furtka/docs/smoke-vm.md

107 lines
4.3 KiB
Markdown
Raw Normal View History

# Smoke VM on Proxmox Test Host
Every push to `main` builds a fresh ISO (`build-iso.yml`) and then boots
it in a throwaway VM on the Proxmox test host — currently
`192.168.178.165` — to confirm the live ISO boots and the webinstaller
responds on `:5000`. If the smoke step fails, the ISO artifact is still
uploaded and the VM is left running for post-mortem.
The heavy lifting lives in [`scripts/smoke-vm.sh`](../scripts/smoke-vm.sh);
the workflow just downloads the artifact and shells out.
## Where smoke VMs live
- Node: whatever the test host reports as its node name (auto-detected)
- VMID range: `90009099` (`PVE_TEST_VMID_MIN` / `PVE_TEST_VMID_MAX`)
- Name: `furtka-smoke-<12-char-sha>`
- Tags: `furtka`, `smoke`, `sha-<12-char-sha>`
- MAC: `BC:24:11:<first-6-hex-of-sha>` (Proxmox's OUI; lets the runner
find the VM by scanning the LAN — the live ISO has no guest agent)
- ISO on test host: `local:iso/furtka-<short-sha>.iso`
Five most recent VMs (and their ISOs) are kept; anything older is stopped
and purged (`destroy-unreferenced-disks=1`) on the next run. Tune via
`PVE_TEST_KEEP`.
## Poking a failed smoke VM
1. Find it in the Proxmox WebUI — look for `furtka-smoke-<sha>` in the
9000-range. The VM is still running.
2. Console: **Console** tab in the WebUI (SPICE or noVNC). The webinstaller
logs to `journalctl -u furtka-webinstaller.service` on the live ISO.
3. SSH: the live Arch ISO ships `sshd` enabled with no root password.
Normally SSH as a LAN-reachable user is not possible without creds —
use the WebUI console instead. (The **installed** system, post-wizard,
has the `server` user with the password the wizard set.)
4. Fetch the short-sha from the VM name → cross-reference against
`git log` to see exactly which commit built the failing ISO.
## Running a smoke test locally
Needs LAN access to the test Proxmox and an API token with VM perms.
```bash
PVE_TEST_HOST=192.168.178.165 \
PVE_TEST_TOKEN='user@pve!smoke=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' \
./scripts/smoke-vm.sh iso/out/furtka-*.iso
```
The script exits 0 on success, non-zero if the VM never served
`http://<ip>:5000`. Pruning runs either way.
## Clearing the 9000-range by hand
If smoke tests wedge or you want a clean slate:
```bash
# List smoke VMs
curl -sSk -H "Authorization: PVEAPIToken=${PVE_TEST_TOKEN}" \
https://192.168.178.165:8006/api2/json/nodes/<node>/qemu \
| python3 -c 'import json,sys; [print(v["vmid"],v["name"]) for v in json.load(sys.stdin)["data"] if 9000<=int(v["vmid"])<=9099]'
# Destroy one
curl -sSk -X POST -H "Authorization: PVEAPIToken=${PVE_TEST_TOKEN}" \
https://192.168.178.165:8006/api2/json/nodes/<node>/qemu/<vmid>/status/stop
curl -sSk -X DELETE -H "Authorization: PVEAPIToken=${PVE_TEST_TOKEN}" \
"https://192.168.178.165:8006/api2/json/nodes/<node>/qemu/<vmid>?purge=1&destroy-unreferenced-disks=1"
```
Or just run `scripts/smoke-vm.sh` with `PVE_TEST_KEEP=0` and any ISO —
the prune step will sweep everything in the range except the one it
just created.
## Proxmox API token setup (one-time)
1. WebUI → **Datacenter → Permissions → API Tokens → Add**
2. User: `root@pam` (or a dedicated `smoke@pve` user — see below)
3. Token ID: `smoke`
4. Uncheck **Privilege Separation** for the quick path, or keep it
separated and grant explicit perms below
5. Save the displayed secret once — it's shown only here
Minimum perms on `/` (if privilege-separated):
`VM.Allocate`, `VM.Config.Disk`, `VM.Config.CPU`, `VM.Config.Memory`,
`VM.Config.Network`, `VM.Config.Options`, `VM.Config.HWType`,
`VM.Config.CDROM`, `VM.PowerMgmt`, `VM.Audit`, `Datastore.AllocateTemplate`
(for ISO upload/delete on the `local` content store).
Set the result as Forgejo secret `PVE_TEST_TOKEN` in the format:
```
user@realm!tokenid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
```
…and `PVE_TEST_HOST` as `192.168.178.165`. That's all the workflow needs.
## Assumptions
- Runner has L2 reachability to `192.168.178.0/24` (MAC→IP discovery
uses `arp-scan` from the runner).
- Test host uses default storage names: `local` for ISOs, `local-lvm` for
disks. Override via `PVE_TEST_ISO_STORAGE` / `PVE_TEST_DISK_STORAGE`.
- Bridge `vmbr0` carries LAN DHCP. Override via `PVE_TEST_BRIDGE`.
If any of those don't match, set the corresponding env var in
`build-iso.yml` (via `env:` on the smoke step) or override on the CLI
when running locally.