Some checks failed
Build ISO / smoke-vm (push) Blocked by required conditions
Build ISO / build-iso (push) Successful in 24m28s
CI / test (push) Successful in 3m1s
CI / validate-json (push) Successful in 55s
CI / markdown-links (push) Successful in 37s
CI / lint (push) Failing after 13m19s
After build-iso, a new smoke-vm job uploads the freshly built ISO to the test Proxmox at 192.168.178.165 via PVE API token, boots it in a fresh VM (VMID range 9000-9099, MAC derived from commit SHA so the runner can find the DHCP IP by scanning the LAN), and curls :5000 to confirm the webinstaller answers HTTP 200. Last 5 smoke VMs + their ISOs are kept for post-mortem; older ones are purged. continue-on-error on the smoke job so a VM-side flake doesn't mark the ISO build red. Shortens the feedback loop on ISO regressions from "next manual VM test session" (days) to "next push" (minutes) — the 2026-04-15/16 VM sessions each found real boot-time bugs that unit tests missed. Docs at docs/smoke-vm.md. Requires Forgejo secrets PVE_TEST_HOST and PVE_TEST_TOKEN (dedicated smoke@pve!ci PVE token, privilege-separated). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
106 lines
4.3 KiB
Markdown
106 lines
4.3 KiB
Markdown
# Smoke VM on Proxmox Test Host
|
||
|
||
Every push to `main` builds a fresh ISO (`build-iso.yml`) and then boots
|
||
it in a throwaway VM on the Proxmox test host — currently
|
||
`192.168.178.165` — to confirm the live ISO boots and the webinstaller
|
||
responds on `:5000`. If the smoke step fails, the ISO artifact is still
|
||
uploaded and the VM is left running for post-mortem.
|
||
|
||
The heavy lifting lives in [`scripts/smoke-vm.sh`](../scripts/smoke-vm.sh);
|
||
the workflow just downloads the artifact and shells out.
|
||
|
||
## Where smoke VMs live
|
||
|
||
- Node: whatever the test host reports as its node name (auto-detected)
|
||
- VMID range: `9000–9099` (`PVE_TEST_VMID_MIN` / `PVE_TEST_VMID_MAX`)
|
||
- Name: `furtka-smoke-<12-char-sha>`
|
||
- Tags: `furtka`, `smoke`, `sha-<12-char-sha>`
|
||
- MAC: `BC:24:11:<first-6-hex-of-sha>` (Proxmox's OUI; lets the runner
|
||
find the VM by scanning the LAN — the live ISO has no guest agent)
|
||
- ISO on test host: `local:iso/furtka-<short-sha>.iso`
|
||
|
||
Five most recent VMs (and their ISOs) are kept; anything older is stopped
|
||
and purged (`destroy-unreferenced-disks=1`) on the next run. Tune via
|
||
`PVE_TEST_KEEP`.
|
||
|
||
## Poking a failed smoke VM
|
||
|
||
1. Find it in the Proxmox WebUI — look for `furtka-smoke-<sha>` in the
|
||
9000-range. The VM is still running.
|
||
2. Console: **Console** tab in the WebUI (SPICE or noVNC). The webinstaller
|
||
logs to `journalctl -u furtka-webinstaller.service` on the live ISO.
|
||
3. SSH: the live Arch ISO ships `sshd` enabled with no root password.
|
||
Normally SSH as a LAN-reachable user is not possible without creds —
|
||
use the WebUI console instead. (The **installed** system, post-wizard,
|
||
has the `server` user with the password the wizard set.)
|
||
4. Fetch the short-sha from the VM name → cross-reference against
|
||
`git log` to see exactly which commit built the failing ISO.
|
||
|
||
## Running a smoke test locally
|
||
|
||
Needs LAN access to the test Proxmox and an API token with VM perms.
|
||
|
||
```bash
|
||
PVE_TEST_HOST=192.168.178.165 \
|
||
PVE_TEST_TOKEN='user@pve!smoke=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' \
|
||
./scripts/smoke-vm.sh iso/out/furtka-*.iso
|
||
```
|
||
|
||
The script exits 0 on success, non-zero if the VM never served
|
||
`http://<ip>:5000`. Pruning runs either way.
|
||
|
||
## Clearing the 9000-range by hand
|
||
|
||
If smoke tests wedge or you want a clean slate:
|
||
|
||
```bash
|
||
# List smoke VMs
|
||
curl -sSk -H "Authorization: PVEAPIToken=${PVE_TEST_TOKEN}" \
|
||
https://192.168.178.165:8006/api2/json/nodes/<node>/qemu \
|
||
| python3 -c 'import json,sys; [print(v["vmid"],v["name"]) for v in json.load(sys.stdin)["data"] if 9000<=int(v["vmid"])<=9099]'
|
||
|
||
# Destroy one
|
||
curl -sSk -X POST -H "Authorization: PVEAPIToken=${PVE_TEST_TOKEN}" \
|
||
https://192.168.178.165:8006/api2/json/nodes/<node>/qemu/<vmid>/status/stop
|
||
curl -sSk -X DELETE -H "Authorization: PVEAPIToken=${PVE_TEST_TOKEN}" \
|
||
"https://192.168.178.165:8006/api2/json/nodes/<node>/qemu/<vmid>?purge=1&destroy-unreferenced-disks=1"
|
||
```
|
||
|
||
Or just run `scripts/smoke-vm.sh` with `PVE_TEST_KEEP=0` and any ISO —
|
||
the prune step will sweep everything in the range except the one it
|
||
just created.
|
||
|
||
## Proxmox API token setup (one-time)
|
||
|
||
1. WebUI → **Datacenter → Permissions → API Tokens → Add**
|
||
2. User: `root@pam` (or a dedicated `smoke@pve` user — see below)
|
||
3. Token ID: `smoke`
|
||
4. Uncheck **Privilege Separation** for the quick path, or keep it
|
||
separated and grant explicit perms below
|
||
5. Save the displayed secret once — it's shown only here
|
||
|
||
Minimum perms on `/` (if privilege-separated):
|
||
`VM.Allocate`, `VM.Config.Disk`, `VM.Config.CPU`, `VM.Config.Memory`,
|
||
`VM.Config.Network`, `VM.Config.Options`, `VM.Config.HWType`,
|
||
`VM.Config.CDROM`, `VM.PowerMgmt`, `VM.Audit`, `Datastore.AllocateTemplate`
|
||
(for ISO upload/delete on the `local` content store).
|
||
|
||
Set the result as Forgejo secret `PVE_TEST_TOKEN` in the format:
|
||
|
||
```
|
||
user@realm!tokenid=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
|
||
```
|
||
|
||
…and `PVE_TEST_HOST` as `192.168.178.165`. That's all the workflow needs.
|
||
|
||
## Assumptions
|
||
|
||
- Runner has L2 reachability to `192.168.178.0/24` (MAC→IP discovery
|
||
uses `arp-scan` from the runner).
|
||
- Test host uses default storage names: `local` for ISOs, `local-lvm` for
|
||
disks. Override via `PVE_TEST_ISO_STORAGE` / `PVE_TEST_DISK_STORAGE`.
|
||
- Bridge `vmbr0` carries LAN DHCP. Override via `PVE_TEST_BRIDGE`.
|
||
|
||
If any of those don't match, set the corresponding env var in
|
||
`build-iso.yml` (via `env:` on the smoke step) or override on the CLI
|
||
when running locally.
|