Commit graph

14 commits

Author SHA1 Message Date
7b894f096f fix(ci): inline smoke-vm as a step instead of a downstream job
All checks were successful
CI / lint (pull_request) Successful in 1m6s
CI / test (pull_request) Successful in 1m23s
CI / validate-json (pull_request) Successful in 56s
CI / markdown-links (pull_request) Successful in 24s
The separate smoke-vm job with `needs: build-iso` required round-tripping
the 1.5 GB ISO through actions/upload-artifact + download-artifact. v3
on Forgejo has a known issue where large artifacts stall at 0.0% in the
download step — the smoke run hung today with endless "Total file count:
1 ---- Processed file #0 (0.0%)" output.

Since both jobs run on the same self-hosted runner (host mode, same
workspace available), there was never a real need for the artifact
indirection. Inlining as a step after the artifact upload reuses the
ISO already in iso/out/ and skips the download entirely.

step-level continue-on-error preserves the original guarantee that a
VM-side flake doesn't mark the ISO build red.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 12:20:58 +02:00
d499907613 feat(ci): auto-boot every main-ISO in smoke VM on .165 Proxmox
Some checks failed
Build ISO / smoke-vm (push) Blocked by required conditions
Build ISO / build-iso (push) Successful in 24m28s
CI / test (push) Successful in 3m1s
CI / validate-json (push) Successful in 55s
CI / markdown-links (push) Successful in 37s
CI / lint (push) Failing after 13m19s
After build-iso, a new smoke-vm job uploads the freshly built ISO to
the test Proxmox at 192.168.178.165 via PVE API token, boots it in a
fresh VM (VMID range 9000-9099, MAC derived from commit SHA so the
runner can find the DHCP IP by scanning the LAN), and curls :5000 to
confirm the webinstaller answers HTTP 200. Last 5 smoke VMs + their
ISOs are kept for post-mortem; older ones are purged. continue-on-error
on the smoke job so a VM-side flake doesn't mark the ISO build red.

Shortens the feedback loop on ISO regressions from "next manual VM
test session" (days) to "next push" (minutes) — the 2026-04-15/16 VM
sessions each found real boot-time bugs that unit tests missed.

Docs at docs/smoke-vm.md. Requires Forgejo secrets PVE_TEST_HOST and
PVE_TEST_TOKEN (dedicated smoke@pve!ci PVE token, privilege-separated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 11:41:44 +02:00
b4c65f46bf fix(release): drop jq dependency, use python3 for JSON assembly
All checks were successful
Build ISO / build-iso (push) Successful in 17m30s
CI / lint (push) Successful in 25s
CI / test (push) Successful in 33s
CI / validate-json (push) Successful in 24s
CI / markdown-links (push) Successful in 12s
Release / release (push) Successful in 6s
The 26.2-alpha release workflow hung for 15+ minutes on
"apt-get install -y jq" — the runner's apt mirror was unreachable
(or very slow), and the whole publish stalled.

jq was only used for two tiny things: building the release-create
POST body and reading the release id from the response. Both are
one-liners in Python, which is guaranteed-present on the Forgejo
Actions ubuntu-latest runner image. Replaced both uses; removed
the apt-get step from release.yml entirely. Slow mirrors no
longer block tagged releases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 17:05:21 +02:00
f0acc4427e feat(furtka): release CI + \furtka update\ / \furtka rollback\ CLI
Slice 2 of the self-update story. Tagging a release on main now
produces a downloadable self-update payload on the Forgejo releases
page, and a running box can pull it down, verify it, atomically swap
to the new version, and health-check the result.

New pieces:

- scripts/build-release-tarball.sh <version> — packages the furtka/
  package + bundled apps/ + a root-level VERSION file as
  dist/furtka-<version>.tar.gz, plus a .sha256 sidecar and a
  release.json metadata blob.
- scripts/publish-release.sh <version> — uses the Forgejo v1 API to
  create a release (body pulled from the CHANGELOG section for this
  tag, pre-release auto-flagged on -alpha/-beta/-rc) and upload the
  three assets sequentially. Needs \$FORGEJO_TOKEN.
- .forgejo/workflows/release.yml — tag-triggered, runs both scripts
  with the new \$FORGEJO_RELEASE_TOKEN repo secret.
- furtka/updater.py — check_update, prepare_update, apply_update,
  run_update, rollback. Atomic symlink swap, sha256 verify (TOCTOU-
  safe: re-hashes on-disk file), health-check post-restart with
  auto-rollback on failure, stage-by-stage progress persisted to
  /var/lib/furtka/update-state.json so the UI can poll independent
  of the (restarting) API process. Path overrides via FURTKA_ROOT /
  FURTKA_STATE_DIR / FURTKA_LOCK_PATH so tests pin a tmpdir.
- furtka/cli.py — \`furtka update [--check] [--json]\` and
  \`furtka rollback\`.
- tests/test_updater.py — 15 tests: version compare, sha256 verify,
  tarball extract (including traversal refusal), lockfile, apply
  happy + rollback paths, rollback CLI, check_update with stubbed
  Forgejo.
- iso/build.sh — writes VERSION at the tarball root so the install
  path matches the self-update path (previously assumed only the
  release script did this).

RELEASING.md now points at the automated flow — no more manually
clicking "Create release" on the Forgejo UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:30:45 +02:00
dfdbdd69aa docs: sync README roadmap, runner-setup, and ops/ to today's reality
All checks were successful
Build ISO / build-iso (push) Successful in 17m13s
CI / lint (push) Successful in 26s
CI / test (push) Successful in 32s
CI / validate-json (push) Successful in 22s
CI / markdown-links (push) Successful in 13s
A lot moved since the last docs sweep. Catching everything up in one
batch so a newcomer (or future us) reading the repo isn't lied to.

**README.md roadmap:**
- Walking-skeleton live ISO: upgraded from "screens 1-3 work
  end-to-end" to "install runs to completion on a VM and the installed
  system logs in and runs `docker ps` without sudo".
- 26.0-alpha release: dropped the "deferred" note — its blocker
  (archinstall not completing) is gone; just needs a re-tag when we
  like the installer copy.
- Added an explicit "ISO-build in CI" line for the new
  `.forgejo/workflows/build-iso.yml`.
- Split the old "mDNS + local CA" item: mDNS is live (hostname baked
  in, avahi/nss-mdns in the image), HTTPS via local CA still open.
- Noted post-install reboot button, progress bar, archinstall 4.x
  schema work, console welcome, custom_commands docker group join in
  the wizard milestone bullet.

**docs/runner-setup.md:**
- Full rewrite for the docker-outside-of-docker architecture we
  actually run now (was still describing the DinD sidecar setup).
- Documents the `/data` symlink on the host that makes host-mode
  `-v /data/…:/work` resolve — the non-obvious piece that took the
  longest to nail down today.
- Describes the two runtime modes (`ubuntu-latest:docker://…` for CI,
  `self-hosted:host` for build-iso) and why each exists.
- Adds the `upload-artifact@v3` pin note — v4+ fails on Forgejo with
  `GHESNotSupportedError`.

**ops/forgejo-runner/compose.yml + config.yml:**
- Compose now matches what's actually running: DooD (no DinD sidecar),
  runs as root so apk can install nodejs + docker-cli at startup,
  /var/run/docker.sock bind-mounted.
- Config gets the three explicit label mappings and DooD
  `docker_host` + `valid_volumes`.

**.forgejo/workflows/build-iso.yml:**
- Added `paths-ignore` for docs/website/*.md so doc-only commits don't
  kick off 5-min ISO rebuilds. Code + ISO overlay changes still
  trigger.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:28:33 +02:00
05ef50f74e ci: pin upload-artifact to v3 — v4+ unsupported on forgejo
All checks were successful
Build ISO / build-iso (push) Successful in 17m29s
CI / lint (push) Successful in 25s
CI / test (push) Successful in 32s
CI / validate-json (push) Successful in 24s
CI / markdown-links (push) Successful in 12s
Forgejo Actions only speaks the GHES-compatible @actions/artifact
protocol; upload-artifact@v4+ insists on the newer API and fails with
`GHESNotSupportedError`. Pin to v3, which uses the old protocol that
Forgejo implements.

Good news: the ISO itself built end-to-end in ~5m on the runner
(DooD + /data symlink resolved the path-mismatch). Only the upload
failed, and this pins it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:10:16 +02:00
e9e8bd3319 ci: run build-iso on the runner host (DooD path fix)
Some checks failed
CI / test (push) Waiting to run
Build ISO / build-iso (push) Failing after 6s
CI / lint (push) Successful in 25s
CI / validate-json (push) Successful in 23s
CI / markdown-links (push) Has been cancelled
Now that the runner uses docker-outside-of-docker, volume mounts in
`build.sh` (`docker run -v \$REPO_ROOT:/work ...`) are interpreted by
host docker — so `\$REPO_ROOT` must be a real host path. When the job
runs inside a job container, `\$REPO_ROOT` is only valid in the job
container's filesystem namespace and host docker can't find it, hence
`bash: /work/iso/build.sh: No such file or directory`.

Fix: switch `runs-on` to `self-hosted`. Forgejo-runner exposes that
label out of the box and, with no matching container image mapping,
runs steps directly on the runner VM. Checkout writes to a real host
path; `docker run -v …` then mounts a path both the outer CLI and
host docker agree on.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:50:47 +02:00
a6cccc67c1 ci: drop duplicate docker.sock mount in build-iso
Some checks failed
Build ISO / build-iso (push) Failing after 6s
CI / lint (push) Successful in 26s
CI / test (push) Successful in 33s
CI / validate-json (push) Successful in 23s
CI / markdown-links (push) Has been cancelled
Forgejo-runner's valid_volumes already injects /var/run/docker.sock
into every job container, so the explicit `container.volumes` mount
in the workflow triggered 'Duplicate mount point' and the job never
started. Removed — DOCKER_HOST env is enough.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:49:12 +02:00
0f0308bf68 ci: switch build-iso to docker-outside-of-docker
Some checks failed
Build ISO / build-iso (push) Failing after 46s
CI / lint (push) Successful in 25s
CI / test (push) Successful in 32s
CI / validate-json (push) Successful in 24s
CI / markdown-links (push) Successful in 14s
The DinD setup was the wrong tool here: forgejo-runner runs on host
docker, but it spawned jobs via the DinD sidecar — meaning jobs
were isolated inside DinD's own docker namespace and couldn't reach
`docker-in-docker` by hostname, and couldn't see the
`forgejo-runner_default` network (which only exists on host docker).

Switched the runner (compose.yml + data/config.yml) to talk directly
to host docker via `/var/run/docker.sock` and added it to the host
`docker` group (GID 988) so the non-root runner user can use the
socket. `valid_volumes` now whitelists the socket so job containers
can mount it too.

Workflow now mounts /var/run/docker.sock into the job container and
points DOCKER_HOST at that unix socket. `./iso/build.sh` then runs
its inner `docker run --privileged archlinux:latest` against the
host daemon — no nested docker.

Tradeoff: this is less isolated than DinD (jobs have full host docker
access — they could spawn arbitrary containers), but on a dedicated
single-user build VM the DooD simplification is worth it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:45:32 +02:00
4c5a00a0e0 ci: drop ineffective container.options override for build-iso
Some checks failed
Build ISO / build-iso (push) Failing after 1s
CI / lint (push) Failing after 1s
CI / test (push) Failing after 1s
CI / validate-json (push) Failing after 1s
CI / markdown-links (push) Failing after 1s
forgejo-runner 6.4 filters `--network` out of `container.options`, so
the workflow-level override was silently ignored and the job kept
landing on a per-task network where `docker-in-docker` didn't resolve.
Fixed at the right level by editing the runner's `/data/config.yml`
(`container.network: "forgejo-runner_default"`) and restarting the
forgejo-runner container — every job now joins the shared network so
DOCKER_HOST=tcp://docker-in-docker:2375 just works.

Workflow trimmed back to only what's needed: DOCKER_HOST env pin. The
default runner image (catthehacker/ubuntu:act-latest) already has the
docker CLI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:40:16 +02:00
ba36bb4741 ci: attach build-iso job to DinD network, pin lychee-action source
Some checks failed
Build ISO / build-iso (push) Failing after 5s
CI / lint (push) Successful in 27s
CI / test (push) Successful in 44s
CI / markdown-links (push) Failing after 1s
CI / validate-json (push) Failing after 10m34s
- build-iso: the job container was on a per-job docker network, so
  `docker-in-docker` (the DinD sidecar hostname on
  `forgejo-runner_default`) didn't resolve. Pin the container to that
  shared network via `container.options: --network forgejo-runner_default`.
  catthehacker/ubuntu:act-latest already has the docker CLI, so drop
  the apt-get step.

- ci.yml markdown-links: forgejo's action mirror at data.forgejo.org
  doesn't carry `lycheeverse/lychee-action`, so `uses:` was 404ing
  before the step could even run (rendering continue-on-error moot).
  Fully-qualified GitHub URL bypasses the mirror.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:37:54 +02:00
a777efd4c0 ci: green the pipeline — tests match 4.x schema, build-iso hits DinD, lint clean
Some checks failed
Build ISO / build-iso (push) Failing after 20s
CI / lint (push) Successful in 26s
CI / test (push) Successful in 31s
CI / validate-json (push) Successful in 23s
CI / markdown-links (push) Failing after 2s
Three things are broken on origin/main as of 6114cb2, all found in one
red CI run:

- build-iso workflow couldn't reach docker. forgejo-runner's config
  sets `docker_host: tcp://docker-in-docker:2375` but that env doesn't
  propagate into job containers on `runs-on: ubuntu-latest`, and the
  default job image has no docker CLI. Fix: pin `DOCKER_HOST` on the
  job and apt-install `docker.io` before invoking `iso/build.sh`.

- Two tests asserted on the pre-4.x archinstall schema:
  `creds["root_password"]` (now `!root-password`) and
  `cfg["disk_config"]["device"]` / `cfg["users"]` (users moved to
  creds; disk_config is now a full `default_layout` dict). Rewrote
  the tests to reflect 4.x reality and monkeypatched `build_disk_config`
  since its real body imports archinstall, which isn't on CI.

- Ruff flagged one line of `PROGRESS_PHASES` at 107 chars — collapsed
  the column alignment. `ruff format` pulled in a couple of cosmetic
  expansions in spawn_archinstall and the tests that had been drifting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:29:42 +02:00
6114cb2f27 ci: build the live ISO on push-to-main and publish as artifact
Some checks failed
Build ISO / build-iso (push) Failing after 19s
CI / lint (push) Failing after 27s
CI / test (push) Failing after 41s
CI / validate-json (push) Successful in 24s
CI / markdown-links (push) Failing after 2s
Adds `.forgejo/workflows/build-iso.yml` that runs `./iso/build.sh` and
uploads the resulting ISO as a `furtka-iso` artifact (retained 14 days).
Triggers on `push: branches: [main]` and `workflow_dispatch` only —
feature branches don't pay the 15-20 min build cost. `concurrency`
cancels older runs of the same ref so only the most recent push
produces an artifact.

This is what Robert asked for: push change → download ISO from the
Forgejo run → test without needing a laptop to build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:13:15 +02:00
852efdb0ed ci: add Forgejo Actions workflow with ruff, pytest, JSON + link checks
Some checks failed
CI / lint (push) Failing after 36s
CI / test (push) Failing after 1s
CI / validate-json (push) Failing after 2s
CI / markdown-links (push) Failing after 1s
- .forgejo/workflows/ci.yml: four jobs (lint, test, validate-json,
  markdown-links) running on push to main and on pull requests
- pyproject.toml: project metadata, flask dep, dev extras (ruff, pytest),
  ruff config (E/F/I/W/B/UP rulesets, 100-char lines, py311 target),
  pytest config (pythonpath=webinstaller so tests can import drives)
- tests/test_drives.py: 11 unit tests covering parse_size_gb (TB/GB/MB,
  European comma decimal, empty input, unknown units), drive type
  scoring (nvme/ssd/hdd), size scoring bands, and score_device summing
- .gitignore: ignore .pytest_cache, *.egg-info, .ruff_cache
- webinstaller/drives.py: refactor subprocess.run to capture_output
  kwarg (ruff UP022) — drops four lines, same behavior
- webinstaller/app.py: ruff-sorted imports (isort)

All checks pass locally: ruff check + format, pytest 11/11, JSON valid.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:24:05 +02:00