Files
claude-timemachine 49d1cb3280
CI / test (3.10) (push) Successful in 8s
CI / test (3.11) (push) Successful in 8s
CI / test (3.12) (push) Successful in 7s
CI / build-pyz (push) Successful in 4s
CI / release (push) Has been skipped
drop restic repo encryption; rely on TLS + append-only + LUKS
User credentials now serve HTTP basic auth only. Repos init with
--insecure-no-password. Removes:
  - RESTIC_PASSWORD env in client subprocess
  - Per-repo password coordination story
  - Multi-key restic setup (user key + operator-master key)
  - Two-password recovery edge cases

Operator-side prune now runs over the filesystem path (-r /srv/.../<user>/)
which bypasses rest-server's HTTP-layer append-only enforcement. No
password needed at all.

Protection model stays:
  - TLS in transit (reverse proxy)
  - HTTP basic per-user (htpasswd) for read/write authorization
  - --private-repos for per-user URL isolation
  - --append-only for client-side delete protection
  - LUKS / disk-level for at-rest encryption (operator's responsibility)

Verified end-to-end on john: pull → push → restore round-trip works,
DELETE on bogus snapshot still returns 403 (append-only intact),
operator can read repo via filesystem path (prune-mode access works).

33 pytest still green.
2026-06-04 22:23:40 +02:00

14 KiB

cloud-sync — design

Per-Discord-user state sync for Minecraft. Pulls on launch, pushes on exit. Single Python zipapp drops into Prism / MMC / ATLauncher / frazclient as a pre-launch + post-exit hook.

Data plane: restic-rest-server with --private-repos --append-only. Clients hit this directly with their per-user password.

Control plane: cloud-svc Go service with two listeners — a provisioning port reachable from automc-net (called by discord-bot) and a loopback admin port (called by automc-setup wizard). Players never touch cloud-svc.

Client: cloud-sync.pyz (Python 3.10+, stdlib only) subprocesses restic. ~300 LOC. Distributed as a zipapp (single-file). Python over Java for two reasons: (a) launcher's PostExit hook can call any subprocess so language doesn't matter, (b) custom unsigned JARs that download binaries + upload files are textbook Windows Defender false-positive triggers, while Python invoked by signed python.exe mostly sidesteps that.

Why this shape

Concern How restic solves it
Snapshot semantics Native — every restic backup is a snapshot
Deduplication Chunk-level (not just file-level), built in
Retention policy restic forget --keep-last/daily/weekly/monthly
Append-only enforcement restic-rest-server --append-only: even with a valid password, clients can't delete
Per-user isolation --private-repos: URL path must contain the authenticated username
Encryption at rest Disabled (--insecure-no-password); delegated to LUKS on host disk
Multi-machine support Restic tags + hostname; if we ever want it, free

cloud-svc as originally designed was a worse re-implementation of all the above. Pivoting before it ships; cloud-svc gets reshaped into the control plane described below.

Topology

flowchart LR
    pl["player PC"]:::external
    op["operator
(via SSH)"]:::external
    jar["cloud-sync.pyz
(Python; in launcher's
pre/post hooks)"]:::deploy
    restic["restic binary
(auto-downloaded
on first run)"]:::deploy

    subgraph john["john (192.168.65.33)"]
        rp{{"reverse proxy
:443"}}:::deploy

        subgraph net["automc-net"]
            ao{{"restic-rest-server
--private-repos
--append-only
:8002"}}:::deploy
            bot{{"discord-bot"}}:::deploy
            cs_int{{"cloud-svc
provisioning :9091
(automc-net only)"}}:::deploy
        end

        cs_admin{{"cloud-svc
admin :9092
(127.0.0.1 only)"}}:::deploy

        store[/"/srv/cloud-data
/<discord_id>/..."/]:::pvc
        htp[/"/etc/restic-users
htpasswd"/]:::pvc
    end

    pl --> jar --> restic
    restic ==>|"rest:https
<discord_id>:<password>"| rp
    rp -->|"loopback"| ao
    ao -->|"reads"| htp
    ao -->|"writes"| store

    bot -.->|"on /register:
POST /admin/users"| cs_int
    cs_int -.->|"htpasswd add
restic init
--insecure-no-password"| htp
    cs_int -.->|"mints repo"| store
    bot -.->|"DM password"| pl

    op -.->|"SSH then
automc-setup cloud ..."| cs_admin
    cs_admin -.->|"list / revoke"| htp
    cs_admin -.->|"prune via
filesystem path"| store

    classDef deploy fill:#d5e8d4,stroke:#82b366,color:#000
    classDef pvc fill:#f5f5f5,stroke:#666,color:#000
    classDef external fill:#f5f5f5,stroke:#666,color:#000,stroke-dasharray:5 5

cloud-svc runs as one process with two listeners:

Listener Bind Reachable from Endpoints
Provisioning automc-net:9091 (no PublishPort) discord-bot via service-net DNS POST /admin/users only
Operator 127.0.0.1:9092 john's loopback (SSH session) GET/DELETE /admin/users, POST /admin/users/{id}/prune, GET /admin/users/{id}/quota, etc.

The split means a compromised discord-bot can mint new accounts but cannot enumerate, prune, or revoke existing ones. Operator-only ops require shell access on john.

Auth model:

  • Provisioning listener: per-caller tokens. cloud-svc reads CLOUD_PROVISIONING_TOKENS_BOT, CLOUD_PROVISIONING_TOKENS_<NAME> env vars. Header Authorization: Bearer <token>. Logs include matched caller name for audit attribution.
  • Operator listener: no auth — loopback bind is the boundary, same pattern as server-manager:127.0.0.1:8080

Auth & identity

Element Value
User identity Discord ID (immutable, from discord-bot's existing account-card flow)
User credential One password per user. HTTP basic auth ONLY — bcrypt'd in /etc/restic-users htpasswd file. Restic repos use --insecure-no-password, so this password does NOT also encrypt blobs.
URL pattern rest:https://cloud.tm.center/<discord_id>/
Server isolation --private-repos enforces URL path matches authenticated user

discord-bot's /register flow extends to call POST cloud-svc:9091/admin/users with the player's Discord ID. cloud-svc mints a random password, htpasswd -B-adds it to the file, runs restic init --insecure-no-password, and returns the password. discord-bot DMs it to the player. discord-bot itself never touches restic or htpasswd directly.

Revocation = operator runs automc-setup cloud revoke <discord_id> which hits the loopback admin port. No token store, no scope checks, no auth-service involvement.

On-disk layout (client)

cloud-sync.pyz stores its state under <pack-folder>/.cloud-sync/ — per-instance, hidden by leading dot. Auto-excluded from cloud sync so a player can't accidentally upload their own credentials.

<pack-folder>/
  mods/                              # managed by packwiz
  config/                            # mixed: pack-shipped + player-modified
  options.txt
  journeymap/data/
  .cloud-sync/                       # cloud-sync owns this
    token                            # "discord_id:password" (mode 0600)
    scope.json                       # per-distribution include/exclude rules
    restic-<version>                 # auto-downloaded binary
    state.json                       # last-pull snapshot ID, last-push time
    logs/                            # rolling, capped ~5 MB

Per-instance isolation matters: a player running a cracked instance + a premium instance gets two separate .cloud-sync/ dirs with different Discord credentials. rm -rf .cloud-sync/ resets one instance entirely.

Restic binary discovery

Probed in order:

  1. <pack-folder>/.cloud-sync/restic-<version> — pinned copy from first run
  2. $PATH (which restic + version match) — honor existing system install
  3. Download from github.com/restic/restic/releases/download/v<version>/restic_<version>_<os>_<arch>.bz2, cache to <pack-folder>/.cloud-sync/

--restic-binary <path> flag overrides discovery for air-gapped operators.

Jar placement

Stateless. Lives wherever the operator put it. Prism / MMC config references absolute path. One pyz can serve N instances; each gets its own .cloud-sync/ underneath its own --pack-folder.

Client flow

cloud-sync.pyz pull

1. Load creds from <pack-folder>/.cloud-token  (format: discord_id:password on one line)
2. Locate or auto-download restic binary into <pack-folder>/.cloud-sync/restic-<version>
3. restic -r rest:https://<url>/<discord_id>/ snapshots --latest 1 --json
4. If no snapshots → exit 0 (first run on this machine, nothing to restore)
5. restic restore latest --target <pack-folder> --include-from cloud-scope.txt

cloud-sync.pyz push

1. Same creds + restic locator as pull
2. restic backup <pack-folder> --files-from cloud-scope.txt --exclude-from cloud-exclude.txt
3. restic forget --keep-last 20 --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --prune

The forget --prune step is allowed by restic-rest-server --append-only only if the client supplies Force-Allow-Forget: true. We DON'T enable this in --append-only mode — the server refuses forget. Pruning happens server-side via a nightly cron running restic forget with the operator's full-access password against the repo. Clients can only add, never remove.

cloud-scope.json → restic args

Input Becomes
include: ["options.txt", "config/", "journeymap/data/"] Listed in cloud-scope.txt, passed as --files-from cloud-scope.txt
exclude: ["config/simple-mod-sync*", "**/*.log"] Listed in cloud-exclude.txt, passed as --exclude-from cloud-exclude.txt
max_size_mb_per_file: 50 restic doesn't have a per-file size cap; we filter during scope generation

Retention policy

Server-side cron (e.g., daily at 04:00 UTC) walks all per-user repos:

for repo in /srv/cloud-data/*/; do
    user=$(basename "$repo")
    restic -r "$repo" --insecure-no-password \
        forget --keep-last=20 --keep-daily=7 --keep-weekly=4 --keep-monthly=6 --prune
done

Operator-side prune talks to restic directly on the filesystem (-r /srv/cloud-data/<user>/), bypassing the rest-server's --append-only enforcement. No HTTP, no password. Works because:

  • Repos use --insecure-no-password (no encryption key to coordinate)
  • The operator owns the on-disk files anyway
  • --append-only is a rest-server HTTP-layer policy; the filesystem doesn't care

Previous drafts of this doc proposed a multi-key restic setup (one key per user + one operator-master key) to enable HTTP-mode prune. That's no longer needed.

What's in v1

  • restic-rest-server with --private-repos --append-only --htpasswd-file
  • discord-bot /register extension: mint password, htpasswd add, restic init --insecure-no-password
  • cloud-sync.pyz that subprocesses restic for pull/push (--insecure-no-password on every call)
  • Auto-download restic binary on first run from upstream GitHub release
  • Server-side nightly prune via separate systemd timer on john, running restic forget --prune directly on filesystem paths (bypassing the rest-server's HTTP-layer append-only enforcement)

What's deferred

  • restic version pinning / auto-update of the binary (treat like packwiz-installer self-update)
  • Server-side restic check cron for repo integrity
  • Per-user quota at the rest-server level (rest-server supports --max-size per-user via .maxsize file in each repo)
  • Operator UI for "this player has 25 GB of cloud data, what's in it?"
  • Cross-machine sync UX (you can play on PC A then PC B; latest snapshot wins. No conflict UI because restic doesn't merge — restore-latest is destructive by design.)

cloud-svc — reshape, not delete

cloud-svc gets a new purpose: control plane for the restic backend. Throw away:

  • Manifest types + validation (manifest.go)
  • Blob storage + tarball extraction (storage.go body)
  • Player-facing /v1/* endpoints (server.go body)
  • Snapshot ID generation, content hash cross-check

Keep:

  • Project skeleton (go.mod, Dockerfile, Makefile, CI)
  • Auth-cache pattern from auth.go (reused for provisioning token verification)
  • Per-user mutex pattern from storage.go (still needed to serialize concurrent provisioning calls)
  • Config loader from config.go (adds new vars)

New code:

  • Two http.Server instances, one per listener
  • htpasswd writer that respects bcrypt + file locking
  • restic CLI subprocesser (init repo, add key, prune)
  • time.Ticker for nightly prune job

Estimate: ~300 LOC kept, ~600 LOC new. Net smaller than current cloud-svc.

Also delete cloud_pull / cloud_push from frazclient/client.py (these get obsoleted by import cloud_sync calls; frazclient depends on the same package).

Topology consequences for automc/docs/network-exposure.md

Layer Bind Public?
restic-ao (data plane) 127.0.0.1:8002 Via reverse proxy at cloud.tm.center:443
cloud-svc provisioning listener automc-net:9091 (no PublishPort) No
cloud-svc admin listener 127.0.0.1:9092 No

Only one public HTTPS endpoint changes from the original plan: it now fronts restic-ao instead of cloud-svc. Same reverse-proxy hardening checklist applies. Threat surface differences:

Old (cloud-svc as data path) New (restic-ao as data path)
Bearer token via auth-service /auth/verify-key HTTP Basic via htpasswd in restic-rest-server
Custom Go service, 33 tests Upstream restic-rest-server, well-audited
Player-facing endpoints None — cloud-svc not public

Operator endpoints are loopback-only and require SSH access to john to reach. No new public surface from the control plane.

Repo layout post-pivot

Repo Purpose
Timemachine/cloud-sync (this) Python 3.10+ package + zipapp that subprocesses restic
Timemachine/cloud-svc Reshaped — control plane only. Two-port Go service for provisioning + operator ops. NOT archived.
Timemachine/discord-bot Extended /register flow calls cloud-svc to provision; DMs returned password
Timemachine/automc setup wizard adds automc-setup cloud {list,prune,revoke,quota} subcommands hitting cloud-svc's loopback admin port. Quadlet templates for both restic-ao (new flags) and cloud-svc (two listeners). database/schema.sql unchanged.

Pre-implementation checklist

All locked 2026-06-02:

  • cloud-svc reshapes to control plane, not archived
  • Two-port split — automc-net for provisioning, loopback for operator
  • Repo encryption disabled (--insecure-no-password). Per-user password covers HTTP basic auth ONLY. Defense-in-depth via repo encryption was dropped for the homelab scope; protection delegated to LUKS on disk + TLS at proxy + append-only at rest-server. Cuts provisioning from 3 restic ops to 1, removes the two-password coordination problem.
  • Server-side prune over filesystem path (-r /srv/cloud-data/<user>/). Bypasses rest-server's HTTP-layer --append-only. No multi-key dance needed.
  • cloud-sync.pyz auto-downloads restic binary. Matches packwiz-installer-bootstrap pattern. First run hits https://github.com/restic/restic/releases for the matching platform binary, caches under <pack-folder>/.cloud-sync/restic-<version>. SHA256 verified against the release's SHA256SUMS file. --no-download flag for air-gapped operators.
  • Nightly prune at 04:00 UTC via separate systemd timer on john (not embedded in cloud-svc), for fault isolation — prune crash doesn't take down provisioning. The timer also runs restic copy to the homelab primary before pruning (per-user repos, mirroring john's layout).
  • Per-caller tokens, NOT shared. cloud-svc reads CLOUD_PROVISIONING_TOKENS_BOT, CLOUD_PROVISIONING_TOKENS_<OTHER> env vars — one per known caller. Logs include the matched caller name so audit trails show which service made each call. Adding a future caller (e.g., a portal) means a new env var, not a token rotation.