Files
cloud-sync/DESIGN.md
T
claude-timemachine 698a7a037c
CI / build (push) Failing after 3s
CI / release (push) Has been skipped
design: pivot to restic-rest-server as the backend
cloud-svc was a worse re-implementation of what restic-rest-server
already does (--private-repos + --append-only + native retention +
chunk-level dedup). Pivoting before either ships in production.

cloud-sync.jar becomes a restic CLI wrapper. ~200 LOC instead of
~2000+ in the custom-server path. Server-side prune via operator
master password (option 1 — multi-key per repo).

Open questions flagged at end of doc for confirmation.
2026-06-02 20:44:48 +02:00

7.9 KiB

cloud-sync — design

Per-Discord-user state sync for Minecraft. Pulls on launch, pushes on exit. Single JAR drops into Prism / MMC / ATLauncher / frazclient as a pre-launch + post-exit hook.

Backend: restic-rest-server with --private-repos --append-only. No custom server code. cloud-sync.jar is a restic CLI wrapper.

Why this shape

Concern How restic solves it
Snapshot semantics Native — every restic backup is a snapshot
Deduplication Chunk-level (not just file-level), built in
Retention policy restic forget --keep-last/daily/weekly/monthly
Append-only enforcement restic-rest-server --append-only: even with a valid password, clients can't delete
Per-user isolation --private-repos: URL path must contain the authenticated username
Encryption at rest Per-repo password, built in
Multi-machine support Restic tags + hostname; if we ever want it, free

cloud-svc as I'd been building it was a worse re-implementation of all the above. Pivoting before it ships.

Topology

flowchart LR
    pl["player PC"]:::external
    jar["cloud-sync.jar
(in launcher's
pre/post hooks)"]:::deploy
    restic["restic binary
(auto-downloaded
on first run)"]:::deploy

    subgraph john["john (192.168.65.33)"]
        rp{{"reverse proxy
:443"}}:::deploy
        ao{{"restic-rest-server
--private-repos
--append-only
:8002"}}:::deploy
        store[/"/srv/cloud-data
/<discord_id>/..."/]:::pvc
        bot{{"discord-bot"}}:::deploy
        htp[/"/etc/restic-users
htpasswd"/]:::pvc
    end

    pl --> jar --> restic
    restic ==>|"rest:https
<discord_id>:<password>"| rp
    rp -->|"loopback"| ao
    ao -->|"reads"| htp
    ao -->|"writes"| store
    bot -.->|"on /register:
htpasswd -B add"| htp
    bot -.->|"DM password"| pl

    classDef deploy fill:#d5e8d4,stroke:#82b366,color:#000
    classDef pvc fill:#f5f5f5,stroke:#666,color:#000
    classDef external fill:#f5f5f5,stroke:#666,color:#000,stroke-dasharray:5 5

Auth & identity

Element Value
User identity Discord ID (immutable, from discord-bot's existing account-card flow)
User credential restic repo password = bcrypt'd in /etc/restic-users htpasswd file
URL pattern rest:https://cloud.tm.center/<discord_id>/
Server isolation --private-repos enforces URL path matches authenticated user

discord-bot's /register flow extends to mint a random password, htpasswd -B-add it to the file, DM the password to the player. Existing flow stays untouched for non-cloud cases.

Revocation = htpasswd -D removes the user. No token store, no scope checks, no auth-service involvement.

Client flow

cloud-sync.jar pull

1. Load creds from <pack-folder>/.cloud-token  (format: discord_id:password on one line)
2. Locate or auto-download restic binary into <jar dir>/restic-<version>/
3. restic -r rest:https://<url>/<discord_id>/ snapshots --latest 1 --json
4. If no snapshots → exit 0 (first run on this machine, nothing to restore)
5. restic restore latest --target <pack-folder> --include-from cloud-scope.txt

cloud-sync.jar push

1. Same creds + restic locator as pull
2. restic backup <pack-folder> --files-from cloud-scope.txt --exclude-from cloud-exclude.txt
3. restic forget --keep-last 20 --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --prune

The forget --prune step is allowed by restic-rest-server --append-only only if the client supplies Force-Allow-Forget: true. We DON'T enable this in --append-only mode — the server refuses forget. Pruning happens server-side via a nightly cron running restic forget with the operator's full-access password against the repo. Clients can only add, never remove.

cloud-scope.json → restic args

Input Becomes
include: ["options.txt", "config/", "journeymap/data/"] Listed in cloud-scope.txt, passed as --files-from cloud-scope.txt
exclude: ["config/simple-mod-sync*", "**/*.log"] Listed in cloud-exclude.txt, passed as --exclude-from cloud-exclude.txt
max_size_mb_per_file: 50 restic doesn't have a per-file size cap; we filter during scope generation

Retention policy

Server-side cron (e.g., daily at 04:00 UTC) walks all per-user repos:

for repo in /srv/cloud-data/*/; do
    user=$(basename "$repo")
    restic -r "$repo" --password-file /etc/restic-master-pass \
        forget --keep-last=20 --keep-daily=7 --keep-weekly=4 --keep-monthly=6 --prune
done

This requires the operator to have a "master password" that opens any user's repo — restic doesn't have that natively. Options:

  1. Init each user's repo with TWO keys — one for the user, one for the operator-side pruner. restic supports multi-key per repo.
  2. Run the cron with each user's own password — requires storing all user passwords server-side; defeats the encryption.
  3. Don't auto-prune — let users push forever, trust quota at the rest-server level.

Recommendation: option 1 (multi-key per repo). On /register, the bot calls restic -r <repo> --password-file <operator> key add to add the player's password as a SECOND key. The pruner cron uses the operator master password.

What's in v1

  • restic-rest-server with --private-repos --append-only --htpasswd-file
  • discord-bot /register extension: mint password, htpasswd add, restic init repo, restic key add player key
  • cloud-sync.jar that subprocesses restic for pull/push
  • Auto-download restic binary on first run from upstream GitHub release
  • Server-side nightly prune cron with operator-side master password key

What's deferred

  • restic version pinning / auto-update of the binary (treat like packwiz-installer self-update)
  • Server-side restic check cron for repo integrity
  • Per-user quota at the rest-server level (rest-server supports --max-size per-user via .maxsize file in each repo)
  • Operator UI for "this player has 25 GB of cloud data, what's in it?"
  • Cross-machine sync UX (you can play on PC A then PC B; latest snapshot wins. No conflict UI because restic doesn't merge — restore-latest is destructive by design.)

Migration from cloud-svc

cloud-svc was never deployed. No user data to migrate. Action:

  • Archive Timemachine/cloud-svc repo (mark archived, leave commits + DESIGN.md as a record)
  • Delete cloud_pull / cloud_push from frazclient/client.py
  • Remove automc_cloud_svc.md memory entry, replace with automc_cloud_sync.md pointing here

Topology consequences for automc/docs/network-exposure.md

Same one public endpoint (cloud.tm.center :443), same reverse-proxy hardening checklist, same threat surface. Differences:

Old (cloud-svc) New (restic-ao)
Bearer token via auth-service /auth/verify-key Basic auth via htpasswd in restic-rest-server
Token leak = one user's data Password leak = one user's data
Custom Go service, 33 tests Upstream restic-rest-server, well-audited
127.0.0.1:9091 loopback bind 127.0.0.1:8002 (existing restic-ao quadlet)
60s in-memory cache of verified tokens rest-server reads htpasswd per request

Net: fewer moving parts, smaller attack surface.

Repo layout post-pivot

Repo Purpose
Timemachine/cloud-sync (this) Kotlin/Gradle JAR that subprocesses restic
Timemachine/cloud-svc Archived. Snapshot of the abandoned path; commits + DESIGN.md kept as decision record
Timemachine/discord-bot Extended /register flow to mint htpasswd creds + init restic repo
Timemachine/automc setup wizard renders the restic-ao quadlet with the new flags; database/schema.sql unchanged

Pre-implementation checklist

  • User reviews this design doc
  • Confirm: server-side prune via operator master password (option 1 above)
  • Confirm: archive cloud-svc rather than delete
  • Confirm: cloud-sync.jar auto-downloads restic binary vs requires it pre-installed
  • Confirm: nightly prune at 04:00 UTC vs after-each-push