01Three surfaces, one product
HMM Trade has three deployment surfaces working together. Two we operate; one runs on the user's machine for free-tier only. Each surface has a clear boundary so a failure in one doesn't cascade into the others.
- Cloud control plane — Next.js app + Supabase Postgres + Stripe. Handles signup, billing, broker connections, bot lifecycle endpoints.
- Per-bot Fly Machines — a lightweight VM per paid bot, running the same Python trader code as the local agent.
- Local agent — Streamlit dashboard + multi-bot supervisor that runs on the user's laptop. Free tier's primary surface; paid users can run it side-by-side for richer visualisation.
02Cloud control plane (this app)
03Per-bot Fly Machines (paid tiers)
Every hosted bot runs as its own lightweight VM on Fly Machines. Image is registry.fly.io/regime-trader-bot:latest (Python 3.12, ~250MB), 1GB RAM, scale-to-zero between ticks for stock-only bots.
The container does:
- Boots and reads its
BOT_ID+ machine JWT. - Calls
GET /api/v1/internal/bots/<id>to fetchprofile_json+ connection metadata. - Calls
POST /api/v1/internal/broker_tokenwhich decrypts the broker creds via KMS and returns a short-lived access token. - Materializes
config/instances/<id>.yaml+.envon disk so the existingmain.py paperstartup path reads them like a local install. - Spawns the trader subprocess. Output is teed to Fly logs + the cloud audit sink so /bots/<id> mirrors what
fly logsshows. - On SIGTERM (Fly maintenance, Stripe cancel, manual stop): forward to the trader subprocess, flush audit, exit.
04Local agent (free tier)
The Python repo (regime-trader) ships a Streamlit dashboard + multi-bot supervisor that runs entirely on the user's machine. Free-tier users keep their broker keys local; the agent only talks to the cloud for auth + bot-profile sync.
Paid users can also install the local agent for richer visualization (walk-forward backtests, audit pipeline viewer, HMM live-detection charts) — both surfaces read the same cloud bot list, so changes in one show up in the other.
05KMS envelope encryption
Broker credentials are stored in Postgres but encrypted with a per-payload AES-256 data key. The data key itself is encrypted under a Cloud KMS key — neither the database nor our application code can decrypt without a live KMS round-trip. Compromising the DB alone doesn't leak tokens.
Encrypt path (broker connect):
- Generate a random AES-256 data key in process memory.
- Call
KMS.Encrypton the data key → ciphertext data key. - AES-GCM encrypt the credential JSON with the plaintext data key → ciphertext + IV + auth tag.
- Persist (ciphertext, ciphertext data key, IV, auth tag, KMS key id) on
broker_connections. Zero the plaintext data key.
Decrypt path (bot launch):
- Read the ciphertext + ciphertext data key from
broker_connections. - Call
KMS.Decrypton the ciphertext data key → plaintext data key. - AES-GCM decrypt the credential JSON. Pass to the bot over TLS. Plaintext lives in API process memory only for the duration of one HTTPS request.
06Failure modes by tier
- Free + cloud down — agent keeps trading on cached JWT for 24h, then asks user to reconnect.
- Free + user laptop off — bot stops. Free tier doesn't promise 24/7.
- Paid + control plane down — hosted bots keep ticking (Fly Machines don't depend on our control plane to tick). User can't change settings or see the dashboard.
- Paid + Fly outage — our incident. Status page, refund per ToS.
- Paid + KMS down — bots already running keep ticking (token decrypted in memory). New bot launches fail until KMS recovers. Bounded blast radius.