Skip to content

External Services

External service credentials required by the platform today. The sovereignty roadmap is to reduce this surface for core infrastructure, especially native Ceph-backed storage and self-hosted/Anycast DNS.

The CLI automatically delivers credentials to each host during provisioning. Environment variables are assembled from:

  1. CLI-derived infrastructure and service defaults from manifest config (DATABASE_HOST, DATABASE_PORT, KAFKA_BROKERS, CLICKHOUSE_ADDR, CLUSTER_ID, NODE_ID)
  2. Runtime provisioning values such as the shared SERVICE_TOKEN, enrollment tokens, and internal TLS paths
  3. Shared env_files from the manifest’s top-level env_files field
  4. Per-service env_file from each service’s env_file field
  5. Inline config from the service, interface, or observability config map
  6. CLI fill-ins for values that are still unset, such as GeoIP paths and production runtime defaults
  7. CLI-composed values such as DATABASE_URL, COOKIE_DOMAIN, and BRAND_DOMAIN when the operator did not provide them explicitly

The merged result is written to /etc/frameworks/{service}.env on each host.

CategoryExamplesWho Sets It
Operator-providedDATABASE_PASSWORD, CLICKHOUSE_PASSWORD, Stripe keys, SMTPYou — in shared env files
Shared platform secretsJWT_SECRET, SERVICE_TOKEN, FIELD_ENCRYPTION_KEYYou — in shared env files for non-dev deployments
CLI-derivedDATABASE_URL, KAFKA_BROKERS, COOKIE_DOMAINCLI — from manifest

For dev-profile manifests, the CLI can replace missing or placeholder shared platform secrets with crypto/rand values for that provision run. Non-dev provisioning validates that shared platform secrets are present in your manifest env_files; keep them in an encrypted env file so re-provisioning uses stable values.

During frameworks cluster provision, the CLI also provisions database users:

  • Postgres/YugabyteDB: Creates a per-database owner role for each database listed in the manifest (role name = database owner from the manifest), using DATABASE_PASSWORD. Each role has access only to its own database.
  • ClickHouse: Deploys users.xml with a frameworks user (password from CLICKHOUSE_PASSWORD) and a readonly_user (rate-limited, password from CLICKHOUSE_READONLY_PASSWORD)

This ensures runtime services can connect with the same credentials the CLI provisioned.

Setup:

  1. Create gitops/config/production.env for non-secret operator config
  2. Copy the secret template: cp config/env/secrets.env.example gitops/secrets/production.env
  3. Set at minimum in gitops/secrets/production.env: DATABASE_PASSWORD, CLICKHOUSE_PASSWORD, SERVICE_TOKEN, JWT_SECRET, PASSWORD_RESET_SECRET, FIELD_ENCRYPTION_KEY, and USAGE_HASH_SECRET
  4. Set your public URLs and other non-secret operator inputs in gitops/config/production.env
  5. Reference both files in your cluster manifest:
    env_files:
    - config/production.env
    - secrets/production.env
  6. Run frameworks cluster provision — the CLI creates database users, generates missing secrets, and pushes merged env vars to all hosts

The CLI supports SOPS-encrypted env files using age keys. This encrypts secrets at rest in your gitops repo — decryption happens transparently during provisioning.

One-time setup:

Terminal window
# Install tools
brew install sops age # macOS
# or: apt install age && go install github.com/getsops/sops/v3/cmd/sops@latest
# Generate an age keypair
age-keygen -o ~/.config/sops/age/keys.txt
# Note the public key (age1...)

Create .sops.yaml in your gitops repo root:

creation_rules:
- path_regex: secrets/.*\.env$
age: age1yourpublickeyhere
- path_regex: clusters/.*/hosts\.enc\.yaml$
age: age1yourpublickeyhere

The second rule encrypts host inventory files (hosts.enc.yaml) that contain server IPs and SSH targets. See hosts_file below.

Encrypt your secrets file:

Terminal window
sops -e -i secrets/production.env
git add secrets/production.env .sops.yaml
git commit -m "Encrypt production secrets with SOPS/age"

Provisioning decrypts automatically — the CLI detects SOPS-encrypted files and decrypts using the age key at ~/.config/sops/age/keys.txt (or SOPS_AGE_KEY_FILE):

Terminal window
frameworks cluster provision --github-repo org/gitops --cluster production
# Or explicitly specify the key file:
frameworks cluster provision --github-repo org/gitops --cluster production --age-key /path/to/keys.txt

Local development with an encrypted config/env/secrets.env:

Terminal window
make decrypt # Decrypt in-place (requires sops CLI)
make env # Generate .env as usual
make encrypt # Re-encrypt before committing

Separating host IPs from architecture (hosts_file)

Section titled “Separating host IPs from architecture (hosts_file)”

For public gitops repos, you can keep your cluster architecture in plaintext while encrypting server IPs and SSH targets in a separate SOPS-encrypted file. Set hosts_file in your cluster manifest:

# cluster.yaml — plaintext architecture (safe to publish)
version: v1
type: cluster
env_files:
- config/production.env
- secrets/production.env
hosts_file: clusters/production/hosts.enc.yaml
hosts:
central-eu-1:
cluster: core
roles: [infrastructure, services]
# No external_ip or user here — they come from hosts_file

The hosts file contains connection details:

# clusters/production/hosts.enc.yaml — SOPS-encrypted
hosts:
central-eu-1:
external_ip: "203.0.113.10"
user: root
edge_nodes:
edge-eu-1:
external_ip: "203.0.113.20"
user: root # optional; defaults to root

Encrypt it: sops -e -i clusters/production/hosts.enc.yaml

During provisioning, the CLI decrypts the hosts file, merges IPs into the manifest, and composes edge SSH targets as user@external_ip. Cluster manifests may still put external_ip inline under hosts:; edge manifests keep connection details in hosts_file or use inline ssh.


Purpose: Domain management, geographic routing, load balancing

Providers:

  • Cloudflare — required. Authoritative for the root domain and root service records (api, app, control-plane hostnames).
  • Bunny DNS — optional, highly recommended for any deployment with more than a handful of edges. Owns cluster-scoped media and edge zones ({cluster_slug}.{base}, edge node records, the tenant alias cdn.{base} zone) when BUNNY_API_KEY is set. Recommended because Cloudflare’s geo-load-balancing pools cap at 20 origins per pool — too small for an edge fleet that scales horizontally. Bunny’s smart records handle arbitrary geo-routed origin sets. Deployments with fewer than ~20 edges total can operate on Cloudflare alone; once you grow past that threshold, add Bunny.

Managed By: Navigator (api_dns) - automates DNS record management based on node inventory

Keep DNS behavior, routing patterns, and certificate mechanics in DNS and Cluster Routing. This page only lists the credential inputs Navigator needs:

Terminal window
CLOUDFLARE_API_TOKEN="your-token"
CLOUDFLARE_ZONE_ID="your-zone-id"
CLOUDFLARE_ACCOUNT_ID="your-account-id"
BUNNY_API_KEY="your-bunny-api-key" # optional media-zone delegation

Set root_domain in the manifest. Navigator uses manifest/service inventory to decide which records to reconcile.


Purpose: HTTPS for public endpoints, platform-managed subdomains, and per-tenant wildcard bundles under cdn.{base}

Providers: Both Let’s Encrypt and Google Trust Services. LE is primary; Google Trust is the fallback issuer required for production-scale tenant alias issuance.

  • Let’s Encrypt handles root, global service pools, and cluster wildcard certificates — well within LE rate limits at typical volume.
  • Google Trust Services is required as a fallback issuer for per-tenant wildcard bundles under {tenant}.cdn.{base}. Each paying tenant gets its own wildcard cert, so issuance volume scales with paid-tenant count; LE alone hits per-registered-domain rate limits as paid tenants grow. Configure External Account Binding (EAB) credentials via NAVIGATOR_GOOGLE_TRUST_EAB_KID / NAVIGATOR_GOOGLE_TRUST_EAB_HMAC_KEY. Navigator’s CA order (NAVIGATOR_ACME_CA_ORDER) defaults to letsencrypt,google-trust when EAB credentials are present.

Managed By: Navigator (api_dns) — DNS-01 ACME issuance + renewal, multi-CA fallback when one issuer rate-limits, per-domain issuer pinning so renewals stay on the same CA.

See DNS and Cluster Routing for certificate routing and provider ownership. Set NAVIGATOR_CERT_ALLOWED_SUFFIXES when a deployment should restrict certificate requests beyond the manifest root_domain.


Purpose: Durable storage for recordings, clips, DVR segments, VOD assets, and thumbnails

Providers: Cloudflare R2 and Hetzner Object Storage are the supported defaults. Any S3-compatible backend works (the AWS SDK v2 client + BaseEndpoint + UsePathStyle = true cover MinIO, Wasabi, Backblaze B2, AWS S3 itself).

Consumed By:

  • Foghorn (api_balancing) — mints presigned PUT/GET URLs for clip / DVR / VOD lifecycle, writes manifests directly
  • Chandler (api_assets) — serves artifacts and caches via local LRU; per-cluster S3 lookup via Quartermaster (see infrastructure_clusters.s3_*)

Multiregion deployments map each tenant’s artifacts to the bucket whose pricing model best fits that tenant’s expected access pattern via tenants.home_region + infrastructure_clusters.s3_*.

  • Hetzner Object Storage (EU) is cheapest at-rest and on-egress in EU. Use for EU-home tenants when most viewers are same-region; storage savings dominate egress for that pattern.
  • Cloudflare R2 (US/global) has $0 egress with no AUP traps. Use where cross-region viewer fanout would otherwise compound egress bills, or where the bucket lives far from a meaningful portion of its readers. Standard storage class is the right default; Infrequent Access trades retrieval fees for storage savings and is wrong for live-streaming origins.

Single-region deployments can use one provider for both EU and US clusters — the two-provider split is an optimization, not a requirement.

  • Bucket per cluster (one Hetzner bucket for the EU cluster, one R2 bucket for the US cluster in the standard multiregion shape)
  • Access key + secret per provider, stored in SOPS-encrypted env files
  • CORS policy on each bucket (see below) — required for browser-direct VOD multipart uploads

Cloudflare R2:

  1. Create bucket via Cloudflare dashboard. Set Location hint explicitly (ENAM for Eastern North America, WEUR for Western Europe, etc.) — leaving Automatic places the bucket near the creating browser’s region, which can be wrong.
  2. Generate API token under R2 → Manage R2 API Tokens scoped to the bucket with Object Read & Write.
  3. Enable Local Uploads — terminates customer browser PUTs at the nearest Cloudflare PoP and routes internally to the bucket’s home region. Improves upload latency for geographically distributed customers; no downside for same-region writers.
  4. Leave default-disabled: Custom Domains, Public Development URL, R2 Data Catalog, Bucket Lock Rules, Event Notifications, On Demand Migration.
  5. Keep the default 7-day multipart-abort lifecycle rule. Tenant retention is FrameWorks-owned (api_balancing/internal/jobs/retention.go); bucket-level lifecycle would create a second authority over deletes.

Hetzner Object Storage:

  1. Create bucket via Hetzner Cloud Console under the right project + region (nbg1 Nuremberg, fsn1 Falkenstein, hel1 Helsinki).
  2. Generate an access key + secret via the bucket UI.

VOD upload uses browser-direct multipart presigned PUTs — the webapp fetch()s straight to the S3 endpoint and reads the ETag response header per part. Without CORS on the bucket the browser’s preflight OPTIONS request fails and uploads never start.

The presigned URL is the auth boundary, not the Origin header — "AllowedOrigins": ["*"] is the correct shape, used by every major upload-backend deployment.

The CORS policy is the same shape for both providers. Set via aws s3api put-bucket-cors once per bucket:

Terminal window
cat > /tmp/cors.json <<'EOF'
{
"CORSRules": [
{
"AllowedOrigins": ["*"],
"AllowedMethods": ["PUT", "GET", "HEAD"],
"AllowedHeaders": ["*"],
"ExposeHeaders": ["ETag"],
"MaxAgeSeconds": 3600
}
]
}
EOF
# Hetzner
aws s3api put-bucket-cors \
--bucket <hetzner-bucket> \
--endpoint-url https://<region>.your-objectstorage.com \
--region us-east-1 \
--cors-configuration file:///tmp/cors.json
# Cloudflare R2
aws s3api put-bucket-cors \
--bucket <r2-bucket> \
--endpoint-url https://<account-id>.r2.cloudflarestorage.com \
--region auto \
--cors-configuration file:///tmp/cors.json

Verify with an anonymous preflight curl (preflight precedes authentication, so no credentials needed):

Terminal window
curl -i -X OPTIONS --max-time 10 \
-H "Origin: https://example.com" \
-H "Access-Control-Request-Method: PUT" \
-H "Access-Control-Request-Headers: Content-Type" \
"https://<region>.your-objectstorage.com/<bucket>"

Expected: 200 OK with access-control-allow-origin, access-control-allow-methods: PUT, access-control-expose-headers: ETag. Hetzner Ceph echoes the specific matched origin instead of literally * in the response — spec-correct CORS behavior, not a deviation.

Same variable names on every host that runs Foghorn or Chandler; the values differ per cluster:

Terminal window
STORAGE_S3_ENDPOINT="https://<region>.your-objectstorage.com" # Hetzner
# or
STORAGE_S3_ENDPOINT="https://<account-id>.r2.cloudflarestorage.com" # R2
STORAGE_S3_BUCKET="<bucket-name>"
STORAGE_S3_REGION="us-east-1" # Hetzner (signing-only; ignored for routing)
# or
STORAGE_S3_REGION="auto" # R2 convention
STORAGE_S3_ACCESS_KEY="<from provider>"
STORAGE_S3_SECRET_KEY="<from provider>"
STORAGE_S3_PREFIX="" # optional key prefix inside the bucket

In multiregion deployments, EU and US clusters use different providers and therefore different credentials. The CLI assembles env per host so each box receives the credentials for the cluster it serves — see Credential Delivery above. Cluster-bound services pull STORAGE_S3_* from cluster-scoped env_files referenced by the manifest’s service definitions.

Chandler also consults infrastructure_clusters.s3_endpoint / s3_bucket / s3_region for the local CLUSTER_ID and overrides env defaults with the cluster-row values. Credentials are never stored in infrastructure_clusters — they remain env-only.

  • aws-cli 2.34.x throws TypeError: argument of type 'NoneType' is not a container or iterable on the error path against ceph-RGW backends (Hetzner runs on Ceph). It’s a cosmetic awscli bug parsing empty <Message></Message> tags; the actual HTTP exchange succeeds. Success-path operations work fine; error responses show the bug. Read --debug output for the real HTTP response, or use mc (MinIO Client) which handles ceph cleanly.
  • Hetzner echoes specific origin, R2 echoes *: when AllowedOrigins is ["*"], Hetzner Ceph returns Access-Control-Allow-Origin: <requesting origin> while R2 returns Access-Control-Allow-Origin: *. Both are spec-compliant; don’t assert one shape in integration tests.
  • Cloudflare R2 location hint is permanent. Bucket name and location hint are set once at creation; you can’t migrate later without copying objects to a new bucket. Pick the right region the first time.

Purpose: IP geolocation for geographic routing, node placement, and analytics

Provider: MaxMind (GeoLite2 or GeoIP2), DB-IP Lite, or IP2Location LITE

What You Need:

  • Provider account (MaxMind requires a free license key for GeoLite2)
  • MMDB file provisioned to target hosts

Consumed By:

  • Foghorn (api_balancing) — geo-aware viewer routing decisions
  • Quartermaster (api_tenants) — geolocates nodes at bootstrap for federation map placement

If no MMDB is configured, GeoIP is disabled and services fall back to non-geographic behavior.

CLI Provisioning (recommended):

Terminal window
# Download from MaxMind and distribute to all Foghorn + Quartermaster hosts
frameworks cluster sync-geoip --license-key YOUR_MAXMIND_KEY
# Use a local MMDB file instead
frameworks cluster sync-geoip --source file --file /path/to/GeoLite2-City.mmdb
# Target specific services, restart after upload
frameworks cluster sync-geoip --license-key KEY --services foghorn,quartermaster --restart

The CLI reads your cluster manifest to find the hosts running the target services, uploads the MMDB via SSH, and optionally restarts those workloads. You can also make GeoIP part of normal cluster provisioning with a geoip: block in the manifest so frameworks cluster provision stages the MMDB before Foghorn and Quartermaster start.

Manual Provisioning:

  1. Download a City-level MMDB file from your provider
  2. Place it on each Foghorn and Quartermaster host at /var/lib/GeoIP/GeoLite2-City.mmdb
  3. Set environment variables on the target services:
    Terminal window
    GEOIP_MMDB_PATH="/var/lib/GeoIP/GeoLite2-City.mmdb"

Cache Tuning (optional):

The shared pkg/geoip reader caches lookups. Defaults are sensible for most deployments:

Terminal window
GEOIP_CACHE_TTL=300s # Positive cache TTL (default: 300s)
GEOIP_CACHE_SWR=120s # Stale-while-revalidate window (default: 120s)
GEOIP_CACHE_NEG_TTL=60s # Negative (miss) cache TTL (default: 60s)
GEOIP_CACHE_MAX=50000 # Max cached entries

Purpose: Transactional email — password reset, email verification, billing/invoice delivery, forms inbox

Providers: Fastmail, SendGrid, AWS SES, or any SMTP provider

Required: Yes. SMTP is required for production deployments. Without it:

  • Forgotten-password recovery is impossible — users locked out of accounts have no path back in.
  • Email verification at signup can’t complete — new account creation breaks.
  • Billing invoices and payment-failure notifications never reach customers — they discover failed payments only by losing service.
  • Forms API (contact submissions, support intake) hard-fails on send.

Auth flows degrade with warnings logged if SMTP is missing, but a degraded auth flow is not a viable production posture. Treat SMTP as required.

What You Need:

  • SMTP credentials (host, port, username, password)
  • Verified sender domain (SPF / DKIM configured at the DNS layer for deliverability)

Manual Steps:

  1. Choose email provider
  2. Generate SMTP credentials or app password
  3. Set environment variables:
    Terminal window
    export SMTP_HOST="smtp.provider.com"
    export SMTP_PORT="587"
    export SMTP_USER="your-username"
    export SMTP_PASSWORD="your-password"
    export FROM_EMAIL="[email protected]"
    export FROM_NAME="FrameWorks" # optional
    export TO_EMAIL="[email protected]" # Forms API inbox

Used By:

  • Auth (email verification + password resets)
  • Billing (invoice + payment-failure notifications)
  • Forms API (contact form submissions — hard-fails without SMTP)

Purpose: Billing, subscriptions, payment handling

Providers: Stripe (primary), Mollie (optional)

What You Need:

  • Stripe account and secret key (STRIPE_SECRET_KEY)
  • Stripe webhook signing secret (STRIPE_WEBHOOK_SECRET)
  • (Optional) Mollie API key (MOLLIE_API_KEY)

What’s Automated:

  • Webhook event handling (sync + asynchronous settlement)
  • Subscription management and tier activation
  • Product/price catalog sync on startup
  • Payment processing, refunds, and disputes

Manual Steps:

  1. Create Stripe account at https://stripe.com
  2. Get the secret key from Developers → API keys (sk_live_…)
  3. Add a webhook endpoint at Developers → Webhooks:
    • URL: https://yourdomain.com/webhooks/billing/stripe
    • Event destination scope: “Your account” (FrameWorks is not a Stripe Connect platform)
    • API version: 2026-05-27.dahlia — must match the bundled stripe-go SDK. The webhook payloads are parsed against this version; a mismatched endpoint version can silently mis-parse fields.
    • Subscribe these events:
      • checkout.session.completed, checkout.session.async_payment_succeeded, checkout.session.async_payment_failed, checkout.session.expired
      • customer.subscription.created, .updated, .deleted, .paused, .resumed
      • invoice.paid, invoice.payment_failed, invoice.payment_action_required
      • payment_intent.succeeded, payment_intent.payment_failed
      • charge.refunded, charge.dispute.created, .closed, .funds_withdrawn, .funds_reinstated
    • Reveal the endpoint Signing secret (whsec_…)
  4. (Optional) If using Mollie, configure: https://yourdomain.com/webhooks/billing/mollie
  5. Enable payment methods (Settings → Payment methods). The hosted Checkout pins an explicit, currency-aware allowlist in code: card always, plus sepa_debit, ideal, and bancontact for EUR checkouts (these are EUR-only at Stripe; non-EUR checkouts offer card only). The dashboard-enabled set must include the methods you bill in.
  6. Set environment variables:
    Terminal window
    export STRIPE_SECRET_KEY="sk_live_..."
    export STRIPE_WEBHOOK_SECRET="whsec_..."
    export MOLLIE_API_KEY="live_xxx" # optional

Used By: Purser (billing API)

Optional: Not required for basic platform operation

Note: Stripe webhooks require STRIPE_WEBHOOK_SECRET in all environments. Asynchronous methods (SEPA Direct Debit, iDEAL, Bancontact) settle after checkout completes; value is granted only on the confirming event (async_payment_succeeded / invoice.paid / customer.subscription.updated), never on checkout.session.completed alone.


ServiceProviderRequiredAutomated by
DatabasePostgres or YugabyteDBYesCLI (user creation + grants)
ClickHouseClickHouseYesCLI (users.xml deployment)
DNSCloudflare (required today) + Bunny DNS (optional)YesNavigator (api_dns)
SSL/TLSLet’s Encrypt + Google Trust Services (multi-CA)YesNavigator (api_dns)
Object StorageCloudflare R2 / Hetzner / any S3-compatYes todayOperator (bucket + CORS); Chandler reads per-cluster row
GeoIPMaxMind / DB-IP / IP2LocationRecommendedCLI (cluster sync-geoip)
SMTPAny provider (Fastmail, SendGrid, AWS SES, …)YesPlatform (email delivery)
PaymentsStripe, MollieOptional (required for billing)Purser (webhook handling)