Secrets at Startup: Why We Stopped Using Environment Variables
The 12-factor app is gospel. Factor III — “Store config in the environment” — has been repeated so many times that most engineers I talk to treat it as settled law. Put your database URL in DATABASE_URL, your API key in STRIPE_SECRET_KEY, your JWT secret in JWT_SECRET, and make the environment the contract between your app and the outside world.
We used to do this. Every project had a sprawling .env file. Every docker-compose.yml had an environment: block ten lines long. Every CI pipeline had a “secrets” tab full of variables. Every developer laptop had a .env.local they’d copied from someone on Slack two years ago.
Then we stopped. Today, our .env files contain at most two lines — ENVIRONMENT= and, for AWS, REGION=. Every other value — every secret, every piece of configuration, even things like LOG_LEVEL and WORKERS — is fetched from a secrets manager at container startup and held only in memory.
This post is about why.
What 12-factor actually says vs. what people do with it
Factor III says config “varies substantially across deploys” and shouldn’t be in code. That’s true. The fix 12-factor proposes — environment variables — was the right answer in 2011, for Heroku. Heroku controlled the kernel, the process supervisor, the log drain, and the deploy pipeline. Env vars were the clean seam between the platform and your dyno.
We are not running on Heroku. We are running Docker containers on Unraid, on Proxmox, on GCP, and on AWS ECS. Our threat model is not “is my dyno config separated from my code” — we nailed that a long time ago. Our threat model is “where is this secret actually visible, and to whom.”
Let me be specific about where a value in a container’s environment is actually visible.
Where env vars leak
An environment variable set on a container is visible in all of these places, whether you put it there with docker run -e, docker-compose.yml, a .env file loaded by compose, or a Kubernetes env: block:
docker inspect <container>— anyone on the host with Docker socket access can read every env var, including “secret” ones. If you’ve ever given a teammatedockergroup membership to let them restart containers, you’ve given them every secret on the host./proc/<pid>/environ— on the host, the container’s init process exposes its env to anyone who can read that file.cat /proc/$(pgrep -f myapp)/environ | tr '\0' '\n'— done.- Crash dumps — most runtimes (Python, Node, Go) include environment variables in crash reports. Sentry, Datadog, Rollbar, and Bugsnag all capture them by default unless you explicitly filter. Your
DATABASE_URLhas probably been in a Sentry issue at some point. - Process listings on shared infrastructure —
ps auxeshows env on some systems. If you share a host (including in dev), neighbors see everything. - CI logs — every CI system I’ve used has accidentally echoed an env var at some point. GitHub Actions has a mask feature. It’s opt-in and trivially bypassable by a rogue action (
echo ${DB_PASS:0:4}...${DB_PASS: -4}). - Child processes — every subprocess inherits the parent’s env unless you explicitly scrub it. That means any shell callout, any
popen, any background worker carries your secrets with it. - Error messages from libraries you don’t control — I’ve seen a connection library print the full connection string (password included) to stderr on a timeout. The stderr went to a log aggregator. The log aggregator was indexed. The password was searchable.
The common thread: env vars are a broadcast channel, not a secure store. You put a value there and you’ve implicitly consented to it appearing in all of the above places. Most of the time it’s fine. The one time it isn’t, you’re paging someone at 2am about rotating a credential that leaked through a surface you didn’t know existed.
What we do instead
The pattern has a name in our wiki: secrets at startup.
The rules are:
- The only things allowed in
.envareENVIRONMENT(always) andREGION(AWS only). - Everything else — database URLs, API keys, JWT secrets, and operational config like
LOG_LEVEL,PORT,WORKERS, pool sizes, feature flags — lives in a secrets manager. - At container startup, the app authenticates to the secrets manager, fetches everything it needs into an in-memory cache, and holds it there for the lifetime of the container. No auto-refresh, no timer, no background task. Secrets are deliberately immutable until an operator triggers a rotation (more on this below).
- If a required secret is missing, the container crashes immediately. No silent fallbacks. No default values.
Here’s the Python version, roughly what’s running in production:
class SecretsClient:
"""Fetch secrets from Infisical or AWS — never from environment variables."""
def __init__(self, backend: str = "infisical"):
self.backend = backend
if backend == "infisical":
self.url = self._read_config("INFISICAL_URL")
self.project_id = self._read_config("INFISICAL_PROJECT_ID")
self.client_id = self._read_config("INFISICAL_CLIENT_ID")
self.client_secret = self._read_config("INFISICAL_CLIENT_SECRET")
@staticmethod
def _read_config(name: str) -> str:
"""Read from a file first, then fall back to env. File path wins."""
file_path = os.environ.get(f"{name}_FILE")
if file_path and Path(file_path).exists():
return Path(file_path).read_text().strip()
value = os.environ.get(name)
if not value:
raise SecretsError(f"Missing {name}: set {name}_FILE or {name}")
return value
@lru_cache(maxsize=128)
def get_secret(self, key: str, folder: str = "/") -> str:
# ... call Infisical / AWS / etc., cache the result
...
def clear_cache(self) -> None:
# Called only by the POST /secrets/refresh endpoint during
# rotation — never by a background timer.
self.get_secret.cache_clear()
self._access_token = None
On FastAPI, we wire this into Pydantic BaseSettings with a model validator that collects all missing secrets before raising, so operators can fix them in one pass instead of playing whack-a-mole. Feature-gated secrets (Stripe, SMTP) are only required when the feature flag is on — you don’t need a Stripe key to run a dev container that doesn’t take payments.
The app never sees os.environ["DATABASE_URL"]. It sees settings.DATABASE_URL, which was populated by a call to client.get_secret("url", "/database") at startup. docker inspect on the running container shows ENVIRONMENT=production and nothing else interesting.
“But how do you authenticate to the secrets manager without secrets?”
This is the question that always comes up, and it’s the right question. You’ve moved the problem one layer down — now you need a bootstrap credential to talk to the secrets manager, and that credential has to come from somewhere.
We solve it two different ways depending on environment.
At work (AWS): the ECS task runs with an IAM role. AWS injects temporary credentials via the instance metadata service. There is literally no bootstrap config on disk or in env — the container gets its identity from the infrastructure itself. This is the cleanest possible answer.
At home (Unraid / Infisical): we use Docker secret files on tmpfs. A gitignored .secrets/ directory on the host contains four tiny files (infisical_url, infisical_client_id, infisical_client_secret, infisical_project_id). docker-compose mounts them at /run/secrets/ inside the container. The app’s _read_config method prefers the *_FILE variant, reads the file, and uses the value to authenticate with Infisical.
services:
backend:
environment:
ENVIRONMENT: "${ENVIRONMENT:-production}"
secrets:
- infisical_url
- infisical_client_id
- infisical_client_secret
- infisical_project_id
secrets:
infisical_url:
file: ${SECRETS_PATH:-.secrets}/infisical_url
infisical_client_id:
file: ${SECRETS_PATH:-.secrets}/infisical_client_id
infisical_client_secret:
file: ${SECRETS_PATH:-.secrets}/infisical_client_secret
infisical_project_id:
file: ${SECRETS_PATH:-.secrets}/infisical_project_id
The files live on the host’s filesystem and are mounted into a tmpfs inside the container, so they exist in memory only for the container’s lifetime. They aren’t in docker inspect output. They aren’t inherited by child processes. They aren’t in crash dumps. A small, bounded attack surface — exactly four tiny credential files, nothing else.
The ${SECRETS_PATH:-.secrets} interpolation is a small but important detail: local dev defaults to ./.secrets in the project directory, while CI overrides SECRETS_PATH to an absolute path outside the repo at deploy time. Same compose file, different deploy context.
“Isn’t this just env vars with extra steps?”
No, and the difference matters.
An env var is broadcast: anyone in the container’s process tree, anyone with Docker socket access, anyone reading /proc, anyone who captures a crash dump, anyone tailing docker inspect output, sees it.
A value held in an application’s in-memory cache is scoped: only code running inside the Python interpreter’s heap can read it. A subprocess started from that app gets a clean env (unless you explicitly pass values through). A crash dump captures the value of local variables, not the secret cache dict, unless you dump that specific object. A docker inspect sees ENVIRONMENT=production and goes home.
The blast radius is smaller. That’s the whole point. You can’t get perfect isolation without a vault-sidecar architecture — but you can get dramatically better isolation with a ~200-line secrets client.
The DHI wrinkle: you may not have a choice anyway
There’s an angle I didn’t fully appreciate until we started running on Docker Hardened Images. DHI runtime images are stripped further than Alpine or distroless — no shell, no aws-cli, no jq, no curl, no find, no sed, not even cp. You get your language runtime and the minimum libraries it needs. That’s it.
If you’ve ever written a “fetch secrets at entrypoint” script, you know what it usually looks like:
#!/bin/sh
export DATABASE_URL=$(aws secretsmanager get-secret-value \
--secret-id myapp/prod/database \
--query SecretString --output text | jq -r '.url')
export JWT_SECRET=$(aws secretsmanager get-secret-value \
--secret-id myapp/prod/auth \
--query SecretString --output text | jq -r '.jwt_secret')
exec node server.js
Every single thing in that script is unavailable on a DHI runtime. sh isn’t there. aws isn’t there. jq isn’t there. export isn’t there. You can’t even run it. The traditional pattern of “shell glue that stitches env vars from a secrets manager into the running app” is simply not an option when your runtime has been stripped to the bone.
At first that felt like a restriction. In practice it turned out to be a forcing function toward something better.
The in-process secrets client — the one you had to write anyway for the “secrets out of env vars” argument — becomes the only way to populate credentials at startup on a DHI runtime. You cannot shell out. You cannot call a sidecar in a way that writes to the environment. You must read from the secrets manager inside your application’s process, in code, in your language of choice. For Node.js that’s an .mjs entrypoint with @aws-sdk/client-secrets-manager. For Python it’s the SecretsClient class I showed earlier. For Go it’s the AWS or Infisical SDK called from main(). All roads lead to the same pattern.
And here’s the upside I didn’t expect: everything about startup becomes unit-testable. A shell entrypoint isn’t really testable — you can lint it, you can run it in a throwaway container, but you can’t write a proper test suite around it. An application-language entrypoint is a module. It has functions. You can mock the secrets backend, feed in broken configs, feed in missing secrets, feed in malformed JSON, and assert the right exceptions get raised at the right time. The “one-off” glue code that used to live in an untested entrypoint script — the migration runner, the schema check, the cache warm, the health probe the healthcheck calls — all of that lives inside your application language now, which means it lives inside your test suite.
We caught a whole class of startup bugs this way that we used to catch in staging: a missing secret throwing a cryptic KeyError instead of a named SecretsError, a fail-fast path that didn’t actually fail fast on one specific backend, a cache key collision when two different secrets had the same local name. Every one of those is a five-line test.
If you’re on a DHI runtime (or any shell-less base image), the in-process secrets pattern isn’t the fancy way. It’s the only way. And the side effect of being forced into it is that your entire startup sequence moves into code your test suite can reach.
Rotation: deliberate, not automatic
This is the part that people usually guess wrong about when they first hear about the pattern, so I want to be explicit: we do not auto-refresh. There is no timer. There is no background task. A secret fetched at startup lives in that container’s memory until the container is stopped or something explicitly tells it to refresh.
The first version of this pattern we built did auto-refresh on a timer. We tore it out. Two reasons:
- AWS Secrets Manager has no push notification. There’s no webhook when a secret value changes, so auto-refresh has to poll. You’re paying a real fetch on every tick to catch a rare event.
- Auto-refresh is dangerous. Someone updates a secret in the AWS console — a typo, a wrong value, a test — and every running container silently picks up the bad credential on its next cache clear. The operator has no blast-radius control. The first signal is an alert on your error-rate dashboard fifteen minutes later.
Deliberate rotation is safer and, in our experience, almost always what you actually wanted. The mechanism is three small endpoints every container exposes:
GET /secrets/health — was startup fetch successful? when?
GET /secrets/status — do the currently-held credentials actually work?
POST /secrets/refresh — clear cache, re-fetch from SM, recycle pool if creds changed
Rotation is scripted. A script (rotate-secrets-bluegreen.sh in our case) orchestrates the change:
- Generate the new credential (for a database, create a new DB user with the same grants as the current user — this is the two-user rotation pattern).
- Write the new credential to the secrets manager. Keep the old value side-by-side so both credentials work during the transition window.
POST /secrets/refreshon the inactive color in a blue/green pair.GET /secrets/statusto verify the inactive color picked up the new credential and the database accepts the connection. If not, roll the secret back and abort — nothing user-facing has changed yet.- Shift traffic to the inactive color.
POST /secrets/refreshon the now-inactive color (which was previously serving).GET /secrets/statusagain. Confirm both colors are healthy on the new credential.- Drop the old DB user. Remove the side-by-side key from SM.
There are twelve steps in the real script; I’ve collapsed them for the post. The important part is that rotation is a deliberate, auditable sequence with a verification gate at every step. Auto-refresh skips all of those gates.
The /secrets/refresh endpoint itself is the only dynamic piece. When called, it clears the in-memory secrets cache, re-fetches from the manager, compares the new DB username to the one the connection pool is currently using, and — only if it changed — drains and recreates the pool. In-flight queries finish on the old connections before they close. No dropped requests, no container restart, zero downtime for the rotation. If the credentials didn’t actually change (most calls, during normal operation), the pool is left untouched and the response says so.
Rotations in normal operation: database passwords every 90 days, JWT secrets every 180 days, API keys on whatever schedule the provider imposes. Each one runs the same script, each one hits the same endpoint, each one is a deliberate operation by someone who can watch the health dashboard while it happens.
The non-obvious win: config in the secrets manager too
The thing I didn’t expect when we made this change is how much of it isn’t really about secrets. It’s about config.
When LOG_LEVEL, WORKERS, REDIS_POOL_SIZE, and JWT_EXPIRY_MINUTES are in your secrets manager alongside your actual secrets, three things happen:
- You can change them without rebuilding the image. Flip
LOG_LEVELfrominfotodebugin Infisical, callPOST /secrets/refresh, and the next request sees the new value. Same deliberate mechanism as rotation — same verification gate, same audit trail, same ability to confirm it worked before moving on. - You get an audit trail for free. Every config change is logged in Infisical (or CloudTrail for AWS). “When did we bump the pool size and who did it?” becomes a single query.
- Dev/staging/prod stop drifting. The thing that makes config hard is that it sprawls across
.envfiles on developer laptops,docker-compose.override.ymlfiles, CI variable pages, and deployment scripts. Centralizing it means there’s exactly one place where config lives, and the app always gets the right value for its environment.
The 12-factor philosophy of “config separate from code” is right. We just think the environment variable channel has outlived its usefulness as the transport.
Trade-offs (because there are some)
I’d be lying if I said this was free.
- Startup latency. The app does an extra HTTP round trip to the secrets manager on boot. With caching, this is milliseconds in steady state, but cold starts take 50–200ms longer. For our workload, that doesn’t matter. For a serverless function that cold-starts on every request, it might.
- A new external dependency at boot. If the secrets manager is down, nothing starts. We mitigate with fail-fast + clear error messages + monitoring on the secrets manager itself, but it’s a real dependency.
- More upfront setup. A new project needs the secrets client, the bootstrap file plumbing, and an Infisical folder structure. For a weekend prototype, a
.envfile is faster. This pattern earns its keep on projects you’re going to run for years. - It only really works with a secrets manager. “Just put everything in Vault” is a sentence that hides weeks of infrastructure work if you don’t already have one. Infisical has made this dramatically cheaper than it used to be (self-hostable, free tier, good UX), but it’s still not free.
We think the trade-offs are worth it. docker inspect on any of our containers returns exactly two environment variables, and we sleep better for it.
When to deviate
A few cases where you should bend the rule:
- Frontend SPAs. Truly public config — analytics IDs, public API URLs — can be baked in at build time. Never try to fetch “secrets” from a client-side app; there are no secrets in code that runs in the browser.
- Database init containers. The postgres/mysql container itself needs a root password for first-boot initialization. Use the native
*_FILEenv var pattern (MYSQL_ROOT_PASSWORD_FILE=/run/secrets/mysql_root_password) and call it a day. This is the same Docker secret file pattern we use for bootstrap, just with a different consumer. - Weekend prototypes. If you’re building something you might throw away in a week, a
.envfile is fine. This pattern is about the projects you plan to run for years.
The short version
12-factor’s “store config in the environment” was the right answer for Heroku circa 2011. It is no longer the right answer for self-hosted Docker or cloud-native container platforms in 2026. The surface area of “environment variables on a running container” has grown to include docker inspect, /proc, crash dumps, error monitoring tools, CI logs, child processes, and library error messages — not because any one of those is new, but because the cumulative blast radius is larger than most people realize when they’re typing DATABASE_URL= into a .env file.
The alternative isn’t exotic. A small secrets client, a fail-fast settings loader, a tmpfs-mounted bootstrap credential, and three small endpoints (/secrets/health, /secrets/status, /secrets/refresh) that let a rotation script deliberately refresh credentials without restarting anything. Maybe a day of work to set up on a new project. The payoff is that your secrets live in exactly one place you can audit, rotate, and reason about — and your containers stop being broadcast towers for credentials.
And if you’re on a hardened runtime like DHI — where sh, aws, jq, and curl aren’t available at all — you don’t get to choose. The in-process secrets pattern is the only one that actually runs. The upside is that your entire startup sequence moves into code your test suite can reach.
If you’re still typing secrets into .env files in 2026, it’s worth asking: not whether 12-factor is right in theory, but whether the specific transport 12-factor recommends still fits the threat model of the thing you’re actually building.
Links:
- Infisical — open-source self-hostable secrets manager
- AWS Secrets Manager — managed alternative for AWS workloads
- 12-factor Config — the original, still worth reading
- Docker secrets — the
/run/secrets/file-mount mechanism used for bootstrap