Cross-Device Drafts with AES + Device Tokens
Here’s a problem that looks simple from the outside: someone receives a link to a long, multi-step form, opens it on their phone, gets interrupted partway through, and wants to pick it up on a desktop five minutes later. Same person, same form, different device. No accounts, no logins — they’re a one-time visitor, not a registered user, and asking them to create credentials before they can fill out the form would crater completion rates.
Also: the form contains sensitive data, so everything needs to be encrypted at rest. And whatever mechanism lets them resume on a second device has to absolutely guarantee that the phone and the desktop can’t both be saving to the same draft at the same time.
When I sat down to design this, I couldn’t find a single existing pattern that handled all three constraints together. Most collaborative-editing literature starts with “assume authenticated users.” Most draft-autosave literature starts with “assume single device.” Most encryption-at-rest literature assumes you have an account to bind the key to. We had none of those.
This post is about what we built, and — maybe more importantly — why the usual tools didn’t fit.
Why the normal answers don’t work here
My instinct when I first saw this problem was “this is basically Google Docs, just with one writer at a time.” That instinct was wrong in both directions.
CRDTs / operational transforms — the Google Docs / Figma answer — are designed to merge concurrent edits. They’re beautiful machinery. They’re also exactly what you don’t want when two devices might type different answers to the same form field. A merge that produces an answer is worse than one that produces no answer, because “no answer” at least prompts the human to resolve it. Free-text collaborative prose has a correct merge; structured form inputs do not.
Pessimistic row locks — the database-engineer answer — assume the lock holder will either finish or crash cleanly. Real users close tabs. They put their phone down, the screen goes dark, the browser gets killed by the OS when the app switches contexts. There is no ON DISCONNECT hook for a phone in someone’s pocket. A pessimistic lock held by a device that’s gone is a lock that blocks the user’s own second device five minutes later.
Session-based authentication — the web-app answer — is circular here. A session presupposes a user identity. We don’t have one.
“Just use JWTs” — no. A JWT authenticates a device, not a person. Two devices belonging to the same user should both be able to resume, but not at the same time. JWTs don’t natively express “exactly one of you holds this right now.”
So we wrote something specific to the constraint set. The short version: access to a form is gated by an unguessable capability URL that’s sent to the user out-of-band and dies when the form is submitted or expires; ownership of that URL is proven by a knowledge check against values the user themselves entered on the first page; the single writer is enforced by an opaque device token; and the previous token holder finds out they’ve been superseded over a WebSocket room keyed on the draft ID.
The longer version has six moving parts, and the one that does most of the heavy lifting is the one I want to walk through first.
Part 1: The URL is a capability
Before a user ever sees the form, the system generates a random unguessable identifier — a GUID with enough entropy that you can’t enumerate your way to one — and sends the user a link containing it out-of-band (SMS, email, a link at a front desk, whatever delivery channel the context calls for). That identifier is the only way to address the form from the outside. A visitor without a valid identifier doesn’t see an error, a login page, or a landing page — they see nothing at all. The form doesn’t exist for them.
The draft lookup service takes only that identifier. There is no endpoint anywhere in the system that takes a name, a date of birth, or any other personal detail and returns a list of matching drafts. You cannot ask the system “which drafts belong to this person?” — not because the answer is hidden, but because the system genuinely doesn’t support that query. The identifier is not an index into a searchable space. It is the space.
No draft record exists at the moment the URL is issued. The URL is a capability to start a form, not a handle to an existing record. When the user follows the link and actually fills out the first page, that’s the moment the draft is created server-side, the expiry clock starts ticking, and the encryption key is generated. Before that point there is nothing to decrypt and nothing to steal. After that point, the same URL continues to identify this specific draft until it’s submitted or expires — the user uses it to come back themselves, on the same device or any other.
Submission is instant and terminal. When the user finishes the form and the server accepts the final submission, the draft row is deleted in the same transaction, the DEK is evicted from the cache, and the URL’s identifier is flagged used in a small separate table that outlives the draft storage. From that instant the URL stops resolving — not because “we can’t find the row” (which would be an inference) but because we explicitly remember that the identifier was spent. Even if the draft row were somehow restored from a backup, the “used” flag would still refuse the URL. Expiry works the same way: cleanup deletes the draft and marks the URL spent, in one step.
This matters for two reasons.
First: an attacker who knows everything about the user still cannot find a draft they do not already have the URL for. Enumerate GUIDs? The space is too big. Guess by name? There’s no name lookup. Brute-force the identity check? You have to be hitting a specific draft for the check to apply, and you don’t have one. The knowledge check is scoped to the draft identified by the URL, not run as a global query. Without a valid URL, there is no attack surface for the identity factor to defend at all.
Second: the URL is bound to a single lifecycle. Once the corresponding draft is submitted, the URL stops resolving. Once the draft expires from inactivity, the URL stops resolving. A visit to a URL whose draft has been submitted, expired, or never started returns a generic not-found response — the same response a random made-up GUID would return. A leaked URL whose draft has already been submitted is indistinguishable from garbage. This isn’t a rate limit on a known identifier; it’s a hard state transition from “valid capability” to “nothing ever existed here.”
Together, those two properties are what let the rest of the system tolerate a knowledge check that would otherwise be weak. We’ll get to that.
Part 2: Per-draft Data Encryption Key, lazily cached
Every draft gets its own 256-bit Data Encryption Key (DEK), generated at draft creation. The draft’s payload is AES-encrypted with the DEK before it touches the database. On read, the DEK decrypts the payload in memory. At rest, the database sees only ciphertext and the DEK handle; neither half by itself is enough to recover the form data.
The DEKs themselves live in an in-memory cache on the API server, populated lazily on first access. The cache is bounded by size and TTL, evicts on draft expiry or submission, and — deliberately — is not warmed at startup. Cold-start latency for the first access to a draft is tens of milliseconds and completely invisible to users, who don’t notice the gap between typing their name and seeing the next field render.
The “per-draft DEK” part matters. A single master key encrypting every draft would mean a single compromise unlocks every user’s form. Per-draft DEKs mean a compromised key unlocks exactly one draft, which radically shrinks the blast radius of any single mistake. The cost is bookkeeping — you need a KEK (key-encrypting-key) to wrap the DEKs at rest — but that’s a rounding error compared to the reduction in failure modes.
Part 3: Knowledge-based identity verification (scoped to the draft, not a global lookup)
A few fields on the first page of the form are designated as the knowledge check. Whatever the user types into those fields — and it’s legitimately whatever they type, the system doesn’t know the “right” answers ahead of time — becomes the verification secret for that specific draft. As soon as they submit page 1, the UI tells them plainly: “if you want to continue this form on another device, you’ll need to remember these values.” No fine print, no hidden gotcha — the user is informed at the moment the secret is created that it is a secret and they need to hang on to it.
(I’m going to stay generic about which fields, because the pattern works with anything the user controls and can realistically remember for a few days. Name-plus-date-of-birth. A chosen passphrase. Anything that sits naturally in the form’s first page and that the user isn’t going to forget between devices.)
When the user later opens the same URL on a different device, they’re asked to re-enter those values. The server compares their input against cheap-but-not-too-cheap hashes stored on the draft record, outside the encrypted blob. If and only if they match, the server fetches the DEK, decrypts the draft, and issues a device token.
The important detail — and the thing that makes this work at all: the identity check is scoped to the specific draft already identified by the URL, and it is only reachable when there is actually a draft to verify against. There is no “find a draft matching these facts” endpoint. The flow is always:
- User visits the URL → server looks up the identifier.
- If the identifier has been marked used (submitted form), or the draft has expired, or no draft has ever been created for this URL: the server behaves as if the URL is unknown. Generic not-found. No knowledge check, no hint that anything ever existed there.
- If a live unexpired, unsubmitted draft exists for this URL, and the requesting device doesn’t already hold the current device token: now the knowledge check is presented, for that specific draft.
- User enters the values they set on page 1.
- Server compares against cheap-but-not-too-cheap hashes stored on the draft, outside the encrypted blob.
- If and only if they match, the server decrypts the draft and issues a device token.
Failed verification never touches the ciphertext. An attacker who somehow obtained a valid URL but doesn’t know the answers can bang on the verify endpoint forever and never get a single field back. Constant-time comparison prevents timing signals. Rate limits cap the total attempts. The response on failure is always the same generic message.
The “only reachable when a draft exists” part is worth pausing on. The knowledge check is not the front door. The front door is the URL itself, and that door is only open while a form-in-progress lives behind it. The instant the form is submitted, the door closes and stays closed — even for the legitimate user. This narrows the window an attacker has to work with to exactly the time between the user starting page 1 and the user finishing the form, and eliminates the “dig up an old email from last year” attack entirely.
The knowledge check alone is deliberately not strong — it’s a small number of values the user chose themselves and a determined attacker might plausibly guess. That’s fine, because the knowledge check is never alone. It is gated by the URL capability. An attacker has to defeat both layers, and the only plausible way to defeat the first layer is to actively intercept the out-of-band channel the link was delivered over (the user’s own SMS or email) during the narrow window before the user submits the form.
This is the two-factor separation worth highlighting: the URL proves “you were given access to this form by someone who had the right to give it,” and the knowledge check proves “you are the same person who started filling it out.” Each factor protects against the other’s failure mode. A leaked link alone is useless — you still need to match what the original user typed. Leaked personal guesses alone are useless — you still need the URL. You need both, and you need them within the window where the draft is still live.
Part 4: Device tokens as single-writer locks
This is the part I was most proud of when it clicked.
When a device successfully unlocks a draft — either by being the original creator or by passing the URL + knowledge check on a resume — it gets a device token. The token is opaque, server-generated, bound to the specific draft, and the server only remembers one of them at a time per draft.
Every save endpoint — every one — checks the device token as part of the HTTP request. Not in the session, not in the cookie, not in the body — in a header, explicitly. If the token doesn’t match the one the server is currently holding for that draft, the save is rejected with 409 Conflict. The client gets an unambiguous error, not a silent success that corrupts data.
When a new device takes over, the server does two things atomically:
- Invalidates the previous device token (sets the draft’s “current token” to the new one).
- Generates and returns the new token to the new device.
From that instant, the previous device’s next save attempt will hit a 409. It doesn’t need to be told; the token is invalid the moment the new device authenticates.
This is a form of pessimistic locking, but it inverts the usual failure mode. In database row locks, the lock holder is assumed to be well-behaved and finish cleanly. Here, the previous lock holder is assumed to be gone — the phone is in someone’s pocket, the browser tab is closed, the app was killed. The lock is not waiting for the previous holder to release it; the new holder takes it, and the previous holder finds out on their next request (if there is one). No leases, no heartbeats, no timeouts. The lock transfers on the positive event (a successful handoff) rather than the negative one (lease expiry).
The opaqueness matters too. The token has no structure. It doesn’t embed a timestamp, a user ID, or a device fingerprint. It’s a random string the server looks up in a map. This means a leaked token cannot be forged or extended — if you want a new one, you go through the URL + knowledge gate again. And the server-side lookup means the “one writer per draft” constraint is enforced in exactly one place: the data structure that holds the current token for each draft ID. No distributed consensus to get wrong.
Part 5: WebSocket rooms for device supersession
The missing piece: when a new device takes over, the old device needs to know. Otherwise its user is staring at a form they think they’re editing, typing answers that will silently 409 when they hit save, and the experience is miserable. Token validation handles correctness; WebSocket supersession handles user experience.
Each draft ID has a WebSocket room. A device actively editing a draft joins the room when it loads. When the server authenticates a new device for that draft and issues a new device token, it also pushes a device_superseded event into the room. The previous device receives the event in real time and shows a friendly message: “this form is now being edited on another device.” The input fields go read-only. No confusion, no lost-work messages, no silent failures.
The room is keyed on the draft ID, not on the device or the user (since there is no user). Any device connected to the room — typically exactly one or two — gets the event. The room is cheap: a pub-sub channel with a single subscriber in the common case. It exists for the moment a handoff happens, and then nothing interesting flows through it until the next handoff or the draft is submitted.
The WebSocket is a UX layer, not a security layer. If the WebSocket is disconnected, the previous device simply doesn’t get the “you’ve been superseded” notification — but its next save still fails with 409. Security correctness doesn’t depend on the event being delivered. This is important: it means the WebSocket can drop, reconnect, run on a different node, or fail entirely, and the single-writer guarantee is still enforced by the token check. The WebSocket room is a nice-to-have that upgrades a correct-but-confusing experience into a correct-and-clear one.
Part 6: Expiry and cleanup
Drafts automatically expire after a short window of inactivity (we use seven days, tuned to our flow — pick yours to match the real tail of legitimate resume attempts). A periodic cleanup job purges expired drafts, their device tokens, their URL bindings, and their cached DEKs in one pass. Submitted drafts are eligible for immediate cleanup.
The cleanup job is boring on purpose. It scans expired rows, deletes them, invalidates any matching device tokens, evicts any cached DEKs. No fancy coordination, no distributed scheduling — just one scheduled task on the API server.
Expiry doubles as a hard backstop on the capability URL’s lifetime. Even if submission never happens, the URL stops resolving within days no matter what. Defense in depth for a pattern where the depth is genuinely cheap.
What this protects against (and what it doesn’t)
I want to be explicit about the threat model because knowing which attacks you’re defending against is half the hard part of any security design.
Protects against:
- Bystander exposure. A URL glimpsed on a screen, a browser history, or a receipt does nothing for a stranger — they still need to pass the knowledge check for that specific draft, which requires knowing what the original user typed on page 1.
- Database compromise. A dump of the draft table gets an attacker ciphertext and DEK handles; they still need the KEK to decrypt any of it, and even then they recover the stored data — no session tokens or credentials that would unlock anything else in the system.
- Enumeration attacks. The URL space is too large to enumerate, and there is no endpoint that returns a list of drafts matching a user. You cannot discover drafts; you can only resume one you already have a live URL for.
- Stolen URL after the form is done. Submission is terminal: the draft is deleted in the same transaction that accepts the submission, the DEK is evicted, and the URL’s identifier is flagged used in a separate table that outlives the draft storage. A URL for a completed form doesn’t just fail to find anything — the server explicitly remembers that identifier as spent. Intercepting a link from last Tuesday’s email archive, after the user has already finished, buys nothing. Even a backup-restore of the draft row wouldn’t bring the URL back to life.
- Targeted attacker who knows the user’s personal facts but not the URL. The knowledge check is never the entry point — it is only reachable through a live URL. An attacker with the facts but no URL has no endpoint to attack.
- Concurrent-edit data corruption. Two devices cannot both successfully save to the same draft. The token check makes it impossible.
- Zombie device saves. A phone that’s been closed and reopened can’t save stale data over the desktop’s in-progress work — its token is already invalid.
Does not protect against:
- An attacker who intercepts the out-of-band channel that delivers the URL and also knows the knowledge-check facts, during the narrow window before the legitimate user resumes. This is the real residual risk, and it’s small in practice because the URL is delivered over a channel the attacker typically does not control (the user’s own SMS or email), and the resume window is short. The mitigation is “make the delivery channel itself trusted.” That’s true of any system that delivers credentials out-of-band, and it’s a known, bounded risk.
- Compromise of the API server process. If an attacker gets code execution on the API server, they can read the DEK cache out of memory and decrypt every draft that’s been accessed recently. This is the fundamental limit of any in-memory key system; the mitigation is the usual defense-in-depth (short cache TTL, restricted network, hardened runtime, audit logging).
- Two tabs on the same browser. A user who opens the same URL twice will be the concurrent writer. The token check prevents both from saving — only the most recently authenticated one can save — but it’s worth saying out loud that the “single writer” is enforced at the network level, not at the human level.
The capability URL is doing most of the work here; the knowledge check catches the narrow case where the URL leaks before it’s consumed. The two layers together are meaningfully harder to defeat than either would suggest on its own.
When not to build this
A few cases where this pattern is overkill:
- Short forms (under a couple of minutes). The overhead of draft, resume, and capability URL handling buys nothing when the user will finish in one sitting. Skip the draft system entirely.
- Single-session flows. Timed assessments, one-shot questionnaires, anything where “pick it up later” is explicitly not a feature — don’t build drafts. The complexity is pure cost.
- Non-sensitive data. If the form doesn’t contain anything that needs to be encrypted, drop the DEK layer. Server-side drafts with a session cookie are fine, and you still get the rest of the pattern if you want it.
- Authenticated users. If users have accounts, drop all of this. Standard session management, per-user draft storage, you’re done. This pattern is specifically for the no-account case.
The short version
We needed cross-device continuity for forms that have no user accounts, contain sensitive data, and cannot tolerate concurrent writes. The combination is rare enough that the usual collaborative-editing and session-management tools didn’t fit.
The six pieces: a random capability URL delivered out-of-band that is the only way to reach the form and dies the moment the form is submitted or ages out; per-draft AES encryption with a lazy DEK cache created when the user starts page 1; a knowledge check against values the user themselves entered on that first page, scoped to the specific draft identified by the URL; opaque server-held device tokens validated on every save; WebSocket rooms keyed on draft ID that push a device_superseded event when a handoff happens; and a cleanup job that purges expired drafts and their associated keys and tokens.
The design leans on two observations.
The first is that correctness and UX can be enforced by different systems. The single-writer guarantee is enforced by the token check at the HTTP layer, which is deterministic and unambiguous. The “you’ve been taken over” experience is handled by a WebSocket event, which is best-effort. Mixing those two concerns into one mechanism would make the system either less correct or less pleasant. Separating them gives you a correct system with a nice experience on top.
The second is that a capability URL + a light knowledge check is quietly stronger than either factor would be alone. The URL cannot be enumerated or guessed, lives only for the life of one specific form, and dies the moment that form is done — so a leaked URL has a narrow shelf life, and a URL whose form has already been submitted is indistinguishable from a random string. The knowledge check cannot be queried globally and is only reachable through a live URL, so leaked personal guesses have nowhere to land. Each layer’s weakness is covered by the other. It isn’t as strong as real authentication against a real identity provider — but for the set of threats a draft system for unauthenticated users actually faces, it’s a better fit than pretending you had accounts all along.
If you ever find yourself in a similar corner — collaborative features without collaborative users, encryption without accounts, single-writer guarantees without sessions — this is one shape that fits. Your constraints will be different; mine were, until they weren’t.
Related reading:
- A survey of CRDTs — the other end of the collaborative-editing spectrum, with a great explanation of when merge semantics work and when they don’t
- Capability-based security (Wikipedia) — the older-than-the-web idea that an unguessable reference can be the authorization, not just point to something that is authorized elsewhere
- Jepsen: Consistency Models — useful for thinking about what guarantees you actually need before reaching for a consensus algorithm