Zero-Knowledge File Transfer: A Practical Taxonomy

What "encrypted" actually means across different file transfer architectures - the threat models, the tradeoffs, and where each approach genuinely protects you.

Introduction

The phrase "end-to-end encrypted" appears on the marketing pages of nearly every file transfer service. So does "zero-knowledge." So does "your files are private." These phrases are doing different amounts of work in different products, and the differences matter - they determine what actually happens to your file in the case of a server compromise, a subpoena, a malicious insider, or a network-level attack.

This guide is a taxonomy of the architectural approaches actually used by production file transfer services in 2026. It defines terms precisely, walks through each approach with its threat model honestly stated, and compares what each actually protects against - and what it doesn't.

It is written for technical readers who want to understand, rather than be reassured. It cites primary sources where possible.

Definitions, precisely

The terminology in this space is used loosely by vendors. It's worth establishing what each term actually means before comparing architectures.

Encryption in transit. Data is encrypted while it travels across the network - typically using TLS. This protects against passive network eavesdroppers but does nothing about what happens once the data reaches the server. Effectively every modern service does this; it is table stakes, not a feature.

Encryption at rest. Data is encrypted while stored on disk on the server. This protects against an attacker who steals the physical disk but does not protect against an attacker who compromises the running server, because the server holds the keys to decrypt the stored data. Like encryption in transit, this is now standard.

End-to-end encryption (E2EE). Data is encrypted on the sender's device with a key the server never sees in plaintext, transmitted as ciphertext, and decrypted only on the receiver's device. The server may transport the data and store it temporarily, but it never has access to plaintext or to the keys. This is the property that "Signal-style" messaging is famous for.^[1]

Zero-knowledge. A stronger property than E2EE. Zero-knowledge means the server has no useful information about what it is storing or transmitting beyond what is strictly necessary for routing - not the contents, not the keys, not enough metadata to reconstruct who-talked-to-whom or what-was-sent-when. A service can be E2EE without being zero-knowledge if it logs metadata or holds backup keys for "account recovery."^[2]

Forward secrecy. A key compromise affects only past or future sessions, not both. Most modern key-exchange protocols (TLS 1.3, Signal protocol, Magic Wormhole's PAKE) provide forward secrecy. File transfer services often do not, because they reuse long-lived account keys.

Server compromise resistance. This is the property that matters most in practice and is rarely advertised directly. Given an attacker who fully compromises the server (root access, all stored data, all logs, all running processes), what can they recover? In a zero-knowledge architecture, they recover nothing useful. In a conventional encrypted architecture, they recover everything that was on the server at the moment of compromise.

The honest test for any "encrypted" file transfer service is: if your relay server is compromised tonight, what does the attacker walk away with?

Architecture 1: Server-side encryption (Dropbox, Google Drive, OneDrive)

The most common architecture in mass-market file transfer is also the weakest from a privacy standpoint. Services like Dropbox, Google Drive, OneDrive, iCloud Drive, and most email attachment systems use server-side encryption. This is the default for almost everything most users do.

The mechanism is straightforward. The user uploads a file via TLS. The service encrypts the file on its servers with keys the service controls. The encrypted file sits at rest in the service's storage. When the user (or someone the user shared with) wants to access the file, the service decrypts it server-side and serves it back via TLS.

The properties this provides:

Protects against: Passive network eavesdroppers (TLS handles that). Physical theft of storage hardware (disks are encrypted at rest). Casual operational access by employees (most providers have access controls and audit logging).

Does not protect against: The service provider itself. Anyone with sufficient access at the provider - privileged employees, the operations team, anyone who compromises the running infrastructure - can decrypt and read the files. Government subpoenas can compel the provider to hand over plaintext. Server compromise reveals plaintext for everything that was being processed at the time.

The honest framing: this architecture treats the service provider as part of the trusted base. You are not protecting your files from the provider; you are accepting that the provider has full access and trusting them to behave responsibly.

For most use cases, this is fine. People share vacation photos, share Google Docs with colleagues, send PDFs through email. The threat model - random network attackers, casual data theft - is well-handled. But for sensitive documents (legal materials, medical records, journalistic source material, anything subject to attorney-client privilege or regulated under HIPAA), server-side encryption is meaningfully weaker than what's actually needed.

Architecture 2: Client-side encryption with provider-held keys (most "secure" services)

Several services market themselves as "secure" or "encrypted" while operating in a gray zone between server-side encryption and true end-to-end encryption. The pattern is that data is encrypted client-side before upload - but the service still holds (or can derive) the keys.

A common implementation: the user creates an account with a password. The service derives an encryption key from the password and uses it to encrypt files client-side before upload. The service stores the encrypted files and a hash of the password.

This is better than server-side encryption in some respects - a casual operational compromise does not yield plaintext. But it has subtle weaknesses:

Password-based key derivation is the weakest link. If the password has low entropy, the encryption key has low entropy. An attacker who exfiltrates the encrypted file blob and the password hash can attempt to brute-force the password and recover the encryption key. Account passwords, even with PBKDF2 or Argon2 stretching, do not have the entropy of a randomly generated cryptographic key.

The provider often has key recovery mechanisms. "Forgot your password?" features that let you recover access typically mean the provider holds enough key material to enable that recovery. If they hold it, an attacker who compromises them holds it.

Sharing breaks the model. If you share a file with another user via a shareable link, the service has to make the key available to whoever has the link - which usually means the service holds the key in some form.

This architecture is sometimes still called "zero-knowledge" by its vendors. The honest claim is that it is "client-side encrypted with caveats." The threat model is meaningfully better than server-side encryption, but the trust assumption is still that the provider behaves correctly and that the password was strong.

Architecture 3: URL fragment key delivery (phntm.sh, Mega)

A more interesting architecture uses URL fragments to deliver keys out-of-band. The pattern is:

The sender's device generates a random encryption key
The file is encrypted client-side with that key
The encrypted file is uploaded to a relay server
The relay returns a URL containing the file identifier - but the encryption key is appended to the URL as a fragment (the part after #)
The recipient opens the URL; their browser fetches the encrypted blob from the relay; the JavaScript on the page reads the key from the URL fragment; the file is decrypted client-side

The clever property: URL fragments are never sent to servers in HTTP requests. When a browser navigates to https://example.com/file?id=abc#key=xyz, the server sees the request for ?id=abc but the #key=xyz part stays client-side. This means the relay literally cannot see the key, even though the user sent the URL through some channel that ostensibly went through other servers (email, messaging app).

Mega.nz pioneered this approach in 2013. phntm.sh is a more recent example.^[3] It is a well-designed architecture for what it does: the relay holds only ciphertext, the key only exists in the URL the user controls.

But it has subtle vulnerabilities that are worth understanding:

Analytics scripts can leak fragments. Some analytics tools - Vercel Analytics, Umami, Plausible - read location.href client-side, which includes the fragment. If they then send that to their analytics endpoint, the key has effectively been transmitted to a server. This was exposed in a public Reddit thread in 2026 when a phntm.sh user identified that Vercel Analytics was leaking the fragment to Vercel's servers.^[4] The author patched the issue, but the architectural risk remains: any third-party script with access to location.href is a potential leak point.

The user has to share the URL through some channel. That channel - email, Slack, SMS, WhatsApp - sees the URL and therefore sees the key. The threat model is now "trust the channel through which the URL is shared." If the user emails the URL, the email provider sees the key. If they paste it into Slack, Slack sees it.

The URL is reusable until expiry. Anyone who obtains the URL can decrypt the file. There is no proximity property - the recipient could be anywhere in the world, and so could anyone they forward the URL to.

URL fragment delivery is a valid architecture for "encrypted file transfer where the relay can never read the file." It is not a good architecture if the threat model includes "the channel through which the link is shared" or "third-party scripts running in the browser."

Architecture 4: PAKE-based code exchange (Magic Wormhole)

Magic Wormhole, originally developed by Brian Warner, takes a different approach. The mechanism uses Password-Authenticated Key Exchange (PAKE) to derive a strong encryption key from a short human-readable code shared between sender and receiver.^[5]

The flow:

The sender's tool connects to a "rendezvous server" (mailbox server) and is assigned a numeric nameplate
The user reads out a short code consisting of the nameplate plus two random words from a wordlist (e.g., 7-crossover-clockwork)
The receiver's tool, given the same code, connects to the same rendezvous server
Both sides perform a SPAKE2 PAKE protocol. SPAKE2, due to Abdalla and Pointcheval, allows two parties who share a low-entropy password to derive a strong shared key, with the property that an active man-in-the-middle attacker gets only one guess at the password before being detected
The shared key is used to encrypt all subsequent traffic
Where possible, the two clients establish a direct connection and transfer the file peer-to-peer; where direct connection is impossible, traffic relays through a transit relay server

Properties:

The rendezvous server cannot decrypt traffic. It only sees ciphertext, and the keys are derived from the PAKE exchange.

An active attacker gets only one guess. The default 16-bit code gives the attacker a 1-in-65,536 chance of successfully impersonating one side per attempt. Each code is single-use, limiting attack opportunities. If an attacker guesses wrong, both legitimate parties detect the failure.

Forward secrecy by design. A key compromised after a session does not affect prior or future sessions; new codes derive new keys.

The drawbacks for general consumer use:

Both sides need the same software. Magic Wormhole is a CLI tool primarily; there are GUI wrappers but they are not widely deployed. It is not a "click a link in a browser" experience.

The code has to be communicated through some channel. This is the same problem as URL fragment delivery - whatever channel transmits the code (verbal, SMS, email) needs to be trusted, or trusted enough that one shot at impersonation is acceptable.

No proximity property. Like URL fragments, the code can be communicated to anyone anywhere in the world. The "presence" element is purely social.

PAKE is excellent cryptography - arguably the cleanest construction in the file transfer space. Its real limitation is that it remains a developer tool. The user experience friction is high enough that most non-technical users will not adopt it, regardless of its security properties.

Architecture 5: Acoustic key delivery (chirpfile)

A different approach uses the physical channel of sound itself to deliver the encryption key. The architecture:

The sender's device generates a random encryption key
The file is encrypted client-side with that key (AES-128-GCM, in chirpfile's case)
The encrypted blob is uploaded to a relay server
The decryption key is encoded as an FSK audio signal via the ggwave protocol and played through the sender's speaker
The receiving device, in the same room, captures the audio via its microphone and decodes the key
The receiving device fetches the encrypted blob from the relay and decrypts it locally
The relay deletes the blob after first download (burn-after-read)

The technical details of the acoustic transmission protocol are covered in our acoustic data transmission guide. What's relevant here is the security model.

Properties:

The relay never sees the key. The key exists only as an audio signal in the room and as bytes in the receiving browser's memory. It never crosses the internet in any form.

Physical presence is a cryptographic requirement. Sound attenuates rapidly with distance - roughly 6 dB per doubling, faster in cluttered environments. Acoustic transmissions are confined to the immediate room. A device that was not physically present to hear the chirp cannot recover the key. This is not a policy. It is a property of how sound propagates.

Server compromise leaks nothing useful. A full breach of the chirpfile relay reveals only encrypted blobs. The keys for those blobs were never on the server in any form, so even total compromise does not enable decryption.

No account, no metadata, no session linkability. chirpfile assigns no account, logs no email, and burns the encrypted blob after first download. There is nothing to subpoena that would reveal what was sent or by whom.

Drawbacks worth being honest about:

Acoustic interception is theoretically possible. A microphone in the room - a smart speaker, a phone left on the table, a hidden recording device - could capture the chirp. The attack surface is real but narrow: the attacker needs physical presence at the moment of transmission. This is much smaller than the attack surface for network-based key delivery.

Browser memory is a transient attack surface. During decryption, the key briefly exists in the receiving browser's memory. A malicious browser extension with sufficient permissions could in principle capture it. This is true of any browser-based cryptographic system and is not specific to acoustic delivery.

Hardware reliability varies. Different speaker and microphone combinations decode acoustic signals with different reliability. The protocol can fall back from ultrasonic to audible mode when reliability degrades, but acoustic transmission is not bit-perfect across all hardware combinations.

No remote use. This is a feature for proximity transfer; it is a hard limit for any use case where sender and receiver are not in the same room. Acoustic delivery is not a general-purpose file transfer mechanism - it is specifically suited to in-room transfer.

The combination of these properties is unique among production file transfer architectures. The only one that combines a relay-based architecture (so both devices can be on different networks), zero-knowledge (so server compromise reveals nothing decryptable), and a cryptographic proximity guarantee (so the file cannot be received outside the immediate room).

Comparing the architectures honestly

Some properties are easy to compare in a table; others require nuance. Here's the honest comparison:

Architecture	Server sees plaintext	Server holds keys	Channel for key	Network requirements	Proximity guarantee
Server-side encrypted (Drive, Dropbox)	Yes	Yes	N/A	Internet	None
Client-side encrypted, provider-held keys	No (in transit)	Effectively yes	Account password	Internet	None
URL fragment key (phntm.sh, Mega)	No	Never	URL itself, via user's chosen channel	Internet	None
PAKE code exchange (Magic Wormhole)	No	Never	Short code, communicated separately	Internet	None
Acoustic key delivery (chirpfile)	No	Never	Sound (in-room)	Internet	Yes (physical)
Local network P2P (Snapdrop, LocalSend)	N/A (no server)	N/A	mDNS discovery	Same LAN	Implicit (LAN)
AirDrop / Quick Share	N/A (no server)	N/A	Direct radio handshake	Direct radio range	Implicit (radio range)

A few comments on this table:

"Server holds keys" is sometimes ambiguous. Services that claim "zero-knowledge" while offering account recovery are effectively holding keys, even if they describe the architecture differently. The honest test is: can the service decrypt your data given enough motivation? If yes, they hold keys.

"Proximity guarantee" is the column that separates social proximity from cryptographic proximity. AirDrop, Quick Share, and chirpfile all require physical presence - but for AirDrop and Quick Share, presence is enforced by radio range, while for chirpfile it is enforced by the acoustic channel itself. Magic Wormhole and URL fragment systems can claim social proximity (the user only shares the code or URL with someone they trust to be present) but cannot enforce it.

The interesting non-obvious property: acoustic key delivery is the only architecture that combines all of "no server access," "no key on the server ever," "works across different networks," and "physical proximity is cryptographically required." This is not a marketing claim - it is what falls out of the comparison when you list the properties precisely.

Threat models, by architecture

A more honest way to compare these architectures is to walk through specific threats and see how each handles them.

Threat: passive network eavesdropper. All architectures handle this through TLS. None are vulnerable to passive interception of in-transit data.

Threat: server compromise. Server-side encryption fails completely - all plaintext is exposed. Client-side encryption with provider-held keys is mostly compromised - the provider's keys are presumably accessible. URL fragment, PAKE, and acoustic architectures all fail gracefully - the attacker recovers only ciphertext for which they don't have keys.

Threat: subpoena to the service provider. Server-side: provider can comply by handing over plaintext. Client-side with provider-held keys: provider can comply by handing over the keys derived from the user's account. URL fragment, PAKE, acoustic: provider literally cannot comply for past traffic because no keys are or were ever held by the provider.

Threat: malicious insider at the service provider. Server-side: insider has full access. Client-side with provider-held keys: insider has access through the provider's key infrastructure. URL fragment, PAKE, acoustic: insider has access only to ciphertext.

Threat: man-in-the-middle attack on the key delivery channel. URL fragment: depends entirely on the channel through which the URL is shared. PAKE: cryptographically resistant - attacker gets one guess and is detected if they guess wrong. Acoustic: requires physical presence in the room, which is a much smaller attack surface than network MITM.

Threat: malicious browser extension or compromised endpoint. All architectures are vulnerable to varying degrees. If the receiving device is compromised, anything the legitimate user can decrypt, the attacker can also decrypt. This is a fundamental limit of any system.

Threat: forwarding by recipient. Server-side, client-side with provider-held keys, URL fragment: trivially forwarded. PAKE: code can be forwarded; once the session is established, the file can be forwarded (PAKE doesn't bind to identity). Acoustic with burn-after-read: the encrypted blob is deleted after first download, so the forwarded URL would be useless even if intercepted. The recipient could of course forward the decrypted file - no architecture can prevent that - but the transfer mechanism itself cannot be replayed.

Compliance and regulated data

For organizations handling regulated data - HIPAA-covered health information, attorney-client privileged material, GDPR-protected personal data - the architecture matters in specific ways that vendor marketing does not always make clear.

HIPAA does not strictly require zero-knowledge architecture, but does require that any third party handling protected health information (PHI) sign a Business Associate Agreement (BAA). A service that operates in zero-knowledge mode does not handle PHI in a meaningful sense - the service holds only ciphertext that it cannot read - which can simplify the compliance posture. However, this is a legal interpretation that varies by jurisdiction and counsel; covered entities should not assume zero-knowledge alone satisfies HIPAA without a BAA in place.

Attorney-client privilege is preserved most cleanly by architectures where no third party can access the communication. Server-side encryption arguably weakens the privilege claim because the service provider can technically access the materials. Zero-knowledge architectures preserve privilege most cleanly because no third party can access the communication, even with a court order.

GDPR requires data minimization and security appropriate to the risk. Zero-knowledge architectures align well with GDPR's principles because the service provider is not a data processor for the file contents in any meaningful sense. They are still a data processor for whatever metadata they retain (account info, IP addresses, etc.), but the file contents themselves are not "processed" by the provider.

Audit trails are a complicating factor. Many regulated environments require demonstrable audit trails of who accessed what, when. Zero-knowledge architectures cannot provide content-aware audit trails - they don't see the content. They can still provide transmission audit trails (this user uploaded a blob of this size at this time, this user downloaded it), which may be sufficient for some compliance requirements but not others.

The honest summary: zero-knowledge architectures are well-suited to regulated data when the regulation prioritizes confidentiality, and less suited when the regulation prioritizes auditability of content. Most regulations care more about confidentiality, but the tradeoff is real.

What this means for choosing an architecture

Choosing a file transfer architecture honestly comes down to threat model:

If your threat model is "casual network attackers and accidental data exposure," server-side encrypted services (Drive, Dropbox) are perfectly adequate. The conventional architecture handles this threat model well, and the convenience tradeoffs (search, preview, sharing flexibility) often outweigh the privacy ones.

If your threat model is "the service provider as an adversary, or a server compromise that leaks everything," you need at minimum end-to-end encryption with keys never held by the provider. URL fragment, PAKE-based, and acoustic architectures all qualify; conventional "zero-knowledge" services with account-derived keys are a weaker version of this.

If your threat model includes "the channel through which the recipient learns about the file," URL fragment architectures are vulnerable. PAKE and acoustic architectures handle this better - PAKE through cryptographic detection of MITM, acoustic through physical channel constraint.

If your threat model requires "physical presence as a hard requirement for decryption," only acoustic key delivery currently provides this. AirDrop and Quick Share provide it via radio range but at the cost of restrictive ecosystem and network requirements; chirpfile provides it via acoustic range without those constraints.

If your threat model is "regulated data, attorney-client privilege, journalistic source protection," zero-knowledge architectures preserve confidentiality most cleanly. The specific choice between URL fragment, PAKE, and acoustic depends on the user experience constraints of your environment.

There is no single best architecture. There is, for any given use case, an architecture that matches the threat model honestly and one or more that overstate or understate the protection they provide. Understanding the differences is the only way to choose well.

Conclusion

"Encrypted" is a marketing word. The architectural reality of any specific service is what determines what actually happens to your file. Five architecturally distinct approaches dominate the production landscape in 2026: server-side encryption, client-side encryption with provider-held keys, URL fragment key delivery, PAKE-based code exchange, and acoustic key delivery. Each handles a different set of threats well and a different set poorly.

The vendor pages describing these services tend to elide the differences. The differences are real. A server compromise that exposes plaintext for service-side encrypted files reveals nothing decryptable for any of the latter three architectures. A subpoena that compels a provider to hand over plaintext can be complied with by service-side and client-side-with-provider-held-keys architectures, but cannot be complied with by zero-knowledge architectures even if the provider wanted to.

For most users, server-side encryption is genuinely good enough. For users whose threat model includes the service provider, the network, or the channel through which the recipient learns about the file, choosing the right architecture matters - and choosing the right architecture requires understanding what the architectures actually do, not what they're marketed as.