The secure transfer of digital payloads between unassociated, physically proximate devices remains a friction-heavy problem. Existing solutions typically depend on shared local area networks (LANs), proprietary operating system ecosystems, centralized cloud accounts, or manual out-of-band authentication codes. These approaches introduce security trade-offs or significant usability barriers in modern enterprise environments where network segmentation, VPN routing, or device management policies prevent traditional peer-to-peer connections.
This paper proposes a hybrid architecture termed Acoustic Cryptographic Binding (ACB). The system combines browser-side end-to-end encryption with an automated Out-of-Band (OOB) key exchange using short-range audio signals. Encrypted payloads are transmitted through a network relay while the cryptographic key required for decryption is delivered through a brief acoustic transmission detectable only by nearby devices.
By separating payload transport from key distribution, the system enables zero-configuration file exchange across isolated networks while maintaining a zero-knowledge relay model. The server never possesses the cryptographic keys required to decrypt transferred data.
The sender encrypts the payload locally and uploads the ciphertext to a blind relay server. The decryption material and blob identifier are transmitted through a short acoustic signal to nearby devices. The receiver retrieves the encrypted blob from the relay and decrypts it locally. Optional server-side expiry or single-use deletion limits ciphertext retention.
Transferring files between devices located in the same physical space but operating on different networks is often surprisingly difficult.
A common example involves transferring a file from a smartphone on a cellular network to a corporate laptop connected to a VPN-secured Wi-Fi network. Under such conditions many common sharing tools fail because they assume a shared network environment or require accounts and persistent cloud storage.
When these tools fail, users resort to unofficial transfer methods, including emailing files to themselves, uploading files to third-party cloud drives, or using external storage devices. These workarounds introduce security risks, persistent data artifacts, or additional friction.
The fundamental challenge is establishing a secure, temporary communication channel between two devices that:
This paper introduces Acoustic Cryptographic Binding (ACB), a system that solves this problem by distributing encrypted data over the internet while delivering the cryptographic unlock key through a short acoustic transmission detectable only by nearby devices.
Existing file transfer systems generally fall into four architectural categories.
Examples include AirDrop, Quick Share, and Nearby Share. These systems typically use Bluetooth Low Energy for discovery and Wi-Fi Direct for high-speed peer-to-peer transfers.
Advantages:
Limitations:
Examples include Snapdrop, LocalSend, and similar WebRTC-based sharing tools. These tools establish peer-to-peer connections between browsers or applications on the same local network.
Advantages:
Limitations:
They depend on permissive LAN conditions and fail under common modern networking configurations such as:
Examples include email attachments, WeTransfer, Google Drive, and Dropbox. These services upload plaintext files to centralized servers and distribute download links to recipients.
Advantages:
Limitations:
Examples include Magic Wormhole and Send Anywhere. These systems use relay servers combined with Password-Authenticated Key Exchange (PAKE) protocols. A human-readable code is displayed on the sending device and must be manually entered on the receiving device to complete the cryptographic handshake.
Advantages:
Limitations:
Manual entry of authentication codes introduces cognitive load and slows the transfer process. The requirement for human interaction reduces usability in quick device-to-device transfers.
Acoustic Cryptographic Binding eliminates manual PIN codes while preserving the zero-knowledge property of relay-based encryption systems.
The architecture separates payload transport from key exchange and automates the latter using sound.
When a file is selected for transfer, the sender generates a high-entropy ephemeral symmetric key using the browser's Web Crypto API. The symmetric encryption key is generated uniquely for each transfer and discarded after use.
The file is encrypted locally using AES-GCM authenticated encryption mode. Only the encrypted ciphertext leaves the sender device.
The plaintext file data never leaves the device.
The encrypted blob is uploaded to a relay server via HTTPS.
The relay server acts as a blind transport layer:
The blob is indexed using a short-lived random identifier called the Blob ID. The Blob ID acts only as a retrieval reference for the encrypted payload and carries no cryptographic meaning.
The receiver requires the Blob ID and the encryption key in order to retrieve and decrypt the payload.
Instead of transmitting these values over the network, the sender encodes them into a short acoustic signal using Frequency-Shift Keying (FSK).
The signal is emitted as a brief chirp through the sender's speaker. The receiving device listens through its microphone and decodes the acoustic message.
Because acoustic communication limits reception to devices within practical audio range of the sender, the signal introduces a physical proximity constraint.
Once the receiver obtains the Blob ID and key material:
Optionally, the relay server deletes the encrypted blob after the first successful download or after a short expiration time.
The architecture provides several security properties absent from traditional sharing systems.
If an attacker gains full access to the relay server infrastructure, the attacker obtains only encrypted ciphertext. Without the acoustic key exchange, decrypting the data is computationally infeasible.
An attacker monitoring network traffic (for example on public Wi-Fi) can observe the encrypted payload transfer but cannot obtain the decryption key, which is transmitted through the acoustic channel.
Intercepting a transfer would require a dual-channel attack:
This significantly raises the practical difficulty of interception.
The system may include transfer expiration or single-use retrieval tokens to prevent replay of previously captured acoustic signals.
This architecture builds upon several established domains of research and product design, recombining them to address the specific friction points of asymmetric-network proximity transfers.
The use of audio as an out-of-band (OOB) channel to bootstrap trust between unassociated devices is well-documented. Classic prior art includes Loud and Clear: Human-Verifiable Authentication Based on Audio (Goodrich et al., 2006) and subsequent research into Acoustic Integrity Codes (2020) and PairSonic (2024). These works demonstrate the viability of using commodity audio hardware to authenticate or pair nearby devices. Our architecture draws on these principles but applies them to stateless, ephemeral payload exchange across asymmetric networks rather than persistent device pairing or session establishment.
Systems such as Dhwani (Nandakumar et al., SIGCOMM 2013) frame acoustic communication as an NFC-like peer-to-peer transport medium. Open-source libraries like ggwave provide practical implementations of Frequency-Shift Keying (FSK) for low-bandwidth acoustic data exchange. While these systems use sound as the primary transport for small payloads, our architecture recognizes the bandwidth limitations of audio and delegates the heavy payload transport to an IP-based network relay.
Tools like Magic Wormhole and Send Anywhere successfully demonstrate the value of using a blind server relay coupled with end-to-end encryption (e.g., PAKE/SPAKE2). However, these systems rely on manual human input, such as entering a short code or key phrase, to complete the cryptographic rendezvous.
Applications such as LocalSend and Snapdrop provide excellent P2P sharing experiences but are brittle under modern network topologies. These systems are optimized for nearby/local-network conditions and may fail under AP isolation, local-network restrictions, VPN-related interference, or cross-network scenarios where devices are not on the same reachable LAN.
Architectural Distinction
In contrast, Acoustic Cryptographic Binding uses audio exclusively as an out-of-band carrier for decryption material, while the encrypted payload itself is transported through an IP-based relay. This separation is intended for nearby devices that are physically co-located but not necessarily connected through the same local network.
By separating encrypted payload transport from cryptographic key delivery and utilizing sound as an automated out-of-band channel, the system enables secure, cross-platform proximity transfers without shared networks, accounts, or manual authentication codes. The architecture preserves the zero-knowledge property of encrypted relay systems while maintaining the simplicity expected from proximity-based sharing tools.