Data Storage
Cocoon's data storage layer — torrent-ccip — provides decentralized, encrypted, access-controlled storage without relying on IPFS, Filecoin, or any centralized gateway. Data lives in the BitTorrent network. Smart contracts handle registration and access control. The EIP-3668 / EIP-5559 CCIP protocol bridges on-chain lookups to off-chain data retrieval.
torrent-ccip is not a separate storage network you connect to. It is built into the Erigon node. Every Cocoon node is simultaneously a BitTorrent peer that stores and serves data as part of its normal operation.
What It Solves
Traditional on-chain applications store large data off-chain and reference it by URL — a centralized, mutable pointer. IPFS improves content-addressing but requires a separate gateway or pinning service. torrent-ccip achieves:
Content-addressing via infohash (BitTorrent) and CID (UCAN envelopes)
Decentralized distribution via the BitTorrent swarm — no pinning service
On-chain registration so smart contracts can look up and verify data
Encryption so only authorized parties can read sensitive data
Access control enforced at the protocol level, not the application level
Typical use cases within Cocoon: storing identity documents linked to investor DIDs, fund reports and NAV data referenced by token contracts, UCAN delegation tokens distributed to delegatees, and WebF site zips.
Registration Models
There are two ways to register data on-chain:
Contract registry — a named dataset is registered by calling a registry smart contract. The contract maps a human-readable name (or token address) to an infohash. This is used for persistent, named data like fund documents or identity credentials.
Torrent inscriptions — data is embedded in transaction calldata and indexed by the node. No separate registry contract is needed. Used for lightweight, immutable records — think of it as a calldata-native content store. Retrieved via erigon_getInscription.
CCIP Integration
EIP-3668 (CCIP-Read) and EIP-5559 (CCIP-Write) define a standard pattern where a smart contract signals that data lives off-chain. The Erigon node handles this transparently — applications call the contract normally and receive resolved data without knowing about the BitTorrent layer.
The OffchainLookup revert is caught internally by the node — the application never sees it. From the application's perspective, the eth_call just returns data.
For writes (EIP-5559), the flow is reversed: a contract signals that data should be written off-chain, the node publishes the torrent, and records the infohash on-chain.
Peer Discovery
Nodes discover each other's torrent data via two mechanisms:
Manifest torrent — each node publishes a registry.toml manifest torrent listing all datasets it seeds. New nodes bootstrap by resolving the manifest and joining relevant swarms.
ENR advertisement — nodes advertise their torrent capabilities via DevP2P ENR (Ethereum Node Records) using the discv5 discovery protocol. This means torrent peer discovery piggybacks on the existing Ethereum peer discovery infrastructure — no separate DHT bootstrap is needed.
Encryption Model
Sensitive data (identity documents, fund reports, private UCAN tokens) is encrypted before entering the torrent swarm. The encryption stack is:
Envelope
UCAN container (DAG-CBOR encoded)
Content encryption
AES-256-GCM
Key wrapping
ML-KEM-768 per-recipient key encapsulation
Key distribution
Encrypted key wrapped per authorized DID
Each authorized recipient's public ML-KEM-768 key is used to wrap the AES content key. The UCAN envelope contains one wrapped key per authorized recipient. A recipient node decrypts using its ML-KEM private key to recover the AES key, then decrypts the content.
This means:
Content is encrypted once regardless of how many recipients there are
Adding a recipient requires re-wrapping the AES key (not re-encrypting content)
Post-quantum security is built in at the key-wrapping layer
ML-KEM key pairs must be managed carefully. If a node's ML-KEM private key is lost, encrypted content accessible only to that key is permanently inaccessible. Store ML-KEM private keys in the configured keystore (HSM or vault), not on disk.
Access Control
The TorrentAccessControl contract maintains per-infohash access lists. Before a node decrypts and serves content, it checks this contract.
Access is enforced at two levels:
Cryptographic — without the correct ML-KEM private key, the AES content key cannot be recovered. Even if a node downloads the torrent segments, the content is unreadable.
Contract — the
TorrentAccessControlcontract provides an on-chain record of who is authorized, enabling audits, revocation (via UCAN revocation + access list update), and integration with token-gated access (e.g., only holders of a specific fund token can access its documents).
Keystore Architecture
The node itself is designed to remain keyless with respect to decryption. When content needs to be decrypted for a request, the node forwards the authentication token (UCAN) to the configured keystore, which performs the ML-KEM decryption and returns the plaintext or the unwrapped AES key.
Supported keystore backends:
Local file — encrypted keystore file (development / low-security deployments)
HSM — hardware security module via PKCS#11
Vault — HashiCorp Vault or compatible secrets manager
This design means a compromised node process does not expose private keys. The keystore can enforce its own policy (rate limits, audit logs, additional authentication) independent of the node.
Key APIs
erigon_resolveTorrent
Resolve a named dataset or infohash to its current content, triggering download if needed
erigon_publishTorrent
Publish data to the torrent swarm and register the infohash on-chain
erigon_getInscription
Retrieve data embedded in transaction calldata by inscription ID
erigon_listInscriptions
List inscriptions for a given address or contract
Last updated