Authors: @Nicola @Peter Rabbitson
Date: Nov 22, 2021
- Introduction
- Terminology
- Protocol
- Storing SubDeals Protocol Flow
- Getting SubDeal Storage Proof Protocol Flow
- Failed deals Protocol Flow
- Aggregator Software
- Variants
- Variant 0: Aggregator Escrow
- Variant 1: Arbitrary Sizes Deals
- Variant 2: Trustless aggregator
- Variant 3: Storing straight IPFS hashes
- FAQ
- Current solutions
Introduction
Deals are too expensive to publish on chain. Several solutions have demonstrated the value of aggregating multiple files/dags into a single deal (e.g. Esturary, Web3 Storage). However, in current designs the user must trust the aggregator to truly storing their files.
This proposal shows how we can aggregate deals in a way that it's provable to the user that their data is actually is included in the aggregators' deal. This allows for a more decentralized ecosystem of aggregators.
The intuition is that subdeals (deals between the clients and the aggregators) are stored taking advantage of the structure of the PieceCID tree. This enables for very simple merkle tree proofs.
On opinions: This proposal has several opinionated decisions in order to guarantee: (1) backward compatibility with the existing system, (2) no upgrade needed, e.g. no actor code. However, we have provided a list of variants at the end of the document - variants are not mutually exclusive and can be combined.
On trust assumptions: Although this proposal has trust assumptions on the aggregator node - the game plan is to start with this and slowly add the variants one by one which will remove all trust assumptions.
PieceCID
/\
/\
/\
SubPieceCID
/\
/\ /\
a b c d
Terminology
SubDeal
- Deal between the client and the aggregator that stores SubPieceCID.
SubPieceCID
- CommP of linearized client DAG with necessary padding.
- Its size must be a power of 2 (if you don't like this, see variant)
- Must be Fr32 friendly
Protocol
Storing SubDeals Protocol Flow
Client side
- Generate a SubPieceCIDHash: Client takes DAG and turns it into Fr32 friendly that is padded to the next power of 2 and genates a SubPieceCID hash. (if you want the client to send straight IPFS hashes see variant 3)
- Wraps it into a SubDealProposal (same as a DealProposal), signs it
- Sends it to the aggregator
- Send payment to the aggregator (note if you are not happy with this level of trust see the variant 2)
Aggregator side
- Wait to receive enough SubPieceCIDs
- Generate PieceCID
- Generates for each SubPieceCID an inclusion proof from PieceCID to SubPieceCID
- Send PieceCID to the miners for storage
Getting SubDeal Storage Proof Protocol Flow
- Client requests the proof for a subdeal via
/getSubDealProof
API - Aggregator constructs the proof from SubPieceCID to PieceCID and replies.
Failed deals Protocol Flow
- Deal data is lost by the miner
- Aggregator gets payment back
- Aggregator is trusted to give money back to user
(otherwise see variant)
Aggregator Software
Internal Storage
- Proofs[PieceCIDs][SubPieceCID] → Proof
- For each PieceCID store SubPieceCIDs and their inclusion proofs
API
- /storeSubPieceCID
- /getSubDealProof/:SubPieceCID → (PieceCID, Proof)
- /getStatus → [waiting for deals, submitted, stored]
- /stats/:PieceID → (signed set of subdeal)
- /stats ([PieceCiDs], [signedDeals])
Variants
These variants can be composed with each other.
Variant 0: Aggregator Escrow
The goal of this variant is to ensure that if a deal is ever lost, the client can get their money back.
This differently from the current design where the aggregator gets the money back and it's trusted to distribute them.
TODO: These two options need to be fleshed out more.
Option 1:
- Extend storage market actor escrow to have payments to be reimbursed to a different wallet to the one making the deal.
- The list of wallets should either be a list on-chain (expensive) or a commitment to a tree of wallets, however this means that in order to retrieve the payment one should submit a proof (which could be expensive)
Option 2:
- There is a aggregator contract where funds are sent by the client when making deals.
- The address for deal failure reimbursement will be this broker contract
- The client can withdraw from this contract
Variant 1: Arbitrary Sizes Deals
The goal of this variant is to allow for arbitrary size deals.
@Kubuxu will find a way.
Variant 2: Trustless aggregator
The goal of this variant is to avoid having the client to trust the aggregator node with their payments.
NOTE: This needs a new on-chain actor.
Pessimistic On-chain Option (not ideal)
- Client sends money to the Aggregator Escrow contract for a subdeal to be included in a published storage deal
- Aggregator generates a proof that a subdeal is included in a storage deal and provides them on chain and gets money back
- If the aggregator takes too long to do this, then the client can get their money back
Optimistic On-chain option (ideal)
- Client sends money to Aggregator Escrow
- Aggregator claims the money when the deal is submitted
- Aggregator sends the proof to the client
- If the client is not happy, they start a dispute process
- If the aggregator shows the SubDeal proof at the dispute process → aggregator gets the money
- Otherwise user gets their money back
Payment channel options
(Same as above but instead in a state channel - this is very similar to the original payment channels with vouchers)
Variant 3: Storing straight IPFS hashes
If the total amount of bytes to hash is small, then we can use SNARKs to prove that a SubPieceCID has the same data as an IPFS hash.
Note1: at current speeds it might be faster for the client to just generate their own SubPieceCID themselves rather than delegating it to the aggregator node (which would bring us back to the protocol as presented).
Note2: there is still value for this protocol to exist even if the client can generate this themselves - it is the setting where the client is unaware of the concept of PieceCID. E.g. a blockchain user uses a smart contract that stores files into Filecoin, this user only knows about IPFS and nothing else. They want to delegate everything that has nothing to do with IPFS. In this case, the smart contract can guarantee that the IPFS hash is correctly matched by verifying a proof provided by the aggregator node.
Protocol
- Client generates IPFS hash of the dag
- Client sends it to the aggregator node
- Aggregator node turns that into SubPieceCID
- Aggregator node generates a SNARK that SubPieceCID matches the data behind the IPFS hash
- Aggregator sends the proof to client
- Client is now convinced
Otherwise see:
Making Deals for IPFS CIDFAQ
Can a client be themselves an aggregator? This solution is not "natively recursive" because we want to respect the tree of CommD. They can be a broker as long as they respect the structure of the tree. Dags can be aggregated but not composed.
Current solutions
Platform | DAG support | Stream support | Inclusion Proof | Reimbursment on failures | User payment |
---|---|---|---|---|---|