🎪

Proposal: Simple Merkle Deals

Creator

Nicola

Created

Feb 9, 2022 12:34 PM

Stage

Graduated from Notebook

Authors: @Nicola @

Date: Nov 22, 2021

Introduction
Terminology
Protocol
Storing SubDeals Protocol Flow
Getting SubDeal Storage Proof Protocol Flow
Failed deals Protocol Flow
Aggregator Software
Variants
Variant 0: Aggregator Escrow
Variant 1: Arbitrary Sizes Deals
Variant 2: Trustless aggregator
Variant 3: Storing straight IPFS hashes
FAQ
Current solutions

Introduction

Deals are too expensive to publish on chain. Several solutions have demonstrated the value of aggregating multiple files/dags into a single deal (e.g. Esturary, Web3 Storage). However, in current designs the user must trust the aggregator to truly storing their files.

This proposal shows how we can aggregate deals in a way that it's provable to the user that their data is actually is included in the aggregators' deal. This allows for a more decentralized ecosystem of aggregators.

The intuition is that subdeals (deals between the clients and the aggregators) are stored taking advantage of the structure of the PieceCID tree. This enables for very simple merkle tree proofs.

On opinions: This proposal has several opinionated decisions in order to guarantee: (1) backward compatibility with the existing system, (2) no upgrade needed, e.g. no actor code. However, we have provided a list of variants at the end of the document - variants are not mutually exclusive and can be combined.

On trust assumptions: Although this proposal has trust assumptions on the aggregator node - the game plan is to start with this and slowly add the variants one by one which will remove all trust assumptions.

         PieceCID
        /\
			 /\
			/\
      SubPieceCID
     /\
   /\  /\
   a b c d

Terminology

SubDeal

Deal between the client and the aggregator that stores SubPieceCID.

SubPieceCID

CommP of linearized client DAG with necessary padding.
Its size must be a power of 2 (if you don't like this, see variant)
Must be Fr32 friendly

Protocol

Storing SubDeals Protocol Flow

Client side

Generate a SubPieceCIDHash: Client takes DAG and turns it into Fr32 friendly that is padded to the next power of 2 and genates a SubPieceCID hash. (if you want the client to send straight IPFS hashes see variant 3)
Wraps it into a SubDealProposal (same as a DealProposal), signs it
Sends it to the aggregator
Send payment to the aggregator (note if you are not happy with this level of trust see the variant 2)

Aggregator side

Wait to receive enough SubPieceCIDs
Generate PieceCID
Generates for each SubPieceCID an inclusion proof from PieceCID to SubPieceCID
Send PieceCID to the miners for storage

Getting SubDeal Storage Proof Protocol Flow

Client requests the proof for a subdeal via /getSubDealProof API
Aggregator constructs the proof from SubPieceCID to PieceCID and replies.

Failed deals Protocol Flow

Deal data is lost by the miner
Aggregator gets payment back
Aggregator is trusted to give money back to user

(otherwise see variant)

Aggregator Software

Internal Storage

Proofs[PieceCIDs][SubPieceCID] → Proof

For each PieceCID store SubPieceCIDs and their inclusion proofs

API

/storeSubPieceCID
/getSubDealProof/:SubPieceCID → (PieceCID, Proof)
/getStatus → [waiting for deals, submitted, stored]
/stats/:PieceID → (signed set of subdeal)
/stats ([PieceCiDs], [signedDeals])

Variants

These variants can be composed with each other.

Variant 0: Aggregator Escrow

The goal of this variant is to ensure that if a deal is ever lost, the client can get their money back.

This differently from the current design where the aggregator gets the money back and it's trusted to distribute them.

TODO: These two options need to be fleshed out more.

Option 1:

Extend storage market actor escrow to have payments to be reimbursed to a different wallet to the one making the deal.
The list of wallets should either be a list on-chain (expensive) or a commitment to a tree of wallets, however this means that in order to retrieve the payment one should submit a proof (which could be expensive)

Option 2:

There is a aggregator contract where funds are sent by the client when making deals.
The address for deal failure reimbursement will be this broker contract
The client can withdraw from this contract

Variant 1: Arbitrary Sizes Deals

The goal of this variant is to allow for arbitrary size deals.

@ will find a way.

Variant 2: Trustless aggregator

The goal of this variant is to avoid having the client to trust the aggregator node with their payments.

NOTE: This needs a new on-chain actor.

Pessimistic On-chain Option (not ideal)

Client sends money to the Aggregator Escrow contract for a subdeal to be included in a published storage deal
Aggregator generates a proof that a subdeal is included in a storage deal and provides them on chain and gets money back
If the aggregator takes too long to do this, then the client can get their money back

Optimistic On-chain option (ideal)

Client sends money to Aggregator Escrow
Aggregator claims the money when the deal is submitted
Aggregator sends the proof to the client
If the client is not happy, they start a dispute process

If the aggregator shows the SubDeal proof at the dispute process → aggregator gets the money
Otherwise user gets their money back

Payment channel options

(Same as above but instead in a state channel - this is very similar to the original payment channels with vouchers)

Variant 3: Storing straight IPFS hashes

If the total amount of bytes to hash is small, then we can use SNARKs to prove that a SubPieceCID has the same data as an IPFS hash.

Note1: at current speeds it might be faster for the client to just generate their own SubPieceCID themselves rather than delegating it to the aggregator node (which would bring us back to the protocol as presented).

Note2: there is still value for this protocol to exist even if the client can generate this themselves - it is the setting where the client is unaware of the concept of PieceCID. E.g. a blockchain user uses a smart contract that stores files into Filecoin, this user only knows about IPFS and nothing else. They want to delegate everything that has nothing to do with IPFS. In this case, the smart contract can guarantee that the IPFS hash is correctly matched by verifying a proof provided by the aggregator node.

Protocol

Client generates IPFS hash of the dag
Client sends it to the aggregator node
Aggregator node turns that into SubPieceCID
Aggregator node generates a SNARK that SubPieceCID matches the data behind the IPFS hash
Aggregator sends the proof to client
Client is now convinced

Otherwise see:

Making Deals for IPFS CID

FAQ

Can a client be themselves an aggregator? This solution is not "natively recursive" because we want to respect the tree of CommD. They can be a broker as long as they respect the structure of the tree. Dags can be aggregated but not composed.

Current solutions

Platform	DAG support	Stream support	Inclusion Proof	Reimbursment on failures	User payment
L2 - Web3.Storage
L2 - Estuary
L2 - Simple Merkle Deals
L1 - Deals (GFM)
L1 - Deals (Raw)
L1 - Deals (Boost)