- Intro
- Previous solutions:
- Limitations:
- New Proposal (”Optimistic SnapDeal”)
- Protocol flow (sketch)
- Why do we think this idea is meaningful?
Intro
Some optimizations have already been proposed, see SnapDeals with Reduced Overhead - Adopting alternative Data Commitment. They introduce the use of a different hash function (SNARK friendly) to compute a second commitment tree of the data.
The scenario is the following:
- We have some input data D
- We have a first commitment Comm1 (a MT with SHA) which is efficiently computable by the client but is inefficient to be proven inside a snark
- We have a second commitment Comm2 (a MT with Poseidon) which can be efficiently proven inside a snark but represent a consistent proving overhead for the client (~ 100x)
Question: How do one knows that both Comm1 and Comm2 are commitment to the same (original) D without having a prover to (always) compute a really expensive snark or a client to have a consistent proving overhead in computing Comm2?
Previous solutions:
PoMTT: interactive and off-chain proof of Merkle Tree translation.
That is, before signing the deal offer the Client requires the SP to do probabilistic checks on different random openings of both Comm1 and Comm2.
- client chooses C random positions and send those to SP;
- how big is C?
- say we are okay if the translation is wrong in at most 4% of the positions
- then (1 - 0.04)^C = 2^(-80)
- C = -80 / log_2(1-0.04) = ~1359
- SP sends the MT paths for both Comm1 and Comm2;
- Client checks the MT paths received;
If the checks does not convince the Client, he will not sign the deal proposal.
PoWMTT: non interactive proof of Wrong Merkle Tree translation.
At a latter time (after the deal is published), anyone retrieving the data D can challenge the SP that the translation was performed wrong at some position. This challenge would be resolved on-chain, by the SP posting a (singe) proof that translation was performed correctly at the given position (i.e. a single MT path in case of commitments done via MT).
Limitations:
- PoMTT:
- client overhead: handling the challenge-response protocol, computing some Poseidon hash for verification;
- (minor) not public verifiable proof;
- PoWMTT:
- client overhead: how to find the position where to challenge the SP?
New Proposal (”Optimistic SnapDeal”)
We propose to follow the “optimistic approach” of PoWMTT and in case of a client that suspects a wrong translation, we procede by requiring the SP to do a PoMTT.
Protocol flow (sketch)
- The SP submits Comm1 and Comm2 and snaps D into a CC sector, proving the Snap using Comm2
- Comment: snark becomes cheaper than current SnapDeal and will also be cheaper than sealing a new sector!
- When a client retrieves the file D*, he checks that Comm1(D*) = Comm1(D)
- Comment: only sha cheap computation for the client;
- If Comm1(D*) ≠ Comm1(D), then the client sends a fraud message onchain. The prover has X (TBD) hours of time to produce a proof (using snark) that the snapdeal was done correctly. There are two possibility of this proof:
- Option 1 (recommended): redo the snapdeal protocol using Comm1 (same as current SnapDeal protocol);
- Option 2: snark the non-interactive version (via fiat shamir) of PoMTT;
- SP derives C challenges from comm1 and comm2
- SP computes the snark for the C MT paths for both comm1 and comm2
- With C = 1359 this should be more or less the same number of constraints of current SnapDeal protocol (1376).
- SP submits the snark proof onchain
- If the prover does not submit the proof, the sector is flagged as “wrongly translated” and we may slash the provider;
Comments:
- One can realise that Comm1 and Comm2 are not both opening to D if and only D is retrieved ;
- Option 1 is the recommended one because does not requires a new trusted setup for the snark (which is needed for option 2)
- Note that the PoMTT protocol has a margin of error (eg, 4% can be different but the proof is valid) and this will result in the case of
- client seeing comm1(D) ≠ comm1(D’);
- but SP being able to produce a valid PoMTT for comm1(D) and comm2(D’);
Why do we think this idea is meaningful?
- It is enough to download D once in order to realize if there is a mismatch. Basically it is a "once for all” retrieval
- If the file is not retrieved AND there is a mismatch, then the mismatch got unnoticed
- BUT: does it matter if there was a mismatch on some Comm1 and Comm2 regarding some data D that never gets retrieved?
- On one hand
- If the data is never retrieved, then one can claim that "nobody cares that this data is actually there” during the whole deal duration
- After deal duration any data can be removed from the network, so if nobody retrieved it there is no difference "in practice” in having the real D stored for the deal duration or a modified D*
- Space hardness is not put at risk by having Comm1 and Comm2 opening to two different data D and D*
- On the other hand
- We can not ensure a specific data is stored in the network a priori (up until it gets retrieved). This can be something undesirable from the SLA perspective