@Irene, June 2023
⚠️ Warning: This doc assumes the reader to be familiar with the Filecoin protocol and in particular knowledgeable of methods related to proving storage (ie, SDR-PoRep, WindowPoSt, WinningPoSt). If you need to study this part, please refer to .
🗣️ Please, be aware that this doc does not describes neither analyses each construction in full details. Links to the relevant documents are provided for this! [Some of these docs may not be public yet, sorry! Ping me if you need access.]
- Proofs of Useful Space - adding storage
- Asymmetric Constructions
- ZigZag
- DRG PoRep
- Symmetric Constructions
- Small SDR
- NSE
- Wide PoRep
- Others…
- PoSt & friends - auditing storage
- Latency model
- ElectionPoSt
- Sequential ElectionPoSt
- ChainPoSt
- Cost Model
- SurprisePoSt
- BucketPoSt
- Periodical Rescaling
- ChainPoSt
Security Model | Possible paired PoSt (auditing schedule) | Security analysis completed? | “Fast” (faster than SDR) retrieval ? | |
ZigZag | latency
(based on SHA256 latency) | winningPoSt | no (sealstack to solve, bug in the pebbling argument to fix) | yes (asymmetric) |
DRG PoRep | latency
(based on SHA256 latency) | winningPoSt | no (sealstack to solve) | yes (asymmetric) |
Small SDR | cost and latency
(based on SHA256 evaluation) | windowPoSt/ chainPost | yes | yes (symmetric) |
NSE | cost
(based on SHA256 evaluation) | windowPoSt
(or anything from the cost model category) | yes | yes |
Wide PoRep | latency
(based on SHA256 latency)
| winningPost
| yes | yes-ish
(no explicit params) |
Proofs of Useful Space - adding storage
Here we provide a list of possible replacements for the Stacked-DRG proof of useful space (aka “SDR”). For each construction, we provide (1) a very high level idea of the construction, (2) some comments about possible pros/cons and similarities respect to SDR (eg, security model) and (3) links to relevant documents with more details.
Asymmetric Constructions
We call a PoS “asymmetric” when the decoding algorithm (from replica back to data) is faster than the encoding algorithm (ie, the algorithm run by the prover/SP to go from data to replica).
Note any asymmetric construction used in the latency model (eg, in combo with winningPoSt) is subject to the following problem:
SealStack Attack: Assume we have a PoRep/PoUS/ with fast unsealing/decoding. The SP runs the sealing process and creates replica R (from data = 0, or any data); The SP runs the sealing process again with data = R and creates replica R'; The SP deletes R and keeps only R'. Every time a post (eg, winning PoSt) is required for R, the SP runs the unsealing algorithm on R' and gets R. If the unsealing algorithm is faster than 30s (block time/response time), then winningPoSt is broken. In other words we must have RetrievalLatency > RegenLatency > ResponseTime (Note: if unsealing algorithm is fast but still as expensive as sealing, then WindowPoSt is still secure and we can use the asymmetric construction in the cost model ) Possible solutions:
- (sub optimal) Use the construction in the cost model with windowPoSt only;
- Maybe use the construction for FIL+ deals (data can not be chosen by the prover);
ZigZag
Construction: It is similar to Stacked-DRGs (SDR) PoUS and presented in the same paper. Respect to SDR, it achieves fast retrieval because the decoding procedure can per run in parallel (ie, all nodes are decoded in parallel). In the paper, it was designed to work in the latency model (so it can work with any PoSt from the next section from the “latency model” category). The latency bound uses SHA256 running time on ASICs
It should be possible to use it in the cost model with similar analysis as the one we did for SDR. Comments:
- Same spacegap as SDR (20%), similar or slightly worse params than SDR; Sector size can stay as today (32/64GiB);
- ⚠️ The security proof in the paper is broken (irene and luca found a bug that still need to be fixed and will have consequences on the parameters)
- Sealstack
Links:
DRG PoRep
Construction:
Take a DRG, and use the “DAG encoding” to create the replica R from the data D: R_i = D_i + H(R_j for all j that are parents of i ) (note: encoding is sequential, decoding is parallelazible)
Comments:
- Sector size can stay as today (32/64GiB);
- Fast retrieval (10 s for 10MB)? Yes, this solution is an asymmetric porep with local decodability)
- 🔍 Large spacegap (if we use the DRG we have now)
- Possible fix: find better DRG construction! See for example https://docs.google.com/document/d/1GAk87FUms8IZkJXJ4Kt5_Za_4Q1b-GnRDR2UA3UVt_o/edit#bookmark=id.omi8z5hfyxc3
- It works well in the latency model, it can be used paired with winningPoSt and then a rescaling method (or fallback mechanism, see next section)
- 🔍 SealStack problem
Links:
- https://docs.google.com/document/d/1GAk87FUms8IZkJXJ4Kt5_Za_4Q1b-GnRDR2UA3UVt_o/edit#bookmark=id.1cz0pndl5udo
- https://eprint.iacr.org/2018/678.pdf
Symmetric Constructions
We call a PoS “symmetric” when the decoding algorithm is the same as decoding.
Small SDR
Construction: The same as the currently used SDR, just smaller sector size.
Comments:
- Sector size suggested to get fast retrieval: 512MiB/256MiB
- As it is today for SDR, we can use this both with latency and cost model:
- cost model: use windowPoSt (same as current shipped solution but more snarks per day per miner). Could use snarkpack/aggregation here to solve/improve this?
- latency model: we can not use winningPoSt (latency bound is smaller than 30s), maybe we can used the ChainPoSt idea (requires VDF)
- in 2021, we estimated the per day SNARK proofs generated for windowPoSt be higher than the one with SDR.
- Note that the impact of this have changed now, since we use the optimistic approach and we could use SnarkPack.
Links:
NSE
Constructions: Based on labeling a layered graph. The graph is made by expander graphs and butterfly graphs (no DRGs) and it is also divided horizontally in small (ie, 4MiB) independent windows. These can be labeled in parallel, so helping with fast labeling and faster retrieval than SDR. Designed to work in the cost model, the main parameters were give for cost based on SHA256 evaluation (same as SDR) but some informal arguments about labeling being also memory/bandwidth hard were given (but no formal analysis for that)
Comments:
- Sector size was much lager than 32/64GiB, but time to first byte was lower!
- The idea was to use it together with windowPoSt, but we estimated the per day SNARK proofs generated for windowPoSt be higher than the one with SDR.
- Note that the impact of this have changed now, since we use the optimistic approach and we could use SnarkPack.
Links: https://github.com/filecoin-project/research-private/blob/master/papers/nse.pdf
Wide PoRep
Construction:
Based on labeling a layered graph. The graph consists of a unique (larger) DRG layer on the top and then smaller (narrowed) butterfly layers below of it. Do the same labeling as in SDR and then use the “XOR encoding” to create the replica R from the data D.
Comments:
- Goal: keep decoding reasonably cheap (and low-latency) while still getting strong latency bounds
- Fast retrieval? maybe yes (no explicit params)
- How do we do the audit (PoSt)? WinningPoSt should work, or others in the latency model. Note clear if we can use this in the cost model, maybe yes!
Links:
- See construction 2 here https://github.com/filecoin-project/research-private/blob/master/papers/wide.pdf
Others…
Others (less clear or less studied proposal) proposals:
- Stacked-Expanders and similar ideas (Stack expanders with much higher degree than the one used in SDR and improve the regeneration bound, to be used in the cost model)
- Tried to use constructions similar to the Stacked-Expander or NSE in a cost model where the main cost is given by memory access (ie, bandwidth hard graph), but did not manage to find good params. Links:
- Expanded-SDR (This construction is based on the combination of small SDR graphs and a big expander that acts like a glue. To be used in the latency model, mainly)
- Mem PoRep
- Memory-intensive PoRep
- DRGs with high degree, small sector size (e.g., 4 GiB)
- Edges determined dynamically (as function of hash outputs)
- Goal: rely on memory-hardness for adversary
- Modern GPUs: very high memory bandwidth (~1 TB/s)
- Either cost or latency based
- WinPoRep (windowed porep, overaleaf by Juan)
- VDE-PoRep
- VDE stay for verifiable delay encoding (it can be instantiated using a VDF)
- R_i = VDE(D_i+ H(comm_D))
- it works in the latency model, VDF a_max can be problematic
- see https://eprint.iacr.org/2018/678.pdf
- Selective Unsealing
- For example: “per node sealing” → Each node is sealed and unsealed independently
- Toy Example: a "comb sealing", where each node of the key is the composition of multiple hash of different nodes) , K1 ← P1 ← P2 ← … ← PN , K1 =: H(P1) , Pi =: H(Pi+1), R1 = K1 + D1
- Pros: We can have both faster retrieval and updatability
- Cons: It seems this requires a huge amount of online challenges or sealing cost will spike!
- This construction works in the cost model.
- Link: https://docs.google.com/document/d/1fpxNiGOoPFhBbHETpVFrTYeUi88upyeOR6oEN5I-BiU/edit#bookmark=id.mee7y8y9um7
PoSt & friends - auditing storage
This section contains a list of alternative designs we considered for auditing storage over time before we decided to ship Filecoin with WinningPoSt and WindowPost. 👉 Note: not any PoSt can be used paired with any Proof of Useful Space. Once a PoUS is chosen, then a security model is chosen (latency or cost, rarely both are possible) and we can go a pick a PoSt in that category. Moreover, the explicit PoSt params will depends on the PoUS chosen. For example windowPoSt for SDR does not have the same parameters as windowPoSt for NSE.
Latency model
ElectionPoSt
Check every (or a fraction of) sectors at each epoch (eg, every 30 s), use the PoSt proofs as input for a standard VRF based election. Protocol overview:
- At the beginning of the block:
- Miner gets challenges from the chain (or drand)
- A subset of sectors are randomly selected: selected = sample_rate * miner_size
- For each selected sector, miner reads some random positions (can be done in parallel), hash do generating the “partial tickets”
- Miner checks if any of the partial ticket is winning the VRF-based election
- something like, partial ticket < target
- A winning ticket and can be used as to submit a block (generate a single PoSt for the block)
- the probability of scratching a winning ChallengeTicket is dependent on sector size and total storage. Miners are rewarded in proportion to the quantity of winning ChallengeTickets they generate in a given epoch, thereby incentivizing a miner to check as much of their storage as allowed in order to express their full power in a leader election.
Security (latency model):
- if a sector is not stored, the miner has no time to regenerate and can to create the partial ticket
Major Problems:
- if the sample_rate = 1 —> hardware requirements (i.e. if you are a big miner you have to try all your sector in a short period of time), checking all HDDs every 30s may not be not feasible!
- if sample_rate ≠ 1 —> possible power table inflation
Links to past relevant docs:
- https://docs.google.com/document/d/1146plsfBhltbbsXcxsQB0DoAliw9QkZr74IZH7KxiPk/edit
- https://docs.google.com/document/d/1elvcz5Ln0mqOJAv5XqsOeu0TtljqgUuKvLDoj0d57fs/edit
- https://www.overleaf.com/project/5e1dd99ac28f70000196603b
Sequential ElectionPoSt
Comments:
- Proposed to get fast sealing/unsealing
- estimation in 2019: ~137s per 32GiB
- The PoRep was DRG-PoRep
- The security model was latency and it tried to solve this problem
- large challenge-response time (1h) —> large sealing/unsealing time (120 hours), not good for product
- short challenge-response time (1 min or 15 secs) —> miners not able to post on chain on time, footprint is high, but sealing is fast
- Commit-then-prove paradigm
Protocol overview:
- At the beginning of the block:
- Miner gets challenge from the chain (or drand)
- Miner generates a PoSt with sequential challenges (PoSt on sector i gives the challenges for PoSt on sector i+1)
- Miner checks if they won the block using the output of the last post
- Commit to the post output (witness)
- After X blocks:
- Miner submits a SNARK proof
Major problems:
- same as ElectionPoSt (?)
Links to past relevant docs:
- https://docs.google.com/document/d/1qKr1SAf9RnD1GLWWgFKRfmyNTIvP36rHHB2Suqm36a4/edit
- https://docs.google.com/document/d/1syQ6H-FLYU7UJqcHY3qpg9fMZZ3zSHCShVdVa5RGVE4/edit#
ChainPoSt
Challenge a new sector every x (each sector is chosen at random), post on chain only one SNARK after many sectors are proven.
drand seed —> PoStProof for R1 → VDF → PoStProof for R2 → submit on chain
To cover block time: 30 seconds say RegenLatency = x seconds
say VDF runs for y seconds
numbers of PoSt proofs = n
⇒ n*x + (n-1)*y = 30 seconds (n = (30+y)/ (x+y)
If you do not have a sector, you could lose the block reward (same as winningPoSt Rational argument) Attack:
run the VDF in y/A_max time say for example A_max = 2
Notes:
- it should be possible to use this with “small SDR/small DRG” PoRep to get fast retrieval
- security in the latency model: either most sectors are stored, or the probability of be able to post a valid SNARK is low
- Link: https://docs.google.com/document/d/1GAk87FUms8IZkJXJ4Kt5_Za_4Q1b-GnRDR2UA3UVt_o/edit#bookmark=kix.gwtw1oh2qsha
Cost Model
SurprisePoSt
Given a proving period P consisting of k epochs and a power table with length M (meaning with M miners), at each epoch, M/k miners are sampled at random and need to prove their storage (or a fixed amount of his sectors).
Security: in the cost model
Deprecated in favour of WindowPost, since the second one was easier to implement and analyze.
Minor comments:
- WindowPoSt was called “Scheduled PoSt” in some old docs;
- We used “fallback PoSt” as a generic name for Surprise/Window PoSt. The idea was: the “main” PoSt (for security) is Election PoSt and then you need another mechanism for rescaling/power table inflation
BucketPoSt
Idea: Like winningPoSt (PoSt at election time, not periodically), but (1) use the cost model instead of latency (2) challenge a fraction of sectors instead of only 1
Problem: it require a minimum size for miners. Link: https://docs.google.com/document/d/1GAk87FUms8IZkJXJ4Kt5_Za_4Q1b-GnRDR2UA3UVt_o/edit#bookmark=id.bn7cr5ajl16j
Periodical Rescaling
[This is not a PoSt mechanism] There are periodical checkpoint where each miner locally updates his own power table according to the frequency each other miner posted Election/Winning PoSt proofs on-chain since the last checkpoint.
This can be used in the place of any fallback PoSt mechanism.
Problems:
- may be hard to implement ??
- it was not enough for “storage market soundness/security” (aka “sector accountability”): A miner has to prove that he can access/retrieve the sectors he is claiming (with some frequency). See https://docs.google.com/document/d/1gJrAmuUKsMSS4K-u3elPp11WE7NY1qt3Pzqw30mnbkE/edit
ChainPoSt
Challenge the same sector every few hours, post on chain only one SNARK after many PoSt proofs for the sector are collected.
Recap:
Consider the scenario of a scheduled/windowPoSt (ie, per sector, miner proves once every X hours). VDF can be used to space out different execution steps of a Proof of Space (aka PoSt) without having to post on chain each PoSt.
drand seed —> 1st PoStProof → VDF → 2nd PoStProof → submit on chain
Security (in the cost model):
VDF runs for Y hours
We need Y + operational time = X So Y needs to be close to X If this is not possible, repeat many time:
drand seed —> 1st PoSt → VDF → 2nd PoSt → VDF —> … n-th PoSt —> submit on chain
n*Y ~ X
Attack:
The attacker has SW and HW to run the VDF in Y/A_max (faster)
The attacker is not storing and regenerated for the 1st PoSt, then runs VDFs and the other PoSt proofs, as soon as he is done, it deletes the sector. RC = regen cost SC = storage cost for X hours (this is the cost for honest player) The attacker pays: RC + SC/X * (nY/A_max) ~ RC + SC/A_max
Or the attacker pays regeneration twice, to not store during VDF running time
Comparison formula: RC > SC* min {(A_max -1) / A_max, 1/2}
Without VDF: RC > SC
So we can have better parameters with the VDF!