This is a short outline of features and changes needed to bring CC sector upgrades into reality
Intro:
Since 90+% of sectors in the Filecoin Network are CC sectors, having a protocol that allows for updating CC sectors to store real data without incurring in a full re-sealing would massively improve our network in terms of the amount of real data stored through it.
Relevant docs/links:
- https://github.com/filecoin-project/FIPs/issues/131 - FIP Issue
- https://docs.google.com/document/d/1lerNbgVMBBEDOHC4dqexiS-IKUDOBOy_eTu7fjyv5gs/edit# - Lightweight Sector Updates in Filecoin
Implementation breakdown, per component:
Actors:
- Finalizing the construction
- One message
- Actor update to v6
- DeclareUpdate implementation
- Looks like we won't need this message
- ProveCommitReplicaUpdate implementation
- Remove old cc sector upgrade path
- Actors Tests
- State migration
Proofs:
- New: Sector upgrade circuit
- Circuit design
- Audit
- Implementation
- Phase 2 Trusted Setup
- Encoding Decoding function
- Integration into the system
- New API endpoints
- EncodeNewDeals (encode a set of new deals into given CC replica)
- VerifyReplicateUpdate
- ComputeReplicaUpdateProof
- FFI Integration
Lotus:
- Change: Lightweight CC upgrade pipeline
- FFI integration
- Actor and State upgrade integration
- Remove old cc upgrade feature from lotus
- Testing
Audit:
- We need an internal audit - It can be done right after the actor's implementation - It can be done by @Nicola and @Friedel Ziegelmayer
- We need an external audit - we have a list of vendors for this - It is optional for this implementation
- We need to do a circuits audit - we worked with J.P. before on this - we will again on this, with JP
Trusted setup:
- Participants will be selected from the groups we used for the previous one. They need to be lined up + given guidance before we actually start running the setup. This would mostly be under the TPM scope of work - estimated at 1 week, we should start this process 3 weeks before we start running a trusted setup.
- Running the trusted setup - 4 weeks would be needed to complete this process
- An open question, which will be answered only when we implement the circuits, is: how long will the trusted setup last per user, how long their runs will be?
- We might be able to do some parallelizations since we can break validators into groups: 64 GB or 32GB test groups.
Q3 Plan | How much time it will take - estimation | Man-days estimation | DRI: Who own this line of work and deliveries | Comments/Flags | Current status |
---|---|---|---|---|---|
Aug 25th - October 15th | ~ 40 days | @Friedel Ziegelmayer @nemo | We have 2 big unknowns: 1. Integration of new circuits into the system, we don’t know how that will impact things. 2. Circuit design changes are always tricky. | In progress | |
August 11th - August 24th | ~ 11 Days | @ToBeRemoved | We can start internal audit of the implementation as soon as we have the actors implementation | Backlog | |
August 26th - Sept 3rd | ~ 5 days | @Kubuxu | We can start earlier with mocking and opening the API's, we don’t need to wait for entire proofs implementation to be done | Backlog | |
Sept 5th - Sept 13th | ~ 7 days | @ToBeRemoved @Kubuxu | Backlog | ||
Sept 14th - Sept 24rd | ~ 5 days | @Magik6k we need to help with this | It will take 5-7 days for @Kubuxu or @ToBeRemoved to do it. Would be a great Knowledge sharing opportunity. | Backlog | |
September 24th - October 1st | ~ 5 days | @ToBeRemoved | Backlog | ||
October 1st - October 15th | ~ 10 days | @Kubuxu | Backlog | ||
October 15th - October 21st | ~ 4 days | @Kubuxu | Backlog | ||
October 10th - November 10th | ~ 20 days | @Friedel Ziegelmayer + @Nicola for the internal audit, @Dragan Zurzin to TPM the Circuits audit with external vendors(J.P.) | We can start the audit as soon as we have the proofs writing done | Ongoing discussion | |
October 29rd - November 7th | ~ 7 days | @Dragan Zurzin @jennijuju | Backlog | ||
November 10th -December 17th | ~ 25 days | @Dragan Zurzin + development team representative | We would need new parameter files, which will generate new snarks. There is a possibility that we can start circuits trusted setup before the planned timeline. We would need approval from all side that we should do this - we freeze the circuits and start setup on that part | Ongoing discussion | |
December 20th - January 15th 2022 | ~ 20 days | @Aayush Rajasekaran, @ToBeRemoved | Backlog | ||
January 15th 2022 - 5th of February 2022 | ~ 15 days | @jennijuju @Dragan Zurzin | since we need to make good on our promises for Exchanges and Miners - 2 weeks of notice period | Backlog | |
SUM: 167 Man-Days |
- Timeline snapshot before changes made in Madeira - 2021-11-04 - @Dragan Zurzin :
Comments:
2021-08-05 @Steve (biglep) thoughts:
Thanks a lot for putting this together! A few comments:
- What options do we have for pulling this in tighter?
@Dragan Zurzin from my perspective, options/comments are:
- @Kubuxu or @ToBeRemoved to start work on Lotus + actors changes, since it’s not connected to Proofs writing, try to finish that work ASAP and jump on the Proofs side to Help Nemo with Proofs implementation
- I see no options in doing internal Audit + circuits audit and the trusted setup any faster, based on the feedback I got Friedel and Deep K.
- RC timeline is intentionally increased for a week since we are counting that a lot of "delivery-critical" people will on holidays
- It's useful seeing the breakout of tasks - great. Did we estimate those individually to roll up into larger buckets?
- How are we accounting for if we have multiple engineers on an item, or are all things single-threaded?
- Are we applying any scaling factor to some of the dev estimates? If so, what is it.
- Can we break the thing down a little more where there is dependencies/blocking? For example, we say "We can start an internal audit of the implementation as soon as we have the actors implementation". Are we modeling this? Another example is that it seems like we might need a task for "create mocked proofs" which then unblocks Lotus work.
- To make the modeling more clear, we can link related items (link rows in the table to themselves. Here's an example: https://www.notion.so/protocollabs/biglep-af9cf9adf6b5486788faddf40d3026d0
- Do we need to do any further design work earlier on?
- @Kubuxu: Luka and I are planing some circuit design work for next week (to speed up proofs implementation). Some aspects of lotus+proofs integration needs a bit more design but that is factored in.
- Is documentation accounted for?
- @Kubuxu: Yes but it depends on how we integrate this capability. My plan was to allow miners (via a config) to either seal deals into new sectors or CC sectors. We can in future enable per deal decision making but IMO the config would be enough for v1.
- In general if lightweight upgrades work as well as they seem, in future PoRep designs we may consider not allowing data to be added at the time on sealing.
- Can we describe a bit more the automated testing story? I assume we need the ability to run a testnet and go from CC sector to the upgraded sector (asserting the happy path and various failure scenarios). Do we have the ability to write a test like that today? If not, is there some integration test frame additions we need to account for?
- (Indent) with the integration test refactor that has landed, we posses the capability to test this
- Right now we have a table of tasks above and then a timeline view of those tasks (copy/pasted) I think. It seems like there's a margin for error with the copy/pasting and ensuring that the dates actually account for the amount of estimated time. Your call, but one idea to tighten this up may be to:
- Have separate columns for start/end date and a derived column for the difference between the two to ensure that aligns with the actual estimates
- Use views one database: have a table view and a timeline view. This avoids having two databases. Example: https://www.notion.so/protocollabs/biglep-af9cf9adf6b5486788faddf40d3026d0