Logo
    🎯

    Direct data onboarding: impacts & work outline

    Creator
    Alex North
    Created
    May 31, 2023 2:10 AM
    Project
    Storage Programmability

    What is Direct Data Onboarding (aka Direct FIL+)?

    Direct data onboarding is a label for changes to the Filecoin built-in actors, and the components that interact with them, to enable gas-cheap data onboarding, including Filecoin Plus. The built-in storage market actor is bypassed, and the verified registry’s on-chain allocation/claim record suffices to represent a simple “deal”. Direct data onboarding can’t handle client payments (yet), but most deals today have no on-chain payment anyway.

    After the initial Direct data onboarding changes, subsequent, smaller changes will be possible to support scalable FIL+ allocations (single transaction covering multiple sectors) and flexibility for other kinds of data applications.

    See also:

    • 🎯Direct data onboarding & FIL+ technical design
    • 🕸️Miner protocol change dependencies
    • 🕊️Free CommD: Unconstrained data applications on Filecoin

    Impacts

    Immediate impact

    Reduced gas cost for data onboarding

    Judging by analysis of some batched deal and onboarding messages from mainnet, the potential gas savings for Direct FIL+ onboarding, as compared with a built-in market verified deal, are:

    • PublishStorageDeals: total cost 1,911M gas, direct FIL+ cost: 283M, saving: 85%
    • PreCommitSectorBatch: (not yet measured, but some saving expected)
    • ProveCommitAggregate: total cost 2,509M gas, direct FIL+ cost: 1,491M, saving: 36%

    If participants do choose to use a built-in market deal (e.g. for a client payment) but take advantage of the new APIs, deals will not need to be specified or checked at pre-commit. The PreCommitSectorBatch gas reduction will benefit these parties too.

    Note that the reduction in total gas costs is accompanied by a change in the parties paying them. Instead of the SP paying for PublishStorageDeals, the client pays for creating a datacap allocation (~15% of the PSD cost). Additional mechanisms could be added so that the SP pays this cost, if needed.

    The first-order impacts of this are reduced SP onboarding costs and reduced variability in those costs, improving total returns. Reduced gas use for onboarding reduces competition for chain bandwidth, likely leading to a reduction in base fee and transaction costs for other parties.

    Second-order effects of reduced gas usage and base fee include reduced base fee burn rate (which accounts for ~12.5% of net circulating supply outflows) in the near term. Reduced onboarding costs may induce more onboarding (locking pledge accounts for the other 87.5% of supply outflows) and user demand for other transactions, more likely over a longer term.

    Availability of 5-year maximum FIL+ term for clients

    A verified client going direct can immediately specify a 5-year maximum term. The storage provider can only up-front commit a sector for 1.5 years (3.5 years after FIP-0052), but if the client has allowed it, they will be able to extend the sector out to the 5-year maximum while retaining full QAP with no further client interaction.

    Direct data onboarding without FIL+

    The changes also enable the onboarding of data without a FIL+ allocation or any deal with the built-in market. Details of the data onboarded are still committed to the blockchain (in message history) but without any facility for on-chain client payments. Direct data onboarding without FIL+ will be even cheaper in gas than using FIL+, and possibly cheap enough for some SPs to break even on paid-for data storage.

    Downstream impacts

    The changes will involve the creation and use of new methods in the built-in miner actor for sector onboarding. These will parallel the existing ones, but accept new parameters. The existing methods and flows will continue to work and behave as they have in the past, with minimal breaking changes. Participants can continue using these methods while the downstream changes to support Direct FIL+ are maturing, but will not be able to take advantage of the benefits until switching to new flows.

    The new onboarding methods will also be capable of supporting existing flows, so the old methods may eventually be deprecated to reduce code complexity. Such deprecation would require an additional FIP and is not imminent.

    Work required by PL Lotus and Boost teams is sketched in the work outline below. Other node implementations and deal-making services would require similar adaptations to take advantage of the new flows.

    Other downstream impacts include TODO:

    • 🟡Sentinel/lily node (monitoring tools)/Starboards (analytics powered by lily) - will need spec
      • need to update how to track deal activation
      • deal state table :update upon market deal state update; dependent on market actor states
      • deal onboarding|flow/SP stats/Client stats: tracking market deal proposals
      • avg deal duration: tracked using PSD
      • ecosystem impact (stats referred constantly as critical metric for filecoin network, i.e:Messari) - without an updated spec and integration, it might cause a false assumption of filecoin data onboarding is slowing down
        • Newly committed deals (in TiB)
        • Storage Data Utilized
        • Historical Daily Active Deal Count
        • Historical Daily Active Deals in PiB
        • Network storage capacity
        • Network reliability (Active faults %)
        • Clients by dataset size
    • Lending:
    • FIL+ explorers (needs at least 1 mo turn around time):
      • data stored in published verified deal: depend on cid checker/glif market dump
    • Block explorers:
      • minimal work: add new message method
      • medium work: deal explorer (one, two)
        • in the future will need something like this - needs FRC.
    • Filecoin CID checker:
      • dependent on: StateMarketDeals API
      • 🟡used by Glif market(one two): used by a decent amount of builders - they will need ui changes
        1. during last month (2023-06-01 — 2023-07-01)

        2. marketdeals - 5460
        3. marketdeals-calibration - 1144
    • Ledger: unknown new message (might be a blocker for the one below)
      • we accept any message but with a restriction on the size, it needs to be less than 200bytes
      • uint8_t params[MAX_PARAMS_BUFFER_SIZE]; // MAX_PARAMS_BUFFER_SIZE = 200

        Another limitation is in the number of parameters, it cannot be more than 255

    • FIL+ clients
      • FIL+ clients make datacap allocations on-chain, instead of submitting signed deal proposals to SPs.
      • How much will singularity/delta/spade abstract this difference for them, or how much end-user workflow’s change?
    • SPs
      • Miners must know FIL+ allocation IDs off-chain and submit them at ProveCommit. Similarly, must track deal PieceCIDs for submission at ProveCommit.
      • How much will Lotus abstract this difference for them, or how much will operator workflows change?
    • Other ecosystem impact and tooling:
      • Tools/dashboard that provides visibility into about of $FIL used for storage across all available contracts

    Observable state

    This table outlines the data that is committed to and observable on the blockchain.

    • State: data is hot storage, can be accessible to smart contracts, readable by explorers, eventually deleted, expensive. x2 means it’s stored twice.
    • Message: data is cold storage, inaccessible to smart contracts, readable by explorers, permanent, cheaper.
    • From sector: can be inferred from state data for the sector storing the data, in addition to any explicit representation.
    • If implemented (for contracts): data can be captured in state or messages by contracts that find it useful.

    The state data for a storage application smart contract depends on what that application implements. Much of the essential data is observable in any case from the messages submitted to confirm data storage, even where no smart contract is used.

    Start epoch
    Message + state x2 + from sector
    Message + state + from sector
    From sector + if implemented
    From sector
    Deal label
    Message + state
    None
    If implemented
    None
    Today
    Direct FIL+
    App contract
    “Off-chain”
    End epoch
    Message + state x2 + from sector
    Message + state + from sector
    From sector + if implemented
    From sector
    Client address
    Message + state x2
    Message + state
    If implemented
    None
    Provider address
    Message + state x2
    Message + state
    Message + if implemented
    Message
    Storage price
    Message + state
    None
    If implemented
    None
    Piece CID
    Message + state x2
    Message + state
    Message + if implemented
    Message
    Piece Size
    Message + state x2
    Message + state
    Message + if implemented
    Message
    Data↔sector association
    Message + state
    Message + state
    Message + if implemented
    Message

    Work outline

    Rough efforts: 🐁 = small, 🐕 = medium, 🐎 = large, 🐘 = huge.

    Declare built-in market optional (+ data termination penalty?)

    An overall easy change once we decide the product needs. No immediate user impact, but a pre-requisite to subsequent changes. FIP draft, discussion.

    Effort:

    • Actors: implement data termination penalty (if needed) 🐁 
    • Lotus/Venus/Forest: remove market locked balances from circulating supply 🐁
    • Boost etc: None

    Direct data onboarding/FIL+ APIs

    New onboarding APIs support claiming FIL+ allocations without built-in market deals, and more efficient activation when deals are used. This is a more involved change with some outstanding technical design. The outcome is simple FIL+ deals avoid built-in market → cheaper gas, 5y term

    Old flows will continue to work, so immediately required changes are few. But some coordination between teams is required to realise the value in end-to-end flows. Effort:

    • Actors: See technical design 🐎
      • New ProveCommit API
      • Move deal↔sector state to market actor
      • Market actor implementation of sector content change notification for deal activation
      • New sector termination handling
    • Lotus(the client)/Venus/Forest
    • Lotus-miner/Venus-cluster: 🏇
      • Always specify unsealed sector CID in PreCommitSector (breaking)
      • Omit deal IDs from PreCommitSector
      • Maintain metadata for sector pieces, allocations, deals until activation
      • Call new ProveCommitAggregate method with PieceCIDs and AllocationIDs
    • Boost: 🐕?
      • Direct FIL+ allocation on-chain as alternative to creating deal proposal
      • Monitoring verified registry for indication of completion
      • Future: parameterise deals with address of the market actor being used

    Subsequent enhancements (all optional)

    Scalable FIL+ allocations.

    Support FIL+ allocations and claims for multiple sectors within a single transaction and ~constant costs. The outcome is FIL+ accounting has near-constant cost regardless of size, resulting in further reduced costs for FIL+ onboarding. See FIP discussion #708.

    Key efforts are a cryptographic vector commitment scheme, actors changes to verified registry allocation/claim state, and metadata for batch allocations/claims tracked by off-chain participants.

    On-chain notifications for data onboarding

    Smart contracts can receive synchronous on-chain notifications of data commitments into sectors. The outcome is user-programmed contracts can implement markets and other applications without off-chain oracles or monitors for activation.

    The effort depend on how much groundwork we front-load into initial Direct FIL+ changes. A key challenge is security of invoking untrusted code during onboarding.

    “Pull” optional on-chain CommD and sector state

    Allow SPs to store sector unsealed CID in chain state, where it can be inspected by smart contracts.

    Proven data deletion

    SPs can prove that data has been removed from a sector. (There are tricky product impacts here, this requires more than just the technical means to prove deletion).

    CryptoNet is a Protocol Labs initiative.