Logo

    L2 Deal Market

    Authors
    Nicola
    Creator
    Nicola
    Created
    May 2, 2023 4:07 PM
    Project
    Storage on-boarding

    Deal Market

    • Deal Market
    • Introduction
    • FAQ
    • Model
    • L1
    • L2
    • Open Design Questions
    • Role of deal aggregators
    • Layer 2 query to Layer 1
    • Data Market Protocol
    • Work plan
    • Notes
    • WIP Diagram
    • Call Anjor - @2023/05/24
    • Call Marina - @2023/05/25

    Introduction

    A large fraction of users onboard data onto the Filecoin network via a data aggregator, a network participant that looks for clients with FIL+, and provides a single deal to the SP that is as large as a sector.

    Another significant segment of users onboard large datasets onto Filecoin uses a system of Lead SP and Replication SPs. The Lead SP receives data from the client and manages its replication.

    Both of these processes display the fundamental failure of Storage Market system within Filecoin. The primary

    Thanks to FRCXXX, users do not have to trust aggregators to correctly add their data to a larger deals, instead they can have a proof that their data was correctly stored in a deal in the Filecoin Storage Market.

    Several data aggregators (such as Estuary) are supporting this new FRC, however:

    • Clients now will need to integrate with different data aggregators APIs
    • There is no on-chain footprint of the deals
    • There is no on-chain SLA if the data is lost

    The Data Aggregators Market project aims to solve this problem by creating a layer 2 storage market (deployed on an IPC subnet), where deal information between clients and data aggregators are on-chain and can be bound to smart contract guarantees.

    Summary of goals:

    • Enable a competitive markets for data aggregators
    • Standardize a protocol for storing with data aggregators
    • Having on-chain (L2) information about the deal and enforced SLA

    Summary of impact:

    • Allow SPs to get data from data aggregators with a standard protocol
    • Allow clients to store small data without centralized intermediaries

    FAQ

    What are “Deal Aggregators”?

    Deal Aggregators are participants in Filecoin ecosystem who establish connections between Clients possessing data in chunks smaller than 32GiB and Storage Providers.

    What is an “agreement”?

    A “deal” on the L2 between a client and an aggregator is called “agreement”

    What are the interactions possible?

    • SP make deals with multiple Deal Aggregators
    • Deal Aggregators make deals with multiple SPs
    • Clients make agreements with multiple Deal Aggregators
    • Deal Aggregators make agreements with multiple clients

    What is out of scope of this work?

    ‼️
    The Data Preparers’ market is outside the scope of L2 Deals due to technical constraints around verifying quality and work performed by Data Preparers.

    Model

    L1

    • “Storage Market” Actor:
      • Deals between an SP and an Aggregator or clients

    L2

    • Deal Aggregator’s Market
      • A functional market between Clients possessing data smaller than 32GiB and Aggregators
      • Agreement - between Clients and Aggregators
    • Data Market (out of scope right now)
      • A functional market between Aggregators, Data Preparers and Clients with Prepared Data vs SPs
      • Deal - between party possessing at least an X GB chunk of data and Storage Provider.
      • Facilitate the discovery of both sides of the market:
        • SPs willing to accept data for pay
        • SPs willing to pay for FIL+ data
        • Aggregators with Data looking for SPs
        • Large Clients or Data Preparers looking for SPs willing to pay for FIL+ data

    Open Design Questions

    Role of deal aggregators

    Failure scenarios:

    • Data Transfer
    • Agreement is signed but not activated in a specific timeline
    • Deal is signed but not activated
    • Deal is terminated after activation (sector lost)
    Name/Responsibilities
    Data Transfer
    Agreement signed, but Deal not signed
    Deal signed, but not activated
    Deal is terminated early, sector lost
    Sector Preparer
    Takes no responsibility. Will aggregate and onboard data via SPs on best effort basis. Does not store Client’s data, Client will transfer straight to SP/SP will request data from the Client.
    Finds SP willing to take the data straight from the Client. Client needs to be available for period of time to hand over the data directly to SP.
    No liability
    No liability
    No liability
    Forwarder
    Takes no responsibility but will store Client’s data for some period of time to simplify the transfer to the SP.
    Takes prepared data from the Client. The client can go offline at that point. The Aggregator will transfer data to SP
    No liability
    No liability
    No liability
    Guarantor
    Orthogonal to the previous two, it takes on liability to onboard data onto Filecoin. The liability ends when data is on-boarded into a sector.
    It can work in either of two data transfer models but is easier to execute with the Forwarder responsibility.
    Takes on the responsibility for attempting to onboard data with an SP. Penalised if the Deal is not executed in some time frame.
    Takes responsibility.
    No liability.
    Repair
    Takes on responsibility from the signing of the agreement until the given term. Will act as repairer to create new deals if existing deals fail.
    Responsible for following up and executing repair tasks.
    Same as above.
    Same as above.
    Penalised if data looses given replication guarantee for extended period.

    Open questions

    What is the role of the Aggregator?
    Should they take the liability if the file is lost before being stored?
    Should they try to restore it?
    Should they be absolved from all liability?
    Should the aggregator store data temporarily or just redirect it to the SP?
    How flexible should the protocol be?
    Should this be programmable (hence we should support them all? or should we support the minimum and others can build guarantees on top?)
    Client being able to limit set of SPs?
    Client limiting to set of Aggregators.

    Data aggregator roles:

    • Sector preparers role (doesn’t store, is not liable). Kind of a match maker service
    • Forwarder role (stores, is not liable)
    • Liability role (could store, but it is liable for data loss)
    • Repair role?
    Complete this list to share with others
    Reach out to Client ICU folks/Juan for first review

    Layer 2 query to Layer 1

    • L2 needs to know that L1 Deal was made
      • It is a core feature of L2 to get data from L1
      • chat with consensus folk on how its done to do that

    Data Market Protocol

    ⚠️
    This is WIP protocol specification

    Stage 1: Data preparation

    • Client prepares the file

    Stage 2: Agreement making

    • Client sends the agreementInfo + info on L2 Market (can specify aggregators)
    • Aggregator bids on the agreement on L2 Market
    • Aggregator facilitates a transfer from Client to SP, verifies the data matches the agreement.
    • Aggregator confirms the agreement on the L2 Market.
      • Possibly: From this point, aggregator is held liable to onboard the data. (deal either onboard on the L1 or aggregator is liable?)
    • Aggregator continues to get files from other users (ideally they don’t get the full file, but redirects to SP)
    • Aggregator collects multiple agreements into a deal via the L1 Market

    Stage 3: Agreement activation

    • SP onboard a sector and at ProveCommit deal is activated on L1 Market
    • Aggregator confirms the agreements collected in a deal in L2
      • Commitment of deal is checked on chain
      • Oracle/Way from L1 to show deal is activated/activity status including its CommP, Size and Deal terms.
      • Aggregator shows the inclusion of CommPc (client) in CommPa (aggregate), deal terms are verified to be consistent with the agreements.
      • If all checks succeed, agreement is

    Failures: TBD

    • If agreement is not onboarded on L1 and shown on L2 in a given period, the aggregator pays a penalty/is slashed/gets a reputation hit.
    • If file is lost:

    The aggregator fills the liability gap between the Client and the SP, during which the file is not yet confirmed on the L1.

    Work plan

    ✅ done 🔵 in progress

    Design Phase (Kuba 80%, Nicola 20%)

    • Step 0: FRC for verifiable sub pieces ✅ (April 2023)
    • Step 1: Gather requirements for data aggregators and capabilities of IPC (May 2023) 🔵
    • Step 2: Early Protocol design for review (PAUSED — mid June 2023)

    Uncertain projections:

    Implementation Phase (1 Solidity Engineer for IPC side 100%, 1 PM 50%, Kuba on Filecoin facing side 50%, Nicola 20%, Audits from Alex, Irene and FVM team for 2 weeks)

    • Step 3: Assemble a team for:
      • Smart contract development (August 2023)
      • API and software for SPs and clients and data aggregatores (August 2023)
    • Step 4: Testnet version (September 2023)
      • Testing
      • Protocol Audits
      • Smart Contract audits
    • Step 5: Mainnet version (October 2023)

    Notes

    2023-05-09

    • How does L1 know that a deal has been made?
      • L1 doesn’t need to know
    • How do we protect against malicious actors in the market?
    • Do we want order books?
      • At least a Aggregator service buletin is super useful
    • Do agreements guarantee storage over time, repair?
      • Depends on service level.

    2023-05-16

    • Juan values data preparation nodes
      • Common assumption Data Preparation is hard
      • what if we made it super simple?
        • Data stored flat in deal
        • Consistent IPFS hashing
        • The tree is not part of deal data but rebuilt by SP.
      • what if we made it super simple by removing IPFS from the inside?
        • If the CARification was removed then data preparation simplifies to chunking and hashing.
    Data flow diagram
    • Calls:
      • Marina re Client Onoarding
      • Reached out
      • Lauren (OuterCore) re - Whole data flow
        • On Friday
      • Anjor re Analysis of Clients and their behaviour (do they want)
        1. Reached out
        2. Works mostly on big data, do we still want to meet?
      • Try to schedule this week
    Channel on fielcoin slack, #deal-aggregators-market
    CL on information transfer from L1 to L2
    • Filecoin Miners might drop deals due to unprofitably. How to save them?
      • Unsupported EmbedUnsupported Embed
      • Are these FIL+ deals? Yes
    • It doesn’t have to be on IPC
      • Are there other L2s that work on Filecoin?
      • Custom permission L2?

    2023-05-18

    Aggregation call with FVM

    • IPC timeline for deployment on mainnet for v1 is October
    • SpaceNet can be used for demos/dev right now but is WIP
    • Accessing L1 data and state in the L2
      • All nodes in the IPC are assumed to be able to access the state of the layer up, exposing a “read only method call” looks like to be an option.

    WIP Diagram

    image

    Call Anjor - @2023/05/24

    • What are the hurdles for large data clients?
      • SP don’t understand data preparation towards retrieval
      • Client Owned Preperation
        • works but needs a lot of know-how on the part of the client
      • Stanandalone Data Preparer
        • Bakalau as data preparer
      • Another big problem: Finding SPs
        • IA uses Spade to find SPs
        • Delta has a huge list of active SPs
        • Geofencing
        • Assumption: All SPs always want verified deals
          • Not always true, due to high gas fees
        • 30TiB/d data prep from IA
          • IA preps data faster than SPs are accepting
          • It goes through Spade, which is still onboarding SP.
          • Gas prices for PSD and Activation are an issue.
        • Spade is the client side of Market for finding SPs
          • Client posts that they have data, SPs query if there is some data for them.
          • Spade doesn’t handle Client payments
          • Centralized Broker
          • \@caro is working reputation DAO for SP
            • They don’t define what the reputation is, Client defines based on metrics they provide.
    • go-fil-dataprep
    • Solana
      • No data prep as we think about it, they have a IPLD scheme so it goes directly to a CARfile
      • The large 800GiB CAR file gets split into smaller ones.
    • “The larger ecosystem does not understand how data preparation and retrieval is connected.”

    Call Marina - @2023/05/25

    • Solving small deals
    • Do large data clients go straight to SP
      • Usually they want to have a IRL contract with the SP
      • Relationship between SP and IA
    • let’s think about large data
      • Codifying contracts between Lead SPs and replicating SPs
      • There is a Lead SP which communicates with Client
      • The Lead SP
      • Codifying agreements between Lead SP and Replication SPs !!
      • Broker dashboards based on L2 data

    1:1 Nicola - @May 30, 2023

    Calls with SPs
    Schedule time with NIcola

    @June 22, 2023

    Spade/web3.storage would like a contract where their verifiable capabilities (based on structured signed tokens) enable a broker to allocate data cap.

    CryptoNet is a Protocol Labs initiative.