Deal Market
- Deal Market
- Introduction
- FAQ
- Model
- L1
- L2
- Open Design Questions
- Role of deal aggregators
- Layer 2 query to Layer 1
- Data Market Protocol
- Work plan
- Notes
- WIP Diagram
- Call Anjor - @2023/05/24
- Call Marina - @2023/05/25
Introduction
A large fraction of users onboard data onto the Filecoin network via a data aggregator, a network participant that looks for clients with FIL+, and provides a single deal to the SP that is as large as a sector.
Another significant segment of users onboard large datasets onto Filecoin uses a system of Lead SP and Replication SPs. The Lead SP receives data from the client and manages its replication.
Both of these processes display the fundamental failure of Storage Market system within Filecoin. The primary
Thanks to FRCXXX, users do not have to trust aggregators to correctly add their data to a larger deals, instead they can have a proof that their data was correctly stored in a deal in the Filecoin Storage Market.
Several data aggregators (such as Estuary) are supporting this new FRC, however:
- Clients now will need to integrate with different data aggregators APIs
- There is no on-chain footprint of the deals
- There is no on-chain SLA if the data is lost
The Data Aggregators Market project aims to solve this problem by creating a layer 2 storage market (deployed on an IPC subnet), where deal information between clients and data aggregators are on-chain and can be bound to smart contract guarantees.
Summary of goals:
- Enable a competitive markets for data aggregators
- Standardize a protocol for storing with data aggregators
- Having on-chain (L2) information about the deal and enforced SLA
Summary of impact:
- Allow SPs to get data from data aggregators with a standard protocol
- Allow clients to store small data without centralized intermediaries
FAQ
What are “Deal Aggregators”?
Deal Aggregators are participants in Filecoin ecosystem who establish connections between Clients possessing data in chunks smaller than 32GiB and Storage Providers.
What is an “agreement”?
A “deal” on the L2 between a client and an aggregator is called “agreement”
What are the interactions possible?
- SP make deals with multiple Deal Aggregators
- Deal Aggregators make deals with multiple SPs
- Clients make agreements with multiple Deal Aggregators
- Deal Aggregators make agreements with multiple clients
What is out of scope of this work?
Model
L1
- “Storage Market” Actor:
- Deals between an SP and an Aggregator or clients
L2
- Deal Aggregator’s Market
- A functional market between Clients possessing data smaller than 32GiB and Aggregators
- Agreement - between Clients and Aggregators
- Data Market (out of scope right now)
- A functional market between Aggregators, Data Preparers and Clients with Prepared Data vs SPs
- Deal - between party possessing at least an X GB chunk of data and Storage Provider.
- Facilitate the discovery of both sides of the market:
- SPs willing to accept data for pay
- SPs willing to pay for FIL+ data
- Aggregators with Data looking for SPs
- Large Clients or Data Preparers looking for SPs willing to pay for FIL+ data
Open Design Questions
Role of deal aggregators
Failure scenarios:
- Data Transfer
- Agreement is signed but not activated in a specific timeline
- Deal is signed but not activated
- Deal is terminated after activation (sector lost)
Name/Responsibilities | Data Transfer | Agreement signed, but Deal not signed | Deal signed, but not activated | Deal is terminated early, sector lost | |
Sector Preparer | Takes no responsibility. Will aggregate and onboard data via SPs on best effort basis. Does not store Client’s data, Client will transfer straight to SP/SP will request data from the Client. | Finds SP willing to take the data straight from the Client. Client needs to be available for period of time to hand over the data directly to SP. | No liability | No liability | No liability |
Forwarder | Takes no responsibility but will store Client’s data for some period of time to simplify the transfer to the SP. | Takes prepared data from the Client. The client can go offline at that point. The Aggregator will transfer data to SP | No liability | No liability | No liability |
Guarantor | Orthogonal to the previous two, it takes on liability to onboard data onto Filecoin. The liability ends when data is on-boarded into a sector. | It can work in either of two data transfer models but is easier to execute with the Forwarder responsibility. | Takes on the responsibility for attempting to onboard data with an SP.
Penalised if the Deal is not executed in some time frame. | Takes responsibility. | No liability. |
Repair | Takes on responsibility from the signing of the agreement until the given term. Will act as repairer to create new deals if existing deals fail. | Responsible for following up and executing repair tasks. | Same as above. | Same as above. | Penalised if data looses given replication guarantee for extended period. |
Open questions
Data aggregator roles:
- Sector preparers role (doesn’t store, is not liable). Kind of a match maker service
- Forwarder role (stores, is not liable)
- Liability role (could store, but it is liable for data loss)
- Repair role?
Layer 2 query to Layer 1
- L2 needs to know that L1 Deal was made
- It is a core feature of L2 to get data from L1
Data Market Protocol
Stage 1: Data preparation
- Client prepares the file
Stage 2: Agreement making
- Client sends the agreementInfo + info on L2 Market (can specify aggregators)
- Aggregator bids on the agreement on L2 Market
- Aggregator facilitates a transfer from Client to SP, verifies the data matches the agreement.
- Aggregator confirms the agreement on the L2 Market.
- Possibly: From this point, aggregator is held liable to onboard the data. (deal either onboard on the L1 or aggregator is liable?)
- Aggregator continues to get files from other users (ideally they don’t get the full file, but redirects to SP)
- Aggregator collects multiple agreements into a deal via the L1 Market
Stage 3: Agreement activation
- SP onboard a sector and at ProveCommit deal is activated on L1 Market
- Aggregator confirms the agreements collected in a deal in L2
- Commitment of deal is checked on chain
- Oracle/Way from L1 to show deal is activated/activity status including its CommP, Size and Deal terms.
- Aggregator shows the inclusion of CommPc (client) in CommPa (aggregate), deal terms are verified to be consistent with the agreements.
- If all checks succeed, agreement is
Failures: TBD
- If agreement is not onboarded on L1 and shown on L2 in a given period, the aggregator pays a penalty/is slashed/gets a reputation hit.
- If file is lost:
The aggregator fills the liability gap between the Client and the SP, during which the file is not yet confirmed on the L1.
Work plan
✅ done 🔵 in progress
Design Phase (Kuba 80%, Nicola 20%)
- Step 0: FRC for verifiable sub pieces ✅ (April 2023)
- Step 1: Gather requirements for data aggregators and capabilities of IPC (May 2023) 🔵
- Step 2: Early Protocol design for review (PAUSED — mid June 2023)
Uncertain projections:
Implementation Phase (1 Solidity Engineer for IPC side 100%, 1 PM 50%, Kuba on Filecoin facing side 50%, Nicola 20%, Audits from Alex, Irene and FVM team for 2 weeks)
- Step 3: Assemble a team for:
- Smart contract development (August 2023)
- API and software for SPs and clients and data aggregatores (August 2023)
- Step 4: Testnet version (September 2023)
- Testing
- Protocol Audits
- Smart Contract audits
- Step 5: Mainnet version (October 2023)
Notes
2023-05-09
- How does L1 know that a deal has been made?
- L1 doesn’t need to know
- How do we protect against malicious actors in the market?
- Do we want order books?
- At least a Aggregator service buletin is super useful
- Do agreements guarantee storage over time, repair?
- Depends on service level.
2023-05-16
- Juan values data preparation nodes
- Common assumption Data Preparation is hard
- what if we made it super simple?
- Data stored flat in deal
- Consistent IPFS hashing
- The tree is not part of deal data but rebuilt by SP.
- what if we made it super simple by removing IPFS from the inside?
- If the CARification was removed then data preparation simplifies to chunking and hashing.
- Calls:
- Marina re Client Onoarding
- Lauren (OuterCore) re - Whole data flow
- On Friday
- Anjor re Analysis of Clients and their behaviour (do they want)
- Works mostly on big data, do we still want to meet?
- Try to schedule this week
- Filecoin Miners might drop deals due to unprofitably. How to save them?
- Are these FIL+ deals? Yes
- It doesn’t have to be on IPC
- Are there other L2s that work on Filecoin?
- Custom permission L2?
2023-05-18
Aggregation call with FVM
- IPC timeline for deployment on mainnet for v1 is October
- SpaceNet can be used for demos/dev right now but is WIP
- Accessing L1 data and state in the L2
- All nodes in the IPC are assumed to be able to access the state of the layer up, exposing a “read only method call” looks like to be an option.
WIP Diagram
Call Anjor - @2023/05/24
- What are the hurdles for large data clients?
- SP don’t understand data preparation towards retrieval
- Client Owned Preperation
- works but needs a lot of know-how on the part of the client
- Stanandalone Data Preparer
- Bakalau as data preparer
- Another big problem: Finding SPs
- IA uses Spade to find SPs
- Delta has a huge list of active SPs
- Geofencing
- Assumption: All SPs always want verified deals
- Not always true, due to high gas fees
- 30TiB/d data prep from IA
- IA preps data faster than SPs are accepting
- It goes through Spade, which is still onboarding SP.
- Gas prices for PSD and Activation are an issue.
- Spade is the client side of Market for finding SPs
- Client posts that they have data, SPs query if there is some data for them.
- Spade doesn’t handle Client payments
- Centralized Broker
- \@caro is working reputation DAO for SP
- They don’t define what the reputation is, Client defines based on metrics they provide.
- go-fil-dataprep
- Solana
- No data prep as we think about it, they have a IPLD scheme so it goes directly to a CARfile
- The large 800GiB CAR file gets split into smaller ones.
- “The larger ecosystem does not understand how data preparation and retrieval is connected.”
Call Marina - @2023/05/25
- Solving small deals
- Do large data clients go straight to SP
- Usually they want to have a IRL contract with the SP
- Relationship between SP and IA
- let’s think about large data
- Codifying contracts between Lead SPs and replicating SPs
- There is a Lead SP which communicates with Client
- The Lead SP
- Codifying agreements between Lead SP and Replication SPs !!
- Broker dashboards based on L2 data
1:1 Nicola - @May 30, 2023
@June 22, 2023
Spade/web3.storage would like a contract where their verifiable capabilities (based on structured signed tokens) enable a broker to allocate data cap.