L2 Deal Market

Authors

Nicola

Creator

Nicola

Created

May 2, 2023 4:07 PM

Project

Storage on-boarding

Deal Market

Deal Market
Introduction
FAQ
Model
L1
L2
Open Design Questions
Role of deal aggregators
Layer 2 query to Layer 1
Data Market Protocol
Work plan
Notes
WIP Diagram
Call Anjor - @2023/05/24
Call Marina - @2023/05/25

Introduction

A large fraction of users onboard data onto the Filecoin network via a data aggregator, a network participant that looks for clients with FIL+, and provides a single deal to the SP that is as large as a sector.

Another significant segment of users onboard large datasets onto Filecoin uses a system of Lead SP and Replication SPs. The Lead SP receives data from the client and manages its replication.

Both of these processes display the fundamental failure of Storage Market system within Filecoin. The primary

Thanks to FRCXXX, users do not have to trust aggregators to correctly add their data to a larger deals, instead they can have a proof that their data was correctly stored in a deal in the Filecoin Storage Market.

Several data aggregators (such as Estuary) are supporting this new FRC, however:

Clients now will need to integrate with different data aggregators APIs
There is no on-chain footprint of the deals
There is no on-chain SLA if the data is lost

The Data Aggregators Market project aims to solve this problem by creating a layer 2 storage market (deployed on an IPC subnet), where deal information between clients and data aggregators are on-chain and can be bound to smart contract guarantees.

Summary of goals:

Enable a competitive markets for data aggregators
Standardize a protocol for storing with data aggregators
Having on-chain (L2) information about the deal and enforced SLA

Summary of impact:

Allow SPs to get data from data aggregators with a standard protocol
Allow clients to store small data without centralized intermediaries

FAQ

What are “Deal Aggregators”?

Deal Aggregators are participants in Filecoin ecosystem who establish connections between Clients possessing data in chunks smaller than 32GiB and Storage Providers.

What is an “agreement”?

A “deal” on the L2 between a client and an aggregator is called “agreement”

What are the interactions possible?

SP make deals with multiple Deal Aggregators
Deal Aggregators make deals with multiple SPs
Clients make agreements with multiple Deal Aggregators
Deal Aggregators make agreements with multiple clients

What is out of scope of this work?

‼️

The Data Preparers’ market is outside the scope of L2 Deals due to technical constraints around verifying quality and work performed by Data Preparers.

Model

L1

“Storage Market” Actor:

Deals between an SP and an Aggregator or clients

L2

Deal Aggregator’s Market

A functional market between Clients possessing data smaller than 32GiB and Aggregators
Agreement - between Clients and Aggregators

Data Market (out of scope right now)

A functional market between Aggregators, Data Preparers and Clients with Prepared Data vs SPs
Deal - between party possessing at least an X GB chunk of data and Storage Provider.
Facilitate the discovery of both sides of the market:

SPs willing to accept data for pay
SPs willing to pay for FIL+ data
Aggregators with Data looking for SPs
Large Clients or Data Preparers looking for SPs willing to pay for FIL+ data

Open Design Questions

Role of deal aggregators

Failure scenarios:

Data Transfer
Agreement is signed but not activated in a specific timeline
Deal is signed but not activated
Deal is terminated after activation (sector lost)

Name/Responsibilities		Data Transfer	Agreement signed, but Deal not signed	Deal signed, but not activated	Deal is terminated early, sector lost
Sector Preparer	Takes no responsibility. Will aggregate and onboard data via SPs on best effort basis. Does not store Client’s data, Client will transfer straight to SP/SP will request data from the Client.	Finds SP willing to take the data straight from the Client. Client needs to be available for period of time to hand over the data directly to SP.	No liability	No liability	No liability
Forwarder	Takes no responsibility but will store Client’s data for some period of time to simplify the transfer to the SP.	Takes prepared data from the Client. The client can go offline at that point. The Aggregator will transfer data to SP	No liability	No liability	No liability
Guarantor	Orthogonal to the previous two, it takes on liability to onboard data onto Filecoin. The liability ends when data is on-boarded into a sector.	It can work in either of two data transfer models but is easier to execute with the Forwarder responsibility.	Takes on the responsibility for attempting to onboard data with an SP. Penalised if the Deal is not executed in some time frame.	Takes responsibility.	No liability.
Repair	Takes on responsibility from the signing of the agreement until the given term. Will act as repairer to create new deals if existing deals fail.	Responsible for following up and executing repair tasks.	Same as above.	Same as above.	Penalised if data looses given replication guarantee for extended period.

Open questions

What is the role of the Aggregator?

Should they take the liability if the file is lost before being stored?

Should they try to restore it?

Should they be absolved from all liability?

Should the aggregator store data temporarily or just redirect it to the SP?

How flexible should the protocol be?

Should this be programmable (hence we should support them all? or should we support the minimum and others can build guarantees on top?)

Client being able to limit set of SPs?

Client limiting to set of Aggregators.

Data aggregator roles:

Sector preparers role (doesn’t store, is not liable). Kind of a match maker service
Forwarder role (stores, is not liable)
Liability role (could store, but it is liable for data loss)
Repair role?

Complete this list to share with others

Reach out to Client ICU folks/Juan for first review

Layer 2 query to Layer 1

L2 needs to know that L1 Deal was made

It is a core feature of L2 to get data from L1

chat with consensus folk on how its done to do that

Data Market Protocol

⚠️

This is WIP protocol specification

Stage 1: Data preparation

Client prepares the file

Stage 2: Agreement making

Client sends the agreementInfo + info on L2 Market (can specify aggregators)
Aggregator bids on the agreement on L2 Market
Aggregator facilitates a transfer from Client to SP, verifies the data matches the agreement.
Aggregator confirms the agreement on the L2 Market.

Possibly: From this point, aggregator is held liable to onboard the data. (deal either onboard on the L1 or aggregator is liable?)

Aggregator continues to get files from other users (ideally they don’t get the full file, but redirects to SP)
Aggregator collects multiple agreements into a deal via the L1 Market

Stage 3: Agreement activation

SP onboard a sector and at ProveCommit deal is activated on L1 Market
Aggregator confirms the agreements collected in a deal in L2

Commitment of deal is checked on chain
Oracle/Way from L1 to show deal is activated/activity status including its CommP, Size and Deal terms.
Aggregator shows the inclusion of CommPc (client) in CommPa (aggregate), deal terms are verified to be consistent with the agreements.
If all checks succeed, agreement is

Failures: TBD

If agreement is not onboarded on L1 and shown on L2 in a given period, the aggregator pays a penalty/is slashed/gets a reputation hit.
If file is lost:

The aggregator fills the liability gap between the Client and the SP, during which the file is not yet confirmed on the L1.

Work plan

✅ done 🔵 in progress

Design Phase (Kuba 80%, Nicola 20%)

Step 0: FRC for verifiable sub pieces ✅ (April 2023)
Step 1: Gather requirements for data aggregators and capabilities of IPC (May 2023) 🔵
Step 2: Early Protocol design for review (PAUSED — mid June 2023)

Uncertain projections:

Implementation Phase (1 Solidity Engineer for IPC side 100%, 1 PM 50%, Kuba on Filecoin facing side 50%, Nicola 20%, Audits from Alex, Irene and FVM team for 2 weeks)

Step 3: Assemble a team for:

Smart contract development (August 2023)
API and software for SPs and clients and data aggregatores (August 2023)

Step 4: Testnet version (September 2023)

Testing
Protocol Audits
Smart Contract audits

Step 5: Mainnet version (October 2023)

Notes

2023-05-09

How does L1 know that a deal has been made?

L1 doesn’t need to know

How do we protect against malicious actors in the market?
Do we want order books?

At least a Aggregator service buletin is super useful

Do agreements guarantee storage over time, repair?

Depends on service level.

2023-05-16

Juan values data preparation nodes

Common assumption Data Preparation is hard
what if we made it super simple?

Data stored flat in deal
Consistent IPFS hashing
The tree is not part of deal data but rebuilt by SP.

what if we made it super simple by removing IPFS from the inside?

If the CARification was removed then data preparation simplifies to chunking and hashing.

Data flow diagram

Calls:

Marina re Client Onoarding

~~Reached out~~

Lauren (OuterCore) re - Whole data flow

On Friday

Anjor re Analysis of Clients and their behaviour (do they want)

~~Reached out~~

Works mostly on big data, do we still want to meet?

Try to schedule this week

Channel on fielcoin slack, #deal-aggregators-market

~~CL on information transfer from L1 to L2~~

Filecoin Miners might drop deals due to unprofitably. How to save them?

Unsupported Embed
Are these FIL+ deals? Yes

It doesn’t have to be on IPC

Are there other L2s that work on Filecoin?
Custom permission L2?

2023-05-18

Aggregation call with FVM

IPC timeline for deployment on mainnet for v1 is October
SpaceNet can be used for demos/dev right now but is WIP
Accessing L1 data and state in the L2

All nodes in the IPC are assumed to be able to access the state of the layer up, exposing a “read only method call” looks like to be an option.

WIP Diagram

Call Anjor - @2023/05/24

What are the hurdles for large data clients?

SP don’t understand data preparation towards retrieval
Client Owned Preperation

works but needs a lot of know-how on the part of the client

Stanandalone Data Preparer

Bakalau as data preparer

Another big problem: Finding SPs

IA uses Spade to find SPs
Delta has a huge list of active SPs
Geofencing
Assumption: All SPs always want verified deals

Not always true, due to high gas fees

30TiB/d data prep from IA

IA preps data faster than SPs are accepting
It goes through Spade, which is still onboarding SP.
Gas prices for PSD and Activation are an issue.

Spade is the client side of Market for finding SPs

Client posts that they have data, SPs query if there is some data for them.
Spade doesn’t handle Client payments
Centralized Broker
\@caro is working reputation DAO for SP

They don’t define what the reputation is, Client defines based on metrics they provide.

go-fil-dataprep
Solana

No data prep as we think about it, they have a IPLD scheme so it goes directly to a CARfile
The large 800GiB CAR file gets split into smaller ones.

“The larger ecosystem does not understand how data preparation and retrieval is connected.”

Call Marina - @2023/05/25

Solving small deals
Do large data clients go straight to SP

Usually they want to have a IRL contract with the SP
Relationship between SP and IA

let’s think about large data

Codifying contracts between Lead SPs and replicating SPs
There is a Lead SP which communicates with Client
The Lead SP
Codifying agreements between Lead SP and Replication SPs !!
Broker dashboards based on L2 data

1:1 Nicola - @May 30, 2023

Calls with SPs

Schedule time with NIcola

@June 22, 2023

Spade/web3.storage would like a contract where their verifiable capabilities (based on structured signed tokens) enable a broker to allocate data cap.