💡

Filecoin On-chain Storage Service

Authors

Nicola

Alex North

Creator

Nicola

Created

Feb 9, 2022 12:34 PM

Project

Data availability for Web3

Stage

Still Valid

Intro
Definitions
Milestones: On-chain storage and On-chain computation scheduler
Let’s design together on-chain storage
What lives into IPFS and what lives into Filecoin?

Intro

This document presents a new Filecoin offer in the storage services offered: On-chain storage.

Storage services offered by Filecoin:

Filecoin Storage Market (current product)

Can store files, stream of bytes
Very cheap and with simple guarantees

No guarantee on retrieval
No guarantee of perpetuity of storage

On-chain storage (presented in this doc)

Can store structured data
Data is accessible from smart contracts
Data availability
Perpetual storage
Verifiable queries/database

Definitions

On-chain storage vs On-chain state

On-chain means smart contract can access data
State is the equivalent of memory in a CPU and it’s expensive on ETH
Storage is the same as hard disks, load data as you need and it’s cheap to store (maybe expensive to use)

On-chain storage vs Off-chain storage

Off-chain storage data can’t be accessed or is cumbersome (NFTs and NFT metadata are stored off-chain, smart contracts can’t access this data)
On-chain storage data can be accessed via a smart contract (Merkle Airdrops could be considered on-chain storage)

How is on-chain storage accessed?

Differently from on-chain state that can be accessed in a very simple way, on-chain storage is a bit more complex:

Single tx via proof (see below)
Job scheduler (see below)

What’s the difference between Filecoin Storage Market and the On-chain Storage product

Storage is what we do today, guarantees are meh
On-chain storage is a higher guarantee storage

Milestones: On-chain storage and On-chain computation scheduler

The idea is to first provide on-chain storage products, then to focus all on better computation over on-chain data.

This is a list of features and projects that we must work on to achieve these goals.

Step 1: On-chain storage as a product (STORAGE PRODUCTS)

Accessibility through smart contract

Project: Job scheduler
Project: On-chain verifier (software for generating proofs & so on)

Data Availability

Project: Actual Data Avail protocol
Project: Retrieval Mining(?)

Retrieval/Proving

Project: have a simple way to retrieve and prove data that is not “files” (e.g. key value stores)

Perpetual storage

Project: Bounty contract
Project: Crowdfunding contract
Project: Repair nodes contracts
Project: Pack it all up into a Perpetual storage contract

Verifiability

Project: Better Vector commitments (ops happen on chain)
Project: Better Key value store (ops happen on chain)
Project: Better generic verifiable delegated DB queries (ops happen off chain, delegated)
Project: Add these data structures to IPLD (make Provable IPLD)

Step 2: Computation on filecoin data (COMPUTATIONAL PRODUCTS)

This project is about scaling computation, not scaling storage
Job scheduling

Project: incentivize nodes to run job scheduler for all computation

Different computational models

Verifiable delegate DB
Optimistic DB (arbitrum style)
Encrypted computation with ZAMA (FHE)
MPC

Let’s design together on-chain storage

This part of the document is educational, it shows why it’s important to have on-chain storage.

End-goal:

NFTs have large metadata
NFTs have a pointer to a “Filecoin Storage Contract”

The Storage contract has read functions that can be used by other smart contracts
Pays for perpetual storage
Guarantees data availability

Exercise:

There are some NFTs
NFTs have metadata (e.g. Apes wear fur, Nouns wear glasses)
Metadata could be stored on-chain or in a json
We want a smart contract that emits an event saying “You are in” if the Apes wear fur (or if the noun wears blue glasses)

On-chain State: Accessing Nouns features

Nouns Storage:

Nouns images are stored on chain (the actual data, no IPFS link)
Nouns features are stored on-chain in a descriptor smart contract

→ On-chain state, features can be accessed via smart contracts in Ethereum

https://etherscan.io/address/0x9c8ff314c9bc7f6e59a9d9225fb22946427edc03#readContract

// On-chain State (Noun)
// Check via on-chain smart contract
checkGlasses(nounId) {
	return descriptContract.glasses(nounId)
}

ClaimInvite(nounId) {
	if (checkGlasses(nounId) {
		Emit("You are in")
	}
}

Off-chain Storage: Accessing BoredApeYachtClub features

BAYC Storage:

The BAYC smart contract has a field called tokenURI
The tokenURI points to an IPFS hash
This has points to a JSON file that has the features of the Ape
One of the fields is the image of the NFT

→ Storage: Off-chain storage, features of the apes can’t be accessed

https://etherscan.io/address/0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d#readContract

// Off-chain Storage (BoredApe)

// We can't access this data since it's offchain as a JSON on an IPFS hash

Since we can’t access the data, we must get clever and find a way to show to the chain that we can access this data.

Single TX Attempt 1: for on-chain storage using straight IPFS

What if we want to read the Fur from IPFS file?
Pro: we can read the fur data
Cons: passing a file in an Ethereum tx is VERY expensive
Cons: parsing the fur data from the JSON is also very expensive

// On-chain storage of files (BoredApe using IPFS hash)
// Check via IPFS hash (full hashing and then parsing of the file)
checkFur(apeId, file) {
	bayc.tokenURI === hash(file)

	object = JSON.parse(file)
  return object.attributes[5].value == 'fur'
}
claimThankYouWithFile(apeId, file) {
	if (checkFur(apeId, file) {
		Emit("You are in")
	}
}

Single TX Attempt 2: on-chain storage using a better Key Value Store Commitment scheme

URI reference is not an IPFS hash but a Key Value Store Commitment
We do this because we can have a smaller proof (smaller than the entire IPFS file) and because we won’t need to parse the data from the JSON (since we are getting straight the value that we want)

// On-chain storage of data
// Check via KV store (delegate data)
checkFur(apeId, proof) {
  verifyKV(bayc.tokenKVCommitment, proof)
  proof.value === "fur"
}
claimThankYouWithProof(apeId, proof) {
	if (checkFur(apeId, file) {
		Emit(”Thank you”)
	}
}

Single TX Attempt 3: on-chain storage using a SNARK-based DB

The idea is that we delegate the smart contract checks (the query that checks if the ape has fur or not) to a SNARK circuit/Verifiable database

// On-chain storage of queriable data
// Check via SnarkDB (delegate data and delegate the check (as delegating a query))
claimThankYouWithProof(apeId, proof) {
	SnarkDBVerifier(FurCircuit, proof)
}

Multiple TX Attempt 4: on-chain storage using a Job Schedule + any of the above solution

User adds to the scheduler some data that they need to show to a smart contract
Someone executes the pending jobs and for each job calls the callback

ThankYou {
	claimThankyouWithScheduler(apeId) {
	  scheduler.Add({apeId, 'look for fur', callback})
	}
	callback(task) {
		Emit(”Thank you”)
	}
}


Scheduler {
	taskList
	Add(taskObj) {
		taskList.add(taskObj)
	}
	ExecuteTasks(proof) {
		verify proof
		for each task in TaskList: task.callback(task)
	}
}

What lives into IPFS and what lives into Filecoin?

IPFS <> HTTP <> Email <> Protocol

Content-addressable network
Peer-discovery network
Content delivery network
Filesystem: files are chunked in a specific way, folder structuring (if it’s IPLD if it’s just data)
(Altruistic)

Filecoin <> AWS <> Gmail <> CryptoNetwork

Publicly available database and service
Network of “trusted” users that we can delegate storage services to
Storage Products

Filecoin File System
Structured data storage (future)
Storage derivatives market (future)
On-chain storage (future)

Accessible from smart contracts
Verifiable queries/database
Data availability
Perpetual storage

Computational Tasks scheduler

Schedule a computational task using this data

So where at what level should we do the on-chain storage stuff?

The better authenticated data structures (DB, KV store) should be just an extension over IPLD (e.g. a new project around IPLD data (VeryDB))
However, the actual service for storage should be offered on-chain

Does it have to be Filecoin?

Yeah because the miners already have data, having everything integrated would be nice
However if we wanted to we could bootstrap a new network called DBCoin (or a smart contract on top of Filecoin or Ethereum)