Filecoin On-chain Storage Service

May 23, 2022 11:32 PM
Data availability for Web3
This document presents a new Filecoin offer in the storage services offered: On-chain storage.

Storage services offered by Filecoin:

  • Filecoin Storage Market (current product)
    • Can store files, stream of bytes
    • Very cheap and with simple guarantees
      • No guarantee on retrieval
      • No guarantee of perpetuity of storage
  • On-chain storage (presented in this doc)
    • Can store structured data
    • Data is accessible from smart contracts
    • Data availability
    • Perpetual storage
    • Verifiable queries/database


On-chain storage vs On-chain state

  • On-chain means smart contract can access data
  • State is the equivalent of memory in a CPU and it’s expensive on ETH
  • Storage is the same as hard disks, load data as you need and it’s cheap to store (maybe expensive to use)

On-chain storage vs Off-chain storage

  • Off-chain storage data can’t be accessed or is cumbersome (NFTs and NFT metadata are stored off-chain, smart contracts can’t access this data)
  • On-chain storage data can be accessed via a smart contract (Merkle Airdrops could be considered on-chain storage)

How is on-chain storage accessed?

  • Differently from on-chain state that can be accessed in a very simple way, on-chain storage is a bit more complex:
    • Single tx via proof (see below)
    • Job scheduler (see below)

What’s the difference between Filecoin Storage Market and the On-chain Storage product

  • Storage is what we do today, guarantees are meh
  • On-chain storage is a higher guarantee storage

Milestones: On-chain storage and On-chain computation scheduler

The idea is to first provide on-chain storage products, then to focus all on better computation over on-chain data.

This is a list of features and projects that we must work on to achieve these goals.

Step 1: On-chain storage as a product (STORAGE PRODUCTS)

  • Accessibility through smart contract
    • Project: Job scheduler
    • Project: On-chain verifier (software for generating proofs & so on)
  • Data Availability
    • Project: Actual Data Avail protocol
    • Project: Retrieval Mining(?)
  • Retrieval/Proving
    • Project: have a simple way to retrieve and prove data that is not “files” (e.g. key value stores)
  • Perpetual storage
    • Project: Bounty contract
    • Project: Crowdfunding contract
    • Project: Repair nodes contracts
    • Project: Pack it all up into a Perpetual storage contract
  • Verifiability
    • Project: Better Vector commitments (ops happen on chain)
    • Project: Better Key value store (ops happen on chain)
    • Project: Better generic verifiable delegated DB queries (ops happen off chain, delegated)
    • Project: Add these data structures to IPLD (make Provable IPLD)

Step 2: Computation on filecoin data (COMPUTATIONAL PRODUCTS)

  • This project is about scaling computation, not scaling storage
  • Job scheduling
    • Project: incentivize nodes to run job scheduler for all computation
  • Different computational models
    • Verifiable delegate DB
    • Optimistic DB (arbitrum style)
    • Encrypted computation with ZAMA (FHE)
    • MPC

Let’s design together on-chain storage

This part of the document is educational, it shows why it’s important to have on-chain storage.


  • NFTs have large metadata
  • NFTs have a pointer to a “Filecoin Storage Contract”
    • The Storage contract has read functions that can be used by other smart contracts
    • Pays for perpetual storage
    • Guarantees data availability


  • There are some NFTs
  • NFTs have metadata (e.g. Apes wear fur, Nouns wear glasses)
  • Metadata could be stored on-chain or in a json
  • We want a smart contract that emits an event saying “You are in” if the Apes wear fur (or if the noun wears blue glasses)

On-chain State: Accessing Nouns features

// On-chain State (Noun)
// Check via on-chain smart contract
checkGlasses(nounId) {
	return descriptContract.glasses(nounId)

ClaimInvite(nounId) {
	if (checkGlasses(nounId) {
		Emit("You are in")

Off-chain Storage: Accessing BoredApeYachtClub features

  • BAYC Storage:
    • The BAYC smart contract has a field called tokenURI
    • The tokenURI points to an IPFS hash
    • This has points to a JSON file that has the features of the Ape
    • One of the fields is the image of the NFT

→ Storage: Off-chain storage, features of the apes can’t be accessed

// Off-chain Storage (BoredApe)

// We can't access this data since it's offchain as a JSON on an IPFS hash

Since we can’t access the data, we must get clever and find a way to show to the chain that we can access this data.

Single TX Attempt 1: for on-chain storage using straight IPFS

  • What if we want to read the Fur from IPFS file?
  • Pro: we can read the fur data
  • Cons: passing a file in an Ethereum tx is VERY expensive
  • Cons: parsing the fur data from the JSON is also very expensive

// On-chain storage of files (BoredApe using IPFS hash)
// Check via IPFS hash (full hashing and then parsing of the file)
checkFur(apeId, file) {
	bayc.tokenURI === hash(file)

	object = JSON.parse(file)
  return object.attributes[5].value == 'fur'
claimThankYouWithFile(apeId, file) {
	if (checkFur(apeId, file) {
		Emit("You are in")

Single TX Attempt 2: on-chain storage using a better Key Value Store Commitment scheme

  • URI reference is not an IPFS hash but a Key Value Store Commitment
  • We do this because we can have a smaller proof (smaller than the entire IPFS file) and because we won’t need to parse the data from the JSON (since we are getting straight the value that we want)
// On-chain storage of data
// Check via KV store (delegate data)
checkFur(apeId, proof) {
  verifyKV(bayc.tokenKVCommitment, proof)
  proof.value === "fur"
claimThankYouWithProof(apeId, proof) {
	if (checkFur(apeId, file) {
		Emit(”Thank you”)

Single TX Attempt 3: on-chain storage using a SNARK-based DB

  • The idea is that we delegate the smart contract checks (the query that checks if the ape has fur or not) to a SNARK circuit/Verifiable database
// On-chain storage of queriable data
// Check via SnarkDB (delegate data and delegate the check (as delegating a query))
claimThankYouWithProof(apeId, proof) {
	SnarkDBVerifier(FurCircuit, proof)

Multiple TX Attempt 4: on-chain storage using a Job Schedule + any of the above solution

  • User adds to the scheduler some data that they need to show to a smart contract
  • Someone executes the pending jobs and for each job calls the callback
ThankYou {
	claimThankyouWithScheduler(apeId) {
	  scheduler.Add({apeId, 'look for fur', callback})
	callback(task) {
		Emit(”Thank you”)

Scheduler {
	Add(taskObj) {
	ExecuteTasks(proof) {
		verify proof
		for each task in TaskList: task.callback(task)

What lives into IPFS and what lives into Filecoin?

IPFS <> HTTP <> Email <> Protocol

  • Content-addressable network
  • Peer-discovery network
  • Content delivery network
  • Filesystem: files are chunked in a specific way, folder structuring (if it’s IPLD if it’s just data)
  • (Altruistic)

Filecoin <> AWS <> Gmail <> CryptoNetwork

  • Publicly available database and service
  • Network of “trusted” users that we can delegate storage services to
  • Storage Products
    • Filecoin File System
    • Structured data storage (future)
    • Storage derivatives market (future)
    • On-chain storage (future)
      • Accessible from smart contracts
      • Verifiable queries/database
      • Data availability
      • Perpetual storage
  • Computational Tasks scheduler
    • Schedule a computational task using this data

So where at what level should we do the on-chain storage stuff?

  • The better authenticated data structures (DB, KV store) should be just an extension over IPLD (e.g. a new project around IPLD data (VeryDB))
  • However, the actual service for storage should be offered on-chain

Does it have to be Filecoin?

  • Yeah because the miners already have data, having everything integrated would be nice
  • However if we wanted to we could bootstrap a new network called DBCoin (or a smart contract on top of Filecoin or Ethereum)