Logo
    Filecoin On-chain Storage Service
    đź’ˇ

    Filecoin On-chain Storage Service

    Authors
    NicolaAlex North
    Creator
    Nicola
    Created
    Feb 9, 2022 12:34 PM
    Project
    Data availability for Web3
    Stage
    Still Valid
    • Intro
    • Definitions
    • Milestones: On-chain storage and On-chain computation scheduler
    • Let’s design together on-chain storage
    • What lives into IPFS and what lives into Filecoin?

    Intro

    This document presents a new Filecoin offer in the storage services offered: On-chain storage.

    Storage services offered by Filecoin:

    • Filecoin Storage Market (current product)
      • Can store files, stream of bytes
      • Very cheap and with simple guarantees
        • No guarantee on retrieval
        • No guarantee of perpetuity of storage
    • On-chain storage (presented in this doc)
      • Can store structured data
      • Data is accessible from smart contracts
      • Data availability
      • Perpetual storage
      • Verifiable queries/database

    Definitions

    On-chain storage vs On-chain state

    • On-chain means smart contract can access data
    • State is the equivalent of memory in a CPU and it’s expensive on ETH
    • Storage is the same as hard disks, load data as you need and it’s cheap to store (maybe expensive to use)

    On-chain storage vs Off-chain storage

    • Off-chain storage data can’t be accessed or is cumbersome (NFTs and NFT metadata are stored off-chain, smart contracts can’t access this data)
    • On-chain storage data can be accessed via a smart contract (Merkle Airdrops could be considered on-chain storage)

    How is on-chain storage accessed?

    • Differently from on-chain state that can be accessed in a very simple way, on-chain storage is a bit more complex:
      • Single tx via proof (see below)
      • Job scheduler (see below)

    What’s the difference between Filecoin Storage Market and the On-chain Storage product

    • Storage is what we do today, guarantees are meh
    • On-chain storage is a higher guarantee storage

    Milestones: On-chain storage and On-chain computation scheduler

    The idea is to first provide on-chain storage products, then to focus all on better computation over on-chain data.

    This is a list of features and projects that we must work on to achieve these goals.

    Step 1: On-chain storage as a product (STORAGE PRODUCTS)

    • Accessibility through smart contract
      • Project: Job scheduler
      • Project: On-chain verifier (software for generating proofs & so on)
    • Data Availability
      • Project: Actual Data Avail protocol
      • Project: Retrieval Mining(?)
    • Retrieval/Proving
      • Project: have a simple way to retrieve and prove data that is not “files” (e.g. key value stores)
    • Perpetual storage
      • Project: Bounty contract
      • Project: Crowdfunding contract
      • Project: Repair nodes contracts
      • Project: Pack it all up into a Perpetual storage contract
    • Verifiability
      • Project: Better Vector commitments (ops happen on chain)
      • Project: Better Key value store (ops happen on chain)
      • Project: Better generic verifiable delegated DB queries (ops happen off chain, delegated)
      • Project: Add these data structures to IPLD (make Provable IPLD)

    Step 2: Computation on filecoin data (COMPUTATIONAL PRODUCTS)

    • This project is about scaling computation, not scaling storage
    • Job scheduling
      • Project: incentivize nodes to run job scheduler for all computation
    • Different computational models
      • Verifiable delegate DB
      • Optimistic DB (arbitrum style)
      • Encrypted computation with ZAMA (FHE)
      • MPC

    Let’s design together on-chain storage

    This part of the document is educational, it shows why it’s important to have on-chain storage.

    End-goal:

    • NFTs have large metadata
    • NFTs have a pointer to a “Filecoin Storage Contract”
      • The Storage contract has read functions that can be used by other smart contracts
      • Pays for perpetual storage
      • Guarantees data availability

    Exercise:

    • There are some NFTs
    • NFTs have metadata (e.g. Apes wear fur, Nouns wear glasses)
    • Metadata could be stored on-chain or in a json
    • We want a smart contract that emits an event saying “You are in” if the Apes wear fur (or if the noun wears blue glasses)

    On-chain State: Accessing Nouns features

    • Nouns Storage:
      • Nouns images are stored on chain (the actual data, no IPFS link)
      • Nouns features are stored on-chain in a descriptor smart contract
      • → On-chain state, features can be accessed via smart contracts in Ethereum

    • https://etherscan.io/address/0x9c8ff314c9bc7f6e59a9d9225fb22946427edc03#readContract
    // On-chain State (Noun)
    // Check via on-chain smart contract
    checkGlasses(nounId) {
    	return descriptContract.glasses(nounId)
    }
    
    ClaimInvite(nounId) {
    	if (checkGlasses(nounId) {
    		Emit("You are in")
    	}
    }

    Off-chain Storage: Accessing BoredApeYachtClub features

    • BAYC Storage:
      • The BAYC smart contract has a field called tokenURI
      • The tokenURI points to an IPFS hash
      • This has points to a JSON file that has the features of the Ape
      • One of the fields is the image of the NFT

    → Storage: Off-chain storage, features of the apes can’t be accessed

    • https://etherscan.io/address/0xbc4ca0eda7647a8ab7c2061c2e118a18a936f13d#readContract
    // Off-chain Storage (BoredApe)
    
    // We can't access this data since it's offchain as a JSON on an IPFS hash

    Since we can’t access the data, we must get clever and find a way to show to the chain that we can access this data.

    Single TX Attempt 1: for on-chain storage using straight IPFS

    • What if we want to read the Fur from IPFS file?
    • Pro: we can read the fur data
    • Cons: passing a file in an Ethereum tx is VERY expensive
    • Cons: parsing the fur data from the JSON is also very expensive
    // On-chain storage of files (BoredApe using IPFS hash)
    // Check via IPFS hash (full hashing and then parsing of the file)
    checkFur(apeId, file) {
    	bayc.tokenURI === hash(file)
    
    	object = JSON.parse(file)
      return object.attributes[5].value == 'fur'
    }
    claimThankYouWithFile(apeId, file) {
    	if (checkFur(apeId, file) {
    		Emit("You are in")
    	}
    }

    Single TX Attempt 2: on-chain storage using a better Key Value Store Commitment scheme

    • URI reference is not an IPFS hash but a Key Value Store Commitment
    • We do this because we can have a smaller proof (smaller than the entire IPFS file) and because we won’t need to parse the data from the JSON (since we are getting straight the value that we want)
    // On-chain storage of data
    // Check via KV store (delegate data)
    checkFur(apeId, proof) {
      verifyKV(bayc.tokenKVCommitment, proof)
      proof.value === "fur"
    }
    claimThankYouWithProof(apeId, proof) {
    	if (checkFur(apeId, file) {
    		Emit(”Thank you”)
    	}
    }

    Single TX Attempt 3: on-chain storage using a SNARK-based DB

    • The idea is that we delegate the smart contract checks (the query that checks if the ape has fur or not) to a SNARK circuit/Verifiable database
    // On-chain storage of queriable data
    // Check via SnarkDB (delegate data and delegate the check (as delegating a query))
    claimThankYouWithProof(apeId, proof) {
    	SnarkDBVerifier(FurCircuit, proof)
    }

    Multiple TX Attempt 4: on-chain storage using a Job Schedule + any of the above solution

    • User adds to the scheduler some data that they need to show to a smart contract
    • Someone executes the pending jobs and for each job calls the callback
    ThankYou {
    	claimThankyouWithScheduler(apeId) {
    	  scheduler.Add({apeId, 'look for fur', callback})
    	}
    	callback(task) {
    		Emit(”Thank you”)
    	}
    }
    
    
    Scheduler {
    	taskList
    	Add(taskObj) {
    		taskList.add(taskObj)
    	}
    	ExecuteTasks(proof) {
    		verify proof
    		for each task in TaskList: task.callback(task)
    	}
    }

    What lives into IPFS and what lives into Filecoin?

    IPFS <> HTTP <> Email <> Protocol

    • Content-addressable network
    • Peer-discovery network
    • Content delivery network
    • Filesystem: files are chunked in a specific way, folder structuring (if it’s IPLD if it’s just data)
    • (Altruistic)

    Filecoin <> AWS <> Gmail <> CryptoNetwork

    • Publicly available database and service
    • Network of “trusted” users that we can delegate storage services to
    • Storage Products
      • Filecoin File System
      • Structured data storage (future)
      • Storage derivatives market (future)
      • On-chain storage (future)
        • Accessible from smart contracts
        • Verifiable queries/database
        • Data availability
        • Perpetual storage
    • Computational Tasks scheduler
      • Schedule a computational task using this data

    So where at what level should we do the on-chain storage stuff?

    • The better authenticated data structures (DB, KV store) should be just an extension over IPLD (e.g. a new project around IPLD data (VeryDB))
    • However, the actual service for storage should be offered on-chain

    Does it have to be Filecoin?

    • Yeah because the miners already have data, having everything integrated would be nice
    • However if we wanted to we could bootstrap a new network called DBCoin (or a smart contract on top of Filecoin or Ethereum)

    CryptoNet is a Protocol Labs initiative.