👷

Off-chain sealing market design sketch

Creator

Alex North

Created

Feb 9, 2022 12:34 PM

Project

Storage on-boarding

Stage

Still Valid

This is a sketch of the mechanics for an off-chain sealing market. This is an arrangement whereby SP’s can outsource the computationally-intensive work of sealing a sector to compute providers that have hardware, but no on-chain presence. It does not require any Filecoin protocol changes, but requires a business agreement between the parties.

The idea came to me via @ and @ZX Zhang.

Background

Sealing a sector is a resource-intensive operation. It requires (a) significant CPU use for the pre-commit stage, (b) 14x the sector size in intermediate storage, and (c) significant GPU to compute the PoRep SNARK. Sealing is thus a capital-intensive operation. Achieving a high sealing rate requires significant up-front investment, much greater than the capital cost of subsequently maintaining that storage.

The investment required in sealing hardware is anecdotally a significant barrier to new or smaller-scale miners onboarding storage. One must have a fairly large total storage commitment in mind to justify investing in high sealing throughput.

Current storage onboarding rates (~25 PiB/day) is about half the maximum enjoyed by the network last year. Thus significant amounts of sealing infrastructure must exist that are either idle, or being used for something else. Much of it is probably in China.

Goals

Increase network-wide CC sector onboarding rate
Reduce capital investment and operational complexity required for new/small storage providers to grow fast
Minimise change required to core protocols

Design ideas

In an off-chain market for sealing, a storage provider (SP) outsources the work of sealing a committed-capacity (no deals) sector to a compute provider (CP). The parties strike a business deal; blockchain contracts are not involved. The compute provider must be trusted to (incrementally) honour the deal, but is not trusted with any access or control over the storage provider’s operations.

During the process of sealing, a sector has an identity which is tied to the SP who will commit it. Thus a CP must respond to a specific sealing request by an SP.

Outline

SP makes request to CP to seal a sector, providing

the SP’s miner actor ID
the desired sector number (unique to the SP)

CP samples the Filecoin chain to obtain seal challenge randomness, which is an input to replication

(The SP could instead provide this to CP, but it’s a time-sensitive value. Having the CP sample it allows flexibility in scheduling between multiple clients).

CP performs the replication step (CPU-intensive)

This generates a large intermediate data that must be retained until the sector is proven

CP transmits the replica’s seal challenge epoch and computed CommR to the SP

At any point from here onwards, the CP can also transmit the sealed sector itself to the SP. This is a large transfer which may take some time. The most conservative approach would be to transmit it in full prior to pre-commitment.
The SP should verify the CommR matches the sealed sector data to ensure they will be able to compute Window PoSt in the future.

The SP submits sector pre-commitment to the chain

The SP must stake a pre-commit deposit at this point, which will be lost if the sector is not subsequently proven.

The CP observes the pre-commitment on chain, and obtains the interactive randomness, which is an input to PoRep.

(Again, the SP could provider this to the CP)

CP samples the intermediate data and performs PoRep (GPU-intensive)
CP sends the proof to the SP, completing the request
SP submit proof of replication to the chain

Chain access

The basic flow above requires the compute provider to read the blockchain in order to sample the sealing and interactive randomness values. This allows the CP more scheduling flexibility and reduces the number of interactions necessary between SP and CP.

If maintaining a Lotus node for chain access is too burdensome for a compute provider, the could instead receive the values from the SP.

Aggregation

This simple flow doesn’t address aggregation of proofs.

If the CP has sufficient throughput and bandwidth to the SP, the SP can batch pre-commits at will without needing anything else from the CP.

However, batching PoRep does require the CP to aggregate the proofs of multiple replications into a single proof. How much aggregation occurs affects the cost to the SP of submitting the proofs to the chain. Thus, aggregation might require additional orchestration between parties.

Reducing pre-commit deposit

[This was wrong] While the pre-commit deposit for sectors is not too large at small scales, it can be reduced to about 20% of current value with a small protocol change. This would reduce the financial cost to the SP if the CP does not produce the proof of replication on time. ~~@Alex North~~ ~~wants to make this change to pre-commit deposit anyway, for other reasons.~~

Incremental trust

This flow requires the SP to trust the CP to follow through with the proof of replication after the SP has pre-committed the sector. If the CP does not follow through, the SP will lose their pre-commit deposit.

This trust is needed on a per-sector basis. Thus, it can be built up over an ongoing relationship of successful exchanges. The deposit for 1 TiB of sectors is only $100 at risk (@FIL = $20).

Business agreement

While this proposal does not use smart contracts to enforce commitments, nothing prevents the entire thing being automated in principle. Compare to a video transcoding service provided via the web.

The compute provider’s terms and pricing may be standard
Payment may be via credit card, pre-paid credits or charged post
Sealing requests may be submitted via API, and job status and results similarly available remotely
Sealed sectors could even be staged in AWS for retrieval by the SP

In practise, of course, to start with the system might be entirely manual. Success of the CP would then motivate investment in automation.

Challenges

For sealing hardware currently resident in China, it is probably quite difficult to get the reliable network bandwidth outside China to transfer sealed sectors in a reliable fashion to SPs.