🏞️ Background: Fast Retrieval via Extra Copy
One key features Filecoin should provide is fast retrieval. We already analyzed Fast Retrieval in the past, and some considerations can be found here: Product Problem: Fast Retrieval
In essence, given the way our sealing process is designed today, the easiest way to ensure such a feature is via asking providers to have access to an unsealed copy of the files.
⚠️ Problem: How do we Know Fast Retrieval is Actually Possible?
One key question if how to ensure clients that, once they agree on fast retrieval functionality with a SP, this feature is actually possible to obtain.
Said in another way: what prevents a malicious storage provider to pretend to give fast retrieval via extra copy without having access to it? The answer is that today this malicious strategy is actually possible since nobody checks that a SP committing to fast retrieval has actually access to a clear copy of them.
📢 Proposal: Periodical Check of Clear Data Access
As we proposed in PDP: Checking Clear Data in Filecoin, we propose to periodically check data possession of extra copy for classes of files for which fast retrieval is a key feature. This would include
- Fil+ deals
- Other deals that Clients wants to have fast retrieval guarantees on
We envision a check similar to WindowPost, which would prove that the SP has access, with some frequency, to a clear copy of the file.
🎉 Why Should we Move Forward?
If we want to provide fast retrieval functionality, we must put something in place with respect to the status quo.
Indeed, currently we have no mechanism which is intended to ensure (or even encourage) fast retrieval.
We are convinced that extra copy checking, in combination with other products that are shipping soon (as Retriev.org ) represent a solid first step in this direction.
Moreover, we think that the engineering effort needed to ship a project like this would not be really huge: this means that if we have engineering support, we could ship this relatively fast (the logic is not different from the WindowPost one, and we have been already discussing with different teams in the org).
⁉️ Addressing Questions and Concerns:
In this section we try to address some (legitimate) questions and concerns about this idea
- Question 1: What does it mean checking "access” to an extra copy of a file? Why is this enough?
- Question 2: Why do we still need to go through sealing, if we introduce such a check on clear data?
- Question 3: I'm not sure this is the right way to go. Why can't we build a different protocol for ensuring fast retrieval? For instance, why can't we build a reputation system for SPs?
- Question 4: I don't think this solution works. A SP can always answer the checks on the extra copy and then not giving back the file to the client
- Question 5: Why are you envisioning extra copy check only for some classes of sectors?
What we'd have after this check is a proof of data possession that a SP provides on a particular file. This means that at challenge time, the SP was able to access the data and answer to some challenges on it. We do not have a guarantee that that copy of the file was unique, nor incompressible, but we have a (probabilistic) guarantee that the file could be accessed (and thus retrieved). This is enough for ensuring the possibility of fast retrieval
First, this checks does not say anything about space hardness. Indeed, files in the clear can be compressible and even shared among different SPs. The only thing we check here is the ability of a particular SP to access a particular file. Thus, such check can not be used for storage based consensus purposes.
We know that the solution we propose is not solving the problem once and for all. Nevertheless, we think that it represent a significant step forward. Moreover, before startind this project we tried to build a decentralized reputation system for SP with respect to retrievability, but in the end we had to put the project on pause for lack of basic features one would need to make it happen (see ).
This is an important point. It is true that a SP can always promise fast retrieval and then not giving back the file when asked. Nevertheless, this would also be true when putting in place other tools like a reputation system. Moreover, with the launch of Retriev.org, it will be possible to retrieve files even in this setting. On the top of that, having to answer periodical challenges on extra copy access will be a deterrent. As an additional consideration, the combination of Retrieval Pinning and Extra copy checks can be used to build a reputation system for SPs so that clients can pick the one they prefer.
First, we do not need extra copy check for CC sectors (since they do not contain deals and are initialized with 0s. Checks on those sectors would actually be meaningless). Second, some deals do not need fast retrieval (like deals on files which are "cold storage”). Moreover, having an "opt-in” feature makes things much flexible