Now in FIP discussion: https://github.com/filecoin-project/FIPs/discussions/382. Please contribute there.
This is a proposal for a calling convention for user-programmed native WASM actors on the Filecoin VM.
A calling convention enables syntactic composability between actors: the general ability of components of a system to be recombined into larger structures and for the output of one to be the input of another (Aragon: What is Composability). We need such a convention for actors programmed by different teams to interoperate. A convention is also necessary for the adoption of standards such as token and wallet APIs to provide morphological composability—application-level interoperability. A calling convention can also provide a helpful abstraction for actor developers, ideally a familiar one. A convention that is widely adopted provides a base for a stable development environment and justifies investments in tooling.
Many ideas here are thanks to collaboration with @Kuba Sztandera.
The FVM dispatches messages to actors, both top-level invocations rooted in blockchain messages, and internal invocations between actors. The essential fields comprising a message include:
- A method number (CBOR integer, variable length up to 8 bytes plus sign, u64 in WASM)
- Method parameters (byte array expected to be IPLD, for now always CBOR)
- Receiver address
- Token quantity transferred from sender to receiver
The VM implements internal dispatch, which means there is a single entry point to each receiving actor that receives messages. It is expected (but not necessary) that an actor then uses information from each message to dispatch to some method implementing the requested functionality as a simple function call.
The built-in actors dispatch based on a message’s method number, and expect the parameters to decode into a fixed schema corresponding to each method. Methods are numbered sequentially, partly in order to enjoy a dense representation for top-level chain messages (method numbers <24 cost zero additional bytes).
The FVM does not further constrain dispatch, and the development team wish for the FVM to remain neutral with respect to such conventions.
This proposal uses method numbers, but provides a deterministic mapping from method names to those numbers.
- Provide a familiar dispatch-by-name abstraction to programmers
- Compact, uniform-size on-chain encoding
- No change to blockchain message schema
- Permit specification of standard actor interfaces, including retro-actively in recognition of widely-adopted patterns
- Permit extending the interface of an extant actor to implement methods defined in a new standard
- Independent of programming language
This proposal assumes that method names are defined statically, that all actors will have the interface definitions of any other actors they call available at compile-time. However, it can support dynamic resolution for future reflection capabilities.
Method number computation
The method number for a symbolic method name is calculated as the first four bytes of
hash(salt+methodname) interpreted as an unsigned 32-bit integer.
Zero is an invalid method number while Filecoin and the FVM continue to treat is as a special-case bare send of the native Filecoin token.
salt is chosen so that
hash(salt + "Constructor") == 1, the method number currently reserved for construction by the built-in actors.
The hash function is blake2b, which is already implemented in the VM and available as a syscall. This permits easy dynamic method number calculation, though build tools will typically compute a method number statically at compile time.
A method is exported when a method number is computed for it and the actor will recognise this number for internal dispatch. Conventions on exported method names are independent of any programming language conventions. Note that these conventions only apply to methods exported to the VM for inter-actor dispatch. Actor developers can continue to use relevant programming language conventions for simple internal function calls.
These conventions encode the loose guidelines set by the built-in actors.
Exported method names should:
- Use only the ASCII characters in
[a-zA-Z0-9_](the same set as the C programming language). Other characters, including unicode beyond this set, are excluded in order to reduce the opportunity for misleading spelling of names in user interfaces.
- Have an initial capital letter, and use CamelCase to identify word boundaries.
- Capitalize all letters in acronyms.
Two different method names will collide on their method number with probability
1/(2^32). In the rare case of collision within a single actor, build tooling should identify the collision and prompt one of the methods to be renamed.
A harder-to-resolve collision will occur in case of a collision between names in two standard interfaces to be implemented by a single actor. The probability of a collision grows with the square root of the number of methods in each interface. Two ten-method interfaces will find a collision with probability approximately
(10/2^32)*10 ~= 1/50,000,000. This is judged to be rare enough that developer tooling will be able to detect collision with widely used standard names before significant adoption makes renaming one a significant burden.
Compatibility with built-in actors
This proposal is compatible with the existing built-in actors. All they need to do is compute the prescribed method number corresponding to each existing method and add these to their existing dispatch tables. Calls to the old, sequential method numbers can continue to be supported indefinitely.
Compatibility with Ethereum
This scheme is inspired by the Solidity ABI for the Ethereum VM. It differs in the choice of hash function and exclusion of method parameter types (no overloading), so is not transparently compatible.
Note that removing both those differences would not automatically make this scheme compatible with the Solidity ABI, since method parameter types have fundamentally different schemas between the two environments. A future FVM/EVM embedding will need to translate method selectors along with the rest of the EVM ABI.
While the FVM may remain independent of calling conventions, strong network effects are likely to result in a single convention dominating, with perhaps a few others for niche use cases. It is thus worthwhile establishing a good standard now.
We could overload method names by including a description of their parameter type in the hash payload. This is rejected for simplicity. There is little evidence that method overloading has been of great utility in, e.g., Solidity. Since payloads are expected to be IPLD structures, overloading would require defining a standard serialization of an arbitrary IPLD type schema.
Excluding the parameter type from the method names means that methods may receive dynamically typed parameter payloads. This permits a kind of overloading if some part of the payload is used for dispatch in addition to the method number. A future convention could define a standard for this.
In the case of defining contract API standards such as tokens, we could prefix the interface/standard name to the method name, thus preventing the collision of short/common names between different interfaces. E.g.
hash("FRC20:" + methodname). This is rejected because it would make it impossible to retroactively declare some widely used method as such a standard without making all the existing implementations non-compliant with the standard they have created.
Where authors are designing a standard ahead of its wide adoption, they are encouraged to identify interface method names with such a prefix in any case, E.g.
Removing method numbers
There has previously been some enthusiasm for removing the method number from Filecoin, since it’s external to the VM and ABI. As this would require a change to the blockchain message structure, this is a difficult and hence unlikely change. This proposal instead exploits it, and for its original purpose.
An alternative that could promote the eventual removal of method numbers is to define a standard envelope IPLD structure for message parameters, embedding the method number there. Such an envelope would be a 2-item array, with the first item being the method number as an integer and the second item being a byte array encoding the resolved method’s parameters. Actors could extract the method number before decoding the remaining parameters, in order to determine the appropriate schema.
The solidity ABI spec: https://docs.soliditylang.org/en/v0.8.13/abi-spec.html