Original author: Vitalik Buterin
Original translation: Luke, Mars Finance
This article presents a radical idea about the future of the Ethereum execution layer, with ambitions comparable to the consensus layer’s beacon chain efforts. It aims to significantly improve the efficiency of the Ethereum execution layer, addressing one of the major scaling bottlenecks, while also significantly simplifying the execution layer—in fact, this may be the only way to achieve simplification.
Core idea: Replace EVM with RISC-V as the virtual machine language for writing smart contracts.
Important clarification:
Account, cross-contract calls, storage and other concepts remain unchanged. These abstract concepts work well, and developers have already become accustomed to them. Opcodes like SLOAD, SSTORE, BALANCE, CALL will become RISC-V syscalls.
In this case, smart contracts can be written in Rust, but I expect that most developers will continue to use Solidity (or Vyper), as these languages will adapt RISC-V as the backend. This is because the code for smart contracts written in Rust is often not as aesthetically pleasing, while Solidity and Vyper have better readability. Potentially, the changes in developer experience (devex) may be minimal, and developers may hardly notice any changes.
The old-style EVM contracts will continue to operate and achieve complete bidirectional interoperability with the new RISC-V contracts. There are several implementation methods, which I will elaborate on later.
An example is the Nervos CKB VM, which is basically based on RISC-V.
Why do this?
In the short term, the main bottlenecks of Ethereum L1 scalability will be addressed by the upcoming EIPs, such as block-level access lists, delayed execution, distributed history storage, and EIP-4444. In the medium term, we will tackle further issues through statelessness and ZK-EVMs. In the long term, the main limiting factors for Ethereum L1 scaling will be:
The stability of data availability sampling and history storage protocols.
Maintain the desire for block production in a competitive market.
The proving capability of ZK-EVM.
I will argue that replacing ZK-EVM with RISC-V can address the key bottlenecks in (2) and (3).
The following is a table of the number of cycles required for Succinct ZK-EVM to prove different parts of the EVM execution layer:
Among these four parts, the ones that take more time are: deserialize_inputs, initialize_witness_db, state_root_computation, and block_execution.
initialize_witness_db and state_root_computation are both related to the state tree.
deserialize_inputs refers to the process of converting block and witness data into an internal representation. Therefore, in reality, over 50% of the time spent is proportional to witness sizes.
By replacing the current keccak 16-ary Merkle Patricia tree with a binary tree using a prover-friendly hash function, these components can be significantly optimized. If Poseidon is used, we can prove 2 million hashes per second on a laptop (in contrast, keccak is only about 15,000 hash/sec). Besides Poseidon, there are many other options. Overall, these components have considerable room for optimization. As an additional benefit, we can eliminate accrue_logs_bloom by removing bloom.
This leaves us with block_execution, which currently accounts for about half of the prover cycles. If we want to increase the overall prover efficiency by 100 times, we must at least improve the EVM’s prover efficiency by 50 times. We can try to create a more efficient EVM implementation to reduce prover cycles. Another approach is to note that the current ZK-EVM provers have already proven by compiling the EVM to RISC-V, so we can directly allow smart contract developers to use the RISC-V VM.
Some data indicate that under specific circumstances, this could lead to an efficiency improvement of over 100 times:
In practice, I expect the remaining prover time to be primarily dominated by today’s precompiles. If we take RISC-V as the main VM, then the gas schedule will reflect the proving time, thus creating economic pressure for developers to stop using the more expensive precompiles. Nevertheless, the efficiency gains may not be as astonishing as the data suggests, but we have ample reason to believe they will be very significant.
(Incidentally, EVM and “other parts” make up about 50% of the total EVM execution as well, and we intuitively expect the same significant increase by removing the EVM as a “middleman”.) )
Implementation details
There are several ways to implement such a proposal. Here are a few possible ways:
The least destructive approach: supporting two types of VMs, allowing contracts to be written in either VM. Both types of contracts can access the same functionalities: persistent storage (SLOAD and SSTORE), holding ETH balance, initiating and receiving calls, etc. EVM and RISC-V contracts can freely call each other; from the perspective of RISC-V, calling an EVM contract is akin to executing a syscall with special parameters; the EVM contract receiving the message will interpret it as a CALL.
A more radical approach: converting existing EVM contracts to call an EVM interpreter contract written in RISC-V to run their existing EVM code. That is to say, if an EVM contract has code C, and the EVM interpreter is located at address X, then the contract will be replaced with the top-level logic: when called from external with call parameters D, it will invoke X with (C, D), and then wait for the return value and forward it. If the EVM interpreter itself calls this contract, requesting to execute CALL or SLOAD/SSTORE, then the contract will perform the corresponding operations.
Intermediate approach: adopt the second method but create a clear protocol feature—essentially writing the concept of a “virtual machine interpreter” into the protocol and requiring its logic to be written in RISC-V. The EVM will be the first interpreter, but there can be other interpreters (for example, Move could be a candidate).
A KEY BENEFIT OF THE SECOND AND THIRD PROPOSALS IS THAT THEY GREATLY SIMPLIFY THE EXECUTION LAYER SPECIFICATION - IN FACT, THIS APPROACH MAY BE THE ONLY PRACTICAL WAY TO ACHIEVE SIMPLIFICATION, CONSIDERING THE DIFFICULTY OF EVEN INCREMENTAL SIMPLIFICATION LIKE REMOVING SELFDESTRUCT. Tinygrad has a hard and fast rule: code never exceeds 10,000 lines; An optimized blockchain base layer should be able to easily adapt to this limitation, if not less. The efforts of the beacon chain hold great promise for greatly simplifying Ethereum’s consensus layer. But for the executive layer to achieve a similar simplification, this radical change may be the only viable path.