zkVM 1.0: Industry-Leading Performance Benchmarks

Tim Carstens & Santiago Campos

June 17, 2024

TLDR:

Performance Leader: zkVM 1.0 delivers high efficiency, outperforming competitors on general-purpose compute.
Future-proof: Continuous improvements ensure RISC Zero will maintain its lead in zkVM performance.
Competitive Edge: Comprehensive benchmarks show RISC Zero’s superiority in both cost and speed across various workloads.

‍

‍

‍

‍

Note: Raw data is available at the bottom of this post.

A few weeks back, we hinted at the possibility of a ZK proof system faster than Plonky3, the proof system used by a number of other projects, including Succinct’s SP1. Recently, the Valida team published independent benchmarks, revealing some interesting insights. Now, it’s time to share our data comparing RISC Zero zkVM and the SP1 zkVM.
‍

RISC Zero’s zkVM Performance‍

At RISC Zero, we’re excited to share that zkVM 1.0 outperforms competitors by a significant margin. Our zkVM delivers high performance and is well-positioned for future improvements. Here’s a high-level comparison of RISC Zero’s zkVM against the SP1 zkVM:
‍

Questions and Methodology

Overview and Rationale

Our recent benchmarks showcase RISC Zero's clear advantage over SP1, reaffirming our position as the leader in zkVM performance. This context ensures a fair comparison and highlights the real-world implications of our performance lead. We seek to investigate the following questions:

General-Purpose VM Use Cases: Evaluating which proof system is best for general-purpose VM use.
Fastest Proofs: Comparing R0 vs SP1 to determine which implementation offers the fastest proofs.
Cost Efficiency: Analyzing which implementation offers the lowest cost for proving.
‍

Methodology:‍

We used a variety of hardware setups, ranging from consumer devices to cloud instances, and focused on common features:

RISC-V execution
Random-access memory
SHA2 accelerator (“precompile”)
‍

In previous benchmarking efforts by Succinct, comparisons often included dissimilar features, such as accelerated versus unaccelerated Keccak, which obscured the performance of the underlying proof systems. Additionally, prior benchmarks compared a fine-tuned SP1 deployment against a sub-optimal RISC Zero setup. In our tests, we aimed for the best “out of the box” performance from both systems, using recommended compiler flags and preferred AWS instances.

To ensure fairness, we enabled all recommended performance features for both RISC Zero’s zkVM and Succinct’s SP1. Our zkVM is optimized for GPU proving, a feature not utilized in earlier comparisons. This is like comparing the speeds of an airplane and a car by making the airplane stay on the ground.

Lastly, for these tests, we chose not to include data from Jolt or Valida:

Valida: Does not implement RISC-V, making direct comparison impossible, though their benchmarks show significant speed improvements over SP1.
Jolt: Still in early development, not yet representing its full potential.

These questions are relevant because, over time, VMs will converge on similar features. The performance of the underlying proof system will then determine which VM leads in cost and speed.

By standardizing our testing environment, we ensured the results were directly comparable and fair.

The Tests

We tested R0 zkVM 1.0.0-rc.5 against SP1 1.0-testnet.

We conducted experiments on 4 cloud instances:

AWS g4dn.xlarge; Cuda; po2=20; $0.33/hr reserved. (Note: we were unable to collect data from SP1 on this instance due to intense memory pressure that frequently caused the instance to freeze, requiring manual intervention via AWS management console.)
AWS g6.xlarge; Cuda; po2=21; $0.52/hr reserved.
AWS g6.16xlarge; Cuda; po2=21; $2.21/hr reserved.
AWS r7i.16xlarge; CPU only; po2=21 and po2=22; $2.80/hr reserved. (Note: this is the instance used by Succinct in their Testnet benchmarks; we did not measure R0 on this instance since we were satisfied with R0’s performance on less expensive instances.)
‍

We also tested on a variety of consumer (end-user) devices:

MacBook Pro (MBP) M3 Max with 96GB RAM; Metal; po2=20.
MacBook Pro (MBP) M2 Max with 96GB RAM; Metal; po2=20.
Ryzen 5950x with 128GB RAM; Cuda; po2=20.
‍

Notes:

po2 refers to the segment/shard size, a critical parameter for both RISC Zero and Succinct.
All builds were performed with RUSTFLAGS='-C target-cpu=native', following Succinct’s performance recommendations.
Instance prices were obtained from Vantage.‍
- Region: US East (N. Virginia)
- Pricing Unit: Instance
- Cost: Hourly
- Reserved: 1-year- No Upfront

We conducted several experiments. Each of these experiments is parameterized by the scale of the workload. This allows us to compare the scaling properties of the underlying proof systems for different types of workloads:

sha2. This guest calculates hash(hash(hash(...))) using an iterative algorithm. This measures proof-system performance for classical cryptography use cases.
fib. The guest calculates Fibonacci numbers using an iterative algorithm. This measures RISC-V core performance.
sort. The guest generates an array of random values, sorts the array, then calculates the alternating sum of values. This measures performance for heap-bound RISC-V computations.
big_input. The guest reads a vector of numbers and returns the sum. This measures heap and I/O performance.
big_input_push. The guest reads several numbers, pushes them into a vec, and returns the sum. This measures heap and I/O performance.
big_input_vecless. The guest reads several numbers and returns the sum. This measures I/O performance.

‍

Lastly, we also wanted to gather some data about critical blockchain use cases. For this we revisited Succinct’s prior benchmarks for Tendermint Light Client, updated it to use the latest Tendermint library, and enabled GPU proving for RISC Zero.

‍Experiment sizes range from 2k-167M cycles (approximate). For each task, we measured the total time required to create a single ZK proof. For RISC Zero and Succinct, this is a two step process: first a segmented (resp. sharded) proof is generated, then it is compressed (resp. reduced) into a single proof. In certain use cases, additional reduction steps might be performed; we did not measure those additional steps in these tests.

‍

Condensed Results and Key Takeaways

For the raw data see here.

Key Findings:

Across the board, we found that a properly configured RISC Zero zkVM outperforms a similarly configured Succinct SP1 deployment in both cost and speed. This holds true:

In the cloud
On consumer Macs (M2 Max and M3 Max);
On consumer PCs equipped with a consumer-grade NVIDIA GPU.

Additional highlights:

Across all VM and hardware combinations, the fastest proofs come from a consumer PC equipped with an nVidia 4090 running RISC Zero zkVM.
In the cloud, one g6.xlarge instance running RISC Zero is both faster and cheaper than one r7i.16xlarge instance (Succinct’s preferred node for benchmarking) running SP1.
In the cloud, for any given task, RISC Zero is at least 7x less expensive than SP1. For some workloads (such as SHA2 hashing) RISC Zero is at-least 30x less expensive. And for small workloads in particular, RISC Zero is nearly 60x less expensive than SP1.

In certain specific use cases, which heavily rely on specific cryptographic operations, Succinct's SP1 currently demonstrates faster performance due to their use of accelerators. RISC Zero is actively working on integrating similar accelerators to boost performance in these scenarios. While Succinct's use of accelerated Keccak provided an advantage in specific benchmarks, RISC Zero's zkVM delivers strong overall performance and cost-efficiency across a wide range of workloads compared to SP1. As we continue to enhance our zkVM with additional accelerators and optimizations, we expect to see further improvements in real-world applications.

‍

Our Industry-leading Tendermint Light Client‍

Our Tendermint light client test results, illustrated in the chart below, showcase R0's industry-leading performance.

‍

We utilized Succinct's performance benchmark suite, which includes the Tendermint light client test, and made updates to re-run the test. The methodology and full details of the benchmark suite and our updated fork can be found HERE.

Our testing was conducted on the MacBook Pro M3. For shard size 20, R0 achieved a proof duration of 3.85 minutes, significantly faster than SP1's 7.54 minutes. For shard size 22, R0 completed in 3.08 minutes compared to SP1's 4.23 minutes. This demonstrates R0's superior efficiency, being 48% and 27% faster than SP1, respectively. These results clearly highlight our lead in the industry.

We also collected data from AWS. As with our other experiments, we tested SP1 on their recommended instance: r7i.16xlarge ($2.8005/hr, shard size 22). The proof was completed in 4.27 minutes, roughly equivalent to their performance on M3. Meanwhile, we tested RISC Zero on our recommended instances: g4dn.xlarge ($0.331/hr, shard size 20) and g6.xlarge ($0.5239/hr, shard size 21). On these instances we observed times of 3.43 minutes and 1.30 minutes, respectively.

These data show that RISC Zero is faster on consumer devices, and faster and less expensive in the cloud, for critical blockchain use cases such as the Tendermint Light Client.

‍

Detailed Benchmarks and Reproduction

To ensure transparency and reproducibility, we have made our benchmarking scripts and data available. To replicate the results, you can check out our guide HERE.

For complete raw data, check out the spreadsheet HERE.

‍

Conclusion

Our experiments highlighted that RISC Zero zkVM not only leads in performance and cost-efficiency but also provides a robust foundation for future enhancements. In many head-to-head comparisons between similar systems, RISC Zero demonstrates strong performance and cost-efficiency over SP1. This demonstrates not only a fundamental difference between these VMs but also between the underlying proof systems.

It's important to note that performance may vary depending on specific workloads and hardware configurations. While RISC Zero shows strong overall performance, there may be certain scenarios where other solutions, such as Succinct's SP1, excel due to factors like the use of accelerators for specific tasks.

All these systems are likely to continue improving over the coming year. RISC Zero and Succinct are both actively working on the next set of performance enhancements. Thus, one can expect more head-to-head comparisons to be published in the coming months.

This competition, along with pressure from Jolt and Valida, will benefit consumers and drive advancements in the ZK space as a whole. With ongoing enhancements and a strong foundation, RISC Zero is positioned to remain the best zkVM for the foreseeable future.

‍

Stay Updated‍

To stay updated on RISC Zero benchmarks, check out our dedicated benchmarking website.

Powering the Modular Expansion with Blobstream Zero

zkVM Performance Upgrades Roadmap - Q3 2024

zkVM 1.0 is Live

Technology

Developers

About

Blog

Docs

zkVM 1.0: Industry-Leading Performance Benchmarks

Tim Carstens & Santiago Campos

June 17, 2024

TLDR:

Note: Raw data is available at the bottom of this post.

RISC Zero’s zkVM Performance‍

Questions and Methodology

The Tests

Condensed Results and Key Takeaways

Key Findings:

Additional highlights:

Our Industry-leading Tendermint Light Client‍

These data show that RISC Zero is faster on consumer devices, and faster and less expensive in the cloud, for critical blockchain use cases such as the Tendermint Light Client.

‍

Detailed Benchmarks and Reproduction

Conclusion

Stay Updated‍

In other news:

Keep Reading

Powering the Modular Expansion with Blobstream Zero

zkVM 1.0: Industry-Leading Performance Benchmarks

Zeth Brings Validity Proofs to Optimism’s OP Stack

Sign up for updates

© RISC Zero 2024