BitcoinLayer 2 extension Technical Analysis: validity proof and fraud proof

2024-10-22 10:38:55

1 Introduction

For an Algorithm f, two mutually untrusted participants, Alice and Bob, can establish trust in the following way:

Alice inputs x, runs Algorithm f, and obtains result y. Bob also runs Algorithm f with the same input x and obtains y’. If y and y’ are equal, Bob acknowledges the input x and output y provided by Alice. This is a common validity proof mechanism, usually applied in Consensus of Blockchains, where Alice is the Node that packages Blocks, and Bob is a Node participating in Consensus.
Alice inputs x, runs the zk.prove program, and obtains the result y and proof. Bob runs the zk.verify program based on f, y, and proof. If the result is true, Bob accepts Alice’s result y; if the result is false, Bob does not accept Alice’s result y. This is a validity proof, and Bob can be an on-chain contract.
Alice inputs x and runs Algorithm f, resulting in y. Bob also runs Algorithm f with the same input x, resulting in y’. If y and y’ are equal, nothing happens; if y and y’ are not equal, Bob challenges Alice with the program f. The interaction between Alice and Bob can be single or multiple rounds. Arbitration is achieved through the challenge-response mechanism. This is a fraud proof, where Bob is the challenger, listening offline and initiating challenges online.
Alice inputs x, runs the zk.prove program, and obtains the result y and the proof proof. Bob runs the zk.verify program based on f, y, and proof. If the result is true, nothing is done; if the result is false, Bob will challenge Alice, and the challenging program is zk.verify. The interaction between Alice and Bob can be one or more rounds. Arbitration is achieved through the challenge-response mechanism. This is a kind of fraud proof, with Bob as the challenger, listening offline and initiating challenges online.

For the above 2, 3, and 4, let x be Layer2 transactions and initial state, f be Layer2 Consensus program, y be the final state of the transaction, which corresponds to the Layer2 scaling solution of the blockchain.

validity proof: Based on a pessimistic assumption, its validity must be proven before acceptance and take effect immediately. In the validity proof, evidence of correct L2 state transitions must be provided, reflecting a pessimistic view of the world - it will only be accepted when the state is proven correct.
fraud proof: Based on optimistic assumptions, it is accepted by default unless proven otherwise. It has a challenge window, and only becomes effective after the challenge window ends. In fraud proof, evidence of incorrect L2 state transition must be provided, reflecting an optimistic view of the world - state transitions are considered correct unless proven otherwise.

Table 1: Methods to Build Trust

Also, it is important to note:

The key to distinguishing between fraud proof and validity proof does not lie in whether to use ZK proof systems like SNARK or STARK. The ZK proof system only expresses the proof method used, while fraud or validity represents the content of the proof. Therefore, scenario 1 in Table 1 is referred to as validity proof.
Validity proof has better timeliness, but higher complexity for on-chain verification; while fraud proof has lower complexity for on-chain verification, but poorer timeliness.
For case 2 and 4 in Table 1, by using ZK recursive and composition techniques, multiple f can be calculated and compressed to significantly drop the verification cost of on-chain to off-chain computation.

Currently, thanks to the Turing Complete Smart Contract of Solidity, fraud proof and validity proof technologies have been widely used in the Layer2 expansion of Ethereum. However, within the BTC framework, due to the limited functionality of BTC’s opcodes, a stack element count of only 1000, and other restrictions, the application of these technologies is still in the exploratory stage. This article summarizes the limitations and breakthrough technologies under the BTC framework, studies the validity proof and fraud proof technologies, and outlines the unique script segmentation technology under the BTC framework.

Limitations and Breakthroughs Under the Paradigm of 2 Bitcoin

There are many limitations within the framework of Bitcoin, but these limitations can be overcome through various clever methods or technologies. For example, the use of Bit commitments can break through the statelessness limitation of UTXO, Taproot can solve the limitation of script space, connector outputs can overcome the limitation of UTXO spending methods, and Smart Contracts can overcome the limitation of pre-signing.

2.1 UTXO Model and Script Restrictions

Bitcoin adopts the UTXO model, where each UTXO is locked in a locking script that defines the conditions required to spend the UTXO. There are the following limitations to BTC scripts:

BTC scripts are stateless;
For P2TR output types, a single transaction can accommodate up to about 4 million opcodes, filling the entire Block, while for other output types, only 10,000 opcodes can be used;
The consumption method of a single UTXO is limited, and there is a lack of exploration of combined consumption methods;
Does not support flexible contract functionality;
The size limit of the stack is up to 1000 elements (including altstack and stack), and the maximum size of a single element is 520 bytes;
Arithmetic operations (such as addition and subtraction) are limited to 4-byte elements and cannot be expanded to 20-byte or longer elements, which is necessary for the encryption operation;
Opcode like OPMUL and OPCAT have been disabled, and it is extremely expensive to simulate them using existing opcodes, such as simulating a BLAKE3 hash, which results in a script size of about 75K.

2.2 Bit Commitment: Breaking the UTXO State Limitation

Currently, the BTC script is completely stateless. When executing the BTC script, the execution environment of each script will be reset upon completion. This makes it impossible for the BTC script to natively support scripts 1 and 2 sharing the same x value. However, this problem can be solved through some methods, the core idea is to sign a value in some way. If a signature can be generated for a value, it is possible to achieve a stateful BTC script. In other words, by checking the signatures of the x values in scripts 1 and 2, it can be ensured that the same x value is used in these two scripts. If there are conflicting signatures, that is, two different values are signed for the same variable x, penalties can be imposed. This method is called Bit commitment.

The principle of Bit commitment is relatively simple. Each Bit corresponds to two different hash values, hash0 and hash1. If the Bit value to be signed is 0, the pre-image of hash0 is revealed; if the Bit value is 1, the pre-image of hash1 is revealed.

Taking a single Bit message m ∈ {0,1} as an example, the corresponding Bit commitment unlocking script is composed of some preimages: if the Bit value is 0, the unlocking script is preimage0 - ‘0xfa7fa5b1dea37d71a0b841967f6a3b119dbea140’; if the Bit value is 1, the unlocking script is preimage1 - ‘0x47c31e611a3bd2f3a7a42207613046703fa27496’. Therefore, through Bit commitment, the stateless limitation of UTXO can be overcome, enabling stateful BTC scripts and making various interesting new features possible.

OP_HASH160

OP_DUP

0xf592e757267b7f307324f1e78b34472f8b6f46f3> // This is hash1

OP_EQUAL

OP_DUP

OP_ROT

0x100b9f19ebd537fdc371fa1367d7ccc802dc2524 > // This is hash0

OP_EQUAL

OP_BOOLOR

OP_VERIFY

// The value promised by Bit is on the stack. It can be “0” or “1”.

Currently, Bit promises two implementation methods:

Lamport one-time signature: The principle of Lamport one-time signature is relatively simple, only requiring the use of hash functions, suitable for compatibility with BTC. For each Bit in the message, two hash values need to be committed, resulting in relatively large signature data. Specifically, for a message of length v Bits, the Public Key length is 2v Bits, while the signature length is v Bits.
Winternitz One-Time Signature: Compared with Lamport signature, Winternitz signature significantly reduces the length of the signature and Public Key, but increases the complexity of signing and verification. This scheme allows flexible setting of different hash chain lengths d as needed, thus achieving a balance between length and complexity. For example, setting d = 15 can shorten the length of the Public Key and signature by about 4 times, but the complexity of verification will increase by 4 times. This is actually a trade-off between the stack space and script size of BTC. When d = 1, Lamport signature can be seen as a special case of Winternitz signature.

Currently, the BitVM2 library implements Winternitz signatures based on the Blake3 hash function. The length of a single Bit signature is approximately 26 bytes. Therefore, it can be seen that introducing state through Bit commitments has a cost. Therefore, in the implementation of BitVM2, the message is first hashed to obtain a 256-bit hash value, and then a Bit commitment is made to this hash value to save costs, instead of committing to each Bit of the original longer message directly.

2.3 Taproot: Breaking the Script Space Limit

The Taproot soft fork upgrade of Bitcoin (BTC) was activated in November 2021, including three proposals: Schnorr signatures (BIP 340), Taproot (BIP 341), and TapScript (BIP 342). This upgrade introduces a new transaction type - Pay-to-Taproot (P2TR) transactions. By combining the advantages of Taproot, MAST (Merkelized Abstract Syntax Trees), and Schnorr signatures, P2TR transactions can create more private, flexible, and scalable transaction formats.

P2TR supports two spending methods: spending via Secret Key path or script path. According to the Taproot (BIP 341) specification, when spending via the script path, the corresponding Merkle path length cannot exceed 128. This means that the number of tapleaves in the taptree cannot exceed 2^128. Since the SegWit upgrade in 2017, the BTC network measures block size in weight units, with a maximum support of 4 million weight units (about 4MB). When a P2TR output is spent via the script path, only a single tapleaf script needs to be revealed, which means that the block contains only one tapleaf script. Therefore, for P2TR transactions, the maximum script size for each tapleaf can reach about 4MB. However, according to BTC’s default policy, many nodes only forward transactions smaller than 400K; larger transactions require cooperation with miners for packaging.

The script space premium that Taproot brings makes it even more valuable to simulate cryptographic operations such as multiplication and hashing using existing opcodes. When building verifiable computations on top of P2TR, the script size is no longer limited by the 4MB limit, but can be split into multiple sub-functions distributed across multiple tapleaves, thus breaking the 4MB script space limit. In fact, the Groth16 validator algorithm currently implemented in BitVM2 is up to 2GB in size. However, it can be split and distributed across approximately 1000 tapleaves, and by combining with the Bit commitment, constrains the consistency between the inputs and outputs of each sub-function, thus ensuring the integrity and correctness of the entire computation.

2.4 Connector Output: Breaking the UTXO Spending Limitation

Currently, BTC provides two native spending methods for a single Unspent Transaction Output (UTXO): spending through script or Public Key. Therefore, as long as the correct Public Key signature or script conditions are met, the UTXO can be spent. Two UTXOs can be spent independently, and restrictions cannot be added to constrain these two UTXOs, which means additional conditions must be met to spend them.

However, the founder of ARK protocol, Burak, cleverly utilized the SIGHASH flag to achieve connector output. Specifically, Alice can create a signature to send her BTC to Bob. Since the signature can commit to multiple inputs, Alice can set her signature as a condition: it is only valid when the second input is spent in the Taketx transaction. The second input is called the connector, which connects UTXO A and UTXO B. In other words, the Taketx transaction is only valid when UTXO B is not spent by the Challengetx. Therefore, by destroying the connector output, the validity of the Taketx transaction can be prevented.

Figure 1: Connector output schematic diagram

In the BitVM2 protocol, the connector output acts as an if…else function. Once the connector output is spent by a transaction, it cannot be spent again by other transactions, ensuring its exclusivity. In practical deployment, additional UTXOs with time locks are introduced to allow for challenge-response cycles. In addition, the connector output can be set with different spending strategies as needed, such as allowing either party to spend in the case of a challenge transaction, and only allowing the operator to spend in the case of a response transaction or allowing anyone to spend after a timeout.

2.5 Contract: Breaking the Pre-Signing Restriction

Currently, BTC scripts primarily restrict unlocking conditions, but do not restrict how unspent transaction outputs (UTXOs) are further spent. This is because BTC scripts cannot read the content of the transaction itself and cannot achieve introspection of the transaction. If BTC scripts could examine any content of the transaction, including outputs, then contract functionality could be achieved.

The current contract implementation can be divided into two categories:

Presigning: Based on the existing BTC script capability, a limited function pre-scheduled contract is constructed through presigning. This means that participants need to design and sign all possible future transactions in advance, thereby locking them to specific Private Keys and fee rates. Some schemes even require participants to generate short-term Private Keys for all presigned transactions. Once the presigning is completed, these short-term Private Keys are securely deleted to prevent attackers from gaining access and stealing funds. However, this process needs to be repeated whenever new participants are added or operations are updated, resulting in high maintenance costs. For example, the Lighting Network achieves bilateral contracts through presigning and uses Hashed TimeLock Contract (HTLC) technology to implement routing for multiple bilateral contracts, thereby realizing a trust-minimized scaling strategy. However, the Lighting Network requires presigning multiple transactions and is limited to only two parties, making it cumbersome to use; in BitVM1, hundreds of transactions need to be presigned for each initialization, while in the optimized BitVM2, the number of transactions to be presigned for each initialization also reaches dozens. In BitVM1 and BitVM2, only the operators involved in presigning are eligible for reimbursement. If there are n participants involved in presigning, and each participant needs to presign m transactions, then the presigning complexity for each participant will be n * m.
Introducing Contract Operation Codes: Introducing new contract functional operation codes can significantly reduce the communication complexity and maintenance costs among contract participants, thus providing a more flexible contract implementation method for BTC. For example, the OPCAT operation code is used to concatenate byte strings. Despite its simplicity, it is very powerful and can significantly reduce the complexity of BitVM; the OPTXHASH operation code allows for more precise control of operations within the contract. If used in BitVM, it can support a wider range of operators, significantly improving the security of BitVM and minimizing trust. In addition, the nature of the pre-signature method essentially means that operators in BitVM can only adopt a reimbursement process, which leads to lower capital utilization efficiency; whereas through new contract operation codes, it may be possible to directly pay output users from the peg-in fund pool, further improving capital efficiency. Therefore, a flexible contract model will effectively break through the traditional pre-signature limitations.

3 BTC Layer2 Expansion: validity proof and fraud proof

validity proof and fraud proof can both be used for BTC’s Layer2 expansion, with the main differences seen in Table 2.

Table 2: validity proof and fraud proof

BTC-based fraud proofs can be built based on Bit commitments, Taproot, pre-signed, and connector outputs. At the same time, by introducing contract opcodes (such as OP_CAT), it is also possible to build BTC validity proofs based on Taproot. In addition, fraud proof can also be divided into licensed fraud proof and permissionless fraud proof according to whether Bob has a get on board system. In permissioned fraud proof, only a specific group of people can challenge as Bob, whereas in permissionless fraud proof, any third party can challenge as Bob. Permissionless fraud proof is more secure than permissioned fraud proof because it drops the risk of collusion between participants.

Based on the challenge-response interaction count between Alice and Bob, fraud proof can be divided into single-round fraud proof and multi-round fraud proof, as shown in Figure 2.

Figure 2: Single-round fraud proof vs. multi-round fraud proof

As shown in Table 3, fraud proof can be implemented through different interaction models, including single-round interaction model and multi-round interaction model.

Table 3: Single-turn Interaction and Multi-turn Interaction

In the Layer 2 expansion paradigm of BTC, BitVM1 adopts a multi-round fraud proof mechanism, while BitVM2 adopts a single-round fraud proof mechanism. Bitcoin-circle stark utilizes validity proof. Among these mechanisms, BitVM1 and BitVM2 can be implemented without modifying the BTC protocol, while bitcoin-circle stark requires the introduction of a new Operation Code OP_CAT.

For most computational tasks, BTC’s signet, testnet, and mainnet cannot fully support 4MB scripts, so script splitting technology is needed, that is, splitting the complete computational script into blocks smaller than 4MB and distributing them in different Tapleafs.

3.1 Bitcoin multi-round fraud proof

As shown in Table 3, multi-round fraud proof is suitable for those who want to reduce on-chain arbitration calculations, or cannot locate the problematic calculation segment in one step. As the name suggests, multi-round fraud proof requires multiple rounds of interaction between the prover and validators to identify the problematic calculation segment, and then arbitrate based on these segments.

Robin Linus’s early BitVM White Paper (usually referred to as BitVM1) adopts multiple rounds of fraud proof. Assuming each challenge period lasts for one week and uses binary search to locate the problematic computation segment, the on-chain challenge response period of the Groth16 verifier may be extended to 30 weeks, resulting in poor timeliness. Although there are teams currently researching more efficient n-ary search methods than binary search, their timeliness is still significantly lower than the 2-week cycle of single-round fraud proof.

Currently, multi-round fraud proof in BTC uses permissioned challenges, while single-round fraud proof implements a permissionless challenge method, thereby dropping the risk of collusion between participants and enhancing security. To achieve this, Robin Linus fully utilized the advantages of Taproot to optimize BitVM1, reducing the number of interaction rounds to one and expanding the challenge method to a permissionless manner, despite increasing the cost of on-chain arbitration calculation.

Round 3.2 BTC fraud proof

In this model, the verification of fraud proof only requires one interaction between the prover and validators. Validators only need to initiate one challenge, while the prover only needs to respond once. In the response, the prover must provide evidence to prove that their calculation is correct. If the validators find inconsistencies in the evidence, the challenge is successful; otherwise, the challenge fails. The characteristics of single-round interaction fraud proof are shown in Table 3.

Figure 3: Single Round Fraud Proof

On August 15, 2024, Robin Linus released the technical White Paper ‘BitVM2: Connecting BTC to Layer 2’, which implements a cross-chain bridge using a single-round fraud proof method similar to that shown in Figure 3.

3.3 Use OP_CAT’s BTCvalidity proof

OP_CAT is part of the BTC scripting language when it was released, but it was disabled in 2010 due to security vulnerabilities. Nevertheless, the BTC community has been discussing the possibility of re-enabling OP_CAT for years. Currently, OP_CAT has been assigned the number 347 and has been enabled on BTC’s signet network.

The main function of OP_CAT is to merge the two elements in the stack and push the result back to the stack. This function provides new possibilities for contracts on BTC and STARK verifiers:

Contract: Andrew Poelstra proposed CAT and Schnorr Tricks I to implement contracts on BTC using OP_CAT and Schnorr technology. The Schnorr Algorithm is a Digital Signature suitable for P2TR output types; for other output types, similar ECDSA technology can be used, as shown in the contract using CAT and ECDSA. With the OP_CAT contract, the STARK verifier Algorithm can be decomposed into multiple transactions to gradually verify the entire STARK proof.
STARK Verifier: The main function of the STARK Verifier is to connect data together and hash it. Unlike algebraic operations, hash is a native operation in BTC scripts, which can significantly reduce expenses. For example, OPSHA256 is a single opcode in its native form, while the simulated version requires hundreds of opcodes. The main hash operations in STARK include verifying Merkle paths and Fiat-Shamir transformations. Therefore, OP_CAT is very friendly to the STARK Verifier Algorithm.

3.4 BTC Script Splitting Technology

Despite the much lower computational load required for proof of validation after SNARK/STARK proof, the script size required for implementing the validator Algorithm in BTC script is still huge. Currently, based on existing BTC opcodes, the optimized sizes of Groth16 and Fflonk validator scripts both exceed 2GB, while the size of a single BTC Block is only 4MB, so it is impossible to run the entire validator script within a single Block. However, since the Taproot upgrade, BTC supports script execution through tapleaf, allowing the validator script to be split into multiple blocks, each serving as a tapleaf to construct the taptree. Consistency between blocks can be ensured through bit commitment.

Under the OP_CAT contract, the STARK verifier can be split into multiple standard transactions that are less than 400KB, so that the entire STARK validity proof verification can be completed without the need to cooperate with the Miner.

This section will focus on the splitting technique of BTC scripts under existing conditions, without introducing or activating any new opcodes.

When performing script splitting, it is necessary to balance the information of the following aspects:

The script size of a single block must not exceed 4MB and should include input bit commitment scripts, transaction signatures, and other content.
The maximum stack size for a single block must not exceed 1000. Therefore, the stack for each block should only retain necessary elements to ensure sufficient stack space for script size optimization, as Money Laundering in BTC is unrelated to stack size.
On BTC, the cost of bit commitment is high. Therefore, it is necessary to minimize the number of input and output bits between two adjacent blocks, with 1 bit currently corresponding to 26 bytes.
For the convenience of auditing, the function of each block should be as clear as possible.

Currently, the methods of script splitting can be divided into the following three categories:

Automatic Split: This method seeks a splitting method to make the script size approximately 3MB, and minimizes the stack size based on the stack size and script size. Its advantage is independent of the specific validator Algorithm and can be extended to any script splitting for computation. Disadvantages include: (1) The entire logic block must be separately marked, for example, OP_IF code blocks that cannot be split; otherwise, the execution result of the split script may be incorrect; (2) The execution result of the block may correspond to multiple elements on the stack, and the number of stack elements requiring application bit commitment needs to be marked according to the actual calculation logic; (3) The logical function of each block script has poor readability and is difficult to audit; (4) The stack may contain elements that are not needed for the next block, wasting stack space.
Function Decomposition: This method decomposes the calculation based on functional sub-functions, clarifying the input and output of each sub-function. At the same time, necessary bit commitment scripts are implemented for each block when decomposing the script, ensuring that the total script size of the final block is less than 4MB, and the stack size is less than 1000. The advantage is clear functionality, logical clarity, good readability, and ease of auditing. The disadvantage is that the original calculation logic may not match the script-level logic, and the original sub-functions may be optimal, but not necessarily the best at the script level.
Manual Split: The split point of this method is not based on functional sub-functions, but is set manually. This method is particularly suitable for cases where the size of a single sub-function exceeds 4MB. The advantage is allowing manual splitting of large script sub-functions, such as sub-functions related to Fq12 calculation; each block has clear logic and strong readability, making it easy for auditing. The disadvantage is limited by manual tuning ability. After the overall script optimization, the previously set manual split points may no longer be optimal and need to be readjusted.

For example, after multiple rounds of optimization, the script size of Groth16 verifier has dropped from about 7GB to approximately 1.26GB. In addition to overall computational optimization, each block can also be individually optimized to maximize stack space utilization. For instance, by introducing more efficient lookup table algorithms and dynamically loading and unloading lookup tables, the script size of each block can be further reduced.

Due to the significant differences in computing costs and runtime environments between web2 programming languages and BTC scripts, it is not feasible to simply convert existing algorithm implementations into BTC scripts. Therefore, specific optimizations for BTC scenarios need to be considered:

Look for an Algorithm that can optimize memory locality, even if it may increase some computational burden, to reduce the input/output bit count between blocks, thus reducing the data required for commitment in the assertTx transaction design of BitVM2.
Use the commutativity of the relevant operations (such as logical operations), such as x&y = y&x, to save nearly half of the lookup table space.
Currently, the script size operated by Fq12 is relatively large; consider using Fiat-Shamir, Schwartz-Zippel, and polynomial commitment schemes to significantly reduce the computational complexity of extending Fq12 operations.

4 Summary

This article first discusses the limitations of BTC scripts and discusses several solutions: using BTC commitments to overcome the stateless restrictions of UTXO, using Taproot to break the limitations of script space, bypassing the limitations of UTXO spending methods by connecting outputs, and using contracts to solve the limitations of pre-signed transactions. Next, the article provides a comprehensive overview of the characteristics of fraud proof and validity proof, including the characteristics of licensed and unlicensed fraud proof, the difference between single-round and multi-round fraud proof, and the relevant content of BTC script splitting technology.

Statement:

This article is reproduced from[aicoin]，著作权归属原作者[mutourend & lynndell, Bitlayer Labs]，如对转载有异议，请联系Gate Learn team，团队会根据相关流程尽速处理。
Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
The other language versions of the article are translated by the Gate Learn team, and may not be copied, disseminated, or plagiarized without mentioning [Gate.io].

BTC0,85%

TAPROOT-1,82%

MODE-5,68%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.