What is TEE? TEE security model? Common TEE vulnerabilities and their security best practices.
Original Title: “Securing TEE Apps: A Developer’s Guide”
Authors: prateek, roshan, siddhartha & linguine (Marlin), krane (Asula)
Compiled by: Shew, GodRealmX
Since Apple announced the launch of private cloud and NVIDIA provided confidential computing in GPUs, Trusted Execution Environments (TEEs) have become increasingly popular. Their confidentiality guarantees help protect user data (including private keys), while isolation ensures that the execution of programs deployed on them cannot be tampered with - whether by humans, other programs, or operating systems. Therefore, it is not surprising that TEEs are widely used in the Crypto x AI field to build products.
Like any new technology, TEE is going through an optimistic experimental phase. This article aims to provide developers and general readers with a basic conceptual guide to help them understand what TEE is, TEE’s security model, common vulnerabilities, and best practices for secure use of TEE. *(Note: To make the text easier to understand, we deliberately replaced TEE terms with simpler equivalent words).
TEE is an isolated environment in a processor or data center, where programs can run without any interference from the rest of the system. To prevent interference from other parts, a series of designs are needed, mainly including strict access control, that is, controlling the access of other parts of the system to the programs and data inside the TEE. Currently, TEE is ubiquitous in mobile phones, servers, PCs, and cloud environments, making it very accessible and reasonably priced.
The above content may sound vague and abstract, in fact, different servers and cloud providers implement TEE in different ways, but the fundamental purpose is to prevent TEE from being interfered with by other programs.
Most readers may use biometric information to log in to devices, such as unlocking their phones with fingerprints. But how do we ensure that malicious applications, websites, or jailbroken operating systems cannot access and steal this biometric information? In fact, in addition to encrypting the data, the circuits in TEE devices simply do not allow any program to access the memory and processor areas occupied by sensitive data.
The hardware wallet is another example of TEE application scenarios. The hardware wallet is connected to the computer and communicates with it in a sandbox, but the computer cannot directly access the mnemonic stored in the hardware wallet’s memory. In both of the above cases, users trust that the device manufacturer can correctly design the chip and provide appropriate firmware updates to prevent the export or viewing of confidential data within the TEE.
Unfortunately, there are many types of TEE implementations, and these different implementations (Intel SGX, Intel TDX, AMD SEV, AWS Nitro Enclaves, ARM TrustZone) require independent security model modeling and analysis. In the rest of this article, we will mainly discuss Intel SGX, TDX, and AWS Nitro, because these TEE systems have more users and complete and available development tools. These systems are also the most commonly used TEE systems in Web3.
Generally speaking, the workflow of the application deployed in TEE is as follows:
Clearly, there are three potential risks here:
Fortunately, TEE now has a solution to eliminate the above risks, namely Reproducible Builds( and Remote Atteststations).
So what is reproducible build? Modern software development often requires importing a large number of dependencies, such as external tools, libraries, or frameworks, etc. These dependency files may also have hidden risks. Now npm and other solutions use the code hash corresponding to the dependency file as a unique identifier. When npm finds that a dependency file is inconsistent with the recorded hash value, it can be considered that the dependency file has been modified.
Reproducible builds can be considered as a set of standards, the goal of which is to obtain consistent hash values when any code runs on any device, as long as the build is performed according to a predetermined process. Of course, in practice, we can also use products other than hash values as identifiers, which we call code measurement (code measurement) here.
Nix is a commonly used tool for reproducible builds. After the program’s source code is made public, anyone can inspect the code to ensure that developers have not inserted any malicious content. Anyone can use Nix to build the code and check whether the resulting product has the same code measurement/hash as the one deployed by the project in the production environment. But how do we know the code measurement value of the program in the TEE? This involves the concept of ‘remote attestation’.
Remote attestation is a signed message from the TEE platform (trusted party) containing code measurement values, TEE platform versions, etc. Remote attestation allows external observers to know that a program is running in a secure location inaccessible to anyone (real TEE of version xx).
Reproducibility and remote attestation enable any user to know the actual code and TEE platform version information running inside the TEE, thereby preventing developers or servers from malicious activities.
However, in the case of TEE, it is always necessary to trust its suppliers. If the TEE supplier behaves maliciously, they can directly forge remote attestation. Therefore, if suppliers are considered as potential attack vectors, relying solely on TEE should be avoided, and it is preferable to combine them with ZK or consensus protocols.
In our opinion, the TEE’s popularity, especially in terms of its deployment friendliness for AI Agents, is mainly due to the following factors:
Regardless of the pros and cons, it is currently difficult to find alternative solutions for many use cases that heavily rely on TEE. We believe that the introduction of TEE further expands the development space for on-chain applications, which may drive the emergence of new use cases.
Programs running in TEE are still susceptible to a series of attacks and errors. Just like smart contracts, they are prone to a series of issues. For simplicity, we classify potential vulnerabilities as follows:
( Developer Negligence
Whether intentional or unintentional, developers can undermine the security guarantees of programs in TEE through intentional or unintentional code. This includes:
) Runtime Vulnerability
Developers, no matter how cautious they are, can still become victims of runtime vulnerabilities. Developers must carefully consider whether any of the following will affect the security guarantees of their projects:
The technology stack used by TEE applications should be handled with caution. The following issues may arise when building TEE applications:
Last but not least, there are some practical considerations about how to actually operate a server that runs TEE programs:
We divide our recommendations into the following points:
1. The safest solution: no external dependencies
Creating highly secure applications may involve eliminating external dependencies, such as external inputs, APIs, or services, to reduce the attack surface. This approach ensures that the application runs independently without external interactions that could compromise its integrity or security. While this strategy may limit the diversity of the program’s functionality, it can provide a very high level of security.
If the model is running locally, this level of security can be achieved for most CryptoxAI use cases.
2. Necessary preventive measures taken
Regardless of whether the application has external dependencies, the following content is required!
Consider TEE applications as smart contracts, not backend applications; maintain a lower update frequency, and strict testing.
Building a TEE program should be as rigorous as writing, testing, and updating a smart contract. Like smart contracts, TEEs operate in a highly sensitive and immutable environment, where erroneous or unexpected behavior can lead to serious consequences, including a complete loss of funds. Thorough audits, extensive testing, and a minimum, carefully audited update are essential to ensure the integrity and reliability of TEE-based applications.
Audit code and check build pipeline
The security of an application depends not only on the code itself, but also on the tools used in the build process. A secure build pipeline is essential to prevent breaches. The TEE only guarantees that the code provided will work as intended, but cannot fix defects introduced during the build process.
To reduce risk, code must be rigorously tested and audited to eliminate errors and prevent unnecessary information leakage. In addition, repeatable builds play a crucial role, especially when the code is developed by one party and used by another. Reproducible builds allow anyone to verify that the programs executed within the TEE match the original source code, ensuring transparency and trust. Without a repeatable build, it is nearly impossible to determine the exact content of the executable program within the TEE, compromising the security of the application. **
For example, the source code for DeepWorm (a project that runs a worm brain simulation model in a TEE) is completely open source. The executors within the TEE are built reproducibly using Nix pipelines.
Use audited or verified libraries
When handling sensitive data in TEE programs, only use audited libraries for key management and private data processing. Unaudited libraries may expose keys and compromise the security of the application. Prioritize thoroughly reviewed, security-focused dependencies to maintain the confidentiality and integrity of the data.
Always verify proof from TEE
Users interacting with TEE must verify the remote attestation or verification mechanism generated by TEE to ensure secure and trustworthy interaction. Without these checks, the server may manipulate the response, making it impossible to distinguish between genuine TEE output and tampered data. Remote attestation provides critical evidence for the code library and configuration running in TEE, based on which we can determine whether the program running inside TEE is consistent with our expectations.
Specific attestations can be verified on-chain (IntelSGX, AWSNitro), off-chain using ZK proofs (IntelSGX, AWSNitro), or by users themselves or by managed services such as t16z or MarlinHub.
3. Recommendations that depend on the use case
According to the target use case and structure of the application, the following tips may help make your application more secure.
Ensure that user interactions with TEE are always performed over a secure channel
The server where the TEE is located is essentially untrusted. The server can intercept and modify communications. In some cases, it may be acceptable for the server to read data without modifying it, while in other cases, even reading data may be unacceptable. To mitigate these risks, it is essential to establish a secure end-to-end encrypted channel between the user and the TEE. At a minimum, please ensure that the message contains a signature to verify its authenticity and source. In addition, users need to always check that the TEE provides remote proof to verify that they are communicating with the correct TEE. This ensures the integrity and confidentiality of the communication.
For example, Oyster is able to support secure TLS issuance through the use of CAA records and RFC8657. In addition, it provides a TEE native TLS protocol called Scallop, which does not rely on WebPKI.
Know that TEE memory is transient
TEE memory is transient, which means that when TEE is turned off, its contents (including encryption keys) will be lost. Without a secure mechanism to save this information, critical data may become permanently inaccessible, potentially causing financial or operational difficulties.
A multi-party computation (MPC) network with decentralized storage systems such as IPFS can be used as a solution to this problem. The MPC network splits the key into multiple nodes, ensuring that no single node holds the complete key while allowing the network to rebuild the key when needed. Data encrypted with this key can be securely stored on IPFS.
If necessary, the MPC network can provide keys to a new TEE server running the same image, provided certain conditions are met. This approach ensures flexibility and strong security, maintaining data accessibility and confidentiality even in untrusted environments.
![]###https://img.gateio.im/social/moments-db0051b56114c59f585033373a8ab946###
There is another solution,** that is, the TEE hands over the relevant transactions to different MPC servers, and the MPC servers sign and aggregate the signatures and finally put the transactions on the chain. This approach is much less flexible and can’t be used to hold API keys, passwords, or arbitrary data (there is no trusted third-party storage service). **
Reduced Attack Surface
For security-critical use cases, it’s worth trying to reduce perimeter dependencies as much as possible at the expense of the developer experience. For example, Dstack comes with a minimal Yocto-based kernel that contains only the modules that Dstack needs to work. It might even be worth using an older technology like SGX (over TDX) because it doesn’t require a bootloader or operating system to be part of the TEE.
Physical Isolation
By physically isolating TEE from possible human intervention, the security of TEE can be further enhanced. Although we can believe that data centers can provide physical security by hosting TEE servers in data centers and cloud providers, projects like Spacecoin are exploring a rather interesting alternative—space. The SpaceTEE paper relies on security measures, such as measuring the inertia after launch, to verify whether the satellite deviates from the expected process when entering orbit.
Masternodes
Just as Ethereum relies on multiple client implementations to reduce the risk of bugs that affect the entire network, multiprovers use different TEE implementations to improve security and resiliency. By running the same computational steps across multiple TEE platforms, multi-factor validation ensures that vulnerabilities in one TEE implementation do not compromise the entire application. While this approach requires the computational process to be deterministic, or to define consensus between different TEE implementations in non-deterministic cases, it also offers significant benefits such as fault isolation, redundancy, and cross-validation, making it a good choice for applications that require reliability guarantees.
TEE has obviously become a very exciting field. As mentioned earlier, the ubiquity of AI and its continuous access to user sensitive data means that large tech companies like Apple and NVIDIA are using TEE in their products and offering it as part of their products.
On the other hand, the crypto community has always been very focused on security. As developers attempt to extend on-chain applications and use cases, we have seen TEE become popular as a solution that offers the right trade-off between functionality and trust assumptions. While TEE is not as trust-minimizing as a complete ZK solution, we expect TEE to become the approach through which Web3 companies and large tech companies gradually integrate their products.