# Cracks in the Shield: Understanding and Mitigating Side-Channel Threats to Confidential Computing #### Jo Van Bulck ★ DistriNet, KU Leuven, Belgium jo.vanbulck@cs.kuleuven.be vanbulck.net Graz Security Week, Sept 4, 2025 #### Trust? #### **Confidential Computing: Reducing Attack Surface** Traditional layered designs: Large trusted computing base #### **Confidential Computing: Reducing Attack Surface** Trusted execution: Hardware-level isolation and attestation #### The Rise of Trusted Execution Environments (TEEs) - 2004: ARM TrustZone - 2015: Intel Software Guard Extensions (SGX) - 2016: AMD Secure Encrypted Virtualization (SEV) - 2018: IBM Protected Execution Facility (PEF) - 2020: AMD SEV with Secure Nested Paging (SEV-SNP) - 2022: Intel Trust Domain Extensions (TDX) - 2023: ARM Confidential Compute Architecture (CCA) - 2024: NVIDIA Confidential Computing ### "Confidential Computing Today, Just Computing Tomorrow" \* # TEE Attack Research Leads the Way . . . #### TEE Attack Research Leads the Way . . . - Privileged TEE attacker models sets the bar! - Idealized execution environment for attack research - Generalizations: e.g., Foreshadow-NG, branch prediction, address translation, etc. #### Motivation: Why Research TEE/SGX Security? Technologies such as disk- and network-traffic encryption protect data in storage and during transmission, but data can be vulnerable to interception and tampering while in use in memory. "Confidential computing" is a rapidly emerging usage category that protects data while it is in use in a Trusted Execution Environment (TEE). Intel SGX is the most researched, updated and battle-tested TEE for data center confidential computing, with the smallest attack surface within the system. It enables application isolation in private memory regions, called enclaves, to help protect up to 1 terabyte of code and data while in use. #### Idea: Page Faults as a Side Channel SGX machinery protects against direct address remapping attacks #### Idea: Page Faults as a Side Channel Xu et al. "Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems", IEEE S&P 2015. #### Intel SGX: Page Faults as a Side Channel 🗅 Xu et al.: "Controlled-channel attacks: Deterministic side channels for untrusted operating systems", Oakland 2015. ⇒ Page fault traces leak **private control data/flow** #### **Spatial Resolution: Page-Granular Memory Access Traces** Detailed trace of (coarse-grained) code and data accesses over time... Xu et al. "Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems", IEEE S&P 2015. #### **Spatial Resolution: Page-Granular Memory Access Traces** Xu et al. "Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems", IEEE S&P 2015. #### **Protection from Side-Channel Attacks** Intel® SGX does not provide explicit protection from side-channel attacks. It is the enclave developer's responsibility to address side-channel attack concerns. In general, enclave operations that require an OCall, such as thread synchronization, I/O, etc., are exposed to the untrusted domain. If using an OCall would allow an attacker to gain insight into enclave secrets, then there would be a security concern. This scenario would be classified as a side-channel attack, and it would be up to the ISV to design the enclave in a way that prevents the leaking of side-channel information. An attacker with access to the platform can see what pages are being executed or accessed. This side-channel vulnerability can be mitigated by aligning specific code and data blocks to exist entirely within a single page. More important, the application enclave should use an appropriate crypto implementation that is side channel attack resistant inside the enclave if side-channel attacks are a concern. #### Temporal Resolution Limitations for the Page-Fault Oracle ⇒ tight loop: 4 instructions, single memory operand, single code + data page #### Counting strlen loop iterations? Note: Page-fault attacks cannot make progress for 1 code + data page #### **Temporal Resolution Limitations for the Page-Fault Oracle** Progress requires both pages present (non-faulting) ↔ page fault oracle #### **Building the Side-Channel Oracle with Execution Timing?** **Too noisy:** modern x86 processors are lightning fast... #### **Challenge: Side-Channel Sampling Rate** #### **SGX-Step: Executing Enclaves one Instruction at a Time** #### **SGX-Step: Executing Enclaves one Instruction at a Time** ↑ https://github.com/jovanbulck/sgx-step #### SGX-Step: Executing Enclaves one Instruction at a Time #### **SGX-Step Demo: Single-Stepping Password Comparison** #### SGX-Step: Enabling a New Line of High-Resolution Attacks | Yr | Venue | Paper | Step | Use Case | Drv | |-----|----------------|------------------|-----------------|------------------------|-----| | '15 | S&P | Ctrl channel | ~ Page | Probe (page fault) | / = | | '16 | <b>ESORICS</b> | AsyncShock | ~ Page | Exploit (mem safety) | - 0 | | '17 | CHES | CacheZoom | × >1 | Probe (L1 cache) | 10 | | '17 | ATC | Hahnel et al. | <b>X</b> 0 - >1 | Probe (L1 cache) | / | | '17 | <b>USENIX</b> | BranchShadow | × 5 - 50 | Probe (BPU) | × A | | '17 | <b>USENIX</b> | Stealthy PTE | ~ Page | Probe (page table) | 10 | | '17 | <b>USENIX</b> | DarkROP | ~ Page | Exploit (mem safety) | ✓ Ø | | '17 | SysTEX | SGX-Step | <b>√</b> 0 - 1 | Framework | 1-1 | | '18 | ESSoS | Off-limits | <b>√</b> 0 - 1 | Probe (segmentation) | 1- | | '18 | AsiaCCS | Single-trace RSA | ~ Page | Probe (page fault) | 1-4 | | '18 | USENIX | Foreshadow | <b>√</b> 0 - 1 | Probe (transient exec) | 1-1 | | '18 | EuroS&P | SgxPectre | ~ Page | Exploit (transient) | 10 | | '18 | CHES | CacheQuote | × >1 | Probe (L1 cache) | 10 | | '18 | ICCD | SGXlinger | X >1 | Probe (IRQ latency) | × A | | '18 | CCS | Nemesis | ✓ 1 | Probe (IRQ latency) | 1- | | '19 | USENIX | Spoiler | ✓ 1 | Probe (IRQ latency) | 1- | | '19 | CCS | ZombieLoad | <b>√</b> 0 - 1 | Probe (transient exec) | 1- | | '19 | CCS | Fallout | a <del>-</del> | Probe (transient exec) | 1- | | '19 | CCS | Tale of 2 worlds | ✓ 1 | Exploit (mem safety) | 1-1 | | '19 | ISCA | MicroScope | ~ 0 - Page | Framework | × A | | '20 | CHES | Bluethunder | ✓ 1 | Probe (BPU) | 1- | | '20 | USENIX | Big troubles | ~ Page | Probe (page fault) | 1- | | '20 | S&P | Plundervolt | _ | Exploit (undervolt) | 1- | | '20 | CHES | Viral primitive | <b>√</b> 1 | Probe (IRQ count) | 1 | | '20 | USENIX | CopyCat | <b>√</b> 1 | Probe (IRQ count) | 1 | | '20 | S&P | LVI | <b>/</b> 1 | Exploit (transient) | 1-4 | | Yr | Venue | Paper | Step | Use Case | Drv | |-----|---------------|-------------------|----------------|------------------------|-----| | '20 | CHES | A to Z | ~ Page | Probe (page fault) | 1-1 | | '20 | CCS | Déjà Vu NSS | ~ Page | Probe (page fault) | 1-1 | | '20 | MICRO | PTHammer | _ | Probe (page walk) | 1-4 | | '21 | USENIX | Frontal | <b>√</b> 1 | Probe (IRQ latency) | 1-4 | | '21 | S&P | CrossTalk | <b>√</b> 1 | Probe (transient exec) | 1-4 | | '21 | CHES | Online template | ✓ 1 | Probe (IRQ count) | 1-4 | | '21 | NDSS | SpeechMiner | _ | Framework | 1-4 | | '21 | S&P | Platypus | <b>√</b> 0 - 1 | Probe (voltage) | 1-4 | | '21 | DIMVA | Aion | <b>√</b> 1 | Probe (cache) | 1-4 | | '21 | CCS | SmashEx | <b>√</b> 1 | Exploit (mem safety) | 1-4 | | '21 | CCS | Util::Lookup | <b>√</b> 1 | Probe (L3 cache) | 1-# | | '22 | <b>USENIX</b> | Rapid prototyping | ✓ 1 | Framework | 1-1 | | '22 | CT-RSA | Kalyna expansion | <b>√</b> 1 | Probe (L3 cache) | 1-4 | | '22 | SEED | Enclyzer | - | Framework | 1-4 | | '22 | NordSec | Self-monitoring | ~ Page | Defense (detect) | 1-4 | | '22 | AutoSec | Robotic vehicles | ✓ 1 - >1 | Exploit (timestamp) | 1-4 | | '22 | ACSAC | MoLE | ✓ 1 | Defense (randomize) | 1-4 | | '22 | <b>USENIX</b> | AEPIC | ✓ 1 | Probe (I/O device) | 1-4 | | '22 | arXiv | Confidential code | ✓ 1 | Probe (IRQ latency) | 1-4 | | '23 | ComSec | FaultMorse | ~ Page | Probe (page fault) | 1-4 | | '23 | CHES | HQC timing | <b>√</b> 1 | Probe (L3 cache) | 1-4 | | '23 | ISCA | Belong to us | ✓ 1 | Probe (BPU) | 1-4 | | '23 | USENIX | BunnyHop | <b>√</b> 1 | Probe (BPU) | 1-4 | | '23 | USENIX | DownFall | <b>√</b> 0 - 1 | Probe (transient exec) | 1-4 | | '23 | USENIX | AEX-Notify | <b>√</b> 1 | Defense (prefetch) | 1-4 | #### **SGX-Step: A Versatile Open-Source Attack Framework** ## Single-Stepping Beyond Intel SGX Based on slide from Luka Wilke. #### **Root-causing SGX-Step: Aiming the timer interrupt** #### **Root-causing SGX-Step: Microcode assists to the rescue!** | | PTE A-bit | Mean (cycles) | Stddev (cycles) | • • | |-----------------|-----------|---------------|-----------------|----------------------| | | A=1 | 27 | 30 | i O V | | | A=0 | 666 | 55 | | | | | | ~ | 3. Assisted PT walk | | <b>A</b> . 10 | | 1 | | | | | | | | | | <b>-</b> | | | | page walk (\$RIP) | | Clear PTE A-bit | | 2. TLB flush | **** | | | | | • | | **** | | Arm timer | | ERESUME | | NOP <sub>1</sub> | | | | רך | | | | | | <b>→</b> - | | $() \rightarrow (-)$ | | $\odot$ | | しし | | | #### **Root-causing SGX-Step: Microcode assists to the rescue!** Constable et al. "AEX-Notify: Thwarting Precise Single-Stepping Attacks through Interrupt Awareness for Intel SGX Enclaves", USENIX Security 2023. #### **AEX-Notify: Hardware-Software Co-Design Solution** Constable et al. "AEX-Notify: Thwarting Precise Single-Stepping Attacks through Interrupt Awareness for Intel SGX Enclaves", USENIX Security 2023. #### **AEX-Notify: Hardware-Software Co-Design Solution** # CHAPTER 8 ASYNCHRONOUS ENCLAVE EXIT NOTIFY AND THE EDECCSSA USER LEAF FUNCTION #### 8.1 INTRODUCTION Asynchronous Enclave Exit Notify (AEX-Notify) is an extension to Intel® SGX that allows Intel SGX enclaves to be notified after an asynchronous enclave exit (AEX) has occurred. EDECCSSA is a new Intel SGX user leaf function (ENCLUTEDECCSSA) that can facilitate AEX notification handless well as some analysis of the Intel SGX. The following list summarizes the a details are provided in Section 8.3) - SECS.ATTRIBUTES.AEXNOTIFY - TCS.FLAGS.AEXNOTIFY: This e → shipped in millions of devices ≥ 4th Gen Xeon CPU SSA.GPRSGX.AEXNOTIFY: Enclave-writable byte that allows enclave software to dynamically enable/disable AEX notifications. An AEX notification is delivered by ENCLU[ERESUME] when the following conditions are met: **ARTICLES & REVIEWS** **NEWS ARCHIVE** FORUMS PREMIUM CONTACT O CATEGORIES Written by Michael Larabel in Intel on 6 November 2022 at 06:01 AM EST. 5 Comments Future Intel CPUs and some existing processors via a microcode update will support a new feature called the Asynchronous EXit (AEX) notification mechanism to help with Software Guard Extensions (SGX) enclave security. Patches for the Linux kernel are pending for implementing this Intel AEX Notify support with capable processors. Intel's Asynchronous EXit (AEX) notification mechanism lets SGX enclaves run a handler after an AEX event. Those handlers can be used for things like mitigating SGX-Step as an attack framework for precise enclave execution control. most recent 73 SGX-Step led to changes in major OSs and enclave SDKs interrupt/exception occurred. ess #### There's a Catch... Finally note that our proposed mitigation does not protect against interrupting enclaves and observing application code and data page accesses at a coarse-grained 4 KiB spatial resolution. In contrast to the fine-grained, instructiongranular interrupt-driven attacks we consider in this work, such controlled-channel attacks have received ample attention [18, 47, 56, 59] from the research community. ### Why Mitigating Single-Stepping is Not Enough Original (left), Xu et al. (middle), our attack with AEX-Notify single-stepping defense (right) #### Libjpeg: AEX-Notify's Temporal Reduction in Practice ## Libjpeg: AEX-Notify's Temporal Reduction in Practice ## Libjpeg: AEX-Notify's Temporal Reduction in Practice # Idea: TLB as a "Filter" to Hide Page Accesses #### TLBlur: Self-Monitoring and Restoring Enclave Page Accesses #### Instrumentation to Self-Monitor Page Accesses at Runtime #### Leakage Reduction in Practice: Libjpeg Single-Stepping ## Leakage Reduction in Practice: Libjpeg Page Faults ### Leakage Reduction in Practice: Libjpeg TLBlur (N=10) ## Leakage Reduction in Practice: Libjpeg TLBlur (N=20) ## Leakage Reduction in Practice: Libjpeg TLBlur (N=30) #### **TLBlur: Compiler-Assisted Leakage Reduction in Practice** Automated "blurring" of page-access traces in space and time ## **Confidential-Computing: Off-Chip Attacks?** In order to conceal the enclave data while it is out of the CPU package, the memory controller has an encryption engine to transparently encrypt and decrypt enclave memory. In CPUs prior to Ice Lake, the Memory Encryption Engine (MEE) is used to encrypt pages leaving the CPU caches. MEE uses a n-ary Merkle tree with root in SRAM to maintain integrity of the encrypted data. This provides integrity and anti-replay protection but does not scale to large memory sizes because the time required to update the Merkle tree grows logarithmically in relation to the memory size. CPUs starting from Icelake use Total Memory Encryption (TME) in the place of MEE. TME-based SGX implementations do not have an integrity Merkle tree, which means integrity and replayattacks are not mitigated. B, it includes additional changes to prevent cipher text from being returned and SW memory aliases from being created. DMA to enclave memory is blocked by range registers on both MEE and TME systems (SDM section 41.10). #### **Background: TEE Trust in DRAM** - Originally: Strong cryptographic protection - OLimitations: Overhead, small size, ... - Scalable solutions move away from strong cryptographic guarantees | TEE | Encryption | Guarantees | | | |--------------------|----------------|-----------------|-----------|-----------| | | | Confidentiality | Integrity | Freshness | | Classic Intel SGX | AES-CTR | ✓ | | / | | Scalable Intel SGX | <b>AES-XTS</b> | ✓ | X | Х | | Intel TDX | <b>AES-XTS</b> | ✓ | X | X | | AMD SEV-SNP | <b>AES-XEX</b> | ✓ | X | X | | Arm CCA | AES-XEX/QARMA | ✓ | X | X | # **Idea: Memory Aliasing Attacks** # DDR3: The "Paperclip" Attack... DDR3 <> **DDR4...** #### **Background: Memory Initialization** - BIOS programs memory controller based on DIMM configuration - Incorrect configuration leads to inconsistent memory view ## **SPD-Based Memory Address Aliasing** ## **SPD-Based Memory Address Aliasing** - BIOS configures memory controller - Malicious SPD contents introduces aliases ### **Breaking AMD SEV-SNP** - Attestation - SP takes measurement of launched VM - Encrypted under SP key - No freshness, can replay launch digest ## **Conclusions and Take-Away** New era of confidential computing for the cloud and IoT ... but current architectures are **not perfect!** Scientific understanding driven by attacker-defender race