# Enforcing Speculative Non-Interference with Hardware-Software Codesign

Nicholas Mosier,<sup>1</sup> Hamed Nemati,<sup>2</sup> John C. Mitchell,<sup>1</sup> Caroline Trippel<sup>1</sup>
April 16<sup>th</sup>, 2025 • Stanford Security Lunch

$$train the CPU to predict index in-bounds$$

$$victim(x = 0) \\ victim(x = 0) \\ victim(x = 0)$$

```
if (x < 10) {

correct path
```

```
attacker
victim(x = 0)
victim(x = 0)
victim(x = 0)
victim(x = 100)
```

```
branch misprediction if (x < 10) {
y = A[x]
z = B[y]
```

```
attacker
victim(x = 0)
victim(x = 0)
victim(x = 0)
victim(x = 100)
```







Spectre attacks exploit control- or data-flow mispredictions in hardware to transiently leak secret data via transmitters.



source of misprediction



#### transient transmitter

- transient: not architecturally committed
- <u>transmitter</u>: unsafe instruction whose execution creates operand-dependent resource usage





hardware side channel

resource modulated by transmitter



## Spectre attacks [Kocher+ S&P'19]

Spectre attacks exploit control- or data-flow mispredictions in hardware to transiently leak secret data via transmitters.

#### speculation primitive

source of misprediction



#### transient transmitter

- transient: not architecturally committed
- transmitter: unsafe instruction whose execution creates operand-dependent resource usage

#### our focus

(sufficient to reason about security of Spectre defenses)



hardware side channel

resource modulated by transmitter

#### receiver

observer of side channel

## **Speculation Primitives**

#### speculation primitive

source of misprediction

conditional branch prediction [Kocher+ S&P'19] indirect branch prediction [Kocher+ S&P'19]

return address prediction [Koruyeh+ S&P'18]

control-flow

data-flow

branch type confusion [AMD+ '22] μcode branch prediction [Mosier+arXiv'25]

store-to-loadforwarding [Horn GPZ'18] predictive store forwarding [AMD '21]

load address prediction [Kim+ S&P'25] load value prediction [Kim+ USENIX'25]

• • •



#### transient transmitter

- transient: not architecturally committed
- <u>transmitter</u>: unsafe instruction whose execution creates operand-dependent resource usage

### **Transient Transmitters**





#### transient transmitter

- transient: not architecturally committed
- <u>transmitter</u>: unsafe instruction whose execution creates operand-dependent resource usage

### **Transient Transmitters**





#### transient transmitter

- transient: not architecturally committed
- transmitter: unsafe instruction whose execution creates operand-dependent resource usage

```
control flow

if (unsafe)

variable-time operations

i = unsafe/unsafe
f = unsafe * 2.5
```

Any program with secrets in its address space is **vulnerable** to Spectre attacks.

```
if (x < 10) {
    y = A[x]
    leak(y)
}</pre>
```













What does it mean for a program to be **secure** against Spectre attacks?

## Speculative Non-Interference

Formalized as a two-trace property over architectural and microarchitectural executions

A program execution satisfies **speculative non-interference (SNI)** if it leaks no more information transiently than it does sequentially [Guarnieri+ S&P'20].

#### Execution 1:



x leaked transiently but not sequentially

#### **Execution 2:**

satisfies SNI

x leaked sequentially → safe to leak transiently

#### **Execution 3:**

× violates SNI

x&1 leaked transiently but not sequentially

#### Execution 4:

satisfies SNI

x/2 is a function of x, which leaked sequentially

black = sequentially executed

gray = transiently executed

We present **Mieros**, the most performant Spectre defense to comprehensively enforce SNI to date.

### Outline

- Framework for enforcing SNI
- Mieros, a Spectre defense that enforces SNI
- Evaluation and results
- Conclusion

### Outline

- Framework for enforcing SNI
- Mieros, a Spectre defense that enforces SNI
- Evaluation and results
- Conclusion

### How to design a Spectre defense that enforces SNI?



## How to design a Spectre defense that enforces SNI? Protection policies and protection mechanisms.



In our framework, a Spectre defense is composed of a **protection policy** and a **protection mechanism**.

- <u>Protection policy</u>: defines *what data* the defense protects against leaking transiently
- <u>Protection mechanism</u>: defines *how* the defense prevents that data from leaking transiently

#### A Spectre defense enforces SNI if:

- 1. The protection policy protects all data that is not leaked via a sequential transmitter.
- 2. The protection mechanism prevents all transient transmitters from leaking any protected data.

## A naive protection policy for SNI



In our framework, a Spectre defense is composed of a **protection policy** and a **protection mechanism**.

- <u>Protection policy</u>: defines *what data* the defense protects against leaking transiently
- <u>Protection mechanism</u>: defines *how* the defense prevents that data from leaking transiently

#### A Spectre defense **enforces SNI** if:

- 1. The protection policy protects all data that is not leaked via a sequential transmitter.
- 2. The protection mechanism prevents all transient transmitters from leaking any protected data.

## A naive protection policy for SNI



In our framework, a Spectre defense is composed of a **protection policy** and a **protection mechanism**.

- <u>Protection policy</u>: defines *what data* the defense protects against leaking transiently
- <u>Protection mechanism</u>: defines *how* the defense prevents that data from leaking transiently

#### A Spectre defense **enforces SNI** if:

- 1. The protection policy protects all data that is not leaked via a sequential transmitter.
- 2. The protection mechanism prevents all transient transmitters from leaking any protected data.

## Enforcing SNI with protection policies and mechanisms



In our framework, a Spectre defense is composed of a **protection policy** and a **protection mechanism**.

- <u>Protection policy</u>: defines what data the defense protects against leaking transiently
- <u>Protection mechanism</u>: defines *how* the defense prevents that data from leaking transiently

A Spectre defense enforces SNI if:

- 1. The protection policy protects all data that is not leaked via a sequential transmitter.
- 2. The protection mechanism prevents all transient transmitters from leaking any protected data.



The protection policy should **unprotect** data that leaks in the sequential execution.

## Enforcing SNI with protection policies and mechanisms



In our framework, a Spectre defense is composed of a **protection policy** and a **protection mechanism**.

- <u>Protection policy</u>: defines *what data* the defense protects against leaking transiently
- <u>Protection mechanism</u>: defines *how* the defense prevents that data from leaking transiently

A Spectre defense enforces SNI if:

- 1. The protection policy protects all data that is not leaked via a sequential transmitter.
- 2. The protection mechanism prevents all transient transmitters from leaking any protected data.



The protection policy should unprotect data that leaks in the sequential execution.

## Protection policies can safely unprotect **past-leaked** and **bound-to-leak** data under SNI.

#### **Past-Leaked Data**

data that has already leaked in the sequential execution

#### Execution 4:

```
if (...)
  leak(x/2)
leak(x)
// x past-leaked
```

- easy to detect at runtime in hardware
- easy to detect at compile time

#### **Bound-to-Leak Data**

data that will leak along all future sequential control-flow paths

#### Execution 4:

```
// x bound-to-leak
if (...)
  leak(x/2)
leak(x)
```

- difficult to impossible to detect at runtime in hardware
- easy to detect at compile time



## A case for hardware protection mechanisms

|                                                                            | software-based         |
|----------------------------------------------------------------------------|------------------------|
| can consider all speculation primitives?                                   | no → non-comprehensive |
| restrict speculation only under microarchitecturally necessary conditions? | no → non-performant    |



Protection mechanisms should be implemented **in hardware**.

## Comprehensively and efficiently enforcing SNI requires hardware-compiler codesign.



## **Mieros** comprehensively enforces SNI via three hardware-compiler codesigned parts.



## SNICC: a compile-time protection policy for SNI

1. SNICC uses two static data-flow analyses to identify bound-to-leak and past-leaked registers at each program point.

## SNICC's Bound-to-Leak Analysis (Backward)

Intuitively, mark a register as bound-to-leak if it is passed to a transmitter along all future control-flow paths or can be inferred from data that will be.

#### **Transfer function:**

r is bound-to-leak before I if:

- a) I := leak(r), i.e., I transmits r
- b) I := r' = injective\_op(r) and r' is bound-to-leak after I
- c) r is bound-to-leak after and unmodified by I

Case (a) Case (b) Case (c) 
$$\{x\}$$
 leak(x)  $\{a\}$  a = b  $\{x\}$  nop

Meet function (set intersection):

r is bound-to-leak after I if r is boundto-leak before all successors I' of I.



SNICC conservatively underapproximates the set of bound-to-leak registers by iteratively applying these rules.

## SNICC: a compile-time protection policy for SNI

 SNICC uses two static data-flow analyses to identify bound-to-leak and past-leaked registers at each program point.

```
bound-to-leak and
past-leaked registers

source

if (*p)
    leak(x)
leak(x+1)

    t = x+1
    leak(t)
}
```

## SNICC: a compile-time protection policy for SNI

- 1. SNICC uses two static data-flow analyses to identify **bound-to-leak** and **past-leaked** registers at each program point.
- 2. SNICC assigns a **protection type**, PROT or UNPROT, to **each register definition**, based on whether it is bound-to-leak or past-leaked.





## Protection Type Extensions (PTeX): architectural and hardware support for tracking SNICC's protection policy

#### **Architectural Extension: PROT** prefix

Presence/absence of PROT prefix implies
 PROT-/UNPROT-TYPED output register

```
UNPROT x = x

UNPROT p = p

PROT b = *p

if (b)

leak(x)

leak(x+1)
```

**PTeX** 

(HW-SW interface)



## Protection Type Extensions (PTeX): architectural and hardware support for tracking SNICC's protection policy

#### Architectural Extension: PROT prefix

Presence/absence of PROT prefix implies
 PROT-/UNPROT-TYPED output register



**PTeX** (HW-SW interface)



#### Hardware Extension

Given **prot** prefixes, PTeX associates a **protection tag** with each:

- register
- memory byte, up through the L1D

#### protected register file

| reg | data | prot |  |
|-----|------|------|--|
| х   | 42   | 1    |  |
| р   | 100  | 1    |  |
| b   | 1    | 1    |  |

#### protected L1D cache

| addr | data | prot |
|------|------|------|
|      |      |      |
|      |      |      |
|      |      |      |

32



### Naive Protection Mechanism

#### Recall:

- A protection mechanism defines how the defense prevents that data from leaking transiently.
- To enforce SNI, it must prevent all transient transmitters from leaking any protected data.



<u>Naive protection mechanism</u>: block the speculative execution of **protected transmitters** (transmitters with PTeX-protected input registers).



```
prot x = *p
return
y = x
leak(y)  SNI violation!

allows unprotected transmitter to
execute transiently, leaking x
```

## **Taint Primitives**



### **Taint Primitives**

#### taint primitive

an instruction that speculatively copies data from *protected state* into *unprotected state* 

(state = register or memory byte)

```
<u>reg</u> → <u>reg</u>
```

req → mem

## return z = \*p

<u>mem</u> → <u>reg</u>

mem → mem

#### impossible

stores update memory protection to agree with data register protection

#### impossible

no instructions can directly copy memory



PTeX-enabled hardware exhibits two classes of taint primitives: reg→reg and mem→reg.

### SNI violations on PTeX hardware

Theorem. On PTeX hardware, all SNI violations are due to a transiently transmitted register that is...

- (a) tagged protected, or
- (b) tagged unprotected but depends on a reg→reg or mem→reg taint primitive.

the register is tainted



Inspires complete protection mechanism, **Taint Primitive Tracking**:

- (a) block protected transmitters from executing speculatively
- (b) block tainted transmitters from executing speculatively

- taints unprotected registers that depend on a taint primitive,
- · tracks the youngest taint primitive (YTP) that tainted registers depend on, and
- **stalls transmitters** with protected or tainted inputs
- untaints registers once their YTP becomes sequential (non-speculative)

#### register file

| reg→reg taint primitive | I1: $\mathbf{x}_{\mathbf{I1}} = \mathbf{a}$ |
|-------------------------|---------------------------------------------|
| reg→mem taint primitive | $I2: y_{T2}^{-} * p$                        |
| unprotected register op | I3: $z_{12} = x + y_{12}$                   |
|                         | I4: leak( <b>z</b> <sub>I2</sub> )          |
|                         | I4: leak(a) 🧮                               |

| reg | data | prot | taint | YTP |
|-----|------|------|-------|-----|
| а   | 42   | 1    | ı     | -   |

(gray = speculative)

- taints unprotected registers that depend on a taint primitive,
- tracks the youngest taint primitive (YTP) that tainted registers depend on, and
- stalls transmitters with protected or tainted inputs
- untaints registers once their YTP becomes sequential (non-speculative)

#### register file

| reg→reg taint primitive | I1: $x_{I1} = a$                    |
|-------------------------|-------------------------------------|
| reg→mem taint primitive | I2: $\mathbf{y}_{T2} = *\mathbf{p}$ |
| unprotected register op | I3: $z_{12} = x + y_{12}$           |
|                         | I4: leak( $z_{12}$ )                |
|                         | I4: leak(a)∑                        |

| reg | data | prot | taint | YTP |
|-----|------|------|-------|-----|
| а   | 42   | 1    | -     | -   |
| Х   | 42   | 0    | 0     | -   |
| У   | 10   | 0    | 1     | 12  |
| Z   | 52   | 0    | 1     | 12  |

- taints unprotected registers that depend on a taint primitive,
- tracks the youngest taint primitive (YTP) that tainted registers depend on, and
- stalls transmitters with protected or tainted inputs
- untaints registers once their YTP becomes sequential (non-speculative)

### reg→reg taint primitive reg→mem taint primitive unprotected register op

I1: 
$$x = a$$

I2:  $y_{I2} = *p$ 

I3:  $z_{I2} = x + y_{I2}$ 

I4:  $leak(z_{I2}) \times x$ 

I4:  $leak(a) \times x$ 

#### register file

| reg | data | prot | taint | YTP |
|-----|------|------|-------|-----|
| а   | 42   | 1    | -     | -   |
| Х   | 42   | 0    | 0     | -   |
| У   | 10   | 0    | 0     | -   |
| Z   | 52   | 0    | 0     | -   |

- taints unprotected registers that depend on a taint primitive,
- tracks the youngest taint primitive (YTP) that tainted registers depend on, and
- stalls transmitters with protected or tainted inputs
- untaints registers once their YTP becomes sequential (non-speculative)

#### register file

| reg→reg taint primitive | I1: x = a      |
|-------------------------|----------------|
| reg→mem taint primitive | I2: y = *p     |
| unprotected register op | I3: z = x + y  |
|                         | I4: leak(z)    |
|                         | I4: leak(a)  ▼ |

| reg | data | prot | taint | YTP |
|-----|------|------|-------|-----|
| а   | 42   | 1    | -     | -   |
| Х   | 42   | 0    | 0     | -   |
| У   | 10   | 0    | 0     | -   |
| Z   | 52   | 0    | 0     | -   |

- taints unprotected registers that depend on a taint primitive,
- tracks the youngest taint primitive (YTP) that tainted registers depend on, and
- stalls transmitters with protected or tainted inputs
- untaints registers once their YTP becomes sequential (non-speculative)

#### register file

| reg→reg taint primitive | I1: x = a                  |
|-------------------------|----------------------------|
| reg→mem taint primitive | I2: y = *p                 |
| unprotected register op | I3: z = x + y              |
|                         | I4: leak( <mark>z</mark> ) |
|                         | I4: leak(a) ₹              |

| reg | data | prot | taint | YTP |
|-----|------|------|-------|-----|
| а   | 42   | 1    | -     | -   |
| Х   | 42   | 0    | 0     | -   |
| У   | 10   | 0    | 0     | -   |
| Z   | 52   | 0    | 0     | -   |

- taints unprotected registers that depend on a taint primitive,
- tracks the youngest taint primitive (YTP) that tainted registers depend on, and
- stalls transmitters with protected or tainted inputs
- untaints registers once their YTP becomes sequential (non-speculative)

#### register file

| reg→reg taint primitive | I1: x = a     |
|-------------------------|---------------|
| reg→mem taint primitive | I2: y = *p    |
| unprotected register op | I3: z = x + y |
|                         | I4: leak(z)   |
|                         | T4: leak(a)▼  |

| reg | data | prot | taint | YTP |
|-----|------|------|-------|-----|
| а   | 42   | 1    | -     | -   |
| Х   | 42   | 0    | 0     | -   |
| У   | 10   | 0    | 0     | -   |
| Z   | 52   | 0    | 0     | -   |

## Outline

- Framework for enforcing SNI
- Mieros, a Spectre defense that enforces SNI
- Evaluation and results
- Conclusion

# Mieros Implementation



## Experimental Setup

- Evaluated on the SPEC2017, PARSEC, and crypto benchmarks
- Baselines (run on base binaries):
  - Unsafe: unmodified gem5 O3 CPU model

only prior Spectre defense to comprehensively enforce SNI

- Secure: Speculative Privacy Tracking (SPT) [Choudhary+ MICRO'21]
- Hardware config resembles Intel Alder Lake hybrid processor:
  - 8 performance cores (P-cores)
  - 8 efficiency cores (E-cores)

## Results: SPEC2017



#### Mieros: 49% overhead

SPT: 81% overhead



#### Runtime on the SPEC2017 benchmarks, E-core

Mieros: 18% overhead

SPT: 37% overhead



## Results: PARSEC





Mieros: 38% overhead

SPT: 62% overhead

# Results: crypto





## Outline

- Framework for enforcing SNI
- Mieros, a Spectre defense comprehensively and performantly enforcing SNI
- Evaluation and results
- Conclusion

## Conclusion and Future Work

- We propose Mieros, the first Spectre defense to use hardwarecompiler codesign to comprehensively enforce speculative noninterference.
- Mieros demonstrates how static analysis, new ISA, and hardware support can together achieve performant and secure speculation.
- Future work
  - Extending SNICC with more complex static analyses
  - Dynamic protection types (rather than static ones)
  - Program transformations to eliminate/reduce taint primitives at compile time

# Questions?

nmosier@stanford.edu

# **END**

# Backup Slides

## Speculation Primitives

control-flow



branch prediction

```
(*f)(x)
            leak(x)
foo()
  return
indirect branch prediction
```







branch type confusion



microcode branch misprediction

```
data-flow
```

```
secret;
```

store-to-load forwarding

```
secret;
leak(*q):
```

predictive store forwarding



# Security-critical projects remain vulnerable to Spectre attacks, 8 years later



#### Spectre and Meltdown Attacks Against OpenSSL

The OpenSSL Technical Committee (OTC) was recently made aware of several potential attacks against the OpenSSL libraries which might permit information leakage via the <u>Spectre</u> attack. ¹ Although there are currently no known exploits for the Spectre attacks identified, it is plausible that some of them might be exploitable.

Local side channel attacks, such as these, are outside the scope of our <u>security policy</u>, however the project generally does introduce mitigations when they are discovered. In this case, the OTC has decided that <u>these attacks will **not**</u> be mitigated by changes to the OpenSSL code base. The full reasoning behind this is given below.

https://openssl-library.org/post/2022-05-13-spectre-meltdown/





#### Spectre variant 1

For the Spectre variant 1, vulnerable kernel code (as determined by code audit or scanning tools) is annotated on a case by case basis to use nospec accessor macros for bounds clipping [2] to avoid any usable disclosure gadgets. However, it may not cover all attack vectors for Spectre variant 1.

Conclusion





For the reasons above, we now assume any active code can read any data in the same address space. The plan going forward must be to keep sensitive crossorigin data out of address spaces that run untrustworthy code, rather than relying on in-process checks.

https://chromium.googlesource.com/chromium/src/+/master/docs/security/side-channel-threat-model.md

## SNICC's Past-Leaked Analysis (Forward)

Intuitively, marks a register as past-leaked if it is computed from constant or transmitted data along all incoming control-flow paths.

#### **Transfer function**

r is past-leaked after I if:

- a) I := leak(r), i.e., I transmits r
- b) I :=  $r = regop(r_1, ..., r_n)$  where  $r_1, ..., r_n$  are past-leaked before I
- c) r is past-leaked before and unmodified by I

Meet function (set intersection)

r is past-leaked before I if r is past-leaked after all predecessors I' of I



SNICC conservatively underapproximates the set of bound-to-leak registers by iteratively applying these rules.