A new design for microarchitectural weird machines, along with a compiler and a proof-of-concept packer application.
For more details, please refer to our paper:
Ping-Lun Wang, Riccardo Paccagnella, Riad S. Wahby, Fraser Brown. "Bending microarchitectural weird machines towards practicality." USENIX Security, 2024.
- What are microarchitectural weird machines?
- Hardware requirements
- Reproduce our results
- Install the Flexo compiler
- Compile a weird machine
- Example: basic logic gates
- Configure the Flexo compiler
- Create a new weird machine with C/C++
- UPFlexo: UPX packer with Flexo weird machines
- Run Flexo on an unsupported processor
- Contacts
Microarchitectural weird machines (µWMs) are code gadgets that perform computation purely through microarchitectural side effects. They work similarly to a binary circuit: they use weird registers to store values and use weird gates to compute with them.
For example, here is a weird AND gate with two inputs and one output:
Out[In1[0] + In2[0]]
In1
and In2
are the two input weird registers, and this AND gate outputs to the weird register Out
.
Section 2.2 of our paper explains how a µWM works in details.
Note: the code snippets in our paper (Listing 1-5) are for illustrative purposes and can be different from our actual implementation, which we show in Listing 7 in the appendix.
These µWMs can prevent both static and dynamic analysis because they convert computations into memory operations (like the AND gate example above), and debuggers and emulators may not preserve the microarchitecture behavior of a processor, e.g., single-stepping stops transient execution.
Therefore, they are a great candidate for program obfuscation and potentially many other types of attacks.
The following list is the AWS EC2 instances that we used to run our Flexo weird machines.
Microarchitecture | Instance type | Processor |
---|---|---|
Zen 1 | t3a.xlarge | AMD EPYC 7571 |
Zen 2 | c5a.xlarge | AMD EPYC 7R32 |
Zen 3 | c6a.xlarge | AMD EPYC 7R13 |
Zen 4 | m7a.xlarge | AMD EPYC 9R14 |
Skylake | c5n.xlarge | Intel Xeon 8124M |
Cascade Lake | m5n.xlarge | Intel Xeon 8259CL |
Icelake | m6in.xlarge | Intel Xeon 8375C |
Sapphire Rapids | m7i.xlarge | Intel Xeon 8488C |
Warning
When using a processor not included in this list, the weird machines may fail to generate correct results. While a processor with similar microarchitecture may be able to run our weird machines, we cannot guarantee the accuracy and performance when using other processors. For instructions about running our Flexo weird machine on an unsupported processor, please refer to "Run Flexo on an unsupported processor".
Follow the instructions in the README file under reproduce/
to run the experiments in our paper.
We suggest installing the Flexo compiler using a Docker or Podman container.
The script below creates a container using the provided Docker file and runs the build script to build the compiler.
For installation using Podman, simply replace docker
with podman
in these commands.
docker build -t flexo .
docker run -i -t --rm \
--mount type=bind,source="$(pwd)"/,target=/flexo \
flexo \
bash -c "cd /flexo && ./build.sh"
The compiler will be stored inside the build/
folder when the installation process is complete.
The Flexo compiler takes a LLVM IR file as input, so the first step of compiling a weird machine is to compile a C/C++ program implementing the weird machine into LLVM IR.
The following command uses clang
(version 17) to compile a C/C++ program into LLVM IR.
Replace [INPUT_WM_SOURCE]
with the filename of the input C/C++ program and [LLVM_IR_FILE]
with the filename of the output LLVM IR.
clang-17 -fno-discard-value-names -fno-inline-functions -O1 -S -emit-llvm [INPUT_WM_SOURCE] -o [LLVM_IR_FILE]
For more information about how to write a C/C++ program that implements a weird machine, please refer to #TODO. It is also possible to implement a weird machine using structural Verilog. For more information about using Verilog to create a weird machine, please refer to #TODO.
After that, run compile.sh
to execute the Flexo compiler and generate the weird machine, which is also in the form of LLVM IR.
The following command performs this step.
Remember to replace [INPUT_LLVM_IR_FILE]
with the filename of the LLVM IR file generated in the previous step and [OUTPUT_LLVM_IR_FILE]
with the filename of the output LLVM IR file that contains the weird machine.
docker run -i -t --rm \
--mount type=bind,source="$(pwd)"/,target=/flexo \
flexo \
bash -c "cd /flexo && ./compile.sh [INPUT_LLVM_IR_FILE] [OUTPUT_LLVM_IR_FILE]"
The Flexo compiler provides several compile options to adjust the construction of a weird machine. For the details of these compiler options, please refer to #TODO
Finally, you can generate an executable file from the output LLVM IR file using the following command.
Remember to replace [OUTPUT_LLVM_IR_FILE]
with the filename of the LLVM IR file generated from the previous step and [OUTPUT_EXECUTABLE_FILE]
with the filename of the executable file to be generated.
clang-17 [OUTPUT_LLVM_IR_FILE] -o [OUTPUT_EXECUTABLE_FILE] -lm -lstdc++
We implemented several weird machines: basic logic gates, adders, multipliers, a 4-bit ALU, the SHA-1 hash function, the AES block cipher, and the Simon block cipher.
These examples can be found in the circuits/
folder.
The readme file under circuits/gates
provides instructions regarding how to build a simple weird machine that computes basic logic gates (AND
, OR
, NOT
, NAND
, and MUX
).
The Flexo compiler can be configured by setting environment variables. Line 15 of the compilation script provides an example of configuring the Flexo compiler using environment variables. The following is the list of environment variables that are available to configure the compiler.
WM_KEYWORD
: The compiler only converts a function to WM when it contains a "keyword" in its name. The default value of this keyword is__weird__
.WM_VERBOSE
: Show verbose output, default tofalse
.TMP_PATH
: A folder to store temporary files, default to/tmp
.WM_CIRCUIT_FILE
: The path to a Verilog circuit file when the WM is implemented using Verilog. No default value. See the ALU example for this configuration.
WM_DELAY
: The iterations of the delay loop between gate executions. Default to256
.WM_USE_FENCE
: Use memory fences instead of using a delay loop. Default totrue
.RET_WM_DIV_ROUNDS
: The number ofdiv
instructions used when calculating the modified return address. Default to4
.RET_WM_DIV_SIZE
: The register size of thediv
instructions when calculating the modified return address. Possible values:16
(default),32
, or64
.RET_WM_JMP_SIZE
: The jump size of the modified return address (should be larger than the gate size). Default to512
.DUAL_WM_MAX_INPUT
: The maximum input size of a weird gate. Default to4
.WM_MAX_FANOUT
: The max number of fan-outs of an assign gate. Default to3
.
WR_TYPE
: The weird register type, which can beBaseline
,NoBranch
, orDual
(default).WR_MAPPING
: The mapping type of weird registers, which can beBaseline
orShuffle
(default).WR_OFFSET
: The memory offset (in bytes) between each weird registers. Default to960
.WR_FAKE_OFFSET
: The memory offset (in bytes) between a real weird register and its fake location. Default to512
.WR_HIT_THRESHOLD
: The maximum access latency (in cycles) of a cache hit. Default to180
.WR_SYSCALL_RAND
: Use Linux syscall to generate random numbers instead of using standard library calls. Default tofalse
.WR_USE_MMAP
: Call themmap
syscall to allocate memory for WR instead of using the stack memory. May be slower if enabled, but this supports circuits with more wires. Default tofalse
.
The Flexo compiler converts C or C++ functions into weird machines. These functions must comply with the following requirements:
The functions that should be converted to a weird machine must contain the string __weird__
in its name.
This string can be customized by specifying WM_KEYWORD
in compiler configuration.
The functions that should be converted to a weird machine must follow a special format. Here is an example:
bool __weird__fn(unsigned char* in1, unsigned char* out1, bool in2, bool* out2, bool& out3);
There are no rules for the names of the arguments or the order of the arguments, so they can be named or placed arbitrarily.
The inputs and outputs of a weird machine are all contained in the function argument. There is no need to specify an argument as input or output as the compiler will detect this automatically by checking if the value of an argument is read or over-written. The input and output variables must have the following types:
Inputs can be integer type with arbitrary length, e.g., int
, unsigned char
, long
, are all valid.
They can be passed by value, by pointer, or by reference.
Outputs also can integer type with arbitrary length, but they must be passed by pointer or by reference.
The return value of the function is NOT the output of the weird machine, and it instead contains the error detection result.
The return value is true
when an error is detected and false
if undetected.
When the output of the weird machine has more than one bit or when there are multiple outputs, then the return value is true
when any of the output bit detects an error.
If the error detection value is not needed, the return type of the function can be set to void
.
Flexo weird machines can perform error detection for each output bit.
The compiler provides these error detection results via some special output variables, which have the name of the output variable with the error_
prefix.
For example:
void __weird_fn2(unsigned char* in, unsigned char* out, unsigned char* error_out);
This function has an output out
, and the bit-wise error detection results of this output are provided in error_out
.
error_out
must have the same length as out
, and the bits in error_out
are set when the corresponding bits in out
detect an error.
For example, if the n-th bit of out
detects an error, then the n-th bit of error_out
is set to 1
; otherwise, the n-th bit of error_out
is set to 0
.
The Flexo compiler parses the LLVM IR instructions inside the function body and converts it to a Verilog circuit.
The conversion is simple: it translates a LLVM IR instruction to a corresponding Verilog operator.
For example, the Add
instruction is converted to +
.
As some functionality of LLVM IR is not supported by a Verilog circuit, some LLVM IR instructions are not allowed in the function body, and any unsupported LLVM IR instruction will lead to compile errors.
For example, the function body should not contain any branch or function call, and memory operations are not allowed except for the inputs and outputs.
To see what LLVM IR instructions are supported, please refer to lib/Circuits/IRParser.cpp
.
As a proof-of-concept application, we obfuscated the UPX packer with Flexo. We encrypt the packed binary using AES or Simon encryption, and at runtime, a Flexo weird machine decrypts the packed binary. Please refer to the readme of UPFlexo for more information about how to install and use this packer.
Microarchitectural weird machines are very sensitive to the nuances of a microarchitecture. When running Flexo on an unsupported processor or when creating a new weird machine, some compiler configuration may need to be adjusted so that the weird machine can execute correctly. Here are some compiler configurations that should be adjusted when tuning a weird machine:
The length of the transient window is controlled by the RET_WM_DIV_ROUNDS
environment variable.
(For the definition of a "transient window", please refer to section 2.1 of our paper.)
We suggest trying every integers between 1 and 50 to see which number gives the best result.
For some rare cases, RET_WM_DIV_SIZE
may also need to be adjusted.
The spacing between weird registers can impact the accuracy of a weird machine significantly.
We discuss this in section 3.3 of our paper.
We suggest setting WR_OFFSET
to 64
) and WR_OFFSET
to 192, 320, 448, 576, 960, 1088, ...
Some processors may have smaller transient windows, and thus they can only support small weird gates.
In this case, reduce DUAL_WM_MAX_INPUT
to 3
or 2
to reduce the size of weird gates.
For any question, contact the first author: Ping-Lun Wang (pinglunw [at] andrew [dot] cmu [dot] edu).