armshaker is a processor fuzzer targeting the Armv8-A ISA. It can be used to find hidden instructions in hardware processors or other implementations of the ISA like emulators.
In essence, it works by executing undefined instructions and checking whether the execution generates an undefined instruction exception or not, in the form of a SIGILL signal sent to the process. If no SIGILL is received, the instruction is marked as hidden and logged. The limited size of the three instruction sets in Armv8-A (A64, A32 and T32) makes it feasible to do an exhaustive search over the whole instruction space using this method.
armshaker has two primary parts: the back-end doing most of the work, and a front-end adding visualizations and multiprocessing support. In most cases, the front-end is the preferred way to run the fuzzer.
The fuzzer must be compiled before use, which can usually be done with a simple
make
in the project directory.
Then an exhaustive search of the default instruction set can be initiated like so:
./armshaker.py -f2
Which will look something like this:
The particular instruction set that is fuzzed depends on the runtime of the current system. If the fuzzer is compiled with a 32-bit (AArch32) toolchain, it will be able to fuzz A32 or T32 (with the -t
option). If it is compiled with a 64-bit toolchain (AArch64), it will be able to fuzz A64, although cross-compiling and running a 32-bit fuzzer from AArch64 is possible.
If a hidden instruction is found, it will be logged in the file data/logX
, where X
corresponds to the worker ID. Each log entry will be in the following format: <instruction_encoding>,hidden,<generated_signal_number>,...
, with register value changes appended if the -p
(and optionally -g
) option is set.
In case Python 3 is not available, shell_frontend.sh
can be used instead for multiprocessing support. Otherwise the fuzzer back-end can be run directly with ./fuzzer <options>
.
armshaker is made for Armv8-A-based systems running Linux, although it should be possible to run it on Armv7-A too. Other than Linux, it only requires a C compiler (GCC), the make tool and Python 3 for the front-end.
The simplest way to compile the fuzzer is to run
make
in the project directory. This will use the bundled libopcodes disassembler and should work for most use cases.
Optionally, a local installation of libopcodes can be used instead with
make SHARED_LIBOPCODES=TRUE
Capstone can also be used in addition to libopcodes (note: not instead of). This is mostly a legacy feature, but can be used to compare disassembly results between the two. It can be enabled by adding the USE_CAPSTONE=TRUE
option when compiling.
The options available in the front-end are as follows. For more detailed descriptions, see the options for the back-end.
$ ./armshaker.py -h
usage: armshaker.py [-h] [-s INSN] [-e INSN] [-c] [-w NUM] [-p] [-n]
[-f LEVEL] [-t] [-z] [-g] [-V] [-c]
fuzzer front-end
optional arguments:
-h, --help show this help message and exit
-s INSN, --start INSN
search range start
-e INSN, --end INSN search range end
-d, --discreps Log disassembler discrepancies
-w NUM, --workers NUM
Number of worker processes
-p, --ptrace Use ptrace when testing
-n, --no-exec Don't execute instructions, just disassemble them.
-f LEVEL, --filter LEVEL
Filter certain instructions
-t, --thumb Use the thumb instruction set (only on AArch32).
-z, --random Load the registers with random values, instead of all
0s.
-g, --log-reg-changes
For hidden instructions, only log registers that
changed value.
-V, --vector Set and log vector registers (d0-d31, fpscr) when
fuzzing.
-c, --cond Set cpsr flags to match instruction condition code.
The back-end has some extra options that can be useful for analysis or targeted fuzzing. Its options are as follows.
$ ./fuzzer -h
Usage: ./fuzzer [option(s)]
General options:
-h, --help Print help information.
-q, --quiet Don't print the status line.
Search options:
-s, --start <insn> Start of instruction search range (in hex).
[default: 0x00000000]
-e, --end <insn> End of instruction search range, inclusive (in
hex). [default: 0xffffffff]
-i, --single-exec Execute a single instruction (i.e., set end=start).
-m, --mask <mask> Only update instruction bits marked in the supplied
mask. Useful for testing different operands on a
single instruction. Example: 0xf0000000 -> only
increment most significant nibble.
Execution options:
-n, --no-exec Calculate the total amount of undefined
instructions, without executing them.
-x, --exec-all Execute all instructions (regardless of the
disassembly result).
-f, --filter <level> Filter away (skip) certain instructions that would
otherwise be executed and might generate false
positives. Supports the following levels, where
each level includes the numerically lower ones:
1: Inaccurate disassemblies, primarily caused
by SBO/SBZ bits.
2: Hidden instructions induced by Linux
as a result of bugs or backwards
compatibility measures.
-p, --ptrace Execute instructions on a separate process using
ptrace. This will generally make execution slower,
but lowers the chance of the fuzzer crashing in
case hidden instructions with certain side-effects
are found. It also enables some additional options.
-c, --cond On AArch32: Set the condition flags in the CPSR to
match the condition code in the instruction
encoding. This ensures that undefined instructions
with a normally non-matching condition code won't
be skipped, as is the case in some ISA
implementations.
Logging options:
-l, --log-suffix Add a suffix to the log and status file.
-d, --discreps Log disassembler discrepancies.
Ptrace options (only available with -p option):
-t, --thumb Use the thumb instruction set (only available on
AArch32). Note: 16-bit thumb instructions have the
format XXXX0000. So to test e.g. instruction 46c0,
use 46c00000.
-r, --print-regs Print register values before/after instruction
execution.
-z, --random Load the registers with random values, instead of
all 0s. Note that the random values are generated
at startup, and remain constant throughout the
session.
-g, --log-reg-changes For hidden instructions, only log registers that
changed value.
-V, --vector Set and log vector registers (d0-d31, fpscr) when
fuzzing.
Fuzz the 16-bit part of T32 (Thumb):
$ ./fuzzer -e e7ff0000 -pt -f2
Fuzz the 32-bit part of T32 (using multiple processes):
$ ./armshaker.py -s e8000000 -pt -f2
Check if your (64-bit) Linux kernel induces undocumented 32-bit T32 SETEND instructions:
$ sudo sh -c 'echo 1 > /proc/sys/abi/setend' # Enable SETEND emulation
$ ./fuzzer -s e800b650 -m ffff0008 -ptg -f1
Check if your QEMU release has hidden VMUL instructions (in A32/T32):
$ ./fuzzer -s f3200d10 -m 004ff0ef -pVzg -f2
Check if your QEMU release has hidden VQDMULL instructions (in A32/T32):
$ ./fuzzer -s f3900d00 -m 007ff0af -pVzg -f2
Execute a defined instruction and see its effects (A64 example):
$ tools/as.sh 'mov x0, #0x1337'
d28266e0
$ ./fuzzer -s d28266e0 -i -pxr
insn: 0xd28266e0, checked: 0, skipped: 0, filtered: 0, hidden: 0, ips: 27676236
insn: d28266e0
x0: 0000000000000000 0000000000001337
x1: 0000000000000000 0000000000000000
x2: 0000000000000000 0000000000000000
...
x29: 0000000000000000 0000000000000000
x30: 0000000000000000 0000000000000000
sp: 0000000000000000 0000000000000000
pc: 000000555ba769f0 000000555ba769ec
pstate: 0000000000000000 0000000000000000
signal: 0
The fuzzer detects millions of hidden instructions in A32. Is something wrong?
It's possible that your particular processor executes undefined instructions with an unmatched condition code as NOPs, instead of generating an exception. This is an implementation defined behavior, as documented in section G1.16.1 of the DDI0487E Armv8 Architecture Reference Manual.
This behavior can in some sense be bypassed by making sure the condition code is always matched, which is what the -c
option does. See if adding that option reduces the number of reported hidden instructions.
armshaker was inspired by Christopher Domas's sandsifter project and implemented as part of my master's thesis in computer science, where I used it to fuzz a variety of Armv8-A-based systems. No hidden instructions that could be attributed to hardware were found, but the fuzzing did reveal bugs in the QEMU emulator and the Linux kernel. See the refs
directory for more information.
The main findings of my thesis were published as an article presented at the HASP 2020 workshop, under the title Uncovering Hidden Instructions in Armv8-A Implementations (preprint version).
If you would like to cite this work, you can use the following BibTex entry:
@inproceedings{10.1145/3458903.3458906,
author = {Strupe, Fredrik and Kumar, Rakesh},
title = {Uncovering Hidden Instructions in Armv8-A Implementations},
year = {2021},
isbn = {9781450388986},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3458903.3458906},
doi = {10.1145/3458903.3458906},
booktitle = {Hardware and Architectural Support for Security and Privacy},
articleno = {3},
numpages = {9},
keywords = {undocumented instructions, Armv8-A, hidden instructions, hardware security},
location = {Virtual, Greece},
series = {HASP '20}
}