Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hardware accelerated AES for ARM #10

Closed
newpavlov opened this issue Oct 23, 2017 · 17 comments · Fixed by #250
Closed

Hardware accelerated AES for ARM #10

newpavlov opened this issue Oct 23, 2017 · 17 comments · Fixed by #250

Comments

@newpavlov
Copy link
Member

Similar to aesni crate it would be nice to have hardware accelerated AES support for ARM using specialized instructions.

@quininer
Copy link

quininer commented Jan 28, 2018

Should we plan to use stdsimd? rust-lang/stdarch#295 rust-lang/stdarch#148

@newpavlov
Copy link
Member Author

Excellent! I've subscribed to both issues and will try to implement both ARM and x86 versions once they get implemented in the stdsimd. (for x86 I have LLVM intrinsics based draft) One thing which I would like to have is an ability to specify target_feature in the Cargo.toml, otherwise intrinsics based solution will be less convenient to use, as you'll have to list several features in the RUSTFLAGS.

@jasondavies
Copy link

FYI, I recently added aarch64 AES intrinsics in rust-lang/stdarch#398. I believe AES intrinsics for x86 are in there already.

@newpavlov
Copy link
Member Author

Yes, I've seen it, thank you for this addition! When I'll have time I'll try to play with them and implement the crate. (if no one beats me to it that is)

AES-NI intrinsics already utilized in the aesni crate, as for Intel SHA extensions currently I don't have an appropriate CPU to test code on.

@tarcieri
Copy link
Member

tarcieri commented Nov 24, 2020

@jack-signal I saw you have hardware accelerated aarch64 implemented here:

https://github.com/signalapp/libsignal-client/blob/946e067/rust/aes-gcm-siv/src/aes/aarch64.rs

Any interest in upstreaming that?

@jack-signal
Copy link
Contributor

At this point everything Signal is putting out, including that crate, is AGPL, I doubt there is interest in relicensing so probably wouldn't work. It's worth pointing out though that the linked code is just for AES-256 in forward direction because that is all we needed, and it doesn't implement any of the RustCrypto traits so a ways off from something you could use.

Someone recently pointed me at the MIT licensed https://github.com/shadowsocks/crypto2/blob/master/src/blockcipher/aes/aarch64.rs which might work better as a starting point. But that code doesn't do instruction pipelining, and the key schedule isn't constant time :(

@tarcieri
Copy link
Member

Ugh, alright, it seems both those approaches are a no-go.

@tarcieri
Copy link
Member

@newpavlov were you thinking of trying to do an implementation of this from scratch?

@tarcieri
Copy link
Member

tarcieri commented Nov 26, 2020

SUPERCOP contains a public domain implementation of the full AES-GCM including one based on ARMv8 intrinsics which works on 8-blocks-at-a-time in parallel:

https://github.com/floodyberry/supercop/blob/master/crypto_aead/aes256gcmv1/dolbeau/armv8crypto/armv8crypto.c

@newpavlov
Copy link
Member Author

newpavlov commented Nov 28, 2020

@tarcieri
Unfortunately I don't have an easy way to test code which uses ARM crypto extension. RPi4 does not support it and while my phone probably has it, right now I don't have development environment and necessary knowledge for working with it.

@tarcieri
Copy link
Member

@newpavlov no worries!

I have an Apple M1 Mac Mini that as of yesterday now has the beta Rust compiler I can test things on. I might experiment to see if I can get it working.

@valpackett
Copy link

@newpavlov you can always get an AWS EC2 instance to test code, until the end of 2020 one t4g.micro is available for free for all accounts

@jack-signal
Copy link
Contributor

qemu supports the Aarch64 crypto extensions and is pretty easy to integrate into cargo test, just install qemu and aarch64 cross gcc, and add to ~/.cargo/config

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
runner = ["qemu-aarch64", "-L", "/usr/aarch64-linux-gnu"]

Then you can build and test Aarch64 code on x86-64

@tarcieri
Copy link
Member

tarcieri commented Nov 30, 2020

As it were, we already use cross to test against aarch64:

https://github.com/RustCrypto/block-ciphers/blob/7236bce/.github/workflows/aes.yml#L149-L181

@tarcieri
Copy link
Member

I was able to translate this public domain implementation of AES-128:

https://github.com/noloader/AES-Intrinsics/blob/master/aes-arm.c

My translation is here. It runs successfully on my Apple M1, and passes the supplied test vector:

https://gist.github.com/tarcieri/f10b0c58a56dfab4917c3832f93b25af

It still needs a round key expansion but that's not terribly difficult. Likewise for AES-192 and AES-256 support.

The implementation is using #![feature(stdsimd)] so it only works on nightly for now. We could potentially introduce a nightly Cargo feature and gate it under that. Curious what @newpavlov thinks about that.

@tarcieri
Copy link
Member

Sidebar, but in researching AES on ARM I came across this paper which looks potentially helpful:

"Efficient Parallel Implementation of CTR Mode of ARX-Based Block Ciphers on ARMv8 Microcontrollers"

https://www.mdpi.com/2076-3417/11/6/2548

@tarcieri
Copy link
Member

I landed ARMv8 Cryptography Extensions support in #250, however it's only tested on and therefore presently gated on aarch64 targets.

If you're interested in support for 32-bit ARMv8 targets or other types of ARM acceleration, please leave a comment on this issue or make a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants