Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong codegen for wasm32 with opt-level >= 2 #50754

Closed
ngg opened this issue May 14, 2018 · 11 comments
Closed

Wrong codegen for wasm32 with opt-level >= 2 #50754

ngg opened this issue May 14, 2018 · 11 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/

Comments

@ngg
Copy link
Contributor

ngg commented May 14, 2018

I've tried to create a small wasm wrapper around the aes-soft crate and I realized that it gives wrong results when compiled on opt-level >= 2.

Minimized test case is uploaded to https://github.com/ngg/wasm-codegen-bug.

I'm not sure if it's a bug in Rust or in LLVM.

@kennytm kennytm added O-wasm Target: WASM (WebAssembly), http://webassembly.org/ C-bug Category: This is a bug. labels May 14, 2018
@CryZe
Copy link
Contributor

CryZe commented May 15, 2018

It looks like this is indeed a codegen bug and it didn't use to happen (doesn't happen on webassembly.studio). It's also not related to wasm-bindgen.

@ngg
Copy link
Contributor Author

ngg commented May 15, 2018

I've bisected which nightly caused the regression.
The last nightly it was working is rustc 1.26.0-nightly (e026b59cf 2018-03-03).
The next nightly already has this issue: rustc 1.26.0-nightly (259e4a678 2018-03-04).
The diff between these is e026b59...259e4a6, which contains #48125, the other merged PRs seem irrelevant to this.
So it might actually be a bug in LLD.

@kennytm
Copy link
Member

kennytm commented May 15, 2018

cc @alexcrichton #48125 Potential LLD codegen bug ↑

@alexcrichton
Copy link
Member

Thanks for the report @ngg! To confirm the bug here is that the browser shows 65530 while the answer is supposed to be 65525, right?

@CryZe
Copy link
Contributor

CryZe commented May 16, 2018

Yeah

@ngg
Copy link
Contributor Author

ngg commented May 16, 2018

Yes, it gives 65525 on a Linux host target with all opt-levels, it also gives 65525 on wasm32 with opt-level = 0,1,"s","z" but gives 65530 with opt-level = 2,3 independently from the lto setting. And it seems that before switching to LLD on wasm32 it used to give 65525 with all opt-levels as well.

@kennytm
Copy link
Member

kennytm commented May 16, 2018

Somewhat minimized, removing the wasm-bindgen dependency
#![feature(lang_items)]
#![no_std]

#[lang = "panic_fmt"]
fn panic_fmt() {}

type V2 = (u16, u16);
type V4 = (u16, u16, u16, u16);
type V8 = (u16, u16, u16, u16, u16, u16, u16, u16);

fn xor2(lhs: V2, rhs: V2) -> V2 {
    let (a0, a1) = lhs;
    let (b0, b1) = rhs;
    (a0 ^ b0, a1 ^ b1)
}

fn mul2(x: V2, y: V2) -> V2 {
    let (b, a) = x;
    let (d, c) = y;
    let e = (a ^ b) & (c ^ d);
    ((b & d) ^ e, (a & c) ^ e)
}

fn split4(v: V4) -> (V2, V2) {
    let (x0, x1, x2, x3) = v;
    ((x0, x1), (x2, x3))
}

fn mul4(x: V4, y: V4) -> V4 {
    let (b, a) = split4(x);
    let (d, c) = split4(y);
    let e = xor2(a, b);
    let (x2, x3) = xor2(mul2(a, c), e);
    let (x0, x1) = xor2(mul2(b, d), e);
    (x0, x1, x2, x3)
}

fn inv4(v: V4) -> V4 {
    let a = (v.2, v.3);
    let b = (v.0, v.1);
    let d = mul2(a, b);
    let (x2, x3) = mul2(d, b);
    let (x0, x1) = mul2(d, a);
    (x0, x1, x2, x3)
}

fn split8(v: V8) -> (V4, V4) {
    let (x0, x1, x2, x3, x4, x5, x6, x7) = v;
    ((x0, x1, x2, x3), (x4, x5, x6, x7))
}

fn inv8(v: V8) -> V8 {
    let (b, a) = split8(v);
    let (x0, x1, x2, x3) = mul4(b, a);
    let (x4, x5, x6, x7) = mul4(inv4(a), b);
    (0, !x1, x2, x3, x4, !x5, !x6, x7)
}

#[no_mangle]
pub fn test() -> u16 {
    let tmp0 = (1, 1, 1, 1, 1, 1, 1, 1);
    inv8(tmp0); // command out to get correct value.

    let tmp4 = (65525, 65530, 15, 0, 15, 65520, 5, 10);
    let tmp5 = inv8(tmp4);
    tmp5.5
}

Test shim:

const fs = require('fs');
let buf = fs.readFileSync('lib.wasm');
let m = new WebAssembly.Module(buf);
let inst = new WebAssembly.Instance(m, {});
console.log(inst.exports.test());

Execute with:

rustc --target=wasm32-unknown-unknown --crate-type=cdylib -Copt-level=1 src/lib.rs
node 1.js

With opt-level ≥ 2, the output would be 15 (wrong). With opt-level ≤ 1, the output would be 65525 (correct).

@alexcrichton
Copy link
Member

I believe this is an LLVM bug after reduction, and I've opened an upstream bug report

@alexcrichton alexcrichton added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label May 16, 2018
@alexcrichton
Copy link
Member

For posterity the smallest reduction I've got so far (IR-wise) is:

#![crate_type = "cdylib"]

pub struct V4(u16, u16, u16, u16);
pub struct V8(u16, u16, u16, u16, u16, u16, u16, u16);

#[inline(never)]
fn inv4(v: &V4) -> u16 {
    v.2 ^ (v.2 & v.0)
}

fn split8(a: u16, b: u16, c: u16) -> (V4, V4) {
    (V4(0, 0, 0, a), V4(b, c, 0, 0))
}

#[inline(never)]
fn inv8(v: V8) -> u16 {
    let (_b, a) = split8(v.3, v.4, v.5);
    inv4(&a)
}

#[no_mangle]
pub extern fn test() -> u16 {
    inv8(V8(0, 0, 0, 0, 1, 65520, 0, 2))
}

@alexcrichton
Copy link
Member

This has now been fixed in upstream LLVM and I've backported it to our fork with rust-lang/llvm#116, so all that's needed now is a PR to this repo to update the LLVM submodule!

@ngg
Copy link
Contributor Author

ngg commented May 21, 2018

Thanks! This is now working, even for the original aes-soft wrapper.

@ngg ngg closed this as completed May 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/
Projects
None yet
Development

No branches or pull requests

4 participants