Criterion benchmarks hang with target-cpu=znver1 #65618

novacrazy · 2019-10-20T06:50:58Z

On my system, a Zen 1 AMD CPU, attempting to run Criterion benchmarks with RUSTFLAGS = "-C target-cpu=znver1" results in the benchmark hanging at the Analyzing stage for a few minutes before I killed the program.

Without the flags, it doesn't hang at all, as expected.

This affects stable, beta and nightly, and is probably related to LLVM 9.

Code that can hang on stable:

#[macro_use]
extern crate criterion;

use criterion::{Criterion, ParameterizedBenchmark};

fn criterion_benchmark(c: &mut Criterion) {
    c.bench(
        "bench",
        ParameterizedBenchmark::new(
            "old",
            |b, _| {
                b.iter(|| {});
            },
            vec![()],
        )
        .with_function("new", |b, _| {
            b.iter(|| {});
        }),
    );
}

criterion_group!(my_benches, criterion_benchmark);
criterion_main!(my_benches);

using cargo +stable bench

No special flags are present in the Cargo.toml

[package]
name = "bench-test"
version = "0.1.0"
authors = ["novacrazy <[email protected]>"]
edition = "2018"

[dev-dependencies]
criterion = "0.3.0"

[[bench]]
name = "benchmark_name"
harness = false

The text was updated successfully, but these errors were encountered:

nagisa · 2019-10-20T14:28:35Z

Unable to reproduce on my 1st gen Ryzen 1700 with nightly:

...
   Compiling bench-test v0.1.0 (/tmp/crittest)
     Running `rustc --edition=2018 --crate-name bench_test bench_name.rs --error-format=json --json=diagnostic-rendered-ansi --emit=dep-info,link -C opt-level=3 --cfg test -C metadata=8c71f5ab08445b57 -C extra-filename=-8c71f5ab08445b57 --out-dir /tmp/crittest/target/release/deps -L dependency=/tmp/crittest/target/release/deps --extern bench_test=/tmp/crittest/target/release/deps/libbench_test-0f2595cb3983be9b.rlib --extern criterion=/tmp/crittest/target/release/deps/libcriterion-0d07609906f00047.rlib -Ctarget-cpu=znver1`
    Finished bench [optimized] target(s) in 2.26s
     Running `/tmp/crittest/target/release/deps/bench_test-8c71f5ab08445b57 --bench`
bench/old               time:   [0.0000 ps 0.0000 ps 0.0000 ps]                               
                        change: [-53.418% -6.0379% +85.594%] (p = 0.87 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe
bench/new               time:   [0.0000 ps 0.0000 ps 0.0000 ps]                               
                        change: [-52.325% -7.9366% +80.704%] (p = 0.82 > 0.05)
                        No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
  6 (6.00%) high mild
  14 (14.00%) high severe

nagisa · 2019-10-20T14:35:54Z

Unable to reproduce with stable compiler on 2400G or 2700 either.

novacrazy · 2019-10-20T14:39:02Z

I'm on a Threadripper 1950X. Could the number of threads (32 in this case) affect it?

Another thought, does any part of Criterion rely on atomic values? Could be something odd with how code is generated and interacts with the multi-chip Threadripper.

mati865 · 2019-10-20T14:47:13Z

Cannot reproduce on 2700X using nightly linux-gnu and windows-gnu with the command RUSTFLAGS="-C target-cpu=znver1" cargo bench.
So it's not related to OS, unlike #63959

novacrazy · 2019-10-20T14:49:44Z

Ah, I should have specified I'm on the MSVC toolchain (nightly-x86_64-pc-windows-msvc).

I can confirm this does not appear on GNU toolchains, only MSVC.

Speedy37 · 2019-11-18T23:43:16Z

Same here with an AMD R7 1800X and -C target-cpu=native

rustc 1.39.0 (4560ea788 2019-11-04)
binary: rustc
commit-hash: 4560ea788cb760f0a34127156c78e2552949f734
commit-date: 2019-11-04
host: x86_64-pc-windows-msvc
release: 1.39.0
LLVM version: 9.0

I took a quick look at llvm changes since 9.0.0, but nothing seems related.

llvm/llvm-project@llvmorg-9.0.0...release/9.x

At some point the producer variable is broken in rayon helper method (iter/plumbing/mod.rs).
And there is a lot of AVX2 instructions there.

producer value is : {range={start=0x0000000000000000 end=0x00000021646c7200 } } once the folding core is running. start and end should stay between 0 and 100 000.

Things are working fine with -C target-cpu=core-avx2 but iter/plumbing/mod.rs is not using AVX2 instructions anymore.

Here are the generated set of instructions of iter/plumbing/mod.rs

znver1.txt
core-avx2.txt

I hope I didn't focus on the wrong method.

Speedy37 · 2019-11-19T02:23:30Z

Couldn't this be related to #64609 ?

Speedy37 · 2019-11-19T02:41:13Z

I just run cargo bench on the beta and nightly channel and it works fine.

mati865 · 2019-11-19T13:06:29Z

I think it's different because this one unlike #64609 doesn't appear with windows-gnu target.

Speedy37 · 2019-11-19T13:20:47Z

@mati865 I think you are right, and the part where I think there is a misscompilation is inlined.

How can we validate beta/nightly actually fixes this and it's not pure luck due to some minor change in how rust output things?
Just printing the "bad" value with println at the top of the helper function, made the program work with rustc stable.

mati865 · 2019-11-19T17:48:22Z

Just printing the "bad" value with println at the top of the helper function, made the program work with rustc stable.

Probably println prevented one of optimisations.

I don't know to move forward.
Maybe compare IR and ASM generated for mingw and msvc?

cc @eddyb

Speedy37 · 2019-11-19T23:33:30Z

Well I already reduced the incriminated code to the helper function which use a set of both xmm and ymm.
Seeing what I saw with the assembly of your test case for #63959, I bet this issue is the same.

novacrazy · 2020-01-04T01:40:52Z

This appears to have been caused by #63959, so it's fixed for me now. This could probably be closed now.

jonas-schievink added the C-bug Category: This is a bug. label Oct 20, 2019

jonas-schievink added the O-windows-msvc Toolchain: MSVC, Operating system: Windows label Oct 20, 2019

novacrazy mentioned this issue Oct 25, 2019

Miscompilation with target-cpu=znver1 (AMD Ryzen 1000/2000 series) on Windows + LLVM 9. #63959

Closed

sanxiyn added the O-x86_64 Target: x86-64 processors (like x86_64-*) label Nov 19, 2019

sanxiyn closed this as completed Jan 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Criterion benchmarks hang with target-cpu=znver1 #65618

Criterion benchmarks hang with target-cpu=znver1 #65618

novacrazy commented Oct 20, 2019 •

edited

Loading

nagisa commented Oct 20, 2019 •

edited

Loading

nagisa commented Oct 20, 2019

novacrazy commented Oct 20, 2019 •

edited

Loading

mati865 commented Oct 20, 2019

novacrazy commented Oct 20, 2019

Speedy37 commented Nov 18, 2019 •

edited

Loading

Speedy37 commented Nov 19, 2019 •

edited

Loading

Speedy37 commented Nov 19, 2019

mati865 commented Nov 19, 2019

Speedy37 commented Nov 19, 2019

mati865 commented Nov 19, 2019

Speedy37 commented Nov 19, 2019

novacrazy commented Jan 4, 2020

Criterion benchmarks hang with target-cpu=znver1 #65618

Criterion benchmarks hang with target-cpu=znver1 #65618

Comments

novacrazy commented Oct 20, 2019 • edited Loading

nagisa commented Oct 20, 2019 • edited Loading

nagisa commented Oct 20, 2019

novacrazy commented Oct 20, 2019 • edited Loading

mati865 commented Oct 20, 2019

novacrazy commented Oct 20, 2019

Speedy37 commented Nov 18, 2019 • edited Loading

Speedy37 commented Nov 19, 2019 • edited Loading

Speedy37 commented Nov 19, 2019

mati865 commented Nov 19, 2019

Speedy37 commented Nov 19, 2019

mati865 commented Nov 19, 2019

Speedy37 commented Nov 19, 2019

novacrazy commented Jan 4, 2020

novacrazy commented Oct 20, 2019 •

edited

Loading

nagisa commented Oct 20, 2019 •

edited

Loading

novacrazy commented Oct 20, 2019 •

edited

Loading

Speedy37 commented Nov 18, 2019 •

edited

Loading

Speedy37 commented Nov 19, 2019 •

edited

Loading