Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix regex for the binary_stats command #1884

Merged
merged 1 commit into from
Apr 2, 2024

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Apr 1, 2024

Regressed in #1833.

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

How it broken, is there some error?

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 2, 2024

Yeah, it uses a Regex that fails to compile at runtime due to the unicode feature being disabled. Sadly this is not easy to figure out during compilation.

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

So, it wasn't tested in CI, i guess?

Is there way to reproduce error without rebuilding compiler?

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 2, 2024

So, it wasn't tested in CI, i guess?

Yeah, the collector has many commands that are not tested in CI.

Is there way to reproduce error without rebuilding compiler?

Sure, e.g.

./target/release/collector binary_stats \
    +da02fff3b6e4e27156054dcdda6675fe2a2591a6 \
    --rustc2 +31075bb493f49c2b739ff20b2e96a9045e1eba13 \
    --include helloworld \
    --profile Debug \
    --backend Llvm

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

$ ./target/debug/collector.exe binary_stats +nightly-x86_64-pc-windows-msvc --rustc2 +nightly-x86_64-pc-windows-msvc --include helloworld --profile Debug --backend Llvm
Stats for benchmark `helloworld`
--------------------
Target `helloworld` (artifact `helloworld.exe`)
┌────────────────────┬───────────────┬──────────────┬──────┬──────────┐
│ Section            │ Size (before) │ Size (after) │ Diff │ Diff (%) │
├────────────────────┼───────────────┼──────────────┼──────┼──────────┤
│ <5 unchanged rows> │    145.78 KiB │   145.78 KiB │    0 │     0.0% │
│────────────────────│───────────────│──────────────│──────│──────────│
│ Total              │    145.78 KiB │   145.78 KiB │    0 │     0.0% │
└────────────────────┴───────────────┴──────────────┴──────┴──────────┘

Stats for benchmark `helloworld-tiny`
--------------------
Target `helloworld-tiny` (artifact `helloworld-tiny.exe`)
┌────────────────────┬───────────────┬──────────────┬──────┬──────────┐
│ Section            │ Size (before) │ Size (after) │ Diff │ Diff (%) │
├────────────────────┼───────────────┼──────────────┼──────┼──────────┤
│ <5 unchanged rows> │    145.79 KiB │   145.79 KiB │    0 │     0.0% │
│────────────────────│───────────────│──────────────│──────│──────────│
│ Total              │    145.79 KiB │   145.79 KiB │    0 │     0.0% │
└────────────────────┴───────────────┴──────────────┴──────┴──────────┘

  1. --include finds match by substring, so i can't specify exact match?
  2. Well, on my win machine i don't see errors :-)

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 2, 2024

Yeah, you need to exclude helloworld-tiny instead. Maybe on Windows it uses a different implementation for regexes, IDK.

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

Ok, this is something

$ ./target/debug/collector.exe binary_stats +stable-x86_64-pc-windows-msvc --rustc2 +nightly-x86_64-pc-windows-msvc --include "helloworld" --profile Debug --backend Llvm
Stats for benchmark `helloworld`
--------------------
Target `helloworld` (artifact `helloworld.exe`)
┌─────────┬───────────────┬──────────────┬───────┬──────────┐
│ Section │ Size (before) │ Size (after) │  Diff │ Diff (%) │
├─────────┼───────────────┼──────────────┼───────┼──────────┤
│ .rdata  │     36.06 KiB │    37.57 KiB │ +1546 │    +4.2% │
│ .pdata  │      4.49 KiB │     5.39 KiB │  +924 │   +20.1% │
│ .text   │    101.92 KiB │   101.12 KiB │  -811 │    -0.8% │
│ .reloc  │         944 B │       1012 B │   +68 │    +7.2% │
│ .data   │         776 B │        728 B │   -48 │    -6.2% │
│─────────│───────────────│──────────────│───────│──────────│
│ Total   │    144.14 KiB │   145.78 KiB │ +1679 │    +1.1% │
└─────────┴───────────────┴──────────────┴───────┴──────────┘

Stats for benchmark `helloworld-tiny`
--------------------
Target `helloworld-tiny` (artifact `helloworld-tiny.exe`)
┌─────────┬───────────────┬──────────────┬───────┬──────────┐
│ Section │ Size (before) │ Size (after) │  Diff │ Diff (%) │
├─────────┼───────────────┼──────────────┼───────┼──────────┤
│ .rdata  │     36.07 KiB │    37.58 KiB │ +1546 │    +4.2% │
│ .pdata  │      4.49 KiB │     5.39 KiB │  +924 │   +20.1% │
│ .text   │    101.92 KiB │   101.12 KiB │  -811 │    -0.8% │
│ .reloc  │         944 B │       1012 B │   +68 │    +7.2% │
│ .data   │         776 B │        728 B │   -48 │    -6.2% │
│─────────│───────────────│──────────────│───────│──────────│
│ Total   │    144.15 KiB │   145.79 KiB │ +1679 │    +1.1% │
└─────────┴───────────────┴──────────────┴───────┴──────────┘

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

Hmm, linux machine:

└─$ target/debug/collector binary_stats +nightly-x86_64-unknown-linux-gnu --rustc2 +stage-1 --include "helloworld" --profile Debug --backend Llvm 
Stats for benchmark `helloworld`
--------------------
thread 'main' panicked at collector/src/artifact_stats.rs:99:91:
called `Result::unwrap()` on an `Err` value: Syntax(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
regex parse error:
    (::)?\b[a-z0-9]{15,17}\b(\.\d+)?
                               ^^
error: Unicode-aware Perl class not found (make sure the unicode-perl feature is enabled)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 2, 2024

Yeah, that's the error that this PR will solve.

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

Well, rust-lang/regex#982, kind of.

Some cargo feature unification + platform dependent features.

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

Without size regression:

diff --git a/Cargo.lock b/Cargo.lock
index 4e9440d2..90b0c872 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -378,7 +378,7 @@ dependencies = [
  "num_cpus",
  "object",
  "rayon",
- "regex",
+ "regex-lite",
  "reqwest",
  "rustc-demangle",
  "semver",
@@ -1948,6 +1948,12 @@ dependencies = [
  "regex-syntax",
 ]
 
+[[package]]
+name = "regex-lite"
+version = "0.1.5"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "30b661b2f27137bdbc16f00eda72866a92bb28af1753ffbd56744fb6e2e9cd8e"
+
 [[package]]
 name = "regex-syntax"
 version = "0.6.28"
diff --git a/collector/Cargo.toml b/collector/Cargo.toml
index f0fdbd35..96fa01d4 100644
--- a/collector/Cargo.toml
+++ b/collector/Cargo.toml
@@ -40,7 +40,7 @@ console = "0.15"
 object = "0.32.1"
 tabled = { version = "0.14.0" , features = ["ansi-str"]}
 humansize = "2.1.3"
-regex = { version = "1.7.1", default-features = false, features = ["std", "perf"] }
+regex = { package = "regex-lite", version = "0.1" }
 analyzeme = { git = "https://github.com/rust-lang/measureme", branch = "stable" }
 
 benchlib = { path = "benchlib" }

@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

└─$ target/release/collector binary_stats +nightly-x86_64-unknown-linux-gnu --rustc2 +stage-1 --include "helloworld" --profile Debug --backend Llvm
Stats for benchmark `helloworld`
--------------------
Target `helloworld` (artifact `helloworld`)
┌─────────────────────┬───────────────┬──────────────┬─────────┬──────────┐
│ Section             │ Size (before) │ Size (after) │    Diff │ Diff (%) │
├─────────────────────┼───────────────┼──────────────┼─────────┼──────────┤
│ .debug_info         │    909.71 KiB │     1.17 MiB │ +299636 │   +32.2% │
│ .debug_line         │    396.24 KiB │   555.58 KiB │ +163165 │   +40.2% │
│ .debug_str          │      1.28 MiB │     1.35 MiB │  +76547 │    +5.7% │
│ .strtab             │     48.37 KiB │    77.57 KiB │  +29895 │   +60.4% │
│ .debug_ranges       │    605.27 KiB │   628.67 KiB │  +23968 │    +3.9% │
│ .debug_abbrev       │      4.50 KiB │    21.86 KiB │  +17782 │  +386.2% │
│ .debug_aranges      │     28.50 KiB │    42.45 KiB │  +14288 │   +49.0% │
│ .text               │    247.03 KiB │   260.32 KiB │  +13616 │    +5.4% │
│ .symtab             │     17.98 KiB │    28.41 KiB │  +10680 │   +58.0% │
│ .eh_frame           │     21.54 KiB │    25.83 KiB │   +4392 │   +19.9% │
│ .gcc_except_table   │      4.08 KiB │     5.21 KiB │   +1152 │   +27.6% │
│ .rodata             │     24.13 KiB │    25.22 KiB │   +1113 │    +4.5% │
│ .eh_frame_hdr       │      3.99 KiB │     4.75 KiB │    +776 │   +19.0% │
│ .rela.dyn           │     17.48 KiB │    17.86 KiB │    +384 │    +2.1% │
│ .data.rel.ro        │      9.27 KiB │     9.53 KiB │    +272 │    +2.9% │
│ .got                │      1.69 KiB │     1.62 KiB │     -64 │    -3.7% │
│ .comment            │          82 B │         55 B │     -27 │   -32.9% │
│ .dynstr             │      1.04 KiB │     1.02 KiB │     -26 │    -2.4% │
│ .dynsym             │      1.59 KiB │     1.57 KiB │     -24 │    -1.5% │
│ .tbss               │          73 B │         81 B │      +8 │   +11.0% │
│ .gnu.version        │         136 B │        134 B │      -2 │    -1.5% │
│ <18 unchanged rows> │      1.83 KiB │     1.83 KiB │       0 │     0.0% │
│─────────────────────│───────────────│──────────────│─────────│──────────│
│ Total               │      3.57 MiB │     4.19 MiB │ +657531 │   +17.6% │
└─────────────────────┴───────────────┴──────────────┴─────────┴──────────┘

Stats for benchmark `helloworld-tiny`
--------------------
Target `helloworld-tiny` (artifact `helloworld-tiny`)
┌─────────────────────┬───────────────┬──────────────┬─────────┬──────────┐
│ Section             │ Size (before) │ Size (after) │    Diff │ Diff (%) │
├─────────────────────┼───────────────┼──────────────┼─────────┼──────────┤
│ .debug_info         │    909.71 KiB │     1.17 MiB │ +299636 │   +32.2% │
│ .debug_line         │    396.24 KiB │   555.58 KiB │ +163165 │   +40.2% │
│ .debug_str          │      1.28 MiB │     1.35 MiB │  +76548 │    +5.7% │
│ .strtab             │     48.42 KiB │    77.62 KiB │  +29897 │   +60.3% │
│ .debug_ranges       │    605.27 KiB │   628.67 KiB │  +23968 │    +3.9% │
│ .debug_abbrev       │      4.50 KiB │    21.86 KiB │  +17782 │  +386.2% │
│ .debug_aranges      │     28.50 KiB │    42.45 KiB │  +14288 │   +49.0% │
│ .text               │    247.03 KiB │   260.32 KiB │  +13616 │    +5.4% │
│ .symtab             │     17.98 KiB │    28.41 KiB │  +10680 │   +58.0% │
│ .eh_frame           │     21.54 KiB │    25.83 KiB │   +4392 │   +19.9% │
│ .gcc_except_table   │      4.08 KiB │     5.21 KiB │   +1152 │   +27.6% │
│ .rodata             │     24.13 KiB │    25.22 KiB │   +1113 │    +4.5% │
│ .eh_frame_hdr       │      3.99 KiB │     4.75 KiB │    +776 │   +19.0% │
│ .rela.dyn           │     17.48 KiB │    17.86 KiB │    +384 │    +2.1% │
│ .data.rel.ro        │      9.27 KiB │     9.53 KiB │    +272 │    +2.9% │
│ .got                │      1.69 KiB │     1.62 KiB │     -64 │    -3.7% │
│ .comment            │          82 B │         55 B │     -27 │   -32.9% │
│ .dynstr             │      1.04 KiB │     1.02 KiB │     -26 │    -2.4% │
│ .dynsym             │      1.59 KiB │     1.57 KiB │     -24 │    -1.5% │
│ .tbss               │          73 B │         81 B │      +8 │   +11.0% │
│ .gnu.version        │         136 B │        134 B │      -2 │    -1.5% │
│ <18 unchanged rows> │      1.83 KiB │     1.83 KiB │       0 │     0.0% │
│─────────────────────│───────────────│──────────────│─────────│──────────│
│ Total               │      3.57 MiB │     4.19 MiB │ +657534 │   +17.6% │
└─────────────────────┴───────────────┴──────────────┴─────────┴──────────┘

cursed stage-1 config

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 2, 2024

I would prefer to just keep the popular option, i.e. just use regex with default features. Shaving a few KiBs off the collector binary size isn't super important.

@Kobzol Kobzol merged commit f36f7a4 into rust-lang:master Apr 2, 2024
11 checks passed
@Kobzol Kobzol deleted the fix-binary-stats branch April 2, 2024 12:36
@klensy
Copy link
Contributor

klensy commented Apr 2, 2024

This can actually use unicode-perl feature instead of full unicode, i'll try it later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants