Provide clang flag to opt out of `wasm-opt` when linking wasm? #55781

TerrorJack · 2022-05-30T19:15:59Z

Currently, when linking wasm, clang attempts to detect wasm-opt in the environment, and invokes it with certain flags based on current optimization level. source

This behavior is undesirable when user wants to customize when or how wasm-opt is run. It would be nice if a clang command line flag can be provided to opt-out of this behavior.

The text was updated successfully, but these errors were encountered:

llvmbot · 2022-05-30T21:04:50Z

@llvm/issue-subscribers-clang-driver

abrown · 2023-01-10T00:38:12Z

cc: @sunfishcode, @sbc100, @tlively or anyone else who may have worked on this: what do you all think about retaining the name custom section by default in wasm-opt and only removing it when the optimization level is explicitly dialed up? This section is quite useful for populating backtraces, e.g., and I was confused when the name section magically disappeared in a build.

sbc100 · 2023-01-10T01:11:57Z

I think the simplest thing would be to preserve it in all cases. If folks want to strip it they can use -Wl,--strip-debug which already exists today.

I think it might be as simple as adding -g unconditionally to wasm-opt (although that might end up add a name section to the binary where non existed previously... I think we should fix that on the wasm-opt side if that is the case).

sbc100 · 2023-01-10T01:12:42Z

In other words, let wasm-ld decide what debug into to include, and ask wasm-opt to do its best to preserve it.

abrown · 2023-01-11T22:44:58Z

Should I submit a change request to add -g unconditionally to the driver?

aheejin · 2023-01-12T00:32:40Z

Adding -g to the driver invocation in the clang driver is one way.

I think another way might be to just, change binaryen's behavior so that even without -g given, we preserve the debug info. I checked with clang and llc and they preserve the existing debug info without -g given.

For example, if I create test.ll using

clang -g test.c -c -emit-llvm -o test.ll

And finish compiling test.ll to test.o using clang -g or llc,

llc -filetype-obj test.ll -o test.o

clang -c test.ll -o test.o

Even though the second llc and clang don't have -g, they preserve the debug info. But this changes binaryen's current behavior, so I can be unaware of potential consequences. Thoughts? @kripken

kripken · 2023-01-12T17:11:31Z

@aheejin Hmm, I think changing binaryen may be a little risky. It's a change to a long-existing user interface and the result could be people shipping name sections unnecessarily without noticing it (if they don't measure code size).

Also, there is a reason binaryen differs from llc - llc's inputs all know if they have debug info or not, so it's natural to preserve debug info as it was in the input. But wasm-opt can also receive as input a wat file which doesn't say if it has debug info or not, so we need user input to tell us whether to emit a names section. It made more sense in the past when binaryen had a second input format without clear debug info (asm2wasm / asm.js), so I guess it's debatable. And also clang has mixed behavior, requiring -g for source files but preserving debug info on object files automatically IIUC.

Still, for the reasons in the first paragraph I think it's better not to change binaryen here. Instead, could clang pass -g to wasm-opt when it received -g itself?

abrown · 2023-01-12T19:46:59Z

@kripken, I also was thinking along the lines of "conditionally pass -g along" but I couldn't figure out how this might work when wasm-ld is used directly. I don't see that it has a -g flag to pass along — were you thinking we should add one?

sbc100 · 2023-01-12T19:49:38Z

Its in the clang driver linked above:

llvm-project/clang/lib/Driver/ToolChains/WebAssembly.cpp

Line 133 in 940e290

// When optimizing, if wasm-opt is available, run it.

wasm-ld never calls wasm-opt itself.. its something that run after wasm-ld.

sbc100 · 2023-01-12T19:51:41Z

Also, I think its has to be unconditional since wasm-ld will produce debug info based on input object files, and not based on the -g flag passed to clang at link time. In fact, I think clang ignores -g at link time.

Assuming wasm-opt -g means don't strip debug info, but also don't generate any then I think its fine to make it unconditional.

kripken · 2023-01-12T20:34:45Z

Assuming wasm-opt -g means don't strip debug info, but also don't generate any then I think its fine to make it unconditional.

Unfortunately that's not true. Binaryen IR doesn't mention the presence or lack of debug info. That is, having or not having a names section is something the wasm binary format has, but not the text format, and not Binaryen IR which abstracts over both the text and binary format. Atm wasm-opt -g will emit a name section using the names used in the IR (which, if there wasn't a names section, will be things like $0).

But if there isn't a better solution, then adding that might be ok. We already have hasFeaturesSection, and could add hasNamesSection + handling for it.

kripken · 2023-01-12T20:36:24Z

(But if we did that, we'd need to add a new flag for it, likely, because of the breaking change issue from before - we don't want wasm-opt to start or stop emitting the names section in new ways compared to what it did before.)

sbc100 · 2023-01-12T20:55:42Z

I new option sounds reasonable. Something like --preserver-debug-info... on the other hand we could make the default and add --strip-debug?

kripken · 2023-01-12T21:15:49Z

Looks like wasm-opt already has --strip-debug actually (which also strips the SourceMapURL and DWARF sections).

--preserve-debug-info sounds reasonable. (Or -g? 😉 )

sbc100 · 2023-01-12T21:26:39Z

The problem with -g is that it could mean "generate debug sections" even when the input file doesn't contain them... do we ever want wasm-opt to do that? Perhaps that should be the rare/opt-in thing?

sbc100 · 2023-01-12T21:28:24Z

OK, how about this: Why don't we do what wasm-ld does and preserve debug sections unless --strip-debug is passed? But only add new names section if -g is passed? No new option needed, no clang change needed.

kripken · 2023-01-12T21:34:13Z

The problem with -g is that it could mean "generate debug sections" even when the input file doesn't contain them... do we ever want wasm-opt to do that?

It's probably rare, but an example use case of generating debug info when there was none before is: wasm-opt --name-types -g which will give wasm GC types useful stable names. Without stable names, they can get reordered during opts and the indexes mean nothing, and debugging is a total mess.

Why don't we do what wasm-ld does and preserve debug sections unless --strip-debug is passed?

My main concern there is that it's a breaking change for users. They had an existing workflow that does not emit a names section, and now it does. This is worrying because names sections are silent bloat and the user might not realize they are starting to ship larger wasm files for no reason.

That idea might be simpler overall, but it's also riskier. in contrast I think --preserve-debug has few downsides.

dschuff · 2023-01-12T22:25:52Z

The problem with -g is that it could mean "generate debug sections" even when the input file doesn't contain them... do we ever want wasm-opt to do that?

I can't think of any reason we want wasm-opt to do that.

It's probably rare, but an example use case of generating debug info when there was none before is: wasm-opt --name-types -g which will give wasm GC types useful stable names. Without stable names, they can get reordered during opts and the indexes mean nothing, and debugging is a total mess.

This involves the user passing an explicit flag though, so if --name-types means "generate new names and create a name section for them" then I think it's fine if -g without that flag always preserves but never generates new names. Probably in order for this to work we'd need to track which IR names came from the user and which were autogenerated by the parser, but that doesn't sound too terrible to do.

Why don't we do what wasm-ld does and preserve debug sections unless --strip-debug is passed?

My main concern there is that it's a breaking change for users. They had an existing workflow that does not emit a names section, and now it does. This is worrying because names sections are silent bloat and the user might not realize they are starting to ship larger wasm files for no reason.

That idea might be simpler overall, but it's also riskier. in contrast I think --preserve-debug has few downsides.

I agree that this isn't great. IMO it's slightly mitigated by the fact that wasm-opt users are expected to be toolchain developers who are driving wasm-opt with some kind of frontend (as opposed to "end" users developing apps). This means there are fewer direct users, and they will be updating their wasm-opt version explicitly, and (it seems to me) more likely to notice behavior changes than an end user (and maybe it's easier for us to warn them). But there's still some risk.

kripken · 2023-01-12T23:34:39Z

Fair point, yeah, it's mostly toolchain people that are relevant here. I agree that reduces the risk.

I don't feel super-strongly if everyone else disagrees with me. But I would prefer not to emit a name section by default in any mode, just because of that risk of silent bloat. I'd rather people need to opt-in to get something that has such a risk, using -g or a new --preserve-debug.

sbc100 · 2023-01-12T23:38:50Z

Either of those sounds reasonable to me, as long as (one their own) they don't introduce a name section where there was none before.

By exporting the WASMLABS_SKIP_WASM_OPT envvar, the wasm-opt wrapper present in the wasm-base container image will make the wasm-opt call a no-op. This is useful for cases when we don't desire wasm-opt to be called, for example when invoked directly by clang (llvm/llvm-project#55781).

yamt · 2023-01-27T04:45:02Z

IMO, it's simpler just to remove the automatic wasm-opt invocation.
as a user, my honest impression is that it's a rather surprising than useful. (it took a few hours for me to figure out that binaryen installation has silently affected my build.)
if a user wants to run wasm-opt, he can always do that by himself.

pelletier · 2023-07-12T20:47:08Z

This confused me today as well. I was inspecting the line number extraction of inlined calls in our profiler. The dwarf ranges didn't make sense, but they worked in our CI. Eventually, I realized wasm-opt was being run when on the path, which was the case on my machine, but not on our build container.

In my opinion, wasm-opt should never be invoked by the driver. If folks want to use it, they can set it up as part of their build system. Not being able to produce correct wasm modules with -O3 -g if wasm-opt happens to be on the PATH of the build machine makes things needlessly difficult.

llvmbot · 2023-11-02T21:29:53Z

@llvm/issue-subscribers-lld-wasm

Author: Cheng Shao (TerrorJack)

Currently, when linking wasm, `clang` attempts to detect `wasm-opt` in the environment, and invokes it with certain flags based on current optimization level. [source](https://github.com/llvm/llvm-project/blob/940e290860894d274c986c88bea2dcc17f1e93c3/clang/lib/Driver/ToolChains/WebAssembly.cpp#L133)

This behavior is undesirable when user wants to customize when or how wasm-opt is run. It would be nice if a clang command line flag can be provided to opt-out of this behavior.

TerrorJack · 2023-11-02T21:39:00Z

Sorry but it doesn't seem to me 89d5635 closes this ticket at all. --keep-section is not a way to opt-out the invocation of wasm-opt when -O is enabled!

tlively · 2023-11-06T23:18:10Z

@sbc100, did you mean to close this?

RReverser · 2023-11-15T17:46:18Z

This is also a problem with 3rd-party toolchains that don't properly embed target_features list but use clang for linking (looking at you, .NET).

Arguably they should fix this issue on their side but meanwhile it means that linking will fail as soon as optimisation is enabled with lots of errors like

[wasm-validator error in function 5096] unexpected false: Bulk memory operations require bulk memory [--enable-bulk-memory], on 
  (memory.copy
   (local.get $0)
   (local.get $1)
   (local.get $2)
  )
  [wasm-validator error in function 5097] unexpected false: Bulk memory operations require bulk memory [--enable-bulk-memory], on
  (memory.copy
   (local.get $0)
   (local.get $1)
   (local.get $2)
  )
...

Maybe wasm-opt could be at least executed with -all when doing so implicitly?

tlively · 2023-11-15T19:51:43Z

-all wouldn't be quite right because it would allow wasm-opt to introduce new uses of features that were not intended to be allowed. If we want wasm-opt to be able to do the right thing without specifically enabling features on the command line and without a target features section, we'll need to add a new flag to wasm-opt that lets it infer the allowed features from the input module.

RReverser · 2023-11-15T20:08:42Z

-all wouldn't be quite right because it would allow wasm-opt to introduce new uses of features that were not intended to be allowed.

I thought it only affected validation? I didn't realise it also allows it to introduce new features.

If not, maybe -n (aka --no-validation) at least? Normally code produced by the linker should be valid anyway, so presumably validation isn't necessary, but it would allow cases like this one to pass.

tlively · 2023-11-15T20:19:49Z

It's not just validation. The features affect what optimizations are run and what the optimizations are allowed to do as well.

Skipping validation might help, but without e.g. the sign-ext feature being set, wasm-opt will not perform optimizations that introduce sign extension instructions. I wouldn't be surprised if some optimizations hit assertion failures if the features don't match the module, too.

mh4ck-Thales · 2024-01-25T10:50:22Z

I stumbled on this bug too. Having the behavior (and even the good working) of clang with frequently used flags such as -g or -O depend on the presence of wasm-opt on the machine, which is a binary unrelated to the LLVM project and not distributed with LLVM or LLVM-based toolchain such as wasi-sdk AFAIK, is a big problem.

I believe that there are two problems here, that may be separated in two issues if you find it relevant. The first one is that clang compiling to Wasm and calling wasm-opt breaks some flags / features (-g, -O, and others). This issue can be tracked in #64909. The second one is that clang can use or not wasm-opt without any warnings and produce potentially different outputs, or that one may want to use wasm-opt with specific flags on the "clang without wasm-opt" output, and that can't be achieved right now.

Regarding these two problems, I think that the easiest solution would be to remove the wasm-opt call for now, or at least put it behind an opt-in flag. But I don't know if that can be considered as a breaking change or a regression.

If removing the default call for wasm-opt is not possible, then add an opt-out flag so that the people having trouble with wasi-opt can just opt out without having to tweat their PATH or whatever. Also, add a warning / info message that tells the clang user if wasm-opt was used or not, and the exact path if so.

Once correct support for passing flags from clang to wasm-opt if implemented, and a way to distribute wasm-opt alongside LLVM (be it in wasi-sdk or somewhere else) is introduced, it can be added back as the default.

Finally, I'll put some questions here in order to better understand the situation around wasm-opt and maybe help taking future decisions:

Why add wasm-opt in the clang driver in the first place? Shouldn't this work be done by llvm-opt directly?
Is wasm-opt related to LLVM / a LLVM project in any way, or is it a completely external tool that was added in here for its interesting features?
Can we implement wasm-opt interesting features directly into LLVM?

kripken · 2024-01-25T17:21:04Z

@mh4ck-Thales

Is wasm-opt related to LLVM / a LLVM project in any way, or is it a completely external tool that was added in here for its interesting features?

wasm-opt is part of Binaryen and entirely separate from LLVM. It is an optimizer that can make wasm files smaller and faster (on LLVM output it typically helps by 15% or so).

Can we implement wasm-opt interesting features directly into LLVM?

In theory yes, but it would be a large amount of both initial and ongoing effort, so I doubt it. That effort can't be deduplicated because from the wasm side it makes sense to add new optimizations in wasm-opt which is a standalone, modular tool that any toolchain can use (not only LLVM), which benefits the entire wasm ecosystem.

(Personally I agree with you on the main matter here that the current clang behavior is surprising, but I defer to clang people on how to fix that.)

mh4ck-Thales · 2024-01-31T13:13:12Z

@kripken thanks for the clarifications. I understand that technically, wasm-opt is a great addition to the clang workflow and greatly improves the final Wasm binary. I also get that most of the work addressing Wasm-specific optimizations are directly towards wasm-opt instead of the internal LLVM optimizer (which makes sense). This builds towards the argument of keeping wasm-opt and finding a way to distribute it alongside LLVM, as a third-party but necessary dependency.

I'm curious though to know if there is any kind of overlap between LLVM optimization features and the ones of wasm-opt, and if that's the case is there an interest to try and avoid the overlap somehow to save time on big binaries (and if the benefit is worth the trouble).

I hope that we can all agree on a solution soon, and accept a patch accordingly.

This flag causes wasm-ld preserve a section even in the face of `--strip-all`. This is useful, for example, to preserve the target_features section in the ase of clang (which can run wasm-opt after linking), and emcc (which performs a bunch of post-link work). Fixes: llvm/llvm-project#60613 Fixes: llvm/llvm-project#55781 Differential Revision: https://reviews.llvm.org/D149917

whitequark · 2024-05-20T15:52:31Z

I've hit this issue a number of times recently. The two problematic outcomes for me were:

wasm-opt corrupts DWARF debug information, therefore making the backtraces useless.
Even on a non-LTO build, wasm-opt can take 10-20 minutes to complete on a recent laptop building for example Clang, which makes incremental workflows impossible.

sunfishcode · 2024-05-23T17:22:26Z

I'm also in favor of having a flag to disable wasm-opt. If someone could submit a patch, I'd review it.

@sunfishcode

This PR fixes llvm#55781 by adding the `--no-wasm-opt` and `--wasm-opt` flags in clang to disable/enable the `wasm-opt` optimizations. The default is to enable `wasm-opt` as before in order to not break existing workflows. I think that adding a warning when no flag or the `--wasm-opt` flag is given but `wasm-opt` wasn't found in the path may be relevant here. It allows people using `wasm-opt` to be aware of if it have been used on their produced binary or not. The only downside I see to this is that people already using the toolchain with the `-O` and `-Werror` flags but without `wasm-opt` in the path will see their toolchain break (with an easy fix: either adding `--no-wasm-opt` or add `wasm-opt` to the path). I haven't implemented this here because I haven't figured out how to add such a warning, and I don't know if this warning should be added here or in another PR. CC @sunfishcode that proposed in the associated issue to review this patch.

github-actions bot added the new issue label May 30, 2022

EugeneZelenko added clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' and removed new issue labels May 30, 2022

LinusU mentioned this issue Sep 28, 2022

unexpected false: Bulk memory operation (bulk memory is disabled) WebAssembly/wasi-sdk#254

Closed

Angelmmiguel mentioned this issue Jan 18, 2023

Skip wasm-opt in PHP builds vmware-labs/webassembly-language-runtimes#34

Merged

sbc100 mentioned this issue Aug 8, 2023

Linker error when building large shared library pyodide/pyodide#4048

Closed

abrown mentioned this issue Aug 8, 2023

debug info depends on -Wl,--export= WebAssembly/wasi-sdk#129

Open

abrown mentioned this issue Sep 21, 2023

Missing debug information WebAssembly/wasi-sdk#350

Closed

sbc100 closed this as completed in 89d5635 Nov 2, 2023

EugeneZelenko added the lld:wasm label Nov 2, 2023

tlively reopened this Nov 6, 2023

lum1n0us mentioned this issue Feb 7, 2024

samples/multi-thread build failed bytecodealliance/wasm-micro-runtime#3139

Open

cfallin mentioned this issue May 20, 2024

Cranelift: assertion left == right failed: the memory base pointer may be incorrect due to sharing memory bytecodealliance/wasmtime#8652

Closed

mh4ck-Thales added a commit to ThalesGroup/llvm-project that referenced this issue Jun 11, 2024

Add flag to opt out of wasm-opt (llvm#55781)

8c7052d

mh4ck-Thales mentioned this issue Jun 12, 2024

Add flag to opt out of wasm-opt #95208

Merged

sunfishcode closed this as completed in #95208 Jun 24, 2024

sunfishcode closed this as completed in 962d7ac Jun 24, 2024

mh4ck-Thales mentioned this issue Jul 24, 2024

Add wasm-opt warning #100321

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide clang flag to opt out of `wasm-opt` when linking wasm? #55781

Provide clang flag to opt out of `wasm-opt` when linking wasm? #55781

TerrorJack commented May 30, 2022

llvmbot commented May 30, 2022

abrown commented Jan 10, 2023

sbc100 commented Jan 10, 2023

sbc100 commented Jan 10, 2023

abrown commented Jan 11, 2023

aheejin commented Jan 12, 2023

kripken commented Jan 12, 2023

abrown commented Jan 12, 2023

sbc100 commented Jan 12, 2023

sbc100 commented Jan 12, 2023

kripken commented Jan 12, 2023

kripken commented Jan 12, 2023 •

edited

Loading

sbc100 commented Jan 12, 2023

kripken commented Jan 12, 2023

sbc100 commented Jan 12, 2023 •

edited

Loading

sbc100 commented Jan 12, 2023

kripken commented Jan 12, 2023

dschuff commented Jan 12, 2023

kripken commented Jan 12, 2023

sbc100 commented Jan 12, 2023

yamt commented Jan 27, 2023

pelletier commented Jul 12, 2023

llvmbot commented Nov 2, 2023

TerrorJack commented Nov 2, 2023

tlively commented Nov 6, 2023

RReverser commented Nov 15, 2023

tlively commented Nov 15, 2023

RReverser commented Nov 15, 2023 •

edited

Loading

tlively commented Nov 15, 2023

mh4ck-Thales commented Jan 25, 2024

kripken commented Jan 25, 2024

mh4ck-Thales commented Jan 31, 2024

whitequark commented May 20, 2024

sunfishcode commented May 23, 2024

Provide clang flag to opt out of wasm-opt when linking wasm? #55781

Provide clang flag to opt out of wasm-opt when linking wasm? #55781

Comments

TerrorJack commented May 30, 2022

llvmbot commented May 30, 2022

abrown commented Jan 10, 2023

sbc100 commented Jan 10, 2023

sbc100 commented Jan 10, 2023

abrown commented Jan 11, 2023

aheejin commented Jan 12, 2023

kripken commented Jan 12, 2023

abrown commented Jan 12, 2023

sbc100 commented Jan 12, 2023

sbc100 commented Jan 12, 2023

kripken commented Jan 12, 2023

kripken commented Jan 12, 2023 • edited Loading

sbc100 commented Jan 12, 2023

kripken commented Jan 12, 2023

sbc100 commented Jan 12, 2023 • edited Loading

sbc100 commented Jan 12, 2023

kripken commented Jan 12, 2023

dschuff commented Jan 12, 2023

kripken commented Jan 12, 2023

sbc100 commented Jan 12, 2023

yamt commented Jan 27, 2023

pelletier commented Jul 12, 2023

llvmbot commented Nov 2, 2023

TerrorJack commented Nov 2, 2023

tlively commented Nov 6, 2023

RReverser commented Nov 15, 2023

tlively commented Nov 15, 2023

RReverser commented Nov 15, 2023 • edited Loading

tlively commented Nov 15, 2023

mh4ck-Thales commented Jan 25, 2024

kripken commented Jan 25, 2024

mh4ck-Thales commented Jan 31, 2024

whitequark commented May 20, 2024

sunfishcode commented May 23, 2024

Provide clang flag to opt out of `wasm-opt` when linking wasm? #55781

Provide clang flag to opt out of `wasm-opt` when linking wasm? #55781

kripken commented Jan 12, 2023 •

edited

Loading

sbc100 commented Jan 12, 2023 •

edited

Loading

RReverser commented Nov 15, 2023 •

edited

Loading