STANDALONE_WASM option #9461

kripken · 2019-09-19T16:18:00Z

This adds an option to build "standalone" wasm files, that is, files that can be run without JavaScript. They idea is that they can be run in either

A wasm server runtime like wasmer or wasmtime, where the supported APIs are wasi.
A custom wasm embedding that uses wasm as plugins or dynamic libraries, where the supported APIs may include application-specific imports and exports, some subset of wasi, and other stuff (but probably not JS-specific things).

This removes the old EMITTING_JS flag, which was only used to check whether to minify wasm imports and exports. The new STANDALONE_WASM flag also prevents such minification (since we can't assume the JS will be the only thing to run the wasm), but is more general, in that we may still emit JS with that flag, but when we do the JS is only a convenient way to run the code on the Web or in Node.js, as the wasm can also be run standalone.

Note that SIDE_MODULE is an interesting case here: with wasm, a side module is just a wasm file (no JS), so we used to set EMITTING_JS to 0. However, we can't just set STANDALONE_WASM in that case, since the side module may expect to be linked with a main module that is not standalone, that is, that depends on JS (i.e. the side module may call things in the main module which are from JS).

Aside from side modules, though, if the user says -o X.wasm (emit wasm, no JS file) then we do set STANDALONE_WASM, since that is the likely intention (otherwise, without this flag running such a wasm file would be incredibly difficult since it wasn't designed for it!).

The main reason for needing a new flag here is that while we can use many wasi APIs by default, like fd_write, there are some changes that are bad for JS. The main one is Memory handling: it's better to create the Memory early in JS, both to avoid fragmentation issues on 32-bit, and to allow using the Memory by JS while the wasm is still loading (e.g. to set up files). For standalone wasm, though, we can't do that since there is no JS to create it for us, and indeed wasi runtimes expect the memory to be created and not imported. So STANDALONE_WASM basically means, "make an effort to make the wasm as standalone as possible, even that wouldn't be good for JS." Without this flag we do still try to do that, but not when it compromises JS size.

This adds libstandalone_wasm which contains things in C to avoid using JS, like routing exit to wasi's proc_exit. There may be better ways to do some of those things, which I intend to look into as followups, but I think this is a good first step to get the flag landed in a working and tested state, in as small a PR as possible.

This adds testing in the form of running standalone wasm files in wasmer and wasmtime, on Linux. I have some ideas about how to generalize that in a nicer way, but want to leave that for followups.

This updates EMSCRIPTEN_METADATA - we need a new field to know whether a wasm is standalone or not, as the ABI is different.

After this lands I'll update the wasm standalone docs.

…ndalone, as it does change the ABI

kripken · 2019-09-19T16:21:43Z

I realized this requires EMSCRIPTEN_METADATA changes - we need a new field to know whether a wasm is standalone or not, as the ABI is different. Added now.

syrusakbary · 2019-09-19T22:56:12Z

Love this! Thanks for the effort there

sbc100

This is a great step forward!

sbc100 · 2019-09-20T17:58:12Z

src/library_wasi.js

+  proc_exit: function(code) {
+    return _exit(code);
+  },
+});


I like this. I really hope we can move fd_read in there and link it by default in the future.

sbc100 · 2019-09-20T17:58:50Z

src/modules.js

@@ -152,6 +152,10 @@ var LibraryManager = {
      libraries.push('library_glemu.js');
    }

+    if (STANDALONE_WASM) {
+      libraries.push('library_wasi.js');


Why make this conditional? Aren't all the library functions only included on demand anyway?

Yes, it's all on demand, but we also include some libraries only based on their flags, like the GL stuff right above this. I think it's nice as it reflects the fact that nothing should be used from those libraries without the flag, the error is clearer that way.

sbc100 · 2019-09-20T17:59:22Z

src/preamble.js

+#if STANDALONE_WASM
+    // In pure wasm mode the memory is created in the wasm (not imported), and
+    // then exported.
+    // TODO: do not create a Memory earlier in JS


Seems like a TODO we certainly want to fix.

sbc100 · 2019-09-20T18:01:31Z

tests/test_other.py

@@ -8296,16 +8296,21 @@ def run(args, expected):
    run(['-s', 'TOTAL_MEMORY=32MB', '-s', 'ALLOW_MEMORY_GROWTH=1', '-s', 'BINARYEN=1'], (2 * 1024 * 1024 * 1024 - 65536) // 16384)
    run(['-s', 'TOTAL_MEMORY=32MB', '-s', 'ALLOW_MEMORY_GROWTH=1', '-s', 'BINARYEN=1', '-s', 'WASM_MEM_MAX=128MB'], 2048 * 4)

-  def test_wasm_targets(self):


Is it ok that we seems to have lost of test coverage for fastcomp here?

Ah, good point, I missed something here. I thought it was fine to remove this, but actually it hid a possible regression: building in fastcomp without SIDE_MODULE but with -o X.wasm.

That never worked very well anyhow, it was a half-hearted attempt at standalone wasm files. So I am leaning towards showing a clear error in that case that the user should use the upstream wasm backend for standalone wasm?

Looks like there is a reasonable fix actually, pushed now.

sbc100 · 2019-09-20T18:05:00Z

tools/shared.py

@@ -3090,7 +3106,8 @@ def add_emscripten_metadata(js_file, wasm_file):
      WebAssembly.lebify(global_base) +
      WebAssembly.lebify(dynamic_base) +
      WebAssembly.lebify(dynamictop_ptr) +
-      WebAssembly.lebify(tempdouble_ptr)
+      WebAssembly.lebify(tempdouble_ptr) +
+      WebAssembly.lebify(int(Settings.STANDALONE_WASM))


I wonder if we should just not add this metadata for STANDALONE_WASM? Since the metadata was mostly to support for loading of emscripten-like wasm binaries outside of emscripten? Perhaps we can do that as a followup after consulting with the users of the metadata? Who are the users exactly I'm not sure.

I think we can check with people if it's unnecessary, but we should either land this with the change (which is the safe thing) or block this until we decide whether to add this change or not. I lean towards the former since the latter may take a while.

In general I think this may still be useful for the same people for the same reasons as before - these wasms are as standalone as we can make them, but still may contain e.g. WebGL calls if using OpenGL, etc. So they are not pure wasi in general. Which means it's useful to have metadata about the emscripten ABI, and this flag does affect the ABI.

Sure we can land this now.. I was just suggesting this as a followup.

Hmm, I remembered that we have promised to only append to the metadata list. So if we land this we probably don't want to remove the field later.

However, as I said earlier, I think we do want this field, so I'm not worried. Are you still concerned?

Re-reading your posts, were you saying that we should not emit EMSCRIPTEN_METADATA at all for STANDALONE_WASM? (I initially understood you to mean this new field specifically.)

I'm not talking about the field, but the entire metadata section. Its seems to be that the point of this section is so that runtimes and extract the information needs from the "emscripten-flavor" wasm file and run it without JS.

My idea is that for PURE_WASM we may not need the section at all in the future. In which case .. this field would be zeor by definition... or we could bump the major version which allows is the change it however we like.

My real long term hope is that the users of this metadata section are in fact that exact same users who want to use PURE_WASM instead and that PURE_WASM don't needs to self-describe in this way, and so we may be able to simply remove this completely?

I see now, thanks.

Yeah, maybe we eventually won't need this metadata at all, if we only use standard APIs. That seems like the very far future though, as we'll need nonstandard APIs to do many things for quite some time, if not forever (e.g. WebGL).

cggallant · 2019-11-27T01:19:43Z

Is it somehow possible to have a STANDALONE_WASM module export its table similar to how it exports its memory?

kripken · 2019-11-27T01:26:04Z

@cggallant Not currently, but we should probably add an option for that, good point! I opened #9907 for discussion.

sbc100 · 2019-11-27T01:29:12Z

I think you can just add -Wl,--export-table to the link time flags.. however, we probably should switch from a whitelist of a blacklist when filtering linker flags .. I think today we block all unknown-to-emscripten linker flags.

cggallant · 2019-11-27T01:37:40Z

Oh very nice! That flag worked and exported the table as __indirect_function_table. It's limited to a maximum of 2 elements. Is there a flag to increase that?

sbc100 · 2019-11-27T04:24:09Z

Yes it looks like there is also `--growable-table`.. although I have to admit I don't remember adding that option :) wasm-ld --help will list all these option: ``` OVERVIEW: LLVM Linker USAGE: ../llvm-build/bin/wasm-ld [options] file... OPTIONS: --allow-undefined-file=<value> Allow symbols listed in <file> to be undefined in linked binary --allow-undefined Allow undefined symbols in linked binary --check-features Check feature compatibility of linked objects (default) --color-diagnostics=<value> Use colors in diagnostics; one of 'always', 'never', 'auto' --color-diagnostics Use colors in diagnostics --compress-relocations Compress the relocation targets in the code section. --demangle Demangle symbol names --emit-relocs Generate relocations in output --entry <entry> Name of entry point symbol --error-limit=<value> Maximum number of errors to emit before stopping (0 = no limit) --export-all Export all symbols (normally combined with --no-gc-sections) --export-dynamic Put symbols in the dynamic symbol table --export-table Export function table to the environment --export=<value> Force a symbol to be exported -E Alias for --export-dynamic --fatal-warnings Treat warnings as errors --features=<value> Comma-separated used features, inferred from input objects by default. --gc-sections Enable garbage collection of unused sections --global-base=<value> Where to start to place global data --growable-table Remove maximum size from function table, allowing table to grow --help Print option help --import-memory Import memory from the environment --import-table Import function table from the environment --initial-memory=<value> Initial size of the linear memory --lto-O<opt-level> Optimization level for LTO --lto-partitions=<value> Number of LTO codegen partitions -L <dir> Add a directory to the library search path -l <libName> Root name of library to use --max-memory=<value> Maximum size of the linear memory --merge-data-segments Enable merging data segments --mllvm <value> Options to pass to LLVM --no-check-features Ignore feature compatibility of linked objects --no-color-diagnostics Do not use colors in diagnostics --no-demangle Do not demangle symbol names --no-entry Do not output any entry point --no-export-dynamic Do not put symbols in the dynamic symbol table (default) --no-gc-sections Disable garbage collection of unused sections --no-merge-data-segments Disable merging data segments --no-pie Do not create a position independent executable (default) --no-print-gc-sections Do not list removed unused sections --no-threads Do not run the linker multi-threaded --no-whole-archive Do not force load of all members in a static library (default) -O <value> Optimize output file size -o <path> Path to file to write output --pie Create a position independent executable --print-gc-sections List removed unused sections --relocatable Create relocatable object file --reproduce=<value> Dump linker invocation and input files for debugging --shared-memory Use shared linear memory --shared Build a shared object --stack-first Place stack at start of linear memory rather than after data --strip-all Strip all symbols --strip-debug Strip debugging information -S Alias for --strip-debug -s Alias for --strip-all --thinlto-cache-dir=<value> Path to ThinLTO cached object file directory --thinlto-cache-policy=<value> Pruning policy for the ThinLTO cache --thinlto-jobs=<value> Number of ThinLTO jobs --threads Run the linker multi-threaded --trace-symbol=<value> Trace references to symbols --trace Print the names of the input files -t Alias for --trace --undefined=<value> Force undefined symbol during linking --verbose Verbose mode --version Display the version number and exit -v Display the version number --whole-archive Force load of all members in a static library --wrap=<symbol>=<symbol> Use wrapper functions for symbol -y <value> Alias for --trace-symbol -z <option> Linker option extensions ```

…

On Tue, Nov 26, 2019 at 7:37 PM C. Gerard Gallant ***@***.***> wrote: Oh very nice! That flag worked and exported the table as __indirect_function_table. It's limited to a maximum of 2 elements. Is there a flag to increase that? — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#9461?email_source=notifications&email_token=AAD55ZJ46DVJX7QJXPV5DCLQVXFOLA5CNFSM4IYNIZZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFH7SUA#issuecomment-558889296>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAD55ZL3FXBCYCOUHKEAOULQVXFOLANCNFSM4IYNIZZQ> .

cggallant · 2019-11-27T12:22:32Z

You had mentioned a white list? The --growable-table flag emits a warning message and the generated module is still fixed at 2 elements:
shared:WARNING: ignoring unsupported linker flag: --growable-table

I ran the --help flag on the wasm-ld.exe in my emsdk-master\upstream\bin folder and it gave me the same list that you posted above.

cggallant · 2019-11-27T12:48:41Z

I was able to find the white list in the emcc.py file's SUPPORTED_LLD_LINKER_FLAGS

When I added --growable-table to the list and re-ran the command line, the generated module no longer has an upper limit for the table size.

With this, I was able to get my code working. Thank you very much for all your help!

This adds an option to build "standalone" wasm files, that is, files that can be run without JavaScript. They idea is that they can be run in either A wasm server runtime like wasmer or wasmtime, where the supported APIs are wasi. A custom wasm embedding that uses wasm as plugins or dynamic libraries, where the supported APIs may include application-specific imports and exports, some subset of wasi, and other stuff (but probably not JS-specific things). This removes the old EMITTING_JS flag, which was only used to check whether to minify wasm imports and exports. The new STANDALONE_WASM flag also prevents such minification (since we can't assume the JS will be the only thing to run the wasm), but is more general, in that we may still emit JS with that flag, but when we do the JS is only a convenient way to run the code on the Web or in Node.js, as the wasm can also be run standalone. Note that SIDE_MODULE is an interesting case here: with wasm, a side module is just a wasm file (no JS), so we used to set EMITTING_JS to 0. However, we can't just set STANDALONE_WASM in that case, since the side module may expect to be linked with a main module that is not standalone, that is, that depends on JS (i.e. the side module may call things in the main module which are from JS). Aside from side modules, though, if the user says -o X.wasm (emit wasm, no JS file) then we do set STANDALONE_WASM, since that is the likely intention (otherwise, without this flag running such a wasm file would be incredibly difficult since it wasn't designed for it!). The main reason for needing a new flag here is that while we can use many wasi APIs by default, like fd_write, there are some changes that are bad for JS. The main one is Memory handling: it's better to create the Memory early in JS, both to avoid fragmentation issues on 32-bit, and to allow using the Memory by JS while the wasm is still loading (e.g. to set up files). For standalone wasm, though, we can't do that since there is no JS to create it for us, and indeed wasi runtimes expect the memory to be created and not imported. So STANDALONE_WASM basically means, "make an effort to make the wasm as standalone as possible, even that wouldn't be good for JS." Without this flag we do still try to do that, but not when it compromises JS size. This adds libstandalone_wasm which contains things in C to avoid using JS, like routing exit to wasi's proc_exit. There may be better ways to do some of those things, which I intend to look into as followups, but I think this is a good first step to get the flag landed in a working and tested state, in as small a PR as possible. This adds testing in the form of running standalone wasm files in wasmer and wasmtime, on Linux. I have some ideas about how to generalize that in a nicer way, but want to leave that for followups. This updates EMSCRIPTEN_METADATA - we need a new field to know whether a wasm is standalone or not, as the ABI is different.

kripken and others added 18 commits September 7, 2019 18:47

wip

e11288c

fixups [ci skip]

479e297

Merge remote-tracking branch 'origin/incoming' into wasi6

8dd1aaa

PURE_WASM

b25ee39

wip [ci skip]

fab6ceb

fixes [ci skip]

664e17b

more [ci skip]

40e2f58

comments [ci skip]

90b1eaa

standalone [ci skip]

78db28a

cleanup [ci skip]

2e0770a

wasmtime too [ci skip]

1848451

Merge remote-tracking branch 'origin/incoming' into wasi6

6e5db08

clear error on EMITTING_JS being removed

f781f39

text

5994a4c

standalone wasm library is upstream-only

d129daf

here to there

5a0b65c

here to there

f89fa67

fixes

998aa4a

kripken requested review from sbc100 and dschuff September 19, 2019 16:18

kripken added 2 commits September 19, 2019 09:18

Merge remote-tracking branch 'origin/incoming' into wasi6

61c75eb

Update metadata, and add a metadata field for whether the wasm is sta…

521686d

…ndalone, as it does change the ABI

sbc100 approved these changes Sep 20, 2019

View reviewed changes

kripken added 5 commits September 20, 2019 12:39

fix fastcomp with -o X.wasm without SIDE_MODULE

c347034

Merge remote-tracking branch 'origin/incoming' into wasi6

5e894da

fix

8136e2b

fix

67007b8

fix circleci maybe

0f6e911

kripken merged commit 49c12a9 into incoming Sep 20, 2019

kripken deleted the wasi6 branch September 20, 2019 22:37

syrusakbary mentioned this pull request Sep 22, 2019

How to compile with STANDALONE_WASM emscripten-core/emsdk#358

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

STANDALONE_WASM option #9461

STANDALONE_WASM option #9461

kripken commented Sep 19, 2019 •

edited

Loading

kripken commented Sep 19, 2019

syrusakbary commented Sep 19, 2019

sbc100 left a comment

sbc100 Sep 20, 2019

sbc100 Sep 20, 2019

kripken Sep 20, 2019

sbc100 Sep 20, 2019

sbc100 Sep 20, 2019

kripken Sep 20, 2019

kripken Sep 20, 2019

sbc100 Sep 20, 2019

kripken Sep 20, 2019

sbc100 Sep 20, 2019

kripken Sep 20, 2019

kripken Sep 20, 2019

sbc100 Sep 20, 2019

kripken Sep 20, 2019

cggallant commented Nov 27, 2019

kripken commented Nov 27, 2019

sbc100 commented Nov 27, 2019

cggallant commented Nov 27, 2019

sbc100 commented Nov 27, 2019 via email

cggallant commented Nov 27, 2019

cggallant commented Nov 27, 2019

STANDALONE_WASM option #9461

STANDALONE_WASM option #9461

Conversation

kripken commented Sep 19, 2019 • edited Loading

kripken commented Sep 19, 2019

syrusakbary commented Sep 19, 2019

sbc100 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cggallant commented Nov 27, 2019

kripken commented Nov 27, 2019

sbc100 commented Nov 27, 2019

cggallant commented Nov 27, 2019

sbc100 commented Nov 27, 2019 via email

cggallant commented Nov 27, 2019

cggallant commented Nov 27, 2019

kripken commented Sep 19, 2019 •

edited

Loading