Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better JS size for small programs #5794

Closed
kripken opened this issue Nov 15, 2017 · 30 comments
Closed

Better JS size for small programs #5794

kripken opened this issue Nov 15, 2017 · 30 comments
Labels

Comments

@kripken
Copy link
Member

kripken commented Nov 15, 2017

We've historically focused on full support of existing C/C++ code by default, which leads to us emitting a bunch of things that increase the default JS code size, like

  • longjmp and exceptions support (dynCall/invoke glue)
  • atexit support
  • Filesystem support
  • Various POSIX things (like /dev/random).
  • Ability to run on the web, node.js, various js shells
  • ccall, string stuff, etc. utilities for JS convenience
  • etc.

As a result the JS emitted for "hello world" is not tiny.

In a medium or large project the compiled code is the large bulk anyhow and we have very good code size optimization there. But people do notice the JS size on small programs, and with wasm increasing the interest in compiling to the web, this has been coming up.

I've been thinking that dynamic linking might get us there, as we emit no JS for a standalone wasm dynamic library (SIDE_MODULE). However, dynamic libraries add relocation (unnecessary for many things) and we don't necessarily want 0 JS, we want "minimal" JS. So dynamic libraries are not the solution here.

Two other possible paths we could go down:

  1. Audit the current JS output and fix things one by one. That is, look at what we emit, and starting from the larger things, find out if we can't optimize it out or put it behind a flag, etc. To really make progress that way, we may need breaking changes, but e.g. putting longjmp support behind a flag could be ok as long as we issue a clear message for developers (your program uses longjmp, you need to compile with -s LONGJMP_SUPPORT=1).
  2. Introduce a new codegen mode, MINIMAL_JS perhaps, which we would design from scratch to be minimal - we'd start from nothing and add just the things we want in that mode. It wouldn't support things like a filesystem or POSIX or atexit etc. (and we'd need to decide on exceptions and longjmp, asm.js or just wasm, etc.). We'd point people to the "normal" non-minimal JS for those things.

Thoughts?

@juj
Copy link
Collaborator

juj commented Nov 16, 2017

Introduce a new codegen mode, MINIMAL_JS perhaps, which we would design from scratch to be minimal

I think before we do this, there's a lot we can optimize with feature-specific flags and other methods. Not too long ago I wrote a tiny chess game in GLES2, at https://github.com/juj/tiny_chess, and looking at its build output, in one evening I was able to remove about 2/3rds of the runtime boilerplate as unnecessary. (from uncompressed 150KB down to uncompressed 50KB). There does exist a lot of low hanging fruit here, and I think it would be best to start splicing off these types of items on a feature basis, that way it won't be a wholesale "old runtime" or "new runtime" question. Longjmp support is one such example. Another thing is that I'd like to start slimming down individual items in Module object, which don't DCE too well.

@kripken
Copy link
Member Author

kripken commented Nov 17, 2017

Good points. Thinking on this some more, maybe we should clarify the target goal. Is it

  • Reduce the current JS code by a factor of 2, 3, 4..? That would probably still leave a multiple of 10K (uncompressed).
  • Have an easy way to get a truly minimal amount of JS code, really just code to load the wasm and provide minimal support, leading to something like a multiple of 1K.

I think the first goal is reachable with an incremental improvement approach, but to reach the second, starting "from scratch" in a mode where we add only necessary JS seems more likely to succeed.

@jgravelle-google
Copy link
Contributor

Crazy thought: if the goal is to have a minimal wasm hello world, can we change users' expectations around what that looks like? As a strawman:

#include <js_console.h>

int main() {
  console_log("Hello world!");
  return 0;
}

can generate

(import "env" "console_log" (func $console_log ...))
(data (i32.const 1024) ("Hello world!\00"))
(func $main (result i32)
  (call $console_log
    (i32.const 1024)
  )
  (i32.const 0)
)

Then the JS glue would just be Pointer_stringify and a wrapper around console.log.

The idea being, if people want minimal wasm, they're either doing something web-first, and/or they're experimenting with the wasm format, and C just happens to compile to that today. Supporting printf alone is a huge part of what makes our current hello world so large, between the libc functions it winds up including in the wasm, and the posix/syscall support in JS. If what people really want is "the ability to write to the console from C," then exposing the JS API as directly as possible is going to be the cheapest option.

@kripken
Copy link
Member Author

kripken commented Nov 17, 2017

Yeah, that's very relevant here - people that want minimal output should avoid printf etc. We do already have stuff for them, like emscripten_log, but it takes a simple printf-like format string for convenience (which brings in a few K of support code). Could be nice to add to the HTML5 API something that just gets a string, and so just wraps console.log, like emscripten_console_log as you suggest.

@jgravelle-google
Copy link
Contributor

I wonder if we could save code size by making a C++ API that does string conversion.

class Console {
  template <typename... Rest>
  void log(char* str, Rest... rest) {
    add_to_inner_buffer(str);
    log(rest...);
  }

  template <typename... Rest>
  void log(int x, Rest... rest) {
    add_to_inner_buffer(itoa(str));
    log(rest...);
  }

  void log() {
    console_log(inner_buffer);
    clear_inner_buffer();
  }
} console;

// Later
console.log("Foo = ", foo, ", bar = ", bar);

Anyway all that is to say that I think -s MINIMAL_WASM is worth investigating (I also think we should keep the wasm as small as possible as well as the js). It's a use case that we're seeing people want, that we don't have a great support story for.

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 18, 2017

If you call printf, that's going to take a lot of code. I wouldn't consider that overhead though, it's the true cost of a powerful C function. Of course it would be good to make it easier to use cheaper logging functions.

I don't think long term it would be helpful to have two modes. If a minimal JS mode is started it should eventually be the default and then the only mode.

Changes I think could help:

  • Not pull in the whole of Browser when only some is required (Separate Browser.mainLoop from the browser API functions #5355) (potential 30kb uncompressed saving)
  • Make file loading less accommodating. Use one set of helper functions for everything. Stick with only one method (fetch?). The user can add polyfills if needed. Be more opinionated about how files are loaded, ie, it's either inline, or it's loaded from one set location. If you want to differ from that you need to manually set an option to add the code to support it.
  • Similarly, rely on TextDecoder being present and the user can add polyfills if needed
  • Maybe use native promises rather than the runtime dependencies system? I imagine that a promise chain would be more straightforward and more concise.

@rongjiecomputer
Copy link
Contributor

I did some experiments on this. The result is pretty long, so I put in a repo. Please see https://github.com/rongjiecomputer/minimal-emscripten

Demo is GameBoy emulator by @binji.

@binji
Copy link
Contributor

binji commented Nov 19, 2017

@rongjiecomputer Yeah, I specifically made sure that binjgb didn't use much host functionality (GL, audio, etc.) so it was easier to port. I'm not sure many developers will want to (or even be able to) do that. But it's worth exploring that direction -- I really wish I could have a minimal bindings layer like the one that you generated!

Perhaps it would be good to explore which use cases you want to support, and have that point you toward the goals. I can think of a few:

  • Someone who wants minimal JS glue, and will do everything themselves (this is me)
  • Someone porting an existing, non-graphical C/C++ library and wants to expose a convenient JS API for the web
  • Same as previous, but wants to expose the library for node.js
  • Someone writing new code that wants to use WebAssembly. They want to write mostly normal C++ but are OK using web specific libraries and avoiding POSIX cruft
  • Someone with an existing portable application, that just wants an easy way to port to the web (emscripten works pretty well for them currently)
  • Someone who is working on their own language, and wants to use WebAssembly as a backend. They need to write their own runtime that works with their language, but they don't want redo all the work done in library_*.js files.
  • Someone who has an existing web application and has pinpointed a hotspot in their code and want to optimize it. They want an easy way to port this code from JavaScript and drop it in.
  • Someone who isn't developing for the web, and just wants to use WebAssembly as a portable native executable format. They don't want any JavaScript at all, just the necessary information to create their own host-specific glue code.

And I'm sure there are many more! Thoughts?

@rongjiecomputer
Copy link
Contributor

My experiment is basically option 2 mentioned by @kripken, that is, a new mode starts from nothing and slowly adds more things when needed.

A none user-facing breaking change is the use ES6 template string instead of hacky C preprocessor. This allows more complex operations in the template. The downside is library_*.js will have to be rewritten, porting all library_*.js to new template format might be infeasible. There will be two versions of library_*.js, I am not sure Emscripten will want that, so this experiment might never make it into Emscripten.

Someone who wants minimal JS glue, and will do everything themselves (this is me)

This is me as well. In my opinion, this offers the best performance due to less wrapper layers and temporary JS objects.

Someone porting an existing, non-graphical C/C++ library and wants to expose a convenient JS API for the web. Same as previous, but wants to expose the library for node.js.

Someone who has an existing web application and has pinpointed a hotspot in their code and want to optimize it. They want an easy way to port this code from JavaScript and drop it in.

I want to support this, but user is expected to write his own JS code to handle filesystem as well. Most big C/C++ projects like protobuf do use filesystem in some functions. User has to do extra work to make sure these functions are not compiled into .wasm.

Currently C++ exception is not supported as I have not implemented __cxa_allocate_exception etc. yet. Even the simplest C++ code might not work due to this. I am hesitating whether to implement the JS version now (which might be hard to implement, only to be abandoned when WebAssembly exception is released) or just wait for WebAssembly exception to be implemented in Chrome Canary (need to wait until 2018).

Currently my plan is let user call gen.js, gen.js will check if non-default values of GLOBAL_BASE, TOTAL_MEMORY and TOTAL_STACK are set, pass extra arguments to emcc.py, read last line of the generated .wast, then generate the JS code.

I have a final exam to prepare, so I am going to submerge myself for a while. I might still have time to participate in the discussion, but will only have time to work on this some time in December.

@juj
Copy link
Collaborator

juj commented Nov 20, 2017

people that want minimal output should avoid printf etc. We do already have stuff for them, like emscripten_log

A nice way to do simple prints from C is

#include <emscripten.h>
int main()
{
  int foo = 42;
  EM_ASM(console.log('hello ' + $0), foo);
}

that is also size efficient (although could be even smaller if ASM_CONSTS array was optimized away when not needed.

Thinking on this some more, maybe we should clarify the target goal. Is it

Reduce the current JS code by a factor of 2, 3, 4..? That would probably still leave a multiple of 10K (uncompressed).
Have an easy way to get a truly minimal amount of JS code, really just code to load the wasm and provide minimal support, leading to something like a multiple of 1K.

if the goal is to have a minimal wasm hello world, can we change users' expectations around what that looks like? As a strawman:

I think we could do both; but what I'm saying is that we should have an extremely strong bias to doing the first item first until we run out of items to optimize, because refactoring >>> rewriting (the usual Joel on Software and Coding Horror articles etc.). Jumping on first foot forward to rewrite here would feel like a bad engineering call. I don't like the idea of having two runtimes and scenarios where we would end up with people considering "how do I do this in the new vs old runtime?", or "is this thing X still compatible with the new runtime?", or "will this thing X ever work in new runtime?". There is nothing fundamentally incompatible or impossible why the current runtime could not be DCEd one item at a time, except the engineering time to start looking at opportunities to optimize.

Different directions are raising these kind of "hands up" looks with "what if we just started over with the runtime?", and those kinds of thoughts come mainly from not understanding why the undesired lines exist in the runtime, when they are needed, what problems they are there for to solve, and what the path would be to optimizing them away. Sure, nobody likes having to deal with overhead from other developers' problems, that is understandable. As first party developers of all these features that go into the runtime, we do have that knowledge, so we should be able to cut those items down one at a time.

Taking a peek at building the above code example and looking at the output, there are a lot of content there that we have an opportunity to implement better DCE machinery for. If we "started over" but with no better DCE machinery than we currently have, we would still be lacking the capability to DCE, and would end up in a similar situation eventually, where we don't have the means to DCE undesired code away, and will then start to implement the exact DCE methods that should probably have been built in the first place. Having flags -s SUPPORT_ENVIRONMENT_IS_WORKER=0/1 as manual DCE methods does not sound like a bad idea as long as they are understandable feature-based (rather than -s NEW_RUNTIME=0/1); putting preamble.js on diet, migrating functions from there to --js-libraries, and using perhaps smaller feature-focused slices such as -s POSIX_PROCESS=1 to control whether to emit process argc,argv and exit() support, -s ENABLE_CCALL=1 for emitting cwrap()/ccall() and similar would be great. That way developers will have a chance to tune the existing output to which features they will need, and we won't create a rift between developers wanting to access features in old vs new. The var Runtime object has some fields that look quite archaic that can be removed.

Then after we have slimmed down all that we have any chance to do, we can look at what remains and figure out why those lines are so fundamentally difficult that they could not become manually exportable in some fashion.

I could be wrong in that the engineering effort to do the above might prove to be too much, that we won't be able to pull it off. Though I think that would be the path to best success that would allow all features to be used - compartmentalize different features in boxes, then either automatically have the compiler choose which boxes are used, or if not possible, do it manually, so that developers have the options to exactly pay only for which features they use.

@juj
Copy link
Collaborator

juj commented Nov 20, 2017

Then the JS glue would just be Pointer_stringify

Oh, on this specific note, I want to kill Pointer_stringify in favor of the more explicit UTF8ToString.

@binji
Copy link
Contributor

binji commented Nov 20, 2017

@juj I think your suggestion of modularizing the current runtime is a good one! I agree that starting over from scratch is probably the wrong way to go. But I'm also a bit concerned about the idea of adding more -s flags to make it happen. IMO the thing that is both nice and annoying about using emscripten is that it tries to figure out what you want and provides opt-out compiler flags otherwise. It means that the runtime is provided privileged information, so it is difficult for a developer to craft their own. I'd much rather see the layers that the standard runtime provides are using a well-defined interface that can be easily replaced by the developer. This "keeps you honest" in terms of what is possible, and provides the opportunity for all of the users I listed in my comment above to be able to use emscripten the way they want.

@kripken
Copy link
Member Author

kripken commented Nov 20, 2017

@rongjiecomputer - thanks for sharing that experiment! Very interesting and helpful to think about this.

I think that experiment shows it is possible to offer a "minimal JS" runtime option. The details are tricky, though - I would suggest different stuff be enabled in that mode than in the experiment ;) - which perhaps proves one of @juj's points.

@binji - what type of interfaces do you mean here? And what type of use cases do you have in mind for developers replacing parts of the runtime?

@binji
Copy link
Contributor

binji commented Nov 21, 2017

what type of interfaces do you mean here?

Basically just imports/exports to each layer. The compiler provides an initial layer, and each additional layer depends on a previous one. Should be analogous to a standard module system, though probably won't be exactly that. So something like cwrap depends on very little, and something like the filesystem layer depends on more, and the IndexedDB layer even more, etc. It seems like you have a lot of this behavior already implemented w/ the --js-library stuff, perhaps it can just extend to more of the runtime, and maybe it can be more granular.

And what type of use cases do you have in mind for developers replacing parts of the runtime?

Well, I gave some examples about some use cases above, but personally I don't need a lot of the features, so all I want is enough to get the C/C++ code off the ground (setup linear memory, run static initializers, etc) then allow me to call into the module. I typically won't use many C library or POSIX features (probably just printf) and will instead plumb through my own functions.

@rongjiecomputer
Copy link
Contributor

@juj I also agree that starting over from scratch may be bad because we will need up supporting two runtime versions, adding to the already heavy maintenance burden.

Due to lack of DCE and the fact that everything is exposed in global, whatever we do here will most likely break someone's code.

But I'm also a bit concerned about the idea of adding more -s flags to make it happen.

I share the same concern. For new users, they won't know all these flags and completely puzzled why their hello world is large even though they don't use ccall etc..

Any thoughts about the possibility of using ES6 template string as template engine instead of C preprocessor and {{{ }}}?

  • Advantage: Python script will only need to provide Settings.* to Node.js script and let Node.js do the actual template generation. We might be able to do more DCE.
  • Disadvantage: Massive refactoring of Python and Node.js scripts. library_*.js also needs to be rewritten. Can't be done in a single commit.

Some of the things I want to kill:

  • STACK_ALIGN, QUANTUM_SIZE.
  • A lot of things from Runtime,
  • loadWebAssemblyModule,integrateWasmJS (We should just provide importObj and let users load/compile/instantiate/run WebAssembly themselves at their chosen time).
  • Module.{preRun,postRun,preMain,callMain,run,addOnPreRun,addOnExit,...} (since users will be running code themselves)
  • Module["asm"] (ditto, users get functions directly from instance.exports)
  • addRunDependency etc..
  • All Math polyfills.

@juj
Copy link
Collaborator

juj commented Nov 21, 2017

Due to lack of DCE and the fact that everything is exposed in global, whatever we do here will most likely break someone's code.

I think it's not concerning at all to default to exporting fewer functions or features, since the breakage will then in 99% cases manifest as some code being missing ('undefined function' exceptions), and be manageable by adding a new linker flag. For example, currently everyone gets ccall and cwrap by default, and I think we could just easily migrate to users having to specify them explicitly if they want them. This will break users, but it will be minor nuisance, since users will have an easy time by adding a single flag -s EXTRA_RUNTIME_FUNCTIONS_TO_EXPORT=['ccall'] to the build (well, we may want to ease that flag to something like --export=ccall --export=cwrap etc. that will be easy to layer on top of each other)

If there was a new "completely from scratch" runtime, then we'd be scrambling to figure out how to make it compatible, and what use cases it's for. It would have a chance to split the developer community into two, like what happened to Python 2 and 3.

Your list of things to optimize is a good one, and I think that's already a great start to boxing up features to separate compartments to control specifically. I still think that the C preprocessor with #ifdeffing will be our best weapon for selectively phasing features out, especially the more overarching ones.

Baby steps to success, in reviewable units. #5826 to start off with.

Any thoughts about the possibility of using ES6 template string as template engine instead of C preprocessor and {{{ }}}?

I think "in addition to", rather than "instead", except unless this can be proven to be superior in practice? We will migrate to latest Node.js LTS in emsdk in next tagged release, so this will be available at least out of the box.

@curiousdannii
Copy link
Contributor

curiousdannii commented Nov 21, 2017

With more -s flags, especially if more of them are arrays, could emscripten support a settings JSON file so that these don't all need to be passed as args? And maybe also the other emcc args? (If so, I guess the -s flags would need to be nested.)

And could emcc.py be made more modular? It's huge and very hard to approach as someone new to the codebase. (I know this would be quite the undertaking.) I also don't think it should directly contain JS code gen.

@kripken
Copy link
Member Author

kripken commented Nov 21, 2017

@binji I see, thanks. That sounds good in general, but I admit I don't have a clear idea of how the layers would look yet. I think, though, that such a design could be done in separate manner from the better JS size issue? Maybe it would even become easier to do after we do some of the JS size shrinking, as it will entail refactoring and modularization.

@curiousdannii Both your suggestions are very good, we should do those. For emcc.py modularization, we can split out code into smaller components in tools. (Also tools/shared.py could be split up, it already has internal components, but it's all in one big file.)

@kripken
Copy link
Member Author

kripken commented Nov 21, 2017

@juj

For example, currently everyone gets ccall and cwrap by default, and I think we could just easily migrate to users having to specify them explicitly if they want them. This will break users, but it will be minor nuisance

This is a key point, good to bring it up. If we focus on the incremental modularization approach (instead of a new "minimal JS" mode), then I think we need to agree that

  • There will be some breakage. We can mitigate so that it is minimally surprising, but users may need to change some things. E.g. we may export fewer things by default, like ccall/cwrap in the example above.
  • How much breakage we accept will depend on code size wins. I think we should set a target for the size JS we emit (for, say, hello world), and that should be a factor in deciding if a particular breakage is worth it. (I don't know what the size target should be.)

Do those principles make sense?

@binji
Copy link
Contributor

binji commented Nov 21, 2017

@kripken Yes, I think layering can help make DCE simpler, but it seems like there are low-hanging fruit to remove first.

There will be some breakage.

Awesome! I'm very excited to see this. Backward compatibility is great, but I think we as developers understand that sometimes things must break to move the project forward. As long as I can pin my emscripten version and only upgrade as desired, and there are clear errors when things break, then I am OK.

I think we should set a target for the size JS we emit (for, say, hello world)

This is a good idea, but I don't think hello world is a good example. It's true that everyone tries this out and having it be slim is good PR, but it isn't very realistic for actual users. Emscripten is pretty widely used; perhaps you can just take some concrete projects as examples and measure reduction for them instead?

@kripken
Copy link
Member Author

kripken commented Nov 21, 2017

This is a good idea, but I don't think hello world is a good example.

Fair enough, yeah. We can figure out what to measure on (and the size target to aim for on it) later, assuming we agree to go down this route. And yeah, maybe a real-world project could be good (box2D or ammo maybe?).

@lukewagner
Copy link
Contributor

I appreciate both the argument for why Option 2 can achieve the better end result and why we should avoid splitting the ecosystem and duplicating work.

It seems like both are achievable, though: build the Option 2 ideal in terms of general primitives while refactoring Emscripten to be implemented in terms of these same primitives. I've seen this strategy work in practice a number of times. Really, this is the path we've started on already with the upstream llvm and lld wasm integration projects; it seems like we just need to continue this refactoring "up the stack".

IMHO, the ideal end state here is that Emscripten is like a Linux distro, pulling together a collection of tools and packages, providing an easy install with good defaults and regularity, filling in the gaps with custom bits, and hosting a community. Does that make sense to everyone else?

One refinement I'd propose making to Option 2 though: in addition to defining Option 2 in terms of "minimal output size", could we also say that Option 2 specifically generates ES Modules which are designed to be used as part of a bigger app that uses lots of ES modules. This change in output form can help explain why the Option 2 environment doesn't attempt to provide a full POSIX environment and why certain things must be passed as explicit parameters instead of using the global scope. I think being able to explain Emscripten output in terms of familiar ES Module concepts will help adoption almost as much as getting the "hello world" output size down.

@kripken
Copy link
Member Author

kripken commented Nov 21, 2017

@lukewagner

It seems like both are achievable, though: build the Option 2 ideal in terms of general primitives while refactoring Emscripten to be implemented in terms of these same primitives. I've seen this strategy work in practice a number of times. Really, this is the path we've started on already with the upstream llvm and lld wasm integration projects; it seems like we just need to continue this refactoring "up the stack".

I don't think I understand how that would work, and I'm not sure what the suggested primitives here would be. I am also unclear on why refactoring to use general primitives would help decrease code size which is the issue here - for example, using lld (one of your examples) will likely increase our code size. (But it's worth doing for other reasons of course!)

@kripken
Copy link
Member Author

kripken commented Nov 22, 2017

After some offline discussion with @lukewagner I opened #5828 for discussion of the modularization approach.

For this issue, I think we should make sure that what we decide makes sense with the long-term goal of having ES6-module-based output, as mentioned there:

  • Option 1 (incremental shrinking, putting stuff behind flags, etc.) is mostly orthogonal to that goal. It might help in that while putting things behind a flag they are more separable, so easier to put into a literal ES6 module later.
  • Option 2 (new codegen mode for minimal JS) might be more tied to modularization, since the new codegen mode could be a step towards ES6 modules. So far the discussion into that option hasn't gotten into details related to that, though, so it's also currently orthogonal.

@rongjiecomputer
Copy link
Contributor

For now, I can accept the plan to just guard more components with -s, starting from the easier ones like Math polyfill, TypedArray detecting etc..

I like the idea of --export=ccall --export=cwrap (or --export=ccall,cwrap,preRun) instead of -s EXTRA_RUNTIME_FUNCTIONS_TO_EXPORT=['ccall'].

Any thoughts about the possibility of using ES6 template string as template engine instead of C preprocessor and {{{ }}}?

I think "in addition to", rather than "instead", except unless this can be proven to be superior in practice? We will migrate to latest Node.js LTS in emsdk in next tagged release, so this will be available at least out of the box.

"In addition to" sounds good. runtime.js definitely can use some cleanup with ES6 template string, then more possible cleanups can be explored.

@juj
Copy link
Collaborator

juj commented Nov 22, 2017

There will be some breakage. We can mitigate so that it is minimally surprising, but users may need to change some things. E.g. we may export fewer things by default, like ccall/cwrap in the example above.
How much breakage we accept will depend on code size wins. I think we should set a target for the size JS we emit (for, say, hello world), and that should be a factor in deciding if a particular breakage is worth it. (I don't know what the size target should be.)

I think rather than "how much code size it wins", I see it as "how does one deal with the breakage?". There are a lot of different types of breakage, and something we analyze constantly when communicating to external parties. When there's an easy "you'll start to get a compilation/runtime error about X, so then do Y" model to follow, it's not much of an issue to disrupt users, since the path to action is clear. Even if you don't save too many characters with this, it's easy to justify if it simplifies, since the action to resolve is easy.

A more difficult one is when there's a bidirectional breakage: old code will not compile/work on a newer compiler version, and the fixed code will not compile/work on an older compiler version. We had this with the Wasm debug table formats change. This is much more annoying than the above, because one can't write any kind of "one ideal form of code" that would be compatible with different versions. We want to avoid bidirectional breakages whenever possible, since this impacts distribution update paths in ecosystems. If it's not possible, then we should aim to be diligent to identify where such bidirectional breakages lie, so that those will be easy to discover.

Removing the Module.{preRun,postRun,preMain,callMain,run,addOnPreRun,addOnExit,...} architecture is probably next on the list in terms of difficulty. Completely doable, and as a feature I think well "blocked out" into its own box to refactor. However this will need explainers, migration guides, or similar that instruct developers how to proceed. There is no love lost if we deleted these altogether at some point, for example in favor of the now newer Promise.then mechanisms, but we need to make sure to help people have a path to migrating to this.

Then there are the changes where we know something will break, but do not want to think about how it will manifest, which existing features are (in)compatible with the changes, how to migrate, or debug. These are in the red flag zone - users will get angry if they will need to research new breakages after someone else's PR lands and if they discover the breakage was not an accident but intended.

Shrinking code size and ES6 modularization are very orthogonal, we can certainly add ES6 module structure even without putting any effort in to shrinking code at all. Both of these features can be worked in parallel as well.

@kripken
Copy link
Member Author

kripken commented Nov 22, 2017

@juj - good points, agreed. I'd add that I think many of the changes here could either have compile-time error messages, or if not, then at least run-time errors, something like this:

  • An existing thing (ccall, dynCall_*, etc.) is no longer exported by default on Module (so our DCE can remove it automatically from the final JS).
  • When building with ASSERTIONS, we emit a stub for the thing we removed (on Module, in these examples), basically an assertion that it is never called, saying you need to export it if you want to use it. This increases code size in ASSERTIONS mode, but that's ok. And we already encourage people to use that mode when things don't work well, so there is a reasonable path for users to get a clear error message telling them what to do.
  • We add tests checking that exporting it works, and that using it without an export in ASSERTIONS mode shows the message.
  • As mentioned in the previous comment, we mention these changes on the mailing list and in the docs, etc.

I think doing that (+ compile-time errors when possible etc.) would reasonably mitigate the breaking changes we are proposing here.

@kripken
Copy link
Member Author

kripken commented Nov 22, 2017

Opened #5836 with some data on the JS we emit for one testcase, and a list of tasks for it.

@floooh
Copy link
Collaborator

floooh commented Nov 28, 2017

Interesting thread! I share the concern about adding more -s flags. Main reasons:

  • it leaks out into the (and basically requires a) build system, which in general is the 'accepted' way of things, but adds a layer of complexity (a good example is marking a function inline as EMSCRIPTEN_KEEPALIVE vs the -s EXPORTED_FUNCTIONS, the EMSCRIPTEN_KEEPALIVE way is infinitely better. A similar example is Visual Studio's ```#pragma comment(lib, "xxx")" versus adding a linker command line option
  • especially smaller demos often don't even need a build system, if header-only libs are used (like the examples here: https://github.com/floooh/sokol-samples/tree/master/html5) , even non-trivial programs can be built directly with emcc bla.c -o bla.html -O3, adding a lot of options to get the smallest possible build would make this much less appealing

I think the ability to build without a build system is especially important for beginners (in the sense of "people new to WASM/asm.js", or when building WASM modules for an app that's primarily written in JS.

So from my point of view, a way to move some of these configuration options into the source code would be highly appreciated (although I have no good answers of how to achieve this, but EMSCRIPTEN_KEEPALIVE or the custom #pragmas point into the right direction).

kripken added a commit that referenced this issue Dec 16, 2017
This makes us not exit the runtime by default. That means we don't emit code for atexits and other things that happen when the runtime shuts down, like flushing the stdio streams. This is beneficial for 2 reasons:

 * For #5794, this helps remove code. It avoids all the support for shutting down the runtime, emitting atexits, etc. It also enables more optimizations (the ctor evaller in wasm can do better without calls to atexit). This removes 3% of hello world's wasm size and 0.5% of its JS.
  * A saner default for the web. A program on the web that does anything asynchronous will not want the runtime to exit when main() exits, so we set this flag to 1 for many tests, which this PR lets us remove.

However, this is a breaking change. As already mentioned, the possible breakages are

 * printf("hello") will not console.log since there is no newline. Only when the streams are flushed would that be printed out. So this change would make us not emit that.
 * atexits do not run.

Both of those risks are mitigated in this PR: In ASSERTIONS mode, check if there is unflushed stream output, and explain what to do if so. Same if atexit is called.

This PR has a lot of test changes, some that simplify web tests - because the new default is better for the web - but others that add a param to a shell test - because the new default is less optimal in a shell environment. I think the risk here is lower than those shell tests indicate: we do test quite a lot of things in the shell, but just because it's convenient, not because that's what most users care about.

Also:

 * this PR found an unnoticed bug: FORCE_FILESYSTEM didn't actually do what the name suggests. I think we just never tested it properly with NO_EXIT_RUNTIME. Fixed in this PR.

 * emrun sets NO_EXIT_RUNTIME=0. it is a mode where we specifically want to get the exit code from the running program, as if it were a shell command, not a browser app

 * add faq entry, and mention the faq

 * fix an existing emterpreter-async bug: if we are unwinding the stack as we leave main(), then do not call exit, we are not exiting yet - code is yet to run later

* metadce is now more effective, update test

* faq entry on Module.* is not a function

* fix browser.test_emscripten_main_loop - the pthreads part needs the runtime to exit
@stale
Copy link

stale bot commented Sep 19, 2019

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

@stale stale bot added the wontfix label Sep 19, 2019
@stale stale bot closed this as completed Sep 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants