Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Granular imports #167

Closed
Rich-Harris opened this issue May 13, 2015 · 3 comments
Closed

Granular imports #167

Rich-Harris opened this issue May 13, 2015 · 3 comments

Comments

@Rich-Harris
Copy link
Contributor

This is possibly a bit pie-in-the-sky, or borderline delusional (given how little time I've been able to spend on esperanto recently), but I can tell it's an idea that's going to gnaw away at me until I can put fingers to keyboard, so may as well create an issue for it.

The problem

import { foo } from 'someMassiveLibrary';

We don't need all of someMassiveLibrary, just foo. In CommonJS you might do...

var foo = require( 'someMassiveLibrary/theBitsYouActuallyNeed/foo' );

...but that's a) bad (using folder structure to define an interface is a terrible idea, and it means you can also do require('undocumented/private/modules'), and b) is not supported by the jsnext:main field in package.json files (source: this comment by @caridy).

The solution

Tree shaking! As part of the minification process, strip out all the junk you don't need. Unless the foo export is a function that uses bar, we can remove bar from the minified file, and so on.

The problem with the solution

Good tree shaking turns out to be really hard in JavaScript. It's extremely tricky to determine whether or not code is necessary. UglifyJS does an admirable job, but it can't remove stuff like this:

var someUnusedObject = {};
someUnusedObject.someUnusedMethod = function () {/*...*/};

It also potentially creates a huge amount of unnecessary work, since potentially a whole load of modules are imported and processed (parsed, altered to deconflict variable names, etc etc) only to be discarded later.

A possible alternative

Rather than throwing everything in and then pruning it back, it seems like it should be possible to just import the nodes we actually need - so if our bundle's entry point imports foo, we just pull in the node that defines foo, and any nodes that foo 'depends on'.

I haven't spent a huge amount of time thinking about the ramifications of all this, but I think there's a decent chance that this is a practical solution. Esperanto already generates bundles that are much more efficient than the browserify/webpack/jspm equivalent (only comparing the bundling here: those tools all do much more than esperanto), but this could be a bit of a game-changer in terms of making the case for ES6 modules over AMD and CommonJS.


Obviously this would require some fairly major rewriting, and may turn out to be completely impractical. Would be interested to hear from anyone who has already been down this road (or a similar one), and has lessons to share!

@Bobris
Copy link

Bobris commented May 14, 2015

You will also need to search in "global" code for side effects (setTimeout, addListener), but I don't think really good tree shaking could be done without type information (you will have false positives from aliased name of methods), so in long run I am betting on TypeScript, possibly Flow+Babel. But esperanto with uglify is currently my favorite (without practical try though).

@Rich-Harris
Copy link
Contributor Author

I've been hacking away on this idea for a while, and I'm happy enough with the results to share them (though this is still early stage work):

https://github.com/rollup/rollup

It became clear fairly soon that it would be better to do this in a fresh codebase rather than gradually rewrite esperanto to support granular imports, hence 'rollup'.

Current status: it's self-hosting (rollup rolls up rollup), and being used in a couple of minor libraries (e.g. sorcery). I've successfully built rsvp.js with it and all the tests pass (though the relevant branch is offline on a different machine, so I can't share it right now).

To demonstrate the potential of this approach I've ported d3 to ES6 and called it d3-jsnext. Recreating this d3 example takes 8kb (minified and gzipped) - ordinarily, d3 alone is 53kb. I plan to try the same with other libraries that developers frequently use a subset of, such as lodash and three.js. (Other suggestions welcome!)

Brief(ish) description of how it works

We start by parsing the entry module and generating an array of 'statements' (which could actually be statements or declarations) from ast.body (all the top-level AST nodes, minus import declarations):

import { foo } from './foo';

function bar () {          // this is a statement
  console.log( foo() );    //
}                          //

bar();                     // so is this

export default 42;         // and this

Each statement is traversed individually, so that we can determine a) which names it defines (e.g. function foo () {...} or class Foo {...} or const foo = ... and so on), which names it depends on (e.g. the bar declaration in the example above depends on foo), and which names it modifies (e.g. foo = 1 modifies foo, obj.bar = 2 modifies obj, doSomething(baz) is said to modify baz just in case - in future we could be smarter about determining whether doSomething can mutate its arguments).

We then expand each statement. Expanding a statement means finding all the statements that define names that this statement depends on (and expanding them, etc), adding them, followed by the statement itself, followed by any statements that modify any names defined by this statement. In the example above, we begin by expanding the bar declaration, which means including the definition of foo (found in foo.js) then the declaration itself. We then include the call to bar - the statements it depends on are already included, so nothing to do. Finally, the export declaration is dealt with according to the output format we're generating.

This works amazingly well!

@Bobris side-effects are indeed problematic. At the moment rollup uses a crude heuristic that I expect meets 99% of uses: all statements in the entry module are included, and if a module has an empty import...

import 'polyfills';

...it assumes that it contains side-effects, so all of its statements are included as well. This would fail in a case like this...

import { something } from './sideEffectyModule';

...but there are solutions we could consider.

As you say, good tree shaking is impossible without type info. We're not doing any tree shaking here, just selectively including code, so we actually have the opposite problem - we risk not including enough code, or somehow getting it in the wrong order. So far I haven't found that to be a problem, but it's definitely something to watch out for. In future we could potentially use things like Flow to make better decisions, and obviously this approach doesn't preclude a subsequent tree-shaking step (which would typically be more successful anyway given that the tree is leaner to begin with).

Integrating rollup into esperanto?

One option, once rollup reaches feature parity, would be to use it inside esperanto.

Another option would be to accept that one tool doesn't need to do both 1-to-1 conversions and bundling, and remove the bundling functionality from esperanto altogether. Though at that point, esperanto wouldn't really need to exist at all given that we have babel - esperanto is many times faster (because it does less), but the chances are someone using ES6 modules would already need babel for other ES6 features anyway.

Am interested in any and all feedback. Thanks!

@Rich-Harris
Copy link
Contributor Author

closing in favour of #184. see also #191

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants