Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

post transform file cache #116

Open
DonutEspresso opened this issue May 2, 2016 · 3 comments
Open

post transform file cache #116

DonutEspresso opened this issue May 2, 2016 · 3 comments

Comments

@DonutEspresso
Copy link

DonutEspresso commented May 2, 2016

Hi folks, I have some tooling built on top of module-deps. Often I already have post transformed source and deps tree information ready to go. It looks like if I provide opts.cache it skips reading/parsing source and applying transforms. However, it's unclear to me what the format of a "file object" looks like. Poking through the walk() function it looks like I need at least source, package, and deps, so would the format be something like this?

opts.cache = {
  '/User/me/foo.js': {
    source: <string> (post transformed source code), 
    package: ?,
    deps: {
      './fooDep1': '/User/me/fooDep1.js',
      '../fooDep2': '/User/fooDep2.js'
    }
  }
};

My two questions are:

  • I see the transform stream emits other fields too (file, id, entry, etc.) but are those needed in the cache?
  • What should the value of package field be?
  • Any gotchas when it comes to providing cache for files in node_modules?

I'll start running my own tests and exploring, but any guidance would be appreciated. Thanks!

@DonutEspresso
Copy link
Author

Update: I captured the raw records being emitted by the transform stream on the data event, and saved those record objects. You can then feed those records into a new module-deps instance as the cache object and that seems to do the trick.

Is there any concern around this approach? AFAICT, this seems to work correctly and significantly speeds up subsequent runs of module-deps.

@jmm
Copy link
Collaborator

jmm commented May 5, 2016

@DonutEspresso I can't answer all of your questions, but I've been wanting that stuff to be better documented and more consistent for a long time. file|id are some of the worst offenders. See for example:

Certain properties of those record objects are meaningful at certain phases of the pipeline (module-deps being one of the phases). I think package is the object represented by package.json. entry means is it an "entry" file, i.e. will it be executed when the bundle executes. Normally a b.add()ed file is an entry whereas a b.required() one isn't, for example.

You could also take a look at what watchify does, as I think it monkeys with that cache data, and its purpose is to speed up subsequent browserify bundling operations on mostly the same set of files.

Related: #72

@DonutEspresso
Copy link
Author

Thanks @jmm, appreciate the info and the links. I had a lot of trouble trying to "recreate" the records. I'd get inconsistent output from the stream when doing so, probably because I was feeding it bad data. Good to know I'm not the only one confused. :) In the end, I simply captured the emitted records "as-is" without changing them (all fields intact), then feeding them back in next time. That got me the consistent output from run to run, so it appears to be working so far

AFAICT, all the records emitted appeared to have id, source, file, deps. entry: true is there for all files I suspect that are added directly to module-deps via write() or end(). entry seems to be missing only when the file is located in node_modules, or is an unparseable file (i.e., require('./random.ext')), then random.ext would get emitted, but without the entry value.

I think it might be worthwhile delineating out what things are specific to module-deps, vs in the context of browserify.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants