Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with mksnapshot and binding data #35930

Closed
jasnell opened this issue Nov 2, 2020 · 8 comments
Closed

Issue with mksnapshot and binding data #35930

jasnell opened this issue Nov 2, 2020 · 8 comments
Labels
confirmed-bug Issues with confirmed bugs. lib / src Issues and PRs related to general changes in the lib or src directory.

Comments

@jasnell
Copy link
Member

jasnell commented Nov 2, 2020

@joyeecheung ... I'm running into an issue with the node_mksnapshot tool during build. It apparently does not know how to handle BindingData for native modules when the snapshot is created...

I'm working on a rework of internalBinding('trace_events') module, and attempting to add binding data to it:

  TraceEventsState* const state =
      env->AddBindingData<TraceEventsState>(context, target);
  if (state == nullptr) return;

The trace_events module is loaded during bootstrap, so the created TraceEventsState object is attached as the binding data at that step.

Unfortunately, this causes node_mksnapshot to crash:

global handle not serialized: 0x327e5e9f3281: [JS_API_OBJECT_TYPE] in OldSpace
 - map: 0x2c1d4730fe89 <Map(HOLEY_ELEMENTS)> [FastProperties]
 - prototype: 0x218677ec84b1 <Object map = 0x2c1d473101a1>
 - elements: 0x1fd3c5ec0b29 <FixedArray[0]> [HOLEY_ELEMENTS]
 - embedder fields: 1
 - properties: 0x1fd3c5ec0b29 <FixedArray[0]> {}
 - embedder fields = {
    22006, aligned pointer: 0x55f6648c31e0
 }



#
# Fatal error in , line 0
# Check failed: handle_checker.CheckGlobalAndEternalHandles().
#
#
#
#FailureMessage Object: 0x7ffd2543d6c0
 1: 0x55f65ee301a1  [/home/james/node/node/out/Release/node_mksnapshot]
 2: 0x55f65fd6af3e V8_Fatal(char const*, ...) [/home/james/node/node/out/Release/node_mksnapshot]
 3: 0x55f65f3e3823 v8::SnapshotCreator::CreateBlob(v8::SnapshotCreator::FunctionCodeHandling) [/home/james/node/node/out/Release/node_mksnapshot]
 4: 0x55f65ed1c016 node::SnapshotBuilder::Generate(std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >) [/home/james/node/node/out/Release/node_mksnapshot]
 5: 0x55f65e953832 main [/home/james/node/node/out/Release/node_mksnapshot]
 6: 0x7f2df6bb1b97 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
 7: 0x55f65e95b42a _start [/home/james/node/node/out/Release/node_mksnapshot]
Illegal instruction (core dumped)
node.target.mk:26: recipe for target '/home/james/node/node/out/Release/obj/gen/node_snapshot.cc' failed
make[1]: *** [/home/james/node/node/out/Release/obj/gen/node_snapshot.cc] Error 132
rm 82c869d3ddd844a851343f4dda5865e1e173131d.intermediate 53b6ae97571ad65714c594648ab8f1c04245ce26.intermediate
Makefile:104: recipe for target 'node' failed
make: *** [node] Error 2

We should be able to attach BindingData to any of the internal native modules without having to fight with this. I know that addaleax's original implementation design for this bit covered but I'm not sure of the details so I'm not entirely sure what is missing in the version of your approach that landed here.

@jasnell jasnell added lib / src Issues and PRs related to general changes in the lib or src directory. confirmed-bug Issues with confirmed bugs. labels Nov 2, 2020
@addaleax
Copy link
Member

addaleax commented Nov 2, 2020

(Fwiw, I know I’ve been pinged but I’ll unsubscribe because I’m not ready to deal with this ******** again. I’ve had working code that would have solved this, the project decided against it, not my problem anymore.)

@joyeecheung
Copy link
Member

I had an alternative approach here, which implements the option2s in the pending discussions in the design doc. It was removed from #32984 since my intention was to land a minimal snapshot while keeping the options open to alternatives like the ones proposed by Anna (when the PR landed, there were no BaseObject before the pre-execution phase after refactoring anyway). I can rebase the patch and make it possible to add BaseObjects into the mksnapshot.

BTW what are you planning to add in TraceEventsState? Are there just C++ strings/numbers (that we can serialize into and deserialize from a blob on our own easily), or are there any references to JS objects or other BaseObjects? The second types of objects would be a bit complicated to serialize because of potential GC issues, as explained in the design doc though it might be possible to work around it if we put these behind private symbols upon snapshot time and let V8 do most of the work for us.

@joyeecheung
Copy link
Member

Just adding a link to the tracking issue, thanks for opening this one #35711

@joyeecheung
Copy link
Member

joyeecheung commented Nov 17, 2020

Just posting my WIP here after some trial and error (it's not working yet). I still think the best way to deal with C++ -> JS references (including references from BindingData to AliasedBuffers) should be solved by putting them behind private symbols at serialization time and restoring them back at deserialization time so that V8 does most of the graph-walking for us and it's less error-prone due to GC

@joyeecheung
Copy link
Member

When looking into all the embedder objects, I realized that we never actually use class ObjectWrap in our own code base, but this might be something that we can use to provide add-on support..my current plan is:

  • At serialization time, all BaseObjects will put their references to other JS objects (e.g. Global handles, including AliasedBuffers) behind private symbol properties, and V8 will do the work to serialize these. Other states on the native-side (most likely some strings or numbers) will be saved as a chunk passed to the serialization callback invoked by V8
  • At deserialization time, we revive the native states from the chunks and store them somewhere on the native side, and enqueue a callback to be invoked when the object graph is complete (alongside it, a Global handle to that embedder object). When the object graph is complete, we invoke the callbacks to, for example, call the BaseObject subclass constructor to create the C++ object corresponding to the embedder JS objects, and revive its states from the chunks or the private symbol properties. For aliased buffers we'll recreate the C++ object based on the JS typed array. We'll set the private symbol properties to undefined once they are reset to embedder fields.

This should work for what we currently have, which is that BaseObjects all store at slot 0 a pointer to one C++ object which in turn reference other stuff, also at slot 1 or higher some more pointers. I will try to open a PR once my branch can handle slot 0, and leave other slots as TODO (which leaves out things like UDPWrap and StreamReq)

@joyeecheung
Copy link
Member

joyeecheung commented Dec 8, 2020

Well..it looks like we can't just use the serialization callbacks for everything - because the BaseObjects are held through global handles which are checked during serialization. To avoid making BaseObjects that should be gone during GC strong, I think we could just ignore all the BaseObjects with IsWeakOrDetached() returning true, and store the rest with SnapshotCreator::AddData(), and release global handles held by BaseObjects (like the ones in AliasedBuffers) in favor of turning them into strong references from the BaseObject's JS object to the typed arrays.

@joyeecheung
Copy link
Member

joyeecheung commented Dec 16, 2020

Update: I changed the approach in the WIP a bit and stored a v8::String in the BaseObjects (only for serializable BaseObjects) so that at serialization time, we can identify the type of the object by looking at the string at slot 0. This may be useful for post-mortem debugging as well. For now, only the fs::BindingData in the WIP works. I plan to test this on a few more different BindingData before sending it for review.

I also noticed that only a handful of BaseObjects seem to be eligible for serialization

  • the binding data
  • the wraps of modules (for these we'll probably need some patches in V8 so that we can retrieve proper references to the scripts/modules at deserialization, right now they are just one-shot pointers and not even v8::Data)
  • the fast API objects (with typed arrays)

It does not seem to make sense to serialize any of the AsyncWraps or the OpenSSL contexts - they should be gone after GC when we serialize the context (which is why we need to use the serialization callback since otherwise we can't avoid touching objects that aren't supposed in the snapshot), and if they have to be saved, some logic in JS should be done to make sure they are reinitialized properly after deserialization.

@joyeecheung
Copy link
Member

hmm, I just realized that with the simplest two binding data (the fs one and the v8 one), there is no need to deserialize their AliasedBuffers at all since those are stats buffer and their content can be thrown away once consumed (which is guaranteed at snapshot time because by then no JS frame should be left on stack). I'll stash my WIP that support JS reference serialization on a different branch, and open a PR soon with a simplified approach (i.e. just re-initializing the aliased buffers, I think it's just much more straight forward than deserializing and the performance difference should be negligible)

joyeecheung added a commit that referenced this issue Feb 19, 2021
joyeecheung added a commit that referenced this issue Feb 19, 2021
targos pushed a commit that referenced this issue Feb 28, 2021
This patch adds the SnapshotableObject interface. Native objects
supporting serialization can inherit from it, implementing
PrepareForSerialization(), Serialize() and Deserialize()
to control how the native states should be serialized and
deserialized.

See doc: https://docs.google.com/document/d/15bu038I36oILq5t4Qju1sS2nKudVB6NSGWz00oD48Q8/edit

PR-URL: #36943
Fixes: #35930
Refs: #35711
Reviewed-By: James M Snell <[email protected]>
targos pushed a commit that referenced this issue Feb 28, 2021
targos pushed a commit that referenced this issue Feb 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. lib / src Issues and PRs related to general changes in the lib or src directory.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants