Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: simplify build of inspector protocol #22680

Closed

Conversation

refack
Copy link
Contributor

@refack refack commented Sep 4, 2018

Main idea is to capture the code generated for the unified inspector protocol (debugging + tracing), instead of regenerating it for every build.
From now on changing the PDL will require a manual code-gen and compilation adaptation.

Multi step refactoring:

  1. inspector: refactor code generation
    • Split GYP scaffold to inspector/inspector_protocol.gyp
    • Point code-generator output to src/inspector/node_protocol
  2. inspector: store generated code
    This is instead of generating on each build. When a developer changes
    the protocol, it will require checking in the generated source.
  3. inspector,build: stop redundant code-gen
    • document howto code-gen manually
  4. tools: remove unused copy of inspector_protocol
  5. build,v8: fix gyp target race
    The complementing V8 codegen script was prone to races
Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • documentation is changed or added
  • commit message follows commit guidelines

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Sep 4, 2018
@refack refack added windows Issues and PRs related to the Windows platform. build Issues and PRs related to build files or the CI. tools Issues and PRs related to the tools directory. labels Sep 4, 2018
@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

/CC @nodejs/v8-inspector @nodejs/trace-events @nodejs/build-files

@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

@refack refack self-assigned this Sep 4, 2018
@refack refack added inspector Issues and PRs related to the V8 inspector protocol trace_events Issues and PRs related to V8, Node.js core, and userspace code trace events. labels Sep 4, 2018
@richardlau
Copy link
Member

Looks like related build failures:
https://ci.nodejs.org/job/node-test-commit-plinux/nodes=ppcle-ubuntu1404/20028/console

22:25:03 Traceback (most recent call last):
22:25:03   File "../third_party/inspector_protocol/CodeGenerator.py", line 654, in <module>
22:25:03     main()
22:25:03   File "../third_party/inspector_protocol/CodeGenerator.py", line 541, in main
22:25:03     protocol = Protocol(config)
22:25:03   File "../third_party/inspector_protocol/CodeGenerator.py", line 314, in __init__
22:25:03     self.generate_domains = self.read_protocol_file(config.protocol.path)
22:25:03   File "../third_party/inspector_protocol/CodeGenerator.py", line 335, in read_protocol_file
22:25:03     parsed_json = json.loads(json_string)
22:25:03   File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
22:25:03     return _default_decoder.decode(s)
22:25:03   File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
22:25:03     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
22:25:03   File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
22:25:03     obj, end = self.scan_once(s, idx)
22:25:03 ValueError: Expecting object: line 721 column 18 (char 32767)

@richardlau
Copy link
Member

richardlau commented Sep 4, 2018

I think we need to have some sort of documentation (e.g. in https://github.com/nodejs/node/blob/master/doc/guides/maintaining-V8.md or a new separate guide for the inspector) that describes when and how to regenerate the protocol (since it's not automatic).

For example, it doesn't look like you can regenerate the protocol on Windows as this PR stands?

@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

Looks like related build failures:

I'm investigating this. It's quite strange that is happens only on some platforms, and it's not trivially reproducible (even manual on a failing machine)

I think we need to have some sort of documentation (e.g. in /doc/guides/maintaining-V8.md@master or a new separate guide for the inspector) that describes when and how to regenerate the protocol (since it's not automatic).

Good point, I knew I had something in mind that I missed. Code needs to be generated only after a intentional change to https://github.com/nodejs/node/blob/master/src/inspector/node_protocol.pdl, which should be done by an informed stakeholder as per:

# Please notify @nodejs/v8-inspector and @nodejs/trace-events before modifying this file

If code is generated again after an update to the "inspector_protocol" tool it most probably will not compile (i.e. changes are not API compatible), and will require subsequent changes to node code in /src/.
In that sense we might consider src/inspector[/node_protocol] to be an external dependency of sorts.

For example, it doesn't look like you can regenerate the protocol on Windows as this PR stands?

I've actually developed this PR on a windows machine. The generate-node-inspector-code target can be run in WSL or with MSYS make.
I moved the script to Makefile since it's our catch-all for tooling triggers, but maybe a python file inside src/inspector might be more appropriate, though less accessible...

@richardlau
Copy link
Member

@refack Thanks. I'm trying to review with a build-files hat on, but it involves understanding how the previous way worked to compare. I'd not looked at node/src/inspector/node_protocol.pdl as it isn't touched by this PR (so thanks for pointing me in that direction).

re. Windows, AFAIK WSL and MSYS make are not supported build environments. Personally I'd be okay if we had some documentation/guide that just listed out the python commands to invoke on Windows (and direct people on other platforms to use the makefile) since it will be rarely(?, at least comparatively) have to be invoked. cc @nodejs/platform-windows

@refack refack force-pushed the refactor-out-second-inspector-protocol branch from e7f232a to dda2039 Compare September 4, 2018 20:01
@pavelfeldman
Copy link
Contributor

I did not look into these changes in detail, but it should be conceptually impossible to remove the copy of the inspector_protocol while maintaining regular v8 rolls due to potential version mismatch between the inspector_protocol gen used in v8 and in node. That's why we keep two copies of it in Chrome and I was hoping you would do too in Node.

@refack refack mentioned this pull request Sep 4, 2018
@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

I did not look into these changes in detail, but it should be conceptually impossible to remove the copy of the inspector_protocol while maintaining regular v8 rolls due to potential version mismatch between the inspector_protocol gen used in v8 and in node. That's why we keep two copies of it in Chrome and I was hoping you would do too in Node.

Just to be clear, this PR makes the node source independent of either instance of inspector_protocol for code-gen.

The idea here is that since we don't make frequent changes to the protocol, we can treat the generated code as SCM tracked source. Any future changes to the protocol should be followed by locally running code-gen and checking-in the generated code. (Similar to how V8 stores the protocol as JSON, as well as PDL).

IMHO it's reasonable since the generated code is cross platform and is human maintainable, so even if we lose the knowledge of how to generate it again from PDL/JSON, we can still maintain it.

@kfarnung
Copy link
Contributor

kfarnung commented Sep 4, 2018

Like @pavelfeldman I think it's better to stick with a separate copy of the inspector_protocol scripts/templates as they can change in ways that make the Node implementation incompatible (usually new methods or changed signatures). Since the generation is only done on-demand it won't be caught during V8 upgrade and instead cause issues the next time someone wants to make a change to the PDL definition for Node.

@pavelfeldman
Copy link
Contributor

The idea here is that since we don't make frequent changes to the protocol, we can treat the generated code as SCM tracked source.

I don't think you would want this though, especially given that you already have the automated way wired

IMHO it's reasonable since the generated code is cross platform and is human maintainable, so even if we lose the knowledge of how to generate it again from PDL/JSON, we can still maintain it.

It was not intended to be human-readable and definitely not human-editable. It also is a subject to change for cross-cutting optimizations such as flattened operation or binary notation. You would want those in place once they are a part of the inspector_protocol.

@refack refack force-pushed the refactor-out-second-inspector-protocol branch from dda2039 to 79a236d Compare September 4, 2018 23:12
@refack refack changed the title Refactor out second inspector protocol build: simplify build of inspector protocol Sep 4, 2018
@refack refack force-pushed the refactor-out-second-inspector-protocol branch from 79a236d to a9d644c Compare September 4, 2018 23:17
@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

Since the generation is only done on-demand it won't be caught during V8 upgrade and instead cause issues the next time someone wants to make a change to the PDL definition for Node.

The idea is that on change of the protocol, there would probably be required code changes in node source anyway.
Running the code-gen on demand could be done with any version of inspector_protocl the developer wants (not necessarily the one in deps/v8).

Just so it's clear I'll remove the changes to Makefile and added a README.

@kfarnung
Copy link
Contributor

kfarnung commented Sep 4, 2018

Running the code-gen on demand could be done with any version of inspector_protocl the developer wants (not necessarily the one in deps/v8).

That somewhat misses the point, the templates used to generate are tied tightly to the implementations based on them. The implementation side extends an abstract based clase which defines the interfaces. Using any random version of the protocol can cause drastic changes in the generated output, so we would want to have a specific and stable version so that the files can be regenerated at will without unnecessary churn and new build breaks.

@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

I don't think you would want this though, especially given that you already have the automated way wired

Actually simplifying the automated procedure is a big motivator for me. We have a way-too-complicated build procedure. So yes I do want to remove parts that are easy to refactor out (in the sense that only 1 or 2 devs actually make use of them). This also allows us to take out the inspector_protocol tool from the tree, and not keep track of ~100 extra files.

It was not intended to be human-readable and definitely not human-editable. It also is a subject to change for cross-cutting optimizations such as flattened operation or binary notation. You would want those in place once they are a part of the inspector_protocol.

IMHO we are over-tasked with keeping track of dependencies and tools, and ATM we don't have a designated person to keep track of inspector_protocol improvements. Also AFAIU these improvements come with a maintenance cost to the wrapping code (that was the reason for the duplication of the tool in the first place).

My proposed solution would be for a knowledgeable dev to regenerate the protocol harness by using any version of the tool that they deem fit, at the same time fix the node source to accommodate any breaking changes, and this while keeping all of these concerns out of the node repo. Abstraction FTW.

@pavelfeldman
Copy link
Contributor

To me this is backwards, changes to the source code need to be versioned and traced to the reason. Using random version of the inspector_protocol might end up with crash or security vulnerability fixes to be withdrawn from the build. Let alone the practice of checking in generated files.

@refack
Copy link
Contributor Author

refack commented Sep 4, 2018

Using any random version of the protocol can cause drastic changes in the generated output, so we would want to have a specific and stable version so that the files can be regenerated at will without unnecessary churn and new build breaks.

@kfarnung that's a valid point but it also goes the other way. If we SCM track the generated source, it's the most stable we can get it to be.

To me this is backwards, changes to the source code need to be versioned and traced to the reason. Using random version of the inspector_protocol might end up with crash or security vulnerability fixes to be withdrawn from the build. Let alone the practice of checking in generated files.

@pavelfeldman I get what you are saying, but I'm looking at it from a different POV. For me (as a node focused dev) I want to abstract away the reason for these kind of changes, the same way I upgrade my compiler or even my OS.

My big assumption here is that whomever does want to make changes is knowledgeable in the concerns surrounding the needed change and the proper use of the tool.

IMO this is similar to the way we treat most of our dependencies, for example I don't know how the assembly files in openSSL where created, but I can check that they work.


Thank you all for your perspective. You have given me much to think about.

@kfarnung
Copy link
Contributor

kfarnung commented Sep 5, 2018

I would argue that having the generated code be reproducible is actually the most important aspect of generated code. In that case I think you need to have the generator/templates checked in to achieve that.

My overarching point is that you shouldn't need to be an expert to be able to generate the exact same code that's checked in already. There should be a known way to achieve that. I'm not a fan of checking in generated code, but I'm ok with it assuming that it can be regenerated at will and the generated files themselves are never edited directly.

@pavelfeldman
Copy link
Contributor

I guess I don't get what we are saving via removing this build step, especially from the node-focused pov. With the generator in place, the PRs make sense, incremental versioning makes sense, there is no maintenance involved. Without the generator, PRs are unreadable blobs, versioning is error-prone, maintenance is obvious.

I mean we have it for ourselves in a similar chrome setup, looking at it from the chrome-focused pov. There has never been an issue or confusion with the build step that is ran by the computer.

@refack
Copy link
Contributor Author

refack commented Nov 20, 2018

As per #24486 we now another motivation to check-in the generated code - python3 compatibility.
/CC @nodejs/python PTAL.

This is instead of generating on each build. When a developer changes
the protocol, it will require checking in the generated source.
* document how to code-gen manually
@refack refack force-pushed the refactor-out-second-inspector-protocol branch from a9d644c to 1cd1b88 Compare November 20, 2018 14:39
@refack refack added the python PRs and issues that require attention from people who are familiar with Python. label Nov 20, 2018
@refack refack closed this Dec 21, 2018
@refack refack deleted the refactor-out-second-inspector-protocol branch December 21, 2018 01:31
@refack refack removed their assignment Mar 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI. inspector Issues and PRs related to the V8 inspector protocol lib / src Issues and PRs related to general changes in the lib or src directory. python PRs and issues that require attention from people who are familiar with Python. tools Issues and PRs related to the tools directory. trace_events Issues and PRs related to V8, Node.js core, and userspace code trace events. windows Issues and PRs related to the Windows platform.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants