-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update parser to support editions syntax #2000
base: master
Are you sure you want to change the base?
Conversation
@@ -0,0 +1 @@ | |||
syntax = "proto3"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of these dummies? It seems like this could conflict with the actual versions. Also, it doesn't actually define any custom features, so anyone referencing those would have problems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose I had in mind was that an import of these files may be present in a schema originating in another language or toolchain, so having dummies prevents "file not found" errors in pbjs. Left them blank because the definitions within are not of relevance to protobuf.js anyway. But now that you say it, given that the dummies might leak out to other toolchains, it seems like a good idea to include their full contents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, also usually people would import them and reference the features defined in them. So I think you'd still hit errors from missing options definitions?
OTOH copying them verbatim here seems like a good way to get version skew issues. How do you deal with descriptor.proto? We've been modifying that pretty rapidly in the last few years
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The missing definitions aren't an issue for protobuf.js itself. It currently parses any options literally without looking up schemas. Hence the historic comment in the parser on how properties are assumed. Similarly, descriptor.proto is optional and only updated sporadically, as it is not needed internally for protobuf.js to function. Perhaps a data point that descriptor.proto hasn't leaked out so far.
if (acceptTypeRef && typeRefRe.test(token)) | ||
return token; | ||
if (acceptIdentifier && nameRe.test(token)) | ||
return readIdentifier(token); // `ENUM_VALUE` | ||
|
||
/* istanbul ignore next */ | ||
throw illegal(token, "value"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think readRanges needs to be updated as well. In edition 2023 we changed the grammar of reserved field/enum value names to be identifiers instead of strings. This avoids accidental concatenation
option features.(pb.go).legacy_unmarshal_json_enum = true;\ | ||
option features.(pb.java).legacy_closed_enum = true;\ | ||
message A {\ | ||
repeated int32 b = 1 [features.repeated_field_encoding = EXPANDED];\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding reserved field identifiers would be good to test the grammar changes
var option = name; | ||
var identifier = readOptionIdentifier(); | ||
|
||
// Historically, `(some.option).prop` has been interpreted as a property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like exactly what it is today. What's the change in behavior here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly a comment I found useful to explain why the somewhat strange code exists in the first place. Say, someone stumbles over this part, wonders why it's there, and decides to make a PR to remove it because it doesn't seem useful to them. Other than that, there's indeed no change in behavior other than also applying the fallback to syntax previously not supported, e.g. a.(b.c).d
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think I misunderstood what was happening here. FWIW it looks like you still don't support multiple extensions (e.g. a.(b).c.(d).e
), but we hopefully won't ever need that
syntax = "proto3"; | ||
isProto3 = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems strange. Why treat editions files like proto3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only added the fallback to produce valid schemas that can be tested, including during resolve. I guess another option is to test via the parser only, and for now produce an "editions are not supported" error during resolve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, this is just a temporary test hook? That makes sense, but I definitely like the failure approach better. That way nobody starts depending on this behavior
edition
keyword. Falls back to proto3 for now.features.(pb.java).legacy_closed_enum
. Retains backwards compatibility with assumptions made by prior options logic.cc @protobufjs/maintainers