Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C++ Pattern Validation #286

Merged
merged 3 commits into from
Dec 3, 2019
Merged

Add C++ Pattern Validation #286

merged 3 commits into from
Dec 3, 2019

Conversation

asraa
Copy link
Contributor

@asraa asraa commented Nov 5, 2019

Envoy now uses RE2 as a safe regex engine instead of std::regex (envoyproxy/envoy#7878). Because PGV already requires patterns to use RE2 syntax, one option is to use RE2 for C++ patterns as well. This implements it, for use in strings, bytes, repeated items, and may key/value pattern validation.

Implements #22

WIP: I ran in to difficulty creating the regex because a regex containing a null character would get cut off... for example, the ascii character test used the pattern, ^[\x00-x7f]+$, and consuming this as a string resulted in creating a null-terminated string pattern ^[ instead of the actual pattern. I think this might be a problem across most of the C++ code? That's why there's a terrible string construction in the pattern creation.

Signed-off-by: Asra Ali [email protected]

@asraa
Copy link
Contributor Author

asraa commented Nov 5, 2019

@akonradi

@asraa
Copy link
Contributor Author

asraa commented Nov 15, 2019

@rmichela @rodaine would either one of you be able to give this a pass over? thank you!

@rodaine
Copy link
Member

rodaine commented Nov 15, 2019

This looks fine to me, but I'm not terribly familiar with C++ enough to say if this is ideal or not.

Looking in Envoy, the RE2 usages use a StringPiece. Is that applicable here? https://github.com/envoyproxy/envoy/blob/master/source/common/common/regex.cc#L47


{{ if has .Rules "Items"}}{{ if .Rules.Items }}
{{ if has .Rules.Items.GetString_ "Pattern" }} {{ if .Rules.Items.GetString_.Pattern }}
const re2::RE2 {{ lookup .Field "Pattern" }}(std::string({{ lit .Rules.Items.GetString_.GetPattern }},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The construction of a std::string here doesn't seem terrible; I don't think there's a better way to deal with null-terminated strings unless you want to introduce absl/std::string_view. That would save a copy on startup.

@asraa
Copy link
Contributor Author

asraa commented Nov 22, 2019

(Sorry just getting back to this after KubeCon) Oh, thanks Chris and Alex. I think we can get the best of both worlds with that re2::StringPiece. Should avoid the copy. I'll update and test using that for both the matcher and the value.

Signed-off-by: Asra Ali <[email protected]>
@asraa
Copy link
Contributor Author

asraa commented Nov 26, 2019

I used re2::StringPiece where I could to avoid copies. I do have a question, seems like the pattern is to panic if a regex is invalid, so I eliminated re2::Quiet, it'll throw an error if it's an invalid pattern like the go code does.

@akonradi
Copy link
Contributor

I assume we're validating the regex when generating the code, so we shouldn't see invalid regexes during validation. I think the behavior of failing loudly is reasonable.

@asraa asraa changed the title WIP: Add C++ Pattern Validation Add C++ Pattern Validation Dec 3, 2019
@htuch htuch merged commit 76a9789 into bufbuild:master Dec 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants