-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WASM: Some builtins don't work with strings >64 characters as inputs #6376
Comments
I tried to see if it affected any natively implemented builtin or is specific to the regexp one. I did this is a simplified repro rego: package repro
c65 := count(input.c65)
c64 := count(input.c65) and it turns out that the result of this is the same for
Surprise! The input data is one character short, it seems. But that only slightly varies the bug report, not in any material way. Also, I guess you've already ruled out that it's a more general problem, since without the trailing So I'll keep digging. |
OK at this point, I'm not sure what works and what doesn't and what shouldn't 😅 -- We're passing UNANCHORED to re2 here and I guess that makes sense for submatches, but it might not square with anchored regexps fed into So, despite this being somewhat buggy -- Where did the need arise for you to use an anchored RE with find_all_string_submatch_n? |
Coming back to this! 🔍
|
When feeding a `char *` into `re->Match()`, it was converted to a StringPiece, taking its size as `strlen()`. For our (long) input, that wasn't resulting in the correct size, and did then freak out the re2 match input validation if the regular expression has an end anchor, but the endpos wasn't the same as its length. Since the endpos was taken from `s->len`, and the "length" taken via the mentioned StringPiece's strlen() call, they did indeed not match. Worked around by feeding it a properly-constructed std::string instead. I'm a C++ novice at best, but it does the trick, and I'm reasonable certain it's less wrong than before. Fixes open-policy-agent#6376. Signed-off-by: Stephan Renatus <[email protected]>
When feeding a `char *` into `re->Match()`, it was converted to a StringPiece, taking its size as `strlen()`. For our (long) input, that wasn't resulting in the correct size, and did then freak out the re2 match input validation if the regular expression has an end anchor, but the endpos wasn't the same as its length. Since the endpos was taken from `s->len`, and the "length" taken via the mentioned StringPiece's strlen() call, they did indeed not match. Worked around by feeding it a properly-constructed std::string instead. I'm a C++ novice at best, but it does the trick, and I'm reasonable certain it's less wrong than before. Fixes open-policy-agent#6376. Signed-off-by: Stephan Renatus <[email protected]>
☝️ PR is ready. Thanks again for helping uncover this ✨ |
When feeding a `char *` into `re->Match()`, it was converted to a StringPiece, taking its size as `strlen()`. For our (long) input, that wasn't resulting in the correct size, and did then freak out the re2 match input validation if the regular expression has an end anchor, but the endpos wasn't the same as its length. Since the endpos was taken from `s->len`, and the "length" taken via the mentioned StringPiece's strlen() call, they did indeed not match. Worked around by feeding it a properly-constructed std::string instead. I'm a C++ novice at best, but it does the trick, and I'm reasonable certain it's less wrong than before. Fixes #6376. Signed-off-by: Stephan Renatus <[email protected]>
Short description
Some builtins, at least
regex.find_all_string_submatch_n
, don't work forinput
strings longer than 64 charactersSteps To Reproduce
repro.rego
:input.json
:Running with
-t rego
They all evaluate truthy:
Running with
-t wasm
They all evaluate truthy:
Notice how
input_dollar_c65
is the only test evaluating asundefined
.So this issue is only when:
Expected behavior
That it correctly evaluates, both in
wasm
andrego
modeAdditional context
Tested both on the latest version and old version, as far back as v0.36.0
The text was updated successfully, but these errors were encountered: