caddyhttp: Escaping placeholders in CEL, add vars
and vars_regexp
#6594
+215
−32
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related to #6584, prior work in #4715 (comment)
Our current CEL placeholder regexp is too naive. It replaces everything that looks like a placeholder, and offers no good option to opt out if a literal placeholder-looking input is given to some CEL functions as input.
To solve this, we can do the placeholder replacement in two steps.
\
(or start of a line) and replace it and preserving the preceding character which is matched (we now have two capture groups so${1}
has to go to the start)\
, and drop the\
.That gives a way to escape through the
caddyPlaceholder()
replacement and get an input directly to the given matcher.One interesting quirk: CEL itself has its own
\\
escape sequence, so depending on whether we want the placeholder to be replaced by the matcher (e.g.header
matcher does do placeholder replacements on its inputs at runtime) or we want the value to be raw determines how many backslashes we need. The test shows this pretty well I think:header
matcher should take the value as-is and not perform placeholder replacement (because the match value is also placeholder-like), then we need three backslashes, like\\\{foobar}
. This is because the last one of the three is for escapingcaddyPlaceholder()
, then the prior two are collapsed into one by CEL's parsing itself, then the last one is to escape the placeholder replacer, and the result is a clean{foobar}
matching value.header
matcher itself should perform placeholder replacement (not done in the CEL matcher, but deeper in the matcher itself) then a single backslash is used, like\{http.request.uri.path}
(or\{path}
in the Caddyfile). This only escapes past thecaddyPlaceholder()
regexp, but does not escape past the placeholder replacer which runs insideheader
.We'll need to document this, but it's tricky 😅
After doing the above, I decided to also implement
vars
andvars_regexp
support in CEL, which we skipped implementing because of the above placeholder limitations.