-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regex_parser
transform doesn't work with nested fields
#1812
Comments
A related issue has been raised on the regex crate here but the last official comment on that one is that "Before implementation, this requires a careful specification." This seems to hint that a PR to arbitrarily add |
@bruceg Yeah, I saw this issue. But I think that maintaining a fork with such a simple change should be fine for now. We can even not send the PR at all, but I think it is better to send it even if it is not going to be accepted, just to signal that there is a demand for such functionality. Alternatively it is possible to encode dots somehow in By the way, I think that ideally we need to support not only dots, but also |
Created pull request on upstream regex and a local fork |
In planning, we noted that this approach has several drawbacks:
Two alternate approaches were proposed that eliminate the dependency on a modified regex crate:
I have investigated 2 and don't see a viable way to use the regex syntax parser to help with this work. There is simply not a convenient way of manipulating tokens between parsing and compiling without duplicating a large chunks of the crate. We could do the replacements at the textual level before any parsing. However, since there is no way to do this unambiguously without a full parse, this could lead to actual patterns being replaced instead of the group names. Based on this, I think the immediate path forward is to simply add a |
@bruceg It's an issue of future packaging! It may hamper attempts to distribute future official distribution packaging efforts, as crates.io and some official distro repos forbid git deps. :( I note this problem already exists, but I'd rather us move away from it, not embrace it more. Another point: if we monkey patch regex and don't replace every other regex dep, it means that we need to wrap up two copies in a binary. |
The regex crate has now added support for this. This feature is now present in regex 1.4.0 / regex-syntax 0.6.19. Should we re-open this issue or create a new one? |
Let's reopen. |
Actually, let's defer this. The Remap language will likely solve this. |
Leaving this open so that we can verify the remap language covers this. |
Closing since the Remap
|
Currently the
regex
crate which underlies ourregex_parser
implementation supports only[_0-9a-zA-Z]+
as possible names for captures groups. Thus nested fields of formx.y.z
cannot be captured.For example, the following config unit test
fails with the error
I think we need to allow field names containing dots. A simplest option to do this is to fork the
regex
crate and add support for it there, send a PR to the upstream, and use the fork until the support for dots in capture groups in added to the upstream crate.The text was updated successfully, but these errors were encountered: