-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please support locale-specific/unicode matching with --basic-regexp
#234
Comments
Thanks for the feedback! I agree with your assessment. Initially ugrep (based on RE/flex) is more of a developer-centric tool that always uses the "C" locale. I will revisit the design choices to see what can be added, but this has to wait a little bit. |
You don't need to use The In a future update I want to update option |
Thanks for the hint! Of course, the attraction of using |
I will take a closer look at In the meantime, I've updated my dev version to implement option |
Testing GNU grep 3.11 matching with
and observing that Nothing is matched by GNU grep 3.11 on MacOS or Linux for cn letters with
By contrast, with ugrep I made sure that Also ripgrep and silver searcher do not match any accented characters in the French text with pattern For We could make EDIT: the results depend on MacOS versus Linux. |
So it seems that This looks good to me to move forward. Correct me if I am wrong with these:
Note that ugrep does not match newlines when part of these classes, such as |
Looks good, thanks for your work on this! |
GNU grep, for example, matches accented letters with
[:alpha:]
in UTF-8 locales.Yes, I could either use GNU grep, or use
-P
to match\p{L}
instead, but it would be nice to be able to use "standard" patterns withugrep -G
and not have to worry!I also appreciate that ugrep currently documents the POSIX character classes as being ASCII-only, so it might be necessary to add a flag to support locale-sensitive POSIX regexs; at least for my use case that would be fine—I expect to have to use a shell alias or similar to "configure" ugrep to work the way I want.
The text was updated successfully, but these errors were encountered: