-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Replacer::by_ref adaptor to use a Replacer without consuming it #449
Conversation
#83 proposed something similar but without the same motivation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbrubeck! I think the idea here seems OK to me, but do you have a specific example in mind where having this would help?
As far as the PR itself, it looks like this doesn't work on Rust 1.12. Is there a way to make this work for 1.12? If not, we'll probably have to leave this until I do the next version bump.
Also, could you wrap all the doc strings to 79 (inclusive) columns? Thanks.
Here's a specific problem that this solves. https://play.rust-lang.org/?gist=3a4d8af0b9ee6b8faa984a6966eb2728&version=stable |
It was a real use case, that I simplified into the example on the playground. ReplacerRef allows to reuse the same Replacer in generic context, which in this case fulfills the need to repeatedly apply the same Regex and Replacer to a string until it stops applying, because the replacement may result in appearance of something that also needs to be replaced. I think implementing such a fixpoint operator for regex replacement is a legitimate need. Consider this example, which is closer to what I needed. Suppose you have something like It can't be written as a regex with single replacement, but can be written as a repeated replacement on pairs of adjacent loop { // while Regex applies
s = Regex::new(r"f\(([^)]*)\)f\(([^)]*)\)").unwrap().replace(s, "f($1$2)");
} |
Fixed.
Done. |
All right, I think I'm OK with this. The use case seems a little out there, but since there's no other way to achieve it and this is a very small API addition, then I think I'm OK with this. Thanks @mbrubeck! |
Well, another way to define the use-case is emulating "overlapping replacement", with the repeated application of a non-overlapping replacement, which |
I'm not getting there, am I? 😄 Well, suppose you want to remove space between every two digits, e.g You can only do that with repeated application of This is actually a simpler problem than what repeated application allows to solve in general, because it could be theoretically solved with a single The want for repeated application often arises when one wants to cheat and transform some simple subset of context-free grammar language with regular expressions. E.g. merge contents of consecutive tags in some simple html, or merge consecutive Alternative would be firing whole HTML/Rust parsers and transforming resulting ASTs, which is often an overkill for the task. |
@albel727 No, I get it. :) I'm just confused at how you could possibly insist that something like this is common! I've been programming for a long time, and in all that time, neither myself nor any of the code I've read have ever needed something like this out of a regex library. Just because something is niche or uncommon doesn't mean I think we automatically shouldn't have it, but it definitely weighs in the cost-benefit analysis. The fact that you've presented some interesting examples, and have demonstrated that you can't really do what you want without this addition is sufficient enough for me (combined with the fact that this is a very simple and lightweight addition to the existing API). But, it's impossible to truly know. After this makes it into the next release, let's come back here in two years and look at all of regex's dependents. Count me shocked if we find more than a single digit number of uses. :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mbrubeck Thanks so much for quickly updating the PR! I've added another round of nits. Thanks for your patience on this. :)
src/re_unicode.rs
Outdated
/// not be cloneable) and use it without consuming it, so it can be used | ||
/// more than once: | ||
/// | ||
/// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to nit at these things, but could you add an # Example
header here to be consistent with how other examples are done? (I believe this is also the style in std
too.)
/// let dst = re.replace_all(&dst, rep.by_ref()); | ||
/// dst.into_owned() | ||
/// } | ||
/// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you for writing out this example! Could you also add it to the re_bytes
version too?
src/re_unicode.rs
Outdated
/// By-reference adaptor for a `Replacer` | ||
/// | ||
/// Returned by [`Replacer::by_ref`](trait.Replacer.html#method.by_ref). | ||
pub struct ReplacerRef<'a, R: ?Sized + 'a>(&'a mut R); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to add #[derive(Debug)]
to this? (And do the same for the re_bytes
module.)
Note: This can't simply return `&mut Self` because a generic `impl<R: Replacer> Replacer for &mut R` would conflict with libstd's generic `impl<F: FnMut> FnMut for &mut F`.
Nits fixed. |
Alright, fairy nuff. :) Though I've needed this quite often in my years of programming, but maybe it's just me. So now I can feel at rest, knowing that one will always be able to implement one-dimensional cellular automata with |
This is useful when you want to take a generic
Replacer
(which might not be cloneable) and use it without consuming it, so it can be used more than once.Note: This can't simply return
&mut Self
because a genericimpl<R: Replacer> Replacer for &mut R
would conflict with libstd's genericimpl<F: FnMut> FnMut for &mut F
.