-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing filters for capitalize, downcase, first, last, prepend, append and pluralize #18
Conversation
@ligthyear any updates on this? Should I preliminarily review it already? :) The travis build is failing because you mis-spelled However, #28 already introduced |
Sorry, @johannhof , I was hoping I had a bit more time to implement a few others, but the last two weeks went crazy instead. I rebased against master and updated everything to make all tests pass. I guess it is best to review/merge this batch now and do others another time. I'll update the description accordingly. So, yes, please feel free to review and merge. |
@ligthyear don't worry about it, I can totally relate. I'll review it soon! Thanks for contributing |
}, | ||
_ => chr.to_uppercase().next().unwrap(), | ||
}.to_string(); | ||
word + &next_char |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would've done something like
Str(ref s) => Ok(Str(s.split(' ').map(|word| {
let (h, t) = word.split_at(1);
h.to_uppercase() + t
}).collect::<Vec<String>>().join(" "))),
but it seems you're trying to preserve whitespace, which is fine by me. Can I just ask you to add some comments explaining your code, because I really can't wrap my head around it. Also, are the unwrap
calls guaranteed to be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, are we sure that the iterator to_uppercase
returns will always contain only one uppercase letter? afaik ß
maps to SS
, for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some random suggestions:
Converting every char
to a String
and then appending that to the other string is isn't the best way to do this. You should probably use String::push
instead. Btw. word
as the variable name for the accumulator string is a bit misleading, I think.
You're also iterating over the accumulated string to get the last char in every iteration, which means the whole function is O(n^2) in the length of the input string. That's not necessarily a problem, but it would be easy to avoid by just having a boolean that keeps track whether the last char was whitespace. That would make the function O(n) and also make it easier to read.
Finally, you're iterating over char_indices
, but never actually look at the indices. Is there a reason you couldn't just use chars
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the ß
issue: I don't know enough about unicode to know how much of a problem this is. ß
probably won't be a problem because it never occurs at the beginning of a word anyway, but there might be other characters that have the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re ß:
ß is the only letter that doesn't have any uppercase at all. The work-a-round in technical sense is to often replace it with "SS", which is technically two characters, however that never effects us, as ß also can't be used as the beginning of any word as it should only be used to extend the previous vowel. So, while in a uppercasing of a text this would be relevant (and actually mean your lowercased text is shorter than the uppercased one), in a title-case scenario (the one we are looking at), this isn't an issue in practice. Given, that someone could still feed the ß at the beginning of the word, but as far as I understand this code, this would just lead to the usage of "S" instead of "SS".
Re complexity/ not using split(' ')
:
This is complex because unicode is complex. I had a split version first, then added the "silent whitespace" test and everything failed – as it also would for a line break. Same for the join. We'd replace the whitespaces improperly.
Re char_indices
: this is relic of an earlier implementation. Yes, this should use chars
Re String::push
: good point. Though that means I need to get a mutable handle. I'm sure I can do that in the fold without breaking the borrow checking. I'll try it.
Re Boolean for tracking: indeed, that could be done. That's how I used to do that in python, but never liked it. In case in particular as my otherwise very functional closure has an ugly side effect it constantly depends on. Also not sure, the borrow checker will allow that one. I'll see if that is something I can make better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unwrap()
s here are safe. We always have an actual character ;) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re Boolean for tracking: indeed, that could be done. That's how I used to do that in python, but never liked it. In case in particular as my otherwise very functional closure has an ugly side effect it constantly depends on.
That doesn't require any side effects, the accumulator variable of the fold can just be (String, bool)
instead of String
. Although I have to say that even as a Haskell programmer, I'm not sure that a fold is really better than a loop here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re String::push: good point. Though that means I need to get a mutable handle. I'm sure I can do that in the fold without breaking the borrow checking. I'll try it.
This shouldn't be a problem, you just need to add mut
to the binding of word
. I would be surprised if that caused issues with the borrow checker.
I basically only have that comment, the rest looks good. We'd also need test coverage for the error cases (passing invalid values to the filters). If you don't have the time for that we can just open a follow-up bug. :) |
@johannhof See comments in the thread. I can/will take a look this weekend, is that soon enough? also adding some invalid values should be in there ;) . |
Take your time, this PR is not going away :)
huh, where? I can't seem to find it Thanks! |
Going through this again I think it's ok to merge, thank you very much @ligthyear! Feel free to make another PR if you still want to pick up the suggestions in this thread. |
This implements the a few more filters.
Specifically it adds:
slice - slice a string. Takes an offset and length, e.g. {{ "hello" | slice: -3, 3 }} #=> llo
References #11