Skip to content

Commit

Permalink
Accept 0 as a valid str char boundary
Browse files Browse the repository at this point in the history
Index 0 must be a valid char boundary (invariant of str that it contains
valid UTF-8 data).

If we check explicitly for index == 0, that removes the need to read the
byte at index 0, so it avoids a trip to the string's memory, and it
optimizes out the slicing index' bounds check whenever it is zero.

With this change, the following examples all change from having a read of
the byte at 0 and a branch to possibly panicing, to having the bounds
checking optimized away.

```rust
pub fn split(s: &str) -> (&str, &str) {
    s.split_at(0)
}

pub fn both(s: &str) -> &str {
    &s[0..s.len()]
}

pub fn first(s: &str) -> &str {
    &s[..0]
}

pub fn last(s: &str) -> &str {
    &s[0..]
}
```
  • Loading branch information
bluss committed Mar 24, 2016
1 parent 80e7a1b commit f621193
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion src/libcore/str/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1892,7 +1892,10 @@ impl StrExt for str {

#[inline]
fn is_char_boundary(&self, index: usize) -> bool {
if index == self.len() { return true; }
// 0 and len are always ok.
// Test for 0 explicitly so that it can optimize out the check
// easily and skip reading string data for that case.
if index == 0 || index == self.len() { return true; }
match self.as_bytes().get(index) {
None => false,
Some(&b) => b < 128 || b >= 192,
Expand Down

0 comments on commit f621193

Please sign in to comment.