Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graphemes(s, m:n) substring slicing #44266

Merged
merged 8 commits into from
Apr 2, 2022
Merged

graphemes(s, m:n) substring slicing #44266

merged 8 commits into from
Apr 2, 2022

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Feb 19, 2022

People seem to want this functionality, and it's nontrivial to implement efficiently & correctly.

(Discoverability is a challenge here, because most people wanting to access a "slice of characters" in Unicode probably don't know what a grapheme is, but at least this gives us a succinct answer to such questions.)

image

@stevengj stevengj added strings "Strings!" unicode Related to unicode characters and encodings labels Feb 19, 2022
@KristofferC
Copy link
Sponsor Member

Should anything be said about the time complexity of this operation in the docstring?

@stevengj
Copy link
Member Author

Should anything be said about the time complexity of this operation in the docstring?

Sure, added.

@stevengj
Copy link
Member Author

stevengj commented Feb 20, 2022

buildkite CI errors seem unrelated:

LoadError("sysimg.jl", 19, LoadError("/cache/build/dockerized-amdci4-8/julialang/julia-master/tmp/test-asan/asan/usr/share/julia/stdlib/v1.9/Random/src/Random.jl", 3, LoadError("/cache/build/dockerized-amdci4-8/julialang/julia-master/tmp/test-asan/asan/usr/share/julia/stdlib/v1.9/Random/src/DSFMT.jl", 3, InexactError(:trunc, Int64, Inf))))

and

LibGit2/libgit2                          (13) \|         failed at 2022-02-19T22:52:54.464
--
  | Error During Test at /cache/build/default-amdci5-8/julialang/julia-master/julia-db411f4927/share/julia/stdlib/v1.9/LibGit2/test/libgit2.jl:2224
  | Got exception outside of a @test
  | IOError: read: i/o error (EIO)

and

SuiteSparse                              (9) \|  1907.82 \|  11.17 \|  0.6 \|    9577.14 \|  2255.55
--
  | Profile                                  (5) \|         failed at 2022-02-20T00:35:48.352
  | Test Failed at /cache/build/default-amdci4-6/julialang/julia-master/julia-db411f4927/share/julia/stdlib/v1.9/Profile/test/runtests.jl:165
  | Expression: getline(values(fdictc)) == getline(values(fdict0)) + 2
  | Evaluated: nothing == 24
  | Test Failed at /cache/build/default-amdci4-6/julialang/julia-master/julia-db411f4927/share/julia/stdlib/v1.9/Profile/test/runtests.jl:187
  | Expression: parse(Int, s) > 100
  | Evaluated: 68 > 100

@vtjnash vtjnash added the merge me PR is reviewed. Merge when all tests are passing label Mar 28, 2022
@giordano giordano merged commit 41c2c7c into master Apr 2, 2022
@giordano giordano deleted the sgj/graphemeslice branch April 2, 2022 21:27
@giordano giordano removed the merge me PR is reviewed. Merge when all tests are passing label Apr 2, 2022
@matthieugomez
Copy link
Contributor

Isn't it a bit confusing that graphemes(x, m:n) returns a SubString but graphemes(x) does not? I guess it'd be better to have a different name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
strings "Strings!" unicode Related to unicode characters and encodings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants