-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed bug in unicode.jl/encode16 #10948
Conversation
Thanks @ScottPJones. A couple of comments:
Note that you can amend the commit by making the changes, doing (I noticed that you pushed the change to your |
else | ||
push!(buf, UInt16(0xd7c0 + (c>>10) & 0x3ff)) | ||
push!(buf, UInt16(0xdc00 + c & 0x3ff)) | ||
throw(ArgumentError("invalid Unicode character (>0x10FFFF)")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe (> 0x10ffff)
, since we usually use spaces around >
in Julia code, and lower-case hex is the default Julia style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Thanks for all the advice! I’ll fix this up and do the git commit --amend / git push -f
And your real name in the author field as long as you've configured |
@@ -55,9 +55,11 @@ function encode16(s::AbstractString) | |||
c = reinterpret(UInt32, ch) | |||
if c < 0x10000 | |||
push!(buf, UInt16(c)) | |||
elseif c <= 0x10FFFF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lower-case hex to match the surrounding style
Hmmm... I’d like to also change it so that it displays the character value when it gets the ArgumentError... I have checked that it doesn’t create a GC frame or change in-line ability, as Stefan was concerned about, but my problem is that doing $c interpolates the number in decimal, and I want it out as 0x and hex digits... (I know this is a stupid newbie question!) |
Ah, never mind, found it! 0x$(hex(c)) |
…aracters > 0x100000
I have now updated the tests to add all of the suggestions. Thanks! |
Fixed bug in unicode.jl/encode16
Now, I'd like to look at improving the performance of some of these functions in utf8, utf16, and utf32... I now know I messed up by pushing from my master... can you please help a newbie out, and tell me Thanks so much everybody! |
This should be a reasonably bulletproof recipe. I assume here that you have two remotes; "julialang" is the remote that points at the JuliaLang/julia repository on GitHub, and "mine" points at ScottPJones/julia on GitHub. Substitute as needed. You can check your remotes with
|
Oh, and don't worry about your master branch on GitHub being out of sync...in fact, you don't even need it:
|
@pao I did what you said, but after that, when I tried to rebuild, I got a strange error:
I actually have automake version 1.15 installed on my Mac... did something change in the last two days? Thanks, Scott |
some timestamp issue might be confusing autotools - try |
What @tkelman said is a good general rule of thumb for weird-looking dependency build problems. |
There's a final tier of cleanliness if |
backported in 8eafd07 |
This fixes the bug in encode16 (called by convert and utf16), which incorrectly handles Unicode characters above 0x100000, and doesn't check for invalid characters above 0x10FFFF.
It updates the test cases as well to make sure the maximum valid Unicode character is tested.