-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the bug when keys contain non UTF8 strings #2566
Conversation
Thank you for the pull request, but you haven't followed the contribution guidelines:
Could you add these, making sure you follow the linked contribution guidelines, then push these changes? Once this is done, one of the maintainers will review the pull request. |
@HughParsonage Sorry for the ignorances. Will do it ASAP, thanks. |
Codecov Report
@@ Coverage Diff @@
## master #2566 +/- ##
========================================
Coverage ? 91.4%
========================================
Files ? 63
Lines ? 12101
Branches ? 0
========================================
Hits ? 11061
Misses ? 1040
Partials ? 0
Continue to review full report at Codecov.
|
good catch 👍 |
…hinese strings can't be expressed in native encoding...
Thanks for the nice PR. |
Will the PR being from your personal branch, think I have to wait for you to accept my PR to you (why it's failing to pass checks). I went ahead and merged to master. I've invited you to be a project member. This will allow you to create branch directly in project next time so we can all push to each other's PR branches. Welcome! |
@mattdowle , thanks for merging this and for this great package! You should have been able to modify my code directly, since I filed this PR with allowing the author to edit my code. The encoding issue is indeed tricky, especially on Windows. Another idea is simply to enforce all the nonASCII strings in Your commit of making |
Current build fails for me after @mattdowle 's update:
(this is doing Also
Works as expected. So something about the options in RStudio's build&reload routine is going awry? But I've never seen this type of issue before... |
@MichaelChirico Just tried and it's ok for me on a Windows machine~ |
Closes #2462. #1826 is the older issue...
For strings,
data.table
compare their values in UTF8 encoding. However, due to missing twoENC2UTF8
incsort()
andcsort_pre()
, the order thatdata.table
creates actually depends on the encoding. On Windows, the fact that the default encoding is not UTF8 leads to some weird output when there're strings in keys.Please tell me that if there's anything more need to be done, thanks.
Here's what we get now.
Without this commit, on a windows machine, the result would be different: