Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for unicode normalization. #44

Closed
SethMMorton opened this issue Aug 18, 2017 · 1 comment
Closed

Add support for unicode normalization. #44

SethMMorton opened this issue Aug 18, 2017 · 1 comment
Labels

Comments

@SethMMorton
Copy link
Owner

This is inspired by this StackOverflow question. The problem is that even with ns.LOCALE some non-ASCII letters are not sorted as you might expect, but running it through unicode.normalize does the trick.

It is not clear if this should be the default behavior or an add-on. It is also not clear if it should default to 'NFD' or if it can be more flexible.

SethMMorton added a commit that referenced this issue Aug 19, 2017
All unicode input now gets 'NFD' normalization, which ensures that
all characters that look the same are represented by the same code
points. 'NFD' was chosen because it is the expanded for which will
cause (for example) 'é' to be placed immediately after 'e' rather than
after 'z'.

Users can choose 'NFKD' with ns.COMPATIBILITYNORMALIZE (or ns.CN) which
will change certain characters to their compatible (and often ASCII)
representation. This may be useful to cause force numbers in odd
representations to be transformed to ASCII which will potentially give
better sorting orders.

This will close issue #44.
@SethMMorton
Copy link
Owner Author

Added in version 5.1.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant