-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for UTF-16 surrogate pair encoded emojis #279
Comments
tech4him1
changed the title
Support for Unicode surrogate pair encoded emojis
Support for UTF-16 surrogate pair encoded emojis
Sep 4, 2017
Still not supported 😞 |
steveh
added a commit
to steveh/yaml
that referenced
this issue
Apr 8, 2024
I'm new to this code base so have likely implemented this in a way that isn't ideal, but hopefully it's enough of a starting point. References: * https://russellcottrell.com/greek/utilities/SurrogatePairCalculator.htm * https://mathiasbynens.be/notes/javascript-unicode * readerc.go Fixes go-yaml#279
I've created a PR to add support for surrogate pairs: #1029 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently, if I try to parse YAML data containing Unicode emojis split into UTF-16 surrogate pairs (i.e.
1F468
as\uD83D\uDC68
in YAML),go-yaml
returns the error "found invalid Unicode character escape code".According to the YAML spec parsers are supposed to support UTF-8 and UTF-16, including surrogate pairs:
http://www.yaml.org/spec/1.2/spec.html#id2770814
http://www.yaml.org/spec/1.2/spec.html#id2771184
This looks intentional, are you planning on supporting these, or not?
yaml/scannerc.go
Lines 2443 to 2447 in 25c4ec8
The text was updated successfully, but these errors were encountered: