Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Surrogate code units (U+D800 through U+DBFF) cannot be encoded into UTF-8.
Ref #268
This covers the most important part of #268, which is UTF-8 compatibility. I would also like to prohibit permanently reserved-for-Unicode-internal-use noncharacters (U+FDD0 through U+FDEF and U+nFFFE and U+nFFFF where n is 0x0 through 0x10, two of which are also invalid XML characters) and control characters (U+0000 through U+001F and U+007F through U+009F, the first 32 of which are not valid unescaped inside a JSON string or [with specific exceptions allowing tab, line feed, and carriage return] in XML), although those can be addressed in a followup (and any of them that should be allowed in string contents will need a corresponding escape sequence, similar to how
\\
represents a single\
).