You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue was spotted during the removal of TextEncoder and TextDecoder (#4). TextDecoder has an ability to automatically strip the BOM (U+FFFD) from the input string if any. We need to emulate this in a separate encoding, perhaps BOMAwareUTF8Encoding (which whatwg_name() is still utf-8)? This use case itself can be handled better with decoders with a fallback encoding (#19), but we may need to require BOM-attached Unicode encodings from time to time: many applications of UTF-16 require BOM, for example.
The text was updated successfully, but these errors were encountered:
I think that BOMAwareUTF8Encoding the wrong approach. Rather, what’s needed is what the spec calls decode.
It could be be a BOMDecoder (or other name) that takes a "fallback encoding" parameter. When the input starts with a BOM, the BOM is stripped and the corresponding encoding is used. Otherwise, the fallback encoding is used.
This decoder should always be used for formats that support multiple encoding, because the BOM (by proximity) is more accurate than other metadata.
@SimonSapin I have updated the description. I agree that this use case should be handled elsehow, see #19 for a separate discussion. BOM-aware encoding itself might be useful by itself though.
This issue was spotted during the removal of
TextEncoder
andTextDecoder
(#4).TextDecoder
has an ability to automatically strip the BOM (U+FFFD) from the input string if any.We need to emulate this in a separate encoding, perhapsThis use case itself can be handled better with decoders with a fallback encoding (#19), but we may need to require BOM-attached Unicode encodings from time to time: many applications of UTF-16 require BOM, for example.BOMAwareUTF8Encoding
(whichwhatwg_name()
is stillutf-8
)?The text was updated successfully, but these errors were encountered: