Skip to content

Commit

Permalink
Editorial: Move ASCII-case definitions into ECMA-262
Browse files Browse the repository at this point in the history
This moves the definitions of ASCII-uppercase, ASCII-lowercase, and ASCII-
case-insensitive match into ECMA-262 and deletes them from ECMA-402.

It also generalizes the definitions of ASCII-uppercase and ASCII-lowercase
to cover sequences of code points as well as strings.

This allows a simplification in the TimeZoneIANALegacyName production.
(Note, the unnecessary parentheses are because grammarkdown seems to choke
on the version without them)
  • Loading branch information
ptomato committed Oct 20, 2022
1 parent 7f91717 commit 0af69d3
Show file tree
Hide file tree
Showing 3 changed files with 51 additions and 1 deletion.
2 changes: 1 addition & 1 deletion spec/abstractops.html
Original file line number Diff line number Diff line change
Expand Up @@ -1009,7 +1009,7 @@ <h1>ISO 8601 grammar</h1>
TimeZoneIANANameComponent[?Legacy] `/` TimeZoneIANANameTail[?Legacy]

TimeZoneIANALegacyName :
TimeZoneIANANameTail[+Legacy] [&gt; but only if `etc/gmt` |ASCIISign| |UnpaddedHour| matches StringToCodePoints(the ASCII-lowercase of CodePointsToString(the sequence of code points matched by |TimeZoneIANANameTail|))]
TimeZoneIANANameTail[+Legacy] [&gt; but only if `etc/gmt` |ASCIISign| |UnpaddedHour| matches the ASCII-lowercase of the sequence of code points matched by (|TimeZoneIANANameTail|)]
TimeZoneIANANameTail[+Legacy] [&gt; but only if the sequence of code points matched by |TimeZoneIANANameTail| is an ASCII-case-insensitive match for an element of « *"Etc/GMT0"*, *"GMT0"*, *"GMT-0"*, *"GMT+0"*, *"EST5EDT"*, *"CST6CDT"*, *"MST7MDT"*, *"PST8PDT"* »]

TimeZoneIANAName :
Expand Down
26 changes: 26 additions & 0 deletions spec/intl.html
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,32 @@ <h1>Amendments to the ECMAScript® 2023 Internationalization API Specification</
</p>
</emu-note>

<emu-clause id="sup-case-sensitivity-and-case-mapping">
<h1><a href="https://tc39.es/ecma402/#sec-case-sensitivity-and-case-mapping">Case Sensitivity and Case Mapping</a></h1>

<emu-note type="editor">
<p>These definitions are moved into ECMA-262.</p>
</emu-note>

<p>
The String values used to identify locales, currencies, scripts, and time zones are interpreted in an ASCII-case-insensitive manner, treating the code units 0x0041 through 0x005A (corresponding to Unicode characters LATIN CAPITAL LETTER A through LATIN CAPITAL LETTER Z) as equivalent to the corresponding code units 0x0061 through 0x007A (corresponding to Unicode characters LATIN SMALL LETTER A through LATIN SMALL LETTER Z), both inclusive. No other case folding equivalences are applied.
</p>
<emu-note>
For example, *"ß"* (U+00DF) must not match or be mapped to *"SS"* (U+0053, U+0053). *"ı"* (U+0131) must not match or be mapped to *"I"* (U+0049).
</emu-note>
<del class="block">
<p>
The <em>ASCII-uppercase</em> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) with the corresponding ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) while preserving all other code units.
</p>
<p>
The <em>ASCII-lowercase</em> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) with the corresponding ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) while preserving all other code units.
</p>
<p>
A String value _A_ is an <em>ASCII-case-insensitive match</em> for String value _B_ if the ASCII-uppercase of _A_ is exactly the same sequence of code units as the ASCII-uppercase of _B_. A sequence of Unicode code points _A_ is an ASCII-case-insensitive match for _B_ if _B_ is an ASCII-case-insensitive match for ! CodePointsToString(_A_).
</p>
</del>
</emu-clause>

<emu-clause id="sup-time-zone-names">
<h1><a href="https://tc39.es/ecma402/#sec-time-zone-names">Time Zone Names</a></h1>

Expand Down
24 changes: 24 additions & 0 deletions spec/mainadditions.html
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,30 @@ <h1>Amendments to the ECMAScript® 2023 Language Specification</h1>
</p>
</emu-note>

<emu-clause id="sec-ecmascript-language-types-string-type">
<h1><a href="https://tc39.es/ecma262/#sec-ecmascript-language-types-string-type">The String Type</a></h1>

<emu-note type="editor">
<p>
This section intends to move the definitions of ASCII-uppercase, ASCII-lowercase, and ASCII-case-insensitive match from ECMA-402 into ECMA-262, after the definition of the ASCII word characters, and generalizes the former two definitions to cover sequences of code points.
</p>
</emu-note>

<p>[...]</p>

<ins class="block">
<p>
The <dfn>ASCII-uppercase</dfn> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) with the corresponding ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) while preserving all other code units. The ASCII-uppercase of a sequence of Unicode code points _A_ is the sequence of code points derived from _A_ by replacing each occurrence of an ASCII lowercase letter code point (U+0061 through U+007A, inclusive) with the corresponding ASCII uppercase letter code point (U+0041 through U+005A, inclusive) while preserving all other code points.
</p>
<p>
The <dfn>ASCII-lowercase</dfn> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) with the corresponding ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) while preserving all other code units. The ASCII-lowercase of a sequence of Unicode code points _A_ is the sequence of code points derived from _A_ by replacing each occurrence of an ASCII uppercase letter code point (U+0041 through U+005A, inclusive) with the corresponding ASCII lowercase letter code point (U+0061 through U+007A, inclusive) while preserving all other code points.
</p>
<p>
A String value _A_ is an <dfn>ASCII-case-insensitive match</dfn> for String value _B_ if the ASCII-uppercase of _A_ is exactly the same sequence of code units as the ASCII-uppercase of _B_. A sequence of Unicode code points _A_ is an ASCII-case-insensitive match for _B_ if _B_ is an ASCII-case-insensitive match for ! CodePointsToString(_A_).
</p>
</ins>
</emu-clause>

<ins class="block">
<emu-clause id="sec-temporal-mergelists" type="abstract operation">
<h1>
Expand Down

0 comments on commit 0af69d3

Please sign in to comment.