Editorial: Move ASCII-case definitions into ECMA-262

This moves the definitions of ASCII-uppercase, ASCII-lowercase, and ASCII- case-insensitive match into ECMA-262 and deletes them from ECMA-402. It also generalizes the definitions of ASCII-uppercase and ASCII-lowercase to cover sequences of code points as well as strings. This allows a simplification in the TimeZoneIANALegacyName production. (Note, the unnecessary parentheses are because grammarkdown seems to choke on the version without them)
tc39 · Oct 20, 2022 · 0af69d3 · 0af69d3
1 parent 7f91717
commit 0af69d3
Show file tree

Hide file tree

Showing 3 changed files with 51 additions and 1 deletion.
diff --git a/spec/abstractops.html b/spec/abstractops.html
@@ -1009,7 +1009,7 @@ <h1>ISO 8601 grammar</h1>
           TimeZoneIANANameComponent[?Legacy] `/` TimeZoneIANANameTail[?Legacy]
 
       TimeZoneIANALegacyName :
-          TimeZoneIANANameTail[+Legacy] [&gt; but only if `etc/gmt` |ASCIISign| |UnpaddedHour| matches StringToCodePoints(the ASCII-lowercase of CodePointsToString(the sequence of code points matched by |TimeZoneIANANameTail|))]
+          TimeZoneIANANameTail[+Legacy] [&gt; but only if `etc/gmt` |ASCIISign| |UnpaddedHour| matches the ASCII-lowercase of the sequence of code points matched by (|TimeZoneIANANameTail|)]
           TimeZoneIANANameTail[+Legacy] [&gt; but only if the sequence of code points matched by |TimeZoneIANANameTail| is an ASCII-case-insensitive match for an element of « *"Etc/GMT0"*, *"GMT0"*, *"GMT-0"*, *"GMT+0"*, *"EST5EDT"*, *"CST6CDT"*, *"MST7MDT"*, *"PST8PDT"* »]
 
       TimeZoneIANAName :

diff --git a/spec/intl.html b/spec/intl.html
@@ -14,6 +14,32 @@ <h1>Amendments to the ECMAScript® 2023 Internationalization API Specification</
     </p>
   </emu-note>
 
+  <emu-clause id="sup-case-sensitivity-and-case-mapping">
+    <h1><a href="https://tc39.es/ecma402/#sec-case-sensitivity-and-case-mapping">Case Sensitivity and Case Mapping</a></h1>
+
+    <emu-note type="editor">
+      <p>These definitions are moved into ECMA-262.</p>
+    </emu-note>
+
+    <p>
+      The String values used to identify locales, currencies, scripts, and time zones are interpreted in an ASCII-case-insensitive manner, treating the code units 0x0041 through 0x005A (corresponding to Unicode characters LATIN CAPITAL LETTER A through LATIN CAPITAL LETTER Z) as equivalent to the corresponding code units 0x0061 through 0x007A (corresponding to Unicode characters LATIN SMALL LETTER A through LATIN SMALL LETTER Z), both inclusive. No other case folding equivalences are applied.
+    </p>
+    <emu-note>
+      For example, *"ß"* (U+00DF) must not match or be mapped to *"SS"* (U+0053, U+0053). *"ı"* (U+0131) must not match or be mapped to *"I"* (U+0049).
+    </emu-note>
+    <del class="block">
+      <p>
+        The <em>ASCII-uppercase</em> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) with the corresponding ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) while preserving all other code units.
+      </p>
+      <p>
+        The <em>ASCII-lowercase</em> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) with the corresponding ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) while preserving all other code units.
+      </p>
+      <p>
+        A String value _A_ is an <em>ASCII-case-insensitive match</em> for String value _B_ if the ASCII-uppercase of _A_ is exactly the same sequence of code units as the ASCII-uppercase of _B_. A sequence of Unicode code points _A_ is an ASCII-case-insensitive match for _B_ if _B_ is an ASCII-case-insensitive match for ! CodePointsToString(_A_).
+      </p>
+    </del>
+  </emu-clause>
+
   <emu-clause id="sup-time-zone-names">
     <h1><a href="https://tc39.es/ecma402/#sec-time-zone-names">Time Zone Names</a></h1>
 

diff --git a/spec/mainadditions.html b/spec/mainadditions.html
@@ -11,6 +11,30 @@ <h1>Amendments to the ECMAScript® 2023 Language Specification</h1>
     </p>
   </emu-note>
 
+  <emu-clause id="sec-ecmascript-language-types-string-type">
+    <h1><a href="https://tc39.es/ecma262/#sec-ecmascript-language-types-string-type">The String Type</a></h1>
+
+    <emu-note type="editor">
+      <p>
+        This section intends to move the definitions of ASCII-uppercase, ASCII-lowercase, and ASCII-case-insensitive match from ECMA-402 into ECMA-262, after the definition of the ASCII word characters, and generalizes the former two definitions to cover sequences of code points.
+      </p>
+    </emu-note>
+
+    <p>[...]</p>
+
+    <ins class="block">
+      <p>
+        The <dfn>ASCII-uppercase</dfn> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) with the corresponding ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) while preserving all other code units. The ASCII-uppercase of a sequence of Unicode code points _A_ is the sequence of code points derived from _A_ by replacing each occurrence of an ASCII lowercase letter code point (U+0061 through U+007A, inclusive) with the corresponding ASCII uppercase letter code point (U+0041 through U+005A, inclusive) while preserving all other code points.
+      </p>
+      <p>
+        The <dfn>ASCII-lowercase</dfn> of a String value _S_ is the String value derived from _S_ by replacing each occurrence of an ASCII uppercase letter code unit (0x0041 through 0x005A, inclusive) with the corresponding ASCII lowercase letter code unit (0x0061 through 0x007A, inclusive) while preserving all other code units. The ASCII-lowercase of a sequence of Unicode code points _A_ is the sequence of code points derived from _A_ by replacing each occurrence of an ASCII uppercase letter code point (U+0041 through U+005A, inclusive) with the corresponding ASCII lowercase letter code point (U+0061 through U+007A, inclusive) while preserving all other code points.
+      </p>
+      <p>
+        A String value _A_ is an <dfn>ASCII-case-insensitive match</dfn> for String value _B_ if the ASCII-uppercase of _A_ is exactly the same sequence of code units as the ASCII-uppercase of _B_. A sequence of Unicode code points _A_ is an ASCII-case-insensitive match for _B_ if _B_ is an ASCII-case-insensitive match for ! CodePointsToString(_A_).
+      </p>
+    </ins>
+  </emu-clause>
+
   <ins class="block">
     <emu-clause id="sec-temporal-mergelists" type="abstract operation">
       <h1>