codespell

johnkerl · Aug 20, 2022 · d8be06b · d8be06b
1 parent 7c9d0e2
commit d8be06b
Show file tree

Hide file tree

Showing 4 changed files with 26 additions and 33 deletions.
diff --git a/.github/workflows/codespell.yml b/.github/workflows/codespell.yml
@@ -33,7 +33,4 @@ jobs:
         with:
           check_filenames: true
           ignore_words_file: .codespellignore
-          # ignore_words_list: denom,inout,iput,nd,nin,numer,te,wee
-          # There is a word "RO" in docs/src/shapes-of-data.md.in and docs/src/shapes-of-data.md
-          # which is listed in .codespellignore but which codespell refuses to ignore. Not sure why.
           skip: "*.csv,*.dkvp,*.txt,*.js,*.html,*.map,./tags,./test/cases,./docs/src/shapes-of-data.md.in,./docs/src/shapes-of-data.md"
diff --git a/docs/src/data/colours.csv b/docs/src/data/colours.csv
@@ -1,3 +1,3 @@
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR
 masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
diff --git a/docs/src/shapes-of-data.md b/docs/src/shapes-of-data.md
@@ -36,7 +36,7 @@ Use the `file` command to see if there are CR/LF terminators (in this case, ther
 <b>file data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-data/colours.csv: UTF-8 Unicode text
+data/colours.csv: Unicode text, UTF-8 text
 </pre>
 
 Look at the file to find names of fields:
@@ -45,18 +45,15 @@ Look at the file to find names of fields:
 <b>cat data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR
-masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR
+masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 </pre>
 
 Extract a few fields:
 
-<pre class="pre-highlight-in-pair">
-<b>mlr --csv cut -f KEY,PL,RO data/colours.csv </b>
-</pre>
-<pre class="pre-non-highlight-in-pair">
-(only blank lines appear)
+<pre class="pre-highlight-non-pair">
+<b>mlr --csv cut -f KEY,PL,TO data/colours.csv </b>
 </pre>
 
 Use XTAB output format to get a sharper picture of where records/fields are being split:
@@ -65,12 +62,12 @@ Use XTAB output format to get a sharper picture of where records/fields are bein
 <b>mlr --icsv --oxtab cat data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 </pre>
 
-Using XTAB output format makes it clearer that `KEY;DE;...;RO;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):
+Using XTAB output format makes it clearer that `KEY;DE;...;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):
 
 <pre class="pre-highlight-in-pair">
 <b>mlr --icsv --ifs semicolon --oxtab cat data/colours.csv </b>
@@ -83,9 +80,9 @@ ES  Blanco
 FI  Valkoinen
 FR  Blanc
 IT  Bianco
-NL  Witter
+NL  Wit
 PL  Biały
-RO  Alb
+TO  Alb
 TR  Beyaz
 
 KEY masterdata_colourcode_2
@@ -97,17 +94,17 @@ FR  Noir
 IT  Nero
 NL  Zwart
 PL  Czarny
-RO  Negru
+TO  Negru
 TR  Siyah
 </pre>
 
 Using the new field-separator, retry the cut:
 
 <pre class="pre-highlight-in-pair">
-<b>mlr --csv --fs semicolon cut -f KEY,PL,RO data/colours.csv </b>
+<b>mlr --csv --fs semicolon cut -f KEY,PL,TO data/colours.csv </b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-KEY;PL;RO
+KEY;PL;TO
 masterdata_colourcode_1;Biały;Alb
 masterdata_colourcode_2;Czarny;Negru
 </pre>

diff --git a/docs/src/shapes-of-data.md.in b/docs/src/shapes-of-data.md.in
@@ -18,35 +18,34 @@ Use the `file` command to see if there are CR/LF terminators (in this case, ther
 
 GENMD-CARDIFY-HIGHLIGHT-ONE
 file data/colours.csv 
-data/colours.csv: UTF-8 Unicode text
+data/colours.csv: Unicode text, UTF-8 text
 GENMD-EOF
 
 Look at the file to find names of fields:
 
 GENMD-CARDIFY-HIGHLIGHT-ONE
 cat data/colours.csv 
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR
-masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR
+masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 GENMD-EOF
 
 Extract a few fields:
 
 GENMD-CARDIFY-HIGHLIGHT-ONE
-mlr --csv cut -f KEY,PL,RO data/colours.csv 
-(only blank lines appear)
+mlr --csv cut -f KEY,PL,TO data/colours.csv 
 GENMD-EOF
 
 Use XTAB output format to get a sharper picture of where records/fields are being split:
 
 GENMD-CARDIFY-HIGHLIGHT-ONE
 mlr --icsv --oxtab cat data/colours.csv 
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Witter;Biały;Alb;Beyaz
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_1;Weiß;White;Blanco;Valkoinen;Blanc;Bianco;Wit;Biały;Alb;Beyaz
 
-KEY;DE;EN;ES;FI;FR;IT;NL;PL;RO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
+KEY;DE;EN;ES;FI;FR;IT;NL;PL;TO;TR masterdata_colourcode_2;Schwarz;Black;Negro;Musta;Noir;Nero;Zwart;Czarny;Negru;Siyah
 GENMD-EOF
 
-Using XTAB output format makes it clearer that `KEY;DE;...;RO;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):
+Using XTAB output format makes it clearer that `KEY;DE;...;TR` is being treated as a single field name in the CSV header, and likewise each subsequent line is being treated as a single field value. This is because the default field separator is a comma but we have semicolons here.  Use XTAB again with different field separator (`--fs semicolon`):
 
 GENMD-CARDIFY-HIGHLIGHT-ONE
 mlr --icsv --ifs semicolon --oxtab cat data/colours.csv 
@@ -57,9 +56,9 @@ ES  Blanco
 FI  Valkoinen
 FR  Blanc
 IT  Bianco
-NL  Witter
+NL  Wit
 PL  Biały
-RO  Alb
+TO  Alb
 TR  Beyaz
 
 KEY masterdata_colourcode_2
@@ -71,15 +70,15 @@ FR  Noir
 IT  Nero
 NL  Zwart
 PL  Czarny
-RO  Negru
+TO  Negru
 TR  Siyah
 GENMD-EOF
 
 Using the new field-separator, retry the cut:
 
 GENMD-CARDIFY-HIGHLIGHT-ONE
-mlr --csv --fs semicolon cut -f KEY,PL,RO data/colours.csv 
-KEY;PL;RO
+mlr --csv --fs semicolon cut -f KEY,PL,TO data/colours.csv 
+KEY;PL;TO
 masterdata_colourcode_1;Biały;Alb
 masterdata_colourcode_2;Czarny;Negru
 GENMD-EOF