6.6.0

johnkerl · Jan 1, 2023 · 9951f36 · 9951f36
1 parent 31fdc1c
commit 9951f36
Show file tree

Hide file tree

Showing 12 changed files with 104 additions and 66 deletions.
diff --git a/docs/src/data-diving-examples.md b/docs/src/data-diving-examples.md
@@ -271,19 +271,19 @@ The histogram shows the different distribution of 0/1 flags:
 <b>mlr --opprint histogram -f flag,u,v --lo -0.1 --hi 1.1 --nbins 12 data/colored-shapes.dkvp</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-bin_lo                bin_hi              flag_count u_count v_count
--0.010000000000000002 0.09000000000000002 6058       0       36
-0.09000000000000002   0.19000000000000003 0          1062    988
-0.19000000000000003   0.29000000000000004 0          985     1003
-0.29000000000000004   0.39000000000000007 0          1024    1014
-0.39000000000000007   0.4900000000000001  0          1002    991
-0.4900000000000001    0.5900000000000002  0          989     1041
-0.5900000000000002    0.6900000000000002  0          1001    1016
-0.6900000000000002    0.7900000000000001  0          972     962
-0.7900000000000001    0.8900000000000002  0          1035    1070
-0.8900000000000002    0.9900000000000002  0          995     993
-0.9900000000000002    1.0900000000000003  4020       1013    939
-1.0900000000000003    1.1900000000000002  0          0       25
+bin_lo                              bin_hi                              flag_count u_count v_count
+-0.1                                0.000000000000000013877787807814457 6058       0       36
+0.000000000000000013877787807814457 0.10000000000000003                 0          1062    988
+0.10000000000000003                 0.20000000000000004                 0          985     1003
+0.20000000000000004                 0.30000000000000004                 0          1024    1014
+0.30000000000000004                 0.40000000000000013                 0          1002    991
+0.40000000000000013                 0.5000000000000001                  0          989     1041
+0.5000000000000001                  0.6000000000000002                  0          1001    1016
+0.6000000000000002                  0.7000000000000002                  0          972     962
+0.7000000000000002                  0.8000000000000002                  0          1035    1070
+0.8000000000000002                  0.9000000000000002                  0          995     993
+0.9000000000000002                  1                                   4020       1013    939
+1                                   1.1                                 0          0       25
 </pre>
 
 Look at univariate stats by color and shape. In particular, color-dependent flag probabilities pop out, aligning with their original Bernoulli probabilities from the data-generator script:

diff --git a/docs/src/manpage.md b/docs/src/manpage.md
@@ -50,7 +50,7 @@ MILLER(1)                                                            MILLER(1)
        insertion-ordered hash map.  This encompasses a variety of data
        formats, including but not limited to the familiar CSV, TSV, and JSON.
        (Miller can handle positionally-indexed data as a special case.) This
-       manpage documents mlr 6.5.0-dev.
+       manpage documents mlr 6.6.0.
 
 1mEXAMPLES0m
        mlr --icsv --opprint cat example.csv
@@ -197,7 +197,7 @@ MILLER(1)                                                            MILLER(1)
        most-frequent nest nothing put regularize remove-empty-columns rename reorder
        repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
        sort sort-within-records split stats1 stats2 step summary tac tail tee
-       template top utf8-to-latin1 unflatten uniq unsparsify
+       template top utf8-to-latin1 unflatten uniq unspace unsparsify
 
 1mFUNCTION LIST0m
        abs acos acosh any append apply arrayify asin asinh asserting_absent
@@ -2080,6 +2080,15 @@ MILLER(1)                                                            MILLER(1)
                      With -n, produces only one record which is the unique-record count.
                      With neither -c nor -n, produces unique records.
 
+   1munspace0m
+       Usage: mlr unspace [options]
+       Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
+       Options:
+       -f {x}    Replace spaces with specified filler character.
+       -k        Unspace only keys, not keys and values.
+       -v        Unspace only values, not keys and values.
+       -h|--help Show this message.
+
    1munsparsify0m
        Usage: mlr unsparsify [options]
        Prints records with the union of field names over all input records.
@@ -3135,7 +3144,7 @@ MILLER(1)                                                            MILLER(1)
        int: declares an integer local variable in the current curly-braced scope.
        Type-checking happens at assignment: 'int x = 0.0' is an error.
 
-   map
+   1mmap0m
        map: declares a map-valued local variable in the current curly-braced scope.
        Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
        always OK. map b = a is OK or not depending on whether a is a map.
@@ -3288,5 +3297,5 @@ MILLER(1)                                                            MILLER(1)
 
 
 
-                                  2022-12-05                         MILLER(1)
+                                  2023-01-01                         MILLER(1)
 </pre>
diff --git a/docs/src/manpage.txt b/docs/src/manpage.txt
@@ -29,7 +29,7 @@ MILLER(1)                                                            MILLER(1)
        insertion-ordered hash map.  This encompasses a variety of data
        formats, including but not limited to the familiar CSV, TSV, and JSON.
        (Miller can handle positionally-indexed data as a special case.) This
-       manpage documents mlr 6.5.0-dev.
+       manpage documents mlr 6.6.0.
 
 1mEXAMPLES0m
        mlr --icsv --opprint cat example.csv
@@ -176,7 +176,7 @@ MILLER(1)                                                            MILLER(1)
        most-frequent nest nothing put regularize remove-empty-columns rename reorder
        repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
        sort sort-within-records split stats1 stats2 step summary tac tail tee
-       template top utf8-to-latin1 unflatten uniq unsparsify
+       template top utf8-to-latin1 unflatten uniq unspace unsparsify
 
 1mFUNCTION LIST0m
        abs acos acosh any append apply arrayify asin asinh asserting_absent
@@ -2059,6 +2059,15 @@ MILLER(1)                                                            MILLER(1)
                      With -n, produces only one record which is the unique-record count.
                      With neither -c nor -n, produces unique records.
 
+   1munspace0m
+       Usage: mlr unspace [options]
+       Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
+       Options:
+       -f {x}    Replace spaces with specified filler character.
+       -k        Unspace only keys, not keys and values.
+       -v        Unspace only values, not keys and values.
+       -h|--help Show this message.
+
    1munsparsify0m
        Usage: mlr unsparsify [options]
        Prints records with the union of field names over all input records.
@@ -3114,7 +3123,7 @@ MILLER(1)                                                            MILLER(1)
        int: declares an integer local variable in the current curly-braced scope.
        Type-checking happens at assignment: 'int x = 0.0' is an error.
 
-   map
+   1mmap0m
        map: declares a map-valued local variable in the current curly-braced scope.
        Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
        always OK. map b = a is OK or not depending on whether a is a map.
@@ -3267,4 +3276,4 @@ MILLER(1)                                                            MILLER(1)
 
 
 
-                                  2022-12-05                         MILLER(1)
+                                  2023-01-01                         MILLER(1)
diff --git a/docs/src/operating-on-all-fields.md b/docs/src/operating-on-all-fields.md
@@ -24,10 +24,9 @@ Suppose you want to replace spaces with underscores in your column names:
 <b>cat data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-a b c,def,g h i
-123,4567,890
-2468,1357,3579
-9987,3312,4543
+column 1,column 2,column 3
+apple,ball,cat
+dale egg,fish,gale
 </pre>
 
 The simplest way is to use `mlr rename` with `-g` (for global replace, not just first occurrence of space within each field) and `-r` for pattern-matching (rather than explicit single-column renames):
@@ -36,20 +35,18 @@ The simplest way is to use `mlr rename` with `-g` (for global replace, not just
 <b>mlr --csv rename -g -r ' ,_'  data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-a_b_c,def,g_h_i
-123,4567,890
-2468,1357,3579
-9987,3312,4543
+column_1,column_2,column_3
+apple,ball,cat
+dale egg,fish,gale
 </pre>
 
 <pre class="pre-highlight-in-pair">
 <b>mlr --csv --opprint rename -g -r ' ,_'  data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-a_b_c def  g_h_i
-123   4567 890
-2468  1357 3579
-9987  3312 4543
+column_1 column_2 column_3
+apple    ball     cat
+dale egg fish     gale
 </pre>
 
 You can also do this with a for-loop:
@@ -69,10 +66,9 @@ $* = newrec
 <b>mlr --icsv --opprint put -f data/bulk-rename-for-loop.mlr data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-a_b_c def  g_h_i
-123   4567 890
-2468  1357 3579
-9987  3312 4543
+column_1 column_2 column_3
+apple    ball     cat
+dale egg fish     gale
 </pre>
 
 ## Bulk rename of fields with carriage returns

diff --git a/docs/src/reference-verbs.md b/docs/src/reference-verbs.md
@@ -4099,7 +4099,7 @@ The primary use-case is for PPRINT output, which is space-delimited. For example
 <b>cat data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-column 1, column 2, column 3
+column 1,column 2,column 3
 apple,ball,cat
 dale egg,fish,gale
 </pre>
@@ -4108,40 +4108,40 @@ dale egg,fish,gale
 <b>mlr --icsv --opprint cat data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-column 1  column 2  column 3
-apple    ball      cat
-dale egg fish      gale
+column 1 column 2 column 3
+apple    ball     cat
+dale egg fish     gale
 </pre>
 
 <pre class="pre-highlight-in-pair">
 <b>mlr --icsv --opprint cat data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-column 1  column 2  column 3
-apple    ball      cat
-dale egg fish      gale
+column 1 column 2 column 3
+apple    ball     cat
+dale egg fish     gale
 </pre>
 
 <pre class="pre-highlight-in-pair">
 <b>mlr --icsv --opprint unspace data/spaces.csv</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-column_1 _column_2 _column_3
-apple    ball      cat
-dale_egg fish      gale
+column_1 column_2 column_3
+apple    ball     cat
+dale_egg fish     gale
 </pre>
 
 <pre class="pre-highlight-in-pair">
 <b>mlr --icsv --opprint unspace data/spaces.csv | mlr --ipprint --oxtab cat</b>
 </pre>
 <pre class="pre-non-highlight-in-pair">
-column_1  apple
-_column_2 ball
-_column_3 cat
+column_1 apple
+column_2 ball
+column_3 cat
 
-column_1  dale_egg
-_column_2 fish
-_column_3 gale
+column_1 dale_egg
+column_2 fish
+column_3 gale
 </pre>
 
 ## unsparsify

diff --git a/docs/src/spaces.csv b/docs/src/spaces.csv
@@ -3,4 +3,3 @@ Zone,Total MWh
 17,39.8
 24,7.4
 30,50.5
-
diff --git a/internal/pkg/go-csv/csv_reader.go b/internal/pkg/go-csv/csv_reader.go
@@ -473,4 +473,3 @@ parseField:
 	}
 	return dst, err
 }
-
diff --git a/internal/pkg/go-csv/csv_writer.go b/internal/pkg/go-csv/csv_writer.go
@@ -179,4 +179,3 @@ func (w *Writer) fieldNeedsQuotes(field string) bool {
 	r1, _ := utf8.DecodeRuneInString(field)
 	return unicode.IsSpace(r1)
 }
-
diff --git a/internal/pkg/version/version.go b/internal/pkg/version/version.go
@@ -4,4 +4,4 @@ package version
 // Nominally things like "6.0.0" for a release, then "6.0.0-dev" in between.
 // This makes it clear that a given build is on the main dev branch, not a
 // particular snapshot tag.
-var STRING string = "6.5.0-dev"
+var STRING string = "6.6.0"
diff --git a/man/manpage.txt b/man/manpage.txt
@@ -29,7 +29,7 @@ MILLER(1)                                                            MILLER(1)
        insertion-ordered hash map.  This encompasses a variety of data
        formats, including but not limited to the familiar CSV, TSV, and JSON.
        (Miller can handle positionally-indexed data as a special case.) This
-       manpage documents mlr 6.5.0-dev.
+       manpage documents mlr 6.6.0.
 
 1mEXAMPLES0m
        mlr --icsv --opprint cat example.csv
@@ -176,7 +176,7 @@ MILLER(1)                                                            MILLER(1)
        most-frequent nest nothing put regularize remove-empty-columns rename reorder
        repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
        sort sort-within-records split stats1 stats2 step summary tac tail tee
-       template top utf8-to-latin1 unflatten uniq unsparsify
+       template top utf8-to-latin1 unflatten uniq unspace unsparsify
 
 1mFUNCTION LIST0m
        abs acos acosh any append apply arrayify asin asinh asserting_absent
@@ -2059,6 +2059,15 @@ MILLER(1)                                                            MILLER(1)
                      With -n, produces only one record which is the unique-record count.
                      With neither -c nor -n, produces unique records.
 
+   1munspace0m
+       Usage: mlr unspace [options]
+       Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
+       Options:
+       -f {x}    Replace spaces with specified filler character.
+       -k        Unspace only keys, not keys and values.
+       -v        Unspace only values, not keys and values.
+       -h|--help Show this message.
+
    1munsparsify0m
        Usage: mlr unsparsify [options]
        Prints records with the union of field names over all input records.
@@ -3114,7 +3123,7 @@ MILLER(1)                                                            MILLER(1)
        int: declares an integer local variable in the current curly-braced scope.
        Type-checking happens at assignment: 'int x = 0.0' is an error.
 
-   map
+   1mmap0m
        map: declares a map-valued local variable in the current curly-braced scope.
        Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
        always OK. map b = a is OK or not depending on whether a is a map.
@@ -3267,4 +3276,4 @@ MILLER(1)                                                            MILLER(1)
 
 
 
-                                  2022-12-05                         MILLER(1)
+                                  2023-01-01                         MILLER(1)
diff --git a/man/mlr.1 b/man/mlr.1
@@ -2,12 +2,12 @@
 .\"     Title: mlr
 .\"    Author: [see the "AUTHOR" section]
 .\" Generator: ./mkman.rb
-.\"      Date: 2022-12-05
+.\"      Date: 2023-01-01
 .\"    Manual: \ \&
 .\"    Source: \ \&
 .\"  Language: English
 .\"
-.TH "MILLER" "1" "2022-12-05" "\ \&" "\ \&"
+.TH "MILLER" "1" "2023-01-01" "\ \&" "\ \&"
 .\" -----------------------------------------------------------------
 .\" * Portability definitions
 .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -47,7 +47,7 @@ on integer-indexed fields: if the natural data structure for the latter is the
 array, then Miller's natural data structure is the insertion-ordered hash map.
 This encompasses a variety of data formats, including but not limited to the
 familiar CSV, TSV, and JSON.  (Miller can handle positionally-indexed data as
-a special case.) This manpage documents mlr 6.5.0-dev.
+a special case.) This manpage documents mlr 6.6.0.
 .SH "EXAMPLES"
 .sp
 
@@ -217,7 +217,7 @@ json-stringify join label latin1-to-utf8 least-frequent merge-fields
 most-frequent nest nothing put regularize remove-empty-columns rename reorder
 repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
 sort sort-within-records split stats1 stats2 step summary tac tail tee
-template top utf8-to-latin1 unflatten uniq unsparsify
+template top utf8-to-latin1 unflatten uniq unspace unsparsify
 .fi
 .if n \{\
 .RE
@@ -2604,6 +2604,21 @@ Options:
 .fi
 .if n \{\
 .RE
+.SS "unspace"
+.if n \{\
+.RS 0
+.\}
+.nf
+Usage: mlr unspace [options]
+Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
+Options:
+-f {x}    Replace spaces with specified filler character.
+-k        Unspace only keys, not keys and values.
+-v        Unspace only values, not keys and values.
+-h|--help Show this message.
+.fi
+.if n \{\
+.RE
 .SS "unsparsify"
 .if n \{\
 .RS 0

diff --git a/miller.spec b/miller.spec
@@ -1,6 +1,6 @@
 Summary: Name-indexed data processing tool
 Name: miller
-Version: 6.5.0
+Version: 6.6.0
 Release: 1%{?dist}
 License: BSD
 Source: https://github.com/johnkerl/miller/releases/download/%{version}/miller-%{version}.tar.gz
@@ -36,6 +36,9 @@ make install
 %{_mandir}/man1/mlr.1*
 
 %changelog
+* Sun Jan 1 2023 John Kerl <[email protected]> - 6.6.0-1
+- 6.6.0 release
+
 * Sun Nov 27 2022 John Kerl <[email protected]> - 6.5.0-1
 - 6.5.0 release