Skip to content

Commit

Permalink
Miller 6.5.0 (#1134)
Browse files Browse the repository at this point in the history
  • Loading branch information
johnkerl committed Nov 27, 2022
1 parent b6846fc commit 63cf240
Show file tree
Hide file tree
Showing 63 changed files with 1,110 additions and 3,149 deletions.
124 changes: 1 addition & 123 deletions docs/src/10min.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,6 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

But `mlr cat` can also do format conversion -- for example, you can pretty-print in tabular format:
Expand All @@ -61,9 +58,6 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

`mlr head` and `mlr tail` count records rather than lines. Whether you're getting the first few records or the last few, the CSV header is included either way:
Expand All @@ -77,9 +71,6 @@ yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
red,circle,true,3,16,13.8103,2.9010
red,square,false,4,48,77.5542,7.4670
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -91,9 +82,6 @@ purple,triangle,false,7,65,80.1405,5.8240
yellow,circle,true,8,73,63.9785,4.2370
yellow,circle,true,9,87,63.5058,8.3350
purple,square,false,10,91,72.3735,8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -120,9 +108,6 @@ go tool pprof -http=:8080 foo-stream
"rate": 8.2430
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can sort on a single field:
Expand All @@ -142,9 +127,6 @@ purple square false 10 91 72.3735 8.2430
yellow triangle true 1 11 43.6498 9.8870
purple triangle false 5 51 81.2290 8.5910
purple triangle false 7 65 80.1405 5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Or, you can sort primarily alphabetically on one field, then secondarily numerically descending on another field, and so on:
Expand All @@ -164,9 +146,6 @@ red square true 2 15 79.2778 0.0130
purple triangle false 7 65 80.1405 5.8240
purple triangle false 5 51 81.2290 8.5910
yellow triangle true 1 11 43.6498 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

If there are fields you don't want to see in your data, you can use `cut` to keep only the ones you want, in the same order they appeared in the input data:
Expand All @@ -186,9 +165,6 @@ triangle false
circle true
circle true
square false
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can also use `cut -o` to keep specified fields, but in your preferred order:
Expand All @@ -208,9 +184,6 @@ false triangle
true circle
true circle
false square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can use `cut -x` to omit fields you don't care about:
Expand All @@ -230,9 +203,6 @@ purple 7 65 80.1405 5.8240
yellow 8 73 63.9785 4.2370
yellow 9 87 63.5058 8.3350
purple 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Even though Miller's main selling point is name-indexing, sometimes you really want to refer to a field name by its positional index. Use `$[[3]]` to access the name of field 3 or `$[[[3]]]` to access the value of field 3:
Expand All @@ -252,9 +222,6 @@ purple triangle false 7 65 80.1405 5.8240
yellow circle true 8 73 63.9785 4.2370
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -272,9 +239,6 @@ purple triangle NEW 7 65 80.1405 5.8240
yellow circle NEW 8 73 63.9785 4.2370
yellow circle NEW 9 87 63.5058 8.3350
purple square NEW 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

You can find the full list of verbs at the [Verbs Reference](reference-verbs.md) page.
Expand All @@ -292,9 +256,6 @@ red square true 2 15 79.2778 0.0130
red circle true 3 16 13.8103 2.9010
red square false 4 48 77.5542 7.4670
red square false 6 64 77.1991 9.5310
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -304,9 +265,6 @@ go tool pprof -http=:8080 foo-stream
color shape flag k index quantity rate
red square true 2 15 79.2778 0.0130
red circle true 3 16 13.8103 2.9010
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Computing new fields
Expand All @@ -331,9 +289,6 @@ purple triangle false 7 65 80.1405 5.8240 13.760388049450551 purple_triangl
yellow circle true 8 73 63.9785 4.2370 15.09995279679018 yellow_circle
yellow circle true 9 87 63.5058 8.3350 7.619172165566886 yellow_circle
purple square false 10 91 72.3735 8.2430 8.779995147397793 purple_square
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

When you create a new field, it can immediately be used in subsequent statements:
Expand All @@ -356,9 +311,6 @@ purple triangle false 7 65 80.1405 5.8240 66 4363
yellow circle true 8 73 63.9785 4.2370 74 5484
yellow circle true 9 87 63.5058 8.3350 88 7753
purple square false 10 91 72.3735 8.2430 92 8474
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

For `put` and `filter` we were able to type out expressions using a programming-language syntax.
Expand All @@ -379,9 +331,6 @@ Zone,Total MWh
17,39.8
24,7.4
30,50.5
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -393,9 +342,6 @@ Zone Total MWh
17 39.8
14 27.2
24 7.4
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

For `put` and `filter` expressions, use `${...}`:
Expand All @@ -409,9 +355,6 @@ Zone Total MWh Total KWh
17 39.8 39800
24 7.4 7400
30 50.5 50500
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

See also the [section on field names](reference-dsl-variables.md#field-names).
Expand Down Expand Up @@ -458,9 +401,6 @@ a,b,c
1,2,3
4,5,6
7,8,9
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Chaining verbs together
Expand All @@ -475,12 +415,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

This works fine -- but Miller also lets you chain verbs together using the word `then`. Think of this as a Miller-internal pipe that lets you use fewer keystrokes:
Expand All @@ -493,9 +427,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

As another convenience, you can put the filename first using `--from`. When you're interacting with your data at the command line, this makes it easier to up-arrow and append to the previous command:
Expand All @@ -508,9 +439,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -524,9 +452,6 @@ shape quantity
square 72.3735
circle 63.5058
circle 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Sorts and stats
Expand All @@ -543,9 +468,6 @@ color shape flag k index quantity rate
purple square false 10 91 72.3735 8.2430
yellow circle true 9 87 63.5058 8.3350
yellow circle true 8 73 63.9785 4.2370
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Lots of Miller commands take a `-g` option for group-by: here, `head -n 1 -g shape` outputs the first record for each distinct value of the `shape` field. This means we're finding the record with highest `index` field for each distinct `shape` field:
Expand All @@ -558,9 +480,6 @@ color shape flag k index quantity rate
yellow circle true 9 87 63.5058 8.3350
purple square false 10 91 72.3735 8.2430
purple triangle false 7 65 80.1405 5.8240
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

Statistics can be computed with or without group-by field(s):
Expand All @@ -574,9 +493,6 @@ shape quantity_count quantity_min quantity_mean quantity_max
triangle 3 43.6498 68.33976666666666 81.229
square 4 72.3735 76.60114999999999 79.2778
circle 3 13.8103 47.0982 63.9785
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -591,9 +507,6 @@ circle red 1 13.8103 13.8103 13.8103
triangle purple 2 80.1405 80.68475000000001 81.229
circle yellow 2 63.5058 63.742149999999995 63.9785
square purple 1 72.3735 72.3735 72.3735
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

If your output has a lot of columns, you can use XTAB format to line things up vertically for you instead:
Expand All @@ -611,9 +524,6 @@ rate_p75 8.5910
rate_p90 9.8870
rate_p99 9.8870
rate_p100 9.8870
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## Unicode and internationalization
Expand Down Expand Up @@ -646,9 +556,6 @@ UTF-8 data. For example:
κόκκινο κύκλος αληθινό 3 16 13.8103 2.9010
κίτρινο κύκλος αληθινό 8 73 63.9785 4.2370
κίτρινο κύκλος αληθινό 9 87 63.5058 8.3350
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -666,9 +573,6 @@ go tool pprof -http=:8080 foo-stream
κόκκινο τετράγωνο ψευδές 6 64 77.1991 9.5310
μοβ τρίγωνο ψευδές 7 65 80.1405 5.8240
μοβ τετράγωνο ψευδές 10 91 72.3735 8.2430
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -686,9 +590,6 @@ go tool pprof -http=:8080 foo-stream
желтый КРУГ истина 8 73 63.9785 4.2370 6
желтый КРУГ истина 9 87 63.5058 8.3350 6
фиолетовый КВАДРАТ ложь 10 91 72.3735 8.2430 10
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

## File formats and format conversion
Expand Down Expand Up @@ -788,9 +689,6 @@ a matter of specifying input-format and output-format flags:
"rate": 0.0130
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -800,9 +698,6 @@ go tool pprof -http=:8080 foo-stream
color,shape,flag,k,index,quantity,rate
yellow,triangle,true,1,11,43.6498,9.8870
red,square,true,2,15,79.2778,0.0130
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

However, if JSON data has map-valued or array-valued fields, Miller gives you choices on how to
Expand Down Expand Up @@ -843,9 +738,6 @@ We can convert this to CSV, or other tabular formats:
<pre class="pre-non-highlight-in-pair">
hostname,pid,req.id,req.method,req.path,req.host,req.headers.host,req.headers.user-agent,res.status_code,res.header.content-type,res.header.content-encoding
localhost,12345,6789,GET,api/check,foo.bar,bar.baz,browser,200,text,plain
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
Expand All @@ -863,9 +755,6 @@ req.headers.user-agent browser
res.status_code 200
res.header.content-type text
res.header.content-encoding plain
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

These transformations are reversible:
Expand Down Expand Up @@ -897,12 +786,6 @@ These transformations are reversible:
}
}
]
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

See the [flatten/unflatten page](flatten-unflatten.md) for more information.
Expand Down Expand Up @@ -992,14 +875,9 @@ If you like, you can first copy off your original data somewhere else, before do

Lastly, using `tee` within `put`, you can split your input data into separate files per one or more field names:

<pre class="pre-highlight-in-pair">
<pre class="pre-highlight-non-pair">
<b>mlr --csv --from example.csv put -q 'tee > $shape.".csv", $*'</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Memory profile started.
Memory profile finished.
go tool pprof -http=:8080 foo-stream
</pre>

<pre class="pre-highlight-in-pair">
<b>cat circle.csv</b>
Expand Down
Loading

0 comments on commit 63cf240

Please sign in to comment.