Skip to content

for/if/while and various features

Compare
Choose a tag to compare
@johnkerl johnkerl released this 11 Jun 11:15

While one of Miller’s strengths is its brevity, and so its domain-specific language is intentionally simple, the ability to loop over field names is a basic thing to want. Likewise for other control structures on the same complexity level as awk. Miller has always owed much inspiration to awk; 4.1.0 makes this more explicit by providing several common language idioms.

Major features:

  • For-loops over key-value pairs in stream records and out-of-stream variables
  • Loops using while and do while
  • break and continue in for, while, and do while loops
  • If-elif-else statements
  • Nestability of all the above, as well as of existing pattern-action blocks

Additional features:

  • Computable field names using square brackets, e.g. $[$a.$b] = $a * $b
  • Type-predicate functions: isnumeric, isint, isfloat, isbool, isstring
  • Commenting using pound signs
  • The new print and eprint allow formatting of arbitrary expressions to stdout/stderr, respectively
  • In addition to the existing dump which formats all out-of-stream variables to stdout as JSON, the new edump does the same to stderr
  • Semicolon is no longer required after closing curly brace
  • emit @* and unset @* are new synonyms for emit all and unset all
  • unset $* now exists
  • mlr -n is synonymous with mlr --from /dev/null, which is useful in dataless contexts wherein all your put statements are contained within begin/end blocks
  • Bugfix: in 4.0.0, mlr put -v '@a[1][2]=$b;$new=@a[1][2]' mydata.tbl would crash with a memory-management error.

Syntax example:

% mlr --from estimates.tbl put '
  for (k,v in $*) {
    if (isnumeric(v) && k =~ "^[t-z].*$") {
      $sum += v; $count += 1
    }
  }
  $mean = $sum / $count # no assignment if count unset
'

Document links:

Brew update: Homebrew/homebrew-core#1895