Get data from YAML, JSON, and TOML file directly in your shell.
How to easily parse and retrieve data from YAML file in our shell?
The previous question, few years ago, led us to the development of niet.
Indeed, at that time, we needed a way to store and retrieve data for our own needs. We created niet to read those data. The goal was to develop a tools that will allow us to standardize how we parse YAML locally or in our CI pipelines. We wanted something reusable and easily distribuable. Niet was born.
Over the years niet evolved to introduce the support of other formats like TOML.
Niet is like xmllint or jq but for YAML, JSON and TOML data - you can use it to slice and filter and map and transform structured data.
You can easily retrieve data by using simple expressions or using xpath advanced features to access non-trivial data.
You can easily convert YAML format into JSON, or TOML formats and vice versa.
Niet is writen in Python so you can install it from a package manager (from PyPi) or directly by cloning this repository - no specific system rights are needed to install it.
- Extract elements by using xpath syntax
- Extract values from JSON, YAML, and TOML format
- Automaticaly detect format (json/yaml)
- Read data from a web resource
- Read data from file or pass data from stdin
- Format output values
- Format output to be reused by shell
eval
- Convert YAML to JSON, or TOML and vice versa
$ pip install -U niet
- Python 3.9 or higher
Since niet 2.0 the support of python 2.7 have been dropped so if if you only have python 2.7 at hands then you can use previous version (lower to 2.0) but you should consider first that no support will be given on these versions (no bugfix, no new feature, etc). If you report an issue or or propose a new feature then they will be addressed only for current or higher version.
$ niet --help
usage: niet [-h] [-a ADDITIONAL_OBJECTS [ADDITIONAL_OBJECTS ...]] [-f {json,yaml,toml,eval,newline,ifs,squote,dquote,comma}] [-i] [-o OUTPUT_FILE] [-s] [-v] [--debug] object [file]
Read data from YAML or JSON file
positional arguments:
object Path to object. Based on jsmespath identifiers (https://jmespath.org/specification.html#identifiers) Use '.' to get whole file. (eg: a.b.c)
file Optional JSON or YAML local filename or distant web resource at raw format. If not provided niet read from stdin
options:
-h, --help show this help message and exit
-a ADDITIONAL_OBJECTS [ADDITIONAL_OBJECTS ...], --additional-objects ADDITIONAL_OBJECTS [ADDITIONAL_OBJECTS ...]
Path to additional objects to search. Here you can pass a list of additional researchs. Allow you to combine researchs into the same command call. The researchs will be made on the original file as with the
`object` parameter. Niet will output the results sequentially without delimiter between the results. If the `--output` argument is given by user, the results are appended at the end of the file sequentially. Based
on jsmespath identifiers (https://jmespath.org/specification.html#identifiers) Use '.' to get whole file. (eg: a.b.c)
-f {json,yaml,toml,eval,newline,ifs,squote,dquote,comma}, --format {json,yaml,toml,eval,newline,ifs,squote,dquote,comma}
output format
-i, --in-place Perform modification in place. Will so alter read file
-o OUTPUT_FILE, --output OUTPUT_FILE
Print output in a file instead of stdout (surcharged by in-place parameter if set)
-s, --silent silent mode, doesn't display message when element was not found
-v, --version print the Niet version number and exit (also --version)
--debug Activate the debug mode (based on pdb)
output formats:
json Return object in JSON
yaml Return object in YAML
toml Return object in TOML
eval Return result in a string evaluable by a shell eval command as an input
newline Return all elements of a list in a new line
ifs Return all elements of a list separated by IFS env var
squote Add single quotes to result
dquote Add double quotes to result
comma Return all elements separated by commas
$ echo '{"foo": "bar", "fizz": {"buzz": ["1", "2", "Fizz", "4", "Buzz"]}}' | niet fizz.buzz
1
2
Fizz
4
Buzz
$ echo '{"foo": "bar", "fizz": {"buzz": ["1", "2", "Fizz", "4", "Buzz"]}}' | niet fizz.buzz -f squote
'1' '2''Fizz' '4' 'Buzz'
$ echo '{"foo": "bar", "fizz": {"buzz": ["1", "2", "fizz", "4", "buzz"]}}' | niet . -f yaml
fizz:
buzz:
- '1'
- '2'
- fizz
- '4'
- buzz
foo: bar
$ echo '{"foo": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet "fizz.buzz[2]"
two
$ echo '{"foo": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f dquote "fizz.buzz[0:2]"
"zero" "one"
$ echo '{"foo": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f dquote "fizz.buzz[:3]"
"zero" "one" "two"
Consider the yaml file with the following content:
# /path/to/your/file.yaml
project:
meta:
name: my-project
foo: bar
list:
- item1
- item2
- item3
test-dash: value
You can download the previous example locally for testing purpose or use the command line for this:
wget https://gist.githubusercontent.com/4383/53e1599663b369f499aa28e27009f2cd/raw/389b82c19499b8cb84a464784e9c79aa25d3a9d3/file.yaml
You can retrieve data from this file by using niet like this:
$ niet ".project.meta.name" /path/to/your/file.yaml
my-project
$ niet ".project.foo" /path/to/your/file.yaml
bar
$ niet ".project.list" /path/to/your/file.yaml
item1 item2 item3
$ # assign return value to shell variable
$ NAME=$(niet ".project.meta.name" /path/to/your/file.yaml)
$ echo $NAME
my-project
$ niet project.'"test-dash"' /path/to/your/file.json
value
Consider the json file with the following content:
{
"project": {
"meta": {
"name": "my-project"
},
"foo": "bar",
"list": [
"item1",
"item2",
"item3"
],
"test-dash": "value"
}
}
You can download the previous example locally for testing purpose or use the command line for this:
wget https://gist.githubusercontent.com/4383/1bab8973474625de738f5f6471894322/raw/0048cd2310df2d98bf4f230ffe20da8fa615cef3/file.json
You can retrieve data from this file by using niet like this:
$ niet "project.meta.name" /path/to/your/file.json
my-project
$ niet "project.foo" /path/to/your/file.json
bar
$ niet "project.list" /path/to/your/file.json
item1 item2 item3
$ # assign return value to shell variable
$ NAME=$(niet "project.meta.name" /path/to/your/file.json)
$ echo $NAME
my-project
$ niet project.'"test-dash"' /path/to/your/file.json
value
An identifier is the most basic expression and can be used to extract a single
element from a JSON/YAML document. The return value for an identifier is
the value associated with the identifier. If the identifier does not
exist in the JSON/YAML document, than niet display a specific message and
return the error code 1
, example:
$ echo '{"foo": "bar", "fizz": {"buzz": ["1", "2", "3"]}}' | niet fizz.gogo
Element not found: fizz.gogo
$ echo $?
1
See the related section for more info on how to manage
errors with niet
.
Niet is based on jmespath
to find results so for complexe research you can
refer to the jmespath specifications
to use identifiers properly.
If you try to search for an identifier who use some dash you need to surround your research expression with simple and double quotes, examples:
$ echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f dquote '"foo-biz"'
bar
$ echo '{"key-test": "value"}' | niet '"key-test"'
value
However, niet
will detect related issues and surround automatically your
identifier if jmespath
fail to handle it.
Hence, the following examples will return similar results than the previous examples:
$ echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f dquote foo-biz
bar
$ echo '{"key-test": "value"}' | niet key-test
value
If your object is not at the root of your path, an example is available in
tests/sample/sample.json
, then you need to only surround the researched
identifier like this project.'"test-dash"'
{
"project": {
"meta": {
"name": "my-project"
},
"foo": "bar",
"list": [
"item1",
"item2",
"item3"
],
"test-dash": "value"
}
}
Example:
niet project.'"test-dash"' tests/sample/sample.json
Further examples with jmespath
identifiers.
Additional objects allow you to combine more than one query.
The --additional-objects
parameter accept a list of objects strings.
These objects are the same thing that the base object used to query
your inputs.
This parameter allow to generate advanced output from your input, let see an example:
Consider the following yaml example:
configuration:
warehouse: warehouse-name
database: database-name
object_type:
schema:
schema1: "/path/to/schema1.sql"
schema2: "/path/to/schema2.sql"
The following command will allow us to generate an output constitued from the results of the two objects used as query:
$ niet ".configuration.object_type | keys(@)[0]" config.yaml
-a ".configuration.object_type.schema.[keys(@)[0], values(@)[0]]"
The previous command will output:
schema
schema1
/path/to/schema1.sql
This output wouldn't be possible without combining the result of two queries, the additional objects are made for that.
Outputs of these additional objects are printed sequentially in the order they are given in the command line.
By default, niet print the output on stdout.
It if possible to pass a filename using -o or --output argument to writes directly in a file. This file will be created if not exists or will be replaced if already exists.
It is possible to modify directly a file using -i or --in-place argument. This will replace the input file by the output of niet command. This can be used to extract some data of a file or reindent a file.
You can change the output format using the -f or --format optional argument.
By default, niet detect the input format and display complex objects in the same format. If the object is a list or a value, newline output format will be used.
Output formats are:
- ifs
- squote
- dquote
- newline
- yaml
- json
- toml
Ifs output format print all values of a list or a single value in one line. All values are separated by the content of IFS environment variable if defined, space otherwise.
Examples (consider the previous YAML file example):
$ IFS="|" niet .project.list /path/to/your/file.yaml -f ifs
item1|item2|item3
$ IFS=" " niet .project.list /path/to/your/file.yaml -f ifs
item1 item2 item3
$ IFS="@" niet .project.list /path/to/your/file.yaml -f ifs
item1@item2@item3
This is usefull in a shell for loop, but your content must, of course, don't contain IFS value:
OIFS="$IFS"
IFS="|"
for i in $(niet .project.list /path/to/your/file.yaml -f ifs); do
echo ${i}
done
IFS="${OIFS}"
Previous example provide the following output:
item1
item2
item3
For single quoted see squote ouput or dquote double quoted output with IFS
Squotes output format print all values of a list or a single value in one line. All values are quoted with single quotes and are separated by IFS value.
Examples (consider the previous YAML file example):
$ # With the default IFS
$ niet .project.list /path/to/your/file.yaml -f squote
'item1' 'item2' 'item3'
$ # With a specified IFS
$ IFS="|" niet .project.list /path/to/your/file.yaml -f squote
'item1'|'item2'|'item3'
Dquotes output format print all values of a list or a single value in one line. All values are quoted with a double quotes and are separated by IFS value.
Examples (consider the previous YAML file example):
$ # With the default IFS
$ niet .project.list /path/to/your/file.yaml -f dquote
'item1' 'item2' 'item3'
$ # With a specified IFS
$ IFS="|" niet .project.list /path/to/your/file.yaml -f dquote
"item1"|"item2"|"item3"
newline
output format print one value of a list or a single value per line.
The newline
format is mostly usefull with shell while read loops and
with script interactions.
Example:
while read value: do
echo $value
done < $(niet --format newline project.list your-file.json)
comma
output format print results on the same line and separated by commas.
The comma
format allow you to format your outputs to consume your results
with other commands lines interfaces. By example some argument parser
allow you to pass multi values for the same parameter (the
beagle command per
example allow you to
repeat the --repo
option).
Example of integration with beagle and shell:
$ OSLO_PROJECTS_URL=https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml
$ beagle search \
-f link \
--repo $(niet "oslo.deliverables.*.repos[0]" ${OSLO_PROJECTS_URL} -f comma) 'venv'
The previous command will return all the links of files
who contains venv
on the openstack oslo's scope of projects (pbr,
taskflow, oslo.messaging, etc).
Else another with a more reduced scope on openstack oslo's projects:
$ niet "oslo.deliverables.*.repos[0][?contains(@, \`oslo\`) == \`true\`]" \
https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml \
-f comma
openstack/oslo-cookiecutter,openstack/oslo-specs,openstack/oslo.cache,
openstack/oslo.concurrency,openstack/oslo.config,openstack/oslo.context,
openstack/oslo.db,openstack/oslo.i18n,openstack/oslo.limit,openstack/oslo.log,
openstack/oslo.messaging,openstack/oslo.middleware,
openstack/oslo.policy,openstack/oslo.privsep,openstack/oslo.reports,
openstack/oslo.rootwrap,openstack/oslo.serialization,openstack/oslo.service,
openstack/oslo.tools,openstack/oslo.upgradecheck,openstack/oslo.utils,
openstack/oslo.versionedobjects,openstack/oslo.vmware,openstack/oslotest
In the previous example we retrieve only the projects repos who contains
oslo
in their names, so other projects like taskflow
, pbr
, etc will
be ignored.
Eval output format allow you to eval output string to initialize shell variable generated from your JSON/YAML content.
You can intialize shell variables from your entire content, example:
$ echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval .
foo_biz="bar";fizz__buzz=( zero one two three )
$ eval $(echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval .)
$ echo ${foo_biz}
bar
$ echo ${fizz__buzz}
zero one two three
$ eval $(echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval '"foo-biz"'); echo ${foo_biz}
bar
$ echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval fizz.buzz
fizz_buzz=( zero one two three );
Parent elements are separated by __
by example the fizz.buzz
element
will be represented by a variable named fizz__buzz
. You need to consider
that when you call your expected variables.
Also you can initialize some shell array from your content and loop over in a shell maner:
$ eval $(echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one", "two", "three"]}}' | niet -f eval fizz.buzz)
$ for el in ${fizz_buzz}; do echo $el; done
zero
one
two
three
YAML output format force output to be in YAML regardless the input file format.
JSON output format force output to be in JSON regardless the input file format.
TOML output format force output to be in TOML regardless the input file format.
Niet allow you to read data (json/yaml/toml) from a web resource accessible by using the HTTP protocole (introduced in niet 2.1).
This can be done by passing an url to niet which refer to a raw content (json, yaml, or toml).
Here is some examples with the openstack governance's projects data:
$ # List all the oslo projects repos (https://wiki.openstack.org/wiki/Oslo)
$ niet "oslo.deliverables.*.repos[0]" \
https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml
openstack/automaton
openstack/castellan
...
openstack/debtcollector
...
openstack/futurist
openstack/oslo.cache
openstack/oslo.concurrency
openstack/oslo.config
openstack/oslo.context
openstack/oslo.db
openstack/oslo.i18n
openstack/oslo.limit
openstack/oslo.log
openstack/oslo.messaging
openstack/oslo.middleware
openstack/oslo.policy
...
openstack/oslo.service
openstack/osprofiler
openstack/pbr
...
openstack/stevedore
openstack/taskflow
openstack/tooz
openstack/whereto
$ niet oslo.service \
https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml
Common libraries
$ # Get the openstack oslo's mission
$ niet oslo.mission \
https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml
To produce a set of python libraries containing code shared by OpenStack projects.
The APIs provided by these libraries should be high quality, stable, consistent,
documented and generally applicable.
$ eval $(niet oslo.service \
https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml -f eval) && \
test "${oslo_service}" = "Common libraries"
$ # Get the name of the oslo PTL
$ eval $(niet oslo.ptl.name \
https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml -f eval)
$ echo "${oslo_ptl_name}" # now display your evaluated result
$ # Convert original distant yaml file into json
$ niet . https://raw.githubusercontent.com/openstack/governance/master/reference/projects.yaml -f json
For further examples of filters and selections please take a look to the jmespath's doc.
By default when no results was found niet display a specific message and return
the error code 1
, example:
$ echo '{"foo": "bar", "fizz": {"buzz": ["1", "2", "3"]}}' | niet fizz.gogo
Element not found: fizz.gogo
$ echo $?
1
You can avoid this behavior by passing niet into a silent mode.
Silent mode allow you to hide the specific message error but continue to return
a status code equal to 1
when the key was not found.
You can use the silent mode by using the flag -s/--silent
, example:
$ echo '{"foo": "bar", "fizz": {"buzz": ["1", "2", "3"]}}' | niet fizz.gogo -s
$ echo $?
1
When your JSON file content are not valid niet display an error and exit
with return code 1
You can easily protect your script like this:
PROJECT_NAME=$(niet project.meta.name your-file.yaml)
if [ "$?" = "1" ]; then
echo "Error occur ${PROJECT_NAME}"
else
echo "Project name: ${PROJECT_NAME}"
fi
You can try niet by using the samples provided with the project sources code.
All the following examples use the sample file available in niet sources code at the following location
tests/samples/sample.yaml
.
Sample example:
# tests/samples/sample.yaml
project:
meta:
name: my-project
foo: bar
list:
- item1
- item2
- item3
Retrieve the project name:
$ niet project.meta.name tests/samples/sample.yaml
my-project
Consider the following content:
$ cat /var/lib/libvirt/dnsmasq/virbr0.status
[
{
"ip-address": "192.168.122.113",
"mac-address": "52:54:00:91:14:02",
"hostname": "rhel79",
"expiry-time": 1644251254
},
{
"ip-address": "192.168.122.162",
"mac-address": "52:54:00:23:37:ed",
"hostname": "satellite",
"expiry-time": 1644251837
}
]
Here we want to retrieve the value of the ip-address field when the hostname
is equal to satellite
. The following command will allow you to get this
value:
$ sed 's/ip/_/g' /var/lib/libvirt/dnsmasq/virbr0.status | niet "[?hostname=='satellite'].ip"
192.168.122.162
You should notice that first we replace -
by _
by using the sed
command. We do that because jmespath
, the underlying library used by niet
, poorly handle key that contain -
. We chosen to replace all - by _ to avoid
any issues elsewhere on the file
Here is an exemple of an automated ssh connection in a kvm virtualised lab
environment by looking for vmname in dhcp file with niet
and performing the
ssh connection to the server even if its ip changed.
The ssh connection here can be performed with this command:
ssh -o ProxyCommand='nc $(sed 's/-/_/g' /var/lib/libvirt/dnsmasq/virbr0.status | niet "[?hostname=='''%h'''].ip_address") %p' root@rhel79
Tips - to ease that use you can for example set this .ssh/config
entry:
host lab-*
user root
ProxyCommand /usr/bin/nc $(sed 's/-/_/g' /var/lib/libvirt/dnsmasq/virbr0.status | niet "[?hostname=='$(echo %h | cut -d'-' -f2 )'].ip_address") %p
And then perform a ssh lab-rhel79
or a ssh lab-satellite
to join all VMs
from your lab, by the hostname prefixed by lab-
.
Deal with list of items
$ for el in $(niet project.list tests/samples/sample.yaml); do echo ${el}; done
item1
item2
item3
Also you can eval
your niet
output to setput some shell variables
that you can reuse in your shell scripts, the following example is similar to
the previous example but make use of the eval ouput format (-f eval
):
$ eval $(niet -f eval project.list tests/samples/sample.yaml)
$ for el in ${project__list}; do echo $el; done
zero
one
two
three
Extract the object as JSON to store it in shell variable :
$ project="$(niet -f json .project tests/samples/sample.yaml)"
Then parse it after in bash in this example:
$ niet .meta.name <<< $project
my-project
With niet you can easily convert your JSON into YAML
$ niet . tests/samples/sample.json -f yaml
project:
foo: bar
list:
- item1
- item2
- item3
meta:
name: my-project
With niet you can easily convert your YAML into JSON
$ niet . tests/samples/sample.yaml -f json
{
"project": {
"meta": {
"name": "my-project"
},
"foo": "bar",
"list": [
"item1",
"item2",
"item3"
]
}
}
With niet you can easily convert your JSON into TOML
$ niet . tests/samples/sample.json -f toml
[project]
foo = "bar"
list = ["item1", "item2", "item3"]
test-dash = "value"
[project.meta]
name = "my-project"
With niet you can easily convert your YAML into TOML
$ niet . tests/samples/sample.yaml -f toml
[project]
foo = "bar"
list = ["item1", "item2", "item3"]
test-dash = "value"
[project.meta]
name = "my-project"
With niet you can easily convert your TOML into YAML
niet . tests/samples/sample.toml -f yaml
project:
foo: bar
list:
- item1
- item2
- item3
meta:
name: my-project
test-dash: value
This is an example of how to indent a JSON file :
$ niet . tests/samples/sample_not_indented.json
{
"project": {
"meta": {
"name": "my-project"
},
"foo": "bar",
"list": [
"item1",
"item2",
"item3"
],
"test-dash": "value"
}
}
You may want to retrieve values from keys that contains dots, example:
.foo:
something: "a"
bar:
something: "b"
foo.z:
something: "c"
Then you must surround keys that contains dots with quotes, example:
$ niet '".foo"' /tmp/test.yaml
something: a
$ niet '"foo.z"' /tmp/test.yaml
something: c
You can pass your search with or without quotes like this:
$ niet project.meta.name your-file.yaml
$ niet "project.meta.name" your-file.yaml
You can execute niet
step by step by using the debug mode. It will allow
you to inspect your execution during your debug sessions.
If you want to contribute to niet please first read the contribution guidelines
This project is under the MIT License.