A CLI tool to transform a stream of newline-delimited JSON by applying a JS function to each JSON object.
Features:
- take the JS function to apply from a file
- the function may return async results
- preview the transformation results with the
--diff
option
npm i -g ndjson-apply
cat some_data.ndjson | ndjson-apply some_transform_function.js > some_data_transformed.ndjson
# Which can also be written
ndjson-apply some_transform_function.js < cat some_data.ndjson > some_data_transformed.ndjson
where some_transform_function.js
just needs to export a JS function. This should work both with the ESM export syntax
// some_transform_function.js
export default function (doc) {
doc.total = doc.a + doc.b
if (doc.total % 2 === 0) {
return doc
} else {
// returning null or undefined drops the entry
}
}
or with the CommonJS export syntax
// some_transform_function.js
module.exports = function (doc) {
doc.total = doc.a + doc.b
if (doc.total % 2 === 0) {
return doc
} else {
// returning null or undefined drops the entry
}
}
That function can also be async:
import { getSomeExtraData } from './path/to/get_some_extra_data.js'
// some_async_transform_function.js
export default async function (doc) {
doc.total = doc.a + doc.b
if (doc.total % 2 === 0) {
doc.extraData = await getSomeExtraData(doc)
return doc
} else {
// returning null or undefined drops the entry
}
}
As a way to preview the results of your transformation, you can use the diff mode
cat some_data.ndjson | ndjson-apply some_transform_function.js --diff
which will display a colored diff of each line before and after transformation.
For more readability, each line diff output is indented and on several lines.
Use the js function only to filter lines: lines returning true
will be let through. No transformation will be applied.
cat some_data.ndjson | ndjson-apply some_transform_function.js --filter
Given a function_collection.js
file like:
// function_collection.js
export function foo (obj) {
obj.timestamp = Date.now()
return obj
}
export function bar (obj) {
obj.count += obj.count
return obj
}
You can use those subfunction by passing their key as an additional argument
cat some_data.ndjson | ndjson-apply ./function_collection.js foo
cat some_data.ndjson | ndjson-apply ./function_collection.js bar
This should also work with the CommonJS syntax:
// function_collection.cjs
module.exports = {
foo: (obj) => {
obj.timestamp = Date.now()
return obj
},
bar: (obj) => {
obj.count += obj.count
return obj
}
}
Any remaining argument will be passed to the function
# Pass '123' as argument to the exported function
cat some_data.ndjson | ndjson-apply ./function.js 123
# Pass '123' as argument to the exported sub-function foo
cat some_data.ndjson | ndjson-apply ./function_collection.js foo 123
To use ndjson-apply
with .ts
files, you can execute it with tsx
as follow:
# Get a tsx executable
npm install --global tsx
# Use ndjson-apply-ts just like you would use ndjson-apply
ndjson-apply-ts ./some_transform_function.ts < ./tests/assets/sample.ndjson
- jq is great to work with NDJSON:
cat entries_array.json | jq '.[]' -cr > entries.ndjson
- ndjson-cli#map
- json-apply