JSONL parser and Pick #147
Replies: 2 comments
-
The major difference is that JSONL is a huge stream of small objects, where "small" is defined as "fit in the memory and can be converted to a JS object". You should still use the main
So for JSONL the pipeline should be something like that: (an unverified sketch): let _pipeline = stream_chain([
_stream,
stream_jsonl_parser(), // JSONL!
function* ({value}) {
if (value.data.foo) yield {key: 'foo', value: value.data.foo};
if (value.data.bar) yield {key: 'bar', value: value.data.bar};
// replace the function body with better, more realistic, and more robust code
// if you want to dump all keys of data, use Object.keys() or Object.entries()
// instead of a generator function you can use many() of stream-chain (can be more performant)
}
]); If you need to use the main parser, use |
Beta Was this translation helpful? Give feedback.
-
Moved to discussions as having no action items. |
Beta Was this translation helpful? Give feedback.
-
Hey! First of all, thanks for your library.
For quite some time I worked with your library to process the following JSON file (example):
And this pipeline:
Given my previous example, that pipeline would output the following result:
I'm now changing my file format to JSONL. So the file is now:
I'd like to adapt my pipeline, but I'm quite stuck. I know about the
jsonStreaming
option, this works, but my files contains hundreds of thousands of lines, and I saw that you providestream-json/jsonl/Parser
that seems to be more performance-focused ("The only reason for its existence is improved performance").So I tried
stream-json/jsonl/Parser
, but my pipeline output is now empty. From my understanding, it's becausePick
doesn't accept theStreamValues
.Any insight on how to adapt my existing pipeline? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions