Skip to content

A set of performant and memory efficient utilities that extend the use of JSON

License

Notifications You must be signed in to change notification settings

discoveryjs/json-ext

Repository files navigation

json-ext

NPM version Build Status Coverage Status NPM Downloads

A set of utilities designed to extend JSON's capabilities, especially for handling large JSON data (over 100MB) efficiently:

  • parseChunked() – Parses JSON incrementally; similar to JSON.parse(), but processing JSON data in chunks.
  • stringifyChunked() – Converts JavaScript objects to JSON incrementally; similar to JSON.stringify(), but returns a generator that yields JSON strings in parts.
  • stringifyInfo() – Estimates the size of the JSON.stringify() result and identifies circular references without generating the JSON.
  • parseFromWebStream() – A helper function to parse JSON chunks directly from a Web Stream.
  • createStringifyWebStream() – A helper function to generate JSON data as a Web Stream.

Key Features

  • Optimized to handle large JSON data with minimal resource usage (see benchmarks)
  • Works seamlessly with browsers, Node.js, Deno, and Bun
  • Supports both Node.js and Web streams
  • Available in both ESM and CommonJS
  • TypeScript typings included
  • No external dependencies
  • Compact size: 9.4Kb (minified), 3.8Kb (min+gzip)

Why json-ext?

  • Handles large JSON files: Overcomes the limitations of V8 for strings larger than ~500MB, enabling the processing of huge JSON data.
  • Prevents main thread blocking: Distributes parsing and stringifying over time, ensuring the main thread remains responsive during heavy JSON operations.
  • Reduces memory usage: Traditional JSON.parse() and JSON.stringify() require loading entire data into memory, leading to high memory consumption and increased garbage collection pressure. parseChunked() and stringifyChunked() process data incrementally, optimizing memory usage.
  • Size estimation: stringifyInfo() allows estimating the size of resulting JSON before generating it, enabling better decision-making for JSON generation strategies.

Install

npm install @discoveryjs/json-ext

API

parseChunked()

Functions like JSON.parse(), iterating over chunks to reconstruct the result object, and returns a Promise.

Note: reviver parameter is not supported yet.

function parseChunked(input: Iterable<Chunk> | AsyncIterable<Chunk>): Promise<any>;
function parseChunked(input: () => (Iterable<Chunk> | AsyncIterable<Chunk>)): Promise<any>;

type Chunk = string | Buffer | Uint8Array;

Benchmark

Usage:

import { parseChunked } from '@discoveryjs/json-ext';

const data = await parseChunked(chunkEmitter);

Parameter chunkEmitter can be an iterable or async iterable that iterates over chunks, or a function returning such a value. A chunk can be a string, Uint8Array, or Node.js Buffer.

Examples:

  • Generator:
    parseChunked(function*() {
        yield '{ "hello":';
        yield Buffer.from(' "wor'); // Node.js only
        yield new TextEncoder().encode('ld" }'); // returns Uint8Array
    });
  • Async generator:
    parseChunked(async function*() {
        for await (const chunk of someAsyncSource) {
            yield chunk;
        }
    });
  • Array:
    parseChunked(['{ "hello":', ' "world"}'])
  • Function returning iterable:
    parseChunked(() => ['{ "hello":', ' "world"}'])
  • Node.js Readable stream:
    import fs from 'node:fs';
    
    parseChunked(fs.createReadStream('path/to/file.json'))
  • Web stream (e.g., using fetch()):

    Note: Iterability for Web streams was added later in the Web platform, not all environments support it. Consider using parseFromWebStream() for broader compatibility.

    const response = await fetch('https://example.com/data.json');
    const data = await parseChunked(response.body); // body is ReadableStream

stringifyChunked()

Functions like JSON.stringify(), but returns a generator yielding strings instead of a single string.

Note: Returns "null" when JSON.stringify() returns undefined (since a chunk cannot be undefined).

function stringifyChunked(value: any, replacer?: Replacer, space?: Space): Generator<string, void, unknown>;
function stringifyChunked(value: any, options: StringifyOptions): Generator<string, void, unknown>;

type Replacer =
    | ((this: any, key: string, value: any) => any)
    | (string | number)[]
    | null;
type Space = string | number | null;
type StringifyOptions = {
    replacer?: Replacer;
    space?: Space;
    highWaterMark?: number;
};

Benchmark

Usage:

  • Getting an array of chunks:

    const chunks = [...stringifyChunked(data)];
  • Iterating over chunks:

    for (const chunk of stringifyChunked(data)) {
        console.log(chunk);
    }
  • Specifying the minimum size of a chunk with highWaterMark option:

    const data = [1, "hello world", 42];
    
    console.log([...stringifyChunked(data)]); // default 16kB
    // ['[1,"hello world",42]']
    
    console.log([...stringifyChunked(data, { highWaterMark: 16 })]);
    // ['[1,"hello world"', ',42]']
    
    console.log([...stringifyChunked(data, { highWaterMark: 1 })]);
    // ['[1', ',"hello world"', ',42', ']']
  • Streaming into a stream with a Promise (modern Node.js):

    import { pipeline } from 'node:stream/promises';
    import fs from 'node:fs';
    
    await pipeline(
        stringifyChunked(data),
        fs.createWriteStream('path/to/file.json')
    );
  • Wrapping into a Promise streaming into a stream (legacy Node.js):

    import { Readable } from 'node:stream';
    
    new Promise((resolve, reject) => {
        Readable.from(stringifyChunked(data))
            .on('error', reject)
            .pipe(stream)
            .on('error', reject)
            .on('finish', resolve);
    });
  • Writing into a file synchronously:

    Note: Slower than JSON.stringify() but uses much less heap space and has no limitation on string length

    import fs from 'node:fs';
    
    const fd = fs.openSync('output.json', 'w');
    
    for (const chunk of stringifyChunked(data)) {
        fs.writeFileSync(fd, chunk);
    }
    
    fs.closeSync(fd);
  • Using with fetch (JSON streaming):

    Note: This feature has limited support in browsers, see Streaming requests with the fetch API

    Note: ReadableStream.from() has limited support in browsers, use createStringifyWebStream() instead.

    fetch('http://example.com', {
        method: 'POST',
        duplex: 'half',
        body: ReadableStream.from(stringifyChunked(data))
    });
  • Wrapping into ReadableStream:

    Note: Use ReadableStream.from() or createStringifyWebStream() when no extra logic is needed

    new ReadableStream({
        start() {
            this.generator = stringifyChunked(data);
        },
        pull(controller) {
            const { value, done } = this.generator.next();
    
            if (done) {
                controller.close();
            } else {
                controller.enqueue(value);
            }
        },
        cancel() {
            this.generator = null;
        }
    });

stringifyInfo()

export function stringifyInfo(value: any, replacer?: Replacer, space?: Space): StringifyInfoResult;
export function stringifyInfo(value: any, options?: StringifyInfoOptions): StringifyInfoResult;

type StringifyInfoOptions = {
    replacer?: Replacer;
    space?: Space;
    continueOnCircular?: boolean;
}
type StringifyInfoResult = {
    bytes: number;      // size of JSON in bytes
    spaceBytes: number; // size of white spaces in bytes (when space option used)
    circular: object[]; // list of circular references
};

Functions like JSON.stringify(), but returns an object with the expected overall size of the stringify operation and a list of circular references.

Example:

import { stringifyInfo } from '@discoveryjs/json-ext';

console.log(stringifyInfo({ test: true }, null, 4));
// {
//   bytes: 20,     // Buffer.byteLength('{\n    "test": true\n}')
//   spaceBytes: 7,
//   circular: []    
// }

Options

continueOnCircular

Type: Boolean
Default: false

Determines whether to continue collecting info for a value when a circular reference is found. Setting this option to true allows finding all circular references.

parseFromWebStream()

A helper function to consume JSON from a Web Stream. You can use parseChunked(stream) instead, but @@asyncIterator on ReadableStream has limited support in browsers (see ReadableStream compatibility table).

import { parseFromWebStream } from '@discoveryjs/json-ext';

const data = await parseFromWebStream(readableStream);
// equivalent to (when ReadableStream[@@asyncIterator] is supported):
// await parseChunked(readableStream);

createStringifyWebStream()

A helper function to convert stringifyChunked() into a ReadableStream (Web Stream). You can use ReadableStream.from() instead, but this method has limited support in browsers (see ReadableStream.from() compatibility table).

import { createStringifyWebStream } from '@discoveryjs/json-ext';

createStringifyWebStream({ test: true });
// equivalent to (when ReadableStream.from() is supported):
// ReadableStream.from(stringifyChunked({ test: true }))

License

MIT