-
Notifications
You must be signed in to change notification settings - Fork 1.6k
X Experimental Benchmarks
This page describes some benchmarks and gives representative timings and "maxrss" (maximum resident set size) statistics.
Each "test" consists of a combination of a task, often given as a jq program, and some input data (possibly null). The first test however involves the md5 program, first so that the md5 value of a particular JSON file can be shown, and to give a reference point for comparison.
Each combination of task and input data is assigned a number, given in the form (N); for example, the first "test" is:
(1) md5 jeopardy.json
This page is organized as follows:
-
the SOURCES sections has one subsection each for DATA and for PROGRAMS;
-
the RESULTS section is organized into GROUPS so that the timings within each group are roughly comparable. Groups are identified by a string such as "Mac OS X (High Sierra) 3GHz 16GB RAM"
In the RESULTS section, the version of jq should be specified according to its tag, e.g. jq-1.5, jq-1.6rc1
"jeopardy.json" (aka JEOPARDY_QUESTIONS1.json) [54MB]
- https://github.com/alicemaz/super_jeopardy/blob/master/JEOPARDY_QUESTIONS1.json
- https://web.archive.org/web/20180222052746/https://raw.githubusercontent.com/alicemaz/super_jeopardy/master/JEOPARDY_QUESTIONS1.json
- https://drive.google.com/file/d/0BwT5wj_P7BKXb2hfM3d2RHU1ckE/
- gzipped: http://skeeto.s3.amazonaws.com/share/JEOPARDY_QUESTIONS1.json.gz
Description: https://www.reddit.com/r/datasets/comments/1uyd0t/200000_jeopardy_questions_in_a_json_file
"citylots.json" [181MB]
- https://raw.githubusercontent.com/zemirco/sf-city-lots-json/master/citylots.json
- https://drive.google.com/open?id=1dy6PNgsCBol5xBfLUpXXJ6sOnrUZ-QM_
Description: https://github.com/zemirco/sf-city-lots-json
- https://gist.github.com/pkoppstein/a5abb4ebef3b0f72a6ed (also available at archive.org)
Note: in the tests, the last line of "schema.jq" has been uncommented, but see footnote [*1] below for alternatives.
def zip(headers):
. headers as $headers
| [$headers, .] | transpose | map({(.[0]): .[1]}) | add ;
def testzip(n):
[range(0;n)] as $row
| $row | zip( $row|map(tostring) ) ;
testzip(1000000) | length
(1) md5 jeopardy.json
MD5 (jeopardy.json) = 2075398fa049b1c00223b2279ca5281d
user 0m0.126s
sys 0m0.025s
maxrss 11341824
(2) length jeopardy.json
jq-1.5 length jeopardy.json
216930
user 0m1.144s
sys 0m0.112s
maxrss 223440896
(2 rq) length jeopardy.json
rq 'map(s)=>{s.length}' < jeopardy.json
216930
user 4.76s
sys 0.27s
maxrss 372486144
(3) schema.jq jeopardy.json
jq-1.5 -f schema.jq jeopardy.json > jeopardy.schema.json
user 7.10s
sys 0.13s
maxrss 223457280
jq-1.6rc1 -f schema.jq jeopardy.json > jeopardy.schema.json
user 8.26s
sys 0.13s
maxrss 223526912
(4) null testzip.jq
jq-1.5 -n testzip.jq
1000000
user 6.11s
sys 0.35s
maxrss 711286784
(5) . jeopardy.json
jq-1.5 . jeopardy.json | wc -l
1952372
user 4.69s
sys 0.12s
maxrss 223350784
(5 rq) . jeopardy.json
rq --format readable id < jeopardy.json | wc -l
1952372
user 21.38s
sys 2.13s
maxrss 381214720
(6) 'select(length==2)' jeopardy.json # --stream
jq-1.5 --stream 'select(length==2)' jeopardy.json | wc -l
10629570
user 0m8.901s
sys 0m0.087s
maxrss 1359872
(7) null 0
jq-1.5 -n 0
user 0.002924s
sys 0.001339s
maxrss 1187840
Times are based on 1000 iterations using a bash loop, after adjusting for the times of the looping itself.
jq-1.6rc1 -n 0
user: 0.030609s
sys : 0.001838s
maxrss 2076672
Times are based on 1000 iterations using a bash loop, after adjusting for the times of the looping itself.
(8) md5 citylots.json
md5 citylots.json
MD5 (citylots.json) = 158346af5a90253d8b4390bd671eb5c5
user 0.43s
sys 0.06s
maxrss 11333632
(9) length citylots.json
jq-1.5 length citylots.json
2
user 0m6.887s
sys 0m0.772s
maxrss 1375858688
(10) '.features|length' citylots.json
jq-1.5 '.features|length' citylots.json
206560
user 6.23s
sys 0.78s
maxrss 1375899648
(11) schema.jq citylots.json
jq-1.5 -f schema.jq citylots.json > citylots.schema.json
user 67.05s
sys 1.10s
maxrss 1375961088
(12) .features[10000].properties.LOT_NUM citylots.json
jq-1.5 '.features[10000].properties.LOT_NUM' citylots.json
"091"
user 6.44s
sys 0.97s
maxrss 1371561984
jq-1.6rc1 '.features[10000].properties.LOT_NUM' citylots.json
"091"
user 5.46
sys 0.73
maxrss 1375936512
jq-1.5 -n --stream 'first(inputs | select(.[0] == ["features",10000,"properties","LOT_NUM"])) | .[1]' citylots.json
"091"
user 0.60s
sys 0.00s
maxrss 2084864
"jeopardy.schema.json"
{
"air_date": "string",
"answer": "string",
"category": "string",
"question": "string",
"round": "string",
"show_number": "string",
"value": "string"
}
[*1] If you prefer to use schema.jq as it exists on the web, here are two alternative methods that can be considered:
(a) jq -f <(cat schema.jq; echo schema) ... (b) jq 'include "schema"; schema' ...
For further details about using include
, see the jq documentation.
- Home
- FAQ
- jq Language Description
- Cookbook
- Modules
- Parsing Expression Grammars
- Docs for Oniguruma Regular Expressions (RE.txt)
- Advanced Topics
- Guide for Contributors
- How To
- C API
- jq Internals
- Tips
- Development