Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parsed object names #675

Merged
merged 4 commits into from
Nov 2, 2024
Merged

Cache parsed object names #675

merged 4 commits into from
Nov 2, 2024

Conversation

byroot
Copy link
Member

@byroot byroot commented Nov 2, 2024

Looking up the fstring table isn't that cheap, if we assume most object names are likely to re-appear, it makes sense to keep a small cache of them on the stack.

e.g. [{"foo": 1, "bar": 2}, {"foo": 3, "bar": 4}]

In term of implementation, we use a simple sorted array with binary search as it's the most compact and guarantee a decent O(log n), and the comparison is first done on the string length, an then fallback to memcmp.

This helps quite a bit on activitypub and twitter benchmarks, however it is worse on citm_catalog because it has a number of unique object names, of the same size, so it behave poorly with the chosen sorting algorithm.

It's a bit of a double edge sword, but perhaps a smarter comparison function may solve it. Solved later on.

Before:

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   714.000 i/100ms
                  oj   801.000 i/100ms
          Oj::Parser   943.000 i/100ms
           rapidjson   632.000 i/100ms
Calculating -------------------------------------
                json      7.135k (± 0.7%) i/s  (140.16 μs/i) -     35.700k in   5.003978s
                  oj      7.991k (± 0.2%) i/s  (125.14 μs/i) -     40.050k in   5.012044s
          Oj::Parser      9.611k (± 0.2%) i/s  (104.04 μs/i) -     48.093k in   5.003723s
           rapidjson      6.318k (± 0.2%) i/s  (158.29 μs/i) -     31.600k in   5.001896s

Comparison:
                json:     7134.7 i/s
          Oj::Parser:     9611.5 i/s - 1.35x  faster
                  oj:     7990.8 i/s - 1.12x  faster
           rapidjson:     6317.6 i/s - 1.13x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    57.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    573.527 (± 1.6%) i/s    (1.74 ms/i) -      2.907k in   5.070094s
                  oj    619.368 (± 1.6%) i/s    (1.61 ms/i) -      3.100k in   5.006550s
          Oj::Parser    770.095 (± 0.9%) i/s    (1.30 ms/i) -      3.900k in   5.064768s
           rapidjson    560.601 (± 0.4%) i/s    (1.78 ms/i) -      2.856k in   5.094597s

Comparison:
                json:      573.5 i/s
          Oj::Parser:      770.1 i/s - 1.34x  faster
                  oj:      619.4 i/s - 1.08x  faster
           rapidjson:      560.6 i/s - 1.02x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    31.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    46.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    319.842 (± 0.3%) i/s    (3.13 ms/i) -      1.612k in   5.040026s
                  oj    329.315 (± 2.4%) i/s    (3.04 ms/i) -      1.666k in   5.061887s
          Oj::Parser    452.725 (± 1.1%) i/s    (2.21 ms/i) -      2.300k in   5.080996s
           rapidjson    358.160 (± 0.8%) i/s    (2.79 ms/i) -      1.824k in   5.093054s

Comparison:
                json:      319.8 i/s
          Oj::Parser:      452.7 i/s - 1.42x  faster
           rapidjson:      358.2 i/s - 1.12x  faster
                  oj:      329.3 i/s - 1.03x  faster

After:

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   810.000 i/100ms
                  oj   794.000 i/100ms
          Oj::Parser   936.000 i/100ms
           rapidjson   630.000 i/100ms
Calculating -------------------------------------
                json      7.759k (± 1.2%) i/s  (128.88 μs/i) -     38.880k in   5.011657s
                  oj      7.945k (± 1.2%) i/s  (125.87 μs/i) -     40.494k in   5.097765s
          Oj::Parser      9.658k (± 1.3%) i/s  (103.54 μs/i) -     48.672k in   5.040614s
           rapidjson      6.321k (± 0.3%) i/s  (158.20 μs/i) -     32.130k in   5.082985s

Comparison:
                json:     7759.2 i/s
          Oj::Parser:     9657.8 i/s - 1.24x  faster
                  oj:     7944.6 i/s - same-ish: difference falls within error
           rapidjson:     6321.1 i/s - 1.23x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    66.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    661.653 (± 0.6%) i/s    (1.51 ms/i) -      3.366k in   5.087447s
                  oj    621.674 (± 1.3%) i/s    (1.61 ms/i) -      3.162k in   5.087171s
          Oj::Parser    775.859 (± 0.3%) i/s    (1.29 ms/i) -      3.900k in   5.026718s
           rapidjson    563.961 (± 0.4%) i/s    (1.77 ms/i) -      2.856k in   5.064230s

Comparison:
                json:      661.7 i/s
          Oj::Parser:      775.9 i/s - 1.17x  faster
                  oj:      621.7 i/s - 1.06x  slower
           rapidjson:      564.0 i/s - 1.17x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    30.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    47.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    303.318 (± 0.3%) i/s    (3.30 ms/i) -      1.530k in   5.044285s
                  oj    328.832 (± 2.7%) i/s    (3.04 ms/i) -      1.666k in   5.069899s
          Oj::Parser    458.235 (± 0.4%) i/s    (2.18 ms/i) -      2.303k in   5.025895s
           rapidjson    381.204 (± 0.5%) i/s    (2.62 ms/i) -      1.938k in   5.084065s

Comparison:
                json:      303.3 i/s
          Oj::Parser:      458.2 i/s - 1.51x  faster
           rapidjson:      381.2 i/s - 1.26x  faster
                  oj:      328.8 i/s - 1.08x  faster

Looking up the fstring table isn't that cheap, if we assume most object
names are likely to re-appear, it makes sense to keep a small cache of them on
the stack.

e.g. `[{"foo": 1, "bar": 2}, {"foo": 3, "bar": 4}]`

In term of implementation, we use a simple sorted array with binary search
as it's the most compact and guarantee a decent `O(log n)`, and the comparison
is first done on the string length, an then fallback to `memcmp`.

This helps quite a bit on `activitypub` and `twitter` benchmarks, however
it is worse on `citm_catalog` because it has a number of unique object names,
of the same size, so it behave poorly with the chosen sorting algorithm.

It's a bit of a double edge sword, but perhaps a smarter comparison function
may solve it.

Before:

```
== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   714.000 i/100ms
                  oj   801.000 i/100ms
          Oj::Parser   943.000 i/100ms
           rapidjson   632.000 i/100ms
Calculating -------------------------------------
                json      7.135k (± 0.7%) i/s  (140.16 μs/i) -     35.700k in   5.003978s
                  oj      7.991k (± 0.2%) i/s  (125.14 μs/i) -     40.050k in   5.012044s
          Oj::Parser      9.611k (± 0.2%) i/s  (104.04 μs/i) -     48.093k in   5.003723s
           rapidjson      6.318k (± 0.2%) i/s  (158.29 μs/i) -     31.600k in   5.001896s

Comparison:
                json:     7134.7 i/s
          Oj::Parser:     9611.5 i/s - 1.35x  faster
                  oj:     7990.8 i/s - 1.12x  faster
           rapidjson:     6317.6 i/s - 1.13x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    57.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    573.527 (± 1.6%) i/s    (1.74 ms/i) -      2.907k in   5.070094s
                  oj    619.368 (± 1.6%) i/s    (1.61 ms/i) -      3.100k in   5.006550s
          Oj::Parser    770.095 (± 0.9%) i/s    (1.30 ms/i) -      3.900k in   5.064768s
           rapidjson    560.601 (± 0.4%) i/s    (1.78 ms/i) -      2.856k in   5.094597s

Comparison:
                json:      573.5 i/s
          Oj::Parser:      770.1 i/s - 1.34x  faster
                  oj:      619.4 i/s - 1.08x  faster
           rapidjson:      560.6 i/s - 1.02x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    31.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    46.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    319.842 (± 0.3%) i/s    (3.13 ms/i) -      1.612k in   5.040026s
                  oj    329.315 (± 2.4%) i/s    (3.04 ms/i) -      1.666k in   5.061887s
          Oj::Parser    452.725 (± 1.1%) i/s    (2.21 ms/i) -      2.300k in   5.080996s
           rapidjson    358.160 (± 0.8%) i/s    (2.79 ms/i) -      1.824k in   5.093054s

Comparison:
                json:      319.8 i/s
          Oj::Parser:      452.7 i/s - 1.42x  faster
           rapidjson:      358.2 i/s - 1.12x  faster
                  oj:      329.3 i/s - 1.03x  faster

```

After:

```
== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   810.000 i/100ms
                  oj   794.000 i/100ms
          Oj::Parser   936.000 i/100ms
           rapidjson   630.000 i/100ms
Calculating -------------------------------------
                json      7.759k (± 1.2%) i/s  (128.88 μs/i) -     38.880k in   5.011657s
                  oj      7.945k (± 1.2%) i/s  (125.87 μs/i) -     40.494k in   5.097765s
          Oj::Parser      9.658k (± 1.3%) i/s  (103.54 μs/i) -     48.672k in   5.040614s
           rapidjson      6.321k (± 0.3%) i/s  (158.20 μs/i) -     32.130k in   5.082985s

Comparison:
                json:     7759.2 i/s
          Oj::Parser:     9657.8 i/s - 1.24x  faster
                  oj:     7944.6 i/s - same-ish: difference falls within error
           rapidjson:     6321.1 i/s - 1.23x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    66.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    78.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    661.653 (± 0.6%) i/s    (1.51 ms/i) -      3.366k in   5.087447s
                  oj    621.674 (± 1.3%) i/s    (1.61 ms/i) -      3.162k in   5.087171s
          Oj::Parser    775.859 (± 0.3%) i/s    (1.29 ms/i) -      3.900k in   5.026718s
           rapidjson    563.961 (± 0.4%) i/s    (1.77 ms/i) -      2.856k in   5.064230s

Comparison:
                json:      661.7 i/s
          Oj::Parser:      775.9 i/s - 1.17x  faster
                  oj:      621.7 i/s - 1.06x  slower
           rapidjson:      564.0 i/s - 1.17x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    30.000 i/100ms
                  oj    34.000 i/100ms
          Oj::Parser    47.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    303.318 (± 0.3%) i/s    (3.30 ms/i) -      1.530k in   5.044285s
                  oj    328.832 (± 2.7%) i/s    (3.04 ms/i) -      1.666k in   5.069899s
          Oj::Parser    458.235 (± 0.4%) i/s    (2.18 ms/i) -      2.303k in   5.025895s
           rapidjson    381.204 (± 0.5%) i/s    (2.62 ms/i) -      1.938k in   5.084065s

Comparison:
                json:      303.3 i/s
          Oj::Parser:      458.2 i/s - 1.51x  faster
           rapidjson:      381.2 i/s - 1.26x  faster
                  oj:      328.8 i/s - 1.08x  faster

```
@byroot byroot marked this pull request as draft November 2, 2024 16:53
Based on how JSON documents are generally constructed, a numeric
object name is much less likely to be repeated in the document.

```
== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   779.000 i/100ms
                  oj   793.000 i/100ms
          Oj::Parser   956.000 i/100ms
           rapidjson   620.000 i/100ms
Calculating -------------------------------------
                json      7.849k (± 0.3%) i/s  (127.41 μs/i) -     39.729k in   5.061875s
                  oj      7.953k (± 0.3%) i/s  (125.74 μs/i) -     40.443k in   5.085293s
          Oj::Parser      9.652k (± 0.2%) i/s  (103.61 μs/i) -     48.756k in   5.051573s
           rapidjson      6.198k (± 0.1%) i/s  (161.35 μs/i) -     31.000k in   5.001787s

Comparison:
                json:     7848.7 i/s
          Oj::Parser:     9651.7 i/s - 1.23x  faster
                  oj:     7953.0 i/s - 1.01x  faster
           rapidjson:     6197.8 i/s - 1.27x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    66.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    79.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    676.145 (± 0.3%) i/s    (1.48 ms/i) -      3.432k in   5.075895s
                  oj    631.864 (± 0.3%) i/s    (1.58 ms/i) -      3.162k in   5.004287s
          Oj::Parser    791.409 (± 0.3%) i/s    (1.26 ms/i) -      4.029k in   5.090947s
           rapidjson    566.597 (± 0.2%) i/s    (1.76 ms/i) -      2.856k in   5.040641s

Comparison:
                json:      676.1 i/s
          Oj::Parser:      791.4 i/s - 1.17x  faster
                  oj:      631.9 i/s - 1.07x  slower
           rapidjson:      566.6 i/s - 1.19x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    34.000 i/100ms
                  oj    32.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    37.000 i/100ms
Calculating -------------------------------------
                json    356.414 (± 0.3%) i/s    (2.81 ms/i) -      1.802k in   5.055963s
                  oj    349.397 (± 2.0%) i/s    (2.86 ms/i) -      1.760k in   5.039216s
          Oj::Parser    478.342 (± 0.4%) i/s    (2.09 ms/i) -      2.408k in   5.034158s
           rapidjson    395.222 (± 0.8%) i/s    (2.53 ms/i) -      1.998k in   5.055669s

Comparison:
                json:      356.4 i/s
          Oj::Parser:      478.3 i/s - 1.34x  faster
           rapidjson:      395.2 i/s - 1.11x  faster
                  oj:      349.4 i/s - same-ish: difference falls within error
```
@byroot
Copy link
Member Author

byroot commented Nov 2, 2024

4a04252 solves the regression.

It's simple, we just don't cache names that don't start with a letter. This may sound like benchmark gaming, but I think it makes sense as an heuristic given how most JSON documents are constructed.

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   779.000 i/100ms
                  oj   793.000 i/100ms
          Oj::Parser   956.000 i/100ms
           rapidjson   620.000 i/100ms
Calculating -------------------------------------
                json      7.849k (± 0.3%) i/s  (127.41 μs/i) -     39.729k in   5.061875s
                  oj      7.953k (± 0.3%) i/s  (125.74 μs/i) -     40.443k in   5.085293s
          Oj::Parser      9.652k (± 0.2%) i/s  (103.61 μs/i) -     48.756k in   5.051573s
           rapidjson      6.198k (± 0.1%) i/s  (161.35 μs/i) -     31.000k in   5.001787s

Comparison:
                json:     7848.7 i/s
          Oj::Parser:     9651.7 i/s - 1.23x  faster
                  oj:     7953.0 i/s - 1.01x  faster
           rapidjson:     6197.8 i/s - 1.27x  slower


== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    66.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    79.000 i/100ms
           rapidjson    56.000 i/100ms
Calculating -------------------------------------
                json    676.145 (± 0.3%) i/s    (1.48 ms/i) -      3.432k in   5.075895s
                  oj    631.864 (± 0.3%) i/s    (1.58 ms/i) -      3.162k in   5.004287s
          Oj::Parser    791.409 (± 0.3%) i/s    (1.26 ms/i) -      4.029k in   5.090947s
           rapidjson    566.597 (± 0.2%) i/s    (1.76 ms/i) -      2.856k in   5.040641s

Comparison:
                json:      676.1 i/s
          Oj::Parser:      791.4 i/s - 1.17x  faster
                  oj:      631.9 i/s - 1.07x  slower
           rapidjson:      566.6 i/s - 1.19x  slower


== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    34.000 i/100ms
                  oj    32.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    37.000 i/100ms
Calculating -------------------------------------
                json    356.414 (± 0.3%) i/s    (2.81 ms/i) -      1.802k in   5.055963s
                  oj    349.397 (± 2.0%) i/s    (2.86 ms/i) -      1.760k in   5.039216s
          Oj::Parser    478.342 (± 0.4%) i/s    (2.09 ms/i) -      2.408k in   5.034158s
           rapidjson    395.222 (± 0.8%) i/s    (2.53 ms/i) -      1.998k in   5.055669s

Comparison:
                json:      356.4 i/s
          Oj::Parser:      478.3 i/s - 1.34x  faster
           rapidjson:      395.2 i/s - 1.11x  faster
                  oj:      349.4 i/s - same-ish: difference falls within error

static VALUE rstring_cache_fetch(rstring_cache *cache, const char *str, const long length)
{
if (RB_UNLIKELY(length > JSON_RSTRING_CACHE_MAX_ENTRY_LENGTH)) {
rb_enc_interned_str(str, length, rb_utf8_encoding());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing a return?
Fixed in the next commit, but I wonder if it affects the benchmark results for this commit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was a mistake. I don't think we ever hit that branch on any of the benchmarks.

@byroot
Copy link
Member Author

byroot commented Nov 2, 2024

@eregon while you're here, I'm surprised to see TruffleRuby is missing rb_enc_interned_str:

/home/runner/.rubies/truffleruby-head/bin/truffleruby: symbol lookup error: /home/runner/work/json/json/ext/json/ext/parser.so: undefined symbol: rb_enc_interned_str

@eregon
Copy link
Member

eregon commented Nov 2, 2024

https://github.com/oracle/truffleruby/blob/07b978e4e0d43827ca13c600af447c98ac477a71/src/main/c/cext/string.c#L435-L442 looks like TruffleRuby has rb_enc_interned_str_cstr() and rb_str_to_interned_str() but not rb_enc_interned_str() yet -> oracle/truffleruby#3703

Also micro optimize a bit further.

```
== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   832.000 i/100ms
                  oj   799.000 i/100ms
          Oj::Parser   969.000 i/100ms
           rapidjson   636.000 i/100ms
Calculating -------------------------------------
                json      8.020k (± 0.3%) i/s  (124.70 μs/i) -     40.768k in   5.083607s
                  oj      7.942k (± 1.7%) i/s  (125.92 μs/i) -     39.950k in   5.031909s
          Oj::Parser      9.515k (± 4.4%) i/s  (105.10 μs/i) -     47.481k in   5.001202s
           rapidjson      6.282k (± 2.1%) i/s  (159.20 μs/i) -     31.800k in   5.064719s

Comparison:
                json:     8019.6 i/s
          Oj::Parser:     9514.5 i/s - 1.19x  faster
                  oj:     7941.9 i/s - same-ish: difference falls within error
           rapidjson:     6281.6 i/s - 1.28x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    67.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    79.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    670.935 (± 2.7%) i/s    (1.49 ms/i) -      3.417k in   5.096850s
                  oj    618.937 (± 3.2%) i/s    (1.62 ms/i) -      3.100k in   5.014800s
          Oj::Parser    768.894 (± 1.7%) i/s    (1.30 ms/i) -      3.871k in   5.036093s
           rapidjson    556.882 (± 2.7%) i/s    (1.80 ms/i) -      2.805k in   5.040970s

Comparison:
                json:      670.9 i/s
          Oj::Parser:      768.9 i/s - 1.15x  faster
                  oj:      618.9 i/s - 1.08x  slower
           rapidjson:      556.9 i/s - 1.20x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    35.000 i/100ms
                  oj    32.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    382.262 (± 2.6%) i/s    (2.62 ms/i) -      1.925k in   5.039336s
                  oj    348.721 (± 0.6%) i/s    (2.87 ms/i) -      1.760k in   5.047265s
          Oj::Parser    478.294 (± 0.6%) i/s    (2.09 ms/i) -      2.408k in   5.034798s
           rapidjson    398.740 (± 0.8%) i/s    (2.51 ms/i) -      2.028k in   5.086365s

Comparison:
                json:      382.3 i/s
          Oj::Parser:      478.3 i/s - 1.25x  faster
           rapidjson:      398.7 i/s - 1.04x  faster
                  oj:      348.7 i/s - 1.10x  slower
```
@byroot
Copy link
Member Author

byroot commented Nov 2, 2024

I fixed the TruffleRuby and Ruby 2.7 support, and squeezed some more micro-optimizations:

== Parsing activitypub.json (58160 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json   832.000 i/100ms
                  oj   799.000 i/100ms
          Oj::Parser   969.000 i/100ms
           rapidjson   636.000 i/100ms
Calculating -------------------------------------
                json      8.020k (± 0.3%) i/s  (124.70 μs/i) -     40.768k in   5.083607s
                  oj      7.942k (± 1.7%) i/s  (125.92 μs/i) -     39.950k in   5.031909s
          Oj::Parser      9.515k (± 4.4%) i/s  (105.10 μs/i) -     47.481k in   5.001202s
           rapidjson      6.282k (± 2.1%) i/s  (159.20 μs/i) -     31.800k in   5.064719s

Comparison:
                json:     8019.6 i/s
          Oj::Parser:     9514.5 i/s - 1.19x  faster
                  oj:     7941.9 i/s - same-ish: difference falls within error
           rapidjson:     6281.6 i/s - 1.28x  slower

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    67.000 i/100ms
                  oj    62.000 i/100ms
          Oj::Parser    79.000 i/100ms
           rapidjson    55.000 i/100ms
Calculating -------------------------------------
                json    670.935 (± 2.7%) i/s    (1.49 ms/i) -      3.417k in   5.096850s
                  oj    618.937 (± 3.2%) i/s    (1.62 ms/i) -      3.100k in   5.014800s
          Oj::Parser    768.894 (± 1.7%) i/s    (1.30 ms/i) -      3.871k in   5.036093s
           rapidjson    556.882 (± 2.7%) i/s    (1.80 ms/i) -      2.805k in   5.040970s

Comparison:
                json:      670.9 i/s
          Oj::Parser:      768.9 i/s - 1.15x  faster
                  oj:      618.9 i/s - 1.08x  slower
           rapidjson:      556.9 i/s - 1.20x  slower

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    35.000 i/100ms
                  oj    32.000 i/100ms
          Oj::Parser    43.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    382.262 (± 2.6%) i/s    (2.62 ms/i) -      1.925k in   5.039336s
                  oj    348.721 (± 0.6%) i/s    (2.87 ms/i) -      1.760k in   5.047265s
          Oj::Parser    478.294 (± 0.6%) i/s    (2.09 ms/i) -      2.408k in   5.034798s
           rapidjson    398.740 (± 0.8%) i/s    (2.51 ms/i) -      2.028k in   5.086365s

Comparison:
                json:      382.3 i/s
          Oj::Parser:      478.3 i/s - 1.25x  faster
           rapidjson:      398.7 i/s - 1.04x  faster
                  oj:      348.7 i/s - 1.10x  slower

I'll squash and merge.

@byroot byroot marked this pull request as ready for review November 2, 2024 21:34
@byroot byroot merged commit 6d9694e into ruby:master Nov 2, 2024
36 checks passed
@byroot byroot deleted the key-cache branch November 2, 2024 21:37
byroot added a commit to byroot/json that referenced this pull request Nov 3, 2024
This is somewhat dead code as unless you are using `JSON::Parser.new`
direcltly we never allocate `JSON::Ext::Parser` anymore.

But still, we should mark all its reference in case some code out there
uses that.

Followup: ruby#675
byroot added a commit to byroot/json that referenced this pull request Nov 3, 2024
This is somewhat dead code as unless you are using `JSON::Parser.new`
direcltly we never allocate `JSON::Ext::Parser` anymore.

But still, we should mark all its reference in case some code out there
uses that.

Followup: ruby#675
casperisfine pushed a commit to Shopify/msgpack-ruby that referenced this pull request Nov 13, 2024
Heavily inspired from ruby/json#675.

When parsing documents with the same structure repeatedly, a lot
of time can be saved by keeping a small cache of map keys encountered.

Using the existing `bench/bench.rb`, comparing with and without the
cache shows a 30% performance improvement:

```
Calculating -------------------------------------
       unpack-pooled    960.380k (± 1.4%) i/s -      4.865M in   5.066600s
    unpack-key-cache      1.245M (± 1.6%) i/s -      6.232M in   5.009060s

Comparison:
       unpack-pooled:   960379.8 i/s
    unpack-key-cache:  1244517.6 i/s - 1.30x  (± 0.00) faster
```

However, on the same benchmark, but with the cache filled with other
keys, the performance is notably degraded:

```
Calculating -------------------------------------
       unpack-pooled    926.849k (± 2.1%) i/s -      4.639M in   5.007333s
    unpack-key-cache    822.266k (± 2.4%) i/s -      4.113M in   5.004645s

Comparison:
       unpack-pooled:   926849.2 i/s
    unpack-key-cache:   822265.6 i/s - 1.13x  (± 0.00) slower
```

So this feature is powerful but situational.
casperisfine pushed a commit to Shopify/msgpack-ruby that referenced this pull request Nov 13, 2024
Heavily inspired from ruby/json#675.

When parsing documents with the same structure repeatedly, a lot
of time can be saved by keeping a small cache of map keys encountered.

Using the existing `bench/bench.rb`, comparing with and without the
cache shows a 30% performance improvement:

```
Calculating -------------------------------------
       unpack-pooled    960.380k (± 1.4%) i/s -      4.865M in   5.066600s
    unpack-key-cache      1.245M (± 1.6%) i/s -      6.232M in   5.009060s

Comparison:
       unpack-pooled:   960379.8 i/s
    unpack-key-cache:  1244517.6 i/s - 1.30x  (± 0.00) faster
```

However, on the same benchmark, but with the cache filled with other
keys, the performance is notably degraded:

```
Calculating -------------------------------------
       unpack-pooled    926.849k (± 2.1%) i/s -      4.639M in   5.007333s
    unpack-key-cache    822.266k (± 2.4%) i/s -      4.113M in   5.004645s

Comparison:
       unpack-pooled:   926849.2 i/s
    unpack-key-cache:   822265.6 i/s - 1.13x  (± 0.00) slower
```

So this feature is powerful but situational.
casperisfine pushed a commit to Shopify/msgpack-ruby that referenced this pull request Nov 13, 2024
Heavily inspired from ruby/json#675.

When parsing documents with the same structure repeatedly, a lot
of time can be saved by keeping a small cache of map keys encountered.

Using the existing `bench/bench.rb`, comparing with and without the
cache shows a 30% performance improvement:

```
Calculating -------------------------------------
       unpack-pooled    960.380k (± 1.4%) i/s -      4.865M in   5.066600s
    unpack-key-cache      1.245M (± 1.6%) i/s -      6.232M in   5.009060s

Comparison:
       unpack-pooled:   960379.8 i/s
    unpack-key-cache:  1244517.6 i/s - 1.30x  (± 0.00) faster
```

However, on the same benchmark, but with the cache filled with other
keys, the performance is notably degraded:

```
Calculating -------------------------------------
       unpack-pooled    926.849k (± 2.1%) i/s -      4.639M in   5.007333s
    unpack-key-cache    822.266k (± 2.4%) i/s -      4.113M in   5.004645s

Comparison:
       unpack-pooled:   926849.2 i/s
    unpack-key-cache:   822265.6 i/s - 1.13x  (± 0.00) slower
```

So this feature is powerful but situational.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants