make parser 70x times faster? #13

rkeytacked · 2023-09-08T08:16:08Z

Hi, I am working some time now on a really fast ISBN parser/formatter in Java. And for this Java library (https://github.com/creativecouple/isbn-validation-java/) I recently found a way to speed-up the parsing throughput from 6k to 13k ops/milliseconds (meaning just 75 nanoseconds per parse operation).

When trying out this approach for other programming languages, I found your isbn3 NPM library.
Your benchmark script was not able to measure that tiny amount of time correctly, so I put a 1000x loop around the parsing like this:

for (let i=0; i<1000; i++) {
  const data = isbns.map(isbn => parse(isbn))
}

My question is: Are you interested in rewriting your parsing engine?
Otherwise I would try to create a new npm package with that approach.

I compared your latest version of isbn3 to the old npm lib isbn and then my temporary prototype using either of these different imports:

const { parse } = require('..') // your latest version from Github
const { ISBN: { parse } } = require('isbn') // using npm i --no-save isbn (7 years old!!)
const { parse } = require('isbn-validation-js') // using npm i --no-save git://github.com/creativecouple/isbn-validation-java#tmp-javascript-version

This is the result on my machine with these three approaches:

ISBN3

$ npm run benchmark

load module: 5.498ms
parsed 1000x 5640 non-hyphenated ISBNs in: 1:06.333 (m:ss.mmm)
ISBN 978-0-00-443799-6 Group Name English language

ISBN (7 years old!!)

$ npm run benchmark

load module: 1.033ms
parsed 1000x 5640 non-hyphenated ISBNs in: 10.214s
ISBN 978-0-00-443799-6 Group Name English speaking area

my prototype from https://github.com/creativecouple/isbn-validation-java/tree/tmp-javascript-version

$ npm run benchmark

load module: 1.425ms
parsed 1000x 5640 non-hyphenated ISBNs in: 975.009ms
ISBN 978-0-00-443799-6 Group Name English language

So the old isbn package is still faster than your current version, but as you see it is possible to go sub-second for 5,640,000 parsing operations.

The text was updated successfully, but these errors were encountered:

maxlath · 2023-09-11T16:12:49Z

Hi!

So for the back-story, this module was forked from isbn2 as I attempted to better understand how it was working. It probably lost some perf optimizations on the way, but was more readable to me like this, and thus easier to maintain and keep open for future changes.

That said, performance boost that don't hinder the readability/maintainability are very welcome! Could you submit a PR with your new approach, ideally without breaking changes, and with some comments on how that works? That big hyphens array is quite cryptic ^^'

rkeytacked · 2023-09-11T17:57:21Z

Sure thing, I can try to come up with a PR that won't break the API.

Just one question for my understanding of API compatibility: Does the outcome of the parse operation have to be a plain JS object with all the fields? Or can it be a class object with getter properties instead?
(e.g. let isbn = parse("9781234567890") then isbn.group could be a getter only, which in my case increased the speed of the parse method)

maxlath · 2023-09-13T12:56:17Z

Returning a class object with getters would be a breaking change I think: for instance, passing that object to JSON.stringifiy will ignore the getters. There might be other scenario were that would introduce breaking behaviors, but I'm not so familiar with class objects as I rarely work with them. So I think it would be better to stick to plain JS objects for this PR. I would expect the biggest perf boost to rather come from the optimized hyphen lookup, right?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

make parser 70x times faster? #13

make parser 70x times faster? #13

rkeytacked commented Sep 8, 2023

maxlath commented Sep 11, 2023

rkeytacked commented Sep 11, 2023

maxlath commented Sep 13, 2023 •

edited

Loading

make parser 70x times faster? #13

make parser 70x times faster? #13

Comments

rkeytacked commented Sep 8, 2023

ISBN3

ISBN (7 years old!!)

my prototype from https://github.com/creativecouple/isbn-validation-java/tree/tmp-javascript-version

maxlath commented Sep 11, 2023

rkeytacked commented Sep 11, 2023

maxlath commented Sep 13, 2023 • edited Loading

maxlath commented Sep 13, 2023 •

edited

Loading