Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Chunker): JSON parsing performance #7171

Merged
merged 16 commits into from
Dec 21, 2020
Merged

fix(Chunker): JSON parsing performance #7171

merged 16 commits into from
Dec 21, 2020

Conversation

karlmcguire
Copy link
Contributor

@karlmcguire karlmcguire commented Dec 18, 2020

Fixes DGRAPH-2810.

About a 90% performance increase in NQuads/sec on the data used in the BenchmarkNoFacets() function.


This change is Reviewable

@karlmcguire karlmcguire added kind/enhancement Something could be better. area/performance Performance related issues. area/facets Issues related to face handling, querying, etc. area/parsing Issues related to the parser or lexer. labels Dec 18, 2020
@CLAassistant
Copy link

CLAassistant commented Dec 18, 2020

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @karlmcguire and @vvbalaji-dgraph)


go.mod, line 19 at r2 (raw file):

	github.com/blevesearch/bleve v1.0.13
	github.com/codahale/hdrhistogram v0.0.0-20161010025455-3a0bb77429bd
	github.com/davecgh/go-spew v1.1.1

remove?


go.mod, line 48 at r2 (raw file):

	github.com/hashicorp/vault/api v1.0.4
	github.com/minio/minio-go/v6 v6.0.55
	github.com/minio/simdjson-go v0.1.5

remove?


go.mod, line 52 at r3 (raw file):

	github.com/pkg/profile v1.2.1
	github.com/prometheus/client_golang v0.9.3
	github.com/prometheus/common v0.4.1 // indirect

revert go.mod. And then run go mod tidy.


chunker/json_parser.go, line 385 at r3 (raw file):

	for k, v := range m {
		if strings.Contains(k, x.FacetDelimeter) {
			mf[k] = v

I'd also delete them from m, so the iteration over m is faster.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @karlmcguire and @vvbalaji-dgraph)


chunker/json_parser.go, line 433 at r3 (raw file):

		// option.
		// We also skip facets here because we parse them with the corresponding predicate.
		if pred == "uid" || strings.Index(pred, x.FacetDelimeter) > 0 {

If m doesn't have facets, then you can save on the strings.Index search.

Copy link
Contributor Author

@karlmcguire karlmcguire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 4 files reviewed, 5 unresolved discussions (waiting on @manishrjain and @vvbalaji-dgraph)


go.mod, line 19 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

remove?

Done.


go.mod, line 48 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

remove?

Done.


go.mod, line 52 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

revert go.mod. And then run go mod tidy.

Done.


chunker/json_parser.go, line 385 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I'd also delete them from m, so the iteration over m is faster.

Done.


chunker/json_parser.go, line 433 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

If m doesn't have facets, then you can save on the strings.Index search.

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice! Feel free to squash and merge.

Reviewable status: 1 of 2 files reviewed, all discussions resolved (waiting on @manishrjain and @vvbalaji-dgraph)

@karlmcguire karlmcguire merged commit 24b0ae1 into master Dec 21, 2020
@karlmcguire karlmcguire deleted the karl/chunker branch December 21, 2020 15:48
danielmai pushed a commit that referenced this pull request Dec 21, 2020
* testing simdjson speed, saw a 4x improvement from encoding/json speed

* first pass finding facets

* walking the tree, currently not passing all tests

* walk function passes all tests

* three dimensional walk map

* 87% speed increase now

* uncomment testing function

* cleanup and documentation

* formatting fixes

* ignore deepsource warning

* fix tests

* simpler fix

* tidy

* remove comment

* delete from main map for faster iteration

* fix go.mod
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/facets Issues related to face handling, querying, etc. area/parsing Issues related to the parser or lexer. area/performance Performance related issues. kind/enhancement Something could be better.
Development

Successfully merging this pull request may close these issues.

3 participants