Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor rebuild index logic. #2851

Merged
merged 10 commits into from
Jan 3, 2019
Merged

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented Dec 28, 2018

  • Delete the index.go file in the working module. The callers of the
    deleted methods will call the functions in the posting module
    directly.
  • The rebuild index functions delete the index first.
  • The rebuild index functions figure out themselves if the index needs
    to be rebuilt after the deletion, instead of delegating that task to
    the callers.
  • Simplify delete logic.

No fuctional changes.


This change is Reviewable

* Delete the index.go file in the working module. The callers of the
  deleted methods will call the functions in the posting module
  directly.
* The rebuild index functions delete the index first.
* The rebuild index functions figure out themselves if the index needs
  to be rebuilt after the deletion, instead of delegating that task to
  the callers.
* Simplify delete logic.

No fuctional changes.
@martinmr martinmr force-pushed the martinmr/indexing-optimization branch from 766cc2a to acf6a1e Compare December 29, 2018 01:48
* Move all reindexing logic to the postings package.
* Simplify logic.
* Fix formatting and some linting issues.
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 4 unresolved discussions (waiting on @martinmr and @manishrjain)


posting/index.go, line 588 at r1 (raw file):

// RebuildIndicesRequest holds the info needed to initiate a rebuilt of the indices.
type RebuildIndicesRequest struct {

You can drop the Request suffix.


posting/index.go, line 596 at r1 (raw file):

// RebuildAllIndices rebuilds all indices that need it.
func RebuildAllIndices(ctx context.Context, req *RebuildIndicesRequest) error {

func (req *RebuildIndices) Run(ctx context.Context) error


posting/index.go, line 621 at r1 (raw file):

	}
	// Index needs to be rebuilt if the scheme directive changed.
	if (current.Directive == pb.SchemaUpdate_INDEX) != (old.Directive == pb.SchemaUpdate_INDEX) {

Merge this and next if.

indexNow := current.Directive == pb.Index
indexPrev := old.Directive == pb.Index

if !indexNow {
return false?? <- Seems like it to me? Or, does this function mean rebuild or delete, and therefore this must be true?
}

if !indexPrex { return true } // Was not previously indexed.
if current.ValType != old.ValType { return true } // Was previously indexed, but with a different value type.


posting/index.go, line 633 at r1 (raw file):

	}
	for i, t := range old.Tokenizer {
		if current.Tokenizer[i] != t {

Do we know that the ordering must be the same?

posting/index.go Outdated
// Delete index entries from data store.
if err := deleteCountIndex(attr, false); err != nil {
func deleteCountIndex(attr string) error {
if err := deleteCountIndexInternal(attr, /*reverse*/ false); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File is not goimports-ed (from goimports)

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 5 unresolved discussions (waiting on @manishrjain and @golangcibot)


posting/index.go, line 588 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

You can drop the Request suffix.

Done.


posting/index.go, line 596 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

func (req *RebuildIndices) Run(ctx context.Context) error

Done.


posting/index.go, line 621 at r1 (raw file):

return false?? <- Seems like it to me? Or, does this function mean rebuild or delete, and therefore this must be true?

This function is in fact rebuild or delete. I have added comments on top of these functions explaining this.

I have also simplified the logic a bit.


posting/index.go, line 633 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Do we know that the ordering must be the same?

I didn't think so but I originally kept the logic. I have changed it to use a set to test for equality instead of checking the ordering.


posting/index.go, line 429 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

File is not goimports-ed (from goimports)

Done.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: Fix the few issues before merging.

Reviewed 2 of 5 files at r1, 3 of 3 files at r3.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @manishrjain, @golangcibot, and @martinmr)


posting/index.go, line 423 at r3 (raw file):

}

func deleteCountIndexInternal(attr string, reverse bool) error {

Remove this func.


posting/index.go, line 430 at r3 (raw file):

func deleteCountIndex(attr string) error {
	if err := deleteCountIndexInternal(attr, false /*reverse*/); err != nil {
pk := ...
if err := deleteAllEntries(pk.CountPrefix(false)); err != nil { .. }

return deleteAllEntries(pk.CountPrefix(true))

posting/index.go, line 569 at r3 (raw file):

	// If no previous schema existed, the index needs to be built.
	if old == nil {
		return true

if old == nil { old = &pb.SchemaUpdate{}}

Then it would work through the logic correctly. Same elsewhere.


posting/index.go, line 593 at r3 (raw file):

	// Index needs to be rebuilt if the tokenizers have changed
	prevTokens := mapset.NewSet()

Remove mapset library.


posting/index.go, line 654 at r3 (raw file):

	if old == nil {
		return true

here and elsewhere.


posting/index.go, line 722 at r3 (raw file):

	if old == nil {
		return true

same here. Should just set this to empty.


posting/index.go, line 777 at r3 (raw file):

	if old == nil {
		return false, nil

and here.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @manishrjain, @golangcibot, and @martinmr)


posting/index_test.go, line 350 at r3 (raw file):

	s2 := pb.SchemaUpdate{ValueType: pb.Posting_UID}
	require.False(t, needsIndexRebuild(&s1, &s2))
	require.True(t, needsIndexRebuild(nil, &s2))

for old=nil, new=does not have count, current logic would return true for rebuildCount and rebuildList, etc. Test for that.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 5 files reviewed, 10 unresolved discussions (waiting on @manishrjain and @golangcibot)


posting/index.go, line 423 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove this func.

Done.


posting/index.go, line 430 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…
pk := ...
if err := deleteAllEntries(pk.CountPrefix(false)); err != nil { .. }

return deleteAllEntries(pk.CountPrefix(true))

Done.


posting/index.go, line 569 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if old == nil { old = &pb.SchemaUpdate{}}

Then it would work through the logic correctly. Same elsewhere.

Done.


posting/index.go, line 593 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Remove mapset library.

Done.


posting/index.go, line 654 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

here and elsewhere.

Done.


posting/index.go, line 722 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

same here. Should just set this to empty.

Done.


posting/index.go, line 777 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

and here.

I left this one intact. This change didn't make sense in this function since we only want this to return true if the schema existed before. Applying this suggestion caused multiple tests to fail.


posting/index_test.go, line 350 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

for old=nil, new=does not have count, current logic would return true for rebuildCount and rebuildList, etc. Test for that.

Done.

@martinmr martinmr dismissed manishrjain’s stale review January 3, 2019 01:37

Dismissing last review because it's been addressed and LGTM was given already.

@martinmr martinmr merged commit 0caeb41 into master Jan 3, 2019
@martinmr martinmr deleted the martinmr/indexing-optimization branch January 3, 2019 01:47
@manishrjain
Copy link
Contributor

!current.Count == !old.Count
This is confusing. No different from: current.Count == old.Count. Can you fix?

dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 19, 2019
* Refactor rebuild index logic.

* Delete the index.go file in the working module. The callers of the
  deleted methods will call the functions in the posting module
  directly.
* The rebuild index functions delete the index first.
* The rebuild index functions figure out themselves if the index needs
  to be rebuilt after the deletion, instead of delegating that task to
  the callers.
* Simplify delete logic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants