Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: avoid deadlock in indexer when processor errors #407

Merged
merged 1 commit into from
Mar 3, 2021

Conversation

iand
Copy link
Contributor

@iand iand commented Mar 3, 2021

No description provided.

@iand iand requested a review from frrist March 3, 2021 17:18
@@ -432,7 +432,7 @@ func (t *TipSetIndexer) runActorProcessor(ctx context.Context, p ActorProcessor,
}
}

func (t *TipSetIndexer) Close() error {
func (t *TipSetIndexer) closeProcessors() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commentary: Should we be returning the errors in addition to logging them, or should we drop the return param from both of these close methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case we want to log the errors because we want to continue with all the close calls, not exit at first error.

We could remove the error return. I tend to keep them assuming that one day we will return an error and the caller should handle it. But that's not a very strong argument to keep it. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be in favor of nixing the errors, it makes the code a bit more readable -- but these aren't strong feelings. I'll leave it up to you.

return nil
}

func (t *TipSetIndexer) Close() error {
// ensure there are no persist go routines left running
t.persistSlot <- struct{}{}
Copy link
Member

@frrist frrist Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im wondering if:

if len(t.persistSlot) == 0 {
  return
} else {
  t.persistSlot <- struct{}{}
}

might be safer?

For example, if the context is canceled, I'm not sure the channel will be emptied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's potentially racy. Another way is to include a timeout:

select {
    case t.persistSlot <- struct{}{}:

    case time.After(3*time.Second):

}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's here is fine for now. I don't have enough inside into the time it takes to persist models to make an accurate guess on an appropriate timeout.

@iand iand merged commit 8f367a3 into master Mar 3, 2021
@iand iand deleted the fix/indexer-deadlock branch March 3, 2021 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants