Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should we benchmark containment rather than similarity? #2

Closed
ctb opened this issue Oct 8, 2020 · 3 comments
Closed

should we benchmark containment rather than similarity? #2

ctb opened this issue Oct 8, 2020 · 3 comments

Comments

@ctb
Copy link
Member

ctb commented Oct 8, 2020

In Results section Scaled MinHash sketches support efficient indexing for large-scale containment queries, tbl:search-runtime shows runtime for similarity search. Two thoughts --

first, these are surprisingly slow :(.
second, these are for similarity, not containment.

My experience with containment and gather (which uses containment) is that these are pretty fast operations; I rather rarely use similarity. Moreover, the whole paper is more focused on containment than similarity anyway.

Should we refocus this benchmark on containment?

@ctb
Copy link
Member Author

ctb commented Oct 28, 2020

yes, I think we should. :)

@ctb
Copy link
Member Author

ctb commented Dec 2, 2020

given the stuff going on with greyhound, we are going to ignore performance in this paper (beyond implying that it's acceptable, 'cause here are the results).

@ctb ctb closed this as completed Dec 2, 2020
@ctb
Copy link
Member Author

ctb commented Dec 2, 2020

(and in fact we are removing that entire section as part of shift to #10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant