Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix MinHash.merge(...) in rust #1446

Open
ctb opened this issue Apr 7, 2021 · 4 comments
Open

fix MinHash.merge(...) in rust #1446

ctb opened this issue Apr 7, 2021 · 4 comments
Labels
good next issue An issue that should be ready to resolve. rust

Comments

@ctb
Copy link
Contributor

ctb commented Apr 7, 2021

In #1395, I detail some fun ways to make the rust code return invalid MinHash objects because merge does Bad Things when handed mixed pairs of abundance/non-abundance sketches.

This is an issue corresponding to that PR.

@ctb ctb added rust good next issue An issue that should be ready to resolve. labels Apr 7, 2021
@ctb
Copy link
Contributor Author

ctb commented Apr 7, 2021

As part of the larger "What shall we do with MinHash objects?" discussion, note that here we want to come up with tests and (hopefully simple) fixes so that things don't break. Down the road we might want to also think about splitting abund and noabund MinHash objects so that an explicit conversion from abund to noabund is needed, similarly to what we are talking about with num and scaled MinHash objects in #1354 (comment).

@keyabarve
Copy link
Contributor

@ctb I'm a little confused about what exactly needs to be done here. Could you please clarify?

@ctb
Copy link
Contributor Author

ctb commented Apr 15, 2021

hi @keyabarve the most straightforward option is to take over #1395 and fix all of the issues - basically, do systematic tests of all of the if/else branches in the Rust merge code starting in Python land, and then fix the Rust merge code to pass all of the tests. Does that make sense?

@ctb
Copy link
Contributor Author

ctb commented Apr 15, 2021

to get started, try checking out that branch with git fetch; git checkout fix/minhash_abund_merge and then do a trial removal of all of the changes to src/core/src/sketch/minhash.rs to the latest branch with git show latest:src/core/src/sketch/minhash.rs > src/core/src/sketch/minhash.rs. At that point you should see that the tests in tests/test__minhash.py don't work any more!

So your task would be to fix the code in src/core/src/sketch/minhash.rs to pass those tests, as well as seeing if there are other problems with the merge code that need fixing (as well as tests).

Note that you can get back the right version of src/core/src/sketch/minhash.rs on this branch with

git checkout src/core/src/sketch/minhash.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good next issue An issue that should be ready to resolve. rust
Projects
None yet
Development

No branches or pull requests

2 participants