Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query k-mer coordinates #337

Merged
merged 13 commits into from
Jul 17, 2021
Merged

query k-mer coordinates #337

merged 13 commits into from
Jul 17, 2021

Conversation

karasikov
Copy link
Member

No description provided.

@karasikov karasikov marked this pull request as ready for review July 15, 2021 09:19
@karasikov karasikov requested a review from hmusta July 15, 2021 09:20
@karasikov karasikov changed the title query k-mer coordinates, WIP query k-mer coordinates Jul 15, 2021
Copy link
Collaborator

@hmusta hmusta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far. I'll propose a list of TODOs

  • A few integration tests to make sure serializing/loading works
  • Unit tests
  • Make get_row_tuples in TupleCSCMatrix more cache efficient
  • Implement TupleCSRMatrix (optional?)

Comment on lines +143 to +156
// TODO: reshape?
for (size_t i = 0; i < rows.size(); ++i) {
row_tuples[i].reserve(column_ranks[i].size());
for (auto [j, r] : column_ranks[i]) {
assert(r >= 1 && "matches can't have zero-rank");
size_t begin = delimiters_[j].select1(r) + 1 - r;
size_t end = delimiters_[j].select1(r + 1) - r;
Tuple tuple;
tuple.reserve(end - begin);
for (size_t t = begin; t < end; ++t) {
tuple.push_back(column_values_[j][t]);
}
row_tuples[i].emplace_back(j, std::move(tuple));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I think it might be more cache efficient to first aggregate a list of all accessed column IDs, then iterate through each column once to populate this vector.

@karasikov karasikov requested a review from hmusta July 15, 2021 19:23
metagraph/src/cli/query.cpp Outdated Show resolved Hide resolved
metagraph/src/graph/annotated_dbg.cpp Outdated Show resolved Hide resolved
@karasikov karasikov requested a review from hmusta July 15, 2021 21:43
@karasikov karasikov merged commit 8325931 into master Jul 17, 2021
@karasikov karasikov deleted the query_coord branch July 17, 2021 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants