Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for column-split cpu predictor #8676

Merged
merged 13 commits into from
Jan 17, 2023

Conversation

rongou
Copy link
Contributor

@rongou rongou commented Jan 13, 2023

When data is split by column, produce predictions in two passes by leveraging bit vectors.

Part of #8424

@rongou
Copy link
Contributor Author

rongou commented Jan 13, 2023

@trivialfis @hcho3

@trivialfis
Copy link
Member

merging the master branch should fix the CI error.

if (prev_thread_temp_size < nthread) {
out->resize(nthread, RegTree::FVec());
// init thread buffers
static void InitThreadTemp(int nthread, std::vector<RegTree::FVec> *out) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use an anonymous namespace instead of static if it's not intended to be used outside of the TU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


CHECK_EQ(model_.param.size_leaf_vector, 0) << "size_leaf_vector is enforced to 0 so far";
// parallel over local batch
const auto nsize = static_cast<bst_omp_uint>(batch.Size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to set bst_omp... explicitly, use auto and ParallelFor will do the casting:

common::Parallel(n_blocks, n_threads, [&](auto block_id) {}); // auto block_id has the same type as `n_blocks`.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

std::vector<RegTree::FVec> feat_vecs_{};

std::size_t n_rows_;
std::vector<BitVector::value_type> decision_storage_{};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some comments about the layout of the storage? How to index it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@trivialfis trivialfis merged commit 78396f8 into dmlc:master Jan 17, 2023
@rongou rongou deleted the split-col-pred branch September 25, 2023 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants