Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Parallelize Treelite to FIL conversion over trees #3396

Merged
merged 12 commits into from
Jan 26, 2021

Conversation

wphicks
Copy link
Contributor

@wphicks wphicks commented Jan 21, 2021

Convert Treelite trees to FIL in parallel
Improve speed of initial prediction for a model with 300 trees of approximately 4M nodes by an average of 5.9% on a system with 12 Intel Xeon Gold 6128 CPUs available

@wphicks wphicks requested a review from a team as a code owner January 21, 2021 19:01
@wphicks
Copy link
Contributor Author

wphicks commented Jan 21, 2021

Note that @canonizer previously proposed replacing std::vector for storing FIL nodes with a raw malloc call to avoid the initialization overhead. Profiling demonstrates that this provides only a 0.5% speedup relative to an initial resize call on the vector, which makes sense given the constructors for the FIL nodes. Furthermore, looking at the generated assembly for each approach on Godbolt, I'm fairly well convinced that the vast majority of that tiny slowdown when using std::vector is genuinely useful error handling. Given that, I think it's worth it to accept the slowdown in favor of better error-handling and maintainability. If we decide we do want that extra 0.5%, we can use make_unique to generate the array and avoid the raw malloc and free, as demonstrated in an earlier commit on this branch.

@wphicks wphicks changed the title Parallelize Treelite to FIL conversion over trees [REVIEW] Parallelize Treelite to FIL conversion over trees Jan 21, 2021
@wphicks wphicks added feature request New feature or request non-breaking Non-breaking change 3 - Ready for Review Ready for review by team and removed 3 - Ready for Review Ready for review by team labels Jan 21, 2021
@wphicks wphicks changed the title [REVIEW] Parallelize Treelite to FIL conversion over trees [WIP] Parallelize Treelite to FIL conversion over trees Jan 21, 2021
@wphicks wphicks changed the title [WIP] Parallelize Treelite to FIL conversion over trees [REVIEW] Parallelize Treelite to FIL conversion over trees Jan 21, 2021
@wphicks wphicks added the 3 - Ready for Review Ready for review by team label Jan 21, 2021
@codecov-io
Copy link

codecov-io commented Jan 21, 2021

Codecov Report

Merging #3396 (92a1548) into branch-0.18 (550121b) will increase coverage by 0.14%.
The diff coverage is 85.77%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.18    #3396      +/-   ##
===============================================
+ Coverage        71.48%   71.63%   +0.14%     
===============================================
  Files              207      210       +3     
  Lines            16748    16924     +176     
===============================================
+ Hits             11973    12123     +150     
- Misses            4775     4801      +26     
Impacted Files Coverage Δ
python/cuml/decomposition/incremental_pca.py 94.70% <ø> (ø)
python/cuml/dask/ensemble/base.py 19.69% <30.43%> (+0.36%) ⬆️
python/cuml/ensemble/randomforestregressor.pyx 70.83% <44.44%> (ø)
...ython/cuml/dask/ensemble/randomforestclassifier.py 30.00% <50.00%> (+0.51%) ⬆️
python/cuml/dask/ensemble/randomforestregressor.py 35.08% <50.00%> (+0.54%) ⬆️
python/cuml/fil/fil.pyx 91.87% <60.00%> (-1.88%) ⬇️
python/cuml/ensemble/randomforestclassifier.pyx 73.72% <66.66%> (ø)
python/cuml/multiclass/multiclass.py 84.21% <84.21%> (ø)
python/cuml/model_selection/_split.py 90.35% <90.35%> (ø)
python/cuml/svm/svm_base.pyx 94.27% <91.30%> (-0.63%) ⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c48f35a...92a1548. Read the comment docs.

Copy link
Contributor

@canonizer canonizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just small technical comments, otherwise looks good.

cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Outdated Show resolved Hide resolved
cpp/src/fil/fil.cu Show resolved Hide resolved
@wphicks wphicks added 4 - Waiting on Reviewer Waiting for reviewer to review or respond and removed 3 - Ready for Review Ready for review by team labels Jan 23, 2021
Copy link
Contributor

@canonizer canonizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved again.

@wphicks wphicks added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Waiting on Reviewer Waiting for reviewer to review or respond labels Jan 26, 2021
@JohnZed
Copy link
Contributor

JohnZed commented Jan 26, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 9ff478f into rapidsai:branch-0.18 Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge feature request New feature or request libcuml non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants