-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelization of fast_csr_mv #67
base: master
Are you sure you want to change the base?
Conversation
I'm running the examples code with and without this patch, and the 12 CPUs I have are all constantly at 100% , and don't see any speed improvements after using this patch. It seems that numba (or llvm) is by default already deciding to parallelize some parts of the code |
I'm going to run a couple more benchmarks, but if the performance is the same I would be more inclined to let numba decide which parts to parallelize, as he'll likely do a better job than us at that (thinking for example at nested parallelizable loops) ;-) |
and the parallelization I observe definitely comes from numba, as I can get it down to use just one CPU with the environment variable NUMBA_NUM_THREADS=1 |
For now, the only parallelization that's done in this part of the code base is for sampling the batches, once per epoch. Is it possible that the 100% on the CPUs (that I also observe on my machines) is because there's no deallocation between two calls to that function? What happens if we remove parallel=True for sampling the batches, and put it for the matrix multiplication? |
OK I think I found why I was seeing all CPUs being used. It's because the bottleneck of the algorithm is not in the algorithm itself but in computing the fw_gap used for reporting. This uses the full gradient, and so Numpy's matrix-vector routines that fire up all CPUs |
Does that mean that it would be nice in practice ? Since for applications, you won't be computing the gap very often. |
Parallelization for the sparse matrix multiplication, row-wise.
Cf this thread, which announces nice speed ups.