Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel chunking routine and Python 2GB bottleneck for large statsmodels OLS #154

Open
turbach opened this issue Aug 2, 2019 · 0 comments

Comments

@turbach
Copy link
Collaborator

turbach commented Aug 2, 2019

Appears to be a monkey jar ... the chunker ships jobs out to the pool that fit through the 2GB bottleneck then statsmodel OLS fitting inflates the size and the returns are too big to make it back through the bottleneck.

Possible workaround: use tester fit to estimate OLS return size and chunk accordingly for 2GB, fall back to serial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant