Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use doRNG and foreach for reproducible parallel bootstrapping #46

Open
xrobin opened this issue Apr 7, 2019 · 0 comments
Open

Use doRNG and foreach for reproducible parallel bootstrapping #46

xrobin opened this issue Apr 7, 2019 · 0 comments
Labels
api-change The issue describes a change in the API visible to the user feature-request

Comments

@xrobin
Copy link
Owner

xrobin commented Apr 7, 2019

The plyr is old and newer, better options exist for parallel execution. The foreach package seems to be the way to go, with different backends available, and the doRNG package for reproducible parallel calculations.

Interface from the user perspective would look like:

cl <- makeCluster(2) # 2 cores
registerDoParallel(cl)
registerDoRNG(1234) 
ci(...)
stopCluster(cl)

Internally we would simply have:

resampled.values <- foreach(i=1:boot.n) %dopar% { stratified.bootstrap.test(...) }

instead of

resampled.values <- laply(1:boot.n, stratified.bootstrap.test, ...)

Things to consider:

  • Code should be able to run without any extra line of code from the user (but then not in parallel)
  • Progress bars?
  • What if some of the bootstrapping gets implemented in C++ in the future?
@xrobin xrobin added feature-request api-change The issue describes a change in the API visible to the user labels Apr 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change The issue describes a change in the API visible to the user feature-request
Projects
None yet
Development

No branches or pull requests

1 participant