Skip to content

Short tour of parallel and foreach packages, and how to think about scaling data analyses

License

Notifications You must be signed in to change notification settings

ljdursi/beyond-single-core-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beyond Single Core: Parallel Analysis in R

R is a great environment for interactive analysis on your desktop, but when your data needs outgrow your personal computer, it's not clear what to do next.

This is material for a short overview of scalable data analysis in R. The slides can be viewed at https://ljdursi.github.io/beyond-single-core-R .

It covers:

  • How to think about parallelism and scalability in data analysis
  • The standard parallel package, including what was the snow and multicore facilities, using airline data as an example
  • The foreach package, using airline data and simple stock data;
  • A summary of best practices.

Included in the materials, though not in the talk, are some more advanced methods:

  • The bigmemory package for out-of-core computation on large data matrices, with a simple physical sciences example;
  • The Rdsm package for shared memory; and
  • a brief introduction to the powerful pbdR pacakges for extremely large-scale computation.

About

Short tour of parallel and foreach packages, and how to think about scaling data analyses

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published