An Efficient Foundation for Big Data Processing on Modern Clusters

PhD dissertation, Vinayak Borkar, March 2016.

@phdthesis{borkar2016efficient,
  title={An Efficient Foundation for Big Data Processing on Modern Clusters},
  author={Borkar, Vinayak},
  school={University of California, Irvine},
  year=2016,
  month=03
}

In recent years, the world has seen an explosion in the amount of data being generated. Google proposed the MapReduce framework to allow programmers easily process massive amounts of data in parallel using a cluster of shared-nothing commodity machines. What started out as a tool for human efficiency subsequently began to be used as an intermediate representation for queries compiled from higher-level declarative languages. In this thesis, we present an alternate software stack for building scalable Big Data systems. We specifically focus on two parts of the stack. Hyracks is a new partitioned-parallel runtime layer that provides an efficient, generalized model for executing data-processing jobs on a cluster of commodity machines. Algebricks is a compiler framework that helps to build high-level declarative language compilers for parallel processing on top of Hyracks.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
thesis		thesis
LICENSE		LICENSE
README.md		README.md
thesis.pdf		thesis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An Efficient Foundation for Big Data Processing on Modern Clusters

About

Releases

Packages

Languages

License

vinayakb/phd-thesis

Folders and files

Latest commit

History

Repository files navigation

An Efficient Foundation for Big Data Processing on Modern Clusters

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages