machine-learning-vm

Note: if you just want to set up a running Spark virtual machine, you do not need this project. Use the ml-notebook project instead, that one will download the packaged base box and launch the VM automatically. This one is for building the VM from scratch

This project contains the files needed to generate a virtual machine for Machine Learning/Data Science tasks. When provisioning the virtual machine, every required piece software is downloaded from Internet. To see what is included inside the virtual machine and what has changed between versions, look at the ChangeLog file.

The VM is managed through Vagrant. Software requirements for the host are:

Vagrant 2.1 or above (if possible, use the latest version available)
VirtualBox 6.0 or above

Process

The project creates a "base" VM, with all the needed software but not fully configured to work. Another subproject defined as a submodule, in the ml-notebook repository, takes care of configuring the VM for a Spark system accessed through Jupyter Notebook in its own Vagrantfile. That subproject uses the "base" VM as a Vagrant box to start from.

So the complete creation is a two-step process:

the first step takes place here, and the produced spark-base64 box is manually uploaded to Vagrant Cloud
The second one is the one implemented in the Vagrantfile in ml-notebook; it downloads the spark-base64 box from the cloud and finalizes the configuration
```
  starting box   --->     base box     --->  final VM
 [ubuntu 22.04]        [spark-base64]
```

There is an additional submodule, nbextensions, which contains the Jupyter Notebook extensions that will be copied to the base VM (note by default they are not configured to automatically be included in notebooks, this is again taken care of in the Vagrantfile for the ml-notebook subproject.)

Missing bits

The base Vagrantfile in this project is self-contained (downloads everything needed from public repositories), with a few exceptions.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
base		base
notebook @ 910e5ce		notebook @ 910e5ce
simple		simple
.gitignore		.gitignore
.gitmodules		.gitmodules
ChangeLog.txt		ChangeLog.txt
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

machine-learning-vm

Process

Missing bits

About

Releases 7

Packages

Languages

License

paulovn/machine-learning-vm

Folders and files

Latest commit

History

Repository files navigation

machine-learning-vm

Process

Missing bits

About

Resources

License

Stars

Watchers

Forks

Releases 7

Packages 0

Languages

Packages