Skip to content

Public ML-programming courses carried out by NGI. Focusing on geotechnical challenges and datasets.

License

Notifications You must be signed in to change notification settings

norwegian-geotechnical-institute/ML_course_2022_0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine learning workshop series spring / early summer 2022

This repository contains code for the NGI internal Machine Learning workshop presented in spring / early summer 2022. We upload code to this repo after each session.

Before you start to code, make sure you have installed:

  • An IDE: VSCode, Spyder, Atom, Pycharm etc.
  • An package handling and coding environment system. In further coding sessions we show instructions using conda, but feel free to use other systems such as pipenv etc. Conda can be downladed either in a GUI version, using Anaconda (https://www.anaconda.com/products/distribution), or the version called miniconda (https://docs.conda.io/en/latest/miniconda.html) without GUI and tons of other stuff in Anaconda you probably don't need :-)

Setup

Build the directory structure:

├── Data
│   ├── raw                 <- Raw data from third party sources.
│   ├── processed           <- Processed data ready for modelling
├── Figures                 <- Saved figures from processing and results
├── src                     <- Script files with functionality

It is also possible to run the bash-script in the repo to setup the structure by running:

bash make_data_structure.sh

Download and save the dataset into the Data/raw directory.

Good practise for scientific development

First we would like to mention three excellent papers that describe good practise in scientific computing.

Version control system

Git is a version control system. To get the code locally on your computer. You do this only once.

  1. Install git

  2. On a linux or windows terminal maneuver to your project directory where you want to store different coding projects.

  3. Clone repo with:

    git clone <url copied from repo>

Environment

Use one environment for each coding project. For these 5 sessions we will use the same datasets and all sessions are basically a part of the same project. It is then ok with the the same environment for all sessions.

  1. Create an environment called ml_sessions_2022 using environment.yaml with the help of conda. If you get pip errors, install pip libraries manually, e.g. pip install pandas

    conda env create --file environment.yaml
  2. Activate the new environment with:

    conda activate ml_sessions_2022

Version control

We recommend to register for your own github account and make one repo for your code in the workshop sessions. After every session you push the code to your personal github repo. That is a good learning task for taking care of version control!

Before each coding session

We recommend some steps before each workshop session

  1. Get the latest code in repo. You will probably be asked for a git-token for authentication:

    git pull
  2. Update the necessary libraries and activate the environment by calling:

    conda env update --file environment.yaml
    conda activate ml_sessions_2022

About

Public ML-programming courses carried out by NGI. Focusing on geotechnical challenges and datasets.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published