Skip to content

Thaliehln/ds4ph

Repository files navigation

Data Science for Public Health

This workshop is being developed by the Ifakara Health Institute (IHI) and the Swiss Tropical & Public Health Institute (Swiss TPH), with the support of the Leading House Africa (LHA) which promotes and fosters bilateral collaboration with partner institutions in Africa.


Welcome!


🗓️ September 26-28, 2022
🕘 09:00 - 17:00
🌇 Dar-es-Salaam, Tanzania (Protea Hotel by Marriott Dar es Salaam Courtyard)

Overview

Data science and artificial intelligence have the potential to generate fundamentally new insights on global health policies in Africa, but the full realization of this potential depends on the availability of a critical mass of highly trained health data scientists on the continent. The goal of this project is to jointly develop and implement a public health data science curriculum to enable researchers from the Ifakara Health Institute (IHI) in Tanzania to strengthen their expertise in the area.

Learning objectives

Two complementary aspects of moving into data science are:

  1. the mindset about how scientists think and collaborate about data, and
  2. the skillsets which is composed of an ecosystem of tools (mostly open-source) and practices.

Upon completing the workshop, participants will have gained:

  • exposure to data science approach, tools and collaborative practices
  • hands-on experience on how to interface between Stata and R, learned the basics of working with data in R/RStudio, and how to incrementally incorporate R into your existing data analysis workflows in Stata. The idea is not to replace everything you do in Stata into R but that you can continue your learning after this workshop at your own pace.

Is this workshop for me?

This workshop is relevant for individuals who answer yes to the following questions:

  • Do you who want to develop data science projects in public health?
  • Do you wants to learn more about how open and reproducible science approaches can be used in your daily practice?
  • Are you a Stata user (or any other data analysis language) who would like to expand your data analysis skillset with R?
  • Do you want to bridge analyses between data analysis tools (Stata, R or Python) and to more easily collaborate with other researchers who use another of these tools?

Schedule

Time (group) Day 1 (stream 1) Day 1 (stream 2) Day 2 (stream 1) Day 2 (stream 2)
8:30-9:00 Welcome Welcome Welcome Welcome
9:00-10:30 Data science introduction Data science introduction Sharing Sharing
Break
11:00-11:30 Introduction to Git Introduction to Git
11.30-13:00 Public Health question (need) Public Health question (data strategy) TBD Machine learning analysis
Lunch break
14:00-15:30 Analysis Plan R, RStudio and RMarkdown Sharing Sharing
Break
16:00-17:30 N/A ODK Central API, ruODK Wrap-up Wrap-up

This project aims to accompany researchers to progress on the following development axes:

  • Data science mindset
  • Data science skillset
    • Programming tools
      • Move from Stata to R (prerequisite: Stata)
      • R programming
        • dplyr
      • Python programming
        • pandas
        • scikit-learn (prerequisite: independent Python user)
    • Coding with best practices (R/RStudio/tidyverse)
      • Versioning using GitHub (all)
      • Using targets (prerequisite: independent R user)
    • Reporting and publishing: Dynamic report generation
      • Using Stata (prerequisite: Stata)
      • Using R/Rmarkdown (prerequisite: R basics)
      • Notebooks (Python Jupyter Notebook, Rmarkdown as a notebook - prerequisite: Python/R basics)
    • Reproducible data
      • Use APIs (prerequisite: IT programming basics)
      • Open access data (all)
    • Statistical methods for reproducible research (advanced)

Wrap-up

What is not covered

  • Reproducible workflows (targets)
  • Reproducible environments (Binder, Docker, renv, etc)

Instructors

From the Ifakara Health Institute (IHI)

  • Samwel Lwambura
  • Hajirani Msuya
  • Ibrahim Mtebene
  • Charles Festo

From the Swiss Tropical & Public Health Institute (Swiss TPH)

  • Fenella Beynon
  • Hélène Langet
  • Silvia Cicconi
  • Gillian Levine
  • Fabian Schär

Prerequisites

Soon - here guidance to install the (free) software to be used in the workshop will be given.

R installation

RStudio installation

GitHub account

GitHub desktop

Data

Soon - here a succinct description of the data to be used as part of the workshop will be given.


This work is licensed under a Attribution 4.0 International (CC BY 4.0).

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, even commercially.

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.