GitHub - pmaji/data-science-toolkit: Collection of stats, modeling, and data science tools in Python and R.

Introduction

Welcome! The purpose of this repository is to serve as stockpile of statistical methods, modeling techniques, and data science tools. The content itself includes everything from educational vignettes on specific topics, to tailored functions and modeling pipelines built to enhance and optimize analyses, to notes and code from various data science conferences, to general data science utilities. This will remain a work in progress, and I welcome all contributions and constructive criticism. If you have a suggestion or request, please use the "Issues" tab and I will endeavor to respond expeditiously!

Note: GitHub often has trouble rendering larger .ipynb files in particular. If you find that you are unable to view one of the jupyter notebooks linked below, I recommend copy and pasting the result into jupyter's nbviewer, which will take you to a viewable link like this one here for my "Visualization with Plotly" notebook. Note that if you want to ensure that you are viewing the most up-to-date version of the notebook with nbviewer, you should add ?flush_cache=true to the end of the generated URL as is described here; otherwise, your link risks being slightly out-of-date.

Playground and Basics
1. Rough Notes from ISLR Exercises -- R
2. Rough Notes from Python Data Scientist Track -- Python
Exploratory Data Analysis (EDA) and Visualization
Hypothesis Testing
1. Kolmogorov-Smirnov Test (KS Test) -- R
2. Useful Hypothesis Testing Functions -- R
Classification
Regression
1. Linear Regression -- Python
Reinforcement Learning
Text Mining and Natural Language Processing (NLP)
1. Basic Texting Mining and NLP -- R
Time Series
1. Time Series Forecasting with Facebook's Prophet Package -- Python
Notes and Material from Data Science Conferences
Utilities
1. HTML File Appender (Using Beautiful Soup) -- Python

Contribution Info

All are welcome and encouraged to contribute to this repository. My only request is that you include a detailed description of your contribution, that your code be thoroughly-commented, and that you test your contribution locally with the most recent version of the master branch integrated prior to submitting the PR.

Name		Name	Last commit message	Last commit date
Latest commit History 418 Commits
classification		classification
conferences		conferences
eda-and-visualization		eda-and-visualization
hypothesis-tests		hypothesis-tests
playground-and-basics		playground-and-basics
regression		regression
reinforcement-learning		reinforcement-learning
text-mining-and-nlp		text-mining-and-nlp
time-series		time-series
utilities		utilities
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Table of Contents

Contribution Info

About

Releases

Packages

Contributors 4

Languages

pmaji/data-science-toolkit

Folders and files

Latest commit

History

Repository files navigation

Introduction

Table of Contents

Contribution Info

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages