The propose of this library is to allow the data analysis process more easy and automatic.
General objectives:
- reduce boilerplate code;
- reduce time spent on data analysis tasks and;
- offer a reproducible data analysis workflow.
Generally, there is a lot of boilerplate code on data analysis task that could be resolved with reproducible mechanisms and easy data visualization methods. Another point is related to data publish. A lot of data analysts doesn't know about open data repositories or doesn't consider that in his/her scientific workflow communication.
Specifics objectives:
- optimize data visualization;
- integration with open data repositories to publish data;
- reproducibility on data analysis tasks through storing and recovery operations;
SkData should integrate with Pandas library (Python).
- https://www.packtpub.com/big-data-and-business-intelligence/clean-data
- https://www.packtpub.com/big-data-and-business-intelligence/python-data-analysis
- https://www.packtpub.com/big-data-and-business-intelligence/mastering-machine-learning-scikit-learn
- https://www.packtpub.com/big-data-and-business-intelligence/practical-data-analysis-second-edition
- https://github.com/rsouza/MMD/blob/master/notebooks/3.1_Kaggle_Titanic.ipynb
- https://github.com/agconti/kaggle-titanic/blob/master/Titanic.ipynb
- https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/kaggle/titanic.ipynb
Installing scikit-data from the conda-forge channel can be achieved by adding conda-forge to your channels with:
$ conda config --add channels conda-forge
Once the conda-forge channel has been enabled, scikit-data can be installed with:
$ conda install scikit-data
It is possible to list all of the versions of scikit-data available on your platform with:
$ conda search scikit-data --channel conda-forge
To install scikit-data, run this command in your terminal:
$ pip install skdata
If you don't have pip installed, this Python installation guide can guide you through the process.
- License: MIT
- Documentation: https://skdata.readthedocs.io
- CUESTA, Hector; KUMAR, Sampath. Practical Data Analysis. Packt Publishing Ltd, 2016.
Electronic materials