Skip to content

osl-pocs/skdata

Repository files navigation

SciKit Data

Documentation Status

Conda package current release info

Anaconda-Server Badge Anaconda-Server Badge

About SciKit Data

The propose of this library is to allow the data analysis process more easy and automatic.

General objectives:

  • reduce boilerplate code;
  • reduce time spent on data analysis tasks and;
  • offer a reproducible data analysis workflow.

Generally, there is a lot of boilerplate code on data analysis task that could be resolved with reproducible mechanisms and easy data visualization methods. Another point is related to data publish. A lot of data analysts doesn't know about open data repositories or doesn't consider that in his/her scientific workflow communication.

Specifics objectives:

  • optimize data visualization;
  • integration with open data repositories to publish data;
  • reproducibility on data analysis tasks through storing and recovery operations;

SkData should integrate with Pandas library (Python).

Books used as reference to guide this project:

Some other materials used as reference:

Installing scikit-data

Using conda

Installing scikit-data from the conda-forge channel can be achieved by adding conda-forge to your channels with:

$ conda config --add channels conda-forge

Once the conda-forge channel has been enabled, scikit-data can be installed with:

$ conda install scikit-data

It is possible to list all of the versions of scikit-data available on your platform with:

$ conda search scikit-data --channel conda-forge

Using pip

To install scikit-data, run this command in your terminal:

$ pip install skdata

If you don't have pip installed, this Python installation guide can guide you through the process.

More Information

References

  • CUESTA, Hector; KUMAR, Sampath. Practical Data Analysis. Packt Publishing Ltd, 2016.

Electronic materials