The INDEC (meaning "Instituto Nacional de Estadística y Censos", or National Institute of Statistics and Census) just published a report on Argentinian labor and workforce.
Although the data is great, the visualization is lacking.
We'll try to fix that, while keeping the data, the tools and the process open for everyone to audit, learn and criticise.
- really raw data is on
.xls
in the raw folder. - take the information from the
.xls
and save it as plain text on the raw/txt folder (follow the naming convention). - note that there are many sheets per
.xls
, each one should be it's own.txt
file. - run
cleanup.py
passing in the.txt
file as standard input (e.g.> python cleanup.py < ./raw/txt/some.file.txt
). - save the output as a
.csv
file on the csv folder. - make sure to pick the appropiate facto by modifying the chosen factor variable on
cleanup.py