Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make salary 2021 script year agnostic #135

Closed
AetherUnbound opened this issue May 29, 2022 · 5 comments
Closed

Make salary 2021 script year agnostic #135

AetherUnbound opened this issue May 29, 2022 · 5 comments
Labels
data munge Data ingestion/munging scripts

Comments

@AetherUnbound
Copy link
Collaborator

AetherUnbound commented May 29, 2022

The salary 2021 script uses public salary information that's updated regularly. Therefore, it's not actually just 2021 data but data from whenever the script itself is run. We should make the script more agnostic to which year it's being run in, and also save the raw salary data for future ingestions/tests (since that raw data is likely to change).

The script in question is here: https://github.com/OrcaCollective/OpenOversight/blob/main/data-munging/spd_2021_salary_data.py

@AetherUnbound AetherUnbound added the data munge Data ingestion/munging scripts label May 29, 2022
@sea-kelp
Copy link
Collaborator

@AetherUnbound This line seems to be the only year-specific line in the script from what I can tell:

# Set the year to 2020
merged["year"] = 2021

Would you mind expanding on what you mean by "save the raw salary data for future ingestions/tests"?

@AetherUnbound
Copy link
Collaborator Author

Ah, I should have linked the dataset - I meant the City Of Seattle Wage Data! https://data.seattle.gov/City-Business/City-of-Seattle-Wage-Data/2khk-5ukd

@sea-kelp
Copy link
Collaborator

Oh right, I meant where we're planning on saving it to - in the database or in S3?

@AetherUnbound
Copy link
Collaborator Author

Oh that's a good point! Probably S3, we can have a separate bucket for it (and use minio locally for that).

@sea-kelp
Copy link
Collaborator

sea-kelp commented Jun 2, 2024

Fixed by #324

@sea-kelp sea-kelp closed this as completed Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data munge Data ingestion/munging scripts
Projects
Status: Done
Development

No branches or pull requests

2 participants