Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some modernization and tidying up #38

Open
wants to merge 42 commits into
base: master
Choose a base branch
from
Open

Conversation

fgregg
Copy link

@fgregg fgregg commented Jun 13, 2024

Hi Jacob,

this PR

  • brings in schema updates from propublica
  • uses black and isort to tidy up the code formatting
  • fixes all the current flake8 failures
  • points the IRS S3 bucket to Giving Tuesday's S3
  • Brings back file downloads
  • Gets tests working
  • And sets up github action CI

@asuozzo, lmk if your repo is actually the one I should be making a PR against.

@asuozzo
Copy link

asuozzo commented Jul 12, 2024

Whoops, super delayed response here, but this is great @fgregg! I've been keeping our branch updated as needed for my purposes but have utterly failed to do much in the way of testing and/or PR any changes back into this repo, so I appreciate this!

One general thought I have here is that it's fantastic that the GivingTuesday bucket exists, but since it's not an IRS resource, I wonder if it makes sense to make irsx more data-source-agnostic, with an easy option to toggle the GT bucket as the default data source. This is probably a longer-term conversation (I don't want to hold up updates that get irsx working right out of the box), but something I've been thinking about.

# This is the URL to amazon's bucket, could use another synced to it
IRS_XML_HTTP_BASE = "https://s3.amazonaws.com/irs-form-990"
# This is the URL to Giving Tuesday's bucket, could use another synced to it
IRS_XML_HTTP_BASE = env(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asuozzo, to your point about making the XML paths configurable, i made the path to the xml files settable environmental variable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I think that's a great start! My main thought is on long-term reliability of the default setting because we've been burned before, i.e. if the GT bucket gets removed or moves somewhere else, the out-of-the-box setting breaks immediately and requires an override - if the library forces some sort of initial config, then it's less reliant on a presumed default. Hopefully this line of thinking is all theoretical at this point, though!

@fgregg
Copy link
Author

fgregg commented Jul 12, 2024

@asuozzo so should your fork be the one to make against?

@asuozzo
Copy link

asuozzo commented Jul 12, 2024

@fgregg nope, I think keeping it here is good (if @jsfenfen agrees!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants