Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0672 spider chi northwest home equity #1007

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

yijun-li-20
Copy link

@yijun-li-20 yijun-li-20 commented Feb 14, 2021

Summary

Issue: #966

Replace "ISSUE_NUMBER" with the number of your issue so that GitHub will link this pull request with the issue and make review easier.

Checklist

All checks are run in GitHub Actions. You'll be able to see the results of the checks at the bottom of the pull request page after it's been opened, and you can click on any of the specific checks listed to see the output of each step and debug failures.

  • Tests are implemented
  • All tests are passing
  • Style checks run (see documentation for more details)
  • Style checks are passing
  • Code comments from template removed

Questions

Include any questions you have about what you're working on.

@yijun-li-20 yijun-li-20 reopened this Feb 14, 2021
@yijun-li-20 yijun-li-20 changed the title Finish tests for nw heap 0672 spider chi northwest home equity Feb 14, 2021
Copy link
Collaborator

@pjsier pjsier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is looking great so far, and I just have some comments on parsing each detail page rather than loading from the summary since there's some extra information on the detail pages. Let me know if you have any questions

city_scrapers/spiders/chi_northwest_home_equity.py Outdated Show resolved Hide resolved
continue
meeting = Meeting(
title=self._parse_title(item),
description="Past Meetings",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be an empty string

timezone = "America/Chicago"
start_urls = ["https://nwheap.com/category/meet-minutes-and-agendas/"]

def parse(self, response):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this overall, I think it would be best if we could scrape each meeting URL separately. It seems like they don't generally have much information on the detail page, but at the very least we'll be able to get a unique URL for each meeting's source, and in some cases they've included dial-in information and links like this page https://nwheap.com/2020/11/30/december-10-2020-meeting-minutes-and-agenda-annual-budget-meeting/

chi_ssa_25 might be a good example to go off of for this. parse should yield requests to each visible detail page, and a separate method like _parse_detail can yield a single Meeting for each detail page

city_scrapers/spiders/chi_northwest_home_equity.py Outdated Show resolved Hide resolved

def _parse_links(self, item):
"""Parse or generate links."""
link = item.xpath("a/@href").get()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link should go in the source, since links is more for things like agendas and minutes, but this will likely change with parsing each detail page

"Main Office" if addr.startswith("3234 N. Central Ave") else "Unknown place"
)
return {
"address": addr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there's a static address for "Main Office" we can include that here if it matches, but this will also change when parsing the detail page

tests/test_chi_northwest_home_equity.py Outdated Show resolved Hide resolved
@SubtleHyperbole
Copy link

Sorry for going AWOL, has this tracker been finished, or is it still in the state where I left it? Either way is fine -- i shouldn't have gone missing for the better part of a year -- but if it is still unfinished i would be glad to finish things out. Just unclear about the second pull request, whether its a new start or an edit of what i started.

@SubtleHyperbole
Copy link

I am unclear as to the status of this tracker, so I am going to try to start fresh on a different board meeting in need of tracking. If this one is still unfinished though, and could use my help finishing out, let me know because I don't want to interfere if someone else has taken over the issue. I will periodically check back here though.

Again, sorry for going AWOL after submitting the initial pull request. That's definitely on me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants