Crawl and use iquery.pl PACER pages to trigger alerts #837

mlissner · 2018-06-05T22:32:27Z

This is similar to freelawproject/recap#251, but focuses on using this information for alerts.

What we can probably do is only crawl this data for the intersection of jurisdictions where we don't have good RSS feeds and cases that have alerts. In other words, don't crawl these if we have good RSS feeds, and don't crawl these for cases that don't have alerts.

johnhawkinson · 2018-06-05T22:51:27Z

Let's make sure we think carefully about what a scheduler requires here so that when we to the 1.0 version (without a scheduler) we don't make design choices that preclude a proper scheduler later.

mlissner · 2018-06-05T22:52:41Z

What kinds of preclusions are you thinking of / can you give an example of your concerns?

mlissner · 2018-06-05T22:58:31Z

I take it you're concerned that we'll trigger alerts twice if we get an item for one of these from, say RECAP, and then crawl and get an update that way too? Shooting in the dark here a bit.

johnhawkinson · 2018-06-05T23:00:27Z

I take it you're concerned that we'll trigger alerts twice if we get an item for one of these from…

No. I'll get back to you on your 2018-06-05T22:52:41Z in a bit, I'm tied up atm.

johnhawkinson · 2018-06-06T04:03:21Z

What kinds of preclusions are you thinking of / can you give an example of your concerns?

So, part of what I meant is the sort of thing outlined in the RSS scraper discussion at freelawproject/juriscraper#195 (comment)

There are all sorts of ways we might want to trigger a poll: If a case has activity recently, or if we know it has activity on a particular day, we might want to poll more frequently. If a case has a deadline set for a particular day, we might want to poll more on that day. If a case has pending motions, we might want to poll 14 days after the date of the motion being filed (when responses are presumptively due), as well as at the appropriate reply date if the district allows reply briefing. And whatever the number of days after a complaint, etc., etc. And before scheduled hearings. One could get really fancy (again, not a 1. feature). Also different behavior for districts that have a midnight deadline vs. a 6pm or 5pm deadline (good luck with that one). Etc., etc.

We should choose an archiecture that does not preclude this kind of fancy scheduling.

mlissner · 2020-05-15T19:14:59Z

OK, after lengthy discussion yesterday we reached the conclusion that using this endpoint for alerts isn't really that helpful. We'll begin crawling it nightly just to get the data that it has, but we won't use it trigger alerts.

Gathering data for last filing is being done here: #1264

Alerts via other means is being done here: #1279

Closing. Thanks for the earlier comments, @johnhawkinson.

ikeboy mentioned this issue May 10, 2020

Maximize usage of iquery page on PACER to get last_date_filing, case_name, and date_filed fields (among others) #1264

Closed

mlissner closed this as completed May 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crawl and use iquery.pl PACER pages to trigger alerts #837

Crawl and use iquery.pl PACER pages to trigger alerts #837

mlissner commented Jun 5, 2018

johnhawkinson commented Jun 5, 2018

mlissner commented Jun 5, 2018

mlissner commented Jun 5, 2018

johnhawkinson commented Jun 5, 2018

johnhawkinson commented Jun 6, 2018

mlissner commented May 15, 2020

Crawl and use iquery.pl PACER pages to trigger alerts #837

Crawl and use iquery.pl PACER pages to trigger alerts #837

Comments

mlissner commented Jun 5, 2018

johnhawkinson commented Jun 5, 2018

mlissner commented Jun 5, 2018

mlissner commented Jun 5, 2018

johnhawkinson commented Jun 5, 2018

johnhawkinson commented Jun 6, 2018

mlissner commented May 15, 2020