Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attachment menu/doc1 page not detected sometimes #291

Closed
mdaniels5757 opened this issue Aug 31, 2020 · 4 comments
Closed

Attachment menu/doc1 page not detected sometimes #291

mdaniels5757 opened this issue Aug 31, 2020 · 4 comments

Comments

@mdaniels5757
Copy link

I think this is my first bug filed here, apologies if I screwed this up.

Expected behavior w/ example:

  • Go to some doc1 page (e.g. S.D.N.Y 1:20-cv-05770 ECF 81)
  • RECAP detects that it is a doc1 page and uploads accordingly
  • This works for all doc1 pages where there is a "Download All" button

Actual behavior on certain doc1 pages:

  • Go to a doc1 page where the combined PDF would be over the size limit, and therefore there is no "Download All" button (e.g. S.D.N.Y 1:20-cv-05770 ECF 76)
  • Not uploaded to CourtListener

Suspected cause:

  • RECAP does not detect that second example is a doc1 page.
  • Possibly because isAttachmentMenuPage in pacer.js does not detect pages where there is no "Download All" button.

Possible (untested!) solution:

  • In isAttachmentMenuPage, instead or as an alternative to checking for a "Download All" button, check the top bolded text on the doc1 page is "Document Selection Menu".
  • Proposed code: document.getElementsByTagName('b')[0].textContent == 'Document Selection Menu'

Best,
Michael Daniels

@mlissner
Copy link
Member

Thanks Michael. This looks like a dup of #238. Your solution seems logical to me. Want to take a crack at PR?

@johnhawkinson
Copy link
Collaborator

johnhawkinson commented Aug 31, 2020

As I said in #238, I don't think we should be looking for a heading at all.

We should just look for table rows of doc1 links and if they're there than they're close enough to attachment pages to be worth shipping to the server for it to parse. (Because don't have a unified parser in the client and server, and don't want to overparse in the client).

I think this analysis remains correct, 2 years later.

So, that is:

(document.querySelectorAll(`td a[href*="/doc1"], td a[href*="/docs1"]`).length > 0)

Or maybe without even the td constraint?

@mlissner
Copy link
Member

I'm happy with either solution.

@mlissner
Copy link
Member

I believe this is fixed via freelawproject/recap-chrome#269

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

3 participants