Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve accessibility for screen reader users. #831

Closed
AlainGravelet opened this issue Aug 18, 2021 · 6 comments
Closed

Improve accessibility for screen reader users. #831

AlainGravelet opened this issue Aug 18, 2021 · 6 comments
Labels

Comments

@AlainGravelet
Copy link

AlainGravelet commented Aug 18, 2021

Hi,
The span method is already much better than a simple image, thank you for that.
But we could do much more better.
This idea will work if the PDF file is already accessible, I mean tagged with semantic tags: H1, H2, P, UL/LI, buttons...
In that case, replacing the span by the correct semantic tags will allow blind end-users to navigate easily into the document alternate, which can be very complicated, especially when documents have multiple pages.
Thank you.
github-01

@wojtekmaj
Copy link
Owner

Do you have a sample PDF that has such properties? I'll have a look on what PDF.js offers. Although I don't recall being provided with suggested tag name.

Also would be great to check if text layer contains these elements when this PDF is opened with Firefox. If not, then probably it's not possible at the moment. If yes, however...

@AlainGravelet
Copy link
Author

AlainGravelet commented Aug 26, 2021 via email

@MattL75
Copy link
Contributor

MattL75 commented Nov 20, 2021

Hello again @wojtekmaj :) I took a look at PDF.js and it seems there is an optional parameter that can be passed to getTextContent as such: getTextContent({includeMarkedContent: true}) which offers limited support for tagged PDFs. It's combined with some struct tree which renders elements within the canvas and relates them together using an aria-owns.

I tested this on the PDF.js viewer demo and it does seem to render some more accessible structure in a separate layer. It is still spans but it adds some roles and other attributes to help out screenreaders. It's definitely not perfect, but it would be nice to have as an option.

I will probably be implementing (or at least investigating) this in our internal react-pdf fork so I will update this ticket if I have interesting results.

Relevant links:
mozilla/pdf.js#13171 (comment)
mozilla/pdf.js#6269

Tagged document from WCAG:
cooking.pdf

@MattL75
Copy link
Contributor

MattL75 commented Nov 23, 2021

After doing a small (and ugly) internal PoC, this is definitely possible.

The key parts in the PDF.js PR linked in my above comment are src/display/text_layer.js in the _processItems function and web/pdf_page_view.js for everything struct tree related.

PDF.js viewer renders a structure as follows:

<canvas>
  <span role="heading" aria-level="1" aria_owns="heading_id"></span>
  <span aria_owns="some_paragraph"></span>
</canvas>

In the text layer:
<span id="heading_id">Some Heading</span>
<span id="some_paragaph">Hello world!</span>

I'm not 100% sure why they went in this direction as opposed to rendering the actual tags in the text layer since they are accessible through getTextContent({includeMarkedContent: true}), but I do see that sometimes tags need to be grouped under a parent so that might be why.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 14 days.

@github-actions github-actions bot added the stale label Feb 28, 2022
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 14 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants