Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No support for generating accessible PDFs (PDF/UA) #79

Closed
philipfennell opened this issue Mar 14, 2017 · 9 comments
Closed

No support for generating accessible PDFs (PDF/UA) #79

philipfennell opened this issue Mar 14, 2017 · 9 comments

Comments

@philipfennell
Copy link

Our customer has specific requirements regarding the support for accessibility features defined in PDF/UA. When using Adobe Acrobat to test PDFs generated with your on-line sandbox we see that the accessibility report shows no support for tagged PDF which then results in issues with logical reading order and reflow. There is also no setting of document title or primary language.

@zutnop
Copy link

zutnop commented Nov 20, 2017

See #30 for how to access PDFBox PDDocument to add meta-data and other properties directly.

@ScrappyTheDev
Copy link

We are also running into this due to 508 compliance testing. Are there any plans at this time to add 508 compliant metadata?

Issue:
The test is failing because there are no PDF Tags (Container Elements).
There may be other issues but that was the big one they handed down to us.

Standard PDF Tags:

  • Document - Document element. The root element of a document’s tag tree.
  • Part - Part element. A large division of a document; may group smaller units of content together, such as division elements, article elements, or section elements.
  • Div - Division element. A generic block-level element or group of block-level elements.
  • Art - Article element. A self-contained body of text considered to be a single narrative.
  • Sect - Section element. A general container element type, comparable to Division (DIV) in HTML, which is usually a component of a part element or an article element.

@danfickle
Copy link
Owner

I've started work on PDF/UA support in PR #315

danfickle added a commit that referenced this issue Jan 14, 2019
Technically we are PDF/UA compliant with simple documents according to the checker (PAC). YEAH! However, semantics are all wrong. ie. wrong tags, wrong parent/child relationships, etc.
@pdsway
Copy link

pdsway commented Jan 14, 2019

Hi Dan,

I love your work and interested in the PDF/UA compliance feature. I must produce 508 compliant pdf for a government client and can help guide/evaluate your implementation as well as contribute code.

I'm not sure how to become a contributor etc so please contact me if you're interested.

Thanks,
-Paul

danfickle added a commit that referenced this issue Jan 15, 2019
…ded.

Getting the hang of number tree system also. Working well for very simple documents. Now, to make more robust.
@ScrappyTheDev
Copy link

Dan,
This is excellent.
Please let me know if there is anything I can do to help.

danfickle added a commit that referenced this issue Jan 18, 2019
…nstead of DOM elements.

Also started to use correct tags.
danfickle added a commit that referenced this issue Jan 18, 2019
Having trouble with boxes that span two or more pages.
@danfickle
Copy link
Owner

Thanks @pdsway and @ScrappyTheDev.

At the moment, the most useful thing would be real world test cases. You can either paste html in the issue (in a code block) or submit a PR. be71e17 shows how to add a PDF/UA test case.

Obviously, code review or suggestions are also welcome.

Thanks again for your interest in this feature and project.

danfickle added a commit that referenced this issue Jan 20, 2019
danfickle added a commit that referenced this issue Jan 20, 2019
…re pages.

Previously failing layers test is now passing.
danfickle added a commit that referenced this issue Jan 20, 2019
danfickle added a commit that referenced this issue Jan 21, 2019
For images that continue over more than one page, just mark the portion on the second and subsequent pages as an artifact. The spec (PDF 1.7) is not clear on what to do in this situation.
danfickle added a commit that referenced this issue Jan 23, 2019
…margins).

Also fixed language code in document catalog.
danfickle added a commit that referenced this issue Jan 26, 2019
danfickle added a commit that referenced this issue Feb 7, 2019
Failing because of table over two pages. Need more work on table attributes.
danfickle added a commit that referenced this issue Feb 7, 2019
At least confirm that the PDF/UA implementation doesn't throw unexpected exceptions.
danfickle added a commit that referenced this issue Feb 9, 2019
…e now passes.

Also simplified adding attributes to structure elements.
danfickle added a commit that referenced this issue Feb 9, 2019
…lse method.

Also log incompatible parent/child relationships in the structure tree. All testcases are passing.
@danfickle
Copy link
Owner

OK, PDF/UA support is now available in the main branch, although not released to maven central yet. You can find the PDF/UA docs on the wiki.

Thanks everyone.

I'm especially indebted to the work of @chris271 for his open-source PDF/UA work that this implementation was based on. Thanks a lot!

@pdsway
Copy link

pdsway commented Feb 10, 2019

Thanks Dan, I will review and get back to you soon.

Our group has a list of PDF 508 requirements (which are less than full PDF/UA). I will try to include them here. They mostly use Acrobat Pro for the 508 compliance validation.

  • Meta tags: title, language, filename
  • Reading order same as object order
  • PDF container tags. This is like HTML

    etc but in the PDF so the assistive reader identifies major breaks in the document. Several levels of PDF tagging.

  • Tag "decorative" text/drawing as "artifact" so it's ignored.

I'll add more requirements as I get them.

danfickle added a commit that referenced this issue Feb 10, 2019
Also work around a bug in the PDF Accessibility Checker by setting attributes manually.
@danfickle
Copy link
Owner

Hi @pdsway,

I've attached the all-in-one example, that you may use to check if this implementation meets your requirements. I think so far, so good!

all-in-one.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants