PDF-Zensor can be used to censor PDF-files. As such it strips annotations and metadata as well as textual and graphical content from the PDF-file. It can also partially censor PDF-files and highlight certain text phrases.
The application comes with a set of predefined colors, however, individual colors for censoring different elements can be configured as well.
PDF-Zensor uses a number of open source projects to work properly:
- PDFBox - The Apache PDFBox library is an open source Java tool for working with PDF documents.
- Picocli - Command line interface
- Log4J - Apache Log4j is a Java-based logging utility.
- Jackson - In computing, Jackson is a high-performance JSON processor for Java.
- Apache Commons - Apache Commons is an Apache project focused on all aspects of reusable Java components.
And of course PDF-Zensor itself is open source.
PDFZensor requires Java >= 11
Install the PDF-Zensor:
- Go to packages on the right side of this page.
- Choose the asset you need.
- Use the usual way to install/use the respective format.
If you downlaoded the jar you can use the command:
$ alias pdf-zensor='java -cp "pdf-zensor-1.0-jar-with-dependencies.jar" de.uni_hannover.se.pdfzensor.App'
to create a temporary alias "pdf-zensor" which is valid for the current shell session.
Want to contribute? Great! Write a message!
- (Feature): Clipping of images and the like according to the current GraphicsContext
- (Feature): Correctly censor inline drawings
- (Feature): Remove watermark
- (Feature): Correctly censor Chinese characters or similar
- (Feature): Regex works across pages
- (Feature): Detect line breaks
- 🐞 ZCensoring of rotated text can be strange (since we merge text according to global coordinates and not according to local)
- 🐞 Tokenizer cannot find tokens across the page boundary
- 🐞 Annotations::getRect returns a wrong (?) Rectangle. Avoided by HighlightAnnotation::getQuads
- 🐞 EOFException instead of a FileFormatException if no valid PDF was entered [error in PDFBox]
GNU GPLv3
Free Software, Hell Yeah!