You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I invoke verapdf as a serverless openfaas function, and for that, the validation from stdin is very useful since I do not have to write a temp file inside my docker container before I can call the verapdf cli. I have to filter out the textual help message before the xml output (which is a bit silly for a tool processing stdin, but I can imagine why you did that).
BUT: If I invoke verapdf and pass the pdf via stdin, the resulting xml is fundamentally different from the documented format at http://docs.verapdf.org/cli/validation/#auto-profile. The documentation says the result xml contains a validationReport element whereas the result of a stdin invocation contains a validationResult element and also aside from that it looks very different.
I tried --format mrr (see below) and also the formats xml and text, they all yield the same result when using stdin.
When calling with a filename (not stdin), --format xml gives me a larger xml which documents the used configuration and appears to contain the output of the stdin call as a subtree as well.
If the behaviour of the stdin invocation is not already documented somewhere, can it be documented?
If possible, can the stdin invocation support --format mrr and (with that option) yield the same result as the invocation which passes a file name?
Making mrr the default would break the current interface and should not be done lightly, but maybe that is something to consider nevertheless. It is quite confusing that the invocation via stdin behaves differently than an invocation with file name.
Below you see how mrr appears to be ignored when stdin is used:
ds@ds-Nitro-AN515-42:~/verapdf$ cat corpus/veraPDF-corpus-staging/PDF_A-1b/6.6\ Actions/6.6.1\ General/veraPDF\ test\ suite\ 6-6-1-t01-fail-a.pdf | ./verapdf --format mrr
veraPDF is processing STDIN and is expecting an EOF marker.
If this isn't your intention you can terminate by typing an EOF equivalent:
- Linux or Mac users should type CTRL-D
- Windows users should type CTRL-Z
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<processorResult xmlns:ns2="http://www.verapdf.org/ValidationProfile" isPdf="true" isEncryptedPdf="false">
<itemDetails size="-1">
<name>STDIN</name>
</itemDetails>
<validationResult flavour="PDFA_1_B" totalAssertions="363" isCompliant="false">
<ns2:profileDetails creator="veraPDF Consortium" created="2017-09-06T15:12:20.277+02:00">
<ns2:name>PDF/A-1B validation profile</ns2:name>
<ns2:description>Validation rules against ISO 19005-1:2005, Cor.1:2007 and Cor.2:2011, Level B</ns2:description>
</ns2:profileDetails>
<ns2:assertions>
<ns2:assertion ordinal="363" status="FAILED">
<ns2:ruleId specification="ISO_19005_1" clause="6.6.1" testNumber="1"/>
<ns2:message>The Launch, Sound, Movie, ResetForm, ImportData and JavaScript actions shall not be permitted.
Additionally, the deprecated set-state and no-op actions shall not be permitted. The Hide action shall not be permitted (Corrigendum 2)</ns2:message>
<ns2:location>
<ns2:level>CosDocument</ns2:level>
<ns2:context>root/document[0]/OpenAction[0](5 0 obj PDAction)</ns2:context>
</ns2:location>
</ns2:assertion>
</ns2:assertions>
</validationResult>
<fixerResult status="NO_ACTION">
<ns2:appliedFixes/>
</fixerResult>
<featuresReport/>
<taskResult>
<taskResult type="VALIDATE" isExecuted="true" isSuccess="true">
<duration start="1547889097054" finish="1547889097643">00:00:00.589</duration>
</taskResult>
</taskResult>
</processorResult>
The text was updated successfully, but these errors were encountered:
For historical reasons the processing of an input stream is indeed different from the (batch) processing of a folder or a single PDF path. We'll fix this to make sure the input stream is validated as if it was a single local PDF file.
I invoke verapdf as a serverless openfaas function, and for that, the validation from stdin is very useful since I do not have to write a temp file inside my docker container before I can call the verapdf cli. I have to filter out the textual help message before the xml output (which is a bit silly for a tool processing stdin, but I can imagine why you did that).
BUT: If I invoke verapdf and pass the pdf via stdin, the resulting xml is fundamentally different from the documented format at http://docs.verapdf.org/cli/validation/#auto-profile. The documentation says the result xml contains a
validationReport
element whereas the result of a stdin invocation contains avalidationResult
element and also aside from that it looks very different.I tried
--format mrr
(see below) and also the formatsxml
andtext
, they all yield the same result when using stdin.When calling with a filename (not stdin),
--format xml
gives me a larger xml which documents the used configuration and appears to contain the output of the stdin call as a subtree as well.If the behaviour of the stdin invocation is not already documented somewhere, can it be documented?
If possible, can the stdin invocation support
--format mrr
and (with that option) yield the same result as the invocation which passes a file name?Making
mrr
the default would break the current interface and should not be done lightly, but maybe that is something to consider nevertheless. It is quite confusing that the invocation via stdin behaves differently than an invocation with file name.Below you see how
mrr
appears to be ignored when stdin is used:The text was updated successfully, but these errors were encountered: