Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Highlights/annotations repeated on all pages #1399

Open
Jmuccigr opened this issue Sep 24, 2024 · 1 comment
Open

[Bug]: Highlights/annotations repeated on all pages #1399

Jmuccigr opened this issue Sep 24, 2024 · 1 comment
Assignees
Labels
triage Issue needs triage

Comments

@Jmuccigr
Copy link
Contributor

Describe the bug

On a file that has been ocr'ed with ocrmypdf, an annotation (text highlight) gets repeated on every page of the document after saving.

Steps to reproduce

1. Run ocrmypdf on a PDF which is just a bag of images.
2. Add some annotations (text highlights).
3. Use the script [here](https://thepythoncode.com/article/redact-and-highlight-text-in-pdf-with-python) to remove the annotations via pyMuPDF.
4. Add another annotation to the resulting file.
5. Close and re-open it to find that last annotation on every page.

I realize there's some other stuff besides ocrmypdf happening there, but if I take a file from, say JSTOR, that has OCR'ed text already and run steps 2-4 on it, I don't get the problem. So it seems like it's something that ocrmypdf is doing to the file that's causing the issue.

One more thing: if I use gs to remove the highlights on the same file via something like `gs -dSAFER -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -o output.pdf -c "/PreserveAnnotTypes [] def" -c "/ShowAnnotTypes [] def" -f input.pdf` I don't get any weirdness either. I suspect gs is doing a bit more than the pyMuPDF script, too.

Files

2. magick_2pages.pdf.zip
5. magick_ocr_highlights.pdf.zip

Here's a PDF as an example. It's generated from a JSTOR file using magick -units pixelsperinch -density 280 input.pdf -format pdf output.pdf to remove all metadata, etc, and convert the contents to images. I've zipped it in the original version with no OCR applied and in the final version where the highlighting is duplicated. Note that this happens pretty reliably for me with files from other sources.

How did you download and install the software?

Homebrew

OCRmyPDF version

16.5.0

Relevant log output

ocrmypdf 16.5.0                                                                                           __main__.py:59
Running: ['tesseract', '--version']                                                                      __init__.py:133
Found tesseract 5.4.1                                                                                    __init__.py:343
Running: ['tesseract', '--version']                                                                      __init__.py:133
Running: ['tesseract', '--version']                                                                      __init__.py:133
Running: ['gs', '--version']                                                                             __init__.py:133
Found gs 10.4.0                                                                                          __init__.py:343
Running: ['gs', '--version']                                                                             __init__.py:133
Running: ['tesseract', '--list-langs']                                                                   __init__.py:133
stdout/stderr = List of available languages in "/Users/username/Documents/tessdata/" (16):        __init__.py:73
deu
deu_frak
ell
eng
enm
fra
grc
ita
ita_old
lat
osd
script/Fraktur
script/Greek
script/Latin
spa
spa_old

pikepdf mmap enabled                                                                                      helpers.py:328
os.symlink(/Users/username/Desktop/AU6P9FDZ/2. magick_2pages.pdf,                                 helpers.py:179
/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/origin)
os.symlink(/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/origin,                  helpers.py:179
/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/origin.pdf)
Gathering info with 1 thread workers                                                                         info.py:804
pikepdf mmap enabled                                                                                      helpers.py:328
Scanning contents     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 2/2 0:00:00
Using Tesseract OpenMP thread limit 3                                                               tesseract_ocr.py:199
Start processing 2 pages concurrently                                                                          ocr.py:96
pikepdf mmap enabled                                                                                      helpers.py:328
pikepdf mmap enabled                                                                                      helpers.py:328
    1 Rasterize with png16m, rotation 0                                                                 _pipeline.py:539
    2 Rasterize with png16m, rotation 0                                                                 _pipeline.py:539
    1 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-dInterpolateControl=-1',           __init__.py:133
'-sDEVICE=png16m', '-dFirstPage=1', '-dLastPage=1', '-r280.000105x280.000105', '-dPDFSTOPONERROR', '-o',
'-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f',
'/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/origin.pdf']
    2 Running: ['gs', '-dQUIET', '-dSAFER', '-dBATCH', '-dNOPAUSE', '-dInterpolateControl=-1',           __init__.py:133
'-sDEVICE=png16m', '-dFirstPage=2', '-dLastPage=2', '-r280.000105x280.000105', '-dPDFSTOPONERROR', '-o',
'-', '-sstdout=%stderr', '-dAutoRotatePages=/None', '-f',
'/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/origin.pdf']
    1 Rotating output by 0                                                                            ghostscript.py:149
    1 resolution (280.0096, 280.0096)                                                                   _pipeline.py:618
    1 Running: ['tesseract', '-l', 'eng',                                                                __init__.py:133
'/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/000001_ocr.png',
'/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/000001_ocr_hocr', 'hocr', 'txt']
    2 resolution (280.0096, 280.0096)                                                                   _pipeline.py:618
    2 Running: ['tesseract', '-l', 'eng',                                                                __init__.py:133
'/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/000002_ocr.png',
'/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/000002_ocr_hocr', 'hocr', 'txt']
    2 pikepdf.Matrix(0.257143, 0, 0, -0.257143, 0, 760.114)                                                 _hocr.py:203
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 159, 213)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 233, 1234)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1284)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1335)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1386)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 165, 1436)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 165, 1486)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1537)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 163, 1587)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 231, 1637)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1687)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1738)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 165, 1789)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1839)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 164, 1890)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(0.257143, 0, 0, -0.257143, 0, 760.114)                                                 _hocr.py:203
    2 pikepdf.Matrix(1, 0, 0, 1, 166, 1940)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 320, 200)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 165, 2029)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 366, 282)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 165, 2067)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 564, 421)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 166, 2106)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 543, 701)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 168, 2145)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 213, 800)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 168, 2184)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 215, 850)                                                                  _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 168, 2223)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 901)                                                                  _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 168, 2261)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 953)                                                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 167, 2339)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 1005)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 168, 2378)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1057)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 168, 2456)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1109)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 169, 2494)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 172, 2533)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1161)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 170, 2572)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 1212)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 170, 2611)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 1265)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 971, 2028)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 240, 1315)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 968, 2066)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1365)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 968, 2143)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1416)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 968, 2182)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1468)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 969, 2221)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1519)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 969, 2260)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1572)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 969, 2299)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 1624)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 969, 2337)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1675)                                                                 _hocr.py:323
    2 pikepdf.Matrix(0.99996, -0.00899964, 0.00899964, 0.99996, 969, 2377)                                  _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 969, 2454)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 1727)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 970, 2492)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 969, 2531)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 1779)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 970, 2570)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1831)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 972, 2609)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    1 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 754, 2762)                                                                 _hocr.py:323
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1883)                                                                 _hocr.py:323
    2 eng                                                                                                   _hocr.py:267
    2 pikepdf.Matrix(1, 0, 0, 1, 839, 2794)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    2 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 1933)                                                                 _hocr.py:323
    2 pikepdf.Matrix(1, 0, 0, 1, 661, 2825)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 170, 2027)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 2067)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 2107)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 2146)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 169, 2186)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 168, 2226)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 168, 2266)                                                                 _hocr.py:323
    2 Text rotation: (text, autorotate, content) -> text misalignment = (0, 0, 0) -> 0                     _graft.py:140
    2 Grafting                                                                                             _graft.py:251
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 168, 2306)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 168, 2387)                                                                 _hocr.py:323
    2 Grafting with ctm pikepdf.Matrix(1, 0, 0, 1, 0, 0)                                                   _graft.py:294
    1 eng                                                                                                   _hocr.py:267
    2 Page rotation: (content, auto) -> page = (0, 0) -> 0                                                 _graft.py:165
    1 pikepdf.Matrix(1, 0, 0, 1, 168, 2425)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 167, 2463)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 168, 2501)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(0.999988, 0.00499994, -0.00499994, 0.999988, 168, 2644)                                _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2028)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 973, 2068)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 968, 2107)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2148)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2187)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2227)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2268)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 973, 2308)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 970, 2348)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2388)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2426)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 971, 2503)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 755, 2762)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 839, 2794)                                                                 _hocr.py:323
    1 eng                                                                                                   _hocr.py:267
    1 pikepdf.Matrix(1, 0, 0, 1, 661, 2825)                                                                 _hocr.py:323
    1 Text rotation: (text, autorotate, content) -> text misalignment = (0, 0, 0) -> 0                     _graft.py:140
    1 Grafting                                                                                             _graft.py:251
    1 Grafting with ctm pikepdf.Matrix(1, 0, 0, 1, 0, 0)                                                   _graft.py:294
    1 Page rotation: (content, auto) -> page = (0, 0) -> 0                                                 _graft.py:165
OCR                   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 2/2 0:00:00
Postprocessing...                                                                                             ocr.py:144
Running: ['tesseract', '--version']                                                                      __init__.py:133
Linearizing           ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 100/100 0:00:00
Recursing into Form XObject /OCR-bWPu0wv2KG52KckIrd7dgw in page 0                                        optimize.py:265
xref 26: skipping image because it is an SMask                                                           optimize.py:280
xref 16: treating as an optimization candidate                                                           optimize.py:282
Recursing into Form XObject /OCR-BLMEU-cDuckumi1ZbRkhyg in page 1                                        optimize.py:265
xref 27: skipping image because it is an SMask                                                           optimize.py:280
xref 20: treating as an optimization candidate                                                           optimize.py:282
XrefExt(xref=16, ext='.png')                                                                             optimize.py:347
XrefExt(xref=20, ext='.png')                                                                             optimize.py:347
Optimizable images: JPEGs: 0 PNGs: 2                                                                     optimize.py:352
Recompressing JPEGs   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% 0/0 -:--:--
Recursing into Form XObject /OCR-bWPu0wv2KG52KckIrd7dgw in page 0                                        optimize.py:265
xref 26: skipping image because it is an SMask                                                           optimize.py:280
xref 16: treating as an optimization candidate                                                           optimize.py:282
Recursing into Form XObject /OCR-BLMEU-cDuckumi1ZbRkhyg in page 1                                        optimize.py:265
xref 27: skipping image because it is an SMask                                                           optimize.py:280
xref 20: treating as an optimization candidate                                                           optimize.py:282
Deflating JPEGs       ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% 0/0 -:--:--
Recursing into Form XObject /OCR-bWPu0wv2KG52KckIrd7dgw in page 0                                        optimize.py:265
xref 26: skipping image because it is an SMask                                                           optimize.py:280
xref 16: treating as an optimization candidate                                                           optimize.py:282
Recursing into Form XObject /OCR-BLMEU-cDuckumi1ZbRkhyg in page 1                                        optimize.py:265
xref 27: skipping image because it is an SMask                                                           optimize.py:280
xref 20: treating as an optimization candidate                                                           optimize.py:282
Optimizable images: JBIG2 groups: 0                                                                      optimize.py:363
JBIG2                 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% 0/0 -:--:--
Image optimization did not improve the file - optimizations will not be used                             optimize.py:720
Running: ['jbig2', '--version']                                                                          __init__.py:133
Running: ['pngquant', '--version']                                                                       __init__.py:133
Image optimization ratio: 1.00 savings: -0.0%                                                           _pipeline.py:989
Total file size ratio: 0.99 savings: -0.7%                                                              _pipeline.py:992
/var/folders/yl/xd3tsv2x1959s23ts4k1qt9m0000gr/T/ocrmypdf.io.s_f6cvgn/optimize.pdf ->                  _pipeline.py:1064
Desktop/AU6P9FDZ/magick_ocr.pdf
@Jmuccigr Jmuccigr added the triage Issue needs triage label Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Issue needs triage
Projects
None yet
Development

No branches or pull requests

3 participants
@Jmuccigr @jbarlow83 and others