Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OcrdExif: Pillow's IFDRational is unsupported #270

Closed
bertsky opened this issue Aug 3, 2019 · 4 comments
Closed

OcrdExif: Pillow's IFDRational is unsupported #270

bertsky opened this issue Aug 3, 2019 · 4 comments
Labels

Comments

@bertsky
Copy link
Collaborator

bertsky commented Aug 3, 2019

I get the following error when using OcrdExif on (most of) the TIFFs in the Metastore GT Bags:

14:45:37.787 INFO ocrd.task_sequence - Start processing task 'cis-ocropy-clip -I OCR-D-GT-SEG-LINE -O OCR-D-GT-SEG-LINE-CLIP'
14:45:38.968 INFO processor.OcropyClip - INPUT FILE 0 / phys_0001
Traceback (most recent call last):
  File "env3/bin/ocrd-cis-ocropy-clip", line 11, in <module>
    load_entry_point('cis-ocrd', 'console_scripts', 'ocrd-cis-ocropy-clip')()
  File "env3/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "env3/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "env3/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "env3/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "ocrd_cis/ocropy/cli.py", line 27, in cis_ocrd_ocropy_clip
    return ocrd_cli_wrap_processor(OcropyClip, *args, **kwargs)
  File "env3/lib/python3.6/site-packages/ocrd/decorators.py", line 38, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "env3/lib/python3.6/site-packages/ocrd/processor/base.py", line 65, in run_processor
    processor.process()
  File "ocrd_cis/ocropy/clip.py", line 96, in process
    self.workspace, page, page_id)
  File "ocrd_cis/ocropy/common.py", line 776, in image_from_page
    page_image_info = OcrdExif(page_image)
  File "env3/lib/python3.6/site-packages/ocrd_models/ocrd_exif.py", line 54, in __init__
    self.resolution = round(sqrt(self.xResolution * self.yResolution))
TypeError: unsupported operand type(s) for *: 'IFDRational' and 'IFDRational'
Traceback (most recent call last):
  File "env3/bin/ocrd", line 10, in <module>
    sys.exit(cli())
  File "env3/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "env3/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "env3/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "env3/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "env3/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "env3/lib/python3.6/site-packages/ocrd/cli/process.py", line 26, in process_cli
    run_tasks(mets, log_level, page_id, tasks)
  File "env3/lib/python3.6/site-packages/ocrd/task_sequence.py", line 95, in run_tasks
    raise Exception("%s exited with non-zero return value %s" % (task.executable, returncode))
Exception: ocrd-cis-ocropy-clip exited with non-zero return value 1

It does not happen on loeber_heuschrecken_1693 and wecker_kochbuch_1598 (but all 34 other bags).

We already chased this exact error in a Travis build for Python 3.4 here, but this time the error occurs on Python 3.6.5 (and Pillow 5.4.1)!

I fail to see any reason why the class IFDRational would not allow the multiplication operator.

@bertsky
Copy link
Collaborator Author

bertsky commented Aug 3, 2019

This is critical, because it drags down our current image_from_page with it.

Those 2 bags which evade the error are from the subset we already identified as coming from lossy JPEGs. And, it seems, these 2 are the ones which show no dpi info in Pillow at all (all others do).

The error disappears with Pillow 6.1.0 BTW.

@bertsky
Copy link
Collaborator Author

bertsky commented Aug 3, 2019

So can someone please determine whether #238 still happens with 6.1.0?

@kba @finkf I traced your original description in the Gitter log and you reported the problem on a file called eval-grenzboten-shuffle/trainws/179396.tif – can you please publish this somewhere (or try to reproduce with 6.1.0 yourself)?

@kba
Copy link
Member

kba commented Aug 7, 2019

Here's the TIFF. 179470.zip Debugging this now.

@kba
Copy link
Member

kba commented Aug 7, 2019

So can someone please determine whether #238 still happens with 6.1.0?

Unfortunately, yes:

def test_pil_version(self):
    """
    Test segfault issue in PIL TiffImagePlugin

    Run the same code multiple times to make segfaults more probable

    Should fail persistently:
        5.3.1 no
        5.4.1 no
        6.0.0 yes
        6.1.0 yes
    """
    for _ in range(0, 10):
        pil_image = Image.open(assets.path_to('grenzboten-test/data/OCR-D-IMG-BIN/p179470'))
        pil_image.crop(box=[1539, 202, 1626, 271])

throws random segfaults for Pillow 6.0.0 and 6.1.0 :-/

I'll check whether there's a workaround in OcrdExif...

@kba kba closed this as completed in bdc683b Aug 8, 2019
kba added a commit that referenced this issue Aug 8, 2019
exif: cast IDFRational to int, fix #270
@bertsky bertsky mentioned this issue Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants