Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import camelot | OSError: Ghostscript is not installed #476

Open
danielbellhv opened this issue Dec 2, 2021 · 5 comments · Fixed by py-pdf/pypdf_table_extraction#149
Open

Comments

@danielbellhv
Copy link

danielbellhv commented Dec 2, 2021

Goal: import camelot and add to poetry.lock file

I am trying to install packages, via. VSCode, using Poetry, but am having dependency problems.

Environment/ Setup:

  • Windows 10,
  • Visual Studio Code,
  • Ubunutu WSL v1 CLI Bash
  • Poetry version 1.1.11

Traceback:

me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ pip install "camelot-py[base]"
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: camelot-py[base] in /home/me/.local/lib/python3.8/site-packages (0.10.1)
Requirement already satisfied: chardet>=3.0.4 in /usr/lib/python3/dist-packages (from camelot-py[base]) (3.0.4)
Requirement already satisfied: pandas>=0.23.4 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (1.3.4)
Requirement already satisfied: openpyxl>=2.5.8 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (3.0.9)
Requirement already satisfied: numpy>=1.13.3 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (1.21.4)
Requirement already satisfied: PyPDF2>=1.26.0 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (1.26.0)
Requirement already satisfied: click>=6.7 in /usr/lib/python3/dist-packages (from camelot-py[base]) (7.0)
Requirement already satisfied: pdfminer.six>=20200726 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (20211012)
Requirement already satisfied: tabulate>=0.8.9 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (0.8.9)
Requirement already satisfied: pdftopng>=0.2.3 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (0.2.3)
Requirement already satisfied: ghostscript>=0.7 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (0.7)
Requirement already satisfied: opencv-python>=3.4.2.17 in /home/me/.local/lib/python3.8/site-packages (from camelot-py[base]) (4.5.4.60)
Requirement already satisfied: setuptools>=38.6.0 in /usr/lib/python3/dist-packages (from ghostscript>=0.7->camelot-py[base]) (45.2.0)
Requirement already satisfied: et-xmlfile in /home/me/.local/lib/python3.8/site-packages (from openpyxl>=2.5.8->camelot-py[base]) (1.1.0)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/me/.local/lib/python3.8/site-packages (from pandas>=0.23.4->camelot-py[base]) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /home/me/.local/lib/python3.8/site-packages (from pandas>=0.23.4->camelot-py[base]) (2021.3)
Requirement already satisfied: cryptography in /usr/lib/python3/dist-packages (from pdfminer.six>=20200726->camelot-py[base]) (2.8)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.7.3->pandas>=0.23.4->camelot-py[base]) (1.14.0)
me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ poetry run python3 scrape_tables.py 
Traceback (most recent call last):
  File "scrape_tables.py", line 1, in <module>
    import camelot
ModuleNotFoundError: No module named 'camelot'
me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ python3 scrape_tables.py 
Traceback (most recent call last):
  File "scrape_tables.py", line 1, in <module>
    import camelot
  File "/home/me/.local/lib/python3.8/site-packages/camelot/__init__.py", line 6, in <module>
    from .io import read_pdf
  File "/home/me/.local/lib/python3.8/site-packages/camelot/io.py", line 5, in <module>
    from .handlers import PDFHandler
  File "/home/me/.local/lib/python3.8/site-packages/camelot/handlers.py", line 8, in <module>
    from .core import TableList
ImportError: cannot import name 'TableList' from 'camelot.core' (/home/me/.local/lib/python3.8/site-packages/camelot/core/__init__.py)
me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ 

I have a SO Post, but have since got this to work in Jupyter Notebooks. So I believe this to be a library installation conflict with my environment.

@danielbellhv danielbellhv changed the title import camelot | pip install "camelot-py[base]" ImportError: cannot import name 'TableList' from 'camelot.core' Dec 2, 2021
@danielbellhv
Copy link
Author

danielbellhv commented Dec 2, 2021

After following this Solution, uninstall and re-install.

I now have this error:

me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ python3 scrape_tables.py 
Traceback (most recent call last):
  File "scrape_tables.py", line 14, in <module>
    pdf_esg_scraped = p.map(scrape_tables, PDF_LIST)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "scrape_tables.py", line 9, in scrape_tables
    tables = camelot.read_pdf(pdf_filename)
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/io.py", line 113, in read_pdf
    tables = p.parse(
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/handlers.py", line 176, in parse
    t = parser.extract_tables(
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/parsers/lattice.py", line 421, in extract_tables
    self.backend.convert(self.filename, self.imagename)
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/backends/ghostscript_backend.py", line 31, in convert
    raise OSError(
OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html

@danielbellhv danielbellhv changed the title ImportError: cannot import name 'TableList' from 'camelot.core' import camelot | OSError: Ghostscript is not installed. Dec 2, 2021
@danielbellhv danielbellhv changed the title import camelot | OSError: Ghostscript is not installed. import camelot | OSError: Ghostscript is not installed Dec 2, 2021
@danielbellhv
Copy link
Author

Attempted Solution:

I got Ghostscript filepath output in Jupyter Notebook.

import ctypes
from ctypes.util import find_library
find_library("".join(("gsdll", str(ctypes.sizeof(ctypes.c_voidp) * 8), ".dll")))
>>> 'C:\\Users\\me\\Anaconda3\\Library\\bin\\gsdll64.dll'

Bash:

  • PATH=$PATH:C:\Users\me\Anaconda3\Library\bin\gsdll64.dll
  • PATH=$PATH:C:\\Users\\me\\Anaconda3\\Library\\bin\\gsdll64.dll

both give this error...

Error:

me@PF2DCSXD:/mnt/c/Users/me/Documents/GitHub/workers-python/workers/data_simulator/src$ python3 scrape_tables.py 
Traceback (most recent call last):
  File "scrape_tables.py", line 14, in <module>
    pdf_esg_scraped = p.map(scrape_tables, PDF_LIST)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "scrape_tables.py", line 9, in scrape_tables
    tables = camelot.read_pdf(pdf_filename)
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/io.py", line 113, in read_pdf
    tables = p.parse(
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/handlers.py", line 176, in parse
    t = parser.extract_tables(
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/parsers/lattice.py", line 421, in extract_tables
    self.backend.convert(self.filename, self.imagename)
  File "/home/danielbellhv/.local/lib/python3.8/site-packages/camelot/backends/ghostscript_backend.py", line 31, in convert
    raise OSError(
OSError: Ghostscript is not installed. You can install it using the instructions here: https://camelot-py.readthedocs.io/en/master/user/install-deps.html

@danielbellhv
Copy link
Author

pip install camelot-py:

Traceback (most recent call last):
  File "scrape_tables.py", line 25, in <module>
    import camelot
ModuleNotFoundError: No module named 'camelot'

pip install camelot:

danielbellhv@PF2DCSXD:/mnt/c/Users/dabell/Documents/GitHub/workers-python/workers/data_simulator/src$ pip install camelot
^[[A^[[A
Requirement already satisfied: camelot in /home/me/.local/lib/python3.8/site-packages (12.6.29)
Requirement already satisfied: Elixir>=0.7.1 in /home/me/.local/lib/python3.8/site-packages (from camelot) (0.7.1)
Requirement already satisfied: SQLAlchemy<0.8.0,>=0.7.7 in /home/me/.local/lib/python3.8/site-packages (from camelot) (0.7.10)
Requirement already satisfied: xlrd==0.7.1 in /home/me/.local/lib/python3.8/site-packages (from camelot) (0.7.1)
Requirement already satisfied: Jinja2>=2.5.5 in /usr/lib/python3/dist-packages (from camelot) (2.10.1)
Requirement already satisfied: xlwt==0.7.2 in /home/me/.local/lib/python3.8/site-packages (from camelot) (0.7.2)
Requirement already satisfied: sqlalchemy-migrate>=0.7.1 in /home/me/.local/lib/python3.8/site-packages (from camelot) (0.11.0)
Requirement already satisfied: chardet>=1.0.1 in /usr/lib/python3/dist-packages (from camelot) (3.0.4)
Requirement already satisfied: decorator in /home/me/.local/lib/python3.8/site-packages (from sqlalchemy-migrate>=0.7.1->camelot) (5.1.0)
Requirement already satisfied: pbr>=1.8 in /home/me/.local/lib/python3.8/site-packages (from sqlalchemy-migrate>=0.7.1->camelot) (5.8.0)
Requirement already satisfied: Tempita>=0.4 in /home/me/.local/lib/python3.8/site-packages (from sqlalchemy-migrate>=0.7.1->camelot) (0.5.2)
Requirement already satisfied: six>=1.7.0 in /usr/lib/python3/dist-packages (from sqlalchemy-migrate>=0.7.1->camelot) (1.14.0)
Requirement already satisfied: sqlparse in /home/me/.local/lib/python3.8/site-packages (from sqlalchemy-migrate>=0.7.1->camelot) (0.4.2)

@motifiy
Copy link

motifiy commented Jun 7, 2022

I think you can copy this gsdll64.dll into your project path, which might help you

@GitAronas
Copy link

This works for me on Windows 10:

  1. first install ghostscript from https://www.ghostscript.com/releases/gsdnld.html
  2. then pip install ghostscript in your python virtual environment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants