How to use pymupdf to extract table information?? #700

kingqiuol · 2020-10-22T08:48:05Z

I need to extract PDF information, including: text, images and table information, but there is no way to extract table information in pymupdf.

JorjMcKie · 2020-10-22T09:43:22Z

That is correct: interpretation of table structures and other elements of tagged PDFs is not supported.

JorjMcKie · 2020-10-23T15:29:00Z

What is supported since v1.18.0 is extracting drawings, meaning lines, rectangles and curves.
This may help identifying tables ...

MartinThoma · 2023-03-26T11:36:22Z

Is there maybe an update to this?

Can PyMuPDF extract marked content from tagged PDFs?

kingqiuol added the question label Oct 22, 2020

kingqiuol assigned JorjMcKie Oct 22, 2020

JorjMcKie closed this as completed Oct 23, 2020

JorjMcKie added the resolved fixed / implemented / answered label Nov 11, 2020

MartinThoma mentioned this issue Mar 26, 2023

New feature: FPDF.table() py-pdf/fpdf2#701

Closed

Provide feedback