I want to know how to extract data from pdf by using python language on pycharm .I tried to code by using pycharm by importing from pypdf2 but yet it is not showing results.
Asked
Active
Viewed 136 times
-3
-
2Can you show what code you have so far? – kpie Feb 22 '22 at 05:11
2 Answers
1
PyPDF2, PyPDF3, and PyPDF4 are all unmaintained. I would recommend taking a look at this question and trying one of the many different methods discussed.
According to the PyPDF2 documentation, the extractText() method "works well for some PDF files, but poorly for others, depending on the generator used". Without seeing your code, a large factor in why your code is not working may be incompatibility with the PDF file itself.
farzany
- 61
- 1
0
Use this code
from PyPDF2 import PdfFileReader
reader = PdfFileReader(filename)
pageObj = reader.getNumPages()
for page_count in range(pageObj):
page = reader.getPage(page_count)
page_data = page.extractText()
Shubham Korade
- 167
- 7