Find, delete and add text into pdf file in Python

Question

I have a pdf file, it is necessary to delete certain text in it. Then add new text below to the existing one. I'm trying to use the PyMuPDF library - fitz. Open the file, set the text to search, but I did not find how to delete it and add new text. Please could you help me how to delete the found text and add to the existing one. Using libraries is not important, we can use PyPDF2 and others. The sample pdf file with description is attached.

import fitz
  
doc = fitz.open(MyFilePath)
page = doc[0]
  
text1 = “ANA”
text_instances1 = page.searchFor(text1)
  
# found text should be deleted …
  
text_to_add = “Text”
text2 = “TAIL NO.”
text_instances2 = page.searchFor(text2)
  
# should be added "text_to_add" after found text "text2"
  
doc.save(OutputFilePath, garbage=4, deflate=True, clean=True)

1

is the problem solved? – AzyCrw4282 Jul 09 '20 at 18:39

score 0 · Answer 1 · answered Jul 08 '20 at 17:34

The library doesn't officially support adding/deleting text of a pdf document. However, from a recorded issue there is a workaround this. You can see the answer here from the author of the library on how you can get around this using a Text Modification method.

It also worries me that the documentation for the library seems to be unavailable. Not sure if this a permanent case but if so you should consider using a different library. You should see the answers here on the best alternative library - Add text to Existing PDF using Python

Beware that that workaround explicitly is only *for text coded in ASCII* or Latin. If you eventually get arbitrary input documents, you cannot count on that, even if the text only used characters from the ASCII range. — mkl, Jul 09 '20 at 05:02

Find, delete and add text into pdf file in Python

1 Answers1