0

I'm using Tesseract with Python to attempt to read license plates using the function image_to_string(). The license plates include only uppercase alphas and digits. Occasionally, Tesseract misreads digits or uppercase characters as lowercase characters.

I know that I can specify a white list of characters to include only uppercase alphas and digits. What I really want to know is whether the white list causes the OCR algorithm to bypass the white listed characters and continue to try to match the symbol with non-white listed characters, or does it simply cause the image_to_string() function to discard characters that it has interpreted that are not on the white list?

Zizumara
  • 21
  • 3
  • https://stackoverflow.com/questions/2363490/limit-characters-tesseract-is-looking-for, seems like limit – KMM May 19 '22 at 01:33

0 Answers0