12

This is an interesting topic. Basically, I have an image that contains some text. How do I extract the text from the image?

I have already tried many things, but everything I do is very tedious and usually does not work. I am simply wondering if there is a fairly easy way to do this.

I have come upon this: http://sourceforge.net/projects/javaocr/. I have tried this for hours, but I cannot get it to take an Image and turn it into a String of text from the image.

Thank you all in advance!

Dylan Wheeler
  • 6,740
  • 14
  • 55
  • 79
  • You could also find this helpful: http://stackoverflow.com/questions/9480831/java-ocr-api-open-source-on-eclipse/9481603#9481603 – Nikolay May 03 '12 at 04:42

4 Answers4

7

You need to look into Java OCR implementations. Take a look at this question: Java OCR

Community
  • 1
  • 1
Josh Diehl
  • 2,782
  • 2
  • 30
  • 39
4

Tess4J, a JNA wrapper around Tesseract engine, supports APIs that take BufferedImage, File, or image data as input, and return String as output.

nguyenq
  • 7,964
  • 1
  • 15
  • 15
  • I know I'm commenting after 3 years but your answer shoul be the right ansswer 'javaOCR' has many problems but this API works very well. – SlimenTN Jun 18 '15 at 08:47
2

You need an OCR (optical character recognizer) library or write your own. Check out this SO question.

Community
  • 1
  • 1
Pablo Santa Cruz
  • 170,119
  • 31
  • 233
  • 283
0

Try this character recognition library: http://sourceforge.net/projects/javaocr/

Jonathan
  • 7,311
  • 5
  • 27
  • 35