149

I'm looking for a Java OCR that runs on Android, however Asprise doesn't seem to be a platform independent OCR. is there any opensource/free Java OCR I can use for android application development?

Utsav Gupta
  • 3,345
  • 5
  • 27
  • 49
user121196
  • 28,674
  • 57
  • 144
  • 198
  • 5
    Google recently released an OCR API: https://developers.google.com/vision/text-overview – Wirling Jun 28 '16 at 09:55
  • For people who coming from 2021, there is a great library for android/OS : https://developers.google.com/ml-kit – Karam Mar 16 '21 at 14:14

6 Answers6

38

OCR can be pretty CPU intensive, you might want to reconsider doing it on a smart phone.

That aside, to my knowledge the popular OCR libraries are Aspire and Tesseract. Neither are straight up Java, so you're not going to get a drop-in Android OCR library.

However, Tesseract is open source (GitHub hosted infact); so you can throw some time at porting the subset you need to Java. My understanding is its not insane C++, so depending on how badly you need OCR it might be worth the time.

So short answer: No.

Long answer: if you're willing to work for it.

BlueWizard
  • 352
  • 4
  • 18
Kevin Montrose
  • 21,711
  • 8
  • 86
  • 136
  • 2
    by porting it over do you mean rewriting the subset in Java? that might take lots of effort, so there is not a 100% Java OCR out there? – user121196 Jul 10 '09 at 00:39
  • To the best of my knowledge, no there is not. – Kevin Montrose Jul 10 '09 at 06:13
  • 13
    I would recommend trying to wrap Tesseract in a JNI layer through Android NDK, rather than trying to port it to Android's Java. Tesseract already appears to be ported to ARM, so it should be easier to put a JNI API on top of it. Also, this keeps it fast(er) than any Java port would be, and would simplify long-term maintenance. – CommonsWare Sep 21 '09 at 19:00
  • 15
    There is already a Tesseract JNI interface for Java called Tessjeract. http://code.google.com/p/tesjeract/ – sventechie Dec 04 '09 at 19:21
  • 1
    Tesseract will not be a short walk from C to java. The code I've seen is highly idiomatic 80's C and not easily transportable to other languages. – plinth Dec 21 '09 at 15:16
  • 1
    WARNING ! Asprise can't run on android, it uses swing... – Taiko Nov 26 '14 at 09:14
  • @sventechie No sign to be found of TessJeRact on the web nowadays (correct URL probably was [http://code.google.com/p/tessjeract](http://code.google.com/p/tessjeract) (with `ss`). – ᴠɪɴᴄᴇɴᴛ May 19 '15 at 14:28
  • 2
    @vincent disappeared in the last year. JNA version is now available: https://github.com/nguyenq/tess4j but also an Android fork: https://github.com/rmtheis/tess-two – sventechie May 22 '15 at 16:06
21

I am having quite a lot of luck with tesseract-android-tools

Ben Pearson
  • 7,172
  • 4
  • 29
  • 50
  • The question has been closed, but it's good to find someone who has had positive results. It's very hard to find people on these sourceforge type projects. Question: did you try Tesseract with image scans of passports or ID documents? It seems ok with text PDFs but I'm struggling with images. – PKHunter Sep 02 '14 at 01:56
  • I didn't try it with anything that had images on, it was just a document with text (same font, typeface, size) – Ben Pearson Sep 02 '14 at 13:36
  • I hope it works for me – Romantic Electron Nov 28 '14 at 20:38
  • I have worked with tesseract with images with text and it was successful – tharindu Mar 08 '21 at 12:55
20

ANother option could be to post the image to a webapp (possibly at a later moment), and have it OCR-processed there without the C++ -> Java port issues and possibly clogging the mobile CPU.

Jaco
  • 249
  • 2
  • 2
7

Google Goggles is the perfect application for doing both OCR and translation.
And the good news is that Google Goggles to Become App Platform.

Until then, you can use IQ Engines.

harrymc
  • 1,049
  • 8
  • 32
4

Yes there is.

But OCR is very vast. I know an Android application that has an OCR feature, but that might not be the kind of OCR you are looking after.

This open-source application is called Aedict, and it does OCR on handwritten Japanese characters. It is not that slow.

If it is not what you are looking for, please precise which kind of characters, and which data input (image or X-Y touch history).

Nicolas Raoul
  • 57,417
  • 55
  • 212
  • 360
2

You can use the google docs OCR reader.

Jeff Axelrod
  • 26,593
  • 29
  • 143
  • 243
richardwiden
  • 1,548
  • 2
  • 12
  • 22