OCRopus - Open-Source Layout Analysis and OCR

OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities. This server allows you to use the system through your web browser.

Please note:

You can either submit an image through the form interface, or you can submit it programmatically through HTTP.

Form Interface

Note: We may retain data for debugging purposes or to enhance our services.

File (max. 8MB):

Examples

If you do not have an image at hand or want to try some of our images, try one of these (note that results may be cached):
Show OCR result
Show OCR result
Show OCR result
Show OCR result

Programmatic Interface

To submit your image programmatically, you can simply POST to this URL; the image should be a parameter named "imagefile".

From the command line, you can do this using:

curl -D header.out -F 'imagefile=@input.png;type=image/png' http://demo.iupr.org/ocropus/ > output.html

You can also do this easily using the HTTP implementation in your favorite programming language (C#, Python, Java, Perl, etc.).