OCRopus - Open-Source Layout Analysis and OCR
OCRopus is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities. This server allows you to use the system through your web browser.- the system is currently optimized for English text
- the best resolution to use is around 300 dpi
- the processing time increases with the image size and can be up to one minute for a large page
send a png or jpg file to ocropus ..
Please note that OCRopus will remember the document image you are uploading and we may retain it for debugging purposes or to enhance our services! With uploading a document image you accept these conditions.
Programmatic Interface
To submit your image programmatically, you can simply POST to this URL; the image should be a parameter named "file". Also you must set "curl=1". Optionally you can add "bin=1" and "pageseg=1" for binarization and page segmentation of the image.From the command line, you can do this using:
curl -D header.out -F 'file=@input.png;type=image/png' -F curl=1 http://demo.iupr.org/cgi-bin/upload.cgi > output.html
You can also do this easily using the HTTP implementation in your favorite programming language (C#, Python, Java, Perl, etc.).


