ocropus

Document analysis and ocr system
  http://code.google.com/p/ocropus/
  0
  no reviews



Ocropus(tm) is a state-of-the-art document analysis and optical character recognition (ocr) system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities.

the ocropus engine is based on two research projects: a high-performance handwriting recognizer developed in the mid-90's and deployed by the us census bureau, and novel high-performance layout analysis methods.

ocropus development is sponsored by google and is initially intended for high-throughput, high-volume document conversion efforts. it will also be an excellent ocr system for many other applications.