Document Recognition
Goal: Better OCR, document structure analysis
New developments:
- Document recognizers based on stochastic context free grammars (as opposed to hacked regular grammars)
- Enables better treatment of layout, tables
- Automatically extract bibliographic references
- “OCRchie” toolkit works with more mathematics, produces TeX (w. Univ. of Essen)
- Math OCR improvements