Skip to content
Technical GlossaryComputer Vision

Optical Character Recognition

A core Document AI task that converts text within images or documents into machine-processable text.

OCR is one of the foundational building blocks of document intelligence systems. It extracts printed or handwritten text from images so that it becomes searchable, indexable, analyzable, and automatable. Modern OCR systems do more than recognize characters; they also deal with multiple fonts, degraded scans, skewed pages, and multilingual content. It is critical for archival digitization, invoice processing, and enterprise document automation.