Mistral launches OCR 4 with layout-aware document understanding

Mistral updates its document AI stack with OCR 4

Mistral AI has released OCR 4, a new document-understanding model designed to do more than simply extract text from files. The company says the system now returns bounding boxes, block classifications and inline confidence scores together with the recognized text, giving developers a richer output for document workflows.

The model is aimed at enterprise document ingestion, including search, retrieval-augmented generation and domain-specific pipelines. Mistral says OCR 4 can process a wide range of formats, including PDF, DOC, PPT and OpenDocument files, and supports 170 languages across 10 language groups.

A major change in OCR 4 is its structure-aware output. Instead of producing only plain text, the model identifies where each block appears on the page and what kind of content it represents, such as titles, tables, equations or signatures. Mistral says the added structure should make it easier to build citation-ready search systems, support human review and improve downstream extraction workflows.

Focus on enterprise workflows

Mistral is positioning OCR 4 as a tool for both high-volume and compliance-sensitive environments. The company says the model is compact enough to run in a single container and can be deployed fully on customer infrastructure. That self-hosted option is intended to help organizations with data residency, sovereignty or privacy requirements keep documents inside their own systems.

The model is also integrated into Mistral Search Toolkit, the company’s open-source search framework, as an ingestion component. In that setup, OCR 4 output can be used in retrieval pipelines for enterprise search and RAG applications.

Mistral says developers can use OCR 4 through an API, while teams wanting a no-code path can work through Document AI in Mistral Studio. The company lists OCR 4 at $4 per 1,000 pages through the API, with Batch API pricing set at $2 per 1,000 pages. Document AI is priced at $5 per 1,000 pages.

Benchmarks and performance claims

Mistral says OCR 4 outperformed competing OCR and document-AI systems in human preference tests, with independent annotators choosing it more often than rivals across the documents they reviewed. The company also says the model posted the top score on the public OlmOCRBench benchmark, with a score of 85.20, and scored 93.07 on OmniDocBench.

At the same time, Mistral cautioned that popular benchmarks can be imperfect measures of real-world accuracy. The company said some mismatches are caused by reference errors, formatting differences, equation segmentation, or reading-order issues in multi-column documents rather than by model failures.

Mistral also highlighted internal multilingual testing, saying OCR 4 led across all eight of its language group categories, with especially strong results in specialized and low-resource languages. The company said those languages are often where competing systems lose accuracy most sharply.

Early use cases and limits

Mistral says early customers are using OCR 4 for tasks such as turning invoices into structured fields, digitizing archives, extracting text from technical and scientific reports, and supporting enterprise search systems.

The company also drew a line around what OCR 4 is meant to do. It describes the model as a document-understanding system rather than a decision-maker, and says it is not intended for medical diagnosis, legal advice, high-stakes financial decisions, safety-critical applications or non-document inputs such as audio and video.

OCR 4 is now available through Mistral Studio, Amazon SageMaker and Microsoft Foundry, with support for self-hosting for enterprise customers. Mistral said additional integration with Snowflake Parse Document is coming soon.