# PaddleOCR
**Repository Path**: simon2010/PaddleOCR
## Basic Information
- **Project Name**: PaddleOCR
- **Description**: 基于飞桨的OCR和文档解析工具库,包含文字识别PP-OCR系列模型、文档解析PP-Structure系列方案和关键信息抽取PP-ChatOCR系列方案
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: https://www.paddleocr.ai
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 1059
- **Created**: 2025-10-11
- **Last Updated**: 2025-10-11
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
  
       
  
English | [简体中文](./readme/README_cn.md) | [繁體中文](./readme/README_tcn.md) | [日本語](./readme/README_ja.md) | [한국어](./readme/README_ko.md) | [Français](./readme/README_fr.md) | [Русский](./readme/README_ru.md) | [Español](./readme/README_es.md) | [العربية](./readme/README_ar.md)
[](https://github.com/PaddlePaddle/PaddleOCR)
[](https://arxiv.org/abs/2507.05595)
[](https://pepy.tech/project/paddleocr)
[](https://pepy.tech/project/paddleocr)
[](https://github.com/PaddlePaddle/PaddleOCR/network/dependents)



[](./LICENSE)
[](https://deepwiki.com/PaddlePaddle/PaddleOCR)
**PaddleOCR is an industry-leading, production-ready OCR and document AI engine, offering end-to-end solutions from text extraction to intelligent document understanding**
 
# PaddleOCR
[](https://www.paddlepaddle.org.cn/en)
[](#)
[](#)
[](#)
[](#)
> [!TIP]
> PaddleOCR now provides an MCP server that supports integration with Agent applications like Claude Desktop. For details, please refer to [PaddleOCR MCP Server](https://paddlepaddle.github.io/PaddleOCR/latest/en/version3.x/deployment/mcp_server.html).
>
> The PaddleOCR 3.0 Technical Report is now available. See details at: [PaddleOCR 3.0 Technical Report](https://arxiv.org/abs/2507.05595)
**PaddleOCR** converts documents and images into **structured, AI-friendly data** (like JSON and Markdown) with **industry-leading accuracy**—powering AI applications for everyone from indie developers and startups to large enterprises worldwide. With over **50,000 stars** and deep integration into leading projects like **MinerU, RAGFlow, and OmniParser**, PaddleOCR has become the **premier solution** for developers building intelligent document applications in the **AI era**.
### PaddleOCR 3.0 Core Features
[](https://aistudio.baidu.com/community/app/91660/webUI)
[](https://aistudio.baidu.com/community/app/518494/webUI)
[](https://aistudio.baidu.com/community/app/518493/webUI)
[](https://www.modelscope.cn/organization/PaddlePaddle)
[](https://huggingface.co/PaddlePaddle)
- **PP-OCRv5 — Universal Scene Text Recognition**  
  **Single model supports five text types** (Simplified Chinese, Traditional Chinese, English, Japanese, and Pinyin) with **13% accuracy improvement**. Solves multilingual mixed document recognition challenges.
- **PP-StructureV3 — Complex Document Parsing**  
  Intelligently converts complex PDFs and document images into **Markdown and JSON files that preserve original structure**. **Outperforms** numerous commercial solutions in public benchmarks. **Perfectly maintains document layout and hierarchical structure**.
- **PP-ChatOCRv4 — Intelligent Information Extraction**  
  Natively integrates ERNIE 4.5 to **precisely extract key information** from massive documents, with 15% accuracy improvement over previous generation. Makes documents "**understand**" your questions and provide accurate answers.
In addition to providing an outstanding model library, PaddleOCR 3.0 also offers user-friendly tools covering model training, inference, and service deployment, so developers can rapidly bring AI applications to production.
| Project Name | Description |
| ------------ | ----------- |
| [RAGFlow](https://github.com/infiniflow/ragflow) 

|RAG engine based on deep document understanding.|
| [MinerU](https://github.com/opendatalab/MinerU) 

|Multi-type Document to Markdown Conversion Tool|
| [Umi-OCR](https://github.com/hiroi-sora/Umi-OCR) 

|Free, Open-source, Batch Offline OCR Software.|
| [OmniParser](https://github.com/microsoft/OmniParser)

 |OmniParser: Screen Parsing tool for Pure Vision Based GUI Agent.|
| [QAnything](https://github.com/netease-youdao/QAnything)

 |Question and Answer based on Anything.|
| [PDF-Extract-Kit](https://github.com/opendatalab/PDF-Extract-Kit) 

|A powerful open-source toolkit designed to efficiently extract high-quality content from complex and diverse PDF documents.|
| [Dango-Translator](https://github.com/PantsuDango/Dango-Translator)

 |Recognize text on the screen, translate it and show the translation results in real time.|
| [Learn more projects](./awesome_projects.md) | [More projects based on PaddleOCR](./awesome_projects.md)|