# 全品OCR接口文档

**Repository Path**: canpoint_ai/interfaces_document_of_canpoint_ocr

## Basic Information

- **Project Name**: 全品OCR接口文档
- **Description**: 描述全品OCR接口及对接方式
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-08-09
- **Last Updated**: 2024-11-28

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# 全品OCR接口文档

## 项目简介

全品OCR，用于检测并识别中小学试卷内容 

*备注：目前支持数学试卷，更多正在后续添加中*

|样例|结果|
|:----:|:----:|
|<img src="https://gitee.com/canpoint_ai/interfaces_document_of_canpoint_ocr/raw/master/imgs/sample1.jpg" height="480px" />|<img src="https://gitee.com/canpoint_ai/interfaces_document_of_canpoint_ocr/raw/master/imgs/sample1_res.png" height="480px" />|
|<img src="https://gitee.com/canpoint_ai/interfaces_document_of_canpoint_ocr/raw/master/imgs/sample2.jpg" height="480px" />|<img src="https://gitee.com/canpoint_ai/interfaces_document_of_canpoint_ocr/raw/master/imgs/sample2_res.PNG" height="480px" />|

## 接口调用

### 1. 试卷照片识别

#### 说明

检测试卷照片中的文字、公式、照片和表格，识别其中的文字和公式内容。其结果按行返回，文字转化为可编辑文本，公式转化为LaTex。

#### 接口调用

接口地址：http://123.60.217.149:9911/photo_ocr   
请求方法：POST  
传参方式：http body  
响应类型：同步  
返回类型：JSON  
content-type：text/html  

#### 参数说明

请求参数  
|参数|类型|是否必填|描述|
|:----|:----|:----|:----|
|image|文件字节流|是|待检测的本地图片文件字节流|  

返回参数
|参数|类型|描述|
|:----|:----|:----|
|code|int|错误码，1表示成功，-1表示失败|
|result|list|每一项表示一行文本，见行结果参数|  
|cost_time|float|识别耗时|
|img_name|string|图片名称|  

行结果参数  
*行参数为列表，列表每一项表示行中的一个item*
|关键字|item|类型|描述|
|:----|:----|:----|:----|
|ench|中英文本|string|中英文文本字符串|
|formula|公式|string|公式Latex|  
|pic|图片坐标|list|坐标分别为：左上x，左上y，右下x，右下y|
|table|表格坐标|list|表格坐标|  

#### 调用示例

*以python代码为示例*  
    
    url = 'http://123.60.217.149:9911/photo_ocr'  
    files = {'image': open('xxx/xxx.jpg', 'rb')}  
    response = requests.post(url, files=files)  
    print(response.json())  

*结果*  

    [[{'ench': '第Ⅱ卷（非选择题，共84分）', 'pos': [163, -2, 415, 39]}],  
    [{'ench': '二、填空题：本大酸共6个小题，每小题分，满分24分', 'pos': [27, 28, 352, 80]}],  
    [{'formula': '\\frac { x _ { 1 } + x _ { 2 } } { 1 3 2 + 3 2 x } = \\frac { 5 } { \\sqrt { 3 } }', 'pos': [16, 42, 341, 121]},  
    ...  
    [{'pic': None, 'pos': [120, 251, 240, 377]}],  
    ]

### 2. 试卷PDF识别

#### 说明

检测试卷PDF中的文字、公式、照片和表格，识别其中的文字和公式内容。其结果按行返回，文字转化为可编辑文本，公式转化为LaTex。

#### 接口调用

接口地址：http://123.60.217.149:9911/pdf_ocr  
请求方法：POST  
传参方式：http body  
响应类型：同步  
返回类型：JSON  
content-type：text/html  

#### 参数说明

请求参数  
|参数|类型|是否必填|描述|
|:----|:----|:----|:----|
|image|文件字节流|是|待检测的本地图片文件字节流|  

返回参数
|参数|类型|描述|
|:----|:----|:----|
|code|int|错误码，1表示成功，-1表示失败|
|result|list|每一项表示一行文本，见行结果参数|  
|cost_time|float|识别耗时|
|img_name|string|图片名称|  

行结果参数  
*行参数为列表，列表每一项表示行中的一个item*
|关键字|item|类型|描述|
|:----|:----|:----|:----|
|ench|中英文本|string|中英文文本字符串|
|formula|公式|string|公式Latex|  
|pic|图片坐标|list|坐标分别为：左上x，左上y，右下x，右下y|
|table|表格坐标|list|表格坐标|  

#### 调用示例

*以python代码为示例*  

    url = 'http://123.60.217.149:9911/pdf_ocr'  
    files = {'image': open('xxx/xxx.jpg', 'rb')}  
    response = requests.post(url, files=files)  
    print(response.json())  

*结果*  

    [[{'ench': '第Ⅱ卷（非选择题，共84分）', 'pos': [163, -2, 415, 39]}],  
    [{'ench': '二、填空题：本大酸共6个小题，每小题分，满分24分', 'pos': [27, 28, 352, 80]}],  
    [{'formula': '\\frac { x _ { 1 } + x _ { 2 } } { 1 3 2 + 3 2 x } = \\frac { 5 } { \\sqrt { 3 } }', 'pos': [16, 42, 341, 121]},  
    ...  
    [{'pic': None, 'pos': [120, 251, 240, 377]}],  
    ]

### 3. 试卷照片题目结构化

#### 说明

自动对试卷照片题目做切分，以题目为单位返回内容及对应区域坐标。

#### 接口调用

接口地址：http://123.60.217.149:9911/photo_subject  
请求方法：POST  
传参方式：http body  
响应类型：同步  
返回类型：JSON  
content-type：text/html  

#### 参数说明

请求参数  
|参数|类型|是否必填|描述|
|:----|:----|:----|:----|
|image|文件字节流|是|待检测的本地图片文件字节流|  

返回参数
|参数|类型|描述|
|:----|:----|:----|
|code|int|错误码，1表示成功，-1表示失败|
|result|list|每一项表示一个题目，见行结果参数|  
|cost_time|float|识别耗时|
|img_name|string|图片名称|  

行结果参数  
|关键字|item|类型|描述|
|:----|:----|:----|:----|
|No|索引|int|对识别到的题目从1-n编号|
|content|题目内容|string|题目内容，公式段前后用$标记|  
|box|图片坐标|dict|坐标，左上x，左上y，右下x，右下y|
 

#### 调用示例

*以python代码为示例*  

    url = 'http://123.60.217.149:9911/photo_subject'  
    files = {'image': open('xxx/xxx.jpg', 'rb')}  
    response = requests.post(url, files=files)  
    print(response.json())  

*结果*  

    [{'No': 0,
    'content': '14.不等式组 $\\left\\{ \\begin{array} { l } { x - 3 ( x - 2 ) > 4 } \\\\ { \\frac { 2 x - 3 } { 5 } \\leqslant \\frac { x + 1 } { 2 } } \\end{array}\\right.$ 的解集为 S.在平面直角坐标案中，点C.的坐标分别为C（2,3）、D（1,0）.现以原点为位似中心， 线段CD放大得到线段AB，若点D的对应点B在轴上且 $) B = 2$ 则点C的对应点 ',
    'box': {'start_x': '18', 'start_y': '95', 'end_x': '217', 'end_y': '201'}},
    {'No': 1,
    'content': '16.如图，将矩形ABCD沿 $\\overrightarrow { G H }$ 对折，点C落在 $Q$ 处，点D落在AB边上的 $E$ 处 $. E Q \\perp$ BO 的坐标为 相交子点 $F$ 若 $A D = 8 , A B = 6 , A E =$ 4 $\\triangle E B F$ 周长的大小为   （第17题图） （第16题图） ',
    'box': {'start_x': '33', 'start_y': '187', 'end_x': '571', 'end_y': '409'}},
    {'No': 2,
    'content': '12.如图一个几何体的三视图分别是两个短形、一个扇形，则这个几何体表面积的大小为 ',
    'box': {'start_x': '32', 'start_y': '388', 'end_x': '555', 'end_y': '441'}},
    ]


### 4. 试卷PDF题目结构化

#### 说明

自动对试卷PDF题目做切分，以题目为单位返回内容及对应区域坐标。

#### 接口调用

接口地址：http://123.60.217.149:9911/pdf_subject  
请求方法：POST  
传参方式：http body  
响应类型：同步  
返回类型：JSON  
content-type：text/html  

#### 参数说明

请求参数  
|参数|类型|是否必填|描述|
|:----|:----|:----|:----|
|image|文件字节流|是|待检测的本地图片文件字节流|  

返回参数
|参数|类型|描述|
|:----|:----|:----|
|code|int|错误码，1表示成功，-1表示失败|
|result|list|每一项表示一个题目，见行结果参数|  
|cost_time|float|识别耗时|
|img_name|string|图片名称|  

行结果参数  
|关键字|item|类型|描述|
|:----|:----|:----|:----|
|No|索引|int|对识别到的题目从1-n编号|
|content|题目内容|string|题目内容，公式段前后用$标记|  
|box|图片坐标|dict|坐标，左上x，左上y，右下x，右下y|
 

#### 调用示例

*以python代码为示例*  

    url = 'http://123.60.217.149:9911/pdf_subject'  
    files = {'image': open('xxx/xxx.jpg', 'rb')}  
    response = requests.post(url, files=files)  
    print(response.json())  

*结果*  

    [{'No': 0,
    'content': '14.不等式组 $\\left\\{ \\begin{array} { l } { x - 3 ( x - 2 ) > 4 } \\\\ { \\frac { 2 x - 3 } { 5 } \\leqslant \\frac { x + 1 } { 2 } } \\end{array}\\right.$ 的解集为 S.在平面直角坐标案中，点C.的坐标分别为C（2,3）、D（1,0）.现以原点为位似中心， 线段CD放大得到线段AB，若点D的对应点B在轴上且 $) B = 2$ 则点C的对应点 ',
    'box': {'start_x': '18', 'start_y': '95', 'end_x': '217', 'end_y': '201'}},
    {'No': 1,
    'content': '16.如图，将矩形ABCD沿 $\\overrightarrow { G H }$ 对折，点C落在 $Q$ 处，点D落在AB边上的 $E$ 处 $. E Q \\perp$ BO 的坐标为 相交子点 $F$ 若 $A D = 8 , A B = 6 , A E =$ 4 $\\triangle E B F$ 周长的大小为   （第17题图） （第16题图） ',
    'box': {'start_x': '33', 'start_y': '187', 'end_x': '571', 'end_y': '409'}},
    {'No': 2,
    'content': '12.如图一个几何体的三视图分别是两个短形、一个扇形，则这个几何体表面积的大小为 ',
    'box': {'start_x': '32', 'start_y': '388', 'end_x': '555', 'end_y': '441'}},
    ]