小镇青年的文档中心

前言

本文使用PaddleOCR 3.1.1版本，该版本相对于之前的版本功能更加强大
PaddleOCR文档：快速开始 - PaddleOCR 文档

安装

安装依赖

shell

pip install paddlepaddle paddleocr
# 或使用uv安装
uv add paddlepaddle paddleocr

代码示例

python

from paddleocr import PaddleOCR

ocr = PaddleOCR(
    use_doc_orientation_classify=False,
    use_doc_unwarping=False,
    use_textline_orientation=False,
)

# 对示例图像运行 OCR 推理
result = ocr.predict(input="./data/test2.jpg")

# 可视化结果并保存 JSON 结果
for res in result:
    res.save_to_json(save_path="output_ppstructurev3")

PaddleOCR2.x代码示例参考 Python OCR一站式方案 - 知乎

PaddleOCR3.x构造函数参数解析

文档处理相关参数

参数名	类型	说明
doc_orientation_classify_model_name	str	文档方向分类模型的名称
doc_orientation_classify_model_dir	str	文档方向分类模型的存储目录路径
doc_unwarping_model_name	str	文档展平模型的名称（用于校正弯曲的文档图像）
doc_unwarping_model_dir	str	文档展平模型的存储目录路径
use_doc_orientation_classify	bool	是否使用文档方向分类功能
use_doc_unwarping	bool	是否使用文档展平功能

文本检测相关参数

参数名	类型	说明
text_detection_model_name	str	文本检测模型的名称
text_detection_model_dir	str	文本检测模型的存储目录路径
text_det_limit_side_len	int	文本检测中图像边长限制
text_det_limit_type	str	文本检测限制类型（'max'或'min'）
text_det_thresh	float	文本检测阈值
text_det_box_thresh	float	文本检测框阈值
text_det_unclip_ratio	float	文本检测框扩展比例
text_det_input_shape	tuple/list	文本检测模型的输入形状

文本识别相关参数

参数名	类型	说明
text_recognition_model_name	str	文本识别模型的名称
text_recognition_model_dir	str	文本识别模型的存储目录路径
text_recognition_batch_size	int	文本识别的批处理大小
text_rec_score_thresh	float	文本识别置信度阈值
text_rec_input_shape	tuple/list	文本识别模型的输入形状
return_word_box	bool	是否返回单词级别的框（而不是文本行级别）

文本行方向相关参数

参数名	类型	说明
textline_orientation_model_name	str	文本行方向分类模型的名称
textline_orientation_model_dir	str	文本行方向分类模型的存储目录路径
textline_orientation_batch_size	int	文本行方向分类的批处理大小
use_textline_orientation	bool	是否使用文本行方向分类功能

通用参数
参数名类型说明
lang str 识别语言（如 'ch', 'en' 等）
ocr_version str OCR 版本（如 'PP-OCRv3'）
**kwargs dict 其他未明确列出的参数

PaddleOCR中的不同OCR模型查看文档即可
- PP-OCRv5：使用教程 - PaddleOCR 文档
- PP-StructureV3：使用教程 - PaddleOCR 文档
- PP-ChatOCRv4：使用教程 - PaddleOCR 文档

selenium

PaddleOCR

AndroidStudio

MobaXterm

idea

jd-gui

navicat

vscode

vercel

前言

安装

参数名	类型	说明
lang	str	识别语言（如 'ch', 'en' 等）
ocr_version	str	OCR 版本（如 'PP-OCRv3'）
**kwargs	dict	其他未明确列出的参数

前言 ​

安装 ​

前言

安装