UVDoc

简介

文本图像校正的主要目的是对图像进行几何变换，以修正图像中的文档扭曲、倾斜、透视变形等问题，从而使后续的文本识别更加准确。

模型	CER
UVDoc	0.179

注：测试数据集：docunet 基准数据集。

快速开始

安装

PaddlePaddle

请参考以下命令，使用 pip 安装 PaddlePaddle：

# for CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# for CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# for CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

关于 PaddlePaddle 的安装详情，请参考 PaddlePaddle 官方网站。

PaddleOCR

从 PyPI 安装最新版本的 PaddleOCR 推理包：

python -m pip install paddleocr

模型使用

您可以通过以下单条命令快速体验功能：

paddleocr text_image_unwarping --model_name UVDoc -i https://cdn-uploads.huggingface.co/production/uploads/63d7b8ee07cd1aa3c49a2026/SfMVKd0xnMII5KBDV6Mfz.jpeg

您也可以将TextImageUnwarping模块的模型推理集成到您的项目中。在运行以下代码之前，请将示例图像下载到您的本地计算机。

from paddleocr import TextImageUnwarping

model = TextImageUnwarping(model_name="UVDoc")
output = model.predict("SfMVKd0xnMII5KBDV6Mfz.jpeg", batch_size=1)
for res in output:
    res.print()
    res.save_to_img(save_path="./output/")
    res.save_to_json(save_path="./output/res.json")

运行后，得到的结果如下：

{'res': {'input_path': 'doc_test.jpg', 'page_index': None, 'doctr_img': '...'}}

可视化图像如下：

image/jpeg

有关使用命令和参数说明的详细信息，请参阅文档。

流水线使用

单个模型的能力是有限的。但由多个模型组成的流水线能够提供更强的能力，以解决现实场景中的复杂问题。

PP-StructureV3

版面分析是一种从文档图像中提取结构化信息的技术。PP-StructureV3 包含以下六个模块：

版面检测模块
通用 OCR 子流水线
文档图像预处理子流水线（可选）
表格识别子流水线（可选）
印章识别子流水线（可选）
公式识别子流水线（可选）

您可以通过一条命令快速体验 PP-StructureV3 流水线。

paddleocr pp_structurev3 --use_doc_unwarping True -i https://cdn-uploads.huggingface.co/production/uploads/63d7b8ee07cd1aa3c49a2026/KP10tiSZfAjMuwZUSLtRp.png

您只需几行代码即可体验该流水线的推理过程。以 PP-StructureV3 流水线为例：

from paddleocr import PPStructureV3

pipeline = PPStructureV3(use_doc_unwarping=True) # Use use_doc_unwarping to enable/disable document unwarping module
output = pipeline.predict("./KP10tiSZfAjMuwZUSLtRp.png")
for res in output:
    res.print() ## Print the structured prediction output
    res.save_to_json(save_path="output") ## Save the current image's structured result in JSON format
    res.save_to_markdown(save_path="output") ## Save the current image's result in Markdown format

有关使用命令和参数说明的详细信息，请参阅文档。

链接

PaddleOCR 代码库

PaddleOCR 文档

UVDoc

简介

文本图像校正的主要目的是对图像进行几何变换，以修正图像中的文档扭曲、倾斜、透视变形等问题，从而使后续的文本识别更加准确。

模型	CER
UVDoc	0.179

注：测试数据集：docunet 基准数据集。

快速开始

安装

PaddlePaddle

请参考以下命令，使用 pip 安装 PaddlePaddle：

# for CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# for CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# for CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

关于 PaddlePaddle 的安装详情，请参考 PaddlePaddle 官方网站。

PaddleOCR

从 PyPI 安装最新版本的 PaddleOCR 推理包：

python -m pip install paddleocr

模型使用

您可以通过以下单条命令快速体验功能：

paddleocr text_image_unwarping --model_name UVDoc -i https://cdn-uploads.huggingface.co/production/uploads/63d7b8ee07cd1aa3c49a2026/SfMVKd0xnMII5KBDV6Mfz.jpeg

您也可以将TextImageUnwarping模块的模型推理集成到您的项目中。在运行以下代码之前，请将示例图像下载到您的本地计算机。

from paddleocr import TextImageUnwarping

model = TextImageUnwarping(model_name="UVDoc")
output = model.predict("SfMVKd0xnMII5KBDV6Mfz.jpeg", batch_size=1)
for res in output:
    res.print()
    res.save_to_img(save_path="./output/")
    res.save_to_json(save_path="./output/res.json")

运行后，得到的结果如下：

{'res': {'input_path': 'doc_test.jpg', 'page_index': None, 'doctr_img': '...'}}

可视化图像如下：

image/jpeg

有关使用命令和参数说明的详细信息，请参阅文档。

流水线使用

单个模型的能力是有限的。但由多个模型组成的流水线能够提供更强的能力，以解决现实场景中的复杂问题。

PP-StructureV3

版面分析是一种从文档图像中提取结构化信息的技术。PP-StructureV3 包含以下六个模块：

版面检测模块
通用 OCR 子流水线
文档图像预处理子流水线（可选）
表格识别子流水线（可选）
印章识别子流水线（可选）
公式识别子流水线（可选）

您可以通过一条命令快速体验 PP-StructureV3 流水线。

paddleocr pp_structurev3 --use_doc_unwarping True -i https://cdn-uploads.huggingface.co/production/uploads/63d7b8ee07cd1aa3c49a2026/KP10tiSZfAjMuwZUSLtRp.png

您只需几行代码即可体验该流水线的推理过程。以 PP-StructureV3 流水线为例：

from paddleocr import PPStructureV3

pipeline = PPStructureV3(use_doc_unwarping=True) # Use use_doc_unwarping to enable/disable document unwarping module
output = pipeline.predict("./KP10tiSZfAjMuwZUSLtRp.png")
for res in output:
    res.print() ## Print the structured prediction output
    res.save_to_json(save_path="output") ## Save the current image's structured result in JSON format
    res.save_to_markdown(save_path="output") ## Save the current image's result in Markdown format

有关使用命令和参数说明的详细信息，请参阅文档。

链接

PaddleOCR 代码库

PaddleOCR 文档