PP-OCRv5_mobile_rec

简介

PP-OCRv5_mobile_rec 是 PaddleOCR 团队研发的最新一代文本行识别模型 PP-OCRv5_rec 中的一员。其目标是通过单一模型高效、准确地支持简体中文、繁体中文、英文、日文四大语种的识别，以及手写体、竖排文本、拼音、生僻字等复杂文本场景。关键精度指标如下：

手写中文	手写英文	印刷中文	印刷英文	繁体中文	古文字	日文	通用场景	拼音	旋转	扭曲	艺术字	平均值
0.4166	0.4944	0.8605	0.8753	0.7199	0.5786	0.7577	0.5570	0.7703	0.7248	0.8089	0.5398	0.8015

注：若一行文本中存在任何字符（含标点符号）错误，则整行标记为错误。这确保了实际应用中的更高准确率。

模型使用

import requests
from PIL import Image
from transformers import AutoImageProcessor, AutoModelForTextRecognition

model_path = "PaddlePaddle/PP-OCRv5_mobile_rec_safetensors"
model = AutoModelForTextRecognition.from_pretrained(model_path, device_map="auto")
image_processor = AutoImageProcessor.from_pretrained(model_path)

image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png", stream=True).raw).convert("RGB")
inputs = image_processor(images=image, return_tensors="pt").to(model.device)
outputs = model(**inputs)

results = image_processor.post_process_text_recognition(outputs)

for result in results:
    print(result)

PP-OCRv5_mobile_rec

简介

手写中文	手写英文	印刷中文	印刷英文	繁体中文	古文字	日文	通用场景	拼音	旋转	扭曲	艺术字	平均值
0.4166	0.4944	0.8605	0.8753	0.7199	0.5786	0.7577	0.5570	0.7703	0.7248	0.8089	0.5398	0.8015

注：若一行文本中存在任何字符（含标点符号）错误，则整行标记为错误。这确保了实际应用中的更高准确率。

模型使用

import requests
from PIL import Image
from transformers import AutoImageProcessor, AutoModelForTextRecognition

model_path = "PaddlePaddle/PP-OCRv5_mobile_rec_safetensors"
model = AutoModelForTextRecognition.from_pretrained(model_path, device_map="auto")
image_processor = AutoImageProcessor.from_pretrained(model_path)

image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/general_ocr_rec_001.png", stream=True).raw).convert("RGB")
inputs = image_processor(images=image, return_tensors="pt").to(model.device)
outputs = model(**inputs)

results = image_processor.post_process_text_recognition(outputs)

for result in results:
    print(result)