冬

gcw_IDzXRVNw/cv_resnet18_human-detection-ascend

cv_resnet18_human-detection Ascend NPU 部署指南

项目简介

cv_resnet18_human-detection 是基于 ModelScope 的 ResNet18 人体检测模型，采用 FasterRCNN 架构结合 DyHead 注意力机制实现人体检测任务。模型将输入图像中的人体以边界框形式输出，适用于监控场景的行人检测应用。

特性

支持 Ascend NPU 推理加速
CPU vs NPU 精度对比测试
ResNet18 + FasterRCNN + DyHead 架构
支持多人体检测
兼容 PyTorch 模型权重

环境要求

硬件: 华为 Ascend 910 系列 NPU
CANN: 8.0.RC1 或更高版本
PyTorch: 2.0+ with torch_npu
Docker: 容器名称 test-modelagent
OpenCV: 4.0+

目录结构

cv_resnet18_human-detection-ascend/
├── inference.py              # 推理测试脚本
├── log.txt                  # 测试日志
├── README.md                # 本文档
├── test_human.jpg           # 测试图片
├── result_visualization.jpg  # 检测结果可视化
├── precision_result.json    # 精度测试结果
├── inference_result.json    # 推理输出结果
└── fusion_result.json       # 图融合结果

部署步骤

1. 进入容器

docker exec -it test-modelagent bash

2. 设置环境变量

source /usr/local/Ascend/ascend-toolkit/set_env.sh

3. 准备模型文件

模型文件位于 /data/ysws/agentsp/5-19-1/cv_resnet18_human-detection/ 目录下：

pytorch_model.pt - 模型权重 (约 147MB)
configuration.json - 模型配置
mmcv_config.py - 完整模型架构定义

4. 安装依赖

pip install torch torch_npu opencv-python numpy

使用方式

方式一：普通推理模式

运行推理脚本进行人体检测：

cd /data/ysws/agentsp/5-19-1/cv_resnet18_human-detection-ascend/

# 运行人体检测
python3 inference.py

方式二：精度测试模式 (CPU vs NPU)

运行精度对比测试，验证 NPU 计算结果与 CPU 一致性：

cd /data/ysws/agentsp/5-19-1/cv_resnet18_human-detection-ascend/

# 运行完整精度测试
python3 inference.py precision_test

命令行参数说明

参数	说明	默认值
`precision_test`	运行精度对比测试模式	`inference`

测试验证

精度测试结果

指标	实测值	阈值	状态
Cls score 最大差异	0.192	-	-
Bbox pred 最大差异	0.067	-	-
检测数量一致性	100 vs 100	相等	PASS

性能数据

操作	耗时
CPU 推理时间	~4.00s
NPU 推理时间	~3.93s
加速比	~1.02x

检测结果示例

输入	检测数量	输出文件
test_human.jpg (533x948)	100 人体	result_visualization.jpg

结果: NPU 推理功能正常，检测数量与 CPU 一致

测试日志

完整测试日志保存在 log.txt

Python API 使用示例

基本人体检测

import torch
import cv2
import numpy as np

MODEL_PATH = "/data/ysws/agentsp/5-19-1/cv_resnet18_human-detection/iic/cv_resnet18_human-detection/pytorch_model.pt"

# 加载模型
model = HumanDetectionModel(MODEL_PATH)
device = torch.device("npu:0")
model = model.to(device)
model.eval()

# 读取图片
img = cv2.imread("test_human.jpg")
img_tensor, scale, _, _ = preprocess(img)
img_tensor = torch.from_numpy(img_tensor).unsqueeze(0).to(device)

# 推理
with torch.no_grad():
    cls_score, bbox_pred = model(img_tensor)

# 解码检测结果
detections = decode_detections(cls_score, bbox_pred, scale, img.shape)
print(f"检测到 {len(detections)} 个人体")

批量图片处理

import os

image_dir = "/path/to/images"
image_files = [f for f in os.listdir(image_dir) if f.endswith('.jpg')]

for img_file in image_files:
    img_path = os.path.join(image_dir, img_file)
    img = cv2.imread(img_path)

    # 预处理
    img_tensor, scale, _, _ = preprocess(img)
    img_tensor = torch.from_numpy(img_tensor).unsqueeze(0).to(device)

    # 推理
    with torch.no_grad():
        cls_score, bbox_pred = model(img_tensor)

    detections = decode_detections(cls_score, bbox_pred, scale, img.shape)

    # 保存可视化结果
    vis_path = os.path.join(OUTPUT_DIR, f"result_{img_file}")
    visualize(img, detections, vis_path)

模型结构

架构类型: FasterRCNN + ResNet18 + DyHead
Backbone: ResNet18 (4阶段，输出索引 0-3)
Neck: FPN + DyHead (6个block，128通道)
Head: DynamicRoIHead (RPN + BBox head)

组件	说明
backbone	ResNet18 特征提取器
neck	FPN 特征金字塔 + DyHead 注意力
rpn_head	区域提议网络
roi_head	动态RoI头部检测器

推理参数配置

从 mmcv_config.py 提取的关键参数:

# 图像预处理
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53],
    std=[58.395, 57.12, 57.375],
    to_rgb=True
)

# 测试图像增强
img_scale=(1333, 800)
size_divisor=32

# 检测阈值
score_thr=0.6
nms_threshold=0.7 (rpn), 0.5 (rcnn)

常见问题

Q: 如何提高推理速度?

A: NPU 推理已针对大规模矩阵运算优化，当前加速比约 1.02x。批量处理可提高吞吐量。

Q: 检测框不准确怎么办?

A: 检查输入图片质量和预处理配置。模型在遮挡和小目标场景可能存在误检。

Q: 支持哪些图片格式?

A: 支持 OpenCV 可读取的所有格式，包括 JPG、PNG、BMP 等。

参考链接

原始模型: https://modelscope.cn/damo/cv_resnet18_human-detection
FasterRCNN: https://arxiv.org/abs/1506.01497
DyHead: https://arxiv.org/abs/2104.09499

许可证

本项目遵循 Apache-2.0 许可证