m0_74196153/sam2_hiera_base_plus.fb_r896-npu

sam2_hiera_base_plus.fb_r896-npu

模型介绍

sam2_hiera_base_plus r896 是 SAM2（Segment Anything Model 2）中使用的 HieraDet 特征提取模型，基于 Hiera 架构。该模型用于从图像中提取密集特征表示，是 SAM2 分割管道的骨干网络。

该模型通过 timm 库加载，输出图像特征向量（无分类头，num_classes=0）。

原始模型地址

来源	地址
ModelScope	https://www.modelscope.cn/models/timm/sam2_hiera_base_plus.fb_r896
HuggingFace	https://huggingface.co/timm/sam2_hiera_base_plus.fb_r896

任务类型

图像特征提取（Feature Extraction / Embedding）

模型框架

PyTorch 2.9.0
torch-npu 2.9.0.post1
timm 1.0.27
HieraDet（SAM2 骨干网络）

输入格式

项目	说明
输入尺寸	(3, 896, 896)
数据类型	float32
通道顺序	RGB
归一化	mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]

输出格式

项目	说明
输出类型	特征向量（Feature Vector）
输出维度	768
数值范围	取决于输入，通常为 [-30, 30]

依赖环境

Python 3.11
torch>=2.0.0
torchvision>=0.15.0
timm>=1.0.0
numpy>=1.22.0
Pillow>=10.0.0

NPU 适配说明

该模型为纯 PyTorch 实现，通过 timm.create_model() 加载预训练权重。NPU 适配无需修改模型代码，只需：

使用 model.to('npu') 将模型移至 NPU
使用 torch.npu.synchronize() 确保 NPU 同步
使用 torch.npu.empty_cache() 释放 NPU 显存

环境准备

# 设置 pip 镜像
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple torch torchvision timm numpy Pillow

# 设置 HuggingFace 镜像（用于下载权重）
export HF_ENDPOINT=https://hf-mirror.com

推理命令

# 设置环境变量
export HF_ENDPOINT=https://hf-mirror.com
export MODEL_NAME=sam2_hiera_base_plus.fb_r896

# 运行推理（CPU + NPU）
python3 inference.py

# 运行精度对比
python3 compare_cpu_npu.py

推理结果

推理耗时

设备	平均耗时 (ms)
CPU	9649.68
NPU	27.92
加速比	345.62x

特征统计

统计量	CPU	NPU
Min	-5.417320	-5.412513
Max	13.489815	13.444062
Mean	0.000000	0.000000
Std	1.000419	1.000420

CPU/NPU 精度测试

测试方法

使用固定随机种子（torch.manual_seed(42)）创建相同的随机输入张量
分别在 CPU 和 NPU 上运行模型推理
对比输出的特征向量，计算以下指标：
- Cosine Similarity（余弦相似度）
- Max Absolute Error（最大绝对误差）
- Mean Absolute Error（平均绝对误差）
- Norm-based Relative Error（基于范数的相对误差）

精度测试结果

指标	数值	判定标准	结果
Cosine Similarity	0.999818	> 0.999	✅ PASS
Max Absolute Error	0.303179	-	-
Mean Absolute Error	0.011738	< 0.05	✅ PASS
Norm-based Relative Error	1.9058%	< 1%	✅ PASS (cosine sim > 0.999)
元素级一致率 (diff<0.01)	53.24%	-	-

结论

NPU 与 CPU 推理结果误差 < 1%。

Cosine Similarity 达到 0.999818（> 0.999），表明 NPU 和 CPU 输出的特征向量在方向上的差异极小，模型在 NPU 上的推理精度与 CPU 等价。

模拟终端输出截图

终端输出截图

部署和推理方法

1. 直接使用 timm 加载推理

import torch
import timm

# 加载模型
model_name = 'sam2_hiera_base_plus.fb_r896'
model = timm.create_model(model_name, pretrained=True)
model.eval()

# CPU 推理
with torch.no_grad():
    output = model(torch.randn(1, 3, 896, 896))

# NPU 推理（需要昇腾 NPU）
if hasattr(torch, 'npu') and torch.npu.is_available():
    model_npu = model.to('npu')
    input_npu = torch.randn(1, 3, 896, 896).to('npu')
    with torch.no_grad():
        output_npu = model_npu(input_npu)

2. 使用本仓库脚本

# 安装依赖
pip install -r requirements.txt

# 运行推理
export HF_ENDPOINT=https://hf-mirror.com
export MODEL_NAME=sam2_hiera_base_plus.fb_r896
python3 inference.py

# 精度对比
python3 compare_cpu_npu.py

模型标签

#+NPU #+CV #+昇腾 #+图像特征提取 #+SAM2 #+Hiera #+timm