ConvNeXtV2-nano-22k-384 是 ConvNeXtV2 系列模型,采用全卷积掩码自编码器(FCMAE)预训练方法,在 ImageNet-22k 数据集上进行预训练,并在 384x384 分辨率的 ImageNet-1k 数据集上进行微调。
from PIL import Image
import requests
from transformers import AutoImageProcessor, ConvNextV2ForImageClassification
# Load model and processor
model_name = "facebook/convnextv2-nano-22k-384"
processor = AutoImageProcessor.from_pretrained(model_name)
model = ConvNextV2ForImageClassification.from_pretrained(model_name)
# Prepare input
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
# Inference
outputs = model(**inputs)
logits = outputs.logits
predicted_class = logits.argmax(-1).item()该模型已在 Ascend NPU(Ascend910)上完成验证。
| 批次大小 | 延迟(毫秒) | 吞吐量(张/秒) |
|---|---|---|
| 1 | 6.80 | 147.00 |
| 2 | 6.98 | 286.46 |
| 4 | 8.85 | 451.80 |
| 8 | 13.52 | 591.63 |
├── inference.py # NPU inference script
├── eval/
│ ├── run_accuracy.py # CPU vs NPU accuracy validation
│ └── run_performance.py # Performance benchmark
└── ms_model/
└── facebook/convnextv2-nano-22k-384/
├── config.json
├── model.safetensors
├── pytorch_model.bin
└── preprocessor_config.json