用于多视图深度估计和相机姿态估计的DA3 Small模型。采用统一深度射线表示的高效基础模型。
| 属性 | 值 |
|---|---|
| 模型系列 | Any-view Model |
| 参数 | 0.08B |
| 许可证 | Apache 2.0 |
git clone https://github.com/ByteDance-Seed/depth-anything-3
cd depth-anything-3
pip install -e .import torch
from depth_anything_3.api import DepthAnything3
# Load model from Hugging Face Hub
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = DepthAnything3.from_pretrained("depth-anything/da3-small")
model = model.to(device=device)
# Run inference on images
images = ["image1.jpg", "image2.jpg"] # List of image paths, PIL Images, or numpy arrays
prediction = model.inference(
images,
export_dir="output",
export_format="glb" # Options: glb, npz, ply, mini_npz, gs_ply, gs_video
)
# Access results
print(prediction.depth.shape) # Depth maps: [N, H, W] float32
print(prediction.conf.shape) # Confidence maps: [N, H, W] float32
print(prediction.extrinsics.shape) # Camera poses (w2c): [N, 3, 4] float32
print(prediction.intrinsics.shape) # Camera intrinsics: [N, 3, 3] float32# Process images with auto mode
da3 auto path/to/images \
--export-format glb \
--export-dir output \
--model-dir depth-anything/da3-small
# Use backend for faster repeated inference
da3 backend --model-dir depth-anything/da3-small
da3 auto path/to/images --export-format glb --use-backend💎 单个纯Transformer(例如基础DINO编码器)无需架构专门化即可作为骨干网络。 # noqa: E501
✨ 单一的深度射线表示消除了对复杂多任务学习的需求。
🏆 Depth Anything 3 在以下方面显著优于同类模型:
有关详细的基准测试结果,请参阅我们的论文。 # noqa: E501
如果您发现Depth Anything 3对您的研究或项目有帮助,请引用:
@article{depthanything3,
title={Depth Anything 3: Recovering the visual space from any views},
author={Haotong Lin and Sili Chen and Jun Hao Liew and Donny Y. Chen and Zhenyu Li and Guang Shi and Jiashi Feng and Bingyi Kang}, # noqa: E501
journal={arXiv preprint arXiv:XXXX.XXXXX},
year={2025}
}Haotong Lin · Sili Chen · Junhao Liew · Donny Y. Chen · Zhenyu Li · Guang Shi · Jiashi Feng · Bingyi Kang # noqa: E501