HuggingFace镜像/NVIDIA-Nemotron-3-Nano-4B-BF16

NVIDIA-Nemotron-3-Nano-4B-BF16

模型开发者： NVIDIA Corporation

模型日期：

2025年12月 - 2026年1月

数据时效性：

2024年9月

预训练数据的截止日期为2024年9月。

模型概述

NVIDIA-Nemotron-3-Nano-4B-BF16是由NVIDIA从头开始训练的小型语言模型（SLM），设计为适用于推理和非推理任务的统一模型。它通过首先生成推理轨迹，然后得出最终响应来回应用户的查询和任务。模型的推理能力可通过系统提示进行控制。如果用户希望模型直接提供最终答案而不输出中间推理轨迹，可以进行相应配置，尽管对于需要推理的较难提示，其准确性会略有下降。相反，允许模型先生成推理轨迹通常会为查询和任务带来更高质量的最终解决方案。

该模型是使用Nemotron Elastic框架从NVIDIA-Nemotron-Nano-9B-v2压缩而来。关于父模型NVIDIA-Nemotron-Nano-9B-v2的详细信息，请参见（Nemotron-H技术报告）。该模型采用混合架构，主要由Mamba-2和MLP层组成，并结合了仅四个Attention层。

支持的语言包括：英语。借助Qwen进行了改进。

此模型已准备好投入商业使用。

许可协议/使用条款

管辖条款：本模型的使用受 NVIDIA Nemotron Open Model License 管辖。

评估结果：

我们在 Reasoning-off（推理关闭）模式下通过以下基准对模型进行了评估

基准测试	NVIDIA-Nemotron-3-Nano-4B-BF16
BFCL v3	61.1
IFBench-Prompt	43.2
IFBench-Instruction	44.2
Orak	22.9
IFEval-Prompt	82.8
IFEval-Instruction	88
HaluEval	62.2
RULER (128k)	91.1
Tau2-Airline	28.0
Tau2-Retail	34.8
Tau2-Telecom	24.9
EQ-Bench3	63.2

我们还在 Reasoning-On（推理开启）模式下通过以下基准对模型进行了评估。

基准测试	NVIDIA-Nemotron-3-Nano-4B-BF16
AIME25	78.5
MATH500	95.4
GPQA	53.2
LCB	51.8
BFCL v3	61.1
IFEVAL-Prompt	87.9
IFEVAL-Instruction	92
Tau2-Airline	33.3
Tau2-Retail	39.8
Tau2-Telecom	33

所有评估均使用 NeMo-Skills 和 Orak 进行。对于 Orak，我们在三个游戏（《超级马里奥》《暗黑地牢》和《星露谷物语》）上进行了评估。

部署地区：全球

用例

NVIDIA-Nemotron-3-Nano-4B 是一款边缘就绪型小型语言模型，适用于边缘平台（Jetson Thor、GeForce RTX、DGX Spark）中的智能体 AI（Agentic AI）。其主要目标用途包括 AI 游戏 NPC（队友/同伴）、本地语音助手（适用于设备、应用程序和游戏）以及物联网自动化。该模型适用于英语和编程语言。

发布日期：2026年3月16日

Huggingface 2026年3月16日通过 https://huggingface.co/

参考文献

模型架构

架构类型：Mamba2-Transformer 混合架构
网络架构：Nemotron-Hybrid
- 该模型由 nvidia/NVIDIA-Nemotron-Nano-9B-v2 压缩而来
- 模型参数数量：3.97 x 10^9

输入

输入类型：文本
输入格式：字符串
输入参数：一维（1D）：序列
其他输入相关属性：上下文长度可达 262K。支持的语言包括英语。

输出

输出类型：文本
输出格式：字符串
输出参数：一维（1D）：序列
其他输出相关属性：序列长度可达 262K

我们的模型经过设计和优化，可在 NVIDIA GPU 加速系统上运行。通过利用 NVIDIA 的硬件（如 GPU 核心）和软件框架（如 CUDA 库），与纯 CPU 解决方案相比，该模型实现了更快的训练和推理时间。

软件集成

运行时引擎：NeMo 25.07
支持的硬件微架构兼容性：NVIDIA A10G、NVIDIA H100-80GB、NVIDIA A100、GeForce RTX
操作系统：Linux

将基础模型和微调模型集成到 AI 系统中，需要使用特定用例的数据进行额外测试，以确保安全有效的部署。遵循 V 模型方法论，在单元和系统层面进行迭代测试和验证，对于在部署前降低风险、满足技术和功能要求以及确保符合安全和道德标准至关重要。

使用 Transformers 调用

以下代码片段展示了如何使用 Huggingface Transformers（在 4.48.3 版本上测试）调用此模型。

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("nvidia/NVIDIA-Nemotron-3-Nano-4B")
model = AutoModelForCausalLM.from_pretrained(
    "nvidia/NVIDIA-Nemotron-3-Nano-4B",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto"
)

messages = [
    {"role": "system", "content": <system_prompt>},
    {"role": "user", "content": "Write a haiku about GPUs"},
]
tokenized_chat = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    tokenized_chat,
    max_new_tokens=32,
    eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0]))

对于推理任务，建议设置 temperature=1.0 和 top_p=0.95；而对于工具调用，建议设置 temperature=0.6 和 top_p=0.95。

如果您想关闭推理功能，请在 apply_chat_template() 中添加 enable_thinking=False。默认情况下，enable_thinking 设为 True。

messages = [
    {"role": "system", "content": <system_prompt>},
    {"role": "user", "content": "Write a haiku about GPUs"},
]
tokenized_chat = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    enable_thinking=False,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    tokenized_chat,
    max_new_tokens=32,
    eos_token_id=tokenizer.eos_token_id
)
print(tokenizer.decode(outputs[0]))

使用 vLLM 运行

此模型需要 vllm>=0.15.1。如果您使用的是 Jetson Thor 或 DGX Spark，请使用此 vllm 容器。

pip install -U "vllm>=0.15.1"

从 Hugging Face 仓库下载自定义解析器。

wget https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16/resolve/main/nano_v3_reasoning_parser.py

使用自定义解析器启动 vLLM 服务器。

vllm serve nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16 \
  --served-model-name nemotron3-nano-4B-BF16\
  --max-num-seqs 8 \
  --tensor-parallel-size 1 \
  --max-model-len 262144 \
  --port 8000 \
  --trust-remote-code \
  --mamba_ssm_cache_dtype float32 \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder \
  --reasoning-parser-plugin nano_v3_reasoning_parser.py \
  --reasoning-parser nano_v3

使用 python 客户端访问托管的 API。


from openai import OpenAI
import asyncio
from openai import AsyncOpenAI

# NOTE: Streaming is preferred for better performance and resource efficiency.
# It allows you to start processing responses as they arrive, reducing latency.

# Synchronous example (non-streaming)
client = OpenAI(
    api_key="your-nvapikey",
    base_url="base-url"
)

response = client.chat.completions.create(
    model="nemotron3-nano-4B-BF16",
    messages=[
        {
            "role": "user",
            "content": "Hello!"
        }
    ],
    temperature=0.7,
    max_tokens=256,
    top_p=0.7,
    stream=false
)

print(response.choices[0].message.content)

与 TRT-LLM 配合使用

使用 TRT-LLM 启动模型

docker run -v /home/root/.cache/huggingface/:/root/.cache/huggingface/ --rm --ulimit memlock=-1 --ulimit stack=67108864 --gpus=all --ipc=host --network host -d -e MODEL=nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16 -e HF_TOKEN=$HF_TOKEN nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc6 bash -c '
cat > /tmp/extra-llm-api-config.yml <<EOF
kv_cache_config:
  dtype: "auto"
  enable_block_reuse: false
cuda_graph_config:
  max_batch_size: 32
  enable_padding: true
disable_overlap_scheduler: true
moe_config: 
  backend: CUTLASS
EOF

trtllm-serve  \
nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16 \
--host 0.0.0.0 \
--port 8123 \
--max_batch_size 32 \
--extra_llm_api_options /tmp/extra-llm-api-config.yml '

使用 curl 命令访问托管端点。

curl http://localhost:8123/v1/chat/completions -H "Content-Type: application/json"  -d '{
    "model": "NVIDIA-Nemotron-3-Nano-4B-BF16",
    "messages": [
        {
            "role": "user",
            "content": "Where is New York?"
        }
    ],
    "max_tokens": 1024,
    "top_p": 1.0
}' -w "\n"

模型版本

v1.0

训练、测试和评估数据集

训练数据集

数据模态：文本
文本训练数据量：超过 10 万亿 tokens
训练/测试/验证集划分：我们使用 100% 的语料库进行预训练，并依赖外部基准进行测试。
数据集的数据收集方法：混合：自动化、人工、合成
数据集的标注方法：混合：自动化、人工、合成

特性： NVIDIA-Nemotron-3-Nano-4B 的训练后语料库包含英语和多语言文本（德语、西班牙语、法语、意大利语、韩语、葡萄牙语、俄语、日语、中文和英语）。我们的来源涵盖多种文档类型，如网页、对话、文章和其他书面材料。语料库跨越多个领域，包括代码、法律、数学、科学、金融等。我们还包含一小部分问答和对齐风格的数据，以提高模型的准确性。对于上述多个领域，我们使用了合成数据，特别是来自 DeepSeek R1/R1-0528、Qwen3-235B-A22B、Nemotron 4 340B、Qwen2.5-32B-Instruct-AWQ、Qwen2.5-14B-Instruct、Qwen 2.5 72B 的推理轨迹。

有关数据集和合成数据生成方法的更多详细信息，请参见技术报告 NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model。

公共数据集

数据集	收集时间
Problems in Elementary Mathematics for Home Study	2025年4月23日
GSM8K	2025年4月23日
PRM800K	2025年4月23日
CC-NEWS	2025年4月23日
Common Crawl	2025年4月23日
Wikimedia	2025年4月23日
Bespoke-Stratos-17k	2025年4月23日
tigerbot-kaggle-leetcodesolutions-en-2k	2025年4月23日
glaive-function-calling-v2	2025年4月23日
APIGen Function-Calling	2025年4月23日
LMSYS-Chat-1M	2025年4月23日
Open Textbook Library - CC BY-SA & GNU subset 和 OpenStax - CC BY-SA subset	2025年4月23日
Advanced Reasoning Benchmark, tigerbot-kaggle-leetcodesolutions-en-2k, PRM800K, 和 SciBench	2025年4月23日
FineWeb-2	2025年4月23日
Court Listener	历史下载
peS2o	历史下载
OpenWebMath	历史下载
BioRxiv	历史下载
PMC Open Access Subset	历史下载
OpenWebText2	历史下载
Stack Exchange Data Dump	历史下载
PubMed Abstracts	历史下载
NIH ExPorter	历史下载
arXiv	历史下载
BigScience Workshop Datasets	历史下载
Reddit Dataset	历史下载
SEC's Electronic Data Gathering, Analysis, and Retrieval (EDGAR)	历史下载
Public Software Heritage S3	历史下载
The Stack	历史下载
mC4	历史下载
Advanced Mathematical Problem Solving	历史下载
MathPile	历史下载
NuminaMath CoT	历史下载
PMC Article	历史下载
FLAN	历史下载
Advanced Reasoning Benchmark	历史下载
SciBench	历史下载
WikiTableQuestions	历史下载
FinQA	历史下载
Riddles	历史下载
Problems in Elementary Mathematics for Home Study	历史下载
MedMCQA	历史下载
Cosmos QA	历史下载
MCTest	历史下载
AI2's Reasoning Challenge	历史下载
OpenBookQA	历史下载
MMLU Auxiliary Train	历史下载
social-chemestry-101	历史下载
Moral Stories	历史下载
The Common Pile v0.1	历史下载
FineMath	历史下载
MegaMath	历史下载
FastChat	2025年6月30日
MultiverseMathHard	2025年10月2日
SWE-Gym	2025年10月2日
WorkBench	2025年10月2日
WildChat-1M	2025年10月2日
OpenCodeReasoning-2	2025年10月2日
HelpSteer3	2025年10月2日
opc-sft-stage2	2025年10月2日
Big-Math-RL-Verified	2025年10月2日
NuminaMath CoT	2025年10月2日
MetaMathQA	2025年10月2日
simple-arithmetic-problems	2025年10月2日
arithmetic	2025年10月2日
Skywork-OR1-RL-Data	2025年10月2日
News Commentary	2025年10月2日
FastChat	2025年10月2日
Essential-Web	2025年10月2日
finepdfs	2025年10月2日
HotpotQA	2025年10月2日
SQuAD2.0	2025年10月2日
NLTK Words Lists	2025年10月2日

第三方私有非公开数据集

数据集
Global Regulation
Workbench

在线数据集来源

英文Common Crawl数据从Common Crawl基金会下载（有关其爬取的详细信息，请参见其常见问题解答），包括快照CC-MAIN-2013-20至CC-MAIN-2025-13。随后，按照Nemotron-CC论文中描述的各种方式对数据进行了去重和过滤。

此外，我们从以下三个Common Crawl快照中提取了十五种语言的数据：CC-MAIN-2024-51、CC-MAIN-2025-08、CC-MAIN-2025-18。这十五种语言包括阿拉伯语、中文、丹麦语、荷兰语、法语、德语、意大利语、日语、韩语、波兰语、葡萄牙语、俄语、西班牙语、瑞典语和泰语。由于我们没有可靠的基于多语言模型的质量分类器，因此我们仅应用了启发式过滤——类似于我们在Nemotron-CC流程中对较低质量英文数据所做的处理，但对某些效果不佳的语言有选择地移除了部分过滤器。去重操作与Nemotron-CC的方式相同。

GitHub Crawl是使用GitHub REST API和Amazon S3 API收集的。每次爬取都按照其各自来源（GitHub或S3）设定的速率限制进行。我们收集原始源代码，随后移除任何不在我们的宽松许可证集合中的许可证（更多详情，请参考技术报告）。

数据集	模态	数据集大小（ tokens）	收集日期
English Common Crawl	文本	3.360T	4/8/2025
Multilingual Common Crawl	文本	812.7B	5/1/2025
GitHub Crawl	文本	747.4B	4/29/2025
English Common Crawl 1.1	文本	未披露	10/2/2025

NVIDIA 来源的合成数据集

数据集	模态	数据集大小（tokens）	种子数据集	用于生成的模型
Synthetic Art of Problem Solving from DeepSeek-R1	文本	25.5B	Art of Problem Solving; American Mathematics Competitions 8; American Mathematics Competitions 10;	DeepSeek-R1
Synthetic Moral Stories and Social Chemistry from Mixtral-8x22B-v0.1	文本	327M	social-chemestry-101; Moral Stories	Mixtral-8x22B-v0.1
Synthetic Social Sciences seeded with OpenStax from DeepSeek-V3, Mixtral-8x22B-v0.1, and Qwen2.5-72B	文本	83.6M	OpenStax - CC BY-SA subset	DeepSeek-V3; Mixtral-8x22B-v0.1; Qwen2.5-72B
Synthetic Health Sciences seeded with OpenStax from DeepSeek-V3, Mixtral-8x22B-v0.1, and Qwen2.5-72B	文本	9.7M	OpenStax - CC BY-SA subset	DeepSeek-V3; Mixtral-8x22B-v0.1; Qwen2.5-72B
Synthetic STEM seeded with OpenStax, Open Textbook Library, and GSM8K from DeepSeek-R1, DeepSeek-V3, DeepSeek-V3-0324, and Qwen2.5-72B	文本	175M	OpenStax - CC BY-SA subset; GSM8K; Open Textbook Library - CC BY-SA & GNU subset	DeepSeek-R1, DeepSeek-V3; DeepSeek-V3-0324; Qwen2.5-72B
Nemotron-PrismMath	文本	4.6B	Big-Math-RL-Verified; OpenR1-Math-220k	Qwen2.5-0.5B-instruct, Qwen2.5-72B-Instruct; DeepSeek-R1-Distill-Qwen-32B
Synthetic Question Answering Data from Papers and Permissible Books from Qwen2.5-72B-Instruct	文本	350M	arXiv; National Institutes of Health ExPorter; BioRxiv; PMC Article; USPTO Backgrounds; peS2o; Global Regulation; CORE; PG-19; DOAB CC BY & CC BY-SA subset; NDLTD	Qwen2.5-72B-Instruct
Synthetic FineMath-4+ Reprocessed from DeepSeek-V3	文本	9.2B	Common Crawl	DeepSeek-V3
Synthetic FineMath-3+ Reprocessed from phi-4	文本	27.6B	Common Crawl	phi-4
Synthetic Union-3+ Reprocessed from phi-4	文本	93.1B	Common Crawl	phi-4
Refreshed Nemotron-MIND from phi-4	文本	73B	Common Crawl	phi-4
Synthetic Union-4+ Reprocessed from phi-4	文本	14.12B	Common Crawl	phi-4
Synthetic Union-3+ minus 4+ Reprocessed from phi-4	文本	78.95B	Common Crawl	phi-4
Synthetic Union-3 Refreshed from phi-4	文本	80.94B	Common Crawl	phi-4
Synthetic Union-4+ Refreshed from phi-4	文本	52.32B	Common Crawl	phi-4
Synthetic AGIEval seeded with AQUA-RAT, LogiQA, and AR-LSAT from DeepSeek-V3 and DeepSeek-V3-0324	文本	4.0B	AQUA-RAT; LogiQA; AR-LSAT	DeepSeek-V3; DeepSeek-V3-0324
Synthetic AGIEval seeded with AQUA-RAT, LogiQA, and AR-LSAT from Qwen3-30B-A3B	文本	4.2B	AQUA-RAT; LogiQA; AR-LSAT	Qwen3-30B-A3B
Synthetic Art of Problem Solving from Qwen2.5-32B-Instruct, Qwen2.5-Math-72B, Qwen2.5-Math-7B, and Qwen2.5-72B-Instruct	文本	83.1B	Art of Problem Solving; American Mathematics Competitions 8; American Mathematics Competitions 10; GSM8K; PRM800K	Qwen2.5-32B-Instruct; Qwen2.5-Math-72B; Qwen2.5-Math-7B; Qwen2.5-72B-Instruct
Synthetic MMLU Auxiliary Train from DeepSeek-R1	文本	0.5B	MMLU Auxiliary Train	DeepSeek-R1
Synthetic Long Context Continued Post-Training Data from Papers and Permissible Books from Qwen2.5-72B-Instruct	文本	5.4B	arXiv; National Institutes of Health ExPorter; BioRxiv; PMC Article; USPTO Backgrounds; peS2o; Global Regulation; CORE; PG-19; DOAB CC BY & CC BY-SA subset; NDLTD	Qwen2.5-72B-Instruct
Synthetic Common Crawl from Qwen3-30B-A3B and Mistral-Nemo-12B-Instruct	文本	1.949T	Common Crawl	Qwen3-30B-A3B; Mistral-NeMo-12B-Instruct
Synthetic Multilingual Data from Common Crawl from Qwen3-30B-A3B	文本	997.3B	Common Crawl	Qwen3-30B-A3B
Synthetic Multilingual Data from Wikimedia from Qwen3-30B-A3B	文本	55.1B	Wikimedia	Qwen3-30B-A3B
Synthetic OpenMathReasoning from DeepSeek-R1-0528	文本	1.5M	OpenMathReasoning	DeepSeek-R1-0528
Synthetic OpenCodeReasoning from DeepSeek-R1-0528	文本	1.1M	OpenCodeReasoning	DeepSeek-R1-0528
Synthetic Science Data from DeepSeek-R1-0528	文本	1.5M	-	DeepSeek-R1-0528
Synthetic Humanity's Last Exam from DeepSeek-R1-0528	文本	460K	Humanity's Last Exam	DeepSeek-R1-0528
Synthetic ToolBench from Qwen3-235B-A22B	文本	400K	ToolBench	Qwen3-235B-A22B
Synthetic Nemotron Content Safety Dataset V2, eval-safety, Gretel Synthetic Safety Alignment, and RedTeam_2K from DeepSeek-R1-0528	文本	52K	Nemotron Content Safety Dataset V2; eval-safety; Gretel Synthetic Safety Alignment; RedTeam_2K	DeepSeek-R1-0528
Synthetic HelpSteer from Qwen3-235B-A22B	文本	120K	HelpSteer3; HelpSteer2	Qwen3-235B-A22B
Synthetic Alignment data from Mixtral-8x22B-Instruct-v0.1, Mixtral-8x7B-Instruct-v0.1, and Nemotron-4 Family	文本	400K	HelpSteer2; C4; LMSYS-Chat-1M; ShareGPT52K; tigerbot-kaggle-leetcodesolutions-en-2k; GSM8K; PRM800K; lm_identity (NVIDIA internal); FinQA; WikiTableQuestions; Riddles; ChatQA nvolve-multiturn (NVIDIA internal); glaive-function-calling-v2; SciBench; OpenBookQA; Advanced Reasoning Benchmark; Public Software Heritage S3; Khan Academy Math Keywords	Nemotron-4-15B-Base (NVIDIA internal); Nemotron-4-15B-Instruct (NVIDIA internal); Nemotron-4-340B-Base; Nemotron-4-340B-Instruct; Nemotron-4-340B-Reward; Mixtral-8x7B-Instruct-v0.1; Mixtral-8x22B-Instruct-v0.1
Synthetic LMSYS-Chat-1M from Qwen3-235B-A22B	文本	1M	LMSYS-Chat-1M	Qwen3-235B-A22B
Synthetic Multilingual Reasoning data from DeepSeek-R1-0528, Qwen2.5-32B-Instruct-AWQ, and Qwen2.5-14B-Instruct	文本	25M	OpenMathReasoning; OpenCodeReasoning	DeepSeek-R1-0528; Qwen2.5-32B-Instruct-AWQ (translation); Qwen2.5-14B-Instruct (translation);
Synthetic Multilingual Reasoning data from Qwen3-235B-A22B and Gemma 3 Post-Trained models	文本	5M	WildChat	Qwen3-235B-A22B; Gemma 3 PT 12B; Gemma 3 PT 27B
Tool Calling Data	文本	26.2B		Qwen3-235B-A22B-2507; gpt-oss-120b
Synthetic Essential-Web from QwQ-32B	文本	28.1B	Essential-Web	QwQ-32B
Translated Synthetic Crawl	文本	389.9B	Common Crawl	Qwen3-30B-A3B
Translated Synthetic Wikipedia	文本	7.9B	Wikimedia	Qwen3-30B-A3B
Synthetic Art of Problem Solving from gpt-oss-120b and Qwen2.5-32B-Instruct	文本	Undisclosed	Art of Problem Solving; American Mathematics Competitions 8; American Mathematics Competitions 10	gpt-oss-120b; Qwen2.5-32B-Instruct
Synthetic Stack Exchange from gpt-oss-120b and Qwen2.5-32B-Instruct	文本	Undisclosed	Stack Exchange	gpt-oss-120b; Qwen2.5-32B-Instruct
Synthetic OpenCodeReasoning from DeepSeek-R1-0528	文本	Undisclosed	OpenCodeReasoning	DeepSeek-R1-0528
Synthetic HackerRank Coding from DeepSeek-R1-0528	文本	Undisclosed	HackerRank Coding Dataset	DeepSeek-R1-0528
Synthetic SWE-Gym from Qwen3-Coder-480B-A35B-Instruct	文本	Undisclosed	SWE-Gym	Qwen3-Coder-480B-A35B-Instruct
Synthetic Art of Problem Solving and Stack Exchange from gpt-oss-120b, Qwen2.5-32B-Instruct, and Goedel-Prover-V2-32B	文本	Undisclosed	Art of Problem Solving; American Mathematics Competitions 8; American Mathematics Competitions 10; Stack Exchange	gpt-oss-120b; Qwen2.5-32B-Instruct; Goedel-Prover-V2-32B
Synthetic Multilingual Science and Code data from DeepSeek-R1, DeepSeek-R1-0528, Qwen2.5-32B-Instruct, and Qwen3-235B-A22B, translated with Qwen2.5-32B-Instruct and Qwen2.5-14B-Instruct	文本	Undisclosed	Stack Exchange; SCP-116K; LIMO; TACO; Code Contest; Codeforces	DeepSeek-R1; DeepSeek-R1-0528; Qwen2.5-32B-Instruct; Qwen3-235B-A22B;
Synthetic Safety from DeepSeek-R1-0528, gpt-oss-120b and Mixtral-8x7B-v0.1	文本	Undisclosed	Nemotron Content Safety Dataset V2; Gretel Synthetic Safety Alignment Dataset; RedTeam-2K; Malicious Tasks; Nemotron-Personas-USA	DeepSeek-R1-0528; gpt-oss-120b; Mixtral-8x7B-v0.1
来自 Qwen3-235B-A22B-Instruct-2507 和 gpt-oss-120b 的合成 STEM	文本	未公开	arXiv；美国国立卫生研究院 ExPorter；BioRxiv；PMC 文章；美国专利商标局背景资料；peS2o；全球法规；CORE；PG-19；DOAB CC BY & CC BY-SA 子集；NDLTD	Qwen3-235B-A22B-Instruct-2507；gpt-oss-120b
来自 DeepSeek-R1-0528 的合成 KernelBook	文本	未公开	KernelBook	DeepSeek-R1-0528
来自 Qwen3-235B-A22B-Thinking-2507 和 Qwen3-Next-80B-A3B-Thinking 的合成工具调用	文本	未公开	ToolBench；glaive-function-calling-v2；APIGen 函数调用；Nemotron-Personas-USA	Qwen3-235B-A22B-Thinking-2507；Qwen3-Next-80B-A3B-Thinking
来自 gpt-oss-120b、Mixtral-8x22B-Instruct-v0.1、Qwen3-235B-A22B-Instruct-2507 和 Qwen3-235B-A22B-Thinking-2507 的合成对话	文本	未公开	C4；LMSYS-Chat-1M；ShareGPT；GSM8K；PRM800K；FinQA；WikiTableQuestions；谜语；glaive-function-calling-v2；SciBench；tigerbot-kaggle-leetcodesolutions-en-2k；OpenBookQA；高级推理基准；软件遗产；可汗学院数学关键词；WildChat-1M；Nemotron-Personas-USA	gpt-oss-120b；Mixtral-8x22B-Instruct-v0.1；Qwen3-235B-A22B-Instruct-2507；Qwen3-235B-A22B-Thinking-2507
来自 Qwen3-235B-A22B-Instruct-2507 的合成长上下文	文本	未公开	CORE；PG-19；DOAB CC BY & CC BY-SA 子集；NDLTD	Qwen3-235B-A22B-Instruct-2507
来自 gpt-oss-120b、DeepSeek-R1-0528、Qwen3-32B 和 Qwen3-235B-A22B-Thinking-2507 的合成工具使用交互式智能体	文本	未公开	NVIDIA 内部	gpt-oss-120b；DeepSeek-R1-0528；Qwen3-32B；和 Qwen3-235B-A22B-Thinking-2507
来自 Qwen3-235B-A22B-Thinking-2507 的合成 STEM	文本	未公开	ICHO-IPH0；Physics Big；Scale HLE；OpenMathReasoning；OpenCodeReasoning	Qwen3-235B-A22B-Thinking-2507
来自 Qwen3-Coder-480B-A35B-Instruct 和 Kimi-K2-Thinking 的合成 DocFinQA 和 SWE-smith	文本	未公开	DocFinQA；SWE-smith	Qwen3-Coder-480B-A35B-Instruct；Kimi-K2-Thinking
来自 gpt-oss-120b 和 Qwen2.5-32B-Instruct 的合成数学	文本	未公开	-	gpt-oss-120b；Qwen2.5-32B-Instruct
来自 gpt-oss-120b 的合成 Essential-Web	文本	未公开	Essential-Web	gpt-oss-120b
来自 gpt-oss-120b 的合成 Scale HLE	文本	未公开	Scale HLE	gpt-oss-120b
来自 gpt-oss-120b 的合成 CDQuestions	文本	未公开	CDQuestions	gpt-oss-120b
来自 gpt-oss-120b 的合成 Stack Exchange	文本	未公开	Stack Exchange	gpt-oss-120b
来自 gpt-oss-120b 和 Qwen2.5-32B-Instruct 的合成 GPQA	文本	未公开	Stack Exchange	gpt-oss-120b；Qwen2.5-32B-Instruct
来自 gpt-oss-120b 的合成 Vedantu	文本	未公开	Vedantu	gpt-oss-120b
来自 Qwen3-Coder-480B-A35B-Instruct 的合成 SWE-Gym 和 R2E-Gym-Subset	文本	未公开	SWE-Gym；R2E-Gym-Subset	Qwen3-Coder-480B-A35B-Instruct
来自 Qwen3-Coder-480B-A35B-Instruct 的合成 SWE-Gym	文本	未公开	SWE-Gym	Qwen3-Coder-480B-A35B-Instruct
来自 DeepSeek-R1-0528 的合成 SWE-Gym 和 R2E-Gym-Subset	文本	未公开	SWE-Gym；R2E-Gym-Subset	DeepSeek-R1-0528
来自 gpt-oss-120b、Qwen3-235B-A22B-Instruct-2507 和 Qwen3-235B-A22B-Thinking-2507 的合成 HelpSteer、LMSYS-Chat-1M 和 Nemotron-Personas-USA	文本	未公开	HelpSteer2；HelpSteer3；LMSYS-Chat-1M；Nemotron-Personas-USA	gpt-oss-120b；Qwen3-235B-A22B-Instruct-2507；Qwen3-235B-A22B-Thinking-2507
来自 Qwen3-30B-A3B-Instruct-2507、Qwen3-30B-A3B-Thinking-2507、Qwen3-235B-A22B-Instruct-2507 和 Qwen3-235B-A22B-Thinking-2507 的合成结构化输出	文本	未公开	-	Qwen3-30B-A3B-Instruct-2507；Qwen3-30B-A3B-Thinking-2507；Qwen3-235B-A22B-Instruct-2507；Qwen3-235B-A22B-Thinking-2507
来自 Qwen3-235B-A22B 和 DeepSeek-R1-0528 的合成搜索 STEM MCQ	文本	未公开	-	Qwen3-235B-A22B；DeepSeek-R1-0528
来自 DeepSeek-R1-0528 的合成搜索 STEM OPENQ	文本	未公开	-	DeepSeek-R1-0528
来自 Qwen2.5-32B-Instruct 和 DeepSeek-R1-0528 的合成 OpenSTEM	文本	未公开	-	Qwen2.5-32B-Instruct；DeepSeek-R1-0528
来自 Qwen2.5-32B-Instruct 和 DeepSeek-R1-0528 的合成 MCQ	文本	未公开	-	Qwen2.5-32B-Instruct；DeepSeek-R1-0528
来自 DeepSeek-R1-0528 的合成 MCQ10	文本	未公开	-	DeepSeek-R1-0528
来自 Qwen3-235B-A22B、DeepSeek-R1-0528 和 Qwen3-235B-A22B-Instruct-2507 的合成 MCQ4	文本	未公开	-	Qwen3-235B-A22B；DeepSeek-R1-0528；Qwen3-235B-A22B-Instruct-2507
来自 gpt-oss-120b 和 Qwen2.5-32B-Instruct 的合成 OpenMathReasoning	文本	未公开	OpenMathReasoning	gpt-oss-120b；Qwen2.5-32B-Instruct
来自 DeepSeek-R1-0528 的合成离线搜索 MCQA HLE	文本	未公开	-	DeepSeek-R1-0528
来自 Qwen3-235B-A22B 和 DeepSeek-R1-0528 的合成离线搜索 MCQA GPQA	文本	未公开	-	Qwen3-235B-A22B；DeepSeek-R1-0528
来自 QwQ-32B、Qwen3-30B-A3B、Qwen3-235B-A22B、Qwen3-235B-A22B-Instruct-2507、Mistral-Small-3.1-24B-Instruct-2503、Mistral-Small-3.2-24B-Instruct-2506、MiniMax-M1-80k、MiniMax-M1-40k、Kimi-K2-Instruct、DeepSeek-V3-0324、DeepSeek-R1-0528 的合成人类偏好	文本	未公开	-	QwQ-32B；Qwen3-30B-A3B；Qwen3-235B-A22B；Qwen3-235B-A22B-Instruct-2507；Mistral-Small-3.1-24B-Instruct-2503；Mistral-Small-3.2-24B-Instruct-2506；MiniMax-M1-80k；MiniMax-M1-40k；Kimi-K2-Instruct；DeepSeek-V3-0324；DeepSeek-R1-0528
来自 DeepSeek-R1、gemma-2-2b-it、gemma-3-27b-it、gpt-oss-20b、gpt-oss-120b、Mistral-7B-Instruct-v0.3、Mixtral-8x22B-Instruct-v0.1、Nemotron-4-340B-Instruct、NVIDIA-Nemotron-Nano-9B-v2、Phi-4-mini-instruct、Phi-3-small-8k-instruct、Phi-3-medium-4k-instruct、Qwen3-235B-A22B、QwQ-32B 的合成 WildChat-1M 和 arena-human-preference-140k	文本	未公开	WildChat-1M；arena-human-preference-140k	DeepSeek-R1；gemma-2-2b-it；gemma-3-27b-it；gpt-oss-20b；gpt-oss-120b；Mistral-7B-Instruct-v0.3；Mixtral-8x22B-Instruct-v0.1；Nemotron-4-340B-Instruct；NVIDIA-Nemotron-Nano-9B-v2；Phi-4-mini-instruct；Phi-3-small-8k-instruct；Phi-3-medium-4k-instruct；Qwen3-235B-A22B；QwQ-32B
来自 DeepSeek-R1-0528、gpt-oss-120b、DeepSeek-R1-Distill-Qwen-7B 和 Mixtral-8x7B-v0.1 的合成安全性	文本	未公开	Nemotron 内容安全数据集 V2；Gretel 合成安全对齐数据集；RedTeam-2K；恶意任务；	DeepSeek-R1-0528；gpt-oss-120b；DeepSeek-R1-Distill-Qwen-7B；Qwen3-30B-A3B-Thinking-2507；Qwen3-235B-A22B-Instruct-2507；Mixtral-8x7B-v0.1
来自 Qwen3-32B 的合成代码	文本	未公开	英文通用爬虫数据；英文通用爬虫数据 1.1	Qwen3-32B
来自 DeepSeek-R1 的合成 OpenCodeReasoning	文本	未公开	OpenCodeReasoning	DeepSeek-R1
来自 DeepSeek-R1-0528 的合成 LIMO	文本	未公开	LIMO	DeepSeek-R1-0528
来自 DeepSeek-R1-0528 的合成 SCP	文本	未公开	SCP-116K	DeepSeek-R1-0528
来自 DeepSeek-R1-0528 的合成 Stack Exchange	文本	未公开	Stack Exchange	DeepSeek-R1-0528
来自 Qwen3-30B-A3B 的合成通用爬虫数据	文本	未公开	Common Crawl	Qwen3-30B-A3B
来自 Qwen3-30B-A3B 的合成维基百科	文本	未公开	维基媒体	Qwen3-30B-A3B
来自 Qwen3-30B-A3B 和 Qwen3-235B-A22B-Thinking-2507 的合成 Essential-Web	文本	未公开	Essential-Web	Qwen3-30B-A3B；Qwen3-235B-A22B-Thinking-2507
来自 Qwen3-30B-A3B、Qwen3-235B-A22B、phi-4 的合成教科书数学	文本	未公开	Common Crawl；FineMath	Qwen3-30B-A3B；Qwen3-235B-A22B；phi-4
来自 DeepSeek-R1 和 DeepSeek-R1-0528 的合成数学与代码	文本	未公开	Magicoder-Evol-Instruct-110K；opc-sft-stage2；TACO；OpenCodeReasoning；OpenMathReasoning；NuminaMath CoT	DeepSeek-R1；DeepSeek-R1-0528
来自 gpt-oss-120b 和 Qwen3-8B 的合成 Nemotron-Personas-USA	文本	未公开	Nemotron-Personas-USA	gpt-oss-120b；Qwen3-8B

数据集	收集周期
Problems in Elementary Mathematics for Home Study	2025年4月23日
GSM8K	2025年4月23日

评估数据集：

各数据集的数据收集方法：混合：人工、合成
各数据集的标注方法：混合：自动化、人工、合成

推理

引擎：HF、vLLM、llama-cpp、TRT-LLM、SGLang
测试硬件：NVIDIA GeForce RTX、H100 80GB、DGX Spark、Jetson Thor/Orin Nano

伦理考量

NVIDIA 认为可信 AI 是一项共同责任，我们已制定相关政策和实践，以支持各类 AI 应用的开发。当按照我们的可信 AI 服务条款下载或使用时，开发人员应与内部模型团队合作，确保该模型满足相关行业和用例的要求，并应对意外的产品误用问题。

我们建议不要规避模型中包含的任何安全防护措施，除非为您的用例设置了实质上类似的防护措施。更多详情：安全性和可解释性子卡片。

有关此模型伦理考量的更多详细信息，请参阅 Model Card++ 的偏见和隐私子卡片。

请在此报告安全漏洞或 NVIDIA AI 相关问题。