我们推出了Intern-S1,这是我们迄今为止最先进的开源多模态推理模型。Intern-S1 融合了强大的通用任务能力与在各类科学任务上的顶尖性能,可与领先的闭源商业模型相媲美。
Intern-S1 基于 2350 亿参数的 MoE 语言模型(Qwen3)和 60 亿参数的视觉编码器(InternViT)构建,进一步在5 万亿 tokens 的多模态数据上进行预训练,其中包含超过2.5 万亿科学领域 tokens。这使得该模型在保持强大通用能力的同时,在化学结构解析、蛋白质序列理解、化合物合成路线规划等专业科学领域表现卓越,成为现实世界科学应用中得力的研究助手。
在语言和视觉推理基准测试中表现强劲,尤其在科学任务上。
在规模达 5T token 的海量数据集上持续预训练,其中超过 50% 为专业科学数据,蕴含深厚的领域知识。
动态分词器能够原生理解分子式、蛋白质序列和地震信号。
我们在各类基准测试(包括通用数据集和科学数据集)上对 Intern-S1 进行了评估。以下是与近期多模态模型(VLMs)和大型语言模型(LLMs)的性能对比。
| 基准测试 | Intern-S1 | InternVL3-78B | Qwen2.5-VL-72B | DS-R1-0528 | Qwen3-235B-A22B | Kimi-K2-Instruct | Gemini-2.5 Pro | o3 | Grok-4 | |
|---|---|---|---|---|---|---|---|---|---|---|
| MMLU-Pro | 83.5 ✅ | 73.0 | 72.1 | 83.4 | 82.2 | 82.7 | 86.0 | 85.0 | 85.9 | |
| MMMU | 77.7 ✅ | 72.2 | 70.2 | - | - | - | 81.9 | 80.8 | 77.9 | |
| GPQA | 77.3 | 49.9 | 49.0 | 80.6 | 71.1 | 77.8 | 83.8 | 83.3 | 87.5 | |
| MMStar | 74.9 ✅ | 72.5 | 70.8 | - | - | - | 79.3 | 75.1 | 69.6 | |
| MathVista | 81.5 👑 | 79.0 | 74.8 | - | - | - | 80.3 | 77.5 | 72.5 | |
| AIME2025 | 86.0 | 10.7 | 10.9 | 87.5 | 81.5 | 51.4 | 83.0 | 88.9 | 91.7 | |
| MathVision | 62.5 ✅ | 43.1 | 38.1 | - | - | - | 73.0 | 67.7 | 67.3 | |
| IFEval | 86.7 | 75.6 | 83.9 | 79.7 | 85.0 | 90.2 | 91.5 | 92.2 | 92.8 | |
| SFE | 44.3 👑 | 36.2 | 30.5 | - | - | - | 43.0 | 37.7 | 31.2 | |
| Physics | 44.0 ✅ | 23.1 | 15.7 | - | - | - | 40.0 | 47.9 | 42.8 | |
| SmolInstruct | 51.0 👑 | 19.4 | 21.0 | 30.7 | 28.7 | 48.1 | 40.4 | 43.9 | 47.3 | |
| ChemBench | 83.4 👑 | 61.3 | 61.6 | 75.6 | 75.8 | 75.3 | 82.8 | 81.6 | 83.3 | |
| MatBench | 75.0 👑 | 49.3 | 51.5 | 57.7 | 52.1 | 61.7 | 61.7 | 61.6 | 67.9 | |
| MicroVQA | 63.9 👑 | 59.1 | 53.0 | - | - | - | 63.1 | 58.3 | 59.5 | |
| ProteinLMBench | 63.1 | 61.6 | 61.0 | 61.4 | 59.8 | 66.7 | 62.9 | 67.7 | 66.2 | |
| MSEarthMCQ | 65.7 👑 | 57.2 | 37.6 | - | - | - | 59.9 | 61.0 | 58.0 | |
| XLRS-Bench | 55.0 👑 | 49.3 | 50.9 | - | - | - | 45.2 | 43.6 | 45.4 | |
注:✅ 表示在开源模型中性能最佳,👑 表示在所有模型中性能最佳。
我们使用 OpenCompass 和 VLMEvalkit 对所有模型进行评估。 请参考 此页面 快速开始纯文本评估任务。
我们建议使用以下超参数以确保获得更优结果
top_p = 1.0
top_k = 50
min_p = 0.0
temperature = 0.7以下提供演示代码,说明如何基于文本和多模态输入进行生成。
请使用 transformers>=4.53.0 以确保模型正常运行。
from transformers import AutoProcessor, AutoModelForCausalLM
import torch
model_name = "internlm/Intern-S1"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "tell me about an interesting physical phenomenon."},
],
}
]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)
generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)from transformers import AutoProcessor, AutoModelForCausalLM
import torch
model_name = "internlm/Intern-S1"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
messages = [
{
"role": "user",
"content": [
{"type": "image", "url": "http://images.cocodataset.org/val2017/000000039769.jpg"},
{"type": "text", "text": "Please describe the image explicitly."},
],
}
]
inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt").to(model.device, dtype=torch.bfloat16)
generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)请确保已通过 pip install decord 安装 decord 视频解码库。
from transformers import AutoProcessor, AutoModelForCausalLM
import torch
model_name = "internlm/Intern-S1"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
messages = [
{
"role": "user",
"content": [
{
"type": "video",
"url": "https://huggingface.co/datasets/hf-internal-testing/fixtures_videos/resolve/main/tennis.mp4",
},
{"type": "text", "text": "What type of shot is the man performing?"},
],
}
]
inputs = processor.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True,
video_load_backend="decord",
tokenize=True,
return_dict=True,
).to(model.device, dtype=torch.float16)
generate_ids = model.generate(**inputs, max_new_tokens=32768)
decoded_output = processor.decode(generate_ids[0, inputs["input_ids"].shape[1] :], skip_special_tokens=True)
print(decoded_output)部署 Intern-S1 系列模型的最低硬件要求如下:
| 模型 | A100(GPU) | H800(GPU) | H100(GPU) | H200(GPU) |
|---|---|---|---|---|
| internlm/Intern-S1 | 8 | 8 | 8 | 4 |
| internlm/Intern-S1-FP8 | - | 4 | 4 | 2 |
您可以使用以下任一 LLM 推理框架来搭建兼容 OpenAI 的服务器:
lmdeploy serve api_server internlm/Intern-S1 --reasoning-parser intern-s1 --tool-call-parser intern-s1 --tp 8vllm serve internlm/Intern-S1 --tensor-parallel-size 8 --trust-remote-codepython3 -m sglang.launch_server \
--model-path internlm/Intern-S1 \
--trust-remote-code \
--tp 8 \
--grammar-backend none# install ollama
curl -fsSL https://ollama.com/install.sh | sh
# fetch model
ollama pull internlm/interns1
# run model
ollama run internlm/interns1
# then use openai client to call on http://localhost:11434/v1如今,许多大型语言模型(LLMs)都具备工具调用这一强大功能,使其能够通过与外部工具和 API 交互来扩展自身功能。这让模型能够执行诸如获取最新信息、运行代码或调用其他应用程序内函数等任务。
对于开发人员而言,一个重要优势是越来越多的开源 LLM 被设计为与 OpenAI API 兼容。这意味着您可以利用 OpenAI 库中相同的熟悉语法和结构,在这些开源模型上实现工具调用。因此,本教程中展示的代码具有通用性——不仅适用于 OpenAI 模型,还适用于任何遵循相同接口标准的模型。
为了说明其工作原理,让我们深入探讨一个实际的代码示例,该示例使用工具调用获取最新天气预报(基于 lmdeploy api server)。
from openai import OpenAI
import json
def get_current_temperature(location: str, unit: str = "celsius"):
"""Get current temperature at a location.
Args:
location: The location to get the temperature for, in the format "City, State, Country".
unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])
Returns:
the temperature, the location, and the unit in a dict
"""
return {
"temperature": 26.1,
"location": location,
"unit": unit,
}
def get_temperature_date(location: str, date: str, unit: str = "celsius"):
"""Get temperature at a location and date.
Args:
location: The location to get the temperature for, in the format "City, State, Country".
date: The date to get the temperature for, in the format "Year-Month-Day".
unit: The unit to return the temperature in. Defaults to "celsius". (choices: ["celsius", "fahrenheit"])
Returns:
the temperature, the location, the date and the unit in a dict
"""
return {
"temperature": 25.9,
"location": location,
"date": date,
"unit": unit,
}
def get_function_by_name(name):
if name == "get_current_temperature":
return get_current_temperature
if name == "get_temperature_date":
return get_temperature_date
tools = [{
'type': 'function',
'function': {
'name': 'get_current_temperature',
'description': 'Get current temperature at a location.',
'parameters': {
'type': 'object',
'properties': {
'location': {
'type': 'string',
'description': 'The location to get the temperature for, in the format \'City, State, Country\'.'
},
'unit': {
'type': 'string',
'enum': [
'celsius',
'fahrenheit'
],
'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'
}
},
'required': [
'location'
]
}
}
}, {
'type': 'function',
'function': {
'name': 'get_temperature_date',
'description': 'Get temperature at a location and date.',
'parameters': {
'type': 'object',
'properties': {
'location': {
'type': 'string',
'description': 'The location to get the temperature for, in the format \'City, State, Country\'.'
},
'date': {
'type': 'string',
'description': 'The date to get the temperature for, in the format \'Year-Month-Day\'.'
},
'unit': {
'type': 'string',
'enum': [
'celsius',
'fahrenheit'
],
'description': 'The unit to return the temperature in. Defaults to \'celsius\'.'
}
},
'required': [
'location',
'date'
]
}
}
}]
messages = [
{'role': 'user', 'content': 'Today is 2024-11-14, What\'s the temperature in San Francisco now? How about tomorrow?'}
]
openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=messages,
max_tokens=32768,
temperature=0.8,
top_p=0.8,
stream=False,
extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),
tools=tools)
print(response.choices[0].message)
messages.append(response.choices[0].message)
for tool_call in response.choices[0].message.tool_calls:
tool_call_args = json.loads(tool_call.function.arguments)
tool_call_result = get_function_by_name(tool_call.function.name)(**tool_call_args)
tool_call_result = json.dumps(tool_call_result, ensure_ascii=False)
messages.append({
'role': 'tool',
'name': tool_call.function.name,
'content': tool_call_result,
'tool_call_id': tool_call.id
})
response = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.8,
top_p=0.8,
stream=False,
extra_body=dict(spaces_between_special_tokens=False, enable_thinking=False),
tools=tools)
print(response.choices[0].message.content)Intern-S1 默认启用思考模式,可增强模型的推理能力,以生成更高质量的响应。通过在 tokenizer.apply_chat_template 中设置 enable_thinking=False,可禁用该功能。
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False # think mode indicator
)通过 LMDeploy 部署 Intern-S1 模型后,您可以在请求中调整 enable_thinking 参数,从而动态控制思考模式。
from openai import OpenAI
import json
messages = [
{
'role': 'user',
'content': 'who are you'
}, {
'role': 'assistant',
'content': 'I am an AI'
}, {
'role': 'user',
'content': 'AGI is?'
}]
openai_api_key = "EMPTY"
openai_api_base = "http://0.0.0.0:23333/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
model_name = client.models.list().data[0].id
response = client.chat.completions.create(
model=model_name,
messages=messages,
temperature=0.7,
top_p=0.8,
max_tokens=2048,
extra_body={
"enable_thinking": False,
}
)
print(json.dumps(response.model_dump(), indent=2, ensure_ascii=False))对于 vllm 和 sglang 用户,请通过以下方式进行配置:
extra_body={
"chat_template_kwargs": {"enable_thinking": False}
}如果您觉得本研究成果对您有所帮助,欢迎引用我们的工作。
@misc{bai2025interns1scientificmultimodalfoundation,
title={Intern-S1: A Scientific Multimodal Foundation Model},
author={Lei Bai and Zhongrui Cai and Maosong Cao and Weihan Cao and Chiyu Chen and Haojiong Chen and Kai Chen and Pengcheng Chen and Ying Chen and Yongkang Chen and Yu Cheng and Yu Cheng and Pei Chu and Tao Chu and Erfei Cui and Ganqu Cui and Long Cui and Ziyun Cui and Nianchen Deng and Ning Ding and Nanqin Dong and Peijie Dong and Shihan Dou and Sinan Du and Haodong Duan and Caihua Fan and Ben Gao and Changjiang Gao and Jianfei Gao and Songyang Gao and Yang Gao and Zhangwei Gao and Jiaye Ge and Qiming Ge and Lixin Gu and Yuzhe Gu and Aijia Guo and Qipeng Guo and Xu Guo and Conghui He and Junjun He and Yili Hong and Siyuan Hou and Caiyu Hu and Hanglei Hu and Jucheng Hu and Ming Hu and Zhouqi Hua and Haian Huang and Junhao Huang and Xu Huang and Zixian Huang and Zhe Jiang and Lingkai Kong and Linyang Li and Peiji Li and Pengze Li and Shuaibin Li and Tianbin Li and Wei Li and Yuqiang Li and Dahua Lin and Junyao Lin and Tianyi Lin and Zhishan Lin and Hongwei Liu and Jiangning Liu and Jiyao Liu and Junnan Liu and Kai Liu and Kaiwen Liu and Kuikun Liu and Shichun Liu and Shudong Liu and Wei Liu and Xinyao Liu and Yuhong Liu and Zhan Liu and Yinquan Lu and Haijun Lv and Hongxia Lv and Huijie Lv and Qidang Lv and Ying Lv and Chengqi Lyu and Chenglong Ma and Jianpeng Ma and Ren Ma and Runmin Ma and Runyuan Ma and Xinzhu Ma and Yichuan Ma and Zihan Ma and Sixuan Mi and Junzhi Ning and Wenchang Ning and Xinle Pang and Jiahui Peng and Runyu Peng and Yu Qiao and Jiantao Qiu and Xiaoye Qu and Yuan Qu and Yuchen Ren and Fukai Shang and Wenqi Shao and Junhao Shen and Shuaike Shen and Chunfeng Song and Demin Song and Diping Song and Chenlin Su and Weijie Su and Weigao Sun and Yu Sun and Qian Tan and Cheng Tang and Huanze Tang and Kexian Tang and Shixiang Tang and Jian Tong and Aoran Wang and Bin Wang and Dong Wang and Lintao Wang and Rui Wang and Weiyun Wang and Wenhai Wang and Yi Wang and Ziyi Wang and Ling-I Wu and Wen Wu and Yue Wu and Zijian Wu and Linchen Xiao and Shuhao Xing and Chao Xu and Huihui Xu and Jun Xu and Ruiliang Xu and Wanghan Xu and GanLin Yang and Yuming Yang and Haochen Ye and Jin Ye and Shenglong Ye and Jia Yu and Jiashuo Yu and Jing Yu and Fei Yuan and Bo Zhang and Chao Zhang and Chen Zhang and Hongjie Zhang and Jin Zhang and Qiaosheng Zhang and Qiuyinzhe Zhang and Songyang Zhang and Taolin Zhang and Wenlong Zhang and Wenwei Zhang and Yechen Zhang and Ziyang Zhang and Haiteng Zhao and Qian Zhao and Xiangyu Zhao and Xiangyu Zhao and Bowen Zhou and Dongzhan Zhou and Peiheng Zhou and Yuhao Zhou and Yunhua Zhou and Dongsheng Zhu and Lin Zhu and Yicheng Zou},
year={2025},
eprint={2508.15763},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2508.15763},
}