Mistral Small 4 119B A6B

Mistral Small 4 是一款功能强大的混合模型，既能作为通用指令模型，也能作为推理模型。它将 Instruct、Reasoning（前称 Magistral）和 Devstral 这三个不同模型家族的能力整合到了一个统一的模型中。

凭借其多模态能力、高效架构和灵活的模式切换，它成为适用于任何任务的强大通用模型。在延迟优化配置下，Mistral Small 4 实现了端到端完成时间缩短 40%；在吞吐量优化配置下，与 Mistral Small 3 相比，每秒可处理3 倍以上的请求。

要进一步提升效率，您可以利用以下任一方式：

借助我们训练的 eagle head mistralai/Mistral-Small-4-119B-2603-eagle 进行推测解码。
借助我们的 NVFP4 检查点 mistralai/Mistral-Small-4-119B-2603-NVFP4 进行 4 位浮点精度量化。

主要特性

Mistral Small 4 包含以下架构选择：

MoE：128 个专家，4 个激活。
1190 亿参数，每个 token 激活65 亿参数。
256k 上下文长度。
多模态输入：接受文本和图像输入，输出文本。
指令和推理功能，支持函数调用（可按请求配置推理力度）。

Mistral Small 4 具备以下功能：

推理模式：可在快速即时回复模式和推理模式之间切换，在需要时通过测试时计算提升性能。
视觉：除文本外，还能分析图像并基于视觉内容提供见解。
多语言：支持数十种语言，包括英语、法语、西班牙语、德语、意大利语、葡萄牙语、荷兰语、中文、日语、韩语和阿拉伯语。
系统提示：严格遵守并支持系统提示。
智能体：具备一流的智能体能力，支持原生函数调用和 JSON 输出。
速度优化：提供一流的性能和速度。
Apache 2.0 许可证：开源许可证，适用于商业和非商业用途。
大上下文窗口：支持 256k 上下文窗口。

应用场景

Mistral Small 4 适用于通用聊天助手、编码、智能体任务以及推理任务（需开启推理模式）。其多模态能力还支持文档和图像理解，可用于数据提取与分析。

其功能特别适合以下场景：

对编码和智能体能力感兴趣的开发者，用于软件工程自动化和代码库探索。
需要通用聊天助手、智能体和文档理解功能的企业。
利用其数学和研究能力开展工作的研究人员。

Mistral Small 4 也非常适合通过定制和微调来适应更专业的任务。

应用示例

通用聊天助手
文档解析与提取
编码智能体
研究助手
定制与微调
以及更多……

基准测试

与内部模型的对比

根据任务需求，您可以通过 按请求 参数 reasoning_effort 触发推理功能。设置方式如下：

reasoning_effort="none"：快速、轻量级响应，适用于日常任务，聊天风格与 mistralai/Mistral-Small-3.2-24B-Instruct-2506 相当。
reasoning_effort="high"：深度、逐步推理，适用于复杂问题，详细程度与之前的 Magistral 模型（如 mistralai/Magistral-Small-2509）相当。

Internal benchmark

推理模型对比

Internal benchmark - Reasoning

与其他模型的对比

具备推理能力的Mistral Small 4取得了具有竞争力的分数，在所有三个基准测试中均达到或超越了GPT-OSS 120B，同时生成的输出内容显著更短。在AA LCR上，Mistral Small 4仅用1.6K字符就获得了0.72的分数，而Qwen模型需要3.5-4倍更多的输出内容（5.8-6.1K字符）才能达到相当的性能。在LiveCodeBench上，Mistral Small 4在性能上优于GPT-OSS 120B，同时输出内容减少了20%。这种效率降低了延迟和推理成本，并改善了用户体验。

Comparison benchmark - LCR Comparison benchmark - LiveCodeBench Comparison benchmark - AIME25

使用方法

多个库均支持Mistral Small 4的推理和微调功能。在此，我们感谢所有为此提供帮助的贡献者和维护者。

推理

该模型可通过以下方式部署：

vllm (推荐)：参见此处
llama.cpp：Unsloth的GGUF格式文件参见此处
LM studio：参见此处
SGLang：参见此处
transformers：参见此处

为获得最佳性能，如果本地部署效果不佳，建议使用Mistral AI API。

微调

可通过以下方式对模型进行微调：

Axolotl：参见此处。

vLLM（推荐）

我们建议将 Mistral Small 4 与 vLLM 库结合使用，以实现生产级推理。

安装

确保安装 vllm nightly 版本：
```
uv pip install -U vllm \
    --torch-backend=auto \
    --extra-index-url https://wheels.vllm.ai/nightly
```
执行此命令应会自动安装 mistral_common >= 1.11.0。

检查安装情况：
```
python -c "import mistral_common; print(mistral_common.__version__)"
```
您也可以使用现成的 docker 镜像或 docker hub 上的镜像。

从主分支安装 transformers：

uv pip install git+https://github.com/huggingface/transformers.git

模型服务

我们建议采用服务器/客户端架构：

vllm serve mistralai/Mistral-Small-4-119B-2603 --max-model-len 262144 --tensor-parallel-size 2 --attention-backend FLASH_ATTN_MLA \
  --tool-call-parser mistral --enable-auto-tool-choice --reasoning-parser mistral --max_num_batched_tokens 16384 --max_num_seqs 128 \
  --gpu_memory_utilization 0.8

测试服务器连接

指令遵循

Mistral Small 4 能够严格按照您的指令执行。

from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.1
# use TEMP = 0.7 for reasoning="high"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": "Write me a sentence where every word starts with the next letter in the alphabet - start with 'a' and end with 'z'.",
    },
]

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    reasoning_effort="none",
)

assistant_message = response.choices[0].message.content
print(assistant_message)

工具调用

借助我们简单的 Python 计算器工具来解一些方程吧。

import json
from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.1

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

image_url = "https://math-coaching.com/img/fiche/46/expressions-mathematiques.jpg"


def my_calculator(expression: str) -> str:
    return str(eval(expression))


tools = [
    {
        "type": "function",
        "function": {
            "name": "my_calculator",
            "description": "A calculator that can evaluate a mathematical expression.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "The mathematical expression to evaluate.",
                    },
                },
                "required": ["expression"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "rewrite",
            "description": "Rewrite a given text for improved clarity",
            "parameters": {
                "type": "object",
                "properties": {
                    "text": {
                        "type": "string",
                        "description": "The input text to rewrite",
                    }
                },
            },
        },
    },
]

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Thanks to your calculator, compute the results for the equations that involve numbers displayed in the image.",
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": image_url,
                },
            },
        ],
    },
]

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    tools=tools,
    tool_choice="auto",
    reasoning_effort="none",
)

tool_calls = response.choices[0].message.tool_calls

results = []
for tool_call in tool_calls:
    function_name = tool_call.function.name
    function_args = tool_call.function.arguments
    if function_name == "my_calculator":
        result = my_calculator(**json.loads(function_args))
        results.append(result)

messages.append({"role": "assistant", "tool_calls": tool_calls})
for tool_call, result in zip(tool_calls, results):
    messages.append(
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "name": tool_call.function.name,
            "content": result,
        }
    )


response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    reasoning_effort="none",
)

print(response.choices[0].message.content)

视觉推理

来看看 Mistral Small 4 是否知道何时该出手！

from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.7

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]


response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    reasoning_effort="high",
)

print(response.choices[0].message.content)

Transformers

安装

若要使用 Mistral Small 4，您需要安装 Transformers 的主分支：

uv pip install git+https://github.com/huggingface/transformers.git

推理

Python 推理代码片段

import torch
from transformers import AutoProcessor, Mistral3ForConditionalGeneration


model_id = "mistralai/Mistral-Small-4-119B-2603"

processor = AutoProcessor.from_pretrained(model_id)
model = Mistral3ForConditionalGeneration.from_pretrained(
    model_id, device_map="auto"
)

image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]

inputs = processor.apply_chat_template(messages, return_tensors="pt", tokenize=True, return_dict=True, reasoning_effort="high")
inputs = inputs.to(model.device)

output = model.generate(
    **inputs,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.7,
)[0]

# Setting `skip_special_tokens=False` to visualize reasoning trace between [THINK] [/THINK] tags.
decoded_output = processor.decode(output[len(inputs["input_ids"][0]):], skip_special_tokens=False) 
print(decoded_output)

许可协议

本模型根据 Apache 2.0 许可协议进行许可。

您不得将本模型用于侵犯、盗用或违反任何第三方权利（包括知识产权）的行为。

Mistral Small 4 119B A6B

要进一步提升效率，您可以利用以下任一方式：

借助我们训练的 eagle head mistralai/Mistral-Small-4-119B-2603-eagle 进行推测解码。
借助我们的 NVFP4 检查点 mistralai/Mistral-Small-4-119B-2603-NVFP4 进行 4 位浮点精度量化。

主要特性

Mistral Small 4 包含以下架构选择：

MoE：128 个专家，4 个激活。
1190 亿参数，每个 token 激活65 亿参数。
256k 上下文长度。
多模态输入：接受文本和图像输入，输出文本。
指令和推理功能，支持函数调用（可按请求配置推理力度）。

Mistral Small 4 具备以下功能：

推理模式：可在快速即时回复模式和推理模式之间切换，在需要时通过测试时计算提升性能。
视觉：除文本外，还能分析图像并基于视觉内容提供见解。
多语言：支持数十种语言，包括英语、法语、西班牙语、德语、意大利语、葡萄牙语、荷兰语、中文、日语、韩语和阿拉伯语。
系统提示：严格遵守并支持系统提示。
智能体：具备一流的智能体能力，支持原生函数调用和 JSON 输出。
速度优化：提供一流的性能和速度。
Apache 2.0 许可证：开源许可证，适用于商业和非商业用途。
大上下文窗口：支持 256k 上下文窗口。

应用场景

其功能特别适合以下场景：

对编码和智能体能力感兴趣的开发者，用于软件工程自动化和代码库探索。
需要通用聊天助手、智能体和文档理解功能的企业。
利用其数学和研究能力开展工作的研究人员。

Mistral Small 4 也非常适合通过定制和微调来适应更专业的任务。

应用示例

通用聊天助手
文档解析与提取
编码智能体
研究助手
定制与微调
以及更多……

基准测试

与内部模型的对比

根据任务需求，您可以通过 按请求 参数 reasoning_effort 触发推理功能。设置方式如下：

reasoning_effort="none"：快速、轻量级响应，适用于日常任务，聊天风格与 mistralai/Mistral-Small-3.2-24B-Instruct-2506 相当。
reasoning_effort="high"：深度、逐步推理，适用于复杂问题，详细程度与之前的 Magistral 模型（如 mistralai/Magistral-Small-2509）相当。

Internal benchmark

推理模型对比

Internal benchmark - Reasoning

与其他模型的对比

Comparison benchmark - LCR Comparison benchmark - LiveCodeBench Comparison benchmark - AIME25

使用方法

多个库均支持Mistral Small 4的推理和微调功能。在此，我们感谢所有为此提供帮助的贡献者和维护者。

推理

该模型可通过以下方式部署：

vllm (推荐)：参见此处
llama.cpp：Unsloth的GGUF格式文件参见此处
LM studio：参见此处
SGLang：参见此处
transformers：参见此处

为获得最佳性能，如果本地部署效果不佳，建议使用Mistral AI API。

微调

可通过以下方式对模型进行微调：

Axolotl：参见此处。

vLLM（推荐）

我们建议将 Mistral Small 4 与 vLLM 库结合使用，以实现生产级推理。

安装

确保安装 vllm nightly 版本：
```
uv pip install -U vllm \
    --torch-backend=auto \
    --extra-index-url https://wheels.vllm.ai/nightly
```
执行此命令应会自动安装 mistral_common >= 1.11.0。

检查安装情况：
```
python -c "import mistral_common; print(mistral_common.__version__)"
```
您也可以使用现成的 docker 镜像或 docker hub 上的镜像。

从主分支安装 transformers：

uv pip install git+https://github.com/huggingface/transformers.git

模型服务

我们建议采用服务器/客户端架构：

vllm serve mistralai/Mistral-Small-4-119B-2603 --max-model-len 262144 --tensor-parallel-size 2 --attention-backend FLASH_ATTN_MLA \
  --tool-call-parser mistral --enable-auto-tool-choice --reasoning-parser mistral --max_num_batched_tokens 16384 --max_num_seqs 128 \
  --gpu_memory_utilization 0.8

测试服务器连接

指令遵循

Mistral Small 4 能够严格按照您的指令执行。

from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.1
# use TEMP = 0.7 for reasoning="high"

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": "Write me a sentence where every word starts with the next letter in the alphabet - start with 'a' and end with 'z'.",
    },
]

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    reasoning_effort="none",
)

assistant_message = response.choices[0].message.content
print(assistant_message)

工具调用

借助我们简单的 Python 计算器工具来解一些方程吧。

import json
from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.1

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

image_url = "https://math-coaching.com/img/fiche/46/expressions-mathematiques.jpg"


def my_calculator(expression: str) -> str:
    return str(eval(expression))


tools = [
    {
        "type": "function",
        "function": {
            "name": "my_calculator",
            "description": "A calculator that can evaluate a mathematical expression.",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "The mathematical expression to evaluate.",
                    },
                },
                "required": ["expression"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "rewrite",
            "description": "Rewrite a given text for improved clarity",
            "parameters": {
                "type": "object",
                "properties": {
                    "text": {
                        "type": "string",
                        "description": "The input text to rewrite",
                    }
                },
            },
        },
    },
]

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Thanks to your calculator, compute the results for the equations that involve numbers displayed in the image.",
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": image_url,
                },
            },
        ],
    },
]

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    tools=tools,
    tool_choice="auto",
    reasoning_effort="none",
)

tool_calls = response.choices[0].message.tool_calls

results = []
for tool_call in tool_calls:
    function_name = tool_call.function.name
    function_args = tool_call.function.arguments
    if function_name == "my_calculator":
        result = my_calculator(**json.loads(function_args))
        results.append(result)

messages.append({"role": "assistant", "tool_calls": tool_calls})
for tool_call, result in zip(tool_calls, results):
    messages.append(
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "name": tool_call.function.name,
            "content": result,
        }
    )


response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    reasoning_effort="none",
)

print(response.choices[0].message.content)

视觉推理

来看看 Mistral Small 4 是否知道何时该出手！

from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.7

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")
image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]


response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    reasoning_effort="high",
)

print(response.choices[0].message.content)

Transformers

安装

若要使用 Mistral Small 4，您需要安装 Transformers 的主分支：

uv pip install git+https://github.com/huggingface/transformers.git

推理

Python 推理代码片段

import torch
from transformers import AutoProcessor, Mistral3ForConditionalGeneration


model_id = "mistralai/Mistral-Small-4-119B-2603"

processor = AutoProcessor.from_pretrained(model_id)
model = Mistral3ForConditionalGeneration.from_pretrained(
    model_id, device_map="auto"
)

image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]

inputs = processor.apply_chat_template(messages, return_tensors="pt", tokenize=True, return_dict=True, reasoning_effort="high")
inputs = inputs.to(model.device)

output = model.generate(
    **inputs,
    max_new_tokens=1024,
    do_sample=True,
    temperature=0.7,
)[0]

# Setting `skip_special_tokens=False` to visualize reasoning trace between [THINK] [/THINK] tags.
decoded_output = processor.decode(output[len(inputs["input_ids"][0]):], skip_special_tokens=False) 
print(decoded_output)

许可协议

本模型根据 Apache 2.0 许可协议进行许可。

您不得将本模型用于侵犯、盗用或违反任何第三方权利（包括知识产权）的行为。

Mistral Small 4 119B A6B

主要特性

推荐设置

应用场景

应用示例

基准测试

与内部模型的对比

推理模型对比

与其他模型的对比

使用方法

推理

微调

vLLM（推荐）

安装

模型服务

测试服务器连接

Transformers

安装

推理

许可协议

Mistral Small 4 119B A6B

主要特性

推荐设置

应用场景

应用示例

基准测试

与内部模型的对比

推理模型对比

与其他模型的对比

使用方法

推理

微调

vLLM（推荐）

安装

模型服务

测试服务器连接

Transformers

安装

推理

许可协议