EXAONE 4.5

我们推出EXAONE 4.5，这是LG人工智能研究院开发的首款开源权重视觉语言模型。通过在现有EXAONE 4.0框架中集成专用视觉编码器，我们将模型的能力扩展到多模态领域。EXAONE 4.5的总参数量为330亿，其中视觉编码器占12亿参数。EXAONE 4.5在通用基准测试中表现出竞争力，同时在文档理解和韩语语境推理方面优于同等规模的最先进模型，继承了我们先前语言模型强大的语言能力。

更多详情，请参考技术报告、博客和GitHub。

模型配置

模型类型：因果语言模型 + 视觉编码器
参数数量（语言模型）：317亿
参数数量（视觉编码器）：12.9亿
隐藏层维度：5,120
中间层大小：27,392
层数：64个主层 + 1个MTP层
- 混合注意力模式：16组×（3个滑动窗口注意力 + 1个全局注意力）
- 重排序归一化：在注意力/MLP之后、残差连接之前应用归一化
滑动窗口注意力
- 注意力头数量：40个查询头（Q-heads）和8个键值头（KV-heads）
- 头维度：查询头/键值头均为128
- 滑动窗口大小：4096
全局注意力
- 注意力头数量：40个查询头（Q-heads）和8个键值头（KV-heads）
- 头维度：查询头/键值头均为128
- 不使用旋转位置嵌入（NoPE）
视觉编码器
- 分组查询注意力（GQA）
- 用于视觉嵌入的2D RoPE
词汇表大小：153,600
上下文长度：262,144个token
知识截止日期：2024年12月（2024/12）

评估结果

视觉-语言任务

	EXAONE 4.5 33B (推理)	GPT-5 mini (推理：高)	Qwen3-VL 32B 思维链	Qwen3-VL 235B 思维链	Qwen3.5 27B (推理)
架构	密集型	-	密集型	混合专家	密集型
总参数	330亿	-	330亿	2360亿	270亿
激活参数	330亿	-	330亿	220亿	270亿
STEM / 谜题
MMMU	78.7	79.0	78.1	80.6	82.3
MMMU-Pro	68.6	67.3	68.1	69.3	75.0
MedXpertQA-MM	42.1	34.4	41.6	47.6	62.4
MathVision	75.2	71.9	70.2	74.6	86.0
MathVista (mini)	85.0	79.1	85.9	85.8	87.8
WeMath	79.1	70.3	71.6	74.8	84.0
LogicVista	73.8	70.3	70.9	72.2	77.0
BabyVision	18.8	20.9	17.4	22.2	44.6
文档理解
AI2D	89.0	88.2	88.9	89.2	92.9
ChartQAPro	62.2	60.9	61.4	61.2	66.8
CharXiv (RQ)	71.7	68.6	65.2	66.1	79.5
OCRBench v2	63.2	55.8	68.4	66.8	67.3
OmniDocBench v1.5	81.2	77.0	83.1	84.5	88.9
通用
MMStar	74.9	74.1	79.4	78.7	81.0
BLINK	68.8	67.7	68.5	67.1	71.6
HallusionBench	63.7	63.2	67.4	66.7	70.0
韩语
KMMMU	42.7	42.6	37.8	42.1	51.7
K-Viscuit	80.1	78.5	78.5	83.9	84.0
KRETA	91.9	94.8	90.3	92.8	96.5

纯语言任务

	EXAONE 4.5 33B (推理)	GPT-5 mini (推理：高)	K-EXAONE 236B (推理)	Qwen3-VL 235B 思维	Qwen3.5 27B (推理)
架构	密集型	-	MoE	MoE	密集型
总参数	330亿	-	2360亿	2360亿	270亿
激活参数	330亿	-	230亿	220亿	270亿
推理能力
AIME 2025	92.9	91.1	92.8	89.7	93.5
AIME 2026	92.6	92.4	92.2	89.4	90.8
GPQA-Diamond	80.5	82.3	79.1	77.1	85.5
LiveCodeBench v6	81.4	78.1	80.7	70.1	80.7
MMLU-Pro	83.3	83.3	83.8	83.8	86.1
智能体工具使用
τ²-Bench (零售)	77.9	78.3	78.6	67.0	84.7
τ²-Bench (航空)	56.5	60.0	60.4	62.0	67.5
τ²-Bench (电信)	73.0	74.1	73.5	44.7	99.3
指令遵循
IFBench	62.6	74.0	67.3	59.2	76.5
IFEval	89.6	92.8	89.7	88.2	95.0
长上下文理解
AA-LCR	50.6	68.0	53.5	58.7	67.3
韩语能力
KMMLU-Pro	67.6	72.5	67.3	71.1	73.0
KoBALT	52.1	63.6	61.8	51.1	54.9

快速入门

部署 EXAONE 4.5

为获得更优的推理速度和内存使用效率，建议使用经过优化的推理引擎来部署模型。EXAONE 4.5 模型支持多种框架，包括 TensorRT-LLM、vLLM、SGLang 和 llama.cpp。未来还将扩展更多支持。

实际上，您可以在单张 H200 GPU 上部署支持 256K 上下文长度的 EXAONE 4.5 模型，或者通过张量并行技术在4 张 A100-40GB GPU 上进行部署。

TensorRT-LLM

TensorRT-LLM 为 EXAONE 4.5 提供了即时支持。使用 EXAONE 4.5 模型需要我们 fork 的 Transformers 库。您可以通过运行以下命令安装 Transformers：

pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

请参考官方安装指南、EXAONE 文档和EXAONE 4.5 PR了解详细信息。

安装 TensorRT-LLM 后，您可以使用以下代码片段启动服务器。您可以从该片段中移除不必要的参数。

trtllm-serve LGAI-EXAONE/EXAONE-4.5-33B \
    —tp_size 2 \
    —port 8000 \
    —reasoning_parser qwen3

一个兼容 OpenAI 的 API 服务器将在 http://localhost:8000/v1 可用。

vLLM

要使用 EXAONE 4.5 模型，需要我们分支版本的 Transformers 和 vLLM。您可以通过运行以下命令安装所需依赖：

uv pip install git+https://github.com/lkm2835/vllm.git@add-exaone4_5
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

安装 vLLM 后，您可以使用以下代码片段启动服务器。您可以从该片段中移除不必要的参数。

vllm serve LGAI-EXAONE/EXAONE-4.5-33B \
    --served-model-name EXAONE-4.5-33B \
    --port 8000 \
    --tensor-parallel-size 2 \
    --max-model-len 262144 \
    --reasoning-parser qwen3 \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --limit-mm-per-prompt '{"image": 64}' \
    --speculative_config '{
        "method": "mtp", 
        "num_speculative_tokens": 3
    }'

一个兼容 OpenAI 的 API 服务器将在 http://localhost:8000/v1 可用。

SGLang

要使用 EXAONE 4.5 模型，需要我们分支的 Transformers 和 SGLang。您可以通过运行以下命令安装所需依赖：

uv pip install 'git+https://github.com/lkm2835/sglang.git@add-exaone4_5#subdirectory=python&egg=sglang[all]'
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

安装 SGLang 后，您可以使用以下代码片段启动服务器。您可以从该片段中移除不必要的参数。

python -m sglang.launch_server \
    --model-path LGAI-EXAONE/EXAONE-4.5-33B \
    --served-model-name EXAONE-4.5-33B \
    --port 8000 \
    --tp-size 2 \
    --mem-frac 0.81 \
    --reasoning-parser qwen3 \
    --tool-call-parser hermes \
    --speculative-algorithm EAGLE \
    --speculative-num-steps 3 \
    --speculative-eagle-topk 1 \
    --speculative-num-draft-tokens 4

一个兼容 OpenAI 的 API 服务器将在 http://localhost:8000/v1 可用。

使用 EXAONE 4.5

使用 EXAONE 4.5 启动兼容 OpenAI 的服务器后，尽管服务框架已更改，您仍可以通过 API 无缝使用该模型，只需集成少量代码即可。要使用 OpenAI Python SDK 及以下示例，您需要在环境中安装 openai 库。

[!IMPORTANT] 为达到预期性能，我们建议使用以下配置：

对于通用目的，建议使用 temperature=1.0、top_p=0.95、presence_penalty=1.5。

对于 OCR/文档相关任务以及韩语输入，建议使用 temperature=0.6、top_p=0.95、presence_penalty=1.5、top_k=20。

对于纯文本输入，建议使用 temperature=1.0、top_p=0.95。

与 EXAONE-4.0 不同，EXAONE 4.5 默认使用 enable_thinking=True。因此，当您希望使用非推理模式时，需要设置 enable_thinking=False。

EXAONE 4.5 倾向于使用 \boxed{} 格式来回答问题。为了获得更好的解析准确性，我们建议结合相应的格式指令使用此格式。

您可以使用 OpenAI Python SDK 轻松尝试模型的聊天补全功能。对于本地机器上的服务器，您需要为 OpenAI 客户端修改 base_url 和 api_key。

图文问答

推理模式

对于需要准确结果的任务，您可以按以下方式运行 EXAONE 4.5 模型的推理模式。

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://github.com/Aim-Highest/EXAONE-4.5/blob/main/assets/exaone45_input2.png?raw=true",
                },
            },
            {
                "type": "text",
                "text": "How much larger is the model released in winter 2025 compared with the one released in summer 2024?",
            },
        ]
    }
]

response = client.chat.completions.create(
    model="EXAONE-4.5-33B",
    messages=messages,
    max_tokens=32768,
    temperature=1.0,
    top_p=0.95,
    presence_penalty=1.5,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": True,  # default: True
        }
    }, 
)
print(response)

非推理模式

对于延迟比准确性更重要的任务，您可以按以下方式以非推理模式运行 EXAONE 4.5 模型。

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://github.com/Aim-Highest/EXAONE-4.5/blob/main/assets/exaone45_input1.jpg?raw=true",
                },
            },
            {
                "type": "text",
                "text": "What dish is the person preparing, and how is it made?",
            },
        ]
    }
]

response = client.chat.completions.create(
    model="EXAONE-4.5-33B",
    messages=messages,
    max_tokens=32768,
    temperature=1.0,
    top_p=0.95,
    presence_penalty=1.5,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": False,  # default: True
        }
    }, 
)
print(response)

纯文本问答

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

messages = [
    {
        "role": "user",
        "content": "Explain how useful you are.",
    }
]

response = client.chat.completions.create(
    model="EXAONE-4.5-33B",
    messages=messages,
    max_tokens=32768,
    temperature=1.0,
    top_p=0.95,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": True,  # default: True
        }
    }, 
)
print(response)

智能体应用

以下示例展示了EXAONE 4.5针对图文输入的智能体能力。您可以将自己的智能体、技能或其他工具与EXAONE 4.5模型配合使用。

# If needed:
# pip install langchain langchain-openai langchain-mcp-adapters
# curl -LsSf https://astral.sh/uv/install.sh | sh
# sudo apt-get update && sudo apt-get install -y nodejs npm

import os
import asyncio
from langchain_openai import ChatOpenAI
from langchain.agents import create_agent
from langchain_mcp_adapters.client import MultiServerMCPClient

def print_message(msg):
    parts = msg.content if isinstance(msg.content, list) else [{"type": "text", "text": msg.content or ""}]
    text_out, reasoning_out = [], []

    for p in parts:
        if isinstance(p, dict):
            if p.get("type") in ("text", "output_text") and p.get("text"):
                text_out.append(p["text"])
            elif p.get("type") in ("reasoning", "reasoning_text") and p.get("text"):
                reasoning_out.append(p["text"])

    if reasoning_out:
        print("\n[assistant_reasoning_content]")
        print("\n".join(reasoning_out))
    if text_out:
        print("\n[assistant_content]")
        print("\n".join(text_out))

async def main():
    model = ChatOpenAI(
        model="EXAONE-4.5-33B",
        base_url="http://localhost:8000/v1",
        api_key="EMPTY",
        temperature=1.0,
        model_kwargs={"top_p": 0.95},
    )

    client = MultiServerMCPClient({
        "filesystem": {
            "transport": "stdio",
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
        },
        "fetch": {
            "transport": "stdio",
            "command": "uvx",
            "args": ["mcp-server-fetch"],
        },
        "duckduckgo": {
            "transport": "stdio",
            "command": "uvx",
            "args": ["duckduckgo-mcp-server"],
        },
    })

    agent = create_agent(model, await client.get_tools())

    inputs = {
        "messages": [{
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": (
                        "Look at the image and identify the landmark. "
                        "Use the DuckDuckGo MCP tool to verify its name, height, and location. "
                        "Then use the fetch tool to read a fuller article page about it. "
                        "Create /tmp/mcp-demo and write a short markdown file to "
                        "/tmp/mcp-demo/landmark.md with: name, location, height, and a one-sentence summary of the article. "
                        "Finally, return only the exact file content."
                    ),
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/a/a8/Tour_Eiffel_Wikimedia_Commons.jpg"
                    },
                },
            ],
        }]
    }

    async for step in agent.astream(inputs, stream_mode="values"):
        msg = step["messages"][-1]
        if getattr(msg, "type", "") == "ai":
            print_message(msg)
            for tc in getattr(msg, "tool_calls", []) or []:
                print(f"\n[tool call] {tc['name']}({tc['args']})")

if __name__ == "__main__":
    asyncio.run(main())

局限性

EXAONE 4.5 模型与所有现有多模态模型一样，存在一定的局限性，偶尔可能会生成不恰当的响应。多模态模型基于 token 的输出概率生成响应，这是在训练数据学习过程中确定的。尽管我们尽力从训练数据中排除个人信息、有害信息和有偏见的信息，但仍可能包含一些有问题的内容，从而可能导致不良响应。请注意，EXAONE 4.5 模型生成的文本并不反映 LG AI Research 的观点。

可能会生成包含个人信息、有害信息或其他不恰当信息的不当回答。
可能会生成与年龄、性别、种族等相关的有偏见的响应。
生成的响应在很大程度上依赖于训练数据的统计信息，这可能导致生成语义或语法不正确的句子。
由于模型未反映最新信息，响应可能是虚假的或相互矛盾的。

LG AI Research 致力于降低 EXAONE 4.5 模型可能带来的潜在风险。用户在使用 EXAONE 4.5 模型时，不得从事任何恶意活动（例如输入非法信息），以诱导生成违反 LG AI 道德原则的不当输出。

许可

本模型根据 EXAONE AI Model License Agreement 1.2 - NC 进行许可。

引用

@article{exaone-4.5,
  title={EXAONE 4.5 Technical Report},
  author={{LG AI Research}},
  journal={arXiv preprint arXiv:2604.08644},
  year={2026}
}

联系方式

LG AI Research 技术支持：contact_us@lgresearch.ai

EXAONE 4.5

更多详情，请参考技术报告、博客和GitHub。

模型配置

模型类型：因果语言模型 + 视觉编码器
参数数量（语言模型）：317亿
参数数量（视觉编码器）：12.9亿
隐藏层维度：5,120
中间层大小：27,392
层数：64个主层 + 1个MTP层
- 混合注意力模式：16组×（3个滑动窗口注意力 + 1个全局注意力）
- 重排序归一化：在注意力/MLP之后、残差连接之前应用归一化
滑动窗口注意力
- 注意力头数量：40个查询头（Q-heads）和8个键值头（KV-heads）
- 头维度：查询头/键值头均为128
- 滑动窗口大小：4096
全局注意力
- 注意力头数量：40个查询头（Q-heads）和8个键值头（KV-heads）
- 头维度：查询头/键值头均为128
- 不使用旋转位置嵌入（NoPE）
视觉编码器
- 分组查询注意力（GQA）
- 用于视觉嵌入的2D RoPE
词汇表大小：153,600
上下文长度：262,144个token
知识截止日期：2024年12月（2024/12）

评估结果

视觉-语言任务

	EXAONE 4.5 33B (推理)	GPT-5 mini (推理：高)	Qwen3-VL 32B 思维链	Qwen3-VL 235B 思维链	Qwen3.5 27B (推理)
架构	密集型	-	密集型	混合专家	密集型
总参数	330亿	-	330亿	2360亿	270亿
激活参数	330亿	-	330亿	220亿	270亿
STEM / 谜题
MMMU	78.7	79.0	78.1	80.6	82.3
MMMU-Pro	68.6	67.3	68.1	69.3	75.0
MedXpertQA-MM	42.1	34.4	41.6	47.6	62.4
MathVision	75.2	71.9	70.2	74.6	86.0
MathVista (mini)	85.0	79.1	85.9	85.8	87.8
WeMath	79.1	70.3	71.6	74.8	84.0
LogicVista	73.8	70.3	70.9	72.2	77.0
BabyVision	18.8	20.9	17.4	22.2	44.6
文档理解
AI2D	89.0	88.2	88.9	89.2	92.9
ChartQAPro	62.2	60.9	61.4	61.2	66.8
CharXiv (RQ)	71.7	68.6	65.2	66.1	79.5
OCRBench v2	63.2	55.8	68.4	66.8	67.3
OmniDocBench v1.5	81.2	77.0	83.1	84.5	88.9
通用
MMStar	74.9	74.1	79.4	78.7	81.0
BLINK	68.8	67.7	68.5	67.1	71.6
HallusionBench	63.7	63.2	67.4	66.7	70.0
韩语
KMMMU	42.7	42.6	37.8	42.1	51.7
K-Viscuit	80.1	78.5	78.5	83.9	84.0
KRETA	91.9	94.8	90.3	92.8	96.5

纯语言任务

	EXAONE 4.5 33B (推理)	GPT-5 mini (推理：高)	K-EXAONE 236B (推理)	Qwen3-VL 235B 思维	Qwen3.5 27B (推理)
架构	密集型	-	MoE	MoE	密集型
总参数	330亿	-	2360亿	2360亿	270亿
激活参数	330亿	-	230亿	220亿	270亿
推理能力
AIME 2025	92.9	91.1	92.8	89.7	93.5
AIME 2026	92.6	92.4	92.2	89.4	90.8
GPQA-Diamond	80.5	82.3	79.1	77.1	85.5
LiveCodeBench v6	81.4	78.1	80.7	70.1	80.7
MMLU-Pro	83.3	83.3	83.8	83.8	86.1
智能体工具使用
τ²-Bench (零售)	77.9	78.3	78.6	67.0	84.7
τ²-Bench (航空)	56.5	60.0	60.4	62.0	67.5
τ²-Bench (电信)	73.0	74.1	73.5	44.7	99.3
指令遵循
IFBench	62.6	74.0	67.3	59.2	76.5
IFEval	89.6	92.8	89.7	88.2	95.0
长上下文理解
AA-LCR	50.6	68.0	53.5	58.7	67.3
韩语能力
KMMLU-Pro	67.6	72.5	67.3	71.1	73.0
KoBALT	52.1	63.6	61.8	51.1	54.9

快速入门

部署 EXAONE 4.5

实际上，您可以在单张 H200 GPU 上部署支持 256K 上下文长度的 EXAONE 4.5 模型，或者通过张量并行技术在4 张 A100-40GB GPU 上进行部署。

TensorRT-LLM

TensorRT-LLM 为 EXAONE 4.5 提供了即时支持。使用 EXAONE 4.5 模型需要我们 fork 的 Transformers 库。您可以通过运行以下命令安装 Transformers：

pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

请参考官方安装指南、EXAONE 文档和EXAONE 4.5 PR了解详细信息。

安装 TensorRT-LLM 后，您可以使用以下代码片段启动服务器。您可以从该片段中移除不必要的参数。

trtllm-serve LGAI-EXAONE/EXAONE-4.5-33B \
    —tp_size 2 \
    —port 8000 \
    —reasoning_parser qwen3

一个兼容 OpenAI 的 API 服务器将在 http://localhost:8000/v1 可用。

vLLM

要使用 EXAONE 4.5 模型，需要我们分支版本的 Transformers 和 vLLM。您可以通过运行以下命令安装所需依赖：

uv pip install git+https://github.com/lkm2835/vllm.git@add-exaone4_5
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

安装 vLLM 后，您可以使用以下代码片段启动服务器。您可以从该片段中移除不必要的参数。

vllm serve LGAI-EXAONE/EXAONE-4.5-33B \
    --served-model-name EXAONE-4.5-33B \
    --port 8000 \
    --tensor-parallel-size 2 \
    --max-model-len 262144 \
    --reasoning-parser qwen3 \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --limit-mm-per-prompt '{"image": 64}' \
    --speculative_config '{
        "method": "mtp", 
        "num_speculative_tokens": 3
    }'

一个兼容 OpenAI 的 API 服务器将在 http://localhost:8000/v1 可用。

SGLang

要使用 EXAONE 4.5 模型，需要我们分支的 Transformers 和 SGLang。您可以通过运行以下命令安装所需依赖：

uv pip install 'git+https://github.com/lkm2835/sglang.git@add-exaone4_5#subdirectory=python&egg=sglang[all]'
uv pip install git+https://github.com/nuxlear/transformers.git@add-exaone4_5

安装 SGLang 后，您可以使用以下代码片段启动服务器。您可以从该片段中移除不必要的参数。

python -m sglang.launch_server \
    --model-path LGAI-EXAONE/EXAONE-4.5-33B \
    --served-model-name EXAONE-4.5-33B \
    --port 8000 \
    --tp-size 2 \
    --mem-frac 0.81 \
    --reasoning-parser qwen3 \
    --tool-call-parser hermes \
    --speculative-algorithm EAGLE \
    --speculative-num-steps 3 \
    --speculative-eagle-topk 1 \
    --speculative-num-draft-tokens 4

一个兼容 OpenAI 的 API 服务器将在 http://localhost:8000/v1 可用。

使用 EXAONE 4.5

[!IMPORTANT] 为达到预期性能，我们建议使用以下配置：

对于通用目的，建议使用 temperature=1.0、top_p=0.95、presence_penalty=1.5。

对于 OCR/文档相关任务以及韩语输入，建议使用 temperature=0.6、top_p=0.95、presence_penalty=1.5、top_k=20。

对于纯文本输入，建议使用 temperature=1.0、top_p=0.95。

与 EXAONE-4.0 不同，EXAONE 4.5 默认使用 enable_thinking=True。因此，当您希望使用非推理模式时，需要设置 enable_thinking=False。

EXAONE 4.5 倾向于使用 \boxed{} 格式来回答问题。为了获得更好的解析准确性，我们建议结合相应的格式指令使用此格式。

您可以使用 OpenAI Python SDK 轻松尝试模型的聊天补全功能。对于本地机器上的服务器，您需要为 OpenAI 客户端修改 base_url 和 api_key。

图文问答

推理模式

对于需要准确结果的任务，您可以按以下方式运行 EXAONE 4.5 模型的推理模式。

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://github.com/Aim-Highest/EXAONE-4.5/blob/main/assets/exaone45_input2.png?raw=true",
                },
            },
            {
                "type": "text",
                "text": "How much larger is the model released in winter 2025 compared with the one released in summer 2024?",
            },
        ]
    }
]

response = client.chat.completions.create(
    model="EXAONE-4.5-33B",
    messages=messages,
    max_tokens=32768,
    temperature=1.0,
    top_p=0.95,
    presence_penalty=1.5,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": True,  # default: True
        }
    }, 
)
print(response)

非推理模式

对于延迟比准确性更重要的任务，您可以按以下方式以非推理模式运行 EXAONE 4.5 模型。

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://github.com/Aim-Highest/EXAONE-4.5/blob/main/assets/exaone45_input1.jpg?raw=true",
                },
            },
            {
                "type": "text",
                "text": "What dish is the person preparing, and how is it made?",
            },
        ]
    }
]

response = client.chat.completions.create(
    model="EXAONE-4.5-33B",
    messages=messages,
    max_tokens=32768,
    temperature=1.0,
    top_p=0.95,
    presence_penalty=1.5,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": False,  # default: True
        }
    }, 
)
print(response)

纯文本问答

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
)

messages = [
    {
        "role": "user",
        "content": "Explain how useful you are.",
    }
]

response = client.chat.completions.create(
    model="EXAONE-4.5-33B",
    messages=messages,
    max_tokens=32768,
    temperature=1.0,
    top_p=0.95,
    extra_body={
        "chat_template_kwargs": {
            "enable_thinking": True,  # default: True
        }
    }, 
)
print(response)

智能体应用

以下示例展示了EXAONE 4.5针对图文输入的智能体能力。您可以将自己的智能体、技能或其他工具与EXAONE 4.5模型配合使用。

# If needed:
# pip install langchain langchain-openai langchain-mcp-adapters
# curl -LsSf https://astral.sh/uv/install.sh | sh
# sudo apt-get update && sudo apt-get install -y nodejs npm

import os
import asyncio
from langchain_openai import ChatOpenAI
from langchain.agents import create_agent
from langchain_mcp_adapters.client import MultiServerMCPClient

def print_message(msg):
    parts = msg.content if isinstance(msg.content, list) else [{"type": "text", "text": msg.content or ""}]
    text_out, reasoning_out = [], []

    for p in parts:
        if isinstance(p, dict):
            if p.get("type") in ("text", "output_text") and p.get("text"):
                text_out.append(p["text"])
            elif p.get("type") in ("reasoning", "reasoning_text") and p.get("text"):
                reasoning_out.append(p["text"])

    if reasoning_out:
        print("\n[assistant_reasoning_content]")
        print("\n".join(reasoning_out))
    if text_out:
        print("\n[assistant_content]")
        print("\n".join(text_out))

async def main():
    model = ChatOpenAI(
        model="EXAONE-4.5-33B",
        base_url="http://localhost:8000/v1",
        api_key="EMPTY",
        temperature=1.0,
        model_kwargs={"top_p": 0.95},
    )

    client = MultiServerMCPClient({
        "filesystem": {
            "transport": "stdio",
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
        },
        "fetch": {
            "transport": "stdio",
            "command": "uvx",
            "args": ["mcp-server-fetch"],
        },
        "duckduckgo": {
            "transport": "stdio",
            "command": "uvx",
            "args": ["duckduckgo-mcp-server"],
        },
    })

    agent = create_agent(model, await client.get_tools())

    inputs = {
        "messages": [{
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": (
                        "Look at the image and identify the landmark. "
                        "Use the DuckDuckGo MCP tool to verify its name, height, and location. "
                        "Then use the fetch tool to read a fuller article page about it. "
                        "Create /tmp/mcp-demo and write a short markdown file to "
                        "/tmp/mcp-demo/landmark.md with: name, location, height, and a one-sentence summary of the article. "
                        "Finally, return only the exact file content."
                    ),
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/a/a8/Tour_Eiffel_Wikimedia_Commons.jpg"
                    },
                },
            ],
        }]
    }

    async for step in agent.astream(inputs, stream_mode="values"):
        msg = step["messages"][-1]
        if getattr(msg, "type", "") == "ai":
            print_message(msg)
            for tc in getattr(msg, "tool_calls", []) or []:
                print(f"\n[tool call] {tc['name']}({tc['args']})")

if __name__ == "__main__":
    asyncio.run(main())

局限性

可能会生成包含个人信息、有害信息或其他不恰当信息的不当回答。
可能会生成与年龄、性别、种族等相关的有偏见的响应。
生成的响应在很大程度上依赖于训练数据的统计信息，这可能导致生成语义或语法不正确的句子。
由于模型未反映最新信息，响应可能是虚假的或相互矛盾的。

许可

本模型根据 EXAONE AI Model License Agreement 1.2 - NC 进行许可。

引用

@article{exaone-4.5,
  title={EXAONE 4.5 Technical Report},
  author={{LG AI Research}},
  journal={arXiv preprint arXiv:2604.08644},
  year={2026}
}

联系方式

LG AI Research 技术支持：contact_us@lgresearch.ai