jd-opensource/JoyAI-LLM-Flash-GGUF
模型介绍文件和版本Pull Requests讨论分析
下载使用量0
JoyAI-LLM Flash

Hugging Face License

1. 模型介绍

JoyAI-LLM-Flash 是一款先进的中型指令语言模型,拥有 30 亿激活参数和 480 亿总参数。该模型使用 Muon 优化器在 20 万亿文本 token 上进行预训练,随后在多样化环境中进行了大规模监督微调(SFT)、直接偏好优化(DPO)和强化学习(RL)。JoyAI-LLM-Flash 在前沿知识、推理、编码任务以及智能体能力方面均表现出色。

核心特性

  • 纤维丛强化学习(Fiber Bundle RL):将纤维丛理论引入强化学习,提出了一种新颖的优化框架 FiberPO。该方法专为应对大规模和异构智能体训练的挑战而设计,能在复杂数据分布下提升稳定性和鲁棒性。
  • 训练-推理协同优化:结合 Muon 优化器与密集 MTP 技术,开发了新颖的优化方法以解决模型规模扩大时的不稳定性问题,吞吐量达到非 MTP 版本的 1.3 至 1.7 倍。
  • 智能体智能(Agentic Intelligence):专为工具使用、推理和自主问题解决而设计。

2. 模型概要

Architecture混合专家模型(Mixture-of-Experts, MoE)
Total Parameters48B
Activated Parameters3B
Number of Layers (Dense layer included)40
Number of Dense Layers1
Attention Hidden Dimension2048
MoE Hidden Dimension (per Expert)768
Number of Attention Heads32
Number of Experts256
Selected Experts per Token8
Number of Shared Experts1
Vocabulary Size129K
Context Length128K
Attention MechanismMLA
Activation FunctionSwiGLU

3. 评估结果

基准测试JoyAI-LLM FlashQwen3-30B-A3B-Instuct-2507GLM-4.7-Flash
(Non-thinking)
知识与对齐
MMLU89.5086.8780.53
MMLU-Pro81.0273.8863.62
CMMLU87.0385.8875.85
GPQA-Diamond74.4368.6939.90
SuperGPQA55.0052.0032.00
LiveBench72.9059.7043.10
IFEval86.6983.1882.44
AlignBench8.248.076.85
HellaSwag91.7989.9060.84
代码能力
HumanEval96.3495.1274.39
LiveCodeBench65.6039.7127.43
SciCode3.08/22.923.08/22.923.08/15.11
数学能力
GSM8K95.8379.8381.88
AIME202565.8362.0824.17
MATH 50097.1089.8090.90
智能体能力
SWE-bench Verified60.6024.4451.60
Tau2-Retail67.5553.5162.28
Tau2-Airline54.0032.0052.00
Tau2-Telecom79.834.3988.60
长文本理解
RULER95.6089.6656.12

4. 部署

[!Note] 您可以通过 https://docs.jdcloud.com/cn/jdaip/chat 访问 JoyAI-LLM Flash API,我们为您提供与 OpenAI/Anthropic 兼容的 API。 当前,推荐在以下推理引擎上运行 JoyAI-LLM-Flash-GGUF:

  • Llama.cpp
  • Ollama

5. 模型使用

以下使用示例演示了如何调用我们的官方 API。

对于使用 vLLM 或 SGLang 部署的第三方 API,请注意:

[!Note] 推荐的采样参数:temperature=0.6,top_p=1.0

对话补全

这是一个简单的对话补全脚本,展示了如何调用 JoyAI-Flash API。

from openai import OpenAI

client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")


def simple_chat(client: OpenAI):
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "which one is bigger, 9.11 or 9.9? think carefully.",
                }
            ],
        },
    ]
    model_name = client.models.list().data[0].id
    response = client.chat.completions.create(
        model=model_name, messages=messages, stream=False, max_tokens=4096
    )
    print(f"response: {response.choices[0].message.content}")


if __name__ == "__main__":
    simple_chat(client)

工具调用补全

这是一个简单的工具调用补全脚本,展示了如何调用 JoyAI-Flash API。

import json

from openai import OpenAI

client = OpenAI(base_url="http://IP:PORT/v1", api_key="EMPTY")


def my_calculator(expression: str) -> str:
    return str(eval(expression))


def rewrite(expression: str) -> str:
    return str(expression)


def simple_tool_call(client: OpenAI):
    messages = [
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "use my functions to compute the results for the equations: 6+1",
                },
            ],
        },
    ]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "my_calculator",
                "description": "A calculator that can evaluate a mathematical equation and compute its results.",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "expression": {
                            "type": "string",
                            "description": "The mathematical expression to evaluate.",
                        },
                    },
                    "required": ["expression"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "rewrite",
                "description": "Rewrite a given text for improved clarity",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "text": {
                            "type": "string",
                            "description": "The input text to rewrite",
                        }
                    },
                },
            },
        },
    ]
    model_name = client.models.list().data[0].id
    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        temperature=1.0,
        max_tokens=1024,
        tools=tools,
        tool_choice="auto",
    )
    tool_calls = response.choices[0].message.tool_calls

    results = []
    for tool_call in tool_calls:
        function_name = tool_call.function.name
        function_args = tool_call.function.arguments
        if function_name == "my_calculator":
            result = my_calculator(**json.loads(function_args))
            results.append(result)
    messages.append({"role": "assistant", "tool_calls": tool_calls})
    for tool_call, result in zip(tool_calls, results):
        messages.append(
            {
                "role": "tool",
                "tool_call_id": tool_call.id,
                "name": tool_call.function.name,
                "content": result,
            }
        )
    response = client.chat.completions.create(
        model=model_name,
        messages=messages,
        temperature=1.0,
        max_tokens=1024,
    )
    print(response.choices[0].message.content)


if __name__ == "__main__":
    simple_tool_call(client)

6. 许可协议

代码仓库和模型权重均遵循 Modified MIT License 进行发布。