MiniCPM3-4B:可用于文本生成、函数调用及代码解释等场景，具备强大通用能力。该项目是MiniCPM系列第三代模型，性能超越Phi-3.5-mini-Instruct等，支持32k上下文窗口及无限上下文处理。【此简介由AI生成】

MiniCPM 代码库 | MiniCPM 论文 | MiniCPM-V 代码库 | 欢迎加入我们的 Discord 和微信交流群

简介

MiniCPM3-4B 是 MiniCPM 系列的第三代模型。其综合性能超越了 Phi-3.5-mini-Instruct 和 GPT-3.5-Turbo-0125，可与众多最新的 7B~9B 模型相媲美。

与 MiniCPM1.0/MiniCPM2.0 相比，MiniCPM3-4B 具备更强大、更多样化的技能组合，以支持更广泛的通用场景。MiniCPM3-4B 支持函数调用和代码解释器功能。使用指南请参见进阶功能。

MiniCPM3-4B 拥有 32k 的上下文窗口。借助 LLMxMapReduce 技术，MiniCPM3-4B 理论上可处理无限长上下文，且无需占用大量内存。

使用方法

使用 Transformers 进行推理

from mindnlp.transformers import AutoModelForCausalLM, AutoTokenizer

path = "openbmb/MiniCPM3-4B"

tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(path,   trust_remote_code=True)

messages = [
    {"role": "user", "content": "推荐5个北京的景点。"},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(device)

model_outputs = model.generate(
    model_inputs,
    max_new_tokens=1024,
    top_p=0.7,
    temperature=0.7
)

output_token_ids = [
    model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
]

responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
print(responses)

使用 vLLM 进行推理

目前，你需要安装我们的 vLLM 分支版本。

pip install git+https://github.com/OpenBMB/vllm.git@minicpm3

from mindnlp.transformers import AutoTokenizer
from vllm import LLM, SamplingParams

model_name = "openbmb/MiniCPM3-4B"
prompt = [{"role": "user", "content": "推荐5个北京的景点。"}]

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)

llm = LLM(
    model=model_name,
    trust_remote_code=True,
    tensor_parallel_size=1
)
sampling_params = SamplingParams(top_p=0.7, temperature=0.7, max_tokens=1024, repetition_penalty=1.02)

outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)

print(outputs[0].outputs[0].text)

评估结果

评测基准	Qwen2-7B-Instruct	GLM-4-9B-Chat	Gemma2-9B-it	Llama3.1-8B-Instruct	GPT-3.5-Turbo-0125	Phi-3.5-mini-Instruct(3.8B)	MiniCPM3-4B
英文能力
MMLU（多任务语言理解）	70.5	72.4	72.6	69.4	69.2	68.4	67.2
BBH（大语言模型推理能力）	64.9	76.3	65.2	67.8	70.3	68.6	70.2
MT-Bench（多轮对话质量）	8.41	8.35	7.88	8.28	8.17	8.60	8.41
IFEVAL（指令遵循严格准确率）	51.0	64.5	71.9	71.5	58.8	49.4	68.4
中文能力
CMMLU（中文多任务语言理解）	80.9	71.5	59.5	55.8	54.5	46.9	73.3
CEVAL（中文基础模型评估）	77.2	75.6	56.7	55.2	52.8	46.1	73.6
AlignBench v1.1（中文对齐基准）	7.10	6.61	7.10	5.68	5.82	5.73	6.74
FollowBench-zh（中文指令遵循SSR指标）	63.0	56.4	57.0	50.6	64.6	58.1	66.8
数学能力
MATH（数学问题解决）	49.6	50.6	46.0	51.9	41.8	46.4	46.6
GSM8K（小学数学问题）	82.3	79.6	79.7	84.5	76.4	82.7	81.1
MathBench（数学综合能力）	63.4	59.4	45.8	54.3	48.9	54.9	65.6
代码能力
HumanEval+（代码生成评估）	70.1	67.1	61.6	62.8	66.5	68.9	68.3
MBPP+（代码生成与执行）	57.1	62.2	64.3	55.3	71.4	55.8	63.2
LiveCodeBench v3（实时代码基准）	22.2	20.2	19.2	20.4	24.0	19.6	22.6
函数调用能力
BFCL v2（函数调用基准）	71.6	70.1	19.2	73.3	75.4	48.4	76.0
综合能力
平均值	65.3	65.0	57.9	60.8	61.0	57.2	66.3

声明

作为一款语言模型，MiniCPM3-4B 通过学习海量文本生成内容。
但它不具备理解或表达个人观点、价值判断的能力。
MiniCPM3-4B 生成的任何内容均不代表模型开发者的观点或立场。
因此，用户在使用 MiniCPM3-4B 生成的内容时，应自行承担全部评估与核实责任。

许可协议

本仓库基于 Apache-2.0 许可协议发布。
MiniCPM3-4B 模型权重的使用必须严格遵守 MiniCPM Model License.md。
MiniCPM3-4B 的模型及权重完全免费用于学术研究。填写 "问卷" 完成注册后，也可免费用于商业用途。

引用

@article{hu2024minicpm,
  title={MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies},
  author={Hu, Shengding and Tu, Yuge and Han, Xu and He, Chaoqun and Cui, Ganqu and Long, Xiang and Zheng, Zhi and Fang, Yewei and Huang, Yuxiang and Zhao, Weilin and others},
  journal={arXiv preprint arXiv:2404.06395},
  year={2024}
}