StableLM 2 12B ChatStable LM 2 12B Chat 是一个拥有 120 亿参数的指令调优语言模型,它在公开可用数据集和合成数据集的混合数据上进行训练,并采用了 直接偏好优化(DPO) 技术。
注意:本模型需要 transformers>=4.40.0
StableLM 2 12B Chat 采用以下指令 ChatML 格式。
该格式也可通过分词器的 apply_chat_template 方法获取:
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('stabilityai/stablelm-2-12b-chat')
model = AutoModelForCausalLM.from_pretrained(
'stabilityai/stablelm-2-12b-chat',
device_map="auto",
)
prompt = [{'role': 'user', 'content': 'Implement snake game using pygame'}]
inputs = tokenizer.apply_chat_template(
prompt,
add_generation_prompt=True,
return_tensors='pt'
)
tokens = model.generate(
inputs.to(model.device),
max_new_tokens=100,
temperature=0.7,
do_sample=True,
)
output = tokenizer.decode(tokens[:, inputs.shape[-1]:][0], skip_special_tokens=False)
print(output)StableLM 2 12B Chat 还支持函数调用。以下是使用示例:
system_prompt = """\
You are a helpful assistant with access to the following functions. You must use them if required -\n
[
{
"type": "function",
"function": {
"name": "TextToImage",
"description": "This function is able to create, draw, or illustrate an image from a text prompt.",
"parameters": {
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "The description of image that the user wants to create."
}
},
"required": [
"prompt"
]
}
}
}
]
"""
messages = [
{'role': 'system', 'content': system_prompt},
{'role': "user", 'content': "Please, generate a picture of the Eiffel Tower at night!"}
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors='pt'
)
tokens = model.generate(
inputs.to(model.device),
max_new_tokens=1024,
temperature=0.5,
do_sample=True
)
output = tokenizer.decode(tokens[:, inputs.shape[-1]:][0], skip_special_tokens=True)
print(output)
"""
[
{
"name": "TextToImage",
"arguments": {
"prompt": "Eiffel Tower at night."
}
}
]
"""
StableLM 2 12B Chat模型是一款基于Transformer解码器架构的自回归语言模型。lm@stability.ai。该数据集由HuggingFace Hub上的多种开放大规模数据集以及一个内部安全数据集混合组成:
| 模型 | 参数规模 | MT Bench(修正偏差后) |
|---|---|---|
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 130亿/470亿 | 8.48 ± 0.06 |
| stabilityai/stablelm-2-12b-chat | 120亿 | 8.15 ± 0.08 |
| Qwen/Qwen1.5-14B-Chat | 140亿 | 7.95 ± 0.10 |
| HuggingFaceH4/zephyr-7b-gemma-v0.1 | 85亿 | 7.82 ± 0.03 |
| mistralai/Mistral-7B-Instruct-v0.2 | 70亿 | 7.48 ± 0.02 |
| meta-llama/Llama-2-70b-chat-hf | 700亿 | 7.29 ± 0.05 |
| 模型 | 参数规模 | 平均分 | ARC Challenge(25-shot) | HellaSwag(10-shot) | MMLU(5-shot) | TruthfulQA(0-shot) | Winogrande(5-shot) | GSM8K(5-shot) |
|---|---|---|---|---|---|---|---|---|
| mistralai/Mixtral-8x7B-Instruct-v0.1 | 13B/47B | 72.71 | 70.14 | 87.55 | 71.40 | 64.98 | 81.06 | 61.11 |
| stabilityai/stablelm-2-12b-chat | 12B | 68.45 | 65.02 | 86.06 | 61.14 | 62.00 | 78.77 | 57.70 |
| Qwen/Qwen1.5-14B | 14B | 66.70 | 56.57 | 81.08 | 69.36 | 52.06 | 73.48 | 67.63 |
| mistralai/Mistral-7B-Instruct-v0.2 | 7B | 65.71 | 63.14 | 84.88 | 60.78 | 60.26 | 77.19 | 40.03 |
| HuggingFaceH4/zephyr-7b-gemma-v0.1 | 8.5B | 62.41 | 58.45 | 83.48 | 60.68 | 52.07 | 74.19 | 45.56 |
| Qwen/Qwen1.5-14B-Chat | 14B | 62.37 | 58.79 | 82.33 | 68.52 | 60.38 | 73.32 | 30.86 |
| google/gemma-7b | 8.5B | 63.75 | 61.09 | 82.20 | 64.56 | 44.79 | 79.01 | 50.87 |
| stabilityai/stablelm-2-12b | 12B | 63.53 | 58.45 | 84.33 | 62.09 | 48.16 | 78.10 | 56.03 |
| mistralai/Mistral-7B-v0.1 | 7B | 60.97 | 59.98 | 83.31 | 64.16 | 42.15 | 78.37 | 37.83 |
| meta-llama/Llama-2-13b-hf | 13B | 55.69 | 59.39 | 82.13 | 55.77 | 37.38 | 76.64 | 22.82 |
| meta-llama/Llama-2-13b-chat-hf | 13B | 54.92 | 59.04 | 81.94 | 54.64 | 41.12 | 74.51 | 15.24 |
该模型旨在用于类聊天应用。开发人员必须在其特定用例中评估模型的安全性能。请阅读下文的安全与限制部分以了解更多信息。
我们强烈建议将此模型与输入和输出分类器配合使用,以防止产生有害响应。 使用此模型时,需要对输入和输出设置防护措施,确保返回的任何输出都不是幻觉内容。 此外,由于每个用例都是独特的,我们建议运行您自己的测试套件,以确保此模型的正常性能。 最后,如果模型不适合您的应用,或者用于可能对他人造成故意或非故意伤害的任何应用,请不要使用这些模型。
@article{bellagente2024stable,
title={Stable LM 2 1.6 B Technical Report},
author={Bellagente, Marco and Tow, Jonathan and Mahan, Dakota and Phung, Duy and Zhuravinskyi, Maksym and Adithyan, Reshinth and Baicoianu, James and Brooks, Ben and Cooper, Nathan and Datta, Ashish and others},
journal={arXiv preprint arXiv:2402.17834},
year={2024}
}