`StableLM 2 12B Chat`

模型说明

Stable LM 2 12B Chat 是一个拥有 120 亿参数的指令调优语言模型，它在公开可用数据集和合成数据集的混合数据上进行训练，并采用了直接偏好优化（DPO）技术。

使用方法

注意：本模型需要 transformers>=4.40.0

StableLM 2 12B Chat 采用以下指令 ChatML 格式。该格式也可通过分词器的 apply_chat_template 方法获取：

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('stabilityai/stablelm-2-12b-chat')
model = AutoModelForCausalLM.from_pretrained(
    'stabilityai/stablelm-2-12b-chat',
    device_map="auto",
)

prompt = [{'role': 'user', 'content': 'Implement snake game using pygame'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = model.generate(
    inputs.to(model.device),
    max_new_tokens=100,
    temperature=0.7,
    do_sample=True,
)
output = tokenizer.decode(tokens[:, inputs.shape[-1]:][0], skip_special_tokens=False)

print(output)

StableLM 2 12B Chat 还支持函数调用。以下是使用示例：

system_prompt = """\
You are a helpful assistant with access to the following functions. You must use them if required -\n
[
  {
    "type": "function",
    "function": {
      "name": "TextToImage",
      "description": "This function is able to create, draw, or illustrate an image from a text prompt.",
      "parameters": {
        "type": "object",
        "properties": {
          "prompt": {
            "type": "string",
            "description": "The description of image that the user wants to create."
          }
        },
        "required": [
          "prompt"
        ]
      }
    }
  }
]
"""
messages = [
    {'role': 'system', 'content': system_prompt},
    {'role': "user", 'content': "Please, generate a picture of the Eiffel Tower at night!"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = model.generate(
    inputs.to(model.device),
    max_new_tokens=1024,
    temperature=0.5,
    do_sample=True
)
output = tokenizer.decode(tokens[:, inputs.shape[-1]:][0], skip_special_tokens=True)

print(output)
"""
[
  {
    "name": "TextToImage",
    "arguments": {
      "prompt": "Eiffel Tower at night."
    }
  }
]
"""

模型详情

开发机构：Stability AI
模型类型：StableLM 2 12B Chat模型是一款基于Transformer解码器架构的自回归语言模型。
支持语言：英语
相关论文：[Stable LM 2 Chat 技术报告]((https://arxiv.org/abs/2402.17834)
使用库：Alignment Handbook
微调基础模型：
许可协议：StabilityAI 非商业研究社区许可。如您希望将本模型用于商业产品或用途，请通过此处与我们联系以了解更多信息。
联系方式：有关模型的问题和意见，请发送电子邮件至lm@stability.ai。

训练数据集

该数据集由HuggingFace Hub上的多种开放大规模数据集以及一个内部安全数据集混合组成：

SFT 数据集

HuggingFaceH4/ultrachat_200k
meta-math/MetaMathQA
WizardLM/WizardLM_evol_instruct_V2_196k
Open-Orca/SlimOrca
openchat/openchat_sharegpt4_dataset
LDJnr/Capybara
hkust-nlp/deita-10k-v0
teknium/OpenHermes-2.5
glaiveai/glaive-function-calling-v2

安全数据集：

Anthropic/hh-rlhf
内部安全数据集

偏好数据集：

argilla/dpo-mix-7k

性能表现

MT-Bench

模型	参数规模	MT Bench（修正偏差后）
mistralai/Mixtral-8x7B-Instruct-v0.1	130亿/470亿	8.48 ± 0.06
stabilityai/stablelm-2-12b-chat	120亿	8.15 ± 0.08
Qwen/Qwen1.5-14B-Chat	140亿	7.95 ± 0.10
HuggingFaceH4/zephyr-7b-gemma-v0.1	85亿	7.82 ± 0.03
mistralai/Mistral-7B-Instruct-v0.2	70亿	7.48 ± 0.02
meta-llama/Llama-2-70b-chat-hf	700亿	7.29 ± 0.05

OpenLLM 排行榜

模型	参数规模	平均分	ARC Challenge（25-shot）	HellaSwag（10-shot）	MMLU（5-shot）	TruthfulQA（0-shot）	Winogrande（5-shot）	GSM8K（5-shot）
mistralai/Mixtral-8x7B-Instruct-v0.1	13B/47B	72.71	70.14	87.55	71.40	64.98	81.06	61.11
stabilityai/stablelm-2-12b-chat	12B	68.45	65.02	86.06	61.14	62.00	78.77	57.70
Qwen/Qwen1.5-14B	14B	66.70	56.57	81.08	69.36	52.06	73.48	67.63
mistralai/Mistral-7B-Instruct-v0.2	7B	65.71	63.14	84.88	60.78	60.26	77.19	40.03
HuggingFaceH4/zephyr-7b-gemma-v0.1	8.5B	62.41	58.45	83.48	60.68	52.07	74.19	45.56
Qwen/Qwen1.5-14B-Chat	14B	62.37	58.79	82.33	68.52	60.38	73.32	30.86
google/gemma-7b	8.5B	63.75	61.09	82.20	64.56	44.79	79.01	50.87
stabilityai/stablelm-2-12b	12B	63.53	58.45	84.33	62.09	48.16	78.10	56.03
mistralai/Mistral-7B-v0.1	7B	60.97	59.98	83.31	64.16	42.15	78.37	37.83
meta-llama/Llama-2-13b-hf	13B	55.69	59.39	82.13	55.77	37.38	76.64	22.82
meta-llama/Llama-2-13b-chat-hf	13B	54.92	59.04	81.94	54.64	41.12	74.51	15.24

使用与限制

预期用途

该模型旨在用于类聊天应用。开发人员必须在其特定用例中评估模型的安全性能。请阅读下文的安全与限制部分以了解更多信息。

限制与偏见

我们强烈建议将此模型与输入和输出分类器配合使用，以防止产生有害响应。使用此模型时，需要对输入和输出设置防护措施，确保返回的任何输出都不是幻觉内容。此外，由于每个用例都是独特的，我们建议运行您自己的测试套件，以确保此模型的正常性能。最后，如果模型不适合您的应用，或者用于可能对他人造成故意或非故意伤害的任何应用，请不要使用这些模型。

引用方式

@article{bellagente2024stable,
  title={Stable LM 2 1.6 B Technical Report},
  author={Bellagente, Marco and Tow, Jonathan and Mahan, Dakota and Phung, Duy and Zhuravinskyi, Maksym and Adithyan, Reshinth and Baicoianu, James and Brooks, Ben and Cooper, Nathan and Datta, Ashish and others},
  journal={arXiv preprint arXiv:2402.17834},
  year={2024}
}