我们推出了Llama3-Athene-70B,这是一款基于Llama-3-70B-Instruct通过RLHF训练的开源权重大型语言模型。Athene-70B在Arena-Hard-Auto(Chatbot Arena的代理基准测试)中取得了高分。
| 模型 | Arena-Hard |
|---|---|
| Claude-3.5-Sonnet(专有) | 79.3% |
| GPT-4o(专有) | 79.2% |
| Athene-70B(开源) | 77.8% |
| Gemini-Pro-1.5(专有) | 72.0% |
| Gemma-2-27B(开源) | 57.0% |
| Llama-3-70B(开源) | 46.6% |
Athene-70B采用与Llama-3-70B-Instruct相同的聊天模板。以下是使用Transformers库的简单示例。
import transformers
import torch
model_id = "Nexusflow/Athene-70B"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)
messages = [
{"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
{"role": "user", "content": "Whooo are you?"},
]
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]
outputs = pipeline(
messages,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
print(outputs[0]["generated_text"][-1])感谢LMSYS Organization对模型测试工作的支持。感谢Meta AI及开源社区在提供数据集和基础模型方面所付出的努力。
@misc{Athene2024,
title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
url = {https://nexusflow.ai/blogs/athene},
author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},
month = {July},
year = {2024}
}