HuggingFace镜像/Athene-70B
模型介绍文件和版本分析
下载使用量0

Llama3-Athene-70B

我们推出了Llama3-Athene-70B,这是一款基于Llama-3-70B-Instruct通过RLHF训练的开源权重大型语言模型。Athene-70B在Arena-Hard-Auto(Chatbot Arena的代理基准测试)中取得了高分。

  • 开发团队: Nexusflow团队(Evan Frick*、Peter Jin*、Tianle Li*、Karthik Ganesan、Jian Zhang、Jiantao Jiao和Banghua Zhu)。
  • 模型类型: 聊天模型
  • 微调基础模型: Llama-3-70B-Instruct
  • 许可证: Nexusflow Research License
  • 博客: https://nexusflow.ai/blogs/athene
模型Arena-Hard
Claude-3.5-Sonnet(专有)79.3%
GPT-4o(专有)79.2%
Athene-70B(开源)77.8%
Gemini-Pro-1.5(专有)72.0%
Gemma-2-27B(开源)57.0%
Llama-3-70B(开源)46.6%

使用方法

Athene-70B采用与Llama-3-70B-Instruct相同的聊天模板。以下是使用Transformers库的简单示例。

import transformers
import torch

model_id = "Nexusflow/Athene-70B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
    {"role": "user", "content": "Whooo are you?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])

致谢

感谢LMSYS Organization对模型测试工作的支持。感谢Meta AI及开源社区在提供数据集和基础模型方面所付出的努力。

引用

@misc{Athene2024,
    title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
    url = {https://nexusflow.ai/blogs/athene},
    author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},    
    month = {July},
    year = {2024}
}