Llama3-Athene-70B

我们推出了Llama3-Athene-70B，这是一款基于Llama-3-70B-Instruct通过RLHF训练的开源权重大型语言模型。Athene-70B在Arena-Hard-Auto（Chatbot Arena的代理基准测试）中取得了高分。

开发团队： Nexusflow团队（Evan Frick*、Peter Jin*、Tianle Li*、Karthik Ganesan、Jian Zhang、Jiantao Jiao和Banghua Zhu）。
模型类型： 聊天模型
微调基础模型： Llama-3-70B-Instruct
许可证： Nexusflow Research License
博客： https://nexusflow.ai/blogs/athene

模型	Arena-Hard
Claude-3.5-Sonnet（专有）	79.3%
GPT-4o（专有）	79.2%
Athene-70B（开源）	77.8%
Gemini-Pro-1.5（专有）	72.0%
Gemma-2-27B（开源）	57.0%
Llama-3-70B（开源）	46.6%

使用方法

Athene-70B采用与Llama-3-70B-Instruct相同的聊天模板。以下是使用Transformers库的简单示例。

import transformers
import torch

model_id = "Nexusflow/Athene-70B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
    {"role": "user", "content": "Whooo are you?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])

致谢

感谢LMSYS Organization对模型测试工作的支持。感谢Meta AI及开源社区在提供数据集和基础模型方面所付出的努力。

引用

@misc{Athene2024,
    title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
    url = {https://nexusflow.ai/blogs/athene},
    author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},    
    month = {July},
    year = {2024}
}

Llama3-Athene-70B

开发团队： Nexusflow团队（Evan Frick*、Peter Jin*、Tianle Li*、Karthik Ganesan、Jian Zhang、Jiantao Jiao和Banghua Zhu）。

模型类型： 聊天模型

微调基础模型： Llama-3-70B-Instruct

许可证： Nexusflow Research License

博客： https://nexusflow.ai/blogs/athene

模型	Arena-Hard
Claude-3.5-Sonnet（专有）	79.3%
GPT-4o（专有）	79.2%
Athene-70B（开源）	77.8%
Gemini-Pro-1.5（专有）	72.0%
Gemma-2-27B（开源）	57.0%
Llama-3-70B（开源）	46.6%

使用方法

Athene-70B采用与Llama-3-70B-Instruct相同的聊天模板。以下是使用Transformers库的简单示例。

import transformers
import torch

model_id = "Nexusflow/Athene-70B"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
    {"role": "user", "content": "Whooo are you?"},
]

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
]

outputs = pipeline(
    messages,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][-1])

引用

@misc{Athene2024,
    title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
    url = {https://nexusflow.ai/blogs/athene},
    author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},    
    month = {July},
    year = {2024}
}