这是 Jinan_AICC/Faro-Yi-9B 的 DPO 版本。与 Faro-Yi-9B 和 Yi-9B-200K 相比,该 DPO 模型在多项任务中表现出色,大幅超越了原始的 Yi-9B-200K。
| 指标 | MMLU | GSM8K | hellaswag | truthfulqa | ai2_arc | winogrande | CMMLU |
|---|---|---|---|---|---|---|---|
| Yi-9B-200K | 65.73 | 50.49 | 56.72 | 33.80 | 69.25 | 71.67 | 71.97 |
| Faro-Yi-9B | 68.80 | 63.08 | 57.28 | 40.86 | 72.58 | 71.11 | 73.28 |
| Faro-Yi-9B-DPO | 69.98 | 66.11 | 59.04 | 48.01 | 75.68 | 73.40 | 75.23 |
Faro-Yi-9B-DPO 采用 chatml 模板,在短上下文和长上下文场景下均有良好表现。
# source environment variable
source /usr/local/Ascend/ascend-toolkit/set_env.sh
export OPENMIND_FRAMEWORK=ptfrom openmind import AutoModelForCausalLM, AutoTokenizer
from openmind_hub import snapshot_download
import argparse
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--model_name_or_path",
type=str,
help="Jinan_AICC/Faro-Yi-9B-DPO",
default=None,
)
args = parser.parse_args()
return args
args = parse_args()
model_path = args.model_name_or_path
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
messages = [
{"role": "system", "content": "You are a helpful assistant. Always answer with a short response."},
{"role": "user", "content": "Tell me what is Pythagorean theorem like you are a pirate."}
]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
generated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.5)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True) # Aye, matey! The Pythagorean theorem is a nautical rule that helps us find the length of the third side of a triangle. ...
print(response)