Faro-Yi-9B-DPO

这是 Jinan_AICC/Faro-Yi-9B 的 DPO 版本。与 Faro-Yi-9B 和 Yi-9B-200K 相比，该 DPO 模型在多项任务中表现出色，大幅超越了原始的 Yi-9B-200K。

指标	MMLU	GSM8K	hellaswag	truthfulqa	ai2_arc	winogrande	CMMLU
Yi-9B-200K	65.73	50.49	56.72	33.80	69.25	71.67	71.97
Faro-Yi-9B	68.80	63.08	57.28	40.86	72.58	71.11	73.28
Faro-Yi-9B-DPO	69.98	66.11	59.04	48.01	75.68	73.40	75.23

在 openMind 中使用

Faro-Yi-9B-DPO 采用 chatml 模板，在短上下文和长上下文场景下均有良好表现。

环境变量

# source environment variable
source /usr/local/Ascend/ascend-toolkit/set_env.sh
export OPENMIND_FRAMEWORK=pt

推理

from openmind import AutoModelForCausalLM, AutoTokenizer
from openmind_hub import snapshot_download
import argparse

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--model_name_or_path",
        type=str,
        help="Jinan_AICC/Faro-Yi-9B-DPO",
        default=None,
    )
    args = parser.parse_args()
    return args

args = parse_args()
model_path = args.model_name_or_path
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
messages = [
    {"role": "system", "content": "You are a helpful assistant. Always answer with a short response."},
    {"role": "user", "content": "Tell me what is Pythagorean theorem like you are a pirate."}
]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
generated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.5)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True) # Aye, matey! The Pythagorean theorem is a nautical rule that helps us find the length of the third side of a triangle. ...
print(response)

Faro-Yi-9B-DPO

这是 Jinan_AICC/Faro-Yi-9B 的 DPO 版本。与 Faro-Yi-9B 和 Yi-9B-200K 相比，该 DPO 模型在多项任务中表现出色，大幅超越了原始的 Yi-9B-200K。

指标	MMLU	GSM8K	hellaswag	truthfulqa	ai2_arc	winogrande	CMMLU
Yi-9B-200K	65.73	50.49	56.72	33.80	69.25	71.67	71.97
Faro-Yi-9B	68.80	63.08	57.28	40.86	72.58	71.11	73.28
Faro-Yi-9B-DPO	69.98	66.11	59.04	48.01	75.68	73.40	75.23

在 openMind 中使用

Faro-Yi-9B-DPO 采用 chatml 模板，在短上下文和长上下文场景下均有良好表现。

环境变量

# source environment variable
source /usr/local/Ascend/ascend-toolkit/set_env.sh
export OPENMIND_FRAMEWORK=pt

推理

from openmind import AutoModelForCausalLM, AutoTokenizer
from openmind_hub import snapshot_download
import argparse

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--model_name_or_path",
        type=str,
        help="Jinan_AICC/Faro-Yi-9B-DPO",
        default=None,
    )
    args = parser.parse_args()
    return args

args = parse_args()
model_path = args.model_name_or_path
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
messages = [
    {"role": "system", "content": "You are a helpful assistant. Always answer with a short response."},
    {"role": "user", "content": "Tell me what is Pythagorean theorem like you are a pirate."}
]

input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
generated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.5)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True) # Aye, matey! The Pythagorean theorem is a nautical rule that helps us find the length of the third side of a triangle. ...
print(response)