Faro 聊天模型注重实用性和长上下文建模能力。它能以更高质量处理各类下游任务,即便输入包含冗长文档或复杂指令,也能输出稳定可靠的结果。Faro 可流畅支持中英文双语。
Faro-Yi-9B 是在 Yi-9B-200K 基础上进行改进的模型,在 Fusang-V1 上进行了广泛的指令微调。与 Yi-9B-200K 相比,借助 Fusang-V1 中的大规模合成数据,Faro-Yi-9B 在各类下游任务和长上下文建模方面的能力得到了显著提升。
与 Yi-9B-200K 一样,Faro-Yi-9B 支持最长 200K 的上下文长度。
Faro-Yi-9B 采用 chatml 模板,在短上下文和长上下文场景下均表现出色。
# source environment variable
source /usr/local/Ascend/ascend-toolkit/set_env.sh
export OPENMIND_FRAMEWORK=ptfrom openmind import AutoModelForCausalLM, AutoTokenizer
from openmind_hub import snapshot_download
import argparse
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--model_name_or_path",
type=str,
help="Jinan_AICC/Faro-Yi-9B",
default=None,
)
args = parser.parse_args()
return args
args = parse_args()
model_path = args.model_name_or_path
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
messages = [
{"role": "system", "content": "You are a helpful assistant. Always answer with a short response."},
{"role": "user", "content": "Tell me what is Pythagorean theorem like you are a pirate."}
]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
generated_ids = model.generate(input_ids, max_new_tokens=512, temperature=0.5)
response = tokenizer.decode(generated_ids[0], skip_special_tokens=True) # Aye, matey! The Pythagorean theorem is a nautical rule that helps us find the length of the third side of a triangle. ...
print(response)Faro-Yi-9B 在多数维度上相比 Yi-9B-200K 均有能力提升,尤其在长文本建模和双语(英文、中文)理解方面表现突出。在参数规模约为 90 亿的所有开源模型中,Faro 具备较强的竞争力。
| 指标 | MMLU | GSM8K | HellaSwag | TruthfulQA | Arc | Winogrande |
|---|---|---|---|---|---|---|
| Yi-9B-200K | 65.73 | 50.49 | 56.72 | 33.80 | 69.25 | 71.67 |
| Faro-Yi-9B | 68.80 | 63.08 | 57.28 | 40.86 | 72.58 | 71.11 |
| 名称 | Average_zh | Average_en | Code Completion |
|---|---|---|---|
| Yi-9B-200K | 30.288 | 36.7071 | 72.2 |
| Faro-Yi-9B | 41.092 | 40.9536 | 46.0 |
| 名称 | Few-shot Learning_en | Synthetic Tasks_en | Single-Doc QA_en | Multi-Doc QA_en | Summarization_en | Few-shot Learning_zh | Synthetic Tasks_zh | Single-Doc QA_zh | Multi-Doc QA_zh | Summarization_zh |
|---|---|---|---|---|---|---|---|---|---|---|
| Yi-9B-200K | 60.6 | 22.8 | 30.9 | 38.9 | 25.8 | 46.5 | 28.0 | 49.6 | 17.7 | 9.7 |
| Faro-Yi-9B | 63.8 | 40.2 | 36.2 | 38.0 | 26.3 | 30.0 | 75.1 | 55.6 | 30.7 | 14.1 |
| 名称 | MMLU | CMMLU |
|---|---|---|
| Yi-9B-200K | 65.73 | 71.97 |
| Faro-Yi-9B | 68.80 | 73.28 |