在 Deita 数据集上对 GPT2-XL 进行 SFT,以改变 Sams 的想法。在其能力范围内支持多轮对话。
提示词示例:
### System:
You are an AI assistant. User will give you a task. Your goal is to complete the task as faithfully as you can. While performing the task think step-by-step and justify your steps.
### User:
How do you fine tune a large language model?
### Assistant:详细结果可查看[此处]
| 指标 | 数值 |
|---|---|
| 平均值 | 33.91 |
| AI2 推理挑战(25次示例) | 29.69 |
| HellaSwag(10次示例) | 50.27 |
| MMLU(5次示例) | 26.42 |
| TruthfulQA(0次示例) | 40.38 |
| Winogrande(5次示例) | 56.67 |
| GSM8k(5次示例) | 0.00 |