Nandi-Mini-150M-Instruct 是一款紧凑高效的多语言模型,旨在资源受限环境中实现卓越性能。它基于 5250 亿 tokens 从头开始预训练,并通过指令微调与直接偏好优化(DPO)进一步增强。该模型支持英语及 10 种印度语言。
Nandi-Mini-150M-Instruct 侧重于通过架构效率而非规模来最大化每参数性能。它针对边缘设备、本地部署和低延迟应用进行了优化,非常适合资源受限的环境。 Nandi-Mini-150M-Instruct 具备以下主要特性:
Nandi 系列才刚刚起步 🚀
📢 技术博客与深度解析即将推出,届时我们将分享:
敬请期待!
该模型在英语和多种印度语言上进行了训练,包括:
!pip install transformers=='5.4.0'
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Rta-AILabs/Nandi-Mini-150M-Instruct"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
trust_remote_code=True,
dtype=torch.bfloat16
).to(device).eval()
prompt = "Explain newton's second law of motion"
messages = [
{"role": "user", "content": prompt}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
generated_ids = model.generate(
**inputs,
max_new_tokens=500,
do_sample=True,
temperature=0.3,
top_p=0.90,
top_k=20,
repetition_penalty=1.1,
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)我们非常期待听到您的想法、反馈和建议!