模型名称: open-vakgyata
模型概述: open-vakgyata 是一个开源语言识别模型,能够从语音输入中检测和分类印度语言。
支持的语言:
| 语言 | 代码 |
|---|---|
| 英语(印度) | en-IN |
| 印地语 | hi-IN |
| 奥里亚语 | or-IN |
| 孟加拉语 | bn-IN |
| 泰米尔语 | ta-IN |
| 泰卢固语 | te-IN |
| 卡纳达语 | kn-IN |
| 马拉雅拉姆语 | ml-IN |
| 马拉地语 | mr-IN |
| 古吉拉特语 | gu-IN |
规格说明
使用方法:
from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor
import torch
device = "cpu" # "cuda"
model_id = "onecxi/open-vakgyata"
processor = AutoFeatureExtractor.from_pretrained(model_id)
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id).to(device)
推理:
import torchaudio
audio, sr = torchaudio.load("path/to/audio.wav")
# Process the waveform and move to the appropriate device
inputs = processor(audio.flatten(), sampling_rate=sr, return_tensors="pt").to(device)
# Perform inference
with torch.no_grad():
logits = model(**inputs).logits
# Get language probabilities
probs = logits.softmax(dim=-1).cpu().numpy()
language = model.config.id2label.get(probs.argmax())
print(language)如果您在研究或应用中使用此模型,请考虑引用该模型及其基础来源:
@misc{vakgyata2024,
title={vakgyata: Language Identification for Indian Speech},
author={OneCXI},
year={2024},
url={https://huggingface.co/onecxi/open-vakgyata}
}