BETO（西班牙语 BERT）+ 西班牙语 SQuAD2.0 + 以“bert-base-multilingual-cased”为教师模型的蒸馏

该模型是在 SQuAD-es-v2.0 上进行微调的 BETO 的蒸馏版本，适用于问答（Q&A） 任务。

通过蒸馏，该模型比 bert-base-spanish-wwm-cased-finetuned-spa-squad2-es 更小、更快、成本更低且更轻量。

该模型在相同的数据集上进行了微调，但在过程中使用了上述蒸馏技术（并增加了一个训练轮次）。

蒸馏所使用的教师模型为 bert-base-multilingual-cased。它与 distilbert-base-multilingual-cased（即 DistilmBERT）所使用的教师模型相同（其平均速度是 mBERT-base 的两倍）。

下游任务（问答）详情 - 数据集

SQuAD-es-v2.0

数据集	问答对数
SQuAD2.0 训练集	130 K
SQuAD2.0-es-v2.0 数据集	111 K
SQuAD2.0 开发集	12 K
SQuAD-es-v2.0-small 开发集	69 K

模型训练

该模型在 Tesla P100 GPU 和 25GB 内存上进行训练，使用的命令如下：

!export SQUAD_DIR=/path/to/squad-v2_spanish \
&& python transformers/examples/distillation/run_squad_w_distillation.py \
  --model_type bert \
  --model_name_or_path dccuchile/bert-base-spanish-wwm-cased \
  --teacher_type bert \
  --teacher_name_or_path bert-base-multilingual-cased \
  --do_train \
  --do_eval \
  --do_lower_case \
  --train_file $SQUAD_DIR/train-v2.json \
  --predict_file $SQUAD_DIR/dev-v2.json \
  --per_gpu_train_batch_size 12 \
  --learning_rate 3e-5 \
  --num_train_epochs 5.0 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir /content/model_output \
  --save_steps 5000 \
  --threads 4 \
  --version_2_with_negative

结果：

待定

模型实际应用

使用 pipelines 的快速用法：

from transformers import *

# Important!: By now the QA pipeline is not compatible with fast tokenizer, but they are working on it. So that pass the object to the tokenizer {"use_fast": False} as in the following example:

nlp = pipeline(
    'question-answering', 
    model='mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',
    tokenizer=(
        'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',  
        {"use_fast": False}
    )
)

nlp(
    {
        'question': '¿Para qué lenguaje está trabajando?',
        'context': 'Manuel Romero está colaborando activamente con huggingface/transformers ' +
                    'para traer el poder de las últimas técnicas de procesamiento de lenguaje natural al idioma español'
    }
)
# Output: {'answer': 'español', 'end': 169, 'score': 0.67530957344621, 'start': 163}

在 Colab 中体验该模型和 pipelines：

设置上下文并提出一些问题：

Set context and questions

运行预测：

Run the model

想了解更多关于 Huggingface pipelines 的信息？请查看此 Colab：

由 Manuel Romero/@mrm8488 创建

在西班牙用 ♥ 制作