
xomad/gliner-model-merge-large-v1.0 模型基于预训练模型 knowledgator/gliner-multitask-large-v0.5 开发,旨在探索模型融合技术的能力。通过该技术,模型性能显著提升了 3.25 个百分点,F1 分数从 0.6276 提升至 0.6601。
为确保在 Apache-2.0 许可下具有广泛的适用性,该模型仅使用商业友好许可的数据集进行训练。训练过程中使用的数据集如下:
该过程以基础模型 knowledgator/gliner-multitask-large-v0.5 为起点。我们的模型 xomad/gliner-model-merge-large-v1.0 在上述每个数据集上分别进行微调,并在微调过程中保存多个检查点。我们将所有这些检查点汇集到一个池中,然后应用 Model soups 技术生成不同的融合模型:
uniform_mergedgreedy_on_randomgreedy_on_sorted随后,我们应用 WiSE-FT 融合技术,从上述 3 个模型和原始模型组成的组中选择模型对进行融合,生成 wise_ft_merged 模型。至此,第一阶段微调结束。
然后在第二阶段微调中重复该过程,以 wise_ft_merged 作为新的起点,最终生成最终模型。整个微调流程如下图所示:

微调模型池和融合模型的性能在 CrossNER、TwitterNER 基准上进行了评估,并在以下两个图中进行了绘制(分别为 crossner_f1 和 other_f1)。
第一阶段微调图:

第二阶段微调图:

要使用此模型,您必须安装 GLiNER Python 库:
pip install gliner下载 GLiNER 库后,您可以导入 GLiNER 类。然后,您可以使用 GLiNER.from_pretrained 加载此模型。
from gliner import GLiNER
model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")
text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""
labels = ["founder", "computer", "software", "position", "date", "company"]
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])输出:
Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date
在不同零样本命名实体识别(NER)基准测试(CrossNER、mit-movie 和 mit-restaurant)上的性能,数据来源于 https://huggingface.co/knowledgator/gliner-multitask-large-v0.5:
不同数据集上的详细性能:
| 模型 | 数据集 | 精确率 | 召回率 | F1 分数 | F1 分数(小数) |
|---|---|---|---|---|---|
| xomad/gliner-model-merge-large-v1.0 | CrossNER_AI | 62.66% | 57.48% | 59.96% | 0.5996 |
| CrossNER_literature | 73.28% | 66.42% | 69.68% | 0.6968 | |
| CrossNER_music | 74.89% | 70.67% | 72.72% | 0.7272 | |
| CrossNER_politics | 79.46% | 77.57% | 78.51% | 0.7851 | |
| CrossNER_science | 74.72% | 70.24% | 72.41% | 0.7241 | |
| mit-movie | 67.33% | 57.89% | 62.25% | 0.6225 | |
| mit-restaurant | 54.94% | 40.41% | 46.57% | 0.4657 | |
| 平均值 | 0.6601 | ||||
| numind/NuNER_Zero-span | CrossNER_AI | 63.82% | 56.82% | 60.12% | 0.6012 |
| CrossNER_literature | 73.53% | 58.06% | 64.89% | 0.6489 | |
| CrossNER_music | 72.69% | 67.40% | 69.95% | 0.6995 | |
| CrossNER_politics | 77.28% | 68.69% | 72.73% | 0.7273 | |
| CrossNER_science | 70.08% | 63.12% | 66.42% | 0.6642 | |
| mit-movie | 63.00% | 48.88% | 55.05% | 0.5505 | |
| mit-restaurant | 54.81% | 37.62% | 44.62% | 0.4462 | |
| 平均值 | 0.6196 | ||||
| knowledgator/gliner-multitask-v0.5 | CrossNER_AI | 51.00% | 51.11% | 51.05% | 0.5105 |
| CrossNER_literature | 72.65% | 65.62% | 68.96% | 0.6896 | |
| CrossNER_music | 74.91% | 73.70% | 74.30% | 0.7430 | |
| CrossNER_politics | 78.84% | 77.71% | 78.27% | 0.7827 | |
| CrossNER_science | 69.20% | 65.48% | 67.29% | 0.6729 | |
| mit-movie | 61.29% | 52.59% | 56.60% | 0.5660 | |
| mit-restaurant | 50.65% | 38.13% | 43.51% | 0.4351 | |
| 平均值 | 0.6276 | ||||
| gliner-community/gliner_large-v2.5 | CrossNER_AI | 50.85% | 63.03% | 56.29% | 0.5629 |
| CrossNER_literature | 64.92% | 67.21% | 66.04% | 0.6604 | |
| CrossNER_music | 70.88% | 73.10% | 71.97% | 0.7197 | |
| CrossNER_politics | 72.67% | 72.93% | 72.80% | 0.7280 | |
| CrossNER_science | 61.71% | 68.85% | 65.08% | 0.6508 | |
| mit-movie | 54.63% | 52.83% | 53.71% | 0.5371 | |
| mit-restaurant | 47.99% | 42.13% | 44.87% | 0.4487 | |
| 平均值 | 0.6154 | ||||
| urchade/gliner_large-v2.1 | CrossNER_AI | 54.98% | 52.00% | 53.45% | 0.5345 |
| CrossNER_literature | 59.33% | 56.47% | 57.87% | 0.5787 | |
| CrossNER_music | 67.39% | 66.77% | 67.08% | 0.6708 | |
| CrossNER_politics | 66.07% | 63.76% | 64.90% | 0.6490 | |
| CrossNER_science | 61.45% | 62.56% | 62.00% | 0.6200 | |
| mit-movie | 55.94% | 47.36% | 51.29% | 0.5129 | |
| mit-restaurant | 53.34% | 40.83% | 46.25% | 0.4625 | |
| 平均值 | 0.5754 | ||||
| EmergentMethods/gliner_large_news-v2.1 | CrossNER_AI | 59.60% | 54.55% | 56.96% | 0.5696 |
| CrossNER_literature | 65.41% | 56.16% | 60.44% | 0.6044 | |
| CrossNER_music | 67.47% | 63.08% | 65.20% | 0.6520 | |
| CrossNER_politics | 66.05% | 60.07% | 62.92% | 0.6292 | |
| CrossNER_science | 68.44% | 63.57% | 65.92% | 0.6592 | |
| mit-movie | 65.85% | 49.59% | 56.57% | 0.5657 | |
| mit-restaurant | 54.71% | 35.94% | 43.38% | 0.4338 | |
| 平均值 | 0.5876 |
Hoan Nguyen,来自 xomad.com
@misc{wortsman2022modelsoupsaveragingweights,
title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time},
author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
year={2022},
eprint={2203.05482},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2203.05482},
}
@InProceedings{Wortsman_2022_CVPR,
author = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
title = {Robust Fine-Tuning of Zero-Shot Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
pages = {7959-7971}
}
@misc{stepanov2024gliner,
title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks},
author={Ihor Stepanov and Mykhailo Shtopko},
year={2024},
eprint={2406.12925},
archivePrefix={arXiv},
primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}
@misc{zaratiana2023gliner,
title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
year={2023},
eprint={2311.08526},
archivePrefix={arXiv},
primaryClass={cs.CL}
}