gliner-model-merge-large-v1.0:可用于零样本命名实体识别任务，从文本中提取如公司、人物、日期等实体。该项目基于模型合并技术提升性能，F1分数达0.6601，使用商业友好数据集训练，支持多种场景应用。【此简介由AI生成】

xomad/gliner-model-merge-large-v1.0 模型基于预训练模型 knowledgator/gliner-multitask-large-v0.5 开发，旨在探索模型融合技术的能力。通过该技术，模型性能显著提升了 3.25 个百分点，F1 分数从 0.6276 提升至 0.6601。

为确保在 Apache-2.0 许可下具有广泛的适用性，该模型仅使用商业友好许可的数据集进行训练。训练过程中使用的数据集如下：

⚙️ 微调过程

该过程以基础模型 knowledgator/gliner-multitask-large-v0.5 为起点。我们的模型 xomad/gliner-model-merge-large-v1.0 在上述每个数据集上分别进行微调，并在微调过程中保存多个检查点。我们将所有这些检查点汇集到一个池中，然后应用 Model soups 技术生成不同的融合模型：

uniform_merged
greedy_on_random
greedy_on_sorted

随后，我们应用 WiSE-FT 融合技术，从上述 3 个模型和原始模型组成的组中选择模型对进行融合，生成 wise_ft_merged 模型。至此，第一阶段微调结束。

然后在第二阶段微调中重复该过程，以 wise_ft_merged 作为新的起点，最终生成最终模型。整个微调流程如下图所示：

Finetuning flow

微调模型池和融合模型的性能在 CrossNER、TwitterNER 基准上进行了评估，并在以下两个图中进行了绘制（分别为 crossner_f1 和 other_f1）。

第一阶段微调图： 1st finetuning phase

第二阶段微调图： 2nd finetuning phase

🛠️ 安装

要使用此模型，您必须安装 GLiNER Python 库：

pip install gliner

下载 GLiNER 库后，您可以导入 GLiNER 类。然后，您可以使用 GLiNER.from_pretrained 加载此模型。

💻 用法

from gliner import GLiNER

model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")

text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""

labels = ["founder", "computer", "software", "position", "date", "company"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

输出：

Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date

📊 基准测试：

模型性能

在不同零样本命名实体识别（NER）基准测试（CrossNER、mit-movie 和 mit-restaurant）上的性能，数据来源于 https://huggingface.co/knowledgator/gliner-multitask-large-v0.5：

模型	F1 分数
xomad/gliner-model-merge-large-v1.0	0.6601
knowledgator/gliner-multitask-v0.5	0.6276
numind/NuNER_Zero-span	0.6196
gliner-community/gliner_large-v2.5	0.615
EmergentMethods/gliner_large_news-v2.1	0.5876
urchade/gliner_large-v2.1	0.5754

不同数据集上的详细性能：

模型	数据集	精确率	召回率	F1 分数	F1 分数（小数）
xomad/gliner-model-merge-large-v1.0	CrossNER_AI	62.66%	57.48%	59.96%	0.5996
	CrossNER_literature	73.28%	66.42%	69.68%	0.6968
	CrossNER_music	74.89%	70.67%	72.72%	0.7272
	CrossNER_politics	79.46%	77.57%	78.51%	0.7851
	CrossNER_science	74.72%	70.24%	72.41%	0.7241
	mit-movie	67.33%	57.89%	62.25%	0.6225
	mit-restaurant	54.94%	40.41%	46.57%	0.4657
	平均值				0.6601
numind/NuNER_Zero-span	CrossNER_AI	63.82%	56.82%	60.12%	0.6012
	CrossNER_literature	73.53%	58.06%	64.89%	0.6489
	CrossNER_music	72.69%	67.40%	69.95%	0.6995
	CrossNER_politics	77.28%	68.69%	72.73%	0.7273
	CrossNER_science	70.08%	63.12%	66.42%	0.6642
	mit-movie	63.00%	48.88%	55.05%	0.5505
	mit-restaurant	54.81%	37.62%	44.62%	0.4462
	平均值				0.6196
knowledgator/gliner-multitask-v0.5	CrossNER_AI	51.00%	51.11%	51.05%	0.5105
	CrossNER_literature	72.65%	65.62%	68.96%	0.6896
	CrossNER_music	74.91%	73.70%	74.30%	0.7430
	CrossNER_politics	78.84%	77.71%	78.27%	0.7827
	CrossNER_science	69.20%	65.48%	67.29%	0.6729
	mit-movie	61.29%	52.59%	56.60%	0.5660
	mit-restaurant	50.65%	38.13%	43.51%	0.4351
	平均值				0.6276
gliner-community/gliner_large-v2.5	CrossNER_AI	50.85%	63.03%	56.29%	0.5629
	CrossNER_literature	64.92%	67.21%	66.04%	0.6604
	CrossNER_music	70.88%	73.10%	71.97%	0.7197
	CrossNER_politics	72.67%	72.93%	72.80%	0.7280
	CrossNER_science	61.71%	68.85%	65.08%	0.6508
	mit-movie	54.63%	52.83%	53.71%	0.5371
	mit-restaurant	47.99%	42.13%	44.87%	0.4487
	平均值				0.6154
urchade/gliner_large-v2.1	CrossNER_AI	54.98%	52.00%	53.45%	0.5345
	CrossNER_literature	59.33%	56.47%	57.87%	0.5787
	CrossNER_music	67.39%	66.77%	67.08%	0.6708
	CrossNER_politics	66.07%	63.76%	64.90%	0.6490
	CrossNER_science	61.45%	62.56%	62.00%	0.6200
	mit-movie	55.94%	47.36%	51.29%	0.5129
	mit-restaurant	53.34%	40.83%	46.25%	0.4625
	平均值				0.5754
EmergentMethods/gliner_large_news-v2.1	CrossNER_AI	59.60%	54.55%	56.96%	0.5696
	CrossNER_literature	65.41%	56.16%	60.44%	0.6044
	CrossNER_music	67.47%	63.08%	65.20%	0.6520
	CrossNER_politics	66.05%	60.07%	62.92%	0.6292
	CrossNER_science	68.44%	63.57%	65.92%	0.6592
	mit-movie	65.85%	49.59%	56.57%	0.5657
	mit-restaurant	54.71%	35.94%	43.38%	0.4338
	平均值				0.5876

作者

Hoan Nguyen，来自 xomad.com

引用

@misc{wortsman2022modelsoupsaveragingweights,
      title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time}, 
      author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
      year={2022},
      eprint={2203.05482},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2203.05482}, 
}

@InProceedings{Wortsman_2022_CVPR,
    author    = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
    title     = {Robust Fine-Tuning of Zero-Shot Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {7959-7971}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks}, 
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

from gliner import GLiNER model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0") text = """ Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014. """ labels = ["founder", "computer", "software", "position", "date", "company"] entities = model.predict_entities(text, labels) for entity in entities: print(entity["text"], "=>", entity["label"])

Microsoft => company Bill Gates => founder Paul Allen => founder April 4, 1975 => date BASIC => software Altair 8800 => computer Microsoft => company chairman => position chief executive officer => position president => position chief software architect => position May 2014 => date

模型

F1 分数

xomad/gliner-model-merge-large-v1.0

0.6601

knowledgator/gliner-multitask-v0.5

0.6276

numind/NuNER_Zero-span

0.6196

gliner-community/gliner_large-v2.5

0.615

EmergentMethods/gliner_large_news-v2.1

0.5876

urchade/gliner_large-v2.1

0.5754

模型

数据集

精确率

召回率

F1 分数

F1 分数（小数）

xomad/gliner-model-merge-large-v1.0

CrossNER_AI

62.66%

57.48%

59.96%

0.5996

CrossNER_literature

73.28%

66.42%

69.68%

0.6968

CrossNER_music

74.89%

70.67%

72.72%

0.7272

CrossNER_politics

79.46%

77.57%

78.51%

0.7851

CrossNER_science

74.72%

70.24%

72.41%

0.7241

mit-movie

67.33%

57.89%

62.25%

0.6225

mit-restaurant

54.94%

40.41%

46.57%

0.4657

平均值

0.6601

numind/NuNER_Zero-span

CrossNER_AI

63.82%

56.82%

60.12%

0.6012

CrossNER_literature

73.53%

58.06%

64.89%

0.6489

CrossNER_music

72.69%

67.40%

69.95%

0.6995

CrossNER_politics

77.28%

68.69%

72.73%

0.7273

CrossNER_science

70.08%

63.12%

66.42%

0.6642

mit-movie

63.00%

48.88%

55.05%

0.5505

mit-restaurant

54.81%

37.62%

44.62%

0.4462

平均值

0.6196

knowledgator/gliner-multitask-v0.5

CrossNER_AI

51.00%

51.11%

51.05%

0.5105

CrossNER_literature

72.65%

65.62%

68.96%

0.6896

CrossNER_music

74.91%

73.70%

74.30%

0.7430

CrossNER_politics

78.84%

77.71%

78.27%

0.7827

CrossNER_science

69.20%

65.48%

67.29%

0.6729

mit-movie

61.29%

52.59%

56.60%

0.5660

mit-restaurant

50.65%

38.13%

43.51%

0.4351

平均值

0.6276

gliner-community/gliner_large-v2.5

CrossNER_AI

50.85%

63.03%

56.29%

0.5629

CrossNER_literature

64.92%

67.21%

66.04%

0.6604

CrossNER_music

70.88%

73.10%

71.97%

0.7197

CrossNER_politics

72.67%

72.93%

72.80%

0.7280

CrossNER_science

61.71%

68.85%

65.08%

0.6508

mit-movie

54.63%

52.83%

53.71%

0.5371

mit-restaurant

47.99%

42.13%

44.87%

0.4487

平均值

0.6154

urchade/gliner_large-v2.1

CrossNER_AI

54.98%

52.00%

53.45%

0.5345

CrossNER_literature

59.33%

56.47%

57.87%

0.5787

CrossNER_music

67.39%

66.77%

67.08%

0.6708

CrossNER_politics

66.07%

63.76%

64.90%

0.6490

CrossNER_science

61.45%

62.56%

62.00%

0.6200

mit-movie

55.94%

47.36%

51.29%

0.5129

mit-restaurant

53.34%

40.83%

46.25%

0.4625

平均值

0.5754

EmergentMethods/gliner_large_news-v2.1

CrossNER_AI

59.60%

54.55%

56.96%

0.5696

CrossNER_literature

65.41%

56.16%

60.44%

0.6044

CrossNER_music

67.47%

63.08%

65.20%

0.6520

CrossNER_politics

66.05%

60.07%

62.92%

0.6292

CrossNER_science

68.44%

63.57%

65.92%

0.6592

mit-movie

65.85%

49.59%

56.57%

0.5657

mit-restaurant

54.71%

35.94%

43.38%

0.4338

平均值

0.5876

@misc{wortsman2022modelsoupsaveragingweights, title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time}, author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt}, year={2022}, eprint={2203.05482}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2203.05482}, } @InProceedings{Wortsman_2022_CVPR, author = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig}, title = {Robust Fine-Tuning of Zero-Shot Models}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2022}, pages = {7959-7971} } @misc{stepanov2024gliner, title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks}, author={Ihor Stepanov and Mykhailo Shtopko}, year={2024}, eprint={2406.12925}, archivePrefix={arXiv}, primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'} } @misc{zaratiana2023gliner, title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois}, year={2023}, eprint={2311.08526}, archivePrefix={arXiv}, primaryClass={cs.CL} }