transformers库更新至4.22+版本。大型神经语言模型(如BERT)的预训练在许多自然语言处理(NLP)任务上带来了显著提升。然而,大多数预训练工作集中于通用领域语料库,例如新闻稿和网络文本。一个普遍的假设是,即使是特定领域的预训练,也能通过从通用领域语言模型起步而受益。近期研究表明,对于拥有丰富未标注文本的领域(如生物医学),从零开始预训练语言模型相比在通用领域语言模型基础上进行持续预训练,能带来显著的性能提升。后续研究则探讨了更大模型规模及其对BLURB基准性能的影响。
此BiomedBERT是使用PubMed的摘要从零开始预训练的。
如果您在研究中发现BiomedBERT有用,请引用以下论文:
@misc{https://doi.org/10.48550/arxiv.2112.07869,
doi = {10.48550/ARXIV.2112.07869},
url = {https://arxiv.org/abs/2112.07869},
author = {Tinn, Robert and Cheng, Hao and Gu, Yu and Usuyama, Naoto and Liu, Xiaodong and Naumann, Tristan and Gao, Jianfeng and Poon, Hoifung},
keywords = {Computation and Language (cs.CL), Machine Learning (cs.LG), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing},
publisher = {arXiv},
year = {2021},
copyright = {arXiv.org perpetual, non-exclusive license}
}