FinBERT 是一款基于金融沟通文本预训练的 BERT 模型。其旨在推动金融 NLP 的研究与实践。该模型的训练基于以下三个金融沟通语料库，总语料规模达 49 亿个 tokens。

公司报告 10-K 与 10-Q：25 亿个 tokens
earnings 电话会议记录：13 亿个 tokens
分析师报告：11 亿个 tokens

有关 FinBERT 的更多技术细节：点击链接

此次发布的 finbert-tone 模型，是在 FinBERT 模型基础上，使用 10,000 条来自分析师报告、经人工标注（积极、消极、中性）的句子进行微调后得到的。该模型在金融语气分析任务上表现卓越。如果您仅希望使用 FinBERT 进行金融语气分析，不妨一试。

若您在学术工作中使用此模型，请引用以下论文：

Huang, Allen H., Hui Wang, and Yi Yang. "FinBERT: A Large Language Model for Extracting Information from Financial Text." Contemporary Accounting Research (2022).

使用方法

您可以将此模型与 Transformers pipeline 结合，用于情感分析。

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline

finbert = BertForSequenceClassification.from_pretrained('Beijing-Ascend/finbert-tone',num_labels=3)
tokenizer = BertTokenizer.from_pretrained('Beijing-Ascend/finbert-tone')

nlp = pipeline("sentiment-analysis", model=finbert, tokenizer=tokenizer)

sentences = ["there is a shortage of capital, and we need extra financing",  
             "growth is strong and we have plenty of liquidity", 
             "there are doubts about our finances", 
             "profits are flat"]
results = nlp(sentences)
print(results)  #LABEL_0: neutral; LABEL_1: positive; LABEL_2: negative

公司报告 10-K 与 10-Q：25 亿个 tokens
earnings 电话会议记录：13 亿个 tokens
分析师报告：11 亿个 tokens

有关 FinBERT 的更多技术细节：点击链接

若您在学术工作中使用此模型，请引用以下论文：

Huang, Allen H., Hui Wang, and Yi Yang. "FinBERT: A Large Language Model for Extracting Information from Financial Text." Contemporary Accounting Research (2022).

使用方法

您可以将此模型与 Transformers pipeline 结合，用于情感分析。

from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline

finbert = BertForSequenceClassification.from_pretrained('Beijing-Ascend/finbert-tone',num_labels=3)
tokenizer = BertTokenizer.from_pretrained('Beijing-Ascend/finbert-tone')

nlp = pipeline("sentiment-analysis", model=finbert, tokenizer=tokenizer)

sentences = ["there is a shortage of capital, and we need extra financing",  
             "growth is strong and we have plenty of liquidity", 
             "there are doubts about our finances", 
             "profits are flat"]
results = nlp(sentences)
print(results)  #LABEL_0: neutral; LABEL_1: positive; LABEL_2: negative