🚀 Falcon-7B

Falcon-7B 是由 TII 构建的一个拥有 70 亿参数的因果解码器模型，它在 15000 亿 tokens 的 RefinedWeb 数据上进行训练，并辅以精选语料库。该模型在 Apache 2.0 许可下发布。

为何选择 Falcon-7B？

性能超越同类开源模型（例如 MPT-7B、StableLM、RedPajama 等），这得益于其在 15000 亿 tokens 的 RefinedWeb 数据上进行训练，并辅以精选语料库。详见 OpenLLM 排行榜。
采用优化的推理架构，集成了 FlashAttention（Dao et al., 2022）和 multiquery（Shazeer et al., 2019）技术。
采用宽松的 Apache 2.0 许可协议，允许商业使用，无任何版税或使用限制。

⚠️ 这是一个原始的预训练模型，在大多数使用场景下需要进一步微调。 如果您需要一个更适合以聊天格式接收通用指令的版本，我们建议您关注 Falcon-7B-Instruct。

🔥 想要更强大的模型？ Falcon-40B 是 Falcon-7B 的“大哥”！

💥 使用 openmind 时，Falcon LLM 系列需要 PyTorch 2.0 版本支持！

如需快速推理 Falcon，请查看 Text Generation Inference！更多信息请阅读此博客文章。

运行 Falcon-7B 推理至少需要 16GB 内存以确保流畅运行。

Falcon-7B 模型卡片

模型详情

模型描述

模型类型： 因果解码器模型；
支持语言（NLP）： 英语、德语、西班牙语、法语（在意大利语、葡萄牙语、波兰语、荷兰语、罗马尼亚语、捷克语、瑞典语方面能力有限）；
许可协议： Apache 2.0。

模型来源

论文： 即将发布。

用途

直接用途

大型语言模型相关研究；作为进一步专门化和微调到特定用途（例如摘要、文本生成、聊天机器人等）的基础模型。

超出范围的用途

未经充分风险评估和缓解措施的生产环境使用；任何可能被视为不负责任或有害的使用场景。

偏见、风险与局限性

Falcon-7B 仅在英语和法语数据上训练，因此无法很好地泛化到其他语言。此外，由于其训练数据来自大规模网络语料库，它将携带网络上常见的刻板印象和偏见。

建议

我们建议 Falcon-7B 的用户考虑针对特定的目标任务集对其进行微调，并在任何生产环境使用时采取防护措施和适当的预防手段。

如何开始使用模型

from openmind import pipeline, AutoTokenizer, AutoModelForCausalLM
import openmind
import torch
import torch_npu
import argparse

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--model_name_or_path",
        type=str,
        help="Jinan_AICC/Falcon-7B",
        default="Jinan_AICC/Falcon-7B",
    )
    args = parser.parse_args()
    return args

args = parse_args()
model = args.model_name_or_path


tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = openmind.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
)
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")