开放思维

import argparse

import torch
from openmind import is_torch_npu_available
from openmind import AutoTokenizer, AutoModelForCausalLM


def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--model_name_or_path",
        type=str,
        help="Path to model",
        default=None,
    )

    args = parser.parse_args()
    return args


def main():
    args = parse_args()
    if args.model_name_or_path:
        model_path = args.model_name_or_path
    else:
        model_path = "../"

    if is_torch_npu_available():
        device = "npu:0"
    else:
        device = "cpu"
    file_name = 'SmolLM2-1.7B-Instruct.F16.gguf'
    tokenizer = AutoTokenizer.from_pretrained("Rose/SmolLM2-1.7B-Instruct-GGUF",gguf_file=file_name)
    model = AutoModelForCausalLM.from_pretrained("Rose/SmolLM2-1.7B-Instruct-GGUF",gguf_file=file_name)

    input_ids = tokenizer("Gra", return_tensors='pt').to(model.device)["input_ids"]
    output = model.generate(input_ids, max_new_tokens=48, do_sample=True, temperature=0.7)
    print(tokenizer.decode(output[0]))


if __name__ == "__main__":
    main()

SmolLM2-1.7B-Instruct-GGUF

文件名	大小	量化级别	描述
`SmolLM2-1.7B-Instruct.F16.gguf`	3.42GB	FP16	全精度16位浮点数，提供最佳准确性，适用于高性能配置。
`SmolLM2-1.7B-Instruct.Q4_K_M.gguf`	1.06GB	Q4	量化为4位，优先考虑内存效率和更快的推理速度，但会牺牲部分准确性。
`SmolLM2-1.7B-Instruct.Q5_K_M.gguf`	1.23GB	Q5	5位平衡量化，在内存占用和模型准确性之间取得折衷。
`SmolLM2-1.7B-Instruct.Q8_0.gguf`	1.82GB	Q8	8位量化，旨在提供中等性能，其准确性优于低位模型。

使用 Ollama 运行 🦙

概述

Ollama 是一款强大的工具，能让您轻松运行机器学习模型。本指南将帮助您在短短几分钟内下载、安装并运行自己的 GGUF 模型。

下载并安装 Ollama🦙

首先，请从 https://ollama.com/download 下载 Ollama，并将其安装在您的 Windows 或 Mac 系统上。

运行 GGUF 模型的步骤

1. 创建模型文件

首先，创建一个模型文件并为其适当命名。例如，您可以将模型文件命名为 metallama。

2. 添加模板命令

在您的模型文件中，包含一行 FROM 来指定您要使用的基础模型文件。例如：

FROM Llama-3.2-1B.F16.gguf

确保模型文件与您的脚本位于同一目录中。

3. 创建并修补模型

打开终端，运行以下命令来创建并修补您的模型：

ollama create metallama -f ./metallama

一旦流程成功完成，您将看到一条确认消息。

要验证模型是否已成功创建，您可以使用以下命令列出所有模型：

ollama list

确保 metallama 出现在模型列表中。

运行模型

要运行您新创建的模型，请在终端中使用以下命令：

ollama run metallama

示例用法

在命令提示符中，您可以执行：

D:\>ollama run metallama

你可以这样与模型交互：

>>> write a mini passage about space x
Space X, the private aerospace company founded by Elon Musk, is revolutionizing the field of space exploration.
With its ambitious goals to make humanity a multi-planetary species and establish a sustainable human presence in
the cosmos, Space X has become a leading player in the industry. The company's spacecraft, like the Falcon 9, have
demonstrated remarkable capabilities, allowing for the transport of crews and cargo into space with unprecedented
efficiency. As technology continues to advance, the possibility of establishing permanent colonies on Mars becomes
increasingly feasible, thanks in part to the success of reusable rockets that can launch multiple times without
sustaining significant damage. The journey towards becoming a multi-planetary species is underway, and Space X
plays a pivotal role in pushing the boundaries of human exploration and settlement.

结论

通过这些简单步骤，您可以轻松下载、安装并使用Ollama运行自己的模型。无论您是探索Llama的功能，还是构建自己的自定义模型，Ollama都能让这一过程变得简单高效。

本README提供了清晰的说明和结构化信息，帮助用户有效掌握Ollama的使用流程。您可以根据自身具体需求或需要补充的额外细节，对任何部分进行调整。