POINTS-Seeker:可用于长周期、知识密集型视觉推理任务，是一款从头构建的先进多模态智能搜索模型，采用Agentic Seeding原生训练及V-Fold自适应历史感知压缩方案，突破静态参数知识的认知限制。【此简介由AI生成】

🌟 模型概述

POINTS-Seeker-8B 是一款先进的多模态智能体搜索模型，它从零开始构建，旨在突破大型多模态模型（LMM）中静态参数化知识的认知局限。不同于在现有LMM基础上简单集成搜索工具的方式，POINTS-Seeker 采用智能体种子训练（Agentic Seeding） 进行原生训练——这是一个专门的训练阶段，用于植入智能体行为的基础前提——并配备了V-Fold自适应历史感知压缩机制，有效解决了长序列交互中的性能瓶颈问题。POINTS-Seeker-8B 在长序列、知识密集型视觉推理任务上展现出卓越性能。

快速开始

使用 Transformers 运行

请先通过以下命令安装 WePOINTS：

git clone https://github.com/WePOINTS/WePOINTS.git
cd ./WePOINTS
pip install -e .

from transformers import AutoModelForCausalLM, AutoTokenizer, Qwen2VLImageProcessor
import torch

user_prompt = "explain the image"  # replace with your instruction
image_path = 'your image path'
model_path = 'tencent/POINTS-Seeker'
model = AutoModelForCausalLM.from_pretrained(model_path,
                                             trust_remote_code=True,
                                             dtype=torch.bfloat16,
                                             device_map='cuda')
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
image_processor = Qwen2VLImageProcessor.from_pretrained(model_path)
content = [
            dict(type='image', image=image_path),
            dict(type='text', text=user_prompt)
          ]
messages = [
        {
            'role': 'user',
            'content': content
        }
    ]
generation_config = {
        'max_new_tokens': 2048,
        'do_sample': False
    }
response = model.chat(
    messages,
    tokenizer,
    image_processor,
    generation_config
)
print(response)

多模态智能体搜索

请参考我们的 github 仓库