HuggingFace镜像/stoic
模型介绍文件和版本分析
下载使用量0

Stoic

快速准确的蛋白质化学计量预测工具

license bioRxiv codecov Open in Colab Open in Spaces HuggingFace model

模型架构

Stoic 可直接从序列预测蛋白质复合物组分的拷贝数,还能基于预测的最佳化学计量结果导出适用于 AF3 的 JSON 文件。

网页版(Hugging Face Space):stoic-space
预印本:Stoic: Fast and accurate protein stoichiometry prediction

安装

1. 创建并激活环境

venv

python -m venv .venv
source .venv/bin/activate

conda / mamba

mamba create -n stoic-env python=3.10 -y
mamba activate stoic-env

2. 安装 Stoic(环境激活后)

从本地克隆安装(可编辑模式)

git clone https://github.com/PickyBinders/stoic.git
cd stoic
python -m pip install --upgrade pip
python -m pip install -e .

直接从 GitHub 安装

python -m pip install git+https://github.com/PickyBinders/stoic.git

注意: 首次推理运行需要互联网连接,以便从 Hugging Face 下载模型权重。后续运行会重用 ~/.cache/huggingface 中的缓存文件,因此模型缓存后即可离线使用。

通过命令行界面预测化学计量学

stoic_predict_stoichiometry 命令支持:

  1. 序列列表,
  2. 单个 FASTA 文件,
  3. FASTA 文件目录(每个 FASTA 均视为独立的复合物)。
usage: stoic_predict_stoichiometry [-h]
                                   [--sequences SEQ [SEQ ...] | --input-path INPUT_PATH]
                                   [--model MODEL]
                                   [--top-n TOP_N]
                                   [--return-residue-weights]
                                   [--max-inference-seq-len MAX_INFERENCE_SEQ_LEN]
                                   [--output-dir OUTPUT_DIR]
                                   [--device DEVICE]

options:
  -h, --help            show this help message and exit
  --sequences SEQ [SEQ ...]
                        Protein sequences (one per unique chain)
  --input-path INPUT_PATH
                        Path to a FASTA file or a directory with FASTA files
  --model MODEL         HuggingFace model name or local path (default: PickyBinders/stoic)
  --top-n TOP_N         Number of top stoichiometry candidates (default: 3)
  --return-residue-weights
                        Return residue weights and save residue-level predictions
  --max-inference-seq-len MAX_INFERENCE_SEQ_LEN
                        Maximum sequence length for full-length inference
  --output-dir OUTPUT_DIR
                        Output directory for predictions and AF3 JSON files
  --device DEVICE       Device to use, e.g. cuda or cpu (default: auto-detect)

序列列表

stoic_predict_stoichiometry \
  --sequences "SENECA" "VIRTVS" \
  --top-n 3

单个 FASTA 文件

stoic_predict_stoichiometry \
  --input-path path/to/complex.fasta \
  --top-n 3

FASTA 文件目录

stoic_predict_stoichiometry \
  --input-path path/to/fasta_dir \
  --top-n 3 \
  --output-dir stoic_predictions

在目录模式下,输出将按复合物保存(<fasta_stem>.json、<fasta_stem>_af3_input.json 以及可选的残基预测结果)。

输出文件

当提供 --output-dir 时:

  • 单个输入(序列列表或单个 FASTA):
    • results.json
    • af3_input.json
    • residue_predictions.pkl(若使用 --return-residue-weights)
  • FASTA 目录输入:
    • <complex_name>.json
    • <complex_name>_af3_input.json
    • <complex_name>_residue_predictions.pkl(若使用 --return-residue-weights)

作为 Python API 使用

高级推理辅助工具

from stoic.predict_stoichiometry import predict_stoichiometry

results = predict_stoichiometry(
    sequences=["SENECA", "VIRTVS"],  # or FASTA path / FASTA dir path
    model_name="PickyBinders/stoic",
    top_n=3,
)
print(results)

直接从 Hugging Face 加载模型

import torch
from stoic.model import Stoic


device = "cuda" if torch.cuda.is_available() else "cpu"
model = Stoic.from_pretrained("PickyBinders/stoic")
model.eval().to(device)
pred = model.predict_stoichiometry(["SENECA", "VIRTVS"], top_n=3)
print(pred)

引用

如果您使用 Stoic,请引用:

@article{litvinov2026stoic,
  title   = {Stoic: Fast and accurate protein stoichiometry prediction},
  author  = {Litvinov, Daniil and Pantolini, Lorenzo and {\v{S}}krinjar, Peter and Tauriello, Gerardo and McCafferty, Caitlyn L and Engel, Benjamin D and Schwede, Torsten and Durairaj, Janani},
  journal = {bioRxiv},
  year    = {2026},
  doi     = {10.64898/2026.03.13.711535},
  url     = {https://www.biorxiv.org/content/10.64898/2026.03.13.711535v1}
}