a369041/DeepSeek-V4-Pro-w4a8-mtp
模型介绍文件和版本Pull Requests讨论分析

DeepSeek-V4-Pro-w4a8-mtp

This is a quantized variant of deepseek-ai/DeepSeek-V4-Pro with W4A8 weight-activation quantization and Multi-Token Prediction (MTP) support.

Model Details

  • Base Model: DeepSeek-V4-Pro
  • Quantization: W4A8 (4-bit weights, 8-bit activations)
  • Architecture: deepseek_v4
  • Library: transformers

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "DeepSeek-V4-Pro-w4a8-mtp",
    trust_remote_code=True,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("DeepSeek-V4-Pro-w4a8-mtp", trust_remote_code=True)

License

This model inherits the MIT license from the original DeepSeek-V4-Pro model.

下载使用量0