This is a quantized variant of deepseek-ai/DeepSeek-V4-Pro with W4A8 weight-activation quantization and Multi-Token Prediction (MTP) support.
deepseek_v4transformersfrom transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"DeepSeek-V4-Pro-w4a8-mtp",
trust_remote_code=True,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("DeepSeek-V4-Pro-w4a8-mtp", trust_remote_code=True)This model inherits the MIT license from the original DeepSeek-V4-Pro model.