OmniCoder-9B-GGUF

OmniCoder-9B 的 GGUF 量化版本

可用量化版本

量化类型	大小	适用场景
`Q2_K`	~3.8 GB	极致压缩，最低质量
`Q3_K_S`	~4.3 GB	小巧体积
`Q3_K_M`	~4.6 GB	小巧体积，均衡表现
`Q3_K_L`	~4.9 GB	小巧体积，更高质量
`Q4_0`	~5.3 GB	良好平衡
`Q4_K_S`	~5.4 GB	良好平衡
`Q4_K_M`	~5.7 GB	推荐大多数用户使用
`Q5_0`	~6.3 GB	高质量
`Q5_K_S`	~6.3 GB	高质量
`Q5_K_M`	~6.5 GB	高质量，均衡表现
`Q6_K`	~7.4 GB	接近无损
`Q8_0`	~9.5 GB	最高质量量化
`BF16`	~17.9 GB	全精度

使用方法

# Install llama.cpp
brew install llama.cpp  # macOS
# or build from source: https://github.com/ggml-org/llama.cpp

# Interactive chat
llama-cli --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -p "Your prompt" -c 8192

# Server mode (OpenAI-compatible API)
llama-server --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -c 8192

由 Tesslate 构建 | 查看完整模型卡片：OmniCoder-9B

OmniCoder-9B-GGUF

OmniCoder-9B 的 GGUF 量化版本

可用量化版本

量化类型	大小	适用场景
`Q2_K`	~3.8 GB	极致压缩，最低质量
`Q3_K_S`	~4.3 GB	小巧体积
`Q3_K_M`	~4.6 GB	小巧体积，均衡表现
`Q3_K_L`	~4.9 GB	小巧体积，更高质量
`Q4_0`	~5.3 GB	良好平衡
`Q4_K_S`	~5.4 GB	良好平衡
`Q4_K_M`	~5.7 GB	推荐大多数用户使用
`Q5_0`	~6.3 GB	高质量
`Q5_K_S`	~6.3 GB	高质量
`Q5_K_M`	~6.5 GB	高质量，均衡表现
`Q6_K`	~7.4 GB	接近无损
`Q8_0`	~9.5 GB	最高质量量化
`BF16`	~17.9 GB	全精度

使用方法

# Install llama.cpp
brew install llama.cpp  # macOS
# or build from source: https://github.com/ggml-org/llama.cpp

# Interactive chat
llama-cli --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -p "Your prompt" -c 8192

# Server mode (OpenAI-compatible API)
llama-server --hf-repo Tesslate/OmniCoder-9B-GGUF --hf-file omnicoder-9b-q4_k_m.gguf -c 8192

由 Tesslate 构建 | 查看完整模型卡片：OmniCoder-9B