HuggingFace镜像/Hyper-SD
模型介绍文件和版本分析
下载使用量0

超级SD(Hyper-SD)

论文官方存储库:Hyper-SD。

项目页面:https://hyper-sd.github.io/

Hyper-SD预览图

最新动态🔥🔥🔥

  • 2024年8月26日. 💥💥💥 我们的与FLUX.1-dev相关的8步和16步LoRAs现已可用!建议使用约0.125的LoRA缩放比例,它适应训练,指导尺度可保持在3.5。较低步数的LoRAs即将推出。💥💥💥
  • 2024年8月19日. SD3相关CFG LoRAs现可使用!我们建议在4/8/16步时将指导尺度设置为3.0/5.0/7.0。在使用diffusers进行推理前,请不要忘记以相对较小的比例(如适应训练的0.125)融合LoRA。请注意,8步和16步LoRA也可用于略小步骤的推理,分别约为6步和12步。期待您的反馈,FLUX相关模型将于下周发布。
  • 2024年5月13日. 支持5到8个引导尺度的12步CFG保留Hyper-SDXL-12steps-CFG-LoRA和Hyper-SD15-12steps-CFG-LoRA也已发布,这在性能与速度之间提供了更好的折衷,享受吧!
  • 2024年4月30日. 我们的8步CFG保留Hyper-SDXL-8steps-CFG-LoRA和Hyper-SD15-8steps-CFG-LoRA现已可用(支持5~8个引导尺度),强烈建议将8步CFGLoRA作为所有SDXL和SD15模型的标准配置!
  • 2024年4月28日. 推出了支持TCDScheduler于不同步骤推理的一体化LoRA的ComfyUI工作流程[链接],记得安装 Bộ nodes tùy chỉnh ComfyUI-TCD到你的ComfyUI/custom_nodes文件夹中哦!鼓励调整η参数以获得更佳效果🌟!
  • 2024年4月26日. 感谢@Pete贡献了更大的画布给我们的涂鸦演示[链接]👏。
  • 更多更新信息涉及具体日期、链接及功能介绍省略...

尝试我们的Hugging Face演示:

  • 草图演示: 在绘制接口体验Hyper-SD的草图功能。
  • 一步文字转图像演示: 访问文字到图像接口尝试快速转换。

引言

超级SD(Hyper-SD)是先进的扩散模型加速技术之一。在此仓库中,我们发布了从FLUX.1-dev、SD3-Medium、SDXL Base 1.0以及Stable-Diffusion v1-5精炼出的模型。

检查点

  • Hyper-FLUX.1-dev-Nsteps-lora.safetensors: 针对FLUX.1-dev相关模型的LoRA检查点。
  • Hyper-SD3-Nsteps-CFG-lora.safetensors: 适用于SD3系列模型的LoRA检查点。
  • Hyper-SDXL-Nstep-lora.safetensors: 专为SDXL系列设计的LoRA检查点。
  • Hyper-SD15-Nstep-lora.safetensors: SD1.5系列模型的LoRA检查点。
  • Hyper-SDXL-1step-unet.safetensors: 来自SDXL基础版的Unet模型精炼。

文本转图像使用方法

针对FLUX.1-dev相关模型

import torch
from diffusers import FluxPipeline
from huggingface_hub import hf_hub_download
base_model_id = "black-forest-labs/FLUX.1-dev"
repo_name = "ByteDance/Hyper-SD"
# Take 8-steps lora as an example
ckpt_name = "Hyper-FLUX.1-dev-8steps-lora.safetensors"
# Load model, please fill in your access tokens since FLUX.1-dev repo is a gated model.
pipe = FluxPipeline.from_pretrained(base_model_id, token="xxx")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora(lora_scale=0.125)
pipe.to("cuda", dtype=torch.float16)
image=pipe(prompt="a photo of a cat", num_inference_steps=8, guidance_scale=3.5).images[0]
image.save("output.png")

与SD3相关的模型

import torch
from diffusers import StableDiffusion3Pipeline
from huggingface_hub import hf_hub_download
base_model_id = "stabilityai/stable-diffusion-3-medium-diffusers"
repo_name = "ByteDance/Hyper-SD"
# Take 8-steps lora as an example
ckpt_name = "Hyper-SD3-8steps-CFG-lora.safetensors"
# Load model, please fill in your access tokens since SD3 repo is a gated model.
pipe = StableDiffusion3Pipeline.from_pretrained(base_model_id, token="xxx")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora(lora_scale=0.125)
pipe.to("cuda", dtype=torch.float16)
image=pipe(prompt="a photo of a cat", num_inference_steps=8, guidance_scale=5.0).images[0]
image.save("output.png")

与SDXL相关的模型

2步、4步、8步 LoRA

以2步LoRA为例,您也可以使用其他LoRA进行相应的推理步骤设置。

import torch
from diffusers import DiffusionPipeline, DDIMScheduler
from huggingface_hub import hf_hub_download
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
repo_name = "ByteDance/Hyper-SD"
# Take 2-steps lora as an example
ckpt_name = "Hyper-SDXL-2steps-lora.safetensors"
# Load model.
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora()
# Ensure ddim scheduler timestep spacing set as trailing !!!
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
# lower eta results in more detail
prompt="a photo of a cat"
image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0]

统一LoRA(支持1到8步推理)

您可以灵活调整推理步数和eta值,以达到最佳性能。

import torch
from diffusers import DiffusionPipeline, TCDScheduler
from huggingface_hub import hf_hub_download
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
repo_name = "ByteDance/Hyper-SD"
ckpt_name = "Hyper-SDXL-1step-lora.safetensors"
# Load model.
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora()
# Use TCD scheduler to achieve better image quality
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
# Lower eta results in more detail for multi-steps inference
eta=1.0
prompt="a photo of a cat"
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0]

一步式SDXL Unet

仅适用于单步推理。

import torch
from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
repo_name = "ByteDance/Hyper-SD"
ckpt_name = "Hyper-SDXL-1step-Unet.safetensors"
# Load model.
unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
unet.load_state_dict(load_file(hf_hub_download(repo_name, ckpt_name), device="cuda"))
pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")
# Use LCM scheduler instead of ddim scheduler to support specific timestep number inputs
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
# Set start timesteps to 800 in the one-step inference to get better results
prompt="a photo of a cat"
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[800]).images[0]

与SD1.5相关的模型

2步、4步、8步LoRA

以2步LoRA为例,您也可以使用其他LoRA进行相应的推理步骤设置。

import torch
from diffusers import DiffusionPipeline, DDIMScheduler
from huggingface_hub import hf_hub_download
base_model_id = "runwayml/stable-diffusion-v1-5"
repo_name = "ByteDance/Hyper-SD"
# Take 2-steps lora as an example
ckpt_name = "Hyper-SD15-2steps-lora.safetensors"
# Load model.
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora()
# Ensure ddim scheduler timestep spacing set as trailing !!!
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
prompt="a photo of a cat"
image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0]

统一LoRA(支持1到8步推理)

您可以灵活调整推理步数和eta值,以达到最佳性能。

import torch
from diffusers import DiffusionPipeline, TCDScheduler
from huggingface_hub import hf_hub_download
base_model_id = "runwayml/stable-diffusion-v1-5"
repo_name = "ByteDance/Hyper-SD"
ckpt_name = "Hyper-SD15-1step-lora.safetensors"
# Load model.
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
pipe.fuse_lora()
# Use TCD scheduler to achieve better image quality
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
# Lower eta results in more detail for multi-steps inference
eta=1.0
prompt="a photo of a cat"
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0]

ControlNet 使用指南

与 SDXL 相关的模型

2 步、4 步、8 步 LoRA

以 Canny Controlnet 和 2 步推理为例:

import torch
from diffusers.utils import load_image
import numpy as np
import cv2
from PIL import Image
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, DDIMScheduler
from huggingface_hub import hf_hub_download

# Load original image
image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
image = np.array(image)
# Prepare Canny Control Image
low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
control_image = Image.fromarray(image)
control_image.save("control.png")
control_weight = 0.5  # recommended for good generalization

# Initialize pipeline
controlnet = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0",
    torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda")

pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-2steps-lora.safetensors"))
# Ensure ddim scheduler timestep spacing set as trailing !!!
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
pipe.fuse_lora()
image = pipe("A chocolate cookie", num_inference_steps=2, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight).images[0]
image.save('image_out.png')

统一LoRA(支持1到8步推理)

以Canny Controlnet为例:

import torch
from diffusers.utils import load_image
import numpy as np
import cv2
from PIL import Image
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, TCDScheduler
from huggingface_hub import hf_hub_download

# Load original image
image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
image = np.array(image)
# Prepare Canny Control Image
low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
control_image = Image.fromarray(image)
control_image.save("control.png")
control_weight = 0.5  # recommended for good generalization

# Initialize pipeline
controlnet = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0",
    torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda")

# Load Hyper-SD15-1step lora
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-1step-lora.safetensors"))
pipe.fuse_lora()
# Use TCD scheduler to achieve better image quality
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
# Lower eta results in more detail for multi-steps inference
eta=1.0
image = pipe("A chocolate cookie", num_inference_steps=4, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight, eta=eta).images[0]
image.save('image_out.png')

与SD1.5相关的模型

两步、四步、八步LoRA

以Canny控制网和两步推理为例:

import torch
from diffusers.utils import load_image
import numpy as np
import cv2
from PIL import Image
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, DDIMScheduler

from huggingface_hub import hf_hub_download

controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"

# Load original image
image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png")
image = np.array(image)
# Prepare Canny Control Image
low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
control_image = Image.fromarray(image)
control_image.save("control.png")

# Initialize pipeline
controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-2steps-lora.safetensors"))
pipe.fuse_lora()
# Ensure ddim scheduler timestep spacing set as trailing !!!
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
image = pipe("a blue paradise bird in the jungle", num_inference_steps=2, image=control_image, guidance_scale=0).images[0]
image.save('image_out.png')

统一 LoRA(支持 1 到 8 步推理)

以 Canny Controlnet 为例:

import torch
from diffusers.utils import load_image
import numpy as np
import cv2
from PIL import Image
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, TCDScheduler
from huggingface_hub import hf_hub_download

controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"

# Load original image
image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png")
image = np.array(image)
# Prepare Canny Control Image
low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
control_image = Image.fromarray(image)
control_image.save("control.png")

# Initialize pipeline
controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")
# Load Hyper-SD15-1step lora
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-1step-lora.safetensors"))
pipe.fuse_lora()
# Use TCD scheduler to achieve better image quality
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
# Lower eta results in more detail for multi-steps inference
eta=1.0
image = pipe("a blue paradise bird in the jungle", num_inference_steps=1, image=control_image, guidance_scale=0, eta=eta).images[0]
image.save('image_out.png')

ComfyUI 使用指南

  • Hyper-SDXL-Nsteps-lora.safetensors: 图文生成工作流
  • Hyper-SD15-Nsteps-lora.safetensors: 图文生成工作流
  • Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors: 图文生成工作流
    • 需求/安装说明: 为了使用1步SDXL UNet,需在您的ComfyUI/custom_nodes目录中安装我们的调度器文件夹,以便从800个时间步进行采样而非999个。
      • 确保存在路径ComfyUI/custom_nodes/ComfyUI-HyperSDXL1StepUnetScheduler。
      • 更详细信息,请参考我们的技术报告。
  • Hyper-SD15-1step-lora.safetensors: 图文生成工作流
  • Hyper-SDXL-1step-lora.safetensors: 图文生成工作流
    • 需求/安装说明:对于1步统一LoRAs,需要将ComfyUI-TCD集成到您的ComfyUI/custom_nodes中,以启用支持不同推断步骤(1至8)的TCDScheduler,仅使用单一检查点。
      • 确保ComfyUI/custom_nodes/ComfyUI-TCD目录存在。
      • 鼓励调整TCDScheduler中的eta参数以获得更佳效果。

引用

@misc{ren2024hypersd,
      title={Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis}, 
      author={Yuxi Ren and Xin Xia and Yanzuo Lu and Jiacheng Zhang and Jie Wu and Pan Xie and Xing Wang and Xuefeng Xiao},
      year={2024},
      eprint={2404.13686},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

当然,我会根据您的要求提供高质量的翻译服务。请提供您希望翻译的文本内容,我将以通俗、专业、优雅且流畅的语言风格进行翻译,并保持原有的Markdown格式不变。请注意,我的回复将仅包含翻译后的文本,不含额外解释或其他内容。如果您有特定的源语言和目标语言需求,请在提供的文本前后说明。现在,请提供您需要翻译的文本。