此流水线是 [Stable Diffusion (v1.5)] 的“指令微调”版本。它是在现有 [InstructPix2Pix 检查点] 的基础上进行精调的。
此流水线的设计动机部分源自 [FLAN],部分源自 [InstructPix2Pix]。其核心思路是,首先创建一个带指令提示的数据集(如 [我们的博客] 中所述),然后进行 InstructPix2Pix 风格的训练。最终目标是让 Stable Diffusion 能够更好地遵循那些涉及图像转换相关操作的特定指令。
训练是在 instruction-tuning-sd/cartoonization 数据集上进行的。更多信息请参考 [此仓库]。训练日志可在 这里 找到。
以下是从此流水线得出的一些结果:
您可以使用此流水线,通过输入图像和输入提示来执行卡通化操作。
以下是使用此模型的方法:
import argparse
from openmind import is_torch_npu_available
from diffusers import StableDiffusionInstructPix2PixPipeline
from diffusers.utils import load_image
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--model_name_or_path",
type=str,
help="Path to model",
default=None,
)
args = parser.parse_args()
return args
def main():
args = parse_args()
model_id = args.model_name_or_path
if is_torch_npu_available():
device = "npu:0"
else:
device = "cpu"
pipeline = StableDiffusionInstructPix2PixPipeline.from_pretrained(
model_id, torch_dtype=torch.float16, use_auth_token=True
).to(device)
image_path = "https://hf-mirror.com/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
image = load_image(image_path)
image = pipeline("Cartoonize the following image", image=image).images[0]
image.save("image.png")
if __name__ == "__main__":
main()关于限制、误用、恶意使用和超出范围使用的说明,请参考模型卡片
FLAN
@inproceedings{
wei2022finetuned,
title={Finetuned Language Models are Zero-Shot Learners},
author={Jason Wei and Maarten Bosma and Vincent Zhao and Kelvin Guu and Adams Wei Yu and Brian Lester and Nan Du and Andrew M. Dai and Quoc V Le},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=gEZrGCozdqR}
}InstructPix2Pix
@InProceedings{
brooks2022instructpix2pix,
author = {Brooks, Tim and Holynski, Aleksander and Efros, Alexei A.},
title = {InstructPix2Pix: Learning to Follow Image Editing Instructions},
booktitle = {CVPR},
year = {2023},
}Stable Diffusion 博客的指令微调
@article{
Paul2023instruction-tuning-sd,
author = {Paul, Sayak},
title = {Instruction-tuning Stable Diffusion with InstructPix2Pix},
journal = {Hugging Face Blog},
year = {2023},
}