表 1 版本配套表
| 配套 | 版本 | 环境准备指导 |
|---|---|---|
| Python | 3.11.10 | - |
| torch | 2.9.0 | - |
注意:
# 增加软件包可执行权限,{version}表示软件版本号,{arch}表示CPU架构,{soc}表示昇腾AI处理器的版本。
chmod +x ./Ascend-cann-toolkit_{version}_linux-{arch}.run
chmod +x ./Ascend-cann-kernels-{soc}_{version}_linux.run
chmod +x ./Ascend-cann-nnal_{version}_linux-{arch}.run (若使用量化)
# 校验软件包安装文件的一致性和完整性
./Ascend-cann-toolkit_{version}_linux-{arch}.run --check
./Ascend-cann-kernels-{soc}_{version}_linux.run --check
./Ascend-cann-nnal{version}_linux-{arch}.run --check (若使用量化)
# 安装
./Ascend-cann-toolkit_{version}_linux-{arch}.run --install
./Ascend-cann-kernels-{soc}_{version}_linux.run --install
./Ascend-cann-nnal{version}_linux-{arch}.run --torch_atb --install (若使用量化)
# 设置环境变量
source /usr/local/Ascend/ascend-toolkit/set_env.sh
source /usr/local/Ascend/nnal/atb/set_env.shpip3 install -r requirements.txt --no-deps# 增加软件包可执行权限,{version}表示软件版本号,{arch}表示CPU架构。
chmod +x ./Ascend-mindie_${version}_linux-${arch}.run
./Ascend-mindie_${version}_linux-${arch}.run --check
# 方式一:默认路径安装
./Ascend-mindie_${version}_linux-${arch}.run --install
# 设置环境变量
cd /usr/local/Ascend/mindie && source set_env.sh
# 方式二:指定路径安装
./Ascend-mindie_${version}_linux-${arch}.run --install-path=${AieInstallPath}
# 设置环境变量
cd ${AieInstallPath}/mindie && source set_env.sh下载 pytorch_v{pytorchversion}_py{pythonversion}.tar.gz
tar -xzvf pytorch_v{pytorchversion}_py{pythonversion}.tar.gz
# 解压后,会有whl包
pip install torch_npu-{pytorchversion}.xxxx.{arch}.whl# 若环境镜像中没有gcc、g++,请用户自行安装
yum install gcc
yum install g++
# 导入头文件路径
export CPLUS_INCLUDE_PATH=/usr/include/c++/12/:/usr/include/c++/12/aarch64-openEuler-linux/:$CPLUS_INCLUDE_PATH注:若使用openeuler镜像,需要配置gcc、g++环境,否则会导致fatal error: 'stdio.h' file not found
https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3Bhttps://huggingface.co/Wan-AI/Wan2.1-T2V-14Bhttps://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480Phttps://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720Pgit clone https://modelers.cn/MindIE/Wan2.1.git使用上一步下载的权重
model_base="./Wan2.1-T2V-1.3B/"执行命令:
# Wan2.1-T2V-1.3B
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
python generate.py \
--task t2v-1.3B \
--size 832*480 \
--ckpt_dir ${model_base} \
--offload_model False \
--sample_steps 50 \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."\
--base_seed 0 参数说明:
执行命令:
# Wan2.1-T2V-1.3B
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
python generate.py \
--task t2v-1.3B \
--size 832*480 \
--ckpt_dir ${model_base} \
--sample_steps 50 \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."\
--base_seed 0 \
--use_attentioncache \
--start_step 20 \
--attentioncache_interval 2 \
--end_step 47参数说明:
执行命令:
# 1.3B支持双卡、四卡
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=4 generate.py \
--task t2v-1.3B \
--size 832*480 \
--ckpt_dir ${model_base} \
--sample_steps 50 \
--ulysses_size 4 \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0 参数说明:
执行命令:
# 1.3B支持双卡、四卡
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=4 generate.py \
--task t2v-1.3B \
--size 832*480 \
--ckpt_dir ${model_base} \
--sample_steps 50 \
--ulysses_size 4 \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0 \
--use_attentioncache \
--start_step 20 \
--attentioncache_interval 2 \
--end_step 47参数说明:
使用上一步下载的权重
model_base="./Wan2.1-T2V-14B/"执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--sample_steps 50 \
--dit_fsdp \
--t5_fsdp \
--cfg_size 2 \
--ulysses_size 4 \
--vae_parallel \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0 参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--sample_steps 50 \
--t5_fsdp \
--tp_size 8 \
--vae_parallel \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0 参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=16 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--sample_steps 50 \
--dit_fsdp \
--t5_fsdp \
--cfg_size 2 \
--ulysses_size 8 \
--vae_parallel \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--dit_fsdp \
--t5_fsdp \
--sample_steps 50 \
--cfg_size 2 \
--ulysses_size 4 \
--vae_parallel \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0 \
--use_attentioncache \
--start_step 20 \
--attentioncache_interval 2 \
--end_step 47参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--dit_fsdp \
--t5_fsdp \
--sample_steps 50 \
--cfg_size 2 \
--ulysses_size 4 \
--vae_parallel \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0 \
--use_rainfusion \
--sparsity 0.64 \
--sparse_start_step 15参数说明:
使用上一步下载的权重
# 生成480P的视频
model_base="./Wan2.1-I2V-14B-480P/"
# 生成720P的视频
model_base="./Wan2.1-I2V-14B-720P/"执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task i2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--frame_num 81 \
--sample_steps 40 \
--dit_fsdp \
--t5_fsdp \
--cfg_size 2 \
--ulysses_size 4 \
--vae_parallel \
--image examples/i2v_input.JPG \
--base_seed 0 \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task i2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--frame_num 81 \
--sample_steps 40 \
--t5_fsdp \
--tp_size 8 \
--vae_parallel \
--image examples/i2v_input.JPG \
--base_seed 0 \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task i2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--frame_num 81 \
--sample_steps 40 \
--dit_fsdp \
--t5_fsdp \
--cfg_size 2 \
--ulysses_size 4 \
--image examples/i2v_input.JPG \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside." \
--base_seed 0 \
--vae_parallel \
--use_attentioncache \
--start_step 12 \
--attentioncache_interval 4 \
--end_step 37参数说明:
执行命令:
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
torchrun --nproc_per_node=8 generate.py \
--task i2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--frame_num 81 \
--sample_steps 40 \
--dit_fsdp \
--t5_fsdp \
--cfg_size 2 \
--ulysses_size 4 \
--image examples/i2v_input.JPG \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside." \
--base_seed 0 \
--vae_parallel \
--use_rainfusion \
--sparsity 0.64 \
--sparse_start_step 15
参数说明:
新增对Wan2.1-T2V、Wan2.1-I2V的W8A8_dynamic量化支持,针对DiT模型进行量化,以降低显存占用,提升模型推理性能
下载并安装msmodelslim工具
git clone https://gitcode.com/Ascend/msit
cd msit/msmodelslim
bash install.sh以Wan2.1-T2V-14B模型为例,导出DiT的W8A8量化权重及描述文件
cd /path/to/Wan2.1/
model_base="./Wan2.1-T2V-14B/"
python quant_wan21.py \
--task t2v-14B \
--ckpt_dir ${model_base} \
--quant_dit_path ./quant_w8a8_dynamic \
--quant_type W8A8 \
--is_dynamic参数说明:
执行后,quant_w8a8_dynamic目录下会生成两个文件:
quant_model_description_w8a8_dynamic.json:量化配置描述文件quant_model_weight_w8a8_dynamic.safetensors:量化后的权重文件以Wan2.1-T2V-14B模型为例,执行量化推理
export ALGO=0
export PYTORCH_NPU_ALLOC_CONF='expandable_segments:True'
export TASK_QUEUE_ENABLE=2
export CPU_AFFINITY_CONF=1
export TOKENIZERS_PARALLELISM=false
model_base="./Wan2.1-T2V-14B/"
quant_dit_path="./quant_w8a8_dynamic/"
torchrun --nproc_per_node=8 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ${model_base} \
--quant_dit_path ${quant_dit_path}
--sample_steps 50 \
--dit_fsdp \
--t5_fsdp \
--cfg_size 2 \
--ulysses_size 4 \
--vae_parallel \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage." \
--base_seed 0新增参数说明:
export T5_LOAD_CPU=1,以降低显存占用Directory operation failed. Reason: Directory [/usr/local/Ascend/mindie/latest/mindie-rt/aoe] does not exist,请设置环境变量unset TUNE_BANK_PATHfatal error: 'stdio.h' file not found,请参考1.6 gcc、g++安装Failed to bind the IP port. Reason: The IP address and port have been bound already.
HCCL function error :HcclGetRootInfo(&hcclID), error code is 7:请配置export HCCL_HOST_SOCKET_PORT_RANGE="auto"不指定端口
HCCL function error :HcclGetRootInfo(&hcclID), error code is 11:请配置sysctl -w net.ipv4.ip_local_reserved_ports=60000-60015预留端口