SDFT下载 - SDFT源代码下载

SDFT

Ai源码

1.0.0

下载

SDFT

概述

SDFT是一个自我教育项目，旨在概述主要的稳定扩散微调技术。稳定的扩散实现来自拥抱面扩散器库。

概述的技术：

低级适应
文本反演
Dreambooth

数据集

所有微调技术均在一个名为“ Dark Fantasy”的手工制造的玩具数据集上进行。使用稳定的扩散XL Base-1.0模型从SteStieAI中收集数据集，以使人联想到1970年代和1980年代的样式，从而生成类似深色幻想的图像。目的是说明如何在此数据集中概述所有技术。

数据集可以在datasets/目录下找到。

技术

洛拉

用法

用洛拉微调SDXL：

accelerate launch train_lora_sdxl.py 
    --pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0 
    --pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix 
    --allow_tf32 
    --mixed_precision= " fp16 " 
    --rank=32 
    --train_data_dir=datasets/dark_fantasy/ 
    --caption_column= " text " 
    --dataloader_num_workers=16 
    --resolution=512 
    --use_center_crop 
    --use_random_flip 
    --train_batch_size=2 
    --gradient_accumulation_steps=4 --gradient_checkpointing 
    --max_train_steps=1500 
    --learning_rate=1e-04 
    --max_grad_norm=5 
    --lr_scheduler= " cosine_with_restarts " 
    --lr_warmup_steps=100 
    --output_dir=runs/lora_run/ 
    --checkpointing_steps=100 
    --validation_epochs=10 
    --num_validation_images=4 
    --save_images_on_disk 
    --validation_prompt= " A picture of a misterious figure in cape, back view. " 
    --logging_dir= " logs " 
    --seed=1337

使用LORA检查点进行推断：

accelerate launch run_lora_inference.py 
    --pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0 
    --pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix 
    --output_dir=runs/lora_v1/ 
    --lora_checkpoint_path=runs/lora_run/checkpoint-100/ 
    --resolution=1024 
    --num_images_to_generate=5 
    --guidance_scale=5.0 
    --num_inference_steps=40 
    --prompt= " A picture of a misterious figure in cape, back view. " 
    --negative_prompt= " logo, watermark, text, blurry " 
    --seed=1337

结果

无洛拉 - 洛拉图像比较。使用同一潜伏期生成成对的图像。

"A picture of a heavy red Kenworth truck riding in the night across the abanoned city streets."

"A picture of a wounded orc warrior, climbing in misty mountains, front view, exhausted face, looking at the camera."

"A picture of space rocket launching, Earth on the background, candid photo."

"A picture of a supermassive black hole, devouring the galaxy, cinematic picture"

"A picture of a human woman warrior, black hair, looking at the camera, front view."

文本反演

用法

用文本反演（Ti）微调SDXL：

accelerate launch train_ti_sdxl.py 
    --pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0 
    --pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix 
    --allow_tf32 
    --mixed_precision= " fp16 " 
    --train_data_dir=datasets/skull 
    --learnable_property= " style " 
    --placeholder_token= " <skull_lamp> " 
    --initializer_token= " skull " 
    --num_vectors=8 
    --resolution=1024 
    --repeats=1 
    --train_batch_size=2 
    --gradient_accumulation_steps=4 --gradient_checkpointing 
    --max_train_steps=5000 
    --learning_rate=3e-3 
    --lr_scheduler= " piecewise_constant " 
    --lr_warmup_steps=30 
    --output_dir= " runs/ti_run " 
    --validation_prompt= " A painting of Eiffel tower in the style of <skull_lamp> " 
    --num_validation_images=4 
    --validation_steps=100 
    --embeddings_save_steps=500 
    --save_images_on_disk 
    --use_random_flip 
    --use_center_crop 
    --seed=1337

使用训练有素的TI嵌入进行推断：

accelerate launch run_ti_inference.py 
    --pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0 
    --pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix 
    --output_dir=runs/ti_run 
    --path_to_embeddings=runs/ti_run/ti-embeddings-final.safetensors 
    --resolution=1024 
    --num_images_to_generate=1 
    --guidance_scale=5.0 
    --num_inference_steps=50 
    --placeholder_token= " <skull_lamp> " 
    --prompt= " A <skull_lamp>, made of lego " 
    --negative_prompt= " logo, watermark, text, blurry, bad quality " 
    --seed=1337