Simple Lora Download - Simple Lora Source code download

Simple-Lora

This is a differentiator from automatic1111 webui, which is more friendly to developers, or virtual idol training.

Good news

The google colab side is operational:

exhibit

Show me the lora effect of training with a small number of Dilraba Dilraba, a mixed-race European and American Reba

environment

 pip install -r requirements.txt
git lfs install

Pre-trained model

 # blip 模型
wget https : // storage . googleapis . com / sfr - vision - language - research / BLIP / models / model_base_caption_capfilt_large . pth - P . / pretrained_models

# bert-base-uncased
cd pretrained_models
git clone https : // huggingface . co / bert - base - uncased

# diffusion base model
# 我选用的是chilloutmix_NiPrunedFp32Fix
git clone https : // huggingface . co / naonovn / chilloutmix_NiPrunedFp32Fix
# safetenosor模型转换
cd ..
python process / convert_original_stable_diffusion_to_difdusers . py 
    - - checkpoint_path . / pretrained_models / chilloutmix_NiPrunedFp32Fix / chilloutmix_NiPrunedFp32Fix . safetensors 
    - - dump_path . / pretrained_models / chilloutmixNiPruned_Tw1O - - from_safetensors

Data preparation

huggingface data[option]
Pokemon data as an example

 # 下载数据
mkdir -p dataset
cd dataset
git clone https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions/

User data[option]
Lora training for single pictures

 # 图片文本获取
python process / run_caption . py - - img_base . / dataset / custom

# 将a woman 替换成<dlrb>
python process / change_txt . py - - img_base . / dataset / custom - - ori_txt 'a woman' - - new_txt "<dlrb>"

train

Parameter adjustment self.custom = True to True use user data, False uses huggingfaec data

 - - train_text_encoder # 开启text_encoder lora训练
- - dist # 关闭DDP多机多卡训练模式
- - batch_size 1 # 设置batch_size大小

# 训练脚本
python  train . py  - - batch_size 1 - - dist - - train_text_encoder

reasoning

 python inference . py 
    - - mode 'lora' 
    - - lora_path checkpoint / Lora / 000 - 00000600. pth 
    - - prompt  "<dlrb>,solo, long hair, black hair, choker, breasts, earrings, blue eyes, jewelry, lipstick, makeup, dark, bare shoulders, mountain, night, upper body, dress, large breasts, ((masterpiece))" 
    - - outpath results / 1. png 
    - - num_images_per_prompt 2

The fewer training images are, the smaller the number of iterations of the selected model should be. For example, if you choose about 1000 for a single image training, and if you choose about 2500 for a 10 image training, you can choose about 10 images training.

controlnet

Added controlnet conversion, refer to Here

Download the original model v1-5-pruned.ckpt, control_sd15_openpose.pth into pretrained_models
Convert your own basic model to controlnet form

 python process/tool_transfer_control.py 
--path_input pretrained_models/chilloutmix_NiPrunedFp32Fix/chilloutmix_NiPrunedFp32Fix.safetensors 
--path_output pretrained_models/chilloutmix_control.pth

Controlnet to diffuser form

 python process / convert_controlnet_to_diffusers . py 
- - checkpoint_path  pretrained_models / chilloutmix_control . pth 
- - original_config_file model / third / cldm_v15 . yaml 
- - dump_path  pretrained_models / chilloutmix_control - - device cuda

Download the openpose model body_pose_model.pth, hand_pose_model.pth to pretrained_models/openpose
reasoning

 python inference . py 
    - - mode 'control' 
    - - lora_path checkpoint / Lora / 000 - 00000600. pth 
    - - control_path pretrained_models / chilloutmix_control 
    - - pose_img assets / pose . png 
    - - prompt  "<dlrb>,solo, long hair, black hair, choker, breasts, earrings, blue eyes, jewelry, lipstick, makeup, dark, bare shoulders, mountain, night, upper body, dress, large breasts, ((masterpiece))" 
    - - outpath results / 1. png 
    - - num_images_per_prompt 2

Inpaiting

Download the model

 cd pretrained_models
git clone https : // huggingface . co / runwayml / stable - diffusion - inpainting
# 下载parsing模型
wget https : // github . com / LeslieZhoa / LVT / releases / download / v0 . 0 / face_parsing . pt - P pretrained_models

reasoning

 python inference . py 
    - - mode 'inpait' 
    - - inpait_path pretrained_models / stable - diffusion - inpainting 
    - - mask_area all 
    - - ref_img assets / ref . png 
    - - prompt  "green hair,short hair,curly hair, green hair,beach,seaside" 
    - - outpath results / 1. png 
    - - num_images_per_prompt 2

T2I-Adapter

Inpaiting is more silky

Download adapter model

 wget https : // huggingface . co / TencentARC / T2I - Adapter / resolve / main / models / t2iadapter_seg_sd14v1 . pth - P pretrained_models

reasoning

 python inference . py 
    - - mode 't2iinpait' 
    - - ref_img assets / t2i - input . png 
    - - mask assets / t2i - mask . png 
    - - adapter_mask assets / t2i - adapter . png 
    - - prompt  "green hair,curly hair, green hair,beach,seaside" 
    - - outpath results / 1. png 
    - - num_images_per_prompt 2

Insruct-Pix2Pix style

Model download

 cd pretrained_models
git clone https : // huggingface . co / timbrooks / instruct - pix2pix

reasoning

 python inference . py 
    - - mode 'instruct' 
    - - ref_img assets / t2i - input . png 
    - - prompt  "turn her face to comic style" 
    - - neg_prompt None 
    - - image_guidance_scale 1 
    - - outpath results / 1. png 
    - - num_images_per_prompt 1

Still photos move

The model is mainly derived from FaceVid2Vid, which adds 512 HD definition

 wget https://github.com/LeslieZhoa/Simple-Lora/releases/download/v0.0/script.zip
unzip script.zip && rm -rf script.zip 
python script/run.py  --input assets/6.png
ffmpeg  -r 25 -f image2 -i results/%06d.png  -vcodec libx264   11.mp4

11.mp4

refer to

https://github.com/huggingface/diffusers
https://github.com/AUTOMATIC1111/stable-diffusion-webui
https://github.com/salesforce/BLIP
https://github.com/haofanwang/Lora-for-Diffusers
https://github.com/lllyasviel/ControlNet
https://github.com/haofanwang/ControlNet-for-Diffusers
https://github.com/haofanwang/T2I-Adapter-for-Diffusers
https://github.com/TencentARC/T2I-Adapter
https://github.com/HimariO/diffusers-t2i-adapter
https://github.com/zhanglonghao1992/One-Shot_Free-View_Neural_Talking_Head_Synthesis