Lightweight networks that control spatial information of Stable Diffusion using Chinese fine-tuning
In English
ControlLoRA is an engineering that uses LoRA technology to simply debug stable diffusion to control its spatial information. Generally, a simple and tiny network is used (~7M parameters, ~25M storage). More information is available from ControlLoRA.
This project can be regarded as a fork of ControlLoRA. And two Chinese fields are given based on the ControlLoRA method.
You can use the online huggingface space to upload your pictures and Chinese prompt text to see the output results. Since it is deployed and used on the CPU, I recommend you download these projects locally and run them using your GPU. (Due to the setting of "is_available", it will dynamically switch devices based on whether there is a GPU)
| name | HuggingFace Model Link | HuggingFace Space Link |
|---|---|---|
| ControlNet By Canny Chinese? | https://huggingface.co/svjack/canny-control-lora-zh | https://huggingface.co/spaces/svjack/ControlNet-Canny-Chinese |
| ControlNet By Pose Chinese? | https://huggingface.co/svjack/pose-control-lora-zh | https://huggingface.co/spaces/svjack/ControlNet-Pose-Chinese |
pip install -r requirements.txtAfter installation, you can cd into ControlNet-Canny-Chinese and ControlNet-Pose-Chinese to run separately
python app.pyOpen your browser and go to http://localhost:7860 to experiment in the browser.
| Name | Prompt | Original Image | Backbone Image | Transformed Image |
|---|---|---|---|---|
| ControlNet By Canny Chinese? | A playful clown | ![]() | ![]() | ![]() |
| ControlNet By Canny Chinese? | Night full of meteors | ![]() | ![]() | ![]() |
| ControlNet By Canny Chinese? | Cat vampire | ![]() | ![]() | ![]() |
| ControlNet By Pose Chinese? | Wheat Field Watcher | ![]() | ![]() | ![]() |
| ControlNet By Pose Chinese? | Military officer in military uniform | ![]() | ![]() | ![]() |
LoRA: Low-Rank Adaptation of Large Language Models LoRA reduces the number of training parameters by learning the rank decomposition matrix pair and freezing the original weight. This greatly reduces the storage limitations of downstream tasks fine-tuning and task switching by the big model, opening up the inference potential when deploying. LoRA also surpasses many other adjustment models (such as adapter, prefix-tuning, and fine-tuning)
In the Stable Diffusion field, I also provide 3 Stable Diffusion models fine-tuned using Lora. The CC3M dataset is downloaded and converted by svjack/img2dataset-pq2hf-transform-toolkit.
| name | HuggingFace Model Link | language | Fine-tune the dataset |
|---|---|---|---|
| svjack/pokemon-sd-lora-zh | https://huggingface.co/svjack/pokemon-sd-lora-zh | Chinese | svjack/pokemon-blip-captions-en-zh |
| svjack/concept-caption-3m-sd-lora-en | https://huggingface.co/svjack/concept-caption-3m-sd-lora-en | English | Conceptual Captions (CC3M) |
| svjack/concept-caption-3m-sd-lora-zh | https://huggingface.co/svjack/concept-caption-3m-sd-lora-zh | Chinese | Conceptual Captions (CC3M) |
You can discover how to use these models through the model card.
svjack - [email protected] - [email protected]
Project Link:https://github.com/svjack/ControlLoRA-Chinese