FlaxDiff下載 - FlaxDiff源代碼下載

FlaxDiff

其他源碼

1.0.0

下載

該項目得到了Google TPU研究雲的部分支持。我要感謝Google Cloud TPU團隊為我提供了在多主機分佈式設置中培訓更大文本條件模型的資源。

多功能且簡單的擴散庫

近年來，擴散和基於得分的多步模型已徹底改變了生成的AI領域。但是，該領域的最新研究已變得高度數學密集型，使了解最先進的擴散模型如何工作並產生令人印象深刻的圖像變得具有挑戰性。在代碼中復制這項研究可能令人生畏。

FlaxDiff是以易於理解的方式設計和實施的工具（調度程序，採樣器，模型等）的庫。重點是對性能的可理解性和可讀性。我開始了這個項目，是一種愛好，以熟悉亞麻和jax，並了解擴散和生成AI的最新研究。

我最初在Keras啟動了這個項目，熟悉Tensorflow 2.0，但由於其性能和易用性而過渡到由JAX提供動力的亞麻。還提供了舊的筆記本電腦和型號，包括我的第一個亞麻模型。

Diffusion_flax_linen.ipynb筆記本是我實驗的主要工作區。將幾個檢查點上傳到pretrained文件夾，以及與每個檢查點關聯的工作筆記本的副本。您可能需要將筆記本複製到工作根，以使其正常運行。

從頭開始的示例筆記本

在example notebooks文件夾中，您將找到有關各種擴散技術的全面筆記本，這些筆記本完全是從頭開始編寫的，並且獨立於Flaxdiff庫。每個筆記本都包括對基本數學和概念的詳細說明，使其成為學習和理解擴散模型的寶貴資源。

可用的筆記本和資源

擴散解釋（NBViewer鏈接）（本地鏈接）
- 正在進行的工作，對基於擴散的生成模型的概念DDPM（denoising擴散概率模型），DDIM（脫氧擴散隱含模型）和SDE/ODE概括擴散的概念進行了深入探索。

EDM（闡明基於擴散的生成模型的設計空間）
- TODO徹底的EDM指南，討論了此高級擴散模型中使用的創新方法和技術。

這些筆記本旨在為各種擴散模型和技術提供非常易於理解和逐步指南。它們被設計為對初學者的友好型，因此儘管它們可能不遵守原始論文的確切表述和實現，以使其更容易理解和推廣，但我還是盡力使它們盡可能準確。如果您發現任何錯誤或有任何建議，請隨時打開問題或提取請求。

其他資源

JAX中的多主宿主數據並行培訓腳本
- 訓練腳本對JAX中的多主宿主數據並行培訓，可作為跨多個主機多個GPU/TPU的大型模型的參考。正在製作成熟的教程筆記本。
TPU公用事業可使生活更輕鬆
- 一個集合的實用程序和腳本，可以使使用TPU的工作更加容易，例如CLI創建/start/star/stop/Setup tpu，腳本以設置TPU VM（安裝所需的所有內容），安裝GCS數據集等。

免責聲明（關於我）

從2019 - 2021年開始，我曾在Hyperverge擔任機器學習研究人員，重點關注計算機視覺，特別是面部反欺騙和麵部檢測和識別。自從2021年改用我目前的工作以來，我從未從事過太多的研發工作，導致我開始了這個寵物項目，以重新訪問和重新學習基礎知識，並熟悉最先進的工作。我目前的角色主要涉及Golang System Engineering，其中一些應用的ML工作剛剛湧入。因此，代碼可能反映了我的學習旅程。請原諒任何錯誤，並打開一個讓我知道的問題。

另外，在Github Copilot的幫助下，很少有文本可以生成，因此請原諒文本中的任何錯誤。

指數

多功能且易於理解的擴散庫
免責聲明（關於我）
特徵
- 調度程序
- 模型預測指標
- 採樣器
- 訓練
- 型號
Flaxdiff的安裝
Flaxdiff入門
- 訓練例子
- 推理示例
參考和致謝
等待的事情清單
畫廊
貢獻
執照

特徵

調度程序

在flaxdiff.schedulers中實施：

LinearnoisesChedule （ flaxdiff.schedulers.LinearNoiseSchedule ）：β參數化的離散調度程序。
cosinenoiseschedule （ flaxdiff.schedulers.CosineNoiseSchedule ）：β參數化的離散調度程序。
expnoisesChedule （ flaxdiff.schedulers.ExpNoiseSchedule ）：β參數化的離散調度程序。
cosinecontuulnoisescheduler （ flaxdiff.schedulers.CosineContinuousNoiseScheduler ）：連續的調度程序。
cesinegenernoisescheduler （ flaxdiff.schedulers.CosineGeneralNoiseScheduler ）：連續的Sigma參數化餘弦調度程序。
Karrasvenoisescheduler （ flaxdiff.schedulers.KarrasVENoiseScheduler ）：Sigma參數化的連續調度程序，由Karras等人提出。 2022年，最適合推斷。
EdmnoisesCheduler （ flaxdiff.schedulers.EDMNoiseScheduler ）：基於指數擴散模型（EDM）的Sigma參數連續調度程序，最適合與KarraskAraskArlasvenoisscheduler進行培訓。

模型預測指標

在flaxdiff.predictors中實施。預告：

Epsilonpredictor （ flaxdiff.predictors.EpsilonPredictor ）：預測數據中的噪聲。
X0PREDECTOR （ flaxdiff.predictors.X0Predictor ）：從嘈雜數據中預測原始數據。
vpredictor （ flaxdiff.predictors.VPredictor ）：預測EDM中常用的數據和噪聲的線性組合。
KarrasedMpredictor （ flaxdiff.predictors.KarrasEDMPredictor ）：EDM的廣義預測指標，集成了各種參數化。

採樣器

在flaxdiff.samplers中實施。採樣器：

DDPMSAMPLER （ flaxdiff.samplers.DDPMSampler ）：實現denoising擴散概率模型（DDPM）採樣過程。
DDIMSAMPLER （ flaxdiff.samplers.DDIMSampler ）：實現denoising擴散隱式模型（DDIM）採樣過程。
EulerSampler （ flaxdiff.samplers.EulerSampler ）：使用Euler方法的ode求解器採樣器。
heunsampler （ flaxdiff.samplers.HeunSampler ）：使用heun方法的ode求解器採樣器。
RK4SAMPLER （ flaxdiff.samplers.RK4Sampler ）：使用Runge-Kutta方法的ODE求解器採樣器。
MULTISTEPDPM （ flaxdiff.samplers.MultiStepDPM ）：實現了一種由多步dpm求解器啟發的多步驟採樣方法，如下所示：Tonyduan/tonyduan/diffusion）

訓練

在flaxdiff.trainer中實施：

擴散劑（ flaxdiff.trainer.DiffusionTrainer ）：旨在促進擴散模型訓練的類。它管理培訓循環，損失計算和模型更新。

型號

在flaxdiff.models中實現：

unet （ flaxdiff.models.simple_unet.SimpleUNet ）：擴散模型的樣本UNET體系結構。
層：包括上採樣（ flaxdiff.models.simple_unet.Upsample ）的層庫，下採樣（ flaxdiff.models.simple_unet.Downsample ），時間嵌入式， flaxdiff.models.simple_unet.FouriedEmbedding tllection glaxsdiff.modiff and and and and and and and and and simt simt simt simt simt simt simt simt simt sign norly simnsemssemne norly flaxdiff.models.simple_unet.ResidualBlock flaxdiff.models.simple_unet.AttentionBlock 。

安裝

要安裝FlaxDiff，您需要擁有Python 3.10或更高。使用以下方式安裝所需的依賴項：

pip install -r requirements.txt

對模型進行了訓練並用JAX == 0.4.28和亞麻== 0.8.4進行了測試。但是，當我更新到最新的jax == 0.4.30和亞麻== 0.8.5時，這些模型停止了訓練。似乎已經有一些重大變化打破了訓練動態，因此我建議堅持要求中提到的版本.txt

入門

訓練例子

這是一個簡化的示例，可以讓您開始使用FlaxDiff培訓擴散模型：

 from flaxdiff . schedulers import EDMNoiseScheduler
from flaxdiff . predictors import KarrasPredictionTransform
from flaxdiff . models . simple_unet import SimpleUNet as UNet
from flaxdiff . trainer import DiffusionTrainer
import jax
import optax
from datetime import datetime

BATCH_SIZE = 16
IMAGE_SIZE = 64

# Define noise scheduler
edm_schedule = EDMNoiseScheduler ( 1 , sigma_max = 80 , rho = 7 , sigma_data = 0.5 )

# Define model
unet = UNet ( emb_features = 256 , 
            feature_depths = [ 64 , 128 , 256 , 512 ],
            attention_configs = [{ "heads" : 4 }, { "heads" : 4 }, { "heads" : 4 }, { "heads" : 4 }, { "heads" : 4 }],
            num_res_blocks = 2 ,
            num_middle_res_blocks = 1 )

# Load dataset
data , datalen = get_dataset ( "oxford_flowers102" , batch_size = BATCH_SIZE , image_scale = IMAGE_SIZE )
batches = datalen // BATCH_SIZE

# Define optimizer
solver = optax . adam ( 2e-4 )

# Create trainer
trainer = DiffusionTrainer ( unet , optimizer = solver , 
                           noise_schedule = edm_schedule ,
                           rngs = jax . random . PRNGKey ( 4 ), 
                           name = "Diffusion_SDE_VE_" + datetime . now (). strftime ( "%Y-%m-%d_%H:%M:%S" ),
                           model_output_transform = KarrasPredictionTransform ( sigma_data = edm_schedule . sigma_data ))

# Train the model
final_state = trainer . fit ( data , batches , epochs = 2000 )

推理示例

這是一個簡化的示例，用於使用訓練有素的模型生成圖像：

 from flaxdiff . samplers import DiffusionSampler

class EulerSampler ( DiffusionSampler ):
    def take_next_step ( self , current_samples , reconstructed_samples , pred_noise , current_step , state , next_step = None ):
        current_alpha , current_sigma = self . noise_schedule . get_rates ( current_step )
        next_alpha , next_sigma = self . noise_schedule . get_rates ( next_step )
        dt = next_sigma - current_sigma
        x_0_coeff = ( current_alpha * next_sigma - next_alpha * current_sigma ) / dt
        dx = ( current_samples - x_0_coeff * reconstructed_samples ) / current_sigma
        next_samples = current_samples + dx * dt
        return next_samples , state

# Create sampler
sampler = EulerSampler ( trainer . model , trainer . state . ema_params , edm_schedule , model_output_transform = trainer . model_output_transform )

# Generate images
samples = sampler . generate_images ( num_images = 64 , diffusion_steps = 100 , start_step = 1000 , end_step = 0 )
plotImages ( samples , dpi = 300 )

參考和致謝

研究論文和預印本

原始的denoisis擴散概率模型（DDPM）紙
denoising擴散隱式模型（DDIM）紙
改進的deNo化擴散概率模型紙
擴散模型在圖像合成紙上擊敗甘恩
通過隨機微分方程紙基於得分的生成建模
闡明基於擴散的生成模型（EDM）紙的設計空間
感知優先訓練擴散模型（P2加權）紙
用於流形（PNMDM）紙上擴散模型的偽數值方法
DPM溶劑：用於擴散概率模型在10步紙中進行的快速ODE求解器

有用的博客和代碼庫

桑德·迪爾曼（Sander Dieleman）撰寫的一系列令人難以置信的博客，介紹了各種與擴散有關的主題。這些帖子尤其是關於擴散模型，典型性，擴散指導的幾何形狀和噪聲時間表的文章
Tony Duan的一個很棒的博客系列從頭開始擴散模型。儘管它訓練MNIST的模型，並且實現有點基礎，但以非常好的方式來解釋數學。代碼庫在這裡
K-Diffusion Codebase Katherine Crowson與DPM-Solver，DPM-Solver ++（2s和2m）一起在Pytorch中託管了EDM Paper（Karras等人）的詳盡實現。大多數其他擴散庫從中藉用。
Tero Karras在Pytorch的官方EDM實施。真正的整潔代碼和所有基於Karras的採樣器/計劃的參考實現。
擁抱面孔擴散器庫，可以說是該領域最新最新技術和概念的最完整的實現集。該存儲庫的重點主要是在Pytorch上寫的，但也可以使用亞麻的實現，這也是完整性和易於理解的重點。
A_K Nain的Keras DDPM教程以及AndrásBéres的Keras DDIM實現，這對於初學者來說是理解擴散模型的基礎知識的重要起點。我開始嘗試實現從頭開始這些教程中介紹的概念。
特別感謝Openai的Chatgpt-4幫助我清除了我的疑問。

等待的事情清單

高級求解器，例如dpm/dpm2/dpm ++等
當前ODE求解器的SDE版本，即祖先採樣
文本條件圖像生成
分類器和分類免費指導

畫廊

Euler祖先採樣器在200個步驟中生成的圖像[帶有CFG的Text2Image]

在TPU-V4-32上接受了Laion-aesthetics 12m + CC12M + CC12M + MS Coco + 1M審美6+子集：coyo-700m的子集： a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful landscape with a river with mountains, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a beautiful forest with a river and sunlight, a big mansion with a garden, a big mansion with a garden, a big mansion with a garden, a big mansion with a garden, a big mansion with a garden, a big mansion with a garden, a big mansion with a garden, a big mansion with a garden

參數： Dataset: Laion-Aesthetics 12M + CC12M + MS COCO + 1M aesthetic 6+ subset of COYO-700M Batch size: 256 Image Size: 128 Training Epochs: 5 Steps per epoch: 74573 Model Configurations: feature_depths=[128, 256, 512, 1024]

Training Noise Schedule: EDMNoiseScheduler Inference Noise Schedule: KarrasEDMPredictor

Eulera帶有CFG

Euler祖先採樣器在200個步驟中生成的圖像[帶有CFG的Text2Image]

由以下提示使用無分類器引起的指導因素= 2的圖像= 2： 'water tulip, a water lily, a water lily, a water lily, a photo of a marigold, a water lily, a water lily, a photo of a lotus, a photo of a lotus, a photo of a lotus, a photo of a rose, a photo of a rose, a photo of a rose, a photo of a rose, a photo of a rose'

參數： Dataset: oxford_flowers102 Batch size: 16 Image Size: 128 Training Epochs: 1000 Steps per epoch: 511

Training Noise Schedule: EDMNoiseScheduler Inference Noise Schedule: KarrasEDMPredictor

Eulera帶有CFG

Euler祖先採樣器在200個步驟中生成的圖像[帶有CFG的Text2Image]

由以下提示產生的圖像使用指導因素免費指導= 4： 'water tulip, a water lily, a water lily, a photo of a rose, a photo of a rose, a water lily, a water lily, a photo of a marigold, a photo of a marigold, a photo of a marigold, a water lily, a photo of a sunflower, a photo of a lotus, columbine, columbine, an orchid, an orchid, an orchid, a water lily, a water lily, a water lily, columbine, columbine, a photo of a sunflower, a photo of a sunflower, a photo of a sunflower, a photo of a lotus, a photo of a lotus, a photo of a marigold, a photo of a marigold, a photo of a rose, a photo of a rose, a photo of a rose, orange dahlia, orange dahlia, a lenten rose, a lenten rose, a water lily, a water lily, a water lily, a water lily, an orchid, an orchid, an orchid, hard-leaved pocket orchid, bird of paradise, bird of paradise, a photo of a lovely rose, a photo of a lovely rose, a photo of a globe-flower, a photo of a globe-flower, a photo of a lovely rose, a photo of a lovely rose, a photo of a ruby-lipped cattleya, a photo of a ruby-lipped cattleya, a photo of a lovely rose, a water lily, a osteospermum, a osteospermum, a water lily, a water lily, a water lily, a red rose, a red rose'

參數： Dataset: oxford_flowers102 Batch size: 16 Image Size: 128 Training Epochs: 1000 Steps per epoch: 511

Training Noise Schedule: EDMNoiseScheduler Inference Noise Schedule: KarrasEDMPredictor

Eulera帶有CFG

DDPM採樣器生成的圖像以1000個步驟[無條件]

參數： Dataset: oxford_flowers102 Batch size: 16 Image Size: 64 Training Epochs: 1000 Steps per epoch: 511

Training Noise Schedule: CosineNoiseSchedule Inference Noise Schedule: CosineNoiseSchedule

Model: UNet(emb_features=256, feature_depths=[64, 128, 256, 512], attention_configs=[{"heads":4}, {"heads":4}, {"heads":4}, {"heads":4}, {"heads":4}], num_res_blocks=2, num_middle_res_blocks=1)

DDPM採樣器結果

DDPM採樣器生成的圖像以1000個步驟[無條件]

參數： Dataset: oxford_flowers102 Batch size: 16 Image Size: 64 Training Epochs: 1000 Steps per epoch: 511

Training Noise Schedule: CosineNoiseSchedule Inference Noise Schedule: CosineNoiseSchedule

Model: UNet(emb_features=256, feature_depths=[64, 128, 256, 512], attention_configs=[{"heads":4}, {"heads":4}, {"heads":4}, {"heads":4}, {"heads":4}], num_res_blocks=2, num_middle_res_blocks=1)

DDPM採樣器結果

Heun採樣器生成的圖像以10個步驟（HEUN採用2倍推理步驟）生成的圖像[無條件]

參數： Dataset: oxford_flowers102 Batch size: 16 Image Size: 64 Training Epochs: 1000 Steps per epoch: 511

Training Noise Schedule: EDMNoiseScheduler Inference Noise Schedule: KarrasEDMPredictor

Model: UNet(emb_features=256, feature_depths=[64, 128, 256, 512], attention_configs=[{"heads":4}, {"heads":4}, {"heads":4}, {"heads":4}, {"heads":4}], num_res_blocks=2, num_middle_res_blocks=1)

HEUN採樣器結果