min dalle

min dalle

Python

v0.4

下载

最小（dall·e）

AI顿悟的YouTube步行

这是Boris Dayma的Dall·e Mini（具有巨型重量）的快速，最小的港口。它已被剥离以进行推理，并转换为Pytorch。唯一的第三方依赖性是Numpy，请求，枕头和火炬。

要生成一个3x3网格的dall·e Mega图像：

55秒，带有CoLab的T4
33秒，在Colab中有P100
15秒，拥抱脸上有A10G

这是A100上性能的更详细的细分。归功于 @Technobird22和他的Neogen Discord bot。

可以在此处找到用于将其转换为火炬的亚麻模型和代码。

安装

$ pip install min-dalle

用法

加载一次模型参数，然后重复使用模型生成多个图像。

 from min_dalle import MinDalle

model = MinDalle (
    models_root = './pretrained' ,
    dtype = torch . float32 ,
    device = 'cuda' ,
    is_mega = True , 
    is_reusable = True
)

如果尚未那里，则需要将所需的型号下载到models_root 。将dtype设置为torch.float16以保存GPU内存。如果您有安培体系结构GPU，则可以使用torch.bfloat16 。将device设置为“ CUDA”或“ CPU”。一旦所有内容都完成了初始化，请随意使用一些文本来调用generate_image 。使用阳性seed可再现结果。较高的supercondition_factor值可以更好地与文本一致，但生成的图像较窄。每个图像令牌都是从top_k最有可能的令牌中采样的。最大的logit从logits中减去以避免INF。然后将逻辑除以temperature 。如果is_seamless是正确的，则图像网格将在令牌空间中铺有瓷砖而不是像素空间。

 image = model . generate_image (
    text = 'Nuclear explosion broccoli' ,
    seed = - 1 ,
    grid_size = 4 ,
    is_seamless = False ,
    temperature = 1 ,
    top_k = 256 ,
    supercondition_factor = 32 ,
    is_verbose = False
)

display ( image )

示例为@hardmaru

保存单个图像

如果您想手动处理图像，这些图像也可以作为FloatTensor生成。

 images = model . generate_images (
    text = 'Nuclear explosion broccoli' ,
    seed = - 1 ,
    grid_size = 3 ,
    is_seamless = False ,
    temperature = 1 ,
    top_k = 256 ,
    supercondition_factor = 16 ,
    is_verbose = False
)

要将图像变成PIL格式，您必须首先将图像移至CPU，然后将张量转换为Numpy阵列。

 images = images . to ( 'cpu' ). numpy ()

然后图像 $ i $可以覆盖到pil.image并保存

 image = Image . fromarray ( images [ i ])
image . save ( 'image_{}.png' . format ( i ))

渐进输出

如果模型被交互使用（例如，笔记本中的模型） generate_image_stream可用于在模型解码时生成图像流。 denokenizer为每个图像增加了一个小延迟。将progressive_outputs设置为True以启用此功能。 COLAB中实现了一个示例。

 image_stream = model . generate_image_stream (
    text = 'Dali painting of WALL·E' ,
    seed = - 1 ,
    grid_size = 3 ,
    progressive_outputs = True ,
    is_seamless = False ,
    temperature = 1 ,
    top_k = 256 ,
    supercondition_factor = 16 ,
    is_verbose = False
)

for image in image_stream :
    display ( image )

命令行

使用image_from_text.py从命令行生成图像。

$ python image_from_text.py --text= ' artificial intelligence ' --no-mega

展开

附加信息

版本 v0.4
类型 Python
更新时间 2025-07-14
大小 3.94MB
来自于 Github

最小（dall·e）

安装

用法

保存单个图像

渐进输出

命令行

GitHub sgrebnov/cordova plugin background download

Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

CRMEB Min开源商城 v4.3.2

chat.petals.dev

GPT Prompt Templates

GPTyped

ToDo Co

Python Portfolio

datamule python

Google Dorks

shepherd

mongo express