sparsezoo下载 - sparsezoo源代码下载

稀疏

神经网络模型存储库，用于高度稀疏和稀疏定量模型，并具有匹配的稀疏配方

概述

Sparsezoo是一个不断增长的存储库，该存储库是具有神经网络的稀疏配方的稀疏（修剪和修剪定制的）模型。它简化并加速了您在构建表演者深度学习模型中使用推理优化的模型和食谱的集合来构建表演者的时间。阅读有关稀疏的更多信息。

通过API获得并托管在云中，Sparsezoo既包含基线和模型，又包含不同程度的推理性能与基线损失恢复。围绕稀疏算法构建的配方驱动的方法使您可以使用给定的模型，将模型从模型转移到私人数据集中，或将食谱传输到您的体系结构中。

GitHub存储库包含Python API代码，以处理与云的连接和身份验证。

新的Sparsezoo型号

？ 2023年10月？

生成的AI

稀疏MPT型号-21种变体
- ⚡突出显示模型⚡：mpt-7b-gsm8k_mpt_pretratain-pruned80_quantized
稀疏OPT模型-12种变体
- ⚡突出显示的模型：opt-6.7b-opt_pretrain-pruned50_quantw8a8
跨代码（单，多）型号-10种变体
- ⚡突出显示模型⚡：Codegen_multi-350m-bigquery_thepile-pruned50_quantized

亮点

模型存根架构概述
可用的型号食谱
sparsezoo.neuralmagic.com

安装

该存储库在Python 3.8-3.11和Linux/Debian系统上进行了测试。建议在虚拟环境中安装以保持系统的顺序。

使用PIP安装：

pip install sparsezoo

快速游览

Sparsezoo Python API使您可以搜索和下载稀疏的模型。代码示例如下。我们鼓励用户通过直接从模型页面复制存根来加载Sparsezoo模型。

模型类对象简介

该Model是一个基本对象，可作为Sparsezoo库的主要接口。它代表了稀疏模型，以及所有目录和文件。

从Sparsezoo Stub创建模型类对象

 from sparsezoo import Model

stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"

model = Model ( stub )
print ( str ( model ))

>> Model ( stub = zoo : cv / classification / resnet_v1 - 50 / pytorch / sparseml / imagenet / pruned95_quant - none )

从本地模型目录创建模型类对象

 from sparsezoo import Model

directory = ".../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"

model = Model ( directory )
print ( str ( model ))

>> Model ( directory = ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 )

手动指定模型下载路径

除非另有说明，否则将从Sparsezoo Stub创建的模型保存到本地Sparsezoo高速缓存目录。可以通过将可选download_path参数传递给构造函数来覆盖：

 from sparsezoo import Model

stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
download_directory = "./model_download_directory"

model = Model ( stub , download_path = download_directory )

下载模型文件

一旦模型从存根初始化，可以通过调用download()方法或调用path属性来下载它。对于Sparsezoo中的所有文件，这两种途径都是通用的。调用path属性将始终触发文件下载，除非已下载文件。

 # method 1
model . download () 

# method 2 
model_path = model . path

检查Sparsezoo模型的内容物

我们调用available_files方法来检查Sparsezoo模型中存在哪些文件。然后，我们通过调用适当属性来选择一个文件：

 model . available_files

>> { 'training' : Directory ( name = training ), 
>> 'deployment' : Directory ( name = deployment ), 
>> 'sample_inputs' : Directory ( name = sample_inputs . tar . gz ), 
>> 'sample_outputs' : { 'framework' : Directory ( name = sample_outputs . tar . gz )}, 
>> 'sample_labels' : Directory ( name = sample_labels . tar . gz ), 
>> 'model_card' : File ( name = model . md ), 
>> 'recipes' : Directory ( name = recipe ), 
>> 'onnx_model' : File ( name = model . onnx )}

然后，我们可能会仔细研究Sparsezoo模型的内容：

 model_card = model . model_card
print ( model_card )

>> File ( name = model . md )

 model_card_path = model . model_card . path
print ( model_card_path )

>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . md

型号，目录和文件

通常，Sparsezoo模型中的每个文件都共享一组属性： name ， path ， URL和parent ：

name用作文件/目录的标识符
path指向文件/目录的位置
URL指定所讨论的文件/目录的服务器地址
parent指向所讨论文件/目录的父目录的位置

目录是包含其他文件的唯一文件类型。因此，它具有其他files属性。

 print ( model . onnx_model )

>> File ( name = model . onnx )

print ( f"File name: { model . onnx_model . name } n "
      f"File path: { model . onnx_model . path } n "
      f"File URL: { model . onnx_model . url } n "
      f"Parent directory: { model . onnx_model . parent_directory } " )
      
>> File name : model . onnx
>> File path : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . onnx
>> File URL : https : // models . neuralmagic . com / cv - classification / ...
>> Parent directory : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0

 print ( model . recipes )

>> Directory ( name = recipe )

print ( f"File name: { model . recipes . name } n "
      f"Contains: { [ file . name for file in model . recipes . files ] } n "
      f"File path: { model . recipes . path } n "
      f"File URL: { model . recipes . url } n "
      f"Parent directory: { model . recipes . parent_directory } " )
      
>> File name : recipe
>> Contains : [ 'recipe_original.md' , 'recipe_transfer-classification.md' ]
>> File path : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe
>> File URL : None
>> Parent directory : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0

选择特定于检查点的数据

Sparsezoo型号可能包含多个检查点。该模型可能包含在量化模型之前已保存的检查点 - 该检查点将用于传输学习。量化步骤后可能已经保存了另一个检查点 - 通常将一个检查点直接用于推断。

食谱也可能因用例而有所不同。我们可能希望访问用于稀疏密集模型（ recipe_original ）或使我们能够从已稀疏模型（ recipe_transfer ）学习的食谱。

有两种访问这些特定文件的方法。

访问食谱（通过Python API）

 available_recipes = model . recipes . available
print ( available_recipes )

>> [ 'original' , 'transfer-classification' ]

transfer_recipe = model . recipes [ "transfer-classification" ]
print ( transfer_recipe )

>> File ( name = recipe_transfer - classification . md )

original_recipe = model . recipes . default # recipe defaults to `original`
original_recipe_path = original_recipe . path # downloads the recipe and returns its path
print ( original_recipe_path )

>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe / recipe_original . md

访问检查点（通过Python API）

通常，我们期望在模型中包括以下检查点：

checkpoint_prepruning
checkpoint_postpruning
checkpoint_preqat
checkpoint_postqat

模型默认值为的检查点是preqat状态（在量化步骤之前）。

 from sparsezoo import Model

stub = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84"

model = Model ( stub )
available_checkpoints = model . training . available
print ( available_checkpoints )

>> [ 'preqat' ]

preqat_checkpoint = model . training . default # recipe defaults to `preqat`
preqat_checkpoint_path = preqat_checkpoint . path # downloads the checkpoint and returns its path
print ( preqat_checkpoint_path )

>> ... / . cache / sparsezoo / 0857 c6f2 - 13 c1 - 43 c9 - 8 db8 - 8 f89a548dccd / training

[ print ( file . name ) for file in preqat_checkpoint . files ]

>> vocab . txt
>> special_tokens_map . json
>> pytorch_model . bin
>> config . json
>> training_args . bin
>> tokenizer_config . json
>> trainer_state . json
>> tokenizer . json

访问配方（通过存根字符串参数）

您还可以通过将适当的URL查询参数附加到存根中直接请求特定的食谱/检查点类型：

 from sparsezoo import Model

stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer"

model = Model ( stub )

# Inspect which files are present.
# Note that the available recipes are restricted
# according to the specified URL query arguments
print ( model . recipes . available )

>> [ 'transfer-classification' ]

transfer_recipe = model . recipes . default # Now the recipes default to the one selected by the stub string arguments
print ( transfer_recipe )

>> File ( name = recipe_transfer - classification . md )

访问样本数据

用户可以轻松地请求代表模型输入和输出的数据批次批次。

 sample_data = model . sample_batch ( batch_size = 10 )

print ( sample_data [ 'sample_inputs' ][ 0 ]. shape )
>> ( 10 , 3 , 224 , 224 ) # (batch_size, num_channels, image_dim, image_dim)

print ( sample_data [ 'sample_outputs' ][ 0 ]. shape )
>> ( 10 , 1000 ) # (batch_size, num_classes)

模型搜索

函数search_models使用户能够快速过滤Sparsezoo存储库的内容以找到感兴趣的存根：

 from sparsezoo import search_models

args = {
    "domain" : "cv" ,
    "sub_domain" : "segmentation" ,
    "architecture" : "yolact" ,
}

models = search_models ( ** args )
[ print ( model ) for model in models ]

>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned82_quant - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned90 - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / base - none )

环境变量

用户可以指定该目录，其中模型（在下载过程中暂时）及其所需的凭据将保存在您的工作机中。 SPARSEZOO_MODELS_PATH是将临时保存下载模型的路径。默认~/.cache/sparsezoo/ SPARSEZOO_CREDENTIALS_PATH是credentials.yaml 。默认~/.cache/sparsezoo/

控制台脚本

除Python API外，包装sparsezoo安装了控制台脚本入口点。这使您可以直接从控制台/终端直接进行交互。

下载

下载命令帮助

sparsezoo.download -h

下载Resnet-50型号

sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none

下载修剪并量化Resnet-50型号

sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned_quant-moderate

搜索

搜索命令帮助

sparsezoo search -h

在计算机视觉域中搜索所有分类Mobilenetv1模型

sparsezoo search --domain cv --sub-domain classification --architecture mobilenet_v1

搜索所有Resnet-50型号

sparsezoo search --domain cv --sub-domain classification 
    --architecture resnet_v1 --sub-architecture 50

有关更深入的阅读，请查看Sparsezoo文档。

资源

了解更多

文档：Sparseml，Sparsezoo，Sparsify，deepsparse
神经魔术：博客，资源

发布历史

官方版本在PYPI上托管

稳定：Sparsezoo
每晚（DEV）：稀疏到周一

此外，可以通过GitHub释放找到更多信息。

执照

该项目是根据Apache许可证2.0版获得许可的。

社区

贡献

我们感谢对代码，示例，集成和文档以及错误报告和功能请求的贡献！了解这里的方式。

加入

对于用户帮助或有关Sparsezoo的疑问，请注册或登录我们的神经魔术社区。我们正在成员成长，很高兴见到您。错误，功能请求或其他问题也可以发布到我们的GitHub问题队列中。

您可以通过订阅神经魔术社区来获得最新的新闻，网络研讨会和活动邀请，研究论文以及其他ML性能。

有关神经魔术的更多一般性问题，请填写此表格。

展开