Sparsezoo是一个不断增长的存储库,该存储库是具有神经网络的稀疏配方的稀疏(修剪和修剪定制的)模型。它简化并加速了您在构建表演者深度学习模型中使用推理优化的模型和食谱的集合来构建表演者的时间。阅读有关稀疏的更多信息。
通过API获得并托管在云中,Sparsezoo既包含基线和模型,又包含不同程度的推理性能与基线损失恢复。围绕稀疏算法构建的配方驱动的方法使您可以使用给定的模型,将模型从模型转移到私人数据集中,或将食谱传输到您的体系结构中。
GitHub存储库包含Python API代码,以处理与云的连接和身份验证。
生成的AI
该存储库在Python 3.8-3.11和Linux/Debian系统上进行了测试。建议在虚拟环境中安装以保持系统的顺序。
使用PIP安装:
pip install sparsezooSparsezoo Python API使您可以搜索和下载稀疏的模型。代码示例如下。我们鼓励用户通过直接从模型页面复制存根来加载Sparsezoo模型。
该Model是一个基本对象,可作为Sparsezoo库的主要接口。它代表了稀疏模型,以及所有目录和文件。
from sparsezoo import Model
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
model = Model ( stub )
print ( str ( model ))
>> Model ( stub = zoo : cv / classification / resnet_v1 - 50 / pytorch / sparseml / imagenet / pruned95_quant - none ) from sparsezoo import Model
directory = ".../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"
model = Model ( directory )
print ( str ( model ))
>> Model ( directory = ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 )除非另有说明,否则将从Sparsezoo Stub创建的模型保存到本地Sparsezoo高速缓存目录。可以通过将可选download_path参数传递给构造函数来覆盖:
from sparsezoo import Model
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
download_directory = "./model_download_directory"
model = Model ( stub , download_path = download_directory )一旦模型从存根初始化,可以通过调用download()方法或调用path属性来下载它。对于Sparsezoo中的所有文件,这两种途径都是通用的。调用path属性将始终触发文件下载,除非已下载文件。
# method 1
model . download ()
# method 2
model_path = model . path 我们调用available_files方法来检查Sparsezoo模型中存在哪些文件。然后,我们通过调用适当属性来选择一个文件:
model . available_files
>> { 'training' : Directory ( name = training ),
>> 'deployment' : Directory ( name = deployment ),
>> 'sample_inputs' : Directory ( name = sample_inputs . tar . gz ),
>> 'sample_outputs' : { 'framework' : Directory ( name = sample_outputs . tar . gz )},
>> 'sample_labels' : Directory ( name = sample_labels . tar . gz ),
>> 'model_card' : File ( name = model . md ),
>> 'recipes' : Directory ( name = recipe ),
>> 'onnx_model' : File ( name = model . onnx )}然后,我们可能会仔细研究Sparsezoo模型的内容:
model_card = model . model_card
print ( model_card )
>> File ( name = model . md ) model_card_path = model . model_card . path
print ( model_card_path )
>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . md通常,Sparsezoo模型中的每个文件都共享一组属性: name , path , URL和parent :
name用作文件/目录的标识符path指向文件/目录的位置URL指定所讨论的文件/目录的服务器地址parent指向所讨论文件/目录的父目录的位置目录是包含其他文件的唯一文件类型。因此,它具有其他files属性。
print ( model . onnx_model )
>> File ( name = model . onnx )
print ( f"File name: { model . onnx_model . name } n "
f"File path: { model . onnx_model . path } n "
f"File URL: { model . onnx_model . url } n "
f"Parent directory: { model . onnx_model . parent_directory } " )
>> File name : model . onnx
>> File path : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . onnx
>> File URL : https : // models . neuralmagic . com / cv - classification / ...
>> Parent directory : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 print ( model . recipes )
>> Directory ( name = recipe )
print ( f"File name: { model . recipes . name } n "
f"Contains: { [ file . name for file in model . recipes . files ] } n "
f"File path: { model . recipes . path } n "
f"File URL: { model . recipes . url } n "
f"Parent directory: { model . recipes . parent_directory } " )
>> File name : recipe
>> Contains : [ 'recipe_original.md' , 'recipe_transfer-classification.md' ]
>> File path : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe
>> File URL : None
>> Parent directory : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0Sparsezoo型号可能包含多个检查点。该模型可能包含在量化模型之前已保存的检查点 - 该检查点将用于传输学习。量化步骤后可能已经保存了另一个检查点 - 通常将一个检查点直接用于推断。
食谱也可能因用例而有所不同。我们可能希望访问用于稀疏密集模型( recipe_original )或使我们能够从已稀疏模型( recipe_transfer )学习的食谱。
有两种访问这些特定文件的方法。
available_recipes = model . recipes . available
print ( available_recipes )
>> [ 'original' , 'transfer-classification' ]
transfer_recipe = model . recipes [ "transfer-classification" ]
print ( transfer_recipe )
>> File ( name = recipe_transfer - classification . md )
original_recipe = model . recipes . default # recipe defaults to `original`
original_recipe_path = original_recipe . path # downloads the recipe and returns its path
print ( original_recipe_path )
>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe / recipe_original . md 通常,我们期望在模型中包括以下检查点:
checkpoint_prepruningcheckpoint_postpruningcheckpoint_preqatcheckpoint_postqat模型默认值为的检查点是preqat状态(在量化步骤之前)。
from sparsezoo import Model
stub = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84"
model = Model ( stub )
available_checkpoints = model . training . available
print ( available_checkpoints )
>> [ 'preqat' ]
preqat_checkpoint = model . training . default # recipe defaults to `preqat`
preqat_checkpoint_path = preqat_checkpoint . path # downloads the checkpoint and returns its path
print ( preqat_checkpoint_path )
>> ... / . cache / sparsezoo / 0857 c6f2 - 13 c1 - 43 c9 - 8 db8 - 8 f89a548dccd / training
[ print ( file . name ) for file in preqat_checkpoint . files ]
>> vocab . txt
>> special_tokens_map . json
>> pytorch_model . bin
>> config . json
>> training_args . bin
>> tokenizer_config . json
>> trainer_state . json
>> tokenizer . json 您还可以通过将适当的URL查询参数附加到存根中直接请求特定的食谱/检查点类型:
from sparsezoo import Model
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer"
model = Model ( stub )
# Inspect which files are present.
# Note that the available recipes are restricted
# according to the specified URL query arguments
print ( model . recipes . available )
>> [ 'transfer-classification' ]
transfer_recipe = model . recipes . default # Now the recipes default to the one selected by the stub string arguments
print ( transfer_recipe )
>> File ( name = recipe_transfer - classification . md )用户可以轻松地请求代表模型输入和输出的数据批次批次。
sample_data = model . sample_batch ( batch_size = 10 )
print ( sample_data [ 'sample_inputs' ][ 0 ]. shape )
>> ( 10 , 3 , 224 , 224 ) # (batch_size, num_channels, image_dim, image_dim)
print ( sample_data [ 'sample_outputs' ][ 0 ]. shape )
>> ( 10 , 1000 ) # (batch_size, num_classes)函数search_models使用户能够快速过滤Sparsezoo存储库的内容以找到感兴趣的存根:
from sparsezoo import search_models
args = {
"domain" : "cv" ,
"sub_domain" : "segmentation" ,
"architecture" : "yolact" ,
}
models = search_models ( ** args )
[ print ( model ) for model in models ]
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned82_quant - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned90 - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / base - none )用户可以指定该目录,其中模型(在下载过程中暂时)及其所需的凭据将保存在您的工作机中。 SPARSEZOO_MODELS_PATH是将临时保存下载模型的路径。默认~/.cache/sparsezoo/ SPARSEZOO_CREDENTIALS_PATH是credentials.yaml 。默认~/.cache/sparsezoo/
除Python API外,包装sparsezoo安装了控制台脚本入口点。这使您可以直接从控制台/终端直接进行交互。
下载命令帮助
sparsezoo.download -h
下载Resnet-50型号
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none
下载修剪并量化Resnet-50型号
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned_quant-moderate搜索命令帮助
sparsezoo search -h
在计算机视觉域中搜索所有分类Mobilenetv1模型
sparsezoo search --domain cv --sub-domain classification --architecture mobilenet_v1
搜索所有Resnet-50型号
sparsezoo search --domain cv --sub-domain classification
--architecture resnet_v1 --sub-architecture 50有关更深入的阅读,请查看Sparsezoo文档。
官方版本在PYPI上托管
此外,可以通过GitHub释放找到更多信息。
该项目是根据Apache许可证2.0版获得许可的。
我们感谢对代码,示例,集成和文档以及错误报告和功能请求的贡献!了解这里的方式。
对于用户帮助或有关Sparsezoo的疑问,请注册或登录我们的神经魔术社区。我们正在成员成长,很高兴见到您。错误,功能请求或其他问题也可以发布到我们的GitHub问题队列中。
您可以通过订阅神经魔术社区来获得最新的新闻,网络研讨会和活动邀请,研究论文以及其他ML性能。
有关神经魔术的更多一般性问题,请填写此表格。