sparsezoo下載 - sparsezoo源代碼下載

稀疏

神經網絡模型存儲庫，用於高度稀疏和稀疏定量模型，並具有匹配的稀疏配方

概述

Sparsezoo是一個不斷增長的存儲庫，該存儲庫是具有神經網絡的稀疏配方的稀疏（修剪和修剪定制的）模型。它簡化並加速了您在構建表演者深度學習模型中使用推理優化的模型和食譜的集合來構建表演者的時間。閱讀有關稀疏的更多信息。

通過API獲得併託管在雲中，Sparsezoo既包含基線和模型，又包含不同程度的推理性能與基線損失恢復。圍繞稀疏算法構建的配方驅動的方法使您可以使用給定的模型，將模型從模型轉移到私人數據集中，或將食譜傳輸到您的體系結構中。

GitHub存儲庫包含Python API代碼，以處理與雲的連接和身份驗證。

新的Sparsezoo型號

？ 2023年10月？

生成的AI

稀疏MPT型號-21種變體
- ⚡突出顯示模型⚡：mpt-7b-gsm8k_mpt_pretratain-pruned80_quantized
稀疏OPT模型-12種變體
- ⚡突出顯示的模型：opt-6.7b-opt_pretrain-pruned50_quantw8a8
跨代碼（單，多）型號-10種變體
- ⚡突出顯示模型⚡：Codegen_multi-350m-bigquery_thepile-pruned50_quantized

亮點

模型存根架構概述
可用的型號食譜
sparsezoo.neuralmagic.com

安裝

該存儲庫在Python 3.8-3.11和Linux/Debian系統上進行了測試。建議在虛擬環境中安裝以保持系統的順序。

使用PIP安裝：

pip install sparsezoo

快速遊覽

Sparsezoo Python API使您可以搜索和下載稀疏的模型。代碼示例如下。我們鼓勵用戶通過直接從模型頁面複製存根來加載Sparsezoo模型。

模型類對像簡介

該Model是一個基本對象，可作為Sparsezoo庫的主要接口。它代表了稀疏模型，以及所有目錄和文件。

從Sparsezoo Stub創建模型類對象

 from sparsezoo import Model

stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"

model = Model ( stub )
print ( str ( model ))

>> Model ( stub = zoo : cv / classification / resnet_v1 - 50 / pytorch / sparseml / imagenet / pruned95_quant - none )

從本地模型目錄創建模型類對象

 from sparsezoo import Model

directory = ".../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"

model = Model ( directory )
print ( str ( model ))

>> Model ( directory = ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 )

手動指定模型下載路徑

除非另有說明，否則將從Sparsezoo Stub創建的模型保存到本地Sparsezoo高速緩存目錄。可以通過將可選download_path參數傳遞給構造函數來覆蓋：

 from sparsezoo import Model

stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
download_directory = "./model_download_directory"

model = Model ( stub , download_path = download_directory )

下載模型文件

一旦模型從存根初始化，可以通過調用download()方法或調用path屬性來下載它。對於Sparsezoo中的所有文件，這兩種途徑都是通用的。調用path屬性將始終觸發文件下載，除非已下載文件。

 # method 1
model . download () 

# method 2 
model_path = model . path

檢查Sparsezoo模型的內容物

我們調用available_files方法來檢查Sparsezoo模型中存在哪些文件。然後，我們通過調用適當屬性來選擇一個文件：

 model . available_files

>> { 'training' : Directory ( name = training ), 
>> 'deployment' : Directory ( name = deployment ), 
>> 'sample_inputs' : Directory ( name = sample_inputs . tar . gz ), 
>> 'sample_outputs' : { 'framework' : Directory ( name = sample_outputs . tar . gz )}, 
>> 'sample_labels' : Directory ( name = sample_labels . tar . gz ), 
>> 'model_card' : File ( name = model . md ), 
>> 'recipes' : Directory ( name = recipe ), 
>> 'onnx_model' : File ( name = model . onnx )}

然後，我們可能會仔細研究Sparsezoo模型的內容：

 model_card = model . model_card
print ( model_card )

>> File ( name = model . md )

 model_card_path = model . model_card . path
print ( model_card_path )

>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . md

型號，目錄和文件

通常，Sparsezoo模型中的每個文件都共享一組屬性： name ， path ， URL和parent ：

name用作文件/目錄的標識符
path指向文件/目錄的位置
URL指定所討論的文件/目錄的服務器地址
parent指向所討論文件/目錄的父目錄的位置

目錄是包含其他文件的唯一文件類型。因此，它具有其他files屬性。

 print ( model . onnx_model )

>> File ( name = model . onnx )

print ( f"File name: { model . onnx_model . name } n "
      f"File path: { model . onnx_model . path } n "
      f"File URL: { model . onnx_model . url } n "
      f"Parent directory: { model . onnx_model . parent_directory } " )
      
>> File name : model . onnx
>> File path : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . onnx
>> File URL : https : // models . neuralmagic . com / cv - classification / ...
>> Parent directory : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0

 print ( model . recipes )

>> Directory ( name = recipe )

print ( f"File name: { model . recipes . name } n "
      f"Contains: { [ file . name for file in model . recipes . files ] } n "
      f"File path: { model . recipes . path } n "
      f"File URL: { model . recipes . url } n "
      f"Parent directory: { model . recipes . parent_directory } " )
      
>> File name : recipe
>> Contains : [ 'recipe_original.md' , 'recipe_transfer-classification.md' ]
>> File path : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe
>> File URL : None
>> Parent directory : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0

選擇特定於檢查點的數據

Sparsezoo型號可能包含多個檢查點。該模型可能包含在量化模型之前已保存的檢查點 - 該檢查點將用於傳輸學習。量化步驟後可能已經保存了另一個檢查點 - 通常將一個檢查點直接用於推斷。

食譜也可能因用例而有所不同。我們可能希望訪問用於稀疏密集模型（ recipe_original ）或使我們能夠從已稀疏模型（ recipe_transfer ）學習的食譜。

有兩種訪問這些特定文件的方法。

訪問食譜（通過Python API）

 available_recipes = model . recipes . available
print ( available_recipes )

>> [ 'original' , 'transfer-classification' ]

transfer_recipe = model . recipes [ "transfer-classification" ]
print ( transfer_recipe )

>> File ( name = recipe_transfer - classification . md )

original_recipe = model . recipes . default # recipe defaults to `original`
original_recipe_path = original_recipe . path # downloads the recipe and returns its path
print ( original_recipe_path )

>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe / recipe_original . md

訪問檢查點（通過Python API）

通常，我們期望在模型中包括以下檢查點：

checkpoint_prepruning
checkpoint_postpruning
checkpoint_preqat
checkpoint_postqat

模型默認值為的檢查點是preqat狀態（在量化步驟之前）。

 from sparsezoo import Model

stub = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84"

model = Model ( stub )
available_checkpoints = model . training . available
print ( available_checkpoints )

>> [ 'preqat' ]

preqat_checkpoint = model . training . default # recipe defaults to `preqat`
preqat_checkpoint_path = preqat_checkpoint . path # downloads the checkpoint and returns its path
print ( preqat_checkpoint_path )

>> ... / . cache / sparsezoo / 0857 c6f2 - 13 c1 - 43 c9 - 8 db8 - 8 f89a548dccd / training

[ print ( file . name ) for file in preqat_checkpoint . files ]

>> vocab . txt
>> special_tokens_map . json
>> pytorch_model . bin
>> config . json
>> training_args . bin
>> tokenizer_config . json
>> trainer_state . json
>> tokenizer . json

訪問配方（通過存根字符串參數）

您還可以通過將適當的URL查詢參數附加到存根中直接請求特定的食譜/檢查點類型：

 from sparsezoo import Model

stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer"

model = Model ( stub )

# Inspect which files are present.
# Note that the available recipes are restricted
# according to the specified URL query arguments
print ( model . recipes . available )

>> [ 'transfer-classification' ]

transfer_recipe = model . recipes . default # Now the recipes default to the one selected by the stub string arguments
print ( transfer_recipe )

>> File ( name = recipe_transfer - classification . md )

訪問樣本數據

用戶可以輕鬆地請求代表模型輸入和輸出的數據批次批次。

 sample_data = model . sample_batch ( batch_size = 10 )

print ( sample_data [ 'sample_inputs' ][ 0 ]. shape )
>> ( 10 , 3 , 224 , 224 ) # (batch_size, num_channels, image_dim, image_dim)

print ( sample_data [ 'sample_outputs' ][ 0 ]. shape )
>> ( 10 , 1000 ) # (batch_size, num_classes)

模型搜索

函數search_models使用戶能夠快速過濾Sparsezoo存儲庫的內容以找到感興趣的存根：

 from sparsezoo import search_models

args = {
    "domain" : "cv" ,
    "sub_domain" : "segmentation" ,
    "architecture" : "yolact" ,
}

models = search_models ( ** args )
[ print ( model ) for model in models ]

>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned82_quant - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned90 - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / base - none )

環境變量

用戶可以指定該目錄，其中模型（在下載過程中暫時）及其所需的憑據將保存在您的工作機中。 SPARSEZOO_MODELS_PATH是將臨時保存下載模型的路徑。默認~/.cache/sparsezoo/ SPARSEZOO_CREDENTIALS_PATH是credentials.yaml 。默認~/.cache/sparsezoo/

控制台腳本

除Python API外，包裝sparsezoo安裝了控制台腳本入口點。這使您可以直接從控制台/終端直接進行交互。

下載

下載命令幫助

sparsezoo.download -h

下載Resnet-50型號

sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none

下載修剪並量化Resnet-50型號

sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned_quant-moderate

搜尋

搜索命令幫助

sparsezoo search -h

在計算機視覺域中搜索所有分類Mobilenetv1模型

sparsezoo search --domain cv --sub-domain classification --architecture mobilenet_v1

搜索所有Resnet-50型號

sparsezoo search --domain cv --sub-domain classification 
    --architecture resnet_v1 --sub-architecture 50

有關更深入的閱讀，請查看Sparsezoo文檔。

資源

了解更多

文檔：Sparseml，Sparsezoo，Sparsify，deepsparse
神經魔術：博客，資源

發布歷史

官方版本在PYPI上託管

穩定：Sparsezoo
每晚（DEV）：稀疏到週一

此外，可以通過GitHub釋放找到更多信息。

執照

該項目是根據Apache許可證2.0版獲得許可的。

社區

貢獻

我們感謝對代碼，示例，集成和文檔以及錯誤報告和功能請求的貢獻！了解這裡的方式。

加入

對於用戶幫助或有關Sparsezoo的疑問，請註冊或登錄我們的神經魔術社區。我們正在成員成長，很高興見到您。錯誤，功能請求或其他問題也可以發佈到我們的GitHub問題隊列中。

您可以通過訂閱神經魔術社區來獲得最新的新聞，網絡研討會和活動邀請，研究論文以及其他ML性能。

有關神經魔術的更多一般性問題，請填寫此表格。

展開