Sparsezoo是一個不斷增長的存儲庫,該存儲庫是具有神經網絡的稀疏配方的稀疏(修剪和修剪定制的)模型。它簡化並加速了您在構建表演者深度學習模型中使用推理優化的模型和食譜的集合來構建表演者的時間。閱讀有關稀疏的更多信息。
通過API獲得併託管在雲中,Sparsezoo既包含基線和模型,又包含不同程度的推理性能與基線損失恢復。圍繞稀疏算法構建的配方驅動的方法使您可以使用給定的模型,將模型從模型轉移到私人數據集中,或將食譜傳輸到您的體系結構中。
GitHub存儲庫包含Python API代碼,以處理與雲的連接和身份驗證。
生成的AI
該存儲庫在Python 3.8-3.11和Linux/Debian系統上進行了測試。建議在虛擬環境中安裝以保持系統的順序。
使用PIP安裝:
pip install sparsezooSparsezoo Python API使您可以搜索和下載稀疏的模型。代碼示例如下。我們鼓勵用戶通過直接從模型頁面複製存根來加載Sparsezoo模型。
該Model是一個基本對象,可作為Sparsezoo庫的主要接口。它代表了稀疏模型,以及所有目錄和文件。
from sparsezoo import Model
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
model = Model ( stub )
print ( str ( model ))
>> Model ( stub = zoo : cv / classification / resnet_v1 - 50 / pytorch / sparseml / imagenet / pruned95_quant - none ) from sparsezoo import Model
directory = ".../.cache/sparsezoo/eb977dae-2454-471b-9870-4cf38074acf0"
model = Model ( directory )
print ( str ( model ))
>> Model ( directory = ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 )除非另有說明,否則將從Sparsezoo Stub創建的模型保存到本地Sparsezoo高速緩存目錄。可以通過將可選download_path參數傳遞給構造函數來覆蓋:
from sparsezoo import Model
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none"
download_directory = "./model_download_directory"
model = Model ( stub , download_path = download_directory )一旦模型從存根初始化,可以通過調用download()方法或調用path屬性來下載它。對於Sparsezoo中的所有文件,這兩種途徑都是通用的。調用path屬性將始終觸發文件下載,除非已下載文件。
# method 1
model . download ()
# method 2
model_path = model . path 我們調用available_files方法來檢查Sparsezoo模型中存在哪些文件。然後,我們通過調用適當屬性來選擇一個文件:
model . available_files
>> { 'training' : Directory ( name = training ),
>> 'deployment' : Directory ( name = deployment ),
>> 'sample_inputs' : Directory ( name = sample_inputs . tar . gz ),
>> 'sample_outputs' : { 'framework' : Directory ( name = sample_outputs . tar . gz )},
>> 'sample_labels' : Directory ( name = sample_labels . tar . gz ),
>> 'model_card' : File ( name = model . md ),
>> 'recipes' : Directory ( name = recipe ),
>> 'onnx_model' : File ( name = model . onnx )}然後,我們可能會仔細研究Sparsezoo模型的內容:
model_card = model . model_card
print ( model_card )
>> File ( name = model . md ) model_card_path = model . model_card . path
print ( model_card_path )
>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . md通常,Sparsezoo模型中的每個文件都共享一組屬性: name , path , URL和parent :
name用作文件/目錄的標識符path指向文件/目錄的位置URL指定所討論的文件/目錄的服務器地址parent指向所討論文件/目錄的父目錄的位置目錄是包含其他文件的唯一文件類型。因此,它具有其他files屬性。
print ( model . onnx_model )
>> File ( name = model . onnx )
print ( f"File name: { model . onnx_model . name } n "
f"File path: { model . onnx_model . path } n "
f"File URL: { model . onnx_model . url } n "
f"Parent directory: { model . onnx_model . parent_directory } " )
>> File name : model . onnx
>> File path : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / model . onnx
>> File URL : https : // models . neuralmagic . com / cv - classification / ...
>> Parent directory : ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 print ( model . recipes )
>> Directory ( name = recipe )
print ( f"File name: { model . recipes . name } n "
f"Contains: { [ file . name for file in model . recipes . files ] } n "
f"File path: { model . recipes . path } n "
f"File URL: { model . recipes . url } n "
f"Parent directory: { model . recipes . parent_directory } " )
>> File name : recipe
>> Contains : [ 'recipe_original.md' , 'recipe_transfer-classification.md' ]
>> File path : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe
>> File URL : None
>> Parent directory : / home / user / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0Sparsezoo型號可能包含多個檢查點。該模型可能包含在量化模型之前已保存的檢查點 - 該檢查點將用於傳輸學習。量化步驟後可能已經保存了另一個檢查點 - 通常將一個檢查點直接用於推斷。
食譜也可能因用例而有所不同。我們可能希望訪問用於稀疏密集模型( recipe_original )或使我們能夠從已稀疏模型( recipe_transfer )學習的食譜。
有兩種訪問這些特定文件的方法。
available_recipes = model . recipes . available
print ( available_recipes )
>> [ 'original' , 'transfer-classification' ]
transfer_recipe = model . recipes [ "transfer-classification" ]
print ( transfer_recipe )
>> File ( name = recipe_transfer - classification . md )
original_recipe = model . recipes . default # recipe defaults to `original`
original_recipe_path = original_recipe . path # downloads the recipe and returns its path
print ( original_recipe_path )
>> ... / . cache / sparsezoo / eb977dae - 2454 - 471 b - 9870 - 4 cf38074acf0 / recipe / recipe_original . md 通常,我們期望在模型中包括以下檢查點:
checkpoint_prepruningcheckpoint_postpruningcheckpoint_preqatcheckpoint_postqat模型默認值為的檢查點是preqat狀態(在量化步驟之前)。
from sparsezoo import Model
stub = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/pruned_quant_3layers-aggressive_84"
model = Model ( stub )
available_checkpoints = model . training . available
print ( available_checkpoints )
>> [ 'preqat' ]
preqat_checkpoint = model . training . default # recipe defaults to `preqat`
preqat_checkpoint_path = preqat_checkpoint . path # downloads the checkpoint and returns its path
print ( preqat_checkpoint_path )
>> ... / . cache / sparsezoo / 0857 c6f2 - 13 c1 - 43 c9 - 8 db8 - 8 f89a548dccd / training
[ print ( file . name ) for file in preqat_checkpoint . files ]
>> vocab . txt
>> special_tokens_map . json
>> pytorch_model . bin
>> config . json
>> training_args . bin
>> tokenizer_config . json
>> trainer_state . json
>> tokenizer . json 您還可以通過將適當的URL查詢參數附加到存根中直接請求特定的食譜/檢查點類型:
from sparsezoo import Model
stub = "zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none?recipe=transfer"
model = Model ( stub )
# Inspect which files are present.
# Note that the available recipes are restricted
# according to the specified URL query arguments
print ( model . recipes . available )
>> [ 'transfer-classification' ]
transfer_recipe = model . recipes . default # Now the recipes default to the one selected by the stub string arguments
print ( transfer_recipe )
>> File ( name = recipe_transfer - classification . md )用戶可以輕鬆地請求代表模型輸入和輸出的數據批次批次。
sample_data = model . sample_batch ( batch_size = 10 )
print ( sample_data [ 'sample_inputs' ][ 0 ]. shape )
>> ( 10 , 3 , 224 , 224 ) # (batch_size, num_channels, image_dim, image_dim)
print ( sample_data [ 'sample_outputs' ][ 0 ]. shape )
>> ( 10 , 1000 ) # (batch_size, num_classes)函數search_models使用戶能夠快速過濾Sparsezoo存儲庫的內容以找到感興趣的存根:
from sparsezoo import search_models
args = {
"domain" : "cv" ,
"sub_domain" : "segmentation" ,
"architecture" : "yolact" ,
}
models = search_models ( ** args )
[ print ( model ) for model in models ]
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned82_quant - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / pruned90 - none )
>> Model ( stub = zoo : cv / segmentation / yolact - darknet53 / pytorch / dbolya / coco / base - none )用戶可以指定該目錄,其中模型(在下載過程中暫時)及其所需的憑據將保存在您的工作機中。 SPARSEZOO_MODELS_PATH是將臨時保存下載模型的路徑。默認~/.cache/sparsezoo/ SPARSEZOO_CREDENTIALS_PATH是credentials.yaml 。默認~/.cache/sparsezoo/
除Python API外,包裝sparsezoo安裝了控制台腳本入口點。這使您可以直接從控制台/終端直接進行交互。
下載命令幫助
sparsezoo.download -h
下載Resnet-50型號
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/base-none
下載修剪並量化Resnet-50型號
sparsezoo.download zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned_quant-moderate搜索命令幫助
sparsezoo search -h
在計算機視覺域中搜索所有分類Mobilenetv1模型
sparsezoo search --domain cv --sub-domain classification --architecture mobilenet_v1
搜索所有Resnet-50型號
sparsezoo search --domain cv --sub-domain classification
--architecture resnet_v1 --sub-architecture 50有關更深入的閱讀,請查看Sparsezoo文檔。
官方版本在PYPI上託管
此外,可以通過GitHub釋放找到更多信息。
該項目是根據Apache許可證2.0版獲得許可的。
我們感謝對代碼,示例,集成和文檔以及錯誤報告和功能請求的貢獻!了解這裡的方式。
對於用戶幫助或有關Sparsezoo的疑問,請註冊或登錄我們的神經魔術社區。我們正在成員成長,很高興見到您。錯誤,功能請求或其他問題也可以發佈到我們的GitHub問題隊列中。
您可以通過訂閱神經魔術社區來獲得最新的新聞,網絡研討會和活動邀請,研究論文以及其他ML性能。
有關神經魔術的更多一般性問題,請填寫此表格。