swin transformer pytorch下載 - swin transformer pytorch源代碼下載

swin transformer pytorch

Python

ve Positional Bias

下載

線性自我注意力

SWIN Transformer -Pytorch

Swin Transformer體系結構的實現。本文提出了一種名為Swin Transformer的New Vision Transformer，它可以作為計算機視覺的通用骨幹。與文本中的單詞相比，圖像中的較大變化和圖像中像素的高分辨率的差異很大，從而使變壓器從語言轉化為視覺的挑戰是由兩個域之間的差異引起的。為了解決這些差異，我們提出了一個層次變壓器，其表示由移動的窗口計算出來。移動的窗戶方案通過將自我發揮的計算限制為非重疊的本地窗口，從而帶來更高的效率，同時還允許交叉窗口連接。該層次結構具有在各種尺度上建模的靈活性，並且相對於圖像大小具有線性計算複雜性。 Swin Transformer的這些質量使其與廣泛的視覺任務兼容，包括圖像分類（Imagenet-1K上的86.4 TOP-1精度）和密集的預測任務，例如對象檢測（58.7 Box AP和51.1 box ap和51.1 box toco test-evev上的蒙版AP）和語義分割（53.5 miou on Ade20k val）。它的性能超過了先前的最新幅度+2.7盒子AP和+2.6 Mask ap在可可上，而ADE20K上的+3.2 MIOU超過了+3.2 MIOU，這表明了基於變壓器的模型作為視覺骨架的潛力。

這不是Swin Transformer的官方存儲庫。目前，作者的官方代碼尚不可用，但稍後可以在以下網址找到：https：//github.com/microsoft/swin-transformer。

所有學分都歸作者Ze Liu，Yutong Lin，Yue Cao，Han Hu，Yixuan Wei，Zheng Zhang，Stephen Lin和Baining Guo。

安裝

$ pip install swin-transformer-pytorch

或（如果您克隆存儲庫）

$ pip install -r requirements.txt

用法

 import torch
from swin_transformer_pytorch import SwinTransformer

net = SwinTransformer (
    hidden_dim = 96 ,
    layers = ( 2 , 2 , 6 , 2 ),
    heads = ( 3 , 6 , 12 , 24 ),
    channels = 3 ,
    num_classes = 3 ,
    head_dim = 32 ,
    window_size = 7 ,
    downscaling_factors = ( 4 , 2 , 2 , 2 ),
    relative_pos_embedding = True
)
dummy_x = torch . randn ( 1 , 3 , 224 , 224 )
logits = net ( dummy_x )  # (1,3)
print ( net )
print ( logits )

參數

hidden_dim ：int。
原始論文中指出了您要用於體系結構的隱藏尺寸
layers ：INT的4核算可除以2。
每個階段都有多少層。每個int都應被兩個分組，因為我們總是將常規和移動的swinblock一起使用。
heads ：4核心INTS
在每個階段，有多少頭要應用。
channels ：int。
輸入的頻道數量。
num_classes ：int。
NUM類輸出應該具有。
head_dim ：int。
每個頭應具有什麼尺寸。
window_size ：int。
要使用的窗口大小，請確保每次縮小圖像尺寸仍然可以除以窗口大小。
downscaling_factors ：4核心。
在每個階段使用什麼縮小因素。確保圖像尺寸足夠大，以實現縮小因素。
relative_pos_embedding ：bool。
是使用可學習的相對位置嵌入（2M-1）X（2M-1）還是完整的位置嵌入（m²xm²）。

托多

在Imagenet-1K和Coco 2017上調整代碼並驗證

參考

該代碼的某些部分改編自Pytorch -VisionTransFormer存儲庫https://github.com/lucidrains/vit-pytorch，它提供了一個非常乾淨的VisionTransFormer實現。

引用

 @misc { liu2021swin ,
      title = { Swin Transformer: Hierarchical Vision Transformer using Shifted Windows } , 
      author = { Ze Liu and Yutong Lin and Yue Cao and Han Hu and Yixuan Wei and Zheng Zhang and Stephen Lin and Baining Guo } ,
      year = { 2021 } ,
      eprint = { 2103.14030 } ,
      archivePrefix = { arXiv } ,
      primaryClass = { cs.CV }
}

展開

附加信息

版本 ve Positional Bias
類型 Python
更新時間 2025-07-15
大小 188.97KB
來自於 Github

相關應用

GitHub sgrebnov/cordova plugin background download

2024-11-05
pytorch image models

2024-11-03
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
錄製 Swin 應用程式

2024-05-06
Monster Transformer手機版

2023-09-07
Swin語音筆記app

2023-06-01

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
ToDo Co

Python

1.0.0
Python Portfolio

Python
datamule python

Python
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部