DeepMind的感知模型(https://arxiv.org/abs/2103.03206)適應Pytorch。可以在此處找到原始的JAX/HAIKU代碼:https://github.com/deepmind/deepmind-research/tree/master/master/perceiver
git clone https://github.com/JOBR0/PerceiverIO_Pytorch
cd PerceiverIO_Pytorchpython3 -m venv perceiverEnv
source perceiverEnv/bin/activate按照官方說明安裝pytorch:https://pytorch.org/get-started/locally/
從需求安裝其他必需的軟件包.txt:
pip3 install -r requirements.txt該實現涵蓋了可用的驗證模型的以下示例任務:
來自官方DeepMind存儲庫的haiku檢查站已轉換為Pytorch檢查站,可以從Google-Drive下載。 Pytorch檢查點應放置在“ Pytorch_checkpoints”文件夾中,以便示例代碼可以找到它們。
為了為自定義任務創建一個新的prepeiverio,使用了perceiver_io/perceiver.py中的感知者類。
class PerceiverIO ( nn . Module ):
"""The Perceiver: a scalable, fully attentional architecture.
Args:
num_blocks (int): Number of times the block is applied with shared weights. Default: 8
num_self_attends_per_block (int): Number of self-attentions in the block. Default: 6,
num_latents: (int): Number of latent vectors. Default 512,
num_latent_channels (int): Number of channels for the latent vectors. Default: 1024,
final_project (bool): Whether to apply a linear layer to the outputs before the post-processors. Default: True,
final_project_out_channels (int): Number of output channels for the final projection layer. Default: None,
perceiver_encoder_kwargs (Dict): Additional arguments for the perceiver encoder class. Default: {},
perceiver_decoder_kwargs (Dict): Additional arguments for the perceiver decoder class. Default: {},
input_preprocessors (dict / nn.Module): Optional input preprocessors. 1 or none for each modality. Default: None,
output_postprocessors (dict / nn.Module): Optional output postprocessors. 1 or none for each modality. Default: None,
output_queries (dict / nn.Module): Modules that create the output queries. 1 for each modality. Default: None,
output_query_padding_channels (int): Number of learnable features channels that are added to the output queries. Default: 0,
input_padding_channels (int): Number of learnable features channels that are added to the preprocessed inputs. Default: 0,
input_channels (dict, int): = The number of input channels need to be specified if NO preprocessor is used. Otherwise,
the number will be inferred from the preprocessor. Default: None,
input_mask_probs (dict): Probability with which each input modality will be masked out. Default None,
"""以下是多模式應用的感知者的圖:
輸入預處理器獲取原始輸入數據並進行預處理,以便可以通過第一個交叉注意來查詢它。這可以是類似於從圖像創建補丁的東西。通常,位置編碼由預處理器合併。也可以手動處理輸入,而不是使用預處理器。
可以在perceiver_io/io_processors/preprocessors.py中找到幾個input_preprocessors.py
輸出後處理器採用感知器的最終輸出,並處理以獲得所需的輸出格式。
可以在perceiver_io/io_processors/postprocessors.py中找到幾個output_postProcessors.py
Ouput查詢創建的功能用於查詢感知器的最終潛在表示以產生輸出。他們獲得預處理輸入作為參數,以便根據需要使用它。它們通常還包含位置編碼。
可以在perceiver_io/output_queries.py ###多種模態中找到多個output_queries,可以一次處理多種模態,可以將其從模態到模塊映射到模塊的詞典可用於input_preprocessors,output_postprocessors和utpute_queries(請參閱perceive_io_io_io_io/multimodimodal_perce.perce.per.py.py)。為了使不同的輸入彼此兼容,它們可以使用可訓練的參數填充相同的通道大小。還可以使用與輸入相比,也可以使用不同數量的輸出查詢。
@misc { jaegle2021perceiver ,
title = { Perceiver IO: A General Architecture for Structured Inputs & Outputs } ,
author = { Andrew Jaegle and Sebastian Borgeaud and Jean-Baptiste Alayrac and Carl Doersch and Catalin Ionescu and David Ding and Skanda Koppula and Andrew Brock and Evan Shelhamer and Olivier Hénaff and Matthew M. Botvinick and Andrew Zisserman and Oriol Vinyals and João Carreira } ,
year = { 2021 } ,
eprint = { 2107.14795 } ,
archivePrefix = { arXiv } ,
primaryClass = { cs.LG }
}