DeepMind的感知模型(https://arxiv.org/abs/2103.03206)适应Pytorch。可以在此处找到原始的JAX/HAIKU代码:https://github.com/deepmind/deepmind-research/tree/master/master/perceiver
git clone https://github.com/JOBR0/PerceiverIO_Pytorch
cd PerceiverIO_Pytorchpython3 -m venv perceiverEnv
source perceiverEnv/bin/activate按照官方说明安装pytorch:https://pytorch.org/get-started/locally/
从需求安装其他必需的软件包.txt:
pip3 install -r requirements.txt该实现涵盖了可用的验证模型的以下示例任务:
来自官方DeepMind存储库的haiku检查站已转换为Pytorch检查站,可以从Google-Drive下载。 Pytorch检查点应放置在“ Pytorch_checkpoints”文件夹中,以便示例代码可以找到它们。
为了为自定义任务创建一个新的prepeiverio,使用了perceiver_io/perceiver.py中的感知者类。
class PerceiverIO ( nn . Module ):
"""The Perceiver: a scalable, fully attentional architecture.
Args:
num_blocks (int): Number of times the block is applied with shared weights. Default: 8
num_self_attends_per_block (int): Number of self-attentions in the block. Default: 6,
num_latents: (int): Number of latent vectors. Default 512,
num_latent_channels (int): Number of channels for the latent vectors. Default: 1024,
final_project (bool): Whether to apply a linear layer to the outputs before the post-processors. Default: True,
final_project_out_channels (int): Number of output channels for the final projection layer. Default: None,
perceiver_encoder_kwargs (Dict): Additional arguments for the perceiver encoder class. Default: {},
perceiver_decoder_kwargs (Dict): Additional arguments for the perceiver decoder class. Default: {},
input_preprocessors (dict / nn.Module): Optional input preprocessors. 1 or none for each modality. Default: None,
output_postprocessors (dict / nn.Module): Optional output postprocessors. 1 or none for each modality. Default: None,
output_queries (dict / nn.Module): Modules that create the output queries. 1 for each modality. Default: None,
output_query_padding_channels (int): Number of learnable features channels that are added to the output queries. Default: 0,
input_padding_channels (int): Number of learnable features channels that are added to the preprocessed inputs. Default: 0,
input_channels (dict, int): = The number of input channels need to be specified if NO preprocessor is used. Otherwise,
the number will be inferred from the preprocessor. Default: None,
input_mask_probs (dict): Probability with which each input modality will be masked out. Default None,
"""以下是多模式应用的感知者的图:
输入预处理器获取原始输入数据并进行预处理,以便可以通过第一个交叉注意来查询它。这可以是类似于从图像创建补丁的东西。通常,位置编码由预处理器合并。也可以手动处理输入,而不是使用预处理器。
可以在perceiver_io/io_processors/preprocessors.py中找到几个input_preprocessors.py
输出后处理器采用感知器的最终输出,并处理以获得所需的输出格式。
可以在perceiver_io/io_processors/postprocessors.py中找到几个output_postProcessors.py
Ouput查询创建的功能用于查询感知器的最终潜在表示以产生输出。他们获得预处理输入作为参数,以便根据需要使用它。它们通常还包含位置编码。
可以在perceiver_io/output_queries.py ###多种模态中找到多个output_queries,可以一次处理多种模态,可以将其从模态到模块映射到模块的词典可用于input_preprocessors,output_postprocessors和utpute_queries(请参阅perceive_io_io_io_io/multimodimodal_perce.perce.per.py.py)。为了使不同的输入彼此兼容,它们可以使用可训练的参数填充相同的通道大小。还可以使用与输入相比,也可以使用不同数量的输出查询。
@misc { jaegle2021perceiver ,
title = { Perceiver IO: A General Architecture for Structured Inputs & Outputs } ,
author = { Andrew Jaegle and Sebastian Borgeaud and Jean-Baptiste Alayrac and Carl Doersch and Catalin Ionescu and David Ding and Skanda Koppula and Andrew Brock and Evan Shelhamer and Olivier Hénaff and Matthew M. Botvinick and Andrew Zisserman and Oriol Vinyals and João Carreira } ,
year = { 2021 } ,
eprint = { 2107.14795 } ,
archivePrefix = { arXiv } ,
primaryClass = { cs.LG }
}