mixture of experts
1.0.0
該存儲庫包含Pytorch重新實現Pytorch的紙張中稀疏門控的MOE層。
from moe import MoE
import torch
# instantiate the MoE layer
model = MoE ( input_size = 1000 , output_size = 20 , num_experts = 10 , hidden_size = 66 , k = 4 , noisy_gating = True )
X = torch . rand ( 32 , 1000 )
#train
model . train ()
# forward
y_hat , aux_loss = model ( X )
# evaluation
model . eval ()
y_hat , aux_loss = model ( X )安裝要求運行:
pip install -r requirements.py
文件example.py包含一個最小的工作示例,說明瞭如何使用虛擬輸入和目標訓練和評估MOE層。運行示例:
python example.py
文件cifar10_example.py包含CIFAR 10數據集的最小工作示例。它通過任意的超參數且未完全融合,可實現39%的精度。運行示例:
python cifar10_example.py
FastMoE:快速混合了專家訓練系統,該實現被用作單GPU培訓的參考Pytorch實現。
該代碼基於可以在此處找到的TensorFlow實現。
@misc{rau2019moe,
title={Sparsely-gated Mixture-of-Experts PyTorch implementation},
author={Rau, David},
journal={https://github.com/davidmrau/mixture-of-experts},
year={2019}
}