MoE PyTorch PyTorch implementation of Sparsely-Gated Mixture-of-Experts (MoE). MoE - Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.