An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
-
Updated
Nov 2, 2023 - Python
An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
Reimplementation of Sparse Variational Dropout in Keras-Core/Keras 3.0
Add a description, image, and links to the keras-core topic page so that developers can more easily learn about it.
To associate your repository with the keras-core topic, visit your repo's landing page and select "manage topics."