-
Notifications
You must be signed in to change notification settings - Fork 1
Experimental
I am currently experimenting with a Variational Autoencoder (VAE) to generate embeddings as a replacement for RWKV's own embeddings. This VAE model can generate an embeddings matrix that is extracted and reconstructed from the compressed latent space of the dataset, for a given input sequence. The training has gone smoothly, with the loss converging to a good minimum:
Replacing the embeddings portion of the RWKV model with this pre-trained version will help to generate more meaningful sequences, reflecting the "soul" of a given dataset more reliably by capturing essential components. It also provides a way to direct the resulting sequences in various directions, depending on the characteristics of the pre-trained embeddings that we inject into the target model.
A sample training script is available at this notebook or you can follow the instructions here.