Experimental

VAE Embeddings

I am currently experimenting with a Variational Autoencoder (VAE) to generate embeddings as a replacement for RWKV's own embeddings. This VAE model can generate an embeddings matrix that is extracted and reconstructed from the compressed latent space of the dataset, for a given input sequence. The training has gone smoothly, with the loss converging to a good minimum:

loss

Replacing the embeddings portion of the RWKV model with this pre-trained version will help to generate more meaningful sequences, reflecting the "soul" of a given dataset more reliably by capturing essential components. It also provides a way to direct the resulting sequences in various directions, depending on the characteristics of the pre-trained embeddings that we inject into the target model.

A sample training script is available at this notebook or you can follow the instructions here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental

VAE Embeddings

Clone this wiki locally