Skip to content
/ jax-rl Public

Hands-on RL algorithm implementation with JAX.

Notifications You must be signed in to change notification settings

Juno-T/jax-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RL algorithms with JAX

Hands-on project after finishing Deepmind's RL course. Decided to learn and use JAX for the implementations.

🚧 The project is still in progress.

TODO

Utilities:

  • Experience accumulator
    • ✔️ by episodes
    • 🔲 by transitions (🚧 in progress)
  • ✔️ Training experiment

Environments:

  • gym
    • ✔️ Black-jack
    • ✔️ Cartpole
    • 🔲 Atari (🚧 in progress)
  • mujoco ?
  • evogym (so cool, must try)

Algorithms:

  • Value function approximator
    • ✔️ Tabular
    • ✔️ Linear
    • ✔️ Neural Nets
  • Value approximation/heuristic
    • TD
      • ✔️ TD(0)
      • ✔️ n-step TD
      • ✔️ TD(λ)
    • Q-learning
      • ✔️ vanilla q-learning
      • 🔲 λ q-learning
  • Simple agents
    • ✔️ Tabular + TD,Q (with ε-greedy)
    • ✔️ Linear + Q (with ε-greedy)
  • DQN
    • ✔️ Barebones (NN + Q)
    • 🔲 Vanilla DQN (:construction: in progress)
    • 🔲 Rainbow?
  • Policy Gradient
    • 🔲 Vanilla
    • 🔲 Trust Region/PPO?
  • Model-based ?
  • GVF ?
  • Combining with Evolutionary ?

About

Hands-on RL algorithm implementation with JAX.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published