RL algorithms with JAX

Hands-on project after finishing Deepmind's RL course. Decided to learn and use JAX for the implementations.

🚧 The project is still in progress.

TODO

Utilities:

Environments:

Algorithms:

Value function approximator
- ✔️ Tabular
- ✔️ Linear
- ✔️ Neural Nets
Value approximation/heuristic
- TD
  - ✔️ TD(0)
  - ✔️ n-step TD
  - ✔️ TD(λ)
- Q-learning
  - ✔️ vanilla q-learning
  - 🔲 λ q-learning
Simple agents
- ✔️ Tabular + TD,Q (with ε-greedy)
- ✔️ Linear + Q (with ε-greedy)
DQN
- ✔️ Barebones (NN + Q)
- 🔲 Vanilla DQN (:construction: in progress)
- 🔲 Rainbow?
Policy Gradient
- 🔲 Vanilla
- 🔲 Trust Region/PPO?
Model-based ?
GVF ?
Combining with Evolutionary ?

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
agents		agents
figures		figures
learn_jax		learn_jax
scripts/dqn		scripts/dqn
test		test
utils		utils
value_prediction		value_prediction
.gitignore		.gitignore
README.md		README.md
tmp.ipynb		tmp.ipynb