This repository contains the reference implementation of the [DAU+]DSUP(
Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning
by Harley Wiltzer*, Marc G. Bellemare, David Meger, Patrick Shafto, and Yash Jhaveri*.
This project uses PDM for dependency management. See https://pdm-project.org/latest/#installation for installation instructions.
Once PDM has been installed, execute the following from the project root to sync the dependencies:
pdm venv create
pdm install
Before running any code, be sure to activate the virtual environment (from the project root):
source .venv/bin/activate
Some environments simulate dynamics from datasets. The download_data.sh
file downloads these datasets. Make this
script executable:
chmod +x download_data.sh
Then run the script to download the datasets:
./download_data.sh
This script will create a data
directory in the project root with the requisite datasets.
The easiest way to run training scripts is with our justfile
, using the just
command runner.
To train agents for risk-neutral option trading, execute
just writer=[aim | comet] agent=[dsup | qrdqn | dau] option_idx=<int> time_mul=<int> train_options
Here, option_idx
specifies the commodity for the environment, and time_mul
is the decision frequency. Setting time_mul=1
results in the base frequency, and time_mul=n
is n
times the base frequency.
To train the DAU+DSUP(1/2) variant, execute replace train_options
with train_options_dsup_shifted
.
To train agents for risk-sensitive option trading with CVaR, execute
just writer=[aim | comet] agent=[dsup | qrdqn | dau] option_idx=<int> time_mul=<int> risk_param=<float> train_options_risky
Here, risk_param
refers to the CVaR level for the experiment.
If you build on our work or find it useful, please cite it using the following bibtex,
@inproceedings{wiltzer2024action,
title={Action Gaps and Advantages in Continuous-Time Distributional Reinforcement Learning},
author={Harley Wiltzer and Marc G. Bellemare and David Meger and Patrick Shafto and Yash Jhaveri},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=BRW0MKJ7Rr}
}