-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Updated tutorials, requirements, and added workflow to test tutorials…
… (experimental)
- Loading branch information
1 parent
9b4b602
commit c3713d5
Showing
11 changed files
with
126 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions | ||
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions | ||
--- | ||
name: Tutorial tests | ||
|
||
on: | ||
push: | ||
branches: [master] | ||
pull_request: | ||
branches: [master] | ||
|
||
permissions: | ||
contents: read | ||
|
||
jobs: | ||
tutorial-test: | ||
runs-on: ubuntu-latest | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
python-version: ['3.7', '3.8', '3.9', '3.10'] # '3.11' - broken due to numba | ||
tutorial: ['GreedyAgent'] | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v3 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install dependencies and run tutorials | ||
run: | | ||
sudo apt-get install python3-opengl xvfb | ||
cd tutorials/${{ matrix.tutorial }} | ||
pip install -r requirements.txt | ||
pip uninstall -y pettingzoo | ||
pip install -e ../.. | ||
for f in *.py; do xvfb-run -a -s "-screen 0 1024x768x24" python "$f"; done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,4 +7,5 @@ pytest==7.1.2 | |
ray==2.2.0 | ||
tianshou==0.4.11 | ||
torch==1.12.1 | ||
pre-commit==3.1.1 | ||
pre-commit==3.1.1 | ||
hypothesis==2.4.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
# Tutorial: Greedy Agent | ||
This tutorial provides a basic example of running the gobblet environment using greedy agents. | ||
|
||
The agents are greedy in the sense that they will only choose actions which: | ||
1. Wins the game | ||
2. Blocks the opponent from winning | ||
|
||
The `depth` parameter controls the amount of turns in the future they are able to search through. | ||
|
||
For example, depth 2 means it will consider moves which will set the agent up to win with their next move, regardless of what the opponent does. | ||
|
||
This script randomizes the first move for each agent, in order to add variety, and the underlying policy in `greedy_policy.py` additionally enforces that agents cannot repeat any of the previous 3 moves they have made (to avoid getting stuck in a loop). | ||
|
||
## Usage: | ||
|
||
1. (Optional) Create a virtual environment: `conda create -n gobblet python=3.10` | ||
2. (Optional) Activate the virtual environment: `conda activate gobblet` | ||
3. Install gobblet: run `pip install gobblet-rl` or run `pip install -e .` in the root directory | ||
4. Install requirements for this tutorial: `cd tutorials/GreedyAgent && pip install -r requirements.txt` | ||
5. Run `python tutorial_greedy.py` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
gym==0.23.1 | ||
gymnasium==0.27.1 | ||
numpy==1.22.0 | ||
PettingZoo==1.22.3 | ||
pygame==2.1.2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import time | ||
import numpy as np | ||
|
||
from gobblet import gobblet_v1 # noqa: E402 | ||
|
||
PLAYER = 0 | ||
DEPTH = 2 | ||
RENDER_MODE = "human" | ||
|
||
if __name__ == "__main__": | ||
env = gobblet_v1.env(render_mode="human", args=None) | ||
|
||
greedy_policy = gobblet_v1.GreedyGobbletPolicy(depth=DEPTH) | ||
|
||
# Render 3 games between greedy agents | ||
for _ in range(3): | ||
env.reset() | ||
env.render() # need to render the environment before pygame can take user input | ||
|
||
iter = 0 | ||
|
||
for agent in env.agent_iter(): | ||
observation, reward, termination, truncation, info = env.last() | ||
|
||
if termination or truncation: | ||
env.render() | ||
time.sleep(1) | ||
print(f"Agent: ({agent}), Reward: {reward}, info: {info}") | ||
break | ||
|
||
if iter < 2: | ||
# Randomize the first action for variety (games can be repeated otherwise) | ||
action_mask = observation["action_mask"] | ||
action = np.random.choice( | ||
np.arange(len(action_mask)), p=action_mask / np.sum(action_mask) | ||
) | ||
# Wait 1 second between moves so the user can follow the sequence of moves | ||
time.sleep(1) | ||
|
||
else: | ||
action = greedy_policy.compute_action( | ||
observation["observation"], observation["action_mask"] | ||
) | ||
# Wait 1 second between moves so the user can follow the sequence of moves | ||
time.sleep(1) | ||
|
||
env.step(action) | ||
|
||
iter += 1 |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters