Game Playing Agent

The agent is trained using proximal policy optimization
The following is the result after 500,000 training steps

Performance

Trial1.mp4

t2.mp4

TODO

Implement soft-actor-critic (SAC).
Emperically optimize, neural network parameters: depth, input layer
Hyperparameter-tuning.

Reward Function

The reward funciton is:

timeSurvived + ( 3 * rocksDestroyed ) + ( 5 * enemyShipsDestroyed )

Rational: the agent would be incentivised to shoot down enemy bullets and rocks and miximize its survival time

Please see this script for more details

Input observations are, positions of objects currently in the scene, feeded sequentially with a label after each position to differentiate between objects

Parameters Used:

Input layer neurons: 30
Depth: 2

Shortcomings

Limited Computational Power

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Game Playing Agent

Performance

TODO

Reward Function

Parameters Used:

Shortcomings

Files

README.md

Latest commit

History

README.md

File metadata and controls

Game Playing Agent

Performance

TODO

Reward Function

Parameters Used:

Shortcomings