Reward Tuning for Self-adaptive Policy in MDP Based Distributed Decision-Making to Ensure a Safe Mission Planning

This project focuses on reward tuning in Markov Decision Processes (MDPs) to enable self-adaptive policies in distributed decision-making, ensuring safe mission planning. The method is validated on a UAV tracking mission case study.

Overview

Markov Decision Processes (MDPs) are a standard model for sequential decision-making under uncertainty, typically involving a single agent making decisions. However, in fields like robotics, multiple actions may need to be executed simultaneously by different agents. The complexity of missions often requires the decomposition of an MDP into sub-MDPs, which can lead to conflicts between agents performing concurrent actions. Furthermore, system or sensor failures may introduce risks during missions, such as UAV crashes caused by battery depletion.

This project presents a new method to prevent behavior conflicts in distributed decision-making and emphasizes action selection to ensure system safety. The approach recomputes rewards dynamically, taking into account antagonistic actions and thresholds on transition functions, to promote actions that guarantee mission safety. The method is validated using a UAV tracking mission case study, where rewards are recomputed to avoid conflicts and ensure constraint satisfaction.

Installation

Requirements

MATLAB (any version supporting base MATLAB functions)

Getting Started

Clone the project repository:

git clone https://github.com/MohandHAMADOUCHE/Reward_Tuning_for_Self-adaptive_Policy_in_MDP.git
cd Reward_Tuning_for_Self-adaptive_Policy_in_MDP

Choose the appropriate test based on the case:
- 2.1. To test the MDP example: Navigate to the folder "0. MDP Example - UAV control" and run the MATLAB file MDP_Racing_UAV.m:
```
cd '0. MDP Example - UAV control'
run('MDP_Racing_UAV.m')
```
- 2.2. To test reward tuning at the UAV level: Navigate to the folder "1. At UAV level" and run the MATLAB file main.m:
```
cd '1. At UAV level'
run('main.m')
```
- 2.3. To test reward tuning at the swarm level: Navigate to the folder "2. At SWARM level" and run the MATLAB file main.m:
```
cd '2. At SWARM level'
run('main.m')
```

Results

The results of the simulations are detailed in the following paper: Reward Tuning for Self-adaptive Policy in MDP Based Distributed Decision-Making to Ensure a Safe Mission Planning. You can access the full results and methodology in the published paper.

Contributing

We welcome contributions! Please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Submit a pull request with a detailed explanation of your changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Please cite this work in your publications:

@inproceedings{hamadouche2020reward,
  title={Reward tuning for self-adaptive policy in mdp based distributed decision-making to ensure a safe mission planning},
  author={Hamadouche, Mohand and Dezan, Catherine and Branco, Kalinka RLJC},
  booktitle={2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)},
  pages={78--85},
  year={2020},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
0. MDP Example - UAV control		0. MDP Example - UAV control
1. At UAV level		1. At UAV level
2. At SWARM level		2. At SWARM level
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reward Tuning for Self-adaptive Policy in MDP Based Distributed Decision-Making to Ensure a Safe Mission Planning

Table of Contents

Overview

Installation

Requirements

Getting Started

Results

Contributing

License

About

Releases

Packages

Languages

License

MohandHAMADOUCHE/Reward_Tuning_for_Self-adaptive_Policy_in_MDP

Folders and files

Latest commit

History

Repository files navigation

Reward Tuning for Self-adaptive Policy in MDP Based Distributed Decision-Making to Ensure a Safe Mission Planning

Table of Contents

Overview

Installation

Requirements

Getting Started

Results

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages