Hybrid Transformer based Multi-agent Reinforcement Learning for Multiple Unmanned Aerial Vehicle Coordination in Air Corridors
4 air corridors, cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile
10 air corridors, cylinder-torus-torus-cylinder-torus-torus-cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile
- Embedding network normalizes the input values and standardizes the input dimensions.
- Transformer processes dynamic neighbors' information using encoders and decoders.
- Actor-critic network outputs the estimated state value and stochastic action in spherical coordinates.
Train one set of parameters: main.py
Train a batch, parameter grid search: batched_grid_search.sh
Models (actor/critic) are saved every 0.25 million steps. Training process is visualized with terminal log and TensorBoard.