In this repository we provide details of our interace, installation guidelines, links to saved checkpoints and training logs, and instructions to reproduce the experiments.
- Install Python 3.7
- Install Poetry. Please refer to: https://python-poetry.org/docs/#installation
- Download the codebase we shared.
- Create a virtual environment in
ScenicGFootBall
usingpoetry env use python3.7
. Activate it usingpoetry shell
. - In
ScenicGFootBall
runpoetry install
This will install Scenic and Our Interface in editable mode. - Install Google Research Football. Please refer to: https://github.com/google-research/football#on-your-computer
- Install RL Training dependencies including Tensorflow 1.15, Sonnet, and OpenAI Baselines. Please refer to: https://github.com/google-research/football#run-training
- Go to the folder
training
and runpython3 -m pip install -e .
to install our packagegfrl
which we use to conduct the experiments.
Run the following program to create an environment with a scenic scenario script and running a random agent.
from scenic.simulators.gfootball.utilities.scenic_helper import buildScenario
scenario = buildScenario("..path to a scenic script..") #Find all our scenarios in training/gfrl/_scenarios/
from scenic.simulators.gfootball.rl.gfScenicEnv_v2 import GFScenicEnv_v2
env_settings = {
"stacked": True,
"rewards": 'scoring',
"representation": 'extracted',
"players": [f"agent:left_players=1"],
"real_time": True
}
env = GFScenicEnv_v2(initial_scenario=scenario, gf_env_settings=env_settings)
env.reset()
done = False
while not done:
action = env.action_space.sample()
_, _, done, _ = env.step(action)
Our model, action, and behavior libraries can be found in the directory src/scenic/simulators/gfootball/
in model.scenic
, actions.py
, and behaviors.scenic
respectively.
All of our proposed scenarios can be found in the training/gfrl/_scenarios
directory, categorized according to their type. Proposed Defense and Offense Scenarios are placed in training/gfrl/_scenarios/defense
and training/gfrl/_scenarios/offense
directories, respectively.
Scenic scenario scripts corresponding to select GRF scenarios can be found in training/gfrl/_scenarios/grf
. training/gfrl/_scenarios/testing_generalization
contains testing scripts corresponding to all the above mentioned scenarios.
Scenic semi-expert policy scripts for select scenarios can be found in training/gfrl/_scenarios/demonstration
. Data generated from these policies are placed in training/gfrl/_demonstration_data
.
First Create Scenic scenario object:
from scenic.simulators.gfootball.utilities.scenic_helper import buildScenario
scenario = buildScenario(scenario_file)
Then, Create Gym Environment from Scenic scenario:
env_type : We now offer two different environment classes, i.e., GFScenicEnv_v1
and GFScenicEnv_v2
. We recommend to use GFScenicEnv_v2
, which is our default environment implementing all the features and experiments discussed in the paper. GFScenicEnv_v1
only allows the initial distribution of states, but doesnt allow to use Scenic behaviors for non-RL agents, i.e., it always uses the Default AI behavior provided by GRF for the non-RL agents.
from scenic.simulators.gfootball.rl.gfScenicEnv_v1 import GFScenicEnv_v1
from scenic.simulators.gfootball.rl.gfScenicEnv_v2 import GFScenicEnv_v2
env_type = "... use appropriate env class according to your need..."
if env_type=="v1":
env = GFScenicEnv_v1(initial_scenario=scenario, gf_env_settings=gf_env_settings, compute_scenic_behavior=True)
elif env_type=="v2":
env = GFScenicEnv_v2(initial_scenario=scenario, gf_env_settings=gf_env_settings)
else:
assert False, "invalid env_type"
Please refer to training/gfrl/base/bc/utils.py
for further details.
For generating expert data using Scenic policies, we need to execute scenic policies at each timestep to generate an action. At any timestep, one can use the environment method env.simulation.get_scenic_designated_player_action()
to read the action that was computed corresponding to the active
player.
We provide helper scripts to automate the process though. One can simply follow the following steps to generate data:
-
Open
training/gfrl/experiments/gen_demonstration.py
-
Change the following fields:
- scenario: Path to the scenic policy scripts
- data_path: The output file path without any extension.
-
Run
python3 training/gfrl/experiments/gen_demonstration.py
To read the saved offline data, run the following:
from gfrl.common.mybase.cloning.dataset import get_datasets
tds, vds = get_datasets("..path to the saved data...", validation_ratio=0.0)
print("train")
print(tds.summary())
print()
print("validation")
print(vds.summary())
print()
In order to reproduce PPO results from the paper, please refer to:
- training/gfrl/experiments/score_scenic.sh
- Open
training/gfrl/experiments/bc.sh
- Change the following fields:
- level: Path to the scenic scenario scripts
- eval_level: Should be the same as level
- dataset: The .npz demonstration data file generated by
gen_demonstration.py
. - n_epochs: Change it to 5 for offense and 16 for grf scenarios to reproduce the paper's results
- exp_root: where to store the training outputs
- exp_name: output directory's name
- Run
bash training/gfrl/experiments/bc.sh
With default settings, the script will train the model for 5M timesteps.
- Open
training/gfrl/experiments/pretrain.sh
- Change the following fields:
- level: Path to the Scenic scenario scripts
- eval_level: Should be the same as level
- exp_root: where to store the training outputs
- exp_name: output directory's name
- load_path: path to the behavior cloning model
- Run
bash training/gfrl/experiments/pretrain.sh
Please use the following script to evaluate the model's mean score.
- Open
training/gfrl/experiments/test_agent.sh
- Change the following fields:
- eval_level: path to the Scenic scenario scripts
- load_path: path to the model to be evaluated
- write_video: Set it to False
- dump_full_episodes: Set it to False
- Run
bash training/gfrl/experiments/test_agent.sh
- The result will be printed in the end in the following format:
exp_name, reward_mean, score_mean, ep_len_mean, num_test_epi, test_total_timesteps, eval_level, load_path
.
We pubicly share tensorboard log and saved checkpoints for all our experiments here.
We'd like to thank the Scenic and GRF Team for open sourcing their projects.