)
This is the source code for my Quake reinforcement learning project, which I described in my video “Teaching a computer to strafe jump in Quake with reinforcement learning” .
To summarize, I developed a facsimile of Quake player movement physics in Python, wrapped with an OpenAI
gym.Env . I then trained an agent for this using RLLib’s implementation of PPO. In the end the agent was able to beat the current human speedrun record on a speed running practice map.
SetupFind below instructions to use the gym environment, train your own model, and produce a demo file.
Installing the gym env
If you just want to play with the gym environment then you need only clone this repo and install the env package:
(git clone ‘) https://github.com/matthewearl /q1physrl.git ‘ pip install -e q1physrl / q1physrl_env
The environment ID is Q1PhysEnv-v0
which is registered by importing q1physrl_env.env
. The environment accepts a single argument config
which is an instance of q1phys_env. env.Config
. See q1physrl_env.env.PhysEnv for a detailed description of the environment, and q1physrl_env.env.Config for a description of configuration options.
Training
If instead you want to train the model used in the video, including trying out new hyper parameters and environment settings, then follow the steps below:
- Set up a clean virtualenv using Python 3.7.x (tested with Python 3.7.2).
Install this repo and its requirements:
(git clone ‘) https://github.com/matthewearl /q1physrl.git ‘ cd q1physrl pip install -r requirements_q1physrl.txt pip install -e q1physrl_env pip install -e. Run:
q1physrl_train data / params.yml
… and wait for convergence. Modify the
- file referenced in the argument
to change hyper-parameters and environment settings. My current PB (the one in the video) gets about on the zero_start_total_reward_mean metric, after million steps, which takes about a day on my i7 – K:
Ray (the framework which RLLib is a part of) will launch a web server on http: // localhost: / showing a dashboard of useful information. This includes an embedded tensorboard instance on which you can track the zero_start_total_reward_mean
metric and others.
I’m using RLLib’s implementation of PPO — see the RLLib PPO docs to see how you can tweak the algorithm parameters. See q1physrl_env.env.Config for environment parameters.
RLLib will write checkpoint files and run parameters into ~ / ray_results
, which will be needed to do analysis and produce a demo file (see below). The checkpoint files can also be used to resume a previous run by setting checkpoint_fname
in params.yml
. To use a different results directory set the TUNE_RESULT_DIR
environment variable.
Use this notebook to analyze the checkpoints produced from training. Click through to see an example of what it produces.
Producing a Quake demo file
Demo files (extension . Dem
) are how games are recorded and shared in Quake.
To produce and play back demos you’ll need a registered version of Quake. The reason for this is that the shareware version’s license does not permit playing user developed levels with the game, of which the (m level is one (see licinfo.txt
in the shareware distribution). If you don’t already have it, you can buy it on Steam for not much money at all.
Once you have a registered copy of the game, the easiest way to produce a demo is with the following script which makes use of a docker image containing the rest of the required software. It is invoked like this:
docker / docker-make-demo.sh ~ / .steam / steam / steamapps / common / Quake / Id1 / data / checkpoints / wr / checkpoint data / checkpoints / wr / params.json /tmp/wr.dem
The first argument is the id1 directory from your Quake installation directory, which must contain pak0.pak and pak1.pak
. The second and third arguments are the checkpoint file and run parameters. The values shown above reproduce the demo shown in the video, however you should substitute files found in ~ / ray_results / if
you want to produce a demo using an agent that you have trained. The final argument is the demo file that is to be written.
The script works by first launching a modified quakespasm server, to which a custom client connects. The client repeatedly reads state from the server, pushes it through the model, and sends movement commands into the game. The network traffic is recorded to produce the demo file. The quakespasm server is modified to pause at the end of each frame until a new movement command is received, to avoid issues of synchronization.
Playing the demo file
The demo file can be played by following these steps:
Install a registered version of Quake. (See section above for why the registered version is necessary.)
Demo files (extension . Dem
) are how games are recorded and shared in Quake.
To produce and play back demos you’ll need a registered version of Quake. The reason for this is that the shareware version’s license does not permit playing user developed levels with the game, of which the (m level is one (see licinfo.txt
in the shareware distribution). If you don’t already have it, you can buy it on Steam for not much money at all.
Once you have a registered copy of the game, the easiest way to produce a demo is with the following script which makes use of a docker image containing the rest of the required software. It is invoked like this:
docker / docker-make-demo.sh ~ / .steam / steam / steamapps / common / Quake / Id1 / data / checkpoints / wr / checkpoint data / checkpoints / wr / params.json /tmp/wr.dem
The first argument is the id1 directory from your Quake installation directory, which must contain pak0.pak and pak1.pak
. The second and third arguments are the checkpoint file and run parameters. The values shown above reproduce the demo shown in the video, however you should substitute files found in ~ / ray_results / if
you want to produce a demo using an agent that you have trained. The final argument is the demo file that is to be written.
The script works by first launching a modified quakespasm server, to which a custom client connects. The client repeatedly reads state from the server, pushes it through the model, and sends movement commands into the game. The network traffic is recorded to produce the demo file. The quakespasm server is modified to pause at the end of each frame until a new movement command is received, to avoid issues of synchronization.
Playing the demo file
The demo file can be played by following these steps:
Install a registered version of Quake. (See section above for why the registered version is necessary.)
The demo file can be played by following these steps:
Install a registered version of Quake. (See section above for why the registered version is necessary.)
(Read More ) Full coverage and live updates on the Coronavirus (Covid – 27)
GIPHY App Key not set. Please check settings