The Rust-loving team at Immunant has been hard at work on C2Rust, a migration framework that takes the drudgery out of migrating to Rust. Our goal is to make safety improvements to the translated Rust automatically where we can, and help the programmer do the same where we cannot. First, however, we have to build a rock-solid translator that gets people up and running in Rust. Testing on small CLI programs gets old eventually, so we decided to try translating Quake 3 into Rust. After a couple of days, we were likely the first people to ever play Quake3 in Rust!
Setting the stage: Quake 3 sources
After looking at the original Quake 3 source code and various forks, we settled on
ioquake3
. It is a community fork of Quake 3 that is still maintained and builds on modern platforms.
As a starting point, we made sure we could build the project as is:$ make release
The ioquake3 build produces a few different libraries and executables:$ tree –prune -I missionpack -P “* .so | * x _ 094 . └── build Ug debug-linux-x _ ├── baseq3 G ├── cgamex (_) ****************************************************************************************************************. so # client Ag ├── qagamex (_) ****************************************************************************************************************. so # game server Ix ix uix (_) ****************************************************************************************************************. so # ui Io ioq3ded.x # dedicated server binary Io ioquake3.x # main binary ├── renderer_opengl1_x _____________________ ************************************************************************************. so # opengl1 renderer └── renderer_opengl2_x _____________________ ************************************************************************************. so # opengl2 renderer
Of these libraries, the UI, client, and server libraries can be built as either (Quake VM) ****************** (assembly or native X) shared libraries. We opted to use the native versions of these libraries for our project. Translating just the VM into Rust and using the QVM versions would have been significantly simpler but we wanted to thoroughly test out C2Rust.
We focused on the UI, game, client, OpenGL1 renderer and main binary for our translation. It would be possible to translate the OpenGL2 renderer as well, but we chose to skip it as it makes significant use of. Glsl
shader files which the build system embeds as literal strings in C source code. While we could add custom build script support for embedding the GLSL code into Rust strings after we transpile, there's not a good automatic way to transpile these autogenerated, temporary files1. We instead just translated the OpenGL1 renderer library and forced the game to use it instead of the default renderer. Finally, we decided to skip the dedicated server and mission pack files, as they wouldn't be hard to translate but were also not necessary for our demonstration. Transpiling Quake 3To preserve the directory structure used by Quake 3 and not need to change its source code, we needed to produce exactly the same binaries as the native build, meaning four shared libraries and one executable. Since C2Rust produces Cargo build files, each binary needs its own Rust crate with a correspondingCargo.tomlfile. For C2Rust to produce one crate per output binary, it would need a list of the binaries along with their corresponding object or source files, and linker invocation used to produce each binary (used to determine other details like library dependencies).
However, we quickly ran into one limitation with the way C2Rust intercepts the native build process: C2Rust takes acompilation databasefile as an input, which contains a list of compilation commands executed during the build. However, this databaseonly contains compilation commands, and not any linker invocations. Most tools that produce this database have this intentional limitation, eg,
,bearcmake
withCMAKE_EXPORT_COMPILE_COMMANDSand
compiledb
. To our knowledge, the only tool that does include linking commands is build-logger
************ (from
CodeChecker , which we did not use because we only learned about it after writing our own wrappers (described below). This meant that we couldn't use a
compile_commands.json
file produced by any of the common tools to transpile a multi-binary C program.Instead, we wrote our owncompilerandlinkerwrapper scripts that dump out all compiler and linker invocations to a database, and then convert that into an extended
compile_commands.json. Instead of the normal build using a command like: $ make release
We add wrappers to intercept the build using:
$ make release CC=/ path / to / C2Rust / scripts / cc-wrappers / ccThe wrappers produce a directory full of JSON files, one per invocation. A secondscriptaggregates all of them into a new
compile_commands.json
file that contains both compilation and linking commands. We then extended C2Rust to read the linking commands from the database, and produce a separate crate per linked binary. Additionally, C2Rust now also reads the library dependencies of each binary and automatically adds them to that crate'sbuild.rs
file.As a quality of life improvement, all of the binaries can be built all at once by having them within a workspace. C2Rust produces a top-level workspaceCargo.toml
file, so we can build the project with a single
cargo build (command in the) ****************** (quake3-rs) ******************** directory:$ ***** tree-L 1 . ├── Cargo.lock ├── Cargo.toml G cgamex _____________________ ************************************************************************************ Io ioquake3 Ag qagamex _____________________ ************************************************************************************ ├── renderer_opengl1_x _____________________ ************************************************************************************ Ust rust-toolchain Ix uix _____________________ ************************************************************************************ $ cargo build --release Fixing a few Papercuts
When we first tried to build the translated code, we hit a couple of issues with the Quake 3 sources, hitting corner cases that C2Rust couldn't handle (correctly or at all) .
Pointers to Arrays [u8; 16] ******************************
In In a few places, the original source code contains expressions that point one past the last element of an array. Here is a simplified example of the C code:
int array [1024]; int * p; // ... if (p>=& array [1024]) { // error ... }
The C standard (see eg********* (C) **************************************************************************************************************************, Section 6.5.6) allows pointers to an element one past the end of the array. However, Rust forbids this, even if we are only taking the address of the element. We found examples of this pattern in the
AAS_TraceClientBBox
***************** function.
The Rust compiler also flagged a similar but actually buggy example in G_TryPushingEntity
**************where the conditional is>, not>=
The out of bounds pointer was then dereferenced after the conditional, which is an actual memory safety bug.
, we noticed that it was dereferencing theTo avoid this issue in the future, we fixed the C2Rust transpiler to use pointer arithmetic to calculate the address of an array element instead of using an array indexing operation. With this fix, code that uses this “address of element past the array end” pattern will now correctly translate and run with no modifications necessary.
Flexible Array Members
We started up a game to test things out and immediately got a panic from Rust:thread 'main' panicked at ' index out of bounds: the len is 4 but the index is 4 ', quake3-client / src / cm_polylib.rs: (*************************************************************************************************:
Taking a look atcm_polylib.c
(p
field in this struct is a pre-C (non-compliant version of a) “flexible array member”field in the following struct:typedef struct { int numpoints; vec3_t p [4]; // variable sized } winding_t;Thep
which is still accepted by (gcc) . C2Rust recognizes flexible array members with the C 173 syntax (vec3_t p []
and implements simple heuristicsto also detect some pre- C 173 versions of this pattern (0-and 1-sized arrays at the end of structures; we also found a few of those in the ioquake3 source code).
Changing
the above struct to C************************************************************************************ syntax fixed the panic:typedef struct { int numpoints; vec3_t p []; // variable sized } winding_t;Trying to automatically fix this pattern in the general case (arrays of sizes other that 0 or 1) would be extremely difficult, since we would have to distinguish between regular arrays and flexible array members of arbitrary sizes. Instead, we recommend that the original C code is fixed manually - just like we did for ioquake3.
Tied Operands in Inline AssemblyAnother source of crashes was this C inline assembly code from the
# define __FD_ZERO (fdsp) do { int __d0, __d1; __asm__ __volatile__ ("cld; rep;" __FD_ZERO_STOS : "=c" (__d0), "=D" (__d1) : "a" (0), "0" (sizeof (fd_set) / sizeof (__fd_mask)), "1" (& __ FDS_BITS (fdsp) [0]) : "memory"); } while (0)/ usr / include / bits / select.h system header:
which defines the internal version of the
__ FD_ZERO
macro. This definition hits a rare corner case ofgcc
inline assembly:tied input / output operandswith different sizes. The“=D” (__d1)
output operand binds the (edi) register to the
__ d1
variable as a - bit value, while"1" (& __ FDS_BITS (fdsp) [0])binds the same register to the address of (fdsp->fds_bitsas a - bit pointer.gcc
instead and then truncating its value before the assignment toand (clang) ******************** fix this mismatch by using the - bit registerrdi__ d1
, while Rust defaults to LLVM's semantics which leave this case undefined. What we saw happening for debug builds (but not release builds, which behaved correctly) was that both operands would be assigned to theedi (register, causing the pointer to be truncated to bits before the inline assembly, which would cause crashes.
Since
rustc
passes Rust inline assembly to LLVM with very few changes, we decided to fix this particular case in C2Rust. We implemented a newc2rust-asm-casts
crate that fixes the issue above via the Rust type system using a (trait) and some helper functions that automatically extend and truncate the values of tied operands to an internal size that is large enough to hold both operands. The code above correctly transpiles to the following:
let mut __d0: c_int=0 ; let mut __d1: c_int=0; // Reference to the output value of the first operand let fresh5=& mut __d0; // The internal storage for the first tied operand let fresh6; // Reference to the output value of the second operand let fresh7=& mut __d1; // The internal storage for the second tied operand let fresh8; // Input value of the first operand let fresh9=(:: std :: mem :: size_of :: () as c_ulong) .wrapping_div (:: std :: mem :: size_of :: () as c_ulong); // Input value of the second operand let fresh=& mut * fdset .__ fds_bits.as_mut_ptr (). offset (0) as * mut __fd_mask; asm! ("cld; rep; stosq" : "={cx}" (fresh6), "={di}" (fresh8) : "{ax}" (0), // Cast the input operands into the internal storage type // with optional zero- or sign-extension "0" (AsmCast :: cast_in (fresh5, fresh9)), "1" (AsmCast :: cast_in (fresh7, fresh)) : "memory" : "volatile"); // Cast the operands out (types are inferred) with truncation AsmCast :: cast_out (fresh5, fresh9, fresh6); AsmCast :: cast_out (fresh7, fresh
************************************************************************************, fresh8);
Note that the code above does not require the types for any input or output values in the assembly statement, relying instead on Rust's type inference to resolve those types (mainly the types of (fresh6) ******************** and (fresh8
above).<__fd_mask>Aligned Global VariablesThe final source of crashes we encountered was the following global variable that stores a SSE constant:
static unsigned char ssemask [16] __attribute __ ((aligned ())={ " xFF xFF xFF xFF xFF xFF xFF xFF xFF xFF xFF xFF x 03 x ( x x 01 };Rust currently supports the alignment attribute on structure types, but not on global variables, ie,
static
items. We are looking into ways to solve this in the general case in either Rust or C2Rust, but have decided to fix this issue manually for ioquake3 with a short (patch) file for now. This patch file replaces the Rust equivalent ofssemask
with:struct SseMask ([u8; 16]); static mut ssemask: SseMask=SseMask ([ 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 0, 0, 0, 0,]);Running quake3-rs
Running
cargo build --release
emits the binaries, but they are all emitted under(target / release) ******************** using a directory structure that the
ioquake3
binary does not recognize. We wrote ascriptthat creates symbolic links in the current directory to replicate the correct directory structure (including links to the
files containing the game assets):$ / path / to / make_quake3_rs_links .sh / path / to / quake3-rs / target / release / path / to / paks.pk3
The
/ path / to / paks (path should point to a directory containing the
. pk3
files.Now let's run the game! We need to pass set vm_game 0
, etc., so that we load these modules as Rust shared libraries instead of QVM assembly, and
cl_renderer
to use the OpenGL1 renderer.$ ./ioquake3 set sv_pure 0 set vm_game 0 set vm_cgame 0 set vm_ui 0 set cl_renderer "opengl1"
And…
We have Quake3 running in Rust!
Here is a video of us transpiling Quake 3, loading the game and playing for a bit:
You may browse the (transpiled sources) in the
transpiled
branch of our repository. We also provide the
refactored
branch containing the samesourceswith some (refactoring commands) **************** (pre-applied.) Transpiling Instructions
If you want to try translating Quake 3 and run it yourself, please be aware that you will need to own the original Quake 3 game assets or download the demo assets from the web. You'll also need to install C2Rust (the required Rust nightly version at the time of writing is
nightly - (**************************************************************************************************************************** - () ********************, but we recommend you check the C2Rust (repository) **************** (or) **************************************************************** (crates.io) for the latest one):
$ cargo nightly - - - 10 install c2rust
and copies of our C2Rust and ioquake3 repositories:
$ git clone: *** immunant / c2rust. git $ git clone
: immunant / ioq3.gitAs an alternative to installingc2rust With the command above, you may build C2Rust manually using
cargo build - -release
In either case, the C2Rust repository is still required as it contains the compiler wrapper scripts that are required to transpile ioquake3.
We provide a (script) that automatically transpiles the C code and applies the
ssemask
patch. To use it, run the following command from the top level of theioq3
repository:$ ./transpile.sh (******************************************************************
This command should produce aquake3-rs
subdirectory containing the Rust code, where you can subsequently runcargo build --release
and the rest of the steps described earlier.
As we continue to develop C2Rust, we'd love to hear what you want to see translated next. Drop us a line at[email protected]
and let us know! If you have legacy C code you need modernized and translated, the team here at Immunant is here to help. We are available for consulting and contracting engagements ranging from one-time support to full-service code modernization.(******************************************************************************************************************************************************Read More (******************************************************************************
GIPHY App Key not set. Please check settings