Boxcars: an Example Rocket League Replay Parser Using Rust and NomPublished on
Rust and nom showing the way to cut through the binary weeds
Boxcars, also stylized as boxca-rs, is an example of a Rocket League replay parser written in Rust using nom for parsing and serde for serialization. As stated in the title, this is an example, as this library in no way competes with the other feature complete parsers such as Octane and RocketLeagueReplayParser. Rather, let boxcars be a good example of Rust code using nom, and serde as extensive examples are hard to come by. While lacking feature completeness and user friendly error message – among other issues, tests and documentation strive to be thorough.
I highly recommend checking out the repo, as the comments made here are also made in the code!
At lunch one day at work, we were discussing Rocket League. If you’re not familiar with Rocket League, look up a match! It’s soccer + cars video game, and after a match you can save a replay. The replay is what we were talking about at lunch. In a replay, you’re a free floating camera and can view the action from any angle. A player controlled camera ruled out a traditional video format, so we hypothesized that parsing the replay would be possible.
Searching revealed parsers already existed, and a website dedicated to analyzing uploaded replays. This sparked collective daydreaming about 3D heatmaps of where players and balls were found during a match.
My coworkers let the topic die after lunch, but I didn’t. I saw it as an opportunity. I love parsing video game formats and making them extremely efficient. The state of the art parser is Octane, which claims on the front page
Octane parses most replays in less than 5 seconds.
I took a look at a replay, saw it was 1MB, so accepted the challenge.
I decided on Rust and nom for a variety of reasons. First reason is performance and code ergonomics. Rust emphasizes zero cost abstractions, type inference, and pattern matching. While nom has a pretty large claim in their readme:
speed: benchmarks have shown that nom parsers often outperform many parser combinators library like Parsec and attoparsec, some regular expression engines and even handwritten C parsers
The second reason for choosing these libraries is that they provided an ample learning opportunity, as I’m new to Rust and parser combinators! I love learning.
Additionally, nom makes parsing binary data a breeze, which we’ll dive more into now.
A Rocket League game replay is a binary encoded file with an emphasis on little endian encodings. The number 100 would be represented as the four byte sequence:
0x64 0x00 0x00 0x00
This in contrast to big-endian, which would represent the number as:
0x00 0x00 0x00 0x64
Remember, little endian means least significant bit first!
A replay is split into three major sections, a header, body, and footer.
The first four bytes of a replay is the number of bytes that comprises the header. A length prefixed integer is very common throughout a replay. This prefix may either be in reference to the number of bytes an elements takes up, as just seen, or the number of elements in a list.
The next four bytes make up the cyclic redundancy check (CRC) for the header. The check ensures that the data has not be tampered with or, more likely, corrupted. Unfortunately, it remains an outstanding issue to implement this check. I tried utilizing crc-rs with community-calculated parameters, but didn’t get anywhere.
The game’s major and minor version follow, each 32bit integers.
Subsequently, the game type is encoded as a string. Strings in Rocket League Replay files are length prefixed. The last byte in the text will be null terminated, so we trim it off. It may seem redundant to store both length prefixing and null termination, but stackoverflow contains a nice reasoning for why it may have been done this way. There is a potential for UTF-16 strings, in which case, the prefixed length will negative and half the required byte length since each character in a UTF-16 string is two bytes. Decoding UTF-16 strings is not supported at this time because I don’t have a reproducible replay with a UTF-16 string.
For example the string “None” is encoded as:
0x05 0x00 0x00 0x00 0x4e 0x6f 0x6e 0x65 0x00
Next, the properties is where all the good nuggets of info reside:
- When and who scored a goal
- Player stats (goals, assists, score, etc).
- Date and level played on
A property can be a number, string, or a more complex object such as an array containing additional properties.
Header properties are encoded in a pretty simple format, with some oddities. The first 64bits is data that can be discarded, some people think that the 64bits is the length of the data while others think that the first 32bits is the header length in bytes with the subsequent 32bits unknown. Doesn’t matter to us, we throw it out anyways. The rest of the bytes are decoded property type specific.
One can visualize the properties as a map of strings to various types (number, string, array) that continues until a “None” key is found. Visualization is not an assumption though, since replay data is not a defined format, it could be foolish to assume that duplicate keys can’t exist. Thus to be safe, store all the information into a vector of key value tuples.
The body is the least implemented section, but it contains some familiar notions, such as length prefixed data structures.
Out of the body we get:
- Levels (what level did the match take place)
- Keyframes as defined by the video compression section in the wikipedia article, are the main frames that are derived from in the following frame data. Since we are not decoding the network stream, this is more a nice-to-decode than a necessity
- The body’s crc. This check is actually for the rest of the content (including the footer).
Since everything is length prefixed, we’re able to skip the network stream data. This would be 90% of the file, and it’s a shame that my enthusiasm for implementing this section waned. When the developers of the game say the section isn’t easy to parse, the major rocket league libraries dedicate half of their code to parsing the section, and the with each patch everything breaks, it’s an incredible feat for anyone to retain enthusiasm. Way to go maintainers!
We already extracted most of the interesting bits like player stats and goals contained in the header, so it’s not a tremendous loss if we can’t parse the network data. If we were able to parse the network data, it would allow us to run benchmark against other implementations.
After the network stream there isn’t too much of interest to us, as it relates more to the network stream, but there was a low barrier to parse it. From the footer we see:
- Debug info: think of it has a
- Tickmarks typically represent a significant event in the game (eg. a goal). The tick mark is placed before the event happens so there is a ramp-up time. For instance, a tickmark could be at frame 396 for a goal at frame 441. At 30 fps, this would be 1.5 seconds of ramp up time.
- Followed by several string info and other classes that seem totally worthless if the network data isn’t parsed
All of the models for the replay are contained in one file because of serde. On Rust nightly, which is what this library needs to compile, no special actions are needed for serde to work. However, to achieve compilation on Rust stable, an extra build step can be introduced (EDIT (Aug 2017): not needed anymore now that serde can compile on stable). Part of the intermediate step is to split all the serde specific code into a separate file, which is what I’ve done. To me, it is not critical to have this library compile on stable, hence this work remains unfinished.
For serde, we only care about serialization, JSON serialization. Deserialization is not implemented from our JSON output because it is lossy (JSON isn’t the best with different numeric/string types). Asking “why JSON” would be next logical step, and that’s due to other rocket league replay parsers (like Octane) using JSON; however, the output of this library is not compatible with that of other rocket league replay parsers.
Remember the current goal:
Octane parses most replays in less than 5 seconds.
Which is what initially got me curious if, utilizing the right tools, I could do better.
Running Octane on
assets/rumble.replay found in the repo, it decoded the information and converted it to JSON in 2.3s. Considering the file is 1MB, I saw room for improvement. Using the implementation here to output the header and footer data in JSON took 1ms. Yes, this is not an apples to apples comparison, and one should continue using proven tools, not some example project, but if I were to extrapolate, there isn’t 1000x additional work needed.
Work to be Done
The following issues are being tracked in the repo, but for completeness sake I’m going to list what I find to be the major enhancements in order to make this project not just an example but a serious competitor:
- Parse the network stream (low priority as most of the interesting bits are in the header, but if I do ever want to do that 3D heatmap, positional data will need to be parsed out)
- Implement CRC check to ensure that data isn’t corrupted
- Better error messages, instead of
Incomplete(4778)should be something like “Not enough data to parse header”
- Compile on Rust stable
- Instead of ingesting the complete file, use streaming
- Decode UTF-16 strings properly