Contributing Guide
Welcome to the Rakata workspace! This guide outlines how we build, how we test, and the core rules for keeping our code clean, compliant, and maintainable.
License Policy
- License: All workspace crates use
GPL-3.0-or-later. - Third-Party Components: New dependencies must be compatible (MIT, Apache-2.0, BSD). Add them to
THIRD_PARTY_NOTICES.mdbefore merge.
Clean Room Implementation
To ensure everything we build is 100% our own original work and we aren’t accidentally borrowing from other community tools (if you’re curious about why we’re so strict about this, check out docs/LEGAL.md):
- Reference Policy: Treat existing tools (like PyKotor) as behavioral references, not copy sources.
- No Copy-Paste: Do not copy source code blocks, large comments, or docstrings from third-party sources into Rust files.
- Re-Derivation: Derive implementation logic from format documentation, observed behavior (hex dumps), and black-box fixture analysis.
- Reverse Engineering:
- Behavior verification via disassembly tools (e.g., Ghidra) is allowed for interoperability analysis.
- Do not copy decompiled code into source files.
- Record findings as paraphrased behavior notes natively within the relevant format specification under
docs/src/formats/.
What belongs in the Engine Audits
The entire Rakata format specifications manual (docs/src/formats/) serves as the engine audit layer between reverse engineering and implementation. All Rust code is written strictly from these engine audits (specifically the Engine Audits & Decompilation sections embedded in each format’s blueprint), not from raw decompilation output.
- Record: Field names, data types, default values, error conditions, and observable behavioral rules (e.g., “field X is clamped to range 0–100”, “list is sorted ascending by field Y”).
- Do not record: Step-by-step algorithmic sequences, control flow structure, or implementation details that go beyond what is needed for interoperability. The test is: could someone implement correct behavior from this note without it dictating a specific code structure?
Format Work vs Engine Reimplementation
Right now, this workspace is exclusively focused on format parsing, linting, and modding tools - reading, writing, and validating the game’s actual data files. We are fundamentally just mapping out how the original game structures its data so we can build cool tools around it.
Building an actual game engine replacement (with gameplay logic, AI, and rendering pipelines) is a completely different beast for another day. But that’s exactly why these format blueprints are so critical: if someone wants to build an engine later, they can just use our shared engine audits to understand the data, rather than having to dig through raw decompiled binaries themselves!
Code Style & Linting
Pre-commit Hooks
We use pre-commit to keep the codebase consistently formatted without anyone having to manually police it. After cloning the repository, it’s highly recommended to set up the hooks:
pre-commit install
pre-commit install --hook-type pre-push
This registers two quick automated stages:
- pre-commit: Formats your code via
cargo fmt --all(auto-fixing it for you) and runscargo clippyacross all targets. - pre-push: Runs
cargo test --workspace --all-featuresto ensure tests are green before you push.
Try to avoid skipping hooks using --no-verify. If a hook catches something, it’s usually just a helpful clippy suggestion or a quick formatting tweak!
Manual Checks
If you don’t like automated hooks and prefer running things manually from the workspace root before committing, you absolutely can:
cargo fmt --all
cargo clippy --workspace --all-targets --all-features
cargo test --workspace --all-features
Note: Passing --all-features to clippy and test is important so it catches optional code paths like serde and tracing! We just ask that fmt and clippy run cleanly before you open a Pull Request.
Idiomatic Rust
To keep the codebase consistently safe, lean, and fast, we heavily rely on a few core Rust principles:
- Safe Numeric Casts: To prevent silent truncation bugs, we enforce
#![warn(clippy::as_conversions)]. Avoid the rawaskeyword; lean onFrom,TryFrom, or.into(). If an unsafe cast is truly unavoidable (like anf32down to ani32), use a scoped#[allow(clippy::as_conversions)]and drop an inline comment explaining why it’s safe. - No Primitive Obsession: We heavily utilize strongly-typed wrappers (like
ResRef) rather than passing raw[u8; 16]orStringprimitives around. - Strict Error Handling: We explicitly forbid
.unwrap()and.unwrap_unchecked()in library code. Everything must propagate cleanly viaResultusing typed error enums (managed viathiserror). - Composition over Hierarchy: We prefer lean, flat structs and trait combinators over deep, messy object-oriented class hierarchies.
- Iterators over Loops: We prefer functional iterator chains (
map,filter,fold) over maintaining manual mutable state inforloops. - Zero-cost Features: Optional functionality (like
serdeserialization ortracingtelemetry) must introduce absolutely zero overhead when disabled. - Safe by Default: We use
#![forbid(unsafe_code)]across all core parser crates to enforce strict memory safety boundaries.
Testing & Quality
Our testing approach is a Gray Box strategy: we use our hard-earned white-box knowledge of the game engine (via Ghidra audits) to build extremely strictly-validated black-box test cases for our parsers. We want to test against how the real game engine behaves, not against artificial mocks.
When adding a brand new format, please make sure your PR includes:
- Fixture-Backed Tests: Full roundtrip coverage using synthetic test files (stored in
fixtures/). We never commit real game assets; runcargo test --test gen_fixtures -- --ignoredto safely generate them! Byte-exact roundtrip assertions are the gold standard for any format where the engine consumes bytes exactly as written. - Mutation Tests: A quick pass to verify the parser safely rejects malformed or corrupted inputs without panicking (usually wired up via
corruption_matrix.rs). - Module Documentation: A clean rustdoc block showing the basic format layout.
The Reserved Field Rule
Game engines are weird, and sometimes they leave mysterious “padding” or “reserved” sections in their binary formats. Every struct field that corresponds to a reserved region must be:
- Stored strictly as a named array (e.g.,
reserved: [u8; N]) in the format struct. - Read directly from the source bytes verbatim.
- Written back verbatim during a roundtrip.
If a writer zeroes out or silently drops a reserved field you parsed, we consider that a “lossless bug” – even if the engine doesn’t explicitly seem to use those bytes. If you’re constructing a brand new file from scratch, you can safely write zeroes for reserved regions, but the struct must be capable of storing exactly what it read off disk.
Release Process
(TODO: We haven’t cut an official production release yet! Right now we are aggressively building out the rakata-lint engine rules and expanding our format coverage. Once we officially stabilize v0.3.0 to crates.io, we’ll formalize our exact release checklist, dependency license refreshes, and CI pipelines here.)