Contributing Guide

Welcome to the Rakata workspace! This guide outlines how we build, how we test, and the core rules for keeping our code clean, compliant, and maintainable.

License Policy

License: All workspace crates use GPL-3.0-or-later.
Third-Party Components: New dependencies must be compatible (MIT, Apache-2.0, BSD). Add them to THIRD_PARTY_NOTICES.md before merge.

To ensure everything we build is 100% our own original work and we aren’t accidentally borrowing from other community tools (if you’re curious about why we’re so strict about this, check out docs/src/legal.md):

Reference Policy: Treat existing tools (like PyKotor) as behavioral references, not copy sources.
No Copy-Paste: Do not copy source code blocks, large comments, or docstrings from third-party sources into Rust files.
Re-Derivation: Derive implementation logic from format documentation, observed behavior (hex dumps), and black-box fixture analysis.
Reverse Engineering:
- Behavior verification via disassembly tools (e.g., Ghidra) is allowed for interoperability analysis.
- Do not copy decompiled code into source files.
- Record findings as paraphrased behavior notes natively within the relevant format specification under docs/src/formats/.

What belongs in the Engine Audits

The entire Rakata format specifications manual (docs/src/formats/) serves as the engine audit layer between reverse engineering and implementation. All Rust code is written strictly from these engine audits (specifically the Engine Audits & Decompilation sections embedded in each format’s blueprint), not from raw decompilation output.

Record: Field names, data types, default values, error conditions, and observable behavioral rules (e.g., “field X is clamped to range 0–100”, “list is sorted ascending by field Y”).
Do not record: Step-by-step algorithmic sequences, control flow structure, or implementation details that go beyond what is needed for interoperability. The test is: could someone implement correct behavior from this note without it dictating a specific code structure?

Format Work vs Engine Reimplementation

Right now, this workspace is exclusively focused on format parsing, linting, and modding tools - reading, writing, and validating the game’s actual data files. We are fundamentally just mapping out how the original game structures its data so we can build cool tools around it.

Building an actual game engine replacement (with gameplay logic, AI, and rendering pipelines) is a completely different beast for another day. But that’s exactly why these format blueprints are so critical: if someone wants to build an engine later, they can just use our shared engine audits to understand the data, rather than having to dig through raw decompiled binaries themselves!

Code Style & Linting

Pre-commit Hooks

We use pre-commit to keep the codebase consistently formatted without anyone having to manually police it. After cloning the repository, it’s highly recommended to set up the hooks:

pre-commit install
pre-commit install --hook-type pre-push

This registers two quick automated stages:

pre-commit: Formats your code via cargo fmt --all (auto-fixing it for you) and runs cargo clippy across all targets.
pre-push: Runs cargo test --workspace --all-features to ensure tests are green before you push.

Try to avoid skipping hooks using --no-verify. If a hook catches something, it’s usually just a helpful clippy suggestion or a quick formatting tweak!

Manual Checks

If you don’t like automated hooks and prefer running things manually from the workspace root before committing, you absolutely can:

cargo fmt --all
cargo clippy --workspace --all-targets --all-features
cargo test --workspace --all-features

Note: Passing --all-features to clippy and test is important so it catches optional code paths like serde and tracing! We just ask that fmt and clippy run cleanly before you open a Pull Request.

Idiomatic Rust

To keep the codebase consistently safe, lean, and fast, we heavily rely on a few core Rust principles:

Safe Numeric Casts: To prevent silent truncation bugs, we enforce #![warn(clippy::as_conversions)]. Avoid the raw as keyword; lean on From, TryFrom, or .into(). If an unsafe cast is truly unavoidable (like an f32 down to an i32), use a scoped #[allow(clippy::as_conversions)] and drop an inline comment explaining why it’s safe.
No Primitive Obsession: We heavily utilize strongly-typed wrappers (like ResRef) rather than passing raw [u8; 16] or String primitives around.
Strict Error Handling: We explicitly forbid .unwrap() and .unwrap_unchecked() in library code. Everything must propagate cleanly via Result using typed error enums (managed via thiserror).
Composition over Hierarchy: We prefer lean, flat structs and trait combinators over deep, messy object-oriented class hierarchies.
Honest Projections in Typed Views: Typed views over GFF in rakata-generics (Utc, Uti, Are, Git, Dlg, Ifo, Utd, Ute, Utm, Utp, Uts, Utt, Utw) model only the fields they enumerate. from_gff silently drops unmodelled fields and to_gff writes only the modelled ones. Do not add an extra_fields accumulator on the struct; callers that need byte-exact preservation work with the raw Gff tree directly. See Typed Views and Raw GFF for the rationale.
Project-Then-Snapshot Decoded Views: Decoded views (UtiSnapshot, future UtcSnapshot, etc.) split into a scope-free projection (format.project(...)) and a per-scope snapshot (projection.snapshot(&mut cache)). Uti::snapshot(&mut cache) is the single-scope shortcut. All snapshot queries are &self borrow-free reads against eager-resolved cached state; the snapshot does not retain the cache borrow. To query under a different scope, snapshot the same projection again against a different context. See Decoded Views: Projection and Snapshot for the rationale.
Iterators over Loops: We prefer functional iterator chains (map, filter, fold) over maintaining manual mutable state in for loops.
Zero-cost Features: Optional functionality (like serde serialization or tracing telemetry) must introduce absolutely zero overhead when disabled.
Safe by Default: We use #![forbid(unsafe_code)] across all core parser crates to enforce strict memory safety boundaries.

Testing & Quality

Our testing approach is a Gray Box strategy: we use our hard-earned white-box knowledge of the game engine (via Ghidra audits) to build extremely strictly-validated black-box test cases for our parsers. We want to test against how the real game engine behaves, not against artificial mocks.

When adding a brand new format, please make sure your PR includes:

Fixture-Backed Tests: Full roundtrip coverage using synthetic test files (stored in fixtures/). We never commit real game assets; run cargo test --test gen_fixtures -- --ignored to safely generate them! Byte-exact roundtrip assertions are the gold standard for any format where the engine consumes bytes exactly as written.
Mutation Tests: A quick pass to verify the parser safely rejects malformed or corrupted inputs without panicking (usually wired up via corruption_matrix.rs).
Module Documentation: A clean rustdoc block showing the basic format layout.

The Reserved Field Rule

Game engines are weird, and sometimes they leave mysterious “padding” or “reserved” sections in their binary formats. Every struct field that corresponds to a reserved region must be:

Stored strictly as a named array (e.g., reserved: [u8; N]) in the format struct.
Read directly from the source bytes verbatim.
Written back verbatim during a roundtrip.

If a writer zeroes out or silently drops a reserved field you parsed, we consider that a “lossless bug” – even if the engine doesn’t explicitly seem to use those bytes. If you’re constructing a brand new file from scratch, you can safely write zeroes for reserved regions, but the struct must be capable of storing exactly what it read off disk.

Release Process

(TODO: We haven’t cut an official production release yet! Right now we are aggressively building out the rakata-lint engine rules and expanding our format coverage. Once we officially stabilize v0.3.0 to crates.io, we’ll formalize our exact release checklist, dependency license refreshes, and CI pipelines here.)

Rakata Documentation