Rakata

Rakata is a clean-room Rust implementation of Knights of the Old Republic (KotOR) data formats and tooling. It provides a modular workspace designed for robust, type-safe, and canonical handling of Odyssey Engine game data.

This Wiki serves as the definitive reference manual for KOTOR Formats and Engine Behaviors, designed to decouple format knowledge from the underlying Rust source code.

Requirements

Rust Version: 1.85.0 or newer.

Documentation Domains

Rakata’s documentation operates on two tiers: the Software API and the Format Specifications.

1. The Workspace (Code API)

The workspace is organized into focused crates and tools. If you are developing against Rakata and need to know the semantic layout of types, functions, and data structures, refer to the respective Rustdocs:

Libraries (`crates/`)

rakata-core: Foundational primitives (ResRef, ResourceType, ResourceId) and core utilities (encoding, filesystem, detection).
rakata-formats: Binary and text format readers/writers for 19 KotOR formats including GFF, ERF, RIM, KEY/BIF, MDL/MDX, TPC, TGA, and more.
rakata-generics: Typed wrappers around GFF-backed resources (all 13 types: UTW, UTC, UTI, etc.). from_gff / to_gff are honest projections that model only the enumerated fields; byte-exact preservation is the raw Gff tree’s job.
rakata-extract: Resource resolution logic, composite module handling (.mod + _s.rim + _dlg.erf), and game-wide resource access (GameResources).
rakata-lint: Comprehensive resource validation against engine-derived field schemas. Catches crash-causing mod errors across all formats before they hit the engine.
rakata-save: Save game parsing and modification logic.
rakata: Facade crate re-exporting the ecosystem.

Tools (`tools/`)

rakata-saveeditor: Desktop GUI application for editing save games.
vanilla-inspector: Corpus validation tool for testing format implementations against all vanilla game assets.

🔗 View Rakata Rustdocs

2. Format Specifications (This Wiki)

The entire formats/ specification manual effectively serves as Rakata’s formal Evidence Log. If you need to understand binary structure, historical context, or how the original swkotor.exe engine interprets byte bounds under the hood (via Ghidra-backed engine constraints), you are in the right place!

Navigate through the sidebar to explore our exhaustive, decoupled format libraries:

Archive Formats – Detailed overviews of encapsulated containers (BIF, KEY, ERF, RIM).
GFF Structure – The bedrock of KOTOR’s data, exposing the 13 distinct blueprint constraints (Creatures, Dialogues, Triggers, etc.).
3D Models & Mesh – MDL/MDX structures and binary walkmesh topologies.
Textures & Audio – Overviews detailing graphic compression (TPC, DDS) and MP3/Miles Sound System wrappers.
Text & Data Formats – Localized Talk Tables (TLK), rule mappings (2DA), and hierarchical layout geometries (LYT, VIS).

Ready to dive in? Head over to the Goals & Roadmap to see where the project is heading, or look into the Architecture logic that powers the Rakata suite.

Project Roadmap

This document outlines what we’re tinkering with in rakata and where the project is heading.

Note

For day-to-day progress, bug fixes, and specific technical tasks, check out the Codeberg Issues tracker instead.

The End Goal

Right now, our libraries are mostly just good at reading and writing individual game files - like extracting a 3D model, opening a save file, or decoding audio. But the real dream for rakata is to build a full, modern KOTOR engine integration.

Eventually, it would be cool to tie all these isolated pieces together into an actual rendering pipeline. For example: dropping a vanilla model into an active window and have the engine stream the textures and background audio straight from the game data.

How We Get There

Since this is a passion project, we try to match the original game behavior down to the exact byte before building higher-level abstractions on top of it. It takes a bit longer, but it keeps us from having to constantly rewrite core parsers when we stumble into weird edge cases.

1. Laying the Foundation (Mostly Done)

Our core libraries (rakata-formats, rakata-save, etc.) can currently read, write, and safely roundtrip over 17 different KOTOR file formats. We’ve tackled a lot of the weird legacy archives (BIF), models (MDL/MDX), and raw textures (TPC), ensuring they line up with vanilla behavior.

However, the foundation is still growing! We still have a handful of outstanding data formats to map out and implement, including Pathfinding (PTH), UI Layouts (GUI), and Walkmeshes (WOK/DWK/PWK).

Additionally, formatting and bytecode support for NCS (Compiled Scripts) is actively being prioritized (see Issue #19) to allow rakata to interface natively with upcoming Rust-based community compilers and decompilers.

2. Building Real Tools (Our Active Focus)

Now that we can parse the data reliably, we are building stuff the community can actually use:

Mod Linter: A tool to scan modded files and point out if they break the game’s actual data constraints, catching crashes before you load them in-game.
Save Editor: A basic offline save editor (rakata-saveeditor) built directly on top of our stable format parsers.
Audio Streaming: Updating the generic audio logic (rakata-audio) so we can natively stream game music and voice lines instead of loading giant buffers into memory.
Drop-in Replacements: Providing modern, reliable drop-in replacements for legendary (but aging) community tools. By backing these with rakata’s strict parsing rules, we can offer faster, safer, cross-platform native tools for unpacking archives, compiling models, and building mods. (Note: While we aim to replace these tools, we will not inherit their legacy bugs or non-vanilla API quirks. When in doubt, the original game engine is our only source of truth).

3. KOTOR 2 (TSL) Support

We are strictly focusing on KOTOR 1 right now, but extending parsing support for TSL via compatibility flags is a planned enhancement for further down the line once K1 is completely stabilized.

4. The Runtime Engine (The dream but probably a few years away)

Once our standalone tools prove that our format parsers are perfectly stable, we have a pipedream to one day start weaving them together into a natively synchronized rendering loop.

Architecture Guide

This document outlines how the rakata workspace is structured and the design principles we try to stick to.

Core Principles

Vanilla K1 First
- By default, we target the original vanilla behavior of KotOR 1.
- Compatibility for TSL or community tools is strictly opt-in behind feature flags, not the default assumption.
- When deciding how to parse something, the original game engine is our ultimate source of truth. We use local fixtures and original game data to prove our parsers work, rather than just copying how older community tools did things.
Aim for Lossless
- We want to be able to read a file and write it back out to the exact same bytes. We’ve largely achieved this for standard archives and data formats (GFF, ERF, RIM, KEY, TLK, etc.).
- For highly complex formats (like MDL/MDX models), there are some known divergences where achieving a byte-exact roundtrip is essentially impossible due to how the original compilers ordered geometry blocks. We track these exceptions, but the output still safely runs in-game.
- No Lazy Pass-throughs: If a file has undocumented fields, we don’t just read them as an opaque Vec<u8> blob and blindly pass them through. Our goal is to properly reverse-engineer and map every single struct boundary. However, if we identify defined “reserved” fields in the binary layout that we haven’t cracked the meaning of yet, we will map them as properly sized reserved values so we don’t accidentally drop data the engine might rely on. (Note: explicit blank padding bytes aren’t stored in memory at all - we just recalculate those dynamically on write).
- Layer scope: this lossless guarantee applies to the byte-level format layer (rakata-formats). Typed views in rakata-generics (Utc, Uti, Are, …) are explicitly honest projections that model only the fields they enumerate; byte-exact preservation stays with the raw Gff tree. See Typed Views and Raw GFF below for the full rule.
Strict Text Handling
- All text decoding goes through rakata-core::text.
- Localized text (TLK entries, strings) uses language-aware encodings (Windows-1252, Shift-JIS, etc.) to match what the engine expects.
- Binary strings (like node names or texture paths) use TextEncoding::Windows1252 since that’s what the engine actually uses under the hood. No silently stripping weird characters with lossless backups.

(For day-to-day coding rules around iterators, zero-cost abstractions, and memory safety, see the Idiomatic Rust section in the contributing.md guide!)

Workspace Boundaries

Note: This layout is a living target! Some of these crates (like rakata-audio and rakata-saveeditor) are currently under active development. As we tackle our near-term roadmap goals – like building out the rakata-lint validation engine – expect these existing crates to flesh out, alongside brand new sibling crates being added to the ecosystem.

The workspace is organized in a clean dependency chain. Crates can only depend on crates listed “above” them:

rakata-core          (no workspace deps)
  rakata-formats     (depends on: core)
    rakata-audio     (depends on: core, formats)
    rakata-generics  (depends on: core, formats)
    rakata-extract   (depends on: core, formats)
    rakata-lint      (depends on: formats, generics)
    rakata-save      (depends on: core, formats)
rakata               (facade: re-exports all library crates)

Library Crates (`crates/`)

rakata-core: The absolute basics (ResRef, IDs) and core utilities like file streams and text encoding.
rakata-formats: Our massive library of parsers and writers (GFF, ERF, BIF, MDL, TPC, etc.). This parses bytes into objects, but doesn’t know anything about how the game actually uses them.
rakata-audio: Audio streaming and decoding for the engine’s various sound formats (WAV, ADPCM, MP3).
rakata-generics: Strongly-typed Rust models for all the different GFF files (like Doors, Items, Characters).
rakata-extract: The logic for hunting down actual game files in the wild. It knows how to look inside ERFs, check the Override folder, and resolve files just like the engine does.
rakata-lint: Our rule engine for scanning modded files and checking them against vanilla schema constraints.
rakata-save: High-level logic for safely reading, editing, and backing up save files.
rakata: A handy facade crate that re-exports everything so you only need to add one dependency.

Tool Crates (`tools/`)

rakata-saveeditor: The actual desktop application for editing save files.
vanilla-inspector: A testing utility for validating our parsers against the actual mass of game files.

Format API Guidelines

Public API Shape

Every format parser in rakata-formats generally provides the same clean interface:

read_<fmt><R: Read>(reader: &mut R) -> Result<T, E>
read_<fmt>_from_bytes(bytes: &[u8]) -> Result<T, E>
write_<fmt><W: Write>(writer: &mut W, data: &T) -> Result<(), E>
write_<fmt>_to_vec(data: &T) -> Result<Vec<u8>, E>

Formats with multiple output modes (like exporting models to ASCII text or JSON) just use variations of these names (read_mdl_ascii()).

Generic Traits: We strongly prefer accepting generic I/O trait bounds (Read, BufRead, Write, Seek) over concrete types. Accept the narrowest trait that covers your API’s needs so callers aren’t forced to jump through hoops.

Error Handling

Robust parsing means strict error boundaries:

Each format module must define its own domain-specific error enum (e.g., GffError, ErfError) using the thiserror crate. Do not use generic stringly-typed errors or Box<dyn Error>.
Low-level read failures (like sudden bounds exhaustion or bad magic numbers) should wrap our shared BinaryLayoutError.
Never unwrap() at an API boundary! Only fail explicitly via Result or use .expect() with a hardcoded rationale if it is impossible to fail.

Memory & Ownership

While we try to avoid deep cloning and heavy allocations behind the scenes, we default to owned data types when crossing public API boundaries. Unless a module is explicitly built and documented as a zero-copy “View” type, you should avoid passing nasty lifetimes into the caller’s lap.

Keeping Concerns Separated

Dumb Parsers: Format modules in rakata-formats are intentionally “dumb”. They solely translate between raw byte streams and Rust structs without any awareness of game architecture, filesystems, or what a “module” is.
Smart Extractors: All the messy environment logic – hunting down loose files, enforcing vanilla precedence rules (e.g., checking the Override folder before extracting from a BIF archive), and assembling composite files – lives safely isolated inside rakata-extract. This separation guarantees our parsers can cleanly process isolated test files just as well as they operate in a massive live-game workflow.

Tracing & Telemetry

We strongly encourage instrumenting format parsers with tracing::instrument spans to help pinpoint exactly where a badly formed file breaks during a parse. However, this telemetry must remain entirely zero-cost for consumers who don’t need it! We achieve this by wrapping public parser entry points in conditional attributes: #[cfg_attr(feature = "tracing", tracing::instrument(...))]. If a user doesn’t explicitly opt-in via their Cargo.toml, the Rust compiler strips the instrumentation entirely.

Serialization (Serde)

Just like tracing, serde support for exporting our parsed files to JSON or YAML must be treated as a zero-cost, opt-in feature. Format structs and types should generously derive Serialize and Deserialize when the serde feature flag is enabled. This allows downstream utilities (like the Save Editor) to effortlessly convert memory layouts into text formats, while ensuring the core parsers stay extremely light for purely binary-focused applications.

Beyond Basic Parsing

While rakata-formats gives us the ability to parse isolated bytes, the game engine is much more complicated. Our higher-level crates exist to bridge that gap between “dumb bytes” and “actual game logic”.

Finding Files (`rakata-extract`)

rakata-extract handles the messy reality of finding files scattered across a massive KOTOR installation. It mirrors the vanilla engine’s lookup hierarchy in three distinct layers:

Primitives: Grabbing a file out of a single archive (like unpacking a standalone ERF or BIF file).
Composition: Treating related archive sets as a single “Module” (like grouping a .mod file with its matching _s.rim and _dlg.erf files so they load transparently together).
Game-wide: Creating a unified GameResources tree that maps out the entire game installation.

Because we want our extraction to perfectly mirror vanilla behavior, lookups are strictly case-insensitive, and loading precedence is explicitly designed to mirror how the original game works (so a file in the Override folder automatically beats a file buried in a BIF archive).

Strongly-Typed Data (`rakata-generics`)

When we parse a .utc Character file, rakata-formats just hands us a raw GFF tree of untyped labels and values. rakata-generics wraps those raw data blobs in strongly-typed Rust structs (Utc, Uti, Are, Git, Dlg, Ifo, and friends). This guarantees that if a developer needs to access a character’s “Strength” stat, they get a guaranteed u8 property rather than blindly guessing string handles inside a raw binary tree.

Typed Views and Raw GFF

These typed structs sit beside the raw Gff tree, not on top of it. They are projections, not replacements. You construct one with Uti::from_gff(&gff) and round-trip back with uti.to_gff(); the original Gff stays accessible the whole time.

The projection layer follows one load-bearing rule: model what’s enumerated; drop what isn’t. from_gff extracts the fields each typed view documents and silently ignores anything else; to_gff writes only those documented fields. There is intentionally no extra_fields: Vec<GffField> accumulator on Utc / Uti / Are / etc. that would round-trip unmodelled fields through the typed layer.

The reason is correctness. Unmodelled fields often depend semantically on neighbouring fields (a savegame’s animation state only makes sense at the exact moment of save; a toolset’s custom annotations describe a specific revision). If the typed view silently preserved them while a caller edited a modelled field, the output would be internally inconsistent. The staleness contract is real, but it belongs explicitly with whoever needs byte-exact preservation, not buried inside a layer whose only job is type-safe access to known fields.

This naturally splits into two audiences served by one storage layer:

Tools (save editor, mod linter, format inspector) reach for the typed views. They want type-safe access to known fields and don’t care about unmodelled bytes.
Engines or byte-fidelity workflows (a future engine shim, a roundtrip auditor, anyone preserving toolset annotations) work directly with the raw Gff from rakata-formats. They own the staleness contract explicitly.

The one place where projection meets enumeration-by-design is rakata_generics::decoded::DecodedProperty. UTI item properties carry a PropertyName that indexes into itempropdef.2da, a table mods can extend with new rows. The enum has an Unknown variant that preserves the raw fields for one entry within an enumerated list, so an unrecognized property kind still surfaces through the decode pass instead of being dropped. It is a per-entry catch-all, not a struct-level accumulator, and the staleness risk is low because property entries are independent records.

When in doubt: if you need byte-exact preservation across a parse-then-write cycle, work with the raw Gff. If you need ergonomic, type-safe access to the fields Rakata has audited, work with the typed view.

Decoded Views: Projection and Snapshot

The typed structs (Uti, Utc, etc.) bring file-native fields into Rust types. A second layer on top, the decoded view, resolves those file-native fields against external context. For UTI, that context is the 2DA tables the engine consults at item-property evaluation time: itempropdef.2da for property-kind dispatch, baseitems.2da for combat / equip metadata, the iprp_* cost tables for magnitude resolution.

A decoded view splits into two stages so cross-scope analysis is a first-class operation:

Uti::project(itempropdef) -> UtiProjection<'_>. File-native typed-variant dispatch. Cheap, scope-free, takes only the minimal context (the property-kind dispatch table) needed to pick variants. The projection is the intermediate from which one or many snapshots are built.
UtiProjection::snapshot(&mut TwoDaCache) -> UtiSnapshot<'_>. Resolves the projection against a full per-scope context. Loads every table the snapshot’s query methods could need and caches the resolved values. All query methods on the snapshot are &self borrow-free reads against that cached snapshot.
Uti::snapshot(&mut TwoDaCache) -> UtiSnapshot<'_>. Single-scope shortcut for project(...).snapshot(...). Most callers want this.

The split exists because tools, the linter, and a future engine shim want to ask “what does this UTI look like under condition X” without re-running the file-native dispatch step for each context. Mod conflict analysis (does this item resolve differently with mod A loaded?), vanilla-vs-modded diffs, and the upcoming save / module VFS scoping (Codeberg #27, #28) all reduce to “build one projection, snapshot under several contexts, compare.” The projection step is shared across snapshots; only the per-scope resolution repeats.

Snapshots do not retain the cache borrow once constructed. To query a snapshot under a different scope, call projection.snapshot(&mut other_cache) again on the same projection. The typed-variant dispatch is not redone.

The cost-table magnitude resolution recipe each UTI snapshot bakes in is documented in the Cost-Table Magnitude Resolution subsection of the UTI engine audit: which iprp_costtable.2da index every typed property kind dispatches through, which column carries the magnitude, and which handlers bypass the dispatch chain entirely.

UTC follows the same shape with format-specific differences: Utc::project() takes no minimal context (UTC has no single dispatch table; typed list dispatch happens at snapshot time against per-list 2DAs), while UtcProjection::snapshot(&mut TwoDaCache) loads racialtypes.2da / appearance.2da / portraits.2da / soundset.2da / classes.2da / spells.2da and caches scalar-id resolutions, typed DecodedClass variants, and typed DecodedSpecialAbility variants. UtcSnapshot exposes the same &self borrow-free query surface (race_label, classes, total_level, is_force_user, is_droid, has_class, special_abilities, equipment, inventory, etc.). AreSnapshot and any future generic that grows a decoded view follow the same two-stage rule.

High-Level Interaction (`rakata-save` & `rakata-lint`)

Finally, crates at the top of the stack use our extraction logic and strongly typed generic structs to actually do things. rakata-lint compares typed structs against vanilla constraints to catch modding errors, while rakata-save gracefully handles unpacking, editing, and re-compressing massive save-game directories without corrupting the player’s campaign!

Contributing Guide

Welcome to the Rakata workspace! This guide outlines how we build, how we test, and the core rules for keeping our code clean, compliant, and maintainable.

License Policy

License: All workspace crates use GPL-3.0-or-later.
Third-Party Components: New dependencies must be compatible (MIT, Apache-2.0, BSD). Add them to THIRD_PARTY_NOTICES.md before merge.

Clean Room Implementation

To ensure everything we build is 100% our own original work and we aren’t accidentally borrowing from other community tools (if you’re curious about why we’re so strict about this, check out docs/src/legal.md):

Reference Policy: Treat existing tools (like PyKotor) as behavioral references, not copy sources.
No Copy-Paste: Do not copy source code blocks, large comments, or docstrings from third-party sources into Rust files.
Re-Derivation: Derive implementation logic from format documentation, observed behavior (hex dumps), and black-box fixture analysis.
Reverse Engineering:
- Behavior verification via disassembly tools (e.g., Ghidra) is allowed for interoperability analysis.
- Do not copy decompiled code into source files.
- Record findings as paraphrased behavior notes natively within the relevant format specification under docs/src/formats/.

What belongs in the Engine Audits

The entire Rakata format specifications manual (docs/src/formats/) serves as the engine audit layer between reverse engineering and implementation. All Rust code is written strictly from these engine audits (specifically the Engine Audits & Decompilation sections embedded in each format’s blueprint), not from raw decompilation output.

Record: Field names, data types, default values, error conditions, and observable behavioral rules (e.g., “field X is clamped to range 0–100”, “list is sorted ascending by field Y”).
Do not record: Step-by-step algorithmic sequences, control flow structure, or implementation details that go beyond what is needed for interoperability. The test is: could someone implement correct behavior from this note without it dictating a specific code structure?

Format Work vs Engine Reimplementation

Right now, this workspace is exclusively focused on format parsing, linting, and modding tools - reading, writing, and validating the game’s actual data files. We are fundamentally just mapping out how the original game structures its data so we can build cool tools around it.

Building an actual game engine replacement (with gameplay logic, AI, and rendering pipelines) is a completely different beast for another day. But that’s exactly why these format blueprints are so critical: if someone wants to build an engine later, they can just use our shared engine audits to understand the data, rather than having to dig through raw decompiled binaries themselves!

Code Style & Linting

Pre-commit Hooks

We use pre-commit to keep the codebase consistently formatted without anyone having to manually police it. After cloning the repository, it’s highly recommended to set up the hooks:

pre-commit install
pre-commit install --hook-type pre-push

This registers two quick automated stages:

pre-commit: Formats your code via cargo fmt --all (auto-fixing it for you) and runs cargo clippy across all targets.
pre-push: Runs cargo test --workspace --all-features to ensure tests are green before you push.

Try to avoid skipping hooks using --no-verify. If a hook catches something, it’s usually just a helpful clippy suggestion or a quick formatting tweak!

Manual Checks

If you don’t like automated hooks and prefer running things manually from the workspace root before committing, you absolutely can:

cargo fmt --all
cargo clippy --workspace --all-targets --all-features
cargo test --workspace --all-features

Note: Passing --all-features to clippy and test is important so it catches optional code paths like serde and tracing! We just ask that fmt and clippy run cleanly before you open a Pull Request.

Idiomatic Rust

To keep the codebase consistently safe, lean, and fast, we heavily rely on a few core Rust principles:

Safe Numeric Casts: To prevent silent truncation bugs, we enforce #![warn(clippy::as_conversions)]. Avoid the raw as keyword; lean on From, TryFrom, or .into(). If an unsafe cast is truly unavoidable (like an f32 down to an i32), use a scoped #[allow(clippy::as_conversions)] and drop an inline comment explaining why it’s safe.
No Primitive Obsession: We heavily utilize strongly-typed wrappers (like ResRef) rather than passing raw [u8; 16] or String primitives around.
Strict Error Handling: We explicitly forbid .unwrap() and .unwrap_unchecked() in library code. Everything must propagate cleanly via Result using typed error enums (managed via thiserror).
Composition over Hierarchy: We prefer lean, flat structs and trait combinators over deep, messy object-oriented class hierarchies.
Honest Projections in Typed Views: Typed views over GFF in rakata-generics (Utc, Uti, Are, Git, Dlg, Ifo, Utd, Ute, Utm, Utp, Uts, Utt, Utw) model only the fields they enumerate. from_gff silently drops unmodelled fields and to_gff writes only the modelled ones. Do not add an extra_fields accumulator on the struct; callers that need byte-exact preservation work with the raw Gff tree directly. See Typed Views and Raw GFF for the rationale.
Project-Then-Snapshot Decoded Views: Decoded views (UtiSnapshot, future UtcSnapshot, etc.) split into a scope-free projection (format.project(...)) and a per-scope snapshot (projection.snapshot(&mut cache)). Uti::snapshot(&mut cache) is the single-scope shortcut. All snapshot queries are &self borrow-free reads against eager-resolved cached state; the snapshot does not retain the cache borrow. To query under a different scope, snapshot the same projection again against a different context. See Decoded Views: Projection and Snapshot for the rationale.
Iterators over Loops: We prefer functional iterator chains (map, filter, fold) over maintaining manual mutable state in for loops.
Zero-cost Features: Optional functionality (like serde serialization or tracing telemetry) must introduce absolutely zero overhead when disabled.
Safe by Default: We use #![forbid(unsafe_code)] across all core parser crates to enforce strict memory safety boundaries.

Testing & Quality

Our testing approach is a Gray Box strategy: we use our hard-earned white-box knowledge of the game engine (via Ghidra audits) to build extremely strictly-validated black-box test cases for our parsers. We want to test against how the real game engine behaves, not against artificial mocks.

When adding a brand new format, please make sure your PR includes:

Fixture-Backed Tests: Full roundtrip coverage using synthetic test files (stored in fixtures/). We never commit real game assets; run cargo test --test gen_fixtures -- --ignored to safely generate them! Byte-exact roundtrip assertions are the gold standard for any format where the engine consumes bytes exactly as written.
Mutation Tests: A quick pass to verify the parser safely rejects malformed or corrupted inputs without panicking (usually wired up via corruption_matrix.rs).
Module Documentation: A clean rustdoc block showing the basic format layout.

The Reserved Field Rule

Game engines are weird, and sometimes they leave mysterious “padding” or “reserved” sections in their binary formats. Every struct field that corresponds to a reserved region must be:

Stored strictly as a named array (e.g., reserved: [u8; N]) in the format struct.
Read directly from the source bytes verbatim.
Written back verbatim during a roundtrip.

If a writer zeroes out or silently drops a reserved field you parsed, we consider that a “lossless bug” – even if the engine doesn’t explicitly seem to use those bytes. If you’re constructing a brand new file from scratch, you can safely write zeroes for reserved regions, but the struct must be capable of storing exactly what it read off disk.

Release Process

(TODO: We haven’t cut an official production release yet! Right now we are aggressively building out the rakata-lint engine rules and expanding our format coverage. Once we officially stabilize v0.3.0 to crates.io, we’ll formalize our exact release checklist, dependency license refreshes, and CI pipelines here.)

Legal & Compliance

Disclaimer: We aren’t lawyers! The following information references specific legal statutes regarding software interoperability and reverse engineering simply to clearly demonstrate our commitment to strictly lawful development.

Project Intent

Rakata is an open-source research project and software library strictly designed to build interoperability with the data formats used by Star Wars: Knights of the Old Republic (KOTOR).

Our Goal: We want to empower users to access, read, edit, and safely modify their own legally purchased game files on modern operating systems using open-source tools.
No DRM Circumvention: This project completely avoids the game executable. We do not bypass, strip, or defeat any Digital Rights Management (DRM) or software encryption. We solely parse static data files (like .rim, .bif, and .mdl files) for the pure purpose of compatibility.
No Pirated Assets: This repository does not contain, distribute, or host any copyrighted game assets (art, sound, proprietary code, or binaries) owned by the original rights holders. You must supply your own legally obtained copy of the game to do anything useful with this software.

Legal Basis for Reverse Engineering

This project operates under the specific “Interoperability” exceptions provided by copyright law in major jurisdictions:

🇨🇦 Canada (Jurisdiction of Maintainer)

Under the Copyright Act (R.S.C., 1985, c. C-42), this project relies on Section 30.61, which permits the reproduction of a computer program for the purpose of:

(a) obtaining information that is necessary to allow the computer program to be compatible with another computer program; or

(b) correcting errors in the computer program.

🇺🇸 United States

Under the Digital Millennium Copyright Act (DMCA), this project operates under the 17 U.S.C. § 1201(f) exception for Reverse Engineering, which states:

(1) … a person who has lawfully obtained the right to use a copy of a computer program may circumvent a technological measure… for the sole purpose of identifying and analyzing those elements of the program that are necessary to achieve interoperability of an independently created computer program with other programs…

🇪🇺 European Union (Host Jurisdiction - Codeberg)

Under Directive 2009/24/EC (Legal Protection of Computer Programs), this project adheres to Article 6 (Decompilation), which allows for the reproduction of code and translation of its form when:

(a) these acts are performed by the licensee or by another person having a right to use a copy of a program…

(b) the information necessary to achieve interoperability has not previously been readily available…

(c) these acts are confined to the parts of the original program which are necessary to achieve interoperability.

Acknowledgements

Portions of the initial file format logic were originally derived from research by the awesome PyKotor project (licensed under LGPL-3.0-or-later) and verified against original game binaries using clean-room reverse engineering techniques (via Ghidra and ret-sync).

This project is open-source and licensed under GPL-3.0-or-later.
Star Wars: Knights of the Old Republic is a trademark of its respective owners. This passion project is not affiliated with, endorsed by, or connected to Bioware, LucasArts, or Disney in any way.

Format Implementation Reference

This launchpad tracks the implementation status of KotOR file formats across our parsing libraries (rakata-formats) and our strongly-typed wrappers (rakata-generics).

Status Legend:

Full: Binary reader/writer implemented with roundtrip tests.
Generics: Strongly-typed wrappers and linting schemas implemented.
Partial: Basic parsing support, advanced features deferred.
Canonical: Validated against vanilla KotOR (K1) runtime behavior.

Archive Formats

Format	Status	Notes
BIF	Full	Supports variable/fixed tables. Deterministic 4-byte payload alignment. `BZF` compression feature-gated.
KEY	Full	First-match lookup semantics (native verified). Duplicate key insertions ignored.
ERF	Full	Supports `ERF`/`MOD`/`SAV`. Optional blank-block emission for MODs is explicit opt-in.
RIM	Full	Supports `V1.0`. Offset fallback handled. Tight packing.

GFF & Blueprints

Format	Status	Notes
GFF Structure	Full	Core binary parity for structs/lists/fields. Localized strings supported. Stable list ordering.
Generics	Generics	13 typed blueprints completed: ARE, DLG, GIT, IFO, UTC, UTD, UTE, UTI, UTM, UTP, UTS, UTT, UTW. Tied into `rakata-lint`.

3D Models & Walkmeshes

Format	Status	Notes
MDL/MDX	Full	Binary reader/writer with full geometry, node hierarchy, controllers, and MDX vertex data. ASCII reader/writer for modder interop. In-game verified.
BWM / WOK	Full	`V1.0` binary tables (vertices, faces, materials, etc.). Strict bounds validation.

Texture Formats

Format	Status	Notes
TPC	Full	Container header/payload/footer. Canonical pixel-type mapping (DXT5 for type 4). Mip payload sizing matches native right-shift.
DDS	Full	Supports standard D3D headers and K1-specific `CResDDS` prefix (20-byte metadata).
TGA	Full	Reader normalizes to RGBA8888. Canonical mode rejects grayscale RLE. Lossless passthrough when source pixels are unmodified.
TXI	Full	ASCII format. Case-insensitive command tokens (native verified). Coordinate block support.

Text & Data Formats

Format	Status	Notes
2DA	Full	Binary `V2.b`.
TLK	Full	Strict language-aware decode/encode. Validated against `test.tlk`.
VIS	Full	ASCII format. Case-insensitive room normalization. Deterministic ordering.
LYT	Full	ASCII format. Strict Windows-1252 text handling. Count-driven parsing.
LTR	Full	`V1.0` headers. 28-char probability tables.

Audio Formats

Format	Status	Notes
WAV	Full	Standard RIFF + KotOR SFX/VO obfuscation wrappers. MP3-in-WAV unwrapping support.
LIP	Full	`V1.0` header + keyframes. Deterministic writer.
SSF	Full	`V1.1` header + 28-slot sound table.

Missing / Deferred Formats

These formats are currently unimplemented or do not yet have strongly-typed wrappers in rakata-generics.

Format	Status	Notes
NCS / NSS	Deferred	NWScript Source and Compiled bytecode. NCS decompilation is slated for future work via an independent pipeline.
GUI	Deferred	Graphical User Interface layout blueprints (GFF).
JRL	Deferred	Journal and quest tracking blueprints (GFF).
FAC	Deferred	Faction mappings and reputations (GFF).
PTH	Deferred	Pathfinding graphs and navigation waypoints (GFF).
ITP	Deferred	Item Palette definitions (GFF).
BIK	Deferred	Bink Video container (proprietary video format). Unlikely to be implemented natively.

Provenance Policy

Because this project seeks to achieve strict interoperability with a two-decade-old engine, mere “correctness” is insufficient. We guarantee canonical behavior.

Target: Canonical vanilla Star Wars: Knights of the Old Republic 1 (2003).
Engine Audits: We do not guess how the engine behaves. Code is written exclusively from observed engine evidence notes derived from clean-room reverse engineering (via Ghidra/ret-sync). Every implementation choice is documented directly inside that format’s specific page on this site.
Verification: Behaviors are locked via deep integration tests against synthetic fixtures. If a parser perfectly round-trips an invalid file but the game engine rejects it, it is treated as a critical bug.

Archive Formats

At the heart of the Odyssey Engine is its virtual file system. Instead of loading tens of thousands of tiny loose files straight from the local disk, the engine efficiently streams them from large, concatenated archive blobs. You can think of these formats as extremely specialized zip files used to store binary models, compiled scripts, textures, and UI data.

Note

KOTOR utilizes a highly strict two-tier architecture. BIF & KEY act as the core foundational registry for all base-game assets (e.g. data/models.bif is mapped using chitin.key as the absolute global lookup index). Meanwhile, ERF & RIM files act as completely independent, self-contained archives used aggressively for loading localized module levels, stateful save games, and community mods.

Implementation Blueprints

Format	Name	Layout & Purpose
BIF	Binary Information File	Massive binary payload silos containing raw game assets packed end-to-end.
KEY	Global Index File	Master lookup table mapping precise file names directly to their internal BIF payload offset block.
ERF	Encapsulated Resource File	Extremely versatile package format utilized heavily for modules (`.mod`), stateful save games (`.sav`), and generic archives (`.erf`).
RIM	Resource Image	Stripped-down, fast-loading, highly compact localized module containers (often used to split up geometry models vs dynamic entity layouts).

BIF (Binary Information File)

BIFs are essentially giant, uncompressed data silos. Because they act as the raw storage tier of the KOTOR engine, they don’t waste bytes on complex metadata or internal filenames – they are simply pure, tightly packed continuous byte arrays for game resources. They are designed to be randomly accessed extremely quickly at runtime strictly via their companion KEY index file.

At a Glance

Property	Value
Extension(s)	`.bif`
Magic Signatures	`BIFF` (version `V1` )
Type	Archive Blob Payload
Rust Reference	View `rakata_formats::Bif` in Rustdocs

Data Model Structure

The rakata-formats crate handles raw Bif parsing for you by reading the internal offset tables. However, developers very rarely interact with a raw Bif file on its own.

Unified Access: Typically, you’ll use the KeyFile API (rakata_extract::keyfile::KeyFile), which automatically ties .key index files to their .bif data payloads so you don’t have to map them yourself.
Seek Performance: To prevent loading 100MB+ binary files completely into memory just to read a tiny script, Rakata jumps directly to the exact file coordinate on your hard drive (via KeyFile::read_resource_by_seek), extracting only the single resource you specifically asked for!

Tip

The compressed BZF BIF variant did not exist in the original 2003 PC version of the game. It was added much later by Aspyr for their modern iOS, Android, and Nintendo Switch ports simply to save storage space on mobile devices. While our parser can read the BZF layout, it falls slightly outside our core focus on the original PC version and hasn’t been heavily tested against real mobile game files yet.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .bif archive headers mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CExoResFile::LoadHeader (0x0040d910) and CExoResFile::ReadResource (0x0040da20).)

Archive Initialization (`CExoResFile::LoadHeader`)

Mapped from 0x0040d910.

Pipeline Event	Engine Behavior & Result
Signature Check	The engine strictly validates both the `BIFF` magic and the exact `V1` version. It does not actively process any files that deviate from this signature pair.
Variable Table Loading	The system extracts the `variable_count` value from the header and physically reads `variable_count * 16` bytes from the `variable_table_offset` to map the resource keys.
Fixed Table Bypass	The `fixed_count` header scalar is entirely decorative. It is not part of the active runtime read path (files with nonzero values are accepted but never mapped).
Direct Asset Extraction	When reading a physical asset out of the `.bif`, the engine isolates the `entry_index` using `(resource_id & 0x3fff) * 0x10`. It then calls a direct C `fseek(SEEK_SET)` strictly matching the raw `data_offset` extracted from the 16-byte variable table entry. No alignment or structural normalization is applied—the data is dumped entirely blindly.

Caution

Because the engine passes the internal data_offset integer directly into a raw C fseek(SEEK_SET), any custom BIF files must meticulously guarantee byte-perfect offset tables. If the offset is even slightly misaligned, the engine will read garbage data into the stream, inevitably crashing the game.

KEY (Global Index)

Think of the KEY file as the absolute master table of contents governing the entire game directory. Because uncompressed BIF archives are completely blind payloads that contain no internal filenames, the KEY file acts as the singular, authoritative index that tells the engine exactly which BIF holds which file, and precisely where to seek inside that BIF to find it.

At a Glance

Property	Value
Extension(s)	`.key`
Magic Signatures	`KEY` (version `V1` )
Type	Archive Global Index
Rust Reference	View `rakata_formats::Key` in Rustdocs

Data Model Structure

The rakata-formats crate evaluates the .key file as the holy grail mapping for global engine initialization.

Indices Hierarchy: Internally, the format houses an array of bif_entries bounding archive paths and sizes, alongside a massive array of KeyResourceEntry structures fusing a standard ResRef string and a format TypeCode to a bit-packed numeric ResourceId.
Conflict Resolution: Because the game engine relies on a strict override hierarchy, multiple KEYs might accidentally declare the same resource! When constructing the active dictionary out of a KEY file (KeyFile::build_key_resource_index), Rakata explicitly utilizes or_insert() to strictly ensure only the first defined entry for a conflict is honored, perfectly mimicking the engine’s aggressive linear-scan precedence rules.

Engine Audits & Decompilation

The following information documents the KOTOR engine’s exact load sequence and field constraints for genuine .key files. All behavior was mapped natively from swkotor.exe during clean-room reverse engineering.

Key Table Registration (`CExoKeyTable::AddKeyTableContents`)

Mapped from 0x0040fb80.

Action	Engine Behavior
Signature Check	Validates exactly for the `KEY` magic and the explicit `V1` version signature.
Version Branching	There is absolutely zero logic handling any speculative `V1.1` version branch in vanilla K1. It is currently unknown if a `V1.1` KEY format actually exists in the wild, but the engine certainly wouldn’t load it.
Payload Mapping	Extrapolates the file location natively by tearing apart the `ResourceId` bitmask to locate both the target `BIF` file index and the internal struct array offset.

Note

The engine handles KEY table loading extremely early in the application lifecycle during CExoBase::InitObject. If a global KEY fails to mount due to malformed headers, the engine immediately aborts execution.

ERF (Encapsulated Resource File)

ERFs are the heavy lifters for standard game modules (.mod) and save game architectures (.sav). Unlike BIFs, which rely entirely on an external KEY file to resolve their resource identities, ERFs are completely self-contained entities that carry their own internal file tables, localized descriptions, and asset payloads.

At a Glance

Property	Value
Extension(s)	`.erf`, `.mod`, `.hak`, `.sav`
Magic Signatures	`ERF` , `MOD` , `HAK` , `SAV` (version `V1.0`)
Type	Self-Contained Archive
Rust Reference	View `rakata_formats::Erf` in Rustdocs

Data Model Structure

Because ERF files share the exact same structural responsibility as RIM files (acting as self-contained module wrappers), the rakata-extract crate abstracts both ERF and RIM parsing directly behind the unified Capsule struct.

Capsule Generalization: Standard module extraction relies entirely on calling rakata_extract::Capsule::read_from_bytes(). This actively probes and dynamically mounts either ERF or RIM boundaries identically in memory, completely hiding the underlying structural container differences from the developer API.

Engine Audits & Decompilation

The following information documents the KOTOR engine’s exact load sequence and field requirements for genuine .erf capsule variants. All behavior was mapped natively from swkotor.exe during clean-room reverse engineering.

Capsule Header Initialization (`CExoEncapsulatedFile::LoadHeader`)

Mapped from 0x0040e1f0.

Action	Engine Behavior
Signature Check	Explicitly validates the header against exactly matching `ERF` , `MOD` , or `HAK` signatures, paired with the mandatory `V1.0` version string.
Unchecked Saves	The engine completely lacks a validation branch for `.sav` files. If a file is loaded as a Save Game (param flag 1), the engine falls through the validation tree and explicitly mandates the file use the `MOD` magic string natively. An ERF file with `SAV` magic will physically crash or reject here!
Header Truncation	The loader explicitly pulls the entire 160-byte header into scope (`CExoFile::Read(..., 0xa0)`), but only evaluates offsets `0x00` through `0x1C`. Offset `0x18` (Key List) and anything beyond `0x1C` is entirely ignored during initialization.

Tip

The 116-Byte “Dead Zone” The giant block of bytes stretching from physical offsets 0x2C down to 0xA0 inside the 160-byte header is formally loaded into the engine’s active memory stack… and then completely discarded immediately. It is totally inert data containing old Bioware build metadata.

RIM (Resource Image)

RIM files operate as a radically leaner alternative to ERFs. They are used exclusively by the game engine for distributing absolutely essential or lightweight modules without the hefty structural metadata overhead of an ERF file. They provide rapid, self-contained loading for core engine environments.

At a Glance

Property	Value
Extension(s)	`.rim`
Magic Signatures	`RIM` (version `V1.0`)
Type	Lightweight Archive
Rust Reference	View `rakata_formats::Rim` in Rustdocs

Data Model Structure

Because RIM files act as a lightweight twin to the ERF format, the rakata-extract crate extracts them identically.

Capsule Generalization: Standard module extraction relies entirely on calling rakata_extract::Capsule::read_from_bytes(). The developer API makes absolutely no programmatic distinction between querying an ERF module or a RIM module—it behaves perfectly seamlessly either way.

Engine Audits & Decompilation

The following information documents the KOTOR engine’s exact load sequence and field requirements for genuine .rim capsule variants. All behavior was mapped natively from swkotor.exe during clean-room reverse engineering.

Resource Image Overrides (`CExoKeyTable::AddResourceImageContents`)

Mapped from 0x0040f990.

Action	Engine Behavior
Signature Check	Explicitly validates the exact `RIM` magic and the `V1.0` version string implicitly upon loading.
Header Evaluation	The engine physically reads the `entry_count` (offset `0x0C`) and the `keys_offset` (offset `0x10`) from the header to explicitly navigate the file structures.

Tip

The 96-Byte “Dead Zone” Exactly like the ERF dead zone, RIM files feature a massive 96 bytes of completely inert padding sitting physically between offsets 0x18 and 0x77 inside the 120-byte header. The engine blindly sweeps right past it during initialization. It is perfectly safe to zero out this region when generating new synthetic fixtures.

GFF (Generic File Format)

The Generic File Format (GFF) is BioWare’s core binary serialization format, functioning like a binary JSON object or XML tree. It holds arbitrarily nested structures, typed fields, and lists, powering UI layouts, character sheets, dialogues, and area descriptions.

At a Glance

Property	Value
Extension(s)	`.gff`, `.utc`, `.uti`, `.utp`, `.ute`, `.utd`, `.dlg`, `.are`, `.ifo`, etc.
Magic Signature	Target type (e.g. `UTC` ) / `V3.2`
Type	Generic Hierarchical Data
Rust Reference	View `rakata_formats::Gff` in Rustdocs

Data Model Structure

The rakata-formats crate gracefully abstracts the GFF struct/field/list indexing graph into a user-friendly memory model (rakata_formats::Gff).

Typestate Wrapping: GFF natively supports discrete types (e.g., BYTE, SHORT, VOID, STRUCT, LIST). rakata_formats::GffValue encapsulates these identically, shielding developers from raw byte layouts and indirect arrays.
Data Deduplication: Unlike standard web JSON, GFF binaries limit all field labels to 16 characters and deduplicate them via a contiguous LabelTable. The rakata-formats implementation mimics this memory layout exactly during serialization, guaranteeing structurally deterministic binaries natively acceptable by the engine!

Engine Audits & Decompilation

Binary: swkotor.exe

Serialization Architecture (`WriteGFFFile`)

Derived from 0x00413030 / 0x004113d0.

The engine allocates the output buffer entirely in-memory and serializes exactly 7 contiguous sections in an absolutely strict order. No inter-section padding or reserved alignment bytes are inserted anywhere natively. Each section’s byte-offset is dynamically snapshotted into the 56-byte header, operating as the canonical write path utilized for save games and area extraction.

Phasing Order	Section Component	Memory Footprint / Quirk
Phase 1	Root Header	Exactly `56` bytes (`0x38`).
Phase 2	Struct Array	`12B × struct_count`
Phase 3	Field Array	`12B × field_count`
Phase 4	Label Array	`16B × label_count`
Phase 5	Field Data Blob	Arbitrary bounds constraint.
Phase 6	Field Indices	Dynamic array bounds.
Phase 7	List Indices	Dynamic array bounds.

Warning

Because BioWare enforces fixed 16-byte elements inside the Label arrays, any label that exceeds 16 characters is strictly truncated by the engine array bounds.

Engine Blueprints: Specialized GFF Containers

While the gff.md reference explains the layout of raw GFF nodes, the engine frequently uses GFF as a structural wrapper to serialize completely deterministic entities known as Blueprints. These blueprints operate as the strict layouts defining creatures, dialogue trees, placeables, and area parameters.

Because rakata-lint provides deep behavioral validation over these blueprints natively, we have comprehensively audited how the K1 GOG executable (swkotor.exe) maps these layouts into active memory via its Load*FromGFF functions.

Note

The typed blueprint structs documented below (Utc, Uti, Are, Git, Dlg, Ifo, Utd, Ute, Utm, Utp, Uts, Utt, Utw) are projections over raw GFF, not replacements. Each from_gff extracts only the documented fields and silently drops anything else; to_gff writes only those documented fields. The raw Gff tree stays alongside the typed view for callers that need byte-exact fidelity. See Typed Views and Raw GFF in the architecture guide for the full rationale and the choose-which-layer guidance.

The Blueprint Engine Audits

The audits listed in this section’s navigation bar are formal, decompilation-backed blueprints cataloging KOTOR’s physical constraints. They document the exact fields, load phrasing, and engine rule evaluations that supersede any generic structural validity.

If a field exists in GFF but breaks the engine, our Linter rules will flag it using these documentation audits as the source of truth.

Ext	Type	Core Function
`.are`	Area Static Blueprint	Defines overarching static world properties (weather, day/night limits, physics constraints).
`.dlg`	Dialogue	Encapsulates the conversation graph, branching logic, and cinematic execution sequences.
`.git`	Game Instance Template	The physical object manifest. Orchestrates exact placement, vector orientations, and template spawning.
`.ifo`	Module Info	Root environment metadata bridging modules together and orchestrating spawn states.
`.utc`	Creature	Instantiates NPCs, stat-blocks, and character body configurations.
`.utd`	Door	Configures transitions, linked bounds, and structural barriers.
`.ute`	Encounter	Orchestrates dynamic boundary triggers and valid enemy spawning constraints.
`.uti`	Item	Unifies structural stats across weapons, armors, and consumables.
`.utm`	Store	Limits merchant arrays and details markup/markdown behaviors.
`.utp`	Placeable	Standardizes interactive storage boxes, unusable statues, and deployable traps.
`.uts`	Sound	Configures local dynamic audio emitters and distance volume calculations.
`.utt`	Trigger	Plots physical interactive polygons tracking spatial events.
`.utw`	Waypoint	Anchors spatial float positions for navigation grids and area transitions.

ARE Format (Area Static Blueprint)

The Area (.are) blueprint format operates as the static environmental foundation of any game module. It establishes the rigid, overarching properties of a level, orchestrating the terrain’s grass rendering definitions, dynamic sunlight and fog constraints, ambient audio scale, and the primary interior/exterior state configurations. It effectively constructs the structural ‘stage’ that dynamic entities (like creatures and doors) populate later on.

At a Glance

Property	Value
Extension(s)	`.are`
Magic Signature	`ARE` / `V3.2`
Type	Area Static Blueprint
Rust Reference	View `rakata_generics::Are` in Rustdocs

Data Model Structure

Rakata maps the Area definition directly into the rakata_generics::Are struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .are files mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSArea::LoadArea at 0x0050e190.)

The initial LoadArea dispatch branches out to parse the .are GFF, .lyt layout, .git instance tracking, and .pth bounds. The engine processes roughly 61 scalar fields, 4 scripts, 3 lists, and a nested minigame struct natively within the LoadAreaHeader subroutine.

Core Environmental Identity

Field Category	Engine Property & Behavioral Quirk
Identity	`Name` (LocString), `Comments` (String), `ID` (Int) -> Standard definition strings.
Identity	`Tag` (String) -> Lowercased on load (via `CExoString::LowerCase`). The only tag to behave this way!
Scripts	`OnHeartbeat`, `OnUserDefined`, `...` -> `CResRef` script payloads.
State Flags	`Flags` (DWord) -> `Bit 0` explicitly marks an `Interior` environment.
State Flags	`RestrictMode` (Byte) -> Hardcoded Event: Changing this to a non-zero value during gameplay forces `CSWPartyTable::UnstealthParty`.

Note

Internal Weather Truncation If Flags (Bit 0) marks the area as an interior space, the engine zeros out all weather properties upon load, actively discarding any prior weather assignments.

Weather & Terrain Generation

Field	Type	Engine Evaluation
`ChanceFog`	`INT`	Stored persistently as an integer.
`ChanceRain`, `ChanceSnow`, `ChanceLightning`, `WindPower`	`INT`	Warning: The engine explicitly truncates these INT properties to 8-bit bytes at runtime. Values over 255 silently wrap around.
`Grass_TexName`	`ResRef`	If empty or invalid, the engine forces a hard fallback to `"grass"`.
`AlphaTest`	`FLOAT`	Defaults to `0.2` (older tools commonly assume `0.0`).

Area Lighting & Sun/Moon Tracking

KOTOR handles dynamic sunlight constraints separately between Sun and Moon.

Property Groups	Type	Engine Evaluation
Fog Ranges (`MoonFogNear/Far`, `SunFogNear/Far`)	`FLOAT`	Defaults to an immense distance of `10000.0`. The engine aggressively clamps values to be `≥0.0`.
Tints (`AmbientColor`, `DiffuseColor`, `*FogColor`)	`DWORD`	Processed seamlessly as standard `DWORD` color masks.
Environment Shadows (`ShadowOpacity`, `*Shadows`)	`BYTE`	Basic toggles and opacities orchestrating render limits.

Map Transitions & Saving states

Feature Category	Engine Evaluation & Triggers
Minimap Logic	Geographic vectors (`MapResX`, spatial coordinate structs like `WorldPt1X`) are only loaded if an actual Minimap TGA/TPC asset matching the level name exists on disk!
Parsing Type	If read, the engine parses `MapPt` along a dual-path logic checking if it is formally a `FLOAT` or `INT` type.
Zoom Bias	Area maps evaluate `MapZoom` to a default scaling scalar of `1`, not `0`!
Stealth Save-States	The stealth framework leverages the `.are` struct to snapshot `.StealthXPMax` and `.StealthXPCurrent` directly as `DWORD`s when parsing the layout.

The Minigame Struct

Read via CSWMiniGame::Load (0x006723d0). If a minigame context triggers, the .are reads the nested Type (DWORD mapping 1=Swoop, 2=Turret). It injects highly specialized float properties modifying basic terrain speeds:

Field	Injection Default / Constraint
`LateralAccel`	Defaults safely to `60.0`.
`MovementPerSec`	Scales to `6.0` (Swoops), `90.0` (Turrets), or `0.0` otherwise!
`Bump_Plane`	Bounds are heavily clamped to `0..3`.
Nested Arrays	The struct natively requires sub-struct `Player` arrays (Models, Camera, Axes) and `Enemy/Obstacles` lists to operate properly.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::are.

ARE-001 (Context Discards): Warns when interior areas (Flags & 1) carry non-zero ChanceRain, ChanceSnow, ChanceLightning, or WindPower; the engine discards weather for interiors.
ARE-002 (Weather Truncation): Warns when ChanceRain, ChanceSnow, ChanceLightning, or WindPower exceed 255; the engine truncates these to bytes at runtime.
ARE-003 (Fog Clamping): Warns when MoonFogNear/Far or SunFogNear/Far are negative; the engine clamps fog distances to >= 0.0.
ARE-004 (Tag Lowercasing): Warns when Tag contains uppercase characters; the engine lowercases area tags on load.
ARE-005 (Toolset / K2 Fields): Informs when DisableTransit, NoHangBack, PlayerOnly, or PlayerVsPlayer are set; never read by the K1 engine.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::are_range.

ARE-006 (Resref Existence): Warns when any of OnEnter, OnExit, OnHeartbeat, or OnUserDefined (.ncs) does not resolve, or when any Rooms[i].PartSounds[j].Sound (.wav) does not resolve in the configured resource sources.

Pending

Grass Texture Fallback: Informs when Grass_TexName is empty; the engine treats this as the literal string "grass".
Texture / MiniGame Resref Existence: DefaultEnvMap, Grass_TexName, and the nested MiniGame model / track / music graph – ResourceTypeCode mapping for engine-specific texture and model packs is still being audited.

DLG Format (Dialogue Blueprint)

Description: The Dialogue (.dlg) format is the beating heart of KOTOR’s storytelling. It acts as the master “script” for every conversation, cutscene, and cinematic sequence. Rather than just holding localized text, it acts as a branching storyboard that tells the engine exactly what the characters should say in audio, what animations they should perform, which camera angles to use, and when to fire off scripts that impact the plot.

At a Glance

Property	Value
Extension(s)	`.dlg`
Magic Signature	`DLG` / `V3.2`
Type	Dialogue Blueprint
Rust Reference	View `rakata_generics::Dlg` in Rustdocs

Data Model Structure

Rakata parses the raw GFF structure into the rakata_generics::Dlg struct.

Typestate Extraction: By extracting the loosely-typed GFF binary into a strict Rust struct, Rakata replaces unsafe dynamic string queries with compile-time guaranteed data types (such as DlgAnimation and DlgCamera models).

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .dlg files mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSDialog::LoadDialog (0x005a2ae0), cascading through LoadDialogBase (0x005a11c0) and LoadDialogCamera (0x005a1ab0).)

The LoadDialog subroutine processes the root-level conversation configuration before iterating over the heavily nested EntryList and ReplyList. For each of those conversational nodes, it delegates parsing to LoadDialogBase (for text and scripts) and LoadDialogCamera (for viewport directions).

Additionally, StartingList provides the dialogue entry points, while the StuntList associates cutscene actor models.

Root Conversation Configuration

Field Category	Engine Property & Type	Notable Default or Behavioral Quirk
Identity & Rules	`CameraModel` (ResRef), `DelayEntry/Reply` (DWord)	`DelayEntry` and `DelayReply` safely default to `0` if missing.
Identity & Rules	`Skippable` (Byte)	Explicitly defaults to `1` (True) if missing.
Logic Hooks	`EndConversation`, `EndConverAbort` (ResRefs), `AmbientTrack`	Fire when the dialogue terminates abruptly or via conclusion. Fallback to empty strings `""` if missing.
Hardware Interfacing	`ConversationType` (Int)	`0` = Cinematic, `1` = Computer, `2` = Special. Cinematic explicitly unstealths the party.
Hardware Interfacing	`ComputerType` (Byte)	Only evaluated if `ConversationType` is `1`. Otherwise, standard camera positioning and animations are bypassed.
Equipment & Actions	`UnequipItems`, `UnequipHItem`, `AnimatedCut`, `OldHitCheck`	`AnimatedCut` forces a global unpauseable state if non-zero. `AnimatedCut` and `OldHitCheck` default to `0`.

Shared Dialogue Node Properties (`LoadDialogBase`)

These fields apply to both entries (NPC spoken) and replies (Player spoken), and are parsed via LoadDialogBase.

Field	Type	Engine Evaluation
`Text`	`LocString`	The spoken localized string.
`Script`, `Speaker`, `Quest`	Strings/ResRefs	Standard execution scripts and entity mapping.
`Sound`, `VO_ResRef`	`ResRef`	Sound Fallback: If `Sound` fails to execute, the engine will attempt to play `VO_ResRef`. If both fail, the bitmask `SoundExists` is forcibly downgraded to `0`.
`Delay`	`DWord`	Delay Special Case: If value is `0xFFFFFFFF`, the engine explicitly reads from the root `DelayEntry`/`DelayReply` field instead and modulates `WaitFlags`!
`FadeType`	`Byte`	Determines the `FadeDelay` and `FadeLength`. If set to `0` or missing, all fade configurations are zeroed inherently.

Viewport Framing (`LoadDialogCamera`)

Field	Type	Engine Evaluation
`CameraID`	`INT`	Dependent Field: Only permitted when `CameraAngle = 6` (Placeable Camera). Otherwise, the engine forces the ID to `-1` regardless of the static binary value.
`CamFieldOfView`	`FLOAT`	Aggressively validated. If the property is entirely missing or is explicitly negative, the engine forces the perspective to `-1.0`.
`CamHeightOffset`, `TarHeightOffset`	`FLOAT`	Standard float deltas.

Relational Data Trees

Dialogues operate as highly interconnected link-lists.

Entry -> Reply Links (RepliesList within an Entry Node): Maps the Index (DWORD) to the overarching .ReplyList bounds. Unique in that it exclusively parses the DisplayInactive Byte.
Reply -> Entry Links (EntriesList within a Reply Node): Maps the Index to the .EntryList bounds.
Start Indices (StartingList): Uses the exact same linkage schema as a Reply->Entry link. Validates Index against entry_count.

Warning

Corrupted Link Constraints Index paths are strictly evaluated against the internal array bounds prior to traversing. If a node tries to link out of bounds, it immediately triggers a fatal Load Failure within the engine.

Ancillary Configuration Lists

AnimList: Defines custom Participant models and their accompanying Animation (WORD) action index to loop.
StuntList: Dictates which StuntModel should proxy standard rendering behavior for a given Participant.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::dlg.

DLG-001 (Camera Angle Compliance): Warns when CameraID is populated while CameraAngle != 6; the engine forces the ID to -1.
DLG-002 (Conversation Type Mismatch): Warns when ComputerType is set but ConversationType != 1 (Computer Dialog); ComputerType is dead data otherwise.
DLG-003 (Ghost Delay Flags): Warns when an entry delay is maxed (0xFFFFFFFF) but no sound/VO is configured and the parent fallback delay is 0; the node terminates instantly.
DLG-004 (Fatal Bounds Checking): Errors when any Index in a node’s link list, the starting list, or a reply list exceeds the target array bounds; this triggers a fatal engine load failure.
DLG-005 (Context Zeroing): Warns when FadeDelay, FadeLength, or FadeColor are configured but FadeType=0; the engine discards the timings.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::dlg_range.

DLG-006 (Resref Existence): Warns when any of EndConversation / EndConverAbort (.ncs), CameraModel (.mdl), AmbientTrack (.wav), per-stunt StuntList[i].StuntModel (.mdl), or per-node Script (.ncs), Sound / VO_ResRef (.wav), and Links[j].Active condition scripts (.ncs) do not resolve in the configured resource sources.

GIT Format (Game Instance Template)

Description: The Game Instance Template (.git) orchestrates the exact placement of every single entity within an environment. If the .are file is the underlying “stage”, the .git file acts as the blueprint for its “actors”–defining exactly where creatures initially spawn, where placeables sit, the physical rotation of doors, and the bounds of any active sound emitters.

At a Glance

Property	Value
Extension(s)	`.git`
Magic Signature	`GIT` / `V3.2`
Type	Instance Blueprint
Rust Reference	View `rakata_generics::Git` in Rustdocs

Data Model Structure

Rakata parses the raw GFF structure into the rakata_generics::Git struct.

Typestate Extraction: By extracting the loosely-typed GFF binary into a strict Rust struct, Rakata inherently standardizes all 13 object sub-lists, creating deterministic representations of GitCreature, GitDoor, GitPlaceable, etc.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .git files mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSArea::LoadGIT at 0x0050dd80.)

The LoadGIT subroutine is a massive dispatcher. It evaluates 3 immediate root scalars before handing off evaluation to 13 distinct object-list loaders mapping entities. Crucially, the flag UseTemplates dominates this process by dictating whether these lists refer to external files or contain fully inline entity data.

Root Behavior Properties

Field	Type	Engine Evaluation
`UseTemplates`	`BYTE`	Controls whether object arrays read `TemplateResRef` to construct entities, or fall back to inline evaluation.
`CurrentWeather`	`BYTE`	Standard `BYTE`. Zeroed to `0xFF` on Interior Areas.
`WeatherStarted`	`BYTE`	Standard `BYTE`. Zeroed to `0` on Interior Areas.

(The engine validates weather fields against the .are properties immediately during load).

Field Naming Inconsistencies

Due to legacy asset sprawl, the engine evaluates vectors explicitly according to vastly different naming conventions depending entirely on the entity class. This is hardcoded into swkotor.exe.

Target Lists	Position Paradigm	Orientation Paradigm
Creatures, Items, Waypoints, Stores	`XPosition`, `YPosition`, `ZPosition`	`XOrientation`, `YOrientation`, `ZOrientation`
Doors, Placeables	`X`, `Y`, `Z`	`Bearing` (Float angle)
Area Effects	`PositionX`, `PositionY`, `PositionZ`	`OrientationX`, `OrientationY`, `OrientationZ`

Warning

Orientation Normalization The engine strictly evaluates 3D orientation logic. If a normalized orientation vector (like in StoreList or AreaEffectList) inadvertently resolves to 0.0 unconditionally, the engine catches the math fault and applies a hard fallback vector to (0, 1, 0).

Standard Instance Arrays

Standard loaders evaluate the generic ObjectId, process the localized position/orientation floats, and dispatch behavior mapping logic.

List Name	Struct Target	Engine Triggers & Fallbacks
Creature List	`LoadCreatures`	Positions are explicitly validated defensively through `ComputeSafeLocation` bounds.
Door List	`LoadDoors`	Save states trigger `LoadObjectState`. External templates dynamically route to `LoadDoorExternal`.
WaypointList	`LoadWaypoints`	Completely ignores `UseTemplates`–it solely relies on inline data! Z-height is shifted dynamically via `ComputeHeight`.
TriggerList	`LoadTriggers`	Geometry properties reuse native `UTT` formatting. Contains unique linkage arrays: `LinkedToModule`, `TransitionDestination`, `LinkedTo`.

Specialized Struct Parsings

Engine Dispatch Target	Description & Findings
`LoadSounds` (`0x00505560`)	Discard logic: Translates `GeneratedType` via DWord, but physically truncates it to an 8-bit byte on save, silently discarding the upper 24 bits!
`LoadEncounters` (`0x00505060`)	Highly nested structural array reusing both `Geometry` and `SpawnPointList` formats natively built for UTE boundaries.
`LoadPlaceableCameras` (`0x00505eb0`)	Client-side only struct that reads composite GFF spatial types correctly natively! Camera Limit: If it hits `51` camera entries, the loader formally rejects it.
“List” (Items) (`0x00504de0`)	Bizarrely, the generic parent entity list `List` is used specifically to orchestrate Item instances!

Singular Structs

AreaProperties: Orchestrates stealth behavior state tracking and dynamic audio states. It physically reads AmbientSndDayVol / AmbientSndNitVol and explicitly truncates their INT declarations into a single native runtime byte value.
AreaMap: Strict binary blobs evaluating rendering properties (AreaMapData). It is absolutely bypassed during fresh loads, only executed conditionally during save-game states.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::git.

GIT-001 (Weather Zeroing): Informs when CurrentWeather != 0xFF or WeatherStarted=true is configured; if the area is an interior, the engine forcibly zeros these on load.
GIT-002 (Camera Array Bounds): Errors when CameraList contains 51 or more entries; triggers an immediate engine-level loader failure.
GIT-003 (Stealth Clamping): Warns when StealthXPCurrent > StealthXPMax; the engine clamps on evaluation.
GIT-004 (Ambient Volume Truncation): Warns when AmbientSndDayVol or AmbientSndNitVol are outside 0..=255; the engine truncates to an 8-bit byte.
GIT-005 (Sound GeneratedType Truncation): Warns when any sound’s GeneratedType exceeds 255; the engine truncates to an 8-bit byte on save.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::git_range.

GIT-006 (Template Resref Existence): Warns when any per-instance TemplateResRef does not resolve to its expected typed template file: CreatureList[].TemplateResRef (.utc), List[].TemplateResRef (.uti, item instances), Placeable List[].TemplateResRef (.utp), SoundList[].TemplateResRef (.uts), TriggerList[].TemplateResRef (.utt), StoreList[].ResRef (.utm), and Encounter List[].TemplateResRef (.ute). Door and Waypoint instances are inlined and have no template; trigger LinkedToModule is deferred to Phase 3 cross-resource checks.

IFO Format (Module Info Blueprint)

Description: The Module Info (.ifo) is the absolute root metadata file for any environment. It dictates global module behavior, handling everything from the starting spawn location, to the local calendar and time-of-day progression, to script execution for global module events.

At a Glance

Property	Value
Extension(s)	`.ifo`
Magic Signature	`IFO` / `V3.2`
Type	Module Blueprint
Rust Reference	View `rakata_generics::Ifo` in Rustdocs

Data Model Structure

Rakata parses the raw GFF structure into the rakata_generics::Ifo struct.

Typestate Extraction: By extracting the loosely-typed GFF binary into a strict Rust struct, Rakata natively implements robust parsing for all module-level configuration.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .ifo files mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSModule::LoadModuleStart at 0x004c9050.)

Global State Configurations

Field	Type	Engine Evaluation
`Mod_Entry_Area`	`ResRef`	The primary spawning area ResRef.
`Mod_Entry_X` / `Mod_Entry_Y` / `Mod_Entry_Z`	`FLOAT`	Exact spawning XYZ coordinates.
`Mod_Entry_Dir_X` / `Mod_Entry_Dir_Y`	`FLOAT`	Entry Direction Fallback: If the engine cannot evaluate `Mod_Entry_Dir_Y`, it forces a hard graphical fallback rendering the entity facing east (X=1.0, Y=0.0).
`Mod_XPScale`	`BYTE`	Globals XP multiplier scale. Defaults natively to `10`.

Time & Cycle Management

Field	Type	Description
`Mod_DawnHour`	`BYTE`	Dawn hour integer marker.
`Mod_DuskHour`	`BYTE`	Dusk hour integer marker.
`Mod_MinPerHour`	`BYTE`	Configuration for exactly how many real-time active gameplay minutes constitute a module hour limit.

Note

Day/Night Cycle Computations The engine continuously computes localized day/night phases explicitly against Mod_DawnHour, Mod_DuskHour, and the current_hour. This dynamically updates an internal state flag denoting: 1=Day, 2=Night, 3=Dawn, 4=Dusk.

Global Event Scripts

Event scripts are universally evaluated as string ResRef pointers executing compiled NSS logic. The engine evaluates 15 separate global events (like Mod_OnHeartbeat, Mod_OnModLoad, Mod_OnClientEntr, Mod_OnPlrDeath, etc).

Asymmetric I/O (Equipping): The Mod_OnEquipItem array natively loads during absolute module startup bounds (LoadModuleStart), however, it is entirely omitted and ignored during the save-game serialization cycle (SaveModuleIFOStart).

Safe-State Injection (Save Games Only)

Certain blocks of data inside the .ifo are deliberately evaluated only when the engine is mounting a module directly from a loaded .sav archive block.

Engine Target	Description
Player / Mod Variables	Structures like `Mod_PlayerList`, `Mod_Tokens`, `VarTable`, and the `EventQueue` are strictly bypassed unless natively evaluated under `is_save_game` conditions.
Area Overrides	The `Mod_Area_list` technically supports arrays (for NWN legacy), but KOTOR strictly enforces a single active area boundary. The secondary `ObjectId` within this specific array is only ever read natively inside a save state flow.
Legacy Hak De-sync	“Hak Packs” are custom override archives natively used in Neverwinter Nights (the engine’s predecessor). While KOTOR’s save routine (`SaveModuleIFOStart`) blindly writes a `Mod_Hak` string into save-games as leftover legacy behavior, the actual load cycle (`LoadModuleStart`) completely ignores it. Modders cannot use this field to hook custom archives.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::ifo.

IFO-001 (Direction Fallback): Warns when Mod_Entry_Dir_X and Mod_Entry_Dir_Y both evaluate to 0.0; the engine forces an unrecorded fallback direction locking spawn toward (1.0, 0.0).
IFO-002 (XP Dead-Scaling): Warns when Mod_XPScale == 0; aggressively halts all localized XP acquisition.
IFO-003 (Eternal Day/Night Bounds): Warns when Mod_DawnHour == Mod_DuskHour; locks the module into perpetual daylight (engine forces transition indices to 3 or 4).
IFO-004 (Void Area Initialization): Errors when Mod_Area_list is empty; directly faults the load cycle.
IFO-005 (Dangling NWM Structure): Warns when Mod_IsNWMFile=true without Mod_NWMResName; evaluates to an unstable execution state.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::ifo_range.

IFO-006 (Resref Existence): Warns when Mod_Entry_Area (.are), any Mod_Area_list[i].Area_Name (.are), or any of the 15 Mod_On* script hooks (.ncs) does not resolve in the configured resource sources.

Pending

Mod_StartMovie (.bik): No ResourceTypeCode variant for the Bink movie format yet.
Mod_CutSceneList[i].CutScene_Name: Engine resolution is .dlg or .bik depending on context (audit deferred).

UTC Format (Creature Blueprint)

Description: The Creature (.utc) blueprint format defines the attributes, stats, and behavior of all in-scene NPCs and monsters. It covers a creature’s identity, class/level, appearance, equipment, and event scripts. Because they hold so much state, Creatures are one of the most dynamic and memory-heavy templates processed by the Odyssey Engine.

At a Glance

Property	Value
Extension(s)	`.utc`
Magic Signature	`UTC` / `V3.2`
Type	Creature Blueprint
Rust Reference	View `rakata_generics::Utc` in Rustdocs

Data Model Structure

Rakata maps the Creature definition directly into the rakata_generics::Utc struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Creature breaks down into six main categories:

Core Statistics: The basic stats that define the creature’s physical capabilities (e.g., Strength, Dexterity, base HitPoints).
Identity & Graphics: Identifiers that define who the creature is and what 3D model they use (e.g., Tag, Appearance_Type, Conversation).
Class & Skill Progression: The mechanics that define their level, classes, and skills (e.g., ClassList, SkillList).
Combat Capabilities: The specific feats and Force powers the creature can use (e.g., FeatList, SpellList).
Inventory & Equipment: The exact items the creature spawns with, including both equipped gear and inventory drops (e.g., Equip_ItemList, ItemList).
Event Hooks (Scripts): The behavior scripts that run when the creature reacts to the world, such as taking damage or noticing an enemy (e.g., OnNotice, OnDamaged).

State Validation: rakata-lint checks the data against engine constraints to prevent fatal runtime crashes.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .utc files mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSCreatureStats::ReadStatsFromGff at 0x005afce0.)

Structural Load Phasing

Function	Size	Behavior
`ReadStatsFromGff`	7835 B	The massive initial pass that parses 57 basic creature scalars including strength, dexterity, and physical appearance.
`LoadCreature`	–	Sets up how the creature physically sits in the world, handling their stealth states, collision size, and idle animations.
`ReadScriptsFromGff`	–	Attaches all the custom event scripts that fire when the creature notices an enemy, takes damage, dies, or simply stands around (heartbeat).
`ReadItemsFromGff`	–	Pulls all loot into memory, structuring items specifically into equipped slots, the backpack, or dropping them entirely if a creature spawns dead.
`ReadSpellsFromGff`	–	Specifically extracts the list of any Force powers or combat feats the creature is allowed to use.

Note

Zeroed Data Elements Legacy structures referencing Tail and Wings are explicitly hardcoded to 0 during parsing and completely bypassed by the binary loader.

Core Structural Findings

The engine strictly validates parameters when loading a .utc file. Improper formatting will trigger some of KOTOR’s most notorious game crashes.

Warning

Understanding Fatal Crash Codes (0x5fX) When the game engine parses a file and hits an invalid stat, it completely aborts loading. Instead of recovering gracefully, the engine deliberately triggers a fatal crash to your desktop and returns a specific hexadecimal error code (e.g., 0x5f7 or 0x5f4). The rules below track the specific scenarios where the game will crash.

Engine Rule	Runtime Behavior
Class Limits	The engine expects a strict limit of 2 discrete class types. Providing duplicate class configuration completely crashes the game (Engine Error `0x5f7`).
Race Bounds	The engine compares `Race` against the compiled row count of `racialtypes.2da`. Exceeding this boundary fatally crashes the map loader (Engine Error `0x5f4`).
Saves Calculation	Pre-computed saving throws (`SaveWill`, `SaveFortitude`) in the `.utc` file are completely ignored dead data. The engine overrides them exclusively by reading `willbonus` and `fortbonus`.
Perception Faults	A non-PC `PerceptionRange` initiates a read against `appearance.2da` for `PERCEPTIONDIST`. Failing to resolve this distance fails the entire creature load (Engine Error `0x5f5`).
Movement Fallbacks	If a unique `MovementRate` isn’t declared, the engine logic falls back directly to default `WalkRate` parameters.
Hard Clamping	The engine strictly limits specific numeric bounds upon load: `Gender` is clamped structurally at a maximum of `4`, and `GoodEvil` is fiercely clamped so that it cannot exceed `100`.
Appearance Shifting	If `Appearance_Head` is `0`, the engine overrides it to `1` to prevent rendering bugs.

Legacy & Ignored Data

Finding Type	Explanation
Legacy Engine Artifacts	A staggering 17 `.utc` fields (such as `Morale`, `SaveWill`, `BlindSpot`, `PaletteID`) present in older files are actually Neverwinter Nights or KOTOR 2 superset metrics that the K1 engine natively ignores.

Vanilla Data Anomalies

Corpus surveys of the K1 GOG .utc set surface two anomalies in the SpecAbilityList field. Both are vanilla data quirks rather than decoder bugs; the structural reader represents them faithfully.

Stacked `SpecAbilityList` entries on the Bastila variants

Six Bastila .utc templates (bastila00c, p_bastilla, p_bastilla001, p_bastilla003, p_bastilla005, p_bastilla006) each carry 99 identical entries of Spell = 52 (SPECIAL_ABILITY_BODY_FUEL) in their SpecAbilityList. The engine’s loader (the SpecAbilityList block of CSWSCreatureStats::ReadStatsFromGff at 0x005afce0) walks each list element and unconditionally appends (Spell, SpellFlags, SpellCasterLevel) to the in-memory special_abilities_ array; there is no deduplication step. Each of the 99 entries occupies its own array slot with independent SpellFlags and SpellCasterLevel, so the stacking is faithfully preserved at runtime.

The loop iteration count itself is taken from CResGFF::GetListCount cast to a single byte (uVar17 & 0xff), so any UTC with more than 255 SpecAbilityList entries would have its tail silently truncated at load. Bastila’s 99 sits comfortably below that cap.

Out-of-range `Spell` id on the `partymember` template

The partymember.utc template references Spell = 299. Vanilla K1 spells.2da has 132 rows (0–131), so 299 does not resolve to any row. The SpecAbilityList loader does not validate Spell against spells.2da at load time; the value is stored verbatim in the in-memory entry. spells.2da is itself read into a per-row struct array sized exactly to row_count (CSWClass::LoadSpellsTable at 0x005be4c0), so a use-time lookup of Spell = 299 indexes past the end of that array. The realised behaviour depends on heap layout at runtime and is not deterministic from the load path alone.

Both anomalies are candidate targets for future Phase 2 / Phase 3 UTC lint rules (e.g., “SpecAbilityList[].Spell must resolve to a row in spells.2da”; optionally “warn on stacked-duplicate SpecAbilityList entries unless explicitly whitelisted as a known vanilla pattern”).

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::utc.

UTC-001 (Appearance Correction): Warns when Appearance_Head == 0; the engine forces this to 1 at runtime.
UTC-002 (Class Limit): Warns when more than 2 entries appear in ClassList; the engine ignores classes beyond the second.
UTC-003 (Class Duplications): Errors when duplicate class IDs exist in ClassList; causes a fatal engine crash (0x5f7) on load.
UTC-004 (Dead Save Fields): Informs when SaveWill or SaveFortitude are populated; the engine reads willbonus/fortbonus instead.
UTC-005 (Gender Clamp): Warns when Gender > 4; the engine clamps to a maximum of 4.
UTC-006 (GoodEvil Clamp): Warns when GoodEvil > 100; the engine clamps to a maximum of 100.
UTC-007 (Toolset / Legacy Fields): Informs when any of Comment, Morale*, PaletteID, BodyVariation, TextureVar, BlindSpot, MultiplierSet, NoPermDeath, IgnoreCrePath, Hologram, WillNotRender, or LawfulChaotic are set; never read by the K1 engine.

Phase 2 (range / 2DA / resref existence, requires `LintContext`)

Implemented under rakata_lint::rules::utc_range.

UTC-008 (Race Bounds): Errors when Race does not resolve to a row in racialtypes.2da; engine crash 0x5f4 on load.
UTC-009 (Class Bounds): Errors when any ClassList[].Class does not resolve to a row in classes.2da (or is negative); engine load failure.
UTC-010 (Appearance Bounds): Errors when Appearance does not resolve to a row in appearance.2da; engine renders missing model.
UTC-011 (Portrait Bounds): Errors when PortraitId (when not the 0xFFFE “use string Portrait” sentinel) does not resolve to a row in portraits.2da.
UTC-012 (Resref Existence): Warns when Conversation (.dlg), Portrait (.tga), any of the 14 Script* hooks (.ncs), Equip_ItemList[i].EquippedRes (.uti), or ItemList[i].InventoryRes (.uti) does not resolve in the configured resource sources.

UTD Format (Door Blueprint)

Description: The Door (.utd) blueprint defines interactive pathways on a level map. Beyond acting as physical barriers or transitions between areas, doors house lock mechanics, trap configurations, script hooks, and basic visual states (open, destroyed, jammed).

At a Glance

Property	Value
Extension(s)	`.utd`
Magic Signature	`UTD` / `V3.2`
Type	Door Blueprint
Rust Reference	View `rakata_generics::Utd` in Rustdocs

Data Model Structure

Rakata maps the Door definition directly into the rakata_generics::Utd struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Door breaks down into four main categories:

Core Identity & Geometry: The configuration for what the door looks like, its faction, and the text displayed when targeted (e.g., Appearance, TemplateResRef, LocName).
Lock & Trap Mechanics: The parameters defining whether it’s locked, what key is needed, and rules for any attached traps (e.g., Locked, KeyName, TrapType, DisarmDC).
Transition Pathways: The linked destination used when a door acts as a loading zone to another area (e.g., LinkedTo, LinkedToFlags).
Behavioral Hooks (Scripts): The scripts that run when a player opens, destroys, or fails to unlock the door (e.g., OnOpen, OnFailToOpen, OnMeleeAttacked).

Active Validation: rakata-lint enforces checks against missing keys or invalid transition references before a module ever reaches the game engine.

Engine Audits & Decompilation

The following information documents the engine’s exact load sequence and field requirements for .utd files mapped from swkotor.exe.

Structural Load Phasing

The engine processes a Door structurally by mapping its sub-fields into distinct operational constraints.

Domain	Sub-fields Evaluated	Purpose
Scales & State	22	Reads the physical health, visual appearance, and base traits determining whether the door is locked or indestructible.
Hooks	15	Attaches custom event scripts that fire when the door is opened, forced, unlocked, or trapped.
Mechanical	9	Configures the lock difficulty tiers and the specific skill hurdles required to detect and disarm any attached traps.
Transitions	4	Links the door strictly to another area (`.are`), turning it into a physical loading screen transition node.

Core Structural Findings

The CSWSDoor parser natively guarantees strict state adjustments upon parsing.

Engine Rule	Runtime Behavior
Appearance Truncation	The engine reads `Appearance` as a 32-bit integer but forcefully truncates it to a single byte (`(byte)uVar5`). Any ID above `255` automatically wraps to 0 and breaks the physical door model.
Static Enforcement	If the door is marked `Static`, the engine automatically forces `plot = 1`. This safely guarantees that static level architecture cannot be destroyed by players.
Portrait Shadowing	If `PortraitId` is `0`, the engine hardcodes it to `0x22E`. If `PortraitId` is `< 0xFFFE`, the engine completely ignores the `Portrait` string ResRef and relies entirely on the ID. Any value in the `Portrait` ResRef field is treated as dead data.
Trap Hook Fallback	If the `OnTrapTriggered` script is left empty, set to null, or literally named `"default"`, the engine pulls the default standard script from `traps.2da` instead.
HP Synchronization	`CurrentHP` is securely clamped against the door’s maximum `HP` to prevent overflow bugs.

Legacy & Ignored Data

Finding Type	Explanation
Legacy Engine Artifacts	7 explicitly mapped template structures (like `AnimationState`, `NotBlastable`, `OpenLockDiff`) are Neverwinter Nights or KOTOR 2 legacy dependencies inherently ignored by the K1 parser.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::utd.

UTD-001 (Static Parity): Warns when Static=true but Plot=false; the engine forces Plot to true at runtime.
UTD-002 (HP Bounds): Errors when CurrentHP > HP; the engine clamps to HP on template load.
UTD-003 (Portrait Shadowing): Warns when PortraitId < 0xFFFE and Portrait resref is set; the resref is ignored at runtime.

Phase 2 (range / 2DA / resref existence, requires `LintContext`)

Implemented under rakata_lint::rules::utd_range.

UTD-004 (Generic Door Type Bounds): Errors when GenericType does not resolve to a row in genericdoors.2da; engine renders missing model.
UTD-005 (Portrait Bounds): Errors when PortraitId (when not the 0xFFFE “use string Portrait” sentinel) does not resolve to a row in portraits.2da.
UTD-006 (Resref Existence): Warns when Conversation (.dlg), Portrait (.tga), or any of the 15 On* script hooks (.ncs) does not resolve in the configured resource sources. LinkedToModule (area transition) is deferred to Phase 3.

Pending

Appearance Truncation: Flags legacy Appearance (u32) values above 255 (engine truncates to a single byte).
Trap Hook Fallback Detection: Scans for empty / null / literally-named "default" OnTrapTriggered references that silently invoke the traps.2da fallback.
Portrait Zero Hardcode: Detects PortraitId == 0 mappings since the engine hardcodes lookup to 0x22E.

UTE Format (Encounter Blueprint)

Description: The Encounter (.ute) blueprint defines interactive spawn points and boundary triggers across a level map. Instead of acting merely as a spatial zone, encounters handle complex difficulty scaling, bubble-sort creature limits, and explicit coordinate vertices to dynamically deploy combatants when a player crosses their geometry bounds.

At a Glance

Property	Value
Extension(s)	`.ute`
Magic Signature	`UTE` / `V3.2`
Type	Encounter Blueprint
Rust Reference	View `rakata_generics::Ute` in Rustdocs

Data Model Structure

Rakata maps the Encounter definition directly into the rakata_generics::Ute struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

An Encounter breaks down into four main categories:

Spawn Population (CreatureList): The list of creature blueprints the encounter can spawn.
Difficulty & Limits: Setting how many creatures spawn at once and how difficult they should be relative to the player (e.g., MaxCreatures, DifficultyIndex).
Trigger Boundaries (Geometry): The coordinates defining the physical tripwire that triggers the spawn.
Behavioral Hooks (Scripts): The scripts that run when a player enters or exits the trigger, or when the spawn pool runs dry (e.g., OnEntered, OnExhausted).

Model Validation: rakata-lint checks the data against engine constraints to prevent fatal runtime crashes.

Engine Audits & Decompilation

The following information documents the engine’s exact load sequence and field requirements for .ute files mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from the primary dispatcher CSWSEncounter::LoadEncounter at 0x00593830.)

Structural Load Phasing

The engine processes an Encounter structurally across several chunked subroutines, each responsible for unique spatial and logic bindings.

Function	Size	Behavior
`ReadEncounterFromGff`	3445 B	The initial pass that sets up the encounter’s identity, difficulty limits, and the spawn list.
`ReadEncounterScriptsFromGff`	567 B	Attaches scripts that trigger when players enter, exit, or exhaust the spawn pool.
`LoadEncounterSpawnPoints`	364 B	Reads the coordinates so the engine knows exactly where to spawn the creatures.
`LoadEncounterGeometry`	651 B	Reads the coordinates that trace the trigger’s boundaries on the floor.

Core Structural Findings

The engine rigorously evaluates geometric and spatial boundaries. Improper definitions break the spawn mapping algorithm.

Warning

Understanding Fatal Log Drops While minor coordinate math errors usually just cause creatures to spawn inside walls, failing strict geometry constraints causes KOTOR to abruptly abort parsing the Encounter. Specifically, if a .ute file declares it has geometry boundaries but fails to provide the actual coordinate vertices, the engine dumps a fatal error to its trace log and refuses to spawn the encounter at all.

Engine Rule	Runtime Behavior
Tag Overrides	The engine forcefully converts any `Tag` to all-lowercase via `CSWSObject::SetTag`. Any static casing is lost immediately upon load.
Geometry Integrity	If `Geometry` is explicitly defined but has 0 vertices, the engine logs a “has geometry, but no vertices” error and aborts loading the encounter entirely.
Geometry Synthesis	If the `Geometry` list is completely omitted from the blueprint, the engine falls back and safely synthesizes a default 4-vertex spatial box.
Difficulty Resolution	The engine prioritizes using `DifficultyIndex` to look up the difficulty in `encdifficulty.2da`. The static `Difficulty` field is ignored unless the 2DA table fails to resolve.
Bubble Sorting	Upon loading the `CreatureList`, the engine runs a Bubble Sort algorithm to firmly re-order the encounter’s spawn pool by ascending CR (Challenge Rating), completely overriding any custom static display order.
Area Instantiation	`AreaList` buffer allocation size is strictly dictated by `AreaListMaxSize`. If the real list exceeds this size, the buffer will silently overrun.

Legacy & Ignored Data

Finding Type	Explanation
Passive Legacy Artifacts	Unused fields left over from older tools or Odyssey branches (e.g., `TemplateResRef`, `Comment`, `PaletteID`) are completely dark. The engine inherently ignores them.
Superseded Legacy Fields	The static `Difficulty` field is a completely inactive legacy metric as long as `DifficultyIndex` maps to a valid row inside `encdifficulty.2da`.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::ute.

UTE-001 (Dead Difficulty Traces): Warns when Difficulty > 0 while DifficultyIndex >= 0; the engine ignores the static Difficulty in favor of the 2DA lookup.
UTE-002 (Deficient Spawn Loops): Warns when an encounter is marked Active=true but CreatureList is empty.
UTE-003 (Dead Field Evaluation): Informs when TemplateResRef, Comment, or PaletteID are populated; never read by the K1 engine.
UTE-004 (Geometry Integrity Risk): Warns when Geometry has 0 vertices; an explicitly defined empty geometry array crashes the engine on load.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::ute_range.

UTE-005 (Resref Existence): Warns when any of OnEntered, OnExit, OnHeartbeat, OnExhausted, or OnUserDefined (.ncs) does not resolve, or when any CreatureList[i].ResRef (.utc) does not resolve in the configured resource sources.

UTI Format (Item Blueprint)

The Item (.uti) blueprint serves as the central data model for all tangible loot, weapons, armor, and usable gear in the game. It defines how an item physically appears on characters, what custom properties or stat bonuses it applies through specific upgrade hierarchies, its intrinsic monetary cost, and exactly what its runtime state behaves like when dropped into the world map.

At a Glance

Property	Value
Extension(s)	`.uti`
Magic Signature	`UTI` / `V3.2`
Type	Item Blueprint
Rust Reference	View `rakata_generics::Uti` in Rustdocs

Data Model Structure

Rakata maps the Item definition directly into the rakata_generics::Uti struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

An Item breaks down into four main categories:

Core Identity: The basic text strings that provide the item’s name and description, including both identified and unidentified states (e.g., TemplateResRef, LocName, Description).
Economic & Charge Mechanics: The value of the item, and the number of charges left for consumable abilities (e.g., Cost, Charges).
Visual Geometry (Appearance): Setting what the item looks like when dropped on the floor or equipped (e.g., ModelVariation, TextureVar).
Combat & Upgrade Properties (PropertiesList): The stat buffs, damage modifiers, and abilities bound to the item, alongside slots for workbench upgrades.

Model Validation: rakata-lint checks the data against engine constraints to prevent fatal runtime crashes.

Engine Audits & Decompilation

The following information documents the engine’s exact load sequence and field requirements for .uti files mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from the primary dispatcher CSWSItem::LoadDataFromGff at 0x0055fcd0, the active-property predicate CSWSItem::IsFriendlyUsableItem at 0x00553900, the property-string resolver CSWSItem::GetPropertyStrings at 0x00554e00, and the IPRP table loaders CTwoDimArrays::LoadIPRPCostTables at 0x005c4730 / LoadIPRPParamTables at 0x005c49c0.)

Structural Load Phasing

The engine processes an Item structurally across multi-pass capabilities mappings.

Function	Size	Behavior
`LoadDataFromGff`	–	The main parser that sets what the item is, how many charges it holds, and its descriptions.
`LoadItemPropertiesFromGff`	–	Reads the special properties (like energy damage or stat boosts), splitting them into ‘useable’ abilities versus permanent buffs.
`LoadItem`	–	The constructor that decides whether to load the item onto a character or leave it idle in an inventory.
`LoadFromTemplate`	–	A fallback used when spawning an item dynamically from a script instead of off a character.
`SaveItem` / `SaveItemProperties`	–	The opposite pipeline that writes the item into a save game, which notoriously forces the item to always be flagged as “Identified”.

Core Structural Findings

The engine rigorously evaluates base-item mapping constraints from 2DA arrays and aggressively overrides improperly defined models.

Engine Rule	Runtime Behavior
Description Cross-Swap	If either `Description` or `DescIdentified` is missing, the engine automatically duplicates the provided string into the missing field so item identification mechanics never crash the game.
Model Truncation	If an older tool incorrectly configures `ModelVariation` to `0`, the engine forcefully bumps it to `1` upon load, ensuring the item always has visible geometry instead of rendering an invisible weapon or armor piece.
Model & Body Variation Hooks	The engine completely ignores the `.uti`’s `BodyVariation` field, opting instead to enforce the exact `body_var` value predefined in `baseitems.2da`. Additionally, `TextureVar` is unconditionally bypassed unless the item’s base type is strictly configured as Model Type 1.
Cost Generation Fallback	The physical `Cost` integer provided in the file is dead data. The engine strictly computes economic value actively via `GetCost()` calculations based on its properties, completely ignoring your defined value.
Identifier Enforcement	During explicit serializing via `SaveItem` (when the player creates a save game), the engine actively forces and hardcodes `Identified` to `1` unconditionally.
Property Capabilities	Item properties are structurally split into Active and Passive memory tables at load. The engine evaluates every `PropertyName` index: any ID strictly mapping to `10`, `37`, `46`, or `53` (e.g., Cast Power, Trap) is actively hooked as a usable player ability, while all other integers are silently applied as passive stat modifiers.
Data-Driven Property Kinds	The engine does not hardcode a “PropertyName N -> semantic kind” table. Property-kind classification (Damage Bonus, Ability Bonus, Save Bonus, etc.) is resolved entirely by reading the `Label` column of `itempropdef.2da` at the row indexed by `PropertyName`. Mods that add new rows surface as new property kinds without engine changes.
Property Field Defaults	When a `PropertiesList` entry omits a field, the engine fills it from a fixed table: `Useable=1` for active properties (`PropertyName` 10/37/46/53), `Useable=0` for passive ones; `UsesPerDay=0xFF` and `UpgradeType=0xFF` – both `0xFF` values function as “not set” sentinels rather than valid row indices.
Bit Flags Application	The `Dropable` boolean explicitly sets bit 3 of the item’s internal memory flags, while `Pickpocketable` sets bit 4. Missing fields safely default to `0`.

Property Table Dispatch

A UtiProperty carries three indices that point through three separate registry-of-registries chains. The engine holds no hardcoded mapping for any of them; every dispatch is a 2DA cell read, so mods that extend the underlying tables surface without engine modification.

Per-property subtype dispatch (resolved at display time inside GetPropertyStrings at 0x00554e00):

Step	2DA	Indexed By	Column Read	Purpose
1	`itempropdef.2da`	`PropertyName`	`Name` (INT)	TLK strref for the property’s display name (e.g. “Damage Bonus”).
2	`itempropdef.2da`	`PropertyName`	`SubTypeResRef` (string)	Resref of the per-property subtype 2DA (e.g. `iprp_damagecost`). Empty/missing means the property has no subtype dimension.
3	(subtype 2DA from step 2)	`Subtype`	`Name` (INT)	TLK strref for the subtype’s display name (e.g. “Acid”).

Cost-table dispatch (resolved eagerly at startup inside LoadIPRPCostTables at 0x005c4730):

Step	2DA	Indexed By	Column Read	Purpose
1	`iprp_costtable.2da`	`CostTable`	`Name` (string)	Resref of the cost-specific 2DA (e.g. `iprp_meleecost`). Used as a resref despite the column name suggesting a label.
2	`iprp_costtable.2da`	`CostTable`	`ClientLoad` (INT, optional)	When set and the engine is running in client mode, the loader skips loading this row’s cost 2DA. Treated as server-only.
3	(cost 2DA from step 1)	`CostValue`	(table-specific)	The row at `CostValue` carries the cost effect for this property; column layout varies per cost table.

Param-table dispatch (resolved eagerly at startup inside LoadIPRPParamTables at 0x005c49c0):

Step	2DA	Indexed By	Column Read	Purpose
1	`iprp_paramtable.2da`	`Param1`	`TableResRef` (string)	Resref of the param-specific 2DA.
2	(param 2DA from step 1)	`Param1Value`	(table-specific)	The row at `Param1Value` carries the parameter value for this property; column layout varies per param table.

Engine constraints:

Both iprp_costtable.2da and iprp_paramtable.2da row counts are stored as byte (u8) in CTwoDimArrays. Rows past index 255 are silently truncated by the loader and the affected per-property tables never get loaded into memory.
Column-name lookups in 2DAs are case-sensitive at the engine API (C2DA::GetINTEntry / GetCExoStringEntry compare verbatim). The exact spellings the engine uses are Name, SubTypeResRef, TableResRef, Label, and ClientLoad.
The subtype 2DA listed in SubTypeResRef is loaded lazily on display via GetPropertyStrings, not eagerly at startup. A missing subtype 2DA fails only the call that needs it, not the whole game load.
The Name column on every level of the dispatch is a TLK strref. The Label column on the same row holds a developer-readable identifier (e.g. Damage_Bonus) that does not require talktable resolution.

Cost-Table Magnitude Resolution

The cost-table dispatch chain documented above ends at “the row at CostValue carries the cost effect for this property; column layout varies per cost table.” This section pins down the column layout for vanilla K1’s iprp_costtable.2da entries and how each Apply<PropertyKind> handler reads from them, sourced from the CSWSItemPropertyHandler::Apply* family in swkotor.exe (handlers cluster around 0x004e5490-0x004e7e80 and 0x004e9230-0x004e9390).

iprp_costtable.2da (vanilla K1) — index to per-cost 2DA mapping:

Index	`Name` (resref of per-cost 2DA)	`Label`	`ClientLoad`
0	`IPRP_BASE1`	`Base1`	0
1	`IPRP_BONUSCOST`	`Bonus`	0
2	`IPRP_MELEECOST`	`Melee`	1
3	`IPRP_CHARGECOST`	`SpellUse`	0
4	`IPRP_DAMAGECOST`	`Damage`	0
5	`IPRP_IMMUNCOST`	`Immune`	0
6	`IPRP_SOAKCOST`	`DamageSoak`	0
7	`IPRP_RESISTCOST`	`DamageResist`	0
8	`IPRP_BLADECOST`	`DancingScimitar`	0
9	`IPRP_SLOTSCOST`	`Slots`	0
10	`IPRP_WEIGHTCOST`	`Weight`	0
11	`IPRP_SRCOST`	`SpellResist`	0
12	`IPRP_STAMINACOST`	`Stamina`	0
13	`IPRP_SPELLLVCOST`	`SpellLevel`	0
14	`IPRP_AMMOCOST`	`Ammo`	0
15	`IPRP_REDCOST`	`WeightReduction`	0
16	`IPRP_SPELLCOST`	`Spells`	0
17	`IPRP_TRAPCOST`	`Traps`	0
18	`IPRP_LIGHTCOST`	`Light`	1
19	`IPRP_MONSTCOST`	`Monster_Cost`	0
20	`IPRP_NEG5COST`	`Negative_Modifiers`	0
21	`IPRP_NEG10COST`	`Negative_Modifiers`	0
22	`IPRP_DAMVULCOST`	`Damage_vulnerability`	0
23	`IPRP_SPELLLVLIMM`	`Spell_Level_Immunity`	0
24	`IPRP_ONHITCOST`	`OnHitCosts`	0
25	`IPRP_ONHITDC`	`OnHitDC_saves`	0

Per-handler magnitude resolution. Each Apply<Kind> handler that needs a cost-table magnitude calls CTwoDimArrays::GetIPRPCostTable(<index>) then C2DA::GetINTEntry(table, row=CostValue, column, out). The integer that comes back is the engine-side magnitude (in the units appropriate to the property kind: bonus number, damage soak amount, save delta, etc.). The column name is read case-sensitively; the only two columns the vanilla handlers consult are Value and Amount.

Handler	CostTable index	Per-cost 2DA	Column	Post-processing
`ApplyAbilityBonus`	1	`iprp_bonuscost`	`Value`	–
`ApplyACBonus`	1	`iprp_bonuscost`	`Value`	–
`ApplyImprovedSavingThrow`	1	`iprp_bonuscost`	`Value`	–
`ApplyDamageReduction`	6	`iprp_soakcost`	`Amount`	–
`ApplyDamageResistance`	7	`iprp_resistcost`	`Amount`	–
`ApplyImprovedForceResistance`	11 (`0xB`)	`iprp_srcost`	`Value`	–
`ApplyAttackPenalty`	20 (`0x14`)	`iprp_neg5cost`	`Value`	negate
`ApplyDamagePenalty`	20 (`0x14`)	`iprp_neg5cost`	`Value`	negate
`ApplyReducedSavingThrows`	20 (`0x14`)	`iprp_neg5cost`	`Value`	none (table holds negatives)
`ApplyDecreasedAC`	20 (`0x14`)	`iprp_neg5cost`	`Value`	negate
`ApplyDecreasedAbilityScore`	21 (`0x15`)	`iprp_neg10cost`	`Value`	negate
`ApplyDecreasedSkillModifier`	21 (`0x15`)	`iprp_neg10cost`	`Value`	negate
`ApplyDamageVulnerability`	22 (`0x16`)	`iprp_damvulcost`	`Value`	–
`ApplyDamageImmunity`	dynamic (`property.cost_table`)	per-property	`Value`	–

Handlers that bypass the cost-table dispatch. A surprising number of vanilla handlers do not call GetIPRPCostTable at all and instead consume CostValue (or another property field) directly as the magnitude:

ApplyDamageBonus (covers PropertyName 11 Damage, 12 DamageAlignmentGroup, and 13 DamageRacialGroup in one switch) reads CostValue straight as the damage amount. There is no per-cost 2DA lookup. The iprp_damagecost.2da table is used for cost calculation (GetCost), not for damage-magnitude resolution.
ApplyEnhancementBonus and ApplyAttackBonus read (Rules->internal).all_2DAs->iprp_meleecost via direct struct-field access (not through GetIPRPCostTable), then read column Value. Equivalent to a cost-table-index 2 (iprp_meleecost) dispatch, just inlined.
ApplySkillBonus and ApplyBonusFeat read the magnitude / feat id from the property struct directly.
ApplyImmunity switches on the subtype id and assigns one of ten hardcoded engine constants; no 2DA is consulted.
ApplyRegeneration uses CostValue as the regen amount and a hardcoded 6000 ms tick interval; no 2DA.

Implications for decoded magnitude resolution. A decoder that resolves property magnitudes should:

First check whether the property kind is on the cost-table list above; if yes, read the resolved magnitude from the listed cost 2DA at row CostValue, column Value or Amount, with the documented post-processing.
If the property kind is on the bypass list, the magnitude is CostValue directly (or, for ApplyImmunity, hardcoded per subtype).
For ApplyDamageImmunity, the cost-table index is read from the property’s own CostTable field rather than being hardcoded per handler; mod-extended cost tables resolve through the same path.

Vanilla `itempropdef.2da` Label Reference

The following table lists every label in the vanilla K1 itempropdef.2da. The decoder in rakata_generics::decoded matches on the Label column at the row indexed by UtiProperty::property_name. The Subtype 2DA column is the file’s SubTypeResRef cell verbatim (lowercased per the engine’s case-insensitive resref handling); an empty cell means the property has no subtype dimension.

The four rows the engine treats as active (loaded into the per-character usable-ability table per IsFriendlyUsableItem) are marked. Every other row is passive.

Row	Label	Subtype 2DA	Notes
0	`Ability`	`iprp_abilities`
1	`Armor`	–	AC base bonus
2	`ArmorAlignmentGroup`	`iprp_aligngrp`
3	`ArmorDamageType`	`iprp_combatdam`
4	`ArmorRacialGroup`	`racialtypes`
5	`Enhancement`	–	Enhancement bonus to weapons
6	`EnhancementAlignmentGroup`	`iprp_aligngrp`
7	`EnhancementRacialGroup`	`racialtypes`
8	`AttackPenalty`	–
9	`BonusFeats`	`feat`
10	`CastSpell`	`spells`	active
11	`Damage`	`iprp_damagetype`
12	`DamageAlignmentGroup`	`iprp_aligngrp`
13	`DamageRacialGroup`	`racialtypes`
14	`DamageImmunity`	`iprp_damagetype`
15	`DamagePenalty`	–
16	`DamageReduced`	`iprp_protection`
17	`DamageResist`	`iprp_damagetype`
18	`Damage_Vulnerability`	`iprp_damagetype`
19	`DecreaseAbilityScore`	`iprp_abilities`
20	`DecreaseAC`	`iprp_acmodtype`
21	`DecreasedSkill`	`skills`
22	`DamageMelee`	`iprp_combatdam`
23	`DamageRanged`	`iprp_combatdam`
24	`Immunity`	`iprp_immunity`
25	`ImprovedMagicResist`	–
26	`ImprovedSavingThrows`	`iprp_saveelement`
27	`ImprovedSavingThrowsSpecific`	`iprp_savingthrow`
28	`Keen`	–
29	`Light`	–
30	`Mighty`	–
31	`DamageNone`	–
32	`OnHit`	`iprp_onhit`
33	`ReducedSavingThrows`	`iprp_saveelement`
34	`ReducedSpecificSavingThrow`	`iprp_savingthrow`
35	`Regeneration`	–
36	`Skill`	`skills`
37	`ThievesTools`	–	active
38	`AttackBonus`	–
39	`AttackBonusAlignmentGroup`	`iprp_aligngrp`
40	`AttackBonusRacialGroup`	`racialtypes`
41	`ToHitPenalty`	–
42	`UnlimitedAmmo`	`iprp_ammotype`
43	`UseLimitationAlignmentGroup`	`iprp_aligngrp`
44	`UseLimitationClass`	`classes`
45	`UseLimitationRacial`	`racialtypes`
46	`Trap`	`traps`	active
47	`True_Seeing`	–
48	`OnMonsterHit`	`iprp_monsterhit`
49	`Massive_Criticals`	–
50	`Freedom_of_Movement`	–
51	`Monster_damage`	–
52	`Special_Walk`	`iprp_walk`
53	`Computer_Spike`	–	active
54	`Regeneration_Force_Points`	–
55	`Blaster_Bolt_Deflect_Increase`	–
56	`Blaster_Bolt_Defect_Decrease`	–	Vanilla typo (`Defect` not `Deflect`); decoder must match the file spelling exactly.
57	`Use_Limitation_Feat`	`feat`
58	`Droid_Repair_Kit`	–
59	`Disguise`	`appearance`

Mod content extends this table with rows past index 59. The decoder’s typed-variant dispatch matches by Label, so a mod-added kind surfaces as DecodedProperty::Unknown { property_label: Some("ModLabel"), .. } instead of as a dispatch hole.

Legacy & Ignored Data

Finding Type	Explanation
Superseded Legacy Fields	Directly supplying static `Cost` or `BodyVariation` values is a byproduct of older file versions; these remain inherently unused overhead compared to the physical runtime 2DA evaluation.
Passive Legacy Artifacts	General nodes left over from older tools (like `TemplateResRef`, `Comment`, `PaletteID`, and explicitly `UpgradeLevel`) are bypassed on load entirely.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::uti.

UTI-001 (Model Truncation Safety): Warns when ModelVariation == 0; the engine forces this to 1 at runtime.
UTI-002 (Dead Cost Fields): Informs when Cost is set; the engine ignores this and computes item cost dynamically.
UTI-003 (Dead Body Overrides): Informs when BodyVariation is set; the engine queries baseitems.2da instead.
UTI-004 (Toolset-Only Fields): Informs when any of TemplateResRef, Comment, PaletteID, or UpgradeLevel are set; never read by the K1 engine.
UTI-005 (Conditional TextureVar): Informs when TextureVar is set; only evaluated if the base item’s 2DA model_type is exactly 1.

Phase 2 (range / 2DA, requires `LintContext`)

Implemented under rakata_lint::rules::uti_range.

UTI-006 (Base Item Bounds): Errors when BaseItem does not resolve to a row in baseitems.2da (or is negative); the engine indexes the table directly to look up model type, equip slot, and weapon class – an invalid id either crashes the load or produces a corrupt item.
UTI-007 (Valid Capability Bounds): Errors per PropertiesList entry when PropertyName does not resolve to a row in itempropdef.2da, or when Subtype does not resolve to a row in the per-property iprp_*.2da named by itempropdef[PropertyName].SubTypeResRef (skipped when the row has no SubTypeResRef, i.e. the property kind has no subtype dimension). UpgradeType and UsesPerDay use the engine’s 0xFF “not set” sentinel; both the absent-field and explicit-0xFF forms decode as “not set” and the rule does not flag either.

Pending

Resref Existence: UTI’s only ResRef field is the toolset-only TemplateResRef (never read by the engine), so the per-format resref-existence rule from B6 was deliberately omitted.

UTM Format (Merchant Blueprint)

Description: The Merchant (.utm) blueprint natively handles the interactive storefront data for merchants and shops. Because shops strictly behave as container interfaces that dynamically buy, sell, and map economic value onto spawned .uti items, the structure of a .utm is highly compact, primarily consisting of economic markups and inventory sorting parameters.

At a Glance

Property	Value
Extension(s)	`.utm`
Magic Signature	`UTM` / `V3.2`
Type	Merchant Blueprint
Rust Reference	View `rakata_generics::Utm` in Rustdocs

Data Model Structure

Rakata maps the Merchant definition directly into the rakata_generics::Utm struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Merchant breaks down into three main categories:

Core Identity: The basic identifiers providing the shop’s name and tag (e.g., Tag, LocName).
Economic Metrics: The percentages controlling price scaling when buying or selling items, alongside basic shop rules (e.g., MarkUp, MarkDown, BuySellFlag).
Store Inventory (ItemList): The list of items actively available in the shop’s stock, including rules for infinite regeneration.

Typestate Extraction: By mapping the raw GFF structure into the strict rakata_generics::Utm struct, fields like MarkUp, BuySellFlag, and inventory lists are safely constrained to proper domain types (e.g., bounds-checked i32 or strictly formatted ResRef).

Engine Audits & Decompilation

Because .utm evaluating is structurally straightforward, the engine bypasses heavy memory allocations and maps fields in an incredibly fast iteration.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSStore::LoadStore at 0x005c7180.)

Structural Load Phasing

Function	Size	Behavior
`LoadStore`	1341 B	The primary parser that pulls the merchant’s basic identity, economic constraints (`MarkUp`/`MarkDown`), and buying capabilities.
`ItemList Read`	–	Iterates through the list of store stock, actively pulling either explicitly saved item instances or generating them freshly from templates (`InventoryRes`).
`AddItemToInventory`	–	Pushes the fully sorted loot stack into the physical storefront container so the player can actually interact with and purchase them.

Core Structural Findings

Engine Rule	Runtime Behavior
Cost Sorting	When building the store inventory, the engine actively sorts the merchant’s final stock from cheapest to most expensive by checking the cost of each item. This completely overrides whatever custom display order you try to dictate statically.
Dynamic Economics	The engine relies entirely on the `MarkUp` and `MarkDown` integers to control shop prices. These act as simple percentages that mathematically bump or slash the base cost of every item the merchant sells or buys.
Buy/Sell Bit Flags	`BuySellFlag` is split into basic toggles: bit 0 controls whether you are allowed to sell your gear to the merchant, and bit 1 controls whether the merchant will actually sell anything to you.
Infinite Stacking	If an item is flagged as `Infinite`, the engine specifically locks that item in memory so that no matter how many times a player buys it, the shop never physically runs out of stock.

Legacy & Ignored Data

Finding Type	Explanation
Legacy Interface Configurations	Some older tools expose positional values like `Repos_PosX` or `Repos_PosY` inherited from other Odyssey games, but the engine completely ignores them. The game physically builds its shop UI dynamically when you open it, rendering those grid coordinates totally useless.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::utm.

UTM-001 (Legacy Grid Coordinates): Informs when inventory items contain non-zero Repos_PosX or Repos_PosY; the engine builds its shop UI dynamically and ignores these coordinates.
UTM-002 (Unknown Buy/Sell Flags): Warns when BuySellFlag has bits set outside the canonical buy (bit 0) and sell (bit 1) toggles.
UTM-003 (Legacy Store UI Fallback): Warns when BuySellFlag == 0 (missing or empty); the engine falls back to legacy UI behaviors and forcefully clamps MarkUp to 100.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::utm_range.

UTM-004 (Resref Existence): Warns when OnOpenStore (.ncs) or any ItemList[i].InventoryRes (.uti) does not resolve in the configured resource sources. The toolset-only top-level ResRef (merchant template) is intentionally skipped – it is never read by the engine.

UTP Format (Placeable Blueprint)

Description: The Placeable (.utp) blueprint dictates the configuration of universally interactive scenery and containers within a map. Ranging from simple locked footlockers to rigged command consoles and explodable starship barricades, .utp structs blend physical static properties (like structural HP and lock difficulties) with heavy dynamic script bindings.

At a Glance

Property	Value
Extension(s)	`.utp`
Magic Signature	`UTP` / `V3.2`
Type	Placeable Blueprint
Rust Reference	View `rakata_generics::Utp` in Rustdocs

Data Model Structure

Rakata maps the Placeable definition directly into the rakata_generics::Utp struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Placeable breaks down into five main categories:

Core Identity & Geometry: The configuration for what the placeable looks like, its faction, and the text displayed when targeted (e.g., Appearance, TemplateResRef, LocName).
Interactive State & Dialogue: Flags determining if the placeable can be clicked, if it starts a conversation/computer sequence, or if it acts as a loot container (e.g., Useable, Conversation, HasInventory).
Lock & Trap Mechanics: The parameters defining whether it’s locked, what key is needed, and rules for any attached traps (e.g., Locked, KeyName, TrapType, DisarmDC).
Health & Destruction: The physical integrity of the object, defining if it can be destroyed and its defensive thresholds (e.g., HP, Hardness, Static, Plot).
Behavioral Hooks (Scripts): The scripts that run when a player explores, attacks, or opens the placeable (e.g., OnOpen, OnInvDisturbed, OnDamaged).

State Validation: rakata-lint checks the data against engine constraints to prevent runtime bugs.

Engine Audits & Decompilation

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSPlaceable::LoadPlaceable at 0x00585670.)

Because Placeables act as physical junctions for event hooking, they expose a massive suite of script triggers natively.

Structural Load Phasing

Function	Size	Behavior
`LoadPlaceable`	5092 B	The primary physical parser evaluating 46 core metrics including health, conversation dialogues, basic trap bindings, and physical alignment states.
`ReadScriptsFromGff`	–	Attaches 16 dedicated script hooks dictating behavior when the placeable is bashed, opened, unlocked, or triggered.

Core Structural Findings

Engine Rule	Runtime Behavior
Appearance Truncation	The engine reads `Appearance` as a 32-bit integer but forcefully truncates it to a single byte. Any ID above `255` automatically wraps to `0` and physically breaks the placeable model rendering.
Static vs. Plot Chaining	Just like Doors, if a Placeable is marked `Static=1`, the engine completely overrides all other behaviors and acts as if `Plot=1` is true, making the placeable totally indestructible even if it has an HP value defined.
Default Usability Check	If the `Static` toggle is completely missing from the binary file, the engine automatically derives it by actively checking if the Placeable is marked as usable (`!Useable`).
Portrait Shadowing	If `PortraitId` is `< 0xFFFE`, the engine completely ignores the `Portrait` string ResRef and relies entirely on the ID. Any value in the `Portrait` ResRef field is treated as dead data.
Ground Pile Forcing	The engine reads whatever value you place in `GroundPile`, but physically overwrites it and forces it to `1` in memory, making native static configuration of this field utterly pointless.
Missing Door Hooks	Toolsets erroneously expose `OnFailToOpen` for Placeables, but the engine specifically treats this as a Door-exclusive (`.utd`) script hook and completely ignores it here.
Trap Hook Fallback	If a trap bounds check fails or the `OnTrapTriggered` script is left blank, the engine automatically attempts to read the `traps.2da` table and pulls the default script based on the specific `TrapType`.

Legacy & Ignored Data

Finding Type	Explanation
Legacy Engine Artifacts	Placeable binaries are littered with legacy metrics from older tools or other Odyssey games (`Comment`, `OpenLockDiff`, `Interruptable`, `Type`, `PaletteID`). The physical KOTOR engine constructor entirely ignores these.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::utp.

UTP-001 (Plot Chaining Context): Warns when Static=true but Plot=false; the engine forces Plot to true at runtime.
UTP-002 (Ghost Value Detection): Informs when GroundPile=false since the engine immediately overwrites this to true on load.
UTP-003 (Dead Hook Pruning): Flags OnFailToOpen instances because placeables ignore this event hook (it is door-exclusive).
UTP-004 (HP Health Ceiling): Errors when CurrentHP > HP; the engine clamps to HP on template load.
UTP-005 (Portrait Shadowing): Warns when PortraitId < 0xFFFE and Portrait resref is set; the resref is ignored at runtime.

Phase 2 (range / 2DA / resref existence, requires `LintContext`)

Implemented under rakata_lint::rules::utp_range.

UTP-006 (Appearance Bounds): Errors when Appearance does not resolve to a row in placeables.2da; engine renders missing model.
UTP-007 (Portrait Bounds): Errors when PortraitId (when not the 0xFFFE “use string Portrait” sentinel) does not resolve to a row in portraits.2da.
UTP-008 (Resref Existence): Warns when Conversation (.dlg), Portrait (.tga), any of the 16 On* script hooks (.ncs), or ItemList[i].InventoryRes (.uti) does not resolve in the configured resource sources. OnFailToOpen is intentionally NOT included – UTP-003 already flags it as door-exclusive dead data.

Pending

Appearance Truncation: Warns when Appearance exceeds 255 (engine truncates to a single byte before lookup, distinct from the row-count check in UTP-006).
Animation Conditional Limits: Verifies that custom AnimationState indices are strictly guarded by Open==0 closures.

UTS Format (Sound Object Blueprint)

Description: The Sound Object (.uts) blueprint defines dynamic, positional, and ambient audio emitters placed throughout a game map. Ranging from environmental hums and randomized crowd chatter to highly localized looping sound effects, .uts files act as physical sound nodes combining strict spatial coordinates with randomized pitch, interval, and varying volume matrices.

At a Glance

Property	Value
Extension(s)	`.uts`
Magic Signature	`UTS` / `V3.2`
Type	Sound Object Blueprint
Rust Reference	View `rakata_generics::Uts` in Rustdocs

Data Model Structure

Rakata maps the Sound Object definition directly into the rakata_generics::Uts struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Sound Object breaks down into five main categories:

Audio Emitters (Sounds List): An array containing the audio files (.wav files) the engine will sequence or shuffle through.
Spatial Geometry: Distance boundaries determining exactly where the sound is audible in the map (MinDistance, MaxDistance).
Playback Automation: Rules for how the sound loops and strings together (Continuous, Random, Active, Looping).
Algorithmic Variations: Modifiers that dynamically distort the audio file’s pitch and volume at runtime (PitchVariation, FixedVariance, VolumeVrtn).
Procedural Generators: Identifiers that tell the engine if the sound represents specific background noise like crowd chatter or combat ambiance (GeneratedType).

State Validation: rakata-lint checks the data against engine constraints to prevent runtime bugs.

Engine Audits & Decompilation

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSSoundObject::Load at 0x005c9040.)

Sound Objects represent one of the most streamlined parsers in the engine. They completely lack script triggers and rely almost entirely on mathematically calculating randomized positional matrices and variations natively.

Structural Load Phasing

Function	Size	Behavior
`Load`	1345 B	The primary physical parser evaluating 24 core audio metric bounds, defining spatial positioning, volume variation, pitch scales, and active looping capabilities.
`Sounds List`	–	Iterates through the list of associated audio clips, actively loading sound resrefs into memory sequentially for playback.

Core Structural Findings

Engine Rule	Runtime Behavior
Generated Type Truncation	The engine reads `GeneratedType` as a massive 32-bit integer from the file, but forcefully truncates it and stores only the bottom single byte in memory. Setting this number astronomically high physically corrupts the expected generator type.
Constructor Defaults	If fields are missing from the `.uts` binary, the engine physically relies on its internal C++ constructor to populate default values, completely avoiding hardcoded literal checks during parse time.
Spatial Loading Context	When loaded globally via a static map (`CSWSArea::LoadSounds`), the engine skips reading positional coordinates from the `.uts` file entirely and strictly enforces the `X`, `Y`, and `Z` vectors defined practically in the area’s `.git` file.
Silent Sound Lists	When pulling the list of sounds, the engine actively ignores missing entries. It only pushes a sound struct into playable memory if the file actually provided a valid `Sound` reference string.

Legacy & Ignored Data

Finding Type	Explanation
Legacy Engine Artifacts	Some older tools and legacy file revisions include values like `TemplateResRef`, `LocName`, `Comment`, `Elevation`, `Priority`, and `PaletteID`. These are artifacts from other Odyssey Engine branches (like Neverwinter Nights) and the KOTOR engine never evaluates them natively.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::uts.

UTS-001 (Volume Ceiling): Warns when Volume > 127; values outside the engine’s byte threshold cause distortion or clipping.
UTS-002 (Audio Integrity): Warns when the Sounds list contains blank entries; the engine skips them silently.
UTS-003 (Emitter Verification): Errors when the Sounds list is empty; the object loads as a dead audio node.
UTS-004 (GeneratedType Truncation): Errors when GeneratedType > 255; the engine truncates to a single byte and corrupts intended behavior.
UTS-005 (Legacy Engine Artifacts): Informs when TemplateResRef, Elevation, Priority, or PaletteID are populated; never natively evaluated by the K1 engine.

Phase 2 (resource existence, requires `LintContext`)

Implemented under rakata_lint::rules::uts_range.

UTS-006 (Sound Resref Existence): Warns when any non-blank Sounds[i].Sound does not resolve to a .wav resource in the configured sources. Blank entries are skipped (UTS-002 already covers them).

UTT Format (Trigger Blueprint)

Description: The Trigger (.utt) blueprint defines invisible zones placed across level maps. While encounters spawn creatures, triggers operate as tripwires – firing scripts, acting as loading zones to new areas, or springing mechanical traps when a character crosses them.

At a Glance

Property	Value
Extension(s)	`.utt`
Magic Signature	`UTT` / `V3.2`
Type	Trigger Blueprint
Rust Reference	View `rakata_generics::Utt` in Rustdocs

Data Model Structure

Rakata maps the Trigger definition directly into the rakata_generics::Utt struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Trigger breaks down into four main categories:

Core Identity & Geometry: The basic identifiers and coordinate boundaries that define what the trigger is and where it sits on the ground (e.g., Tag, Geometry).
Interactive State & Sub-types: Settings that determine if the trigger acts as a loading zone, a trap, or just a generic scripting boundary (e.g., Type, Cursor, HighlightHeight).
Trap Mechanics: The parameters defining rules for trap visibility and skill checks required to disarm them (e.g., TrapType, TrapOneShot).
Transition & Behavioral Hooks (Scripts): The event scripts that fire when a character enters, clicks, leaves, or disarms the trigger, as well as the destination area if the trigger acts as a loading zone (e.g., ScriptOnEnter, LinkedTo).

State Validation: rakata-lint checks the GFF structure directly against the constraints the engine expects.

Engine Audits & Decompilation

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSTrigger::LoadTrigger at 0x0058da80.)

Structural Load Phasing

Function	Size	Behavior
`LoadTrigger`	3381 B	The main constructor. It reads the trigger’s properties, scripts, and trap rules.
`LoadTriggerGeometry`	743 B	Reads the X, Y, and Z coordinates that draw the trigger’s boundary on the floor.

Core Structural Findings

Engine Rule	Runtime Behavior
Behavior Derived from Type	The engine determines the trigger’s behavior and UI cursor based on the `Type` field. Type 1 makes it a map transition zone. Type 2 makes it a trap.
OnClick Duplication Bug	The engine has a known bug where it copies the `ScriptOnEnter` value and uses it to overwrite the `OnClick` listener by default, unless explicitly overridden.
Trap Hook Fallback	If the `OnTrapTriggered` script is left empty, set to null, or named `"default"`, the engine ignores it and pulls the default script from `traps.2da` based on the `TrapType`.
Highlight Clamping	The trigger’s `HighlightHeight` is ignored by the engine unless it is greater than `0.0`. If it is exactly zero or negative, the engine falls back to a default rendering height of `0.1`.
Contextual Loading	Fields like `LinkedTo`, `LinkedToModule`, `AutoRemoveKey`, `Tag`, and `Faction` are only loaded into memory when the Trigger is processed from a `.git` area layout file.
Portrait Shadowing	If `PortraitId` is `< 0xFFFE`, the engine completely ignores the `Portrait` string ResRef and relies entirely on the ID. Any value in the `Portrait` ResRef field is treated as dead data.

Legacy & Ignored Data

Finding Type	Explanation
Legacy Engine Artifacts	As with other templates, older asset revisions include `TemplateResRef`, `Comment`, `PaletteID`, and `PartyRequired`. The engine completely ignores these.
Superseded Legacy Fields	Older asset revisions typically map `TrapDetectDC` and `DisarmDC` in the `.utt` file itself, but the engine ignores them – it calculates DCs dynamically using the rules in the `.2da` files instead.

Implemented Linter Rules (Rakata-Lint)

Phase 1 (intra-resource, no context)

Implemented under rakata_lint::rules::utt.

UTT-001 (Transition Enforcement): Warns when Type==1 (Transition) but no destination (LinkedTo, LinkedToModule, or TransitionDestin) is configured.
UTT-002 (Trap Consistency): Informs when TrapDetectDC/DisarmDC are set (engine reads from traps.2da); also warns when TrapFlag=true but Type != 2.
UTT-003 (Geometry Safety): Warns when the trigger’s geometry contains fewer than 3 vertices.
UTT-004 (OnClick on Generic Trigger): Informs when OnClick is set on a Generic trigger (Type==0); the event only fires for Transition triggers.
UTT-005 (Highlight Bounding): Informs when HighlightHeight <= 0.0; the engine falls back to a default of 0.1.
UTT-006 (Portrait Shadowing): Warns when PortraitId < 0xFFFE and Portrait resref is set; the resref is ignored at runtime.
UTT-007 (PartyRequired Dead Data): Informs when PartyRequired is set; the K1 engine never reads this field.

Phase 2 (range / 2DA / resref existence, requires `LintContext`)

Implemented under rakata_lint::rules::utt_range.

UTT-008 (Portrait Bounds): Errors when PortraitId (when not the 0xFFFE “use string Portrait” sentinel) does not resolve to a row in portraits.2da.
UTT-009 (Resref Existence): Warns when any of OnDisarm, OnTrapTriggered, OnClick, OnHeartbeat, OnEnter, OnExit, or OnUserDefined (.ncs), or Portrait (.tga), does not resolve in the configured resource sources. LinkedToModule (area transition) is deferred to Phase 3.

Pending

Default Script Identification: Identifies empty / null / literally-named "default" OnTrapTriggered entries that silently invoke the traps.2da fallback.

UTW Format (Waypoint Blueprint)

Description: The Waypoint (.utw) blueprint defines static reference coordinates within an area map. Unlike functional triggers or physical placeables, waypoints act exclusively as invisible logic markers. They provide coordinate anchors for creature patrol routes, spawn locations, camera focal points, or visible map pins in the player’s UI.

At a Glance

Property	Value
Extension(s)	`.utw`
Magic Signature	`UTW` / `V3.2`
Type	Waypoint Blueprint
Rust Reference	View `rakata_generics::Utw` in Rustdocs

Data Model Structure

Rakata maps the Waypoint definition directly into the rakata_generics::Utw struct. To view the exhaustive binary schema and strict GFF field mappings, please refer to the Rustdocs for this struct, where each field is explicitly documented.

A Waypoint breaks down into three main categories:

Core Identity: The basic identifiers that define the waypoint’s name and tag used heavily by scripts (e.g., Tag, LocalizedName).
Spatial Geometry: The exact map coordinates and facing orientation that creatures or cameras will reference (e.g., XPosition, XOrientation).
Map Navigation Notes: The text and toggles that dictate whether the waypoint draws a physical pin on the player’s mini-map UI (e.g., HasMapNote, MapNote).

State Validation: rakata-lint checks the data against engine constraints to prevent runtime bugs or dead data paths.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for .utw files mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWSWaypoint::LoadWaypoint at 0x005c7f30.)

Structural Load Phasing

Function	Size	Behavior
`LoadWaypoint`	682 B	The main constructor. It loads the waypoint’s identity, map geometry, and checks for mini-map pins.
`LoadFromTemplate`	134 B	A fallback used when dynamically spawning a waypoint from a script.

Core Structural Findings

Engine Rule	Runtime Behavior
Map Note Two-Gate Pattern	If `HasMapNote` is `0` or missing, the engine skips reading the map note entirely. If it is `1`, it reads the strings but uses a second gate: if the `MapNote` string itself is missing, the entire map pin block is discarded silently.
Orientation Normalization	The engine computes the squared magnitude of the orientation vectors. If it is not exactly `1.0`, it automatically calls `Vector::Normalize()` to fix the math. Non-unit vectors are tolerated but corrected instantly at load.
Position Override	When a waypoint is loaded from a `.git` area layout via `LoadWaypoints`, the engine re-reads the X and Y coordinates directly from the `.git` file, completely overriding the `.utw`. It also forcefully calculates the Z height based on the terrain collision mesh via `ComputeHeight`.
Dynamic Identification	Waypoints never pull an `ObjectId` from their own `.utw` file. It is always forcibly assigned by the `.git` list element (defaulting to `0x7f000000`).

Legacy & Ignored Data

Finding Type	Explanation
Superseded Legacy Fields	Older asset revisions pad the file with fields like `TemplateResRef`, `Appearance`, `PaletteID`, `Comment`, `LinkedTo`, and `Description`. The KOTOR engine completely ignores these.

Implemented Linter Rules (Rakata-Lint)

These diagnostics are implemented under rakata_lint::rules::utw.

UTW-001 (Map Note Double-Gating): Warns when MapNote or MapNoteEnabled are populated but HasMapNote=false; this data is silently discarded by the engine.
UTW-002 (Orientation Warnings): Informs when the orientation vector magnitude is not within ~0.001 of 1.0; the engine forcibly normalizes at load.

Pending

Tag Enforcement: Flags empty Tag values since waypoints are primarily targeted by name from scripts.

3D Geometry & Models

At the heart of the Odyssey Engine’s visual presentation is a proprietary structural design for interpreting and rendering 3D geometry. Modern formats like .glTF or .fbx bundle all visual and physical data into a single asset. KotOR however, splits this data across several distinct files. The engine strictly decouples the node hierarchy tree, the raw vertex buffers, and the mathematical collision boundaries.

Note

If you are looking for the exact underlying Ghidra-derived notes detailing the K1 Engine’s InputBinary::Read pipeline and structural layout bytes, please refer to the MDL & MDX Deep Dive.

Implementation Blueprints

This section documents the primary pillars of KOTOR geometry and their mathematical foundations, backed by swkotor.exe clean-room reverse engineering.

Format	Name	Layout & Purpose
MDL	Model Hierarchy	The architectural scaffold holding the model together. It defines the scene bounding volumes, spatial rotations, embedded animations, engine rendering parameters, and a deep recursive tree of typed `Nodes` (e.g., Lights, Bones, Emitters, Trimeshes).
MDX	Vertex Data	The abstract mathematical arrays defining the actual rendering payload. It directly encodes interleaved array blocks mapping exact spatial coordinates (`X, Y, Z`), texture UV layouts, and Lighting Normals.
BWM	Walkmeshes	The raw mathematical graph of AABB bounds and face intersections that serve as physics collision boxes for area environments (`.wok`), placeables (`.pwk`), and interactive doors (`.dwk`).
Math	TriMesh Derivations	Documentation explaining exactly how variables like coordinate bounds and face offsets are mathematically derived across both visual Trimeshes and collision Walkmeshes.

MDL Format (Model Hierarchy)

The .mdl format serves as the overarching structural spine for 3D model geometry. Rather than storing literal vertex positions directly, it recursively structures a tree of generalized nodes (Bones, Trimeshes, Lights, Emitters) into a unified visual mesh. It delegates vertex geometry out, binds textures, links dynamic controllers (keyframe transformations), and maps bounding sphere matrices directly to the model’s rigid physical space.

At a Glance

Property	Value
Extension(s)	`.mdl`
Magic Signature	Text (`filedependancy`) or Binary (`\0` byte header)
Type	3D Hierarchical Mesh
Rust Reference	View `rakata_formats::Mdl` in Rustdocs

Data Model Structure

Rakata maps the .mdl binary tree exactly into rakata_formats::Mdl.

Because a model intrinsically utilizes 11 distinct struct sub-types, Rakata resolves the pointer-based tree structure into a secure Rust Vec<MdlNode>. Native file pointer offsets which are normally resolved inside KOTOR via an explicit raw memory relocation dump are converted into safe recursive structures at parse time.

Node Sub-Types

The engine determines exact node allocations using a rigid bitflag header.

Sub-Type	Description
Base	A pure structure node (Dummy) acting strictly as an invisible visual group or spatial pivot.
Light	Projects localized dynamic lighting, lens flares, and shading priorities.
Emitter	Configures particle spawning systems (fountains, single-shots, lightning, explosions).
Camera	An empty node serving as a static viewport anchor for dialogue cinematics.
Reference	An anchor point explicitly linking an external 3D model asset to a point.
TriMesh	A rigid standard triangle geometry boundary carrying static vertex arrays.
SkinMesh	A procedural mesh utilizing skeleton bone-weights and vectors to calculate organic deformations.
AnimMesh	A mesh carrying hardcoded, explicitly sampled vertex coordinate animation loops.
DanglyMesh	A sub-mesh evaluated through swinging physics constraints (displacement, tightness, period).
AABB	A strict spatial collision tree structurally defining an internal walkmesh barrier.
Saber	Allocates dynamic 3D quad arrays utilized exclusively to generate stretching lightsaber swing trails.

Engine Audits & Decompilation

Deep Dive: For an exhaustive archive of the Ghidra decompilation notes detailing the exact byte-level layout of the binary MDL format and engine loading pipeline, refer to the MDL & MDX Deep Dive.

The following information documents the engine’s exact load sequence for genuine Binary MDL models. All behavior was mapped from natively analyzing swkotor.exe execution pipelines via Ghidra.

Loading and Wrapper Validation

Read initially via Input::Read (0x004a14b0).

Pipeline Event	Ghidra Provenance & Engine Behavior
Binary vs ASCII Detection	The engine checks the exact first byte of the file. If it hits a `\0` (NULL), it dispatches the asset entirely to the `InputBinary` track. If it hits text (`"filedependancy"` or `"newmodel"`), it loops into the `FuncInterp` ASCII parser track.
Wrapper Mapping	The Binary format evaluates the initial 12 bytes as an abstract Wrapper block defining explicit sizes for the `.MDL` and the associated `.MDX` geometry.
In-Memory Heap Dump	The engine allocates the sizes noted in the wrapper, runs `memcpy` on both the `.MDL` and `.MDX` assets blindly into memory, and then runs the recursive `Reset` path to relocate spatial internal pointer offsets to absolute memory addresses.

Node Dispatch Architecture

Read initially via InputBinary::ResetMdlNode (0x004a0900). The engine recursively navigates downwards matching against a constant 16-bit node-type flag lookup spanning from 0x0001 (Base Node) to 0x0821 (Lightsaber).

Mapped Property	Engine Behavior
Sub-node Allocation Sizes	Nodes are dynamically allocated varying byte lengths strictly based on their type-mask. A root Base node only evaluates 80 contiguous bytes, but an Emitter allocates 304, and a Skin allocates 512.
Parent/Child Graph Resolution	Engine structures evaluate nodes continuously downward via embedded raw pointer arrays. These arrays branch a group of distinct sub-children implicitly off their master parent. At load time, the engine must safely rewrite all relative file offsets into absolute physical memory locations, otherwise the entire hierarchy will instantly detach.

Mapped Behavior Quirks

Mapped Property	Ghidra Provenance & Engine Behavior
LOD Suffix Generation	The engine natively evaluates if the `cullWithLOD` property is set. If true, it explicitly triggers string concatenations for `FindModel(name + "_x")` and `FindModel(name + "_z")` sequentially to dynamically attach lower-quality auxiliary geometry instances based on viewport distance.
Animation Bone Binding	When building the live hierarchy tree for a rendering sequence, the engine explicitly ignores the node’s textual string name. Instead, it rigidly evaluates physical pairings against a mapped `node_id` integer. If the bone isn’t properly sequenced to that numeric ID array, it detaches from the runtime arrays entirely.
Self-Describing Keyframes	Unlike older properties that rely on rigid dictionaries, KOTOR determines how an animation was saved dynamically by reading the keyframe’s controller type integer. It applies a bitwise AND check against the type’s lowest hex digit (`& 0x0F`) to instantly dictate whether the loaded keyframe is a single float (like scaling), 3 floats (like an XYZ positional vector), or 4 floats (for a Slerp quaternion rotation).

Proposed Linter Rules (Rakata-Lint)

While rakata-lint currently only evaluates GFF formats and does not yet parse .mdl models dynamically, the engine behaviors above hint at some suggested lint diagnostics:

Planned Lint Diagnostics:

Skeleton / Animation Tracing: Flags animation nodes where the internal skeletal node_number binding parameter implicitly equals 0, ensuring the mesh does not hard freeze via pointing to the rigid root spine.
Controller Mask Encoding: Validates that generic Controller properties properly bit-mask against the Bezier indicator (0x10) rather than reading explicitly raw quaternion values (which causes cascading loop failures through the rest of the array block).
Emitter Detonation Allocation: Flags interactive Emitter nodes attempting to bind the detonate key (Controller 502) while structurally mis-identifying as "Fountain". The engine native only maps controller 502 data to strict "Explosion" memory paths, resulting in an aggressive Access Violation engine crash otherwise.
Name Graph Sanitization: Notifies developers if the node graph contains artificially un-referenced graph pointers mapped under the unified Name Table. (BioWare notoriously shipped identical shared name tables compiling .pwk and .wok models into .mdl nodes natively throughout the 2003 pipeline).

MDX Format (Vertex Data)

The .mdx format is a companion file that always pairs tightly with a .mdl model. While the .mdl file handles the complex math, skeletal hierarchy, and animation logic, the .mdx file acts as bulk storage; holding the massive lists of raw 3D coordinates (vertices) that make up the physical shape of the model.

Architecturally, the swkotor.exe engine treats these two files as a single combined asset: the .mdl dictates where and how the model moves, and the .mdx provides the points to physically draw on the screen.

At a Glance

Property	Value
Extension(s)	`.mdx`
Magic Signature	Raw binary stream (No explicit signature block)
Type	Interleaved Vertex Payload Array
Rust Reference	View `rakata_formats::Mdx` in Rustdocs

Data Model Structure

Rakata safely consumes the unindexed byte sequences into a typed geometry definition mapped within rakata_formats::Mdx.

At the raw binary level, .mdx data is strictly an interleaved buffer. Variables (like positional 3D XYZ vectors, Texture Parameter UV planes, and light-calculating Normals) are sequentially woven directly across the byte stream.

Engine Audits & Decompilation

Deep Dive: For an exhaustive archive of the Ghidra decompilation notes detailing the exact byte-level layout of the binary MDL/MDX format and engine loading pipeline, refer to the MDL & MDX Deep Dive.

The following documents the engine’s exact load sequence and structure for .mdx interleaved data pipelines mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from InputBinary::Read (0x004a1230) and InputBinary::ResetMdlNode (0x004a0900).)

Loading and Lifecycle

Pipeline Event	Ghidra Provenance & Engine Behavior
Memory Wrapping	Triggered immediately alongside the `.mdl`. The wrapper dynamically outlines the exact byte-count of `.mdx` data required (`wrapper + 0x08`).
Buffer Liberation	MDX arrays are entirely stateless. Once `InputBinary::ResetMdlNode` computes the geometry arrays and translates the buffer directly into the OpenGL hardware render-pools during loading, the engine immediately calls `free()` wiping the MDX byte arrays from physical memory entirely.

TriMesh Structural Addressing

The KOTOR Engine avoids parsing the MDX data by scanning through it block-for-block. Instead, traversing the actual MDL hierarchy drives vertex payload requests explicitly.

Mapped Property	Ghidra Provenance & Engine Behavior
Array Slicing	Every distinct `TriMesh` instantiated in the parent MDL tree explicitly registers an `mdx_data_offset` pointer (`TriMesh + 0x144`). This dictates exactly where the engine explicitly seeks within the interleaved `.mdx` payload array to fetch this mesh’s native points.
Node Alignment Constraints	Vanilla assets maintain extremely strict alignment formats. Meshes are dynamically sorted prior to hardware parsing: static rendering models fall to the top of the index chain, whereas dynamic procedural meshes (like character `.Skin` nodes) are specifically dumped sequentially to the rear of the `.mdx`.

Note

Ghost Payload Sentinels During memory extraction, the engine implicitly pads geometric mesh payloads out to distinct 16-byte aligned boundaries using Terminator Rows. Any mesh vertex iteration falling slightly out of stride will be explicitly back-filled with ghost/sentinel float arrays ([0.0, 0.0, 0.0]) to ensure OpenGL buffer calculations remain strictly uniform without overflowing pointer indexes during hardware streaming.

Proposed Linter Rules (Rakata-Lint)

Incorrectly calculated .mdx offset spans or payload array lengths can cause the engine to read misaligned bytes or overflow data bounds. Providing a linter rule to validate these payload alignments helps prevent geometry corruption and potential engine/gpu crashes.

While rakata-lint currently only evaluates GFF formats and does not yet parse .mdx buffers dynamically, the engine behaviors above hint at the foundational requirements for .mdx stability:

Planned Lint Diagnostics:

Mesh Slice Verification: Enforces explicit iteration seeking. Validates .mdx vector boundaries by explicitly jumping pointers down the file according to individual mdx_data_offset assignments mapped on explicitly bound TriMesh headers, rather than assuming unverified sequential payload lengths.

Walkmesh (BWM / WOK)

Walkmeshes govern physical collision and pathfinding across an area. They dictate exactly where a character can stand, what slopes they can climb, and what physical materials block their path.

BWM Binary

The binary implementation of the Walkmesh is entirely designed to be dumped straight into memory. Instead of smoothly parsing the file piece-by-piece, the engine constantly jumps around the file using a complex array of offsets located at the very top.

At a Glance

Property	Value
Extension(s)	`.bwm`, `.wok`
Magic Signature	None standard header block
Type	Memory-Mapped Collision Net
Rust Reference	View `rakata_formats::Bwm` in Rustdocs

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and field requirements for Binary Walkmeshes mapped from swkotor.exe.

(Decompilation logic for this section was entirely audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWCollisionMesh::LoadMeshBinary at 0x00597120.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Pointer Jumping	The engine doesn’t read the file linearly from top to bottom. Instead, it uses direct memory math (pointer arithmetic) to aggressively jump between the header and the raw data payload.
Offset Extraction	The beginning of the file contains exact byte locations the engine uses to orient itself: • `+0x08` yields the total `vertex_count` • `+0x0C..+0x18` provides the maximum limits for faces, materials, and walk-edges • `+0x18..+0x24` yields adjacency boundaries • `+0x3C..+0x48` stores the direct starting addresses for the geometry data
Bounding Box Offsets	The spans immediately following (`+0x48..+0x6C` and `+0x6C..+0x84`) are reserved specifically for tracking offsets that point to the Axis-Aligned Bounding Box (AABB) collision trees.
Ignoring the Magic ID	Magic bypass: Magic and version identifiers (`BWM` ) are actually ignored natively during the `LoadMeshBinary` process. It relies on a different system entirely to verify file signatures beforehand.
Read-Only Format	One-Way Flow: Vanilla KOTOR contains strictly read-only capabilities for BWM binaries. Developers removed any functionality needed to compile or save collision data dynamically!

Tip

Orphaned Memory Gaps: The engine entirely skips reading two massive blocks of bytes off the disk: +0x24..+0x3C (24 bytes) and +0x64..+0x6C (8 bytes). For a byte-perfect roundtrip toolset, these gaps must absolutely be preserved verbatim!

BWM ASCII

For tooling purposes, BioWare engine modules support a raw ASCII readable version of the walkmesh that can be dynamically parsed at runtime at a massive performance cost.

At a Glance

Property	Value
Extension(s)	`.bwm` (ASCII formatted)
Magic Signature	ASCII Text Directives
Type	Uncompiled Collision Text

Engine Audits & Decompilation

The following documents the engine’s exact load sequence and constraints for ASCII text walkmeshes mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWRoomSurfaceMesh::LoadMeshText at 0x00582d70.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Searching for Keywords	The engine scans the text file reading line-by-line to look for the specific keywords `node`, `verts`, `faces`, and `aabb`.
Strict Face Formatting	Every defined face string must strictly format exactly 8 numbers separated by spaces. Interestingly, while the engine reads the adjacency input, it immediately deletes it! The engine forces adjacency math to be physically recomputed from scratch post-load to prevent geometric errors from old assets.
Line Length Limits	The engine will aggressively truncate or glitch if any single text line stretches beyond 256 characters (`0x100` bytes).
Face Reordering	Using the `surfacemat.2da` file, the engine completely shuffles the order of the faces while loading. It essentially groups every geometry face marked “walkable” at the absolute top of the array, and pushes all non-walkable geometry straight to the bottom.
Fudging the Boundaries	When figuring out the Axis-Aligned Bounding Box (AABB) limits, the text loader artificially stretches the box outwards by roughly `0.01` across every axis. Due to the face reordering mentioned above, the engine also has to build a temporary remap table under the hood just to keep track of where everything moved!

Warning

Because the ASCII face-reordering mechanism radically shuffles the root array indexes from walkable to unwalkable clusters via the LoadMeshText routine, it is impossible to do a clean 1-to-1 binary-to-ASCII-to-binary round trip of a KOTOR walkmesh without completely losing the original face indexing format!

TriMesh Derived & Computed Fields Reference

This document catalogs derived or computable fields specifically impacting TriMesh generation for MDL/MDX structures.

At a Glance

Property	Value
Extension(s)	`.mdl`
Domain	Geometry Math / Model Reconstruction
Rust Reference	View `rakata_formats::MdlNodeTriMesh` in Rustdocs

Data Model Structure

Rakata attempts to make building a TriMesh as painless as possible by handling the complex math under the hood.

Derived Fields: Rakata explicitly understands the difference between data you must supply (like static 3D coordinates) and data that can safely be calculated on the fly (like bounding limits, spherical radii, or adjacency maps). The rakata-formats API automatically calculates all of these required boundaries for you seamlessly whenever you serialize the file!

Engine Audits & Decompilation

This document catalogues every field on MdlMesh and MdlFace that can be derived from geometry, documenting what each field means, how community tools handle it, and what algorithm is needed to recompute it. This is the reference for future model-editing API work.

Field Categories

User-authored: Provided by the modeller. Never recomputed.
Derivable: Can be recomputed from geometry. Tools recompute on ASCII import / model rebuild; preserve verbatim on binary roundtrip.
Runtime-only: Written by the engine at load time. On-disk values are meaningless stubs.

1. Internal CExoArrayList Fields (+0x98 .. +0xC8)

The five CExoArrayList slots in the TriMesh header form a coordinated GL index buffer submission system. Each stores a 12-byte header (ptr/count/alloc) in the mesh header plus a single u32 data value in the content blob.

1.1 vertex_indices (+0x98) – Dead in KotOR

What it is: A legacy engine array block. In BioWare’s older titles (like Neverwinter Nights), this block pointed to vertex index data. In KOTOR, the engine never actually looks at this field at all.

Community tools:

mdledit: Misidentifies as cTexture3 (12-byte string). Byte-exact preserve.
mdlops: Reads as raw bytes via darray struct. Byte-exact preserve.
PyKotor: Reads as indices_counts. Byte-exact preserve.
xoreos/reone: Skip entirely.

Vanilla values: Always zeros (ptr=0, count=0, alloc=0).

Rakata Processing Rule: Store as [u8; 12] for lossless preservation, or zero on write. No computation needed.

1.2 left_over_faces (+0xA4) – Dead in KotOR

What it is: Another legacy array block. In NWN, this stored “left over” face geometry. In KOTOR, the engine updates the pointer location dynamically but completely forgets to actually use or read the data during the OpenGL rendering cycle. rendering loop.

Community tools:

mdledit: Misidentifies as cTexture4 (12-byte string). Byte-exact preserve.
mdlops: Reads as raw bytes via darray struct. Points to the packed u16 vertex index data (mdlops uses this as the indirection to find face indices).
PyKotor: Reads as indices_offsets. Byte-exact preserve.
xoreos: Only field it actually follows – reads the pointer to find packed u16 face vertex indices.
reone: Reads as indicesOffsetArrayDef. Uses first element as pointer to u16 index data.

Vanilla values: Typically non-zero. The pointer value points to the packed u16 face vertex index data. Count is 1, alloc is 1.

Rakata Processing Rule: Store the raw pointer and count variables. The pointer is content-relative and must be explicitly backpatched on write to point to the packed u16 face index data block.

1.3 vertex_indices_count (+0xB0) – Derivable

What it is: Single u32 value = total number of u16 vertex indices in the face index buffer.

Formula: face_count * 3

Community tools:

mdledit: Recomputes on every write (nVertIndicesCount = Faces.size() * 3).
mdlops: Recomputes on ASCII import.
PyKotor: Preserves from binary, creates empty for new models.

Rakata Processing Rule: Dynamically derive from faces.len() * 3. Never store a static value in the struct.

1.4 mdx_offsets (+0xBC) – Derivable (pointer)

What it is: Single u32 value = content-relative offset to the packed u16 face vertex index data in the MDL content blob.

Community tools:

mdledit: Writes placeholder, backpatches when VertIndices data is written.
mdlops: Same approach.
PyKotor: Same approach.

Rakata Processing Rule: Compute strictly at serialization time via the binary writer. Never store a static value in the struct.

1.5 index_buffer_pools / Inverted Counter (+0xC8) – Preserve or Derive

What it is: A standard 32-bit number. On the physical hard drive, this acts exclusively as a sequence counter that numbers meshes using a bizarre “inverted” counting pattern. However, the moment the engine loads the file into memory, it deletes this number and overwrites the exact memory space with an OpenGL hardware connection handle.

The inverted counter formula (from mdledit asciipostprocess.cpp:1024):

mesh_counter: sequential 1-based index across all mesh nodes in DFS tree order.
              Saber meshes consume TWO increments (one per inverted counter).

Quo = mesh_counter / 100
Mod = mesh_counter % 100
inverted_counter = (2^Quo) * 100 - mesh_counter
                 + (Mod != 0 ? Quo * 100 : 0)
                 + (Quo != 0 ? 0 : -1)

Example sequence: 98, 97, 96, …, 1, 0, 100, 199, 198, …, 101, 200, …

Community tools:

mdledit: Preserves from binary. Recomputes from formula only for ASCII import when value is missing (!nMeshInvertedCounter.Valid()).
mdlops: Recomputes on ASCII import using same formula.
PyKotor: Preserves from binary.

Rakata Processing Rule: Map as a static u32 field to perfectly preserve binary roundtripping. When natively constructing new models, dynamically compute the inverted sequence according to the formula using a DFS mesh counter.

2. Packed u16 Face Vertex Indices

What it is: A tightly packed list of u16 index triplets (yielding exactly 6 bytes per face). Each 3-piece triplet tells the renderer which three vertex dots to connect to draw one flat triangle. This entire block is physically uploaded straight to the graphics card to render the final model.

Relationship to MdlFace: The packed u16 data is identical to MdlFace.vertex_indices for each face, laid out sequentially. It is fully redundant with the face array.

Community tools:

mdledit: Reads from binary into nVertIndices (3 u16 per face, stored alongside face data). Writes from face data.
mdlops: Reads as vertindexes darray. Writes from face data on ASCII import.
xoreos/reone: Read from the pointer at +0xA4 or +0xBC.

Rakata Processing Rule: Always dynamically derive identical copies directly from faces[i].vertex_indices during binary emission. Never map a redundant array inside the Rakata struct.

3. Face Fields (MdlFace, 32 bytes per face)

3.1 plane_normal ([f32; 3]) – Derivable

What it is: The geometric direction the triangle’s flat surface is facing (a unit normal vector).

Formula:

edge1 = positions[v1] - positions[v0]
edge2 = positions[v2] - positions[v0]
normal = normalize(cross(edge1, edge2))

Community tools: All tools that recompute adjacency also recompute normals.

3.2 plane_distance (f32) – Derivable

What it is: The raw distance measured straight from the physical center of the world (origin) to the face’s flat surface along its normal vector.

Formula: plane_distance = -dot(plane_normal, positions[v0])

Note: some tools negate this differently. Verify against vanilla data.

3.3 surface_id (u32) – User-authored

What it is: Material/surface type identifier. Determines footstep sounds, walkability, etc. in walkmeshes; material properties in render meshes.

Not derivable – assigned by the modeller or inherited from the source asset.

3.4 adjacent ([u16; 3]) – Derivable

What it is: For each edge of the triangle, the index of the face sharing that edge. 0xFFFF means no adjacent face (boundary edge).

Edge-to-adjacent mapping:

adjacent[0]: face sharing edge (v0, v1)
adjacent[1]: face sharing edge (v1, v2)
adjacent[2]: face sharing edge (v2, v0)

Rakata Hash-Map Adjacency Algorithm:

1. Build position_key(v) = format!("{:.4e},{:.4e},{:.4e}", pos[0], pos[1], pos[2])

2. Build vertex_group: HashMap<String, Vec<usize>>
   For each vertex index i:
       vertex_group[position_key(i)].push(i)

3. Build vertex_to_faces: HashMap<usize, Vec<usize>>
   For each face f, for each vertex v in face.vertex_indices:
       vertex_to_faces[v].push(f)

4. Build face_set(vertex_index) -> HashSet<usize>:
   Collect all faces touching any vertex in the same position group:
       group = vertex_group[position_key(vertex_index)]
       union of vertex_to_faces[g] for all g in group

5. For each face f:
   For each edge (va, vb) in [(v0,v1), (v1,v2), (v2,v0)]:
       candidates = face_set(va) & face_set(vb) - {f}
       adjacent[edge] = if candidates.is_empty() { 0xFFFF }
                        else { min(candidates) }

Complexity: O(F * V_avg) where V_avg is the average number of faces per vertex group. Effectively O(F) for well-behaved meshes.

No-neighbor sentinel: 0xFFFF (u16::MAX). All tools agree except PyKotor which incorrectly uses 0 (bug – face 0 is a valid index).

Non-manifold edges: When more than 2 faces share an edge, tools differ:

mdledit: First match wins, logs a warning.
mdlops: Arbitrary (hash iteration order).
PyKotor: Smallest face index wins (min(candidates)).

Rakata Processing Rule: Always use min(candidates) internally so evaluation remains deterministic and aligns with PyKotor output. If non-manifold geometric edges are detected, the formatter must throw a logger warning.

Important: Vertex matching must be position-based, not index-based. Meshes commonly have duplicate vertices at the same position with different normals/UVs (hard edges, UV seams). Index-based matching would miss adjacency across these seams.

3.5 vertex_indices ([u16; 3]) – User-authored

What it is: The three vertex indices forming this triangle.

Not derivable – defines the mesh topology.

4. Mesh Bounding Geometry – Derivable

4.1 bounding_min / bounding_max ([f32; 3])

What it is: A perfect, square box drawn tightly around every single vertex dot in the model (an Axis-Aligned Bounding Box).

Formula:

bounding_min = [min of all positions[i][0], min of [1], min of [2]]
bounding_max = [max of all positions[i][0], max of [1], max of [2]]

4.2 bsphere_center / bsphere_radius ([f32; 3], f32)

What it is: Minimum bounding sphere enclosing all vertices. Used by the engine for frustum culling (PartTriMesh::GetMinimumSphere at 0x00443330).

Engine algorithm (from Ghidra, confirmed in mdl_mdx.md):

center = average of all vertex positions (centroid)
radius = max distance from center to any vertex

This is NOT the true minimum bounding sphere (Welzl’s algorithm), but a simpler centroid-based approximation. Matches what vanilla files contain.

4.3 total_surface_area (f32)

What it is: Sum of all triangle areas in the mesh.

Formula:

For each face:
    edge1 = positions[v1] - positions[v0]
    edge2 = positions[v2] - positions[v0]
    area += 0.5 * length(cross(edge1, edge2))
total_surface_area = sum of all face areas

5. AABB Tree – Derivable (complex)

What it is: A mathematical collision-detection tree (Binary Space Partition) built over the faces of the mesh. It recursively slices the physics block into smaller and smaller floating boxes so the engine can quickly determine if a player bumps into a wall, saving it from checking collision against every single polygon.

When needed: Only for MdlNodeData::Aabb nodes (walkmesh-like collision geometry). Regular render meshes don’t have AABB trees.

Node layout: 40 bytes (see mdl_mdx.md for full struct).

Build algorithm: Recursive spatial partition:

Compute AABB of all face centroids.
Choose split axis (longest AABB dimension).
Sort faces by centroid along split axis.
Split at median into left/right subsets.
Recurse on each subset until single-face leaves.

Community tools generally don’t rebuild AABB trees from scratch – they preserve the existing tree or require external tooling to generate it.

6. Fields That Are NOT Derivable

These distinct fields are explicitly user-authored or carried over from tooling. Rakata must treat them strictly as rigid payload endpoints. They are never mathematically recomputed across the pipeline:

Field	Source
Vertex positions, normals, UVs, tangent space	3D modeller
Vertex colors	3D modeller or material editor
Texture names (texture_0, texture_1)	Material assignment
Diffuse/ambient colors	Material properties
Transparency hint, light_mapped, beaming, etc.	Material flags
Surface ID per face	Surface type assignment
Vertex indices per face	Mesh topology
Controller keyframes	Animation data
Bone weights, indices, bonemap	Rigging tool
Emitter properties	Particle editor

7. Tool Cross-Reference: CExoArrayList Naming

The naming across tools is wildly inconsistent:

Offset	Engine (Ghidra)	rakata	mdledit	mdlops	PyKotor	xoreos
+0x98	vertex_indices	vertex_indices_array	cTexture3	pntr_to_vert_num	indices_counts	(skip)
+0xA4	left_over_faces	left_over_faces_array	cTexture4	pntr_to_vert_loc	indices_offsets	offOffVerts
+0xB0	vertex_indices_count	vertex_indices_count_array	IndexCounterArray	array3	counters	(skip)
+0xBC	mdx_offsets	mdx_offsets_array	IndexLocationArray	(backpatch only)	(not modeled)	offOffVerts
+0xC8	index_buffer_pools	index_buffer_pools_array	MeshInvertedCounterArray	inv_count	(not modeled)	(skip)

Note: mdledit’s identification of +0x98/+0xA4 as texture name slots is incorrect for KotOR. In NWN, the mesh header has 4 texture name slots (64 bytes each) at this region. KotOR reduced to 2 texture names (32 bytes each at +0x58/+0x78) and repurposed the remaining space as CExoArrayList headers. The CExoArrayLists are always empty (all zeros) in vanilla KotOR, so mdledit’s string-based read/write produces byte-identical results.

8. MDL vs BWM Adjacency Encoding

A critical distinction for anyone working with both formats:

Property	MDL Face Adjacency	BWM Walkmesh Adjacency
Storage	u16 per edge	i32 per edge
Encoding	Plain face index	`face_index * 3 + edge_index`
No-neighbor	`0xFFFF`	`-1` (0xFFFFFFFF)
Purpose	GL rendering hints	Pathfinding / collision

BWM’s edge-encoded adjacency tells you not just WHICH face is adjacent, but WHICH EDGE of that face connects – needed for the pathfinding walk algorithm. MDL only needs to know which face, not which edge.

9. Write-Order Dependencies

When writing a mesh node, fields must be emitted in a specific order because some fields are content-relative pointers that must be backpatched. The canonical order (from mdledit binarywrite.cpp) is:

Face array (32 bytes per face)
vertex_indices_count data (single u32: face_count * 3)
Content vertex positions (12 bytes per vertex, only for MDL content blob)
mdx_offsets data (single u32: placeholder, backpatched)
index_buffer_pools data (single u32: inverted counter value)
Packed u16 vertex indices (face_count * 3 u16 values)

After step 6, backpatch the mdx_offsets pointer to point to the start of step 6’s data.

CExoArrayList headers at +0x98..+0xC8 are written as part of the mesh extra header (332 bytes), with pointer values backpatched after the data is written.

Texture Formats

KOTOR handles graphics via multiple tailored texture formats. It uses hardware-accelerated DXT compression techniques natively supported by its OpenGL backend.

Implementation Blueprints

This section details the primary texture architectures parsed natively by rakata-formats.

Format	Name	Layout & Purpose
TPC	Texture Pack Compressed	A proprietary BioWare wrapper around native DXT-compressed OpenGL texture data. This is the primary format used for all base-game environment and character textures.
DDS	DirectDraw Surface	A proprietary BioWare variation of the standard Microsoft DDS format. Rather than utilizing standard headers, the legacy engine requires a bespoke 20-byte magic wrapper.
TGA	Truevision Targa	An uncompressed, lossless visual format. Used for rendering crisp UI elements, visual effects (VFX), etc.
TXI	Texture Extensions	Plaintext routing files that accompany primary textures. They direct the engine how to apply advanced rendering hints, such as procedural animations or bump-mapping.

TPC (Texture Pack Compressed)

TPC is the proprietary bundled texture format created by BioWare. It contains the raw DXT-compressed texture data, pre-computed mipmaps, and potentially appended TXI configuration data all in one blob.

At a Glance

Property	Value
Extension(s)	`.tpc`
Magic Signature	None
Type	Compressed Texture Pack
Rust Reference	View `rakata_formats::Tpc` in Rustdocs

Data Model Structure

The rakata-formats crate provides a formally mapped Tpc container that completely shields you from managing pixel type bitmasks.

Pixel Enum Decoding: Instead of raw integer flag codes, calling known_pixel_format() instantly resolves the byte code into a robust TpcHeaderPixelFormat enumeration (e.g., Dxt1, Dxt5, Rgb, Greyscale).
Footer Management: Trailing TXI text is seamlessly maintained, and can be cleanly updated via .set_txi_text_strict().

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for TPC textures mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CAuroraProcessedTexture::ReadProcessedTextureHeader at 0x0070f590.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Format Byte Mapping	The single header format byte acts as a strict bitmask. The engine explicitly checks bit0, bit1, and bit2 to generate internal format codes: `1`, `3`, and `4`.
Compression Dispatch	The runtime fundamentally ignores other variants. It strictly requires Code `3` to process 8-byte geometry chunks (standard S3TC DXT1) or Code `4` to process 16-byte chunks (standard S3TC DXT5).
Mipmap Calculations	Rather than parsing explicit counts, the engine calculates mipmap storage dimensions by blindly right-shifting the base dimensions for each depth level without natively clamping the integer to 1. Because of this, extremely deep architectural mip levels can produce 0 geometry bytes!
OpenGL Hardware Binding	When aggressively pushing the TPC bytes into OpenGL video memory, the engine natively maps Code `3` directly to OpenGL constant `0x83F0` (DXT1) and Code `4` straight to `0x83F3` (DXT5). Technically, there is zero branching logic to support native DXT3 (`0x83F2`) inside the vanilla engine’s parser.

DDS (DirectDraw Surface)

The .dds extension in KOTOR does not represent a standard Microsoft DirectDraw Surface file. Instead, the engine strictly expects a proprietary format consisting of a bespoke 20-byte configuration prefix followed by raw DXT compression blocks. The vanilla parsing logic completely ignores standard 124-byte DDS magic headers.

At a Glance

Property	Value
Extension(s)	`.dds`
Magic Signature	None (Proprietary 20-Byte Prefix)
Type	BioWare DirectDraw Wrapper
Rust Reference	View `rakata_formats::Dds` in Rustdocs

Data Model Structure

rakata-formats is built to natively parse both standard Microsoft DDS architecture and KotOR’s proprietary CResDDS format transparently. When evaluating a .dds file via rakata_formats::Dds:

Bilateral Read Path: If the file begins with the standard Microsoft DDS magic bytes, Rakata leverages a standard pipeline to extract the payload. If those magic bytes are missing, Rakata immediately pivots and parses the data natively as a proprietary K1 CResDDS 20-byte payload.
Strict Serialization: Regardless of which variation is ingested from the disk, Rakata will strictly emit valid 20-byte KotOR-compliant payloads during binary serialization.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for DDS textures mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CResDDS::GetDDSAttrib at 0x00710ee0.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Prefix Stripping	The engine’s parser explicitly expects and strips a proprietary 20-byte magic header wrapper prepended to the DDS buffer: `width` (+0x00), `height` (+0x04), `byte code` (+0x08), `base-size` (+0x0C), and an `alpha_mean` `FLOAT` (+0x10).
Block Calculation	The runtime completely mimics the TPC logic for memory block sizing. Fundamentally, the algorithm determines the 3D dimensions via the formula: `(pixel_type == 4) * 8 + 8`. Code `3` explicitly evaluates into 8-byte texture blocks, while Code `4` evaluates to 16-byte blocks.

Tip

Reserved Gaps: The bytes spanning +0x09 to +0x0B in the header prefix are entirely ignored by the GetDDSAttrib read path. We preserve them strictly for round-trip fidelity.

TGA (Truevision Targa)

TGA is the standard uncompressed image format utilized by the engine, typically reserved for UI elements, icons, or high-fidelity models that demand lossless alpha channels.

At a Glance

Property	Value
Extension(s)	`.tga`
Magic Signature	Truevision Standard
Type	Uncompressed RGB/A Raster
Rust Reference	View `rakata_formats::Tga` in Rustdocs

Data Model Structure

rakata-formats natively emulates the engine’s parsing logic. When evaluating a .tga file, Rakata ignores non-essential Truevision header flags (such as image_type and id_len) and strictly validates the payload against the engine’s natively supported pixel_depth thresholds.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for TGA textures mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from ImageReadTGAHeader at 0x0045e2e0.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Header Stripping	Function: `ImageReadTGAHeader` (`0x0045e2e0`) The native engine parser is exceptionally loose. Standard Truevision fields such as `image_type` (offset `+0x02`), `image_descriptor` (offset `+0x11` governing the origin bit), and the `id_len` field are completely ignored and never validated during a read sequence.
Depth Validation	Function: `ImageReadTGAHeader` (`0x0045e2e0`) The sole structural validation check performed before memory allocation dictates that the `pixel_depth` must strictly equal `8`, `24`, or `32`. Any other depth integer triggers an immediate process failure.
Write Generation	Function: `ImageWriteTGA` The engine’s in-memory rasterization is strictly top-left, but its canonical on-disk `.tga` format is entirely bottom-left. When saving screenshot files or extracting buffers to disk, the engine forcefully accommodates this by hardcoding `image_type=2`, `id_len=0`, and `image_descriptor=0`, explicitly triggering an `ImageFlipY` vertical inversion on the memory payload before pushing the image to disk.

TXI (Texture Extensions)

TXI files (or TPC appended arrays) are highly forgiving plain-text metadata blocks applied adjacent to graphical files to enforce custom mipmap, bumpmap, or animation shaders.

At a Glance

Property	Value
Extension(s)	`.txi`
Magic Signature	None
Type	ASCII Configuration Strings
Rust Reference	View `rakata_formats::Txi` in Rustdocs

Data Model Structure

rakata-formats inherently pairs TXI payload access alongside its target texture. When querying the virtual resolver, textures are natively returned as a combined TextureWithTxiResult object. This architecture guarantees that the raw graphic bytes and their exact applied TXI rule block are inextricably tracked as a coupled pair throughout the virtual environment.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for TXI text configurations mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CAurTextureBasic::ParseField at 0x00422390.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Invalid Commands	Function: `CAurTextureBasic::ParseField` (`0x00422390`) Unknown or unsupported TXI commands are safely bypassed. If the parsed string evaluation fails to match an explicit configuration branch, the subroutine immediately exits without throwing any logger alarms or terminating texture load.
Case Agnosticism	Function: `CAurTextureBasic::ParseField` (`0x00422390`) Field matching acts strictly case-insensitive (e.g. `cMgTxi == cmgtxi`).
Line Normalization	Function: `CAurTextureBasic::ParseField` (`0x00422390`) The native internal engine scanner searches exclusively for `LF` (`\n`) bounds. However, if the read targets an active disk file, the underlying standard C `fgets` call automatically handles `CRLF` normalization before handing strings to the regex evaluator.
Boolean Parsing	Function: `Parse_bool` (`0x00463680`) The native `Parse_bool` validation explicitly performs lowercase scans evaluating against exact variants of `"true"`, `"false"`, `"1"`, or `"0"`.

Note

Boolean Parsing Nuance Modding documentation often warns against specific formats or keywords (like decal). Decompilation reveals the universal behavior applied to all boolean flags:

Missing Space: Keys merged with their arguments (e.g. "decal1", "mipmap0") silently abort. The firstword() extractor pulls the merged string, completely failing the target evaluation list.

Separated Numbers: Space-separated numbers (e.g. "decal 1") are completely structurally valid. firstword() pulls "decal" and hands " 1" off to Parse_bool(). An sscanf strips the whitespace and evaluates "1" to true.

Argument-less Flags: Passing just a flag ("decal") triggers the branch, but Parse_bool physically finds no argument. It fails to match "true", "false", "1", or "0", silently safely leaving the boolean integer unchanged from its previous memory allocation.

Text & Data Formats

KOTOR heavily relies on structured text and data layouts to manage everything from stat numbers to map meshes. Engine-native evidence for these varied structures (2DA, TLK, VIS, LYT, LTR) is documented below.

Implementation Blueprints

Specification	Core Focus
2DA (2D Array)	Binary/text relational database format managing core engine rules, constants, and stats.
TLK (Talk Table)	Centralized localized string dictionary managing all in-game dialogue and UI text.
VIS (Visibility Graph)	Binary topology mapping the rendering culling relationships between area geometry rooms.
LYT (Layout File)	ASCII configuration defining spatial positioning and linking of a module’s room geometry.
LTR (Letter Frequency)	Character-frequency matrices supporting the in-game random name generator algorithms.

2DA (2D Array)

2DAs are data tables defining the engine’s core rules and constraints (such as item costs and Force powers, which the engine internally stores as spells.2da). They bridge the gap between human-readable text for modding and fast-loading binaries for the final game.

At a Glance

Property	Value
Extension(s)	`.2da`
Magic Signature	`2DA` / `V2.b` (Binary) or `V2.0` (Text)
Type	Tabular Data
Rust Reference	View `rakata_formats::TwoDa` in Rustdocs

Data Model Structure

The rakata-formats crate parses 2DAs so that binary and text formats look identical to the rest of the application. The TwoDa container lets developers simply retrieve cells using twoda.cell(row, "Label"), completely hiding the inner offset calculations and padding differences between text and binary structures.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for 2D Arrays mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from C2DA::Load2DArray at 0x004143b0.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Magic/Version Gate	The engine first checks for the `"2DA "` signature. It then branches down a binary parsing path for `"V2.b"` or a text parsing path for `"V2.0"`. Any other version string triggers an instant load failure.
Binary Load (`V2.b`)	The parser starts with an 8-byte skip into the file (`data_ptr = raw_data_ptr + 8`), jumping right past the header to the starting newline character. Column headers are a tab-separated, null-terminated block. The cell offsets are then parsed as an array of `u16` integers (`rows × cols`) in row-major order.
Text Load (`V2.0`)	The text parser strips whitespace and newlines, specifically hunting for `"DEFAULT:"` or `"DEFAULT"` blocks. When parsing individual cells, the literal text `"****"` is converted into an empty string `""` to signal the fallback rule. Finally, it runs `_strlwr` on all column headers to immediately convert them to lowercase.

Tip

Orphaned Size Field: In binary row blocks, the 2-byte cell_data_size u16 is completely bypassed. The engine skips it with +2 and performs no reading or validation.

TLK (Talk Table)

The Talk Table is a massive localized string repository. Every item description, line of dialogue, and UI text in KOTOR references an index (a StrRef) pointing into this master dictionary file.

At a Glance

Property	Value
Extension(s)	`.tlk`
Magic Signature	`TLK` / `V3.0`
Type	Localized String Bundle
Rust Reference	View `rakata_formats::Tlk` in Rustdocs

Data Model Structure

The entire Talk Table format maps to the rakata_formats::Tlk struct. Each entry fuses the separated audio and text flags into a single TlkEntry. The struct safely handles missing text flags natively, preventing out-of-bounds string lookups if an entry contains audio parameters but no valid string text offset.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for Talk Tables mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CTlkFile::ReadHeader at 0x0041d890 and CTlkFile::AddFile.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Magic Check	Function: `CTlkFile::ReadHeader` (`0x0041d890`) The parser requires a `"TLK "` signature. However, strict version validation is entirely absent. The engine accepts essentially any version tag without raising a failure.
Size Dispatching	Function: `CTlkFile::ReadHeader` (`0x0041d890`) While the version isn’t used for rejection, it dynamically determines memory block sizing. A `"V3.0"` tag dictates 40 bytes (`0x28`) per entry, whereas any other version tag automatically falls back to 36 bytes (`0x24`).
Feminine Dialects	Function: `CTlkFile::AddFile` When mounting the primary archive, the engine systematically queries the directory for a secondary `<basename>F.tlk` (e.g., `dialogF.tlk`) specifically to supply overriding feminine vocabulary strings for character-gendered text queries.

VIS (Visibility Graph)

VIS is an ASCII graph structure used extensively by the rendering engine to calculate occlusion culling. It plots mathematical relationships defining which room meshes are visible from any given observer room.

At a Glance

Property	Value
Extension(s)	`.vis`
Magic Signature	None
Type	Room Graph
Rust Reference	View `rakata_formats::Vis` in Rustdocs

Data Model Structure

The rakata-formats crate parses raw VIS text blocks into a strongly typed Vis structure. Rather than storing flat arrays of strings, Vis models room visibility as an adjacency list using BTreeMap<String, BTreeSet<String>>. This structural choice guarantees deterministic lookups while automatically mimicking the engine’s internal deduplication algorithms.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for Visibility graphs mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from Scene::LoadVisibility at 0x004568d0.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Text Loading	Function: `Scene::LoadVisibility` (`0x004568d0`) The `.vis` file is executed purely as raw text. The engine continuously extracts observer and child string pairs by looping `AurResGetNextLine()` over the file buffer.
Silent Forgiveness	Function: `Scene::LoadVisibility` (`0x004568d0`) If the parser extracts a room reference (either observer or child) that does not exist in the active area layout (which it verifies via a `FindRoom` call), the visibility entry is quietly dropped without crashing or generating logs.
Bidirectional Application	Function: `Scene::SetVisibility` Calling `SetVisibility(room_a, room_b, 1)` inherently maps both visualization paths. The function inserts `room_b` into `room_a`’s visibility list, and immediately mirrors by adding `room_a` to `room_b`’s list while executing native deduplication.
Write Generation	Function: `Scene::SaveVisibility` When generating a `.vis` file natively, the engine relies on an `_sscanf` block structure mapping to `"%s%d"` and uniformly pads a dual-space indent onto all child elements beneath observer headers.

LYT (Layout File)

LYT files are ASCII configuration arrays that define the spatial 3D placement and orientation of independent room models to construct a complete area map.

At a Glance

Property	Value
Extension(s)	`.lyt`
Magic Signature	None
Type	Plain Text Layout
Rust Reference	View `rakata_formats::Lyt` in Rustdocs

Data Model Structure

The rakata-formats crate parses LYT files into the strongly-typed Lyt container. The parser segregates the raw nested lines into distinct rooms, tracks, obstacles, and doorhooks collections, natively mapping coordinate strings into engine-standard Vec3 and Quaternion structs for immediate mathematical interoperability.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for Layout configurations mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CLYT::LoadLayout at 0x005de900.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Newline Bounds	The parser heavily expects explicit `\r\n` (CRLF) endings. Scanning extracts target strings utilizing `_sscanf("%[^\r\n]", ...)` patterns and frequently relies on blind `+2` byte pointer leaps to manually clear the terminators.
Preamble Skipping	All file lines existing prior to the `beginlayout` execution marker (such as the ubiquitous `#MAXLAYOUT ASCII` header) are deliberately skipped and ignored.
Sequential Parsing	The structure mandates a rigid sequential ingestion. Data collections must explicitly appear geographically in the exact order: `roomcount` → `trackcount` → `obstaclecount` → `doorhookcount` → `donelayout`.

Warning

Boundary Oversight While the engine systematically verifies donelayout boundaries separating the primary collections, the underlying parse loop functionally neglects to verify the final donelayout signature upon closing the doorhooks segment.

LTR (Letter Frequency)

LTR files contain matrices defining the probabilistic sequence groupings of letters used by the engine’s random name generator.

At a Glance

Property	Value
Extension(s)	`.ltr`
Magic Signature	`LTR` / `V1.0`
Type	Naming State Matrix
Rust Reference	View `rakata_formats::Ltr` in Rustdocs

Data Model Structure

The rakata-formats crate maps character frequency architectures directly into the strongly-typed Ltr container, safely abstracting away the fallible raw string-parsing logic for downstream implementations.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for Letter Frequency structures mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CResLTR::OnResourceServiced at 0x00712410.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Magic Validation	The native parser enforces a mandatory `"LTR "` signature and strictly validates the `"V1.0"` format tag. These parameters collectively structure a rigid 9-byte header block. The sequence natively defines the `letter_count` variable as a single byte resting exactly at offset `+0x08`.
Contiguous Ingestion	Memory buffer extraction initiates immediately at offset `+0x09`. The parser algorithm sequentially extracts natively chained string arrays grouping `start`, `middle`, and `end` blocks to map against procedural probability matrices.
Payload Bounds Check	Upon closing the read operations, the memory allocator immediately verifies a structural bounding condition asserting that the terminal parsing offset explicitly matches the buffer array’s total byte allocation length.

Audio Formats

KOTOR handles audio via specialized implementations of the Miles Sound System, utilizing specific prefix wrappers for streaming dialogue, sound effects, and lip-syncing animations.

Implementation Blueprints

Specification	Core Focus
WAV (Waveform Audio)	Modified audio streams typically utilizing a proprietary Miles Sound System prefix wrapper.
LIP (Lip Synching)	Timed phonetic animation sequence data mapped explicitly to character speech tracks.
SSF (Sound Set File)	Mapping configuration assigning specific audio events to standard creature interaction triggers (e.g., attacking or dying).

WAV (Waveform Audio)

While standard RIFF WAV files are supported, KOTOR utilizes a multi-tiered routing structure to evaluate audio buffers dynamically based on whether the file encapsulates voice-overs (VO), ambient sound effects (SFX), or unmodified bytes.

At a Glance

Property	Value
Extension(s)	`.wav`
Magic Signature	`RIFF`
Type	Streamed / Buffered Audio
Rust Reference	View `rakata_formats::Wav` in Rustdocs

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for audio formats mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CExoSoundInternal::LoadSoundProvider at 0x005d9140.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Standard Audio (`WAV`)	If the payload begins with the exact `"RIFF"` 4-byte signature and evaluates dynamically as a non-MP3 track, the parser initiates at offset `0` and transmits the contiguous buffer to the Miles Sound System without execution modification.
Ambient Audio (`SFX`)	When evaluated as an SFX structure, the `"RIFF"` signature is deliberately absent from offset `0`. The engine interprets a custom proprietary configuration prefix that displaces the standard `"RIFF"` block exactly 470 bytes into the payload buffer (`+0x01d6`). The execution structure calculates `size = file_size - 0x1d6` and strictly extracts the resulting sub-segment.
Voice Audio (`VO`)	For streaming voice-over tracks, the `.wav` wrapper successfully begins with a `"RIFF"` tag. However, structural logic asserting `riff_size + 8 < file_size` effectively succeeds. The memory engine immediately seeks to byte offset `riff_size + 8` and subsequently pipes the remaining data exclusively as a literal `.mp3` stream.
Delegation Hand-off	The main executable natively acts as a dispatch router, executing almost zero internal chunk structural parsing routines. Total specialization for deep RIFF chunk deserialization is deferred unconditionally to the external Miles Sound System layer.

LIP (Lip Synching)

LIP files provide keyframed facial morph data directly bound to audio streams, instructing character models how to physically animate their mouths to match speech.

At a Glance

Property	Value
Extension(s)	`.lip`
Magic Signature	`LIP V1.0`
Type	Facial Animation Keyframes
Rust Reference	View `rakata_formats::Lip` in Rustdocs

Data Model Structure

The rakata-formats crate maps LIP binaries into the Lip structure. It extracts the raw 5-byte sequential keyframe array and cleanly projects it into a format that pairs each chronological float timestamp directly with its localized mouth shape.

Structural Layout

Offset	Type	Description
`0x00`	CHAR[8]	Signature (`LIP V1.0`)
`0x08`	FLOAT	Animation Length
`0x0C`	DWORD	Entry Count
`0x10`	Struct[]	Keyframe Array (5 bytes per entry)

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for Lip Synching animations mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CLIP::LoadLip at 0x0070c590.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Zero-Copy Loading	The engine handles LIP files as completely flat structures. Instead of parsing the variables out individually, it simply verifies the `"LIP V1.0"` signature and pulls the animation length and entry count directly from offsets `+0x08` and `+0x0C`.
Direct Array Assignment	The keyframes are packed into identical 5-byte chunks (a 4-byte float for the timestamp, and a 1-byte integer determining the mouth shape). Because of this flat layout, the engine never loops through the data to read it. It simply points its internal `animations` memory pointer perfectly to file offset `+0x10` and natively runs the animation straight off the raw file buffer.

SSF (Sound Set File)

Sound sets map specific generic triggers (e.g. “Battle Cry”, “Agony”, “Selected”) to physical sound references by mapping enum hooks to strings.

At a Glance

Property	Value
Extension(s)	`.ssf`
Magic Signature	None
Type	Enum-String Mapping
Rust Reference	View `rakata_formats::Ssf` in Rustdocs

Data Model Structure

The rakata-formats crate maps SSF files into the Ssf structure. It parses the raw table offset and builds a collection of 28 nullable sound reference integers mapped directly back to their standard gameplay triggers.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for Sound Set mappings mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSoundSet::GetStrres at 0x00678820.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Finding the Table	The parser reads a single 4-byte integer (`DWORD`) at offset `+0x08`. This number acts as a direct distance pointer, telling the game explicitly where the audio mapping table begins inside the file payload.
Reading the Slots	Starting directly at that pointer, the engine grabs exactly 28 continuous integers. Each position in this span represents a hardcoded character action (e.g. slot 1 is always ‘Battle Cry’, slot 2 is always ‘Agony’).
Handling Blanks	Obviously, not all characters have recorded audio for every obscure trigger. If a sound slot is supposed to be empty, it utilizes the default sentinel value `0xFFFFFFFF` (`-1`) to let the engine know to skip playback.

Note

1-Indexed Triggers When modders fire off audio events using gameplay scripts, the event identifiers are natively 1-indexed (1 to 28). To find the matching audio string underneath, the engine simply subtracts 1 behind the scenes to correctly navigate the literal 0-indexed array in memory.

Resource System & Resolution

The Odyssey Engine’s resource resolution dictates exactly how the game searches for files when it needs to render a texture, load a module, or mount a script – including the exact precedence logic when multiple mods attempt to overwrite the same asset.

ResRef Validation

A ResRef is the engine’s primary resource identifier: a fixed 16-byte buffer used everywhere a resource needs to be named (KEY/BIF entries, RIM keys, GFF resref fields, save-game resource handles, network messages, etc.).

Engine Audits & Decompilation

The following documents how swkotor.exe constructs and stores CResRef instances.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from the constructor family CResRef::CResRef at 0x00405ed0, 0x00405ef0, 0x00406d60, 0x00406d80, 0x00406da0, and the network read path CSWMessage::ReadCResRef at 0x004d6180.)

Engine Rule	Runtime Behavior
Verbatim Byte Copy	Every `CResRef` constructor performs a straight memcpy of up to 16 bytes from the source into the internal buffer. There is no character classification, no rejection of “invalid” bytes, and no separate `ValidResRef` helper anywhere in the binary.
No Character Whitelist	The engine recognizes no `[A-Za-z0-9_-]` style restriction. The binary contains no “invalid resref” or “bad resref” error strings. Real vanilla content depends on this freedom: `chitin.key` includes upgrade-modifier resrefs containing `+`, RIM key tables include `!`, and similar punctuation appears in script and texture names across the vanilla corpus.
16-Byte Cap	Inputs longer than 16 bytes are silently truncated by the constructor (`if (0xf < param_2) { param_2 = 0x10; }`). The remaining buffer is zero-padded.
No UTF-8 Awareness	The engine treats the buffer as raw bytes. There is no encoding-aware comparison. Resources stored under a non-ASCII byte sequence in the source data would be looked up by exact byte equality (after the engine’s case-insensitive comparison logic kicks in elsewhere).
Default Construction	The empty constructor (`0x00405ed0`) zeros all 16 bytes. A null-pointer source falls through to the same all-zero state.

How Rakata Models This

rakata_core::ResRef accepts any single ASCII byte (0..=0x7F) up to 16 bytes and lowercases ASCII letters for canonicalization (matching the engine’s case-insensitive lookup). Multi-byte UTF-8 input is rejected because Rust’s &str API forces inputs to be UTF-8 valid, and a multi-byte sequence in a &str cannot represent the same byte the engine would store under Windows-1252 conventions; round-tripping such input would silently corrupt the lookup key. Vanilla content is exclusively single-byte and unaffected.

The 16-byte cap and case-insensitive canonicalization match the engine. The ASCII-only restriction is a Rust-side safety net rather than an engine constraint.

TXI Sidecar Lookup

Texture Extensions (TXIs) are independent ascii text configurations used to override material instructions for specific graphics.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for TXI sidecar files mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from AurResGet at 0x0044c740.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Global Callback	When the game needs a TXI file, it always routes through a global helper calling `AurResGet(name, ".txi", ..., true)`. Three different rendering systems use this exact same path to hunt for TXIs: `CAurTextureBasic::Init`, `Gob::EnableRenderBumpedOut`, and `Material::Init`.
Total Independence	Because `AurResGet` only checks the raw filename and the `.txi` extension, it performs a totally fresh, global search through the game’s file systems. It does not know or care where the parent texture actually came from (like a specific BIF archive).

Note

Because it is entirely independent from the parent texture handle, swkotor.exe supports pulling a TXI from the /override folder even if the parent texture was sourced natively from a KEY/BIF package. Rakata maintains this independent sidecar lookup model natively via the rakata_extract::resolver::TextureWithTxiResult logic to guarantee resolver parity.

Key/BIF Resolution Mapping

The engine has a strict hierarchical override order when hunting for identical overlapping resource identifiers across multiple virtual disk mounts.

Engine Audits & Decompilation

The following documents the engine’s exact resource directory search order mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CExoKeyTable::FindKey at 0x0040ec50.)

Pipeline Event	Ghidra Provenance & Engine Behavior
First-Match Exit	When hunting for a file, the key table loops through standard folders in a hardcoded order. The second it finds a matching file name, `FindKey` returns success and completely ignores any duplicates hiding deeper in other archives.
Duplicate Checking	During startup, the engine’s `AddKey` function actually scans for duplicates. If it finds one, it ignores it, permanently locking in the file that had the higher resolution priority.

Tip

Resolution Priority:
resource_directory (Override folder) → ERF (Pass 1) → RIM → ERF (Pass 2) → Fixed / Archive

Module Loading Priorities

Modules orchestrate KOTOR’s area hubs. They are layered collections of ERF/RIM files functioning as a localized state.

Composition Loading Precedence

Because KOTOR modules are often fragmented into multiple discrete archive files (e.g., separating rigid layouts from variable area dialog), it uses the following concrete precedence when constructing a single “virtual” module (the order below lists the highest priority target first).

1. <root>_dlg.erf (K2 Dialog overrides)
2. <root>_s.rim (Supplemental properties)
3. <root>_a.rim (Base Area) if present, ELSE <root>_adx.rim (Extended Area) if present, ELSE <root>.rim (Main/Vanilla)
4. <root>.mod (Single-file Mod archive)

Tip

Rust Integration The rakata-extract crate natively replicates this exact priority order through the CompositeModule struct. When you pass a directory path to CompositeModule::load_from_directory, it automatically scans the folder and merges the _dlg, _s, _a / _adx, and base .mod files together using the engine’s strict precedence hierarchy.

Engine Audits & Decompilation

The following documents the engine’s exact load sequence for module assemblies mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CExoResMan::AsyncLoad at 0x004094a0.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Primary MOD Search	The game natively attempts to load the highest-level package by explicitly targeting the `MODULES:<root>.mod` path first.
RIM Fallback Chains	If the `.mod` file doesn’t exist, the system catches the failure and immediately shifts to look for the `<root>_s.rim` fallback.
Area Extension Probes	Throughout the module loading process, the engine actively probes for the `<root>_a.rim` and `<root>_adx.rim` extension files to violently merge in the physical area geometry.

Save Game

While KotOR Save Games (.sav) are structurally just ERF containers under the hood, the engine employs complex party-synchronization and module loading logic to physically reconstruct the player’s session.

Data Model Structure

A standard Save ERF container packages a specific set of internal GUI and logic files that the game actively requires to reconstruct a valid player state.

`savenfo.res`

The overarching save metadata block, primarily responsible for the main menu UI.

save_name: Display string for the save file.
pc_name: (Optional in K1) Player character name.
area_name: Localized display name for the area.
last_module: The resref of the specific module being loaded.
time_played: Running game time.
cheat_used: Global boolean flag to mark corrupted/cheat sessions.

`globalvars.res`

The universal state trackers running the campaign plot.

Segmented explicitly into numbers and booleans.
Each global uses a strict Symbol Name.

`partytable.res`

The live snapshot of the physical adventuring group and global resources:

Shared credits and shared party_xp.
cheat_used: Independent table-specific cheat flag.
members: Fixed list of party members tracking who is currently active and who is is_leader.
journal_entries: Currently active quest plot_ids and their numeric stages.

Additional Constituents

Character List: Populated using a mix of sources (leader, pc, and availnpc* resources).
Inventory: Represents all items the player carries (inventory.res), tracking stack sizes, charges, and upgrade bitfields.
Doors: Transient state (locked/open attributes) extracted directly from the module’s GIT.

Tip

Rust Integration The rakata-save crate handles this structure natively via the SaveEditorModel struct. You can use it to directly parse, validate, and write back these internal save components without manually managing the ERF layer.

Engine Audits & Decompilation

The following documents the engine’s exact state restoration logic mapped from swkotor.exe.

(Decompilation logic for this section was audited and verified via native Ghidra pipeline against swkotor.exe, explicitly pulling from CSWPartyTable::SaveTableInfo at 0x005648c0.)

Pipeline Event	Ghidra Provenance & Engine Behavior
Cheat Synchronization	The `CHEATUSED` flag in the `savenfo` header file isn’t tracked in isolation. When saving the game, the engine simply copies the raw `PT_CHEAT_USED` numeric flag directly from the party table to keep the UI in sync.
Character Loading Hierarchy	When the engine pushes a character into the game during a standard load (via `LoadCharacterFromIFO`), it defaults to pulling character data strictly from the active module’s `IFO` roster matrix. It only falls back to reading the save’s dedicated `.pifo` (Party Info) file if the target parameter index explicitly equals `0xffffffff`.
Area Restoration	To rebuild the dynamic state of the room you were standing in (like which doors are open or locked), the engine restores the area’s Game Instance data by targeting the `GIT` resource type (`0x7E7`) and matching it against the module’s core resref string.

Warning

There is zero K1 runtime string evidence for K2 (The Sith Lords) crafting or influence fields (PT_ITEM_COMPONEN, PT_INFLUENCE, UpgradeSlot*). If constructing K1-native party utilities, those fields must be aggressively excluded!

Engine Internals

This section contains notes and breakdowns of the Odyssey engine’s execution pipelines, case studies on community tooling bugs, and other engine-level logic or behaviors that are discovered during clean-room reverse engineering. These notes partially serve as the foundational research powering rakata-lint.

Research Notes

Topic	Description
MDL & MDX Deep Dive	Deep dive into the Ghidra decompilation notes detailing the exact byte-level layout of the binary MDL/MDX format and the engine loading pipeline.
GFF List Corruption	Case study analyzing out-of-bounds GFF list behavior in the Odyssey engine vs. loose community tooling abstractions.

GFF List Index Corruption

Summary

A binary GFF writer can silently corrupt list mapping if it writes list index entries in a way that allows recursive nested-list writes to interleave with the parent list’s index block.

This is a compatibility-critical issue for KOTOR data because many resources depend on stable list ordering and correct struct index mapping.

How GFF Lists Work

In binary GFF, a List field stores:

A relative offset into the list_indices table.
At that offset:
1. count (u32)
2. count struct indices (u32 each), each pointing into the struct table.

If these indices are wrong, the parser will load the wrong list structs.

Failure Mode

The bug class occurs when a writer:

Starts writing a parent list.
Recursively builds child structs.
Appends list indices directly while recursion is still producing nested list index data.

Because nested lists also write into the same list_indices buffer, parent and child index blocks can interleave and the parent list can point at unintended structs.

Observable Symptoms

Struct IDs in list entries change after roundtrip.
Expected fields are missing from entries after roundtrip.
Mod compatibility breaks for list-heavy resources due to reordered/remapped entries.

Correct Writer Strategy

For each list field:

Write list count.
Reserve contiguous slots for all struct indices up front.
Build each child struct recursively.
Backfill each reserved slot with the final struct index.

This guarantees parent list index layout is stable even when nested lists write their own index blocks.

Implementation Status

In this repository:

rakata-formats/src/gff/writer.rs reserves list index slots and backfills them.
Regression tests cover:
- synthetic list order + struct-id stability
- UTC fixture roundtrip stability on lists like FeatList, ItemList, ClassList.
rakata-generics/src/utc.rs includes a no-op rebuild test to ensure typed conversion does not drift list order/IDs.

The MDL/MDX Format

BioWare’s Aurora/Odyssey engine stores 3D models in a pair of files: .mdl and .mdx. This page documents what’s inside them, how the engine consumes them, and – occasionally – why they look the way they do. Evidence throughout is drawn from Ghidra decompilation of swkotor.exe (K1 GOG build), cross-checked against hex dumps of vanilla assets and community references (kotorblender, mdledit, mdlops, pykotor, reone, xoreos).

Overview

At a glance:

Property	Value
Extensions	`.mdl`, `.mdx`
Magic	Binary: first `u32 == 0`. ASCII: text (`filedependancy`, `newmodel`, …)
Type	Hierarchical scene graph + animation + vertex data
Resource type ID	`2002` (MDL), `3008` (MDX) in KEY/BIF
Rust reference	`rakata_formats::Mdl`

A model is a tree of nodes. Each node carries a transform (position + orientation), an animation track (“controllers”), and – depending on its type – geometry, light parameters, particle-emitter configuration, a skinning skeleton, a lightsaber blade, and so on. One MDL file can carry multiple named animations that operate on that tree.

The surprising shape of the format only makes sense once you understand one design choice, so let’s start there.

The core idea: load-and-fixup

The binary MDL is not a parsed format in the usual sense. The engine does not walk a byte stream field by field, calling read_u32, read_string, read_float. Instead, it does this:

Allocate a buffer exactly the size of the model data.
Copy the whole file into that buffer in one memcpy.
Walk the now-in-memory structure and convert relative offsets into absolute pointers.

That’s it. The “parser” is a pointer rewriter. Every Reset* function you’ll see in the engine (InputBinary::Reset, ResetMdlNode, ResetTriMeshParts, …) takes a buffer base pointer and a struct pointer, and its job is essentially struct->field += base for every relocatable pointer in the struct, recursing into children as it goes.

An analogy: think of IKEA instructions that say “screw part A into the hole next to part B” rather than giving exact millimetre coordinates. The instructions are valid anywhere you choose to assemble the furniture. The MDL blob is identical: every pointer is expressed relative to the blob’s origin, so the engine can drop the blob anywhere in memory and then do a one-time pass to convert those relative offsets to real addresses.

This design choice ripples through everything:

On-disk layout matches in-memory layout exactly. If a MdlNodeTriMesh is 412 bytes in RAM, it’s 412 bytes on disk. Struct field offsets you see in a Ghidra decompilation are the file offsets.
Binary files are architecture-bound. This format is a snapshot of a specific compiler’s struct layout on 32-bit Windows. Field alignment, pointer size (4 bytes), endianness (little), and even padding bytes all match that ABI.
“Parsing” is really validation + relocation. A Rust reader doesn’t need to convert a byte stream into a Rust struct; it needs to interpret a memory image as a struct overlay, following pointers to walk the tree.
The engine never writes binary MDL. The shipping engine only has code to emit ASCII MDL. Binary MDL is produced exclusively by BioWare’s model compiler (a build-time tool). The runtime reads it but never round-trips it.

With that frame in place, the rest of the format falls into shape.

File structure

The 12-byte wrapper

The file begins with a tiny header:

Offset	Type	Field	Notes
+0x00	u32	zero marker	Always `0`. Used to tell binary from ASCII.
+0x04	u32	MDL content size	Bytes of model data that follow.
+0x08	u32	MDX file size	Size of the accompanying `.mdx` file.

Input::Read at 0x004a14b0 is the dispatcher: it peeks at the first byte, and if it’s \0 the file is binary (the first u32 is always zero). Otherwise the file starts with ASCII tokens like filedependancy or newmodel, and processing hands off to a line-based interpreter.

For binary files, InputBinary::Read at 0x004a1260 does the rest:

Record mdl_content_size and mdx_file_size from the wrapper.
Allocate a heap buffer the size of the MDL content; memcpy the model data into it.
If MDX size is non-zero, allocate a second buffer and memcpy the MDX file into it.
Call Reset(mdl_buf, mdx_buf, resource_handle).

Note: the wrapper is not part of the model data. Byte 12 of the on-disk file is byte 0 of the in-memory MDL blob. All internal offsets are relative to the in-memory origin.

Three kinds of pointer

Inside the MDL blob you’ll encounter three distinct flavours of “pointer”, which is worth keeping straight:

MDL-relative offsets – the vast majority. Relocated to absolute pointers by Reset* functions. On re-serialization, they must be rewritten back to relative offsets.
MDX-file byte offsets – used by a few fields (e.g. per-mesh mdx_data_offset at +0x144) to locate vertex data in the separate MDX file.
String pointers – themselves MDL-relative, but pointing into a string table at the end of the blob, pointed to by the name-offsets array at model +0xB8.

Confusingly, there are two similarly named fields on each mesh node: mdx_data_offset at +0x144 (an MDX file offset) and vert_array_offset at +0x148 (a content-relative pointer to embedded position data). Conflating these produced one of the nastier bugs in our reader’s history (see War stories below).

Model header

Once the blob is in memory, InputBinary::Reset at 0x004a1030 walks the model header. Here’s the relevant field map:

Offset	Field	Notes
+0x00	`ModelDestructor` vptr	Populated at load time.
+0x04	`ModelParseField` vptr	Populated at load time.
+0x28	root node offset	Relocated. `ResetMdlNode` recurses from here.
+0x48	resource handle	Populated at load time.
+0x4C	type byte	`GetType()
+0x50	classification	0=Other, 1=Effect, 2=Tile, 4=Character, 8=Door.
+0x54	ref count
+0x58	animations array ptr	Relocated; count at +0x5C.
+0x64	supermodel pointer	Populated via `FindModel(buf+0x88)`.
+0x68..+0x80	bbox min/max	`Vector bmin, bmax`.
+0x80	radius	f32, default 7.0.
+0x84	animation scale	f32, default 1.0. ASCII: `setanimationscale`.
+0x88	supermodel name	`char[36]`, null-terminated. Drives recursive model load.
+0xA8	node array (secondary)	Relocated if non-zero.
+0xAC	MDX vertex pool offset	Source offset into MDX data (consumed into a GL pool).
+0xB0	MDX data size	Size of the vertex-pool copy.
+0xB8	name offsets array ptr	Relocated; count at +0xBC. Array entries also relocated.

Two fields deserve special mention:

+0x50 classification is the model’s high-level category (Character, Door, Tile, …). It’s never read during the Reset pass – it’s carried through as part of the memory-mapped blob and consulted at runtime. Cross-validated against hex dumps:

File +0x50 Category

c_dewback.mdl 0x04 Character ✓

dor_lhr01.mdl 0x08 Door ✓

m01aa_01a.mdl 0x00 Other ✓
+0x88 supermodel name is a 32-byte (plus 4 padding) ASCII name. Loading a model with a supermodel triggers a recursive FindModel call for that name – think of supermodels as CSS-style inheritance, where animation data and bones defined on the parent are available to the child.

The node tree

The root node sits at model +0x28. From there, children are reached through a standard in-memory array layout: ptr + count_used + count_allocated at offsets +0x2C, +0x30, +0x34. This three-u32 pattern is BioWare’s CExoArrayList and shows up everywhere in the format – any time you see “12 bytes of array header”, this is what it is.

Base node layout (80 bytes)

All node types begin with the same 80-byte header:

Offset	Size	Field	Notes
+0x00	u16	`node_type`	Flag bitmask. Drives type dispatch.
+0x02	u16	`node_id`	Sequential `0..N-1`.
+0x04	u16	`node_id_dup`	Identical copy of `node_id`. Never read.
+0x06	u16	padding	Always zero.
+0x08	u32	name pointer	Relocated. Points into the string table.
+0x0C	u32	parent pointer	Relocated if non-zero.
+0x10	12	position	`Vector{x, y, z}` as 3×f32.
+0x1C	16	orientation	`Quaternion{w, x, y, z}` as 4×f32.
+0x2C	12	children array	CExoArrayList of `MdlNode*`.
+0x38	12	controller keys array	CExoArrayList of `NewController` (16B each).
+0x44	12	controller data array	CExoArrayList of float (packed key data).

The two bytes at +0x04 are a redundant duplicate of node_id – always identical to +0x02 across 209 nodes verified across four vanilla files, zero mismatches. No known engine function reads it. Best guess: legacy field or exporter artifact. It’s preserved for round-trip fidelity but has no semantic meaning.

A few conventions worth noting:

Quaternion order is (w, x, y, z). Confirmed via Gob::GetOrientation at 0x004499a0 which copies fields in that order. Identity quaternion is [1.0, 0.0, 0.0, 0.0]. The Rust API uses the same convention.
Position and orientation are read directly from the blob. They’re not relocated – they’re inline values, not pointers.
Only two fields need relocation in the base header: name pointer at +0x08 and parent pointer at +0x0C.

InputBinary::ResetMdlNodeParts at 0x004a0b60 handles the base relocations and then recurses: for each entry in the children array, relocate the child pointer and call ResetMdlNode on it.

Type dispatch

InputBinary::ResetMdlNode at 0x004a0900 reads the node_type field and dispatches:

`node_type`	Handler	Kind
`0x0001`	`ResetMdlNodeParts` only	Dummy / base
`0x0003`	`ResetLight`	Light
`0x0005`	`ResetMdlNodeParts` only	Emitter
`0x0009`	`ResetMdlNodeParts` only	Camera
`0x0011`	`ResetMdlNodeParts` only	Reference
`0x0021`	`ResetTriMesh` → `ResetTriMeshParts`	TriMesh
`0x0061`	`ResetSkin`	Skin mesh
`0x00A1`	`ResetAnim`	AnimMesh
`0x0121`	`ResetDangly`	Dangly mesh (cloth)
`0x0221`	`ResetAABBTree` + `ResetTriMeshParts`	Walkmesh with AABB
`0x0401`	(no-op)	Trigger / unused
`0x0821`	`ResetLightsaber`	Saber mesh

The type values are stored as a lookup table in the executable at 0x00740a18 (12 × u32).

Though the type codes are shaped like a bitmask – HEADER=0x01, LIGHT=0x02|HEADER, EMITTER=0x04|HEADER, TRIMESH=0x20|HEADER, SKIN=0x40|TRIMESH, SABER=0x800|TRIMESH, and so on – the dispatch is an exact value match, not individual bit checks. The bitmask structure is meaningful (skin is a superset of trimesh, for instance), it’s just not how the engine branches.

Size summary

Every node type has a known fixed size, both on disk and in memory:

Flag	Type	Total	Base	Extra	Extends
0x0001	Base	80	80	0	–
0x0003	Light	172	80	92	MdlNode
0x0005	Emitter	304	80	224	MdlNode
0x0009	Camera	80	80	0	MdlNode
0x0011	Reference	116	80	36	MdlNode
0x0021	TriMesh	412	80	332	MdlNode
0x0061	Skin	512	412	100	TriMesh
0x00A1	AnimMesh	468	412	56	TriMesh
0x0121	Dangly	440	412	28	TriMesh
0x0221	AABB	416	412	4	TriMesh
0x0401	Trigger	80	80	0	MdlNode
0x0821	Saber	432	412	20	TriMesh

Verified via ParseNode’s operator_new(size) calls and Ghidra struct definitions. All mesh subtypes extend MdlNodeTriMesh – their extra data begins at node offset +0x19C, immediately after the TriMesh block.

Node types in depth

The lightweight types

Camera (0x009) has no extra data. Same 80-byte footprint as the base node. ResetMdlNode dispatches to ResetMdlNodeParts only. There are no camera-specific ASCII fields either – the ASCII parser also falls through to the base handler.

Reference (0x011) carries just two fields in 36 extra bytes: a 32-byte ref_model name and a 4-byte reattachable flag. Both inline (no pointers to relocate).

Trigger (0x401) – the decompiled ResetMdlNode explicitly returns void without calling any reset function for this type. In practice it appears to be unused in shipping content.

Light (0x003)

Lights carry 92 bytes of extra data. Most of the scalar fields are straightforward (priority, shadow flag, ambient-only flag, flare radius, etc.), but lights are the most complex non-mesh type because of their array fields:

Extra offset	Field	Layout	Runtime relocation
+0x04	texture SafePointers	12-byte array header	Zeroed on disk
+0x10	`flaresizes`	CExoArrayList	ptr relocated
+0x1C	`flarepositions`	CExoArrayList	ptr relocated
+0x28	`flarecolorshifts`	CExoArrayList	ptr relocated
+0x34	`texturenames`	CExoArrayList<char*> (each ptr too!)	all ptrs relocated

Lights also drive their colour, radius, shadow radius, vertical displacement, and multiplier via controllers (types 0x4C, 0x58, 0x60, 0x64, 0x8C) – these live in the base node’s controller arrays, not in the light-specific block.

Emitter (0x005)

Emitters are 304 bytes and – pleasantly – contain no relocatable pointers. Everything is inline: a fistful of floats and ints, four 32-byte name fields (update, render, blend, texture), and a 16-byte chunk_name. The full field map is in the appendix.

The most important field is update at extra offset +0x20. It’s the emitter type string, a case-sensitive selector against:

"Fountain" → steady particle stream (most common)
"Explosion" → one-shot burst
"Single" → single particle
"Lightning" → lightning-bolt effect

MdlNodeEmitter::InternalCreateInstance at 0x0049d5c0 branches on this string to instantiate the appropriate runtime emitter class.

Known engine-level footgun: controller 502 (detonate) is only valid on "Explosion" emitters. InternalCreateInstance only allocates the detonation memory for that branch, so a detonate controller on a "Fountain" emitter reads unallocated memory at runtime and crashes. This is a known flaw in mdlops-based exporters (KotorMax); rakata-lint will validate this.

TriMesh (0x021)

This is the big one. 332 bytes of extra data, encoding everything you’d expect in a mesh plus many things you wouldn’t.

Inline fields

At a high level:

Runtime function pointers (+0x00, +0x04): written by the constructor. Zero on disk; never consumed from a file.
Faces array (+0x08): CExoArrayList of MaxFace (32 bytes each). See Face layout below.
Bounding volumes (+0x14..+0x38): bbox min, bbox max, bounding sphere (radius + centre xyz). The sphere is the one actually consumed at runtime – PartTriMesh::GetMinimumSphere hierarchically unions it with children’s spheres for culling. These sphere fields have no ASCII-parser equivalent; they’re exclusively binary-format fields written by the BioWare toolset.
Material (+0x3C..+0x54): diffuse RGB, ambient RGB, transparencyhint.
Textures (+0x58..+0x98): texture_0 (primary/diffuse) and texture_1 (secondary/lightmap), each a 32-byte null-terminated string, plus 32 bytes of padding up to +0xE8.
UV animation (+0xEC..+0xF8): uv_direction_x, uv_direction_y, uv_jitter, uv_jitter_speed. Gated by animate_uv (+0xE8).
MDX vertex layout (+0x100..+0x12F): flags bitmask plus 11 per-attribute byte offsets. Described in the next subsection.
Counts and flags (+0x130..+0x13B): vertex_count (u16), texture_channel_count (u16), six 1-byte booleans (light_mapped, rotate_texture, is_background_geometry, shadow, beaming, render).
Tail (+0x13C..+0x14B): total_surface_area, one unresolved reserved slot, mdx_data_offset, vertex_data_ptr.

Out of 332 bytes, 61 fields are fully confirmed through Ghidra cross-referencing, 5 are confirmed-unused, 1 is “very likely” (the always-3 indices_per_face), and exactly 1 remains unresolved (the 4 bytes at +0x140, which the constructor initializes to zero and no known function ever touches).

MDX vertex layout

The flags field at extra +0x100 is a bitmask describing what each MDX vertex record contains:

Bit	Component	Size
0x01	position	3×f32 (12B) – always set
0x02	UV1 / `tverts0`	2×f32 (8B)
0x04	UV2 / `tverts1`	2×f32 (8B)
0x08	UV3 / `tverts2`	2×f32 (8B)
0x10	UV4 / `tverts3`	2×f32 (8B)
0x20	normal	3×f32 (12B) – always set
0x80	tangent space	3×3×f32 (36B) – bump-mapped meshes

Common patterns in vanilla K1: 0x21 (pos+norm only, 24B stride), 0x23 (+UV1, 32B), 0x27 (+UV2, 40B), 0xA7 (+tangent, 76B).

Note that vertex colours have no flag bit. Their presence is signalled by the per-attribute offset slot being != -1. The 11 offset slots are:

Slot	Extra offset	Field	Evidence
0	+0x104	position	`LightPartTriMesh` reads 3×f32, world-transforms
1	+0x108	normal	`LightPartTriMesh` reads 3×f32, rotation only
2	+0x10C	vertex color	Checked `!= -1`, reads RGB only. Alpha unused.
3	+0x110	UV1	`PartTriMesh` reads 2×f32
4	+0x114	UV2	Structural: `tverts1` in `InternalGenVertices`
5	+0x118	UV3	Structural: `tverts2`
6	+0x11C	UV4	Structural: `tverts3`
7	+0x120	tangent space	Filled by `CalculateTangentSpaceBasis`
8–10	+0x124..+0x12C	reserved	Always `-1` across 215 surveyed vanilla meshes

Vertex colour alpha is unused (confirmed 2026-04-04). LightPartTriMesh reads only bytes [0], [1], [2] (RGB). Byte [3] is stored but never read. The rendered output hardcodes alpha to 0xFF. The fourth byte exists purely for alignment.

Important subtlety: the engine doesn’t trust any of these values on load. InternalPostProcess at 0x0043cf00 recomputes the flags, stride, per-attribute offsets, and mdx_data_offset from scratch, based on which vertex components are actually present in the node’s arrays. It also recomputes vertex normals via edge cross products, and re-derives the bounding box and sphere. The on-disk values preserve the compiler’s original output, but they’re cosmetic from the engine’s perspective.

This has a consequence for tooling: you can largely get away with wrong values in these fields as long as your mesh is otherwise valid, because the engine will fix them up at load time. But a correct writer should still populate them – community tools (kotorblender, mdledit) depend on them, and the BioWare build pipeline does too.

Skin mesh (0x061)

100 extra bytes beyond TriMesh. Skinning data (bone weights, inverse-bind-pose rotation and translation, bone-index mapping) sits here, along with several padding regions:

Skin offset	Field	Layout	Notes
+0x00	`weights`	CExoArrayList	Always zero in binary files.
+0x14	`bone_weight_data`	ptr	Relocated if count at +0x18 > 0.
+0x1C	`qbone_ref_inv`	CExoArrayList	Inverse-bind rotations.
+0x28	`tbone_ref_inv`	CExoArrayList	Inverse-bind translations.
+0x34	`bone_constant_indices`	CExoArrayList	Bone-index remap.

The weights array deserves a call-out. A 52-byte SkinVertexWeight struct exists and is fully specified by the ASCII parser – 4 bone names, 4 weights, some metadata – but in the binary path, ResetSkin never relocates its pointer, and a corpus scan of all 968 skin nodes across 2832 vanilla models found zero non-empty weights arrays. Binary models store per-vertex bone data exclusively in MDX (via dedicated bone-weight and bone-index offsets), and the weights CExoArray is just a 12-byte zero blob on disk.

AnimMesh (0x0A1)

56 extra bytes. Carries a sample_period scalar and two CExoArrayList fields (anim_verts, anim_t_verts) for time-sampled vertex animation. The remaining six fields (three pointers + three counts + some padding) are runtime-only and zero on disk. Fun fact: no community tool (kotorblender, mdledit, kotormax, reone, xoreos, pykotor) parses AnimMesh nodes – we may have the first structured reader for this type.

Also: ResetAnim is peculiar in that it processes the extra data before calling ResetTriMeshParts, the reverse of every other mesh subtype. There’s no obvious reason for this.

Dangly mesh (0x121)

The simplest mesh subtype, 28 extra bytes. Four fields: a per-vertex constraints CExoArrayList, and three inline floats (displacement, tightness, period) that parameterize the soft-body simulation. A single conditional pointer at the tail is relocated only when the TriMesh vertex count is non-zero.

Dangly meshes are BioWare’s hack for cloth and hair – rigged to the skeleton like a skin mesh, but with simulation parameters that let parts of the geometry lag and swing.

AABB walkmesh (0x221)

4 extra bytes: a single pointer to the root of an AABB tree stored inline in the MDL blob.

The AABB tree is a flattened binary search tree written in DFS preorder. Each node is 40 bytes:

Offset	Size	Field	Notes
+0x00	12	`box_min`	3×f32 AABB minimum corner
+0x0C	12	`box_max`	3×f32 AABB maximum corner
+0x18	4	`right_child`	Content-relative offset (0 = no child)
+0x1C	4	`left_child`	Content-relative offset (0 = no child)
+0x20	4	`face_index`	i32. Leaves: ≥ 0. Internal: −1.
+0x24	4	`split_direction_flags`	Axis bitmask: 1=+X, 2=+Y, 4=+Z, 8=−X, 16=−Y, 32=−Z

Note that right_child comes before left_child in the struct – this is the actual field order, not a typo. Matches Ghidra and the mdledit/mdlops implementations.

Leaf nodes have left = 0, right = 0, face_index ≥ 0, split_direction_flags = 0. Internal nodes have both children non-zero, face_index = -1, and flags computed from the child bounding-box separation. The format is the classic spatial subdivision tree used for fast triangle lookups during pathfinding and collision queries.

ResetAABBTree at 0x004a0260 recurses the tree, relocating each child pointer. It manually unrolls to depth 4 before recursing (the engine’s author was clearly worried about stack depth on a modest C++ compiler).

Lightsaber (0x821)

20 extra bytes – small but architecturally notable:

Saber offset	Field	Notes
+0x00	saber vert data	Relocated pointer
+0x04	saber UV data	Relocated pointer
+0x08	saber normal data	Relocated pointer
+0x0C	GL vertex pool ID	Runtime-only (set by `RequestPool`)
+0x10	GL index pool ID	Runtime-only

Three arrays of exactly 176 vertices each (NUM_SABER_VERTS = 176, confirmed by kotorblender): position, UV, normal. The saber blade is a fixed-topology mesh – BioWare pre-baked the geometry as a flexible band that can be animated by swinging the endpoint controllers.

Unlike Skin/Dangly/AnimMesh, the saber uses the base TriMesh gen_vertices and remove_temporary_array callbacks. Its geometry doesn’t morph dynamically at the vertex-processing level – the animation is in the controller track.

Controllers and animation

The controller header

Controllers are the keyframe-animation primitive. Each node has an array of 16-byte NewController headers (at node +0x38) plus a shared pool of float data (at +0x44). Each header describes one animatable property of that node:

Offset	Size	Field	Notes
+0x00	u32	`type_code`	Byte offset of the target property in the Part struct.
+0x04	i16	`supermodel_link`	Additive-blending property offset; `-1` = no blending.
+0x06	u16	`row_count`	Number of keyframes.
+0x08	u16	`time_data_offset`	Float-array index for time values.
+0x0A	u16	`data_offset`	Float-array index for value data.
+0x0C	u8	`value_type_and_flags`	Low nibble: 1=float, 2/4=quaternion, 3=vector. Bit 4=0x10=Bezier.
+0x0D	3	padding	Alignment to 16 bytes. Never read.

The type_code is elegant: it’s literally the byte offset into the Part struct where the animated value lives. NewController::Control dereferences it as *(float*)(part_ptr + type_code). So type_code = 8 means “position” because position sits at Part+0x08; type_code = 20 means “orientation” because orientation sits at Part+0x14 (as a compressed axis-angle quaternion); and so on. This collapses what would otherwise be a switch over property IDs into direct pointer arithmetic.

The value_type_and_flags byte at +0x0C has a compound encoding that bit us hard early on:

Low nibble (& 0x0F) – value-type discriminator: 1=float, 2 or 4=quaternion, 3=vector. Selects the interpolation path (Lerp/Slerp/VectorLerp).
High nibble (& 0xF0) – flags. 0x10 signals Bezier interpolation, which triples the per-keyframe value count (each keyframe is value + in-tangent + out-tangent).
Special case: for orientation controllers (type code 20) with raw byte value == 2, the keyframe is a compressed quaternion packed into a single u32, not two f32 values.

The low nibble happens to coincide with the “number of floats per keyframe row” for simple cases (1, 3, 4), which is why the earlier interpretation of this byte as column_count mostly worked – until it didn’t. See the controller bug below.

Self-describing rows

Because value_type_and_flags is inline in each controller header, the binary format is entirely self-describing for animation data. The reader doesn’t need a lookup table mapping type codes to column counts – it reads the flags byte and knows how many floats to consume per row.

This is useful because vanilla K1 contains controller type codes (0x68, 0x188) that aren’t documented in any community reference. Trying to parse these with a closed enum caused 517 of 2832 vanilla MDLs (18.3%) to fail. MdlControllerType is therefore a newtype struct MdlControllerType(u32) with named constants for the three universally-confirmed base types (POSITION = 8, ORIENTATION = 20, SCALE = 36) and accepts any other u32 losslessly.

Base vs type-specific controllers

Three controllers are universal – they exist on every node type:

ASCII name	Code	Columns	Meaning
`position`	8	3	x, y, z
`orientation`	20	4	x, y, z, angle (compressed axis-angle)
`scale`	36	1	uniform scale factor

Type-specific codes live at higher numbers: light controllers start at 76 (color), emitter controllers are at 88+. All three base codes also support a Bezier variant (signalled by the flag bit, not a separate type code).

The MDX file: a mystery

Now for the strangest part of the format.

The MDX file contains interleaved vertex data – positions, normals, UVs, tangent space, colours – packed into records of width given by the mesh vertex_stride field, aligned into per-mesh blocks with sentinel-float terminators separating them. It looks exactly like what you’d expect a GPU vertex buffer to look like.

And the K1 engine never reads it.

Here’s the complete trace through InputBinary::Read:

Read the MDX file into a buffer (pbVar9).
Call Reset(mdl_content, mdx_content, resource).
Reset passes mdx_content as param_3 through a chain of function calls (ResetMdlNode, ResetTriMeshParts, …). Every downstream function has param_3 as a formal parameter.
param_3 is never used. In ResetTriMeshParts, it’s literally overwritten as a loop counter on line 67.
Back in InputBinary::Read, line 78: _free(pbVar9). The MDX buffer is freed.

At no point does any vertex-related code path consume MDX data. InternalGenVertices builds vertex buffers from verts_arrays, which lives in the MDL content blob. ProcessVerts recomputes normals from geometry. LightPartTriMesh reads from the GL pool populated at +0xAC of the model header – which is sourced from the MDL content, not the MDX file.

So where does the vertex data actually come from? From a parallel set of position-only arrays stored inside the MDL content blob, pointed to by vert_array_offset at mesh +0x148 (content-relative), with additional UV/colour/normal data in the MdlNodeTriMeshVertArrays structures.

The MDX file, in short, is a redundant interleaved copy of data that the K1 engine could reconstruct from the MDL alone. Most likely theories for why it exists:

Build-pipeline artifact. BioWare’s Aurora engine (Neverwinter Nights) may have used the MDX format directly, and the K1 pipeline inherited the file-layout convention without the consuming code path.
Toolset requirement. Third-party editors and the BioWare toolset itself may still parse MDX for authoring workflows.
ResetLite path. There’s a separate “lightweight” loader (InputBinary::ResetLite at 0x004a11b0) that may use MDX for a reduced in-memory representation – unverified.

For our purposes, this has two consequences:

Engine-functional MDX is near-trivial. Any MDX file the K1 engine happily ignores is a valid MDX file. You could write all zeros and the game would run.
Round-trip-accurate MDX requires the per-mesh terminator convention (described next), because community tools do read MDX, and byte-identical round-trip is a useful correctness check.

Per-mesh terminators and alignment

Empirically, vanilla MDX files are larger than sum(vertex_count × stride). Across 2832 vanilla K1 models, 2445 have MDX files with excess bytes, totalling 3,278,456 bytes corpus-wide.

The excess has structure. After each mesh’s vertex data, there’s a terminator row of exactly one stride’s worth of bytes, beginning with three sentinel floats and padded with zeros:

Mesh type	Sentinel value	Hex (f32 LE)
Non-skin (`type & 0x40 == 0`)	10,000,000.0	`00 96 18 4B`
Skin (`type & 0x40 != 0`)	1,000,000.0	`00 24 74 49`

Corpus sentinel detection: 6,973 non-skin sentinels, 6 skin sentinels, 0 unknown patterns.

Between meshes, the cursor is padded to the next 16-byte boundary. The last mesh has no trailing alignment:

cursor = 0
for each mesh in MDX order:
    cursor += vertex_count × stride   # vertex data
    cursor += stride                   # terminator row
    if not last mesh:
        cursor = (cursor + 15) & ~15   # 16-byte alignment
mdx_file_size = cursor

For stride-24 meshes, the gap between meshes is either 24 or 32 bytes depending on current alignment. For stride-32 and stride-64 meshes, it’s always exactly stride because the stride is already a multiple of 16.

Mesh ordering in MDX

Non-skin meshes come first, then skin meshes. Within each group, the order is DFS-traversal-of-the-tree – mostly. About 27% of vanilla models exhibit a compiler-specific permutation that defers “second children” of paired parents until after all their siblings’ first children. This is reproducible for our own output (if we write DFS, we read DFS), but not for byte-identical round-trip of every BioWare file.

Writing in standard DFS order (non-skin first, skin second) produces semantically identical MDX data with the correct total size. 1784 of 2444 models match byte-for-byte; the remaining 660 have the non-standard compiler ordering.

What this means for `mdx_data_offset`

The mesh header has two adjacent u32 fields at +0x144 and +0x148:

+0x144 mdx_data_offset: per-mesh byte offset into the MDX file. Used by community tools to seek directly to that mesh’s vertex block. The engine also uses this after InternalPostProcess overwrites it with a GL-pool offset.
+0x148 vert_array_offset: content-relative pointer to the position-only vertex data embedded in the MDL content blob. Used by the engine during load. Relocated by ResetTriMeshParts via param_1->field60_0x198 = param_2 + param_1->field60_0x198 – where param_2 is the MDL content base, not the MDX base.

These two fields were conflated under a single MDX_OFFSET = 0x148 constant in our implementation for several months, which caused the reader to lose the MDX offset entirely and the writer to overwrite the content pointer with an MDX offset. Full story in War stories.

Face layout

Faces are 32-byte records (MaxFace) stored in the TriMesh faces CExoArray:

Offset	Size	Field	Type	Notes
+0x00	12	`plane_normal`	3×f32	Face plane normal.
+0x0C	4	`plane_distance`	f32	Plane equation: n·p = d.
+0x10	4	`surface_id`	u32	Walkability / material identifier.
+0x14	6	`adjacent`	3×u16	Indices of adjacent faces (for AABB/pathfinding).
+0x1A	6	`vertex_indices`	3×u16	Triangle vertex indices.

The plane normal and distance are pre-computed by the BioWare toolset. They can be re-derived from the geometry but the binary format preserves them. The adjacency graph is what makes AABB walkmesh lookups fast – each triangle points to its neighbours, enabling constant-time stepping during pathfinding.

An early version of our reader assumed 12-byte faces (just the vertex indices). This led to every 2.67th “face” being interpreted from garbage bytes belonging to the next face’s plane normal. It was masked by synthetic round-trip tests – write wrong, read wrong, match! – and only caught when vanilla-file validation found vertex indices exceeding the mesh’s vertex count.

War stories and implementation history

A brief chronicle of the bugs found while building the Rust reader/writer, because the “how we know this” is often as useful as the “what we know”.

The 12-byte face bug

Described above. The MaxFace stride is 32 bytes, not 12. Caught by vertex-index bounds checking against vanilla files.

Mesh header size corrections

The whole mesh extra-header was misunderstood for a long time. A sample of the corrections, all fixed in late February 2026:

VERTEX_COUNT offset was 0x9E → actually 0x130
MDX_OFFSET was 0xB8 → actually two separate fields at 0x144 and 0x148
VERTEX_STRUCT_SIZE was 0xBC → actually 0xFC
MESH_EXTRA_SIZE was 200 bytes → actually 332 (0x14C)
RENDER boolean was missing entirely → added at 0x139
SHADOW boolean was missing entirely → added at 0x137

All of these stemmed from extrapolating offsets from partial hex dumps rather than decompiling the struct. Ghidra’s MdlNodeTriMesh struct definition settled the whole thing – once the Ghidra type was aligned, the field offsets fell out directly.

Controller column-count encoding

Our reader initially used the raw value_type_and_flags byte (at controller +0x0C) directly as a float count per row. This worked for the common case (position=3, orientation=4, scale=1) but broke in two scenarios:

Bezier controllers set bit 0x10, turning raw=3 (Bezier position) into a byte value of 0x13 = 19 columns, not 9.
Integral orientation: ORIENTATION controllers with raw byte == 2 mean “compressed quaternion packed into one u32 per row”, not “2 f32 values per row”.

The integral-orientation case was the more painful bug: a c_dewback scan showed 876 integral-orientation controllers; c_rancor had 1,212. Reading 2 floats instead of 1 consumed double the expected data, desynchronizing every subsequent controller in the data array. Every node’s animation after the first compressed-quaternion keyframe was reading from a shifted window of garbage.

Fix: decode the raw byte with & 0x0F masking plus the two special cases (Bezier multiplies by 3; integral orientation uses 1 u32 per row regardless). The raw byte is preserved in a raw_column_count field for round-trip fidelity.

Animation node_number at +0x02

The 80-byte node header’s first 8 bytes are type_flags (u16), node_number (u16), name_index (u16), padding (u16). Our offset map had NODE_ID = 0x04, which pointed to name_index, not node_number.

For animation nodes specifically, node_number is the engine’s key for matching animation keyframe nodes to their geometry-side skeleton bones. Writing zeros at +0x02 and stuffing the name_index at +0x04 meant every animation node had node_number = 0, so every keyframe targeted the root bone. Visually: characters froze in T-pose with no skeletal motion whatsoever.

Fix: read node_number from +0x02 explicitly; derive name_index from the name map at +0x04.

MDX per-mesh seeking

Our MDX reader used a cumulative cursor assuming non-skin-first DFS ordering. For the ~51% of vanilla models where MDX layout doesn’t match that assumption, vertex data was assigned to the wrong mesh nodes. Self-round-trip tests couldn’t detect this – we were reading and writing the same wrong assignment, which is a consistency check for the tool’s own output, not for correctness against vanilla.

Fix: seek to info.mdx_data_offset (the +0x144 field) for each mesh, matching kotorblender and mdledit behaviour. The cumulative-cursor logic remains in the writer, which produces its own layout and backpatches the offset field; the reader trusts whatever the file says.

Name-table dead entries

220 vanilla K1 models have name tables containing entries that no node references. These turn out to be walkmesh node names (*_wok, *_pwk, *_dwk variants) from BioWare’s build pipeline, which apparently shared a single name table across the MDL and WOK outputs.

The engine only performs indexed lookups via name_index; it never iterates the full table or validates the count. Extra entries are harmless dead weight.

Decision: not preserved. Our writer builds the name table from the node tree (matching kotorblender and mdledit), producing files that are functionally identical but 20–80 bytes shorter. This is a known, benign size delta – not a parity bug.

Emitter controller code verification

All 48 emitter controller type codes were independently verified against the engine binary via Ghidra. For each, we located the __stricmp call for the ASCII field name and traced the controller type value stored on match. Every code matched mdledit’s ReturnControllerName table exactly – no additions, no omissions.

One naming correction: the engine’s canonical string for code 200 is "lightningZigzag" (camelCase Z). mdledit has "lightningzigzag" (all lowercase). Functionally identical because the engine uses __stricmp (case-insensitive), but the engine’s capitalization is now what we emit.

Corpus validation status

As of 2026-02-24: 2832/2832 (100%) structural round-trip success (parse → write → parse → compare). This was achieved after fixing three comparison issues in the test harness:

NaN ≠ NaN (IEEE 754): 1559 false failures – floats containing NaN don’t equal themselves. Fixed with bitwise f32::to_bits() comparison.
Parent index ordering: 135 mismatches from depth-first vs. original node ordering. The binary format preserves node ordering but our parent-index reconstruction uses DFS. Semantically equivalent, numerically different – skipped in comparison.
Face NaN values: exactly one model (w_dblsbr_001) has NaN in its pre-computed plane_normal/plane_distance, because one of its faces is degenerate. Round-trips correctly once NaN-aware comparison is used.

Byte-level MDL/MDX equality is a separate target – 1784 of 2444 MDX files match byte-for-byte, with the remaining 660 showing the non-standard BioWare compiler traversal discussed earlier.

Appendix

Emitter field map

304 bytes total (80 base + 224 extra). Emitter-specific data:

Node offset	Extra offset	Field	Type
+0x50	+0x00	`deadspace`	f32
+0x54	+0x04	`blast_radius`	f32
+0x58	+0x08	`blast_length`	f32
+0x5C	+0x0C	`num_branches`	i32
+0x60	+0x10	`control_pt_smoothing`	i32
+0x64	+0x14	`x_grid`	i32
+0x68	+0x18	`y_grid`	i32
+0x6C	+0x1C	`spawn_type`	i32
+0x70	+0x20	`update`	char[32]
+0x90	+0x40	`render`	char[32]
+0xB0	+0x60	`blend`	char[32]
+0xD0	+0x80	`texture`	char[32]
+0xF0	+0xA0	`chunk_name`	char[16]
+0x100	+0xB0	`two_sided_tex`	i32
+0x104	+0xB4	`loop`	i32
+0x108	+0xB8	`render_order`	u16
+0x10A	+0xBA	`frame_blending`	u8
+0x10B	+0xBB	`depth_texture_name`	char[16]
+0x11B	+0xCB	(reserved)	21 bytes

LOD naming convention

When a model has cullWithLOD set, the engine searches for LOD variants by appending suffixes to the model name:

<name>_x – medium LOD
<name>_z – far LOD

Loaded via FindModel(name + "_x") and FindModel(name + "_z") as separate Model instances linked to the primary. Not relevant to format parsing, but useful for model validation and lint rules.

Resource type IDs

Format	Resource type
MDL	2002 (0x7D2)
MDX	3008 (0xBC0)

These map to the KEY/BIF resource type system. CAuroraInterface::RequestModel at 0x0070d8d0 resolves models through a sorted requestedModelList.

Dynamic type casts

The engine exposes As* functions for type-checked downcasts. Caller counts indicate runtime usage frequency:

Function	Callers
`AsModel`	34
`AsMdlNodeTriMesh`	14
`AsMdlNodeEmitter`	11
`AsAnimation`	7
`AsMdlNodeLightsaber`	5
`AsMdlNodeSkin`	4
`AsMdlNodeAABB`	3
`AsMdlNodeDanglyMesh`	3
`AsMdlNodeLight`	3
`AsMdlNodeAnimMesh`	2
`AsMdlNodeCamera`	2
`AsMdlNodeReference`	2

TriMesh (14) and Emitter (11) are the most-queried node types – useful signal for prioritizing implementation completeness.

Binary MDL call graph

For reference when reading Ghidra decompilations:

NewCAurObject (0x00449cc0)
└── FindModel (0x00464110)           [by name; checks cache via BinarySearchModel]
    └── LoadModel (0x00464200)       [on cache miss]
        └── IODispatcher::ReadSync (0x004a15d0)
            └── Input::Read (0x004a14b0)          ← format dispatcher
                ├── InputBinary::Read (0x004a1260)   if first_byte == 0x00
                │   └── Reset / ResetLite                (pointer rewriting)
                │       ├── ResetMdlNode                  (per-node dispatch)
                │       │   ├── ResetMdlNodeParts         (base fields)
                │       │   ├── ResetTriMesh              (mesh subtypes)
                │       │   ├── ResetLight                (light extras)
                │       │   ├── ResetSkin, ResetAnim, ...
                │       │   └── ResetAABBTree             (recursive tree walk)
                │       └── ResetAnimation                (per-animation)
                └── FuncInterp loop                 otherwise (ASCII MDL)
    └── CreateInstanceTreeR (0x00449200)  [builds runtime Part tree from MdlNode tree]

Key Ghidra addresses

For anyone continuing this archaeology, the foundation set of function addresses in swkotor.exe (K1 GOG build):

Function	Address
`Input::Read`	`0x004a14b0`
`InputBinary::Read`	`0x004a1260`
`InputBinary::Reset`	`0x004a1030`
`InputBinary::ResetMdlNode`	`0x004a0900`
`InputBinary::ResetMdlNodeParts`	`0x004a0b60`
`InputBinary::ResetTriMeshParts`	`0x004a0c00`
`InputBinary::ResetAABBTree`	`0x004a0260`
`InputBinary::ResetLight`	`0x004a05e0`
`InputBinary::ResetSkin`	`0x004a01b0`
`InputBinary::ResetDangly`	`0x004a0100`
`InputBinary::ResetAnim`	`0x004a0060`
`InputBinary::ResetLightsaber`	`0x004a0460`
`InputBinary::ResetAnimation`	`0x004a0fb0`
`MdlNodeTriMesh::InternalPostProcess`	`0x0043cf00`
`MdlNodeTriMesh::InternalGenVertices`	`0x00439df0`
`MdlNodeTriMesh::InternalParseField`	`0x004658b0`
`MdlNodeEmitter::InternalParseField`	`0x004658b0`
`MdlNodeEmitter::InternalCreateInstance`	`0x0049d5c0`
`PartTriMesh::GetMinimumSphere`	`0x00443330`
`LightPartTriMesh`	`0x0046a9e0`
`NewController::Control`	`0x00483330`
`NewController::GetFloatValue`	`0x00482bf0`
`Model` constructor	`0x0044aa70`
`MaxTree` constructor	`0x0044a900`
`ParseNode`	`0x004680e0`
Node type flag table	`0x00740a18`

Keyboard shortcuts

Rakata Documentation