The MDL/MDX Format
BioWare’s Aurora/Odyssey engine stores 3D models in a pair of files:
.mdland.mdx. This page documents what’s inside them, how the engine consumes them, and – occasionally – why they look the way they do. Evidence throughout is drawn from Ghidra decompilation ofswkotor.exe(K1 GOG build), cross-checked against hex dumps of vanilla assets and community references (kotorblender,mdledit,mdlops,pykotor,reone,xoreos).
Overview
At a glance:
| Property | Value |
|---|---|
| Extensions | .mdl, .mdx |
| Magic | Binary: first u32 == 0. ASCII: text (filedependancy, newmodel, …) |
| Type | Hierarchical scene graph + animation + vertex data |
| Resource type ID | 2002 (MDL), 3008 (MDX) in KEY/BIF |
| Rust reference | rakata_formats::Mdl |
A model is a tree of nodes. Each node carries a transform (position + orientation), an animation track (“controllers”), and – depending on its type – geometry, light parameters, particle-emitter configuration, a skinning skeleton, a lightsaber blade, and so on. One MDL file can carry multiple named animations that operate on that tree.
The surprising shape of the format only makes sense once you understand one design choice, so let’s start there.
The core idea: load-and-fixup
The binary MDL is not a parsed format in the usual sense. The engine does not walk a byte stream field by field, calling read_u32, read_string, read_float. Instead, it does this:
- Allocate a buffer exactly the size of the model data.
- Copy the whole file into that buffer in one
memcpy. - Walk the now-in-memory structure and convert relative offsets into absolute pointers.
That’s it. The “parser” is a pointer rewriter. Every Reset* function you’ll see in the engine (InputBinary::Reset, ResetMdlNode, ResetTriMeshParts, …) takes a buffer base pointer and a struct pointer, and its job is essentially struct->field += base for every relocatable pointer in the struct, recursing into children as it goes.
An analogy: think of IKEA instructions that say “screw part A into the hole next to part B” rather than giving exact millimetre coordinates. The instructions are valid anywhere you choose to assemble the furniture. The MDL blob is identical: every pointer is expressed relative to the blob’s origin, so the engine can drop the blob anywhere in memory and then do a one-time pass to convert those relative offsets to real addresses.
This design choice ripples through everything:
- On-disk layout matches in-memory layout exactly. If a
MdlNodeTriMeshis 412 bytes in RAM, it’s 412 bytes on disk. Struct field offsets you see in a Ghidra decompilation are the file offsets. - Binary files are architecture-bound. This format is a snapshot of a specific compiler’s struct layout on 32-bit Windows. Field alignment, pointer size (4 bytes), endianness (little), and even padding bytes all match that ABI.
- “Parsing” is really validation + relocation. A Rust reader doesn’t need to convert a byte stream into a Rust struct; it needs to interpret a memory image as a struct overlay, following pointers to walk the tree.
- The engine never writes binary MDL. The shipping engine only has code to emit ASCII MDL. Binary MDL is produced exclusively by BioWare’s model compiler (a build-time tool). The runtime reads it but never round-trips it.
With that frame in place, the rest of the format falls into shape.
File structure
The 12-byte wrapper
The file begins with a tiny header:
| Offset | Type | Field | Notes |
|---|---|---|---|
| +0x00 | u32 | zero marker | Always 0. Used to tell binary from ASCII. |
| +0x04 | u32 | MDL content size | Bytes of model data that follow. |
| +0x08 | u32 | MDX file size | Size of the accompanying .mdx file. |
Input::Read at 0x004a14b0 is the dispatcher: it peeks at the first byte, and if it’s \0 the file is binary (the first u32 is always zero). Otherwise the file starts with ASCII tokens like filedependancy or newmodel, and processing hands off to a line-based interpreter.
For binary files, InputBinary::Read at 0x004a1260 does the rest:
- Record
mdl_content_sizeandmdx_file_sizefrom the wrapper. - Allocate a heap buffer the size of the MDL content;
memcpythe model data into it. - If MDX size is non-zero, allocate a second buffer and
memcpythe MDX file into it. - Call
Reset(mdl_buf, mdx_buf, resource_handle).
Note: the wrapper is not part of the model data. Byte 12 of the on-disk file is byte 0 of the in-memory MDL blob. All internal offsets are relative to the in-memory origin.
Three kinds of pointer
Inside the MDL blob you’ll encounter three distinct flavours of “pointer”, which is worth keeping straight:
- MDL-relative offsets – the vast majority. Relocated to absolute pointers by
Reset*functions. On re-serialization, they must be rewritten back to relative offsets. - MDX-file byte offsets – used by a few fields (e.g. per-mesh
mdx_data_offsetat +0x144) to locate vertex data in the separate MDX file. - String pointers – themselves MDL-relative, but pointing into a string table at the end of the blob, pointed to by the name-offsets array at model +0xB8.
Confusingly, there are two similarly named fields on each mesh node: mdx_data_offset at +0x144 (an MDX file offset) and vert_array_offset at +0x148 (a content-relative pointer to embedded position data). Conflating these produced one of the nastier bugs in our reader’s history (see War stories below).
Model header
Once the blob is in memory, InputBinary::Reset at 0x004a1030 walks the model header. Here’s the relevant field map:
| Offset | Field | Notes |
|---|---|---|
| +0x00 | ModelDestructor vptr | Populated at load time. |
| +0x04 | ModelParseField vptr | Populated at load time. |
| +0x28 | root node offset | Relocated. ResetMdlNode recurses from here. |
| +0x48 | resource handle | Populated at load time. |
| +0x4C | type byte | `GetType() |
| +0x50 | classification | 0=Other, 1=Effect, 2=Tile, 4=Character, 8=Door. |
| +0x54 | ref count | |
| +0x58 | animations array ptr | Relocated; count at +0x5C. |
| +0x64 | supermodel pointer | Populated via FindModel(buf+0x88). |
| +0x68..+0x80 | bbox min/max | Vector bmin, bmax. |
| +0x80 | radius | f32, default 7.0. |
| +0x84 | animation scale | f32, default 1.0. ASCII: setanimationscale. |
| +0x88 | supermodel name | char[36], null-terminated. Drives recursive model load. |
| +0xA8 | node array (secondary) | Relocated if non-zero. |
| +0xAC | MDX vertex pool offset | Source offset into MDX data (consumed into a GL pool). |
| +0xB0 | MDX data size | Size of the vertex-pool copy. |
| +0xB8 | name offsets array ptr | Relocated; count at +0xBC. Array entries also relocated. |
Two fields deserve special mention:
-
+0x50 classification is the model’s high-level category (Character, Door, Tile, …). It’s never read during the
Resetpass – it’s carried through as part of the memory-mapped blob and consulted at runtime. Cross-validated against hex dumps:File +0x50 Category c_dewback.mdl0x04 Character ✓ dor_lhr01.mdl0x08 Door ✓ m01aa_01a.mdl0x00 Other ✓ -
+0x88 supermodel name is a 32-byte (plus 4 padding) ASCII name. Loading a model with a supermodel triggers a recursive
FindModelcall for that name – think of supermodels as CSS-style inheritance, where animation data and bones defined on the parent are available to the child.
The node tree
The root node sits at model +0x28. From there, children are reached through a standard in-memory array layout: ptr + count_used + count_allocated at offsets +0x2C, +0x30, +0x34. This three-u32 pattern is BioWare’s CExoArrayList and shows up everywhere in the format – any time you see “12 bytes of array header”, this is what it is.
Base node layout (80 bytes)
All node types begin with the same 80-byte header:
| Offset | Size | Field | Notes |
|---|---|---|---|
| +0x00 | u16 | node_type | Flag bitmask. Drives type dispatch. |
| +0x02 | u16 | node_id | Sequential 0..N-1. |
| +0x04 | u16 | node_id_dup | Identical copy of node_id. Never read. |
| +0x06 | u16 | padding | Always zero. |
| +0x08 | u32 | name pointer | Relocated. Points into the string table. |
| +0x0C | u32 | parent pointer | Relocated if non-zero. |
| +0x10 | 12 | position | Vector{x, y, z} as 3×f32. |
| +0x1C | 16 | orientation | Quaternion{w, x, y, z} as 4×f32. |
| +0x2C | 12 | children array | CExoArrayList of MdlNode*. |
| +0x38 | 12 | controller keys array | CExoArrayList of NewController (16B each). |
| +0x44 | 12 | controller data array | CExoArrayList of float (packed key data). |
The two bytes at +0x04 are a redundant duplicate of node_id – always identical to +0x02 across 209 nodes verified across four vanilla files, zero mismatches. No known engine function reads it. Best guess: legacy field or exporter artifact. It’s preserved for round-trip fidelity but has no semantic meaning.
A few conventions worth noting:
- Quaternion order is
(w, x, y, z). Confirmed viaGob::GetOrientationat0x004499a0which copies fields in that order. Identity quaternion is[1.0, 0.0, 0.0, 0.0]. The Rust API uses the same convention. - Position and orientation are read directly from the blob. They’re not relocated – they’re inline values, not pointers.
- Only two fields need relocation in the base header: name pointer at +0x08 and parent pointer at +0x0C.
InputBinary::ResetMdlNodeParts at 0x004a0b60 handles the base relocations and then recurses: for each entry in the children array, relocate the child pointer and call ResetMdlNode on it.
Type dispatch
InputBinary::ResetMdlNode at 0x004a0900 reads the node_type field and dispatches:
node_type | Handler | Kind |
|---|---|---|
0x0001 | ResetMdlNodeParts only | Dummy / base |
0x0003 | ResetLight | Light |
0x0005 | ResetMdlNodeParts only | Emitter |
0x0009 | ResetMdlNodeParts only | Camera |
0x0011 | ResetMdlNodeParts only | Reference |
0x0021 | ResetTriMesh → ResetTriMeshParts | TriMesh |
0x0061 | ResetSkin | Skin mesh |
0x00A1 | ResetAnim | AnimMesh |
0x0121 | ResetDangly | Dangly mesh (cloth) |
0x0221 | ResetAABBTree + ResetTriMeshParts | Walkmesh with AABB |
0x0401 | (no-op) | Trigger / unused |
0x0821 | ResetLightsaber | Saber mesh |
The type values are stored as a lookup table in the executable at 0x00740a18 (12 × u32).
Though the type codes are shaped like a bitmask – HEADER=0x01, LIGHT=0x02|HEADER, EMITTER=0x04|HEADER, TRIMESH=0x20|HEADER, SKIN=0x40|TRIMESH, SABER=0x800|TRIMESH, and so on – the dispatch is an exact value match, not individual bit checks. The bitmask structure is meaningful (skin is a superset of trimesh, for instance), it’s just not how the engine branches.
Size summary
Every node type has a known fixed size, both on disk and in memory:
| Flag | Type | Total | Base | Extra | Extends |
|---|---|---|---|---|---|
| 0x0001 | Base | 80 | 80 | 0 | – |
| 0x0003 | Light | 172 | 80 | 92 | MdlNode |
| 0x0005 | Emitter | 304 | 80 | 224 | MdlNode |
| 0x0009 | Camera | 80 | 80 | 0 | MdlNode |
| 0x0011 | Reference | 116 | 80 | 36 | MdlNode |
| 0x0021 | TriMesh | 412 | 80 | 332 | MdlNode |
| 0x0061 | Skin | 512 | 412 | 100 | TriMesh |
| 0x00A1 | AnimMesh | 468 | 412 | 56 | TriMesh |
| 0x0121 | Dangly | 440 | 412 | 28 | TriMesh |
| 0x0221 | AABB | 416 | 412 | 4 | TriMesh |
| 0x0401 | Trigger | 80 | 80 | 0 | MdlNode |
| 0x0821 | Saber | 432 | 412 | 20 | TriMesh |
Verified via ParseNode’s operator_new(size) calls and Ghidra struct definitions. All mesh subtypes extend MdlNodeTriMesh – their extra data begins at node offset +0x19C, immediately after the TriMesh block.
Node types in depth
The lightweight types
Camera (0x009) has no extra data. Same 80-byte footprint as the base node. ResetMdlNode dispatches to ResetMdlNodeParts only. There are no camera-specific ASCII fields either – the ASCII parser also falls through to the base handler.
Reference (0x011) carries just two fields in 36 extra bytes: a 32-byte ref_model name and a 4-byte reattachable flag. Both inline (no pointers to relocate).
Trigger (0x401) – the decompiled ResetMdlNode explicitly returns void without calling any reset function for this type. In practice it appears to be unused in shipping content.
Light (0x003)
Lights carry 92 bytes of extra data. Most of the scalar fields are straightforward (priority, shadow flag, ambient-only flag, flare radius, etc.), but lights are the most complex non-mesh type because of their array fields:
| Extra offset | Field | Layout | Runtime relocation |
|---|---|---|---|
| +0x04 | texture SafePointers | 12-byte array header | Zeroed on disk |
| +0x10 | flaresizes | CExoArrayList | ptr relocated |
| +0x1C | flarepositions | CExoArrayList | ptr relocated |
| +0x28 | flarecolorshifts | CExoArrayList | ptr relocated |
| +0x34 | texturenames | CExoArrayList<char*> (each ptr too!) | all ptrs relocated |
Lights also drive their colour, radius, shadow radius, vertical displacement, and multiplier via controllers (types 0x4C, 0x58, 0x60, 0x64, 0x8C) – these live in the base node’s controller arrays, not in the light-specific block.
Emitter (0x005)
Emitters are 304 bytes and – pleasantly – contain no relocatable pointers. Everything is inline: a fistful of floats and ints, four 32-byte name fields (update, render, blend, texture), and a 16-byte chunk_name. The full field map is in the appendix.
The most important field is update at extra offset +0x20. It’s the emitter type string, a case-sensitive selector against:
"Fountain"→ steady particle stream (most common)"Explosion"→ one-shot burst"Single"→ single particle"Lightning"→ lightning-bolt effect
MdlNodeEmitter::InternalCreateInstance at 0x0049d5c0 branches on this string to instantiate the appropriate runtime emitter class.
Known engine-level footgun: controller 502 (
detonate) is only valid on"Explosion"emitters.InternalCreateInstanceonly allocates the detonation memory for that branch, so adetonatecontroller on a"Fountain"emitter reads unallocated memory at runtime and crashes. This is a known flaw inmdlops-based exporters (KotorMax);rakata-lintwill validate this.
TriMesh (0x021)
This is the big one. 332 bytes of extra data, encoding everything you’d expect in a mesh plus many things you wouldn’t.
Inline fields
At a high level:
- Runtime function pointers (+0x00, +0x04): written by the constructor. Zero on disk; never consumed from a file.
- Faces array (+0x08): CExoArrayList of
MaxFace(32 bytes each). See Face layout below. - Bounding volumes (+0x14..+0x38): bbox min, bbox max, bounding sphere (radius + centre xyz). The sphere is the one actually consumed at runtime –
PartTriMesh::GetMinimumSpherehierarchically unions it with children’s spheres for culling. These sphere fields have no ASCII-parser equivalent; they’re exclusively binary-format fields written by the BioWare toolset. - Material (+0x3C..+0x54): diffuse RGB, ambient RGB,
transparencyhint. - Textures (+0x58..+0x98):
texture_0(primary/diffuse) andtexture_1(secondary/lightmap), each a 32-byte null-terminated string, plus 32 bytes of padding up to +0xE8. - UV animation (+0xEC..+0xF8):
uv_direction_x,uv_direction_y,uv_jitter,uv_jitter_speed. Gated byanimate_uv(+0xE8). - MDX vertex layout (+0x100..+0x12F): flags bitmask plus 11 per-attribute byte offsets. Described in the next subsection.
- Counts and flags (+0x130..+0x13B):
vertex_count(u16),texture_channel_count(u16), six 1-byte booleans (light_mapped,rotate_texture,is_background_geometry,shadow,beaming,render). - Tail (+0x13C..+0x14B):
total_surface_area, one unresolved reserved slot,mdx_data_offset,vertex_data_ptr.
Out of 332 bytes, 61 fields are fully confirmed through Ghidra cross-referencing, 5 are confirmed-unused, 1 is “very likely” (the always-3 indices_per_face), and exactly 1 remains unresolved (the 4 bytes at +0x140, which the constructor initializes to zero and no known function ever touches).
MDX vertex layout
The flags field at extra +0x100 is a bitmask describing what each MDX vertex record contains:
| Bit | Component | Size |
|---|---|---|
| 0x01 | position | 3×f32 (12B) – always set |
| 0x02 | UV1 / tverts0 | 2×f32 (8B) |
| 0x04 | UV2 / tverts1 | 2×f32 (8B) |
| 0x08 | UV3 / tverts2 | 2×f32 (8B) |
| 0x10 | UV4 / tverts3 | 2×f32 (8B) |
| 0x20 | normal | 3×f32 (12B) – always set |
| 0x80 | tangent space | 3×3×f32 (36B) – bump-mapped meshes |
Common patterns in vanilla K1: 0x21 (pos+norm only, 24B stride), 0x23 (+UV1, 32B), 0x27 (+UV2, 40B), 0xA7 (+tangent, 76B).
Note that vertex colours have no flag bit. Their presence is signalled by the per-attribute offset slot being != -1. The 11 offset slots are:
| Slot | Extra offset | Field | Evidence |
|---|---|---|---|
| 0 | +0x104 | position | LightPartTriMesh reads 3×f32, world-transforms |
| 1 | +0x108 | normal | LightPartTriMesh reads 3×f32, rotation only |
| 2 | +0x10C | vertex color | Checked != -1, reads RGB only. Alpha unused. |
| 3 | +0x110 | UV1 | PartTriMesh reads 2×f32 |
| 4 | +0x114 | UV2 | Structural: tverts1 in InternalGenVertices |
| 5 | +0x118 | UV3 | Structural: tverts2 |
| 6 | +0x11C | UV4 | Structural: tverts3 |
| 7 | +0x120 | tangent space | Filled by CalculateTangentSpaceBasis |
| 8–10 | +0x124..+0x12C | reserved | Always -1 across 215 surveyed vanilla meshes |
Vertex colour alpha is unused (confirmed 2026-04-04).
LightPartTriMeshreads only bytes [0], [1], [2] (RGB). Byte [3] is stored but never read. The rendered output hardcodes alpha to0xFF. The fourth byte exists purely for alignment.
Important subtlety: the engine doesn’t trust any of these values on load. InternalPostProcess at 0x0043cf00 recomputes the flags, stride, per-attribute offsets, and mdx_data_offset from scratch, based on which vertex components are actually present in the node’s arrays. It also recomputes vertex normals via edge cross products, and re-derives the bounding box and sphere. The on-disk values preserve the compiler’s original output, but they’re cosmetic from the engine’s perspective.
This has a consequence for tooling: you can largely get away with wrong values in these fields as long as your mesh is otherwise valid, because the engine will fix them up at load time. But a correct writer should still populate them – community tools (kotorblender, mdledit) depend on them, and the BioWare build pipeline does too.
Skin mesh (0x061)
100 extra bytes beyond TriMesh. Skinning data (bone weights, inverse-bind-pose rotation and translation, bone-index mapping) sits here, along with several padding regions:
| Skin offset | Field | Layout | Notes |
|---|---|---|---|
| +0x00 | weights | CExoArrayList | Always zero in binary files. |
| +0x14 | bone_weight_data | ptr | Relocated if count at +0x18 > 0. |
| +0x1C | qbone_ref_inv | CExoArrayList | Inverse-bind rotations. |
| +0x28 | tbone_ref_inv | CExoArrayList | Inverse-bind translations. |
| +0x34 | bone_constant_indices | CExoArrayList | Bone-index remap. |
The weights array deserves a call-out. A 52-byte SkinVertexWeight struct exists and is fully specified by the ASCII parser – 4 bone names, 4 weights, some metadata – but in the binary path, ResetSkin never relocates its pointer, and a corpus scan of all 968 skin nodes across 2832 vanilla models found zero non-empty weights arrays. Binary models store per-vertex bone data exclusively in MDX (via dedicated bone-weight and bone-index offsets), and the weights CExoArray is just a 12-byte zero blob on disk.
AnimMesh (0x0A1)
56 extra bytes. Carries a sample_period scalar and two CExoArrayList fields (anim_verts, anim_t_verts) for time-sampled vertex animation. The remaining six fields (three pointers + three counts + some padding) are runtime-only and zero on disk. Fun fact: no community tool (kotorblender, mdledit, kotormax, reone, xoreos, pykotor) parses AnimMesh nodes – we may have the first structured reader for this type.
Also: ResetAnim is peculiar in that it processes the extra data before calling ResetTriMeshParts, the reverse of every other mesh subtype. There’s no obvious reason for this.
Dangly mesh (0x121)
The simplest mesh subtype, 28 extra bytes. Four fields: a per-vertex constraints CExoArrayList, and three inline floats (displacement, tightness, period) that parameterize the soft-body simulation. A single conditional pointer at the tail is relocated only when the TriMesh vertex count is non-zero.
Dangly meshes are BioWare’s hack for cloth and hair – rigged to the skeleton like a skin mesh, but with simulation parameters that let parts of the geometry lag and swing.
AABB walkmesh (0x221)
4 extra bytes: a single pointer to the root of an AABB tree stored inline in the MDL blob.
The AABB tree is a flattened binary search tree written in DFS preorder. Each node is 40 bytes:
| Offset | Size | Field | Notes |
|---|---|---|---|
| +0x00 | 12 | box_min | 3×f32 AABB minimum corner |
| +0x0C | 12 | box_max | 3×f32 AABB maximum corner |
| +0x18 | 4 | right_child | Content-relative offset (0 = no child) |
| +0x1C | 4 | left_child | Content-relative offset (0 = no child) |
| +0x20 | 4 | face_index | i32. Leaves: ≥ 0. Internal: −1. |
| +0x24 | 4 | split_direction_flags | Axis bitmask: 1=+X, 2=+Y, 4=+Z, 8=−X, 16=−Y, 32=−Z |
Note that right_child comes before left_child in the struct – this is the actual field order, not a typo. Matches Ghidra and the mdledit/mdlops implementations.
Leaf nodes have left = 0, right = 0, face_index ≥ 0, split_direction_flags = 0. Internal nodes have both children non-zero, face_index = -1, and flags computed from the child bounding-box separation. The format is the classic spatial subdivision tree used for fast triangle lookups during pathfinding and collision queries.
ResetAABBTree at 0x004a0260 recurses the tree, relocating each child pointer. It manually unrolls to depth 4 before recursing (the engine’s author was clearly worried about stack depth on a modest C++ compiler).
Lightsaber (0x821)
20 extra bytes – small but architecturally notable:
| Saber offset | Field | Notes |
|---|---|---|
| +0x00 | saber vert data | Relocated pointer |
| +0x04 | saber UV data | Relocated pointer |
| +0x08 | saber normal data | Relocated pointer |
| +0x0C | GL vertex pool ID | Runtime-only (set by RequestPool) |
| +0x10 | GL index pool ID | Runtime-only |
Three arrays of exactly 176 vertices each (NUM_SABER_VERTS = 176, confirmed by kotorblender): position, UV, normal. The saber blade is a fixed-topology mesh – BioWare pre-baked the geometry as a flexible band that can be animated by swinging the endpoint controllers.
Unlike Skin/Dangly/AnimMesh, the saber uses the base TriMesh gen_vertices and remove_temporary_array callbacks. Its geometry doesn’t morph dynamically at the vertex-processing level – the animation is in the controller track.
Controllers and animation
The controller header
Controllers are the keyframe-animation primitive. Each node has an array of 16-byte NewController headers (at node +0x38) plus a shared pool of float data (at +0x44). Each header describes one animatable property of that node:
| Offset | Size | Field | Notes |
|---|---|---|---|
| +0x00 | u32 | type_code | Byte offset of the target property in the Part struct. |
| +0x04 | i16 | supermodel_link | Additive-blending property offset; -1 = no blending. |
| +0x06 | u16 | row_count | Number of keyframes. |
| +0x08 | u16 | time_data_offset | Float-array index for time values. |
| +0x0A | u16 | data_offset | Float-array index for value data. |
| +0x0C | u8 | value_type_and_flags | Low nibble: 1=float, 2/4=quaternion, 3=vector. Bit 4=0x10=Bezier. |
| +0x0D | 3 | padding | Alignment to 16 bytes. Never read. |
The type_code is elegant: it’s literally the byte offset into the Part struct where the animated value lives. NewController::Control dereferences it as *(float*)(part_ptr + type_code). So type_code = 8 means “position” because position sits at Part+0x08; type_code = 20 means “orientation” because orientation sits at Part+0x14 (as a compressed axis-angle quaternion); and so on. This collapses what would otherwise be a switch over property IDs into direct pointer arithmetic.
The value_type_and_flags byte at +0x0C has a compound encoding that bit us hard early on:
- Low nibble (
& 0x0F) – value-type discriminator:1=float,2or4=quaternion,3=vector. Selects the interpolation path (Lerp/Slerp/VectorLerp). - High nibble (
& 0xF0) – flags.0x10signals Bezier interpolation, which triples the per-keyframe value count (each keyframe is value + in-tangent + out-tangent). - Special case: for orientation controllers (type code 20) with raw byte value
== 2, the keyframe is a compressed quaternion packed into a singleu32, not two f32 values.
The low nibble happens to coincide with the “number of floats per keyframe row” for simple cases (1, 3, 4), which is why the earlier interpretation of this byte as column_count mostly worked – until it didn’t. See the controller bug below.
Self-describing rows
Because value_type_and_flags is inline in each controller header, the binary format is entirely self-describing for animation data. The reader doesn’t need a lookup table mapping type codes to column counts – it reads the flags byte and knows how many floats to consume per row.
This is useful because vanilla K1 contains controller type codes (0x68, 0x188) that aren’t documented in any community reference. Trying to parse these with a closed enum caused 517 of 2832 vanilla MDLs (18.3%) to fail. MdlControllerType is therefore a newtype struct MdlControllerType(u32) with named constants for the three universally-confirmed base types (POSITION = 8, ORIENTATION = 20, SCALE = 36) and accepts any other u32 losslessly.
Base vs type-specific controllers
Three controllers are universal – they exist on every node type:
| ASCII name | Code | Columns | Meaning |
|---|---|---|---|
position | 8 | 3 | x, y, z |
orientation | 20 | 4 | x, y, z, angle (compressed axis-angle) |
scale | 36 | 1 | uniform scale factor |
Type-specific codes live at higher numbers: light controllers start at 76 (color), emitter controllers are at 88+. All three base codes also support a Bezier variant (signalled by the flag bit, not a separate type code).
The MDX file: a mystery
Now for the strangest part of the format.
The MDX file contains interleaved vertex data – positions, normals, UVs, tangent space, colours – packed into records of width given by the mesh vertex_stride field, aligned into per-mesh blocks with sentinel-float terminators separating them. It looks exactly like what you’d expect a GPU vertex buffer to look like.
And the K1 engine never reads it.
Here’s the complete trace through InputBinary::Read:
- Read the MDX file into a buffer (
pbVar9). - Call
Reset(mdl_content, mdx_content, resource). Resetpassesmdx_contentasparam_3through a chain of function calls (ResetMdlNode,ResetTriMeshParts, …). Every downstream function hasparam_3as a formal parameter.param_3is never used. InResetTriMeshParts, it’s literally overwritten as a loop counter on line 67.- Back in
InputBinary::Read, line 78:_free(pbVar9). The MDX buffer is freed.
At no point does any vertex-related code path consume MDX data. InternalGenVertices builds vertex buffers from verts_arrays, which lives in the MDL content blob. ProcessVerts recomputes normals from geometry. LightPartTriMesh reads from the GL pool populated at +0xAC of the model header – which is sourced from the MDL content, not the MDX file.
So where does the vertex data actually come from? From a parallel set of position-only arrays stored inside the MDL content blob, pointed to by vert_array_offset at mesh +0x148 (content-relative), with additional UV/colour/normal data in the MdlNodeTriMeshVertArrays structures.
The MDX file, in short, is a redundant interleaved copy of data that the K1 engine could reconstruct from the MDL alone. Most likely theories for why it exists:
- Build-pipeline artifact. BioWare’s Aurora engine (Neverwinter Nights) may have used the MDX format directly, and the K1 pipeline inherited the file-layout convention without the consuming code path.
- Toolset requirement. Third-party editors and the BioWare toolset itself may still parse MDX for authoring workflows.
ResetLitepath. There’s a separate “lightweight” loader (InputBinary::ResetLiteat0x004a11b0) that may use MDX for a reduced in-memory representation – unverified.
For our purposes, this has two consequences:
- Engine-functional MDX is near-trivial. Any MDX file the K1 engine happily ignores is a valid MDX file. You could write all zeros and the game would run.
- Round-trip-accurate MDX requires the per-mesh terminator convention (described next), because community tools do read MDX, and byte-identical round-trip is a useful correctness check.
Per-mesh terminators and alignment
Empirically, vanilla MDX files are larger than sum(vertex_count × stride). Across 2832 vanilla K1 models, 2445 have MDX files with excess bytes, totalling 3,278,456 bytes corpus-wide.
The excess has structure. After each mesh’s vertex data, there’s a terminator row of exactly one stride’s worth of bytes, beginning with three sentinel floats and padded with zeros:
| Mesh type | Sentinel value | Hex (f32 LE) |
|---|---|---|
Non-skin (type & 0x40 == 0) | 10,000,000.0 | 00 96 18 4B |
Skin (type & 0x40 != 0) | 1,000,000.0 | 00 24 74 49 |
Corpus sentinel detection: 6,973 non-skin sentinels, 6 skin sentinels, 0 unknown patterns.
Between meshes, the cursor is padded to the next 16-byte boundary. The last mesh has no trailing alignment:
cursor = 0
for each mesh in MDX order:
cursor += vertex_count × stride # vertex data
cursor += stride # terminator row
if not last mesh:
cursor = (cursor + 15) & ~15 # 16-byte alignment
mdx_file_size = cursor
For stride-24 meshes, the gap between meshes is either 24 or 32 bytes depending on current alignment. For stride-32 and stride-64 meshes, it’s always exactly stride because the stride is already a multiple of 16.
Mesh ordering in MDX
Non-skin meshes come first, then skin meshes. Within each group, the order is DFS-traversal-of-the-tree – mostly. About 27% of vanilla models exhibit a compiler-specific permutation that defers “second children” of paired parents until after all their siblings’ first children. This is reproducible for our own output (if we write DFS, we read DFS), but not for byte-identical round-trip of every BioWare file.
Writing in standard DFS order (non-skin first, skin second) produces semantically identical MDX data with the correct total size. 1784 of 2444 models match byte-for-byte; the remaining 660 have the non-standard compiler ordering.
What this means for mdx_data_offset
The mesh header has two adjacent u32 fields at +0x144 and +0x148:
- +0x144
mdx_data_offset: per-mesh byte offset into the MDX file. Used by community tools to seek directly to that mesh’s vertex block. The engine also uses this afterInternalPostProcessoverwrites it with a GL-pool offset. - +0x148
vert_array_offset: content-relative pointer to the position-only vertex data embedded in the MDL content blob. Used by the engine during load. Relocated byResetTriMeshPartsviaparam_1->field60_0x198 = param_2 + param_1->field60_0x198– whereparam_2is the MDL content base, not the MDX base.
These two fields were conflated under a single MDX_OFFSET = 0x148 constant in our implementation for several months, which caused the reader to lose the MDX offset entirely and the writer to overwrite the content pointer with an MDX offset. Full story in War stories.
Face layout
Faces are 32-byte records (MaxFace) stored in the TriMesh faces CExoArray:
| Offset | Size | Field | Type | Notes |
|---|---|---|---|---|
| +0x00 | 12 | plane_normal | 3×f32 | Face plane normal. |
| +0x0C | 4 | plane_distance | f32 | Plane equation: n·p = d. |
| +0x10 | 4 | surface_id | u32 | Walkability / material identifier. |
| +0x14 | 6 | adjacent | 3×u16 | Indices of adjacent faces (for AABB/pathfinding). |
| +0x1A | 6 | vertex_indices | 3×u16 | Triangle vertex indices. |
The plane normal and distance are pre-computed by the BioWare toolset. They can be re-derived from the geometry but the binary format preserves them. The adjacency graph is what makes AABB walkmesh lookups fast – each triangle points to its neighbours, enabling constant-time stepping during pathfinding.
An early version of our reader assumed 12-byte faces (just the vertex indices). This led to every 2.67th “face” being interpreted from garbage bytes belonging to the next face’s plane normal. It was masked by synthetic round-trip tests – write wrong, read wrong, match! – and only caught when vanilla-file validation found vertex indices exceeding the mesh’s vertex count.
War stories and implementation history
A brief chronicle of the bugs found while building the Rust reader/writer, because the “how we know this” is often as useful as the “what we know”.
The 12-byte face bug
Described above. The MaxFace stride is 32 bytes, not 12. Caught by vertex-index bounds checking against vanilla files.
Mesh header size corrections
The whole mesh extra-header was misunderstood for a long time. A sample of the corrections, all fixed in late February 2026:
VERTEX_COUNToffset was 0x9E → actually 0x130MDX_OFFSETwas 0xB8 → actually two separate fields at 0x144 and 0x148VERTEX_STRUCT_SIZEwas 0xBC → actually 0xFCMESH_EXTRA_SIZEwas 200 bytes → actually 332 (0x14C)RENDERboolean was missing entirely → added at 0x139SHADOWboolean was missing entirely → added at 0x137
All of these stemmed from extrapolating offsets from partial hex dumps rather than decompiling the struct. Ghidra’s MdlNodeTriMesh struct definition settled the whole thing – once the Ghidra type was aligned, the field offsets fell out directly.
Controller column-count encoding
Our reader initially used the raw value_type_and_flags byte (at controller +0x0C) directly as a float count per row. This worked for the common case (position=3, orientation=4, scale=1) but broke in two scenarios:
- Bezier controllers set bit 0x10, turning
raw=3(Bezier position) into a byte value of0x13= 19 columns, not 9. - Integral orientation: ORIENTATION controllers with raw byte
== 2mean “compressed quaternion packed into one u32 per row”, not “2 f32 values per row”.
The integral-orientation case was the more painful bug: a c_dewback scan showed 876 integral-orientation controllers; c_rancor had 1,212. Reading 2 floats instead of 1 consumed double the expected data, desynchronizing every subsequent controller in the data array. Every node’s animation after the first compressed-quaternion keyframe was reading from a shifted window of garbage.
Fix: decode the raw byte with & 0x0F masking plus the two special cases (Bezier multiplies by 3; integral orientation uses 1 u32 per row regardless). The raw byte is preserved in a raw_column_count field for round-trip fidelity.
Animation node_number at +0x02
The 80-byte node header’s first 8 bytes are type_flags (u16), node_number (u16), name_index (u16), padding (u16). Our offset map had NODE_ID = 0x04, which pointed to name_index, not node_number.
For animation nodes specifically, node_number is the engine’s key for matching animation keyframe nodes to their geometry-side skeleton bones. Writing zeros at +0x02 and stuffing the name_index at +0x04 meant every animation node had node_number = 0, so every keyframe targeted the root bone. Visually: characters froze in T-pose with no skeletal motion whatsoever.
Fix: read node_number from +0x02 explicitly; derive name_index from the name map at +0x04.
MDX per-mesh seeking
Our MDX reader used a cumulative cursor assuming non-skin-first DFS ordering. For the ~51% of vanilla models where MDX layout doesn’t match that assumption, vertex data was assigned to the wrong mesh nodes. Self-round-trip tests couldn’t detect this – we were reading and writing the same wrong assignment, which is a consistency check for the tool’s own output, not for correctness against vanilla.
Fix: seek to info.mdx_data_offset (the +0x144 field) for each mesh, matching kotorblender and mdledit behaviour. The cumulative-cursor logic remains in the writer, which produces its own layout and backpatches the offset field; the reader trusts whatever the file says.
Name-table dead entries
220 vanilla K1 models have name tables containing entries that no node references. These turn out to be walkmesh node names (*_wok, *_pwk, *_dwk variants) from BioWare’s build pipeline, which apparently shared a single name table across the MDL and WOK outputs.
The engine only performs indexed lookups via name_index; it never iterates the full table or validates the count. Extra entries are harmless dead weight.
Decision: not preserved. Our writer builds the name table from the node tree (matching kotorblender and mdledit), producing files that are functionally identical but 20–80 bytes shorter. This is a known, benign size delta – not a parity bug.
Emitter controller code verification
All 48 emitter controller type codes were independently verified against the engine binary via Ghidra. For each, we located the __stricmp call for the ASCII field name and traced the controller type value stored on match. Every code matched mdledit’s ReturnControllerName table exactly – no additions, no omissions.
One naming correction: the engine’s canonical string for code 200 is "lightningZigzag" (camelCase Z). mdledit has "lightningzigzag" (all lowercase). Functionally identical because the engine uses __stricmp (case-insensitive), but the engine’s capitalization is now what we emit.
Corpus validation status
As of 2026-02-24: 2832/2832 (100%) structural round-trip success (parse → write → parse → compare). This was achieved after fixing three comparison issues in the test harness:
- NaN ≠ NaN (IEEE 754): 1559 false failures – floats containing NaN don’t equal themselves. Fixed with bitwise
f32::to_bits()comparison. - Parent index ordering: 135 mismatches from depth-first vs. original node ordering. The binary format preserves node ordering but our parent-index reconstruction uses DFS. Semantically equivalent, numerically different – skipped in comparison.
- Face NaN values: exactly one model (
w_dblsbr_001) has NaN in its pre-computed plane_normal/plane_distance, because one of its faces is degenerate. Round-trips correctly once NaN-aware comparison is used.
Byte-level MDL/MDX equality is a separate target – 1784 of 2444 MDX files match byte-for-byte, with the remaining 660 showing the non-standard BioWare compiler traversal discussed earlier.
Appendix
Emitter field map
304 bytes total (80 base + 224 extra). Emitter-specific data:
| Node offset | Extra offset | Field | Type |
|---|---|---|---|
| +0x50 | +0x00 | deadspace | f32 |
| +0x54 | +0x04 | blast_radius | f32 |
| +0x58 | +0x08 | blast_length | f32 |
| +0x5C | +0x0C | num_branches | i32 |
| +0x60 | +0x10 | control_pt_smoothing | i32 |
| +0x64 | +0x14 | x_grid | i32 |
| +0x68 | +0x18 | y_grid | i32 |
| +0x6C | +0x1C | spawn_type | i32 |
| +0x70 | +0x20 | update | char[32] |
| +0x90 | +0x40 | render | char[32] |
| +0xB0 | +0x60 | blend | char[32] |
| +0xD0 | +0x80 | texture | char[32] |
| +0xF0 | +0xA0 | chunk_name | char[16] |
| +0x100 | +0xB0 | two_sided_tex | i32 |
| +0x104 | +0xB4 | loop | i32 |
| +0x108 | +0xB8 | render_order | u16 |
| +0x10A | +0xBA | frame_blending | u8 |
| +0x10B | +0xBB | depth_texture_name | char[16] |
| +0x11B | +0xCB | (reserved) | 21 bytes |
LOD naming convention
When a model has cullWithLOD set, the engine searches for LOD variants by appending suffixes to the model name:
<name>_x– medium LOD<name>_z– far LOD
Loaded via FindModel(name + "_x") and FindModel(name + "_z") as separate Model instances linked to the primary. Not relevant to format parsing, but useful for model validation and lint rules.
Resource type IDs
| Format | Resource type |
|---|---|
| MDL | 2002 (0x7D2) |
| MDX | 3008 (0xBC0) |
These map to the KEY/BIF resource type system. CAuroraInterface::RequestModel at 0x0070d8d0 resolves models through a sorted requestedModelList.
Dynamic type casts
The engine exposes As* functions for type-checked downcasts. Caller counts indicate runtime usage frequency:
| Function | Callers |
|---|---|
AsModel | 34 |
AsMdlNodeTriMesh | 14 |
AsMdlNodeEmitter | 11 |
AsAnimation | 7 |
AsMdlNodeLightsaber | 5 |
AsMdlNodeSkin | 4 |
AsMdlNodeAABB | 3 |
AsMdlNodeDanglyMesh | 3 |
AsMdlNodeLight | 3 |
AsMdlNodeAnimMesh | 2 |
AsMdlNodeCamera | 2 |
AsMdlNodeReference | 2 |
TriMesh (14) and Emitter (11) are the most-queried node types – useful signal for prioritizing implementation completeness.
Binary MDL call graph
For reference when reading Ghidra decompilations:
NewCAurObject (0x00449cc0)
└── FindModel (0x00464110) [by name; checks cache via BinarySearchModel]
└── LoadModel (0x00464200) [on cache miss]
└── IODispatcher::ReadSync (0x004a15d0)
└── Input::Read (0x004a14b0) ← format dispatcher
├── InputBinary::Read (0x004a1260) if first_byte == 0x00
│ └── Reset / ResetLite (pointer rewriting)
│ ├── ResetMdlNode (per-node dispatch)
│ │ ├── ResetMdlNodeParts (base fields)
│ │ ├── ResetTriMesh (mesh subtypes)
│ │ ├── ResetLight (light extras)
│ │ ├── ResetSkin, ResetAnim, ...
│ │ └── ResetAABBTree (recursive tree walk)
│ └── ResetAnimation (per-animation)
└── FuncInterp loop otherwise (ASCII MDL)
└── CreateInstanceTreeR (0x00449200) [builds runtime Part tree from MdlNode tree]
Key Ghidra addresses
For anyone continuing this archaeology, the foundation set of function addresses in swkotor.exe (K1 GOG build):
| Function | Address |
|---|---|
Input::Read | 0x004a14b0 |
InputBinary::Read | 0x004a1260 |
InputBinary::Reset | 0x004a1030 |
InputBinary::ResetMdlNode | 0x004a0900 |
InputBinary::ResetMdlNodeParts | 0x004a0b60 |
InputBinary::ResetTriMeshParts | 0x004a0c00 |
InputBinary::ResetAABBTree | 0x004a0260 |
InputBinary::ResetLight | 0x004a05e0 |
InputBinary::ResetSkin | 0x004a01b0 |
InputBinary::ResetDangly | 0x004a0100 |
InputBinary::ResetAnim | 0x004a0060 |
InputBinary::ResetLightsaber | 0x004a0460 |
InputBinary::ResetAnimation | 0x004a0fb0 |
MdlNodeTriMesh::InternalPostProcess | 0x0043cf00 |
MdlNodeTriMesh::InternalGenVertices | 0x00439df0 |
MdlNodeTriMesh::InternalParseField | 0x004658b0 |
MdlNodeEmitter::InternalParseField | 0x004658b0 |
MdlNodeEmitter::InternalCreateInstance | 0x0049d5c0 |
PartTriMesh::GetMinimumSphere | 0x00443330 |
LightPartTriMesh | 0x0046a9e0 |
NewController::Control | 0x00483330 |
NewController::GetFloatValue | 0x00482bf0 |
Model constructor | 0x0044aa70 |
MaxTree constructor | 0x0044a900 |
ParseNode | 0x004680e0 |
| Node type flag table | 0x00740a18 |