The MDL/MDX Format

BioWare’s Aurora/Odyssey engine stores 3D models in a pair of files: .mdl and .mdx. This page documents what’s inside them, how the engine consumes them, and – occasionally – why they look the way they do. Evidence throughout is drawn from Ghidra decompilation of swkotor.exe (K1 GOG build), cross-checked against hex dumps of vanilla assets and community references (kotorblender, mdledit, mdlops, pykotor, reone, xoreos).

Overview

At a glance:

Property	Value
Extensions	`.mdl`, `.mdx`
Magic	Binary: first `u32 == 0`. ASCII: text (`filedependancy`, `newmodel`, …)
Type	Hierarchical scene graph + animation + vertex data
Resource type ID	`2002` (MDL), `3008` (MDX) in KEY/BIF
Rust reference	`rakata_formats::Mdl`

A model is a tree of nodes. Each node carries a transform (position + orientation), an animation track (“controllers”), and – depending on its type – geometry, light parameters, particle-emitter configuration, a skinning skeleton, a lightsaber blade, and so on. One MDL file can carry multiple named animations that operate on that tree.

The surprising shape of the format only makes sense once you understand one design choice, so let’s start there.

The core idea: load-and-fixup

The binary MDL is not a parsed format in the usual sense. The engine does not walk a byte stream field by field, calling read_u32, read_string, read_float. Instead, it does this:

Allocate a buffer exactly the size of the model data.
Copy the whole file into that buffer in one memcpy.
Walk the now-in-memory structure and convert relative offsets into absolute pointers.

That’s it. The “parser” is a pointer rewriter. Every Reset* function you’ll see in the engine (InputBinary::Reset, ResetMdlNode, ResetTriMeshParts, …) takes a buffer base pointer and a struct pointer, and its job is essentially struct->field += base for every relocatable pointer in the struct, recursing into children as it goes.

An analogy: think of IKEA instructions that say “screw part A into the hole next to part B” rather than giving exact millimetre coordinates. The instructions are valid anywhere you choose to assemble the furniture. The MDL blob is identical: every pointer is expressed relative to the blob’s origin, so the engine can drop the blob anywhere in memory and then do a one-time pass to convert those relative offsets to real addresses.

This design choice ripples through everything:

On-disk layout matches in-memory layout exactly. If a MdlNodeTriMesh is 412 bytes in RAM, it’s 412 bytes on disk. Struct field offsets you see in a Ghidra decompilation are the file offsets.
Binary files are architecture-bound. This format is a snapshot of a specific compiler’s struct layout on 32-bit Windows. Field alignment, pointer size (4 bytes), endianness (little), and even padding bytes all match that ABI.
“Parsing” is really validation + relocation. A Rust reader doesn’t need to convert a byte stream into a Rust struct; it needs to interpret a memory image as a struct overlay, following pointers to walk the tree.
The engine never writes binary MDL. The shipping engine only has code to emit ASCII MDL. Binary MDL is produced exclusively by BioWare’s model compiler (a build-time tool). The runtime reads it but never round-trips it.

With that frame in place, the rest of the format falls into shape.

File structure

The 12-byte wrapper

The file begins with a tiny header:

Offset	Type	Field	Notes
+0x00	u32	zero marker	Always `0`. Used to tell binary from ASCII.
+0x04	u32	MDL content size	Bytes of model data that follow.
+0x08	u32	MDX file size	Size of the accompanying `.mdx` file.

Input::Read at 0x004a14b0 is the dispatcher: it peeks at the first byte, and if it’s \0 the file is binary (the first u32 is always zero). Otherwise the file starts with ASCII tokens like filedependancy or newmodel, and processing hands off to a line-based interpreter.

For binary files, InputBinary::Read at 0x004a1260 does the rest:

Record mdl_content_size and mdx_file_size from the wrapper.
Allocate a heap buffer the size of the MDL content; memcpy the model data into it.
If MDX size is non-zero, allocate a second buffer and memcpy the MDX file into it.
Call Reset(mdl_buf, mdx_buf, resource_handle).

Note: the wrapper is not part of the model data. Byte 12 of the on-disk file is byte 0 of the in-memory MDL blob. All internal offsets are relative to the in-memory origin.

Three kinds of pointer

Inside the MDL blob you’ll encounter three distinct flavours of “pointer”, which is worth keeping straight:

MDL-relative offsets – the vast majority. Relocated to absolute pointers by Reset* functions. On re-serialization, they must be rewritten back to relative offsets.
MDX-file byte offsets – used by a few fields (e.g. per-mesh mdx_data_offset at +0x144) to locate vertex data in the separate MDX file.
String pointers – themselves MDL-relative, but pointing into a string table at the end of the blob, pointed to by the name-offsets array at model +0xB8.

Confusingly, there are two similarly named fields on each mesh node: mdx_data_offset at +0x144 (an MDX file offset) and vert_array_offset at +0x148 (a content-relative pointer to embedded position data). Conflating these produced one of the nastier bugs in our reader’s history (see War stories below).

Model header

Once the blob is in memory, InputBinary::Reset at 0x004a1030 walks the model header. Here’s the relevant field map:

Offset	Field	Notes
+0x00	`ModelDestructor` vptr	Populated at load time.
+0x04	`ModelParseField` vptr	Populated at load time.
+0x28	root node offset	Relocated. `ResetMdlNode` recurses from here.
+0x48	resource handle	Populated at load time.
+0x4C	type byte	`GetType()
+0x50	classification	0=Other, 1=Effect, 2=Tile, 4=Character, 8=Door.
+0x54	ref count
+0x58	animations array ptr	Relocated; count at +0x5C.
+0x64	supermodel pointer	Populated via `FindModel(buf+0x88)`.
+0x68..+0x80	bbox min/max	`Vector bmin, bmax`.
+0x80	radius	f32, default 7.0.
+0x84	animation scale	f32, default 1.0. ASCII: `setanimationscale`.
+0x88	supermodel name	`char[36]`, null-terminated. Drives recursive model load.
+0xA8	node array (secondary)	Relocated if non-zero.
+0xAC	MDX vertex pool offset	Source offset into MDX data (consumed into a GL pool).
+0xB0	MDX data size	Size of the vertex-pool copy.
+0xB8	name offsets array ptr	Relocated; count at +0xBC. Array entries also relocated.

Two fields deserve special mention:

+0x50 classification is the model’s high-level category (Character, Door, Tile, …). It’s never read during the Reset pass – it’s carried through as part of the memory-mapped blob and consulted at runtime. Cross-validated against hex dumps:

File +0x50 Category

c_dewback.mdl 0x04 Character ✓

dor_lhr01.mdl 0x08 Door ✓

m01aa_01a.mdl 0x00 Other ✓
+0x88 supermodel name is a 32-byte (plus 4 padding) ASCII name. Loading a model with a supermodel triggers a recursive FindModel call for that name – think of supermodels as CSS-style inheritance, where animation data and bones defined on the parent are available to the child.

File	+0x50	Category
`c_dewback.mdl`	0x04	Character ✓
`dor_lhr01.mdl`	0x08	Door ✓
`m01aa_01a.mdl`	0x00	Other ✓

The node tree

The root node sits at model +0x28. From there, children are reached through a standard in-memory array layout: ptr + count_used + count_allocated at offsets +0x2C, +0x30, +0x34. This three-u32 pattern is BioWare’s CExoArrayList and shows up everywhere in the format – any time you see “12 bytes of array header”, this is what it is.

Base node layout (80 bytes)

All node types begin with the same 80-byte header:

Offset	Size	Field	Notes
+0x00	u16	`node_type`	Flag bitmask. Drives type dispatch.
+0x02	u16	`node_id`	Sequential `0..N-1`.
+0x04	u16	`node_id_dup`	Identical copy of `node_id`. Never read.
+0x06	u16	padding	Always zero.
+0x08	u32	name pointer	Relocated. Points into the string table.
+0x0C	u32	parent pointer	Relocated if non-zero.
+0x10	12	position	`Vector{x, y, z}` as 3×f32.
+0x1C	16	orientation	`Quaternion{w, x, y, z}` as 4×f32.
+0x2C	12	children array	CExoArrayList of `MdlNode*`.
+0x38	12	controller keys array	CExoArrayList of `NewController` (16B each).
+0x44	12	controller data array	CExoArrayList of float (packed key data).

The two bytes at +0x04 are a redundant duplicate of node_id – always identical to +0x02 across 209 nodes verified across four vanilla files, zero mismatches. No known engine function reads it. Best guess: legacy field or exporter artifact. It’s preserved for round-trip fidelity but has no semantic meaning.

A few conventions worth noting:

Quaternion order is (w, x, y, z). Confirmed via Gob::GetOrientation at 0x004499a0 which copies fields in that order. Identity quaternion is [1.0, 0.0, 0.0, 0.0]. The Rust API uses the same convention.
Position and orientation are read directly from the blob. They’re not relocated – they’re inline values, not pointers.
Only two fields need relocation in the base header: name pointer at +0x08 and parent pointer at +0x0C.

InputBinary::ResetMdlNodeParts at 0x004a0b60 handles the base relocations and then recurses: for each entry in the children array, relocate the child pointer and call ResetMdlNode on it.

Type dispatch

InputBinary::ResetMdlNode at 0x004a0900 reads the node_type field and dispatches:

`node_type`	Handler	Kind
`0x0001`	`ResetMdlNodeParts` only	Dummy / base
`0x0003`	`ResetLight`	Light
`0x0005`	`ResetMdlNodeParts` only	Emitter
`0x0009`	`ResetMdlNodeParts` only	Camera
`0x0011`	`ResetMdlNodeParts` only	Reference
`0x0021`	`ResetTriMesh` → `ResetTriMeshParts`	TriMesh
`0x0061`	`ResetSkin`	Skin mesh
`0x00A1`	`ResetAnim`	AnimMesh
`0x0121`	`ResetDangly`	Dangly mesh (cloth)
`0x0221`	`ResetAABBTree` + `ResetTriMeshParts`	Walkmesh with AABB
`0x0401`	(no-op)	Trigger / unused
`0x0821`	`ResetLightsaber`	Saber mesh

The type values are stored as a lookup table in the executable at 0x00740a18 (12 × u32).

Though the type codes are shaped like a bitmask – HEADER=0x01, LIGHT=0x02|HEADER, EMITTER=0x04|HEADER, TRIMESH=0x20|HEADER, SKIN=0x40|TRIMESH, SABER=0x800|TRIMESH, and so on – the dispatch is an exact value match, not individual bit checks. The bitmask structure is meaningful (skin is a superset of trimesh, for instance), it’s just not how the engine branches.

Size summary

Every node type has a known fixed size, both on disk and in memory:

Flag	Type	Total	Base	Extra	Extends
0x0001	Base	80	80	0	–
0x0003	Light	172	80	92	MdlNode
0x0005	Emitter	304	80	224	MdlNode
0x0009	Camera	80	80	0	MdlNode
0x0011	Reference	116	80	36	MdlNode
0x0021	TriMesh	412	80	332	MdlNode
0x0061	Skin	512	412	100	TriMesh
0x00A1	AnimMesh	468	412	56	TriMesh
0x0121	Dangly	440	412	28	TriMesh
0x0221	AABB	416	412	4	TriMesh
0x0401	Trigger	80	80	0	MdlNode
0x0821	Saber	432	412	20	TriMesh

Verified via ParseNode’s operator_new(size) calls and Ghidra struct definitions. All mesh subtypes extend MdlNodeTriMesh – their extra data begins at node offset +0x19C, immediately after the TriMesh block.

Node types in depth

The lightweight types

Camera (0x009) has no extra data. Same 80-byte footprint as the base node. ResetMdlNode dispatches to ResetMdlNodeParts only. There are no camera-specific ASCII fields either – the ASCII parser also falls through to the base handler.

Reference (0x011) carries just two fields in 36 extra bytes: a 32-byte ref_model name and a 4-byte reattachable flag. Both inline (no pointers to relocate).

Trigger (0x401) – the decompiled ResetMdlNode explicitly returns void without calling any reset function for this type. In practice it appears to be unused in shipping content.

Light (0x003)

Lights carry 92 bytes of extra data. Most of the scalar fields are straightforward (priority, shadow flag, ambient-only flag, flare radius, etc.), but lights are the most complex non-mesh type because of their array fields:

Extra offset	Field	Layout	Runtime relocation
+0x04	texture SafePointers	12-byte array header	Zeroed on disk
+0x10	`flaresizes`	CExoArrayList	ptr relocated
+0x1C	`flarepositions`	CExoArrayList	ptr relocated
+0x28	`flarecolorshifts`	CExoArrayList	ptr relocated
+0x34	`texturenames`	CExoArrayList<char*> (each ptr too!)	all ptrs relocated

Lights also drive their colour, radius, shadow radius, vertical displacement, and multiplier via controllers (types 0x4C, 0x58, 0x60, 0x64, 0x8C) – these live in the base node’s controller arrays, not in the light-specific block.

Emitter (0x005)

Emitters are 304 bytes and – pleasantly – contain no relocatable pointers. Everything is inline: a fistful of floats and ints, four 32-byte name fields (update, render, blend, texture), and a 16-byte chunk_name. The full field map is in the appendix.

The most important field is update at extra offset +0x20. It’s the emitter type string, a case-sensitive selector against:

"Fountain" → steady particle stream (most common)
"Explosion" → one-shot burst
"Single" → single particle
"Lightning" → lightning-bolt effect

MdlNodeEmitter::InternalCreateInstance at 0x0049d5c0 branches on this string to instantiate the appropriate runtime emitter class.

Known engine-level footgun: controller 502 (detonate) is only valid on "Explosion" emitters. InternalCreateInstance only allocates the detonation memory for that branch, so a detonate controller on a "Fountain" emitter reads unallocated memory at runtime and crashes. This is a known flaw in mdlops-based exporters (KotorMax); rakata-lint will validate this.

TriMesh (0x021)

This is the big one. 332 bytes of extra data, encoding everything you’d expect in a mesh plus many things you wouldn’t.

Inline fields

At a high level:

Runtime function pointers (+0x00, +0x04): written by the constructor. Zero on disk; never consumed from a file.
Faces array (+0x08): CExoArrayList of MaxFace (32 bytes each). See Face layout below.
Bounding volumes (+0x14..+0x38): bbox min, bbox max, bounding sphere (radius + centre xyz). The sphere is the one actually consumed at runtime – PartTriMesh::GetMinimumSphere hierarchically unions it with children’s spheres for culling. These sphere fields have no ASCII-parser equivalent; they’re exclusively binary-format fields written by the BioWare toolset.
Material (+0x3C..+0x54): diffuse RGB, ambient RGB, transparencyhint.
Textures (+0x58..+0x98): texture_0 (primary/diffuse) and texture_1 (secondary/lightmap), each a 32-byte null-terminated string, plus 32 bytes of padding up to +0xE8.
UV animation (+0xEC..+0xF8): uv_direction_x, uv_direction_y, uv_jitter, uv_jitter_speed. Gated by animate_uv (+0xE8).
MDX vertex layout (+0x100..+0x12F): flags bitmask plus 11 per-attribute byte offsets. Described in the next subsection.
Counts and flags (+0x130..+0x13B): vertex_count (u16), texture_channel_count (u16), six 1-byte booleans (light_mapped, rotate_texture, is_background_geometry, shadow, beaming, render).
Tail (+0x13C..+0x14B): total_surface_area, one unresolved reserved slot, mdx_data_offset, vertex_data_ptr.

Out of 332 bytes, 61 fields are fully confirmed through Ghidra cross-referencing, 5 are confirmed-unused, 1 is “very likely” (the always-3 indices_per_face), and exactly 1 remains unresolved (the 4 bytes at +0x140, which the constructor initializes to zero and no known function ever touches).

MDX vertex layout

The flags field at extra +0x100 is a bitmask describing what each MDX vertex record contains:

Bit	Component	Size
0x01	position	3×f32 (12B) – always set
0x02	UV1 / `tverts0`	2×f32 (8B)
0x04	UV2 / `tverts1`	2×f32 (8B)
0x08	UV3 / `tverts2`	2×f32 (8B)
0x10	UV4 / `tverts3`	2×f32 (8B)
0x20	normal	3×f32 (12B) – always set
0x80	tangent space	3×3×f32 (36B) – bump-mapped meshes

Common patterns in vanilla K1: 0x21 (pos+norm only, 24B stride), 0x23 (+UV1, 32B), 0x27 (+UV2, 40B), 0xA7 (+tangent, 76B).

Note that vertex colours have no flag bit. Their presence is signalled by the per-attribute offset slot being != -1. The 11 offset slots are:

Slot	Extra offset	Field	Evidence
0	+0x104	position	`LightPartTriMesh` reads 3×f32, world-transforms
1	+0x108	normal	`LightPartTriMesh` reads 3×f32, rotation only
2	+0x10C	vertex color	Checked `!= -1`, reads RGB only. Alpha unused.
3	+0x110	UV1	`PartTriMesh` reads 2×f32
4	+0x114	UV2	Structural: `tverts1` in `InternalGenVertices`
5	+0x118	UV3	Structural: `tverts2`
6	+0x11C	UV4	Structural: `tverts3`
7	+0x120	tangent space	Filled by `CalculateTangentSpaceBasis`
8–10	+0x124..+0x12C	reserved	Always `-1` across 215 surveyed vanilla meshes

Vertex colour alpha is unused (confirmed 2026-04-04). LightPartTriMesh reads only bytes [0], [1], [2] (RGB). Byte [3] is stored but never read. The rendered output hardcodes alpha to 0xFF. The fourth byte exists purely for alignment.

Important subtlety: the engine doesn’t trust any of these values on load. InternalPostProcess at 0x0043cf00 recomputes the flags, stride, per-attribute offsets, and mdx_data_offset from scratch, based on which vertex components are actually present in the node’s arrays. It also recomputes vertex normals via edge cross products, and re-derives the bounding box and sphere. The on-disk values preserve the compiler’s original output, but they’re cosmetic from the engine’s perspective.

This has a consequence for tooling: you can largely get away with wrong values in these fields as long as your mesh is otherwise valid, because the engine will fix them up at load time. But a correct writer should still populate them – community tools (kotorblender, mdledit) depend on them, and the BioWare build pipeline does too.

Skin mesh (0x061)

100 extra bytes beyond TriMesh. Skinning data (bone weights, inverse-bind-pose rotation and translation, bone-index mapping) sits here, along with several padding regions:

Skin offset	Field	Layout	Notes
+0x00	`weights`	CExoArrayList	Always zero in binary files.
+0x14	`bone_weight_data`	ptr	Relocated if count at +0x18 > 0.
+0x1C	`qbone_ref_inv`	CExoArrayList	Inverse-bind rotations.
+0x28	`tbone_ref_inv`	CExoArrayList	Inverse-bind translations.
+0x34	`bone_constant_indices`	CExoArrayList	Bone-index remap.

The weights array deserves a call-out. A 52-byte SkinVertexWeight struct exists and is fully specified by the ASCII parser – 4 bone names, 4 weights, some metadata – but in the binary path, ResetSkin never relocates its pointer, and a corpus scan of all 968 skin nodes across 2832 vanilla models found zero non-empty weights arrays. Binary models store per-vertex bone data exclusively in MDX (via dedicated bone-weight and bone-index offsets), and the weights CExoArray is just a 12-byte zero blob on disk.

AnimMesh (0x0A1)

56 extra bytes. Carries a sample_period scalar and two CExoArrayList fields (anim_verts, anim_t_verts) for time-sampled vertex animation. The remaining six fields (three pointers + three counts + some padding) are runtime-only and zero on disk. Fun fact: no community tool (kotorblender, mdledit, kotormax, reone, xoreos, pykotor) parses AnimMesh nodes – we may have the first structured reader for this type.

Also: ResetAnim is peculiar in that it processes the extra data before calling ResetTriMeshParts, the reverse of every other mesh subtype. There’s no obvious reason for this.

Dangly mesh (0x121)

The simplest mesh subtype, 28 extra bytes. Four fields: a per-vertex constraints CExoArrayList, and three inline floats (displacement, tightness, period) that parameterize the soft-body simulation. A single conditional pointer at the tail is relocated only when the TriMesh vertex count is non-zero.

Dangly meshes are BioWare’s hack for cloth and hair – rigged to the skeleton like a skin mesh, but with simulation parameters that let parts of the geometry lag and swing.

AABB walkmesh (0x221)

4 extra bytes: a single pointer to the root of an AABB tree stored inline in the MDL blob.

The AABB tree is a flattened binary search tree written in DFS preorder. Each node is 40 bytes:

Offset	Size	Field	Notes
+0x00	12	`box_min`	3×f32 AABB minimum corner
+0x0C	12	`box_max`	3×f32 AABB maximum corner
+0x18	4	`right_child`	Content-relative offset (0 = no child)
+0x1C	4	`left_child`	Content-relative offset (0 = no child)
+0x20	4	`face_index`	i32. Leaves: ≥ 0. Internal: −1.
+0x24	4	`split_direction_flags`	Axis bitmask: 1=+X, 2=+Y, 4=+Z, 8=−X, 16=−Y, 32=−Z

Note that right_child comes before left_child in the struct – this is the actual field order, not a typo. Matches Ghidra and the mdledit/mdlops implementations.

Leaf nodes have left = 0, right = 0, face_index ≥ 0, split_direction_flags = 0. Internal nodes have both children non-zero, face_index = -1, and flags computed from the child bounding-box separation. The format is the classic spatial subdivision tree used for fast triangle lookups during pathfinding and collision queries.

ResetAABBTree at 0x004a0260 recurses the tree, relocating each child pointer. It manually unrolls to depth 4 before recursing (the engine’s author was clearly worried about stack depth on a modest C++ compiler).

Lightsaber (0x821)

20 extra bytes – small but architecturally notable:

Saber offset	Field	Notes
+0x00	saber vert data	Relocated pointer
+0x04	saber UV data	Relocated pointer
+0x08	saber normal data	Relocated pointer
+0x0C	GL vertex pool ID	Runtime-only (set by `RequestPool`)
+0x10	GL index pool ID	Runtime-only

Three arrays of exactly 176 vertices each (NUM_SABER_VERTS = 176, confirmed by kotorblender): position, UV, normal. The saber blade is a fixed-topology mesh – BioWare pre-baked the geometry as a flexible band that can be animated by swinging the endpoint controllers.

Unlike Skin/Dangly/AnimMesh, the saber uses the base TriMesh gen_vertices and remove_temporary_array callbacks. Its geometry doesn’t morph dynamically at the vertex-processing level – the animation is in the controller track.

Controllers and animation

The controller header

Controllers are the keyframe-animation primitive. Each node has an array of 16-byte NewController headers (at node +0x38) plus a shared pool of float data (at +0x44). Each header describes one animatable property of that node:

Offset	Size	Field	Notes
+0x00	u32	`type_code`	Byte offset of the target property in the Part struct.
+0x04	i16	`supermodel_link`	Additive-blending property offset; `-1` = no blending.
+0x06	u16	`row_count`	Number of keyframes.
+0x08	u16	`time_data_offset`	Float-array index for time values.
+0x0A	u16	`data_offset`	Float-array index for value data.
+0x0C	u8	`value_type_and_flags`	Low nibble: 1=float, 2/4=quaternion, 3=vector. Bit 4=0x10=Bezier.
+0x0D	3	padding	Alignment to 16 bytes. Never read.

The type_code is elegant: it’s literally the byte offset into the Part struct where the animated value lives. NewController::Control dereferences it as *(float*)(part_ptr + type_code). So type_code = 8 means “position” because position sits at Part+0x08; type_code = 20 means “orientation” because orientation sits at Part+0x14 (as a compressed axis-angle quaternion); and so on. This collapses what would otherwise be a switch over property IDs into direct pointer arithmetic.

The value_type_and_flags byte at +0x0C has a compound encoding that bit us hard early on:

Low nibble (& 0x0F) – value-type discriminator: 1=float, 2 or 4=quaternion, 3=vector. Selects the interpolation path (Lerp/Slerp/VectorLerp).
High nibble (& 0xF0) – flags. 0x10 signals Bezier interpolation, which triples the per-keyframe value count (each keyframe is value + in-tangent + out-tangent).
Special case: for orientation controllers (type code 20) with raw byte value == 2, the keyframe is a compressed quaternion packed into a single u32, not two f32 values.

The low nibble happens to coincide with the “number of floats per keyframe row” for simple cases (1, 3, 4), which is why the earlier interpretation of this byte as column_count mostly worked – until it didn’t. See the controller bug below.

Self-describing rows

Because value_type_and_flags is inline in each controller header, the binary format is entirely self-describing for animation data. The reader doesn’t need a lookup table mapping type codes to column counts – it reads the flags byte and knows how many floats to consume per row.

This is useful because vanilla K1 contains controller type codes (0x68, 0x188) that aren’t documented in any community reference. Trying to parse these with a closed enum caused 517 of 2832 vanilla MDLs (18.3%) to fail. MdlControllerType is therefore a newtype struct MdlControllerType(u32) with named constants for the three universally-confirmed base types (POSITION = 8, ORIENTATION = 20, SCALE = 36) and accepts any other u32 losslessly.

Base vs type-specific controllers

Three controllers are universal – they exist on every node type:

ASCII name	Code	Columns	Meaning
`position`	8	3	x, y, z
`orientation`	20	4	x, y, z, angle (compressed axis-angle)
`scale`	36	1	uniform scale factor

Type-specific codes live at higher numbers: light controllers start at 76 (color), emitter controllers are at 88+. All three base codes also support a Bezier variant (signalled by the flag bit, not a separate type code).

The MDX file: a mystery

Now for the strangest part of the format.

The MDX file contains interleaved vertex data – positions, normals, UVs, tangent space, colours – packed into records of width given by the mesh vertex_stride field, aligned into per-mesh blocks with sentinel-float terminators separating them. It looks exactly like what you’d expect a GPU vertex buffer to look like.

And the K1 engine never reads it.

Here’s the complete trace through InputBinary::Read:

Read the MDX file into a buffer (pbVar9).
Call Reset(mdl_content, mdx_content, resource).
Reset passes mdx_content as param_3 through a chain of function calls (ResetMdlNode, ResetTriMeshParts, …). Every downstream function has param_3 as a formal parameter.
param_3 is never used. In ResetTriMeshParts, it’s literally overwritten as a loop counter on line 67.
Back in InputBinary::Read, line 78: _free(pbVar9). The MDX buffer is freed.

At no point does any vertex-related code path consume MDX data. InternalGenVertices builds vertex buffers from verts_arrays, which lives in the MDL content blob. ProcessVerts recomputes normals from geometry. LightPartTriMesh reads from the GL pool populated at +0xAC of the model header – which is sourced from the MDL content, not the MDX file.

So where does the vertex data actually come from? From a parallel set of position-only arrays stored inside the MDL content blob, pointed to by vert_array_offset at mesh +0x148 (content-relative), with additional UV/colour/normal data in the MdlNodeTriMeshVertArrays structures.

The MDX file, in short, is a redundant interleaved copy of data that the K1 engine could reconstruct from the MDL alone. Most likely theories for why it exists:

Build-pipeline artifact. BioWare’s Aurora engine (Neverwinter Nights) may have used the MDX format directly, and the K1 pipeline inherited the file-layout convention without the consuming code path.
Toolset requirement. Third-party editors and the BioWare toolset itself may still parse MDX for authoring workflows.
ResetLite path. There’s a separate “lightweight” loader (InputBinary::ResetLite at 0x004a11b0) that may use MDX for a reduced in-memory representation – unverified.

For our purposes, this has two consequences:

Engine-functional MDX is near-trivial. Any MDX file the K1 engine happily ignores is a valid MDX file. You could write all zeros and the game would run.
Round-trip-accurate MDX requires the per-mesh terminator convention (described next), because community tools do read MDX, and byte-identical round-trip is a useful correctness check.

Per-mesh terminators and alignment

Empirically, vanilla MDX files are larger than sum(vertex_count × stride). Across 2832 vanilla K1 models, 2445 have MDX files with excess bytes, totalling 3,278,456 bytes corpus-wide.

The excess has structure. After each mesh’s vertex data, there’s a terminator row of exactly one stride’s worth of bytes, beginning with three sentinel floats and padded with zeros:

Mesh type	Sentinel value	Hex (f32 LE)
Non-skin (`type & 0x40 == 0`)	10,000,000.0	`00 96 18 4B`
Skin (`type & 0x40 != 0`)	1,000,000.0	`00 24 74 49`

Corpus sentinel detection: 6,973 non-skin sentinels, 6 skin sentinels, 0 unknown patterns.

Between meshes, the cursor is padded to the next 16-byte boundary. The last mesh has no trailing alignment:

cursor = 0
for each mesh in MDX order:
    cursor += vertex_count × stride   # vertex data
    cursor += stride                   # terminator row
    if not last mesh:
        cursor = (cursor + 15) & ~15   # 16-byte alignment
mdx_file_size = cursor

For stride-24 meshes, the gap between meshes is either 24 or 32 bytes depending on current alignment. For stride-32 and stride-64 meshes, it’s always exactly stride because the stride is already a multiple of 16.

Mesh ordering in MDX

Non-skin meshes come first, then skin meshes. Within each group, the order is DFS-traversal-of-the-tree – mostly. About 27% of vanilla models exhibit a compiler-specific permutation that defers “second children” of paired parents until after all their siblings’ first children. This is reproducible for our own output (if we write DFS, we read DFS), but not for byte-identical round-trip of every BioWare file.

Writing in standard DFS order (non-skin first, skin second) produces semantically identical MDX data with the correct total size. 1784 of 2444 models match byte-for-byte; the remaining 660 have the non-standard compiler ordering.

What this means for `mdx_data_offset`

The mesh header has two adjacent u32 fields at +0x144 and +0x148:

+0x144 mdx_data_offset: per-mesh byte offset into the MDX file. Used by community tools to seek directly to that mesh’s vertex block. The engine also uses this after InternalPostProcess overwrites it with a GL-pool offset.
+0x148 vert_array_offset: content-relative pointer to the position-only vertex data embedded in the MDL content blob. Used by the engine during load. Relocated by ResetTriMeshParts via param_1->field60_0x198 = param_2 + param_1->field60_0x198 – where param_2 is the MDL content base, not the MDX base.

These two fields were conflated under a single MDX_OFFSET = 0x148 constant in our implementation for several months, which caused the reader to lose the MDX offset entirely and the writer to overwrite the content pointer with an MDX offset. Full story in War stories.

Face layout

Faces are 32-byte records (MaxFace) stored in the TriMesh faces CExoArray:

Offset	Size	Field	Type	Notes
+0x00	12	`plane_normal`	3×f32	Face plane normal.
+0x0C	4	`plane_distance`	f32	Plane equation: n·p = d.
+0x10	4	`surface_id`	u32	Walkability / material identifier.
+0x14	6	`adjacent`	3×u16	Indices of adjacent faces (for AABB/pathfinding).
+0x1A	6	`vertex_indices`	3×u16	Triangle vertex indices.

The plane normal and distance are pre-computed by the BioWare toolset. They can be re-derived from the geometry but the binary format preserves them. The adjacency graph is what makes AABB walkmesh lookups fast – each triangle points to its neighbours, enabling constant-time stepping during pathfinding.

An early version of our reader assumed 12-byte faces (just the vertex indices). This led to every 2.67th “face” being interpreted from garbage bytes belonging to the next face’s plane normal. It was masked by synthetic round-trip tests – write wrong, read wrong, match! – and only caught when vanilla-file validation found vertex indices exceeding the mesh’s vertex count.

War stories and implementation history

A brief chronicle of the bugs found while building the Rust reader/writer, because the “how we know this” is often as useful as the “what we know”.

The 12-byte face bug

Described above. The MaxFace stride is 32 bytes, not 12. Caught by vertex-index bounds checking against vanilla files.

Mesh header size corrections

The whole mesh extra-header was misunderstood for a long time. A sample of the corrections, all fixed in late February 2026:

VERTEX_COUNT offset was 0x9E → actually 0x130
MDX_OFFSET was 0xB8 → actually two separate fields at 0x144 and 0x148
VERTEX_STRUCT_SIZE was 0xBC → actually 0xFC
MESH_EXTRA_SIZE was 200 bytes → actually 332 (0x14C)
RENDER boolean was missing entirely → added at 0x139
SHADOW boolean was missing entirely → added at 0x137

All of these stemmed from extrapolating offsets from partial hex dumps rather than decompiling the struct. Ghidra’s MdlNodeTriMesh struct definition settled the whole thing – once the Ghidra type was aligned, the field offsets fell out directly.

Controller column-count encoding

Our reader initially used the raw value_type_and_flags byte (at controller +0x0C) directly as a float count per row. This worked for the common case (position=3, orientation=4, scale=1) but broke in two scenarios:

Bezier controllers set bit 0x10, turning raw=3 (Bezier position) into a byte value of 0x13 = 19 columns, not 9.
Integral orientation: ORIENTATION controllers with raw byte == 2 mean “compressed quaternion packed into one u32 per row”, not “2 f32 values per row”.

The integral-orientation case was the more painful bug: a c_dewback scan showed 876 integral-orientation controllers; c_rancor had 1,212. Reading 2 floats instead of 1 consumed double the expected data, desynchronizing every subsequent controller in the data array. Every node’s animation after the first compressed-quaternion keyframe was reading from a shifted window of garbage.

Fix: decode the raw byte with & 0x0F masking plus the two special cases (Bezier multiplies by 3; integral orientation uses 1 u32 per row regardless). The raw byte is preserved in a raw_column_count field for round-trip fidelity.

Animation node_number at +0x02

The 80-byte node header’s first 8 bytes are type_flags (u16), node_number (u16), name_index (u16), padding (u16). Our offset map had NODE_ID = 0x04, which pointed to name_index, not node_number.

For animation nodes specifically, node_number is the engine’s key for matching animation keyframe nodes to their geometry-side skeleton bones. Writing zeros at +0x02 and stuffing the name_index at +0x04 meant every animation node had node_number = 0, so every keyframe targeted the root bone. Visually: characters froze in T-pose with no skeletal motion whatsoever.

Fix: read node_number from +0x02 explicitly; derive name_index from the name map at +0x04.

MDX per-mesh seeking

Our MDX reader used a cumulative cursor assuming non-skin-first DFS ordering. For the ~51% of vanilla models where MDX layout doesn’t match that assumption, vertex data was assigned to the wrong mesh nodes. Self-round-trip tests couldn’t detect this – we were reading and writing the same wrong assignment, which is a consistency check for the tool’s own output, not for correctness against vanilla.

Fix: seek to info.mdx_data_offset (the +0x144 field) for each mesh, matching kotorblender and mdledit behaviour. The cumulative-cursor logic remains in the writer, which produces its own layout and backpatches the offset field; the reader trusts whatever the file says.

Name-table dead entries

220 vanilla K1 models have name tables containing entries that no node references. These turn out to be walkmesh node names (*_wok, *_pwk, *_dwk variants) from BioWare’s build pipeline, which apparently shared a single name table across the MDL and WOK outputs.

The engine only performs indexed lookups via name_index; it never iterates the full table or validates the count. Extra entries are harmless dead weight.

Decision: not preserved. Our writer builds the name table from the node tree (matching kotorblender and mdledit), producing files that are functionally identical but 20–80 bytes shorter. This is a known, benign size delta – not a parity bug.

Emitter controller code verification

All 48 emitter controller type codes were independently verified against the engine binary via Ghidra. For each, we located the __stricmp call for the ASCII field name and traced the controller type value stored on match. Every code matched mdledit’s ReturnControllerName table exactly – no additions, no omissions.

One naming correction: the engine’s canonical string for code 200 is "lightningZigzag" (camelCase Z). mdledit has "lightningzigzag" (all lowercase). Functionally identical because the engine uses __stricmp (case-insensitive), but the engine’s capitalization is now what we emit.

Corpus validation status

As of 2026-02-24: 2832/2832 (100%) structural round-trip success (parse → write → parse → compare). This was achieved after fixing three comparison issues in the test harness:

NaN ≠ NaN (IEEE 754): 1559 false failures – floats containing NaN don’t equal themselves. Fixed with bitwise f32::to_bits() comparison.
Parent index ordering: 135 mismatches from depth-first vs. original node ordering. The binary format preserves node ordering but our parent-index reconstruction uses DFS. Semantically equivalent, numerically different – skipped in comparison.
Face NaN values: exactly one model (w_dblsbr_001) has NaN in its pre-computed plane_normal/plane_distance, because one of its faces is degenerate. Round-trips correctly once NaN-aware comparison is used.

Byte-level MDL/MDX equality is a separate target – 1784 of 2444 MDX files match byte-for-byte, with the remaining 660 showing the non-standard BioWare compiler traversal discussed earlier.

Appendix

Emitter field map

304 bytes total (80 base + 224 extra). Emitter-specific data:

Node offset	Extra offset	Field	Type
+0x50	+0x00	`deadspace`	f32
+0x54	+0x04	`blast_radius`	f32
+0x58	+0x08	`blast_length`	f32
+0x5C	+0x0C	`num_branches`	i32
+0x60	+0x10	`control_pt_smoothing`	i32
+0x64	+0x14	`x_grid`	i32
+0x68	+0x18	`y_grid`	i32
+0x6C	+0x1C	`spawn_type`	i32
+0x70	+0x20	`update`	char[32]
+0x90	+0x40	`render`	char[32]
+0xB0	+0x60	`blend`	char[32]
+0xD0	+0x80	`texture`	char[32]
+0xF0	+0xA0	`chunk_name`	char[16]
+0x100	+0xB0	`two_sided_tex`	i32
+0x104	+0xB4	`loop`	i32
+0x108	+0xB8	`render_order`	u16
+0x10A	+0xBA	`frame_blending`	u8
+0x10B	+0xBB	`depth_texture_name`	char[16]
+0x11B	+0xCB	(reserved)	21 bytes

LOD naming convention

When a model has cullWithLOD set, the engine searches for LOD variants by appending suffixes to the model name:

<name>_x – medium LOD
<name>_z – far LOD

Loaded via FindModel(name + "_x") and FindModel(name + "_z") as separate Model instances linked to the primary. Not relevant to format parsing, but useful for model validation and lint rules.

Resource type IDs

Format	Resource type
MDL	2002 (0x7D2)
MDX	3008 (0xBC0)

These map to the KEY/BIF resource type system. CAuroraInterface::RequestModel at 0x0070d8d0 resolves models through a sorted requestedModelList.

Dynamic type casts

The engine exposes As* functions for type-checked downcasts. Caller counts indicate runtime usage frequency:

Function	Callers
`AsModel`	34
`AsMdlNodeTriMesh`	14
`AsMdlNodeEmitter`	11
`AsAnimation`	7
`AsMdlNodeLightsaber`	5
`AsMdlNodeSkin`	4
`AsMdlNodeAABB`	3
`AsMdlNodeDanglyMesh`	3
`AsMdlNodeLight`	3
`AsMdlNodeAnimMesh`	2
`AsMdlNodeCamera`	2
`AsMdlNodeReference`	2

TriMesh (14) and Emitter (11) are the most-queried node types – useful signal for prioritizing implementation completeness.

Binary MDL call graph

For reference when reading Ghidra decompilations:

NewCAurObject (0x00449cc0)
└── FindModel (0x00464110)           [by name; checks cache via BinarySearchModel]
    └── LoadModel (0x00464200)       [on cache miss]
        └── IODispatcher::ReadSync (0x004a15d0)
            └── Input::Read (0x004a14b0)          ← format dispatcher
                ├── InputBinary::Read (0x004a1260)   if first_byte == 0x00
                │   └── Reset / ResetLite                (pointer rewriting)
                │       ├── ResetMdlNode                  (per-node dispatch)
                │       │   ├── ResetMdlNodeParts         (base fields)
                │       │   ├── ResetTriMesh              (mesh subtypes)
                │       │   ├── ResetLight                (light extras)
                │       │   ├── ResetSkin, ResetAnim, ...
                │       │   └── ResetAABBTree             (recursive tree walk)
                │       └── ResetAnimation                (per-animation)
                └── FuncInterp loop                 otherwise (ASCII MDL)
    └── CreateInstanceTreeR (0x00449200)  [builds runtime Part tree from MdlNode tree]

Key Ghidra addresses

For anyone continuing this archaeology, the foundation set of function addresses in swkotor.exe (K1 GOG build):

Function	Address
`Input::Read`	`0x004a14b0`
`InputBinary::Read`	`0x004a1260`
`InputBinary::Reset`	`0x004a1030`
`InputBinary::ResetMdlNode`	`0x004a0900`
`InputBinary::ResetMdlNodeParts`	`0x004a0b60`
`InputBinary::ResetTriMeshParts`	`0x004a0c00`
`InputBinary::ResetAABBTree`	`0x004a0260`
`InputBinary::ResetLight`	`0x004a05e0`
`InputBinary::ResetSkin`	`0x004a01b0`
`InputBinary::ResetDangly`	`0x004a0100`
`InputBinary::ResetAnim`	`0x004a0060`
`InputBinary::ResetLightsaber`	`0x004a0460`
`InputBinary::ResetAnimation`	`0x004a0fb0`
`MdlNodeTriMesh::InternalPostProcess`	`0x0043cf00`
`MdlNodeTriMesh::InternalGenVertices`	`0x00439df0`
`MdlNodeTriMesh::InternalParseField`	`0x004658b0`
`MdlNodeEmitter::InternalParseField`	`0x004658b0`
`MdlNodeEmitter::InternalCreateInstance`	`0x0049d5c0`
`PartTriMesh::GetMinimumSphere`	`0x00443330`
`LightPartTriMesh`	`0x0046a9e0`
`NewController::Control`	`0x00483330`
`NewController::GetFloatValue`	`0x00482bf0`
`Model` constructor	`0x0044aa70`
`MaxTree` constructor	`0x0044a900`
`ParseNode`	`0x004680e0`
Node type flag table	`0x00740a18`

Keyboard shortcuts

Rakata Documentation