Model Format

TinyRustLM currently uses a custom local .slm model format. This is the format the runtime can validate and load in the browser today.

Why `.slm` exists

The .slm path is a constrained proof lane. It avoids the complexity of broad external model formats while the runtime stabilizes:

header validation;
checksum validation;
tokenizer metadata;
tensor directory validation;
f32, q8_0, and q4_0 storage;
context/token limits;
adapter compatibility;
browser byte budgets;
manifest sidecars.

The domain name is GGUF-facing, but the current source does not claim full GGUF import. GGUF support belongs on the roadmap until it has the same style of validation and proof.

Core `.slm` concepts

Concept	Meaning
Header	Versioned binary record that defines shape, quantization, tokenizer, tensor count, and checksums.
Tensor directory	Explicit map of tensor names, offsets, sizes, types, and layout.
Tokenizer section	Byte tokenizer or custom BPE fixture metadata.
Quantization metadata	f32, q8_0, or q4_0 route metadata.
Manifest sidecar	Human- and agent-readable route that records checksum, shape, source kind, admission status, and quality boundary.
Admission status	Whether the artifact is accepted for runtime smoke, pending evaluation, or rejected.

Current model artifacts

The source bundle references deterministic smoke models and TinyLM-16M f32/q8_0/q4_0 artifacts. Those artifacts exist to prove runtime execution, not to claim finished assistant quality.

Quality boundary

A manifest can say runtime-execution-smoke-only or trained_quality_claim=not-claimed. The public site must preserve that distinction. A model that can load and generate one deterministic smoke output is not automatically a trained, useful assistant.

Future route

Future trained or converted models should enter through validated source manifests, provenance sidecars, runtime-smoke evidence, eval sidecars, promotion gates, selector admission, and browser route proof.