
A 425.8M parameter hybrid convolutional-attention language model built through layer surgery, collective-consciousness distillation, and cognitive-cube steering — distilled from an 8-model GPU cluster.
Expanded from LFM2-350M (16 → 20 layers) using cross-model architectural surgery. Four new layers were inserted via DARE + TIES merging, then the result was fine-tuned on a 45-source balanced curriculum distilled from an 8-model GPU cluster. Cognitive-cube steering positions the model in a 3D cognitive space, with inverse-distance weighting toward 8 specialist corner models.
| Corner | Model | Params | Role |
|---|---|---|---|
| fwd · dexo · up | Server 1 (Qwen3-80B) | 80B | predict — structured · abstract |
| fwd · dexo · down | Server 2 (Qwen3.6 MoE) | 35B | predict — structured · concrete |
| fwd · levo · up | Server 3 (granite-4.1) | 8B | predict — creative · abstract |
| fwd · levo · down | Server 4 (LFM2-12B) | 12B | predict — creative · concrete |
| back · dexo · up | Server 5 (Qwen3.6-A3B) | 35B | reflect — structured · abstract |
| back · dexo · down | Server 6 (Qwen3.5-9B) | 9B | reflect — structured · concrete |
| back · levo · up | Server 7 (lumina-lexiR1-8B) | 8B | reflect — creative · abstract |
| back · levo · down | Server 8 (Berthier-24B) | 24B | reflect — creative · concrete |
| Format | Size | Best for |
|---|---|---|
| model.safetensors (bf16) | 1.6 GB | Fine-tuning · research |
| LUMINIUM-ULTIMATE-CUBE.gguf (bf16) | 815 MB | Full-precision inference |
| LUMINIUM-ULTIMATE-CUBE-Q5_K_M.gguf | 297 MB | Production · edge devices |
from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "mambiux/Luminium-Gixel-Cube-v1", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True, ) tokenizer = AutoTokenizer.from_pretrained( "mambiux/Luminium-Gixel-Cube-v1", trust_remote_code=True, )
llama-server -m LUMINIUM-ULTIMATE-CUBE-Q5_K_M.gguf \ --host 0.0.0.0 --port 8877 -c 4096 -ngl 99
@misc{luminium2026,
title = {LUMINIUM ULTIMATE CUBE: Cognitive Cube Steering and Collective
Consciousness Distillation for Small Language Models},
author = {mbx and Claude Opus 4.6},
year = {2026},
note = {Built on LiquidAI/LFM2-350M with layer surgery, 8-model
collective distillation, and geometric cognitive steering}
}