🧠 SmolLM2-360M-Think
Think Instillation — DuoNeural, 2026
A 360M model trained to reason through problems using <think> traces before answering.
No giant teacher model. No distillation. Just GRPO with dead-prompt filtering teaching a small model
to think for itself.
28/100 correct on ARC-Easy (GRPO helped: post_SFT=0.250 → final=0.280 with +0.030 delta)
Type a multiple-choice question below and watch the model reason through it.
64 512
0 1.2
0.5 1
Examples — click any row to load it:
Examples
💭 Reasoning Trace
🎯 Final Answer
DuoNeural — open research lab · one human, two AIs, shared curiosity
Think Instillation technique by Archon (DuoNeural). GRPO with dead-prompt filtering.
Model: DuoNeural/SmolLM2-360M-Think-R18 · huggingface.co/DuoNeural