Apriel-Nemotron-15b-Thinker is a 15 billion‑parameter reasoning model in ServiceNow’s Apriel SLM series which achieves competitive performance against similarly sized state-of-the-art models like o1‑mini, QWQ‑32b, and EXAONE‑Deep‑32b, all while maintaining only half the memory footprint of those alternatives. It builds upon the Apriel‑15b‑base checkpoint through a three‑stage training pipeline (CPT, SFT and GRPO). Highlights Half the size of SOTA models like QWQ-32b and EXAONE-32b and hence memory efficient. It consumes 40% less tokens compared to QWQ-32b, making it super efficient in production. 🚀🚀🚀 On par or outperforms on tasks like - MBPP, BFCL, Enterprise RAG, MT Bench, MixEval, IFEval and Multi-Challenge making it great for Agentic / Enterprise tasks. Competitive performance on academic benchmarks like AIME-24 AIME-25, AMC-23, MATH-500 and GPQA considering model size.
Notes: 6 FLOP / parameter / token * 15 * 10^9 parameters * 100 * 10^9 tokens = 9e+21 FLOP [1 epoch assumed] -> "Likely" confidence
Size Notes: 1. Mid training / Continual Pre‑training In this stage, the model is trained on 100+ billion tokens of carefully curated examples drawn from mathematical reasoning, coding challenges, scientific discourse and logical puzzles. 2. Supervised Fine‑Tuning (SFT) Next, we SFT the model using 200,000 high‑quality demonstrations that cover mathematical and scientific problem‑solving, coding tasks, generic instruction‑following scenarios, API/function invocation use cases etc. 3. RLHF
Notes: 15B