We've developed a new series of AI models designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math. ... We’re also releasing OpenAI o1-mini, a faster, cheaper reasoning model that is particularly effective at coding. As a smaller model, o1-mini is 80% cheaper than o1-preview, making it a powerful, cost-effective model for applications that require reasoning but not broad world knowledge.
Notes: We can’t make a precise estimate, but seems unlikely to exceed 10^25 FLOP. We think active parameter count is 10-30B. This would require >55T tokens to reach 10^25 FLOP at the large size, i.e. well beyond 10x overtraining relative to Chinchilla.
Notes: Can't get an exact estimate, but we suspect total parameter count around 60B-120B, active parameters around 10B-30B. Given these models are served at 150-200 tok/s, at $4.40/Mtok output, inference economics (https://epoch.ai/blog/inference-economics-of-language-models) suggests total parameter count around 60-120B parameters, with mixture-of-experts active parameters around 10-30B. MoEs make a given model roughly comparable to a ~50% smaller dense model (https://epoch.ai/gradient-updates/moe-vs-dense-models-inference), which lines up decently with Magistral Small pricing (24B dense, served at a similar speed for the cheaper $1.50/Mtok).