SenseTime hosted a Tech Day event, sharing their strategic plan for advancing AGI (Artificial General Intelligence) development through the combination of “foundation models + large-scale computing” systems. Under this strategy, SenseTime unveiled the “SenseNova” foundation model set, introducing a variety of foundation models and capabilities in natural language processing, content generation, automated data annotation, and custom model training. At the event, SenseTime not only showcased their large language model’s capabilities, but also demonstrated a series of generative AI models and applications, such as text-to-image creation, 2D/3D digital human generation, and complex scenario/detailed object generation. Additionally, they introduced their AGI research and development platform facilitated by the integration of “foundation models + large-scale computing” systems.
Notes: “Over the course of five years, SenseTime has built SenseCore, a leading AI infrastructure with 27,000 GPUs, capable of delivering a total computational power of 5,000 petaflops” Assuming they used this entire cluster with 30 days of training (rough average of frontier model training times since 2016), 30% utilization rate: 5000e15 * 0.3 * 30 * 24 * 60 * 60 = 3.89e24 FLOP. Assuming the model is dense and trained Chinchilla-optimal: 20 tokens/parameter * (180e9 parameters)**2 * 6 = 3.89e24 FLOP. (The two estimates match by coincidence.) The model seems more likely than not to be dense, given that news of SenseChat 5.0 makes a point of stating its MoE architecture, whereas SenseChat 1.0 does not mention architecture. Given uncertainties (e.g. the model is possibly MoE, could have been overtrained or undertrained, could have trained longer or shorter), likely between 1e23 and 3e25 FLOP.
Notes: https://www.thepaper.cn/newsDetail_forward_22639611 Translation: "SenseTime launched the "SenseNova" large model system, which includes natural language generation, image generation services, pre-labeling for perception models, and model development. The "SenseChat" application platform, powered by a 180-billion parameter Chinese language model, supports ultra-long text comprehension and offers capabilities such as question answering, understanding, and generation in Chinese." Link says "hundreds of billions" but the more precise number above seems more credible.