Model Details

Domain:

Task:

Visual question answering

Code generation

Model Access:

Open weights (restricted use)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

On October 27, 2023, at the 2023 China Computer Conference (CNCC), Zhipu AI launched the fully self-developed third-generation large base model ChatGLM3 and related series of products.

Benchmarking

FLOPs5.04e+22

Notes: Highly speculative. Assume 1 epoch on 1.4T tokens. 6 FLOP/token/param * 1.4T tokens * 6B params=50.4 * 10 ^(12+9) = 5.04*10^(22)

Training

Training Code Accessibilityweights available with restricted license: https://huggingface.co/THUDM/chatglm-6b/blob/main/MODEL_LICENSE

Size Notes: "ChatGLM-6B was pre-trained on approximately one trillion tokens of Chinese and English corpus" "By further realizing more diverse training datasets, more sufficient training steps, and more optimized training strategies, ChatGLM3-6B topped 42 benchmarks across semantics, mathematics, reasoning, code, and knowledge." The ChatGLM website states that the latest ChatGLM service is based on (and upgraded from) ChatGLM2, which was trained on 1.4T tokens. Assume that ChatGLM3 is trained on at least the same number of tokens. Sources: https://chatglm.cn/ https://github.com/THUDM/ChatGLM2-6B/blob/main/README_EN.md https://www.zhipuai.cn/en/news/76 here (https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors_paper.pdf) they confirm the dataset size "Consequently, in Kolors, we utilize the open-source ChatGLM3-6B-Base as text encoder, which has been pre-trained with over 1.4 trillion bilingual tokens, resulting in a robust capability for Chinese language understanding."

Parameters

Parameters6000000000

Notes: 6B from https://arxiv.org/abs/2406.12793

Authors

Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang, Peng Zhang, Qinkai Zheng, Rui Lu, Shuaiqi Duan, Shudan Zhang, Shulin Cao, Shuxun Yang, Weng Lam Tam, Wenyi Zhao, Xiao Liu, Xiao Xia, Xiaohan Zhang, Xiaotao Gu, Xin Lv, Xinghan Liu, Xinyi Liu, Xinyue Yang, Xixuan Song, Xunkai Zhang, Yifan An, Yifan Xu, Yilin Niu, Yuantao Yang, Yueyan Li, Yushi Bai, Yuxiao Dong, Zehan Qi, Zhaoyu Wang, Zhen Yang, Zhengxiao Du, Zhenyu Hou, Zihan Wang

Related ModelsView all models

GLM-4-PlusBy Zhipu AI

Language

GLM-4 0116By Zhipu AI

Language

Zhipu AI | ChatGLM3-6B - Capabilities, Benchmarks and Use Cases

Model Details

Domain:

Task:

Visual question answering

Code generation

Model Access:

Open weights (restricted use)

AI Tools Usage

This model is commonly used behind the scenes in AI tools.

Introduction

On October 27, 2023, at the 2023 China Computer Conference (CNCC), Zhipu AI launched the fully self-developed third-generation large base model ChatGLM3 and related series of products.

Benchmarking

FLOPs5.04e+22

Notes: Highly speculative. Assume 1 epoch on 1.4T tokens. 6 FLOP/token/param * 1.4T tokens * 6B params=50.4 * 10 ^(12+9) = 5.04*10^(22)

Training

Training Code Accessibilityweights available with restricted license: https://huggingface.co/THUDM/chatglm-6b/blob/main/MODEL_LICENSE

Parameters

Parameters6000000000

Notes: 6B from https://arxiv.org/abs/2406.12793

Authors

Related ModelsView all models

GLM-4-PlusBy Zhipu AI

Language

GLM-4 0116By Zhipu AI

Language

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters

Authors

Top Tasks

Top Countries

Top Domains

Top Organizations

Top Categories

Top Collections

Platform

Model Details

AI Tools Usage

Introduction

Benchmarking

Training

Parameters

Authors