Model Details

Domain:

Task:

Quantitative reasoning

Code generation

Translation

Model Access:

Open weights (unrestricted)

Introduction

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: 8 model sizes, including 0.5B, 1.8B, 4B, 7B, 14B, 32B and 72B dense models, and an MoE model of 14B with 2.7B activated; Significant performance improvement in human preference for chat models; Multilingual support of both base and chat models; Stable support of 32K context length for models of all sizes No need of trust_remote_code.

Benchmarking

FLOPs

3.36e+23

Notes: 6 FLOP / parameter / token * 14*10^9 parameters * 4*10^12 tokens = 3.36e+23 FLOP

Training

Training Code Accessibility

https://huggingface.co/Qwen/Qwen1.5-14B

Size Notes: 4 trillion tokens from this response https://github.com/QwenLM/Qwen2/issues/97

Parameters

14000000000

Notes: 14B

Authors

Qwen Team

Related Models

Qwen1.5-14B - Use Model

Qwen1.5-14B - Use Model

Model Details

Introduction

Benchmarking

Training

Parameters

Authors

Related Models

Qwen3-Omni-30B-A3B

Qwen3-Next-80B-A3B

Wan 2.2 14B I2V

Wan 2.2 14B T2V

Qwen1.5-14B - Use Model

Qwen1.5-14B - Use Model

Model Details

Introduction

Benchmarking

Training

Parameters

Authors

Related Models

Qwen3-Omni-30B-A3B

Qwen3-Next-80B-A3B

Wan 2.2 14B I2V

Wan 2.2 14B T2V