in ,

Alibaba releases Qwen1.5-110B, the first large model with hundreds of billions of parameters


Alibaba released Qwen1.5-110B, the first large model with hundreds of billions of parameters. It has previously released versions with different scale parameters of 0.5B, 1.8B, 4B, 7B, 14B and 72B. Alibaba said that the Qwen1.5-110B model is comparable to Meta-Llama3-70B in basic capability evaluation and performs well in Chat evaluation, including MT-Bench and AlpacaEval 2.0. Qwen1.5-110B is similar to other Qwen1.5 models and uses the same Transformer decoder architecture. It includes Grouped Query Attention (GQA), which is more efficient during model inference. The model supports a context length of 32K tokens, and it is still multi-lingual, supporting English, Chinese, French, Spanish, German, Russian, Japanese, Korean, Vietnamese, Arabic and other languages.

https://qwenlm.github.io/zh/blog/qwen1.5/
https://qwenlm.github.io/zh/blog/qwen1.5-110b/

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

Agile by Design: Cybersecurity at the Heart of Transformation

The Not-So-Silent Type