in ,

Meta launches Llama 3 artificial intelligence model, providing a 70B parameter version with greatly improved performance


Meta Artificial Intelligence Research Institute today launched the Llama 3 model. This model has been trained with 15T (trillion) tokens and provides a language model that has been pre-trained and fine-tuned with instructions. It is divided into 8B and 70B parameter versions and can be used in various environments. use.

Compared with Llama 2, the new version provides new features and improved reasoning capabilities, significantly reduces false rejection rates, supports multi-language and multi-mode, has longer context, and also improves the overall performance of core functions such as reasoning and programming.

In some benchmark tests, the performance of Llama 3 exceeds Mistral-7B, Mistral 8x22B and Google Gemini Pro version 1.0, and it is also the best-performing open AI model currently.

Meta launches Llama 3 artificial intelligence model, providing a 70B parameter version with greatly improved performance

In order to maximize the performance of Llama 3 in chat scenarios, Meta has also innovated the instruction fine-tuning method, including using a combination of supervised fine-tuning, rejection sampling, proximal policy optimization and direct policy optimization, especially through proximal policy optimization and Direct policy optimization significantly improves Llama 3's reasoning and programming performance.

For example, Meta said that if the user asks the model a reasoning question that is difficult to answer, the model will sometimes produce the correct reasoning trajectory. The model knows how to generate the correct answer, but does not know how to choose this answer, and the preference ranking Training allows the model to learn how to choose this answer.

On the security side, Meta is updated on Llama Guard 2 and Cyber ​​Sec Eval 2, and also introduces Code Shield, an inference time guard used to filter large language models that generate unsafe code, which can improve the overall performance of Llama 3. safety.

From now on, the Llama 3 model is available on major cloud computing platforms, including Amazon AWS and Google Cloud. Developers can also download the model for deployment.

After the release of Llama 3, Meta is training the next generation of Llama, the largest model of which has more than 400B parameters, but these models are still being trained. Meta hopes to roll out a multimodal version in the coming months and continue to expand contextual support.

related information:

Robin Li continued to say at the Baidu AI Developer Conference that the open source model will only become more and more backward.

Copyright statement: Thank you for reading. Unless the source website name or link is indicated in the article, it is the original content of Blue Dot.com.When reprinting, please be sure to indicate: Source: bluedot.com, author andFull link to this article,Thank you for understanding.

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

SeedHunter Marketing Module Is live – Web3 Influencer Campaigns With Payment In Stable Coins

It’s time for governance!Reference path for data security protection in the industrial field