Llama 3.1: Meta’s Powerful and Risky New AI Model

Llama 3.1: Meta’s Powerful and Risky New AI Model

The latest iteration of Meta’s Llama AI model, Llama 3.1, promises to make artificial intelligence more accessible and customizable. However, it also raises concerns about the potential dangers of releasing AI without proper safeguards.

Most tech moguls aim to sell AI to the masses, but Mark Zuckerberg is giving away what Meta considers one of the best AI models in the world. On Monday, Meta released the largest and most capable version of its large language model, Llama, for free. While Meta hasn’t disclosed the development costs of Llama 3.1, Zuckerberg recently told investors that the company is spending billions on AI development.

Open Artificial Intelligence

With this latest release, Meta demonstrates that the closed approach favored by most AI companies isn’t the only way to develop AI. However, the company is also positioning itself at the center of the debate about the dangers of releasing AI without controls. Meta trains Llama to avoid producing harmful results by default, but the model can be modified to remove these safeguards.

Meta claims that Llama 3.1 is as intelligent and useful as the best commercial offerings from companies like OpenAI, Google, and Anthropic. According to Meta, the model is the smartest on Earth in certain AI progress metrics.

“It’s very exciting,” says Percy Liang, an associate professor at Stanford University who closely follows open-source AI. Liang notes that if developers find the new model as capable as industry leaders like OpenAI’s GPT-4, many might switch to Meta’s offering: “It will be interesting to see how usage changes.”

In an open letter published with the new model’s release, Zuckerberg, Meta’s CEO, compared Llama to the open-source Linux operating system. When Linux took off in the late ’90s and early 2000s, many major tech companies bet on closed alternatives and criticized open-source software as risky and unreliable. Today, however, Linux is widely used in cloud computing and is the core of the Android operating system.

Unlike OpenAI and Google’s latest models, Llama is not “multimodal,” meaning it isn’t designed to handle images, audio, and video. However, Meta highlights that the model is significantly better at using other software, like a web browser, which many researchers and companies believe could make AI more useful.

Potential Misuse Concerns

Following the launch of OpenAI’s ChatGPT in late 2022, some AI experts called for a moratorium on AI development, fearing the technology could be misused or become too powerful to control. Although existential alarm has since cooled, many experts remain concerned that unrestricted AI models could be misused by hackers or accelerate the development of biological or chemical weapons.

“Cybercriminals worldwide will be thrilled,” says Geoffrey Hinton, Turing Award winner and a pioneer in deep learning, the field of machine learning that underpins large language models.

Hinton joined Google in 2013 but left last year to speak out about the potential risks of advanced AI models. He argues that AI is fundamentally different from open-source software because models cannot be examined in the same way. “People adjust models for their own purposes, and some of those purposes are very bad,” he adds.

Meta has mitigated some fears by cautiously releasing previous versions of Llama. The company asserts that it subjects Llama to rigorous safety testing before release and adds that there is little evidence its models facilitate weapon development. Meta announced new tools to help developers maintain Llama models’ safety by moderating output and blocking attempts to bypass restrictions. Jon Carvill, a Meta spokesperson, said the company will decide case by case whether to release future models.

Dan Hendrycks, a computer scientist and director of the nonprofit Center for AI Safety, dedicated to AI dangers, notes that Meta has generally done a good job testing its models before release. He adds that the new model could help experts understand future risks. “The release of Llama 3 will allow researchers outside major tech companies to conduct much-needed research on AI safety.”

Model Architecture

Training Llama 3.1 405B with over 15 trillion tokens, Meta’s largest model to date, was a significant challenge. To train at this scale and achieve the results in a reasonable time, Meta significantly optimized its training architecture and scaled up to over 16,000 H100 GPUs, making 405B the first Llama model trained at this scale.

Meta’s Goal

“I believe AI will evolve similarly,” writes Zuckerberg in his letter, “Today, several tech companies are developing leading closed models. But open source is quickly closing the gap.”

Meta’s decision to give away its AI isn’t entirely selfless. Previous Llama releases have helped the company secure an influential position among AI researchers, developers, and startups. Liang notes that Llama 3.1 isn’t truly open source because Meta imposes usage restrictions, such as limiting the scale at which the model can be used commercially.

The new Llama version has 405 billion parameters or tunable elements. Meta has also released smaller versions of Llama 3, one with 70 billion parameters and another with 8 billion. Enhanced versions of these models, branded Llama 3.1, were also launched.

Accessibility

Llama 3.1 is too large to run on a standard computer, but Meta asserts that many cloud service providers, such as Databricks, Groq, AWS, and Google Cloud, will offer hosting options for developers to run customized versions of the model. The model can also be accessed on Meta.ai.

Some developers believe the new Llama version could significantly impact AI development. Stella Biderman, executive director of the open-source AI project EleutherAI, notes that Llama 3 isn’t entirely open source. However, Biderman highlights that a change in Meta’s latest license will allow developers to train their own models using Llama 3, something most AI companies currently prohibit: “This is very, very important,” says Biderman.

Instruction and Chat Configuration

With Llama 3.1 405B, Meta aimed to improve the model’s utility, quality, and ability to follow detailed user instructions while ensuring high security levels. The biggest challenges were supporting more functions, the 128K context window, and the model’s increased size.

In post-training, Meta produced final chat models through several rounds of alignment over the pre-trained model. Each round involved supervised fine-tuning (SFT), rejection sampling (RS), and direct preference optimization (DPO). The generation of synthetic data produced the majority of SFT examples, iterating several times to produce higher-quality data across features. Meta invested in multiple data processing techniques to filter this synthetic data for maximum quality, allowing the scaling of fine-tuned data across resources.

The Llama System

Llama templates have always been designed to function as part of a global system that can orchestrate multiple components, including external tools. Meta’s vision extends beyond basic templates to provide developers with access to a broader system, offering the flexibility to design and create customized offerings aligned with their vision. This thinking began last year with the introduction of non-core LLM components.

As part of ongoing efforts to responsibly develop AI beyond the model layer and help others do the same, Meta is launching a complete reference system, including several sample applications and new components like Llama Guard 3, a multilingual security model, and Prompt Guard, an immediate injection filter. These sample applications are open source and can be developed by the community.

Implementing the components of this Llama system vision remains fragmented. Therefore, Meta has begun working with the industry, startups, and the broader community to better define the interfaces of these components. To support this, Meta is launching a request for comments on GitHub for what it calls the “Llama Stack,” a set of standardized, opinionated interfaces on how to build canonical toolset components (fine-tuning, synthetic data generation) and agent applications. Meta hopes these will be adopted across the ecosystem to facilitate interoperability.

Llama 3.1 is a significant step forward in AI development, combining Meta’s commitment to open AI with rigorous safety measures. Its impact on the AI landscape will be closely watched by researchers, developers, and policymakers alike.

Conclusion

Meta’s release of Llama 3.1 marks a significant step in making advanced AI technology accessible to a broader audience. By offering a powerful, customizable AI model for free, Meta challenges the closed approach favored by many tech giants and positions itself at the forefront of the AI development debate. While the open nature of Llama 3.1 raises concerns about potential misuse, Meta’s rigorous safety protocols and commitment to responsible AI deployment provide some assurance. As the AI landscape evolves, the impact of Llama 3.1 will likely influence future AI research and development, promoting innovation while balancing the need for security.

FAQs

Q: What is Llama 3.1?

A: Llama 3.1 is the latest version of Meta’s advanced AI model, designed to be powerful, customizable, and accessible for free.

Q: How does Llama 3.1 compare to other AI models?

A: Meta claims that Llama 3.1 is as intelligent and useful as leading commercial AI models from companies like OpenAI, Google, and Anthropic.

Q: What makes Llama 3.1 unique?

A: Unlike many AI models, Llama 3.1 is designed to be highly customizable and is offered for free, promoting a more open approach to AI development.

Q: What are the potential risks associated with Llama 3.1?

A: The open nature of Llama 3.1 raises concerns about potential misuse, such as by cybercriminals or for developing harmful technologies. Meta has implemented safety protocols to mitigate these risks.

Q: Can Llama 3.1 handle images, audio, and video?

A: No, Llama 3.1 is not multimodal and is not designed to handle images, audio, or video. However, it excels in integrating with other software like web browsers.

Q: How can developers use Llama 3.1?

A: Developers can access Llama 3.1 through cloud service providers like Databricks, Groq, AWS, and Google Cloud, allowing them to run customized versions of the model.

Q: Is Llama 3.1 truly open source?

A: While Llama 3.1 is not entirely open source due to usage restrictions imposed by Meta, it allows developers to train their own models using Llama 3.1, which is a significant step towards more open AI development.

Q: How does Meta ensure the safety of Llama 3.1?

A: Meta conducts rigorous safety testing and has released tools to help developers maintain the security of Llama models, including moderating outputs and blocking attempts to bypass restrictions.

Q: What future developments can we expect for Llama?

A: Meta plans to continue refining Llama models and releasing future versions, balancing innovation with safety considerations.

Q: Where can I access Llama 3.1?

A: Llama 3.1 can be accessed via Meta.ai and through various cloud service providers, offering flexible options for developers.


To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics