Meta has introduced a new collection of AI models, Llama 4, which is part of the Llama family, marking a significant development on a Saturday.
The Llama 4 collection comprises four new models: Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth. These models were trained using large amounts of unlabeled text, image, and video data, granting them broad visual understanding, according to Meta.
The success of open models from the Chinese AI lab DeepSeek, which perform comparably to or even surpass Meta’s previous flagship Llama models, reportedly accelerated the development of Llama. Meta allegedly established war rooms to analyze how DeepSeek reduced the cost of running and deploying models like R1 and V3.
Scout and Maverick are currently available on Llama.com and through Meta’s partners, including the AI development platform Hugging Face, while Behemoth is still in training. Meta has updated its AI-powered assistant, Meta AI, to utilize Llama 4 in 40 countries, although multimodal features are limited to the U.S. in English for now.
However, some developers may have concerns regarding the Llama 4 license.
Users and companies based in or with a primary place of business in the EU are prohibited from using or distributing the models, likely due to the region’s AI and data privacy laws. Additionally, companies with over 700 million monthly active users must request a special license from Meta, which the company can approve or deny at its discretion.
“The Llama 4 models mark the beginning of a new era for the Llama ecosystem,” Meta stated in a blog post. “This is just the start for the Llama 4 collection.”

According to Meta, Llama 4 is the company’s first cohort of models to utilize a mixture of experts (MoE) architecture, which is more computationally efficient for training and answering queries. MoE architectures divide data processing tasks into subtasks and delegate them to smaller, specialized “expert” models.
For example, Maverick has 400 billion total parameters but only 17 billion active parameters across 128 “experts.” Scout, on the other hand, has 17 billion active parameters, 16 experts, and 109 billion total parameters.
Meta’s internal testing indicates that Maverick, which is best suited for general assistant and chat use cases like creative writing, outperforms models such as OpenAI’s GPT-4o and Google’s Gemini 2.0 on certain coding, reasoning, multilingual, long-context, and image benchmarks. However, Maverick does not quite match the performance of more capable recent models like Google’s Gemini 2.5 Pro, Anthropic’s Claude 3.7 Sonnet, and OpenAI’s GPT-4.5.
Scout’s strengths lie in tasks such as document summarization and reasoning over large codebases. Notably, it has a very large context window of 10 million tokens, allowing it to process and work with extremely lengthy documents.
Scout can run on a single Nvidia H100 GPU, while Maverick requires an Nvidia H100 DGX system or equivalent, according to Meta’s calculations.
The unreleased Behemoth model will require even more powerful hardware, with 288 billion active parameters, 16 experts, and nearly two trillion total parameters. Meta’s internal benchmarking shows Behemoth outperforming GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro (but not 2.5 Pro) on several evaluations measuring STEM skills like math problem-solving.
It’s worth noting that none of the Llama 4 models are proper “reasoning” models, similar to OpenAI’s o1 and o3-mini. Reasoning models fact-check their answers and generally respond to questions more reliably, but they take longer to deliver answers compared to traditional, non-reasoning models.

Interestingly, Meta says it has tuned all its Llama 4 models to refuse to answer “contentious” questions less often. According to the company, Llama 4 responds to debated political and social topics that the previous Llama models wouldn’t. Additionally, the company claims Llama 4 is “dramatically more balanced” in the prompts it entertains.
“You can count on Llama 4 to provide helpful, factual responses without judgment,” a Meta spokesperson told TechCrunch. “We’re continuing to make Llama more responsive so that it answers more questions, can respond to a variety of different viewpoints, and doesn’t favor some views over others.”
These adjustments come as some White House allies accuse AI chatbots of being too politically “woke.”
Many of President Donald Trump’s close confidants, including billionaire Elon Musk and crypto and AI “czar” David Sacks, have alleged that popular AI chatbots censor conservative views. Sacks has historically singled out OpenAI’s ChatGPT as “programmed to be woke” and untruthful about political subject matter.
However, bias in AI is an intractable technical problem. Musk’s own AI company, xAI, has struggled to create a chatbot that doesn’t endorse some political views over others.
That hasn’t stopped companies including OpenAI from adjusting their AI models to answer more questions than they would have previously, particularly those related to controversial subjects.
Source Link