This week, Google introduced a family of open AI models known as Gemma 3, which garnered significant praise for its impressive efficiency. However, several developers expressed concerns on X about the risks associated with the commercial use of Gemma 3 due to its licensing terms. A number of developers have lamented about the issue, highlighting the potential pitfalls of using Gemma 3 in commercial applications.
The problem is not unique to Gemma 3, as companies like Meta also apply custom, non-standard licensing terms to their openly available models. These terms pose significant legal challenges for companies, especially smaller operations, which worry that Google and others might suddenly assert more onerous clauses, effectively pulling the rug from under their business.
According to Nick Vidal, head of community at the Open Source Initiative, a long-running institution that aims to define and “steward” all things open source, “the restrictive and inconsistent licensing of so-called ‘open’ AI models is creating significant uncertainty, particularly for commercial adoption.” He told TechCrunch, “While these models are marketed as open, the actual terms impose various legal and practical hurdles that deter businesses from integrating them into their products or services.”
Developers of open models often have valid reasons for releasing them under proprietary licenses instead of industry-standard options like Apache and MIT. For example, AI startup Cohere has been clear about its intent to support scientific, but not commercial, work on top of its models, as stated in their FAQs.
However, the licenses for Gemma and Meta’s Llama models have specific restrictions that limit the ways companies can use them without fear of legal reprisal. Meta, for instance, prohibits developers from using the output or results of Llama 3 models to improve any model besides Llama 3 or derivative works. It also prevents companies with over 700 million monthly active users from deploying Llama models without obtaining a special, additional license.
Gemma’s license is generally less burdensome, but it does grant Google the right to restrict usage of Gemma that Google believes is in violation of the company’s prohibited use policy or applicable laws and regulations, as outlined in the license terms.
These terms apply not only to the original Llama and Gemma models but also to models based on them. In Gemma’s case, this includes models trained on synthetic data generated by Gemma. Florian Brand, a research assistant at the German Research Center for Artificial Intelligence, believes that licenses like Gemma and Llama’s “cannot reasonably be called ‘open source.'” He pointed out that some tech giant execs may have a different opinion, but the restrictive terms tell a different story.
“Most companies have a set of approved licenses, such as Apache 2.0, so any custom license is a lot of trouble and money,” Brand told TechCrunch. “Small companies without legal teams or money for lawyers will stick to models with standard licenses.” He noted that while AI model developers with custom licenses, like Google, haven’t aggressively enforced their terms yet, the threat alone is often enough to deter adoption.
“These restrictions have an impact on the AI ecosystem — even on AI researchers like me,” said Brand. Han-Chung Lee, director of machine learning at Moody’s, and Eric Tramel, a staff applied scientist at AI startup Gretel, agree that custom licenses such as those attached to Gemma and Llama make the models “not usable” in many commercial scenarios.
Tramel explained that model-specific licenses make specific carve-outs for model derivatives and distillation, causing concern about potential clawbacks. He illustrated the scenario with an example: “Imagine a business that is specifically producing model fine-tunes for their customers. What license should a Gemma-data fine-tune of Llama have? What would the impact be for all of their downstream customers?”
The scenario that deployers most fear, Tramel said, is that the models could be a trojan horse of sorts. “A model foundry can put out [open] models, wait to see what business cases develop using those models, and then strong-arm their way into successful verticals by either extortion or lawfare,” he said. For example, Gemma 3, by all appearances, seems like a solid release — and one that could have a broad impact. However, the market can’t adopt it because of its license structure, so businesses will likely stick with perhaps weaker and less reliable Apache 2.0 models.
It’s worth noting that certain models have achieved widespread distribution despite their restrictive licenses. Llama, for example, has been downloaded hundreds of millions of times and built into products from major corporations, including Spotify. However, they could be even more successful if they were permissively licensed, according to Yacine Jernite, head of machine learning and society at AI startup Hugging Face.
Jernite called on providers like Google to move to open license frameworks and “collaborate more directly” with users on broadly accepted terms. “Given the lack of consensus on these terms and the fact that many of the underlying assumptions haven’t yet been tested in courts, it all serves primarily as a declaration of intent from those actors,” Jernite said. “[But if certain clauses] are interpreted too broadly, a lot of good work will find itself on uncertain legal ground, which is particularly scary for organizations building successful commercial products.”
Vidal emphasized that there’s an urgent need for AI models that companies can freely integrate, modify, and share without fearing sudden license changes or legal ambiguity. “The current landscape of AI model licensing is riddled with confusion, restrictive terms, and misleading claims of openness,” Vidal said. “Instead of redefining ‘open’ to suit corporate interests, the AI industry should align with established open source principles to create a truly open ecosystem.”
Source Link