M
MercyNews
Home
Back
Wikipedia Secures AI Training Deals with Tech Giants
Technology

Wikipedia Secures AI Training Deals with Tech Giants

Ars Technica3h ago
3 min read
📋

Key Facts

  • ✓ The Wikimedia Foundation announced licensing agreements with Microsoft, Meta, Amazon, Perplexity, and Mistral AI for AI model training.
  • ✓ These deals allow tech companies to use Wikipedia's 65 million articles to train AI models like Microsoft Copilot and ChatGPT.
  • ✓ The agreements are part of Wikimedia Enterprise, a commercial subsidiary that sells high-speed API access to major companies.
  • ✓ Revenue from these partnerships helps offset infrastructure costs for the nonprofit organization.
  • ✓ Google previously signed a deal with Wikimedia Enterprise in 2022, establishing the initial framework for these commercial agreements.
  • ✓ The foundation did not disclose the financial terms of the deals with Microsoft, Meta, and Amazon.

In This Article

  1. A New Era for Wikipedia
  2. The Partnership Details
  3. Why This Matters
  4. The Enterprise Program
  5. Industry Context
  6. Looking Ahead

A New Era for Wikipedia#

The Wikimedia Foundation has entered into a transformative phase of its digital strategy, announcing landmark licensing agreements with some of the world's most powerful technology companies. On Thursday, the nonprofit organization revealed deals with Microsoft, Meta, and Amazon, among others, to formally license Wikipedia content for artificial intelligence training.

This development represents a significant departure from the past, where these same companies routinely scraped Wikipedia's vast knowledge base without explicit permission or compensation. The agreements signal a maturing relationship between open knowledge repositories and the commercial AI industry.

The Partnership Details#

The newly announced deals encompass five major technology companies: Microsoft, Meta, Amazon, Perplexity, and Mistral AI. These organizations have joined the Wikimedia Enterprise program, a commercial subsidiary specifically created to manage licensing agreements with large-scale commercial users.

Wikimedia Enterprise offers a premium service that provides API access to Wikipedia's 65 million articles at significantly higher speeds and volumes than the free public APIs available to general users. This premium access is essential for companies training large language models that require massive, consistent data streams.

The financial terms of these agreements remain confidential, as the foundation chose not to disclose specific monetary values. However, the revenue generated represents a crucial new income stream for the organization.

These new partners join an existing roster that includes:

  • Google - Signed a deal in 2022
  • Ecosia - Smaller search engine company
  • Nomic - AI research organization
  • Pleias - AI development company
  • ProRata - Technology firm
  • Reef Media - Digital media company

Why This Matters#

This shift from unpermitted scraping to formal licensing represents a paradigm shift in how AI companies access training data. Previously, major tech firms extracted Wikipedia's content without compensation, treating it as a freely available resource. The new agreements establish a commercial framework that recognizes the value of curated knowledge.

For the Wikimedia Foundation, these deals provide essential financial support for maintaining and scaling Wikipedia's infrastructure. The nonprofit organization has historically relied on small public donations to cover its operational costs, which include server maintenance, software development, and community support.

The revenue helps offset infrastructure costs for the nonprofit, which otherwise relies on small public donations while watching its content become a staple of training data for AI models.

The agreements also validate Wikipedia's role as a foundational dataset for modern AI systems. Models like Microsoft Copilot and OpenAI's ChatGPT depend on diverse, accurate information sources, and Wikipedia's structured, multilingual content provides an ideal training resource.

The Enterprise Program#

Wikimedia Enterprise represents the foundation's strategic response to the growing commercial demand for its content. Unlike the free Wikipedia API designed for individual developers and small projects, Enterprise offers enterprise-grade features including higher rate limits, dedicated support, and guaranteed uptime.

The program was specifically designed to accommodate the unique requirements of large-scale AI training, where companies need to process millions of articles repeatedly and rapidly. This technical capability makes Wikipedia's content more accessible for commercial applications while maintaining the nonprofit's commitment to free knowledge.

The subsidiary model allows the foundation to pursue commercial opportunities without compromising its core mission. Revenue generated through Enterprise directly supports the free, public Wikipedia that millions of users access daily.

Key features of the Enterprise program include:

  • High-speed API access for large-scale data processing
  • Volume-based pricing for enterprise clients
  • Dedicated technical support and service guarantees
  • Compliance with data usage and licensing requirements

Industry Context#

The timing of these agreements reflects the rapid evolution of the AI industry and its growing need for high-quality training data. As companies develop increasingly sophisticated language models, the demand for reliable, comprehensive datasets has intensified.

Previously, the relationship between AI developers and content providers was largely unregulated, with companies extracting data from various sources without formal agreements. The Wikimedia Foundation's approach establishes a precedent for how open knowledge projects can engage with commercial AI development.

This development also highlights the economic value of curated knowledge. While Wikipedia's content is freely available for personal use, its commercial application for AI training represents a significant economic opportunity that can help sustain the platform's operations.

The agreements with Microsoft, Meta, and Amazon are particularly notable given their scale and influence in the AI sector. These companies operate some of the world's most widely used AI assistants and language models.

Looking Ahead#

The Wikimedia Foundation's successful negotiation of licensing deals with major technology companies marks a significant milestone in the relationship between open knowledge and commercial AI development. This partnership model provides a sustainable path forward for both parties.

As the AI industry continues to expand, the demand for high-quality training data will likely increase. The Wikimedia Enterprise program positions the foundation to meet this demand while maintaining its commitment to free knowledge.

These agreements also set an important precedent for how other content providers might approach licensing with AI companies. The success of this model could influence broader industry practices around data attribution and compensation.

For users of Wikipedia and AI assistants alike, this development represents a step toward more sustainable and ethical AI development practices, where the creators and curators of knowledge receive appropriate recognition and support for their contributions to the digital ecosystem.

#AI#Biz & IT#AI infrastructure#AI training data#Amazon#generative ai#google#jimmy wales#large language models#machine learning#meta#microsoft#Mistral AI#non-profit#Perplexity#Wikimedia Enterprise#Wikimedia Foundation#wikipedia

Continue scrolling for more

AI Transforms Mathematical Research and Proofs
Technology

AI Transforms Mathematical Research and Proofs

Artificial intelligence is shifting from a promise to a reality in mathematics. Machine learning models are now generating original theorems, forcing a reevaluation of research and teaching methods.

Just now
4 min
207
Read Article
My favorite dietitian-approved spritz is just 4 ingredients — and it can be made with or without alcohol
Lifestyle

My favorite dietitian-approved spritz is just 4 ingredients — and it can be made with or without alcohol

Place cinnamon sticks inside the spritz for a festive touch. Rachel Hosie I make a four-ingredient, Aperol-based drink when I host, and it's a true crowd-pleaser. It's a lighter alternative to heavier cocktails, and you can easily make it nonalcoholic. It's also easy to upgrade with festive garnishes, such as rosemary sprigs and cinnamon sticks. For many, the Aperol spritz is reserved for daylight-saving time, falling into the same bucket as loose linen shirts and beach days. Once the clock falls back, the quintessential, three-ingredient summer cocktail — made with Aperol, Prosecco, and sparkling water — seems to go into hibernation. I, however, don't fall into that camp. The bright-orange, refreshing spritz is one of my year-round orders. I've found that with a couple of easy tweaks, an Aperol-based drink can actually be very versatile. My version of the classic recipe has become one of my favorite drinks in the fall and winter months, regardless of whether I'm on the sun-drenched Côte d'Azur or in my native drizzly England. Making the spritz is really easy — and it doesn't even need to contain alcohol I only need four ingredients to make this beverage. Rachel Hosie I've seen various versions of this cocktail online with slightly different ratios and ingredients. In my experience, you don't need to be too strict about it. You can alter the quantities based on your tastes and preferences, but for one simple spritz, I use the following: 2 ounces of Aperol or a nonalcoholic alternative 3 ounces of prosecco or nonalcoholic sparkling wine 2 ounces of apple or cranberry juice 1 ounce of soda water Pour the above ingredients into a glass with plenty of ice, and you're done. There's no shaking required. Although the above measurements yield one cocktail, I've scaled the recipe and served it in a large punch bowl or pitcher when hosting friends at my home. You can make it stronger by adding a splash of alcoholic spiced apple cider or go for a heavier Aperol pour. Just keep in mind that Aperol is a distilled spirit with 11% alcohol content, so even when it's diluted, it's worth being mindful of how many units you're having. To garnish the drink, you can use a sprig of rosemary, cinnamon sticks, dehydrated orange slices, or cranberries. One of my favorite touches is making big ice cubes with an apple slice or a star anise inside. And if you really love a sweet cocktail, coat the rim of your glass with a bit of cinnamon sugar. When enjoyed mindfully, this 'lighter' drink beats out heavier cocktails Whenever I host people at my home, I have the spritz cocktails ready to serve. Rachel Hosie I ran my recipe by dietitian Nichola Ludlam-Raine, who said that my festive spritz is a "lighter, more hydrating option" than many other popular holiday cocktails. As your glass empties, top up your drink with soda water, which Ludlam-Raine explains counteracts alcohol's diuretic effect and helps keep you from getting dehydrated. If you're not drinking at all, swap the Aperol for Crodini or Wilfred's aperitif and the prosecco for a nonalcoholic sparkling wine to create a tasty mocktail. "I would totally drink this myself and recommend it to my patients," Ludlam-Raine told me. "It's a great way of having a tasty drink without feeling like you're missing out if you're not having alcohol." Although Ludlam-Raine said that fruit juice adds flavor and antioxidants, you should be mindful of how much you use to keep your sugar intake in check. This is especially true if you're having multiple … which I always am because this drink is that delicious. Read the original article on Business Insider

1h
3 min
0
Read Article
Amazon is turning Fallout’s post-apocalypse into a reality show
Entertainment

Amazon is turning Fallout’s post-apocalypse into a reality show

Move over Squid Game: another bleak fictional world is being turned into competitive reality television. In the midst of season 2 of Fallout, Amazon has announced a new series called Fallout Shelter. According to a casting call, the show will put competitors inside of Fallout's iconic vaults and then test their survival skills in a recreation of the post-apocalyptic world. Presumably there will be no Deathclaws. Here's the official logline: Set inside Vault-Tec's bomb-proof vaults, Fallout Shelter drops a diverse group of contestants into an immersive, high-stakes world inspired by the games' signature dark humor, retro-futurism, and post … Read the full story at The Verge.

1h
3 min
0
Read Article
This case makes your iPhone feel caseless [Hands-on]
Technology

This case makes your iPhone feel caseless [Hands-on]

Over the years, I have tried 100s of iPhone cases, from rugged cases to leather cases, clear cases, and multifunctional wallet cases. And while these cases have their time and place, I always come back to the same type of case as my favorite: thin iPhone cases. There is just something about keeping the iPhone feeling like an iPhone that makes it enjoyable to use long-term. So when I saw Moft was releasing their new Movas Frame case, I had to give it a try. It was designed to be slim, light, and minimal while still adding protection. I’ve been using it for a few weeks, and here is what you should know. more…

1h
3 min
0
Read Article
Jeep pulls the plug on its $25,000 EV for the US
Automotive

Jeep pulls the plug on its $25,000 EV for the US

A $25,000 electric Jeep won’t happen after all. At least those in the US will likely never get to see it. more…

1h
3 min
0
Read Article
How to claim Verizon's $20 credit for Wednesday's service outage
Technology

How to claim Verizon's $20 credit for Wednesday's service outage

Verizon is offering a very small mea culpa after Wednesday's massive outage, which drew more than 1.5 million reports on Downdetector and lasted hours. The carrier posted on X that it will offer a $20 credit, but customers must redeem it in the myVerizon app. "This credit isn’t meant to make up for what happened. No credit really can," the company wrote. "But it’s a way of acknowledging your time and showing that this matters to us." Incensed customers have largely replied with incredulity, both at the miniscule amount, and that it isn't being applied automatically. Engadget has reached out to Verizon seeking clarity on whether this credit can be claimed by contacting the carrier or only through the app. We will update this piece if we hear back. This article originally appeared on Engadget at https://www.engadget.com/mobile/how-to-claim-verizons-20-credit-for-wednesdays-service-outage-171909695.html?src=rss

1h
3 min
0
Read Article
Placements : les quatre valeurs sûres pour faire fructifier votre portefeuille en 2026
Economics

Placements : les quatre valeurs sûres pour faire fructifier votre portefeuille en 2026

NOS CONSEILS - Le début d’année est le moment idéal pour passer en revue ses placements. Voici des pistes pour réajuster vos allocations et profiter au mieux des secteurs qui feront prospérer votre épargne dans les prochains mois.

1h
3 min
0
Read Article
Sesame Street Finds New Home on YouTube
Entertainment

Sesame Street Finds New Home on YouTube

The iconic children's program Sesame Street is expanding its digital footprint with a new official presence on YouTube and YouTube Kids, offering families access to more than 100 classic episodes.

2h
5 min
7
Read Article
US Imposes 25% Tariff on Nvidia H200 AI Chips
Politics

US Imposes 25% Tariff on Nvidia H200 AI Chips

The Trump administration has formalized a 25% tariff on Nvidia's H200 AI chips destined for China, marking a significant escalation in trade restrictions affecting the semiconductor industry.

2h
5 min
6
Read Article
Technology

Fitbit's Evolution: Best Trackers in Google's Era

Fitbit remains a leader in fitness tracking despite Google's acquisition. This guide reviews the best devices, including the Pixel Watch 4 and Inspire 3, and discusses the ongoing transition to Google accounts.

2h
7 min
2
Read Article
🎉

You're all caught up!

Check back later for more stories

Back to Home