2024 - Artificial Intelligence Companies Continue to Collect Data from the Internet

It turns out that artificial intelligence companies bypass the instructions, also known as robots.txt.

With the rise of artificial intelligence, companies entering this field need huge amounts of data to develop their own tools. The first alternative that comes to mind to find this data is, of course, the internet. on the other hand every data on the internet, not every article can be used to train artificial intelligence. Websites indicate whether data can be collected from them with a file called robots.txt.

According to Reuters, many artificial intelligence developer They choose to bypass the directions in this file and collect data from these sites. Although Perplexity, which introduces itself as a “free artificial intelligence search engine”, is one of the companies that attracts the most reactions in this regard, it is not alone in this practice.

OpenAI, Anthropic…

According to reports, many artificial intelligence developers robots.txt It bypasses the files and continues to receive content from the sites. Although no names were given in the report, it was learned that OpenAI and Anthropic were among these companies. perplexity It turned out that a server used by was also not following these guidelines. Perplexity CEO Aravind Srinivas had previously said that the company “is not in a position to first bypass the protocol and then lie about it.”

Robots.txt protocol on the other hand since the 1990s It is used and actually has no legal binding. Perhaps creating a new, stricter and more detailed protocol on this issue will contribute to the solution of the problem.

OpenAI Reveals When GPT-5 Will Be Released and How Smart It Will Be

Ilya Sutskever Leaving OpenAI Announces His Own Artificial Intelligence Company

Source :
https://www.engadget.com/ai-companies-are-reportedly-still-scraping-websites-despite-protocols-meant-to-block-them-132308524.html?src=rss

source site-37

June 27, 2024 A101 Internet Special Catalog

Amazon’s Market Value Exceeds $2 Trillion!

A Cyber Attack Was Organized Via WordPress Plugins

realme 12 4G Introduced: Here are its Features and Price

HONOR Announces New Artificial Intelligence Features

Does Money Bring Happiness According to Science?

The Striking Story of New York’s Flatiron Building

How Have the Logos of Technology Giants Changed Over Time?

Apple Will Support Third-Party Batteries and Displays

Could Electric Cars Be the End of the Oil Rich?

Figma is Renewing – Webtekno

Automobile Brands That Are Popular Abroad, But Not Popular in Turkey

Microsoft Co-Founder Paul Allen’s Technology Museum Is Closing!

Why is the first episode of a series called a “Pilot”?

YouTube Will Make Artificial Intelligence Agreement with Music Giants

Google Translate Gets Support for 110 New Languages

Why Did We Stop Going to the Moon? – Webtekno

Meteorites That Will Pass Our Planet Can Be Watched Live

Porsche may be in trouble! All Taycan models recalled

$7,000 Price Claim for Ethereum (ETH): Experts Speak

Artificial Intelligence Companies Continue to Collect Data from the Internet

OpenAI, Anthropic…

OpenAI Reveals When GPT-5 Will Be Released and How Smart It Will Be

Ilya Sutskever Leaving OpenAI Announces His Own Artificial Intelligence Company

Adresse / England