BlockNews
  • Crypto
  • Finance
  • Politics
  • Memecoins
  • NFT
  • Technology
  • Opinion
No Result
View All Result
FOLLOW
BlockNews
  • Crypto
  • Finance
  • Politics
  • Memecoins
  • NFT
  • Technology
  • Opinion
No Result
View All Result
BlockNews

OpenAI Set to Unleash New Web Crawler to Devour More of the Open Web

by BlockNews Team
August 10, 2023
in Media, Social, Technology
A A
OpenAI Considers Exiting Europe Amid Regulatory Pressure
1
SHARES
Share on TwitterShare on Reddit
  • OpenAI has introduced GPTBot, a web crawling bot, to gather data for training its upcoming AI systems, possibly named “GPT-5”.
  • GPTBot collects public data from websites similar to search engines, but web publishers can prevent their content inclusion by adding a “disallow” rule.
  • The release of GPTBot raises concerns about consent and copyright, highlighting the ongoing challenges in balancing AI capabilities with ethical considerations.

Leading AI firm OpenAI has released a new web crawling bot, GPTBot, to expand its dataset for training its next generation of AI systems—and the next iteration appears to have an official name. The company trademarked the term “GPT-5,” implying an upcoming release while informing web publishers how to keep their content out of its massive corpus.

According to OpenAI, the web crawler will collect publicly available data from websites while avoiding paywalls, sensitive and prohibited content. However, unlike other search engines such as Google, Bing, and Yandex, the system is opt-out—by default, GPTBot will assume all accessible information is fair game.

To prevent OpenAI’s web crawler from ingesting a website, the website’s owner must add a “disallow” rule to a standard file on the server.

GPTBot, according to OpenAI, will also scan scraped data ahead of time to remove personally identifiable information (PII) and text that violates its policies.

However, some technology ethicists believe the opt-out approach still raises consent challenges.

 Some users justified OpenAI’s move on Hacker News by stating that if people want a capable generative AI tool in the future, they must gather as much information as possible. “They still need current data, or their GPT models will be stuck in September 2021 forever,” said one user. Another privacy-concerned user claimed that “OpenAI isn’t even citing in moderation. It’s making a derivative work without citing, thus obscuring it.”

GPTBot’s release follows recent criticism of OpenAI for previously scraping data without permission to train Large Language Models (LLMs) such as ChatGPT. The company updated its privacy policies in April in response to such concerns.

Meanwhile, the recent trademark application for GPT-5 appears to confirm that OpenAI is developing its next model in preparation for a future launch. The new system will likely use large-scale web scraping to update and broaden its training data.

This could indicate a shift from OpenAI’s early emphasis on transparency and AI safety. Still, it’s not surprising, given that ChatGPT is the most widely used LLM in the world, despite an increasingly crowded and powerful marketplace.

OpenAI’s star product—and that of any LLM—is only as good as the quality of the data used to train it. OpenAI requires more and newer data, and a lot of it.

ChatGPT now has over 1.5 billion active monthly users. And Microsoft’s $10 billion investment in OpenAI appears to have been foresighted, as ChatGPT integration has enhanced Bing’s capabilities.

For the time being, OpenAI leads the hot AI space, with tech titans racing to catch up. The company’s new web crawler could improve the capabilities of its models. However, expanding internet data collection raises ethical concerns about copyright and consent.

Balancing transparency, ethics, and capabilities will remain complex as AI systems become more sophisticated.

Tags: ChatGPTOpenAI
TweetShareShare

DON'T MISS THESE! HOT OFF THE PRESS

Mark Zuckerberg Unveils New Crypto Strategy for META: What You Need to Know
Crypto

Mark Zuckerberg Unveils New Crypto Strategy for META: What You Need to Know

May 9, 2025
$TAO Is Back With a Vengeance: Bittensor is About to Explode and Here is Why
Crypto

$TAO Is Back With a Vengeance: Bittensor is About to Explode and Here is Why

May 9, 2025
Google Stock Tumbles Over 9% Amid Apple’s AI Search Plans: Is This a Buying Opportunity?
Business

Google Stock Tumbles Over 9% Amid Apple’s AI Search Plans: Is This a Buying Opportunity?

May 7, 2025
CZ Binance Shares Top 3 Crypto Sectors He’s Bullish on Right Now: What You Need to Know
Crypto

CZ Binance Shares Top 3 Crypto Sectors He’s Bullish on Right Now: What You Need to Know

May 5, 2025
$TAO Heating Up: Can Bittensor Lead the AI Wave?
Crypto

$TAO Heating Up: Can Bittensor Lead the AI Wave?

May 1, 2025
Meta’s Reality Labs Reports Massive $4.2 Billion Q1 Loss: Should Zuckerberg Give Up Hope?
Business

Meta’s Reality Labs Reports Massive $4.2 Billion Q1 Loss: Should Zuckerberg Give Up Hope?

April 30, 2025
Load More

Related News

HBAR Price Rebounds, Hits Highest Point Since March

HBAR Price Rebounds, Hits Highest Point Since March

May 10, 2025
Memecoins Steal the Spotlight as Ethereum Surges: PEPE and BONK Lead the Charge (Still Early?)

Memecoins Steal the Spotlight as Ethereum Surges: PEPE and BONK Lead the Charge (Still Early?)

May 10, 2025
Sui (SUI) Gains 10% Amid Market Optimism — But Is a Selloff Looming?

Sui (SUI) Gains 10% Amid Market Optimism — But Is a Selloff Looming?

May 10, 2025
TRON’s USDT Dominance Grows as Whales Tighten Grip

TRON’s USDT Dominance Grows as Whales Tighten Grip

May 10, 2025
Cardano ADA Just Broke a KEY Resistance Level: Here is the Next Price Target You Need to Watch

Cardano ADA Just Broke a KEY Resistance Level: Here is the Next Price Target You Need to Watch

May 10, 2025
Discord Twitter Youtube TikTok Instagram

BLOCKNEWS.COM

BlockNews

BlockNews.com is your premier source for real-time cryptocurrency, blockchain, and financial market news.

Our mission is to deliver accurate, timely, and insightful information to help both seasoned investors and newcomers navigate the evolving digital economy.

With in-depth analysis, exclusive insights, and up-to-date news, BlockNews.com keeps you informed on the latest trends in crypto, DeFi, NFTs, tech, and beyond.

Stay ahead of the herd with BlockNews.com

RESOURCES

  • About
  • Newsletter
  • Advertise
  • Terms and Conditions
  • Privacy Policy

POPULAR TOPICS

$ADA $XRP AI Avalanche Binance Bitcoin Bitcoin ETF blackrock Blockchain BTC Business Cardano China Coinbase crypto cryptocurrency Crypto Exchange Crypto Regulation DeFi Dogecoin Donald Trump Elon Musk ETF eth ethereum Federal Reserve FTX grayscale Memecoin Meme Coin metaverse Microstrategy NFT NFTs PEPE ripple sec Shiba Inu Solana Stablecoin Technology twitter US Web3 xrp

GET QUICKER UPDATES ON X

© 2022-2025 BlockNews.com - Crypto and NFT news website by Aiur Labs.

No Result
View All Result
  • Home
  • Crypto
  • Memecoins
  • Technology
  • Politics
  • Finance
  • NFT
  • DeFi
  • Opinion

© 2022-2025 BlockNews.com - Crypto and NFT news website by Aiur Labs.