23 Major News Sites Block Wayback Machine Over AI Concerns
Original: 23 Major News Sites Have Blocked the Wayback Machine – Digital History In Danger
Why This Matters
Threatens digital preservation and journalistic accountability tools amid AI development concerns.
Twenty-three major news organizations including New York Times and USA Today have blocked the Internet Archive's Wayback Machine from preserving their content, citing fears that archived material could be used to train AI systems.
The Internet Archive's Wayback Machine, which has preserved over a trillion web pages for three decades, now faces systematic blocking from major media outlets. The New York Times, USA Today's 200+ outlets, and Reddit have all blocked ia_archiverbot, the preservation tool's web crawler. Publishers justify the blocking with two main arguments: preventing AI companies from training on archived content and general anti-scraping measures. The Times claims archived content violates copyright law and is being used to 'directly compete with us,' though specifics remain unclear. The Guardian allows crawling but filters archived content from public access. The blocking creates accountability issues, as USA Today used the Wayback Machine for investigative reporting while simultaneously preventing preservation of its own content.