A Fight Is Underway for the Internet’s Memory. Who Will Win?

Digital preservation is important for social and political reasons, but archivists face significant hurdles.

Published on Dec. 30, 2025
The Internet Archive logo on a laptop
Image: Shutterstock / Built In
Brand Studio Logo
REVIEWED BY
Seth Wilson | Dec 30, 2025
Summary: Digital content is vanishing, risking historical and political accountability. Organizations face lawsuits, cyberattacks and technical hurdles to preserve valuable social and political evidence against powerful interests. Preserving digital truth requires decentralized, collaborative and supported efforts.

“The internet never forgets” is a saying commonly used to caution against posting sensitive materials online. And although the warning should be taken seriously in such cases, it is not technically true.

The internet is constantly forgetting, sometimes many things all at once. As many as 25 percent of all web pages that existed between 2013 and 2023 have vanished, according to a recent study. And the older the content, the worse the situation. Of all the web pages researched in this study that existed in the year 2013, 38 percent are gone. When web pages vanish, their digital archive often goes with them. 

This digital decay presents an issue for historians and other researchers, of course, but it also affects all of us. When authoritarian regimes delete evidence of atrocities, agencies remove damaging reports and platforms disappear overnight, history can be rewritten and reality reshaped. 

A diverse range of individuals and organizations are fighting back, from investigators and librarians to non-profits and private individuals. Their disparate efforts to archive public information are facing major challenges. Still, they may be our best chance to preserve digital artifacts of immense social, cultural and political value.

What Are the Stakes of Digital Preservation?

Strategic deletions by governments, server shutdowns and platforms disappearing overnight can lead to history being rewritten and reality being reshaped. Digital archivists face hostile environments, including anti-bot markets that hinder public data collection, significant infrastructure requirements for storage and legal attacks.

Moving forward requires a cultural, political and technological shift to recognize digital preservation as essential, support archiving organizations and keep access to information as open as possible.

More From Julius CerniauskasWhat Does Cloudflare’s New Opt-In Model Mean for AI Model Training?

 

The Stakes of Digital Preservation

Journalists investigating atrocities, lawyers building criminal cases and future historians of our fleeting present all depend on digital artifacts that can vanish thanks to a server shutdown or a strategic deletion.

Actions of the few people in strategically important positions have a growing effect on access to information for all. Government agencies alter historical data. News organizations delete archives. Social media platforms try to privatize public discourse and sometimes simply shut down, taking years of public conversations with them

The implications for political accountability are huge. Digital breadcrumbs are increasingly the means by which we track corruption, document war crimes and hold the powerful accountable. They can also be extremely valuable in criminal justice, especially in cases of crimes by officers of the law themselves. Proof posted online can contradict the official narrative and uncover these crimes.

Furthermore, digital preservation is important for cultural and social reasons. When a major platform, such as Vine, MySpace, or Google+, goes under, it takes mountains of irreplaceable records with it. The forums where communities formed, the blogs that gave rise to movements, and the comments that captured discrete cultural moments in their most paradigmatic form have disappeared. These are interactions that partly defined certain generations and moments in history. 

Finally, what remains becomes increasingly difficult to access. As information is being steadily enclosed, packaged and monetized, what is publicly available today may disappear behind a paywall or similar restrictions tomorrow. 

 

How to Protect Digital Evidence

Digital archivists today confront hostile environments. The rising anti-bot market, initially meant to curb harmful bots, now hinders all public data collection. Access depends on the bot’s ability to mimic organic behavior. Thus, better technology — not better intentions — usually determines who gets to use public data.

Additionally, large-scale archiving of any kind comes with considerable infrastructure requirements. To keep their actual IPs hidden and unblocked by websites, archivists need to rely on proxy servers. The pool of proxy IPs they use must be relatively large and geographically diverse because a lot of online content is geo-restricted. Building such infrastructure takes years as one has to communicate with multiple proxy providers from across the globe and vet their proxy sourcing practices for compliance. Furthermore, such infrastructure requires constant maintenance to replace broken or blocked IPs with new ones.

Storage presents another infrastructure challenge, especially considering the gargantuan amount of content to be preserved. The Internet Archive alone maintains more than 866 billion web pages, consuming petabytes of storage across multiple data centers

Of course, not everything should be preserved. There are important privacy and sustainability concerns to consider. Worse, no centralized authority exists to answer all of these questions and cover all the associated costs. The checks and balances of digital preservation come from its polycentric, volunteer-based nature. 

 

The Attacks on Archiving Efforts

Digital preservation combines a variety of perspectives, uniting diverse people and groups. With limited resources, they’ve already achieved amazing results. 

The Internet Archive is leading perhaps the best-known archiving initiative with its Wayback Machine. The goal of the non-profit is not political and not targeting anyone. It merely exists to preserve digital artifacts of our culture for future generations. 

Yet archiving organizations are facing hostility. Platforms increasingly hide content behind authentication walls, limiting or outright banning archival attempts. The Internet Archive is also facing lawsuits from publishers that could ultimately lead to its financial ruin.

It has also survived several cyberattacks that took its services offline, leaving permanent gaps in the historical record. The British Library’s web archive was also severely compromised by hackers and remains only partially accessible two years later. Clearly, some resourceful groups see internet archiving as against their interests.

 

Archiving Truth to Power

Organizations like the investigative journalism collective Bellingcat, which archive digital evidence specifically to expose the truth, run even greater risks. Their tool, the Auto Archiver, which was updated and made more accessible at the end of June, has already preserved more than 150,000 pieces of digital evidence sourced from the web. With the aim to keep records of human rights abuses, political violence and other facts that some powerful interests would rather have erased, one is bound to make serious enemies.

However, the Auto Archiver’s case also shows the power of decentralized collaboration in attempts to preserve digital truth. The proxy pool requirements are covered free of charge by the Project 4β initiative, created to address similar technical needs for socially beneficial web data collection. Meanwhile, Bellingcat makes the tool open source and easy to use for journalists and researchers across the globe who need to collect and preserve digital evidence. 

Collaboration of independent actors, while sometimes less stable and weaker than centralized management, is enduring in the long term. When many remember events and keep records of them, it becomes harder to make the world forget the truth.

More on Legal + TechWhen AI Makes a Bad Decision, Who’s Liable?

 

The Way Forward: Public Access for Public Effort

The solution to the problem of memory-holed web data must be cultural and political as well as technological. We need to recognize digital preservation as essential and support organizations doing this work. While governmental support is also necessary and welcome, we should be cautious so as not to turn governments into monopolies of digital memory.

To prevent important pieces of information from being accidentally or deliberately lost, we must keep access to all of it as open as possible. Technology can help here. But it all starts with the will of the people.

Explore Job Matches.