It's no hyperbole to say that everything in the modern world runs on data, whether that's operationally or in the planning and design stages. Sometimes, that data gets lost, whether through accidents, tragedies, cyberattacks, or human error, and it's not always able to be recovered. A 2022 survey by backup provider Arcserve found that 76% of businesses that responded had suffered a loss of critical data, and 45% of those said the loss was permanent.

That's a lot of companies with backup strategies that need improvement, but how about the big names in the industry? Surely, they are better prepared and have safeguards in place. Well, the risk of data loss is just as common within those big players, and most of the time, you never hear about it because they have robust backup strategies in place and distributed data so they can recover from data loss silently. The other times? Well, you don't always hear about those either, but these incidents of data loss were too large to ignore.

6 GitLab (2017)

Human error led to 300GB of production data loss with no backups

Over half the world's business data is in cloud storage, and that includes providers like GitLab. GitLab provides a web-based Git repository with advanced features like a wiki and issue tracking, so companies can keep development, institutional knowledge, and the list of needed bug fixes all in one place. It's the kind of service that attracts big names like IBM, Sony, and CERN, so it's well aware of the need for data integrity, backups, and disaster recovery plans.

They often say that no well-laid plan survives contact with the enemy, but that might need revising because it seems they also won't survive contact with anyone. In 2017, that was put to the test when GitLab lost 300GB of production data during what should have been a routine maintenance task designed to test database scaling techniques. The mistake? An engineer ran the task on the primary database, instead of on the copy they'd created for the task. That led to a cascade of errors, from things silently failing:

No safeguards against accidental deletion

Backups had been silently failing for weeks

Recovery process was too slow

Secondary database was out of sync so couldn't be used to replace the primary

So, while the initial mistake was human error, plenty of other issues compounded that into an unrecoverable six hours of new data in the production database. To their credit, GitLab analyzed what went wrong, live-streamed the recovery on YouTube, publicly stated the issues, and put procedures into place for future guardrails. And that one engineer? Was still there afterward, as the CEO told Business Insider that the blame was shared among the entire team.

5 Samsung (2014)

A data center fire left the company without critical data