How Dropbox Optimized Its Storage System After Ditching AWS
The goal was deceptively simple: build a file storage system and end the company’s reliance on AWS. If executed correctly, Dropbox would both slash its monthly AWS bill and be able to increase its data storage capacity through new technology.
Dropbox already had some experience building and scaling storage systems, having managed the metadata of users and files in-house since the company launched in 2007. Efficiently storing metadata on hundreds of millions of files is a big challenge, but it pales in comparison to what Dropbox set out to do with Magic Pocket.
The team wanted to build a system from the ground up capable of storing exabytes of files. For the uninitiated, one exabyte is the equivalent of a billion gigabytes, or about 245 million DVDs’ worth of data.
Wired published a story in March of 2016 about what it took to bring Magic Pocket to life. According to their reporting, the project required the construction of special machines capable of packing in as many hard disk drives as possible. Dropbox hardware engineers custom-built a box measuring 18 inches tall and 44 inches wide, dubbed “Diskotech,” that could store up to a petabyte’s worth of data.
In addition to the hardware hurdle, Dropbox software engineers had to create the server architecture and write code that controlled how data was written to the disks themselves. According to the company, 90 percent of its user data had migrated from Amazon’s servers by October of 2015, two and a half years after the project began. (Dropbox still leverages AWS for its international storage needs.)
“It’s one thing to get the data there. It’s another to run the system for the long term.”
Paperwork filed with the SEC prior to the company’s IPO in March of 2018 revealed that the move cut operating costs by $75 million. But instead of running a victory lap, the Dropbox team went right back to work once Magic Pocket went live.
“The logistics of handling deliveries, ensuring we can do refreshes as data centers come on and off-lease and the data migrations that entails — those are huge operational feats,” said Andrew Fong, Dropbox’s head of infrastructure. “It’s one thing to get the data there. It’s another to run the system for the long term.”
That long-term viability relies in large part upon Dropbox’s ability to make the system more efficient. That goal, according to Fong, is owned by both hardware engineers who keep the company up-to-date on the latest developments in storage technology and software engineers who find inventive ways to store data.
The Switch To Shingles
Facebook recently opened a new, 2.5-million-square-foot data center in New Albany, Ohio. The facility was completed at a cost of $1 billion and brings the social media giant’s data center count to 15 worldwide.
Unlike Facebook, Dropbox doesn’t actually have its own data centers. Instead, it rents space in three colocation centers, or facilities run by data center companies that lease space. That setup gives the company two choices when it wants to scale its storage system: lease more space or optimize its existing hardware.
Cutting the cord to the cloud
- Google stores its own data. The company has a map showing the locations of its 21 data centers worldwide.
- According to a Facebook blog post from January 2019, the social media giant operates 15 data centers around the world.
- In a blog post from 2017, Twitter said it migrated off of third-party data storage platforms in 2010.
According to Fong, Dropbox began looking for ways to pack more data into its Diskotech boxes shortly after Magic Pocket went live. What they landed on was shingled magnetic recording, or SMR, a relatively new development in hard disk drive technology.
“You have to pay attention to hardware trends and help the software team see and take advantage of them,” said Fong. “In some sense, the hardware teams are pushing the boundaries faster than software teams are.”
Unlike standard perpendicular magnetic (PMR) drives, which write data onto millimeter-wide circular tracks, SMR drives feature overlapping tracks that resemble roof shingles. By stacking tracks, more data can be stored without increasing the size of the disk.
Dropbox opted for host-managed SMR drives, which provide the ability to manage every aspect of how data is written and read. To attain such control, developers rewrote the daemon managing the layout of data on the disk. SMR drives are divided into zones, each containing bands of tracks, with data written sequentially one zone at a time. Since the tracks on SMR drives overlap, data must be written sequentially as random writes would delete data on the overlapping track.
According to Dropbox, moving from PMR to SMR technology has enabled it to store 10 to 20 percent more data per drive and cut storage costs by 20 percent. By the end of 2019, nearly 40 percent of all data on Dropbox was being stored on SMR drives.
Putting OLD Data On Ice
It’s easy to take the capabilities of cloud storage for granted. A lot goes into ensuring a file is accessible in seconds at all hours of the day. At Dropbox, file copies are stored at each one of the company’s three data centers. Every time an update is made, a new file and corresponding copies are generated.
This setup ensures user data from any version of a file is readily available, but it also creates a problem. A whopping 90 percent of files accessed by users have been uploaded in the past year. Files older than a year — and all the versions of them that exist — are stored directly alongside those that have been uploaded more recently. Instead of developing a hardware retirement home for its old data, Dropbox created a separate cold storage tier for its existing SMR disks.
“There’s a fairly high likelihood that SMR will be the only drive technology available above 20 terabytes, and we really want to make sure that the system is future proof.”
As file data ages and is accessed less frequently, it’s migrated to the cold storage tier. The file is broken into two fragments, and each fragment is sent to a different data center, with the third data center storing two complete copies of both fragments. When a user dusts off an old file, a call is put into the three data centers to collectively provide the two data fragments.
“Cold storage allows us to trade off a little bit of network bandwidth for storage costs,” said Fong.
By fragmenting and migrating older data into its cold storage tier, Dropbox said it was able to reduce disk usage by 25 percent and cut storage costs by 10 to 15 percent, all without impacting the user experience.
“We want to stay at the forefront of drive technology and take advantage of the economies of scale that SMR provides,” said Fong. “There’s a fairly high likelihood that SMR will be the only drive technology available above 20 terabytes, and we really want to make sure that the system is future proof.”