When the California Consumer Privacy Act (CCPA) was first announced last year, many tech companies had no idea what to expect. The proposed law could have huge implications for how businesses collect and store user data.
The new regulation, which went into effect on Jan. 1, is intended to give consumers more control over the data that companies collect about them online. If a company is based in California, or has even one user in the state, it must disclose the type of data it’s collecting about them. Then, users can request that their data be deleted or that it not be sold to a third party.
This can pose new challenges, because companies often store user data in multiple places. Giving consumers information about the data that’s collected about them is easy enough to do in a privacy policy update. But if someone requests that their data be deleted, how can a company ensure they are locating and deleting all of the data that exists?
Segment, a data infrastructure platform headquartered in San Francisco, anticipated this problem several years ago. Segment CEO and co-founder, Peter Reinhardt, says a core part of the company’s mission asks, “how do we help companies get visibility into the data they have?”
The Origin of Segment's Approach to Data Infrastructure
Reinhardt and his co-founders launched Segment in 2011. They originally created the tool for their JavaScript library while working on another project and realized that instead of centralizing customer data in the library and then sending it out to multiple third parties, they could integrate third-party tools like A/B testing platforms, and email marketing and analytics, individually curated by the customer. Segment’s platform can give insights from all these tools at once.
Then, in 2016, the European Parliament approved the General Data Protection Regulation (GDPR). For the first time, European Union citizens would be able to request that companies anywhere in the world delete their data.
“The thing we’ve always been best at (data infrastructure) had another reason to be done well.”
“All of this data in all these different systems? Are they serious?” Tido Carriero, chief product development officer at Segment, said when he and his team first heard about GDPR. Carriero knew that complying with this new privacy regulation would mean a lot of responsibility for Segment’s technical teams.
“But then we got excited,” he said. “The thing we’ve always been best at (data infrastructure) had another reason to be done well.”
And just a few years later, California’s own privacy law was approved. The European law and California's law are different, but they do have one major thing in common: Consumers have the right to ask companies to delete their personal data.
“Because we were prepared for GDPR, we already had the bones in place to help our customers become CCPA compliant,” Carriero said.
The Segment Tools Helping Tech Companies Comply with CCPA
Open Source Consent Manager: Segment’s consent management solution helps customers comply with ‘the right to know’ portion of CCPA.
“Our customers use us for data collection, so consent is crucial,” said Segment product manager, Aliya Dossa, who worked on the tool.
Once a user sees the data a company is collecting about them and which tools it’s being used for — like advertising, marketing or analytics tools — they can choose not to have their data collected for any or all of those tools.
Segment customers can load their tools based on an individual user’s preferences and the platform will ensure that the user’s data is only sent to the tools they’ve consented to.
The consent manager can also help customers comply with ‘the right to opt-out’ portion of CCPA — when a user chooses to opt-out of having their data sold to advertisers.
Customers can keep a running list of opt-outs in Segment and it can track each time a user chooses to opt-out of having their data sold.
This part of the tool was unique for CCPA. “We already had the consent manager, but the ability to adapt it to this specific right was powerful,” Dossa said. Privacy laws won’t be static any time soon, she added. The way companies handle data shouldn’t be either. The consent manager “demonstrates the flexibility of Segment to adapt to the law or use-case at hand,” Dossa said.
Data Subject Access Requests: Under California’s law, users have the right to access information collected about them. Segment lets their clients organize data about any user by enabling a raw data integration like Amazon S3, a file storing service that AWS offers, or another data warehouse. The intent is for companies to easily share that information with their users if they receive a request to do so, and Segment can pull a user’s record up to 365 days in the past.
“We’ve heard from our customers that they’ve already had way more access requests than they expected,” said Dossa. “A lot of them have users based in California.”
Deletion Request Management: Segment offers a deletion request management tool to help customers comply with ‘the right to be deleted’ portion of CCPA.
What does this look like in practice?
Let’s say you bought a pair of shoes on Nike.com, so you typed in your name and address. To process your request, Nike had to collect that data. Under CCPA, you can submit a request to Nike asking them to delete your data. But now, Nike has the burden of having to find everywhere your data is stored and then delete it from every location.
Using Segment, Nike can tell the platform to delete all the data about you in one request. And the tool will not only delete the user’s data from Segment, it will send that deletion request downstream to other tools as well.
The tool was originally developed prior to GDPR, which also gives users the right to request that their information be deleted. When the team at Segment first heard about this component, they knew it would be challenging to address. Why? Because they had no idea how many deletion requests their customers would actually receive.
“This was the first product we ever built where we had no idea what the use case would be,” said Carriero.
To build this tool, they had to put themselves in the customer’s shoes. “We asked ourselves, ‘where are all the places we could store any information about users?’” said Carriero. Essentially, it’s a searching problem, he added. And once you’ve successfully searched for the data, you need to be able to delete it too.
“This was the first product we ever built where we had no idea what the use case would be.”
Segment’s integration with Amazon S3 allows customers to store a raw archive of user data in case users need to access it. “But Amazon S3 is not designed to have queries made for large flat files in an efficient way,” Carriero said. “A big part of the challenge was how to efficiently find all potential (queries) in this large archive.”
Their first implementation of the deletion request tool was too expensive to run at the scale their customers demanded. “We were floored by the number of people using the API,” said Carriero. “We realized that the public was serious about this.”
Carriero’s team had to rework the algorithm for deleting from batch archives. They ended up building a more complex indexing system to process the large number of deletion requests efficiently.
One year after GDPR went into effect, Segment deleted nearly 7 million users from their platform as well as other tools.
“We were floored by the number of people using the API.”
Because it’s only been a few weeks since CCPA went into effect, it’s difficult to know how many users will make deletion requests. But after GDPR, Segment’s product team is confident their deletion request tool will be well-equipped to handle a high volume of requests.
The Future of Privacy Regulation
GDPR and CCPA are two of the first major data privacy regulations to be enacted. But this is likely the beginning of further legislation aimed at giving consumers more control over their personal information online.
In recent years, a small crop of data privacy startups have emerged in response to the evolution of online privacy laws — many of which are based in San Francisco. InCountry stores data in the country where it was created so companies can comply with local laws. TrustArc, which was started in San Francisco in 1997 under the name TRUSTe, has pivoted from privacy certifications to developing data protection and compliance products for large companies.
In the constantly evolving landscape of privacy, Segment’s technical teams are staying flexible and acting fast. Right now, they’re building a new feature that will give customers more control and choice over where their data is processed and stored.
“We’re definitely watching the regulatory scene and staying up on that in our product development,” Carriero said.