In the last few years encryption, and cryptography in general, has firmly become a part of the mainstream, largely due to privacy conversations centered around technology giants, the meteoric rise in popularity of Bitcoin, and even the success of movies like The Imitation Game. Even most laymen today understand the word encryption to refer to the technique of transforming data so it can be hidden in plain sight — and they understand its importance.
Encryption has, however, been a firmly rooted component of all enterprise software design for many years. Historically, these capabilities were provided by underlying infrastructure and libraries used by IT and developer teams, who merely had to centrally turn on flags in their builds, enable configurations in their servers, and ensure the use of transport layer security (TLS) in their networking infrastructure.
But with the move to microservices-based architecture and infrastructure-as-code paradigms, individual teams are now responsible for the security of their application and infrastructure stack, and it has become important for them to understand how to properly leverage encryption for all the services they develop.
To properly secure data, it needs to be protected at rest, in transit, and in use. Below are various common encryption terms and frameworks, and what developers can do to leverage them properly.
Encryption of Data at Rest
Data at rest refers to how data is stored in persistent storage. An attacker with access to the physical storage infrastructure or your device can gain unauthorized access to the data stored on it unless it is encrypted. Encryption of data at rest can be achieved in multiple ways.
Disk- or File System-Level Encryption
With disk- or file system-level encryption, the encryption is performed by the implementation of the virtual storage layer. This is completely transparent to all application software and can be deployed with any underlying storage layer, regardless of its encryption capabilities.
Responsibility: Today, all cloud vendors provide this capability, and this is not something developers have to worry about — they just need to enable it.
Threats It Protects Against: Stolen disks or other storage media.
Server-side encryption is responsible for encrypting and decrypting data, transparently from its clients. The cryptographic keys used for encryption are known only to the server. In the cloud native world, the server can either be a cloud service with keys typically controlled by the cloud provider or a service built by the developers with keys managed by developers. From the perspective of the clients, encryption is transparent.
Responsibility: Many individual cloud services provide this capability, developers will need to enable the feature if it does exist. For an added layer, developers can build and manage their own server-side encryption mechanisms that can even be combined with a cloud service-based server-side encryption.
Threats It Protects Against: Stolen disks or other storage media, file system-level attacks, and cloud provider internal threats if built by the developers.
Here the client is responsible for encrypting data before sending it to the server for storage. Similarly, during retrieval, the client needs to decrypt the data. This makes the design of application software more difficult. An advantage of client-side encryption is that not every bit of stored data needs to be encrypted, only the sensitive parts can be protected. This is often beneficial when the cost of computation is a concern.
Responsibility: This is solely on the developers to design and make the process as seamless as possible for the client and end user.
Threats It Protects Against: Man-in-the-middle and storage provider internal threats.
What Are the Limits of Encryption of Data at Rest?
Encryption of data at rest is now considered best practice, but is not without its limitations and challenges.
The most critical aspect is how and where the encryption keys are stored, who can gain access to them, and so on. While good solutions are available to secure key storage, it is essential to set them up correctly. Weaknesses in key management are, unfortunately, far too common, and are much likelier to lead to confidentiality breaches, than someone breaking a modern encryption algorithm. Everyone likely knows at least one person who lost access to their data on their smart device because they couldn’t remember their back-up key.
Advice to Developers: If at all possible, utilize the resources of your cloud provider for key management. Many of the services have simple configuration toggles to enable encryption at rest and will handle key management transparently. For the most security, you should choose a customer-managed key where possible.
Suggested Tools: Key management services from major cloud providers including Amazon Web Services (AWS) Key Management Service (KMS), Microsoft Azure Key Vault, and Google Cloud Platform (GCP) Cloud Key Management.
Another challenge with encryption of data at rest is that key rotation (the recommended practice of periodically changing secret keys) can be extremely disruptive and costly since large volumes of data may need to be decrypted and then re-encrypted. This poses a challenge when an employee with access to the key leaves the organization or the key is otherwise considered as compromised.
Advice to Developers: Again, if at all possible, utilize the resources of your cloud provider for automatic key rotation as well. Today, all three major providers support automatic master key rotation, and it is a simple config flag when enabling encryption. In these scenarios, a master key will be a reference to the version of the actual encryption key. That is, when a key is rotated, all new data will be encrypted with the rotated key. Manual rotation is possible, but difficult.
Suggested Tools: AWS KMS, Azure Key Vault, and GCP Cloud Key Management.
Encryption of Data in Transit
As developers run their services in the cloud, integrating with other third-party services, encryption of data in transit becomes a must. For years, there was a great deal of pushback due to concerns about latency in applications and as such many applications never implemented transit-level encryption.
However, HTTPS has made huge performance gains over the past decade, and all services today have come to use it — with HTTPS even being used interchangeably with the terms SSL and TLS.
Advice to Developers: Enabling HTTPS for any public endpoints is a necessity today and is extremely simple to do. There will be some minor configuration needed to be done, but if you are using any of the major cloud providers, you can quickly and seamlessly generate and integrate certificates with your services.
Suggested Tools: Each of the cloud providers offer a way to generate public and even private certificates. So there’s AWS Certificate Manager, Azure App Service, and GCP Certificate Manager. If you want to use another service, LetsEncrypt is free and has tools for automatic generation and rotation.
Encryption of Data in Use
This is an area of increasing interest, which addresses the risk that data ultimately needs to be available in plain-text form while it is being processed by an application. Even with the strongest encryption techniques applied to data at rest and in transit, it is the application itself that often runs at the very boundary of trust of an organization and becomes the biggest threat to the data being stolen.
Below are two relatively recent techniques to address threats to data privacy while it is in use:
- Confidential Computing: This leverages advancements in CPU chipsets, which provide a trusted execution environment within the CPU itself. At a high level, it provides real-time encryption and decryption of data held in the RAM of a computer system even as it is being processed by an application, and ensures the keys are accessible only to authorized application code. With this technique, even someone with administrative access to a VM or its hypervisor cannot maliciously access the sensitive data being processed by an application.
- Homomorphic Encryption: This is a class of encryption algorithm that allows certain limited kinds of computations to be performed on the encrypted data itself. These are usually limited to a small set of arithmetic operations. There have been some recent attempts to derive analytics information or insights from homomorphically encrypted data. This includes several companies claiming capabilities like search through regulated or confidential data, and collaboration between analytics teams on highly sensitive data.
A somewhat related technique, popular among companies trying to avoid these problems altogether, is that of tokenization. It involves substituting sensitive data with a non-sensitive equivalent, with no intrinsic business value. The original sensitive data is kept in a highly secure, often offsite or third-party location.
A common example is an online retailer storing credit card tokens instead of credit card numbers themselves. The original credit card number is kept with a third-party service, which only makes it available to an authorized payment processor when needed. While this protects the data and often offloads compliance burden on the business tasked with securing the data, it could be susceptible to token replay attacks and therefore requires that the tokens be protected, effectively just transferring the problem instead of solving it.
Like with all other security techniques, there is no silver bullet or one approach IT and development teams can use to secure their data from prying eyes. The above framework, however, is a good starting point for organizations embracing digital transformation and taking a collaborative approach to security.