Data encryption for health applications
Why you need record-level encryption and how Chino.io helps
Data Encryption
Overview
Data encryption involves encoding your data using a key and a cryptographic function. The resulting ciphertext can only be read with the correct key to decipher it. Data can be encrypted at rest (in storage) or in flight (while being transmitted across the Internet).
Data at rest includes all data stored on a user's device and all data in your backend. This data is very vulnerable to attacks, and it needs to be secured using suitable encryption.
Disk-level encryption involves encrypting the physical disk drive. This protects your data from being accessed if a thief steals the drive. However, whenever the server is running, the data is accessible.
Database-level encryption involves encrypting the whole database using a single key. Typically, this is what is offered by cloud providers. The problem is, loss of that single key jeopardises all your data. Furthermore, we often see people storing the key in a plain text configuration file under the admin account.
Application- or record-level encryption involves encrypting each record within the database individually. In effect, each application instance (user) has a key that is used to encrypt the data related to that user. This requires key management, but is the most secure approach if you need to access or process the data on your backend.
E2e (end-to-end) encryption involves encrypting all data within the application. The backend gets no access to the data at all. This is extremely secure, but is only any use if you have no need to process the data on the backend.
Data Encryption
Overview
Encryption is the process of converting (encrypting) data into a coded form so that it is hard/impossible for a 3rd party to read it. This process is done using a mathematical algorithm to scramble the data using a key. In symmetric (traditional) encryption, the data can then be recovered (decrypted) using the same key. In asymmetric (public key) encryption, keys come as a pair. One key is used for encryption (the public key) and a different (private) key is used for decryption.
What are the standard forms of encryption?
The current de-facto standard for symmetric encryption is AES (the Advanced Encryption Standard). This uses multiple rounds of encryption and hashing. Because it is so widely used, CPUs include hardware algorithms to speed up AES encryption.
The usual form of asymmetric encryption is RSA (Rivest–Shamir–Adleman). This is widely used by tools like SSH. It relies on the fact that factorising large numbers is easy, but recovering the numbers from the factors is a hard problem. Public key based cryptography tends to be slow, and thus isn't suitable when you are storing large amounts of data in real-time. However, it is ideal for e2e encryption of messages.
What does 128-bit or 256-bit mean?
You may have heard people referring to 128-bit security or 256-bit security. This simply refers to the length of the key in binary bits. The longer the key, the more secure. To give some idea of why this is so, with a 32-bit key there are 2^32 possible keys (4,294,967,296), but for 128-bit that increases to 2^128, which is 340, 282, 366, 920, 938, 463, 463, 374, 607, 431, 768, 211, 456 possible keys.
While it may seem tempting to be paranoid and encrypt everything with 8192-bit keys, this is unnecessary. Also, the longer the key is, the longer it takes to encrypt and decrypt the data. This is because the encryption operations cannot be done efficiently by computers. But modern CPUs can do fast hardware encryption for the AES block cipher. Here at Chino.io we use AES-256 for our encryption. This is because AES-256 is a good balance of security versus speed of encryption.