Encryption and cloud based storage
Encryption in the digital world is akin to a safe in the physical world. Data is locked away and can only be seen by those who have the correct key. Among other things, encryption is what provides an assurance of confidentiality in data security.
This is a complex and rather large field of continued research around algorithms, protocols and how these can be made to fit the situations that benefit from encryption. It is only logical that it should be put to use in a practical and above all in a secure way.
To simplify this discussion I will make a few assumptions:
– users or online storage customers are security-minded people who won’t leave their keys in the door then complain that the lock is no good;
– when discussing encryption, it is assumed that it is of satisfactory strength and that breaking it by brute force is not the most efficient solution to gain access to the information;
– all involved network protocols are considered secure;
Internet based storage services are becoming popular and common place, perhaps because they’re pretty convenient for storing and sharing files across various hosts and people. But how well do these services tie in with data security and, in particular, confidentiality?
Loosely, it would seem…
You see data that goes to Internet storage is usually transferred securely and it is not as if files are kept on public web servers, so the obvious stuff is there. But once it gets to the storage server, it is desperately beyond our reach or control. It may be stored unencrypted or not, it may be read by the service’s admins or not. It may be delivered to powerful third parties or not. It may be compromised if the servers are broken into or it may be accessed if the servers are physically hijacked.
If storage encryption is part of the deal that provides access to an Internet storage solution, then this covers some aspects of confidentiality of the stored data. If the providers’ servers are compromised or hijacked, then access to the stored information will not be guaranteed.
However any reasonable storage provider who offers encryption will store each of their customers’ data with their own unique encryption key. The deal is that the provider holds the data and the user holds the key – normally derived from the account’s password using some algorithm
(PBKDF2 is a good example).
Now this is where things get tricky: usually encryption of stored data occurs on the provider’s servers, which means that at this moment they must hold the encryption key. A bit like asking someone else to lock or unlock a safe with a copy of the key that you’ve just agreed to make. Or like a hotel receptionist or a valet parking service – in all these cases we as customers have no choice but to trust the service provider.
In the digital world this happens by logging in to the service. Web application or compiled program are no different. The password travels within the point-to-point encrypted tunnel (usually SSL/TLS) which means that it exists in a memory space in plain text at the client and at the server. The key has been copied.
It is clear now that in this encryption model, the provider will at certain moments hold its users encryption keys, that it requires to be able to encrypt and decrypt information. The user holds the key and the provider holds the encrypted data. Whenever the user needs to access his or her pool of data they must lend the key to the provider in what I would call a ‘security by proxy model’.
So if the provider has other, less clear, less honest intentions or is coerced to do so, they can keep copies of encryption keys and decrypt users data. Customers cannot do anything to mitigate this without additional mechanisms (*), it is out of their reach, this is one of the less clear premises of online storage services, one that does not even appear in the fine print. Online providers of storage, encrypted or not, have to be trusted.
*: Yes, a truecrypt volume would solve the problem. However with this solution there are a number of down sides and a significant loss of flexibility. Not to mention potential integrity issues, particularly when attempting to share files in the TC volume…
Data security vs trust
This discussion is not about how much trust can be applied to whichever provider, but instead about the requirements of data security versus the implications of security by proxy.
If the user does not trust the provider, then it must be assumed that the data is or will be compromised. If the user trusts the provider, then either the provider can change their ethics or be forced to do so or they can be attacked and compromised – still resulting in customer data being compromised. There is no way of telling whether or not the data has been compromised.
If a user requires data security within online storage, then they cannot use a security by proxy model. They must use a method that ensures that the provider does not ever have sufficient information to decrypt or facilitate decryption of stored customer data.
Can data security be implemented in such an encryption model that ensures that the provider never has sufficient information in its systems to decrypt it? Yes, naturally, there are protocols and methods for doing this and then some more waiting to be invented.
All current cloud storage services that I know of who provide encryption or add on encryption services _that I know of_ are of the ‘security by proxy’ model. Therefore in all cases users do not get assurance that the provider cannot access their data. That’s because the provider can, with little or no effort, access their customers’ data, if they want to.