MediaTech Law

By MIRSKY & COMPANY, PLLC

Encrypted Data: Still “Personal Data” under GDPR?

An interesting question is whether encrypted personal data is still “personal data” for purposes of the European Union’s General Data Protection Regulation (GDPR), and therefore making processing of that data subject to the GDPR’s library of compliance obligations.  The answer depends on the meaning of encryption: It is not enough to claim that encrypted data is “anonymized” and therefore inaccurate to conclude that it does not relate to the personal data definition’s meaning of an “identified or identifiable natural person.”

If an organization encrypts data in its care, with the encryption thereby rendering the data no longer “identified”, is it still “identifiable”?  Maybe.  If neither identified nor identifiable, then data is no longer “personal data”.

First, what is encryption?  Josh Gresham writes on IAPP’s blog that encryption involves a party “tak[ing] data and us[ing] an ‘encryption key’ to encode it so that it appears unintelligible.  The recipient uses the encryption key to make it readable again.  The encryption key itself is a collection of algorithms that are designed to be completely unique, and without the encryption key, the data cannot be accessed.  As long as the key is well designed, the encrypted data is safe.” (emphasis added)

There are several predicates in Gresham’s definition of encryption: Gresham acknowledges “the assumption that the encryption algorithm is well designed and that the encryption is properly carried out”.  Or in other words, how unintelligible is sufficiently “unintelligible”?  What makes an encryption key “well designed”?  Gresham advocates that the European Data Protection Board establish, maintain and update a list of acceptable encryption technologies and standards.

Which would not necessarily guarantee the non-identifiability of data, but rather assure it in its most practical sense.  All of which seems consistent with GDPR Recital 26: “To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.  To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.”

This same point was made by a group of British data scientists last year in Computer Law & Security Review, challenging the presumption that all “pseudonymized” (as opposed to “anonymized”) data relating to an identified or identifiable individual remain “personal data” under GDPR.  As I wrote a few years ago, equally muddying the waters is the common casual use in vendor contracts of the term “anonymized aggregated data” (an example: “[VENDOR] may use such Customer’s Performance Data … to create anonymized aggregated data, industry reports, and/or statistics (“Aggregated Data”) for its own commercial purposes, provided that Aggregated Data will not contain any information that identifies the Advertiser or any of its customers”). 

On the one hand, there is the precision view, requiring that “anonymized” only be used to describe data that cannot be re-identified.  This was the view of the Advocate General in the 2016 case of Breyer v Germany (Case 582/14 – Patrick Breyer v Germany), arguing that a website user’s dynamic IP address held by a website operator could be personal data if an internet service provider has further information which could be combined with the IP address to identify the user.   

There is, on the other hand, a practical approach recognizing (as does Recital 26) that absolute non-re-identifiability is not the test, but rather that de-identified data is sufficiently removed from personal data by taking “account … of all objective factors” and “all the means reasonably likely to be used” to identify the natural person directly or indirectly.

As the data scientists note in Computer Law & Security Review, in discussing the opinion of the Court of Justice of the European Union (CJEU) in Breyer v Germany (see above):

[I]t was necessary to determine ‘whether the possibility to combine a dynamic IP address with the additional data held by the internet service provider constitutes a means likely reasonably to be used to identify the data subject.’  They concluded that the data were personal, but only because of the existence of legal channels enabling the competent authority to obtain identifying information from the internet service provider in the event of a cyber-attack.

In the authors’ view, the GDPR’s contemplated use of the term “pseudonomization” and the concept of de-identification should not be understood as dispositive for whether data is “personal” – in fact, such data is presumed personal as a default.  Rather, “it is Recital 26 and its requirement of a ‘means reasonably likely to be used’ which remains the relevant test as to whether data are personal”.

John Gresham in his IAPP commentary writes that encryption is neither pseudonomization nor anonymization.  That’s because, as discussed above, de-identified encrypted data may or may not be able to be re-identified, depending on the availability (or security) of the encryption key, but also depending on the design (and implementation) strength of the encryption itself.  Assuming sufficiency of the latter, Gresham seems to suggest that encrypted personal data processed by parties who directly implemented the encryption should remain viewed as personal data on the presumption that those parties retain a re-identification key (i.e. pseudonomized data).  That same data in the care of downstream third parties without reasonable access to that key should not be so viewed (i.e. anonymized data).

Several commenters to Gresham’s post took issue with this last idea that sufficiently-designed encryption, properly implemented, might render data no longer “personal data” under GDPR, particularly in the processing hands of parties that did not carry out the encryption and do not hold the key.  Marcus Mueller commented that “It’s not necessary that the cloud provider has insight into the content or meaning of encrypted data. This is just no requirement according to Art. 4 regarding ‘personal data’ (par. 1), ‘processing’ (par. 2), ‘controller’ (par. 7) or ‘processor’ (par. 8). … In spite of the encryption, the full range of the principles in Art. 5 applies for encrypted data, including ‘integrity and confidentiality’ (par. 1 (f)).”  And Michiel Benda simply points out that “downstream third parties without reasonable access to [the encryption] key” (my words) do not practically exist.

In fairness, Gresham does not advocate for removing encrypted, anonymized data from data protection governance, which anonymization would only reduce, not eliminate.  He argues instead that, since in practical terms, a processor of encrypted data “isn’t really processing personal data”, a better practical focus of achieving GDPR’s data privacy protections might be to establish and maintain a list of acceptable encryption technologies to truly anonymize data.  Standards for acceptable encryption should be sufficiently high in order to justify lower GDPR obligations such as notifying data subjects of data breaches.  

Share this article: Share on Facebook
Facebook
Tweet about this on Twitter
Twitter
Share on LinkedIn
Linkedin
Email this to someone
email

Add Comment

Your email address will not be published. Required fields are marked *