Archive for the ‘Anonymisation’ Category

The legal and practical realities of “personal data”

Posted on September 3rd, 2014 by



Are IP addresses personal data?  It’s a question I’m so frequently asked that I thought I’d pause for a moment to reflect on how the scope of “personal data” has changed since the EU Data Protection Directive’s adoption in 1995.

The Directive itself defines personal data as “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity“.

That’s not the beginning and the end of the story though.  Over the years, various regulatory guidance has been published that has further shaped what we understand by the term “personal data”.  This guidance has taken the form of papers published by the Article 29 Working Party (most notably Opinion 4/2007 on the Concept of Personal Data) and by national regulators like the UK’s Information Commissioner’s Office (see here).  Then throw in various case law that has touched on this issue, like the Durant case in the UK and the European Court of Justice rulings in Bodil Lindqvist (Case C-101/01) and the Google Right to Be Forgotten case (C-131/12), and it’s apparent that an awful lot of time has been spent thinking about this issue by an awful lot of very clever people.

The danger, though, is that the debate over what is and isn’t personal data can often get so weighted down in academic posturing, that the practical realities of managing data often get overlooked.  When I’m asked whether or not data is personal, it’s typically a loaded question: the enquirer wants to know whether the data in question can be retained indefinitely, or whether it can be withheld from disclosures made in response to a subject access request, or whether it can be transferred internationally without restriction.  If the data’s not personal, then the answer is: yes, yes and yes.  If it is personal, then the enquirer needs to start thinking about how to put in place appropriate compliance measures for managing that data.

There are, of course, data types that are so obviously personal that it would be churlish to pretend otherwise: no one could claim that a name, address or telephone number isn’t personal.  But what should you do when confronted with something like an IP address, a global user ID, or a cookie string?  Are these data types “personal”?  If you’re a business trying to operationalise a privacy compliance program, an answer of “maybe” just doesn’t cut it.  Nor does an answer of “err on the side of caution and treat it as personal anyway”, as this can lead to substantial engineering and compliance costs in pursuit of a vague – and possibly even unwarranted – benefit.

So what should you do?  Legal purists might start exploring whether these data types “relate” to an “identified or identifiable person”, as per the Directive.  They might note that the Directive mentions “direct or indirect” identification, including by means of an “identification number” (an obvious hook for arguing an IP address is personal data).  They might explore the content, purpose or result of the data processing, as proposed by the Article 29 Working Party, or point out that these data types “enable data subjects to be ‘singled out’, even if their real names are not known.”  Or they might even argue the (by now slightly fatigued) argument that these data types relate to a device, not to a person – an argument that may once have worked in a world where a single computer was shared by a family of four, but that now looks increasingly weak in a world where your average consumer owns multiple devices, each with multiple unique IDs.

There is an alternative, simpler test though: ask yourself why this data is processed in the first place and what the underlying individuals would therefore expect as a consequence.  For example: Is it collected just to prevent online fraud or is it instead being put to use for targeting purposes? Depending on your answer, would individuals therefore expect to receive a bunch of cookie strings in response to a subject access request?  How would they feel about you retaining their IP address indefinitely if it was held separately from other personal identifiers?

The answers to these questions will of course vary depending on the nature of the business you run – it’s difficult to imagine a Not For Profit realistically being expected to disclose IP addresses contained in web server logs in response to a subject access request, but perhaps not a huge stretch, say, for a targeted ad platform.   The point is simply that trying to apply black and white boundaries to what is, and isn’t, personal will, in most cases, prove an unhelpful exercise and be wholly devoid of context.  That’s why Privacy Impact Assessment are so important as a tool to assess these issues and proposed measured, proportionate responses to them.

The debate over the scope of personal data is far from over, particularly as new technologies come online and regulators and courts continue to publish decisions about what they consider to be personal.  But, faced with practical compliance challenges about how to handle data in a day-to-day context, it’s worth stepping back from legal and regulatory guidance alone.  Of course, I wouldn’t for a second advocate making serious compliance decisions in the absence of legal advice; it’s simply that decisions based on legal merit alone risk not giving due consideration to data subject trust.

And what is data protection about, if not about trust?

 

Germany: Federal Court stops disclosure claims against review platforms

Posted on August 1st, 2014 by



In Germany, the Federal Court of Justice pulled the rug from under claims for the disclosure of user data against the providers of online services. The court ruled that statutory law would not permit a service provider to disclose user data to persons and businesses concerned by a negative and potentially unlawful review posted on a review platform (judgement of 1 July 2014, court ref. VI ZR 345/13). Only if the review constitutes a criminal act in itself, such as a defamation or slander, the prosecution may request disclosure in the course of a criminal investigation. The judgement eventually ended a debate that had been simmering for a long time.

Background

The case concerned a medical practitioner who sued a review platform dedicated to medical services. A user had posted a review on the platform in which he alleged that patients´ files would be kept in clothesbaskets, average waiting times would be extraordinarily long, follow-up appointments would not be offered in due time, a thyroid hyperfunction had not been identified and been treated contraindicative. Shortly afterwards, further reviews were posted which were identical in places to the first review. The claimant repeatedly notified the platform provider of these reviews, and the platform provider took the reviews down. In July 2012, another review was posted with the same allegations. The claimant now sued the platform provider for cessation and desistance and for disclosure of the name and the address of the user who posted the reviews. The defendant never denied that the facts stated in the reviews were untrue.

The Judgement

The claim was dismissed. The court´s decision is based on Sec. 12 (2) German Telemedia Act (“TMG”), which stipulates that a service provider may only disclose user data where a specific statute exists that permits such a disclosure and expressly references “Telemedia” services, i.e. online services. The court argued that the general civil law claim for disclosure of third-party data, which is based on bona fide aspects (Sec. 242 German Civil Code), would not fulfil the requirements of Sec. 12 (2) TMG. Further, the requirements of Sec. 14 (2) TMG, which allows for a disclosure of user data if this is necessary for the purposes of criminal prosecution, protection of the constitution, averting public dangers and national security, and for the enforcement of copyright, would not apply. According to the court, there is also no room for an analogous application of Sec. 14 (2) TMG, because there would not be an unintentional gap in the statutes as required for an analogy. In this regard, the court highlighted that the question whether an individual whose personality rights were unlawfully affected by a user posting should have a claim for disclosure of that user´s data was debated in the process of legislation without further consequences.

The court noted, however, that the result of the legal assessment may be regarded as unbalanced against the statutory right for disclosure of user data in the event of a copyright infringement, and that it deems the extension of this statutory right desirable. However, the court emphasized that this decision is up to the legislator, not the court.

Comment

The question of whether a claim for disclosure of user data would be supported by German civil law had long been debated in legal literature, and courts of lower instances had issued conflicting decisions in similar cases. The appellate court (Higher Regional Court of Stuttgart) had decided in favour of the claimant, too. This debate has now been ended by the Federal Court for the time being. The judgement is clear and leaves no room for interpretation or loopholes. This is good news for both providers of online platforms, who can safely assure to their users that their identity is protected, and users who will not need to fear de-anonymization, which could result in a pre-emptive self-limitation when posting comments.

However, the question remains whether the court duly considered constitutional law aspects, as the German-law concept of personality rights is rooted in the German constitution (right to human dignity, right to personal freedoms). This has been the main reason why some courts of lower instances had obvious concerns about the result of their legal assessment and tried to find a way out of the dilemma by applying analogies, or considerations of interest, on dubious legal grounds to overcome the statutory law situation which had been deemed inappropriate in some cases. The Federal Court now has not touched constitutional law issues, so it can be concluded that at least it did not see a blatant violation of constitutional law. However, the Federal Court articulated concerns about the outcome too by declaring a revision of the statutes desirable, and by emphasizing the responsibility of the legislator to consider respective amendments of the law. Even though the judgement is final and binding, the claimant may seek additional relief by lodging a constitutional complaint.

The decision does not affect the right of the competent authorities to request a disclosure of user data in the case of criminal prosecution, i.e. in cases where the content of a user review does not only constitute a violation of personality rights as protected by civil law, but reach the threshold of criminal offences such as in the case of defamation and slander.

Anonymisation is great, but don’t undervalue pseudonymisation

Posted on April 26th, 2014 by



Earlier this week, the Article 29 Working Party published its Opinion 05/2014 on Anonymisation Techniques.  The opinion describes (in quite some technical detail) the different anonymisation techniques available to data controllers, their relative values, and makes some good practice suggestions – noting that “Once a dataset is truly anonymised and individuals are no longer identifiable, European data protection law no longer applies“.

This is a very significant point – data, once truly anonymised, is no longer subject to European data protection law.  This means that EU rules governing how long that data can be kept for, whether it can be exported internationally and so on, do not apply.  The net effect of this should be to incentivise controllers to anonymise their datasets, shouldn’t it?

Well, not quite.  Because the truth is that many controllers don’t anonymise their data, but use pseudonymisation techniques instead.  

Difference between anonymisation and pseudonymisation

Anonymisation means transforming personal information into data that “can no longer be used to identify a natural person … [taking into account] ‘all the means likely reasonably to be used’ by either the controller or a third party.  An important factor is that the processing must be irreversible.”  Using anonymisation, the resulting data should not be capable of singling any specific individual out, of being linked to other data about an individual, nor of being used to deduce an individual’s identity.

Conversely, pseudonymisation means “replacing one attribute (typically a unique attribute) in a record by another.  The natural person is therefore still likely to be identified indirectly.”  In simple terms, pseudonymisation means replacing ‘obviously’ personal details with another unique identifier, typically generated through some kind of hashing, encryption or tokenisation function.  For example, “Phil Lee bought item x” could be pseudonymised to “Visitor 15364 bought item x”.

The Working Party is at pains to explain that pseudonymisation is not the same thing as anonymisation: “Data controllers often assume that removing or replacing one or more attributes is enough to make the dataset anonymous.  Many examples have shown that this is not the case…” and “pseudonymisation when used alone will not result in an anonymous dataset.

The value of pseudonymisation

The Working Party lists various “common mistakes” and “shortcomings” of pseudonymisation but curiously, given its prevalence, fails to acknowledge the very important benefits it can deliver, including in terms of:

  • Individuals’ expectations: The average individual sees a very big distinction between data that is directly linked to them (i.e. associated with their name and contact details) and data that is pseudonymised, even if not fully anonymised.  In the context of online targeted advertising, for example, website visitors are very concerned about their web browsing profiles being collected and associated directly with their name and address, but less so with a randomised cookie token that allows them to be recognised, but not directly identified.
  • Data value extraction:  For many businesses, anonymisation is just not an option.  The data they collect typically has a value whose commercialisation, at an individual record level, is fundamental to their business model.  So what they need instead is a solution that enables them to extract value at a record level but also that respects individuals’ privacy by not storing directly identifying details, and pseudonymisation enables this.
  • Reversibility:  In some contexts, reversibility of pseudonymised data can be very important.  For example, in the context of clinical drug trials, it’s important that patients’ pseudonymised trial data can be reversed if needing, say, to contact those patients to alert them to an adverse drug event.  Fully anonymised data in this context would be dangerous and irresponsible.
  • Security:  Finally, pseudonymisation improves the security of data held by controllers.  Should that data be compromised in a data breach scenario, the likelihood that underlying individuals’ identities will be exposed and that they will suffer privacy harm as a result is considerably less.

It would be easy to read the Working Party’s Opinion and conclude that pseudonymisation ultimately serves little purpose, but this would be a foolhardy conclusion to draw.  Controllers for whom anonymisation is not possible should never be disincentivised from implementing pseudonymisation as an alternative – not doing so would be to the detriment of their security and to their data subjects’ privacy.

Instead, pseudonymisation should always be encouraged as a minimum measure intended to facilitate data use in a privacy-respectful way.  As such, it should be an essential part of every controller’s privacy toolkit!

The anonymisation challenge

Posted on November 29th, 2012 by



For a while now, it has been suggested that one of the ways of tackling the risks to personal information, beyond protecting it, is to anonymise it.  That means to stop such information being personal data altogether.  The effect of anonymisation of personal data is quite radical – take personal data, perform some magic to it and that information is no longer personal data.  As a result, it becomes free from any protective constraints.  Simple.  People’s privacy is no longer threatened and users of that data can run wild with it.  Everybody wins.  However, as we happen to be living in the ‘big data society’, the problem is that with the amount of information we generate as individuals, what used to be pure statistical data is becoming so granular that the real value of that information is typically linked to each of the individuals from whom the information originates.  Is true anonymisation actually possible then?

The UK Information Commissioner believes that given the potential benefits of anonymisation, it is at least worthwhile having a go at it.  With that in mind, the ICO has produced a chunky code of practice aimed at showing how to manage privacy risks through anonymisation.  According to the code itself, this is the first attempt ever made by a data protection regulator to explain how to rely on anonymisation techniques to protect people’s privacy, which is quite telling about the regulators’ faith in anonymisation given that the concept is already mentioned in the 1995 European data protection directive.  Nevertheless, the ICO is relentless in its defence of anonymisation as a tool that can help society meet its information needs in a privacy-friendly way.

The ICO believes that the legal test of whether information qualifies as personal data or not allows anonymisation to be a realistic proposition.  The reason for that is that EU data protection law only kicks in when someone is identifiable taking into account all the means ‘likely reasonably’ to be used to identify the individual.  In other words and as the code puts it, the law is not framed in terms of the mere possibility of an individual being identified.  The definition of personal data is based on the likely identification of an individual.  Therefore, the ICO argues that although it may not be possible to determine with absolute certainty that no individual will ever be identified as a result of the disclosure of anonymous data, that does not mean that personal data has been disclosed.

One of the advantages of anonymisation is that technology itself can help make it even more effective.  As with other privacy-friendly manifestations of technology – such as encryption and anti-malware software – the practice of anonymising data is likely to evolve at the same speed as the chances of identification.  This is so because technological evolution is in itself neutral and anonymisation techniques can and should evolve as the uses of data become more sophisticated.  What is clear is that whilst some anonymisation techniques are weak because reintroducing personal identifiers is as easy as stripping them out, technology can also help bulletproof anonymised data.

What makes anonymisation less viable though is the fact that in reality there will always be a risk of identification of the individuals to whom the data relates.  So the question is how remote that risk must be for anonymisation to work.  The answer is that it depends on the level of identification that turns non-personal data into personal data.  If personal data and personally identifiable information were the same thing, it would be much easier to establish whether a given anonymisation process has been effective.  But they are not because personal data goes beyond being able to ‘name’ an individual.  Personal data is about being able to single out an individual so the concept of identification can cover many situations which make anonymisation genuinely challenging.

The ICO is optimistic about the benefits and the prospect of anonymisation.  In certain cases – mostly in the context of public sector data uses – it will clearly be possible to derive value from truly anonymised data.  In many other cases however, it is difficult to see how anonymisation in isolation will achieve its end, as data granularity will prevail in order to maximise the value of the information.  In those situations, the gap left by imperfect anonymisation will need to be filled in by a good and fair level of data protection and, in some other cases, by the principle of ‘privacy by default’.  But that’s a different kind of challenge.

 
This article was first published in Data Protection Law & Policy in November 2012.