FEATURE28 January 2022

Anonymising data

x Sponsored content on Research Live and in Impact magazine is editorially independent.
Find out more about advertising and sponsorship.

Features Impact Legal Privacy Technology

Following the publication of the data-sharing code of practice, the Information Commissioner’s Office (ICO) is now working on guidance on anonymisation, pseudonymisation, and privacy-enhancing technologies.

IStock-913879566 Iaremenko

The guidance will explore the legal, policy and governance issues around the application of anonymisation and pseudonymisation in the context of data protection law. In so doing, it will clarify when personal data can be considered anonymised, if it is possible to anonymise data adequately to reduce risks, and what the benefits of anonymisation and pseudonymisation might be.

The endeavour is commendable. The topic has been up for debate for a long time, from the ICO’s first attempt, Anonymisation: managing data protection risk code of practice of 2012, and the late Article 29 working party’s (now European Data Protection Board) Opinion on anonymisation techniques of 2014, to the Norwegian and Irish data protection authorities’ guidance on anonymisation and pseudonymisation of 2017 and 2019.

The concept is straightforward: anonymisation must be assessed against the possibility of re-identifying the data subject, while the test of re-identifiability leaves large room for discussion. On the one side, the EU Article 29 working party aimed for a close-to-zero approach; on the other, the Court of Justice of the European Union in 2016, in the case of Patrick Breyer, held that if the identification of the data subject is prohibited by law – or practically impossible on account of the fact that it requires a disproportionate effort in terms of time, cost and labour – the risk of identification appears, in reality, to be insignificant.

Legislation followed, which is why the GDPR Recital 26 now reads: ‘To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person, to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.’

But how can controllers anonymise data? Organisations’ efforts to adapt and keep up with the data protection legislation have been restless, and the concept of pseudonymisation as a mere security measure has been mostly understood – but when it comes to anonymisation, the struggle is real.

From an organisation’s operational standpoint, it all starts with terminology: are anonymous, anonymisation and anonymised in the same league as anonymity? What are the processes that can achieve effective anonymisation? How can anonymisation be upheld? More importantly, how effectively can these processes be presented to small organisations, micro enterprises and sole traders who don’t have the means and the tools to investigate randomisation and generalisation, noise addition, permutation, differential privacy, aggregation, and concepts including k-anonymity, l-diversity and t-closeness?

A very good, clear and practical example came from a leaflet published by the Spanish data protection authority and the European Data Protection Supervisor.

Among the listed 10 misunderstandings related to anonymisation, the most notable points are:

  • Anonymisation of data is always possible – in fact, it is not. It is not always possible to lower the re-identification risk below a previously defined threshold while retaining a useful dataset for a specific processing
  • Anonymisation is forever – in fact, there is no cradle-to-grave solution. There is a risk that some anonymisation processes could be reverted in the future
  • Anonymisation always reduces the probability of re-identification of a dataset to zero – it does not. A robust anonymisation process aims to reduce the re-identification risk only below a certain threshold
  • Anonymisation makes the data useless – actually, a proper anonymisation process keeps the data functional for a given purpose. The relevance of the document rests on its direct effectiveness in being understood.

The ICO guidance is scheduled in stages. The first chapter has just been published and the following ones will include:

  • Identifiability (outlining approaches such as the spectrum of identifiability, including guidance on managing re-identification risk, covering concepts such as the ‘reasonably likely’ and ‘motivated intruder’ tests)
  • Guidance on pseudonymisation techniques and best practices
  • Accountability and governance requirements
  • Anonymisation and research
  • Guidance on privacy-enhancing technologies
  • Technological solutions and data-sharing options, and case studies.

The first chapter, introducing the key concepts of anonymisation and pseudonymisation, is clear, direct and easy to understand. The really interesting part will come in the upcoming chapters.

We have two major expectations: the first is the ‘reasonably likely’ test; the second is, of course, anonymisation and research. Will research be intended only as health research? From a market, opinion and social research perspective, the stance is quite straightforward. As Nunan and Di Domenico pointed out in a 2016 IJMR paper, anonymisation is rooted in the defining principle of research ethics: that participants in research should not be harmed as a result of participation. As they correctly note, it’s a quintessential part of trust, a key factor in facilitating the forms of relationship upon which market research relies.

Anonymisation has a unique place within market research, as opposed to other uses of commercial data, because it enables respondent trust and the maintenance of a self-regulatory regime. Researchers would do well to remember that trust does not come free.

0 Comments