'From knowing by name to targeting: the meaning of identification under the GDPR' by Nadezhda Purtova in (2022) International Data Privacy Law comments
Despite its core role in the EU system of data protection, the meaning of identification remains unclear in data protection law and scholarship while the spotlight focuses on the legally relevant chance of identification, ie identifiability. While Article 29 Working Party interpreted identification broadly, as distinguishing one in a group, this interpretation has been questioned in light of the CJEU decision in Breyer. This article tackles this uncertainty. This article offers an integrated socio-technical typology of identification where, in addition to the known identification types (look-up-, recognition-, session- and classification identification), targeting is added as a new identification type. To identify by way of targeting means to select a particular individual from a group as an object of attention or treatment in a single moment of time. The article clarifies the legal meaning of identification under the GDPR. It proposes a contextual interpretation of Breyer, which negates Breyer’s restrictive potential and brings all identification types within the GDPR. The article concludes with a discussion of the implications of this reading of identification for data protection in terms the applicability of the GDPR to new data technologies and practices such as facial detection and non-tracking based targeted advertising, effects of certain privacy preserving technologies such as federated learning of cohorts, consequences for invoking data protection rights when identification is not possible, but also in terms of the need to clearly define the objectives of the data protection law.
Purtova argues
Identification, referring both to the process of identifying someone and the fact of being identified, is one of the boundary concepts of data protection law. It separates the data that is personal, i.e. relating to an identified or identifiable natural person, from non-personal, and thus triggers the applicability of the EU General Data Protection Regulation (the GDPR). Yet, despite the high stakes attached to the meaning of this concept, relatively little attention is paid both in law and legal scholarship to what identification is. Therefore the chief issue tackled here is the meaning of identification under the GDPR.
The primary focus of the current scholarly attention lies on the adjacent concept of identifiability which refers to the possibility of identification, ie of being identified, in future. This is not surprising since in practice whether or not a person is identifiable rather than identified is regarded as an easier criterion to meet and is therefore a de facto ‘threshold condition’ when determining the status of data as personal. Some legal scholars discuss the meaning and legally relevant degree of identifiability, pseudonymization, and true meaning and possibility of anonymization. The debates among computer scientists tackle anonymization and reidentification techniques and their (in)effectiveness. These discussions clarify the boundaries of application of data protection law and contribute to practical solutions for at least some of the data protection concerns, and as such are valuable and relevant. Yet, the meaning of identifiability is derived from and hence is secondary in relation to the primary concept of identification. Therefore any identifiability debate is at risk of being hollow when not underpinned with a robust understanding of identification. It makes little sense to argue if a natural person is ‘identifiable’ when it is not clear when a natural person would be ‘identified’ and what it means to identify somebody.
As the technologies to target a person evolve and test the boundaries of data protection, the meaning of identification becomes less clear, and the gap in understanding what it means to identify becomes increasingly more obvious and imperative to close.8 A relatively recent case of such technological development is face detection and analysis used in ‘smart’ advertising boards. Unlike with facial recognition where one’s facial features are compared to pre-existing facial templates to establish if a person is known, face detection and analysis do not recognize people but ‘detect’ them and, in case of smart billboards, classify them into gender-, age-, emotion-, and other groups based on processing of their facial features to display tailored ads. The industry that develops, sells, and employs the technology argues that facial detection does not involve processing personal data,10 eg because the chance of establishing who a person before the ‘sensor’ is close to null. In part this is due to the ‘transient’ nature of the processing, where raw data of an individual processed by the detection ‘sensors’ is discarded immediately. The technology does not allow tracking a person and recognizing him or her over time either. To be clear, as will become apparent from further analysis, these industry arguments do not necessarily withstand legal scrutiny and it is highly likely that personal data will be processed in these contexts, if the proposed interpretation of identification is adopted. Yet, there is no uniform position on the interaction of face detection and data protection across the EU Member States. For instance, the Dutch data protection authority considers face detection in the context of smart billboards as processing of personal data, while its Irish and reportedly Bavarian counterparts are of the opposite view. More similar debates and uncertainties are likely to emerge in other contexts where facial analysis and sensing can be used, such as healthcare for pain or pulse detection, in the news sector for audience measurement, or in assisted driving, video surveillance with face analytics, but also online in the context of tracking-free advertising, and in other cases of the ‘transient’ data processing. While the applicability of the GDPR would be the focus of debate in these contexts, the discussions will inevitably emerge also where the applicability of the GDPR is not in dispute, eg in the context of invoking data protection rights. Article 11(2) GDPR—under some caveats—exempts data controllers from complying with data subjects’ data access and rectification requests, requests for erasure and restriction of processing, as well as data portability obligations where ‘the controller is able to demonstrate that it is not in a position to identify the data subject’. The question will then be: what does it mean to identify? The definition of biometric data in Article 4(14) GDPR and pseudonymization in Article 4(5) GDPR also hinge on the meaning of identification.
To date, there have been disappointingly few attempts in the data protection legal scholarship, at least in English, at understanding identification beyond identifiability. In 2007 Leenes proposed a four-fold classification of identification. According to Leenes, there is more to identification than simply establishing one’s civil identity, and we need to read identification broadly if we are to address the ‘real privacy concerns’. He distinguished look-up (l-), recognition (r-), classification (c-), and session (s-) identifiability. A recent notable contribution to the debate on the meaning of identification is by Davis who examines the meaning of an ‘identified natural person’ specifically in the context of smart billboards and articulates the importance of looking into the meaning of ‘identified’ as a baseline for establishing the meaning of ‘identifiable’. However, Leenes, while examining the meaning of identification in data protection law, does so with a view to inform the information privacy debate across borders rather than to offer an interpretation of the specific legal concept of the EU data protection law, among others in light of the evolving case law of the Luxemburg Court, and Davis’ analysis is limited to the legal status of data in the context of facial detection. Jasserand addressed the meaning of identification under the GDPR framework, but only when it concerns the definition of biometric data.
In addition, there is a swirling stream of sociological and philosophical literature focusing on the related concepts of identity and anonymity. To name a few, in 1999 Gary Marx presented a sociological typology of what he called ‘identity knowledge’, which is the opposite of anonymity and hence I consider it equal to identification. He specified seven broad types of identity knowledge: legal name, locatability, pseudonyms linked to identity or location, pseudonyms that are not linked to name or location, pattern knowledge, social categorization, and symbols of eligibility/non-eligibility. Helen Nissenbaum discussed the meaning and value of anonymity in the information age as ‘unreachability’. A range of scholars offer many accounts of the meaning and construction of identity, generally and in the context of ambient intelligence and profiling. Against this backdrop the legal scholarly account of the meaning of identification is inadequate.
This lack of academic consideration might be partially explained by the fact that the Article 29 Working Party, an EU advisory authority on data protection under the former 1995 Data Protection Directive, defined what an identified person means in its 2007 opinion on the concept of personal data: ‘[i]n general terms, a natural person can be considered as “identified” when, within a group of persons, he or she is “distinguished” from all other members of the group’. The same explanation arguably holds for the concept of personal data in the GDPR, since there are no fundamental differences between the definitions of personal data under the 1995 Directive and the Regulation. This approach includes identification by name, but also other modes of ‘zoom[ing] in on a flesh and bone individual’. The authority of the Working Party when it comes to the data protection on the ground is undoubted, and its opinion on the concept of personal data is the most comprehensive and influential guideline for the controllers as to how this concept should be used in practice. The general perception of the meaning of identification under the GDPR following from the WP29 interpretation is thus that it is broad, flexible, and generously accommodating to the realities and challenges of the modern data processing practices. Indeed, the meaning of identification as distinguishing a person from a group should bring the cases of targeted advertising, profiling, and others where the name of a person is of no consequence to the protective bosom of the GDPR. Perhaps for this reason the data protection scholarship seems to be comfortably content with the status quo in law and literature.
However, the status quo has been resting on shaky grounds. The position of the Working Party, and hence the ‘distinguished from’ approach to identification, are not formally binding. The Court of Justice of the European Union (CJEU), the only body with authority to issue binding interpretations of the GDPR, was long silent on the meaning of identification. While the Court did follow the Working Party in interpreting the ‘information’ and ‘relating to’ elements of the concept of personal data in Nowak, it also has a record of not following the lines of interpretation chosen by the WP29 earlier. To complicate matters further, the Court in its 2016 Breyer decision appeared to have invalidated the understanding of identification as distinguishing or being distinguished from a group, advanced by the Working Party and granting the GDPR protection a broad reach. Without any detailed consideration about the meaning of identification, the Court in Breyer dismissed a dynamic IP (Internet Protocol) address as an identifier sufficient to identify a person, while one of the core functions of an IP address is exactly to distinguish one web visitor, or at least a location on the network, from another.
This brief consideration seems to restrict the interpretation of identification under the GDPR to the identification by name or a similar unique identifier representing one’s civil identity, the narrowest meaning of identification possible. This effectively takes cookies, IP addresses, and other online trackers, and with them a large part of online tracking and discrimination, but also not name-tied individual profiling and (real-time) automated decision-making, among others enabled through some of the new technologies such as facial detection, outside of the scope of the data protection law, and deprives people affected by these practices of legal protection that the GDPR would have granted, was the identification interpreted broadly. The very limited scholarly commentary on the Breyer case has largely overlooked this remarkable and consequential departure of the CJEU from the WP29 interpretation. Hence, the question remains: how should identification under the GDPR be understood?
This article will answer this question in two steps. First, it will examine the meaning of identification outside of the legal context (the Section ‘Meaning and Socio-Technical Approaches to Identification outside of the GDPR’). It will offer an integrated typology of identification as a process and result of distinguishing a person in a group. The typology builds on three prominent socio-technical accounts of identification: four identifiability types by Leenes, seven types of identity knowledge by Marx, and anonymity as unreachability by Nissenbaum. In addition to the established types, I will identify targeting as a new identification type, where to identify by way of targeting means to select a particular individual from a group as an object of attention or treatment in a single moment of time. The argument will build, among others, on the literatures on calculated publics, profiling in recommender systems, price, and content personalization. Second, I will focus on the legal meaning of identification under the GDPR. I will build a case that all five identification types not limited to civil identity identification are covered by the GDPR meaning of identification. It is an easy conclusion to draw if one follows a non-binding interpretation of Article 29 Working Party that to identify means to distinguish one in a group. This approach will be detailed in the section ‘The Article 29 Working Party Interpretation of the GDPR’. In the section ‘Meaning of Identification in CJEU’s case law’ I review the CJEU case law with relevance to the meaning of identification, including Breyer and its potentially restrictive impact. I then propose a contextual interpretation of Breyer in light of the facts of the case, which negates Breyer’s restrictive potential and brings all types of identification, including non-civil identity ones, within the meaning of identification under the GDPR. The section’ Conclusion: What This Means for Data Protection’ will conclude with a discussion of the implications of this broad reading of identification for EU data protection law practice and research.