'How Data Can Be Used Against People: A Classification of Personal Data Misuses' by Jacob Leon Kröger, Milagros Miceli and Florian Müller comments
Even after decades of intensive research and public debates, the topic of data privacy remains surrounded by confusion and misinformation. Many people still struggle to grasp the importance of privacy, which has far-reaching consequences for social norms, jurisprudence, and legislation. Discussions on personal data misuse often revolve around a few popular talking points, such as targeted advertising or government surveillance, leading to an overly narrow view of the problem. Literature in the field tends to focus on specific aspects, such as the privacy threats posed by ‘big data’, while overlooking many other possible harms. To help broaden the perspective, this paper proposes a novel classification of the ways in which personal data can be used against people, richly illustrated with real-world examples. Aside from offering a terminology to discuss the broad spectrum of personal data misuse in research and public discourse, our classification provides a foundation for consumer education and privacy impact assessments, helping to shed light on the risks involved with disclosing personal data.
The protection of personal data is a highly controversial issue. While scores of researchers, activists and politicians advocate the right to informational privacy and stress the importance of comprehensive data protection laws, others argue against strong legal restrictions, pointing to the wide-ranging benefits of data collection and use. Many people, asserting they have “nothing to hide”, even dismiss the importance of data protection altogether, believing that privacy only truly matters for those on the wrong side of law. While the nothing-to-hide argument has long been exposed as misguided, it is not only held by ordinary citizens but also backed by some of the most powerful organizations on earth, including governments and multinational corporations.
During his time as Google’s CEO, Eric Schmidt notoriously stated, “If you have something that you don’t want anyone to know, maybe you shouldn’t be doing it in the first place”. Following the same reasoning, the British government chose the campaign slogan “If you’ve got nothing to hide, you’ve got nothing to fear” to promote a nation-wide CCTV surveillance program. As these examples illustrate, the various ways in which privacy invasions can cause harm to law-abiding citizens are often ignored, underestimated or even deliberately concealed. As a result, many people, including court members, struggle to articulate why the protection of personal data is important. This state of misinformation has severe consequences on policy and public discourse, with data protection advocates being referred to as “privacy alarmists” and privacy itself being framed as “old-fashioned[,] antiprogressive, overly costly” and “primarily an antiquated roadblock on the path to greater innovation”. This widespread sentiment also helps legitimize and perpetuate current privacy laws, which are riddled with loopholes and fail to consistently safeguard people from harmful, abusive and ethically questionable data practices.
In the face of these challenges, researchers have called for a closer examination and better understanding of the actual harms that can result from the disclosure and processing of personal data. In this vein, Solove argues that “Privacy is far too vague a concept to guide adjudication and lawmaking, as abstract incantations of the importance of ‘privacy’ do not fare well when pitted against more concretely stated countervailing interests”. Along these lines, many theorists have made attempts to convey the importance of privacy protection by presenting categories and real-life examples of personal data misuse. Some notable examples are the “Data Harm Record” by the Data Justice Lab, an “Inventory of Risks and Harms” provided by the Centre for for Information Policy Leadership and Wolfie Christl’s extensive work on corporate surveillance in everyday life. Furthermore, scholars have written several essays on the societal values related to and protected by informational privacy (e.g., [1, 3, 4]) with Magi, for example, providing a list of “fourteen reasons privacy matters”.
Existing classifications often focus on data practices of companies, on novel privacy threats posed by big data technologies and/or specific categories of harm resulting from personal data use (e.g., bodily harm, loss of liberty, financial loss, reputational harm). What seems to be lacking thus far, however, is a general classification of the possible actions that lead to these harms – in other words: a classification capturing the manifold ways in which personal data can be used against people, by criminal, private, public and governmental organizations, or by other individuals.
In an attempt to fill the identified gap, this paper proposes a classification scheme of personal data misuses. As previous work has stated, taxonomies comprise subjective social, technical, and political choices. Classifications (ours included) are always normative attempts to “impose order onto an undifferentiated mass”. While we acknowledge the subjective character of our endeavor, we also strive for a holistic overview and see value in creating a structured classification of the possible ways in which personal data can be weaponized. We argue that without a comprehensive and clear overview, many potential paths of harm can easily be overlooked in privacy impact assessments and public discourse, leading to an overly narrow view of the problem. This paper is based on extensive literature research, including previous investigations and press articles, as well as discussions among the three authors and, occasionally, advisors and fellow researchers throughout many months until reaching theoretical saturation.
Our classification scheme comprises the following eleven categories:
1. Consuming data for personal gratification – Section 2.1
2. Generating coercive incentives – Section 2.2
3. Compliance monitoring – Section 2.3
4. Discrediting – Section 2.4
5. Assessment and discrimination – Section 2.5
6. Identification of personal weak spots – Section 2.6
7. Personalized persuasion – Section 2.7
8. Locating and physically accessing the data subject – Section 2.9
9. Contacting the data subject – Section 2.8
10. Accessing protected domains or assets – Section 2.10
11. Reacting strategically to actions or plans of the data subject – Section 2.11
While we acknowledge that a holistic exploration of the topic is particularly important in view of the rapid proliferation of data-based services and the accompanying rise of governmental and corporate mass surveillance, the focus of this paper is not limited to the domain of big data, nor even to the digital domain. The classification is meant to be universally applicable, independent of how the data was obtained (e.g., online or offline, legally or illegally, collected or inferred, with or without the knowledge of the data subject), who causes the threat (e.g., individual person, corporation, organized crime group, intelligence agency) and what motivations lie behind it (e.g., financial gain, political objectives, revenge). These parameters will only be included in examples for illustrative purposes.
As even de-identified data has the potential to cause harm to individuals (cf. Sect. 3.2), we chose to adopt a very broad understanding of “personal data” for the purpose of this classification. While privacy law usually applies to information relating to an identified or identifiable natural person (e.g., Art. 4 GDPR), our proposed classification may apply to any information that is, or once was, personal data according to the above definition, including even anonymized data – as long as it still has the potential to cause or facilitate harm against the data subject.
The remainder of this paper is structured as follows. Section 2 presents the eleven identified categories of personal data misuse. Section 3 then explains the utility of the classification scheme and discusses its scope and limitations. Section 4 concludes the paper.