31 July 2019

Animals

'‘Ruff’ Justice: Canine Cases and Judicial Law Making as an Instrument of Change' by Richard Jochelson and James Gacek in (2018) 24(1) Animal Law Review resonates with the current Australian Capital Territory move to recognise dogs and other non human animals as sentient beings.

The authors comment
 The regulation of animals in North America should be apprised of evolving socialities. As the judiciary encounters situations of contestation between humans and animals in adjudication, it should take notice of the emergence of animal recognition in Western societies. Law is apprised of sociality, can absorb social information, and may, at times, reflect how citizens view issues of justice. What was once innocent behavior can be reconstituted as criminal through the adjudicative exercise (and vice versa). In this Paper, we investigate socio-legal constructions of ‘the animal’ in two recent North American adjudications. In two recent cases, R. v. D.L.W. and State v. Newcomb, the Supreme Court of Canada and the Oregon Supreme Court contested what it means to be an animal in situations of bestiality and animal welfare investigations respectively. We argue that the jurisprudence in Canada and the United States should begin to incrementally shift towards progressive conceptions of animal existence. Such an understanding would (re)consider animals as beings, capable of worth and dignity – as more than expendable property. In light of a relative void of modern animal welfare legislation in North American jurisdictions, let alone animal bills of rights, the judicial decision remains the most likely site of progress for animal advocacy.

Mapping

'A Right Not to be Mapped? Augmented Reality, Real Property, and Zoning' (Ottawa Faculty of Law Working Paper No. 2019-17) by Elizabeth F Judge and Tenille E Brown comments
 The digital mapping applications underlying augmented reality have strong public benefits but can also have unappreciated effects on real property. In recent litigation on Pokémon Go, an enhanced digital mapping application in which players participate in a digital scavenger hunt by visiting real world locations, homeowners alleged that the augmented reality application harmed their residential properties by increasing the number of people in their residential areas. However, neither the existing laws on intellectual property nor those for real property are designed to address these types of harms. On the one hand, real property torts, such as nuisance and trespass, on which the homeowners relied, are ill-suited to address harms from a digital application as they are based on a right to exclude and consent. On the other hand, intellectual property laws have not focused on harms that could result from the intersection of intellectual property rights and real property. If it were to be framed anew, the basis of the homeowners’ claims would be most analogous to asserting “a right not to be mapped.” However, there is not yet a “right not to be mapped” in law, and there are compelling reasons for the law not to create one. We recommend three alternative mechanisms to regulate the relationship between augmented reality and real property. We recommend the application of zoning principles as a legal mechanism designed for location-sensitive regulation, which can balance the concerns of individual real property owners, as well as the larger context of community and city interests, and be adapted to innovative technologies such as augmented reality. Additionally, we suggest that catalogues of augmented reality applications be created to support zoning decisions and to provide public notice. We also consider the possibility of licensing schemes with micropayments for real properties affected by augmented reality.

30 July 2019

US MDPs

'Black Market Law Firms' by Casey Faucon in (2019) Cardozo Law Review (Forthcoming) comments
In business and in competition, value exists in striking first. Accountants, the so-called hawks of the professional world, have made the first move. In September 2017, the global accounting giant PwC opened a law firm in Washington, D.C. called ILC Legal. ILC Legal not only provides legal services on non-domestic matters, but also acts as a multidisciplinary provider (“MDP”) and offers other professional services, such as tax-planning, business consulting, and marketing, throughout its 90-country network. In June 2018, Deloitte quickly followed suit, the second of the Big Four accounting firms to enter the U.S. MDP market, partnering with a U.S. immigration law firm in San Francisco. With accountants now having the “first mover” advantage, the legal profession must respond. 
Restricting any competitive response are the legal profession’s current ethical rules. Two weaknesses in the legal profession’s integrity system—the self-regulatory market monopoly over legal services and the ethical treatment of all lawyering acts under a unified profession of law—have restricted collaborative innovations between lawyers and non-lawyers. No more pronounced are larger impacts of these weaknesses to the overall competitiveness of the legal profession than when viewed through the exemplar of Model Rule of Professional Conduct Rule 5.4, which protects the professional independence of a lawyer through prohibiting non-lawyer ownership of law firms. This rule has not stopped accountants, however, from hiring lawyers en masse to deliver legal services to their business and tax clients; nor has the rule stopped enterprising lawyers from collaborating with non-lawyer professionals in an attempt to keep pace and to provide more holistic and comprehensive legal services to clients. 
This Article calls for recognition and regulation of MDPs because the legal profession must now overcome the accountants’ first mover advantages. Despite this initial competitive setback, the legal profession is also now in a position to leverage its current self-regulatory monopoly over legal services to market a higher quality, ABA and state ethics board accredited MDP services to clients. This Article then proposes a regulatory framework for recognizing and regulating MDPs based on a classification scheme which categorizes MDPs based on the potential risk that the ownership and control structure could undermine a lawyer’s independent judgment. This novel classification scheme categorizes MDPs as either white, gray, or black market law firms depending on the percentage of non-lawyer majority ownership and control of the MDP. Based on those categories, this Article argues that we should revise Rule 5.4 to allow for unlimited associational forms between lawyers and non-lawyer professionals but prohibit lawyers from providing legal services in black market MDPs, or MDPs which are majority owned and controlled by non-lawyers

Pricing Algorithms and Data Trusts

'The Price Is (Not) Right: Data Protection and Discrimination in the Age of Pricing Algorithms' by Laura Drechsler and Juan Carlos Benito Sanchez in (2018) 9(3) European Journal of Law and Technology comments
In the age of the large-scale collection, aggregation, and analysis of personal data (‘Big Data’), merchants can generate complex profiles of consumers. Based on those profiles, algorithms can then try and match customers with the highest price they are willing to pay. But this entails the risk that pricing algorithms rely on certain personal characteristics of individuals that are protected under both data protection and anti-discrimination law. For instance, relying on the user’s ethnic origin to determine pricing may trigger the special protection foreseen for sensitive personal data and the prohibition of discrimination in access to goods and services. Focusing on European Union law, this article seeks to answer the following question: What protection do data protection law and anti-discrimination law provide for individuals against discriminatory pricing decisions taken by algorithms? Its originality resides in an analysis that combines the approaches of these two disciplines, presenting the commonalities, advantages from an integrated approach, and misalignments currently existing at the intersection of EU data protection and anti-discrimination law.
'Data Trusts as an AI Governance Mechanism' by Chris Reed and Irene Ng comments 
This paper is a response to the Singapore Personal Data Protection Commission consultation on a draft AI Governance Framework. It analyses the five data trust models proposed by the UK Open Data Institute and identifies that only the contractual and corporate models are likely to be legally suitable for achieving the aims of a data trust. 
The paper further explains how data trusts might be used as in the governance of AI, and investigates the barriers which Singapore's data protection law presents to the use of data trusts and how those barriers might be overcome. Its conclusion is that a mixed contractual/corporate model, with an element of regulatory oversight and audit to ensure consumer confidence that data is being used appropriately, could produce a useful AI governance tool. 

Solicitors' Misconduct

Lawyer Disciplinary Processes: An Empirical Study of Solicitors’ Misconduct Cases in England and Wales in 2015' by Andrew Boon and Avis Whyte in (2019) 39(3) Legal Studies comments
The Legal Services Act 2007 effected major changes in the disciplinary system for solicitors in England Wales. Both the practice regulator, the Solicitors Regulation Authority, and a disciplinary body, the Solicitors Disciplinary Tribunal, were reconstituted as independent bodies and given new powers. Our concern is the impact of the Act on the disciplinary system for solicitors. Examination of this issue involves consideration of changes to regulatory institutions and the mechanics of practice regulation. Drawing on Foucault’s notion of governmentality, empirical evidence drawn from disciplinary cases handled by the SDT and the SRA in 2015 is used to explore potentially different conceptions of discipline informing the work of the regulatory institutions. The conclusion considers the implications of our findings for the future of the professional disciplinary system.
The authors argue
The influence of regulation on individual behaviour is a theme of Foucault’s work on the transition in criminal sanctions from physical punishment to confinement. He observed that the introduction of the prison provided the opportunity to use novel technologies of surveillance. He argued that the techniques of hierarchical observation, normalising judgment and their combination in assessment procedures, which he called ‘the examination’ pervade social institutions aiming to affect individual subjectivity. Foucault’s theory of governmentality proposes that the self-regulation of the subject using such techniques, aims to negate the need for external regulation. Their manifestation in both state regulation and attempts to instil self-government on the population is particularly marked in initiatives across liberal and neoliberal economies. Institutions aim to normalise conduct conducive to enterprise, thereby affecting individual self-identity and subjectivity. Foucault argued that three factors determine the character of systems of social discipline, the system’s underlying purposes, its social institutions and the available technology of regulation, is particularly relevant to the regulation of what was, until 2007, a professionalised legal services market. 
The LSA effected fundamental change in the first two of Foucault’s three factors, philosophy and institution. The philosophy was set out in the first section of the Act declaring, inter alia, the regulatory objectives of promoting competition and the consumer interest. The institutional changes made by the LSA were intended to reflect a de-regulatory agenda. The central thrust was abolition of the regulatory role of professional self-regulating organisations. A number of new regulatory institutions were therefore created. These included a Legal Ombudsman (LeO) to receive complaints against all regulated lawyers and a government agency, the Legal Services Board (LSB), answerable to government for achieving the Act’s regulatory objectives. The levers of changes to practice regulation were in the hands of new ‘front line regulators’ constituted independently of the professional bodies. A key mechanism of change was the LSB’s oversight of and influence over these institutions. Those responsible for the three main areas of regulation for solicitors, the largest legal profession in England and Wales, were the Solicitors Regulation Authority (SRA) and the Solicitors Disciplinary Tribunal (SDT). The SDT was constituted as a professional institution in 1974 to hear misconduct allegations against individual practitioners. It changed little after the LSA, being relatively insulated from the LSB’s influence. Its raison d’etre is, however, potentially at odds with the rationale of the LSA and the regulatory direction taken by the SRA. This increasingly reflects a changing logic of regulation. 
Freidson identified three regulatory logics and their complementary mechanisms: professionalism (collegial control of markets in a spirit of public service), perfect competition (a free market with minimal regulation) and corporate bureaucracy (maximising the advantages of effective management). The SRA’s regulatory strategy has gradually moved away from the policies of professionalism towards those promoting competition and corporate bureaucracy. In terms of practice regulation, however, the forum for the ‘modernisation’ of legal services regulation is the modern law firm. The main focus has been on a process of ratcheting up the responsibility of law firms for regulation, exemplified by a rule book that addresses the employing organisation rather than the individual lawyer. This development was anticipated by strand of the legal ethics literature which advocated building the ‘ethical infrastructure’ of conventional law firms as a way of addressing lawyer misconduct. The legal office is also viewed as a site of control of the individual in the organisational theory literature, where more subtle mechanisms of control are described. 
Brown and Lewis argue that processes of observation, normalisation and examination, processes identified by Foucault as features of modern disciplinary systems, are particularly effective in legal workplaces. Routines such as time recording potentially define the identity of workers even in sectors, such as the professions, characterised by collegiality and relatively autonomous work places. Through the process of normalisation the individual accepts subjection to their work role and consequent limitations on their autonomy. In this approach to regulation ‘[d]isciplinary power is not, or not just sporadic and spectacular, but regular and monotonous... the mundane, everyday, repeated patterns of activity which characterize processes of (self) organizing’. The ability of institutions to perform a disciplinary function by affecting the behaviour of the individual employee depends on their capacity to provide more effective surveillance and control of regulated populations. The legal services market in England and Wales comprises different spheres of solicitors’ practice. A broad division between corporate and ‘private plight’ clients, recognised in the literature, is the basis of very different firm structures. Sole practitioners and small firm tend to operate in the private plight sphere and their partners are over-represented in the SDT. This is a challenge to a system of regulation based on theories of governmentality. 
This article explores the development of the regulatory system of solicitors following the LSA. Our account begins by exploring the evolution of the regulatory system following the Act. We argue that the shift in the SRA’s regulatory strategy towards corporate bureaucracy presents different concepts of discipline in the post-LSA regulatory regime. The themes of regulation and governmentality are examined using empirical data on the role of the SRA as a practice regulator and prosecuting authority and that of the SDT as adjudicator. In conclusion we consider how the nature of risk associated with particular activities of the regulated population might determine tools of governance.19 We also consider whether the SDT, in some ways a surprising survivor of the LSA revolution, performs a necessary function. As a remnant of a professional regime representing the physical, public and ceremonial dimensions of discipline, it arguably sits uneasily in a system based on neoliberal theories of regulation.

ACOLA report on Australia and AI

The Australian Council of Learned Academies (ACOLA) report on Australia and Artificial Intelligence - The effective and ethical development of artificial intelligence: An opportunity to improve our wellbeing - is one of those weighty (250 pages) beige horizon studies from the Great and Good.

There's something there for everyone - MOOCs, digital discrimination, a new agency, community discourse, job loss/creation .... and overall it's very underwhelming, another addition to the pile of weighty reports on AI such as that here and here and here.

Given the responsiveness of the Commonwealth government in the past it is unlikely to have much impact. It will be however fun to pull apart in the graduate Law, Innovation and Technologies unit I'm teaching this semester!

It states
Artificial Intelligence (AI) provides us with myriad new opportunities and potential on the one hand and presents global risks on the other. If responsibly developed, AI has the capacity to enhance wellbeing and provide benefits throughout society. There has been significant public and private investment globally, which has been directed toward the development, implementation and adoption of AI technologies. As a response to the advancements in AI, several countries have developed national strategies to guide competitive advantage and leadership in the development and regulation of AI technologies. The rapid advancement of AI technologies and investment has been popularly referred to as the ‘AI race’.
Strategic investment in AI development is considered crucial to future national growth. As with other stages of technological advancement, such as the industrial revolution, developments are likely to be shared and adopted to the benefit of nations around the world.
The promise underpinning predications of the potential benefits associated with AI technologies may be equally juxtaposed with narratives that anticipate global risks. To a large extent, these divergent views exist as a result of the yet uncertain capacity, application, uptake and associated impact of AI technologies. However, the utility of extreme optimism or pessimism is limited in the capacity to address the wide ranging and, perhaps less obvious, impacts of AI. While discussions of AI inevitably occur within the context of these extreme narratives, this report seeks to give a measured and balanced examination of the emergence of AI as informed by leading experts.
What is known is that the future role of AI will be ultimately determined by decisions taken today. To ensure that AI technologies provide equitable opportunities, foster social inclusion and distribute advantages throughout every sector of society, it will be necessary to develop AI in accordance with broader societal principles centred on improving prosperity, addressing inequity and continued betterment. Partnerships between government, industry and the community will be essential in determining and developing the values underpinning AI for enhanced wellbeing.
Artificial intelligence can be understood as a collection of interrelated technologies used to solve problems that would otherwise require human cognition. Artificial intelligence encompasses a number of methods, including machine learning (ML), natural language processing (NLP), speech recognition, computer vision and automated reasoning. Sufficient developments have already occurred within the field of AI technology that have the capacity to impact Australia. Even if no further advancements are made within the field of AI, it will remain necessary to address aspects of economic, societal and environmental changes.
While AI may cause short-term to medium-term disruption, it has the potential to generate long-term growth and improvement in areas such as agriculture, mining, manufacturing and health, to name a few. Although some of the opportunities for AI remain on the distant horizon, this anticipated disruption will require a measured response from government and industry and our actions today will set a course towards or away from these opportunities and their associated risks.
Development, implementation and collaboration
AI is enabled by data and thus also access to data. Data-driven experimental design, execution and analysis are spreading throughout the sciences, social sciences and industry sectors creating new breakthroughs in research and development. To support successful implementation of the advances of AI, there is a need for effective digital infrastructure to diffuse AI equitably, particularly through rural, remote and ageing populations. A framework for generating, sharing and using data in a way that is accessible, secure and trusted will be critical to support these advances. Data monopolies are already occurring and there will be a need to consider enhanced legal frameworks around the ownership and sharing of data. Frameworks must include appropriate respect and protection for the full range of human rights that apply internationally, such as privacy, equality, indigenous data sovereignty and cultural values. If data considerations such as these are not considered carefully or appropriately, it could inhibit the development of AI and the benefits that may arise. With their strong legal frameworks for data security and intellectual property and their educated workforces, both Australia and New Zealand could make ideal testbeds for AI development.
New techniques of machine learning are spurring unprecedented developments in AI applications. Next-generation robotics promise to transform our manufacturing, infrastructure and agriculture sectors; advances in natural language processing are revolutionising the way clinicians interpret the results of diagnostic tests and treat patients; chatbots and automated assistants are ushering in a new world of communication, analytics and customer service; unmanned autonomous vehicles are changing our capacities for defence, security and emergency response; intelligent financial technologies are establishing a more accountable, transparent and risk-aware financial sector; and autonomous vehicles will revolutionise transport.
While it is important to embrace these applications and the opportunities they afford, it will also be necessary to recognise potential shortcomings in the way AI is developed and used. It is well known, for example, that smart facial recognition technologies have often been inaccurate and can replicate the underlying biases of the human-encoded data they rely upon; that AI relies on data that can and has been exploited for ethically dubious purposes, leading to social injustice and inequality; and that while the impact of AI is often described as ‘revolutionary’ and ‘impending’, there is no guarantee that AI technologies such as autonomous vehicles will have their intended effects, or even that their uptake in society will be inevitable or seamless. Equally, the shortcomings associated with current AI technological developments need not remain permanent limitations. In some cases, these are teething problems of a new technology like that seen of smart facial recognition technologies a few years ago compared to its current and predicted future accuracy. The nefarious and criminal use of AI technologies is also not unique to AI and is a risk associated with all technological developments. In such instances however, AI technologies could in fact be applied to oppose this misuse. For these reasons, there will be a need to be attuned to the economic and technological benefits of AI, and also to identify and address potential shortcomings and challenges.
Interdisciplinary collaboration between industry, academia and government will bolster the development of core AI science and technologies. National, regional and international effort is required across industry, academia and governments to realise the benefits promised by AI. Australia and New Zealand would be prudent to actively promote their interests and invest in their capabilities, lest they let our societies be shaped by decisions abroad. These efforts will need to draw on the skills not only of AI developers, but also legal experts, social scientists, economists, ethicists, industry stakeholders and many other groups. 
Employment, education and access 
While there is much uncertainty regarding the extent to which AI and automation will transform work, it is undeniable that AI will have an impact on most work roles, even those that, on the surface today, seem immune from disruption. As such, there will be a need to prepare for change, even if change does not arrive as rapidly or dramatically as is often forecast.
The excitement relating to the adoption and development of AI technologies has produced a surge in demand for workers in AI research and development. New roles are being created and existing roles augmented to support and extend the development of AI, but demand for skilled workers including data scientists is outstripping supply. Training and education for this sector are subsequently in high demand. Tertiary providers are rapidly growing AI research and learning capabilities. Platform companies such as Amazon (Web Services) and Google are investing heavily in tools for self-directed AI learning and reskilling. A robust framework for AI education – one that draws on the strengths of STEM and HASS perspectives, that cultivates an interest in AI from an early age and that places a premium on encouraging diversity in areas of IT and engineering – can foster a generation of creative and innovative AI designers, practitioners, consultants as well as an informed society. Students from a diverse range of disciplines such as chemistry, politics, history, physics and linguistics could be equipped with the knowledge and knowhow to apply AI techniques such as ML to their disciplines. A general, communitywide understanding of the basic principles of AI – how it operates; what are its main capabilities and limitations – will be necessary as AI becomes increasingly prevalent across all sectors. The demand for AI skills and expertise is leading to an international race to attract AI talent, and Australia and New Zealand can take advantage of this by positioning themselves as world leaders in AI research and development, through strategic investment as well as recognition of areas of AI application where the countries can, and currently do, excel.
Although AI research and development will become an increasingly important strategic national goal, a larger – and perhaps more significant – goal is to ensure that existing workforces feel prepared for the opportunities and challenges associated with the broad uptake of AI. This will mean ensuring workers are equipped with the skills and knowledge necessary to work with and alongside AI, and that their sense of autonomy, productivity and wellbeing in the workplace is not compromised in the process. Education should emphasise not only the technical competencies needed for the development of AI, but also the human skills such as emotional literacy that will become more important as AI becomes better at particular tasks. In the short to medium term, the implementation of AI may require the application of novel approaches. It will be important to ensure that workers are comfortable with this.
To ensure the benefits of AI are equitably dispersed throughout the community, principles of inclusion should underpin the design of AI technologies. Inclusive design and universal access are critical to the successful uptake of AI. Accessible design will facilitate the uptake and use of AI by all members of our community and provide scope to overcome existing societal inequalities. If programmed with inclusion as a major component, we can facilitate beneficial integration between humans and AI in decision making systems. To achieve this, the data used in AI systems must be inclusive. Much of society will need to develop basic literacies in AI systems and technologies – which will involve understanding what AI is capable of, how AI uses data, the potential risks of AI and so on – in order to feel confident engaging in AI in their everyday lives. Massive Open Online Courses (MOOCs) and micro-credentials, as well as free resources provided by platform companies, could help achieve this educational outcome.
Regulation, governance and wellbeing 
Effective regulation and governance of AI technologies will require involvement of, and work by, all thought-leaders and decision makers and will need to include the participation of the public, communities and stakeholders directly impacted by the changes. Political leaders are well placed to guide a national discussion about the future society envisioned for Australia. Policy initiatives must be coordinated in relation to existing domestic and international regulatory frameworks. An independently-led AI body drawing together stakeholders from government, industry and the public and private sectors could provide institutional leadership on the development and deployment of AI. For example, a similar body, the Australian Communications and Media Authority, regulates the communications sector with the view to maximise economic and social benefits for both the community and industry.
Traditional measures of success, such as GDP and the Gini coefficient (a measure of income inequality), will remain relevant in assessing the extent to which the nation is managing the transition to an economy and a society that takes advantage of the opportunities AI makes available. These measures can mask problems, however, and innovative measures of subjective wellbeing may be necessary to better characterise the effect of AI on society. Such measures could include the OECD Better Life Index or other indicators such as the Australian Digital Inclusion Index. Measures like the triple bottom line may need to be adapted to measure success in a way that makes the wellbeing of all citizens central.
Ensuring that AI continues to be developed safely and appropriately for the wellbeing of society will be dependent on a responsive regulatory system that encourages innovation and engenders confidence in its development. It is often argued that AI systems and technologies require a new set of legal frameworks and ethical guidelines. However, existing human rights frameworks, as well as national and international regulations on data security and privacy, can provide ample scope through which to regulate and govern much of the use and development of AI systems and technologies. Updated competition policies could account for emerging data monopolies. We should therefore apply existing frameworks to new ethical problems and make modifications only where necessary. Much like the debates occurring on AI’s impact on employment, the governance and regulation of AI are subject to a high degree of uncertainty and disagreement. Our actions in these areas will shape the future of AI, so it is important that decisions made in these contexts are not only carefully considered, but that they align with the nation’s vision for an AI-enabled future that is economically and socially sustainable, equitable and accessible for all, strategic in terms of government and industry interests, and places the wellbeing of society in the centre. The development of regulatory frameworks should facilitate industry-led growth and seek to foster innovation and economic wellbeing. Internationally coordinated policy action will be necessary to ensure the authority and legitimacy of the emerging body of law governing AI.
A national framework
The safe, responsible and strategic implementation of AI will require a clear national framework or strategy that examines the range of ethical, legal and social barriers to, and risks associated with, AI; allows areas of major opportunity to be established; and directs development to maximise the economic and social benefits of AI. The national framework would articulate the interests of society, uphold safe implementation, be transparent and promote wellbeing. It should review the progress of similar international initiatives to determine potential outcomes from their investments to identify the potential opportunities and challenges on the horizon. Key actions could include:
  • Educational platforms and frameworks that are able to foster public understanding and awareness of AI 
  • Guidelines and advice for procurement, especially for public sector and small and medium enterprises, which informs them of the importance of technological systems and how they interact with social systems and legal frameworks 
  • Enhanced and responsive governance and regulatory mechanisms to deal with issues arising from cyber-physical systems and AI through existing arbiters and institutions 
  • Integrated interdisciplinary design and development requirements for AI and cyber‑physical systems that have positive social impacts 
  • Investment in the core science of AI and translational research, as well as in AI skills. 
  • An independent body could be established or tasked to provide leadership in relation to these actions and principles. This central body would support a critical mass of skills and could provide oversight in relation to the design, development and use of AI technologies, promote codes of practice, and foster innovation and collaboration.

29 July 2019

Reidentification and the GDPR

‘Estimating the success of re-identifications in incomplete datasets using generative models’ by Luc Rocher, Julien M. Hendrickx and Yves-Alexandre de Montjoye in (2019) 10 Nature Communications 3069 comments
While rich medical, behavioral, and socio-demographic data are key to modern data-driven research, their collection and use raise legitimate privacy concerns. Anonymizing datasets through de-identification and sampling before sharing them has been the main tool used to address those concerns. We here propose a generative copula-based method that can accurately estimate the likelihood of a specific person to be correctly re-identified, even in a heavily incomplete dataset. On 210 populations, our method obtains AUC scores for predicting individual uniqueness ranging from 0.84 to 0.97, with low false-discovery rate. Using our model, we find that 99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes. Our results suggest that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization set forth by GDPR and seriously challenge the technical and legal adequacy of the de-identification release-and-forget model.
They argue
In the last decade, the ability to collect and store personal data has exploded. With two thirds of the world population having access to the Internet, electronic medical records becoming the norm, and the rise of the Internet of Things, this is unlikely to stop anytime soon. Collected at scale from financial or medical services, when filling in online surveys or liking pages, this data has an incredible potential for good. It drives scientific advancements in medicine, social science, and AI and promises to revolutionize the way businesses and governments function. 
However, the large-scale collection and use of detailed individual-level data raise legitimate privacy concerns. The recent backlashes against the sharing of NHS [UK National Health Service] medical data with DeepMind and the collection and subsequent sale of Facebook data to Cambridge Analytica are the latest evidences that people are concerned about the confidentiality, privacy, and ethical use of their data. In a recent survey, 72% of U.S. citizens reported being worried about sharing personal information online. In the wrong hands, sensitive data can be exploited for blackmailing, mass surveillance, social engineering, or identity theft. 
De-identification, the process of anonymizing datasets before sharing them, has been the main paradigm used in research and elsewhere to share data while preserving people’s privacy. Data protection laws worldwide consider anonymous data as not personal data anymore allowing it to be freely used, shared, and sold. Academic journals are, e.g., increasingly requiring authors to make anonymous data available to the research community. While standards for anonymous data vary, modern data protection laws, such as the European General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), consider that each and every person in a dataset has to be protected for the dataset to be considered anonymous. This new higher standard for anonymization is further made clear by the introduction in GDPR of pseudonymous data: data that does not contain obvious identifiers but might be re-identifiable and is therefore within the scope of the law. Yet numerous supposedly anonymous datasets have recently been released and re-identified. In 2016, journalists re-identified politicians in an anonymized browsing history dataset of 3 million German citizens, uncovering their medical information and their sexual preferences. A few months before, the Australian Department of Health publicly released de-identified medical records for 10% of the population only for researchers to re-identify them 6 weeks later. Before that, studies had shown that de-identified hospital discharge data could be re-identified using basic demographic attributes and that diagnostic codes, year of birth, gender, and ethnicity could uniquely identify patients in genomic studies data. Finally, researchers were able to uniquely identify individuals in anonymized taxi trajectories in NYC, bike sharing trips in London, subway data in Riga, and mobile phone and credit card datasets. 
Statistical disclosure control researchers and some companies are disputing the validity of these re-identifications: as datasets are always incomplete, journalists and researchers can never be sure they have re-identified the right person even if they found a match. They argue that this provides strong plausible deniability to participants and reduce the risks, making such de-identified datasets anonymous including according to GDPR. De-identified datasets can be intrinsically incomplete, e.g., because the dataset only covers patients of one of the hospital networks in a country or because they have been subsampled as part of the de-identification process. For example, the U.S. Census Bureau releases only 1% of their decennial census and sampling fractions for international census range from 0.07% in India to 10% in South American countries. Companies are adopting similar approaches with, e.g., the Netflix Prize dataset including <10 of="" p="" their="" users="">
Imagine a health insurance company who decides to run a contest to predict breast cancer and publishes a de-identified dataset of 1000 people, 1% of their 100,000 insureds in California, including people’s birth date, gender, ZIP code, and breast cancer diagnosis. John Doe’s employer downloads the dataset and finds one (and only one) record matching Doe’s information: male living in Berkeley, CA (94720), born on January 2nd 1968, and diagnosed with breast cancer (self-disclosed by John Doe). This record also contains the details of his recent (failed) stage IV treatments. When contacted, the insurance company argues that matching does not equal re-identification: the record could belong to 1 of the 99,000 other people they insure or, if the employer does not know whether Doe is insured by this company or not, to anyone else of the 39.5M people living in California. 
<6 .7="" 95="" a="" b="" for="" threshold="">Our paper shows how the likelihood of a specific individual to have been correctly re-identified can be estimated with high accuracy even when the anonymized dataset is heavily incomplete. We propose a generative graphical model that can be accurately and efficiently trained on incomplete data. Using socio-demographic, survey, and health datasets, we show that our model exhibits a mean absolute error (MAE) of 0.018 on average in estimating population uniqueness and an MAE of 0.041 in estimating population uniqueness when the model is trained on only a 1% population sample. Once trained, our model allows us to predict whether the re-identification of an individual is correct with an average false-discovery rate of <6 .7="" 95="" a="" for="" threshold="">0.95) and an error rate 39% lower than the best achievable population-level estimator. With population uniqueness increasing fast with the number of attributes available, our results show that the likelihood of a re-identification to be correct, even in a heavily sampled dataset, can be accurately estimated, and is often high. Our results reject the claims that, first, re-identification is not a practical risk and, second, sampling or releasing partial datasets provide plausible deniability. Moving forward, they question whether current de-identification practices satisfy the anonymization standards of modern data protection laws such as GDPR and CCPA and emphasize the need to move, from a legal and regulatory perspective, beyond the de-identification release-and-forget model.