21 September 2013


'Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms' by Kate Crawford and Jason Schultz in (2014) 55(1) Boston College Law Review comments
The rise of “big data” analytics in the private sector poses new challenges for privacy advocates. Unlike previous computational models that exploit personally identifiable information (PII) directly, such as behavioral targeting, big data has exploded the definition of PII to make many more sources of data personally identifiable. By analyzing primarily metadata, such as a set of predictive or aggregated findings without displaying or distributing the originating data, big data approaches often operate outside of current privacy protections (Rubinstein 2013; Tene and Polonetsky 2012), effectively marginalizing regulatory schema. Big data presents substantial privacy concerns – risks of bias or discrimination based on the inappropriate generation of personal data – a risk we call “predictive privacy harm.” Predictive analysis and categorization can pose a genuine threat to individuals, especially when it is performed without their knowledge or consent. While not necessarily a harm that falls within the conventional “invasion of privacy” boundaries, such harms still center on an individual’s relationship with data about her. Big data approaches need not rely on having a person’s PII directly: a combination of techniques from social network analysis, interpreting online behaviors and predictive modeling can create a detailed, intimate picture with a high degree of accuracy. Furthermore, harms can still result when such techniques are done poorly, rendering an inaccurate picture that nonetheless is used to impact on a person’s life and livelihood. 
In considering how to respond to evolving big data practices, we began by examining the existing rights that individuals have to see and review records pertaining to them in areas such as health and credit information. But it is clear that these existing systems are inadequate to meet current big data challenges. Fair Information Privacy Practices and other notice-and-choice regimes fail to protect against predictive privacy risks in part because individuals are rarely aware of how their individual data is being used to their detriment, what determinations are being made about them, and because at various points in big data processes, the relationship between predictive privacy harms and originating PII may be complicated by multiple technical processes and the involvement of third parties. Thus, past privacy regulations and rights are ill equipped to face current and future big data challenges. 
We propose a new approach to mitigating predictive privacy harms – that of a right to procedural data due process. In the Anglo-American legal tradition, procedural due process prohibits the government from depriving an individual’s rights to life, liberty, or property without affording her access to certain basic procedural components of the adjudication process – including the rights to review and contest the evidence at issue, the right to appeal any adverse decision, the right to know the allegations presented and be heard on the issues they raise. Procedural due process also serves as an enforcer of separation of powers, prohibiting those who write laws from also adjudicating them. 
While some current privacy regimes offer nominal due process-like mechanisms in relation to closely defined types of data, these rarely include all of the necessary components to guarantee fair outcomes and arguably do not apply to many kinds of big data systems (Terry 2012). A more rigorous framework is needed, particularly given the inherent analytical assumptions and methodological biases built into many big data systems (boyd and Crawford 2012). Building on previous thinking about due process for public administrative computer systems (Steinbock 2005; Citron 2010), we argue that individuals who are privately and often secretly “judged” by big data should have similar rights to those judged by the courts with respect to how their personal data has been used in such adjudications. Using procedural due process principles, we analogize a system of regulation that would provide such rights against private big data actors.
'Transparent Predictions' by Tal Zarsky in (2013) 4 University of Illinois Law Review asks -
Can human behavior be predicted? A broad variety of governmental initiatives are using computerized processes to try. Vast datasets of personal information enhance the ability to engage in these ventures and the appetite to push them forward. Governments have a distinct interest in automated individualized predictions to foresee unlawful actions. Novel technological tools, especially data-mining applications, are making governmental predictions possible. The growing use of predictive practices is generating serious concerns regarding the lack of transparency. Although echoed across the policy, legal, and academic debate, the nature of transparency, in this context, is unclear. Transparency flows from different, even competing, rationales, as well as very different legal and philosophical backgrounds. This Article sets forth a unique and comprehensive conceptual framework for understanding the role transparency must play as a regulatory concept in the crucial and innovative realm of automated predictive modeling.
Part II begins by briefly describing the predictive modeling process while focusing on initiatives carried out in the context of federal income tax collection and law enforcement. It then draws out the process’s fundamental elements, while distinguishing between the role of technology and humans. Recognizing these elements is crucial for understanding the importance and challenges of transparency. Part III moves to address the flow of information the prediction process generates. In doing so, it addresses various strategies to achieve transparency in this process — some addressed by law, while others are ignored. In doing so, the Article introduces a helpful taxonomy that will be relied upon throughout the analysis. It also establishes the need for an overall theoretical analysis and policy blueprint for transparency in prediction. Part IV shifts to a theoretical analysis seeking the sources of calls for transparency. Here, the analysis addresses transparency as a tool to enhance government efficiency, facilitate crowdsourcing, and promote both privacy and autonomy. 
Part V turns to examine counterarguments which call for limiting transparency. It explains how disclosure can undermine government policy and authority, as well as generate problematic stereotypes. After mapping out the justifications and counterclaims, Part VI moves to provide an innovative and unique policy framework for achieving transparency. The Article concludes, in Part VII, by explaining which concerns and risks of the predictive modeling process transparency cannot mitigate, and calling for other regulatory responses.