18 October 2015


'Big Data Legal Scholarship: Toward a Research Program and Practitioner's Guide' by Frank Fagan in Virginia Journal of Law and Technology (Forthcoming) ambitiously
seeks to take first steps toward developing a research program for big data legal scholarship by sketching its positive and normative components. The application of big data methods to the descriptive questions of law can generate broader consensus. This is because big data methods can provide greater comprehensiveness and less subjectivity than traditional approaches, and can diminish general disagreement over the categorization and theoretical development of law as a result. Positive application can increase the clarity of rules; uncover the relationship between judicial text and outcome; and comprehensively describe judicially-determined facts, the contents of legislation and regulation, or the contents of private agreements. Equipped with a normative framework, big data legal scholarship can lower the costs of judging, litigating, and administrating law; increase comparative justice and predictability; and support the advocacy of better rules and policies.
In addition to sketching theoretical foundations, this Essay seeks to take first steps toward developing the core components of successful praxis. Handling and analyzing big data can be cumbersome, though the newcomer can avoid common pitfalls with care. Accordingly, there exist best practices for preprocessing, converting, and analyzing the data. Current analytical techniques germane to law include algorithmic classification, topic modeling, and multinomial inverse regression.
First steps, by definition, are incomplete. The contours of big data legal scholarship and practice will undoubtedly shift over time to reflect new techniques and prevailing normative questions. This Essay merely aspires to generate a conversation about how big data can enhance our understanding of law — what it is, and what it should be.
Fagan comments
This Essay narrows its focus on the application of big data methods to legal scholarship. The goal is to develop a framework for, or at least take first steps toward, understanding how big data techniques can be used to address the positive and normative questions of conventional legal scholarship, i.e. what the law is and what the law should be. Thus, it is less concerned with evaluating whether big data techniques are particularly well-suited to legal scholarship as a question by itself, which is certainly relevant and central, and is more concerned with identifying potential avenues for application to conventional doctrinal scholarship. Along the way, the analysis considers the value of traversing those avenues from a normative law and economics point of view. In other words, can big data legal studies increase social welfare, generally, through a reduction in the social costs of lawyering and judging, or say, an increase in the awareness of socially undesirable rules (thereby placing pressure on their decline), or an increase in the use of socially desirable rules, and yet still others. ... The Essay is divided into two parts. Part one takes first steps toward establishing a way of thinking about big data legal scholarship. It begins by considering how big data methods can positively describe the law and identifies three types of study: of taxonomy, of realism, and of policy. First steps toward positive application are followed by first steps toward normative application. Three normative goals are identified: to clarify doctrine, to advocate comparative justice and predictability, and to advocate better rules. Part two aims to demystify the big data research process by providing a practitioner’s guide, which details the steps a newcomer can take to develop a big data legal study. The guide has two goals: to convince an interested legal scholar to consider carrying out big data study, and to facilitate broader dialog amongst those who study legal doctrine — regardless of whether they are interested in big data methods. By prying open the black box just a little, part two hopes to further the conversation about law — between big data legal scholars and their counterparts using other methods.
Fagan goes on to argue
As emphasized by Joshua Fischman, legal scholars should prioritize normative questions over positive assessments and allow substantive questions of policy to drive their choice of methods. Nonetheless, this essay leads with big data’s positive component in order to place its normative component in sharper relief. Priority is assigned for exposition. Broadly speaking, big data methods can be used to describe legal rules and legal theory. For example, Jonathan Macey and Joshua Mitts have applied big data methods to develop a taxonomy of corporate veil-piercing;  I have applied big data methods to develop a taxonomy corporate successor liability.  Daniel Young has applied big data methods to study Bruce Akerman’s constitutional moments;  Lea-Rachel Kosnik has applied big data methods to study Ronald Coases’ transaction costs.  What has scientifically emerged thus far, at least in terms of big data’s ability to describe legal rules, is essentially taxonomic. And in terms of its ability to describ e legal theory, big data provides just one more empirical method for falsification.
1. Taxonomic Studies
Taxonomic studies are important nonetheless, especially where there exists scholarly dispute over which categorical set of facts leads to judicial application of a particular doctrine, or where there exists ongoing variation in judicial rationale for the same. Both of these discrepancies were present when Macey and Mitts approached veil-piercing with big data methods. Their effort was directed toward resolving (i) a scholarly dispute whether veil-piercing doctrinal standards exhibit coherence, or whether those standards are “characterized by ambiguity, unpredictability, and even a seeming degree of randomness”;  and (ii) why judges appear to apply standards differently.  Either of these grounds present an intellectual opportunity for big data taxonomic studies. With respect to veil - piercing, the intellectual and doctrinal stakes were relatively high considering that Stephen Bainbridge called for the abolishment of the doctrine for its seeming lack of coherence and non-uniformity.  Through application of big data methods, Macey and Mitts were able to find coherence in the morass of 9,000 veil-piercing decisions and establish an authoritative taxonomy. The taxonomy draws its authority from the impressive fact that it parses 9,000 decisions, which were culled from a database of 2.5 million.  Size, for better or worse, has always been a hallmark of quality and thoroughness in doctrinal scholarship.  Big data methods can leverage that hallmark. The taxonomy also draws authority from its limited subjectivity. Taxonomies based upon hand-classification are inherently biased because an analyst must decide on a classification scheme. While classifications based upon biological characteristics, for example, are relatively straightforward and mildly contestable at worst (think six legs means insect; thus, a centipede is not an insect), classifications based upon legal characteristics are often messy and mildly contestable at best. As noted by Macey and Mitts, the messiness was particularly acute with veil-piercing doctrine:
Coding schemes — indeed, quantitative analysis more generally — necessarily reflect an imperfect approximation of the qualitative complexity of each case. But using mechanical coding to identify determinants of veil piercing is particularly imprecise because it places substantial discretion in the hands of human coders, whose application of judgment can vary between individuals and ev en from case to case by the same individual.
Heightened subjectivity can lead to differing classification results which limits the scope of academic consensus and places a downward pressure on producing a testable theory or taxonomy. This is because those that disagree with the initial classification results are more likely to reject a theory grounded in those results. And because the theory will have less appeal, the scholar will have less incentive to produce it and empirically test it (given that she values wide appeal of her scholarship of course). 
Big data methods, instead, offer classification techniques that severely limit the subjective bias of the analyst and thereby promote consensual advancement. Through a method known as topic modeling, the analyst can use an algorithm to evaluate a practically unlimited number of judicial decisions at once — without specifying qualitative classes or case characteristics ad hoc. The algorithm, from its point of view, performs aimless work. The analyst simply specifies the number of topics to be modeled, and the algorithm simply outputs that number of word lists. The analyst must then create classification categories based upon the contents of the individual lists. For example, the topic model used in Fagan (2015) returned four lengthy lists with one consisting of terms such as: “petition reorganization”, “legal interest”, “transfer purchase”, “liquidation distribution”, and “public auction”.  With the aid of that list, I was tasked to create a category (most cleanly related to bankruptcy).
Category creation requires a certain amount of subjectivity, but that amount is severely limited when compared to other methods — and in scientifically meaningful ways. Consider first, that categories are based upon algorithmically generated word lists that can be replicated by other researchers who use the same dataset.  Any dispute over category choice is therefore limited to subjective interpretation of the word list. Second, unlike traditional classification met hods, topic modeling considers the complete contents of a legal text with equivalent importance. Terms such as “under capitalization” are evaluated with the same systematic rigor as “hot air”. Any pattern or topical structure manifest in the word lists, therefore, is mechanically based upon the texts — irrespective of their contents. This type of frankness with textual data could, possibly, be sacrificed when the data are approached non-algorithmically and with a human touch. Entire topics, or at least subtle classification influences, might be missed or papered over as a result. Finally, topic modeling limits subjectivity through the use of substantially enlarged datasets. As the number of data increase, a more objective picture is more likely to emerge.  On balance then, big data topic modeling presents clear advancement in reducing subjectivity when compared with traditional hand-classification techniques. Combined with its even clearer advantage in comprehensiveness when an analyst mines a sufficient universe of legal texts, it holds immediate potential to generate taxonomies that yield broader acceptance.
2. Legal Realist Studies
In his book How Judges Think, Richard Posner notes wryly that if judges did nothing but apply clear rules of law, “[t ]hen judges would be well on the road to being superseded by digitized artificial intelligence programs.”  For their part, legalists claim that judges merely “apply rules made by legislatures or framers of the Constitution (or follow precedents, made by current or former judges, that are promptly changed if they prove maladapted to current conditions).”  Judges state those rules in their opinions and apply them without bias to sets of facts, which themselves are also determined without bias.  Realists, instead, claim that the judicial temperament matters. Judicially - made doctrines and decisions depend in part upon the judges' incentives, “which may in turn depend on the judges' cognition and psychology, on how persons are selected (including self-selected) to be judges, and on terms and conditions of judicial employment.”  Application and determination of rules and facts can, therefore, partly depend upon judges' “motivations, capacities, mode of selection, professional norms, and psychology.”  Current judicial practice does not involve explicitly stating any of these up front, and it very likely will remain this way for the foreseeable future. Thus, if judicial temperament matters in decision making, it must be measured indirectly. Big data analytics provide a powerful research platform for empirically testing the tenets of legal realism. Relying upon a form of regression analysis adapted for lengthy texts, empirical techniques now exist for untangling the (potentially) uneasy relationship between judicial texts and judicial outcomes. The analytical results are able to support the assertion that a given judicial rationale, even when made textually explicit, is not (or conversely, may be) driving a pattern of judicial decision making.
An imposing body of empirical literature supports the claim that judges pursue political objectives — just one of the judicial pursuits identified by legal realists.  Many of these studies rely on specific attributes of the judge such as age, race, jurisdiction, party of the appointing executive, etc.  Big data analytics, by way of contrast, can rely on the judicially - determined facts of the case to the extent those facts are captured by actual judicial word choice in the opinion texts. For example, Macey and Mitts found that whether a corporation was undercapitalized, an alter ego, a mere instrumentality; or whether its stock was held mostly by its owners; or whether it had failed to issue dividends or had rarely issued them at all, mattered little in judicial application of corporate veil-piercing doctrine — without looking at characteristics of individual judges or by hand-classifying cases categorically by fact pattern.  Their result might be interpreted a number of ways, but surely one interpretation involves that judges pay lip-service to those reasons and dispose of cases for other reasons. 
Not only can big data uncover the posturing of a particular rationale, it can additionally show that other rationales matter, particularly rationales grounded in judicial attributes. For example, judges with particular attributes may tend to hold for plaintiffs when there is commingling of assets by small-business defendants or where plaintiffs are public entities. A number of possibilities exist across many areas of law. Areas where judicial bias is suspected already, e.g. criminal conviction and sentencing with respect to defendant attributes, free speech with respect to speech contents, various areas of administrative law, etc. are especially ripe for study.
Exactly how might this scholarship look? Big data legal scholars can use a method known as multinomial inverse regression (MNIR). Figure 6 depicts a spreadsheet of documents, ordered by rows, with the words contained within those documents ordered by columns.  Consider that each document represents a legal text associated with an outcome. For example, each document may represent a judicial opinion where a corporate successor is either found liable or not liable. Or each document may represent a judicial application of a criminal law doctrine where the defendant is found guilty or not guilty. All of the words in a given opinion are assumed to potentially relate to the judicial outcome. The analyst determines a relationship between the words and outcome by regressing the words as independent variables against the dependent variable of outcome.  The results are then interpreted to support or disprove a given theory. The key difference between MNIR and traditional regression is the sheer number of covariates. A data set consisting of a 20,000-word vocabulary will consist of 20,000 covariates. Obviously, judges' use of some of those words will contain no underlying theoretical connection to judicial outcome.  But word choice, just as obviously, is not meaningless. Otherwise judges could author opinions with randomly chosen words.
MNIR can go a long way toward addressing omitted variables bias when studying the relationship between text and outcome. Because the analyst narrows her inquiry to how a judicial outcome is reached only through judicial word choice, and because every word contained within a judicial opinion is counted as a variable, omitted variables bias is severely reduced.   So long as the inquiry is properly narrowed and explicitly acknowledged while reaching conclusions, big data methods can provide an effective and powerful research platform for advancing our understanding of the limits and scope of legalism and realism, at least with respect to the relationship between judicial text and judicial out come.
In addition to narrow inquiries that focus on text, big data methods can support wider inquiries into judicial behavior. When the words of the opinion texts are used as covariates, they can function as surrogates for the judicially determined facts of a case. One can think of the words as an enormous set of dummy variables, which can be used for holding the facts constant. This technique could represent an important advance in reducing subjectivity from traditional hand-coding of cases as explained above in the section on taxonomy.  Once the facts are controlled for, the remaining judicial attribute covariates such as age, appointed party, etc. can be evaluated more accurately.
3. Policy Studies
To appreciate the role of multinomial inverse regression as a tool for the big data legal analyst, consider the recent history of one of the most influential forms of empirical legal scholarship, viz., econometrics applied to legal questions (aka empirical law and economics). In their survey of the field, Jonah Gelbach and Jonathan Klick note immediately that “the central problem in much empirical work is omitted variables bias.” Before the mid-1990s empirical law and economics simply added more variables to address the problem. However, if the variables are unknown, then they cannot be added. A follow-up approach was to admit bias, but speculate about its nature.  However, the results of this approach were , naturally, always contestable because bias was always admitted. By the mid-1990s, empirical law and economics began implementing the difference-in-differences approach to address research design problems.
Typically, the analyst studies the effects of a new policy by comparing a jurisdiction that adopts the policy with a jurisdiction that does not. So long as unknown variables are realistically assumed to be the same across those jurisdictions, any difference between the two can be understood as caused by the change in policy. However, jurisdictions do not adopt policies randomly. Something is happening within the jurisdiction that is driving the choice to adopt. This “something” may or may not be happening in the other jurisdictions which are used for comparison. For this reason, state-of-the-art studies use instrumental variables (IV) and similar techniques to purge the jurisdictionally - centered (or more generally, the endogenous or internal) nature of policy choice. Nonetheless, a valid instrumental variable requires an assumption that cannot be tested. Its validity can only be evaluated for reasonableness. Thus, Gelbach and Klick conclude that “[u]ltimately, it is an unavoidable if uncomfortable fact of empirical life that untestable assumptions, and untestably good judgment about them, are indispensable to the measurement of causal effects of real-world policies.” However, the good news is that new empirical techniques like instrumental variables have contributed to the restoration of some consensual level of credibility to empirical law and economics.  The new methods are not perfect, but they “have uncovered compelling evidence on [issues] of great policy import.”
Difference-in-differences research designs augmented with quasi-experimental approaches such as IV are usually c oncerned with isolating a crucial effect of policy change. This effect often constitutes the policy's justification. For example, does increased policing reduce crime?  Do liberal takeover policies increase market value of firms? Do heightened pleading requirements increase settlement rates?  For some of these studies, especially those that involve the analysis of case law, controlling for judicially-determined facts may be useful. Big data methods can reduce subjectivity through classifying case types algorithmically with topic modeling or through using the words of opinion texts as covariates. In other instances, preliminary analysis with big data methods may reveal variation that can later be exploited with quasi-experimental methods