'Big Data Legal Scholarship: Toward a Research Program and Practitioner's Guide' by Frank Fagan in
Virginia Journal of Law and Technology (Forthcoming)
ambitiously
seeks to take first steps toward developing a research program for big data legal scholarship by sketching its positive and normative components. The application of big data methods to the descriptive questions of law can generate broader consensus. This is because big data methods can provide greater comprehensiveness and less subjectivity than traditional approaches, and can diminish general disagreement over the categorization and theoretical development of law as a result. Positive application can increase the clarity of rules; uncover the relationship between judicial text and outcome; and comprehensively describe judicially-determined facts, the contents of legislation and regulation, or the contents of private agreements. Equipped with a normative framework, big data legal scholarship can lower the costs of judging, litigating, and administrating law; increase comparative justice and predictability; and support the advocacy of better rules and policies.
In addition to sketching theoretical foundations, this Essay seeks to take first steps toward developing the core components of successful praxis. Handling and analyzing big data can be cumbersome, though the newcomer can avoid common pitfalls with care. Accordingly, there exist best practices for preprocessing, converting, and analyzing the data. Current analytical techniques germane to law include algorithmic classification, topic modeling, and multinomial inverse regression.
First steps, by definition, are incomplete. The contours of big data legal scholarship and practice will undoubtedly shift over time to reflect new techniques and prevailing normative questions. This Essay merely aspires to generate a conversation about how big data can enhance our understanding of law — what it is, and what it should be.
Fagan comments
This
Essay narrows its focus on the application of big data methods to
legal scholarship. The goal is to develop a framework for, or at least take
first steps toward, understanding how big data techniques can be used to
address the positive and normative questions of conventional legal
scholarship, i.e. what the law is and what
the law should be. Thus,
it is
less
concerned with evaluating whether big data techniques are particularly
well-suited to legal scholarship as a question by itself, which is certainly
relevant and central,
and is more concerned with identifying potential
avenues for application to conventional doctrinal scholarship. Along the
way, the
analysis
considers
the value of traversing those avenues from a
normative law and economics point of
view. In other words, can big data
legal studies increase social welfare, generally, through a reduction in the
social costs of lawyering and judging, or say, an increase in the awareness
of socially undesirable rules (thereby placing pressure on their decline), or
an increase in the use of socially desirable rules, and yet still others. ... The Essay is divided into two parts. Part one takes first steps toward
establishing a way of thinking about big data legal scholarship. It begins by
considering how big data methods can positively describe the law and
identifies three types of study: of taxonomy, of realism, and of policy. First
steps toward positive application are followed by first steps toward
normative application.
Three normative goals are identified: to
clarify
doctrine, to advocate comparative justice and predictability, and to advocate
better rules.
Part two aims to
demystify the
big data
research process by
providing
a
practitioner’s
guide, which details the steps
a newcomer
can
take to develop
a big
data legal study.
The guide has two goals: to convince
an
interested
legal scholar to consider carrying out
big data study, and to
facilitate
broader dialog amongst those
who study
legal doctrine
—
regardless of whether they are interested in big data methods. By
prying open
the black box just a little,
part two
hopes to further the
conversation
about law
—
between
big data legal scholars
and their counterparts using
other methods.
Fagan goes on to argue
As emphasized by Joshua Fischman, legal scholars should prioritize
normative questions over positive assessments and allow substantive
questions of policy to drive their choice of methods. Nonetheless,
this
essay leads with big data’s positive component in order to place its
normative component in sharper relief. Priority is assigned for exposition.
Broadly speaking, big data methods can be used to describe legal rules
and legal theory. For example, Jonathan Macey and Joshua Mitts have
applied
big data methods to develop a taxonomy of corporate veil-piercing;
I have applied big data methods to develop a taxonomy corporate
successor liability.
Daniel Young has applied big data
methods to study
Bruce Akerman’s constitutional moments;
Lea-Rachel Kosnik has applied
big data methods to study Ronald Coases’
transaction costs.
What has
scientifically emerged thus far, at least in terms of big data’s ability to
describe legal rules, is essentially taxonomic. And in terms of its ability to
describ
e legal theory, big data provides just one more empirical method for
falsification.
1.
Taxonomic Studies
Taxonomic studies are important nonetheless, especially where there exists scholarly dispute over which categorical set of facts leads to judicial
application of a particular doctrine, or where there exists ongoing variation
in judicial rationale for the same. Both of these discrepancies were present
when Macey and Mitts approached veil-piercing with big data methods.
Their effort was directed toward resolving (i) a scholarly dispute whether
veil-piercing doctrinal standards exhibit coherence, or whether those
standards
are “characterized by ambiguity, unpredictability, and even a
seeming degree of randomness”;
and (ii) why judges appear to apply
standards differently.
Either of these grounds present an intellectual
opportunity for big data taxonomic studies. With respect to veil
-
piercing,
the intellectual and doctrinal stakes were relatively high considering that
Stephen Bainbridge called for the abolishment of the doctrine for its
seeming lack of coherence and non-uniformity.
Through application of
big data methods, Macey and Mitts were able to find coherence in the
morass of 9,000 veil-piercing decisions and establish an authoritative
taxonomy. The taxonomy draws its authority from the impressive fact that
it parses 9,000 decisions, which were culled from a database of 2.5
million.
Size, for better or worse, has always been a hallmark of quality
and thoroughness in doctrinal scholarship.
Big data methods can leverage
that hallmark.
The taxonomy also draws authority from its limited subjectivity.
Taxonomies based upon hand-classification are inherently biased because
an analyst must decide on a classification scheme. While classifications
based upon biological characteristics, for example, are relatively
straightforward and mildly contestable at worst (think six legs means insect;
thus, a centipede is not an insect), classifications based upon legal
characteristics are often messy
and mildly contestable at best. As noted by
Macey and Mitts, the messiness was particularly acute with veil-piercing
doctrine:
Coding schemes
— indeed, quantitative analysis more
generally — necessarily reflect an imperfect approximation of
the qualitative complexity of each case. But using
mechanical coding to identify determinants of veil piercing
is particularly imprecise because it places substantial
discretion in the hands of human coders, whose application
of judgment can vary between individuals and ev
en from
case to case by the same individual.
Heightened subjectivity can lead to differing classification results which
limits the scope of academic consensus and places a downward pressure on
producing a testable theory or taxonomy. This is because those that
disagree with the initial classification results are more likely to reject a
theory grounded in those results. And because the theory will have less
appeal, the scholar will have less incentive to produce it and empirically test
it (given that she values wide appeal of her scholarship of course).
Big data methods, instead, offer classification techniques that severely
limit the subjective bias of the analyst and thereby promote consensual
advancement. Through a method known as topic modeling,
the analyst can
use an algorithm to evaluate a practically unlimited number of judicial
decisions at once
—
without specifying qualitative classes or case
characteristics ad hoc.
The algorithm, from its point of view, performs aimless work. The analyst simply specifies the number of topics to be
modeled, and the algorithm simply outputs that number of word lists. The
analyst must then create classification categories based upon the contents of
the individual lists. For example, the topic model used in Fagan (2015)
returned four lengthy lists with one consisting of terms such as: “petition
reorganization”, “legal interest”, “transfer purchase”, “liquidation
distribution”, and “public auction”.
With the aid of that list, I was tasked
to create a category
(most cleanly related to bankruptcy).
Category creation requires a certain amount of subjectivity, but that
amount is severely limited when compared to other methods
—
and in
scientifically meaningful ways. Consider first, that categories are based
upon
algorithmically generated word lists that can be replicated by other
researchers who use the same dataset.
Any dispute over category choice is
therefore limited to subjective interpretation of the word list. Second,
unlike traditional classification met
hods, topic modeling considers the
complete contents of a legal text with equivalent importance. Terms such
as “under capitalization” are evaluated with the same systematic rigor as
“hot air”. Any pattern or topical structure manifest in the word lists,
therefore, is mechanically based upon the texts
—
irrespective of their
contents. This type of frankness with textual data could, possibly, be
sacrificed when the data are approached non-algorithmically and with a
human touch. Entire topics, or at least subtle classification influences,
might be missed or papered over as a result. Finally, topic modeling limits
subjectivity through the use of substantially enlarged datasets. As the
number of data increase, a more objective picture is more likely to emerge.
On balance then, big data topic modeling presents clear advancement in
reducing subjectivity when compared with traditional hand-classification
techniques. Combined with its even clearer advantage in
comprehensiveness when an analyst mines a sufficient universe of legal
texts, it holds immediate potential to generate taxonomies that yield broader acceptance.
2.
Legal Realist Studies
In his book
How Judges Think, Richard Posner notes wryly that if
judges did nothing but apply clear rules of law, “[t
]hen judges would be
well on the road to being superseded by digitized artificial intelligence
programs.”
For their part, legalists claim that judges merely “apply rules
made by legislatures or framers of the Constitution (or follow precedents,
made by current or former judges, that are promptly changed if they prove
maladapted to current conditions).”
Judges state those rules in their
opinions and apply them without bias to sets of facts, which themselves are
also determined without bias.
Realists, instead, claim that the judicial
temperament
matters. Judicially
-
made doctrines and decisions depend in
part upon the judges' incentives, “which may in turn depend on the judges'
cognition and psychology, on how persons are selected (including self-selected) to be judges, and on terms and conditions of judicial
employment.”
Application and determination of rules and facts can,
therefore, partly depend upon judges' “motivations, capacities, mode of
selection, professional norms, and psychology.”
Current
judicial practice
does not involve explicitly stating any of these up front, and it very likely
will remain this way for the foreseeable future. Thus, if judicial
temperament matters in decision
making, it must be measured indirectly.
Big data analytics provide a powerful research platform for empirically
testing the tenets of legal realism. Relying upon a form of regression
analysis adapted for lengthy texts, empirical techniques now exist for
untangling the (potentially) uneasy relationship between judicial texts and
judicial outcomes. The analytical results are able to support the assertion
that a given judicial rationale, even when made textually explicit, is not (or
conversely, may be) driving
a pattern of judicial decision
making.
An imposing body of empirical literature supports the claim that judges
pursue political objectives
—
just one of the judicial pursuits identified by
legal realists.
Many of these studies rely on specific attributes of the judge such as age, race, jurisdiction, party of the
appointing executive, etc.
Big
data analytics, by way of contrast, can rely on the judicially
-
determined
facts of the case to the extent those facts are captured by actual judicial
word choice in the opinion texts. For example, Macey and Mitts found that
whether a corporation was undercapitalized, an alter ego, a mere
instrumentality; or
whether its stock was
held mostly by
its
owners; or
whether it
had failed to
issue dividends
or had rarely issued
them at
all,
mattered little in judicial application of corporate veil-piercing doctrine
—
without looking at characteristics of individual judges or by hand-classifying cases categorically by fact pattern.
Their
result might be
interpreted a number of ways, but surely one interpretation involves that
judges
pay lip-service to those reasons and dispose of cases for other
reasons.
Not only can big data uncover the posturing of a particular rationale, it
can additionally show that other rationales matter, particularly rationales
grounded in judicial attributes. For example, judges with particular
attributes may tend to hold for plaintiffs when there is commingling of
assets by small-business defendants or where plaintiffs are public entities.
A number of possibilities exist across many areas of law. Areas
where
judicial bias is suspected already, e.g. criminal conviction and sentencing
with respect to defendant attributes, free speech with respect to speech
contents, various areas of administrative law, etc. are especially ripe for study.
Exactly how might this scholarship look? Big data legal scholars can
use a method known as multinomial inverse regression (MNIR). Figure 6
depicts a spreadsheet of documents, ordered by rows, with the words
contained within those documents ordered by columns.
Consider that each
document represents a legal text associated with an outcome. For example,
each document may represent a judicial opinion where a corporate
successor is either found liable or not liable. Or each document may
represent a judicial application
of a criminal law doctrine where the
defendant is found guilty or not guilty. All of the words in a given opinion
are assumed to potentially relate to the judicial outcome. The analyst
determines a relationship between the words and outcome by regressing the
words as independent variables against the
dependent variable of
outcome.
The results are then interpreted to support or disprove a given theory. The
key difference between MNIR and traditional regression is the sheer
number of covariates. A data
set consisting of a 20,000-word vocabulary
will consist of 20,000 covariates.
Obviously, judges' use of some of those
words will contain no underlying theoretical connection to judicial
outcome.
But word choice, just as obviously, is not meaningless.
Otherwise judges could author opinions with randomly chosen words.
MNIR can go a long way toward addressing omitted variables bias
when studying the relationship between text and outcome. Because the analyst narrows her inquiry to how a judicial outcome
is reached only
through judicial word choice, and because every word contained within a
judicial opinion is counted as a variable, omitted variables bias is severely
reduced.
So long as the inquiry is properly narrowed and explicitly
acknowledged while
reaching conclusions, big data methods can provide an
effective and powerful research platform for advancing our understanding
of the limits and scope of legalism and realism, at least with respect to the
relationship between judicial text and judicial out
come.
In addition to narrow inquiries that focus on text, big data methods can
support wider
inquiries
into judicial behavior. When the words of the
opinion texts are used as covariates, they can function as
surrogates
for the
judicially determined facts of a case. One can think of the words as an
enormous set of dummy variables, which can be used for holding the facts
constant. This technique could represent an important advance in reducing
subjectivity from traditional hand-coding of cases as explained above in the
section on taxonomy.
Once the facts are controlled for, the remaining
judicial
attribute covariates such as
age, appointed party, etc. can be
evaluated more accurately.
3.
Policy Studies
To appreciate the role of multinomial inverse regression as a tool for the
big data legal analyst, consider the recent history of one of the most
influential forms of empirical legal scholarship,
viz., econometrics applied
to legal questions (aka empirical
law and economics). In their survey of the
field, Jonah Gelbach and Jonathan Klick note immediately that “the central
problem in much empirical work is omitted variables bias.”
Before the
mid-1990s empirical law and economics simply added more variables to
address the problem.
However, if the variables are unknown, then they
cannot be added. A follow-up approach was to admit bias, but speculate about its nature.
However, the results of this approach were
, naturally,
always contestable
because bias
was always admitted.
By the mid-1990s,
empirical law and economics began implementing the difference-in-differences approach to address research design problems.
Typically, the analyst studies the effects of a new policy by comparing a
jurisdiction that adopts the policy with a jurisdiction that does not. So long
as unknown variables are
realistically assumed to be
the same across those
jurisdictions, any difference between the two can be
understood as
caused
by the change in policy.
However, jurisdictions do not adopt policies
randomly. Something is happening within the jurisdiction that is driving
the choice to adopt.
This “something” may or may not be happening in
the other jurisdictions which are used for comparison.
For this reason,
state-of-the-art studies use instrumental variables (IV) and similar
techniques to purge the jurisdictionally
-
centered (or more generally, the
endogenous or internal) nature of policy choice.
Nonetheless, a valid
instrumental variable requires an assumption that cannot be tested. Its
validity can only be evaluated for reasonableness.
Thus, Gelbach and
Klick conclude that “[u]ltimately, it is an unavoidable if uncomfortable fact
of empirical life that untestable assumptions, and untestably good judgment
about
them, are indispensable to the measurement of causal effects of real-world policies.”
However, the good news is that new empirical techniques
like instrumental variables have contributed to the restoration of some
consensual level of credibility to empirical law and economics.
The new
methods are not perfect, but they “have uncovered compelling evidence on [issues] of great policy import.”
Difference-in-differences research designs augmented with quasi-experimental approaches such as IV are usually c
oncerned with isolating a
crucial effect of policy change. This effect often constitutes the policy's
justification. For example, does increased policing reduce crime?
Do
liberal takeover policies increase market value of firms? Do heightened
pleading requirements increase settlement rates?
For some of these
studies, especially those that involve the analysis of case law, controlling
for judicially-determined facts may be useful. Big data methods can reduce
subjectivity through classifying case types algorithmically with topic
modeling or through using the words of opinion texts as covariates. In
other instances, preliminary analysis with big data methods may reveal
variation that can later be exploited with quasi-experimental methods