'Big Data in Small Hands' by Woodrow Hartzog and Evan Selinger in (2013) 66
Stanford Law Review Online 81
comments
“Big data” can be defined as a problem-solving philosophy that leverages massive data-sets and algorithmic analysis to extract “hidden information and surprising correlations." Not only does big data pose a threat to traditional notions of privacy, but it also compromises socially shared information. This point remains under appreciated because our so-called public disclosures are not nearly as public as courts and policymakers have argued — at least, not yet. That is subject to change once big data becomes user friendly.
Most social disclosures and details of our everyday lives are meant to be known only to a select group of people. Until now, technological constraints have favored that norm, limiting the circle of communication by imposing transaction costs — which can range from effort to money — onto prying eyes. Unfortunately, big data threatens to erode these structural protections, and the common law, which is the traditional legal regime for helping individuals seek redress for privacy harms, has some catching up to do.
To make our case that the legal community is under-theorizing the effect big data will have on an individual’s socialization and day-to-day activities, we will proceed in four steps. First, we explain why big data presents a bigger threat to social relationships than privacy advocates acknowledge, and construct a vivid hypothetical case that illustrates how democratized big data can turn seemingly harmless disclosures into potent privacy problems. Second, we argue that the harm democratized big data can inflict is exacerbated by decreasing privacy protections of a special kind — ever-diminishing “obscurity.” Third, we show how central common law concepts might be threatened by eroding obscurity and the resulting difficulty individuals have gauging whether social disclosures in a big data context will sow the seeds of forthcoming injury. Finally, we suggest that one way to stop big data from causing big, un-redressed privacy problems is to update the common law with obscurity-sensitive considerations.
'Big Data Proxies and Health Privacy Exceptionalism' by Nicolas Terry
argues that
while “small data” rules protect conventional health care data (doing so exceptionally, if not exceptionally well), big data facilitates the creation of health data proxies that are relatively unprotected. As a result, the carefully constructed, appropriate, and necessary model of health data privacy will be eroded. Proxy data created outside the traditional space protected by extant health privacy models will end exceptionalism, reducing data protection to the very low levels applied to most other types of data. The article examines big data and its relationship with health care, including the data pools in play, and pays particular attention to three types of big data that lead to health proxies: “laundered” HIPAA data, patient-curated data, and medically-inflected data. It then reexamines health privacy exceptionalism across legislative and regulatory domains seeking to understand its level of “stickiness” when faced with big data. Finally the article examines some of the claims for big data in the health care space, taking the position that while increased data liquidity and big data processing may be good for health care they are less likely to benefit health privacy.
Terry concludes -
There
is
little
doubt
how
the
big
data
industry
and
its
customers
wish
any
data
privacy
debate
to
proceed.
In
the
words
of
a
recent
McKinsey
report
the
collective
mind-set
about
patient
data
needs
to
be
shifted
from
“protect”
to
“share,
with
protections.”
Yet
these
“protections”
fall
far
short
of
what
is
necessary
and
what
patients
have
come
to
expect
from
our
history
of
health
privacy
exceptionalism.
Indeed,
some
of
the
specific
recommendations
are
antithetical
to
our
current
approach
to
health
privacy.
For
example,
the
report
suggests
encouraging
data
sharing
and
streamlining
consents,
specifically
that
“data
sharing
could
be
made
the
default,
rather
than
the
exception.” However,
McKinsey
also
noted
the
privacy-based
objections
that
any
such
proposals
would
face:
[A]s data liquidity increases, physicians and manufacturers will be subject to increased scrutiny, which could result in lawsuits or other adverse consequences.We know that these issues are
already generating
much concern, since many stakeholders have told us that their fears about data release outweigh their hope of using the information to
discover new opportunities.
Speaking
at
a
June
2013
conference
FTC
Commissioner
Julie
Brill
acknowledged
that
HIPAA
was
not
the
only
regulated
zone
that
was
being
side-stepped
by
big
data
as
“new-fangled
lending
institutions
that
forgo
traditional
credit
reports
in
favor
of
their
own
big-data-driven
analyses
culled
from
social
networks
and
other
online
sources.” With
specific
regard
to
HIPAA
privacy
and,
likely,
data
proxies
the
Commissioner
lamented:
[W]hat
damage
is
done
to
our
individual
sense
of
privacy
and
autonomy
in
a
society
in
which
information
about
some
of
the
most
sensitive
aspects
of
our
lives
is
available
for
analysts
to
examine
without
our
knowledge
or
consent,
and
for
anyone
to
buy
if
they
are
willing
to
pay
the
going
price.
Indeed,
when
faced
with
the
claims
for
big
data,
health
privacy
advocates
will
not
be
able
to
rely
on
status
quo
arguments
and
will
need
to
sharpen
their
defense
of
health
privacy
exceptionalism,
while
demanding
new
upstream
regulation
to
constrict
the
collection
of
data
being
used
to
create
proxy
health
data
and
sidestep
HIPAA.
As persuasively
argued
by
Beauchamp
and
Childress,
“We
owe
respect
in
the
sense
of
deference
to
persons’
autonomous
wishes
not
to
be
observed,
touched,
intruded
on,
and
the
like.
The
right
to
authorize
access
is
basic.”
Of
course
one
approach
to
the
issue
is
to
shift
our
attention
to
reducing
or
removing
the
incentives
for
customers
of
predictive
analytics
firms
to
care
about
the
data.
Recall
how
Congress
was
sufficiently
concerned
about
how
health
insurers
would
use
genetic
information
to
make
individual
underwriting
decisions
that
it
passed
GINA,
prohibiting
them
from
acquiring
such
data.
Yet,
today
some
(but
not
all)
arguments
for
such
genetic
privacy
exceptionalism
seem
less
urgent
given
that
the
ACA
broadly
requires
guaranteed
issue
and
renewability, broadly
prohibiting
pre-existing
condition
exclusions
or
related
discrimination. A
realistic
long-term
goal
must
be
to
reduce
disparities
and
discrimination
and
thereby
minimize
any
incentive
to
segment
using
data
profiling.
A
medium-term
but
realistic
prediction
is
that
there
is
a
politically
charged
regulatory
fight
on
the
horizon.
After
all,
as
Mayer-Schonberger
and
Cukier
note,
“The
history
of
the
twentieth
century
[was]
blood-soaked
with
situations
in
which
data
abetted
ugly
ends.” Disturbingly,
however,
privacy
advocates
may
not
like
how
that
fight
likely
will
turn
out.
Increasingly,
as
large
swathes
of
the
federal
government
become
embroiled
in
and
enamored
with
big
data-driven
decision-making
and
surveillance,
so
it
may
become
politically
or
psychologically
difficult
for
them
to
contemplate
regulating
mirroring
behavior
by
private
actors.
On
the
other
hand
the
position
that
we
should
not
be
taken
advantage
of
without
our
permission
could
gain
traction
resulting
in
calls
such
as
expressed
herein
for
increased
data
protection.
Then
we
will
need
to
enact
new
upstream
data
protection
of
broad
applicability
(i.e.,
without
the
narrow
data
custodian
definitions
we
see
in
sector-based
privacy
models).
Defeat
of
such
reform
will
leave
us
huddled
around
downstream
HIPAA
protection,
an
exceptional
protection,
but
increasingly
one
that
is
(in
big
data
terms)
too
small
to
care
about
and
that
can
be
circumvented by proxy data produced by the latest technologies.