03 February 2017


'The Legacy of inBloom') (Data and; Society Working Paper 02.02.2017) by Monica Bulger, Patrick McCormick and Mikaela Pitcan considers a 2014 educational trainwreck.

The authors ask
Why Do We Still Talk About inBloom?
Many people in the area of educational technology still discuss the story of inBloom. InBloom was an ambitious edtech initiative funded in 2011, launched in 2013, and ended in 2014. We asked ourselves why the story of inBloom is important, and conducted a year-long case study to find the answer. For some, inBloom’s story is one of contradiction: the initiative began with unprecedented scope and resources. And yet, its decline was swift and public. What caused a $100 million initiative with technical talent and political support to close in just one year? A key factor was the combination of the public’s low tolerance for risk and uncertainty and the inBloom initiative’s failure to communicate the benefits of its platform and achieve buy-in from key stakeholders. InBloom’s public failure to achieve its ambitions catalyzed discussions of student data privacy across the education ecosystem, resulting in student data privacy legislation, an industry pledge, and improved analysis of the risks and opportunities of student data use. It also surfaced the public’s low tolerance for risk and uncertainty, and the vulnerability of large-scale projects to public backlash. Any future U. S. edtech project will have to contend with the legacy of inBloom, and so this research begins to analyze exactly what that legacy is.
The inBloom Story
InBloom was a $100 million educational technology initiative primarily funded by the Bill and Melinda Gates Foundation that aimed to improve American schools by providing a centralized platform for data sharing, learning apps, and curricula. In a manner that has become a hallmark of the Gates Foundation’s large scale initiatives, inBloom was incredibly ambitious, well-funded, and expected to deliver high impact solutions in a short time frame. The initiative aimed to foster a multi-state consortium to co-develop the platform and share best practices. It intended to address the challenge of siloed data storage that prevented the interoperability of existing school datasets by introducing shared standards, an open source platform that would allow local iteration, and district-level user authentication to improve security. By providing a platform for learning applications, founders of inBloom set out to challenge the domination of major education publishers in the education software market and allow smaller vendors to enter the space. Ultimately, the initiative planned to organize existing data into meaningful reporting for teachers and school administrators to inform personalized instruction and improve learning outcomes.
The initiative was initially funded in 2011 and publicly launched in February, 2013. What followed was a public backlash over inBloom’s intended use of student data, surfacing concerns over privacy and protection. Barely a year later, inBloom announced its closure. Was this swift failure a result of flying too close to the sun, being too lofty in ambition, or were there deeper structural or external factors?
To examine the factors that contributed to inBloom’s closure, we interviewed 18 key actors who were involved in the inBloom initiative, the Shared Learning Infrastructure (SLI) and the Shared Learning Collaborative (SLC), the latter of which were elements under the broader inBloom umbrella. Interview participants included administrators from school districts and state-level departments of education, major technology companies, former Gates Foundation and inBloom employees, parent advocates, parents, student data privacy experts, programmers, and engineers.
Co-occurring Events
The inBloom initiative occurred during a historically tumultuous time for the public understanding of data use. It coincided with Edward Snowden’s revelations about the NSA collecting data on U.S. civilians sparking concerns about government overreach, the Occupy Wall Street protests surfacing anti-corporation sentiments, and data breaches reported by Target, Kmart, Staples, and other large retailers. The beginnings of a national awareness of the volume of personal data generated by everyday use of credit cards, digital devices, and the internet were coupled with emerging fears and uncertainty. The inBloom initiative also contended with a history of school data used as punitive measures of education reform rather than constructive resources for teachers and students. InBloom therefore served as an unfortunate test case for emerging concerns about data privacy coupled with entrenched suspicion of education data and reform.
What Went Wrong?
InBloom did not lack talent, resources, or great ideas, but throughout its brief history, the organization and the product seemed to embody contradictory business models, software development approaches, philosophies, and cultures. There was a clash between Silicon Valley-style agile software development methods and the slower moving, more risk-averse approaches of states and school districts. At times, it was as though a team of brilliant thinkers had harvested every “best practice” or innovative idea in technology, business, and education—but failed to whittle them down to a manageable and cohesive strategy. Despite the Gates Foundation’s ongoing national involvement with schools, the inBloom initiative seemed to not anticipate the multiple layers of politics and bureaucracy within the school system. Instead there were expectations that educational reform would be easily accomplished, with immediate results, or that – worst case – there would be an opportunity to simply fail fast and iterate.
However, the development of inBloom was large-scale and public, leaving little room to iterate or quietly build a base of case studies to communicate its value and vision. Thus, when vocal opposition raised concerns about student data use potentially harming children’s future prospects or being sold to third parties for targeted advertising, the initiative was caught without a strong counter-position. As opposition mounted, participating states succumbed to pressure from advocacy groups and parents and, one by one, dropped out of the consortium.
The Legacy of InBloom
Although inBloom closed in 2014, it ignited a public discussion of student data privacy that resulted in the introduction of over 400 pieces of state-level legislation. The fervor over inBloom showed that policies and procedures were not yet where they needed to be for schools to engage in data-informed instruction. Industry members responded with a student data privacy pledge that detailed responsible practice. A strengthened awareness of the need for transparent data practices among nearly all of the involved actors is one of inBloom’s most obvious legacies.
Instead of a large-scale, open source platform that was a multi-state collaboration, the trend in data-driven educational technologies since inBloom’s closure has been toward closed, proprietary systems, adopted piecemeal. To date, no large-scale educational technology initiative has succeeded in American K-12 schools. This study explores several factors that contributed to the demise of inBloom and a number of important questions: What were the values and plans that drove inBloom to be designed the way it was? What were the concerns and movements that caused inBloom to run into resistance? How has the entire inBloom development impacted the future of edtech and student data?

02 February 2017

Transit Data Access

'Open or Closed? Open Licensing of Real-Time Public Sector Transit Data' by Teresa Scassa and Alexandra Diebel in (2016) 8(2) Journal of e-Democracy 1-19 explores
how real-time data are made available as “open data” using municipal transit data as a case study. Many transit authorities in North America and elsewhere have installed technology to gather GPS data in real-time from transit vehicles. These data are in high demand in app developer communities because of their use in communicating predicted, rather than scheduled, transit vehicle arrival times. While many municipalities have chosen to treat real-time GPS data as “open data,” the particular nature of real-time GPS data requires a different mode of access for developers than what is needed for static data files. This, in turn, has created a conflict between the “openness” of the underlying data and the sometimes restrictive terms of use which govern access to the real-time data through transit authority Application Program Interfaces (APIs). This paper explores the implications of these terms of use and considers whether real-time data require a separate standard for openness. While the focus is on the transit data context, the lessons from this area will have broader implications, particularly for open real-time data in the emerging smart cities environment.

31 January 2017

Genetic Privacy, Big Data and Counselling

'Am I My Family’s Keeper? Disclosure Dilemmas in Next-Generation Sequencing' by Roel H.P. Wouters,Rhod´e M. Bijlsma, Margreet G.E.M. Ausems, Johannes J.M. van Delden, Emile E. Voest, and Annelien L. Bredenoord in (2016) Human Mutation comments
Ever since genetic testing is possible for specific mutations, ethical debate has sparked on the question of whether professionals have a duty to warn not only patients but also their relatives that might be at risk for hereditary diseases.As next-generation sequencing (NGS) swiftly finds its way into clinical practice, the questionwho is responsible for conveying unsolicited findings to family members becomes increasingly urgent. Traditionally, there is a strong emphasis on the duties of the professional in this debate. But what is the role of the patient and her family? In this article, we discuss the question of whose duty it is to convey relevant genetic risk information concerning hereditary diseases that can be cured or prevented to the relatives of patients undergoing NGS. We argue in favor of a shared responsibility for professionals and patients and present a strategy that reconciles these roles: a moral accountability nudge. Incorporated into informed consent and counseling services such as letters and online tools, this nudge aims to create awareness on specific patient responsibilities. Commitment of all parties is needed to ensure adequate dissemination of results in the NGS era. 
The authors argue
Single gene testing is available for a few decades now. Since that time, healthcare professionals have been confronted with dilemmas that arise from the fact that genetic findings have implications not just for individual patients but also for their family members [Chadwick, 1997; Parker, 2001]. This debate has become increasingly urgent in the advent of next-generation sequencing (NGS) technologies such as whole-exome sequencing and whole-genome sequencing. NGS techniques are particularly promising in the context of personalized medicine [Dietel et al., 2015]. In the near future, healthcare professionals will facemore dilemmas regarding the disclosure of genetic test results to family members because more peoplewill undergo genetic testing.An example of this development lies within the context of personalized cancer care, where germ line sequencing is an essential component in accurate assessment of actionable mutations in neoplasms. Although the chance of finding an unsolicited but actionable germ line mutation remains relatively lowon an individual level [Bijlsma et al., 2016], the absolute number of unsolicited findings is expected to be considerable [Chan et al., 2012; van El et al., 2013]. Consequently, the ethical dilemma of whether or not to communicate genetic results to family members directly will occur more frequently as NGS finds its way into clinical practice.
Current ethical literature focuses primarily on the scenario that a patient explicitly refuses to share potentially life-saving genetic informationwith relatives [Falk et al., 2003; Offit et al., 2004; Bombard et al., 2012; Shah et al., 2013]. Indeed, a majority of genetic professionals have encountered this dilemma at least once in their careers [McLean et al., 2013]. Empirical research, however, suggests that the refusing patient scenario occurs in less than 1%of the consultations in the genetics clinic [Clarke et al., 2005]. Generally, patients are willing to share relevant results with their family members. Moreover, the possibility to inform relatives about hereditary diseases is an important motivation for patients to undergo whole-exome sequencing. Until now, this has primarily taken place in a research setting rather than within a clinical diagnostics setting [Clarke et al., 2005; Facio et al., 2013; Hitch et al., 2014]. This article, therefore, concentrates on a much more common situation: a patient is not opposed to sharing genetic information but nevertheless fails to informher relatives. Particularly urgent in this situation is information on hereditary diseases that can be cured or prevented. Although probands know that it is important to inform family members and are generally willing to do so, data suggest that this vital transfer of information often fails to occur [Claes et al., 2003; Sharaf et al., 2013; de Geus et al., 2015]. Uptake of genetic testing tends to be quite low, approximately half of the relatives undergoes genetic testing after a potentially life-threatening mutation (e.g., HNPCC) has been found [Gaff et al., 2007]. This suggests that index patients often do not adequately inform at-risk people in their families. Reasons for not sharing results include not feeling close to family members, not finding the right time and words, and anticipation of negative reactions [Seymour et al., 2010; Wiseman et al., 2010]. Traditionally, there is a strong emphasis on the duties of the professional in this debate [Godard et al., 2006; Dheensa et al., 2016].
But what is the role of the patient and her family? Family ethics is a domain in the field of bioethics that has not been given much attention, and only a few authors have dealt with the subject of responsibilities that arisewithin a family [Lindemann, 2014]. Whereas the current literature about family ethics views the family as a community rooted in shared values rather than shared genes [Verkerk et al., 2015], NGS draws the attention toward responsibilities that emerge within a genetic family. In this article, we examine the question of who is responsible for conveying actionable information to relatives of patients undergoing NGS.
The authors cite Bonython and Arnold, 'Disclosure ‘downunder’: misadventures in Australian genetic privacy law' (2014) 40 Journal of Medical Ethics 168–172.

‘Clinical genomics, big data, and electronic medical records: reconciling patient rights with research when privacy and science collide’ by Jennifer Kulynych and Henry T. Greely in (2017) 4(1)  Journal of Law and the Biosciences 94 comments
 Widespread use of medical records for research, without consent, attracts little scrutiny compared to biospecimen research, where concerns about genomic privacy prompted recent federal proposals to mandate consent. This paper explores an important consequence of the proliferation of electronic health records (EHRs) in this permissive atmosphere: with the advent of clinical gene sequencing, EHR-based secondary research poses genetic privacy risks akin to those of biospecimen research, yet regulators still permit researchers to call gene sequence data ‘de-identified’, removing such data from the protection of the federal Privacy Rule and federal human subjects regulations. Medical centers and other providers seeking to offer genomic ‘personalized medicine’ now confront the problem of governing the secondary use of clinical genomic data as privacy risks escalate. We argue that regulators should no longer permit HIPAA-covered entities to treat dense genomic data as de-identified health information. Even with this step, the Privacy Rule would still permit disclosure of clinical genomic data for research, without consent, under a data use agreement, so we also urge that providers give patients specific notice before disclosing clinical genomic data for research, permitting (where possible) some degree of choice and control. To aid providers who offer clinical gene sequencing, we suggest both general approaches and specific actions to reconcile patients’ rights and interests with genomic research.
The authors argue
With the broad adoption of electronic medical record (EMR) systems, researchers can mine vast amounts of patient data, searching for the best predictors of health outcomes. Many of these predictors may lie in the genome, the encoded representation of each person’s DNA. As gene sequencing continues to evolve from a complex, expensive research tool to a routine, affordable screening test, most of us are likely to have our DNA fully digitized, vastly expanding the already large store of electronic health data already preserved in or linked to our EMRs. In parallel, genomic researchers will, increasingly, seek out EMRs as an inexpensive source of population-wide genome, health, and phenotype data, thus turning patients into the subjects of genomic research. This will often occur without the patients’ knowledge, let alone their consent, in a research climate where the privacy risks are routinely discounted and data security can be uncertain. The implications, both for research and for privacy, are profound, but the prospect has received little attention in the literature. 
The widespread re-use of health information in EMRs is already commonplace, but those records typically don’t include detailed genomic information. The landscape is changing, however, as technical advances make sequencing and storing patient genomes increasingly affordable, and as providers and academic medical institutions— along with government, science, and industry—envision using genomic data to enable ‘precision medicine’. As more patients have genomic data linked to their medical records, absent a change in policy or practice we will see the same non-consensual re-use of these data already allowed for other forms of health information. 
Advocates of the status quo argue either that there is little real re-identification risk for genomic data (the ‘privacy through obscurity’ theory) or in the alternative, that if the risk is real, the consequences are minor, because relative to other forms of health data, information about genetic variation is less stigmatizing, less valuable, and, therefore, less attractive to hackers and criminals. The net effect of these rationales is a privacy standard for DNA sequences much lower than what currently applies to data elements such as URLs, fingerprints, and zip codes—each enumerated as an identifier under the Privacy Rule and protected when linked to health information. Moreover, even assuming arguendo that genome sequence data don’t constitute particularly sensitive health information, it is becoming difficult to maintain that a gene sequence (or substantial subset thereof) is not an ‘identifier’ that places any associated health or demographic information at risk, when databases of identifiable sequence data are proliferating and researchers are exploring ways to sequence DNA rapidly for use as a biometric identifier.  
And, finally, at the heart of this issue lies an important ethical, and practical, question: Should the scientific and provider communities continue to disregard the accumulating evidence from repeated studies that patients expect to be told about, and to control, research uses of their genomic and health information? 
The prospect of eventual, widespread EMR-based genomic research under current privacy practices drove us to write this paper. The paper proceeds in five parts: setting out the problem, reviewing the current status of records-based biomedical research, noting other secondary uses of medical records, describing the conflict between individual rights and societal interests implicated in genomics-based research, and providing our recommendations for a balanced approach. 
We acknowledge the vigorous debate over almost every aspect of the problem of genomic privacy: whether genomic data are identifiable, whether it is likely that anyone would try to re-identify a subject of genomic research, whether patients have an obligation to participate in such research regardless of personal preference. Our paper builds on the 2008 recommendations of the Personalized Health Care Work Group of the US Department of Health and Human Services (‘DHHS’) American Health Information Community, which advocated special protections for the research use of genomic data in EMRs, arguing that such data are exceptional relative to other sensitive information due to their uniqueness and potential for re-identification. Without engaging the debate over ‘genetic exceptionalism’, we maintain that it is still useful here to draw a line—even if it is in sand—and to insist that if patients have any genuine right to understand and influence the uses of any of their sensitive medical information, such a right must include their genomes. That all bright lines are imperfect does not mean no lines are useful. 
Although we do not call for legal or regulatory changes, we question whether current federal health privacy law, properly interpreted, actually permits health care providers, whether clinicians or academics, to treat whole genome sequence data as ‘de-identified’ information subject to no ethical oversight or security precautions, especially when genomes are combined with health histories and demographic data. We recognize that pending amendments to the federal Common Rule might affect and even further strengthen our argument, especially if, as proposed, IRBs would no longer oversee much secondary research involving medical records (as discussed below in Section II.A.2). We do not discuss those proposed changes in detail. The Common Rule amendments have been pending for half a decade, since the Advance Notice of Proposed Rulemaking (ANPR) was published in July 2011, so we do not assume that relevant regulatory changes are imminent or that their final form is predictable. 
We conclude by offering standards (versus new regulations), for individual providers and provider institutions (eg academic medical centers, HMO, and large medical practices) to follow in dealing with both patients and researchers interested in genomic data of those patients. In these standards, we propose a model point-of-care notice and disclosure form for EMR-based genomic research. We call for rigorous data security standards and data use agreements (DUAs) in all EMR genomic research, but note that DUAs are relatively toothless without the means to audit compliance and penalize non-compliance. We acknowledge the limitations of any model of permission or consent, recognizing that such models can’t anticipate every legitimate use or disclosure occurring in connection with research. At the same time we do not agree that, at least in American culture, there is popular support for the view that all patients have a legal or ethical obligation to become subjects of all secondary records research, however, valuable the science. Finally, we consider how researchers might encourage patient participation by sharing more information about the research, more quickly, with the patients whose data they obtain. 
The stakes are high and time is limited. There are compelling reasons why researchers want and need to combine EMRs with genomic data. Without new steps to promote disclosure and awareness, one day the public will discover that medical and genomic information it assumed was confidential is in fact used widely, and at some privacy risk, in research the subjects neither consented to nor even knew about. This discovery could become an ethical, practical, and political landmine—one that we can, and should, avoid.