barnold law: Algorithmics

Showing posts with label Algorithmics. Show all posts

28 August 2025

Prediction

'Predictive privacy: Collective data protection in the context of artificial intelligence and big data' by Rainer Mühlhoff in (2012) Big Data And Society comments

Big data and artificial intelligence pose a new challenge for data protection as these techniques allow predictions to be made about third parties based on the anonymous data of many people. Examples of predicted information include purchasing power, gender, age, health, sexual orientation, ethnicity, etc. The basis for such applications of “predictive analytics” is the comparison between behavioral data (e.g. usage, tracking, or activity data) of the individual in question and the potentially anonymously processed data of many others using machine learning models or simpler statistical methods. The article starts by noting that predictive analytics has a significant potential to be abused, which manifests itself in the form of social inequality, discrimination, and exclusion. These potentials are not regulated by current data protection law in the EU; indeed, the use of anonymized mass data takes place in a largely unregulated space. Under the term “predictive privacy,” a data protection approach is presented that counters the risks of abuse of predictive analytics. A person's predictive privacy is violated when personal information about them is predicted without their knowledge and against their will based on the data of many other people. Predictive privacy is then formulated as a protected good and improvements to data protection with regard to the regulation of predictive analytics are proposed. Finally, the article points out that the goal of data protection in the context of predictive analytics is the regulation of “prediction power,” which is a new manifestation of informational power asymmetry between platform companies and society.

One of the today's most important applications of artificial intelligence (AI) technology is so-called predictive analytics. I use this term to describe data-based predictive models that make predictions about any individual based on available data. These predictions can relate to future behavior (e.g. what is someone likely to buy?), to unknown personal attributes (e.g. sexual identity, ethnicity, wealth, education level), to momentary vulnerabilities (vulnerable conditions such as frustration, depression, loneliness, financial difficulties, pregnancy, etc.), or to personal risk factors (e.g. mental or physical disease predispositions, addictive behavior, or credit risk). Predictive analytics is controversial because, although it has socially beneficial applications, the technology has an enormous potential for abuse and is currently scarcely regulated by law. Predictive analytics makes it possible to automate and, therefore, significantly scale the exploitation of individual vulnerabilities, as well as fostering unequal treatment of individuals in terms of access to economic and social resources such as employment, education, knowledge, healthcare, and law enforcement. Specifically, in the context of data protection and anti-discrimination, the application of predictive AI models needs to be analyzed as a new form of data power large IT companies wield and which relates to the stabilization and production of discriminatory structures, patterns of exploitation, and data-based societal inequalities.

Against the backdrop of the enormous societal impact of predictive analytics, I will argue (as others have argued before me, cf. Hildebrandt, 2009; Hildebrandt and Gutwirth, 2008; Mittelstadt, 2017; Taylor et al., 2016; Taylor, 2016; Vedder, 1999) that we need new approaches to data protection in the context of big data and AI. In my approach, I will use the concept of predictive privacy to normatively capture this novel form of privacy violation through inferred or predicted information. That is, applying predictive models to individuals in order to support decisions is a violation of privacy, yet it is one which does not come about either through “data theft” or a breach of anonymization. Predictive analytics proceeds according to the principle of “pattern matching” by learning algorithms that compare auxiliary data known about a target individual (e.g. usage data on social media, browsing history, geolocation data) against the data of many thousands of other users. This pattern matching is at the core of predictive privacy violations and is possible wherever there is a sufficiently large group of users disclosing their sensitive attributes alongside behavioral and auxiliary data—usually, because they are unaware that this data can be exploited using big data-based methods, or because they think they personally “have nothing to hide.” As I will argue, the problem of predictive privacy denotes a limit to the liberalism inherent in contemporary views of data privacy as the individual's right to control what data is shared about them. The issue of predictive privacy thus strengthens the case for anchoring collectivist protective goods and collectivist defensive rights in data protection.

In the philosophical theories of privacy, collectivist perspectives have long taken into account that one's own data can potentially have negative effects on other people as well, and have therefore posited that individuals should not be free to decide in every respect what data they disclose about themselves to modern data companies (Hildebrandt, 2009; Hildebrandt and Gutwirth, 2008; Loi, Christen, 2020; Mantelero, 2016; Mittelstadt, 2017; cf. Regan, 2002; Taylor et al., 2016). I will also argue that large collections of anonymized data relating to many individuals should not be freely processable by data processors because predictive capacities can be extracted from anonymous data sets. This is in contrast to the current legal situation under the EU General Data Protection Regulation (GDPR), which does not restrict the processing and storage of anonymized data and the predictive models (or “profiles,” to use the terminology of Hildebrandt, 2009) derived from them. Finally, I will call for the rights of data subjects as outlined by the GDPR (right of access, rectification, deletion, and so on) to be reformulated in a collectivist manner, so that affected groups and the community as a whole would be empowered, for the sake of the common good, to exercise such rights against data-processing organizations and thereby prevent the misuse of predictive capacities.

28 August 2024

Moral Economy

'The Moral Economy of High-Tech Modernism' by Henry Farrell and Marion Fourcade Author and Article Information in (2023) 152(1) Daedalus 225–235 comments

Algorithms-especially machine learning algorithms-have become major social institutions. To paraphrase anthropologist Mary Douglas, algorithms “do the classifying.” They assemble and they sort-people, events, things. They distribute material opportunities and social prestige. But do they, like all artifacts, have a particular politics? Technologists defend themselves against the very notion, but a lively literature in philosophy, computer science, and law belies this naive view. Arcane technical debates rage around the translation of concepts such as fairness and democracy into code. For some, it is a matter of legal exposure. For others, it is about designing regulatory rules and verifying compliance. For a third group, it is about crafting hopeful political futures.

The questions from the social sciences are often different: How do algorithms concretely govern? How do they compare to other modes of governance, like bureaucracy or the market? How does their mediation shape moral intuitions, cultural representations, and political action? In other words, the social sciences worry not only about specific algorithmic outcomes, but also about the broad, society-wide consequences of the deployment of algorithmic regimes-systems of decision-making that rely heavily on computational processes running on large databases. These consequences are not easy to study or apprehend. This is not just because, like bureaucracies, algorithms are simultaneously rule-bound and secretive. Nor is it because, like markets, they are simultaneously empowering and manipulative. It is because they are a bit of both. Algorithms extend both the logic of hierarchy and the logic of competition. They are machines for making categories and applying them, much like traditional bureaucracy. And they are self-adjusting allocative machines, much like canonical markets.

Understanding this helps highlight both similarities and differences between the historical regime that political scientist James Scott calls “high modernism” and what we dub high-tech modernism. We show that bureaucracy, the typical high modernist institution, and machine learning algorithms, the quintessential high-tech modernist one, share common roots as technologies of hierarchical classification and intervention. But whereas bureaucracy reinforces human sameness and tends toward large, monopolistic (and often state-based) organizations, algorithms encourage human competition, in a process spearheaded by large, near-monopolistic (and often market-based) organizations. High-tech modernism and high modernism are born from the same impulse to exert control, but are articulated in fundamentally different ways, with quite different consequences for the construction of the social and economic order. The contradictions between these two moral economies, and their supporting institutions, generate many of the key struggles of our times.

Both bureaucracy and computation enable an important form of social power: the power to classify. Bureaucracy deploys filing cabinets and memorandums to organize the world and make it “legible,” in Scott's terminology. Legibility is, in the first instance, a matter of classification. Scott explains how “high modernist” bureaucracies crafted categories and standardized processes, turning rich but ambiguous social relationships into thin but tractable information. The bureaucratic capacity to categorize, organize, and exploit this information revolutionized the state's ability to get things done. It also led the state to reorder society in ways that reflected its categorizations and acted them out. Social, political, and even physical geographies were simplified to make them legible to public officials. Surnames were imposed to tax individuals; the streets of Paris were redesigned to facilitate control.

Yet high modernism was not just about the state. Markets, too, were standardized, as concrete goods like grain, lumber, and meat were converted into abstract qualities to be traded at scale. The power to categorize made and shaped markets, allowing grain buyers, for example, to create categories that advantaged them at the expense of the farmers they bought from. Businesses created their own bureaucracies to order the world, deciding who could participate in markets and how goods ought to be categorized.

We use the term high-tech modernism to refer to the body of classifying technologies based on quantitative techniques and digitized information that partly displaces, and partly is layered over, the analog processes used by high modernist organizations. Computational algorithms-especially machine learning algorithms-perform similar functions to the bureaucratic technologies that Scott describes. Both supervised machine learning (which classifies data using a labeled training set) and unsupervised machine learning (which organizes data into self-discovered clusters) make it easier to categorize unstructured data at scale. But unlike their paper-pushing predecessors in bureaucratic institutions, the humans of high-tech modernism disappear behind an algorithmic curtain. The workings of algorithms are much less visible, even though they penetrate deeper into the social fabric than the workings of bureaucracies. The development of smart environments and the Internet of Things has made the collection and processing of information about people too comprehensive, minutely geared, inescapable, and fast-growing for considered consent and resistance.

In a basic sense, machine learning does not strip away nearly as much information as traditional high modernism. It potentially fits people into categories (“classifiers”) that are narrower-even bespoke. The movie streaming platform Netflix will slot you into one of its two thousand-plus “microcommunities” and match you to a subset of its thousands of subgenres. Your movie choices alter your position in this scheme and might in principle even alter the classificatory grid itself, creating a new category of viewer reflecting your idiosyncratic viewing practices. Many of the crude, broad categories of nineteenth-century bureaucracies have been replaced by new, multidimensional classifications, powered by machine learning, that are often hard for human minds to grasp. People can find themselves grouped around particular behaviors or experiences, sometimes ephemeral, such as followers of a particular YouTuber, subprime borrowers, or fans of action movies with strong female characters. Unlike clunky high modernist categories, high-tech modernist ones can be emergent and technically dynamic, adapting to new behaviors and information as they come in. They incorporate tacit information in ways that are sometimes spookily right, and sometimes disturbing and misguided: music-producing algorithms that imitate a particular artist's style, language models that mimic social context, or empathic AI that supposedly grasps one's state of mind. Generative AI technologies can take a prompt and generate an original picture, video, poem, or essay that seems to casual observers as though it were produced by a human being.

Taken together, these changes foster a new politics. Traditional high modernism did not just rely on standard issue bureaucrats. It empowered a wide variety of experts to make decisions in the area of their particular specialist knowledge and authority. Now, many of these experts are embattled, as their authority is nibbled away by algorithms whose advocates claim are more accurate, more reliable, and less partial than their human predecessors.

One key difference between the moral economies of high modernism and high-tech modernism involves feedback. It is tempting to see high modernism as something imposed entirely from above. However, in his earlier book Weapons of the Weak, Scott suggests that those at the receiving end of categorical violence are not passive and powerless. They can sometimes throw sand into the gears of the great machinery.

As philosopher Ian Hacking explains, certain kinds of classifications-typically those applying to human or social collectives-are “interactive” in that when known by people or those around them, and put to work in institutions, [they] change the ways in which individuals experience themselves-and may even lead people to evolve their feelings and behavior in part because they are so classified.

People, in short, have agency. They are not submissive dupes of the categories that objectify them. They may respond to being put in a box by conforming to or growing into those descriptions. Or they may contest the definition of the category, its boundaries, or their assignment to it. This creates a feedback loop in which the authors of classifications (state officials, market actors, experts from the professions) may adjust the categories in response. Human society, then, is forever being destructured and restructured by the continuous interactions between classifying institutions and the people and groups they sort.

But conscious agency is only possible when people know about the classifications: the politics of systems in which classifications are visible to the public, and hence potentially actionable, will differ from the politics of systems in which they are not.

So how does the change from high modernism to high-tech modernism affect people's relationships with their classifications? At its worst, high modernism stripped out tacit knowledge, ignored public wishes and public complaints, and dislocated messy lived communities with sweeping reforms and grand categorizations, making people more visible and hence more readily acted on. The problem was not that the public did not notice the failures, but that their views were largely ignored. Authoritarian regimes constricted the range of ways in which people could respond to their classification: anything more than passive resistance was liable to meet brutal countermeasures. Democratic regimes were, at least theoretically, more open to feedback, but often ignored it when it was inconvenient and especially when it came from marginalized groups.

The pathologies of computational algorithms are often more subtle. The shift to high-tech modernism allows the means of ensuring legibility to fade into the background of the ordinary patterns of our life. Information gathering is woven into the warp and woof of our existence, as entities gather ever finer data from our phones, computers, doorbell cameras, purchases, and cars. There is no need for a new Haussmann to transform cramped alleyways into open boulevards, exposing citizens to view. Urban architectures of visibility have been rendered nearly redundant by the invisible torrents of data that move through the air, conveying information about our movements, our tastes, and our actions to be sieved through racks of servers in anonymous, chilled industrial buildings.

The feedback loops of high-tech modernism are also structurally different. Some kinds of human feedback are now much less common. Digital classification systems may group people in ways that are not always socially comprehensible (in contrast to traditional categories such as female, married, Irish, or Christian). Human feedback, therefore, typically requires the mediation of specialists with significant computing expertise, but even they are often mystified by the operation of systems they have themselves designed.

12 August 2024

Publicness

'The Impoverished Publicness of Algorithmic Decision Making' by Neli Frost in (2024) Oxford Journal of Legal Studies comments

The increasing use of machine learning (ML) in public administration requires that we think carefully about the political and legal constraints imposed on public decision making. These developments confront us with the following interrelated questions: can algorithmic public decisions be truly ‘public’? And, to what extent does the use of ML models compromise the ‘publicness’ of such decisions? This article is part of a broader inquiry into the myriad ways in which digital and AI technologies transform the fabric of our democratic existence by mutating the ‘public’. Focusing on the site of public administration, the article develops a conception of publicness that is grounded in a view of public administrations as communities of practice. These communities operate through dialogical, critical and synergetic interactions that allow them to track—as faithfully as possible—the public’s heterogeneous view of its interests, and reify these interests in decision making. Building on this theorisation, the article suggests that the use of ML models in public decision making inevitably generates an impoverished publicness, and thus undermines the potential of public administrations to operate as a locus of democratic construction. The article thus advocates for a reconsideration of the ways in which administrative law problematises and addresses the harms of algorithmic decision making.

The use of algorithmic—including machine learning (ML)—models in public decision making to assist or replace human administrators in their routine decision-making tasks, has garnered much attention in recent years. Scandals such as the ‘Robodebt Scheme’ in Australia or the childcare benefits scheme in the Netherlands offer stark examples of why legal scholars are increasingly concerned by this use. From a legal standpoint, such use may certainly contribute to vital features of a properly functioning public administration, such as efficiency, expediency (together referred to as ‘scalability’) and ‘Weberian instrumental rationality’. But it also potentially undermines both ethical and legal principles that are decidedly significant in this arena, such as fairness, due process, the rule of law, a range of human rights and principles of justice, as well as individual autonomy or dignity. Concerns that centre on these principles all meaningfully problematise the use of algorithmic models in public administration. Algorithmic decision making is indeed often biased, is typically opaque and unexplainable, and can result in unjust, rights-infringing decisions at least some of the time.

The present article joins these concerns, but offers a different, complementary frame to problematise the use of algorithmic models—particularly ML—in public administration. Broadly preoccupied with the tensions between novel technologies and democracy, this frame centres a unit of analysis that is constitutive of the very idea and fabric of democracy: the ‘public’. The article grapples with the following interrelated questions: can ML-driven public administrative decisions be truly ‘public’? And, to what extent does the use of ML models compromise the publicness of public decision making and decisions? These questions are part of a broader inquiry into the myriad ways in which digital and artificial intelligence technologies transform the very fabric of our democratic existence by mutating the ‘public’.

The main claim I advance in the article is two pronged. First, I offer to view public administrations as important sites of democratic construction insofar as they maintain a quality of publicness. To make this argument, I offer a theory of publicness that is tailored to the arena of public administration, and explain the importance of this attribute for the potential of bureaucracies to function as coherent entities in modern democratic states. Secondly, I argue that the increasing deployment of ML technologies compromises the publicness of administrative decision making and decisions to generate an impoverished publicness and thereby destabilise this site’s democratic potential.

Together, these two prongs contribute to thinking in the fields of law & technology, public law theory & democratic jurisprudence and administrative law. To the first, the article cautions that the challenges that it identifies are likely to persist even where technological advancement will allow the overcoming of concerns that relate to the bias and opaqueness of ML models, which currently occupy much of the literature. To the second, it offers a view of publicness that complements parallel treatments of this concept, but that is tailored to the specific site of public administration and its unique features, and is also well suited to address the challenges that bureaucracies face today in the advent of technological developments. To the last, it highlights administrative law’s existing limits in fully addressing the plights of algorithmic decision making, and points to novel sites for regulatory intervention.

The arguments the article puts forward unfold as follows. Section 2 addresses the first prong of the argument. It theorises what publicness means in the context of public administration—as both a norm and an institutional practice. Publicness in this account centres on the web of interactions between public administrators themselves and between them and the public’s elected representatives. It is theorised as an attribute that relates to public administrations’ praxis of decision making and to their decisional outcomes. Briefly, it is grounded in a view of public administrations as ‘communities of practice’ that operate through dialogical, critical and synergetic interactions driven not only by explicit knowledge, but also by tacit, practical forms of knowledge. Publicness is further grounded in the view that this unique feature of the fabric and operations of public administrations potentially allows them to track—as faithfully as possible—the public’s intricate view of its interests and to produce decisions that reify those interests.

This theory of publicness is grounded in a theory of democracy that explains its normative value. The normativity of publicness lies in the claim that it imbues public administration—that unelected, democratically suspect stratum of state functionaries—with democratic legitimacy by institutionalising the neo-republican ideal of liberty as non-domination within the bureaucracy. Publicness is also grounded in a theory of the state that helps frame its ontology. I draw here on the work of Martin Loughlin, and his account of power, politics and representation, to explain the political and institutional constraints in which public administrations operate and which shape their task of governing. Importantly, my concept of publicness is equally grounded in empirical accounts of how public administrations operate in practice, which demonstrate its plausibility. I draw here on literature in both law and political science that empirically examines the nature of interactions between public administrators themselves and their interactions with the public’s elected representatives.

The article then proceeds in section 3 to address the second limb of the argument that problematises the use of ML models in public administration. I begin here by situating the claim in the broader literature on algorithmic public decision making, and offer an overview of the types of normative concerns that have attracted legal scholars’ attention in this context. I then move to offer a different problematisation that draws on my analysis of publicness and the necessary conditions for its viability. Here, I suggest that the use of ML models in public administration is malignant to the operations of communities of practice. The claim, in brief, is that the knowledge and operational logic of ML models is largely incompatible with the types of knowledge and interactions that drive communities of practice. ML models thus undermine the quality of publicness, so that public decision making that incorporates these models will feature an impoverished publicness. On this account, publicness is not only deeply political, but also deeply human. I conclude with the observation that if this is the case, the analysis of publicness should inform how we shape the law that regulates ML systems.

01 August 2024

'Strategic Bureaucracy: The Convergence of Bureaucratic and Strategic Management Logics in the Organizational Restructuring of Universities' by Peter Woelert and Bjørn Stensaker in (2024) Minerva comments

Over recent decades, the organizational dimensions of universities have taken a center stage in analyses of higher education policy reform and governance change (e.g., Bleiklie, Enders, and Lepori 2015; Fumasoli and Stensaker 2013; Seeber et al. 2015). Research from different parts of the world has documented a changing university where key organizational trends include greater centralization and formalization, more external and internal reporting and accountability pressures, and the growth of an increasingly professionalized and managerial administrative apparatus within universities (e.g., Christensen 2011; Croucher and Woelert 2022; Ramirez and Christensen 2013).

Across the literature examining the changing organizational governance of universities, one can identify two related but differently accentuated narratives concerning the observed changes. The first narrative is broadly associated with analyses of public sector reform along New Public Management (NPM) lines and the associated policy and governance changes (Ferlie et al. 1996). Key elements in this narrative are, first, the state’s off-loading of responsibilities for organizational governance to universities and increases in universities’ institutional autonomy in operational matters, and second, increases in universities’ accountability to government authorities and other key stakeholders setting the broader policy goals and objectives (e.g., Capano 2011; Christensen 2011; Enders, de Boer, and Weyer 2013). This shift towards increased institutional autonomy and accountability entails new and expanded administrative responsibilities and demands that, so the narrative goes, compel universities to increasingly acquire the characteristics of formalized, centralized, and hierarchical organizations (Bleiklie, Enders, and Lepori 2015; Musselin 2006). In view of these apparent changes, universities thus can be said to have undergone an organizational process of bureaucratization.

The second narrative is related to the first in that it also sees the environment as the core driver of change within universities. However, in contrast to linking organizational change in the university directly to public sector reform and ‘steering at a distance’, this narrative foregrounds the emergence of dynamic forms of institutional competition including those associated with markets or quasi-markets (see Jungblut and Vukasovic 2018) as a key driver of change. Intensifying institutional competition for domestic and international students and university ranking positions (Brankovic 2018; Espeland and Sauder 2007), the narrative then goes, has made it imperative for universities to become comprehensively managed organizations capable of strategic decision-making and swift internal restructuring to effectively identify and realize opportunities offered by their environment (see, e.g., Krücken and Meier 2006; Thoenig and Paradeise 2016). In short, according to this narrative, an increasingly competitive and uncertain environment has driven universities to transform into strategically managed organizations.

Despite the ongoing centrality of these two narratives to accounts of university reform and change, the question of how specifically the two associated organizational logics – bureaucratic and strategic – interrelate in the restructuring of universities has received little attention. This is in parts because the strategic organizational logic, on a more general level, has been frequently yet simplistically painted as implying a radical departure from bureaucratic forms and processes (see on this point, e.g., Hoggett 2007; Wright, Sturdy and Wylie 2012). Applied to the domain of universities, such ‘post-bureaucratic’ notion of strategic management thus provides little scope to account for any common ground or convergence between the two logics in processes of organizational restructuring and change.

This is an issue also since more recent empirical studies from around the world appear to present a mixed picture as to how universities are changing as organizations (see, e.g., Bleiklie, Enders, and Lepori 2017; Ramirez and Christensen 2013; Seeber et al. 2015). There is, for example, a range of evidence suggesting that universities have become more tightly integrated and managed as organizations (Bleiklie, Enders, and Lepori 2015, Seeber et al. 2015). Yet there are also signs of ongoing fragmentation in university organization due to the successive addition of new administrative layers that ultimately appear to have expanded the bureaucratic dimensions of university life (Maassen and Stensaker 2019, Ramirez and Christensen 2013; Woelert 2023).

In this conceptual paper, we argue that bureaucratic and strategic logics, despite their different emphases and points of departure, converge and combine with respect to key dimensions of universities’ internal governance and organizing, ultimately giving rise to a hybrid form of organizational governance we refer to as ‘strategic bureaucracy’. We suggest that the manifestation of strategic bureaucracy within universities is inter alia characterized by a strong focus on strategic leadership and the associated management techniques alongside intensification of organizational features and dimensions traditionally associated with bureaucratic governance such as formalization and hierarchical authority.

The key research questions guiding our discussion are: 1. What are the key characteristics of bureaucratic and strategic logics in a university setting? 2. How are the bureaucratic and strategic organizational logics articulating within universities? 3. What are some of the key organizational implications arising from this articulation between both logics?

Our use of the notion of organizational logic throughout this paper is motivated by the ambition to conceptualize (a) distinctive forms or types of collective rationality that frame, legitimize, and guide organizational activities; and (b) the relationships between these forms. There are affinities to the institutional logics conception that has become widely popular in the social sciences over recent decades, and which assumes that typically there are several such forms, or logics, to be found and interacting within organizations, and which further posits that understanding of the articulation of such different forms is key to understanding organizational change also (Thornton, Ocasio, and Lounsbury 2012). In contrast to the institutional logics perspective and its ambition to integrate macro-, meso-, and micro-levels of analysis (see Thornton, Ocasio, and Lounsbury 2012), our analyses remain, however, more modestly focused on the organizational level and, in particular, do not attempt to integrate individual or micro-level dimensions or foundations.

'Turning universities into data-driven organisations: seven dimensions of change' by Janja Komljenovic, Sam Sellar and Kean Birch in (2024) Higher Education comments

Universities are striving to become data-driven organisations, benefitting from data collection, analysis, and various data products, such as business intelligence, learning analytics, personalised recommendations, behavioural nudging, and automation. However, datafication of universities is not an easy process. We empirically explore the struggles and challenges of UK universities in making digital and personal data useful and valuable. We structure our analysis along seven dimensions: the aspirational dimension explores university datafication aims and the challenges of achieving them; the technological dimension explores struggles with digital infrastructure supporting datafication and data quality; the legal dimension includes data privacy, security, vendor management, and new legal complexities that datafication brings; the commercial dimension tackles proprietary data products developed using university data and relations between universities and EdTech companies; the organisational dimension discusses data governance and institutional management relevant to datafication; the ideological dimension explores ideas about data value and the paradoxes that emerge between these ideas and university practices; and the existential dimension considers how datafication changes the core functioning of universities as social institutions.

Universities recognise the potential value of their digital data and strive to become data-driven organisations that collect, analyse, structure, manage, and use data and data products in their strategic and operational activities. As one of the participants in the focus groups we held during our research on the digitalisation of higher education (HE) in the UK noted: I think every university knows that the data they hold is the wealth of the institution, whether that’s data about how people are behaving or what they’ve actually produced. But that is, at the end of the day, that is the most valuable thing you have. (G6P3).

This imaginary of the value of digital data is supported and encouraged by policymakers and sectorial agencies (Gulson et al., 2022). Jisc, a digital technology and data agency supporting HE in the UK, has recently launched the Data Maturity Framework, which universities can use to assess their ‘data capability’ and guide strategic change. The Higher Education Statistics Agency (HESA) led the Data Futures Project, which aimed at sector-level data collection and analysis to modernise HE data collection and make it more efficient. These initiatives are further driving the marketisation of HE in the UK (Williamson, 2018) and supporting commercial actors to economically benefit from university data (Komljenovic, 2020), including the recent emergence of educational data brokers (Arantes, 2023).

Datafication refers to the ‘quantification of human life through digital information, very often for economic value’ (Mejias & Couldry, 2019, p.1), which involves representing social and natural worlds in machine readable digital formats (Williamson et al., 2020) with significant social consequences. In education, datafication consists of collecting and processing data at all levels, from individual to institutional, national and beyond, impacting education stakeholders’ discursive and material practices (Jarke & Breiter, 2019).

We specifically focus on digital data collected by or registered in digital platforms and digital infrastructure. In many industries, data are valuable when aggregated into big data, allowing more sophisticated analyses, such as group analysis and comparison of individuals for targeted advertising (Birch et al., 2021; Pistor, 2020). In HE, policymakers and educational leaders are attempting to improve quality, efficiency, and impact via datafication at the sectoral and institutional levels (Eynon, 2013). Imaginaries of precision education promise to deliver personalisation akin to other sectors, such as medicine and agriculture (Kuch et al., 2020).

This omnipresent and techno-deterministic belief in the value of data acts as a mythical belief in magic in that it evokes the ideas of seamless functionality with impressive end experience without attention to how it works or the means with which this was achieved, including struggles, efforts, risks, and costs (Elish & boyd, 2018). However, a paradox emerges as this belief in the value of data is not realised in HE, at least not to the extent that stakeholders would wish; yet it continues to drive investment, business models, actions, and strategies (Komljenovic et al., 2024, 2024b). Currently, data are both valuable and not valuable. Various actors, including EdTech companies and universities, experiment and look for ways to realise economic and social value from data.

Universities are diverse along many dimensions, including size and resources, which are particularly important for datafication. These differences mean they organise data processes differently. Having thousands of students and staff, universities have to manage petabytes of data, which is a complex task technologically, financially, and legally. The costs of data storage alone have substantially increased, on top of other new costs related to establishing and maintaining the digital ecosystems required for datafication. Universities also deal with legacy software, problems integrating various systems and data flows, ensuring data security, facing cyberattacks, and more. Moreover, diverse actors formally and informally scrutinise universities concerning their data and digital practices (Komljenovic et al., 2024, 2024b).

In this article, we focus on the UK as an illustrative case due to the high level of digitalisation and datafication of HE (Williamson, 2019). We aim to recognise UK universities’ needs and aims to become data-driven organisations and analyse the challenges they face as they pursue the datafication journey. We first examine datafication in HE and then elaborate on our methodological approach. We then turn to our analysis, structured around seven interrelated dimensions of change, followed by a brief conclusion calling for democratic and relational datafication in HE.

25 June 2024

edTech

'Edtech in Higher Education: Empirical Findings from the Project ‘Universities and Unicorns: Building Digital Assets in the Higher Education Industry’ by Janja Komljenovic, Morten Hansen, Sam Sellar and Kean Birch (published by the Centre for Global Higher Education, Department of Education, University of Oxford comments

Higher education (HE) is by now thoroughly digitalised. Universities use a variety of digital products and services to support their operations. The educational technology (EdTech) industry has been expanding in the past decade, while investors have become important actors in the field. This report offers findings from the ESRC-funded research project ‘Universities and Unicorns: Building Digital Assets in the Higher Education Industry’ (UU), which investigated new forms of value in digitalised HE as the sector engages with EdTech providers. ...

The project was conducted between 1 January 2021 and 30 June 2023. It investigated new forms of value in digital and digitalised higher education (HE) as the sector engages with educational technology (EdTech) providers. The project was especially interested in digital user data and data operations. We followed three groups of actors: universities, EdTech start-up companies, and investors in EdTech.

Our study of universities focused on understanding their: digitalisation strategies and practices; digital ecosystems and collaborations with EdTech companies; attitudes towards and experiences with EdTech companies; user data operations and data outputs; and key challenges with digitalisation.

Our study of EdTech start-up companies focused on understanding: development of products and services; business models and strategies; how products are datafied and their data operations; how user data is made valuable; experiences and relations with universities; experiences and relations with investors; and challenges they are facing in their work and growth.

Our study of investors focused on understanding: their views of HE and the future of the sector; the role that EdTech should play in this future; their beliefs about the value of user data; their investment theses, strategies and activities; and their experiences and relations with the EdTech and HE sectors. xx Understanding EdTech relationally, and bringing these groups together, allowed us to gain particular insights into the digitalisation of HE and its political economy. We aimed to trace the flow of ideas, strategies, and actions between these actors and to understand how and why the EdTech industry is developing as it is.

Our conceptual approach centred on rentiership and assetisation. The global economy is increasingly characterized by rentiership: the move from creating value via producing and selling commodities in the market to extracting value via controlling access to assets. In the digital economy, rentiership is often exercised by controlling digital platforms and pursuing revenues associated with platforms, such as collecting and monetising digital data extracted via these platforms. Users became valuable through their engagement with the platform and are made visible through various user metrics. Emerging work on assetisation in education argues that this is a productive way to understand the impact of the privatisation, financialisation, and digitalisation of public education. However, the rise of assetisation does not mean that HE is no longer a public good or subject to commodification. Instead, it adds new complex forms of value creation and governance to the sector. We should note that this research project was conducted before the release of ChatGPT into public use. Therefore, this report does not make reference to the turbulent discussions about generative AI and its potential usage and impacts in HE. Finally, we note that this report offers an empirical description of key themes and dynamics identified in our study. More in-depth and theorised analyses of project findings are being published in journal articles and book chapters, all of which are openly accessible. The Appendix includes a list of publications. ...

In this section, we briefly summarise key overall findings, which are analysed in more detail in academic publications, i.e. journal articles and book chapters (see Appendix). The following findings are relevant to our case studies and might be different in other contexts.

Takeaway #1: Big Tech and legacy software are prominent in digitalising higher education

Big Tech infrastructure and platforms, legacy software, and EdTech incumbents dominate university digital ecosystems. It is challenging for the EdTech start-up industry to enter HE markets. Digital products and services offered by new companies represent a small proportion of digitalisation work at universities. EdTech companies primarily target individuals as customers, enterprises for staff development and training, and lower levels of education (i.e. schooling rather than HE).

Takeaway #2: EdTech in HE is less advanced than imagined

There is a discrepancy between the promises of the EdTech industry regarding the quality and impact of digital products and services and the perception of university customers. Many university actors, as well as a few EdTech companies, argued that the current quality of EdTech products is generally low compared to other sectors.

Takeaway #3: Making user data valuable is difficult

Collecting, cleaning, sorting, processing, and analysing digital user data demands significant human, technological, and financial resources. It is difficult to make user data analysis useful and valuable, such that universities are willing to pay higher fees for data-driven products. Most EdTech companies that we analysed struggle with monetising user data. There is also less user data analysis currently in the sector than imagined by the EdTech industry in its public discourse. The omnipresent belief in the value of user data among all actors is disjunctive with the realities of data practices, which are mostly simple or non-existent. Most university users are sceptical about learning analytics.

Takeaway #4: User data analytics in HE are not well-developed

EdTech companies attempt to make their digital products valuable by incorporating user data analytics into their core products. However, currently, these analytics are simple and remain at the level of basic descriptive feedback loops for the user. Nevertheless, there is a clear trend in which EdTech companies are continuing their attempts to construct new metrics, scores, and analytics to monetise data, with efforts to convince customers of the value of these analytics.

Takeaway #5: Datafication in HE happens at universities

Universities are in the driving seat of their institutional datafication. Universities are establishing data warehouses, and many aim to collect all user data produced by external digital platforms in order to organise and analyse it for pedagogical and business purposes. However, universities currently lack the capacity to analyse, interpret and act on data. Universities need to establish frameworks for action based on data and acquire the requisite personnel and skills to do so. Universities should ensure that data outputs (e.g. analytics, metrics, scores) are truly representative of what is measured and build confidence in their communities regarding data-driven decision-making.

Takeaway #6: Digitalisation and datafication create work and costs for universities

Digitalisation and EdTech promise to bring efficiency and cost savings for universities, but in reality, university actors feel that digitalisation and data operations create more work and higher costs. In addition, new staff profiles and skills are needed, including data scientists, vendor managers, cloud engineers, as well as more learning technologists.

Takeaway #7: Good EdTech does not challenge core university values and practices

University actors find technology useful in general and are interested in technological innovation in relation to their work. However, there are two instances where university actors are sceptical towards EdTech. First, when companies' business models are exploitative and extractive. Second, when digital products interfere with the university's core values and practices, such as by challenging professional judgement or academic freedom. Intentions to automate the teaching process or provide behavioural nudges are often received with scepticism. Most university actors feel that user data collection should be limited, and data outputs, including analytics, should be restricted and carefully evaluated.

Takeaway #8: The aims of EdTech require greater clarity

The key aims of EdTech are understood to be personalisation, automation, enhanced student engagement, and greater institutional efficiency. However, there are discrepancies between university, EdTech, and investor actors in terms of how they understand these objectives and, consequently, how they will be achieved. Each of these aims needs clarification, including recognising the plurality of dimensions to each objective.

Takeaway #9: Future imaginaries of tech companies and universities

The future imaginaries of HE and EdTech are constructed by the EdTech industry and policy actors. There are discrepancies between investors, EdTech companies, and universities in relation to what EdTech should do and how it should shape the future of HE. Universities should drive these discussions and determine their futures and the role of technology in creating these futures.

Takeaway #10: Democratic data governance

Universities should do more to inform students and staff about the digital products and services they routinely use. Universities should also continuously provide transparent information to students and staff about user data collected from them and what is being done with this data within their universities and externally. Students and staff should have the choice to participate or not in user data collection and processing. Students and staff should be included in the governance of EdTech and user data at their institutions.

Takeaway #11: There is a plurality of assetisation processes in EdTech

EdTech companies establish a variety of processes to control and charge for access to their assets. These include mediating content, organising and mediating teaching interventions, and digitalising and mediating credentials. Typical moats that EdTech companies build are lock-in, network effects, and integration of products into everyday individual practices.

03 June 2024

Kafka

'Kafka in the Age of AI and the Futility of Privacy as Control' by Daniel J. Solove and Woodrow Hartzog in (2024) 104 Boston University Law Review 1021 comments

Although writing more than a century ago, Franz Kafka captured the core problem of digital technologies – how individuals are rendered powerless and vulnerable. During the past fifty years, and especially in the 21st century, privacy laws have been sprouting up around the world. These laws are often based heavily on an Individual Control Model that aims to empower individuals with rights to help them control the collection, use, and disclosure of their data.

In this Essay, we argue that although Kafka starkly shows us the plight of the disempowered individual, his work also paradoxically suggests that empowering the individual isn’t the answer to protecting privacy, especially in the age of artificial intelligence. In Kafka’s world, characters readily submit to authority, even when they aren’t forced and even when doing so leads to injury or death. The victims are blamed, and they even blame themselves.

Although Kafka’s view of human nature is exaggerated for darkly comedic effect, it nevertheless captures many truths that privacy law must reckon with. Even if dark patterns and dirty manipulative practices are cleaned up, people will still make bad decisions about privacy. Despite warnings, people will embrace the technologies that hurt them. When given control over their data, people will give it right back. And when people’s data is used in unexpected and harmful ways, people will often blame themselves.

Kafka’s provides key insights for regulating privacy in the age of AI. The law can’t empower individuals when it is the system that renders them powerless. Ultimately, privacy law’s primary goal should not be to give individuals control over their data. Instead, the law should focus on ensuring a societal structure that brings the collection, use, and disclosure of personal data under control.

29 May 2024

Responsibility

'Humans Outside the Loop' by Charlotte A Tschider in 26 Yale Journal of Law & Technology 324 comments

Artificial Intelligence (AI) is not all artificial. Despite the need for high-powered machines that can create complex algorithms and routinely improve them, humans are instrumental in every step used to create AI. From data selection, decisional design, training, testing, and tuning to managing AI’s development as it is used in the human world, humans exert agency and control over the choices and practices underlying AI products. AI is now ubiquitous: it is part of every sector of the economy and many people’s everyday lives. When AI development companies create unsafe products, however, we might be surprised to discover that very few legal options exist to remedy any wrongs.

This Article introduces the myriad of choices humans make to create safe and effective AI products and explores key issues in existing liability models. Significant issues in negligence and products liability schemes, including contractual limitations on liability, distance the organizations creating AI products from the actual harm they cause, obscure the origin of issues relating to the harm, and reduce the likelihood of plaintiff recovery. Principally, AI offers a unique vantage point for analyzing the limits of tort law, challenging long-held divisions and theoretical constructs. From the perspectives of both businesses licensing AI and AI users, this Article identifies key impediments to realizing tort doctrine’s goals and proposes an alternative regulatory scheme that shifts liability from humans in the loop to humans outside the loop.

25 May 2024

Algorithmic Control

'The Invisible Cage: Workers’ Reactivity to Opaque Algorithmic Evaluations' by Hatim A. Rahman in (2021) 66(4) Administrative Science Quarterly comments

Existing research has shown that people experience third-party evaluations as a form of control because they try to align their behavior with evaluations’ criteria to secure more favorable resources, recognition, and opportunities from external audiences. Much of this research has focused on evaluations with transparent criteria, but increasingly, algorithmic evaluation systems are not transparent. Drawing on over three years of interviews, archival data, and observations as a registered user on a labor platform, I studied how freelance workers contend with an opaque third-party evaluation algorithm—and with what consequences. My findings show the platform implemented an opaque evaluation algorithm to meaningfully differentiate between freelancers’ rating scores. Freelancers experienced this evaluation as a form of control but could not align their actions with its criteria because they could not clearly identify those criteria. I found freelancers had divergent responses to this situation: some experimented with ways to improve their rating scores, and others constrained their activity on the platform. Their reactivity differed based not only on their general success on the platform—whether they were high or low performers—but also on how much they depended on the platform for work and whether they experienced setbacks in the form of decreased evaluation scores. These workers experienced what I call an “invisible cage”: a form of control in which the criteria for success and changes to those criteria are unpredictable. For gig workers who rely on labor platforms, this form of control increasingly determines their access to clients and projects while undermining their ability to understand and respond to factors that determine their success.

Third-party evaluations are a central feature of today’s societal and organizational landscape (Lamont, 2012; Sharkey and Bromley, 2015; Espeland and Sauder, 2016). Studies have shown that third-party rating evaluations of actors such as doctors (RateMDs), professors (RateMyProfessors), hotels (TripAdvisor), restaurants (Yelp), corporations (Forbes), and universities (U.S. News & World Report) provide a sense of transparency and accountability for external audiences (Strathern, 2000; Power, 2010; Orlikowski and Scott, 2014). Audiences also use third-party evaluations to form their perceptions and make decisions about the evaluated actor (Karpik, 2010). As a result, these evaluations influence the resources, recognition, and opportunities actors receive from external audiences (Pope, 2009; Brandtner, 2017). As the prevalence and influence of third-party evaluation systems have increased, researchers have examined how actors subject to such systems react to them (Jin and Leslie, 2003; Espeland and Sauder, 2007; Chatterji and Toffel, 2010). For example, because university admissions, funding, and recognition are influenced by third-party evaluations (e.g., U.S. News & World Report), university administrators and faculty pay close attention to the criteria these evaluations use, such as career placement statistics, and change their behavior to better align with them (Sauder and Espeland, 2009; Espeland and Sauder, 2016).

Consequently, while prior work has shown that third-party evaluations often provide transparency and accountability for external audiences, it has also suggested that actors subject to third-party evaluations experience them as a form of control (Espeland and Sauder, 2016; Brandtner, 2017; Kornberger, Pflueger, and Mouritsen, 2017). Because third-party evaluations can influence actors’ ability to secure resources and recognition from their primary audiences, actors will likely internalize evaluations’ criteria and change their behavior to conform to those standards (Sauder and Espeland, 2009; Masum and Tovey, 2011). Scholars label the phenomenon of people changing their perceptions and behavior in response to being evaluated as “reactivity” (Espeland and Sauder, 2007).

Technological advancements have expanded the use of third-party evaluations to new areas of work and organizing, raising new questions in this domain (Fourcade and Healy, 2016; Kuhn and Maleki, 2017; Cameron, 2021). Nowhere is this more evident than in the rise of labor platforms and their use of third-party evaluations to assess workers. While several types of platforms exist (Davis, 2016; Sundararajan, 2016), those most relevant to this study are labor platforms facilitating gig work, such as Upwork, TopCoder, and TaskRabbit. They provide a digital infrastructure to connect clients with freelance job seekers for relatively short-term projects. Labor platforms have attracted increased attention from work and organizational scholars because they differ from intermediaries and exchange systems previously studied (Vallas and Schor, 2020; Rahman and Valentine, 2021; Stark and Pais, 2021), particularly in their use of evaluations (Kornberger, Pflueger, and Mouritsen, 2017).

Unlike previously studied settings, in which third-party evaluation criteria are relatively transparent to those being evaluated, in labor platforms these criteria are often opaque to workers. This opacity makes it easier for platforms and clients to differentiate among workers by using their evaluation scores, because it is more difficult for workers to game and inflate the evaluation system than in traditional settings (Filippas, Horton, and Golden, 2019; Garg and Johari, 2020). Platforms’ use of opacity in worker evaluations raises an underexplored question: how do opaque third-party evaluations influence workers’ reactivity, and what mechanisms contribute to this form of reactivity? While existing organizational research (Proctor, 2008; Briscoe and Murphy, 2012; Burrell, 2016) has broadly suggested that opacity will make it more difficult for workers to understand evaluation criteria, we lack grounded theory examining how workers contend with such opacity—and with what consequences.

To address this gap, I studied one of the largest labor platforms focused on higher-level project work, such as software engineering, design, and data analytics. The platform implemented an opaque algorithmic rating evaluation to better differentiate which freelancers should be visible to clients and to prevent freelancers from gaming their scores. Freelancers tried but generally failed to understand the evaluation’s inputs, processing, and output, which led them to experience the opaque evaluation as a system of control characterized by unpredictable updates, fluctuating criteria, and lack of opportunities to improve their scores. These experiences were especially frustrating because the opacity contrasted with workers’ expectations of employee-evaluation systems based on previous experiences in traditional organizations, where such systems’ main purpose is to help workers improve (Cappelli and Conyon, 2018).

I observed that freelancers responded to evaluation opacity with two types of reactivity: they either tested different tactics to increase their scores, such as working on various types of projects and with different contract lengths, or they tried to preserve their scores by limiting their engagement with the platform, such as by working with platform-based clients outside of the platform and not working with new clients. This was the case both for workers with higher and lower scores on the platform. Two mechanisms determined their type of reactivity: the extent to which freelancers depended on the platform for work and income and whether they experienced decreases in their evaluation scores (regardless of whether those scores started out higher or lower). My findings support the argument that opaque third-party evaluations can create an “invisible cage” for workers, because they experience such evaluations as a form of control and yet cannot decipher or learn from the criteria for success and changes to those criteria.

23 January 2024

Insurance and LLMs

'Artificial intelligence for health insurance: A proposed framework for FDA oversight' by Renee Sirbu, Jessica Morley and Luciano Floridi comments

Despite mounting enthusiasm regarding the introduction of artificial intelligence (AI) software as a medical device (SaMD) to clinical care and, consequently, the development of a new regulatory proposal for the federal oversight of AI/ML medical devices, little attention has been paid to the oversight of AI tools used by large insurers. The U.S. Food and Drug Administration (FDA) has advanced an “Action Plan” for clinical AI (CAI) governance. However, the U.S. healthcare system remains threatened by the unregulated application of insurance AI (IAI). In this article, we use IAI tools in the Medicare Advantage (MA) prior authorization pathway as an illustrative case to argue that these technologies require further regulatory attention by the FDA. Specifically, we propose a redefinition of “medical device” under the 21st Century Cures Act as necessarily inclusive of IAI and advance an actionable framework for FDA oversight in the approval of IAI tools for deployment by large healthcare insurers.

'AI as Agency Without Intelligence: On ChatGPT, Large Language Models, and Other Generative Models' by Luciano Floridi in (2023) Philosophy and Technology comments

The article discusses the recent advancements in artificial intelligence (AI) and the development of large language models (LLMs) such as ChatGPT. The article argues that these LLMs can process texts with extraordinary success and often in a way that is indistinguishable from human output, while lacking any intelligence, understanding or cognitive ability. It also highlights the limitations of these LLMs, such as their brittleness (susceptibility to catastrophic failure), unreliability (false or made-up information), and the occasional inability to make elementary logical inferences or deal with simple mathematics. The article concludes that LLMs, represent a decoupling of agency and intelligence. While extremely powerful and potentially very useful, they should not be relied upon for complex reasoning or crucial information, but could be used to gain a deeper understanding of a text’s content and context, rather than as a replacement for human input. The best author is neither an LLM nor a human being, but a human being using an LLM proficiently and insightfully.

07 October 2023

Prescribing

"Prescribing Algorithmic Discrimination' by Jennifer D Oliva and Elizabeth Pendo comments

In response to America’s escalating drug poisoning crisis, the federal government has funded, incentivized, and mandated that states adopt and implement prescription drug monitoring programs (PDMPs) to electronically surveil controlled substances and other “drugs of concern.” State PDMPs utilize proprietary, predictive software platforms that deploy algorithms to determine whether a patient is at risk for drug misuse, drug diversion, doctor shopping, or substance use disorder. PDMPs have never been validated by a federal agency or peer review, yet states have mandated their use throughout the health care delivery system.

Research demonstrates that clinical overreliance on the risk scores generated by PDMP algorithms motivates clinicians to refuse to treat—or to inappropriately treat—marginalized and stigmatized patient populations, including individuals with or perceived of suffering from substance use disorder and patients with chronic, complex disabilities. This article provides a framework for challenging such PDMP algorithmic discrimination as disability discrimination. It contends that Section 504 of the Rehabilitation Act, the Americans with Disabilities Act, and Section 1557 of the Accountable Care Act can be engaged to protect vulnerable patients from PDMP-related algorithmic discrimination and provides recommendations to strengthen the 2022 Section 1557 proposed rule concerned with clinical-decision algorithmic discrimination, to harmonize new and existing antidiscrimination protections, and to improve implementation and enforcement efforts in this context.

28 September 2023

Decisions

'When is a Decision Automated? A Taxonomy for a Fundamental Rights Analysis' by Francesca Palmiotto in German Law Journal (forthcoming) comments

This paper addresses the pressing issues surrounding the use of automated systems in public decision-making, with a specific focus on the field of migration, asylum, and mobility. Drawing on empirical research conducted for the AFAR project, the paper examines the potential and limitations of the General Data Protection Regulation and the proposed Artificial Intelligence Act in effectively addressing the challenges posed by automated decision making (ADM). The paper argues that the current legal definitions and categorizations of ADM fail to capture the complexity and diversity of real-life applications, where automated systems assist human decision-makers rather than replace them entirely. This discrepancy between the legal framework and practical implementation highlights the need for a fundamental rights approach to legal protection in the automation age. To bridge the gap between ADM in law and practice, the paper proposes a taxonomy that provides theoretical clarity and enables a comprehensive understanding of ADM in public decision-making. This taxonomy not only enhances our understanding of ADM but also identifies the fundamental rights at stake for individuals and the sector-specific legislation applicable to ADM. The paper finally calls for empirical observations and input from experts in other areas of public law to enrich and refine the proposed taxonomy, thus ensuring clearer conceptual frameworks to safeguard individuals in our increasingly algorithmic society.

24 September 2023

Collection

The important 'Monitoring, Streamlining and Reorganizing Work with Digital Technology: A case study on software for process mining, workflow automation, algorithmic management and AI based on rich behavioral data about workers' by Wolfie Christl comments

Data collection in the workplace has become ubiquitous. Employers use a growing number of information systems to plan, organize and manage workflows and work performed by their employees, most prominently systems for enterprise resource planning (ERP) and customer relationship management (CRM), which are now used by mid- to large-size organizations in most industries. Many systems constantly store digital records about work activities and behaviors of employees. This data is increasingly stored in centralized databases and in the cloud. Employers exploit the data to support managerial decisions, organize work, automate workflows and monitor workers. The technical systems in place are often complex and opaque. Most workers will not be aware of the data flows and decisions that occur in the background while they routinely interact with networked software and devices at work.

This case study explores, examines and documents software systems and technologies used by employers that utilize extensive personal data about the work activities and behaviors of employees to streamline, reorganize and manage work, expand control over workers, subject them to digital monitoring and make automated decisions about them – with a focus on Europe. To illustrate wider practices, it investigates cloud-based software for enterprise data analytics, workflow automation and algorithmic management provided by the German vendor Celonis, based on a detailed analysis of software documentation and other corporate sources.

Celonis is considered the global market leader in software for process mining, which utilizes activity log data recorded by enterprise systems from vendors like SAP, Oracle, Salesforce and Microsoft to create a digital representation of how work is actually being performed in an organization, down to granular steps and tasks. Process mining aims to analyze, standardize and optimize workflows in order to make them more productive and efficient while lowering costs. Still considered a “startup”, Celonis has a significant customer base in Europe and the US. It received more than a billion in venture capital and was listed among the five largest private investments in “AI” technology globally in 2022. SAP started reselling its technology in 2015. Since then, Celonis has added functionality for workflow automation and task management and started to refer to its technology as an “execution management system”, putting the focus on managing rather than merely analyzing work. Several consulting firms like KPMG, Deloitte, Accenture, Capgemini, IBM and the Porsche subsidiary MHP provide Celonis-based applications.

This case study documents a wide range of data practices, which can affect workers in many fields, from insurance claim handling to manufacturing, from creative work to warehouse picking, from low-wage to knowledge work:

Analyzing extensive personal data. Celonis analyzes large amounts of log data about work activities recorded by ERP, CRM and other enterprise software systems. This can occur in real time and include millions of time-stamped activity records that typically contain personal data about the workers who perform the activities.

Streamlining, reorganizing and managing work. Based on the data, Celonis evaluates, assesses and monitors workflows in many industries in order to optimize them in line with the employers’ business goals. Metrics about productivity, time, quality, automation and cost are ubiquitous. Several mechanisms help to automate the reorganization and management of work. The “process AI” promises to identify the “root causes” for “undesired activities” and other inefficiencies. The system can also notify managers of deviations from KPI targets and assign them tasks. The “simulation” module forecasts and predicts the impact of process changes.

Group-level digital control. The analysis of workflows, activities and metrics for groups, such as teams, departments, units, offices, plants or subcontractors, plays a major role. The system lets employers drill down into group data and compare different groups. Group-level analysis can facilitate internal competition and represents a form of performance monitoring for managers, who are expected to pass the pressure on to workers. It can also facilitate peer control, where members of a team or other groups put pressure on each other.

Granular performance and behavior monitoring. Celonis’ technology can be used to scrutinize work at the level of individual employees and monitor, rate and rank named workers by their productivity, speed, work outcomes and behaviors. Employers can use it for granular performance and behavior monitoring, from rating what call center agents say in conversations to assessing tasks down to the second in manufacturing.

Analyzing social interactions. Another software module “adds the social aspect of processes” to the system and promises to assess work activities with respect to social interactions and collaboration between workers.

Workflow automation across enterprise systems. In addition, the system can facilitate workflows and real-time data sharing across Celonis’ process mining software and hundreds of other enterprise systems, such as for ERP, CRM, HRM, task management and communication. It can automatically initiate particular actions in SAP, Salesforce, Workday or Microsoft 365 when certain criteria are met in the process data. It can also trigger actions based on networked access to other enterprise systems, for example, by watching the location of a delivery driver or by monitoring the corporate chat system Slack or an Outlook email inbox.

Automated task assignment. Celonis’ workflow automation technology can involve processing workers’ personal data and making automated decisions about them. It can automatically prioritize, distribute and assign tasks to workers and provide them with a limited set of recommended actions to perform.

Apps that combine process analysis and algorithmic management. Employers, consulting firms and other vendors can create Celonis-based applications that address particular processes in an organization. These apps can combine process analysis, management automation, workflow automation and task assignment.

Recording screen, application, browser, keyboard and mouse activity. Another Celonis technology analyzes interactions, behaviors and activities performed on the desktop computers of employees. The “task mining” system can capture extensive personal data including screen recordings, keystrokes, mouse clicks and clipboard contents from up to 2,500 employees. While employers can customize the captured data, example applications show that the system can be used to scrutinize how workers use programs and applications, websites, keyboard commands and copy/paste functionality. Employers can combine data on desktop interactions with activity data from enterprise systems and use it to detect “inefficiencies”, decrease the time spent on “non-value adding activities” and assess “workforce productivity”, “productive time” and “idle time”.

Corporate sources suggest that analysis results can be displayed both at the level of teams and for individual workers.

Employers can customize the systems offered by Celonis and its partners, use only parts of them or use them in less intrusive ways. The last section of the case study summarizes the identified data practices and discusses potential implications for workers. While granular performance monitoring at the individual level is clearly problematic, extracting aggregate knowledge from personal data increases the power imbalance at work and can also have significant effects. Utilizing the data to standardize and unilaterally reorganize workflows can accelerate and intensify work, reduce discretion, make workers easier to replace, facilitate outsourcing, undermine bargaining power and affect wages. Employers may refer to “objective” data to justify arbitrary decisions. Automated task assignment and algorithmic management practices can also have a variety of side effects. The rapid expansion of data flows and functionality potentially undermines purpose limitation, a cornerstone of European data protection law.

The findings of this case study will be incorporated in the main report of the ongoing project “Surveillance and Digital Control at Work” (2023-2024), led by Cracked Labs, which aims to explore how companies use personal data on workers in Europe. The main report will draw further conclusions

16 September 2023

Discrimination

‘Setting the Framework for Accountability for Algorithmic Discrimination at Work’ by Alysia Blackham in (2023) 47(1) Melbourne University Law Review comments

Digital inequalities at work are pervasive yet difficult to challenge. Employers are increasingly using algorithmic tools in recruitment, work allocation, performance management, employee monitoring and dismissal. According to a survey conducted by the Society for Human Resource Management, nearly one in four companies in the United States (‘US’) use artificial intelligence (‘AI’) in some form for human resource management. Of those surveyed who do not use automation for such processes, one in five organisations plan to either use or increase their use of such AI tools for performance management over the next five years.

The elimination of discrimination in employment and occupation is a fundamental obligation of International Labour Organization (‘ILO’) members, and is included in the ILO Declaration on Fundamental Principles and Rights at Work. This obligation invariably extends to the digital sphere. It is critical, then, to create a meaningful framework for accountability for these algorithmic tools. At present, though, it is unclear who is responsible for monitoring the risks of algorithmic decision-making at work: is it the technology companies who develop and market these algorithmic products? The employers using algorithmic tools? Or the individual workers who might experience inequality as a result of algorithmic decision-making? Or, indeed, all three?

This article considers how we might create a framework for accountability for digital inequality, specifically concerning the use of algorithmic tools in the workplace that disadvantage groups of workers. In Part II, I consider how algorithms and algorithmic management might be deployed in the workplace, and the way this might address or exacerbate inequality at work. I argue that the automated application of machine learning (‘ML’) algorithms in the workplace presents five critical challenges to equality law related to: the scale of data used; their speed and scale of application; lack of transparency; growth in employer control; and the complex supply chain associated with digital technologies. In Part III, I consider principles that emerge from privacy and data protection law, third-party and accessorial liability, and collective solutions to reframe the operation of equality law to respond to these challenges. Focusing on transparency, third-party and accessorial liability, and supply chain regulation, I draw on comparative doctrinal examples from the European Union (‘EU’) General Data Protection Regulation (‘GDPR’), the Australian Privacy Principles (‘APP’) and Fair Work Act 2009 (Cth) (‘Fair Work Act’), and collectively negotiated solutions to identify possible paths forward for equality law. This analysis adopts comparative doctrinal methods, reflecting what Örücü describes as a ‘problem-solving’ or sociological approach to comparative law, examining how different legal systems have responded to similar problems in contrasting ways. The fact that these jurisdictions are facing a similar problem warrants the comparison; differences in national context increase the potential for mutual learning. The GDPR is seen as setting the standard or benchmark for global data protection regulation: it is therefore considered here as an important comparator to Australian provisions.

Drawing on these principles, I argue that there is a need to develop a meaningful accountability framework for discrimination effected by algorithms and automated processing, with differentiated responsibilities for algorithm developers, data processors and employers. While discrimination law — either via claims of direct or indirect discrimination — might be adequately framed to accommodate algorithmic discrimination, I argue for a need to reframe equality law around proactive positive equality duties that better respond to the risks of algorithmic management. This represents a critical and innovative contribution to Australian legal scholarship, which has rarely considered the implications of technological and algorithmic tools for equality law. Given the critical differences between Australian, US and EU equality law, there is a clear need for jurisdiction-specific consideration of these issues.

29 August 2023

Algorithmics

'The Thought Problem and Judicial Review of Administrative Algorithms' by David Tan in Adelaide Law Review (Forthcoming) comments

The issue of whether algorithms can be characterised as “thinking” or have properties of “thought” has arisen in both judicial decisions like Pintarich and scholarly discussion regarding issues like bias. This paper refers to this issue as the Thought Problem and introduces three principles for how to resolve it: the manifestation, implementation, and equivalent treatment principle. The manifestation principle states that an algorithmic output can be considered a decision where the manifestation of conduct of the agency supervising the algorithm would be understood to the outside world as a product of a thinking person. The implementation principle states that the humans in the executive who implemented the algorithm have responsibility for the algorithm. The equivalent treatment principle proposes to treat algorithms and a humans who reasoned similarly as equivalent before the eyes of administrative law. The paper does not try to conclusively resolve which principle is best but suggests the equivalent treatment principle is the most complete one for dealing with the Thought Problem.

23 August 2023

Regulation

'Regulatory Issues of Data and Algorithms for the Data-Driven Economy' by Kung-Chung Liu in (2023) GRUR International comments

Data and algorithms are the lifeblood of the data-driven economy. Big data and big algo are bringing new challenges and legal issues to the fore. This article deals with some aspects of these issues and tries to put forward an analytical framework for typifying, securing access to and use of data, and ultimately maximizing the generation and flow of data. This article then deals with algorithms and endeavours to propose six principles for auditing algorithms and a brand new international governance framework for algorithms and their auditing via a treaty and relevant mechanisms. ...

With the ubiquitous take-up of digital technology and the digitization of products, services, and business processes, we find ourselves in a data-driven economy whose two pillars are data and algorithms. Data and especially big data generated and collected by netizens and machines (devices) connected via the internet of things (IoT) are essential elements of the development of new products or services, business models and competition. Another even more important pillar of the data-driven economy is algorithms, as they decide the collection, compilation and analysis of data, and shape the final decision-making of artificial intelligence (AI). Controversies are on the rise about algorithms being biased or even designed to distort competition and harm consumers, whether algorithm-driven market interactions call traditional economic models of competition law into question (explicit versus tacit collusion1), and whether and how new regulations for algorithms must be developed. How can we best regulate data and algorithms to boost the data-driven economy? How can we overcome the intellectual property (IP) hurdles, copyright and trade secrets in particular, of data and algorithms that might hinder the data-driven economy?

This article is dedicated to the clarification of some of the regulatory issues of data and algorithms ‒ no solution can be offered as it is not targeted at one specific national legal regime – with the concerns for the data-driven economy. It will first focus on data by discussing its typology (identifying three types of data), protection, access and use. The second focal point of this article deals with algorithms and their auditing, and endeavours to propose six principles for auditing algorithms and a new treaty and mechanisms to tackle these issues from the perspectives of global governance.

barnold law