Showing posts with label Emotion. Show all posts
Showing posts with label Emotion. Show all posts

01 March 2025

Attention Biopolitics

'Attention as an object of knowledge, intervention and valorisation: exploring data-driven neurotechnologies and imaginaries of intensified learning' by Dimitra Kotouza, Martyn Pickersgill, Jessica Pykett and Ben Williamson in (2025) Critical Studies in Education comments 

Innovations in mobile neuromonitoring and brain–computer interfaces are increasingly used to inform understandings of human brains and behaviours while also catalysing imaginaries of neuroscientifically measured and enhanced economic productivity. In this paper, we focus on neurotechnologies that claim to capture, monitor, measure and train learners’ attention. We analyse a corpus of relevant scientific, governance, and commercial texts to explore how they reconfigure learners’ attention as an object of knowledge, intervention, and valorisation. We demonstrate that outcomes- driven neuroscience research and technological development tends to split attention into optimal and undesirable forms: externally versus internally oriented, and synchronised versus unsynchronised with others, which become the variables of intervention to optimise attention. Commercial wearables, in turn, envelop desirable forms of attention under logics of brain control, social discipline, and valorisation. This process is enacted within an international context of speculation on neurotechnology investments and their anticipated outcome of enhancing future human productivity. Circumscribing desirable forms of learner attention and subjectivity, these technologies provide expanded means to mould and monitor learners’ attention towards performativity regimes of economised education governance while enabling profit-making based on learners’ activity.

They note 

Neurotechnological research and development has received billions of dollars of state funding across the world in recent decades via large-scale brain science projects. This reflects, and further intensifies, a focus on neuroscience as an object of significant social and political investment and concern (Pickersgill, 2023). Today, the increasing portability of neuroimaging and its integration with software allow analysing and visualising ‘millisecond-by-millisecond’ brain data thought to reflect cognitive processes (Davidesco et al., 2021, p. 650). These advances have been hailed as capable of ‘unlocking the secrets of how the human brain works’ (Mathieson et al., 2021, p. 8). 

One focus among many has been the field termed ‘educational neuroscience’. While this has for some years asserted that its epistemic products can provide new insights to teachers (Howard-Jones et al., 2021), well-funded neurotechnological advances enable neuroscience to propose going beyond these founding ambitions. Combined with learning analytics and AI, technologies are anticipated by some proponents to deliver algorithmically personalised education alongside various other tools to enhance learning capacity (Davidesco et al., 2021, p. 650; Edelenbosch et al., 2015). 

Within the US especially, the notion of ‘attention’ in education is emphasised through research, policy, and economic infrastructures that promote neurotechnologies to measure, monitor, and enhance learners’ attentiveness. Research in this area involves tried- and-tested techniques such as electroencephalography (EEG) – which visualises brain electrical activity – and functional magnetic resonance imaging (fMRI), a technique to represent blood flow as a means of generating information about brain activity. Learners’ attentional control is increasingly presented as a datafied object of knowledge and intervention in the broader context of concern around ADHD and digital distraction. Using fMRI, cognitive neuroscience is said to have revealed the ‘attention networks’ in which children with diagnoses of ADHD (Attention Deficit Hyperactivity Disorder) are thought to have a ‘deficit’ (Posner, 2013, p. 15). Further, various attention-targeting neurotechnologies currently in development promise to improve attention and address ADHD symptoms. Indeed, the relevance of cognitive neuroscience to education has often been advocated in the context of rising levels of ADHD diagnoses and parental concerns that ‘typically developing children’ will not be able to resist the temptations of the ‘digital age’ (Posner, 2013, p. 14). Perhaps ironically, then, social anxieties around attention have shaped interest in, and a market for, neurotechnological interventions within education. Such developments resonate with broader trajectories of governance. Since the 1980s, educational governance internationally has been increasingly guided by economic theories of human capital (Roberts-Holmes & Moss, 2021). This ‘economisation of education’ renders it as the process of individuals acquiring skills and personal qualities to form a productive workforce (Connell, 2013). Subsequently, it also involved economists’ interest in ‘the neuroscience of human capability formation’ (Heckman, 2007, p. 13250). Education is repeatedly submitted to what Foucault (2008, p. 247) has called a ‘permanent economic tribunal’, in which governance structures – such as OECD’s international student assessment – metricise, standardise, and compare schools and educational systems, subjectifying through the proliferation of numbers (Grek & Ydesen, 2021). 

Under this regime, the learning process has become an object not only of economic management but also of the scientific management of learning. This is expressed, for instance, as policy interest in the ‘learning sciences’ and most prominently in educational neuroscience (De Vos, 2016; Pykett & Disney, 2016). Internationally, organisations like the OECD and UNESCO have generated policy- influencing agendas for educational transformation around the promises of neuroscience and neurotechnology. For the OECD, optimised human cognitive performance – figured as ‘brain capital’ – is thought necessary for long-term productivity and economic prosperity, and is to be measured and improved by neuroscientific instruments (OECD, 2021; Smith et al., 2021). Educational neurotechnology, in this policy discourse, is imagined as having an ‘immense potential to improve student learning and cognition’ (UNESCO, 2023, p. 6). 

Anticipatory enthusiasm about the possibilities of neurotechnology in education redoubles existing socio-cultural fascination with the neurological and its attendant subjectivities (De Vos, 2016; Pickersgill et al., 2011; Pykett & Disney, 2016; Rose & Abi-Rached, 2013; Vidal & Ortega, 2017). Neuroscience is expected to help ‘stretch’ children and provide tools to improve their learning outcomes, not least by finding therapeutic solutions to increasing rates of ADHD diagnoses (Cortese et al., 2023). Coupled with the growing market in learning analytics – which monitor university students’ data traces and use algorithms to assess engagement, predict performance and help and prevent lapses and drop-outs (Kotouza et al., 2021) – these technologies enable new ways to directly monitor, assess, and train attention in educational settings. This leads us to ask: what might be the implications for the shaping (and reduction) of students into attentive and productive learner subjects? More specifically, what aspects of this subjectivity are targeted via scientific and economic forms of control, and in what ways? 

This article is a step towards answering such questions, by concentrating on the production of knowledge and technology in neuroscience that addresses attention in education. It is informed by critical studies of the implications of neurotechnologies for knowledge production, governance and contemporary forms of economic organisation (e.g. Pickersgill, 2023), as we outline conceptually in the following section. Specifically, by building on such theorisations of the epistemic, interventionist and economic dimensions of neuroscience, we explore how ‘attention’ is (re)configured as: (a) an object of knowledge, by examining how concepts of attention and distraction have shifted in educational neuroscience studies; (b) an object of intervention, by looking at how those concepts are operant within emerging neurotechnologies used to measure, monitor, and enhance learners’ attention; and (c) an object of valorisation, by tracing how attention neurotechnologies enable learners’ attentional activity to enter, in different forms, circuits of valorisation.

22 January 2025

ERT

'Training Humans to Detect Children's Lies Through their Facial Expressions' by Alison O'Connor, Kaila Bruer, Jennifer Gongola, Thomas D Lyon and Angela D Evans in Applied Cognitive Psychology (in press) comments 

 The accurate detection of children’s truthful and dishonest reports is essential as children can serve as important providers of information. Research using automated facial coding and machine learning found that children who were asked to lie about an event were more likely to look surprised when hearing the first question during an interview about said event. The present studies explored if humans can be trained to look for surprised expressions to detect children’s deception. Participants made lie-detection judgments after seeing children’s expressions in very brief clips. In Study 1, we compared performance across a training condition and control condition, and in Study 2 we modified the training. With training, adults could detect children’s lies at above chance levels by viewing their facial expression. Detection accuracy was further improved with modified training (Study 2), but participants held a consistent lie bias. Challenges with using facial expressions to detect deceit are discussed.

15 January 2024

Emotion Recognition

'The unbearable (technical) unreliability of automated facial emotion recognition' by Federico Cabitza, Andrea Campagner and Martina Mattioli in (2022) 9(2) Big Data and Society comments 

Emotion recognition, and in particular facial emotion recognition (FER), is among the most controversial applications of machine learning, not least because of its ethical implications for human subjects. In this article, we address the controversial conjecture that machines can read emotions from our facial expressions by asking whether this task can be performed reliably. This means, rather than considering the potential harms or scientific soundness of facial emotion recognition systems, focusing on the reliability of the ground truths used to develop emotion recognition systems, assessing how well different human observers agree on the emotions they detect in subjects’ faces. Additionally, we discuss the extent to which sharing context can help observers agree on the emotions they perceive on subjects’ faces. Briefly, we demonstrate that when large and heterogeneous samples of observers are involved, the task of emotion detection from static images crumbles into inconsistency. We thus reveal that any endeavour to understand human behaviour from large sets of labelled patterns is over-ambitious, even if it were technically feasible. We conclude that we cannot speak of actual accuracy for facial emotion recognition systems for any practical purposes. ... 

Emotional artificial intelligence (AI) (McStay, 2020) is an expression that encompasses all computational systems that leverage ‘affective computing and AI techniques to sense, learn about and interact with human emotional life’. Within the emotional AI domain (but even more broadly, within the entire field of AI based on machine learning (ML) techniques), acial emotion recognition (FER), which denotes applications that attempt to infer the emotions experienced by a person from their facial expression (Paiva-Silva et al., 2016; McStay, 2020; Barrett et al., 2019), is one of the most controversial (Ghotbi et al., 2021) and debated (Stark and Hoey, 2021) applications. 

In fact, ‘turning the human face into another object for measurement and categorization by automated processes controlled by powerful companies and governments touches the right to human dignity’ and ‘the ability to extract […physiological and psychological characteristics such as ethnic origin, emotion and wellbeing…] from an image and the fact that a photograph can be taken from some distance without the knowledge of the data subject demonstrates the level of data protection issues which can arise from such technologies’. On the other hand, opinions diverge among the specialist literature. Some authors highlight the accurate performance of FER applications and their potential benefits in a variety of fields; for instance, customer satisfaction (Bouzakraoui et al., 2019), car driver safety (Zepf et al., 2020), or the diagnosis of behavioural disorders (Paiva-Silva et al., 2016; Jiang et al., 2019). Others have raised concerns regarding the potentially harmful uses in sectors such as human resource (HR) selection (Mantello et al., 2021; Bucher, 2022), airport safety controls (Jay, 2017), and mass surveillance settings (Mozur, 2020). In addition, the scientific basis of FER applications has been called into question, either by equating their assumptions with pseudo-scientific theories, such as phrenology or physiognomy (Stark and Hutson, Forthcoming), or by questioning the validity of the reference psychological theories (Barrett et al., 2019), which assume the universality of emotion expressions through facial expressions (Elfenbein and Ambady, 2002). Lastly, others have noted that the use of proxy data (such as still and posed images) to infer emotions should be supported by other contextual information (McStay and Urquhart, 2019), especially if the output of the FER systems is used to make sensitive decisions, so as to avoid misinterpretation of the broader context. According to Stark and Hoey (2021) ‘normative judgements can emerge from conceptual assumptions, themselves grounded in a particular interpretation of empirical data or the choice of what data is serving as a proxy for emotive expression’. 

From a technical point of view, FER is a measurement procedure (Mari, 2003) in which the emotions conveyed in facial expressions are probabilistically gauged to detect the dominant one or a collection of prevalent emotions. As a result, FER can be related to the concepts of validity and reliability. A recognition system is valid if it recognizes what it is designed to recognize (i.e. basic emotions); it is reliable if the outcome of its recognition is consistent when applied to the same objects (i.e. a subject’s expression). However, when FER is achieved by means of a classification system based on ML techniques, its reliability cannot (and should not) be separated from the reliability of its ground truth, i.e. training and test datasets (Cabitza et al., 2019). In this scenario, reliability is defined as the extent to which the categorical data from which the system is expected to develop its statistical model are generated from ‘precise measurements’, i.e. human ‘recognitions’ exhibiting an acceptable agreement. This is because, by definition, no classification model can outperform the quality of the human reference (Cabitza et al., 2020b). 

In this study, we will not contribute to the vast (and heated) debate still currently going on about the validity of automatic FER systems (Franzoni et al., 2019; Feldman Barrett, 2021; Stark and Hoey, 2021), that is, we do not address the classification task from the conceptual point of view (how to define emotions, if possible at all) nor merely from the technical point of view (how to recognize emotions, whatever they are). For the sake of argument, we assume that the main psychological emotion models make perfect sense and we do not address how robust recognition algorithms are, how well they perform in external settings, and, most importantly, how useful they can be, i.e. whether they provide the benefits that their promoters envision and advocate. 

Instead, we focus on the reliability of their ground truth, which is not a secondary concern from a pragmatic standpoint (Cabitza et al., 2020a, 2020b). To that end, we conducted a survey of the major FER datasets concentrating on their reported reliability as well as a small user study by which we address three related research questions: Do existing FER ground truths have an adequate level of reliability? Are human observers in agreement regarding the emotions they sense in static facial expressions? Do they agree more when the context information is shared before interpreting the expressions? 

The first question is addressed in the ‘Related work and motivations’ section and the answer is in Table 3. The other questions are addressed by means of a user study described in the ‘User study: Methods’ section and whose results are reported in the ‘Results’ section. Finally, in the ‘Discussion’ section, we discuss these findings and their immediate implications, while in the ‘Conclusion’ section we interpret them within the bigger picture of FER reliability and relate them to implications for the use of automated FER systems in sensitive domains and critical human decision making.

'What an International Declaration on Neurotechnologies and Human Rights Could Look like: Ideas, Suggestions, Desiderata' by Jan Christoph Bublitz in (2024) 15(2) AJOB Neuroscience 96 comments 

 Ethical and legal worries arising from novel neurotechnological applications have reached the level of international human rights institutions and prompted ongoing deliberations about a new legal instrument that sets international standards for the development, regulation, and use of neurotechnologies. In a recent report on human rights implications of neurotechnologies, the International Bioethics Committee of UNESCO (IBC) considers the idea of a “governance framework set forth in a future UNESCO Universal Declaration on the Human Brain and Human Rights” or a “New Universal Declaration on Human Rights and Neurotechnology” (2021, at 184c). Other human rights agencies have been concerned with the matter, hosted hearings and commissioned reports (especially OECD  2019; see also Ienca  2021; OECD  2017; Sosa et al.  2022). The UN Human Rights Council (2022) mandated its Advisory Committee to prepare a comprehensive study on neurotechnologies and human rights. A novel international instrument will likely emerge from these debates (cf. UNESCO Docs. 216 EX/Dec.9 and EX/50). As the first global instrument specifically tailored to neurotechnologies, it will set the tone for further regulations at domestic, supranational, and international levels. Although some stakeholders have been consulted in previous proceeedings, the development has so far largely evaded the broader attention of the neuroscience, neurotech, and neuroethics communities. This is unfortunate, as academic input is vital to identify problems, frame debates and develop solutions, not least because international agencies lack subject matter expertise and have relied on a limited number of experts so far. The timing is critical. Once debates move to the political arena and intergovernmental negotiations, the room for academic and big picture debates narrows as matters tend to become increasingly technical and arguments tend to become interest-based. Accordingly, the time for impactful academic interventions is now. To facilitate it and to widen the perspective of current debates, this target article puts to discussion twenty-five considerations and desiderata for a future instrument. In particular, it wishes to transcend the confines of the debate about so called neurorights that dominates the current discourse (e.g., Borbón and Borbón  2021; Bublitz  2022c; Genser, Herrmann, and Yuste  2022; Ienca  2021; Ligthart et al.  2023; Rommelfanger, Pustilnik, and Salles 2022; Yuste et al. 2017; Zúñiga-Fajuri et al. 2021). Proceeding on the basis of existing rights, the following remains uncommitted as to whether novel rights are needed. This debate overshadows a broader and richer field of relevant questions, and it is time to turn to them. 

Setting the stage, the nature and the limits of a future instrument should be clarified. It will likely be a soft law instrument such as a recommendation by UNESCO or a resolution by the UN General Assembly. Such documents are not legally binding and lack enforcement mechanisms. Whether they qualify as law at all depends on legal theory’s perennial question about the nature of law and may be answered differently with respect to different types of documents (Andorno  2012; Shelton  2008). Suffice it to note here that such documents understand themselves as more than mere ethical statements because they demand compliance by signatory States without creating enforceable legal obligations. Theoretical matters aside, soft law instruments can be practically effective governance tools that draw attention to problems and set standards which are often observed by States and other stakeholders. They may, for instance, affect governmental research funding, decisions by ethics committees, or the regulatory conditions for market approval of devices. Soft law may also turn into hard law in several ways. It may provide guidance for courts in interpreting norms, rendering the content of rights more concrete and resolving normative conflicts. It may inform secondary soft law such as general comments by treaty bodies, and inspire further binding acts at domestic or international levels. Soft law’s greater flexibility is an advantageous feature in fast-moving fields without firm normative underpinnings such as neurotechnologies, and has therefore become the prime legal-regulatory tool for technology governance at both the international and the domestic level (Hagemann and Skees  2018; Marchant and Tournas 2019). At any rate, because of the often insurmountable political hurdles that binding treaties of international law face, especially in the current geopolitical climate, soft law instruments are the best form of international governance of neurotechnologies that is realistically attainable in the near future. 

The nature of an instrument shapes its content. In contrast to the abstract and elegantly worded Universal Declaration of Human Rights and the international covenants that followed it, soft law instruments allow for more aspirational goals and broader scopes but also for more concrete norms and standards. In addition, they are not only directed at States as the protagonists of international law but also at other stakeholders, notably private actors such as businesses that may threaten human rights, individuals whose rights may have been violated, but also other relevant parties such as engineers and developers of neurotechnologies. Moreover, given the aspiration of global applicability and the need for consensus in matters about which countries and cultures may reasonably disagree, instruments must allow for local adaptability, value pluralism, compromises, and gravitate toward smallest common denominators. These conditions are reflected in the texts of such documents, which are often replete with references to general values of the human rights systems, not always entirely coherent, and sometimes even intentionally vague at critical points. But despite and because of these weaknesses, soft law instruments can set norms and standards that are observed and steer the course of the future development of a field. The UNESCO Recommendation on Artificial Intelligence (AI), adopted in 2021, may serve as a model for a future neurotech instrument. It contains recommendations at different levels of abstraction, from broad values over principles to actionable policy options. Although not free from textual weaknesses, the Recommendation provides some novel, concrete, and surprisingly far-reaching standards. 

It is further worth noting that international norms for the regulation of neurotechnologies already exist. Current debates sometimes evoke the impression that they develop in a legal vacuum, but this is a bit misleading. For instance, placing devices on markets is regulated by domestic and supranational device regulation, such as the EU Medical Device Regulation, which covers neurotechnologies for medical and some non-medical purposes (European Union  2017). It leaves neurodevices for non-medical neuroimaging outside of its scope, but this is not a gap but rather an intentional regulatory decisions. At the international human rights level, the Oviedo Convention on Human Rights and Biomedicine (1997), a legally binding international treaty signed by more than 30 States, seeks to safeguard the dignity and integrity of persons “with regard to the application of biology and medicine” (Council of Europe 1997, preamble). Likewise, the non-binding UNESCO Universal Declaration on Bioethics and Human Rights (2005) was adopted in view of the “rapid advances in science and their technological applications” (2005, preamble). Both instruments contain various norms about human rights and informed consent that apply to neurobiological interventions. The same is true for the Recommendation on Responsible Innovation in Neurotechnology (OECD  2019). This leads to the first desideratum: (i) A future instrument should cohere with existing instruments but not merely repeat them; it should neither contradict them without compelling reasons, nor address similar points by different terms, and should strive to go beyond them by suggesting more concrete norms or addressing substantially different aspects. 

The following presents further desiderata and considerations for a future instrument. It proceeds from the general to the particular, from meta-considerations to concrete rights and technical suggestions, and at least partially attempts to deduce the latter from the former. The points are thus interwoven rather than distinct; they are sometimes couched in the idiosyncratic style of international documents and should not be understood as conclusive but as an invitation for criticism and additions.

25 April 2023

Biometrics

'Suspect AI: Vibraimage, Emotion Recognition Technology and Algorithmic Opacity' by James Wright in (2021) Science, Technology and Society comments 

 Vibraimage is a digital system that quantifies a subject’s mental and emotional state by analysing video footage of the movements of their head. Vibraimage is used by police, nuclear power station operators, airport security and psychiatrists in Russia, China, Japan and South Korea, and has been deployed at two Olympic Games, a FIFA World Cup and a G7 Summit. Yet there is no reliable empirical evidence for its efficacy; indeed, many claims made about its effects seem unprovable. What exactly does vibraimage measure and how has it acquired the power to penetrate the highest profile and most sensitive security infrastructure across Russia and Asia? xx I first trace the development of the emotion recognition industry, before examining attempts by vibraimage’s developers and affiliates scientifically to legitimate the technology, concluding that the disciplining power and corporate value of vibraimage are generated through its very opacity, in contrast to increasing demands across the social sciences for transparency. I propose the term ‘suspect artificial intelligence (AI)’ to describe the growing number of systems like vibraimage that algorithmically classify suspects/non-suspects, yet are themselves deeply suspect. Popularising this term may help resist such technologies’ reductivist approaches to ‘reading’—and exerting authority over—emotion, intentionality and agency.

Wright states 

As I sat in the meeting room of a nondescript office building in Tokyo, the managing director of a company called ELSYS Japan discussed my emotional and psychological state, referring to a series of charts and tables displayed on a large screen at the front of the room: Aggression … 20-50 is the normal range, but you scored 52.4 … this is a bit too high. Probably you yourself didn’t know this, but you’re a very aggressive person, potentially… Next is stress. Your stress is 29.2, within the range of 20-40, with a statistical deviation of 14—that’s OK… I think you have very good stress… Just tension—your [average] value is within the range, but because your statistical deviation is high—over 20—so you’re a little tense. Mental balance is 64 from a range of 50-100, so it fits correctly in the range… Charm … 74.6 is pretty good. Now, neuroticism is 35.3, this is also in the range, but the statistical deviation is high. But some people have a high score the first time they are measured. There are people who have high scores for neuroticism as well as for tension, yes. People who possess a delicate heart.1 (Interview, 17 April 2019) xx The director’s seemingly authoritative statements were based on an assessment of various measurements produced by ‘vibraimage’, a patented2 system developed to quantify a subject’s mental and emotional state through an automated analysis of video footage of the physical movements of their face and head. This system, distributed in Japan by ELSYS Japan under the brands ‘Mental Checker’ and ‘Defender-X’, provides numerical values for levels of aggression, tension, balance, energy, inhibition, stress, suspiciousness,3 charm, self-regulation, neuroticism, extroversion and stability, categorising these automatically into positive and negative ‘emotions’. Mental Checker generates an impressive array of statistical data arranged across tables, pie chart, histogram and line chart, producing an image of mathematical precision and solid scientific legitimacy (see Figure 1). The report also provides a visualisation of what ELSYS Japan terms an ‘aura’—a horizontal colour-coded bar chart, indicating the frequency of micro-vibrations of a subject’s head, superimposed against a still image of their face. 

Vibraimage technology has already entered the global security marketplace. It was deployed at the 2014 Sochi Olympics (Herszenhorn, 2014), 2018 PyeongChang Winter Olympics, 2018 FIFA World Cup in Russia and at major Russian airports to detect suspect individuals among crowds (JETRO, 2019). It has been used at the Russian State Atomic Energy Corporation in experiments to monitor the professionalism of workers handling and disposing of spent nuclear fuel and radioactive waste (Bobrov et al., 2019; Shchelkanova et al., 2019), and to diagnose their psychosomatic illnesses (Novikova et al., 2019). In Japan, Mental Checker and Defender-X have been used by one of the largest technology and electronics companies, NEC,4 to vet staff at nuclear power stations and by a leading security services firm, ALSOK, to detect and potentially deny entry to or detain suspicious individuals at major events, including the G7 Summit in 2016, as well as sporting events and theme parks (Interview with ELSYS Japan, 17 April 2019). Managers at ELSYS Japan expected that the technology would be used at the 2020 Tokyo Olympics (Nonaka 2018, p. 148, Interview with ELSYS Japan, 17 April 2019), an event that spurred significant increased spending on domestic security services and infrastructure, with estimated market growth of 18% between 2016 and 2019 (Teraoka, 2018).5 ELSYS Japan’s customers also include Fujitsu and Toshiba, which have considered ‘incorporat[ing] [vibraimage]… into their own recognition technologies to differentiate their original products’ (Nonaka, 2018, p. 147), and managers told me that Mental Checker has been used by an unspecified number of Japanese psychiatrists to confirm diagnoses of depression. 

In South Korea, the Korean National Police Agency, Seoul Metropolitan Policy Agency and several universities have collaborated on research aiming to establish the use of vibraimage in a video-based ‘contactless’ lie-detection system as an alternative to polygraph testing (Lee & Choi, 2018; Lee et al., 2018), while, in China, it has been deployed in Inner Mongolia, Zhejiang and elsewhere to identify suspects for questioning and detention, and has been officially certified for use by Chinese police (Choi et al., 2018a, 2018b).6 Other corporate applications of vibraimage are also proposed: an ELSYS Japan brochure suggests using Mental Checker to discover how employees really feel about their company; measure their levels of stress, fatigue and ‘potential ability’; counter employees’ accusations of bullying and abuses of power in the workplace; and even ‘to know the risk of hiring persons who might commit a crime’ (ELSYS Japan Brochure, undated). The brochure provides a screenshot of a suggested employee report, with grades (A+, B−, C, etc.) for qualities that include stability, fulfilment and happiness, social skills, teamwork, communication, ability to take action, aggressiveness, stress tolerance and ability to ‘recognise reality’. 

Vibraimage forms one part of the rapid growth in algorithmic security, surveillance, predictive policing and smart city infrastructure across urban East Asia, enabling the ‘active sorting, identification, prioritization and tracking of bodies, behaviours and characteristics of subject populations on a continuous, real-time basis’ (Graham & Wood, 2003, p. 228). Amid an international boom in both surveillance technologies and artificial intelligence (AI) systems designed to extract maximal information from digital photographic and video data relating to the body, companies are developing algorithms that move beyond facial recognition intended to identify individuals and increasingly aim to analyse their behaviour and emotional states (AI Now Institute, 2018, pp. 50–52). The digital emotion recognition industry was worth up to US$12 billion in 2018, and it continues to grow rapidly (AI Now Institute, 2018). 

As the concepts of algorithmic regulation and governance (Goldstein et al., 2013; Introna, 2016) are increasingly becoming a reality, transparency has become a key theme in critiques of black-boxed algorithms and AI, including those used in emotion recognition. This is particularly the case with machine learning, in which algorithms recursively adjust themselves and can quickly become inexplicable even to data science experts. As Maclure puts it, ‘we are delegating tasks and decisions that directly affect the rights, opportunities and wellbeing of humans to opaque systems which cannot explain and justify their outcomes’ (Maclure, 2019, p. 3). Transparency is linked to and overlaps with values of comprehensibility, explicability, accountability and social justice, and it is frequently presented as a vital component of ethical or ‘good’ AI (Floridi et al., 2018; Hayes et al., 2020; Leslie, 2019). ... 

...  This article uses the case of vibraimage to examine issues around opacity and the work it does for companies and governments in the provision of security services, by attempting to shed light on the algorithms of vibraimage and its imagined and actual uses, as far as possible based on publicly available data. What exactly does vibraimage measure and how does the data the system produces, processed through an algorithmic black box, deliver reports that have acquired the power to penetrate corporate and public security systems involved in the highest profile and most sensitive security tasks in Russia, Japan, China and elsewhere? The first section of the article examines emotion detection techniques and their digitalisation. The second section focuses on vibraimage and how its proponents, many of whom have commercial relationships with companies distributing it, have engaged in processes of scientific legitimation of the technology while making claims for its actual and potential uses. The final section considers how the disciplining power and corporate value of vibraimage are generated through its very opacity, in stark contrast to increasingly urgent demands across the social sciences and society, more broadly, for transparency as a prerequisite for ‘good AI’. I propose the term ‘suspect AI’ reflexively to describe the increasing number of algorithmic systems, such as vibraimage, in operation globally across law enforcement and security services, which automatically classify subjects as suspects or non-suspects. Popularising this term may be one way to resist such reductivist approaches to reading and exerting authority over human emotion, intentionality, behaviour and agency. 

Emotion Recognition Based on Facial Expressions 

Psychologist Paul Ekman pioneered research exploring the relationship between emotions and facial expressions since the 1960s, building on Darwin’s (2012[1872]) work on evolutionary connections between the two among animals, including humans. Ekman conducted experiments around the world, aiming to demonstrate the universality of a handful of basic emotions (such as anger, contempt, disgust, fear, happiness, sadness and surprise) across all cultures and societies, and of their articulation through similar facial expressions (Ekman, 1992). This work was highly influential because it seemed to provide overwhelming empirical evidence that individuals of all cultures were able to ‘correctly’ categorise the expressions of people of their own and other cultures provided in photos, matching them to the ‘basic emotions’ they supposedly expressed (Ekman & Friesen, 1971).  

Ekman further argued that facial expressions could be used to identify incongruities between professed and ‘real’ emotions, enabling facial expression analysis to be used for lie detection (Ekman & Friesen, 1969). This attracted substantial interest from corporations concerned with ensuring the honesty of employees or gaining covert insights in business negotiations, and from governments and security forces concerned with identifying dissimulating and suspect individuals. Ekman and collaborators in this field like David Matsumoto formed companies, running workshops and holding consultations with corporations and public bodies about how to read subjects’ facial micro-expressions and behavioural cues to evaluate personality, truthfulness and potential danger. In 2001, the American Psychological Association named Ekman one of the most influential psychologists of the twentieth century (APA, 2002). 

The identification of emotions through facial expressions underwent digitalisation via machine learning techniques pioneered since the mid-1990s by Rosalind Picard and Rana el Kaliouby at Massachusetts Institute of Technology (MIT). They commercialised this new field of ‘affective computing’ via their venture capital–backed company Affectiva, founded in 2009, which provides emotional analysis software to businesses based on algorithms trained on large databases of facial expressions (Johnson, 2019). According to Affectiva, this enables a test subject’s emotional responses to, for example, TV commercials, to be tracked in real time. With the recent boom in facial recognition technology, emotion recognition represents a rapidly expanding area of AI development, used across industries, including recruitment and marketing research (Devlin, 2020). A growing number of companies offer emotion recognition services based on analysis of facial expressions, including Microsoft (Emotion application programming Interface [API]), Amazon (Rekognition), Apple (Emotient, which Ekman advised on) and Google (Cloud Vision API). 

Such systems are increasingly being used in border protection and law enforcement to identify dissimulating and otherwise suspect individuals, regardless of substantial evidence of efficacy. From 2007, the Transportation Security Administration (TSA) spent US$900 million on a ‘behaviour-detection programme’ entitled Screening Passengers by Observation Technique (SPOT), until it was ruled ineffective by the Department of Homeland Security and the Government Accountability Office (GAO, 2013). Ekman consulted on SPOT, and the system incorporated his techniques; his company also provided consulting services to US courts (Fischer, 2013). Another system—Automated Virtual Agent for Truth Assessments in Real-Time (AVATAR), was developed for lie detection targeting migrants on the USA–Mexico border (Daniels, 2018), while the EU trialled the iBorderCtrl system, supplied by the consortium European Dynamics and funded by Horizon 2020, using the interpretation of micro-expressions to detect deceit among migrants in Hungary, Greece, and Latvia (Boffey, 2018; see also AI Now Institute, 2018, pp. 50–52). 

Recently, this work on facial expression analysis for emotion recognition has come under increasing scrutiny despite its ongoing popularity among many psychologists. The most basic critique is that one does not necessarily smile when one is happy—common sense suggests that facial expressions do not always, or even often, map to inner feelings, that emotions are often fleeting or momentary, and that facial expressions and their meaning are highly dependent on sociocultural context. Barrett et al. (2019) summarise these and other critiques, arguing that approaches positing a limited number of prototypical basic emotions that can be ‘read’ through universal facial expressions fail to grasp what emotions are and what facial expressions convey. 

In anthropology, the ‘affective turn’ has drawn attention to the distinction between affect and emotion—the former a precognitive sensory response or potential to affect and be affected, and the latter a more culturally mediated expression of feeling. White describes this as the difference between ‘how bodies feel and how subjects make sense of how they feel’ (White, 2017, p. 177). These nuances are overlooked in the field of emotion recognition, which reduces emotion to a simplistic and digitally scalable model. Barrett argues that emotion is: a contingent act of perception that makes sense of the information coming in from the world around you, how your body is feeling in the moment, and everything you’ve ever been taught to understand as emotion. Culture to culture, person to person even, it’s never quite the same. (Fischer, 2013) 

We might, therefore, define the process of interpreting one’s own emotional state as making sense of an inner noise of biological signals and memories, in contextually contingent and socioculturally mediated ways, and placing them into—and in the process co-constructing—socioculturally mediated categories. It may also sometimes involve not definitively categorising or making sense of these affective feelings. As this article will show, it is the very ambiguity or malleability of this process that may help make vibraimage a convincing technology of emotion recognition and provide authority to its analysis. 

Given these growing critiques of Ekmanian theories of universal basic emotions expressed through facial expressions, researchers at the organisation AI Nowhave concluded that, by extension, the digital emotion detection industry is ‘built on markedly shaky foundations…. There remains little to no evidence that these new affect-recognition products have any scientific validity’ (AI Now Institute, 2018, p. 50). Baesler, similarly, argues that the use of emotion detection software by the TSA was ‘unconfirmed by peer-reviewed research and untested in the field’ (Baesler, 2015, pp. 60–61), while holding significant potential for harm through misuse. In common with broader critiques of AI from critical algorithm studies (e.g., Eubanks, 2018; Lum & Isaac, 2016), machine learning methods involved in emotion recognition systems have been criticised for racial bias, based on their training data sets (Rhue, 2018). Indeed, Ekman’s work not only constructs ethnocentric emotional categories but also racial subject categories, for example in his creation, with Matsumoto, of the Japanese and Caucasian Facial Expressions of Emotion stimulus set of photos showing emotional expressions of archetypal ‘Japanese’ and ‘Caucasian’ subjects (Biehl et al., 1997; https://www/humintell.com), which continues to be used in psychology experiments. For all of these reasons, the increasingly widespread application of this technology has raised growing ethical and civil liberties concerns

'Automated Video Interviewing as the New Phrenology' by Ifeoma Ajunwa in (2022) 36 Berkeley Technology Law Journal 101 comments 

This Article deploys the new business practice of automated video interviewing as a case study to illuminate the limitations of traditional employment antidiscrimination laws. Employment antidiscrimination laws are inadequate to address unlawful discrimination attributable to emerging workplace technologies that gatekeep equal opportunity in employment. The Article shows how the practice of automated video interviewing is based on shaky or non-proven technological principles that disproportionately impact racial minorities. In this way, the practice of automated video interviewing is analogous to the pseudo-science of phrenology, which enabled societal and economic exclusion through the legitimization of eugenics and racist attitudes. After parsing the limitations of traditional anti-discrimination law to curtail emerging workplace technologies such as video interviewing, this Article argues that ex ante legal regulations, such as those derived from the late Professor Joel Reidenberg’s Lex Informatica framework, may be more effective than ex post remedies derived from the traditional employment antidiscrimination law regime. The Article argues that one major benefit of applying a Lex Informatica framework to video interviewing is developing legislation that considers the capabilities of the technology itself rather than how actors intend to use it. In the case of automated hiring, such an approach would mean actively using the Uniform Guideline on Employee Selection Procedures to govern the design of automated hiring systems. For example, the guidelines could dictate design features for the collection of personal information and treatment of content. Other frameworks, such as Professor Pamela Samuelson’s “privacy as trade secrecy” approach could govern design features for how information from automated video interviewing systems may be transported and shared. Rather than reifying techno solutionism, a focus on the technological capabilities of automated decision-making systems offers the opportunity for regulation to start at inception, which in turn could affect the development and design of the technology. This is a preemptive approach that sets standards for how the technology will be used and is a more proactive legal approach than merely addressing the negative consequences of the technology after they have occurred.

13 January 2023

Emotions

'Physiognomic Artificial Intelligence' by Luke Stark and Jevan Hutson  in (2022) 32  Fordham Intellectual Property, Media & Entertainment Law Journal 922   comments

The reanimation of the pseudosciences of physiognomy and phrenology at scale through computer vision and machine learning is a matter of urgent concern. This Article, which contributes to critical data studies, consumer protection law, biometric privacy law, and anti-discrimination law, endeavors to conceptualize and problematize physiognomic artificial intelligence (AI) and offer policy recommendations for state and federal lawmakers to forestall its proliferation. 

Physiognomic AI, we contend, is the practice of using computer software and related systems to infer or create hierarchies of an individual’s body composition, protected class status, perceived character, capabilities, and future social outcomes based on their physical or behavioral characteristics. Physiognomic and phrenological logics are intrinsic to the technical mechanism of computer vision applied to humans. In this Article, we observe how computer vision is a central vector for physiognomic AI technologies, unpacking how computer vision reanimates physiognomy in conception, form, and practice and the dangers this trend presents for civil liberties. 

This Article thus argues for legislative action to forestall and roll back the proliferation of physiognomic AI. To that end, we consider a potential menu of safeguards and limitations to significantly limit the deployment of physiognomic AI systems, which we hope can be used to strengthen local, state, and federal legislation. We foreground our policy discussion by proposing the abolition of physiognomic AI. From there, we posit regimes of U.S. consumer protection law, biometric privacy law, and civil rights law as vehicles for rejecting physiognomy’s digital renaissance in artificial intelligence. Specifically, we argue that physiognomic AI should be categorically rejected as oppressive and unjust. Second, we argue that lawmakers should declare physiognomic AI to be unfair and deceptive per se. Third, we argue that lawmakers should enact or expand biometric privacy laws to prohibit physiognomic AI. Fourth, we argue that lawmakers should prohibit physiognomic AI in places of public accommodation. We also observe the paucity of procedural and managerial regimes of fairness, accountability, and transparency in addressing physiognomic AI and attend to potential counterarguments in support of physiognomic AI.

Stark's 'The emotive politics of digital mood tracking' in (2020) 22(11) New Media and Society 2039-2057 comments 

A decade ago, deploying digital tools to track human emotion and mood was something of a novelty. In 2013, the Pew Research Center’s Internet & American Life Project released a report on the subject of “Tracking for Health,” exploring the growing contingent of Americans keeping count of themselves and their activities through technologies ranging from paper and pencil to digital smart phone apps (Fox and Duggan, 2013). These systems generate what Natasha Dow Schüll terms more broadly “data for life” (Schüll, 2016), traces of our everyday doings as recorded in bits and bytes. Mood tracking, about which the survey queried, received so few affirmative responses that it did not rate at even 1% of positive answers. 

Yet in the interim, emotion in the world of computational media has become big business (McStay, 2016, 2018; Stark, 2016, 2018b; Stark and Crawford, 2015). Using artificial intelligence (AI) techniques, social networks such as Twitter and Facebook have joined dedicated health-tracking applications in pioneering methods for the analysis of emotive and affective “data for life.” These mood-monitoring and affect-tracking technologies involve both active self-reporting by users (Korosec, 2014; Sundström et al., 2007) and the automated collection of behavioral data (Isomursu et al., 2007)—methods often collectively known as digital phenotyping (Jain et al., 2015), or the practice of measuring human behavior via smart phone sensors, keyboard interactions, and various other features of voice and speech (Insel, 2017). This continuum of technologies allows an analyst to extrapolate a range of information about the physiology, activity, behaviors, habits, and social interactions from everyday digital emanations (Kerr and McGill, 2007). 

The past few years have also seen policymakers and the public becoming increasingly attuned to the political impacts of digital media technologies, including AI and machine learning (ML) systems (Barocas and Selbst, 2016; Crawford and Schultz, 2013; Diakopoulos, 2016). Citizens, activists, and elected politicians are eager to address the ways in which technical particularities of such systems influence social and political outcomes via design and deployment (Buolamwini and Gebru, 2018; Dourish, 2016; Johnson, 2018). Yet critical analyses and responses to these tools of what Zuboff (2019) terms “surveillance capitalism” must account for the role of human affect, emotion, and mood in surveillance capitalism’s extraction and contestation. As Raymond Williams observed, working toward an understanding of the barriers to economic and social justice means being first and foremost “concerned with meanings and values as they are actively lived and felt” (Williams, 1977: 132). 

Here, I perform a close reading and comparative values in design (VID) analysis (Flanagan and Nissenbaum, 2014; Friedman et al., 2006) of MoodPanda and Moodscope, two popular consumer applications for tracking mood. Human emotions themselves arise from a tangled nexus of biological, cultural, and contextual factors (Boehner et al., 2005; Sengers et al., 2008). As such, I argue that the design choices in each service shape the particular dynamics of political economy, sociality, and self-fashioning available to their users, and that these design decisions are exemplary of the ties between the computational politics of surveillance capitalism (Tufekci, 2014; Zuboff, 2019), and the quantification and classification of human emotion via digital mechanisms (Stark, 2018a). 

Drawing on Tufekci (2014, 2017), Papacharissi (2014), and others (Ahmed, 2004; Martin, 2007), I articulate how the affordances of mood-tracking services such as Moodscope and MoodPanda are indexical to a broader emerging emotive politics mediated and amplified by digital systems. The dynamics of emotive politics underpin many contemporary digitally mediated sociotechnical controversies, ranging from media manipulation by extremist actors, negative polarization, and “fake news,” to collective action problems around pressing global crises such as climate change. Human passions have always been understood as an element of political life, but the particular technical and social affordances of digital systems configure these responses in particular ways: emotive politics foreground contestations regarding how we as social actors should interact together in what Papacharissi (2014) terms as “affective publics,” and the weights and ways in which we as designers, participants, and citizens should treat human feeling as dispositive features of civic discourse. Mood tracking’s explicit engagement with human emotion as a mediated, embodied state points toward how emotive politics emerge out of designer expertise, technical features, and the social contexts and practices of everyday digital mediation (Dourish, 2004; Kuutti and Bannon, 2014). 

In this analysis, I also seek to highlight the ways in which user interface and experience (UI/UX) design shape political outcomes alongside the structures of algorithms and databases (Dourish, 2016; Montfort and Bogost, 2009: 145)—though the groundbreaking work of scholars such as Johanna Drucker (2014) and Lisa Nakamura (2009) means this insight should be of surprise to no one. “The same quantitative modulations and numerical valuations required by the new information worker,” Alexander R. Galloway likewise observes, come “in a dazzling array of new cultural phenomena…to live today is to know how to use menus” (Galloway, 2006). Analyses taking interface design into account as an aspect of broader conversations around the fairness, ethics, and accountability of digital systems, which I seek to model here, will bolster the interdisciplinary work of interrogating the impact of these automated systems on our collective political future.

'The “Criminality from Face” Illusion' by Kevin W. Bowyer, Michael C. King, Walter Scheirer, and Kushal Vangara comments 

 Criminal or not? Is it possible to create an algorithm that analyzes an image of a person’s face and accurately labels the person as Criminal or Non-Criminal? Recent research tackling this problem has reported accuracy as high as 97% [14] using convolutional neural networks (CNNs). In this paper, we explain why the concept of an algorithm to compute “criminality from face,” and the high accuracies reported in recent publications, are an illusion. 

Facial analytics seek to infer something about an individual other than their identity. Facial analytics can predict, with some reasonable accuracy, things such as age [10], gender [6], race [9], facial expression / emotion [25], body mass index [5], and certain types of health conditions [29]. A few recent papers have attempted to extend facial analytics to infer criminality from face, where the task is to take a face image as input, and predict the status of the person as Criminal / Non- Criminal for output. This concept is illustrated in Figure 1. 

One of these papers states that “As expected, the state-of- the-art CNN classifier performs the best, achieving 89.51% accuracy...These highly consistent results are evidences for the validity of automated face-induced inference on criminality, despite the historical controversy surrounding the topic” [40]. Another paper states that, “the test accuracy of 97%, achieved by CNN, exceeds our expectations and is a clear indicator of the possibility to differentiate between criminals and non- criminals using their facial images” [14]. (During the review period of this paper, we were informed by one of the authors of [14] that they had agreed with the journal to retract their paper.) A press release about another paper titled “A Deep Neural Network Model to Predict Criminality Using Image Processing” stated that “With 80 percent accuracy and with no racial bias, the software can predict if someone is a criminal based solely on a picture of their face. The software is intended to help law enforcement prevent crime.” The original press release generated so much controversy that it “was removed from the website at the request of the faculty involved” and replaced by a statement meant to defuse the situation: “The faculty are updating the paper to address concerns raised” [13]. 

Section II of this paper explains why the concept of an algorithm to compute criminality from face is an illusion. A useful solution to any general version of the problem is impossible. Sections III and IV explain how the impressive reported accuracy levels are readily accounted for by inadequate experimental design that has extraneous factors confounded with the Criminal / Non-Criminal labeling of images. Learning incidental properties of datasets rather than the intended concept is a well-known problem in computer vision. Section V explains how Psychology research on first impressions of a face image has been mis-interpreted as suggesting that it is possible to accurately characterize true qualities of a person. Section VI briefly discusses the legacy of the Positivist School of criminology. Lastly, Section VII describes why the belief in the illusion of a criminality-from- face algorithm potentially has large, negative consequences for society.

16 April 2020

Facial Observation and Emotion Scanning

'The Inconsentability of Facial Surveillance' by Evan Selinger and Woodrow Hartzog in (2019) 66 Loyola Law Review 101 comments
 Governments and companies often use consent to justify the use of facial recognition technologies for surveillance. Many proposals for regulating facial recognition technology incorporate consent rules as a way to protect those faces that are being tagged and tracked. But consent is a broken regulatory mechanism for facial surveillance. The individual risks of facial surveillance are impossibly opaque, and our collective autonomy and obscurity interests aren’t captured or served by individual decisions. 
In this article, we argue that facial recognition technologies have a massive and likely fatal consent problem. We reconstruct some of Nancy Kim’s fundamental claims in Consentability: Consent and Its Limits, emphasizing how her consentability framework grants foundational priority to individual and social autonomy, integrates empirical insights into cognitive limitations that significantly impact the quality of human decision-making when granting consent, and identifies social, psychological, and legal impediments that allow the pace and negative consequences of innovation to outstrip the protections of legal regulation. 
We also expand upon Kim’s analysis by arguing that valid consent cannot be given for face surveillance. Even if valid individual consent to face surveillance was possible, permission for such surveillance is in irresolvable conflict with our collective autonomy and obscurity interests. Additionally, there is good reason to be skeptical of consent as the justification for any use of facial recognition technology, including facial characterization, verification, and identification.

'Emotional Expressions Reconsidered: Challenges to Inferring Emotion From Human Facial Movements' by Lisa Feldman Barrett, Ralph Adolphs, Stacy Marsella, Aleix M. Martinez and Seth D Pollak in (2019) 20(1) Psychological Science in the Public Interest comments 

It is commonly assumed that a person’s emotional state can be readily inferred from his or her facial movements, typically called emotional expressions or facial expressions. This assumption influences legal judgments, policy decisions, national security protocols, and educational practices; guides the diagnosis and treatment of psychiatric illness, as well as the development of commercial applications; and pervades everyday social interactions as well as research in other scientific fields such as artificial intelligence, neuroscience, and computer vision. In this article, we survey examples of this widespread assumption, which we refer to as the common view, and we then examine the scientific evidence that tests this view, focusing on the six most popular emotion categories used by consumers of emotion research: anger, disgust, fear, happiness, sadness, and surprise. The available scientific evidence suggests that people do sometimes smile when happy, frown when sad, scowl when angry, and so on, as proposed by the common view, more than what would be expected by chance. Yet how people communicate anger, disgust, fear, happiness, sadness, and surprise varies substantially across cultures, situations, and even across people within a single situation. Furthermore, similar configurations of facial movements variably express instances of more than one emotion category. In fact, a given configuration of facial movements, such as a scowl, often communicates something other than an emotional state. Scientists agree that facial movements convey a range of information and are important for social communication, emotional or otherwise. But our review suggests an urgent need for research that examines how people actually move their faces to express emotions and other social information in the variety of contexts that make up everyday life, as well as careful study of the mechanisms by which people perceive instances of emotion in one another. We make specific research recommendations that will yield a more valid picture of how people move their faces to express emotions and how they infer emotional meaning from facial movements in situations of everyday life. This research is crucial to provide consumers of emotion research with the translational information they require.

The authors argue 

Faces are a ubiquitous part of everyday life for humans. People greet each other with smiles or nods. They have face-to-face conversations on a daily basis, whether in person or via computers. They capture faces with smartphones and tablets, exchanging photos of themselves and of each other on Instagram, Snapchat, and other social-media platforms. The ability to perceive faces is one of the first capacities to emerge after birth: An infant begins to perceive faces within the first few days of life, equipped with a preference for face-like arrangements that allows the brain to wire itself, with experience, to become expert at perceiving faces (Arcaro, Schade, Vincent, Ponce, & Livingstone, 2017; Cassia, Turati, & Simion, 2004; Gandhi, Singh, Swami, Ganesh, & Sinhaet, 2017; Grossmann, 2015; L. B. Smith, Jayaraman, Clerkin, & Yu, 2018; Turati, 2004; but see Young and Burton, 2018, for a more qualified claim). Faces offer a rich, salient source of information for navigating the social world: They play a role in deciding whom to love, whom to trust, whom to help, and who is found guilty of a crime (Todorov, 2017; Zebrowitz, 1997, 2017; Zhang, Chen, & Yang, 2018). 

Beginning with the ancient Greeks (Aristotle, in the 4th century BCE) and Romans (Cicero), various cultures have viewed the human face as a window on the mind. But to what extent can a raised eyebrow, a curled lip, or a narrowed eye reveal what someone is thinking or feeling, allowing a perceiver’s brain to guess what that someone will do next? The answers to these questions have major consequences for human outcomes as they unfold in the living room, the classroom, the courtroom, and even on the battlefield. They also powerfully shape the direction of research in a broad array of scientific fields, from basic neuroscience to psychiatry. 

Understanding what facial movements might reveal about a person’s emotions is made more urgent by the fact that many people believe they already know. Specific configurations of facial-muscle movements appear as if they summarily broadcast or display a person’s emotions, which is why they are routinely referred to as emotional expressions and facial expressions. A simple Google search for the phrase “emotional facial expressions” (see Box 1 in the Supplemental Material available online) reveals the ubiquity with which, at least in certain parts of the world, people believe that certain emotion categories are reliably signaled or revealed by certain facial-muscle movement configurations—a set of beliefs we refer to as the common view (also called the classical view; L. F. Barrett, 2017b). Likewise, many cultural products testify to the common view. Here are several examples:

  • Technology companies are investing tremendous resources to figure out how to objectively “read” emotions in people by detecting their presumed facial expressions, such as scowling faces, frowning faces, and smiling faces, in an automated fashion. Several companies claim to have already done it (e.g., Affectiva.com, 2018; Microsoft Azure, 2018). For example, Microsoft’s Emotion API promises to take video images of a person’s face to detect what that individual is feeling. Microsoft’s website states that its software “integrates emotion recognition, returning the confidence across a set of emotions . . . such as anger, contempt, disgust, fear, happiness, neutral, sadness, and surprise. These emotions are understood to be cross-culturally and universally communicated with particular facial expressions” (screen 3). 

  • Countless electronic messages are annotated with emojis or emoticons that are schematized versions of the proposed facial expressions for various emotion categories (Emojipedia.org, 2019). 

  • Putative emotional expressions are taught to preschool children by displaying scowling faces, frowning faces, smiling faces, and so on, in posters (e.g., use “feeling chart for children” in a Google image search), games (e.g., Miniland emotion games; Miniland Group, 2019), books (e.g., Cain, 2000; T. Parr, 2005), and episodes of Sesame Street (among many examples, see Morenoff, 2014; Pliskin, 2015; Valentine & Lehmann, 2015). 

  • Television shows (e.g., Lie to Me; Baum & Grazer, 2009), movies (e.g., Inside Out; Docter, Del Carmen, LeFauve, Cooley, and Lassetter, 2015), and documentaries (e.g., The Human Face, produced by the British Broadcasting Company; Cleese, Erskine, & Stewart, 2001) customarily depict certain facial configurations as universal expressions of emotions. 

  • Magazine and newspaper articles routinely feature stories in kind: facial configurations depicting a scowl are referred to as “expressions of anger,” facial configurations depicting a smile are referred to as “expressions of happiness,” facial configurations depicting a frown are referred to as “expressions of sadness,” and so on.  

  • Agents of the U.S. Federal Bureau of Investigation (FBI) and the Transportation Security Administration (TSA) were trained to detect emotions and other intentions using these facial configurations, with the goal of identifying and thwarting terrorists (R. Heilig, special agent with the FBI, personal communication, December 15, 2014; L. F. Barrett, 2017c). 

  • The facial configurations that supposedly diagnose emotional states also figure prominently in the diagnosis and treatment of psychiatric disorders. One of the most widely used tasks in autism research, the Reading the Mind in the Eyes Test, asks test takers to match photos of the upper (eye) region of a posed facial configuration with specific mental state words, including emotion words (Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001). Treatment plans for people living with autism and other brain disorders often include learning to recognize these facial configurations as emotional expressions (Baron-Cohen, Golan, Wheelwright, & Hill, 2004; Kouo & Egel, 2016). This training does not generalize well to real-world skills, however (Berggren et al., 2018; Kouo & Egel, 2016). 

  • “Reading” the emotions of a defendant — in the words of Supreme Court Justice Anthony Kennedy, to “know the heart and mind of the offender” (Riggins v. Nevada, 1992, p. 142) — is one pillar of a fair trial in the U.S. legal system and in many legal systems in the Western world. Legal actors such as jurors and judges routinely rely on facial movements to determine the guilt and remorse of a defendant (e.g., Bandes, 2014; Zebrowitz, 1997). For example, defendants who are perceived as untrustworthy receive harsher sentences than they otherwise would (J. P. Wilson & Rule, 2015, 2016), and such perceptions are more likely when a person appears to be angry (i.e., the person’s facial structure looks similar to the hypothesized facial expression of anger, which is a scowl; Todorov, 2017).

An incorrect inference about defendants’ emotional state can cost them their children, their freedom, or even their lives (for recent examples, see L. F. Barrett, 2017b, beginning on page 183). 

But can a person’s emotional state be reasonably inferred from that person’s facial movements? In this article, we offer a systematic review of the evidence, testing the common view that instances of an emotion category are signaled with a distinctive configuration of facial movements that has enough reliability and specificity to serve as a diagnostic marker of those instances. We focus our review on evidence pertaining to six emotion categories that have received the lion’s share of attention in scientific research—anger, disgust, fear, happiness, sadness, and surprise—and that, correspondingly, are the focus of the common view (as evidenced by our Google search, summarized in Box 1 in the Supplemental Material). Our conclusions apply, however, to all emotion categories that have thus far been scientifically studied. We open the article with a brief discussion of its scope, approach, and intended audience. We then summarize evidence on how people actually move their faces during episodes of emotion, referred to as studies of expression production, following which we examine evidence on which emotions are actually inferred from looking at facial movements, referred to as studies of emotion perception. We identify three key shortcomings in the scientific research that have contributed to a general misunderstanding about how emotions are expressed and perceived in facial movements and that limit the translation of this scientific evidence for other uses:

  • Limited reliability (i.e., instances of the same emotion category are neither reliably expressed through nor perceived from a common set of facial movements). 

  • Lack of specificity (i.e., there is no unique mapping between a configuration of facial movements and instances of an emotion category). 

  • Limited generalizability (i.e., the effects of context and culture have not been sufficiently documented and accounted for). 

We then discuss our conclusions, followed by proposals for consumers on how they might use the existing scientific literature. We also provide recommendations for future research on emotion production and perception with consumers of that research in mind. We have included additional detail on some topics of import or interest in the Supplemental Material.