'The Invisible Cage: Workers’ Reactivity to Opaque Algorithmic Evaluations' by Hatim A. Rahman in (2021) 66(4) Administrative Science Quarterly comments
Existing research has shown that people experience third-party evaluations as a form of control because they try to align their behavior with evaluations’ criteria to secure more favorable resources, recognition, and opportunities from external audiences. Much of this research has focused on evaluations with transparent criteria, but increasingly, algorithmic evaluation systems are not transparent. Drawing on over three years of interviews, archival data, and observations as a registered user on a labor platform, I studied how freelance workers contend with an opaque third-party evaluation algorithm—and with what consequences. My findings show the platform implemented an opaque evaluation algorithm to meaningfully differentiate between freelancers’ rating scores. Freelancers experienced this evaluation as a form of control but could not align their actions with its criteria because they could not clearly identify those criteria. I found freelancers had divergent responses to this situation: some experimented with ways to improve their rating scores, and others constrained their activity on the platform. Their reactivity differed based not only on their general success on the platform—whether they were high or low performers—but also on how much they depended on the platform for work and whether they experienced setbacks in the form of decreased evaluation scores. These workers experienced what I call an “invisible cage”: a form of control in which the criteria for success and changes to those criteria are unpredictable. For gig workers who rely on labor platforms, this form of control increasingly determines their access to clients and projects while undermining their ability to understand and respond to factors that determine their success.
Third-party evaluations are a central feature of today’s societal and organizational landscape (Lamont, 2012; Sharkey and Bromley, 2015; Espeland and Sauder, 2016). Studies have shown that third-party rating evaluations of actors such as doctors (RateMDs), professors (RateMyProfessors), hotels (TripAdvisor), restaurants (Yelp), corporations (Forbes), and universities (U.S. News & World Report) provide a sense of transparency and accountability for external audiences (Strathern, 2000; Power, 2010; Orlikowski and Scott, 2014). Audiences also use third-party evaluations to form their perceptions and make decisions about the evaluated actor (Karpik, 2010). As a result, these evaluations influence the resources, recognition, and opportunities actors receive from external audiences (Pope, 2009; Brandtner, 2017). As the prevalence and influence of third-party evaluation systems have increased, researchers have examined how actors subject to such systems react to them (Jin and Leslie, 2003; Espeland and Sauder, 2007; Chatterji and Toffel, 2010). For example, because university admissions, funding, and recognition are influenced by third-party evaluations (e.g., U.S. News & World Report), university administrators and faculty pay close attention to the criteria these evaluations use, such as career placement statistics, and change their behavior to better align with them (Sauder and Espeland, 2009; Espeland and Sauder, 2016).
Consequently, while prior work has shown that third-party evaluations often provide transparency and accountability for external audiences, it has also suggested that actors subject to third-party evaluations experience them as a form of control (Espeland and Sauder, 2016; Brandtner, 2017; Kornberger, Pflueger, and Mouritsen, 2017). Because third-party evaluations can influence actors’ ability to secure resources and recognition from their primary audiences, actors will likely internalize evaluations’ criteria and change their behavior to conform to those standards (Sauder and Espeland, 2009; Masum and Tovey, 2011). Scholars label the phenomenon of people changing their perceptions and behavior in response to being evaluated as “reactivity” (Espeland and Sauder, 2007).
Technological advancements have expanded the use of third-party evaluations to new areas of work and organizing, raising new questions in this domain (Fourcade and Healy, 2016; Kuhn and Maleki, 2017; Cameron, 2021). Nowhere is this more evident than in the rise of labor platforms and their use of third-party evaluations to assess workers. While several types of platforms exist (Davis, 2016; Sundararajan, 2016), those most relevant to this study are labor platforms facilitating gig work, such as Upwork, TopCoder, and TaskRabbit. They provide a digital infrastructure to connect clients with freelance job seekers for relatively short-term projects. Labor platforms have attracted increased attention from work and organizational scholars because they differ from intermediaries and exchange systems previously studied (Vallas and Schor, 2020; Rahman and Valentine, 2021; Stark and Pais, 2021), particularly in their use of evaluations (Kornberger, Pflueger, and Mouritsen, 2017).
Unlike previously studied settings, in which third-party evaluation criteria are relatively transparent to those being evaluated, in labor platforms these criteria are often opaque to workers. This opacity makes it easier for platforms and clients to differentiate among workers by using their evaluation scores, because it is more difficult for workers to game and inflate the evaluation system than in traditional settings (Filippas, Horton, and Golden, 2019; Garg and Johari, 2020). Platforms’ use of opacity in worker evaluations raises an underexplored question: how do opaque third-party evaluations influence workers’ reactivity, and what mechanisms contribute to this form of reactivity? While existing organizational research (Proctor, 2008; Briscoe and Murphy, 2012; Burrell, 2016) has broadly suggested that opacity will make it more difficult for workers to understand evaluation criteria, we lack grounded theory examining how workers contend with such opacity—and with what consequences.
To address this gap, I studied one of the largest labor platforms focused on higher-level project work, such as software engineering, design, and data analytics. The platform implemented an opaque algorithmic rating evaluation to better differentiate which freelancers should be visible to clients and to prevent freelancers from gaming their scores. Freelancers tried but generally failed to understand the evaluation’s inputs, processing, and output, which led them to experience the opaque evaluation as a system of control characterized by unpredictable updates, fluctuating criteria, and lack of opportunities to improve their scores. These experiences were especially frustrating because the opacity contrasted with workers’ expectations of employee-evaluation systems based on previous experiences in traditional organizations, where such systems’ main purpose is to help workers improve (Cappelli and Conyon, 2018).
I observed that freelancers responded to evaluation opacity with two types of reactivity: they either tested different tactics to increase their scores, such as working on various types of projects and with different contract lengths, or they tried to preserve their scores by limiting their engagement with the platform, such as by working with platform-based clients outside of the platform and not working with new clients. This was the case both for workers with higher and lower scores on the platform. Two mechanisms determined their type of reactivity: the extent to which freelancers depended on the platform for work and income and whether they experienced decreases in their evaluation scores (regardless of whether those scores started out higher or lower). My findings support the argument that opaque third-party evaluations can create an “invisible cage” for workers, because they experience such evaluations as a form of control and yet cannot decipher or learn from the criteria for success and changes to those criteria.