'Lessons from the FDA for AI' (AINow Institute) by Sarah Myers West and Amba Kak comments
When we initiated this project at the start of 2023, a growing chorus of voices was mobilizing in favor of stronger and, importantly, ex ante or premarket regulatory scrutiny for artificial intelligence. Industry leaders and regulators alike were calling for stronger standards to bring a sense of order and stability to the sector; the UK AI Summit even had major AI labs commit (in principle) to premarket testing of certain AI models; and there was a general, pervasive sense that some friction was necessary in the frenzied AI industry. The devil would be in the details: Would a licensing scheme, such as that lobbied for by Microsoft, bring stronger scrutiny to the sector, or would it tip the scales in favor of incumbents? How should responsibility be distributed among the many actors involved in development along the AI supply chain, from the base models to end products? Taking time and space to read and think deeply to arrive at viable answers seemed well worth doing.
As we write this executive summary in July 2024, enacting premarket enforcement of any kind seems like a distant prospect: the conviction that “something should be done” fell prey to a lack of political will to move on actual proposals, while those that did gain traction contained worrying carve-outs. Recent case law brought down from the Supreme Court may create further barriers to agency-led regulation, Congress remains sharply divided on the path forward after losing months to industry-centered deliberation, and Silicon Valley’s venture-capital class is rallying around a deregulatory agenda for AI as central to the Republican presidential election agenda. Why, amid these headwinds, read up on lessons from the Food and Drug Administration, one of the most regulated industries in the US?
As the report that follows illustrates, the example of the FDA is most instructive not as a road map for how to approach AI, but as a set of lessons on attuning ex ante regulation to an evolving market and its products. The question at hand is not whether we need an “FDA for AI,” since that crude formulation will inevitably lead to unhelpfully vague answers. Rather, how the FDA transformed the pharmaceutical sector in the United States, from a domain of snake oil salesmen and quack doctors to a market that produces lifesaving drugs that are tested rigorously enough for people around the world to travel to the US just to obtain them, holds key insights for regulatory debates on AI.
Amy Kapczynski, a member of the advisory council for this project, has written that we didn’t know much about how drugs worked until an agency existed to motivate companies to research them, producing the evidence necessary for entry into the market. One could think about the long-critiqued opacity and unreliability of AI systems along the same lines.
Because market entry hinges on approval by a regulator, the structure of the pharmaceutical market also provides strong motivations for compliance with the law. Similar motivations are lacking in AI, where regulatory fines often amount to a budget line that deep-pocketed tech firms build into their financial planning year by year.
Ex ante regulation also provides consistency and reliability for industry players and regulators alike, in contrast to the whack-a-mole approach that constitutes AI regulation at present.
And while many conversations about artificial intelligence center on risk management, from the FDA example we can learn about the necessity of also validating the efficacy of these systems, enabling a meaningful evaluation of the trade-offs between risks and benefits rather than relying on breezy assertions that AI is inherently innovative.
To be clear, this is not meant to portray the FDA as a shining beacon: if anything, it is a clear example of how regulatory hurdles hinder competition between firms, leading to bloat. Market incentives are often tipped against sufficient investment in providing affordable access to drugs for populations not seen as highly valued by corporate shareholders. The revolving door between regulators and industry has been and remains an endemic challenge, and real ethical concerns need to be raised about a funding model that relies on corporate fees to support testing and evaluation.
We do want to emphasize, however, that strong arguments can be made that regulation that orders an otherwise unruly and unpredictable market, that provides not only incentives for beneficial corporate behavior that would not be induced by the market alone, but also disincentives for risky behavior by any one entity that would tarnish the market as a whole, can offer significant value to industry and to the public at large.
As we launch this report amid more uncertainty around imminent regulatory possibilities, we’re left with new questions to wrestle with: What is the landscape of possibility for regulation post-Chevron? How will First Amendment challenges be navigated for premarket regulatory proposals? How should AI regulators interface with the existing jurisdiction and authorities of sectoral regulators? How should we think about appropriate benchmarks for sociotechnical evaluation, both of the risks associated with AI systems and their efficacy? What audiences are most relevant to increased generation of documentation and information about the AI market (for example, consumers, businesses, regulatory authorities, third-party auditors, journalists and civil society members), and how does this lead to greater accountability? How should regulators ensure the information is meaningful and relevant to that audience? How can regulatory intervention be structured to incentivize compliance? What penalties will be most meaningful to this sector? How do we draw boundaries around the “AI industry” as the focus of regulatory scrutiny? What counts as an “AI firm”?
AI is already a regulated technology, and the companies developing and deploying AI are bound by existing law. As the frequent refrain of many US government officials goes, There is no AI exemption to the law on the books. Despite this, the current regulatory environment for AI leaves much to be desired: the penalties are paltry given the deep pockets of many tech firms; it’s almost entirely reliant on ex post accountability for harms surfaced by underresourced regulators and investigative journalists; and the haphazard nature of the regulatory approach means that it is difficult, if not impossible, to clearly conscribe boundaries around the “AI market.”
Given these weaknesses, the idea of creating a novel regulatory agency for AI waxes and wanes alongside waves of attention to the sector, as we’ve outlined in Box 1. This report does not endorse, one way or another, whether or not we need a new regulatory agency for AI. Instead, it seeks to draw lessons from the analogical model most frequently referenced in relation to a stronger regulatory regime for AI: the Food and Drug Administration (FDA).
Delving deeply into this model surfaced the following key insights:
An “FDA for AI” is a blunt metaphor to build from. A more productive starting point would look at FDA-style regulatory interventions and how they may be targeted at different points in the AI supply chain:
Discussions about an “FDA for AI” often operate in a broad analogical manner—a blunt instrument for a conversation deserving of greater nuance. Rather than simply porting over to AI the functions of a large agency whose regulatory toolbox includes many varied approaches, a supply chain approach to understanding AI development adds useful conceptual clarity to conversations about appropriate regulatory design.
FDA-style interventions might be better suited for certain parts of the AI supply chain than others:
The FDA’s approach translates most directly at the level of the application or eventual use case, where it is most tractable to validate the safety and effectiveness of an AI product.
By contrast, attempting similar interventions at other stages of AI development, such as the base or “foundation model” layer, presents potentially intractable challenges like how to identify in advance the universe of possible harms using empirical evaluation. Here, other regulatory design approaches, such as financial regulation and its treatment of systemic risk, may offer more useful corollaries.
At minimum, mandates for clear documentation of base models, including the data used to train them, will be necessary to enable evaluation at the application layer.
It is important to clearly differentiate between the “‘users” of AI applications, which are the entities procuring AI systems, and the people or communities the system is used on—the “subjects” of AI’s use. Often there is a significant power differential between “users” and “subjects,” which regulatory interventions must also account for.
The FDA model offers a powerful lesson in optimizing regulatory design for information production, rather than just product safety. This is urgently needed for AI given the lack of clarity on market participants and the structural opacity surrounding AI development and deployment.
The FDA has catalyzed and organized an entire field of expertise that has enhanced our understanding of pharmaceuticals and creating and disseminating expertise across stakeholders far beyond understanding incidents in isolation. AI is markedly opaque in contrast: mapping the ecosystem of companies and actors involved in AI development (and thus subject to any accountability or safety interventions) is a challenging task absent regulatory intervention.
This information-production function is particularly important for AI, a domain where the difficulty—nay, impossibility—of interpretability and explainability remain pressing challenges for the field, and where key players in the market are incentivized against transparency. Over time, the FDA’s interventions have expanded the public’s understanding of how drugs work by ensuring firms invest in research and documentation to comply with a mandate to do so—prior to the existence of the agency, much of the pharmaceutical industry was largely opaque, in ways that bear similarities to the AI market.
Many specific aspects of information exchange in the FDA model offer lessons for thinking about AI regulation. For example, in the context of pharmaceuticals, there is a focus on multistakeholder communication that requires ongoing information exchange between staff, expert panels, patients, and drug developers. Drug developers are mandated to submit troves of internal documentation, which the FDA then reformats for the public.
The FDA-managed database of adverse incidents, clinical trials, and guidance documentation also offers key insights for AI incident reporting (an active field of research). It may motivate shifts in the AI development process, encouraging beneficial infrastructures for increasing transparency of deployment and clearer documentation.
The lack of consensus on what counts as efficacy (rather than safety) is a powerful entry point for regulating AI. There will always be potential harms from AI; the regulatory question thus must consider whether the benefits outweigh the harms. But to know that, we need clear evidence—which we currently lack—of the specific benefits offered by AI technologies.
A lesson from the FDA is that safety and efficacy of products must be evaluated in parallel. In the context of AI, policymaking has tended to index heavily on safety and harm and not focus as intently on evaluating or challenging the fundamental premise of efficacy, or on presenting a concrete appraisal of risks and benefits.
To serve the public interest, measures of efficacy should be considered carefully so that they are not primarily or solely indexed on profit or growth, but take into account benefits to society more generally. Regulatory approaches in AI should require developers of AI systems to explain how an AI system works, which societal problems it attempts to address, and what benefits it offers—not just to evaluate where it fails.
Efficacy evaluation could present an existential challenge to some domains and applications of AI where we currently lack the necessary methods to validate the ostensible benefits of AI usage, given widespread failures in machine learning research to reproduce the findings published in papers.
Premarket approval is potentially the most powerful stage of regulatory intervention: this is where alignment between regulatory power and companies’ incentives to comply reach their peak.
Past the point of market entry, the FDA retains some ability to act in the public interest, through market surveillance and recalls—but we see a significant drop in the agency’s ability to act and its track record for doing so successfully.
In both the context of the FDA and in AI, assuring downstream compliance after a product enters the market is a regulatory challenge. Post-market surveillance is a challenge for AI given the varied provenance of AI system components, but currently characterizes the bulk of ongoing AI regulatory enforcement.
Looking to the FDA analogy, downstream accountability occurs through mechanisms such as recalling products after the fact, though its ability to enact these remedies is weakened once they are in commercial use. Applied to AI, this is made even more challenging given the difficulty in clearly identifying the chain of provenance for particular components of AI systems.
In the context of the FDA, companies remain liable for harms caused to the public after drugs are made available for wide release, but establishing liability and then demonstrating causation in the AI context are significant barriers. Currently, the bulk of regulatory enforcement of existing law in AI occurs ex post, and is thus subject to these challenges.
To have teeth, any regulatory intervention targeting the AI sector must go far beyond the current standard of penalties to meaningfully challenge some of the biggest companies in the world.
The FDA model hinges on the FDA’s ability to prevent pharmaceutical companies from marketing drugs to physicians—without which they cannot sell their drugs on the market. Controlling this essential gate to market entry is what grants the FDA a big stick, critical to its effectiveness as a regulator; under present conditions, no corollary gates to market entry for AI companies exist.
The power of FDA regulation also comes from other actors in the system, from physicians to insurance companies, who can themselves refuse to recommend or cover a product if they believe it not helpful. This has acted as an important second line of defense in pharmaceuticals, where the regulatory process has failed to be sufficiently rigorous; there are also corollaries in other industries such as banking and insurance. This deserves stronger development in the context of AI, where the dependencies and sites of friction remain comparatively immature.
Greater transparency into what constitutes the market itself, and the process through which AI products are sold, will be important to AI governance. Currently, the contours of what constitutes the “AI market” are underspecified and opaque.
FDA regulation for pharmaceuticals is triggered by the “marketing” of a drug as a critical gate to entry. In other industries, there are gates around the sale of certain products, which may be preferable over marketing given First Amendment concerns. Any attempt at sector-specific AI regulation will run into a thorny set of definitional questions: What constitutes the AI market, and how do products enter into commercial use? Moreover, conceptual clarity that the entity procuring the AI system is often not the same as the individual the system is used on is key, given that AI systems are frequently used by comparatively powerful entities on the less powerful, necessitating interventions that go beyond deceptive marketing and protect the interests of the public at large.
The funding model for regulatory agencies matters tremendously to its effectiveness, and can inadvertently make the regulator beholden to industry motives.
The FDA utilizes fees paid by industry players to fund its review process, which ensures adequate resourcing for reviews. However, under the present model, the FDA must submit its budgets regularly to companies paying fees, making them responsible to the companies it is reviewing for its accounting. This is a significant weakening of the agency’s power and risks creating leverage by industry.
FDA-style documentation requirements for AI would already be a step-change from the current accountability vacuum in AI. Encouraging stronger monitoring and compliance activities within AI firms like recordkeeping and documentation practices would generate organizational reflexivity as well as provide legal hooks for ex post enforcement.
Introducing FDA-style functions into the AI governance process could motivate restructuring of the development practices, and potentially the operating model, of AI developers. In and of itself, this would create greater internal transparency and accountability within AI firms that would convey societal benefits, and aid the work of enforcement agencies when they need to investigate AI companies.