Research

Our research focuses on the foundations of progress in computing: what are the most important trends, how do they underpin economic prosperity, and how can we harness them to sustain and promote productivity growth. In our work, we draw on computer science, economics, and management to identify and understand trends in computing that create opportunities for (or pose risks to) our ability to sustain economic growth.

Featured Research

Beyond AI Exposure: Which Tasks are Cost-Effective to Automate with Computer Vision?
January 2024
Maja S. Svanberg, Wensu Li, Martin Fleming, Brian C. Goehring, Neil C. Thompson

The faster AI automation spreads through the economy, the more profound its potential impacts, both positive (improved productivity) and negative (worker displacement). The previous literature on “AI Exposure” cannot predict this pace of automation since it attempts to measure an overall potential for AI to affect an area, not the technical feasibility and economic attractiveness of building such systems. In this article, we present a new type of AI task automation model that is end-to-end, estimating: the level of technical performance needed to do a task, the characteristics of an AI system capable of that performance, and the economic choice of whether to build and deploy such a system. The result is a first estimate of which tasks are technically feasible and economically attractive to automate - and which are not. We focus on computer vision, where cost modeling is more developed. We find that at today’s costs U.S. businesses would choose not to automate most vision tasks that have “AI Exposure,” and that only 23% of worker wages being paid for vision tasks would be attractive to automate. This slower roll-out of AI can be accelerated if costs falls rapidly or if it is deployed via AI-as-a-service platforms that have greater scale than individual firms, both of which we quantify. Overall, our findings suggest that AI job displacement will be substantial, but also gradual – and therefore there is room for policy and retraining to mitigate unemployment impacts.

All Research

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Reconceiving the National Research Enterprise

IEEE
October
2025
Christophe Combemale, Martin Fleming, Yong-Yeol Ahn, Cassidy Sugimoto

U.S. leadership in scientific research is at a transformative moment. Not only is artificial intelligence playing a disruptive, even revolutionary, role, China has emerged as a viable competitor. As a source of U.S. competitive advantage, innovation, transformation, and diffusion is vital. Scientific knowledge must spread beyond its original research context and be applied to diverse problems. In these times of industrial revolution, universities play a central role, but funding should prioritize institutions that demonstrate effective diffusion and talent retention at national and regional levels. Given the priorities set by federal leadership, decentralized decision-making freedom allows institutions to experiment with solutions. While technical advantages are temporary, sustained leadership requires stable talent pathways that support continuous and rapid innovation that delivers value to end users faster than national competitors can innovate. U.S. scientific competitiveness depends not only on discovery but also on the diffusion and retention of tacit knowledge, particularly through graduate students who embody and transfer expertise. Talent attraction is a U.S. strength, but retention is equally critical since departing researchers can weaken the nation's competitive position by transferring tacit knowledge abroad. To succeed, the U.S. must compete globally to provide

Quantum Advantage in Computational Chemistry?

PNAS NEXUS
August
2025
Hans Gundlach, Keeper Sharkey, Jayson Lynch, Victoria Hazoglou, Kung-Chuan Hsu, Carl Dukatz, Eleanor Crane, Karin Walczyk, Marcin Bodziak, Johannes Galatsanos-Dueck, Neil Thompson

For decades, computational chemistry has been posited as one of the areas in which quantum computing would revolutionize. However, the algorithmic advantages that fault-tolerant quantum computers have for chemistry can be overwhelmed by other disadvantages, such as error correction, processor speed, etc. To assess when quantum computing will be disruptive to computational chemistry, we compare a wide range of classical methods to quantum computational methods by extending the framework proposed by Choi, Moses, and Thompson. Our approach accounts for the characteristics of classical and quantum algorithms, and hardware, both today and as they improve. We find that in many cases, classical computational chemistry methods will likely remain superior to quantum algorithms for at least the next couple of decades. Nevertheless, quantum computers are likely to make important contributions in two important areas. First, for simulations with tens or hundreds of atoms, highly accurate methods such as Full Configuration Interaction are likely to be surpassed by quantum phase estimation in the coming decade. Secondly, in cases where quantum phase estimation is most efficient less accurate methods like Coupled Cluster and Møller–Plesset, could be surpassed in fifteen to twenty years if the technical advancements for quantum computers are favorable. Overall, we find that in the next decade or so, quantum computing will be most impactful for highly accurate computations with small to medium-sized molecules, whereas classical computers will likely remain the typical choice for calculations of larger molecules.

Expertise

NBER Working Paper
June
2025
David Autor & Neil Thompson

When job tasks are automated, does this augment or diminish the value of labor in the tasks that remain? We argue the answer depends on whether removing tasks raises or reduces the expertise required for remaining non-automated tasks. Since the same task may be relatively expert in one occupation and inexpert in another, automation can simultaneously replace experts in some occupations while augmenting expertise in others. We propose a conceptual model of occupational task bundling that predicts that changing occupational expertise requirements have countervailing wage and employment effects: automation that decreases expertise requirements reduces wages but permits the entry of less expert workers; automation that raises requirements raises wages but reduces the set of qualified workers. We develop a novel, content-agnostic method for measuring job task expertise, and we use it to quantify changes in occupational expertise demands over four decades attributable to job task removal and addition. We document that automation has raised wages and reduced employment in occupations where it eliminated inexpert tasks, but lowered wages and increased employment in occupations where it eliminated expert tasks. These effects are distinct from—and in the case of employment, opposite to—the effects of changing task quantities. The expertise framework resolves the puzzle of why routine task automation has lowered employment but often raised wages in routine task-intensive occupations. It provides a general tool for analyzing how task automation and new task creation reshape the scarcity value of human expertise within and across occupations.

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

IEEE
August
2024
Peter Slattery, Alexander K. Saeri, Emily A. C. Grundy, Jess Graham, Michael Noetel, Risto Uuk, James Dao, Soroush Pour, Stephen Casper, Neil Thompson

The risks posed by Artificial Intelligence (AI) are of considerable concern to academics, auditors, policymakers, AI companies, and the public. However, a lack of shared understanding of AI risks can impede our ability to comprehensively discuss, research, and react to them. This paper addresses this gap by creating an AI Risk Repository to serve as a common frame of reference. This comprises a living database of 777 risks extracted from 43 taxonomies, which can be filtered based on two overarching taxonomies and easily accessed, modified, and updated via our website and online spreadsheets. We construct our Repository with a systematic review of taxonomies and other structured classifications of AI risk followed by an expert consultation. We develop our taxonomies of AI risk using a best-fit framework synthesis. Our high-level Causal Taxonomy of AI Risks classifies each risk by its causal factors (1) Entity: Human, AI; (2) Intentionality: Intentional, Unintentional; and (3) Timing: Pre-deployment; Post-deployment. Our mid-level Domain Taxonomy of AI Risks classifies risks into seven AI risk domains: (1) Discrimination & toxicity, (2) Privacy & security, (3) Misinformation, (4) Malicious actors & misuse, (5) Human-computer interaction, (6) Socioeconomic & environmental, and (7) AI system safety, failures, & limitations. These are further divided into 23 subdomains. The AI Risk Repository is, to our knowledge, the first attempt to rigorously curate, analyze, and extract AI risk frameworks into a publicly accessible, comprehensive, extensible, and categorized risk database. This creates a foundation for a more coordinated, coherent, and complete approach to defining, auditing, and managing the risks posed by AI systems.

User-generated content shapes judicial reasoning: Evidence from a randomized control trial on Wikipedia

INFORMS
March
2024
Neil C. Thompson, Xueyun Luo, Brian McKenzie, Edana Richardson, Brian Flanagan

Legal professionals have access to many different sources of knowledge, including user-generated Wikipedia articles that summarize previous judicial decisions (i.e., precedents). Although these Wikipedia articles are easily accessible, they have unknown provenance and reliability, and therefore using them in professional settings is problematic. Nevertheless, Wikipedia articles influence legal judgments, as we show using a first-of-its-kind randomized control trial on judicial decision making. We find that the presence of a Wikipedia article about Irish Supreme Court decisions makes it meaningfully more likely that the corresponding case will be cited as a precedent by judges in subsequent decisions. The language used in the Wikipedia article also influences the language used in judgments. These effects are only present for citations by the High Court and not for the higher levels of the judiciary (Court of Appeal and Supreme Court). The High Court faces larger caseloads, so this may indicate that settings with greater time pressures encourage greater reliance on Wikipedia. Our results add to the growing recognition that Wikipedia and other frequently accessed sources of user-generated content have profound effects on important social outcomes and that these effects extend farther than previously seen—into high-stakes settings where norms are supposed to restrict their use.

Beyond AI Exposure: Which Tasks are Cost-Effective to Automate with Computer Vision?

January
2024
Maja S. Svanberg, Wensu Li, Martin Fleming, Brian C. Goehring, Neil C. Thompson

The faster AI automation spreads through the economy, the more profound its potential impacts, both positive (improved productivity) and negative (worker displacement). The previous literature on “AI Exposure” cannot predict this pace of automation since it attempts to measure an overall potential for AI to affect an area, not the technical feasibility and economic attractiveness of building such systems. In this article, we present a new type of AI task automation model that is end-to-end, estimating: the level of technical performance needed to do a task, the characteristics of an AI system capable of that performance, and the economic choice of whether to build and deploy such a system. The result is a first estimate of which tasks are technically feasible and economically attractive to automate - and which are not. We focus on computer vision, where cost modeling is more developed. We find that at today’s costs U.S. businesses would choose not to automate most vision tasks that have “AI Exposure,” and that only 23% of worker wages being paid for vision tasks would be attractive to automate. This slower roll-out of AI can be accelerated if costs falls rapidly or if it is deployed via AI-as-a-service platforms that have greater scale than individual firms, both of which we quantify. Overall, our findings suggest that AI job displacement will be substantial, but also gradual – and therefore there is room for policy and retraining to mitigate unemployment impacts.

The Grand Illusion: The Myth of Software Portability and Implications for ML Progress

arXiv.org
September
2023
Fraser Mince, Dzung Dinh, Jonas Kgomo, Neil Thompson, Sara Hooker

Pushing the boundaries of machine learning often requires exploring different hardware and software combinations. However, the freedom to experiment across different tooling stacks can be at odds with the drive for efficiency, which has produced increasingly specialized AI hardware and incentivized consolidation around a narrow set of ML frameworks. Exploratory research can be restricted if software and hardware are co-evolving, making it even harder to stray away from mainstream ideas that work well with popular tooling stacks. While this friction increasingly impacts the rate of innovation in machine learning, to our knowledge the lack of portability in tooling has not been quantified. In this work, we ask: How portable are popular ML software frameworks? We conduct a large-scale study of the portability of mainstream ML frameworks across different hardware types. Our findings paint an uncomfortable picture -- frameworks can lose more than 40% of their key functions when ported to other hardware. Worse, even when functions are portable, the slowdown in their performance can be extreme and render performance untenable. Collectively, our results reveal how costly straying from a narrow set of hardware-software combinations can be - and suggest that specialization of hardware impedes innovation in machine learning research.

Democratising case law while teaching students: writing Wikipedia articles on legal cases

European Journal of Legal Education
June
2023
Edana Richardson, Brian McKenzie, Brian Flanagan, Neil C. Thompson, Maria Murphy

This article draws on qualitative student feedback and lecturer experience to provide a guide for educators who are interested in creating Wikipedia article-based assignments. Using legal cases as an example, this article details how these assignments can encourage students to deepen their understanding of a topic and consider how knowledge can be communicated effectively. In particular, this article focuses on how educators outside of the United States and Canada can navigate Wikipedia's bureaucracy and how they and their students can contribute information of relevance to smaller jurisdictions on a publicly-accessible repository. This article begins by addressing concerns that educators may have with student use of Wikipedia, while highlighting pedagogical benefits for students who write Wikipedia articles. It goes on to provide a guide for educators who want to create a Wikipedia article writing assignment – in particular, the preparatory steps required to make the assignment effective, how to support students in their writing journey, and how to better ensure that student-authored articles remain available on Wikipedia. This article concludes by encouraging educators to consider using Wikipedia as an educational tool, and to teach their students how they can use Wikipedia article writing to contribute to public knowledge.

Intentional and serendipitous diffusion of ideas: Evidence from academic conferences

arXiv.org
September
2022
Misha Teplitskiy, Soya Park, Neil Thompson, David Karger

This paper investigates the effects of seeing ideas presented in-person when they are easily accessible online. Presentations may increase the diffusion of ideas intentionally (when one attends the presentation of an idea of interest) and serendipitously (when one sees other ideas presented in the same session). We measure these effects in the context of 25 computer science conferences using data from the scheduling application Confer, which lets users browse papers, Like those of interest, and receive schedules of their presentations. We address endogeneity concerns in presentation attendance by exploiting scheduling conflicts: when a user Likes multiple papers that are presented at the same time, she cannot see them both, potentially affecting their diffusion. Estimates show that being able to see presentations increases citing of Liked papers within two years by 1.5 percentage points (62.5% boost over the baseline citation rate). Attention to Liked papers also spills over to non-Liked papers in the same session, increasing their citing by 0.5 percentage points (125% boost), and this serendipitous diffusion represents 30.5% of the total effect. Both diffusion types were concentrated among papers semantically close to an attendee's prior work, suggesting that there are inefficiencies in finding related research that conferences help overcome. Overall, even when ideas are easily accessible online, in-person presentations substantially increase diffusion, much of it serendipitous.

Trial by Internet: A Randomized Field Experiment on Wikipedia's Influence on Judges' Legal Reasoning

SSRN
August
2022
Neil Thompson, Brian Flanagan, Edana Richardson, Brian McKenzie, Xueyun Luo

In the common law tradition, legal decisions are supposed to be grounded in both statute and precedent, with legal training guiding practitioners on the most important and relevant touchstones. But actors in the legal system are also human, with the failings and foibles seen throughout society. This may lead them to take methodological shortcuts, even to relying on unknown internet users for determinations of a legal source's relevance. In this chapter, we investigate the influence on legal judgments of a pervasive but unauthoritative source of legal knowledge: Wikipedia. Using the first randomized field experiment ever undertaken in this area—the gold standard for identifying causal effects—we show that Wikipedia shapes judicial behavior. Wikipedia articles on decided cases, written by law students, guide both the decisions that judges cite as precedents and the textual content of their written opinions. The information and legal analysis offered on Wikipedia led judges to cite the relevant legal cases more often and to talk about them in ways comparable to how the Wikipedia authors had framed them. Collectively, our study provides clear empirical evidence of a new form of influence on judges' application of the law—easily accessible, user-generated online content. Because such content is not authoritative, our analysis reveals a policy gap: if easily accessible analysis of legal questions is already being relied on, it behooves the legal community to accelerate efforts to ensure that such analysis is both comprehensive and expert.

Tools that shape innovation

Decomposing the “Tacit Knowledge Problem:” Codification of Knowledge and Access in CRISPR Gene-Editing

Pre-print
November
2017
Neil Thompson, Samantha Zyontz

The ability to edit genes with CRISPR has, in a few short years, been transformative to genetics, generating follow-on science such as drought-resistant crops and mosquitos that cannot carry malaria, and is widely expected to win a Nobel prize. The rush of scientists to embrace CRISPR in its early days provides an important and data-rich environment in which to study how tacit information is codified and transferred for others to build upon. In particular, the introduction of CRISPR allows us to explore whether embedding the associated knowledge into an easy-to-distribute tool (a plasmid) and making it available through a biological resource center (Addgene) solves the "tacit information" problem and the physical localization that accompanies it. We show that codification into a scientific tool does solve the problem of access; scientists of equivalent caliber experiment with CRISPR in equal measure regardless of where in the U.S. they are. However, scientists across different geographies have unequal success in converting that experimentation into published science, suggesting that some tacit information problems persist. The remaining tacit information seems to be driven by expertise, with geographies specialized in mammalian CRISPR helping to create publishable science on mammals and geographies specialized in bacterial CRISPR helping to create publishable science on bacteria. Collectively, our case study of the earliest days of CRISPR speaks to the tacit information challenges that are, and are not, solved by distributing embedded materials.

Science Is Shaped by Wikipedia: Evidence From a Randomized Control Trial

Pre-print
September
2017
Neil Thompson, Douglas Hanley

“I sometimes think that general and popular treatises are almost as important for the progress of science as original work.” - Charles Darwin, 1865. As the largest encyclopedia in the world, it is not surprising that Wikipedia reflects the state of scientific knowledge. However, Wikipedia is also one of the most accessed websites in the world, including by scientists, which suggests that it also has the potential to shape science. This paper shows that it does. Incorporating ideas into Wikipedia leads to those ideas being used more in the scientific literature. We provide correlational evidence of this across thousands of Wikipedia articles and causal evidence of it through a randomized control trial where we add new scientific content to Wikipedia. In the months after uploading it, an average new Wikipedia article in Chemistry is read tens of thousands of times and causes changes to hundreds of related scientific journal articles. Patterns in these changes suggest that Wikipedia articles are used as review articles, summarizing an area of science and highlighting the research contributions to it. Consistent with this reference article view, we find causal evidence that when scientific articles are added as references to Wikipedia, those articles accrue more academic citations. Our findings speak not only to the influence of Wikipedia, but more broadly to the influence of repositories of knowledge and the role that they play in science.

University licensing and the flow of scientific knowledge

Research Policy
July
2017
Neil C Thompson, Arvids A Ziedonis, David C Mowery

As university involvement in technology transfer and entrepreneurship has increased, concerns over the patenting and licensing of scientific discoveries have grown. This paper examines the effect that the licensing of academic patents has on journal citations to academic publications covering the same scientific research. We analyze data on invention disclosures, patents, and licenses from the University of California, a leading U.S. academic patenter and licensor, between 1997 and 2007. We also develop a novel “inventor–based” maximum–likelihood matching technique to automate and generalize Murray’s (2002) “patent–paper pairs” methodology. We use this methodology to identify the scientific publications associated with University of California patents and licenses. Based on a “difference–in–differences” analysis, we find that within our sample of patented academic discoveries, citations to licensed patent–linked publications are higher in the three years after the license, although this difference is not statistically significant. But when we disaggregate our sample into (a) patented discoveries that are likely to be used as “research tools” by other researchers (based on the presence of material transfer agreements (MTAs) that cover them) and (b) patented discoveries not covered by MTAs, we find that citations to publications linked to the licensed patents in the latter subset to be not only higher for publications linked to licensed patents than for unlicensed patents, but that this difference is statistically significant. In contrast, licensing of patented discoveries that are also research tools is associated with a reduction in citations to papers linked to these research advances, raising the possibility that licensing may restrict the flow of inputs to “follow–on” scientific research.