Research

June 2025

Expertise

David Autor & Neil Thompson

When job tasks are automated, does this augment or diminish the value of labor in the tasks that remain? We argue the answer depends on whether removing tasks raises or reduces the expertise required for remaining non-automated tasks. Since the same task may be relatively expert in one occupation and inexpert in another, automation can simultaneously replace experts in some occupations while augmenting expertise in others. We propose a conceptual model of occupational task bundling that predicts that changing occupational expertise requirements have countervailing wage and employment effects: automation that decreases expertise requirements reduces wages but permits the entry of less expert workers; automation that raises requirements raises wages but reduces the set of qualified workers. We develop a novel, content-agnostic method for measuring job task expertise, and we use it to quantify changes in occupational expertise demands over four decades attributable to job task removal and addition. We document that automation has raised wages and reduced employment in occupations where it eliminated inexpert tasks, but lowered wages and increased employment in occupations where it eliminated expert tasks. These effects are distinct from—and in the case of employment, opposite to—the effects of changing task quantities. The expertise framework resolves the puzzle of why routine task automation has lowered employment but often raised wages in routine task-intensive occupations. It provides a general tool for analyzing how task automation and new task creation reshape the scarcity value of human expertise within and across occupations.

August 2024

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

Peter Slattery, Alexander K. Saeri, Emily A. C. Grundy, Jess Graham, Michael Noetel, Risto Uuk, James Dao, Soroush Pour, Stephen Casper, Neil Thompson

The risks posed by Artificial Intelligence (AI) are of considerable concern to academics, auditors, policymakers, AI companies, and the public. However, a lack of shared understanding of AI risks can impede our ability to comprehensively discuss, research, and react to them. This paper addresses this gap by creating an AI Risk Repository to serve as a common frame of reference. This comprises a living database of 777 risks extracted from 43 taxonomies, which can be filtered based on two overarching taxonomies and easily accessed, modified, and updated via our website and online spreadsheets. We construct our Repository with a systematic review of taxonomies and other structured classifications of AI risk followed by an expert consultation. We develop our taxonomies of AI risk using a best-fit framework synthesis. Our high-level Causal Taxonomy of AI Risks classifies each risk by its causal factors (1) Entity: Human, AI; (2) Intentionality: Intentional, Unintentional; and (3) Timing: Pre-deployment; Post-deployment. Our mid-level Domain Taxonomy of AI Risks classifies risks into seven AI risk domains: (1) Discrimination & toxicity, (2) Privacy & security, (3) Misinformation, (4) Malicious actors & misuse, (5) Human-computer interaction, (6) Socioeconomic & environmental, and (7) AI system safety, failures, & limitations. These are further divided into 23 subdomains. The AI Risk Repository is, to our knowledge, the first attempt to rigorously curate, analyze, and extract AI risk frameworks into a publicly accessible, comprehensive, extensible, and categorized risk database. This creates a foundation for a more coordinated, coherent, and complete approach to defining, auditing, and managing the risks posed by AI systems.

All Research

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

October 2025

Reconceiving the National Research Enterprise

Christophe Combemale, Martin Fleming, Yong-Yeol Ahn, Cassidy Sugimoto

U.S. leadership in scientific research is at a transformative moment. Not only is artificial intelligence playing a disruptive, even revolutionary, role, China has emerged as a viable competitor. As a source of U.S. competitive advantage, innovation, transformation, and diffusion is vital. Scientific knowledge must spread beyond its original research context and be applied to diverse problems. In these times of industrial revolution, universities play a central role, but funding should prioritize institutions that demonstrate effective diffusion and talent retention at national and regional levels. Given the priorities set by federal leadership, decentralized decision-making freedom allows institutions to experiment with solutions. While technical advantages are temporary, sustained leadership requires stable talent pathways that support continuous and rapid innovation that delivers value to end users faster than national competitors can innovate. U.S. scientific competitiveness depends not only on discovery but also on the diffusion and retention of tacit knowledge, particularly through graduate students who embody and transfer expertise. Talent attraction is a U.S. strength, but retention is equally critical since departing researchers can weaken the nation's competitive position by transferring tacit knowledge abroad. To succeed, the U.S. must compete globally to provide

August 2025

Quantum Advantage in Computational Chemistry?

Hans Gundlach, Keeper Sharkey, Jayson Lynch, Victoria Hazoglou, Kung-Chuan Hsu, Carl Dukatz, Eleanor Crane, Karin Walczyk, Marcin Bodziak, Johannes Galatsanos-Dueck, Neil Thompson

For decades, computational chemistry has been posited as one of the areas in which quantum computing would revolutionize. However, the algorithmic advantages that fault-tolerant quantum computers have for chemistry can be overwhelmed by other disadvantages, such as error correction, processor speed, etc. To assess when quantum computing will be disruptive to computational chemistry, we compare a wide range of classical methods to quantum computational methods by extending the framework proposed by Choi, Moses, and Thompson. Our approach accounts for the characteristics of classical and quantum algorithms, and hardware, both today and as they improve. We find that in many cases, classical computational chemistry methods will likely remain superior to quantum algorithms for at least the next couple of decades. Nevertheless, quantum computers are likely to make important contributions in two important areas. First, for simulations with tens or hundreds of atoms, highly accurate methods such as Full Configuration Interaction are likely to be surpassed by quantum phase estimation in the coming decade. Secondly, in cases where quantum phase estimation is most efficient less accurate methods like Coupled Cluster and Møller–Plesset, could be surpassed in fifteen to twenty years if the technical advancements for quantum computers are favorable. Overall, we find that in the next decade or so, quantum computing will be most impactful for highly accurate computations with small to medium-sized molecules, whereas classical computers will likely remain the typical choice for calculations of larger molecules.

August 2025

Introducing the Quantum Economic Advantage Online Calculator

Frederick Mejia, Hans Gundlach, Jayson Lynch, Carl Dukatz, Andrew Lucas, Eleanor Crane, Prashant Shukla, Neil Thompson

We introduce an open-access web tool that estimates when quantum systems will outperform classical computers for specific algorithms on a cost‑equivalent basis. Built on the Choi–Moses–Thompson framework, it lets users vary assumptions about error correction, gate speeds, overheads, connectivity, hardware roadmaps, and classical baselines. We find that expected timing for advantage is robust for some algorithms (e.g., Shor's) but contingent for others (e.g., Grover's), depending on technical factors.

June 2025

The Quantum Tortoise and the Classical Hare: When Will Quantum Computers Outpace Classical Ones and When Will They Be Left Behind?

Sukwoong Choi; William S. Moses; Neil Thompson

Quantum computing promises transformational gains for solving some problems, but little to none for others. We introduce a framework—cast as a race between quantum and classical computers—to determine which problems benefit. While classical machines operate faster, quantum algorithms can sometimes be more efficient; which advantage dominates determines where quantum wins. Our analysis suggests many small-to-moderate problems common in business will not benefit from quantum computing, whereas larger problems or those with exceptionally large algorithmic gains may see advantages.

June 2025

Meek Models Shall Inherit the Earth

Hans Gundlach, Jayson Lynch, Neil Thompson

The past decade has seen incredible scaling of AI systems by a few companies, leading to inequality in AI model performance. This paper argues that, contrary to prevailing intuition, the diminishing returns to compute scaling will lead to a convergence of AI model capabilities. In other words, meek models (those with limited computation budget) shall inherit the earth, approaching the performance level of the best models overall. We develop a model illustrating that under a fixed-distribution next-token objective, the marginal capability returns to raw compute shrink substantially. Given current scaling practices, we argue that these diminishing returns are strong enough that even companies that can scale their models exponentially faster than other organizations will eventually have little advantage in capabilities. As part of our argument, we give several reasons that proxies like training loss differences capture important capability measures using evidence from benchmark data and theoretical performance models. In addition, we analyze empirical data on the capability difference of AI models over time. Finally, in light of the increasing ability of meek models, we argue that AI strategy and policy require reexamination, and we outline the areas this shift will affect.

June 2025

Expertise

David Autor & Neil Thompson

May 2025

The dual edges of AI: Advancing knowledge while reducing diversity

Sukwoong Choi, Hyo Kang, Namil Kim, Junsik Kim

We study how the interaction between human professionals and AI in advancing knowledge, using professional Go matches from 2003 to 2021. In 2017, an AI-powered Go program (APG) far surpassed the best human player, and professional players began learning from AI. Such human–AI interaction paved a new way to reassess historical Go knowledge and create new knowledge. We analyze standard patterns (defined as a sequence of the first eight alternating moves) in about 15 million moves by over 1,700 players in nearly 70,000 professional Go games and find that, after APG, professional players significantly changed how they adopted different sets of moves. However, new knowledge catalyzed by AI comes at the expense of a reduced diversity in moves. Further, AI's impact on knowledge creation is greater for highly skilled players; since AI does not explain, learning from AI requires the absorptive capacity of the top professionals.

October 2024

Damocles's Switchboard: Information Externalities and the Autocratic Logic of Internet Control

Meicen Sun

This paper advances a theory for the autocratic logic of internet control. Politically motivated internet control generates a positive externality for domestic data-intensive firms and a negative externality for domestic knowledge-intensive research entities. Exploiting a major internet control shock in 2014, I find that Chinese data-intensive firms gained 26 percent in revenue over other Chinese firms as the result of internet control. The same shock incurred a 10 percent decline in research quality from Chinese researchers, conditional on the knowledge intensity of their discipline. It also reduced the research quality from Chinese researchers relative to their US counterparts by 22 percent in all disciplines. Due to the positive data externality, internet control enacted to prevent domestic threats challenges the state's competing need for data sovereignty against foreign threats. Meanwhile, the state shields certain foreign knowledge-intensive actors from the negative knowledge externality to avoid the immediate economic costs they might otherwise impose. Qualitative evidence supports both implications, highlighting the centrality of short-term interests and foreign actors in autocratic decision making.

September 2024

Environmental uncertainty and entrepreneurial orientation in collectivist and individualist cultures: evidence from Brazil and Belgium

Ernst Verwaal, Anna M. Pastwa, Mario Texeira Reis Neto

A fundamental premise of the information processing approach in entrepreneurship theory is that the distribution and processing of information is key to opportunity identification. We propose that collectivist-individualist cultural values influence the efficacy of information processing leading to different levels of entrepreneurial orientation (EO). Specifically, we submit that collectivism enhances shared information processing and goal setting in a dynamic environment, while individualism facilitates creativity and divergent thinking that can help EO if sufficient informational input is available. To empirically verify our theory, we study the relationships using survey and archival data from a sample of 487 firms in six industries in Brazil and Belgium, two countries that contrast in collectivism-individualism. Consistent with our predictions, we find higher EO in the collectivist culture of Brazil under high environmental dynamism and low information availability, and this difference is increased by strategic planning.

September 2024

Economic impacts of AI-augmented R&D

Tamay Besiroglu, Nicholas Emery-Xu, Neil Thompson

Since its emergence around 2010, deep learning has rapidly become the most important technique in Artificial Intelligence (AI), producing an array of scientific firsts in areas as diverse as protein folding, drug discovery, integrated chip design, and weather prediction. As scientists and engineers adopt deep learning, it is important to consider what effect widespread deployment would have on scientific progress and, ultimately, economic growth. We assess this impact by estimating the idea production function for AI in two computer vision tasks that are considered key test-beds for deep learning and show that AI idea production is notably more capital-intensive than traditional R&D. Because increasing the capital-intensity of R&D accelerates the investments that make scientists and engineers more productive, our work suggests that AI-augmented R&D has the potential to speed up technological change and economic growth.

August 2024

The last mile problem: Why job automation will be slower than technological progress suggests

Martin Fleming, Wensu Li, and Neil C. Thompson

New artificial intelligence (AI) offerings, such as Open AI's ChatGPT, Google's Gemini, and Anthropic's Claude, have fascinated business leaders and the public alike. While the technical progress that has garnered headlines is impressive, the economic feasibility of these AI systems can fall short. For firms to justify adopting new AI capabilities, these systems must create value that exceeds their cost. However, because the upfront development costs are enormous, just achieving breakeven on AI investments will be a challenge unless firms have the deployment scope necessary to sufficiently amortize costs. Put another way, for AI to move from a few generalist systems to the myriad of specialized systems needed for deployment throughout the economy, an enormous amount of costly 'last mile' customization will be needed. Whether such customization can be justified economically will depend on the performance needs of the companies deploying them and on the ability of technology providers to achieve greater scale.

August 2024

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

Peter Slattery, Alexander K. Saeri, Emily A. C. Grundy, Jess Graham, Michael Noetel, Risto Uuk, James Dao, Soroush Pour, Stephen Casper, Neil Thompson

May 2024

Neural Scaling Laws in Robotics

Sebastian Sartor, Neil Thompson

Neural scaling laws have driven advances in ML, yet robotics remains underexplored. Via a meta-analysis of 327 papers, we measure how data, model size, and compute affect performance for robot foundation models (RFMs) and LLMs on robotic tasks. Performance improves with resources following a power-law, scaling faster than in language tasks; new robot capabilities emerge as scale increases, suggesting notable gains with more data and compute.

May 2024

Neural Scaling Laws for Embodied AI

Sebastian Sartor

March 2024

A Model for Estimating the Economic Costs of Computer Vision Systems That Use Deep Learning

Neil Thompson MIT, Martin Fleming The Productivity Institute, Varicent, Benny J. Tang MIT, Anna M. Pastwa MIT University of Warsaw, Nicholas Borge IBM, Brian C. Goehring IBM, Subhro Das IBM

Deep learning, the most important subfield of machine learning and artificial intelligence (AI) over the last decade, is considered one of the fundamental technologies underpinning the Fourth Industrial Revolution. But despite its record-breaking history, deep learning's enormous appetite for compute and data means that sometimes it can be too costly to practically use. In this paper, we connect technical insights from deep learning scaling laws and transfer learning with the economics of IT to propose a framework for estimating the cost of deep learning computer vision systems to achieve a desired level of accuracy. Our tool can be of practical use to AI practitioners in industry or academia to guide investment decisions.

March 2024

User-generated content shapes judicial reasoning: Evidence from a randomized control trial on Wikipedia

Neil C. Thompson, Xueyun Luo, Brian McKenzie, Edana Richardson, Brian Flanagan

Legal professionals have access to many different sources of knowledge, including user-generated Wikipedia articles that summarize previous judicial decisions (i.e., precedents). Although these Wikipedia articles are easily accessible, they have unknown provenance and reliability, and therefore using them in professional settings is problematic. Nevertheless, Wikipedia articles influence legal judgments, as we show using a first-of-its-kind randomized control trial on judicial decision making. We find that the presence of a Wikipedia article about Irish Supreme Court decisions makes it meaningfully more likely that the corresponding case will be cited as a precedent by judges in subsequent decisions. The language used in the Wikipedia article also influences the language used in judgments. These effects are only present for citations by the High Court and not for the higher levels of the judiciary (Court of Appeal and Supreme Court). The High Court faces larger caseloads, so this may indicate that settings with greater time pressures encourage greater reliance on Wikipedia. Our results add to the growing recognition that Wikipedia and other frequently accessed sources of user-generated content have profound effects on important social outcomes and that these effects extend farther than previously seen—into high-stakes settings where norms are supposed to restrict their use.

March 2024

Algorithmic progress in language models

Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, Jaime Sevilla

We investigate the rate at which algorithms for pre-training language models have improved since the advent of deep learning. Using 200+ evaluations (2012–2023) on Wikitext and Penn Treebank, we find the compute required to reach a fixed performance halves roughly every 8 months (95% CI ≈ 5–14 months), faster than Moore's Law. Augmented scaling laws quantify algorithmic progress versus scaling, showing that increased compute still contributed even more to improvements over this period, despite rapid algorithmic advances.

January 2024

Beyond AI Exposure: Which Tasks are Cost-Effective to Automate with Computer Vision?

Maja S. Svanberg, Wensu Li, Martin Fleming, Brian C. Goehring, Neil C. Thompson

The faster AI automation spreads through the economy, the more profound its potential impacts, both positive (improved productivity) and negative (worker displacement). The previous literature on “AI Exposure” cannot predict this pace of automation since it attempts to measure an overall potential for AI to affect an area, not the technical feasibility and economic attractiveness of building such systems. In this article, we present a new type of AI task automation model that is end-to-end, estimating: the level of technical performance needed to do a task, the characteristics of an AI system capable of that performance, and the economic choice of whether to build and deploy such a system. The result is a first estimate of which tasks are technically feasible and economically attractive to automate - and which are not. We focus on computer vision, where cost modeling is more developed. We find that at today’s costs U.S. businesses would choose not to automate most vision tasks that have “AI Exposure,” and that only 23% of worker wages being paid for vision tasks would be attractive to automate. This slower roll-out of AI can be accelerated if costs falls rapidly or if it is deployed via AI-as-a-service platforms that have greater scale than individual firms, both of which we quantify. Overall, our findings suggest that AI job displacement will be substantial, but also gradual – and therefore there is room for policy and retraining to mitigate unemployment impacts.

January 2024

The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?

Tamay Besiroglu, Sage Andrus Bergerson, Amelia Michael, Lennart Heim, Xueyun Luo, Neil Thompson

Industrial and academic AI labs differ sharply in compute access. We document how this divide coincides with reduced representation of academic-only teams in compute‑intensive topics (e.g., foundation models) and argue academia may play a smaller role in advancing techniques, scrutiny, and diffusion. We recommend expanding academic access (e.g., national compute, open science), plus structured access and auditing to enable measured external evaluation of industry systems.

December 2023

What should be done about the growing influence of industry in AI research?

Nur Ahmed and Neil C. Thompson

Artificial intelligence (AI) is becoming increasingly important for society and the economy. However, modern AI research requires expensive resources that are often only available to a small group of for-profit firms. Consequently, there is a growing concern that only a handful of companies will be able to perform cutting-edge research, leading to a worrisome concentration of power in AI development and its future.

September 2023

Large Language Model Routing with Benchmark Datasets

Tal Shnitzer, Anthony Ou, Mírian Silva, Kate Soule, Yuekai Sun, Justin Solomon, Neil Thompson, Mikhail Yurochkin

There is a rapidly growing number of open-source Large Language Models (LLMs) and benchmark datasets to compare them. While some models dominate these benchmarks, no single model typically achieves the best accuracy in all tasks and use cases. In this work, we address the challenge of selecting the best LLM out of a collection of models for new tasks. We propose a new formulation for the problem, in which benchmark datasets are repurposed to learn a "router" model for this LLM selection, and we show that this problem can be reduced to a collection of binary classification tasks. We demonstrate the utility and limitations of learning model routers from various benchmark datasets, where we consistently improve performance upon using any single model for all tasks.

September 2023

Unleash the Unexpected for Radical Innovation

Wenjing Lyu, Gina Colarelli O'Connor, and Neil C. Thompson

The accelerometer chip — a small but radical innovation — is ubiquitous in today's digital devices. These speed and orientation sensors tell our phones whether they're being held in portrait or landscape mode, deploy airbags in our cars, and track our forehands when we play virtual tennis. They also help sense when the earth starts to shift before earthquakes or volcanic eruptions. But while it is easy to recognize the significance of this innovation retrospectively, its true impact didn't become apparent until many of today's most valued applications were developed. This gradual unveiling of an innovation's potential over time is a surprisingly common pattern — so common, in fact, that companies need to craft their innovation management systems with this phenomenon in mind.

September 2023

The Grand Illusion: The Myth of Software Portability and Implications for ML Progress

Fraser Mince, Dzung Dinh, Jonas Kgomo, Neil Thompson, Sara Hooker

Pushing the boundaries of machine learning often requires exploring different hardware and software combinations. However, the freedom to experiment across different tooling stacks can be at odds with the drive for efficiency, which has produced increasingly specialized AI hardware and incentivized consolidation around a narrow set of ML frameworks. Exploratory research can be restricted if software and hardware are co-evolving, making it even harder to stray away from mainstream ideas that work well with popular tooling stacks. While this friction increasingly impacts the rate of innovation in machine learning, to our knowledge the lack of portability in tooling has not been quantified. In this work, we ask: How portable are popular ML software frameworks? We conduct a large-scale study of the portability of mainstream ML frameworks across different hardware types. Our findings paint an uncomfortable picture -- frameworks can lose more than 40% of their key functions when ported to other hardware. Worse, even when functions are portable, the slowdown in their performance can be extreme and render performance untenable. Collectively, our results reveal how costly straying from a narrow set of hardware-software combinations can be - and suggest that specialization of hardware impedes innovation in machine learning research.

June 2023

Democratising case law while teaching students: writing Wikipedia articles on legal cases

Edana Richardson, Brian McKenzie, Brian Flanagan, Neil C. Thompson, Maria Murphy

This article draws on qualitative student feedback and lecturer experience to provide a guide for educators who are interested in creating Wikipedia article-based assignments. Using legal cases as an example, this article details how these assignments can encourage students to deepen their understanding of a topic and consider how knowledge can be communicated effectively. In particular, this article focuses on how educators outside of the United States and Canada can navigate Wikipedia's bureaucracy and how they and their students can contribute information of relevance to smaller jurisdictions on a publicly-accessible repository. This article begins by addressing concerns that educators may have with student use of Wikipedia, while highlighting pedagogical benefits for students who write Wikipedia articles. It goes on to provide a guide for educators who want to create a Wikipedia article writing assignment – in particular, the preparatory steps required to make the assignment effective, how to support students in their writing journey, and how to better ensure that student-authored articles remain available on Wikipedia. This article concludes by encouraging educators to consider using Wikipedia as an educational tool, and to teach their students how they can use Wikipedia article writing to contribute to public knowledge.

March 2023

The growing influence of industry in AI research

Nur Ahmed, Muntasir Wahed, Neil C. Thompson

For decades, artificial intelligence (AI) research has coexisted in academia and industry, but the balance is tilting toward industry as deep learning, a data-and-compute-driven subfield of AI, has become the leading technology in the field. Industry’s AI successes are easy to see on the news, but those headlines are the heralds of a much larger, more systematic shift as industry increasingly dominates the three key ingredients of modern AI research: computing power, large datasets, and highly skilled researchers. This domination of inputs is translating into AI research outcomes: Industry is becoming more influential in academic publications, cutting-edge models, and key benchmarks. And although these industry investments will benefit consumers, the accompanying research dominance should be a worry for policy-makers around the world because it means that public interest alternatives for important AI tools may become increasingly scarce.

September 2022

Why Innovators in China Stay Close to the Market

Neil C. Thompson, Didier Bonnet, Mark J. Greeven, Wenjing Lyu, and Sarah Jaballah

Most large companies take a similar approach to corporate innovation, running it out of centralized innovation groups. But companies in China, both domestic and foreign, are much more likely to turn to market-facing sources of innovation, including customers, competitors, and front-line employees. China's fast growth is producing a disproportionately large share of new customers for many industries, which demands an orientation toward generating ideas closer to customers to drive more market-led innovation.

September 2022

Intentional and serendipitous diffusion of ideas: Evidence from academic conferences

Misha Teplitskiy, Soya Park, Neil Thompson, David Karger

This paper investigates the effects of seeing ideas presented in-person when they are easily accessible online. Presentations may increase the diffusion of ideas intentionally (when one attends the presentation of an idea of interest) and serendipitously (when one sees other ideas presented in the same session). We measure these effects in the context of 25 computer science conferences using data from the scheduling application Confer, which lets users browse papers, Like those of interest, and receive schedules of their presentations. We address endogeneity concerns in presentation attendance by exploiting scheduling conflicts: when a user Likes multiple papers that are presented at the same time, she cannot see them both, potentially affecting their diffusion. Estimates show that being able to see presentations increases citing of Liked papers within two years by 1.5 percentage points (62.5% boost over the baseline citation rate). Attention to Liked papers also spills over to non-Liked papers in the same session, increasing their citing by 0.5 percentage points (125% boost), and this serendipitous diffusion represents 30.5% of the total effect. Both diffusion types were concentrated among papers semantically close to an attendee's prior work, suggesting that there are inefficiencies in finding related research that conferences help overcome. Overall, even when ideas are easily accessible online, in-person presentations substantially increase diffusion, much of it serendipitous.

August 2022

Trial by Internet: A Randomized Field Experiment on Wikipedia's Influence on Judges' Legal Reasoning

Neil Thompson, Brian Flanagan, Edana Richardson, Brian McKenzie, Xueyun Luo

In the common law tradition, legal decisions are supposed to be grounded in both statute and precedent, with legal training guiding practitioners on the most important and relevant touchstones. But actors in the legal system are also human, with the failings and foibles seen throughout society. This may lead them to take methodological shortcuts, even to relying on unknown internet users for determinations of a legal source's relevance. In this chapter, we investigate the influence on legal judgments of a pervasive but unauthoritative source of legal knowledge: Wikipedia. Using the first randomized field experiment ever undertaken in this area—the gold standard for identifying causal effects—we show that Wikipedia shapes judicial behavior. Wikipedia articles on decided cases, written by law students, guide both the decisions that judges cite as precedents and the textual content of their written opinions. The information and legal analysis offered on Wikipedia led judges to cite the relevant legal cases more often and to talk about them in ways comparable to how the Wikipedia authors had framed them. Collectively, our study provides clear empirical evidence of a new form of influence on judges' application of the law—easily accessible, user-generated online content. Because such content is not authoritative, our analysis reveals a policy gap: if easily accessible analysis of legal questions is already being relied on, it behooves the legal community to accelerate efforts to ensure that such analysis is both comprehensive and expert.

June 2022

Trial by Internet: A Response to Judicial Critics

Neil Thompson, Brian Flanagan, Edana Richardson, Brian McKenzie, Xueyun Luo

In July 2022, we released a preprint of Chapter 38 of The Cambridge Handbook of Experimental Jurisprudence. In the months that followed, there were a variety of judicial reactions, ranging from levity and philosophical reflection, to irritation and ad hominem criticism. The most detailed response consisted in a paper, 'Academia', co-authored by a High Court judge, that was published online in May 2023. In this rejoinder, we show that the concerns expressed in that paper are ill-founded.

June 2022

The Importance of (Exponentially More) Computing Power

Neil C. Thompson, Shuning Ge, Gabriel F. Manso

Denizens of Silicon Valley have called Moore's Law "the most important graph in human history," and economists have found that Moore's Law-powered I.T. revolution has been one of the most important sources of national productivity growth. But data substantiating these claims tend to either be abstracted - for example by examining spending on I.T., rather than I.T. itself - or anecdotal. In this paper, we assemble direct quantitative evidence of the impact that computing power has had on five domains: two computing bellwethers (Chess and Go), and three economically important applications (weather prediction, protein folding, and oil exploration). Computing power explains 49%-94% of the performance improvements in these domains. But whereas economic theory typically assumes a power-law relationship between inputs and outputs, we find that an exponential increase in computing power is needed to get linear improvements in these outcomes. This helps clarify why the exponential growth of computing power from Moore's Law has been so important for progress, and why performance improvements across many domains are becoming economically tenuous as Moore's Law breaks down.

February 2022

Compute Trends Across Three Eras of Machine Learning

Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn, Pablo Villalobos

Compute, data, and algorithmic advances are the three fundamental factors that guide the progress of modern Machine Learning (ML). In this paper we study trends in the most readily quantified factor - compute. We show that before 2010 training compute grew in line with Moore's law, doubling roughly every 20 months. Since the advent of Deep Learning in the early 2010s, the scaling of training compute has accelerated, doubling approximately every 6 months. In late 2015, a new trend emerged as firms developed large-scale ML models with 10 to 100-fold larger requirements in training compute. Based on these observations we split the history of compute in ML into three eras: the Pre Deep Learning Era, the Deep Learning Era and the Large-Scale Era. Overall, our work highlights the fast-growing compute requirements for training advanced ML systems.

October 2021

Deep Learning's Diminishing Returns: The Cost of Improvement is Becoming Unsustainable

Neil C. Thompson; Kristjan Greenewald; Keeheon Lee; Gabriel F. Manso

Collectively, our study provides clear empirical evidence of a new form of influence on judges' application of the law—easily accessible, user-generated online content. Because such content is not authoritative, our analysis reveals a policy gap: if easily accessible analysis of legal questions is already being relied on, it behooves the legal community to accelerate efforts to ensure that such analysis is both comprehensive and expert.

September 2021

How Fast do Algorithms Improve?

Yash Sherry, Neil Thompson

Algorithms are one of the fundamental building blocks of computing. But current evidence about how fast algorithms improve is anecdotal, using small numbers of case studies to extrapolate. In this work, we gather data from 57 textbooks and more than 1,137 research papers to present the first systematic view of algorithm progress ever assembled. There is enormous variation. Around half of all algorithm families experience little or no improvement. At the other extreme, 13% experience transformative improvements, radically changing how and where they can be used. Overall, we find that, for moderate-sized problems, 30% to 45% of algorithmic families had improvements comparable or greater than those that users experienced from Moore’s Law and other hardware advances

March 2021

The Decline of Computers as a General Purpose Technology

Neil Thompson, Svenja Spanuth

The general-purposeness of today’s computers comes from technical achievements, but also from a mutually-reinforcing economic cycle, where product improvement and market growth fuel each other. This article argues that technological and economic forces are now pushing computing away from being general purpose and towards specialization. This fragmentation process, driven by the breakdowns in Moore’s Law and Dennard Scaling, has already begun and threatens to divide computing into ‘fast lane’ applications that get powerful specialized processors and ‘slow lane’ applications that get stuck using general purpose processors whose progress fades.

July 2020

The Computational Limits of Deep Learning

Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, Gabriel F. Manso

Deep learning's recent history has been one of achievement: from triumphing over humans in the game of Go to world-leading performance in image classification, voice recognition, translation, and other tasks. But this progress has come with a voracious appetite for computing power. This article catalogs the extent of this dependency, showing that progress across a wide variety of applications is strongly reliant on increases in computing power. Extrapolating forward this reliance reveals that progress along current lines is rapidly becoming economically, technically, and environmentally unsustainable. Thus, continued progress in these applications will require dramatically more computationally-efficient methods, which will either have to come from changes to deep learning or from moving to other machine learning methods.

June 2020

There’s plenty of room at the Top: What will drive computer performance after Moore’s law?

Charles E. Leiserson, Neil Thompson, Joel S. Emer, Bradley C. Kuszmaul, Butler W. Lampson, Daniel Sanchez, Tao B. Schardl

The miniaturization of semiconductor transistors has driven the growth in computer performance for more than 50 years. As miniaturization approaches its limits, bringing an end to Moore’s law, performance gains will need to come from software, algorithms, and hardware. We refer to these technologies as the “Top” of the computing stack to distinguish them from the traditional technologies at the “Bottom”: semiconductor physics and silicon-fabrication technology. In the post-Moore era, the Top will provide substantial performance gains, but these gains will be opportunistic, uneven, and sporadic, and they will suffer from the law of diminishing returns. Big system components offer a promising context for tackling the challenges of working at the Top.

June 2020

Building the algorithm commons: Who discovered the algorithms that underpin computing in the modern enterprise?

Neil C. Thompson, Shuning Ge, Yash M. Sherry

Recent work has revealed rapid improvement in the algorithms that underpin modern computing. For many computations, these algorithmic innovations have been more important than those in computer hardware (including Moore’s Law, which is known to have substantially improved firm productivity). In this article, we analyze who built the “Algorithmic Commons”. We find that the United States has been the largest contributor of these public goods, with universities and large private labs (e.g. IBM) playing the biggest role. More broadly, we find a historical pattern of contributions consistent with world geopolitics, where the United States took algorithmic leadership in the post-war period, but that this has faded in recent decades as Europe recovered and then Asia grew.

May 2020

Why Innovation's Future Isn't (Just) Open

Neil C. Thompson, Didier Bonnet, and Yun Ye

New digital technologies have upended conventional business models, organizational structures, and operating processes in most industries. Almost every aspect of business — customer relations, supply chain management, after-sales service — has been radically altered. Nowhere is that more evident than in brick-and-mortar companies' innovation processes. Facing tough competition from digital upstarts that are creating and capturing value in new ways, incumbents are trying to figure out how to keep up.

October 2019

The Close Relationship Between Management Practices and Corporate Culture

Donald N. Sull, Hyo Kang, Neil Thompson

A growing body of literature finds that a healthy corporate culture is associated with superior financial performance. A separate stream of research has found that a firm's adoption of management "best practices" is correlated with higher efficiency and productivity. To date, the cultural and management practices literatures have proceeded in parallel, with few studies considering the relationship between an organization's processes and its culture. This paper uses data from a carefully-designed survey of 370 organizations and nearly ten thousand managers to simultaneously measure corporate culture and management practices. Our key finding is that the quality of a company's management practices and health of its corporate culture are highly correlated. This implies that studies which measure either culture or processes in isolation are likely to overstate their impact on performance. We also provide suggestive evidence that management practices may cause changes in corporate culture, or at least that this effect is stronger than the reverse.

May 2019

How to Measure and Draw Causal Inferences with Patent Scope

Jeffrey M. Kuhn, Neil Thompson

This paper presents an easy-to-use measure of patent scope that is grounded both in patent law and in the practices of patent attorneys. We validate our measure by showing that patent attorneys' subjective assessments of scope agree with our estimates and that the behavior of patenters is consistent with it. Using our validation exercise, we find that previous measures of patent scope — such as the number of patent classes, the number of citations made by future patents, and the number of claims in a patent — are uninformative or misleading. To facilitate drawing causal inferences with our measure, we show how it can be used to create an instrumental variable, Patent Examiner Scope Toughness, which we also validate. We then demonstrate the power of this instrument by examining standard-essential patents. We show that an (exogenous) diminishment of patent scope leads to patents being much less likely to be declared standard-essential.

October 2018

Gene synthesis allows biologists to source genes from farther away in the tree of life

Aditya M. Kunjapur, Philipp Pfingstag, Neil C. Thompson

Gene synthesis enables creation and modification of genetic sequences at an unprecedented pace, offering enormous potential for new biological functionality but also increasing the need for biosurveillance. In this paper, we introduce a bioinformatics technique for determining whether a gene is natural or synthetic based solely on nucleotide sequence. This technique, grounded in codon theory and machine learning, can correctly classify genes with 97.7% accuracy on a novel data set. We then classify ∼19,000 unique genes from the Addgene non-profit plasmid repository to investigate whether natural and synthetic genes have differential use in heterologous expression. Phylogenetic analysis of distance between source and expression organisms reveals that researchers are using synthesis to source genes from more genetically-distant organisms, particularly for longer genes. We provide empirical evidence that gene synthesis is leading biologists to sample more broadly across the diversity of life, and we provide a foundational tool for the biosurveillance community.

August 2018

Trade Offs in Firm Culture? Nope You Can Have it All

Donald N. Sull, Hyo Kang, Neil Thompson, Lucy Hu

A firm can exhibit many "good" cultural values, for example collaboration, integrity, or ambition. Influential theories of corporate culture claim that firms must choose which cultural values to foster because of inherent trade-offs between them. This paper tests this proposition using a new survey of managers (370 firms, averaging 27 respondents each). We find no evidence of trade-offs. To the contrary, we find that firms that score higher on one cultural value also tend to score higher on others. Our findings suggest that any inherent trade-offs are outweighed by the ability of good management practices to help a firm excel across many cultural values.

Tools that shape innovation

November 2017

Decomposing the “Tacit Knowledge Problem:” Codification of Knowledge and Access in CRISPR Gene-Editing

Neil Thompson, Samantha Zyontz

The ability to edit genes with CRISPR has, in a few short years, been transformative to genetics, generating follow-on science such as drought-resistant crops and mosquitos that cannot carry malaria, and is widely expected to win a Nobel prize. The rush of scientists to embrace CRISPR in its early days provides an important and data-rich environment in which to study how tacit information is codified and transferred for others to build upon. In particular, the introduction of CRISPR allows us to explore whether embedding the associated knowledge into an easy-to-distribute tool (a plasmid) and making it available through a biological resource center (Addgene) solves the "tacit information" problem and the physical localization that accompanies it. We show that codification into a scientific tool does solve the problem of access; scientists of equivalent caliber experiment with CRISPR in equal measure regardless of where in the U.S. they are. However, scientists across different geographies have unequal success in converting that experimentation into published science, suggesting that some tacit information problems persist. The remaining tacit information seems to be driven by expertise, with geographies specialized in mammalian CRISPR helping to create publishable science on mammals and geographies specialized in bacterial CRISPR helping to create publishable science on bacteria. Collectively, our case study of the earliest days of CRISPR speaks to the tacit information challenges that are, and are not, solved by distributing embedded materials.

September 2017

First Solar

Neil Thompson and Jennifer Ballen

Tymen deJong, First Solar's senior vice president of module manufacturing, fixated yet again on the company's latest 10-K. DeJong had joined the company in January of 2010, at a time when First Solar's future appeared bright. Now, just two years later, First Solar's cost advantage was eroding, and deJong was facing challenges that would require tough decisions.

September 2017

Science Is Shaped by Wikipedia: Evidence From a Randomized Control Trial

Neil Thompson, Douglas Hanley

“I sometimes think that general and popular treatises are almost as important for the progress of science as original work.” - Charles Darwin, 1865. As the largest encyclopedia in the world, it is not surprising that Wikipedia reflects the state of scientific knowledge. However, Wikipedia is also one of the most accessed websites in the world, including by scientists, which suggests that it also has the potential to shape science. This paper shows that it does. Incorporating ideas into Wikipedia leads to those ideas being used more in the scientific literature. We provide correlational evidence of this across thousands of Wikipedia articles and causal evidence of it through a randomized control trial where we add new scientific content to Wikipedia. In the months after uploading it, an average new Wikipedia article in Chemistry is read tens of thousands of times and causes changes to hundreds of related scientific journal articles. Patterns in these changes suggest that Wikipedia articles are used as review articles, summarizing an area of science and highlighting the research contributions to it. Consistent with this reference article view, we find causal evidence that when scientific articles are added as references to Wikipedia, those articles accrue more academic citations. Our findings speak not only to the influence of Wikipedia, but more broadly to the influence of repositories of knowledge and the role that they play in science.

July 2017

University licensing and the flow of scientific knowledge

Neil C Thompson, Arvids A Ziedonis, David C Mowery

As university involvement in technology transfer and entrepreneurship has increased, concerns over the patenting and licensing of scientific discoveries have grown. This paper examines the effect that the licensing of academic patents has on journal citations to academic publications covering the same scientific research. We analyze data on invention disclosures, patents, and licenses from the University of California, a leading U.S. academic patenter and licensor, between 1997 and 2007. We also develop a novel “inventor–based” maximum–likelihood matching technique to automate and generalize Murray’s (2002) “patent–paper pairs” methodology. We use this methodology to identify the scientific publications associated with University of California patents and licenses. Based on a “difference–in–differences” analysis, we find that within our sample of patented academic discoveries, citations to licensed patent–linked publications are higher in the three years after the license, although this difference is not statistically significant. But when we disaggregate our sample into (a) patented discoveries that are likely to be used as “research tools” by other researchers (based on the presence of material transfer agreements (MTAs) that cover them) and (b) patented discoveries not covered by MTAs, we find that citations to publications linked to the licensed patents in the latter subset to be not only higher for publications linked to licensed patents than for unlicensed patents, but that this difference is statistically significant. In contrast, licensing of patented discoveries that are also research tools is associated with a reduction in citations to papers linked to these research advances, raising the possibility that licensing may restrict the flow of inputs to “follow–on” scientific research.

January 2017

The Economic Impact of Moore's Law: Evidence from When it Faltered

Neil Thompson

“Computing performance doubles every couple of years” is the popular re-phrasing of Moore’s Law, which describes the 500,000 – fold increase in the number of transistors on modern computer chips. But what impact has this 50 – year expansion of the technological frontier of computing had on the productivity of firms? This paper focuses on the surprise change in chip design in the mid-2000s, when Moore’s Law faltered. No longer could it provide ever-faster processors, but instead it provided multicore ones with stagnant speeds. Using the asymmetric impacts from the changeover to multicore, this paper shows that firms that were ill-suited to this change because of their software usage were much less advantaged by later improvements from Moore’s Law.

Research

Research Pillars

Featured Research

Expertise

The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks From Artificial Intelligence

There’s plenty of room at the Top: What will drive computer performance after Moore’s law?

All Research