By Tal Roded & Peter Slattery
As artificial intelligence continues to progress and improve, its role in scientific discovery is becoming increasingly central. At the recent MIT FutureTech Workshop on the Role of AI in Science, leading researchers, technologists, and policymakers gathered to explore how AI is rapidly transforming scientific process—from accelerating research cycles to uncovering novel insights across disciplines. Researchers also presented on potential pitfalls and limitations with AI, such as the over-concentration of AI development among a few entities and the explosion of AI-generated content. The discussions provided a nuanced look at AI’s potential, its current limitations, and the fundamental shifts it is producing in scientific research. In this short article we summarize a few of the talks.
One of the most prominent themes of the workshop was how AI is reshaping the pace and scale of scientific discovery. Across fields such as materials science, drug discovery, and robotics, AI-driven tools are not just optimizing existing research workflows but also enabling entirely new forms of inquiry. Foundation models for science - similar to large language models but trained on domain-specific data - are now helping researchers generate hypotheses, design experiments, and even automate aspects of laboratory work.
FutureTech also presented research revealing disparities between industrial and academic usage of these large foundation models, calling into question the available resources for academics. In his talk “Strengthening and democratizing the U.S. Artificial Intelligence Innovation Ecosystem through the National AI Research Resource Pilot”, Varun Chandola shared progress on the establishment of The National Artificial Intelligence Research Resource (NAIRR), an NSF project to open up access to AI infrastructure for all types of researchers, further integrating AI’s role in research pipelines. Such a resource would enable academics to expand the scope of their scientific research and better take advantage of the developments in Generative AI currently being led by industry.
Speakers emphasized AI's increasing abilities not only to identify patterns in existing, complex datasets but also to actively contribute to the formulation of new theories and empirics.Examples included self-improving AI models that refine their predictions through iterative learning and AI agents that assist in autonomous experimentation, reducing the time required to test and validate new theories. Ross King presented on the potential for a robot scientist in his talk, “The automation of science”, reporting on AI that could originate, develop, execute, and iterate on its own experiments.
With AI taking on a more active role in discovery, the role for human scientists in the research process is evolving and may even cease to exist if AI supplants scientists and human oversight cannot keep pace. In the current state however, AI is assisting researchers to interact with data, design experiments, and interpret results. Workshop discussions highlighted the increasing importance of human oversight, particularly in defining meaningful research questions and ensuring AI models align with rigorous scientific standards. As AI systems become more autonomous, the challenge will be to maintain human intuition and creativity as central elements of the scientific process.
Despite AI’s great potential, workshop participants identified several hurdles that must be addressed. Bias and data quality remain critical concerns, as models trained on incomplete or unrepresentative datasets can lead to misleading conclusions. Rada Mihalcea's talk “The Uneven Access to Knowledge”, shared findings that LLMs which are trained on unrepresentative datasets perform poorly in lower resource languages and on cultural context-dependent questions. Additionally, while AI can accelerate many aspects of research, there are limits to its ability to generate new knowledge that extends beyond existing data. Addressing these challenges will require interdisciplinary collaboration between AI developers, domain experts, and policymakers to establish best practices for the responsible use of AI in scientific research.
Another area of concern with AI-generated content are issues of plagiarism, fake news, and other ethical concerns as highlighted by Lei Li in his talk on “Watermarking and detecting AI generation”. Dr. Li addressed concerns over AI-generated data’s ethical sourcing of input data as well as limitations with current detectors in positively identifying AI-generated content. Dr. Li presented a subtle and novel method of watermarking text generated by LLM's by augmenting the distribution of output tokens. This way, those with the secret distribution could easily confirm large bodies of text were indeed produced by an LLM, yet reverse engineering the augmentation would be intractable
The workshop underscored that AI is an increasingly fundamental part of the scientific enterprise. As models become more powerful and embedded in research workflows, the next step will be to develop AI systems that are not only accurate but also interpretable, robust, and aligned with the goals of scientific discovery. Moving forward, the role of AI in science will be shaped by an interplay between technological advancements, ethical considerations, and the evolving needs of the research community.
Session 1: The Role of AI in Advancing Scientific Research
This session explored the relevance of scientific discovery for welfare and how artificial intelligence is transforming scientific discovery across various fields.
Regina Barzilay (MIT Jameel Clinic): How AI contributes to the discovery of new drugs
Lilach M. (Wharton AI & Analytics Initiative): AI and education
Danial Lashkari (Princeton University and MIT FutureTech): Scientific progress as the driver of economic growth
Ross King (University of Cambridge): The automation of science
Session 2: How AI is Changing Models and Data
This session explored the transformative impact of AI on data and models, emphasizing how artificial intelligence is revolutionizing data interpretation, hypothesis generation, and the advancement of scientific theories
Stephen Wolfram (Wolfram): AI, Computation and the Future of Science
Yuji Roh (Google): The AI Feedback Loop: New Promises and Risks of Large Models
Boris Kozinsky (Harvard University): How AI enables novel computations & theory predictions
Manuela Veloso (JPMorganChase and Carnegie Mellon University): Humans and AI: The Journey
Session 3: De-democratization and Concentration of Power in AI
This session analysedhow technological progress can create market power and discusses the growing concerns about the concentration of AI development and control within a few powerful entities.
Varun Chandola (National Science Foundation): Strengthening and democratizing the U.S. AI Innovation Ecosystem through the National AI Research Resource Pilot
Jan Eeckhout (Universitat Pompeu Fabra Barcelona): Market power and technological change
Rada Mihalcea (Michigan AI Laboratory): The uneven access to knowledge
Haydn Belfield (Centre for the Study of Existential Risk, University of Cambridge, and the Leverhulme Centre for the Future of Intelligence): Big Tech Big Science: Challenges and opportunities for frontier AI governance
Session 4: Challenges that AI poses to Science
This session addressed the practical and technological challenges that AI poses to science, focusing on issues like understanding and advancing artificial intelligence, as well as detecting AI-generated content.
Dashun Wang (Northwestern University): AI and the Science of Science
Lei Li (Carnegie Mellon University): Watermarking and detecting AI generation
Eamon Duede (Purdue University and Argonne National Laboratory): Conceptual and Technological Frictions on AI-Infused Science
Keyon Vafa (Harvard Data Science Initiative): The LLM Conundrum