What drives progress in AI? Trends in Algorithms

What drives progress in AI? Trends in Algorithms

May 20, 2024

Zachary Brown, Peter Slattery and Haoran Lyu | MIT FutureTech

In this article, we provide a high level overview of a key trend in AI models: that progress in algorithms leads to performance improvements in AI systems. This is the second in a series of articles on AI trends. You can read our article on data here.

Background

Progress in artificial intelligence is underpinned by advances in three areas: compute, data, and algorithms. Compute refers to the computational resources – including the physical hardware that executes computations – that computer systems employ to run calculations or process data. Algorithms are the procedures or formulas that computer systems use for solving problems or completing tasks. Data refers to the information being processed or produced by a computer, for instance to train and validate AI models. 

This article focuses on progress in algorithms and the impacts on AI development. 

Why progress in algorithms matters

Progress in algorithms is a big deal because it allows firms to train bigger and more capable models by using less computation. As computer scientists invent and improve algorithms, solutions for problems can be calculated more easily. This can produce significant increases in performance. 

For instance, consider a single algorithm family: the family of algorithms designed to solve the maximum subarray problem. The time it takes to solve this problem for n = 10^6 inputs has declined by a factor of about 10^12 (approximately 99.9999999999%) since 1970 (Sherry and Thompson 2021). That’s an incredible speedup! 

Due to this and other type of algorithmic progress, it has been estimated that the level of compute needed to achieve a given level of AI performance has halved roughly every 8 months (Ho et al, 2024).

This is a significant trend, and there are good reasons to want to understand how it may evolve: Understanding the trajectory of algorithmic progress can inform the allocation of resources and development of regulation and policy. Assessing and understanding the key drivers behind AI advancement is essential for making informed decisions about the governance of AI technologies, ensuring their development aligns with societal values and priorities.

What are algorithms?

Algorithms are used to calculate solutions to fundamental problems that often reappear across many applications, including AI systems. One example is a sorting algorithm: an algorithm that puts elements of a list into an order, as visualized in Figure 1 below.

Figure 1: A merge sort algorithm ordering elements [source]

Many different computer applications need to sort lists, so algorithmic advances that improve the efficiency of sorting yield speedups across many applications. 

There are several types of algorithms. Serial algorithms are implemented sequentially by a single processor. In contrast, parallel algorithms use multiple processors running simultaneously, each performing some operations, so that the problem can be solved in less time (although there will be more operations run in total, giving rise to a tradeoff). 

Optimization algorithms find the unique solution(s) to a problem. In contrast, approximation algorithms find a solution within some acceptable error range.

How do we measure algorithmic progress?

One way computer scientists evaluate the quality of an algorithm is by looking at its time complexity: As the number of inputs n to an algorithm increases (think of these inputs as the individual 1s and 0s that an algorithm processes), how much does the time it takes to execute the algorithm increase? 

For example, for some algorithms, the time it takes to run the algorithm increases exponentially as we add more inputs n, while more efficient algorithms might require only linear increases in time with respect to n. Compared to exponential increases, linear increases mean dramatically less compute is needed to process large n

Figure 2 illustrates this with a hypothetical comparison. It compares the time it takes to execute two algorithms with linear and exponential time complexity given different sizes of inputs. It shows that both algorithms take a similar amount of time to execute when the numbers of inputs are small. However, the exponential time complexity algorithm takes far longer to execute than the linear time complexity algorithm as the number of inputs increases. 

Figure 2: Comparing hypothetical algorithms with linear and exponential time complexity 

What progress have we made with algorithms?

There has been impressive but spasmodic progress in algorithms, with substantial variation between algorithm families.

Sherry and Thompson 2021 estimated the rate of improvement for a large set of algorithm families drawn from historical textbooks and research papers. Their findings:

  • For just under half of the algorithm families, there is little to no improvement, even for large problem sizes. 
  • But 14% of the algorithm families see yearly improvement rates of over 1000%.
  • The remaining algorithms fall somewhere in between.

As shown in Figure 3, the pace of improvement for many algorithms has surpassed the pace of hardware improvement. And hardware has improved rapidly over time: hardware’s computational power, measured by the number of transistors per chip, has doubled approximately every two years!

Figure 3: Distribution of average yearly improvement rates for 110 algorithm families, as calculated based on asymptotic time complexity, for problems of size 1 Billion [source]

For some problems, a lack of progress after years of attempts suggests that more progress might not be possible, and in other cases formal proofs have shown that there are lower bounds to the possible complexity an algorithm solving that problem can have. For example, Liu (2021) finds that, of algorithms in the Sherry and Thompson database, over three quarters are near optimal, and only 7% of algorithm families improved in the 2010s. 

Measuring the effects of algorithmic progress on AI performance

For the purposes of this analysis, we’ll lump together all the algorithmic and programmatic advances that lead to progress in AI performance, and consider these advances collectively. What we examine as “AI algorithms” will therefore include improvements to the high level model architecture of AI systems and the specific algorithms used as part of a model.

Specifically, we can look at the change (over time) in performance that AI models are able to achieve at a given compute and data budget. The resultant measure of algorithmic progress will reflect both 

  • AI-relevant improvements in foundational algorithms, and 
  • improvements in AI-specific algorithms and model architectures.

An analysis by a team including researchers at Epoch AI and MIT FutureTech used precisely this strategy for analyzing algorithmic progress (Ho et al, 2024). The below figure shows results for algorithmic progress in various domains including large language models (LLMs such as ChatGPT) and computer vision. 

In the figure, the dots represent central estimates or ranges from specific papers; the triangles correspond to doubling times for problems at different sizes; and the dashed line indicates the 2-year doubling time associated with Moore’s law. 

Figure 4: Estimates of effective compute doubling from algorithmic improvements across different domains [source]

The rate of algorithmic progress is measured here in effective compute doubling time: the amount of time for algorithmic progress to yield performance improvements equivalent to what we’d expect if we saw no algorithmic progress but doubled computing power. For most domains, they record exponential algorithmic progress that exceeds progress in hardware.

Note that there is a great deal of uncertainty for some of these figures, and it is unclear to what extent these different software domains are usefully comparable. 

Looking at LLMs in particular, Ho et al (2024) find that algorithmic progress has contributed roughly half as much as compute scaling to more capable performance. 

Figure 5: A stylized illustration of the relative contribution of compute scaling and algorithmic progress to effective compute [source]

However, caution is warranted here, too: given the unpredictable and spasmodic nature of algorithmic improvements generally, this finding should not form the basis for strongly confident predictions about the future relative contribution of algorithm development to AI progress. As we discussed earlier, there could be diminishing returns to further research on AI-relevant algorithms. Or there could be a sudden brilliant finding that leads to dramatic increases in performance.

Could AI contribute to algorithmic progress?

One possibility is that AI could itself cause algorithmic progress. This has already happened, at least a little. A Google DeepMind paper (Mankowitz el al. 2023) used an AI model to discover improved sorting and hashing algorithms – both foundational algorithms used by millions. And Wang et al. (2017) find a way to discover improved AI architectures using an AI system. 

Perhaps at some point, AI algorithmic progress will be largely driven by AI discoveries, leading to a feedback loop of escalating progress.  The implications of such a feedback loop are far-reaching and deserve comprehensive investigation to better understand the opportunities and challenges that may arise.

Conclusion

Algorithmic progress means that programs can do more computations with a set quantity of computing resources. This can be a big deal for AI progress: When AI models can do computations more efficiently, we can train bigger and more capable models, and explore new approaches. This in turn creates new possibilities and risks.

Our analysis finds that algorithmic progress has been significant but inconsistent, and outpaced progress in hardware in some cases. For AI related progress, we find that algorithms play a vital role in enhancing AI model performance, but also that most performance improvement has come from data and compute scaling. 

Resources

Our dataset exploring the use of algorithms, parameters, and training data can be accessed here

For further information on the types of algorithms developed over time and their applications and uses, check out our Algorithm Wiki.

References

Ho, Anson, et al. 2024. "Algorithmic Progress in Language Models." Preprint. Posted at arXiv:2403.05812. https://doi.org/10.48550/arXiv.2403.05812.

Liu, Emily. 2021. "A Metastudy of Algorithm Lower Bounds." Master of Engineering thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. https://dspace.mit.edu/bitstream/handle/1721.1/140013/Liu-emiliu-meng-eecs-2021-thesis.pdf

Mankowitz, Daniel J., et al. "Faster Sorting Algorithms Discovered Using Deep Reinforcement Learning." Nature 618, no. 7964 (June 2023): 257–63. https://doi.org/10.1038/s41586-023-06004-9.

Sherry, Yash, and Neil C. Thompson. "How Fast Do Algorithms Improve? [Point of View]." Proceedings of the IEEE, vol. 109, no. 11, 2021, pp. 1768–77. https://doi.org/10.1109/JPROC.2021.3107219.

Wang, Huanting, et al. "Automating Reinforcement Learning Architecture Design for Code Optimization." In Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction, 129–43. CC 2022. New York, NY, USA: Association for Computing Machinery, 2022. https://doi.org/10.1145/3497776.3517769.