Intelligence
Bandwidth
Intelligence Bandwidth
A metric to track the exponential growth of AI
August 2025
Using AI today is like using an old computer in command line from the early 1990s. Everything is text based, you type in the command and wait for the response. On modern social platforms, people read articles with illustrative images and watch long-form, short-form, and live video. By contrast, most AI interactions remain predominantly text-based.
With the emergence of multimodal AI, such as image generation1,2,3 and video generation4,5,6, it’s reasonable to expect that both human-AI interaction7 and AI-AI interaction8 will become multimodal. Images may even appear in a model’s chain of thought, approximating aspects of visual reasoning in the human brain.
Analogy to the internet
This invites a comparison between AI's trajectory and the internet's. Viewed through the lens of network bandwidth, increases in throughput enabled richer content formats9, evolving from text-forward sites like early Twitter10 to video-centric platforms like YouTube11. This transformation fundamentally changed how people use the internet.
Nielsen's law12 states that internet bandwidth increases by 50% annually. This predictive framework enabled anticipation of when the internet would become fast enough for users to easily download images or stream videos, thereby forecasting the emergence and popularization of platforms such as Pinterest13 and YouTube11.
If bandwidth served as a progress proxy for the web, can we identify an analogous metric to track the rate of AI output? Moreover, if we can observe a growth pattern in such a metric, we may be able to predict the timeline for new modalities of human-AI interaction in the future.
Intelligence bandwidth
I propose intelligence bandwidth as a metric to measure the output rate of AI services. It quantifies the number of bits in the raw outputs of an AI model, with the unit expressed as KB/s.
Intelligence bandwidth is formally expressed as:
where
The measurement methodology is straightforward. For any generative AI service, whether it produces text, images, or videos, the output can be saved to disk. Intelligence bandwidth is then calculated by dividing the storage size of the output by the time the service took to generate that content.
Measuring models across modalities
With intelligence bandwith, we can project all the GenAI models to the same axis, KB/s. All historically significant generative AI models are measured and plotted in a single figure. The models covered include large language models, image generators, and video generators. Most of the data presented in this section is collected from Artificial Analysis14.
The experimental results are shown in Figure 1, where the X-axis is the release date of the models, and the Y-axis is the intelligence bandwidth of the models measured in kilobytes per second. The modality of the models is indicated by different colors as shown in the legend.
Figure 1: Intelligence bandwidth (KB/s) over time.
Key observations from the experimental results include:
-
Most language models are between 0 KB/s and 3 KB/s.
-
Image generators exhibit an exponential growth rate.
-
The video generator, Veo35, currently exhibits an even lower intelligence bandwidth than the state-of-the-art image generators. This is primarily attributable to less mature serving technologies for video models compared to those for large language models and image generators. As serving efficiency for video generators improves, substantial growth in their intelligence bandwidth is anticipated in the near future.
-
The Gemini 2.5 Flash15 image generator is an outlier, primarily because it is optimized for low latency and usability rather than best quality and fidelity.
Jin's law
Effective metrics can help reveal new laws to predict the future, analogous to the number of transistors and Moore's law16, FLOPS and Huang's Law, or the internet bandwidth and Nielsen's law12. Similarly, a robust metric for AI output rate can facilitate the discovery of macro-level trends in AI development.
We now assess the validity of intelligence bandwidth as such a metric by examining whether it supports the formulation of a predictive law for future AI growth.
The dotted curve in Figure 1 represents the estimated growth of intelligence bandwidth. The prediction of the growth rate is based primarily on Imagen 4, the state-of-the-art high-quality image generator, rather than a model balanced between speed and quality, such as Gemini 2.5 Flash15. The growth rate of intelligence bandwidth is summarized in a simple law named after the author's surname, presented as follows.
Jin's law: The intelligence bandwidth (KB/s) of the best generative AI service available to the public doubles every year.
The formal definition of this law is as follows. Let
where
In Jin's law, intelligence bandwidth is defined by the modality exhibiting the highest KB/s measurement. Currently, image generators are at the forefront of this growth. As advancements in models and serving technologies for image generation reach a plateau, it is anticipated that video generators will become the primary drivers of further increases in intelligence bandwidth.
We only measure the best GenAI services. Any model or service that is not well-received by the public is not considered by Jin's law.
We only measure the public available GenAI services. Any service that is not public accessible is not considered by Jin's law.
Impact on human-AI interaction
To use Jin's law to predict the future of human-AI interaction, we first need to recap a few basics on this subject.
Current human-AI interactions are still predominantly text-based. This is possible because the output speed of AI has exceeded human reading speed, which is approximately 238 words per minute17. This is enabled by state-of-the-art serving technologies, which can generate 14,000 words per minute. Similarly, speech generation speed is far beyond human listening speed18.
To clarify the prerequisites for enabling real-time human-AI interaction within a given modality, it is helpful to examine the specific conditions that must be satisfied. For self-paced media formats such as text and images, individuals typically consume content at their maximum perceptual speed. Once the intelligence bandwidth exceeds the human perceptual threshold, real-time interaction in that modality becomes feasible. In contrast, for fixed-speed media formats such as audio and video, users generally adhere to the inherent playback speed. Thus, as long as the AI generation speed surpasses the fixed playback rate of these media formats, real-time interaction in those modalities is achievable.
Research in multi-modal AI continues to address the bottlenecks of other modalities. With the increase of the intelligence bandwidth of image generation, visual illustrations will become integrated into AI responses. AI may also be able to perform visual reasoning akin to humans on a whiteboard, and iteratively refine graphical designs as designers do on paper. As the intelligence bandwidth of leading AI models continues to increase, it is expected that video illustrations and real-time generated environment interactions, such as those demonstrated by Genie 319, will become feasible. In a speculative future, AI could generate entire worlds that users can interact with in real time, an idealistic scenario enabled by the cognitive capabilities of AI.
Achieving such increases in intelligence bandwidth requires not only advances in AI models as static collections of neural architectures and parameters, but also significant improvements in AI hardware and machine learning systems. In this vision, hardware, software, and models are no longer orthogonal, but are deeply integrated to enable superintelligence.
Predictions
Based on Jin's law, two predictions about human-AI interaction in the near future are made as an example of how to use the law to predict the exponential growth of AI in the future:
-
Images will soon be used in AI interactions. The latency of the Gemini 2.5 Flash15 image generator is lower than that of large language model responses. Consequently, large language models may soon incorporate images to provide enhanced illustrations in their outputs. Currently, it takes only 4.6 seconds to generate an image. If this speed doubles within a year, it is likely that applications will emerge in which images become a primary mode of interaction and illustration.
-
Real-time video interaction will be widely available in three years. Given the intelligence bandwidth of models in 2025, it is currently possible to generate 8 seconds of video in 50 to 60 seconds. If generation speed increases by a factor of 7 to 8, real-time video generation will become feasible. Achieving an 8-fold increase corresponds to approximately
years. While some uncertainty remains in this prediction, video generators are presently below the projected growth curve and possess significant potential for accelerated improvement as serving technologies advance.
There are many other implications that can be derived from Jin's law. It is hoped that this law will guide AI application developers in identifying optimal time windows to bring products to market, and policymakers in enacting regulations at appropriate times to maximize development while minimizing harm.
Limitations
There are two main limitations to this work.
First, the accuracy of the estimated doubling period
Second, the exponential growth described by Jin's law represents an idealized scenario. Just like the idea of "the end of Moore's Law", Jin's law would not live for ever. In practice, growth may be constrained by factors such as energy supply limitations or economic pressures, particularly if the AI sector experiences a market correction.
Conclusions
The article introduces the concept of intelligence bandwidth, a metric that measures the processing speed of AI services. The article observes an exponential growth trend in this metric across various modalities and formulates Jin's law, which states that the intelligence bandwidth of the best publicly available AI doubles annually. This law provides a predictive framework, forecasting the imminent integration of real-time images into human-AI text interactions and the widespread availability of real-time video generation within three years.
Further reading
Intelligence bandwidth measures only the total size of AI output, whereas the proportion of intelligent information contained within that output can vary significantly. To address this limitation, it is also important to measure the rate at which an AI service produces genuinely intelligent information. I propose another metric to evaluate this aspect, named intelligence goodput, which is discussed in detail in this companion article.
References
- Kingma, D. P., & others. (2014). Auto-encoding variational Bayes. ICLR. ↩
- Goodfellow, I. J., & others. (2014). Generative adversarial nets. NeurIPS. ↩
- Rombach, R., & others. (2022). High-resolution image synthesis with latent diffusion models. CVPR. ↩
- Amershi, S., & others. (2019). Guidelines for human-AI interaction. CHI. ↩
- Coffman, K. G., & others. (2002). Internet growth: Is there a “Moore’s Law” for data traffic? Handbook of Massive Data Sets. ↩
- Murthy, D. (2018). Twitter. Polity Press Cambridge. ↩
- Gilbert, E., & others. (2013). “ I need to try this”? A statistical overview of Pinterest. CHI. ↩
- Moore, G. E., & others. (1965). Cramming more components onto integrated circuits. ↩
- Brysbaert, M. (2019). How many words do we read per minute? A review and meta-analysis of reading rate. Journal of Memory and Language. ↩
- Kuperman, V., & others. (2021). A lingering question addressed: Reading rate and most efficient listening rate are highly similar. Journal of Experimental Psychology: Human Perception and Performance. ↩
@misc{jin2025intelligence-bandwidth,
title={Intelligence Bandwidth},
author={Jin, Haifeng},
journal={Haifeng Jin's Blog},
url={https://haifengjin.com/intelligence-bandwidth/},
year={2025},
note={}
}