Wafer-Scale Computing: Advancements, Challenges, and Future Perspectives [Feature]

Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computi...

Full description

Saved in:
Bibliographic Details
Published inIEEE circuits and systems magazine (New York, N.Y. 2001) Vol. 24; no. 1; pp. 52 - 81
Main Authors Hu, Yang, Lin, Xinhan, Wang, Huizheng, He, Zhen, Yu, Xingmao, Zhang, Jiahao, Yang, Qize, Xu, Zheng, Guan, Sihan, Fang, Jiahao, Shang, Haoran, Tang, Xinru, Dai, Xu, Wei, Shaojun, Yin, Shouyi
Format Magazine Article
LanguageEnglish
Published New York IEEE 2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Nowadays, artificial intelligence (AI) technology with large models plays an increasingly important role in both academia and industry. It also brings a rapidly increasing demand for the computing power of the hardware. As the computing demand for AI continues to grow, the growth of hardware computing power has failed to keep up. This has become a significant factor restricting the development of AI. The augmentation of hardware computing power is mainly propelled by the escalation of transistor density and chip area. However, the former is impeded by the termination of the Moore's Law and Dennard scaling, and the latter is significantly restricted by the challenge of disrupting the legacy fabrication equipment and process. In recent years, advanced packaging technologies that have gradually matured are increasingly used to implement bigger chips that integrate multiple chiplets, while still providing interconnections with chip-level density and bandwidth. This technique points out a new path of continuing the increase of computing power while leveraging the current fabrication process without significant disruption. Enabled by this technique, a chip can extend to a size of wafer-scale (over 10,000 mm<inline-formula> <tex-math notation="LaTeX">^{2} </tex-math></inline-formula>), provisioning orders of magnitude more computing capabilities (several POPS within just one monolithic chip) and die-to-die bandwidth density (over 15 GB/s/mm) than regular chips, and emerges a new Wafer-scale Computing paradigm. Compared to conventional high-performance computing paradigms such as multi-accelerator and datacenter-scale computing, Wafer-scale Computing shows remarkable advantages in communication bandwidth, integration density, and programmability potential. Not surprisingly, disruptive Wafer-scale Computing also brings unprecedented design challenges for hardware architecture, design- <inline-formula> <tex-math notation="LaTeX">\backslash </tex-math></inline-formula>system- technology co-optimization, power and cooling systems, and compiler tool chain. At present, there are no comprehensive surveys summarizing the current state and design insights of Wafer-scale Computing. This article aims to take the first step to help academia and industry review existing wafer-scale chips and essential technologies in a one-stop manner. So that people can conveniently grasp the basic knowledge and key points, understand the achievements and shortcomings of existing research, and contribute to this promising research direction.
ISSN:1531-636X
1558-0830
DOI:10.1109/MCAS.2024.3349669