The rapid expansion of AI data centers is becoming one of the most powerful drivers of semiconductor demand. However, the deployment of AI infrastructure is limited by supply-chain bottlenecks in critical semiconductor components which determines the pace of AI expansion as expanded below: –
At a Glance: Where the Bottlenecks Are
At a high level, the slowdown in AI data-center expansion is not due to lack of demand, but because a few critical parts of the supply chain cannot keep up. The main constraints fall into four key areas:
- High-Bandwidth Memory (HBM) – AI systems need HBM, an advanced memory component that is much faster than normal computer memory, but supply is limited.
- Advanced packaging – Chips need to be assembled in more complex ways than usual computer chips, and capacity is tight.
- Networking infrastructure – Data must move quickly between thousands of machines, creating new bottlenecks.
- Power and energy systems – AI servers consume much more electricity, requiring better power management.
- Thermal management – AI servers generate a lot of heat, so advanced cooling solutions are needed to keep them running safely, but these systems are also difficult and costly to scale.
Each bottleneck is explained in detail below:
1. High-Bandwidth Memory (HBM)
High-bandwidth memory (HBM) is a specialised type of memory designed for AI systems. Unlike normal computer memory, it is built to move extremely large amounts of data very quickly, something AI chips need to function effectively.
Globally, the HBM market is highly concentrated. It is dominated by three main players: SK hynix, Samsung, and Micron, which together control almost the entire supply. Among them, SK hynix is the current market leader, supplying more than half of global HBM demand, particularly to companies like NVIDIA.
However, despite aggressive expansion, supply remains tight. Major AI companies (such as Microsoft, Google, and Meta) are locking in long-term supply agreements, and most HBM production for 2025–2026 is already sold out. This means even if demand continues to grow, there is limited additional capacity available in the short term.
The bottleneck exists for several reasons. First, HBM is significantly more complex to manufacture than standard memory. It involves stacking multiple layers of memory chips vertically and integrating them closely with AI processors. This requires advanced manufacturing processes, specialised equipment, and high-quality wafers, all of which take years to scale. In fact, expanding capacity can take 4–5 years due to fabrication and infrastructure constraints.
Second, production is being reallocated toward HBM because it is far more profitable thereby costing significantly more per unit than conventional memory. This shift is creating shortages not only in HBM itself but also across the broader memory market.
In Malaysia, the country does not yet manufacture HBM at scale, but it plays an important supporting role in the supply chain. Companies like Micron are considering about expanding HBM-related capabilities in the country.[1] Malaysia’s strength in semiconductor back-end processes positions it to benefit from future HBM growth, although it remains dependent on upstream production from Korea and the US.
Overall, HBM has emerged as one of the most critical bottlenecks in AI infrastructure. Even if AI chips (like GPUs) are available, limited access to this specialised memory can delay or restrict deployment of entire AI systems.
2. Advanced Packaging Capacity
Advanced packaging refers to how semiconductor chips are assembled and connected together. For AI systems, this process has become far more complex. Instead of using a single chip, modern AI processors combine multiple smaller chips (“chiplets”) and stack memory closely together using technologies such as 2.5D and 3D packaging. This allows faster data transfer, better performance, and more efficient use of space.
Globally, advanced packaging is dominated by a small number of players. TSMC is the clear leader, particularly through its CoWoS (Chip-on-Wafer-on-Substrate) technology used in AI chips like NVIDIA’s GPUs. Other key players include Samsung Electronics, Intel, and outsourced semiconductor assembly and test (OSAT) providers such as ASE Technology and Amkor Technology.
Despite rising demand, global capacity remains constrained. For example, TSMC’s CoWoS capacity has been fully booked due to strong demand from AI companies like NVIDIA. Expanding this capacity is not straightforward. It requires highly specialised equipment, cleanroom facilities and skilled engineering talent. Lead times for new capacity can take years and capital expenditure is extremely high, often running into billions of dollars.
The bottleneck is driven by both technical and structural limitations. Advanced packaging requires precise alignment of multiple chips and memory layers, as well as advanced substrates that are themselves in short supply.
In addition, the ecosystem is tightly interdependent: shortages in substrates, equipment, or even skilled labour can slow down the entire process. Unlike traditional chip manufacturing, scaling packaging capacity has only recently become a priority, meaning supply is still catching up with demand.
Malaysia plays a significant role in this segment of the supply chain. The country is one of the world’s key hubs for semiconductor assembly and testing, with major players such as Inari Amertron, Unisem, and MPI Corporation, alongside global firms operating locally. While Malaysia has traditionally focused on conventional packaging, it is now moving up the value chain into more advanced packaging technologies. Government initiatives and foreign investments are aimed at strengthening these capabilities to capture a larger share of AI-driven demand.
Overall, advanced packaging has become a critical chokepoint in AI infrastructure. Even when chips and memory are available, limited packaging capacity can delay final production and deployment, making it one of the key constraints in scaling AI data centers globally.
3. Networking and Interconnect Performance
Large-scale AI systems are not powered by a single machine, but by thousands of processors working together at the same time. For these systems to function efficiently, massive amounts of data must move quickly between servers with minimal delay. In this context, networking becomes just as important as computing power itself.
Globally, demand is rising rapidly for high-performance networking technologies that can keep up with AI workloads. This includes high-speed networking chips, data processing units (DPUs), and optical interconnect solutions that use light to transmit data faster than traditional electrical signals.
Key players in this space include NVIDIA (through its InfiniBand and Spectrum networking platforms), Broadcom, Marvell Technology, and Intel. These companies supply the critical components that connect AI systems into large, coordinated clusters.
However, networking is emerging as a major bottleneck. As AI clusters scale, the volume of data moving between machines increases exponentially. Even small delays (latency) can significantly reduce overall system performance. In many cases, AI systems are unable to operate at full capacity not because of insufficient compute power, but because data cannot be moved fast enough between processors.
The constraints in this part of the supply chain are both technological and infrastructural. High-speed networking chips are complex to design and manufacture, while optical interconnects require specialised materials and precision engineering. In addition, deploying these systems requires advanced data centre architecture, including high-quality cabling, switching systems and software optimisation. Scaling all of these simultaneously is a significant challenge.
In Malaysia, the country is not a leading designer of networking chips, but it plays a supporting role in the global supply chain through semiconductor manufacturing and assembly activities. As AI data centre investments increase in Southeast Asia, Malaysia has the opportunity to strengthen its position by supporting infrastructure deployment and participating in downstream activities such as testing, integration, and system-level manufacturing.
Overall, networking and interconnect performance is becoming a critical constraint in AI infrastructure. Even when sufficient chips and memory are available, limitations in how fast data can move across systems can restrict the performance and scalability of entire AI clusters.
4. Power, Energy and Thermal Management
AI data centers consume significantly more electricity than traditional data centers. High-performance AI chips require large and stable power supplies to operate efficiently, especially when thousands of processors are running simultaneously. As a result, power and energy management has become a critical part of AI infrastructure.
Globally, demand is increasing for power-related semiconductors such as voltage regulators, power management ICs, and power conversion components. Key players in this space include Infineon Technologies, Texas Instruments, STMicroelectronics and ON Semiconductor. These companies supply the components that ensure electricity is delivered efficiently and safely across complex AI systems.
The bottleneck is not just about generating enough power, but managing it effectively. AI servers can consume several times more energy than conventional servers, placing strain on both data centre infrastructure and national power grids. Upgrading power systems requires significant investment in transformers, backup systems, and energy distribution networks, all of which take time to scale.
At the same time, heat has become a major constraint. AI servers generate intense levels of heat due to their high energy usage and traditional air-cooling methods are often no longer sufficient. This is driving the adoption of advanced cooling technologies such as liquid cooling and immersion cooling, which are more efficient but also more complex and expensive to deploy at scale. Cooling systems must be carefully integrated with server design, adding another layer of engineering difficulty.
In Malaysia, power availability and infrastructure readiness are becoming key considerations as the country attracts AI data centre investments. While Malaysia benefits from relatively stable energy supply and growing data centre ecosystems (particularly in Johor and Penang), scaling AI infrastructure will require continued investment in both power capacity and cooling technologies. This includes grid upgrades, renewable energy integration, and more advanced data centre design capabilities.
Overall, power, energy management, and cooling are emerging as critical bottlenecks in AI expansion. Even if chips, memory, and networking are available, AI systems cannot scale without sufficient power delivery and effective heat management, making this one of the most fundamental constraints in the entire ecosystem.
Conclusion
AI data centers are transforming semiconductor demand, but the pace of expansion is increasingly shaped by supply-chain bottlenecks rather than compute demand alone. Companies and countries that successfully expand capabilities in memory, advanced packaging, networking and power electronics will be the winners of the AI data center race. If you’d like to know how we can assist you further on gathering commercial intelligence and risk management, contact us at inquiries@aheaddetect.com.my
References:
[1] https://www.trendforce.com/news/2024/06/20/news-microns-hbm-expansion-at-full-throttle-rumored-to-expand-in-us-malaysia-also-an-option/
Get weekly insights on risks impacting your business — straight to your inbox.
