Chapter 01 · Silicon Fundamentals

The Compute Core: Sovereign Silicon and Packaging.

The modern artificial intelligence ecosystem is fundamentally anchored to the physical limits of semiconductor fabrication. As transistor scaling approaches atomic boundaries, progress is defined by packaging architectures, memory bandwidth, and numeric precision.

TSMC 2nm Node Wafer Price
$30,000

A 50% increase over N3's ~$20,000 wafer cost, driven by GAA complexity.

Rubin Memory Bandwidth
22 TB/s

2.75× Blackwell's 8.0 TB/s, powered by the industry's first HBM4 integration.

CoWoS CAGR (2022-2027)
>80%

TSMC's projected annual growth rate for chip-on-wafer-on-substrate packaging.

TSMC SoIC Capacity CAGR
>90%

Projected annual system-on-integrated-chips stacking capacity growth.

Transistor Scaling & GAA Nanosheets

Taiwan Semiconductor Manufacturing Company (TSMC) officially launched mass volume production of its **2nm (N2) node in Q4 2025**, marking the industry's transition from traditional FinFET architectures to **Gate-All-Around (GAA) nanosheet** transistors.

The N2 node delivers **10–15% performance gains at iso-power**, or **25–30% power reduction at iso-performance**, alongside a **15% density uplift** for mixed designs (up to 20% for logic-only components) compared to N3E. This technological leap has come with massive capital requirements: advanced N2 wafer prices have risen to approximately **$30,000 per wafer**, compared to ~$20,000 for 3nm.

Volume production is currently centered at **Fab 22 in Kaohsiung** and **Fab 20 in Hsinchu**, with TSMC planning a **70% compound annual growth rate in 2nm capacity from 2026 to 2028**. In contrast, Intel's rival **18A process** has entered volume manufacturing primarily for internal use, struggling to capture high-volume external foundry clients, leaving TSMC as the uncontested fabricator of the AI frontier.

The Advanced Packaging Bottleneck

As monolithic dies hit the physical reticle limit, performance scaling has shifted to multi-die architectures. TSMC's **CoWoS (Chip-on-Wafer-on-Substrate)** wafer-level packaging is the primary physical bottleneck of the AI accelerator supply chain. By stacking logic processors and High Bandwidth Memory (HBM) on a silicon interposer, CoWoS enables high-bandwidth, low-latency inter-die connections.

TSMC projects **CoWoS capacity to grow more than 80% annually from 2022 to 2027**, while its **SoIC (System-on-Integrated-Chips)** 3D-stacking capacity is projected to increase **over 90% per year**. Despite aggressive domestic expansions in Taiwan and overseas projects in Arizona, Kumamoto, and Dresden, advanced packaging remains the single biggest chokepoint limiting AI accelerator shipments globally.

Memory Hierarchy & Numeric Asymmetries

The physics of training and inference represent a perpetual battle between computation and data movement:

  • SRAM (Static RAM): Fast, on-die caches with sub-nanosecond latency. While crucial for storing parameter states during active instruction execution, SRAM is extremely expensive and occupies massive silicon area, prompting architectures like Groq to rely on scale-up inter-chip SRAM networks.
  • HBM (High Bandwidth Memory): 3D-stacked DRAM connected via a silicon interposer. The transition from Blackwell's HBM3e (8.0 TB/s) to the next-generation **HBM4** starting in late 2026 will deliver up to **22 TB/s bandwidth** and **288GB capacity** per GPU (implemented on the NVIDIA Rubin R100), bypassing the standard Key-Value (KV) cache memory bottlenecks.
  • Numeric Precision: While training continues to utilize 16-bit precisions (FP16/BF16), inference has shifted aggressively to lower bit-widths. The introduction of **FP8** and **FP4** (specifically NVIDIA's NVFP4 with micro-block scaling) allows up to **7× GEMM (General Matrix Multiply) speedups** over Hopper, compressing large models without sacrificing semantic accuracy.

Chapter Citations

All sources verified against primary SEC filings, lab blogs, and foundry registries.

Next Chapter

How do these physical chips convert electricity and logic gates into language? Inspect the attention mechanics.

02 · How AI Works →

AI Accelerator Specifications (2026 Landscape)

AcceleratorProcess NodeTransistorsOn-Chip / HBM MemoryBandwidthPeak PerformanceStatus
NVIDIA B300TSMC 4nm (N4P)208 Billion192GB HBM3e8.0 TB/s~10-20 PFLOPS FP4Shipping (18-week lead times)
NVIDIA R100 (Rubin)TSMC 3nm (N3)336 Billion288GB HBM422.0 TB/s50 PFLOPS FP4Sampling Q4 2026, Volume Q1 2027
Intel Gaudi 3TSMC 5nm (N5)Undisclosed128GB HBM2e3.7 TB/s1,835 TFLOPS BF16Shipping (200K-250K units target)
Groq 3 LPX RackUndisclosedUndisclosed128GB SRAM (Aggregate)640.0 TB/s (Scale-up)Ultra-low latency LPU clusterShipping Q3 2026