ASUS XA GB721-E2: GB300 NVL72 Rack-Scale AI

Built for trillion-parameter workloads, the XA GB721-E2 fuses Grace and Blackwell inside a liquid-cooled 48U rack for fast deployment and datacenter sanity.

ASUS is rolling out the XA GB721-E2, a rack-scale platform based on NVIDIA GB300 NVL72 to push model training, high-throughput inference, and HPC past the usual bottlenecks. In one NVLink domain, you get 36 Grace CPUs and 72 Blackwell Ultra GPUs wired for low-latency, high-bandwidth GPU-to-GPU traffic, packaged in a serviceable 48U MGX rack with liquid cooling aimed at predictable performance.

Rack-Scale Muscle, One NVLink Domain

At the heart is a full NVLink fabric spanning 72 Blackwell Ultra GPUs and 36 Grace CPUs. That layout targets trillion-parameter jobs where inter-GPU chatter kills throughput. By collapsing communication into a single domain, the system trims hops, flattens latency, and keeps tensor cores fed under sustained load.

Liquid-Cooled, Modular, and Meant to Be Touched

The 48U MGX rack integrates manifold-based liquid cooling, nine NVLink switch trays, and 18 compute trays. The plumbing is about stability and serviceability: predictable thermals, cleaner maintenance windows, and fewer airflow compromises when you pack dense accelerators into one cabinet.

Balanced I/O and Storage for Real Pipelines

Northbound and east-west traffic rides either the NVIDIA Quantum-X800 InfiniBand stack or Spectrum-X Ethernet with ConnectX-8 SuperNIC, so clusters scale without whiplash. For data, ASUS pairs all-flash hot tiers for AI/HPC with hybrid or unified arrays to handle backup and archival, letting teams tune cost vs. performance without bolting on yet another silo.

From Rack Arrival to Production in Minutes

On the ops side, ASUS Control Center (ACC) and Infrastructure Deployment Center (AIDC) promise near zero-touch onboarding, with orchestration that can bring a full rack online in roughly a half hour. It’s a turnkey path from installation to scheduled jobs, with lifecycle management baked in rather than duct-taped on.

For organizations wrestling with validation, cooling loops, and network topologies, ASUS Professional Services covers design through optimization. Pricing is typically on request, and availability rolls out via enterprise channels; ASUS didn’t publish public list prices for this configuration. Learn more at ASUS, and yes, the company is betting hard that a faster time-to-compute beats chasing raw TFLOPS on paper.

Technical Specifications

Component	Details
Platform	NVIDIA GB300 NVL72 rack-scale system
CPU	36 NVIDIA Grace CPUs in a single NVLink domain
GPU	72 NVIDIA Blackwell Ultra GPUs with NVLink
Interconnect	NVIDIA NVLink across compute trays; nine NVLink switch trays
Networking	NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet; ConnectX-8 SuperNIC
Compute Trays	18 trays within 48U MGX-compliant rack
Cooling	Manifold-based liquid cooling for predictable thermals
Management	ASUS Control Center (ACC) and ASUS Infrastructure Deployment Center (AIDC)
Deployment	Turnkey rollout; full-rack online target ~30 minutes
Storage Options	All-flash hot tier for AI/HPC; hybrid and unified arrays for enterprise data
Use Cases	Large-scale model training, high-throughput inference, HPC pipelines