New Supercomputer Developed Advances Supercomputing Ecosystem
Stampede3, a powerful new supercomputer that will enable groundbreaking open science research projects in the U.S and leverage previous high performance computing investment funds, is coming to the Texas Advanced Computing Center (TACC) at The University of Texas at Austin.
For over a decade, the Stampede systems — Stampede (2012) and Stampede2 (2017) — have been flagships in the National Science Foundation’s (NSF) scientific supercomputing ecosystem. The Stampede systems will continue to provide a vital capability for researchers in every field of science.
Made possible by a $10 million award for computer hardware from the NSF, Stampede3 will be the newest strategic resource for the nation’s open science community when it enters full production in early 2024. It will enable thousands of researchers nationwide to investigate questions that require advanced computing power.
“We will continue our long partnership with Dell and Intel and leverage the NSF investments in Stampede2 for this new science resource using the latest technology processors with high bandwidth memory, and making Intel graphics processing units widely available to the NSF open science community,” said Dan Stanzione, executive director of TACC.
Stampede3 will deliver:
A new 4 petaflop capability for high-end simulation: 560 new Intel® Xeon® CPU Max Series processors with high bandwidth memory-enabled nodes, adding nearly 63,000 cores for the largest, most performance-intensive compute jobs.
A new graphics processing unit/AI subsystem including 10 Dell PowerEdge XE9640 servers adding 40 new Intel® Data Center GPU Max Series processors for AI/ML and other GPU-friendly applications.
Reintegration of 224 3rd Gen Intel Xeon Scalable processor nodes for higher memory applications (added to Stampede2 in 2021).
Legacy hardware to support throughput computing — more than 1,000 existing Stampede2 2nd Gen Intel Xeon Scalable processor nodes will be incorporated into the new system to support high-throughput computing, interactive workloads, and other smaller workloads.
The new Omni-Path Fabric 400 Gb/s technology offering highly scalable performance through a network interconnect with 24 TB/s backplane bandwidth to enable low latency, excellent scalability for applications, and high connectivity to the I/O subsystem.
1,858 compute nodes with more than 140,000 cores, more than 330 terabytes of RAM, 13 petabytes of new storage, and almost 10 petaflops of peak capability.
“Stampede3 will provide the user community access to CPU nodes equipped with high-bandwidth memory for accelerated application performance,” said Katie Antypas, office director for NSF’s Office of Advanced Cyberinfrastructure. “In addition, the transition from Stampede2 to Stampede3 will be transparent to users easing the shift to a new system. I’m confident it will be a popular platform for the broad science and engineering community.”
The Stampede3 project, as with the previous Stampede systems, encompasses more than just technology. Stampede3 will also include first-class operations, user support and training, education, outreach, documentation, data management, visualization, analytics-driven application support, and research collaboration.
The new system will be delivered in fall 2023, and go into full production in early 2024, with no break in service from Stampede2 to Stampede3. It will serve the open science community from 2024 through 2029.
“The best way to make the case for the science and engineering need and promise of Stampede3 is to look at the success of the current Stampede2, which is nearing the end of its production life,” Stanzione said. “Individual jobs on Stampede2 have been successful even at the extreme scale, ranging to half a million cores, and have come from nearly all fields of science.”
Since its deployment, more than 11,000 users working on more than 3,000 funded projects have run more than 11 million simulations and data analysis jobs on Stampede2 thus far in its production life.