DIII-D gets supercomputing access through the DOE’s high-speed data network

June 5, 2024, 7:00AMNuclear News
The DIII-D Superfacility team. (Photo: General Atomics)

Researchers at the DIII-D National Fusion Facility, the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory (LBNL), and the Energy Sciences Network (ESnet) are teaming up to make the high-performance computing (HPC) powers of NERSC available to DIII-D researchers through ESnet—a high-speed data network. Their collaboration, described in a May 29 news release, in effect boosts the computing power behind DIII-D’s diagnostic tools to make more data from fusion experiments available to researchers at DIII-D in San Diego and to the global fusion research community.

Superfacility breakthrough: A multi-institutional team from DIII-D, NERSC, and ESnet collaborated to develop a “Superfacility”: a multi-institution scientific environment composed of experimental resources at DIII-D and computing resources at NERSC interconnected via ESnet6, the latest iteration of ESnet’s dedicated high-speed network for science. This Superfacility model enables near-real-time analysis of massive quantities of data during experiments.

“To achieve the goal of fusion, all available resources, including HPC, need to be applied to the analysis of present fusion experiments to be able to extrapolate to future fusion power plants,” said Sterling Smith, the project lead for the DIII-D team involved in the Superfacility effort.

“We have long recognized that experimental science teams need a better way to connect experiment facilities with high-speed networks and HPC,” said NERSC Data Department head Debbie Bard, who leads Superfacility work at LBNL. “We started the Superfacility project at Berkeley Lab as a broad initiative to develop the tools, infrastructure, and policies to enable these connections. DIII-D and ESnet have been key partners in this work.”

Raffi Nazikian, fusion data science senior director at General Atomics, which hosts the DIII-D National Fusion Facility as an Office of Science User Facility on behalf of the DOE, added that “We see the Superfacility concept, and the emerging Integrated Research Infrastructure under the DOE Office of Advanced Scientific Computing Research, as a transformative capability for fusion research and look forward to exploring its full potential, beginning with the DIII-D/NERSC Superfacility.”

Between-shot processing: At DIII-D, researchers study the behavior of short plasma discharges called shots, typically performed at 10- to 15-minute intervals, capturing data with nearly 100 diagnostic and instrumentation systems. Between shots, they have just those 10 to 15 minutes to address any issues or evaluate how specific parameter settings affect plasma behavior, which is not enough time to perform on-site the higher-fidelity modeling possible today, nearly 40 years after DIII-D’s first operation in 1986.

“While DIII-D has automated rapid data processing performed on local computing systems to provide near-real-time feedback to scientists to inform experimental decision-making, over the years, the fidelity of models and understanding of the physics has increased dramatically,” said David Schissel, DIII-D computer systems and science coordinator at General Atomics.

“DIII-D (and fusion more generally) presents a use case where returning results quickly really matters,” said Laurie Stephey, a member of the NERSC team. “Many types of fusion simulation and data analysis are too computationally demanding to be run on local resources between shots, so this often means that these analyses either never get done, or if they do get done, they are often finished too late to be actionable.”

Getting it set up: Turning the collaboration between DIII-D, NERSC, and ESnet into a working Superfacility began with coordinating code. The researchers first needed to ensure that the EFIT code used at DIII-D to calculate the device’s equilibrium magnetic field profile would run well on NERSC’s Perlmutter supercomputer. Then, they used the Consistent Automated Kinetic Equilibria (CAKE) workflow, developed for DIII-D by Princeton University associate professor Egemen Kolemen and his team, to produce full descriptions of plasma behavior in the DIII-D tokamak. By adjusting the CAKE workflow for optimal use on NERSC, the Superfacility team was able to decrease the time to solution by 80 percent, from 60 minutes to 11 minutes for a benchmark case.

In the first six months of operations with Superfacility computing power, DIII-D produced more than 20,000 automated high-resolution magnetic field profile reconstructions for 555 DIII-D shots. By contrast, between 2008 and 2022 only 4,000 reconstructions were produced. All reconstructions have lasting value as part of a database that DIII-D users can use to inform experimental planning and interpretation.

Expanding access: The success of the DIII-D/NERSC Superfacility may be a model for similar projects and other Integrated Research Infrastructure collaborations across the DOE lab complex. In fact, the DIII-D team is also working through ESnet with Argonne National Laboratory’s Leadership Computing Facility to analyze plasma pulses quickly, taking a different approach to the workflow.

The Superfacility model can also help make the experimental process more equitable and inclusive for a broader range of researchers. Providing HPC access to all team researchers without requiring them to acquire their own allocation of computing time allows newer groups and researchers to participate in experimental sessions. Additionally, the physically distributed nature of a Superfacility lowers barriers to entry for early-career researchers with smaller professional networks to collaborate with subject matter experts in their research projects.


Related Articles