Skip to Main Content

Title ImagePublic Abstract

 
Collapse

DE-SC0024547: Open and FAIR Fusion for Machine Learning Applications

Award Status: Active
  • Institution: Auburn University, Auburn, AL
  • UEI: DMQNDJDHTDG4
  • DUNS: 066470972
  • Most Recent Award Date: 08/12/2024
  • Number of Support Periods: 2
  • PM: Halfmoon, Michael
  • Current Budget Period: 09/01/2024 - 08/31/2025
  • Current Project Period: 09/01/2023 - 08/31/2026
  • PI: Kostadinova, Evdokiya
  • Supplement Budget Period: N/A
 

Public Abstract

Open and FAIR Fusion for Machine Learning Applications

Cristina Rea (Lead PI), Saskia Mordijck, Aleksandar Jelenak,
Stephanie Diem, Evdokiya Kostadinova

Massachusetts Institute of Technology, William & Mary, The HDF Group,
University of Wisconsin - Madison, Auburn University


Initial forays into leveraging the computing capabilities of Machine Learning for magnetically confined plasmas have shown tremendous potential. Machine Learning (ML) thrives on large well documented and curated datasets that are easily accessible. Most ML projects in Magnetic Fusion Energy (MFE) encounter challenges related to data structures that are not equipped to handle the scales of I/O required for ML applications, limited metadata information, lack of Open and FAIR workflows and available databases. US experimental MFE databases at various user facilities can only be accessed through signing a user agreement and an additional steep learning curve, with limited documentation or access to existing workflows.


This research endeavors to reduce these challenges, by developing a Fusion Data Platform for Machine Learning with a focus on MFE data that will explicitly adhere to Findable, Interoperable, Accessible, Reusable (FAIR) and Open Science (OS) guidelines. To develop this Data Platform the scope of work will focus on:


  • Redefining an appropriate metadata structure that matches FAIR/OS principles for MFE data and that is suitable for ML workflows,

  • Developing FAIR/OS workflows to curate and augment labeled data for classification of relevant events from multiple US MFE devices,

  • Making publicly available selected experimental and simulation data,

  • Diversifying workforce skills through interdisciplinary education of students and junior scientists in fusion and ML tasks.


The multi-institutional team will focus on four main research topics to develop a Fusion Data Platform for Machine Learning applications: (1) MDSplusML, (2) FAIR Workflows, (3) Open Databases, and (4) Student Engagement.


MFE devices participating in this research are Alcator C-Mod, Pegasus-III, CTH and HBT-EP. An interoperable and publicly available library will be developed leveraging data from these devices. The library will have built-in pipelines for ML application design, allowing preservation of reproducible scientific results. 

The team will also expand the interdisciplinary student engagement by designing an intensive 2-week summer school focusing on data science and fusion for undergraduate students, which will be followed by funded summer research. Hands-on training will leverage databases developed for the physics-based use cases of this work. The summer school will be hosted at William & Mary, and will be essential in the expansion of a new interdisciplinary workforce for ML and fusion science.



Scroll to top