Skip to Main Content

Title ImagePublic Abstract

 
Collapse

DE-SC0025416: ZF: A novel framework to design trustworthy lossy compressors for scientific data approaching lossy compressibility limits

Award Status: Active
  • Institution: Virginia Polytechnic Institute and State University, Blacksburg, VA
  • UEI: QDE5UHE5XD16
  • DUNS: 003137015
  • Most Recent Award Date: 09/03/2024
  • Number of Support Periods: 1
  • PM: Rabson, David
  • Current Budget Period: 09/01/2024 - 08/31/2025
  • Current Project Period: 09/01/2024 - 08/31/2027
  • PI: Liu, Lingjia
  • Supplement Budget Period: N/A
 

Public Abstract

Data reduction is necessary for many scientific domains because large-scale numerical simulation codes and instruments produce massive datasets, often at high rates, making it challenging to move and store the data they produce. Lossy compression for scientific data is important in the range of reduction techniques because it complements sampling, filtering, and dimensionality reduction by preserving all data points and only leveraging redundancies and tuning accuracy to reduce dataset sizes. However, current techniques for designing lossy compression methods suffer three critical limitations: lack of theoretical support, lack of direct error controls for spatial quantities of interest (QoIs), and lack of support for structured and unstructured grids used by some of these instruments and codes. For example, currently, lossy compressors for scientific data are designed without the notion of optimality, resulting in incremental progress. Like other research domains relying on theoretical guides for algorithm design a rigorous theoretical framework is needed to guide lossy compressor design.

  

To address the three limitations, we propose developing ZF, a novel framework for reasoning about, designing, and building lossy compressors approaching practical lossy compressibility limits. We will develop ZF from three connected research thrusts: 1) innovative techniques to compute practical compressibility limits for any scientific dataset given quantity of interest (QoI) preservation constraints, 2) physics-based QoI preservation techniques for structured and unstructured grids and their integration into compressibility limit formulations, and 3) efficient approximation methods to approach practical compressibility limits. We will use ZF to design and implement novel lossy compressors for three DOE applications (ARM observatory, turbulent flows in complex geometry, and high-energy coherent diffraction imaging), representing three domains (climate, combustion, and light sources), and three scientific modi operandi (observation, simulation, and experiment).

  

The ZF project will produce new knowledge and understanding, QoI preservation techniques, decorrelation methods, and high-performance compressor implementations generalizable to a broad range of mission-critical DOE applications where compression is critically important. Thrust 1 is the first attempt to adapt fundamental rate-distortion theory to scientific data. The resulting formulations will inform the effectiveness of existing lossy compressors and provide the community with a solid guide to designing new compression schemes for scientific data. Thrust 2 addresses the unexplored preservation of QoIs with three novel techniques depending on tractability: symbolic derivation of QoI preservation, co-design of QoI preservation and analysis, and iterative and selective QoI preservation. Thrust 3 develops innovative, effective, and fast compression algorithms based on multivariate prediction based on approximated co-variance matrix, randomized low-rank approximations, and graph neural networks.



Scroll to top