Skip to Main Content

Title ImagePublic Abstract

 
Collapse

DE-SC0021303: High throughput, accurate gene annotation through AI and HPC-enabled structural analysis

Award Status: Inactive
  • Institution: Georgia Tech Research Corporation, Atlanta, GA
  • UEI: EMW9FC8J3HN4
  • DUNS: 097394084
  • Most Recent Award Date: 07/29/2022
  • Number of Support Periods: 3
  • PM: Madupu, Ramana
  • Current Budget Period: 09/15/2022 - 09/14/2023
  • Current Project Period: 09/15/2020 - 09/14/2023
  • PI: Skolnick, Jeffrey
  • Supplement Budget Period: N/A
 

Public Abstract


High throughput, accurate gene annotation through AI and HPC-enabled structural analysis

J. Skolnick, Georgia Institute of Technology (Principal Investigator)

J. Cheng, University of Missouri – Columbia (Co-Investigator)

A. Sedova, Oak Ridge National Laboratory (Co-Investigator)

J. M. Parks, Oak Ridge National Laboratory (Co-Investigator)

The ability to predict the function of a protein from its sequence is a grand challenge in biology.  Proteins are the workhorses of the body, acting as enzymes to speed up chemical reactions so that the molecules necessary for life can be produced. They also provide signaling and control needed to regulate cellular behavior and division. Proteins, which are long chain molecules, fold into specific geometric shapes that enable them to carry out these processes. However, many of the functions of proteins in bacteria, fungi and plants are unknown to us. But simply given the 3 dimension structure of the biologically active form of a protein doesn’t immediately translate into what it does. New tools are needed to address this challenge Accurate annotation using computational methods will facilitate breakthroughs in the genomic sciences essential to understanding and harnessing life processes in bacteria, fungi and plants. This project will create a computational infrastructure to infer gene function from the amino acid sequence of the protein using high-performance computing (HPC)-based multiscale simulation, informatics and machine-learning pipelines. The incorporation of information from state-of-the-art structural modeling, simulations of what happens in a cell together with evolutionary analysis and systems biology databases will make this possible. This synergy, together with HPC-enabled bioinformatics and machine learning, will greatly extend the ability to predict protein function. Success of this project will advance one of DOE’s Office of Biological and Environmental Science’s primary missions to translate nature’s genetic code into predictive models of biological function and will be facilitated by HPC resources provided by the DOE leadership computing facilities.

 



Scroll to top