GRAPE: Matching natural products with gene clusters

Dejong, Chris

Nonribosomal peptide (NRP) and polyketide (PK) natural products are biosynthesized by large, multi-modular enzymatic assembly lines. The biosynthetic gene clusters of many such natural products remain unknown. Moreover, exponential growth of available sequence information has revealed a large number of biosynthetic gene clusters not linked to their corresponding natural products. The inherently low fidelity of genomic natural product structure predictions often prevents available software from linking genetic data to compounds. However, PRISM, a software platform developed in the Magarvey Lab, is able to predict individual chemical fragments of the final natural product, including amino acids, malonate-derived monomers, sugars, and other chemical fragments from genetic data with a high level of accuracy. GRAPE is an algorithm that takes chemical structures as input from NRPs, NRP-PK hybrids, lantipeptides, bacterocins and macrocyclic PKs, and determines the chemical fragments that would have been used to construct the natural product from its parent cluster. GRAPE is able to reveal amino acid and polyketide fragments and their order from final structures, as well as other biosynthetically relevant constituents such as halogens, lactams, or thiazoles. The constituent fragments of the compound are then aligned to chemical fragments predicted by PRISM. GRAPE allows mismatches and flexibility in the order to allow for partial and incomplete matches. By leveraging a large database, of known natural products, GRAPE can match gene clusters to their products, and to related compounds. GRAPE will be able to leverage large natural product databases with new connections between natural products and clusters with unknown products.

Natural Product Discovery & Development in the Post Genomic Era

Conference Dates: January 11 - 14, 2015

Location: San Diego, CA