Monday, August 2, 2010
Pacific Concourse (Hyatt Regency San Francisco)
The new era in industrial biotech is increasingly driven by emerging ultra-fast sequencing technologies. The current major bottleneck is a lack of tools for the systematic handling of the large data volumes produced by next-generation sequencers and the interpretation of the genotype data across complex production strain ancestries. The design of improved production strains is of major interest to Evonik Degussa, a leading producer of L-amino acids by microbial production strains derived from C. glutamicum and E. coli. Here, we present a fully automated data analysis pipeline that compares dozens of proprietary strain genomes resulting from random mutagenesis campaigns and directed strain engineering strategies. We used the Genedata solution to systematically process and annotate these strains including automatic identification and categorization of point mutations in their genetic and biological context. This process is assisted by tailored viewers that drill down to the sequencing data and predict the mutations’ influences on gene products (e.g. by modifying an enzyme’s active site) or on the gene regulation (e.g., by altering a transcription factor’s DNA binding site). This study demonstrates how next-generation sequencing data can support rational genomic design strategies for developing strains to optimize yields of L-amino acids or other biotech products such as vitamins and enzymes.