We employed high throughput sequencing technologies to complete high coverage deep sequencing of an inbred laboratory female Chinese hamster. The genome sequence data was complemented by ~80,000 long EST sequences from traditional Sanger sequencing as well as transcript contigs obtained through deep coverage of RNA-seq data, enabling an efficient assembly of the Chinese hamster genome. The current assembly (~2.5Gb), constituting over two billion sequence reads, includes more than 25,000 annotated genes across a range of functional classes. This has allowed a global comparative analysis with the mouse, rat and human genomes. Furthermore, the investigation of regulatory features including promoters, CpG Islands and microRNAs has opened up new avenues for manipulating individual gene expression as well as genome level interventions.
In addition, this work aims to study the genetic variation underlying economically important productivity traits in CHO cells, including cell line specific polymorphisms, by a comparative genomics approach, with diploid hamster DNA as reference. The availability of the genome has also allowed for new approaches for interpretation of CHO cell transcriptome data from a variety of studies, including MTX-based amplification experiments. The availability and application of these genomic resources will facilitate fundamental studies employing CHO cells as well as enable engineering at a genomic scale.