Leading the Way in Life Science Technologies

GEN Exclusives

More »


More »
November 15, 2010 (Vol. 30, No. 20)

Solving the Next-Gen Sequencing Data Crunch

Eureka Genomics' Bioinformatics Platform Takes Aim at Computational Bottlenecks

  • Click Image To Enlarge +
    Eureka Genomics worked with scientists from the University of California, Davis to characterize a previously undefined viral disease that threatens California vineyards. The company isolated RNA from both infected and healthy grapevines then compared short reads from infected plants to those of uninfected plants.

    The foundation for Eureka Genomics  was laid several years ago during a meeting sponsored by the U.S. Department of Homeland Security. Researchers were demonstrating their inventions when one  presenter, Yuriy Fofanov, Ph.D., director of bioinformatics at the University of Houston, TX, demonstrated a bioinformatics platform. The technology immediately impressed Didier Perez, COO and CFO at Eureka Genomics. “We had never seen any bioinformatics system that could provide the answers we were looking for.”

    Perez secured worldwide exclusive rights to the technology from the University of Houston and launched Eureka Genomics in 2007, along with Dr. Fofanov and Heather Koshinsky, Ph.D., CSO. The company, located in Hercules, CA, and Houston, TX, first offered bioinformatics services based on designing ultraspecific RNA and DNA signatures for companies developing diagnostics. The platform subsequently expanded into its Next Generation Bioinformatics Service.

    The storage, manipulation, and analysis of massive quantities of sequence data cause computational bottlenecks and delay scientific progress. Eureka Genomics’ approach uses novel sets of data structures and algorithms to quickly process data in a nonheuristic manner, according to Perez.

    Most other approaches such as search engines like BLAST rely on heuristic analysis to find approximate answers to questions. Heuristic approaches, however, are limited because they only identify simple alignments or lack of alignments in data, and do not guarantee that all possible matches are detected. “In order to detect insertions, deletions, and substitutions at any position in the length of a sequence or portion of a genome, you need a non-heuristic approach.”

    Fifteen years ago, “biologists thought that longer sequence reads were better,” explains Dr. Koshinsky. But, Dr. Fofanov, a mathematician, took an early view of sequencing data that was 180 degrees opposite the biological one. He wondered why biologists were so obsessed with long reads. It didn’t make sense to the mathematician, who thought shorter reads would be more informative if analyzed correctly.

    The bioinformatics technology proved computationally intensive, and it provides a thorough understanding of sequence data, including mapping, assembly, and novel sequence discovery. The technology, adopted by Eureka Genomics, allows scientists to ask what genes are more or less expressed, discover and sequence new species, or identify foreign DNA in complex samples. “Those types of experiments are more addressable with our robust analysis of sequence data,” Dr. Koshinsky says.

Related content