The geneticist Josep F. Abril participates in the large-scale assessment of gene expression analysis software

Josep F. Abril,  expert from the Department of Genetics and main researcher at the Computational Genomics Lab of the UB.
Josep F. Abril, expert from the Department of Genetics and main researcher at the Computational Genomics Lab of the UB.
Research
(07/11/2013)

Researchers from the University of Barcelona (UB) and the Centre for Genomic Regulation (CRG) participate in an international consortium of scientists that has published a systematic assessment of the software used to analyse gene expression departing from the data obtained by the sequencing method RNA-seq. Results, published recently on the journal Nature Methods, may inspire new computing approaches to handle current and future technologies for gene expression analysis. Experts Josep F. Abril, from the Department of Genetics of the UB, and Roderic Guigó, from CRG, collaborate in the research; they were the only Spanish experts who participated in the Human Genome Project, a really important milestone in the field of human genetics.

Josep F. Abril,  expert from the Department of Genetics and main researcher at the Computational Genomics Lab of the UB.
Josep F. Abril, expert from the Department of Genetics and main researcher at the Computational Genomics Lab of the UB.
Research
07/11/2013

Researchers from the University of Barcelona (UB) and the Centre for Genomic Regulation (CRG) participate in an international consortium of scientists that has published a systematic assessment of the software used to analyse gene expression departing from the data obtained by the sequencing method RNA-seq. Results, published recently on the journal Nature Methods, may inspire new computing approaches to handle current and future technologies for gene expression analysis. Experts Josep F. Abril, from the Department of Genetics of the UB, and Roderic Guigó, from CRG, collaborate in the research; they were the only Spanish experts who participated in the Human Genome Project, a really important milestone in the field of human genetics.

Scientists use the method of RNA sequencing to see how genes are being expressed across an entire genome. But how can they analyse this information? Is the software they use good enough to do so? In the research, the consortium RNA-seq Genome Annotation Assessment (RGASP), an initiative affiliated with the project ENCODE (ENCyclopedia Of DNA Elements), evaluated the performance of a wide range of RNA-seq computer programs. Experts were able to specify which approaches work well for certain tasks, and which areas can be improved.

 

Improve of gene prediction computational tools

Professor Josep F. Abril, member of the Institute of Biomedicine of the University of Barcelona (IBUB), affiliated centre with the campus of International excellence BKC,, “the article results from a project aimed at assessing the reliability of the most modern gene prediction software, in the context of the evidences provided by RNA-seq data, and how these data could improve gene annotation and the computational tools that define them”.

By systematically comparing the existing computational tools for gene prediction we try to determine whether the new RNA-seq data improve or not the reliability of genetic predictions”, explains Josep F. Abril, director of the Computational Genomics Lab of the UB. “Apart from providing excellent new tools for gene prediction —he adds—, we have also identified the questions we should address in the future”.

Roderic Guigó, coordinator of the Bioinformatics and Genomics programme of CRG, affirms that they have been working in order to find new and more-sophisticated alternatives to currently available methods. “The conclusions we are presenting in these two papers —he highlights— will contribute to obtain better methods and facilitate the application of these methods to diverse fields such as medicine and biotechnology”.

 

 

RGASP: a consortium devoted to the research on genomics

Paul Bertone, expert at the European Bioinformatics Institute (EMBL-EBI) and coordinator of the research, assures that they found “a striking degree of variability in how these programs handle different aspects of RNA-seq data”. There are some methods that perform well overall, whereas others have clever design features that excel at solving specific problems. Experts were also able to highlight areas where many of these computational approaches can improve. “This kind of work —underlines Bertone— provides an important resource for genomics community, and the consortium model was a unique platform to deliver that in a large-scale, systematic way”.

In both studies, developers of leading software programs were invited to participate in a detailed evaluation of computational methods for processing and interpreting RNA-seq data. Within the Genome Annotation Assessment Project (EGASP) —based on ENCODE—, all groups and researchers provided their results for evaluation. Each of the methods compared in the study performs sequence alignment and transcript reconstruction: essential steps in the analysis of RNA-seq experiments.

 

New frontiers of RNA-seq

Consortiumʼs systematic, meticulous approach to the performance assessment resulted in findings that can be used to enhance and expand the range of RNA-seq analysis tools that are available for different kinds of studies. They can also be used to inform developments that meet the demands of emerging sequencing technologies.

“One of the most relevant features of the study —highlights Josep F. Abril— is the great amount of data we have worked with”. “We evaluated gene annotations and predictions on the genome of the human being, the common fruit fly (Drosophila melanogaster) and the roundworm (Caenorhabditis elegans) and, at the same time, we included RNA-seq data generated by the project ENCODE”, details the expert. Another aspect that must be stressed is that the research observed how different computing tools could quantify the level of expression on the structures predicted from RNA-seq data.