Barcelona, 12 January 2023- The Centro Nacional de Análisis Genómico (CNAG-CRG) is one of the main partners of the Solve-RD project, a pan-European initiative aiming to improve diagnostics of rare disease patients through a “genetic knowledge web” about genes, genomic variants and phenotypes. A key component to achieve this objective is the RD-Connect Genome-Phenome Analysis Platform (GPAP) which provides clinical scientists with a framework to process, analyse and share integrated sequencing and phenotypic data from patients with a rare disease and their relatives. The platform has been developed and is managed by the CNAG-CRG and it enables authorised users to identify causative genetic variants from undiagnosed patients or discover new gene-disease associations.
The RD-Connect GPAP currently hosts phenotypic and genomic data from more than 26,500 patients and relatives but it does not store the genomic alignments on its online servers since these files are very large and other services, such as the European Genome-Phenome Archive (EGA), are better suited for this purpose. However, it is essential for clinical scientists to have access to genomic alignments to visualize regions around candidate disease causing variants. Although the algorithms identifying genetic variants are generally highly accurate, there are many regions of the genome where it remains challenging to detect variants from sequencing data.
In a study published in Cell Genomics, researchers from the CNAG-CRG Bioinformatics Unit, led by Sergi Beltran, in collaboration with the EGA, describe the new robust and scalable genome browser module implemented in the RD-Connect GPAP to enable users of the platform to visualize genomic alignments from data archived at the EGA. This module is accessible to all registered users of the GPAP and has been tested with 11,750 datasets enabling more than 120 users to access and visualize the information of the specific alignment region in real time. The described implementation has contributed to solve already a large number of previously undiagnosed patients and is expected to help diagnose many more in the future.
“Having to download and transfer gigabytes of data from a remote source each time a researcher wants to visualize a specific genomic locus is neither efficient nor scalable. For this reason, we are happy to present this new implementation that will allow users to request and receive genomic data in a standardized, secure and real-time streaming,” says Alberto Corvò, Front-end Engineer at the Data platforms and tools development team of the CNAG-CRG and first co-author of the study.
“Our work demonstrates the possibility of connecting and federating systems installed in different countries and under different institutions. It highlights the impact of developing and implementing interoperability standards, which will be essential for the establishment of large federated genomics data networks,” says Sergi Beltran, Head of the Bioinformatics Unit and last author of the study.
"This implementation is part of the analysis strategy used by Solve-RD that has led to the diagnosis of hundreds of patients, ending their “diagnosis odyssey”. An example is reported in the study published in the European Journal of Human Genetics where a pathogenic TRIP4 variant causing prenatal spinal muscular atrophy and congenital bone fractures was identified in a case that had been unsolved for several years,” says Leslie Matalonga, Clinical Genomics Specialist at the CNAG-CRG and first co-author of the study.
Work of reference