Knowledge discovery in variant databases using inductive logic programming.
Fiche publication
Date publication
janvier 2013
Auteurs
Membres identifiés du Cancéropôle Est :
Dr POCH Olivier
Tous les auteurs :
Nguyen H, Luu TD, Poch O, Thompson JD
Lien Pubmed
Résumé
Understanding the effects of genetic variation on the phenotype of an individual is a major goal of biomedical research, especially for the development of diagnostics and effective therapeutic solutions. In this work, we describe the use of a recent knowledge discovery from database (KDD) approach using inductive logic programming (ILP) to automatically extract knowledge about human monogenic diseases. We extracted background knowledge from MSV3d, a database of all human missense variants mapped to 3D protein structure. In this study, we identified 8,117 mutations in 805 proteins with known three-dimensional structures that were known to be involved in human monogenic disease. Our results help to improve our understanding of the relationships between structural, functional or evolutionary features and deleterious mutations. Our inferred rules can also be applied to predict the impact of any single amino acid replacement on the function of a protein. The interpretable rules are available at http://decrypthon.igbmc.fr/kd4v/.
Référence
Bioinform Biol Insights. 2013 Mar 18;7:119-31