Strategies for Phasing and Imputation in a Population Isolate
In: ISSN: 0741-0395, 2017
Online
academicJournal
Zugriff:
International audience ; In the search for genetic associations with complex traits, population isolates offer the advantage of reduced genetic and environmental heterogeneity. In addition, cost-efficient next-generation association approaches have been proposed in these populations where only a sub-sample of representative individuals is sequenced and then genotypes are imputed into the rest of the population. Gene mapping in such populations thus requires high quality genetic imputation and preliminary phasing. To identify an effective study-design, we compare by simulation a range of phasing and imputation software and strategies. We simulated 1,115,604 variants on chromosome 10 for 477 members of the large complex pedigree of Campora, a village within the established isolate of Cilento in southern Italy. We assessed the phasing performance of IBD-based software ALPHAPHASE and SLRP, LD-based software SHAPEIT2, SHAPEIT3, and BEAGLE, and new software EAGLE which combines both methodologies. For imputation we compared IMPUTE2, IMPUTE4, MINIMAC3, BEAGLE, and new software PBWT. Genotyping errors and missing genotypes were simulated to observe their effects on the performance of each software. Highly accurate phased data were achieved by all software with SHAPEIT2, SHAPEIT3, and EAGLE2 providing the most accurate results. MINIMAC3, IMPUTE4, and IMPUTE2 all performed strongly as imputation software and our study highlights the considerable gain in imputation accuracy provided by a genome sequenced reference panel specific to the population isolate.
Titel: |
Strategies for Phasing and Imputation in a Population Isolate
|
---|---|
Autor/in / Beteiligte Person: | Herzig, Anthony, Francis ; Nutile, Teresa ; Babron, Marie-Claude ; Ciullo, Marina ; Bellenguez, Céline ; Leutenegger, Anne-Louise ; Variabilité Génétique et Maladies Humaines (U946) ; Institut Universitaire d'Hématologie (IUH) ; Université Paris Diderot - Paris 7 (UPD7)-Université Paris Diderot - Paris 7 (UPD7)-Institut National de la Santé et de la Recherche Médicale (INSERM) ; Université Sorbonne Paris Cité (USPC) ; Institute of Genetics and Biophysics "A. Buzzati Traverso" Naples, Italy ; National Research Council of Italy ; Istituto Neurologico Mediterraneo (NEUROMED I.R.C.C.S.) ; Università degli Studi di Roma "La Sapienza" = Sapienza University Rome (UNIROMA)-University of Naples Federico II = Università degli studi di Napoli Federico, II ; Facteurs de Risque et Déterminants Moléculaires des Maladies liées au Vieillissement - U 1167 (RID-AGE) ; Institut Pasteur de Lille ; Réseau International des Instituts Pasteur (RIIP)-Réseau International des Instituts Pasteur (RIIP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lille-Centre Hospitalier Régional Universitaire CHU Lille (CHRU Lille) ; Excellence Laboratory LabEx, DISTALZ ; ESGI—The research leading to these results has received funding from the Seventh Framework Programme FP7/2007‐2013 under grant agreement no. 262055. A.F.H. was funded by an international Ph.D. fellowship from Sorbonne Paris Cité (convention HERZI15RDXMTSPC1LIETUE). ; We address special thanks to the people of Campora for their participation in the study. We kindly thank the European Genome‐phenome Archive at the European Bioinformatics Institute for making available the UK10K imputation panel (EGAD00001000776) and HRC imputation panel (EGAD00001002729) for the use in our simulation study. We also thank the two anonymous reviewers for their comments that greatly improved the manuscript. ; European Project: 262055,EC:FP7:INFRA,FP7-INFRASTRUCTURES-2010-1,ESGI(2011) |
Link: | |
Zeitschrift: | ISSN: 0741-0395, 2017 |
Veröffentlichung: | HAL CCSD ; Wiley, 2017 |
Medientyp: | academicJournal |
DOI: | 10.1002/gepi.22109 |
Schlagwort: |
|
Sonstiges: |
|