GBSathon Benchmarking reproducibility of Genotyping by sequencing analysis workflows through comparison with SNP chip and pedigree data
The advent of reduced representation genotyping-by-sequencing (GBS) provides a cost- effective high-throughput genotyping platform to many ‘orphan’ species. This enables downstream analyses including genomic selection, parentage assignment, conservation genetics, population genetics and genome wide association studies. There are many different workflows available for deriving SNPs from GBS data. Key aspects of any bioinformatic workflow include accuracy, reproducibility and reliability. Few independent studies benchmark multiple workflows to biological ‘gold standards’, such as pedigree or SNP chip data, to assess these key aspects. Here, we benchmark open source SNP-calling workflows for GBS data to assess their accuracy and reproducibility. To do this, we generated GBS data for a cohort of 333 sheep. These have also been genotyped using a 50k or 600k SNP chip. Furthermore, the cohort comprised 125 parent-offspring trios and all individuals had multigenerational pedigree data. The SNPs called from the GBS workflows were compared back to the gold standards to assess the accuracy, reproducibility and reliability of SNP callers. Focusing on the bigger picture, we derived genomic relationship matrices (GRMs) from all methods to compare the accuracy of the SNPs called for downstream biological applications including relationship estimates among parents and progeny.
ABOUT THE AUTHOR
Rachael Ashby is a postdoctoral researcher with the Bioinformatics team at AgResearch and
Genomics Aoteroa. Her research focusses on the use of next generation sequencing for
applications including genome assembly and genotyping-by-sequencing for genomic
management of highly diverse species.