Genotyping By Sequencing development for Salmo salar: A simulation-based predictive approach using the R package SimRAD.

International audience Application of Next Generation Sequencing platform (NGS) for genotyping purpose in the field of biotechnology, ecology or evolutionary biology is developing quickly. The introduction of efficient methods to reduce genome complexity allows making the most of the huge number of...

Full description

Bibliographic Details
Main Authors: Lepais, Olivier, Salin, Franck, Boury, Christophe, Guichoux, Erwan, Laizet, Yec'han, Weir, Jason T
Other Authors: Ecologie Comportementale et Biologie des Populations de Poissons (ECOBIOP), Institut National de la Recherche Agronomique (INRA)-Université de Pau et des Pays de l'Adour (UPPA), Biodiversité, Gènes & Communautés (BioGeCo), Université de Bordeaux (UB)-Institut National de la Recherche Agronomique (INRA), University of Toronto, European Union, Marie Curie CIG, European Project: 303526,EC:FP7:PEOPLE,FP7-PEOPLE-2011-CIG,GENEARLY(2012)
Format: Conference Object
Language:English
Published: HAL CCSD 2014
Subjects:
Online Access:https://hal.inrae.fr/hal-02799311
Description
Summary:International audience Application of Next Generation Sequencing platform (NGS) for genotyping purpose in the field of biotechnology, ecology or evolutionary biology is developing quickly. The introduction of efficient methods to reduce genome complexity allows making the most of the huge number of sequences generated by analyzing several individuals in a single run. As a result, numerous approaches for genome complexity reduction have been recently developed using different combinations of restriction enzymes, library construction protocols and fragments size selection. Therefore, the choice of which strategy to use may become cumbersome because it is difficult to anticipate the number of loci resulting from each method and no tool was available to provide guidance. To fill this methodological gap, we developed the R package SimRAD (available on the CRAN at http://cran.r-project.org/ web/packages/SimRAD) for simulation-based prediction of the number of loci expected from alternative Genotyping by Sequencing (GBS) or Restriction Associated DNA (RAD) protocols. This package can be used for non-model species for which no reference genome sequence is available, or for species with a draft or a full reference genome sequence released. We illustrated the practical use of SimRAD by comparing the number of loci expected under different GBS approaches applied in Atlantic salmon. We performed our simulations based on a randomly DNA sequence generated following CG content of 42.6% characteristics of Atlantic salmon and the draft genome sequence (AGKD00000000.1) available as yet for this species. Based on these estimations, we selected a GBS protocol that provided a good compromise between number of loci and potential for individual multiplexing in a single run. We then implemented the GBS method on three individuals using the two restriction enzymes PstI and MseI and a fragment size selection step on the Ion Torrent PGM. This preliminary run resulted in a total of 100000 loci which was within the range of the prediction ...