High-quality carnivoran genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox

High-quality carnivore genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox Rémi Allio 1 *, Marie-Ka Tilak 1 , Céline Scornavacca 1 , Nico L. Avenant 2 , Andrew C. Kitchener 3 , Erwan Corre 4 , Benoit Nabholz 1&5 , and Frédéric Delsuc 1 * Affiliation...

Full description

Bibliographic Details
Main Authors: Allio, Rémi, Tilak, Marie-ka, Scornavacca, Céline, Avenant, Nico L., Kitchener, Andrew C., Corre, Erwan, Nabholz, Benoit, Delsuc, Frédéric
Format: Dataset
Language:unknown
Published: Zenodo 2021
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.4479226
https://zenodo.org/record/4479226
Description
Summary:High-quality carnivore genomes from roadkill samples enable comparative species delineation in aardwolf and bat-eared fox Rémi Allio 1 *, Marie-Ka Tilak 1 , Céline Scornavacca 1 , Nico L. Avenant 2 , Andrew C. Kitchener 3 , Erwan Corre 4 , Benoit Nabholz 1&5 , and Frédéric Delsuc 1 * Affiliations 1 Institut des Sciences de l’Evolution de Montpellier (ISEM), CNRS, IRD, EPHE, Université de Montpellier, France remi.allio@umontpellier.fr marie-ka.tilak@umontpellier.fr celine.scornavacca@umontpellier.fr benoit.nabholz@umontpellier.fr frederic.delsuc@umontpellier.fr 2 National Museum and Centre for Environmental Management, University of the Free State, Bloemfontein, South Africa navenant@nasmus.co.za 3 Department of Natural Sciences, National Museums Scotland, Edinburgh, UK a.kitchener@nms.ac.uk 4 CNRS, Sorbonne Université, FR2424, ABiMS, Station Biologique de Roscoff, 29680 Roscoff, France corre@sb-roscoff.fr 5 Institut Universitaire de France (IUF) *Correspondence: remi.allio@umontpellier.fr, frederic.delsuc@umontpellier.fr Running head Genomics from roadkill samples Abstract In a context of continuing erosion of biodiversity, obtaining genomic resources from wildlife is becoming essential for conservation. The many thousands of mammals killed on roads annually could potentially provide a useful source of material for genomic surveys. To illustrate the potential of this underexploited resource, we used roadkill samples to sequence reference genomes and study the genomic diversity of the bat-eared fox ( Otocyon megalotis ) and the aardwolf ( Proteles cristatus ), for which subspecies have been defined based on similar disjunct distributions in Eastern and Southern Africa. By developing an optimized DNA extraction protocol, we successfully obtained long reads using the Oxford Nanopore Technologies (ONT) MinION device. For the first time in mammals, we obtained two reference genomes with high contiguity and gene completeness by combining ONT long reads with Illumina short reads using a hybrid assembly approach. Based on re-sequencing data from a few other roadkill samples, the comparison of the genetic differentiation between our two pairs of subspecies to that of pairs of well-defined species across the Carnivora showed that the two subspecies of aardwolf might warrant species status ( P. cristatus and P. septentrionalis ), whereas the two subspecies of bat-eared fox might not. Moreover, using these data, we conducted demographic analyses that revealed similar trajectories between Eastern and Southern populations of both species, suggesting that their population sizes have been shaped by similar environmental fluctuations. Finally, we obtained a well resolved genome-scale phylogeny for the Carnivora, with evidence for incomplete lineage sorting among the three main arctoid lineages. Overall, our cost-effective strategy opens the way for large-scale population genomic studies and phylogenomics of mammalian wildlife using roadkill. Figures & Tables Figure 1 . Disjunct distributions of the aardwolf ( Proteles cristatus ) and the bat-eared fox ( Otocyon megalotis ) in Eastern and Southern Africa. Within each species, two subspecies have been recognized based on their distributions and morphological differences (Clark, 2005; Koehler and Richardson, 1990). Picture credits: Southern aardwolf ( P. cristatus cristatus ) copyright Dominik Käuferle; Southern bat-eared fox ( O. megalotis megalotis ) copyright Derek Keats. Figure 2 . Representation of the mitochondrial genetic diversity within the Carnivora with a) the mitogenomic phylogeny inferred from 142 complete Carnivora mitogenomes, including those of the two populations of aardwolf ( Proteles cristatus ) and bat-eared fox ( Otocyon megalotis ) and b) intraspecific (orange) and the interspecific (red) genetic diversities observed for the two mitochondrial markers COX1 and CYTB. Silhouettes from http://phylopic.org/. Figure 3. Genetic differentiation indices obtained from a comparison of intraspecific (orange) and interspecific (red) polymorphisms in four pairs of well-defined Carnivora species and for the subspecies of aardwolf ( Proteles cristatus ) and bat-eared fox ( Otocyon megalotis ) (grey). Silhouettes from http://phylopic.org/. Figure 4. PSMC estimates of changes in effective population size over time for the Eastern (orange) and Southern (blue and purple) populations of a) bat-eared fox and ) aardwolf. mu = mutation rate of 10 -8 mutations per site per generation and g = generation time of 2 years. Vertical red lines indicate 20 kyrs and 40 kyrs. Silhouettes from http://phylopic.org/. Figure 5. Phylogenomic tree reconstructed from the nucleotide supermatrix composed of 14,307 single-copy orthologous genes for 52 species of Carnivora plus one outgroup ( Manis javanica ). The family names in the legend are ordered as in the phylogeny. Silhouettes from http://phylopic.org/. Figure 6. Phenotypic comparisons, highlighting the differences in fur coloration and stripe pattern, between captive individuals of Eastern ( P. septentrionalis ) and Southern ( P. cristatus ) aardwolves held at Hamerton Zoo Park (UK). All pictures copyright and used with permission from Rob Cadd. Table 1. Summary of sequencing and assembly statistics of the genomes generated in this study. Additional files Figure S1 : Graphical representation of the convergence of the PhyloBayes chains run for dating analyses using a) Clock model, b) LN model, and c) UGAM model. Figure S2 : Plot of the quality of Nanopore long reads base-called with either the fast or the high accuracy option of Guppy v3.1.5. The quality of the base-calling step has a large impact on the final quality of the assemblies by reducing the number of contigs and increasing the N50 value. Figure S3: Comparison of 503 mammalian genome assemblies from 12 taxonomic groups using bean plots of the a) number of scaffolds, and b) scaffold N50 values ranked by median values. Thick black lines show the medians, dashed black lines represent individual data points, and polygons represent the estimated density of the data. Note the log scale on the Y axes. The bat-eared fox ( Otocyon megalotis megalotis ) and aardwolf ( Proteles cristatus cristatus ) assemblies produced in this study using SOAPdenovo and MaSuRCA are indicated by asterisks. Bean plots were computed using BoxPlotR (Spitzer et al., 2014). Figure S4: BUSCO completeness assessment of 67 Carnivora genome assemblies visualized as bar charts representing percentages of complete single-copy (light blue), complete duplicated (dark blue), fragmented (yellow), and missing (red) genes ordered by increasing percentage of total complete genes. The bat-eared fox ( Otocyon megalotis megalotis ) and aardwolf ( Proteles cristatus cristatus ) assemblies produced in this study using MaSuRCA and SOAPdenovo are indicated by asterisks. Figure S5. Genetic differentiation indices obtained from a comparison of intraspecific and interspecific polymorphisms after having homogenized the coverage of all species (at about 15x). The estimates were calculated for four pairs of well-defined Carnivora species and for the subspecies of aardwolf ( Proteles cristatus ) and bat-eared fox ( Otocyon megalotis ). Silhouettes from http://phylopic.org/. Figure S6 : Genetic differentiation indices obtained from the comparison of intraspecific and interspecific polymorphisms for the pair Ursus arctos/Ursus maritimus (~10 replicates per species). GDI is estimated for each pair of individuals. This result demonstrates that randomly picking only three individuals (out of 10) is sufficient to accurately estimate the level of genetic differentiation between the two species. Figure S7 : Definition of the genetic differentiation index (GDI) based on the F-statistic (FST). The main difference between these two indexes is the use of heterozygous allele states for GDI rather than real polymorphism for the FST. Green = π within , Orange = π between , Blue = Population A, Red = Population A+B. Figure S8 : Graphical representation (BlobPlot) of the results of contamination analyses performed with BlobTools for a) the aardwolf ( Proteles cristatus cristatus ) and b) the bat-eared fox ( Otocyon megalotis megalotis ) genome assemblies. Table S1 : Pairwise patristic distances estimated for the 142 species based on branch lengths of the phylogenetic tree inferred with the 15 mitochondrial loci (2 rRNAs and 13 protein-coding genes). Table S2 : Results of Bayesian dating for the two nodes leading to the Proteles cristatus sspp. and the Otocyon megalotis sspp.. Divergence time estimates based on UGAM and LN models are reported with associated 95% credibility intervals for each MCMC chain. Table S3 : Sample details and assembly statistics (Number of contigs/scaffolds and associated N50 values) for the 503 mammalian assemblies retrieved from NCBI (https://www.ncbi.nlm.nih.gov/assembly) on August 13th, 2019 with filters: “Exclude derived from surveillance project”, “Exclude anomalous”, “Exclude partial”, and using only the RefSeq assembly for Homo sapiens . Table S4 : Genome completeness assessment of MaSuRCA and SOAPdenovo assemblies obtained for Proteles cristatus cristatus and Otocyon megalotis megalotis together with the 63 carnivoran assemblies available at NCBI on August 13th, 2019 using Benchmarking Universal Single-Copy Orthologs (BUSCO) v3 with the Mammalia OrthoDB 9 BUSCO gene set. Table S5 : Annotation summary and supermatrix composition statistics of the 53 species used to infer the genome-scale Carnivora phylogeny. Table S6 : Sample details and assembly statistics of the 13 newly assembled carnivoran mitochondrial genomes. Table S7 : Node calibrations used for the Bayesian dating inferences based on mitogenomic data. Table S8 : Results of contamination analyses performed with BlobTools for the aardwolf ( Proteles cristatus cristatus ). Table S9 : Results of contamination analyses performed with BlobTools for the bat-eared fox ( Otocyon megalotis megalotis ). Table S10 : Summary information for the Carnivora genomes available either on GenBank, DNA Zoo and the OrthoMaM database as of February 11th, 2020. The “OMM” column indicates if the genome was available on OMM (yes) or not (no). The “Annotation” column indicates whether the genome was already annotated (yes) or not (no). Supplementary File 1 : Analyses of morphological differences between the two proposed species of aardwolf. Zenodo supplementary files MS_Genomes_Roadkill.sh contains the main command lines used for the study 1- Mitogenomics Barcoding gap analyses: Barcoding gap analysis.zip contains the COX1 and CYTB matrices used for the barcoding gap analyses + the final trees and their associated IQ-TREE log files Mitochondrial phylogeny: Mitochondrial phylogeny.zip contains the mitochondrial supermatrix used to generate de phylogeny + the mitochondrial phylogeny and the associated IQ-TREE log file. Dating: Mitochondrial dating.zip contains the results of the dating analyses. 2- Genomics Assemblies.tar.gz contains the hybrid assemblies of Proteles cristata cristata and Otocyon megalotis megalotis. Busco analyses: Busco_Carnivora.xlsx contains genomes information like Accession numbers, Species names, Busco scores etc... Genetic differentiation: Genetic differentiation analyses.zip contains the coordinates of the regions used for the estimations for each genomes + a summary of the results for each species pair. PSMC analyses: PSMC.zip contains the results of the PSMC analyses. 3- Phylogenomics Phylogenomic analyses: Phylogenomic analysis.zip contains the nucleotides supermatrix composed of 14,307 single-copy orthologous genes for 54 species of carnivores plus one outgroup ( Manis javanica ) + the phylogenomic tree and the associated IQ-TREE log file. Coalescent analyses: ASTRAL-III.zip contains the results obtained from Astral analyses. Gene trees: Gene trees.zip contains the gene trees.