A low‐density single nucleotide polymorphism panel for brown trout (Salmo trutta L.) suitable for exploring genetic diversity at a range of spatial scales

Abstract The rivers of southern England and northern France which drain into the English Channel contain several genetically unique groups of trout (Salmo trutta L.) that have suffered dramatic declines in numbers over the past 40 years. Knowledge of levels and patterns of genetic diversity is essential for effective management of these vulnerable populations. Using restriction site‐associated DNA sequencing (RADseq) data, we describe the development and characterisation of a panel of 95 single nucleotide polymorphism (SNP) loci for trout from this region and investigate their applicability and variability in both target (i.e., southern English) and non‐target trout populations from northern Britain and Ireland. In addition, we present three case studies which demonstrate the utility and resolution of these genetic markers at three levels of spatial separation:(a) between closely related populations in nearby rivers, (b) within a catchment and (c) when determining parentage and familial relationships between fish sampled from a single site, using both empirical and simulated data. The SNP loci will be useful for population genetic and assignment studies on brown trout within the UK and beyond.


Funding information
European Union Interreg France (Channel) England programme; Westcountry Rivers Trust; GW4 NERC Centre for Doctorial Training FRESH programme PhD studentship

Abstract
The rivers of southern England and northern France which drain into the English Channel contain several genetically unique groups of trout (Salmo trutta L.) that have suffered dramatic declines in numbers over the past 40 years. Knowledge of levels and patterns of genetic diversity is essential for effective management of these vulnerable populations. Using restriction site-associated DNA sequencing (RADseq) data, we describe the development and characterisation of a panel of 95 single nucleotide polymorphism (SNP) loci for trout from this region and investigate their applicability and variability in both target (i.e., southern English) and non-target trout populations from northern Britain and Ireland. In addition, we present three case studies which demonstrate the utility and resolution of these genetic markers at three levels of spatial separation:(a) between closely related populations in nearby rivers, (b) within a catchment and (c) when determining parentage and familial relationships between fish sampled from a single site, using both empirical and simulated data. The SNP loci will be useful for population genetic and assignment studies on brown trout within the UK and beyond.
K E Y W O R D S assignment, brown trout, population structure, RADseq, sea trout 1 | INTRODUCTION Central to effective ecological conservation is the understanding of genetic diversity, within and between populations of a species . Such genetic diversity underpins the potential of a species to adapt to its local environment and to adapt to future stressors, including predicted anthropogenic climate change, changes in community structure, novel pathogens and chemical pollutants (Garner et al., 2020). Studies of genetic diversity within a species can also reveal cryptic structure within an apparently homogenous group (Andersson et al., 2017) or reveal a genetic basis for differences in life history between different components of a species that may be of ecological importance (Arostegui et al., 2019), and which can then inform conservation measures to safeguard this diversity.
Early population genetic studies moved from the use of allozymes, proteins of variable structure, to an emphasis on (mostly) selectively neutral loci, e.g., microsatellites, to delineate relationships between populations and to reveal greater genetic polymorphism than previously possible with allozymes (Hughes & Queller, 1993). In recent years, genome sequencing has become the method of choice for the study of genome-wide variation in living organisms. Although initial studies focused on a few so-called model organisms, recent increases in sequencing accuracy coupled with significant reductions in cost have paved the way for the application of such methods to address population-level questions in a range of organisms, including those with little or no pre-existing genetic resources. Nonetheless, except for organisms with very small genomes, whole genome sequencing of many individuals remains relatively expensive, and a cost-effective alternative is to screen genomic data to identify single nucleotide polymorphisms (SNPs) (Fuentes-Pardo & Ruzzante, 2017) that segregate with geographically distinct populations and/or phenotypic traits of interest (e.g., Hohenlohe et al., 2010). SNP markers also overcome a number of the limitations of microsatellite markers, being more frequently and evenly distributed across the genome and overcoming difficulties in standardising genotype calls between different laboratories (LaHood et al., 2002). Moreover, the greater density of SNPs across the genome, alongside their occurrence within coding regions with adaptive potential, has allowed the identification of loci under selection within populations (Johnston et al., 2014). Consequently, SNPs have become the molecular marker system of choice for cost-effective population genetic analysis of an increasingly wide range of animal, plant and microbial taxa.
Brown trout (Salmo trutta L.) is a ubiquitous freshwater fish species found throughout most of Europe and with a native range stretching from Iceland in the west, Norway and Russia to the north, the Atlas mountains of North Africa to the south and Russia, Afghanistan and Pakistan in the east, as well as having readily colonised many freshwater systems around the world after introduction for recreational angling (Elliott, 1989). During the last glacial maximum (LGM) the rivers of southern England and northern continental Europe formed tributaries of a larger Channel river, draining westwards into the Bay of Biscay (Menot et al., 2006). Previous studies have demonstrated that this historic geography has influenced the genetic structure of brown trout across northern Europe, with catchments differentially recolonised from a number of refugial populations in southern Europe (Hamilton et al., 1989). As a result of these differences in the origins of recolonisation, subsequent local adaptation, highly variable life histories and the high fidelity of homing by anadromous individuals to natal rivers, there exists a marked degree of genetic structuring of brown trout populations, both within and between catchments (Griffiths et al., 2009;McKeown et al., 2010).
Understanding relatedness between populations is key to effective management of this ecologically and economically valuable species (Caudron et al., 2006;Waples & Hendry, 2008) and can reveal impacts of anthropogenic activities on populations (Paris et al., 2015).
Here, we present the development of a low-density SNP panel, using loci identified from restriction-associated sequencing (RADseq) of trout from English Channel rivers. We then examine the effectiveness of this panel to explore patterns of genetic diversity and structure using populations from within the English Channel area as well as populations outside of this region. We then explore three case studies using this panel to examine (a) structure and genetic diversity within a single catchment relating to population fragmentation, (b) structure and genetic diversity between small proximate coastal catchments and (c) the utility of this SNP panel in identifying full-sib family structure within a population and comparing the results to those obtained using 18 microsatellite markers.

| Ethical statement
The work represented here did not require ethics approval.

| RADseq
A pooled RADseq approach (Delord et al., 2018) was used for SNP discovery, using a sample of 264 fish from 61 southern British rivers, 25 French rivers and 3 French hatchery stocks. Briefly, DNA was extracted from fin clips using Qiagen Blood and Tissue kits (Qiagen, Manchester, UK), quantified using Qubit dsDNA HS assays (Life Technologies, Renfrew, UK) and then combined into 20 pools (Supporting Information Table S1) based on existing knowledge of genetic structure between these populations (King et al., 2020Quéméré et al., 2016). DNA was then digested with Sbfl, purified using Ampur-eXP magnetic beads (Beckman Coulter, High Wycombe, UK), and phased P1 adaptors were ligated onto each fragment. Digested DNA was fragmented to an average size of 400 bp, blunt ends repaired and adenylated prior to P2 adaptor ligation. The libraries were PCR amplified for 14 cycles prior to 250 bp PE sequencing on an Illumina HiSeq 2500 (Illumina, San Diego, CA, USA).
RAD loci were built de novo using optimised parameters, following Paris et al. (2017). SNP discovery was carried out using the populations module, filtering for missing data, allele frequency and retaining only RAD loci with a single bi-allelic SNP. Full details of the SNP discovery pipeline are provided in Supporting Information.

| Non-RADseq loci
To the RADseq-derived loci the authors added sequence from three additional genomic regionsa three base-pair indel polymorphism and two additional SNPs [a non-synonymous substitution in exon 2 of the vestigial-like family member 3 (vgll3) gene and a C/G polymorphism in an intron of the metallothionein B (metB) gene]. Full details for these additional loci are given in Supporting Information. Hereafter, all loci are referred to as SNPs for clarity, including the three base-pair indel.
T A B L E 1 Basic population diversity statistics for the brown trout populations sampled from eight target and three non-target rivers

| SNP panel design
A randomly selected sub-set of 1070 filtered RAD loci were aligned to the brown trout reference genome (Hansen et al., 2021) using the NCBI blastn portal (https://blast.ncbi.nlm.nih.gov/Blast.cgi). This step retained only loci that aligned with a > 99% identity score to a single linkage group (LG) and used the Genome Browser utility of SalmoBase (https://salmobase.org, Samy et al., 2017) to record genomic location, whether each RAD locus was within a coding or non-coding region, and whether coding loci spanned introns, exons or both. Loci were then ranked, with preference for high-match accuracy, singular fulllength hits and high heterozygosity. Flanking sequence data for 159 RADseq-derived and 3 non-RADseq loci were submitted to a commercial assay design platform (Fluidigm D3) for primer design and synthesis. These candidate loci were tested using the Fluidigm EP1 system, using an initial test panel of template DNAs from trout originating from multiple catchments. Loci that failed to amplify reliably or lacked one of the homozygous genotype clusters were exchanged for alternative assays to produce a panel of 95 reliable SNPs.

| Population screening
To understand and validate the effectiveness of the SNP panel, we screened DNA from trout from four English (Taw, Tamar, Frome and Dour) and four French (Bresle, Sée, Touques and Flèche) rivers that flow into the English Channel, together with trout samples from three non-Channel rivers from Britain and Ireland (Wear: Northumbria, northeast England; Burn of Arisdale: Yell, Shetland, northern Scotland, and Avoca: Co. Wicklow, southeast Ireland) (Table 1). Where possible, fish aged 1 or older were sampled to reduce the chances of collecting potentially related individuals. For British and Irish trout, DNA was extracted from adipose fin clips using the Hotshot method of Truett et al. (2000). For French fish, DNA was extracted from adipose fin clips using NucleoSpin 96 Tissue kits (Macherey-Nagel) following the manufacturer's protocol.  Figure S1. Each run included two positive (individuals of known genotype) and two negative controls.
Individuals that failed to yield data at more than 10% of loci were excluded from further analysis. Basic measures of genetic diversity [observed heterozygosity and expected heterozygosity (H o and H e , respectively), inbreeding coefficient (F is ) and percentage of polymorphic loci within each sample] were calculated using GenAlEx v6.502 (Peakall & Smouse, 2012 and GenoDive version 3.03 (Meirmans, 2020).
Genepop version 4.0 (Rousset, 2017) was used with default parameters to calculate pair-wise linkage disequilibrium between loci, and heterozygosity deficiency and excess from Hardy-Weinberg equilibrium for each locus within each population. P-values for linkage disequilibrium and Hardy-Weinberg deficiency and excess were corrected for false discovery rate (FDR) using the Holm-Bonferroni correction (Holm, 1979) for multiple comparisons.
Population pair-wise values of F ST were calculated using Gen-oDive v 3.03 with significance assessed by 999 permutations of genotypes among populations. Population inbreeding coefficients (F IS ) were calculated using the divBasic function of the R package diveRsity v1.9 (Keenan et al., 2013), and significance was tested by bootstrapping the data 1000 times. Discriminant analysis of principal components (DAPC) was performed for individuals in the adegenet R package (Jombart, 2008). The optim.a.score function of adegenet was used to determine the number of principal components to retain in DAPC analysis. Isolation by distance (IBD) analysis was assessed in R using Mantel tests of linear F ST and distance, utilising the man.rtest function in the ade4 package (Dray & Dufour, 2007) and measuring pair-wise distances between sites using an online distance tool (http://www.daftlogic.com/projectsgoogle-maps-distance-calculator.htm).
We explored the utility of this SNP panel, designed around trout from multiple Channel/Manche rivers (Supporting Information   Table S1), to investigate locally relevant management questions at multiple spatial scales.

| Case Study 1
The rivers of southern Cornwall are typified by relatively small catchments inhabited by trout displaying a mosaic of genetic variation (King et al., 2020), with gene flow between populations maintained by straying of some anadromous individuals. We investigated the power of the SNP panel to delineate population structure between fish in these small coastal catchments. The sample consisted of 97 trout sampled from four rivers from the Mount's Bay region of southwest Cornwall (Table 1; Figure 1).

| Case Study 2
Fragmentation of river systems has been highlighted as a major driver of declines in freshwater migratory fish species (Deinet et al., 2020) and can have significant impacts on the diversity of fragmented populations, including the loss of allelic richness and increased genetic drift in small, isolated populations, and inbreeding depression (Coleman et al., 2018;Pavlova et al., 2017).
The River Camel flows from its headwaters on Bodmin Moor in eastern Cornwall over granite geology approximately 40 km to the sea at Wadebridge. The trout of this catchment appear typical of those of small coastal trout populations in southwest England, showing no obvious local (within-catchment) sub-structuring . Within the catchment, granite has been quarried from the mid-19th century to the present day, with the De Lank tributary having been isolated from the main catchment by 300 m of granite spoil from the De Lank quarry for at least 140 years (Stanier, 1985). How this fragmentation has impacted genetic diversity of trout isolated upstream of this barrier, considered to be impassable to salmonids, was the focus of this investigation. Resident trout were sampled from six sites across the catchment, including from the De Lank upstream of the quarry spoil barrier (Table 1; Figure 1); the other five samples were from sites without any obvious barriers to fish movement.

| Case Study 3
Brown trout populations are often characterised by large numbers of closely related individuals, i.e., full-sibs (Goodwin et al., 2016). We compared the ability of the SNP panel to assign individuals to full-sib families with results from a panel of 18 microsatellite markers using a sample of 30 trout parr from a site on the Great Stour. We used a maximum-likelihood method, implemented in COLONY v 2.0 (Jones & Wang, 2010) to assign sibship based on either multilocus SNP or microsatellite genotypes. Settings for COLONY were high precision, medium length run, assuming both male and female polygamy without inbreeding. To check for consistency, analyses were run twice using different random number seeds.
The ability to recover true familial relationships is dependent on the allelic diversity within the sample of individuals analysed (Hansen & Jensen, 2005). We tested the power of both the microsatellite and SNP panels to recover true full-sib relationships and to establish whether unrelated individuals were falsely classified as full sibs. HYBRIDLAB (Nielsen et al., 2001) was used to simulate genotypes. To test whether the SNP panel would falsely classify unrelated individuals as full sibs, we simulated genotypes for 30 unrelated individuals (the same population genotype data were provided as the input for both "parent 1" and "parent 2" in the HYBRIDLAB interface - Nielsen et al., 2001) and analysed the data in COLONY using the same settings as given above. To test the ability of the SNP panel to correctly elucidate full-sib relationships, the data set for the Great Stour population was arbitrarily split into "male" and "female" groups (15 fish in each). We simulated four full-sib families of known parentage comprising 2, 5, 10 and 15 individuals using single "male" and "female" genotypes as input to HYBRIDLAB (Supporting Information Table S2).    Table S4). Both marker types identified six families with two or more members. The microsatellite analysis identified an additional family but with low probability of inclusion (0.766).
Simulated data showed that the microsatellite and SNP panels could reliably identify full sibship and parentage (Supporting Information Figure S4). Both data sets returned no spurious full-sib relationships among the simulated unrelated genotypes.

| Between catchment structures
We tested the ability of the SNP panel to provide informative population genetic statistics and delineate structure between multiple, potentially linked catchments (Case Study 1). In the region studied -Mount's Bay,   (King et al., 2020). The Cober terminates in its lower reaches in a natural lagoon, with connection to the sea blocked by a shingle bar -Loe Bar. The age of the formation of this natural bar is disputed, with suggestions spanning from after the last Ice Age, 5000-6000 years ago, through to more contemporary formation in the 13th century (Toy, 1934) and though flood relief channels have been installed in recent years, the routing of thesethrough underground culvertslikely limits the movement of anadromous trout (Vincent & Lawrence, 2020 (Coard et al., 1983), whereas outflow from abandoned mine workings is responsible for current high levels of dissolved copper (Environment Agency, 2019); tin was also mined extensively from the region of West Penwith through which the Penberth flows (Knight & Harrison, 2013). Extraction of these metals has had significant detrimental impacts on fish and invertebrate communities, with sedimentation and chronic dissolved metal toxicity causing marked population declines (Durrant et al., 2011); these declines have resulted in severe population bottlenecks and have driven genetic differentiation in metal-impacted trout populations (Paris et al., 2015).
The 95-locus SNP panel presented here appears well suited to exploring this variation, as seen in several previous microsatellitebased studies, together with additional fine levels of differentiation.

| Within catchment structure
The brown trout populations of southwest England are typified by relatively small coastal catchments, with contemporary gene flow facilitated by the straying of anadromous individuals, with levels of genetic diversity within these small catchments often comparable to larger rivers (King et al., 2020). This region has, however, long been impacted by industrial processes, such as metal mining and milling, that have acted to fragment and impact resident trout populations (Jones et al., 2019;King et al., 2020;Paris et al., 2015). Management of these populations to conserve unique and adaptive genetic diversity must first quantify the potential impact of barriers before appropriate remedial action can be taken, making inexpensive genetic profiling of populations key to the viability of such conservation efforts.
Of the trout sampled from the River Camel, the De Lank population appeared to be the most distinct of the six sites, with pair-wise  (Klütsch et al., 2019) and barriers of chronic metal toxicity (Paris et al., 2015).
The quarry on the Camel has been present since at least 1880 (Stanier, 1985), giving a period of 130 years of isolation before the samples analysed in this study were collected; given an assumed generation time of 3.5 years for brown trout (Jensen et al., 2008), this represents approximately 35 generations. The distinctiveness of the De Lank fish is in contrast to several other studies assessing structure between isolated populations (Hoffman et al., 2017;Landguth et al., 2010) which suggest that such a small number of generations may not be sufficient to produce detectable drift and any genetic distance is likely to be small (Waples, 1998

| Family relationships
In-river sampling of juvenile salmonid fishes for genetic analysis is often focused on just one or a few sites within a river, where juveniles (fry, parr) may originate from a very limited number of spawning adults (e.g., Goodwin et al., 2016). This can be particularly evident when juvenile fish are sampled early in the season after emerging from their spawning gravels and before they have had adequate time to disperse throughout a catchment. In such cases, samples are frequently composed of numbers of closely related juveniles, i.e., fulland half-sibs (Goodwin et al., 2016;Pritchard et al., 2007) with retention of such individuals, especially full-sibs, leading to potential biases in the estimation of some (but not all) population genetic parameters (Sánchez-Montes et al., 2017) and misleading interpretation of population structure (Rodríguez-Ramilo & Wang, 2012). In agreement with other studies, the panel of 95 SNP loci performed at least as well as microsatellites in assigning individuals to kin groups, even in the absence of parental genotypes (Hauser et al., 2011). Similarly, Hauser et al. (2011) found that a panel of 80 SNPs outperformed a panel of 11 microsatellites for assigning parentage in a wild population of sockeye salmon (Oncorhynchus nerka). In addition, using simulated data, we found no 'spurious' familial relationships, thereby increasing the confidence in the ability of their SNP panel to correctly identify full sibs within 'real"' data sets.

| CONCLUSION
Here we present a low-density SNP panel of 95 loci as a low-cost tool for use in population genetic studies of trout (Salmo trutta). We find these loci to be highly polymorphic and suitable for defining popula- The case studies also reveal some otherwise-overlooked conservation concerns within fragmented populations, and we anticipate that this panel will be useful in future studies seeking to understand the impacts of potential stressors on genetic structure and health of threatened brown trout populations.