Population Diversity of Leptosphaeria maculans in Australia

The fungal pathogen Leptosphaeria maculans, causal agent of blackleg disease, is a primary cause of canola (Brassica napus) crop loss in Australia. Expanding our knowledge of the occurrence of this pathogen in Australia will provide valuable insights into developing methods of resistance against it. In this study, we examine the population diversity of L. maculans in Australia using single nucleotide polymorphisms (SNPs). An Illumina GoldenGate 384 SNP assay was developed and used to genotype 59 blackleg isolates collected from across Australia, in different years and from different stubble sources. Limited linkage disequilibrium, absence of significant clustering in the principal component analysis and a mixed dendrogram suggest that the Australian L. maculans population as a whole is panmictic. Some evidence of clonality concentrated in each state was also observed. There was a lack of correlation between SNP haplotypes, stubble cultivar and year of collection. These results suggest a high rate of sexual reproduction and evolutionary diversification in the pathogen. These features could enable the pathogen to overcome resistance and continue to cause disease in Brassica crops. Analysis of these fungal population isolates will help shed some light on evolution and pathogenicity questions in this important crop pathogen.


Introduction
The ubiquitous fungal pathogen L. maculans is the causal agent of phoma stem canker (blackleg) in Brassica napus, B. juncea, B. rapa and B. oleracea: canola, vegetable and mustard crops (West, Kharbanda, Barbetti, & Fitt, 2001).This ascomycete was first described in 1791 by Tode and 1849 by Desmaziéres (Gout, Eckert, Rouxel, & Balesdent, 2006).Severe epidemics of blackleg disease occurred in Australia during the 1970s, wiping out the nascent canola industry (Rouxel & Balesdent, 2005).The 1972 epidemic in Australia caused almost 90% crop losses, highlighting the susceptibility of canola to L. maculans.Annually, this pathogen causes an average loss of AUD $100 million to the Australian economy (Zander et al., 2013).Furthermore, L. maculans isolates present in Australia are classified as highly virulent, able to cause disease even in the more resistant Brassica species: B. juncea, B. nigra and B. carinata (Purwantara, Salisbury, Burton, & Howlett, 1998).Daverdin et al. (2012) describe that rapid evolution in pathogens gives rise to new strains to combat crop defences.Maintenance of crop resistance directly depends on the field population size of the pathogen, its evolutionary potential and cropping practices that directly affect its reproductive system (Daverdin et al., 2012).L. maculans can survive as a saprobe in the stubble of infected plants for many years and this is usually favoured by dry hot summers and cold winters (West, Kharbanda, Barbetti, & Fitt, 2001).During this period, it produces sexual inoculum (ascospores), which can travel from several hundred metres to several hundred kilometres (Travadon et al., 2011) and infect plants followed by asexual spore (conidia) production at the site of infection (Rouxel & Balesdent, 2005).
Recent genome sequencing revealed that the L. maculans genome has an isochore-like structure (Rouxel et al., 2011), where the genome is divided into AT and GC-rich blocks, probably caused by the amplification of transposable elements and repeat-induced point (RIP) mutations.The RIP mechanism causes nucleotide substitutions from C to T and G to A and is a premeiotic repeat-inactivation mechanism specific to fungi that creates genetic diversification in the fungal genome (Rouxel et al., 2011).Fudal et al. (2009) reported that RIP affects the AvrLm6 locus, causing gene inactivation and leading to virulence.Furthermore, isolates have been found to undergo continuous deletions and mutations apart from RIP that lead to further genome diversity.Current approaches to establish blackleg resistance in canola have not been successful in fully controlling this pathogen (Hayward, McLanders, Campbell, Edwards, & Batley, 2012).Therefore, understanding how this fungus has evolved, diversified and spread in Australia is important in providing information for the breeding and sowing of improved resistant varieties of Brassica.
Population diversity in L. maculans has been examined using a variety of markers.Genetic differences between eastern and western Australian isolates have previously been found using microsatellites and minisatellites, which were attributed to the presence of arid desert between the coasts (Hayden, Cozijnsen, & Howlett, 2007).Minisatellite markers used to analyse four field populations in France found high levels of gene and genotypic diversity within populations and high gene flow between populations, consistent with randomly mating populations (Gout et al., 2006) Dilmaghani et al. (2012) also used minisatellite markers to show that the L. maculans population in Western Canada comprises two genetically distinct populations.A further study implementing fourteen minisatellite markers also found clonal sub populations of this pathogen on B. oleracea in Mexico (Dilmaghani et al., 2013).However, Travadon et al. (2011) found the French L. maculans population to be panmictic.This study also employed minisatellite and microsatellite markers.Other investigations into blackleg population structure have also been conducted using amplified fragment length polymorphisms (AFLPs) (Purwantara, Barrins, Cozijnsen, Ades, & Howlett, 2000) and restricted fragment length polymorphisms (RFLPs) (Barrins, Ades, Salisbury, & Howlett, 2004).Single Nucleotide Polymorphisms (SNPs) have recently become a popular choice of molecular marker for population diversity studies, and offer significant benefits in terms of abundance in genomes and ease of high-throughput assessment.SNPs are single base-pair differences between two individuals at a particular locus (Appleby, Edwards, & Batley, 2009).SNPs can be classified as transitions (C to T, G to A), transversions (C to G, A to T, T to G or C to A) and insertions/deletions (indels) of a single base pair.Such molecular markers are good tools to analyse the various processes encompassing the population genetics and evolutionary processes of an organism.These include mating systems, patterns of speciation, dispersal, mutation, migration and selection etc. (Giraud, Enjalbert, Fournier, Delmotte, & Dutech, 2008;Gout et al., 2006).
The Illumina GoldenGate genotyping assay can be used to simultaneously analyse 384-3072 SNP loci across multiple individuals (Tindall et al., 2010).Previous studies using the Illumina GoldenGate assay have shown that it can be used to reliably score SNPs for genetic analysis (Durstewitz et al., 2010).Furthermore, it is cost-effective and flexible for analysing large numbers of SNPs (Appleby et al., 2009).We applied 384 previously developed L. maculans SNPs (Zander et al., 2013) in a GoldenGate assay to analyse 59 Australian L. maculans population isolates collected from different years, regions and cultivars, assessing the diversity of this pathogen across Australia.

Fungal Samples
A total of 59 fungal isolates were analysed using the Illumina GoldenGate assay, this comprised of 96 samples including replicates and controls (Table 1).Isolates were carefully selected to cover a wide range of parameters including region of collection, cultivar grown at collection site, year isolated and Avr gene complement (not shown) (Table 1, Appendix A).The isolates received were either stored in liquid form (agar piece in water) or filter form (filter discs in silica beads).The isolates were grown and genomic DNA extracted as in Zander et al. (2013).The extracted DNA was quantified using a Qubit Fluorometer (Life Technologies, 2013).The reference isolate v23.1.3,(for which the genome sequence is available) (Table 1) (Rouxel et al., 2011) was used for data analysis.(Balesdent et al., 2001); Not all data on these isolates was available, "-" denotes an unknown variable.IBCN numbers represent the IDs of "International Blackleg Collection Network" isolates (Marcroft et al., 2012).

Illumina GoldenGate assay
A total of 384 SNPs were selected for the Illumina GoldenGate assay.The SNPs were chosen to cover a range of the 76 supercontigs on which SNPs were predicted, from the list of 21,814 SNPs described in Zander et al. (2013).
A designability assessment conducted using the Illumina Assay Design Tool (ADT) scored the 384 SNPs at 0.4 or above, which is deemed a good score for the Illumina GoldenGate assay (Durstewitz et al., 2010).Sample preparation for the Illumina GoldenGate assay was performed according to the Illumina GoldenGate Genotyping Assay guide (According to manufacturer's instructions).The software "Genome Studio" (Illumina Inc., 2013) was used to manually cluster the SNPs into one of the two possible genotype clusters (A and B) for this haploid organism.SNPs that clustered confidently were selected for future data analyses and monomorphic and non-clustering SNPs (did not clearly separate into either the 'A' group or the 'B' group) were eliminated from further analyses, resulting in 214 high-quality SNPs.A sub-set of 193 SNPs was used for linkage disequilibrium (LD) analysis (SNPs monomorphic in all isolates except Lm-1 and Lm-2, the isolates which were used to identify polymorphic SNPs for the assay, were omitted).

Data Analysis
The data set of 214 SNPs was sorted according to predicted positions on the L. maculans supercontigs, as outlined in Supplementary Figure S1 of Rouxel et al. (2011).In order to look for potential SNP blocks relating to a parameter, the isolates were sorted individually, based on each parameter eg.State, stubble species or stubble cultivar (Table 1).Manhattan plots generated using the R package 'Gapit' (Lipka et al., 2012), were used to visualise any possible association between SNPs and these parameters.
All isolates were considered to be part of one population for the statistical analyses.The SNP positions were given 1 and 0 values (for the dendrogram and PCA analyses) or 'A/A' and 'B/B' (for LD analysis) for each genotype call and 'NA' was assigned to missing values.A binary distance matrix was generated and used to create a phylogenetic dendrogram.The R package "pvclust" (Suzuki & Shimodaira, 2006) was used to generate a dendrogram with 1000 bootstrap iterations, binary distance and complete clustering.Population LD was calculated using the R package 'genetics' (Warnes, Gorjanc, Leisch, & Man, 2012).R 2 LD values were used to generate the heatmap using the R package 'LDHeatmap' (Shin, Blay, McNeney, & Graham, 2006) to visualise LD.PCA was performed using the R packages 'ade4' (Dray & Dufour, 2007) and 'maptools' (Lewin-Koh et al., 2012).SNPs with possible null and private alleles were checked against the reference (Rouxel et al., 2011) using the alignment tool in Geneious Pro version 5.6 (Biomatters Ltd., 2015;Kearse et al., 2012).

Illumina GoldenGate results
The results from the GoldenGate assay supported the SNP prediction of Zander et al. (2013).The data generated was sorted for SNPs that had high confidence clusters.2.6% of SNPs had missing values (NA) for all isolates, 29.9% were monomorphic and 11.7% were non-clustering SNPs.SNPs belonging to these three categories were eliminated from further analyses.No correlation between eliminated SNPs and SNP score or supercontig on which they were positioned could be observed.Filtering for quality polymorphic data resulted in a dataset of 214 SNPs.A subset of 193 SNPs was used for linkage disequilibrium (LD) after elimination of a further 21 SNPs.Reproducibility less than 100% was due only to missing data in one or other of the replicates.

General Marker and Population Statistics
The SNP data was mined for possible private alleles (         2010) also used isolates Lm-6, Lm-17, Lm-18 and Lm-51 for genotyping at the AvrLm1 and AvrLm6 loci.They classified Lm-18 and Lm-51 as haplotype 24, Lm-17 as haplotype 10 and Lm-6 as haplotype 4 based on their Avr genotype.These haplotypes displayed a completely different association in their study as compared to the dendrogram.This highlights the efficacy of using a large number of SNPs chosen from across the genome to classify the relationships between isolates.
Some local groupings based on state, year collected, stubble species, stubble cultivar and/or site of collection were also observed.Isolates in Box 1-8 (Figure 2) all clustered for certain parameters, with each cluster group sharing a single state of collection.Local groupings such as these may indicate the presence of small clonal subpopulations of this pathogen within Australia, such as were observed by Dilmaghani et al. (2013) on B. oleracea in Mexico.
Clusters in our study were spread across the tree, indicating genetic differences between these possible subpopulations.Different conditions particular to each state such as weather, cultivars grown and stubble resistance all cumulatively affect the evolution of this pathogen.Therefore, it may be that conditions particular to each state promote asexual reproduction rather than sexual reproduction leading to less diversity within each subpopulation.Dilmaghani et al. (2013) attributed clonality such as this to moving the pathogen from its native biogeographic range, loss of a mating-type by mutation and culture conditions conducive to large-scale dispersal of conidia.This conclusion was made based on the presence of high linkage disequilibrium.Certain isolates like Lm-16 and Lm-39 also clustered together but were vastly different in the parameters associated with them.It is known that human movement transports and introduces infected seed and plant material from one area to another (Dilmaghani et al., 2012) thereby mixing and changing the population, further attributing to its panmictic nature.
The bootstrapping of the phylogenetic tree also supports the theory of a randomly interacting mixed population.
Bootstrapping values for the main branches were <95%, which indicates low confidence in the hierarchical cluster analysis.On the other hand, most bootstrapping values within each box were >95%, indicating that they were strongly supported by the data (Suzuki & Shimodaira, 2006).An overall analysis of the tree yielded no particular association to any other parameter.The PCA results also displayed a random positioning of isolates.Certain isolates clustered together, such as Lm-36 and Lm-42 isolates in Box 7, validating the dendrogram and also supporting panmixia.These findings could be attributed to the high evolutionary potential of L. maculans.Spore dispersal also plays a role in increasing gene flow and generating a random mix of isolates across the population (Travadon et al., 2011).More samples from each region and year will need to be collected and examined to investigate the possibility of clonal sub-populations of L. maculans within Australia.
We hypothesised that a possible cause of LD in this population could be selection of AvrLm genes, due to their impact on host plant infection.This would lead to loci in the selected region segregating with each other more often than expected by chance and can be visualised as blocks on the LD heatmap.However, we failed to notice any such patterns.As the rate of recombination between loci increases, there is a greater chance of linkage equilibrium in the population, decreasing LD.Populations that are constantly recombining and have a high cross-over rate will show little LD (McVean, 2008).Xu (2006) stated that only 5% of locus-pairs have significant observed association to those expected in a completely panmictic population.Our LD analysis showed 0.83% of p-values associated with pairwise comparisons, to be significant.SNP73, which was seen to be significantly associated with B. napus cultivars and was located near Avr4-7, did not display significant LD.The SNP and the gene on SC12 of the L. maculans genome are 253.6 kb apart and the GC content of the traversing region is 45.2%.Parlange et al. (2009) reported two PCR markers on the border of the AvrLm4-7 locus; the GC content between those markers was 35.2%.It has been concluded by Rouxel et al. (2011) that recombination in the L. maculans genome occurs more frequently within GC-rich regions than between GC-rich regions.Therefore we believe that recombination events in the GC-rich region between the SNP and the gene may have impacted the association between them in B. napus cultivars.The number of samples isolated from B. napus cultivars may also be too small to clearly display this association on the heatmap in the form of LD.Furthermore, the majority of Australian cultivars that have been genotyped contain Rlm4 and therefore there has been strong selection pressure at the AvrLm4-7 locus for a number of years in Australia (Marcroft et al., 2012).However, detailed studies comprising more isolates derived from B. juncea isolates will need to be conducted to confirm the validity of this association to B. napus cultivars.In the future, examining associations between SNPs and Avr genes in AT-rich regions such as these may prove fruitful in analysing the evolution of avirulence genes.
The results from this SNP genotyping assay successfully validated the work conducted by Zander et al. (2013) using the same SNP resource.The SNP prediction supported transferability of SNPs for use in the GoldenGate assay for the chosen SNPs.Stringent clustering criteria (ensuring that all SNPs visually separated into either the 'A' group or the 'B' group) yielded 214 SNPs, which were used for subsequent data analysis.Poor clustering could be a result of additional SNPs in the flanking regions of the predicted SNPs which can be resolved in the future by using more isolates for SNP discovery.Private alleles relating to five particular isolates (Lm-1, Lm-2, Lm-24, Lm-30 and Lm-55) were found.Lm-2 and Lm-55 were isolated from the same year and site of collection in Victoria.Alleles present only in the isolates used for SNP prediction (Lm-1 and Lm-2) are due to ascertainment bias of using these for the SNP prediction.Ascertainment bias is introduced because of the method used for SNP discovery (Albrechtsen, Nielsen, & Nielsen, 2010), which in this case, used two isolates (Lm-1 and Lm-2) to predict SNPs.Ascertainment bias can be corrected in the future by using more isolates for SNP prediction.Being closely related and hence sharing sequence similarity might explain the four alleles that were found only in Lm-2 and Lm-55.The two main clades on the phylogenetic tree separated the isolates Lm-1 and Lm-2.Replicates used in this assay were seen to be genetically identical except for missing values.This was validated by the phylogenetic tree and PCA output, which confirmed the reproducibility of this assay.
The purpose behind conducting a large-scale genotyping assay was to understand the genetic diversity in the Australian L. maculans population.Our cumulative analysis of these results supports our null hypothesis of a panmictic Australian L. maculans population with possible regional clonality.This contrasts with the conclusions of Hayden et al. (2007) who found two genetically distinct eastern and western blackleg populations in Australia using 6 microsatellite and 2 minisatellite markers in 513 isolates collected over two years.The study also found 85% difference within the 13 subpopulations identified and 10% difference between the coasts.However, our results concur with the panmictic conclusion of Travadon et al. (2011).This study analysed 29 field populations of French L. maculans isolates using minisatellite markers and also found low genetic differentiation within populations.
Further study concentrated on sampling from each region will help provide insights into this theory.
Overall, the high rate of sexual reproduction, ability of the pathogen to survive on stubble for long periods of time and random mating between isolates likely assist in maintaining a large blackleg population in Australia.This combined with its high evolutionary potential enables it to overcome host resistance quickly and cause infection leading to wide-spread crop losses.It is therefore imperative to attempt to restrict the population size of this pathogen using existing methods such as stubble management and introgression of resistance genes in Brassica species.Future work involves identifying new disease-associated genes, which will help in developing novel strategies to further control this devastating pathogen.
Figur s: red-Approx tstrap Probab supported by d indicate simila Figure 3. P Note: Axis replicates n Figure 4.

Table 1 .
List of L. maculans population isolates used in this study Note: VIC-Victoria; NSW-New South Wales; WA-Western Australia; SA-South Australia; Isolate v23.1.3 is the result of a series of in vitro crosses between European field isolates

Table 2 .
Percent reproducibility of replicates used in the assay Table3).Private alleles are unique alleles in isolates that denote genetic distinctiveness.Of the 21 private allele SNPs, eight occurred in the isolate Lm-1 and seven occurred in the isolate Lm-2, both of which were used for SNP discovery.Of the other four alleles occurring at low frequency, four were in Lm-2 and Lm-55 only, and two were present in only Lm-2 and Lm-24 and Lm-2 and Lm-30.SNPs private in Lm-2 and Lm-55 were the only ones that consistently occurred in intergenic regions of the genome.The SNP polymorphic information content (PIC) scores (See Appendix B) ranged from 0.3-0.5, indicative of relatively high polymorphism.General statistics can be found in Appendix B and association analysis of SNPs to the state the isolates were collected from can be found in Figures C1-C4 of Appendix C.

4. Discuss
appear to be genetically identical with 78% shared alleles (22% missing values).These isolates have been collected from Moyhall, South Australia in 2003 and Wonwondah, Victoria in 2004 respectively.Van de Wouw et  al. (