PacBio genome sequencing reveals new insights into the genomic organisation of the multi-copy ToxB gene of the wheat fungal pathogen Pyrenophora tritici-repentis
BMC Genomics volume 21, Article number: 645 (2020)
Necrotrophic effector proteins secreted by fungal pathogens are important virulence factors that mediate the development of disease in wheat. Pyrenophora tritici-repentis (Ptr), the causal agent of wheat tan spot, has a race structure dependent on the combination of effectors. In Ptr, ToxA and ToxB are known proteinaceous effectors responsible for necrosis and chlorosis respectively. While Ptr ToxA is encoded by the single gene ToxA, ToxB has multiple loci in the Ptr genome, which is postulated to be directly related to the level of ToxB production and leaf chlorosis. Although previous analysis has indicated that the majority of the ToxB loci lie on a single chromosome, the exact number and chromosomal locations for all the ToxB loci have not been fully identified.
In this study, we have sequenced the genome of a race 5 ToxB-producing isolate (DW5), using PacBio long read technology, and found that ToxB duplications are nested in the complex subtelomeric chromosomal regions. A total of ten identical ToxB gene copies were identified and based on flanking sequence identity, nine loci appeared associated with chromosome 11 and a single copy with chromosome 5. Chromosome 11 multiple ToxB gene loci were separated by large sequence regions between 31 and 66 kb within larger segmental duplications in an alternating pattern related to loci strand, and flanked by transposable elements.
This work provides for the first time the full accompaniment of ToxB loci and surrounding regions, and identifies the organization and distribution of ten ToxB loci to subtelomeric regions. To our knowledge, this is the first report of an interwoven strand-related duplication pattern event. This study further highlights the importance of resolving the highly complex distal chromosomal regions, that remain difficult to assemble, and can harbour important effectors and virulence factors.
The inverse gene-for-gene interactions between host plants and necrotrophic fungal pathogen typically involve pathogen effectors, which interact with a compatible locus in the host leading to toxin sensitivity and disease susceptibility.
Pyrenophora tritici-repentis (Ptr) a necrotrophic fungal pathogen and the causal agent of wheat (Triticum aestivum L) tan spot, produces a number of effectors that mediate the development of foliar disease on susceptible wheat genotypes. Tan spot has two distinct leaf symptoms, which are necrosis and chlorosis . These symptoms are the result of secreted effectors ToxA, ToxB and ToxC [2,3,4] and other as yet uncharacterised effectors [5, 6]. ToxA and ToxB, are characterised as small effector proteins that produce necrosis and chlorosis symptoms, respectively [2, 4]. While ToxC, which also causes chlorosis, has not been characterised and may be the product of a secondary metabolite gene cluster .
For the two proteinaceous toxins, ToxA reacts with a corresponding susceptibility gene in wheat (Tsn1), which makes the host sensitive to the effector , while the corresponding host gene for ToxB remains as yet unknown but is associated with the Tsc2 locus on chromosome 2B .
In the Ptr genome, ToxA is a single locus gene, the result of a horizontal gene transfer from another fungal pathogen species . While in contrast, there are multiple identical gene copies of ToxB [10, 11], in which the copy number variation has been shown to have an association with virulence. Nine copies of ToxB in race 5 isolates (DW2, DW7, DW13 and DW16), were estimated by phosphoimage analysis, and of these six copies were individually cloned and sequenced from DW7 (1-3 kb in length) . Southern analysis indicated that the ToxB loci reside on two unknown chromosomes, approximately 3.5 and 2.7 Mb in length, with the majority located on the smaller chromosome .
To date a number of Ptr whole genome sequencing projects involving race 5 isolates (ToxB-producing) have not been able to determine if the ToxB loci are clustered or dispersed [12, 13] in the genome. We therefore undertook genome sequencing via PacBio long read technology to resolve the number, organization and distribution of ToxB loci within the genome of a race 5 isolate (DW5). A comparative analysis of these ToxB regions to a race 1 isolate (ToxB non-producing), which was previously assembled from PacBio long read technology and optical mapping , was undertaken to identify any flanking sequence conservation.
Ptr isolate DW5 whole genome assembly analysis
The Ptr race 5 isolate DW5 was sequenced using long read single molecule PacBio technology and the error corrected reads were assembled and annotated (Table 1). Furthermore, a previous PacBio sequenced Ptr race 1 isolate (M4), which was scaffolded into chromosomes based on an optical map, but not annotated at the time , was also annotated during this study. The DW5 genome assembly size was 40.87 Mb, close to the genome size of M4 at 40.92 Mb , however DW5 was slightly more fragmented with 60 contigs as compared to the 50 contigs for M4 . This fragmentation may be directly related to a slightly higher repeat content in DW5 and the slightly smaller content of protein coding genes compared to M4 (Table 1). Protein coding gene predictions for the DW5 contigs and M4 scaffold assemblies were 14,276 and 15,466, respectively. The DW5 annotated genome has been deposited at DDBJ/ENA/GenBank under the accession MUXC00000000. The version described in this paper is version MUXC02000000. The annotated M4 genome has been deposited in DDBJ/ENA/GenBank under accession NQIK00000000. The version described in this paper is version NQIK02000000.
Whole genome comparative analysis between Ptr races 1 and 5
The genome sequence of DW5 (race 5) was aligned to M4 (race 1)  to determine sequence conservation at a chromosome level. Thirteen DW5 contigs showed colinear alignment to the scaffolded M4 chromosomes at greater than 98% sequence identity (Fig. 1) with no large-scale chromosomal rearrangements. DW5 contigs 3, 5, 7 and 8 were sequenced from 5′ telomere to 3′ telomere informed by the presence of the tandem telomere repeat motifs (CCCTAA)n/(TTAGGG)n.
Based on M4 chromosomes, thirteen DW5 assembled contigs matched nine chromosomes, which included chromosomes 1–9 (Table 2). A chromosome fusion between chromosome 10 and 11 (referred to as chromosome 10) in Australian isolate M4 resolved by optical mapping  was not observed for DW5, where DW5 contig 8 possessed both 5′ and 3′ telomere motifs (Table 2), which would represent chromosome a (telomere to telomere).
Multiple ToxB loci have alternate strand positions
The DW5 assembly was searched for ToxB homologs and 10 copies were identified across 5 contigs (DW5_contig_0004, DW5_contig_0009, DW5_contig_00015, DW5_contig_00016 and DW5_contig_00018). A single ToxB loci was found for each of the larger two contigs DW5_contig_0004 (3.65 Mb) and DW5_contig_0009 (2.18 Mb), labelled here as ToxB1 and ToxB2, respectively (Table 3). Multiple ToxB loci were located on the smaller contigs DW5_contig_00015 (ToxB3, ToxB4 and ToxB5), DW5_contig_00016 (ToxB6, ToxB7 and ToxB8) and DW5_contig_00018 (ToxB9 and ToxB10), sized 126, 123 and 99 kb, respectively. ToxB genes were not immediate neighbours and loci appeared to locate in alternate strand positions separated by relatively large distances that ranged between 31 and 66 kb in size. This pattern was observed across the three contigs (DW5_contig_00015, DW5_contig_00016 and DW5_contig_00018) harbouring multi-loci ToxB (Fig. 2).
Multiple ToxB loci are associated with subtelomeric chromosomal regions
Based on genome alignments to M4, two contigs (DW5_contig_0004 and DW5_contig_0009) with single ToxB loci were syntenic with the subtelomeric regions of M4 chromosomes 5 and 10, respectively (Fig. 1 and Table 3). No significant alignments were identified for the three smaller multiple ToxB loci contigs (DW5_contig_0015, DW5_contig_0016 and DW5_contig_0018) to the genome of M4. However, a search back to the DW5 genome (self-search) identified alignments for all three contigs to chromosome 10 (DW5_contig_0009) (Fig. 3), sequence breaks can be seen where regions of paralogous sequence are interspersed with repeat elements. No other alignments to the DW5 genome were found except for self-contig alignments. The alignment of the fragmented ToxB contigs with the 5′ subtelomeric region of chromosome 10 (reverse complemented DW5_contig_0009) and the presence of a 5′ telomere motif (TTAGGG)n in chromosome 5 (reverse complemented DW5_contig_0004) (Table 4), weighted chromosome 10 as the possible origin of ToxB3–10 loci and chromosome 5 (DW5_contig_0004) as the only source for the ToxB1 locus. The alignment of the 5′ telomere region of chromosome 10 and ToxB loci (ToxB3 to ToxB10) thus implied that contigs 15, 16 and 18 could be the fragmented regions not assembled from the 5′ telomere region of chromosome 10 (Fig. 4).
All ToxB loci, except ToxB6, which was truncated in the 5′ region upstream of ToxB, were co-located with dimer Tnp-hAT repeat genes. The dimer Tnp-hAT genes were located 10–15 kb upstream of the ToxB loci.
Larger groups of conserved regions are found between the ToxB loci based on strand positions
The ToxB loci and flanking sequence regions of 5 kb upstream and downstream were extracted (including ToxB mRNA transcript) for a nucleotide multiple sequence alignment to determine sequence conservation between the ten loci. Only ToxB6 was truncated in the 5′ sequence region due to the locus location (contig16:4,627–4,887 bp). The ToxB 10 kb multiple sequence alignment showed a highly conserved region of 3,170 bp with a large proportion (2.5 kb) highly conserved upstream of ToxB for all ten loci (Fig. 5a). On closer examination, the ToxB 10 kb regions could be grouped by their locus strand (Fig. 5b). The full 10 kb regions were highly conserved for ToxB loci B4, B6 and B8 on the forward strands of contigs 15 and 16 (group 1). While further conservation was found for reversed stranded ToxB loci B5, B7 and B9 (group 2) on contigs 15, 16, and 18, and to a lesser extent for reverse strand ToxB loci B2 and B3 (group 3) on contigs 9 and 15 (not shown in Fig. 5).
When the homology between the ten ToxB 10 kb regions was summarized for conserved and distinctive regions (Fig. 6), the 10 kb regions surrounding ToxB1 on chromosome 5 were found to be more divergent than the remaining loci proposed to be from chromosome 10. It was also noted that a small hypothetical protein (128 bp) was conserved 288 bp downstream of the ToxB loci in all forward stranded positions except ToxB1 and only in reverse positioned ToxB2 and ToxB3.
ToxB and promoter region
All ten copies of the 261 bp ToxB protein coding sequence are identical, as found previously for six of the sequenced copies . Based on DW5 mRNA transcript from a previous study , ToxB has a two exon gene structure of 533 bp in length. ToxB exon1 (94 bp) and exon 2 (439 bp) flank an intron 52 bp in size. The exon 1 5′ UTR and exon 2 3′ UTR have lengths of 99 bp and 172 bp, respectively (Additional file 1). Previously, the ToxB promoter was reported to be greater than 300 bp upstream of the coding sequence . The upstream region from ToxB (2 kb) was then searched for transcription binding site motifs. A DNA binding site was predicted upstream of ToxB, 847 bp from the starting codon of ToxB2–9, and 644 bp for ToxB1 and ToxB10 at an expected value of 4.9e-178. The most significant motif profile MA0320.1 (IME1) was identified with a probability value of 2.20e-06 (Additional file 2).
Ptr ToxB multiloci analysis
This is the first genome sequence investigation into the distribution of ToxB loci in Ptr using long read sequencing technologies. A previous study for race 5 isolate DW7 found that six of the sequenced copies, all had identical protein coding sequence identity . In this study, all the the ToxB loci (585 bp) identified have identical sequence, including exon and intron sequences. It was previously suggested that DW7 ToxB loci resided on two unknown chromosomes, approximately 3.35 and 2.7 Mb in size, with the majority of the loci on the smaller chromosome . In this study, the ToxB loci were located on chromosome 5 and 11, which had assembly sizes of 3.36 and 2.18 Mb respectively, which are close to the previously estimated chromosome sizes by Martinez et al., (2004). Of the ten ToxB loci, nine appeared to be associated with the smaller chromosome 11 located in the 3′ distal region. A Ptr chromosome noted for a chromosome fusion event for a race 1 isolate M4 . The telomere to telomere support for eleven DW5 chromosomes is similar to the findings for another American race 1 isolate Ptr Pt-1C-BFP , unlike the 10 chromosome genome of Australian isolate M4  (Fig. 7). Large scale segmental rearrangements have been frequently identified in the subtelomere regions of fungal chromosomes, where breakage/fusion events and large-scale rearrangements frequently occur [12, 14, 15]. During meiosis the subtelomeric regions have instability often referred to as plasticity . In these regions, chromosome breakage fusion cycles begin with the loss of telomeres which causes instability and potential fusion of sister chromatids. During the breakage fusion cycle, the site of breakage during separation in erroneously fused sister chromatids can lead to sequence duplication, deletion and rearrangement . It is therefore probable that the recent highly conserved duplications of loci in race 5 have occurred through multiple breakage fusion events between the distal chromosome regions and may have at one stage been potentially lost from race 1 isolates.
Genome plasticity in distal chromosome regions can contribute to rapid fungal diversification, especially for Ptr . In this study the subtelomeric ToxB loci location within Ptr DW5 provided a favourable environment for duplication, which may have provided this isolate a potential advantage for survival.
Ptr ToxB patterns of duplication
In addition to the positioning of the ToxB duplication within the distal region of chromosome 11, ToxB loci were located equidistant downstream from dimer Tnp-haT transposases, a familiar gene found coupled to Ptr ToxA and within the horizontally transferred region, also found in Parastagonospora nodorum and Bipolaris sorokiniana [9, 17]. It is therefore possible that the dimer Tnp-hAT transposases observed in DW5 may have played a self-complementing role in the duplication of ToxB, providing regions of homology between flanking regions, resulting in larger regions of homology as observed between the multiple DW5 ToxB copies. Our data found that multiple ToxB gene duplication events involved much larger segmental duplications, flanked by transposable elements, than previously identified . Here, we also identified that larger homologous regions could be grouped by the strand from which the duplicated ToxB is transcribed. Furthermore, we believe this is first reporting of a potential interwoven strand-related duplication pattern/event of a necrotrophic effector gene.
ToxB transcription factor binding site analysis
The binding of transcription factors to specific DNA binding sites (identified by a DNA motif) is key for the transcriptional regulation of genes, here a transcription factor binding motif IME1 profile was identified upstream of the multiple ToxB loci. The motif of IME1 is a conserved regulatory site for Saccharomyces cerevisiae, previously identified from ChIP-chip data . Although the IME1 transcription factor protein (UniProt accession P21190) is required for sporulation and early sporulation-specific genes expression, further experimental validation would be required in Ptr race 5 isolates to determine if the potential transcription factor is indeed involved in the regulation of ToxB.
Our findings provided insights into the unique nature of the multicopy ToxB organisation in the Ptr genome and revealed a potentially complex effector gene regulatory network. This study directly works towards a better understanding of genome plasticity events in fungal adaptation and effector gene evolution.
Material and methods
Ptr race 5 isolate DW5 collection and sequencing
The Ptr race 5 isolate DW5 was collected in 1998 from North Dakota, USA and was kindly provided by Tim Friesen (North Dakota, USA).
Isolate genomic DNA was extracted from 3-day old mycelia grown in Fries 3 medium using the BioSprint 15 automated workstation according to the manufacturer’s instruction (Qiagen, Germany). DNA was then treated with 50 μg/ml of RNase enzyme (Qiagen, Hilden, Germany) for 1 h followed by phenol/chloroform extraction. DNA was precipitated with sodium acetate and ethanol, and resuspended in TE buffer .
The DW5 genome was sequenced using PacBio Sequel technology (https://www.pacb.com) by Novogene (China, https://en.novogene.com/). The PacBio sequence coverage for isolate DW5 was 77x. The DW5 genome was also Illumina sequenced (www.illumina.com) for 150 PE reads at 100x coverage by Novogene (China, (https://en.novogene.com/)). The Illumina data was used for post-genome assembly error correction (polishing).
Ptr isolate DW5 whole genome assembly
The DW5 PacBio sequence data was error corrected and assembled using Canu version 1.9  with pacbio-raw and genome size of 40 Mb parameter settings on a heterogeneous Hewlett Packard Enterprise Linux cluster (Zeus, https://pawsey.org.au). The DW5 assembled PacBio contigs were then indexed using BWA index version 0.7.17-r1188 . The DW5 genomic Illumina read data, sequenced in this study, was then aligned to the indexed DW5 assembled PacBio contigs using BWA mem version 0.7.17-r1188  (−t 16). The alignment file (BAM format) was then filtered for concordant read alignments using SAMTools version 1.7 view (−f 0 × 2) and sorted  for further genome error correction (polishing). The DW5 PacBio assembly was then error corrected using Pilon version 1.23  (--changes --tracks --output DW5_pilon --defaultqual 20 --threads 16 --frags ‘DW5 sorted BAM file’).
The DW5 PacBio assembled genome was then masked for low complexity sequence and known fungal repeats using RepeatMasker (RM)  version 2.9.0+, Dfam 3.0 [24, 25] and Repbase 20,181,026  with taxon fungi parameter available through a docker image (https://hub.docker.com/r/taavipall/repeatmasker-image).
DW5 and M4 gene prediction and annotation
The PacBio DW5 assembled contigs and a previously assembled Ptr race 1 isolate M4 scaffold assembly  were indexed using bowtie2-build version 18.104.22.168 . Previously sequenced stranded RNA-seq Illumina read data  for DW5 and M4 were aligned to the respective indexed genomes DW5 (DDBJ/ENA/GenBank accession MUXC02000000) and M4 (DDBJ/ENA/GenBank accession NQIK02000000) using TopHat2 version 2.1.1  (--no-discordant -N 0 -i 10 -I 5000 -p 16 --library-type fr-firststrand). Based on the accepted TopHat2 alignments (BAM file), mRNA transcripts, in GTF format, were then generated using CuffLinks version 2.2.1  (-p -library-type fr-firststrand). The transcript GTF file format was then converted to GFF3 using GenomeTools gtf_to_gff3 version 1.5.10  to provide transcript support (evidence) towards the ab initio gene predictions.
Ab initio gene predictions were made with GeneMark-ES v 4.33 (--ES --fungus --cores 16 --evidence)  and Coding Quarry v2.0  (-p 16 -t) in pathogen mode (PM), both ab initio gene predictions were supported by the transcript GFF3 file. Published Ptr protein FASTA sequences were downloaded from NCBI using NCBI txid45151 on the 20th January 2020 and aligned to the genomes using Exonerate v2.2.0  (--showvulgar no --showalignment no --minintron 10 --maxintron 2000 --percent 90) mode protein2genome. The final gene prediction sets were then merged via EvidenceModeller v1.1.1  using a combination of protein alignments and the two ab initio predictions on the genome, with a minimum intron length of 10 bp and evidence weights  CodingQuarry:10, GeneMark.hmm:10, Exonerate:5 and CuffLinks:10.
Gene annotations were assigned from BLASTX (v2.2.26)  searches (expected value ≤ 1e-05) against the following databases Uniref90 (October, 2019), NCBI Refseq (taxon = Ascomycota) (October, 2019) and sequence domains were assigned by RPS-BLAST (v2.2.26) against Pfam (October, 2019), Smart (October, 2019) and CDD (October, 2019). The blast protein and domain searches were then summarised using AutoFACT version 3.4 .
The annotated proteins were searched for signal peptides using SignalP version 5.0b  (-format short -gff3 -mature -org euk). Those identified with signal peptides were then searched for predicted effectors using EffectorP version 2.0 . EffectorP 2.0 has a low false positive rate of 11.2% and a high accuracy of 88.8% for effector prediction .
DW5 ToxB identification and analyses
All published ToxB sequences, 76 in total, were downloaded from NCBI GenBank nucleotide database (https://www.ncbi.nlm.nih.gov/nuccore) with the text search (ToxB) AND “Pyrenophora tritici-repentis”[porgn:__txid45151] (Additional file 3) and searched against the DW5 genome using BLATX v3.5  (-maxIntron = 5000 -minIdentity = 70) and ≥ 50% query coverage (to detect any truncated genes).
Sequence flanking the identified ToxB loci, a total length of 10 kb, were then extracted using EMBOSS extractseq version 22.214.171.124  and aligned with ToxB mRNA and CDS using Muscle  (-clwstrict). The multiple sequence alignment was then visualised in JalView version 2.10.5 , figures were created using the alignment overview.
To obtain a better view of sequence regions shared between the ten DW5 ToxB 10 kb regions, each sequence was aligned to each other at greater than 70% sequence identity, using BLAT version 3.5  fastMap option, all coordinates were then used to create a bed file for visualisation using GenomeTools (gt) sketch version 1.5.10 .
The 2 kb sequence region upstream of ToxB was submitted to MEME Suite 5.1.1  for motif discovery with classic discovery mode, site distribution zero or one occurrence and motif width between 6 and 50 inclusive. The most significant motif was submitted to TOMTOM  to identify similar motifs in the published nonredundant database JASPAR CORE 2018  for eukaryotes.
Whole genome alignment
DW5 PacBio assembled contigs were aligned to the optically mapped M4 chromosome scaffold  reference using NUCmer v3.1 (--maxmatch --coords) . The sequence dot plot figure (Fig. 1) was generated using MUMmerplot v3.1  with option for color plot line with percentage similarity gradient. EMBOSS revseq version 126.96.36.199  was used for the reverse complementation of sequence.
Availability of data and materials
All data generated or analyzed during this study are included and can be accessed in this published article (and in Additional file 3). The annotated genome of DW5 has been deposited at DDBJ/ENA/GenBank repository under accession MUXC00000000. The DW5 version described in this paper is MUXC02000000. The annotated genome of M4 has been deposited at DDBJ/ENA/GenBank repository under accession NQIK00000000. The M4 version described in this paper is version NQIK02000000.
Lamari L, Bernier CC. Evaluation of wheat lines and cultivars to Tan spot [Pyrenophora-Tritici-Repentis] based on lesion type. Can J Plant Pathol. 1989;11(1):49–56.
Strelkov SE, Lamari L, Ballance GM. Characterization of a host-specific protein toxin (Ptr ToxB) from Pyrenophora tritici-repentis. Mol Plant Microbe In. 1999;12(8):728–32.
Effertz RJ, Meinhardt SW, Anderson JA, Jordahl JG, Francl LJ. Identification of a Chlorosis-inducing toxin from Pyrenophora tritici-repentis and the chromosomal location of an insensitivity locus in wheat. Phytopathology. 2002;92(5):527–33.
Ciuffetti LM, Tuori RP, Gaventa JM. A single gene encodes a selective toxin causal to the development of tan spot of wheat. Plant Cell. 1997;9(2):135–44.
Ali S, Gurung S, Adhikari TB. Identification and characterization of novel isolates of Pyrenophora tritici-repentis from Arkansas. APS. 2010;94(2):229–35.
See PT, Marathamuthu KA, Iagallo EM, Oliver RP, Moffat CS. Evaluating the importance of the tan spot ToxA-Tsn1 interaction in Australian wheat varieties. Plant Pathol. 2018;67(5):1066–75.
Liu Z, Friesen TL, Ling H, Meinhardt SW, Oliver RP, Rasmussen JB, et al. The Tsn1-ToxA interaction in the wheat-Stagonospora nodorum pathosystem parallels that of the wheat-tan spot system. Genome. 2006;49(10):1265–73.
Corsi B, Percival-Alwyn L, Downie RC, Venturini L, Iagallo EM, Campos Mantello C, et al. Genetic analysis of wheat sensitivity to the ToxB fungal effector from Pyrenophora tritici-repentis, the causal agent of tan spot. Theor Appl Genet. 2020;133(3):935–50.
Friesen TL, Stukenbrock EH, Liu Z, Meinhardt S, Ling H, Faris JD, et al. Emergence of a new disease as a result of interspecific virulence gene transfer. Nat Genet. 2006;38(8):953–6.
Martinez JP, Oesch NW, Ciuffetti LM. Characterization of the multiple-copy host-selective toxin gene, ToxB, in pathogenic and nonpathogenic isolates of Pyrenophora tritici-repentis. Mol Plant-Microbe Interact. 2004;17(5):467–74.
Aboukhaddour R, Cloutier S, Ballance GM, Lamari L. Genome characterization of Pyrenophora tritici-repentis isolates reveals high plasticity and independent chromosomal location of ToxA and ToxB. Mol Plant Pathol. 2009;10(2):201–12.
Moolhuijzen P, See PT, Hane JK, Shi G, Liu Z, Oliver RP, et al. Comparative genomics of the wheat fungal pathogen Pyrenophora tritici-repentis reveals chromosomal variations and genome plasticity. BMC Genomics. 2018;19(1):279.
Manning VA, Pandelova I, Dhillon B, Wilhelm LJ, Goodwin SB, Berlin AM, et al. Comparative genomics of a plant-pathogenic fungus, Pyrenophora tritici-repentis, reveals transduplication and the impact of repeat elements on pathogenicity and population divergence. G3 (Bethesda). 2013;3(1):41–63.
Moolhuijzen P, See PT, Moffat CS. A new PacBio genome sequence of an Australian Pyrenophora tritici-repentis race 1 isolate. BMC Res Notes. 2019;12(1):642.
Bertazzoni S, Williams AH, Jones DA, Syme RA, Tan KC, Hane JK. Accessories make the outfit: accessory chromosomes and other dispensable DNA regions in plant-pathogenic Fungi. Mol Plant Microbe In. 2018;31(8):779–88.
Chuma I, Hotta Y, Tosa Y. Instability of subtelomeric regions during meiosis in Magnaporthe oryzae. J Gen Plant Pathol. 2011;77:317–25.
McDonald MC, Taranto AP, Hill E, Schwessinger B, Liu Z, Simpfendorfer S, et al. Transposon-Mediated Horizontal Transfer of the Host-Specific Virulence Protein ToxA between Three Fungal Wheat Pathogens. MBio. 2019;10(5):e01515–19.
MacIsaac KD, Wang T, Gordon DB, Gifford DK, Stormo GD, Fraenkel E. An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics. 2006;7:113.
Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 2017;27(5):722–36.
Li H, Durbin R. Fast and accurate short read alignment with burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 2014;9(11):e112963.
Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2004;Chapter 4:Unit 4 10.
Hubley R, Finn RD, Clements J, Eddy SR, Jones TA, Bao W, et al. The Dfam database of repetitive DNA families. Nucleic Acids Res. 2016;44(D1):D81–9.
Wheeler TJ, Clements J, Eddy SR, Hubley R, Jones TA, Jurka J, et al. Dfam: a database of repetitive DNA based on profile hidden Markov models. Nucleic Acids Res. 2013;41(Database issue):D70–82.
Kohany O, Gentles AJ, Hankus L, Jurka J. Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and censor. BMC Bioinformatics. 2006;7:474.
Langdon WB. Performance of genetic programming optimised Bowtie2 on genome comparison and analytic testing (GCAT) benchmarks. BioData Min. 2015;8(1):1.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14(4):R36.
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–78.
Gremme G, Steinbiss S, Kurtz S. GenomeTools: a comprehensive software library for efficient processing of structured genome annotations. IEEE/ACM Trans Comput Biol Bioinform. 2013;10(3):645–56.
Borodovsky M, Lomsadze A. Eukaryotic gene prediction using GeneMark.hmm-E and GeneMark-ES. Curr Protoc Bioinformatics. 2011;Chapter 4:Unit 4 6 1–10.
Testa AC, Hane JK, Ellwood SR, Oliver RP. CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts. BMC Genomics. 2015;16:170.
Slater GS, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31.
Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 2008;9(1):R7.
Shiryev SA, Papadopoulos JS, Schaffer AA, Agarwala R. Improved BLAST searches using longer words for protein seeding. Bioinformatics. 2007;23(21):2949–51.
Koski LB, Gray MW, Lang BF, Burger G. AutoFACT: an automatic functional annotation and classification tool. BMC Bioinformatics. 2005;6:151.
Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8(10):785–6.
Sperschneider J, Dodds PN, Gardiner DM, Singh KB, Taylor JM. Improved prediction of fungal effector proteins from secretomes with EffectorP 2.0. Mol Plant Pathol. 2018;19(9):2094–110.
Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res. 2002;12(4):656–64.
Olson SA. EMBOSS opens up sequence analysis. European molecular biology open software suite. Brief Bioinform. 2002;3(1):87–91.
Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113.
Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ. Jalview version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25(9):1189–91.
Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994;2:28–36.
Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS. Quantifying similarity between motifs. Genome Biol. 2007;8(2):R24.
Khan A, Fornes O, Stigliani A, Gheorghe M, Castro-Mondragon JA, van der Lee R, et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. 2018;46(D1):D260–D6.
Delcher AL, Salzberg SL, Phillippy AM. Using MUMmer to identify similar regions in large sequence sets. Curr Protoc Bioinformatics. 2003;Chapter 10:Unit 10 3.
Sonnhammer EL, Durbin R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene. 1995;167(1–2):GC1–10.
Sullivan MJ, Petty NK, Beatson SA. Easyfig: a genome comparison visualizer. Bioinformatics. 2011;27(7):1009–10.
We thank the Australian grain growers for their continued support of research through the Grains Research and Development Corporation (GRDC) and the Australian Government National Collaborative Research Infrastructure Strategy (NCRIS) for providing access to Pawsey Supercomputing under a National Computational Merit Allocation Scheme (NCMAS), Nectar Research/Pawsey Nimbus Cloud resources. We would also like to thank Prof. Tim Friesen, Department of Plant Pathology, North Dakota State University, Fargo, ND for supplying the isolate DW5.
This work was generously supported through co-investment by Grains Research and Development Corporation (GRDC) and Curtin University (project code CUR00023) as well as Australian Government National Collaborative Research Infrastructure Strategy and Education Investment Fund Super Science Initiative. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Ethics approval and consent to participate
Consent for publication
All authors have read the manuscript and declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original version of this article was revised: an update on the location of DW5 ToxB cluster relative to the M4 chromosome 10 fusion event was required.
Nucleotide multiple sequence alignment of the ten ToxB loci regions.
Predicted DNA binding site motif.
About this article
Cite this article
Moolhuijzen, P., See, P.T. & Moffat, C.S. PacBio genome sequencing reveals new insights into the genomic organisation of the multi-copy ToxB gene of the wheat fungal pathogen Pyrenophora tritici-repentis. BMC Genomics 21, 645 (2020). https://doi.org/10.1186/s12864-020-07029-4