Volume 15 Supplement 1
Genome sequencing of high-penicillin producing industrial strain of Penicillium chrysogenum
- Fu-Qiang Wang†1,
- Jun Zhong†2, 3,
- Ying Zhao1,
- Jingfa Xiao2,
- Jing Liu1,
- Meng Dai1,
- Guizhen Zheng1,
- Li Zhang1,
- Jun Yu2,
- Jiayan Wu2Email author and
- Baoling Duan1Email author
© Wang et al.; licensee BioMed Central Ltd. 2014
Published: 24 January 2014
Due to the importance of Penicillium chrysogenum holding in medicine, the genome of low-penicillin producing laboratorial strain Wisconsin54-1255 had been sequenced and fully annotated. Through classical mutagenesis of Wisconsin54-1255, product titers and productivities of penicillin have dramatically increased, but what underlying genome structural variations is still little known. Therefore, genome sequencing of a high-penicillin producing industrial strain is very meaningful.
To reveal more insights into the genome structural variations of high-penicillin producing strain, we sequenced an industrial strain P. chrysogenum NCPC10086. By whole genome comparative analysis, we observed a large number of mutations, insertions and deletions, and structural variations. There are 69 new genes that not exist in the genome sequence of Wisconsin54-1255 and some of them are involved in energy metabolism, nitrogen metabolism and glutathione metabolism. Most importantly, we discovered a 53.7 Kb "new shift fragment" in a seven copies of determinative penicillin biosynthesis cluster in NCPC10086 and the arrangement type of amplified region is unique. Moreover, we presented two large-scale translocations in NCPC10086, containing genes involved energy, nitrogen metabolism and peroxysome pathway. At last, we found some non-synonymous mutations in the genes participating in homogentisate pathway or working as regulators of penicillin biosynthesis.
We provided the first high-quality genome sequence of industrial high-penicillin strain of P. chrysogenum and carried out a comparative genome analysis with a low-producing experimental strain. The genomic variations we discovered are related with energy metabolism, nitrogen metabolism and so on. These findings demonstrate the potential information for insights into the high-penicillin yielding mechanism and metabolic engineering in the future.
Penicillin and β-lactam antibiotic play a significant role in human medical history [1, 2] since Fleming's discovery of the filamentous fungus Penicillium notatum in 1929 . The regulation of penicillin biosynthesis has been studied for many years, together with much more proteins or pathways were discovered [4–9]. The improvement of P. chrysogenum strains to obtain higher penicillin yields is a main intense objective in industrial research [10, 11].
Due to the importance of P. chrysogenum, the genome sequence of low-penicillin producer Wisconsin54-1255, which is widely used in laboratories, was sequenced and a number of genes responsible for key steps in penicillin production were identified . The precursors for penicillin biosynthesis, genes encoding microbody proteins and transporters were found, illustrating potential for future genomics-driven metabolic engineering . Through classical mutagenesis and screening methods, product titers and productivities of penicillin have dramatically increased since Wisconsin54-1255 strain, but how low-penicillin producer strain was transformed into an efficient producer through improvement is still challenging. For commercial reasons, the improvement of P. chrysogenum strains has never been stopped. The productivity of industrial used strains is far more higher than their ancestor, and the progress was mainly obtained by classical mutagenesis and screening methods. Because mutations were random, most of the genetic changes in high yield strains were unclear. Although some significant structural variations (SVs) [8, 9, 13] and differential expression profiling [12, 14, 15] have been found in high-penicillin producing strains, little is known about the underlying whole genomic changes between low-producing laboratorial strain and high-producing industrial strain.
To gain more insight into the genome structural variations of high-penicillin producing strain, we sequenced a Chinese industrial strain NCPC10086. We also offer a comprehensive comparative genomics analysis [16–19] to find all mutations and large-scale structural variations between NCPC10086 and the first published genome of P. chrysogenum strain, Wisconsin54-1255 . Some variations including mutations, indels and structural variations were considered for their potential biological impact for penicillin biosynthesis. Our genome sequence data and analyses explore the differences between high- and low-yield P. chrysogenum strains and demonstrate the potential useful information to improve strains by direct genetic engineering tools.
Genome sequencing, assembly and general characeristics
P. chrysogenum NCPC10086 genome sequencing data
Insert fragment size (Kb)
Sequencing throughput (Mb)
Roche 454 GS
Illumina HiSeq 2000
Global statistics of the genome assembly and annotation of P. chrysogenum NCPC10086
Percentage of the assembly
GC content (%)
(1 gene every n bp)
Genes with intron
Genome comparison analysis between P. chrysogenumNCPC10086 and Wisconsin54-1255
Metabolism or progress involved by several "new" genes
The metabolism or progress
Amino sugar and nucleotide sugar metabolism
Nitrogen metabolism, oxidative phosphorylation
Glutathione, arachidonic acid, taurine and hypotaurine metabolism,
Fluorobenzoate, chlorocyclohexane and chlorobenzene, toluene degradation
Single nuclear variations (SNVs) involved in homogentisate pathway and the regulators of penicillin biosynthesis
A phenylacetate 2-hydroxylase which catalyzes the first step of the homogentisate pathway for PAA catabolism
Strongly similar to pahA
Strongly similar to pahA
Strongly similar to pahA
A global regulator of secondary metabolism
An activator and repressor of secondary metabolism
We sequenced the whole genome of an industal high-penicillin producing strain NCPC10086 and provided an integral whole geome comparison results with Wisconsin54-1255. A total genome size of 32.3 Mb was assembled with contig N50 of 661 Kb and scaffold N50 of 2.8 Mb. The gene structures were predicted with a combined de novo and homology-based approach, and annotated by four gene annotation systems.
By whole genome comparative analysis, we observed a large number of mutations, insertions and deletions, and structural variations. There are 69 "new" genes that not exist in the genome sequence of Wisconsin54-1255 and some of them are involved in energy metabolism, nitrogen metabolism and glutathione metabolism. As was expected, the high-penicillin producing strain needs more energy for penicillin synthesis, sorting, transport and processing, and we confirm some new genes participate in it. One "new" gene was discovered in nitrogen metabolism, which is up regulated strongly in cultures supplemented with the side chain precursor PAA (phenylacetic acid) in high-producing strain . Both cysteine biosynthesis and the oxidized glutathione need NADPH, if glutathione metabolism is more active, NADPH could be reserved for more cysteine biosynthesis to improve the penicillin production. Our "new" gene involved in glutathione metabolism may impact on this process.
The penicillin biosynthetic genes cluster (PBC) is the well-known dominant core for penicillin production existing in all strains; copy number and fragment arrangement are the key features for PBC. The high-penicillin producing strain, NCPC10086, has seven copies of PBC and one 53.7 Kb "new shift fragment" with unique arrangement type. The TTTACA sequence and its inverse complement TGTAAA sequence could be hotspots for site-specific recombination after multiple mutations. This process may aim to repair damage from mutations by nitrosoguanidine. We found two large translocations in NCPC10086; one is a 266 Kb fragment in subtelomere transferred to centromere including genes regulating nitrogen metabolite repression; another is a 1,202 Kb fragment consists of a mitochondrial ADP/ATP carrier involved in energy metabolism and peroxin-2 gene involved in peroxysome pathway.
Due to our comparative genomics statistics results, we predicted that energy metabolism and nitrogen metabolism plays an important role in penicillin production together with glutathione metabolism and peroxysome pathway. To further analysis genes involved in those processes, we looked into two types of genes deeper, pahA gene set and velvet-like complex genes. Translocation, stop codon mutation, synonymous and non-synonymous mutations are found there. These variations may impact the homogentisate pathway for PAA catabolism as well as global regulation of secondary metabolism, including penicillin biosynthesis, sporulation and pigmentation.
We found out many mutations and structural variations, but how many of them and how they affect the penicillin yield is still a formidable challenge. Efficient approaches to narrow down the possibilities are to sequence more genomes for common variations and system biological investigation using "omic" data . Through genome resequencing and functional analysis, identification of precise mutations in strains with altered phenotypes will add insight into specific gene functions and guide further metabolic engineering efforts.
This is the first high-quality genome of high-penicillin producing industrial stain of Penicillium chrysogenum, which can provides abundant genetic information for broad biomedical researchers. Through comparative genomics analysis with low-producing strain, we found a lot of mutations, insertions and deletions, and structural variations. Moreover, we showed some "new" genes not existent in the public genome sequence of Wisconsin54-1255 involved in energy metabolism, nitrogen metabolism and glutathione metabolism. Most remarkably, for the penicillin biosynthesis cluster, we are surprised to find a 53.7 Kb new "shift fragment" in our high-producing strain and the type of fragment arrangement is unique. In addition, we addressed a 266 Kb translocation including a regulator of nitrogen metabolite repression and a 1,202 Kb translocation including genes involved in energy metabolism and peroxysome pathway. Our findings lay a foundation for the insights into the high-penicillin producing mechanism and metabolic engineering in the future.
Source of sample and culture conditions
P. chrysogenum NCPC10086 strain was selected for genome sequencing as it was commercialized in North China Pharmaceutical Group Corporation. Spore suspensions of NCPC10086 were inoculated in 40 mL of seed medium (20 g/L sucrose corn steep liquor, 20 g/L sucrose, 5 g/L yeast extract, 5 g/L CaCO3, pH 5.8) in 250 ml flasks and incubated on a rotary shaker (250 r.p.m.) at 26°C for 24 h. Two milliliters of seed culture were transferred to 40 mL of fermentation medium (35 g/L lactose, 30 g/L corn steep liquor, 5 g/L (NH4)2SO4, 1 g/L KH2PO4,1 g/L, K2SO4, 10 g/L CaCO3, 2 g/L phenylacetic acid, 6 ml/L corn oil, pH 6.0) and grown at 26°C with shaking at 250 r.p.m.
Genome sequencing and assembly
The genome of P. chrysogenum NCPC10086 strain was sequenced by whole genome random sequencing method [20, 21]. We used Roche 454 GS FLX system to produce 18× coverage single-end reads with an average read length of 410 bp to do contig assembly. Moreover, 3-4 Kb and 6-8 Kb mate-pair libraries were produced to do contig and scaffold assembly, with 5× coverage sequenced by ABI 3730 system and 1× coverage sequenced by Megabace1000 system. ABI 3730 and Megabace1000 produced an average read length of 659 bp and 739 bp respectively. Phred and Phrap [46, 47] were used to deal with the raw data from ABI 3730 and Megabace1000. Hybrid assembly was performed using Newbler  with all single-end and mate-pair reads by overlap-layout-consensus strategy. After assembly, we aligned our contigs to the reference sequence  to predict the gaps sizes distribution in our genome. According to the gaps size distribution, we designed 1-2 Kb mate-pair inserted fragment library to do scaffolding. We mapped the mate-pair high quality reads onto the scaffolds and used the reads to fill the gaps if one of the mate-pair reads located at the edge of gaps. At last, redundant sequences were deleted through self-alignment.
Gene prediction and annotation
The repeat sequences of P. chrysogenum NCPC10086 were masked throughout the genome using RepeatMasker (v 3.3.0) and the RepBase library (20120418)  with Tandem Repeats Finder (v 3.2.1). The gene structures were predicted with a combined de novo and homology-based approach. Firstly, for all repeat-masked scaffolds larger than 1 Kb, Fgenesh (v 3.1.2) and GeneMark-ES (v 2.3e) were performed on the whole genomic sequence to provide an initial set of predicted ORFs. Fgenesh is trained on sequences of Penicillium funiculosum and GeneMark-ES is upon the self-training algorithm for fungal genomes. Preference was given to Fgenesh genes, and all predicted protein should be larger than 10aa. Secondly, we use the gene prediction results of P. chrysogenum Wisconsin54-1255 to revise and complement our predicted genes by homology searching. At last, the former two results were integrated together as predicted genes. All predicted proteins were blastp  against databases of GenBank's non-redundant proteins, InterProscan , Swiss-Prot/UniProtKB  and Gene Ontology (GO), and the best alignment of every protein was considered its annotation. No alignment results by blastp of predicted proteins were automatically considered as hypothetical proteins. We presented unique and shared proteins among four gene annotation systems by venn diagram (http://bioinfogp.cnb.csic.es/tools/venny/index.html). WEGO  was used to plot GO annotation results. The pathway analysis is carried out by KAAS (v 1.67x) with SBH method .
Calling single nuclear variations (SNVs), insertions and deletions (InDels) and copy number variation (CNVs)
Based on the assembled P. chrysogenum Wisconsin54-1255, we realigned all of the high-quality reads with the genome by SOAP(v 2.21)  to identify the SNVs. In the reads gap-free alignment process, at most two mismatches were allowed between a read and the reference, and best hits were selected. Multiple reads mapping results were filtered. We use SOAPsnp (v1.03), a statistical model based on Bayesian theory and Illumina quality system, to calculate a probability for each possible genotype at each position on the reference genome. We used five thresholds to filter out unreliable SNV results: (1) we required at least five reads for each SNV; (2) average quality of each SNV had to be higher than 20; (3) the overall depth had to be less than 150; (4) the approximate copy number of flanking sequences had to be less than 2 (to avoid misreading SNV caused by the alignment of similar reads from repeat units or by copy number variations); (5) there had to be at least one pair of mate-pair reads to support. For InDel detection, we use Pindel(v0.2.4)  to find breakpoints of large deletions and medium-sized insertions from paired-end or single reads. The short reads alignment is by BWA- backtrack (v0.5.9)  and long reads is by BWA-SW . We use SAM Tools(v0.1.17)  to manipulate alignments in the SAM format. Copy number variations detection is by CNV-seq  which is based on a robust statistical model with 50× high-quality reads from Illumina HiSeq 2000.
Identification of structural variations
We used Blat (v 34)  with default parameter to align scaffolds of P. chrysogenum NCPC10086 to the reference, whole genome of P. chrysogenum Wisconsin54-1255, to search colinearity between them. The alignment results of each scaffold indicate a candidate location of the scaffold. For scaffolds with multiple hits, the top ten hits with highest sequence similarity remained as candidate locations. The alignment with the longest matches in a linear orientation between a scaffold and the reference was picked as 'best-hit' of the scaffold. After finding structural variations, we use Blastn with parameter '1e-5' to check the detail alignment results. We randomly pick up 20× mate-pair reads from 180× high-quality Illumina HiSeq 2000 reads. Reads mapping is by SOAP (v 2.21)  and the alignment result is visualized by MapView (v 3.4.1) . The 5'-3' primers of PCR identification of structural variations of "266 Kb translocation" are ACCTGGCGTGCCTCATGCAGCG and TTGGGGTGGAATGACGTGGGG, which are before 200 bp and after 300 bp of the breakpoint. The 5'-3' primers of PCR identification of structural variations of "1,202 Kb translocation" are ACCTGTGGGGATCATTAGCCTCC and ACTCGGATAGTCTAGGTTCGGCGG, which are before 250 bp and after 220 bp of the breakpoint.
Availability of supporting data
P. chrysogenum strain NCPC10086 genome sequences are available via GenBank/EMBL/DDBJ under the accession APKG00000000.
Publication of this article was funded by grant from the Natural Science Foundation of China (31271386 and 31101063), grant (2012AA020409) from the National Programs for High Technology Research and Development (863 Program), the Ministry of Science and Technology of the People's Republic of China.
This article has been published as part of BMC Genomics Volume 15 Supplement 1, 2014: Selected articles from the Twelfth Asia Pacific Bioinformatics Conference (APBC 2014): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/15/S1.
- Drews J: Drug Discovery: A Historical Perspective. Science. 2000, 287: 1960-1964. 10.1126/science.287.5460.1960.PubMedView ArticleGoogle Scholar
- Hamed RB, Gomez-Castellanos JR, Henry L, Ducho C, McDonough MA, Schofield CJ: The enzymes of beta-lactam biosynthesis. Nat Prod Rep. 2012, 30: 21-107.View ArticleGoogle Scholar
- Fleming A: On the Antibacterial Action of Cultures of a Penicillium, with Special Reference to their Use in the Isolation of B. influenzæ. British Journal of Experimental Pathology. 1929, 10: 226-236.PubMed CentralGoogle Scholar
- Schofield CJ, Baldwin JE, Byford MF, Clifton I, Hajdu J, Hensgens C, Roach P: Proteins of the penicillin biosynthesis pathway. Curr Opin Struct Biol. 1997, 7: 857-864. 10.1016/S0959-440X(97)80158-3.PubMedView ArticleGoogle Scholar
- Brakhage AA: Molecular regulation of beta-lactam biosynthesis in filamentous fungi. Microbiol Mol Biol Rev. 1998, 62: 547-585.PubMedPubMed CentralGoogle Scholar
- Brakhage AA, Sprote P, Al-Abdallah Q, Gehrke A, Plattner H, Tuncher A: Regulation of penicillin biosynthesis in filamentous fungi. Adv Biochem Eng Biotechnol. 2004, 88: 45-90.PubMedGoogle Scholar
- Barredo JL, Diez B, Alvarez E, Martin JF: Large amplification of a 35-kb DNA fragment carrying two penicillin biosynthetic genes in high penicillin producing strains of Penicillium chrysogenum. Curr Genet. 1989, 16: 453-459. 10.1007/BF00340725.PubMedView ArticleGoogle Scholar
- van den Berg MA, Westerlaken I, Leeflang C, Kerkman R, Bovenberg RAL: Functional characterization of the penicillin biosynthetic gene cluster of Penicillium chrysogenum Wisconsin54-1255. Fungal Genetics and Biology. 2007, 44: 830-844. 10.1016/j.fgb.2007.03.008.PubMedView ArticleGoogle Scholar
- Fierro F, Garcia-Estrada C, Castillo NI, Rodriguez R, Velasco-Conde T, Martin JF: Transcriptional and bioinformatic analysis of the 56.8 kb DNA region amplified in tandem repeats containing the penicillin gene cluster in Penicillium chrysogenum. Fungal Genetics and Biology. 2006, 43: 618-629. 10.1016/j.fgb.2006.03.001.PubMedView ArticleGoogle Scholar
- Cullen D: The genome of an industrial workhorse. Nature Biotechnology. 2007, 25: 189-190. 10.1038/nbt0207-189.PubMedView ArticleGoogle Scholar
- van den Berg MA: Impact of the Penicillium chrysogenum genome on industrial production of metabolites. Appl Microbiol Biotechnol. 2011, 92: 45-53. 10.1007/s00253-011-3476-z.PubMedView ArticleGoogle Scholar
- van den Berg MA, Albang R, Albermann K, Badger JH, Daran JM, Driessen AJM, Garcia-Estrada C, Fedorova ND, Harris DM, Heijne WHM, et al: Genome sequencing and analysis of the filamentous fungus Penicillium chrysogenum. Nature Biotechnology. 2008, 26: 1161-1168. 10.1038/nbt.1498.PubMedView ArticleGoogle Scholar
- Fierro F, Barredo JL, Diez B, Gutierrez S, Fernandez FJ, Martin JF: The penicillin gene cluster is amplified in tandem repeats linked by conserved hexanucleotide sequences. Proceedings of the National Academy of Sciences of the United States of America. 1995, 92: 6200-6204. 10.1073/pnas.92.13.6200.PubMedPubMed CentralView ArticleGoogle Scholar
- Kiel JA, van der Klei IJ, van den Berg MA, Bovenberg RA, Veenhuis M: Overproduction of a single protein, Pc-Pex11p, results in 2-fold enhanced penicillin production by Penicillium chrysogenum. Fungal Genetics and Biology. 2005, 42: 154-164. 10.1016/j.fgb.2004.10.010.PubMedView ArticleGoogle Scholar
- Jami MS, Barreiro C, Garcia-Estrada C, Martin JF: Proteome analysis of the penicillin producer Penicillium chrysogenum: Characterization of protein changes during the industrial strain improvement. Mol Cell Proteomics. 2010Google Scholar
- Andersen MR, Salazar MP, Schaap PJ, van de Vondervoort PJ, Culley D, Thykaer J, Frisvad JC, Nielsen KF, Albang R, Albermann K, et al: Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88. Genome Res. 2011, 21: 885-897. 10.1101/gr.112169.110.PubMedPubMed CentralView ArticleGoogle Scholar
- Wohlbach DJ, Kuo A, Sato TK, Potts KM, Salamov AA, Labutti KM, Sun H, Clum A, Pangilinan JL, Lindquist EA, et al: Comparative genomics of xylose-fermenting fungi for enhanced biofuel production. Proc Natl Acad Sci USA. 2011, 108: 13212-13217. 10.1073/pnas.1103039108.PubMedPubMed CentralView ArticleGoogle Scholar
- Cornell MJ, Alam I, Soanes DM, Wong HM, Hedeler C, Paton NW, Rattray M, Hubbard SJ, Talbot NJ, Oliver SG: Comparative genome analysis across a kingdom of eukaryotic organisms: specialization and diversification in the fungi. Genome Res. 2007, 17: 1809-1822. 10.1101/gr.6531807.PubMedPubMed CentralView ArticleGoogle Scholar
- Kellis M, Patterson N, Endrizzi M, Birren B, Lander ES: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature. 2003, 423: 241-254. 10.1038/nature01644.PubMedView ArticleGoogle Scholar
- Weber JL, Myers EW: Human whole-genome shotgun sequencing. Genome Res. 1997, 7: 401-409.PubMedGoogle Scholar
- Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995, 269: 496-512. 10.1126/science.7542800.PubMedView ArticleGoogle Scholar
- Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol. 2008, 26: 1135-1145. 10.1038/nbt1486.PubMedView ArticleGoogle Scholar
- MacLean D, Jones JDG, Studholme DJ: Application of 'next-generation' sequencing technologies to microbial genetics. Nature Reviews Microbiology. 2009Google Scholar
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al: Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005, 437: 376-380.PubMedPubMed CentralGoogle Scholar
- Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA. 1977, 74: 5463-5467. 10.1073/pnas.74.12.5463.PubMedPubMed CentralView ArticleGoogle Scholar
- Fedurco M, Romieu A, Williams S, Lawrence I, Turcatti G: BTA, a novel reagent for DNA attachment on glass and efficient generation of solid-phase amplified DNA colonies. Nucleic Acids Res. 2006, 34: e22-10.1093/nar/gnj023.PubMedPubMed CentralView ArticleGoogle Scholar
- Turcatti G, Romieu A, Fedurco M, Tairi AP: A new class of cleavable fluorescent nucleotides: synthesis and optimization as reversible terminators for DNA sequencing by synthesis. Nucleic Acids Res. 2008, 36: e25-10.1093/nar/gkn320.PubMedPubMed CentralView ArticleGoogle Scholar
- Salamov AA, Solovyev VV: Ab initio gene finding in Drosophila genomic DNA. Genome Res. 2000, 10: 516-522. 10.1101/gr.10.4.516.PubMedPubMed CentralView ArticleGoogle Scholar
- Ter-Hovhannisyan V, Lomsadze A, Chernoff YO, Borodovsky MG-E: Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training. Genome Research. 2008, 18: 1979-1990. 10.1101/gr.081612.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R: InterProScan: protein domains identifier. Nucleic Acids Res. 2005, 33: W116-120. 10.1093/nar/gki442.PubMedPubMed CentralView ArticleGoogle Scholar
- Zdobnov EM, Apweiler R: InterProScan--an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001, 17: 847-848. 10.1093/bioinformatics/17.9.847.PubMedView ArticleGoogle Scholar
- Boeckmann B: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research. 2003, 31: 365-370. 10.1093/nar/gkg095.PubMedPubMed CentralView ArticleGoogle Scholar
- O'Donovan C, Martin MJ, Gattiker A, Gasteiger E, Bairoch A, Apweiler R: High-quality protein knowledge resource: SWISS-PROT and TrEMBL. Brief Bioinform. 2002, 3: 275-284. 10.1093/bib/3.3.275.PubMedView ArticleGoogle Scholar
- Camon E, Magrane M, Barrell D, Binns D, Fleischmann W, Kersey P, Mulder N, Oinn T, Maslen J, Cox A, Apweiler R: The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res. 2003, 13: 662-672. 10.1101/gr.461403.PubMedPubMed CentralView ArticleGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.PubMedPubMed CentralView ArticleGoogle Scholar
- Harris DM, van der Krogt ZA, Klaassen P, Raamsdonk LM, Hage S, van den Berg MA, Bovenberg RA, Pronk JT, Daran JM: Exploring and dissecting genome-wide gene expression responses of Penicillium chrysogenum to phenylacetic acid consumption and penicillinG production. BMC Genomics. 2009, 10: 75-10.1186/1471-2164-10-75.PubMedPubMed CentralView ArticleGoogle Scholar
- Nasution U, van Gulik WM, Ras C, Proell A, Heijnen JJ: A metabolome study of the steady-state relation between central metabolism, amino acid biosynthesis and penicillin production in Penicillium chrysogenum. Metab Eng. 2008, 10: 10-23. 10.1016/j.ymben.2007.07.001.PubMedView ArticleGoogle Scholar
- Fierro F, Gutierrez S, Diez B, Martin JF: Resolution of four large chromosomes in penicillin-producing filamentous fungi: the penicillin gene cluster is located on chromosome II (9.6 Mb) in Penicillium notatum and chromosome I (10.4 Mb) in Penicillium chrysogenum. Mol Gen Genet. 1993, 241: 573-578.PubMedView ArticleGoogle Scholar
- Raper KB, Alexander DF, Coghill RD: Penicillin: II. Natural Variation and Penicillin Production in Penicillium notatum and Allied Species. J Bacteriol. 1944, 48: 639-659.PubMedPubMed CentralGoogle Scholar
- Šmidák R, Kralovičová M, Ševčíková B, Jakubčová M, Kormanec J, Timko J, Turňa J: Sequence analysis and gene amplification study of the penicillin biosynthesis gene cluster from different strains of Penicillium chrysogenum. Biologia. 2010, 65: 1-6. 10.2478/s11756-009-0216-2.View ArticleGoogle Scholar
- Galagan JE, Calvo SE, Cuomo C, Ma LJ, Wortman JR, Batzoglou S, Lee SI, Basturkmen M, Spevak CC, Clutterbuck J, et al: Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature. 2005, 438: 1105-1115. 10.1038/nature04341.PubMedView ArticleGoogle Scholar
- Rodriguez-Saiz M, Barredo JL, Moreno MA, Fernandez-Canon JM, Penalva MA, Diez B: Reduced function of a phenylacetate-oxidizing cytochrome p450 caused strong genetic improvement in early phylogeny of penicillin-producing strains. J Bacteriol. 2001, 183: 5465-5471. 10.1128/JB.183.19.5465-5471.2001.PubMedPubMed CentralView ArticleGoogle Scholar
- Kosalkova K, Garcia-Estrada C, Ullan RV, Godio RP, Feltrer R, Teijeira F, Mauriz E, Martin JF: The global regulator LaeA controls penicillin biosynthesis, pigmentation and sporulation, but not roquefortine C synthesis in Penicillium chrysogenum. Biochimie. 2009, 91: 214-225. 10.1016/j.biochi.2008.09.004.PubMedView ArticleGoogle Scholar
- Hoff B, Kamerewerd J, Sigl C, Mitterbauer R, Zadra I, Kurnsteiner H, Kuck U: Two components of a velvet-like complex control hyphal morphogenesis, conidiophore development, and penicillin biosynthesis in Penicillium chrysogenum. Eukaryot Cell. 2010, 9: 1236-1250. 10.1128/EC.00077-10.PubMedPubMed CentralView ArticleGoogle Scholar
- Quo CF, Kaddi C, Phan JH, Zollanvari A, Xu M, Wang MD, Alterovitz G: Reverse engineering biomolecular systems using -omic data: challenges, progress and opportunities. Briefings in Bioinformatics. 2012, 13: 430-445. 10.1093/bib/bbs026.PubMedPubMed CentralView ArticleGoogle Scholar
- Ewing B, Hillier L, Wendl MC, Green P: Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 1998, 8: 175-185. 10.1101/gr.8.3.175.PubMedView ArticleGoogle Scholar
- Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 1998, 8: 186-194.PubMedView ArticleGoogle Scholar
- Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J: Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005, 110: 462-467. 10.1159/000084979.PubMedView ArticleGoogle Scholar
- Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27: 573-580. 10.1093/nar/27.2.573.PubMedPubMed CentralView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.PubMedView ArticleGoogle Scholar
- Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, Wang J, Li S, Li R, Bolund L: WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 2006, 34: W293-297. 10.1093/nar/gkl031.PubMedPubMed CentralView ArticleGoogle Scholar
- Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M: KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007, 35: W182-185. 10.1093/nar/gkm321.PubMedPubMed CentralView ArticleGoogle Scholar
- Li R, Li Y, Kristiansen K, Wang J: SOAP: short oligonucleotide alignment program. Bioinformatics. 2008, 24: 713-714. 10.1093/bioinformatics/btn025.PubMedView ArticleGoogle Scholar
- Li R, Li Y, Fang X, Yang H, Wang J, Kristiansen K: SNP detection for massively parallel whole-genome resequencing. Genome Res. 2009, 19: 1124-1132. 10.1101/gr.088013.108.PubMedPubMed CentralView ArticleGoogle Scholar
- Ye K, Schulz MH, Long Q, Apweiler R, Ning Z: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics. 2009, 25: 2865-2871. 10.1093/bioinformatics/btp394.PubMedPubMed CentralView ArticleGoogle Scholar
- Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009, 25: 1754-1760. 10.1093/bioinformatics/btp324.PubMedPubMed CentralView ArticleGoogle Scholar
- Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010, 26: 589-595. 10.1093/bioinformatics/btp698.PubMedPubMed CentralView ArticleGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009, 25: 2078-2079. 10.1093/bioinformatics/btp352.PubMedPubMed CentralView ArticleGoogle Scholar
- Xie C, Tammi MT: CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics. 2009, 10: 80-10.1186/1471-2105-10-80.PubMedPubMed CentralView ArticleGoogle Scholar
- Kent WJ: BLAT--the BLAST-like alignment tool. Genome Res. 2002, 12: 656-664.PubMedPubMed CentralView ArticleGoogle Scholar
- Bao H, Guo H, Wang J, Zhou R, Lu X, Shi S: MapView: visualization of short reads alignment on a desktop computer. Bioinformatics. 2009, 25: 1554-1555. 10.1093/bioinformatics/btp255.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.