Whole genome transcription profiling ofAnaplasma phagocytophilumin human and tick host cells by tiling array analysis

  • Curtis M Nelson1,

    Affiliated with

    • Michael J Herron1,

      Affiliated with

      • Roderick F Felsheim1,

        Affiliated with

        • Brian R Schloeder1,

          Affiliated with

          • Suzanne M Grindle1,

            Affiliated with

            • Oliva Adela Chavez1,

              Affiliated with

              • Timothy J Kurtti1 and

                Affiliated with

                • Ulrike G Munderloh1Email author

                  Affiliated with

                  BMC Genomics20089:364

                  DOI: 10.1186/1471-2164-9-364

                  Received: 07 May 2008

                  Accepted: 31 July 2008

                  Published: 31 July 2008

                  Abstract

                  Background

                  Anaplasma phagocytophilum(Ap) is an obligate intracellular bacterium and the agent of human granulocytic anaplasmosis, an emerging tick-borne disease.Apalternately infects ticks and mammals and a variety of cell types within each. Understanding the biology behind such versatile cellular parasitism may be derived through the use of tiling microarrays to establish high resolution, genome-wide transcription profiles of the organism as it infects cell lines representative of its life cycle (tick; ISE6) and pathogenesis (human; HL-60 and HMEC-1).

                  Results

                  Detailed, host cell specific transcriptional behavior was revealed. There was extensive differentialApgene transcription between the tick (ISE6) and the human (HL-60 and HMEC-1) cell lines, with far fewer differentially transcribed genes between the human cell lines, and all disproportionately represented by membrane or surface proteins. There wereApgenes exclusively transcribed in each cell line, apparent human- and tick-specific operons and paralogs, and anti-sense transcripts that suggest novel expression regulation processes. SevenvirB2paralogs (of the bacterial type IV secretion system) showed human or tick cell dependent transcription. Previously unrecognized genes and coding sequences were identified, as were the expressedp44/msp2(major surface proteins) paralogs (of 114 total), through elevated signal produced to the unique hypervariable region of each - 2/114 in HL-60, 3/114 in HMEC-1, and none in ISE6.

                  Conclusion

                  Using these methods, whole genome transcription profiles can likely be generated forAp, as well as other obligate intracellular organisms, in any host cells and for all stages of the cell infection process. Visual representation of comprehensive transcription data alongside an annotated map of the genome renders complex transcription into discernable patterns.

                  Background

                  Arthropod-borne intracellular organisms that parasitize the cells of mammalian hosts must be able to manipulate a diversity of host cells to support their own growth and life cycle. Revealing how they accomplish this will illuminate not only pathogenesis but also cell biology.Anaplasma phagocytophilum(Ap) is a gram-negative obligate intracellular bacterium, the agent of human granulocytic anaplasmosis (HGA), an emerging tick-borne disease.Aphas a 1.47 million base pair genome with 1411 annotated features [1]. Clinically, membrane boundApcolonies, called morulae, are seen in peripheral blood neutrophils. The white-footed mouse (Peromyscus leucopus) is considered to be the primary reservoir for theApvariant responsible for HGA, but other mammals are also susceptible [14]. Ticks do not passApto their offspring, but to mammals they feed upon, which transmit it back to ticks, and so the organism cycles between tick and mammalian hosts.

                  HGA is a potentially severe illness with symptoms, including pancytopenia and limb edema, that suggest other cells or tissues, beside neutrophils, are infected [57] In mice,Apinfects endothelial cells [8] and human bone marrow cells support infectionin vivo and in vitro[5,9]. The specific cells infected in ticks have not been unambiguously identified, however evidence indicates they reside within midgut and salivary gland tissues [1012]. Tick cell lines have been developed that supportApreplication, including ISE6, which was isolated fromIxodes scapularis, the primary vector of HGA in North America [13]. Susceptible human cell lines include HL-60, a promyelocytic leukemia cell line that serves as a model for neutrophils, and the microvascular endothelial cell line HMEC-1 [14].Approduces distinct infection phenotypes and growth kinetics in these cell lines, suggesting, along with its broad host range, that the organism adapts to each host by shifting its gene expression.

                  The obligate intracellular lifestyle ofApmakes direct biochemical, genetic, and observational study approaches inherently difficult. Transformation ofApwith fluorescent reporters has recently been achieved and should improve visualization of live bacteria, and open avenues for directed genetic research [15]. Nevertheless, methods for functional genomic analysis, for example, specific gene knockout, are still lacking. Gene transcription and expression analyses in animal models are largely impractical becauseAplevels in tick and mammal tissues are too low for recovery of sufficient bacterial RNA or protein. In vitro studies have focused on characterization of the immunodominantp44/msp2genes, which encode a large family of major surface proteins whose expression varies according to whether the organisms were derived from tick or mammalian host cells [16]. In addition, genes encoding the type IV secretion system ofAphave been identified, transcriptionally analyzed, and described [17,18], but their function and regulation remain undefined. DNA microarrays have been used to measure changes in host cell gene transcription during infection, with an aim to infer the mechanisms and strategies applied byAp[1924], but no microarray studies that directly measureAptranscription have been published.

                  The release of an annotatedApgenome sequence [1], and development of maskless, photolithographic, digital light processor technology (DLP) [25] have made it feasible to characterize global transcript levels inApusing tiling microarrays [26,27]. With these technologies entire genomes can be probed instead of sampling only selected sequences. The continuous data generated can be plotted in genomic order as a line graph, with transcribed genes appearing as peaks rising from a baseline of non-transcribed or intergenic sequence, and peak height corresponding to relative transcript abundance. A direct alignment of this to a parallel, annotated map of the genome can provide a visually striking and intuitive way to assess the data. Through Affymetrix (Santa Clara, CA) and NimbleGen Systems, Inc. (Madison, WI), we designed a tiling microarray for the entire genome ofAp(1.47 Mbp) and characterizedApgene transcription in three cell lines representative of its life cycle (ISE6 tick) and pathogenesis in humans (HL-60 and HMEC-1).

                  Methods

                  Cell lines,Apstrain, and growth conditions

                  Sterile andAp-infected HL-60 cells (American Type Culture Collection, Manassas, VA, USA; ATCC CCL-240) were maintained in RPMI 1640 medium supplemented with 10% fetal bovine serum (FBS) and 25 mM HEPES. Cultures infected withApisolate HZ were subcultured weekly by 1:50 (v/v) dilution of > 90% infected cells into sterile HL-60 cultures [28]. The HMEC-1 cell line was received from the Centers for Disease Control (Atlanta, GA), and both sterile and infected cells likewise cultured in RPMI 1640 medium with 10% FBS and 25 mM HEPES [29]. Infected HMEC-1 cultures were fed daily andApsubcultured 1:50 bi-weekly when > 80% of cells were infected. HL-60 and HMEC-1 cultures were kept at 37°C in a humidified atmosphere of 5% CO2in air. ISE6 cells were propagated in L15B300 medium with 5% tryptose phosphate broth (BD, Sparks MD, USA), 5% FBS, and 0.1% lipoprotein concentrate (MPBiomedical, Irvine CA, USA) at 34°C [13].Ap-infected ISE6 cultures were fed twice weekly with medium buffered to pH 7.6 using 0.25% NaHCO3and 25 mM HEPES, and subcultured 1:50 bi-weekly [13].

                  Apstrain HZ was cultured from the blood of a New York State patient by co-culture with HL-60 [Goodman et al. unpublished; [28] ] HZ-Ap-infected HL-60 cells (passage 8) were simultaneously inoculated into the three cell lines. These infected parallel cultures were continuously subcultured and served as the source of infected cell samples for tiling array analysis. All samples from each cell line were fromApcultures between passages 21 and 34.

                  Tiling array design and manufacture

                  Through consultation with Affymetrix (Santa Clara, CA), a library of 258,480 complimentary (perfect match) 25-mer oligonucleotide probes covering both DNA strands of theApgenome (isolate HZ) [1] was designed. Each probe overlapped its neighbor by 11 bases for a probe resolution of 14 bases, the distance from the center of one probe to the next. Probes were "hard pruned" - ridden of highly repetitive sequence elements thought to be irrelevant using an algorithm (Affymetrix) to identify, somewhat subjectively, long repeat sequences. Probes for these were not included, though probes for many "shorter" repeating sequences were. Pruned sequences can be viewed easily in the Artemis graphs. They are characterized by successive data points with the same or similar value that together produce large blunt peaks. For examples see additional file1coordinates 665858–666184, 1025792–1026289, and 645698–646032. NimbleGen Systems, Inc. (Madison, WI) synthesized the oligonucleotide probesin situusing a photo-mediated, maskless process in which the synthesis of each probe is directed by a digital light processor [25].

                  Isolation of RNA

                  Apgenomic transcription was measured in each of the three cell lines when cultures were approximately 95% infected. Typically, cells contained hundreds of bacteria (Figure1: Microscopic images of Giemsa stained cells infected withAp). RNA was extracted from threeAp-infected and three uninfected samples of each cell line (18 samples total). Each sample was from a separate culture and consisted of approximately 107infected cells or uninfected control cells. Cells were suspended by pipetting (HL-60 and ISE6) or with a cell scraper (HMEC-1) and immediately centrifuged at 300 × g for 2 minutes. The supernatant was aspirated and discarded; cell pellets were loosened by flicking and immediately dissolved in TRI REAGENT™ (Sigma, Saint Louis, MO, USA). All steps were performed at room temperature. Total RNA was then isolated according to the TRI REAGENT™ product instructions. In brief, samples in TRI REAGENT™ were extracted with chloroform and centrifuged at 12,000 × g for 15 minutes at 4°C. RNA in the aqueous, upper phase was precipitated in isopropanol, collected by centrifugation at 12,000 × g for 10 minutes at 4°C, and washed twice in cold 75% ethanol. RNA pellets were dissolved in 100 μL RNase-free water, quantified by spectrophotometry, and processed for array analysis.
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig1_HTML.jpg
                  Figure 1

                  Microscopic images of Giemsa stained cells infected withAp. (A)Ap-infected HMEC-1 (B)Ap-infected ISE6 (C)Ap-infected HL-60. Cell nuclei are labeled "N" and arrows point toApmorulae. Scale bar = 10 μm

                  Preparation of tiling array "target"

                  Total RNA fromAp-infected or sterile control cells was processed forAptranscript measurement according to the Affymetrix "Prokaryotic Target Preparation" protocol using random priming of total RNA to synthesize a single strand of cDNA. The cDNA was recovered by column purification, fragmented with DNase I, and end labeled with biotin. These biotinylated cDNA fragment "targets" were hybridized to the "probes" contained on the tiling arrays, labeled with a streptavidin-phycoerythrin conjugate, and probe hybridization was quantified by laser scanning. The detailed protocol was as follows.

                  cDNA synthesis

                  In a volume of 30 μL, 10 μg of total RNA (fromApinfected or sterile control cells) was combined with random primers (25 ng/μL final concentration) (Invitrogen, Carlsbad, CA), and, in a thermocycler, incubated 10 minutes at 70°C followed by 10 minutes at 25°C, then chilled to 4°C. To this reaction mixture was added 30 μL of the following master mix: 12 μL 5× 1stStrand Buffer, 6 μL 100 mM DTT, 3 μL 10 mM dNTPs, 1.5 μL SUPERaseIn™ (20U/μL) (Ambion, Austin TX, USA), 7.5 μL SuperScript II (200 U/μL) (Invitrogen). Samples (60 μL) were incubated in a thermocycler 10 minutes at 25°C, 60 minutes at 37°C, 60 minutes at 42°C, 10 minutes at 70°C, and chilled to 4°C.

                  cDNA isolation and fragmentation

                  To degrade RNA, 20 μL of 1N NaOH was added to each sample, incubated at 65°C for 30 minutes, and neutralized by addition of 20 μL 1N HCl. MiniElute PCR Purification Columns (Qiagen, Valencia CA, USA) were used according to product instructions to purify cDNA from the samples. Typical cDNA yields were 3–4 μg. cDNA in 10 μL was combined with 2 μL 10× One-Phor-All Buffer (Amersham Biosciences, Piscataway, NJ), 0.6 U DNase I/μg cDNA (Amersham Biosciences), plus sufficient water for 20 μL total volume, and incubated 10 minutes at 37°C. DNase I was inactivated by heating to 98°C for 10 minutes. cDNA fragments produced were 50–200 bases in length.

                  Biotinylation of 3' termini of cDNA fragments

                  The GeneChip®DNA labeling kit (Affymetrix) was used as follows: 20 μL fragmented cDNA was combined with 10 μL 5× reaction buffer, 2 μL 7.5 mM GeneChip DNA labeling reagent, 2 μL terminal deoxynucleotidyl transferase, and 16 μL water and incubated at 37°C for 60 minutes. The reaction was stopped with 2 μL of 0.5 M EDTA and then frozen at -20°C until it was applied to an array.

                  Tiling array hybridization and scanning

                  Samples were hybridized to tiling arrays and scanned at the BioMedical Genomics Center at the University of Minnesota using the Affymetrix Fluidics Station 400. Arrays were scanned using an Affymetrix Genechip 3000 scanner according to standard Affymetrix protocols.

                  Tiling array data analysis

                  "Cel" files generated by the University of Minnesota's microarray facility were joined to Affymetrix BPMAP files specific to the tiling array using Affymetrix®Tiling Analysis Software (TAS). TAS generated a list of signal intensities and arranged them in order of genomic location and DNA strand. The data are available at the NCBI Gene Expression Omnibus (GEO) database (study #GSE11487http://​www.​ncbi.​nlm.​nih.​gov/​projects/​geo/​query/​acc.​cgi?​acc=​GSE11487).

                  Graphical representation of these data along with their annotations was accomplished with the JAVA based program "Artemis"http://​www.​sanger.​ac.​uk/​. Using a script developed internally, the intensity plots were reformatted and imported into Artemis along with an annotation feature listhttp://​www.​ncbi.​nlm.​nih.​gov. The resulting graphics give a visual overview of transcription as it relates to genomic organization, and provide clues to operon structure (see additional file1: Artemis transcription graph of the entire, annotatedApgenome during infection of HL-60, HMEC-1, and ISE6 cell lines). The complete genome coverage provided by the overlapping probes on the tiling array translates into 90 spot intensities generated for a 1000 base open reading frame (ORF). This large number of intensities, coupled with the quality of data suggested that creating a linear graph, and measuring the area under the peaks in regions corresponding to annotated open reading frames - ORF transcription areas - would be a simple and useful method to quantify transcripts for each ORF. To compute these ORF transcription areas, the intensities were normalized via quantiles [30] and imported into the IgorPro data analysis program (WaveMetrics Lake Oswego OR, USA) along with the ORF and structural RNA annotations available fromhttp://​www.​ncbi.​nlm.​nih.​gov. A script was written to index a trapezoidal integration algorithm of the intensity list with the start and end genomic positions indicated on the annotation. This script operation generated a list of 1411 transcription areas.

                  Statistical evaluation of area differences, T values & Fold change

                  ORF transcription areas computed from the quantile normalized data (3 each for HL-60, HMEC-1 and ISE6) and paired 2 tail Students t-test, were performed on: HL-60 vs. ISE6, HMEC-1 vs. ISE6, and HL-60 vs. HMEC-1. ORF transcription area comparisons with p values ≤ 0.05 were considered significant for determination of the number and identity of genes transcribed. Determination of differentially expressed genes utilized the additional requirement that the mean ORF transcription area be at least twice the mean ORF transcription area of the same gene of the compared cell line.

                  The number of expressed ORFs was determined by T-test comparison between the ORF transcription areas from infected cell monolayers, and those of uninfected control cell monolayers. The signal intensity of these arrays was baseline corrected using the signal intensities of twelve manually selected intergenic regions devoid of obvious signal from across the span of the genome. ORF transcription area comparisons with p values ≤ 0.05 were considered significant for determination of the number and identity of genes transcribed.

                  Validation of tiling array data by quantitative reverse transcription-PCR (qRT-PCR)

                  FiveApgenes with known products were assayed for relative transcript abundance by qRT-PCR. Tiling data indicated that four of the genes had differential transcription patterns between the human and tick cells: major outer membrane protein (omp-1A;APH_1359), outer membrane efflux protein (APH_1110), major surface protein 4 (msp4;APH_1240), and the 60 kDa chaperonin (APH_0240). The fifth gene, which codes for succinyl-CoA synthetase beta subunit (APH_1052), was transcribed equally in all three cell lines (Figure2; Artemis transcription profiles for five genes chosen for assay by qRT-PCR).
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig2_HTML.jpg
                  Figure 2

                  Artemis profiles depicting the relative transcription levels of fiveApgenes during infection of HL-60 (red), HMEC-1 (green), and ISE6 (blue) cells. Plots were "smoothed" by setting the sliding window average to 5. (A) Major outer membrane protein gene (omp-1A;APH_1359) transcription greater in the human cell lines compared to the tick cell line. (B) Outer membrane efflux protein (APH_1110) greater in the tick cell line compared to the human cell lines. (C) Transcription of the major surface protein 4 gene (msp4;APH_1240) only in the tick cell line. (D) Transcription of the 60 kDa chaperonin gene (groL;APH_0240) was greatest in HL-60, significantly lower in HMEC-1, and least in ISE6. (E) Equal transcription of the succinyl-CoA synthetase beta subunit gene (sucC;APH_1052) in all three cell lines.

                  Total RNA (from portions of samples prepared for array analysis), from three separate cultures of eachAp-infected cell line (9 samples), were assayed in triplicate by qRT-PCR. To eliminate any DNA contamination, samples were DNase I treated using DNA-free™ (Ambion). DNase I was inactivated and RNA purified using RNeasy mini columns (Qiagen, Valentia, CA). mRNA from each of the five genes was reverse transcribed and amplified quantitatively with primers designed using MacVector (Cary, NC) and Netprimer (Palo Alto, CA) (see additional file2 : qRT-PCR primers). The primers were tested by conventional PCR on a Stratagene (La Jolla CA, USA) Robocycler with temperature gradient capability, usingApstrain HZ DNA as target. Formation of appropriate product sizes was verified and a single annealing temperature (60°C) and primer concentration (150 nM) suitable for all five primer pairs were determined, allowing RNA from each of the cell lines to be qRT-PCR-amplified together for best determination of relative transcript levels. Reverse transcription and subsequent quantitative PCR were performed on 100 ng of each RNA sample in 96-well plates using the Brilliant II SYBR Green 1-step qRT-PCR kit (Stratagene), and Stratagene's Mx3005P thermal cycler. To initiate the qRT-PCR, reverse transcription was allowed to proceed for 30 minutes at 50°C, followed by heat treatment for 10 minutes at 95°C to activate DNA polymerase and deactivate reverse transcriptase. cDNA was then amplified during 40 cycles of 30 seconds at 95°C, 1 minute at 60°C, and 1 minute at 72°C.

                  Results

                  Percentage ofApgenes measured as transcribed in each cell line

                  Of the 1411 annotated features [1] in theApgenome, 983 (69.6%) were significantly transcribed (p-value ≤ 0.05) in HL-60, 620 (43.9%) in HMEC-1, and 974 (69.0%) in ISE6, compared to negative control samples (RNA from uninfected cells).

                  DifferentialApgene transcription between cell lines

                  Between HL-60 and HMEC-1, 71ApORFs (5%) were differentially (p-value ≤ 0.05) transcribed (see additional file3 :Ap-HL-60 vs.Ap-HMEC-1 differential transcription). Between HL-60 and ISE6, 585ApORFs (41.5%) were differentially transcribed. Between HMEC-1 and ISE6, 304ApORFs (21.5%) were differentially transcribed. Adding a fold change criterion of 2 or greater, only oneApgene betweenApfrom HL-60 (Ap-HL-60) andApfrom HMEC-1 (Ap-HMEC-1) passed:APH_1342, one of thep44/msp2paralogs. BetweenAp-HL-60 andApfrom ISE6 (Ap-ISE6), 117 ORFs (8.5%), and betweenAp-HMEC-1 andAp-ISE6, 61 (4.3%) were at least 2-fold different (Table1). The relatively low percentage of ORFs measured as transcribed inAp-HMEC-1 (43.9%) was probably due to lower average signal intensity from those samples (850 vs. 2120 in HL-60). We determined this to be the result of suboptimal biotin labeling after using a particular batch of terminal transferase. A new aliquot of terminal transferase used in the preparation of one of the samples ofAp-HL-60 produced a particularly bright signal, resulting in a higher signal to noise ratio for theAp-HL-60 data. Because of this, and because differential transcription was low between the human cell lines (compared to that between the human and tick cell lines), subsequent descriptions of differential transcription inAp-ISE6 are based on comparisons toAp-HL-60.
                  Table 1

                  Summary of differentialApgene transcription between HL-60, HMEC-1, and ISE6

                   

                  # ORFs

                  Differentially

                  transcribed

                  (p ≤ 0.05)

                  % of total

                  ORFs

                  # ORFs ≥ 2-fold

                  (p ≤ 0.05)

                  differentially

                  transcribed

                  % of total

                  ORFs

                  Ap-HL-60 vs.Ap-HMEC-1

                  71

                  5.0

                  1

                  0.07

                  Ap-HL-60 vs.Ap-ISE6

                  585

                  41.5

                  117

                  8.5

                  Ap-HMEC-1 vs.Ap-ISE6

                  304

                  21.5

                  61

                  4.3

                  Summary of differentialApgene transcription between HL-60, HMEC-1, and ISE6.

                  Of the 117ApORFs differentially transcribed (p ≤ 0.05, ≥ two-fold difference) between the HL-60 and ISE6 cells, 76 had higher levels in HL-60 and 41 had higher levels in ISE6. The 76Ap-HL-60 ORFs comprise 35 known and 41 hypothetical proteins (54%). All but three of the ORFs that were up-regulated inAp-ISE6 are annotated as hypothetical (93%) (see Table2: Genes differentially transcribed between human (HL-60) and tick (ISE6) cells). By comparison, 40% of allApgenes are annotated as hypothetical.
                  Table 2

                  Summary ofAp-HL-60 vs.Ap-ISE6 differential gene transcription

                   

                  Gene Product

                  Locus

                  Predicted Cellular Location

                  Fold Change Ap-HL-60/Ap-ISE6

                  1

                  DNA-binding protein

                  APH_1100

                  Cytoplasmic

                  4.9

                  2

                  HGE-14 protein

                  APH_0387

                  Extracellular

                  4.8

                  3

                  hypothetical protein

                  APH_1412

                  Outer Membrane

                  4.2

                  4

                  hypothetical protein

                  APH_0915

                  Outer Membrane, Extracellular

                  4.1

                  5

                  hypothetical protein

                  APH_0906

                  Outer Membrane

                  4.1

                  6

                  major outer membrane protein OMP-1A

                  APH_1359

                  Outer Membrane

                  4.0

                  7

                  hypothetical protein

                  APH_1378

                  Outer Membrane

                  3.8

                  8

                  hypothetical protein

                  APH_0842

                  Cytoplasmic

                  3.7

                  9

                  hypothetical protein

                  APH_0838

                  Outer Membrane

                  3.6

                  10

                  hypothetical protein

                  APH_0388

                  Cytoplasmic

                  3.6

                  11

                  hypothetical protein

                  APH_1145

                  Inner Membrane

                  3.5

                  12

                  OmpA family protein

                  APH_0338

                  Outer Membrane

                  3.4

                  13

                  DNA-binding response regulator

                  APH_1099

                  Cytoplasmic

                  3.4

                  14

                  hypothetical protein

                  APH_1144

                  Inner Membrane

                  3.4

                  15

                  hypothetical protein

                  APH_0837

                  Cytoplasmic

                  3.4

                  16

                  HGE-14 protein

                  APH_0382

                  Extracellular

                  3.3

                  17

                  hypothetical protein

                  APH_0005

                  Inner Membrane

                  3.3

                  18

                  hypothetical protein

                  APH_0756

                  Inner Membrane, Cytoplasmic

                  3.2

                  19

                  10 kDa chaperonin

                  APH_0241

                  Periplasmic

                  3.2

                  20

                  hypothetical protein

                  APH_0032

                  Outer Membrane, Extracellular

                  3.1

                  21

                  hypothetical protein

                  APH_0874

                  Outer Membrane

                  3.1

                  22

                  hypothetical protein

                  APH_0233

                  Inner Membrane

                  3.1

                  23

                  HGE-14 protein

                  APH_0385

                  Cytoplasmic

                  3.1

                  24

                  signal peptidase II

                  APH_1160

                  Inner Membrane

                  3.0

                  25

                  hypothetical protein

                  APH_1156

                  Cytoplasmic

                  3.0

                  26

                  hypothetical protein

                  APH_0793

                  Inner Membrane

                  2.9

                  27

                  HGE-14 protein

                  APH_0455

                  Extracellular

                  2.9

                  28

                  Omp-1N

                  APH_1220

                  Outer Membrane

                  2.8

                  29

                  hypothetical protein

                  APH_0949

                  Inner Membrane, Cytoplasmic

                  2.7

                  30

                  hypothetical protein

                  APH_0033

                  Cytoplasmic

                  2.7

                  31

                  hypothetical protein

                  APH_1307

                  Inner Membrane

                  2.7

                  32

                  hypothetical protein

                  APH_1157

                  Inner Membrane

                  2.7

                  33

                  hypothetical protein

                  APH_1151

                  Inner Membrane

                  2.6

                  34

                  antioxidant AhpC/Tsa family

                  APH_0795

                  Cytoplasmic

                  2.6

                  35

                  RNA polymerase sigma-32 factor

                  APH_0759

                  Cytoplasmic

                  2.6

                  36

                  60 kDa chaperonin

                  APH_0240

                  Cytoplasmic

                  2.6

                  37

                  hypothetical protein

                  APH_1235

                  Cytoplasmic

                  2.5

                  38

                  hypothetical protein

                  APH_0922

                  Inner Membrane

                  2.5

                  39

                  hypothetical protein

                  APH_1262

                  Cytoplasmic

                  2.5

                  40

                  hypothetical protein

                  APH_0757

                  Cytoplasmic

                  2.5

                  41

                  chaperone protein DnaK

                  APH_0346

                  Cytoplasmic

                  2.4

                  42

                  hypothetical protein

                  APH_1236

                  Cytoplasmic

                  2.4

                  43

                  hypothetical protein

                  APH_0363

                  Cytoplasmic

                  2.4

                  44

                  translation initiation factor IF-3

                  APH_1263

                  Cytoplasmic

                  2.4

                  45

                  glyceraldehyde-3-phosphate dehydrogenase type I

                  APH_1349

                  Cytoplasmic

                  2.4

                  46

                  hypothetical protein

                  APH_0873

                  Cytoplasmic

                  2.4

                  47

                  hypothetical protein

                  APH_0919

                  Inner Membrane

                  2.4

                  48

                  hypothetical protein

                  APH_1072

                  Cytoplasmic

                  2.4

                  49

                  hypothetical protein

                  APH_1320

                  Cytoplasmic

                  2.3

                  50

                  HGE-14 protein

                  APH_0453

                  Cytoplasmic

                  2.3

                  51

                  outer membrane protein MSP2 family

                  APH_1325

                  Outer Membrane

                  2.2

                  52

                  hypothetical protein

                  APH_0643

                  Cytoplasmic

                  2.2

                  53

                  hypothetical protein

                  APH_0839

                  Outer Membrane

                  2.2

                  54

                  putative acyl carrier protein

                  APH_0929

                  Cytoplasmic

                  2.2

                  55

                  Es1 family protein

                  APH_0006

                  Cytoplasmic

                  2.2

                  56

                  hypothetical protein

                  APH_0179

                  Cytoplasmic

                  2.2

                  57

                  iron-sulfur cluster assembly accessory protein

                  APH_0676

                  Cytoplasmic

                  2.2

                  58

                  putative ATP synthase F0 B' subunit

                  APH_1190

                  Cytoplasmic

                  2.1

                  59

                  hypothetical protein

                  APH_0719

                  Cytoplasmic

                  2.1

                  60

                  hypothetical protein

                  APH_0991

                  Cytoplasmic

                  2.1

                  61

                  succinate dehydrogenase cytochrome b556 subunit

                  APH_0999

                  Inner Membrane

                  2.1

                  62

                  pyruvate phosphate dikinase

                  APH_0185

                  Cytoplasmic

                  2.1

                  63

                  iron-binding protein

                  APH_0051

                  Cytoplasmic

                  2.1

                  64

                  nucleoside diphosphate kinase

                  APH_1217

                  Cytoplasmic

                  2.1

                  65

                  malonyl CoA-acyl carrier protein transacylase

                  APH_0092

                  Cytoplasmic

                  2.0

                  66

                  hypothetical protein

                  APH_0786

                  Cytoplasmic

                  2.0

                  67

                  co-chaperone GrpE

                  APH_0036

                  Cytoplasmic

                  2.0

                  68

                  hypothetical protein

                  APH_0771

                  Cytoplasmic

                  2.0

                  69

                  hypothetical protein

                  APH_0585

                  Cytoplasmic

                  2.0

                  70

                  hypothetical protein

                  APH_0655

                  Cytoplasmic

                  2.0

                  71

                  ribonucleoside-diphosphate reductase alpha subunit

                  APH_0331

                  Cytoplasmic

                  2.0

                  72

                  P44-45 outer membrane protein

                  APH_0171

                  Outer Membrane

                  2.0

                  73

                  adenylosuccinate lyase

                  APH_0867

                  Cytoplasmic

                  2.0

                  74

                  P44-36 outer membrane protein

                  APH_1168

                  Outer Membrane

                  2.0

                  75

                  aspartate aminotransferase

                  APH_0660

                  Cytoplasmic

                  2.0

                  76

                  cytochrome C membrane-bound

                  APH_0180

                  Periplasmic

                  2.0

                   

                  35/76 named genes ≥ 2-fold up in Ap -HL-60 = 54% hypothetical

                  33/76 genes membrane associated = 43%

                  77

                  hypothetical protein

                  APH_0197

                  Periplasmic

                  0.5

                  78

                  hypothetical protein

                  APH_0369

                  Cytoplasmic

                  0.5

                  79

                  hypothetical protein

                  APH_0497

                  Cytoplasmic

                  0.5

                  80

                  hypothetical protein

                  APH_0425

                  Cytoplasmic

                  0.5

                  81

                  hypothetical protein

                  APH_0587

                  Cytoplasmic

                  0.5

                  82

                  hypothetical protein

                  APH_0963

                  Cytoplasmic

                  0.5

                  83

                  hypothetical protein

                  APH_1130

                  Inner Membrane

                  0.5

                  84

                  hypothetical protein

                  APH_0467

                  Cytoplasmic

                  0.5

                  85

                  thiamine biosynthesis protein ThiC truncation

                  APH_0586

                  Cytoplasmic

                  0.5

                  86

                  hypothetical protein

                  APH_0806

                  Periplasmic

                  0.5

                  87

                  hypothetical protein

                  APH_0599

                  Cytoplasmic

                  0.5

                  88

                  hypothetical protein

                  APH_0827

                  Cytoplasmic

                  0.4

                  89

                  outer membrane efflux protein

                  APH_1110

                  Outer Membrane

                  0.4

                  90

                  hypothetical protein

                  APH_1131

                  Inner Membrane

                  0.4

                  91

                  hypothetical protein

                  APH_0829

                  Cytoplasmic

                  0.4

                  92

                  hypothetical protein

                  APH_0818

                  Cytoplasmic

                  0.4

                  93

                  hypothetical protein

                  APH_0841

                  Cytoplasmic

                  0.4

                  94

                  hypothetical protein

                  APH_1382

                  Cytoplasmic

                  0.4

                  95

                  hypothetical protein

                  APH_0550

                  Cytoplasmic

                  0.4

                  96

                  hypothetical protein

                  APH_0485

                  Cytoplasmic

                  0.4

                  97

                  hypothetical protein

                  APH_0355

                  Inner Membrane

                  0.4

                  98

                  hypothetical protein

                  APH_1132

                  Inner Membrane

                  0.3

                  99

                  hypothetical protein

                  APH_0720

                  Outer Membrane

                  0.3

                  100

                  hypothetical protein

                  APH_1384

                  Outer Membrane

                  0.3

                  101

                  hypothetical protein

                  APH_1380

                  Cytoplasmic

                  0.3

                  102

                  hypothetical protein

                  APH_1370

                  Cytoplasmic

                  0.3

                  103

                  hypothetical protein

                  APH_0320

                  Cytoplasmic

                  0.3

                  104

                  hypothetical protein

                  APH_0726

                  Membrane

                  0.3

                  105

                  hypothetical protein

                  APH_1369

                  Cytoplasmic

                  0.3

                  106

                  hypothetical protein

                  APH_1368

                  Cytoplasmic

                  0.3

                  107

                  hypothetical protein

                  APH_1385

                  Cytoplasmic

                  0.2

                  108

                  hypothetical protein

                  APH_0724

                  Membrane

                  0.2

                  109

                  hypothetical protein

                  APH_0805

                  Outer Membrane

                  0.2

                  110

                  hypothetical protein

                  APH_0723

                  Membrane

                  0.2

                  111

                  hypothetical protein

                  APH_0487

                  Inner Membrane

                  0.2

                  112

                  hypothetical protein

                  APH_1386

                  Cytoplasmic

                  0.2

                  113

                  hypothetical protein

                  APH_0177

                  Extracellular

                  0.1

                  114

                  hypothetical protein

                  APH_0546

                  Extracellular

                  0.1

                  115

                  major surface protein 4

                  APH_1240

                  Outer Membrane

                  0.1

                  116

                  hypothetical protein

                  APH_0916

                  Inner Membrane

                  0.1

                  117

                  hypothetical protein

                  APH_0406

                  Outer Membrane

                  0.1

                   

                  3/41 named genes ≥ 2-fold up in Ap -ISE6 = 93% hypothetical

                  19/41 genes membrane associated = 46%

                  Genes differentially transcribed (p ≤ 0.05, ≥ two-fold difference) between human (HL-60) and tick (ISE6) cells. Gene products (117) are listed in descending order of their transcript abundance in HL-60, with their fold change indicated in the right hand column. Gene products 1–76 were those more highly transcribed in HL-60, and 77–117 were those more highly transcribed in ISE6. Gene products predicted to be membrane associated (in bold) tended to be those most differentially transcribed - at the top (most abundant inAp-HL-60) and bottom (most abundant inAp-ISE6) of the list. The percentages of hypothetical genes, and the percentages of gene products that are membrane associated are also indicated. (~25% of allApgenes code for proteins that are membrane associated.)

                  The amino acid sequences derived from the 117 differentially transcribed genes were analyzed using the secretomeP CBS prediction server [31] and the CELLO subcellular localization predictor [32] to determine the probable cellular location of each of the gene products - periplasm, inner or outer membrane, extracellular (secreted), or cytoplasmic. While 25% of allApgenes products are membrane associated (non cytoplasmic), 43% of the 76 genes differentially transcribed in HL-60 cells and 46% of the 41 genes differentially transcribed in ISE6 cells code for non cytoplasmic proteins. As illustrated in Table2, the greater a gene's differential transcription, the more likely it was to encode a membrane associated protein (i.e. differentially transcribed genes were over-represented by membrane associated proteins; see Table2: Summary ofAp-HL-60 vs.Ap-ISE6 differential gene transcription).

                  As illustrated in the Artemis transcript level graphs (see additional file1 : materials for graphing transcript level data in Artemis), when the data are displayed as linear graphs alongside a map of the annotated genome, numerous transcription behaviors are revealed. Transcribed sequences are seen to rise from the over-all flat baseline and generally correspond well to annotated ORFs. However, there are examples of transcript signal extending beyond ORF boundaries (APH_numbers0005, 0406, 0793, 0808, 0811, 0859, 0906, and1151), transcription apparently not associated with an ORF (coordinates 46672–46738, 944100–944549, 692299–692983, and 1306128–1306875), and transcribed unannotated ORFs (875684–876751, 1445252–1445797 and 1241148–1241727). The ORF identified between coordinates 1241148 and 1241727 is anotherp44/msp2paralog, bringing the total number ofp44loci now identified to 114 (113 were originally annotated; [1]. Peaks and plateaus of varying profile representing gene transcription are clearly discernible. Often they slope downward from 5' to 3', but sometimes they are flat (Figure3: Examples of flat and sloped transcription peaks). There are also numerous ORFs and operons that showed no significant transcription in any of the cell lines (see additional file4 : genes and operons with no detected transcripts).
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig3_HTML.jpg
                  Figure 3

                  Artemis transcription plots showing examples of flat and sloped gene transcription profiles (Red:Ap-HL-60, Green:Ap-HMEC-1, Blue:Ap-ISE6; plots were "smoothed" by setting the sliding window average to 5). (A) Polynucleotide phosphorylase gene (pnp) with an over-all flat transcription profile in all three cell lines. (B) Two examples of genes –APH_0756(hypothetical) andrpoH(heat shock sigma factor sigma 32) - with transcription profiles that slope downward from 5' to 3'.

                  Paralogs of thep44/msp2family of outer membrane proteins form a characteristic hybridization pattern that is somewhat perplexing. Sincep44is abundantly expressed inAp, transcripts with sequences that correspond to the conserved ends of the gene should bind to all the probes on the array that are complimentary - i.e., those of over 100 genes. Signals associated with the conserved ends of thep44paralogs do rise sharply, while those that correspond to the hypervariable region (HVR) in between are generally near baseline. This produces a double horn shaped signature. Most paralogs are not expressed within a population of bacteria [33] therefore those that display bridged horns - representing transcript hybridization to the HVR - are likely to be specifically transcribed. In HL-60,APH_1152(similar to p44-47) andAPH_1351(similar top44-35), and in HMEC-1,APH_1253(similar top44-39),APH_1342(similar top44-31), andAPH_1350(similar top44-51) had strong signals associated with their HVRs, suggesting those paralogs were expressed.Ap-ISE6 produced no significant hybridization to any of thep44HVRs, however along withAp-HL-60 andAp-HMEC-1,Ap-ISE6 produced strong signals to the conservedp44sequences. In all three cell lines, signals to the conservedp44sequences were greater than those from the HVRs - of the expressing paralogs noted inAp-HL-60 andAp-HMEC-1. In addition, this pattern of excessive hybridization to the conserved ends of thep44ORFs, is "reflected" in the non-coding DNA strand. Probes to sequences opposite conservedp44sense sequences are hybridized significantly in the human cell samples, and as strongly in the tick cells as the sense probes, such that the horned profile appears reflected in the opposite DNA strand. (Figure4:p44transcription phenomena: horns, reflecting, and HVR associated signal)
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig4_HTML.jpg
                  Figure 4

                  Artemis transcription plots of characteristicp44transcription profiles (Red:Ap-HL-60, Green:Ap-HMEC-1, Blue:Ap-ISE6; plots were "smoothed" by setting the sliding window average to 5). Arrows in panel B indicatep44conserved sequence "horns" on the coding (minus) strand, and "reflected" horns (panel A) in the anti-sense (plus) strand. A strong signal (green) associated with the HVR inAPH_1342(*), likely indicates expression of the correspondingp44paralog (p44-31) in HMEC-1. The lack of HVR associated signal inAPH_1343, but strong conserved sequence associated signals (horns), is typical of mostp44paralogs. An unannotated segment ofp44conserved sequence lies betweenAPH_1343 and APH_1344(yellow) on the minus strand. It also showed strong sense (B) and anti-sense (A) signals.APH_1344 and APH_1345show typical transcription profiles: signal on the sense strand (B) but not on the anti-sense strand (A).

                  Exceptions arep44-70, p44-71, p44-72, and p44-79, which have "conserved" ends that differ significantly from the otherp44s; they produced no horns or reflections (see additional file1 , coordinates 680648–684696 and 1418814–1420199). Subtler reflecting was also seen in several non-p44ORFs, such asAPH_1387, which codes for outer membrane protein HGE2 [1], and the hypotheticalAPH_0536(Figure5: Reflecting).
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig5_HTML.jpg
                  Figure 5

                  Artemis transcription plots of two genes showing "reflecting" transcription patterns on the anti-sense strands (Red:Ap-HL-60, Green:Ap-HMEC-1, Blue:Ap-ISE6; plots were "smoothed" by setting the sliding window average to 5). (A) HGE2 proteinAPH_1387. (B) Hypothetical protein APH 0546. Note thatApin all three cell lines produced sense and anti-sense transcript forAPH_1387(panel A), while in the case ofAPH_0546(panel B) onlyAp-ISE6 produced sense and anti-sense transcript.

                  Like conservedp44sequences, repeat sequences, which are common throughout the genome, generally displayed strong signals on both DNA strands (see additional file5 : Repeat-sequence-based sense and anti sense signal).

                  At thep44expression locus (APH_1221) bothAp-HL-60 andAp-HMEC-1 showed strong transcription beginning near base 1289280, just before the start of theomp-1Ngene, and continuing through thep44expression site, whileAp-ISE6 did not. Thep44"horns" seen inAp-ISE6 within the expression locus, are likely examples of the generalized hybridization to conservedp44sequence noted above. Thetr1gene (APH_1218) upstream of thep44expression locus, which encodes a putative transcription regulator [34], is well transcribed byAp-ISE6 but not byAp-HL-60 orAp-HMEC-1. The DNA binding proteinApxR(APH_0515; [34] was weakly transcribed in the human cell lines but not at all in the tick cell line (Figure6: Artemis transcription plots of thep44expression site, andApxR, a putativep44transcription regulator).
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig6_HTML.jpg
                  Figure 6

                  Artemis plots illustrating transcription activity at thep44expression site, and atApxR, a putativep44transcription regulator. (Red:Ap-HL-60, Green:Ap-HMEC-1, Blue:Ap-ISE6; plots were "smoothed" by setting the sliding window average to 5). (A) In the human cell lines,Apshows transcription beginning upstream ofomp-1N(andp44-18ES, thep44expression locus) near coordinate 1289280, but there is no specific transcription in the tick cell line. Transcription regulatortr1(APH_1218) is not transcribed in the human cell lines but is in the tick cell line. (B)ApxR(APH_0515), a putative regulator ofp44transcription - through binding to and inhibiting thetr1promoter - shows low-level transcription in the human cell lines but none in the tick cell line.

                  The type IV secretion system genes identified by Hotopp et al. [1] consistently showed little activity in any of the host cells, whilesodB(APH_0371), an iron superoxide dismutase shown to be co-transcribed with components of the type IV secretion system ofE. chaffeensis and Ap[18], was moderately transcribed byApin all three cell lines.Ank(APH_0740) was strongly transcribed inAp-HMEC-1, somewhat less so inAp-HL-60, and only marginally inAp-ISE6. ThisApgene encodes a protein that is translocated to the nucleus of infected HL-60 cells [35,36] and phosphorylated there within minutes [37], presumably as an effector molecule delivered via theAptype IV secretion system [38]. Located between genome coordinates 1194300 and 1203600 are eight paralogs of theTrbC/VirB2gene family (pfam04956), six of which showed measurable transcript levels either only in the tick cell line (APH_1131 - APH_1134), or the human cell lines (APH_1144 and APH_1145). The relationship by amino acid sequence of these eight paralogs is illustrated in Figure7(Phylogenetic tree of eightvirB2paralogs by amino acid sequence), and indicates those transcribed in ISE6 are more closely related to each other than those transcribed in HL-60 and HMEC-1. Amino acid sequence alignments for the eightvirB2paralogs ofAp(see additional file6 ) show identities that rank from a high of 93% between tick cell expressed paralogsAPH_1133 and APH_1134, and a low of 22% between non-expressedAPH_1136and human cell expressedAPH_1145. Multiple alignment showed higher identity and similarity between the C termini of paralogs, which contain the functional portion of the proteins.
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig7_HTML.jpg
                  Figure 7

                  Phylogenetic tree showing the relationship, based on amino acid sequence, of eightvirB2paralogs in theApgenome. Four were transcribed only in ISE6 (APH_1131 - APH_1134), and two only in HL-60 and HMEC-1 (APH_1144 and APH_1145). No transcript fromAPH_1130orAPH_1136was measured. The tick cell line associated paralogs are closely related to each other, while those transcribed in the human cell lines form a separate group and are less related to each other. The tree was constructed with PAUP 4.0 using neighbor-joining: absolute variation. Values shown in branches correspond to 2000 bootstraps analysis.

                  Two apparent tick-cell-specific operons were identified. ORFs between coordinates 1448342 and 1445170, which include locus tagsAPH_1386throughAPH_1382, were transcribed only in the tick cell line (see additional file7 : Tick- and human-specificApoperons). Locus tagAPH_1380appears to be part of the operon and as such was transcribed in the tick cell line, and, at a lower level in the human cell lines. The functions of the hypothetical proteins of these six ORFs are not known. However, a BLAST homology search produced E values of 9e-18 to 4e-9, indicating the six ORFs are related. The transcription profile aroundAPH_1380and sequence characteristics just up-stream, suggest that the ORF actually begins with the methionine at coordinate 1445107. In support of this, there is a ribosomal binding site at coordinate 1445120. This upstream area shows significant amino acid sequence homology with the N-termini of the other ORF members of this putative operon, also suggesting the sequence is part of that ORF. Between coordinates 1445252 and 1445797 an un-annotated ORF appears to be transcribed only in the tick cell line, and also shows significant homology to the other ORFs in this putative operon. If this is a true ORF, and the start ofAPH_1380is extended to coordinate 1445107, the two putative ORFsAPH_1381 and APH_1382on the positive DNA strand may not be true ORFs, since they are situated opposite coding sequences in the operon and showed no transcription signal (see additional file7 panel A). The other apparent tick specific operon includes locus tagsAPH_0726throughAPH_0720(see additional file7 panel B). All but the small locus tagsAPH_0721 and APH_0722were transcribed. Although these genes are also annotated as encoding hypothetical proteins, searches using SignalP [39] and TMHMM [40] prediction servers indicated they all have transmembrane domains. There was also a group ofApgenes transcribed only in the human cells:APH_0837,APH_0838,APH_0839, andAPH_0842(see additional file7 panel C). All encode hypothetical proteins and all are related by amino acid sequence, especiallyAPH_0838,APH_0839, andAPH_0842.

                  qRT-PCR

                  Relative transcript levels for the five selectedApgenes, within and between cell lines, confirm those indicated by the array data (Figure8; Tiling vs. qRT-PCR graphs).
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig8_HTML.jpg
                  Figure 8

                  Tiling array (area under gene curve) vs. qRT-PCR (40 minus threshold cycle) measurements of transcript levels of fiveApgenes (key to bars indicated) during growth in HL-60, HMEC-1, and ISE6 cells. Relative transcript levels for the five selectedApgenes, within and between cell lines, confirm those indicated by the array data. qRT-PCR data was converted by subtracting the Ct (threshold cycle) from forty (total PCR cycles), since lower threshold cycles correspond to higher transcript levels.

                  Discussion

                  Total RNA fromApinfected human and tick cells was used to establish host cell specificAptranscription profiles by hybridization to complementary oligonucleotides representing the entire genome ofApon tiling arrays. The high percentages of genes measured as transcribed (69.6% in HL-60, 43.9% in HMEC-1, and 69.0% in ISE6), and the low levels of hybridization produced by the uninfected control samples, demonstrate that the method and array design produced sensitive, consistent, and specific transcription measurements. This is encouraging since efforts to fractionate or amplify RNA samples inevitably skew results. However, the culture samples analyzed were heavily infected and therefore optimal for such a direct approach. The three cell lines - HL-60 human promyelocytic, HMEC-1 human microvascular endothelial, and ISE6 tick - each produced bacteria with distinct transcription profiles, suggesting thatApgene expression is closely dependent on the phenotype and genotype (species origin) of its host cell. The bacteria assayed were not synchronized, they were the result of 1:50 inoculations, and therefore the transcription profiles generated were an average, perhaps with a "late stage" bias, of the infection process in each cell line.

                  Transcription profiles between the two human cell lines appeared similar, however with better and more consistent biotin labeling the percentage ofApORFs transcribed in HMEC-1 (43.9%) is predicted to be closer to that seen in HL-60 and ISE6 (~70%), and differences in transcription profiles betweenAp-HL-60 andAp-HMEC-1 would be magnified to reveal additional essential characteristics ofAptranscription in the human promyelocytic versus endothelial cells. Transcription differences between the human and tick cells were extensive; there were many genes and apparent operons transcribed in the tick cells but not in the human cells, and vice versa. The fact that the vast majority of tick cell specific transcripts are for hypothetical genes is tantalizing, and likely reflects our ignorance of the molecular patho-physiology of ticks and their associated bacteria.

                  The observation that in all three cell lines someApgenes and operons remained inactive, is either an indication that there are genetic capabilities not called for by these in vitro infection conditions - the particular intracellular environments of each cell line and the laboratory growth conditions - or the failure of this method to measure the transcription of those genes. Genes and operons that were truly silent may, among other possibilities, encode products specific to earlier stages of infection, to colonization of ticks following blood-meal uptake, or to parasitism of different hosts. Given the distinct transcription profiles produced between the human and tick cells, and the diversity of animal hosts and cell types infected within each, all are possible explanations.

                  ThevirB2paralogs of the type IV secretion system (T4SS) identified as differentially transcribed (6 of 8) between the human and tick cells (APH_1144 and APH_1145, andAPH_1131 - APH_1134, respectively) represent host cell specific usage of type IV secretion system components.VirB2is the major protein that makes up the T4SS pilus, and has been shown to be necessary for full virulence inBrucella abortus[41]. InAp, seven of the eightvirB2paralogs are annotated as beingTrbC/VirB2(pfam04956) family members on the Entrez Protein entries for each individual protein.APH_1145, although not annotated asvirB2, shares homology with and is located next to the other seven. Several other bacteria within the familyAnaplasmataceaealso possess multiple paralogs ofvirB2, which is unusual, as the majority of bacteria with type IV secretion systems have only one or twovirB2genes. A blast search done withAPH_1133shows, for example, thatAnaplasma marginale, as well asEhrlichia and Wolbachiaspecies, also have multiple loci annotated asTrbC/VirB2family members (see additional file8 : Examples of otherAnaplasmataceaebacteria with multiplevirB2loci). These bacteria might also express specificvirB2paralogs in a host cell dependent manner.

                  The absence ofp44transcription in ISE6 at thep44expression locus and clear transcription in HL-60 and HMEC-1, is consistent with the observation that the tick cell samples produced little or no hybridization top44HVRs, while the human samples did, and indicates that in ISE6 little if any transcript was generated from any of the 22 full-lengthp44genes. The lack ofApxRtranscript in the tick cells is consistent with the findings of Wang et al., who performed quantitative reverse transcription PCR onAp-infected ISE6 cells and tick salivary glands and found thatApxRis not transcribed [34]. It was suggested thatApxRgenerally regulates transcription in mammalian host cells and specifically regulatesp44transcription by binding to thetr1promoter. The strong transcription oftr1in the tick cells in this study may be due to a lack of suppression byApxR, which is not transcribed in the tick cells. The function oftr1, therefore, is unclear.

                  The apparent over-representation of transcript from conservedp44sequences, along with its reflecting behavior in the anti-sense strand, is unexpected. It may be the result of transcriptional "read-through" followed by the formation of stable double stranded, conserved sequence RNA. Bacteria are known to have poor control over transcription termination, and transcription of anti-sense sequence has been identified inMycoplasma genitalium[42]. Sincep44paralogs are scattered throughout the genome on both DNA strands, any adjacent gene transcription that continues into sense or anti-sensep44sequences will create "false transcripts," the conserved sequences of which are complementary. Conserved anti-sense false transcript may anneal to conserved sense "true" and false transcript to form double stranded conserved sequence RNA, which is relatively stable compared to single stranded RNA and thus would accumulate in the bacteria (Figure9: Diagram of possible mechanism to explain the over-representation ofp44conserved sequence transcripts and their anti-sense counterparts). Sense and anti-sensep44false transcripts could come from many of the numerousp44paralogs, but a possible source of anti-sensep44transcript in the tick cells is via read-through from themsp4gene (see additional file9 :msp4transcription), which is opposite and just downstream ofp44-15b and p44-13, strongly transcribed in the tick cells, not transcribed in the human cells, and has no obvious transcription terminator.
                  http://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-364/MediaObjects/12864_2008_Article_1557_Fig9_HTML.jpg
                  Figure 9

                  A proposed model for generation of the observed anomalousp44conserved sequence transcripts (sense and anti-sense). "Read through" transcription of genes lying just upstream of anti-sensep44sequence (e.g. "Gene 1" andmsp4) may produce anti-sensep44transcript, which, together withp44sense transcript, forms double stranded RNA (dsRNA). Because the HVR sequences are not complementary they do not form dsRNA and are therefore degraded. However, the conserved, complementary sequences do form dsRNA so are stabilized, accumulate in the bacteria, and are measured as over-abundant by the arrays.

                  It is possible that the anti-sense transcription noted in some genes, along with the prominentp44transcription phenomena, function to regulate gene expression. In prokaryotes,cis- andtrans-encoded anti-sense transcripts regulate coding sequence lying directly opposite or elsewhere in the genome, respectively [43]. Although anti-sense mediated expression regulation mechanisms are poorly understood, some possible modes have been discussed and include: imprinting through DNA methylation, RNA processing interference, and ribosome interference [44,45]. In the case ofp44, anti-sense transcripts may serve to silence leaky expression occurring from any of the 22 identified full-lengthp44paralogs [1], which are apparently capable of being expressed independently from thep44expression locus [46].P44silencing may be especially important in tick cells and account for the particular abundance of anomalousp44conserved sequence transcripts inAp-ISE6, which showed nop44HVR transcription. Sense and anti-sense RNA homologous to the conserved ends of thep44genes may even facilitate the process of non-reciprocal recombination by whichp44paralogs move into and out of the expression locus [47]. If they are not purposeful, it is likely that these gene transcription phenomena are the result of poorly controlled transcription or are artifacts of the tiling arrays. The repeat sequence associated sense and anti-sense "transcripts" do appear to be an artifact of the tiling arrays, as they are consistently seen wherever repeat sequences occur, whether inside or outside of coding sequences. However, the transcription behavior ofp44is unique in the genome, and most genes do not display anti-sense transcription, therefore the over-representation of transcript to conservedp44sequences and its reflection, and the anti-sense transcription of some genes, are intriguing and merit further investigation.

                  Conclusion

                  Obligate intracellular pathogens likeApcontrol the cells they parasitize - to prevent immune attacks, divert cellular resources, and prevent host cells from apoptosing. Our understanding of tick genes is poor so it is not surprising that the up-regulatedApgenes in tick cells are nearly all "hypothetical." Matched with our limited understanding ofApgenes, the tick cell data are particularly difficult to interpret. Conversely, it makes sense that the most differentially activeApgenes in HL-60 cells are better characterized, since human cell lines have mainly been used to study the biology ofAp, and, perhaps,Apgenes evolved to interact in human cells would tend to be related to characterized effectors. It also makes sense that the differentially transcribedApgenes in HL-60 and ISE6 are over-represented by membrane associated gene products, since survival in such disparate host cells would seem to require substantial specialization at the interface of the organism with its host cell: the bacterial membrane. The fact that a majority ofApgenes have no known function poses the greatest challenge to interpreting these data. However, some things are clear: 1. Genes differentially transcribed between the human and tick cells disproportionately represent surface proteins (~45% compared to ~25% of all proteins) (Table2). 2. There are genes, paralogs, and operons exclusively transcribed in the tick and the human cells, some of which may encode excellent vaccine candidates. 3. The particular paralogs of thep44family of membrane proteins (114) expressed in a population ofApmay be identified by the elevated signal produced within the HVR of each as compared to silent paralogs. 4. Whole RNA isolated fromApinfected host cells can be used to reveal details of bacterial gene transcription, including that from anti-sense sequences. 5. Global transcription profiles can likely be generated forApin any host cells, and for all aspects of the cell infection cycle - cell binding, entry, growth, and escape - although some enrichment for bacteria or bacterial mRNA may be necessary. CouplingAptranscription data with that of infected host cells will facilitate the discovery ofApand host cell gene functions.

                  Having transcription data for all of an organism's DNA sequence allows a line graph display for both DNA strands parallel to an annotated map of the genome. This way one can readily see transcriptional behavior that may be less accessible through other analysis tools. For example, anti-sense transcription, and the variation in transcription profiles of genes - sloped, flat, horned, and reflected - may lead to important insights intoApgene regulation, as well as for other intracellular organisms that subvert host cell processes for their own benefit.

                  Declarations

                  Acknowledgements

                  We thank: Wayne Xu (University of Minnesota, Super Computing Institute, University budget funded), for helping with bioinformatics issues and data interpretation; Arkady Khodursky (Biochemistry, Molecular Biology, and Biophysics, University of Minnesota), for his advice on data normalization and interpretation; and Gerald Baldridge (NIH grant Nr. AIO42792) for his help with editing the manuscript. The work presented herein was funded by a grant from the National Research Fund for Tick-Borne Diseases, Inc. to TJK, and a grant from NIH, Nr. AIO42792, to UGM.

                  Authors’ Affiliations

                  (1)
                  , Department of Entomology, University of Minnesota

                  References

                  1. Dunning-Hotopp JC, Lin M, Madupu R, Crabtree J, Angiuoli SV, Eisen J, Seshadri R, Ren Q, Wu M, Utterback TR, Smith S, Lewis M, Khouri H, Zhang C, Niu H, Lin Q, Ohashi N, Zhi N, Nelson W, Brinkac LM, Dodson RJ, Rosovitz MJ, Sundaram J, Daugherty SC, Davidsen T, Durkin AS, Gwinn M, Haft DH, Selengut JD, Sullivan SA, Zafar N, Zhou L, Benahmed F, Forberger H, Halpin R, Mulligan S, Robinson J, White O, Rikihisa Y, Tettelin H:Comparative genomics of emerging human ehrlichiosis agents. PLoS Genet2006,2(2):e21.View ArticlePubMed
                  2. Greig B, Asanovich KM, Armstrong PJ, Dumler JS:Geographic, clinical, serologic, and molecular evidence of granulocytic ehrlichiosis, a likely zoonotic disease, in Minnesota and Wisconsin dogs. J Clin Microbiol1996,34(1):44–48.PubMed
                  3. Walls JJ, Greig B, Neitzel DF, Dumler JS:Natural infection of small mammal species in Minnesota with the agent of human granulocytic ehrlichiosis. J Clin Microbiol1997,35(4):853–855.PubMed
                  4. Bullock PM, Ames TR, Robinson RA, Greig B, Mellencamp MA, Dumler JS:Ehrlichia equi infection of horses from Minnesota and Wisconsin: detection of seroconversion and acute disease investigation. J Vet Intern Med2000,14(3):252–257.http://​dx.​doi.​org/​10.​1892/​0891–6640(2000)014<0252:​EIOHFM>2.​3.​CO;2 View ArticlePubMed
                  5. Bayard-Mc Neeley M, Bansal A, Chowdhury I, Girao G, Small CB, Seiter K, Nelson J, Liveris D, Schwartz I, Mc Neeley DF, Wormser GP, Aguero-Rosenfeld ME:In vivo and in vitro studies on Anaplasma phagocytophilum infection of the myeloid cells of a patient with chronic myelogenous leukaemia and human granulocytic ehrlichiosis. J Clin Pathol2004,57(5):499–503.View ArticlePubMed
                  6. Klein MB, Nelson CM, Goodman JL:Antibiotic susceptibility of the newly cultivated agent of human granulocytic ehrlichiosis: promising activity of quinolones and rifamycins. Antimicrob Agents Chemother1997,41(1):76–79.PubMed
                  7. Madigan JE, Gribble D:Equine ehrlichiosis in northern California: 49 cases (1968–1981). J Am Vet Med Assoc1987,190(4):445–448.PubMed
                  8. Herron MJ, Ericson ME, Kurtti TJ, Munderloh UG:The Interactions of Anaplasma phagocytophilum, Endothelial Cells, and Human Neutrophils. Ann N Y Acad Sci2005,1063:374–382.View ArticlePubMed
                  9. Klein MB, Miller JS, Nelson CM, Goodman JL:Primary bone marrow progenitors of both granulocytic and monocytic lineages are susceptible to infection with the agent of human granulocytic ehrlichiosis. J Infect Dis1997,176(5):1405–1409.View ArticlePubMed
                  10. Felek S, Telford S 3rd, Falco RC, Rikihisa Y:Sequence analysis of p44 homologs expressed by Anaplasma phagocytophilum in infected ticks feeding on naive hosts and in mice infected by tick attachment. Infect Immun2004,72(2):659–666.View ArticlePubMed
                  11. Holman MS, Caporale DA, Goldberg J, Lacombe E, Lubelczyk C, Rand PW, Smith RP:Anaplasma phagocytophilum, Babesia microti, and Borrelia burgdorferi in Ixodes scapularis, southern coastal Maine. Emerg Infect Dis2004,10(4):744–746.PubMed
                  12. Sukumaran B, Narasimhan S, Anderson JF, Deponte K, Marcantonio N, Krishnan MN, Fish D, Telford SR, Kantor FS, Fikrig E:An Ixodes scapularis protein required for survival of Anaplasma phagocytophilum in tick salivary glands. J Exp Med2006.
                  13. Munderloh UG, Jauron SD, Fingerle V, Leitritz L, Hayes SF, Hautman JM, Nelson CM, Huberty BW, Kurtti TJ, Ahlstrand GG, Greig B, Mellencamp MA, Goodman JL:Invasion and intracellular development of the human granulocytic ehrlichiosis agent in tick cell culture. J Clin Microbiol1999,37(8):2518–2524.PubMed
                  14. Ades EW, Candal FJ, Swerlick RA, George VG, Summers S, Bosse DC, Lawley TJ:HMEC-1: establishment of an immortalized human microvascular endothelial cell line. J Invest Dermatol1992,99(6):683–690.View ArticlePubMed
                  15. Felsheim RF, Herron MJ, Nelson CM, Burkhardt NY, Barbet AF, Kurtti TJ, Munderloh UG:Transformation of Anaplasma phagocytophilum. BMC Biotechnol2006,6:42.View ArticlePubMed
                  16. Jauron SD, Nelson CM, Fingerle V, Ravyn MD, Goodman JL, Johnson RC, Lobentanzer R, Wilske B, Munderloh UG:Host cell-specific expression of a p44 epitope by the human granulocytic ehrlichiosis agent. J Infect Dis2001,184(11):1445–1450.View ArticlePubMed
                  17. Niu H, Rikihisa Y, Yamaguchi M, Ohashi N:Differential expression of VirB9 and VirB6 during the life cycle of Anaplasma phagocytophilum in human leucocytes is associated with differential binding and avoidance of lysosome pathway. Cell Microbiol2006,8(3):523–534.View ArticlePubMed
                  18. Ohashi N, Zhi N, Lin Q, Rikihisa Y:Characterization and transcriptional analysis of gene clusters for a type IV secretion machinery in human granulocytic and monocytic ehrlichiosis agents. Infect Immun2002,70(4):2128–2138.View ArticlePubMed
                  19. Lee HC, Goodman JL:Anaplasma phagocytophilum causes global induction of antiapoptosis in human neutrophils. Genomics2006,88(4):496–503.View ArticlePubMed
                  20. Carlyon JA, Chan WT, Galan J, Roos D, Fikrig E:Repression of rac2 mRNA expression by Anaplasma phagocytophila is essential to the inhibition of superoxide production and bacterial proliferation. J Immunol2002,169(12):7009–7018.PubMed
                  21. de la Fuente J, Ayoubi P, Blouin EF, Almazan C, Naranjo V, Kocan KM:Gene expression profiling of human promyelocytic cells in response to infection with Anaplasma phagocytophilum. Cell Microbiol2005,7(4):549–559.View ArticlePubMed
                  22. Borjesson DL, Kobayashi SD, Whitney AR, Voyich JM, Argue CM, Deleo FR:Insights into pathogen immune evasion mechanisms: Anaplasma phagocytophilum fails to induce an apoptosis differentiation program in human neutrophils. J Immunol2005,174(10):6364–6372.PubMed
                  23. Pedra JH, Sukumaran B, Carlyon JA, Berliner N, Fikrig E:Modulation of NB4 promyelocytic leukemic cell machinery by Anaplasma phagocytophilum. Genomics2005,86(3):365–377.View ArticlePubMed
                  24. Sukumaran B, Carlyon JA, Cai JL, Berliner N, Fikrig E:Early transcriptional response of human neutrophils to Anaplasma phagocytophilum infection. Infect Immun2005,73(12):8089–8099.View ArticlePubMed
                  25. Cerrina Fa, Blattnerb F, Huanga W, Huea Y, Greenc R, Singh-Gassonb S 1, Sussmanb M:Biological lithography: development of a maskless microarray synthesizer for DNA chips. Microelectronic Engineering2002,61–62:33–40.View Article
                  26. Liu XS:Getting started in tiling microarray analysis. PLoS Comput Biol2007,3(10):1842–1844.View ArticlePubMed
                  27. Mockler TC, Chan S, Sundaresan A, Chen H, Jacobsen SE, Ecker JR:Applications of DNA tiling arrays for whole-genome analysis. Genomics2005,85(1):1–15.View ArticlePubMed
                  28. Goodman JL, Nelson C, Vitale B, Madigan JE, Dumler JS, Kurtti TJ, Munderloh UG:Direct cultivation of the causative agent of human granulocytic ehrlichiosis. N Engl J Med1996,334(4):209–215.View ArticlePubMed
                  29. Munderloh UG, Lynch MJ, Herron MJ, Palmer AT, Kurtti TJ, Nelson RD, Goodman JL:Infection of endothelial cells with Anaplasma marginale and A. phagocytophilum. Vet Microbiol2004,101(1):53–64.View ArticlePubMed
                  30. Royce TE, Rozowsky JS, Luscombe NM, Emanuelsson O, Yu H, Zhu X, Snyder M, Gerstein MB:Extrapolating traditional DNA microarray statistics to tiling and protein microarray technologies. Methods Enzymol2006,411:282–311.View ArticlePubMed
                  31. Bendtsen JD, Kiemer L, Fausboll A, Brunak S:Non-classical protein secretion in bacteria. BMC Microbiol2005,5:58.View ArticlePubMed
                  32. Yu CS, Chen YC, Lu CH, Hwang JK:Prediction of protein subcellular localization. Proteins2006,64(3):643–651.View ArticlePubMed
                  33. Sarkar M, Troese MJ, Kearns SA, Yang T, Reneer DV, Carlyon JA:Anaplasma phagocytophilum MSP2 (P44)-18 predominates and is modified into multiple isoforms in human myeloid cells. Infect Immun2008.
                  34. Wang X, Cheng Z, Zhang C, Kikuchi T, Rikihisa Y:Anaplasma phagocytophilum p44 mRNA expression is differentially regulated in mammalian and tick host cells: involvement of the DNA binding protein ApxR. J Bacteriol2007,189(23):8651–8659.View ArticlePubMed
                  35. Park J, Kim KJ, Choi KS, Grab DJ, Dumler JS:Anaplasma phagocytophilum AnkA binds to granulocyte DNA and nuclear proteins. Cell Microbiol2004,6(8):743–751.View ArticlePubMed
                  36. Caturegli P, Asanovich KM, Walls JJ, Bakken JS, Madigan JE, Popov VL, Dumler JS:ankA: an Ehrlichia phagocytophila group gene encoding a cytoplasmic protein antigen with ankyrin repeats. Infect Immun2000,68(9):5277–5283.View ArticlePubMed
                  37. Ijdo JW, Carlson AC, Kennedy EL:Anaplasma phagocytophilum AnkA is tyrosine-phosphorylated at EPIYA motifs and recruits SHP-1 during early infection. Cell Microbiol2007,9(5):1284–1296.View ArticlePubMed
                  38. Lin M, den Dulk-Ras A, Hooykaas PJ, Rikihisa Y:Anaplasma phagocytophilum AnkA secreted by type IV secretion system is tyrosine phosphorylated by Abl-1 to facilitate infection. Cell Microbiol2007,9(11):2644–2657.View ArticlePubMed
                  39. Bendtsen JD, Nielsen H, von Heijne G, Brunak S:Improved prediction of signal peptides: SignalP 3.0. J Mol Biol2004,340(4):783–795.View ArticlePubMed
                  40. Sonnhammer EL, von Heijne G, Krogh A:A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol1998,6:175–182.PubMed
                  41. den Hartigh AB, Sun YH, Sondervan D, Heuvelmans N, Reinders MO, Ficht TA, Tsolis RM:Differential requirements for VirB1 and VirB2 during Brucella abortus infection. Infect Immun2004,72(9):5143–5149.View Article
                  42. Lluch-Senar M, Vallmitjana M, Querol E, Pinol J:A new promoterless reporter vector reveals antisense transcription in Mycoplasma genitalium. Microbiology2007,153(Pt 8):2743–2752.View ArticlePubMed
                  43. Brantl S:Regulatory mechanisms employed by cis-encoded antisense RNAs. Curr Opin Microbiol2007,10(2):102–109.View ArticlePubMed
                  44. Lapidot M, Pilpel Y:Genome-wide natural antisense transcription: coupling its regulation to its different regulatory mechanisms. EMBO Rep2006,7(12):1216–1222.View ArticlePubMed
                  45. Timmons JA, Good L:Does everything now make (anti)sense? Biochem Soc Trans2006,34(Pt 6):1148–1150.PubMed
                  46. Zhi N, Ohashi N, Rikihisa Y:Multiple p44 genes encoding major outer membrane proteins are expressed in the human granulocytic ehrlichiosis agent. J Biol Chem1999,274(25):17828–17836.View ArticlePubMed
                  47. Lin Q, Zhang C, Rikihisa Y:Analysis of involvement of the RecF pathway in p44 recombination in Anaplasma phagocytophilum and in Escherichia coli by using a plasmid carrying the p44 expression and p44 donor loci. Infect Immun2006,74(4):2052–2062.View ArticlePubMed

                  Copyright

                  © Nelson et al. 2008

                  This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                  Advertisement