Bacterial strains and culture conditions
Escherichia coli MG1655 (ATCC: 700926) overnight cultures were inoculated into fresh LB medium at 1:50 and grown at 37 °C with shaking (150 rpm). Upon reaching the exponential growth phase, the culture was centrifuged at 3000 g for 10 min. The media was removed and the pellet was resuspended in PBS to a concentration of 107 cells per μL. The cells were stored on ice and total RNA extraction was performed immediately.
Trizol (Thermo Fisher Scientific, Cat. # 15596018) RNA extraction was performed following the manufacturer’s protocol. Briefly, 108 cells were added to 750 μL Trizol, mixed, and then combined with 150 μL chloroform. After centrifugation, the clear aqueous layer was recovered and precipitated with 375 μL of isopropanol and 0.67 μL of GlycoBlue (Thermo Fisher Scientific, Cat. # AM9515). The pellet was washed twice with 75% ethanol and after the final centrifugation, the resulting pellet was resuspended in RNase-free water.
100 ng of total RNA in 2 μL was combined with 3 μL poly-A mix, comprised of 1 μL 5x first strand buffer [250 mM Tris-HCl (pH 8.3), 375 mM KCl, 15 mM MgCl2, comes with Superscript II reverse transcriptase, Invitrogen Cat. # 18064–014], 1 μL blocking primer mix (see Primers), 0.8 μL nuclease-free water, 0.1 μL 10 mM ATP, and 0.1 μL E. coli poly-A polymerase (New England Biolabs, Cat. # M0276S). The mixture was incubated at 37 °C for 10 min. In the control group, no blocking primers were added and 1.8 μL of nuclease-free water was added instead. For EMBR-seq with either unmodified or phosphorylated 3′-end blocking primers, the blocking primer mix was prepared by mixing equal volumes of 50 μM blocking primers specific to 5S, 16S and 23S rRNA. For EMBR-seq with hotspot blocking primers, the blocking primer mix was prepared by mixing equal volumes of 100 μM 3′-end blocking primers with 100 μM hotspot blocking primers, such that the final mixture was 50 μM 3′-end primers (3 primers mixed) and 50 μM hotspot primers (6 primers mixed).
The polyadenylation product was mixed with 0.5 μL 10 mM dNTPs (New England Biolabs, Cat. # N0447L), 1 μL reverse transcription primers (25 ng/μL, see Primers), and 1.3 μL blocking primer mix, and heated to 65 °C for 5 min, 58 °C for 1 min, and then quenched on ice. In the control samples, the blocking primers were again replaced with nuclease-free water. Next, 3.2 μL RT mix, consisting of 1.2 μL 5x first strand buffer, 1 μL 0.1 M DTT, 0.5 μL RNaseOUT (Thermo Fisher Scientific, Cat. #10777019), and 0.5 μL Superscript II reverse transcriptase was added to the solution, followed by 1 h incubation at 42 °C. The temperature was then raised to 70 °C for 10 min to heat inactivate Superscript II.
Second strand synthesis
49 μL of the second strand mix, containing 33.5 μL water, 12 μL 5x second strand buffer [100 mM Tris-HCl (pH 6.9), 23 mM MgCl2, 450 mM KCl, 0.75 mM β-NAD, 50 mM (NH4)2 SO4, Invitrogen, Cat. # 10812–014], 1.2 μL 10 mM dNTPs, 0.4 μL E. coli ligase (Invitrogen, Cat. # 18052–019), 1.5 μL DNA polymerase I (Invitrogen, Cat. # 18010–025), and 0.4 μL RNase H (Invitrogen, Cat. # 18021–071), was added to the product from the previous step. The mixture was incubated at 16 °C for 2 h. cDNA was purified with 1x AMPure XP DNA beads (Beckman Coulter, Cat. # A63881) and eluted in 24 μL nuclease-free water that was subsequently concentrated to 6.4 μL.
In vitro transcription
The concentrated solution was mixed with 9.6 μL of Ambion in vitro transcription mix (1.6 μL of each ribonucleotide, 1.6 μL 10x T7 reaction buffer, 1.6 μL T7 enzyme mix, MEGAscript T7 Transcription Kit, Thermo Fisher Scientific, Cat. # AMB13345) and incubated at 37 °C for 13 h. Next, the aRNA was treated with 6 μL EXO-SAP (ExoSAP-IT™ PCR Product Cleanup Reagent, Thermo Fisher Scientific, Cat. # 78200.200.UL) at 37 °C for 15 min followed by fragmentation with 5.5 μL fragmentation buffer (200 mM Tris-acetate (pH 8.1), 500 mM KOAc, 150 mM MgOAc) at 94 °C for 3 min. The reaction was then quenched with 2.75 μL stop buffer (0.5 M EDTA) on ice. The fragmented aRNA was size selected with 0.8x AMPure RNA beads (RNAClean XP Kit, Beckman Coulter, Cat. # A63987) and eluted in 15 μL nuclease-free water. Thereafter, Illumina libraries were prepared as described previously .
EMBR-seq with TEX digestion
To test the Terminator™ 5′-phosphate-dependent exonuclease (Lucigen, Cat. # TER5120), 100 ng of total RNA in 2 μL was combined with 18 μL TEX mix, comprised of 14.5 μL nuclease free water, 2 μL Terminator 10x buffer A, 0.5 μL RNAseOUT, and 1 μL TEX. The solution was incubated at 30 °C for 1 h and quenched with 1 μL of 100 mM EDTA. The product was purified with 1x AMPure RNA beads and eluted in 10 μL nuclease-free water and concentrated to 2 μL. This TEX digested total RNA was then used as starting RNA in the EMBR-seq protocol described above.
EMBR-seq bioinformatic analysis
Paired-end sequencing of the EMBR-seq libraries was performed on an Illumina NextSeq 500. All sequencing data has been deposited to Gene Expression Omnibus under the accession number GSE149666. In the sequencing libraries, the left mate contains information about the sample barcode (see Primers). The right mate is mapped to the bacterial transcriptome. Prior to mapping, only reads containing valid sample barcodes were retained. Subsequently, the reads were mapped to the reference transcriptome (E. coli K12 substr. MG1655 cds ASM584v2) using Burrows-Wheeler Aligner (BWA) with default parameters.
Analysis of detection bias in EMBR-seq
E. coli operons were downloaded from RegulonDB . Operons with at least 2 genes were included for this analysis. The data from EMBR-seq libraries with 100 ng starting material was mapped to E. coli K12 substr. MG1655 reference genome (ASM584v2). For each read that maps within an operon, the distance of the mapped location from the 3′ end of the operon was calculated, accounting for the read length. Next, the operons were discretized into 50 bins, and all operons with more than 200 unique reads were considered for downstream analysis. The number of reads in each bin was then normalized by the total number of reads in each operon, and the average of the relative reads within each bin was calculated. To compare bacterial data from EMBR-seq to mammalian data from CEL-seq, we downloaded CEL-seq data reported in Grün et al. (GEO Accession: GSM1322290) and performed similar analysis for the mouse genes .
Sequence conservation of 16S and 23S rRNA
16S rRNA sequences from 4000 species were obtained from rrnDB , while 23S rRNA sequences from 119 species were selected from NCBI RefSeq . Next, the last 100 bases from the 3’end of each sequence were aligned using Clustal Omega . Shannon entropy for each aligned base location was then calculated such that the maximal entropy value was 1. Five possibilities were allowed: “A”, “T”, “C”, “G”, and “-”.
Reverse transcription primers are shown below with the 6-nucleotide sample barcodes underlined :
The following five barcodes were used in this study:
In the case of the 3′ phosphorylated primers, all blocking primers have a 3′ phosphorylation modification.
Hotspot blocking primers:
16S primer for hotspot at position 107:
16S primer for hotspot at position 682:
16S primer for hotspot at position 1241:
23S primer for hotspot at position 375:
23S primer for hotspot at position 1421:
23S primer for hotspot at position 1641:
Each primer is designed to anneal approximately 100 bp downstream of the hotspot. The exact position and length of each primer was adjusted to ensure the Tm was above 65 °C.