Open Access

Computational prediction and validation of C/D, H/ACA and Eh_U3 snoRNAs of Entamoeba histolytica

  • Devinder Kaur1,
  • Abhishek Kumar Gupta1,
  • Vandana Kumari1,
  • Rahul Sharma1,
  • Alok Bhattacharya2 and
  • Sudha Bhattacharya1Email author
BMC Genomics201213:390

DOI: 10.1186/1471-2164-13-390

Received: 16 March 2012

Accepted: 25 July 2012

Published: 14 August 2012

Abstract

Background

Small nucleolar RNAs are a highly conserved group of small RNAs found in eukaryotic cells. Genes encoding these RNAs are diversely located throughout the genome. They are functionally conserved, performing post transcriptional modification (methylation and pseudouridylation) of rRNA and other nuclear RNAs. They belong to two major categories: the C/D box and H/ACA box containing snoRNAs. U3 snoRNA is an exceptional member of C/D box snoRNAs and is involved in early processing of pre-rRNA. An antisense sequence is present in each snoRNA which guides the modification or processing of target RNA. However, some snoRNAs lack this sequence and often they are called orphan snoRNAs.

Results

We have searched snoRNAs of Entamoeba histolytica from the genome sequence using computational programmes (snoscan and snoSeeker) and we obtained 99 snoRNAs (C/D and H/ACA box snoRNAs) along with 5 copies of Eh_U3 snoRNAs. These are located diversely in the genome, mostly in intergenic regions, while some are found in ORFs of protein coding genes, intron and UTRs. The computationally predicted snoRNAs were validated by RT-PCR and northern blotting. The expected sizes were in agreement with the observed sizes for all C/D box snoRNAs tested, while for some of the H/ACA box there was indication of processing to generate shorter products.

Conclusion

Our results showed the presence of snoRNAs in E. histolytica, an early branching eukaryote, and the structural features of E. histolytica snoRNAs were well conserved when compared with yeast and human snoRNAs. This study will help in understanding the evolution of these conserved RNAs in diverse phylogenetic groups.

Keywords

U3 snoRNA Guide/ orphan snoRNAs Entamoeba histolytica

Background

Small nucleolar RNAs (snoRNAs) are a special class of small non coding RNAs localized to the nucleolus. They belong to two major categories; box C/D and box H/ACA snoRNAs, based on the presence of short consensus sequence motifs[1]. H/ACA box snoRNAs guide the pseudouridylation while C/D box snoRNAs guide the site specific 2'-o-ribose methylation during post transcriptional modification of pre rRNA[24]. Such modification is accomplished by complementary base pairing between specific regions of the snoRNA and target RNA by the small nucleolar ribonucleoprotein complex which guides the modification of target RNA. Some snoRNAs are also known to perform functions other than the modification of ribosomal RNAs, e.g. U3, U17, U8, U14, and U22. The U3 snoRNA is an exceptional member of the box C/D class, and is involved in early pre rRNA cleavage in the 5’ external transcribed spacer (ETS) in yeast cells[5], mouse extracts[6], and Xenopus oocyte extracts[7]. Depletion of this snoRNA impairs the formation of mature 18 S rRNA[3]. Other exceptions include C/D snoRNA U8[8], U22[9] and an H/ACA snoRNA U17/snR30[10] which are required for pre-rRNA cleavage. They are not involved in rRNA and nuclear RNA modification. Some snoRNAs are involved in both pre-rRNA cleavage as well as modification e.g. U14 (C/D)[11] and snR10 (H/ACA)[12]. Several snoRNAs lack any known target site, and are called orphan snoRNAs. These snoRNAs might have undiscovered functions, which may or may not concern rRNAs. Evidence in this respect is the role of orphan C/D box snoRNA (SNORD115) in regulation of alternative splicing[13].

Structural motifs are one of the important distinguishing features of snoRNAs. The characteristic structural motifs in C/D box snoRNAs are RUGAUGA for C box and CUGA for D box. In H/ACA box snoRNAs the H box is ANANNA and ACA box is ACA, arranged in a hairpin, hinge, hairpin, tail structure[14, 15]. C/D box snoRNAs are about 60–100 bases in size, while H/ACA snoRNAs are 120–160 bases. Vertebrate snoRNAs are typically encoded from introns of protein coding genes[16] while in plants they are transcribed as polycistronic transcripts[17]. In yeast most of them are transcribed from independent promoters[18]. Amongst protozoan parasites, snoRNAs have been extensively studied in Trypanosoma brucei[19] and Plasmodium falciparum[2022]. In the latter it was shown for the first time that snoRNA genes may be located in UTRs. Strikingly, both organisms showed a much larger number of methylation sites compared with pseudouridylation sites.

A number of bioinformatic tools are available for the scanning of genomic sequences for snoRNAs. These include Snoscan[23] and snoSeeker (CDSeeker and ACASeeker)[24] for the search of C/D and H/ACA box snoRNAs. In this study, we have carried out a genome wide analysis of the early branching parasitic protist Entamoeba histolytica for identification of C/D and H/ACA box snoRNAs in this organism. A computational search for structural motifs gave hits out of which false positives having no identifiable target sites were removed. This was achieved by aligning the rRNA of E. histolytica with rRNAs of five eukaryotic organisms Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Saccharomyces cerevisiae and Homo sapiens separately, whose snoRNAs and target sites are already known[2527]. The computational analysis was combined with experimental validation.

Results and discussion

Computational identification of putative snoRNAs from E. histolytica by snoscan and snoSeeker

Target site modifications by snoRNAs are commonly conserved amongst distant eukaryotes[28]. We therefore selected five eukaryotic organisms: A. thaliana, C. elegans, D. melanogaster, S. cerevisiae, H. sapiens, whose methylation sites and pseudouridylation (psi) sites are known and used these to find putative sites in E. histolytica rRNA by aligning its 5.8 S, 28 S and 18 S rRNA sequences with rRNAs of the selected organisms separately (Additional file1: Figure S1). Each of the mapped methylation and psi sites were picked as putative modification sites in E. histolytica. We could identify a total of 173 putative methylation sites and 126 putative psi sites in E. histolytica. A large fraction of these (53%) matched with yeast and human sites. 24 novel methylation sites were also found in E. histolytica. The programs snoscan and snoSeeker (CDSeeker); and snoSeeker (ACASeeker) were used to identify the putative sequences for C/D and H/ACA box snoRNAs respectively in E. histolytica whole genome. The initially predicted snoRNAs (41705 C/D box and 661 H/ACA box) were further analyzed to eliminate false positive candidates using the following criteria (Figure1). Firstly, we selected snoRNAs that could target the putative modification sites obtained by aligning the rRNA of E. histolytica with the five organisms listed above. SnoRNAs that could potentially target 23 predicted methyl sites and 41 psi sites in E. histolytica were thus selected. Secondly, we set a threshold value, the final logarithmic odd score, that incorporated information from each of the snoRNA features and fetched out the snoRNAs having final score equal or more than the threshold value[24, 26]. The threshold values used are given in “Methods”. Thirdly; we looked for the genomic localization of these snoRNAs and selected those coming from intergenic regions and introns. We also selected snoRNAs from genic regions for which the logarithmic odd score was well above the threshold (45 bits for H/ACA and 20 bits for C/D box snoRNAs)[24, 26]. Lastly, we did BLASTn analysis of predicted snoRNAs with EST database of E. histolytica. All those snoRNAs giving hits with ESTs were discarded. Finally we obtained a total of 99 snoRNAs of which 41 were C/D box (34 guide and 7 orphan snoRNAs) and 58 were H/ACA box (43 guide and 15 orphan snoRNAs). We have named the genes encoding the putative snoRNAs so as to indicate firstly the type of snoRNA (Me or ACA), followed by species name (Eh) and the modification site in rRNA (where predicted) or orphan (where it is not known), e.g. ACA-Eh-SSU-1315 represents H/ACA type of snoRNA of E. histolytica which is predicted to modify SSU rRNA at position 1315 (Tables1,2,3).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-13-390/MediaObjects/12864_2012_Article_4555_Fig1_HTML.jpg
Figure 1

Flowchart showing analysis with Snoscan and snoSeeker. (A) C/D box guide snoRNAs predicted by Snoscan and final selection of candidate snoRNAs on the basis of indicated filters. (B) The initial count and the selected orphan C/D box snoRNAs using CDSeeker. (C) Initial count and final selection of H/ACA box snoRNAs using ACASeeker.

Table 1

Box C/D snoRNA genes in E. histolytica

snoRNA genes

Len.

Seq.

Modification

Antisense element

Scaffold

Start

End

Location

(nt)

(%)

**Me-Eh-SSU-G1296

78

92%

SSU-G1296

12nt(5') 100%

DS571223

24176

24254

IR

   

SSU-G1298

10nt(5’) 100%

    
   

SSU-G1195

10nt(5’) 100%

    

Me-Eh-SSU-U1024

80

96%

SSU-U1024

14nt(5') 95%

DS571261

44605

44684

IR

   

SSU-U1822

11nt(5’) 98%

    

**Me-Eh-SSU-A83

78

100%

SSU-A83

16nt(5') 100%

DS571196

58225

58327

IR

   

SSU-U87

12nt(5’) 100%

    

Me-Eh-SSU-G41

68

93%

SSU-G41

11nt(5’) 100%

DS571147

177417

177350

IR

Me-Eh-SSU-A431

68

94%

SSU-A431

13nt(5') 100%

DS571331

10236

10303

IR

Me-Eh-SSU-U871

80

96%

SSU-U871

20nt(5’) 95%

DS571673

2402

2481

NA

*Me-Eh-SSU-G1535

82

93%

SSU-G1535

12nt(5') 100%

DS571215

31121

31040

IR

   

LSU-G2053

9nt(5') 100%

    

Me-Eh-SSU-A27

66

100%

SSU-A27

11nt(5') 100%

DS571226

26372

26307

IR

Me-Eh-SSU-A1830

83

88%

SSU-A1830

11nt(5') 100%

DS571152

29351

29433

EHI_049420 (+)

Me-Eh-SSU-A836

103

--

SSU-A836

13nt(5’)

DS571152

99242

99140

IR

Me-Eh-SSU-G1152

60

91%

SSU-G1152

12nt(3') 100%

DS571335

19522

19581

IR

Me-Eh-SSU-G628

97

--

SSU-G628

10nt(5’)

DS571451

15436

15532

IR

     

DS571177

52928

52831

 

Me-Eh-SSU-A1183

82

--

SSU-A1183

10nt(5’)

DS571164

22795

22876

IR

   

SSU-G1836

13nt(5’)

    
   

SSU-A1485

9nt(5’)

    
   

LSU-A520

12nt(3’)

    
   

LSU-U1210

12nt(5’)

    
   

LSU-A145

10nt(3’)

    

Me-Eh-SSU-A790

68

94%

SSU-A790

10nt(5’) 100%

DS571171

51701

51634

IR

   

SSU-A1496

11nt(5’) 100%

    
   

LSU-A801

11nt(5’) 100%

    
   

LSU-A1834

10nt(5’) 100%

    
   

LSU-A2555

11nt(5’) 100%

    

Me-Eh-SSU-C1805

63

96%

SSU-C1805

10nt(5') 100%

DS571145

496851

496789

IR

Me-Eh-LSU-A928a

69

97%

LSU-A928

11nt(5') 100%

DS571323

13072

13140

IR

Me-Eh-LSU-A928b

66

98%

LSU-A782

11nt(5') 100%

DS571163

50734

50669

IR

   

LSU-A928

9nt(5’) 100%

    
   

LSU-A1034

9nt(5’) 100%

    

Me-Eh-LSU-U1868

101

92%

LSU-U1868

13nt(5’) 92.3%

DS571175

28933

28833

IR

Me-Eh-LSU-U3580a

103

--

LSU-U3580

19nt(5’)

DS571304

677

575

IR

Me-Eh-LSU-U3580b

105

--

LSU-U3580

19nt(5’)

DS571305

36390

36494

IR

Me-Eh-LSU-A785

62

96%

LSU-A785

13nt(5') 100%

DS571416

15678

15739

IR

Me-Eh-LSU-G2958

70

97%

LSU-G2958

13nt(5') 100%

DS571205

22350

22419

IR

*Me-Eh-LSU-A3089

71

92%

LSU-A3089

11nt(5’) 100%

DS571180

41005

40935

IR

*Me-Eh-LSU-C2414

69

97%

LSU-C2414

11nt(5') 100%

DS571473

957

1025

IR

Me-Eh-LSU-G926

59

98%

LSU-G926

13nt(3') 100%

DS571150

13447

13389

IR

Me-Eh-LSU-U1018

69

--

LSU-U1018

11nt(5’)

DS571215

62034

62102

IR

   

LSU-U2783

14nt(5’)

DS571316

3067

2999

 

Me-Eh-LSU-G1028

61

87%

LSU-G1028

14nt(3’) 100%

DS571174

92482

92422

IR

Me-Eh-LSU-U1176a

109

94%

LSU-U1176

14nt(5') 100%

DS571307

17712

17820

IR ▲

Me-Eh-LSU-U1176b

109

94%

LSU-U1176

14nt(5') 100%

DS571419

10643

10535

IR

Me-Eh-LSU-U1176c

109

93%

LSU-U1176

14nt(5') 100%

DS571792

3710

3820

IR

Me-Eh-LSU-A2333

128

93%

LSU-A2333

12nt(3') 100%

DS571208

15564

15691

IR

**Me-Eh-LSU-A228

72

97%

LSU-A228

13nt(5') 100%

DS571397

17920

17991

EHI_003940

Intron of gene

40 S ribosomal protein S4, putative

**Me-Eh-5.8 S-U84

62

86%

5.8 S-U84

18nt(3’) 91%

DS571194

27534

27595

3’UTR

*Me-Eh-5.8 S-A92

115

85%

5.8 S-A92

11nt(5’) 94%

DS571180

76405

76291

EHI_118830 (−) ■

** snoRNAs validated by RT-PCR and Northern, * validated only by RT-PCR.

Note: “Len.” denotes length of the snoRNA genes, “Seq.” is sequence identity of corresponding snoRNA genes in E. dispar, “Antisense element” denotes length of antisense element in E. histolytica and its sequence identity with E. dispar. “IR”, intergenic region, “NA”, no annotation. snoRNA located close to ribosomal protein genes , downstream to rRNA methyltransferase gene ▲, close to C/D box snoRNP (fibrillarin) ■. (+) and (−) represents snoRNA in sense and antisense orientation with respect to host gene.

Table 2

Box H/ACA snoRNA genes in E. histolytica

snoRNA genes

Len

Seq

Modification

Antisense element

Scaffold

Start

End

Location

(nt)

(%)

**ACA-Eh-SSU1315

121

96%

SSU1315

6 + 7nt (5’) 100%

DS571149

98793

98673

IR

ACA-Eh-SSU631

137

-

SSU631

6 + 5nt (3’)

DS572405

485

349

NA

  

-

SSU1114

8 + 9nt (5’)

DS572405

485

349

 

ACA-Eh-SSU1727

135

87%

SSU1727

9 + 5nt (5’) 93%

DS571346

12499

12633

IR/5'UTR

**ACA-Eh-SSU626

127

94%

SSU626

6 + 6nt(3') 100%

DS571463

13091

12965

IR

ACA-Eh-SSU461

142

-

SSU461

7 + 5nt (3’)

DS571171

90117

90258

IR

ACA-Eh-SSU1675

127

92%

SSU1675

5 + 9nt (3') 93%

DS571182

71521

71647

IR

*ACA-Eh-SSU526

126

94%

SSU526

7 + 5nt (5') 100%

DS571463

12972

13097

IR

*ACA-Eh-LSU3008

129

92%

LSU3008

7 + 5nt (5') 100%

DS571272

39423

39295

IR

ACA-Eh-LSU1172a

142

-

LSU1172

5 + 4nt (5’)

DS571149

73439

73580

IR

ACA-Eh-LSU1172b

141

-

LSU1172

5 + 4nt (3’)

DS571307

22719

22859

IR

ACA-Eh-LSU1107b

155

-

LSU1107

11 + 3nt (5’)

DS571159

2240

2086

IR

  

-

LSU1172

6 + 8nt (3’)

DS571159

2240

2086

 
  

-

5.8 S52

8 + 3nt (3’)

DS571159

2240

2086

 

ACA-Eh-LSU1650

118

89%

LSU1650

8 + 5nt (5’) 100%

DS571267

21025

21142

IR

ACA-Eh-LSU3087

129

92%

LSU3087

6 + 4nt (5') 100%

DS571178

75373

75501

IR

ACA-EH-LSU2791

161

 

LSU2791

6 + 7nt (5’)

DS571159

59530

59690

IR

ACA-Eh-LSU3155

151

88%

LSU3155

5 + 6nt (3') 91%

DS571255

1114

963

IR

ACA-Eh-LSU3221

152

79%

LSU3221

9 + 4nt (5’) 91.6

DS571339

14712

14561

IR

ACA-Eh-LSU1159a

154

-

LSU1159

4 + 5nt (5’)

DS571589

7973

8126

IR

     

DS571660

2209

2056

 

ACA-Eh-LSU2700

144

86%

LSU2700

8 + 3nt (3') 100%

DS571160

113417

113560

IR

 

144

 

LSU1159

6 + 4nt (3') 100%

DS571160

113417

113560

 

ACA-Eh-LSU1080

123

-

LSU1080

3 + 7nt (5’)

DS571228

4519

4641

IR

**ACA-Eh-LSU1343

137

-

LSU1343

5 + 5nt (5’)

DS571219

12011

11875

IR

ACA-Eh-LSU2997b

129

96%

LSU2997

5 + 4nt (5') 100%

DS571145

384477

384605

IR

ACA-Eh-LSU339

148

-

LSU339

5 + 4nt (5’)

DS571174

50939

50792

IR

ACA-Eh-LSU1123

148

-

LSU1123

4 + 7nt (5’)

DS571225

51991

52138

IR

ACA-Eh-LSU1005

148

-

LSU1005

4 + 5nt (3’)

DS571402

1263

1116

IR

ACA-Eh-LSU1236a

141

-

LSU1236

3 + 6nt (3’)

DS571481

789

649

IR

ACA-Eh-LSU1236b

141

-

LSU1236

3 + 6nt (3’)

DS571159

21643

21503

IR

ACA-Eh-LSU1107a

154

-

LSU1107

11 + 4nt (3’)

DS571208

46788

46941

IR/ 5'UTR

  

-

SSU1114

8 + 9nt (5’)

DS571208

46788

46941

 

**ACA-Eh-LSU2288

126

92%

LSU2288

4 + 9nt(5') 100%

DS571148

172182

172057

IR

   

SSU1431

6 + 4nt (3') 90.0%

DS571148

172182

172057

 

ACA-Eh-LSU1159b

153

-

LSU1159

5 + 5nt (5’)

DS572251

153

1

NA

  

-

LSU3221

4 + 6nt (3’)

DS572251

153

1

 
  

-

SSU826

4 + 6nt (5’)

DS572251

153

1

 

ACA-Eh-LSU2997a

122

-

LSU2997

5 + 6nt (5’)

DS572347

1128

1007

NA

     

DS572347

800

679

 
     

DS572347

464

343

 
     

DS572347

132

11

 

ACA-Eh-5.8 S80a

140

-

5.8 S80

5 + 9nt (5’)

DS571346

5092

4953

IR

ACA-Eh-5.8 S80b

132

-

5.8S80

5 + 6nt (5’)

DS571206

1568

1437

IR

  

-

LSU3221

5 + 5nt (5’)

DS571206

1568

1437

 

ACA-Eh-SSU740

141

92%

SSU740

4 + 7nt (3') 91%

DS571156

54460

54320

EHI_182810 (+)

*ACA-Eh-SSU188

135

93%

SSU188

6 + 3nt (5’) 89%

DS571501

5129

5263

EHI_172000 (+)

*ACA-Eh-SSU1216

142

77%

SSU1216

5 + 4nt (3’) 89%

DS571247

8141

8000

EHI_016340 (−)

ACA-Eh-SSU299

169

94%

SSU299

4 + 6nt (3') 100%

DS571161

119527

119695

EHI_142230 (+)

ACA-Eh-SSU1212

129

93%

SSU1212

9 + 7nt (3') 100%

DS571169

105772

105900

EHI_098580 (−)

**ACA-Eh-LSU2809

156

82%

LSU2809

12 + 3nt(3') 86.7%

DS571148

116513

116668

EHI_012330 (−)

ACA-Eh-LSU2335

131

93%

LSU2335

3 + 6nt (5’) 100%

DS571304

17766

17896

EHI_161910 (−)

ACA-Eh-LSU2493

135

87%

LSU2493

8 + 3nt (5’) 82%

DS571228

40854

40720

EHI_161000 (−)

ACA-Eh-LSU1176

157

97%

LSU1176

5 + 4nt (5') 100%

DS571185

32437

32593

EHI_104450 (+)

ACA-Eh-LSU2268

135

97%

LSU2268

7 + 3nt (3') 90%

DS571154

24191

24057

EHI_178500 (−)

*ACA-Eh-5.8 S84

152

82%

5.8 S84

7 + 5nt (5’) 92%

DS571169

105495

105646

EHI_098580 (−)

** snoRNAs validated by RT-PCR and Northern, * validated only by RT-PCR.

Note: “Len.” denotes length of the snoRNA genes, “Seq.” is sequence identity of corresponding snoRNA genes in E. dispar, “Antisense element” denotes length of antisense element in E. histolytica and its sequence identity with E. dispar. “IR”, intergenic region, “NA”, no annotation. snoRNA located close to ribosomal protein genes . (+) and (−) represents snoRNA in sense and antisense orientation with respect to host gene.

Table 3

Orphan snoRNA genes (C/D and H/ACA) in E. histolytica

snoRNA genes

Len

Seq

Modification

Antisense element

Scaffold

Start

End

Homology Yeast Human

Location

(nt)

(%)

EhCDOrph1

95

95%

unknown

unknown

DS571162

42554

42648

unknown

EHI_155390 (+)

EhCDOrph2

87

94%

unknown

unknown

DS571301

21222

21308

unknown

IR

EhCDOrph3

107

94%

unknown

unknown

DS571358

4592

4698

unknown

IR

EhCDOrph4

91

96%

unknown

unknown

DS571422

5594

5684

unknown

IR

EhCDOrph5

84

94%

unknown

unknown

DS571468

9619

9702

unknown

IR

EhCDOrph6

94

--

unknown

unknown

DS571178

12358

12451

unknown

3'UTR/IR

EhCDOrph7

94

--

unknown

unknown

DS571178

13726

13819

unknown

3'UTR/IR

EhACAOrph1

115

91%

unknown

unknown

DS571172

5407

5293

unknown

IR

EhACAOrph2

135

93%

unknown

unknown

DS571155

108854

108988

unknown

IR/5’UTR

**EhACAOrph3

137

94%

unknown

unknown

DS571258

10028

9892

unknown

IR

EhACAOrph4

122

90%

unknown

unknown

DS571205

43143

43022

unknown

IR

EhACAOrph5

129

-

unknown

unknown

DS571332

15845

15717

unknown

IR

**EhACAOrph6

158

-

unknown

unknown

DS571298

19208

19365

unknown

IR

EhACAOrph7

130

-

unknown

unknown

DS571219

6608

6737

unknown

IR

EhACAOrph8

131

88%

unknown

unknown

DS571162

44597

44467

unknown

IR

EhACAOrph9

120

89%

unknown

unknown

DS571164

102500

102619

unknown

IR

EhACAOrph10

149

87%

unknown

unknown

DS571179

6844

6696

unknown

EHI_093690 (−)

EhACAOrph11

139

94%

unknown

unknown

DS571299

12352

12214

unknown

EHI_099700 (−)

EhACAOrph12

134

91%

unknown

unknown

DS571402

6404

6271

unknown

EHI_067510 (−)

**EhACAOrph13

137

95%

unknown

unknown

DS571501

3747

3883

unknown

EHI_171990 (+)

**EhACAOrph14

153

97%

unknown

unknown

DS571295

13935

14087

unknown

EHI_082520 (−)

*EhACAOrph15

148

91%

unknown

unknown

DS571166

95075

95222

unknown

EHI_127390 (−)

** snoRNAs validated by RT-PCR and Northern, * validated only by RT-PCR.

Note: “Len.” denotes length of the snoRNA genes, “Seq.” is sequence identity of corresponding snoRNA genes in E. dispar, “IR”, intergenic region, “NA”, no annotation. snoRNA located close to ribosomal protein genes . (+) and (−) represents snoRNA in sense and antisense orientation with respect to host gene.

We compared the predicted E. histolytica snoRNAs with those of S. cerevisiae[29], H. sapiens[30] and the two protozoan parasites (T. brucei and P. falciparum) on the basis of homology with conserved antisense sequences that guide the respective modifications for the two snoRNA classes (Table4). We found 9 C/D guide snoRNAs out of 34 which showed homology with P. falciparum snoRNAs, and 10/34 which showed homology with T. brucei snoRNAs, while in yeast and human this number was 14/34 (with yeast) and 11/34 (with human). Only 4 E. histolytica H/ACA box snoRNAs out of 43 showed homology with P. falciparum snoRNAs and 2/43 showed homology with T. brucei snoRNAs, while the homology with yeast was 14/43 and with human was 18/43. The conservation of modification sites between these organisms was as follows. Of the sites predicted to be modified in E. histolytica rRNAs (47 methylation sites and 41 pseudouridylation sites), 16 methylation sites and 21 pseudouridylation sites were conserved in at least one of the other four organisms (Table4). Taking the two modification sites together, 30 sites were conserved between E. histolytica and S. cerevisiae, 31 between E. histolytica and H. sapiens, 13 sites between E. histolytica and P. falciparum, and 12 sites were conserved between E. histolytica and T. brucei. Seven modification sites of E. histolytica were shared by all the four organisms. We also found 7 and 15 orphan snoRNAs in the C/D and H/ACA categories respectively. Orphan snoRNAs are important as they may act on RNA substrates other than mature rRNAs. As mentioned before, one of the roles of orphan snoRNAs is reported for human HBII-52 snoRNA[13], which is a C/D orphan snoRNA and regulates alternative splicing of the serotonin receptor 2 C. Similarly, some orphan H/ACA box snoRNAs may function in other aspects of RNA biogenesis. For example, the human U17 box H/ACA snoRNA and its yeast orthologue, snR30, plays an essential role in the nucleolytic processing of 18 S rRNA from pre rRNA. We checked for sequence complementarity of the antisense elements in our predicted orphan snoRNAs with the E. histolytica data base. For two C/D orphan snoRNAs (Additional file2: Figure S2) the possible antisense element (upstream to D' box and/or D box) showed complementary base paring with mRNAs of EHI_192630 and EHI_008070 genes in E. histolytica. Further we checked whether the predicted orphan snoRNAs were found in the small RNA data base of E. histolytica (generated in our lab by next generation sequencing). We found that 14 of 22 orphan snoRNAs were detected in this data base.
Table 4

Homology of E. histolytica snoRNAs and modification sites with selected organisms

snoRNA genes of E. histolytica

Modification

Homology

Conservation of sites

Yeast

Human

P. falciparum

T. brucei

Me-Eh-SSU-G1296

SSU-G1296

snR40

U232A

-

TB9Cs3C1

YHT

18SG1271

18SG 1328

 

SSU Gm1676

Me-Eh-SSU-A431

SSU-A431

snR87

U16

PFS11

-

YHP

18SA 436

18SA 484

18S Am442

 

Me-Eh-SSU-G1535

SSU-G1535

snR56

U25

snoR25

TB9Cs2C4

YHPT

18SG 1428

18SG

G1674SSU

SSU Gm1895

Me-Eh-SSU-A27

SSU-A27

snR74

U27

PFS4

TB8Cs2C1

YHPT

18SA 28

27

18S Am28

SSU Am56

Me-Eh-SSU-G1152

SSU-G1152

snR41

-

-

-

Y

18SG 1126

   

Me-Eh-SSU-A790

SSU-A790

snR53

-

-

-

Y

18SA 796

   

Me-Eh-SSU-C1805

SSU-C1805

snR70

U43

-

TB10Cs4C3

YHT

18SC 1639

18SC 1703

 

SSU Um2123

Me-Eh-LSU-A928a

LSU-A928

snR39

U32A

-

TB11Cs4C2

YHT

28SA 807

28SA 1511

 

LSU5 Am1091

Me-Eh-LSU-A785

LSU-A785

U18

U18A

PFS13

TB10Cs2C2

YHPT

28SA 649

28SA 1313

28S Am728

LSU Am910

Me-Eh-LSU-G2958

LSU-G2958

snR38

snR38A

PFS7

TB11Cs1C2

YHPT

28SG 2815

28SG 4362

28S Gm3176

LSU3Gm1207

Me-Eh-LSU-A3089

LSU-A3089

snR71

U29

PFS2

-

YHP

28SA 2946

28SA 4493

18S A1129,28SAm3307

 

Me-Eh-LSU-C2414

LSU-C2414

snR64

U74

PFS15, PFS16

TB10Cs1C1

YHPT

28SC 2337

28SC 3820

28S Cm2632

LSU3 Cm538

Me-Eh-LSU-G926

LSU-G926

snR39b

snR39B

PFS8

TB9Cs2C3

YHPT

28SG805

28SG1509

18SGm1798,28SGm926

LSU5Gm1089

Me-Eh-LSU-U1018

LSU-U1018

snR40

-

-

-

Y

28SU 898

   

Me-Eh-LSU-G1028

LSU-G1028

snR60

U80

-

TB9Cs2C5

YHT

28SG 908

28SG 1612

 

LSU5Gm1192

Me-Eh-LSU-A2333

LSU-A2333

-

-

PFS14

-

P

  

28S Am2551

 

ACA-Eh-SSU1315

SSU1315

snR83

ACA4

Pfa ACA 40

-

YHP

18SU 1290

18SU 1347

SSU1391,1443

 

ACA-Eh-SSU626

SSU626

snR161

unknown

-

-

YH

18SU 632

18SU 681

  

ACA-Eh-SSU461

SSU461

snR189

-

-

-

Y

18SU 466

   

ACA-Eh-LSU3008

LSU3008

snR46

ACA16

Pfa ACA 41

-

YHP

28SU 2865

28SU 4412

LSU3226,3399

 

ACA-Eh-LSU1172a

LSU1172

snR81

ACA7

-

-

YH

28SU 1052

28SU 1779

  

ACA-Eh-LSU1172b

LSU1172

snR81

ACA7

-

-

YH

28SU 1052

28SU 1779

  

ACA-Eh-LSU3087

LSU3087

snR37

ACA10

Pfa ACA 32

TB9Cs2H2

YHPT

28SU 2499

28SU 4491

LSU3305,3478

LSU3psi1336

ACA-Eh-LSU1159a

LSU1159

-

HBI-115

- 

-

H

 

28SU 1766

  

ACA-Eh-LSU2700

LSU1159

-

HBI-115

- 

-

H

 

28SU 1766

  

ACA-Eh-LSU1080

LSU1080

snR8

ACA56

-

-

YH

28SU 960

28SU 1664

  

ACA-Eh-LSU2997b

LSU2997

-

ACA21

-

 -

H

 

28SU 4401

  

ACA-Eh-LSU1123

LSU1123

snR5

ACA52

-

-

YH

28sU 1004

28sU 1731

  

ACA-Eh-LSU2288

LSU2288

-

ACA27

- 

-

H

 

28sU 3694

  

ACA-Eh-LSU1159b

LSU1159

-

HBI-115

- 

-

H

 

28sU 1766

  

ACA-Eh-LSU2997a

LSU2997

-

ACA21

-

- 

H

 

28sU 4401

  

ACA-Eh-5.8S80b

5.8S80b

Pus7p

U69

-

-

YH

5sU 50

5.8sU 69

  

ACA-Eh-SSU1216

SSU1216

snR35

ACA13

-

-

YH

18sU 1191

18sU 1248

  

ACA-Eh-SSU299

SSU299

snR49

-

-

-

Y

18sU 302

   

ACA-Eh-SSU1212

SSU1212

snR36

ACA36/36B

-

-

YH

18sU 1187

18sU 1244

  

ACA-Eh-LSU2335

LSU2335

snR191

U19/19-2

Pfa ACA 35

 

YHP

28sU 2258

28sU 3741

LSU2553,2676

 

ACA-Eh-LSU2268

LSU2268

snR32

unknown

-

TB10Cs3H2

YHT

  

28sU 2191

28sU 3674

 

LSU3psi397

 

Note: snoRNA of E. histolytica and its homolog in yeast (Y), Human (H), P. falciparum (P) and T. brucei (T) is shown with their conserved modification sites.

All of the predicted E. histolytica snoRNAs possessed conserved structural motifs characteristic of each class. Secondary structure of the predicted H/ACA snoRNAs was determined by ACASeeker. All of the predicted 58 H/ACA snoRNAs adopted the consensus folding pattern as shown using VARNA: Visualization Applet for RNA[31]. A representative of H/ACA snoRNA is shown in Additional file3: Figure S3 A. As expected the H/ACA box snoRNAs formed hairpin-hinge-hairpin-tail structure with H box lying in hinge region and ACA box at 3' tail region. Unlike ACASeeker, the C/D box prediction tool did not provide the secondary structure information. Therefore the secondary structure of C/D box was predicted with RNA fold (rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi) and structures were drawn using VARNA: Visualization Applet for RNA. Secondary structures obtained for C/D box snoRNAs were similar to the published structures for these RNAs (Additional file3: Figure S3 B).

The genome sequence of other Entamoeba species is now becoming available. We checked these data bases to look for close matches to the predicted snoRNAs of E. histolytica. Of the 58 predicted H/ACA snoRNAs we found 36 in E. dispar and 47 in E. nuttalli, while of the 41 predicted C/D box RNAs we found 33 in E. dispar and 36 in E. nuttalli. There was a high level of sequence similarity (77-100%), which was expected with E. dispar and E. nuttalli since they are very closely related to E. histolytica[32]. However when the same analysis was done with a distant species E. invadens, which infects reptiles, we found only 1 H/ACA and 2 C/D snoRNAs matching with E. histolytica. Although this result could also be a reflection of the quality of sequence assembly, it shows that E. invadens has diverged significantly from E. histolytica. Sequence comparison of conserved genes, e.g. rRNA genes also shows high divergence between E. histolytica and E. invadens[33, 34].

Validation of computationally predicted snoRNAs by RT-PCR and northern hybridization

To demonstrate whether the predicted snoRNAs are indeed expressed in E. histolytica cells we selected 24 snoRNAs to represent different categories, namely guide/orphan; and gene location in genic/intergenic regions. Accordingly 8 C/D box guide and orphan snoRNAs were selected (5 intergenic, 1 intronic, 1 in UTR and 1 genic) as also the U3 snoRNA; and 15 H/ACA box guide and orphan snoRNAs were selected (8 intergenic, 7 genic). Expression analysis of these snoRNAs was performed by RT-PCR using total RNA from E. histolytica and specific primers for each snoRNA designed from the ends of the predicted snoRNA sequence (Additional file4: Table S1 for primer sequences). RT-PCR products were obtained for all snoRNAs tested (Figure2). Amplicons of predicted size (as obtained by genomic PCR with the same primers using total DNA of E. histolytica) were observed for all C/D box snoRNAs and most of the H/ACA box snoRNAs. For three of the H/ACA snoRNAs somewhat smaller size amplicons were observed (Figure2B, marked by asterisk). A possible explanation for this is provided later. To further validate the RT-PCR results northern blot analysis was performed with RNA enriched in small RNA species. DNA probes from four C/D box and nine H/ACA box snoRNAs tested by RT-PCR were used. Results showed detectable bands corresponding to all snoRNAs tested (Figure3), although intensities of bands were not the same for all, possibly reflecting differential expression levels. For the four C/D box snoRNAs and U3 snoRNA tested, the sizes of observed bands were consistent with the predicted sizes (Figure3C). However several of the H/ACA snoRNAs showed bands in addition to the predicted sizes. These bands may represent mature snoRNAs obtained after processing, as has been reported in other species[35]. Some of these processing events may involve splicing of internal sequences, resulting in shorter size amplicons in RT-PCR. The multiple bands observed in some of the H/ACA snoRNAs indicate that these may be present as both single and double hairpin RNAs, as is known in other species[36]. On the other hand, northern blot analysis of ACA-Eh-SSU626 indicates the existence of double hairpin H/ACA snoRNA alone in this case; while ACA-Eh-SSU1315, ACA-Eh-SSU1345, ACA-Eh-LSU2809 and ACAEhOrph13 seem to exist as single hairpin alone. Thus, the experimental analysis using RT-PCR and northern blotting demonstrate that the snoRNA predictions by computational analysis are indeed valid and correspond to authentic snoRNA genes.
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-13-390/MediaObjects/12864_2012_Article_4555_Fig2_HTML.jpg
Figure 2

Expression analysis of E. histolytica snoRNAs by Reverse-Transcription PCR (RT-PCR). 5 μg of total RNA was reverse transcribed followed by PCR with primer pairs specific to each snoRNA. RT-PCR of computationally predicted C/D box snoRNAs (A) and H/ACA box snoRNAs (B). Arrows indicate the amplicon obtained by RT-PCR. The snoRNAs which have deviated from the predicted size are marked by asterisk. Lane D is the positive control, containing genomic DNA as template. + and – are the RT-PCR reactions with and without reverse transcriptase respectively. Lane M, Size markers 10–300 bp ladder (Fermentas).

https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-13-390/MediaObjects/12864_2012_Article_4555_Fig3_HTML.jpg
Figure 3

Expression analysis of E. histolytica snoRNAs by northern blotting. 15 μg of total RNA enriched in small RNA was resolved on a 12% denaturing urea PAGE gel. For Eh-U3 snoRNA 10 μg of total RNA was electrophoresed on 1.2% denaturing agarose. Blots were transferred to nylon membrane and hybridized to P32 DNA probe specific to each snoRNA. Northern blot analysis of computationally-predicted C/D box (A) H/ACA box (B) snoRNAs. The 70 nt tRNA-Glu(AAA) of E. histolytica was used as a positive control, as indicated in the lower panel of selected samples. Table displaying the predicted (see Tables1 and2) and observed sizes of snoRNAs (C). Sizes of bands were marked by end labelled P32 decade marker (10 – 150 nt, Ambion).

Genomic organization of snoRNAs in E. histolytica

The genomic location of all snoRNAs (C/D-box, H/ACA-box and orphan) was determined (Tables1,2,3). The majority (69%) of snoRNA genes mapped to intergenic regions, while 20% mapped to protein-coding regions where snoRNAs were encoded either from the opposite strand of the protein coding gene (12%) or from the same strand (8%). A small number of snoRNA genes were located in other parts of protein-coding genes, e.g. in the 5’-UTR (3%), 3’-UTR (3%), and intron (1%). 4% of the genes mapped to non annotated regions (Additional file5: figure S4). We checked for proximity of snoRNA genes with protein-coding genes involved in ribosome biogenesis, e.g. ribosomal protein genes and genes encoding nucleolar-localized proteins. A gene was considered proximal if it was found within 1 kb of the snoRNA gene. Of the 68 intergenically-located snoRNA genes, 5 were found close to ribosomal protein genes. Of 20 genically-located snoRNA genes 3 were found close to ribosomal protein genes and 1 was close to the gene for fibrillarin, a component of the C/D box snoRNP, while of 6 snoRNA genes located in UTR 1 was located close to ribosomal protein gene (Table1,2,3). Me-Eh-LSU-U1176a was present close to rRNA methyltransferase gene. Therefore a substantial number of snoRNA genes were physically close to genes of related function. The remaining snoRNA genes were located close to functionally diverse genes, e.g. genes involved in cellular signal transduction, DNA (cytosine-5)-methyltransferase gene, heat shock genes etc. When the genomic location of E. histolytica snoRNA genes was compared with that of other organisms, some striking similarities were observed. For example, the H/ACA snoRNA ACA-Eh-SSU1216 is localized to the ORF of a hypothetical protein and encoded from its opposite strand. Interestingly the yeast H/ACA snoRNA snR35, which is homologous to ACA-Eh-SSU1216 is also located in an ORF for a hypothetical protein and expressed form the opposite strand[37]. Like in E. histolytica, several of the Drosophila snoRNA genes are located in the coding strand of a host gene. It was proposed that in such cases alternative splicing may occur, giving rise to two different RNA species, exhibiting different functions, from the same pre-mRNA; an mRNA translated into a protein, and a small non-messenger RNA (snmRNA) functioning as the snoRNA[35]. A striking feature in P. falciparum is that some of the snoRNA genes are located in the 3’-UTRs. This feature was found in E. histolytica also, where 3 snoRNA genes were localized to 3’-UTRs. Additionally 3 snoRNA genes were also found in 5’-UTRs- a feature not reported in any other system so far. Although we have not experimentally validated the assignment of snoRNA genes to UTRs, these assignments are likely to be correct since we found that snoRNA genes overlapped with protein-coding region of the gene as well as the UTR. In one case (Me-Eh-5.8 S-U84 snoRNA, which is transcribed from the opposite strand of UTR region of receptor protein kinase gene (EHI_021310) we have validated the presence of this snoRNA by RT-PCR as well as northern blotting.

snoRNA genes in other organisms are known to be present both in single and multiple copies, and some may also be in clusters. In E. histolytica we found that 80% of the genes were single copy while the rest were in multiple copies. Our data shows that at least in two instances the snoRNA genes may be present in clusters and may be co-transcribed. 1) The snoRNA genes ACA-Eh-SSU1212 and ACA-Eh-5.8 S84 are 126 bp apart and are transcribed from the opposite strand of EHI_098580 gene. Due to their proximity and presence in the opposite strand of the same gene, it is likely that these two genes may be transcribed together and may exist in a cluster. 2) The four identical copies of ACA-Eh-LSU2997a snoRNA genes (located in Scaffold DS572347) are separated from one another by a sequence of 206–214 bp, which is also identical in the four copies. We tried to locate promoters in the 206–214 bp intergenic region of these snoRNA genes using bioinformatic tools (Promoter2.0 prediction server, neural network promoter prediction) but did not find any promoters. The upstream region of the very first copy of snoRNA may have a promoter but this could not be checked computationally as this region was right at the start of the scaffold. It is possible that these four genes may be co-transcribed as a single unit (polycistronic) and may constitute a cluster.

Structural features of E. histolytica box H/ACA and box C/D snoRNAs

H/ACA snoRNAs typically fold into a characteristic hairpin-hinge-hairpin-tail structure in which base-paired stems alternate with single-stranded regions (hinge and tail). The H box is located at the hinge and the ACA box is located at the 3' tail, 3 nt away from the 3' end of the snoRNA[15]. The site for guiding uridine modification of the target RNA is always located 14–16 nts upstream of the H box and/or the ACA box[38, 39]. This guide site consists of 8–18 base stretch which is complementary to the target RNA. It is located in an internal bulge or recognition loop in each hairpin and contacts the target RNA containing the unpaired uridine to be modified. Each H/ACA snoRNA can guide the modification of one uridine or two uridines which may be located in the same or different target RNAs. Thus the H/ACA snoRNA may contain only one or both functional loops. In E. histolytica all the H/ACA snoRNAs (Table5) adopted the hairpin-hinge-hairpin-tail structure. Some variations were observed, e.g. in some cases the guide sequence may extend into the adjoining P1 and P2 stems flanking the recognition loop (Additional file3: Figure S3 A)[40]. Of 43 guide H/ACA snoRNAs in E. histolytica, 5 snoRNAs (ACA-Eh-LSU1107a, ACA-Eh-SSU631, ACA-Eh-LSU2288, ACA-Eh-LSU1159b, ACA-Eh-LSU1107b) possessed both the functional antisense regions which can either guide the same or different substrate rRNAs. For example, ACA-Eh-SSU631 is predicted to guide the modification of uridine in 18 S rRNA at 2 different positions, 631 and 1114; whereas, ACA-Eh-LSU2288 can guide the modification of uridine at position 1431 in 18 S and at position 2288 in 28 S rRNA (Table2). Three H/ACA snoRNAs show potential of directing two pseudouridylations by a single guide sequence (Additional file6: Figure S5), as has been reported in other organisms e.g. ACA19 in human[41]. It is proposed that RNAs get folded into alternate structures thus targeting multiple sites. Overall we found 41 psi sites guided by 43 H/ACA guide snoRNAs. We also found some sites which may be subjected to both methylation as well as pseudouridylation. In human, U3797 position of 28 S rRNA is subjected to methylation as well as pseudouridylation[30]. Similarly in E. histolytica, the residue LSU1176 could be guided by C/D box snoRNAs Me-Eh-LSU-U1176a, Me-Eh-LSU-U1176b and Me-Eh-LSU-U1176c as well as by an H/ACA box snoRNA: ACA-Eh-LSU1176. The target site corresponding to LSU1176 is known to get methylated in Arabidopsis thaliana (SnoR41Y C/D snoRNA modifying at 25 S:U1064) and pseudouridylated in S. cerevisiae (snR49 H/ACA snoRNA modifying at 25 S:U990)[25, 29]. Similarly the 5.8 S84 site could be guided by C/D box snoRNA Me-Eh-5.8 S-U84 as well as H/ACA box snoRNA ACA-Eh-5.8 S84.
Table 5

Sequences of box H/ACA snoRNA genes in E. histolytica

ACA-Eh-SSU1315

TGCAAGTCTCCAC AGATTGACATAAAGAATGTCTTATCTACTAAGA CTTTGCAAGATTA AAACAAGTTTTAAACTCACGAGTAATATTGAATATTCGTGTTAATAGGGCTTGGAAATA ATC

ACA-Eh-SSU631

ATAAAGTGGAAAATTCTA TGGATGCAAATTTTTTTGCATCTTTTTTCTTT TTTGTAAATTA TTTAGATGCATTTT TTCTTTGCTAATTTTCGTACCCATAAGAAGAAAGAATAACAGAAATTTAA TGTATTATA TTT

ACA-Eh-SSU1727

CTGTGTTTAAAGT CCAAAGATCTTCAGTTATTCGAATTGCTTCTTTGGATAAT GAAAGACAGTAAAATGA GATTGATGTGAACTGTGGGACAACATTCTTGATGTCACTTTCACAATTCACACCAGTTGACA GTC

ACA-Eh-SSU626

TCCACTTCACAAAAATGACACTCATACAGAAGAGTGTGTTTTGGTATTTGACGTAGTGGAAGATTA TTTGCTTAGTAATTC TATTGATATGACTATTTCTATCAATCCTACGA ACTATGCAACA TCA

ACA-Eh-SSU461

TGACTGAGTATGTATTTTGTTCATTTTGTCATCAGCTTGGATATTATTTGTTTATCATTCGATTTAAATAA AATAATAAGGTGTTGT GTTATAATTATAGTTAAGATGGATATAATTCATGACTATC ACCTTATTTACA CCT

ACA-Eh-SSU1675

TGCAGTTATCCCCTCGTTTTAATTAGTATTAAAACGAACCATTATTATACTGCAAAATTA ATTTGCTTTATTTTT AAGGTTTATTTTACTATATTATTTACCTTCTATTTTAA AGCAATAAACA ATT

ACA-Eh-SSU526

GCATAGTTCGTAG GATTGATAGAAATAGTCATATCAATAGAATT ACTAAGCAAATAA TCTTCCACTACGTCAAATACCAAAACACACTCTTCTGTATGAGTGTCATTTTTGTGAAGTGGAATA AAT

ACA-Eh-LSU3008

GGATTTATCGAAG CATTAATATACTGAAGATAGTGATTAATGTCAAA TATAATCCAATAACA GTGGGTAAGAACTTATGATAAAAGTTTTATTTCTTTGAATAAAATTTTATTGTATTACTCTACA TTT

ACA-Eh-LSU1172a

TTATTTGTGAAGTGATTATT AATCAGTTTATATAATTGATTTTAGT CATATTTAATAAATAACA TTTTTGTATGTTTCACATATTTATAATTCATTCATTTTAATTCATAAGTTAATTTATAAATACATACAAAATACA TTT

ACA-Eh-LSU1172b

TATATTATATAATGTCATTGGACTTACTTTTAAATTATCAGAGTGGCACAAAGATTTTATATTTATGACATTA GTCAACAAAGATATTGACTTATT TCGTAATTCTATTATTTATGGAATTGTGATTAGT ATCTAACA ACA

ACA-Eh-LSU1107b

AAATAATTTTTTATTAAT ATTGTTTTTATTTAAAAATACATAAAATGTATTTTTAAATTAGAGGAAATA GAAAATATTTAAAAATAA ATGAATAAATTTATC GATAATTTAACATAACAGTTTGTTTTGTTTATTGGTTTGA AATTCAAACA TCA

ACA-Eh-LSU1650

TACACAATCCAAA GGATGTACAATTTTTTATTTTATGTCCATGTT AATTGTGTGAGAGAA TTCTTGAAATATTGTTTAATTCTTATTGAATTGAAATATTATTTTTCAAGGTACA AAA

ACA-Eh-LSU3087

GGTGCTCCAGCT AGGCTAAACTCTTTTAGTTGTAGACCTCGTT TAAGATCACCTAGAGTA AAAGATATTATGAAAAAAAAAAGAAGACATTATTCAATTAATAATGTTTTAAATTCATAATAAATA AAT

ACA-EH-LSU2791

AAGTTAGAGTGGAAT GTTTGTTAAACAAAAAGTAGTTTAAAACTACTTAAAATAGTCAA TTTTTAATTTAAATTA ATTGTAGGAGTTGTTGGTTATGTGTTTGAGTAAGTTTAAATTGTTAATTTACTAAAACATGACGAAATCATTTTTTCATAACA AAA

ACA-Eh-LSU3155

GTAGTTCAATTGAAATGATGATAATATTCTCTGTATTCTAATCATTTTATAAATGAAGTGCAGACAATA ATGTTCCAAAGATATCTGATCTA TTGAAATAATGAATTGAATATTTTAATTTGAATTTTAATACTTT TCATATTTTAATA AGA

ACA-Eh-LSU3221

TCTTTGTTTGATT CTATTTTACTTCAAATGAGGAAGTGTAATTCATTGAAGTATTGGTATAGAATAA CCATTAAAAGAAGATAA ATAATTTTAATCAGACTGTACATTGTTTGAAATAAGGAACATGTGTATTTAATTGGAATATAACA ACA

ACA-Eh-LSU1159a

AAATAAAAACAACAA TAATGTTTATATAACATTCAATAAAATATTTTGTTG TTTTTCATTTAAATAAAA TATGTTGTGAAGAATATTTCAAAAAAAGTAGAATTATAGTTGTTTATTTCAAATGAATATGAAATGTTTTTAATAATAAATA ACT

ACA-Eh-LSU2700

TGAGACAGTTTGAAGAATGGACAAATAGAAAGGTAGGAGGTGTATTATTTGATTCGTCTGTCTCTGACTGGAATCAGAGAACA TCTGTTTGTGACA AAATGTTAATTGGGAAAGAGCATATTTTGTTTGTTATAGAAGACA CAA

ACA-Eh-LSU1080

GCTTTCCTT ACAACGGCAAAGACATTTTATTCTTTGTGCAGTGGGAATA GAAAGCTATATTA ATATTGGTGCTTTACCCTTGAAAATTTCTTTTAATTTTTAAGTCAAAAACATCATATA ATT

ACA-Eh-LSU1343

AGATGGTCAA AGTTAGTGTTGCACATATGATGATTTTATAAGCAGTCATATGAAGCCGAATGAAT TTATCTAATACATAGA CTATTATGTATCGCAGCTTAACATCAAAGGTGGAGTTGTGTTATTGATAGATA TAA

ACA-Eh-LSU2997b

GGAGTGATAAAGC GGATTGGTAATAGAATAGTGTTAACAATCTCGTC AGAATCTCCTAGATTGA TTATTATGTTGTATTTTCCCATGAAAAATGAATTCATTTTATCATTTAAAAAATACAATATA TTT

ACA-Eh-LSU339

TGCTACATGTGTTTTTC CATCTTTTTTTGAAGAGACAAAGGATGATTTA GTATGTAGTAATACTA GTGACAAAGAAAATATAATAAGAGAAATGATTAGGTTATATCCTTTAATATTTAATGTTTGTTGTTCTTTTTCAATTACA AAA

ACA-Eh-LSU1123

TGATTGATAGT TTGATTTGGTTTATTCTGAAAATAAAATGTAGAATTATTTATCTT GTCAAACATTGA TAATCAACCGTGCTTATTCATTGTTTCATATTGATATTCTTAATTTCACATTATCAACGAATGAAACGTGTTGTACA AAC

ACA-Eh-LSU1005

TTTTGAGAATTGAAAATATTTATTAAATATTTATTTTATTCAATAGTAAAGGTTTTTAATTTTCAAAACAAGAAAA TAATGTTTGTGATAAAACAAAGT CATTTTTCATCCATAAAATGAAAAAGGAGTTTGC TACAAAAAAATA GTC

ACA-Eh-LSU1236a

TATGGTGTTAGTGTTTGATAGAAAATTTCTTATTCAATACTTCAAGAATATTATTGGACATTTATTATAATAGAA GAACGTGTATT TGTAAAGATAATAGTATTTTTACTTGTTCTGATGCAAGTAT ATGTGTTAATA TAG

ACA-Eh-LSU1236b

TATGGTGTTAGTGTTTGATAGAAAATTTCTTATTCAATACTTCAAGAATATTATTGGACATTTATTATAATAGAA GAATGTGTATT TGTAAAGATAATAGTATTTTTACTTGTTCTGATGCAAGTAT ATGTGTTAATA TAG

ACA-Eh-LSU1107a

GCAAAATATAATAAATGGAAAATTCTA TGGATGCAAATTATTTTGCATCTTTTTTCTTT TTTTGCAAATTA TTTAGATGCATTTTTTCTTTGCTAAT TTTCGTACCCAGTGTGATATGTCAATGAAAATGGAGAATGC AAAAGAATGAATA ATT

ACA-Eh-LSU2288

AGCATATACCTTT CTCACTATATTATTGTAGCGAGACATTCAGAG ATGCTAAGAATA AATGAATATTCATTTAACTTCTCTT TTATTACTTTAATGGTTTGAAAGAA TAATGAATATTCGATA TCT

ACA-Eh-LSU1159b

TTATGTGAAATTCGATAA AATATTTTATTTTTTAAAAAATATTTTGTTG TTTTTCATTTAAATAAAA TAATATGAT GAAAATTTTAAAAGTGAAGAAATGAAGTTATTTATTTCAAATGAATATGAAAGGTTTTTTAATAAT ATTAAATA AAT

ACA-Eh-LSU2997a

TGTTCCTGAAAGC GCAGAGACGCACTAGCGTTGTCTGTCGTCGC ATTGGGAACAACAGAGA AGGATGATTCCATAGTGGGTGAGATGGCAATGATGCTGTTTCAATGTGGGATTGTACA GTT

ACA-Eh-5.8S80a

TATAATGTAAATACT GATATGGGGTTTATGAAAACTATAAACAATATCATTTTATT CATTGTGTAAAGTA ATAAAACACTTTTAATAATAGTACTAAAGTGAAGGGTATTATTTTAGAATATTATGAAAACTGTATA AAA

ACA-Eh-5.8S80b

ATTTTTATTGATGCAAAATATT TAGCAACATTTTTATTGATGCAAAATATTTT GAAATAAAAATAAATCA TCATTTGATATTATTAATATATTTGATAATAAATTATATTATATATTAAATACATCATA TTT

ACA-Eh-SSU740

ACCTCCAAGACATTTCATACTTAAATTAAAACTTAAAGGAAGTTATGATTCTGATGAAGGTAAATTTGGAGGTAAAGAA CAAGGAATATTAGAATTT GATTTCTTTATTAAAACAATTCAAAGTAATA TTCCTAGACA TCT

ACA-Eh-SSU188

AGAGATGTACT TAGTATGGATAACATGTATAGTGCATGGAATCCTCAATTAC TTATTTCTATAAAA ACTCCTCAAGCTGTTAATGTTGCCAATGCTTGTACTCTTACCATTCATTGGCAATTTGATGTTACA TTG

ACA-Eh-SSU1216

GTTATGAAGAGGTTCTATTATCTGTTTATCTATTGAATTATAAGGAACTGGTTCAGGACAGAGAA ATGAATCAGTTGGAGGTGTT TGTTTTTGGAATTCATTAAATGAAAAGAAGAAATTGTCA CAACAGTCTGAAATA AGC

ACA-Eh-SSU299

CCATACGTTCTTTATTGGAGGCAGTCCTTATTTTGTAGAAGAAAATAAGATTTTACTTCAATCTCAAAAAACAAATGGAAACGA AGCGGTTTTAATGACAGA AGAAGAGATGAGATCATTTTATGACATTTTTGATTATTTATCTTCTTCTCA ATTACAAACCACA AAA

ACA-Eh-SSU1212

TTATCATCATCAAAGAAATTTGTTATTAATGATTGCTTTGGTTGTGATGGCTGCATAGGA ACAGCGACTTGGTGTTGATT AAGATAAGGGTTAAAAGTACTTTGTTGAAATTGAG GTTGTTGAATA TAT

ACA-Eh-LSU2809

AGTTACTGTGCAATTTTTTGTGGGTTGAACAGTTTTCCAATTCTGATTAATTGTAAGCATAGAACTAAAACAA TCATCAACAAGAGTCATTTCAGAAT AAACAGTAATAGAGTCACAATCAATTTCTGAATATTTG TTATGTGTTCTTGTACA ATC

ACA-Eh-LSU2335

GATTGAATATTTTCA GTTACTTCATGAACATGAGGTTTAGGAGGAATTGTTACT ATTTGGTTTAAAATAA CAGAATTTATACTTTCTGTTATTGGTTCAAAAAATGAATTTGATGGAAAGATAACACA TAA

ACA-Eh-LSU2493

GGGCTTTAGAGTT GTGTATTTTTTCTTTTAACCAATTTCTACAAATGGTGTGAGCATGGTT ATAAAGTTCAAATGA AAGTGGAGTAGAGGTGTTGTATTTAATCTTATCAAAATCTACTGCTTTATTTAATA AGT

ACA-Eh-LSU1176

GGTGGATATATTGTT AAGAATAGTTTTGGATACCAACATGGTCATTCATTAAAATATTGGCTTCA ACAACATTCTATTAAAGAA GAAATGCAAATATGTCCTAATGTACGTTCGTTTGAACAATGGACTCCATTAGATGAGGATTGTATTAATA AAC

ACA-Eh-LSU2268

TACTAAGACAAATTTGTCCATTTGGAAATACATTTGGATGGAAAAATCCTTCTGGTAAATAACA ACGTGGTGGTGATGCTGGGTA TCCAGTAGAAAATTTCATTTCTACTGGATAAT AACCATTTTCCCATA TCG

ACA-Eh-5.8S84

GTTTTGAGTTATTTTTGAA GATGATTGTTTATTTTCATTTTCATCTTCAC TTTCAAAGCCAAAATCA TCGAACACAGAGTTTTTATCTTTTTGTGGTTCTACAAGGGTTGTAGTTTGACTTGAAGGAATAACATTATTTTGACTAGACA ATC

EhACAOrph1

GCATTGCTTTTTTTGATAAATACTTATTTATTTATCTTCGCCGCAATGCAAGAAAA TATTTCAATTAGCAGTGCTTTCTTTAAAGGAGGAAATCACGATATAATTGAAGACA TTA

EhACAOrph2

TTTCAAATAGAATTTCCCGGAGAAATACCACAAAAGGGTGTGAAATGGGTTTTTGAAATAGAATGA ACTGATTTATTAAATCAGATATGTCGCTTTCAAACGATGGACGACGTATGAGGGGAGTGAATGATA GAA

EhACAOrph3

TTGTTTTTTGATTAAACCACAATTTTTATAATATGAAAAGATAATTGTGTTTGGACAATTTAAAACGAAAAGAA ATAGCGATTTAGGGTAGTTCATTCTATGTAAATATAAATGAACACTATTTAATCGCAATA TTA

EhACAOrph4

GCAAAGGGTTAGTATTTTATTTAGTTATTGAAATTAGATAAAAACACCCTGTGCAAGACAA ACGTGTAGATCCTAATAAAGAGAAGTCTTGTCTATTTCTTTTTATCTGTCTACAAACA AAT

EhACAOrph5

TGAGACTCTACGGTTATTAATTTATATGAATTAATAATAACCGAGTTCTCAAACAGAA AATAATCATATAAGGTATATAAAATAATAACAAATAAGATGTTATGATAATTAGATTATATGATAATA ATT

EhACAOrph6

GACATGCCATAAACAATGTTTTGTATAACATTTACGACTATCATCATAAATGTTTTATAAAACACTCCGTGTCACAGTA TTAAAGTGACCGTAATGTTAGGGAAGTTTCCCGAAAAGTAGGGACAACAAATCCCTAACGACAAAGGTGTCACACA AGT

EhACAOrph7

GTCATCCCTTCAGATCATGGAATTACATTCAACACTAATCTGGGAGATGATGACAAAAATAA TGTCATTGAGGAGCATGATTCATTTGAGTCTGTTGAATATCTTTATGATCGTAATCTCGATGATA ATC

EhACAOrph8

CCAAATAACAAAAAGAAGAGCATTAATTAGAAAGAAAAAGAATGACTAAGGTTATTTGGTAAATTA ATAGTGATAAAAGGAAACATAGTTCAAAAGAGGAGTGAGCTATGTGATTGTTTAACACAACA AAG

EhACAOrph9

GCAAATGATATTCGTATATCAATTTTCAAGTTAATTGATTTGTTATTGTTTGCGAGAAAA ATTAAAGATAGAAGTTATTTATATCTTTTGGTATAAATAAAAGAGAATCTTTGAACA TTA

EhACAOrph10

ATTAGAAGTAAAGTGAGGATAACTTAATAACTCTGTTGTTCTTATTTGTATTGAGTTGGTCAACAGATAA CAATGGACAATTATAATATAAACATTTTATTATATTTGGTGTTTCTAATTTAAATAAAATGTTACATTGTTGAACA ATT

EhACAOrph11

TTGGATTTAATTGTACATTATGTCCAGCTTGTTGAGTTAAATCTGGCAGTGGAATAAGTCCAACAGATAA TAAGAGACAATCACACTCAATTTCATATTCTGTTCCTGCAATTGGTGCAAGTGTCTTTGGATCACA TTT

EhACAOrph12

GGTTTATCATCTTCAAATCCAATGGCTGATGCTATTTCTTTGATTTGGTTAAAAGACTCAAAATA TTCTTCTTCGACAAATTTTGATTGTTCATCTAATTGATGTTTTAATTCTAAAATTTGTTGAATATA ACC

EhACAOrph13

ATTATTTTGGATAATGCTAATGTTGATTTACAGGATGTTATTCGTGATAATGTGAAAATA AAAGTTCATGTTGGTCGTGGTATTGTAGTTGGAGGATTTCAGGGATCGGATGCCGCGGATGTTGAAGCTGCATA TAA

EhACAOrph14

ATCATTAGAACATGTAAATGATGATAGTTCTGTGTCAGAAACACCAAACATCCCTTTTACTTTAGCTGATGATAAAACCA ATTCAATAACTAGTGAAATAGCTTTTTGTTGTTTATTATAATAATATTTATCACTAATACCATTGAAACA AAA

EhACAOrph15

GTAGTGGAACAATAAAATGACTATTAGGTAGTGATAGATAGTCATTATCATCAATAATTATTTTCTCTATTACTACAGCA CTATTTAATATTTGTAATTCTACAGAAGTTTCATTTTTCTTAAGAGTATAAAGAAAAGGTGGATA ATG

Note: Box H and box ACA are depicted in bold. Antisense elements are in italics.

The C/D box snoRNAs typically possess the conserved boxes C (RUGAUGA) and D (CUGA) near the 5' and 3' ends, respectively[1]. A short region upstream of C box and downstream of D box usually shows base complementarity. Base-pairing in this region brings the C and D boxes close together. In addition to C and D boxes, some snoRNAs of this class also possess C' and D' boxes which are less conserved and form a folded structure in the order 5’-C/D'/C'/D-3’. The 2'-O-ribose methylation of the target RNA is guided by one or two 10-21nt antisense elements located upstream of the D and/or D' boxes in a manner such that the modified base is paired with the snoRNA nucleotide located precisely 5nts upstream of the D or D' box[3, 4]. All 41 C/D box snoRNAs in E. histolytica had the conserved motifs: C box and D box. The C box had the consensus sequence RUGA [U/g/c/a]G[A/u]. The sequence of D box in two of the C/D box snoRNA genes Me-Eh-LSU-U3580b and Me-Eh-SSU-U871 was AUGA. All of the other snoRNA genes possessed the consensus CUGA sequence in the D box. 71% of these RNAs possessed the D’ box as well (Table6). The D' box is much less conserved and it varied from CUGA to CAGA, UUGA, AUGA, ACCA and CCGA. All the C/D box snoRNAs possessed at least one antisense element upstream to either the D’ box or D box. Me-Eh-SSU-A1183 snoRNA gene had two antisense elements and was able to guide different target sites of the same or different rRNAs (Additional file7: Figure S6A) whereas Me-Eh-SSU-G1535 and Me-Eh-SSU-A790 had single antisense element upstream to D’ box which could guide multiple sites for methylation in different rRNAs (Additional file7: Figure S6B (i-ii)). Five C/D box snoRNAs with a single antisense stretch in each were predicted to target different sites in the same target RNA (Additional file7: Figure S6C (i-v)). From the predicted folding pattern 60% C/D box snoRNAs possessed the terminal stem while the rest either lacked it or had an external stem, or an internal stem[42].
Table 6

Sequences of C/D box snoRNA genes in E. histolytica

Me-Eh-SSU-G1296

TGTAATGATGA GATTTTACCATGCACCACT CAGA ATTATCTACCCAAAGATAAGTTGTGTTGATTATGGTGTCTGA AC

Me-Eh-SSU-U1024

CACTGTGATGA AGCTTTTTATCCAATCCT CTGA ATATCGTTGATATTTATCTATGTGGATATTAATGTTGACTTCTGA GT

Me-Eh-SSU-A83

GAAGATGATGA CTAGACTTGGCAGTCTCCCTGTTCGCAGTTTCAT ACTGA ATAAATATGAGGATAAAGGGTTCTGA TT

Me-Eh-SSU-G41

AGAAATGATGA CTTGTGTGCTTAATCTTT GTTGA TTCAAAAATGATAACACTTCTTTAAAGTCTGA TT

Me-Eh-SSU-A431

GCAAATGAGGA AATAAAATTTGGGTAATTTACG TCTGA AATTGATGATAACCATCTGTCGTTCTGA TG

Me-Eh-SSU-U871

AACGATCATGA ATTTTCACCTCTCCCGTTTTTT TCTGA ATCACCCCAATTATTCCTTTTAATCCTTCTCTCGAAATGA TT

Me-Eh-SSU-G1535

TCGAGTGACGA TAAACCACAGACCTGTT CTGA CCTTAATGGAGATAACAGAGCTGGCTCCAATTAGCGCTGGGGCTCTGA CG

Me-Eh-SSU-A27

GTCAGTGATGA TCAATAAATCAGCATATA TCTGA ATAAAGTATGATGGTTTAAGACGGGTCTGA GA

Me-Eh-SSU-A1830

CAATATGATGA AAAAGCACCAACTCACCTCTTTA GATGA TATTCCTGATTTTGATTTTGATGAAATGATTAACCAAACTGA GG

Me-Eh-SSU-A836

CTTTTTGATGA ATAAACTCTTTTAATCTTTCT TTTGA ATTTTCTTTTCTCTTTTTCTTTCTTTTGAATTTTCTTCTAACTTTTCTTTTAGAGGCTTGCTGA GG

Me-Eh-SSU-G1152

GGTAATGATGA TAGAAAGTTTTCAGATTATTAATGAAGACATTTTCAGCCTTGT CTGA GC

Me-Eh-SSU-G628

TAAAATGATGA TTATAGTTTTAATACAAC ATTGA TTTAAATGAAACACACAACTTTCACTAATTTTAATAATCTAATTTTTACAATTAACTCTGA CT

Me-Eh-SSU-A1183

AAAAATGATGA AAAAAGAAAAAAGTCCTGGAGTTCC AACCA GGATGAATATCCATGATGATAAACTAATCTTCTCACTGA TT

Me-Eh-SSU-A790

AGAAGTGATGA TATATAAATTCCATGTTAGAA CTGA TATAACGTGTTGATATTTGTATAAGTCTGA TC

Me-Eh-SSU-C1805

GTAGATGATGA CTTATACGTCGGGCGG ACTGA AAGATTATATGTAGATTCGACGTGTCTGA TA

Me-Eh-LSU-A928a

ACCAATGATGA TTTACATTAAACCATCTTTCG TCTGA AAAACTGATGTCAAATATGTCATAATCTGA GG

Me-Eh-LSU-A928b

TAAGATGATGA TTTGATTCCGTGTTTCG TCTGA ATCCTGGTGAAAACTCGACAATCTTATCTGA TT

Me-Eh-LSU-U1868

TTCTATGATGA TATTTAATGAAAGAAGAAAAGAG TATGA ACTTAACTCAAAAAAATATAACGGTGGTGCTTTACCTAAAATCTCTTTTTTTCGTCCTGA AT

Me-Eh-LSU-U3580a

GAATATGATGA AGTATTTTAATAAGAAATATAATAAATAATAATAGAAAGA ATGA AATAAGATAATATGAAAGAATAAGAAAAATAAAAAGATATAACTGA TG

Me-Eh-LSU-U3580b

GAATATGATGA ATTAATTTAATAAGAAATATAATAAATAATAAAAGAAAGA ATGA AATAAGATAATATGAAATAATAAGAAAATAAAATGATATAAATGATGA TA

Me-Eh-LSU-A785

AGAAATGATGA TAATGTGGTCCGTGTTT CTGA ATACTGAAGAGACTATAACCACTTCTGA TT

Me-Eh-LSU-G2958

AGCAATGAAGA TATACGCAGTTATCCCTGT CCGA GAACTGCAAATGTGGATATGTTAACTAAGTCTGA GC

Me-Eh-LSU-A3089

AGAAATGATGA AATAATACTCAGCTCAC TCTGA ATATAAATGAAGAATGAGTTTCTATATGATTTCTGA TT

Me-Eh-LSU-C2414

GTCTGTGAGGA ATTGAAAGATAGGGACA TCTGA TATAACTGATGTTAAAAATCTTTGATTTGACTGA GA

Me-Eh-LSU-G926

TGAAGTGATGA TCCTTTATTTAAGTGATTAACCATGATAATCATCTTTCGGGT C TGA TT

Me-Eh-LSU-U1018

GAATATGATGA ACTTAATCAATATTCAAATA GCTGA ATAATATGATAAAATGAAAGTCTGTTACTGA AA

Me-Eh-LSU-G1028

TATGATGATGA AATGAGTCTCCGAATAATATTGAGGACAAATCTTTCGCTCCTAT CTGA TT

Me-Eh-LSU-U1176a

TATAATGATGT ATATTTTCTTCATTAACAATTTCTTTGTTTATTTA TTGA ATTTAGTTGATAATTCATTATTAACACTACAACAACGTTTTGAATATCTTTTACTGA AG

Me-Eh-LSU-U1176b

TATTATGATGT ATATTTTATTCATTAACAATTTCTTTGTTTATTTA TTGA ATTTAGTTGATAATTCATTATTAACACTACAACAATGGTTTGAATATCTTTTACTGA AG

Me-Eh-LSU-U1176c

TATAATGATGT ATATTTTCATCATTAACAATTTCTTTGTTTATTTA TTGA ATTTAGTTGATAATTCATTATTAACACTACAACAACGTTTTGAATATCTTTTACTGA AG

Me-Eh-LSU-A2333

TGTAATGATGA GAACTTTATGAATAATAGAGAGGATTCTTATAAAAAGAAGTGGTAATATTCTCGTTTTGAAAATGTTACCAGGGATGAATAATCTCCCTTGATGATTCTTTCATAGTTACT C TGA AC

Me-Eh-LSU-A228

ACATATGATGA ATTTCTTGGAGAACTGAATTTAAA TTGA AGACAATTTATATTATGTTGCAAAGAACTGA TG

Me-Eh-5.8 S-U84

TATAATGATGA TATAAAACAATAAATTATGACTTTTCTTCAATTTTTTGATATTCA C TGA AA

Me-Eh-5.8 S-A92

TGTAGTGATGA TGGAAGAATTAATTCAAATTTT AATGA ATTAGTGTTATATACTGAAAGAGAGAGAATAGATGAGTATTGTGAAAGGTCTAACCTTCCTTTAAATACTACTGA AA

EhCDOrph1

CTAAATGATTTTCTAAATGATGA CTCTTGTGGTGGTTTTGGAGAAGACTGATTTGATGAATAAGAAGATGACCATCCTGA AGAACATTCATTTGG

EhCDOrph2

GACTTGATAGAATTAAGTGATGA CATGTGTTGAACAATCTCTGAGTTTTGATGACAACTTACCTTCGTCTGA TATTTCTTTTTCTTC

EhCDOrph3

AATTAAAAAAATAACAGTGATGA CTTTACTGCGTTATCTTAAGTAGGATTCTTTTATAGTTTCCAGTGATTTCAACTTTCACTTGAGTCTGA GTTATTCTTTTTATA

EhCDOrph4

TTTAATCAAATCCACAGTGATGA AATAACTTGTCTGAGAGTCATTTTTAATCATGATGGCATGTTTTTATTTCTGA GTGGGTTATTTAACT

EhCDOrph5

ATAATAAGATGTAAGAATGATGA AGTTTTTATTAAACTATGAATATTACATGATTACTTGATCCTCTGA CTTACATTTAATTTT

EhCDOrph6

TTTGAATTAGAAGACGATGATGA ATTTGAATTAGAAGACGACGAAGAAGAAGATGATGAATAAATCCTTAAATAACTGA GTGCTTATATTCAAA

EhCDOrph7

TTTGAATTAGAAGACGATGATGA ATTTGAATTAGAAGACGACGAAGAAGAAGATGATGAATAAATCCTTAAATAACTGA GTGCTTATATTCAAA

Note: Box C and box D are depicted in bold. Box C’ and D’ represented in bold and italics. Antisense elements are in italics.

Computational identification and validation of multiple copies of U3 snoRNA in E. histolytica

U3 snoRNA belongs to the C/D box snoRNA category and performs the specialized function of site specific cleavage of rRNA during pre-rRNA processing. It is present in all eukaryotic organisms either as a single copy or in multiple copies[43]. BLASTn analysis of yeast and human U3 snoRNA with E. histolytica whole genome revealed the presence of 5 copies of U3 snoRNA (Eh_U3a-e) in E. histolytica. These were 97-99% identical to each other and ranged in size from 209–225 nt. All copies were located in intergenic regions (Table7A) and their sequences are given in Table7B. The characteristic boxes- box GAC, A’, A, C, B, box C and box D of E. histolytica U3 snoRNA were conserved (Figure4) when compared with U3 snoRNAs of selected organisms (H. sapiens, Leishmania major and Leishmania tarentolae). The Eh_U3 snoRNA was well conserved with respect to T. brucei and T. cruzi[43]. However, it showed poor homology with P. falciparum U3 snoRNA[21]. Sequence conservation was greater at 5’ end up to central hinge domain, with less conservation in the 3’ hairpin region. We checked for the conservation of U3 snoRNA among Entamoeba species and found 6 copies of U3 snoRNA with 91% identity in E. dispar (Table7A) and 1 copy with 96% identity in E. nuttalli. No homology was observed for E. invadens. To validate the predicted U3 snoRNA in E. histolytica we did RT-PCR and northern blotting with total RNA (Figure2A,3A). RT-PCR was performed using specific primers for U3 snoRNAs (Additional file4: Table S1). The predicted and the observed sizes as obtained by both RT-PCR and northern were the same. The sequencing of one of the clones of the RT-PCR product confirmed the presence of Eh_U3e copy of U3 snoRNA.
Table 7

U3 snoRNA genes in E. histolytica

U3 snoRNA genes

Len (nt)

Seq (%)

Scaffold

Start

End

Homology Yeast/Human

Location

A. U3 snoRNA genes

Eh_U3a

209

91%

DS571856

3136

3344

snR17a/U3 U3

IR

Eh_U3b

225

92%

DS571750

1819

1595

snR17a/U3 U3

IR

Eh_U3c

221

91%

DS571479

13861

14081

snR17a/U3 U3

IR

Eh_U3d

221

91%

DS571353

16563

16343

snR17a/U3 U3

IR

Eh_U3e

225

91%

DS571336

2559

2783

snR17a/U3 U3

IR

B. Sequence of U3 snoRNA genes

Eh_U3a

TAGACCGTACTCTTAGGATCATTTCTATAGTACAGTCAATCCATTATCCGTCTTAAAAATAACAACAAGACAATAGGATGAAGACTAAATAACCAACAACACCAACGGGAGATAAACAGTTGGAAACAAATGTACAATGAACGGCTTGAAACAATCTAAAGAAAGAAATTTCTAAAGATGGTTCAAGAGGTGAATGTTAGGGTGTCTGA

Eh_U3b

TAGACCGTACTCTTAGGATCATTTCTATAGTACAGTCAATCCATTATCCGTCTTAAAAATAACAACAAGACAATAGGATGAAGACTAAATAACCAACAACACCAACGGGAGATAAACAGTTGGAAACAAATGTACAATGAACGGCTTGAAACAATCTAAAGAAAGAAATTTCCAAAGAAAGTTCAAGAGGTGATGTTAGGGTGTCTGACTATCTTTTTATGAAAT

Eh_U3c

TAGACCGTACTCTTAGGATCATTTCTATAGTACAGTCAATCCATTATCCGTCTTAAAAATAACAACAAGACAATAGGATGAAGACTAAATAACCGACAGCACCAACGGGAGATAAACAGTTGGAAACAAATGTACAATGAACGGCTTGAAACAATCTAAGGAAAGAAATTTCCAAAGAAGGTTCAAGAGGTGATGTTAGGGTGTCTGACTATCTTTTTATG

Eh_U3d

TAGACCGTACTCTTAGGATCATTTCTATAGTACAGTCAATCCATTATCCGTCTTAAAAATAACAACAAGACAATAGGATGAAGACTAAATAACCAACAACACCAACGGGAGATAAACAGTTGGAAACAAATGTACAATGAACGGCTTGAAACAATCTAAAGAAAGAAATTTCTAAAGATGGTTCAAGAGGTGATGTTAGGGTGTCTGACTATCTTTTTATG

Eh_U3e

TAGACCGTACTCTTAGGATCATTTCTATAGTACAGTCAATCCATTATCCGTCTTAAAAATAACAACAAGACAATAGGATGAAGACTAAATAACCGACAGCACCAACGGGAGATAAACAGTTGGAAACAAATGTACAATGAACGGCTTGAAACAATCTAAGGAAAGAAAATTCTAAAGAAGGTTCAAGAGGTGATGTTAGGGTGTCTGACTATATTTTTACGAAAT

Note: “Len.” denotes length of the snoRNA genes; “Seq.” is sequence identity of corresponding snoRNA genes in E. dispar and “IR”, intergenic region.

https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-13-390/MediaObjects/12864_2012_Article_4555_Fig4_HTML.jpg
Figure 4

Sequence alignment of Eh_U3 snoRNA. Alignment of Eh_U3 snoRNA sequence with U3 snoRNA of L. major [GenBank: NC_007264, complement (226475–226617)], L. tarentolae [GenBank: L20948] and H. sapiens [GenBank: X14945] is shown. The conserved boxes GAC, A', A, C', B, C and D along with central hinge domain and 3’-hairpin is shown.

Conclusion

Ribosome biogenesis in eukaryotic cells requires the activity of a highly conserved set of small RNAs, the snoRNAs. In this study we show that the parasitic protist, E. histolytica, thought to be an early branching eukaryote, possesses the major classes of snoRNAs as judged by sequence conservation with yeast and human. These RNAs are expressed at fairly high levels as they are readily detectable by northern blots. It is relevant to ask whether E. histolytica, being a human parasite, has evolved any snoRNA features uniquely shared by other parasitic protozoa infecting humans. Amongst these organisms, studies on snoRNAs have mainly been reported with P. falciparum and T. brucei. When the features of E. histolytica snoRNAs are compared with these organisms, the following points emerge. Both in P. falciparum and E. histolytica some snoRNA genes are located in the 3’- UTR, a property not reported in any other organism except Drosophila[35] where an H/ACA-like snoRNA is reported to be present in 3’ UTR. In addition, some E. histolytica snoRNA genes are also found in the 5'-UTR which is unique to this organism so far. Both in P. falciparum and E. histolytica most (80%) snoRNA genes are present in single copy whereas in T. brucei most of the snoRNA clusters are repeated in the genome with few clusters carrying single copy genes[19]. The clustering of snoRNA genes is frequent in P. falciparum and T. brucei. We have reported two instances in E. histolytica where these genes may be clustered. Unlike P. falciparum where 9 snoRNA genes are found in introns, we could locate only one snoRNA gene in an intron, while the majority of them were in intergenic regions, whereas no intronic snoRNA has been reported in T. brucei so far. Like T. brucei, E. histolytica also possesses single hairpin H/ACA snoRNAs which are likely to be processed from a double hairpin pre-H/ACA snoRNA into single hairpin snoRNAs, whereas in P. falciparum single hairpin H/ACA snoRNA has not been reported. Unlike T. brucei which possesses H/AGA box[36], both P. falciparum and E. histolytica contain the highly conserved H/ACA box. In contrast to P. falciparum and T. brucei where the number of methylation sites is much larger than psi sites, in E. histolytica we find an almost equal number of both kinds of modifications. There are 47 methylation sites and 41 psi sites. In overall sequence, E. histolytica snoRNAs are much more homologous to yeast and human than to P. falciparum and T. brucei.

The greater sequence homology of E. histolytica snoRNAs with yeast and human compared with the two parasite species, and the lack of any particular snoRNA features unique to all three parasite species shows that this highly conserved RNA modification machinery is unlikely to be linked to pathogenesis and each parasite species has evolved its own distinct snoRNA features. This study will help to further understand the evolution of these conserved RNAs in diverse phylogenetic groups and will be very useful in future studies on pre rRNA processing in E. histolytica.

Methods

Extraction of putative methylation and pseudouridylation sites in rRNA of E. histolytica

We used the known methylation and psi sites of five different eukaryotic organisms: A. thaliana, C. elegans, D. melanogaster, S. cerevisiae and H. sapiens to find putative methylation and psi sites in E. histolytica rRNA (5.8 S, 18 S and 28 S)[25]. Alignment of rRNA of E. histolytica and selected five organisms was carried by EMBOSS pair wise alignment tool separately (Additional file1: Figure S1). This gave us putative 173 methylation and 126 psi sites.

Search for E. histolytica C/D box snoRNAs

Snoscan and CDSeeker were used to score potential guide and orphan C/D box snoRNAs respectively from the whole genome sequence (WGS) of E. histolytica. WGS was downloaded from ncbi [NCBI:AAFB00000000] (updated on April 17, 2008). The tools were initially used with this file and the results obtained were checked periodically online with the updated genome file. Snoscan is based on the greedy search algorithm. It identifies six features in the genome: box C, box D, a region of sequence complementary to target RNA, box D' if the rRNA complementary region is not adjacent to box D, the predicted methylation site based on the complementary region and the terminal stem, if present[23]. CDSeeker can be used to find both guide as well as orphan C/D box RNA but in the present study it was used to find orphan C/D box snoRNAs in E. histolytica. The CDSeeker program combines probabilistic model, conserved primary and secondary structure motifs to search orphan C/D snoRNAs in whole genome sequence. It searches for same features described for snoscan but for the search of orphan C/D box snoRNAs it looks for predicted conserved functional region next to box D or D' (if D' is present)[24]. Both the tools need genomic DNA sequence and rRNA sequences as an input requirement (optional for CDSeeker). All hits that had scored higher than 14 bits were selected as positive guide C/D box snoRNAs[26]. For orphan C/D box snoRNAs, score was set to be 18 bits. These threshold values given are those used for S. cerevisiae (for guide snoRNAs) and the default value used in CDseeker (for orphan snoRNAs). BLASTn analysis of predicted snoRNAs with EST database of E. histolytica revealed the authenticity of predicted snoRNAs. To find the homology between closely related species E. dispar, E. nuttalli and E. invadens, we did BLASTn analysis of selected snoRNAs with WGS of E. dispar SAW760 (NCBI: AANV02000000) E. nuttalli P19 (AGBL01000000) and E. invadens IP1 (NCBI: AANW02000000).

Search for E. histolytica H/ACA box snoRNAs

ACASeeker was used to screen out potential guide and orphan H/ACA box snoRNAs similarly as mentioned above for CDSeeker. ACASeeker program combines probabilistic model, conserved primary and secondary structure motifs to search orphan and guide H/ACA snoRNAs in whole genome sequence. It identifies following features common for both orphan and guide H/ACA box snoRNA genes: box H, box ACA, hairpin 1, hairpin 2, and hairpin-hinge-hairpin[24]. For guide snoRNA genes, another feature: two regions of sequence complementary to target RNA in a hairpin, was taken into account. This tool needs WGS and the list of putative psi sites (optional) as an input requirement. We have provided the list of putative psi sites (as obtained in method section 1) thus 186 guide H/ACA snoRNAs were predicted on the basis of putative sites and 475 snoRNAs with no putative sites were predicted as orphan H/ACA snoRNAs. The threshold value was 40 bits and 27 bits for H/ACA guide and orphan snoRNAs respectively, which was the cutoff used to train the software SnoSeeker on vertebrate snoRNAs. The snoRNAs were further analyzed for genomic localization in intron, intergenic region or from the ORF of protein coding genes. BLASTn analysis of predicted snoRNAs with EST database of E. histolytica revealed the authenticity of predicted snoRNAs. To find the homology between closely related species E. dispar, E. nuttalli and E. invadens, we did BLASTn analysis of selected snoRNAs with WGS of E. dispar SAW760 (NCBI: AANV02000000) E. nuttalli P19 (AGBL01000000) and E. invadens IP1 (NCBI: AANW02000000).

Validation of snoRNAs by RT-PCR and northern hybridization

Total RNA was isolated from mid log phase trophozoites (~ 5x106cells) using Trizol reagent (Invitrogen) as per manufacturer's instruction. DNase I (Roche)-treated RNA sample (5 μg) was reverse transcribed at 37°C using MMLV (USB) with specific reverse primers (Additional file4: Table S1) as per protocol prescribed by manufacturer, followed by PCR with forward primers. PCR with genomic DNA was used as control. Oligonucleotides used for RT and RT- PCR reactions are listed in Additional file4: Table S1. For northern analysis total RNA and total RNA enriched in small RNA from ~ 5x106 cells was isolated using trizol (invitrogen) and miRNA isolation kit (Ambion) respectively as per manufacturer's instructions. 15 μg of total RNA enriched in small RNA was resolved on a 12% denaturing urea PAGE gel. For Eh_U3 snoRNA 10 μg of total RNA was electrophoresed on 1.2% denaturing agarose and transferred to Genescreen plusR membrane (Perkin Elmer). Probes were prepared by random priming method (NEB blot kit). Hybridization was carried out in buffer (1 M NaCl and 0.5% SDS) at 42°C for 36 hrs. Post hybridization washing of membrane was done as per instructions suggested by manufacturer. Blot was exposed for 48 hrs in imaging plate of phosphorimager for autoradiography.

Declarations

Acknowledgements

This work was supported by a grant to SB from DST and DBT, fellowship by DBT to DK and RS and fellowship from CSIR to VK and AKG. We gratefully acknowledge the helpful discussions with Dr. P. C. Mishra.

Authors’ Affiliations

(1)
School of Environmental Sciences, Jawaharlal Nehru University
(2)
School of Life Sciences, Jawaharlal Nehru University

References

  1. Balakin AG, Smith L, Fournier MJ: The RNA world of the nucleolus: two major families of small RNAs defined by different box elements with related functions. Cell. 1996, 86: 823-834. 10.1016/S0092-8674(00)80156-7.View ArticlePubMed
  2. Ganot P, Bortolin ML, Kiss T: Site-specific pseudouridine formation in preribosomal RNA is guided by small nucleolar RNAs. Cell. 1997, 89: 799-809. 10.1016/S0092-8674(00)80263-9.View ArticlePubMed
  3. Kiss-László Z, Henry Y, Bachellerie JP, Caizergues-Ferrer M, Kiss T: Site-Specific Ribose Methylation of Preribosomal RNA: A Novel Function for Small Nucleolar RNAs. Cell. 1996, 85: 1077-1088. 10.1016/S0092-8674(00)81308-2.View ArticlePubMed
  4. Cavaillé J, Nicoloso M, Bachellerie JP: Targeted ribose methylation of RNA in vivo directed by tailored antisense RNA guides. Nature. 1996, 383: 732-735. 10.1038/383732a0.View ArticlePubMed
  5. Hughes JM, Ares M: Depletion of U3 small nucleolar RNA inhibits cleavage in the 5’ external transcribed spacer of yeast preribosomal RNA and impairs formation of 18 S ribosomal RNA. EMBO J. 1991, 10: 4231-4239.PubMed CentralPubMed
  6. Kass S, Tyc K, Steitz JA, Sollner-Webb B: The U3 small nucleolar ribonucleoprotein functions in the first step of preribosomal RNA processing. Cell. 1990, 60: 897-908. 10.1016/0092-8674(90)90338-F.View ArticlePubMed
  7. Mougey EB, Pape LK, Sollner-Webb B: A U3 small nuclear ribonucleoprotein-requiring processing event in the 5’ external transcribed spacer of Xenopus precursor rRNA. Mol Cell Biol. 1993, 13: 5990-5998.PubMed CentralView ArticlePubMed
  8. Peculis BA, Steitz JA: Disruption of U8 nucleolar snRNA inhibits 5.8 S and 28 S rRNA processing in the Xenopus oocyte. Cell. 1993, 73: 1233-1245. 10.1016/0092-8674(93)90651-6.View ArticlePubMed
  9. Tycowski KT, Shu MD, Steitz JA: Requirement for intron-encoded U22 small nucleolar RNA in 18 S ribosomal RNA maturation. Science. 1994, 266: 1558-1561. 10.1126/science.7985025.View ArticlePubMed
  10. Morrissey JP, Tollervey D: Yeast snR30 is a small nucleolar RNA required for 18 S rRNA synthesis. Mol Cell Biol. 1993, 13: 2469-2477.PubMed CentralView ArticlePubMed
  11. Dunbar DA, Baserga SJ: The U14 snoRNA is required for 2'-O-methylation of the pre-18 S rRNA in Xenopus oocytes. RNA. 1998, 4: 195-204.PubMed CentralPubMed
  12. King TH, Liu B, McCully RR, Fournier MJ: Ribosome structure and activity are altered in cells lacking snoRNPs that form pseudouridines in the peptidyl transferase center. Mol Cell. 2003, 11: 425-435. 10.1016/S1097-2765(03)00040-6.View ArticlePubMed
  13. Kishore S, Stamm S: The snoRNA HBII-52 Regulates Alternative Splicing of the Serotonin Receptor 2 C. Science. 2006, 311: 230-232. 10.1126/science.1118265.View ArticlePubMed
  14. Kiss-László Z, Henry Y, Kiss T: Sequence and structural elements of methylation guide snoRNAs essential for site-specific ribose methylation of pre-rRNA. EMBO J. 1998, 17: 797-807. 10.1093/emboj/17.3.797.PubMed CentralView ArticlePubMed
  15. Ganot P, Caizergues-Ferrer M, Kiss T: The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation. Genes Dev. 1997, 11: 941-956. 10.1101/gad.11.7.941.View ArticlePubMed
  16. Filipowicz W, Pogacić V: Biogenesis of small nucleolar ribonucleoproteins. Curr Opin Cell Biol. 2002, 14: 319-327. 10.1016/S0955-0674(02)00334-4.View ArticlePubMed
  17. Leader DJ, Clark GP, Watters J, Beven AF, Shaw PJ, Brown JW: Clusters of multiple different small nucleolar RNA genes in plants are expressed as and processed from polycistronic pre-snoRNAs. EMBO J. 1997, 16: 5742-5751. 10.1093/emboj/16.18.5742.PubMed CentralView ArticlePubMed
  18. Dieci G, Preti M, Montanini B: Eukaryotic snoRNAs: a paradigm for gene expression flexibility. Genomics. 2009, 94: 83-88. 10.1016/j.ygeno.2009.05.002.View ArticlePubMed
  19. Liang XH, Uliel S, Hury A, Barth S, Doniger T, Unger R, Michaeli S: A genome-wide analysis of C/D and H/ACA-like small nucleolar RNAs in Trypanosoma brucei reveals a trypanosome-specific pattern of rRNA modification. RNA. 2005, 11: 619-645. 10.1261/rna.7174805.PubMed CentralView ArticlePubMed
  20. Mishra PC, Kumar A, Sharma A: Analysis of small nucleolar RNAs reveals unique genetic features in malaria parasites. BMC Genomics. 2009, 10: 68-10.1186/1471-2164-10-68.PubMed CentralView ArticlePubMed
  21. Chakrabarti K, Pearson M, Grate L, Sterne-Weiler T, Deans J, Donohue JP, Ares M: Structural RNAs of known and unknown function identified in malaria parasites by comparative genomics and RNA analysis. RNA. 2007, 13: 1923-1939. 10.1261/rna.751807.PubMed CentralView ArticlePubMed
  22. Raabe CA, Sanchez CP, Randau G, Robeck T, Skryabin BV, Chinni SV, Kube M, Reinhardt R, Ng GH, Manickam R, Kuryshev VY, Lanzer M, Brosius J, Tang TH, Rozhdestvensky TS: A global view of the nonprotein-coding transcriptome in Plasmodium falciparum. Nucleic Acids Res. 2010, 38: 608-617. 10.1093/nar/gkp895.PubMed CentralView ArticlePubMed
  23. Schattner P, Brooks AN, Lowe TM: The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 2005, 33: W686-W689. 10.1093/nar/gki366.PubMed CentralView ArticlePubMed
  24. Yang JH, Zhang XC, Huang ZP, Zhou H, Huang MB, Zhang S, Chen YQ, Qu LH: snoSeeker: an advanced computational package for screening of guide and orphan snoRNA genes in the human genome. Nucleic Acids Res. 2006, 34: 5112-5123. 10.1093/nar/gkl672.PubMed CentralView ArticlePubMed
  25. snoRNA orthological gene database.http://snoopy.med.miyazaki-u.ac.jp/,
  26. Lowe TM, Eddy SR: A computational screen for methylation guide snoRNAs in yeast. Science. 1999, 283: 1168-1171. 10.1126/science.283.5405.1168.View ArticlePubMed
  27. Eo HS, Jo KS, Lee SW, Kim CB, Kim W: A combined approach for locating box H/ACA snoRNAs in the human genome. Mol Cells. 2005, 20: 35-42.PubMed
  28. Bachellerie JP, Cavaillé J, Hüttenhofer A: The expanding snoRNA world. Biochimie. 2002, 84: 775-790. 10.1016/S0300-9084(02)01402-5.View ArticlePubMed
  29. Piekna-Przybylska D, Decatur WA, Fournier MJ: New bioinformatic tools for analysis of nucleotide modifications in eukaryotic rRNA. RNA. 2007, 13: 305-312. 10.1261/rna.373107.PubMed CentralView ArticlePubMed
  30. Lestrade L, Weber MJ: snoRNA-LBME-db, a comprehensive database of human H/ACA and C/D box snoRNAs. Nucleic Acids Res. 2006, 34: D158-D162. 10.1093/nar/gkj002.PubMed CentralView ArticlePubMed
  31. Darty K, Denise A, Ponty Y: VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics. 2009, 25: 1974-1975. 10.1093/bioinformatics/btp250.PubMed CentralView ArticlePubMed
  32. Takano J, Tachibana H, Kato M, Narita T, Yanagi T, Yasutomi Y, Fujimoto K: DNA characterization of simian Entamoeba histolytica-like strains to differentiate them from Entamoeba histolytica. Parasitol Res. 2009, 105: 929-937. 10.1007/s00436-009-1480-3.View ArticlePubMed
  33. Wang Z, Samuelson J, Clark CG, Eichinger D, Paul J, Van Dellen K, Hall N, Anderson I, Loftus B: Gene discovery in the Entamoeba invadens genome. Mol Biochem Parasitol. 2003, 129: 23-31. 10.1016/S0166-6851(03)00073-2.View ArticlePubMed
  34. Bhattacharya A, Satish S, Bagchi A, Bhattacharya S: The genome of Entamoeba histolytica. Int J Parasitol. 2000, 30: 401-410. 10.1016/S0020-7519(99)00189-7.View ArticlePubMed
  35. Yuan G, Klämbt C, Bachellerie JP, Brosius J, Hüttenhofer A: RNomics in Drosophila melanogaster: Identification of 66 candidates for novel non-messenger RNAs. Nucleic Acids Res. 2003, 31: 2495-2507. 10.1093/nar/gkg361.PubMed CentralView ArticlePubMed
  36. Liang XH, Liu L, Michaeli S: Identification of the first trypanosome H/ACA RNA that guides pseudouridine formation on rRNA. J Biol Chem. 2001, 276: 40313-40318.View ArticlePubMed
  37. Li SG, Zhou H, Luo YP, Zhang P, Qu LH: Identification and Functional Analysis of 20 Box H/ACA Small Nucleolar RNAs (snoRNAs) from Schizosaccharomyces pombe. J Biol Chem. 2005, 280: 16446-16455. 10.1074/jbc.M500326200.View ArticlePubMed
  38. Bortolin ML, Ganot P, Kiss T: Elements essential for accumulation and function of small nucleolar RNAs directing site-specific pseudouridylation of ribosomal RNAs. EMBO J. 1999, 18: 457-469. 10.1093/emboj/18.2.457.PubMed CentralView ArticlePubMed
  39. Ni J, Tien AL, Fournier MJ: Small nucleolar RNAs direct site-specific synthesis of pseudouridine in ribosomal RNA. Cell. 1997, 89: 565-573. 10.1016/S0092-8674(00)80238-X.View ArticlePubMed
  40. Wu H, Feigon J: H/ACA small nucleolar RNA pseudouridylation pockets bind substrate RNA to form three-way junctions that position the target U for modification. Proc Natl Acad Sci USA. 2007, 104: 6655-6660. 10.1073/pnas.0701534104.PubMed CentralView ArticlePubMed
  41. Xiao M, Yang C, Schattner P, Yu YT: Functionality and substrate specificity of human box H/ACA guide RNAs. RNA. 2009, 15: 176-186.PubMed CentralView ArticlePubMed
  42. Darzacq X, Kiss T: Processing of intron-encoded box C/D small nucleolar RNAs lacking a 5', 3’-terminal stem structure. Mol Cell Biol. 2000, 20: 4522-4531. 10.1128/MCB.20.13.4522-4531.2000.PubMed CentralView ArticlePubMed
  43. Charette JM, Gray MW: Comparative analysis of eukaryotic U3 snoRNA, U3 snoRNA genes are multi-copy and frequently linked to U5 snRNA genes in Euglena gracilis. BMC Genomics. 2009, 10: 528-10.1186/1471-2164-10-528.PubMed CentralView ArticlePubMed
  44. Huang ZP, Chen CJ, Zhou H, Li BB, Qu LH: A combined computational and experimental analysis of two families of snoRNA genes from Caenorhabditis elegans, revealing the expression and evolution pattern of snoRNAs in nematodes. Genomics. 2007, 89: 490-501. 10.1016/j.ygeno.2006.12.002.View ArticlePubMed

Copyright

© Kaur et al.; licensee BioMed Central Ltd. 2012

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.