Design and evaluation of Actichip, a thematic microarray for the study of the actin cytoskeleton
BMC Genomics volume 8, Article number: 294 (2007)
The actin cytoskeleton plays a crucial role in supporting and regulating numerous cellular processes. Mutations or alterations in the expression levels affecting the actin cytoskeleton system or related regulatory mechanisms are often associated with complex diseases such as cancer. Understanding how qualitative or quantitative changes in expression of the set of actin cytoskeleton genes are integrated to control actin dynamics and organisation is currently a challenge and should provide insights in identifying potential targets for drug discovery. Here we report the development of a dedicated microarray, the Actichip, containing 60-mer oligonucleotide probes for 327 genes selected for transcriptome analysis of the human actin cytoskeleton.
Genomic data and sequence analysis features were retrieved from GenBank and stored in an integrative database called Actinome. From these data, probes were designed using a home-made program (CADO4MI) allowing sequence refinement and improved probe specificity by combining the complementary information recovered from the UniGene and RefSeq databases. Actichip performance was analysed by hybridisation with RNAs extracted from epithelial MCF-7 cells and human skeletal muscle. Using thoroughly standardised procedures, we obtained microarray images with excellent quality resulting in high data reproducibility. Actichip displayed a large dynamic range extending over three logs with a limit of sensitivity between one and ten copies of transcript per cell. The array allowed accurate detection of small changes in gene expression and reliable classification of samples based on the expression profiles of tissue-specific genes. When compared to two other oligonucleotide microarray platforms, Actichip showed similar sensitivity and concordant expression ratios. Moreover, Actichip was able to discriminate the highly similar actin isoforms whereas the two other platforms did not.
Our data demonstrate that Actichip is a powerful alternative to commercial high density microarrays for cytoskeleton gene profiling in normal or pathological samples. Actichip is available upon request.
The actin cytoskeleton is a highly dynamic network of protein polymers extending throughout the cytoplasm. It not only provides structural support for the cell, but also plays a central role in key cell processes including cellular morphogenesis, migration, division and cell communication. The actin cytoskeleton generates forces required for membrane extension and remodelling, motor protein-dependent cell contraction or membrane trafficking . Recently, a nuclear function was identified for actin in the organisation of chromatin and gene expression [2, 3]. In cells, the assembly and disassembly of actin filaments and their organisation into higher-order networks is regulated by actin-associated proteins which, in turn, are controlled by specific signalling pathways [1, 4]. The formation of membrane-cytoskeleton specialisations not only depends on the spatio-temporal controlled recruitment of actin-binding proteins to cellular subdomains, but also on the repertoire of specific sets of cytoskeleton and regulatory proteins that cells express at a given state. In line, timely and spatially regulated expression of cytoskeletal genes is observed during embryonic development or terminal differentiation of cells in adults.
The central role of the actin cytoskeleton in many essential cellular processes makes the system susceptible to mutations and alterations of gene expression level that may cause a wide range of diseases, including muscular dystrophies, amyloidosis, haematological disorders and cancers [5, 6]. Many of these diseases arise from aberrant cell morphogenesis, motility or communication caused by deregulation of actin dynamics or organisation. For example, deregulated cell motility is a typical hallmark of tumour invasion and metastasis characterising cancer malignancy. Recent studies demonstrated that tumour cell progression correlates with alterations of the expression profile of actin cytoskeleton genes and genes of upstream regulatory pathways [6–8]. Similarly, altered expression of genes encoding cytoskeletal proteins of the contractile system of muscle cells is observed in cardio-vascular disorders like heart failure . Therefore, cytoskeleton proteins are potential markers for cell differentiation or disease, and might constitute promising novel targets for therapeutic treatments .
The basic set of structural and signalling protein components of the actin cytoskeleton is now identified and information on their biochemical or biological activities is available. However, gaps and controversies remain on how qualitative or quantitative changes in expression of these proteins are integrated to control actin dynamics and organisation in space and time. Elucidating the intricate interplay between the cytoskeletal components that cells use to build-up various cellular structures is hampered by the complexity of the actin cytoskeleton system. In this context, gene expression profiling using microarrays has the potential to yield a global overview on the set of actin cytoskeleton genes expressed by a cell at a given physiological or pathological state. The technique allows global and parallel investigations of cellular activity, and was used successfully to characterise the molecular basis of a variety of complex experimental models and diseases. Results obtained in previous profiling studies with high-density microarrays underline the potential of this approach for detecting changes in the repertoire of expression of the cytoskeleton genes [7, 8].
Using an optimised experimental approach, we developed Actichip, a custom oligonucleotide microarray designed to study the expression of actin cytoskeleton genes in various cell systems. Actichip represents 327 human genes, most of them encoding proteins that bind directly to actin and control actin dynamics or organisation, while the others are involved in signalling, cell-cell or cell-matrix adhesion. In parallel, we developed Actinome, a knowledge database that integrates information on the target genes, including genomic data and sequence analysis features retrieved from GenBank, and biological function, when available. We determined the performance characteristics of Actichip and compared them with those of two other academic or commercial oligonucleotide arrays. Our data indicate that Actichip exhibits solid performance that makes it a valid platform for studying the human transcriptome of the actin cytoskeleton.
To facilitate the setting up of Actichip, an integrative database called Actinome was implemented cataloguing genomic data and sequence analysis features of the human genes related to the actin cytoskeleton. We also considered some key marker genes including adhesion receptors, metalloproteases or extracellular matrix proteins that are involved in actin-based proccesses like morphogenesis or cell migration. Gene selection was performed using the GenBank database (release 134,), and was based on a combination of biological knowledge, literature data, Gene Ontology (GO) terms  and keywords in the NCBI database. Searches were restricted to genes encoding proteins of the major functional groups regulating the dynamics and organisation of the actin cytoskeleton (Table 1). Actinome was built following a robust protocole as described in the "Method" section. To date, the database compiles a set of 327 non redundant entries and related data such as mRNA and protein identifiers, gene names and descriptions, cytobands and gene ontology annotations. Actinome is freely available .
Oligonucleotide probe design
We decided to use long oligonucleotide probes to build Actichip because of the numerous advantages they offer when compared to PCR amplicons. Being fully custom-designed, they have more uniform hybridisation characteristics, they yield less non-specific hybridisation and misidentification of gene transcripts, while exhibiting similar sensitivity .
Genomic databases are still prone to modifications. While reorganisations and changes in transcript sequences or identifiers may account for erroneous results in microarray studies, using sequence-verified probes was shown to improve microarray measurement accuracy and consistency [15, 16]. Therefore, we decided mining transcript information for probe design from several databases. Although many programs to design oligonucleotide probes were publicly available at the time of our study [17–23], none of these programs allowed such an application. Therefore, we implemented a new, freely available program named CADO4MI (Computer-Assisted Design of Oligonucleotides for Microarrays) . As most of the existing programs, CADO4MI uses variations of the same algorithm and common criteria to design specific oligonucleotides with optimised hybridisation features through a multistep procedure (see "Methods"). Contrary to these programs however, CADO4MI has the potential to compute probe sets for the same query genes using simultaneously two or more databases in order to select optimal sequences. Visualisation, comparison and integration of the different probe sets are greatly facilitated by a powerful graphical user interface. The program also incorporates several other interesting features such as an automatic search for missing sequences in the reference databases and the possibility to compute and display melting temperature (Tm) or GC content curves for individual gene query or, alternatively, for the entire set of sequences. These curves are helpful in selecting the appropriate parameters for the design of probes. CADO4MI was used successfully in a recent study to select automatically a set of PCR primers designed for the resequencing of Interrupted CoDing Sequences (ICDS) .
We designed 60-mer oligonucleotides using the Reference Sequence database (RefSeq)  and the UniGene database  because of their complementary features. While the former gives access to non-redundant and well-annotated sequences including pseudogenes and splice variants, but is not yet exhaustive, the latter compiles comprehensive gene-oriented clusters of sequences but with more redundancy and incomplete or erroneous annotations. The Actichip probe set was designed to target each of the 327 genes defined in the Actinome database with a single oligonucleotide without discriminating splice variants. This was achieved for 301 entries while the probes designed for 26 genes were shown to target more than one sequence in at least one reference database. We therefore selected either 2 oligonucleotides for 22 genes, 3 for 3 genes or 5 for one gene resulting in a total set of 359 oligonucleotides (Table 1). In addition, we used part of a set of viral and bacterial probes described as having no similarity with human transcripts  and sequences of human genes reported as being housekeeping genes  to generate 41 negative and 32 positive controls, respectively.
Evaluation of Actichip performance
To evaluate the experimental performance of Actichip microarrays, a series of repeated hybridisation experiments, including dye swaps, were carried out with optimised target labeling, hybridisation, scanning and data analysis protocols (see "Methods"). All the procedures were standardised to limit the impact of experimental bias or biological variations on data. The same series of high quality RNA samples purified from human breast adenocarcinoma MCF-7 cells and from human skeletal muscle was used in our experiments. Epithelial cells and skeletal muscle tissue were chosen because they express well-characterised sets of cytoskeleton genes, and were anticipated to give well-contrasted differential expression data when analysed with Actichip.
Actichip image quality and data reproducibility
Microarray images exhibited good and reproducible quality parameters in both channels (Figure 1). Background values were low and signals showed maximum dynamic range, resulting in signal-to-background and signal-to-noise ratios higher than 20 and 50 in each channel, respectively. Actichip microarray images were quantified using the Genepix Pro 6.0 software. Negative and irrelevant spots were removed from the dataset as described in the "Methods" section, yielding approximately 80 % of positive features of which 55–60 % were found to be relevant.
To assess the reproducibility of Actichip data, we analysed triplicate spots on the Actichip array (intra-array reproducibility) and repeated hybridisations (technical inter-array reproducibility). Focusing on the intrinsic performance of Actichip, we did not investigate the impact of the hybridisation of different samples (biological inter-array reproducibility). We calculated the standard deviation (STD) and coefficient of variation (CV) between the normalised Log2 ratios values excluding the irrelevant signals. As shown in Table 2, the microarrays gave reproducible results with good intra- and inter-assay STD and CV. Analysis of variance (ANOVA test) showed that the series of normalised signal ratios was highly similar (p < 0.05). The variability in data was also examined in the Acuity 4.0 program using the hierarchical clustering of array experiments with the average linkage and the Pearson correlation coefficient as similarity metric. Two main clusters were identified for correlation coefficients > 0.95 in the dendogram resulting from this analysis, each corresponding to either normal or dye-swap experiments (Figure 2). The correlation coefficients calculated from microarray hybridisations ranged from 0.95 to 0.99 indicating that the assays were highly comparable. Data from assays performed at different time periods were undistinguishable. Although limited, the greatest source of variability in our dataset was the dye exchange, indicating that a slightly uneven incorporation of the dyes in samples occurred during our experiments. This dye bias was compensated through a ratio-based, global normalisation of the data.
Signal linearity and detection limit
To investigate the dynamic range of Actichip and to determine the span of this dynamic range, we used seven Arabidopsis thaliana polyadenylated RNA species referred to as spike RNAs that were in vitro synthesised from plasmids provided by the Institute for Genomic Research . The spike RNAs were calibrated and were combined to construct seven complex mixes, each mix containing six of the seven spike RNAs in staggered concentration ranging from 10-1 to 104 copies per cell (cpc). An eighth sample was prepared, called the reference sample, consisting of the mix of all spike RNAs at a concentration of 102 cpc [see Additional file 1]. Thereby, the comparison of any of the seven RNA samples to the reference sample should theoretically yield signal ratios ranging from 10-3-fold to 102-fold.
The graph shown in Figure 3 is a summary of a complete hybridisation series where each curve represents the signal ratios associated with one of the seven spike RNAs. Actichip arrays displayed a near perfect dynamic range over three logs (10–104 cpc) and the experimental Log2 ratios match well the expected theoretical values. A wider spread of the curves was observed for some spike RNAs indicating that their sequences might favour hybridisation signal accuracy. Actichip arrays accurately yielded ratios for spike RNAs at the highest concentration (104 cpc, 102-fold ratio) with no saturation effect. The lower limit of linearity of the dynamic range was around 10 cpc, and the signals reached a bottom plateau with values close to background noise, marking the limit of sensitivity between 1 and 10 cpc.
To validate the reliability of Actichip data, we analysed our dataset using the Significance Analysis of Microarrays algorithm (SAM; ). SAM analysis resulted in a list of 106 and 176 genes found significantly expressed in skeletal muscle and MCF-7 cells, respectively (Δ = 1.85; FDR = 0 %). This list was highly enriched in marker genes characteristic for either epithelial or skeletal muscle cells (Table 3), in good agreement with the expression patterns expected from an a priori reasoning based on biological knowledge. Importantly, we obtained similar results through SAM analysis using three randomly chosen experiments over the entire series of assays (data not shown) revealing that a limited number of repeats would be sufficient to obtain reliable data with Actichip.
The major difficulty in designing oligonucleotide probes for Actichip arised from the appreciable number of highly similar genes found in the actin cytoskeleton gene family . This is exemplified by the actin gene family which is composed of six different isoform genes sharing not only high sequence identity at the protein level (> 95% Id.), but also at the nucleic acid level (> 85% Id.). For these genes, sequence identity ranges between 91 and 99% of the Coding Sequence (CDS), and between 57 and 83% of the total mRNA sequence hampering the design of specific probes.
To evaluate the specificity of the Actichip probes targeting the actin isoforms, PCR fragments corresponding to the target regions of the transcripts were obtained using as template cDNA generated from Hela cell poly(A)RNA. The purified PCR products were labeled with Alexa dyes through direct covalent linkage and hybridised to Actichip microarrays. As shown in Figure 4, each PCR fragment bound to the corresponding probe. No cross-hybridisation was observed neither within the set of actin probes nor with the other oligonucleotides included in the chip. These results demonstrated that the oligonucleotide probes were fully specific and indicated that the procedure used to design the probes in CADO4MI was robust.
Cross platform comparison
To further evaluate the performance of Actichip, we compared the microarray with two other well-established oligonucleotide-based platforms (Table 4). We used 25-mer oligonucleotide commercial chips (Affymetrix, human genome U133A 2.0) and academic arrays prepared with a 70-mer oligonucleotide set (Operon, human whole genome version 2.0). PCR-amplicon arrays were not considered to avoid bias generated by this format of probe, as a result of either sequence errors or cross-hybridisations . Hybridisation replicates were carried out under optimised and standardised protocols specific to each platform using the RNA sample sets analysed with Actichip. Microarray image analysis and data extraction were performed using dedicated methods and software (see "Methods").
Results obtained with Actichip, Affymetrix and Operon arrays were markedly comparable (Figure 5). The percentage of probes found to be relevant in Genepix ranged from 53 to 55 % for MCF-7-derived RNAs whereas it varied from 57 to 60 % for the skeletal muscle sample. Compared to the Operon platform, the Actichip and Affymetrix arrays gave the best inter-assay reproducibility with only 5 % of discordant data between the series of experiments.
Cross-platform comparison was achieved considering only the genes represented simultaneously on each of the three arrays. These genes were identified by comparing the target gene accession numbers and/or the sequences used to design the probes as detailed in the "Methods" section. Among the 327 genes represented on Actichip, 304 were also targeted by the Affymetrix U133A 2.0 array and 294 by the Operon array, while 275 genes were represented simultaneously on the three platforms. We calculated the degree of concordance between the expression patterns as the ratio of the number of genes simultaneously found expressed or not expressed in our samples by two or three of the microarray platforms to the total number of genes commonly represented by these platforms. We found 49 % of concordant genes between Actichip and Affymetrix, 35 % between Actichip and Operon, 45 % between Affymetrix and Operon, and 24 % when considering all platforms. Results were comparable for the two RNA samples we analysed. The Pearson correlation coefficients (r) calculated using the median expression Log2 ratios of the set of concordant genes were 0.88 between Actichip and Affymetrix, 0.63 between Actichip and Operon, and 0.67 between Affymetrix and Operon. These data indicated that Actichip microarrays performed equally well as Affymetrix platform while Operon arrays were less reliable under our experimental conditions.
As a good indicator of platform specificity, we found that the expression patterns obtained with Actichip for the actin isoforms perfectly matched the profiles described in the literature . Cytoplasmic actin isoforms (ACTB and ACTG1) were detected in both samples as anticipated from their ubiquitous expression (Table 5). Two of the four muscle actin isoforms (ATCA1 and ACTC1) were found to be expressed in skeletal muscle but not in MCF-7 cells while the others (ACTA2 and ACTG2) were not detected in any sample. These expression patterns were further confirmed by PCR using cDNAs obtained from our samples (data not shown). Conversely, the aortic smooth muscle (ACTA2) and gamma enteric smooth muscle (ACTG2) actin isoforms were incorrectly identified in the MCF-7 and skeletal muscle samples using the Affymetrix chips. With the Operon arrays, the alpha skeletal muscle (ACTA1) and aortic smooth muscle (ACTA2) actins were inaccurately detected in the MCF-7 or skeletal muscle RNAs, respectively. In addition, Operon arrays were unable to detect the gamma actin (ACTG1) in both samples.
Microarray analysis is a powerful methodology for high throughput gene expression study which contributes to the understanding of complex events or biological systems . In the present paper we describe the design and benchmarking of a custom-made oligonucleotide microarray named Actichip as a tool to study the actin cytoskeleton in normal or pathological situations.
We designed, produced and evaluated Actichip using optimised and standardised experimental procedures and a data evaluation pipeline we established according to the guidelines developed by the Microarray Gene Expression Data (MGED) Society . Actichip hybridisation signals obtained with our optimised experimental settings were of high quality (Figure 1) leading to accurate and highly reproducible quantification of gene expression levels (Table 2, Figure 2). Importantly, our data indicated that two or three replicates would be sufficient for reliable measurements when applying the standardised procedures we established. Consistent with recent studies [37–40], our results show that a thorough standardisation of the array and experiment design, protocols and data analysis procedures, can greatly improve microarray data quality and comparability. This is crucial for the generation of meaningful universal gene expression index based on the exchange and integration of data between microarray platforms and laboratories.
The reliability and sensitivity of gene expression measurements are other important issues when using microarrays. In this study, we analysed two well-contrasted RNA samples, each characterised by a specific organisation of their actin cytoskeleton and by known marker genes. Many of these genes were found significantly expressed using Actichip (Table 3), underlining the reliability of this array as a transcriptome analysis platform and its value for the characterisation and classification of biological samples based on their transcriptome profiles. Our data further showed that Actichip not only detects reliably qualitative gene expression changes, but has also the potential to accurately measure the amplitude of these variations (Figure 3). In addition, we determined that Actichip has the potential to identify transcripts over a biologically meaningful range including high, intermediate and rare abundance classes of RNAs.
The fraction of probes on an array that yield a significant hybridisation signal can be used as a measure of platform sensitivity. We found a magnitude of detectable genes ranging from 53 to 60 % with both the Actichip, Affymetrix and Operon microarrays (Figure 5), indicating that the reactivity of the three platforms is similar. These results are in good agreement with data from similar studies [16, 41], and suggest that a significant fraction of cytoskeletal genes were not or very lowly expressed in our samples, consistent with the concept that only part of the genome is usually expressed in a given differentiated cell line or tissue .
Comparison of the expression profiles obtained from the three platforms revealed a moderate concordance between the datasets, the best score (49 %) being observed between the Actichip and Affymetrix arrays. Nevertheless, we found good correlations between the relative expression data from the different arrays when considering the subset of concordant genes. The correlation in gene expression levels between the Actichip and Affymetrix arrays was particularly strong and was comparable to those reported in similar studies for best performing arrays [16, 43–46]. Identifying the source of variability between the different microarray platforms was not straightforward since many factors could have influenced the expression data. Indeed, microarray platforms differ on numerous technological aspects including array format and fabrication, protocols and instrumentation, as well as computational and statistical tools. It has been shown that these differences could account, at least in part, for discrepancy in the data generated by different array technologies [33, 47–50]. Although we carefully standardised our protocols, we could not avoid some differences in the procedures specific for each platform. Biases in our data may partly result from dissimilarities between the methods we used to generate and label the samples or from differences in sensitivity between the procedures we applied to acquire and analyse the data.
We found that 7.0 % or 10.1 % of the Actichip targets were not represented in the Affymetrix GeneChip or Operon array, respectively [see Additional file 2]. This result is not surprising considering that the three array platforms were implemented using different databases or different releases of the same database (Table 4) harbouring modifications of transcript sequences, identifiers or annotations. However, our data question the reliability of the high throughput design of pangenomic probe libraries. Focusing on a limited, easy-to-handle set of genes constitutes a more careful and robust approach. In line, several focused microarrays were recently described as powerful alternatives to whole genome arrays to study complex biological systems [45, 51, 52].
On the other hand, many of the genes represented on Actichip are highly similar and are not easy to discriminate using long oligonucleotide microarrays. When considering the actin gene family, only very limited regions of the transcript sequences can be used for the design of probes with convenient physical properties and specificity. To design high quality probes, we developed the CADO4MI program which allows a validation of oligonucleotides by cross-comparison of their sequences with data from several reference databases. For 219 of the 327 target genes represented on Actichip, combining information available from the UniGene and RefSeq databases actually allowed us to select probes with an enhanced specificity compared to those obtained using only one database. The fact that Actichip was able to differentiate the highly similar actin isoforms confirms that CADO4MI generates highly specific probes (Figure 4, Table 5). By contrast, some probes specific for the actin isoforms in the Affymetrix GeneChip and in the Operon set target regions having a high degree of similarity with several unrelated transcripts. As a consequence, these probes may generate false positive data due to cross-reactivity. This could explain the erroneous detections of some actin isoforms we observed with the Affymetrix or Operon platforms. In line, probe sequence alignment showed that the ACTA2 Operon probe has actually the potential to cross-hybridise with several transcripts [see Additional file 3]. By using the probe match tool at the NetAffx analysis center, we also found that the ACTA2 and ACTG1 probe sets from the U133A GeneChip both perfectly match with the ACTA2 mRNA. However, our data showed that the specificity of a probe can not be simply inferred from its design characteristics. Although giving false positives in our study, the ACTA1 Operon probe appeared to be specific as judged by sequence alignment [see Additional file 3], and the ACTG2 Affymetrix probe set perfectly matched with the corresponding transcript sequence.
It is conceivable that using latest versions of commercial arrays based on better-quality genome assembly and annotations or on new design concept may improve measurement accuracy and sensitivity. As an illustration, the GeneChip Exon array recently designed by Affymetrix with over six million probes targeting all annotated and predicted exons in the human genome appears as a promising tool to investigate both gene expression and alternative splicing with a high resolution. Data from the literature show that this chip may provide more accurate gene expression measurements than traditional microarrays [53, 54], but requires a more complex strategy for the analysis of expression data [53, 55]. Complex and time-consuming analysis is a typical trait of high densitiy microarrays and often represents the bottleneck of pangenomic expression studies. In the particular context of studies focusing on a limited number of genes, thematic arrays offer the possibility to overcome these limitations.
Altogether, our data indicate that the tools and procedures we implemented in the course of our study constitute a powerful approach for the design of thematic arrays. Our data show that Actichip displays solid performance characteristics that make it a valid platform for functional genomics studies of the actin cytoskeleton in the context of basic or clinical research. Compared to high density microarrays, Actichip has the potential to facilitate gene expression data analysis because of its reasonable size. With the capacity to screen up to four samples in parallel, Actichip also contributes to lower the cost of the analysis.
Implementation of the Actinome database
Genomic data and sequence analysis features of the genes included in the Actichip microarray were compiled in the Actinome database . To ensure robustness of the database, only mRNA sequences or complete coding cDNA sequences (CDS) were retrieved from the NCBI database, excluding expressed sequence tags (ESTs), sequence tagged sites (STS), genome sequence survey (GSS), working drafts and patent sequences. Each protein was annotated with high confidence through association with the corresponding full-length transcript and protein accession number using the RetScope platform, an in-house eukaryotic sequence analysis platform written in Tcl/Tk. Briefly, the protein accession numbers and sequences were first derived from the GenBank entries found by BLASTN homology searches from initial sequences . The protein sequences were also inferred from the initial sequences by BLASTX searches in the UniProt database . A double cross-validation was performed by assessing (i) sequence identity of the BLASTX- and BLASTN-derived proteins and (ii) their association with the same chromosomal localisation on the human genome . As part of the RetScope platform, the GOAnno module  was finally used to annotate each protein entry with the corresponding Gene Ontology terms. GO annotations were retrieved using high quality multiple alignment of complete sequences computed for each protein.
Oligonucleotide probe design
Oligonucleotide probes (60-mer) were designed using the program CADO4MI . The program was written in Tcl/Tk and was successfully tested on various operating systems such as Windows (Microsoft), Solaris (Sun), Tru64UNIX (Compaq) and Linux (Fedora core 5, openSUSE 10.2). CADO4MI designed probes corresponding to a set of query sequences using different adjustable parameters through the following 4 steps [see Additional file 4] : (i) Poly(A) tails were masked in the query sequences (>15 consecutive adenosine bases), (ii) potential cross-hybridisations were detected using the entire sequence as a query in the BLASTN program , (iii) CADO4MI performed sequence analysis by moving iteratively (10 bases) over the query sequence a sliding window with a size corresponding to the oligonucleotide length (60-mer). At each iteration, sequences with no stretch of 7 or more contiguous identical bases, 35 %<GC content <70 %, 87°C< Tm < 97°C, and distance relative to 3'end ≤ 3.000 bases were selected. The melting temperature (Tm) was evaluated by the nearest-neighbor model combined with the unified parameters defined by SantaLucia . The oligonucleotide specificity was checked through BLASTN by applying the Kane's rules to the selected sequences . The process resulted in a list of oligonucleotides potentially targetting one or more sequences. (iv) The procedure was completed by selecting the oligonucleotide having the minimum number of non specific target and smallest distance to the 3' end of the query. Being a critical point in the design, the specificity of the probes was assessed automatically by identifying the complementary target sequences referenced in two nucleotide databases, RefSeq (release 1) and UniGene (build 161). Probes with unique targets in both databases were automatically selected by CADO4MI, whereas the others were curated manually.
Oligonucleotides were synthesized, 3'-end amino (C6)-modified and HPLC-purified by Eurogentec (Seraing, Belgium). Microarrays were manufactured by contact printing using a Microgrid II microarrayer equipped with 2500 split pins (Genomic solutions, Huntingdon, United Kingdom). Oligonucleotides were spotted in triplicate onto epoxide-coated glass slides (ArrayIt, Sunnyvale, CA, USA) at a concentration of 25 μM in microspotting plus solution (ArrayIt). The library was printed with two array patches per slide, each containing 32 human housekeeping genes as positive controls together with 41 sequences corresponding to viral or bacterial genes as negative controls. In addition, 10 spiking controls corresponding to Arabidopsis thaliana genes were incorporated in each array to assess the quality of microarray hybridisations as described below. Sequences of the corresponding 60-mer oligonucleotides were derived from those of 70-mer probes described elsewhere , and had no homology with any known human transcript sequence as assessed by a BLASTN analysis. Spotting was performed at a constant temperature of 22°C with 50% controlled humidity. Following arraying, the slides were dried overnight and were stored dessicated at room temperature. Actichip is available upon request .
Cell culture and RNA sample preparation
Human breast adenocarcinoma MCF-7 cells (ATCC number HTB-22) were grown to 70–80 % confluency in Dulbecco's Modified Eagles's Medium (DMEM), 10 % Fetal Bovine Serum, 4 mM L-Glutamine, 100 units/ml penicillin G sodium and 100 μg/ml streptomycin sulfate. All cell culture reagents and buffers were purchased from Cambrex (Verviers, Belgium). Cells were washed twice with cold phosphate buffered-saline (PBS), and total RNA was extracted using the TRIzol reagent (Invitrogen, Merelbeke, Belgium) according to the manufacturer's intructions. A single batch of poly(A+) RNA was prepared from total RNA using the polyA purist kit from Ambion (Huntingdon, United Kingdom), and was stored frozen at -80°C in DEPC-treated water until use. Commercial poly(A+) RNA purified from human skeletal muscle was obtained from Ambion. RNA integrity and concentration were evaluated by the Agilent Bioanalyser 2100 capillary electrophoresis RNA 6000 nano assay (Agilent Biotechnologies, Diegem, Belgium). High quality RNAs with a ribosomal RNA ratio greater than 1.9 and no evidence of degradation were used in this study.
Gene expression profiling experiments
Gene profiling experiments were performed using procedures specific for either the Actichip microarray or the whole human genome array manufactured by the genomics laboratory at the university medical center of Utrecht (UMCU, The Netherlands) [see Additional file 5]. Briefly, 2 μg of poly(A+) RNA were reverse-transcribed using the Superscript II reverse transcriptase (Invitrogen) and were labeled with Alexa fluor 555 or 647 NHS-ester dyes (Invitrogen). The hybridisation was carried out at 42°C for 20 h in a Slidebooster 800 (Advalytix, Brunnthal, Germany) with a regular microagitation of the sample. Slides were scanned immediately after post-hybridisation washing using a Genepix 4000B microarray fluorescence reader (Molecular Devices, Sunnyvale, CA, USA) at a resolution of 10 μm. A percentage of saturated pixels of 0.1 was tolerated during the image acquisition to allow detection of the lowly expressed transcripts. Images were quantified using the Genepix Pro 6.0 software (Molecular Devices). For each spot, the local median background was subtracted from the median foreground signal. A spot was considered as positive (i) when it contained more than 55 % of foreground pixels above the background level + 1 standard deviation (STD) and (ii) less than 3 % of foreground pixels saturated, (iii) when the linear regression between the population of foreground pixels in the two channels computed using the least-squares method was above 0.5, and (iv) when the median of the pixel intensities at each wavelength with the median background pixel intensity at each wavelength subtracted was above 500. A positive feature was rated as relevant when the mean foreground intensity was above the mean foreground signal calculated from all negative control spots in at least one channel. Log2 ratio independency from signal intensity in both colour channels was assessed by displaying M-A plots (Log2 ratio = f(Log2 (median signal intensity at 532 nm × median intensity at 635 nm)/2). To compensate for dye bias, fluorescence data were subjected to a ratio-based, global normalisation considering the median intensity ratios of housekeeping genes. Genes detected in less than 2/3 of the arrays or exhibiting absolute Log2 ratios < 1 were filtered out. Clustering and graphical analysis of the remaining gene expression data were performed using the Acuity 4.0 software (Molecular Devices). Microarray data were in compliance with the standards proposed by the Microarray Gene Expression Data Society , and were deposited in the ArrayExpress public repository .
Single color microarray
Transcriptome analysis were carried out using the GeneChip HG-U133A 2.0 (Affymetrix, Santa Clara, CA, USA). Sample labeling, hybridisation and staining were performed at the Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC, Strasbourg, France) according to the eukaryotic target preparation protocol in the Affymetrix technical manual for Genechip expression analysis [see Additional file 5]. Briefly, 200 ng of purified poly(A+) RNA were linearly amplified to generate biotin-labeled cRNA. Upon fragmentation, labeled cRNA were hybridised for 16 h at 45°C according to the manufacturer's protocol. The hybridised arrays were washed and stained with streptavidin-phycoerythrin (Invitrogen), and signal was amplified with biotinylated anti-streptavidin antibodies (Sigma, Bornem, Belgium) using a Genechip fluidics station 400 (Affymetrix) according to the manufacturer's protocol. The arrays were scanned using a Genechip scanner 3000 (Affymetrix) at a resolution of 1.56 μm. The fluorescent intensity of each probe was quantified using the Microarray Analysis Suite version 5.0 (MAS 5.0) software (Affymetrix) according to manufacturer's instructions. The expression level of a single mRNA, defined as the signal, was determined by the MAS 5.0 software which uses a weighted average fluorescence intensity difference obtained among the 11 probe pairs that interrogate the expression of each individual gene. This software also makes a detection call (present [P], marginal [M], or absent [A]) for each gene or probe set, based on the consistency of the performance of the individual probe pairs, the hybridization above background, and the signal-to-noise ratio. Data analysis was performed using default parameters (Tau = 0.015).
Significance analysis of microarray data
The significance level achieved with Actichip microarrays was evaluated by analysing the Log2 ratios from replicated experiments using the Significance Analysis of Microarrays algorithm (SAM; ) in Microsoft Excel (addin vs 2.21). SAM uses a modified t-test to determine for each gene represented on an array the relative difference in gene expression d(i), taking into account both the absolute level of expression as well as the standard deviation of the replicates. SAM then estimates the expected relative difference in expression de(i) for each gene by analysing permutations of the measurements. On a plot of de(i) vs. d(i), genes identified simply by chance are aligned on the d(i) = de(i) line whereas the genes potentially significant are represented by points displaced from this line by a distance greater than a threshold Δ. SAM also gives access to the percentage of genes found to be significant by chance, the false discovery rate (FDR).
Actin probe specificity
PCR primer design
PCR primers (18-mer) were designed for the specific amplification of the six actin genes; actin, alpha 1, skeletal muscle (ACTA1, NM_001100), actin, alpha 2, smooth muscle, aorta (ACTA2, NM_001613), actin, beta (ACTB, NM_001101), actin, alpha, cardiac muscle 1 (ACTC1, NM_005159), actin, gamma 1 (ACTG1, NM_001614), actin, gamma 2, smooth muscle, enteric (ACTG2, NM_001615). For all isoforms except ACTA1, the forward primer was designed in a conserved region whereas the different reverse primers were specific for the respective actins [see Additional file 6]. All primers were verified by using the BLASTN program.
The PCR reaction was performed using as template cDNA generated from poly(A)RNA extracted from Hela cells. Reaction mixtures were prepared by using 10× Ex Taq reaction buffer and 2.5 U of Ex Taq polymerase (Takara Biomedicals, Otsu, Shiga, Japan) in accordance with manufacturer's recommendations in a total volume of 50 μl or 100 μl. Thermal cycling was carried out using a Robotcycler Gradient 96 (Stratagene, La Jolla, CA, USA) with an initial denaturation step of 94°C for 4 min, followed by 30 cycles of denaturation at 94°C for 30 s, an annealing step at 58°C for 30 s, and an elongation step at 72°C for 30 s. Cycling was completed by a final elongation step of 72°C for 10 min. The presence and size of the amplification products were determined by agarose (1%) gel electrophoresis of the reaction product.
Labeling and Hybridisation
Specific PCR products were labeled chemically using the ULYSIS labeling kit (Invitrogen) and Alexa Fluor 546 and 647 dyes according to the manufacturer's protocol. The labeled DNA was purified with Qiagen QIAquick PCR Purification kit, and was than recovered by ethanol precipitation, followed by resuspension in DIG Easy Hyb hybridization solution (Roche Diagnostics, Mannheim, Germany) at a concentration of 12.5 μg/ml. Hybridisations to Actichip microarrays were carried out for 16 h at 42°C as described previously using 20 μl of amplified and labeled DNA solution and a 22×25 mm LifterSlip. After incubation, microarrays were washed and scanned as described previously.
Actichip sensitivity and detection limit
Preparation of spiked RNA samples
Spike poly(A+) RNAs were synthetised from the Arabidopsis thaliana spiking control cRNA vector set originally developed at the Institute for Genomic Research (TIGR, Rockville, MD, USA) by Hind III-Sac I directional cloning of PCR fragments corresponding to ten selected A. thaliana genes into the pSP64 poly(A) vector (Promega, Madison, WI, USA) . Seven of these plasmids were linearised by EcoR I digestion, the restriction site being positioned immediately after the poly(A) tail sequence. One μg of each linearised plasmid was used as template for the in vitro synthesis of sense transcripts using the MEGAscript High Yield Transcription kit (Ambion). Following DNAseI treatment, the transcribed RNAs were purified by lithium chloride precipitation and resuspended in 10 mM Tris-HCl pH 7.5. The quality and quantity of the RNA samples were assessed with a RNA Labchip (Agilent Biotechnologies) and classical spectrophotometry. RNA solutions were adjusted at a concentration of 3 μg/μl corresponding to 106 spike copies/cell/μl (cpc/μl), and were mixed to prepare seven 10× test samples, each containing a full range of spike RNAs at concentration ranging from 1 to 105 cpc [see Additional file 1]. Transcript copy number calculations were made assuming that a cell contains 1 pg poly(A) RNA corresponding to an average of 360,000 transcripts, and that 0.3 ng spike transcript corresponds to 100 spike copies/cell. Care was taken to use DEPC-treated water containing 1 μg/μl E. coli tRNA (Roche Diagnostics) to prevent the loss of spike RNAs at low concentrations through adsorption on plastic surfaces. An eighth 10 × RNA sample was constructed containing the seven RNA spikes at a concentration corresponding to 103 cpc.
Dose response curves
Dose response curves were determined using a procedure modified from Allemeersch et al. . Briefly, the seven test mixes containing the spike RNAs in well-defined concentration and expression ratio were used as template to prepare the Alexa dye 647-labeled samples, whereas the reference mix was used for the generation of the Alexa fluor 555-labeled sample. Test and reference samples were pooled in equimolar concentration to obtain seven hybridisation mixtures that were incubated for 20 h at 42°C onto distinct Actichip microarrays. Slides were washed and scanned as described previously.
Microarray platform comparison
Sets of common probes between the different microarray platforms were established on the basis of sequence accession numbers and/or sequence comparison. In the first approach, RefSeq or GenBank accession numbers from Actinome sequences were compared to the accession numbers present in the annotation files provided by the UMCU for the Operon probe set and by Affymetrix for the HG-U133 2.0 GeneChip. In the second approach, sequences of the Actichip targets (GenBank mRNA sequences) were compared to those provided by Affymetrix ("target sequence") and the UMCU (GenBank and RefSeq mRNA sequences). For this sequence similarity search, we used the BLASTN program and the calculation of two parameters, a global percent identity (GID) and a percent coverage (pCover). GID was defined as the ratio of the number of identical residues to the total number of residues in all Maximum Segment Pairs (MSPs) of the query [see Additional file 7]. pCover corresponded to the ratio of the number of identical residues to the number of residues that were aligned between the two sequences. To avoid false negative results, similarity searches were performed using permissive cutoffs of 95% and 70 % for GID and pCover, respectively. All alignments were then curated manually.
Revenu C, Athman R, Robine S, Louvard D: The co-workers of actin filaments: from cell structures to signals. Nat Rev Mol Cell Biol. 2004, 5: 635-646. 10.1038/nrm1437.
Blessing CA, Ugrinova GT, Goodson HV: Actin and ARPs: action in the nucleus. Trends Cell Biol. 2004, 14: 435-442. 10.1016/j.tcb.2004.07.009.
Miralles F, Visa N: Actin in transcription and transcription regulation. Curr Opin Cell Biol. 2006, 18: 261-266. 10.1016/j.ceb.2006.04.009.
Disanza A, Steffen A, Hertzog M, Frittoli E, Rottner K, Scita G: Actin polymerization machinery: the finish line of signaling networks, the starting point of cellular movement. Cell Mol Life Sci. 2005, 62: 955-970. 10.1007/s00018-004-4472-6.
Janmey PA, Chaponnier C: Medical aspects of the actin cytoskeleton. Curr Opin Cell Biol. 1995, 7: 111-117. 10.1016/0955-0674(95)80052-2.
Lambrechts A, Van Troys M, Ampe C: The actin cytoskeleton in normal and pathological cell motility. Int J Biochem Cell Biol. 2004, 36: 1890-1909. 10.1016/j.biocel.2004.01.024.
Condeelis J, Singer RH, Segall JE: The great escape: when cancer cells hijack the genes for chemotaxis and motility. Annu Rev Cell Dev Biol. 2005, 21: 695-718. 10.1146/annurev.cellbio.21.122303.120306.
Jechlinger M, Grunert S, Tamir IH, Janda E, Ludemann S, Waerner T, Seither P, Weith A, Beug H, Kraut N: Expression profiling of epithelial plasticity in tumor progression. Oncogene. 2003, 22: 7155-7169. 10.1038/sj.onc.1206887.
Steenman M, Lamirault G, Le Meur N, Leger JJ: Gene expression profiling in human cardiovascular disease. Clin Chem Lab Med. 2005, 43: 696-701. 10.1515/CCLM.2005.118.
Giganti A, Friederich E: The actin cytoskeleton as a therapeutic target: state of the art and future directions. Prog Cell Cycle Res. 2003, 5: 511-525.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL: GenBank. Nucleic Acids Res. 2006, 34: D16-20. 10.1093/nar/gkj157.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25: 25-29. 10.1038/75556.
Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, Madore SJ: Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res. 2000, 28: 4552-4557. 10.1093/nar/28.22.4552.
Mecham BH, Wetmore DZ, Szallasi Z, Sadovsky Y, Kohane I, Mariani TJ: Increased measurement accuracy for sequence-verified microarray probes. Physiol Genomics. 2004, 18: 308-315. 10.1152/physiolgenomics.00066.2004.
Barczak A, Rodriguez MW, Hanspers K, Koth LL, Tai YC, Bolstad BM, Speed TP, Erle DJ: Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res. 2003, 13: 1775-1785. 10.1101/gr.1048803.
Li F, Stormo GD: Selection of optimal DNA oligos for gene expression arrays. Bioinformatics. 2001, 17: 1067-1076. 10.1093/bioinformatics/17.11.1067.
Rouillard JM, Zuker M, Gulari E: OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a thermodynamic approach. Nucleic Acids Res. 2003, 31: 3057-3062. 10.1093/nar/gkg426.
Nielsen HB, Wernersson R, Knudsen S: Design of oligonucleotides for microarrays and perspectives for design of multi-transcriptome arrays. Nucleic Acids Res. 2003, 31: 3491-3496. 10.1093/nar/gkg622.
Chou HH, Hsia AP, Mooney DL, Schnable PS: Picky: oligo microarray design for large genomes. Bioinformatics. 2004, 20: 2893-2902. 10.1093/bioinformatics/bth347.
Reymond N, Charles H, Duret L, Calevro F, Beslon G, Fayard JM: ROSO: optimizing oligonucleotide probes for microarrays. Bioinformatics. 2004, 20: 271-273. 10.1093/bioinformatics/btg401.
Nordberg EK: YODA: selecting signature oligonucleotides. Bioinformatics. 2005, 21: 1365-1370. 10.1093/bioinformatics/bti182.
Rimour S, Hill D, Militon C, Peyret P: GoArrays: highly dynamic and efficient microarray probe design. Bioinformatics. 2005, 21: 1094-1103. 10.1093/bioinformatics/bti112.
Perrodou E, Deshayes C, Muller J, Schaeffer C, Van Dorsselaer A, Ripp R, Poch O, Reyrat JM, Lecompte O: ICDS database: interrupted CoDing sequences in prokaryotic genomes. Nucleic Acids Res. 2006, 34: D338-43. 10.1093/nar/gkj060.
Pruitt KD, Tatusova T, Maglott DR: NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2005, 33: D501-4. 10.1093/nar/gki025.
Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Geer LY, Helmberg W, Kapustin Y, Kenton DL, Khovayko O, Lipman DJ, Madden TL, Maglott DR, Ostell J, Pruitt KD, Schuler GD, Schriml LM, Sequeira E, Sherry ST, Sirotkin K, Souvorov A, Starchenko G, Suzek TO, Tatusov R, Tatusova TA, Wagner L, Yaschenko E: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2006, 34: D173-80. 10.1093/nar/gkj158.
Wang D, Coscoy L, Zylberberg M, Avila PC, Boushey HA, Ganem D, DeRisi JL: Microarray-based detection and genotyping of viral pathogens. Proc Natl Acad Sci U S A. 2002, 99: 15687-15692. 10.1073/pnas.242579699.
Lee PD, Sladek R, Greenwood CM, Hudson TJ: Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies. Genome Res. 2002, 12: 292-297. 10.1101/gr.217802.
Wang HY, Malek RL, Kwitek AE, Greene AS, Luu TV, Behbahani B, Frank B, Quackenbush J, Lee NH: Assessing unmodified 70-mer oligonucleotide probe performance on glass-slide microarrays. Genome Biol. 2003, 4: R5-10.1186/gb-2003-4-1-r5.
Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2001, 98: 5116-5121. 10.1073/pnas.091062498.
Muller J, Oma Y, Vallar L, Friederich E, Poch O, Winsor B: Sequence and comparative genomic analysis of actin-related proteins. Mol Biol Cell. 2005, 16: 5736-5748. 10.1091/mbc.E05-06-0508.
Kothapalli R, Yoder SJ, Mane S, Loughran TP: Microarray results: how accurate are they?. BMC Bioinformatics. 2002, 3: 22-10.1186/1471-2105-3-22.
Khaitlina SY: Functional specificity of actin isoforms. Int Rev Cytol. 2001, 202: 35-98.
Hoheisel JD: Microarray technology: beyond transcript profiling and genotype analysis. Nat Rev Genet. 2006, 7: 200-210. 10.1038/nrg1809.
Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M: Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet. 2001, 29: 365-371. 10.1038/ng1201-365.
Bammler T, Beyer RP, Bhattacharya S, Boorman GA, Boyles A, Bradford BU, Bumgarner RE, Bushel PR, Chaturvedi K, Choi D, Cunningham ML, Deng S, Dressman HK, Fannin RD, Farin FM, Freedman JH, Fry RC, Harper A, Humble MC, Hurban P, Kavanagh TJ, Kaufmann WK, Kerr KF, Jing L, Lapidus JA, Lasarev MR, Li J, Li YJ, Lobenhofer EK, Lu X, Malek RL, Milton S, Nagalla SR, O'Malley J P, Palmer VS, Pattee P, Paules RS, Perou CM, Phillips K, Qin LX, Qiu Y, Quigley SD, Rodland M, Rusyn I, Samson LD, Schwartz DA, Shi Y, Shin JL, Sieber SO, Slifer S, Speer MC, Spencer PS, Sproles DI, Swenberg JA, Suk WA, Sullivan RC, Tian R, Tennant RW, Todd SA, Tucker CJ, Van Houten B, Weis BK, Xuan S, Zarbl H: Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods. 2005, 2: 351-356. 10.1038/nmeth0605-477a.
Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, Germino G, Griffin C, Hilmer SC, Hoffman E, Jedlicka AE, Kawasaki E, Martinez-Murillo F, Morsberger L, Lee H, Petersen D, Quackenbush J, Scott A, Wilson M, Yang Y, Ye SQ, Yu W: Multiple-laboratory comparison of microarray platforms. Nat Methods. 2005, 2: 345-350. 10.1038/nmeth756.
Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J: Independence and reproducibility across microarray platforms. Nat Methods. 2005, 2: 337-344. 10.1038/nmeth757.
Kuo WP, Liu F, Trimarchi J, Punzo C, Lombardi M, Sarang J, Whipple ME, Maysuria M, Serikawa K, Lee SY, McCrann D, Kang J, Shearstone JR, Burke J, Park DJ, Wang X, Rector TL, Ricciardi-Castagnoli P, Perrin S, Choi S, Bumgarner R, Kim JH, Short GF, Freeman MW, Seed B, Jensen R, Church GM, Hovig E, Cepko CL, Park P, Ohno-Machado L, Jenssen TK: A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. Nat Biotechnol. 2006, 24: 832-840. 10.1038/nbt1217.
Allemeersch J, Durinck S, Vanderhaeghen R, Alard P, Maes R, Seeuws K, Bogaert T, Coddens K, Deschouwer K, Van Hummelen P, Vuylsteke M, Moreau Y, Kwekkeboom J, Wijfjes AH, May S, Beynon J, Hilson P, Kuiper MT: Benchmarking the CATMA microarray. A novel tool for Arabidopsis transcriptome analysis. Plant Physiol. 2005, 137: 588-601. 10.1104/pp.104.051300.
Snijders AM, Meijer GA, Brakenhoff RH, van den Brule AJ, van Diest PJ: Microarray techniques in pathology: tool or toy?. Mol Pathol. 2000, 53: 289-294. 10.1136/mp.53.6.289.
Shippy R, Sendera TJ, Lockner R, Palaniappan C, Kaysser-Kranich T, Watts G, Alsobrook J: Performance evaluation of commercial short-oligonucleotide microarrays and the impact of noise in making cross-platform correlations. BMC Genomics. 2004, 5: 61-10.1186/1471-2164-5-61.
Schlingemann J, Habtemichael N, Ittrich C, Toedt G, Kramer H, Hambek M, Knecht R, Lichter P, Stauber R, Hahn M: Patient-based cross-platform comparison of oligonucleotide microarray expression profiles. Lab Invest. 2005, 85: 1024-1039. 10.1038/labinvest.3700293.
Magnusson NE, Cardozo AK, Kruhoffer M, Eizirik DL, Orntoft TF, Jensen JL: Construction and validation of the APOCHIP, a spotted oligo-microarray for the study of beta-cell apoptosis. BMC Bioinformatics. 2005, 6: 311-10.1186/1471-2105-6-311.
Yauk CL, Berndt ML, Williams A, Douglas GR: Comprehensive comparison of six microarray technologies. Nucleic Acids Res. 2004, 32: e124-10.1093/nar/gnh123.
Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, Kohane IS: Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics. 2002, 18: 405-412. 10.1093/bioinformatics/18.3.405.
Tan PK, Downey TJ, Spitznagel EL, Xu P, Fu D, Dimitrov DS, Lempicki RA, Raaka BM, Cam MC: Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res. 2003, 31: 5676-5684. 10.1093/nar/gkg763.
Rogojina AT, Orr WE, Song BK, Geisert EE: Comparing the use of Affymetrix to spotted oligonucleotide microarrays using two retinal pigment epithelium cell lines. Mol Vis. 2003, 9: 482-496.
Mah N, Thelin A, Lu T, Nikolaus S, Kuhbacher T, Gurbuz Y, Eickhoff H, Kloppel G, Lehrach H, Mellgard B, Costello CM, Schreiber S: A comparison of oligonucleotide and cDNA-based microarray systems. Physiol Genomics. 2004, 16: 361-370. 10.1152/physiolgenomics.00080.2003.
Jensen K, Talbot R, Paxton E, Waddington D, Glass EJ: Development and validation of a bovine macrophage specific cDNA microarray. BMC Genomics. 2006, 7: 224-10.1186/1471-2164-7-224.
Glas AM, Floore A, Delahaye LJ, Witteveen AT, Pover RC, Bakx N, Lahti-Domenici JS, Bruinsma TJ, Warmoes MO, Bernards R, Wessels LF, Van't Veer LJ: Converting a breast cancer microarray signature into a high-throughput diagnostic test. BMC Genomics. 2006, 7: 278-10.1186/1471-2164-7-278.
Kapur K, Xing Y, Ouyang Z, Wong WH: Exon array assessment of gene expression. Genome Biol. 2007, 8: R82-10.1186/gb-2007-8-5-r82.
Okoniewski MJ, Hey Y, Pepper SD, Miller CJ: High correspondence between Affymetrix exon and standard expression arrays. Biotechniques. 2007, 42: 181-185.
Xing Y, Kapur K, Wong WH: Probe selection and expression index computation of affymetrix exon arrays. PLoS ONE. 2006, 1: e88-10.1371/journal.pone.0000088.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.
Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Mazumder R, O'Donovan C, Redaschi N, Suzek B: The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 2006, 34: D187-91. 10.1093/nar/gkj161.
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowski J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.
Chalmel F, Lardenois A, Thompson JD, Muller J, Sahel JA, Leveillard T, Poch O: GOAnno: GO annotation based on multiple alignment. Bioinformatics. 2005, 21: 2095-2096. 10.1093/bioinformatics/bti252.
SantaLucia J: A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. Proc Natl Acad Sci U S A. 1998, 95: 1460-1465. 10.1073/pnas.95.4.1460.
Tibshirani R: A simple method for assessing sample sizes in microarray experiments. BMC Bioinformatics. 2006, 7: 106-10.1186/1471-2105-7-106.
We thank the Institute for Genomic Research for providing the A. thaliana control spiking cRNA vector set. This work was supported, in Luxembourg, by the Fonds National de la Recherche (FNR), the Centre de Recherche Public de la Santé, the "Fondation luxembourgeoise contre le cancer" and the foundation "Kriibskrankkanner", and, in France, by the Centre National de la Recherche Scientifique (CNRS), the Institut National de la Santé et de la Recherche Médicale (INSERM), the Université Louis Pasteur. J. Muller was supported by a fellowship from the Ministère de la Culture, de l'Enseignement Supérieur et de la Recherche and the FNR, Luxembourg.
JM, FC and ArM developed CADO4MI and processed the bioinformatics data. JM also contributed to data evaluation and drafted the manuscript. AnM and GV contributed to protocol optimization and some experiments as well as data evaluation. MY gave statistics support, contributed to data evaluation and helped to draft the manuscript. OP, EF and LV conceived the study, participated in its design and coordination. LV developed optimized protocols, prepared the Actichip microarrays, carried out microarray experiments, contributed to oligonucleotide design and data evaluation and drafted the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Concentration of the spike RNAs in the seven 10× sample mixes and 10× reference mix. Concentration is expressed in copies per cell (cpc). The gene abbreviations correspond to those used in the original description of the A. thaliana control set . (XLS 15 KB)
Additional file 2: Gene coverage of the Actichip, Affymetrix and Operon platforms. The list recapitulates the genes included in the Actichip microarray that were not covered by the Affymetrix HG-U133A 2.0 GeneChip and the Human oligonucleotide set 2.0 from Operon. Data relative to the Affymetrix GeneChip were verified at the NetAffx analysis center. (PDF 32 KB)
Additional file 3: Probe design comparison. The sequences of the probes or probe sets specific for the various actin isoforms in the Actichip, Operon and Affymetrix platforms were aligned with the sequence of the corresponding target. The results are displayed in the graphical user interface of CADO4MI. The top panel shows the evolution of the average percent identity (red curve) and the number of sequence detected by BLASTN (blue curve) along the sequence of the target (red rectangle). The bottom panel displays the position of the probes relative to the sequence of the query. The large dark gray rectangle corresponds to the target sequence used by Affymetrix to design the probe sets. (PDF 1 MB)
Additional file 7: GID and pCover parameters. Sequences of the Actichip targets (GenBank mRNA sequences) were compared to the sequences (GenBank and RefSeq mRNA sequences) provided by Affymetrix ("target sequence") and the genomics laboratory at the university medical center of Utrecht (UMCU, The Netherlands) relative to the Operon probe set. This comparison was performed through sequence alignment using the BLASTN program and calculation of two parameters, a global percent identity (GID) and a percent coverage (pCover). GID corresponds to the percentage of global identity between the query sequence (Qseq; Affymetrix or Operon target sequence) and one sequence (SeqB; Actichip target sequence) in the BLAST output. GID is determined by considering all best scoring alignments (MSP) between QSeq and SeqB and is computed as the ratio of the total number of identical residues (TNAR) to the total number of residues (TNRM) in all these alignments. pCover refers to the percentage of sequence coverage between QSeq and SeqB in the BLAST output. The coverage corresponds to the extended overlapping regions between QSeq and SeqB considering all the regions aligned between the two sequences. pCover is computed as the ratio of the sum of all identical residues (NAR) to the number of residues in coverage (NRC). (PDF 51 KB)
Authors’ original submitted files for images
About this article
Cite this article
Muller, J., Mehlen, A., Vetter, G. et al. Design and evaluation of Actichip, a thematic microarray for the study of the actin cytoskeleton. BMC Genomics 8, 294 (2007). https://doi.org/10.1186/1471-2164-8-294
- Actin Cytoskeleton
- Microarray Platform
- Foreground Pixel
- Microarray Gene Expression Data
- Microarray Image