Eggs were collected every 24 hours for 6 days on fresh rearing media (95% flour, 5% yeast, by weight) at 30°C from age-matched groups of mated and virgin females from the GA1 strain of T. castaneum. At the time of collection, whole eggs were manually cleaned of excess rearing media and placed into 0.5 ml of Trizol Reagent (Sigma-Aldrich, St. Louis, MO), sheared using a 21 gage needle, and stored at −80°C. Samples from two collection days were combined (days 1 & 4; 2 & 5; 3 & 6) and RNA was isolated using the miRNeasy Mini Kit (Qiagen, Valencia, CA).
NimbleGen T. castaneum whole genome tiled microarray
The microarray consisted of a set of three custom-designed Roche NimbleGen high-density-2 (HD2) whole genome tiling microarray chips, each with 2.1 million isothermal long-oligonucleotide probes (49 – 74 bases range; average 54 bases in length) that sequentially overlap 12 bp, on average. Markov modeled random probes (n=154,000) equivalent in composition to the T. castaneum genome sequence were used to set thresholds that measure significant hybridization signals over the background. All experimental probes were designed from unique regions of the genome sequence using the NimbleGen ArrayScribe software and quality assurance tests of the probes were conducted by Center for Genomics and Bioinformatics at Indiana University in-house algorithms . Signal to background ratios were determined by first calling probes that fluoresced at intensities greater than 99% of the random probes’ signal intensities; therefore, only 1% of fluorescing experimental probes above background should be false positives. The arrays reliably produced high signal to background ratios; log2 ratios in excess of five were observed for signal over background. We conducted two-color competitive hybridizations that measure differential expression from three replicates, including dye-swaps, each using RNA from independent biological extractions of fertilized and unfertilized eggs.
RNA sample processing and analysis of data
Beginning with at least 1.0 μg of total RNA, a single round of amplification using MessageAmp™ II aRNA kit (Ambion, Austin, TX) produced more than 100 μg for all samples. Starting with 10 μg of aRNA, double strand cDNA synthesis was carried out using SuperScript Double-Stranded cDNA Synthesis kit (Invitrogen, Carlsbad, CA) using random hexamer primer (Promega, Madison, WI) followed by Dual-Color Labeling Kit (Roche NimbleGen, Madison, WI) using 1 O.D. CY-labeled random nonomer primer per 1 μg double-stranded cDNA in triplicate. Differentially CY-dye labeled treatment and control replicates (48 μg each) were pooled and resuspended with the Hybridization Kit (Roche NimbleGen, Madison, WI). A single dye-swap was included. Hybridization, post-hybridization washing and scanning were done according to the manufacturer’s instructions. Images were acquired using an Axon GenePix 4200A scanner (Molecular Devices, Sunnyvale CA) with GenePix 6.0 software. Data from the images were extracted using NimbleScan 2.4 software (Roche NimbleGen, Madison, WI).
Transcriptionally active regions (TARs) were identified using the approach described in Colbourne, et al. . TARs were defined by concatenating overlapping probes showing fluorescence above the 1% false positive rate (FPR). First, replicate arrays were quantile-normalized  and to each probe the median value of the replicate probe values was assigned. The fluorescence signal of the random probes, designed to reflect the genome nucleotide composition by Markov modeling, was used to determine a FPR threshold. Probes were considered positive if their fluorescence signal was higher than the 99th percentile of the fluorescence signal of the random probes. TARs were generated using an approach similar to that developed by Kampa, et al. . Positive probes were joined into a single TAR if (1) they were adjacent (maxgap=0 bp, no intermittent non-positive probe) and (2) the length was at least 45 bp (minrun=45, mid-point first positive probe to mid-point last positive probe, resulting in at least 3 adjacent positive probes for a TAR). An exon or gene was considered transcribed only if > 80% of its tiled length was expressed.
The data analysis to measure differential expression of genes and of un-annotated TARs was performed using the statistical software package R  and Bioconductor . Signal distributions across chips, samples and replicates were adjusted to be equal according to the mean fluorescence of the random probes on each array. All probes (including random probes) were quantile-normalized across replicates. Expression-level scores were assigned for each predicted gene based on the median log2 fluorescence over background intensity of probes falling within the exon boundaries. We used the following analysis protocol for estimating differential expression of genes and other genome features from tiled expression data: (1) We created a “tile-expression” table containing normalized log2 expression scores for each oligonucleotide probe, with columns for each treatment and replicate, as well as the designated genome location (or address) of each probe; (2) We next created a “tile-gene mapping” table, in the same sorted order as the tile-expression table, which has columns of gene IDs for each exon, intron, tar-region, in rows matching the address of each probe; and, (3) We calculated the per-tile, per-treatment differential expression (DE) levels with LIMMA R package . This balanced-design DE calculation is of the same type that LIMMA is designed to produce.
The microarray data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus  and are accessible through GEO Series accession number GSE38245.
Expression of maternal Drosophila melanogaster orthologs
Lists of D. melanogaster maternal genes were obtained from Table S12-S17 of Arbietman, et al. Tables two and six of De Renzis, et al.. Additional lists of maternal D. melanogaster genes were generated with FlyMine v32.0 (http://www.flymine.org, ) using the mRNA expression term “maternal” for the Berkeley Drosophila Genome Project in situ database [24, 25] and Fly-Fish database . BioMart  was used to identify T. castaneum orthologs to the D. melanogaster genes (Drosophila melanogaster genes (BDGP 5); ENSEMBL METAZOA 12 (EBI UK) database, Additional file 4). Drosophila melanogaster genes without a T. castaneum ortholog on the chromosomal assemblies were excluded from further analysis. In each comparison, >90% of all D. melanogaster genes with a T. castaneum ortholog had only one ortholog. A maternal D. melanogaster gene was categorized as having maternal T. castaneum orthologs if at least one of the orthologs was expressed in T. castaneum unfertilized eggs (Group A). A maternal D. melanogaster gene was categorized as having only zygotic T. castaneum orthologs if at least one of the orthologs was expressed in T. castaneum fertilized eggs, but none in unfertilized eggs (Group B). A maternal D. melanogaster gene was categorized as having no expressed orthologs if no T. castaneum orthologs were expressed in either fertilized or unfertilized eggs (Group C). The results of the comparisons between the D. melanogaster and T. castaneum maternal genes were examined for gene ontology germ enrichment using AmiGO (version: 1.8, GO database release 2012-09-08) . Only the Berkeley Drosophila Genome Project in situ database [24, 25] had a sufficient number of genes and distribution among gene groupings (i.e. Group A, maternally loaded; Group B, fertilized expressed; Group C, not expressed) for the enrichment test to provide enriched terms for all of the comparison groupings. Complete results are provided in the Additional files 1, 2: Table S1, 3 and 4.
Comparison to previously identified gene groupings in T. castaneum
Female-biased expression genes: We analyzed 1661 of the 1821 female-biased expression genes identified in Supplementary Table one of Prince, et al. for maternal expression. The remaining 160 genes could not be analyzed because they were located in areas not covered by our microarray (i.e., unassigned contig assemblies). Male-linked genes: Gene datasets from Tables two and three of Parthasarathy, et al. , Table one of South, et al. , Supplementary Table one of Prince, et al.  were compared to the expression results in this study. We substituted TC001980 and TC003552 for TC001930 and TC003522, respectively, in Table one of South, et al.  based on primer gene identification (A. Dapper, personal communication). Within each dataset, any genes not covered by our microarray were excluded from analysis.