Prediction of protein-trap targets. A) Estimated number of protein-tagging targets for genes with known expression: genes were ranked according to their highest scoring introns. The sets of predicted genes were then compared to the reported GFP-trap target genes to determine an overlap. The graph shows the number of top-ranked genes considered in order to recover 10%, 20% etc of previously known GFP-trap target genes. B) The predicted numbers of possible targets for P-element- and piggyBac-based GFP trap screens. The numbers were first derived separately for genes with known and genes with unknown expression, and the total prediction is the sum of the two. The predictions show the number of top genes predicted by the model required for an 80% coverage of previously known protein trap insertion targets. C) Overlap between P-element and piggyBac targets: the inner circles of the Venn diagrams (solid lines) represent the numbers of reported P-element and piggyBac gene hits, the outer circles (dotted lines) represent the numbers of estimated gene targets derived from our model. There is an overlap between both the reported and the predicted targets. Only reported genes that have been successfully predicted by the theoretical model are shown in this diagram.