Experimental design procedure showing all steps of the analysis. We first searched by TBLASTN all 20 Drosophila using a set of 18 MLEs transposases representative of all described subfamilies of the mariner family. All sequences longer than 400 bp were clustered with a threshold of 80% identity. A consensus of each cluster was blasted against a transposase database composed of Tc1 family elements to exclude sequences from this family. For the remaining 36 clusters (bona fide MLEs), consensus conceptual translations were used in a phylogenetic analysis together with the MLEs transposases, and the nucleotide consensus were used as queries in a MEGABLAST search in the same 20 genomes to correctly identify all copies. All hits plus 250 bp of flanking regions were retrieved. Structural and evolutionary analyses were performed on a clean dataset from which duplicated copies (segmental duplications) and incomplete copies (ends of contigs) were excluded.