Distribution of coding sequence lengths in OGSv3.2 and OGSv1.0. Histogram plots showing the number of genes having “X” coding sequence length in bins of 20 nt are illustrated using points instead of lines to allow visualization of both distributions. The range in coding sequence length extends to 70,263 and 53,649 in OGSv3.2 (blue) and OGSv1.0 (red), respectively, but this figure zooms in to show lengths only up to 5,000 nt. There were 386 and 344 genes with coding sequences longer than 5,000 nt in OGSv3.2 and OGSv1.0, respectively. This figure shows that the increased number of genes in OGSv3.2 is largely due to increased numbers of short genes. The number of larger genes is not decreased, so gene splitting is not likely a major source of additional genes.