Next generation sequencing (NGS) allows investigation of populations on a genomic scale, thus facilitating the transition from single marker studies to population genomics.
Sequencing pools of individuals (Pool-seq) is a cost effective alternative to sequencing individuals separately that still retains a high degree of accuracy. We have developed several software tools that efficiently deal with Pool-seq data, allowing identification of positively selected regions in single populations (PoPoolation), highly differentiated regions between populations (PoPoolation2) and transposable element (TE) frequencies in populations (PoPoolation TE).
By using these technologies to analyze samples from a natural population of Drosophila melanogaster, we show that low recombining genomic regions harbor more TE insertions and maintain insertions at higher frequencies than do high recombining regions. We conservatively estimate that there are almost twice as many “novel” sites of TE insertion as sites known from the reference sequence.We conservatively estimate that there are almost twice as many “novel” sites of TE insertion as sites known from the reference sequence.Different families of transposable elements show large differences in their insertion densities and population frequencies. Our analyses suggest that the history of TE activity significantly contributes to this pattern, with recently active families segregating at lower frequencies than those active in the more distant past.
Finally, using our high-resolution TE abundance measurements, we identified 13 candidate positively selected TE insertions based on their high population frequencies and on patterns of genetic diversity in their neighborhoods.