Incorporating intergenic regions into reversal and transposition distances with indels
Abstract
Problems in the genome rearrangement field are often formulated in terms of pairwise genome comparison: given two genomes and , find the minimum number of genome rearrangements that may have occurred during the evolutionary process. This broad definition lacks at least two important considerations: the first being which features are extracted from genomes to create a useful mathematical model, and the second being which types of genome rearrangement events should be represented. Regarding the first consideration, seminal works in the genome rearrangement field solely used gene order to represent genomes as permutations of integer numbers, neglecting many important aspects like gene duplication, intergenic regions, and complex interactions between genes. Regarding the second consideration, some rearrangement events are widely studied such as reversals and transpositions. In this paper, we shed light on the first consideration and created a model that takes into account gene order and the number of nucleotides in intergenic regions. In addition, we consider events of reversals, transpositions, and indels (insertions and deletions) of genomic material. We present a 4-approximation algorithm for reversals and indels, a -approximation algorithm for transpositions and indels, and a 6-approximation for reversals, transpositions, and indels.
References
- 1. Christie DA, Genome rearrangement problems, PhD Thesis, Department of Computing Science, University of Glasgow, Glasgow, 1998. Google Scholar
- 2. , Double cut and join with insertions and deletions, J Comput Biol 18(9) :1167–1184, 2011. Crossref, Medline, Google Scholar
- 3. , Computing the inversion-indel distance, IEEE/ACM Trans Comput Biol Bioinf, 2020. Google Scholar
- 4. , Genome rearrangement distance with reversals, transpositions, and indels, J Comput Biol 28(3) :235–247, 2021. Crossref, Medline, Google Scholar
- 5. , Breaking good: Accounting for fragility of genomic regions in rearrangement distance estimation, Genome Biol Evol 8(5) :1427–1439, 2016. Crossref, Medline, Google Scholar
- 6. ,
Comparative genomics on artificial life , Pursuit of the Universal, Springer International Publishing,Switzerland , pp. 35–44, 2016. Crossref, Google Scholar - 7. , Genome rearrangements with indels in intergenes restrict the scenario space, BMC Bioinf 17(14) :426, 2016. Crossref, Medline, Google Scholar
- 8. , Sorting by genome rearrangements on both gene order and intergenic sizes, J Comput Biol 27(2) :156–174, 2020. Crossref, Medline, Google Scholar
- 9. , Sorting signed permutations by intergenic reversals, IEEE/ACM Trans Comput Biol Bioinf, 2020. Medline, Google Scholar
- 10. , Sorting permutations by reversals and Eulerian cycle decompositions, SIAM J Discrete Math 12(1) :91–110, 1999. Crossref, Google Scholar
- 11. , Sorting permutations by fragmentation-weighted operations, J Bioinform Comput Biol 18 :2050006, 2020. Link, Google Scholar
- 12. , Sorting by reversals, transpositions, and indels on both gene order and intergenic sizes, Int Symp Bioinformatics Research and Applications, Springer International Publishing, pp. 28–39, 2019. Crossref, Google Scholar
- 13. , Sorting by transpositions is difficult, SIAM J Discrete Math 26(3) :1148–1180, 2012. Crossref, Google Scholar
- 14. , On the complexity of sorting by reversals and transpositions problems, J Comput Biol 26(11) :1223–1229, 2019. Crossref, Medline, Google Scholar