A symmetry-inclusive algebraic approach to genome rearrangement
Abstract
Of the many modern approaches to calculating evolutionary distance via models of genome rearrangement, most are tied to a particular set of genomic modeling assumptions and to a restricted class of allowed rearrangements. The “position paradigm”, in which genomes are represented as permutations signifying the position (and orientation) of each region, enables a refined model-based approach, where one can select biologically plausible rearrangements and assign to them relative probabilities/costs. Here, one must further incorporate any underlying structural symmetry of the genomes into the calculations and ensure that this symmetry is reflected in the model. In our recently-introduced framework of genome algebras, each genome corresponds to an element that simultaneously incorporates all of its inherent physical symmetries. The representation theory of these algebras then provides a natural model of evolution via rearrangement as a Markov chain. Whilst the implementation of this framework to calculate distances for genomes with “practical” numbers of regions is currently computationally infeasible, we consider it to be a significant theoretical advance: one can incorporate different genomic modeling assumptions, calculate various genomic distances, and compare the results under different rearrangement models. The aim of this paper is to demonstrate some of these features.
References
- 1. ,
An alternative algebraic formalism for genome rearrangements , in Sankoff D, Nadeau JH (eds.), Comparative Genomics Computational Biology, Vol. 1, pp. 213–223, 2000. Crossref, Google Scholar - 2. ,
Extending the algebraic formalism for genome rearrangements to include linear chromosomes , in de Souto MC, Kann MG (eds.), Advances in Bioinformatics and Computational Biology, Springer, Berlin, pp. 13–24, 2012. Crossref, Google Scholar - 3. , An algebraic view of bacterial genome evolution, J Math Biol 69(6–7) :1693–1718, 2014. Crossref, Medline, Google Scholar
- 4. , Group-theoretic models of the inversion process in bacterial genomes, J Math Biol 69(1) :243–265, 2014. Crossref, Medline, Google Scholar
- 5. , Algebraic double cut and join: a group-theoretic approach to the operator on multichromosomal genomes, J Math Biol 71(5) :1149–1178, 2015. Crossref, Medline, Google Scholar
- 6. , Position and content paradigms in genome rearrangements: The wild and crazy world of permutations in genomics, Bull Math Biol 80(12) :3227–3246, 2018. Crossref, Medline, Google Scholar
- 7. ,
A unifying view of genome rearrangements , in Bücher P, Moret BME (eds.), Algorithms in Bioinformatics, Springer, Berlin, Heidelberg, pp. 163–173, 2006. Crossref, Google Scholar - 8. , Double cut and join with insertions and deletions, J Comput Biol 18 :1167–1184, 2011. Crossref, Medline, Google Scholar
- 9. , Computing the rearrangement distance of natural genomes, in Schwartz R (ed.), Research in Computational Molecular Biology RECOMB 2020, Springer International Publishing, Cham, pp. 3–18, 2020. Crossref, Google Scholar
- 10. ,
A general framework for genome rearrangement with biological constraints , in Blanchette M, Ouangraoua A (eds.), Comparative Genomics, Springer International Publishing, New York, pp. 49–71, 2018. Crossref, Google Scholar - 11. , Algorithms for computing the double cut and join distance on both gene order and intergenic sizes, Algorithms Mol Biol 12, 2017. Crossref, Medline, Google Scholar
- 12. ,
Super short reversals on both gene order and intergenic sizes , in Alves R (ed.), Advances in Bioinformatics and Computational Biology, Springer International Publishing, Cham, pp. 14–25, 2018. Crossref, Google Scholar - 13. , Median approximations for genomes modeled as matrices, Bull Math Biol 78(4) :786–814, 2016. Crossref, Medline, Google Scholar
- 14. , Dynamics of genome rearrangement in bacterial populations, PLoS Genetics 4(7) :e1000128, 2008. Crossref, Medline, Google Scholar
- 15. ,
A computational method for the rate estimation of evolutionary transpositions , in Ortuño F, Rojas I (eds.), Bioinformatics and Biomedical Engineering, Springer International Publishing, Cham, pp. 471–480, 2015. Google Scholar - 16. , Maximum likelihood estimates of pairwise rearrangement distances, J Theoret Biol 423 :31–40, 2017. Crossref, Medline, Google Scholar
- 17. , Sorting circular permutations by super short reversals, IEEE/ACM Trans Comput Biol Bioinform 14(3) :620–633, 2017. Crossref, Medline, Google Scholar
- 18. , Maximum likelihood estimates of rearrangement distance: Implementing a representation-theoretic approach, Bull Math Biol 81(2) :535–567, 2019. Crossref, Medline, Google Scholar
- 19. Terauds V, Sumner JG, A new algebraic approach to genome rearrangement models, 2020, arXiv:2012.11665 (under review). Google Scholar
- 20. Stevenson JD, Terauds V, Sumner JG, Rearrangement events on circular genomes, 2021 (in preparation). Google Scholar
- 21. , Sorting by weighted inversions considering length and symmetry, BMC Bioinform 16, 2015. Crossref, Medline, Google Scholar
- 22. , Sorting by restricted-length-weighted reversals, Genomics Proteomics Bioinformatics 3(2) :120–127, 2005. Crossref, Medline, Google Scholar
- 23. , Distance-based genome rearrangement phylogeny, J Mol Evol 63(4) :473–483, 2006. Crossref, Medline, Google Scholar
- 24. , Estimating true evolutionary distances under the DCJ model, Bioinformatics 24(13) :i114–i122, 2008. Crossref, Medline, Google Scholar
- 25. , A representation-theoretic approach to the calculation of evolutionary distance in bacteria, J Phys A 50(33):335601, 14, 2017. Crossref, Google Scholar
- 26. , A mean first passage time genome rearrangement distance, J Math Biol 80(6) :1971–1992, 2020. Crossref, Medline, Google Scholar
- 27. , The ‘butterfly effect’ in Cayley graphs with applications to genomics, J Math Biol 65(6–7) :1267–1284, 2012. Crossref, Medline, Google Scholar
- 28. , Bacterial phylogeny in the Cayley graph, Discrete Math Algorithms Appl 11(5) :1950059, 2019. Link, Google Scholar
- 29. Developers TS, SageMath, the Sage Mathematics Software System (Version 9.1), 2020, http://www.sagemath.org. Google Scholar
- 30. , Exploring network structure, dynamics, and function using NetworkX, Proc. 7th Python in Science Conf.,
USA , pp. 11–15, 2008. Google Scholar - 31. The GAP Group, GAP — Groups, Algorithms, and Programming, Version 4.11.1, 2021, https://www.gap-system.org. Google Scholar
- 32. Dabbaghian V, GAP Team T, Repsn, Constructing representations of finite groups, Version 3.1.0, 2019, refereed GAP package. Google Scholar