World Scientific
  • Search
  •   
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at [email protected] for any enquiries.

The potential of family-free rearrangements towards gene orthology inference

    https://doi.org/10.1142/S021972002140014XCited by:3 (Source: Crossref)
    This article is part of the issue:

    Recently, we proposed an efficient ILP formulation [Rubert DP, Martinez FV, Braga MDV, Natural family-free genomic distance, Algorithms Mol Biol 16:4, 2021] for exactly computing the rearrangement distance of two genomes in a family-free setting. In such a setting, neither prior classification of genes into families, nor further restrictions on the genomes are imposed. Given two genomes, the mentioned ILP computes an optimal matching of the genes taking into account simultaneously local mutations, given by gene similarities, and large-scale genome rearrangements. Here, we explore the potential of using this ILP for inferring groups of orthologs across several species. More precisely, given a set of genomes, our method first computes all pairwise optimal gene matchings, which are then integrated into gene families in the second step. Our approach is implemented into a pipeline incorporating the pre-computation of gene similarities. It can be downloaded from gitlab.ub.uni-bielefeld.de/gi/FFGC. We obtained promising results with experiments on both simulated and real data.

    References

    • 1. Fitch WM, Distinguishing homologous from analogous proteins, Syst Zool 19 :99–113, 1970. Crossref, MedlineGoogle Scholar
    • 2. Bergeron A, Mixtacki J, Stoye J, A unifying view of genome rearrangements, Proc. Int. Workshop on Algorithms in Bioinformatics (WABI), Lecture Notes in Bioinformatics, Vol. 4175, pp. 163–173, 2006. CrossrefGoogle Scholar
    • 3. Hannenhalli S, Pevzner PA, Transforming men into mice (polynomial algorithm for genomic distance problem), Proc. IEEE 36th Annual Foundations of Computer Science, pp. 581–592, 1995. CrossrefGoogle Scholar
    • 4. Yancopoulos S, Attie O, Friedberg R, Efficient sorting of genomic permutations by translocation, inversion and block interchange, Bioinformatics 21(16) :3340–3346, 2005. Crossref, MedlineGoogle Scholar
    • 5. Braga MDV, Willing E, Stoye J, Double cut and join with insertions and deletions, J Comput Biol 18(9) :1167–1184, 2011. Crossref, MedlineGoogle Scholar
    • 6. Sankoff D, Genome rearrangement with gene families, Bioinformatics 15(11) :909–917, 1999. Crossref, MedlineGoogle Scholar
    • 7. Bryant D, The complexity of calculating exemplar distances, in Sankoff D, Nadeau JH (eds.), Comparative Genomics, Springer, pp. 207–211, 2000. CrossrefGoogle Scholar
    • 8. Angibaud S, Fertin G, Rusu I, Thévenin A, Vialette S, On the approximability of comparing genomes with duplicates, J Graph Algorithms Appl 13(1) :19–53, 2009. CrossrefGoogle Scholar
    • 9. Shao M, Lin Y, Moret B, An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes, J Comput Biol 22(5) :425–435, 2015. Crossref, MedlineGoogle Scholar
    • 10. Bohnenkämper L, Braga MDV, Doerr D, Stoye J, Computing the rearrangement distance of natural genomes, J Comput Biol 28(4) :410–431, 2021. Crossref, MedlineGoogle Scholar
    • 11. Shi G, Zhang L, Jiang T, MSOAR 2.0: Incorporating tandem duplications into ortholog assignment based on genome rearrangement, BMC Bioinf 11 :10, 2010. Crossref, MedlineGoogle Scholar
    • 12. Braga MDV, Chauve C, Doerr D, Jahn K, Stoye J, Thévenin A, Wittler R, The potential of family-free genome comparison, in Chauve C, El-Mabrouk N, Tannier E (eds.), Models and Algorithms for Genome Evolution, Chap. 13, Springer-Verlag, pp. 287–307, 2013. CrossrefGoogle Scholar
    • 13. Martinez FV, Feijao P, Braga MDV, Stoye J, On the family-free DCJ distance and similarity, Algorithms Mol Biol 10 :13, 2015. Crossref, MedlineGoogle Scholar
    • 14. Rubert DP, Martinez FV, Braga MDV, Natural family-free genomic distance, Algorithms Mol Biol 16 :4, 2021. MedlineGoogle Scholar
    • 15. Dessimoz C, Cannarozzi G, Gil M, Margadant D, Roth ACJ, Schneider A, Gonnet GH, OMA, a comprehensive, automated project for the identification of orthologs from complete genome data: Introduction and first achievements, Proc. RECOMB Workshop on Comparative Genomics (RECOMB-CG), Lecture Notes in Bioinformatics, Vol. 3678, pp. 61–72, 2005. CrossrefGoogle Scholar
    • 16. Roth ACJ, Gonnet GH, Dessimoz C, Algorithm of OMA for large-scale orthology inference, BMC Bioinf 9 :518, 2008. Crossref, MedlineGoogle Scholar
    • 17. Lechner M, Findeiß S, Steiner L, Marz M, Stadler PF, Prohaska SJ, Proteinortho: Detection of (co-)orthologs in large-scale analysis, BMC Bioinf 12 :124, 2011. Crossref, MedlineGoogle Scholar
    • 18. Lechner M, Hernandez-Rosales M, Doerr D, Wieseke N, Thévenin A, Stoye J, Hartmann RK, Prohaska SJ, Stadler PF, Orthology detection combining clustering and synteny for very large datasets, PLoS One 9(8) :e105015, 2014. Crossref, MedlineGoogle Scholar
    • 19. Doerr D, Thévenin A, Stoye J, Gene family assignment-free comparative genomics, BMC Bioinf 13(Suppl. 19) :S3, 2012. Crossref, MedlineGoogle Scholar
    • 20. van Dongen S, Graph clustering by flow simulation, PhD Thesis, 2000. Google Scholar
    • 21. Vashist A, Kulikowski CA, Muchnik I, Ortholog clustering on a multipartite graph, IEEE/ACM Trans Comput Biol Bioinf 4(1) :17–27, 2007. Crossref, MedlineGoogle Scholar
    • 22. Doerr D, Feijão P, Stoye J, Family-free genome comparison, in Setubal JC, Stoye J, Stadler PF (eds.), Comparative Genomics: Methods and Protocols, Humana Press, pp. 331–342, 2018. CrossrefGoogle Scholar
    • 23. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ, Basic local alignment search tool, J Mol Biol 215(3) :403–410, 1990. Crossref, MedlineGoogle Scholar
    • 24. Shi G, Peng MC, Jiang T, MultiMSOAR 2.0: An accurate tool to identify ortholog groups among multiple genomes, PLoS One 6(6) :e20892, 2011. Crossref, MedlineGoogle Scholar
    • 25. Altenhoff AM et al., OMA standalone: Orthology inference among public and custom genomes and transcriptomes, Genome Res 29(7) :1152–1163, 2019. Crossref, MedlineGoogle Scholar
    • 26. Davín AA, Tricou T, Tannier E, de Vienne DM, Szöllősi GJ, Zombi: A phylogenetic simulator of trees, genomes and sequences that accounts for dead linages, Bioinformatics 36(4) :1286–1288, 2019. CrossrefGoogle Scholar
    • 27. Whelan S, Goldman N, A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach, Mol Biol Evol 18(5) :691–699, 2001. Crossref, MedlineGoogle Scholar
    • 28. Buchfink B, Xie C, Huson DH, Fast and sensitive protein alignment using DIAMOND, Nat Methods 12 :59–60, 2015. Crossref, MedlineGoogle Scholar