################################################################################### # # ANEX_P (Alignment Neighborhood Explorer, type P): # A package including a program to construct approximate probability distributions of alternative Multiple SEquence Alignments (MSAs) # by exploring the neighborhoods of an input MSA; # the MSA probabilities are computed under a given genuine sequence evolution model with realisric insertions/deletions. # (Written almost exclusively in Perl.) # # Version 0.7: Copyright (C) 2020 Kiyoshi Ezawa # # This package is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This package is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License, "GNU_GPL.txt", # along with this package. If not, see . # # The author can be contacted by e-mailing to # (replace " dot " and " at " with "." and "@", respectively). # ################################################################################### # * [ Major modification from version 0.7 to version 0.7.1 ] * Information on the references, Ezawa 2020a,b,c, was updated. * [ This file was created while finishing version 0.7 of the ANEX_P package. (See the bottom.) ] --------------------------------------------------------------------------------------------- << README for the archive, "ExOutputs_ANEX.ver0.7.tgz" >> This archive (or the directory, "ExOutputs_ANEX.ver0.7/," extracted from the archive), which accompanies the "ANEX_P" package, contains some output files (& log files) created in the course of the analyses that were conducted using the files in the "ANALYSES/" directory in the "ANEX_P" package (ver 0.7) (Ezawa 2020b). You can either just refer to these files, or compare them with the results of your running the scripts provided in "ANALYSES/" in "ANEX_P." The "ANALYSES/" sub-directory under this "../ExOutputs_ANEX.ver0.7/" directory has a directory structure nearly identical to that of "ANALYSES/" in "ANEX_P" (ver 0.7). (For explanations on the latter, refer to the "ANALYSES/README.ANALYSES.txt" file in "ANEX_P" (ver 0.7).) But this archive lacks some subdirectories (e.g, the "xxx/Scripts/" subdirectories), as it shoud. And, some subdirectories (e.g., the "xxx/LogFiles/" subdirectories) may be unique to this archive. The two main types of output files are: tables (tsv, tab-spaced-values) (in text format) and Excel spreadsheets. Tables often have names like "tbl_xxx.txt(.gz)," but rarely have different patterns of names (e.g., "table_xxx.txt(.gz)"). Excel spreadsheets have the common suffix, ".xls," and all of them are put either in the subdirectory, "ANALYSES//Preliminaries/Examine_Purges/PRANK/Best_Match/Excel_Files/," or in the subdirectory, "ANALYSES//Preliminaries/Examine_Complex/PRANK/Best_Match/Excel_Files/." These Excel spreadsheets give details on the analyses briefly described in (Ezawa 2020c), and/or summarize some other preliminary analyses conducted before we start developing ANEX. ------------------------------------------------------------------------------------------------------------- [ References ] * Cartwright RA. 2005. "DNA assembly with gap (Dawg): simulating sequence evolution." Bioinformatics 21:iii31-iii38. ## * Ezawa K. 2013a. "DENSERM: DEtecting Negative SElection on Recurrent Mutations," in Bioinformatics.org [URL: "http colon slash slash www.bioinformatics.org slash ftp slash pub slash DENSERM" (replace ' colon ' and ' slash ' with ':' and '/', respectively)]. ## * Ezawa K. 2013b. "LOLIPOG: LOg-LIkelihood for the Pattern Of Gaps in MSA," in Bioinformatics.org [URL: "http colon slash slash www.bioinformatics.org slash ftp slash pub slash lolipog" (replace ' colon ' and ' slash ' with ':' and '/', respectively)]. * Ezawa K. 2016a. "Characterizing multiple sequence alignment errors using complete-likelihood score and position-shift map." BMC Bioinformatics 17:133; DOI: 10.1186/s12859-016-0945-5. * Ezawa K. 2016b. "General continuous-time Markov model of sequence evolution via insertions/deletions: Are alignment probabilities factorable?" BMC Bioinformatics 17:304; DOI: 10.1186/s12859-016-1105-7. * Ezawa K. 2016c. "General continuous-time Markov model of sequence evolution via insertions/deletions: local alignment probability computation." BMC Bioinformatics 17:397; DOI: 10.1186/s12859-016-1167-6. ## * Ezawa K, Landan G, Graur D. 2013. "Detecting negative selection on recurrent mutations using gene genealogy." BMC Genetics. 14:37. * Ezawa K, Graur D, Landan G. 2015a. "Perturbative formulation of general continuous-time Markov model of sequence evolution via insertions/deletions, Part III: Algorithm for first approximation." bioRxiv doi:10.1101/023614. ## * Ezawa K, Graur D, Landan G. 2015b. "Perturbative formulation of general continuous-time Markov model of sequence evolution via insertions/deletions, Part IV: Incorporation of substitutions and other mutations." bioRxiv doi:10.1101/023622. ## * Ezawa K. 2020a. "New perturbation method to compute probabilities of mutually adjoining insertion-type and deletion-type gaps in ancestor-descendant pairwise sequence alignment under genuine sequence evolution model with realistic insertions/deletions: the 'last piece of the puzzle'." (preprint "KEZW_BI_ME00005.lastpiece.pdf" available at: https://www.bioinformatics.org/ftp/pub/anex/Documents/Preprints/.) * Ezawa K. 2020b. "Alingment Neighborhood EXplorer (ANEX): First attempt to apply genuine sequence evolution model with realistic insertions/deletions to Multiple Sequence Alignment reconstruction problem." (preprint "KEZW_BI_ME00006.anex.pdf" available at: https://www.bioinformatics.org/ftp/pub/anex/Documents/Preprints/.) * Ezawa K. 2020c. "Substitutional Residue-Difference Map (SRD Map) to help locate mis-alignments in Multiple Sequence Alignment (MSA): toward Artificial-Intelilgence-assisted probability distribution of alternative MSAs." (preprint "KEZW_BI_ME00007.srdmap.pdf" available at: https://www.bioinformatics.org/ftp/pub/anex/Documents/Preprints/.) ## * Guindon S, Dufayard JF, Lefort V, Anisimova M, hordijk W, Gascuel O. 2010. "New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0." Syst Biol. 59:307-321. ## * Katoh K, Toh H. 2008. "Recent developments in the MAFFT multiple sequence alignment program." Brief Bioinformatics. 9:286-298. ## * Loytynoja A, Goldman N. 2008. "Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis." Science 320:1632-1635. ## * Lunter G, Miklos I, Drummond A, Jensen JL, Hein J. 2005. "Bayesian coestimation of phylogeny and sequence alignment." BMC Bioinformatics 6:83. ## * Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. 2010. "Detection of nonneutral substitution rates on mammalian phylogenies." Genome Res. 20:110-121. # First version of this file was created on August 10th (Mon), 2020 by K. Ezawa. # It was rewritten on August 13th (Thu), 2020, by K. Ezawa, to update information on (Ezawa 2020a,b,c).