<<< METHODS & RESULTS (summary) of our manual validation analyses on the prototype Perl script, "classify_msa_errors_via_mblks.2val.pl" >>>

<< METHODS >>

1. On each of the sets of reconstructed MSAs (via either MAFFT (E-INS-i) or Prank ("best-fit"),
	 each reconstructed MSA was compared with its true counterpart (via Dawg) by running the prototype Perl script.

2. For each erroneous segment in each pair of reconstructed and true MSAs,
    errors associated with each position-shift block (output by the prototype script) were manually classified
    based on the position-shift map (also output by the prototype script).
    (Both position-shift map and blocks are recorded in "log_seg_anal.mm.txt," where "mm" is the segment ID. See README.txt for the file's location.)

3. Then, the results of the manual classification were compared with the results of the automatic classification(, which are also recorded in "log_seg_anal.mm.txt").
  And judged as "Correct" if the main part of both results match, and "WRONG" otherwise.
  When dealing with multiple errors interacting with one another,
  "Correct/2" were called when the script correctly identified the types of the components
   while it failed to point out the interacting nature of the errors.

4. Erroneous segments with long gaps or "complex" errors were excluded from the manual validation analyses,
   because they are currently beyond the scope of the prototype script.


<< RESULTS (summary) >>

The results of individual manual validations are recorded in "manual_validation_of_classification.xls."
And the results can be summarized as follows.

< For MAFFT (E-INS-i) >

#{Correct}   = 170,
#{Correct/2} =   2,
#{WRONG}     =   8.

Thus,

%{Correct classifications} = 100 * (170 + 2/2) /(170 + 2 + 8) = 95.0%.


< For Prank ("best-fit") >

#{Correct}   = 170,
#{Correct/2} =   2,
#{WRONG}     =   5. 

Thus,

%{Correct classifications} = 100 * (170 + 2/2) /(170 + 2 + 5) = 96.6%.



# This file was written by Kiyoshi Ezawa on Tuesday, January 12th, 2016.

