Original submission:
hi there,
I've got a large fasta file, 62MB, which I'm clustering using cd-hit-para.pl. On the system I'm using, I've successfully clustered the same file with -c 1.0 with no sequence coverage arguments. However, when I use the sequence coverage arguments: -aL 0.8 or -aS 0.8 or -AL 49, it runs for about 15 min, then hangs. The full command is perl cd-hit-para.pl -i input -o output -n 5 --S 64 --Q 8 -aS 0.8 I've tried varying the Q and S arguments, and adding a -c argument, but the results are the same. The programs runs for 15 min, then hangs.
When I rank the files chronologically, the last three files are always output.31962.o.sh, output.div-0-o.done and output.div-0.log. The .done file contains the date, and the .log file contains the help pages for cd-hit. At this point cd-hit is still listed by ps. I've left the program running for 3 days, but can't see any further changes.
Am I missing something obvious? Cheers.
|