[Bio-Linux] tophat with -p > 1 results in missing reads
Tim Booth
tbooth at ceh.ac.uk
Tue Aug 25 06:56:27 EDT 2015
Hi Josh,
This does look like a bug in Tophat. I'm working to push out the latest
Tophat and Bowtie2 versions to the repository just now, to see if the
issue is fixed:
TopHat 2.1.0
Bowtie2 2.2.6
But the updated TopHat requires some extra Python libs, so I'm having to
add them too and it takes a bit longer. I'll let you know when the
update is ready.
Cheers,
TIM
On Fri, 2015-08-21 at 13:48 -0400, Josh Thackray wrote:
> Hi All,
>
> I am running tophat (version 2.0.13, from the biolinux distribution). I
> am facing a problem where running tophat with increasing values for -p
> (number of threads) results in more and more reads lost in the final
> output. I'm starting with an uncompressed fastq file containing
> 18,115,321 reads, and running tophat with default parameters except for
> -p and -o.
>
> Running -p 8 results with the following information in align_summary.txt:
> Input : 318640
> Mapped: 191949 (60.2% of input)
> of these: 29316 (15.3%) have multiple alignments (0 have >20)
> 60.2% overall read mapping rate.
>
> Running -p 4 results with the following information in align_summary.txt:
> Input : 1302700
> Mapped: 759998 (58.3% of input)
> of these: 115861 (15.2%) have multiple alignments (1 have >20)
> 58.3% overall read mapping rate.
>
> Running -p 1 results with the following information in align_summary.txt:
> Input : 18115321
> Mapped: 12014534 (66.3% of input)
> of these: 1867188 (15.5%) have multiple alignments (13 have >20)
> 66.3% overall read mapping rate.
>
> I also tried running tophat with the --no-sort-bam option to check if
> samtools was somehow screwing up during the mergesort operation, but I
> get the same result. I also confirmed the numbers reported in the
> align_summary.txt file using the samtools flagstat command. Further
> using bowtie1 instead of bowtie2 for the alignment engine did not
> resolve the problem of these reads going missing.
>
> Any ideas???
>
> Thanks,
>
> Josh
>
--
Tim Booth <tbooth at ceh.ac.uk>
Centre for Ecology and Hydrology
Maclean Bldg, Benson Lane
Crowmarsh Gifford
Wallingford, England
OX10 8BB
http://environmentalomics.org/bio-linux
+44 1491 69 2297
More information about the Bio-linux-list
mailing list