[BiO BB] Comparing sequences from GenBank and RefSeq...

Dan Bolser dan.bolser at gmail.com
Thu Apr 23 11:42:42 EDT 2009


Hi,

I found that the potato chloroplast sequence from GenBank (DQ231562.1)
has several differences (260 SNPs and 30 indels) relative to the same
sequence in RefSeq (NC_008096.1). As far as I am aware this sequence
has only been obtained once, why would the two differ? In general
should I trust the refseq sequence?


For your reference here is the output of dnadiff over the two files:

Reference/DQ231562.fasta Query/NC_008096.fasta
NUCMER

                               [REF]                [QRY]
[Sequences]
TotalSeqs                          1                    1
AlignedSeqs               1(100.00%)           1(100.00%)
UnalignedSeqs               0(0.00%)             0(0.00%)

[Bases]
TotalBases                    155312               155298
AlignedBases         155312(100.00%)      155298(100.00%)
UnalignedBases              0(0.00%)             0(0.00%)

[Alignments]
1-to-1                             1                    1
TotalLength                   155312               155298
AvgLength                  155312.00            155298.00
AvgIdentity                    99.81                99.81

M-to-M                             1                    1
TotalLength                   155312               155298
AvgLength                  155312.00            155298.00
AvgIdentity                    99.81                99.81

[Feature Estimates]
Breakpoints                        0                    0
Relocations                        0                    0
Translocations                     0                    0
Inversions                         0                    0

Insertions                         0                    0
InsertionSum                       0                    0
InsertionAvg                    0.00                 0.00

TandemIns                          0                    0
TandemInsSum                       0                    0
TandemInsAvg                    0.00                 0.00

[SNPs]
TotalSNPs                        260                  260
AC                         23(8.85%)            14(5.38%)
AG                         24(9.23%)           30(11.54%)
AT                         15(5.77%)            14(5.38%)
CA                         14(5.38%)            23(8.85%)
CG                         24(9.23%)            18(6.92%)
CT                        32(12.31%)            19(7.31%)
GA                        30(11.54%)            24(9.23%)
GC                         18(6.92%)            24(9.23%)
GT                         13(5.00%)           34(13.08%)
TA                         14(5.38%)            15(5.77%)
TC                         19(7.31%)           32(12.31%)
TG                        34(13.08%)            13(5.00%)

TotalGSNPs                       113                  113
AC                          9(7.96%)             8(7.08%)
AG                        17(15.04%)           17(15.04%)
AT                          5(4.42%)             3(2.65%)
CA                          8(7.08%)             9(7.96%)
CG                          6(5.31%)             7(6.19%)
CT                        15(13.27%)             8(7.08%)
GA                        17(15.04%)           17(15.04%)
GC                          7(6.19%)             6(5.31%)
GT                          6(5.31%)           12(10.62%)
TA                          3(2.65%)             5(4.42%)
TC                          8(7.08%)           15(13.27%)
TG                        12(10.62%)             6(5.31%)

TotalIndels                       30                   30
A.                        14(46.67%)            4(13.33%)
C.                          1(3.33%)             0(0.00%)
G.                          0(0.00%)             0(0.00%)
T.                         7(23.33%)            4(13.33%)

TotalGIndels                      24                   24
A.                        10(41.67%)            4(16.67%)
C.                          1(4.17%)             0(0.00%)
G.                          0(0.00%)             0(0.00%)
T.                         5(20.83%)            4(16.67%)


Thanks for any pointers,
Dan.




More information about the BBB mailing list