PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeMG37.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_000908 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1MG_RS00020MG_RS00055Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS00020216-2.140178DNA gyrase subunit A
MG_RS00025317-4.075062serine--tRNA ligase
MG_RS00030419-5.247624thymidylate kinase
MG_RS00035419-5.479665DNA polymerase III subunit delta'
MG_RS00040319-5.047270tRNA uridine-5-carboxymethylaminomethyl(34)
MG_RS00045218-4.161175deoxyribonuclease
MG_RS00050116-3.601400DNA primase-like protein
MG_RS00055214-3.769152hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS00025TYPE4SSCAGX320.007 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.7 bits (71), Expect = 0.007
Identities = 29/102 (28%), Positives = 52/102 (50%), Gaps = 8/102 (7%)

Query: 1 MLDPNKLRNNYDFFKKKLLERNVNEQLLNQFIQTDKLMRKNLQQLELANQKQSLLAKQVA 60
+ P +++N F+K+ + + + +F++T KL+ EL QK++L ++ A
Sbjct: 96 FIQPKSVKSNL-MFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEA 154

Query: 61 K---QKDNKKLLAESKELKQK----IENLNNAYKDSQNISQD 95
K QK K + KE + K +ENL NA + QN+S +
Sbjct: 155 KEQAQKAQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNN 196


2MG_RS00745MG_RS00770Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS007452223.948361UDP-galactopyranose mutase
MG_RS007503245.356348elongation factor 4
MG_RS007554296.804275ribonuclease J
MG_RS007601255.129969hypothetical protein
MG_RS007650203.711208adhesin
MG_RS00770-1193.377666hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS00750TCRTETOQM1702e-47 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 170 bits (431), Expect = 2e-47
Identities = 111/438 (25%), Positives = 175/438 (39%), Gaps = 86/438 (19%)

Query: 5 NIRNFSIIAHIDHGKSTLSDRLLEHSLGFEKRL----LQAQMLDTMEIERERGITIKLNA 60
I N ++AH+D GK+TL++ LL +S G L D +ER+RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNS-GAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 61 VELKINVDNNNYLFHLIDTPGHVDFTYEVSRSLAACEGVLLLVDATQGIQAQ-------- 112
+ N ++IDTPGH+DF EV RSL+ +G +LL+ A G+QAQ
Sbjct: 61 ----TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 113 ------TI-------------SNAYLALENNL--EIIP-----------VINKIDMDNAD 140
TI S Y ++ L EI+ V N + + D
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 141 ----------------IETTKDSLHNLL--GVEKNSICLV---SAKANLGIDQLIQTIIA 179
L S+ V SAK N+GID LI+ I
Sbjct: 177 TVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITN 236

Query: 180 KIPPPKGEINRPLKALLFDSYYDPYKGVVCFIRVFDGCLKVNDKVRFIKSNSVYQIVELG 239
K L +F Y + + +IR++ G L + D VR + + +I E+
Sbjct: 237 KFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMY 295

Query: 240 VKTPFFE-KRDQLQAGDVGWFSAGIKKLRDVGVGDTIVSFDDQFTKPLAGYKKILPMIYC 298
K D+ +G++ KL V +GDT + Q + + LP++
Sbjct: 296 TSINGELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKL--LPQRERI----ENPLPLLQT 348

Query: 299 GLYPVDNSDYQNLKLAMEKIIISDAALEYEY--ETSQALGFGVRCGFLGLLHMDVIKERL 356
+ P + L A+ +I SD L Y T + + FLG + M+V L
Sbjct: 349 TVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEII-----LSFLGKVQMEVTCALL 403

Query: 357 EREYNLKLISAPPSVVYK 374
+ +Y++++ P+V+Y
Sbjct: 404 QEKYHVEIEIKEPTVIYM 421



Score = 35.2 bits (81), Expect = 9e-04
Identities = 19/105 (18%), Positives = 35/105 (33%), Gaps = 15/105 (14%)

Query: 400 ISEPFVKVFIDLPDQYLGSVIDLCQNFRGQYESLNEIDINRKRICYLMPLGEIIYSFFDK 459
+ EP++ I P +YL + ++ N + +P I +
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVD-TQLKNNEVILSGEIPARC-IQEYRSD 592

Query: 460 LKSISKGYASLNYEFYNYQ-------------HSQLEKVEIMLNK 491
L + G + E Y +S+++KV M NK
Sbjct: 593 LTFFTNGRSVCLTELKGYHVTTGEPVCQPRRPNSRIDKVRYMFNK 637


3MG_RS01330MG_RS01410Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS013303182.375663amino acid permease
MG_RS013351193.373885thymidylate synthase
MG_RS013400183.109852dihydrofolate reductase
MG_RS013450193.439554ribonucleoside-diphosphate reductase subunit
MG_RS013500204.339553class Ib ribonucleoside-diphosphate reductase
MG_RS013550184.092015ribonucleoside-diphosphate reductase subunit
MG_RS013650223.35488850S ribosomal protein L21
MG_RS01370-2192.283338ribosomal-processing cysteine protease Prp
MG_RS013750203.25332750S ribosomal protein L27
MG_RS01380-1182.458275endonuclease
MG_RS01385-2140.877170transcriptional repressor
MG_RS01390-113-0.117217DUF3196 domain-containing protein
MG_RS01395-211-0.371923trigger factor
MG_RS01400-110-0.990734Lon protease
MG_RS01405-28-3.035446nicotinate-nucleotide adenylyltransferase
MG_RS01410-27-3.067851hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01375TYPE3IMRPROT270.012 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 27.0 bits (60), Expect = 0.012
Identities = 7/30 (23%), Positives = 11/30 (36%)

Query: 47 VGQIIYRQRGTKIFAGQNVAMGSDNTLFAL 76
G+II Q G + A + + A
Sbjct: 98 AGEIIGLQMGLSFATFVDPASHLNMPVLAR 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01405LPSBIOSNTHSS381e-05 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.3 bits (89), Expect = 1e-05
Identities = 21/73 (28%), Positives = 32/73 (43%), Gaps = 10/73 (13%)

Query: 7 IFGGSFDPIHNAHLYIAKHAIKKIKAQKLFFVPTYNGIFKN---NFHASNKDRIAMLKLA 63
I+ GSFDPI HL I + F Y + +N S ++R+ + A
Sbjct: 4 IYPGSFDPITFGHLDIIERGC-------RLFDQVYVAVLRNPNKQPMFSVQERLEQIAKA 56

Query: 64 IKSVNNALVSNFD 76
I + NA V +F+
Sbjct: 57 IAHLPNAQVDSFE 69


4MG_RS01595MG_RS01755Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS01595-1123.082428hypothetical protein
MG_RS016000153.887603hypothetical protein
MG_RS016051154.254191lipoate--protein ligase A
MG_RS016101154.656198dihydrolipoyl dehydrogenase
MG_RS016151143.903135dihydrolipoyllysine-residue acetyltransferase
MG_RS01620-1152.893314pyruvate dehydrogenase E1 component subunit
MG_RS01625-1141.989724pyruvate dehydrogenase E1 component subunit
MG_RS01630-1121.336138NADH oxidase
MG_RS01635-1121.297133adenine phosphoribosyltransferase
MG_RS016400121.328615hypothetical protein
MG_RS01645-1110.779292guanosine-3',5'-bis(diphosphate)
MG_RS016500120.996080hypothetical protein
MG_RS016552141.222864hypothetical protein
MG_RS016601140.837553IgG-blocking protein M
MG_RS01690-113-0.452702*****transcription elongation factor GreA
MG_RS01695-1130.351737proline--tRNA ligase
MG_RS01700-1191.616061hypothetical protein
MG_RS017051203.088060hypothetical protein
MG_RS017103306.596588hypothetical protein
MG_RS028302203.486194acyl carrier protein
MG_RS017202182.435768*hypothetical protein
MG_RS028351150.685537hypothetical protein
MG_RS01725-113-1.188308adhesin
MG_RS01730-213-2.041755hypothetical protein
MG_RS01735115-4.433790high affinity transport system protein p37
MG_RS01745014-4.405614ABC transporter ATP-binding protein
MG_RS01750-112-3.637942ABC transporter permease
MG_RS01755-312-3.474341Holliday junction resolvase RuvX
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01595GPOSANCHOR362e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 2e-04
Identities = 28/222 (12%), Positives = 68/222 (30%), Gaps = 1/222 (0%)

Query: 94 LEGEINRQVQQNSELFSQLKQSESEIIQMQQLVEAKEHQIEALNKQLHAIKEANKKLIEE 153
L+ + + N L + E+ ++ + + + ++ ++ L +
Sbjct: 69 LKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKA 128

Query: 154 HESINVEELLKEYEVQCNEAIYKRDQHIQTVFEDKLALKDGEISETQSLLKTAEKEKQAL 213
E +++ EA + E L + + +KT E EK AL
Sbjct: 129 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 188

Query: 214 KKAYKLVVNSLQKHQKLTTEIEIDFTKLDEIIATIFDETKNPKTGFTNFIKQFEKTKAKL 273
+ + +L+ +T L+ A + + + + AK+
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 274 TKKIAEITKLDHSTPTNYQQETPASQQQLDQENEPIKPSKKS 315
AE L+ ++ + ++ IK +
Sbjct: 249 KTLEAEKAALEARQA-ELEKALEGAMNFSTADSAKIKTLEAE 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01615RTXTOXIND290.046 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.6 bits (64), Expect = 0.046
Identities = 18/120 (15%), Positives = 40/120 (33%), Gaps = 1/120 (0%)

Query: 50 SPFAGTISAINVKVGDVVSIGQVMAVIGEKTSTPLVEPKPQPTEEVAKVKEAGASVVGEI 109
+ I VK G+ V G V+ + K Q + A++++ ++
Sbjct: 101 PIENSIVKEIIVKEGESVRKGDVLLKL-TALGAEADTLKTQSSLLQARLEQTRYQILSRS 159

Query: 110 KVSDNLFPIFGVKPHATPAVKDTKVASSTNITVETTQKPESKTEQKTIAISTMRKAIAEA 169
+ L + V + +V T++ E +++ QK + + R
Sbjct: 160 IELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTV 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01655TYPE4SSCAGA300.016 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.7 bits (66), Expect = 0.016
Identities = 27/85 (31%), Positives = 39/85 (45%), Gaps = 8/85 (9%)

Query: 37 TAANLYVQARNSIDSSF-NSAKAFANALANSANQFSKSSITNNLDQVK---KDLEQSLQK 92
T L Q N + F +S K N + + T N D+VK KDLE+SL+K
Sbjct: 561 TTKGLSPQEANKLIKDFLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRK 620

Query: 93 VD----EYKKNLESQNNLGNISQEK 113
+ E +K LES++ N + K
Sbjct: 621 REHLEKEVEKKLESKSGNKNKMEAK 645


5MG_RS02440MG_RS02530Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS02440318-1.621365ATP synthase subunit delta
MG_RS02445118-1.261523ATP synthase subunit B
MG_RS02450219-1.701370ATP synthase subunit C
MG_RS02455418-2.164683ATP synthase subunit A
MG_RS02460216-2.882292hypothetical protein
MG_RS02465115-2.815260enolase
MG_RS02470115-4.632242peptide-methionine (S)-S-oxide reductase
MG_RS02475215-5.568977phosphate transport system regulatory protein
MG_RS02480115-5.318308phosphate ABC transporter ATP-binding protein
MG_RS02485116-5.427559phosphate ABC transporter permease
MG_RS02490216-5.585166hypothetical protein
MG_RS02495016-4.579120hypothetical protein
MG_RS02500015-4.910759hypothetical protein
MG_RS02505-115-3.56939530S ribosomal protein S9
MG_RS02510-114-3.68572950S ribosomal protein L13
MG_RS02515013-3.486052DNA polymerase III subunit gamma/tau
MG_RS02520-114-3.147815excinuclease ABC subunit A
MG_RS02525313-3.693605hypothetical protein
MG_RS02530213-1.440264RNase J family beta-CASP ribonuclease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS02495SSPAMPROTEIN300.017 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 30.4 bits (68), Expect = 0.017
Identities = 15/44 (34%), Positives = 28/44 (63%), Gaps = 2/44 (4%)

Query: 633 KKLSTIKRTKDGFEYKFKY--RKDFNEQRWIAKDFRIPLNKNVQ 674
+K S +++ ++ F+ K KY RK+ N QRWI + R+ + + +Q
Sbjct: 94 EKRSELEKKREEFQEKSKYWLRKEGNYQRWIIRQKRLYIQREIQ 137


6MG_RS02585MG_RS02615Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS02585214-2.821831ribosome recycling factor
MG_RS02590114-2.909064phosphatidate cytidylyltransferase
MG_RS02595015-3.263123type I restriction modification protein
MG_RS02600-117-3.454792hypothetical protein
MG_RS02605-119-3.577172hypothetical protein
MG_RS02610119-3.420837hypothetical protein
MG_RS02615018-3.427214GTPase
7MG_RS01275MG_RS01295N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MG_RS012751182.600961********proline-rich protein
MG_RS012802192.541576Cytadherence high molecular weight protein 2
MG_RS01285-1191.811091hypothetical protein
MG_RS01290-1191.198178hypothetical protein
MG_RS01295-1171.270140hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01275GPOSANCHOR280.046 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.5 bits (63), Expect = 0.046
Identities = 16/77 (20%), Positives = 24/77 (31%), Gaps = 6/77 (7%)

Query: 210 DSYNFRLNSLKSKLDNALYSLDKTIQNTNENTANLEAIRHNLEQKIQNQSKQLRTNFDTQ 269
+N L S L DK++ LEA + +LE+ ++
Sbjct: 84 KDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFS------T 137

Query: 270 KLDDKINELEIRMQKLT 286
KI LE L
Sbjct: 138 ADSAKIKTLEAEKAALA 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01280GPOSANCHOR392e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 38.9 bits (90), Expect = 2e-04
Identities = 48/267 (17%), Positives = 83/267 (31%), Gaps = 5/267 (1%)

Query: 376 DSLLKLETEYKALQHKINEFKNESATKSEELLNQERELFE---KRREIDTLLTQASLEYE 432
L KAL+ +E E + E+L ++ L E K +E++ E
Sbjct: 71 LKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALE 130

Query: 433 HQRESSQLLKDKQNEVKQHFQNLEYAKKELDKERNLLDQQKKVDSEAIFQLKEKVAQERK 492
S K ++ L K +L+K DS I L+ + A
Sbjct: 131 GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 190

Query: 493 ELEELY--LVKKQKQDQKENELLFFEKQLKQHQADFENELEAKQQELFEAKHALERSFIK 550
EL L ++ + + K A + +LE + A
Sbjct: 191 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 250

Query: 551 LEDKEKDLNTKAQQIANEFSQLKTDKSKSADFELMLQNEYENLQQEKQKLFQERTYFERN 610
LE ++ L + ++ + + L+ E L+ EK L + N
Sbjct: 251 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNAN 310

Query: 611 AAVLSNRLQQKREELLQQKETLDQLTK 637
L L RE Q + +L +
Sbjct: 311 RQSLRRDLDASREAKKQLEAEHQKLEE 337



Score = 36.2 bits (83), Expect = 0.001
Identities = 53/355 (14%), Positives = 105/355 (29%), Gaps = 9/355 (2%)

Query: 550 KLEDKEKDLNTKAQQIANEFSQLKTDKSKSADFELMLQNEYENLQQEKQKLFQERTYFER 609
K E + L K ++ LK + + + + + + + E
Sbjct: 61 KFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEA 120

Query: 610 NAAVLSNRLQQKREELLQQKETLDQLTKSFEQERLINQREHKELVASVEKQKEILGK--K 667
A L L+ + L K L ++ K
Sbjct: 121 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 668 LQDFSQTSLNASKNLAEREMAIKFKEKEIEATEKQLLNDVNNAEVIQADLAQLNQSLNQE 727
L+ L + A K L + +ADL + +
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 240

Query: 728 RSELQNAKQRIADFHNDSLKKLNEYELSLQKRLQELQTLEANQKQHSYQNQAYFEGELDK 787
+ + + + E E + T ++ + + +A E E
Sbjct: 241 STADSAKIKTLEAEKAALEARQAELE-KALEGAMNFSTADSAKIKTLEAEKAALEAEKAD 299

Query: 788 LNREKQAFLNLRKKQTMEVDA---IKQRLSDKHQALNMQQAELDRKTHELNNAFLNHDAD 844
L + Q R+ ++DA K++L +HQ L Q + L
Sbjct: 300 LEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREA 359

Query: 845 QKSLQDQLATVKETQKLIDLERSAL---LEKQREFAENVAGFKRHWSNKTSQLQK 896
+K L+ + ++E K+ + R +L L+ RE + V ++K + L+K
Sbjct: 360 KKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEK 414



Score = 35.8 bits (82), Expect = 0.002
Identities = 39/302 (12%), Positives = 102/302 (33%), Gaps = 4/302 (1%)

Query: 1229 LKNLSQTYLANKNKAEYSQQQLQQKYTNLLDLKENLERTKDQLDKKHRSIFARLTKFAND 1288
++ + + N + L L D + L +K R L++ A+
Sbjct: 55 VQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASK 114

Query: 1289 LRFEKKQLLKAQRIVDDKNRLLKENERNLHFLSNETERKRAVLEDQISYFEKQRKQATDA 1348
++ + + ++ ++ + + L E A + + + + A
Sbjct: 115 IQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAA-RKADLEKALEGAMNFSTA 173

Query: 1349 ILASHKEVKKKEGELQKLLVELETRKTKLNNDFAKFSRQREEFENQRLKLLELQKTLQTQ 1408
A K ++ ++ L+ ELE N S + + E ++ L + L+
Sbjct: 174 DSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKA 233

Query: 1409 TNSNNFKTKAIQEIENSYKRGMEELNFQKKEFDK---NKSRLYEYFRKMRDEIERKESQV 1465
+ A + + L ++ E +K +E +++ +
Sbjct: 234 LEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAAL 293

Query: 1466 KLVLKETQRKANLLEAQANKLNIEKNTIDFKEKELKAFKDKVDQDIDSTNKQRKELNELL 1525
+ + + ++ +L A L + + +K+L+A K+++ + R+ L L
Sbjct: 294 EAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDL 353

Query: 1526 NE 1527
+
Sbjct: 354 DA 355



Score = 35.4 bits (81), Expect = 0.002
Identities = 48/359 (13%), Positives = 112/359 (31%), Gaps = 6/359 (1%)

Query: 1200 EKQRQLVAIKTQCEKLSDEKKALNQKLVELKNLSQTYLANKNKAEYSQQQLQQKYTNLLD 1259
+ + + ++ + ++ L E + Q A K E + + T
Sbjct: 82 ALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSA 141

Query: 1260 LKENLERTKDQLDKKHRSIFARLTKFAND---LRFEKKQLLKAQRIVDDKNRLLKENERN 1316
+ LE K L + + L N + K L + ++ + L++
Sbjct: 142 KIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEG 201

Query: 1317 LHFLSNETERKRAVLEDQISYFEKQRKQATDAILASHKEVKKKEGELQKLLVELETRKTK 1376
S K LE + + ++ A+ + +++ L E + +
Sbjct: 202 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 261

Query: 1377 LNNDFAKFSRQREEFENQRLKLLELQKTLQTQTNSNNFKTKAIQEIENSYKRGMEELNFQ 1436
K+ L+ Q + + + +L+
Sbjct: 262 QAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDAS 321

Query: 1437 KKEFDKNKSRLYEYFRKMR-DEIERKESQVKLVLKETQRKANLLEAQANKLNIEKNTIDF 1495
++ + ++ + + + E R+ + L +K LEA+ KL + +
Sbjct: 322 REAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ--LEAEHQKLEEQNKISEA 379

Query: 1496 KEKELKAFKDKVDQDIDSTNKQRKELNELLNENKLLQQSLIERERAINSKDSLLNKKIE 1554
+ L+ D + K +E N L + L + L E ++ + + L K+E
Sbjct: 380 SRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLE 438



Score = 31.6 bits (71), Expect = 0.034
Identities = 51/369 (13%), Positives = 113/369 (30%), Gaps = 3/369 (0%)

Query: 952 EKNNQVKLELDNRFQALQNQKQDTVQAQLELEREQHQLNLEQTAF-NQANESLLKQREQL 1010
N + R Q +K + E+E +L +F N+A + + +
Sbjct: 34 VVNTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEE 93

Query: 1011 TKKIQAFHYELKKRNQFLALKGKRLFAKEQDQQRKDQEINWRFKQFEKEYTDFDEAKKRE 1070
+ + K A K + L A++ D ++ + + + K
Sbjct: 94 LSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 153

Query: 1071 LEELEKIRRSLSQSNVELERKREKLATDFTNLNKVQHNTQINRDQLNSQIRQFLLERKNF 1130
+ ++L + K+ T ++ L + +
Sbjct: 154 AARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKI 213

Query: 1131 QRFSNEANAKKAFL--IKRLRSFASNLKLQKEALAIQKLEFDKRDEQQKKELQQATLQLE 1188
+ E A A +++ A N A E ++ EL++A
Sbjct: 214 KTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAM 273

Query: 1189 QFKFEKQNFDIEKQRQLVAIKTQCEKLSDEKKALNQKLVELKNLSQTYLANKNKAEYSQQ 1248
F + + A++ + L + + LN L+ K + E Q
Sbjct: 274 NFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQ 333

Query: 1249 QLQQKYTNLLDLKENLERTKDQLDKKHRSIFARLTKFANDLRFEKKQLLKAQRIVDDKNR 1308
+L+++ +++L R D + + + A K + + +R +D
Sbjct: 334 KLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE 393

Query: 1309 LLKENERNL 1317
K+ E+ L
Sbjct: 394 AKKQVEKAL 402


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01290IGASERPTASE300.004 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.004
Identities = 20/92 (21%), Positives = 36/92 (39%), Gaps = 16/92 (17%)

Query: 73 IPTPVVKEIDQPA---------------VIPPVKAKPKATKKKTPVKSKPTSKSTKQTKP 117
I TP + D P+ V PP A P T + SK SK+ ++ +
Sbjct: 997 ITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 118 KQSKPKSKQVQQTK-AKPTQIQTKKSNKKTRS 148
++ ++ + K AK ++N+ +S
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQS 1088


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MG_RS01295VACCYTOTOXIN260.035 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 25.8 bits (56), Expect = 0.035
Identities = 23/69 (33%), Positives = 31/69 (44%), Gaps = 1/69 (1%)

Query: 7 AQAKQVVGGLSFWTFSAGLIMIVNALTGVAHAVNDIFQSTTANANGSDDDNENKNNSYRS 66
A K +G L W SAGL +I G ND +TT N +D ++NNS
Sbjct: 304 ASNKTHIGTLDLWQ-SAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQ 362

Query: 67 KSNYFNTAR 75
N N+A+
Sbjct: 363 VINPPNSAQ 371



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.