PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome1200.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_000907 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HI0030HI0061Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI00302140.527842lipoprotein
HI00310120.592218rod shape-determining protein
HI0032190.961343penicillin-binding protein 2
HI0033-2110.966176rRNA large subunit methyltransferase
HI0034-1110.156696hypothetical protein
HI0035-110-0.658704hypothetical protein
HI0036-212-1.658650ABC transporter ATP-binding protein
HI0037013-1.975782rod shape-determining protein MreB
HI0038013-3.153725rod shape-determining protein MreC
HI0039-313-3.527347rod shape-determining protein MreD
HI0040-215-3.380028hypothetical protein
HI0041-113-2.875830exonuclease III
HI0042-115-2.917674pseudouridine synthase-like protein
HI0043015-3.130893hypothetical protein
HI0044114-3.209563hypothetical protein
HI0045418-3.639743*hypothetical protein
HI0046621-4.190803alkylphosphonate uptake protein
HI0047723-5.189131keto-hydroxyglutarate-aldolase/keto-deoxy-
HI0048826-5.535003D-mannonate oxidoreductase
HI0049927-5.9744362-dehydro-3-deoxygluconokinase
HI0050m826-5.324283integral membrane protein transporter
HI0051723-4.936005hypothetical protein
HI0052416-3.284576hypothetical protein
HI0053214-1.627377zinc-type alcohol dehydrogenase
HI0054-111-0.791770uxu operon regulator
HI0055-111-0.198060mannonate dehydratase
HI0056-116-0.220227hypothetical protein
HI0057-115-0.041135excinuclease ABC subunit C
HI00580140.3461723-deoxy-manno-octulosonate cytidylyltransferase
HI00591140.204935tetraacyldisaccharide 4'-kinase
HI0060214-0.232510lipid transporter ATP-binding/permease
HI0061212-1.188867recombination protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0030SSPANPROTEIN320.003 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 31.6 bits (71), Expect = 0.003
Identities = 25/95 (26%), Positives = 42/95 (44%), Gaps = 9/95 (9%)

Query: 112 VTNLHNNRKVIVRINDRGPFSDKRLIDLSHAAAKEIGLISRGIGQVRIEALHVAKNGNLS 171
V+ LH+N K +RI ++ L A K +GLIS + AL +KN L
Sbjct: 82 VSGLHHNGKSELRIAEK---------LLKVTAEKSVGLISAEAKVDKSAALLSSKNRPLE 132

Query: 172 GAATKTLAKQAKTQEAADRLVLKSNTLFDNTSKSI 206
+ K L+ K E+ + + + D+ K++
Sbjct: 133 SVSGKKLSADLKAVESVSEVTDNATGISDDNIKAL 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0036PF05272365e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 36.2 bits (83), Expect = 5e-04
Identities = 11/36 (30%), Positives = 22/36 (61%)

Query: 418 TSLLIQGKSGAGKTTLLRTIAGLWSYAEGEINCPTH 453
S++++G G GK+TL+ T+ GL +++ + T
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0037SHAPEPROTEIN5350.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 535 bits (1381), Expect = 0.0
Identities = 279/349 (79%), Positives = 314/349 (89%), Gaps = 2/349 (0%)

Query: 2 LFKKIRGLFSNDLSIDLGTANTLIYVKRQGIVLDEPSVVAIRQDRVGTLKSIAAVGKEAK 61
+ KK RG+FSNDLSIDLGTANTLIYVK QGIVL+EPSVVAIRQDR G+ KS+AAVG +AK
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAAVGHDAK 60

Query: 62 LMLGRTPKSIVAIRPMKDGVIADFFVTEKMLQYFIKQVHSGNFMRPSPRVLVCVPAGATQ 121
MLGRTP +I AIRPMKDGVIADFFVTEKMLQ+FIKQVHS +FMRPSPRVLVCVP GATQ
Sbjct: 61 QMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQ 120

Query: 122 VERRAIKESAIGAGAREVYLIEEPMAAAIGAKLPVSTAVGSMVIDIGGGTTEVAVISLNG 181
VERRAI+ESA GAGAREV+LIEEPMAAAIGA LPVS A GSMV+DIGGGTTEVAVISLNG
Sbjct: 121 VERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNG 180

Query: 182 IVYSSSVRIGGDRFDEAIISYVRRTFGSVIGEPTAERIKQEIGSAYIQEGDEIKEMEVHG 241
+VYSSSVRIGGDRFDEAII+YVRR +GS+IGE TAERIK EIGSAY GDE++E+EV G
Sbjct: 181 VVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYP--GDEVREIEVRG 238

Query: 242 HNLAEGAPRSFTLTSRDVLEAIQQPLNGIVAAVRTALEECQPEHAADIFERGMVLTGGGA 301
NLAEG PR FTL S ++LEA+Q+PL GIV+AV ALE+C PE A+DI ERGMVLTGGGA
Sbjct: 239 RNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGA 298

Query: 302 LLRNIDILLSKESGVPVIIAEDPLTCVARGGGEALEMIDMHGGDIFSDE 350
LLRN+D LL +E+G+PV++AEDPLTCVARGGG+ALEMIDMHGGD+FS+E
Sbjct: 299 LLRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0048DHBDHDRGNASE1006e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 99.7 bits (248), Expect = 6e-27
Identities = 73/269 (27%), Positives = 122/269 (45%), Gaps = 21/269 (7%)

Query: 13 LENKLIIITGAGGVLCSFLAKQLAYTKANIALLDLNFEAADKVAKEINQSGGKAKAYKTN 72
+E K+ ITGA + +A+ LA A+IA +D N E +KV + A+A+ +
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 73 VLELENIKEVRNQIETDFGTCDILINGAGGNNPKATTDNEFHQFDLNETTRTFFDLDKSG 132
V + I E+ +IE + G DIL+N AG P H L
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLI-----HS------------LSDEE 108

Query: 133 IEFVFNLNYLGSLLPTQVFAKDMLGKQGANIINISSMNAFTPLTKIPAYSGAKAAISNFT 192
E F++N G ++ +K M+ ++ +I+ + S A P T + AY+ +KAA FT
Sbjct: 109 WEATFSVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFT 168

Query: 193 QWLAVYFSKVGIRCNAIAPGFLVSNQNLALLFDTEGKP---TDRANKILTNTPMGRFGES 249
+ L + ++ IRCN ++PG ++ +L D G T P+ + +
Sbjct: 169 KCLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKP 228

Query: 250 EELLGALLFLIDENYSAFVNGVVLPVDGG 278
++ A+LFL+ + + L VDGG
Sbjct: 229 SDIADAVLFLVSGQ-AGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0051PF06580270.041 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.8 bits (59), Expect = 0.041
Identities = 22/125 (17%), Positives = 41/125 (32%), Gaps = 8/125 (6%)

Query: 39 SVLRYFFDSGIAFSEEFSRICFVYMISFGIILVAKDKAHLTVDIIIPALPEQYRKIVLIV 98
L F + + S + + F IS +++ A+ + L +I+L V
Sbjct: 23 YTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTH--AYRSFIKRQGWLKLNMGQIILRV 80

Query: 99 ANICVLIAMIFIAYGALQLMSLTYTQQMPATGISSSFL------YLAAVISAVSYFFIVM 152
CV+I M++ L + P L + + ++ YF
Sbjct: 81 LPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHF 140

Query: 153 FSMIK 157
F K
Sbjct: 141 FKNYK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0053TYPE3IMSPROT340.001 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 33.6 bits (77), Expect = 0.001
Identities = 29/105 (27%), Positives = 46/105 (43%), Gaps = 22/105 (20%)

Query: 190 AIPILVDILEQRLEYAKSLGIEH--IVNPHKEDD----IK-RIK----EITSGRMAEVVM 238
+ + D + +Y K L + I +KE + IK + + EI S M E V
Sbjct: 196 VVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFHQEIQSRNMRENVK 255

Query: 239 EASGANISIKNTLHYASFAGRIALTGWPKTETPLPTNLITFKELN 283
+S + + N H A I + + + ETPLP L+TFK +
Sbjct: 256 RSS---VVVANPTHIA-----IGI-LYKRGETPLP--LVTFKYTD 289


2HI0154HI0181Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI01542171.363930acyl carrier protein
HI01551141.1687063-ketoacyl-ACP reductase
HI01562140.839045acyl carrier protein S-malonyltransferase
HI0157215-0.1285083-oxoacyl-ACP synthase
HI0158315-1.10377150S ribosomal protein L32
HI0159115-1.198655hypothetical protein
HI0160115-0.740215phosphatidylserine decarboxylase
HI0161115-0.640321glutathione reductase
HI0162216-0.922914lipoprotein
HI0163220-0.837615transcriptional regulator
HI0164221-0.183729Na(+)-translocating NADH-quinone reductase
HI0166219-0.034607Na(+)-translocating NADH-quinone reductase
HI0167018-0.913994Na(+)-translocating NADH-quinone reductase
HI0168014-0.675461Na(+)-translocating NADH-quinone reductase
HI0170-112-0.270330Na(+)-translocating NADH-quinone reductase
HI0171-180.622810Na(+)-translocating NADH-quinone reductase
HI0172-380.058264lipoprotein
HI0173-29-0.228174hypothetical protein
HI0174010-0.316259tRNA-specific 2-thiouridylase MnmA
HI0175218-0.022193hypothetical protein
HI0176418-0.08051623S rRNA pseudouridine synthase D
HI0177318-0.453149hypothetical protein
HI01784160.341792hypothetical protein
HI01793141.127081pyruvate formate lyase-activating enzyme 1
HI01802162.336271formate acetyltransferase
HI01812132.838636formate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0155DHBDHDRGNASE1421e-43 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 142 bits (359), Expect = 1e-43
Identities = 86/254 (33%), Positives = 133/254 (52%), Gaps = 13/254 (5%)

Query: 1 MQGKIALVTGSTRGIGRAIAEELSSKGAFVIGTATSEKGAEAISAYLGDKGK---GLVLN 57
++GKIA +TG+ +GIG A+A L+S+GA + + + E + + L + + +
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 58 VTDKESIETLLEQIKNDFGDIDILVNNAGITRDNLLMRMKDEEWFDIMQTNLTSVYHLSK 117
V D +I+ + +I+ + G IDILVN AG+ R L+ + DEEW N T V++ S+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 118 AMLRSMMKKRFGRIINIGSVVGSTGNPGQTNYCAAKAGVVGFSKSLAKEVAARGITVNVV 177
++ + MM +R G I+ +GS Y ++KA V F+K L E+A I N+V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 178 APGFIATDMTEVLTDEQKA------GILSN----VPAGRLGEAKDIAKAVAFLASDDAGY 227
+PG TDM L ++ G L +P +L + DIA AV FL S AG+
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 228 ITGTTLHVNGGLYL 241
IT L V+GG L
Sbjct: 246 ITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0164RTXTOXIND347e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.4 bits (79), Expect = 7e-04
Identities = 12/44 (27%), Positives = 21/44 (47%), Gaps = 4/44 (9%)

Query: 16 PAQVIHSGNAVNQVAILGEEYVGMRPSMKVREGDVVKKGQVLFE 59
++ HSG + I + + V+EG+ V+KG VL +
Sbjct: 87 NGKLTHSGRSKEIKPIEN----SIVKEIIVKEGESVRKGDVLLK 126


3HI0251HI0256Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0251218-2.320716TonB
HI0252114-3.290718biopolymer transport protein
HI0253223-5.464195transport protein ExbB
HI0254224-4.623750thioredoxin-dependent thiol peroxidase
HI0255019-3.912657dihydrodipicolinate synthase
HI0256118-3.805288lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0251TONBPROTEIN1472e-45 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 147 bits (371), Expect = 2e-45
Identities = 56/215 (26%), Positives = 84/215 (39%), Gaps = 31/215 (14%)

Query: 55 MVLEEPAPEPEDVQKEPEPEPEPGNVQKEPEPEKQEIVEDPTIKPEPKKIKEPEKEKPKP 114
MV P+ VQ PEP EPEPE + I E P E P
Sbjct: 49 MVTPADLEPPQAVQPPPEP-------VVEPEPEPEPIPEPPK-------------EAPVV 88

Query: 115 KGKPKGKPKNKPKKEVKPQKKPINKELPKGDENIDSSANVNDKASTTSAANSNAQVAGSG 174
KPK KPK KPK K Q++P +++ + S A TS+ + A
Sbjct: 89 IEKPKPKPKPKPKPVKKVQEQP-KRDVKPVESRPASPFENTAPARLTSSTATAATSKPVT 147

Query: 175 TDTSEIAAYRSAIRREIESHKRYPTRAKIMRKQGKVSVSFNVGADGSLSGAKVTKSSGDE 234
+ S A +YP RA+ +R +G+V V F+V DG + ++ +
Sbjct: 148 SVASGPRALSRN-------QPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPAN 200

Query: 235 SLDKAALDAINVSRSVGTRPAGFPSSLSVQISFTL 269
++ +A+ R +P S + V I F +
Sbjct: 201 MFEREVKNAMRRWRYEPGKPG---SGIVVNILFKI 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0256PF06291290.006 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 28.9 bits (64), Expect = 0.006
Identities = 11/41 (26%), Positives = 20/41 (48%)

Query: 1 MKKIILNLVTAIILAGCSSNPETLKATNDSFQKSETSIPHF 41
MKK++ + A+++ GC+ T+ + ET HF
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHF 46


4HI0350HI0357Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI03502180.578917permease
HI03512211.153800UDP-glucose 4-epimerase
HI03524220.637120hypothetical protein
HI03544211.899763ABC transporter ATP-binding protein
HI03553170.746919ABC transporter permease
HI0357312-0.577327thiamine biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0351NUCEPIMERASE1714e-53 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 171 bits (436), Expect = 4e-53
Identities = 81/350 (23%), Positives = 151/350 (43%), Gaps = 37/350 (10%)

Query: 1 MAILVTGGAGYIGSHTVVELLNVGKEVVVLDNLCNSSPKSLE--RVKQITGKEAKFYEGD 58
M LVTG AG+IG H LL G +VV +DNL + SL+ R++ + +F++ D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 59 ILDRALLQKIFAENEINSVIHFAGLKAVGESVQKPTEYYMNNVAGTLVLIQEMKKAGVWN 118
+ DR + +FA V AV S++ P Y +N+ G L +++ + + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 119 FVFSSSATVYGDPKIIPITEDCEVGGTTNPYGTSKYMVEQILRDTAKAEPKFSMTILRYF 178
+++SS++VYG + +P + D V + Y +K E ++ T T LR+F
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE-LMAHTYSHLYGLPATGLRFF 179

Query: 179 NPVGAHESGLIGEDPNGIPNNLLP-YISQVAIGKLAQLSVFGSDYDTHDGTGVRDYIHVV 237
G P G P+ L + + GK + V+ G RD+ ++
Sbjct: 180 TVYG----------PWGRPDMALFKFTKAMLEGK--SIDVYN------YGKMKRDFTYID 221

Query: 238 DLAVGHLKALQ---------------RHENDAGLHIYNLGTGHGYSVLDMVKAFEKANNI 282
D+A ++ + A +YN+G ++D ++A E A I
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 283 TIAYKLVERRSGDIATCYSDPSLAAKELGWVAERGLEKMMQDTWNWQKNN 332
++ + GD+ +D + +G+ E ++ +++ NW ++
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0352PF08280280.040 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 27.9 bits (62), Expect = 0.040
Identities = 13/49 (26%), Positives = 25/49 (51%), Gaps = 1/49 (2%)

Query: 107 KENIIKLLPSFSQNKSQSDIHSMEYDLNALY-FLQKHYGVNIYCISPES 154
+E +I LL +F S++ I EY + L L +G+ +Y ++ +
Sbjct: 156 REALIPLLRNFELKLSKNKIVGEEYRIRYLIALLYSKFGIKVYDLTQQD 204


5HI0409HI0416Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0409290.527141hypothetical protein
HI04102111.840519transcriptional regulatory protein
HI04111123.216339RNA-binding protein Hfq
HI04121143.51303823S rRNA pseudouridylate synthase C
HI0413-1154.461897ribonuclease E
HI04140194.829084hypothetical protein
HI04150194.415217hydroxyethylthiazole kinase
HI04161173.355839phosphomethylpyrimidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0410HTHFIS2788e-93 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 278 bits (712), Expect = 8e-93
Identities = 96/338 (28%), Positives = 158/338 (46%), Gaps = 37/338 (10%)

Query: 15 FIVQSEAMKSAVENAKRFAMFDAPLLIQGETGSGKDLLAKACHYQSLRRDKKFIAVNCAG 74
+ +S AM+ R D L+I GE+G+GK+L+A+A H RR+ F+A+N A
Sbjct: 139 LVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAA 198

Query: 75 LPDEDAESEMFGRKVG-----DSETIGFFEYANKGTVLLDGIAELSLSLQAKLLRFLTDG 129
+P + ESE+FG + G + + G FE A GT+ LD I ++ + Q +LLR L G
Sbjct: 199 IPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQG 258

Query: 130 SFRRVGEEKEHYANVRVICTSQVPLHLLVEQGKVRADLFHRLNVLTINVPALRDRMADIE 189
+ VG ++VR++ + L + QG R DL++RLNV+ + +P LRDR DI
Sbjct: 259 EYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIP 318

Query: 190 PLAQGFLQEISEELKIAKPTFDKDFLLYLQKYDWKGNVRELYNTLYRACSLVQDNHLTIE 249
L + F+Q+ +E K FD++ L ++ + W GNVREL N + R +L + +T E
Sbjct: 319 DLVRHFVQQAEKEGLDVK-RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITRE 377

Query: 250 SLNLALPQSAVISLDEF-----ENKTLDEIIGFYEAQVLKLFYAEYPSTRKL-------- 296
+ L S E + ++ + + Q F P +
Sbjct: 378 IIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEME 437

Query: 297 ------------------AQRLGVSHTAIANKLKQYGI 316
A LG++ + K+++ G+
Sbjct: 438 YPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0412RTXTOXIND290.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.030
Identities = 23/128 (17%), Positives = 47/128 (36%), Gaps = 28/128 (21%)

Query: 127 GVIEALRALRPEARFLELVHRLDRDTSGILLIAKKRSALRNLHEQLRVKTVQKDYLALVR 186
I L E +++E V+ L S + + S + + + + V + + +
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQL---EQIESEILSA--KEEYQLVTQLFKNEIL 301

Query: 187 GQWQSHIKVIQASLLKNELSSGERIVRVSEQGKPSETRFSIEERYINATLVKASPVTGRT 246
+ + LL EL+ E E+ + S R +PV+ +
Sbjct: 302 DKLRQTTD--NIGLLTLELAKNE------ERQQASVIR---------------APVSVKV 338

Query: 247 HQIRVHTQ 254
Q++VHT+
Sbjct: 339 QQLKVHTE 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0414OUTRMMBRANEA334e-05 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 33.0 bits (75), Expect = 4e-05
Identities = 23/72 (31%), Positives = 31/72 (43%), Gaps = 5/72 (6%)

Query: 4 GGDVKADQETSGSRSIKRIGFG--FIGGIGYDITPNITLDLGYRY-NDWGRLE--NVRFK 58
G +AD +++ G F GG+ Y ITP I L Y++ N+ G R
Sbjct: 120 GMVWRADTKSNVYGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPD 179

Query: 59 THEASFGVRYRF 70
S GV YRF
Sbjct: 180 NGMLSLGVSYRF 191



Score = 28.0 bits (62), Expect = 0.002
Identities = 10/44 (22%), Positives = 18/44 (40%), Gaps = 4/44 (9%)

Query: 31 GYDITPNITLDLGY----RYNDWGRLENVRFKTHEASFGVRYRF 70
GY + P + ++GY R G +EN +K + +
Sbjct: 63 GYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGY 106


6HI0475HI0485.1Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HI04752200.022056bifunctional phosphoribosyl-AMP
HI04772190.262507tyrosine-specific transport protein
HI04783210.153435F0F1 ATP synthase subunit epsilon
HI04793240.210246F0F1 ATP synthase subunit beta
HI0480019-0.675099F0F1 ATP synthase subunit gamma
HI0481119-1.331485F0F1 ATP synthase subunit alpha
HI0482116-3.540778F0F1 ATP synthase subunit delta
HI0483116-4.128174F0F1 ATP synthase subunit B
HI0484-114-3.657833F0F1 ATP synthase subunit C
HI0485-118-3.598152F0F1 ATP synthase subunit A
HI0485.1-120-3.616180F0F1 ATP synthase subunit I
7HI0507HI0515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0507213-1.843409hypothetical protein
HI0508620-4.892292ribonuclease activity regulator protein RraA
HI0509115-1.8435491,4-dihydroxy-2-naphthoate
HI0510419-1.061256hypothetical protein
HI0511422-1.253781potassium-tellurite ethidium and proflavin
HI0512524-0.982843type II restriction endonuclease
HI0513525-0.376996modification methylase
HI05144301.624538DNA-directed RNA polymerase subunit beta'
HI05152261.056063DNA-directed RNA polymerase subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0515GPOSANCHOR360.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 36.2 bits (83), Expect = 0.001
Identities = 21/102 (20%), Positives = 31/102 (30%), Gaps = 4/102 (3%)

Query: 937 RDGVEKDKRALEIEEMQLREAKKDLTEELEILEAGLFARVRNLLISSGADAAQLDKLDRT 996
+ +EK K L E L A A + L GA +
Sbjct: 192 QAELEKALEGAMNFSTADSAKIKTLEAEKAALAARK-ADLEKAL--EGAMNFSTADSAKI 248

Query: 997 KWLEQTIAD-EEKQNQLEQLAEQYEELRKEFEHKLEVKRKKI 1037
K LE A E +Q +LE+ E K++ +
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 290


8HI0534HI0545Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI05341203.943546aspartate ammonia-lyase
HI05351225.334672urease accessory protein
HI05361195.250728urease accessory protein
HI05371194.933388urease accessory protein UreF
HI05381193.935674urease accessory protein UreE
HI05391243.876491urease subunit alpha
HI05401313.041077urease subunit beta
HI05412272.461988urease subunit gamma
HI05424282.522684co-chaperonin GroES
HI05433272.423842chaperonin GroEL
HI05443242.02551650S ribosomal protein L9
HI0545217-2.20082330S ribosomal protein S18
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0539UREASE10430.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1043 bits (2698), Expect = 0.0
Identities = 362/575 (62%), Positives = 436/575 (75%), Gaps = 8/575 (1%)

Query: 1 MALTISRAQYVATYGPTVGDKVRLGDTNLWATIEQDLLTKGDECKFGGGKSVRDGMAQSG 60
M+ +SRA Y +GPTVGDKVRL DT L+ +E+D T G+E KFGGGK +RDGM QS
Sbjct: 1 MSYRMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQS- 59

Query: 61 TATRDNPNVLDFVITNVMIIDAKLGIIKADIGIRDGRIVGIGQAGNPDTMDNVTPNMIIG 120
TR+ +D VITN +I+D GI+KADIG++DGRI IG+AGNPD V +I+G
Sbjct: 60 QVTREG-GAVDTVITNALILDH-WGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVG 115

Query: 121 ASTEVHNGAHLIATAGGIDTHIHFICPQQAQHAIESGVTTLIGGGTGPADGTHATTCTPG 180
TEV G I TAGG+D+HIHFICPQQ + A+ SG+T ++GGGTGPA GT ATTCTPG
Sbjct: 116 PGTEVIAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPG 175

Query: 181 AWYMERMFQAAEALPVNVGFFGKGNCSTLDPLREQIEAGALGLKIHEDWGATPAVIDSAL 240
W++ RM +AA+A P+N+ F GKGN S L E + GA LK+HEDWG TPA ID L
Sbjct: 176 PWHIARMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCL 235

Query: 241 KVADEMDIQVAIHTDTLNESGFLEDTMKAIDGRVIHTFHTEGAGGGHAPDIIKAAMYSNV 300
VADE D+QV IHTDTLNESGF+EDT+ AI GR IH +HTEGAGGGHAPDII+ NV
Sbjct: 236 SVADEYDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNV 295

Query: 301 LPASTNPTRPFTKNTIDEHLDMLMVCHHLDKRVPEDVAFADSRIRPETIAAEDILHDMGV 360
+P+STNPTRP+T NT+ EHLDMLMVCHHL +PED+AFA+SRIR ETIAAEDILHD+G
Sbjct: 296 IPSSTNPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGA 355

Query: 361 FSIMSSDSQAMGRIGEVVIRTWQTADKMKMQRGELGNE--GNDNFRIKRYIAKYTINPAI 418
FSI+SSDSQAMGR+GEV IRTWQTADKMK QRG L E NDNFR+KRYIAKYTINPAI
Sbjct: 356 FSIISSDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAI 415

Query: 419 AHGIAEHIGSLEVGKIADIVLWKPMFFGVKPEVVIKKGFISYAKMGDPNASIPTPQPVFY 478
AHG++ IGSLEVGK AD+VLW P FFGVKP++V+ G I+ A MGDPNASIPTPQPV Y
Sbjct: 416 AHGLSHEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHY 475

Query: 479 RPMYGAQGLATAQTAVFFVSQAAEKADIRAKFGLHKETIAVKGCR-NVGKKDLVHNDVTP 537
RPM+GA G + ++V FVSQA+ A + + G+ KE +AV+ R +GK ++HN +TP
Sbjct: 476 RPMFGAYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTP 535

Query: 538 NITVDAERYEVRVDGELITCEPVDSVPLGQRYFLF 572
+I VD E YEVR DGEL+TCEP +P+ QRYFLF
Sbjct: 536 HIEVDPETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0542TYPE3OMOPROT260.034 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 25.7 bits (56), Expect = 0.034
Identities = 12/48 (25%), Positives = 24/48 (50%), Gaps = 9/48 (18%)

Query: 36 TRAKVLAVGKGRILENGT--VQPLDVKV-------GDIVIFNDGYGVK 74
T A++ A+G+ ++L T +++ G++V ND GV+
Sbjct: 244 TLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVE 291


9HI0626HI0642aY        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0626216-3.672745large-conductance mechanosensitive channel
HI0627417-1.258423hypothetical protein
HI0628316-0.261905RNA polymerase sigma factor RpoE
HI06291160.121511sigma-E factor negative regulatory protein
HI0630115-0.958433negative regulator of sigmaE
HI0631115-0.713947pantothenate kinase
HI0632014-0.004352****elongation factor Tu
HI0633-115-0.948298hypothetical protein
HI0634-312-1.209039tRNA-dihydrouridine synthase A
HI0635-311-1.278308hemoglobin-binding protein
HI06362150.002340hypothetical protein
HI06371150.201048tryptophanyl-tRNA synthetase
HI0638116-0.336287hypothetical protein
HI06390141.409141adenylosuccinate lyase
HI06402131.66543850S ribosomal protein L10
HI06410110.59607650S ribosomal protein L7/L12
HI06421110.671462bifunctional N-acetylglucosamine-1-phosphate
HI0642a2110.918581hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0626MECHCHANNEL1691e-57 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 169 bits (429), Expect = 1e-57
Identities = 92/132 (69%), Positives = 106/132 (80%), Gaps = 4/132 (3%)

Query: 1 MNFIKEFREFAMRGNVVDMAVGVIIGSAFGKIVSSLVSDIFTPVLGILTGGIDFKDMKFV 60
M+ IKEFREFAMRGNVVD+AVGVIIG+AFGKIVSSLV+DI P LG+L GGIDFK
Sbjct: 1 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT 60

Query: 61 LAQAQGDVPAVTLNYGLFIQNVIDFIIIAFAIFMMIKVINKV-RKPEEKKTAP---KAET 116
L AQGD+PAV ++YG+FIQNV DF+I+AFAIFM IK+INK+ RK EE AP K E
Sbjct: 61 LRDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEEV 120

Query: 117 LLTEIRDLLKNK 128
LLTEIRDLLK +
Sbjct: 121 LLTEIRDLLKEQ 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0632TCRTETOQM832e-19 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 83.4 bits (206), Expect = 2e-19
Identities = 58/194 (29%), Positives = 91/194 (46%), Gaps = 25/194 (12%)

Query: 13 VNVGTIGHVDHGKTTLT-------AAITTVLAKHYGGAARAFDQIDNAPEEKARGITINT 65
+N+G + HVD GKTTLT AIT + + G + DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTT-----RTDNTLLERQRGITIQT 58

Query: 66 SHVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQ 125
+ +D PGH D++ + + +DGAIL+++A DG QTR R+
Sbjct: 59 GITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRK 118

Query: 126 VGVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQ--------YDFPGDDTPIVRGSALQ 177
+G+P I F+NK D + L V +++E LS +P + + + +
Sbjct: 119 MGIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYP--NMCVTNFTESE 173

Query: 178 ALNGVAEWEEKILE 191
+ V E + +LE
Sbjct: 174 QWDTVIEGNDDLLE 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0636PF06776260.015 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 26.4 bits (58), Expect = 0.015
Identities = 9/32 (28%), Positives = 16/32 (50%), Gaps = 4/32 (12%)

Query: 69 KVDDGYLLKADFTFC----CQAELVIFQMRIA 96
K+D+ + +A F C C AE+V+ +
Sbjct: 146 KLDNVDVGRAGFVRCLPNGCVAEVVMDDKLLG 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0639PF07299330.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 32.9 bits (75), Expect = 0.002
Identities = 19/78 (24%), Positives = 35/78 (44%), Gaps = 7/78 (8%)

Query: 373 HLLEELNQNWEVLAEPIQTVMRRYGIEKPYEKLKELTRG-KRVTEQAMREF---IDKLDI 428
H+ E L + L + + TV R E K+ + VT Q +++ KL +
Sbjct: 52 HVFENLTDEQKELIDTVLTVQNREDAESFLLKINPYVIPFQEVTAQTLKKLFPKAKKLKL 111

Query: 429 PQEEKLRLQKLTPATYIG 446
P E+L +++L +Y+
Sbjct: 112 PDMEELDMKEL---SYLS 126


10HI0857HI0873Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI08573101.406652hypothetical protein
HI08582101.805168hypothetical protein
HI08593121.983220ATP-dependent Clp protease ATPase subunit
HI08601111.39113123S rRNA (guanosine-2'-O-)-methyltransferase
HI08611100.337368virulence-associated protein
HI0862010-1.718613hypothetical protein
HI086309-3.053629pyridoxamine 5'-phosphate oxidase
HI0864313-4.735697GTP-binding protein
HI0865721-7.391669glutamine synthetase
HI08661027-7.923255hypothetical protein
HI08671025-8.078805lipopolysaccharide biosynthesis protein
HI0868925-7.716016hypothetical protein
HI0871822-5.305681N-acetylneuraminic acid synthase-like protein
HI0872518-4.102429undecaprenyl-phosphate
HI0873315-2.492223dTDP-glucose 46-dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0859HTHFIS381e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.3 bits (89), Expect = 1e-04
Identities = 32/160 (20%), Positives = 57/160 (35%), Gaps = 28/160 (17%)

Query: 570 VIGQEEAVDAVANAIRRSRAGLSDPNRPIGSFLFLGPTGVGKTELCKTLAKFLFDSEDAM 629
++G+ A+ + + R L + + + G +G GK + + L +
Sbjct: 139 LVGRSAAMQEIYRVLAR----LMQTDLTL---MITGESGTGKELVARALHDYGKRRNGPF 191

Query: 630 VRIDMSEFMEKHSVSRLVGAPPGYVGYEEGGYLTEAVRRRPYSV-------ILLDEVEKA 682
V I+M+ S L G +E+G + T A R + LDE+
Sbjct: 192 VAINMAAIPRDLIESELFG-------HEKGAF-TGAQTRSTGRFEQAEGGTLFLDEIGDM 243

Query: 683 HADVFNILLQVLDDG---RLTDGQGRTVDFRNTVVIMTSN 719
D LL+VL G + D R ++ +N
Sbjct: 244 PMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN 280



Score = 36.3 bits (84), Expect = 5e-04
Identities = 14/68 (20%), Positives = 29/68 (42%), Gaps = 3/68 (4%)

Query: 151 DQNAEESRQALEKYTIDLTARAESG-KLDPVIGRDEEIRRAIQVLQRRTKNN-PVLI-GE 207
+ +AL + + + P++GR ++ +VL R + + ++I GE
Sbjct: 109 TELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGE 168

Query: 208 PGVGKTAI 215
G GK +
Sbjct: 169 SGTGKELV 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0864TCRTETOQM1731e-48 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 173 bits (440), Expect = 1e-48
Identities = 102/443 (23%), Positives = 176/443 (39%), Gaps = 62/443 (13%)

Query: 9 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFESARGDVDE--RVMDSNDLEKERGITILAKN 66
K+ NI ++AHVD GKTTL + LL SG G VD+ D+ LE++RGITI
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITEL-GSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 67 TAINWNDYRINIVDTPGHADFGGEVERVLSMVDSVLLVVDAFDGPMPQTRFVTQKAFAHG 126
T+ W + ++NI+DTPGH DF EV R LS++D +L++ A DG QTR + G
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 127 LKPIVVINKVDRPGARPDWVVDQVFDLF---------VNLGASDEQLDFPII--YASALN 175
+ I INK+D+ G V + + V L + +F + + +
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180

Query: 176 G--------VAG--LEHEDLAEDMT-----------------------PLFEAIVKHVEP 202
G ++G LE +L ++ + L E I
Sbjct: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS 240

Query: 203 PKVELDAPFQMQISQLDYNNYVGVIGIGRIKRGSIKPNQPVTIINSEGKTRQGRIGQVLG 262
+ ++ +++Y+ + R+ G + V I E +I ++
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI----KITEMYT 296

Query: 263 HLGLQRYEEDVAYAGDIVAITGLGELNISDTICDINTVEALPSLTVDEPTVTMFFCVNTS 322
+ + + D AY+G+IV + L ++ + D + + P + +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKP 355

Query: 323 PFAGQEGKYVTSRQILERLNKELVHNVALRVEETPNPDEFRVSGRGELHLSVLIENMRRE 382
Q L ++ + LR E +S G++ + V ++ +
Sbjct: 356 ---QQREML---LDALLEISDS---DPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEK 406

Query: 383 -GYELAVSRPKVIYRDIDGKKQE 404
E+ + P VIY + KK E
Sbjct: 407 YHVEIEIKEPTVIYMERPLKKAE 429



Score = 32.9 bits (75), Expect = 0.004
Identities = 18/89 (20%), Positives = 30/89 (33%), Gaps = 3/89 (3%)

Query: 404 EPYEQVTIDVEEQHQGSVMEALGIRKGEVRDMLPDGKG-RVRLEYIIPSRGLIGFRGDFM 462
EPY I +++ + D K V L IP+R + +R D
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVD--TQLKNNEVILSGEIPARCIQEYRSDLT 594

Query: 463 TMTSGTGLLYSSFSHYDEIKGGEIGQRKN 491
T+G + + Y G + Q +
Sbjct: 595 FFTNGRSVCLTELKGYHVTTGEPVCQPRR 623


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0873NUCEPIMERASE1701e-52 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 170 bits (433), Expect = 1e-52
Identities = 77/353 (21%), Positives = 141/353 (39%), Gaps = 46/353 (13%)

Query: 1 MNILVTGGSGFIGSALIRYIINHTQDFVINIDKLT--YAANQSALR-EVENNPRYVFEKV 57
M LVTG +GFIG + + ++ V+ ID L Y + R E+ P + F K+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DICDLNVIENIFEKYQPDAVMHLAAESHVDRSISGAADFVQTNIVGTYTLLEVAKNYWHT 117
D+ D + ++F + V V S+ + +N+ G +LE ++
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN--- 116

Query: 118 LDEAKKTTFRFHHISTDEVYGDLSLSEPAFTEQSPYHPSSPYSASKAASNHLVQAWHRTY 177
+ + + S+ VYG L+ P T+ S HP S Y+A+K A+ + + Y
Sbjct: 117 --KIQHLLYA----SSSSVYG-LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 178 GLPVIITNSSNNYGAYQHAEKLIPLMISNAVMGKPLPIYGDGQQIRDWLFVEDHVQASYL 237
GLP YG + + + + GK + +Y G+ RD+ +++D +A
Sbjct: 170 GLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIR 229

Query: 238 VL------------------TKGRVGENYNIGGNCEKTNLEVVKRICQLLEELAPSKPNH 279
+ YNIG + ++ ++ + L +K N
Sbjct: 230 LQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI--EAKKN- 286

Query: 280 IKYYEDLMTFVKDRPGHDVRYSL-DCSKIHAELGWQPQITFEQGLRQTVKWYL 331
+ +PG DV + D ++ +G+ P+ T + G++ V WY
Sbjct: 287 ---------MLPLQPG-DVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYR 329


11HI0992HI1012Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0992214-1.756176DNA polymerase III subunit beta
HI0993316-1.955175chromosomal replication initiation protein
HI0994518-2.730950transferrin-binding protein 1
HI0995316-1.726769transferrin-binding protein 2
HI0998116-2.09977550S ribosomal protein L34
HI0999-115-3.355564ribonuclease P
HI1000-114-3.374568hypothetical protein
HI1001-114-2.788141inner membrane protein translocase component
HI1002017-2.574701tRNA modification GTPase TrmE
HI1004019-3.539614peptidyl-prolyl cis-trans isomerase
HI1005-115-2.578806hypothetical protein
HI10060100.225679lipoprotein signal peptidase
HI10070100.7070904-hydroxy-3-methylbut-2-enyl diphosphate
HI10082100.309844hypothetical protein
HI1009311-0.343669glycerol-3-phosphate regulon repressor
HI1010414-0.0424273-hydroxyisobutyrate dehydrogenase
HI1011414-0.367863hypothetical protein
HI1012315-1.062465aldolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI100160KDINNERMP7800.0 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 780 bits (2015), Expect = 0.0
Identities = 304/551 (55%), Positives = 397/551 (72%), Gaps = 16/551 (2%)

Query: 1 MDSRRSLLVLALIFISFLVYQQWQLDKNPPVQTEQTTSITATSDVPASSPSNSQAIADSQ 60
MDS+R+LLV+AL+F+SF+++Q W+ DKNP Q +QTT T T+ A S ++ A Q
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTA---AGSAADQGVPASGQ 57

Query: 61 TRGRIITLENDVFRLKIDTLGGDVISSELLKYDAELDSKTPFELLKDTKEHIYIAQSGLI 120
G++I+++ DV L I+T GGDV + L Y EL+S PF+LL+ + + IY AQSGL
Sbjct: 58 --GKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLT 115

Query: 121 GKNGIDTRSG--RAQYQIEGDNFKLAEGQESLSVPLLF-EKDGVTYQKIFVLKRGSYDLG 177
G++G D + R Y +E D + LAEGQ L VP+ + + G T+ K FVLKRG Y +
Sbjct: 116 GRDGPDNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVN 175

Query: 178 VDYKIDNQSGQAIEVEPYGQLKHSIV------ESSGNVAMPTYTGGAYSSSETNYKKYSF 231
V+Y + N + +E+ +GQLK SI S N A+ T+ G AYS+ + Y+KY F
Sbjct: 176 VNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKF 235

Query: 232 SDMQDN-NLSIDTKAGWVAVLQHYFVSAWIPNQDVNNQLYTITDSKNNVASIGYRGSVVT 290
+ DN NL+I +K GWVA+LQ YF +AWIP+ D N YT + N +A+IGY+ V
Sbjct: 236 DTIADNENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYT-ANLGNGIAAIGYKSQPVL 294

Query: 291 IPAGSQETITSSLWTGPKLQNQMATVANNLDLTVDYGWAWFIAKPLFWLLTFIQGIVSNW 350
+ G + S+LW GP++Q++MA VA +LDLTVDYGW WFI++PLF LL +I V NW
Sbjct: 295 VQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNW 354

Query: 351 GLAIICVTIVVKAILYPLTKAQYTSMAKMRILQPKMQEMRERFGDDRQRMSQEMMKLYKE 410
G +II +T +V+ I+YPLTKAQYTSMAKMR+LQPK+Q MRER GDD+QR+SQEMM LYK
Sbjct: 355 GFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKA 414

Query: 411 EKVNPLGGCLPILLQMPIFIALYWTFLEAVELRHAPFFGWIQDLSAQDPYYILPILMGIS 470
EKVNPLGGC P+L+QMPIF+ALY+ + +VELR APF WI DLSAQDPYYILPILMG++
Sbjct: 415 EKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVT 474

Query: 471 MFLLQKMSPTPVTDPTQQKVMNFMPLVFMFFFLWFPSGLVLYWLVSNLITIAQQQLIYRG 530
MF +QKMSPT VTDP QQK+M FMP++F FFLWFPSGLVLY++VSNL+TI QQQLIYRG
Sbjct: 475 MFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRG 534

Query: 531 LEKKGLHSRKK 541
LEK+GLHSR+K
Sbjct: 535 LEKRGLHSREK 545


12HI1024HI1052Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI1024422-4.1239783-keto-L-gulonate-6-phosphate decarboxylase
HI1025421-4.129619L-ribulose-5-phosphate 4-epimerase
HI1026420-3.440938L-xylulose 5-phosphate 3-epimerase
HI1027420-3.538675L-xylulose kinase
HI1028317-2.736256hypothetical protein
HI1029217-2.381453hypothetical protein
HI1030213-2.074685hypothetical protein
HI1031113-2.3293912,3-diketo-L-gulonate reductase
HI1032-112-3.224179transcriptional regulator
HI1033-111-2.269404phosphoserine phosphatase
HI1034113-3.607238nucleotide-binding protein
HI1035316-4.153922magnesium/nickel/cobalt transporter CorA
HI1036418-4.191642hypothetical protein
HI1037317-4.150142hypothetical protein
HI1038319-3.881917hypothetical protein
HI1040524-3.936123urease accessory protein
HI1041418-1.860384modification methylase
HI1043414-0.567866ferredoxin-type protein
HI1044513-1.072618twin-argninine leader-binding protein DmsD
HI1045413-0.734610anaerobic dimethyl sulfoxide reductase subunit
HI1046312-0.096018anaerobic dimethyl sulfoxide reductase subunit
HI1047111-0.652908anaerobic dimethyl sulfoxide reductase subunit
HI1048013-0.503427hypothetical protein
HI1049-113-1.839399mercuric ion transport protein
HI1050017-1.337549mercuric ion scavenger protein
HI1051-117-0.738906ABC transporter ATP-binding protein
HI1052319-0.846382AraC family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1027PF03309290.031 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 29.3 bits (66), Expect = 0.031
Identities = 12/74 (16%), Positives = 21/74 (28%), Gaps = 12/74 (16%)

Query: 5 LGIDCGGTFIKAAIFDQNGTLQSIARRNIPIISEKPGYAERDMDELWNLCAQVIQKTIRQ 64
L ID T + +G + + I +E E DEL +I
Sbjct: 3 LAIDVRNTHTVVGLISGSGDHAKV-VQQWRIRTEP----EVTADELALTIDGLIGDD--- 54

Query: 65 SSILPQQIKAIGIS 78
+++
Sbjct: 55 ----AERLTGASGL 64


13HI1165HI1174Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HI1165212-3.024427hypothetical protein
HI1166213-3.183515histidinol-phosphate aminotransferase
HI1167213-2.151983phosphoserine aminotransferase
HI1168214-2.838709hypothetical protein
HI1168a212-2.315556hypothetical protein
HI1169111-1.736994para-aminobenzoate synthase component I
HI1171111-0.843472anthranilate synthase component II
HI1172213-0.878134S-adenosylmethionine synthetase
HI1173315-1.154200hypothetical protein
HI1174216-1.076061opacity protein
14HI1357HI1382Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI1357-1153.364516glycogen branching protein
HI1358-1153.189698glycogen operon protein
HI1359-2152.570767glucose-1-phosphate adenylyltransferase
HI1360-3112.117190glycogen synthase
HI1361-3121.111665hypothetical protein
HI1362-2130.959766NAD(P) transhydrogenase subunit alpha
HI1363-210-1.417818pyridine nucleotide transhydrogenase
HI1364-19-2.771756transcriptional regulator
HI1365-19-2.620965DNA topoisomerase I
HI1366-111-3.339508acyl carrier protein phosphodiesterase
HI1367-110-3.141848threonyl-tRNA synthetase
HI1368-19-3.681635zinc protease
HI136918-1.990946hypothetical protein
HI13703100.694271molybdenum-pterin binding protein
HI137129-0.380973dissimilatory sulfite reductase, desulfoviridin
HI1371.129-0.831740C32 tRNA thiolase
HI1372110-1.173227condesin subunit F
HI1373110-2.696550condesin subunit E
HI1374110-3.169935cell division protein MukB
HI1375217-6.163829hypothetical protein
HI1376217-5.022747hypothetical protein
HI1377217-4.495063exonuclease I
HI1378320-5.078796phosphate regulon sensor protein PhoR
HI1379420-3.881206phosphate regulon transcriptional regulatory
HI1381521-4.042480phosphate ABC transporter permease
HI1382118-3.051875phosphate ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1374GPOSANCHOR392e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 38.9 bits (90), Expect = 2e-04
Identities = 34/292 (11%), Positives = 83/292 (28%), Gaps = 33/292 (11%)

Query: 881 EINRERNEIDRELNQFNSGEQQLRIQLDNAKERLQLLNKLIPQLNVLADEDLIDRIEECR 940
+++ + ++ + +L + L I ++L R +
Sbjct: 75 DLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKI--------QELEARKADLE 126

Query: 941 EQLDIAE----QDEYFIRQHGVTLSQLEPIANSLQSDPENYEGLKNELTQAIERQKQVQQ 996
+ L+ A D I+ + L L+ E + I+ + +
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKA 186

Query: 997 RVFALADVVQRKPHFGYEDAGQAET------------SELNEKLRQRLEQMQAQRDTQRE 1044
+ A +++ + + L + LE
Sbjct: 187 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSA 246

Query: 1045 QVRQKQSQFAEYNRVLIQLQSSYDSKYQLLNELIGEISDLGVRADDGAEERARIR----- 1099
+++ +++ A +L+ + + +I L E+A +
Sbjct: 247 KIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQV 306

Query: 1100 ----RDELHQQLSTSRQRRSYVEKQLTLIESEADNLNRLIRKTERDYKTQRE 1147
R L + L SR+ + +E + +E + + RD RE
Sbjct: 307 LNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE 358



Score = 33.5 bits (76), Expect = 0.008
Identities = 52/358 (14%), Positives = 113/358 (31%), Gaps = 29/358 (8%)

Query: 414 ANDALEESQAQFEQTEIEIDAVRSQLADYQQALDAQQTRALQYQQAIAALEKAKTLCGLA 473
+ E S ++ V+ + ++ + + + AL+
Sbjct: 34 VVNTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKD-------- 85

Query: 474 DLSVKNVEDYHAEFDAHAESLTETVLELEHKMSISEAAKSQFDKAYQLVCKIAGEMPRST 533
+ + + +++ E K+ EA K+ +KA + + +
Sbjct: 86 --HNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTAD-SAK 142

Query: 534 AWESAKELLREYPSQKLQAQQTPQLRTKLHELEQRYAQQQSAVKLLNDFNQRANLSLQTA 593
E + + + ++ L +L+ A
Sbjct: 143 IKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA 202

Query: 594 EELEDYHAEQEALIEDISARLSEQVENRSTLRQKRENLTALYDENARKAPAWLTAQAALE 653
A I+ + A + ++ L + E ++ K T +A
Sbjct: 203 MNFST---ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIK---TLEAEKA 256

Query: 654 RLEQQSGERFEHSQDVMNFMQSQLVKERELTMQRDQLEQKRLHLDE--QISRLSQPDGSE 711
LE + E + + MNF + K + L ++ LE ++ L+ Q+ ++
Sbjct: 257 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRR 316

Query: 712 DPRLNMLAERFGGVLLSELYDDVTIEDAPYFSALYGPSRHAIVVRDLNAVREQLAQLE 769
D + A++ +L + I +A SR + + RDL+A RE QLE
Sbjct: 317 DLDASREAKKQLEAEHQKLEEQNKISEA---------SRQS-LRRDLDASREAKKQLE 364



Score = 30.8 bits (69), Expect = 0.047
Identities = 34/268 (12%), Positives = 74/268 (27%), Gaps = 22/268 (8%)

Query: 1021 TSELNEKLRQRLEQMQAQRDTQREQVRQKQSQFAEYNRVLIQLQSSYDSKYQLLNELIGE 1080
E +K ++ + + + E L + + L+E +
Sbjct: 55 VQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASK 114

Query: 1081 ISDLGVRADDG-----AEERARIRRDELHQQLSTSRQRRSYVEKQLTLIESEADNLNRLI 1135
I +L R D + L + + + L A N +
Sbjct: 115 IQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 174

Query: 1136 RKTERDYKTQRELVVAAKVSWCVVLRLSRNSDMEKRLNRRELAYLSADELRSMSDKALGA 1195
+ + ++ + A R +++EK L + + A
Sbjct: 175 SAKIKTLEAEKAALEA------------RQAELEKALEGAMNFSTADSAKIKTLEAEKAA 222

Query: 1196 LRTAVADNEYLRDSLR-VSEDSRKPENKVRFFIAVYQH----LRERIRQDIIKTDDPIDA 1250
L AD E + S + A + L + + + +
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 1251 IEQMEIELSRLTAELTGREKKLAISSES 1278
I+ +E E + L AE E + + + +
Sbjct: 283 IKTLEAEKAALEAEKADLEHQSQVLNAN 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1379HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 2e-22
Identities = 36/130 (27%), Positives = 62/130 (47%), Gaps = 4/130 (3%)

Query: 1 MTR-KILIVEDECAIREMIALFLSQKYYDVIEASDFKTAINKI-KENPKLILLDWMLPGR 58
MT IL+ +D+ AIR ++ LS+ YDV S+ T I + L++ D ++P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 59 SGIQFIQYIKKQESYAAIPIIMLTAKSTEEDCIACLNAGADDYITKPFSPQILLARIEAV 118
+ + IKK +P+++++A++T I GA DY+ KPF L+ I
Sbjct: 61 NAFDLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 119 WRRIYEQQSQ 128
+ S+
Sbjct: 119 LAEPKRRPSK 128


15HI1473HI1520Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI14731213.351250molybdenum transport protein ModD
HI14743254.279998ABC transporter ATP-binding protein
HI14753295.545948hypothetical protein
HI14764336.645099transcriptional regulatory protein
HI14775336.595613DNA-binding protein
HI14785357.194048transposase
HI14795315.656752hypothetical protein
HI14804325.737334hypothetical protein
HI14813304.222462DNA transposition protein
HI14823292.293437hypothetical protein
HI14834313.126348host-nuclease inhibitor protein
HI14852280.420074hypothetical protein
HI14861290.581422hypothetical protein
HI14872281.193608hypothetical protein
HI14883312.661376hypothetical protein
HI14891293.330379hypothetical protein
HI1492-1313.838766hypothetical protein
HI14931386.826510hypothetical protein
HI14951396.817739hypothetical protein
HI1496-2387.719565hypothetical protein
HI14970398.734616hypothetical protein
HI1498-1378.380701hypothetical protein
HI1498.1-1358.028472hypothetical protein
HI14990358.234107hypothetical protein
HI1500-1338.336135hypothetical protein
HI1501-1328.169004hypothetical protein
HI15021328.026616hypothetical protein
HI15031347.988206*Mu-like prophage FluMu G protein
HI15043358.740876bacteriophage Mu I protein
HI15053368.648328hypothetical protein
HI15083378.912567hypothetical protein
HI15094349.302339hypothetical protein
HI15102368.685375hypothetical protein
HI15112369.072502sheath protein gpL
HI15122379.214900hypothetical protein
HI15131369.325172hypothetical protein
HI15141348.636783monofunctional biosynthetic peptidoglycan
HI15151296.36072764 kDa virion protein
HI15182255.754235hypothetical protein
HI15191243.893506hypothetical protein
HI15202212.235318hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1489cloacin328e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.4 bits (73), Expect = 8e-04
Identities = 21/99 (21%), Positives = 42/99 (42%), Gaps = 9/99 (9%)

Query: 91 KKRQQALQTGQKIEPLTNHNYLKSVYETQKPHFAVIRSGKNQSETVKAQ-------QAED 143
+RQQ +E NY ++ E + + V R+ + Q++ V+ A +
Sbjct: 304 NRRQQEWDATHPVEAA-ERNYERARAELNQANEDVARNQERQAKAVQVYNSRKSELDAAN 362

Query: 144 KKVQDAILYVERFVQLGQEEFVKNSPEYQIWLNHKAQKQ 182
K + DAI +++F + + +Q+ KAQ+
Sbjct: 363 KTLADAIAEIKQFNRFAHDPMAGGHRMWQM-AGLKAQRA 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1514BACINVASINB365e-04 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 35.9 bits (82), Expect = 5e-04
Identities = 22/73 (30%), Positives = 34/73 (46%), Gaps = 3/73 (4%)

Query: 507 VAEGVTVLMDDTTNTKEKSEAIGSIAGATAGAIVGQALIPIPVVGAAVGSYLGGWLGEWL 566
+ + +T ++ K+ +E GSI GA AI A++ + VV A VG LG L
Sbjct: 385 IGKAITKALEGLGVDKKTAEMAGSIVGAIVAAI---AMVAVIVVVAVVGKGAAAKLGNAL 441

Query: 567 GSEVGEYLSDPEP 579
+GE + P
Sbjct: 442 SKMMGETIKKLVP 454


16HI1693HI1699Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HI1693012-4.227380molybdenum ABC transporter substrate-binding
HI1694115-5.671205molybdenum transport protein
HI1695318-7.240588UDP-galactose--lipooligosaccharide
HI1696518-7.432566lipopolysaccharide biosynthesis protein
HI1697215-4.943339UDP-GlcNAc--lipooligosaccharide
HI1698115-4.142492lipopolysaccharide biosynthesis protein
HI1699012-3.157543lipopolysaccharide biosynthesis protein
17HI0126HI0140N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0126-113-0.353633ferric transporter ATP-binding protein
HI0129011-0.514768ferric transport system permease-like protein
HI0131-19-0.641392ferric ABC transporter protein
HI0132010-1.131545uridine kinase
HI0133111-1.275913deoxycytidine triphosphate deaminase
HI0134-110-1.870725hypothetical protein
HI0135-110-0.750526sugar efflux transporter
HI0136-110-0.905664GTP-binding protein EngA
HI0137-115-1.600057**DNA polymerase III subunit epsilon
HI0138-114-1.424163ribonuclease H
HI0139-112-0.878159outer membrane protein P2
HI0140-2130.107701N-acetylglucosamine-6-phosphate deacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0126PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.003
Identities = 13/56 (23%), Positives = 22/56 (39%)

Query: 1 MSNNDFLVLKNITKAFGKAVVIDNLDLTIKRGTMVTLLGPSGCGKTTVLRLVAGLE 56
L+ + K V ++ K V L G G GK+T++ + GL+
Sbjct: 565 YKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0134PREPILNPTASE300.015 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.8 bits (67), Expect = 0.015
Identities = 11/57 (19%), Positives = 22/57 (38%), Gaps = 8/57 (14%)

Query: 4 FALIVGIVALAIFSFLYIQLYRV--------QSAINEQLAQQNIAVQSINLSLFSPA 52
+ +V + +L I SFL + ++R+ Q+ + V +L P
Sbjct: 15 YFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPR 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0135TCRTETB552e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 54.9 bits (132), Expect = 2e-10
Identities = 41/185 (22%), Positives = 81/185 (43%), Gaps = 1/185 (0%)

Query: 36 LSDIAQSFDMQTADTGLMMTVYAWTVLIMSLPAMLATGNMERKSLLIKLFIIFIVGHILS 95
L DIA F+ A T + T + T I + + + K LL+ II G ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 VIAWNFW-ILLLARMCIALAHSVFWSITASLVMRISPKHKKTQALGMLAIGTALATILGL 154
+ +F+ +L++AR + F ++ +V R PK + +A G++ A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PIGRIVGQLVGWRVTFGIIAVLALSIMFLIIRLLPNLPSKNAGSIASLPLLAKRPLLLWL 214
IG ++ + W I + +++ FL+ L + K I + L++ + L
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFML 216

Query: 215 YVTTA 219
+ T+
Sbjct: 217 FTTSY 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0136MYCMG045320.006 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 32.0 bits (72), Expect = 0.006
Identities = 15/49 (30%), Positives = 24/49 (48%)

Query: 166 EKMENADENDRTSEEEQDEWEQEFDFDSEEDTALIDDALDEELEEEQDK 214
K NA+ + +Q E+EFD+ +E AL++ EL E + K
Sbjct: 388 SKKNNAEMKSKQMSTDQMTSEKEFDYYTETLKALLEKEDSAELNENEKK 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0139ECOLNEIPORIN444e-07 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 44.4 bits (105), Expect = 4e-07
Identities = 65/347 (18%), Positives = 120/347 (34%), Gaps = 44/347 (12%)

Query: 1 MKKTLAALIVGAFAASAANAAVVYNNEGTNVELGGRLSIIAEQSNSTVDNQKQQHGALRN 60
MKK+L AL + A +A +Y VE ++ Q+ S + +
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTG-----IVD 55

Query: 61 QGSRFHIKATHNFGDGFYAQGYLETRFVTKASENGSDNFGDITSKYAYVTLGNKAFGEVK 120
GS+ K + G+G A +E KAS G+D ++ +++ L FG+++
Sbjct: 56 LGSKIGFKGQEDLGNGLKAIWQVE----QKASIAGTD--SGWGNRQSFIGLK-GGFGKLR 108

Query: 121 LGRAKTIADGITS--AEDKEYGVLNNSDYIPTSGNTVGYTFKGID--GLVLGANYLLAQK 176
+GR ++ D + L + + + + GL Y L
Sbjct: 109 VGRLNSVLKDTGDINPWDSKSDYLGVNKIAEPEARLISVRYDSPEFAGLSGSVQYALNDN 168

Query: 177 RE-----------------------GAKNTNKQPNDKAGEVRIGEINNGIQVGAKYDAND 213
GA + Q + E ++ + YD +
Sbjct: 169 AGRHNSESYHAGFNYKNGGFFVQYGGAYKRHHQVQENVN----IEKYQIHRLVSGYDNDA 224

Query: 214 IVAKIAYGRTNYKYNEADEHTQQLNGVLATLGYRFSDLGLLVSLDSGYAKTKNYKTKHEK 273
+ A +A + + K E + V ATL YRF ++ VS G+ + + +
Sbjct: 225 LYASVAVQQQDAKLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNND 284

Query: 274 RYFVSPGFQYELMEDTNVYGNFKYERTSVDQGEKTREQAVLFGVDHK 320
V G +Y+ + T+ + + + K A G+ HK
Sbjct: 285 YDQVVVGAEYDFSKRTSALVSAGWLQEG-KGESKFVSTAGGVGLRHK 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0140UREASE349e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 33.9 bits (78), Expect = 9e-04
Identities = 12/29 (41%), Positives = 19/29 (65%)

Query: 339 PARAIGIDDRLGSVEKGKIANLAVFTPNY 367
PA A G+ +GS+E GK A+L ++ P +
Sbjct: 413 PAIAHGLSHEIGSLEVGKRADLVLWNPAF 441


18HI0264HI0271N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0264-111-1.283164heme-hemopexin utilization protein A
HI0265-1110.026296dihydroneopterin aldolase
HI0266-2100.205829glycerol-3-phosphate acyltransferase PlsY
HI0267-1110.398524nitrate/nitrite sensor protein NarQ
HI0268-1121.618696UDP-N-acetylenolpyruvoylglucosamine reductase
HI0269-1102.247255RNA polymerase factor sigma-32
HI0270-1100.954654hypothetical protein
HI0271-2100.139009Dna-J like membrane chaperone protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0264PF05860944e-25 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 94.5 bits (235), Expect = 4e-25
Identities = 29/97 (29%), Positives = 46/97 (47%), Gaps = 5/97 (5%)

Query: 38 NVSTIGNKMTIDQKTPTTQ---IDWHSFDIGQNKEVEFKQPDANSVAYNRVTGGNASQIQ 94
N++T GN I++ T + F + + F P +RVTGG+ S I
Sbjct: 14 NITTEGNTRIIERGTQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQNIISRVTGGSVSNID 73

Query: 95 GKLTANGK--VYLANPNGVIITQGAEINVAGLFATTK 129
G + AN ++L NPNG+I Q A +++ G F +
Sbjct: 74 GLIRANATANLFLINPNGIIFGQNARLDIGGSFVGST 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0266TCRTETB280.033 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.5 bits (61), Expect = 0.033
Identities = 42/171 (24%), Positives = 64/171 (37%), Gaps = 31/171 (18%)

Query: 1 MSLFALFYMLFA-------YLLGSISSAILICRIAGLPDPRQN---GSHNPGATNVLRIG 50
MS+ +F+MLF ++ +S I + I + DP + G + P VL G
Sbjct: 207 MSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGG 266

Query: 51 NRKSALAVLIFDMLKGMIPVWAGYYLGLTQFELGMVALGACLGHIFPIFFQFKGGK---- 106
+A + M+P L+ E+G V + G + I F + GG
Sbjct: 267 IIFGTVAGFVS-----MVPYMMKDVHQLSTAEIGSVII--FPGTMSVIIFGYIGGILVDR 319

Query: 107 ----GVATAFGAIAPISWAVAG------SMFGTWIFVFLVSGYSSLSAVIS 147
V +S+ A S F T I VF++ G S VIS
Sbjct: 320 RGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0267PF06580446e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 44.5 bits (105), Expect = 6e-07
Identities = 31/157 (19%), Positives = 59/157 (37%), Gaps = 21/157 (13%)

Query: 423 LRELLATFRLTIQEANLQL-ALKQVIDSLRSQTTMQ-------MNVNCQLPSQSLNPQQL 474
L L R +++ +N + +L + + S + + Q+ ++ Q
Sbjct: 197 LTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVP 256

Query: 475 VHVLQIVREATTNAIKH-----SQGTVIEISARINAEGEYEILVEDDGVGIPNLEEPEGH 529
++Q + E N IKH QG I + + G + VE+ G +
Sbjct: 257 PMLVQTLVE---NGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVENTGSLALKNTKESTG 312

Query: 530 YGLNIMAERCRQL---NAQLHIHRREQGGTQVKITLP 563
GL + ER + L AQ+ + +QG + +P
Sbjct: 313 TGLQNVRERLQMLYGTEAQIKL-SEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0271TCRTETB280.049 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.9 bits (62), Expect = 0.049
Identities = 11/42 (26%), Positives = 18/42 (42%), Gaps = 1/42 (2%)

Query: 4 IGKIIGVFLGWKVGGFFGAIAGLILGSIADKKLYELGSVSSS 45
IG +I +F G FG I G+++ + +G S
Sbjct: 294 IGSVI-IFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLS 334


19HI0715HI0721N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0715-112-0.216795ATP-dependent protease ATP-binding subunit ClpX
HI0716-1130.706871preprotein translocase subunit SecE
HI0717-1110.782150transcription antitermination protein NusG
HI0718-214-0.003799lipoprotein
HI0719-116-1.051699hypothetical protein
HI0720-119-1.818661heat shock protein HtpX
HI0721-119-2.869559sulfur transfer protein SirA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0715HTHFIS362e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.3 bits (84), Expect = 2e-04
Identities = 33/192 (17%), Positives = 67/192 (34%), Gaps = 53/192 (27%)

Query: 51 LEESAVENEEKLPTPHEIRAHLDDYVIGQDYAKKVLSVAVYNHYKRLRTNYESNDVELGK 110
+ A+ ++ P+ E + ++G+ A +Y R +
Sbjct: 114 IIGRALAEPKRRPSKLEDDSQDGMPLVGRSAA----MQEIYRVLAR---------LMQTD 160

Query: 111 SNILLIGPTGSGKTLLAQTL---ARRLNVPFAMADATTLTEAGYVGEDVENVLQKLLQNC 167
+++ G +G+GK L+A+ L +R N PF + + ++++ L
Sbjct: 161 LTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIP---------RDLIESELFGH 211

Query: 168 EYDT------------EKAEKGIIYIDEIDKISRKSEGASITRDVSGEGVQQALLKLIEG 215
E E+AE G +++DEI + Q LL++++
Sbjct: 212 EKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMP--------------MDAQTRLLRVLQQ 257

Query: 216 TIASIPPQGGRK 227
GGR
Sbjct: 258 G--EYTTVGGRT 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0716SECETRNLCASE1184e-38 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 118 bits (298), Expect = 4e-38
Identities = 43/105 (40%), Positives = 63/105 (60%), Gaps = 1/105 (0%)

Query: 2 IFFAAAAIGNIYFQQIYSLPIRVIGMAIALVIAFILAAITNQGTKARAFFNDSRTEARKV 61
A +GN ++ I LP+R + + I + A +A +T +G AF ++RTE RKV
Sbjct: 24 ALLLVAIVGNYLYRDIM-LPLRALAVVILIAAAGGVALLTTKGKATVAFAREARTEVRKV 82

Query: 62 VWPTRAEARQTTLIVIGVTMIASLFFWAVDSIIVTVINFLTDLRF 106
+WPTR E TTLIV VT + SL W +D I+V +++F+T LRF
Sbjct: 83 IWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVSFITGLRF 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0718VACJLIPOPROT319e-113 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 319 bits (818), Expect = e-113
Identities = 106/253 (41%), Positives = 156/253 (61%), Gaps = 11/253 (4%)

Query: 4 KTILTAL-LSAIALTGCANNNDTKQVSERNDSLEDFNRTMWKFNYNVIDRYVLEPAAKGW 62
K L+AL L L GCA++ +Q R+D LE FNRTM+ FN+NV+D Y++ P A W
Sbjct: 2 KLRLSALALGTTLLVGCASSGTDQQ--GRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAW 59

Query: 63 NNYVPKPISSGLAGIANNLDEPVSFINRLIEGEPKKAFVHFNRFWINTVFGLGGFIDFAS 122
+YVP+P +GL+ NL+EP +N ++G+P + VHF RF++NT+ G+GGFID A
Sbjct: 60 RDYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAG 119

Query: 123 -ASKELRIDNQRGFGETLGSYGVDAGTYIVLPIYNATTPRQLTGAVVDAAYMYPFWQWVG 181
A+ +L+ FG TLG YGV G Y+ LP Y + T R G + DA +YP W+
Sbjct: 120 MANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADA--LYPVLSWLT 177

Query: 182 GPWALVKYGVQAVDARAKNLNNAELLRQAQDPYITFREAYYQNLQFKVNDGKLVESK--- 238
P ++ K+ ++ ++ RA+ L++ LLRQ+ DPYI REAY+Q F N G+L +
Sbjct: 178 WPMSVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPN 237

Query: 239 -ESLPDDILKEID 250
+++ DD LK+ID
Sbjct: 238 AQAIQDD-LKDID 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0721PF01206921e-28 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 91.7 bits (228), Expect = 1e-28
Identities = 25/71 (35%), Positives = 42/71 (59%)

Query: 8 QTLDTLGLRCPEPVMLVRKNIRHLNDGEILLIIADDPATTRDIPSFCQFMDHTLLQCEVE 67
Q+LD GL CP P++ +K + +N GE+L ++A DP + +D SF + H LL+ + E
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 68 KPPFKYWVKRG 78
+ + +KR
Sbjct: 66 DGTYHFRLKRA 76


20HI0893HI0900N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI0893-19-0.377508transcriptional repressor
HI0894-110-0.114678membrane-fusion protein
HI0895-19-0.036413acriflavine resistance protein
HI0896-2110.226303cell division protein
HI0897-1100.343550multidrug resistance protein B
HI0898-1120.374786multidrug resistance protein A
HI08990140.761254dihydrofolate reductase
HI0900-1151.158162gamma-glutamyl kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0893HTHTETR572e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.9 bits (137), Expect = 2e-12
Identities = 18/77 (23%), Positives = 35/77 (45%)

Query: 1 MRQAKTDLAEQIFSATDRLMAREGLNQLSMLKLAKEANVAAGTIYLYFKNKDELLEQFAH 60
+Q + + I RL +++G++ S+ ++AK A V G IY +FK+K +L +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RVFSMFMATLEKDFDET 77
S + +
Sbjct: 65 LSESNIGELELEYQAKF 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0894RTXTOXIND531e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 1e-09
Identities = 17/55 (30%), Positives = 30/55 (54%)

Query: 74 GAVSQVLVQNGQNVKKGEVLVELDSSVERANLQAAQAQLSALRQTYQRYVGLLNS 128
V +++V+ G++V+KG+VL++L + A+ Q+ L R RY L S
Sbjct: 105 SIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS 159



Score = 44.0 bits (104), Expect = 7e-07
Identities = 32/216 (14%), Positives = 68/216 (31%), Gaps = 39/216 (18%)

Query: 102 RANLQAAQAQLSALRQTYQRYVGLLNSNAVSRQEMDNAKAAYDAQVASIESLKAAIERRK 161
+ + +A+ + + Q ++ + ++ + + +
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEI---------LDKLRQTTDNIGLLTLELAKNEERQQASV 329

Query: 162 IVAPFDGKAGIVKIN-VGQYVNVGT---EIVRVEDTSSMKVDFALSQNDLDKLHIGQRVT 217
I AP K +K++ G V IV +DT ++V + D+ +++GQ
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDT--LEVTALVQNKDIGFINVGQNAI 387

Query: 218 ATTDARLGETFSARITAIEPAINSSTGLVDVQATFDPEDGHKLLSGMFSRLRIALPTETN 277
+A F + +++ A D G + I++
Sbjct: 388 IKVEA-----FPYTRY---GYLVGKVKNINLDAIEDQRLGL------VFNVIISIEENCL 433

Query: 278 QVVVPQVAISYNMYGE----------IAYLLEPLSE 303
+ +S M I+YLL PL E
Sbjct: 434 STGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEE 469


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0895ACRIFLAVINRP9050.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 905 bits (2340), Expect = 0.0
Identities = 324/1044 (31%), Positives = 538/1044 (51%), Gaps = 47/1044 (4%)

Query: 11 DIFIRRPVLAVSISLLMIILGLQAISKLAVREYPKMTTTVITVSTAYPGADANLIQAFVT 70
+ FIRRP+ A +++++++ G AI +L V +YP + ++VS YPGADA +Q VT
Sbjct: 3 NFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVT 62

Query: 71 SKLEESIAQADNIDYMSSTSAPS-SSTITIKMKLNTDPAGALADVLAKVNAVKSALPNGI 129
+E+++ DN+ YMSSTS + S TIT+ + TDP A V K+ LP +
Sbjct: 63 QVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 130 EDPSVSSS-SGGSGIMYISFRSKKLDSSQ--VTDYINRVVKPQFFTIEGVAEVQVFGAAE 186
+ +S S S +M F S ++Q ++DY+ VK + GV +VQ+FGA +
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA-Q 181

Query: 187 YALRIWLDPQKMAAQNLSVPTVMSALSANNVQTAAGNDN------GYYVSYRNKVETTTK 240
YA+RIWLD + L+ V++ L N Q AAG G ++ +T K
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 241 SVEQLSNLIISSNGD-DLVRLRDIATVELNKENDNSRATANGAESVVLAINPTSTANPLT 299
+ E+ + + N D +VRL+D+A VEL EN N A NG + L I + AN L
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 300 VAEKIRPLYESIKTQLPDSMESDILYDRTIAINSSIHEVIKTIGEATLIVLVVILMFIGS 359
A+ I+ ++ P M+ YD T + SIHEV+KT+ EA ++V +V+ +F+ +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 360 FRAILIPILAIPISLIGVLMLLQSFNFSINLMTLLALILAIGLVVDDAIVVLENIDRHIK 419
RA LIP +A+P+ L+G +L +F +SIN +T+ ++LAIGL+VDDAIVV+EN++R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 420 AGETPFRAAII-GTREIAVPVISMTIALIAVYSPMALMGGITGTLFKEFALTLAGAVFIS 478
+ P + A +I ++ + + L AV+ PMA GG TG ++++F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 479 GVVALTLSPMMSSKLLKSNAKP---------TWMEERVEHTLGKVNRVYEYMLDLVMLNR 529
+VAL L+P + + LLK + W +H+ VN + ++
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHS---VNHYTNSVGKILGSTG 538

Query: 530 KSMLAFAVVIFSTLPFLFNSLSSELTPNEDKGAFIAIGNAPSSVNVDYIQNAMQP----Y 585
+ +L +A+++ + LF L S P ED+G F+ + P+ + Q + Y
Sbjct: 539 RYLLIYALIVAGMV-VLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYY 597

Query: 586 MKNVMETPEVSF---GMSIAGAPTSNSSLNIITLKDWKERSRK---QSAIMNEINEKAKS 639
+KN E F G S +G N+ + ++LK W+ER+ A+++ +
Sbjct: 598 LKNEKANVESVFTVNGFSFSG-QAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGK 656

Query: 640 IPEVSVSAFNIPEIDTG--EQGPPVSIVLKTAQDYKSLANTAEKFLS-AMKASGKFIYTN 696
I + V FN+P I G ++ + + +L + L A + +
Sbjct: 657 IRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 697 LDLTYDTAQMTISVDKEKAGTYGITMQQISNTLGSFLSGATVTRVDVDGRAYKVISQVKR 756
+ DTAQ + VD+EKA G+++ I+ T+ + L G V GR K+ Q
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 757 DDRLSPESFQNYYLTASNGQSVPLSSVISMKLETQPTSLPRFSQLNSAEISAVPMPGISS 816
R+ PE Y+ ++NG+ VP S+ + L R++ L S EI PG SS
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 817 GDAIAWLQQQATDNLPQGYTFDFKSEARQLVQEGNALAVTFALAVIIIFLVLAIQFESIR 876
GDA+A ++ A+ LP G +D+ + Q GN A++ +++FL LA +ES
Sbjct: 837 GDAMALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 877 DPMVIMISVPLAVSGALVSLNILSFFSIAGTTLNIYSQVGLITLVGLITKHGILMCEVAK 936
P+ +M+ VPL + G L++ + ++Y VGL+T +GL K+ IL+ E AK
Sbjct: 896 IPVSVMLVVPLGIVGVLLAAT------LFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 937 EEQLNHGKTRIEAITHAAKVRLRPILMTTAAMVAGLIPLLYATGAGAVSRFSIGIVIVAG 996
+ GK +EA A ++RLRPILMT+ A + G++PL + GAG+ ++ ++GI ++ G
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 997 LSIGTIFTLFVLPVVYSYVATEHK 1020
+ T+ +F +PV + + K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0897TCRTETB1431e-39 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 143 bits (362), Expect = 1e-39
Identities = 91/397 (22%), Positives = 172/397 (43%), Gaps = 17/397 (4%)

Query: 23 LSLATFMQVLDSTIANVAIPTIAGDLGASFSQGTWVITSFGVANAISIPITGWLAKRFGE 82
L + +F VL+ + NV++P IA D + WV T+F + +I + G L+ + G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 83 VRLFLVSTFLFVVSSWLCGIADSLEALIIF-RVIQGAVAGPVIPLSQSLLLNNYPPEKRG 141
RL L + S + + S +L+I R IQGA A L ++ P E RG
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 142 MALAFWSMTIVVAPIFGPILGGWISDNIHWGWIFFINVPIGLSVVLISWKILGSRESEIV 201
A + + GP +GG I+ IHW + + +P+ ++++ + + + ++ +
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPM-ITIITVPFLMKLLKKEVRI 195

Query: 202 HQPIDKVGLVLLVLGVGCLQLMLDQGREQDWFNSNEIIILAVVAVVCLIALVIWELTDDN 261
D G++L+ +G+ L F ++ I +V+V+ + V +
Sbjct: 196 KGHFDIKGIILMSVGIVFFML----------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 262 PVVDISLFHSRNFSVGCLCTSLAFLIYLGSVVLIPLLLQQVFHY-TATWAGLAASPVGLF 320
P VD L + F +G LC + F G V ++P +++ V TA + P +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 321 PILLSPIIGRFGYKIDMRILVTISFIVYAITFYWRAVTFEPSMTFVDVALPQLVQGLAVS 380
I+ I G + ++ I +++F + E + F+ + + ++ GL+ +
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFT 365

Query: 381 CFFMPLTTITLSGLPAHKMASASSLFNFLRTLAGSVG 417
++TI S L + + SL NF L+ G
Sbjct: 366 K--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTG 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0898RTXTOXIND802e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 79.9 bits (197), Expect = 2e-18
Identities = 43/287 (14%), Positives = 87/287 (30%), Gaps = 35/287 (12%)

Query: 91 ELDDTNAKLSFEQAKSNLANAVRQVE----QLGFTVQQLQSAVHANEISLAQAQGNLARR 146
E + ++ S N Q E + + + ++ E + L
Sbjct: 181 EEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDF 240

Query: 147 VQLEKMGAIDKESFQHAKEAVELAKANLNASKNQLAANQALLRNVPLR------------ 194
L AI K + + A L K+QL ++ + +
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 195 ------EQPQIQNAINSLKQAWLNLQRTKIRSPIDGYVARRNVQ-VGQAVSVGGALMAVV 247
I L + Q + IR+P+ V + V G V+ LM +V
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIV 360

Query: 248 -SNEQMWLEANFKETQLTNMRIGQPVKIHFDLYGKNK--EFDGVINGIEMGTGNAFSLLP 304
++ + + A + + + +GQ I + + + G + I +
Sbjct: 361 PEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA-------I 413

Query: 305 SQNATGNWIKVVQRVPVRIKLDPQQFTETPLRIGLSATAKVRISDSS 351
G V+ + + PL G++ TA+++ S
Sbjct: 414 EDQRLGLVFNVIISIEENCLSTGNK--NIPLSSGMAVTAEIKTGMRS 458


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI0900CARBMTKINASE362e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 36.0 bits (83), Expect = 2e-04
Identities = 33/133 (24%), Positives = 48/133 (36%), Gaps = 22/133 (16%)

Query: 129 IPVINENDAVATAEIKVGDNDNLSALVAILVQAEQLYLLTDQQGLFDSDPRKNPEAKLIP 188
+PVI E+ + E V D D +A V A+ +LTD G E L
Sbjct: 197 VPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILTDVNGAA-LYYGTEKEQWLRE 254

Query: 189 V-VEQITDHIRSIAGGSGTNLGTGGMMTKIIAADVATRSGIETIIAPGNRPNVIADL--- 244
V VE++ + + G M K++AA G +IA L
Sbjct: 255 VKVEELRKYYEE------GHFKAGSMGPKVLAAIRFIEW--------GGERAIIAHLEKA 300

Query: 245 --AYEQNIGTKFI 255
A E GT+ +
Sbjct: 301 VEALEGKTGTQVL 313



Score = 33.6 bits (77), Expect = 9e-04
Identities = 16/49 (32%), Positives = 25/49 (51%), Gaps = 4/49 (8%)

Query: 3 KKTIVVKFGTSTLTQGSPKLNSPHMMEIVR----QIAQLHNDGFRIVIV 47
K +V+ G + L Q K + MM+ VR QIA++ G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVIT 50


21HI1114HI1120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI1114-1141.052138ADP-L-glycero-D-mannoheptose-6-epimerase
HI1115-1141.288012thioredoxin
HI1116-1141.996165deoxyribose-phosphate aldolase
HI1117-1152.656176competence protein
HI11181172.238806ribosome biogenesis GTP-binding protein YsxC
HI1119-1192.157025membrane protein
HI1120-3192.316895oligopeptide ABC transporter ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1114NUCEPIMERASE997e-26 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 98.7 bits (246), Expect = 7e-26
Identities = 78/348 (22%), Positives = 131/348 (37%), Gaps = 68/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDLGRKDILVVDNLKD--------------GTKFANLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYCDKEDFIASIIAGDEFGDIDAVFHEGACSATTEWDGKYIMHNNYEYSK-------EL 100
AD + + + A F + VF A +Y + N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLDREIP-FFYASSAATYGDTKV--FREEREFEGPLNVYGYSKFLFDQYVRNILPEA 157
L C +I YASS++ YG + F + + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 158 KSPVCGFRYFNVYGPRENHKGSMASVAFHLNNQILKGENPKLFAGSEHFRRDFVYVGDVA 217
P G R+F VYGP + MA F +L+G++ ++ + +RDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVYNYGKM-KRDFTYIDDIA 224

Query: 218 AVNI------------WCWQNGISG-------IYNLGTGNAESFRAVADAVVKFHG-KGE 257
I W + G +YN+G + A+ G + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 258 IETIPFPEHLKSRYQEYTQADLTKLRS-TGYDKPFKTVAEGVTEYMAW 304
+P T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQPG----DVLETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1115SECA280.028 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.5 bits (61), Expect = 0.028
Identities = 14/37 (37%), Positives = 20/37 (54%), Gaps = 11/37 (29%)

Query: 75 TSPA-INSLAKEGYQVVSVALRSGNEADVNDYLSKND 110
T PA +N+L +G VV+V NDYL++ D
Sbjct: 113 TLPAYLNALTGKGVHVVTV----------NDYLAQRD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1117HTHFIS381e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 1e-04
Identities = 39/178 (21%), Positives = 58/178 (32%), Gaps = 53/178 (29%)

Query: 194 DLTDIIGQ----QHAKRALTIAAAGQHNLLFLGPPGTGKTMLASRLTGLLPEMTDLEAIE 249
D ++G+ Q R L L+ G GTGK ++A A+
Sbjct: 135 DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVA-------------RALH 181

Query: 250 TASVTSLVQNELNFHNWKQRPFRAPHHSASMP------ALVG-------GGTIPKPGEIS 296
+ PF A + A++P L G G G
Sbjct: 182 DYG------------KRRNGPFVAI-NMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFE 228

Query: 297 LATNGVLFLDEL----PEFERKVLDALRQPLESGEIIISRANAKIQFPARFQLVAAMN 350
A G LFLDE+ + + ++L L+ GE I+ R +VAA N
Sbjct: 229 QAEGGTLFLDEIGDMPMDAQTRLLRV----LQQGEYTTVGGRTPIRSDVR--IVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1120HTHFIS300.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.011
Identities = 9/16 (56%), Positives = 12/16 (75%)

Query: 54 VVGESGCGKSTLARAI 69
+ GESG GK +ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


22HI1218HI1225N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HI1218217-2.632951L-lactate permease
HI1219319-1.903356cytidylate kinase
HI1220421-2.33709030S ribosomal protein S1
HI1221120-3.505568integration host factor subunit beta
HI1222014-1.800846hypothetical protein
HI1223-113-0.949705hypothetical protein
HI1224-214-0.348006orotidine 5'-phosphate decarboxylase
HI1225-213-0.734375translation initiation factor Sui1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1218BCTERIALGSPD310.014 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 31.0 bits (70), Expect = 0.014
Identities = 20/94 (21%), Positives = 37/94 (39%), Gaps = 10/94 (10%)

Query: 421 SGSNWTIFSSFLGAIGSFFSGSNTVSNLTFGSV--QLSTAETTGISVALVLALQSVGGAM 478
+ S I ++ GA G+ + S + S ++ G L+ AL S
Sbjct: 379 TNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTK-- 436

Query: 479 GNMVCINNIVAVSSVLNISNQEGTIIKKTIIPMI 512
N+I+A S++ + N E T +P++
Sbjct: 437 ------NDILATPSIVTLDNMEATFNVGQEVPVL 464


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1220ACRIFLAVINRP300.024 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.024
Identities = 25/162 (15%), Positives = 51/162 (31%), Gaps = 22/162 (13%)

Query: 260 PSKVVSLGDTVEVMVLEIDEERRRISLG-LKQCKANPWTQFADTHNKGDKVTGKIKSITD 318
+ T ++ ++ + +I+ G L A P Q N + K+ +
Sbjct: 190 ADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQL----NASIIAQTRFKNPEE 245

Query: 319 FGIFIGLEGGIDGLVHLSDISWSISGEEAVRQYKKGDEVSAVVLAV------------DA 366
FG +V L D++ G E + + A L + A
Sbjct: 246 FGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKA 305

Query: 367 VKERISLGIKQLEED-----PFNNFVAINKKGAVVSATVVEA 403
+K +++ + P++ + V T+ EA
Sbjct: 306 IKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEA 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1221DNABINDINGHU1072e-34 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 107 bits (269), Expect = 2e-34
Identities = 34/89 (38%), Positives = 51/89 (57%), Gaps = 1/89 (1%)

Query: 2 TKSELMEKLSAKQPTLPAKEIENMVKGILEFISQSLENGDRVEVRGFGSFSLHHRQPRLG 61
K +L+ K+ A+ L K+ V + +S L G++V++ GFG+F + R R G
Sbjct: 3 NKQDLIAKV-AEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKG 61

Query: 62 RNPKTGDSVNLSAKSVPYFKAGKELKARV 90
RNP+TG+ + + A VP FKAGK LK V
Sbjct: 62 RNPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HI1225SECA331e-04 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 33.3 bits (76), Expect = 1e-04
Identities = 21/55 (38%), Positives = 29/55 (52%)

Query: 52 DLSDEELKKLAAELKKRCGCGGAVKNGIIEIQGEKRDLLKQLLEQKGFKVKLSGG 106
LSDEELK AE + R G ++N I E R+ K++ + F V+L GG
Sbjct: 37 KLSDEELKGKTAEFRARLEKGEVLENLIPEAFAVVREASKRVFGMRHFDVQLLGG 91



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.