PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome52.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP001680 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPKB_0063HPKB_0100Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0063213-0.291655NAD-dependent aldehyde dehydrogenase
HPKB_0064620-2.299586hypothetical protein
HPKB_0065516-1.513781hypothetical protein
HPKB_0066415-0.650104hypothetical protein
HPKB_00703120.579232hypothetical protein
HPKB_00713130.999179hypothetical protein
HPKB_00723111.087736hypothetical protein
HPKB_00732111.032043hypothetical protein
HPKB_00741122.111842hypothetical protein
HPKB_00794213.430070urease accessory protein (ureH)
HPKB_00805233.114723urease accessory protein UreG
HPKB_00814212.277448urease accessory protein UreF
HPKB_00823172.296594urease accessory protein
HPKB_00833192.350559urease accessory protein / pH-dependent
HPKB_00841162.342038urease B
HPKB_0085092.138709urease subunit alpha
HPKB_00872112.108072*lipoprotein signal peptidase
HPKB_00882121.719315phosphoglucosamine mutase
HPKB_00892131.837597ribosomal protein S20
HPKB_00902121.936281peptide chain release factor RF-1
HPKB_00923131.511279hypothetical protein
HPKB_00931120.844267hypothetical protein
HPKB_0094-2120.291508methyl-accepting chemotaxis protein
HPKB_00950130.354671ribosomal protein S9
HPKB_00961110.351578ribosomal protein L13
HPKB_00971110.621353hypothetical protein
HPKB_0098110-0.309122malate:quinone oxidoreductase
HPKB_0099110-0.570557hypothetical protein
HPKB_0100313-0.888321sigma 70 (RpoD)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0063ANTHRAXTOXNA310.044 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.044
Identities = 36/173 (20%), Positives = 70/173 (40%), Gaps = 19/173 (10%)

Query: 121 QEESRLKERILKRKNEKIILNVNFIGEEVLGEEEASARFEKY---SQALKSNYIQYISIK 177
Q+ S ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0084UREASE10470.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1047 bits (2708), Expect = 0.0
Identities = 353/569 (62%), Positives = 441/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYASMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR YA+M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDTQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN D Q GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKNIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0094BACINVASINB300.034 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 30.1 bits (67), Expect = 0.034
Identities = 38/153 (24%), Positives = 60/153 (39%), Gaps = 27/153 (17%)

Query: 464 MEKALNTLGQEISSMLKASLGFANA------LNHESKDLKTCVDNLTKTAHKQERSLKNT 517
K L ++ S+ A G+A A E+ + K +D T K + +
Sbjct: 160 ATKKLTQAQNKLQSLDPADPGYAQAEAAVEQAGKEATEAKEALDKATDATVK---AGTDA 216

Query: 518 TQSLEEITNIIT----TIDSKSQEMISQGED--------IKSVVDMIREIADQT------ 559
E+ NI+T T ++ SQ +SQGE + ++ M EI +
Sbjct: 217 KAKAEKADNILTKFQGTANAASQNQVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQ 276

Query: 560 NLLALNAAIEAARAGEHGRGFAVVADEVRKLAE 592
N LAL A++ R E + A +E RK E
Sbjct: 277 NDLALFNALQEGRQAEMEKKSAEFQEETRKAEE 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0100IGASERPTASE381e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.1 bits (88), Expect = 1e-04
Identities = 19/71 (26%), Positives = 38/71 (53%), Gaps = 7/71 (9%)

Query: 8 EKASKRAKQEAKTEATQENKAKENKIKESKI-KEAK------TKESKIKEAKTKESKIKE 60
E ++ +KQE+KT E A E + ++ KEAK T+ +++ ++ ++ + +
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 61 TKAKEPVPVKK 71
T+ KE V+K
Sbjct: 1098 TETKETATVEK 1108


2HPKB_0182HPKB_0228Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0182219-4.016704Protein maturation protease (Peptidylprolyl
HPKB_0183218-3.844516fructose-bisphosphate aldolase
HPKB_0184220-4.535845elongation factor P
HPKB_0185-119-5.054715hypothetical protein
HPKB_0186-116-4.518038Type II restriction-modification system
HPKB_0187-212-2.470926hypothetical protein
HPKB_0188-114-0.054265sialic acid synthase
HPKB_0189-2120.143482ABC transporter, ATP-binding protein
HPKB_0190-2110.016509apolipoprotein N-acyltransferase
HPKB_01912120.799971hypothetical protein
HPKB_01921120.928192lysyl-tRNA synthetase
HPKB_01931120.581029serine hydroxymethyltransferase
HPKB_01941140.938404hypothetical protein
HPKB_01951122.507268hypothetical protein
HPKB_01960122.628461hypothetical protein
HPKB_0197-192.242969hypothetical protein
HPKB_0198-192.333393phopholipase D-family protein
HPKB_01990103.103515fumarate reductase iron-sulfur subunit
HPKB_0200-1103.069438fumarate reductase flavoprotein subunit
HPKB_0201-1131.558251fumarate reductase cytochrome b-556 subunit
HPKB_0202-2151.537282triosephosphate isomerase
HPKB_0203-1162.620614enoyl-(acyl carrier protein) reductase
HPKB_02040162.727857UDP-3-O-(3-hydroxymyristoyl) glucosamine
HPKB_0205-1172.594754S-adenosylmethionine synthetase
HPKB_0206-1171.263042nucleoside diphosphate kinase (ndk)
HPKB_0207113-4.049210hypothetical protein
HPKB_0208012-2.61799850S ribosomal protein L32
HPKB_0211010-2.8294483-oxoacyl-(acyl carrier protein) synthase III
HPKB_0212010-3.314410hypothetical protein
HPKB_0213-18-3.287943hypothetical protein
HPKB_0214-29-3.002623ATP-dependent OLD family endonuclease
HPKB_0215-28-0.163327ATP-binding protein
HPKB_0216-19-0.106426hypothetical protein
HPKB_0217-1110.877521hypothetical protein
HPKB_0218-2131.443286heat shock protein 90
HPKB_0219-1132.854214Sel1 repeat-containing protein
HPKB_02200132.845006succinyl-diaminopimelate desuccinylase
HPKB_0223-1123.348217sodium-dependent transporter
HPKB_0224-2113.546098phosphatidate cytidylyltransferase
HPKB_0225-192.9128241-deoxy-D-xylulose 5-phosphate reductoisomerase
HPKB_0228-193.082163putative phospholipid binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0189PF05272300.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.009
Identities = 14/59 (23%), Positives = 23/59 (38%), Gaps = 1/59 (1%)

Query: 23 IKPQESLAILGVSGSGKSTLLSHLATMLKPDSGTISLLEHQDIY-ALNSKKLLELRRLK 80
K S+ + G G GKSTL++ L + + +D Y + EL +
Sbjct: 593 CKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMT 651


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0195IGASERPTASE352e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.4 bits (81), Expect = 2e-04
Identities = 34/150 (22%), Positives = 59/150 (39%), Gaps = 8/150 (5%)

Query: 50 PKETFLQTDSGMQKIGNTKDEKKDDAFESLNLDPSKQESDLDKVADNVKKQENDAFKMPI 109
P ET ++ T ++ + DA E+ + + V N Q N+ +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN--TQTNEVAQSGS 1090

Query: 110 QTNQTQTEMKTTEETQEAKKELKA-VEHTPMSAQKESQAVAKKETPHKKPKVTPKDKEAH 168
+T +TQT T E +++ K E T + SQ K+E + V P+ + A
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQE---QSETVQPQAEPAR 1147

Query: 169 KDKVKHAAKELKAK--KEAHKEVPKKANSK 196
++ KE +++ A E P K S
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSS 1177



Score = 32.3 bits (73), Expect = 0.002
Identities = 30/124 (24%), Positives = 51/124 (41%), Gaps = 10/124 (8%)

Query: 79 LNLDPSKQESDLDKVADNVKKQENDAFKMPIQTNQTQTEMKTTEETQEAKKELKAVEHTP 138
PS+ + VA+N K++ K + + T+T + E +EAK +KA T
Sbjct: 1029 APATPSETT---ETVAENSKQESKTVEKN--EQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 139 MSAQ-----KESQAVAKKETPHKKPKVTPKDKEAHKDKVKHAAKELKAKKEAHKEVPKKA 193
AQ KE+Q KET + + K + +V ++ K+E + V +A
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 194 NSKT 197

Sbjct: 1144 EPAR 1147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0203DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.1 bits (145), Expect = 8e-13
Identities = 61/263 (23%), Positives = 109/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKPLYDSVKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + +++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHNIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++NIR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSSGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


3HPKB_0305HPKB_0343Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_03052121.794042acylamide amidohydrolase
HPKB_03061101.627217flagellar hook-associated protein FlgL
HPKB_03072142.85184450S ribosomal protein L21
HPKB_03082142.91155250S ribosomal protein L27
HPKB_03091133.028052heme-binding lipoprotein
HPKB_03101143.431239dipeptide permease protein
HPKB_0311-1132.802397dipeptide permease protein
HPKB_0312-2142.474667dipeptide ABC transporter, ATP-binding protein
HPKB_0313-2132.165677dipeptide transport system atp-binding protein
HPKB_0314-1121.682372GTPase ObgE
HPKB_0315-1121.334060hypothetical protein
HPKB_03160161.956287hypothetical protein
HPKB_03171172.420998glutamate-1-semialdehyde aminotransferase
HPKB_03184161.817923hypothetical protein
HPKB_03194151.530739hypothetical protein
HPKB_03204141.825336hypothetical protein
HPKB_03211110.678468hypothetical protein
HPKB_0322112-0.039355hypothetical protein
HPKB_0323012-0.267008hypothetical protein
HPKB_0324113-0.999877major facilitator transporter
HPKB_0325213-1.262340hypothetical protein
HPKB_0326113-1.559536arginyl-tRNA synthetase
HPKB_0327112-1.038895sec-independent protein translocase protein
HPKB_0328012-1.149537guanylate kinase
HPKB_0329012-1.495506internalin protein
HPKB_0330-112-1.076844putative endonuclease
HPKB_0331114-0.979574outer membrane protein HorC
HPKB_0332313-0.466558flagellar basal body L-ring protein
HPKB_0333315-0.336057CMP-N-acetylneuraminic acid synthetase
HPKB_0334215-0.366733CMP-N-acetylneuraminic acid synthetase
HPKB_03353140.647766tetraacyldisaccharide 4'-kinase
HPKB_03362151.438104NH(3)-dependent NAD+ synthetase
HPKB_03380151.565833*ketol-acid reductoisomerase
HPKB_03391140.617975cell division inhibitor
HPKB_03402160.461814cell division topological specificity factor
HPKB_03411170.948926hypothetical protein
HPKB_03421190.121309Holliday junction resolvase-like protein
HPKB_0343220-0.420814hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0306FLAGELLIN526e-09 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 52.4 bits (125), Expect = 6e-09
Identities = 24/126 (19%), Positives = 55/126 (43%), Gaps = 1/126 (0%)

Query: 14 NYQNALQNKINDTNTQIASGLKIRYGYQNSNINNQNLKFQYEENTLDQGIDVAQNAYTST 73
N N Q+ ++ +++SGL+I ++ +F L Q A + +
Sbjct: 15 NNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIA 74

Query: 74 LNTDKALQEFSKTMETFKTKLIQSANDVHSETSRAAIANDLERLREHMMNVAN-TSIGGE 132
T+ AL E + ++ + +Q+ N +S++ +I +++++ E + V+N T G
Sbjct: 75 QTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGV 134

Query: 133 FLFGGS 138
+
Sbjct: 135 KVLSQD 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0324TCRTETA461e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.4 bits (110), Expect = 1e-07
Identities = 56/271 (20%), Positives = 105/271 (38%), Gaps = 16/271 (5%)

Query: 28 LILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLISLESIAKISFGLIALSFLICYFD 87
L+ S +T H L + LM + L LS + +S A+ + I
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGA-LSDRFGRRPVLLVSLAGAAVDYAI--MA 91

Query: 88 SIPFFW-LWIWRFIAGVASSALMILVAPLSLPYVKENKKALVGGFIFSAVGIGSVFSGFV 146
+ PF W L+I R +AG+ + A + +++A GF+ + G G V +
Sbjct: 92 TAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVL 150

Query: 147 LPWISSYNIKWAWIFLGGSCLIAFILSLIGLKN-HSLKKKSVKKEESAFKIPFHL----- 200
+ ++ + + F+ L H +++ +++E F
Sbjct: 151 GGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMT 210

Query: 201 ---WLLLISCALNAIGFLPHTLFWVDYLIRHLNISPTTAGTSWALFG-FGATLGSLISGP 256
L+ + + +G +P L WV + + TT G S A FG + ++I+GP
Sbjct: 211 VVAALMAVFFIMQLVGQVPAAL-WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 257 MAQKLGAKNANIFILILKSIACFLPIFFHQI 287
+A +LG + A + +I L F +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0328PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0329IGASERPTASE562e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 56.2 bits (135), Expect = 2e-10
Identities = 57/274 (20%), Positives = 88/274 (32%), Gaps = 20/274 (7%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKETPQEEKPKDDETQEGETLKNEEV 199
E+E N Q +E + P T T E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 200 SKELETQEEIKE-ETQEQAKEQEPIKEETQEEVKEETQ-------EEIKEETQEEIKEET 251
SK+ E E + E + + +E + VK TQ +ETQ +ET
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 252 QEI-KEEKQ----EKTQDSPSVQELEAMQELVKEIQENSNGQENKKETQESAETPQETPQ 306
+ KEEK EKTQ+ P V + ++ E + + + + + PQ Q
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS--Q 1161

Query: 307 EKETQKLETPQEEIQENAEKTQKLETQEDHYESIEDIPEPVMAQAMGEELPFLNESVAKI 366
T E P +E N E+ T + S+ + PE P +N +
Sbjct: 1162 TNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPA---TTQPTVNSESSNK 1218

Query: 367 PNNENDTETPKESVIKTPQEKEESDKTSSPLELR 400
P N + SV + S S + L
Sbjct: 1219 PKNRHRRSV--RSVPHNVEPATTSSNDRSTVALC 1250



Score = 30.0 bits (67), Expect = 0.022
Identities = 33/202 (16%), Positives = 66/202 (32%), Gaps = 28/202 (13%)

Query: 148 EALVQEEPNNEEQLLPTLNDQEEKEEVKETPQEEKPKDDETQEGETL------------- 194
Q EP E PT+N +E + + T E+P + + E
Sbjct: 1138 TVQPQAEPAREND--PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSV 1195

Query: 195 -KNEEVSKELETQEEIKEETQEQAKEQEPI----KEETQEEVKEETQEEIKEETQEEIKE 249
+N E + TQ + E+ + K + E + + +
Sbjct: 1196 VENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTST 1255

Query: 250 ETQEIKEEKQEKTQDSPSVQELEAMQELVKEIQENSNGQEN------KKETQESAETPQE 303
T + + + K Q ++ +A+ + + +++ N+ GQ N S+ +
Sbjct: 1256 NTNAVLSDARAKAQ-FVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRR 1314

Query: 304 TPQEKETQKLETPQEEIQENAE 325
K TQ + I N +
Sbjct: 1315 F-SSKSTQTQLGWDQTISNNVQ 1335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0332FLGLRINGFLGH1904e-63 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 190 bits (485), Expect = 4e-63
Identities = 51/172 (29%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKQEAQYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + + S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


4HPKB_0440HPKB_0475Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0440-212-3.049655dihydroorotate dehydrogenase 2
HPKB_0441-213-3.300292polyphosphate kinase
HPKB_0443-214-3.837351*type I restriction enzyme specificity subunit
HPKB_0446-114-3.168658type I restriction enzyme R protein (hsdR)
HPKB_0447014-2.618571hypothetical protein
HPKB_0448014-2.397847hypothetical protein
HPKB_0449012-1.800565hypothetical protein
HPKB_0450-113-2.209172hypothetical protein
HPKB_0451014-2.321163hypothetical protein
HPKB_0452-112-2.038950hypothetical protein
HPKB_0453-114-2.368810glutathione-regulated potassium-efflux system
HPKB_0454013-3.357799outer membrane protein HorE
HPKB_0455112-4.259729hypothetical protein
HPKB_045609-2.172809putative ABC transporter, ATP-binding protein,
HPKB_0457-110-2.349289glutamyl-tRNA synthetase
HPKB_0458-111-2.629828hypothetical protein
HPKB_0459-112-3.083739type II R-M system methyltransferase
HPKB_0460-112-1.623257DD-heptosyltransferase
HPKB_04611130.462419GTP-binding protein TypA
HPKB_0464515-0.561271type II restriction endonuclease
HPKB_0465215-1.070969type II DNA modification enzyme
HPKB_0466216-1.036857hypothetical protein
HPKB_0467217-0.117783catalase-like protein
HPKB_04683141.112565outer membrane protein HofC
HPKB_04691120.698350putative Outer membrane protein
HPKB_04701140.750275hypothetical protein
HPKB_04712132.310114hypothetical protein
HPKB_04722132.551151Holliday junction resolvase
HPKB_04731142.635074hypothetical protein
HPKB_04742133.187995catalase
HPKB_04751143.076034hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0446FLGHOOKAP1310.029 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.029
Identities = 20/165 (12%), Positives = 54/165 (32%), Gaps = 5/165 (3%)

Query: 682 NRPYNNMSFGYLIDFVGIQENFDKTTDDYLKELNRFNQNGANSDSNIKDIFADRENLEKD 741
+ G + G+Q +D + L+ + + I
Sbjct: 45 STLGAGGWVGNGVYVSGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSS 104

Query: 742 IKNAYNDLFDYPIDDIEDMTSAIVKMSEINELQKVSHAIKTLKERYNLIRTSNDEKILSL 801
+ D F + + + + I + + + + KT + R + + +++
Sbjct: 105 LATQMQDFFTSLQTLVSNAEDPAARQALIGKSEGLVNQFKTTDQYL---RDQDKQ--VNI 159

Query: 802 KEKIDIEKINKISSTLNQKAKQLYALKNINEPKNPNDLIILEDLI 846
+++IN + + Q+ L + +PN+L+ D +
Sbjct: 160 AIGASVDQINNYAKQIASLNDQISRLTGVGAGASPNNLLDQRDQL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0449CHANLCOLICIN280.006 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.1 bits (62), Expect = 0.006
Identities = 16/48 (33%), Positives = 27/48 (56%)

Query: 46 SFQDPEKREEYIERLKKNHERKMILQDKQKEEQMRLYQAKKERESRQK 93
+FQ+ E+R + IER K ER++ L + +++ L + K E QK
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQK 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0461TCRTETOQM1988e-58 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (506), Expect = 8e-58
Identities = 115/461 (24%), Positives = 191/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVEEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ + L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVAIAG--FNAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV + V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


5HPKB_0647HPKB_0678Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_06472112.551690DNA gyrase subunit A
HPKB_06482142.448226hypothetical protein
HPKB_06492142.412490hypothetical protein
HPKB_06503143.340132hypothetical protein
HPKB_06513133.572766N-methylhydantoinase B
HPKB_06521123.888547hydantoin utilization protein A
HPKB_06532112.773973putative outer membrane protein
HPKB_06541130.626309short-chain fatty acids transporter
HPKB_0655116-1.9933833-oxoacid CoA-transferase subunit B
HPKB_0656017-2.8555113-oxoacid CoA-transferase subunit A
HPKB_0657-115-2.772455hypothetical protein
HPKB_0658216-3.277415large-conductance mechanosensitive channel
HPKB_0659215-3.153733putative type II restriction enzyme
HPKB_0660112-1.187496Modification methylase CfrBI
HPKB_06611110.3601653'-5' exonuclease
HPKB_06623120.752123ferrous iron transport protein B
HPKB_06632100.927278TonB-dependent siderophore receptor
HPKB_0664-1101.090283flagellar biosynthesis protein FliP
HPKB_0665181.081704glucosamine-1-phosphate N-acetyltransferase
HPKB_0666190.358469hypothetical protein
HPKB_0667-110-0.034378hypothetical protein
HPKB_0668-110-1.753544ribonucleotide-diphosphate reductase subunit
HPKB_0669011-1.742641putative lipopolysaccharide biosynthesis
HPKB_0670013-1.387821hypothetical protein
HPKB_0671018-4.830901methylated-DNA--protein-cysteine
HPKB_0672118-4.764827phage integrase family protein
HPKB_0673016-4.768809hypothetical protein
HPKB_0674-113-3.208735aspartate aminotransferase
HPKB_0675-115-4.666758putative Outer membrane protein
HPKB_0676016-4.388503hypothetical protein
HPKB_0677114-0.573021hypothetical protein
HPKB_0678215-0.568194glycerol-3-phosphatedehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0664FLGBIOSNFLIP2762e-96 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 276 bits (708), Expect = 2e-96
Identities = 113/245 (46%), Positives = 162/245 (66%), Gaps = 2/245 (0%)

Query: 1 MRFFIFLILICPLICPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSLI 60
MR + + + L A + LP + S P + + + +T L P+++
Sbjct: 1 MRRLLSVAPVL-LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAIL 58

Query: 61 LVMTSFTRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPYMD 120
L+MTSFTR+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+ +
Sbjct: 59 LMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE 118

Query: 121 KKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDEVSLSVLIPAFMISE 180
+KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++ SE
Sbjct: 119 EKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSE 178

Query: 181 LKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLTEN 240
LKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL +
Sbjct: 179 LKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGS 238

Query: 241 LVASF 245
L SF
Sbjct: 239 LAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0666PHAGEIV270.026 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 27.2 bits (60), Expect = 0.026
Identities = 11/36 (30%), Positives = 15/36 (41%)

Query: 49 GKLIGGGVGGFVGDKIGGAIGVPGGPVGIGLGRFVG 84
G G GG D++ + GG GI G +G
Sbjct: 220 GSQRGTVAGGVNTDRLTSVLSSAGGSFGIFNGDVLG 255


6HPKB_0792HPKB_0819Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0792317-0.15832050S ribosomal protein L31
HPKB_0793418-0.846320transcription termination factor Rho
HPKB_0794620-2.264728glutamate racemase
HPKB_0795822-3.202899cytotoxin-associated protein A
HPKB_0796822-4.256837cag pathogenicity island protein B
HPKB_0797721-3.325467cag pathogenicity island protein C
HPKB_0798620-3.301139cag pathogenicity island protein D
HPKB_0799620-3.030890cag pathogenicity island protein E
HPKB_0800721-3.357741cag pathogenicity island protein F
HPKB_0801619-3.052413cag pathogenicity island protein G
HPKB_0802719-3.319082cag pathogenicity island protein (cag20)
HPKB_0803720-4.453167cag pathogenicity island protein (cag19)
HPKB_0804923-5.753973cag pathogenicity island protein L
HPKB_08051223-5.389853cag pathogenicity island protein N
HPKB_08061128-5.584170cag pathogenicity island protein M
HPKB_08071230-5.627519cag island protein
HPKB_08081031-4.734156cag pathogenicity island protein Q
HPKB_08091028-4.799227cag pathogenicity island protein S
HPKB_08101021-3.312131cag pathogenicity island protein T
HPKB_0811921-3.476442cag pathogenicity island protein U
HPKB_08121020-3.222956cag pathogenicity island protein V
HPKB_0813919-2.789845cag pathogenicity island protein W
HPKB_0814917-2.635638penicillin-binding protein
HPKB_08151017-2.083678conjugation TrbI family protein
HPKB_0816818-2.300424cag pathogenicity island protein Z
HPKB_0817617-2.162670P-type DNA transfer ATPase VirB11
HPKB_0818414-1.916327cag pathogenicity island protein 5
HPKB_0819213-2.013007cag pathogenicity island protein 4
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0792PF01206270.004 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 26.6 bits (59), Expect = 0.004
Identities = 7/22 (31%), Positives = 12/22 (54%)

Query: 19 SGKEIEVLSTKPEMRIDISSFC 40
+G+ + V++T P D SF
Sbjct: 31 AGEVLYVMATDPGSVKDFESFS 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0795TYPE4SSCAGA17110.0 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 1711 bits (4432), Expect = 0.0
Identities = 924/1223 (75%), Positives = 1021/1223 (83%), Gaps = 91/1223 (7%)

Query: 1 MTNEIIDQTTTPNQTDFVPQRFINNLQVAFLKVDNAVASFDPDQKPIVDKNDRDNRQAFE 60
MTNE IDQ + F PQ+FINNLQVAFLKVDNAVAS+DPDQKPIVDKNDRDNRQAFE
Sbjct: 1 MTNETIDQQPQ-TEAAFNPQQFINNLQVAFLKVDNAVASYDPDQKPIVDKNDRDNRQAFE 59

Query: 61 KISQLKEEYANKAIKNPTKKNQYFSDFINKSNDLINKDNLIAVDSSVESFRKFGDQRYQI 120
ISQL+EEY+NKAIKNPTKKNQYFSDFINKSNDLINKDNLI V+SS +SF+KFGDQRY+I
Sbjct: 60 GISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRI 119

Query: 121 FMNWVSHQKDPSKINTQQIRNFMENIIQPPISDDKEKAEFLRSAKQSFAGIIIGNQIRSD 180
F +WVSHQ DPSKINT+ IRNFMENIIQPPI DDKEKAEFL+SAKQSFAGIIIGNQIR+D
Sbjct: 120 FTSWVSHQNDPSKINTRSIRNFMENIIQPPILDDKEKAEFLKSAKQSFAGIIIGNQIRTD 179

Query: 181 QKFMGVFDESLKERQEAEKNGEPAGGDWLDIFLSFVFNKKQSSDLKETLNQEPRPDFEQN 240
QKFMGVFDESLKERQEAEKNGEP GGDWLDIFLSF+F+KKQSSD+KE +NQEP P + +
Sbjct: 180 QKFMGVFDESLKERQEAEKNGEPTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPD 239

Query: 241 LATTTTNIQGLPPESRDLLDERGNFFKFTLGDVEMLDVEGVADKDPNYKFNQLLIHNNAL 300
+ATTTT+IQGLPPE+RDLLDERGNF KFTLGD+EMLDVEGVAD DPNYKFNQLLIHNNAL
Sbjct: 240 IATTTTDIQGLPPEARDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNAL 299

Query: 301 SSVLMGGHSNIEPEKVSLLYGGNGGPEARHDWNATVGYKNQQGNNVATLINAHLNNGSGL 360
SSVLMG H+ IEPEKVSLLYGGNGGP ARHDWNATVGYK+QQGNNVAT+IN H+ NGSGL
Sbjct: 300 SSVLMGSHNGIEPEKVSLLYGGNGGPGARHDWNATVGYKDQQGNNVATIINVHMKNGSGL 359

Query: 361 IIAGNENGIKNPSFYLYKEDQLTGLKQALSQEEIQNKVDFMEFLVQNNAKLDNLSEKEKE 420
+IAG E GI NPSFYLYKEDQLTG ++ALSQEEIQNK+DFMEFL QNNAKLDNLSEKEKE
Sbjct: 360 VIAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDNLSEKEKE 419

Query: 421 KFQTEIENFQKDRKAYLDALGNDHIAFVSKKDPKHLALVTEFGNGEVSYTLKDYGKKQDK 480
KF+TEI++FQKD KAYLDALGND IAFVSKKD KH AL+TEFGNG++SYTLKDYGKK DK
Sbjct: 420 KFRTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALITEFGNGDLSYTLKDYGKKADK 479

Query: 481 ALDGETKTTLQGNLKHDGVMFVNYSNFKYTNASKSPDKGVGATNGVSRLEANLSKVAVFN 540
ALD E TLQG+LKHDGVMFV+YSNFKYTNASK+P+KGVG TNGVS LE +KVA+FN
Sbjct: 480 ALDREKNVTLQGSLKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLEVGFNKVAIFN 539

Query: 541 LPNLNNLAITNYIRRDLEDKLWAKGLSPQEANKLIKDFLNSNKELLGKVSNFNKAVAEAK 600
LP+LNNLAIT+++RR+LEDKL KGLSPQEANKLIKDFL+SNKEL+GK NFNKAVA+AK
Sbjct: 540 LPDLNNLAITSFVRRNLEDKLTTKGLSPQEANKLIKDFLSSNKELVGKTLNFNKAVADAK 599

Query: 601 NTGNYDEVKKAQKDLEKSLRKREHLEKEVAKKLESRNDNKNRMEAKAQANSQKDKIFALI 660
NTGNYDEVKKAQKDLEKSLRKREHLEKEV KKLES++ NKN+MEAKAQANSQKD+IFALI
Sbjct: 600 NTGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKSGNKNKMEAKAQANSQKDEIFALI 659

Query: 661 NQEASKEARAVAFDPNLKGVRSELSDKLENINKNLKDFGKSFDELKSGKNNDFNKAEETL 720
N+EA+++ARA+A+ NLKG++ ELSDKLEN+NKNLKDF KSFDE K+GKN DF+KAEETL
Sbjct: 660 NKEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETL 719

Query: 721 KALKDSVKDLGINPEWISKIENLNAALNDFKNGKNKDFSKVTQAKSDFENSIKDVIINQK 780
KALK SVKDLGINPEWISK+ENLNAALN+FKNGKNKDFSKVTQAKSD ENS+KDVIINQK
Sbjct: 720 KALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQK 779

Query: 781 ITDKVDNLNQAVSETKLTGDFSKVEQALAELKNLS-------------LDLGKNSDLQKS 827
+TDKVDNLNQAVS K TGDFS+VEQALA+LKN S L+ K S++ +S
Sbjct: 780 VTDKVDNLNQAVSVAKATGDFSRVEQALADLKNFSKEQLAQQAQKNESLNARKKSEIYQS 839

Query: 828 VKNGVNGTLVGNGLSKTEATTLAKNFSDIRKELNEKLFGNSNNNNNGLKNSTEPIYAKVA 887
VKNGVNGTLVGNGLS+ EATTL+KNFSDI+KELN KL +NNNNNGLKN EPIYAKV
Sbjct: 840 VKNGVNGTLVGNGLSQAEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKN--EPIYAKVN 897

Query: 888 KKVSVKIDQLNEATSAINRKIDRINKIASAGKGVGGFSGAGRSASP-EPIYAQVAKKVSA 946
KK AG++AS EPIYAQVAKKV+A
Sbjct: 898 KK------------------------------------KAGQAASLEEPIYAQVAKKVNA 921

Query: 947 KIDQLNEATSAINRKIDRINKIASAGKGVGGFSGAGRSASPEPIYATIDFDEANQAGFPL 1006
KID+LN+ S G GV G AGFPL
Sbjct: 922 KIDRLNQIAS---------------GLGVVG----------------------QAAGFPL 944

Query: 1007 MRSAAVNDLSKVGLSREQELTRRIGDLNQAVSEAKTGHFGNLEQKIDELKDSTKKNALKL 1066
R V+DLSKVGLSR QEL ++I +LNQAVSEAK G FGNLEQ ID+LKDSTK N + L
Sbjct: 945 KRHDKVDDLSKVGLSRNQELAQKIDNLNQAVSEAKAGFFGNLEQTIDKLKDSTKHNPMNL 1004

Query: 1067 WVESAKQVPTGLQAKLDNYATNSHTRINSNVQSGAINEKATGMLMQKNPEWLKLVNDKIV 1126
WVESAK+VP L AKLDNYATNSH RINSN+++GAINEKATGML QKNPEWLKLVNDKIV
Sbjct: 1005 WVESAKKVPASLSAKLDNYATNSHIRINSNIKNGAINEKATGMLTQKNPEWLKLVNDKIV 1064

Query: 1127 AHNVGSTPLSEYDKIGFNQKNMKDYSDSFKFSTKLNNAVKDVKSDFVQFLTNAFSTGS-Y 1185
AHNVGS PLSEYDKIGFNQKNMKDYSDSFKFSTKLNNAVKD S F QFLTNAFST S Y
Sbjct: 1065 AHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLNNAVKDTNSGFTQFLTNAFSTASYY 1124

Query: 1186 SLMKANAEHGVKNTNTKGGFQKS 1208
L + NAEHG+KN NTKGGFQKS
Sbjct: 1125 CLARENAEHGIKNVNTKGGFQKS 1147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0799ACRIFLAVINRP330.008 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.9 bits (75), Expect = 0.008
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGVNPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0812PF043351181e-34 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (296), Expect = 1e-34
Identities = 44/198 (22%), Positives = 73/198 (36%), Gaps = 10/198 (5%)

Query: 27 KLNKANRTFKRAFYL---SMALNIAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKV 219
S + + NP G++V
Sbjct: 199 GTPSKEVDRFKNPLGYQV 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0814TYPE4SSCAGX8680.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 868 bits (2243), Expect = 0.0
Identities = 513/522 (98%), Positives = 515/522 (98%)

Query: 1 MEQAFFKKIVGCFCLGYLFLSSVIEAAAPDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60
M QAFFKKIVGCFCLGYLFLSS IEA A DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 181 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 241 EEAVKQRAKDKINIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300
EEAV+QRAKDKI+IKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYLAPEKRSKHIMPSEIF 420
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYY APEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0815IGASERPTASE409e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.4 bits (94), Expect = 9e-05
Identities = 39/253 (15%), Positives = 88/253 (34%), Gaps = 19/253 (7%)

Query: 897 TPEAKKLLEEEAKESVKAYLDCVSQARTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDC 956
P ++ + + +++T + ++ T + ++ +EAK +VKA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 957 VSQARTEAEKKECEKLLTPEARKLLEQEVKKSVKAYLDCVSRARNEKERKACEKLLTPEA 1016
A++ +E KE + T E + ++E +A+ E E+ +T +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEE-------------KAKVETEKTQEVPKVTSQV 1129

Query: 1017 RKKLEEAKKSVKAYLDCVSQARTEAEKKECEKLLTPEARKLLEQEVKKSVKAYLDCVSRA 1076
K E+++ T K+ + T + +E +V+ + +
Sbjct: 1130 SPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTV 1189

Query: 1077 RNEKEKQECEKLLTPEAKKFLEQQALDCLKNAKTETEKKRCVKDLPKDLQKKVLAKE--S 1134
E + TP + + K + +R V+ +P +++ + S
Sbjct: 1190 NTGNSVVENPENTTPATTQPTVNSES----SNKPKNRHRRSVRSVPHNVEPATTSSNDRS 1245

Query: 1135 VKAYLDCVSRARN 1147
A D S N
Sbjct: 1246 TVALCDLTSTNTN 1258



Score = 38.9 bits (90), Expect = 3e-04
Identities = 41/246 (16%), Positives = 92/246 (37%), Gaps = 6/246 (2%)

Query: 806 KNAKTDEERKKCLKDLP-KDLQKKVLAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKK 864
+ ++ E + + P ++ + + ++KT + ++ T + ++
Sbjct: 1008 PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 865 LLEEAKESLKAYKDCVSRARNEKEKKECEKLLTPE-AKKLLEEEAKESVKAYLDCVSQAR 923
+ +EAK ++KA A++ E KE + T E A EE+AK + +
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 924 TEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQARTEAEKKECEKLLTPEARKLLEQ 983
+ K+E + + P+A+ E + SQ T A+ ++ K + + + +
Sbjct: 1128 QVSPKQEQSETVQPQAEPAREND--PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTE 1185

Query: 984 EVKKSVKAYLDCVSRARNEKERKACEKLLTPEARKKLEEAKKSVKAYLDCVSQARTEAEK 1043
+V V N + + + K ++SV++ V A T +
Sbjct: 1186 S--TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSND 1243

Query: 1044 KECEKL 1049
+ L
Sbjct: 1244 RSTVAL 1249



Score = 32.7 bits (74), Expect = 0.020
Identities = 35/239 (14%), Positives = 84/239 (35%), Gaps = 10/239 (4%)

Query: 629 RARNEKEKKECEKLLTPEAKKKLEQQVLDCLKNAKTDEERKKCLKDLPKD--LQSDILAK 686
+ NE+ + E + P A + +N+K + + + + + Q+ +AK
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 687 ESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQARTEAEKKE 746
E+ K + E ++ T E K+ E +E K + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 747 CEKLLTPEAKKKLEEAKKSVKAYL--DCVSQARTEAEKKECEKLLTPEAKKLLEQQVLDC 804
++ + + + E A+++ + SQ T A+ ++ K + ++ + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 805 LKNA------KTDEERKKCLKDLPKDLQKKVLAKESVKAYLDCVSQAKTEAEKKECEKL 857
N+ T + + + K + SV++ V A T + + L
Sbjct: 1191 TGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249


7HPKB_0841HPKB_0864Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0841213-1.749524thioesterase family protein
HPKB_0842111-1.180273hypothetical protein
HPKB_0843010-0.968335UDP-N-acetylmuramoyl-L-alanyl-D-glutamate
HPKB_0844214-1.524161hypothetical protein
HPKB_0845517-2.166885hypothetical protein
HPKB_0846517-1.28575450S ribosomal protein L28
HPKB_0847414-1.758299TrkA domain-containing protein
HPKB_0848412-2.243278hypothetical protein
HPKB_0849311-1.975681hypothetical protein
HPKB_0850210-1.266249hypothetical protein
HPKB_0851211-0.189912Holliday junction DNA helicase motor protein
HPKB_0852210-0.006438hypothetical protein
HPKB_08531130.952717virulence factor MviN
HPKB_08540151.526692cysteinyl-tRNA synthetase
HPKB_08570171.698921IRON(III) dicitrate transport system ATP-binding
HPKB_0858-1151.201814transport system permease protein
HPKB_08591162.482694short-chain oxidoreductase
HPKB_08602192.078714hypothetical protein
HPKB_08610202.434782hypothetical protein
HPKB_08620202.836713hypothetical protein
HPKB_08630183.129249hypothetical protein
HPKB_08640183.448554outer membrane protein - adhesin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0845PF052112652e-91 Neuraminyllactose-binding hemagglutinin
		>PF05211#Neuraminyllactose-binding hemagglutinin

Length = 260

Score = 265 bits (679), Expect = 2e-91
Identities = 62/268 (23%), Positives = 117/268 (43%), Gaps = 39/268 (14%)

Query: 18 KTLIALGLSSVLVGCAIKPVAEVKPQNQQEKPIQVNEKIQTTQKVTPFNFNYSLHVAQAP 77
K L+ + ++LVGC+ I+T + N+ +
Sbjct: 14 KCLLGASVVALLVGCS-------------------PHIIETNEVALKLNY-HPASEKVQA 53

Query: 78 QNYRLIGILAPRIQVSDNL-KPYIDKFQNALINQIQTIFEKRGYQTLFF--KDESALTLQ 134
+ + I +L P Q SDN+ K Y +KF+N +++ I + +GY+ + D+ +
Sbjct: 54 LDEK-ILLLRPAFQYSDNIAKEYENKFKNQTTLKVEQILQNQGYKVINVDSSDKDDFSFA 112

Query: 135 DKRKLFAVLDVKGWVGVLEDLKMNLKDPNNPNL--GTLVDQ------SSGSVWFSFYEPE 186
K++ + + + G + + D K ++ + P L T +D+ +G V + EP
Sbjct: 113 QKKEGYLAVAMNGEIVLRPDPKRTIQKKSEPGLLFSTGLDKMEGVLIPAGFVKVTILEPM 172

Query: 187 SNRVVHDFAVEVGTF---QAMTYTYKQSNSGGFNSSNSIIHEDLEKNKEDAIHQILNKIY 243
S + F +++ + T S+SGG S+ N DAI LNKI+
Sbjct: 173 SGESLDSFTMDLSELDIQEKFLKTTHSSHSGGLVSTMV----KGTDNSNDAIKSALNKIF 228

Query: 244 ALIMKKAVTELTEKNISQHKETIDRMKG 271
A IM++ +LT+KN+ +++ +KG
Sbjct: 229 ANIMQEIDKKLTQKNLESYQKDAKELKG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0854OMS28PORIN300.020 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.8 bits (66), Expect = 0.020
Identities = 13/37 (35%), Positives = 25/37 (67%)

Query: 309 EEDLLVSKKRLDKIYRLKQRVLGTLGGINPNFKKEIL 345
+E L+ S++ LD+ + Q+VL + G+NP+ K ++L
Sbjct: 188 KETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVL 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0859DHBDHDRGNASE922e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.4 bits (229), Expect = 2e-24
Identities = 59/245 (24%), Positives = 109/245 (44%), Gaps = 10/245 (4%)

Query: 1 MGEKKESQKVAIITGASSGIGLECALMLLDQGYKVYALSRHATLCVALNHALC------E 54
M K K+A ITGA+ GIG A L QG + A+ + + +L E
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 55 SIDVDVSDSNALKEVFLNISAKEDHCDVLINSAGYGVFGSVEDTPIDEVKKQFSVNFFAL 114
+ DV DS A+ E+ I + D+L+N AG G + +E + FSVN +
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 115 CEVVQFCLPLLKNKPHSKIFNLSSIAGRVSMLFLGHYSASKHALEAYSDALRLELKPFNI 174
+ + ++ I + S V + Y++SK A ++ L LEL +NI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 175 QVCLIEPGPVKSNWEKTAFENDERKDSLYALEVNAAKSFYSGV-YQKALSPKAVAQKIVF 233
+ ++ PG +++ + + + ++ + + + ++F +G+ +K P +A ++F
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIK---GSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 234 LAMSQ 238
L Q
Sbjct: 238 LVSGQ 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0862BINARYTOXINA260.042 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 25.8 bits (56), Expect = 0.042
Identities = 23/84 (27%), Positives = 36/84 (42%), Gaps = 3/84 (3%)

Query: 10 YTKYSEKQLFNFLNSIKTKQKRALEKLKEIQAQKQ-RIKKALQFKALHLTENGYTIEEER 68
Y + EK FN + + + +LEK E++ Q ++ K FK + L E G E+
Sbjct: 134 YFESPEKFAFNKEIRTENQNEISLEKFNELKETIQDKLFKQDGFKDVSLYEPGNGDEKPT 193

Query: 69 EILARAK--DTKNHLCFKNIEDFK 90
+L K L + N D K
Sbjct: 194 PLLIHLKLPKNTGMLPYINSNDVK 217


8HPKB_1052HPKB_1073Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1052215-0.363508FlgM protein
HPKB_1053213-1.293376hypothetical protein
HPKB_1054314-1.131474peptidyl-prolyl cis-trans isomerase
HPKB_1055316-1.935986hypothetical protein
HPKB_1056416-1.659080hypothetical protein
HPKB_10572140.157102translocation protein TolB
HPKB_1058219-0.106521hypothetical protein
HPKB_10592162.109937biopolymer transport protein
HPKB_10601161.737694transport protein
HPKB_10610151.325247ATP synthase F1, epsilon subunit
HPKB_10620141.160500F0F1 ATP synthase subunit beta
HPKB_1063114-0.474707F0F1 ATP synthase subunit gamma
HPKB_10640140.222693F0F1 ATP synthase subunit alpha
HPKB_1065114-1.819351F0F1 ATP synthase subunit delta
HPKB_1066015-0.675177ATP synthase F0 subunit B
HPKB_1067117-3.318467F0F1 ATP synthase subunit B'
HPKB_1068217-3.020174chromosome partitioning protein
HPKB_1069218-2.777308ParA family protein
HPKB_1070217-2.895187biotin--protein ligase
HPKB_1071217-3.138235methionyl-tRNA formyltransferase
HPKB_1072317-3.674570hypothetical protein
HPKB_1073214-0.007099hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1056OMPADOMAIN1477e-46 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 147 bits (373), Expect = 7e-46
Identities = 48/169 (28%), Positives = 75/169 (44%), Gaps = 24/169 (14%)

Query: 22 KMDNKTVAGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAIESGTIIASIYFDF 80
+ DN ++ VS + Q PAP PAP V+ K T+ + + F+F
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNF 225

Query: 81 DKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNAL 137
+K +K Q LD++ + V++ G TD GS YNQ L +R SV + L
Sbjct: 226 NKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYL 285

Query: 138 VIKGVEKDMIKTISFGETKPKC-----AQKTR----ECYKENRRVDVKL 177
+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 286 ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1064ECOLIPORIN300.025 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 29.9 bits (67), Expect = 0.025
Identities = 20/68 (29%), Positives = 35/68 (51%), Gaps = 3/68 (4%)

Query: 196 YVAIGQKESTVAQVVRKLEEYGAMEYSVVINASASDSAAMQYLAPYSGVAMGEYFR-DHA 254
Y+ +G K T Q+ +L YG EY+V N + + A ++G+ G+Y D+
Sbjct: 56 YMRVGFKGET--QINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYG 113

Query: 255 RHALIIYD 262
R+ ++YD
Sbjct: 114 RNYGVLYD 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1067PF06580280.015 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.9 bits (62), Expect = 0.015
Identities = 28/146 (19%), Positives = 51/146 (34%), Gaps = 22/146 (15%)

Query: 8 YLMAVVFVVFVLLLWAMNVWVYRPLLAFMDNRQAEIKDSLAKIKTDNTQSVEIRHQIE-- 65
L + VV V +W++ +Y F + +QAEI Q + ++ QI
Sbjct: 117 ALSIIFNVVVVTFMWSL---LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPH 173

Query: 66 ----------TLLKEAAEKRREMIAEAIQKAAESYDAVIKQKENE---LNQEFEAFAKQL 112
L+ E K REM+ +E ++ L E L
Sbjct: 174 FMFNALNNIRALILEDPTKAREML----TSLSELMRYSLRYSNARQVSLADELTVVDSYL 229

Query: 113 QNEKQILKEQLQVQMPVFEDELNKRV 138
Q +++LQ + + ++ +V
Sbjct: 230 QLASIQFEDRLQFENQINPAIMDVQV 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1069PF07675310.005 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.005
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 69 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 125
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 126 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 170
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1071FERRIBNDNGPP310.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.5 bits (71), Expect = 0.003
Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 70 EPEVQILKALKPDFIVVVAYGKILPKEVLTIAP 102
EP +++L +KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1072PF01540320.012 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 31.6 bits (71), Expect = 0.012
Identities = 68/332 (20%), Positives = 127/332 (38%), Gaps = 49/332 (14%)

Query: 140 KNCKEKVEKRKKKIKDENSAETLSAKQESEIKKYDKEIEKIRKEMTSKTIRITLDEIKIN 199
K+ ++KV++ KKI DEN +IK+ KE+ K+ +++ S I L
Sbjct: 103 KSEQQKVDQANKKIADEN----------LKIKEGAKELLKLSEKIQSFADTIAL------ 146

Query: 200 NICEVSKNKFKVQEDALTNLEKDFDELDEAMKKFDDLKEMELPKDYQTIKDKLESLFSFD 259
I ++ KF++ E L + L++ + + K + + LES F+
Sbjct: 147 TITKLEGKKFQIDETFKKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSE-LESFKEFN 205

Query: 260 IDKEAGQVSE--EIKEHMSKVGREF--------------IEKGIELQKKMPDNACPFCTQ 303
VSE E+K+ SK E I++G + K+ + F
Sbjct: 206 TSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSF-AD 264

Query: 304 EITNNIIQVYTSY-----FNKRIEQFNQDSLEVSGTLKKILEQWNIKE--ILQSFERFES 356
I I ++ + F K++ + + S +K IK+ +L E F+
Sbjct: 265 TIALTITKLERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKE 324

Query: 357 F-------MKKDSSTNKESLKNALEQIKVLLEKLQKEVGKKEGAKNEKEFQETDKKLLEN 409
F + + K++ L +IK +K E +K +E ++ + + E
Sbjct: 325 FNTSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAEENQKI-KNGVEELKKINNEAFEL 383

Query: 410 YEKFQKCVDETRNILKQKKEQKEKLEKLKTEL 441
+ K + E K KE+L+ +L
Sbjct: 384 SKTVNKTIAELEKKFKIDVSFKEQLKNFADDL 415


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1073RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 26/195 (13%), Positives = 70/195 (35%), Gaps = 19/195 (9%)

Query: 27 QIELENQSRF-LAQQKEFEKEVKEKRAQYQSHFKMLEQKEEALKEREREQKAKFDDAVKQ 85
+++L ++ F ++E + + Q+ + QKE L ++ E+ +
Sbjct: 167 ELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRY 226

Query: 86 ASALALQDERAKIIEEARKNAFLEQQKGLELLQKELDEKSKQVQELHQKEAEIERLKREN 145
+ ++ R + + LE + + E EL ++++E+++ E
Sbjct: 227 ENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ-ENKYVEAV---NELRVYKSQLEQIESEI 282

Query: 146 NEAESRLKAENEKKLNEKLDLERERIEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAE 205
A+ + + + E ++K + +L + + +A
Sbjct: 283 LSAKEEYQLVTQ-------LFKNEILDK--LRQTTDNIGLLTLELAKNEERQQASVIRAP 333

Query: 206 LSSQQFQGEVQELAI 220
+S +VQ+L +
Sbjct: 334 VS-----VKVQQLKV 343


9HPKB_1166HPKB_1189Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1166212-0.444752hypothetical protein
HPKB_1167111-0.798086DNA polymerase III subunit delta'
HPKB_11681110.313438dihydropteroate synthase
HPKB_1169-1111.278054hypothetical protein
HPKB_1170-181.327137hypothetical protein
HPKB_1171-281.884928hypothetical protein
HPKB_1172-292.277076hypothetical protein
HPKB_11730103.282955carbamoyl-phosphate synthetase small chain
HPKB_11741113.251409formamidase
HPKB_11750102.958160Maf-like protein
HPKB_11761122.799302alanyl-tRNA synthetase
HPKB_11771161.402926hypothetical protein
HPKB_1178-2110.639091outer membrane protein - adhesin
HPKB_1179213-1.26581030S ribosomal protein S18
HPKB_1180213-1.314214single-strand DNA-binding protein
HPKB_1181211-1.32610830S ribosomal protein S6
HPKB_1182310-1.179970DNA polymerase III holoenzyme delta subunit
HPKB_118328-0.448680ribonuclease R
HPKB_1184110-0.427992shikimate 5-dehydrogenase
HPKB_1185110-0.000403putative cell wall peptidase, NlpC/P60 family
HPKB_1186090.124111oligopeptide ABC transporter permease protein
HPKB_11870110.335964extracellular solute-binding protein
HPKB_11882110.480172tryptophanyl-tRNA synthetase
HPKB_11892120.057898biotin synthesis protein (bioC)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1177PF05844250.035 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 25.0 bits (54), Expect = 0.035
Identities = 13/65 (20%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 10 SVLKANNPHFDKIFEKHNQLDDDIKTAEQQNASDAEVSHMKKQKLKLKDEIHSMIIEYRE 69
L+A F+ + I++ Q + +V + Q ++E+++ I + +
Sbjct: 197 VALRAAGRAFESRNGALQVANTVIQSFVQMANASVQVRQGESQASAREEEVNATIGQ-SQ 255

Query: 70 KQKSE 74
KQK E
Sbjct: 256 KQKVE 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1180INVEPROTEIN270.045 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 27.0 bits (59), Expect = 0.045
Identities = 14/76 (18%), Positives = 28/76 (36%), Gaps = 7/76 (9%)

Query: 49 CFIDARLFGRTAEIANQYLSKGSSVLIEGRLT----YESWMDQTGKKNSRHTIT--VDSL 102
C + ARLFG+T + L I+ Y W+ G R + ++
Sbjct: 168 CALKARLFGKTLSLKPGLLRASYRQFIQSESHEVEIYSDWIASYG-YQRRLVVLDFIEGS 226

Query: 103 QFMDKKSDNPQANAMQ 118
D +++ + ++
Sbjct: 227 LLTDIDANDASCSRLE 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1185IGASERPTASE320.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.6 bits (71), Expect = 0.002
Identities = 19/107 (17%), Positives = 38/107 (35%), Gaps = 7/107 (6%)

Query: 38 EKDSTSISQNLEKTEIERPNSALSPKQEEANTTTTIAEENPTKDSPLPLETPTQENEPKQ 97
E +T + + E+ QE T+ ++ + ++ P P +EN+P
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 98 ENKQEQEKETKPKQNSASPVQNHQKTLSTPTMGKKPLEYKVAVNSVN 144
K+ Q + Q T + ++P+ VN+ N
Sbjct: 1154 NIKEPQSQTNTTADTE-------QPAKETSSNVEQPVTESTTVNTGN 1193



Score = 29.6 bits (66), Expect = 0.008
Identities = 14/95 (14%), Positives = 26/95 (27%)

Query: 38 EKDSTSISQNLEKTEIERPNSALSPKQEEANTTTTIAEENPTKDSPLPLETPTQENEPKQ 97
E D T + + ++ K+ +N + E +E P
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207

Query: 98 ENKQEQEKETKPKQNSASPVQNHQKTLSTPTMGKK 132
+ E KPK V++ + T
Sbjct: 1208 QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSN 1242


10HPKB_1333HPKB_1345Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1333320-0.794339protease IV, 36K short form
HPKB_1334523-1.305757hypothetical protein
HPKB_1335523-0.535857hypothetical protein
HPKB_13365220.461814hypothetical protein
HPKB_13382200.212295hypothetical protein
HPKB_1339217-1.084002hypothetical protein
HPKB_1341-115-0.365866hypothetical protein
HPKB_1342015-0.851611peptidyl-prolyl cis-trans isomerase B,
HPKB_1343116-1.452071carbon storage regulator
HPKB_1344116-2.0806294-diphosphocytidyl-2-C-methyl-D-erythritol
HPKB_1345320-2.004321SsrA-binding protein
11HPKB_1368HPKB_1386Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1368280.589224hypothetical protein
HPKB_13691100.016000outer membrane protein
HPKB_1370-111-1.083672branched-chain amino acid aminotransferase
HPKB_1371-112-2.433208outer membrane protein
HPKB_1372-112-2.642483DNA polymerase I (polA)
HPKB_1373-120-3.185260N-6 DNA methylase
HPKB_1374018-3.138924cytosine-specific methyltransferase
HPKB_1375113-1.578442type II site-specific deoxyribonuclease
HPKB_13763180.178159restriction enzyme BcgI alpha chain-like
HPKB_13773140.481160hypothetical protein
HPKB_1378313-0.001845thymidylate kinase
HPKB_13792110.118490phosphopantetheine adenylyltransferase
HPKB_13802120.1998453-octaprenyl-4-hydroxybenzoate carboxy-lyase
HPKB_1381312-0.278097flagellar basal body P-ring biosynthesis protein
HPKB_1382211-0.239077DNA helicase II
HPKB_1383212-0.123649hypothetical protein
HPKB_13842110.111130hypothetical protein
HPKB_1385114-0.305625hypothetical protein
HPKB_1386213-0.371387hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1379LPSBIOSNTHSS2234e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 223 bits (569), Expect = 4e-78
Identities = 65/148 (43%), Positives = 94/148 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLKERLEMIQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS++ERLE I A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIILHKGDASHLVPKEIH 151
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVA 149


12HPKB_1396HPKB_1448Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1396213-2.453258NifU-like protein
HPKB_1397211-2.339716hypothetical protein
HPKB_1398112-2.881543UDP-N-acetylmuramoylalanyl-D-glutamate--2,
HPKB_1399012-1.768845transaldolase
HPKB_1400213-2.02547950S ribosomal protein L25/general stress protein
HPKB_140119-1.496188peptidyl-tRNA hydrolase
HPKB_140219-1.247232hypothetical protein
HPKB_140319-0.845476hypothetical protein
HPKB_14042100.692101outer membrane protein HopK
HPKB_14051100.515770hypothetical protein
HPKB_14061100.876641heavy metal translocating P-type ATPase
HPKB_14071101.395152hypothetical protein
HPKB_14081101.202624riboflavin biosynthesis protein RibD
HPKB_14092111.138776sodium/glutamate symport carrier
HPKB_14103132.399244hypothetical protein
HPKB_14111141.653619ferrodoxin-like protein
HPKB_14120121.320824hypothetical protein
HPKB_1413-213-0.202392dihydroneopterin aldolase
HPKB_1414-211-2.382519hypothetical protein
HPKB_1415-29-2.616258iron-regulated outer membrane protein
HPKB_1416010-4.323945selenocysteine synthase
HPKB_141709-4.572150transcription elongation factor NusA
HPKB_1418010-4.583049type IIS restriction-modification protein
HPKB_1419113-5.490080putative cytoplasmic protein
HPKB_1420110-3.072067hypothetical protein
HPKB_142229-2.762781hypothetical protein
HPKB_1423110-2.407242adenine specific DNA methylase
HPKB_1424112-1.438489ATP-dependent DNA helicase RecG
HPKB_1425013-1.225022hypothetical protein
HPKB_1426-114-1.156282hypothetical protein
HPKB_1427011-0.469738exodeoxyribonuclease
HPKB_1429111-0.115526*periplasmic competence protein
HPKB_1430312-1.390411chromosomal replication initiator protein DnaA
HPKB_1431215-2.456191purine nucleoside phosphorylase
HPKB_1432213-2.379030hypothetical protein
HPKB_1433011-2.570937D-fructose-6-phosphate amidotransferase
HPKB_1434-114-3.414938FAD-dependent thymidylate synthase
HPKB_1435-212-1.433492restriction modification system S subunit
HPKB_1436-211-0.776833hypothetical protein
HPKB_1437-290.131014type I R-M system M protein
HPKB_14380101.275217typeI restriction enzyme R protein
HPKB_14391101.934481hypothetical protein
HPKB_14402102.419704TonB-dependent siderophore receptor
HPKB_1441-191.070417arginase
HPKB_1442-190.918359amino acid permease
HPKB_1443010-1.012635alanine dehydrogenase
HPKB_1444110-2.058696hypothetical protein
HPKB_144519-2.102790putative outer membrane protein
HPKB_1446410-3.500116ATP-NAD kinase
HPKB_144759-3.710304DNA repair protein
HPKB_1448110-3.676675fibronectin/fibrinogen-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1408CARBMTKINASE290.027 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 29.0 bits (65), Expect = 0.027
Identities = 15/43 (34%), Positives = 21/43 (48%), Gaps = 3/43 (6%)

Query: 246 ILSKHPIDPNSKVFSAPNRLVNAFYDP---KDLPLEKGFNFIE 285
I+++ +D N F P + V FYD K L EKG+ E
Sbjct: 113 IITQTIVDKNDPAFQNPTKPVGPFYDEETAKRLAREKGWIVKE 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1430HTHFIS354e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 4e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 125 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 175
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1448FbpA_PF058331096e-28 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 109 bits (274), Expect = 6e-28
Identities = 57/271 (21%), Positives = 113/271 (41%), Gaps = 21/271 (7%)

Query: 180 ELEHKKNHIIKRLNMQKERLKEKLEKLEDPKNLQLEAKELQTQASLLLTYQHLINRHESR 239
L+ K + + K + R +K + L + + + LL + + + S
Sbjct: 296 RLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYALKKGLSH 355

Query: 240 VVLKDFED---KECAIEIDKSMPLNAFINKKFTLSKKKKQKSQFLYLEEENLKEKIAFKE 296
+ L ++ I +D++ + + + K K+ + + +E++ +
Sbjct: 356 IELANYYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLY 415

Query: 297 NQINYVKGA---------KEESVLEMFM---PLKNSKIKRPMSGYEVLYYKDFKIGLGKN 344
+ + + A K+E + ++ + SK + + I +GKN
Sbjct: 416 SVLTNINNADNYDEIEEIKKELIETGYIKFKKIYKSKKSKTSKPMHFISKDGIDIYVGKN 475

Query: 345 QKENIKL-LQDARANDLWMHVRDIPGSHLIVFCQKNMPKDEVIMELAKMLIKMQKDVFNS 403
+N L L+ A +D+W H ++IPGSH+IV ++P + ++E A + K +S
Sbjct: 476 NIQNDYLTLKFANKHDIWFHTKNIPGSHVIVKNIMDIP-ESTLLEAANLAAYYSKSQNSS 534

Query: 404 -YEIDYTQRKFVKIIKGAN---VIYSKYRTI 430
+DYT+ K VK GA VIYS +TI
Sbjct: 535 NVPVDYTEVKNVKKPNGAKPGMVIYSTNQTI 565



Score = 35.6 bits (82), Expect = 4e-04
Identities = 20/92 (21%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 46 NAPYIGLSKKTPESVLKNTLALDFCLNKFTKNAKILQANIIDNDRI--LEIRGAKDLAYK 103
N P I L+ T + +K + L K+ NAKI+ + I+ DRI ++ +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 104 SETFIFRLEMIPKKANLMILD-QEKCVIEAFR 134
S + +E++ + +N+ ++ ++ ++++ +
Sbjct: 114 SIYSLI-IEIMGRHSNMTLIRKRDNIIMDSIK 144


13HPKB_0253HPKB_0260N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0253-2131.126264neutrophil activating protein (napA)
HPKB_0254-3120.916704sensory histidine kinase AtoS
HPKB_0255-2111.649376hypothetical protein
HPKB_0256-3112.131579flagellar basal body P-ring protein
HPKB_0257-2122.052005DEAD-box ATP dependent DNA helicase
HPKB_0258-2101.796900hypothetical protein
HPKB_0259-3101.771689hypothetical protein
HPKB_0260-3101.727977hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0253HELNAPAPROT1501e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 150 bits (380), Expect = 1e-49
Identities = 39/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLTEALKLTRVKEETKTSFHSKDIFKEILEDYKYLEKEFKELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLQAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0254PF06580300.015 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.015
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0256FLGPRINGFLGI365e-128 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 365 bits (937), Expect = e-128
Identities = 119/345 (34%), Positives = 191/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDVQISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++DV +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AITSGN-----------SNNLLSAHIINGATIEREVSYDLFHKNAMVLSLKNPNFKNAIQ 186
A+ SA + NGA IERE+ +VL L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIMVHPIVVTSQDITLKITKEP--------LSDSKNTQDLDNSMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0260HTHFIS300.026 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.026
Identities = 8/24 (33%), Positives = 13/24 (54%)

Query: 30 VAVVGESGSGKSSIANIIMRLNPR 53
+ + GESG+GK +A + R
Sbjct: 163 LMITGESGTGKELVARALHDYGKR 186


14HPKB_0354HPKB_0363N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0354-290.666179flagellar M-ring protein FliF
HPKB_0355-290.916790flagellar motor switch protein G
HPKB_0356-2100.930264flagellar assembly protein H
HPKB_0357-1101.3699191-deoxy-D-xylulose-5-phosphate synthase
HPKB_03580120.967220GTP-binding protein LepA
HPKB_0359013-0.853368DNA-cytosine methyltransferase
HPKB_0360-1120.714991hypothetical protein
HPKB_03610130.430066flagellar basal-body rod protein FlgG
HPKB_0362112-0.077924alpha-ketoglutarate permease
HPKB_0363013-0.363679cell division protein FtsK, putative
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0354FLGMRINGFLIF5530.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 553 bits (1426), Expect = 0.0
Identities = 179/582 (30%), Positives = 294/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFERLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKILKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLRYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL++ + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GAPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANTLEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMLDNATLSEKIMHKTQKVLGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVMPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 TFSEEEVRYEIILEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0355FLGMOTORFLIG349e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 349 bits (897), Expect = e-122
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSTKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDISKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.005
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0356FLGFLIH373e-05 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 37.1 bits (85), Expect = 3e-05
Identities = 44/207 (21%), Positives = 92/207 (44%), Gaps = 14/207 (6%)

Query: 50 PLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIGFKEG 108
E I + + L L +LQMQ A E+ +A I + G+K G++EG
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQGYQEG 75

Query: 109 EEKMRNELTHSVNEEKNQLLHAISALDEKMKKSEDHLMALE----KELSAIAIDIAKEVI 164
+ L + E K+Q + + + + + + L AL+ L +A++ A++VI
Sbjct: 76 ---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 165 LKEVEDNSQKVALALAEELLKNVLDATDIHLKVNPLDYPYLNERLQNASKI---KLESNE 221
+ ++ + + + L + L + L+V+P D +++ L + +L +
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 222 AISKGGVMITSSNGSLDGNLMERFKTL 248
+ GG +++ G LD ++ R++ L
Sbjct: 193 TLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0358TCRTETOQM1402e-37 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 140 bits (355), Expect = 2e-37
Identities = 99/437 (22%), Positives = 174/437 (39%), Gaps = 85/437 (19%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTLKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 SFQW----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNHLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSSANEVSAKARLGIKD--------- 170
+ + INKID ++ V QDI++ + + +V + + +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDT 177

Query: 171 -------LLEKIITTIPAPSGDFNAPLKALIYD-------------------------SW 198
LLEK ++ + + ++ +
Sbjct: 178 VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNK 237

Query: 199 F--------------------DNYLGALALVRIMDGSINTEQEILVMGTGKKHGVLGLYY 238
F LA +R+ G ++ + + K + +Y
Sbjct: 238 FYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYT 296

Query: 239 PNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDAKNPTPKPIEGFMPAKPFV 295
+ GEI I+ L L SV +GDT P + IE P +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL---PQRERIEN---PLPLL 346

Query: 296 FAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFGFRVGFLGLLHMEVIKERL 355
+ P + + E L +ALL++ +D L + +S+ + FLG + MEV L
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---EIILSFLGKVQMEVTCALL 403

Query: 356 EREFGLNLIATAPTVVY 372
+ ++ + + PTV+Y
Sbjct: 404 QEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTKGYASFDYEP 473
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0361FLGHOOKAP1300.008 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.008
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0362TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.2 bits (94), Expect = 1e-05
Identities = 57/308 (18%), Positives = 105/308 (34%), Gaps = 53/308 (17%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFLLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNVM 210
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EETMDSQTTFKTTIKEKTQRGSLKELLNHKKALMIVFGLTMGGSLCFYTFTVYLKIFLTN 270
++ + + F +L + + T + IF+ +
Sbjct: 190 KKEVRIKGHFDI----------KGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKH 239

Query: 271 SSSFSPK-------ESSFIMLLALSYFILLQPLCG---ILADKIKRTQMLMVFAIAGLIV 320
+ ++ M+ L I+ + G ++ +K L I +I+
Sbjct: 240 IRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVII 299

Query: 321 TPVVFYGI 328
P I
Sbjct: 300 FPGTMSVI 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0363IGASERPTASE398e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.3 bits (91), Expect = 8e-05
Identities = 48/231 (20%), Positives = 77/231 (33%), Gaps = 35/231 (15%)

Query: 179 APSDIQKKETK---NDKEKENLKENPI-DENHKTPNEESFLAIPTPYNTTLNALEPQEGL 234
P++IQ N++E + E P+ TP+E + T + +
Sbjct: 999 TPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT--------ETVAENSKQESKT 1050

Query: 235 VQISSHSPTHYTIYPKR----SRFNDLSNPTNPTLKEVKQETKEREPTPTKET------- 283
V+ + T T + ++ N +N + + ETKE + T TKET
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 284 -----LAPTMSKPATPKPIMPAPIMPAPIMPASAPNTENDNKTENHKAPNHPIKEESPQE 338
T P + P + P + P END T N K P + E
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND-PTVNIKEPQSQTNTTADTE 1169

Query: 339 NAQEERIKEMIKEEEKEVQ-----NAPSFSPITPTSAKKPVMVKELSENKE 384
+E +++ E N+ +P T A V S NK
Sbjct: 1170 QPAKE-TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219


15HPKB_0728HPKB_0744N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0728110-1.007240response regulator PleD
HPKB_0729111-0.996902NAD-dependent DNA ligase LigA
HPKB_073028-0.289143hypothetical protein
HPKB_073128-0.224121ABC transporter related protein
HPKB_0732190.013562hypothetical protein
HPKB_0733180.487085putative vacuolating cytotoxin (VacA) paralog
HPKB_0734-2121.123061hypothetical protein
HPKB_0735-3110.761521acriflavine resistance protein (acrB)
HPKB_0736-3120.541188membrane fusion protein of the hefABC efflux
HPKB_0737-211-0.773728hypothetical protein
HPKB_0738-113-2.111526uroporphyrinogen decarboxylase
HPKB_0739012-2.342022hypothetical protein
HPKB_0740013-2.274478endonuclease III
HPKB_0741012-1.351778flagellin A
HPKB_0742-113-2.651856multidrug resistance protein
HPKB_0743-310-1.467260multidrug resistance protein
HPKB_0744-211-0.473210methyl-accepting chemotaxis sensory transducer
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0728HTHFIS564e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 56.4 bits (136), Expect = 4e-11
Identities = 25/121 (20%), Positives = 52/121 (42%), Gaps = 7/121 (5%)

Query: 194 ILIAEDSLSALKTLEKIVQTLELRYLAFPNGKELLDYLYEKEHYQQVGVVITDLEMPVIS 253
IL+A+D + L + + N L ++ +V+TD+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA----GDGDLVVTDVVMPDEN 61

Query: 254 GFEVLKTIKADSRTEHLPVIINSSMSSDSNRQLAQSLEADGFVVKS-NILEIHEMLRKTL 312
F++L IK LPV++ S+ ++ A A ++ K ++ E+ ++ + L
Sbjct: 62 AFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 313 S 313
+
Sbjct: 120 A 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0730LCRVANTIGEN316e-04 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 31.2 bits (70), Expect = 6e-04
Identities = 16/33 (48%), Positives = 20/33 (60%)

Query: 16 KRKKLLTELAELEAEIKVSSERKSSFNISLSPS 48
R KL ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0733VACCYTOTOXIN2698e-75 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 269 bits (688), Expect = 8e-75
Identities = 103/394 (26%), Positives = 180/394 (45%), Gaps = 14/394 (3%)

Query: 2798 NAVNWLNALFVAKGGNPLFAPYYLQDNPTEHIVTLMKDITSALGMLTKPSLKNNSTDVLQ 2857
+ L L + + +A + I + T+ L + K + L
Sbjct: 907 QGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQTLS 965

Query: 2858 LNTYTQQMGRLAKLSNFAFFDSTDFSERLSSLKNQRFADAIPNAMDVILKYSQRDKLKNN 2917
L+ RL LS F++RL +LK+QRFA + +A +V+ +++ + + N
Sbjct: 966 LSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEKPTN 1024

Query: 2918 LWATGVGGVSFVGNGTGTLYGVNVGYDRFIKG---VIVGGYAAYGHSGFYER--ITSSKS 2972
+WA +GG S G +LYG + G D ++ G IVGG+ +YG+S F + +S +
Sbjct: 1025 VWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLNSGA 1084

Query: 2973 DNVNVGLYARAFIKKSELTFSVNETWGANKTQISSNDALLSMINQSYQYSTWTTNARVNY 3032
+N N G+Y+R F + E F G++++ ++ ALL +NQSY Y ++ R +Y
Sbjct: 1085 NNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATRASY 1144

Query: 3033 GYDFMFKNKSVIVKPQIGLRYYYIGMTGLDGVMNNALYNQFKANADPSKKSVLMIDFAFE 3092
GYDF F ++++KP +G+ Y ++G T + S + + E
Sbjct: 1145 GYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASANVE 1200

Query: 3093 NRHYFNKNSYFYAIGGIGRDLLVRSMGDKLVRFIGDNILSYRKGELYNTFANITTGGEIR 3152
R+Y+ SYFY G+ ++ + V + + R NT A + GGE++
Sbjct: 1201 ARYYYGDTSYFYMNAGVLQEFANFGSSNA-VSLNTFKVNATRNP--LNTHARVMMGGELK 1257

Query: 3153 LFKSFYVNAGVGARFGLDYKMINITGNIGMRLAF 3186
L K ++N G L + + N+GMR +F
Sbjct: 1258 LAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0735ACRIFLAVINRP8940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 894 bits (2312), Expect = 0.0
Identities = 286/1040 (27%), Positives = 516/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGTMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQAIISLFVSSSSVPAT--TLNDYAKKTIKPMLQKIDGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY +K L +++GVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQVRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILVNAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKHIQAISP-SYEIRPFLDTTSYIRTSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSLGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTRLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFIAVVLVFVGSLFVASKLGMEFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHAEVEFTTLQVGY-GTTQNPFKAKIFVQLKPLKERKKEHELGQFELMSVLRKELRS 631
+ + E FT + G QN FV LKP +ER E ++ + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNG-DENSAEAVIHRAKMELGK 656

Query: 632 LPEAKGLDTINLSEVALIGGGGDSSPFQTFVFSHSQEAVDKSVENLRKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGFVI-PFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAEPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + E
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGTSLGEILTQVSKNTKEWLVEGANYRFAGEADNAKESNGEFLIALATAFVLIYMILA 871
GTS G+ + + +N L G Y + G + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0736RTXTOXIND511e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.0 bits (122), Expect = 1e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYSKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLESYEFN 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 31.3 bits (71), Expect = 0.003
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 25/152 (16%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLESYEFNYRRLESDYAYSIALLNKTI 127
+++ S +++ + ++ K+ D + + E ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDN--IGLLTLELAKNEER-------QQASV 329

Query: 128 LRAPFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG------ 179
+RAP + + GV L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 180 -DTYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ + Y+ G K+ I D+
Sbjct: 390 VEAFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0737RTXTOXIND290.048 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.048
Identities = 16/113 (14%), Positives = 41/113 (36%), Gaps = 16/113 (14%)

Query: 204 LARMIALQKKLEQIKTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQ 256
LAR+ + K+ + + L K + + ++A L Y + ++
Sbjct: 220 LARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIE 279

Query: 257 FALEQNRLTLEYLTNLSVKNLKKTTIDVPNLQLRERQD-LVSLREQISALKYQ 308
+ + + +T K +D +LR+ D + L +++ + +
Sbjct: 280 SEILSAKEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0740PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 13/95 (13%), Positives = 26/95 (27%), Gaps = 20/95 (21%)

Query: 60 ILENDDEINLKKIAYAEFSKLAECVRPSGFYNQKAKRLINLSENILKDFQSFENFKQEAT 119
L + + +A+ E + VR + +KA E+
Sbjct: 458 ALRSAPALA-GCVAFDELREQPVAVRAFPW--RKAPGP-------------LEDADVLRL 501

Query: 120 REWLLDQKGIGKESADAILCYVCAKEVMVVDKYSY 154
+++ G G+ SA + D
Sbjct: 502 ADYVETTYGTGEASAQTTEQAINV----AADMNRV 532


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0741FLAGELLIN2444e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 244 bits (625), Expect = 4e-77
Identities = 125/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSDGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + +G V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0744OMS28PORIN290.037 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.0 bits (64), Expect = 0.037
Identities = 25/102 (24%), Positives = 48/102 (47%), Gaps = 2/102 (1%)

Query: 143 NAAKNGEEHSTEGLGTVNKTGQDIESLYEKMQNATSLADSLNQRS--NEITQVISLIDDI 200
N + ++ + L T+NK +D+ S E ++ ++ N + +SL+ D+
Sbjct: 47 NKKLDQKDQVNQALDTINKVTEDVSSKLEGVRESSLELVESNDAGVVKKFVGSMSLMSDV 106

Query: 201 AEQTNLLALNAAIEAARAGEHGRGFAVVADEVRKLAEKTQKA 242
A+ T + + A I A +G G V + +K ++TQKA
Sbjct: 107 AKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQETQKA 148


16HPKB_0759HPKB_0765N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0759214-0.583863endonuclease III
HPKB_0760213-1.390540flagellar motor switch protein
HPKB_0761113-1.731224hypothetical protein
HPKB_0762012-0.916114hypothetical protein
HPKB_0763-113-0.692330dihydroorotase
HPKB_0764013-0.711578hypothetical protein
HPKB_0765011-0.535219hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0759OMS28PORIN290.018 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 28.6 bits (63), Expect = 0.018
Identities = 29/112 (25%), Positives = 53/112 (47%), Gaps = 11/112 (9%)

Query: 25 NQTTELHHKNPYELLVATILSAQCTDARVNQITPKLFEKYPSVKDLAL-----ASLEEVK 79
N+ E+ K E A ++ + T QI + K P+ K+L L A +E+VK
Sbjct: 132 NKVVEMSKKAVQETQKAVSVAGEATFLIEKQI---MLNKSPNNKELELTKEEFAKVEQVK 188

Query: 80 EIIKSVSYFNNKSKHLISMAQKVVRDFKGVIPSTQKELMNLDGVGQKTANVV 131
E + + +++ + AQKV+ G+ PS + +++ V + +NVV
Sbjct: 189 ETLMASERALDET---VQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0760FLGMOTORFLIN992e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 99 bits (249), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0762TONBPROTEIN481e-08 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 47.7 bits (113), Expect = 1e-08
Identities = 23/67 (34%), Positives = 27/67 (40%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPKKPNHKHKALKKVEKV 142
A Q PP P P P P P P E P KPKPKPK K +++ K
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112

Query: 143 EEKKVVE 149
+ K V
Sbjct: 113 DVKPVES 119



Score = 41.9 bits (98), Expect = 1e-06
Identities = 22/63 (34%), Positives = 28/63 (44%), Gaps = 1/63 (1%)

Query: 86 PTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPK-KPNHKHKALKKVEKVEE 144
P P + PP P P P E PK P KPK KP K K +KKV++ +
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPK 111

Query: 145 KKV 147
+ V
Sbjct: 112 RDV 114



Score = 40.0 bits (93), Expect = 6e-06
Identities = 19/65 (29%), Positives = 27/65 (41%)

Query: 95 PTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPKKPNHKHKALKKVEKVEEKKVVEEKKEE 154
P P P +P +P+P P+P + K K K + KKV E+ K +
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113

Query: 155 KKIVE 159
K VE
Sbjct: 114 VKPVE 118



Score = 32.3 bits (73), Expect = 0.002
Identities = 13/55 (23%), Positives = 19/55 (34%)

Query: 76 PSKNTPGAPKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPKKPN 130
P + P K P P PKP++K + +PK KP +
Sbjct: 66 EPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0765TYPE3IMSPROT310.002 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.9 bits (70), Expect = 0.002
Identities = 18/64 (28%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 88 LQSYSVMLFFNLLLLTDILGFLPFSIYHHFMASLIFSALFCISLFLSSPLLGMIALVALS 147
L Y F L+L+ +LPFS S + + +L PLL + AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLL 151
S ++
Sbjct: 101 SHVV 104


17HPKB_0820HPKB_0832N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_0820112-1.768121cag pathogenicity island protein 3
HPKB_0821-210-1.444153cag pathogenicity island protein (cag1)
HPKB_0822-19-0.838392hypothetical protein
HPKB_0823011-1.929379hypothetical protein
HPKB_0824011-1.138426GTP-binding protein Era
HPKB_0825011-0.561257ATP-dependent protease ATP-binding subunit
HPKB_0826-1120.199216ATP-dependent protease peptidase subunit
HPKB_0827012-0.50897150S ribosomal protein L9
HPKB_0828011-0.884890hypothetical protein
HPKB_0829-280.544665glutamine synthetase (glnA)
HPKB_0830-210-0.169210dihydrodipicolinate reductase
HPKB_0831-29-0.204891glycolate oxidase subunit
HPKB_0832-110-1.367497TolA family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0820PF07201300.019 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.8 bits (67), Expect = 0.019
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0824PF03944320.004 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.6 bits (71), Expect = 0.004
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYDSQFLALVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0825HTHFIS290.044 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.044
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 48 TPKNILMIGSTGVGKTEIARRI---AKIMKLPFVKV 80
T +++ G +G GK +AR + K PFV +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0826PF07520290.010 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 29.2 bits (65), Expect = 0.010
Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 4/49 (8%)

Query: 121 LEAEDNKIAAIGSGG---NFALSAARALDNFAHLEPRKLVEESLKIAGD 166
E+ ++A I GG + ++ R DN L P + E ++AGD
Sbjct: 590 GESPSLRLACIDVGGGTTDLMVTTYRGEDNRV-LHPEQTFREGFRVAGD 637


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0828SECA300.050 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.050
Identities = 17/87 (19%), Positives = 37/87 (42%), Gaps = 1/87 (1%)

Query: 62 FVQDGIYTTSHNELLIIDG-QQRLTTITLLFIALRDHLNDEDEFLEKFSHQKIQNRYLIN 120
+ I S E+ I G Q+RL L + + + L+ E E E+ ++I + +
Sbjct: 687 TIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIAEWLDKEPELHEETLRERILAQSIEV 746

Query: 121 SDEKGDKKFKLILSEPDRDILLSLIDK 147
K + ++ ++ ++L +D
Sbjct: 747 YQRKEEVVGAEMMRHFEKGVMLQTLDS 773


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_0832IGASERPTASE454e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.4 bits (107), Expect = 4e-07
Identities = 28/202 (13%), Positives = 65/202 (32%), Gaps = 17/202 (8%)

Query: 246 RIKKREGKIDSREI----KREIKQEAIKEPKKANQGTQNAPTLEEKNYQKAEHKLDAKEE 301
++KR +D+ I + ++ + AP +E E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 302 RRYLRDERKKAKATKKAMEFEEREKEHDERDEKETEGRRKALEMDKGNEKVNTKENEQEI 361
+ ++ + K + A E + +E + + + + E+ + + +E
Sbjct: 1044 SK--QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQ------SGSETKET 1095

Query: 362 KQEAIKEPSNGNNATQQGEKQNAPKENKAQKEENKPNSKEEKRRLKEEKKKAKAEQRARE 421
+ KE AT + E++ + K Q+ + K+ E + R +
Sbjct: 1096 QTTETKET-----ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150

Query: 422 FEQRAKEHQERDEKELEERRKA 443
KE Q + + + A
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPA 1172



Score = 34.3 bits (78), Expect = 0.001
Identities = 21/105 (20%), Positives = 40/105 (38%), Gaps = 1/105 (0%)

Query: 344 EMDKGNEKVNTKENEQEIKQEAIKEPSNGNNATQQGEKQNAPKENKAQKEENKPNSKEEK 403
E++K N+ V+T +A PS +N + AP A ++ +
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQA-DVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 404 RRLKEEKKKAKAEQRAREFEQRAKEHQERDEKELEERRKALEAGK 448
+E K K EQ A E + +E + + ++ + E +
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQ 1087


18HPKB_1064HPKB_1073N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_10640140.222693F0F1 ATP synthase subunit alpha
HPKB_1065114-1.819351F0F1 ATP synthase subunit delta
HPKB_1066015-0.675177ATP synthase F0 subunit B
HPKB_1067117-3.318467F0F1 ATP synthase subunit B'
HPKB_1068217-3.020174chromosome partitioning protein
HPKB_1069218-2.777308ParA family protein
HPKB_1070217-2.895187biotin--protein ligase
HPKB_1071217-3.138235methionyl-tRNA formyltransferase
HPKB_1072317-3.674570hypothetical protein
HPKB_1073214-0.007099hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1064ECOLIPORIN300.025 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 29.9 bits (67), Expect = 0.025
Identities = 20/68 (29%), Positives = 35/68 (51%), Gaps = 3/68 (4%)

Query: 196 YVAIGQKESTVAQVVRKLEEYGAMEYSVVINASASDSAAMQYLAPYSGVAMGEYFR-DHA 254
Y+ +G K T Q+ +L YG EY+V N + + A ++G+ G+Y D+
Sbjct: 56 YMRVGFKGET--QINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYG 113

Query: 255 RHALIIYD 262
R+ ++YD
Sbjct: 114 RNYGVLYD 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1067PF06580280.015 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.9 bits (62), Expect = 0.015
Identities = 28/146 (19%), Positives = 51/146 (34%), Gaps = 22/146 (15%)

Query: 8 YLMAVVFVVFVLLLWAMNVWVYRPLLAFMDNRQAEIKDSLAKIKTDNTQSVEIRHQIE-- 65
L + VV V +W++ +Y F + +QAEI Q + ++ QI
Sbjct: 117 ALSIIFNVVVVTFMWSL---LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPH 173

Query: 66 ----------TLLKEAAEKRREMIAEAIQKAAESYDAVIKQKENE---LNQEFEAFAKQL 112
L+ E K REM+ +E ++ L E L
Sbjct: 174 FMFNALNNIRALILEDPTKAREML----TSLSELMRYSLRYSNARQVSLADELTVVDSYL 229

Query: 113 QNEKQILKEQLQVQMPVFEDELNKRV 138
Q +++LQ + + ++ +V
Sbjct: 230 QLASIQFEDRLQFENQINPAIMDVQV 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1069PF07675310.005 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.005
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 69 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 125
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 126 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 170
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1071FERRIBNDNGPP310.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.5 bits (71), Expect = 0.003
Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 70 EPEVQILKALKPDFIVVVAYGKILPKEVLTIAP 102
EP +++L +KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1072PF01540320.012 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 31.6 bits (71), Expect = 0.012
Identities = 68/332 (20%), Positives = 127/332 (38%), Gaps = 49/332 (14%)

Query: 140 KNCKEKVEKRKKKIKDENSAETLSAKQESEIKKYDKEIEKIRKEMTSKTIRITLDEIKIN 199
K+ ++KV++ KKI DEN +IK+ KE+ K+ +++ S I L
Sbjct: 103 KSEQQKVDQANKKIADEN----------LKIKEGAKELLKLSEKIQSFADTIAL------ 146

Query: 200 NICEVSKNKFKVQEDALTNLEKDFDELDEAMKKFDDLKEMELPKDYQTIKDKLESLFSFD 259
I ++ KF++ E L + L++ + + K + + LES F+
Sbjct: 147 TITKLEGKKFQIDETFKKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSE-LESFKEFN 205

Query: 260 IDKEAGQVSE--EIKEHMSKVGREF--------------IEKGIELQKKMPDNACPFCTQ 303
VSE E+K+ SK E I++G + K+ + F
Sbjct: 206 TSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSF-AD 264

Query: 304 EITNNIIQVYTSY-----FNKRIEQFNQDSLEVSGTLKKILEQWNIKE--ILQSFERFES 356
I I ++ + F K++ + + S +K IK+ +L E F+
Sbjct: 265 TIALTITKLERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKE 324

Query: 357 F-------MKKDSSTNKESLKNALEQIKVLLEKLQKEVGKKEGAKNEKEFQETDKKLLEN 409
F + + K++ L +IK +K E +K +E ++ + + E
Sbjct: 325 FNTSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAEENQKI-KNGVEELKKINNEAFEL 383

Query: 410 YEKFQKCVDETRNILKQKKEQKEKLEKLKTEL 441
+ K + E K KE+L+ +L
Sbjct: 384 SKTVNKTIAELEKKFKIDVSFKEQLKNFADDL 415


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1073RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 26/195 (13%), Positives = 70/195 (35%), Gaps = 19/195 (9%)

Query: 27 QIELENQSRF-LAQQKEFEKEVKEKRAQYQSHFKMLEQKEEALKEREREQKAKFDDAVKQ 85
+++L ++ F ++E + + Q+ + QKE L ++ E+ +
Sbjct: 167 ELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRY 226

Query: 86 ASALALQDERAKIIEEARKNAFLEQQKGLELLQKELDEKSKQVQELHQKEAEIERLKREN 145
+ ++ R + + LE + + E EL ++++E+++ E
Sbjct: 227 ENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ-ENKYVEAV---NELRVYKSQLEQIESEI 282

Query: 146 NEAESRLKAENEKKLNEKLDLERERIEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAE 205
A+ + + + E ++K + +L + + +A
Sbjct: 283 LSAKEEYQLVTQ-------LFKNEILDK--LRQTTDNIGLLTLELAKNEERQQASVIRAP 333

Query: 206 LSSQQFQGEVQELAI 220
+S +VQ+L +
Sbjct: 334 VS-----VKVQQLKV 343


19HPKB_1278HPKB_1283N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1278-3140.518722nodulation protein (nolK)
HPKB_1279-3130.166828GDP-D-mannose dehydratase (rfbD)
HPKB_1280-211-0.242364mannose-6-phosphate isomerase
HPKB_1281-112-0.381488trbI protein
HPKB_1282-111-1.210381comB9 competence protein
HPKB_1283014-0.205102comB8 competence protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1278NUCEPIMERASE512e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 50.6 bits (121), Expect = 2e-09
Identities = 51/346 (14%), Positives = 106/346 (30%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------ALLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEHKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHTAKLKNEKEFVMWGDGTARREYLNAKDLARFIS 222
+YG + + P + T + K ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYENIASIPS-----------------VMNVGSGVDYSIEEYYKMIAQVLDYKGAFVKD 265
+ I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1279NUCEPIMERASE881e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 1e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1282TYPE4SSCAGX320.004 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.7 bits (71), Expect = 0.004
Identities = 26/72 (36%), Positives = 37/72 (51%), Gaps = 8/72 (11%)

Query: 190 KEETKEEETITIGDNTNAMKIVKKDIQKGYRALKSSQ--RKWYCLGICSKKSKLSLMPEE 247
KE+ +EE+ I D A+ + Q + ALK + R + K+SK +MP E
Sbjct: 365 KEKIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSE 418

Query: 248 IFNDKQFTYFKF 259
IF+D FTYF F
Sbjct: 419 IFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1283PF043351331e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 133 bits (336), Expect = 1e-40
Identities = 38/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALVLAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIVAKLMNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKVTRYSIT 238
KNP G++V Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


20HPKB_1351HPKB_1357N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_13510130.679394putative inner membrane protein translocase
HPKB_13521110.624777hypothetical protein
HPKB_1353191.092598tRNA modification GTPase TrmE
HPKB_13542111.557831hypothetical protein
HPKB_1355-1120.549506hypothetical protein
HPKB_1356-2111.384154hypothetical protein
HPKB_1357-2120.926290membrane-associated lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_135160KDINNERMP430e-148 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 430 bits (1108), Expect = e-148
Identities = 168/566 (29%), Positives = 280/566 (49%), Gaps = 53/566 (9%)

Query: 10 RLILAIALSFLFIALYSYFFQKPNQTTTTKQETTNNHTATNSNTLNAFSATQAIPQENLL 69
R +L IAL F+ ++ + Q N +Q T T T + A A Q L+
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQ---TTTTAAGSAADQGVPASGQGKLI 61

Query: 70 STISFEHARIEIDSLGR--IKQVYLKDKKYLTPKQKGFLEHVSHLFNPKANPQTPLKELP 127
++ + + I++ G + + K L Q L S F +A ++ P
Sbjct: 62 -SVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP 120

Query: 128 LLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGALTIIKTLTF 185
A+ +PL +N A G NE V D T KT
Sbjct: 121 DNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVL 167

Query: 186 YDDLHYDLQIAFKSPN--------NIIPSYVITNGYRPVADLDS-----YTFSGVLLENN 232
Y + + + N + + P D S +TF G
Sbjct: 168 KRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTP 226

Query: 233 DKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDSQGFEALIDSEIGTKNPL 289
D+K EK + D + + S +++ + +YF T + G + +G N +
Sbjct: 227 DEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNFYTANLG--NGI 283

Query: 290 GFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDVIEYGLITFFAKGVFVL 338
I K++ N ++GP+ + A++P L ++YG + F ++ +F L
Sbjct: 284 AAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKL 343

Query: 339 LDYLYQFVGNWGWAIILLTIIVRLILYPLSYKGMVSMQKLKEIAPKMKELQEKYKGEPQK 398
L +++ FVGNWG++II++T IVR I+YPL+ SM K++ + PK++ ++E+ + Q+
Sbjct: 344 LKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQR 403

Query: 399 LQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWILWIHDLSIMDP 458
+ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + + LWIHDLS DP
Sbjct: 404 ISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDP 463

Query: 459 YFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKFLPLLFTIFLITFPAGLVLYWTTNNIL 518
Y+ILP+LMG +M++ Q ++P T+TDPMQ KI F+P++FT+F + FP+GLVLY+ +N++
Sbjct: 464 YYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLV 523

Query: 519 SVLQQLIINKVLENKKRAHAQNKKES 544
+++QQ +I + LE K+ H++ KK+S
Sbjct: 524 TIIQQQLIYRGLE-KRGLHSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1352IGASERPTASE300.014 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.014
Identities = 15/48 (31%), Positives = 25/48 (52%), Gaps = 1/48 (2%)

Query: 64 KEESVKETNTKEIHQSAEEKKQKLETETPQEE-TITPKPPKKNLKEES 110
+ + + T TKE +E+K K+ETE QE +T + K + E+
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSET 1138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1353TCRTETOQM340.001 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 34.4 bits (79), Expect = 0.001
Identities = 32/134 (23%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 QGHKVRLIDTAGIRESTDKIERLGIEKSLKSLENCDIVLGVFDLSKPLEQEDFNLIDTLN 318
+ KV +IDT G + ++ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RTKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1357LIPOLPP20292e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 292 bits (748), Expect = e-105
Identities = 171/175 (97%), Positives = 173/175 (98%)

Query: 1 MKSQVKKILGMSVIATMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MK+QVKKILGMSV+A MVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRF 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKR
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175


21HPKB_1480HPKB_1488N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPKB_1480-2141.886797flagellar hook-basal body protein FliE
HPKB_1481-2131.690730flagellar basal body rod protein FlgC
HPKB_1482-2141.328498flagellar basal body rod protein FlgB
HPKB_14830121.658619putative rod shape-determining protein
HPKB_1484-1130.349507iron(III) ABC transporter, periplasmic
HPKB_14850140.047652hypothetical protein
HPKB_14862130.316006putative peroxidase
HPKB_1487112-0.233166hypothetical protein
HPKB_1488113-0.404942penicillin-binding protein 2 (pbp2)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1480FLGHOOKFLIE776e-22 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 77.0 bits (189), Expect = 6e-22
Identities = 19/77 (24%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 34 EQKGGEFSKLLKQSINELNNTQEQSDKALADMATGQIK-DLHQAAIAIGKAETSMKLMLE 92
Q F+ L +++ +++TQ + G+ L+ + KA SM++ ++
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRNKAISAYKELLRTQI 109
VRNK ++AY+E++ Q+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1481FLGHOOKAP1290.011 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.011
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1484FERRIBNDNGPP354e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 34.5 bits (79), Expect = 4e-04
Identities = 28/183 (15%), Positives = 77/183 (42%), Gaps = 10/183 (5%)

Query: 108 NVELLKKLSPDLVVTFVG-NPKAVEHAKKFGISFLSFQETT--IAEAMQAMQ--AQAAVL 162
N+ELL ++ P +V G P A+ +F + +A A +++ A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 163 EIDASKKFAKMQETLDFIADRL-KDVKKKKGVELFHKAN--KISGHQAISSDILEKGGID 219
+ A A+ ++ + + R K + + + G ++ +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 220 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWVSPLTPEDVLNNPKFSTIKAIKNKQVY 277
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 278 KLP 280
++P
Sbjct: 268 RVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1485FERRIBNDNGPP345e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 34.2 bits (78), Expect = 5e-04
Identities = 29/184 (15%), Positives = 75/184 (40%), Gaps = 12/184 (6%)

Query: 106 NVELLKKLSPDLVVTFVGNPKAVEHAKKF--GISFLSFQEKTIAEVMEDID---AQAKAL 160
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 161 EVDASKKLAKMQETLDFIADRLKGVKKKKGVELFHKAN----KISGHQALDSDILEKGGI 216
+ A LA+ ++ + + R + + + L + + G +L +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVK-RGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGI 206

Query: 217 DN-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLSPEDILNNPKFATIKAIKNKQV 274
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 207 PNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRF 266

Query: 275 YKLP 278
++P
Sbjct: 267 QRVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPKB_1488TYPE3IMPPROT290.029 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 29.4 bits (66), Expect = 0.029
Identities = 9/23 (39%), Positives = 12/23 (52%)

Query: 4 LRYKLLLFVFIGFWGLLVLNLFI 26
KL+LFV + W LL L +
Sbjct: 195 TPIKLVLFVALDGWTLLSKGLIL 217



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.