PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2017.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP002571 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1hp2017_0050hp2017_0086Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0050-113-3.116075hydrogenase maturation protein
hp2017_0051-114-3.359127hypothetical protein
hp2017_0052-29-2.154608Adenine-specific methyltransferase
hp2017_0053-19-2.207737putative type II DNA modification enzyme
hp2017_0054-110-2.280351hypothetical protein
hp2017_0055-19-1.117003adenine/cytosine specific DNA methyltransferase
hp2017_00561121.169575Proline/sodium symporter/Propionate/sodium
hp2017_00572140.371669Proline dehydrogenase/Proline
hp2017_0058315-0.398567hypothetical protein
hp2017_0059315-0.099178hypothetical protein
hp2017_00602150.291524hypothetical protein
hp2017_00612150.482613hypothetical protein
hp2017_006211170.335523hypothetical protein
hp2017_006221180.241689hypothetical protein
hp2017_00623219-0.236749hypothetical protein
hp2017_0063016-0.176554hypothetical protein
hp2017_00651140.463180hypothetical protein
hp2017_00661151.153005hypothetical protein
hp2017_00670131.291770hypothetical protein
hp2017_00680151.467596ATP-binding cell division protein
hp2017_00692142.185200ATP-binding cell division protein
hp2017_00704203.702453Urease accessory protein
hp2017_00715223.444778Urease accessory protein
hp2017_00724202.663413Urease accessory protein
hp2017_00733172.668203Urease accessory protein
hp2017_00743202.621437Urease channel protein
hp2017_00752202.205503Urease alpha subunit
hp2017_0076-1121.143870Urease beta/ gamma subunit
hp2017_00770130.775033*Lipoprotein signal peptidase
hp2017_00780141.348509Phosphoglucosamine mutase
hp2017_00794203.12626530S ribosomal protein S20
hp2017_00804192.922958hypothetical protein
hp2017_00814183.044961Peptide chain release factor 1
hp2017_00826212.839412hypothetical protein
hp2017_00836192.797275hypothetical protein
hp2017_00845161.764156outer membrane protein
hp2017_00854140.443304hypothetical protein
hp2017_00864160.904282hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0057ANTHRAXTOXNA310.036 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.036
Identities = 36/173 (20%), Positives = 71/173 (41%), Gaps = 19/173 (10%)

Query: 121 QEESQLKERILKRKNEKIILNVNFIGEEVLGEEEANARFEKY---SQALKSNYIQYISIK 177
Q+ S+ ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0061GPOSANCHOR361e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 1e-04
Identities = 37/206 (17%), Positives = 75/206 (36%), Gaps = 2/206 (0%)

Query: 17 ELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTENDKLNHQVIALTN 76
+ E L+ +N++L L N EL + + EK + + ++ L
Sbjct: 61 KFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEA 120

Query: 77 EQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNLENSNTQLRQALEN 136
+ LE+ + LE E L + LE A + N +T ++
Sbjct: 121 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 137 SNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLASANQDLKRQKRKLE 196
A+ A E + AE + LE + + L+ LA+ DL++
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADS--AKIKTLEAEKAALAARKADLEKALEGAM 238

Query: 197 EENIALKERVDSLKEQLFTLQPQKPQ 222
+ A ++ +L+ + L+ ++ +
Sbjct: 239 NFSTADSAKIKTLEAEKAALEARQAE 264



Score = 35.0 bits (80), Expect = 2e-04
Identities = 43/207 (20%), Positives = 78/207 (37%), Gaps = 2/207 (0%)

Query: 16 EELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTENDKLNHQVIALT 75
+ L+ EL +E + K K +E AS+ L + L + + A +
Sbjct: 81 KALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADS 140

Query: 76 NEQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNLENSNTQLRQALE 135
+ +LE E+A L LEK+ + + K+K LE+ + LE +L +ALE
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 136 NSNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLASANQDLKRQKRKL 195
+ KI + E AR LE +A + ++ + L+ +K L
Sbjct: 201 GAMNFSTADSAKIKTLEAEKAALAARKADLE--KALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 196 EEENIALKERVDSLKEQLFTLQPQKPQ 222
E L++ ++ +
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKT 285



Score = 34.7 bits (79), Expect = 3e-04
Identities = 41/219 (18%), Positives = 72/219 (32%), Gaps = 9/219 (4%)

Query: 4 LIEKWFGFSQIREELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTE 63
L + + E + L K L EL + + +
Sbjct: 153 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 212

Query: 64 NDKLNHQVIALTNEQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNL 123
L + AL + LE+ + LE E L + +LE A +
Sbjct: 213 IKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA 272

Query: 124 ENSNTQLRQALENSNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLAS 183
N +T ++ A+ A E + A+ + + + A +SL S
Sbjct: 273 MNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDAS---------RE 323

Query: 184 ANQDLKRQKRKLEEENIALKERVDSLKEQLFTLQPQKPQ 222
A + L+ + +KLEE+N + SL+ L + K Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_00622SHAPEPROTEIN290.023 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.023
Identities = 17/58 (29%), Positives = 24/58 (41%), Gaps = 9/58 (15%)

Query: 39 RHVFDDEKTAKTFKVELRASEPCAYAISALKSYGFFKSEKLDKPVYYGVFDFGGGTTD 96
R + + + A +V L EP A AI A + + V D GGGTT+
Sbjct: 124 RAIRESAQGAGAREVFL-IEEPMAAAIGA--------GLPVSEATGSMVVDIGGGTTE 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0075UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 353/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0084FLAGELLIN330.002 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 33.5 bits (76), Expect = 0.002
Identities = 32/285 (11%), Positives = 80/285 (28%), Gaps = 6/285 (2%)

Query: 17 SVLLGSMNATDLETYAALQKPSHVFSNYAKKSNKGSELSSDSLTQQQAQNTAQSDTTQAT 76
++ L ++ L + KS+ + D+ + ++
Sbjct: 156 TIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVV 215

Query: 77 TLENTASSGTP----DSSTLPTKETPPATSGGTGGDKHTASSGTPPASSTPPAKKDETSG 132
T + ++ T + + +++GT A + A K G
Sbjct: 216 TDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEG 275

Query: 133 SGDKDQHTASGTGGTPSSSGGTGGDKHTASSGTPPASSTPPTPTPPTSGGNTITSQLTKD 192
D + T T + + G G T + + T T+ S
Sbjct: 276 D-TFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVY 334

Query: 193 TTTVNNLKSVSVSAMNTTLSGVTQLSQQTATISTLLNGSPNLGSVISNAQGLSSAFSALE 252
T+ VN + N + + + + + + + ++ A +
Sbjct: 335 TSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMF 394

Query: 253 SAQNTLKGYLDSSSATIGQLTNGSNAVVGALDKAINQVDMALADL 297
+ + + + ++D A+++VD + L
Sbjct: 395 IDKTASGVSTLINEDAAAAKK-STANPLASIDSALSKVDAVRSSL 438



Score = 31.2 bits (70), Expect = 0.013
Identities = 38/306 (12%), Positives = 85/306 (27%), Gaps = 10/306 (3%)

Query: 55 SSDSLTQQQAQNTAQSDTTQATTLENTASSGTPDSSTLPTKETPPATSGGTGGDKHTASS 114
SL + T + + D+ + + + G TA +
Sbjct: 163 DVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPT 222

Query: 115 GTPPASSTPPAKKDETSGSGDKDQHTASGTGGTPSSSGGTGGDKHTASSGTPPASSTPPT 174
A T+ + + ++ A G +
Sbjct: 223 VPDKVYVNA-ANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYK 281

Query: 175 PTPPTSGGNTITSQLTKDTTTVNNLKSVSVSAMNTTLSGVTQLSQQTA---TISTLLNGS 231
T T K +TT+N K A T + + + ++++NG
Sbjct: 282 GVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQ 341

Query: 232 PNLGSVISNAQGLSSAFSALESAQNTLKGYLDSSSATIGQLTNGSNAVVGALDKAINQVD 291
N S A + + K ++ + T + +
Sbjct: 342 FTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASG 401

Query: 292 MALADLATADTQKTQAVALVAASNSATTTTDAINFLNALKANLTAQKDAFMSVHKNIQTA 351
++ A A + +N + A++ ++A++++L A ++ F S N+
Sbjct: 402 VSTLINEDAAA------AKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNT 455

Query: 352 VAQAQA 357
V +
Sbjct: 456 VTNLNS 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0086CABNDNGRPT320.005 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 32.2 bits (73), Expect = 0.005
Identities = 27/144 (18%), Positives = 54/144 (37%), Gaps = 10/144 (6%)

Query: 433 INSMDNTHANDSKDQGGNALINPNSTTNDDHNDDHMDTNTTDTGNANDTPTDDKDAGGNN 492
I ++ + + + G+++ NS T+ D T T ++ DAGG +
Sbjct: 251 IAAIQRLYGANMTTRTGDSVYGFNSNTDRDF--------YTATDSSKALIFSVWDAGGTD 302

Query: 493 TGDTGDMNNTDTGNTDTGNTDDMSNMNNGNGDTGNANDDMGNSNDMGDDMNNANDMNDDM 552
T D +N N + G+ D+ + + G+D+ N ++ +
Sbjct: 303 TFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGS-GNDILVGNSADNIL 361

Query: 553 -GNSNDDMGDMGDMNDDMGGDMGD 575
G + +D+ G D + G G
Sbjct: 362 QGGAGNDVLYGGAGADTLYGGAGR 385


2hp2017_0106hp2017_0112Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_01063130.087046hypothetical protein
hp2017_01072132.266146hypothetical protein
hp2017_010811143.347523Methyl-accepting chemotaxis protein
hp2017_010821133.394829Methyl-accepting chemotaxis protein
hp2017_0110-1113.6469842',3'-cyclic-nucleotide 2'-phosphodiesterase
hp2017_0111-2124.753741LuxS family protein
hp2017_0112-1134.023387Cystathionine gamma-synthase/ cystathione
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0111LUXSPROTEIN2256e-79 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 225 bits (575), Expect = 6e-79
Identities = 57/145 (39%), Positives = 91/145 (62%), Gaps = 7/145 (4%)

Query: 5 VESFNLDHTKVKAPYVRVADRKKGVNGDLIVKYDVRFKQPNQDHMDMPSLHSLEHLVAEI 64
++SF +DHT++ AP VRVA + GD I +D+RF PN+D + +H+LEHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 65 IRNHA----SYVVDWSPMGCQTGFYLTVLNHDNYTEILEVLEKTMQDVLKAK---EVPAS 117
+RNH ++D SPMGC+TGFY++++ + ++ + M+DVLK + ++P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 118 NEKQCGWAANHTLEGAQNLARAFLD 142
NE QCG AA H+L+ A+ +A+ L+
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILE 147


3hp2017_0188hp2017_0209Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_01882120.393436Lysyl-tRNA synthetase class II
hp2017_01893140.591791Serine hydroxymethyltransferase
hp2017_0190115-0.205459hypothetical protein
hp2017_01912150.403068hypothetical protein
hp2017_019211132.693806hypothetical protein
hp2017_019220112.893711hypothetical protein
hp2017_0193-1102.206219hypothetical protein
hp2017_0194-182.347503putative cardiolipin synthetase
hp2017_0195-1113.230463Fumarate reductase iron-sulfur protein
hp2017_01960113.268956Fumarate reductase flavoprotein subunit
hp2017_0197-1141.765995Fumarate reductase cytochrome b subunit
hp2017_0198-2171.599533Triose-phosphate isomerase
hp2017_0199-2172.567922Enoyl-acyl-carrier-protein reductase
hp2017_0200-2182.491840UDP-3-O-3-hydroxymyristoyl glucosamine
hp2017_0201-2162.818565S-adenosylmethionine synthetase
hp2017_0202-2171.980179Nucleoside diphosphate kinase
hp2017_0203-3171.437171hypothetical protein
hp2017_0204013-3.62400850S ribosomal protein L32
hp2017_0205012-2.515952putative glycerol-3-phosphate acyltransferase
hp2017_0206013-3.3041423-oxoacyl-acyl-carrier-protein synthase III
hp2017_0207214-4.501106hypothetical protein
hp2017_0208211-3.764141hypothetical protein
hp2017_020909-3.446065hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_01922FRAGILYSIN280.029 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 28.5 bits (63), Expect = 0.029
Identities = 22/103 (21%), Positives = 44/103 (42%), Gaps = 2/103 (1%)

Query: 7 EDNKKLYDIIDGQQRTTTIFMLLHVLASKQNEKDKQETRKYLYQKGELKLEVAPQNQSFF 66
DN+ + + +G+ + +T F+L A + ++ + Y++ ++ E+A + F
Sbjct: 93 LDNENVR-LFNGRDKDSTSFILGDEFAVLRFYRNGESISYIAYKEAQMMNEIAEFYAAPF 151

Query: 67 KTLLEAAEKENISHCEKDADTEGKQNLFEVLKAILDKVSKLSG 109
K EKE C D+ T +K +DK K+
Sbjct: 152 KKTRAINEKE-AFECIYDSRTRSAGKDIVSVKINIDKAKKILN 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0199DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.1 bits (145), Expect = 8e-13
Identities = 61/263 (23%), Positives = 108/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNNIKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + I++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGRHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSNGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0209PF01540340.003 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 33.9 bits (77), Expect = 0.003
Identities = 32/127 (25%), Positives = 60/127 (47%), Gaps = 10/127 (7%)

Query: 209 IGKGKQKQLSKIYSHF-KKLSEGEIKPQNEGILKKLKSLDEIFKTTDFTRFTPKTEIKDI 267
+ K K+L++I + KKL+E K +N G+ + K +E F+ + +
Sbjct: 340 VKKAWSKELAEIKAEDDKKLAEENQKIKN-GVEELKKINNEAFELSK--------TVNKT 390

Query: 268 IKEIDEKYPINENFKRQFRTFRSSIGNLKKKINSLKYLEKTREDFERKKESWIKEIGNDC 327
I E+++K+ I+ +FK Q + F + + ++I+ + T+E F + KEI
Sbjct: 391 IAELEKKFKIDVSFKEQLKNFADDLLDKSRQIDEFTTVTSTQEGFTLAELESFKEITTTW 450

Query: 328 KNECNSE 334
N SE
Sbjct: 451 FNGMKSE 457



Score = 33.6 bits (76), Expect = 0.004
Identities = 26/124 (20%), Positives = 50/124 (40%), Gaps = 8/124 (6%)

Query: 211 KGKQKQLSKIYSHFKKLSEGEIKPQNEGILKKLKSLDEIFKTTDFTRFTPKTEIKDIIKE 270
K K+L++I + K E + EG + LK ++I D I I +
Sbjct: 221 KAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSFAD--------TIALTITK 272

Query: 271 IDEKYPINENFKRQFRTFRSSIGNLKKKINSLKYLEKTREDFERKKESWIKEIGNDCKNE 330
++ K+ I+E FK+Q + + ++ + + ++DF + KE +
Sbjct: 273 LERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEFNTSWLEK 332

Query: 331 CNSE 334
SE
Sbjct: 333 IVSE 336


4hp2017_0304hp2017_0343Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_03041143.65935450S ribosomal protein L21
hp2017_03051153.75712750S ribosomal protein L27
hp2017_03061153.743862Periplasmic dipeptide transport substrate
hp2017_03070154.175269Dipeptide transport system permease protein
hp2017_0308-1143.548963Dipeptide transport system permease protein
hp2017_0309-3143.025253Peptide ABC transporter ATP-binding protein
hp2017_0310-2142.624793Dipeptide transport ATP-binding protein
hp2017_0311-2132.157081GTPase
hp2017_0312-1131.574967hypothetical protein
hp2017_03130152.214159hypothetical protein
hp2017_03141162.745965Glutamate-1-semialdehyde aminotransferase
hp2017_03154182.050998hypothetical protein
hp2017_03164161.844378hypothetical protein
hp2017_03172170.410072Putative amylohydrolase
hp2017_0318115-0.811262Putative polysaccharide deacetylase
hp2017_0319119-2.472246hypothetical protein
hp2017_0320117-2.593060putative ATP-binding protein
hp2017_0321120-4.034313putative nitrite exclusion protein
hp2017_0322121-4.217783hypothetical protein
hp2017_0323018-2.643383ATP-binding ABC transporter protein
hp2017_0324115-1.750405hypothetical protein
hp2017_0325414-2.003015Putative heme iron utilization protein
hp2017_0326514-2.391808hypothetical protein
hp2017_0327114-1.860756Arginyl-tRNA synthetase
hp2017_0328213-1.526579putative twin-arginine translocation protein
hp2017_0329113-1.707577Guanylate kinase
hp2017_0330114-1.814612poly E-rich protein
hp2017_0331-114-2.080089membrane bound endonuclease
hp2017_0332114-2.036589putative outer membrane protein
hp2017_0333216-2.016042Flagellar basal body L-ring protein
hp2017_03341213-1.763913CMP-N-Acetylneuraminate cytidylyltransferase
hp2017_03342212-1.066409CMP-N-Acetylneuraminate cytidylyltransferase
hp2017_0335212-0.893390putative flagellar biosynthesis protein
hp2017_03361130.471028Tetraacyldisaccharide 4'-kinase
hp2017_03371151.365064NAD synthetase
hp2017_03380161.624156*Ketol-acid reductoisomerase
hp2017_03390160.614973Septum site-determining protein
hp2017_0340115-0.938648Cell division topological specificity factor
hp2017_0341012-0.117833putative DNA processing chain A
hp2017_0342114-0.659175Putative holliday junction resolvase
hp2017_0343213-0.162833hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0321TCRTETB310.006 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.006
Identities = 36/193 (18%), Positives = 77/193 (39%), Gaps = 1/193 (0%)

Query: 23 VLIPLLILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFL 82
V +P + + P + + A ++ + G+ + LS + ++ + + +
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 83 VCYFDSIPFFWLWIWRFIAGVASSALMILVAPLSLPYVKEHKKALVGGLIFSAVGIGSVF 142
+ + F L + RFI G ++A LV + Y+ + + GLI S V +G
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 143 SGFVLPWISSYNIKWAWIFLGGSCLIAFILSLVGLKTRSLRKKSVKKEESAFKIPFHLWL 202
+ I+ Y I W+++ L I + L+ L + +R K + + +
Sbjct: 155 GPAIGGMIAHY-IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVF 213

Query: 203 LLISCALNAIGFL 215
++ +I FL
Sbjct: 214 FMLFTTSYSISFL 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0329PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0330IGASERPTASE595e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 58.5 bits (141), Expect = 5e-11
Identities = 65/301 (21%), Positives = 110/301 (36%), Gaps = 36/301 (11%)

Query: 185 EVKEMQEEVKEMQEEVKEKQKQEVAENPQD-EEKPKDDETQGSVEPPKDEEVSKELET-Q 242
EV++ + V + +V P + EE + DE V PP S+ ET
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEA--PVPPPAPATPSETTETVA 1041

Query: 243 EQEPIKEETQEIKEEKQEKTQDSPNVQELEAMQELVKEIQENSNDQENKKETQETQEN-T 301
E + +T E E+ +T EA + Q N Q ET+ETQ T
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS-GSETKETQTTET 1100

Query: 302 ETPQDIETQELEIPKEEETQEIAEKTQAQGLEKEEIAETPQEKEIQETQDETPQELEVQD 361
+ +E KEE+ + EKTQ E P+ + E + ++ Q
Sbjct: 1101 KETATVE-------KEEKAKVETEKTQ----------EVPKVTSQVSPKQEQSETVQPQA 1143

Query: 362 EKLQENETPKDENMQESAQNLQEKETQELETPQTQEDHYENIEDIPEPVMTKAMGEELPF 421
E +EN+ N++E ++Q T T++ E ++ +PV
Sbjct: 1144 EPARENDP---------TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGN- 1193

Query: 422 LNENDTETPKENDTETPKESVIKTPQEKEESDKTSSPLELRLNLQDLLKSLNQESLKSLL 481
+ E P+ T + +V K ++ S + N++ S N S +L
Sbjct: 1194 ---SVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250

Query: 482 E 482
+
Sbjct: 1251 D 1251



Score = 57.4 bits (138), Expect = 9e-11
Identities = 57/258 (22%), Positives = 82/258 (31%), Gaps = 36/258 (13%)

Query: 148 EALAKEEPNNEEQLLPTLNEQEGETPKEEAQEEVKKEEVKEMQ-EEVKEMQEEVKEKQKQ 206
E +A+ + P + ET E +++E K E E E EV ++ K
Sbjct: 1015 EEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 207 EVAENPQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQ----EKT 262
V N Q E + S+ ETQ E + T E KEEK EKT
Sbjct: 1075 NVKANTQTNEV--------------AQSGSETKETQTTETKETATVE-KEEKAKVETEKT 1119

Query: 263 QDSPNVQ-ELEAMQELVKEIQENSNDQENKKET---QETQENTETPQDIETQELEIPKEE 318
Q+ P V ++ QE + +Q + T +E Q T T D E E
Sbjct: 1120 QEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 319 ETQEIAEKTQAQGLEKEEIAETPQEKEIQETQDETPQELEVQDEKLQENETPKDENMQES 378
E T G + E P+ TQ E PK+ + +
Sbjct: 1180 EQPVTESTTVNTG---NSVVENPENTTPATTQPTVNSESS---------NKPKNRHRRSV 1227

Query: 379 AQNLQEKETQELETPQTQ 396
E +
Sbjct: 1228 RSVPHNVEPATTSSNDRS 1245



Score = 47.8 bits (113), Expect = 9e-08
Identities = 39/230 (16%), Positives = 68/230 (29%), Gaps = 27/230 (11%)

Query: 148 EALAKEEPNNEEQLLPTLNEQEGETPK-------EEAQEEVKKEEVKEMQEEVKEMQEEV 200
E + +E NEQ+ +EA+ VK + +E
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 201 KEKQKQEVAENPQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQE 260
+ + +E E K+E+ E E ++ P K+E+ E
Sbjct: 1096 QTTETKE----TATVE--------------KEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137

Query: 261 KTQDSPNVQELEAMQELVKEIQENSNDQ-ENKKETQETQENTETPQDIETQELEIPKEEE 319
Q +KE Q +N + ++ +ET N E P T E
Sbjct: 1138 TVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197

Query: 320 TQEIAEKTQAQGLEKEEIAETPQEKEIQETQDETPQELEVQDEKLQENET 369
E Q E + P+ + + + P +E + T
Sbjct: 1198 NPENTTPATTQPTVNSESSNKPKNRHRRSVRSV-PHNVEPATTSSNDRST 1246



Score = 42.7 bits (100), Expect = 3e-06
Identities = 28/160 (17%), Positives = 59/160 (36%), Gaps = 7/160 (4%)

Query: 151 AKEEPNNEEQ-LLPTLNEQEGETPKEEAQEEVKKEEVKEMQEEVKEMQEE---VKEKQKQ 206
KE E++ E+ E PK +Q K+E+ + +Q + + +E V K+ Q
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 207 EVAENPQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSP 266
D E+P + + +P + + + P + ++ + P
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219

Query: 267 ---NVQELEAMQELVKEIQENSNDQENKKETQETQENTET 303
+ + + ++ V+ +SND+ T NT
Sbjct: 1220 KNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNA 1259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0333FLGLRINGFLGH1941e-64 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 194 bits (495), Expect = 1e-64
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0343SYCDCHAPRONE280.003 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 28.4 bits (63), Expect = 0.003
Identities = 12/36 (33%), Positives = 20/36 (55%)

Query: 23 MASQTPKELYDLGVESYKAKDYIKAKKYFEKACGLN 58
++S T ++LY L Y++ Y A K F+ C L+
Sbjct: 31 ISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLD 66


5hp2017_0455hp2017_0466Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0455113-3.158020Molybdenum ABC transporter periplasmic binding
hp2017_0456112-3.468724Molybdenum transport system permease protein
hp2017_0457010-1.699505Molybdenum transport ATP-binding protein
hp2017_0458-111-2.113277Glutamyl-tRNA synthetase
hp2017_04591-213-2.720502adenine specific DNA methyltransferase
hp2017_04592-212-1.754424adenine specific DNA methyltransferase
hp2017_04593-112-1.225990adenine specific DNA methyltransferase
hp2017_0460-116-0.038253GTP-binding protein
hp2017_0461124-3.624542Type II adenine specific DNA methyltransferase
hp2017_04621025-3.579983hypothetical protein
hp2017_04622-221-2.375347hypothetical protein
hp2017_046315190.293454Type II DNA modification enzyme
hp2017_046326180.755314Type II DNA modification enzyme
hp2017_046416190.619321hypothetical protein
hp2017_046425171.262052hypothetical protein
hp2017_04655181.889515Catalase like protein
hp2017_04665181.249850Outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0457PF05272300.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.009
Identities = 12/32 (37%), Positives = 17/32 (53%)

Query: 30 VVALLGESGAGKSTILRILAGLEAVSSGYIEA 61
V L G G GKST++ L GL+ S + +
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0460TCRTETOQM1981e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (505), Expect = 1e-57
Identities = 116/461 (25%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVALAG--FNAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV L V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-RFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


6hp2017_0488hp2017_0529Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0488216-2.717344hypothetical protein
hp2017_0489114-2.440351hypothetical protein
hp2017_0490113-2.115869hypothetical protein
hp2017_0491112-1.840546Glutamine synthetase type I
hp2017_04921012-3.097597RloF like protein
hp2017_04922-111-2.553757RloF like protein
hp2017_0493-310-1.73637350S ribosomal protein L9
hp2017_0494-211-2.039039ATP-dependent protease peptidase subunit
hp2017_0495-212-2.698254ATP-dependent protease ATP binding subunit
hp2017_0496-118-3.185459GTP-binding protein
hp2017_0497020-4.596111putative periplasmic protein
hp2017_0498622-4.122656hypothetical protein
hp2017_04991023-3.819364putative IS606 transposase
hp2017_05001022-3.497222hypothetical protein
hp2017_0501719-2.458185cag pathogenicity island protein
hp2017_0502818-2.491062cag island protein
hp2017_0503918-2.246279cag pathogenicity island protein
hp2017_0504718-1.667719cag island protein
hp2017_0505918-2.389702hypothetical protein
hp2017_05061019-2.719472Type IV secretion system protein
hp2017_05071022-3.267686ATPase
hp2017_05081022-3.538764cag pathogenicity island protein Z
hp2017_05091923-3.428640cag pathogenicity island protein
hp2017_050921223-4.227122cag pathogenicity island protein
hp2017_05101025-4.356965cag island protein
hp2017_05111027-4.476635cag pathogenicity island protein
hp2017_05121227-5.576745Inner membrane protein
hp2017_05131124-5.631356cag pathogenicity island protein
hp2017_05141120-5.674618cag pathogenicity island protein
hp2017_0515920-5.845262cag pathogenicity island protein
hp2017_0516719-4.167982cag pathogenicity island protein
hp2017_0517717-3.085822hypothetical protein
hp2017_0518617-2.801703cag island protein
hp2017_0519719-2.999541cag pathogenicity island protein N
hp2017_0520519-2.896376cag pathogenicity island protein
hp2017_0521520-3.286548cag pathogenicity island protein
hp2017_0522619-3.362909cag island protein
hp2017_0523620-4.164881cag island protein
hp2017_0524621-3.230766cag pathogenicity island protein
hp2017_0525522-2.709889CAG pathogenicity island protein
hp2017_0526626-2.244613cag island protein
hp2017_0527423-0.922977cag pathogenicity island protein C
hp2017_0528218-0.166893hypothetical protein
hp2017_0529218-0.062076cag island protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0496PF03944310.005 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.2 bits (70), Expect = 0.005
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELRVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQKYASQFLDLVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0510TYPE4SSCAGX7900.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 790 bits (2042), Expect = 0.0
Identities = 480/482 (99%), Positives = 482/482 (100%)

Query: 1 MVNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGWNIVPNSNHIFIQPK 60
+VNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGW+IVPNSNHIFIQPK
Sbjct: 41 VVNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPK 100

Query: 61 SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK 120
SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK
Sbjct: 101 SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK 160

Query: 121 AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM 180
AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM
Sbjct: 161 AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM 220

Query: 181 QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN 240
QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN
Sbjct: 221 QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN 280

Query: 241 LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI 300
LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI
Sbjct: 281 LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI 340

Query: 301 KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN 360
KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN
Sbjct: 341 KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN 400

Query: 361 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT 420
YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT
Sbjct: 401 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT 460

Query: 421 NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR 480
NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR
Sbjct: 461 NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR 520

Query: 481 DK 482
DK
Sbjct: 521 DK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0512PF043351195e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 119 bits (299), Expect = 5e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0519TYPE4SSCAGX300.015 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.015
Identities = 29/119 (24%), Positives = 54/119 (45%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTFYQRHDDKEITKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K D KE+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSVQKKAAKHRGLQELNETNANPLNDNPNGNSPTETKSNKDDNFDEM 142
QK+ K +++A L+ L +NP N + N N K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0525ACRIFLAVINRP320.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.1 bits (73), Expect = 0.015
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGINPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0529TYPE4SSCAGA18680.0 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 1868 bits (4840), Expect = 0.0
Identities = 1041/1187 (87%), Positives = 1083/1187 (91%), Gaps = 43/1187 (3%)

Query: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAIASFDPDQKPIVDKNDRDNRQAFDG 60
MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNA+AS+DPDQKPIVDKNDRDNRQAF+G
Sbjct: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAVASYDPDQKPIVDKNDRDNRQAFEG 60

Query: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120
ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF
Sbjct: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120

Query: 121 TSWVSHQNDPSKINTRSIRNFMEHAIQPPIPDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180
TSWVSHQNDPSKINTRSIRNFME+ IQPPI DDKEKAEFLKSAKQSFAGIIIGNQIRTDQ
Sbjct: 121 TSWVSHQNDPSKINTRSIRNFMENIIQPPILDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180

Query: 181 KFMGVFDESLKERQEAEKNGGSTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240
KFMGVFDESLKERQEAEKNG TGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI
Sbjct: 181 KFMGVFDESLKERQEAEKNGEPTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240

Query: 241 ATSTTHIQGLPPESRDLLDERGNFSKFTLGDMEMLDVEGVADMDPNYKFNQLLIHNNALS 300
AT+TT IQGLPPE+RDLLDERGNFSKFTLGDMEMLDVEGVAD+DPNYKFNQLLIHNNALS
Sbjct: 241 ATTTTDIQGLPPEARDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNALS 300

Query: 301 SVLMGSHDGIEPEKVSLLYAGNGGFGDKHDWNATVGYKDQQGNNVATIINVHMKNGSGLI 360
SVLMGSH+GIEPEKVSLLY GNGG G +HDWNATVGYKDQQGNNVATIINVHMKNGSGL+
Sbjct: 301 SVLMGSHNGIEPEKVSLLYGGNGGPGARHDWNATVGYKDQQGNNVATIINVHMKNGSGLV 360

Query: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDSLSEKEKEK 420
IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLD+LSEKEKEK
Sbjct: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDNLSEKEKEK 420

Query: 421 FKNEIKDFQKDSKPYLDALGNDRIAFVSKKDPKHSALITEFNKGDLSYTLKDYGKKADKA 480
F+ EIKDFQKDSK YLDALGNDRIAFVSKKD KHSALITEF GDLSYTLKDYGKKADKA
Sbjct: 421 FRTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALITEFGNGDLSYTLKDYGKKADKA 480

Query: 481 LDREKNVTLQGNLKHDGVMFVNYSNFKYTNASKSPNKGVGVTNGVSHLEAGFSKVAVFNL 540
LDREKNVTLQG+LKHDGVMFV+YSNFKYTNASK+PNKGVGVTNGVSHLE GF+KVA+FNL
Sbjct: 481 LDREKNVTLQGSLKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLEVGFNKVAIFNL 540

Query: 541 PNLNNLAITSVVRRDLEDKLIAKGLSPQEANKLVKDFLSSNKELVGKALNFNKAVAEAKN 600
P+LNNLAITS VRR+LEDKL KGLSPQEANKL+KDFLSSNKELVGK LNFNKAVA+AKN
Sbjct: 541 PDLNNLAITSFVRRNLEDKLTTKGLSPQEANKLIKDFLSSNKELVGKTLNFNKAVADAKN 600

Query: 601 TGNYDEVKRAQKDLEKSLKKREHLEKDVAKNLESKSGNKNKMEVKSQANSQKDEIFALIN 660
TGNYDEVK+AQKDLEKSL+KREHLEK+V K LESKSGNKNKME K+QANSQKDEIFALIN
Sbjct: 601 TGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKSGNKNKMEAKAQANSQKDEIFALIN 660

Query: 661 KEANRDARAIAYAQNLKDIKRELSDKLENISKDLKDFSKSFDEFKNGKSKDFSKVEETLK 720
KEANRDARAIAYAQNLK IKRELSDKLEN++K+LKDF KSFDEFKNGK+KDFSK EETLK
Sbjct: 661 KEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETLK 720

Query: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSIKDVIINQKI 780
ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENS+KDVIINQK+
Sbjct: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKV 780

Query: 781 TDKVDNLNQAVSMAKIAGNFSGVEQALADLKNFSKEQLAQQAQKNESFNVGK-SEIYQSV 839
TDKVDNLNQAVS+AK G+FS VEQALADLKNFSKEQLAQQAQKNES N K SEIYQSV
Sbjct: 781 TDKVDNLNQAVSVAKATGDFSRVEQALADLKNFSKEQLAQQAQKNESLNARKKSEIYQSV 840

Query: 840 KNGVNGTLVGNGLSGIEATALAKNFSDIKKELNEKFKNFNNNNNGLKNGKDKGPEEPIYA 899
KNGVNGTLVGNGLS EAT L+KNFSDIKKELN K NFNNNNN EPIYA
Sbjct: 841 KNGVNGTLVGNGLSQAEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKN------EPIYA 894

Query: 900 QVNKKKTGQVASPEEPIYAQVAKKVTQKIDQLNQAASGFGGVGQ-AGFPLKRHDKVEDLS 958
+VNKKK GQ AS EEPIYAQVAKKV KID+LNQ ASG G VGQ AGFPLKRHDK
Sbjct: 895 KVNKKKAGQAASLEEPIYAQVAKKVNAKIDRLNQIASGLGVVGQAAGFPLKRHDK----- 949

Query: 959 KVGRSVSPEPIYATIDDLGGSFPLRRSAAVDDLSKVGRSREQELTQKIDNLSQAVSEAKA 1018
VDDLSKVG SR QEL QKIDNL+QAVSEAKA
Sbjct: 950 -----------------------------VDDLSKVGLSRNQELAQKIDNLNQAVSEAKA 980

Query: 1019 GFFGNLERTIDKLKDSTKNNPVNLWAENAKKVPASLSAKLDNYATNSHTRINSNIQNGAI 1078
GFFGNLE+TIDKLKDSTK+NP+NLW E+AKKVPASLSAKLDNYATNSH RINSNI+NGAI
Sbjct: 981 GFFGNLEQTIDKLKDSTKHNPMNLWVESAKKVPASLSAKLDNYATNSHIRINSNIKNGAI 1040

Query: 1079 NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN 1138
NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN
Sbjct: 1041 NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN 1100

Query: 1139 NTVKDVKSGFTQFLANAFSTG-YYSLARENAEHGIKNANTKGGFQKS 1184
N VKD SGFTQFL NAFST YY LARENAEHGIKN NTKGGFQKS
Sbjct: 1101 NAVKDTNSGFTQFLTNAFSTASYYCLARENAEHGIKNVNTKGGFQKS 1147


7hp2017_0667hp2017_06782Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_06672110.599926hypothetical protein
hp2017_0668190.367018hypothetical protein
hp2017_0669010-0.914006bifunctional N-acetylglucosamine-1-phosphate
hp2017_0670111-1.582243Flagellar biosynthesis protein
hp2017_0671214-3.095549Iron III dicitrate transport protein
hp2017_0672115-4.136988Ferrous iron transport protein B
hp2017_0673318-3.173254hypothetical protein
hp2017_0674621-2.485490putative type II DNA modification enzyme/ methyl
hp2017_0675718-0.421731hypothetical protein
hp2017_06765160.064347putative type II restriction enzyme
hp2017_06775152.552901Acetone carboxylase gamma subunit
hp2017_067814142.717224Acetone carboxylase, alpha subunit
hp2017_067822143.155632Acetone carboxylase, alpha subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0670FLGBIOSNFLIP2759e-96 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 275 bits (704), Expect = 9e-96
Identities = 114/247 (46%), Positives = 162/247 (65%), Gaps = 4/247 (1%)

Query: 1 MRFFIFLMLALICPLICPLMSADSALPSVNLSLNAPSDPKQLVTTLNVIALLTLLVLAPS 60
MR + + L L A + LP + S P + + + +T L P+
Sbjct: 1 MRRLLSVAPVL---LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPA 56

Query: 61 LILVMTSFTRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPY 120
++L+MTSFTR+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+
Sbjct: 57 ILLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPF 116

Query: 121 MDKKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDDVSLSVLIPAFMI 180
++KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++
Sbjct: 117 SEEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVT 176

Query: 181 SELKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLT 240
SELKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL
Sbjct: 177 SELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLV 236

Query: 241 ENLVASF 247
+L SF
Sbjct: 237 GSLAQSF 243


8hp2017_0708hp2017_07123Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0708216-4.024534tRNA Ile-lysidine synthase
hp2017_0709319-4.110177hypothetical protein
hp2017_0710622-3.067484hypothetical protein
hp2017_0711525-4.171275hypothetical protein
hp2017_07121524-4.270400hypothetical protein
hp2017_07122423-4.879078hypothetical protein
hp2017_07123321-4.517330hypothetical protein
9hp2017_0851hp2017_0862Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_08513200.027754hypothetical protein
hp2017_08521152.288015putative alkylphosphonate uptake protein
hp2017_08531142.720958hypothetical protein
hp2017_08542132.469901hypothetical protein
hp2017_08552132.426707hypothetical protein
hp2017_08563142.475526Catalase
hp2017_08572142.642524putative iron-regulated outer membrane protein
hp2017_08580180.320794Crossover junction endodeoxyribonuclease
hp2017_0859-115-1.378716hypothetical protein
hp2017_0860213-1.316784hypothetical protein
hp2017_0861212-1.032278putative JHP1044-like protein
hp2017_0862312-1.237426hypothetical protein
10hp2017_0958hp2017_0978Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_09583151.197804Cell division protein
hp2017_09593180.606141Cell division protein
hp2017_09601318-4.036058hypothetical protein
hp2017_09602320-5.023067hypothetical protein
hp2017_09603617-5.272792hypothetical protein
hp2017_0961516-5.145314Mechanosensitive channel protein
hp2017_0962719-6.313709hypothetical protein
hp2017_09641721-6.957384DNA topoisomerase I
hp2017_09642718-6.979822DNA topoisomerase I
hp2017_09661618-7.087846conjugal plasmid transfer system protein
hp2017_09662424-7.509235conjugal plasmid transfer system protein
hp2017_0968428-8.660476hypothetical protein
hp2017_0969430-9.701976hypothetical protein
hp2017_0970530-10.099533hypothetical protein
hp2017_09711328-8.205756Type IV secretion system ATPase
hp2017_09712328-7.904735Type IV secretion system ATPase
hp2017_0972326-8.063644hypothetical protein
hp2017_0973424-8.014821hypothetical protein
hp2017_0974323-7.936561hypothetical protein
hp2017_0975522-7.308623putative conjugal transfer protein
hp2017_0976424-6.271459hypothetical protein
hp2017_09771323-5.471779XERCD family protein/integrase/ recombinase
hp2017_09772017-4.309516XERCD family protein/integrase/ recombinase
hp2017_0978-117-3.220109hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0958SHAPEPROTEIN402e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 39.7 bits (93), Expect = 2e-05
Identities = 38/176 (21%), Positives = 66/176 (37%), Gaps = 12/176 (6%)

Query: 211 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 264
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 265 HMLNTPFPYAEEVKIKYGDLSFESGTETPSQSVQIPTTGSDGNESHIVPLSEIQTIMRER 324
+ AE +K + G S G E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 325 ALETFKIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELARTHFTNYPVRLA 377
+ +++ E + G+VLTGG AL++ + L T PV +A
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVA 318


11hp2017_1053hp2017_1059Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_10532110.3398256-phosphogluconolactonase
hp2017_1054390.451089Glucokinase
hp2017_1055412-0.707184Alcohol dehydrogenase
hp2017_1056312-0.852910putative lipopolysaccharide biosynthesis
hp2017_10572110.936845putative lipopolysaccharide biosynthesis
hp2017_10583132.744676hypothetical protein
hp2017_10590143.186885putative outer membrane protein
12hp2017_1074hp2017_1081Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_1074217-0.900965Type II DNA modification enzyme/
hp2017_1075313-0.603024FlgM like protein
hp2017_1076211-1.599063hypothetical protein
hp2017_1077411-1.371036FKBP-type peptidyl-prolyl cis-trans isomerase
hp2017_1078312-2.176619hypothetical protein
hp2017_1079413-1.791842putative peptidoglycan associated lipoprotein
hp2017_10802140.181632Translocation protein
hp2017_10812180.310202TonB like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1079OMPADOMAIN1364e-42 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 136 bits (345), Expect = 4e-42
Identities = 46/162 (28%), Positives = 71/162 (43%), Gaps = 24/162 (14%)

Query: 2 AGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAIESGTIIASIYFDFDKYEIKE 60
+ VS + Q PAP PAP V+ K T+ + + F+F+K +K
Sbjct: 184 SLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNFNKATLKP 232

Query: 61 SDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNALVIKGVEK 117
Q LD++ + V++ G TD GS YNQ L +R SV + L+ KG+
Sbjct: 233 EGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPA 292

Query: 118 DMIKTISFGETKPKCVQ-----KTR----ECYRENRRVDVKL 150
D I GE+ P K R +C +RRV++++
Sbjct: 293 DKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1081TYPE4SSCAGA320.002 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 32.4 bits (73), Expect = 0.002
Identities = 36/139 (25%), Positives = 64/139 (46%), Gaps = 12/139 (8%)

Query: 32 KEAEKILLDLNKKDEQAID--LNLEDLPSEKKNE-KIEKVTEKQGDF---LEPKEEPKEE 85
+EA K++ D +++ + LN ++ KN ++V + Q D L +E ++E
Sbjct: 568 QEANKLIKDFLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKE 627

Query: 86 PEESLEDIFSSLNDFQEKTDKNAQKDE-----QKNEQEEQRRLREQQRLKQ-NQENQEML 139
E+ LE + N + K N+QKDE K + R + Q LK +E + L
Sbjct: 628 VEKKLESKSGNKNKMEAKAQANSQKDEIFALINKEANRDARAIAYAQNLKGIKRELSDKL 687

Query: 140 KGLQQNLNQFTQKLESVKN 158
+ + +NL F + + KN
Sbjct: 688 ENVNKNLKDFDKSFDEFKN 706


13hp2017_1090hp2017_1096Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_1090018-4.347670ATP synthase B chain
hp2017_1091117-3.750613Plasmid replication partition related protein
hp2017_1092217-4.067201Chromosome/ plasmid partitioning protein
hp2017_1093218-4.975941Biotin-protein ligase
hp2017_1094219-5.245046Methionyl-tRNA formyltransferase
hp2017_1095220-5.450838hypothetical protein
hp2017_1096219-0.138221hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1094FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKDLKPDFIVVVAYGKILPKEVLAIAP 104
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1096CHANLCOLICIN363e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.8 bits (82), Expect = 3e-04
Identities = 34/214 (15%), Positives = 86/214 (40%), Gaps = 10/214 (4%)

Query: 4 NQTIPFKCPKCQEPINVSEALYKQIELENQSRFLAQQKAFEKEVNEKRAQYHTHLKMLEQ 63
N+ + + ++ A ++ E++ LA+ E++ ++ + EQ
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKA---EEKARKEAEAAEKAFQEAEQ 155

Query: 64 KEEALKERAKEQQAQFDEAVKQASMLALQDERAKIIEEARKNAFLEQQKGLELLQKELDE 123
+ + ++ E + Q A + LA E AK +E A+K Q + +++ +
Sbjct: 156 RRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTL 215

Query: 124 KSKQVQELHQKEAEIERLKRENNE-------VESRLKAENEKKLNEKLDLEREKIEKALH 176
S+ +H ++AE++ L + NE + + + L+ +A
Sbjct: 216 NSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATR 275

Query: 177 EKNELKFKQQEEQLEMLRNELKNAQRKAELSSQQ 210
+ ++E+Q ++ +E + + A+++ Q
Sbjct: 276 RRVGAGKIREEKQKQVTASETRINRINADITQIQ 309


14hp2017_11862hp2017_1226Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_118623120.155652Dihydropteroate synthase
hp2017_1187315-0.697142hypothetical protein
hp2017_11882130.107119hypothetical protein
hp2017_11890111.526840hypothetical protein
hp2017_1190-1121.758790hypothetical protein
hp2017_1191-2111.226980hypothetical protein
hp2017_11920141.977867Hypothetical protein
hp2017_1193-1102.940017Carbamoyl-phosphate synthase small subunit
hp2017_1194-1122.365827Formamidase
hp2017_11951120.983549hypothetical protein
hp2017_11962120.576308hypothetical protein
hp2017_11972130.933297Maf like protein
hp2017_11982111.282656Alanyl-tRNA synthetase
hp2017_1199320-1.609673hypothetical protein
hp2017_1200115-1.842750hypothetical protein
hp2017_1201112-1.605278hypothetical protein
hp2017_1202111-1.14021730S ribosomal protein S18
hp2017_1203211-1.383289Single-stranded DNA-binding protein
hp2017_1204312-1.76499930S ribosomal protein S6
hp2017_1205312-1.223885putative DNA polymerase III delta subunit
hp2017_1206211-0.4606103'-to-5' exoribonuclease RNase R
hp2017_12070110.431794Shikimate 5-dehydrogenase I alpha
hp2017_12081-1110.075007hypothetical protein
hp2017_12082-1110.667521hypothetical protein
hp2017_1209-1100.696180Oligopeptide ABC transporter permease protein
hp2017_1210-1111.122338putative oligopeptide ABC transporter
hp2017_12111131.012197Tryptophanyl-tRNA synthetase
hp2017_12122161.708195putative biotin synthesis protein
hp2017_12133172.548153Preprotein translocase subunit
hp2017_12141132.733159Ribosome recycling factor
hp2017_12151122.399158Orotate phosphoribosyltransferase
hp2017_12161122.659980hypothetical protein
hp2017_12170133.019140NAD-dependent protein deacetylase
hp2017_1218-2101.585250NADH ubiquinone oxidoreductase chain A
hp2017_1219-291.829728NADH-ubiquinone oxidoreductase chain B
hp2017_1220-2101.871099NADH-ubiquinone oxidoreductase chain C
hp2017_1221-2102.449265NADH-ubiquinone oxidoreductase chain D
hp2017_1222-1112.079293NADH-ubiquinone oxidoreductase chain E
hp2017_1223-1102.124315NADH-ubiquinone oxidoreductase chain G
hp2017_12240113.135197NADH-ubiquinone oxidoreductase chain G
hp2017_12251133.389984NADH-ubiquinone oxidoreductase chain H
hp2017_12261133.201012NADH-ubiquinone oxidoreductase chain I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1189TYPE3IMSPROT240.032 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 23.6 bits (51), Expect = 0.032
Identities = 7/13 (53%), Positives = 10/13 (76%)

Query: 5 FYKELKMDKQKVK 17
+ KELKM K ++K
Sbjct: 210 YIKELKMSKDEIK 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1196adhesinmafb300.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 30.4 bits (68), Expect = 0.002
Identities = 17/50 (34%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 32 MEEIENSDPNQNNPFITA--AMGIGGAAISIFFPNTKPIVDGIKPLAEKG 79
ME I NPFI+A A+GIG + K + I PL +G
Sbjct: 225 MEFINGVAAGALNPFISAGEALGIGDILYGTRYAIDKAAMRNIAPLPAEG 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1199PF05844250.039 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 24.6 bits (53), Expect = 0.039
Identities = 12/65 (18%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 10 SVLKANNPHFDKIFEKHNQLDDDIKTAEQQNASDAEVSHMKKQKLKLKDEIHSMIIEYRE 69
L+A F+ + I++ Q + +V + Q ++E+++ I + +
Sbjct: 197 VALRAAGRAFESRNGALQVANTVIQSFVQMANASVQVRQGESQASAREEEVNATIGQ-SQ 255

Query: 70 KQKSD 74
KQK +
Sbjct: 256 KQKVE 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1213SECGEXPORT494e-10 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 48.8 bits (116), Expect = 4e-10
Identities = 25/84 (29%), Positives = 47/84 (55%), Gaps = 3/84 (3%)

Query: 1 MTSALLGLQIVLAVLIVVVVLLQ--KSSSIGLGAYSGSNDSLFGAKGPASFMAKLTMFLG 58
M ALL + +++A+ +V +++LQ K + +G +G++ +LFG+ G +FM ++T L
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 59 LLFVINTIALGYFYNKEYGKSVLD 82
LF I ++ LG N +
Sbjct: 61 TLFFIISLVLGNI-NSNKTNKGSE 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1224TYPE4SSCAGX340.004 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 33.6 bits (76), Expect = 0.004
Identities = 26/71 (36%), Positives = 37/71 (52%), Gaps = 5/71 (7%)

Query: 447 SKQSIVDEAALKALEEERKKALEQ----AEQGCSIGENKEENKEEAVAPKENKEENKTEA 502
+K+ IVD K LEE+ KKALE+ EQ ++K E ++E A EN T A
Sbjct: 128 TKKLIVDAPDPKELEEQ-KKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANLENLTNA 186

Query: 503 ATPKENQTENK 513
+ +N + NK
Sbjct: 187 MSNPQNLSNNK 197


15hp2017_1340hp2017_1352Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_1340-210-3.061818Prephenate dehydrogenase
hp2017_1341-210-3.741609putative endonuclease
hp2017_1342-211-3.971699Type III restriction-modification system
hp2017_13431-212-2.816045putative type III restriction enzyme
hp2017_13432-112-2.389074putative cAMP-induced cell filamentation
hp2017_1344-112-2.228211Biotin synthase
hp2017_1345114-4.087609putative Ribonuclease N
hp2017_13461215-4.036113hypothetical protein
hp2017_13462415-3.379653hypothetical protein
hp2017_13471315-3.197418hypothetical protein
hp2017_13472415-2.839797hypothetical protein
hp2017_13473215-2.534267hypothetical protein
hp2017_1348113-2.360021hypothetical protein
hp2017_1349015-3.514631hypothetical protein
hp2017_1350018-4.805469NADPH dependent 7-cyano-7-deazaguanine
hp2017_1351-117-4.210929hypothetical protein
hp2017_1352-118-4.214670tRNA delta 2-isopentenylpyrophosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1340SHIGARICIN290.024 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 28.6 bits (64), Expect = 0.024
Identities = 11/74 (14%), Positives = 19/74 (25%), Gaps = 5/74 (6%)

Query: 83 TPIKKSTTIIDLGGAKAQILHNIPKSIRKNFIAAHPMCGTEFYGPKASVKGLYENALVIL 142
P + L GA + ++RK + Y L
Sbjct: 18 APAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKLYDIPLLRSTLPGSQRY-----AL 72

Query: 143 CDLEDSGTEQVEIA 156
L + E + +A
Sbjct: 73 IHLTNYADETISVA 86


16hp2017_1401hp2017_1411Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_14013110.471359putative outer membrane protein
hp2017_14022110.392071Branched-chain amino acid aminotransferase
hp2017_1403110-1.226193putative outer membrane protein
hp2017_1404111-1.346434DNA polymerase I
hp2017_14051116-1.487814putative type II restriction enzyme
hp2017_14052116-0.963174putative type II restriction enzyme
hp2017_140610160.123333putative type II restrcition enzyme/
hp2017_14062217-0.006811putative type II restrcition enzyme/
hp2017_14072151.114933hypothetical protein
hp2017_14082120.220398Thymidylate kinase
hp2017_14092120.631156Phosphopantetheine adenylyltransferase
hp2017_14102120.7387633-polyprenyl-4-hydroxybenzoate carboxy-lyase
hp2017_14112110.244858putative P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1409LPSBIOSNTHSS2259e-79 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 225 bits (574), Expect = 9e-79
Identities = 63/147 (42%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEELIVAVAHSSAKNPMFSLDERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPEEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


17hp2017_1453hp2017_1478Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_1453217-0.969251hypothetical protein
hp2017_1454014-0.785555hypothetical protein
hp2017_1455012-1.116892Exodeoxyribonuclease III
hp2017_1456113-0.146348*hypothetical protein
hp2017_1457216-0.317074hypothetical protein
hp2017_1458214-1.317405Chromosomal replication initiator protein
hp2017_1459315-2.249634purine nucleoside phosphorylase
hp2017_1460313-1.674374hypothetical protein
hp2017_1461112-1.949608Glucosamine-fructose-6-phosphate
hp2017_1462014-2.901459Thymidylate synthase
hp2017_1463-212-0.367136Type I restriction-modification specificity
hp2017_14641-2130.322174Type I restriction enzyme modification subunit
hp2017_14642-2120.960649Type I restriction enzyme modification subunit
hp2017_1465-1121.434004Type I restriction enzyme restriction subunit
hp2017_14662153.741605hypothetical protein
hp2017_14671132.621079putative iron III dicitrate transport protein
hp2017_1468090.605209hypothetical protein
hp2017_1469-190.926424Arginase
hp2017_1470-3100.948936Alanine dehydrogenase
hp2017_1471-19-1.085659zinc-dependent alcohol dehydrogenase
hp2017_147219-1.823753hypothetical protein
hp2017_1473111-1.021399putative outer membrane protein
hp2017_1474314-1.105669putative NAD kinase
hp2017_1475312-3.114497DNA repair protein
hp2017_1476315-4.694215Fibronectin/fibrinogen-binding protein
hp2017_1477215-2.077870hypothetical protein
hp2017_1478313-1.689325hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1458HTHFIS355e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 5e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 127 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 177
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_1476FbpA_PF058331124e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 112 bits (282), Expect = 4e-29
Identities = 72/358 (20%), Positives = 141/358 (39%), Gaps = 25/358 (6%)

Query: 74 AKDLAYKSETFILRLEMIPKKANLMILDQEKCVIEA--FRFNDRVAKNDILGALPPNIYE 131
+ ++ ++ + + L + K + + I++ F FN N +G N+
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 132 HQEEDLDFKDLLDILEKDFLSYQ--HKELEHKKNQIIKRLNAQKERLKEKLEKLEDPKNL 189
++ D L ++F + L+ K + + K + R +K + L +
Sbjct: 269 KEDYKKIQYDSSSKLLENFYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKK 328

Query: 190 QLEAKELQTQASLLLTYQHLIHKHESCVILKDFED---KECMIEIDKSMPLNAFINKKFT 246
+ + LL + + K S + L ++ I +D++ + + +
Sbjct: 329 CEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQSYYK 388

Query: 247 LSKKKKQKSQFLYLEEENLKEKIAFKENQINYVRDAAEESVLE------------MFMPV 294
K K+ + + +E++ + + + + +A +E F +
Sbjct: 389 KYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKFKKI 448

Query: 295 KNSKIKRPMNGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRNIPGSHLIVFC 353
SK + + I +GKN +N L L+ A +D+W H +NIPGSH+IV
Sbjct: 449 YKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVIVKN 508

Query: 354 QKNTPKDEIILELAKMLIKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRTI 407
+ P + +LE A + K +S +DYT+ K VK GA VIYS +TI
Sbjct: 509 IMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQTI 565



Score = 35.2 bits (81), Expect = 4e-04
Identities = 19/92 (20%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 23 SAPYIGLSKKPPESVLKNTLALDFCLNKFTKNAKILQANVIDNDRI--LEIKGAKDLAYK 80
+ P I L+ + +K + L K+ NAKI+ + I+ DRI ++ + +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 81 SETFILRLEMIPKKANLMILD-QEKCVIEAFR 111
S L +E++ + +N+ ++ ++ ++++ +
Sbjct: 114 SIY-SLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


18hp2017_0040hp2017_0045N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0040-2130.632098DNA transformation competancy
hp2017_0041-2130.576321comB9 competence protein
hp2017_0042-1130.923276inner membrane protein
hp2017_00430150.548200Mannose-6-phosphate isomerase
hp2017_0044-1121.231841GDP-D-mannose dehydratase
hp2017_0045-1111.525216Putative fucose synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0040PF043351332e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 133 bits (335), Expect = 2e-40
Identities = 36/202 (17%), Positives = 72/202 (35%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYRLLGLMSFIALVLAIVLISILPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKAQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ K N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLLNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0041TYPE4SSCAGX320.003 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 32.1 bits (72), Expect = 0.003
Identities = 27/70 (38%), Positives = 37/70 (52%), Gaps = 8/70 (11%)

Query: 200 KEKEEETIIIGDNTNAMKIIKKDIQKGYKALKSSQ--RKWYCLWACSKKSKLSLMPKEIF 257
K +EE+ II D A+ + Q + ALK + R + A K+SK +MP EIF
Sbjct: 367 KIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSEIF 420

Query: 258 NDKQFTYFKF 267
+D FTYF F
Sbjct: 421 DDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0044NUCEPIMERASE882e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.5 bits (217), Expect = 2e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0045NUCEPIMERASE474e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 47.1 bits (112), Expect = 4e-08
Identities = 52/353 (14%), Positives = 107/353 (30%), Gaps = 68/353 (19%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKYAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYMSTEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGE-------FDKFEEKIAHMIPGLIARMHTAKLKGEKNFAMWGDGTARREYLNAK 215
+YG KF + + L+G+ ++ G +R++
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAM---------------LEGKS-IDVYNYGKMKRDFTYID 221

Query: 216 DLARFIALAYENIAQIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDY 258
D+A I + I + V N+G+ + +Y + + L
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 259 KGVFVKDLSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+ +P + + D + + + E ++ G+K +Y +V
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


19hp2017_0250hp2017_0257N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0250-2150.035845Neutrophill activating protein A
hp2017_0251-1120.390790putative histidine kinase sensor protein
hp2017_0252-3111.445979hypothetical protein
hp2017_0253-391.489096Flagellar basal body P-ring protein
hp2017_0254-381.526545ATP dependent RNA helicase
hp2017_0256-281.534831hypothetical protein
hp2017_0257-391.493575Oligopeptide transport ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0250HELNAPAPROT1493e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 149 bits (377), Expect = 3e-49
Identities = 38/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEALKLTRVKEETKTSFHSKDIFKEILGDYKYLEKEFEELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E + + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLEAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0251PF06580300.015 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.015
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0253FLGPRINGFLGI363e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 363 bits (933), Expect = e-127
Identities = 118/345 (34%), Positives = 191/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AIVSGNSS-----------NLLSANIINGATIEREVSYDLFHKNAMTLSLKNPKFKNAIQ 186
A++ S SA + NGA IERE+ + L L+NP F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERLSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIIVHPIVVTSQDITLKITKEP--------LNDSKNTQDLDNNMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0257HTHFIS320.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.006
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANLIMRLNPR----FKPHNGEILFETTNLLKESEAF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


20hp2017_0348hp2017_0361N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0348-1100.234472CTP synthase
hp2017_0349-1100.557319hypothetical protein
hp2017_0350-291.195189hypothetical protein
hp2017_0351-290.948341Flagellar MS-ring protein
hp2017_0352-2101.201485Flagellar motor switch protein
hp2017_0353-2110.934392Flagellar assembly protein
hp2017_0354-2101.8283971-deoxy-D-xylulose 5-phosphate synthase
hp2017_03550101.187865putative GTP binding protein
hp2017_03561140.380451hypothetical protein
hp2017_0357013-0.675859hypothetical protein
hp2017_0358-1110.266402hypothetical protein
hp2017_0359011-0.029717Flagellar basal-body rod protein
hp2017_0360111-0.540593Alpha-ketoglutarate permease
hp2017_0361012-0.777116Cell division protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0348ACETATEKNASE290.047 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.0 bits (65), Expect = 0.047
Identities = 14/38 (36%), Positives = 18/38 (47%), Gaps = 5/38 (13%)

Query: 301 LEGVDAILVPGGFGERGIEGKICAIQRARLEKLPFLGI 338
+ GVD I+ G GE G I+ L+ L FLG
Sbjct: 320 MGGVDVIVFTAGIGENG-----PEIREFILDGLEFLGF 352


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0351FLGMRINGFLIF5510.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 551 bits (1421), Expect = 0.0
Identities = 177/582 (30%), Positives = 290/582 (49%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFERLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVLKDD-TILVPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ I VP DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLRYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL++ + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GAPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIAFKDGANALEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K K PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKP-------LPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMIDNATFSEKIMHKTQKILGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDNTGG-----ELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 SFSEEEVRYEIILEKIRGTLKERPDEIATLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0352FLGMOTORFLIG348e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 348 bits (895), Expect = e-122
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIAKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0355TCRTETOQM1132e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 113 bits (284), Expect = 2e-28
Identities = 53/162 (32%), Positives = 87/162 (53%), Gaps = 7/162 (4%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTLKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 SFQW----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNNLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSSANEVS 161
+ + INKID ++ V QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 83.8 bits (207), Expect = 5e-19
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 161 SAKAKLGIKDLLEKIITTIPAPSGDPNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 220
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 221 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 277
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 278 KNPTSKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 337
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 338 FRVGFLGLLHMEVIKERLEREFGLNLIATAPTVVY 372
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTKGYASFDYEP 473
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0359FLGHOOKAP1300.010 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.9 bits (67), Expect = 0.010
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0360TCRTETB392e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 2e-05
Identities = 42/182 (23%), Positives = 71/182 (39%), Gaps = 33/182 (18%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFMLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNIM 210
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EE 212
++
Sbjct: 190 KK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0361IGASERPTASE350.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 0.001
Identities = 31/172 (18%), Positives = 59/172 (34%), Gaps = 4/172 (2%)

Query: 198 KENPIDESHKPPNEESFLAIPTPYNTTLNDSEPQEGLVQISPHPPTHYTIYPKKNRFNDL 257
N + + P E+ + T TT N+ + V + P
Sbjct: 973 NVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPAT 1032

Query: 258 TNPTNPT--LEPQQETKEREPTLKKETPTTL--KPIMPISAPNTENDNKTENHKTPNHPI 313
+ T T +QE+K E + T TT + + + N + + +T
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 314 KKDDLQENAQEENIEEKENLKEEKRETQNAPNFSPLTPTSAKKPVMVKELSE 365
K+ E + +E++E K E +TQ P + ++ V+ +E
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE 1144


21hp2017_0587hp2017_0601N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0587-312-1.225948methyl-accepting chemotaxis protein
hp2017_0588-211-1.163872putative secretion/ efflux ABC transporter
hp2017_0589-2100.433318Flagellin A
hp2017_0590-3100.6872413-methyladenine DNA glycosylase
hp2017_0591-1110.974754hypothetical protein
hp2017_0592190.365477Uroporphyrinogen III decarboxylase
hp2017_05932100.139457hypothetical protein
hp2017_05942100.232584putative efflux transporter
hp2017_0595290.110041putative efflux transporter
hp2017_059639-0.620895hypothetical protein
hp2017_059719-0.391412putative vacuolating cytotoxin like protein
hp2017_0598013-0.802337hypothetical protein
hp2017_0599-2100.280773ATP-binding ABC transporter protein
hp2017_0600-1110.299647hypothetical protein
hp2017_0601-110-0.292612hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0587OMS28PORIN300.014 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.014
Identities = 26/102 (25%), Positives = 49/102 (48%), Gaps = 2/102 (1%)

Query: 143 NAAKNGEEHSNEGLITVNKTGQDIESLYEKMQNATSLADSLNQRS--NEITQVISLIDDI 200
N + ++ N+ L T+NK +D+ S E ++ ++ N + +SL+ D+
Sbjct: 47 NKKLDQKDQVNQALDTINKVTEDVSSKLEGVRESSLELVESNDAGVVKKFVGSMSLMSDV 106

Query: 201 AEQTNLLALNAAIEAARAGEHGRGFAVVADEVRKLAEKTQKA 242
A+ T + + A I A +G G V + +K ++TQKA
Sbjct: 107 AKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQETQKA 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0589FLAGELLIN2446e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 244 bits (624), Expect = 6e-77
Identities = 126/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0593RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.020
Identities = 22/167 (13%), Positives = 61/167 (36%), Gaps = 18/167 (10%)

Query: 151 AQVKLNVFNGFSDVNNVKEKSAT--YRSNVATLEYSRQSIFLQVVQQYYEYFNNLARMIA 208
++KL F +V+ + T + +T + + L + ++ E LAR+
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINR 225

Query: 209 LQKKLEQIQTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQFALEQN 261
+ ++ + + L K + + ++A L Y + ++ +
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 262 RLTLEYLTNLNVKNLKKTTIDVPNLQLRE-RKDLVSLREQISALKYQ 307
+ + +T K +D +LR+ ++ L +++ + +
Sbjct: 286 KEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0594RTXTOXIND494e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.4 bits (118), Expect = 4e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYNKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLEGYEFT 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 30.2 bits (68), Expect = 0.008
Identities = 21/150 (14%), Positives = 51/150 (34%), Gaps = 21/150 (14%)

Query: 70 QAQSDSTEQQLIFAKKQYQRYNKIGGAVDKNTLEGYEFTYRRLESDYAYSIAVLNKTILR 129
+++ S +++ + ++ +I + + T T +++ +++R
Sbjct: 279 ESEILSAKEEYQLVTQLFKN--EILDKLRQTTDNIGLLTLELAKNE-----ERQQASVIR 331

Query: 130 APFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG-------D 180
AP + + GV L+ +V L + +K I + VG +
Sbjct: 332 APVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE 391

Query: 181 TYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ Y+ G K+ I D+
Sbjct: 392 AFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0595ACRIFLAVINRP8980.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 898 bits (2323), Expect = 0.0
Identities = 286/1040 (27%), Positives = 518/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGVMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVMNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQPIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKRIQAISP-NYEIRPFLDTTSYIRTSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTRLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFITVVLVFVGSLFVASKLGMDFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHDEVEFTTLQVGY-GTSQNPFKAKIFVQLKPLKERKKEHELGQFELMSALRKELRS 631
+ + E FT + G +QN FV LKP +ER E ++ + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNG-DENSAEAVIHRAKMELGK 656

Query: 632 LPEAKDLENINLSEVSLIGGGGDSSPFQTFVFSHSQEAVDKSVENLRKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGF-VIPFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 EGYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
+ E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAQPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + +
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLVALATAFVLIYMILA 871
G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0597VACCYTOTOXIN2704e-75 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 270 bits (691), Expect = 4e-75
Identities = 105/397 (26%), Positives = 180/397 (45%), Gaps = 14/397 (3%)

Query: 2804 AGNNSIMWLNELFAAKGGNPLFAPYYLQDNPTEHIVTLMKDIASALGMLSNSNLKNNSTD 2863
+G L L + +A + I + + L +++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2864 VLQLNTYTQQMSRLAKLSNFASFDSTDFSERLSSLKNQRFADATPNAMDVILKYSQRDKL 2923
L L+ SRL LS + F++RL +LK+QRFA +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2924 KNNLWATGVGGVSFVENGTGTLYGVNVGYDRFVRG---VIVGGYAAYGYSGFYER--ITN 2978
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + N
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2979 SKSDNVDVGLYARAFIKKSELTFSVNETWGANKTQISSNDTLLSMINQSYKYSTWTTNAK 3038
S ++N + G+Y+R F + E F G++++ ++ LL +NQSY Y ++ +
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 3039 VNYGYDFMFKNKSIILKPQIGLRYYYIGMSGLEGVMNNALYNQFKANADPSKKSVLTIDF 3098
+YGYDF F +++LKP +G+ Y ++G + + + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 3099 ALENRHYFNTNSYFYAIGGVGRDLLVNSMGDKLVRFIGNNTLSYRKGDLYNTFANITTGG 3158
+E R+Y+ SYFY GV ++ N V + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFA-NFGSSNAVSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 3159 EVRLFKSFYANAGVGARFGLDYKMIDIIGNIGMRLAF 3195
E++L K + N G L + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 40.0 bits (93), Expect = 2e-04
Identities = 44/224 (19%), Positives = 78/224 (34%), Gaps = 19/224 (8%)

Query: 251 SSGATTISGV-TFNNNGALTYKGGNGIGGSITFTNSNINHYKLNLNANSVTFNNSTLGSM 309
+ G T+ + N N T + G G S+T ++++ K +N ++ S L
Sbjct: 386 AGGKNTVVNINRINTNADGTIRVG-GFKASLTTNAAHLHIGKGGINLSNQASGRSLLVEN 444

Query: 310 PNGNANTIGNAYILNANNITFNNLTFNGGWFVFDRTNANVNFQGTTTINNPTSPFVNMTG 369
GN G L NN G + ++AN F+ T N T+ F N
Sbjct: 445 LTGNITVDGP---LRVNNQV--------GGYALAGSSANFEFKAGTDTKNGTATFNNDIS 493

Query: 370 KVNINANAIFNIQNYTPSIGSAYTLFSMKNGNITYNDVNNLWNIIRLKNTQATKDNSKNT 429
+ I + F+ + ++ V N NI +L A+ + +
Sbjct: 494 LGRFVNLKVDAHTANFKGIDTGNGGFNT----LDFSGVTNKVNINKL--ITASTNVAVKN 547

Query: 430 TSNNNTHTYYVTYNLGGMLYHFRQIFSPNSIVLQSVYYGANNIY 473
+ N ++G + I S + I + G +IY
Sbjct: 548 FNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIY 591



Score = 33.9 bits (77), Expect = 0.016
Identities = 15/100 (15%), Positives = 31/100 (31%), Gaps = 5/100 (5%)

Query: 702 SYTFDGINNTFNEDKFNGGSFNFNHAEQTNAFNNNSFNGGSFSFNAKQVNFNHNSFNGGV 761
SY+ + E FN + ++A Q +N + G+ + N + G
Sbjct: 272 SYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGG 330

Query: 762 FNF---NNTPKASFTNDTFNVNNQFKING-TQTDFTFSKG 797
+ + + N + + N TQ +
Sbjct: 331 YKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0601LCRVANTIGEN300.001 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.4 bits (68), Expect = 0.001
Identities = 15/33 (45%), Positives = 20/33 (60%)

Query: 16 KRKRLLTELAELEAEIKVGSERRSSFNVSLSPS 48
R +L ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


22hp2017_0948hp2017_0955N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2017_0948-1100.043477hypothetical protein
hp2017_0949-3100.501115Cobalt-zinc-cadmium resistance protein/Cation
hp2017_0950-211-0.352183putative cobalt-zinc-cadmium resistance protein
hp2017_0951-111-0.273953hypothetical protein
hp2017_0952-113-0.487657Glycyl-tRNA synthetase beta chain
hp2017_0953-2110.518033hypothetical protein
hp2017_0954-1131.2198252,3-bisphosphoglycerate-independent
hp2017_0955-1130.935311Aspartyl-tRNA-Asn- amidotransferase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0948LPSBIOSNTHSS250.035 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 25.2 bits (55), Expect = 0.035
Identities = 16/69 (23%), Positives = 27/69 (39%), Gaps = 12/69 (17%)

Query: 12 LKDALIDYLFEKGFDDFFYV--ECYKYAASSLLLSQKEQVSGRKDYAKFKLFLSEEVALP 69
L+ A + + F Y + +SSL+ K+ A+F + V
Sbjct: 98 LQMANTNKTLASDLETVFLTTSTEYSFLSSSLV----------KEVARFGGNVEHFVPSH 147

Query: 70 LAQALKNQF 78
+A AL +QF
Sbjct: 148 VAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0949ACRIFLAVINRP7520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 752 bits (1942), Expect = 0.0
Identities = 225/1044 (21%), Positives = 460/1044 (44%), Gaps = 42/1044 (4%)

Query: 6 IIEFSLRQRVIVIVGAILILFFGTYSFIHTPVDAFPDISPTQVKIILKLPGSSPEEMENN 65
+ F +R+ + V AI+++ G + + PV +P I+P V + PG+ + +++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 66 IVRPLELELLGLKGQKSLRSVSKYSIS-DITIDFDDSVDIYLARNIVNERLSSVMKDLPM 124
+ + +E + G+ + S S + S IT+ F D +A+ V +L LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 125 GVEGGMAPIVTPLSDIFMF----TIDGNITEIEKRQLLDFVIRPQLRMISGVADVNSIGG 180
V+ + S M + + T+ + + ++ L ++GV DV G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 FSKAFVIVPDFNDMARLGVSISDLESAVRVNLRNSGAGRVDR----DGETFLVKI--QTA 234
A I D + + + ++ D+ + ++V AG++ G+ I QT
Sbjct: 181 -QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 SLSLEDIGKITV--STNLGHLHIKDFAKVISQSRTRLGFVTKDGVGETTEGLVLSLKEAN 292
+ E+ GK+T+ +++ + +KD A+V +G + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLATGAN 298

Query: 293 TKKIITQVYQKLEELKPLLPSGVSLNVFYDRSEFTQKAIATVSKTLIEAVVLIIITLFLF 352
+ KL EL+P P G+ + YD + F Q +I V KTL EA++L+ + ++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LGNLRASVAVGVILPLSLSVAFIFIKLNNLTLNLMSLGGLIIAIGMLIDSAVVVVENAFE 412
L N+RA++ + +P+ L F + ++N +++ G+++AIG+L+D A+VVVEN E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV-E 417

Query: 413 KLSANTKTTKLHAIYRSCKEIAVSVVSGVVIIIVFFVPILTLQGLEGKMFRPLAQSIVYA 472
++ K A +S +I ++V +++ F+P+ G G ++R + +IV A
Sbjct: 418 RVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSA 477

Query: 473 LLGTLVLSITIIPVVSSLVLK--ATPHSET---FLTRFLNRIYGPLLEFFVRNPKKVI-- 525
+ ++++++ + P + + +LK + H E F F N + + + + K++
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWF-NTTFDHSVNHYTNSVGKILGS 536

Query: 526 ----LGAFVFLIA-SLSLFPFVGKNFMPTLDEGDVVLSVETTPSISLDQSKDLILNIESA 580
L + ++A + LF + +F+P D+G + ++ + ++++ ++ +
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 581 IKKHVKEVKTIVARTGSDELGLDLGGLNQTDTFISFIPKKEWSVKTKDELL-EKIMDSLK 639
K+ K V G G Q + ++F+ K W + DE E ++ K
Sbjct: 597 YLKNEKANVESVFTVN----GFSFSGQAQ-NAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 640 -DFKGINFSFTQPIEM-RISEMLTGVRGDLA-VKIFGDDISELNGLSFQIA-QALKGIKG 695
+ I F P M I E+ T D + G L Q+ A +
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 696 SSEVLTTLNEGVNYLYVTPNKEAMANVGITSDEFSKFLKSALEGLIVDVIPTGISRTPVM 755
V E + ++E +G++ + ++ + +AL G V+ +
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 756 IRQEIDFASSITKIKSLALTSKYGVLVPITSIAKIEEVDGPVSIVREDSRRMSVVRSNVV 815
++ + F + L + S G +VP ++ V G + R + ++
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 816 GRDLNSFVEEAKKVIAQNVKLPPSYYITYGGQFENQQRANKRLSTVIPLSILAIFFILFF 875
+ + +A KLP + G ++ + + ++ +S + +F L
Sbjct: 832 PGTSSGDAMALMENLA--SKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 876 TFKSIPLALLILLNIPFAVTGGLIALFAVGEYISVPASVGFIALFGIAVLNGVVMIGYFK 935
++S + + ++L +P + G L+A + V VG + G++ N ++++ + K
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 936 ELLL-QGKSVEECVLLGAKRRLRPVLMTACIAGLGLIPLLFSHSVGSEVQKPLAIVVLGG 994
+L+ +GK V E L+ + RLRP+LMT+ LG++PL S+ GS Q + I V+GG
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 995 LVTSSALTLLLLPPMFMLIAKKIK 1018
+V+++ L + +P F++I + K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0950RTXTOXIND290.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.026
Identities = 25/159 (15%), Positives = 59/159 (37%), Gaps = 31/159 (19%)

Query: 7 WLMLMGVFLMGVFLGAKEYPEIVLEEKNLQPMGLKVIKLDKEIFSKGLPFNAYIDFDSKS 66
L+ F+MG + A +L + +++ N + +S
Sbjct: 56 RPRLVAYFIMGFLVIA-----FIL---------SVLGQVEI-----VATANGKLTHSGRS 96

Query: 67 SVVQSLSFDASVVAVYKREGEQVKAGDAICEVSSID-------LSNLYFELQNNQNKLKI 119
++ + ++ V + +EGE V+ GD + +++++ + + + Q + +I
Sbjct: 97 KEIKPIE-NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQI 155

Query: 120 AKDITKKDLELYKAGVIPKREYQTSFLASEEMGLKVNQL 158
+EL K + + SEE L++ L
Sbjct: 156 LSRS----IELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2017_0955TYPE3IMSPROT250.042 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 25.1 bits (55), Expect = 0.042
Identities = 10/36 (27%), Positives = 16/36 (44%), Gaps = 10/36 (27%)

Query: 5 DTLLQR---LEKLSM--LEIKDEHKES-----VKGH 30
D + +++L M EIK E+KE +K
Sbjct: 202 DYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSK 237



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.