PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2018.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP002572 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1hp2018_0005hp2018_00112Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0005316-1.483375Orotidine 5'-phosphate decarboxylase
hp2018_0006318-1.911623Pantoate-beta-alanine ligase
hp2018_00071181.204512*****hypothetical protein
hp2018_00081191.268637hypothetical protein
hp2018_00090213.783549hypothetical protein
hp2018_00100223.891605hypothetical protein
hp2018_00111-1163.042070outer membrane protein - adhesin
hp2018_00112-2143.002443outer membrane protein - adhesin
2hp2018_0053hp2018_0089Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0053-113-3.130278NiFe hydrogenase metallocenter assembly protein
hp2018_0054-213-3.374569Agmatine deiminase
hp2018_0055-29-2.166628Adenine-specific methyltransferase
hp2018_0056-19-2.220268hypothetical protein
hp2018_0057-110-2.293142hypothetical protein
hp2018_0058-19-1.130933adenine/cytosine DNA methyltransferase
hp2018_00591121.167520Proline/sodium/Propionate symporter
hp2018_00602140.369614Proline dehydrogenase /oxidase/
hp2018_0061315-0.400622hypothetical protein
hp2018_0062315-0.101233hypothetical protein
hp2018_00632150.289469hypothetical protein
hp2018_00642150.480558hypothetical protein
hp2018_006511170.333468Dihydrolipoamide acetyltransferase
hp2018_006521180.239634Dihydrolipoamide acetyltransferase
hp2018_00653219-0.238804Dihydrolipoamide acetyltransferase
hp2018_0066016-0.178609hypothetical protein
hp2018_00681140.461125hypothetical protein
hp2018_00691151.150950hypothetical protein
hp2018_00700141.452035hypothetical protein
hp2018_00710141.549634Cell division protein
hp2018_00721151.961907Cell division protein
hp2018_00731152.970296Urease accessory protein
hp2018_00745223.600717Urease accessory protein
hp2018_007515223.228679Urease accessory protein
hp2018_007524202.634591Urease accessory protein
hp2018_00763172.647422Urease accessory protein
hp2018_00773202.599762Urea transporter
hp2018_00782202.181929Urease alpha subunit
hp2018_0079-1121.141815Urease beta /gamma subunit
hp2018_00800130.772978*Lipoprotein signal peptidase
hp2018_00810141.346454Phosphoglucosamine mutase
hp2018_00824203.12421030S ribosomal protein S20
hp2018_00834192.920903hypothetical protein
hp2018_00844183.042906Peptide chain release factor 1
hp2018_00856212.837357hypothetical protein
hp2018_00866192.795220putative Outer membrane protein
hp2018_00875161.744667outer membrane protein
hp2018_00884140.418907hypothetical protein
hp2018_00894160.881156dentin sialophosphoprotein preproprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0060ANTHRAXTOXNA310.036 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.036
Identities = 36/173 (20%), Positives = 71/173 (41%), Gaps = 19/173 (10%)

Query: 121 QEESQLKERILKRKNEKIILNVNFIGEEVLGEEEANARFEKY---SQALKSNYIQYISIK 177
Q+ S+ ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0064GPOSANCHOR361e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 1e-04
Identities = 37/206 (17%), Positives = 75/206 (36%), Gaps = 2/206 (0%)

Query: 17 ELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTENDKLNHQVIALTN 76
+ E L+ +N++L L N EL + + EK + + ++ L
Sbjct: 61 KFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEA 120

Query: 77 EQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNLENSNTQLRQALEN 136
+ LE+ + LE E L + LE A + N +T ++
Sbjct: 121 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 137 SNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLASANQDLKRQKRKLE 196
A+ A E + AE + LE + + L+ LA+ DL++
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADS--AKIKTLEAEKAALAARKADLEKALEGAM 238

Query: 197 EENIALKERVDSLKEQLFTLQPQKPQ 222
+ A ++ +L+ + L+ ++ +
Sbjct: 239 NFSTADSAKIKTLEAEKAALEARQAE 264



Score = 35.0 bits (80), Expect = 2e-04
Identities = 43/207 (20%), Positives = 78/207 (37%), Gaps = 2/207 (0%)

Query: 16 EELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTENDKLNHQVIALT 75
+ L+ EL +E + K K +E AS+ L + L + + A +
Sbjct: 81 KALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADS 140

Query: 76 NEQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNLENSNTQLRQALE 135
+ +LE E+A L LEK+ + + K+K LE+ + LE +L +ALE
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 136 NSNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLASANQDLKRQKRKL 195
+ KI + E AR LE +A + ++ + L+ +K L
Sbjct: 201 GAMNFSTADSAKIKTLEAEKAALAARKADLE--KALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 196 EEENIALKERVDSLKEQLFTLQPQKPQ 222
E L++ ++ +
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKT 285



Score = 34.7 bits (79), Expect = 3e-04
Identities = 41/219 (18%), Positives = 72/219 (32%), Gaps = 9/219 (4%)

Query: 4 LIEKWFGFSQIREELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTE 63
L + + E + L K L EL + + +
Sbjct: 153 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 212

Query: 64 NDKLNHQVIALTNEQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNL 123
L + AL + LE+ + LE E L + +LE A +
Sbjct: 213 IKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA 272

Query: 124 ENSNTQLRQALENSNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLAS 183
N +T ++ A+ A E + A+ + + + A +SL S
Sbjct: 273 MNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDAS---------RE 323

Query: 184 ANQDLKRQKRKLEEENIALKERVDSLKEQLFTLQPQKPQ 222
A + L+ + +KLEE+N + SL+ L + K Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_00652SHAPEPROTEIN290.023 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.023
Identities = 17/58 (29%), Positives = 24/58 (41%), Gaps = 9/58 (15%)

Query: 39 RHVFDDEKTAKTFKVELRASEPCAYAISALKSYGFFKSEKLDKPVYYGVFDFGGGTTD 96
R + + + A +V L EP A AI A + + V D GGGTT+
Sbjct: 124 RAIRESAQGAGAREVFL-IEEPMAAAIGA--------GLPVSEATGSMVVDIGGGTTE 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0078UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 353/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0087FLAGELLIN330.002 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 33.5 bits (76), Expect = 0.002
Identities = 32/285 (11%), Positives = 80/285 (28%), Gaps = 6/285 (2%)

Query: 17 SVLLGSMNATDLETYAALQKPSHVFSNYAKKSNKGSELSSDSLTQQQAQNTAQSDTTQAT 76
++ L ++ L + KS+ + D+ + ++
Sbjct: 156 TIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVV 215

Query: 77 TLENTASSGTP----DSSTLPTKETPPATSGGTGGDKHTASSGTPPASSTPPAKKDETSG 132
T + ++ T + + +++GT A + A K G
Sbjct: 216 TDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEG 275

Query: 133 SGDKDQHTASGTGGTPSSSGGTGGDKHTASSGTPPASSTPPTPTPPTSGGNTITSQLTKD 192
D + T T + + G G T + + T T+ S
Sbjct: 276 D-TFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVY 334

Query: 193 TTTVNNLKSVSVSAMNTTLSGVTQLSQQTATISTLLNGSPNLGSVISNAQGLSSAFSALE 252
T+ VN + N + + + + + + + ++ A +
Sbjct: 335 TSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMF 394

Query: 253 SAQNTLKGYLDSSSATIGQLTNGSNAVVGALDKAINQVDMALADL 297
+ + + + ++D A+++VD + L
Sbjct: 395 IDKTASGVSTLINEDAAAAKK-STANPLASIDSALSKVDAVRSSL 438



Score = 31.2 bits (70), Expect = 0.013
Identities = 38/306 (12%), Positives = 85/306 (27%), Gaps = 10/306 (3%)

Query: 55 SSDSLTQQQAQNTAQSDTTQATTLENTASSGTPDSSTLPTKETPPATSGGTGGDKHTASS 114
SL + T + + D+ + + + G TA +
Sbjct: 163 DVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPT 222

Query: 115 GTPPASSTPPAKKDETSGSGDKDQHTASGTGGTPSSSGGTGGDKHTASSGTPPASSTPPT 174
A T+ + + ++ A G +
Sbjct: 223 VPDKVYVNA-ANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYK 281

Query: 175 PTPPTSGGNTITSQLTKDTTTVNNLKSVSVSAMNTTLSGVTQLSQQTA---TISTLLNGS 231
T T K +TT+N K A T + + + ++++NG
Sbjct: 282 GVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQ 341

Query: 232 PNLGSVISNAQGLSSAFSALESAQNTLKGYLDSSSATIGQLTNGSNAVVGALDKAINQVD 291
N S A + + K ++ + T + +
Sbjct: 342 FTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASG 401

Query: 292 MALADLATADTQKTQAVALVAASNSATTTTDAINFLNALKANLTAQKDAFMSVHKNIQTA 351
++ A A + +N + A++ ++A++++L A ++ F S N+
Sbjct: 402 VSTLINEDAAA------AKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNT 455

Query: 352 VAQAQA 357
V +
Sbjct: 456 VTNLNS 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0089CABNDNGRPT320.005 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 32.2 bits (73), Expect = 0.005
Identities = 27/144 (18%), Positives = 54/144 (37%), Gaps = 10/144 (6%)

Query: 433 INSMDNTHANDSKDQGGNALINPNSTTNDDHNDDHMDTNTTDTGNANDTPTDDKDAGGNN 492
I ++ + + + G+++ NS T+ D T T ++ DAGG +
Sbjct: 251 IAAIQRLYGANMTTRTGDSVYGFNSNTDRDF--------YTATDSSKALIFSVWDAGGTD 302

Query: 493 TGDTGDMNNTDTGNTDTGNTDDMSNMNNGNGDTGNANDDMGNSNDMGDDMNNANDMNDDM 552
T D +N N + G+ D+ + + G+D+ N ++ +
Sbjct: 303 TFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGS-GNDILVGNSADNIL 361

Query: 553 -GNSNDDMGDMGDMNDDMGGDMGD 575
G + +D+ G D + G G
Sbjct: 362 QGGAGNDVLYGGAGADTLYGGAGR 385


3hp2018_0109hp2018_0115Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_01093120.267402hypothetical protein
hp2018_01102132.380827Beta-1,3-galactosyltransferase
hp2018_01111143.433474Methyl-accepting chemotaxis protein
hp2018_01122143.719094hypothetical protein
hp2018_0113-1113.6449292',3'-cyclic-nucleotide 2'-phosphodiesterase
hp2018_0114-2124.751686S-ribosylhomocysteine lyase
hp2018_0115-1134.021332Cystathionine gamma-lyase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0114LUXSPROTEIN2256e-79 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 225 bits (575), Expect = 6e-79
Identities = 57/145 (39%), Positives = 91/145 (62%), Gaps = 7/145 (4%)

Query: 5 VESFNLDHTKVKAPYVRVADRKKGVNGDLIVKYDVRFKQPNQDHMDMPSLHSLEHLVAEI 64
++SF +DHT++ AP VRVA + GD I +D+RF PN+D + +H+LEHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 65 IRNHA----SYVVDWSPMGCQTGFYLTVLNHDNYTEILEVLEKTMQDVLKAK---EVPAS 117
+RNH ++D SPMGC+TGFY++++ + ++ + M+DVLK + ++P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 118 NEKQCGWAANHTLEGAQNLARAFLD 142
NE QCG AA H+L+ A+ +A+ L+
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILE 147


4hp2018_0191hp2018_0212Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_01912120.391381Lysyl-tRNA synthetase (class II)
hp2018_01923140.589736Serine hydroxymethyltransferase
hp2018_0193115-0.187409hypothetical protein
hp2018_01943150.420359hypothetical protein
hp2018_019511132.707126hypothetical protein
hp2018_019520112.907081hypothetical protein
hp2018_0196-1102.219792Putative inner membrane protein
hp2018_0197-192.360376Cardiolipin synthetase
hp2018_0198-1113.228408Succinate dehydrogenase iron-sulfur protein
hp2018_01990113.266901Succinate dehydrogenase flavoprotein subunit
hp2018_0200-1141.763940Fumarate reductase cytochrome b subunit
hp2018_0201-2171.597478Triosephosphate isomerase
hp2018_0202-2172.565867Enoyl-acyl-carrier-protein reductase
hp2018_0203-2182.489785UDP-3-O-3-hydroxymyristoyl glucosamine
hp2018_0204-2162.816510S-adenosylmethionine synthetase
hp2018_0205-2171.978124Nucleoside diphosphate kinase
hp2018_0206-3171.435116hypothetical protein
hp2018_0207013-3.62606350S ribosomal protein L32
hp2018_0208012-2.518007Phosphate:acyl-ACP acyltransferase
hp2018_0209013-3.2829213-oxoacyl-acyl-carrier-protein synthase
hp2018_0210114-4.270746hypothetical protein
hp2018_0211214-4.540417hypothetical protein
hp2018_0212111-3.734077hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_01952FRAGILYSIN280.029 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 28.5 bits (63), Expect = 0.029
Identities = 22/103 (21%), Positives = 44/103 (42%), Gaps = 2/103 (1%)

Query: 7 EDNKKLYDIIDGQQRTTTIFMLLHVLASKQNEKDKQETRKYLYQKGELKLEVAPQNQSFF 66
DN+ + + +G+ + +T F+L A + ++ + Y++ ++ E+A + F
Sbjct: 93 LDNENVR-LFNGRDKDSTSFILGDEFAVLRFYRNGESISYIAYKEAQMMNEIAEFYAAPF 151

Query: 67 KTLLEAAEKENISHCEKDADTEGKQNLFEVLKAILDKVSKLSG 109
K EKE C D+ T +K +DK K+
Sbjct: 152 KKTRAINEKE-AFECIYDSRTRSAGKDIVSVKINIDKAKKILN 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0202DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.1 bits (145), Expect = 8e-13
Identities = 61/263 (23%), Positives = 108/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNNIKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + I++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGRHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSNGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0212PF01540340.003 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 33.9 bits (77), Expect = 0.003
Identities = 32/127 (25%), Positives = 60/127 (47%), Gaps = 10/127 (7%)

Query: 209 IGKGKQKQLSKIYSHF-KKLSEGEIKPQNEGILKKLKSLDEIFKTTDFTRFTPKTEIKDI 267
+ K K+L++I + KKL+E K +N G+ + K +E F+ + +
Sbjct: 340 VKKAWSKELAEIKAEDDKKLAEENQKIKN-GVEELKKINNEAFELSK--------TVNKT 390

Query: 268 IKEIDEKYPINENFKRQFRTFRSSIGNLKKKINSLKYLEKTREDFERKKESWIKEIGNDC 327
I E+++K+ I+ +FK Q + F + + ++I+ + T+E F + KEI
Sbjct: 391 IAELEKKFKIDVSFKEQLKNFADDLLDKSRQIDEFTTVTSTQEGFTLAELESFKEITTTW 450

Query: 328 KNECNSE 334
N SE
Sbjct: 451 FNGMKSE 457



Score = 33.6 bits (76), Expect = 0.004
Identities = 26/124 (20%), Positives = 50/124 (40%), Gaps = 8/124 (6%)

Query: 211 KGKQKQLSKIYSHFKKLSEGEIKPQNEGILKKLKSLDEIFKTTDFTRFTPKTEIKDIIKE 270
K K+L++I + K E + EG + LK ++I D I I +
Sbjct: 221 KAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSFAD--------TIALTITK 272

Query: 271 IDEKYPINENFKRQFRTFRSSIGNLKKKINSLKYLEKTREDFERKKESWIKEIGNDCKNE 330
++ K+ I+E FK+Q + + ++ + + ++DF + KE +
Sbjct: 273 LERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEFNTSWLEK 332

Query: 331 CNSE 334
SE
Sbjct: 333 IVSE 336


5hp2018_0307hp2018_0345Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_03071143.65729950S ribosomal protein L21
hp2018_03081153.75507250S ribosomal protein L27
hp2018_03091153.741807Dipeptide-binding ABC transporter
hp2018_03100154.173214Dipeptide transport system permease protein
hp2018_0311-1143.546908Dipeptide transport system permease protein
hp2018_0312-3143.041062Dipeptide transport ATP-binding protein
hp2018_0313-2142.642746Dipeptide transport ATP-binding protein
hp2018_0314-2132.176798GTP-binding protein
hp2018_0315-1131.595665hypothetical protein
hp2018_03160162.235463putative periplasmic protein
hp2018_03171162.768223Glutamate-1-semialdehyde aminotransferase
hp2018_03184182.075356hypothetical protein
hp2018_03194161.863745hypothetical protein
hp2018_03202170.533559amidohydrolase like protein
hp2018_0321117-0.058335Putative polysaccharide deacetylase
hp2018_0322118-2.169110hypothetical protein
hp2018_0323117-2.321719ATP/GTP binding protein
hp2018_0324120-3.804710nitrite extrusion protein
hp2018_0325121-4.102506hypothetical protein
hp2018_03261018-2.525026ABC transporter
hp2018_03262-116-2.189622ABC transporter
hp2018_0327314-2.026217Putative heme oxygenase
hp2018_0328513-2.417740hypothetical protein
hp2018_0329114-1.846316Arginyl-tRNA synthetase
hp2018_0330213-1.436249Twin-arginine translocation protein
hp2018_0331113-1.627332Guanylate kinase
hp2018_0332114-1.739007poly E-rich protein
hp2018_0333014-2.037324membrane bound endonuclease
hp2018_0334114-2.020061putative Outer membrane protein
hp2018_0335216-2.040216Flagellar basal body L-ring protein
hp2018_03361214-1.786801CMP-N-Acetylneuraminate cytidylyltransferase
hp2018_03362212-1.088802CMP-N-Acetylneuraminate cytidylyltransferase
hp2018_0337212-0.918707putative flagellar biosynthesis protein
hp2018_03381130.447029Tetraacyldisaccharide 4'-kinase
hp2018_03391151.363009NAD synthetase
hp2018_03400161.622101*Ketol-acid reductoisomerase
hp2018_03410160.612918Septum site-determining protein
hp2018_0342115-0.940703Cell division topological specificity factor
hp2018_0343012-0.119888DNA processing chain A
hp2018_0344114-0.661230Holliday junction resolvase like protein
hp2018_0345213-0.164888hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0324TCRTETB310.006 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.006
Identities = 36/193 (18%), Positives = 77/193 (39%), Gaps = 1/193 (0%)

Query: 23 VLIPLLILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFL 82
V +P + + P + + A ++ + G+ + LS + ++ + + +
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 83 VCYFDSIPFFWLWIWRFIAGVASSALMILVAPLSLPYVKEHKKALVGGLIFSAVGIGSVF 142
+ + F L + RFI G ++A LV + Y+ + + GLI S V +G
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 143 SGFVLPWISSYNIKWAWIFLGGSCLIAFILSLVGLKTRSLRKKSVKKEESAFKIPFHLWL 202
+ I+ Y I W+++ L I + L+ L + +R K + + +
Sbjct: 155 GPAIGGMIAHY-IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVF 213

Query: 203 LLISCALNAIGFL 215
++ +I FL
Sbjct: 214 FMLFTTSYSISFL 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0331PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0332IGASERPTASE622e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 62.4 bits (151), Expect = 2e-12
Identities = 39/247 (15%), Positives = 82/247 (33%), Gaps = 21/247 (8%)

Query: 148 EALAKEEPNNEEQLLPTLNEQEGETPKEEAQEEVKKEEVKEMQEEVKEKQKQEVAEN--- 204
E +A+ + P + ET E +++E K E E Q +EVA+
Sbjct: 1015 EEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 205 --PQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSPNVQ 262
+ + + ++ + + E + +++E K ET++ +E + +Q SP +
Sbjct: 1075 NVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 263 ELEAMQELVKEIQENSNDQENKKETQETQENTETPQDIETQELEIPKEEETQEIAEKTQA 322
+ E +Q + +EN K+ +T +T Q + + + +
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 323 QGLEKEEIAETPQEKEIQETQDETPQELEVQDEKLQENETPKDENMQESAQNLQEKETQE 382
+ E P+ TQ E PK+ + + E
Sbjct: 1195 -------VVENPENTTPATTQPTVNSESS---------NKPKNRHRRSVRSVPHNVEPAT 1238

Query: 383 LETPQTQ 389
+
Sbjct: 1239 TSSNDRS 1245



Score = 59.3 bits (143), Expect = 2e-11
Identities = 66/315 (20%), Positives = 111/315 (35%), Gaps = 36/315 (11%)

Query: 164 TLNEQEGETPKEEAQEEVKKEEVKEMQEEVKEKQKQEVAENPQD-EEKPKDDETQGSVEP 222
L G + E + + V + +V P + EE + DE V P
Sbjct: 970 KLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEA--PVPP 1027

Query: 223 PKDEEVSKELET-QEQEPIKEETQEIKEEKQEKTQDSPNVQELEAMQELVKEIQENSNDQ 281
P S+ ET E + +T E E+ +T EA + Q N Q
Sbjct: 1028 PAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQ 1087

Query: 282 ENKKETQETQEN-TETPQDIETQELEIPKEEETQEIAEKTQAQGLEKEEIAETPQEKEIQ 340
ET+ETQ T+ +E KEE+ + EKTQ E P+
Sbjct: 1088 S-GSETKETQTTETKETATVE-------KEEKAKVETEKTQ----------EVPKVTSQV 1129

Query: 341 ETQDETPQELEVQDEKLQENETPKDENMQESAQNLQEKETQELETPQTQEDHYENIEDIP 400
+ E + ++ Q E +EN+ N++E ++Q T T++ E ++
Sbjct: 1130 SPKQEQSETVQPQAEPARENDP---------TVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 401 EPVMTKAMGEELPFLNENDTETPKENDTETPKESVIKTPQEKEESDKTSSPLELRLNLQD 460
+PV + E P+ T + +V K ++ S + N++
Sbjct: 1181 QPVTESTTVNTGN----SVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEP 1236

Query: 461 LLKSLNQESLKSLLE 475
S N S +L +
Sbjct: 1237 ATTSSNDRSTVALCD 1251



Score = 55.5 bits (133), Expect = 4e-10
Identities = 43/217 (19%), Positives = 76/217 (35%), Gaps = 8/217 (3%)

Query: 148 EALAKEEPNNEEQLLPTLNEQEGETPKEEAQEEVKKEEVKEMQEEVK-EKQKQEVAENPQ 206
E + +E NEQ+ + +E KE + VK Q EVA++
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQN-----REVAKEAKSNVKANTQTNEVAQSGS 1090

Query: 207 DEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSPNVQELEA 266
+ ++ + ET+ + K+E+ E E ++ P K+E+ E Q
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150

Query: 267 MQELVKEIQENSNDQ-ENKKETQETQENTETPQDIETQELEIPKEEETQEIAEKTQAQGL 325
+KE Q +N + ++ +ET N E P T E E Q
Sbjct: 1151 PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPT 1210

Query: 326 EKEEIAETPQEKEIQETQDETPQELEVQDEKLQENET 362
E + P+ + + + P +E + T
Sbjct: 1211 VNSESSNKPKNRHRRSVRSV-PHNVEPATTSSNDRST 1246



Score = 44.7 bits (105), Expect = 9e-07
Identities = 30/213 (14%), Positives = 69/213 (32%), Gaps = 4/213 (1%)

Query: 142 ENLGDLEALAKEEPNNEEQLLPTLNEQEGETPKEEAQEEVKKEEVKEMQEEVKEKQKQEV 201
E +AKE +N + T + + +E Q KE +EE + + ++
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKT 1119

Query: 202 AENPQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSPNV 261
E P+ + + Q P+ E + T + + +T + +Q + S NV
Sbjct: 1120 QEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 262 QELEAMQELVK---EIQENS-NDQENKKETQETQENTETPQDIETQELEIPKEEETQEIA 317
++ V + EN N + E++ P++ + +
Sbjct: 1180 EQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATT 1239

Query: 318 EKTQAQGLEKEEIAETPQEKEIQETQDETPQEL 350
+ ++ T + + + +
Sbjct: 1240 SSNDRSTVALCDLTSTNTNAVLSDARAKAQFVA 1272


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0335FLGLRINGFLGH1941e-64 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 194 bits (495), Expect = 1e-64
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0345SYCDCHAPRONE280.003 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 28.4 bits (63), Expect = 0.003
Identities = 12/36 (33%), Positives = 20/36 (55%)

Query: 23 MASQTPKELYDLGVESYKAKDYIKAKKYFEKACGLN 58
++S T ++LY L Y++ Y A K F+ C L+
Sbjct: 31 ISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLD 66


6hp2018_0457hp2018_0468Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0457113-3.160075Molybdenum ABC transporter/periplasmic
hp2018_0458112-3.470779Molybdenum transport system permease protein
hp2018_0459010-1.701560Molybdenum transport ATP-binding protein
hp2018_0460-111-2.115332Glutamyl-tRNA synthetase
hp2018_04611-213-2.722557adenine specific DNA methyltransferase
hp2018_04612-212-1.756479adenine specific DNA methyltransferase
hp2018_04613-112-1.231517adenine specific DNA methyltransferase
hp2018_0462-116-0.044972GTP-binding protein
hp2018_0463124-3.629763DNA adenine methylase
hp2018_04641025-3.586161hypothetical protein
hp2018_04642-221-2.381904hypothetical protein
hp2018_046515190.286380DNA-cytosine methyltransferase
hp2018_046526190.753259DNA-cytosine methyltransferase
hp2018_046616190.617266hypothetical protein
hp2018_046625171.278202hypothetical protein
hp2018_04675181.906041Catalase
hp2018_04685191.266713Outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0459PF05272300.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.009
Identities = 12/32 (37%), Positives = 17/32 (53%)

Query: 30 VVALLGESGAGKSTILRILAGLEAVSSGYIEA 61
V L G G GKST++ L GL+ S + +
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0462TCRTETOQM1981e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (505), Expect = 1e-57
Identities = 116/461 (25%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVALAG--FNAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV L V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-RFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


7hp2018_0490hp2018_0531Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0490216-2.719399hypothetical protein
hp2018_0491114-2.442406hypothetical protein
hp2018_0492113-2.117924hypothetical protein
hp2018_0493112-1.842601Glutamine synthetase type I
hp2018_04941012-3.099652RloF
hp2018_04942-111-2.555812RloF
hp2018_0495-310-1.73842850S ribosomal protein L9
hp2018_0496-211-2.041094ATP-dependent protease
hp2018_0497-212-2.700309ATP-dependent hsl protease ATP-binding subunit
hp2018_0498-118-3.187514GTP-binding protein
hp2018_0499020-4.565741putative periplasmic protein
hp2018_0500622-4.096438hypothetical protein
hp2018_05011023-3.790090hypothetical protein
hp2018_05021022-3.467861hypothetical protein
hp2018_0503719-2.460240cag pathogenicity island protein
hp2018_0504819-2.493117cag pathogenicity island protein
hp2018_0505918-2.265229cag pathogenicity island protein
hp2018_0506719-1.686235cagGamma protein
hp2018_0507818-2.115209hypothetical protein
hp2018_0508919-2.206788Type IV secretion system protein
hp2018_05091222-2.894635ATPase
hp2018_05101221-3.285126cag pathogenicity island protein Z
hp2018_051111023-2.875064cag pathogenicity island protein
hp2018_051121125-4.226082cag pathogenicity island protein
hp2018_051131026-4.201943cag pathogenicity island protein
hp2018_051141025-4.058680cag pathogenicity island protein
hp2018_05121025-4.359020cag island protein
hp2018_05131027-4.478690cag pathogenicity island protein
hp2018_05141227-5.578800inner membrane protein
hp2018_05151124-5.633411cag pathogenicity island protein
hp2018_05161121-5.676673cag pathogenicity island protein
hp2018_0517920-5.847317cag pathogenicity island protein
hp2018_0518719-4.170037cag pathogenicity island protein
hp2018_0519617-3.081887hypothetical protein
hp2018_0520617-2.800722cag island protein
hp2018_0521719-2.996414cag pathogenicity island protein
hp2018_0522519-2.895487cag pathogenicity island protein
hp2018_0523520-3.282921cag island protein
hp2018_0524619-3.358016cag island protein
hp2018_0525620-4.166936cag island protein
hp2018_0526621-3.232821cag pathogenicity island protein
hp2018_0527522-2.711944cag pathogenicity island protein 23
hp2018_0528626-2.246668cag island protein
hp2018_0529423-0.925032cag pathogenicity island protein C
hp2018_0530218-0.168948hypothetical protein
hp2018_0531218-0.064131cag island protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0498PF03944310.005 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.2 bits (70), Expect = 0.005
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELRVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQKYASQFLDLVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0512TYPE4SSCAGX7900.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 790 bits (2042), Expect = 0.0
Identities = 480/482 (99%), Positives = 482/482 (100%)

Query: 1 MVNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGWNIVPNSNHIFIQPK 60
+VNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGW+IVPNSNHIFIQPK
Sbjct: 41 VVNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPK 100

Query: 61 SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK 120
SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK
Sbjct: 101 SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK 160

Query: 121 AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM 180
AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM
Sbjct: 161 AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM 220

Query: 181 QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN 240
QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN
Sbjct: 221 QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN 280

Query: 241 LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI 300
LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI
Sbjct: 281 LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI 340

Query: 301 KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN 360
KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN
Sbjct: 341 KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN 400

Query: 361 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT 420
YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT
Sbjct: 401 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT 460

Query: 421 NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR 480
NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR
Sbjct: 461 NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR 520

Query: 481 DK 482
DK
Sbjct: 521 DK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0514PF043351195e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 119 bits (299), Expect = 5e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0521TYPE4SSCAGX300.015 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.015
Identities = 29/119 (24%), Positives = 54/119 (45%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTFYQRHDDKEITKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K D KE+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSVQKKAAKHRGLQELNETNANPLNDNPNGNSPTETKSNKDDNFDEM 142
QK+ K +++A L+ L +NP N + N N K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0527ACRIFLAVINRP320.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.1 bits (73), Expect = 0.015
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGINPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0531TYPE4SSCAGA18680.0 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 1868 bits (4840), Expect = 0.0
Identities = 1041/1187 (87%), Positives = 1083/1187 (91%), Gaps = 43/1187 (3%)

Query: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAIASFDPDQKPIVDKNDRDNRQAFDG 60
MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNA+AS+DPDQKPIVDKNDRDNRQAF+G
Sbjct: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAVASYDPDQKPIVDKNDRDNRQAFEG 60

Query: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120
ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF
Sbjct: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120

Query: 121 TSWVSHQNDPSKINTRSIRNFMEHAIQPPIPDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180
TSWVSHQNDPSKINTRSIRNFME+ IQPPI DDKEKAEFLKSAKQSFAGIIIGNQIRTDQ
Sbjct: 121 TSWVSHQNDPSKINTRSIRNFMENIIQPPILDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180

Query: 181 KFMGVFDESLKERQEAEKNGGSTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240
KFMGVFDESLKERQEAEKNG TGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI
Sbjct: 181 KFMGVFDESLKERQEAEKNGEPTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240

Query: 241 ATSTTHIQGLPPESRDLLDERGNFSKFTLGDMEMLDVEGVADMDPNYKFNQLLIHNNALS 300
AT+TT IQGLPPE+RDLLDERGNFSKFTLGDMEMLDVEGVAD+DPNYKFNQLLIHNNALS
Sbjct: 241 ATTTTDIQGLPPEARDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNALS 300

Query: 301 SVLMGSHDGIEPEKVSLLYAGNGGFGDKHDWNATVGYKDQQGNNVATIINVHMKNGSGLI 360
SVLMGSH+GIEPEKVSLLY GNGG G +HDWNATVGYKDQQGNNVATIINVHMKNGSGL+
Sbjct: 301 SVLMGSHNGIEPEKVSLLYGGNGGPGARHDWNATVGYKDQQGNNVATIINVHMKNGSGLV 360

Query: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDSLSEKEKEK 420
IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLD+LSEKEKEK
Sbjct: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDNLSEKEKEK 420

Query: 421 FKNEIKDFQKDSKPYLDALGNDRIAFVSKKDPKHSALITEFNKGDLSYTLKDYGKKADKA 480
F+ EIKDFQKDSK YLDALGNDRIAFVSKKD KHSALITEF GDLSYTLKDYGKKADKA
Sbjct: 421 FRTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALITEFGNGDLSYTLKDYGKKADKA 480

Query: 481 LDREKNVTLQGNLKHDGVMFVNYSNFKYTNASKSPNKGVGVTNGVSHLEAGFSKVAVFNL 540
LDREKNVTLQG+LKHDGVMFV+YSNFKYTNASK+PNKGVGVTNGVSHLE GF+KVA+FNL
Sbjct: 481 LDREKNVTLQGSLKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLEVGFNKVAIFNL 540

Query: 541 PNLNNLAITSVVRRDLEDKLIAKGLSPQEANKLVKDFLSSNKELVGKALNFNKAVAEAKN 600
P+LNNLAITS VRR+LEDKL KGLSPQEANKL+KDFLSSNKELVGK LNFNKAVA+AKN
Sbjct: 541 PDLNNLAITSFVRRNLEDKLTTKGLSPQEANKLIKDFLSSNKELVGKTLNFNKAVADAKN 600

Query: 601 TGNYDEVKRAQKDLEKSLKKREHLEKDVAKNLESKSGNKNKMEVKSQANSQKDEIFALIN 660
TGNYDEVK+AQKDLEKSL+KREHLEK+V K LESKSGNKNKME K+QANSQKDEIFALIN
Sbjct: 601 TGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKSGNKNKMEAKAQANSQKDEIFALIN 660

Query: 661 KEANRDARAIAYAQNLKDIKRELSDKLENISKDLKDFSKSFDEFKNGKSKDFSKVEETLK 720
KEANRDARAIAYAQNLK IKRELSDKLEN++K+LKDF KSFDEFKNGK+KDFSK EETLK
Sbjct: 661 KEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETLK 720

Query: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSIKDVIINQKI 780
ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENS+KDVIINQK+
Sbjct: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKV 780

Query: 781 TDKVDNLNQAVSMAKIAGNFSGVEQALADLKNFSKEQLAQQAQKNESFNVGK-SEIYQSV 839
TDKVDNLNQAVS+AK G+FS VEQALADLKNFSKEQLAQQAQKNES N K SEIYQSV
Sbjct: 781 TDKVDNLNQAVSVAKATGDFSRVEQALADLKNFSKEQLAQQAQKNESLNARKKSEIYQSV 840

Query: 840 KNGVNGTLVGNGLSGIEATALAKNFSDIKKELNEKFKNFNNNNNGLKNGKDKGPEEPIYA 899
KNGVNGTLVGNGLS EAT L+KNFSDIKKELN K NFNNNNN EPIYA
Sbjct: 841 KNGVNGTLVGNGLSQAEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKN------EPIYA 894

Query: 900 QVNKKKTGQVASPEEPIYAQVAKKVTQKIDQLNQAASGFGGVGQ-AGFPLKRHDKVEDLS 958
+VNKKK GQ AS EEPIYAQVAKKV KID+LNQ ASG G VGQ AGFPLKRHDK
Sbjct: 895 KVNKKKAGQAASLEEPIYAQVAKKVNAKIDRLNQIASGLGVVGQAAGFPLKRHDK----- 949

Query: 959 KVGRSVSPEPIYATIDDLGGSFPLRRSAAVDDLSKVGRSREQELTQKIDNLSQAVSEAKA 1018
VDDLSKVG SR QEL QKIDNL+QAVSEAKA
Sbjct: 950 -----------------------------VDDLSKVGLSRNQELAQKIDNLNQAVSEAKA 980

Query: 1019 GFFGNLERTIDKLKDSTKNNPVNLWAENAKKVPASLSAKLDNYATNSHTRINSNIQNGAI 1078
GFFGNLE+TIDKLKDSTK+NP+NLW E+AKKVPASLSAKLDNYATNSH RINSNI+NGAI
Sbjct: 981 GFFGNLEQTIDKLKDSTKHNPMNLWVESAKKVPASLSAKLDNYATNSHIRINSNIKNGAI 1040

Query: 1079 NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN 1138
NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN
Sbjct: 1041 NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN 1100

Query: 1139 NTVKDVKSGFTQFLANAFSTG-YYSLARENAEHGIKNANTKGGFQKS 1184
N VKD SGFTQFL NAFST YY LARENAEHGIKN NTKGGFQKS
Sbjct: 1101 NAVKDTNSGFTQFLTNAFSTASYYCLARENAEHGIKNVNTKGGFQKS 1147


8hp2018_0668hp2018_06792Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_06682100.421058hypothetical protein
hp2018_0669111-0.734981hypothetical protein
hp2018_0670110-0.894769N-acetylglucosamine-1-phosphate
hp2018_0672214-3.083317Iron III dicitrate transport protein
hp2018_0673115-4.139043Ferrous iron transport protein B
hp2018_0674318-3.112297Polysaccharide biosynthesis protein
hp2018_0675621-2.423973putative type II DNA modification enzyme
hp2018_0676718-0.360774hypothetical protein
hp2018_06775160.112193putative type II restriction enzyme
hp2018_06785152.602876Acetone carboxylase gamma subunit
hp2018_067914142.765471Acetone carboxylase alpha subunit /
hp2018_067922143.153577Acetone carboxylase alpha subunit /
9hp2018_0709hp2018_07133Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0709216-4.026589tRNA Ile -lysidine synthetase
hp2018_0710319-4.112232hypothetical protein
hp2018_0711622-3.069539hypothetical protein
hp2018_0712525-4.173330hypothetical protein
hp2018_07131524-4.272455hypothetical protein
hp2018_07132423-4.881133hypothetical protein
hp2018_07133321-4.519385hypothetical protein
10hp2018_0852hp2018_0864Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0852219-0.508302CDP-diacylglycerol pyrophosphatase
hp2018_08533200.025699hypothetical protein
hp2018_08541152.285960Alkylphosphonate utilization operon protein
hp2018_08551142.718903hypothetical protein
hp2018_08562132.467846hypothetical protein
hp2018_08572132.424652hypothetical protein
hp2018_08583142.473471Catalase
hp2018_08592142.640469putative iron regulated outer membrane protein
hp2018_08600180.318739Crossover junction endodeoxyribonuclease
hp2018_0861-115-1.380771hypothetical protein
hp2018_0862213-1.318839hypothetical protein
hp2018_0863212-1.034333hypothetical protein
hp2018_0864312-1.239481hypothetical protein
11hp2018_0875hp2018_08812Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0875218-4.699759Acyl-CoA hydrolase
hp2018_0876622-7.139543hypothetical protein
hp2018_08771161.490011hypothetical protein
hp2018_08782172.410029hypothetical protein
hp2018_08790173.563114hypothetical protein
hp2018_08801194.014123hypothetical protein
hp2018_088111184.382087outer membrane protein - adhesin
hp2018_088121184.093903outer membrane protein - adhesin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0878PF01206280.002 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 27.8 bits (62), Expect = 0.002
Identities = 9/43 (20%), Positives = 21/43 (48%), Gaps = 7/43 (16%)

Query: 44 IPNLETQQAMKEALNGENLEVI-------EDFSAWANEIKKEV 79
+P L+ ++ + GE L V+ +DF +++ + E+
Sbjct: 17 LPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHEL 59


12hp2018_0962hp2018_0982Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_09623151.165077Cell division protein
hp2018_09633180.594879Cell division protein
hp2018_09641318-4.038113hypothetical protein
hp2018_09642320-5.025122hypothetical protein
hp2018_09643618-5.274847hypothetical protein
hp2018_0965516-5.147369Mechanosensitive channel
hp2018_0966719-6.315764hypothetical protein
hp2018_09681721-6.959439DNA topoisomerase I
hp2018_09682718-6.906119DNA topoisomerase I
hp2018_09701619-7.015079Conjugal plasmid transfer system protein
hp2018_09702324-7.424257Conjugal plasmid transfer system protein
hp2018_0972427-8.572684hypothetical protein
hp2018_0973430-9.613941hypothetical protein
hp2018_0974530-10.022034hypothetical protein
hp2018_09751328-8.207811P-type DNA transfer ATPase
hp2018_09752328-7.906790P-type DNA transfer ATPase
hp2018_0976326-8.065699hypothetical protein
hp2018_0977424-8.016876hypothetical protein
hp2018_0978323-7.864459hypothetical protein
hp2018_0979421-7.223076conjugal transfer protein
hp2018_0980324-6.163261hypothetical protein
hp2018_09811222-5.367983integrase/recombinase
hp2018_09812016-4.229339integrase/recombinase
hp2018_0982-116-3.115703hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0962SHAPEPROTEIN402e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 39.7 bits (93), Expect = 2e-05
Identities = 38/176 (21%), Positives = 66/176 (37%), Gaps = 12/176 (6%)

Query: 211 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 264
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 265 HMLNTPFPYAEEVKIKYGDLSFESGTETPSQSVQIPTTGSDGNESHIVPLSEIQTIMRER 324
+ AE +K + G S G E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 325 ALETFKIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELARTHFTNYPVRLA 377
+ +++ E + G+VLTGG AL++ + L T PV +A
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVA 318


13hp2018_1057hp2018_1063Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_10572110.3377706-phosphogluconolactonase
hp2018_1058390.449034Glucokinase
hp2018_1059412-0.709239Alcohol dehydrogenase
hp2018_1060312-0.854965putative lipopolysaccharide biosynthesis
hp2018_10612110.934790putative lipopolysaccharide biosynthesis
hp2018_10623132.722990hypothetical protein
hp2018_10630143.145683putative Outer membrane protein
14hp2018_1078hp2018_1085Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_1078217-0.873443DNA-cytosine methyltransferase
hp2018_1079313-0.577890flgM protein
hp2018_1080211-1.601118hypothetical protein
hp2018_1081411-1.373091FKBP-type peptidyl-prolyl cis-trans isomerase
hp2018_1082312-2.178674hypothetical protein
hp2018_1083413-1.793897peptidoglycan associated lipoprotein precursor
hp2018_10842140.179577tolB protein precursor
hp2018_10852180.308147TonB-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1083OMPADOMAIN1364e-42 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 136 bits (345), Expect = 4e-42
Identities = 46/162 (28%), Positives = 71/162 (43%), Gaps = 24/162 (14%)

Query: 2 AGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAIESGTIIASIYFDFDKYEIKE 60
+ VS + Q PAP PAP V+ K T+ + + F+F+K +K
Sbjct: 184 SLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNFNKATLKP 232

Query: 61 SDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNALVIKGVEK 117
Q LD++ + V++ G TD GS YNQ L +R SV + L+ KG+
Sbjct: 233 EGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPA 292

Query: 118 DMIKTISFGETKPKCVQ-----KTR----ECYRENRRVDVKL 150
D I GE+ P K R +C +RRV++++
Sbjct: 293 DKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1085TYPE4SSCAGA320.002 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 32.4 bits (73), Expect = 0.002
Identities = 36/139 (25%), Positives = 64/139 (46%), Gaps = 12/139 (8%)

Query: 32 KEAEKILLDLNKKDEQAID--LNLEDLPSEKKNE-KIEKVTEKQGDF---LEPKEEPKEE 85
+EA K++ D +++ + LN ++ KN ++V + Q D L +E ++E
Sbjct: 568 QEANKLIKDFLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKE 627

Query: 86 PEESLEDIFSSLNDFQEKTDKNAQKDE-----QKNEQEEQRRLREQQRLKQ-NQENQEML 139
E+ LE + N + K N+QKDE K + R + Q LK +E + L
Sbjct: 628 VEKKLESKSGNKNKMEAKAQANSQKDEIFALINKEANRDARAIAYAQNLKGIKRELSDKL 687

Query: 140 KGLQQNLNQFTQKLESVKN 158
+ + +NL F + + KN
Sbjct: 688 ENVNKNLKDFDKSFDEFKN 706


15hp2018_1094hp2018_1100Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_1094019-4.348558ATP synthase B' chain
hp2018_1095117-3.751391Chromosome partitioning protein / Stage 0
hp2018_1096218-4.051491Chromosome partitioning protein
hp2018_1097219-4.959519Biotin-protein ligase
hp2018_1098219-5.227188Methionyl-tRNA formyltransferase
hp2018_1099220-5.430477hypothetical protein
hp2018_1100219-0.140276hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1098FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKDLKPDFIVVVAYGKILPKEVLAIAP 104
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1100CHANLCOLICIN363e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.8 bits (82), Expect = 3e-04
Identities = 34/214 (15%), Positives = 86/214 (40%), Gaps = 10/214 (4%)

Query: 4 NQTIPFKCPKCQEPINVSEALYKQIELENQSRFLAQQKAFEKEVNEKRAQYHTHLKMLEQ 63
N+ + + ++ A ++ E++ LA+ E++ ++ + EQ
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKA---EEKARKEAEAAEKAFQEAEQ 155

Query: 64 KEEALKERAKEQQAQFDEAVKQASMLALQDERAKIIEEARKNAFLEQQKGLELLQKELDE 123
+ + ++ E + Q A + LA E AK +E A+K Q + +++ +
Sbjct: 156 RRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTL 215

Query: 124 KSKQVQELHQKEAEIERLKRENNE-------VESRLKAENEKKLNEKLDLEREKIEKALH 176
S+ +H ++AE++ L + NE + + + L+ +A
Sbjct: 216 NSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATR 275

Query: 177 EKNELKFKQQEEQLEMLRNELKNAQRKAELSSQQ 210
+ ++E+Q ++ +E + + A+++ Q
Sbjct: 276 RRVGAGKIREEKQKQVTASETRINRINADITQIQ 309


16hp2018_1202hp2018_1218Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_12023140.641253Maf-like protein
hp2018_120313140.469743Alanyl-tRNA synthetase
hp2018_12032315-0.051323Alanyl-tRNA synthetase
hp2018_1204420-2.154880hypothetical protein
hp2018_1205215-2.253972hypothetical protein
hp2018_1206112-1.849588hypothetical protein
hp2018_1207111-1.14227230S ribosomal protein S18
hp2018_1208212-1.132573Single-stranded DNA-binding protein
hp2018_1209211-1.21180530S ribosomal protein S6
hp2018_1210210-0.845688hypothetical protein
hp2018_1211110-0.3058083'-to-5' exoribonuclease RNase R
hp2018_1212010-0.089419Shikimate 5-dehydrogenase I alpha
hp2018_12130100.460558hypothetical protein
hp2018_12140100.658944Oligopeptide transport system permease protein
hp2018_12150111.082144Oligopeptide ABC transporter/periplasmic
hp2018_12161130.984638Tryptophanyl-tRNA synthetase
hp2018_12172161.706140biotin synthesis protein
hp2018_12183172.546098Preprotein translocase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1204PF05844260.014 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 26.1 bits (57), Expect = 0.014
Identities = 12/65 (18%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 10 SVLKANNPHFDKIFEKHNQLDADIKTAEQQNASDAEVSHMKKQKLKLKDEIHSMIIEYRE 69
L+A F+ + I++ Q + +V + Q ++E+++ I + +
Sbjct: 197 VALRAAGRAFESRNGALQVANTVIQSFVQMANASVQVRQGESQASAREEEVNATIGQ-SQ 255

Query: 70 KQKSD 74
KQK +
Sbjct: 256 KQKVE 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1218SECGEXPORT494e-10 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 48.8 bits (116), Expect = 4e-10
Identities = 25/84 (29%), Positives = 47/84 (55%), Gaps = 3/84 (3%)

Query: 1 MTSALLGLQIVLAVLIVVVVLLQ--KSSSIGLGAYSGSNDSLFGAKGPASFMAKLTMFLG 58
M ALL + +++A+ +V +++LQ K + +G +G++ +LFG+ G +FM ++T L
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 59 LLFVINTIALGYFYNKEYGKSVLD 82
LF I ++ LG N +
Sbjct: 61 TLFFIISLVLGNI-NSNKTNKGSE 83


17hp2018_1344hp2018_1355Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_1344-210-3.052554Prephenate and/or arogenate dehydrogenase
hp2018_1345-210-3.731825putative endonuclease G
hp2018_1346-211-3.962274Type III restriction-modification system
hp2018_13471-212-2.804783Type III restriction-modification system DNA
hp2018_013472-112-2.375158Type III restriction-modification system DNA
hp2018_1348-112-2.230266Biotin synthase
hp2018_1349115-4.089664Ribonuclease BN
hp2018_13501215-4.038168hypothetical protein
hp2018_13502116-3.522519hypothetical protein
hp2018_13511216-3.503397hypothetical protein
hp2018_13512116-2.917697hypothetical protein
hp2018_13513015-2.772655hypothetical protein
hp2018_1352-114-3.813344hypothetical protein
hp2018_1353018-4.785434NADPH-dependent 7-cyano-7-deazaguanine
hp2018_1354-117-4.192408Iojap like protein
hp2018_1355-118-4.195827tRNA delta 2-isopentenylpyrophosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1344SHIGARICIN290.024 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 28.6 bits (64), Expect = 0.024
Identities = 11/74 (14%), Positives = 19/74 (25%), Gaps = 5/74 (6%)

Query: 83 TPIKKSTTIIDLGGAKAQILHNIPKSIRKNFIAAHPMCGTEFYGPKASVKGLYENALVIL 142
P + L GA + ++RK + Y L
Sbjct: 18 APAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKLYDIPLLRSTLPGSQRY-----AL 72

Query: 143 CDLEDSGTEQVEIA 156
L + E + +A
Sbjct: 73 IHLTNYADETISVA 86


18hp2018_1403hp2018_1418Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_14033120.469304Outer membrane protein
hp2018_14042110.390016Branched-chain amino acid aminotransferase
hp2018_1405110-1.228248putative Outer membrane protein
hp2018_1406111-1.348489DNA polymerase I
hp2018_14071116-1.489869type IIS restriction enzyme R protein
hp2018_14072116-0.965229type IIS restriction enzyme R protein
hp2018_140810160.121278restriction enzyme BcgI alpha chain-like
hp2018_14082217-0.008866restriction enzyme BcgI alpha chain-like
hp2018_14092151.112878hypothetical protein
hp2018_14102120.218343Thymidylate kinase
hp2018_14112120.629101Phosphopantetheine adenylyltransferase
hp2018_14122120.7367083-polyprenyl-4-hydroxybenzoate carboxy-lyase
hp2018_14132110.242803Flagellar basal-body P-ring formation protein
hp2018_14141110.284929ATP-dependent DNA helicase /epsilon
hp2018_14150120.048882putative transmembrane protein
hp2018_14160140.668328Seryl-tRNA synthetase
hp2018_1417-1170.000446hypothetical protein
hp2018_1418215-0.396836Probable exodeoxyribonuclease VII small subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1411LPSBIOSNTHSS2259e-79 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 225 bits (574), Expect = 9e-79
Identities = 63/147 (42%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEELIVAVAHSSAKNPMFSLDERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPEEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


19hp2018_1444hp2018_1483Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_14442132.893774Saccharopine dehydrogenase
hp2018_14451112.150241ferrodoxin-like protein
hp2018_1446-191.908459Acyl-phosphate:glycerol-3-phosphate
hp2018_1447-2120.106584Dihydroneopterin aldolase
hp2018_1448-2120.014981hypothetical protein
hp2018_1449-2110.075605iron-regulated outer membrane protein
hp2018_1450012-2.837673Selenocysteine synthase
hp2018_1451-110-2.833352Transcription termination protein
hp2018_1453011-3.349393putative type IIS restriction/modification
hp2018_1454112-3.097816hypothetical protein
hp2018_14561211-2.707671Type III restriction-modification system
hp2018_14562111-2.527492Type III restriction-modification system
hp2018_1457112-1.647715ATP-dependent DNA helicase
hp2018_1458217-0.971306hypothetical protein
hp2018_1459014-0.787610Outer membrane protein
hp2018_1460012-1.118947Exodeoxyribonuclease III
hp2018_1461113-0.130268*hypothetical protein
hp2018_1462216-0.276557hypothetical protein
hp2018_1463214-1.282505Chromosomal replication initiator protein
hp2018_1464315-2.089857purine nucleoside phosphorylase
hp2018_1465313-1.530747hypothetical protein
hp2018_1466212-1.849588Glucosamine--fructose-6-phosphate
hp2018_1467014-2.791792Thymidylate synthase
hp2018_1468-212-0.318225Type I restriction-modification system
hp2018_14691-2130.370019Type I restriction-modification system
hp2018_14692-2120.946415Type I restriction-modification system
hp2018_1470-1121.420112Type I restriction-modification system
hp2018_14712143.724329Putative predicted metal-dependent hydrolase
hp2018_14721132.605834Iron(III) dicitrate transport protein
hp2018_1473090.603154hypothetical protein
hp2018_1474-190.924369Arginase
hp2018_1475-38-0.465437Alanine dehydrogenase
hp2018_1476-39-1.455414Alcohol dehydrogenase
hp2018_1477-210-2.689179hypothetical protein
hp2018_1478-212-1.521054outer membrane protein
hp2018_1479013-3.424812NAD kinase
hp2018_1481314-4.656841Fibronectin/fibrinogen-binding protein
hp2018_1482215-1.803397hypothetical protein
hp2018_1483313-1.435574hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1463HTHFIS355e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 5e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 127 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 177
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1481FbpA_PF058331126e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 112 bits (282), Expect = 6e-29
Identities = 72/358 (20%), Positives = 141/358 (39%), Gaps = 25/358 (6%)

Query: 97 AKDLAYKSETFILRLEMIPKKANLMILDQEKCVIEA--FRFNDRVAKNDILGALPPNIYE 154
+ ++ ++ + + L + K + + I++ F FN N +G N+
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 155 HQEEDLDFKDLLDILEKDFLSYQ--HKELEHKKNQIIKRLNAQKERLKEKLEKLEDPKNL 212
++ D L ++F + L+ K + + K + R +K + L +
Sbjct: 269 KEDYKKIQYDSSSKLLENFYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKK 328

Query: 213 QLEAKELQTQASLLLTYQHLIHKHESCVILKDFED---KECMIEIDKSMPLNAFINKKFT 269
+ + LL + + K S + L ++ I +D++ + + +
Sbjct: 329 CEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQSYYK 388

Query: 270 LSKKKKQKSQFLYLEEENLKEKIAFKENQINYVRDAAEESVLE------------MFMPV 317
K K+ + + +E++ + + + + +A +E F +
Sbjct: 389 KYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKFKKI 448

Query: 318 KNSKIKRPMNGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRNIPGSHLIVFC 376
SK + + I +GKN +N L L+ A +D+W H +NIPGSH+IV
Sbjct: 449 YKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVIVKN 508

Query: 377 QKNTPKDEIILELAKMLIKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRTI 430
+ P + +LE A + K +S +DYT+ K VK GA VIYS +TI
Sbjct: 509 IMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQTI 565



Score = 35.2 bits (81), Expect = 5e-04
Identities = 19/92 (20%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 46 SAPYIGLSKKPPESVLKNTLALDFCLNKFTKNAKILQANVIDNDRI--LEIKGAKDLAYK 103
+ P I L+ + +K + L K+ NAKI+ + I+ DRI ++ + +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 104 SETFILRLEMIPKKANLMILD-QEKCVIEAFR 134
S L +E++ + +N+ ++ ++ ++++ +
Sbjct: 114 SIY-SLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


20hp2018_0043hp2018_0048N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0043-2130.630043DNA transformation competence protein
hp2018_0044-2130.574266competence protein
hp2018_0045-1130.921221inner membrane protein
hp2018_00460150.546145Mannose-6-phosphate
hp2018_0047-1121.229786GDP-D-mannose dehydratase
hp2018_0048-1111.523161Putative fucose synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0043PF043351332e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 133 bits (335), Expect = 2e-40
Identities = 36/202 (17%), Positives = 72/202 (35%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYRLLGLMSFIALVLAIVLISILPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKAQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ K N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLLNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0044TYPE4SSCAGX320.003 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 32.1 bits (72), Expect = 0.003
Identities = 27/70 (38%), Positives = 37/70 (52%), Gaps = 8/70 (11%)

Query: 200 KEKEEETIIIGDNTNAMKIIKKDIQKGYKALKSSQ--RKWYCLWACSKKSKLSLMPKEIF 257
K +EE+ II D A+ + Q + ALK + R + A K+SK +MP EIF
Sbjct: 367 KIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSEIF 420

Query: 258 NDKQFTYFKF 267
+D FTYF F
Sbjct: 421 DDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0047NUCEPIMERASE882e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.5 bits (217), Expect = 2e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0048NUCEPIMERASE474e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 47.1 bits (112), Expect = 4e-08
Identities = 52/353 (14%), Positives = 107/353 (30%), Gaps = 68/353 (19%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKYAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYMSTEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGE-------FDKFEEKIAHMIPGLIARMHTAKLKGEKNFAMWGDGTARREYLNAK 215
+YG KF + + L+G+ ++ G +R++
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAM---------------LEGKS-IDVYNYGKMKRDFTYID 221

Query: 216 DLARFIALAYENIAQIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDY 258
D+A I + I + V N+G+ + +Y + + L
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 259 KGVFVKDLSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+ +P + + D + + + E ++ G+K +Y +V
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


21hp2018_01251hp2018_0129N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_01251-3112.176502Flagellin
hp2018_01252-3121.073594Flagellin
hp2018_0126-2101.043392DNA topoisomerase I
hp2018_0127-2111.112868Radical SAM domain protein
hp2018_0128-1121.367931hypothetical protein
hp2018_01290121.647321Phosphoenolpyruvate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_01251FLAGELLIN1857e-56 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 185 bits (471), Expect = 7e-56
Identities = 85/401 (21%), Positives = 152/401 (37%), Gaps = 18/401 (4%)

Query: 2 LEELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKASIGSTSSDKIGHVRMETSSF 61
LEE+D ++N T FNG ++LS + Q+GA T+ + +G +
Sbjct: 119 LEEIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNG- 176

Query: 62 SGEGMLASAAAQNLTEVGLNFKQVNGVNDYKIETVRISTSAGTGIGALSEIINRFSNTLG 121
+ ++ +FK V G + Y + + +G + +
Sbjct: 177 --------PKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVY 228

Query: 122 VRASYNVMATG----GTPVQSGTVRELTINGVEIGTVNDVHKNDADGRLTNAINSVKDRT 177
V A+ + T T V + T E + K +G T V
Sbjct: 229 VNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGD-TFDYKGVTFTI 287

Query: 178 GVEASMDIQGRINLHSIDGRAISVHAASASGQVFGGGNFAGISGTQHAVIGRLTLTRTDA 237
+ D G+++ +I+G +++ A + S D
Sbjct: 288 DTKTGNDGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDD 346

Query: 238 RDIIVSGVNFSHVGFHSAQGVAEYTVNLRAVRGIFDANVASAAGANANGAQAETNSQGIG 297
+ S ++ +G ++ TVN + + AG + + +
Sbjct: 347 KTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLI 406

Query: 298 AG--VTSLKGAMIVMDMADSARTQLDKIRSDMGSVQMELVTTINNISVTQVNVKAAESQI 355
+ K + DSA +++D +RS +G++Q + I N+ T N+ +A S+I
Sbjct: 407 NEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRI 466

Query: 356 RDVDFAEESANFSKYNILAQSGSFAMAQANAVQQNVLRLLQ 396
D D+A E +N SK IL Q+G+ +AQAN V QNVL LL+
Sbjct: 467 EDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_01252FLAGELLIN923e-25 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 91.6 bits (227), Expect = 3e-25
Identities = 40/97 (41%), Positives = 59/97 (60%)

Query: 2 SFRINTNIAALTSHAVGVQNNRDLSSSLEKLSSGLRINKAADDSSGMAIADSLRSQSANL 61
+ INTN +L + ++ LSS++E+LSSGLRIN A DD++G AIA+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIRNANDAIGMVQTADKAMDEQIKILDTIKTKAVK 98
QA RNAND I + QT + A++E L ++ +V+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQ 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0126FbpA_PF05833300.045 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 29.8 bits (67), Expect = 0.045
Identities = 14/29 (48%), Positives = 17/29 (58%)

Query: 225 QEIKNELEKESYIISSIVKKSKKSPTPPP 253
+EIK EL + YI + KSKKS T P
Sbjct: 431 EEIKKELIETGYIKFKKIYKSKKSKTSKP 459


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0129PHPHTRNFRASE2939e-92 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 293 bits (752), Expect = 9e-92
Identities = 104/441 (23%), Positives = 184/441 (41%), Gaps = 67/441 (15%)

Query: 388 DLEHMNSFKEGEILVTDN-TDPDWEPCMKK-ASAVITNRGGRTCHAAIVAREIGVPAIVG 445
+ + + E +++ ++ T D K+ T+ GGRT H+AI++R + +PA+VG
Sbjct: 146 ETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVG 205

Query: 446 VSGATDSLYTGMEITVSCAEGE---------EGYVYAGIYEHEIERVELSNMQETQT--- 493
T+ + G + V EG E ++ E + + +
Sbjct: 206 TKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTK 265

Query: 494 -----KIYINIGNPEKAFSFSQLPNHGVGLARMEMIILNQIKAHPLALVDLHHKKSVKEK 548
++ NIG P+ G+GL R E + +++ + P
Sbjct: 266 DGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDRDQ-LPTE------------- 311

Query: 549 NEIENLMAGYANPKDFFVKKIAEGIGMISAAFYPKPVIVRTSDFKSNEYMRMLGGSSYEP 608
E Y K++ + KPV++RT D ++ + L P
Sbjct: 312 ---EEQFEAY--------KEVVQ-------RMDGKPVVIRTLDIGGDKELSYL----QLP 349

Query: 609 NEENPMLGYRGASRYYSESYNEAFSWECEALALVREEMGLTNMKVMIPFLRTIEEGKKVL 668
E NP LG+R + F + AL N+KVM P + T+EE ++
Sbjct: 350 KELNPFLGFRAIRLCLE--KQDIFRTQLRALL---RASTYGNLKVMFPMIATLEELRQAK 404

Query: 669 EILRKNNLESGKNG------LEIYIMCELPVNVILADDFLSLFDGFSIGSNDLTQLTLGV 722
I+++ + G +E+ IM E+P + A+ F D FSIG+NDL Q T+
Sbjct: 405 AIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAA 464

Query: 723 DRDSELVSHVFDERNEAMLKMFKKAIEACKRHNKYCGICGQAPSDYPEVTEFLVKEGITS 782
DR +E VS+++ + A+L++ I+A K+ G+CG+ D L+ G+
Sbjct: 465 DRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLGLGLDE 523

Query: 783 ISLNPDSVIPTWNAVAKLEKE 803
S++ S++P + + KL KE
Sbjct: 524 FSMSATSILPARSQLLKLSKE 544


22hp2018_0350hp2018_0363N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0350-1100.232417CTP synthase
hp2018_0351-1100.555264hypothetical protein
hp2018_0352-291.193134hypothetical protein
hp2018_0353-290.946286Flagellar M-ring protein
hp2018_0354-2101.199430Flagellar motor switch protein
hp2018_0355-2110.932337Flagellar assembly protein
hp2018_0356-2101.8442061-deoxy-D-xylulose 5-phosphate synthase
hp2018_03570101.205675Translation elongation factor
hp2018_03581140.395685hypothetical protein
hp2018_0359013-0.664331hypothetical protein
hp2018_0360-1120.264347hypothetical protein
hp2018_0361011-0.031772Flagellar basal-body rod protein
hp2018_0362111-0.553019Alpha-ketoglutarate permease
hp2018_0363012-0.790443Cell division protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0350ACETATEKNASE290.047 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.0 bits (65), Expect = 0.047
Identities = 14/38 (36%), Positives = 18/38 (47%), Gaps = 5/38 (13%)

Query: 301 LEGVDAILVPGGFGERGIEGKICAIQRARLEKLPFLGI 338
+ GVD I+ G GE G I+ L+ L FLG
Sbjct: 320 MGGVDVIVFTAGIGENG-----PEIREFILDGLEFLGF 352


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0353FLGMRINGFLIF5510.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 551 bits (1421), Expect = 0.0
Identities = 177/582 (30%), Positives = 290/582 (49%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFERLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVLKDD-TILVPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ I VP DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLRYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL++ + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GAPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIAFKDGANALEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K K PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKP-------LPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMIDNATFSEKIMHKTQKILGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDNTGG-----ELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 SFSEEEVRYEIILEKIRGTLKERPDEIATLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0354FLGMOTORFLIG348e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 348 bits (895), Expect = e-122
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIAKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0357TCRTETOQM1132e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 113 bits (284), Expect = 2e-28
Identities = 53/162 (32%), Positives = 87/162 (53%), Gaps = 7/162 (4%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTLKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 SFQW----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNNLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSSANEVS 161
+ + INKID ++ V QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 83.8 bits (207), Expect = 5e-19
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 161 SAKAKLGIKDLLEKIITTIPAPSGDPNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 220
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 221 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 277
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 278 KNPTSKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 337
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 338 FRVGFLGLLHMEVIKERLEREFGLNLIATAPTVVY 372
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTKGYASFDYEP 473
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0361FLGHOOKAP1300.010 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.9 bits (67), Expect = 0.010
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0362TCRTETB392e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 2e-05
Identities = 42/182 (23%), Positives = 71/182 (39%), Gaps = 33/182 (18%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFMLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNIM 210
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EE 212
++
Sbjct: 190 KK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0363IGASERPTASE350.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 0.001
Identities = 31/172 (18%), Positives = 59/172 (34%), Gaps = 4/172 (2%)

Query: 198 KENPIDESHKPPNEESFLAIPTPYNTTLNDSEPQEGLVQISPHPPTHYTIYPKKNRFNDL 257
N + + P E+ + T TT N+ + V + P
Sbjct: 973 NVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPAT 1032

Query: 258 TNPTNPT--LEPQQETKEREPTLKKETPTTL--KPIMPISAPNTENDNKTENHKTPNHPI 313
+ T T +QE+K E + T TT + + + N + + +T
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 314 KKDDLQENAQEENIEEKENLKEEKRETQNAPNFSPLTPTSAKKPVMVKELSE 365
K+ E + +E++E K E +TQ P + ++ V+ +E
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE 1144


23hp2018_0589hp2018_0602N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0589-312-1.228003methyl-accepting chemotaxis protein
hp2018_0590-211-1.165927multidrug resistance protein
hp2018_0591-2100.431263Flagellin
hp2018_0592-3100.685186Endonuclease III
hp2018_0593-1110.972699hypothetical protein
hp2018_05940120.713983Uroporphyrinogen III decarboxylase
hp2018_05952100.290114putative outer membrane component of multidrug
hp2018_05962100.245155membrane fusion protein
hp2018_0597290.121147Acriflavin resistance protein / Multidrug efflux
hp2018_0598310-0.622194hypothetical protein
hp2018_0599119-0.388420putative vacuolating cytotoxin protein
hp2018_0599209-0.228142putative vacuolating cytotoxin protein
hp2018_0600-210-0.120821hypothetical protein
hp2018_0601-1110.297592hypothetical protein
hp2018_0602-210-0.279412hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0589OMS28PORIN300.014 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.014
Identities = 26/102 (25%), Positives = 49/102 (48%), Gaps = 2/102 (1%)

Query: 143 NAAKNGEEHSNEGLITVNKTGQDIESLYEKMQNATSLADSLNQRS--NEITQVISLIDDI 200
N + ++ N+ L T+NK +D+ S E ++ ++ N + +SL+ D+
Sbjct: 47 NKKLDQKDQVNQALDTINKVTEDVSSKLEGVRESSLELVESNDAGVVKKFVGSMSLMSDV 106

Query: 201 AEQTNLLALNAAIEAARAGEHGRGFAVVADEVRKLAEKTQKA 242
A+ T + + A I A +G G V + +K ++TQKA
Sbjct: 107 AKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQETQKA 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0591FLAGELLIN2446e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 244 bits (624), Expect = 6e-77
Identities = 126/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0595RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.020
Identities = 22/167 (13%), Positives = 61/167 (36%), Gaps = 18/167 (10%)

Query: 151 AQVKLNVFNGFSDVNNVKEKSAT--YRSNVATLEYSRQSIFLQVVQQYYEYFNNLARMIA 208
++KL F +V+ + T + +T + + L + ++ E LAR+
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINR 225

Query: 209 LQKKLEQIQTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQFALEQN 261
+ ++ + + L K + + ++A L Y + ++ +
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 262 RLTLEYLTNLNVKNLKKTTIDVPNLQLRE-RKDLVSLREQISALKYQ 307
+ + +T K +D +LR+ ++ L +++ + +
Sbjct: 286 KEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0596RTXTOXIND494e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.4 bits (118), Expect = 4e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYNKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLEGYEFT 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 30.2 bits (68), Expect = 0.008
Identities = 21/150 (14%), Positives = 51/150 (34%), Gaps = 21/150 (14%)

Query: 70 QAQSDSTEQQLIFAKKQYQRYNKIGGAVDKNTLEGYEFTYRRLESDYAYSIAVLNKTILR 129
+++ S +++ + ++ +I + + T T +++ +++R
Sbjct: 279 ESEILSAKEEYQLVTQLFKN--EILDKLRQTTDNIGLLTLELAKNE-----ERQQASVIR 331

Query: 130 APFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG-------D 180
AP + + GV L+ +V L + +K I + VG +
Sbjct: 332 APVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE 391

Query: 181 TYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ Y+ G K+ I D+
Sbjct: 392 AFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0597ACRIFLAVINRP8980.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 898 bits (2323), Expect = 0.0
Identities = 286/1040 (27%), Positives = 518/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGVMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVMNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQPIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKRIQAISP-NYEIRPFLDTTSYIRTSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTRLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFITVVLVFVGSLFVASKLGMDFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHDEVEFTTLQVGY-GTSQNPFKAKIFVQLKPLKERKKEHELGQFELMSALRKELRS 631
+ + E FT + G +QN FV LKP +ER E ++ + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNG-DENSAEAVIHRAKMELGK 656

Query: 632 LPEAKDLENINLSEVSLIGGGGDSSPFQTFVFSHSQEAVDKSVENLRKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGF-VIPFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 EGYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
+ E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAQPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + +
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLVALATAFVLIYMILA 871
G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_05992VACCYTOTOXIN2733e-76 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 273 bits (700), Expect = 3e-76
Identities = 105/397 (26%), Positives = 180/397 (45%), Gaps = 14/397 (3%)

Query: 2358 AGNNSIMWLNELFAAKGGNPLFAPYYLQDNPTEHIVTLMKDIASALGMLSNSNLKNNSTD 2417
+G L L + +A + I + + L +++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2418 VLQLNTYTQQMSRLAKLSNFASFDSTDFSERLSSLKNQRFADATPNAMDVILKYSQRDKL 2477
L L+ SRL LS + F++RL +LK+QRFA +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2478 KNNLWATGVGGVSFVENGTGTLYGVNVGYDRFVRG---VIVGGYAAYGYSGFYER--ITN 2532
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + N
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2533 SKSDNVDVGLYARAFIKKSELTFSVNETWGANKTQISSNDTLLSMINQSYKYSTWTTNAK 2592
S ++N + G+Y+R F + E F G++++ ++ LL +NQSY Y ++ +
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 2593 VNYGYDFMFKNKSIILKPQIGLRYYYIGMSGLEGVMNNALYNQFKANADPSKKSVLTIDF 2652
+YGYDF F +++LKP +G+ Y ++G + + + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 2653 ALENRHYFNTNSYFYAIGGVGRDLLVNSMGDKLVRFIGNNTLSYRKGDLYNTFANITTGG 2712
+E R+Y+ SYFY GV ++ N V + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFA-NFGSSNAVSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 2713 EVRLFKSFYANAGVGARFGLDYKMIDIIGNIGMRLAF 2749
E++L K + N G L + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 34.2 bits (78), Expect = 0.010
Identities = 15/100 (15%), Positives = 31/100 (31%), Gaps = 5/100 (5%)

Query: 256 SYTFDGINNTFNEDKFNGGSFNFNHAEQTNAFNNNSFNGGSFSFNAKQVNFNHNSFNGGV 315
SY+ + E FN + ++A Q +N + G+ + N + G
Sbjct: 272 SYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGG 330

Query: 316 FNF---NNTPKASFTNDTFNVNNQFKING-TQTDFTFSKG 351
+ + + N + + N TQ +
Sbjct: 331 YKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0602LCRVANTIGEN300.001 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.4 bits (68), Expect = 0.001
Identities = 15/33 (45%), Positives = 20/33 (60%)

Query: 16 KRKRLLTELAELEAEIKVGSERRSSFNVSLSPS 48
R +L ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


24hp2018_0952hp2018_0959N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_0952-1100.030191hypothetical protein
hp2018_0953-3100.489157Cobalt-zinc-cadmium resistance protein/Cation
hp2018_0954-211-0.367905nickel-cobalt-cadmium resistance protein
hp2018_0955-111-0.289241hypothetical protein
hp2018_0956-113-0.502662Glycyl-tRNA synthetase beta chain
hp2018_0957-2110.501889hypothetical protein
hp2018_0958-1131.1980742,3-bisphosphoglycerate-independent
hp2018_0959-1130.910327Aspartyl-tRNA amidotransferase subunit C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0952LPSBIOSNTHSS250.035 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 25.2 bits (55), Expect = 0.035
Identities = 16/69 (23%), Positives = 27/69 (39%), Gaps = 12/69 (17%)

Query: 12 LKDALIDYLFEKGFDDFFYV--ECYKYAASSLLLSQKEQVSGRKDYAKFKLFLSEEVALP 69
L+ A + + F Y + +SSL+ K+ A+F + V
Sbjct: 98 LQMANTNKTLASDLETVFLTTSTEYSFLSSSLV----------KEVARFGGNVEHFVPSH 147

Query: 70 LAQALKNQF 78
+A AL +QF
Sbjct: 148 VAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0953ACRIFLAVINRP7520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 752 bits (1942), Expect = 0.0
Identities = 225/1044 (21%), Positives = 460/1044 (44%), Gaps = 42/1044 (4%)

Query: 6 IIEFSLRQRVIVIVGAILILFFGTYSFIHTPVDAFPDISPTQVKIILKLPGSSPEEMENN 65
+ F +R+ + V AI+++ G + + PV +P I+P V + PG+ + +++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 66 IVRPLELELLGLKGQKSLRSVSKYSIS-DITIDFDDSVDIYLARNIVNERLSSVMKDLPM 124
+ + +E + G+ + S S + S IT+ F D +A+ V +L LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 125 GVEGGMAPIVTPLSDIFMF----TIDGNITEIEKRQLLDFVIRPQLRMISGVADVNSIGG 180
V+ + S M + + T+ + + ++ L ++GV DV G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 FSKAFVIVPDFNDMARLGVSISDLESAVRVNLRNSGAGRVDR----DGETFLVKI--QTA 234
A I D + + + ++ D+ + ++V AG++ G+ I QT
Sbjct: 181 -QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 SLSLEDIGKITV--STNLGHLHIKDFAKVISQSRTRLGFVTKDGVGETTEGLVLSLKEAN 292
+ E+ GK+T+ +++ + +KD A+V +G + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLATGAN 298

Query: 293 TKKIITQVYQKLEELKPLLPSGVSLNVFYDRSEFTQKAIATVSKTLIEAVVLIIITLFLF 352
+ KL EL+P P G+ + YD + F Q +I V KTL EA++L+ + ++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LGNLRASVAVGVILPLSLSVAFIFIKLNNLTLNLMSLGGLIIAIGMLIDSAVVVVENAFE 412
L N+RA++ + +P+ L F + ++N +++ G+++AIG+L+D A+VVVEN E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV-E 417

Query: 413 KLSANTKTTKLHAIYRSCKEIAVSVVSGVVIIIVFFVPILTLQGLEGKMFRPLAQSIVYA 472
++ K A +S +I ++V +++ F+P+ G G ++R + +IV A
Sbjct: 418 RVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSA 477

Query: 473 LLGTLVLSITIIPVVSSLVLK--ATPHSET---FLTRFLNRIYGPLLEFFVRNPKKVI-- 525
+ ++++++ + P + + +LK + H E F F N + + + + K++
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWF-NTTFDHSVNHYTNSVGKILGS 536

Query: 526 ----LGAFVFLIA-SLSLFPFVGKNFMPTLDEGDVVLSVETTPSISLDQSKDLILNIESA 580
L + ++A + LF + +F+P D+G + ++ + ++++ ++ +
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 581 IKKHVKEVKTIVARTGSDELGLDLGGLNQTDTFISFIPKKEWSVKTKDELL-EKIMDSLK 639
K+ K V G G Q + ++F+ K W + DE E ++ K
Sbjct: 597 YLKNEKANVESVFTVN----GFSFSGQAQ-NAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 640 -DFKGINFSFTQPIEM-RISEMLTGVRGDLA-VKIFGDDISELNGLSFQIA-QALKGIKG 695
+ I F P M I E+ T D + G L Q+ A +
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 696 SSEVLTTLNEGVNYLYVTPNKEAMANVGITSDEFSKFLKSALEGLIVDVIPTGISRTPVM 755
V E + ++E +G++ + ++ + +AL G V+ +
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 756 IRQEIDFASSITKIKSLALTSKYGVLVPITSIAKIEEVDGPVSIVREDSRRMSVVRSNVV 815
++ + F + L + S G +VP ++ V G + R + ++
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 816 GRDLNSFVEEAKKVIAQNVKLPPSYYITYGGQFENQQRANKRLSTVIPLSILAIFFILFF 875
+ + +A KLP + G ++ + + ++ +S + +F L
Sbjct: 832 PGTSSGDAMALMENLA--SKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 876 TFKSIPLALLILLNIPFAVTGGLIALFAVGEYISVPASVGFIALFGIAVLNGVVMIGYFK 935
++S + + ++L +P + G L+A + V VG + G++ N ++++ + K
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 936 ELLL-QGKSVEECVLLGAKRRLRPVLMTACIAGLGLIPLLFSHSVGSEVQKPLAIVVLGG 994
+L+ +GK V E L+ + RLRP+LMT+ LG++PL S+ GS Q + I V+GG
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 995 LVTSSALTLLLLPPMFMLIAKKIK 1018
+V+++ L + +P F++I + K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0954RTXTOXIND290.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.026
Identities = 25/159 (15%), Positives = 59/159 (37%), Gaps = 31/159 (19%)

Query: 7 WLMLMGVFLMGVFLGAKEYPEIVLEEKNLQPMGLKVIKLDKEIFSKGLPFNAYIDFDSKS 66
L+ F+MG + A +L + +++ N + +S
Sbjct: 56 RPRLVAYFIMGFLVIA-----FIL---------SVLGQVEI-----VATANGKLTHSGRS 96

Query: 67 SVVQSLSFDASVVAVYKREGEQVKAGDAICEVSSID-------LSNLYFELQNNQNKLKI 119
++ + ++ V + +EGE V+ GD + +++++ + + + Q + +I
Sbjct: 97 KEIKPIE-NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQI 155

Query: 120 AKDITKKDLELYKAGVIPKREYQTSFLASEEMGLKVNQL 158
+EL K + + SEE L++ L
Sbjct: 156 LSRS----IELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_0959TYPE3IMSPROT250.042 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 25.1 bits (55), Expect = 0.042
Identities = 10/36 (27%), Positives = 16/36 (44%), Gaps = 10/36 (27%)

Query: 5 DTLLQR---LEKLSM--LEIKDEHKES-----VKGH 30
D + +++L M EIK E+KE +K
Sbjct: 202 DYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSK 237


25hp2018_1385hp2018_1391N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp2018_13850140.601796Inner membrane protein translocase component
hp2018_13860110.631760hypothetical protein
hp2018_13871101.092634tRNA modification GTPase
hp2018_13883111.083900Outer membrane protein
hp2018_1389217-0.107775hypothetical protein
hp2018_1390-1180.811308hypothetical protein
hp2018_1391-1131.551123membrane-associated lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_138560KDINNERMP422e-145 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 422 bits (1086), Expect = e-145
Identities = 159/576 (27%), Positives = 276/576 (47%), Gaps = 71/576 (12%)

Query: 10 RLILAIALSFLFIALYSYFFQKPNKP--TTETTKQETTNNHTTISPNAPNAQHFSVTQTI 67
R +L IAL F+ ++ + Q N +TT+ TT + P +
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASG-------- 56

Query: 68 PQESLLSTISFEHARIEIDSLGR--IKQVYLKDKKYLTPKEKGFLEHVGHLFSSKENSQP 125
+ L ++ + + I++ G + + K L + L F + S
Sbjct: 57 --QGKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGL 114

Query: 126 SLKELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGVLSI 183
+ ++ P A+ +PL +N A G NE V D +
Sbjct: 115 TGRDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTF 161

Query: 184 IKTLTFYDDLHYDLKIAFKSSNN------------------LIPSYVITNGYRPVADLDS 225
KT Y + + + N L P + +
Sbjct: 162 TKTFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFAL----- 215

Query: 226 YTFSGVLLENNDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQGFEAL 282
+TF G D+K EK + D + + S +++ + +YF T + G
Sbjct: 216 HTFRGAAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNF 274

Query: 283 IDSEIGTKNPLGFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDVIEYGL 331
+ +G N + I K++ N ++GP+ + A++P L ++YG
Sbjct: 275 YTANLG--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGW 332

Query: 332 ITFFAKGVFVLLDYLYQFVGNWGWAIIFLTIIVRIVLYPLSYKGMVSMQKLKELAPKMKE 391
+ F ++ +F LL +++ FVGNWG++II +T IVR ++YPL+ SM K++ L PK++
Sbjct: 333 LWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQA 392

Query: 392 LQEKYKGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWV 451
++E+ + Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + +
Sbjct: 393 MRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFA 452

Query: 452 LWIHDLSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLITFPAG 511
LWIHDLS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F + FP+G
Sbjct: 453 LWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSG 512

Query: 512 LVLYWTTNNILSVLQQLIINKVLENKKRVHAQNIKE 547
LVLY+ +N+++++QQ +I + LE K+ +H++ K+
Sbjct: 513 LVLYYIVSNLVTIIQQQLIYRGLE-KRGLHSREKKK 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1387TCRTETOQM340.001 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 34.1 bits (78), Expect = 0.001
Identities = 33/134 (24%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 KGHKVRLIDTAGIRESADEIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFNLIDTLN 318
+ KV +IDT G + E+ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1389BINARYTOXINB290.026 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 29.3 bits (65), Expect = 0.026
Identities = 22/97 (22%), Positives = 36/97 (37%), Gaps = 6/97 (6%)

Query: 155 SKSMGDLLAKAAPIERILKAYSVPVSPLENYEKIYYQNAFKPKVRITFDNNSDTEIKNAL 214
+ + D L P + +A + E + YQ + FD + IKN L
Sbjct: 536 AVNPSDPLETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQL 595

Query: 215 MSAYAR-VLTPSDEEKLYQ-----IKNEVFTENTNGI 245
A + T D+ KL I+++ F + N I
Sbjct: 596 AELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNI 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp2018_1391LIPOLPP20293e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 293 bits (752), Expect = e-105
Identities = 174/175 (99%), Positives = 175/175 (100%)

Query: 1 MKNQVKKILGMSVIAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSV+AAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.