PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome908.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP002184 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1hp908_0052hp908_0090Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0052-114-3.170911NiFe-hydrogenase metallocenter assembly protein
hp908_0053-114-3.418919agmatine deiminase
hp908_0054-210-2.210678Adenine-specific methyltransferase
hp908_0055-110-2.266360cytosine specific DNA methyltransferase
hp908_0056-210-2.329538hypothetical protein
hp908_0057-19-1.146998adenine/cytosine DNA methyltransferase
hp908_00581121.148227Proline/sodium symporter Propionate/sodium
hp908_00592140.348890Proline dehydrogenase/Proline
hp908_0060315-0.435704hypothetical protein
hp908_0061315-0.105423hypothetical protein
hp908_00622150.285279hypothetical protein
hp908_00633150.574759hypothetical protein
hp908_00641170.526806hypothetical protein
hp908_00650180.562050hypothetical protein
hp908_00663170.294673hypothetical protein
hp908_0067222-0.704564hypothetical protein
hp908_0069116-0.232944hypothetical protein
hp908_00701140.528661hypothetical protein
hp908_00711151.146760hypothetical protein
hp908_00720131.285525hypothetical protein
hp908_00730151.461351cell division protein
hp908_00742142.178955cell division protein
hp908_00755203.656047urease accessory protein
hp908_00765223.397708urease accessory protein
hp908_00774202.615266urease accessory protein
hp908_00783172.624505urease accessory protein
hp908_00793202.575953urease channel
hp908_00802202.156220urease alpha subunit
hp908_0081-1121.137625urease beta subunit/urease gamma subunit
hp908_00820130.768788*lipoprotein signal peptidase
hp908_00830141.342264Phosphoglucosamine mutase
hp908_00844203.120020SSU ribosomal protein s20p
hp908_00854192.916713hypothetical protein
hp908_00864183.038716peptide chain release factor 1
hp908_00875161.776248hypothetical protein
hp908_00884171.860700hypothetical protein
hp908_00894171.867807outer membrane protein
hp908_00903150.519384hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0059ANTHRAXTOXNA310.036 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.036
Identities = 36/173 (20%), Positives = 71/173 (41%), Gaps = 19/173 (10%)

Query: 121 QEESQLKERILKRKNEKIILNVNFIGEEVLGEEEANARFEKY---SQALKSNYIQYISIK 177
Q+ S+ ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0063GPOSANCHOR361e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 1e-04
Identities = 37/206 (17%), Positives = 75/206 (36%), Gaps = 2/206 (0%)

Query: 17 ELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTENDKLNHQVIALTN 76
+ E L+ +N++L L N EL + + EK + + ++ L
Sbjct: 61 KFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEA 120

Query: 77 EQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNLENSNTQLRQALEN 136
+ LE+ + LE E L + LE A + N +T ++
Sbjct: 121 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 137 SNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLASANQDLKRQKRKLE 196
A+ A E + AE + LE + + L+ LA+ DL++
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADS--AKIKTLEAEKAALAARKADLEKALEGAM 238

Query: 197 EENIALKERVDSLKEQLFTLQPQKPQ 222
+ A ++ +L+ + L+ ++ +
Sbjct: 239 NFSTADSAKIKTLEAEKAALEARQAE 264



Score = 35.0 bits (80), Expect = 2e-04
Identities = 43/207 (20%), Positives = 78/207 (37%), Gaps = 2/207 (0%)

Query: 16 EELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTENDKLNHQVIALT 75
+ L+ EL +E + K K +E AS+ L + L + + A +
Sbjct: 81 KALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADS 140

Query: 76 NEQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNLENSNTQLRQALE 135
+ +LE E+A L LEK+ + + K+K LE+ + LE +L +ALE
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 136 NSNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLASANQDLKRQKRKL 195
+ KI + E AR LE +A + ++ + L+ +K L
Sbjct: 201 GAMNFSTADSAKIKTLEAEKAALAARKADLE--KALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 196 EEENIALKERVDSLKEQLFTLQPQKPQ 222
E L++ ++ +
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKT 285



Score = 34.7 bits (79), Expect = 3e-04
Identities = 41/219 (18%), Positives = 72/219 (32%), Gaps = 9/219 (4%)

Query: 4 LIEKWFGFSQIREELEARIGELEDENTELFTTKDKLTKENTELASQNTALTEKNKTLTTE 63
L + + E + L K L EL + + +
Sbjct: 153 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 212

Query: 64 NDKLNHQVIALTNEQNSLEQERAELQDEHGFLEKSCANLEKENQRLTDKLKQLESAQKNL 123
L + AL + LE+ + LE E L + +LE A +
Sbjct: 213 IKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGA 272

Query: 124 ENSNTQLRQALENSNAQLAQAEEKIAEEKTELEREIARLKSLEGMEAKSDLDLANRRLAS 183
N +T ++ A+ A E + A+ + + + A +SL S
Sbjct: 273 MNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDAS---------RE 323

Query: 184 ANQDLKRQKRKLEEENIALKERVDSLKEQLFTLQPQKPQ 222
A + L+ + +KLEE+N + SL+ L + K Q
Sbjct: 324 AKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 362


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0065SHAPEPROTEIN290.023 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.023
Identities = 17/58 (29%), Positives = 24/58 (41%), Gaps = 9/58 (15%)

Query: 39 RHVFDDEKTAKTFKVELRASEPCAYAISALKSYGFFKSEKLDKPVYYGVFDFGGGTTD 96
R + + + A +V L EP A AI A + + V D GGGTT+
Sbjct: 124 RAIRESAQGAGAREVFL-IEEPMAAAIGA--------GLPVSEATGSMVVDIGGGTTE 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0080UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 353/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0089FLAGELLIN330.002 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 33.5 bits (76), Expect = 0.002
Identities = 32/285 (11%), Positives = 80/285 (28%), Gaps = 6/285 (2%)

Query: 17 SVLLGSMNATDLETYAALQKPSHVFSNYAKKSNKGSELSSDSLTQQQAQNTAQSDTTQAT 76
++ L ++ L + KS+ + D+ + ++
Sbjct: 156 TIDLQKIDVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVV 215

Query: 77 TLENTASSGTP----DSSTLPTKETPPATSGGTGGDKHTASSGTPPASSTPPAKKDETSG 132
T + ++ T + + +++GT A + A K G
Sbjct: 216 TDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEG 275

Query: 133 SGDKDQHTASGTGGTPSSSGGTGGDKHTASSGTPPASSTPPTPTPPTSGGNTITSQLTKD 192
D + T T + + G G T + + T T+ S
Sbjct: 276 D-TFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVY 334

Query: 193 TTTVNNLKSVSVSAMNTTLSGVTQLSQQTATISTLLNGSPNLGSVISNAQGLSSAFSALE 252
T+ VN + N + + + + + + + ++ A +
Sbjct: 335 TSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMF 394

Query: 253 SAQNTLKGYLDSSSATIGQLTNGSNAVVGALDKAINQVDMALADL 297
+ + + + ++D A+++VD + L
Sbjct: 395 IDKTASGVSTLINEDAAAAKK-STANPLASIDSALSKVDAVRSSL 438



Score = 31.2 bits (70), Expect = 0.013
Identities = 38/306 (12%), Positives = 85/306 (27%), Gaps = 10/306 (3%)

Query: 55 SSDSLTQQQAQNTAQSDTTQATTLENTASSGTPDSSTLPTKETPPATSGGTGGDKHTASS 114
SL + T + + D+ + + + G TA +
Sbjct: 163 DVKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPT 222

Query: 115 GTPPASSTPPAKKDETSGSGDKDQHTASGTGGTPSSSGGTGGDKHTASSGTPPASSTPPT 174
A T+ + + ++ A G +
Sbjct: 223 VPDKVYVNA-ANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYK 281

Query: 175 PTPPTSGGNTITSQLTKDTTTVNNLKSVSVSAMNTTLSGVTQLSQQTA---TISTLLNGS 231
T T K +TT+N K A T + + + ++++NG
Sbjct: 282 GVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQ 341

Query: 232 PNLGSVISNAQGLSSAFSALESAQNTLKGYLDSSSATIGQLTNGSNAVVGALDKAINQVD 291
N S A + + K ++ + T + +
Sbjct: 342 FTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASG 401

Query: 292 MALADLATADTQKTQAVALVAASNSATTTTDAINFLNALKANLTAQKDAFMSVHKNIQTA 351
++ A A + +N + A++ ++A++++L A ++ F S N+
Sbjct: 402 VSTLINEDAAA------AKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNT 455

Query: 352 VAQAQA 357
V +
Sbjct: 456 VTNLNS 461


2hp908_0110hp908_0116Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_01103120.473365hypothetical protein
hp908_01112142.561832Beta-1,3-galactosyltransferase/Beta-1,
hp908_01121143.604925methyl-accepting chemotaxis protein
hp908_01133153.852542hypothetical protein
hp908_01142143.5482012',3'-cyclic-nucleotide 2'-phosphodiesterase
hp908_0115-2124.759971S-ribosylhomocysteine lyase/Autoinducer-2
hp908_0116-2124.432567Cystathionine gamma-lyase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0115LUXSPROTEIN2256e-79 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 225 bits (575), Expect = 6e-79
Identities = 57/145 (39%), Positives = 91/145 (62%), Gaps = 7/145 (4%)

Query: 5 VESFNLDHTKVKAPYVRVADRKKGVNGDLIVKYDVRFKQPNQDHMDMPSLHSLEHLVAEI 64
++SF +DHT++ AP VRVA + GD I +D+RF PN+D + +H+LEHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 65 IRNHA----SYVVDWSPMGCQTGFYLTVLNHDNYTEILEVLEKTMQDVLKAK---EVPAS 117
+RNH ++D SPMGC+TGFY++++ + ++ + M+DVLK + ++P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 118 NEKQCGWAANHTLEGAQNLARAFLD 142
NE QCG AA H+L+ A+ +A+ L+
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILE 147


3hp908_0192hp908_0214Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_01922120.369610lysyl t-rna synthetase:classII
hp908_01933140.585546serine hydroxymethyl transferase
hp908_0194116-0.171495hypothetical protein
hp908_01952150.435516hypothetical protein
hp908_01961142.702936hypothetical protein
hp908_01970122.902891hypothetical protein
hp908_0198-1102.215602putative inner membrane protein
hp908_0199-192.356186cardiolipin synthetase
hp908_0200-1113.208092succinate dehydrogenase iron-sulfur protein
hp908_02010113.232504succinate dehydrogenase flavoprotein subunit
hp908_0202-1141.739300fumarate reductase cytochrome B subunit
hp908_0203-2171.570962triosephosphate isomerase
hp908_0204-2172.536174enoyl-acyl-carrier-protein NADH
hp908_0205-2182.461246UDP-3-O-3-hydroxy myristoyl glucosamine
hp908_0206-2162.787882S-adenosylmethionine synthetase
hp908_0207-2171.973934nucleoside diphosphate kinase
hp908_0208-3171.430926hypothetical protein
hp908_0209014-3.630253LSU ribosomal protein L32p
hp908_0210012-2.522197Phosphate:acyl-ACP acyltransferase
hp908_0211013-3.2871113-oxoacyl-acyl-carrier-protein synthase
hp908_0212114-4.274936hypothetical protein
hp908_0213111-3.588978hypothetical protein
hp908_0214111-3.720644hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0197FRAGILYSIN280.029 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 28.5 bits (63), Expect = 0.029
Identities = 22/103 (21%), Positives = 44/103 (42%), Gaps = 2/103 (1%)

Query: 7 EDNKKLYDIIDGQQRTTTIFMLLHVLASKQNEKDKQETRKYLYQKGELKLEVAPQNQSFF 66
DN+ + + +G+ + +T F+L A + ++ + Y++ ++ E+A + F
Sbjct: 93 LDNENVR-LFNGRDKDSTSFILGDEFAVLRFYRNGESISYIAYKEAQMMNEIAEFYAAPF 151

Query: 67 KTLLEAAEKENISHCEKDADTEGKQNLFEVLKAILDKVSKLSG 109
K EKE C D+ T +K +DK K+
Sbjct: 152 KKTRAINEKE-AFECIYDSRTRSAGKDIVSVKINIDKAKKILN 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0204DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.1 bits (145), Expect = 8e-13
Identities = 61/263 (23%), Positives = 108/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYNNIKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + I++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGRHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSNGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0214PF01540340.003 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 33.9 bits (77), Expect = 0.003
Identities = 32/127 (25%), Positives = 60/127 (47%), Gaps = 10/127 (7%)

Query: 209 IGKGKQKQLSKIYSHF-KKLSEGEIKPQNEGILKKLKSLDEIFKTTDFTRFTPKTEIKDI 267
+ K K+L++I + KKL+E K +N G+ + K +E F+ + +
Sbjct: 340 VKKAWSKELAEIKAEDDKKLAEENQKIKN-GVEELKKINNEAFELSK--------TVNKT 390

Query: 268 IKEIDEKYPINENFKRQFRTFRSSIGNLKKKINSLKYLEKTREDFERKKESWIKEIGNDC 327
I E+++K+ I+ +FK Q + F + + ++I+ + T+E F + KEI
Sbjct: 391 IAELEKKFKIDVSFKEQLKNFADDLLDKSRQIDEFTTVTSTQEGFTLAELESFKEITTTW 450

Query: 328 KNECNSE 334
N SE
Sbjct: 451 FNGMKSE 457



Score = 33.6 bits (76), Expect = 0.004
Identities = 26/124 (20%), Positives = 50/124 (40%), Gaps = 8/124 (6%)

Query: 211 KGKQKQLSKIYSHFKKLSEGEIKPQNEGILKKLKSLDEIFKTTDFTRFTPKTEIKDIIKE 270
K K+L++I + K E + EG + LK ++I D I I +
Sbjct: 221 KAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSFAD--------TIALTITK 272

Query: 271 IDEKYPINENFKRQFRTFRSSIGNLKKKINSLKYLEKTREDFERKKESWIKEIGNDCKNE 330
++ K+ I+E FK+Q + + ++ + + ++DF + KE +
Sbjct: 273 LERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEFNTSWLEK 332

Query: 331 CNSE 334
SE
Sbjct: 333 IVSE 336


4hp908_0311hp908_0351Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_03111143.653109LSU ribosomal protein L21p
hp908_03121153.750882LSU ribosomal protein L27p
hp908_03131143.737617Dipeptide-binding ABCtransporter
hp908_03140174.008361Dipeptide transport system permease protein
hp908_0315-2153.684513Dipeptide transport system permease protein
hp908_0316-3162.952047Dipeptide transport system permease protein
hp908_0317-2152.210428Dipeptide transport system permease protein
hp908_03180141.967069GTP binding protein
hp908_03200152.184555putative periplasmic protein
hp908_03211162.715406Glutamate-1-semialdehyde aminotransferase
hp908_03224181.991926hypothetical protein
hp908_03234161.795288hypothetical protein
hp908_03242170.364002predicted aminohydrolase
hp908_0325116-0.838173putative polysaccharide acetylase
hp908_0326115-1.828693hypothetical protein
hp908_0327117-2.826649putative GTPase
hp908_0328-118-2.987098nitrite extrusion protein
hp908_0329120-3.724930hypothetical protein
hp908_0330-116-2.328258ABC transporter, ATP-binding protein
hp908_0331115-1.614710putative heme oxygenase
hp908_0332513-2.354729hypothetical protein
hp908_0333413-1.870135arginyl-tRNA synthetase
hp908_0334113-2.060383twin-arginine translocation protein:TatA
hp908_0335212-1.589263guanylate kinase
hp908_0336013-1.551525poly E-rich protein
hp908_0337014-1.639546poly E-rich protein
hp908_0338-114-2.086334membrane bound endonuclease
hp908_0339114-2.065176putative outer membrane protein
hp908_0340116-2.044406flagellar L0 ring protein
hp908_0341214-1.790991N-Acetylneuraminate cytidylyltransferase
hp908_0342212-1.092992conserved hypothetical protein
hp908_0343212-0.922897putative flagellar biosynthesis protein
hp908_03441130.442839Tetraacyldisaccharide 4'-kinase
hp908_03451151.358819NAD synthetase
hp908_03460161.730334*ketol-acid reductoisomerase
hp908_03470160.746898septum site-determining protein
hp908_0348115-0.793893cell division topology specificity factor
hp908_03490120.070477DNA processing chain A
hp908_0350213-0.722789hypothetical protein
hp908_0351215-1.048450hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0328TCRTETB310.006 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.4 bits (71), Expect = 0.006
Identities = 36/193 (18%), Positives = 77/193 (39%), Gaps = 1/193 (0%)

Query: 23 VLIPLLILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFL 82
V +P + + P + + A ++ + G+ + LS + ++ + + +
Sbjct: 35 VSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSV 94

Query: 83 VCYFDSIPFFWLWIWRFIAGVASSALMILVAPLSLPYVKEHKKALVGGLIFSAVGIGSVF 142
+ + F L + RFI G ++A LV + Y+ + + GLI S V +G
Sbjct: 95 IGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGV 154

Query: 143 SGFVLPWISSYNIKWAWIFLGGSCLIAFILSLVGLKTRSLRKKSVKKEESAFKIPFHLWL 202
+ I+ Y I W+++ L I + L+ L + +R K + + +
Sbjct: 155 GPAIGGMIAHY-IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVF 213

Query: 203 LLISCALNAIGFL 215
++ +I FL
Sbjct: 214 FMLFTTSYSISFL 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0335PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0337IGASERPTASE545e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 53.5 bits (128), Expect = 5e-10
Identities = 34/154 (22%), Positives = 63/154 (40%), Gaps = 4/154 (2%)

Query: 148 EALAKEEPNNEEQLLPTLNEQEGETPKEEAQEEVKKEEVKEMQ-EEVKEMQEEVKEKQKQ 206
E +A+ + P + ET E +++E K E E E EV ++ K
Sbjct: 1015 EEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 207 EVAENPQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSP 266
V N Q E ++ + + E + +++E K ET++ +E + +Q SP
Sbjct: 1075 NVKANTQTNEV---AQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131

Query: 267 NVQELEAMQELVKEIQENSNDQENKKPKKPKKTQ 300
++ E +Q + +EN K+P+ T
Sbjct: 1132 KQEQSETVQPQAEPARENDPTVNIKEPQSQTNTT 1165



Score = 45.4 bits (107), Expect = 2e-07
Identities = 30/128 (23%), Positives = 46/128 (35%), Gaps = 4/128 (3%)

Query: 167 EQEGETPKEEAQEEVKKEEVKEMQEEVKEMQEEVKEKQKQEVAENPQDEEKPKDDETQGS 226
E + +EE K E ++ QE K + +V KQ+Q PQ E ++D T
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 227 VEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSPNVQELEAMQELVKEIQENSN 286
EP + + E +E Q + E T +S Q N
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPE---NTTPATTQPTVN 1212

Query: 287 DQENKKPK 294
+ + KPK
Sbjct: 1213 SESSNKPK 1220



Score = 44.3 bits (104), Expect = 5e-07
Identities = 21/147 (14%), Positives = 54/147 (36%), Gaps = 4/147 (2%)

Query: 151 AKEEPNNEEQ-LLPTLNEQEGETPKEEAQEEVKKEEVKEMQEEVKEMQEE---VKEKQKQ 206
KE E++ E+ E PK +Q K+E+ + +Q + + +E V K+ Q
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 207 EVAENPQDEEKPKDDETQGSVEPPKDEEVSKELETQEQEPIKEETQEIKEEKQEKTQDSP 266
D E+P + + +P + + + P + ++ + P
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219

Query: 267 NVQELEAMQELVKEIQENSNDQENKKP 293
+ +++ + ++ + ++
Sbjct: 1220 KNRHRRSVRSVPHNVEPATTSSNDRST 1246



Score = 31.2 bits (70), Expect = 0.005
Identities = 25/122 (20%), Positives = 41/122 (33%), Gaps = 3/122 (2%)

Query: 111 QKKLGSNMSELEPSQNSDPTQEILETNWDELENLGDLEALAKEEPNNEEQLLPTLNEQEG 170
Q++ + + EP++ +DPT I E + D E AKE +N EQ +
Sbjct: 1133 QEQSETVQPQAEPARENDPTVNIKEPQ-SQTNTTADTEQPAKETSSNVEQPVTESTTVNT 1191

Query: 171 ETPKEEAQEEVKKEEVKEMQEEVKEMQEEVKEKQKQEVAENPQDEEKPKDDETQGSVEPP 230
E E + E + K + ++ V P + E S
Sbjct: 1192 GNSVVENPENTTPATTQP--TVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249

Query: 231 KD 232
D
Sbjct: 1250 CD 1251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0340FLGLRINGFLGH1941e-64 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 194 bits (495), Expect = 1e-64
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKKEAEYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + E S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0351SYCDCHAPRONE300.001 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.9 bits (67), Expect = 0.001
Identities = 12/37 (32%), Positives = 20/37 (54%)

Query: 22 GVASQTPKELYDLGVESYKAKDYIKAKKYFEKACGLN 58
++S T ++LY L Y++ Y A K F+ C L+
Sbjct: 30 EISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLD 66


5hp908_0467hp908_0483Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0467113-3.145347Molybdenum ABC transporter
hp908_0468112-3.456429Molybdenum transport system permease protein
hp908_0469010-1.690410Molybdenum transport ATP-binding protein
hp908_0470-111-2.104487Glutamyl-tRNA synthetase
hp908_0471-213-2.726747adenine specific DNA-methyltransferase
hp908_0472-212-1.760669adenine specific DNA-methyltransferase
hp908_0473-112-1.189062hypothetical protein
hp908_0474-116-0.001965GTP-binding protein
hp908_0475124-3.545192DNA adenine methylase
hp908_0476025-3.476929hypothetical protein
hp908_0477-221-2.338435hypothetical protein
hp908_04785190.305299DNA cytosine methyltransferase
hp908_04796190.729299DNA cytosine methyltransferase
hp908_04806200.845479hypothetical protein
hp908_04817211.494634hypothetical protein
hp908_04825191.753499catalase
hp908_04837212.285742outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0469PF05272300.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.009
Identities = 12/32 (37%), Positives = 17/32 (53%)

Query: 30 VVALLGESGAGKSTILRILAGLEAVSSGYIEA 61
V L G G GKST++ L GL+ S + +
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0474TCRTETOQM1981e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (505), Expect = 1e-57
Identities = 116/461 (25%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVALAG--FNAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV L V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-RFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


6hp908_0507hp908_0550Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0507216-2.723589hypothetical protein
hp908_0508114-2.446596hypothetical protein
hp908_0509113-2.122114hypothetical protein
hp908_0510112-1.846791glutamine synthetase typeI
hp908_0511012-3.103842RloF
hp908_0512-111-2.560002RloF
hp908_0513-310-1.742618LSU ribosomal protein L9p
hp908_0514-311-1.923438ATP-dependent protease
hp908_0515-214-3.249005ATP-dependent hsl protease
hp908_0516215-3.342546GTP-binding protein
hp908_0517315-3.320137putative periplasmic protein
hp908_0518620-3.474178hypothetical protein
hp908_0519819-2.483178cag pathogenicity island protein
hp908_0520819-2.514032cag island protein
hp908_0521918-2.286313cag pathogenicity island protein
hp908_0522719-1.691297cag island protein
hp908_0523819-1.618566hypothetical protein
hp908_0524819-2.078303type IV secretion protein
hp908_05251021-2.674767ATPase like protein
hp908_05261121-3.104254cag pathogenicity island protein
hp908_05271022-2.784232cag pathogenicity island protein
hp908_05281024-3.971240cag pathogenicity island protein
hp908_05291024-4.389669cag pathogenicity island protein
hp908_0530925-4.285630cag pathogenicity island protein
hp908_05311025-4.104784cag island protein
hp908_05321028-4.500156cag pathogenicity island protein
hp908_05331027-4.637126cag pathogenicity island protein
hp908_05341223-5.333760cag pathogenicity island protein
hp908_05351122-5.359575cag pathogenicity island protein
hp908_05361220-5.039285cag pathogenicity island protein
hp908_0537819-4.335789cag pathogenicity island protein
hp908_0538718-2.936951cag pathogenicity island protein
hp908_0539617-2.827328cag island protein
hp908_0540720-3.026465cag island protein
hp908_0541519-2.931746cag pathogenicity island protein
hp908_0542520-3.336711cag island protein
hp908_0543619-3.401914cag island protein
hp908_0544620-4.190011cag island protein
hp908_0545621-3.237011cag pathogenicity island protein
hp908_0546522-2.716134cag pathogenicity island protein
hp908_0547626-2.231064cag island protein
hp908_0548423-0.890731cag island protein
hp908_0549218-0.140697hypothetical protein
hp908_0550218-0.036239cag island protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0516PF03944310.005 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 31.2 bits (70), Expect = 0.005
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELRVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQKYASQFLDLVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0531TYPE4SSCAGX7890.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 789 bits (2038), Expect = 0.0
Identities = 479/482 (99%), Positives = 482/482 (100%)

Query: 1 MVNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGWNIVPNSNHIFIQPK 60
+VNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGW+IVPNSNHIFIQPK
Sbjct: 41 VVNKKIAYLGDEKPITIWTSLDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPK 100

Query: 61 SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK 120
SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK
Sbjct: 101 SVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQK 160

Query: 121 AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM 180
AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM
Sbjct: 161 AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDM 220

Query: 181 QEQAQANALKQIEELNKKQAEEAVKQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN 240
QEQAQANALKQIEELNKKQAEEAV+QRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN
Sbjct: 221 QEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTN 280

Query: 241 LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI 300
LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI
Sbjct: 281 LVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELI 340

Query: 301 KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN 360
KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN
Sbjct: 341 KQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRN 400

Query: 361 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT 420
YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT
Sbjct: 401 YNYYQAPEKRSKHIMPSEIFDDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMT 460

Query: 421 NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR 480
NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR
Sbjct: 461 NSGLRWYRVNEIAEKFKLIKDKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVR 520

Query: 481 DK 482
DK
Sbjct: 521 DK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0534PF043351195e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 119 bits (299), Expect = 5e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0540TYPE4SSCAGX300.015 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.015
Identities = 29/119 (24%), Positives = 54/119 (45%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTFYQRHDDKEITKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K D KE+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSVQKKAAKHRGLQELNETNANPLNDNPNGNSPTETKSNKDDNFDEM 142
QK+ K +++A L+ L +NP N + N N K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0546ACRIFLAVINRP320.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.1 bits (73), Expect = 0.015
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGINPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0550TYPE4SSCAGA18680.0 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 1868 bits (4841), Expect = 0.0
Identities = 1041/1187 (87%), Positives = 1084/1187 (91%), Gaps = 43/1187 (3%)

Query: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAIASFDPDQKPIVDKNDRDNRQAFDG 60
MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNA+AS+DPDQKPIVDKNDRDNRQAF+G
Sbjct: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAVASYDPDQKPIVDKNDRDNRQAFEG 60

Query: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSSDLINKDNLIDVESSTKSFQKFGDQRYRIF 120
ISQLREEYSNKAIKNPTKKNQYFSDFINKS+DLINKDNLIDVESSTKSFQKFGDQRYRIF
Sbjct: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120

Query: 121 TSWVSHQNDPSKINTRSIRNFMEHAIQPPIPDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180
TSWVSHQNDPSKINTRSIRNFME+ IQPPI DDKEKAEFLKSAKQSFAGIIIGNQIRTDQ
Sbjct: 121 TSWVSHQNDPSKINTRSIRNFMENIIQPPILDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180

Query: 181 KFMGVFDESLKERQEAEKNGGSTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240
KFMGVFDESLKERQEAEKNG TGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI
Sbjct: 181 KFMGVFDESLKERQEAEKNGEPTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240

Query: 241 ATSTTHIQGLPPESRDLLDERGNFSKFTLGDMEMLDVEGVADMDPNYKFNQLLIHNNALS 300
AT+TT IQGLPPE+RDLLDERGNFSKFTLGDMEMLDVEGVAD+DPNYKFNQLLIHNNALS
Sbjct: 241 ATTTTDIQGLPPEARDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNALS 300

Query: 301 SVLMGSHDGIEPEKVSLLYAGNGGFGDKHDWNATVGYKDQQGNNVATIINVHMKNGSGLI 360
SVLMGSH+GIEPEKVSLLY GNGG G +HDWNATVGYKDQQGNNVATIINVHMKNGSGL+
Sbjct: 301 SVLMGSHNGIEPEKVSLLYGGNGGPGARHDWNATVGYKDQQGNNVATIINVHMKNGSGLV 360

Query: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDSLSEKEKEK 420
IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLD+LSEKEKEK
Sbjct: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDNLSEKEKEK 420

Query: 421 FKNEIKDFQKDSKPYLDALGNDRIAFVSKKDPKHSALITEFNKGDLSYTLKDYGKKADKA 480
F+ EIKDFQKDSK YLDALGNDRIAFVSKKD KHSALITEF GDLSYTLKDYGKKADKA
Sbjct: 421 FRTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALITEFGNGDLSYTLKDYGKKADKA 480

Query: 481 LDREKNVTLQGNLKHDGVMFVNYSNFKYTNASKSPNKGVGVTNGVSHLEAGFSKVAVFNL 540
LDREKNVTLQG+LKHDGVMFV+YSNFKYTNASK+PNKGVGVTNGVSHLE GF+KVA+FNL
Sbjct: 481 LDREKNVTLQGSLKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLEVGFNKVAIFNL 540

Query: 541 PNLNNLAITSVVRRDLEDKLIAKGLSPQEANKLVKDFLSSNKELVGKALNFNKAVAEAKN 600
P+LNNLAITS VRR+LEDKL KGLSPQEANKL+KDFLSSNKELVGK LNFNKAVA+AKN
Sbjct: 541 PDLNNLAITSFVRRNLEDKLTTKGLSPQEANKLIKDFLSSNKELVGKTLNFNKAVADAKN 600

Query: 601 TGNYDEVKRAQKDLEKSLKKREHLEKDVAKNLESKSGNKNKMEVKSQANSQKDEIFALIN 660
TGNYDEVK+AQKDLEKSL+KREHLEK+V K LESKSGNKNKME K+QANSQKDEIFALIN
Sbjct: 601 TGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKSGNKNKMEAKAQANSQKDEIFALIN 660

Query: 661 KEANRDARAIAYAQNLKDIKRELSDKLENISKDLKDFSKSFDEFKNGKSKDFSKVEETLK 720
KEANRDARAIAYAQNLK IKRELSDKLEN++K+LKDF KSFDEFKNGK+KDFSK EETLK
Sbjct: 661 KEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETLK 720

Query: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSIKDVIINQKI 780
ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENS+KDVIINQK+
Sbjct: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKV 780

Query: 781 TDKVDNLNQAVSMAKIAGNFSGVEQALADLKNFSKEQLAQQAQKNESFNVGK-SEIYQSV 839
TDKVDNLNQAVS+AK G+FS VEQALADLKNFSKEQLAQQAQKNES N K SEIYQSV
Sbjct: 781 TDKVDNLNQAVSVAKATGDFSRVEQALADLKNFSKEQLAQQAQKNESLNARKKSEIYQSV 840

Query: 840 KNGVNGTLVGNGLSGIEATALAKNFSDIKKELNEKFKNFNNNNNGLKNGKDKGPEEPIYA 899
KNGVNGTLVGNGLS EAT L+KNFSDIKKELN K NFNNNNN EPIYA
Sbjct: 841 KNGVNGTLVGNGLSQAEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKN------EPIYA 894

Query: 900 QVNKKKTGQVASPEEPIYAQVAKKVTQKIDQLNQAASGFGGVGQ-AGFPLKRHDKVEDLS 958
+VNKKK GQ AS EEPIYAQVAKKV KID+LNQ ASG G VGQ AGFPLKRHDK
Sbjct: 895 KVNKKKAGQAASLEEPIYAQVAKKVNAKIDRLNQIASGLGVVGQAAGFPLKRHDK----- 949

Query: 959 KVGRSVSPEPIYATIDDLGGSFPLRRSAAVDDLSKVGRSREQELTQKIDNLSQAVSEAKA 1018
VDDLSKVG SR QEL QKIDNL+QAVSEAKA
Sbjct: 950 -----------------------------VDDLSKVGLSRNQELAQKIDNLNQAVSEAKA 980

Query: 1019 GFFGNLERTIDKLKDSTKNNPVNLWAENAKKVPASLSAKLDNYATNSHTRINSNIQNGAI 1078
GFFGNLE+TIDKLKDSTK+NP+NLW E+AKKVPASLSAKLDNYATNSH RINSNI+NGAI
Sbjct: 981 GFFGNLEQTIDKLKDSTKHNPMNLWVESAKKVPASLSAKLDNYATNSHIRINSNIKNGAI 1040

Query: 1079 NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN 1138
NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN
Sbjct: 1041 NEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMKDYSDSFKFSTKLN 1100

Query: 1139 NAVKDVKSGFTQFLANAFSTG-YYSLARENAEHGIKNANTKGGFQKS 1184
NAVKD SGFTQFL NAFST YY LARENAEHGIKN NTKGGFQKS
Sbjct: 1101 NAVKDTNSGFTQFLTNAFSTASYYCLARENAEHGIKNVNTKGGFQKS 1147


7hp908_0693hp908_0703Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0693214-3.101794Iron III di citrate transport protein
hp908_0694115-4.143233Ferrous iron transport protein B
hp908_0695318-3.179499Poly saccharide biosynthesis protein
hp908_0696621-2.491735putative type II DNA modification enzyme/methyl
hp908_0697718-0.427976hypothetical protein
hp908_06986190.065580hypothetical protein
hp908_06995172.263055Acetone carboxylase gamma subunit
hp908_07004162.622491Acetone carboxylase alpha subunit/N-methyl
hp908_07013162.171187Acetone carboxylase alpha subunit/N-methyl
hp908_07031143.358917putative Outer membrane protein
8hp908_0722hp908_0740Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0722313-1.051155RNA polymerase sigma-54 factor
hp908_0723113-0.014698putative ATP-binding abc transporter
hp908_0724113-0.413926ATPase
hp908_0725014-0.384068DNA polymerase III subunits gamma and tau
hp908_07262150.282947LysE family Transporter
hp908_07270150.589757hypothetical protein
hp908_0728114-0.307832hypothetical protein
hp908_0729014-1.443597putative Outer membrane protein
hp908_0730-113-1.687212outer membrane protein
hp908_0731013-1.906596hypothetical protein
hp908_0732112-2.733312tRNA di hydro uridine synthase B
hp908_0733216-4.061902tRNA Ile-lysidine synthetase
hp908_0734319-4.154736hypothetical protein
hp908_0735623-3.073729hypothetical protein
hp908_0736523-3.049883hypothetical protein
hp908_0737625-4.258426hypothetical protein
hp908_0738525-4.178876hypothetical protein
hp908_0739522-4.417110hypothetical protein
hp908_0740322-4.392905hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0727SECA280.014 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.9 bits (62), Expect = 0.014
Identities = 12/43 (27%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 71 RIARKNLSKMSEEDFKKMREEVRK--ELEEKTKGLSDEEIKAK 111
++ K ++ ++MR+ V +E + + LSDEE+K K
Sbjct: 4 KLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGK 46


9hp908_0882hp908_0894Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0882219-0.554211CDP-diacylglycerol pyrophosphatase
hp908_0883320-0.041384hypothetical protein
hp908_08842152.189740Alkylphosphonate utilization operon protein
hp908_08851142.607129hypothetical protein
hp908_08862132.365182hypothetical protein
hp908_08872132.322278hypothetical protein
hp908_08884142.376843Catalase
hp908_08893142.536479putative IRON-regulated outer membrane protein
hp908_08900180.268762Crossover junction endo deoxyribonuclease
hp908_0891-115-1.384961hypothetical protein
hp908_0892213-1.346371hypothetical protein
hp908_0893212-1.056551putative JHP1044-like protein
hp908_0894312-1.261797hypothetical protein
10hp908_0990hp908_1015Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_09903151.340724Cell division protein
hp908_09913180.580100Cell division protein
hp908_0992319-4.048843hypothetical protein
hp908_0993320-5.033093hypothetical protein
hp908_0994518-5.282085hypothetical protein
hp908_0995516-5.154739hypothetical protein
hp908_0996720-6.320746hypothetical protein
hp908_0998721-6.963629DNA topoisomeraseI
hp908_0999718-6.948188DNA topoisomeraseI
hp908_1001618-7.056680hypothetical protein
hp908_1002324-7.471964conjugal plasmid transfer system protein
hp908_1004428-8.621797hypothetical protein
hp908_1005430-9.663176hypothetical protein
hp908_1006530-10.105778hypothetical protein
hp908_1007328-8.237273typeII/IV secretion system ATP hydrolase
hp908_1008328-7.939110type IV secretion ATPase
hp908_1009326-8.096771hypothetical protein
hp908_1010424-8.047027hypothetical protein
hp908_1011221-6.594189hypothetical protein
hp908_1012320-5.173783hypothetical protein
hp908_1013119-4.269054hypothetical protein
hp908_1014016-3.921080hypothetical protein
hp908_1015019-3.479783integrase/recombinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0990SHAPEPROTEIN392e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 39.4 bits (92), Expect = 2e-05
Identities = 38/176 (21%), Positives = 66/176 (37%), Gaps = 12/176 (6%)

Query: 211 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 264
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 265 HMLNTPFPYAEEVKIKYGDLSFESGTETPSQSVQIPTTGSDGNESHIVPLSEIQTIMRER 324
+ AE +K + G S G E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 325 ALETFKIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELARTHFTNYPVRLA 377
+ +++ E + G+VLTGG AL++ + L T PV +A
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM-EETGIPVVVA 318


11hp908_1095hp908_1102Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_10952110.270097Gluco kinase
hp908_1096512-1.265644Alcohol dehydrogenase
hp908_1097512-1.863848putative lipo polysaccharide biosynthesis
hp908_1098414-1.348952putative lipo polysaccharide biosynthesis
hp908_10995140.781805hypothetical protein
hp908_11003122.673558hypothetical protein
hp908_11012143.145524hypothetical protein
hp908_11020143.141493putative Outer membrane protein
12hp908_1117hp908_1124Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_1117217-0.848056DNA-cytosine methyl transferase
hp908_1118314-0.582080FlgM protein
hp908_1119311-1.605308hypothetical protein
hp908_1120411-1.377281FKBP-type peptidyl-prolylcis-trans isomerase
hp908_1121412-2.205176hypothetical protein
hp908_1122413-1.824007peptidoglycan associated lipoprotein precursor
hp908_11232140.154593tolB protein precursor
hp908_11242180.303957TonB-like putative protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1121IGASERPTASE300.022 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.022
Identities = 28/150 (18%), Positives = 51/150 (34%), Gaps = 26/150 (17%)

Query: 52 SQVEANTQAQEGLRSVYEGQANKIKDLNNAILSQEESLRALKASQEVQANTLKQQSQTLE 111
+ + Q + SV + + + + + E A KQ+S+T+E
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPS--ETTETVAENSKQESKTVE 1052

Query: 112 DLRNEIRANQQAIQQLDKQNKEMSELLTKLSQDLVSQIALIQKALKEQEEKAEKPLKSSA 171
+NE A + Q + + S + + V+Q E +E
Sbjct: 1053 --KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS------ETKETQTT------ 1098

Query: 172 PANKNPPLKAETTKNQEKKTQEKAKVEFDK 201
ET + + +EKAKVE +K
Sbjct: 1099 ----------ETKETATVEKEEKAKVETEK 1118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1122OMPADOMAIN1364e-42 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 136 bits (345), Expect = 4e-42
Identities = 46/162 (28%), Positives = 71/162 (43%), Gaps = 24/162 (14%)

Query: 2 AGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAIESGTIIASIYFDFDKYEIKE 60
+ VS + Q PAP PAP V+ K T+ + + F+F+K +K
Sbjct: 184 SLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNFNKATLKP 232

Query: 61 SDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNALVIKGVEK 117
Q LD++ + V++ G TD GS YNQ L +R SV + L+ KG+
Sbjct: 233 EGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPA 292

Query: 118 DMIKTISFGETKPKCVQ-----KTR----ECYRENRRVDVKL 150
D I GE+ P K R +C +RRV++++
Sbjct: 293 DKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1124TYPE4SSCAGA320.002 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 32.4 bits (73), Expect = 0.002
Identities = 36/139 (25%), Positives = 64/139 (46%), Gaps = 12/139 (8%)

Query: 32 KEAEKILLDLNKKDEQAID--LNLEDLPSEKKNE-KIEKVTEKQGDF---LEPKEEPKEE 85
+EA K++ D +++ + LN ++ KN ++V + Q D L +E ++E
Sbjct: 568 QEANKLIKDFLSSNKELVGKTLNFNKAVADAKNTGNYDEVKKAQKDLEKSLRKREHLEKE 627

Query: 86 PEESLEDIFSSLNDFQEKTDKNAQKDE-----QKNEQEEQRRLREQQRLKQ-NQENQEML 139
E+ LE + N + K N+QKDE K + R + Q LK +E + L
Sbjct: 628 VEKKLESKSGNKNKMEAKAQANSQKDEIFALINKEANRDARAIAYAQNLKGIKRELSDKL 687

Query: 140 KGLQQNLNQFTQKLESVKN 158
+ + +NL F + + KN
Sbjct: 688 ENVNKNLKDFDKSFDEFKN 706


13hp908_1133hp908_1138Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_1133019-4.369729ATP synthaseB chain
hp908_1134117-3.770324Chromosome/plasmid partitioning protein
hp908_1135117-4.141928Chromosome/plasmid partitioning protein
hp908_1136118-4.700263Biotin-protein ligase
hp908_1137219-4.940722Methionyl-tRNA formyl transferase
hp908_1138220-5.102350trans-Golgi membrane protein p230-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1137FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKDLKPDFIVVVAYGKILPKEVLAIAP 104
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


14hp908_1231hp908_1255Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_12312120.288176DNA poymerase III subunit delta
hp908_12321120.089768Alternative dihydrofolate reductase 2
hp908_12332140.683609hypothetical protein
hp908_12341120.323233hypothetical protein
hp908_1235-1110.412099Integral membrane protein
hp908_12360110.168707integral membrane protein
hp908_1237-1101.633709hypothetical protein
hp908_12380101.972540Carbamoyl-phosphate synthase small chain
hp908_12401120.977304hypothetical protein
hp908_12411120.223529hypothetical protein
hp908_1242211-0.324124Maf-likeprotein
hp908_1243311-0.570883Alanyl-tRNA synthetase
hp908_1244514-2.727680hypothetical protein
hp908_1245514-2.257379hypothetical protein
hp908_1246312-1.260251hypothetical protein
hp908_1247211-0.5017563'-to-5'exo ribonuclease RNase R
hp908_12480110.375491Shikimate 5-dehydrogenase I alpha
hp908_1249-1110.013535hypothetical protein
hp908_1250-1110.618137hypothetical protein
hp908_12510100.637164Oligo peptide transport system permease protein
hp908_12520111.058885Oligopeptide ABC transporter
hp908_12531130.954944Tryptophanyl-tRNA synthetase
hp908_12542151.674402biotin synthesis protein
hp908_12553172.511687Preprotein translocase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1234TYPE3IMSPROT240.032 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 23.6 bits (51), Expect = 0.032
Identities = 7/13 (53%), Positives = 10/13 (76%)

Query: 5 FYKELKMDKQKVK 17
+ KELKM K ++K
Sbjct: 210 YIKELKMSKDEIK 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1241adhesinmafb300.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 30.4 bits (68), Expect = 0.002
Identities = 17/50 (34%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 32 MEEIENSDPNQNNPFITA--AMGIGGAAISIFFPNTKPIVDGIKPLAEKG 79
ME I NPFI+A A+GIG + K + I PL +G
Sbjct: 225 MEFINGVAAGALNPFISAGEALGIGDILYGTRYAIDKAAMRNIAPLPAEG 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1244PF05844250.039 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 24.6 bits (53), Expect = 0.039
Identities = 12/65 (18%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 10 SVLKANNPHFDKIFEKHNQLDDDIKTAEQQNASDAEVSHMKKQKLKLKDEIHSMIIEYRE 69
L+A F+ + I++ Q + +V + Q ++E+++ I + +
Sbjct: 197 VALRAAGRAFESRNGALQVANTVIQSFVQMANASVQVRQGESQASAREEEVNATIGQ-SQ 255

Query: 70 KQKSD 74
KQK +
Sbjct: 256 KQKVE 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1255SECGEXPORT494e-10 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 48.8 bits (116), Expect = 4e-10
Identities = 25/84 (29%), Positives = 47/84 (55%), Gaps = 3/84 (3%)

Query: 1 MTSALLGLQIVLAVLIVVVVLLQ--KSSSIGLGAYSGSNDSLFGAKGPASFMAKLTMFLG 58
M ALL + +++A+ +V +++LQ K + +G +G++ +LFG+ G +FM ++T L
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 59 LLFVINTIALGYFYNKEYGKSVLD 82
LF I ++ LG N +
Sbjct: 61 TLFFIISLVLGNI-NSNKTNKGSE 83


15hp908_1381hp908_1398Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_1381-210-3.085567Prephenate and/or arogenate dehydrogenase
hp908_1382-210-3.764676putative endonuclease
hp908_1383-112-4.223505type III restriction-modification system
hp908_1384112-3.203008type III restriction-modification system DNA
hp908_1385-112-2.960816type III restriction-modification system DNA
hp908_1386-113-2.680756Biotin synthase
hp908_1387114-4.777090Ribonuclease BN
hp908_1388114-5.272821hypothetical protein
hp908_1389115-4.729449hypothetical protein
hp908_1390215-4.020486hypothetical protein
hp908_1391116-3.500504hypothetical protein
hp908_1392216-3.480770hypothetical protein
hp908_1393116-2.896730hypothetical protein
hp908_1394015-2.735693hypothetical protein
hp908_1395-114-3.781282hypothetical protein
hp908_1396018-4.767535NADPH-dependent 7-cyano-7-deazaguanine
hp908_1397-117-4.176022Iojap like protein
hp908_1398-118-4.179118tRNA delta 2-isopentenyl pyrophosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1381SHIGARICIN290.024 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 28.6 bits (64), Expect = 0.024
Identities = 11/74 (14%), Positives = 19/74 (25%), Gaps = 5/74 (6%)

Query: 83 TPIKKSTTIIDLGGAKAQILHNIPKSIRKNFIAAHPMCGTEFYGPKASVKGLYENALVIL 142
P + L GA + ++RK + Y L
Sbjct: 18 APAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKLYDIPLLRSTLPGSQRY-----AL 72

Query: 143 CDLEDSGTEQVEIA 156
L + E + +A
Sbjct: 73 IHLTNYADETISVA 86


16hp908_1449hp908_1461Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_14492120.406428Outer membrane protein
hp908_14502110.322055Branched-chain aminoacid amino transferase
hp908_1451110-1.308206putative Outer membrane protein
hp908_1452111-1.431005DNA polymerase I
hp908_1453116-1.587686type II S restriction enzyme R protein
hp908_1454016-1.062350type II S restriction enzyme R protein
hp908_14550160.093194type II S restriction enzyme M protein
hp908_1456217-0.034659type II S restriction enzyme M protein
hp908_14572151.108688hypothetical protein
hp908_14582120.214153Thymidylate kinase
hp908_14592120.624911Phospho pantetheine adenylyltransferase
hp908_14602120.7325183-polyprenyl-4-hydroxy benzoate carboxy-lyase
hp908_14612110.238613Flagellar basal-body P ring formation protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1459LPSBIOSNTHSS2259e-79 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 225 bits (574), Expect = 9e-79
Identities = 63/147 (42%), Positives = 93/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEELIVAVAHSSAKNPMFSLDERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS+ ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPEEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


17hp908_1493hp908_1538Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_14932132.889584Saccharopine dehydrogenase
hp908_14941112.146051ferrodoxin-like protein
hp908_1495-191.919985Acyl-phosphate:glycerol-3-phosphate O-acyl
hp908_1496-1111.925777Dihydroneopterin aldolase
hp908_1497-1102.119210hypothetical protein
hp908_1498-2110.034891iron-regulated outer membrane protein
hp908_1499-115-2.548408Selenocysteine synthase
hp908_1500-110-3.023788Transcription termination protein
hp908_1502011-3.336287hypothetical protein
hp908_1503011-3.758283hypothetical protein
hp908_150409-3.417540Putative type IIS restriction/modification
hp908_1505111-3.594959putative type I DNA modification enzyme
hp908_1506110-3.119893type III restriction-modification system
hp908_1507110-2.789911adenine specific DNA methyl transferase
hp908_1508111-2.506033type III restriction-modification system
hp908_1509112-1.651905ATP-dependent DNA helicase
hp908_1510217-0.975496hypothetical protein
hp908_1511014-0.791800Outer membrane protein
hp908_1512012-1.123137Exo deoxyribonuclease III
hp908_1513113-0.134458*hypothetical protein
hp908_1514216-0.280747hypothetical protein
hp908_1515217-0.150291Chromosomal replication initiator protein
hp908_1516220-1.991248purine nucleoside phosphorylase
hp908_1517314-2.461700hypothetical protein
hp908_1518214-1.676636Glucosamine-fructose-6-phosphate amino
hp908_1519014-2.933584Thymidylate synthase
hp908_1520014-3.073729cagY like protein
hp908_1521-212-0.439370type I restriction-modification system
hp908_1522-2130.410803type I restriction-modification system
hp908_1523-2121.038646typeI restriction-modification system DNA-methyl
hp908_1524-1120.766817type I restriction-modification system
hp908_15253163.636584Putative predicted metal-dependent hydrolase
hp908_15263154.557028Iron III dicitrate transport protein
hp908_1527-1110.541740Iron III dicitrate transport protein
hp908_1528090.547097hypothetical protein
hp908_1529-190.874329Arginase
hp908_1530-390.900691Alanine dehydrogenase
hp908_1531-19-1.226438Alcohol dehydrogenase
hp908_153219-1.957898hypothetical protein
hp908_1533111-1.208383outer membrane protein
hp908_1534314-1.261725NAD kinase
hp908_1535312-3.057714DNA repair protein
hp908_1536315-4.623099Fibronectin/fibrinogen-binding protein
hp908_1537215-1.751541hypothetical protein
hp908_1538313-1.388626hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1515HTHFIS355e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 5e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 127 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 177
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1536FbpA_PF058331126e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 112 bits (282), Expect = 6e-29
Identities = 72/358 (20%), Positives = 141/358 (39%), Gaps = 25/358 (6%)

Query: 97 AKDLAYKSETFILRLEMIPKKANLMILDQEKCVIEA--FRFNDRVAKNDILGALPPNIYE 154
+ ++ ++ + + L + K + + I++ F FN N +G N+
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 155 HQEEDLDFKDLLDILEKDFLSYQ--HKELEHKKNQIIKRLNAQKERLKEKLEKLEDPKNL 212
++ D L ++F + L+ K + + K + R +K + L +
Sbjct: 269 KEDYKKIQYDSSSKLLENFYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKK 328

Query: 213 QLEAKELQTQASLLLTYQHLIHKHESCVILKDFED---KECMIEIDKSMPLNAFINKKFT 269
+ + LL + + K S + L ++ I +D++ + + +
Sbjct: 329 CEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQSYYK 388

Query: 270 LSKKKKQKSQFLYLEEENLKEKIAFKENQINYVRDAAEESVLE------------MFMPV 317
K K+ + + +E++ + + + + +A +E F +
Sbjct: 389 KYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKFKKI 448

Query: 318 KNSKIKRPMNGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRNIPGSHLIVFC 376
SK + + I +GKN +N L L+ A +D+W H +NIPGSH+IV
Sbjct: 449 YKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVIVKN 508

Query: 377 QKNTPKDEIILELAKMLIKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRTI 430
+ P + +LE A + K +S +DYT+ K VK GA VIYS +TI
Sbjct: 509 IMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQTI 565



Score = 35.2 bits (81), Expect = 5e-04
Identities = 19/92 (20%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 46 SAPYIGLSKKPPESVLKNTLALDFCLNKFTKNAKILQANVIDNDRI--LEIKGAKDLAYK 103
+ P I L+ + +K + L K+ NAKI+ + I+ DRI ++ + +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 104 SETFILRLEMIPKKANLMILD-QEKCVIEAFR 134
S L +E++ + +N+ ++ ++ ++++ +
Sbjct: 114 SIY-SLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


18hp908_0042hp908_0048N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0042-2130.435148DNA transformation competancy
hp908_0043-2140.482272DNA transformation competancy
hp908_0044-1140.808003inner membrane protein
hp908_0045-1160.388719Mannose-6-phosphate
hp908_0046-1131.116209GDP-mannose 4,6 dehydratase
hp908_0047-1111.242883GDP-mannose 4,6 dehydratase
hp908_00480131.309185putative fucose synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0042PF043351332e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 133 bits (335), Expect = 2e-40
Identities = 36/202 (17%), Positives = 72/202 (35%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYRLLGLMSFIALVLAIVLISILPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKAQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ K N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLLNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0043TYPE4SSCAGX320.003 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 32.1 bits (72), Expect = 0.003
Identities = 27/70 (38%), Positives = 37/70 (52%), Gaps = 8/70 (11%)

Query: 200 KEKEEETIIIGDNTNAMKIIKKDIQKGYKALKSSQ--RKWYCLWACSKKSKLSLMPKEIF 257
K +EE+ II D A+ + Q + ALK + R + A K+SK +MP EIF
Sbjct: 367 KIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSEIF 420

Query: 258 NDKQFTYFKF 267
+D FTYF F
Sbjct: 421 DDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0046NUCEPIMERASE871e-22 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 86.8 bits (215), Expect = 1e-22
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0048NUCEPIMERASE474e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 47.1 bits (112), Expect = 4e-08
Identities = 52/353 (14%), Positives = 107/353 (30%), Gaps = 68/353 (19%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKYAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYMSTEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGE-------FDKFEEKIAHMIPGLIARMHTAKLKGEKNFAMWGDGTARREYLNAK 215
+YG KF + + L+G+ ++ G +R++
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAM---------------LEGKS-IDVYNYGKMKRDFTYID 221

Query: 216 DLARFIALAYENIAQIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDY 258
D+A I + I + V N+G+ + +Y + + L
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 259 KGVFVKDLSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+ +P + + D + + + E ++ G+K +Y +V
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


19hp908_0256hp908_0263N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0256-2141.201192non-specific DNA binding protein/Iron-binding
hp908_0257-2131.107611flagellar sensory histidine kinase
hp908_0258-2121.783961hypothetical protein
hp908_0259-3112.165218flagellar P-ring protein
hp908_0260-291.919852cold-shock DEAD-box protein A
hp908_0261-291.613803membrane protease subunit, stomatin/prohibitin
hp908_0262-281.486954hypothetical protein
hp908_0263-3101.445047oligo peptide transport ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0256HELNAPAPROT1493e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 149 bits (377), Expect = 3e-49
Identities = 38/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEALKLTRVKEETKTSFHSKDIFKEILGDYKYLEKEFEELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E + + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLEAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0257PF06580300.015 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.015
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0259FLGPRINGFLGI363e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 363 bits (934), Expect = e-127
Identities = 118/345 (34%), Positives = 192/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AIVSGNSS-----------NLLSANIINGATIEREVSYDLFHKNAMTLSLKNPNFKNAIQ 186
A++ S SA + NGA IERE+ + L L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERLSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIIVHPIVVTSQDITLKITKEP--------LNDSKNTQDLDNNMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0260SECA290.050 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.1 bits (65), Expect = 0.050
Identities = 16/63 (25%), Positives = 31/63 (49%), Gaps = 2/63 (3%)

Query: 261 IVFTRTKKEADELHQFLASKNYKSTALHGDMDQRDRRSSIMAFKKNDADVLVATDVASRG 320
+V T + ++++ + L K L+ + ++I+A A V +AT++A RG
Sbjct: 453 LVGTISIEKSELVSNELTKAGIKHNVLNAKFHANE--AAIVAQAGYPAAVTIATNMAGRG 510

Query: 321 LDI 323
DI
Sbjct: 511 TDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0263HTHFIS320.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.006
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANLIMRLNPR----FKPHNGEILFETTNLLKESEAF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


20hp908_0357hp908_0370N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0357-2100.193774CTP synthase
hp908_0358-1100.518585hypothetical protein
hp908_0359-291.163342hypothetical protein
hp908_0360-290.916805flagellar M-ring protein
hp908_0361-2101.179714flagellar motor switch protein
hp908_0362-2110.892169flagellar assembly protein
hp908_0363-2101.7864251-deoxy-D-xylulose 5-phosphate synthase
hp908_03640101.141890translation elongation factor
hp908_03651140.339628hypothetical protein
hp908_0366013-0.709271hypothetical protein
hp908_0367-1110.215963hypothetical protein
hp908_0368011-0.064724flagellar basal-body rod protein
hp908_0369111-0.576063alpha-ketoglutarate permease
hp908_0370012-0.815207cell division protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0357ACETATEKNASE290.047 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.0 bits (65), Expect = 0.047
Identities = 14/38 (36%), Positives = 18/38 (47%), Gaps = 5/38 (13%)

Query: 301 LEGVDAILVPGGFGERGIEGKICAIQRARLEKLPFLGI 338
+ GVD I+ G GE G I+ L+ L FLG
Sbjct: 320 MGGVDVIVFTAGIGENG-----PEIREFILDGLEFLGF 352


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0360FLGMRINGFLIF5480.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 548 bits (1413), Expect = 0.0
Identities = 176/582 (30%), Positives = 289/582 (49%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAEVLITALLVFLLLYPFKEKDYAQGGYGVLFERLDSSDNALIL 70
+++ +L +I LI A A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVLKDD-TILVPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ I VP DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLRYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL++ + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GAPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIAFKDGANALEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K K PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKP-------LPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMIDNATFSEKIMHKTQKILGSFTPLIKYILVFI 461
++A+G++ RGD + V N F+ + E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDNTGG-----ELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 SFSEEEVRYEIILEKIRGTLKERPDEIATLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0361FLGMOTORFLIG348e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 348 bits (895), Expect = e-122
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIAKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0364TCRTETOQM1132e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 113 bits (284), Expect = 2e-28
Identities = 53/162 (32%), Positives = 87/162 (53%), Gaps = 7/162 (4%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTLKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 SFQW----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNNLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSSANEVS 161
+ + INKID ++ V QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 83.8 bits (207), Expect = 5e-19
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 161 SAKAKLGIKDLLEKIITTIPAPSGDPNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 220
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 221 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 277
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 278 KNPTSKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 337
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 338 FRVGFLGLLHMEVIKERLEREFGLNLIATAPTVVY 372
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTKGYASFDYEP 473
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0368FLGHOOKAP1300.010 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.9 bits (67), Expect = 0.010
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0369TCRTETB392e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 2e-05
Identities = 42/182 (23%), Positives = 71/182 (39%), Gaps = 33/182 (18%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFMLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNIM 210
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EE 212
++
Sbjct: 190 KK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0370IGASERPTASE350.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.0 bits (80), Expect = 0.001
Identities = 31/172 (18%), Positives = 59/172 (34%), Gaps = 4/172 (2%)

Query: 198 KENPIDESHKPPNEESFLAIPTPYNTTLNDSEPQEGLVQISPHPPTHYTIYPKKNRFNDL 257
N + + P E+ + T TT N+ + V + P
Sbjct: 973 NVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPAT 1032

Query: 258 TNPTNPT--LEPQQETKEREPTLKKETPTTL--KPIMPISAPNTENDNKTENHKTPNHPI 313
+ T T +QE+K E + T TT + + + N + + +T
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 314 KKDDLQENAQEENIEEKENLKEEKRETQNAPNFSPLTPTSAKKPVMVKELSE 365
K+ E + +E++E K E +TQ P + ++ V+ +E
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE 1144


21hp908_0610hp908_0625N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0610-312-1.248646methyl-accepting chemotaxis protein
hp908_0611-210-1.186220multi drug resistance protein
hp908_0612-2100.410314flagellin
hp908_0613-3100.680996endonuclease III
hp908_0614-1110.981961hypothetical protein
hp908_0615290.340943Uroporphyrinogen III decarboxylase
hp908_06162100.107825outer membrane component of multidrug efflux
hp908_0617290.111599membrane fusion protein
hp908_0618290.069880Acriflavin resistance protein/ Multidrug efflux
hp908_061929-0.479186hypothetical protein
hp908_062019-0.454466putative vacuolating cytotoxin like protein
hp908_0621-210-0.155482hypothetical protein
hp908_0622-1110.284480hypothetical protein
hp908_0623-210-0.294490hypothetical protein
hp908_0624-211-0.546503NAD-dependent DNA ligase
hp908_0625-212-1.175260Chemotaxis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0610OMS28PORIN300.014 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 30.1 bits (67), Expect = 0.014
Identities = 26/102 (25%), Positives = 49/102 (48%), Gaps = 2/102 (1%)

Query: 143 NAAKNGEEHSNEGLITVNKTGQDIESLYEKMQNATSLADSLNQRS--NEITQVISLIDDI 200
N + ++ N+ L T+NK +D+ S E ++ ++ N + +SL+ D+
Sbjct: 47 NKKLDQKDQVNQALDTINKVTEDVSSKLEGVRESSLELVESNDAGVVKKFVGSMSLMSDV 106

Query: 201 AEQTNLLALNAAIEAARAGEHGRGFAVVADEVRKLAEKTQKA 242
A+ T + + A I A +G G V + +K ++TQKA
Sbjct: 107 AKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQETQKA 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0612FLAGELLIN2452e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 245 bits (627), Expect = 2e-77
Identities = 126/518 (24%), Positives = 210/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQVSSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S + L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0616RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.020
Identities = 22/167 (13%), Positives = 61/167 (36%), Gaps = 18/167 (10%)

Query: 151 AQVKLNVFNGFSDVNNVKEKSAT--YRSNVATLEYSRQSIFLQVVQQYYEYFNNLARMIA 208
++KL F +V+ + T + +T + + L + ++ E LAR+
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINR 225

Query: 209 LQKKLEQIQTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQFALEQN 261
+ ++ + + L K + + ++A L Y + ++ +
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 262 RLTLEYLTNLNVKNLKKTTIDVPNLQLRE-RKDLVSLREQISALKYQ 307
+ + +T K +D +LR+ ++ L +++ + +
Sbjct: 286 KEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0617RTXTOXIND494e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.4 bits (118), Expect = 4e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYNKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLEGYEFT 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 30.2 bits (68), Expect = 0.008
Identities = 21/150 (14%), Positives = 51/150 (34%), Gaps = 21/150 (14%)

Query: 70 QAQSDSTEQQLIFAKKQYQRYNKIGGAVDKNTLEGYEFTYRRLESDYAYSIAVLNKTILR 129
+++ S +++ + ++ +I + + T T +++ +++R
Sbjct: 279 ESEILSAKEEYQLVTQLFKN--EILDKLRQTTDNIGLLTLELAKNE-----ERQQASVIR 331

Query: 130 APFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG-------D 180
AP + + GV L+ +V L + +K I + VG +
Sbjct: 332 APVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE 391

Query: 181 TYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ Y+ G K+ I D+
Sbjct: 392 AFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0618ACRIFLAVINRP8980.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 898 bits (2323), Expect = 0.0
Identities = 286/1040 (27%), Positives = 518/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGVMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVMNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQPIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKRIQAISP-NYEIRPFLDTTSYIRTSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTRLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFITVVLVFVGSLFVASKLGMDFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHDEVEFTTLQVGY-GTSQNPFKAKIFVQLKPLKERKKEHELGQFELMSALRKELRS 631
+ + E FT + G +QN FV LKP +ER E ++ + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNG-DENSAEAVIHRAKMELGK 656

Query: 632 LPEAKDLENINLSEVSLIGGGGDSSPFQTFVFSHSQEAVDKSVENLRKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGF-VIPFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 EGYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
+ E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAQPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + +
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLVALATAFVLIYMILA 871
G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0620VACCYTOTOXIN2694e-75 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 269 bits (690), Expect = 4e-75
Identities = 105/397 (26%), Positives = 180/397 (45%), Gaps = 14/397 (3%)

Query: 2804 AGNNSIMWLNELFAAKGGNPLFAPYYLQDNPTEHIVTLMKDIASALGMLSNSNLKNNSTD 2863
+G L L + +A + I + + L +++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2864 VLQLNTYTQQMSRLAKLSNFASFDSTDFSERLSSLKNQRFADATPNAMDVILKYSQRDKL 2923
L L+ SRL LS + F++RL +LK+QRFA +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2924 KNNLWATGVGGVSFVENGTGTLYGVNVGYDRFVRG---VIVGGYAAYGYSGFYER--ITN 2978
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + N
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2979 SKSDNVDVGLYARAFIKKSELTFSVNETWGANKTQISSNDTLLSMINQSYKYSTWTTNAK 3038
S ++N + G+Y+R F + E F G++++ ++ LL +NQSY Y ++ +
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 3039 VNYGYDFMFKNKSIILKPQIGLRYYYIGMSGLEGVMNNALYNQFKANADPSKKSVLTIDF 3098
+YGYDF F +++LKP +G+ Y ++G + + + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 3099 ALENRHYFNTNSYFYAIGGVGRDLLVNSMGDKLVRFIGNNTLSYRKGDLYNTFANITTGG 3158
+E R+Y+ SYFY GV ++ N V + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFA-NFGSSNAVSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 3159 EVRLFKSFYANAGVGARFGLDYKMIDIIGNIGMRLAF 3195
E++L K + N G L + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 40.0 bits (93), Expect = 2e-04
Identities = 44/224 (19%), Positives = 78/224 (34%), Gaps = 19/224 (8%)

Query: 251 SSGATTISGV-TFNNNGALTYKGGNGIGGSITFTNSNINHYKLNLNANSVTFNNSTLGSM 309
+ G T+ + N N T + G G S+T ++++ K +N ++ S L
Sbjct: 386 AGGKNTVVNINRINTNADGTIRVG-GFKASLTTNAAHLHIGKGGINLSNQASGRSLLVEN 444

Query: 310 PNGNANTIGNAYILNANNITFNNLTFNGGWFVFDRTNANVNFQGTTTINNPTSPFVNMTG 369
GN G L NN G + ++AN F+ T N T+ F N
Sbjct: 445 LTGNITVDGP---LRVNNQV--------GGYALAGSSANFEFKAGTDTKNGTATFNNDIS 493

Query: 370 KVNINANAIFNIQNYTPSIGSAYTLFSMKNGNITYNDVNNLWNIIRLKNTKATKDNSKNT 429
+ I + F+ + ++ V N NI +L A+ + +
Sbjct: 494 LGRFVNLKVDAHTANFKGIDTGNGGFNT----LDFSGVTNKVNINKL--ITASTNVAVKN 547

Query: 430 TSNNNTHTYYVTYNLGGMLYHFRQIFSPNSIVLQSVYYGANNIY 473
+ N ++G + I S + I + G +IY
Sbjct: 548 FNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIY 591



Score = 33.9 bits (77), Expect = 0.016
Identities = 15/100 (15%), Positives = 31/100 (31%), Gaps = 5/100 (5%)

Query: 702 SYTFDGINNTFNEDKFNGGSFNFNHAEQTNAFNNNSFNGGSFSFNAKQVNFNHNSFNGGV 761
SY+ + E FN + ++A Q +N + G+ + N + G
Sbjct: 272 SYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGG 330

Query: 762 FNF---NNTPKASFTNDTFNVNNQFKING-TQTDFTFSKG 797
+ + + N + + N TQ +
Sbjct: 331 YKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0623LCRVANTIGEN300.001 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.4 bits (68), Expect = 0.001
Identities = 15/33 (45%), Positives = 20/33 (60%)

Query: 16 KRKRLLTELAELEAEIKVGSERRSSFNVSLSPS 48
R +L ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0625HTHFIS566e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.6 bits (134), Expect = 6e-11
Identities = 24/110 (21%), Positives = 45/110 (40%), Gaps = 6/110 (5%)

Query: 182 ILIAEDSLSALKTLEKIVQTLELRYLAFPNGRELLDYLYEKEHYQQVGVVITDLEMPNIS 241
IL+A+D + L + + N L ++ +V+TD+ MP+ +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA----GDGDLVVTDVVMPDEN 61

Query: 242 GFEVLKTIKADHRTEHLPVIINSSMSSDSNRQLAQSLEADGFVVKSNILE 291
F++L IK LPV++ S+ ++ A A ++ K L
Sbjct: 62 AFDLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


22hp908_0898hp908_0905N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0898-1110.872643Cysteinyl-tRNA synthetase
hp908_08990122.159538hypothetical protein
hp908_09000122.569320vacuolating cytotoxin
hp908_09011131.837652vacuolating cytotoxin
hp908_09020150.947696putative lipopolysaccharide biosynthesis
hp908_09031191.156261Hemin uptake system ATP-binding protein
hp908_0904-1191.168390iron III dicitrate transport system permease
hp908_09052132.786594Dehydrogenase like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0898TYPE4SSCAGX310.015 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.5 bits (68), Expect = 0.015
Identities = 35/161 (21%), Positives = 72/161 (44%), Gaps = 12/161 (7%)

Query: 316 KKRLDKIYRLK----QRVLGTLGGINPNFKKEILECMQDDLNISKALSVLESMLSSTNEK 371
+ LD++ RL+ Q L I KK+ E ++ ++ +S S +
Sbjct: 208 ENELDQMERLEDMQEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDKSQKSPEDNS 267

Query: 372 LDQNPKNKALKGEIL--ANLKFIEELLGIGFKD--PSAYFQLGVSESEKQEIESKIEE-- 425
++ +P + A + ++ N + +L I KD SAY + + ++ E+ S IEE
Sbjct: 268 IELSPSDSAWRTNLVVRTNKALYQFILRIAQKDNFASAYLTVKLEYPQRHEVSSVIEEEL 327

Query: 426 --RKRAKEQKDFLKADSIREELLNHKIALMDTPQGTIWEKL 464
R+ AK Q++ +K +++ +++ + Q EK+
Sbjct: 328 KKREEAKRQRELIKQENLNTTAYINRVMMASNEQIINKEKI 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0900VACCYTOTOXIN7710.0 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 771 bits (1993), Expect = 0.0
Identities = 431/482 (89%), Positives = 452/482 (93%), Gaps = 4/482 (0%)

Query: 1 MELQQTHRKMNRPLVSLVLAGALVSAIPQESHAAFFTTVIIPAIVGGIATGTAVGTVSGL 60
ME+QQTHRK+NRPLVSL L GALVS PQ+SHAAFFTTVIIPAIVGGIATG AVGTVSGL
Sbjct: 1 MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGL 60

Query: 61 LSWGLKQAEEANKTPDKPDKVWRIQAGRGFNNFPNKEYDLYQSLLSSKIDGGWDWGNAAR 120
L WGLKQAEEANKTPDKPDKVWRIQAG+GFN FPNKEYDLY+SLLSSKIDGGWDWGNAAR
Sbjct: 61 LGWGLKQAEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAAR 120

Query: 121 HYWVKGGQWNKLEVDMKDAVGTYKLSGLRNFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180
HYWVK GQWNKLEVDM++AVGTY LSGL NFTGGDLDVNMQKATLRLGQFNGNSFTSYKD
Sbjct: 121 HYWVKDGQWNKLEVDMQNAVGTYNLSGLINFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180

Query: 181 SADRTTRVNFNAKNISIDNFVEINNRVGSGAGRKASSTVLTLQASEGITSSKNAEISLYD 240
SADRTTRV+FNAKNI IDNF+EINNRVGSGAGRKASSTVLTLQASEGITS +NAEISLYD
Sbjct: 181 SADRTTRVDFNAKNILIDNFLEINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYD 240

Query: 241 GATLNLASNSVKLNGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVDFNHLTVGDQNAAQA 300
GATLNLASNSVKL GNVWMGRLQYVGAYLAPSYSTINTSKVTGEV+FNHLTVGD NAAQA
Sbjct: 241 GATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQA 300

Query: 301 GIIASNKTHIGTLDLCHSAGLNIIAPPEGGYKDKPNST---TSQSGTKNDKQEISQNNNS 357
GIIASNKTHIGTLDL SAGLNIIAPPEGGYKDKPN T+Q+ KNDKQE SQ NNS
Sbjct: 301 GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQ-NNS 359

Query: 358 NTEVINPPNNTQKTETEPTQVIDGPFAGGKDTVVNIDRINTKADGTIRVGGFKASLTTNA 417
NT+VINPPN+ QKTE +PTQVIDGPFAGGK+TVVNI+RINT ADGTIRVGGFKASLTTNA
Sbjct: 360 NTQVINPPNSAQKTEIQPTQVIDGPFAGGKNTVVNINRINTNADGTIRVGGFKASLTTNA 419

Query: 418 AHLNIGKGGVNLSNQASGRTLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEFKAGV 477
AHL+IGKGG+NLSNQASGR+LLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEFKAG
Sbjct: 420 AHLHIGKGGINLSNQASGRSLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEFKAGT 479

Query: 478 DA 479
D
Sbjct: 480 DT 481


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0901VACCYTOTOXIN6200.0 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 620 bits (1599), Expect = 0.0
Identities = 359/405 (88%), Positives = 379/405 (93%)

Query: 1 MFELANRSKDIDTLYTHSGAKGRDLLQTLLIDSHDAGYARQMIDNTSTGEITKQLNAATD 60
+FELANRS DIDTLY +SGA+GRDLLQTLLIDSHDAGYAR MID TS EITKQLN AT
Sbjct: 887 VFELANRSNDIDTLYANSGAQGRDLLQTLLIDSHDAGYARTMIDATSANEITKQLNTATT 946

Query: 61 ALNNVASLEHKTSGLQTLSLSNAMILNSRLVNLSRKHTNHINSFAQRLQALKGQRFASLE 120
LNN+ASLEHKTSGLQTLSLSNAMILNSRLVNLSR+HTNHI+SFA+RLQALK QRFASLE
Sbjct: 947 TLNNIASLEHKTSGLQTLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFASLE 1006

Query: 121 SAAEVLYQFAPKYEKPTNVWANAIGGASLNSGSNASLYGTSAGADAYLNGEVEAIVGGFG 180
SAAEVLYQFAPKYEKPTNVWANAIGG SLNSG NASLYGTSAG DAYLNGEVEAIVGGFG
Sbjct: 1007 SAAEVLYQFAPKYEKPTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFG 1066

Query: 181 SYGYSSFSNQANSLNSGANNANFGVYSRFFANQHEFDFEAQGALGSDQSSLNFKSALLQD 240
SYGYSSFSNQANSLNSGANN NFGVYSR FANQHEFDFEAQGALGSDQSSLNFKSALL+D
Sbjct: 1067 SYGYSSFSNQANSLNSGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRD 1126

Query: 241 LNQSYNYLAYSATARASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSQSQVALKNG 300
LNQSYNYLAYSA RASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNS +VALKNG
Sbjct: 1127 LNQSYNYLAYSAATRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSNQKVALKNG 1186

Query: 301 ASSQHLFNANANVEARYYYGDTSYFYLHAGVLQEFAHFGSNDVASLNTFKINAARSPLST 360
ASSQHLFNA+ANVEARYYYGDTSYFY++AGVLQEFA+FGS++ SLNTFK+NA R+PL+T
Sbjct: 1187 ASSQHLFNASANVEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNATRNPLNT 1246

Query: 361 YARAMMGGELRLAKEVFLNLGVVYLHNLISNASHFASNLGMRYSF 405
+AR MMGGEL+LAKEVFLNLG VYLHNLISN HFASNLGMRYSF
Sbjct: 1247 HARVMMGGELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0905DHBDHDRGNASE902e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 89.7 bits (222), Expect = 2e-23
Identities = 56/233 (24%), Positives = 102/233 (43%), Gaps = 10/233 (4%)

Query: 11 KVAVITGASSGIGLECALMLLDQGYKVYALSRRATLCVALNHALC------ECVDIDVSD 64
K+A ITGA+ GIG A L QG + A+ + +L E DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 65 SNALKEVFLNISAKEDHCDVLINSAGYGVFGSVEDTPIEEVKKQFSVNFFALCEVVQLCL 124
S A+ E+ I + D+L+N AG G + EE + FSVN + +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 125 PLLKNKPHSKIFNLSSIAGRVSMLFLGHYSASKHALEAYSDALRLELKPFNVQVCLIEPG 184
+ ++ I + S V + Y++SK A ++ L LEL +N++ ++ PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 185 PVKSNWEKTAFENDERKDSVYALEVNAAKNFYSGVYQNAL-NAKEVAQKIVFL 236
+++ + + + ++ + V + + F +G+ L ++A ++FL
Sbjct: 189 STETDMQWSLWADENGAEQVIK---GSLETFKTGIPLKKLAKPSDIADAVLFL 238


23hp908_0980hp908_0987N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_0980-2100.014770hypothetical protein
hp908_0981-3100.484967Cobalt-zinc-cadmium resistance protein /Cation
hp908_0982-211-0.372095nickel-cobalt-cadmium resistance protein
hp908_0983-111-0.293431hypothetical protein
hp908_0984-113-0.506852Glycyl-tRNA synthetase beta chain
hp908_0985-2110.603069hypothetical protein
hp908_0986-1131.3228022,3-bisphospho glycerate-independent phospho
hp908_0987-1121.041917Aspartyl-tRNA-Asn-amido transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0980LPSBIOSNTHSS250.035 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 25.2 bits (55), Expect = 0.035
Identities = 16/69 (23%), Positives = 27/69 (39%), Gaps = 12/69 (17%)

Query: 12 LKDALIDYLFEKGFDDFFYV--ECYKYAASSLLLSQKEQVSGRKDYAKFKLFLSEEVALP 69
L+ A + + F Y + +SSL+ K+ A+F + V
Sbjct: 98 LQMANTNKTLASDLETVFLTTSTEYSFLSSSLV----------KEVARFGGNVEHFVPSH 147

Query: 70 LAQALKNQF 78
+A AL +QF
Sbjct: 148 VAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0981ACRIFLAVINRP7520.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 752 bits (1942), Expect = 0.0
Identities = 225/1044 (21%), Positives = 460/1044 (44%), Gaps = 42/1044 (4%)

Query: 6 IIEFSLRQRVIVIVGAILILFFGTYSFIHTPVDAFPDISPTQVKIILKLPGSSPEEMENN 65
+ F +R+ + V AI+++ G + + PV +P I+P V + PG+ + +++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 66 IVRPLELELLGLKGQKSLRSVSKYSIS-DITIDFDDSVDIYLARNIVNERLSSVMKDLPM 124
+ + +E + G+ + S S + S IT+ F D +A+ V +L LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 125 GVEGGMAPIVTPLSDIFMF----TIDGNITEIEKRQLLDFVIRPQLRMISGVADVNSIGG 180
V+ + S M + + T+ + + ++ L ++GV DV G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 FSKAFVIVPDFNDMARLGVSISDLESAVRVNLRNSGAGRVDR----DGETFLVKI--QTA 234
A I D + + + ++ D+ + ++V AG++ G+ I QT
Sbjct: 181 -QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 235 SLSLEDIGKITV--STNLGHLHIKDFAKVISQSRTRLGFVTKDGVGETTEGLVLSLKEAN 292
+ E+ GK+T+ +++ + +KD A+V +G + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLATGAN 298

Query: 293 TKKIITQVYQKLEELKPLLPSGVSLNVFYDRSEFTQKAIATVSKTLIEAVVLIIITLFLF 352
+ KL EL+P P G+ + YD + F Q +I V KTL EA++L+ + ++LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 353 LGNLRASVAVGVILPLSLSVAFIFIKLNNLTLNLMSLGGLIIAIGMLIDSAVVVVENAFE 412
L N+RA++ + +P+ L F + ++N +++ G+++AIG+L+D A+VVVEN E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV-E 417

Query: 413 KLSANTKTTKLHAIYRSCKEIAVSVVSGVVIIIVFFVPILTLQGLEGKMFRPLAQSIVYA 472
++ K A +S +I ++V +++ F+P+ G G ++R + +IV A
Sbjct: 418 RVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSA 477

Query: 473 LLGTLVLSITIIPVVSSLVLK--ATPHSET---FLTRFLNRIYGPLLEFFVRNPKKVI-- 525
+ ++++++ + P + + +LK + H E F F N + + + + K++
Sbjct: 478 MALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWF-NTTFDHSVNHYTNSVGKILGS 536

Query: 526 ----LGAFVFLIA-SLSLFPFVGKNFMPTLDEGDVVLSVETTPSISLDQSKDLILNIESA 580
L + ++A + LF + +F+P D+G + ++ + ++++ ++ +
Sbjct: 537 TGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDY 596

Query: 581 IKKHVKEVKTIVARTGSDELGLDLGGLNQTDTFISFIPKKEWSVKTKDELL-EKIMDSLK 639
K+ K V G G Q + ++F+ K W + DE E ++ K
Sbjct: 597 YLKNEKANVESVFTVN----GFSFSGQAQ-NAGMAFVSLKPWEERNGDENSAEAVIHRAK 651

Query: 640 -DFKGINFSFTQPIEM-RISEMLTGVRGDLA-VKIFGDDISELNGLSFQIA-QALKGIKG 695
+ I F P M I E+ T D + G L Q+ A +
Sbjct: 652 MELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPAS 711

Query: 696 SSEVLTTLNEGVNYLYVTPNKEAMANVGITSDEFSKFLKSALEGLIVDVIPTGISRTPVM 755
V E + ++E +G++ + ++ + +AL G V+ +
Sbjct: 712 LVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLY 771

Query: 756 IRQEIDFASSITKIKSLALTSKYGVLVPITSIAKIEEVDGPVSIVREDSRRMSVVRSNVV 815
++ + F + L + S G +VP ++ V G + R + ++
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 816 GRDLNSFVEEAKKVIAQNVKLPPSYYITYGGQFENQQRANKRLSTVIPLSILAIFFILFF 875
+ + +A KLP + G ++ + + ++ +S + +F L
Sbjct: 832 PGTSSGDAMALMENLA--SKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAA 889

Query: 876 TFKSIPLALLILLNIPFAVTGGLIALFAVGEYISVPASVGFIALFGIAVLNGVVMIGYFK 935
++S + + ++L +P + G L+A + V VG + G++ N ++++ + K
Sbjct: 890 LYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAK 949

Query: 936 ELLL-QGKSVEECVLLGAKRRLRPVLMTACIAGLGLIPLLFSHSVGSEVQKPLAIVVLGG 994
+L+ +GK V E L+ + RLRP+LMT+ LG++PL S+ GS Q + I V+GG
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGG 1009

Query: 995 LVTSSALTLLLLPPMFMLIAKKIK 1018
+V+++ L + +P F++I + K
Sbjct: 1010 MVSATLLAIFFVPVFFVVIRRCFK 1033


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0982RTXTOXIND290.026 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.026
Identities = 25/159 (15%), Positives = 59/159 (37%), Gaps = 31/159 (19%)

Query: 7 WLMLMGVFLMGVFLGAKEYPEIVLEEKNLQPMGLKVIKLDKEIFSKGLPFNAYIDFDSKS 66
L+ F+MG + A +L + +++ N + +S
Sbjct: 56 RPRLVAYFIMGFLVIA-----FIL---------SVLGQVEI-----VATANGKLTHSGRS 96

Query: 67 SVVQSLSFDASVVAVYKREGEQVKAGDAICEVSSID-------LSNLYFELQNNQNKLKI 119
++ + ++ V + +EGE V+ GD + +++++ + + + Q + +I
Sbjct: 97 KEIKPIE-NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQI 155

Query: 120 AKDITKKDLELYKAGVIPKREYQTSFLASEEMGLKVNQL 158
+EL K + + SEE L++ L
Sbjct: 156 LSRS----IELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_0987TYPE3IMSPROT250.042 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 25.1 bits (55), Expect = 0.042
Identities = 10/36 (27%), Positives = 16/36 (44%), Gaps = 10/36 (27%)

Query: 5 DTLLQR---LEKLSM--LEIKDEHKES-----VKGH 30
D + +++L M EIK E+KE +K
Sbjct: 202 DYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSK 237


24hp908_1045hp908_1052N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_1045-112-0.025884hypothetical protein
hp908_1046-2110.995646D-3-phospho glycerate dehydrogenase
hp908_1047-1121.0252543-poly prenyl-4-hydroxy benzoate carboxy-lyase
hp908_10480140.758074hypothetical protein
hp908_10490140.941183hypothetical protein
hp908_10500141.276224chemotaxis protein
hp908_10511131.098114Signal transduction histidine kinase
hp908_1052-212-0.521031Signal transduction histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1045V8PROTEASE328e-04 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 32.3 bits (73), Expect = 8e-04
Identities = 11/35 (31%), Positives = 19/35 (54%)

Query: 47 TNEKSHEINLEENSPNNPNTPNDEKAPHNEEDHNN 81
N+ ++PNNP+ PN+ P+N ++ NN
Sbjct: 284 ANDDQPNNPDNPDNPNNPDNPNNPDEPNNPDNPNN 318



Score = 31.1 bits (70), Expect = 0.002
Identities = 10/34 (29%), Positives = 20/34 (58%)

Query: 48 NEKSHEINLEENSPNNPNTPNDEKAPHNEEDHNN 81
+ + + N+P+NP+ PN+ P+N ++ NN
Sbjct: 279 EDIHFANDDQPNNPDNPDNPNNPDNPNNPDEPNN 312



Score = 27.3 bits (60), Expect = 0.034
Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 1/49 (2%)

Query: 40 TNESPSQTNEKSHEINLEENSPNNPNTPNDEKAPHNEEDHNNTLFQNLD 88
N+ + N +N PNNP+ PN+ P+N ++ +N N D
Sbjct: 284 ANDDQPNNPDNPDNPNNPDN-PNNPDEPNNPDNPNNPDNPDNGDNNNSD 331



Score = 26.9 bits (59), Expect = 0.048
Identities = 13/61 (21%), Positives = 26/61 (42%), Gaps = 5/61 (8%)

Query: 29 EMPTTSHAISQTNESPSQTNEKSHEINLEENSPNNPNTPNDEKAPHNEEDHNNTLFQNLD 88
++ + ++P N + N+P+ PN P++ P N ++ +N N D
Sbjct: 280 DIHFANDDQPNNPDNPDNPNNPDNP-----NNPDEPNNPDNPNNPDNPDNGDNNNSDNPD 334

Query: 89 A 89
A
Sbjct: 335 A 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1050HTHFIS602e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.2 bits (146), Expect = 2e-12
Identities = 29/129 (22%), Positives = 50/129 (38%), Gaps = 13/129 (10%)

Query: 182 GEVLFLDDSKTARKTLKNHLSKLGFSITEAVDGEDGLNKLEMLFKKYGDDLRKHLKFIIS 241
+L DD R L LS+ G+ + + + +++
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA----------AGDGDLVVT 53

Query: 242 DVEMPKMDGYHFLFKLQKDPRFAYIPVIFNSSICDNYSAERAKEMGAVAYLVK-FDAEKF 300
DV MP + + L +++K +PV+ S+ +A +A E GA YL K FD +
Sbjct: 54 DVVMPDENAFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111

Query: 301 TEEISKILD 309
I + L
Sbjct: 112 IGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1051IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 29/168 (17%), Positives = 60/168 (35%), Gaps = 2/168 (1%)

Query: 103 TIRDTGSDTNNGRENEIEEVVKKLQAITSQNLEGAKETSGTKEAPEKEVKKENKEKAKEE 162
T+ T T N + ++ V + I + + + E EN ++ +
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKT 1050

Query: 163 VKTNKTPTTENLTSDNPLADEPDLDY-ANMSAEEVEAEIERLLNKRQEADKERRAQKKQE 221
V+ N+ TE + +A E + AN EV + KE +K+E
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 222 DQAKPKQEVAPAKETPKTETPKAPKTETKAKAKADTEENKAPSIGVEQ 269
++ + +PK ++ET A+ P++ +++
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ-AEPARENDPTVNIKE 1157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1052HTHFIS583e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 3e-12
Identities = 24/121 (19%), Positives = 55/121 (45%), Gaps = 4/121 (3%)

Query: 86 VLAIDDSSTDRAIIRKCLKPLGITLLEATNGLEGLEMLKNGDKIPDAILVDIEMPKMDGY 145
+L DD + R ++ + L G + +N + GD D ++ D+ MP + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD--GDLVVTDVVMPDENAF 63

Query: 146 TFASEVRKYNKFKNLPLIAVTSRVTKTDRMRGVESGMTEYITKPYSGEYLTTVVKRSIKL 205
++K +LP++ ++++ T ++ E G +Y+ KP+ L ++ R++
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 206 E 206

Sbjct: 122 P 122


25hp908_1430hp908_1437N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
hp908_14300140.594196Inner membrane protein translocase component
hp908_14310120.725939RNA-binding protein
hp908_14321111.055926GTPase and tRNA-U345-formylation enzyme
hp908_14334121.506418Outer membrane protein
hp908_1434215-0.705885Outer membrane protein
hp908_1435216-0.147090hypothetical protein
hp908_1436-1170.737640hypothetical protein
hp908_1437-1121.534086membrane-associated lipoprotein:lpp20
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_143060KDINNERMP422e-145 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 422 bits (1086), Expect = e-145
Identities = 159/576 (27%), Positives = 276/576 (47%), Gaps = 71/576 (12%)

Query: 10 RLILAIALSFLFIALYSYFFQKPNKP--TTETTKQETTNNHTTISPNAPNAQHFSVTQTI 67
R +L IAL F+ ++ + Q N +TT+ TT + P +
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASG-------- 56

Query: 68 PQESLLSTISFEHARIEIDSLGR--IKQVYLKDKKYLTPKEKGFLEHVGHLFSSKENSQP 125
+ L ++ + + I++ G + + K L + L F + S
Sbjct: 57 --QGKLISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGL 114

Query: 126 SLKELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGVLSI 183
+ ++ P A+ +PL +N A G NE V D +
Sbjct: 115 TGRDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTF 161

Query: 184 IKTLTFYDDLHYDLKIAFKSSNN------------------LIPSYVITNGYRPVADLDS 225
KT Y + + + N L P + +
Sbjct: 162 TKTFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFAL----- 215

Query: 226 YTFSGVLLENNDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQGFEAL 282
+TF G D+K EK + D + + S +++ + +YF T + G
Sbjct: 216 HTFRGAAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNF 274

Query: 283 IDSEIGTKNPLGFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDVIEYGL 331
+ +G N + I K++ N ++GP+ + A++P L ++YG
Sbjct: 275 YTANLG--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGW 332

Query: 332 ITFFAKGVFVLLDYLYQFVGNWGWAIIFLTIIVRIVLYPLSYKGMVSMQKLKELAPKMKE 391
+ F ++ +F LL +++ FVGNWG++II +T IVR ++YPL+ SM K++ L PK++
Sbjct: 333 LWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQA 392

Query: 392 LQEKYKGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWV 451
++E+ + Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + +
Sbjct: 393 MRERLGDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFA 452

Query: 452 LWIHDLSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLITFPAG 511
LWIHDLS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F + FP+G
Sbjct: 453 LWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSG 512

Query: 512 LVLYWTTNNILSVLQQLIINKVLENKKRVHAQNIKE 547
LVLY+ +N+++++QQ +I + LE K+ +H++ K+
Sbjct: 513 LVLYYIVSNLVTIIQQQLIYRGLE-KRGLHSREKKK 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1432TCRTETOQM340.001 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 34.1 bits (78), Expect = 0.001
Identities = 33/134 (24%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 KGHKVRLIDTAGIRESADEIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFNLIDTLN 318
+ KV +IDT G + E+ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1435BINARYTOXINB290.027 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 28.9 bits (64), Expect = 0.027
Identities = 22/97 (22%), Positives = 36/97 (37%), Gaps = 6/97 (6%)

Query: 155 SKSMGDLLAKAAPIERILKAYSVPVSPLENYEKIYYQNAFKPKVRITFDNNSDTEIKNAL 214
+ + D L P + +A + E + YQ + FD + IKN L
Sbjct: 536 AVNPSDPLETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQL 595

Query: 215 MSAYAR-VLTPSDEEKLYQ-----IKNEVFTENTNGI 245
A + T D+ KL I+++ F + N I
Sbjct: 596 AELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNI 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
hp908_1437LIPOLPP20293e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 293 bits (752), Expect = e-105
Identities = 174/175 (99%), Positives = 175/175 (100%)

Query: 1 MKNQVKKILGMSVIAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSV+AAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.