PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeP12.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP001217 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPP12_0043HPP12_0083Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0043-114-4.542620peptidylarginine deiminase domain-containing
HPP12_0044-213-3.775917adenine specific DNA methyltransferase
HPP12_0045-29-2.422282cytosine specific DNA methyltransferase
HPP12_0046-29-2.598576restriction endonuclease
HPP12_0047-19-2.489208type II R-M system restriction endonuclease
HPP12_0048-19-1.418253type II R-M system methyltransferase
HPP12_00490100.901915sodium/proline symporter
HPP12_0050414-0.391135proline/delta 1-pyrroline-5-carboxylate
HPP12_0051923-2.743824hypothetical protein
HPP12_0052921-2.862116hypothetical protein
HPP12_0053719-2.435857hypothetical protein
HPP12_0054619-2.081960hypothetical protein
HPP12_0055620-1.118801hypothetical protein
HPP12_0056217-0.322046hypothetical protein
HPP12_0057416-0.084415hypothetical protein
HPP12_00583150.079720hypothetical protein
HPP12_0059214-0.099119hypothetical protein
HPP12_0060315-0.078405hypothetical protein
HPP12_0061318-0.466852hypothetical protein
HPP12_0062519-0.339917hypothetical protein
HPP12_0063419-0.830617hypothetical protein
HPP12_0064419-0.705259hypothetical protein
HPP12_00653200.666824hypothetical protein
HPP12_00662171.740144hypothetical protein
HPP12_00671172.583112ATP-binding protein
HPP12_00681162.318593ATP-binding protein
HPP12_00691162.604917ATP-binding protein
HPP12_00701142.682863ATP-binding protein
HPP12_00714213.521240urease accessory protein UreH
HPP12_00724233.149953urease accessory protein UreG
HPP12_00734202.472729urease accessory protein UreF
HPP12_00743162.708632urease accessory protein UreE
HPP12_00753182.738535urease accessory protein UreI
HPP12_00761162.866759urease B subunit
HPP12_0077-2102.135697urease A subunit
HPP12_0078-1122.924797*lipoprotein signal peptidase
HPP12_00791133.231106phosphoglucosamine mutase
HPP12_00801152.401419ribosomal protein S20
HPP12_00811152.500557peptide chain release factor RF-1
HPP12_00823152.172122hypothetical protein
HPP12_00832152.020246outer membrane protein HorA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0050ANTHRAXTOXNA310.034 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 30.9 bits (69), Expect = 0.034
Identities = 36/173 (20%), Positives = 71/173 (41%), Gaps = 19/173 (10%)

Query: 121 QEESQLKERILKRKNEKIILNVNFIGEEVLGEEEANARFEKY---SQALKSNYIQYISIK 177
Q+ S+ ++ + + EK+ F+ E+ + + Y S+ K Y +
Sbjct: 118 QDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGI 177

Query: 178 ITTIFSQINILDFEY-----SKKEIVKRLDALYALALEEEKKQGMPKFINLDMEEFRDLE 232
I S+ LD E+ S + D L++ +E K + K I+++ ++
Sbjct: 178 SLDIISKDKSLDPEFLNLIKSLSDDSDSSDLLFSQKFKE-KLELNNKSIDINF-----IK 231

Query: 233 LTVESFMESIAK-----FDLNAGIVLQAYIPDSYEYLKKLHAFSKERVLKGLK 280
+ F + + F + VL+ Y PD +EY+ KL E++ + LK
Sbjct: 232 ENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNKLEKGGFEKISESLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0055GPOSANCHOR482e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 48.1 bits (114), Expect = 2e-08
Identities = 49/263 (18%), Positives = 90/263 (34%), Gaps = 3/263 (1%)

Query: 53 LRQKNDKLFTTKEKLTKANTDLENKNDKLSKENENLAVKISGLENSNDQLCQAKEKLTKE 112
++++ DK L N+DL N L N+ L ++S + + ++ + +
Sbjct: 55 VQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASK 114

Query: 113 KAELLRDKDNLTKANTELTTKNTELQKQVNRLKNSRQVLENEKAELSKDKENLTKANAEL 172
EL K +L KA +T ++ L+ + L KA+L K E +
Sbjct: 115 IQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 174

Query: 173 KTENDKLNHQVIVLTKEQDSLKQERAQLQDAHGFLEKLCADLEKENQHLTDKLKKLESAQ 232
+ L + L Q L++ + LE E L + LE A
Sbjct: 175 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKAL 234

Query: 233 KNLENSNDQLLQAKENIAEEKTELEREMVRLKSLEATDKSDLDLQNWRFKSA---IEDLK 289
+ N + + + EK LE L+ + + + K+ L+
Sbjct: 235 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALE 294

Query: 290 RQNRKLEEENIVLKERVDGLNEQ 312
+ LE ++ VL L
Sbjct: 295 AEKADLEHQSQVLNANRQSLRRD 317



Score = 46.6 bits (110), Expect = 7e-08
Identities = 46/249 (18%), Positives = 84/249 (33%)

Query: 16 KELEVRIGELENENAELLREKECLAAETSELKDANNQLRQKNDKLFTTKEKLTKANTDLE 75
EL + + + + + A++ EL+ L + + + LE
Sbjct: 88 DELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLE 147

Query: 76 NKNDKLSKENENLAVKISGLENSNDQLCQAKEKLTKEKAELLRDKDNLTKANTELTTKNT 135
+ L+ +L + G N + + L EKA L + L KA +T
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 136 ELQKQVNRLKNSRQVLENEKAELSKDKENLTKANAELKTENDKLNHQVIVLTKEQDSLKQ 195
++ L+ + L KA+L K E + + L + L Q L++
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 196 ERAQLQDAHGFLEKLCADLEKENQHLTDKLKKLESAQKNLENSNDQLLQAKENIAEEKTE 255
+ LE E L + LE + L + L + + E K +
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQ 327

Query: 256 LEREMVRLK 264
LE E +L+
Sbjct: 328 LEAEHQKLE 336



Score = 45.1 bits (106), Expect = 2e-07
Identities = 55/297 (18%), Positives = 105/297 (35%), Gaps = 10/297 (3%)

Query: 25 LENENAELLREKECLAAETSELKDANNQLRQKNDKLFTTKEKLTKANTDLENKNDKLSKE 84
++ + E L + S+L N L+ ND+L + + + + +
Sbjct: 55 VQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASK 114

Query: 85 NENLAVKISGLENSNDQLCQ-------AKEKLTKEKAELLRDKDNLTKANTELTTKNTEL 137
+ L + + LE + + + L EKA L K +L KA +T
Sbjct: 115 IQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 174

Query: 138 QKQVNRLKNSRQVLENEKAELSKDKENLTKANAELKTENDKLNHQVIVLTKEQDSLKQER 197
++ L+ + LE +AEL K E + + L + L + L++
Sbjct: 175 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKAL 234

Query: 198 AQLQDAHGFLEKLCADLEKENQHLTDKLKKLESAQKNLENSNDQLLQAKENIAEEKTELE 257
+ LE E L + +LE A + N + + + EK LE
Sbjct: 235 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALE 294

Query: 258 REMVRLKSLEATDKSDLDLQNWRF---KSAIEDLKRQNRKLEEENIVLKERVDGLNE 311
E L+ ++ + A + L+ +++KLEE+N + + L
Sbjct: 295 AEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRR 351



Score = 41.2 bits (96), Expect = 3e-06
Identities = 45/226 (19%), Positives = 78/226 (34%)

Query: 15 RKELEVRIGELENENAELLREKECLAAETSELKDANNQLRQKNDKLFTTKEKLTKANTDL 74
+ELE R +LE + +A+ L+ L + L E +T
Sbjct: 115 IQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 174

Query: 75 ENKNDKLSKENENLAVKISGLENSNDQLCQAKEKLTKEKAELLRDKDNLTKANTELTTKN 134
K L E L + + LE + + + + L +K L +L
Sbjct: 175 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKAL 234

Query: 135 TELQKQVNRLKNSRQVLENEKAELSKDKENLTKANAELKTENDKLNHQVIVLTKEQDSLK 194
+ LE EKA L + L KA + + ++ L E+ +L+
Sbjct: 235 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALE 294

Query: 195 QERAQLQDAHGFLEKLCADLEKENQHLTDKLKKLESAQKNLENSND 240
E+A L+ L L ++ + K+LE+ + LE N
Sbjct: 295 AEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNK 340



Score = 35.4 bits (81), Expect = 2e-04
Identities = 55/285 (19%), Positives = 103/285 (36%)

Query: 14 VRKELEVRIGELENENAELLREKECLAAETSELKDANNQLRQKNDKLFTTKEKLTKANTD 73
+I LE E A L + L + + K L K L D
Sbjct: 170 FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKAD 229

Query: 74 LENKNDKLSKENENLAVKISGLENSNDQLCQAKEKLTKEKAELLRDKDNLTKANTELTTK 133
LE + + + KI LE L + +L K + + L +
Sbjct: 230 LEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE 289

Query: 134 NTELQKQVNRLKNSRQVLENEKAELSKDKENLTKANAELKTENDKLNHQVIVLTKEQDSL 193
L+ + L++ QVL + L +D + +A +L+ E+ KL Q + + SL
Sbjct: 290 KAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSL 349

Query: 194 KQERAQLQDAHGFLEKLCADLEKENQHLTDKLKKLESAQKNLENSNDQLLQAKENIAEEK 253
+++ ++A LE LE++N+ + L + Q+ +A E +
Sbjct: 350 RRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKL 409

Query: 254 TELEREMVRLKSLEATDKSDLDLQNWRFKSAIEDLKRQNRKLEEE 298
LE+ L+ + + + + ++ + LK + K EE
Sbjct: 410 AALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEE 454



Score = 29.3 bits (65), Expect = 0.020
Identities = 25/201 (12%), Positives = 63/201 (31%)

Query: 104 QAKEKLTKEKAELLRDKDNLTKANTELTTKNTELQKQVNRLKNSRQVLENEKAELSKDKE 163
E + + + + +N L+ + + L + + L++ EL+++
Sbjct: 36 NTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELS 95

Query: 164 NLTKANAELKTENDKLNHQVIVLTKEQDSLKQERAQLQDAHGFLEKLCADLEKENQHLTD 223
N + + + ++ L + L++ + LE E L
Sbjct: 96 NAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAA 155

Query: 224 KLKKLESAQKNLENSNDQLLQAKENIAEEKTELEREMVRLKSLEATDKSDLDLQNWRFKS 283
+ LE A + N + + + EK LE L+ + + + K+
Sbjct: 156 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT 215

Query: 284 AIEDLKRQNRKLEEENIVLKE 304
+ + + L+
Sbjct: 216 LEAEKAALAARKADLEKALEG 236


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0076UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 353/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0083FLAGELLIN320.004 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 32.3 bits (73), Expect = 0.004
Identities = 17/168 (10%), Positives = 49/168 (29%), Gaps = 1/168 (0%)

Query: 45 AEKDKDSKLTSDSPTQQQAQTQAQNTASSGTPTPPTKEEPKHTASSGTPSTSGSSVASQL 104
++ + T N S T ++G + +++ S
Sbjct: 272 GKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSK 331

Query: 105 TKDTTMVNNLKSVSVSGMNTTLSGVETMSKQTATISNLLSGNPNLGSVIPNAQGLSSAFS 164
T++VN + N + + + + ++ N + ++ A
Sbjct: 332 NVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGK 391

Query: 165 ALESAQNTLKGYLNSSSATIGQLTNGSNAVVGALDKAINQVDMALADL 212
+ + + + + ++D A+++VD + L
Sbjct: 392 TMFIDKTASGVSTLINEDAAAAKK-STANPLASIDSALSKVDAVRSSL 438


2HPP12_0104HPP12_0110Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_01040123.192591glycosyl transferase
HPP12_01051143.545386methyl-accepting chemotaxis transmembrane
HPP12_01060123.9807852',3'-cyclic-nucleotide 2'-phosphodiesterase
HPP12_0107-3124.742043autoinducer-2 synthase
HPP12_0108-2133.804539cystathionine gamma-synthase
HPP12_0109-1152.584250cysteine synthetase
HPP12_01102171.587527hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0107LUXSPROTEIN2278e-80 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 227 bits (581), Expect = 8e-80
Identities = 60/145 (41%), Positives = 91/145 (62%), Gaps = 7/145 (4%)

Query: 8 VESFNLDHTKVKAPYVRVADRKKGVNGDVIVKYDVRFKQPNQDHMDMPSLHSLEHLVAEI 67
++SF +DHT++ AP VRVA + GD I +D+RF PN+D + +H+LEHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 68 IRNHAN----YVVDWSPMGCQTGFYLTVLNHDNYTEVLEVLEKTMQDVLKA---TEVPAS 120
+RNH N ++D SPMGC+TGFY++++ + +V + M+DVLK ++P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 121 NEKQCGWAANHTLEGAKNLARAFLD 145
NE QCG AA H+L+ AK +A+ L+
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILE 147


3HPP12_0140HPP12_0146Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_01402121.715226hypothetical protein
HPP12_01412111.911137A/G-specific adenine glycosylase
HPP12_01423132.184288sodium:sulfate symporter transmembrane region
HPP12_01432131.291211cytochrome C oxidase heme B and copper-binding
HPP12_01442150.089140cytochrome C oxidase monoheme subunit
HPP12_0145315-0.906986cytochrome C oxidase subunit Q
HPP12_0146214-0.722091cytochrome C oxidase diheme subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0144PF07201290.019 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 28.7 bits (64), Expect = 0.019
Identities = 13/77 (16%), Positives = 29/77 (37%), Gaps = 6/77 (7%)

Query: 146 FDTAYAEALTQKKVFGVPYDTENGVKLGSVEEAKKAYLEEAKKITADMKDKRVLDAIQRG 205
F +L ++K+ +++ ++ VEE YL + ++ +L +
Sbjct: 60 FSERKELSLDKRKL------SDSQARVSDVEEQVNQYLSKVPELEQKQNVSELLSLLSNS 113

Query: 206 EVLEIVALIAYLNSLGN 222
+ + L AYL
Sbjct: 114 PNISLSQLKAYLEGKSE 130


4HPP12_0175HPP12_0204Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0175217-2.848716hypothetical protein
HPP12_0176316-2.071284hypothetical protein
HPP12_0177-114-0.691493hypothetical protein
HPP12_01780130.140850sialic acid synthase
HPP12_0179-1120.160899ABC transporter ATP-binding protein
HPP12_0180-112-0.061662apolipoprotein N-acyltransferase
HPP12_0181-1110.297920hypothetical protein
HPP12_01820110.937122lysyl-tRNA synthetase
HPP12_01832151.012887serine hydroxymethyltransferase
HPP12_01844190.484780hypothetical protein
HPP12_01855190.537384hypothetical protein
HPP12_01862200.196336hypothetical protein
HPP12_01871161.604751hypothetical protein
HPP12_01880113.085686hypothetical protein
HPP12_0189-1112.895410hypothetical protein
HPP12_0190-2102.164753hypothetical protein
HPP12_0191-292.314904phospholipase D-family protein
HPP12_0192-1113.075273fumarate reductase
HPP12_0193-1113.033857fumarate reductase flavoprotein subunit
HPP12_0194-2141.744615fumarate reductase cytochrome B subunit
HPP12_0195-2141.687525triosephosphate isomerase
HPP12_0196-2152.930271enoyl-(acyl-carrier-protein) reductase
HPP12_0197-2163.006055UDP-3-O-(3-hydroxymyristoyl) glucosamine
HPP12_0198-2173.413194S-adenosylmethionine synthetase
HPP12_0199-1192.818178nucleoside diphosphate kinase
HPP12_0200-2212.496398hypothetical protein
HPP12_0201-113-2.78609050S ribosomal protein L32
HPP12_0202-112-3.051026fatty acid/phospholipid synthesis protein
HPP12_0203012-2.829059beta-ketoacyl-acyl carrier protein synthase III
HPP12_0204211-3.284708hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0179PF05272300.006 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.006
Identities = 12/53 (22%), Positives = 22/53 (41%), Gaps = 1/53 (1%)

Query: 29 LAILGVSGSGKSTLLSHLATMLKPNSGTISLLEHQDIY-ALNSKKLLELRRLK 80
+ + G G GKSTL++ L + + + +D Y + EL +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMT 651


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0181TYPE4SSCAGA353e-04 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 34.7 bits (79), Expect = 3e-04
Identities = 32/100 (32%), Positives = 52/100 (52%), Gaps = 12/100 (12%)

Query: 125 SKMEVMKDANAYLQEKSAFFSTMKSVASKIMRLDGVKHVEQNLKGNLEEMSDEV----KN 180
+KME AN+ +K F+ + A++ R QNLKG E+SD++ KN
Sbjct: 640 NKMEAKAQANS---QKDEIFALINKEANRDAR---AIAYAQNLKGIKRELSDKLENVNKN 693

Query: 181 KESFNKNKESFNKNK-QSFDKAMDKGVESLKEKAKDLPKN 219
+ F+K+ + F K + F KA ++ +++LK KDL N
Sbjct: 694 LKDFDKSFDEFKNGKNKDFSKA-EETLKALKGSVKDLGIN 732


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0185IGASERPTASE354e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.7 bits (79), Expect = 4e-04
Identities = 32/150 (21%), Positives = 57/150 (38%), Gaps = 8/150 (5%)

Query: 50 PKETFLQTDSGMQKIGNTKDEKKDDEFESLNLDPSKQEDKLDKVADNVKKQENDAFNMPI 109
P ET ++ T ++ + D E+ + ++ V N Q N+
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN--TQTNEVAQSGS 1090

Query: 110 QTNQTQTEMKTTEEKQEAQKELKA-VESIPMSAQKESQAVAKKETPHKKPKVAPKDKEAH 168
+T +TQT E +++ K E + SQ K+E + V P+ + A
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQE---QSETVQPQAEPAR 1147

Query: 169 KDKAKHAAKEPKAK--KEAHKEVPKKANSK 196
++ KEP+++ A E P K S
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSS 1177



Score = 32.7 bits (74), Expect = 0.002
Identities = 17/117 (14%), Positives = 42/117 (35%)

Query: 85 KQEDKLDKVADNVKKQENDAFNMPIQTNQTQTEMKTTEEKQEAQKELKAVESIPMSAQKE 144
++D + A N + + N+ T + +E K+ E K ++ + +
Sbjct: 1054 NEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAK 1113

Query: 145 SQAVAKKETPHKKPKVAPKDKEAHKDKAKHAAKEPKAKKEAHKEVPKKANSKTTLTK 201
+ +E P +V+PK +++ + + KE + N+ +
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0189RTXTOXINA250.017 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 24.9 bits (54), Expect = 0.017
Identities = 11/51 (21%), Positives = 25/51 (49%)

Query: 2 KTTIKEIFQEEGYSIPNYQRDYAWKEKNFKDLWEDLEEAIEYNKKGQGAFY 52
T + F++E I N++ + + + + L++A+EY ++ A Y
Sbjct: 908 GITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALEYQQRNNKASY 958


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0196DHBDHDRGNASE601e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 59.7 bits (144), Expect = 1e-12
Identities = 61/263 (23%), Positives = 109/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKSLYDNIKQDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + I++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHHIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L +++IR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSSGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


5HPP12_0295HPP12_0324Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_02951153.70585250S ribosomal protein L21
HPP12_02961143.78861550S ribosomal protein L27
HPP12_02971143.946648periplasmic dipeptide-binding protein
HPP12_02980144.397899dipeptide transport system permease protein
HPP12_0299-1143.632089dipeptide transport system permease protein
HPP12_0300-3133.211440dipeptide ABC transporter
HPP12_0301-3143.014620dipeptide ABC transporter
HPP12_0302-2132.488900GTP-binding protein obgE
HPP12_0303-1121.968280hypothetical protein
HPP12_03041162.394372hypothetical protein
HPP12_03051172.875166glutamate-1-semialdehyde 2,1-aminomutase
HPP12_03063172.463395hypothetical protein
HPP12_03073172.184288hypothetical protein
HPP12_03083172.411354N-carbamoyl-D-amino acid amidohydrolase
HPP12_03092151.991305polysaccharide deacetylase
HPP12_03100130.505666hypothetical protein
HPP12_03110150.444137ATP/GTP binding protein
HPP12_0312-215-0.329985nitrite extrusion protein
HPP12_0313315-0.882665hypothetical protein
HPP12_0315215-0.880901heme iron utilization protein
HPP12_0316113-0.889870arginyl-tRNA synthetase
HPP12_0317113-0.524903sec-independent protein translocase protein
HPP12_0318113-0.8837065'-guanylate kinase
HPP12_0319113-1.184507poly E-rich protein
HPP12_0320-113-1.909705membrane bound endonuclease
HPP12_0321012-1.706075outer membrane protein HorC
HPP12_0322214-2.097967flagellar basal-body L-ring protein
HPP12_0323213-1.876073CMP-N-acetylneuraminic acid synthetase
HPP12_0324210-1.066016CMP-N-acetylneuraminic acid synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0312TCRTETA453e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 45.2 bits (107), Expect = 3e-07
Identities = 53/271 (19%), Positives = 101/271 (37%), Gaps = 16/271 (5%)

Query: 28 LILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFLVCYFD 87
L+ S +T H L + LM + L LS + +S A+ + +
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGA-LSDRFGRRPVLLVSLAGAAVDYAI--MA 91

Query: 88 SIPFFW-LWIWRFIAGVASSALMILVAPLSLPYVKEHKKALVGGLIFSAVGIGSVFSGFV 146
+ PF W L+I R +AG+ + A + ++A G + + G G V +
Sbjct: 92 TAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVL 150

Query: 147 LPWISSYNIKWAWIFLGGSCLIAFILSLVGLK-----TRSLRKKSVKKEESAFKIPFHL- 200
+ ++ + + F+ L R ++ ++F+ +
Sbjct: 151 GGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMT 210

Query: 201 ---WLLLVSCALNAIGFLPHTLFWVDYLIRHLNISPTIAGTSWAFFG-FGATLGSLISGP 256
L+ V + +G +P L WV + + T G S A FG + ++I+GP
Sbjct: 211 VVAALMAVFFIMQLVGQVPAAL-WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 257 MAQKLGAKNANIFILILKSIACFLPIFFHQI 287
+A +LG + A + +I L F +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0318PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0319IGASERPTASE747e-16 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 73.6 bits (180), Expect = 7e-16
Identities = 47/232 (20%), Positives = 81/232 (34%), Gaps = 7/232 (3%)

Query: 158 EEQLLPTLNAQEEKEEVKETPQEEKQEIKETPQKEKQEVKETPQKEKQEVKETPQEEKPK 217
+E +P E + + KQE K T +K +Q+ ET + ++ KE K
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESK-TVEKNEQDATETTAQNREVAKEAKSNVK-- 1077

Query: 218 DDETQESETPKDEEVSKELETQEKLEIPKEETQEEVKEEIKEEAQEEVKEETQEIKEEKQ 277
TQ +E + +KE +T E E E +E+ K E ++ + K+E+
Sbjct: 1078 -ANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQS 1136

Query: 278 EETQDSPSVQELEAMQELVKEIQENSNGQEDKEETQENAEIPQDKEIQEVVTEKTQVQEL 337
E Q +KE Q +N D E+ + ++ + E T T +
Sbjct: 1137 ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVV 1196

Query: 338 EIPKEKTQESAEALQETQAQELEKQENAETPQEKEKQEDTETPQDVETPQEE 389
E P+ T + Q T E + + P +
Sbjct: 1197 ENPENTTPATT---QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRS 1245



Score = 72.8 bits (178), Expect = 2e-15
Identities = 52/262 (19%), Positives = 95/262 (36%), Gaps = 11/262 (4%)

Query: 140 ELENLGDLEALAKEEPNNEEQLLPTLNAQEEKEEVKETPQEEKQEIKETPQKEKQEVKET 199
E+E N Q +E + TP + + V E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 200 PQKEKQEVKETPQEEKPKDDETQESETPKDEEVSKELETQE----KLEIPKEETQE-EVK 254
++E + V++ E+ + Q E K+ + + + TQ + +ETQ E K
Sbjct: 1044 SKQESKTVEK--NEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 255 EEIKEEAQEEVKEETQEIKEEKQEETQDSPSVQELEAMQELVKEIQENS---NGQEDKEE 311
E E +E+ K ET++ +E + +Q SP ++ E +Q + +EN N +E + +
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQ 1161

Query: 312 TQENAEIPQ-DKEIQEVVTEKTQVQELEIPKEKTQESAEALQETQAQELEKQENAETPQE 370
T A+ Q KE V + E+ E Q E++ P+
Sbjct: 1162 TNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221

Query: 371 KEKQEDTETPQDVETPQEEKTQ 392
+ ++ P +VE
Sbjct: 1222 RHRRSVRSVPHNVEPATTSSND 1243



Score = 67.4 bits (164), Expect = 8e-14
Identities = 51/277 (18%), Positives = 87/277 (31%), Gaps = 24/277 (8%)

Query: 163 PTLNAQEEKEEVKETPQEEKQEIKETPQKEKQEVKETPQKEKQEVKETPQEEKPKDDETQ 222
T N + + EE + E P + E + + QE K + Q
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA-ENSKQESKTVEKNEQ 1056

Query: 223 ESETPKDEEVSKELETQEKLEIPKEETQEEV---KEEIKEEAQEEVKEETQEIKEEKQ-- 277
++ ++E+ + K + EV E KE E KE KEEK
Sbjct: 1057 DATETT--AQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKV 1114

Query: 278 --EETQDSPSVQELEAMQELVKEIQENSNGQEDKEETQENAEIPQDKEIQEVVTEKTQVQ 335
E+TQ+ P V + ++ E + + + N + PQ + TE+
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ---- 1170

Query: 336 ELEIPKEKTQESAEALQETQAQELEKQENAETPQEKEKQEDTETPQDVETPQEEKTQEDH 395
P ++T + E E P + T P K + H
Sbjct: 1171 ----PAKETSSNVEQPVTESTTVNTGNSVVENP--ENTTPATTQPTVNSESS-NKPKNRH 1223

Query: 396 YESIEDIP---EPVMAKAMGEELPFLNEAVAETPNSE 429
S+ +P EP + L + + N+
Sbjct: 1224 RRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAV 1260



Score = 52.0 bits (124), Expect = 5e-09
Identities = 45/262 (17%), Positives = 85/262 (32%), Gaps = 25/262 (9%)

Query: 239 QEKLEIPKEETQEEVKEEIKEE-------------AQEEVKEETQEIKEEKQEET--QDS 283
EK + T I+ + E + ET ++S
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 284 PSVQELEAMQELVKEIQENSNGQEDKEETQENAEIPQDKEIQEVVTEKTQVQELEIPKEK 343
+ E N + KE Q E+ + +E + Q E +
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 344 TQESAEALQETQAQELEKQENAE-TPQEKEKQEDTETPQDVETPQEEKTQEDHYESIEDI 402
T E E + + + + QE + T Q KQE +ET Q P + D +I++
Sbjct: 1105 TVEKEE---KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA---RENDPTVNIKEP 1158

Query: 403 PEPVMAKAMGEELPFLNEAVAETPNSENDTETPKESDIKTPQEKEESDKTSSPLELRLNL 462
A E+ + E P +E+ T S ++ P+ + +
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP---TVNSES 1215

Query: 463 QDLLKSLNQESLKSLLENKTLS 484
+ K+ ++ S++S+ N +
Sbjct: 1216 SNKPKNRHRRSVRSVPHNVEPA 1237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0322FLGLRINGFLGH1913e-63 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 191 bits (486), Expect = 3e-63
Identities = 52/172 (30%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKRQEAQYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
R + + S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


6HPP12_0432HPP12_0473Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0432-311-3.732640dihydroorotate dehydrogenase
HPP12_0433-211-4.009534polyphosphate kinase
HPP12_0434016-5.498639*type I R-M system S protein
HPP12_0435115-3.966599type I R-M system M protein
HPP12_0437420-4.744269integrase/recombinase XercD family
HPP12_0438321-3.255334hypothetical protein
HPP12_0439322-3.498318VirB6 type IV secretion protein
HPP12_0440221-3.832832hypothetical protein
HPP12_0441320-3.942004hypothetical protein
HPP12_0442315-3.290781hypothetical protein
HPP12_0443315-3.083474hypothetical protein
HPP12_0444415-3.378939hypothetical protein
HPP12_0445414-3.181293hypothetical protein
HPP12_0446415-3.983677hypothetical protein
HPP12_0447315-3.902822DNA methylase
HPP12_0448516-6.149046chromosome partitioning protein
HPP12_0449419-6.447134hypothetical protein
HPP12_0450320-6.941492hypothetical protein
HPP12_0451323-7.775062relaxase
HPP12_0452122-6.762714hypothetical protein
HPP12_0453324-7.122455hypothetical protein
HPP12_0454325-7.685689VirD4 coupling protein
HPP12_0455428-9.655915hypothetical protein
HPP12_0456528-9.015163hypothetical protein
HPP12_0457326-6.745298hypothetical protein
HPP12_0458419-5.692425VirB11 type IV secretion ATPase
HPP12_0459416-5.506825hypothetical protein
HPP12_0460416-5.158520hypothetical protein
HPP12_0461517-4.965486hypothetical protein
HPP12_0462418-4.924819VirB10 type IV secretion protein
HPP12_0463617-5.523195VirB9 type IV secretion protein
HPP12_0464524-6.751274VirB8 type IV secretion protein
HPP12_0465627-6.680743VirB7 type IV secretion protein
HPP12_0468526-7.017772VirB3 type IV secretion protein
HPP12_0469017-5.367961VirB2 type IV secretion protein
HPP12_0470-115-4.957196hypothetical protein
HPP12_0471-213-4.744338hypothetical protein
HPP12_0472013-3.746320hypothetical protein
HPP12_0473013-3.359718hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0443CABNDNGRPT371e-04 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 37.3 bits (86), Expect = 1e-04
Identities = 31/133 (23%), Positives = 46/133 (34%), Gaps = 11/133 (8%)

Query: 146 TNDPMYANTPFNNNPNSPNDNAINGKDGAN-----GSNGYGINGNDGINGSSGSNGNNSN 200
N + P + AI GAN G + YG N N + + ++ + +
Sbjct: 232 NETGADYNGHYGGAPMIDDIAAIQRLYGANMTTRTGDSVYGFNSNTDRDFYTATDSSKAL 291

Query: 201 NNAV--GSGIDTDGVLGVDG---VNGSNSSSGGSIGGYENNFTNHGSTNNN-TGGYDNFN 254
+V G DT G +N + S G N HG T N GG N
Sbjct: 292 IFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDI 351

Query: 255 NNSSSGGSLGNGG 267
+S ++ GG
Sbjct: 352 LVGNSADNILQGG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0447FbpA_PF05833340.008 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 34.5 bits (79), Expect = 0.008
Identities = 47/235 (20%), Positives = 90/235 (38%), Gaps = 27/235 (11%)

Query: 1238 EQDYEIIKDFMDKVGENNIHLNEQTLNEYFIH-HPENILGHLSLEKTRY---SSEVNGEQ 1293
++ E+ KD ++ N N T N F+ + N++ +K +Y S +
Sbjct: 229 KEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFY 288

Query: 1294 IYKYELQALEDKSLDLSQALNQAIEKLPKGVYQYHKTTLKTDTLIIDTNNERYQEVQKLI 1353
K + L+ KS DL + + I + K + T K + + + ++ +L+
Sbjct: 289 YAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCE------DKDIFKLYGELL 342

Query: 1354 K----NLERG-ELVKWDDLYFQLEQNNERGIFLRPTKINSKAQDSRLKAYFKIKDALNDL 1408
L++G ++ + Y E + I L K S+ S K Y K+K +
Sbjct: 343 TANIYALKKGLSHIELANYY--SENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAA 400

Query: 1409 ------TSAELNPLSS---DLELENKRAKLNLVYDEFVKKFGYLNENKNRKDIKQ 1454
ELN L S ++ + ++ + E ++ GY+ K K K
Sbjct: 401 NEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIET-GYIKFKKIYKSKKS 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0464PF04335951e-24 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 94.5 bits (235), Expect = 1e-24
Identities = 34/213 (15%), Positives = 70/213 (32%), Gaps = 13/213 (6%)

Query: 144 FEEVRD-ASVIYHLEKKLGDYIFYVACFFFGTTALLIILLIVLLPLKQKVPYLVQFSNNK 202
FEE ++ + VA ++ + L PLK PY++ N
Sbjct: 14 FEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNT 73

Query: 203 ENFALVQ--KADSTITANKALIRSLVGAYVLNRESITHIEQHEKMRQNTIKEQSSNEVWY 260
++ D+TIT ++A+ + + YV RE + + + + S+
Sbjct: 74 GEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAR--EEYFDAVMVMSARPEQD 131

Query: 261 EFEKLIA-----HYDSIYTNPLLTRKVKIANI-YLDKDLAYIDIEVSLYHSGELESLKRY 314
+ + +I N V+I + +L ++A + +G +
Sbjct: 132 RWSRFYKTDNPQSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFTKESV-TGSNSTKTDA 189

Query: 315 KVVMSFEFKKQEINFDSMSLNPTGFIVTGYDVT 347
+ ++ NP G+ V Y
Sbjct: 190 VATIKYKVDGTPSKEVDRFKNPLGYQVESYRAD 222


7HPP12_0483HPP12_0495Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0483-113-3.158803molybdenum ABC transporter ModA
HPP12_0484-112-3.531326molybdenum ABC transporter ModB
HPP12_0485-110-1.927967molybdenum ABC transporter ModD
HPP12_0486-19-1.882599glutamyl-tRNA synthetase
HPP12_0487-110-1.940217outer membrane protein HopJ
HPP12_0488010-0.406173type II R-M system methyltransferase
HPP12_04893120.959626DD-heptosyltransferase
HPP12_04901140.965316GTP-binding protein of the TypA subfamily
HPP12_04911190.061105type II R-M system methyltransferase
HPP12_04922190.189358catalase like protein
HPP12_04932180.304246outer membrane protein HofC
HPP12_0494217-0.757824outer membrane protein HofD
HPP12_0495420-1.324253hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0485PF05272300.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.008
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 30 VVALLGESGAGKSTILRILAGLE 52
V L G G GKST++ L GL+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0490TCRTETOQM1994e-58 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 199 bits (508), Expect = 4e-58
Identities = 116/461 (25%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLEKERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LE++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVALAG--FNAMDV-GDSVVDPNNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV L V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVVDTPQDFSGAI 413
+ E I P VI E K E H+ V P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.0 bits (96), Expect = 1e-05
Identities = 19/80 (23%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVVDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ + PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


8HPP12_0524HPP12_0553Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0524315-2.577617hypothetical protein
HPP12_0525619-2.759215hypothetical protein
HPP12_0526821-1.979932hypothetical protein
HPP12_0527820-1.902007cag pathogenicity island protein Zeta
HPP12_0528820-2.156835cag pathogenicity island protein Epsilon
HPP12_0529917-1.828058cag pathogenicity island protein Delta
HPP12_0530918-2.295282cag pathogenicity island protein Gamma
HPP12_0531819-2.423509cag pathogenicity island protein Beta
HPP12_0532920-2.900270cag pathogenicity island protein Alpha
HPP12_0533920-3.119577cag pathogenicity island protein Z
HPP12_0534920-2.952207cag pathogenicity island protein Y VirB10-like
HPP12_0535926-4.324833cag pathogenicity island protein X VirB9-like
HPP12_0536928-4.310962cag pathogenicity island protein W
HPP12_05371229-5.102961cag pathogenicity island protein V
HPP12_05381130-5.542405cag pathogenicity island protein U
HPP12_05391126-5.512279cag pathogenicity island protein T
HPP12_0540924-6.040645cag pathogenicity island protein S
HPP12_0541723-5.491843cag pathogenicity island protein Q
HPP12_0542620-4.485662cag pathogenicity island protein R
HPP12_0543619-3.137333cag pathogenicity island protein P
HPP12_0544618-2.720369cag pathogenicity island protein M
HPP12_0545620-2.971658cag pathogenicity island protein N
HPP12_0546519-2.895310cag pathogenicity island protein L
HPP12_0547520-3.249880cag pathogenicity island protein I
HPP12_0548620-3.215840cag pathogenicity island protein H
HPP12_0549721-4.274510cag pathogenicity island protein G
HPP12_0550621-3.324778cag pathogenicity island protein F
HPP12_0551419-2.424586cag pathogenicity island protein E VirB4-like
HPP12_0552217-0.882090cag pathogenicity island protein D
HPP12_0553217-0.173623cag pathogenicity island protein C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0527TYPE3IMSPROT290.004 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.0 bits (65), Expect = 0.004
Identities = 13/68 (19%), Positives = 24/68 (35%), Gaps = 1/68 (1%)

Query: 18 NAFVNFFKNSLADKRYDSLGLIGAGVLCCVLSGAMGIVGIIFVAIGIFLSFSNINLVKLI 77
A N L + Y L+ L + S + G + I IN ++
Sbjct: 70 QALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVV-QYGFLISGEAIKPDIKKINPIEGA 128

Query: 78 EKLFKKQS 85
+++F +S
Sbjct: 129 KRIFSIKS 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0529PF07201300.023 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.8 bits (67), Expect = 0.023
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0530TACYTOLYSIN270.039 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 27.3 bits (60), Expect = 0.039
Identities = 12/43 (27%), Positives = 22/43 (51%), Gaps = 3/43 (6%)

Query: 128 NKSVYQLVEMAIGAYNGG-MKHDPNGAYVKKFRCIYSQVRYNE 169
N+S Y VE Y G + GAYV ++ ++ ++ Y++
Sbjct: 451 NRSEY--VETTSTEYTSGKINLSHQGAYVAQYEILWDEINYDD 491


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0534IGASERPTASE392e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.3 bits (91), Expect = 2e-04
Identities = 36/223 (16%), Positives = 85/223 (38%), Gaps = 25/223 (11%)

Query: 579 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDC 638
P ++ + + ++KT + ++ T + +++ +EAK +VKA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 639 VSQAKNEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAK 698
A++ +E KE + T E + +++ AK E +K + V KV ++
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV--------PKVTSQ 1128

Query: 699 ESVKAYLDCVSQAKNEAEKKECEKLLTPEARKLLEE--AKESVKAYKDCVSKARNEKEKK 756
S K Q ++E + + E + ++E ++ + A + +K + ++
Sbjct: 1129 VSPK-------QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ 1181

Query: 757 ECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEK 799
+ T + E + + A +E+ K +
Sbjct: 1182 PVTESTTVNTGNSVVENPENTTPA--TTQPTVNSESSNKPKNR 1222



Score = 38.9 bits (90), Expect = 3e-04
Identities = 36/229 (15%), Positives = 83/229 (36%), Gaps = 10/229 (4%)

Query: 763 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESLKAYKDC 822
P ++ + + ++KT + ++ T + R++ +EAK ++KA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 823 LSQARNEEERRACEKLLTPEARKLLEQQALDCLKNAKTDEERKKCLKDLPKDLQSDILAK 882
A++ E + + T E + +++ KT E K + PK QS+ +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 883 --------ESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAK 934
+ K+ SQ T A+ ++ K + ++ + E+ +
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201

Query: 935 TEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKNEAEKKECEKL 983
T + + + + K ++SV++ V A + + L
Sbjct: 1202 TTPATTQ-PTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 37.4 bits (86), Expect = 7e-04
Identities = 34/196 (17%), Positives = 82/196 (41%), Gaps = 6/196 (3%)

Query: 881 AKESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKK 940
++ + ++ ++KT + ++ T + +++ +EAK +VKA A++ +E K
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 941 ECEKLLTPEAKKLLEEAKKSVKA--YLDCVSQAKNEAEKKECEKLLTPEAKKLLEQQALD 998
E + T E + +E K V+ + + K+E + + P+A+ E
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 999 CLKNAK----TEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSQAKNEAEKKECEKLLT 1054
+K + T AD ++ K+ ++++ V +V V +N + +
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213

Query: 1055 PEARKLLEEAKESLKA 1070
+ K + S+++
Sbjct: 1214 ESSNKPKNRHRRSVRS 1229



Score = 36.6 bits (84), Expect = 0.001
Identities = 32/173 (18%), Positives = 70/173 (40%), Gaps = 23/173 (13%)

Query: 894 QAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEAKKL 953
+ E + E + P A ++ + + ++KT + ++ T + +++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT--PSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 954 LEEAKKSVKAYLDCVSQAKNEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCV 1013
+EAK +VKA A++ +E KE + T E + +++ AK E +K + V
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV 1122

Query: 1014 KDLPKDLQKKVLAKESVKAYLDCVSQAKNEAEKKECEKLLTPEARKLLEEAKE 1066
KV ++ S K Q ++E + + E + ++E +
Sbjct: 1123 --------PKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQS 1160



Score = 34.7 bits (79), Expect = 0.005
Identities = 30/169 (17%), Positives = 62/169 (36%), Gaps = 7/169 (4%)

Query: 708 VSQAKNEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAK 767
S+ + E+ E T + R++ +EAK +VKA A++ E KE + T E K
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ---TTETK 1101

Query: 768 KLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESLKAYKDCLSQAR 827
+ E +E K + + + ++ + + E A+E+ + + +
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVN--IKEPQ 1159

Query: 828 NEEERRACEKLLTPEARKLLEQQALDCLKNAKTDEERKKCLKDLPKDLQ 876
++ A + E +EQ + + + P Q
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ 1208



Score = 32.7 bits (74), Expect = 0.016
Identities = 41/246 (16%), Positives = 84/246 (34%), Gaps = 17/246 (6%)

Query: 710 QAKNEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAKKL 769
+ NE + E + P A E E+V SK + E+ E +
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQ---NRE 1067

Query: 770 LEEEAKESVKAYLDC--VSQAKTEA------EKKECEKLLTPEARKLL--EEAKESLKAY 819
+ +EAK +VKA V+Q+ +E E KE + E K+ + +
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 820 KDCLSQARNEEERRACEKLLTPEARKLLEQQALDCLKNAKTDEERKKCLKDLPKDLQSDI 879
+ Q ++E + E + +++ A T++ K+ ++ + +
Sbjct: 1128 QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES- 1186

Query: 880 LAKESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEK 939
+V V + + + + K ++SV++ V A T +
Sbjct: 1187 ---TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSND 1243

Query: 940 KECEKL 945
+ L
Sbjct: 1244 RSTVAL 1249



Score = 32.3 bits (73), Expect = 0.022
Identities = 27/187 (14%), Positives = 67/187 (35%), Gaps = 4/187 (2%)

Query: 495 KARNEKEKKECEKLLTPEAKKKLEQQVLDCLKNAKTDEERKKCLKDLPKD--LQSDILAK 552
+ NE+ + E + P A + +N+K + + + + + Q+ +AK
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 553 ESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKE 612
E+ K + E ++ T E K+ E +E K + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 613 CEKLLTPEAKKLLEEAKKSVKAYL--DCVSQAKNEAEKKECEKLLTPEAKKLLEQQALDC 670
++ + + E A+++ + SQ A+ ++ K + ++ + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 671 LKNAKTE 677
N+ E
Sbjct: 1191 TGNSVVE 1197



Score = 32.0 bits (72), Expect = 0.029
Identities = 30/178 (16%), Positives = 61/178 (34%), Gaps = 6/178 (3%)

Query: 1037 VSQAKNEAEKKECEKLLTPEARKLLEEAKESLKAYKDCLSQARNEEERRACEKLLTPEAR 1096
S+ + E+ E T + R++ +EAK ++KA A++ E + + T E
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 1097 KLLEQEVKKSVKAYLDCVSR-ARNEKEKQECEKLLTPEARKFLAKQVLNCLEKAGNEEER 1155
+ ++E K V + KQE + + P+A +++ ++
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 1156 KACLKNLPKDLQENVLAKESLKAYKDCLSQ-ARNEEERRACEKLLTPEARKLLEQEVK 1212
A + K+ NV + + + N E P + K
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT--QPTVNSESSNKPK 1220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0535TYPE4SSCAGX8700.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 870 bits (2249), Expect = 0.0
Identities = 511/522 (97%), Positives = 514/522 (98%)

Query: 1 MEKAFFKKIVGCFCLGYLFLSSVIEAAAPDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60
M +AFFKKIVGCFCLGYLFLSS IEA A DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 181 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 241 EETIKQRAKDKISIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300
EE ++QRAKDKISIKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNLGLRWYRVNEIAEKFKLIK 480
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTN GLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0537PF043351188e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (297), Expect = 8e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQITIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0545TYPE4SSCAGX300.013 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.1 bits (67), Expect = 0.013
Identities = 33/119 (27%), Positives = 52/119 (43%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELAALGFKKIKTLHQRHGDEEITEEEKKFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E F K K L D + EE+KK L ++ EQ
Sbjct: 112 AVNFALMTRDYQE-----FLKTKKLIVDAPDPKELEEQKK--------ALEKEKEAKEQA 158

Query: 84 QKNIEAFEKKNNSSVQKKAAKHKGLQELNEINATPLNDNPNSNSSAETKSNKDDNFDEM 142
QK A + K +++A L+ L + P N + N N S K +++ D+M
Sbjct: 159 QK---AQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0551ACRIFLAVINRP330.008 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.9 bits (75), Expect = 0.008
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGVNPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


9HPP12_0627HPP12_0647Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0627113-3.470901adenylate kinase
HPP12_0628215-3.670912lipopolysaccharide biosynthesis protein
HPP12_0629-112-3.004025inorganic pyrophosphatase
HPP12_0630013-3.293044hypothetical protein
HPP12_0631012-2.523981hypothetical protein
HPP12_0632113-1.018278hypothetical protein
HPP12_0633012-0.148150hypothetical protein
HPP12_06340110.179555DNA mismatch repair protein
HPP12_0635-1101.511009hypothetical protein
HPP12_0636-291.371280UDP-N-acetylmuramate-alanine ligase
HPP12_0637-2112.265194N-succinyldiaminopimelate aminotransferase
HPP12_0638-1123.2566171-hydroxy-2-methyl-2-(e)-butenyl 4-diphosphate
HPP12_0639-1133.246411tetrahydrodipicolinate (THDP)
HPP12_06400133.203571cysteine-rich protein F
HPP12_0641-1122.993133hypothetical protein
HPP12_0642-1142.649927modulator of drug activity B
HPP12_0643-1131.940768quinone-reactive Ni/Fe hydrogenase HydA
HPP12_0644-1130.928452quinone-reactive Ni/Fe hydrogenase HydB
HPP12_0645016-0.553099quinone-reactive Ni/Fe hydrogenase HydC
HPP12_0646114-0.414042quinone-reactive Ni/fe hydrogenase HydD
HPP12_0647312-1.879469hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0628BACINVASINB310.009 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 31.3 bits (70), Expect = 0.009
Identities = 13/51 (25%), Positives = 24/51 (47%)

Query: 207 KIKRKLNRFIGSILARTEVYKNVVAKYDDLTKKYDDLTKKYDELTGKYESL 257
++ ++ +G T++Y+ + K D YD TKK + K +SL
Sbjct: 124 QVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSL 174


10HPP12_0692HPP12_0707Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_06923201.930141hypothetical protein
HPP12_06935151.568170hypothetical protein
HPP12_06942110.758113hypothetical protein
HPP12_06951110.539066hypothetical protein
HPP12_0696091.170083UDP-N-acetylglucosamine pyrophosphorylase
HPP12_06970101.171827flagellar biosynthetic protein
HPP12_06981101.750225iron(III) dicitrate transport protein
HPP12_0699-2122.131306iron(II) transport protein
HPP12_07001142.514451hypothetical protein
HPP12_07013123.552239acetyl coenzyme A acetyltransferase
HPP12_07024132.943264succinyl-coa-transferase subunit A
HPP12_07033132.092632succinyl-coa-transferase subunit B
HPP12_07042131.660131short-chain fatty acids transporter
HPP12_07051101.671167outer membrane protein
HPP12_07072101.619434N-methylhydantoinase B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0693PF07132331e-04 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 33.1 bits (75), Expect = 1e-04
Identities = 19/45 (42%), Positives = 31/45 (68%)

Query: 37 IGEGVGAGMGGAMGGMIGALGGPWGTVFGAGIGGGIGAYSGAEIG 81
+G +G G+GG +GG+ +LGG G + G G+GGG+G+ G+ +G
Sbjct: 61 MGSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLG 105



Score = 30.4 bits (68), Expect = 0.001
Identities = 17/50 (34%), Positives = 27/50 (54%)

Query: 33 LGRDIGEGVGAGMGGAMGGMIGALGGPWGTVFGAGIGGGIGAYSGAEIGD 82
+G +G G+G G+GG + G GG G G G+G +G+ G+ +G
Sbjct: 61 MGSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLGSALGG 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0697FLGBIOSNFLIP2792e-97 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 279 bits (716), Expect = 2e-97
Identities = 115/246 (46%), Positives = 164/246 (66%), Gaps = 3/246 (1%)

Query: 12 ILRFFIFLILICPLICPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSL 71
+ R ++ LI PL A + LP + S P + + + +T L P++
Sbjct: 1 MRRLLSVAPVLLWLITPL--AFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAI 57

Query: 72 ILVMTSFTRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPYM 131
+L+MTSFTR+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+
Sbjct: 58 LLMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFS 117

Query: 132 DKKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDEVSLSVLIPAFMIS 191
++KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++ S
Sbjct: 118 EEKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTS 177

Query: 192 ELKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLTE 251
ELKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL
Sbjct: 178 ELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVG 237

Query: 252 NLVASF 257
+L SF
Sbjct: 238 SLAQSF 243


11HPP12_0723HPP12_0739Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0723314-0.622577RNA polymerase sigma-54 factor
HPP12_07241130.275280ABC transporter ATP-binding protein
HPP12_07250110.351045hypothetical protein
HPP12_07260101.220235DNA polymerase III gamma and tau subunits
HPP12_07272122.782959integral membrane protein
HPP12_07282152.715897hypothetical protein
HPP12_07292142.529693hypothetical protein
HPP12_07301132.409852outer membrane protein SabB/HopO
HPP12_07310121.991305L-asparaginase type II
HPP12_0732-1110.772472anaerobic C4-dicarboxylate transport protein
HPP12_0733-210-0.706523outer membrane protein SabB/HopO
HPP12_0734013-1.532240outer membrane protein
HPP12_0735113-2.662951transcriptional regulator
HPP12_0736216-3.944714tRNA(Ile)-lysidine synthetase
HPP12_0737319-4.243100ATP/GTP binding protein
HPP12_0738212-2.944276ATP/GTP binding protein
HPP12_0739212-2.479654hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0726TYPE4SSCAGX330.004 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 32.8 bits (74), Expect = 0.004
Identities = 33/113 (29%), Positives = 50/113 (44%), Gaps = 6/113 (5%)

Query: 459 KSIVDGVFGKGENIKIALKNQNKSALEVVKELKFPYSKPKPTTETTAEM-KEKETKEVAE 517
KS+ + + E + AL ++ K+L PK E + KEKE KE A+
Sbjct: 100 KSVKSNLMFEKEAVNFALMTRDYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQ 159

Query: 518 KETKEKETKEKEVQENDTKEVQE-TQPKEAPTALQEFMANHSNLIEEIKSEFE 569
K K+K K KE + + ++ T P L +N+ NL E IK + E
Sbjct: 160 KAQKDKREKRKEERAKNRANLENLTNAMSNPQNL----SNNKNLSELIKQQRE 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0728SECA280.013 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.013
Identities = 12/43 (27%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 71 RIARKNLSKMSEEDFKKMREEVRK--ELEEKTKGLSDEEIKAK 111
++ K ++ ++MR+ V +E + + LSDEE+K K
Sbjct: 4 KLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGK 46


12HPP12_0884HPP12_0896Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_08841113.008011vacuolating cytotoxin VacA
HPP12_0885-1172.263906iron(III) dicitrate ABC transporter FecE
HPP12_08861153.764994iron(III) dicitrate ABC transporter FecD
HPP12_08871122.275640short-chain oxidoreductase
HPP12_08882162.276369acyl coenzyme A thioesterase
HPP12_08891172.589803hypothetical protein
HPP12_08900183.004879hypothetical protein
HPP12_08910183.244375outer membrane protein BabA
HPP12_08921201.526184*hypothetical protein
HPP12_08931182.049444*hypothetical protein
HPP12_08940161.809070hydrogenase expression/formation protein HypD
HPP12_08951161.305423hydrogenase expression/formation protein HypC
HPP12_08962190.914151hydrogenase expression/formation protein HypB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0884VACCYTOTOXIN20490.0 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 2049 bits (5311), Expect = 0.0
Identities = 1221/1296 (94%), Positives = 1258/1296 (97%), Gaps = 5/1296 (0%)

Query: 1 MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIASGAAVGTVSGL 60
MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIA+GAAVGTVSGL
Sbjct: 1 MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGL 60

Query: 61 LGWGLKQAEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYRSLLSSKIDGGWDWGNAAT 120
LGWGLKQAEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLY+SLLSSKIDGGWDWGNAA
Sbjct: 61 LGWGLKQAEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAAR 120

Query: 121 HYWVKGGQWNKLEVDMKDAVGTYNLSGLRNFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180
HYWVK GQWNKLEVDM++AVGTYNLSGL NFTGGDLDVNMQKATLRLGQFNGNSFTSYKD
Sbjct: 121 HYWVKDGQWNKLEVDMQNAVGTYNLSGLINFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180

Query: 181 SADRTTRVDFNAKNISIDNFLEINNRVGSGAGRKASSTVLTLQASEGITSSKNAEISLYD 240
SADRTTRVDFNAKNI IDNFLEINNRVGSGAGRKASSTVLTLQASEGITS +NAEISLYD
Sbjct: 181 SADRTTRVDFNAKNILIDNFLEINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYD 240

Query: 241 GATLNLASSSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDRNAAQA 300
GATLNLAS+SVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGD NAAQA
Sbjct: 241 GATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQA 300

Query: 301 GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNNTPSQSGAKNDKNESAKNDKQESS 360
GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPN+ PS + N AKNDKQESS
Sbjct: 301 GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNN-----AKNDKQESS 355

Query: 361 QNNSNTQVINPPNSAQKTEVQPTQVIDGPFAGGKDTVVNINRINTNADGTIRVGGYKASL 420
QNNSNTQVINPPNSAQKTE+QPTQVIDGPFAGGK+TVVNINRINTNADGTIRVGG+KASL
Sbjct: 356 QNNSNTQVINPPNSAQKTEIQPTQVIDGPFAGGKNTVVNINRINTNADGTIRVGGFKASL 415

Query: 421 TTNAAHLHIGKGGVNLSNQASGRTLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEF 480
TTNAAHLHIGKGG+NLSNQASGR+LLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEF
Sbjct: 416 TTNAAHLHIGKGGINLSNQASGRSLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEF 475

Query: 481 KAGTDTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVTDKVNINK 540
KAGTDTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVT+KVNINK
Sbjct: 476 KAGTDTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVTNKVNINK 535

Query: 541 LITASTNVAIKNFNINELLVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIFSGGV 600
LITASTNVA+KNFNINEL+VKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSI+SGGV
Sbjct: 536 LITASTNVAVKNFNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIYSGGV 595

Query: 601 KFKGGEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGTSKLMFNNLTLGQNA 660
KFKGGEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGT+KLMFNNLTLGQNA
Sbjct: 596 KFKGGEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGTAKLMFNNLTLGQNA 655

Query: 661 VMDYSQFSNLTIQGDFVNNQGTINYLVRGGQVATLNVGNAAAMFFNNNVDSATGFYQPLM 720
VMDYSQFSNLTIQGDFVNNQGTINYLVRGGQVATLNVGNAAAMFF+NNVDSATGFYQPLM
Sbjct: 656 VMDYSQFSNLTIQGDFVNNQGTINYLVRGGQVATLNVGNAAAMFFSNNVDSATGFYQPLM 715

Query: 721 KINSAQDLIKNKEHVLLKAKIIGYGNVSAGTNSISNVNLIEQFKERLALYEHNNRMDICV 780
KINSAQDLIKNKEHVLLKAKIIGYGNVSAGT+SI+NVNLIEQFKERLALY +NNRMDICV
Sbjct: 716 KINSAQDLIKNKEHVLLKAKIIGYGNVSAGTDSIANVNLIEQFKERLALYNNNNRMDICV 775

Query: 781 VRNTDDIKACGTAIGNQSMVNNPDNYKYLIGKAWKNIGISKTANGSKISVHYLGNSTPTE 840
VRNTDDIKACGTAIGNQSMVNNP+NYKYL GKAWKNIGISKTANGSKISVHYLGNSTPTE
Sbjct: 776 VRNTDDIKACGTAIGNQSMVNNPENYKYLEGKAWKNIGISKTANGSKISVHYLGNSTPTE 835

Query: 841 NSGNTTNLPTNTTSNARSAKNALAQNAPFAQPSATPSLVAINQHDFGTIESVFELANRSK 900
N GNTTNLPTNTT+ R A AL +NAPFA+ SATP+LVAINQHDFGTIESVFELANRS
Sbjct: 836 NGGNTTNLPTNTTNKVRFASYALIKNAPFARYSATPNLVAINQHDFGTIESVFELANRSN 895

Query: 901 DIDTLYTHSGAQGRNLLQTLLIDSHDAGYARQMIDNTSTGEIIKQLNAATTTLNNVASLE 960
DIDTLY +SGAQGR+LLQTLLIDSHDAGYAR MID TS EI KQLN ATTTLNN+ASLE
Sbjct: 896 DIDTLYANSGAQGRDLLQTLLIDSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLE 955

Query: 961 HKQSGLQTLSLSNAMILNSRLVNLSRRHTNNIDSFAQRLQALKDQKFASLESAAEVLYQF 1020
HK SGLQTLSLSNAMILNSRLVNLSRRHTN+IDSFA+RLQALKDQ+FASLESAAEVLYQF
Sbjct: 956 HKTSGLQTLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFASLESAAEVLYQF 1015

Query: 1021 APKYEKPTNVWANAIGGTSLNNGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSN 1080
APKYEKPTNVWANAIGGTSLN+GGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSN
Sbjct: 1016 APKYEKPTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSN 1075

Query: 1081 QANSLNSGANNTNFGVYSRLFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLA 1140
QANSLNSGANNTNFGVYSR+FANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLA
Sbjct: 1076 QANSLNSGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLA 1135

Query: 1141 YSAATRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSTNQVALKNGSSSQHLFNA 1200
YSAATRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNS +VALKNG+SSQHLFNA
Sbjct: 1136 YSAATRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSNQKVALKNGASSQHLFNA 1195

Query: 1201 SANVEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNAARNPLNTHARVMMGGE 1260
SANVEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNA RNPLNTHARVMMGGE
Sbjct: 1196 SANVEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNATRNPLNTHARVMMGGE 1255

Query: 1261 LKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1296
LKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF
Sbjct: 1256 LKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0887DHBDHDRGNASE919e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.9 bits (225), Expect = 9e-24
Identities = 57/235 (24%), Positives = 107/235 (45%), Gaps = 12/235 (5%)

Query: 11 KVAIITGASSGIGLECALMLLDQGYKVYALSRHATLCVALNHALC------ESVDIDVSD 64
K+A ITGA+ GIG A L QG + A+ + + +L E+ DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 65 SNALKEVFSNISAKEDHCDVLINSAGYGVFGSVEDTPIEEVKKQFGVNFFALCEVVQFCL 124
S A+ E+ + I + D+L+N AG G + EE + F VN + +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 125 PLLKNKPYSKIFNLSSIAGRVSMLFLGHYSASKHALEAYSDALRLELKPFNVQVCLIEPG 184
+ ++ I + S V + Y++SK A ++ L LEL +N++ ++ PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 185 PVKSNWEKTAFSVENFESEDSLYALEVNAAKSFYSGVYQNALS-PKAVAQKIVFL 238
+++ + + ++ EN + + + ++F +G+ L+ P +A ++FL
Sbjct: 189 STETDMQWSLWADENGAEQ-----VIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238


13HPP12_0973HPP12_0994Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_09732150.605727peptidyl-prolyl cis-trans isomerase D
HPP12_09744171.170563cell division protein FtsA
HPP12_0975319-1.569921cell division protein FtsZ
HPP12_0976320-3.802021hypothetical protein
HPP12_0977419-4.503878hypothetical protein
HPP12_0978119-4.949341hypothetical protein
HPP12_0979017-4.829071mechanosensitive ion channel domain-containing
HPP12_0980020-5.520773hypothetical protein
HPP12_0981218-5.194925hypothetical protein
HPP12_0982216-5.622334hypothetical protein
HPP12_0983218-5.785206hypothetical protein
HPP12_0984319-5.623652hypothetical protein
HPP12_0985627-6.999592hypothetical protein
HPP12_0986426-6.038670hypothetical protein
HPP12_0987226-5.733324hypothetical protein
HPP12_0988427-4.469015hypothetical protein
HPP12_0989327-4.190520serine/threonine kinase C-like protein
HPP12_0990327-4.078915serine/threonine kinase C-like protein
HPP12_0991217-3.574409serine/threonine phosphatase 2C-like protein
HPP12_0992014-3.916191hypothetical protein
HPP12_0993015-3.901855phage/colicin/tellurite resistance cluster
HPP12_0994-113-3.396752hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0974SHAPEPROTEIN431e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 43.2 bits (102), Expect = 1e-06
Identities = 39/181 (21%), Positives = 69/181 (38%), Gaps = 13/181 (7%)

Query: 210 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLSTDL------S 263
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 264 HMLNTPFPYAEEVKIKYGDLSFESGEETPSQNVQMPTTGSDGHESHIVPLNKIQTIMRER 323
+ AE +K + G S G+E V+ + N+I ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 324 ALETFEIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELAKAHFTNYPVRLAA-PM 379
+ +++ E + G+VLTGG AL++ + L T PV +A P+
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLL-MEETGIPVVVAEDPL 322

Query: 380 E 380

Sbjct: 323 T 323


14HPP12_1084HPP12_1092Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_1084114-3.997031flagellar hook-associated protein 1
HPP12_1085122-4.731976hypothetical protein
HPP12_1086221-4.185052type II R-M system restriction endonuclease
HPP12_1087118-2.210442type II R-M system methyltransferase
HPP12_1088313-0.934893FlgM protein
HPP12_1089212-1.640849hypothetical protein
HPP12_1090312-1.549089fkbp-type peptidyl-prolyl cis-trans isomerase
HPP12_1091314-2.191664hypothetical protein
HPP12_1092315-1.595587peptidoglycan-associated lipoprotein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1084FLGHOOKAP15650.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 565 bits (1458), Expect = 0.0
Identities = 128/610 (20%), Positives = 229/610 (37%), Gaps = 75/610 (12%)

Query: 6 SSLNTSYTGLQAHQSMVDVTGNNISNASDEFYSRQRVIAKPQAAYMYGTKNVNMGVDVEA 65
S +N + +GL A Q+ ++ NNIS+ + Y+RQ I + + V GV V
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 66 IERVHDEFVFARYTKANYENTYYDTEFSHLKEASAYFPDIDEASLFTDLQDYFNSWKELS 125
++R +D F+ + A +++ + + + +SL T +QD+F S + L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNML-STSTSSLATQMQDFFTSLQTLV 120

Query: 126 KNAKDSAQKQALAQKTEALTHNIKDTRERLTTLQHKASEELKSVIKEVNSLGSQIAQINK 185
NA+D A +QAL K+E L + K T + L + + + + + ++N+ QIA +N
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 186 RIKEVENNKSLKHANELRDKRDELEFHLRELLGGNVFKSSIKTHSLTDKDSADFDESYNL 245
+I + + N L D+RD+L L +++G V S +YN+
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEV--------------SVQDGGTYNI 226

Query: 246 NIGHGFNIIDGSIFHPLVVKESENKGGLNQIYFQSDDFKLTNITDK-LNQGKVGALLNVY 304
+ +G++++ GS L S + + I +K LN G +G +L
Sbjct: 227 TMANGYSLVQGSTARQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFR 286

Query: 305 NDGSNGTLKGKLQDYIDLLDSFARGLIESTNAIYAQSASHHIEGEPVEFNSDEAFKDTNY 364
+ L + L A E+ N + +A D N
Sbjct: 287 SQ--------DLDQTRNTLGQLALAFAEAFNTQH------------------KAGFDANG 320

Query: 365 NIKNGSFDL----IAYNTDGKEIARKTIAITPITTMNDIIQAINANTDDNQ-----DNNT 415
+ F + + NT K +T + + I+ + + Q N T
Sbjct: 321 DAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNNQWQVTRLASNTT 380

Query: 416 ENDFDDYFTAGFNNETKKFVIQPKNASQGLFVSMKDNGTNFMGALKLNPFFQGDDASNIS 475
D + + + + + M L D + I+
Sbjct: 381 FTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLI-------TDEAKIA 433

Query: 476 LNKEYKKEPTTIRPWLAPINGNFDVANMMQQLQYDSVDFYNDKFDIKPMKISEFYQFLTG 535
+ E E G+ D N L S N K ++ Y L
Sbjct: 434 MASE---EDA----------GDSDNRNGQALLDLQS----NSKTVGGAKSFNDAYASLVS 476

Query: 536 KINTDAEKSGRILDTKKSMLETIKKEQLSISQVSVDEEMLNLIKFQSGYAANAKVITAID 595
I T+ +++ + +Q SIS V++DEE NL +FQ Y ANA+V+ +
Sbjct: 477 DIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTAN 536

Query: 596 RMIDTLLGIK 605
+ D L+ I+
Sbjct: 537 AIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1092OMPADOMAIN1463e-45 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 146 bits (369), Expect = 3e-45
Identities = 47/169 (27%), Positives = 73/169 (43%), Gaps = 24/169 (14%)

Query: 22 NMDKETVAGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAVESGTIIASIYFDF 80
D ++ VS + Q PAP PAP V+ K T+ + + F+F
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNF 225

Query: 81 DKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNAL 137
+K +K Q LD++ + V++ G TD GS YNQ L +R SV + L
Sbjct: 226 NKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYL 285

Query: 138 VIKGVEKDMIKTISFGETKPKCTQ-----KTR----ECYKENRRVDVKL 177
+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 286 ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


15HPP12_1103HPP12_1112Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_1103019-3.729434ATP synthase F0 subunit b'
HPP12_1104118-3.099668chromosome partitioning protein
HPP12_1105118-3.376237chromosome partitioning protein
HPP12_1106117-3.149487bifunctional biotin operon repressor / biotin
HPP12_1107217-3.630034methionyl-tRNA formyltransferase
HPP12_1108218-3.592968ATPase
HPP12_11092150.397171hypothetical protein
HPP12_1110118-0.517535hypothetical protein
HPP12_11113170.142788hypothetical protein
HPP12_1112216-0.271740hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1105PF07675310.004 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.004
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 70 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 126
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 127 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 171
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1107FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKDLKPDFIVVVAYGKILPKEILAIAP 104
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1109RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 22/170 (12%), Positives = 59/170 (34%), Gaps = 18/170 (10%)

Query: 51 RAQYQSHSKALKQKEEALKEREREQKAQFDDAVKQASALALQDERAKIIEEARKNAFLEQ 110
+ Q+ + QKE L ++ E+ + + ++ R + +
Sbjct: 192 KEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAK 251

Query: 111 QKGLELLQKELDEKSKQVQQLHQKEAEIERLKRENNEAESRLKAENEKKLNEKLDLERER 170
LE K ++ +L ++++E+++ E A+ + + + E
Sbjct: 252 HAVLEQENKYVEAV----NELRVYKSQLEQIESEILSAKEEYQLVTQ-------LFKNEI 300

Query: 171 IEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAELSSQQFQGEVQELAI 220
++K + +L + + +A +S +VQ+L +
Sbjct: 301 LDK--LRQTTDNIGLLTLELAKNEERQQASVIRAPVS-----VKVQQLKV 343


16HPP12_1314HPP12_1353Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_1314425-7.811754type II R-M system restriction endonuclease
HPP12_1315529-8.362754hypothetical protein
HPP12_1316532-8.156026hypothetical protein
HPP12_1317531-8.028131hypothetical protein
HPP12_1318631-8.079700hypothetical protein
HPP12_13201232-8.806998hypothetical protein
HPP12_1321825-7.912455hypothetical protein
HPP12_1322724-7.576544hypothetical protein
HPP12_1323723-7.291041VirB2 type IV secretion protein
HPP12_1324622-6.880915VirB3 type IV secretion protein
HPP12_1325522-6.005923hypothetical protein
HPP12_1326723-4.918817VirB4 type IV secretion ATPase
HPP12_1327824-3.692420VirB7 type IV secretion protein
HPP12_1328825-4.076789VirB8 type IV secretion protein
HPP12_1329826-3.662175VirB9 type IV secretion protein
HPP12_1330826-3.338010VirB10 type IV secretion protein
HPP12_1331727-4.344564hypothetical protein
HPP12_1332624-5.076708hypothetical protein
HPP12_1333726-5.318481hypothetical protein
HPP12_1334726-5.065189hypothetical protein
HPP12_1335825-5.600276VirB11 type IV secretion ATPase
HPP12_1336925-5.687476hypothetical protein
HPP12_13371027-5.005255VirD4 coupling protein
HPP12_13391229-6.312284hypothetical protein
HPP12_13401027-7.080144hypothetical protein
HPP12_13411027-7.556453hypothetical protein
HPP12_1342926-6.809112hypothetical protein
HPP12_1343828-7.052166hypothetical protein
HPP12_1344729-7.700464hypothetical protein
HPP12_1345932-8.186083hypothetical protein
HPP12_13461032-8.650183hypothetical protein
HPP12_13471227-7.667649hypothetical protein
HPP12_1348724-5.852445hypothetical protein
HPP12_1349518-4.522551hypothetical protein
HPP12_1350315-3.779461hypothetical protein
HPP12_1352214-2.926886hypothetical protein
HPP12_1353214-2.693548relaxase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1328PF04335899e-23 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 89.1 bits (221), Expect = 9e-23
Identities = 35/224 (15%), Positives = 74/224 (33%), Gaps = 29/224 (12%)

Query: 121 ESFKKDELDLSSVFEIQRKNTQIAYRLAIGGLIGIIALSIAIFIMMPLKENTPYFIDFAN 180
F++ ++ ++A+ +A A +A+ + PLK PY I
Sbjct: 12 AYFEEAASWERDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDR 71

Query: 181 SDKHFAVVQRADTK--VDYGEAFLRNLVGSYITARETINHIDDKIRLNETIREQSSEEVW 238
+ ++ + + Y EA + + +Y+ RE + + + S+
Sbjct: 72 NTGEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYF-DAVMVMSARPEQ 130

Query: 239 KTLEQLVSGKG-----SIYSNSNMDREIKIINISIYKQGKQQNIAVADIVAKVFDKGYLI 293
+ +I +N D ++I +S + VA+V+ +
Sbjct: 131 DRWSRFYKTDNPQSPQNILANRT-DVFVEIKRVSF----------LGGNVAQVYFTKESV 179

Query: 294 SEKRYRVSLIYHFKPLIQFDYSSMP-------KNPTGFIVDKYS 330
+ S I++ P KNP G+ V+ Y
Sbjct: 180 TGSN---STKTDAVATIKYKVDGTPSKEVDRFKNPLGYQVESYR 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1331cloacin395e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 38.9 bits (90), Expect = 5e-05
Identities = 29/90 (32%), Positives = 39/90 (43%), Gaps = 10/90 (11%)

Query: 148 VSGYGGTSNNAGSNGTSANGVNGTSGNNGAKGENGSSGANGANGTSGYQGVGSNPFPPIA 207
+SG G +N G++ TS N +NG G G GA+ SG+ +NP+ +
Sbjct: 1 MSGGDGRGHNTGAHSTSGN-INGGPTGLGVGG--------GASDGSGW-SSENNPWGGGS 50

Query: 208 GSGNGSSGSSNSGYTPFMSGGGGIGGMGGG 237
GSG G S G GG G GG
Sbjct: 51 GSGIHWGGGSGHGNGGGNGNSGGGSGTGGN 80



Score = 37.8 bits (87), Expect = 1e-04
Identities = 29/118 (24%), Positives = 48/118 (40%), Gaps = 8/118 (6%)

Query: 129 VSSKNGKGFSGSGASGMGYVSGYGGTSNNAGSNGTSANGVNGTSGNNGAKGENGSSGANG 188
+S +G+G + S G ++G G G +++G +S NN G +GS G
Sbjct: 1 MSGGDGRGHNTGAHSTSGNING---GPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWG 57

Query: 189 ANGTSGYQGVGSNPFPPIAGSGNGSSGSSNSGYTPFMSGGGGIGGMGGGFIPFPYSPG 246
G G N +G G+G+ G+ ++ P G + G G + S G
Sbjct: 58 GGSGHGNGGGNGN-----SGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAG 110



Score = 36.2 bits (83), Expect = 4e-04
Identities = 28/103 (27%), Positives = 37/103 (35%), Gaps = 5/103 (4%)

Query: 122 GVSKSAYVSSKNGKGFSGSGASGMGYVSGYGGTSNN----AGSNGTSANGVNGTSGNNGA 177
G + A+ +S N G G G G G +S N GS G GN G
Sbjct: 8 GHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGG 67

Query: 178 KGENGSSGANGANGTSGYQGVGSNPFPPIAGSGNGSSGSSNSG 220
G +G G N ++ V FP ++ G G S S
Sbjct: 68 NGNSGGGSGTGGNLSAVAAPVAFG-FPALSTPGAGGLAVSISA 109



Score = 33.9 bits (77), Expect = 0.002
Identities = 26/91 (28%), Positives = 35/91 (38%), Gaps = 7/91 (7%)

Query: 184 SGANGANGTSGYQGVGSNPFPPIAGSGNGSSGSSNSGYTPFMSGGGGIGGMGGGFIPFPY 243
SG +G +G N G G G S SG++ + GG G G +
Sbjct: 2 SGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHW----- 56

Query: 244 SPGLQNGSGTNGINGTNGIDGTSGANGSNSA 274
G +G G G NG +G +G N S A
Sbjct: 57 --GGGSGHGNGGGNGNSGGGSGTGGNLSAVA 85



Score = 33.1 bits (75), Expect = 0.003
Identities = 26/90 (28%), Positives = 35/90 (38%), Gaps = 7/90 (7%)

Query: 193 SGYQGVGSNPFPPIAGSGNGSSGSSNSGYTPFMSGGGGIGGMGGGFIPFPYSPGLQNGSG 252
SG G G N + +SG+ N G T GGG G G P+ G +G
Sbjct: 2 SGGDGRGHN------TGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIH 55

Query: 253 TNGINGT-NGIDGTSGANGSNSANGGTASA 281
G +G NG + GS + +A A
Sbjct: 56 WGGGSGHGNGGGNGNSGGGSGTGGNLSAVA 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1337CHANLCOLICIN357e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 35.4 bits (81), Expect = 7e-04
Identities = 35/216 (16%), Positives = 79/216 (36%), Gaps = 26/216 (12%)

Query: 527 EDAEIVSKEVGEFTRQSKNYSTEKSQLV---------FGGSSSYSHEGRNLLTAQDIMNI 577
++AE K E ++ K EK++ ++ S E + + AQ ++
Sbjct: 141 KEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSA 200

Query: 578 NSDEVIVIVTGAKATPLKLKANYWFKDKELLKRANLPIDLEVERQRVEE----------- 626
EV+ + K +L ++ +D E+ A +L + +E
Sbjct: 201 AQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPR 260

Query: 627 ---PIQPTTEIETTPNQNKADLGPSNKGEKVENESNERNTNENNPTTPQELENSNLKESE 683
P+Q E T + + G + ++ + ++E N N Q + + + S
Sbjct: 261 ANDPLQNRPFFEAT--RRRVGAGKIREEKQKQVTASETRINRINADITQI-QKAISQVSN 317

Query: 684 KDDESTITLENANENIEQGNHNEIDEILKKPLSEIS 719
+ + A EN+++ +N ++ +K +
Sbjct: 318 NRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATV 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1343TACYTOLYSIN280.014 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 28.0 bits (62), Expect = 0.014
Identities = 27/88 (30%), Positives = 43/88 (48%), Gaps = 6/88 (6%)

Query: 5 TQNANKHEIQNSSENEVELTEEKSFLEMSEDEYYEYQQEQYIQMNENDGIARVSAKLEQQ 64
T NA+ ++ QN++ E T E+ E SE + Q+ +N ND I KL +
Sbjct: 31 TANADSNK-QNTANTETTTTNEQPKPESSELTTEKAGQKMDDMLNSNDMI-----KLAPK 84

Query: 65 ELEREEAELESKALQDYEENQIQRAGEI 92
E+ E AE E K +D ++++ EI
Sbjct: 85 EMPLESAEKEEKKSEDNKKSEEDHTEEI 112


17HPP12_1444HPP12_1455Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_14442100.800915ABC transporter permease protein
HPP12_1445090.271671outer membrane protein
HPP12_14460100.376083branched-chain amino acid aminotransferase IlvE
HPP12_1447011-0.344229outer membrane protein HorJ
HPP12_1448112-0.503779DNA polymerase I
HPP12_14490150.067813type IIS R-M system restriction enzyme
HPP12_14501170.505336type IIS R-M system methyltransferase
HPP12_14513141.351465amidophosphoribosyltransferase
HPP12_14523140.681077thymidylate kinase
HPP12_14532140.948243lipopolysaccharide core biosynthesis protein
HPP12_14542140.884593aromatic acid decarboxylase
HPP12_14552140.406156flagella basal body P-ring formation protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1453LPSBIOSNTHSS2235e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 223 bits (569), Expect = 5e-78
Identities = 63/147 (42%), Positives = 94/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLKERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS++ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPKEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


18HPP12_1485HPP12_1496Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_14852122.592161saccharopine dehydrogenase
HPP12_14861122.026488ferredoxin-like protein
HPP12_1487-1111.832393integral membrane protein
HPP12_1488-1101.402405dihydroneopterin aldolase
HPP12_1489-212-0.144522frpb-like protein
HPP12_1490-29-2.307951iron-regulated outer membrane protein
HPP12_1491010-4.453695selenocysteine synthase
HPP12_149209-4.690762transcription termination factor A
HPP12_1493010-4.758582hypothetical protein
HPP12_1494010-4.635978type IIS R-M system restriction/modification
HPP12_1495012-5.017893hypothetical protein
HPP12_1496111-3.007256type III R-M system restriction enzyme
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1488ARGDEIMINASE260.042 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 25.9 bits (57), Expect = 0.042
Identities = 12/60 (20%), Positives = 25/60 (41%), Gaps = 8/60 (13%)

Query: 7 VHIHNLVFETILGILEFERLKPQKISVDLDLFYTELPN-----KAYLDYMEIQELIQKMM 61
+I +L+ E ++ + L+ + IS + + K Y + I +I KM+
Sbjct: 70 EYIEDLISEVLVSSVA---LENKFISQFILEAEIKTDFTINLLKDYFSSLTIDNMISKMI 126


19HPP12_0035HPP12_0040N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0035-3130.456986ComB8 competence protein
HPP12_0036-2130.720013ComB9 competence protein
HPP12_0037-2121.470368ComB10 competence protein
HPP12_0038-2111.222122bifunctional mannose-6-phosphate isomerase /
HPP12_0039-2121.409920GDP-D-mannose dehydratase
HPP12_0040-1141.207276GDP-fucose synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0035PF043351316e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 131 bits (331), Expect = 6e-40
Identities = 37/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALILAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLLNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0036TYPE4SSCAGX320.002 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 32.4 bits (73), Expect = 0.002
Identities = 32/103 (31%), Positives = 48/103 (46%), Gaps = 12/103 (11%)

Query: 171 KENKENVLENALENTPTNNKPLKEEKEE----AKEKEEETIIIGDNTNAMKIVKKDIQKG 226
K +E + + L T N+ + E+ K +EE+ II D A+ + Q
Sbjct: 334 KRQRELIKQENLNTTAYINRVMMASNEQIINKEKIREEKQKIILDQAKAL-----ETQYV 388

Query: 227 YKALKSSQ--RKWYCLGICSKKSKLSLMPEEIFNDKQFTYFKF 267
+ ALK + R + K+SK +MP EIF+D FTYF F
Sbjct: 389 HNALKRNPVPRNYNYYQAPEKRSK-HIMPSEIFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0038FLGMRINGFLIF310.012 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 30.7 bits (69), Expect = 0.012
Identities = 16/70 (22%), Positives = 26/70 (37%), Gaps = 3/70 (4%)

Query: 272 ALFEEAANEPKENVSLNQTPVFAKESANNLVFSHKVSAL---LGVENLAVIDTKDALLVA 328
+LF P +V++ P A + H VS+ L N+ ++D LL
Sbjct: 162 SLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQ 221

Query: 329 HKDKAKDLKA 338
+DL
Sbjct: 222 SNTSGRDLND 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0039NUCEPIMERASE881e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 1e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSDHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0040NUCEPIMERASE491e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 48.6 bits (116), Expect = 1e-08
Identities = 51/346 (14%), Positives = 107/346 (30%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNRPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNIQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHTAKLKNEKNFAMWGDGTARREYLNAKDLARFIA 222
+YG + + P + T + K+ ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYESIAQIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDYKGVFVKD 265
+ I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


20HPP12_0243HPP12_0250N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0243-2141.401900neutrophil-activating protein A
HPP12_0244-3131.171076histidine kinase sensor protein
HPP12_0245-3121.959512hypothetical protein
HPP12_0246-4102.184288flagellar basal-body P-ring protein
HPP12_0247-392.277175ATP-dependent RNA helicase
HPP12_0248-382.324833spfH domain-containing protein
HPP12_0249-281.373945hypothetical protein
HPP12_0250-281.641050oligopeptide permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0243HELNAPAPROT1493e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 149 bits (377), Expect = 3e-49
Identities = 39/140 (27%), Positives = 74/140 (52%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIAQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER+ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEALKLTRVKEETKTSFHSKDIFKEILGDYKHLEKEFKELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLEAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0244PF06580300.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.014
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0246FLGPRINGFLGI362e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 362 bits (930), Expect = e-127
Identities = 116/345 (33%), Positives = 191/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AITSGN-----------SNNLLSANIINGATIEREVSYDLFHKNAMVLSLKSPNFKNAIQ 186
A+ SA + NGA IERE+ +VL L++P+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIISGVDIMVHPIVVTSQDITLKITKEPLNN--------SKNTQDLDNNMSLDTAHN 294
++GTI+ G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0247SECA300.024 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.024
Identities = 24/95 (25%), Positives = 46/95 (48%), Gaps = 5/95 (5%)

Query: 231 DITQRFYVINEHERAEAIM-HLLDTQAPKKSI-VFTRTKKEADELHQFLASKNYKSTALH 288
D+ Y + E E+ +AI+ + + A + + V T + ++++ + L K L+
Sbjct: 422 DLPDLVY-MTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN 480

Query: 289 GDMDQRDRRASIMAFKKNDADVLVATDVASRGLDI 323
+ A+I+A A V +AT++A RG DI
Sbjct: 481 AKFHANE--AAIVAQAGYPAAVTIATNMAGRGTDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0249BINARYTOXINA270.040 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 27.3 bits (60), Expect = 0.040
Identities = 19/80 (23%), Positives = 35/80 (43%), Gaps = 9/80 (11%)

Query: 91 LSYSNAFSLQVGVKNISRFSLNKCVLRLEVLKN-PHNFVEEHAFKWFVKKSYEKTFKEKI 149
++Y N S +G N+S F+ K +LR+ + K+ P ++ A + + E +
Sbjct: 372 ITYPNFISTSIGSVNMSAFAKRKIILRINIPKDSPGAYL--SAIPGYAGE------YEVL 423

Query: 150 LPKESKVFSFFIDNYSHSKT 169
L SK +D+Y
Sbjct: 424 LNHGSKFKINKVDSYKDGTV 443


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0250HTHFIS320.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.007
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANLIMRLNPR----FKPHNGEVLFETTNLLKESEEF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


21HPP12_0345HPP12_0353N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0345-3111.021865flagellar basal-body M-ring protein
HPP12_0346-3111.097331flagellar motor switch protein
HPP12_0347-2101.178541flagellar export protein
HPP12_0348-181.6307001-deoxyxylulose-5-phosphate synthase
HPP12_03491101.023276GTP-binding membrane protein
HPP12_0350012-1.184507hypothetical protein
HPP12_0351-1120.240834hypothetical protein
HPP12_03520130.017366flagellar basal-body rod protein
HPP12_0353113-0.433581alpha-ketoglutarate permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0345FLGMRINGFLIF5570.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 557 bits (1436), Expect = 0.0
Identities = 181/582 (31%), Positives = 295/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFEGLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVSRDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ + I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLNPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIDNVKIVNENGESIGEGDILENSKELALEQLHYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL + + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GAPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANALEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMVPVIDNATLSEKIMHKTQKILGSFTPLIKYVLVFI 461
++A+G++ RGD + V N F+ +DN T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFS----AVDN-TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 TFSEEEVRYEIILEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0346FLGMOTORFLIG350e-122 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 350 bits (900), Expect = e-122
Identities = 122/338 (36%), Positives = 209/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIVKLDNFAIREILKVADKKDLSLALKTSTKDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDIV LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 31.3 bits (71), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0347FLGFLIH375e-05 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 36.7 bits (84), Expect = 5e-05
Identities = 44/207 (21%), Positives = 90/207 (43%), Gaps = 14/207 (6%)

Query: 50 PLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIGFKEG 108
E I + + L L +LQMQ A E+ +A I + G+K G++EG
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQGYQEG 75

Query: 109 EEKMRNELTHSVNEEKNQLLHAITTLDEKMKKSEDHLMALE----KELSAIAIDIAKEVI 164
+ L + E K+Q + + + + + L AL+ L +A++ A++VI
Sbjct: 76 ---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 165 LKEVEDSSQKVALALTEELLKNVLDATDIHLKVNPLDYPYLNERLQNASKI---KLESNE 221
+ + + + + L + L + L+V+P D +++ L + +L +
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 222 AISKGGVMITSSNGSLDGNLMERFKTL 248
+ GG +++ G LD ++ R++ L
Sbjct: 193 TLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0349TCRTETOQM1132e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 113 bits (283), Expect = 2e-28
Identities = 53/162 (32%), Positives = 89/162 (54%), Gaps = 7/162 (4%)

Query: 11 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 67
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 68 RLNYTFKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 127
+F+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 ----SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 128 DNNLEILPVINKIDLPNANVLEIKQDIEDTIGIDCSNANEVS 169
+ + INKID ++ + QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 83.4 bits (206), Expect = 7e-19
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 169 SAKAKLGIKDLLEKIITTIPAPSGDPNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 228
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 229 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 285
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 286 KNPTPKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 345
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 346 FRVGFLGLLHMEVIKERLEREFGLNLIATAPTVVY 380
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.016
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 407 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 466
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 467 LKSCTKGYASFDYEP 481
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0352FLGHOOKAP1300.008 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.008
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0353TCRTETB419e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 9e-06
Identities = 58/315 (18%), Positives = 104/315 (33%), Gaps = 67/315 (21%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFLLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNIM 210
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EETMDRKTTPKTTIKEETQRGSLKELLNHKKALM-------IVFGLTMGGSLCFYTFTVY 263
+ + +K + K ++ + T +
Sbjct: 190 K-----------------KEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 LKIFLTNSSSFSPK-------ESSFIMLLALSYFIFLQPLCG---MLADKIKRTQMLMVF 313
IF+ + + ++ M+ L I + G M+ +K L
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 314 AIAGLIVTPVVFYGI 328
I +I+ P I
Sbjct: 293 EIGSVIIFPGTMSVI 307


22HPP12_0522HPP12_0537N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0522-111-1.760805heat shock protein ATP-binding subunit
HPP12_0523016-2.897680GTP-binding protein Era-like protein
HPP12_0524315-2.577617hypothetical protein
HPP12_0525619-2.759215hypothetical protein
HPP12_0526821-1.979932hypothetical protein
HPP12_0527820-1.902007cag pathogenicity island protein Zeta
HPP12_0528820-2.156835cag pathogenicity island protein Epsilon
HPP12_0529917-1.828058cag pathogenicity island protein Delta
HPP12_0530918-2.295282cag pathogenicity island protein Gamma
HPP12_0531819-2.423509cag pathogenicity island protein Beta
HPP12_0532920-2.900270cag pathogenicity island protein Alpha
HPP12_0533920-3.119577cag pathogenicity island protein Z
HPP12_0534920-2.952207cag pathogenicity island protein Y VirB10-like
HPP12_0535926-4.324833cag pathogenicity island protein X VirB9-like
HPP12_0536928-4.310962cag pathogenicity island protein W
HPP12_05371229-5.102961cag pathogenicity island protein V
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0522HTHFIS290.047 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.047
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 51 TPKNILMIGSTGVGKTEIARRI---AKIMELPFVKV 83
T +++ G +G GK +AR + K PFV +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0523PF03944320.002 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 32.3 bits (73), Expect = 0.002
Identities = 28/110 (25%), Positives = 53/110 (48%), Gaps = 6/110 (5%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYDSQFLALVPLSAKKSQ-NLNALLECISKHLNPSAW 176
+ L +L ++Q Q L L+PL A+ + +L+ + + I LN W
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAANLHLSFIRDVI---LNADEW 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0527TYPE3IMSPROT290.004 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.0 bits (65), Expect = 0.004
Identities = 13/68 (19%), Positives = 24/68 (35%), Gaps = 1/68 (1%)

Query: 18 NAFVNFFKNSLADKRYDSLGLIGAGVLCCVLSGAMGIVGIIFVAIGIFLSFSNINLVKLI 77
A N L + Y L+ L + S + G + I IN ++
Sbjct: 70 QALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVV-QYGFLISGEAIKPDIKKINPIEGA 128

Query: 78 EKLFKKQS 85
+++F +S
Sbjct: 129 KRIFSIKS 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0529PF07201300.023 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.8 bits (67), Expect = 0.023
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0530TACYTOLYSIN270.039 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 27.3 bits (60), Expect = 0.039
Identities = 12/43 (27%), Positives = 22/43 (51%), Gaps = 3/43 (6%)

Query: 128 NKSVYQLVEMAIGAYNGG-MKHDPNGAYVKKFRCIYSQVRYNE 169
N+S Y VE Y G + GAYV ++ ++ ++ Y++
Sbjct: 451 NRSEY--VETTSTEYTSGKINLSHQGAYVAQYEILWDEINYDD 491


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0534IGASERPTASE392e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.3 bits (91), Expect = 2e-04
Identities = 36/223 (16%), Positives = 85/223 (38%), Gaps = 25/223 (11%)

Query: 579 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDC 638
P ++ + + ++KT + ++ T + +++ +EAK +VKA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 639 VSQAKNEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAK 698
A++ +E KE + T E + +++ AK E +K + V KV ++
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV--------PKVTSQ 1128

Query: 699 ESVKAYLDCVSQAKNEAEKKECEKLLTPEARKLLEE--AKESVKAYKDCVSKARNEKEKK 756
S K Q ++E + + E + ++E ++ + A + +K + ++
Sbjct: 1129 VSPK-------QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ 1181

Query: 757 ECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEK 799
+ T + E + + A +E+ K +
Sbjct: 1182 PVTESTTVNTGNSVVENPENTTPA--TTQPTVNSESSNKPKNR 1222



Score = 38.9 bits (90), Expect = 3e-04
Identities = 36/229 (15%), Positives = 83/229 (36%), Gaps = 10/229 (4%)

Query: 763 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESLKAYKDC 822
P ++ + + ++KT + ++ T + R++ +EAK ++KA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 823 LSQARNEEERRACEKLLTPEARKLLEQQALDCLKNAKTDEERKKCLKDLPKDLQSDILAK 882
A++ E + + T E + +++ KT E K + PK QS+ +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 883 --------ESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAK 934
+ K+ SQ T A+ ++ K + ++ + E+ +
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201

Query: 935 TEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKNEAEKKECEKL 983
T + + + + K ++SV++ V A + + L
Sbjct: 1202 TTPATTQ-PTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 37.4 bits (86), Expect = 7e-04
Identities = 34/196 (17%), Positives = 82/196 (41%), Gaps = 6/196 (3%)

Query: 881 AKESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKK 940
++ + ++ ++KT + ++ T + +++ +EAK +VKA A++ +E K
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 941 ECEKLLTPEAKKLLEEAKKSVKA--YLDCVSQAKNEAEKKECEKLLTPEAKKLLEQQALD 998
E + T E + +E K V+ + + K+E + + P+A+ E
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 999 CLKNAK----TEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSQAKNEAEKKECEKLLT 1054
+K + T AD ++ K+ ++++ V +V V +N + +
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS 1213

Query: 1055 PEARKLLEEAKESLKA 1070
+ K + S+++
Sbjct: 1214 ESSNKPKNRHRRSVRS 1229



Score = 36.6 bits (84), Expect = 0.001
Identities = 32/173 (18%), Positives = 70/173 (40%), Gaps = 23/173 (13%)

Query: 894 QAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEAKKL 953
+ E + E + P A ++ + + ++KT + ++ T + +++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT--PSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 954 LEEAKKSVKAYLDCVSQAKNEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCV 1013
+EAK +VKA A++ +E KE + T E + +++ AK E +K + V
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV 1122

Query: 1014 KDLPKDLQKKVLAKESVKAYLDCVSQAKNEAEKKECEKLLTPEARKLLEEAKE 1066
KV ++ S K Q ++E + + E + ++E +
Sbjct: 1123 --------PKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQS 1160



Score = 34.7 bits (79), Expect = 0.005
Identities = 30/169 (17%), Positives = 62/169 (36%), Gaps = 7/169 (4%)

Query: 708 VSQAKNEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAK 767
S+ + E+ E T + R++ +EAK +VKA A++ E KE + T E K
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ---TTETK 1101

Query: 768 KLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESLKAYKDCLSQAR 827
+ E +E K + + + ++ + + E A+E+ + + +
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVN--IKEPQ 1159

Query: 828 NEEERRACEKLLTPEARKLLEQQALDCLKNAKTDEERKKCLKDLPKDLQ 876
++ A + E +EQ + + + P Q
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQ 1208



Score = 32.7 bits (74), Expect = 0.016
Identities = 41/246 (16%), Positives = 84/246 (34%), Gaps = 17/246 (6%)

Query: 710 QAKNEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSKARNEKEKKECEKLLTPEAKKL 769
+ NE + E + P A E E+V SK + E+ E +
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQ---NRE 1067

Query: 770 LEEEAKESVKAYLDC--VSQAKTEA------EKKECEKLLTPEARKLL--EEAKESLKAY 819
+ +EAK +VKA V+Q+ +E E KE + E K+ + +
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 820 KDCLSQARNEEERRACEKLLTPEARKLLEQQALDCLKNAKTDEERKKCLKDLPKDLQSDI 879
+ Q ++E + E + +++ A T++ K+ ++ + +
Sbjct: 1128 QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES- 1186

Query: 880 LAKESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEK 939
+V V + + + + K ++SV++ V A T +
Sbjct: 1187 ---TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSND 1243

Query: 940 KECEKL 945
+ L
Sbjct: 1244 RSTVAL 1249



Score = 32.3 bits (73), Expect = 0.022
Identities = 27/187 (14%), Positives = 67/187 (35%), Gaps = 4/187 (2%)

Query: 495 KARNEKEKKECEKLLTPEAKKKLEQQVLDCLKNAKTDEERKKCLKDLPKD--LQSDILAK 552
+ NE+ + E + P A + +N+K + + + + + Q+ +AK
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 553 ESVKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKE 612
E+ K + E ++ T E K+ E +E K + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 613 CEKLLTPEAKKLLEEAKKSVKAYL--DCVSQAKNEAEKKECEKLLTPEAKKLLEQQALDC 670
++ + + E A+++ + SQ A+ ++ K + ++ + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 671 LKNAKTE 677
N+ E
Sbjct: 1191 TGNSVVE 1197



Score = 32.0 bits (72), Expect = 0.029
Identities = 30/178 (16%), Positives = 61/178 (34%), Gaps = 6/178 (3%)

Query: 1037 VSQAKNEAEKKECEKLLTPEARKLLEEAKESLKAYKDCLSQARNEEERRACEKLLTPEAR 1096
S+ + E+ E T + R++ +EAK ++KA A++ E + + T E
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 1097 KLLEQEVKKSVKAYLDCVSR-ARNEKEKQECEKLLTPEARKFLAKQVLNCLEKAGNEEER 1155
+ ++E K V + KQE + + P+A +++ ++
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 1156 KACLKNLPKDLQENVLAKESLKAYKDCLSQ-ARNEEERRACEKLLTPEARKLLEQEVK 1212
A + K+ NV + + + N E P + K
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT--QPTVNSESSNKPK 1220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0535TYPE4SSCAGX8700.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 870 bits (2249), Expect = 0.0
Identities = 511/522 (97%), Positives = 514/522 (98%)

Query: 1 MEKAFFKKIVGCFCLGYLFLSSVIEAAAPDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60
M +AFFKKIVGCFCLGYLFLSS IEA A DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 181 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 241 EETIKQRAKDKISIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300
EE ++QRAKDKISIKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNLGLRWYRVNEIAEKFKLIK 480
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTN GLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0537PF043351188e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (297), Expect = 8e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQITIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


23HPP12_0585HPP12_0591N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0585214-1.111467hypothetical protein
HPP12_0586215-0.397802neuraminidase
HPP12_0587117-0.226880dihydroorotase
HPP12_0588017-2.366423siderophore-mediated iron transport protein
HPP12_0589-214-2.721792hypothetical protein
HPP12_0590-214-2.079594flagellar motor switch protein
HPP12_0591-112-0.818895endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0585TYPE3IMSPROT300.006 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.7 bits (67), Expect = 0.006
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 88 LQSYSVMLFFNLLLLTDILGFLPFSIYHHFMASLIFSALFCGSLFLSSPLLGVIALVALS 147
L Y F L+L+ +LPFS S + + +L PLL V AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLL 151
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0588TONBPROTEIN525e-10 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 52.3 bits (125), Expect = 5e-10
Identities = 24/52 (46%), Positives = 27/52 (51%)

Query: 91 PQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 142
P P P P P P P IEKPKP+PKPKPKP K + +K VE
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 46.1 bits (109), Expect = 6e-08
Identities = 27/74 (36%), Positives = 32/74 (43%), Gaps = 8/74 (10%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 142
A Q PP P P P P P P E P KPKPKP+PK K V+
Sbjct: 53 ADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP--------KPVK 104

Query: 143 KVEEKKVVEEKKEE 156
KV+E+ + K E
Sbjct: 105 KVQEQPKRDVKPVE 118



Score = 45.0 bits (106), Expect = 1e-07
Identities = 21/65 (32%), Positives = 27/65 (41%), Gaps = 1/65 (1%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPK-KPNHKHKALKKV 141
P P P P P P PPK +PKPKPKP+PK + + + V
Sbjct: 55 LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDV 114

Query: 142 EKVEE 146
+ VE
Sbjct: 115 KPVES 119



Score = 38.8 bits (90), Expect = 2e-05
Identities = 43/218 (19%), Positives = 80/218 (36%), Gaps = 38/218 (17%)

Query: 101 PTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKVVEEKKEEKKIV 160
P PP +P +P EP+P+P+P P+ P +E VV EK + K
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-------------KEAPVVIEKPKPKPKP 98

Query: 161 EQKVEQKVEQKKVEEKKPVKKEFDPNQLSFLPKEVAPPRKENNKGLDNQTRRDIDELYGE 220
+ K +KV+++ + KPV E P N T +
Sbjct: 99 KPKPVKKVQEQPKRDVKPV--------------ESRPASPFENTAPARLTSSTATAATSK 144

Query: 221 EFGDLGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDITDLKI 280
+ + + RN + YP A L +G V+F + P+G + +++I
Sbjct: 145 PVTSVASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNVQI 193

Query: 281 IIGSEYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 318
+ M + ++ + +P + ++ I +
Sbjct: 194 LSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 34.6 bits (79), Expect = 3e-04
Identities = 12/42 (28%), Positives = 17/42 (40%)

Query: 91 PQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPN 132
P+ P P P P PKP K + +PK +P +
Sbjct: 79 PEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120



Score = 31.1 bits (70), Expect = 0.005
Identities = 14/56 (25%), Positives = 20/56 (35%)

Query: 76 PSKNTPGAPKPTLAGPQKPPTPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKP 131
P + P K P P PKP++K + +PK KP +P
Sbjct: 66 EPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRP 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0590FLGMOTORFLIN1001e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 100 bits (250), Expect = 1e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0591OMS28PORIN270.036 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 27.4 bits (60), Expect = 0.036
Identities = 28/112 (25%), Positives = 53/112 (47%), Gaps = 11/112 (9%)

Query: 27 NQTTELHHKNPYELLVATILSAQCTDARVNQITPKLFEKYPSVNDLAL-----ASLEEVK 81
N+ E+ K E A ++ + T QI + K P+ +L L A +E+VK
Sbjct: 132 NKVVEMSKKAVQETQKAVSVAGEATFLIEKQI---MLNKSPNNKELELTKEEFAKVEQVK 188

Query: 82 EIIKSVSYFNNKSKHLISMAQKVVRDFKGVIPSTQKELMSLDGVGQKTANVV 133
E + + +++ + AQKV+ G+ PS + ++++ V + +NVV
Sbjct: 189 ETLMASERALDET---VQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


24HPP12_0614HPP12_0618N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_0614190.334160membrane fusion protein of the HefABC efflux
HPP12_0615190.092938cytoplasmic pump protein of the HefABC efflux
HPP12_0616210-0.923205hypothetical protein
HPP12_0617210-1.012522vacuolating cytotoxin VacA-like protein
HPP12_0618221-3.274302vacuolating cytotoxin VacA-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0614RTXTOXIND502e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.8 bits (119), Expect = 2e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYSKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLEGYEFT 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 30.6 bits (69), Expect = 0.005
Identities = 23/152 (15%), Positives = 50/152 (32%), Gaps = 25/152 (16%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLEGYEFTYRRLESDYAYSIAVLNKTI 127
+++ S +++ + ++ K+ D L L + A + ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL---------LTLELAKNEERQQASV 329

Query: 128 LRAPFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG------ 179
+RAP + + GV L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 180 -DTYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ + Y+ G K+ I D+
Sbjct: 390 VEAFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0615ACRIFLAVINRP8940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 894 bits (2313), Expect = 0.0
Identities = 288/1040 (27%), Positives = 519/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGVMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQAIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKHIQAISP-SYEIRPFLDTTSYIRTSIEDVKFDLVLGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTKLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 ISIAVVLVFVGSLFVASKLGMDFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHAEVEFTTLQVGY-GTTQNPFKAKIFVQLKPLKERKKEGELGQFELMHALRKELRS 631
+ + E FT + G QN FV LKP +ER + ++H + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNGDENS-AEAVIHRAKMELGK 656

Query: 632 LPEAKGLDTINLSEVALIGGGGDSSPFQTFVFSHSQEAVDKSVENLRKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGFVI-PFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAEPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + E
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLIALATAFVLIYMILA 871
G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0617VACCYTOTOXIN340.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 33.9 bits (77), Expect = 0.014
Identities = 16/90 (17%), Positives = 29/90 (32%), Gaps = 3/90 (3%)

Query: 702 SYSFDGINNTFNEDKFNGGSFNFNHAEQTDAFNNNSFNGGSFSFNAKQVDFNHNSFNGGV 761
SYS + E FN + ++A Q +N + G+ + N + G
Sbjct: 272 SYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGG 330

Query: 762 FNFNKTSKVSFTGDTFNVNSQFKINGAQTD 791
+ + T N K +Q +
Sbjct: 331 YKDKPND--KPSNTTQNNAKNDKQESSQNN 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0618VACCYTOTOXIN3046e-96 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 304 bits (780), Expect = 6e-96
Identities = 105/376 (27%), Positives = 178/376 (47%), Gaps = 13/376 (3%)

Query: 2 FAPYYLQDNPTEHIVTLMKDITSALGMLSKPNLKNNSTDALQLNTYTQQMSRLAKLSNFA 61
+A + I + T+ L ++ K + L L+ SRL LS
Sbjct: 924 YARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQTLSLSNAMILNSRLVNLSRRH 983

Query: 62 SFDSTDFSERLSSLKNQRFADATPNAMDVILKYSQRDKLKNNLWATGVGGVSFVGNGTGT 121
+ F++RL +LK+QRFA +A +V+ +++ + + N+WA +GG S G +
Sbjct: 984 TNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEKPTNVWANAIGGTSLNSGGNAS 1042

Query: 122 LYGVNVGYDRFIKG---VIVGGYAAYGYSGFYER--ITSSRSNNVDMGLYARAFIKKSEL 176
LYG + G D ++ G IVGG+ +YGYS F + +S +NN + G+Y+R F + E
Sbjct: 1043 LYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLNSGANNTNFGVYSRIFANQHEF 1102

Query: 177 TFSVNETWGANKTQISSNDTLLSMINQSYKYSTWTTNARVNYGYDFMFKNKSIILKPQIG 236
F G++++ ++ LL +NQSY Y ++ R +YGYDF F +++LKP +G
Sbjct: 1103 DFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATRASYGYDFAFFRNALVLKPSVG 1162

Query: 237 LRYYYIGMTGLEGVMDNALYNQFKANADPSKKSVLTIDLALENRHYFNTNSYFYAIGGVG 296
+ Y ++G T + N+ + S + + +E R+Y+ SYFY GV
Sbjct: 1163 VSYNHLGSTNFKS---NS-NQKVALKNGASSQHLFNASANVEARYYYGDTSYFYMNAGVL 1218

Query: 297 RDLLVRSMGDKLVRFIGNNTLSYRKGELYNTFASITTGGEVRLFKSFYANAGVGARFGLD 356
++ + V + R NT A + GGE++L K + N G L
Sbjct: 1219 QEFANFGSSNA-VSLNTFKVNATRNP--LNTHARVMMGGELKLAKEVFLNLGFVYLHNLI 1275

Query: 357 YKMINITGNIGMRLAF 372
+ + N+GMR +F
Sbjct: 1276 SNIGHFASNLGMRYSF 1291


25HPP12_0899HPP12_0906N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_08990142.370507acetate kinase
HPP12_0900-1122.017923acetate kinase
HPP12_09010142.265351acetate kinase
HPP12_09020151.556381phosphotransacetylase
HPP12_09031140.704537phosphotransacetylase
HPP12_09040160.527476hypothetical protein
HPP12_0905-2151.295153flagellar hook assembly protein
HPP12_0906-1141.966498flagellar hook protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0899ACETATEKNASE1264e-38 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 126 bits (317), Expect = 4e-38
Identities = 52/117 (44%), Positives = 73/117 (62%), Gaps = 2/117 (1%)

Query: 1 MRNIEARK-EKGDKEAKLAFEMCAYRIKKYIGAYMVALGRVDAIIFTGGMGENYSALRES 59
R++E + GDK A+LA + AYR+KK IG+Y A+G VD I+FT G+GEN +RE
Sbjct: 283 FRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREF 342

Query: 60 VCEGLENLGIALCKPTNDNPGNGLVDLSQPDTKVQVLRIPTDEELEIALQTKKVLEK 116
+ +GLE LG L K N G +S D+KV V+ +PT+EE IA T+K++E
Sbjct: 343 ILDGLEFLGFKLDKEKNKVRGEE-AIISTADSKVNVMVVPTNEEYMIAKDTEKIVES 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0900ACETATEKNASE2643e-90 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 264 bits (676), Expect = 3e-90
Identities = 102/185 (55%), Positives = 133/185 (71%)

Query: 5 GGDKFHAPVLVNELVMQEIGNLSILAPLHNPANLAGIEFVQKAHPHIPQIAVFDTAFHAT 64
GG+ F + VL+ + V++ I + LAPLHNPAN+ GI+ + P +P +AVFDTAFH T
Sbjct: 94 GGEYFTSSVLITDDVLKAITDCIELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFHQT 153

Query: 65 MPSYAYMYALPYELYEKYQIRRYGFHGTSHHYVAKEAAKFLNIAYEEFNAISLHLGNGSN 124
MP YAY+Y +PYE Y KY+IR+YGFHGTSH YV++ AA+ LN E I+ HLGNGS+
Sbjct: 154 MPDYAYLYPIPYEYYTKYKIRKYGFHGTSHKYVSQRAAEILNKPIESLKIITCHLGNGSS 213

Query: 125 AAAIQKGKSVDTSMGLTPLEGLIMGTRCGDIDPTVVEYTAQCADKRLEEVVKILNYESGL 184
AA++ GKS+DTSMG TPLEGL MGTR G IDP+++ Y + + EEVV ILN +SG+
Sbjct: 214 IAAVKNGKSIDTSMGFTPLEGLAMGTRSGSIDPSIISYLMEKENISAEEVVNILNKKSGV 273

Query: 185 KGICG 189
GI G
Sbjct: 274 YGISG 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0901ACETATEKNASE963e-27 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 96.0 bits (239), Expect = 3e-27
Identities = 42/99 (42%), Positives = 62/99 (62%), Gaps = 6/99 (6%)

Query: 1 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKLVI 60
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 61 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGG 97
KDH + ++ + L G+IKD ++IDA+GHRVV GG
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGG 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0904IGASERPTASE533e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 52.8 bits (126), Expect = 3e-09
Identities = 48/230 (20%), Positives = 82/230 (35%), Gaps = 9/230 (3%)

Query: 270 KRDKTLSKKKSKKTPTKAQTTAPSIAPENAPKIPLKTPPLMPLIGANPPNDNPPTPLEKE 329
KR++T+ + TP Q PS+ N + P+ P P TP E
Sbjct: 987 KRNQTV-DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPA--------PATPSETT 1037

Query: 330 ETTKEASDNKEKTKEANNSAQSAQNAQASDKTSENKSVTPKETIKHFTQQLKQEIQEYKP 389
ET E S + KT E N + AQ + E KS T + Q E +E +
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 390 PMSRISMDLFPKELGKVEVIIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNNLNALGFEG 449
++ + + +E KVE + + V +T + + N + +
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 450 VDLSFSQDSSKEQPKEQLRELFKEQESSPLKENALKSYQENTDNEHKETS 499
+ + EQP ++ ++ + N S EN +N T+
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_0906FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


26HPP12_1549HPP12_1556N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPP12_1549-2132.614999flagellar hook-basal body protein FliE
HPP12_1550-2132.512722flagellar basal-body rod protein FlgC
HPP12_15510131.993712flagellar basal-body rod protein FlgB
HPP12_15520141.466046cell division protein FtsW
HPP12_15530160.380164iron(III) ABC transporter periplasmic
HPP12_15540150.463387alkyl hydroperoxide reductase
HPP12_15550130.029205outer membrane lipoprotein
HPP12_1556013-0.086406penicillin-binding protein 2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1549FLGHOOKFLIE776e-22 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 77.0 bits (189), Expect = 6e-22
Identities = 19/77 (24%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 34 EQKGGEFSKLLKQSINELNNTQEQSDKALADMATGQIK-DLHQAAIAIGKAETSMKLMLE 92
Q F+ L +++ +++TQ + G+ L+ + KA SM++ ++
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRNKAISAYKELLRTQI 109
VRNK ++AY+E++ Q+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1550FLGHOOKAP1280.013 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.4 bits (63), Expect = 0.013
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1553FERRIBNDNGPP348e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.8 bits (77), Expect = 8e-04
Identities = 29/184 (15%), Positives = 75/184 (40%), Gaps = 12/184 (6%)

Query: 106 NVELLKKLSPDLVVTFVGNPKAVEHAKKF--GILFLSFQEKTIAEVMEDID---AQAKAL 160
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 161 EIDASKKLAKMQETLDFIAERLKGVKKKKGVELFHKAN----KISGHQALDSDILEKGGI 216
+ A LA+ ++ + + R + + + L + + G +L +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVK-RGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGI 206

Query: 217 DN-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLSPEDVLNNPKFSTIKAIKNKQV 274
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 207 PNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRF 266

Query: 275 YKLP 278
++P
Sbjct: 267 QRVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPP12_1556TYPE3IMPPROT290.030 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 29.4 bits (66), Expect = 0.030
Identities = 9/23 (39%), Positives = 12/23 (52%)

Query: 4 LRYKLLLFVFIGFWGLLVLNLFI 26
KL+LFV + W LL L +
Sbjct: 195 TPIKLVLFVALDGWTLLSKGLIL 217



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.