PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeHPAG1.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP000241 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPAG1_0045HPAG1_0089Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0045017-3.171482hypothetical protein
HPAG1_0046120-4.097823adenine specific DNA methyltransferase
HPAG1_0047016-2.428708cytosine specific DNA methyltransferase
HPAG1_0048-210-0.989112type II DNA modification enzyme
HPAG1_0049-1100.019209type II restriction enzyme
HPAG1_0050-1100.470635type II restriction enzyme
HPAG1_00510100.445328adenine/cytosine DNA methyltransferase
HPAG1_00521110.511581sodium/proline symporter
HPAG1_0053212-0.229947proline/delta 1-pyrroline-5-carboxylate
HPAG1_0054416-1.057431hypothetical protein
HPAG1_0055416-1.010061M-protein
HPAG1_0056316-0.507909hypothetical protein
HPAG1_00573160.171585hypothetical protein
HPAG1_0058313-0.244487hypothetical protein
HPAG1_0059313-0.258247hypothetical protein
HPAG1_0060415-0.999919hypothetical protein
HPAG1_0061315-1.096912hypothetical protein
HPAG1_0062312-0.289841hypothetical protein
HPAG1_00632110.281676hypothetical protein
HPAG1_00642141.425571hypothetical protein
HPAG1_00651131.723919hypothetical protein
HPAG1_00660152.078002conserved ATP-binding protein
HPAG1_00672142.694061conserved ATP-binding protein
HPAG1_00684203.563625urease accessory protein UreH
HPAG1_00694223.197456urease accessory protein UreG
HPAG1_00704202.360885urease accessory protein UreF
HPAG1_00713172.268194urease accessory protein UreE
HPAG1_00724192.191073urease accessory protein/pH-dependent
HPAG1_00732162.272351urease B
HPAG1_0074-2101.445207urease A
HPAG1_00751121.965409*lipoprotein signal peptidase
HPAG1_00762132.378807phosphoglucosamine mutase
HPAG1_00773152.451645ribosomal protein S20
HPAG1_00783131.680789peptide chain release factor RF-1
HPAG1_00794151.489136reminiscent outer membrane protein HorA
HPAG1_00803141.532055outer membrane protein HorA
HPAG1_00813150.736244hypothetical protein
HPAG1_0082-1150.344609hypothetical protein
HPAG1_0083-2140.427814methyl-accepting chemotaxis protein
HPAG1_00840130.291983ribosomal protein S9
HPAG1_00851110.431197ribosomal protein L13
HPAG1_00862110.711511hypothetical protein
HPAG1_0087190.299438malate:quinone oxidoreductase
HPAG1_00881100.061550hypothetical protein
HPAG1_00892110.065444RNA polymerase sigma-80 factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0057GPOSANCHOR320.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.0 bits (72), Expect = 0.002
Identities = 34/195 (17%), Positives = 63/195 (32%), Gaps = 2/195 (1%)

Query: 55 KENEKISGLENANDQLWQAKDKLTKENTELTHKNAALTEKTAELKTENDKLNHLVIALNN 114
K + + L+ N L L N ELT + + EK + + + L
Sbjct: 61 KFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEA 120

Query: 115 EQGSLKQERAQLQDERGFLEELCTNLEKENQHLTEKLKKLESAQKNLENTNNQLRQAKEK 174
+ L++ + LE E L + LE A + N + +
Sbjct: 121 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 175 IAEEKTELERKIARLKSLESMEAKSELDLHNRR--LASANQDLKRQNRKLEEENIALKER 232
+ EK LE + A L+ + L + L + LE+
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 240

Query: 233 AYGLNEQLFTLQPQK 247
+ + ++ TL+ +K
Sbjct: 241 STADSAKIKTLEAEK 255



Score = 32.0 bits (72), Expect = 0.002
Identities = 40/238 (16%), Positives = 65/238 (27%), Gaps = 13/238 (5%)

Query: 16 EELEARIGELENENAELFTTKEKLTKENTELAYKNNKLFKENE-----------KISGLE 64
+ L+ EL E + K K +E A K +L +
Sbjct: 81 KALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADS 140

Query: 65 NANDQLWQAKDKLTKENTELTHKNAALTEKTAELKTENDKLNHLVIALNNEQGSLKQERA 124
L K L +L + + L AL Q L++
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 125 QLQDERGFLEELCTNLEKENQHLTEKLKKLESAQKNLENTNNQLRQAKEKIAEEKTELER 184
+ LE E L + LE A + N + + + EK LE
Sbjct: 201 GAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 260

Query: 185 KIARLKSLESMEAKSELDLHN--RRLASANQDLKRQNRKLEEENIALKERAYGLNEQL 240
+ A L+ + L + L+ + LE ++ L L L
Sbjct: 261 RQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318



Score = 31.6 bits (71), Expect = 0.003
Identities = 45/232 (19%), Positives = 76/232 (32%), Gaps = 6/232 (2%)

Query: 15 REELEARIGELENENAELFTTKEKLTKENTELAYKNNKLFKENEKISGLENANDQLWQA- 73
+ +LE + N + + L E L + +L K E A+ +
Sbjct: 157 KADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 74 ---KDKLTKENTELTHKNAALTEKTAELKTENDKLNHLVIALNNEQGSLKQERAQLQDER 130
K L +L + + L AL Q L++ +
Sbjct: 217 EAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFS 276

Query: 131 GFLEELCTNLEKENQHLTEKLKKLESAQKNLENTNNQLRQAKEKIAEEKTELERKIARLK 190
LE E L + LE + L LR+ + E K +LE + +L+
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLE 336

Query: 191 SLESMEAKSELDLHNRRLAS--ANQDLKRQNRKLEEENIALKERAYGLNEQL 240
+ S L AS A + L+ +++KLEE+N + L L
Sbjct: 337 EQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDL 388



Score = 30.8 bits (69), Expect = 0.006
Identities = 38/210 (18%), Positives = 70/210 (33%), Gaps = 4/210 (1%)

Query: 15 REELEARIGELENENAELFTTKEKLTKENTELAYKNNKLFKENEKISGLENANDQLWQAK 74
+ +LE + N + + L E LA + L K E G N +
Sbjct: 122 KADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALE---GAMNFSTADSAKI 178

Query: 75 DKLTKENTELTHKNAALTEKTAELKTENDKLNHLVIALNNEQGSLKQERAQLQDERGFLE 134
L E L + A L + + + + L E+ +L +A L+
Sbjct: 179 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAM 238

Query: 135 ELCTNLEKENQHLTEKLKKLESAQKNLENTNNQLRQAKEKIAEEKTELERKIARLKSLES 194
T + + L + LE+ Q LE + + LE + A L++ E
Sbjct: 239 NFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEA-EK 297

Query: 195 MEAKSELDLHNRRLASANQDLKRQNRKLEE 224
+ + + + N S +DL ++
Sbjct: 298 ADLEHQSQVLNANRQSLRRDLDASREAKKQ 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0073UREASE10410.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1041 bits (2694), Expect = 0.0
Identities = 352/569 (61%), Positives = 441/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYASMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR YA+M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKF 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNFGFLAKGNVSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MN F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNS 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0080CHANLCOLICIN371e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 37.4 bits (86), Expect = 1e-04
Identities = 48/318 (15%), Positives = 108/318 (33%), Gaps = 23/318 (7%)

Query: 42 GNYAEKDKDSKLTSDSLTQQQAQTQAQNTASSDSKEATTLENTAATDSQTATTDETYTKS 101
G+ +E T+ T Q +TQA+ A + + + A D+ T + ++
Sbjct: 42 GSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEA 101

Query: 102 TDATVADAAKQVE---TDNTAVQNDEKTLK----TDVAKVQADASTKDFDETTFTKDQAA 154
+ E +N A+Q +++ L+ + A+ +A+A+ K F E + +
Sbjct: 102 LRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIE 161

Query: 155 EQTAETNLQ-NAEEQLTNDQNALNTALKDQTPSTPSTPPATSGGASGSTVASQLTKDTTM 213
+ AET Q E AL+ K + A S L +
Sbjct: 162 REKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSS 221

Query: 214 VNNLKSVSVSAMNTTLSGVTQLSQQTATIGNLLNSSTDLSS-------VIPNAQGLSSAF 266
+ + + + + + Q S + + L+ + ++ + A
Sbjct: 222 SIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAG 281

Query: 267 SALESAQNTLKGYLNSSSATIGQLTNGSNAVVGALDKAINQVDMALADLSATDTQKTQAV 326
E Q + +S I ++ + A+ + N + +A + + + +
Sbjct: 282 KIREEKQKQVT----ASETRINRINADITQIQKAISQVSNNRNAGIARVH----EAEENL 333

Query: 327 ALAAASDSATTTTDAINF 344
A + + DA++
Sbjct: 334 KKAQNNLLNSQIKDAVDA 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0081TACYTOLYSIN320.008 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 31.9 bits (72), Expect = 0.008
Identities = 20/52 (38%), Positives = 25/52 (48%), Gaps = 3/52 (5%)

Query: 490 GNNTGDTGDMNNTDTGNTDTGNTDDMSNMNSG---NDDAGNANDDMGNSNDM 538
GN D N +T NT+T T++ S + AG DDM NSNDM
Sbjct: 27 GNLVTANADSNKQNTANTETTTTNEQPKPESSELTTEKAGQKMDDMLNSNDM 78


2HPAG1_0101HPAG1_0106Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_01012101.374044hypothetical protein
HPAG1_01022112.774810glycosyl transferase
HPAG1_01032113.031454methyl-accepting chemotaxis protein
HPAG1_01041113.4882222',3'-cyclic-nucleotide 2'-phosphodiesterase
HPAG1_0105-2114.375135autoinducer-2 synthase
HPAG1_0106-2123.587536cystathionine gamma-synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0103OMS28PORIN290.037 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.4 bits (65), Expect = 0.037
Identities = 35/163 (21%), Positives = 76/163 (46%), Gaps = 11/163 (6%)

Query: 376 HTEEELSSKVEQLSRNADDVKSILDIINDIADQTNLLALNAAIEAARAGEHGRGFAVVAD 435
H++++ + K++Q D V LD IN + + +++ +E R + A
Sbjct: 41 HSDQKDNKKLDQ----KDQVNQALDTINKVTED-----VSSKLEGVRESSLELVESNDAG 91

Query: 436 EVRNLAGRTQKSLAEINSTIMVIVQEINDVSSQMNLNSQKMERLSDMSK-SVQETYEKMS 494
V+ G + ++++ +V QE V+ + ++ ++ +MSK +VQET + +S
Sbjct: 92 VVKKFVG-SMSLMSDVAKGTVVASQEATIVAKCSGMVAEGANKVVEMSKKAVQETQKAVS 150

Query: 495 SNLSSVVSDSNQTMDDYAKSGHQIEAMVSDFAEVEKVASKTLA 537
+ Q M + + + ++E +FA+VE+V +A
Sbjct: 151 VAGEATFLIEKQIMLNKSPNNKELELTKEEFAKVEQVKETLMA 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0105LUXSPROTEIN2234e-78 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 223 bits (570), Expect = 4e-78
Identities = 57/145 (39%), Positives = 91/145 (62%), Gaps = 7/145 (4%)

Query: 8 VESFNLDHTKVKAPYVRVADRKKGANGDVIVKYDVRFKQPNQDHMDMPSLHSLEHLVAEI 67
++SF +DHT++ AP VRVA + GD I +D+RF PN+D + +H+LEHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 68 IRNHA----SYVVDWSPMGCQTGFYLTVLNHDNYTEILEVLEKTMQDVLKAK---EVPAS 120
+RNH ++D SPMGC+TGFY++++ + ++ + M+DVLK + ++P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 121 NEKQCGWAANHTLEGAQNLARTFLD 145
NE QCG AA H+L+ A+ +A+ L+
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILE 147


3HPAG1_0140HPAG1_0145Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_01403111.365624A/G-specific adenine glycosylase
HPAG1_01414122.007289C(4)-dicarboxylates and
HPAG1_01422121.210495cytochrome c oxidase heme b and copper-binding
HPAG1_01432160.006830cytochrome c oxidase monoheme subunit
HPAG1_0144313-1.543669cytochrome c oxidase subunit Q
HPAG1_0145213-1.189069cytochrome c oxidase diheme subunit
4HPAG1_0185HPAG1_0202Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_01850123.063997fumarate reductase, iron-sulfur subunit
HPAG1_01860113.309306fumarate reductase, flavoprotein subunit
HPAG1_0187-1131.982240fumarate reductase, cytochrome b subunit
HPAG1_0188-1142.014840triosephosphate isomerase
HPAG1_0189-1153.340195enoyl-(acyl-carrier-protein) reductase
HPAG1_0190-1143.427466UDP-3-0-(3-hydroxymyristoyl) glucosamine
HPAG1_0191-2153.495143S-adenosylmethionine synthetase
HPAG1_0192-2172.470465nucleoside diphosphate kinase
HPAG1_0193-2192.061444hypothetical protein
HPAG1_0194-111-1.011679ribosomal protein L32
HPAG1_0195012-2.366705fatty acid/phospholipid synthesis protein
HPAG1_0196-114-3.283876beta-ketoacyl-acyl carrier protein synthase III
HPAG1_0197219-5.007256hypothetical protein
HPAG1_0198320-5.281769hypothetical protein
HPAG1_0199422-5.642611hypothetical protein
HPAG1_0200525-5.602446hypothetical protein
HPAG1_0201324-4.433714putative transposase OrfB
HPAG1_0202117-3.297643putative transposase OrfA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0189DHBDHDRGNASE606e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.4 bits (146), Expect = 6e-13
Identities = 61/263 (23%), Positives = 109/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKVLYNSVKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + +++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHNIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++NIR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSNGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


5HPAG1_0298HPAG1_0311Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_02981143.640130ribosomal protein L21
HPAG1_02991143.595948ribosomal protein L27
HPAG1_03000133.730222periplasmic dipeptide-binding protein
HPAG1_03010144.020016dipeptide permease protein
HPAG1_0302-1143.204600dipeptide permease protein
HPAG1_0303-3132.833224dipeptide ABC transporter
HPAG1_0304-2132.581376dipeptide ABC transporter
HPAG1_0305-1122.105180GTP-binding protein
HPAG1_0306-1131.407436hypothetical protein
HPAG1_03070162.036305hypothetical protein
HPAG1_03080160.912094glutamate-1-semialdehyde 2,1-aminomutase
HPAG1_0309117-1.275830hypothetical protein
HPAG1_0310216-0.869619hypothetical protein
HPAG1_0311316-0.117898putative N-carbamoyl-D-amino acid
6HPAG1_0358HPAG1_0368Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0358214-0.502025hypothetical protein
HPAG1_03590131.585204transketolase A
HPAG1_0360-112-0.060834riboflavin kinase
HPAG1_0361-1110.022459hemolysin
HPAG1_0362-1110.264449hypothetical protein
HPAG1_0363-210-1.233891aspartate transcarbamoylase
HPAG1_0364-212-3.215098outer membrane protein HofB
HPAG1_0365-28-3.779499multidrug resistance protein
HPAG1_0366-211-4.827589hypothetical protein
HPAG1_0367-111-3.864732conserved hypothetical integral membrane
HPAG1_0368-111-4.108182hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0364OMPADOMAIN300.022 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 29.9 bits (67), Expect = 0.022
Identities = 11/59 (18%), Positives = 23/59 (38%), Gaps = 1/59 (1%)

Query: 392 WVFGGGVHKKWLWGTLWRWTSGTLA-NEASAAVNVGYKISKSLTASVKLEYFGVMTHSG 449
W G + T + +G N+ A GY+++ + + ++ G M + G
Sbjct: 28 WYTGAKLGWSQYHDTGFINNNGPTHENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKG 86


7HPAG1_0450HPAG1_0470Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0450013-3.520782molybdenum ABC transporter ModA
HPAG1_0451011-3.736791molybdenum ABC transporter ModB
HPAG1_0452010-2.166272molybdenum ABC transporter ModD
HPAG1_0453-19-2.368495glutamyl-tRNA synthetase
HPAG1_0454-111-2.838550outer membrane protein HopJ
HPAG1_0455-111-2.962179type II adenine specific methyltransferase
HPAG1_0456012-1.461294DD-heptosyltransferase
HPAG1_04574140.880563GTP-binding protein
HPAG1_04588180.059165type II adenine specific DNA methyltransferase
HPAG1_0459318-0.432441type II restriction endonuclease
HPAG1_0460219-0.437431type II DNA modification enzyme
HPAG1_0461218-0.137639catalase like protein
HPAG1_0462319-0.048392outer membrane protein HofC
HPAG1_0463316-1.586958outer membrane protein HofD
HPAG1_0464317-1.476385hypothetical protein
HPAG1_0465-110-1.181779hypothetical protein
HPAG1_0466-110-1.081503putative potassium channel protein
HPAG1_0467-110-1.501720ribosomal protein L28
HPAG1_0468010-1.925473neuraminyllactose-binding hemagglutinin
HPAG1_0469111-1.518990phospho-N-acetylmuramoyl-pentapeptide-
HPAG1_0470213-1.736258UDP-N-acetylmuramoylalanine-D-glutamate ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0452PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.002
Identities = 12/36 (33%), Positives = 20/36 (55%), Gaps = 1/36 (2%)

Query: 27 KAEVVALL-GESGAGKSTILRILAGLEAVSSGYIEV 61
K + +L G G GKST++ L GL+ S + ++
Sbjct: 594 KFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0457TCRTETOQM1981e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (504), Expect = 1e-57
Identities = 116/461 (25%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVALAG--FNAMDV-GDSVVDPANPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV L V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0468PF05211301e-105 Neuraminyllactose-binding hemagglutinin
		>PF05211#Neuraminyllactose-binding hemagglutinin

Length = 260

Score = 301 bits (772), Expect = e-105
Identities = 66/283 (23%), Positives = 125/283 (44%), Gaps = 41/283 (14%)

Query: 1 MERSLIFKKVRVYSKMLVALGLSSVLIGCAMNPSAETKKPNDAKNQVQTHERIQTSSEHV 60
M+ + FK + K L+ + ++L+GC S + N+ ++ H +SE V
Sbjct: 1 MKANNHFKDF-AWKKCLLGASVVALLVGC----SPHIIETNEVALKLNYH----PASEKV 51

Query: 61 TPLDFNYPIHIVQAPQNHHVVGILMPRIQVSDNL-KPYIDKFQDALANQIQTIFEKRGYQ 119
LD + +L P Q SDN+ K Y +KF++ +++ I + +GY+
Sbjct: 52 QALD--------------EKILLLRPAFQYSDNIAKEYENKFKNQTTLKVEQILQNQGYK 97

Query: 120 VLRF--QDEKALSTQDKRKIFSVLDLKGWVGILEDLKMNVKDPNNPNL--DTLVDQ---- 171
V+ D+ S K++ + + + G + + D K ++ + P L T +D+
Sbjct: 98 VINVDSSDKDDFSFAQKKEGYLAVAMNGEIVLRPDPKRTIQKKSEPGLLFSTGLDKMEGV 157

Query: 172 --SSGSVWFNFYEPESNRVVHDFAVEVGTF---QAMTYTYKHSNSGGFDSSNSIIHENLE 226
+G V EP S + F +++ + T S+SGG S+
Sbjct: 158 LIPAGFVKVTILEPMSGESLDSFTMDLSELDIQEKFLKTTHSSHSGGLVSTMV----KGT 213

Query: 227 KNKEDAIHKILNRMYAVVMKKAVTELTKENIAKYRDTIDRMKG 269
N DAI LN+++A +M++ +LT++N+ Y+ +KG
Sbjct: 214 DNSNDAIKSALNKIFANIMQEIDKKLTQKNLESYQKDAKELKG 256


8HPAG1_0491HPAG1_0524Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0491215-3.451718GTP-binding protein era-like protein
HPAG1_0492516-2.825402conserved hypothetical secreted protein
HPAG1_0493720-3.059264hypothetical protein
HPAG1_0494920-2.546362IS606 transposase
HPAG1_0495920-2.435852cag pathogenicity island protein 1
HPAG1_0496920-2.710379HP0521B-like protein
HPAG1_0497717-1.982556cag pathogenicity island protein 3
HPAG1_0498816-2.647451cag pathogenicity island protein 4
HPAG1_0499917-3.062684cag pathogenicity island protein 5
HPAG1_0500920-3.341185cag pathogenicity island encoded protein/ATPase
HPAG1_0501920-3.531634cag pathogenicity island protein Z
HPAG1_0502821-3.507405cag pathogenicity island protein Y
HPAG1_05031122-4.475513cag pathogenicity island protein Y
HPAG1_05041028-4.555395cag pathogenicity island protein X
HPAG1_05051130-4.600416cag pathogenicity island protein W
HPAG1_05061431-5.138680cag pathogenicity island protein V
HPAG1_05071229-5.184685cag pathogenicity island protein U
HPAG1_05081224-5.242359cag pathogenicity island protein T
HPAG1_05091022-5.646419hypothetical protein
HPAG1_0510721-4.501543cag pathogenicity island protein S
HPAG1_0511621-3.109580cag pathogenicity island protein Q
HPAG1_0512519-3.010212cag pathogenicity island protein M
HPAG1_0513621-3.322296cag pathogenicity island protein N
HPAG1_0514623-2.529615cag pathogenicity island protein L
HPAG1_0515521-3.051742cag pathogenicity island protein I
HPAG1_0516620-3.919948cag pathogenicity island protein H
HPAG1_0517620-4.443715cag pathogenicity island protein G
HPAG1_0518721-4.586237cag pathogenicity island protein F
HPAG1_0519622-3.138544cag pathogenicity island protein E
HPAG1_0520721-3.501686cag pathogenicity island protein E
HPAG1_0521825-2.878074cag pathogenicity island protein D
HPAG1_0522622-1.719179cag pathogenicity island protein C
HPAG1_0523318-1.018097cag pathogenicity island protein B
HPAG1_0524217-0.803637cytotoxin-associated protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0491PF03944310.008 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 30.8 bits (69), Expect = 0.008
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLCQKPHILALSKIDMA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYNSQFLALVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0497PF07201300.020 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.8 bits (67), Expect = 0.020
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALKAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0502IGASERPTASE407e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.0 bits (93), Expect = 7e-05
Identities = 38/214 (17%), Positives = 79/214 (36%), Gaps = 5/214 (2%)

Query: 65 KARNEEERRACEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKK 124
+ NEE R E + P A ++ + + ++KT + ++ T + ++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT---PSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 125 KLEEAKKSVKAYLDCVSQAKNEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRC 184
+EAK +VKA A++ +E KE + T E + +++ E K
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 185 VKDLPKDLQKKVLAKESLKAYKDCVSKARNEKEKKECEKLLTPEAKKLLEEEAKESVKAY 244
+ PK Q + + ++ A ++ + E + + T + K ++ V
Sbjct: 1128 -QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 245 LDCVSQAKNEAEKKECEKLLTPEAKKKLEEAKKS 278
V+ + E E T + E + K
Sbjct: 1187 -TTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219



Score = 38.1 bits (88), Expect = 3e-04
Identities = 33/155 (21%), Positives = 63/155 (40%), Gaps = 5/155 (3%)

Query: 355 VSKARNEKEKKECEKLLTPEARKLLEEAKESLKAYKDCVSKARNEEERRACEKLLTPE-A 413
SK + E+ E T + R++ +EAK ++KA A++ E + + T E A
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 414 KKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQA 473
EE+AK + + + K+E + + P+A+ E + SQ
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND--PTVNIKEPQSQT 1162

Query: 474 KTEADKKECEKLLTPEAKKLLEQQALDCLKNAKTE 508
T AD ++ K + ++ + + N+ E
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197



Score = 37.4 bits (86), Expect = 5e-04
Identities = 40/242 (16%), Positives = 81/242 (33%), Gaps = 4/242 (1%)

Query: 395 KARNEEERRACEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKK 454
+ NEE R E + P A ++ + + ++KT + ++ T + ++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT---PSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 455 KLEEAKKSVKAYLDCVSQAKTEADKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRC 514
+EAK +VKA A++ ++ KE + T E + +++ E K
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 515 VKDLPKDLQKKVLAKKSVKAYLDCVSKARNEKEKKECEKLLTPEARKLLEEAKESLKAYK 574
+ PK Q + + ++ A + + E + + T + K E
Sbjct: 1128 -QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 575 DCVSKARNEKEKKECEKLLTPEARKLLEQEVKKSVKAYLDCVSRARNEKEKKECEKLLTP 634
V+ + E E T + E K + S N + +
Sbjct: 1187 TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 635 EA 636
A
Sbjct: 1247 VA 1248



Score = 36.2 bits (83), Expect = 0.001
Identities = 38/239 (15%), Positives = 80/239 (33%), Gaps = 36/239 (15%)

Query: 25 VSKARNEKEKKECEKLLTPEARKLLEEAKESLKAYKDCVSKARNEEERRACEKLLTPEAK 84
SK + E+ E T + R++ +EAK ++KA A++ E + + T E
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 85 KLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQAK 144
+ +EE +AK E EK + +T + K Q +
Sbjct: 1105 TVEKEE-------------KAKVETEKTQEVPKVTSQVSPK----------------QEQ 1135

Query: 145 NEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESLKA 204
+E + + E + +++ A TE K ++ + + +
Sbjct: 1136 SETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTG--- 1192

Query: 205 YKDCVSKARNEKEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKNEAEKKECEKL 263
+ V + + + E+ + + SV++ V A + + L
Sbjct: 1193 --NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0503IGASERPTASE300.047 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.047
Identities = 32/214 (14%), Positives = 73/214 (34%), Gaps = 5/214 (2%)

Query: 408 RKELELQKELQEYKDCIKNAKTEAEKNECLKGLSKEAIERLK--QQALDCLKNAKTDEER 465
E + QE K KN + E + ++KEA +K Q + ++ +E
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 466 NECLKNIPQDLQKELLADMSVKAYKDCVSKARNEKEKQECEKLLTPEARKKLEQQVLDCL 525
++KE A + + ++ KQE + + P+A E +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 526 KNAKTDEERKKCLKDLPKDLQSDI---LAKESVKAYKDCVSQAKTESEKKECEKLLTPEA 582
K ++ + K+ S++ + + + + V + + + + E+
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSES 1215

Query: 583 KKLLEEEAKESVKAYLDCVSQAKNEAEKKECEKL 616
+ + SV++ V A + + L
Sbjct: 1216 SNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0504TYPE4SSCAGX8800.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 880 bits (2274), Expect = 0.0
Identities = 514/522 (98%), Positives = 516/522 (98%)

Query: 2 MGQAFFKKIVGCFCLGYLFLSSTIEAAALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 61
MGQAFFKKIVGCFCLGYLFLSS IEA ALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 62 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 121
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 122 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 181
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 182 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 241
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 242 EETIKQRAKDKISIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 301
EE ++QRAKDKISIKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 302 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 361
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 362 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 421
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 422 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNLGLRWYRVNEIAEKFKLIK 481
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTN GLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 482 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 523
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0506PF043351186e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (298), Expect = 6e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0520SECETRNLCASE290.013 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 29.5 bits (66), Expect = 0.013
Identities = 12/72 (16%), Positives = 23/72 (31%), Gaps = 12/72 (16%)

Query: 48 DGGNRLFGFPETFIYSSIFILFVTIVLSVILFQAYEPVLIVAIVIVLVALGF-------- 99
G L E + + L + ++ L++ L V++L+A
Sbjct: 9 GSGRGL----EAMKWVVVVALLLVAIVGNYLYRDIMLPLRALAVVILIAAAGGVALLTTK 64

Query: 100 KKDYRLYQRMER 111
K + R R
Sbjct: 65 GKATVAFAREAR 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0524TYPE4SSCAGA18950.0 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 1895 bits (4909), Expect = 0.0
Identities = 1070/1200 (89%), Positives = 1103/1200 (91%), Gaps = 53/1200 (4%)

Query: 1 MTNETINQQPQTEAAFNPQQFINNLQVAFLKVDNAVVSYDPDQKPIVDKNDRDNRQAFDG 60
MTNETI+QQPQTEAAFNPQQFINNLQVAFLKVDNAV SYDPDQKPIVDKNDRDNRQAF+G
Sbjct: 1 MTNETIDQQPQTEAAFNPQQFINNLQVAFLKVDNAVASYDPDQKPIVDKNDRDNRQAFEG 60

Query: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120
ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF
Sbjct: 61 ISQLREEYSNKAIKNPTKKNQYFSDFINKSNDLINKDNLIDVESSTKSFQKFGDQRYRIF 120

Query: 121 TSWVSHQNDPSKINTRSIRNFMENIIQPPIPDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180
TSWVSHQNDPSKINTRSIRNFMENIIQPPI DDKEKAEFLKSAKQSFAGIIIGNQIRTDQ
Sbjct: 121 TSWVSHQNDPSKINTRSIRNFMENIIQPPILDDKEKAEFLKSAKQSFAGIIIGNQIRTDQ 180

Query: 181 KFMGVFDEFLKERQEAEKNGGPTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240
KFMGVFDE LKERQEAEKNG PTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI
Sbjct: 181 KFMGVFDESLKERQEAEKNGEPTGGDWLDIFLSFIFDKKQSSDVKEAINQEPVPHVQPDI 240

Query: 241 ATTTTDIQGLPPESRDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNALS 300
ATTTTDIQGLPPE+RDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNALS
Sbjct: 241 ATTTTDIQGLPPEARDLLDERGNFSKFTLGDMEMLDVEGVADIDPNYKFNQLLIHNNALS 300

Query: 301 SVLMGSHDGIEPEKVSLLYGNNGGPEARHDWNATVGYKNQQGDNVATLINVHMKNGSGLV 360
SVLMGSH+GIEPEKVSLLYG NGGP ARHDWNATVGYK+QQG+NVAT+INVHMKNGSGLV
Sbjct: 301 SVLMGSHNGIEPEKVSLLYGGNGGPGARHDWNATVGYKDQQGNNVATIINVHMKNGSGLV 360

Query: 361 IAGGEKGVNNPSFYLYKEDQLTGLKQALSQEEIRNKVDFMEFLAQNNAKLDNLSEKEKEK 420
IAGGEKG+NNPSFYLYKEDQLTG ++ALSQEEI+NK+DFMEFLAQNNAKLDNLSEKEKEK
Sbjct: 361 IAGGEKGINNPSFYLYKEDQLTGSQRALSQEEIQNKIDFMEFLAQNNAKLDNLSEKEKEK 420

Query: 421 FQTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALLTEFGNGDLSYTLKDYGKKADKA 480
F+TEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSAL+TEFGNGDLSYTLKDYGKKADKA
Sbjct: 421 FRTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALITEFGNGDLSYTLKDYGKKADKA 480

Query: 481 LDREKNVTLQGNLKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLEANLSKVAVFNL 540
LDREKNVTLQG+LKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLE +KVA+FNL
Sbjct: 481 LDREKNVTLQGSLKHDGVMFVDYSNFKYTNASKNPNKGVGVTNGVSHLEVGFNKVAIFNL 540

Query: 541 PNLNNLAITSYVRRDLEDKLWAKGLSPQEANKLIKDFLNSNKELVGKTLNFNKAVAEAKN 600
P+LNNLAITS+VRR+LEDKL KGLSPQEANKLIKDFL+SNKELVGKTLNFNKAVA+AKN
Sbjct: 541 PDLNNLAITSFVRRNLEDKLTTKGLSPQEANKLIKDFLSSNKELVGKTLNFNKAVADAKN 600

Query: 601 TGDYDEVKKAHKDLEKSLRKREHLEKEVAKKLESKSGNKNKMEAKSQANSQKDEIFALIN 660
TG+YDEVKKA KDLEKSLRKREHLEKEV KKLESKSGNKNKMEAK+QANSQKDEIFALIN
Sbjct: 601 TGNYDEVKKAQKDLEKSLRKREHLEKEVEKKLESKSGNKNKMEAKAQANSQKDEIFALIN 660

Query: 661 KEANRDARAIAYAQNLKGIKRELSDKLENINKDLKDFSKSFDEFKNGKNKDFSKTEETLK 720
KEANRDARAIAYAQNLKGIKRELSDKLEN+NK+LKDF KSFDEFKNGKNKDFSK EETLK
Sbjct: 661 KEANRDARAIAYAQNLKGIKRELSDKLENVNKNLKDFDKSFDEFKNGKNKDFSKAEETLK 720

Query: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSIKDVIINQKI 780
ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENS+KDVIINQK+
Sbjct: 721 ALKGSVKDLGINPEWISKVENLNAALNEFKNGKNKDFSKVTQAKSDLENSVKDVIINQKV 780

Query: 781 TDKADNLNQAVSVAKTTGDFSRVEQALADLKNFSKGQLAQQAQKNEDFNTGKNSELYQSV 840
TDK DNLNQAVSVAK TGDFSRVEQALADLKNFSK QLAQQAQKNE N K SE+YQSV
Sbjct: 781 TDKVDNLNQAVSVAKATGDFSRVEQALADLKNFSKEQLAQQAQKNESLNARKKSEIYQSV 840

Query: 841 KNGVNKTLVGNGLSGIEATALAKNFSDIKKELNEKFKNFNNNNNNGLKNEPIYAKVNKKK 900
KNGVN TLVGNGLS EAT L+KNFSDIKKELN K NFNNNNNNGLKNEPIYAKVNKKK
Sbjct: 841 KNGVNGTLVGNGLSQAEATTLSKNFSDIKKELNAKLGNFNNNNNNGLKNEPIYAKVNKKK 900

Query: 901 TGQVASPEEPIYAKVNKKKTGQVASPEEPIYTQVAKKVNAKIDRLNQIASGLGGVGQAAG 960
GQ AS EEPIYA QVAKKVNAKIDRLNQIASGLG VGQAAG
Sbjct: 901 AGQAASLEEPIYA-------------------QVAKKVNAKIDRLNQIASGLGVVGQAAG 941

Query: 961 FPLKRHDKVDDLSKVGRSVSPEPIYATIDDLGGPFPLKRHDKVDDLSKVGLSRNQELAQK 1020
FPLKRHDKVDDLSKVGLSRNQELAQK
Sbjct: 942 ----------------------------------FPLKRHDKVDDLSKVGLSRNQELAQK 967

Query: 1021 IDNLNQAVSEAKAGFFGNLEQAIDKLKDSTKHNPMNLWVESAKKVPASLSAKLDNYATNS 1080
IDNLNQAVSEAKAGFFGNLEQ IDKLKDSTKHNPMNLWVESAKKVPASLSAKLDNYATNS
Sbjct: 968 IDNLNQAVSEAKAGFFGNLEQTIDKLKDSTKHNPMNLWVESAKKVPASLSAKLDNYATNS 1027

Query: 1081 HTRINSNIQNGAINEKVTGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDRIGFNQKNMK 1140
H RINSNI+NGAINEK TGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYD+IGFNQKNMK
Sbjct: 1028 HIRINSNIKNGAINEKATGMLTQKNPEWLKLVNDKIVAHNVGSVPLSEYDKIGFNQKNMK 1087

Query: 1141 DYSDSFKFSTKLNNAVKDIKSGFVQFLTNTFSTASYYCLAEENAKHGIKNANTKGGFQKS 1200
DYSDSFKFSTKLNNAVKD SGF QFLTN FSTASYYCLA ENA+HGIKN NTKGGFQKS
Sbjct: 1088 DYSDSFKFSTKLNNAVKDTNSGFTQFLTNAFSTASYYCLARENAEHGIKNVNTKGGFQKS 1147


9HPAG1_0654HPAG1_0681Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0654-115-3.224960outer membrane protein HorF
HPAG1_0655015-4.060078aspartate aminotransferase
HPAG1_0656116-4.876396hypothetical protein
HPAG1_0657-110-1.867904hypothetical protein
HPAG1_065809-0.148157integrase-recombinase protein
HPAG1_06593100.365788methylated-DNA--protein-cysteine
HPAG1_06603120.673425conserved hypothetical integral membrane
HPAG1_06612141.245185putative lipopolysaccharide biosynthesis
HPAG1_06621131.735412ribonucleoside-diphosphate reductase 1 alpha
HPAG1_06633181.679373hypothetical protein
HPAG1_06644161.007145hypothetical protein
HPAG1_06652110.676935hypothetical protein
HPAG1_06660110.464449hypothetical protein
HPAG1_0667111-1.098642UDP-N-acetylglucosamine pyrophosphorylase
HPAG1_0668214-3.049909flagellar biosynthetic protein
HPAG1_0669215-3.249784iron(III) dicitrate transport protein
HPAG1_0670-117-2.755895iron(II) transport protein
HPAG1_0671018-3.026199hypothetical protein
HPAG1_0672217-2.258292putative type II cytosine specific
HPAG1_06731140.307612putative type II restriction enzyme
HPAG1_06742132.457331hypothetical protein
HPAG1_06751133.729061acetyl coenzyme A acetyltransferase
HPAG1_06763123.487081succinyl-CoA-transferase subunit A
HPAG1_06773133.360947succinyl-CoA-transferase subunit B
HPAG1_06784132.897010short-chain fatty acids transporter
HPAG1_06793132.468075putative outer membrane protein
HPAG1_06802132.796867hydantoin utilization protein A
HPAG1_06812111.412774N-methylhydantoinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0664PHAGEIV290.008 Gene IV protein signature.
		>PHAGEIV#Gene IV protein signature.

Length = 426

Score = 28.7 bits (64), Expect = 0.008
Identities = 12/36 (33%), Positives = 16/36 (44%)

Query: 54 GKLIGGGVGGFVGDKIGGAIGAPGGPVGIGLGRFLG 89
G G GG D++ + + GG GI G LG
Sbjct: 220 GSQRGTVAGGVNTDRLTSVLSSAGGSFGIFNGDVLG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0668FLGBIOSNFLIP2762e-96 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 276 bits (708), Expect = 2e-96
Identities = 113/245 (46%), Positives = 162/245 (66%), Gaps = 2/245 (0%)

Query: 1 MRFFIFLILICPLICPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSLI 60
MR + + + L A + LP + S P + + + +T L P+++
Sbjct: 1 MRRLLSVAPVL-LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAIL 58

Query: 61 LVMTSFTRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPYMD 120
L+MTSFTR+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+ +
Sbjct: 59 LMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE 118

Query: 121 KKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDEVSLSVLIPAFMISE 180
+KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++ SE
Sbjct: 119 EKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSE 178

Query: 181 LKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLTEN 240
LKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL +
Sbjct: 179 LKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGS 238

Query: 241 LVASF 245
L SF
Sbjct: 239 LAQSF 243


10HPAG1_0697HPAG1_0715Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0697213-2.016253hypothetical protein
HPAG1_0698212-1.684507hypothetical protein
HPAG1_0699412-1.217886RNA polymerase sigma-54 factor
HPAG1_0700211-0.300523ABC transporter, ATP-binding protein
HPAG1_0701011-0.210713hypothetical protein
HPAG1_07020110.936666DNA polymerase III gamma and tau subunits
HPAG1_07032122.494765conserved hypothetical integral membrane
HPAG1_07042142.668173hypothetical protein
HPAG1_07053132.572653hypothetical protein
HPAG1_07062122.533900outer membrane protein SabB
HPAG1_07071112.295244anaerobic C4-dicarboxylate transport protein
HPAG1_0708-190.494831L-asparaginase II
HPAG1_0709-111-0.676050outer membrane protein SabA
HPAG1_0710111-2.046440outer membrane protein
HPAG1_0711212-3.156019transcriptional regulator
HPAG1_0712216-4.208125hypothetical protein
HPAG1_0713213-3.287253hypothetical protein
HPAG1_0714313-2.613530hypothetical protein
HPAG1_0715213-1.546942labile enterotoxin output A
11HPAG1_0959HPAG1_0966Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_09593151.829799cell division protein
HPAG1_09604201.095142cell division protein
HPAG1_0961525-0.963096hypothetical protein
HPAG1_0962726-2.200925hypothetical protein
HPAG1_0963524-1.232656hypothetical protein
HPAG1_0964321-1.844812hypothetical protein
HPAG1_0965219-2.203196hypothetical protein
HPAG1_0966318-1.917080hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0959SHAPEPROTEIN418e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 40.9 bits (96), Expect = 8e-06
Identities = 39/181 (21%), Positives = 68/181 (37%), Gaps = 13/181 (7%)

Query: 210 AASIATLSNDERELGVACVDMGGETCNLTIYSGNSIRYNKYLPVGSHHLTTDL------S 263
AA+I G VD+GG T + + S N + Y+ + +G + +
Sbjct: 146 AAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRN 205

Query: 264 HMLNTPFPYAEEVKIKYGDLSFEGGTETPSQNVQIPTTGSDGHESHIVPLSEIQTIMRER 323
+ AE +K + G S G E V+ + +EI ++E
Sbjct: 206 YGSLIGEATAERIKHEIG--SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEP 263

Query: 324 ALETFEIIHRSIQDSGLE---EHLGGGVVLTGGMALMKGIKELAKAHFTNYPVRLAA-PM 379
+ +++ E + G+VLTGG AL++ + L T PV +A P+
Sbjct: 264 LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLL-MEETGIPVVVAEDPL 322

Query: 380 E 380

Sbjct: 323 T 323


12HPAG1_1059HPAG1_1064Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1059219-1.767790type II DNA modification enzyme
HPAG1_1060415-1.568620flgM protein
HPAG1_1061414-2.178473hypothetical protein
HPAG1_1062515-1.986917peptidyl-prolyl cis-trans isomerase
HPAG1_1063416-2.717892periplasmic protein
HPAG1_1064416-2.188238peptidoglycan-associated lipoprotein precursor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1064OMPADOMAIN1479e-46 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 147 bits (373), Expect = 9e-46
Identities = 48/169 (28%), Positives = 75/169 (44%), Gaps = 24/169 (14%)

Query: 22 KMDNKTVAGDVSAKTVQTAPV-TTEPAPEKEEPKQEPAPVVEEKPAVESGTIIASIYFDF 80
+ DN ++ VS + Q PAP PAP V+ K T+ + + F+F
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPA-------PAPEVQTK----HFTLKSDVLFNF 225

Query: 81 DKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNAL 137
+K +K Q LD++ + V++ G TD GS YNQ L +R SV + L
Sbjct: 226 NKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYL 285

Query: 138 VIKGVEKDMIKTISFGETKPKCTQ-----KTR----ECYKENRRVDVKL 177
+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 286 ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


13HPAG1_1075HPAG1_1083Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1075018-4.142886ATP synthase F0, subunit b'
HPAG1_1076116-3.605316plasmid replication-partition related protein
HPAG1_1077218-3.754921spoOJ regulator
HPAG1_1078220-4.543415biotin acetyl coenzyme A carboxylase synthetase
HPAG1_1079321-4.596349methionyl-tRNA formyltransferase
HPAG1_1080422-5.388194hypothetical protein
HPAG1_10816210.594638hypothetical protein
HPAG1_1082220-0.045629hypothetical protein
HPAG1_1083219-0.721349hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1077PF07675310.004 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.004
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 70 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 126
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 127 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 171
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1079FERRIBNDNGPP320.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.8 bits (72), Expect = 0.003
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKDLKPDFIVVVAYGKILPKEVLTIAP 104
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1080PF01540320.010 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 32.0 bits (72), Expect = 0.010
Identities = 72/311 (23%), Positives = 131/311 (42%), Gaps = 28/311 (9%)

Query: 150 QKKIEDENSAETLIAKQESEIKKYNEEIEKIRKKVTS--RTIQITLDEIEINDFCKVSKN 207
Q+K++ N IA + +IK+ +E+ K+ +K+ S TI +T+ ++E F ++ +
Sbjct: 106 QQKVDQANKK---IADENLKIKEGAKELLKLSEKIQSFADTIALTITKLEGKKF-QIDET 161

Query: 208 HFKYQEDTLMNLEKDFNELDE-----AIKKFDDLKEMELPKDYQTI-KDKLESLFSFDID 261
K T+ L K E+ IKK L E+E K++ T +K+ S +
Sbjct: 162 FKKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEVKK 221

Query: 262 KKAGQVSEKIKEHISKVGREF--IEKGIKLQKEMPNNACPFCTQKITNNIIQAYTSY--- 316
+ +++E E K+ E I++G K ++ F I I + +
Sbjct: 222 AWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSF-ADTIALTITKLERKFQID 280

Query: 317 --FNKSIEQFNQDSLEISGTLKNILNQWNIKE--ILQSFERFEPFMEDFLKEKKS-LENA 371
F K + + + S +K IK+ +L E F+ F +L++ S E
Sbjct: 281 EKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEV 340

Query: 372 LEQIKALLEELQKEVDKKEGVKNKEKFQETDKELLEIQENIQQHVDETRNILNQKKEQEE 431
+ L E++ E DKK +N +K + +EL +I E +N+ + E
Sbjct: 341 KKAWSKELAEIKAEDDKKLAEEN-QKIKNGVEELKKINNEAF----ELSKTVNKTIAELE 395

Query: 432 KLKKLKTKLKE 442
K K+ KE
Sbjct: 396 KKFKIDVSFKE 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1081RTXTOXIND423e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.1 bits (99), Expect = 3e-06
Identities = 23/170 (13%), Positives = 61/170 (35%), Gaps = 18/170 (10%)

Query: 51 RAQYQSHFKALEQKEEALKERAKEQQAKFDEAVKHASVLALQDERAKIIEEARKNAFLEQ 110
+ Q+ + QKE L ++ E+ ++ ++ ++ R + +
Sbjct: 192 KEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAK 251

Query: 111 QKGLELLQKELDEKSKQVQELHQKEAEIERLKRENNEAESRLKAENEKKLNEKLDLEREK 170
LE + + E EL ++++E+++ E A+ + + + E
Sbjct: 252 HAVLEQ-ENKYVEAV---NELRVYKSQLEQIESEILSAKEEYQLVTQ-------LFKNEI 300

Query: 171 IEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAELSSQQFQGEVQELAI 220
++K + +L + + +A +S +VQ+L +
Sbjct: 301 LDK--LRQTTDNIGLLTLELAKNEERQQASVIRAPVS-----VKVQQLKV 343


14HPAG1_1326HPAG1_1334Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1326-210-3.010857prephenate dehydrogenase
HPAG1_1327-210-4.044323putative endonuclease G
HPAG1_1328-112-4.431392putative type III restriction enzyme M protein
HPAG1_1329-212-4.042690putative type III restriction enzyme R protein
HPAG1_1330119-1.983695biotin synthetase
HPAG1_1331219-3.386220putative ribonuclease N
HPAG1_1332319-3.729262hypothetical protein
HPAG1_1333319-2.252502hypothetical protein
HPAG1_1334317-1.719859hypothetical protein
15HPAG1_1358HPAG1_1400Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1358-213-3.072584hypothetical protein
HPAG1_1359-114-3.744875hypothetical protein
HPAG1_1360-116-3.471304formyltetrahydrofolate hydrolase
HPAG1_1361-119-3.523322signal peptide protease IV
HPAG1_1362120-3.545824hypothetical protein
HPAG1_1363119-3.364711hypothetical protein
HPAG1_1364017-1.842712conserved hypothetical lipoprotein
HPAG1_1365015-0.841485hypothetical protein
HPAG1_1366-115-0.333210hypothetical protein
HPAG1_1367117-0.776781peptidyl-prolyl cis-trans isomerase B,
HPAG1_1368220-1.731559carbon storage regulator
HPAG1_1369120-2.4022184-diphosphocytidyl-2-C-methyl-D-erythritol
HPAG1_1370219-1.980891SsrA-binding protein
HPAG1_1371215-0.221960biopolymer transport protein
HPAG1_1372217-0.972236biopolymer transport protein
HPAG1_1373017-0.484568ribosomal protein L34
HPAG1_13740160.500794ribonuclease P, protein component
HPAG1_13750160.446681hypothetical protein
HPAG1_1376116-0.18308260 kDa inner-membrane protein
HPAG1_1377117-0.604306hypothetical protein
HPAG1_1378115-0.287249putative thiophene/furan oxidation protein
HPAG1_13790140.160729outer membrane protein HomD
HPAG1_1380217-1.974482hypothetical protein
HPAG1_1381218-1.335752MobC-like protein
HPAG1_1382214-0.386149cagY like protein
HPAG1_13831120.001641thymidylate synthase
HPAG1_1384011-0.162400glutamine fructose-6-phosphate aminotransferase
HPAG1_1385-113-0.679326hypothetical protein
HPAG1_1386013-0.702532purine nucleoside phosphorylase
HPAG1_1387114-0.885621chromosomal replication initiator protein
HPAG1_1388110-2.495745periplasmic competence protein
HPAG1_1389111-3.275374*exodeoxyribonuclease
HPAG1_139008-3.040141hypothetical protein
HPAG1_139108-3.248076hypothetical protein
HPAG1_139208-3.475129DNA recombinase
HPAG1_139307-3.418271type III R-M system modification enzyme
HPAG1_139419-2.764368type III R-M system restriction enzyme
HPAG1_1395-111-0.398208type IIS restriction-modification protein
HPAG1_13960110.760727hypothetical protein
HPAG1_1397-1110.654554hypothetical protein
HPAG1_13980121.412476transcription termination factor
HPAG1_13992141.643237selenocysteine synthase
HPAG1_14003142.424413iron-regulated outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_137660KDINNERMP427e-146 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 427 bits (1099), Expect = e-146
Identities = 163/569 (28%), Positives = 272/569 (47%), Gaps = 57/569 (10%)

Query: 10 RLILAIALSFLFIAIYSYFFQKPNKTTTQTTKQETTNNHTATSPNAPNAQHFSVTQTIPQ 69
R +L IAL F+ I+ Q T T+ A + Q
Sbjct: 5 RNLLVIALLFVSFMIW-------QAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQ 57

Query: 70 ENLLSTISFEHARIEIDSLGR--IKQVYLKDKKYLTPKQKGFLEHVGHLFSPKANPQNLL 127
L+ ++ + + I++ G + + K L Q L F +A
Sbjct: 58 GKLI-SVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTG 116

Query: 128 KELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGTLTIIK 185
++ P A+ +PL +N A G NE V D T K
Sbjct: 117 RDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTK 163

Query: 186 TLTFYDDLHYDLQIAFKSPN--------NIIPSYVITNGYRPVADLDS-----YTFSGVL 232
T Y + + + N + + P D S +TF G
Sbjct: 164 TFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAA 222

Query: 233 LENNDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQGFEALIDSEIGT 289
D+K EK + D + + S +++ + +YF T + G + +G
Sbjct: 223 YSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNFYTANLG- 280

Query: 290 KNPLGFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDVIEYGLITFFAKG 338
N + I K++ N ++GP+ + A++P L ++YG + F ++
Sbjct: 281 -NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQP 339

Query: 339 VFVLLDYLYQFVGNWGWAIIFLTIIVRLILYPLSYKGMVSMQKLKELSPKMKELQEKYKG 398
+F LL +++ FVGNWG++II +T IVR I+YPL+ SM K++ L PK++ ++E+
Sbjct: 340 LFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGD 399

Query: 399 EPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWILWIHDLS 458
+ Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + + LWIHDLS
Sbjct: 400 DKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLS 459

Query: 459 IMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLITFPAGLVLYWTT 518
DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F + FP+GLVLY+
Sbjct: 460 AQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIV 519

Query: 519 NNILSVLQQLIINKVLENKKRMHAQNKKE 547
+N+++++QQ +I + LE K+ +H++ KK+
Sbjct: 520 SNLVTIIQQQLIYRGLE-KRGLHSREKKK 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1377IGASERPTASE280.043 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.043
Identities = 19/62 (30%), Positives = 26/62 (41%), Gaps = 11/62 (17%)

Query: 54 AGVKESVKEVKEESVKETNTKENHQNHQNNMEEKKQKLETETPQEE--IITPKPPKKNPK 111
A KE + KET T E +E+K K+ETE QE + + PK+
Sbjct: 1086 AQSGSETKETQTTETKETATVE---------KEEKAKVETEKTQEVPKVTSQVSPKQEQS 1136

Query: 112 EE 113
E
Sbjct: 1137 ET 1138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1378TCRTETOQM320.005 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 32.1 bits (73), Expect = 0.005
Identities = 32/134 (23%), Positives = 54/134 (40%), Gaps = 25/134 (18%)

Query: 227 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 269
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 270 KGHKVRLIDTAGIRESADKIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFTIIDALN 329
+ KV +IDT G + ++ R SL L D + + ++ + + AL
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 330 RAKKPCIVVLNKND 343
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1387HTHFIS354e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 4e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 125 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 175
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


16HPAG1_1426HPAG1_1445Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1426213-0.013722conserved hypothetical integral membrane
HPAG1_1427016-0.085627conserved hypothetical integral membrane
HPAG1_14281130.643853hypothetical protein
HPAG1_14290110.207980conserved hypothetical integral membrane
HPAG1_14302110.366771methyltransferase
HPAG1_14312120.155811exodeoxyribonuclease VII small subunit
HPAG1_14322130.603259hypothetical protein
HPAG1_14331120.625815seryl-tRNA synthetase
HPAG1_14342120.497855hypothetical protein
HPAG1_14352130.909874DNA helicase II
HPAG1_14363180.077211hypothetical protein
HPAG1_14372180.243435aromatic acid decarboxylase
HPAG1_1438417-0.628797lipopolysaccharide core biosynthesis protein
HPAG1_1439218-1.526934thymidylate kinase
HPAG1_1440312-1.979200hypothetical protein
HPAG1_1441211-1.584917restriction enzyme BcgI alpha chain-like
HPAG1_1442210-0.386257restriction enzyme BcgI alpha chain-like
HPAG1_1443310-0.232099restriction enzyme BcgI alpha chain-like
HPAG1_1444211-0.145562type II restriction enzyme
HPAG1_14452140.710399DNA polymerase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1427ABC2TRNSPORT330.001 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 33.0 bits (75), Expect = 0.001
Identities = 19/63 (30%), Positives = 31/63 (49%), Gaps = 1/63 (1%)

Query: 297 PLIFMMGFVWPFESLPSYLQVFVQIVPAYHGISLLGRLNQMHAEFIDVSVHFYALIAIFI 356
P++F+ G V+P + LP Q + +P H I L+ R + +DV H AL +
Sbjct: 188 PILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLI-RPIMLGHPVVDVCQHVGALCIYIV 246

Query: 357 VSF 359
+ F
Sbjct: 247 IPF 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1438LPSBIOSNTHSS2235e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 223 bits (569), Expect = 5e-78
Identities = 63/147 (42%), Positives = 94/147 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLKERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS++ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPKEI 150
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHV 148


17HPAG1_1458HPAG1_1478Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1458-115-3.184625hypothetical protein
HPAG1_1459-113-3.560112membrane-associated lipoprotein
HPAG1_1460-113-3.446817hypothetical protein
HPAG1_1461-112-3.385186hypothetical protein
HPAG1_1462-115-3.603438type I R-M system specificity subunit
HPAG1_1463013-2.978250hypothetical protein
HPAG1_1464-2110.089797type I restriction enzyme M protein
HPAG1_1465-2110.726650type I restriction enzyme M protein
HPAG1_1466-2111.101240type I restriction enzyme R protein
HPAG1_14672133.349244hypothetical protein
HPAG1_14681112.026056hypothetical protein
HPAG1_14691122.353543iron(III) dicitrate transport protein
HPAG1_14700100.650344arginase
HPAG1_14711100.820570amino acid permease
HPAG1_1472-110-0.182201alanine dehydrogenase
HPAG1_1473111-2.521380hypothetical protein
HPAG1_1474213-1.758567hypothetical protein
HPAG1_1475111-2.203819outer membrane protein HorL
HPAG1_1476211-2.394015inorganic polyphosphate/ATP-NAD kinase
HPAG1_1477210-2.392921DNA repair protein
HPAG1_1478-115-3.129409fibronectin/fibrinogen-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1459LIPOLPP20293e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 293 bits (752), Expect = e-105
Identities = 174/175 (99%), Positives = 175/175 (100%)

Query: 1 MKNQVKKILGMSVIAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSV+AAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1478FbpA_PF058331134e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 113 bits (283), Expect = 4e-29
Identities = 73/361 (20%), Positives = 142/361 (39%), Gaps = 31/361 (8%)

Query: 97 AKDLAYKSETFILRLEMIPKKANLMILDKEKCVIEA--FRFNDRVAKNDILGALPPN-IY 153
+ ++ ++ + + L + K + + I++ F FN N +G N +
Sbjct: 209 SSEICFRLKNNSIDLSLSNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMS 268

Query: 154 EHQEEDLDFKDLLDILEKDFLFYQHKE----LEHKKNQIIKRLNIQKERLKEKLEKLEDP 209
+ + + + +LE FY K+ L+ K + + K + R +K + L +
Sbjct: 269 KEDYKKIQYDSSSKLLEN---FYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNT 325

Query: 210 KNLQLEAKELQTQASLLLTYQHLIHKHESRVVLKDFED---KECAIEIDKSMPLNAFINK 266
+ + LL + + K S + L ++ I +D++ + +
Sbjct: 326 LKKCEDKDIFKLYGELLTANIYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQS 385

Query: 267 KFTLSKKKKQKSQFLYLEEENLKEKIAFKENQINYVKGAQEESVLE------------MF 314
+ K K+ + + +E++ + + + + A +E F
Sbjct: 386 YYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKF 445

Query: 315 MPFKNSKIKRPMNGYEVLYYKDFKIGLGKNQKENIKL-LQDARANDLWMHVRNIPGSHLI 373
SK + + I +GKN +N L L+ A +D+W H +NIPGSH+I
Sbjct: 446 KKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVI 505

Query: 374 VFCQKNTPKDEVIMELAKMLIKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRT 429
V + P + ++E A + K +S +DYT+ K VK GA VIYS +T
Sbjct: 506 VKNIMDIP-ESTLLEAANLAAYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQT 564

Query: 430 I 430
I
Sbjct: 565 I 565



Score = 34.8 bits (80), Expect = 6e-04
Identities = 20/92 (21%), Positives = 48/92 (52%), Gaps = 5/92 (5%)

Query: 46 SAPYIGLSKKPPESVLKNTLALDFCLNKFTKNAKILQANVIDNDRI--LEIKGAKDLAYK 103
+ P I L+ + +K + L K+ NAKI+ + I+ DRI ++ + +L +
Sbjct: 55 NYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVVIDFESTDELGFN 113

Query: 104 SETFILRLEMIPKKANLMILDK-EKCVIEAFR 134
S L +E++ + +N+ ++ K + ++++ +
Sbjct: 114 SIY-SLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


18HPAG1_0037HPAG1_0042N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0037-2130.466087comB8 competence protein
HPAG1_0038-1130.594602comB9 competence protein
HPAG1_0039-2121.237722comB10 competence protein
HPAG1_0040-1120.846948mannose-6-phosphate isomerase
HPAG1_0041-1120.991053GDP-D-mannose dehydratase
HPAG1_0042-2140.730430GDP-fucose synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0037PF043351324e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 132 bits (333), Expect = 4e-40
Identities = 37/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALVLAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLMNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0038TYPE4SSCAGX310.009 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.5 bits (68), Expect = 0.009
Identities = 28/110 (25%), Positives = 51/110 (46%), Gaps = 18/110 (16%)

Query: 156 FIEDKNYYSNAFIKPQKENMAENAPKDAPTNNKPLKEEKEETKEKEEETITIGDNTNAMK 215
I+ +N + A+I N A + N + ++EEK++ + + + NA+K
Sbjct: 339 LIKQENLNTTAYI-----NRVMMASNEQIINKEKIREEKQKIILDQAKALETQYVHNALK 393

Query: 216 IVKKDIQKGYKALKSSQRKWYCLGICSKKSKLSLMPKEIFNDKQFTYFKF 265
+ + + Y ++ + K+SK +MP EIF+D FTYF F
Sbjct: 394 --RNPVPRNYNYYQAPE----------KRSK-HIMPSEIFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0041NUCEPIMERASE852e-20 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 84.8 bits (210), Expect = 2e-20
Identities = 45/180 (25%), Positives = 71/180 (39%), Gaps = 19/180 (10%)

Query: 7 LIAGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSDHKRRFFLHYGD 66
L+ G G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEN 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0042NUCEPIMERASE512e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 50.6 bits (121), Expect = 2e-09
Identities = 51/346 (14%), Positives = 106/346 (30%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNIQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDSGVKKA 102
D++ + + R + ++ + Y NL L + + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHAAKLKNEKEFVMWGDGTARREYLNAKDLARFIA 222
+YG + + P + + K ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYDNITSIPS-----------------VMNVGSGVDYSIEEYYEMVAQVLDYKGVFVKD 265
D I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


19HPAG1_0115HPAG1_0120N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0115-2110.231022flagellin B
HPAG1_0116-112-1.014023DNA topoisomerase I
HPAG1_0117-111-0.667037hypothetical protein
HPAG1_0118011-0.866718hypothetical protein
HPAG1_0119-1100.258253hypothetical protein
HPAG1_01200131.599218phosphoenolpyruvate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0115FLAGELLIN2869e-93 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 286 bits (732), Expect = 9e-93
Identities = 130/519 (25%), Positives = 221/519 (42%), Gaps = 18/519 (3%)

Query: 2 SFRINTNIAALTSHAVGVQNNRDLSSSLEKLSSGLRINKAADDSSGMAIADSLRSQSANL 61
+ INTN +L + ++ LSS++E+LSSGLRIN A DD++G AIA+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIRNANDAIGMVQTADKAMDEQIKILDTIKTKAVQAAQDGQTLESRRALQSDIQRLLE 121
QA RNAND I + QT + A++E L ++ +VQA + +++Q +IQ+ LE
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELDNIANTTSFNGQQMLSGSFSNKEFQIGAYSNTTVKASIGSTSSDKIGHVRMETSSFSG 181
E+D ++N T FNG ++LS + Q+GA T+ + +G +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVN---- 175

Query: 182 EGMLASAAAQNLTEVGLNFKQVNGVNDYKIETVRISTSAGTGIGALSEIINRFSNTLGVR 241
+ ++ +FK V G + Y + + +G + + V
Sbjct: 176 -----GPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVN 230

Query: 242 ASYNVMATG----GTPVQSGTVRELTINGVEIGTVNDVHKNDADGRLTNAINSVKDRTGV 297
A+ + T T V + T E + K +G T V
Sbjct: 231 AANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGD-TFDYKGVTFTIDT 289

Query: 298 EASLDIQGRINLHSIDGRAISVHAASASGQVFGGGNFAGISGTQHAVIGRLTLTRTDARD 357
+ D G+++ +I+G +++ A + S D +
Sbjct: 290 KTGNDGNGKVST-TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 358 IIVSGVNFSHVGFHSAQGVAEYTVNLRAVRGIFDANVASAAGANANGAQAETNSQGIGAG 417
S ++ +G ++ TVN + + AG + + +
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 418 --VTSLKGAMIVMDMADSARTQLDKIRSDMGSVQMELITTINNISVTQVNVKAAESQIRD 475
+ K + DSA +++D +RS +G++Q + I N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 476 VDFAEESANFSKYNILAQSGSFAMAQANAVQQNVLRLLQ 514
D+A E +N SK IL Q+G+ +AQAN V QNVL LL+
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0116FbpA_PF05833300.047 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 29.8 bits (67), Expect = 0.047
Identities = 14/29 (48%), Positives = 17/29 (58%)

Query: 225 QEIKNELEKESYIISSIVKKSKKSPTPPP 253
+EIK EL + YI + KSKKS T P
Sbjct: 431 EEIKKELIETGYIKFKKIYKSKKSKTSKP 459


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0118IGASERPTASE411e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 1e-05
Identities = 42/246 (17%), Positives = 83/246 (33%), Gaps = 9/246 (3%)

Query: 95 ADDQSKKEVAETQKEAENARDRANKSGIELANSQIKAEQEQQKTSNIETNNQIKVEQEQQ 154
AD S E + A ++ AE +Q++ +E N EQ
Sbjct: 1005 ADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN-------EQD 1057

Query: 155 KTEQEKQKTEQEKQKTSNIETNNQ-IKVEQEQQKTSNIETNNQIKVEQEQQKTEQEKQKT 213
TE Q E K+ SN++ N Q +V Q +T +T + K +K E+ K +T
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT-TETKETATVEKEEKAKVET 1116

Query: 214 NNTQKDLIKKAEQNCQENHNQFFIKKVGIKGGIAIEVEAECKTPKPTKTNQTPIQPKHLP 273
TQ+ ++ + ++ ++ + V + + T T K
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETS 1176

Query: 274 NSKQPRSQRGSKTQELIAYLQKELESLPYSQKAIAKQVDFYKPSSIAYLELDPRDFNVTE 333
++ + + + ++ + P + + KP + + NV
Sbjct: 1177 SNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEP 1236

Query: 334 EWQNEN 339
+ N
Sbjct: 1237 ATTSSN 1242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0120PHPHTRNFRASE2947e-92 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 294 bits (753), Expect = 7e-92
Identities = 105/446 (23%), Positives = 186/446 (41%), Gaps = 68/446 (15%)

Query: 388 DLEHMNSFKEGEILVTDN-TDPDWEPCMKK-ASAVITNRGGRTCHAAIVAREIGVPAIVG 445
+ + + E +++ ++ T D K+ T+ GGRT H+AI++R + +PA+VG
Sbjct: 146 ETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVG 205

Query: 446 VSGATDSLYTGMEITVSCAEGE---------EGYVYAGIYEHEIERVELSNMQETQT--- 493
T+ + G + V EG E ++ E + + +
Sbjct: 206 TKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTK 265

Query: 494 -----KIYINIGNPEKAFSFSQLPNHGVGLARMEMIILNQIKAHPLALVDLHHKKSVKEK 548
++ NIG P+ G+GL R E + +++ + P
Sbjct: 266 DGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYMDRDQ-LPTE------------- 311

Query: 549 NEIENLMAGYANPKDFFVKKIAEGIGMISAAFYPKPVIVRTSDFKSNEYMRMLGGSSYEP 608
E Y K++ + KPV++RT D ++ + L P
Sbjct: 312 ---EEQFEAY--------KEVVQ-------RMDGKPVVIRTLDIGGDKELSYL----QLP 349

Query: 609 NEENPMLGYRGASRYYSESYNEAFSWECEALALVREEMGLTNMKVMIPFLRTIEEGKKVL 668
E NP LG+R + F + AL N+KVM P + T+EE ++
Sbjct: 350 KELNPFLGFRAIRLCLE--KQDIFRTQLRALL---RASTYGNLKVMFPMIATLEELRQAK 404

Query: 669 EILRKNNLESGKNG------LEIYIMCELPVNVILADDFLSLFDGFSIGSNDLTQLTLGV 722
I+++ + G +E+ IM E+P + A+ F D FSIG+NDL Q T+
Sbjct: 405 AIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAA 464

Query: 723 DRDSELVSHVFDERNEAMLKMFKKAIEACKRHNKYCGICGQAPSDYPEVTEFLVREGITS 782
DR +E VS+++ + A+L++ I+A K+ G+CG+ D L+ G+
Sbjct: 465 DRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGD-EVAIPLLLGLGLDE 523

Query: 783 ISLNPDSVIPTWNAVAKLE-KELKEH 807
S++ S++P + + KL +ELK
Sbjct: 524 FSMSATSILPARSQLLKLSKEELKPF 549


20HPAG1_0246HPAG1_0253N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0246-2131.199164neutrophil activating protein
HPAG1_0247-3131.025027histidine kinase sensor protein
HPAG1_0248-3121.756290hypothetical protein
HPAG1_0249-3112.240310flagellar basal-body P-ring protein
HPAG1_0250-2102.267250ATP-dependent RNA helicase
HPAG1_0251-281.927586hypothetical protein
HPAG1_0252-282.034850hypothetical protein
HPAG1_0253-292.039979oligopeptide permease ATPase protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0246HELNAPAPROT1502e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 150 bits (379), Expect = 2e-49
Identities = 39/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 6 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIVQLGHH 65
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 66 PLVTLSEALKLTRVKEETKTSFHSKDIFKEILGDYKHLEKEFKELSNTAEKEGDKVTVTY 125
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 126 ADDQLAKLQKSIWMLEAHLA 145
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0247PF06580300.014 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.014
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0249FLGPRINGFLGI364e-127 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 364 bits (936), Expect = e-127
Identities = 117/345 (33%), Positives = 190/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDIHISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++D+ +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AITSGN-----------SNNLLSANIINGATIEREVSYDLFHKNAMVLSLKSPNFKNAIQ 186
A+ SA + NGA IERE+ +VL L++P+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIMVHPIVVTSQDITLKITKEP--------LNDSKNTQDLDNNMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKSITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G + +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0250SECA300.028 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.028
Identities = 17/63 (26%), Positives = 31/63 (49%), Gaps = 2/63 (3%)

Query: 261 IVFTRTKKEADELHQFLASKNYKSTALHGDMDQRDRRASIMAFKKNDADVLVATDVASRG 320
+V T + ++++ + L K L+ + A+I+A A V +AT++A RG
Sbjct: 453 LVGTISIEKSELVSNELTKAGIKHNVLNAKFHANE--AAIVAQAGYPAAVTIATNMAGRG 510

Query: 321 LDI 323
DI
Sbjct: 511 TDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0253HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.009
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANIIMRLNPR----FKPHNGEVLFETTNLLKESEEF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


21HPAG1_0346HPAG1_0350N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0346-2110.828688flagellar basal-body M-ring protein
HPAG1_0347-2111.072248flagellar motor switch protein
HPAG1_0348-2120.900235flagellar export protein
HPAG1_0349-1111.3087721-deoxyxylulose-5-phosphate synthase
HPAG1_0350-2120.377443GTP-binding membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0346FLGMRINGFLIF5600.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 560 bits (1444), Expect = 0.0
Identities = 178/582 (30%), Positives = 294/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYTQGGYGVLFEGLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVSKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ + I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIENVKIVNENGESIGEGDILENSKELALEQLHYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL + + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GAPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANTLEYEPLSDESLQKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +++I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMLDNATLSEKIMHKTQKILGSFTPLIKYVLVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 TFSEEEVRYEIILEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0347FLGMOTORFLIG351e-123 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 351 bits (902), Expect = e-123
Identities = 122/338 (36%), Positives = 209/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAKKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIVKLDNFAIREILKVADKKDLSLALKTSTKDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDIV LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 30.2 bits (68), Expect = 0.010
Identities = 20/102 (19%), Positives = 41/102 (40%), Gaps = 3/102 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEA 102
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEK 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0348FLGFLIH382e-05 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 37.9 bits (87), Expect = 2e-05
Identities = 45/207 (21%), Positives = 91/207 (43%), Gaps = 14/207 (6%)

Query: 50 PLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIGFKEG 108
E I + + L L +LQMQ A E+ +A I + G+K G++EG
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQGYQEG 75

Query: 109 EEKMRNELTHSVNEEKNQLLHAITALDEKMKKSQDHLMALE----KELSAIAIDIAKEVI 164
+ L + E K+Q + + + + Q L AL+ L +A++ A++VI
Sbjct: 76 ---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 165 LKEVEDNSQKVALALAEELLKNVLDATDIHLKVNPLDYPYLNERLQNASKI---KLESNE 221
+ ++ + + + L + L + L+V+P D +++ L + +L +
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 222 AISKGGVMITSSNGSLDGNLMERFKTL 248
+ GG +++ G LD ++ R++ L
Sbjct: 193 TLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0350TCRTETOQM1133e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 113 bits (283), Expect = 3e-28
Identities = 54/162 (33%), Positives = 89/162 (54%), Gaps = 7/162 (4%)

Query: 9 NIRNFSIIAHIDHGKSTLADCLISECNAIS---NREMKSQVMDTMDIEKERGITIKAQSV 65
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 66 RLNYTFKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 125
+F+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 ----SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 126 DNHLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSNANEVS 167
+ + INKID ++ V QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 84.1 bits (208), Expect = 5e-19
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 167 SAKAKLGIKDLLEKIITTIPAPSGDPNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 226
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 227 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 283
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 284 KNPTSKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 343
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 344 FRVGFLGLLHMEVIKERLEREFGLNLIATAPTVVY 378
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 405 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 464
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 465 LKSCTKGYASFDYEP 479
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


22HPAG1_0497HPAG1_0506N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0497717-1.982556cag pathogenicity island protein 3
HPAG1_0498816-2.647451cag pathogenicity island protein 4
HPAG1_0499917-3.062684cag pathogenicity island protein 5
HPAG1_0500920-3.341185cag pathogenicity island encoded protein/ATPase
HPAG1_0501920-3.531634cag pathogenicity island protein Z
HPAG1_0502821-3.507405cag pathogenicity island protein Y
HPAG1_05031122-4.475513cag pathogenicity island protein Y
HPAG1_05041028-4.555395cag pathogenicity island protein X
HPAG1_05051130-4.600416cag pathogenicity island protein W
HPAG1_05061431-5.138680cag pathogenicity island protein V
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0497PF07201300.020 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.8 bits (67), Expect = 0.020
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALKAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0502IGASERPTASE407e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.0 bits (93), Expect = 7e-05
Identities = 38/214 (17%), Positives = 79/214 (36%), Gaps = 5/214 (2%)

Query: 65 KARNEEERRACEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKK 124
+ NEE R E + P A ++ + + ++KT + ++ T + ++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT---PSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 125 KLEEAKKSVKAYLDCVSQAKNEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRC 184
+EAK +VKA A++ +E KE + T E + +++ E K
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 185 VKDLPKDLQKKVLAKESLKAYKDCVSKARNEKEKKECEKLLTPEAKKLLEEEAKESVKAY 244
+ PK Q + + ++ A ++ + E + + T + K ++ V
Sbjct: 1128 -QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 245 LDCVSQAKNEAEKKECEKLLTPEAKKKLEEAKKS 278
V+ + E E T + E + K
Sbjct: 1187 -TTVNTGNSVVENPENTTPATTQPTVNSESSNKP 1219



Score = 38.1 bits (88), Expect = 3e-04
Identities = 33/155 (21%), Positives = 63/155 (40%), Gaps = 5/155 (3%)

Query: 355 VSKARNEKEKKECEKLLTPEARKLLEEAKESLKAYKDCVSKARNEEERRACEKLLTPE-A 413
SK + E+ E T + R++ +EAK ++KA A++ E + + T E A
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 414 KKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQA 473
EE+AK + + + K+E + + P+A+ E + SQ
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND--PTVNIKEPQSQT 1162

Query: 474 KTEADKKECEKLLTPEAKKLLEQQALDCLKNAKTE 508
T AD ++ K + ++ + + N+ E
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197



Score = 37.4 bits (86), Expect = 5e-04
Identities = 40/242 (16%), Positives = 81/242 (33%), Gaps = 4/242 (1%)

Query: 395 KARNEEERRACEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKK 454
+ NEE R E + P A ++ + + ++KT + ++ T + ++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT---PSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 455 KLEEAKKSVKAYLDCVSQAKTEADKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRC 514
+EAK +VKA A++ ++ KE + T E + +++ E K
Sbjct: 1068 VAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127

Query: 515 VKDLPKDLQKKVLAKKSVKAYLDCVSKARNEKEKKECEKLLTPEARKLLEEAKESLKAYK 574
+ PK Q + + ++ A + + E + + T + K E
Sbjct: 1128 -QVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 575 DCVSKARNEKEKKECEKLLTPEARKLLEQEVKKSVKAYLDCVSRARNEKEKKECEKLLTP 634
V+ + E E T + E K + S N + +
Sbjct: 1187 TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 635 EA 636
A
Sbjct: 1247 VA 1248



Score = 36.2 bits (83), Expect = 0.001
Identities = 38/239 (15%), Positives = 80/239 (33%), Gaps = 36/239 (15%)

Query: 25 VSKARNEKEKKECEKLLTPEARKLLEEAKESLKAYKDCVSKARNEEERRACEKLLTPEAK 84
SK + E+ E T + R++ +EAK ++KA A++ E + + T E
Sbjct: 1047 ESKTVEKNEQDATET--TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 85 KLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQAK 144
+ +EE +AK E EK + +T + K Q +
Sbjct: 1105 TVEKEE-------------KAKVETEKTQEVPKVTSQVSPK----------------QEQ 1135

Query: 145 NEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESLKA 204
+E + + E + +++ A TE K ++ + + +
Sbjct: 1136 SETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTG--- 1192

Query: 205 YKDCVSKARNEKEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKNEAEKKECEKL 263
+ V + + + E+ + + SV++ V A + + L
Sbjct: 1193 --NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0503IGASERPTASE300.047 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.047
Identities = 32/214 (14%), Positives = 73/214 (34%), Gaps = 5/214 (2%)

Query: 408 RKELELQKELQEYKDCIKNAKTEAEKNECLKGLSKEAIERLK--QQALDCLKNAKTDEER 465
E + QE K KN + E + ++KEA +K Q + ++ +E
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKET 1095

Query: 466 NECLKNIPQDLQKELLADMSVKAYKDCVSKARNEKEKQECEKLLTPEARKKLEQQVLDCL 525
++KE A + + ++ KQE + + P+A E +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 526 KNAKTDEERKKCLKDLPKDLQSDI---LAKESVKAYKDCVSQAKTESEKKECEKLLTPEA 582
K ++ + K+ S++ + + + + V + + + + E+
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSES 1215

Query: 583 KKLLEEEAKESVKAYLDCVSQAKNEAEKKECEKL 616
+ + SV++ V A + + L
Sbjct: 1216 SNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0504TYPE4SSCAGX8800.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 880 bits (2274), Expect = 0.0
Identities = 514/522 (98%), Positives = 516/522 (98%)

Query: 2 MGQAFFKKIVGCFCLGYLFLSSTIEAAALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 61
MGQAFFKKIVGCFCLGYLFLSS IEA ALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 62 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 121
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 122 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 181
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 182 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 241
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 242 EETIKQRAKDKISIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 301
EE ++QRAKDKISIKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 302 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 361
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 362 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 421
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 422 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNLGLRWYRVNEIAEKFKLIK 481
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTN GLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 482 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 523
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0506PF043351186e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 118 bits (298), Expect = 6e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNVAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


23HPAG1_0558HPAG1_0564N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0558214-1.155116hypothetical protein
HPAG1_0559113-0.552126hypothetical protein
HPAG1_0560014-0.537333dihydroorotase
HPAG1_0561117-2.657465putative siderophore-mediated iron transport
HPAG1_0562-114-2.923375hypothetical protein
HPAG1_0563-114-2.375742flagellar switch protein
HPAG1_0564012-1.123767endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0558TYPE3IMSPROT300.007 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 29.7 bits (67), Expect = 0.007
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 89 LQSYSVMLFFNLLLLTDILGFLPFSIYHHFMASLIFSALFCGSLFLSSPLLGVIALVALS 148
L Y F L+L+ +LPFS S + + +L PLL V AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 149 SSLL 152
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0561TONBPROTEIN495e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 49.2 bits (117), Expect = 5e-09
Identities = 24/57 (42%), Positives = 28/57 (49%)

Query: 83 APKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVE 139
P P +P P P P P IEKPKP+PKPKPKP K + +K VE
Sbjct: 62 QPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 46.1 bits (109), Expect = 5e-08
Identities = 25/70 (35%), Positives = 32/70 (45%), Gaps = 8/70 (11%)

Query: 84 PKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEE 143
P + P +P P P P P P E P KPKPKP+PK K V+KV+E
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP--------KPVKKVQE 108

Query: 144 KKAVEEKKEE 153
+ + K E
Sbjct: 109 QPKRDVKPVE 118



Score = 38.4 bits (89), Expect = 2e-05
Identities = 16/54 (29%), Positives = 21/54 (38%)

Query: 74 QDPSKNNPGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKK 127
Q +P P P P PKP KPKP+P K + +PK+
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112



Score = 35.7 bits (82), Expect = 1e-04
Identities = 41/218 (18%), Positives = 73/218 (33%), Gaps = 40/218 (18%)

Query: 98 PTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPNHKHKALKKVEKVEEKKAVEEKKEEKKIV 157
P P P +PEP+P+P PEP K V EK + K
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKE---------------APVVIEKPKPKPKP 98

Query: 158 EQKVEQKVEHKKVEEKKPVKKEFDPNQLSFLPKEVAPPRQENNKGLDNQTRRDIDELYGE 217
+ K +KV+ + + KPV E P N T +
Sbjct: 99 KPKPVKKVQEQPKRDVKPV--------------ESRPASPFENTAPARLTSSTATAATSK 144

Query: 218 EFGDLGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDISDLKI 277
+ + + RN + YP A L +G V+F + P+G + +++I
Sbjct: 145 PVTSVASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNVQI 193

Query: 278 IIGSEYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 315
+ M + ++ + +P + ++ I +
Sbjct: 194 LSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 33.8 bits (77), Expect = 6e-04
Identities = 14/56 (25%), Positives = 22/56 (39%)

Query: 74 QDPSKNNPGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPKKPN 129
+P P+P P++ P P P PKP K + +PK +P +
Sbjct: 65 PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120



Score = 31.1 bits (70), Expect = 0.004
Identities = 12/52 (23%), Positives = 16/52 (30%)

Query: 75 DPSKNNPGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPEPKPKPKPEPK 126
+P P + P P P P K E+PK + KP
Sbjct: 72 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPAS 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0563FLGMOTORFLIN992e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 99 bits (249), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0564OMS28PORIN270.030 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 27.4 bits (60), Expect = 0.030
Identities = 18/74 (24%), Positives = 38/74 (51%), Gaps = 2/74 (2%)

Query: 31 LFEKYPSVNDLALASLE--EVKEIIKSVSYSNNKSKHLINMAQKVVRDFKGVIPSTQKEL 88
+ K P+ +L L E +V+++ +++ S + AQKV+ G+ PS + ++
Sbjct: 164 MLNKSPNNKELELTKEEFAKVEQVKETLMASERALDETVQEAQKVLNMVNGLNPSNKDQV 223

Query: 89 MSLDGVGQKTANVV 102
++ V + +NVV
Sbjct: 224 LAKKDVAKAISNVV 237


24HPAG1_0582HPAG1_0590N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0582-2120.716710flagellin A
HPAG1_0583-3110.7083623-methyladenine DNA glycosylase
HPAG1_0584-2121.060469hypothetical protein
HPAG1_05850100.320625uroporphyrinogen decarboxylase
HPAG1_058619-0.035601outer-membrane protein of the hefABC efflux
HPAG1_058719-0.137732membrane fusion protein of the hefABC efflux
HPAG1_058819-0.336576cytoplasmic pump protein of the hefABC efflux
HPAG1_0589210-1.303874hypothetical protein
HPAG1_0590210-1.329562putative vacuolating cytotoxin (VacA)-like
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0582FLAGELLIN2431e-76 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 243 bits (622), Expect = 1e-76
Identities = 126/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLHSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0583PF05272300.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.009
Identities = 13/95 (13%), Positives = 26/95 (27%), Gaps = 20/95 (21%)

Query: 60 ILENDDEINLKKIAYIEFSKLAECVRPSGFYNQKAKRLIDLSKNIVKDFQSFENFKQEVT 119
L + + +A+ E + VR + +KA E+
Sbjct: 458 ALRSAPALA-GCVAFDELREQPVAVRAFPW--RKAPGP-------------LEDADVLRL 501

Query: 120 REWLLDQKGIGKESADAILCYVCAKEVMVVDKYSY 154
+++ G G+ SA + D
Sbjct: 502 ADYVETTYGTGEASAQTTEQAINV----AADMNRV 532


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0587RTXTOXIND518e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.4 bits (123), Expect = 8e-10
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYSKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLESYEFN 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 32.1 bits (73), Expect = 0.002
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 25/152 (16%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLESYEFNYRRLESDYAYSIAVLNKTI 127
+++ S +++ + ++ K+ D + + E ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDN--IGLLTLELAKNEER-------QQASV 329

Query: 128 LRAPFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG------ 179
+RAP + + GV L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 180 -DTYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ + Y+ G K+ I D+
Sbjct: 390 VEAFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0588ACRIFLAVINRP8940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 894 bits (2313), Expect = 0.0
Identities = 286/1040 (27%), Positives = 518/1040 (49%), Gaps = 42/1040 (4%)

Query: 1 MYKTAINRPITTLMFALAIVFFGTMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQAIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKHIQAISP-SYEIRPFLDTTSYIRTSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTKLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 ISIAVVLVFVGSLFVASKIGMEFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + ++ F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHAEVEFTTLQVGY-GTTQNPFKAKIFVQLKPLKERKKEGELGQFELMSVLRKELRS 631
+ + E FT + G QN FV LKP +ER + + + + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNGDENSAEAVIHR-AKMELGK 656

Query: 632 LPEAKGLDTINLSEVTLIGGGGDSSPFQTFVFSHSQEAVDKSVENLKKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + + L + + +
Sbjct: 657 IRDGFVI-PFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAEPN 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + E
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEA- 830

Query: 812 RNAGVSLGEILTQVSKNTKEWLVEGANYRFTGEADNAKESNGEFLVALATAFVLIYMILA 871
G S G+ + + +N L G Y +TG + + S + +A +FV++++ LA
Sbjct: 831 -APGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 872 ALYESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVA 931
ALYES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 932 NE-ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSG 990
+ K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + G
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 991 GLMISMVLSLLIVPVFYRLL 1010
G++ + +L++ VPVF+ ++
Sbjct: 1009 GMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0590VACCYTOTOXIN2824e-79 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 282 bits (723), Expect = 4e-79
Identities = 106/397 (26%), Positives = 182/397 (45%), Gaps = 14/397 (3%)

Query: 2794 AGNNSILWLNELFVAKGGNPLFAPYYLQDNPTEHIVTLMKDITSALGMLSKPNLKNNSTD 2853
+G L L + + +A + I + T+ L ++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2854 VLQLNTYTQQMGRLAKLSNFASFDSTDFSERLSSLKNQRFADAIPNAMDVILKYSQRDKL 2913
L L+ RL LS + F++RL +LK+QRFA + +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2914 KNNLWATGVGGVSFVENGTGTLYGVNVGYDRFIKG---VIVGGYAAYGYSGFYER--ITN 2968
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + N
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2969 SKSDNVDVGLYARAFIKKSELTFSVNETWGANKTQISSADTLLSMINQSYNYNTWTTNAR 3028
S ++N + G+Y+R F + E F G++++ ++ LL +NQSYNY ++ R
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 3029 VNYGYDFMFKNKSVIIKPQIGLRYYYIGMTGLEGVMHNALYNQFKANADPSKKSVLTIDF 3088
+YGYDF F ++++KP +G+ Y ++G T + + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 3089 AFENRHYFNKNSYFYAIGGVGRDLLVRSMGDKLVRFIGNNTLSYRKGELYNTFASITTGG 3148
E R+Y+ SYFY GV ++ + V + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFANFGSSNA-VSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 3149 EVRLFKSFYANAGVGARFGLDYKMINITGNIGMRLAF 3185
E++L K + N G L + + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291


25HPAG1_0883HPAG1_0890N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_0883-1151.834591acetate kinase
HPAG1_08840151.898521acetate kinase
HPAG1_08850142.295646phosphotransacetylase
HPAG1_08860141.165895phosphotransacetylase
HPAG1_08872160.629658phosphotransacetylase
HPAG1_08880150.485443hypothetical protein
HPAG1_0889113-0.143895hook assembly protein, flagella
HPAG1_08900120.924839flagellar hook protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0883ACETATEKNASE927e-26 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 92.2 bits (229), Expect = 7e-26
Identities = 33/72 (45%), Positives = 47/72 (65%), Gaps = 1/72 (1%)

Query: 4 MRNIEARK-EKGDKQAKLAFEMCAYRIKKYIGAYMVVLKKVDAIIFTGGLGENYSALRES 62
R++E + GDK+A+LA + AYR+KK IG+Y + VD I+FT G+GEN +RE
Sbjct: 283 FRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREF 342

Query: 63 VCEGLENLGIAL 74
+ +GLE LG L
Sbjct: 343 ILDGLEFLGFKL 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0884ACETATEKNASE1682e-53 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 168 bits (428), Expect = 2e-53
Identities = 67/155 (43%), Positives = 99/155 (63%), Gaps = 6/155 (3%)

Query: 2 MEILVLNLGSSSIKFKLFDMKENKPLASGLAEKIGEEIGQLKIKSHLHHNDQELKEKLVI 61
M+ILV+N GSSS+K++L + K+ LA GLAE+IG L N +++K K +
Sbjct: 1 MKILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHN----ANGEKIKIKKDM 56

Query: 62 KDHASGLLMIRENLT--KMGIIKDFNQIDAIGHRVVQGGDKFHAPVLVDEKVMQEIGNLS 119
KDH + ++ + L G+IKD ++IDA+GHRVV GG+ F + VL+ + V++ I +
Sbjct: 57 KDHKDAIKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCI 116

Query: 120 ILAPLHNPANLAGIEFVQKAHPHILQIAVFDTAFH 154
LAPLHNPAN+ GI+ + P + +AVFDTAFH
Sbjct: 117 ELAPLHNPANIEGIKACTQIMPDVPMVAVFDTAFH 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0888IGASERPTASE455e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.4 bits (107), Expect = 5e-07
Identities = 45/231 (19%), Positives = 79/231 (34%), Gaps = 10/231 (4%)

Query: 287 KRDKTLSKKKSEKTPTKAQTTAPSITPENAPKIPLKTPPLMPLIGANPPLNNNAPTPLEK 346
KR++T+ TP Q PS+ N + P+ P A TP E
Sbjct: 987 KRNQTVDTTNIT-TPNNIQADVPSVPSNNEEIARVDEAPVPPPAPA---------TPSET 1036

Query: 347 EETTKEISDNKEKAKETNNSAQNAQNAQASDKTNENKSIAPKETIKHFTQQLKQEIQEYK 406
ET E S + K E N AQ + E KS T + Q E +E +
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 407 PPMSRISMDLFPKELGKVEVIIQKVGKNLKVSVISHNNSLQTFLDNQQDLKNSLNALGFE 466
++ + + +E KVE + + V +T + + + + +
Sbjct: 1097 TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIK 1156

Query: 467 GVDLSFSQDSSKEQPKEQLRELFKEQESSPLKENALKSYQENTDHENQETS 517
+ + EQP ++ ++ + N S EN ++ T+
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207



Score = 37.7 bits (87), Expect = 1e-04
Identities = 47/267 (17%), Positives = 86/267 (32%), Gaps = 9/267 (3%)

Query: 28 DTKNAPKSASKDFSKILNQKISKDKTAPKENPNA--LKATPKDAKEGAKEDAKTLEKTPT 85
DT N + +++ E P ATP + E E++K KT
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVE 1052

Query: 86 PHHQHAQNLAKDQQAPTLKDWLNHKKTTASHEAQHEIHENHETNPKTPNETLNKNEKKSN 145
+ Q A + + N K T ++E E ET ET +++
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 146 GVTSN------AHQANLTNKNPLTPTNHANHAIKTPTTPTHNAKEPKTLKDIQTLSQKHD 199
V + + ++ K + T PT N KEP++ + +
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTA-DTEQP 1171

Query: 200 LNANNIQATTTPENKTPLNAGDQFALKTTQTPTNHTLAKNDAKNTANLSSVLQSLEKKES 259
+ T +N G+ T T +++++ + + +
Sbjct: 1172 AKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVP 1231

Query: 260 HNKEHANLSNNEKKTPPLKEALQMNAI 286
HN E A S+N++ T L + N
Sbjct: 1232 HNVEPATTSSNDRSTVALCDLTSTNTN 1258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_0890FLGHOOKAP1357e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 35.3 bits (81), Expect = 7e-04
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 2 NDTLLNAYSGIKTHQFGIDSLSNNIANVNTLGY 34
+ + NA SG+ Q +++ SNNI++ N GY
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGY 33



Score = 33.0 bits (75), Expect = 0.004
Identities = 10/48 (20%), Positives = 20/48 (41%)

Query: 557 IRHKYLETSNVNAGNALTNLILMQRGYSMNARAFGAGDDMIKEAISLK 604
+ ++ S VN NL Q+ Y NA+ + + I+++
Sbjct: 499 LSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


26HPAG1_1077HPAG1_1081N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1077218-3.754921spoOJ regulator
HPAG1_1078220-4.543415biotin acetyl coenzyme A carboxylase synthetase
HPAG1_1079321-4.596349methionyl-tRNA formyltransferase
HPAG1_1080422-5.388194hypothetical protein
HPAG1_10816210.594638hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1077PF07675310.004 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.004
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 70 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 126
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 127 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 171
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1079FERRIBNDNGPP320.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.8 bits (72), Expect = 0.003
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 72 EPEVQILKDLKPDFIVVVAYGKILPKEVLTIAP 104
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1080PF01540320.010 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 32.0 bits (72), Expect = 0.010
Identities = 72/311 (23%), Positives = 131/311 (42%), Gaps = 28/311 (9%)

Query: 150 QKKIEDENSAETLIAKQESEIKKYNEEIEKIRKKVTS--RTIQITLDEIEINDFCKVSKN 207
Q+K++ N IA + +IK+ +E+ K+ +K+ S TI +T+ ++E F ++ +
Sbjct: 106 QQKVDQANKK---IADENLKIKEGAKELLKLSEKIQSFADTIALTITKLEGKKF-QIDET 161

Query: 208 HFKYQEDTLMNLEKDFNELDE-----AIKKFDDLKEMELPKDYQTI-KDKLESLFSFDID 261
K T+ L K E+ IKK L E+E K++ T +K+ S +
Sbjct: 162 FKKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEVKK 221

Query: 262 KKAGQVSEKIKEHISKVGREF--IEKGIKLQKEMPNNACPFCTQKITNNIIQAYTSY--- 316
+ +++E E K+ E I++G K ++ F I I + +
Sbjct: 222 AWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSF-ADTIALTITKLERKFQID 280

Query: 317 --FNKSIEQFNQDSLEISGTLKNILNQWNIKE--ILQSFERFEPFMEDFLKEKKS-LENA 371
F K + + + S +K IK+ +L E F+ F +L++ S E
Sbjct: 281 EKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEFNTSWLEKIVSEWEEV 340

Query: 372 LEQIKALLEELQKEVDKKEGVKNKEKFQETDKELLEIQENIQQHVDETRNILNQKKEQEE 431
+ L E++ E DKK +N +K + +EL +I E +N+ + E
Sbjct: 341 KKAWSKELAEIKAEDDKKLAEEN-QKIKNGVEELKKINNEAF----ELSKTVNKTIAELE 395

Query: 432 KLKKLKTKLKE 442
K K+ KE
Sbjct: 396 KKFKIDVSFKE 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1081RTXTOXIND423e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.1 bits (99), Expect = 3e-06
Identities = 23/170 (13%), Positives = 61/170 (35%), Gaps = 18/170 (10%)

Query: 51 RAQYQSHFKALEQKEEALKERAKEQQAKFDEAVKHASVLALQDERAKIIEEARKNAFLEQ 110
+ Q+ + QKE L ++ E+ ++ ++ ++ R + +
Sbjct: 192 KEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAK 251

Query: 111 QKGLELLQKELDEKSKQVQELHQKEAEIERLKRENNEAESRLKAENEKKLNEKLDLEREK 170
LE + + E EL ++++E+++ E A+ + + + E
Sbjct: 252 HAVLEQ-ENKYVEAV---NELRVYKSQLEQIESEILSAKEEYQLVTQ-------LFKNEI 300

Query: 171 IEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAELSSQQFQGEVQELAI 220
++K + +L + + +A +S +VQ+L +
Sbjct: 301 LDK--LRQTTDNIGLLTLELAKNEERQQASVIRAPVS-----VKVQQLKV 343


27HPAG1_1506HPAG1_1514N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPAG1_1506-1152.010481flagellar basal-body protein
HPAG1_1507-1141.829555flagellar basal-body rod protein
HPAG1_1508-1151.526682flagellar basal-body rod protein
HPAG1_15091131.612268cell division protein
HPAG1_15100140.118749iron(III) ABC transporter, periplasmic
HPAG1_1511214-0.151027iron(III) ABC transporter, periplasmic
HPAG1_1512214-0.122326alkyl hydroperoxide reductase
HPAG1_1513111-0.659396putative outer membrane lipoprotein
HPAG1_1514212-0.850378penicillin-binding protein 2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1506FLGHOOKFLIE776e-22 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 77.0 bits (189), Expect = 6e-22
Identities = 19/77 (24%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 34 EQKGGEFSKLLKQSINELNNTQEQSDKALADMATGQIK-DLHQAAIAIGKAETSMKLMLE 92
Q F+ L +++ +++TQ + G+ L+ + KA SM++ ++
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRNKAISAYKELLRTQI 109
VRNK ++AY+E++ Q+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1507FLGHOOKAP1290.011 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.011
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1510FERRIBNDNGPP353e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 34.9 bits (80), Expect = 3e-04
Identities = 28/183 (15%), Positives = 76/183 (41%), Gaps = 10/183 (5%)

Query: 108 NVELLKKLSPDLVVTFVG-NPKAVEHAKKFGISFLSFQETT--IAEAMQAMQ--AQAAVL 162
N+ELL ++ P +V G P A+ +F + +A A +++ A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 163 EIDASKKFAKMQETLDFIAERL-KNVKKKKGVELFHKAN--KISGHQAISSDILEKGGID 219
+ A A+ ++ + + R K + + + G ++ +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 220 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWVSPLTPKDVLNNPKFSTIKAIKNKQVY 277
N + + +G +S++++ ++ +++ + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 278 KLP 280
++P
Sbjct: 268 RVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1511FERRIBNDNGPP320.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.8 bits (72), Expect = 0.003
Identities = 30/183 (16%), Positives = 74/183 (40%), Gaps = 10/183 (5%)

Query: 104 NVELLKKLGPDLVVTFVGNPKAVEHAKKF--GILFLSFQEKTIAEVMEDID---AQAKAL 158
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 159 EIDASKKLAKMQETLDFIKERL-KNVKKKKGVELFHKAN--KISGHQALDSDILEKGGID 215
+ A LA+ ++ + +K R K + + + G +L +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 216 N-FGLKYVKFGRADISVEKIVK-ENPEIIFIWWISPLSPEDILNNPKFATIKAIKNKQVY 273
N + + +G +S++++ ++ +++ + + ++ P + + ++ +
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQ 267

Query: 274 KLP 276
++P
Sbjct: 268 RVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPAG1_1514TYPE3IMPPROT290.029 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 29.4 bits (66), Expect = 0.029
Identities = 9/23 (39%), Positives = 12/23 (52%)

Query: 4 LRYKLLLFVFIGFWGLLVLNLFI 26
KL+LFV + W LL L +
Sbjct: 195 TPIKLVLFVALDGWTLLSKGLIL 217



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.