PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeF57.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in AP011945 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPF57_0064HPF57_0108Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0064219-1.741119hypothetical protein
HPF57_0065320-1.788521hypothetical protein
HPF57_0066217-0.354364hypothetical protein
HPF57_0067519-0.154117hypothetical protein
HPF57_00683180.811738hypothetical protein
HPF57_00691180.281492hypothetical protein
HPF57_0070418-0.667014hypothetical protein
HPF57_0071317-0.795281hypothetical protein
HPF57_00722140.179728hypothetical protein
HPF57_00731140.829150hypothetical protein
HPF57_00742131.172234hypothetical protein
HPF57_00751131.394482hypothetical protein
HPF57_00760141.628984hypothetical protein
HPF57_00772132.439068hypothetical protein
HPF57_00785223.610625urease accessory protein
HPF57_00795233.280531urease accessory protein
HPF57_00804202.626147urease accessory protein
HPF57_00812172.403498urease accessory protein UreE
HPF57_00823192.511857urea transporter
HPF57_00831162.452777urease subunit alpha
HPF57_0084-2101.392208bifunctional urease subunit gamma/beta
HPF57_0085-1121.050600*lipoprotein signal peptidase
HPF57_00862132.025924urease protein
HPF57_00872162.63074230S ribosomal protein S20
HPF57_00884162.629519peptide chain release factor 1
HPF57_00894151.401974possible part of outer membrane protein HorA
HPF57_00904151.024314possible part of outer membrane protein HorA
HPF57_00913151.371533possible part of outer membrane protein HorA
HPF57_00923131.096550outer membrane protein HorA
HPF57_00932120.725038hypothetical protein
HPF57_0094-2140.239328methyl-accepting chemotaxis transducer
HPF57_0095-3130.614928methyl-accepting chemotaxis transducer
HPF57_00960130.53698330S ribosomal protein S9
HPF57_0097012-0.83180750S ribosomal protein L13
HPF57_0098011-1.253888hypothetical protein
HPF57_0099010-2.268351hypothetical protein
HPF57_0100011-3.207019hypothetical protein
HPF57_0101314-4.331796RNA polymerase sigma factor RpoD
HPF57_0102523-7.474132hypothetical protein
HPF57_0103524-7.113293hypothetical protein
HPF57_0104623-6.992670hypothetical protein
HPF57_0105723-7.028261VirB2 type IV secretion protein
HPF57_0106523-5.601835VirB3 type IV secretion protein
HPF57_0107522-4.968142VirB4-like protein
HPF57_0108421-2.178849topoisomerase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0083UREASE10450.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1045 bits (2703), Expect = 0.0
Identities = 354/569 (62%), Positives = 443/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNASNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGNAS +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0092ACRIFLAVINRP280.023 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.3 bits (63), Expect = 0.023
Identities = 16/92 (17%), Positives = 32/92 (34%), Gaps = 4/92 (4%)

Query: 25 TNNYGQMYGVDAMAGYKWFFG--QTKRFGFRTYGYYSYNHANLSFVGSKLGIMEGASQVN 82
+ G+M A W +G + +R+ A + G + +ME + +
Sbjct: 791 RSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALME--NLAS 848

Query: 83 NFAYGVGFDALYSFYESKDGYNTAGLFLGFGL 114
G+G+D Y+ + N A +
Sbjct: 849 KLPAGIGYDWTGMSYQERLSGNQAPALVAISF 880


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0101IGASERPTASE300.041 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.041
Identities = 21/143 (14%), Positives = 48/143 (33%), Gaps = 3/143 (2%)

Query: 4 KANEEKAPKRTKQETKTEAAQENKTKESKVKESKIKETKAKEPVPVKKLSFNEALEELF- 62
+ NE +ET+T +E T E + K E + P ++S + E
Sbjct: 1081 QTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQ 1140

Query: 63 --ANSLSDCVSYESIIQISAKVPTLAQVKKIKELCQKYQKKLVSSSEYAKKLNAIDKIKK 120
A + +I + ++ T A ++ + ++ V+ S N++ + +
Sbjct: 1141 PQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 121 TEEKQKVLDEELEDGYDFLKEKD 143
+ + K +
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRH 1223


2HPF57_0202HPF57_0222Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_02022151.125561serine hydroxymethyltransferase
HPF57_02031150.475957hypothetical protein
HPF57_02042141.076584hypothetical protein
HPF57_02051132.910455hypothetical protein
HPF57_0206-1102.997927hypothetical protein
HPF57_0207-292.224722hypothetical protein
HPF57_0208-292.384520conserved hypothetical secreted protein
HPF57_0209-1103.255626fumarate reductase iron-sulfur subunit
HPF57_0210-1113.360678fumarate reductase flavoprotein subunit
HPF57_0211-2142.045438fumarate reductase cytochrome b-556 subunit
HPF57_0212-2162.117877triosephosphate isomerase
HPF57_0213-2173.204293enoyl-(acyl carrier protein) reductase
HPF57_0214-2163.332044UDP-3-O-[3-hydroxymyristoyl] glucosamine
HPF57_0215-2153.251031S-adenosylmethionine synthetase
HPF57_0216-2172.337521nucleoside diphosphate kinase
HPF57_0217-3181.729341hypothetical protein
HPF57_0218-111-2.06268350S ribosomal protein L32
HPF57_0219012-2.827117fatty acid/phospholipid synthesis protein
HPF57_0220214-4.2764753-oxoacyl-(acyl carrier protein) synthase III
HPF57_0221214-3.562161hypothetical protein
HPF57_0222213-3.538021hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0204IGASERPTASE320.004 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.6 bits (71), Expect = 0.004
Identities = 25/152 (16%), Positives = 54/152 (35%), Gaps = 14/152 (9%)

Query: 50 PKETFLQTDSGMQKIGNTKDEKKDDAFESLNLDPSKQESDLDKVADNVKKQESDAFKMPI 109
P ET ++ T ++ + DA E+ + + V N Q ++ +
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKAN--TQTNEVAQSGS 1090

Query: 110 QTNQTQTEMKTTEETQEAKKELKAVEHTPMSAQKETQAVAKKETPHKKPKVTPKDKEAHK 169
+T +TQT T E +++ K + +E P +V+PK +++
Sbjct: 1091 ETKETQTTETKETATVEKEEKAKV------------ETEKTQEVPKVTSQVSPKQEQSET 1138

Query: 170 DKAKHAAKEPKAKKEAHKEVPKKANSKTNLTK 201
+ + KE + N+ + +
Sbjct: 1139 VQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0213DHBDHDRGNASE608e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.1 bits (145), Expect = 8e-13
Identities = 61/263 (23%), Positives = 109/263 (41%), Gaps = 29/263 (11%)

Query: 4 LKGKKGLIVGVANNKSIAYGIAQSCFNQGATL-AFTYLNESLEKRVRPIAQELNSPYVYE 62
++GK I G A + I +A++ +QGA + A Y E LEK V + E +
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 LDVSKEEHFKPLYDSVKKDLGSLDFIVHSVAF--------APKEALEGSLLETSKSAFNT 114
DV + +++++G +D +V+ E E + S FN
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 115 AMEISVYSLIELTNTLKPLLNNGASVLTLSYLGSTKYMAHYNVMGLAKAALESAVRYLAV 174
+ +S Y + + ++ + +N A V S MA Y +KAA + L +
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTS-------MAAY---ASSKAAAVMFTKCLGL 173

Query: 175 DLGKHNIRVNALSAGPIRT-----LASSGIADFRMILKWNE---INAPLRKNVSLEEVGN 226
+L ++NIR N +S G T L + ++I E PL+K ++ +
Sbjct: 174 ELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIAD 233

Query: 227 AGMYLLSSLSSGVSGEVHFVDAG 249
A ++L+S + ++ VD G
Sbjct: 234 AVLFLVSGQAGHITMHNLCVDGG 256


3HPF57_0273HPF57_0311Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0273213-0.106122adenylosuccinate synthetase
HPF57_0274214-2.761108hypothetical protein
HPF57_0275017-4.814712conserved hypothetical secreted protein
HPF57_0276119-5.332256hypothetical protein
HPF57_0277221-6.842820exodeoxyribonuclease VII large subunit
HPF57_0278628-9.471485Type II DNA modification enzyme
HPF57_0279829-9.280587hypothetical protein
HPF57_0280829-9.344349hypothetical protein
HPF57_0281830-8.833576virB2 type IV secretion protein
HPF57_0282725-8.078855virB3 type IV secretion protein
HPF57_0283624-7.445072DNA transfer protein
HPF57_0284522-6.467866topoisomerase I
HPF57_0285420-6.042195hypothetical protein
HPF57_0286520-6.266964hypothetical protein
HPF57_0287519-6.314674Component of conjugal plasmid transfer system
HPF57_0288426-7.518988ComB3 protein
HPF57_0289631-9.750581hypothetical protein
HPF57_0290532-10.584237hypothetical protein
HPF57_0291428-8.447197hypothetical protein
HPF57_0292427-7.908302VIRB11
HPF57_0293226-7.818151hypothetical protein
HPF57_0294526-8.544638hypothetical protein
HPF57_0295420-7.461700hypothetical protein
HPF57_0296519-6.849212conjugal transfer protein
HPF57_0297616-6.477072hypothetical protein
HPF57_0298515-4.304290hypothetical protein
HPF57_0299515-4.604838hypothetical protein
HPF57_0300514-3.862220hypothetical protein
HPF57_0301516-4.109437hypothetical protein
HPF57_0302516-4.525027PARA protein
HPF57_0303417-4.039367hypothetical protein
HPF57_0304426-6.342495hypothetical protein
HPF57_0305429-6.625785hypothetical protein
HPF57_0306426-6.973965hypothetical protein
HPF57_0307427-6.788909hypothetical protein
HPF57_0308426-6.353444hypothetical protein
HPF57_0309530-8.950753hypothetical protein
HPF57_0310327-8.647503hypothetical protein
HPF57_0311025-6.697395integrase/recombinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0286PF04335991e-26 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 98.7 bits (246), Expect = 1e-26
Identities = 35/202 (17%), Positives = 75/202 (37%), Gaps = 18/202 (8%)

Query: 94 TERKIGDWIFSSAVFFFALALIEAIIIICLLPLKEKVPYLVTFSNATQNFAIVQR--ADK 151
+K+ + A ALA + + L PLK PY++T T +I + D
Sbjct: 30 RSKKLAWVV---AGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLHGDA 86

Query: 152 SIRANQALVRQLVASYVNNRE--NISNIKEQNEIAHETIRLQSAFEVWDFFEKLVSYEH- 208
+I ++A+ + +A+YV RE + +E + + + SA D + + ++
Sbjct: 87 TITYDEAVRKYFLATYVRYREGWIAAAREEY----FDAVMVMSARPEQDRWSRFYKTDNP 142

Query: 209 ----SIYTNINLTRKISIINIALISKTQANIEISAQLFNKEKLESEKRYRIIMTFEFEPI 264
+I N + I ++ + A + + + ++ + ++ +
Sbjct: 143 QSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFT-KESVTGSNSTKTDAVATIKYKVDGT 200

Query: 265 EIDTKSVPLNPTGFIVTGYDVT 286
NP G+ V Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0308cloacin373e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 36.6 bits (84), Expect = 3e-04
Identities = 24/95 (25%), Positives = 36/95 (37%), Gaps = 20/95 (21%)

Query: 166 INGKDGANGNNSNNNAVGSGIDTDGVLGVDGVNGSSSSSGGSVGGYENNFTNHGSTNNNT 225
++G DG N ++ G+ +NG + G G ++ + S NN
Sbjct: 1 MSGGDGRGHNTGAHSTSGN------------INGGPTGLGVGGGA--SDGSGWSSENNPW 46

Query: 226 GGYDNFNNGSSSGGGLGNGGLFPIPFGNGDANNSS 260
GG G G GNGG GNG++ S
Sbjct: 47 GGGSGSGIHWGGGSGHGNGG------GNGNSGGGS 75


4HPF57_0363HPF57_0392Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0363216-0.262789hypothetical protein
HPF57_0364114-0.650890hypothetical protein
HPF57_0365013-1.506643Type II restriction enzyme
HPF57_0366011-0.809831DNA methylase
HPF57_03670130.119777hypothetical protein
HPF57_0368014-0.253212conserved hypothetical ATP-binding protein
HPF57_0369115-0.995306nitrite extrusion protein
HPF57_0370315-1.272145hypothetical protein
HPF57_0371215-1.503674arginyl-tRNA synthetase
HPF57_0372212-0.963709sec-independent protein translocase protein
HPF57_0373112-1.247387guanylate kinase
HPF57_0374111-1.348479poly E-rich protein
HPF57_0375-211-1.841950membrane bound endonuclease
HPF57_0376011-1.878311outer membrane protein HorC
HPF57_0377313-1.787345flagellar basal body L-ring protein
HPF57_0378413-1.597750CMP-N-acetylneuraminic acid synthetase
HPF57_0379312-0.918663CMP-N-acetylneuraminic acid synthetase
HPF57_0380312-0.808419flagellar biosynthesis protein G
HPF57_03812130.760526tetraacyldisaccharide 4'-kinase
HPF57_03822151.738069NH(3)-dependent NAD+ synthetase
HPF57_03830141.514741*ketol-acid reductoisomerase
HPF57_03841171.216372cell division inhibitor
HPF57_03851171.466223cell division topological specificity factor
HPF57_03862191.925285hypothetical protein
HPF57_03873201.419281Holliday junction resolvase-like protein
HPF57_03884230.452903cysteine-rich protein C
HPF57_03895271.276101hypothetical protein
HPF57_03905261.138656hypothetical protein
HPF57_0391522-1.321788lysozyme-like protein
HPF57_0392217-0.825331hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0369TCRTETA484e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 47.5 bits (113), Expect = 4e-08
Identities = 57/271 (21%), Positives = 106/271 (39%), Gaps = 16/271 (5%)

Query: 28 LILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFALIALSFLICYFD 87
L+ S +T H L + LM + L LS + +S A A+ + I
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGA-LSDRFGRRPVLLVSLAGAAVDYAI--MA 91

Query: 88 SIPFFW-LWIWRFIAGVASSALMILVAPLSLPYVKENKKALVGGFIFSAVGIGSVFSGFV 146
+ PF W L+I R +AG+ + A + +++A GF+ + G G V +
Sbjct: 92 TAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVL 150

Query: 147 LPWISSYNIKWAWIFLGGSCLIAFILSLIGLKN-HSLKKKSVKKEESAFKIPFHL----- 200
+ ++ + + F+ L H +++ +++E F
Sbjct: 151 GGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMT 210

Query: 201 ---WLLLISCALNAIGFLPHTLFWVDYLIRHLNISPTTAGTSWALFG-FGATLGSLISGP 256
L+ + + +G +P L WV + + TT G S A FG + ++I+GP
Sbjct: 211 VVAALMAVFFIMQLVGQVPAAL-WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 257 MAQKLGAKNANIFILILKSIACFLPIFFHQI 287
+A +LG + A + +I L F +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0373PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0374IGASERPTASE671e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 66.6 bits (162), Expect = 1e-13
Identities = 59/293 (20%), Positives = 92/293 (31%), Gaps = 25/293 (8%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKEEEKEEVKETPQEEKPKDDETQES 199
E+E N Q EE V E P P E+
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQA------DVPSVPSNNEEIARVDEAP-VPPPAPATPSET 1036

Query: 200 ETLKDEEVSKELETQEELEIPKEETQEQAKEQEPIKEETQEIKEEKQEKTQEEVK---EE 256
E +E +T E+ E ET Q +E KE +K Q + +E
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVA--KEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 257 TQEQVKEQEPIKKETQEIKEEKQEKTQDSPSVQELEAMQELVKEIQENSNGQEDKKETQE 316
TQ ++ ++ ++ K E EKTQ+ P V + ++ E + + +
Sbjct: 1095 TQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 317 NTETPQETPQDIEVQESAETPQEIPQEKEIPQEKEIPQEKEIPQEKETQKLETPQETSQE 376
N + PQ Q + E P KE E P + +E P+ T+
Sbjct: 1154 NIKEPQS-------QTNTTADTEQPA-KETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 377 SAEKTQKLETQE----DHYESIEDIPEPVMAQAMGEELPFLNESVAKTSNNEN 425
+ + T E+ H S+ +P V TS N N
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258



Score = 55.8 bits (134), Expect = 3e-10
Identities = 51/252 (20%), Positives = 76/252 (30%), Gaps = 17/252 (6%)

Query: 202 LKDEEVSKELETQEELEIPKEETQEQAKEQEPIK-EETQEIKEEKQEKTQEEVKEETQEQ 260
L + EV K +T + I + P EE + E ET E
Sbjct: 980 LYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 261 VKEQEPIKKETQEIKEEKQEKTQDSPSVQELEAMQELVKEIQENSNGQEDKKETQENTET 320
V E + +T E E+ +T EA + Q N Q ET+E T
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS-GSETKETQTT 1098

Query: 321 PQETPQDIEVQESA----ETPQEIPQEKEIPQEKEIPQEKEIPQEKETQKLETPQETSQE 376
+ +E +E A E QE+P+ K+ E PQ E + P +E
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ-AEPARENDPTVNIKE 1157

Query: 377 SAEKTQKLETQEDHYESIEDIPEPVMAQAMGEELPFLNESVAKTSNNENDTETSKESVIK 436
+T E + E P + T N+ + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQ----------PVTESTTVNTGNSVVENPENTTPATT 1207

Query: 437 TPQEKEESDKTS 448
P ES
Sbjct: 1208 QPTVNSESSNKP 1219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0377FLGLRINGFLGH1904e-63 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 190 bits (485), Expect = 4e-63
Identities = 51/172 (29%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKQEAQYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + + S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0380SACTRNSFRASE280.015 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.015
Identities = 15/49 (30%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 102 RGETILKALERIAFE---EFQLNSLHLEVMENNFKAIAFYEKNHYELEG 147
R + + AL A E E L LE + N A FY K+H+ +
Sbjct: 102 RKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0390PREPILNPTASE290.014 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 28.6 bits (64), Expect = 0.014
Identities = 18/59 (30%), Positives = 25/59 (42%), Gaps = 8/59 (13%)

Query: 32 FVIVAWLFRF--KSIAFSILITLLVILVDIWVYSDVRQFLL-DTASSPILLLAALLIKW 87
V VA ++A +L +LV L I D+ + LL D + P LL LL
Sbjct: 121 SVAVAMTLAPGWGTLAALLLTWVLVALTFI----DLDKMLLPDQLTLP-LLWGGLLFNL 174


5HPF57_0499HPF57_0526Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0499224-2.160642outer membrane protein HorE
HPF57_0500217-2.228381molybdenum ABC transporter ModA fragment
HPF57_0501015-2.404925molybdenum ABC transporter ModB, split, N
HPF57_0502-111-3.563950molybdenum ABC transporter ModB, split, C
HPF57_0503010-4.026691molybdenum ABC transporter ModD, split, N
HPF57_0504-18-2.231070molybdenum ABC transporter ModD, split, C
HPF57_0505-19-2.233858glutamyl-tRNA synthetase
HPF57_0506-112-2.833035outer membrane protein HopK
HPF57_0507-113-2.906794Type II adenine specific methyltransferase
HPF57_0508-115-1.556140Type II adenine specific methyltransferase
HPF57_0509018-1.273953hypothetical protein
HPF57_0510223-4.742288Type II adenine specific DNA methyltransferase
HPF57_0511020-3.802195Type II adenine specific DNA methyltransferase
HPF57_0512516-0.805538Type II restriction endonuclease
HPF57_05136160.075289Type II DNA modification enzyme
HPF57_0514217-0.394920Type II restriction enzyme
HPF57_0515217-0.219442Type II restriction enzyme
HPF57_05162160.234553catalase-like protein
HPF57_05173170.423323outer membrane protein HofC
HPF57_0518115-1.177975outer membrane protein HofD
HPF57_0519216-1.208310hypothetical protein
HPF57_0520011-1.240658hypothetical protein
HPF57_0521011-1.274012putative potassium channel protein
HPF57_0522011-1.69209050S ribosomal protein L28
HPF57_0523112-1.986505adhesion HpaA homologue HpaA2
HPF57_0524113-1.689753phospho-N-acetylmuramoyl-pentapeptide-
HPF57_0525114-1.954495UDP-N-acetylmuramoyl-L-alanyl-D-glutamate
HPF57_0526216-1.315827hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0503PF05272300.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.004
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 30 VVALLGETGAGKSTILRILAGLE 52
V L G G GKST++ L GL+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0509TCRTETOQM1982e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (504), Expect = 2e-57
Identities = 115/461 (24%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVAIAG--FSAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV + V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0523PF052112422e-83 Neuraminyllactose-binding hemagglutinin
		>PF05211#Neuraminyllactose-binding hemagglutinin

Length = 260

Score = 242 bits (618), Expect = 2e-83
Identities = 53/214 (24%), Positives = 99/214 (46%), Gaps = 19/214 (8%)

Query: 2 HVAQAPQNYRLIGILAPRIQVSDNL-KPYIDKFQDALINQIQTIFEKRGYQTLFF--KDE 58
+ + I +L P Q SDN+ K Y +KF++ +++ I + +GY+ + D+
Sbjct: 48 SEKVQALDEK-ILLLRPAFQYSDNIAKEYENKFKNQTTLKVEQILQNQGYKVINVDSSDK 106

Query: 59 SALTLQDKRKLFAVLDVKGWVGVLEDLKMNLKDPNNPNL--GTLVDQ------SSGSVWF 110
+ K++ + + + G + + D K ++ + P L T +D+ +G V
Sbjct: 107 DDFSFAQKKEGYLAVAMNGEIVLRPDPKRTIQKKSEPGLLFSTGLDKMEGVLIPAGFVKV 166

Query: 111 SFYEPESNRVVHDFAVEVGTF---QAITYTYKQSNSGGFNSSNSIIHEDLEKNKEDAIHQ 167
+ EP S + F +++ + T S+SGG S+ N DAI
Sbjct: 167 TILEPMSGESLDSFTMDLSELDIQEKFLKTTHSSHSGGLVSTMV----KGTDNSNDAIKS 222

Query: 168 ILNKIYALIMKKAVTELTEKNISQYKEAIDRMKG 201
LNKI+A IM++ +LT+KN+ Y++ +KG
Sbjct: 223 ALNKIFANIMQEIDKKLTQKNLESYQKDAKELKG 256


6HPF57_0547HPF57_0573Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0547215-1.746553GTP-binding protein Era
HPF57_0548313-1.815225conserved hypothetical secreted protein
HPF57_0549518-1.761831hypothetical protein
HPF57_0550919-1.727650cag pathogenicity island protein
HPF57_0551919-2.092195cag pathogenicity island protein
HPF57_0552915-2.092239cag pathogenicity island protein
HPF57_0553916-2.700906cag island protein
HPF57_0554818-2.787018cag pathogenicity island protein
HPF57_0555919-3.213302cag pathogenicity island protein
HPF57_0556820-3.414099cag pathogenicity island protein
HPF57_0557920-3.260566cag pathogenicity island protein Y VirB10-like
HPF57_0558926-4.565506cag pathogenicity island protein
HPF57_0559825-4.718165cag pathogenicity island protein
HPF57_05601121-5.307801cag pathogenicity island protein
HPF57_05611122-5.095108cag pathogenicity island protein
HPF57_05621020-4.230803cag pathogenicity island protein
HPF57_0563718-3.322195cag pathogenicity island protein
HPF57_0564617-2.890280cag pathogenicity island protein
HPF57_0565619-3.139409cag pathogenicity island protein
HPF57_0566620-2.901491cag pathogenicity island protein
HPF57_0567620-3.200804cag pathogenicity island protein
HPF57_0568620-3.142258cag island protein
HPF57_0569721-3.068474cag pathogenicity island protein
HPF57_0570721-3.005496cag pathogenicity island protein
HPF57_0571519-2.072819DNA transfer protein
HPF57_0572316-0.586323cag pathogenicity island protein
HPF57_05732150.111651cag pathogenicity island protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0547PF03944320.003 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 32.0 bits (72), Expect = 0.003
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLYQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYASQFLALVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0552PF07201290.025 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.4 bits (66), Expect = 0.025
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0557IGASERPTASE402e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.7 bits (92), Expect = 2e-04
Identities = 38/248 (15%), Positives = 86/248 (34%), Gaps = 11/248 (4%)

Query: 567 SQAKTEAEKQECEKLL----TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLT 622
QA + E++ P ++ + + ++KT + ++ T
Sbjct: 1003 IQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETT 1062

Query: 623 PEAKKKLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEA 682
+ ++ +EAK +VKA A++ +E KE + T E + +++ KT+
Sbjct: 1063 AQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQE 1121

Query: 683 EKKRCVKDLPKDLQKKVLAKESLKAYKDCVSRARNEKEKQECEKLLTPEAKKLL------ 736
K + PK Q + + ++ A ++ + E + Q T + K
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ 1181

Query: 737 EEAKKSLKAYKDCVSRARNEKEKQECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKTEA 796
+ + + V + + E+ + + SV++ V A T +
Sbjct: 1182 PVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSS 1241

Query: 797 EKKECEKL 804
+ L
Sbjct: 1242 NDRSTVAL 1249



Score = 38.5 bits (89), Expect = 3e-04
Identities = 33/198 (16%), Positives = 71/198 (35%), Gaps = 1/198 (0%)

Query: 767 TPEAKKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSVKAYLDC 826
P ++ + + ++KT + ++ T + ++ +EAK +VKA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 827 VSQAKTEAEKKECEKLLTPEAKKLLEQQALDCLKNAKTEAEKKRCVKDLPKDLQKKVLAK 886
A++ +E KE + T E + +++ KT+ K + PK Q + +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 887 ESLKAYKDCVSRARNEKEKQECEKLLTPEAKKLLEEAKKSLKAYKDCVSRARNEKEKQEC 946
++ A ++ + E + Q T + K + V+ + E E
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201

Query: 947 EKLLTPEAKKLLEEAKKS 964
T + E + K
Sbjct: 1202 TTPATTQPTVNSESSNKP 1219



Score = 34.3 bits (78), Expect = 0.005
Identities = 35/269 (13%), Positives = 77/269 (28%), Gaps = 18/269 (6%)

Query: 898 RARNEKEKQECEKLLTPEAKKLLEEAKKSLKAYKDCVSRARNEKEKQECEKLLTPEAKKL 957
+ + + TP + + S +R + + +
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEI---ARVDEAPVPPPAPATPSETTETV 1040

Query: 958 LEEAKKSLKAYKDCVSRARNEKEKQECEKLLTPEAKKLLEQQALDCLKNAKTEAEKKRCV 1017
E +K+ K + A E Q E ++ Q + ++ E +
Sbjct: 1041 AENSKQESKTVEKNEQDAT-ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 1018 KDLPKDLQKKVLAKESVKAYLDCVSRARNEKEKQECEKLLTPEAKKLLEEAKESLKAYKD 1077
++K+ AK + + KQE + + P+A+ E
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVN----- 1154

Query: 1078 CLSQARNEEERRACEKLLTPEAKKLLEQEVKKSVKAYLDCVSKARNEREKQECEKLLTPE 1137
+ + +++ A + E +EQ V +S + E + TP
Sbjct: 1155 -IKEPQSQTNTTADTEQPAKETSSNVEQPVTES--------TTVNTGNSVVENPENTTPA 1205

Query: 1138 ARKFLAKQVLSCLEKARNEEERKACLKNI 1166
+ S K R+ ++ N+
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNV 1234



Score = 33.9 bits (77), Expect = 0.008
Identities = 33/197 (16%), Positives = 82/197 (41%), Gaps = 7/197 (3%)

Query: 555 AKESLKAYKDCVSQAKTEAEKQECEKLLTPEAKKLLEEEAKESVKAYLDC--VSQAKTEA 612
++ + ++ ++KT EK E + T + + +EAK +VKA V+Q+ +E
Sbjct: 1034 SETTETVAENSKQESKTV-EKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 613 EKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEAKKLLEQQAL 672
++ + + +K E+AK + + + K+E + + P+A+ E
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 673 DCLKNAK----TEAEKKRCVKDLPKDLQKKVLAKESLKAYKDCVSRARNEKEKQECEKLL 728
+K + T A+ ++ K+ ++++ V ++ V N +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 729 TPEAKKLLEEAKKSLKA 745
+ + K ++S+++
Sbjct: 1213 SESSNKPKNRHRRSVRS 1229



Score = 32.7 bits (74), Expect = 0.016
Identities = 25/193 (12%), Positives = 73/193 (37%), Gaps = 6/193 (3%)

Query: 676 KNAKTEAEKKRCVKDLP-KDLQKKVLAKESLKAYKDCVSRARNEKEKQECEKLLTPEAKK 734
+ + E+ V + P ++ + ++ ++ ++ ++ T + ++
Sbjct: 1008 PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 735 LLEEAKKSLKAYKDCVSRARNEKEKQECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKT 794
+ +EAK ++KA + E ++ T E K+ E +E K + +
Sbjct: 1068 VAKEAKSNVKANT---QTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK 1124

Query: 795 EAEKKECEKLLTPEAKKKLEEAKKSVKAYL--DCVSQAKTEAEKKECEKLLTPEAKKLLE 852
+ ++ + + + E A+++ + SQ T A+ ++ K + ++ +
Sbjct: 1125 VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 853 QQALDCLKNAKTE 865
+ N+ E
Sbjct: 1185 ESTTVNTGNSVVE 1197



Score = 32.0 bits (72), Expect = 0.032
Identities = 31/189 (16%), Positives = 77/189 (40%), Gaps = 12/189 (6%)

Query: 753 ARNEKEKQECEKLLTPEAK------KLLEEEAKESVKAYLDC--VSQAKTEAEKKECEKL 804
A N K++ + + +A + + +EAK +VKA V+Q+ +E ++ + +
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTET 1100

Query: 805 LTPEAKKKLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEAKKLLEQQALDCLKNAK- 863
+K E+AK + + + K+E + + P+A+ E +K +
Sbjct: 1101 KETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 864 ---TEAEKKRCVKDLPKDLQKKVLAKESLKAYKDCVSRARNEKEKQECEKLLTPEAKKLL 920
T A+ ++ K+ ++++ V ++ V N + + + K
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPK 1220

Query: 921 EEAKKSLKA 929
++S+++
Sbjct: 1221 NRHRRSVRS 1229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0558TYPE4SSCAGX8720.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 872 bits (2253), Expect = 0.0
Identities = 516/522 (98%), Positives = 518/522 (99%)

Query: 1 MEQAFFKKIVGCFCLGYLFLSSVIEAAALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60
M QAFFKKIVGCFCLGYLFLSS IEA ALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 181 ENLTNAMSNPQNLSNNKNLSEFIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240
ENLTNAMSNPQNLSNNKNLSE IKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 241 EEAVKQRAKDKINIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300
EEAV+QRAKDKI+IKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522
DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0560PF043351193e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 119 bits (300), Expect = 3e-35
Identities = 44/205 (21%), Positives = 74/205 (36%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMALNIAAVASIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + AL A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0565TYPE4SSCAGX310.005 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.3 bits (70), Expect = 0.005
Identities = 30/119 (25%), Positives = 55/119 (46%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKKLVALGFKKIKTFHQRHDEKEVTEEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y++ + K K D KE+ E++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSIQKKAAKHKGLQELNETNANPLNDNPNSNSSTETKSNKDDNFDEM 142
QK+ K +++A L+ L +NP N + N N S K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0571ACRIFLAVINRP320.014 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.1 bits (73), Expect = 0.014
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKTDMQKGVNPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


7HPF57_0698HPF57_0721Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_06982111.590773hypothetical protein
HPF57_06991131.628822hypothetical protein
HPF57_07003141.878203putative lipopolysaccharide biosynthesis
HPF57_07012132.100583ribonucleotide-diphosphate reductase subunit
HPF57_07024181.699226hypothetical protein
HPF57_07036160.932054hypothetical protein
HPF57_07043120.408967hypothetical protein
HPF57_07051110.157483hypothetical protein
HPF57_0706011-0.819735UDP-N-acetylglucosamine pyrophosphorylase
HPF57_0707013-1.670389flagellar biosynthesis protein FliP
HPF57_0708015-2.263289iron(III) dicitrate transport protein FecA1
HPF57_0709-116-3.325432iron(II) transport protein
HPF57_0710-218-2.386135hypothetical protein
HPF57_0711-217-1.177705cytosine specific DNA methyltransferase
HPF57_07121161.658575hypothetical protein
HPF57_07130142.457056Type II restriction enzyme NlaIV
HPF57_07141123.031458hypothetical protein
HPF57_07151124.117623acetyl coenzyme A acetyltransferase
HPF57_07162133.8840053-oxoacid Coa-transferase, subunit A
HPF57_07173143.690534succinyl-CoA-transferase subunit B
HPF57_07181142.773926short-chain fatty acids transporter
HPF57_07191142.844882putative outer membrane protein
HPF57_07202122.811265hydantoin utilization protein A
HPF57_07212121.664048N-methylhydantoinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0703PF07132355e-05 Harpin protein (HrpN)
		>PF07132#Harpin protein (HrpN)

Length = 356

Score = 34.7 bits (79), Expect = 5e-05
Identities = 19/46 (41%), Positives = 29/46 (63%)

Query: 51 FWGEAVGAGMGGAMGGMIGSLVGPWSTVFGAGIGGGIGAYSGAEIG 96
F G +G G+GG +GG+ SL G + G G+GGG+G+ G+ +G
Sbjct: 60 FMGSMMGGGLGGGLGGLGSSLGGLGGGLLGGGLGGGLGSSLGSGLG 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0707FLGBIOSNFLIP2762e-96 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 276 bits (708), Expect = 2e-96
Identities = 113/245 (46%), Positives = 162/245 (66%), Gaps = 2/245 (0%)

Query: 1 MRFFIFLILICPLICPLMSADSALPSVNLSLNAPNDPKQLVTTLNVIALLTLLVLAPSLI 60
MR + + + L A + LP + S P + + + +T L P+++
Sbjct: 1 MRRLLSVAPVL-LWLITPLAFAQLPGIT-SQPLPGGGQSWSLPVQTLVFITSLTFIPAIL 58

Query: 61 LVMTSFTRLIVVFSFLRTALGTQQTPPTQILVSLSLILTFFIMEPSLKKAYDTGIKPYMD 120
L+MTSFTR+I+VF LR ALGT PP Q+L+ L+L LTFFIM P + K Y +P+ +
Sbjct: 59 LMMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSE 118

Query: 121 KKISYTEAFEKSALPFKEFMLKNTREKDLALFFRIRNLPNPKTPDEVSLSVLIPAFMISE 180
+KIS EA EK A P +EFML+ TRE DL LF R+ N + P+ V + +L+PA++ SE
Sbjct: 119 EKISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSE 178

Query: 181 LKTAFQIGFLLYLPFLVIDMVISSILMAMGMMMLPPVMISLPFKILVFILVDGFNLLTEN 240
LKTAFQIGF +++PFL+ID+VI+S+LMA+GMMM+PP I+LPFK+++F+LVDG+ LL +
Sbjct: 179 LKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGS 238

Query: 241 LVASF 245
L SF
Sbjct: 239 LAQSF 243


8HPF57_0887HPF57_0908Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_08872153.297994hydrogenase nickel incorporation protein
HPF57_08882133.354239flagellar hook protein FlgE
HPF57_08891132.662794CDP-diacylglycerol pyrophosphatase
HPF57_08901122.709747alkylphosphonate uptake protein
HPF57_08911132.456755hypothetical protein
HPF57_08922131.402210hypothetical protein
HPF57_08932131.354218catalase
HPF57_08941120.305880iron-regulated outer membrane protein
HPF57_0895211-1.425168Holliday junction resolvase
HPF57_0896210-1.575576hypothetical protein
HPF57_0897010-0.335556hypothetical protein
HPF57_0898-29-0.031100Holliday junction DNA helicase motor protein
HPF57_0899-180.541332hypothetical protein
HPF57_0900-191.354508virulence factor MviN
HPF57_0901191.987784cysteinyl-tRNA synthetase
HPF57_09021102.140865vacuolating cytotoxin A
HPF57_09030153.685913iron(III) dicitrate transporter, ATP-binding
HPF57_09040153.795384iron(III) dicitrate ABC transporter, permease
HPF57_09050163.633481short-chain oxidoreductase
HPF57_09060183.585637hypothetical protein
HPF57_09070203.359515hypothetical protein
HPF57_09080193.654067outer membrane protein BabA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0888FLGHOOKAP1427e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 7e-06
Identities = 13/49 (26%), Positives = 27/49 (55%)

Query: 669 SISGSKLESSNVDLSRSLTNLIVVQRGFQANSKAVTTSDQILNTLLNLK 717
+S + S V+L NL Q+ + AN++ + T++ I + L+N++
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 39.2 bits (91), Expect = 5e-05
Identities = 11/35 (31%), Positives = 20/35 (57%)

Query: 4 SLWSGVNGMQAHQIALDIESNNIANVNTTGFKYSR 38
+ + ++G+ A Q AL+ SNNI++ N G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0901OMS28PORIN300.020 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.8 bits (66), Expect = 0.020
Identities = 13/37 (35%), Positives = 25/37 (67%)

Query: 309 EEDLLVSKKRLDKIYRLKQRVLGTLGGINPNFKKEIL 345
+E L+ S++ LD+ + Q+VL + G+NP+ K ++L
Sbjct: 188 KETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVL 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0902VACCYTOTOXIN20110.0 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 2011 bits (5210), Expect = 0.0
Identities = 1182/1296 (91%), Positives = 1232/1296 (95%), Gaps = 5/1296 (0%)

Query: 1 MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGL 60
MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGL
Sbjct: 1 MEIQQTHRKINRPLVSLALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGL 60

Query: 61 LGWGLKQAEEANKTPDKPDKVWRIQAGRGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAAR 120
LGWGLKQAEEANKTPDKPDKVWRIQAG+GFNEFPNKEYDLYKSLLSSKIDGGWDWGNAAR
Sbjct: 61 LGWGLKQAEEANKTPDKPDKVWRIQAGKGFNEFPNKEYDLYKSLLSSKIDGGWDWGNAAR 120

Query: 121 HYWVKGGQWNKLEVDMKDAVGTYKLSGLINYTGGDLDVNMQKATLRLGQFNGNSFTSFKD 180
HYWVK GQWNKLEVDM++AVGTY LSGLIN+TGGDLDVNMQKATLRLGQFNGNSFTS+KD
Sbjct: 121 HYWVKDGQWNKLEVDMQNAVGTYNLSGLINFTGGDLDVNMQKATLRLGQFNGNSFTSYKD 180

Query: 181 NADRTTRVDFNAKNISIDNFIEINNRVGSGAGRKASSTVLTLLASEGITSGKNAEISLYD 240
+ADRTTRVDFNAKNI IDNF+EINNRVGSGAGRKASSTVLTL ASEGITS +NAEISLYD
Sbjct: 181 SADRTTRVDFNAKNILIDNFLEINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYD 240

Query: 241 GATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDRNAAQA 300
GATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGD NAAQA
Sbjct: 241 GATLNLASNSVKLMGNVWMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQA 300

Query: 301 GIIASKKTYIGTLDLWQSAGLNIITPPEGGYKDKPNNTNSQSGAKNDKNESAKNDKQESS 360
GIIAS KT+IGTLDLWQSAGLNII PPEGGYKDKPN+ S + N AKNDKQESS
Sbjct: 301 GIIASNKTHIGTLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNN-----AKNDKQESS 355

Query: 361 QNNSNTQVINPPNSGQKTEIQPTQVIDGPFAGAKDTVVNINRINTNADGTIKVGGYTASL 420
QNNSNTQVINPPNS QKTEIQPTQVIDGPFAG K+TVVNINRINTNADGTI+VGG+ ASL
Sbjct: 356 QNNSNTQVINPPNSAQKTEIQPTQVIDGPFAGGKNTVVNINRINTNADGTIRVGGFKASL 415

Query: 421 TTNAAHLNIGKGGVNLSNQASGRSLLVENLTGNITVDGALMVNNQVGGYALAGSSANFEF 480
TTNAAHL+IGKGG+NLSNQASGRSLLVENLTGNITVDG L VNNQVGGYALAGSSANFEF
Sbjct: 416 TTNAAHLHIGKGGINLSNQASGRSLLVENLTGNITVDGPLRVNNQVGGYALAGSSANFEF 475

Query: 481 KAGVDTKNGTIAFNNNISLGRFVNLKASAHTVNFKDIDTGNGGFNTLDFSGVTNKVNINK 540
KAG DTKNGT FNN+ISLGRFVNLK AHT NFK IDTGNGGFNTLDFSGVTNKVNINK
Sbjct: 476 KAGTDTKNGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNTLDFSGVTNKVNINK 535

Query: 541 LITASTNVAIKNFNINELLVKTNGISVGEYTNFSEDIGNQSRINTVRLETGTRSIYSGGV 600
LITASTNVA+KNFNINEL+VKTNG+SVGEYT+FSEDIG+QSRINTVRLETGTRSIYSGGV
Sbjct: 536 LITASTNVAVKNFNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIYSGGV 595

Query: 601 KFKSGEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGTAKLMFNNLTLGPNA 660
KFK GEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGTAKLMFNNLTLG NA
Sbjct: 596 KFKGGEKLVINDFYYAPWNYFDARNIKNVEITNKLAFGPQGSPWGTAKLMFNNLTLGQNA 655

Query: 661 VMDYSQFSNVTIQGNFVNNQGTINYLVRGGNIETLSVGNAAVMSFNNDIDSATGFYKPLI 720
VMDYSQFSN+TIQG+FVNNQGTINYLVRGG + TL+VGNAA M F+N++DSATGFY+PL+
Sbjct: 656 VMDYSQFSNLTIQGDFVNNQGTINYLVRGGQVATLNVGNAAAMFFSNNVDSATGFYQPLM 715

Query: 721 KINSAQDLIKNKEHVLLKAKIIGYENASLGTNSISNANLIEQFNERLALYNNNNRMDTCV 780
KINSAQDLIKNKEHVLLKAKIIGY N S GT+SI+N NLIEQF ERLALYNNNNRMD CV
Sbjct: 716 KINSAQDLIKNKEHVLLKAKIIGYGNVSAGTDSIANVNLIEQFKERLALYNNNNRMDICV 775

Query: 781 VRNTDDIKACGMAIGDQAMVNNPDNYKYLIGKAWKNIGISKTANGSKISVRYLGNATPAE 840
VRNTDDIKACG AIG+Q+MVNNP+NYKYL GKAWKNIGISKTANGSKISV YLGN+TP E
Sbjct: 776 VRNTDDIKACGTAIGNQSMVNNPENYKYLEGKAWKNIGISKTANGSKISVHYLGNSTPTE 835

Query: 841 NGGNTTNLPTNTTNNARFASYALIKNAPFAQTSATPSLVAINKHNFGTIESVFELANRSE 900
NGGNTTNLPTNTTN RFASYALIKNAPFA+ SATP+LVAIN+H+FGTIESVFELANRS
Sbjct: 836 NGGNTTNLPTNTTNKVRFASYALIKNAPFARYSATPNLVAINQHDFGTIESVFELANRSN 895

Query: 901 DIDTLYANSGAQGRDLLQTLLIDSHDAGYARTMIDATSANEITKQLNTATDALNNIASLE 960
DIDTLYANSGAQGRDLLQTLLIDSHDAGYARTMIDATSANEITKQLNTAT LNNIASLE
Sbjct: 896 DIDTLYANSGAQGRDLLQTLLIDSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLE 955

Query: 961 HKTSGLQTLSLSNAMILNSRLVNLSRRHTNNIDSFAKRLQALKDQRFASLESAAEVLYQF 1020
HKTSGLQTLSLSNAMILNSRLVNLSRRHTN+IDSFAKRLQALKDQRFASLESAAEVLYQF
Sbjct: 956 HKTSGLQTLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFASLESAAEVLYQF 1015

Query: 1021 APKYEKPTNVWANAIGGASLNSGGNTSLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFNN 1080
APKYEKPTNVWANAIGG SLNSGGN SLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSF+N
Sbjct: 1016 APKYEKPTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSN 1075

Query: 1081 QANSLNSGANNANFGAYSRIFANRHEFDFEAQGAVGSDQSSLNFKSALLRDLNQSYNYLA 1140
QANSLNSGANN NFG YSRIFAN+HEFDFEAQGA+GSDQSSLNFKSALLRDLNQSYNYLA
Sbjct: 1076 QANSLNSGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLA 1135

Query: 1141 YGASTRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFESNSTHKTALKNGASSQHLFNA 1200
Y A+TRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNF+SNS K ALKNGASSQHLFNA
Sbjct: 1136 YSAATRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSNQKVALKNGASSQHLFNA 1195

Query: 1201 SANVEARYYYGDTSYFYMNAGVLQEFANFGSSNALSLNTFKVNTARNPLNTHARVMMGGE 1260
SANVEARYYYGDTSYFYMNAGVLQEFANFGSSNA+SLNTFKVN RNPLNTHARVMMGGE
Sbjct: 1196 SANVEARYYYGDTSYFYMNAGVLQEFANFGSSNAVSLNTFKVNATRNPLNTHARVMMGGE 1255

Query: 1261 LKLAKEVFLNLGFIYLHNLISNAGYFASNLGMRYSF 1296
LKLAKEVFLNLGF+YLHNLISN G+FASNLGMRYSF
Sbjct: 1256 LKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0905DHBDHDRGNASE865e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 86.3 bits (213), Expect = 5e-22
Identities = 55/241 (22%), Positives = 105/241 (43%), Gaps = 10/241 (4%)

Query: 1 MGEKKESQKVAIITGASSGIGLECTLMLLDQGYKVYALSRHATLCAALNHALC------E 54
M K K+A ITGA+ GIG L QG + A+ + + +L E
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 55 SIDIDVSDSNALKEAFLNISAKEDHCDVLINSAGYGVFGSVEDTPIDEVKKQFGVNFFAL 114
+ DV DS A+ E I + D+L+N AG G + +E + F VN +
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 115 CEVVQFCLPLLKNKPHSKIFNLSSIAGRMSMIFLGHYSASKHALEAYSDALRLELKPFNI 174
+ + ++ I + S + + Y++SK A ++ L LEL +NI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 175 QVCLIEPGPVKSNWEKTAFENDERKDSLYALEVNAAKSFYSGV-YQKALSPKAVAQKIVF 233
+ ++ PG +++ + + + ++ + + + ++F +G+ +K P +A ++F
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIK---GSLETFKTGIPLKKLAKPSDIADAVLF 237

Query: 234 L 234
L
Sbjct: 238 L 238


9HPF57_1085HPF57_1090Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1085314-0.801439flgM protein
HPF57_1086413-1.715272hypothetical protein
HPF57_1087513-1.659113peptidyl-prolyl cis-trans isomerase
HPF57_1088414-2.457495hypothetical protein
HPF57_1089515-2.220377peptidoglycan-associated lipoprotein precursor
HPF57_1090215-0.213528translocation protein TolB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1089OMPADOMAIN1432e-44 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 143 bits (363), Expect = 2e-44
Identities = 45/168 (26%), Positives = 71/168 (42%), Gaps = 22/168 (13%)

Query: 22 NMDKETVAGDVSAKAVQSAPVSTETVQEKQEPKQEPAPVVEEKPAVESGTIIASIYFDFD 81
D ++ VS + Q P PAP V+ K T+ + + F+F+
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVV------APAPAPAPEVQTK----HFTLKSDVLFNFN 226

Query: 82 KYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVKNALV 138
K +K Q LD++ + V++ G TD GS YNQ L +R SV + L+
Sbjct: 227 KATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLI 286

Query: 139 IKGVEKDMIKTISFGETKPKCTQ-----KTR----ECYKENRRVDVKL 177
KG+ D I GE+ P K R +C +RRV++++
Sbjct: 287 SKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


10HPF57_1100HPF57_1108Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1100018-3.588809F0F1 ATP synthase subunit B'
HPF57_1101218-3.219036plasmid replication-partition related protein
HPF57_1102319-3.153201SpoOJ regulator
HPF57_1103319-3.311311biotin--protein ligase
HPF57_1104219-3.090806methionyl-tRNA formyltransferase
HPF57_1105218-3.351693hypothetical protein
HPF57_11063160.206085hypothetical protein
HPF57_11073170.221787hypothetical protein
HPF57_1108314-0.115430hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1102PF07675310.005 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.005
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 70 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 126
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 127 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 171
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1104FERRIBNDNGPP320.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.8 bits (72), Expect = 0.003
Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 70 EPEVQILKALKPDFIVVVAYGKILPKEVLSIAP 102
EP +++L +KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1105PF01540300.028 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 30.5 bits (68), Expect = 0.028
Identities = 62/315 (19%), Positives = 129/315 (40%), Gaps = 32/315 (10%)

Query: 140 KNCKEKVEKRKKKIKDENSAETLSAKQESEIKKYDKEIEKIRKEMTSKTIQITLDEIKIN 199
K+ ++KV++ KKI DEN +IK+ KE+ K+ +++ S I L
Sbjct: 103 KSEQQKVDQANKKIADENL----------KIKEGAKELLKLSEKIQSFADTIAL------ 146

Query: 200 NICEVSKNKFKVQEDALTNLEKDFDELDEAMKKFDDLKEMELPKDYQTIKDKLESLFSFD 259
I ++ KF++ E L + L++ + + K + + LES F+
Sbjct: 147 TITKLEGKKFQIDETFKKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSE-LESFKEFN 205

Query: 260 IDKEAGQVSE----------EIKEHMSKVGREFIEKGIELQKKMPDNACPFCTQEITNNI 309
VSE E+ E ++ ++ E+ ++++ + + +
Sbjct: 206 TSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSFADT 265

Query: 310 IQVYTSYFNKRIEQFNQDSLEVSGTLKKILEQWN-IKEILQSFERFEPFMKKDSSTNKES 368
I + + ++ + + ++ T++ + ++ +K + F+ + + KE
Sbjct: 266 IALTITKLERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEF 325

Query: 369 LKNALEQIKVLLEKLQKEVDKKWGVKNKEKFQETDKKLLENYEKFQKCADETRNILKQKK 428
+ LE+I E EV K W + E E DKKL E +K + +E + I +
Sbjct: 326 NTSWLEKIVSEWE----EVKKAWSKELAEIKAEDDKKLAEENQKIKNGVEELKKINNEAF 381

Query: 429 EQKEKLEKLKTELKE 443
E + + K EL++
Sbjct: 382 ELSKTVNKTIAELEK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1106RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 26/195 (13%), Positives = 70/195 (35%), Gaps = 19/195 (9%)

Query: 27 QIELENQSRF-LAQQKEFEKEVKEKRAQYQSHFKMLEQKEEALKEQEREQKAKFDDAVKQ 85
+++L ++ F ++E + + Q+ + QKE L ++ E+ +
Sbjct: 167 ELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRY 226

Query: 86 ASALALQDERAKIIEEARKNAFLEQQKGLELLQKELDEKSKQVQELHQKEAEIERLKREN 145
+ ++ R + + LE + + E EL ++++E+++ E
Sbjct: 227 ENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ-ENKYVEAV---NELRVYKSQLEQIESEI 282

Query: 146 NEAESRLKAENEKKLNEKLDLERERIEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAE 205
A+ + + + E ++K + +L + + +A
Sbjct: 283 LSAKEEYQLVTQ-------LFKNEILDK--LRQTTDNIGLLTLELAKNEERQQASVIRAP 333

Query: 206 LSSQQFQGEVQELAI 220
+S +VQ+L +
Sbjct: 334 VS-----VKVQQLKV 343


11HPF57_1189HPF57_1214Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_11892120.201450hypothetical protein
HPF57_1190112-0.295068DNA polymerase III subunit delta'
HPF57_11910130.863883dihydropteroate synthase
HPF57_1192-1121.686452hypothetical protein
HPF57_1193-191.363119hypothetical protein
HPF57_1194-2101.403119hypothetical protein
HPF57_1195-3102.669384hypothetical protein
HPF57_1196-2103.004465carbamoyl phosphate synthase small subunit
HPF57_11970103.531197formamidase
HPF57_1198-1103.150412hypothetical protein
HPF57_1199-1113.529715Maf-like protein
HPF57_12000123.345627alanyl-tRNA synthetase
HPF57_12011151.700587hypothetical protein
HPF57_1202-1101.032375outer membrane protein
HPF57_1203212-0.68397730S ribosomal protein S18
HPF57_1204212-0.937090single-strand DNA-binding protein
HPF57_1205311-1.03275930S ribosomal protein S6
HPF57_1206310-0.723342DNA polymerase III subunit delta
HPF57_1207310-0.1738473'-5' exoribonuclease R
HPF57_1208111-0.289833shikimate 5-dehydrogenase
HPF57_120919-0.100393hypothetical protein
HPF57_1210090.122189putative peptide ABC transporter, ATP-binding
HPF57_12110100.373187hypothetical protein
HPF57_12121120.141011tryptophanyl-tRNA synthetase
HPF57_12132130.613239biotin synthesis protein
HPF57_12142151.426941preprotein translocase subunit SecG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1198adhesinmafb310.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 0.002
Identities = 17/50 (34%), Positives = 22/50 (44%), Gaps = 2/50 (4%)

Query: 32 MEGIENSDPNQNNPFISA--AVGIGGAAISIFFPNTKPIVDGVKPLAEKG 79
ME I NPFISA A+GIG + K + + PL +G
Sbjct: 225 MEFINGVAAGALNPFISAGEALGIGDILYGTRYAIDKAAMRNIAPLPAEG 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1201PF05844250.035 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 25.0 bits (54), Expect = 0.035
Identities = 13/65 (20%), Positives = 28/65 (43%), Gaps = 1/65 (1%)

Query: 10 SVLKANNPHFDKIFEKHNQLDDDIKTAEQQNASDAEVSHMKKQKLKLKDEIHSMIIEYRE 69
L+A F+ + I++ Q + +V + Q ++E+++ I + +
Sbjct: 197 VALRAAGRAFESRNGALQVANTVIQSFVQMANASVQVRQGESQASAREEEVNATIGQ-SQ 255

Query: 70 KQKSE 74
KQK E
Sbjct: 256 KQKVE 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1209IGASERPTASE300.009 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.009
Identities = 24/80 (30%), Positives = 37/80 (46%), Gaps = 6/80 (7%)

Query: 46 QNLEKTKIERQNSTLSPKQEETNTTTTATEENPTKDSPLPLETPTQEKENKQENKQETKQ 105
QN E K + N + + E + + T+E T ++ ET T EKE K + + E Q
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK---ETATVEKEEKAKVETEKTQ 1120

Query: 106 EQEK---ENEPKQNSASPIQ 122
E K + PKQ + +Q
Sbjct: 1121 EVPKVTSQVSPKQEQSETVQ 1140



Score = 28.5 bits (63), Expect = 0.022
Identities = 17/91 (18%), Positives = 33/91 (36%), Gaps = 7/91 (7%)

Query: 49 EKTKIERQNSTLSPKQEETNTTTTATEENPTKDSPLPLETPTQEKENKQENKQETKQEQE 108
+ ++ + S +SPKQE++ T E P PT + Q T ++
Sbjct: 1118 KTQEVPKVTSQVSPKQEQSETVQPQAE-------PARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 109 KENEPKQNSASPIQNHQKTLSTSTIGKKPLE 139
E N P+ + +++ + P
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPEN 1201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1214SECGEXPORT502e-10 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 49.9 bits (119), Expect = 2e-10
Identities = 24/84 (28%), Positives = 47/84 (55%), Gaps = 3/84 (3%)

Query: 1 MTSALLGLQIVLAVLIVVVVLLQ--KSSSIGLGAYSGSNESLFGAKGPASFMAKLTMFLG 58
M ALL + +++A+ +V +++LQ K + +G +G++ +LFG+ G +FM ++T L
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 59 LLFVVNTIALGYFYNKEYGKSILD 82
LF + ++ LG N +
Sbjct: 61 TLFFIISLVLGNI-NSNKTNKGSE 83


12HPF57_1348HPF57_1354Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1348317-0.815936protease IV
HPF57_1349725-2.008709hypothetical protein
HPF57_1350521-1.505333hypothetical protein
HPF57_13515210.692848hypothetical protein
HPF57_13524190.624472lipoprotein
HPF57_13532160.381064hypothetical protein
HPF57_1354214-0.804099hypothetical protein
13HPF57_1380HPF57_1393Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_13802121.198648ABC transporter ATP-binding protein
HPF57_1381290.109796hypothetical protein
HPF57_138239-0.496599outer membrane protein
HPF57_1383311-0.367404branched-chain amino acid aminotransferase
HPF57_1384212-1.003733outer membrane protein HorJ
HPF57_1385213-1.232842DNA polymerase I
HPF57_1386118-0.606809Type II restriction-modification specificity
HPF57_1387320-0.005772Type IIG restriction-modification enzyme
HPF57_13883151.049569hypothetical protein
HPF57_13893120.359089thymidylate kinase
HPF57_13903110.372148phosphopantetheine adenylyltransferase
HPF57_13912120.3876503-octaprenyl-4-hydroxybenzoate carboxy-lyase
HPF57_1392312-0.250232flagellar basal body P-ring biosynthesis protein
HPF57_1393211-0.131569DNA helicase II
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1390LPSBIOSNTHSS2212e-77 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 221 bits (565), Expect = 2e-77
Identities = 65/148 (43%), Positives = 94/148 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSTKNPMFSLKERLEMIQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS++ERLE I A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIVHKGDASHLVPKEIH 151
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVA 149


14HPF57_1412HPF57_1459Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1412213-1.574841peptidyl-tRNA hydrolase
HPF57_1413011-1.078741hypothetical protein
HPF57_1414110-0.854198putative methyltransferase
HPF57_1415110-0.213678putative methyltransferase
HPF57_14162120.954763putative outer membrane protein
HPF57_14171110.842019hypothetical protein
HPF57_14180111.136710putative cation transporting P-type ATPase
HPF57_14191111.617380hypothetical protein
HPF57_14201111.413506riboflavin biosynthesis protein
HPF57_14211121.295689sodium/glutamate symport carrier
HPF57_14222132.362686hypothetical protein
HPF57_14231131.518411ferrodoxin-like protein
HPF57_1424-1121.335749hypothetical protein
HPF57_1425-1121.393246dihydroneopterin aldolase
HPF57_1426-2120.155019iron-regulated outer membrane protein FrpB4
HPF57_1427-210-2.304987putative IRON-regulated outer membrane protein
HPF57_142809-4.356004selenocysteine synthase
HPF57_142908-4.372730transcription elongation factor NusA
HPF57_1430010-4.834933Type IIG restriction-modification enzyme
HPF57_1431-110-5.042281Type IIG restriction-modification enzyme
HPF57_1432013-5.369889hypothetical protein
HPF57_1433111-3.004811Type III restriction enzyme
HPF57_1434211-2.649305Type III DNA modification enzyme
HPF57_1435110-2.350709Type III DNA modification enzyme
HPF57_1436012-1.491782DNA recombinase
HPF57_1437014-1.075567hypothetical protein
HPF57_1438-113-1.102436hypothetical protein
HPF57_1439011-0.372595exodeoxyribonuclease
HPF57_14401110.012363*periplasmic competence protein
HPF57_14412140.144674chromosomal replication initiation protein
HPF57_1442218-0.822830purine nucleoside phosphorylase
HPF57_1443114-0.860407hypothetical protein
HPF57_1444012-1.530480D-fructose-6-phosphate amidotransferase
HPF57_1445-112-2.353235FAD-dependent thymidylate synthase
HPF57_1446-211-0.297627hypothetical protein
HPF57_1447-311-0.212774Type I R-M system specificity subunit
HPF57_1448-3110.250824Type I restriction enzyme M protein
HPF57_1449-1101.204685Type I restriction enzyme R protein
HPF57_14502133.827288hypothetical protein
HPF57_14512123.081286iron(III) dicitrate transport protein
HPF57_1452091.751092arginase
HPF57_1453181.952385amino acid permease
HPF57_1454-291.510675amino acid permease
HPF57_1455-29-0.586903alanine dehydrogenase
HPF57_1456-19-1.589504hypothetical protein
HPF57_1457111-1.820609outer membrane protein HorL
HPF57_1458211-2.112958inorganic polyphosphate/ATP-NAD kinase
HPF57_1459210-2.211667DNA repair protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1420CARBMTKINASE290.028 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 29.0 bits (65), Expect = 0.028
Identities = 15/43 (34%), Positives = 21/43 (48%), Gaps = 3/43 (6%)

Query: 246 ILSKHPIDPNSKVFSAPNRLVNAFYDP---KDLPLEKGFNFIE 285
I+++ +D N F P + V FYD K L EKG+ E
Sbjct: 113 IITQTIVDKNDPAFQNPTKPVGPFYDEETAKRLAREKGWIVKE 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1441HTHFIS355e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 5e-04
Identities = 9/51 (17%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 125 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKHKKVVLV 175
+Y + ++ Q+D ++ G +G GK + A+ ++ ++ V +
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1454PF06580290.027 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.027
Identities = 21/121 (17%), Positives = 46/121 (38%), Gaps = 10/121 (8%)

Query: 72 TTGSFGDYASRFINPSTGYMVF--WMYWLSWVLTVAVEYIAIGLLMQRWFPTIPVYVWVI 129
T FG +AS + +P M+F + + VLT A + + +
Sbjct: 24 TLTGFG-FASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWLKLNMGQIILRVLP 82

Query: 130 ACIALLFLLNFFSVKIFATGEFLLSTIKVLAVFVFIMLGCIGIVYSFYLHGFEGVFANFY 189
AC+ + + + I+ F+ + F + + I+++ + F +++ Y
Sbjct: 83 ACVVIGMVWFVANTSIWRLLAFINT-----KPVAFTLPLALSIIFNVVVVTF--MWSLLY 135

Query: 190 F 190
F
Sbjct: 136 F 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1459GPOSANCHOR290.041 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 29.3 bits (65), Expect = 0.041
Identities = 28/158 (17%), Positives = 52/158 (32%), Gaps = 4/158 (2%)

Query: 142 LDGYIQNKNKAFNPLLGALEERFTRLENLEKE-RRLLEDKKRFQKDLEERLNFEKMKLEK 200
L+ + KA + +++ LE E L K +K LE +NF K
Sbjct: 118 LEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAK 177

Query: 201 LDLKEDEYERLLEQKKLLSSKEKLNDKIALALEVLENTHKITHALESMGHSAEFLKSALL 260
+ E E L ++ L + + A + ++ L+ AL
Sbjct: 178 IKTLEAEKAALEARQAELEKALEGAMNFSTADS--AKIKTLEAEKAALAARKADLEKALE 235

Query: 261 EASALLEKEQAKLEECERLDIEKVLERLGALSGIIKDY 298
A + AK++ E + + R L ++
Sbjct: 236 GAMNFSTADSAKIKTLEA-EKAALEARQAELEKALEGA 272


15HPF57_0262HPF57_0269N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0262-2140.893906neutrophil activating protein (napA)
HPF57_0263-3120.656626histidine kinase sensor protein
HPF57_0264-2111.583513hypothetical protein
HPF57_0265-2112.011497flagellar basal body P-ring protein
HPF57_0266-3112.045421ATP-dependent RNA helicase
HPF57_0267-3101.930051hypothetical protein
HPF57_0268-392.031047hypothetical protein
HPF57_0269-3102.016413oligopeptide permease ATPase protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0262HELNAPAPROT1487e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 148 bits (375), Expect = 7e-49
Identities = 39/140 (27%), Positives = 75/140 (53%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEGFADMFDDLAERIVQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER++ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEAIKLTRVKEETKTSFHSKDIFKEILEDYKHLEKEFKELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLVKLQKSIWMLQAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0263PF06580300.015 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.015
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 280 IVLQNFLYNAIDAIEALEESEQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 338
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 339 TKLKGNGLGLA 349
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0265FLGPRINGFLGI360e-126 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 360 bits (924), Expect = e-126
Identities = 120/345 (34%), Positives = 192/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDVQISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++DV +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AITSGNSS-----------NLLSANIINGATIEREVSYDLFHKNAMVLSLKNPNFKNAIQ 186
A+ S SA + NGA IERE+ +VL L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVAIALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIMVHPIVVTSQDITLKITKEPLDN--------SKNAQDLDNNMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q + M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0269HTHFIS320.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.006
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANIIMRLNPR----FKPHNGEVLFETANLLKESEAF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


16HPF57_0373HPF57_0380N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0373112-1.247387guanylate kinase
HPF57_0374111-1.348479poly E-rich protein
HPF57_0375-211-1.841950membrane bound endonuclease
HPF57_0376011-1.878311outer membrane protein HorC
HPF57_0377313-1.787345flagellar basal body L-ring protein
HPF57_0378413-1.597750CMP-N-acetylneuraminic acid synthetase
HPF57_0379312-0.918663CMP-N-acetylneuraminic acid synthetase
HPF57_0380312-0.808419flagellar biosynthesis protein G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0373PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0374IGASERPTASE671e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 66.6 bits (162), Expect = 1e-13
Identities = 59/293 (20%), Positives = 92/293 (31%), Gaps = 25/293 (8%)

Query: 140 ELENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEVKEEEKEEVKETPQEEKPKDDETQES 199
E+E N Q EE V E P P E+
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQA------DVPSVPSNNEEIARVDEAP-VPPPAPATPSET 1036

Query: 200 ETLKDEEVSKELETQEELEIPKEETQEQAKEQEPIKEETQEIKEEKQEKTQEEVK---EE 256
E +E +T E+ E ET Q +E KE +K Q + +E
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVA--KEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 257 TQEQVKEQEPIKKETQEIKEEKQEKTQDSPSVQELEAMQELVKEIQENSNGQEDKKETQE 316
TQ ++ ++ ++ K E EKTQ+ P V + ++ E + + +
Sbjct: 1095 TQTTETKETATVEKEEKAKVET-EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 317 NTETPQETPQDIEVQESAETPQEIPQEKEIPQEKEIPQEKEIPQEKETQKLETPQETSQE 376
N + PQ Q + E P KE E P + +E P+ T+
Sbjct: 1154 NIKEPQS-------QTNTTADTEQPA-KETSSNVEQPVTESTTVNTGNSVVENPENTTPA 1205

Query: 377 SAEKTQKLETQE----DHYESIEDIPEPVMAQAMGEELPFLNESVAKTSNNEN 425
+ + T E+ H S+ +P V TS N N
Sbjct: 1206 TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTN 1258



Score = 55.8 bits (134), Expect = 3e-10
Identities = 51/252 (20%), Positives = 76/252 (30%), Gaps = 17/252 (6%)

Query: 202 LKDEEVSKELETQEELEIPKEETQEQAKEQEPIK-EETQEIKEEKQEKTQEEVKEETQEQ 260
L + EV K +T + I + P EE + E ET E
Sbjct: 980 LYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 261 VKEQEPIKKETQEIKEEKQEKTQDSPSVQELEAMQELVKEIQENSNGQEDKKETQENTET 320
V E + +T E E+ +T EA + Q N Q ET+E T
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS-GSETKETQTT 1098

Query: 321 PQETPQDIEVQESA----ETPQEIPQEKEIPQEKEIPQEKEIPQEKETQKLETPQETSQE 376
+ +E +E A E QE+P+ K+ E PQ E + P +E
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ-AEPARENDPTVNIKE 1157

Query: 377 SAEKTQKLETQEDHYESIEDIPEPVMAQAMGEELPFLNESVAKTSNNENDTETSKESVIK 436
+T E + E P + T N+ + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQ----------PVTESTTVNTGNSVVENPENTTPATT 1207

Query: 437 TPQEKEESDKTS 448
P ES
Sbjct: 1208 QPTVNSESSNKP 1219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0377FLGLRINGFLGH1904e-63 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 190 bits (485), Expect = 4e-63
Identities = 51/172 (29%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDE 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKQEAQYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + + S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0380SACTRNSFRASE280.015 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.0 bits (62), Expect = 0.015
Identities = 15/49 (30%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 102 RGETILKALERIAFE---EFQLNSLHLEVMENNFKAIAFYEKNHYELEG 147
R + + AL A E E L LE + N A FY K+H+ +
Sbjct: 102 RKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


17HPF57_0401HPF57_0409N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0401-3100.923348flagellar MS-ring protein
HPF57_0402-3101.256756flagellar motor switch protein G
HPF57_0403-2101.244767flagellar assembly protein H
HPF57_0404091.5716641-deoxy-D-xylulose-5-phosphate synthase
HPF57_04050111.160870GTP-binding protein LepA
HPF57_0406013-0.873494DNA-cytosine methyltransferase
HPF57_0407-1110.831870hypothetical protein
HPF57_04080130.496979flagellar basal-body rod protein
HPF57_0409111-0.036376alpha-ketoglutarate permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0401FLGMRINGFLIF5480.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 548 bits (1413), Expect = 0.0
Identities = 177/582 (30%), Positives = 292/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFERLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKILKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKLKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + L+P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTTENVKIVNENGESIGEGDILENSKELALEQLRYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL++ + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GTPKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
G GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANTLEYEPLSDESLKKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +K+I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMLDNATLSEKIMHKTQKVLGSFTPLIKYILAFI 461
++A+G++ RGD + V N F+ + T E + Q + +++L +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 TFSEEEVRYEIVLEKIRGTLKERPDEIAMLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0402FLGMOTORFLIG347e-121 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 347 bits (893), Expect = e-121
Identities = 121/338 (35%), Positives = 208/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDATGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEARKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDISKLDNFAIREILKVADKKDLSLALKTSTQDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDI LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 30.9 bits (70), Expect = 0.006
Identities = 20/103 (19%), Positives = 41/103 (39%), Gaps = 3/103 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDATGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAR 103
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0403FLGFLIH375e-05 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 36.7 bits (84), Expect = 5e-05
Identities = 44/207 (21%), Positives = 91/207 (43%), Gaps = 14/207 (6%)

Query: 48 PLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIGFKEG 106
E I + + L L +LQMQ A E+ +A I + G+K G++EG
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQGYQEG 75

Query: 107 EEKMRNELTHSVNEEKNQLLHAITALDEKMKKSEDHLMALE----KELSAIAIDIAKEVI 162
+ L + E K+Q + + + + + L AL+ L +A++ A++VI
Sbjct: 76 ---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 163 LKEVEDNSQKVALALAEELLKNVLDATDIHLKVNPLDYPYLNERLQNASKI---KLESNE 219
+ ++ + + + L + L + L+V+P D +++ L + +L +
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 220 AISKGGVMITSSNGSLDGNLMERFKTL 246
+ GG +++ G LD ++ R++ L
Sbjct: 193 TLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0405TCRTETOQM1402e-37 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 140 bits (354), Expect = 2e-37
Identities = 99/437 (22%), Positives = 174/437 (39%), Gaps = 85/437 (19%)

Query: 9 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 65
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 66 RLNYTLKGEDYVLNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 125
+ E+ +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 SFQW----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 126 DNHLEILPVINKIDLPNANVVEVKQDIEDTIGIDCSSANEVSAKARLGIKD--------- 176
+ + INKID ++ V QDI++ + + +V + + +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDT 177

Query: 177 -------LLEKIITTIPAPSGDFNAPLKALIYD-------------------------SW 204
LLEK ++ + + ++ +
Sbjct: 178 VIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNK 237

Query: 205 F--------------------DNYLGALALVRIMDGSINTEQEILVMGTGKKHGVLGLYY 244
F LA +R+ G ++ + + K + +Y
Sbjct: 238 FYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYT 296

Query: 245 PNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDAKNPTPKPIEGFMPAKPFV 301
+ GEI I+ L L SV +GDT P + IE P +
Sbjct: 297 SINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL---PQRERIEN---PLPLL 346

Query: 302 FAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFGFRVGFLGLLHMEVIKERL 361
+ P + + E L +ALL++ +D L + +S+ + FLG + MEV L
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---EIILSFLGKVQMEVTCALL 403

Query: 362 EREFGLNLIATAPTVVY 378
+ ++ + + PTV+Y
Sbjct: 404 QEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 405 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 464
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 465 LKSCTKGYASFDYEP 479
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0408FLGHOOKAP1300.008 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.008
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAITG 42
+ A + L+ SNN+++ N G+ R I
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0409TCRTETB402e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 2e-05
Identities = 57/315 (18%), Positives = 104/315 (33%), Gaps = 67/315 (21%)

Query: 37 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 96
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 97 LGSFLLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 150
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 151 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGVLALLSLFLRNIM 210
G GS + +G + I I+ W YL + + + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 211 EETMDSGITSKTTIKEKTQRGSLKELLNHKKALM-------IVFGLTMGGSLCFYTFTVY 263
+ + +K + K ++ + T +
Sbjct: 190 K-----------------KEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 LKIFLTNSSSFSPK-------ESSFIMLLALSYFIFLQPLCG---MLADKIKRTQMLMVF 313
IF+ + + ++ M+ L I + G M+ +K L
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 314 AIAGLIVTPVVFYGI 328
I +I+ P I
Sbjct: 293 EIGSVIIFPGTMSVI 307


18HPF57_0541HPF57_0552N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_054119-1.361486urease-enhancing factor
HPF57_0542010-1.419539glutamine synthetase
HPF57_0543011-2.164385hypothetical protein
HPF57_0544-28-0.80184850S ribosomal protein L9
HPF57_0545-210-0.975850ATP-dependent protease peptidase subunit
HPF57_0546-212-1.727200ATP-dependent protease ATP-binding subunit
HPF57_0547215-1.746553GTP-binding protein Era
HPF57_0548313-1.815225conserved hypothetical secreted protein
HPF57_0549518-1.761831hypothetical protein
HPF57_0550919-1.727650cag pathogenicity island protein
HPF57_0551919-2.092195cag pathogenicity island protein
HPF57_0552915-2.092239cag pathogenicity island protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0541VACJLIPOPROT260.003 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 26.0 bits (57), Expect = 0.003
Identities = 10/23 (43%), Positives = 15/23 (65%)

Query: 1 MKTIRNSVFIGASLLGGCASVET 23
MK +++ +G +LL GCAS T
Sbjct: 1 MKLRLSALALGTTLLVGCASSGT 23


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0545PF07520290.010 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 29.2 bits (65), Expect = 0.010
Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 4/49 (8%)

Query: 121 LEAEDNKIAAIGSGG---NFALSAARALDNFAHLEPRKLVEESLKIAGD 166
E+ ++A I GG + ++ R DN L P + E ++AGD
Sbjct: 590 GESPSLRLACIDVGGGTTDLMVTTYRGEDNRV-LHPEQTFREGFRVAGD 637


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0546HTHFIS290.047 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.047
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 51 TPKNILMIGSTGVGKTEIARRI---AKIMKLPFVKV 83
T +++ G +G GK +AR + K PFV +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0547PF03944320.003 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 32.0 bits (72), Expect = 0.003
Identities = 25/94 (26%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELCVFLASVHDDLKGYEEFLSLYQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKLQEYQQYASQFLALVPLSAKKSQNLN 161
+ L +L ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0552PF07201290.025 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 29.4 bits (66), Expect = 0.025
Identities = 14/76 (18%), Positives = 26/76 (34%), Gaps = 15/76 (19%)

Query: 277 APENSKEKLIEELIANSQLIANEEEREKKLLAEKEKQ--------EAELAKY--KLKDLE 326
S + EE+ E +E L K E ++ +Y K+ +LE
Sbjct: 44 GTLQSIADMAEEVTF-----VFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELE 98

Query: 327 NQKKLKALEAELKKKN 342
++ + L + L
Sbjct: 99 QKQNVSELLSLLSNSP 114


19HPF57_0604HPF57_0610N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0604214-1.242030hypothetical protein
HPF57_0605114-0.486639hypothetical protein
HPF57_0606016-0.232249dihydroorotase
HPF57_0607117-2.448079hypothetical protein
HPF57_0608-116-2.519159hypothetical protein
HPF57_0609-115-1.806161flagellar motor switch protein
HPF57_0610-114-0.670562endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0604TYPE3IMSPROT310.002 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.9 bits (70), Expect = 0.002
Identities = 19/64 (29%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 88 LQSYSVMLFFNLLLLIDILGFLPFSIYHHFMASLIFSALFCSSLFLSSPLLGMIALMALS 147
L Y F L+L+ +LPFS S + + +L PLL + ALMA++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLL 151
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0607PF03544489e-09 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 48.4 bits (115), Expect = 9e-09
Identities = 38/229 (16%), Positives = 74/229 (32%), Gaps = 37/229 (16%)

Query: 81 PGAPKPTLAGPQKPPTPPTPPTPPTPPKPIEKPKPKPKPKPEPKKPNHKHKALKKVEKVE 140
P +P P P PP+P+ +P+P+P+P PEP K
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKE-------------- 89

Query: 141 EKKVVEEKKEEKKIVEQKVDQKKIEEKKPVKKEFDPNQLSFLPKEVAPPRQENNKGLDNQ 200
V +K + K + KK+E+ K +V P +N
Sbjct: 90 --APVVIEKPKPKPKPKPKPVKKVEQPKR---------------DVKPVESRPASPFENT 132

Query: 201 TRRDIDELYGEEFGDLGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYL 260
+++ +YP A L +G V+F +
Sbjct: 133 APARPTSSTATAATSKPVTSVA------SGPRALSRNQPQYPARAQALRIEGQVKVKFDV 186

Query: 261 HPNGDISDLKIIIGSEYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 309
P+G + +++I+ M + ++ + +P + ++ I +
Sbjct: 187 TPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0609FLGMOTORFLIN992e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 99 bits (249), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0610OMS28PORIN290.010 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 29.4 bits (65), Expect = 0.010
Identities = 26/109 (23%), Positives = 52/109 (47%), Gaps = 5/109 (4%)

Query: 25 NQTTELHHKNPYELLVATILSAQCTDARVNKITPKLFEKYPSVKDLALTSLE--EVKEII 82
N+ E+ K E A ++ + T +I + K P+ K+L LT E +V+++
Sbjct: 132 NKVVEMSKKAVQETQKAVSVAGEATFLIEKQI---MLNKSPNNKELELTKEEFAKVEQVK 188

Query: 83 KSVSYFNNKSKHLISMAQKVVRDFKGVIPSTQKELMSLDGVGQKTANVV 131
+++ + AQKV+ G+ PS + ++++ V + +NVV
Sbjct: 189 ETLMASERALDETVQEAQKVLNMVNGLNPSNKDQVLAKKDVAKAISNVV 237


20HPF57_0626HPF57_0644N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_0626-3100.544030flagellin A
HPF57_0627-3110.8906063-methyladenine DNA glycosylase
HPF57_0628-1121.267983hypothetical protein
HPF57_0629090.451153uroporphyrinogen decarboxylase
HPF57_063018-0.044180outer membrane protein HefA
HPF57_063128-0.138596membrane fusion protein of the hefABC efflux
HPF57_063218-0.260019cytoplasmic pump protein of the hefABC efflux
HPF57_0633110-1.131675hypothetical protein
HPF57_0634010-1.126484putative vacuolating cytotoxin (VacA)-like
HPF57_0635-213-1.915362ABC transporter, permease
HPF57_0636-211-0.743638putative abc transporter, ATP-binding protein
HPF57_0637-211-1.007870hypothetical protein
HPF57_0638-111-0.972934NAD-dependent DNA ligase LigA
HPF57_0639-112-1.430149chemotaxis protein
HPF57_0640-112-1.285882aspartyl-tRNA synthetase
HPF57_0641-112-2.148861adenylate kinase
HPF57_0642013-2.385685putative lipopolysaccharide biosynthesis
HPF57_0643011-1.637061putative lipopolysaccharide biosynthesis
HPF57_0644111-1.271263putative lipopolysaccharide biosynthesis
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0626FLAGELLIN2453e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 245 bits (626), Expect = 3e-77
Identities = 126/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFRVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQANSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNQTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQNGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SLDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 AGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0630RTXTOXIND290.047 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.047
Identities = 16/113 (14%), Positives = 41/113 (36%), Gaps = 16/113 (14%)

Query: 203 LARMIALQKKLEQIKTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQ 255
LAR+ + K+ + + L K + + ++A L Y + ++
Sbjct: 220 LARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIE 279

Query: 256 FALEQNRLTLEYLTNLNVKNLKKTTIDVPNLQLRERQD-LVSLREQISALKYQ 307
+ + + +T K +D +LR+ D + L +++ + +
Sbjct: 280 SEILSAKEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0631RTXTOXIND511e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.0 bits (122), Expect = 1e-09
Identities = 22/69 (31%), Positives = 34/69 (49%)

Query: 40 STGIVDSIKVTEGSVVKKGDVLLLLYNQDKQAQSDSTEQQLIFAKKQYQRYSKIGGAVDK 99
IV I V EG V+KGDVLL L +A + T+ L+ A+ + RY + +++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 100 NTLESYEFN 108
N L +
Sbjct: 163 NKLPELKLP 171



Score = 31.3 bits (71), Expect = 0.003
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 25/152 (16%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLESYEFNYRRLESDYAYSIAVLNKTI 127
+++ S +++ + ++ K+ D + + E ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDN--IGLLTLELAKNEER-------QQASV 329

Query: 128 LRAPFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG------ 179
+RAP + + GV L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 180 -DTYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ + Y+ G K+ I D+
Sbjct: 390 VEAFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0632ACRIFLAVINRP9000.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 900 bits (2327), Expect = 0.0
Identities = 283/1038 (27%), Positives = 517/1038 (49%), Gaps = 40/1038 (3%)

Query: 1 MYKTAINRPITTLMFALAIVFFGTMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQAIISLFVSSSSVPAT--TLNDYAKKTIKPMLQKIDGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY +K L +++GVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYNLTYADLFSTLKAENVEIDGGRIVNS------QRELSILVNAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKHIQAISP-NYEIRPFLDTTGYIRTSIEDVKFDLVLGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMNKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESYYTRLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F ++YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFIAVVLVFVGSLFVASKIGMEFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + ++ F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHAEVEFTTLQVGY-GTTQNPFKAKIFVQLKPLKERKKERKLGQFELMSALRKELRS 631
+ + E FT + G QN FV LKP +ER + ++ + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNGDENS-AEAVIHRAKMELGK 656

Query: 632 MPEAKGLESINLSEVSLIGGGGDSSPFQTFVFSHSQEAVDKSVANLKKFLLESPELKGKV 691
+ + + N+ + G ++ F + + D + L + + +
Sbjct: 657 IRDGF-VIPFNMPAIV---ELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDNKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAQPK 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + +
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 812 AGISLGEILTQVSKNTKEWLVEGANYRFTGEADNAKETNGEFLIALATAFVLIYMILAAL 871
G S G+ + + +N L G Y +TG + + + + +A +FV++++ LAAL
Sbjct: 832 PGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAAL 890

Query: 872 YESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVANE 931
YES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A +
Sbjct: 891 YESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD 950

Query: 932 -ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSGGL 990
K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + GG+
Sbjct: 951 LMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 991 MISMVLSLLIVPVFYRLL 1008
+ + +L++ VPVF+ ++
Sbjct: 1011 VSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0634VACCYTOTOXIN2742e-76 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 274 bits (701), Expect = 2e-76
Identities = 106/397 (26%), Positives = 185/397 (46%), Gaps = 14/397 (3%)

Query: 2800 AGNNSLMWLNALFMAKGGNPLFAPYYLQDNSTEHIVTLMKDITSTLGMLSKSNLKNNSTD 2859
+G L L + + +A + S I + T+TL ++ K +
Sbjct: 904 SGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQ 962

Query: 2860 VLQLNTYTQQMGRLAKLSSFASFDSTDFSERLSSLKNQRFADAIPNAMDVILKYSQRDKL 2919
L L+ RL LS + F++RL +LK+QRFA + +A +V+ +++ + +
Sbjct: 963 TLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEK 1021

Query: 2920 KNNLWATGVGGVSFVENGTGTLYGINVGYDRFIKG---VIVGGYAAYGYSGFYER--ITS 2974
N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS F + +
Sbjct: 1022 PTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYSSFSNQANSLN 1081

Query: 2975 SKSDNVDVGLYARAFIKKSELTFSVNEAWGANKTQISSNDALLSMINQSYQYSTWTTNAR 3034
S ++N + G+Y+R F + E F A G++++ ++ ALL +NQSY Y ++ R
Sbjct: 1082 SGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSYNYLAYSAATR 1141

Query: 3035 VNYGYDFMFKNKSVIVKPQISLRYYYIGMTGLDGVMNNALYNQFKANADPSKKSVLMIDF 3094
+YGYDF F ++++KP + + Y ++G T + S + +
Sbjct: 1142 ASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKS----NSNQKVALKNGASSQHLFNASA 1197

Query: 3095 AFENRHYFNKNSYFYAIGGIGRDLLVRSMGDKLVRFIGDNILSYRKGELYNTFANITTGG 3154
E R+Y+ SYFY G+ ++ + V + + R NT A + GG
Sbjct: 1198 NVEARYYYGDTSYFYMNAGVLQEFANFGSSNA-VSLNTFKVNATRNP--LNTHARVMMGG 1254

Query: 3155 EIRLFKSFYVNAGVGARFGLDYKMINITGNIGMRLAF 3191
E++L K ++N G L + + N+GMR +F
Sbjct: 1255 ELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 32.7 bits (74), Expect = 0.031
Identities = 17/110 (15%), Positives = 29/110 (26%), Gaps = 7/110 (6%)

Query: 643 WTGGGYDFTGNS-AFDSVNFNKAYYKFQGAENTYTFKNTNFLAGNFKFQGKTTIEKSVLN 701
+ Y S VNFN A + G +
Sbjct: 268 YLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLWQSAGLNIIAPP 327

Query: 702 DASYTFDGTNNAFNEDKFNGGSFNFNHSEQTDAFNNNSFNGGSFNFNANQ 751
+ Y DK + + N +++ ++ NNS N+ Q
Sbjct: 328 EGGYKDKPN------DKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0637LCRVANTIGEN316e-04 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 31.2 bits (70), Expect = 6e-04
Identities = 16/33 (48%), Positives = 20/33 (60%)

Query: 16 KRKKLLTELAELEAEIKVSSERKSSFNISLSPS 48
R KL ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0639HTHFIS551e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.2 bits (133), Expect = 1e-10
Identities = 24/110 (21%), Positives = 44/110 (40%), Gaps = 6/110 (5%)

Query: 194 ILIAEDSLSALKTLEKIVQTLELRYLAFPNGKELLDYLYEKEHYQQVGVVITDLEMPVIS 253
IL+A+D + L + + N L ++ +V+TD+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA----GDGDLVVTDVVMPDEN 61

Query: 254 GFEVLKTIKADSRTEHLPVIINSSMSSDSNRQLAQSLEADGFVVKSNILE 303
F++L IK LPV++ S+ ++ A A ++ K L
Sbjct: 62 AFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0641MALTOSEBP290.011 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.9 bits (64), Expect = 0.011
Identities = 20/52 (38%), Positives = 30/52 (57%), Gaps = 5/52 (9%)

Query: 60 GELVPLEIVVETILSAIKSSDKGIILIDGYPRSMEQMQALDKELNAQNEVVL 111
G+L+ I VE LS I + D L+ P++ E++ ALDKEL A+ + L
Sbjct: 127 GKLIAYPIAVEA-LSLIYNKD----LLPNPPKTWEEIPALDKELKAKGKSAL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_0644FLGHOOKFLIE300.004 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 29.6 bits (66), Expect = 0.004
Identities = 15/60 (25%), Positives = 27/60 (45%), Gaps = 5/60 (8%)

Query: 2 INETQQKPKEPSNPCKIAPQKVSFNQVVFKKIKRKLNHFIGNILARTEVYKKLVAKYDEL 61
I++TQ + + + V+ N V+ K ++ +G +V KLVA Y E+
Sbjct: 44 ISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMG-----IQVRNKLVAAYQEV 98


21HPF57_1102HPF57_1106N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1102319-3.153201SpoOJ regulator
HPF57_1103319-3.311311biotin--protein ligase
HPF57_1104219-3.090806methionyl-tRNA formyltransferase
HPF57_1105218-3.351693hypothetical protein
HPF57_11063160.206085hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1102PF07675310.005 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.005
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 70 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 126
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 127 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 171
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1104FERRIBNDNGPP320.003 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.8 bits (72), Expect = 0.003
Identities = 12/33 (36%), Positives = 19/33 (57%)

Query: 70 EPEVQILKALKPDFIVVVAYGKILPKEVLSIAP 102
EP +++L +KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1105PF01540300.028 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 30.5 bits (68), Expect = 0.028
Identities = 62/315 (19%), Positives = 129/315 (40%), Gaps = 32/315 (10%)

Query: 140 KNCKEKVEKRKKKIKDENSAETLSAKQESEIKKYDKEIEKIRKEMTSKTIQITLDEIKIN 199
K+ ++KV++ KKI DEN +IK+ KE+ K+ +++ S I L
Sbjct: 103 KSEQQKVDQANKKIADENL----------KIKEGAKELLKLSEKIQSFADTIAL------ 146

Query: 200 NICEVSKNKFKVQEDALTNLEKDFDELDEAMKKFDDLKEMELPKDYQTIKDKLESLFSFD 259
I ++ KF++ E L + L++ + + K + + LES F+
Sbjct: 147 TITKLEGKKFQIDETFKKQLISTIELLNKKSAEVKTFATVNTIKKDFLLSE-LESFKEFN 205

Query: 260 IDKEAGQVSE----------EIKEHMSKVGREFIEKGIELQKKMPDNACPFCTQEITNNI 309
VSE E+ E ++ ++ E+ ++++ + + +
Sbjct: 206 TSWLEKIVSEWEEVKKAWSKELAEIKAEDDKKLAEENQKIKEGAKELLKLSEKIQSFADT 265

Query: 310 IQVYTSYFNKRIEQFNQDSLEVSGTLKKILEQWN-IKEILQSFERFEPFMKKDSSTNKES 368
I + + ++ + + ++ T++ + ++ +K + F+ + + KE
Sbjct: 266 IALTITKLERKFQIDEKFKKQLISTIELLNKKSVEVKTFATVNTIKKDFLLSELESFKEF 325

Query: 369 LKNALEQIKVLLEKLQKEVDKKWGVKNKEKFQETDKKLLENYEKFQKCADETRNILKQKK 428
+ LE+I E EV K W + E E DKKL E +K + +E + I +
Sbjct: 326 NTSWLEKIVSEWE----EVKKAWSKELAEIKAEDDKKLAEENQKIKNGVEELKKINNEAF 381

Query: 429 EQKEKLEKLKTELKE 443
E + + K EL++
Sbjct: 382 ELSKTVNKTIAELEK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1106RTXTOXIND432e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 2e-06
Identities = 26/195 (13%), Positives = 70/195 (35%), Gaps = 19/195 (9%)

Query: 27 QIELENQSRF-LAQQKEFEKEVKEKRAQYQSHFKMLEQKEEALKEQEREQKAKFDDAVKQ 85
+++L ++ F ++E + + Q+ + QKE L ++ E+ +
Sbjct: 167 ELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRY 226

Query: 86 ASALALQDERAKIIEEARKNAFLEQQKGLELLQKELDEKSKQVQELHQKEAEIERLKREN 145
+ ++ R + + LE + + E EL ++++E+++ E
Sbjct: 227 ENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQ-ENKYVEAV---NELRVYKSQLEQIESEI 282

Query: 146 NEAESRLKAENEKKLNEKLDLERERIEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAE 205
A+ + + + E ++K + +L + + +A
Sbjct: 283 LSAKEEYQLVTQ-------LFKNEILDK--LRQTTDNIGLLTLELAKNEERQQASVIRAP 333

Query: 206 LSSQQFQGEVQELAI 220
+S +VQ+L +
Sbjct: 334 VS-----VKVQQLKV 343


22HPF57_1301HPF57_1306N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1301-3130.192023nodulation protein
HPF57_1302-212-0.236342GDP-D-mannose dehydratase
HPF57_1303-111-0.272659mannose-6-phosphate isomerase
HPF57_1304-113-0.165509comB10 competence protein
HPF57_1305013-0.569897comB9 competence protein
HPF57_1306-1120.578568comB8 competence protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1301NUCEPIMERASE482e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 47.9 bits (114), Expect = 2e-08
Identities = 51/346 (14%), Positives = 106/346 (30%), Gaps = 54/346 (15%)

Query: 5 ILITGAYGMVGQNTALYFKKNKPDV-----------ALLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNVQAYLKEHKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLIARMHTAKLKNEKEFAMWGDGTARREYLNAKDLARFIS 222
+YG + + P + T + K ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYENIASIPS-----------------VMNVGSGVDYSIEEYYKKVAQVLDYKGSFVKD 265
+ I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1302NUCEPIMERASE882e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 2e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSEHKRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1305TYPE4SSCAGX320.003 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.7 bits (71), Expect = 0.003
Identities = 26/72 (36%), Positives = 37/72 (51%), Gaps = 8/72 (11%)

Query: 190 KEETKEEETITIGDNTNAMKIVKKDIQKGYRALKSSQ--RKWYCLGICSKKSKLSLMPEE 247
KE+ +EE+ I D A+ + Q + ALK + R + K+SK +MP E
Sbjct: 365 KEKIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSE 418

Query: 248 IFNDKQFTYFKF 259
IF+D FTYF F
Sbjct: 419 IFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1306PF043351319e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 131 bits (330), Expect = 9e-40
Identities = 38/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALVLAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISSNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEYLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLMNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKVTRYSIT 238
KNP G++V Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


23HPF57_1365HPF57_1371N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1365-1130.577679putative inner membrane protein translocase
HPF57_13660110.393690hypothetical protein
HPF57_1367081.018396tRNA modification GTPase TrmE
HPF57_13682101.528974outer membrane protein HomD
HPF57_1369-1140.778582hypothetical protein
HPF57_1370-2121.697047hypothetical protein
HPF57_1371-2111.956076membrane-associated lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_136560KDINNERMP429e-147 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 429 bits (1105), Expect = e-147
Identities = 167/570 (29%), Positives = 281/570 (49%), Gaps = 59/570 (10%)

Query: 10 RLILAIALSFLFIALYSYFFQKPNKT--TTPTTKQETTNNHTATNSNTPNAFSATQTIPQ 67
R +L IAL F+ ++ + Q N TT+ TT +A + P + +
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKLISVK 64

Query: 68 ENLLSAISFEHARIEIDSLGR--IKQVYLKDKKYLTPKQKGFLEHVSHLFNPKANPQTPL 125
++L + I++ G + + K L Q L S F +A
Sbjct: 65 TDVL--------DLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTG 116

Query: 126 KELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGALTIIK 183
++ P A+ +PL +N A G NE V D T K
Sbjct: 117 RDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTK 163

Query: 184 TLTFYDDLHYDLQIAFKSPN--------NIIPSYVITNGYKPVADLDS-----YTFSGVL 230
T Y + + + N + + P D S +TF G
Sbjct: 164 TFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAA 222

Query: 231 LENNDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQGFEALIDSEIGT 287
D+K EK + D + + S +++ + +YF T + G + +G
Sbjct: 223 YSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNFYTANLG- 280

Query: 288 KNPLGFISLKNEA-----------NLHGYIGPKDYRSLKAISPMLTDVIEYGLITFFAKG 336
N + I K++ N ++GP+ + A++P L ++YG + F ++
Sbjct: 281 -NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQP 339

Query: 337 VFVLLDYLYQFVGNWGWAIILLTIIVRLILYPLSYKGMVSMQKLKEIAPKMKELQEKYKG 396
+F LL +++ FVGNWG++II++T IVR I+YPL+ SM K++ + PK++ ++E+
Sbjct: 340 LFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGD 399

Query: 397 EPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWILWIHDLS 456
+ Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + + LWIHDLS
Sbjct: 400 DKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLS 459

Query: 457 IMDPYFILPLLMGASMYWHQSVTPSSVTDPMQAKIFKFLPLLFTIFLITFPAGLVLYWTT 516
DPY+ILP+LMG +M++ Q ++P++VTDPMQ KI F+P++FT+F + FP+GLVLY+
Sbjct: 460 AQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIV 519

Query: 517 NNILSVLQQLIINKILENKKRAHAQNKKES 546
+N+++++QQ +I + LE K+ H++ KK+S
Sbjct: 520 SNLVTIIQQQLIYRGLE-KRGLHSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1366DPTHRIATOXIN310.004 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 31.3 bits (70), Expect = 0.004
Identities = 20/75 (26%), Positives = 38/75 (50%)

Query: 17 IQASIALNCPIINLQYEVIQTPSKGFLNIGKKEAIILASVKESVKEIKEESVKETNTKEI 76
++ S+ + INL ++VI+ +K + K+ I + ES + E + +E
Sbjct: 223 VRRSVGSSLSCINLDWDVIRDKTKTKIESLKEHGPIKNKMSESPNKTVSEEKAKQYLEEF 282

Query: 77 HQSAEEKKQNSEIET 91
HQ+A E + SE++T
Sbjct: 283 HQTALEHPELSELKT 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1367TCRTETOQM310.007 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.007
Identities = 32/134 (23%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 KGHKVRLIDTAGIRESADKIERLGIEKSLKSLENCDIILGVFDLSKPLEQEDFNLIDTLN 318
+ KV +IDT G + ++ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1371LIPOLPP20293e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 293 bits (750), Expect = e-105
Identities = 173/175 (98%), Positives = 174/175 (99%)

Query: 1 MKNQVKKILGMSVIVAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSV+ AMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175


24HPF57_1493HPF57_1500N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPF57_1493-1132.097631flagellar hook-basal body protein FliE
HPF57_1494-1121.951571flagellar basal body rod protein FlgC
HPF57_14951121.638137flagellar basal body rod protein FlgB
HPF57_14961121.127747putative rod shape-determining protein
HPF57_14970130.115160hypothetical protein
HPF57_14981130.392010putative peroxidase
HPF57_1499011-0.310107outer membrane protein
HPF57_1500013-0.285836penicillin-binding protein 2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1493FLGHOOKFLIE776e-22 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 77.0 bits (189), Expect = 6e-22
Identities = 19/77 (24%), Positives = 40/77 (51%), Gaps = 1/77 (1%)

Query: 34 EQKGGEFSKLLKQSINELNNTQEQSDKALADMATGQIK-DLHQAAIAIGKAETSMKLMLE 92
Q F+ L +++ +++TQ + G+ L+ + KA SM++ ++
Sbjct: 27 PQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQ 86

Query: 93 VRNKAISAYKELLRTQI 109
VRNK ++AY+E++ Q+
Sbjct: 87 VRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1494FLGHOOKAP1290.012 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.012
Identities = 10/38 (26%), Positives = 15/38 (39%)

Query: 121 NVNAVVEMADLVEATRAYQANVAAFQSAKNMAQNAIGM 158
VN E +L + Y AN Q+A + I +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1497FERRIBNDNGPP362e-04 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 35.7 bits (82), Expect = 2e-04
Identities = 33/186 (17%), Positives = 75/186 (40%), Gaps = 16/186 (8%)

Query: 106 NVELLKKLSPDLVVTFVGNPKAVEHAKKF--GISFLSFQEKTIAEVMEDID---AQAKAL 160
N+ELL ++ P +V G + E + G F K + A L
Sbjct: 88 NLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNL 147

Query: 161 EIDASKKLAKMQETLDFIKERL-KDVKKKKGVELFHKAN--KISGHQALDSDILEKGGID 217
+ A LA+ ++ + +K R K + + + G +L +IL++ GI
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIP 207

Query: 218 N-FGLKYVKFGRADVSVEKIVK-ENPEIIFIWWISPLNPED---VLNNPKFATIKAIKNK 272
N + + +G VS++++ ++ +++ N +D ++ P + + ++
Sbjct: 208 NAWQGETNFWGSTAVSIDRLAAYKDVDVLC---FDHDNSKDMDALMATPLWQAMPFVRAG 264

Query: 273 QVYKLP 278
+ ++P
Sbjct: 265 RFQRVP 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPF57_1500TYPE3IMPPROT290.029 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 29.4 bits (66), Expect = 0.029
Identities = 9/23 (39%), Positives = 12/23 (52%)

Query: 4 LRYKLLLFVFIGFWGLLVLNLFI 26
KL+LFV + W LL L +
Sbjct: 195 TPIKLVLFVALDGWTLLSKGLIL 217



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.