PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeSantal49.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in CP002983 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1HPSNT_00250HPSNT_00610Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_00250428-2.034869hypothetical protein
HPSNT_00255623-1.224454hypothetical protein
HPSNT_00260517-0.333003hypothetical protein
HPSNT_00265416-0.678900hypothetical protein
HPSNT_00290218-1.210016hypothetical protein
HPSNT_00295319-4.013477hypothetical protein
HPSNT_00300423-5.722539hypothetical protein
HPSNT_00305425-7.675814hypothetical protein
HPSNT_00310731-9.727056hypothetical protein
HPSNT_00325520-5.389728integrase-recombinase protein
HPSNT_00330420-5.204455hypothetical protein
HPSNT_00335417-4.783050hypothetical protein
HPSNT_00350416-4.397308hypothetical protein
HPSNT_00355415-4.316527hypothetical protein
HPSNT_00360415-4.021438DNA methylase
HPSNT_00365617-6.654513putative chromosome partitioning protein
HPSNT_00370518-6.930693hypothetical protein
HPSNT_00375420-7.455969hypothetical protein
HPSNT_00380525-8.456887relaxase
HPSNT_00385326-7.785166hypothetical protein
HPSNT_00390427-7.832574hypothetical protein
HPSNT_00395427-8.314817conjugal transfer protein (traG)
HPSNT_00400530-10.222023hypothetical protein
HPSNT_00405529-9.486227hypothetical protein
HPSNT_00410327-7.553345hypothetical protein
HPSNT_00415420-6.505364VirB11 type IV secretion ATPase
HPSNT_00420421-6.417467hypothetical protein
HPSNT_00425322-6.060655hypothetical protein
HPSNT_00430524-7.309740hypothetical protein
HPSNT_00435524-7.524886competence protein
HPSNT_00440626-7.943290conjugal plasmid transfer system protein
HPSNT_00445730-9.485861hypothetical protein
HPSNT_00450830-9.910711hypothetical protein
HPSNT_00465829-9.684351DNA transfer protein
HPSNT_00470522-7.012950hypothetical protein
HPSNT_00475319-5.190114hypothetical protein
HPSNT_00480215-4.272936hypothetical protein
HPSNT_00485012-0.493593hypothetical protein
HPSNT_004901152.174473hypothetical protein
HPSNT_005104213.478463urease accessory protein
HPSNT_005155223.212026Urease accessory protein UreG
HPSNT_005204202.311071urease accessory protein UreF
HPSNT_005253162.355207urease accessory protein UreE
HPSNT_005303182.338728urease accessory protein / pH-dependent
HPSNT_005351162.357830urease subunit beta
HPSNT_00540-281.494766urease subunit alpha
HPSNT_00545-1112.221160*lipoprotein signal peptidase
HPSNT_005502132.847276phosphoglucosamine mutase
HPSNT_005553142.10990230S ribosomal protein S20
HPSNT_005602142.310815peptide chain release factor 1
HPSNT_005653142.132480hypothetical protein
HPSNT_005703141.920083outer membrane protein HorA
HPSNT_005751131.002431hypothetical protein
HPSNT_00580-2120.373097methyl-accepting chemotaxis protein
HPSNT_005850130.50547930S ribosomal protein S9
HPSNT_005900110.52223850S ribosomal protein L13
HPSNT_005951110.832696hypothetical protein
HPSNT_00600090.238220hypothetical protein
HPSNT_006051110.255784hypothetical protein
HPSNT_00610213-0.080882RNA polymerase sigma factor RpoD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00350SECA300.021 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.021
Identities = 32/169 (18%), Positives = 66/169 (39%), Gaps = 26/169 (15%)

Query: 72 ELEELQQTITTDKTQQQLLEQDNIDFELQSALQNDLKDLEHLSDNQNKDDKEQTIQKSFE 131
++ ++ +TI + ++ + + ID + ++ D+ L + K
Sbjct: 668 DVSDVSETI---NSIREDVFKATIDAYIPPQSLEEMWDIPGLQERL----KNDFDLDLPI 720

Query: 132 QDLEDLQNDKLNLEIKESINKQDDKNYQNKEQLNTETKENIRENSKN-----------SH 180
+ D + + ++E I Q + YQ KE++ E +R K H
Sbjct: 721 AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGA--EMMRHFEKGVMLQTLDSLWKEH 778

Query: 181 LIPITNLKNFLHNRRENFKATQQDLPSEKQKKYSDKLFKKELLEYAKHN 229
L + L+ +H R Q+D P ++ K+ S +F +LE K+
Sbjct: 779 LAAMDYLRQGIHLR----GYAQKD-PKQEYKRESFSMF-AAMLESLKYE 821


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00360TRNSINTIMINR320.034 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 32.4 bits (73), Expect = 0.034
Identities = 19/73 (26%), Positives = 36/73 (49%), Gaps = 8/73 (10%)

Query: 799 TLNAFNAPDSQAIDLNAISNSVGLNPTQESK--ITDNSVELNNAQEQTAQEQDTQENAQT 856
T AF P++Q ++++A N++ P+ E K I + + + A++Q + NAQ
Sbjct: 287 TQEAFKNPENQKVNIDANGNAI---PSGELKDDIVEQIAQQAKEAGEVARQQAVESNAQA 343

Query: 857 TLK---QETKEQE 866
+ Q + QE
Sbjct: 344 QQRYEDQHARRQE 356


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00445PF04335995e-27 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 99 bits (249), Expect = 5e-27
Identities = 36/202 (17%), Positives = 75/202 (37%), Gaps = 18/202 (8%)

Query: 94 AERKIGDWIFSSAVFFFALAFIEAIIIICLLPLKEKVPYLVTFSNATQNFAIVQR--ADK 151
+K+ + A ALA + + L PLK PY++T T +I + D
Sbjct: 30 RSKKLAWVV---AGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLHGDA 86

Query: 152 SIRANQALVRQLVASYVNNRE--NISNIKEQNEIAHETIRLQSAFEVWDFFEKLVSYEH- 208
+I ++A+ + +A+YV RE + +E + + + SA D + + ++
Sbjct: 87 TITYDEAVRKYFLATYVRYREGWIAAAREEY----FDAVMVMSARPEQDRWSRFYKTDNP 142

Query: 209 ----SIYTNINLTRKVSIINIALISKTQANIEISAQLFNKEKLESEKRYRIIMTFEFEPI 264
+I N V I ++ + A + + + ++ + ++ +
Sbjct: 143 QSPQNILAN-RTDVFVEIKRVSFLGGNVAQVYFT-KESVTGSNSTKTDAVATIKYKVDGT 200

Query: 265 EIDTKSVPLNPTGFIVTGYDVT 286
NP G+ V Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00535UREASE10440.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1044 bits (2701), Expect = 0.0
Identities = 353/569 (62%), Positives = 442/569 (77%), Gaps = 4/569 (0%)

Query: 3 KISRKEYVSMYGPTTGDKVRLGDTDLIAEVEHDYTIYGEELKFGGGKTLREGMSQSN-NP 61
++SR Y +M+GPT GDKVRL DT+L EVE D+T +GEE+KFGGGK +R+GM QS
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTR 63

Query: 62 SKEELDLIITNALIVDYTGIYKADIGIKDGKIAGIGKGGNKDMQDGVKNNLSVGPATEAL 121
+D +ITNALI+D+ GI KADIG+KDG+IA IGK GN DMQ GV + VGP TE +
Sbjct: 64 EGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGV--TIIVGPGTEVI 121

Query: 122 AGEGLIVTAGGIDTHIHFISPQQIPTAFASGVTTMIGGGTGPADGTNATTITPGRRNLKW 181
AGEG IVTAGG+D+HIHFI PQQI A SG+T M+GGGTGPA GT ATT TPG ++
Sbjct: 122 AGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIAR 181

Query: 182 MLRAAEEYSMNLGFLAKGNTSNDASLADQIEAGAIGFKIHEDWGTTPSAINHALDVADKY 241
M+ AA+ + MNL F KGN S +L + + GA K+HEDWGTTP+AI+ L VAD+Y
Sbjct: 182 MIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADEY 241

Query: 242 DVQVAIHTDTLNEAGCVEDTMAAIAGRTMHTFHTEGAGGGHAPDIIKVAGEHNILPASTN 301
DVQV IHTDTLNE+G VEDT+AAI GRT+H +HTEGAGGGHAPDII++ G+ N++P+STN
Sbjct: 242 DVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSSTN 301

Query: 302 PTIPFTVNTEAEHMDMLMVCHHLDKSIKEDVQFADSRIRPQTIAAEDTLHDMGIFSITSS 361
PT P+TVNT AEH+DMLMVCHHL +I ED+ FA+SRIR +TIAAED LHD+G FSI SS
Sbjct: 302 PTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIISS 361

Query: 362 DSQAMGRVGEVITRTWQTADKNKKEFGRLKEEKGDNDNFRIKRYLSKYTINPAIAHGISE 421
DSQAMGRVGEV RTWQTADK K++ GRLKEE GDNDNFR+KRY++KYTINPAIAHG+S
Sbjct: 362 DSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLSH 421

Query: 422 YVGSVEVGKVADLVLWSPAFFGVKPNMIIKGGFIALSQMGDANASIPTPQPVYYREMFAH 481
+GS+EVGK ADLVLW+PAFFGVKP+M++ GG IA + MGD NASIPTPQPV+YR MF
Sbjct: 422 EIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFGA 481

Query: 482 HGKAKYDANITFVSQAAYDKGIKEELGLERQVLPVKNCR-NITKKDMQFNDTTAHIEVNP 540
+G+++ ++++TFVSQA+ D G+ LG+ ++++ V+N R I K M N T HIEV+P
Sbjct: 482 YGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVDP 541

Query: 541 ETYHVFVDGKEVTSKPANKVSLAQLFSIF 569
ETY V DG+ +T +PA + +AQ + +F
Sbjct: 542 ETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00560TYPE4SSCAGX310.006 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.3 bits (70), Expect = 0.006
Identities = 26/92 (28%), Positives = 49/92 (53%), Gaps = 6/92 (6%)

Query: 16 DELTALLSNAEVISDIKKLTELSKEQSSIEEISIASKEYLSVLEDIKENKELLEDKELSE 75
+ LT +SN + +S+ K L+EL K+Q E + + LED++E + K++ E
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENE------LDQMERLEDMQEQAQANALKQIEE 234

Query: 76 LAKEELKILEIQKSELETAIKQLLIPKDPNDD 107
L K++ + Q+++ + +IK K P D+
Sbjct: 235 LNKKQAEEAVRQRAKDKISIKTDKSQKSPEDN 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00570FLAGELLIN330.002 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 33.5 bits (76), Expect = 0.002
Identities = 30/245 (12%), Positives = 68/245 (27%), Gaps = 21/245 (8%)

Query: 3 KSFKKLGFVSLAASSVLLGSMNATDLETYAALQKPSHVFDNYAKKDSGNSGTTHTTKNET 62
+ + + A A++ K + T
Sbjct: 238 TDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNG 297

Query: 63 PDSSAPPTKKDETSGSGDKNQHTVSSGTPTPSTSGVASQLVKDTTTVNNLKSVSVSGMNT 122
S+ +K + + S+ V + +V T ++ K+ + S +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDD-KTKNESAKLS 356

Query: 123 TLSGVETMSQQSATIGTLLNSSTDLSSAIPNAQGLSSAFGALESAQNTLKGYLDSSSATI 182
L + +S + + + G + S +TL +++
Sbjct: 357 DLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKK- 415

Query: 183 GQLTNGSNAVVGALDKAINQVDMALADLSATDTQKTQAVTLATTDSSATTDAINFLNALK 242
+ + ++D A+++VD + L A + AI L
Sbjct: 416 -----STANPLASIDSALSKVDAVRSSLGAIQNR--------------FDSAITNLGNTV 456

Query: 243 TNLTA 247
TNL +
Sbjct: 457 TNLNS 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00610TYPE4SSCAGX300.033 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.1 bits (67), Expect = 0.033
Identities = 30/130 (23%), Positives = 55/130 (42%)

Query: 3 KKANEEKAPKRAKQEAKAEAAQENKAKESKVKKAKTKESKFKEAKAKELVPVKKLSFNEA 62
K+ E+K ++EAK +A + K K K K+ + K E + + LS N+
Sbjct: 139 KELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANLENLTNAMSNPQNLSNNKN 198

Query: 63 LEELFANSLSDCVSYESIIQISTKVPTLAQIKKIKELCQKYQKKLVSSSEYAKKLNAIDK 122
L EL + + ++ + +K+I+EL +K ++ V K DK
Sbjct: 199 LSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQAEEAVRQRAKDKISIKTDK 258

Query: 123 IKKTEEKQKV 132
+K+ E +
Sbjct: 259 SQKSPEDNSI 268


2HPSNT_01730HPSNT_01850Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_017302160.149399hypothetical protein
HPSNT_01735014-0.181989hypothetical protein
HPSNT_01740011-0.887193type II restriction enzyme
HPSNT_0174509-0.252441DNA methylase
HPSNT_017500120.749558hypothetical protein
HPSNT_017550140.299060cobalamin syntheis protein, P47K family; ATP/GTP
HPSNT_01760316-1.182251nitrite extrusion protein (narK)
HPSNT_01765417-1.493087putative heme iron utilization protein
HPSNT_01770315-1.662262arginyl-tRNA synthetase
HPSNT_01775413-1.354890Sec-independent protein translocase protein
HPSNT_01780312-1.561753guanylate kinase
HPSNT_01785311-1.910789poly E-rich protein
HPSNT_01790-112-1.883794nuclease NucT
HPSNT_01795112-1.943841outer membrane protein (omp10)
HPSNT_01800314-1.921442flagellar basal body L-ring protein
HPSNT_01805314-1.617456CMP-N-acetylneuraminic acid synthetase
HPSNT_01810211-0.986114CMP-N-acetylneuraminic acid synthetase (neuA)
HPSNT_01815211-0.537353flagellar biosynthesis protein G
HPSNT_018201130.647561tetraacyldisaccharide 4'-kinase
HPSNT_018250151.278462NAD synthetase
HPSNT_018300181.280704*ketol-acid reductoisomerase
HPSNT_01835118-0.070323cell division inhibitor
HPSNT_01840319-1.121515cell division topological specificity factor
HPSNT_018451170.069949DNA processing chain A
HPSNT_01850218-0.063839Holliday junction resolvase-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01760TCRTETA444e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.4 bits (105), Expect = 4e-07
Identities = 53/271 (19%), Positives = 102/271 (37%), Gaps = 16/271 (5%)

Query: 28 LILSGSLTPHQSFQLGIAVLMGYVFGSFLIQFLSPLMSLESIAKISFGLIALSFLICYFD 87
L+ S +T H L + LM + L LS + +S A+ + I
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGA-LSDRFGRRPVLLVSLAGAAVDYAI--MA 91

Query: 88 SIPFFW-LWIWRFIAGVASSALMILVAPLSLPYVKENKKALVGGLIFSAVGIGSVFSGFV 146
+ PF W L+I R +AG+ + A + +++A G + + G G V +
Sbjct: 92 TAPFLWVLYIGRIVAGI-TGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVL 150

Query: 147 LPWISSYNIKWAWIFLGGSCLIAFILSLIGLK-----NRSLRKKSVKKEESAFKIPFHL- 200
+ ++ + + F+ L R ++ ++F+ +
Sbjct: 151 GGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMT 210

Query: 201 ---WLLLISCALNAIGFLPHTLFWVDYLIRHLNISPTIAGTSWALFG-FGATLGSLISGP 256
L+ + + +G +P L WV + + T G S A FG + ++I+GP
Sbjct: 211 VVAALMAVFFIMQLVGQVPAAL-WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 257 MAQKLGAKNANIFILILKSIACFLPIFFHQI 287
+A +LG + A + +I L F +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01780PF05272290.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.011
Identities = 9/18 (50%), Positives = 11/18 (61%)

Query: 8 LILSGPSGAGKSTLTKYL 25
++L G G GKSTL L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01785IGASERPTASE762e-16 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 75.9 bits (186), Expect = 2e-16
Identities = 55/260 (21%), Positives = 98/260 (37%), Gaps = 17/260 (6%)

Query: 192 EEEKEEIKEEEKEEIKEEEKEEIKEEEKEEIKEEEKEEVKEEIKETPKEEEEPKEET-QE 250
+ EE + E E E E K+E K K E++ E T Q
Sbjct: 1006 DVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQN 1065

Query: 251 GEALKDEEVSKELETQ-------GEIKKETQEEQTKEQEPVKEQEPIKEETQEIKEEKQE 303
E K+ + + + TQ G KETQ +TKE V+++E K ET++ +E +
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 304 KTQDSPSAQELEAMQELVKEIQE-----NSNGQEDKKETQEKTETPQEKETQELEIPKEK 358
+Q SP ++ E +Q + +E N + + T TE P ++ + +E P +
Sbjct: 1126 TSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTE 1185

Query: 359 TQESAETPQEKETQELEIPKEK----TQESAETPQEKETQENAETPQELETPQEKTQEKT 414
+ E E P ES+ P+ + + P +E + +++
Sbjct: 1186 STTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRS 1245

Query: 415 QEDHYESIEDIPEPVMAKAM 434
+ V++ A
Sbjct: 1246 TVALCDLTSTNTNAVLSDAR 1265



Score = 63.9 bits (155), Expect = 9e-13
Identities = 54/283 (19%), Positives = 98/283 (34%), Gaps = 20/283 (7%)

Query: 226 EKEEVKEEIKETPKEEEEPKEETQEGEALKDEEVSKELETQGEIKKETQEEQTKEQEPVK 285
E E+ + + T + +EE+++ E + ++ E V
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEA--PVPPPAPATPSETTETVA 1041

Query: 286 EQEPIKEETQEIKEEKQEKTQDSPSAQELEAMQELVKEIQENSNGQEDKKETQEKTETPQ 345
E + +T E E+ +T EA + Q N Q ++T+T +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQS--GSETKETQTTE 1099

Query: 346 EKETQELEIPKEKTQESAETPQEKETQELEIPK----EKTQESAETPQEKETQENAETPQ 401
KET +E ++ E+ +T + + PK E Q AE +E + N + PQ
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 402 ELETPQEKTQEKTQEDHYESIEDIPEPVMAKAMGEELPFLNEAVAKTPNNENDTETPKES 461
Q T T++ E+ ++ +PV +V + P N T
Sbjct: 1160 S----QTNTTADTEQPAKETSSNVEQPVTESTTVNTGN----SVVENPENTTPATTQPTV 1211

Query: 462 VIKTPQETPQESVETPQESVETPQESVETPKESDKTSSPLELR 504
E+ + + SV + +VE S S + L
Sbjct: 1212 N----SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALC 1250



Score = 63.5 bits (154), Expect = 1e-12
Identities = 40/235 (17%), Positives = 80/235 (34%), Gaps = 12/235 (5%)

Query: 148 EALVQEEPNNEEQLLPTLNDQEEKEEIKEEEKEEIKEEEK--------EEIKEEEKEEIK 199
EA V + K+E K EK E E +E K K +
Sbjct: 1022 EAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ 1081

Query: 200 EEEKEEIKEEEKEEIKEEEKEEIKEEEKEEVKEEIKETPKEEEEPKEETQEGEALKDEEV 259
E + E KE E KE E++E+ K E ++T + + + + + E + +
Sbjct: 1082 TNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQP 1141

Query: 260 SKELETQGEIKKETQEEQTKEQEPVKEQEPIKEETQEIKEEKQEKTQDSPSAQELEAMQE 319
E + + +E Q++ ++P KE + +++ E T + ++ E
Sbjct: 1142 QAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTEST----TVNTGNSVVE 1197

Query: 320 LVKEIQENSNGQEDKKETQEKTETPQEKETQELEIPKEKTQESAETPQEKETQEL 374
+ + E+ K + + + + E S+ +L
Sbjct: 1198 NPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDL 1252



Score = 58.2 bits (140), Expect = 6e-11
Identities = 51/271 (18%), Positives = 99/271 (36%), Gaps = 35/271 (12%)

Query: 274 QEEQTKEQEPVKEQEPIKEETQEIKEEKQEKTQ-----DSPSAQELEAMQ-ELVKEIQEN 327
+ QT + + I+ + + +E + P A + E V E +
Sbjct: 987 KRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 328 SNGQEDKKETQEKTETPQEKETQELEIPK--------EKTQESAETPQEKETQELEIPKE 379
+ +K E T Q +E + E Q +ET +E +T E +
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET-KETQTTETKETAT 1105

Query: 380 KTQESAETPQEKETQENAETPQELETPQEKTQEKTQEDHYESIEDIPEPVMAKAMGEELP 439
EKE + ET + E P+ +Q +++ E+++ EP E P
Sbjct: 1106 ---------VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAR-----ENDP 1151

Query: 440 FLNEAVAKTPNN-ENDTETPKESVIKTPQETPQESVETPQESVETPQESVETPKESDKTS 498
+N ++ N DTE P + +T Q T +V T VE P+ + +
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAK---ETSSNVEQP--VTESTTVNTGNSVVENPENTTPAT 1206

Query: 499 SPLELRLNLQDLLKSLNQESLKNLLENKTLS 529
+ + + K+ ++ S++++ N +
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVEPA 1237



Score = 54.7 bits (131), Expect = 8e-10
Identities = 31/194 (15%), Positives = 69/194 (35%), Gaps = 8/194 (4%)

Query: 142 ENLGDLEALVQEEPNNEEQLLPTLNDQEEKEEIKEEEKEEIKEEEKEEIKEEEKEEIKEE 201
E + +E +N + T + E KE + E KE E +E+ K E E+
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVE-TEK 1118

Query: 202 EKEEIKEEEKEEIKEEEKEEIKEEEKEEVKEEIKETPKEEEEPKEETQEGEALKDEEVSK 261
+E K + K+E+ E ++ + + + + KE + T + E
Sbjct: 1119 TQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK----- 1173

Query: 262 ELETQGEIKKETQEEQTKEQEPVKEQEPIKEETQEIKEEKQEKTQDSPSAQELEAMQELV 321
ET +++ E T + P + ++ + P + +++ +
Sbjct: 1174 --ETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVP 1231

Query: 322 KEIQENSNGQEDKK 335
++ + D+
Sbjct: 1232 HNVEPATTSSNDRS 1245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01800FLGLRINGFLGH1908e-63 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 190 bits (483), Expect = 8e-63
Identities = 51/172 (29%), Positives = 84/172 (48%), Gaps = 18/172 (10%)

Query: 56 GERPLFADRRAMKPNDLITIIVSEKASANYSSS----KDYKSASGGNSTPPRLTYNGLDA 111
G +PLF DRR D +TI++ E SA+ SSS +D K+ G ++ P L GL
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYL--QGLFG 118

Query: 112 RKKQEAQYLDDKNNYNFTKSSNNTNFKGGGSQKKSEDLEIVLSARIIKVLENGNYFIYGN 171
+ + + S F G G S L+ + +VL NGN + G
Sbjct: 119 NARADVEA------------SGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGE 166

Query: 172 KEVLVDGEKQILKVSGVIRPYDIERNNTIQSKFLADAKIEYTNLGHLSDSNK 223
K++ ++ + ++ SGV+ P I +NT+ S +ADA+IEY G+++++
Sbjct: 167 KQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQN 218


3HPSNT_01935HPSNT_01990Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_01935212-0.438680hypothetical protein
HPSNT_019400130.904302transketolase
HPSNT_01945014-0.755998bifunctional riboflavin kinase/FMN
HPSNT_01950111-0.560767hemolysin
HPSNT_01955-111-1.656273hypothetical protein
HPSNT_01960-110-3.202632aspartate carbamoyltransferase catalytic
HPSNT_01975-29-3.066769ABC-type transport system, ATP binding protein
HPSNT_01980-312-4.265205hypothetical protein
HPSNT_01985-112-3.263414Holliday junction resolvase-like protein
HPSNT_01990011-3.602869hypothetical protein
4HPSNT_02430HPSNT_02510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_02430-111-3.258794molybdenum ABC transporter ModB
HPSNT_02435-19-3.859020molybdenum ABC transporter ModD
HPSNT_02440-19-2.249425glutamyl-tRNA synthetase
HPSNT_02445-111-3.011555hypothetical protein
HPSNT_02450-212-2.914304outer membrane protein HopK
HPSNT_02455-113-2.771391type II adenine specific methyltransferase
HPSNT_02460-114-1.979634type II adenine specific methyltransferase
HPSNT_02465-116-1.289880GTP-binding protein TypA
HPSNT_02470118-2.874824type II adenine specific DNA methyltransferase
HPSNT_02475518-0.150127type II restriction endonuclease
HPSNT_024807170.484483DNA (cytosine-5-)-methyltransferase
HPSNT_02485216-0.553761hypothetical protein
HPSNT_02490216-0.489239hypothetical protein
HPSNT_02495115-0.284283catalase-like protein
HPSNT_02500115-0.559686outer membrane protein HofC
HPSNT_02505114-1.782091outer membrane protein HofD
HPSNT_02510418-2.772002hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02435PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 11/23 (47%), Positives = 14/23 (60%)

Query: 30 VVALLGESGAGKSTILRILAGLE 52
V L G G GKST++ L GL+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02465TCRTETOQM1982e-57 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 198 bits (504), Expect = 2e-57
Identities = 115/461 (24%), Positives = 190/461 (41%), Gaps = 67/461 (14%)

Query: 3 NIRNIAVIAHVDHGKTTLVDGLLSQSGTFSEREKVDE--RVMDSNDLERERGITILSKNT 60
I NI V+AHVD GKTTL + LL SG +E VD+ D+ LER+RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 61 AIYYKDTKINIIDTPGHADFGGEVERVLKMVDGVLLLVDAQEGVMPQTKFVVKKALSFGI 120
+ +++TK+NIIDTPGH DF EV R L ++DG +LL+ A++GV QT+ + GI
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 121 CPIVVVNKIDKPAAEPDRVVDEVFDLF---------VAMGASDKQLDFPV-----VYAAA 166
I +NKID+ + V ++ + V + + +F
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 167 RDGYAMKSLDDE----------------------------KKNL--EPLFETILEHVPSP 196
D K + + K N+ + L E I S
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSS 241

Query: 197 SGSVDEPLQMQIFTLDYDNYVGKIGIARVFNGSVKKNESVLLMKSDGSKENGRITKLIGF 256
+ L ++F ++Y ++ R+++G + +SV + KE +IT++
Sbjct: 242 THRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRI----SEKEKIKITEMYTS 297

Query: 257 LGLARTEIENAYAGDIVAIAG--FNAMDV-GDSVVDPTNPMPLDPMHLEEPTMSVYFAVN 313
+ +I+ AY+G+IV + V GD+ + P +P P + +
Sbjct: 298 INGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENP----LPLLQTTVEPS 353

Query: 314 DSPLAGLEGKHVTANKLKDRLLKEMQTNIAMKCEEMGEGKFKVSGRGELQITILAENLRR 373
+ + D LL+ + + +S G++Q+ + L+
Sbjct: 354 KPQQREMLLDALLEISDSDPLLRYYVDSAT--------HEIILSFLGKVQMEVTCALLQE 405

Query: 374 E-GFEFSISRPEVIIKEENGVKCEPFEHLVIDTPQDFSGAI 413
+ E I P VI E K E H+ + P F +I
Sbjct: 406 KYHVEIEIKEPTVIYMERPLKKAEYTIHIEVP-PNPFWASI 445



Score = 41.8 bits (98), Expect = 8e-06
Identities = 20/80 (25%), Positives = 30/80 (37%), Gaps = 1/80 (1%)

Query: 396 EPFEHLVIDTPQDFSGAIIERLGKRKAEMKAMNPMSDGYTRLEFEIPARGLIGYRSEFLT 455
EP+ I PQ++ K A + + + L EIPAR + YRS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 456 DTKGEGVMNHSFLEFRPFSG 475
T G V + +G
Sbjct: 596 FTNGRSVCLTELKGYHVTTG 615


5HPSNT_02665HPSNT_02810Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_02665515-2.357399conserved hypothetical secreted protein
HPSNT_02680819-2.560548cag pathogenicity island protein (cag1)
HPSNT_02685818-2.764414cag pathogenicity island protein Epsilon
HPSNT_02690915-2.215036cag pathogenicity island protein (cag3)
HPSNT_02695917-2.621447cag pathogenicity island protein Gamma
HPSNT_02700818-2.548661CAG pathogenicity island protein 5
HPSNT_02705920-2.964380cag pathogenicity island protein alpha
HPSNT_02710820-3.214069cag pathogenicity island protein Z
HPSNT_02715920-3.044277cag pathogenicity island protein (cag7)
HPSNT_027201027-4.311842cag pathogenicity island protein X
HPSNT_027251028-4.552016cag pathogenicity island protein W
HPSNT_027301329-5.845205cag island protein
HPSNT_027351333-6.273199cag pathogenicity island protein U
HPSNT_027401327-5.818791CAG pathogenicity island protein T
HPSNT_02745924-6.287618CAG pathogenicity island protein S
HPSNT_02750821-5.631429hypothetical protein
HPSNT_02755617-4.174009hypothetical protein
HPSNT_02760717-3.011555hypothetical protein
HPSNT_02765618-2.795636cag pathogenicity island protein M
HPSNT_02770719-3.136284cag pathogenicity island protein (cagN, cag17)
HPSNT_02775619-3.084959cag pathogenicity island protein (cag18)
HPSNT_02780620-3.524204cag pathogenicity island protein I
HPSNT_02785720-3.467277cag pathogenicity island protein H
HPSNT_02790721-4.441003cag pathogenicity island protein G
HPSNT_02795721-3.405612cag pathogenicity island protein (cagF, cag22)
HPSNT_02800519-2.472026cag pathogenicity island protein E
HPSNT_02805318-1.018741cag pathogenicity island protein (cagD, cag24)
HPSNT_02810318-0.234023cag pathogenicity island protein C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02695TACYTOLYSIN280.027 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 27.6 bits (61), Expect = 0.027
Identities = 12/43 (27%), Positives = 22/43 (51%), Gaps = 3/43 (6%)

Query: 128 NKSVYQLVEMAIGAYNGG-MKHDPNGAYVKKFRCIYSQVRYNE 169
N+S Y VE Y G + GAYV ++ ++ ++ Y++
Sbjct: 451 NRSEY--VETTSTEYTSGKINLSHQGAYVAQYEILWDEINYDD 491


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02715IGASERPTASE409e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.0 bits (93), Expect = 9e-05
Identities = 32/166 (19%), Positives = 68/166 (40%), Gaps = 3/166 (1%)

Query: 701 AKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKK 760
++ + + ++KT + ++ T + R++ +EAK +VKA A++ E K
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 761 ECEKLLTPEARKLLEEE-AKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKS 819
E + T E + +EE AK + + + K+E + + P+A+ E
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 820 AKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEQQALDCLKNAKTE 865
+ SQ T A+ ++ K + + + + N+ E
Sbjct: 1154 NIK--EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197



Score = 37.0 bits (85), Expect = 0.001
Identities = 29/194 (14%), Positives = 68/194 (35%), Gaps = 31/194 (15%)

Query: 767 TPEARKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSAKAYLDC 826
P ++ + + ++KT + ++ T + ++ +EAK + KA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 827 VSQAKTEAEKKECEKLLTPEARKLLEQQALDCLKNAKTEAEKKRCVKDLPKDLQKKVLAK 886
A++ +E KE + T E + +++ AK E EK +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKT---------------QE 1121

Query: 887 ESLKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKEC 946
+ + ++E + + E + ++E + SQ T A+ ++
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ----------SQTNTTADTEQP 1171

Query: 947 EKLLTPEARKLLEE 960
K + + + E
Sbjct: 1172 AKETSSNVEQPVTE 1185



Score = 36.6 bits (84), Expect = 0.001
Identities = 41/265 (15%), Positives = 87/265 (32%), Gaps = 25/265 (9%)

Query: 898 RARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEARKL 957
+ NE+ + E + P A ++ + + ++KT + ++ T + R++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT--PSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 958 LEEAKESLKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEQQALDCLKNAKTDAEKKRCA 1017
+EAK ++KA A++ E KE + T E + +++ AK + EK +
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV 1122

Query: 1018 KDLPKDLQKKVLAKESVKAYLDCVSRARNEKERKACEKLLTPEAKKLLEEAKESLKAYKD 1077
KV ++ S K + ++E + E + ++E + D
Sbjct: 1123 --------PKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 1078 CLSQARNEEERRACEKLLTPEARKLLEQEVKNSVKAYLDCVSRARNEKEKQECEKLLTPE 1137
Q E + + V+N N E K
Sbjct: 1168 -TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS-ESSNKPKNRHRR 1225

Query: 1138 ARKFLAKQVLSCLEKAKNEEERKAC 1162
+ + + V + + C
Sbjct: 1226 SVRSVPHNVEPATTSSNDRSTVALC 1250



Score = 34.7 bits (79), Expect = 0.005
Identities = 28/187 (14%), Positives = 69/187 (36%), Gaps = 4/187 (2%)

Query: 499 KARNEKEKKECEKLLTPEAKKKLEQQVLDCLKNAKTDEERKKCLKDLPKD--LQSDILAK 556
+ NE+ + E + P A + +N+K + + + + + Q+ +AK
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 557 ESLKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKNEAEKKE 616
E+ K + E ++ T E K+ E +E K + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 617 CEKLLTPEAKKKLEEAKKSVKAYL--DCVSQARTEAEKKECEKLLTPEAKKLLEQQALDC 674
++ + + + E A+++ + SQ T A+ ++ K + ++ + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 675 LKNAKTE 681
N+ E
Sbjct: 1191 TGNSVVE 1197



Score = 33.1 bits (75), Expect = 0.015
Identities = 38/217 (17%), Positives = 85/217 (39%), Gaps = 7/217 (3%)

Query: 555 AKESLKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDC--VSQAKNEA 612
++ + ++ ++KT EK E + T + + +EAK +VKA V+Q+ +E
Sbjct: 1034 SETTETVAENSKQESKTV-EKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 613 EKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQARTEAEKKECEKLLTPEAKKLLEQQAL 672
++ + + +K E+AK + + + K+E + + P+A+ E
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 673 DCLKNAK----TEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSQAKTEAEKKECEKLL 728
+K + T AD ++ K+ ++++ V +V V + +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 729 TPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEKL 765
+ + K + SV++ V A + L
Sbjct: 1213 SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 32.0 bits (72), Expect = 0.030
Identities = 32/243 (13%), Positives = 77/243 (31%), Gaps = 11/243 (4%)

Query: 607 QAKNEAEKKECEKLLTP-------EAKKKLEEAKKSVKAYLDCVSQARTEAEKKECEKLL 659
+ NE + E + P E + + E K ++ Q TE + E
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 660 TPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSQAKTEA 719
++ Q + ++ + + ++K+ AK + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 720 EKKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEKLLTPEARKLLEEEAK 779
K+E + + P+A E + K+ S+ + ++ K + + + E
Sbjct: 1131 PKQEQSETVQPQAEPAREN--DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTT 1188

Query: 780 ESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSAKAYLDCVSQAKTEAEKKEC 839
+ + + T A + + K ++S ++ V A T + +
Sbjct: 1189 VNTGNSVVENPENTTPATTQPTVNSESSNKPK--NRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 840 EKL 842
L
Sbjct: 1247 VAL 1249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02720TYPE4SSCAGX8670.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 867 bits (2242), Expect = 0.0
Identities = 512/522 (98%), Positives = 516/522 (98%), Gaps = 1/522 (0%)

Query: 1 MGQAFFKKIVGCFCLGYLFLSSVIEAAP-DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 59
MGQAFFKKIVGCFCLGYLFLSS IEA DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 60 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 119
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 120 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKKEKRKEERAKNRANL 179
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDK+EKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 180 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 239
ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 240 EETIKQRAKDKISIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 299
EE ++QRAKDKISIKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 300 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 359
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 360 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 419
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 420 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 479
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 480 DKALVTVINKGYGKNPLTRNYNIKNYGELERVIKKLPLVRDK 521
DKALVTVINKGYGKNPLT+NYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02730PF043351193e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 119 bits (300), Expect = 3e-35
Identities = 43/205 (20%), Positives = 73/205 (35%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMVLNIAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + L A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02770TYPE4SSCAGX300.011 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.1 bits (67), Expect = 0.011
Identities = 29/119 (24%), Positives = 56/119 (47%), Gaps = 16/119 (13%)

Query: 24 AINTALLPSEYKELVALGFKKIKTLHQRHDDEEVTKEEKEFATNALREKLRNDRARAEQI 83
A+N AL+ +Y+E + K K + D +E+ +++K EK + + +A++
Sbjct: 112 AVNFALMTRDYQEFL----KTKKLIVDAPDPKELEEQKKAL------EKEKEAKEQAQKA 161

Query: 84 QKNIEAFEKKNNSSVQKKAAKHKGLQELNEINANPLNDNPNSNSSAETKSNKDDNFDEM 142
QK+ K +++A L+ L +NP N + N N S K +++ D+M
Sbjct: 162 QKD------KREKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQM 214


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02800ACRIFLAVINRP330.007 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 32.9 bits (75), Expect = 0.007
Identities = 20/88 (22%), Positives = 32/88 (36%), Gaps = 18/88 (20%)

Query: 19 EVQKRQFQKIEELKADMQKGVNPFFKVLFDGGNRLFGFPETFIYSSI-------FILFVT 71
+ K K+ EL+ +G+ +D F+ SI F +
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMK--VLYPYD--------TTPFVQLSIHEVVKTLFEAIML 350

Query: 72 IVLSVILF-QAYEPVLIVAIVIVLVALG 98
+ L + LF Q LI I + +V LG
Sbjct: 351 VFLVMYLFLQNMRATLIPTIAVPVVLLG 378


6HPSNT_03555HPSNT_03580Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_035551123.678734acetyl-CoA acetyltransferase
HPSNT_035603133.438239succinyl-CoA-transferase subunit A
HPSNT_035654143.317034succinyl-CoA-transferase subunit B
HPSNT_035704142.794008short-chain fatty acids transporter
HPSNT_035753142.050468outer membrane protein
HPSNT_035803152.335691hydantoin utilization protein A (hyuA)
7HPSNT_03685HPSNT_03775Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_03685214-0.957775hypothetical protein
HPSNT_03690215-1.025984hypothetical protein
HPSNT_03695314-0.575922RNA polymerase factor sigma-54
HPSNT_037002150.382974ABC transporter ATP-binding protein
HPSNT_037051140.405332hypothetical protein
HPSNT_037101131.394920DNA polymerase III subunits gamma and tau
HPSNT_037152142.695211L-lysine exporter; membrane protein
HPSNT_037202172.843732hypothetical protein
HPSNT_037253162.689610hypothetical protein
HPSNT_037302152.743938outer membrane protein
HPSNT_037351122.203268anaerobic C4-dicarboxylate transporter
HPSNT_037400120.475694L-asparaginase II
HPSNT_03745013-0.674122outer membrane protein HopP
HPSNT_03750112-1.951408outer membrane protein
HPSNT_03755214-2.710086tRNA-dihydrouridine synthase B
HPSNT_03760217-4.027532tRNA(Ile)-lysidine synthase
HPSNT_03765320-4.511636hypothetical protein
HPSNT_03770215-2.676339hypothetical protein
HPSNT_03775214-2.392460hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03710IGASERPTASE350.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.7 bits (79), Expect = 0.001
Identities = 27/93 (29%), Positives = 39/93 (41%), Gaps = 4/93 (4%)

Query: 470 ENIKIALKNSSENNSALEVVKELKFPSLK---PKPTTETTAELKEKETKEKEVQENDTKE 526
NI+ + + NN + V E P P TTET AE ++E+K E E D E
Sbjct: 1001 NNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATE 1060

Query: 527 IQETQPKEAPTALQEFMAN-HSNLIEEIKSEFE 558
+ A A AN +N + + SE +
Sbjct: 1061 TTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03720SECA280.013 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 27.9 bits (62), Expect = 0.013
Identities = 12/43 (27%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 71 RIARKNLSKMSEEDFKKMREEVRK--ELEEKTKGLSDEEIKAK 111
++ K ++ ++MR+ V +E + + LSDEE+K K
Sbjct: 4 KLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGK 46


8HPSNT_04495HPSNT_04535Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_044951143.186743hydrogenase nickel incorporation protein
HPSNT_045001123.325322flagellar hook protein FlgE
HPSNT_045051142.634598CDP-diacylglycerol pyrophosphatase
HPSNT_045101142.607538alkylphosphonate uptake protein
HPSNT_045152142.416721hypothetical protein
HPSNT_045203142.501508hypothetical protein
HPSNT_045252151.714238catalase
HPSNT_045302130.678437iron-regulated outer membrane protein
HPSNT_04535316-2.073589Holliday junction resolvase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_04500FLGHOOKAP1427e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.3 bits (99), Expect = 7e-06
Identities = 18/75 (24%), Positives = 36/75 (48%), Gaps = 2/75 (2%)

Query: 645 GNVFSQTGNSGQALIGAANTGR--RGSISGSKLESSNVDLSRSLTNLIVVQRGFQANSKA 702
++ S GN L ++ T +S + S V+L NL Q+ + AN++
Sbjct: 472 ASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQV 531

Query: 703 VTTSDQILNTLLNLK 717
+ T++ I + L+N++
Sbjct: 532 LQTANAIFDALINIR 546



Score = 39.2 bits (91), Expect = 5e-05
Identities = 11/35 (31%), Positives = 20/35 (57%)

Query: 4 SLWSGVNGMQAHQIALDIESNNIANVNTTGFKYSR 38
+ + ++G+ A Q AL+ SNNI++ N G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


9HPSNT_04620HPSNT_04665Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_046202242.513256hypothetical protein
HPSNT_046251212.538136virulence associated protein D (vapD)
HPSNT_046301213.494775hypothetical protein
HPSNT_046351223.813783hypothetical protein
HPSNT_046400203.547541outer membrane protein BabA
HPSNT_046450161.199693hypothetical protein
HPSNT_046501161.275546**hypothetical protein
HPSNT_046550151.980675hydrogenase isoenzymes formation protein HypD
HPSNT_046602151.155081hydrogenase assembly chaperone
HPSNT_046652161.838655hydrogenase/urease nickel incorporation protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_04620BINARYTOXINA260.033 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 25.8 bits (56), Expect = 0.033
Identities = 21/71 (29%), Positives = 33/71 (46%), Gaps = 1/71 (1%)

Query: 10 YTQYSEKQLFNFLNSIKTKQKRALEKLKEIQAQKQ-RIKKALQFKALHLTENGYTIEEER 68
Y + EK FN + + + +LEK E++ Q ++ K FK + L E G E+
Sbjct: 134 YFESPEKFAFNKEIRTENQNEISLEKFNELKETIQDKLFKQDGFKDVSLYEPGNGDEKPT 193

Query: 69 EILARAKDTKN 79
+L K KN
Sbjct: 194 PLLIHLKLPKN 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_04625PF046051003e-31 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 100 bits (251), Expect = 3e-31
Identities = 21/93 (22%), Positives = 44/93 (47%), Gaps = 3/93 (3%)

Query: 3 ALAFDLKIEILKKEYGEPYNKAYDDLRQELELLGFEWTQGSVYVNYSKENTLAQVYKAIN 62
A+ FDL + L+K + + + Y +++ + GFE Q S Y + N +V + +N
Sbjct: 7 AINFDLSTKSLEKYF-KDTREPYSLIKKFMLENGFEHRQYSGYTSKEPIN-ERRVIRIVN 64

Query: 63 KLS-QIEWFKKSVGDIRAFKVEDFSDFTEIVKS 94
KL+ + W + V + ++ + E ++
Sbjct: 65 KLTKKFTWLGECVKEFDITEIGEQYSLKETIQD 97


10HPSNT_05470HPSNT_05535Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_05470390.246326glucokinase
HPSNT_05475311-0.827910mannitol dehydrogenase
HPSNT_05480313-1.133204putative lipopolysaccharide biosynthesis
HPSNT_054852141.638855hypothetical protein
HPSNT_054902132.343847hypothetical protein
HPSNT_054950152.866367outer membrane protein (omp23)
HPSNT_055000132.548856pyruvate flavodoxin oxidoreductase subunit
HPSNT_05505-1101.995609pyruvate flavodoxin oxidoreductase subunit
HPSNT_05510-181.305375pyruvate flavodoxin oxidoreductase subunit
HPSNT_05515011-0.430493pyruvate ferredoxin oxidoreductase, beta
HPSNT_05520213-1.027597adenylosuccinate lyase
HPSNT_05525416-1.532099putative outer membrane protein
HPSNT_05530518-1.531532excinuclease ABC subunit B
HPSNT_05535319-0.647530hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_05500YERSSTKINASE280.031 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 27.8 bits (61), Expect = 0.031
Identities = 12/33 (36%), Positives = 21/33 (63%)

Query: 80 IENIFANEKEDTTYIITSYFNKEELFEKKPELK 112
+ N+ A+EK D ++++ + E FEK PE+K
Sbjct: 314 VGNLGASEKSDVFLVVSTLLHCIEGFEKNPEIK 346


11HPSNT_05580HPSNT_05610Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_05580314-0.942953hypothetical protein
HPSNT_05585413-1.844462hypothetical protein
HPSNT_05590513-1.730902FKBP-type peptidyl-prolyl cis-trans isomerase
HPSNT_05595516-2.351789hypothetical protein
HPSNT_05600515-2.000457peptidoglycan-associated lipoprotein precursor
HPSNT_05605215-0.071564translocation protein TolB
HPSNT_05610218-0.030049hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_05600OMPADOMAIN1478e-46 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 147 bits (373), Expect = 8e-46
Identities = 46/172 (26%), Positives = 72/172 (41%), Gaps = 26/172 (15%)

Query: 22 NMDKETVAGDVSAKTVQSAPVSTETVQEKQEPKEEPKQEPAPVVEEKPAVESGTIIASIY 81
D ++ VS + Q P P PAP V+ K T+ + +
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAA----------PVVAPAPAPAPEVQTK----HFTLKSDVL 222

Query: 82 FDFDKYEIKESDQETLDEIVQKAKE---NHMQVLLEGNTDEFGSSEYNQALGVKRTLSVK 138
F+F+K +K Q LD++ + V++ G TD GS YNQ L +R SV
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVV 282

Query: 139 NALVIKGVEKDMIKTISFGETKPKCTQ-----KTR----ECYRENRRVDVKL 181
+ L+ KG+ D I GE+ P K R +C +RRV++++
Sbjct: 283 DYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_05610GPOSANCHOR290.027 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.9 bits (64), Expect = 0.027
Identities = 20/148 (13%), Positives = 37/148 (25%), Gaps = 9/148 (6%)

Query: 27 LERHNKEAEKILLDLGKKNEQVIDLNLEDLPSDEKAVEKAVEKVEEKKDEKVAEKNATDK 86
LE+ + A K + L E + + E
Sbjct: 160 LEKALEGAMNFSTADSAKIKT---LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 87 EGDFIDPKEQEESLENIFSSL-NDFQEKTDTNAQKDEQKNEQEEEQRRLKEQQRLRKNQK 145
E + ++ LE N + + +K E Q L++ N
Sbjct: 217 EAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFS 276

Query: 146 NQEM-----LKGLQQNLDQFAQKLESVK 168
+ L+ + L+ LE
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQS 304


12HPSNT_05665HPSNT_05720Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_05665217-0.518094chromosome partitioning protein
HPSNT_05670318-1.108214biotin--protein ligase
HPSNT_05675421-1.511140methionyl-tRNA formyltransferase
HPSNT_05690519-1.511565hypothetical protein
HPSNT_05695222-1.647443hypothetical protein
HPSNT_05700018-0.969079hypothetical protein
HPSNT_05705115-0.674791hypothetical protein
HPSNT_05710114-0.067348hypothetical protein
HPSNT_057152140.341330hypothetical protein
HPSNT_057202150.79019850S ribosomal protein L19
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_05665PF07675310.004 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 31.2 bits (70), Expect = 0.004
Identities = 30/105 (28%), Positives = 40/105 (38%), Gaps = 7/105 (6%)

Query: 69 QISQVILKTQMPFLDLVPSNLGLAGFEKTFYDSQDENKRGELMLKNALESVV---GLYDY 125
VI T F SNL A FE + D + ++ VV G+YDY
Sbjct: 414 TFGSVIPATGPLFTGTASSNLYSANFEYLTPANADPVVTTQNIIVTGQGEVVIPGGVYDY 473

Query: 126 IIIDSPPALGPLTINSLSAAHSVIIPIQCEFFALEGTKLLLNTIR 170
I + PA G + I A P + + FA E K T+R
Sbjct: 474 CITNPEPASGKMWI----AGDGGNQPARYDDFAFEAGKKYTFTMR 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_05675FERRIBNDNGPP330.001 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 33.0 bits (75), Expect = 0.001
Identities = 12/33 (36%), Positives = 20/33 (60%)

Query: 70 EPEVQILKDLKPDFIVVVAYGKILPKEVLAIAP 102
EP +++L ++KP F+V A P+ + IAP
Sbjct: 86 EPNLELLTEMKPSFMVWSAGYGPSPEMLARIAP 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_05690RTXTOXIND449e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.7 bits (103), Expect = 9e-07
Identities = 26/170 (15%), Positives = 59/170 (34%), Gaps = 18/170 (10%)

Query: 51 RAQYQSHFKMLEQKEEALKEREREQKAQFDDAVKQASALALQDERTKIIEEARKNAFLEQ 110
+ Q+ + QKE L ++ E+ + + ++ R + +
Sbjct: 192 KEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAK 251

Query: 111 QKGLELLQKELEEKSKQVQELHQKEAEIERLKRENNEAESRLKAENEKKLNEKLETEREK 170
LE K +E EL ++++E+++ E A+ + + NE L+ R+
Sbjct: 252 HAVLEQENKYVEAV----NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQ- 306

Query: 171 IEKALHEKNELKFKQQEEQLEMLRNELKNAQRKAELSSQQLQGEVQELAI 220
+L + + +A +S +VQ+L +
Sbjct: 307 --------TTDNIGLLTLELAKNEERQQASVIRAPVS-----VKVQQLKV 343


13HPSNT_07040HPSNT_07180Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_07040226-3.45792216S ribosomal RNA methyltransferase KsgA/Dim1
HPSNT_07045122-3.155728hypothetical protein
HPSNT_07050016-1.461178hypothetical protein
HPSNT_07055-113-2.299481hypothetical protein
HPSNT_07060-113-2.686298putative type I restriction enzyme specificity
HPSNT_07065-113-2.715155type I R-M system specificity subunit
HPSNT_07070010-1.128778formyltetrahydrofolate deformylase
HPSNT_07075015-0.366120protease IV (PspA)
HPSNT_07080217-0.697486hypothetical protein
HPSNT_07105215-0.601743hypothetical protein
HPSNT_07110115-1.089501hypothetical protein
HPSNT_07115116-0.506888hypothetical protein
HPSNT_07120315-0.996774peptidyl-prolyl cis-trans isomerase B
HPSNT_07125217-1.642514carbon storage regulator
HPSNT_07130217-2.0080074-diphosphocytidyl-2-C-methyl-D-erythritol
HPSNT_07135220-1.852877SsrA-binding protein
HPSNT_071402170.089677biopolymer transport protein
HPSNT_07145116-0.775282biopolymer transport protein
HPSNT_07150016-0.52456650S ribosomal protein L34
HPSNT_071550150.409994ribonuclease P protein component
HPSNT_07160-1150.609075hypothetical protein
HPSNT_071650130.271545membrane protein insertase
HPSNT_07170-1110.157758hypothetical protein
HPSNT_071750100.656262tRNA modification GTPase TrmE
HPSNT_071802111.399279outer membrane protein HomD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07065ENTSNTHTASED310.003 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 31.2 bits (70), Expect = 0.003
Identities = 21/60 (35%), Positives = 25/60 (41%), Gaps = 6/60 (10%)

Query: 115 TLIPIPPLNEQSAIADIL-SALDNYLHALRALILKKESVKKALSFELLSQRKRLKGFNQA 173
T + P S IL ++L + AL KESV KA S R L GFN A
Sbjct: 116 TATELAPSIIDSDERQILQASLLPFPLALTLAFSAKESVYKA-----FSDRVTLPGFNSA 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_0716560KDINNERMP425e-146 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 425 bits (1095), Expect = e-146
Identities = 165/572 (28%), Positives = 278/572 (48%), Gaps = 61/572 (10%)

Query: 10 RLILAIALSFLFITLYSYFFQKPNKT--TTQTTKQETANNHTATNPNTPNAQNFSVTQTI 67
R +L IAL F+ ++ + Q N QTT+ T +A + P +
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGK----- 59

Query: 68 PQENLLSTISFEHARIEIDSLGR--IKQVYLKDKKYLTPKQKGFLEHVGHLFNPKANPQT 125
L ++ + + I++ G + + K L Q L F +A
Sbjct: 60 -----LISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGL 114

Query: 126 PLKELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGALTI 183
++ P A+ +PL +N A G NE V D T
Sbjct: 115 TGRDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTF 161

Query: 184 IKTLTFYDDLHYDLQIAFKSSN--------NIIPSYVITNGYRPVADLDS-----YTFSG 230
KT Y + + + N + + P D S +TF G
Sbjct: 162 TKTFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRG 220

Query: 231 VLLENNDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQGFEALIDSEI 287
D+K EK + D + + S +++ + +YF T + G + +
Sbjct: 221 AAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNFYTANL 279

Query: 288 GTKNPLGFISLKNEA-----------DLHGYIGPKDYRSLKAISPMLTDVIEYGLITFFA 336
G N + I K++ + ++GP+ + A++P L ++YG + F +
Sbjct: 280 G--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFIS 337

Query: 337 KGVFVLLDYLYQFVGNWGWAIILLTIIVRLILYPLSYKGMVSMQKLKELTPKMKELQEKY 396
+ +F LL +++ FVGNWG++II++T IVR I+YPL+ SM K++ L PK++ ++E+
Sbjct: 338 QPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERL 397

Query: 397 KGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWILWIHD 456
+ Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + + LWIHD
Sbjct: 398 GDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHD 457

Query: 457 LSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLITFPAGLVLYW 516
LS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F + FP+GLVLY+
Sbjct: 458 LSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYY 517

Query: 517 TTNNILSVLQQLIINKVLENKKRAHAQNKKES 548
+N+++++QQ +I + LE K+ H++ KK+S
Sbjct: 518 IVSNLVTIIQQQLIYRGLE-KRGLHSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07170DPTHRIATOXIN310.004 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 31.3 bits (70), Expect = 0.004
Identities = 24/76 (31%), Positives = 40/76 (52%), Gaps = 2/76 (2%)

Query: 17 IQASIALNCPIINLQYEVIQTPSKGFLNIGKKEAIILAGVKESV-KAVKEESVKEAHTKE 75
++ S+ + INL ++VI+ +K + K+ I + ES K V EE K+ + +E
Sbjct: 223 VRRSVGSSLSCINLDWDVIRDKTKTKIESLKEHGPIKNKMSESPNKTVSEEKAKQ-YLEE 281

Query: 76 IHQSAEEKKQKLETKT 91
HQ+A E + E KT
Sbjct: 282 FHQTALEHPELSELKT 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07175TCRTETOQM320.006 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.7 bits (72), Expect = 0.006
Identities = 32/134 (23%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 QGHKVRLIDTAGIRESADKIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFNLIDTLN 318
+ KV +IDT G + ++ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


14HPSNT_07260HPSNT_07325Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_072602110.489364outer membrane protein
HPSNT_07265211-0.079214branched-chain amino acid aminotransferase
HPSNT_07270212-0.668968outer membrane protein HorJ
HPSNT_07275213-1.164974DNA polymerase I
HPSNT_07280218-0.888511hypothetical protein
HPSNT_072851180.219533type II restriction modification enzyme
HPSNT_07290218-0.032013restriction enzyme BcgI alpha chain-like
HPSNT_072953180.795038hypothetical protein
HPSNT_073003140.499444thymidylate kinase
HPSNT_073053110.332796phosphopantetheine adenylyltransferase
HPSNT_073103120.4620323-octaprenyl-4-hydroxybenzoate carboxy-lyase
HPSNT_073153130.341053hypothetical protein
HPSNT_073203130.401414flagellar basal body P-ring biosynthesis protein
HPSNT_073252120.557910DNA helicase II (uvrD)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07305LPSBIOSNTHSS2234e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 223 bits (570), Expect = 4e-78
Identities = 63/148 (42%), Positives = 94/148 (63%)

Query: 4 IGIYPGTFDPVTNGHIDIIHRSSELFEKLIVAVAHSSAKNPMFSLKERLKMMQLATKSFK 63
IYPG+FDP+T GH+DII R LF+++ VAV + K PMFS++ERL+ + A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 64 NVECVAFEGLLANLAKEYHCKVLVRGLRVVSDFEYELQMGYANKSLNHELETLYFMPTLQ 123
N + +FEGL N A++ ++RGLRV+SDFE ELQM NK+L +LET++ + +
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 124 NAFISSSIVRSIIAHKGDASHLVPKEIH 151
+F+SSS+V+ + G+ H VP +
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVA 149


15HPSNT_07475HPSNT_07705Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_074753122.815915saccharopine dehydrogenase
HPSNT_074801122.271729ferrodoxin-like protein
HPSNT_07485-1102.008282putative glycerol-3-phosphate acyltransferase
HPSNT_07490-1111.005832dihydroneopterin aldolase
HPSNT_07495-1100.008198FrpB-like protein
HPSNT_07500-211-0.878680iron-regulated outer membrane protein
HPSNT_07505010-3.177122selenocysteine synthase
HPSNT_0751009-3.275096transcription elongation factor NusA
HPSNT_07515-19-3.965090hypothetical protein
HPSNT_0752008-3.418225hypothetical protein
HPSNT_0752508-2.966149type IIS restriction-modification protein
HPSNT_07530111-3.031235type III R-M system restriction enzyme
HPSNT_07535110-2.815095type III R-M system modification enzyme
HPSNT_07540010-2.393044type III R-M system modification enzyme
HPSNT_07545012-1.274777ATP-dependent DNA helicase RecG
HPSNT_07550014-0.892042hypothetical protein
HPSNT_07555-113-0.813827hypothetical protein
HPSNT_07560010-0.132173exodeoxyribonuclease III
HPSNT_075651110.147757*hypothetical protein
HPSNT_075703140.142518chromosomal replication initiation protein
HPSNT_07575420-0.132001purine nucleoside phosphorylase
HPSNT_07580319-0.657215hypothetical protein
HPSNT_07585218-0.516422glucosamine--fructose-6-phosphate
HPSNT_075903152.723012FAD-dependent thymidylate synthase
HPSNT_075954162.994642hypothetical protein
HPSNT_076002152.649298type I R-M system specificity subunit
HPSNT_076053143.414701Restriction endonuclease S subunits-like
HPSNT_076202154.022617hypothetical protein
HPSNT_076252123.966291iron(III) dicitrate transport protein (fecA)
HPSNT_07630-1101.764880hypothetical protein
HPSNT_07635090.031290arginase
HPSNT_07640010-0.021188alanine dehydrogenase
HPSNT_07655110-1.385100outer membrane protein HorL
HPSNT_07660312-2.209269probable inorganic polyphosphate/ATP-NAD kinase
HPSNT_07665410-2.357304DNA repair protein
HPSNT_07670-111-2.422399fibronectin/fibrinogen-binding protein
HPSNT_07675114-0.079346hypothetical protein
HPSNT_07680116-0.260222hypothetical protein
HPSNT_07685-115-1.275950hypothetical protein
HPSNT_07690014-1.878227DNA polymerase III subunit epsilon
HPSNT_07695012-1.231748ribulose-phosphate 3-epimerase
HPSNT_07700111-0.460508fructose-1,6-bisphosphatase
HPSNT_077052110.548559hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07570HTHFIS364e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.6 bits (82), Expect = 4e-04
Identities = 11/47 (23%), Positives = 22/47 (46%), Gaps = 6/47 (12%)

Query: 125 TVYEIAKKVAQSDTPPYNPVLFYGGTGLGKTHILNAIGNHALEKRKK 171
+Y + ++ Q+D ++ G +G GK + A+ H KR+
Sbjct: 148 EIYRVLARLMQTDLT----LMITGESGTGKELVARAL--HDYGKRRN 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07670FbpA_PF058331132e-29 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 113 bits (285), Expect = 2e-29
Identities = 75/401 (18%), Positives = 150/401 (37%), Gaps = 29/401 (7%)

Query: 54 KKPPESVLKNTLALDFCLNKFTKNAKILQANIVDNDRILEITGAKDLAYKSENFILRLEM 113
K P + + N N I + L + ++ ++ +N + L +
Sbjct: 170 KLNPFDFSYDMIENFTKENSLQLNDNIFSKIFTGVSKTL----SSEICFRLKNNSIDLSL 225

Query: 114 IPKKTNLMILDKEKCVIEA--FRFNDRVAKNDILGALPFN-AYEHQEEDLDFKGLLDILE 170
K + + I++ F FN N +G N + + + + +LE
Sbjct: 226 SNLKEIVEVCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLE 285

Query: 171 KDFLSYQHKE-LEHKKNQIIKRLNIQKERLKEKLEKLEDPKNLQLEAKELQTQASLLLTY 229
+ + + L+ K + + K + R +K + L + + + LL
Sbjct: 286 NFYYAKDKSDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTAN 345

Query: 230 QHLINRHESRVVLKDFED---KECAIEIDKSMPLNAFINKKFTLSKKKKQKSQFLYLEEE 286
+ + + S + L ++ I +D++ + + + K K+ + +
Sbjct: 346 IYALKKGLSHIELANYYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLL 405

Query: 287 NLKEKIAFKENQINYVKGAQEESVLE------------MFMPFKNSKIKRPMSGYEVLYY 334
+E++ + + + + A +E F SK + +
Sbjct: 406 QNEEELNYLYSVLTNINNADNYDEIEEIKKELIETGYIKFKKIYKSKKSKTSKPMHFISK 465

Query: 335 KDFKIGLGKNQKENIKL-LQDARANDLWMHVRDIPGSHLIVFCQKNTPKDEVIMELAKML 393
I +GKN +N L L+ A +D+W H ++IPGSH+IV + P + ++E A +
Sbjct: 466 DGIDIYVGKNNIQNDYLTLKFANKHDIWFHTKNIPGSHVIVKNIMDIP-ESTLLEAANLA 524

Query: 394 VKMQKDAFNS-YEIDYTQRKFVKIIKGAN---VIYSKYRTI 430
K +S +DYT+ K VK GA VIYS +TI
Sbjct: 525 AYYSKSQNSSNVPVDYTEVKNVKKPNGAKPGMVIYSTNQTI 565



Score = 34.1 bits (78), Expect = 0.001
Identities = 19/104 (18%), Positives = 51/104 (49%), Gaps = 7/104 (6%)

Query: 36 KEKHAFILDLN--VPYIGLSKKPPESVLKNTLALDFCLNKFTKNAKILQANIVDNDRI-- 91
+ ++ + P I L+ + +K + L K+ NAKI+ + ++ DRI
Sbjct: 43 RLSFKLLISSSSNYPRIHLTDLTKPNPIKAPMFCMV-LRKYISNAKIVDIHQINQDRIVV 101

Query: 92 LEITGAKDLAYKSENFILRLEMIPKKTNLMILDK-EKCVIEAFR 134
++ +L + S + L +E++ + +N+ ++ K + ++++ +
Sbjct: 102 IDFESTDELGFNSI-YSLIIEIMGRHSNMTLIRKRDNIIMDSIK 144


16HPSNT_00185HPSNT_00210N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_00185-3140.572724comB8 competence protein
HPSNT_00190-2130.744703comB9 competence protein
HPSNT_00195-2121.596548comB10 competence protein
HPSNT_00200-2121.311811mannose-1-phosphate guanyltransferase
HPSNT_00205-1121.393319GDP-D-mannose dehydratase
HPSNT_00210-1131.499680nodulation protein (nolK)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00185PF043351318e-40 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 131 bits (331), Expect = 8e-40
Identities = 37/202 (18%), Positives = 73/202 (36%), Gaps = 4/202 (1%)

Query: 40 QSVFRLERNRLKIAYKLLGLMSFIALVLAIVLISVLPLQKTEHHF--VDFLNQDKHYAII 97
+ K+A+ + G+ +A + + ++ PL+ E + VD + A
Sbjct: 22 RDKLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAK 81

Query: 98 QRADKSISGNEALARSLIGAYVLNRESINRIDDKSRYELVRLQSSSKVWQRFEDLIKTQN 157
D +I+ +EA+ + + YV RE + ++ V + S+ R+ KT N
Sbjct: 82 LHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDN 141

Query: 158 SIYAQSHLEREVHI-VNIAIYQQDNNPIASVSIAAKLMNENKLVYEKRYKIVLSYLFDTP 216
Q+ L + V I +A V + + + + + Y D
Sbjct: 142 PQSPQNILANRTDVFVEIKRVSFLGGNVAQVYFTKESVTGSNST-KTDAVATIKYKVDGT 200

Query: 217 DFDYASMPKNPTGFKITRYSIT 238
KNP G+++ Y
Sbjct: 201 PSKEVDRFKNPLGYQVESYRAD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00190TYPE4SSCAGX320.004 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 31.7 bits (71), Expect = 0.004
Identities = 26/72 (36%), Positives = 37/72 (51%), Gaps = 8/72 (11%)

Query: 193 KEETKEEETITIGDSTNAMKIVKKDIQKGYRALKSSQ--RKWYCLGICSKKSKPSLMPEE 250
KE+ +EE+ I D A+ + Q + ALK + R + K+SK +MP E
Sbjct: 365 KEKIREEKQKIILDQAKAL-----ETQYVHNALKRNPVPRNYNYYQAPEKRSK-HIMPSE 418

Query: 251 IFNDKQFTYFKF 262
IF+D FTYF F
Sbjct: 419 IFDDGTFTYFGF 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00205NUCEPIMERASE881e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.9 bits (218), Expect = 1e-21
Identities = 46/180 (25%), Positives = 72/180 (40%), Gaps = 19/180 (10%)

Query: 7 LITGVTGQDGSYLAEYLLNLGYEVHGLKRRSSSINTSRIDHLYEDLHSDHQRRFFLHYGD 66
L+TG G G ++++ LL G++V G+ + + S E L F H D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP---GFQFHKID 60

Query: 67 MTDSSNLIHLIATTKPTEIYNLAAQSHVKVSFETPEYTANADGIGTLRILEAMRILGLEK 126
+ D + L A+ ++ + V+ S E P A+++ G L ILE R ++
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ- 119

Query: 127 KTRFYQASTSELYGEVLETPQNENTPF-------NPRSPYAVAKMYAFYITKNYREAYNL 179
AS+S +YG N PF +P S YA K + Y Y L
Sbjct: 120 --HLLYASSSSVYGL------NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_00210NUCEPIMERASE496e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 49.4 bits (118), Expect = 6e-09
Identities = 52/346 (15%), Positives = 106/346 (30%), Gaps = 54/346 (15%)

Query: 5 VLITGAYGMVGQNTALYFKKNKPDV-----------TLLTPKKSELY-----------LL 42
L+TGA G +G + + + V L + EL L
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLA 62

Query: 43 DKDNIQAYLKEYKPTGIIHCAGRVGGIVANMNDLSTYMVENLLMGLYLFSSALDLGVKKA 102
D++ + + R + ++ + Y NL L + ++
Sbjct: 63 DREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 103 INLASSCAYPKFAPNPLKESDLLNGSLEPTNEGYALAKLSVMKYCEYVSAEKGVFYKTLV 162
+ +SS Y P D ++ + YA K + S G+ L
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSL----YAATKKANELMAHTYSHLYGLPATGLR 177

Query: 163 PCNLYGEFDKFEEKIAHMIPGLISRMHTAKLKNEKEFAMWGDGTARREYLNAKDLARFIS 222
+YG + + P + T + K ++ G +R++ D+A I
Sbjct: 178 FFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAII 228

Query: 223 LAYDNIASIPS-----------------VMNVGSGVDYSIEEYYEKVAQVLDYKGAFVKD 265
D I + V N+G+ + +Y + + L +
Sbjct: 229 RLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNML 288

Query: 266 LSKPVGMQQKLMDISK-QKALKWELEIPLEQGIKEAYEYYLKLLEV 310
+P + + D + + + E ++ G+K +Y +V
Sbjct: 289 PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYKV 334


17HPSNT_01410HPSNT_01455N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_01410-3120.925362Neutrophil activating protein NapA
HPSNT_01415-3120.788923histidine kinase sensor protein
HPSNT_01420-3111.506149hypothetical protein
HPSNT_01425-3112.021652flagellar basal body P-ring protein
HPSNT_01430-3111.977163ATP-dependent RNA helicase
HPSNT_01435-3101.552871hypothetical protein
HPSNT_01440-3101.599683hypothetical protein
HPSNT_01445-3101.484685oligopeptide permease ATPase protein
HPSNT_01450-2111.595318oligopeptide permease integral membrane protein
HPSNT_01455-2121.353177*Outer membrane protein HopF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01410HELNAPAPROT1484e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 148 bits (376), Expect = 4e-49
Identities = 39/140 (27%), Positives = 74/140 (52%), Gaps = 1/140 (0%)

Query: 5 EILKHLQADAIVLFMKVHNFHWNVKGTDFFNVHKATEEIYEEFADMFDDLAERIAQLGHH 64
L ++ +L+ K+H FHW VKG FF +H+ EE+Y+ A+ D +AER+ +G
Sbjct: 15 NSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAERLLAIGGQ 74

Query: 65 PLVTLSEAIKLTRVKEETKTSFHSKDIFKEILGDYKHLEKEFKELSNTAEKEGDKVTVTY 124
P+ T+ E + + + + + ++ + ++ DYK + E K + AE+ D T
Sbjct: 75 PVATVKEYTEHASITDGGNET-SASEMVQALVNDYKQISSESKFVIGLAEENQDNATADL 133

Query: 125 ADDQLAKLQKSIWMLEAHLA 144
+ +++K +WML ++L
Sbjct: 134 FVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01415PF06580300.016 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.016
Identities = 10/71 (14%), Positives = 25/71 (35%), Gaps = 13/71 (18%)

Query: 281 IVLQNFLYNAIDAIEALEESKQ-GQVKIEAFIQNEFIVFTIIDNGKEVENKSALFEPFET 339
+++Q + N I + + Q G++ ++ N + + + G +
Sbjct: 258 MLVQTLVENGI--KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK------- 308

Query: 340 TKLKGNGLGLA 350
+ G GL
Sbjct: 309 ---ESTGTGLQ 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01425FLGPRINGFLGI360e-126 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 360 bits (926), Expect = e-126
Identities = 119/345 (34%), Positives = 190/345 (55%), Gaps = 26/345 (7%)

Query: 19 AEKIGDIASVVGVRDNQLIGYGLVIGLNGTGDK-SGSKFTMQSISNMLESVNVKISADDI 77
+I DIAS+ RDNQLIGYGLV+GL GTGD S FT QS+ ML+++ +
Sbjct: 28 TSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMRAMLQNLGITTQGGQS 87

Query: 78 KSKNVAAVMITASLPPFARQGDKIDVQISSIGDAKSIQGGTLVMTPLNAVDGNIYALAQG 137
+KN+AAVM+TA+LPPFA G ++DV +SS+GDA S++GG L+MT L+ DG IYA+AQG
Sbjct: 88 NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIMTSLSGADGQIYAVAQG 147

Query: 138 AITSGN-----------SNNLLSATIINGATIEREVSYDLFHKNAMVLSLKNPNFKNAIQ 186
A+ SA + NGA IERE+ +VL L+NP+F A++
Sbjct: 148 ALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVR 207

Query: 187 VQNTLNKV----FGNKVALALDPKTIQITRPERFSMVEFLALVQEIPINYSAKNKIIVDE 242
V + +N +G+ +A D + I + +P + +A ++ + + K++++E
Sbjct: 208 VADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEIENLTVETDTPAKVVINE 267

Query: 243 KSGTIVSGVDIIVHPIVVTSQDITLKITKEP--------LNDSKNTQDLDHNMSLDTAHN 294
++GTIV G D+ + + V+ +T+++T+ P Q M++
Sbjct: 268 RTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQTAVQPQTDIMAMQEGSK 327

Query: 295 TLSSNGKNITIAGVVKALQKIGVSAKGMVSILQALKKSGAISAEM 339
G ++ +V L IG+ A G+++ILQ +K +GA+ AE+
Sbjct: 328 VAIVEGPDLR--TLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01430SECA300.024 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.024
Identities = 16/63 (25%), Positives = 29/63 (46%), Gaps = 2/63 (3%)

Query: 261 IVFTRTKKEADELHQFLASKNYKSTALHGDMDQRDRRTSIIAFKKNDADVLVATDVASRG 320
+V T + ++++ + L K L+ + I+A A V +AT++A RG
Sbjct: 453 LVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAA--IVAQAGYPAAVTIATNMAGRG 510

Query: 321 LDI 323
DI
Sbjct: 511 TDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01445HTHFIS320.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.007
Identities = 16/50 (32%), Positives = 21/50 (42%), Gaps = 7/50 (14%)

Query: 30 VAIVGESGSGKSSIANIIMCLNPR----FKPHNGGVLFETTNLLKESEEF 75
+ I GESG+GK +A + R F N + L ESE F
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRD---LIESELF 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01455VACCYTOTOXIN310.010 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.2 bits (70), Expect = 0.010
Identities = 19/74 (25%), Positives = 32/74 (43%), Gaps = 4/74 (5%)

Query: 88 SESLKNNAQQQ----NGPSNQALFNLEQSLEILGKLLDLSQQYANQGVIKPLVVDVGKEQ 143
S + K+N+ Q+ NG S+Q LFN ++E D S Y N GV++
Sbjct: 1170 STNFKSNSNQKVALKNGASSQHLFNASANVEARYYYGDTSYFYMNAGVLQEFANFGSSNA 1229

Query: 144 IGITDSMLLVAQNI 157
+ + + +N
Sbjct: 1230 VSLNTFKVNATRNP 1243


18HPSNT_01890HPSNT_01930N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_01890-390.776655flagellar MS-ring protein
HPSNT_01895-2101.114920flagellar motor switch protein G
HPSNT_01900-281.051690flagellar assembly protein H
HPSNT_019050101.5041681-deoxy-D-xylulose-5-phosphate synthase
HPSNT_01910-110-0.295876GTP-binding protein LepA
HPSNT_019150110.231063hypothetical protein
HPSNT_019200120.490745flagellar basal-body rod protein
HPSNT_019251110.168629alpha-ketoglutarate permease (kgtP)
HPSNT_01930011-0.035682DNA segregation ATPase FtsK/SpoIIIE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01890FLGMRINGFLIF5570.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 557 bits (1437), Expect = 0.0
Identities = 177/582 (30%), Positives = 294/582 (50%), Gaps = 66/582 (11%)

Query: 11 VDFFIKLNKKQKIALIAAGVLITALLVFLLLYPFKEKDYAQGGYGVLFEGLDSSDNALIL 70
+++ +L +I LI AG A++V ++L+ K DY LF L D I+
Sbjct: 13 LEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWA-KTPDYR-----TLFSNLSDQDGGAIV 66

Query: 71 QHLQQNQIPYKVSKDD-TILIPKDKVYEERITLASQGIPKTSKVGFEIFDTKDFGATDFD 129
L Q IPY+ + I +P DKV+E R+ LA QG+PK VGFE+ D + FG + F
Sbjct: 67 AQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFS 126

Query: 130 QNIKLIRAIEGELSRTIESLNPILKANVHIAIPKDSVFVAKEVPPSASVMLKIKPDMKLS 189
+ + RA+EGEL+RTIE+L P+ A VH+A+PK S+FV ++ PSASV + ++P L
Sbjct: 127 EQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALD 186

Query: 190 PTQILGIKNLIAAAVPKLTIDNVKIVNENGESIGEGDILENSKELALEQLHYKQNFENIL 249
QI + +L+++AV L NV +V+++G + + + + ++L QL + + E+ +
Sbjct: 187 EGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNT--SGRDLNDAQLKFANDVESRI 244

Query: 250 ENKIVNILAPIVGGKNKVVARVNAEFDFSQKKSTKETFDPNN-----VVRSEQNLEEKKE 304
+ +I IL+PIVG N V A+V A+ DF+ K+ T+E + PN +RS Q ++
Sbjct: 245 QRRIEAILSPIVGNGN-VHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQV 303

Query: 305 GASKKQVGGVPGVVSN-IGPVQGLKDNKEPEKYEKSQN---------------------- 341
GA GGVPG +SN P P + +QN
Sbjct: 304 GAGYP--GGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNE 361

Query: 342 TTNYEVGKTISEIKGEFGTLVRLNAAVVVDGKYKIALKDGANTLEYEPLSDESLQKINAL 401
T+NYEV +TI K G + RL+ AVVV+ K L DG + PL+ + +++I L
Sbjct: 362 TSNYEVDRTIRHTKMNVGDIERLSVAVVVNYK---TLADG----KPLPLTADQMKQIEDL 414

Query: 402 VKQAIGYNQNRGDDVAVSNFEFNPMAPMIDNATLSEKIMHKTQKILGSFTPLIKYVLVFI 461
++A+G++ RGD + V N F+ + T E + Q + +++LV +
Sbjct: 415 TREAMGFSDKRGDTLNVVNSPFSAVDN-----TGGELPFWQQQSFIDQLLAAGRWLLVLV 469

Query: 462 VLFIFYKKVIVPFSERMLEVVPDEDKEVKSMFEEMDEEEDELNKLGDLRKKVEDQLGLNA 521
V +I ++K + P R +E ++ + E + E L+K L+++ +Q
Sbjct: 470 VAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQ----- 524

Query: 522 TFSEEEVRYEIILEKIRGTLKERPDEIATLFKLLIKDEISSD 563
+ E++ ++IR E D + L+I+ +S+D
Sbjct: 525 -----RLGAEVMSQRIR----EMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01895FLGMOTORFLIG351e-123 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 351 bits (902), Expect = e-123
Identities = 122/338 (36%), Positives = 209/338 (61%), Gaps = 4/338 (1%)

Query: 8 KQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQIGAAV 67
K+ + L+ +K AILL+ +G + + ++ ++L + I ++ +I +L ++ V
Sbjct: 7 KEILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNV 66

Query: 68 LEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEAKKVMDKLTKSLQTQKNFAYLGKIKP 127
L EF + + ++I GG++YARELL ++LG+++A +++ L +LQ+ + F ++ + P
Sbjct: 67 LLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQS-RPFEFVRRADP 125

Query: 128 QQLADFIINEHPQTIALILAHMEAPNAAETLSYFPDEMKAEISIRMANLGEISPQVVKRV 187
+ +FI EHPQTIALIL++++ A+ LS P E++ ++ R+A + SP+VV+ V
Sbjct: 126 ANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREV 185

Query: 188 STVLENKLESLTSYK-IEVGGLRAVAEIFNRLGQKSAKTTLARIESVDNKLAGAIKEMMF 246
VLE KL SL+S GG+ V EI N +K+ K + +E D +LA IK+ MF
Sbjct: 186 ERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMF 245

Query: 247 TFEDIVKLDNFAIREILKVADKKDLSLALKTSTKDLTDKFLNNMSSRAAEQFVEEMQYLG 306
FEDIV LD+ +I+ +L+ D ++L+ ALK+ + +K NMS RAA E+M++LG
Sbjct: 246 VFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLG 305

Query: 307 AVKIKDVDVAQRKIIEIVQSLQEKG--VIQTGEEEDVI 342
+ KDV+ +Q+KI+ +++ L+E+G VI G EEDV+
Sbjct: 306 PTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343



Score = 30.2 bits (68), Expect = 0.010
Identities = 20/102 (19%), Positives = 41/102 (40%), Gaps = 3/102 (2%)

Query: 4 KLTPKQKAQLDELSMSEKIAILLIQVGEDTTGEILRHLDIDSITEISKQIVQLNGTDKQI 63
+ P + + IA++L + IL L + T ++++I ++ T ++
Sbjct: 122 RADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEV 181

Query: 64 GAA---VLEEFFAIFQSNQYINTGGLEYARELLTRTLGSEEA 102
VLE+ A S Y + GG++ E++ E
Sbjct: 182 VREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEK 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01900FLGFLIH365e-05 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 36.3 bits (83), Expect = 5e-05
Identities = 44/207 (21%), Positives = 91/207 (43%), Gaps = 14/207 (6%)

Query: 50 PLEKKAIENDLIDCLLKKTDELSSHLVKLQMQFEKAQEES-KALIENAKNDGYKIGFKEG 108
E I + + L L +LQMQ A E+ +A I + G+K G++EG
Sbjct: 19 QAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQ---AHEQGYQAGIAEGRQQGHKQGYQEG 75

Query: 109 EEKMRNELTHSVNEEKNQLLHAITALDEKMKKSEDHLMALE----KELSAIAIDIAKEVI 164
+ L + E K+Q + + + + + L AL+ L +A++ A++VI
Sbjct: 76 ---LAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVI 132

Query: 165 LKEVEDNSQKVALALAEELLKNVLDATDIHLKVNPLDYPYLNERLQNASKI---KLESNE 221
+ ++ + + + L + L + L+V+P D +++ L + +L +
Sbjct: 133 GQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDP 192

Query: 222 AISKGGVMITSSNGSLDGNLMERFKTL 248
+ GG +++ G LD ++ R++ L
Sbjct: 193 TLHPGGCKVSADEGDLDASVATRWQEL 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01910TCRTETOQM1148e-29 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 114 bits (287), Expect = 8e-29
Identities = 55/162 (33%), Positives = 89/162 (54%), Gaps = 7/162 (4%)

Query: 3 NIRNFSIIAHIDHGKSTLADCLIAECNAIS---NREMTSQVMDTMDIEKERGITIKAQSV 59
I N ++AH+D GK+TL + L+ AI+ + + + D +E++RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 60 RLNYTFKGENYILNLIDTPGHVDFSYEVSRSLCSCEGALLVVDATQGVEAQTIANTYIAL 119
+F+ EN +N+IDTPGH+DF EV RSL +GA+L++ A GV+AQT +
Sbjct: 62 ----SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 120 DNNLEILPVINKIDLPNANVLEVKQDIEDTIGIDCSNANEVS 161
+ + INKID ++ V QDI++ + + +V
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVE 159



Score = 83.4 bits (206), Expect = 7e-19
Identities = 50/215 (23%), Positives = 90/215 (41%), Gaps = 17/215 (7%)

Query: 161 SAKAKLGIKDLLEKIITTIPAPSGDFNAPLKALIYDSWFDNYLGALALVRIMDGSINTEQ 220
SAK +GI +L+E I + + + L ++ + LA +R+ G ++
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 221 EILVMGTGKKHGVLGLYYPNPLKKIPTKSLECGEIGIV---SLGLKSVTDIAVGDTLTDA 277
+ + K + +Y + GEI I+ L L SV +GDT
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV----LGDTKLL- 333

Query: 278 KNPTPKPIEGFMPAKPFVFAGLYPIETDRFEDLREALLKLQLNDCALNFEPESSVALGFG 337
P + IE P + + P + + E L +ALL++ +D L + +S+
Sbjct: 334 --PQRERIEN---PLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATH---E 385

Query: 338 FRVGFLGLLHMEVIKERLEREFGLNLIATAPTVVY 372
+ FLG + MEV L+ ++ + + PTV+Y
Sbjct: 386 IILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 31.0 bits (70), Expect = 0.015
Identities = 15/75 (20%), Positives = 28/75 (37%), Gaps = 2/75 (2%)

Query: 399 IKEPFVRATIITPSEFLGNLMQLLNNKRGIQEKMEYLNQSRVMLTYSLPSNEIVMDFYDK 458
+ EP++ I P E+L + L + V+L+ +P+ I ++
Sbjct: 535 LLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSD 592

Query: 459 LKSCTKGYASFDYEP 473
L T G + E
Sbjct: 593 LTFFTNGRSVCLTEL 607


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01920FLGHOOKAP1290.021 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.021
Identities = 8/40 (20%), Positives = 16/40 (40%)

Query: 3 NGYYAATGAMATQFNRLDLTSNNLANLNTNGFKRDDAVTG 42
+ A + L+ SNN+++ N G+ R +
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA 41


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01925TCRTETB392e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 2e-05
Identities = 42/182 (23%), Positives = 71/182 (39%), Gaps = 33/182 (18%)

Query: 35 APYFAKEFTHTNDPTLALISAFLVFMLGFFMRPLGSLFFGKLGDKKGRKTSMVYSIILMA 94
P A +F T + +AF++ G+ +GKL D+ G K +++ II+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSI------GTAVYGKLSDQLGIKRLLLFGIIINC 90

Query: 95 LGSFLLALLPTKEIVGEWAFLFLLLARLLQGFSVGGE------YGVVATYLSELGKNGKK 148
GS + VG F L++AR +QG G VVA Y+ + +
Sbjct: 91 FGSVIGF-------VGHSFFSLLIMARFIQG--AGAAAFPALVMVVVARYIPKENRGKAF 141

Query: 149 GFYGSFQYVTLVGGQLLAIFSLFIVENIYTHEQISAFAWRYLFALGGILALLSLFLRNIM 208
G GS + +G + I I+ W YL + I + FL ++
Sbjct: 142 GLIGS---IVAMGEGVGPAIGGMIAHYIH---------WSYLLLIPMITIITVPFLMKLL 189

Query: 209 EE 210
++
Sbjct: 190 KK 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_01930PF06580300.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.042
Identities = 20/144 (13%), Positives = 44/144 (30%), Gaps = 3/144 (2%)

Query: 17 AFLTLSSWLGNSGLVGRFGVWFAAINKKYFGYLSLINLPYLAWALFLLYRTKNPFTEIVL 76
+ + FG + K + I + + L YR+ +
Sbjct: 11 YYWYCQGIGWGVYTLTGFGFASLYGSPKLHSMIFNIAISLMGLVLTHAYRSFIKRQGWL- 69

Query: 77 EKTLGHLLGILSLLFLQSSLLNQGEIGNSTRLFLRPFIGDFGFYALIMLMVVISYLILFK 136
L IL +L + + N++ L FI + L + I + ++
Sbjct: 70 --KLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVVVV 127

Query: 137 LPPKSVFYPYINKTQSLLKEIYEQ 160
S+ Y + ++ + +Q
Sbjct: 128 TFMWSLLYFGWHFFKNYKQAEIDQ 151


19HPSNT_02650HPSNT_02730N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_02650-29-1.392873ATP-dependent protease subunit HslV
HPSNT_02655011-1.983817ATP-dependent protease ATP-binding subunit HslU
HPSNT_02660114-2.519982GTPase Era
HPSNT_02665515-2.357399conserved hypothetical secreted protein
HPSNT_02680819-2.560548cag pathogenicity island protein (cag1)
HPSNT_02685818-2.764414cag pathogenicity island protein Epsilon
HPSNT_02690915-2.215036cag pathogenicity island protein (cag3)
HPSNT_02695917-2.621447cag pathogenicity island protein Gamma
HPSNT_02700818-2.548661CAG pathogenicity island protein 5
HPSNT_02705920-2.964380cag pathogenicity island protein alpha
HPSNT_02710820-3.214069cag pathogenicity island protein Z
HPSNT_02715920-3.044277cag pathogenicity island protein (cag7)
HPSNT_027201027-4.311842cag pathogenicity island protein X
HPSNT_027251028-4.552016cag pathogenicity island protein W
HPSNT_027301329-5.845205cag island protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02650PF07520290.015 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 28.8 bits (64), Expect = 0.015
Identities = 14/49 (28%), Positives = 23/49 (46%), Gaps = 4/49 (8%)

Query: 121 LEAEDNKIAAIGSGG---NYALSAARALDNFAHLEPRKLVEESLKIAGD 166
E+ ++A I GG + ++ R DN L P + E ++AGD
Sbjct: 590 GESPSLRLACIDVGGGTTDLMVTTYRGEDNRV-LHPEQTFREGFRVAGD 637


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02655HTHFIS290.045 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.045
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 48 TPKNILMIGSTGVGKTEIARRI---AKIMELPFVKV 80
T +++ G +G GK +AR + K PFV +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02660PF03944300.013 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 30.0 bits (67), Expect = 0.013
Identities = 24/94 (25%), Positives = 47/94 (50%), Gaps = 3/94 (3%)

Query: 68 LHHQEKLLNQCMLSQALKAMGDAELRVFLASVHDDLKGYEEFLSLCQKPHILALSKIDTA 127
L E+ LNQ + + + A +AEL A+V + + + FL+ + L+++
Sbjct: 94 LRETERFLNQRLNTDTV-ARVNAELTGLQANVEEFNRQVDNFLNPNRNAVPLSITSSVNT 152

Query: 128 THKQVLQKIQEYQQYNSQFLALVPLSAKKSQNLN 161
+ L ++ ++Q Q L L+PL A+ + NL+
Sbjct: 153 MQQLFLNRLPQFQMQGYQLL-LLPLFAQAA-NLH 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02695TACYTOLYSIN280.027 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 27.6 bits (61), Expect = 0.027
Identities = 12/43 (27%), Positives = 22/43 (51%), Gaps = 3/43 (6%)

Query: 128 NKSVYQLVEMAIGAYNGG-MKHDPNGAYVKKFRCIYSQVRYNE 169
N+S Y VE Y G + GAYV ++ ++ ++ Y++
Sbjct: 451 NRSEY--VETTSTEYTSGKINLSHQGAYVAQYEILWDEINYDD 491


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02715IGASERPTASE409e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.0 bits (93), Expect = 9e-05
Identities = 32/166 (19%), Positives = 68/166 (40%), Gaps = 3/166 (1%)

Query: 701 AKESVKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKK 760
++ + + ++KT + ++ T + R++ +EAK +VKA A++ E K
Sbjct: 1034 SETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETK 1093

Query: 761 ECEKLLTPEARKLLEEE-AKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKS 819
E + T E + +EE AK + + + K+E + + P+A+ E
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 820 AKAYLDCVSQAKTEAEKKECEKLLTPEARKLLEQQALDCLKNAKTE 865
+ SQ T A+ ++ K + + + + N+ E
Sbjct: 1154 NIK--EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197



Score = 37.0 bits (85), Expect = 0.001
Identities = 29/194 (14%), Positives = 68/194 (35%), Gaps = 31/194 (15%)

Query: 767 TPEARKLLEEEAKESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSAKAYLDC 826
P ++ + + ++KT + ++ T + ++ +EAK + KA
Sbjct: 1023 APVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 827 VSQAKTEAEKKECEKLLTPEARKLLEQQALDCLKNAKTEAEKKRCVKDLPKDLQKKVLAK 886
A++ +E KE + T E + +++ AK E EK +
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKT---------------QE 1121

Query: 887 ESLKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKEC 946
+ + ++E + + E + ++E + SQ T A+ ++
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ----------SQTNTTADTEQP 1171

Query: 947 EKLLTPEARKLLEE 960
K + + + E
Sbjct: 1172 AKETSSNVEQPVTE 1185



Score = 36.6 bits (84), Expect = 0.001
Identities = 41/265 (15%), Positives = 87/265 (32%), Gaps = 25/265 (9%)

Query: 898 RARNEKEKKECEKLLTPEAKKLLEEAKKSVKAYLDCVSQAKTEAEKKECEKLLTPEARKL 957
+ NE+ + E + P A ++ + + ++KT + ++ T + R++
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPAT--PSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 958 LEEAKESLKAYKDCVSRARNEKEKKECEKLLTPEAKKLLEQQALDCLKNAKTDAEKKRCA 1017
+EAK ++KA A++ E KE + T E + +++ AK + EK +
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE------KAKVETEKTQEV 1122

Query: 1018 KDLPKDLQKKVLAKESVKAYLDCVSRARNEKERKACEKLLTPEAKKLLEEAKESLKAYKD 1077
KV ++ S K + ++E + E + ++E + D
Sbjct: 1123 --------PKVTSQVSPK-------QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 1078 CLSQARNEEERRACEKLLTPEARKLLEQEVKNSVKAYLDCVSRARNEKEKQECEKLLTPE 1137
Q E + + V+N N E K
Sbjct: 1168 -TEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNS-ESSNKPKNRHRR 1225

Query: 1138 ARKFLAKQVLSCLEKAKNEEERKAC 1162
+ + + V + + C
Sbjct: 1226 SVRSVPHNVEPATTSSNDRSTVALC 1250



Score = 34.7 bits (79), Expect = 0.005
Identities = 28/187 (14%), Positives = 69/187 (36%), Gaps = 4/187 (2%)

Query: 499 KARNEKEKKECEKLLTPEAKKKLEQQVLDCLKNAKTDEERKKCLKDLPKD--LQSDILAK 556
+ NE+ + E + P A + +N+K + + + + + Q+ +AK
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 557 ESLKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDCVSQAKNEAEKKE 616
E+ K + E ++ T E K+ E +E K + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 617 CEKLLTPEAKKKLEEAKKSVKAYL--DCVSQARTEAEKKECEKLLTPEAKKLLEQQALDC 674
++ + + + E A+++ + SQ T A+ ++ K + ++ + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 675 LKNAKTE 681
N+ E
Sbjct: 1191 TGNSVVE 1197



Score = 33.1 bits (75), Expect = 0.015
Identities = 38/217 (17%), Positives = 85/217 (39%), Gaps = 7/217 (3%)

Query: 555 AKESLKAYKDCVSQAKTEAEKKECEKLLTPEAKKLLEEEAKESVKAYLDC--VSQAKNEA 612
++ + ++ ++KT EK E + T + + +EAK +VKA V+Q+ +E
Sbjct: 1034 SETTETVAENSKQESKTV-EKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 613 EKKECEKLLTPEAKKKLEEAKKSVKAYLDCVSQARTEAEKKECEKLLTPEAKKLLEQQAL 672
++ + + +K E+AK + + + K+E + + P+A+ E
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 673 DCLKNAK----TEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSQAKTEAEKKECEKLL 728
+K + T AD ++ K+ ++++ V +V V + +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 729 TPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEKL 765
+ + K + SV++ V A + L
Sbjct: 1213 SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 32.0 bits (72), Expect = 0.030
Identities = 32/243 (13%), Positives = 77/243 (31%), Gaps = 11/243 (4%)

Query: 607 QAKNEAEKKECEKLLTP-------EAKKKLEEAKKSVKAYLDCVSQARTEAEKKECEKLL 659
+ NE + E + P E + + E K ++ Q TE + E
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 660 TPEAKKLLEQQALDCLKNAKTEADKKRCVKDLPKDLQKKVLAKESVKAYLDCVSQAKTEA 719
++ Q + ++ + + ++K+ AK + + +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 720 EKKECEKLLTPEARKLLEEAKESVKAYKDCVSRARNEKEKKECEKLLTPEARKLLEEEAK 779
K+E + + P+A E + K+ S+ + ++ K + + + E
Sbjct: 1131 PKQEQSETVQPQAEPAREN--DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTT 1188

Query: 780 ESVKAYLDCVSQAKTEAEKKECEKLLTPEAKKKLEEAKKSAKAYLDCVSQAKTEAEKKEC 839
+ + + T A + + K ++S ++ V A T + +
Sbjct: 1189 VNTGNSVVENPENTTPATTQPTVNSESSNKPK--NRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 840 EKL 842
L
Sbjct: 1247 VAL 1249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02720TYPE4SSCAGX8670.0 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 867 bits (2242), Expect = 0.0
Identities = 512/522 (98%), Positives = 516/522 (98%), Gaps = 1/522 (0%)

Query: 1 MGQAFFKKIVGCFCLGYLFLSSVIEAAP-DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 59
MGQAFFKKIVGCFCLGYLFLSS IEA DIKNFNRGRVKVVNKKIAYLGDEKPITIWTS
Sbjct: 1 MGQAFFKKIVGCFCLGYLFLSSAIEAVALDIKNFNRGRVKVVNKKIAYLGDEKPITIWTS 60

Query: 60 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 119
LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR
Sbjct: 61 LDNVTVIQLEKDETISYITTGFNKGWSIVPNSNHIFIQPKSVKSNLMFEKEAVNFALMTR 120

Query: 120 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKKEKRKEERAKNRANL 179
DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDK+EKRKEERAKNRANL
Sbjct: 121 DYQEFLKTKKLIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANL 180

Query: 180 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 239
ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 240 EETIKQRAKDKISIKTDKPQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 299
EE ++QRAKDKISIKTDK QKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD
Sbjct: 241 EEAVRQRAKDKISIKTDKSQKSPEDNSIELSPSDSAWRTNLVVRTNKALYQFILRIAQKD 300

Query: 300 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 359
NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE
Sbjct: 301 NFASAYLTVKLEYPQRHEVSSVIEEELKKREEAKRQRELIKQENLNTTAYINRVMMASNE 360

Query: 360 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 419
QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF
Sbjct: 361 QIINKEKIREEKQKIILDQAKALETQYVHNALKRNPVPRNYNYYQAPEKRSKHIMPSEIF 420

Query: 420 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 479
DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK
Sbjct: 421 DDGTFTYFGFKNITLQPAIFVVQPDGKLSMTDAAIDPNMTNSGLRWYRVNEIAEKFKLIK 480

Query: 480 DKALVTVINKGYGKNPLTRNYNIKNYGELERVIKKLPLVRDK 521
DKALVTVINKGYGKNPLT+NYNIKNYGELERVIKKLPLVRDK
Sbjct: 481 DKALVTVINKGYGKNPLTKNYNIKNYGELERVIKKLPLVRDK 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02730PF043351193e-35 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 119 bits (300), Expect = 3e-35
Identities = 43/205 (20%), Positives = 73/205 (35%), Gaps = 10/205 (4%)

Query: 27 KLNKANRTFKRAFYL---SMVLNIAAVTSIVMMMPLKKTDIFVYGIDRYTGEFKIVKRSD 83
KL A R+ K A+ + + L A V ++ + PLK + +V +DR TGE I +
Sbjct: 24 KLAAAERSKKLAWVVAGVAGALATAGVVAVAALTPLKTVEPYVITVDRNTGEASIAAKLH 83

Query: 84 A-RQIVNSEAVVDSATSKFVSLLFGYSKNSLRDRKDQLMQYCDVSFQTQAMRMFNENIRQ 142
I EAV + +V G+ + + D +M Q + R + + Q
Sbjct: 84 GDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPEQDRWSRFYKTDNPQ 143

Query: 143 FVDKVRA-EAIISSNIQREKVKNSPLTRLTFFITIKITPDTMENYEYITKKQVTIYYDFA 201
+ A + I + +F +T T TI Y
Sbjct: 144 SPQNILANRTDVFVEI-KRVSFLGGNVAQVYFTKESVTGSNS----TKTDAVATIKYKVD 198

Query: 202 RGNSSQENLIINPFGFKVFDIQITD 226
S + + NP G++V +
Sbjct: 199 GTPSKEVDRFKNPLGYQVESYRADV 223


20HPSNT_02975HPSNT_03005N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_02975214-1.235431hypothetical protein
HPSNT_02980214-0.613806hypothetical protein
HPSNT_02985116-0.671668dihydroorotase
HPSNT_02990016-2.838652hypothetical protein
HPSNT_02995-114-3.295313hypothetical protein
HPSNT_03000-114-2.697212flagellar motor switch protein
HPSNT_03005013-1.433843endonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02975TYPE3IMSPROT310.003 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 30.9 bits (70), Expect = 0.003
Identities = 18/64 (28%), Positives = 30/64 (46%), Gaps = 4/64 (6%)

Query: 88 LQSYSVMLFFNLLLLTDILGFLPFSIYHHFMASLIFSALFCSSLFLSSPLLGMIALVALS 147
L Y F L+L+ +LPFS S + + +L PLL + AL+A++
Sbjct: 45 LSDYYFEHFSKLMLIPAEQSYLPFSQ----ALSYVVDNVLLEFFYLCFPLLTVAALMAIA 100

Query: 148 SSLL 151
S ++
Sbjct: 101 SHVV 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_02990TONBPROTEIN511e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 51.1 bits (122), Expect = 1e-09
Identities = 23/52 (44%), Positives = 27/52 (51%), Gaps = 1/52 (1%)

Query: 91 PQKPPTPPTPPIPPTPP-KPIEKPKPEPKPKPKPKPEPKKPNHKHKALKKVE 141
P P P PIP P P+ KP+PKPKPKPKP K + +K VE
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVE 118



Score = 46.9 bits (111), Expect = 3e-08
Identities = 23/63 (36%), Positives = 30/63 (47%), Gaps = 1/63 (1%)

Query: 84 PKPTLAGPQKPPTPPTPPIPPTPPKPIEKPKPEPKPKPKPKPEPK-KPNHKHKALKKVEK 142
P + P +P P P P P P E P KPKPKPKP+PK + + + V+
Sbjct: 57 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116

Query: 143 VEE 145
VE
Sbjct: 117 VES 119



Score = 43.1 bits (101), Expect = 5e-07
Identities = 19/60 (31%), Positives = 25/60 (41%)

Query: 91 PQKPPTPPTPPIPPTPPKPIEKPKPEPKPKPKPKPEPKKPNHKHKALKKVEKVEEKKVVE 150
+PP P P P E PK P KPKP+PK K +++ K + K V
Sbjct: 60 AVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVES 119



Score = 40.0 bits (93), Expect = 6e-06
Identities = 17/54 (31%), Positives = 22/54 (40%)

Query: 76 PSKNTPGAPKPTLAGPQKPPTPPTPPIPPTPPKPIEKPKPEPKPKPKPKPEPKK 129
P P+P +PP I PKP KPKP K + +PK + K
Sbjct: 63 PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKP 116



Score = 39.2 bits (91), Expect = 9e-06
Identities = 16/52 (30%), Positives = 20/52 (38%), Gaps = 3/52 (5%)

Query: 81 PGAPKPTLAGPQKPPTPPTPPIPPTP---PKPIEKPKPEPKPKPKPKPEPKK 129
P + P P P PP KPKP+PKPKP K + +
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQP 110



Score = 36.9 bits (85), Expect = 6e-05
Identities = 44/220 (20%), Positives = 79/220 (35%), Gaps = 40/220 (18%)

Query: 98 PTPPIPPTPPKPIEKPKPEPKPKPKPKPEPKKPNHKHKALKKVEKVEEKKVVEEKKEEKK 157
P PP +P +P EP+P+P+P PEP K VV EK + K
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKE---------------APVVIEKPKPKP 96

Query: 158 IVEQKVEQKVEQKKVEEKKPVKKEFDPNQLSFLPKEVAPPRQENNKGLDNQTRRDIDELY 217
+ K +KV+++ + KPV E P N T
Sbjct: 97 KPKPKPVKKVQEQPKRDVKPV--------------ESRPASPFENTAPARLTSSTATAAT 142

Query: 218 GEEFGDLGTAEKDFIRNNLRDIGRITQKYLEYPQVAAYLGQDGTNAVEFYLHPNGDITDL 277
+ + + + RN + YP A L +G V+F + P+G + ++
Sbjct: 143 SKPVTSVASGPRALSRNQPQ-----------YPARAQALRIEGQVKVKFDVTPDGRVDNV 191

Query: 278 KIIIGSEYKMLDDNTLKTIQIAYKDYPRPKTKTLIRIRVR 317
+I+ M + ++ + +P + ++ I +
Sbjct: 192 QILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231



Score = 33.8 bits (77), Expect = 5e-04
Identities = 16/57 (28%), Positives = 23/57 (40%)

Query: 74 QDPSKNTPGAPKPTLAGPQKPPTPPTPPIPPTPPKPIEKPKPEPKPKPKPKPEPKKP 130
+P P+P P++ P P P PKP K + +PK KP +P
Sbjct: 65 PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRP 121



Score = 31.9 bits (72), Expect = 0.003
Identities = 13/53 (24%), Positives = 17/53 (32%)

Query: 75 DPSKNTPGAPKPTLAGPQKPPTPPTPPIPPTPPKPIEKPKPEPKPKPKPKPEP 127
+P P + P P P P K E+PK + KP P
Sbjct: 72 EPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASP 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03000FLGMOTORFLIN992e-30 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 99 bits (249), Expect = 2e-30
Identities = 25/77 (32%), Positives = 47/77 (61%)

Query: 34 LICDYKNLLDMEIVFSAELGSTQIPLLQILRFEKGSVIDLQKPAGESVDTFVNGRVIGKG 93
+ D ++D+ + + ELG T++ + ++LR +GSV+ L AGE +D +NG +I +G
Sbjct: 50 AMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQG 109

Query: 94 EVMVFERNLAIRLNEIL 110
EV+V +R+ +I+
Sbjct: 110 EVVVVADKYGVRITDII 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03005OMS28PORIN290.015 OMS28 porin signature.
		>OMS28PORIN#OMS28 porin signature.

Length = 257

Score = 28.6 bits (63), Expect = 0.015
Identities = 21/77 (27%), Positives = 41/77 (53%), Gaps = 8/77 (10%)

Query: 60 LFEKYPSVNDLAL-----ASLEEVKEIIQSVSYFNNKSKHLISMAQKVVRDFNGVIPSRQ 114
+ K P+ +L L A +E+VKE + + +++ + AQKV+ NG+ PS +
Sbjct: 164 MLNKSPNNKELELTKEEFAKVEQVKETLMASERALDET---VQEAQKVLNMVNGLNPSNK 220

Query: 115 KELMSLNGVGQKTANVV 131
++++ V + +NVV
Sbjct: 221 DQVLAKKDVAKAISNVV 237


21HPSNT_03100HPSNT_03160N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_03100-3120.517953flagellin A
HPSNT_03105-3110.5663323-methyladenine DNA glycosylase
HPSNT_03110-2121.077141hypothetical protein
HPSNT_03115190.586192uroporphyrinogen decarboxylase
HPSNT_03120190.160181outer-membrane protein of the hefABC efflux
HPSNT_0312527-0.043367membrane fusion protein of the hefABC efflux
HPSNT_0313027-0.192508cytoplasmic pump protein of the HefABC efflux
HPSNT_0313529-0.948144hypothetical protein
HPSNT_03140110-0.968214vacuolating cytotoxin VacA-like protein
HPSNT_03145016-2.988744putative ABC transporter permease
HPSNT_03150-213-1.151231ABC transporter ATP-binding protein
HPSNT_03155-113-0.377324hypothetical protein
HPSNT_03160-111-0.873207hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03100FLAGELLIN2446e-77 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 244 bits (624), Expect = 6e-77
Identities = 126/518 (24%), Positives = 209/518 (40%), Gaps = 22/518 (4%)

Query: 2 AFQVNTNINAMNAHVQSALTQNALKTSLERLSSGLRINKAADDASGMTVADSLRSQASSL 61
A +NTN ++ +Q++L +++ERLSSGLRIN A DDA+G +A+ S L
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAIANTNDGMGIIQVADKAMDEQLKILDTVKVKATQAAQDGQTTESRKAIQSDIVRLIQ 121
QA N NDG+ I Q + A++E L V+ + QA + K+IQ +I + ++
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 GLDNIGNTTTYNGQALLSGQFTNKEFQVGAYSNQSIKASIGSTTSDKIGQVRI-ATGALI 180
+D + N T +NG +LS + QVGA ++I + +G G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDN-QMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKE 179

Query: 181 TASGDISLTFKQVDGVNDVTLESVKVSSSAGTGIGVLAEVINKNSNRTGVKAYASVITTS 240
GD+ +FK V G + + + K +G V ++ V A +TT
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 241 DVAVQSGSLSNLTLNGIHLGNIADIKKNDSDGRLVAAINAVTSETGVEAYTDQKGRLNLR 300
D N + K A A+ + + + +
Sbjct: 240 DAE-----------NNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 301 SIDGRGIEIKTDSVSNGPSALTMVNGGQDLTKGSTNYGRLSLTRLDAKSINV------VS 354
+ G K + NG V S + +N +
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 355 ASDSQHLGFTAIGFGESQVAETTVNLRDVTGNFNANVKSASGANYNAVIASGNQSL---G 411
++S L ++ TVN + T N + + +G + S
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINE 408

Query: 412 SGVTTLRGAMVVIDIAESAMKMLDKVRSDLGSVQNQMISTVNNISITQVNVKAAESQIRD 471
+ + +SA+ +D VRS LG++QN+ S + N+ T N+ +A S+I D
Sbjct: 409 DAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIED 468

Query: 472 VDFAEESANFNKNNILAQSGSYAMSQANTVQQNILRLL 509
D+A E +N +K IL Q+G+ ++QAN V QN+L LL
Sbjct: 469 ADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03105PF05272300.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.011
Identities = 13/95 (13%), Positives = 26/95 (27%), Gaps = 20/95 (21%)

Query: 60 ILENDDEINLKKIAYIEFSKLAECVRPSGFYNQKAKRLIDLSKNILKDFQSFENFKQEVT 119
L + + +A+ E + VR + +KA E+
Sbjct: 458 ALRSAPALA-GCVAFDELREQPVAVRAFPW--RKAPGP-------------LEDADVLRL 501

Query: 120 REWLLDQKGIGKESADAILCYVCAKEVMVVDKYSY 154
+++ G G+ SA + D
Sbjct: 502 ADYVETTYGTGEASAQTTEQAINV----AADMNRV 532


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03120RTXTOXIND290.038 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.038
Identities = 16/113 (14%), Positives = 40/113 (35%), Gaps = 16/113 (14%)

Query: 203 LARMIALQKKLEQIKTDIKRVTKLYDKGLTTIDDL-----QSLKAQGNLSEY--DILDMQ 255
LAR+ + K+ + + L K + + ++A L Y + ++
Sbjct: 220 LARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIE 279

Query: 256 FALEQNRLTLEYLTNLNVKNLKKTTIDAPNLQLRERQD-LVSLREQISALRYQ 307
+ + + +T K +D +LR+ D + L +++ +
Sbjct: 280 SEILSAKEEYQLVTQ----LFKNEILD----KLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03125RTXTOXIND524e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.1 bits (125), Expect = 4e-10
Identities = 24/82 (29%), Positives = 37/82 (45%), Gaps = 5/82 (6%)

Query: 27 NVKAIQDSKLTLDSTGIVDSIKVTEGSVVKKGDVLLLLYNQEKQAQSDSTEQQLIFAKKQ 86
K I+ IV I V EG V+KGDVLL L +A + T+ L+ A+ +
Sbjct: 95 RSKEIKPI-----ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLE 149

Query: 87 YQRYSKIGGAVDKNTLESYEFN 108
RY + +++ N L +
Sbjct: 150 QTRYQILSRSIELNKLPELKLP 171



Score = 31.7 bits (72), Expect = 0.002
Identities = 21/152 (13%), Positives = 48/152 (31%), Gaps = 25/152 (16%)

Query: 70 QAQSDSTEQQLIFAKKQYQR--YSKIGGAVDKNTLESYEFNYRRLESDYAYSIAVLNKTI 127
+++ S +++ + ++ K+ D + + E ++
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDN--IGLLTLELAKNEER-------QQASV 329

Query: 128 LRAPFDGVIASKNIQVGEGVSANNTVLLRLVSHARKLVIE--FDSKYINAVKVG------ 179
+RAP + + GV L+ +V L + +K I + VG
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 180 -DTYTYSIDGDSNQHEAKITKIYP--TVDENT 208
+ + Y+ G K+ I D+
Sbjct: 390 VEAFPYTRYGYL---VGKVKNINLDAIEDQRL 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03130ACRIFLAVINRP9000.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 900 bits (2327), Expect = 0.0
Identities = 286/1038 (27%), Positives = 518/1038 (49%), Gaps = 40/1038 (3%)

Query: 1 MYKTAINRPITTLMFALAIVFFGTMGFKKLSVALFPKIDLPTVVVTTTYPGASAEIIESK 60
M I RPI + A+ ++ G + +L VA +P I P V V+ YPGA A+ ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTDKIEEAVMGIDGIKKVTSTSSKNVSIVV-IEFELEKPNEEALNDVVNKISSVR-FDDS 118
VT IE+ + GID + ++STS S+ + + F+ + A V NK+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 119 NIKKPSINKFDTDSQAIISLFVSSSSVPAT--TLNDYAKNTIKPMLQKINGVGGVQLNGF 176
+++ I+ + S ++ S + T ++DY + +K L ++NGVG VQL G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 177 RERQIRIYADPTLMNKYSLTYADLFSTLKAENVEIDGGRIVNS------QRELSILINAN 230
+ +RI+ D L+NKY LT D+ + LK +N +I G++ + Q SI+
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 231 SYSVADVEKIQV-----GNHVRLGDIAKIEIGLEEDNTFASFKDKPGVILEIQKIAGANE 285
+ + K+ + G+ VRL D+A++E+G E N A KP L I+ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 286 IEIVDRVYEALKHIQAISP-SYEIKPFLDTTSYIRTSIEDVKFDLILGAILAVLVVFAFL 344
++ + L +Q P ++ DTT +++ SI +V L +L LV++ FL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 345 RNGTITLVSAISIPISIMGTFALIQWMGFSLNMLTMVALTLAIGIIIDDAIVVIENIHK- 403
+N TL+ I++P+ ++GTFA++ G+S+N LTM + LAIG+++DDAIVV+EN+ +
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 404 KLEMGMSKRKASYEGVREIGFALVAISAMLLSVFVPIGNMKGIIGRFFQSFGITVALAIA 463
+E + ++A+ + + +I ALV I+ +L +VF+P+ G G ++ F IT+ A+A
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 464 LSYVVVVTIIPMVSSVVVNPRHS-------RFYVWSEPFFKALESRYTKLLQWVLNHKLI 516
LS +V + + P + + ++ P + F+ W F + YT + +L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 517 IFIAVVLVFVGSLFVASKLGMDFMLKEDRGRFLVWLKAKPGVSIDY----MTQKSKIFQK 572
+ L+ G + + +L F+ +ED+G FL ++ G + + + Q + + K
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 573 AIEKHDEVEFTTLQVGY-GTTQNPFKAKIFVQLKPLKERKKEHQLGQFELMSALRKELRS 631
+ + E FT + G QN FV LKP +ER + ++ + EL
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNA--GMAFVSLKPWEERNGDENS-AEAVIHRAKMELGK 656

Query: 632 MPEAKGLDTINLSEVALLGGGGDSSPFQTFVFSHSQEAVDKSVANLKKFLLESPELKGKV 691
+ + + N+ + L G ++ F + + D + L + + +
Sbjct: 657 IRDGFVI-PFNMPAIVEL---GTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 692 ESYHTSTSESQPQLQLKILRQNANKYGVSAQTIGSVVSSAFSGTSQASVFKEDGKEYDMI 751
S + E Q +L++ ++ A GVS I +S+A G + + F + G+ +
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGG-TYVNDFIDRGRVKKLY 771

Query: 752 IRVPDDKRVSVEDIKRLQVRNKYDKLMFLDALVEITETKSPSSISRYNRQRSVTVLAQPK 811
++ R+ ED+ +L VR+ +++ A + RYN S+ + +
Sbjct: 772 VQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAA 831

Query: 812 AGISLGEILTQVSKNTKEWLVEGANYRFTGEADNAKETNGEFLIALATAFVLIYMILAAL 871
G S G+ + + +N L G Y +TG + + + + +A +FV++++ LAAL
Sbjct: 832 PGTSSGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAAL 890

Query: 872 YESILEPFIIMVTMPLSFSGAFFALGLVHQPLSMFSMIGLILLIGMVGKNATLLIDVANE 931
YES P +M+ +PL G A L +Q ++ M+GL+ IG+ KNA L+++ A +
Sbjct: 891 YESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKD 950

Query: 932 -ERKKGLNIQEAILFAGKTRLRPILMTTIAMVCGMLPLALASGDGAAMKSPIGIAMSGGL 990
K+G + EA L A + RLRPILMT++A + G+LPLA+++G G+ ++ +GI + GG+
Sbjct: 951 LMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGM 1010

Query: 991 MISMVLSLLIVPVFYRLL 1008
+ + +L++ VPVF+ ++
Sbjct: 1011 VSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03140VACCYTOTOXIN2796e-78 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 279 bits (714), Expect = 6e-78
Identities = 110/407 (27%), Positives = 191/407 (46%), Gaps = 15/407 (3%)

Query: 2791 SAGLNTIKS-ANHNAVNWLNALFVAKGGNPLFAPYYLQDTPTKHIVTLMEDVSSALGMLT 2849
S ++T+ + + + L L + + +A + T I + ++ L +
Sbjct: 894 SNDIDTLYANSGAQGRDLLQTLLI-DSHDAGYARTMIDATSANEITKQLNTATTTLNNIA 952

Query: 2850 KPSLKNNSTDVLQLNTYTQQMGRLAKLSNFASFDSTDFSERLSSLKNQRFADAIPNAMDV 2909
K + L L+ RL LS + F++RL +LK+QRFA + +A +V
Sbjct: 953 SLEHKTSGLQTLSLSNAMILNSRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEV 1011

Query: 2910 ILKYSQRNKLKNNLWATGVGGVSFVGNGTGTLYGVNVGYDRFIKG---VIVGGYAAYGYS 2966
+ +++ + + N+WA +GG S G +LYG + G D ++ G IVGG+ +YGYS
Sbjct: 1012 LYQFAPKYEKPTNVWANAIGGTSLNSGGNASLYGTSAGVDAYLNGEVEAIVGGFGSYGYS 1071

Query: 2967 GFYER--ITSSKSNNVNVGLYARAFIKKSELTFSVNETWGANKTQISSADTLLSVINQSY 3024
F + +S +NN N G+Y+R F + E F G++++ ++ LL +NQSY
Sbjct: 1072 SFSNQANSLNSGANNTNFGVYSRIFANQHEFDFEAQGALGSDQSSLNFKSALLRDLNQSY 1131

Query: 3025 NYNAWTTNARVNYGYDFMFKNKSIILKPQIGLRYYYIGMTGLEGVMNNVLYSQFKANADP 3084
NY A++ R +YGYDF F +++LKP +G+ Y ++G T + N +
Sbjct: 1132 NYLAYSAATRASYGYDFAFFRNALVLKPSVGVSYNHLGSTNFKSNSNQ----KVALKNGA 1187

Query: 3085 SKKSVLTIDFAFENRHYFNTNSYFYAIGGIGRDLLVRSMGDKLVRFIGDNTLSYRKGELY 3144
S + + E R+Y+ SYFY G+ ++ + V + R
Sbjct: 1188 SSQHLFNASANVEARYYYGDTSYFYMNAGVLQEFANFGSSNA-VSLNTFKVNATRNP--L 1244

Query: 3145 NTFASITTGGEVRLFKSFYANVGVGARFGLDYKMINITGNIGMRLAF 3191
NT A + GGE++L K + N+G L + + N+GMR +F
Sbjct: 1245 NTHARVMMGGELKLAKEVFLNLGFVYLHNLISNIGHFASNLGMRYSF 1291



Score = 38.9 bits (90), Expect = 5e-04
Identities = 74/427 (17%), Positives = 142/427 (33%), Gaps = 50/427 (11%)

Query: 119 YIGTKNASSTLNHSSIWFGEKGYVGFITGVFKAKDIFITGAVGSGNEWKTGGGAILVFES 178
I T + +N + + G+ G+ + I G+ + W++ G I+
Sbjct: 275 TINTSKVTGEVNFNHLTVGDHNAAQ--AGIIASNKTHI----GTLDLWQSAGLNIIAP-- 326

Query: 179 SNELNTNGAYFQNNRAGTQTSWINLISNNSVNLTNTDFGNQTPNGGFNAMGRKITYNGGI 238
G + N + T+ N ++ + N Q N N+ + +
Sbjct: 327 ----PEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSN-TQVINP-PNSAQKTEIQPTQV 380

Query: 239 VNGGNFGFDNVDSNGTTTISGVTFNNNGVLTYKGGNGIGGSITFTNSNINHYKLNLNANS 298
++G G T ++ N N T + G G S+T ++++ K +N ++
Sbjct: 381 IDGPFAG------GKNTVVNINRINTNADGTIRVG-GFKASLTTNAAHLHIGKGGINLSN 433

Query: 299 VTFNNSALGSMPNGNINTIGNAYILNASNITFNNLTFNGGWFVFNRSDAHVNFQGATTIN 358
S L GNI G + NN G + S A+ F+ T
Sbjct: 434 QASGRSLLVENLTGNITVDGPLRV--------NNQV---GGYALAGSSANFEFKAGTDTK 482

Query: 359 NPTSPFVNMSAKVTINPNAIFNIQNYTPSIGSAYTLFSMKNGSIAYNDVSNLWNIIRLKN 418
N T+ F N + + I + F+ + ++ V+N NI +L
Sbjct: 483 NGTATFNNDISLGRFVNLKVDAHTANFKGIDTGNGGFNT----LDFSGVTNKVNINKL-- 536

Query: 419 TQATKDNSKNATSNNNTHTYYVTYNLGGTLYNFRQIFSPNSIVLQSIYYGANNLYYTNSV 478
A+ + + + N ++G + I S + I + G ++Y
Sbjct: 537 ITASTNVAVKNFNINELVVKTNGVSVGEYTHFSEDIGSQSRINTVRLETGTRSIYSGG-- 594

Query: 479 NIHDNVFNLKDIKDDRADAIFYLNGLNTWNYTNARFTQTYSGKNSALVFNATTPWANGSI 538
K + + +Y WNY +AR + N +PW +
Sbjct: 595 ------VKFKGGEKLVINDFYY----APWNYFDARNIKNVEITNKLAFGPQGSPWGTAKL 644

Query: 539 PKSNSTV 545
+N T+
Sbjct: 645 MFNNLTL 651



Score = 35.4 bits (81), Expect = 0.005
Identities = 16/100 (16%), Positives = 31/100 (31%), Gaps = 5/100 (5%)

Query: 702 SYSFDGANNIFNEDKFNGGSFNFNHAEQTNAFNNNSFNGGSFSFNAKQVDFNHNLFNGGV 761
SYS + + E FN + ++A Q +N + G+ + N G
Sbjct: 272 SYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTHIGTLDLW-QSAGLNIIAPPEGG 330

Query: 762 FNF---NNTPKASFTDDAFNVNNQFKING-TQTDFTFNKG 797
+ + + + + + N TQ N
Sbjct: 331 YKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_03160LCRVANTIGEN319e-04 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 30.8 bits (69), Expect = 9e-04
Identities = 16/33 (48%), Positives = 20/33 (60%)

Query: 16 KRKKLLTELAELEAEIKVSSEQKSGFNISLSPS 48
R KL ELAEL AE+K+ S ++ N LS S
Sbjct: 149 ARSKLREELAELTAELKIYSVIQAEINKHLSSS 181


22HPSNT_07165HPSNT_07200N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
HPSNT_071650130.271545membrane protein insertase
HPSNT_07170-1110.157758hypothetical protein
HPSNT_071750100.656262tRNA modification GTPase TrmE
HPSNT_071802111.399279outer membrane protein HomD
HPSNT_071851140.461532hypothetical protein
HPSNT_07190-1130.724967hypothetical protein
HPSNT_07195-2131.771940hypothetical protein
HPSNT_07200-2121.977072membrane-associated lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_0716560KDINNERMP425e-146 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 425 bits (1095), Expect = e-146
Identities = 165/572 (28%), Positives = 278/572 (48%), Gaps = 61/572 (10%)

Query: 10 RLILAIALSFLFITLYSYFFQKPNKT--TTQTTKQETANNHTATNPNTPNAQNFSVTQTI 67
R +L IAL F+ ++ + Q N QTT+ T +A + P +
Sbjct: 5 RNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGK----- 59

Query: 68 PQENLLSTISFEHARIEIDSLGR--IKQVYLKDKKYLTPKQKGFLEHVGHLFNPKANPQT 125
L ++ + + I++ G + + K L Q L F +A
Sbjct: 60 -----LISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGL 114

Query: 126 PLKELPLLAADKLKPLEVRFLDPTLNNKAFNTPYSASKTTLGPNEQLV--LTQDLGALTI 183
++ P A+ +PL +N A G NE V D T
Sbjct: 115 TGRDGPDNPANGPRPL-------------YNVEKDAYVLAEGQNELQVPMTYTDAAGNTF 161

Query: 184 IKTLTFYDDLHYDLQIAFKSSN--------NIIPSYVITNGYRPVADLDS-----YTFSG 230
KT Y + + + N + + P D S +TF G
Sbjct: 162 TKTFVLKRG-DYAVNVNYNVQNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRG 220

Query: 231 VLLENNDKKIEKIE---DKDAKEIKRFSNTLFLSSVDRYFTTLLFTKDPQGFEALIDSEI 287
D+K EK + D + + S +++ + +YF T + G + +
Sbjct: 221 AAYSTPDEKYEKYKFDTIADNENLNISSKGGWVAMLQQYFATAWIPHN-DGTNNFYTANL 279

Query: 288 GTKNPLGFISLKNEA-----------DLHGYIGPKDYRSLKAISPMLTDVIEYGLITFFA 336
G N + I K++ + ++GP+ + A++P L ++YG + F +
Sbjct: 280 G--NGIAAIGYKSQPVLVQPGQTGAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFIS 337

Query: 337 KGVFVLLDYLYQFVGNWGWAIILLTIIVRLILYPLSYKGMVSMQKLKELTPKMKELQEKY 396
+ +F LL +++ FVGNWG++II++T IVR I+YPL+ SM K++ L PK++ ++E+
Sbjct: 338 QPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERL 397

Query: 397 KGEPQKLQAHMMQLYKKHGANPLGGCLPLILQIPVFFAIYRVLYNAVELKSSEWILWIHD 456
+ Q++ MM LYK NPLGGC PL++Q+P+F A+Y +L +VEL+ + + LWIHD
Sbjct: 398 GDDKQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHD 457

Query: 457 LSIMDPYFILPLLMGASMYWHQSVTPNTMTDPMQAKIFKLLPLLFTIFLITFPAGLVLYW 516
LS DPY+ILP+LMG +M++ Q ++P T+TDPMQ KI +P++FT+F + FP+GLVLY+
Sbjct: 458 LSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYY 517

Query: 517 TTNNILSVLQQLIINKVLENKKRAHAQNKKES 548
+N+++++QQ +I + LE K+ H++ KK+S
Sbjct: 518 IVSNLVTIIQQQLIYRGLE-KRGLHSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07170DPTHRIATOXIN310.004 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 31.3 bits (70), Expect = 0.004
Identities = 24/76 (31%), Positives = 40/76 (52%), Gaps = 2/76 (2%)

Query: 17 IQASIALNCPIINLQYEVIQTPSKGFLNIGKKEAIILAGVKESV-KAVKEESVKEAHTKE 75
++ S+ + INL ++VI+ +K + K+ I + ES K V EE K+ + +E
Sbjct: 223 VRRSVGSSLSCINLDWDVIRDKTKTKIESLKEHGPIKNKMSESPNKTVSEEKAKQ-YLEE 281

Query: 76 IHQSAEEKKQKLETKT 91
HQ+A E + E KT
Sbjct: 282 FHQTALEHPELSELKT 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07175TCRTETOQM320.006 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.7 bits (72), Expect = 0.006
Identities = 32/134 (23%), Positives = 53/134 (39%), Gaps = 25/134 (18%)

Query: 216 LSIVGKPNAGKSSLLNAMLLEERA---LVSDIKGTTR-DTIEE-------------VIEL 258
+ ++ +AGK++L ++L A L S KGTTR D +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 259 QGHKVRLIDTAGIRESADKIERLGIEKSLKSLENCDIILGVFDLSKPLEKEDFNLIDTLN 318
+ KV +IDT G + ++ R SL L D + + ++ + L L
Sbjct: 66 ENTKVNIIDTPGHMDFLAEVYR-----SLSVL---DGAILLISAKDGVQAQTRILFHALR 117

Query: 319 RAKKPCIVVLNKND 332
+ P I +NK D
Sbjct: 118 KMGIPTIFFINKID 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
HPSNT_07200LIPOLPP20292e-105 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 292 bits (748), Expect = e-105
Identities = 173/175 (98%), Positives = 174/175 (99%)

Query: 1 MKNQVKKILGMSVIAAMVIAGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60
MKNQVKKILGMSV+AAMVI GCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK
Sbjct: 1 MKNQVKKILGMSVVAAMVIVGCSHAPKSGISKSNKAYKEATKGAPDWVVGDLEKVAKYEK 60

Query: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120
YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS
Sbjct: 61 YSGVFLGRAEDLITNNDVDYSTNQATAKARANLAANLKSTLQKDLENEKTRTVDASGKRS 120

Query: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175
ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK
Sbjct: 121 ISGTDTEKISQLVDKELIASKMLARYVGKDRVFVLVGLDKQIVDKVREELGMVKK 175



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.