PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2818.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_009708 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1YpsIP31758_0165YpsIP31758_0171Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0165215-2.211307colicin/pyocin immunity family protein
YpsIP31758_0166315-1.990706hypothetical protein
YpsIP31758_0167216-3.453110pili assembly chaperone
YpsIP31758_0168216-3.781809hypothetical protein
YpsIP31758_0169118-4.948020fimbrial usher protein
YpsIP31758_0170014-4.654227pili assembly chaperone
YpsIP31758_0171015-4.291691fimbrial protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0169PF005777750.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 775 bits (2003), Expect = 0.0
Identities = 256/900 (28%), Positives = 401/900 (44%), Gaps = 71/900 (7%)

Query: 15 RRKALTLCITLILHIDTAFGQEEP---QNFEFDESLFLGTKYASG-LTQLNKKNSITAGN 70
RK + L + AF + P F+ A L++ + G
Sbjct: 18 IRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGT 77

Query: 71 YDAVDVLVNNKLFKRMSVQFIKDANSSEVYPCLSDELLTAAGVELGRENSTPPKEPHVTE 130
Y VD+ +NN V F + + PCL+ L + G+ +
Sbjct: 78 Y-RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSG---------- 126

Query: 131 ANTPITETHAPTNQCLPLSTRVKGASFRFDQAKLRLELSIPQALLQKRPRGYIERAEWQE 190
+ C+PL++ + A+ + D + RL L+IPQA + R RGYI W
Sbjct: 127 ------MNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180

Query: 191 GEKLAFINYSANAYRSDTRGQQKRTSDFGFIGLKSGINLGLWQVRQQSNVRYASN--DSG 248
G +NY+ + R S + ++ L+SG+N+G W++R + Y S+ SG
Sbjct: 181 GINAGLLNYNFSGNSVQNRIGG--NSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSG 238

Query: 249 SDTQWNSIRTYVQRPIPQLDSQLTLGETFTDSTLFGSMSFLGAKMATDQRMWPVSMRGFS 308
S +W I T+++R I L S+LTLG+ +T +F ++F GA++A+D M P S RGF+
Sbjct: 239 SKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFA 298

Query: 309 PEVRGVASTNARVIIRQNGREIYETNVAPGPFVINDLFSTSSQGDLNVEVIEANGSRSTF 368
P + G+A A+V I+QNG +IY + V PGPF IND+++ + GDL V + EA+GS F
Sbjct: 299 PVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIF 358

Query: 369 TVPFSAVPDSMRPGVSRYNAVIGESRDFTN--IDNYFTDFTYERGLTNQLTANSGVRLAK 426
TVP+S+VP R G +RY+ GE R F T GL T G +LA
Sbjct: 359 TVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD 418

Query: 427 DYTALLAGGVLGT-PVGALGLNATYSHAKVENDKTQDGWRMQATYSQTFNQTGTTFSLAG 485
Y A G +GAL ++ T +++ + +D DG ++ Y+++ N++GT L G
Sbjct: 419 RYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVG 478

Query: 486 YRYSTKGYRDLNDVFGVRSMQKNGGTWD-------------SSTYKQRSQFTTTINQDLG 532
YRYST GY + D R N T D + Y +R + T+ Q LG
Sbjct: 479 YRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLG 538

Query: 533 NWGQLYASASTSDYYNDTARDTQLQLGYSNSYQQISYNLAVSRQRSVYTSTLYNWDSPDT 592
LY S S Y+ + D Q Q G + +++ I++ L+ S ++ +
Sbjct: 539 RTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQ----------- 587

Query: 593 DETATTTRYGNTENIATFTVSIPL--------NIGSNNQYLSMSASRNPKSGNNYQTSLS 644
+ + V+IP + S S S + +
Sbjct: 588 ---------KGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVY 638

Query: 645 GTAGERNSFNYALNAGYDDSNFGSSSNNWGANVQKQFPNATVNGSYSRGNNYTQYGAGAR 704
GT E N+ +Y++ GY G+S + A + + N YS ++ Q G
Sbjct: 639 GTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVS 698

Query: 705 GAAVIHRQGVTLGPYLGETFGLIEANGAQGATVRNAQGARIDSNGFALVPALTPYNYNTI 764
G + H GVTLG L +T L++A GA+ A V N G R D G+A++P T Y N +
Sbjct: 699 GGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRV 758

Query: 765 GLDTKGINRNTELKENQGRVVPYAGAAVKVKFETLTGYAVLI--QAEGEGLPLGADVYNS 822
LDT + N +L VVP GA V+ +F+ G +L+ + LP GA V +
Sbjct: 759 ALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSE 818

Query: 823 KDELVGMVGQGNQIYARIADNKGTLDVRWGESSGDQCQLPYAFNRQDTEQDIIHITASCR 882
+ G+V Q+Y G + V+WGE C Y + +Q + ++A CR
Sbjct: 819 SSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


2YpsIP31758_0197YpsIP31758_0209Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0197-215-3.303434uroporphyrinogen-III synthase
YpsIP31758_0198-216-3.855528porphobilinogen deaminase
YpsIP31758_0199-218-5.341900hypothetical protein
YpsIP31758_0200-215-3.864742hypothetical protein
YpsIP31758_0201-315-2.727038adenylate cyclase
YpsIP31758_0202-219-2.928527hypothetical protein
YpsIP31758_02030183.756202frataxin-like protein
YpsIP31758_0204-1194.726265lipoprotein
YpsIP31758_0205-1174.277440diaminopimelate epimerase
YpsIP31758_0206-1143.940713hypothetical protein
YpsIP31758_0207-1133.642530site-specific tyrosine recombinase XerC
YpsIP31758_0208-1133.284698flavin mononucleotide phosphatase
YpsIP31758_0209-1153.102939DNA-dependent helicase II
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0203MALTOSEBP260.029 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 26.2 bits (57), Expect = 0.029
Identities = 13/38 (34%), Positives = 18/38 (47%)

Query: 41 LTFENGSKIVINRQEPLHQVWLATKAGGYHFNYRDGHW 78
L + S ++ N QEP L GGY F Y +G +
Sbjct: 165 LKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENGKY 202


3YpsIP31758_0311YpsIP31758_0354Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0311219-0.373888hypothetical protein
YpsIP31758_03122170.160264hypothetical protein
YpsIP31758_03131180.620498hypothetical protein
YpsIP31758_03141191.046472hypothetical protein
YpsIP31758_03150181.435255hypothetical protein
YpsIP31758_0316-1181.577926hypothetical protein
YpsIP31758_0317-1162.767538hypothetical protein
YpsIP31758_03180182.776444FHA domain-containing protein
YpsIP31758_0319-1193.056480lipoprotein
YpsIP31758_0320-1183.542307hypothetical protein
YpsIP31758_0321-1224.425487hypothetical protein
YpsIP31758_03220245.291399type VI secretion ATPase
YpsIP31758_0323-1316.378082Fis family transcriptional regulator
YpsIP31758_0324-1316.617911hypothetical protein
YpsIP31758_0325-1296.216816ImpA domain-containing protein
YpsIP31758_0326-1274.669582lipoprotein
YpsIP31758_03271274.410380ImpA domain-containing protein
YpsIP31758_03282264.124576Rhs element Vgr protein
YpsIP31758_03293252.115975hypothetical protein
YpsIP31758_03303241.811915RHS/YD repeat-containing protein
YpsIP31758_0331323-0.454214hypothetical protein
YpsIP31758_03323241.938805hypothetical protein
YpsIP31758_03333241.870620RHS/YD repeat-containing protein
YpsIP31758_03341272.276149hypothetical protein
YpsIP31758_03350314.743984hypothetical protein
YpsIP31758_03361296.219598hypothetical protein
YpsIP31758_03371306.275442DNA-binding protein
YpsIP31758_03381295.716533hypothetical protein
YpsIP31758_03391244.732531VgrG protein
YpsIP31758_03401202.812706hypothetical protein
YpsIP31758_03411192.792544YD repeat-/RHS repeat-containing protein
YpsIP31758_0342218-1.659529hypothetical protein
YpsIP31758_0344216-0.280078AP endonuclease
YpsIP31758_03452170.029414oxidoreductase, NAD-binding
YpsIP31758_03472190.676725AraC family transcriptional regulator
YpsIP31758_03462192.293737hypothetical protein
YpsIP31758_03480162.415614RbsD/FucU family transport protein
YpsIP31758_03491143.075596ribokinase
YpsIP31758_03510132.987574hypothetical protein
YpsIP31758_03500144.088871deoxyribose-phosphate aldolase
YpsIP31758_03520134.593023NAD(P)H-dependent FMN reductase
YpsIP31758_0353-1154.588907alkanesulfonate transporter substrate-binding
YpsIP31758_03540184.740485alkanesulfonate monooxygenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0323HTHFIS353e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.8 bits (80), Expect = 3e-04
Identities = 7/47 (14%), Positives = 16/47 (34%)

Query: 215 DSLTTAVETFECAVLTQRQRLYGNDKSRIAASLGLSLRALTYKLAKY 261
+ E ++ ++ + A LGL+ L K+ +
Sbjct: 427 GLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0325adhesinmafb372e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 36.6 bits (84), Expect = 2e-04
Identities = 14/32 (43%), Positives = 18/32 (56%)

Query: 112 NPLHKRRFAQQILKRFDSASSSFSQRADEAQR 143
NP R Q+I + + S+FS RADEA R
Sbjct: 178 NPTDTRSIRQRISDNYSNLGSNFSDRADEANR 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0330cloacin320.019 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.0 bits (72), Expect = 0.019
Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 3/85 (3%)

Query: 945 RDYDAMGRRLWQSAGSDAPTVAADLLPRQG--DIWRKFSFDTAGELSMATDFIRGEQQYR 1002
D A G R+WQ AG A D+ +Q D K D LS A + + ++ +
Sbjct: 380 HDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSSAMESRKKKEDKK 439

Query: 1003 YDAEGRLTDSRERHQLSVAEDFAYD 1027
AE L D + + + +D+ +D
Sbjct: 440 RSAENNLNDEKNKPRKGF-KDYGHD 463


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0333cloacin320.022 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 32.0 bits (72), Expect = 0.022
Identities = 23/85 (27%), Positives = 37/85 (43%), Gaps = 3/85 (3%)

Query: 945 RDYDAMGRRLWQSAGSDAPTVAADLLPRQG--DIWRKFSFDTAGELSMATDFIRGEQQYR 1002
D A G R+WQ AG A D+ +Q D K D LS A + + ++ +
Sbjct: 380 HDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDAAAKEKSDADAALSSAMESRKKKEDKK 439

Query: 1003 YDAEGRLTDSRERHQLSVAEDFAYD 1027
AE L D + + + +D+ +D
Sbjct: 440 RSAENNLNDEKNKPRKGF-KDYGHD 463


4YpsIP31758_0411YpsIP31758_0416Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0411018-3.756946hypothetical protein
YpsIP31758_0412019-4.903384M15 family peptidase
YpsIP31758_0413018-4.785744hypothetical protein
YpsIP31758_0414017-4.890834insecticidal toxin complex protein
YpsIP31758_0415118-7.601634insecticidal toxin complex protein
YpsIP31758_0416-113-4.787435insecticial toxin complex protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0414SALSPVBPROT344e-106 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 344 bits (882), Expect = e-106
Identities = 171/378 (45%), Positives = 214/378 (56%), Gaps = 31/378 (8%)

Query: 10 VAPLSLPKGGGAITGMGDSLGAIGPSGMATLTLPLPISAGRGYAPPLALNYSSGNGNGPF 69
+ P LPKGG A L GP G+A++TLPLPISA RG+AP LAL+YSSG GNGPF
Sbjct: 15 ITPPFLPKGGKA-------LSQSGPDGLASITLPLPISAERGFAPALALHYSSGGGNGPF 67

Query: 70 GLGWQLNTMAICRRTSKRVPHYDEHDEFLAPSGEVLVVAIDQQGNIERTEQSLKG----- 124
G+GW TM+I R TS VP Y++ DEFL P GEVLV + G
Sbjct: 68 GVGWSCATMSIARSTSHGVPQYNDSDEFLGPDGEVLVQTLSTGDAPNPVTCFAYGDVSFP 127

Query: 125 EQFSVIRYLPRIEGSFNRIEYWQPRVDNSQA-PFWVVHGSDGQKHCLGYSASARIADPQH 183
+ ++V RY PR E SF R+EYW V NS FW++H S+G H LG +A+AR++DPQ
Sbjct: 128 QSYTVTRYQPRTESSFYRLEYW---VGNSNGDDFWLLHDSNGILHLLGKTAAARLSDPQA 184

Query: 184 PEHIAEWLLEESVSLSGEHICYLYQAEDEQGIDEDEKQNHPAASAQRYLSTVVYGNREVA 243
H A+WL+EESV+ +GEHI Y Y AE+ +D + + SA RYLS V YGN A
Sbjct: 185 ASHTAQWLVEESVTPAGEHIYYSYLAENGDNVDLNGNEAGRDRSAMRYLSKVQYGNATPA 244

Query: 244 HELYCLTQQPTEKSWLFSLIFDHGEYSNIADQVPIAEEGKSWTYRQDAFSHFNYGFEIRT 303
+LY T WLF+L+FD+GE P SW RQD FS +NYGFEIR
Sbjct: 245 ADLYLWTSATPAVQWLFTLVFDYGERGVDPQVPPAFTAQNSWLARQDPFSLYNYGFEIRL 304

Query: 304 RRLCQQVLMYHNLSALAGNEPDQQPTLVSRLRLNYQHDVYATQLVGCQRLAHEPKGKTCS 363
RLC+QVLM+H+ G + TLVSRL L Y + TQL + LA+E G
Sbjct: 305 HRLCRQVLMFHHFPDELG----EADTLVSRLLLEYDENPILTQLCAARTLAYEGDG---- 356

Query: 364 LPPLEFDYQTFPDNNEKP 381
Y+ P NN P
Sbjct: 357 -------YRRAPVNNMMP 367


5YpsIP31758_0502YpsIP31758_0551Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_05022190.943446acid-resistance membrane protein
YpsIP31758_05032191.115847protein kinase
YpsIP31758_05040172.489089hypothetical protein
YpsIP31758_0505-2236.465459tellurium resistance protein
YpsIP31758_0506-2236.726345tellurium resistance protein
YpsIP31758_0507-2226.490448tellurium resistance protein
YpsIP31758_0508-1226.630184tellurium resistance protein
YpsIP31758_0509-1257.554274ShlB/FhaC/HecB family hemolysin
YpsIP31758_0510-1247.061742adhesin/hemagglutinin
YpsIP31758_05110243.747664hypothetical protein
YpsIP31758_05120222.178712hypothetical protein
YpsIP31758_05130222.231564hypothetical protein
YpsIP31758_05141243.463625hypothetical protein
YpsIP31758_0515525-3.953819hypothetical protein
YpsIP31758_05162230.851734hypothetical protein
YpsIP31758_05171182.828415hypothetical protein
YpsIP31758_05181193.585580hypothetical protein
YpsIP31758_05192203.310788hypothetical protein
YpsIP31758_05203212.358941hypothetical protein
YpsIP31758_05213263.764651hypothetical protein
YpsIP31758_05224252.484399autotransporter protein
YpsIP31758_05234271.755410hypothetical protein
YpsIP31758_05243271.628546hypothetical protein
YpsIP31758_05253281.710400ABC transporter ATP-binding protein
YpsIP31758_05263302.325992hypothetical protein
YpsIP31758_05273273.488662LacI family sugar-binding transcriptional
YpsIP31758_05282224.001432sugar ABC transporter periplasmic protein
YpsIP31758_05292194.303128sugar ABC transporter permease
YpsIP31758_05301185.117560sugar ABC transporter permease
YpsIP31758_05312175.883827hypothetical protein
YpsIP31758_05321175.574113beta-glucosidase
YpsIP31758_0533-1152.014010outer membrane efflux protein
YpsIP31758_0534015-0.016075fusaric acid resistance domain-containing
YpsIP31758_0535020-3.218449multidrug resistance protein MdtN
YpsIP31758_0536225-6.806163hypothetical protein
YpsIP31758_0537329-8.211633hypothetical protein
YpsIP31758_0538229-7.499199aspartate aminotransferase
YpsIP31758_0539131-7.627044Na+/H+ antiporter family protein
YpsIP31758_0540430-7.210139hypothetical protein
YpsIP31758_0541429-5.506918hypothetical protein
YpsIP31758_05423250.429853endoribonuclease L-PSP
YpsIP31758_05432222.458016L-PSP family endoribonuclease
YpsIP31758_05442241.285143hypothetical protein
YpsIP31758_0545122-0.135787hypothetical protein
YpsIP31758_0546121-1.448967hypothetical protein
YpsIP31758_0547120-1.123257LysR family substrate binding transcriptional
YpsIP31758_0548319-2.250657Fic family protein
YpsIP31758_0549217-2.141796HNH endonuclease domain-containing protein
YpsIP31758_0550317-2.350151hypothetical protein
YpsIP31758_0551215-0.455823hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0510PF05860892e-22 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 88.7 bits (220), Expect = 2e-22
Identities = 23/141 (16%), Positives = 46/141 (32%), Gaps = 24/141 (17%)

Query: 68 AAIVADGSAPGNQQPTIISSANGTPQVNIQTPSSGGVSRNAYRQFDVDNRGVILNNGRGV 127
A I D + P N + I++ T + T + + + +++F V G N
Sbjct: 1 AQITPDTTLPIN---SNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFN---- 52

Query: 128 NQTQIAGLVDGNPWLARGEASVILNEVNSRDPSQLNGYIEVAGRKAQVVIANPAGITCEG 187
I++ V S ++G I A + + NP GI
Sbjct: 53 ---------------NPTNIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGIIFGQ 96

Query: 188 CGFINANRATLTTGQAQLNNG 208
++ + + + +L
Sbjct: 97 NARLDIGGSFVGSTANRLKFA 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0514PYOCINKILLER375e-04 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 37.1 bits (85), Expect = 5e-04
Identities = 40/191 (20%), Positives = 65/191 (34%), Gaps = 1/191 (0%)

Query: 767 ANGSIGPIFDKEKEQNRLKEVQLIGEIGGQALDIASTQGKIIATHAANDKMKAVKPEDIA 826
A+ ++GP + + + ++G Q K I + A + + E
Sbjct: 100 ADAALGPAKNLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSLGAKNFLTRTAEEIGE 159

Query: 827 AAEKQWEKAHPGKAATAEDINQQIYQTAYNQAFNESGFGTGGPVQRGMQAATAAVQGLAG 886
A ++ P D + AYN + + AA A+++ A
Sbjct: 160 QAVREGNINGPEAYMRFLDREMEGLTAAYNVKLFTEAISSLQIRMNTLTAAKASIEAAAA 219

Query: 887 GNLGAALTGASAPYLAGVIKQSTGDNPAANTMAHAVLGAVTAYASGNHALAGAAGAATAE 946
N A A A + AANT A G+V A A+G + A GAA+
Sbjct: 220 -NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLA 278

Query: 947 LMAPTIISALG 957
I+ LG
Sbjct: 279 QAISDAIAVLG 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0519PYOCINKILLER372e-04 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 37.5 bits (86), Expect = 2e-04
Identities = 40/191 (20%), Positives = 65/191 (34%), Gaps = 1/191 (0%)

Query: 140 ANGSIGPIFDKEKEQNRLKEVQLIGEIGGQALDIASTQGKIIATHAANDKMKAVKPEDIA 199
A+ ++GP + + + ++G Q K I + A + + E
Sbjct: 100 ADAALGPAKNLAPLDVINRSLTIVGNALQQKNQKLLLNQKKITSLGAKNFLTRTAEEIGE 159

Query: 200 AAEKQWEKAHPGKAATAEDINQQIYQTAYNQAFNESGFGTGGPVQRGMQAATAAVQGLAG 259
A ++ P D + AYN + + AA A+++ A
Sbjct: 160 QAVREGNINGPEAYMRFLDREMEGLTAAYNVKLFTEAISSLQIRMNTLTAAKASIEAAAA 219

Query: 260 GNLGAALTGASAPYLAGVIKQSTGDNPAANTMAHAVLGAVTAYASGNHALAGAAGAATAE 319
N A A A + AANT A G+V A A+G + A GAA+
Sbjct: 220 -NKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLA 278

Query: 320 LMAPTIISALG 330
I+ LG
Sbjct: 279 QAISDAIAVLG 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0525PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 11/35 (31%), Positives = 17/35 (48%)

Query: 33 MVIVGPSGCAKSTMLRMIAGLEEISSGELTIADRK 67
+V+ G G KST++ + GL+ S I K
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0535RTXTOXIND522e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.8 bits (124), Expect = 2e-09
Identities = 63/417 (15%), Positives = 120/417 (28%), Gaps = 96/417 (23%)

Query: 7 SGRKRQLALIVAGVIIIAAAISGWLSVRQTTLNPLSEDAELGASVVH------IASSVPG 60
S R R +A + G ++IA +S + A + H I
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVL--------GQVEIVATANGKLTHSGRSKEIKPIENS 105

Query: 61 RIISINVEENSKVRRGDLLFSIEP-----DLYRLQ--VEQAQAELKMAEAAHDTQQR--- 110
+ I V+E VR+GD+L + D + Q + QA+ E + + +
Sbjct: 106 IVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKL 165

Query: 111 ---TVVAERSNAAITNEQIVR----------------AQANLKLATQT------------ 139
+ E ++ E+++R Q L L +
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINR 225

Query: 140 -----------LARLQPLRPKGYVTAQQVDDAATAKHDAEVSLKQALKQSVAAEALVSST 188
L L K + V + +A L+ Q E+ + S
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 189 -------------------ASSEALVVARRAALAIAERELANTQIHAPNDGRVVGLTV-S 228
+ + LA E + I AP +V L V +
Sbjct: 286 KEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHT 345

Query: 229 AGEFVAPDQAIFTLINTEH-WHASAFFRETELKHIKVGDCATVYVMADRQRAIQGRVEGI 287
G V + + ++ + +A + ++ I VG A + V A G + G
Sbjct: 346 EGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRY-GYLVGK 404

Query: 288 GWGVSSEDMLNIPRGLPYVPKSLNWVRVVQRFPVRISLEKPPEDLMRIGATAVVIVR 344
++ + + + GL + V+ + G ++
Sbjct: 405 VKNINLDAIEDQRLGLVF--------NVIISIEENCLSTGNKNIPLSSGMAVTAEIK 453


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0549PYOCINKILLER340.001 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 33.6 bits (76), Expect = 0.001
Identities = 18/73 (24%), Positives = 35/73 (47%), Gaps = 12/73 (16%)

Query: 293 FRSVRTKFVKSIANNPDVAKRFTLEQIDGLSNGITP-----------SGWVVHHKLPL-D 340
+R R +F ++AN+P+++K+F + + +G P +HHK+ + D
Sbjct: 532 WRDFREQFWIAVANDPELSKQFNPGSLAVMRDGGAPYVRESEQAGGRIKIEIHHKVRVAD 591

Query: 341 DSGTNALDNLVLI 353
G + NLV +
Sbjct: 592 GGGVYNMGNLVAV 604


6YpsIP31758_0588YpsIP31758_0675Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0588015-3.747857AraC family transcriptional regulator
YpsIP31758_0589116-4.318571hypothetical protein
YpsIP31758_0590119-5.581014cystathionine beta-lyase
YpsIP31758_0591324-8.202189tonB-system energizer ExbB
YpsIP31758_0592430-11.494811biopolymer transport protein ExbD
YpsIP31758_0593532-13.100764hypothetical protein
YpsIP31758_0594328-11.077168hypothetical protein
YpsIP31758_0595431-12.816240TadE-like family protein
YpsIP31758_0596429-12.405411lipoprotein
YpsIP31758_0597430-12.621764type II secretion system protein
YpsIP31758_0598428-11.844069type II secretion system protein F domain
YpsIP31758_0599427-11.057693type II/IV secretion system protein
YpsIP31758_0600528-11.690995hypothetical protein
YpsIP31758_0601527-10.078304type II/III secretion system protein
YpsIP31758_06025324.793643hypothetical protein
YpsIP31758_06037356.311829type IV prepilin peptidase family protein
YpsIP31758_06046336.019034hypothetical protein
YpsIP31758_06054304.741234hypothetical protein
YpsIP31758_06064314.973042hypothetical protein
YpsIP31758_06084294.742246invasin
YpsIP31758_0609113-2.079806hypothetical protein
YpsIP31758_0610012-1.934820fibronectin type III domain-containing protein/
YpsIP31758_0611014-1.757642glycosyl hydrolase family protein
YpsIP31758_0612013-0.638083hypothetical protein
YpsIP31758_0613013-0.729896fimbrial protein
YpsIP31758_0614013-1.443921fimbrial usher protein
YpsIP31758_0615117-2.623175chaperone protein PapD
YpsIP31758_0616319-3.329503fimbrial family protein
YpsIP31758_0617115-4.237753iron-enterobactin transporter periplasmic
YpsIP31758_0618015-4.811970hypothetical protein
YpsIP31758_0619-113-3.924252hypothetical protein
YpsIP31758_0620-112-3.433926lipoprotein
YpsIP31758_0621-113-2.817027hypothetical protein
YpsIP31758_0622012-2.685176flagellar biosynthesis protein FlhA
YpsIP31758_0623013-3.989718flagellar biosynthesis protein FlhB
YpsIP31758_0624013-3.819527flagellar biosynthetic protein FliR
YpsIP31758_0625-113-3.559183flagellar biosynthetic protein FliQ
YpsIP31758_0626-111-1.796350flagellar biosynthesis protein FliP
YpsIP31758_0627-115-1.839173flagellar motor switch protein FliN
YpsIP31758_0628-214-0.895179lateral flagellar export/assembly protein
YpsIP31758_0629-1182.740464sigma-54 dependent transcriptional regulator
YpsIP31758_06301203.955437flagellar hook-basal body complex protein FliE
YpsIP31758_06310193.621250flagellar MS-ring protein
YpsIP31758_06320204.240300flagellar motor switch protein G
YpsIP31758_0633-2194.248540flagellar assembly protein H
YpsIP31758_0634-1193.487920flagellum-specific ATP synthase
YpsIP31758_06352190.766991flagellar export protein FliJ
YpsIP31758_06362181.052652flagellar protein FlgN
YpsIP31758_06371181.471722anti-sigma-28 factor FlgM
YpsIP31758_06381181.881748flagellar basal body P-ring biosynthesis protein
YpsIP31758_06391162.306161flagellar basal-body rod protein FlgB
YpsIP31758_06401182.991938flagellar basal body rod protein FlgC
YpsIP31758_06412173.419757flagellar basal body rod modification protein
YpsIP31758_06422193.347782flagellar hook protein FlgE
YpsIP31758_06430161.073553flagellar basal body rod protein FlgF
YpsIP31758_0644015-1.487306flagellar basal body rod protein FlgG
YpsIP31758_0645017-1.521968flagellar basal body L-ring protein
YpsIP31758_0646018-2.656451flagellar basal body P-ring protein
YpsIP31758_0647017-4.619851peptidoglycan hydrolase
YpsIP31758_0648014-3.743550flagellar hook-associated protein FlgK
YpsIP31758_0649214-2.266518flagellar hook-associated protein FlgL
YpsIP31758_0650311-1.613642flagellar hook associated protein lafW
YpsIP31758_0651213-2.799701hypothetical protein
YpsIP31758_0652213-2.150829transcriptional regulator
YpsIP31758_0653316-0.758842flagellin
YpsIP31758_0654217-2.036113flagellin
YpsIP31758_0655119-2.982564flagellar hook-associated protein 2
YpsIP31758_0656320-2.217178flagellar protein FliS
YpsIP31758_0657218-1.277642flagellar protein lafD
YpsIP31758_0658318-1.760460flagellar hook-length control protein
YpsIP31758_0659418-4.501587flagellar protein
YpsIP31758_0660215-2.230487flagellar biosynthesis sigma factor
YpsIP31758_0661014-1.429870flagellar motor protein MotA
YpsIP31758_0662-1130.079563hypothetical protein
YpsIP31758_0663-1162.482805hypothetical protein
YpsIP31758_0664-1162.447844hypothetical protein
YpsIP31758_0665-2152.454628outer membrane lipoprotein PcP
YpsIP31758_0666-112-0.925530sensory box-containing diguanylate cyclase
YpsIP31758_0667-113-0.841904RND family efflux transporter MFP subunit
YpsIP31758_06682172.888183RND efflux transporter
YpsIP31758_06702172.268991hypothetical protein
YpsIP31758_06692151.998845hypothetical protein
YpsIP31758_06712142.137171enterotoxin
YpsIP31758_06723163.482222hypothetical protein
YpsIP31758_06733163.628059autotransporter protein
YpsIP31758_0674114-0.827392hypothetical protein
YpsIP31758_0675013-4.360730outer membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0595MICOLLPTASE280.024 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 27.8 bits (61), Expect = 0.024
Identities = 14/54 (25%), Positives = 21/54 (38%), Gaps = 2/54 (3%)

Query: 62 DAENVLSYQQLFEHNFNRQVTVLGSLINTAPSAELTVNFSHSVADLINGNSEEN 115
D Y +F H N T +N P A + + S V + IN + E+
Sbjct: 747 DGNGNYVYDVVF-HGMN-TDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTES 798


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0601BCTERIALGSPD1254e-33 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 125 bits (315), Expect = 4e-33
Identities = 62/265 (23%), Positives = 120/265 (45%), Gaps = 43/265 (16%)

Query: 170 EYRGVINKIKLPQANQVNVKLTIVEITKDFTENIGLDW---------------NSIKSAA 214
+ VI ++ + + QV V+ I E+ N+G+ W + A
Sbjct: 332 DLERVIAQLDIRRP-QVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIA 390

Query: 215 GAFQF---------------------LNFNAQSISTLVHAINDEAIAKVLAEPNLSVLSG 253
GA Q+ F + + L+ A++ +LA P++ L
Sbjct: 391 GANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDN 450

Query: 254 EYASFLVGGEIPIVSTNQNG------ISVEYKEFGIKLNIGAKVNEKKRIRVMLGEEVSS 307
A+F VG E+P+++ +Q +VE K GIKL + ++NE + + + +EVSS
Sbjct: 451 MEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSS 510

Query: 308 IDKVFNLRGGDSYPSLRIRKANTTVELGDGESFILGGLISSTERESLKKIPFIGDVPLLG 367
+ + D + R N V +G GE+ ++GGL+ + ++ K+P +GD+P++G
Sbjct: 511 VADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIG 570

Query: 368 ALFRNAQTQRNQSELVVVATVNLVK 392
ALFR+ + ++ L++ +++
Sbjct: 571 ALFRSTSKKVSKRNLMLFIRPTVIR 595


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0603PREPILNPTASE392e-06 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 38.6 bits (90), Expect = 2e-06
Identities = 19/141 (13%), Positives = 56/141 (39%), Gaps = 11/141 (7%)

Query: 1 MVLIVSQLLFVCYSDIRHRIISNKFVISIAFNAIILIL----------VTHHTVSIIIPI 50
+L+ L+ + + D+ ++ ++ + + + ++ L V ++
Sbjct: 137 ALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVLW 196

Query: 51 VALFFGYIIFHFNVMGGGDVKLITVLLLALTAEQSLNFIIYTAIMGGVVMLVGLLINRAD 110
+ ++ MG GD KL+ L L + ++ ++++G + + +L+
Sbjct: 197 SLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHH 256

Query: 111 IQKRGIPYAIAITAGFLLSVL 131
K IP+ + +++L
Sbjct: 257 QSK-PIPFGPYLAIAGWIALL 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0608INTIMIN459e-141 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 459 bits (1183), Expect = e-141
Identities = 267/852 (31%), Positives = 391/852 (45%), Gaps = 87/852 (10%)

Query: 61 SKADTMVSYSSTEPYVLGSGETVAMVAKKYGITVDELKKIN--IYRTFSRPFTALTTGDE 118
SK T SY + Y L +GETVA ++K I + + +N +Y + S A G +
Sbjct: 51 SKLLTHNSYQNRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKA-EPGQQ 109

Query: 119 IDIPRKASPF-----------------------------SVDNNKDNRLSVENTLAGHAV 149
I +P K PF S D K N ++ +A
Sbjct: 110 IILPLKKLPFEYSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSN--MTDDKALNYAA 167

Query: 150 AGATALS--------NGDVAKSGERMVRSAASNEFNNSAQQWLSQFGTARIQLNINDDFH 201
A +L NGD AK A N+ ++ Q WL +GTA + L ++F
Sbjct: 168 QQAASLGSQLQSRSLNGDYAKD---TALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNF- 223

Query: 202 LDGSAADVLIPLYDNEKSILFTQLGARNKDSRNTVNMGAGVRTFQGNWMYGANTFFDNDL 261
DGS+ D L+P YD+EK + F Q+GAR DSR T N+GAG R F M G N F D D
Sbjct: 224 -DGSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDF 282

Query: 262 TGKNRRIGVGAEAWTDYLKLSANNYFGITDWHQSRDFIDYNERPANGYDLRAEAYLPSYP 321
+G N R+G+G E W DY K S N YF ++ WH+S + DY+ERPANG+D+R YLPSYP
Sbjct: 283 SGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYP 342

Query: 322 QLGGKAMYEKYRGDDVALFGKDNRQKNPHAITAGVNYTPIPLVTIGAEHRAGKGGQNDSN 381
LG K MYE+Y GD+VALF D Q NP A T GVNYTPIPLVT+G ++R G G +ND
Sbjct: 343 ALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLL 402

Query: 382 INFQLNYRLGETWQSHIDPSAVAASRTLAGSRYDLVERNNHIVLDYQKQNLVRLSLPDSL 441
+ Q Y+ + W I+P V RTL+GSRYDLV+RNN+I+L+Y+KQ+++ L++P +
Sbjct: 403 YSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDI 462

Query: 442 AGDPFSQLSVTAQVTATHGLERIDWQSAELMAAGGVLKQT---SKNGLEITLPEYQMNRT 498
G S + V + +GL+RI W + L + GG ++ + S + LP Y +
Sbjct: 463 NGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYV--QG 520

Query: 499 GGNSYILNAIAYDTQGNASSQASMLITV--NAQKINIANST-LVAAPVNIEANNSDTSVV 555
G N Y + A AYD GN+S+ + ITV N Q ++ T A + +A+ ++
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITY 580

Query: 556 TLTLKDGN----NIPVTGQNVTFLSPLGTLSAMTDSGNGVYTATLTAGTVSGTTAVSSNI 611
T T+K N+PV+ V+ + L SA T+ G+G T TL + +
Sbjct: 581 TATVKKNGVAQANVPVSFNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTA 639

Query: 612 NGIALDMTPATVTLNGNSGELSTTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTG 670
+ A + ++ T S A + +T T++ + PV+
Sbjct: 640 EMTSALNANAVIFVD------QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSN 693

Query: 671 QTVNFAGTLGTLG--AVTEGSSGVYTATLTAGIIVGTSSITASVSSTALGVTPATVILNG 728
Q V F TLG L ++G TLT+ G S ++A VS A+ V V
Sbjct: 694 QEVTFTTTLGKLSNSTEKTDTNGYAKVTLTST-TPGKSLVSARVSDVAVDVKAPEVEF-- 750

Query: 729 DSSNLSTTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTGQTVNFAGTLGTLG-AV 786
T T+ + I G + T+ L+ N +G + A
Sbjct: 751 ------FTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS 804

Query: 787 TEGSSGVYTATLTAGIIVGTSSITASVNSTALGVTPATVILN-------GDSGNLSTTNS 839
+ SSG T + S + + + ++ N D+ N
Sbjct: 805 VDASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFG 864

Query: 840 TLVAAPVNIEAN 851
+ + N N
Sbjct: 865 GKLPSSQNELEN 876



Score = 88.6 bits (219), Expect = 2e-19
Identities = 81/416 (19%), Positives = 140/416 (33%), Gaps = 51/416 (12%)

Query: 1467 TLRDNNNNPVTGQTVAFTSTLGTLGNVTEQASGVYTATLTA----GTVSGVASLSVSVGG 1522
LR + + L + S VY T A G S L+++V
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLS 550

Query: 1523 NALGVTPATVTLNGDSGNLSTTNSMLVAAPVNIEANGSDTSVVTLTLRDSN----NNPVT 1578
N V VT A + +A+G++ T T++ + N PV+
Sbjct: 551 NGQVVDQVGVT-------------DFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 1579 GQTVTFTSTLGTLDNVTEQASGVYTATLTAGTVSGVASLSASVGGNALGVTPATVTLNGD 1638
V+ T+ L T SG T TL + V + + + A + ++
Sbjct: 598 FNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD-- 654

Query: 1639 SGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTLGTLGN--V 1695
T S A + +T T++ + PV+ Q V FT+TLG L N
Sbjct: 655 ----QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE 710

Query: 1696 TEQASGFYTATLTAGTVSGVASLSVSVGGSALGVTPATVTLNGDSGNLSTTNSTLVAAPV 1755
+G+ TLT+ T G + +S V A+ V V T T+ +
Sbjct: 711 KTDTNGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEF--------FTTLTIDDGNI 761

Query: 1756 NIEANGSDTSVVTLTLR-DSNNNPVTGQTVAFTSTL---------GTLGNVTEQASGLYT 1805
I G + T+ L+ N +G +T + G VT + G T
Sbjct: 762 EIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTT 821

Query: 1806 ATLTAGAVAGVASLSVSVGGNALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEAN 1861
++ + A+ +++ + + + D+ N + + N N
Sbjct: 822 ISVISSD-NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876



Score = 85.9 bits (212), Expect = 1e-18
Identities = 78/412 (18%), Positives = 145/412 (35%), Gaps = 43/412 (10%)

Query: 962 TLRDNNNNPVTGQTVAFTSTLGTLGTVSEGSSGVYTATLTAETVAGVASLSVSVGGSALG 1021
LR + + L +G S VY T A G +S +V + +
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLL--TITV 548

Query: 1022 VTPATVTLNGDSGNLSTTNSTLVAAPINIEANGSDTSVVTLTLRDNN----NNPVTGQTV 1077
++ V + + + + +A+G++ T T++ N N PV+ V
Sbjct: 549 LSNGQVVDQVGVTDFTADKT-------SAKADGTEAITYTATVKKNGVAQANVPVSFNIV 601

Query: 1078 AFTSTLGTLGTVSEGSSGVYTATLTAETVAGVASLSVSVNSTALGVTPATVILNGDSGNL 1137
+ T+ L + + + SG T TL ++ V + + T+ A + ++
Sbjct: 602 SGTAVL-SANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD------ 654

Query: 1138 STTNSTLVAAPVNIEANGSDTSVVTLTLRDSNN-NPVTGQTVAFTSTLGTLDN--VTEQA 1194
T S A + +T T++ PV+ Q V FT+TLG L N
Sbjct: 655 QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDT 714

Query: 1195 SGVYTATLTAGTVSGVASLSASVGGSALGVTPATVILNGDSGNLSTTNSTLVAAPVNIEA 1254
+G TLT+ T G + +SA V A+ V V T T+ + I
Sbjct: 715 NGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEF--------FTTLTIDDGNIEIVG 765

Query: 1255 NSSDTSVVTLTLR-DNNNNPVTGQTVAFTSTL---------GTLGNVTEQASGLYTATLT 1304
+ T+ L+ N +G +T + G VT + G T ++
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVI 825

Query: 1305 AGTVSGVASLSVSVGGNALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEAN 1356
+ + A+ +++ + + + D+ N + + N N
Sbjct: 826 SSD-NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876



Score = 84.4 bits (208), Expect = 4e-18
Identities = 80/416 (19%), Positives = 139/416 (33%), Gaps = 51/416 (12%)

Query: 2073 TLRDNNNNPVTGQTVAFTSTLGTLGNVTEQASGVYTATLTAGTVAGVAS----LSVSVGG 2128
LR + + L + S VY T A G +S L+++V
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLS 550

Query: 2129 NALGVTPATVILNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLRDNN----NNPVT 2184
N V V + A + +A+G++ T T++ N N PV+
Sbjct: 551 NGQVVDQVGV-------------TDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 2185 GQTVAFTSTLGTLGNVTEQASGVYTATLTAGTVAGVASLSINVGGNALGVTPATVTLNGD 2244
V+ T+ L T SG T TL + V + + A + ++
Sbjct: 598 FNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD-- 654

Query: 2245 SGNLSVTNSTLVAAPVNIEANSSDTSVVTLTLR-DNNNNPVTGQTVAFTSTLGTLGN--V 2301
S A ++ +T T++ + PV+ Q V FT+TLG L N
Sbjct: 655 ----QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE 710

Query: 2302 TEQASGVYTATLTAGTVSGVASLSVSVGGSALGVTPATVTLNGDSGNLSTTNSTLVAAPV 2361
+G TLT+ T G + +S V A+ V V T T+ +
Sbjct: 711 KTDTNGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEF--------FTTLTIDDGNI 761

Query: 2362 NIEANGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTL---------GTLGNVTEQASGVYT 2411
I G + T+ L+ N +G +T + G VT + G T
Sbjct: 762 EIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTT 821

Query: 2412 ATLTAGTVAGVASLSVSVGGNALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEAN 2467
++ + A+ +++ + + + D+ N + + N N
Sbjct: 822 ISVISSD-NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876



Score = 83.6 bits (206), Expect = 9e-18
Identities = 89/473 (18%), Positives = 156/473 (32%), Gaps = 50/473 (10%)

Query: 1164 TLRDSNNNPVTGQTVAFTSTLGTLDNVTEQASGVYTATLTAGTVSGVASLSASVGGSALG 1223
LR + + L + S VY T A +G + S +V +
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNG--NSSNNVLLTITV 548

Query: 1224 VTPATVILNGDSGNLSTTNSTLVAAPVNIEANSSDTSVVTLTLRDNN----NNPVTGQTV 1279
++ V+ + + + + +A+ ++ T T++ N N PV+ V
Sbjct: 549 LSNGQVVDQVGVTDFTADKT-------SAKADGTEAITYTATVKKNGVAQANVPVSFNIV 601

Query: 1280 AFTSTLGTLGNVTEQASGLYTATLTAGTVSGVASLSVSVGGNALGVTPATVTLNGDSGNL 1339
+ T+ L T SG T TL + V + + + A + ++
Sbjct: 602 SGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD------ 654

Query: 1340 STTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTLGTLGN--VTEQA 1396
T S A + +T T++ + PV+ Q V FT+TLG L N
Sbjct: 655 QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDT 714

Query: 1397 SGVYTATLTAGTVSGVASLSVSVGGNALGVTPATVILNGDSGNLSTTNSTLVAAPVNIEA 1456
+G TLT+ T G + +S V A+ V V T T+ + I
Sbjct: 715 NGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEF--------FTTLTIDDGNIEIVG 765

Query: 1457 NGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTLGTLGNVTEQASGVYTATLTAGTVSGVAS 1515
G + T+ L+ N +G +T + AS +G V+
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDAS--------SGQVTLKEK 817

Query: 1516 LSVSVGGNALGVTPATVTLNGDSGNLSTTNSMLVAAPVN--IEANGSDTSVVTLTLRDSN 1573
+ ++ + AT T+ T NS++V + +T S+
Sbjct: 818 GTTTISVISSDNQTATYTIA-------TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSS 870

Query: 1574 NNPVTGQTVTFTSTLGTLDNVTEQASGVYTATLTAGTVSGVASLSASVGGNAL 1626
N + + + + Q + SGVAS V N L
Sbjct: 871 QNELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDLVKQNPL 923



Score = 83.2 bits (205), Expect = 1e-17
Identities = 78/383 (20%), Positives = 134/383 (34%), Gaps = 39/383 (10%)

Query: 624 TLNGNSGELSTTNSTLVAAPVNIEANGSDTSVVTLTLRDNNNNPVTGQTVNFAGTLGTLG 683
+NG + + D+++ + + ++ + Q L
Sbjct: 461 DINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQA-----ILP 515

Query: 684 AVTEGSSGVYTATLTAGIIVGTSSITASVSSTALGVTPATVILNGDSSNLSTTNSTLVAA 743
A +G S VY T A G SS ++ T ++ V+ ++ + +
Sbjct: 516 AYVQGGSNVYKVTARAYDRNGNSSNNVLLTITV--LSNGQVVDQVGVTDFTADKT----- 568

Query: 744 PVNIEANGSDTSVVTLTLRDNN----NNPVTGQTVNFAGTLGTLGAVTEGSSGVYTATLT 799
+ +A+G++ T T++ N N PV+ V+ L A T GS G T TL
Sbjct: 569 --SAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGS-GKATVTLK 625

Query: 800 AGIIVGTSSITASVNSTALGVTPATVILNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVT 859
+ + T+ A + ++ T S A + +T
Sbjct: 626 SDKPGQVVVSAKTAEMTSALNANAVIFVD------QTKASITEIKADKTTAVANGQDAIT 679

Query: 860 LTLR-DNNNNPVTGQTVAFTSTLGTLGN--VTEQASGLYTATLTAGTVSGVASLSVSVGG 916
T++ + PV+ Q V FT+TLG L N +G TLT+ T G + +S V
Sbjct: 680 YTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTP-GKSLVSARVSD 738

Query: 917 NALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTGQT 975
A+ V V T T+ + I G + T+ L+ N +G
Sbjct: 739 VAVDVKAPEVEF--------FTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 976 VAFTSTLGTLGTVS-EGSSGVYT 997
+T S + SSG T
Sbjct: 791 GKYTWRSANPAIASVDASSGQVT 813



Score = 80.5 bits (198), Expect = 7e-17
Identities = 79/416 (18%), Positives = 138/416 (33%), Gaps = 51/416 (12%)

Query: 1669 TLRDNNNNPVTGQTVAFTSTLGTLGNVTEQASGFYTATLTA----GTVSGVASLSVSVGG 1724
LR + + L + S Y T A G S L+++V
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLS 550

Query: 1725 SALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLRDSN----NNPVT 1780
+ V VT A + +A+G++ T T++ + N PV+
Sbjct: 551 NGQVVDQVGVT-------------DFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 1781 GQTVAFTSTLGTLGNVTEQASGLYTATLTAGAVAGVASLSVSVGGNALGVTPATVTLNGD 1840
V+ T+ L T SG T TL + V + + + A + ++
Sbjct: 598 FNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD-- 654

Query: 1841 SGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTLGTLGN--V 1897
T S A + +T T++ + PV+ Q V FT+TLG L N
Sbjct: 655 ----QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE 710

Query: 1898 TEQASGVYTATLTAGTVAGVASLSVSVGGNALGVTPATVILNGDSGNLSTTNSTLVAAPV 1957
+G TLT+ T G + +S V A+ V V T T+ +
Sbjct: 711 KTDTNGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEF--------FTTLTIDDGNI 761

Query: 1958 NIEANGSDTSVVTLTLR-DSNNNPVTGQTVAFTSTL---------GTLGNVTEQASGLYT 2007
I G + T+ L+ N +G +T + G VT + G T
Sbjct: 762 EIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTT 821

Query: 2008 ATLTAGAVAGVASLSVSVGGNALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEAN 2063
++ + A+ +++ + + + D+ N + + N N
Sbjct: 822 ISVISSD-NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876



Score = 79.7 bits (196), Expect = 1e-16
Identities = 81/420 (19%), Positives = 139/420 (33%), Gaps = 51/420 (12%)

Query: 1871 TLRDNNNNPVTGQTVAFTSTLGTLGNVTEQASGVYTATLTAGTVAGVAS----LSVSVGG 1926
LR + + L + S VY T A G +S L+++V
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLS 550

Query: 1927 NALGVTPATVILNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLRDSN----NNPVT 1982
N V V + A + +A+G++ T T++ + N PV+
Sbjct: 551 NGQVVDQVGV-------------TDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 1983 GQTVAFTSTLGTLGNVTEQASGLYTATLTAGAVAGVASLSVSVGGNALGVTPATVTLNGD 2042
V+ T+ L T SG T TL + V + + + A + ++
Sbjct: 598 FNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD-- 654

Query: 2043 SGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTLGTLGN--V 2099
T S A + +T T++ + PV+ Q V FT+TLG L N
Sbjct: 655 ----QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE 710

Query: 2100 TEQASGVYTATLTAGTVAGVASLSVSVGGNALGVTPATVILNGDSGNLSTTNSTLVAAPV 2159
+G TLT+ T G + +S V A+ V V T T+ +
Sbjct: 711 KTDTNGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEF--------FTTLTIDDGNI 761

Query: 2160 NIEANGSDTSVVTLTLR-DNNNNPVTGQTVAFTSTL---------GTLGNVTEQASGVYT 2209
I G + T+ L+ N +G +T + G VT + G T
Sbjct: 762 EIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTT 821

Query: 2210 ATLTAGTVAGVASLSINVGGNALGVTPATVTLNGDSGNLSVTNSTLVAAPVNIEANSSDT 2269
++ + A+ +I + + + D+ N + + N N
Sbjct: 822 ISVISSD-NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKA 880



Score = 75.5 bits (185), Expect = 3e-15
Identities = 76/407 (18%), Positives = 135/407 (33%), Gaps = 39/407 (9%)

Query: 2275 TLRDNNNNPVTGQTVAFTSTLGTLGNVTEQASGVYTATLTA----GTVSGVASLSVSVGG 2330
LR + + L + S VY T A G S L+++V
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLS 550

Query: 2331 SALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLRDNN----NNPVT 2386
+ V VT A + +A+G++ T T++ N N PV+
Sbjct: 551 NGQVVDQVGVT-------------DFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 2387 GQTVAFTSTLGTLGNVTEQASGVYTATLTAGTVAGVASLSVSVGGNALGVTPATVTLNGD 2446
V+ T+ L T SG T TL + V + + + A + ++
Sbjct: 598 FNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVD-- 654

Query: 2447 SGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLRDSNN-NPVTGQTVAFTSTLGTLDN--V 2503
T S A + +T T++ PV+ Q V FT+TLG L N
Sbjct: 655 ----QTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE 710

Query: 2504 TEQASGLYTATLTAGTLTGTASLSVNVDGNNLGTTPATINVIPAPVDLTVLTDNARKNIG 2563
+G TLT+ T G + +S V + + + D + +G
Sbjct: 711 KTDTNGYAKVTLTSTTP-GKSLVSARVSDVAVDVKAPEVEFFTTL----TIDDGNIEIVG 765

Query: 2564 QAIS---LTVIAKYKSTDVVAPNVKMTFEQVAVVNRQNSPVSSSGVVQIADANYDAFTGM 2620
+ TV +Y ++ A + + S +SSG V + + + +
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVI 825

Query: 2621 TDANGQLTVSVTDPNGIGVQTTLRAKAESGDMENTNVTFNVITSPDS 2667
+ N T ++ PN + V + + + + S +
Sbjct: 826 SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQN 872



Score = 42.4 bits (99), Expect = 3e-05
Identities = 58/309 (18%), Positives = 95/309 (30%), Gaps = 42/309 (13%)

Query: 2376 TLRDNNNNPVTGQTVAFTSTLGTLGNVTEQASGVYTATLTAGTVAGVAS----LSVSVGG 2431
LR + + L + S VY T A G +S L+++V
Sbjct: 491 ALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLS 550

Query: 2432 NALGVTPATVTLNGDSGNLSTTNSTLVAAPVNIEANGSDTSVVTLTLRDSN----NNPVT 2487
N V VT A + +A+G++ T T++ + N PV+
Sbjct: 551 NGQVVDQVGVT-------------DFTADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 2488 GQTVAFTSTLGTLDNVTEQASGLYTATLTAGTLTGTASLSVNVDGNNLGTTPATINVIPA 2547
V+ T+ L T SG T TL + + + + A I V
Sbjct: 598 FNIVSGTAVLSANSANTN-GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 2548 PVDLT-VLTDNARKNIGQAISLTVIAKYKSTDVVAPNVKMTFEQVAVVNRQNSPVSSSGV 2606
+T + D ++T K D N ++TF + NS
Sbjct: 657 KASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTT-TLGKLSNSTEK---- 711

Query: 2607 VQIADANYDAFTGMTDANGQLTVSVTDPNGIGVQTTLRAKAESGDMENTNVTFNVITSPD 2666
TD NG V++T + R + D++ V F + D
Sbjct: 712 --------------TDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTID 757

Query: 2667 SAQASMWGN 2675
+ G
Sbjct: 758 DGNIEIVGT 766


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0610PF07675310.016 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.8 bits (69), Expect = 0.016
Identities = 20/60 (33%), Positives = 31/60 (51%), Gaps = 3/60 (5%)

Query: 326 NSLELMWGTSTSSTGLKSYLVYREGEKIAEVLVPQYTFQETGLSPDTTYRYFVAAQDTQG 385
+ ++ G S + T +Y VYR+G KI E L + TF+E G+ + Y V + T G
Sbjct: 978 DDIQFTMGGSPTPTDY-TYTVYRDGTKIKEGLT-ETTFEEDGV-ATGNHEYCVEVKYTAG 1034



Score = 29.7 bits (66), Expect = 0.030
Identities = 17/43 (39%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 343 SYLVYREGEKIAEVLVPQYTFQETGLSPDTTYRYFVAAQDTQG 385
+Y VYR+G KI E L TF+E G+ + Y V + T G
Sbjct: 535 TYTVYRDGTKIKEGLT-ATTFEEDGV-AAGNHEYCVEVKYTAG 575


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0614PF005776780.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 678 bits (1752), Expect = 0.0
Identities = 226/848 (26%), Positives = 369/848 (43%), Gaps = 59/848 (6%)

Query: 21 TASAIDFNTDAMDANDKQNIDLSHFTNVGYIMPGEYRLEINVNNHRIPEQVIAFYTRDDE 80
+++ + FN + + + DLS F N + PG YR++I +NN + + + F T D E
Sbjct: 43 SSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSE 102

Query: 81 PDSSEVCLPEAVMEQFGLKPDVLQKITFWHEGQCADLREL-AGLTTEVDLATSTLAINVP 139
CL A + GL + + + C L + T ++D+ L + +P
Sbjct: 103 -QGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIP 161

Query: 140 QAWMEYSDSNWVPSSQWDEGIPGFLLDYNVNSLFSKPKESGSTRNISLNGTSGLNAGPWR 199
QA+M ++P WD GI LL+YN + + + G++ LN SGLN G WR
Sbjct: 162 QAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWR 221

Query: 200 LRGDYQGNYSHSSGEYNSSTSTFDWSRIYMYRAIKSLAATLSVGENYFASSLFDTFRYAG 259
LR + +Y+ S S + ++ R I L + L++G+ Y +FD + G
Sbjct: 222 LRDNTTWSYNSSDSSSGSKNK-WQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRG 280

Query: 260 ASLSSDERMLPPNLRGYAPEVSGIARTNAKVTVSQQGRILYQTTVASGPFRIQELSD-SV 318
A L+SD+ MLP + RG+AP + GIAR A+VT+ Q G +Y +TV GPF I ++
Sbjct: 281 AQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGN 340

Query: 319 SGRLDVSVEEQDGTVQTFQVDTAAVPYLTRPGAIRYKTSVGQPSTLNHGTEGPVFASGEF 378
SG L V+++E DG+ Q F V ++VP L R G RY + G+ + N E P F
Sbjct: 341 SGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTL 400

Query: 379 SWGVTNSWSLFGGAIGSGDYNAVSVGVGRDLYAFGAISTDITQTRASGLPNQDTQSGKSL 438
G+ W+++GG + Y A + G+G+++ A GA+S D+TQ ++ LP+ G+S+
Sbjct: 401 LHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANST-LPDDSQHDGQSV 459

Query: 439 RVRYAKRFDELNSDISFAGYRFFEREFMSMNQYLNTRYFDNDL----------------- 481
R Y K +E ++I GYR+ + + +R ++
Sbjct: 460 RFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYY 519

Query: 482 ---GRNKEMYTVTASKNFPDIQTNVNFSYSYQNYWDQP-TSNSYSATISHAFDAFSLKDM 537
+ +T ++ T + S S+Q YW + A ++ AF +D+
Sbjct: 520 NLAYNKRGKLQLTVTQQLGRTST-LYLSGSHQTYWGTSNVDEQFQAGLNTAF-----EDI 573

Query: 538 TVNLSASRSKNNGV--NDDVLYLSFSVPLGNQ-----------QTLSYSGQH-NGQGNNQ 583
LS S +KN D +L L+ ++P + + SYS H
Sbjct: 574 NWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTN 633

Query: 584 TVNYSNSSAIDS--SYRLSAGVNNSNDNGARSQFSGFYIHRSSIAETSLNVAYAQDDFTS 641
+ D+ SY + G D + S +R ++ DD
Sbjct: 634 LAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG-YSHSDDIKQ 692

Query: 642 TGVSMRGGATVTAKGAALHGPGMSGGTRLMVNTDDIAGVPLEERNI-RSNRFGIAVLNNI 700
+ GG A G L P T ++V +E + R++ G AVL
Sbjct: 693 LYYGVSGGVLAHANGVTLGQP--LNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYA 750

Query: 701 NSYYRTDTRIDINQLADDVEVKQSAVEFALTEGAIGYRRFAMMKGEKVLATISLTDSSHP 760
Y +D N LAD+V++ + T GAI F G K+L T++ ++
Sbjct: 751 TEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPL 809

Query: 761 PFGSLVMSAKGQELGIVSDDGFTYLSGVEPGETLNVLWSGAKQCQVAIPAALQPQA---- 816
PFG++V S Q GIV+D+G YLSG+ + V W + L P++
Sbjct: 810 PFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQL 869

Query: 817 --QILLPC 822
Q+ C
Sbjct: 870 LTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0617FERRIBNDNGPP507e-09 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 49.6 bits (118), Expect = 7e-09
Identities = 21/97 (21%), Positives = 40/97 (41%), Gaps = 12/97 (12%)

Query: 129 QTEPNIKAVAKMRPDLIIISATGDDSTLELYDQLSAIAPTLVINYDDKS-----WQELTL 183
+TEPN++ + +M+P ++ SA S + L+ IAP N+ D ++
Sbjct: 84 RTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPLAMARKSLT 139

Query: 184 QLGQATGHEGDAEQVI---DKFARRLNEVKQKITLPP 217
++ + AE + + F R + K P
Sbjct: 140 EMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARP 176


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0623TYPE3IMSPROT296e-101 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 296 bits (760), Expect = e-101
Identities = 98/344 (28%), Positives = 175/344 (50%)

Query: 5 SGEKSEKPTTGKLSKARKKGDIPRSKDVTMAAGLVTSFILLSLFLPYYKELISQSFVSVS 64
SGEK+E+PT K+ ARKKG + +SK+V A +V +L YY E S+ + +
Sbjct: 2 SGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPA 61

Query: 65 QLASQLNDQGALEQFLLANLFIFAKFLATLVPIPLFSMLATLIPGGWNFTPVKLIPDLKK 124
+ + Q L F L L ++ + ++ G+ + + PD+KK
Sbjct: 62 EQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 125 LSPLAGIKRIFSASNGTEVLKMLAKCSIVLYTLYLVVHSSLDDLLHLQTLPLEEAITQGF 184
++P+ G KRIFS + E LK + K ++ +++++ +L LL L T +E
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 185 AQYHHILLYFIAIVVVFAIIDIPLSHHLFTKKMKMTKQEVKQEHKNNDGNPEIKSRVRQL 244
+++ VV +I D ++ + K++KM+K E+K+E+K +G+PEIKS+ RQ
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 245 QRQYAIGQINKTVPSADVIITNPTHFSVALKYAPEKASAPYIVAKGKDDIALYIRSIAQK 304
++ + + V + V++ NPTH ++ + Y + P + K D +R IA++
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 305 HKIEIVEFPPLARAIYHTTKVNQQIPAQLYRAIAQVLTYVMQIK 348
+ I++ PLARA+Y V+ IPA+ A A+VL ++ +
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0624TYPE3IMRPROT1052e-29 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 105 bits (263), Expect = 2e-29
Identities = 70/237 (29%), Positives = 129/237 (54%), Gaps = 3/237 (1%)

Query: 17 LPFVRILSFLHFCPVIRHKAFTRKAKIGTALLLAILITPMISQPVVSRELLSIESLLLAG 76
P +R+L+ + P++ ++ ++ K+G A+++ I P + V + S +L LA
Sbjct: 18 WPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDV--PVFSFFALWLAV 75

Query: 77 EQILWGWLFGSMLHLVLAALEAAGQILSMNMGLGMAMMNDPTSGASTAVISQIIFTFSVL 136
+QIL G G + AA+ AG+I+ + MGL A DP S + V+++I+ ++L
Sbjct: 76 QQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALL 135

Query: 137 IFFTLDGHLLFVTIVLKSFSSWPIG-EAINDFSLRSLALSLGWIISSATLLALPTTFIML 195
+F T +GHL +++++ +F + PIG E +N + +L + I + +LALP ++L
Sbjct: 136 LFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLL 195

Query: 196 IVQGSFGLLNRISPTLNLFSLGFPIGMLFGLLCLLLLAINIPDHYLHLTNEILTQFE 252
+ + GLLNR++P L++F +GFP+ + G+ + L I HL +EI
Sbjct: 196 TLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLA 252


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0625TYPE3IMQPROT476e-11 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 47.5 bits (113), Expect = 6e-11
Identities = 25/74 (33%), Positives = 37/74 (50%)

Query: 14 GLHLVLMISIVAIVPSLLIGLLVSIFQATTQINEQTLSFLPRLVMTMLVLIFAGKWMMTK 73
L+LVL++S + + +IGLLV +FQ TQ+ EQTL F +L+ L L W
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLSGWYGEV 70

Query: 74 LSDFTVSIFQQAAQ 87
L + + A
Sbjct: 71 LLSYGRQVIFLALA 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0626FLGBIOSNFLIP2196e-74 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 219 bits (560), Expect = 6e-74
Identities = 112/236 (47%), Positives = 155/236 (65%), Gaps = 4/236 (1%)

Query: 19 LVGGLLYSPLLLAQEGGITLFNTVQTATGQDYNVKIEILILMTLLGLLPIMMLMMTCFTR 78
V L +PL AQ GIT + GQ +++ ++ L+ +T L +P ++LMMT FTR
Sbjct: 9 PVLLWLITPLAFAQLPGIT--SQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFTR 66

Query: 79 FIIVLAILRQALGLQQSPPNKVLTGIALALTLLVMRPVWTKIHQDAVIPFQQDEITLSQA 138
IIV +LR ALG +PPN+VL G+AL LT +M PV KI+ DA PF +++I++ +A
Sbjct: 67 IIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQEA 126

Query: 139 LGRAEVPLKNYMLAQTSTKSLDQMMAIA--QVSGEPQQQDLSVVTPAYVLSELKTAFQIG 196
L + PL+ +ML QT L +A P+ + ++ PAYV SELKTAFQIG
Sbjct: 127 LEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQIG 186

Query: 197 FMIYIPFLVIDLIVASILMAMGMMMLSPLIVSLPFKLMLFVLCDGWTLMVGTLTAS 252
F I+IPFL+IDL++AS+LMA+GMMM+ P ++LPFKLMLFVL DGW L+VG+L S
Sbjct: 187 FTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0627FLGMOTORFLIN723e-19 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 71.9 bits (176), Expect = 3e-19
Identities = 35/77 (45%), Positives = 50/77 (64%)

Query: 54 RKMSLFSRIPVTLTLEVASVELPLSELLTVNNDSVIELDKLAGEPLDIRVNGIMFGQAEV 113
+ + L IPV LT+E+ + + ELL + SV+ LD LAGEPLDI +NG + Q EV
Sbjct: 52 QDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEV 111

Query: 114 VVINEKYGLRIININSQ 130
VV+ +KYG+RI +I +
Sbjct: 112 VVVADKYGVRITDIITP 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0628TYPE3OMOPROT352e-04 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 35.0 bits (80), Expect = 2e-04
Identities = 25/103 (24%), Positives = 46/103 (44%), Gaps = 16/103 (15%)

Query: 147 GEHLIINNSTAALIACWSYRIDFFLKDYHKSGFSIFIDAPHIDRFINTIKTKSEKSVEKN 206
G+ L+I S A + C++ ++ F + ++ I ++ E N
Sbjct: 172 GDVLLIRTSRA-EVYCYAKKLGHFNR----------VEGGIIVETLDI----QHIEEENN 216

Query: 207 VSLSEKQLEHLVKKLPVTLTSQLSNINLTLAELMALKEGDIIS 249
+ + + L L +LPV L L N+TLAEL A+ + ++S
Sbjct: 217 TTETAETLPGL-NQLPVKLEFVLYRKNVTLAELEAMGQQQLLS 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0629HTHFIS373e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 373 bits (958), Expect = e-129
Identities = 125/345 (36%), Positives = 184/345 (53%), Gaps = 22/345 (6%)

Query: 14 HGFVANAPSSVSVFSLARRVAEFNVPVLVTGETGTGKECVAKYIHQKAMGDASPYIAVNC 73
V + + ++ + R+ + ++ +++TGE+GTGKE VA+ +H P++A+N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 74 AAIPESMLEAILFGYEKGAFTGAIASVAGKFEQANGGTLLLDEIGDMPLALQVKLLRVLQ 133
AAIP ++E+ LFG+EKGAFTGA G+FEQA GGTL LDEIGDMP+ Q +LLRVLQ
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 134 EQEVERLGSHKAIPLDIRIIASTNKDLSVEIAEGRFRQDLYYRLSVVPIHILPLRERPED 193
+ E +G I D+RI+A+TNKDL I +G FR+DLYYRL+VVP+ + PLR+R ED
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 194 ILPLVKAFINKYQSFLNVKIDITAEAQCELYKYTWPGNVRELENVIQRGIIMSNNGVI-- 251
I LV+ F+ + + EA + + WPGNVRELEN+++R + VI
Sbjct: 317 IPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITR 376

Query: 252 ---------ELPSLGLPMAQGISSPVGETSLPF--------STIQPPDGENNIKLRGRLA 294
E+P + A S + + S
Sbjct: 377 EIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEM 436

Query: 295 QYQYIVDLLQRHQGNKSKTAAFLGITPRALRYRLANMREDGIDIE 339
+Y I+ L +GN+ K A LG+ LR + +RE G+ +
Sbjct: 437 EYPLILAALTATRGNQIKAADLLGLNRNTLRKK---IRELGVSVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0630FLGHOOKFLIE454e-09 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 44.7 bits (105), Expect = 4e-09
Identities = 23/73 (31%), Positives = 35/73 (47%), Gaps = 1/73 (1%)

Query: 53 NNLSFSQVLNGAIKSVDQLQHVASEKQTAMDMGISD-DLTGTMLASQKASVAFSAMVQVR 111
+SF+ L+ A+ + Q A + +G L M QKASV+ +QVR
Sbjct: 29 PTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVR 88

Query: 112 NKLTSALDDVMNT 124
NKL +A +VM+
Sbjct: 89 NKLVAAYQEVMSM 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0631FLGMRINGFLIF2831e-90 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 283 bits (724), Expect = 1e-90
Identities = 152/565 (26%), Positives = 255/565 (45%), Gaps = 62/565 (10%)

Query: 12 GQLGENTKTILMSAAALLVTAAIIFSLWRSSQGYTALFGSQENIPVTQVVEVLEGEAIAY 71
+L N + L+ A + V + LW + Y LF + + +V L I Y
Sbjct: 17 NRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPY 76

Query: 72 RINPDNGQVLVAENQLGKARILLAAKGITATLPIGYELMDKESMLGSSQFIQNVRYKRSL 131
R +G + V +++ + R+ LA +G+ +G+EL+D+E G SQF + V Y+R+L
Sbjct: 77 RFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEK-FGISQFSEQVNYQRAL 135

Query: 132 EGELAQSMMALSAVEYARVHLGMSEASSFAISNRSDNSASVVLRLRYGQTLSTEQVGAIV 191
EGELA+++ L V+ ARVHL M + S F + + SASV + L G+ L Q+ A+V
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLF-VREQKSPSASVTVTLEPGRALDEGQISAVV 194

Query: 192 QLVAGSIPGMKPANVRVVDQHGELLSQAYQANSEGVPSVKSGTELAHYLQSTTEKNIANL 251
LV+ ++ G+ P NV +VDQ G LL+ Q+N+ G + + A+ ++S ++ I +
Sbjct: 195 HLVSSAVAGLPPGNVTLVDQSGHLLT---QSNTSGRDLNDAQLKFANDVESRIQRRIEAI 251

Query: 252 LNSVIGANNYRISVSTQLDMSRIEETAERYGPDPRIN------DENIQQENSNDDMAMGI 305
L+ ++G N V+ QLD + E+T E Y P+ + + E G+
Sbjct: 252 LSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGV 311

Query: 306 PGSLSNQPIPQSQAGQTPAAVSRSQAQ------------------------RKYIYDRNI 341
PG+LSNQP P ++A ++ AQ Y DR I
Sbjct: 312 PGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTI 371

Query: 342 RHVRYPGYKLEKMTVAVVLN-KSLPVL--EQWTPEQQEELKRLIEDAAGIDVKRGDSLTI 398
RH + +E+++VAVV+N K+L T +Q ++++ L +A G KRGD+L +
Sbjct: 372 RHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGDTLNV 431

Query: 399 NVMAFAVP-TLIDEPVMPWWQEPSTFRWAELLGIGLLSLLVLW----FGVRPLMKRYSRK 453
F+ E +P+WQ+ S G LL L+V W VRP + R
Sbjct: 432 VNSPFSAVDNTGGE--LPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTR---- 485

Query: 454 GSENLPLAISSASADEALDHVDTGVDGVESSPRAETAFSASSLWESDDLPEQGSGLETKI 513
E + + A + Q G E
Sbjct: 486 -------------RVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMS 532

Query: 514 AHLQQLAQSETERTAEVIKQWINSN 538
+++++ ++ A VI+QW++++
Sbjct: 533 QRIREMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0632FLGMOTORFLIG1733e-53 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 173 bits (439), Expect = 3e-53
Identities = 85/334 (25%), Positives = 165/334 (49%), Gaps = 2/334 (0%)

Query: 15 KSDTKGRSRLEQASILLLSIGEEAAAMVMQQLSREEVVCVSQMMSRLHNIKLDQARQALD 74
D + ++A+ILL+SIG E ++ V + LS+EE+ ++ +++L I + L
Sbjct: 9 ILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLL 68

Query: 75 DFFRDYREQSGINGASRSYLQAILNKALGSDIAKSVINGIYGDEIRHRMTRLQWVDTPQL 134
+F Q I Y + +L K+LG+ A +IN + ++ D +
Sbjct: 69 EFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANI 128

Query: 135 VALIDQEHLQLQAVFLAFLPPDVAAAVLAYLDKDHQDDILYRIAKLDDVNRDVVDEL-DR 193
+ I QEH Q A+ L++L P A+ +L+ L + Q ++ RIA +D + +VV E+
Sbjct: 129 LNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERV 188

Query: 194 LIERGVAVLSEHGSKVIGIKQAANIVNRIPGNQQQ-LLDQLGERDEEVLNELKDEMYEFF 252
L ++ ++ SE + G+ I+N ++ +++ L E D E+ E+K +M+ F
Sbjct: 189 LEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFE 248

Query: 253 ILSRQSEATLQRLMDLIPMSDWAIALKGTEPALRQAIYDVLPKRQIQQLQNATQRTGAVP 312
+ + ++QR++ I + A ALK + +++ I+ + KR L+ + G
Sbjct: 249 DIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTR 308

Query: 313 VSRVEHIRKVIMAQVRELAEAGEIQVQLFAEQTM 346
VE ++ I++ +R+L E GEI + E+ +
Sbjct: 309 RKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDV 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0633FLGFLIH591e-12 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 59.0 bits (142), Expect = 1e-12
Identities = 46/204 (22%), Positives = 100/204 (49%), Gaps = 11/204 (5%)

Query: 18 QFPPLRKVRQVALSAADQTLDPAEYQKQLMAGFQEGISQGFDKGLAEGKEEGYQEGVRLG 77
+F P+ + + + A+ +L+ Q Q+ A QG+ G+AEG+++G+++G + G
Sbjct: 21 EFVPIVEPEETIIEEAEPSLEQQLAQLQMQAH-----EQGYQAGIAEGRQQGHKQGYQEG 75

Query: 78 HDDGLKKGRIEGRQSELASFNDVIKPFSGYVTQLHTYLETYEQRRRDELLQLVEKVTRQV 137
GL++G E +S+ A + ++ V++ T L+ + L+Q+ + RQV
Sbjct: 76 LAQGLEQGLAEA-KSQQAPIHARMQQL---VSEFQTTLDALDSVIASRLMQMALEAARQV 131

Query: 138 IRCELALQPAQLLTLVEEALAALPKVPQQLKVYLNPAEFGRINDV--APEKVQAWGLAAD 195
I + + L+ +++ L P + ++ ++P + R++D+ A + W L D
Sbjct: 132 IGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGD 191

Query: 196 PEMVGGECRIVTETTEIDVGCQHR 219
P + G C++ + ++D R
Sbjct: 192 PTLHPGGCKVSADEGDLDASVATR 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0640FLGHOOKAP1300.004 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.5 bits (66), Expect = 0.004
Identities = 6/37 (16%), Positives = 19/37 (51%)

Query: 102 VNVVSEMADMMSASRSFETNVEVLNSVKSMQQSVLKL 138
VN+ E ++ + + N +VL + ++ +++ +
Sbjct: 509 VNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0642FLGHOOKAP1384e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 38.4 bits (89), Expect = 4e-05
Identities = 19/60 (31%), Positives = 27/60 (45%), Gaps = 5/60 (8%)

Query: 2 SFSIANTALNAHTEQLNTISNNIANSATKGFKASR----TEFASMYAQSQ-PLGVTVSGV 56
+ A + LNA LNT SNNI++ G+ +++ A GV VSGV
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62



Score = 33.4 bits (76), Expect = 0.001
Identities = 10/42 (23%), Positives = 22/42 (52%)

Query: 360 LENSNVDITAELVGLMTAQRNYQASTKIISTNDSMMNALFQV 401
S V++ E L Q+ Y A+ +++ T +++ +AL +
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0644FLGHOOKAP1422e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 2e-06
Identities = 11/42 (26%), Positives = 20/42 (47%)

Query: 213 QLEQGALEGSNVQVVEEMVDMITVQRAYEMNAKMVSAADDML 254
QL S V + EE ++ Q+ Y NA+++ A+ +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIF 539



Score = 40.7 bits (95), Expect = 3e-06
Identities = 20/78 (25%), Positives = 35/78 (44%), Gaps = 14/78 (17%)

Query: 2 NSALWVSKTGLAAQDAKMGAISNNLANVNTDGFKRDRVVFADLFYQNQRTPGAPLDQNNT 61
+S + + +GL A A + SNN+++ N G+ R + A N+T
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA--------------QANST 46

Query: 62 TPSGIQFGSGVQIVGTQK 79
+G G+GV + G Q+
Sbjct: 47 LGAGGWVGNGVYVSGVQR 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0645FLGLRINGFLGH1531e-48 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 153 bits (387), Expect = 1e-48
Identities = 76/228 (33%), Positives = 113/228 (49%), Gaps = 14/228 (6%)

Query: 4 FLILTPMVLALCGCESPALLVQKDDAEFAPPANLVQPATVTEGGGLF---QPAY--NWSL 58
+ I + +VL+L GC A A P P G +F QP L
Sbjct: 9 YAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVA---NGSIFQSAQPINYGYQPL 65

Query: 59 LQDRRAYRIGDILTVILDESTQSSKQAKTNFGKKNDMSLGVPEVLGKKLNKFGGSI---- 114
+DRR IGD LT++L E+ +SK + N + + G V FG +
Sbjct: 66 FEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVE 125

Query: 115 -SGKRDFDGSATSAQQNMLRGSITVAVHQVLPNGVLVIRGEKWLTLNQGDEYMRVTGLVR 173
SG F+G + N G++TV V QVL NG L + GEK + +NQG E++R +G+V
Sbjct: 126 ASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVN 185

Query: 174 ADDVARDNSVSSQRIANARISYAGRGALSDANSAGWLTRFFNHPLFPI 221
++ N+V S ++A+ARI Y G G +++A + GWL RFF + L P+
Sbjct: 186 PRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN-LSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0646FLGPRINGFLGI319e-109 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 319 bits (819), Expect = e-109
Identities = 148/369 (40%), Positives = 211/369 (57%), Gaps = 12/369 (3%)

Query: 8 LVLSAAFVLPMALALPTATAQPLGSLVDIQGVRGNQLVGYSLVVGLDGSGDK-NQVKFTG 66
LV SA L A A + + +Q R NQL+GY LVVGL G+GD FT
Sbjct: 11 LVFSALPFLSTPPAQ--ADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTE 68

Query: 67 QSMANMLRQFGVQLPEKMDPKVKNVAAVAISATLPPGYGRGQSIDITVSSIGDAKSLRGG 126
QSM ML+ G+ KN+AAV ++A LPP G +D+TVSS+GDA SLRGG
Sbjct: 69 QSMRAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGG 127

Query: 127 TLLLTQLRGADGEVYALAQGNVVVGGIKAEGDSGSSVTVNTPTVGRIPNGASIERQVPSD 186
L++T L GADG++YA+AQG ++V G A+GD +++T T R+PNGA IER++PS
Sbjct: 128 NLIMTSLSGADGQIYAVAQGALIVNGFSAQGD-AATLTQGVTTSARVPNGAIIERELPSK 186

Query: 187 FQTNNQVVLNLKRPSFKSANNVALALNR----AFGANTATAQSATNVMVNAPQDAGARVA 242
F+ + +VL L+ P F +A VA +N +G A + + + V P+ A
Sbjct: 187 FKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVA-DLTR 245

Query: 243 FMSLLEDVQINAGQQSPRVVFNARTGTVVIGEGVIVRAAAVSHGNLTVSIRERKNVSQPN 302
M+ +E++ + +VV N RTGT+VIG V + AVS+G LTV + E V QP
Sbjct: 246 LMAEIENLTVET-DTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPA 304

Query: 303 TLGGGKTVTTPESDIEVTKGKNQMVMVPAGTRLRSIVNTINSLGASPDDIMAILQALHEA 362
G+T P++DI + +++ +V G LR++V +NS+G D I+AILQ + A
Sbjct: 305 PFSRGQTAVQPQTDIMAMQEGSKVAIVE-GPDLRTLVAGLNSIGLKADGIIAILQGIKSA 363

Query: 363 GALDAELVV 371
GAL AELV+
Sbjct: 364 GALQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0647FLGFLGJ456e-09 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 45.1 bits (106), Expect = 6e-09
Identities = 19/77 (24%), Positives = 41/77 (53%), Gaps = 4/77 (5%)

Query: 18 GDLQPQDLEQAAVQFEAVFMRTLLQQMRKAAEVLAADDDPFNSKQQRMMRDFYDDKLAST 77
G+ ++ A Q E +F++ +L+ MR A D F+S+ R+ YD ++A
Sbjct: 26 GEDPAANIRPVARQVEGMFVQMMLKSMRDAL----PKDGLFSSEHTRLYTSMYDQQIAQQ 81

Query: 78 LASQRSSGIANLLIQQL 94
+ + + G+A ++++Q+
Sbjct: 82 MTAGKGLGLAEMMVKQM 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0648FLGHOOKAP11525e-43 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 152 bits (386), Expect = 5e-43
Identities = 90/324 (27%), Positives = 151/324 (46%), Gaps = 8/324 (2%)

Query: 4 IKTAFSGMQATQAHLNATSMNIANIHTPGYSRQRAEQSAIGADGQGGINAGNGVNVDAIR 63
I A SG+ A QA LN S NI++ + GY+RQ + + G GNGV V ++
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGVQ 63

Query: 64 RLSKQYVVMQEWQANSQQQYYEAGEQYLKAVELMVSNESTSLATGLNNFFSSLSAATQMP 123
R ++ Q A +Q A + + ++ M+S ++SLAT + +FF+SL
Sbjct: 64 REYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVSNA 123

Query: 124 DSTPMRQQIIESANAMALRFNNVNNFIAQQKNSIGQQRDISIKEINSLTRSIADYNQQIL 183
+ RQ +I + + +F + ++ Q + S+ +IN+ + IA N QI
Sbjct: 124 EDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQIS 183

Query: 184 K--NRSDGNNINDLLDKQELQIKKLSGLIETQVNQAEDGTYRVSVKQGQPLVNGAVAAEL 241
+ G + N+LLD+++ + +L+ ++ +V+ + GTY +++ G LV G+ A +L
Sbjct: 184 RLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTARQL 243

Query: 242 AVDTRSADTKINIHFSGTTQGMNMSC------GGQLGGINDYEFTTLKKLQASTQGMAKT 295
A SAD N+ G LGGI + L + + + +A
Sbjct: 244 AAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLALA 303

Query: 296 VADEFNTQLKLGSDFTGANGRDLF 319
A+ FNTQ K G D G G D F
Sbjct: 304 FAEAFNTQHKAGFDANGDAGEDFF 327



Score = 61.5 bits (149), Expect = 2e-12
Identities = 36/145 (24%), Positives = 67/145 (46%), Gaps = 4/145 (2%)

Query: 311 TGANGRDLFVFNPGDPNGMLQLSAITAEQLALAGRG-EPSGDS--SNLFKLIDIKQKNVV 367
D F P + ++ + + ++ +A E +GDS N L+D++ +
Sbjct: 402 GTPAVNDSFTLKPVS-DAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKT 460

Query: 368 GMNSVRLDDAATTLVGYIAITSNRNYSELENAENTLNQATRYRESISGVNNDEEAINLME 427
+ +DA +LV I + + N + Q + ++SISGVN DEE NL
Sbjct: 461 VGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQR 520

Query: 428 YQRAYQSNMKVIATGDKLFSDLLAL 452
+Q+ Y +N +V+ T + +F L+ +
Sbjct: 521 FQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0653FLAGELLIN1003e-25 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 100 bits (250), Expect = 3e-25
Identities = 68/294 (23%), Positives = 114/294 (38%), Gaps = 7/294 (2%)

Query: 2 ALSIHTNASAKTAINSLSNAGLANAKSSQRLSTGFRINSPADNAAGLQITNRMEKFLNGA 61
A I+TN+ + N+L+ + + + + +RLS+G RINS D+AAG I NR + G
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGK 121
QA +N + I++ Q +G L E L +++L+ QA N TNS +D ++IQ E + +
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELQNALNNTEYNSEKLFADGGKMRKELNFQSGTDAGSSLKLNLNDVIAELTESVTKPGTA 181
E+ N T++N K+ + +M Q G + G ++ ++L + +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQM----KIQVGANDGETITIDLQKIDVKSLGLDGFNVNG 176

Query: 182 ITADASGTPAQKELARLNQVTADALREKELAKKAKTDLGAVQAGANATANIDIPEYKDAN 241
G + N D + + GAV A D AN
Sbjct: 177 PKEATVGDLK---SSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAAN 233

Query: 242 GQTVLGKRIASGATVSAGDIAQIDAAVTALTQVHTDADKASTDYANNNLVGGGV 295
GQ + A A D + V +
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTI 287



Score = 63.1 bits (153), Expect = 6e-13
Identities = 50/333 (15%), Positives = 103/333 (30%), Gaps = 6/333 (1%)

Query: 64 AKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGKEL 123
+++ S + D + K + A + D+ + +L +
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDD 240

Query: 124 QNALNNTEYNSEKLFADGGKMRKELNFQSGTDAGSSLKLNLNDVIAELTESVTKPGTAIT 183
+ G K + T++ ++
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVS 300

Query: 184 ADASGTPAQKELARLNQVTADALREKELAKKAKTDLGAVQAGANATANIDIPEYKDANGQ 243
+G +A + A+ + V + K+ + +
Sbjct: 301 TTINGEKVTLTVADITAGAAN------VDAATLQSSKNVYTSVVNGQFTFDDKTKNESAK 354

Query: 244 TVLGKRIASGATVSAGDIAQIDAAVTALTQVHTDADKASTDYANNNLVGGGVMNMRLADK 303
+ + S + + A T A K + V + A K
Sbjct: 355 LSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAK 414

Query: 304 DLAMEADKKLSEVIDAYGAFRATLGANQNRLQSSSNNLDNMISNTAQALGSIKDTDFADE 363
+ + A R++LGA QNR S+ NL N ++N A I+D D+A E
Sbjct: 415 KSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATE 474

Query: 364 MKNHAQSEMLMQSSVMMLKKANAATQLISTLLQ 396
+ N +++++L Q+ +L +AN Q + +LL+
Sbjct: 475 VSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0654FLAGELLIN969e-24 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 96.3 bits (239), Expect = 9e-24
Identities = 72/291 (24%), Positives = 119/291 (40%), Gaps = 6/291 (2%)

Query: 2 ALSIHTNASAKTAINSLSNAGLANAKSSQRLSTGFRINSPADNAAGLQITNRMEKFLNGA 61
A I+TN+ + N+L+ + + + + +RLS+G RINS D+AAG I NR + G
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGK 121
QA +N + I++ Q +G L E L +++L+ QA N TNS +D ++IQ E + +
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELQNALNNTEYNSEKLFADGGKMRKELNFQSGTDAESSLKLNLNSVIAELTESVTTKATP 181
E+ N T++N K+ + +M Q G + ++ ++L + +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQM----KIQVGANDGETITIDLQKIDVKSLGLDGFNVNG 176

Query: 182 VKADDAGSTLEKEADVLDKATKAAKAAKEAAEAAQLAIAGKKDGDAITATAIPEYKDATG 241
K G +V T A A K + A+ + A G
Sbjct: 177 PKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVY--VNAANG 234

Query: 242 QVIAAKTTGTTLSTADINQINDAATALTKAHAAAEKAENLFQAKNSTGGGV 292
Q+ T + A TA KA A A K + G
Sbjct: 235 QLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTF 285



Score = 61.6 bits (149), Expect = 2e-12
Identities = 53/330 (16%), Positives = 99/330 (30%), Gaps = 3/330 (0%)

Query: 64 AKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGKEL 123
+++ S + D + K + A + D+ + +L +
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDD 240

Query: 124 QNALNNTEYNSEKLFADGGKMRKELNFQSGTDAESSLKLNLNSVIAELTESVTTKATPVK 183
+ G K + E T++ V
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVS 300

Query: 184 ADDAGSTLEKEADVLDKATKAAKAAKEAAEAAQLAIAGKKDGDAITATAIPEYKDATGQV 243
G + + AA + T + A
Sbjct: 301 TTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTK---NESAKLSD 357

Query: 244 IAAKTTGTTLSTADINQINDAATALTKAHAAAEKAENLFQAKNSTGGGVMEMQLLDKDLA 303
+ A S +N A A A K + + + + E K
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKST 417

Query: 304 MMADKKLSDVIDAYGAFRATLGANQNRLQSSSNNLDNMISNTAQALGSIKDTDFADEMKN 363
+ + A R++LGA QNR S+ NL N ++N A I+D D+A E+ N
Sbjct: 418 ANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSN 477

Query: 364 HAQSEMLMQSSVMMLKKANAATQLISTLLQ 393
+++++L Q+ +L +AN Q + +LL+
Sbjct: 478 MSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0658FLGHOOKFLIK290.031 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 29.0 bits (64), Expect = 0.031
Identities = 28/88 (31%), Positives = 45/88 (51%), Gaps = 9/88 (10%)

Query: 258 QHATIRLDPPDMGKIDISIHFEGGKLQVNINTNQGEVYRALQQSSAELRQTL------IG 311
Q A +RL P D+G++ IS+ + + Q+ + + V AL+ + LR L +G
Sbjct: 257 QSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLG 316

Query: 312 QNS---AEVNVQVSANSQQQQQQRHPSH 336
Q++ + Q A SQQQQ QR +H
Sbjct: 317 QSNISGESFSGQQQAASQQQQSQRTANH 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0662OMPADOMAIN383e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 38.4 bits (89), Expect = 3e-05
Identities = 26/116 (22%), Positives = 41/116 (35%), Gaps = 17/116 (14%)

Query: 171 FQRSSAVLTPFFSRLLGELAPAFNEM---DNKIIITGHTDASRYRDQLLYNNWNLSGERA 227
F + A L P L +L + + D +++ G+TD D N LS RA
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTD-RIGSDAY---NQGLSERRA 278

Query: 228 LMAHKALVSGGLDKGRVLQI----------NAMADQMLLDPTDPLAAKNRRIEIMV 273
L+S G+ ++ N + A +RR+EI V
Sbjct: 279 QSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0665FLGPRINGFLGI342e-04 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 33.8 bits (77), Expect = 2e-04
Identities = 28/126 (22%), Positives = 49/126 (38%), Gaps = 7/126 (5%)

Query: 32 SQAGQTQSVTHGTLVSVRPVTIQGGDGNNVAGAVGGAVIGGFLGNTIGGGTGRRLGTAAG 91
S G S+ G L+ ++ G DG A A G ++ GF + + T+A
Sbjct: 116 SSLGDATSLRGGNLIMT---SLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSAR 172

Query: 92 VVAGGVVGQQVQSMMNRSSAVELEVRRDDGSTFLVVQAQGVTQFHP---GQRVTIATSGS 148
V G ++ +++ S S + L++R D ST + V V F G +
Sbjct: 173 VPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVRVADV-VNAFARARYGDPIAEPRDSQ 231

Query: 149 TVTITP 154
+ +
Sbjct: 232 EIAVQK 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0667RTXTOXIND476e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.1 bits (112), Expect = 6e-08
Identities = 27/177 (15%), Positives = 55/177 (31%), Gaps = 17/177 (9%)

Query: 10 RHSLLSHALFLLILGAGTVSAAPAPLPAVTVAVVASITPDNAVQYLGRIEAIQAVDVTTR 69
R L + L + + + V +T GR + I+ +
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIV-ATANGKLTHS------GRSKEIKPI----- 102

Query: 70 TEGFIARRLFTEGKMVKQGELLYEIDPALHQASVAQAQAQLDSATASANHAQVNLTRLQR 129
+ + EG+ V++G++L ++ +A + Q+ L A Q+ ++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 130 LGNNRSVSQAE-----VDEAQAQRDISRAAVAQAQANLQIQQLQLSFTQIHAPISGQ 181
E V E + R S + Q Q +L+ + A
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTV 219



Score = 39.4 bits (92), Expect = 2e-05
Identities = 24/175 (13%), Positives = 52/175 (29%), Gaps = 46/175 (26%)

Query: 75 ARRLFTEGKMVKQGELL-YEIDPALHQASVAQAQAQLDSATASANHAQVNLTRLQRLGNN 133
L E Q + E++ +A A+++ + + L L +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK 246

Query: 134 RSVS-------QAEVDEAQAQRDISRAAVAQ---------------------------AQ 159
++++ + + EA + + ++ + Q Q
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQ 306

Query: 160 ANLQIQQL---------QLSFTQIHAPISGQ-MGHSRFNVGSLINPASGTLVNIV 204
I L + + I AP+S + G ++ A TL+ IV
Sbjct: 307 TTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIV 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0668ACRIFLAVINRP8940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 894 bits (2312), Expect = 0.0
Identities = 421/1036 (40%), Positives = 611/1036 (58%), Gaps = 17/1036 (1%)

Query: 1 MLHFFIRRPKFAIVIALVITLVGWVSLYVIPVEQYPDITPPVVSVSAVYPGASARDVAQA 60
M +FFIRRP FA V+A+++ + G +++ +PV QYP I PP VSVSA YPGA A+ V
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VASPLEAQVNGVSHMLYMESTSANNGSYQLSITFASGTDPDMAAVEVQNRISQVSAQLPA 120
V +E +NG+ +++YM STS + GS +++TF SGTDPD+A V+VQN++ + LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVNENGISVRKRASNLLLGVSVFSPQQTHDALFVSNYTSIQLRDAIARISGVGDVQVFGA 180
EV + GISV K +S+ L+ S +S+Y + ++D ++R++GVGDVQ+FGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 RDYSMRVWLDPQRMESLNVSVQDIVAALQQQNVQAAAGQIGSSPSMPNQQQTLTISGQGR 240
+ Y+MR+WLD + ++ D++ L+ QN Q AAGQ+G +P++P QQ +I Q R
Sbjct: 181 Q-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 LTDARQFADVIIRSNPQGGMIRLGDVARVALGAQNYQVSAAQNQTESAFLVVYPVPGANA 300
+ +F V +R N G ++RL DVARV LG +NY V A N +A L + GANA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 LNVANGVRDEMARLSAAFPADLTYEINYDSTLPVTATLHEIAVSLTLTLIVVLAVVYLFL 360
L+ A ++ ++A L FP + YD+T V ++HE+ +L +++V V+YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 361 QSLRATFIVALTVPVSLLGTFAVLYVFGYSANTLSLFAIILALTIVVDDAIVVVENVERL 420
Q++RAT I + VPV LLGTFA+L FGYS NTL++F ++LA+ ++VDDAIVVVENVER+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 421 LSNDPHLSPAEATRQAMSQIAGPIIATTLVLMAVFVPIAILPGIIGELYRQFAVTLSAAV 480
+ D L P EAT ++MSQI G ++ +VL AVF+P+A G G +YRQF++T+ +A+
Sbjct: 420 MMED-KLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 481 ILSSINALTLSPALCAVLLKRRTL----ATTGMFGTINKGLDRARDGYVGLTGRINRRAV 536
LS + AL L+PALCA LLK + G FG N D + + Y G+I
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 537 FSIAALLLVGLATWWGYSRLPTSFLPEEDQGYFFVSLQLPDGASLNRTQTVMDQMYQQVS 596
+ L+ + RLP+SFLPEEDQG F +QLP GA+ RTQ V+DQ+
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 597 TNDA--VEDVIKITGFSLLSGNNAPNAGFAIVMLKPWGQRP----HIDRVLASIQANLAA 650
N+ VE V + GFS A NAG A V LKPW +R + V+ + L
Sbjct: 599 KNEKANVESVFTVNGFSF--SGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGK 656

Query: 651 IPSAMIMAVNPPAIAGLGSASGFDLRIQALLGQSPQELAQVSQGIIFAANQDP-TLSRVF 709
I ++ N PAI LG+A+GFD + G L Q ++ A Q P +L V
Sbjct: 657 IRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 710 TTFSASVPETNLSIDRDRAALLQVPVSRIFQTLQTSLGGMNAGDFTLNNRMFRVQLQNDM 769
+ L +D+++A L V +S I QT+ T+LGG DF R+ ++ +Q D
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 770 NFRQRTAQINNLNVRSDNGALVSLANLVTLTPSVGAPFISNFNQFPSVAISGSAADGASS 829
FR ++ L VRS NG +V + T G+P + +N PS+ I G AA G SS
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 830 GQAMAAMEALLAQNLPQGYSYSWSGMSWQEQQTGGQVAFIYLAALVFAYLFLVAQYESWS 889
G AMA ME L ++ LP G Y W+GMS+QE+ +G Q + + V +L L A YESWS
Sbjct: 837 GDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 IPLVVVLSVVFAVGGAVAGLSAMGFANDVYAQIGLVLLIGLAAKNAILIVEFSK-ARREE 948
IP+ V+L V + G + + NDVY +GL+ IGL+AKNAILIVEF+K +E
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 949 GASIAEAAQDGAKQRFRAVMMTAISFILGVMPLVFASGAGAMSRQIIGITVFGGMLMATA 1008
G + EA + R R ++MT+++FILGV+PL ++GAG+ ++ +GI V GGM+ AT
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1009 VGILFIPALYLHIQRL 1024
+ I F+P ++ I+R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031



Score = 83.0 bits (205), Expect = 3e-18
Identities = 105/513 (20%), Positives = 193/513 (37%), Gaps = 37/513 (7%)

Query: 533 RRAVFSIAALLLVGLATWWGYSRLPTSFLPEEDQGYFFVSLQLPDGASLNRTQTVMDQMY 592
RR +F+ +++ +A +LP + P VS P GA QTV D +
Sbjct: 7 RRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-GAD---AQTVQDTVT 62

Query: 593 QQVSTNDAVEDVIKITGFSLLSGNNAPNAGFAIVMLKPWGQRPHIDRVLASIQANLAAIP 652
Q + N + I +S + I + G P I +V +Q L
Sbjct: 63 QVIEQN-----MNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQ--VQNKLQLAT 115

Query: 653 SAMIMAVNPPAIAGLGSASGFDLRIQALLGQSPQELAQVSQGIIFAANQDPTLSRV---- 708
+ V I+ S+S + + A Q A+N TLSR+
Sbjct: 116 PLLPQEVQQQGISVEKSSSSY--LMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVG 173

Query: 709 -FTTFSASVPETNLSIDRDRAALLQVPVSRIF-----QTLQTSLGGMNAGDFTL-NNRMF 761
F A + +D D ++ + Q Q + G +
Sbjct: 174 DVQLFGAQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNA 232

Query: 762 RVQLQNDMNFRQRTAQINNLNVRSD-NGALVSLANLVTLTPSVGA-PFISNFNQFPSVAI 819
+ Q + + + +R + +G++V L ++ + I+ N P+ +
Sbjct: 233 SIIAQTRF---KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGL 289

Query: 820 SGSAADGASSGQAMAAMEALLAQ---NLPQGYSYSW-SGMSWQEQQTGGQVAFIYLAALV 875
A GA++ A++A LA+ PQG + + Q + +V A++
Sbjct: 290 GIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIM 349

Query: 876 FAYLFLVAQYESWSIPLVVVLSVVFAVGGAVAGLSAMGFANDVYAQIGLVLLIGLAAKNA 935
+L + ++ L+ ++V + G A L+A G++ + G+VL IGL +A
Sbjct: 350 LVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDA 409

Query: 936 ILIVE-FSKARREEGASIAEAAQDGAKQRFRAVMMTAISFILGVMPLVFASG-AGAMSRQ 993
I++VE + E+ EA + Q A++ A+ +P+ F G GA+ RQ
Sbjct: 410 IVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQ 469

Query: 994 IIGITVFGGMLMATAVGILFIPALYLHIQRLRE 1026
IT+ M ++ V ++ PAL + +
Sbjct: 470 F-SITIVSAMALSVLVALILTPALCATLLKPVS 501


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0673PRTACTNFAMLY802e-16 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 79.7 bits (196), Expect = 2e-16
Identities = 175/846 (20%), Positives = 282/846 (33%), Gaps = 122/846 (14%)

Query: 3608 ATLANNGTQSNDLSAQITGSGDLAFASANDGSTAS-----LSNSTNSYTGTTWVSSGNLR 3662
ATLAN G +D + +G+ A AS D + + N + + G L
Sbjct: 125 ATLANVGDTWDDDGIALYVAGEQAQASIADSTLQGAGGVQIERGANVTVQRSAIVDGGLH 184

Query: 3663 LDADSALGQTSL------LAMSTATHVDINGTQQVVGELATEGGSTLDLNDGKLTVTGGG 3716
+ A +L L L + T V +G V + G S L L+ G +T GG
Sbjct: 185 IGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAV---SVLGASELTLDGGHIT---GG 238

Query: 3717 QIDGALTGSGELVLSGGLLNVSYDNAGFTGSTDIANGAVAHLSQAQGLGNGTINNNGTLH 3776
+ G G +V L + + GAV + G G G
Sbjct: 239 RAAGVAAMQGAVV---HLQRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFG------ 289

Query: 3777 LDNTIGTLFNALTGSDGEVLLSNNASVQLAGDNSGYSGLFTNQAGSILIANSAEHLGGSS 3836
+ + + S V L A + G + A GGS
Sbjct: 290 ---PVLDGWYGVDVSGSSVEL---AQSIVEAPELGAAIRVGRGA-------RVTVSGGSL 336

Query: 3837 IANSGALILDTGSVWEL--TNTISGTGTLVKRGSGTVKIEGDTVSAGLTTIEEGLLQLGS 3894
A G +I G+ +S T G + T+ G G
Sbjct: 337 SAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGADAQGD 396

Query: 3895 SAVTQTLSLEESLQERALRVSFASNMANLTSNVLITANGSLGGYGQVTGN-------VEN 3947
T+ L L V+ AS + + + +T N + +
Sbjct: 397 IVATE-LPSIPGTSIGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVGALRLAS 455

Query: 3948 HGNLIMPNALTGGDFGTFTIDGNYTGDEGMITFNTILAGDTSVTDRLVITGDTAGQSYVT 4007
G++ G F T+ N G+ N D ++D+LV+ D +GQ +
Sbjct: 456 DGSVDFQQPAEAGRFKVLTV--NTLAGSGLFRMNVFA--DLGLSDKLVVMQDASGQHRLW 511

Query: 4008 VNNIGGVGARTFEGIKIIDVGGDSAGQFTL---NGRAVGGAYEYFLYQGG---------- 4054
V N G + + ++ SA FTL +G+ G Y Y L G
Sbjct: 512 VRNSGS-EPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAK 570

Query: 4055 -------ASTPDDGDWYLRTQADDRRPEPASYTANLAAANNMFVTS-------------- 4093
A P + L+AA N V +
Sbjct: 571 APPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAES 630

Query: 4094 --LSDRMGETLYTDVFTGEQKTTSLWLRNEGSHNRSRDDSGELHTQDNR-YVMQLGGDVA 4150
LS R+GE G W R G R + D+ D + +LG D A
Sbjct: 631 NALSKRLGELRLNPDAGG------AWGR--GFAQRQQLDNRAGRRFDQKVAGFELGADHA 682

Query: 4151 QWSRNAQDLWRVGVMAGYANSSSSTVAKVAGYRSTGSVDGYSVGIYGSWLADNADDTGAY 4210
A W +G +AGY G D VG Y +++AD+ G Y
Sbjct: 683 --VAVAGGRWHLGGLAGYTRGDRGFTGD-----GGGHTDSVHVGGYATYIADS----GFY 731

Query: 4211 VDSWVQYSWFDN--NVSGQDLAA--EKYDSKGFTASVEGGYAFKVGESVNQSYFIQPKAQ 4266
+D+ ++ S +N V+G D A KY + G AS+E G F + +F++P+A+
Sbjct: 732 LDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRF----THADGWFLEPQAE 787

Query: 4267 MVWMGVKADDHTETNGTVISGDGNGNIQTRLGAKAFINPSDKAKASGPAFKPFVEANWIH 4326
+ + NG + +G ++ RLG + A G +P+++A+ +
Sbjct: 788 LAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEV---GKRIELAGGRQVQPYIKASVLQ 844

Query: 4327 NTKDFGTT-LDGVTVKQAGTANIAELKLGVDGQINNQLNLWGNIGQQVGNKGYSETSVVL 4385
GT +G+ + AEL LG+ + +L+ + G K +
Sbjct: 845 EFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHA 904

Query: 4386 GVKYNF 4391
G +Y++
Sbjct: 905 GYRYSW 910


7YpsIP31758_0690YpsIP31758_0807Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_06902162.381641ABC transporter ATP-binding protein/permease
YpsIP31758_06912183.320710hypothetical protein
YpsIP31758_06921162.803974hypothetical protein
YpsIP31758_06931193.972068fructuronate transporter
YpsIP31758_06941183.574831autotransporter protein
YpsIP31758_06950162.387220autotransporter protein
YpsIP31758_06961150.788097VgrG protein
YpsIP31758_0697216-0.717939hypothetical protein
YpsIP31758_0698117-1.163355YD repeat-/RHS repeat-containing protein
YpsIP31758_06991037-14.243485hypothetical protein
YpsIP31758_0701222-7.402505hypothetical protein
YpsIP31758_0702117-4.969586hypothetical protein
YpsIP31758_0703115-3.694869hypothetical protein
YpsIP31758_0704011-2.600869hypothetical protein
YpsIP31758_0705-210-0.723083hypothetical protein
YpsIP31758_0706-2130.184950divalent anion:Na+ symporter (DASS) family
YpsIP31758_0707-2160.850504LacI family sugar-binding transcriptional
YpsIP31758_0708-2181.830417hypothetical protein
YpsIP31758_0709-1212.777757sugar (glycoside-Pentoside-hexuronide)
YpsIP31758_07101234.348975TonB-dependent siderophore receptor
YpsIP31758_07110183.609208aerobactin siderophore biosynthesis protein
YpsIP31758_07120152.677426aerobactin siderophore biosynthesis protein
YpsIP31758_0713-1131.555841aerobactin siderophore biosynthesis protein
YpsIP31758_0714014-0.652727aerobactin siderophore biosynthesis protein
YpsIP31758_0715016-3.385038major facilitator transporter
YpsIP31758_0716219-5.741727dioxygenase family protein
YpsIP31758_0717119-4.980375hypothetical protein
YpsIP31758_0718321-4.690457hypothetical protein
YpsIP31758_0720525-6.846219LuxR family transcriptional regulator
YpsIP31758_0719325-5.719143N-acylhomoserine lactone synthase YtbI
YpsIP31758_0721221-3.173698CAAX amino terminal protease family protein
YpsIP31758_07220261.187776hypothetical protein
YpsIP31758_0723-1284.995758lipoprotein
YpsIP31758_07240357.503402IS1, transposase orfA
YpsIP31758_07251379.714195hypothetical protein
YpsIP31758_07262389.470725hypothetical protein
YpsIP31758_072724010.243476hypothetical protein
YpsIP31758_072823910.274446hypothetical protein
YpsIP31758_07290286.664823hypothetical protein
YpsIP31758_0730-1276.160417OmpA domain-containing protein
YpsIP31758_0731-2202.366288hypothetical protein
YpsIP31758_0732-2191.086290type VI secretion ATPase
YpsIP31758_0733-217-1.933345type VI secretion system family protein
YpsIP31758_0734021-5.574760M23 peptidase domain-containing protein
YpsIP31758_0735-217-5.452728IS1 family transposase orfA
YpsIP31758_0738-218-5.666310M23 peptidase domain-containing protein
YpsIP31758_0739024-6.507454hypothetical protein
YpsIP31758_0740121-1.323116IS1 family transposase orfA
YpsIP31758_0741121-0.772773IS1 family transposase orfB
YpsIP31758_07431230.150289phage integrase family site specific
YpsIP31758_07441241.043985DNA-binding protein
YpsIP31758_07452251.455884hypothetical protein
YpsIP31758_07463253.848576hypothetical protein
YpsIP31758_07474255.123419hypothetical protein
YpsIP31758_07485275.189079hypothetical protein
YpsIP31758_07494265.029440hypothetical protein
YpsIP31758_07512243.913432hypothetical protein
YpsIP31758_07501244.311973RadC family DNA repair protein
YpsIP31758_07521243.194440antirestriction ArdB family protein
YpsIP31758_07532253.975077hypothetical protein
YpsIP31758_07543243.787837hypothetical protein
YpsIP31758_07552262.133269hypothetical protein
YpsIP31758_07562261.986914hypothetical protein
YpsIP31758_07573281.566947hypothetical protein
YpsIP31758_07582291.682473transcriptional regulator
YpsIP31758_07592290.844916GTPase
YpsIP31758_0760029-0.360750hypothetical protein
YpsIP31758_07614332.122707hypothetical protein
YpsIP31758_07624301.740534hypothetical protein
YpsIP31758_07636261.338820AlpA family protein
YpsIP31758_07643262.744119hypothetical protein
YpsIP31758_0765422-4.066498hypothetical protein
YpsIP31758_0766524-5.928478hypothetical protein
YpsIP31758_0767524-7.886281hypothetical protein
YpsIP31758_0768525-8.240164hypothetical protein
YpsIP31758_0769527-9.327623IS630 family transposase
YpsIP31758_0770732-11.246639resolvase family site-specific recombinase
YpsIP31758_0771327-8.893894ParB family protein
YpsIP31758_0772117-3.083398RepB plasmid partitioning protein
YpsIP31758_07732183.120337hypothetical protein
YpsIP31758_07742183.876053hypothetical protein
YpsIP31758_07762185.426850PerC family transcriptional regulator
YpsIP31758_07773195.317800hypothetical protein
YpsIP31758_07783205.946080hypothetical protein
YpsIP31758_07791216.130522hypothetical protein
YpsIP31758_0780-1151.858944hypothetical protein
YpsIP31758_0781-1150.720534OmpA domain-containing protein
YpsIP31758_0782-1190.056124hypothetical protein
YpsIP31758_0783019-0.623242hypothetical protein
YpsIP31758_0785018-1.889110Rhs element Vgr protein
YpsIP31758_0786222-5.814573hypothetical protein
YpsIP31758_0787220-5.071037lipoprotein
YpsIP31758_0788118-4.849184Rhs element Vgr protein
YpsIP31758_0789529-7.373425hypothetical protein
YpsIP31758_0790-1180.333548hypothetical protein
YpsIP31758_0791-1203.261920hypothetical protein
YpsIP31758_07920203.240204hypothetical protein
YpsIP31758_07930193.502608hypothetical protein
YpsIP31758_07940193.364770hypothetical protein
YpsIP31758_0795-1234.089243hypothetical protein
YpsIP31758_0796-1212.642118ImpA domain-containing protein
YpsIP31758_07970190.572716PAAR domain-containing protein
YpsIP31758_0798-1253.191160hypothetical protein
YpsIP31758_0799-1285.589318hypothetical protein
YpsIP31758_0800-1266.741696hypothetical protein
YpsIP31758_08011214.664189hypothetical protein
YpsIP31758_0802-116-0.074373hypothetical protein
YpsIP31758_0803121-2.515699hypothetical protein
YpsIP31758_0804120-2.274516hypothetical protein
YpsIP31758_0805120-2.276035ImpA domain-containing protein
YpsIP31758_0806019-3.711501hypothetical protein
YpsIP31758_0807018-3.572815hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0690ACRIFLAVINRP300.026 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.026
Identities = 9/39 (23%), Positives = 22/39 (56%)

Query: 138 IIATASVLCFFSLGLLLKDWRMALAMLSTLPLAVCAYIL 176
++A + V+ F L L + W + ++++ +PL + +L
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLL 913


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0694PERTACTIN612e-11 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 60.9 bits (147), Expect = 2e-11
Identities = 104/458 (22%), Positives = 159/458 (34%), Gaps = 66/458 (14%)

Query: 820 SGEGAWLTLDTVLGD---------DDSATDRLVINGDATGTTSVRVNNAGGLGDKTLNGI 870
+G L +DT+ G D +D+LV+ DA+G + V N+G + N +
Sbjct: 463 AGRFKVLMVDTLAGSGLFRMNVFADLGLSDKLVVMRDASGQHRLWVRNSGSEPA-SGNTM 521

Query: 871 NLITVDGLAQDDTFLLAGDYVTTDGYQAVVGGAYAYTLQADGEA--------ATAGRNWY 922
L+ + TF LA DG V G Y Y L A+G A
Sbjct: 522 LLVQTPRGSAA-TFTLA----NKDG--KVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPA 574

Query: 923 LSSELMLTEGVRYQVGVPLYEQYPQVLAALNTLPTLQQRVGNRYGAPGALA----DLNFD 978
P Q PQ P Q G A A +
Sbjct: 575 PQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGLA 634

Query: 979 DNQW----------------------AWGRIEGSHQVTDPARSTSGSQREIDVWKLQTGI 1016
W AWGR Q D + +G + + V + G
Sbjct: 635 STLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLD---NRAGRRFDQKVAGFELGA 691

Query: 1017 DVPLYQSQDGSLLTGGVNFSYGKAKADIHSFFGDGRINSAGYGLGTSLTWYGNNGVYVDG 1076
D + + L G ++ G F GDG ++ +G T+ N+G Y+D
Sbjct: 692 DHAVAVAGGRWHLGGLAGYTRGD-----RGFTGDGGGHTDSVHVGGYATYIANSGFYLDA 746

Query: 1077 QLQTMWFDSDLS-SRTAGHAVASGNNGRGYTSAIEAGKGYALGNGLSLTPQMQVTYSRVD 1135
L+ ++D + + G+AV G ++EAG+ +A +G L PQ ++ RV
Sbjct: 747 TLRASRLENDFKVAGSDGYAVKGKYRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVG 806

Query: 1136 FDTFRDPFDSEVSLQEGDSLRGRLGVSLDKETTWSAKDGTTRRSHIYSHFDLHNEFLNGS 1195
+R V + G S+ GRLG+ + K R+ Y + EF
Sbjct: 807 GGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIEL----AGGRQVQPYIKASVLQEFDGAG 862

Query: 1196 KVQVSGVEFAT--RDERQSVGLGAGGTYEWQNGRYAVY 1231
V+ +G+ T R R +GLG + YA Y
Sbjct: 863 TVRTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASY 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0695PERTACTIN673e-13 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 67.4 bits (164), Expect = 3e-13
Identities = 101/434 (23%), Positives = 152/434 (35%), Gaps = 57/434 (13%)

Query: 835 DDSATDRLVINGDATGTTSVRVNNAGGLGDKTLNGINLITVDGLAQDDTFLLAGDYVTTD 894
D +D+LV+ DA+G + V N+G + N + L+ + TF LA D
Sbjct: 487 DLGLSDKLVVMRDASGQHRLWVRNSGSEPA-SGNTMLLVQTPRGSAA-TFTLA----NKD 540

Query: 895 GYQAVVAGAYAYTLQADGEA--------ATAGRNWYLSSELMLTEGVRYQVGVPLYEQYP 946
G V G Y Y L A+G A P Q P
Sbjct: 541 G--KVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPP 598

Query: 947 QVLAALNTLPTLQQRVGNRYGAPGALA----DLNFDDNQW-------------------- 982
Q P Q G A A + W
Sbjct: 599 QPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDA 658

Query: 983 --AWGRIEGSHQVTDPARSTSGSQREIDVWKLQTGIDVPLYQSQGGSLLTGGVNFTYGKA 1040
AWGR Q D + +G + + V + G D + + G L G +T G
Sbjct: 659 GGAWGRGFAQRQQLD---NRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGD- 714

Query: 1041 KADIHSFFGDGRINSAGYGLGTSLTWYGNNGVYVDGQLQTMWFDSDLS-SRTAGHAVASG 1099
F GDG ++ +G T+ N+G Y+D L+ ++D + + G+AV
Sbjct: 715 ----RGFTGDGGGHTDSVHVGGYATYIANSGFYLDATLRASRLENDFKVAGSDGYAVKGK 770

Query: 1100 NNGRGYTSAIEAGKGYALGNGLSLTPQMQVTYSRVDFDTFRDPFDSEVSLQEGDSLRGRL 1159
G ++EAG+ +A +G L PQ ++ RV +R V + G S+ GRL
Sbjct: 771 YRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVGGGAYRAANGLRVRDEGGSSVLGRL 830

Query: 1160 GVSLDKETTWSAKDGTTRRSHIYSHLDLHNEFLNGSKVQVSGVEFAT--RDERQSVGLGA 1217
G+ + K R+ Y + EF V+ +G+ T R R +GLG
Sbjct: 831 GLEVGKRIEL----AGGRQVQPYIKASVLQEFDGAGTVRTNGIAHRTELRGTRAELGLGM 886

Query: 1218 GGTYEWQNGRYAVY 1231
+ YA Y
Sbjct: 887 AAALGRGHSLYASY 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0696ICENUCLEATIN374e-04 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 36.7 bits (84), Expect = 4e-04
Identities = 43/189 (22%), Positives = 65/189 (34%), Gaps = 1/189 (0%)

Query: 532 NTTVLNDRSTTVSGNHTETVTKDQAVTVSGNQTMDITQDQTITVTGTQRIDVTQDRIIDV 591
+T ++S +G + + + ++G + +I G Q+R
Sbjct: 758 STQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLT 817

Query: 592 TAEQQTTVKADDRLLISGKQKTKIDLDQEYEVVGSQKKTIGANQTLKVGGYQKNTLEGYK 651
T T+ D LI+G T+ G + GY + GY
Sbjct: 818 TGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYD 877

Query: 652 KSKIGG-DNTTTVGGHDKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITST 710
S I G +T T G + LT G T TA + L G S + I G T T
Sbjct: 878 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQT 937

Query: 711 ASTTHTIKA 719
AS T+ A
Sbjct: 938 ASFKSTLMA 946



Score = 34.7 bits (79), Expect = 0.002
Identities = 32/181 (17%), Positives = 67/181 (37%), Gaps = 9/181 (4%)

Query: 532 NTTVLNDRSTTVSGNHTETVTKDQAVTVSGNQTMDITQDQTITVTGTQRIDVTQDRIIDV 591
+T + S +G + + ++ ++G + ++ + G +++
Sbjct: 902 STQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLT 961

Query: 592 TAEQQTTVKADDRLLISGKQKTKIDLDQEYEVVGSQKKTIGANQTLKVGGYQKNTLEGYK 651
T++ D LI+G T+ G Q + + + GY
Sbjct: 962 AGYGSTSMAGYDSSLIAGYGSTQT--------AGYQSTLTAGYGSTQTAEHSSTLTAGYG 1013

Query: 652 KSKIGGDNTTTVGGH-DKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITST 710
+ G +++ + G+ LT G +TAG TL G S++ G+ I+G + T
Sbjct: 1014 STATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLT 1073

Query: 711 A 711
A
Sbjct: 1074 A 1074



Score = 34.0 bits (77), Expect = 0.003
Identities = 39/188 (20%), Positives = 59/188 (31%), Gaps = 15/188 (7%)

Query: 532 NTTVLNDRSTTVSGNHTETVTKDQAVTVSGNQTMDITQDQTITVTGTQRIDVTQDRIIDV 591
+T +RS +G + + + ++G + +I G Q+
Sbjct: 806 STQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLT 865

Query: 592 TAEQQTTVKADDRLLISGKQKTKIDLDQEYEVVGSQKKTIGANQTLKVGGYQKNTLEGYK 651
T T+ D LI+G T+ T G N L G T +
Sbjct: 866 TGYGSTSTAGYDSSLIAGYGSTQ---------------TAGYNSILTAGYGSTQTAQENS 910

Query: 652 KSKIGGDNTTTVGGHDKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITSTA 711
G +T+T G L G T TA TL G S + G TS A
Sbjct: 911 DLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMA 970

Query: 712 STTHTIKA 719
++ A
Sbjct: 971 GYDSSLIA 978



Score = 32.4 bits (73), Expect = 0.008
Identities = 37/133 (27%), Positives = 54/133 (40%), Gaps = 5/133 (3%)

Query: 606 LISGKQKTKIDLDQEYEVVGS-QKKTIGANQTLKVGGYQKNTLEGYKKSKIGGDNTTTVG 664
LI+G + T+I ++ + G +T G TL G K G D+T T G
Sbjct: 1088 LIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAG 1147

Query: 665 GHDKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITSTASTTHTIKAKTVTS 724
KL G+ +TAG L G I+M + G+N TA ++K + S
Sbjct: 1148 DRSKLLAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTAGINSILTAGC----RSKLIGS 1203

Query: 725 CGDTENIVEGGIL 737
G T E +L
Sbjct: 1204 NGSTLTAGENSVL 1216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0698ACRIFLAVINRP310.043 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.043
Identities = 12/37 (32%), Positives = 20/37 (54%), Gaps = 3/37 (8%)

Query: 218 KSLSLPTSVMLPIPMGRPVVVGGMPVLNLLALMMGLF 254
+S S+P SVML +P+G +VG + L ++
Sbjct: 892 ESWSIPVSVMLVVPLG---IVGVLLAATLFNQKNDVY 925


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0711INVEPROTEIN290.046 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.046
Identities = 13/44 (29%), Positives = 25/44 (56%)

Query: 221 NALDEAAFANEYFMPEYVESFYTLNDSAKQHMLAEQRMTSDGIT 264
A+ + F EY+ E + + ++ D A +H +AEQR T + ++
Sbjct: 329 KAIPSSLFYEEYWQEELLMALRSMTDIAYKHEMAEQRRTIEKLS 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0712PF041837360.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 736 bits (1902), Expect = 0.0
Identities = 381/576 (66%), Positives = 447/576 (77%), Gaps = 1/576 (0%)

Query: 5 NYANWQQVNRHMIAKILSELEYERTLHAELHGETG-RITLPGAVYTFNGKRGIWGWLHID 63
N+ +W VNR ++AK+LSELEYE+ HAE G+ I LPGA + F +RGIWGWL ID
Sbjct: 2 NHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWID 61

Query: 64 PATLRCEGVPLAADHMLRQLALVLKMDDSQVAEHLEDLYATLRGDMQLLSARHGMSAEAL 123
TLRC P+ A +L QL VL M D+ VAEH++DLYATL GD+QLL AR G+SA L
Sbjct: 62 AQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDL 121

Query: 124 IALNDDALQCLLAGHPKFIFNKGRRGWGLTALQHYAPEYQGQFRLHWVAAKRGSFIWCVD 183
I LN D LQCLL+GHPKF+FNKGRRGWG AL+ YAPEY FRLHW+A KR IW D
Sbjct: 122 INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCD 181

Query: 184 AEYPLDNLLNSAMDPAERQRFDRRWRECQLNDDWVPVPLHPWQWQQKIALHFLPQLAEGE 243
E + LL +AMDP E RF + W+E L+ +W+P+P+HPWQWQQKIA F+ AEG
Sbjct: 182 NEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEGR 241

Query: 244 LIELGEFGDHYLAQQSLRTLTNVSRRVPFDIKLPLTIYNTSCYRGIPGKYISAGPAASRW 303
++ LGEFGD +LAQQSLRTLTN SRR DIKLPLTIYNTSCYRGIPG+YI+AGP ASRW
Sbjct: 242 MVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRW 301

Query: 304 LQQVFAQDRTLHESGAEILGEPAAGYMLHQTYATLAKAPYRCQEMLGVIWRENPSCYLRE 363
LQQVFA D TL +SGA ILGEPAAGY+ H+ YA LA+APYR QEMLGVIWRENP +L+
Sbjct: 302 LQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKP 361

Query: 364 GEHAILMATLMETNNQGHPLIAAYIARSGLSAEAWLEQMFRVVVVPMYHLMCCYGVALIA 423
E +LMATLME + PL AYI RSGL AE WL Q+FRVVVVP+YHL+C YGVALIA
Sbjct: 362 DESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALIA 421

Query: 424 HGQNITLVMKDHAPQRILLKDFQGDMRLVDKDFPQAASLPNVVKEVTVRLSADYLIHDLQ 483
HGQNITL MK+ PQR+LLKDFQGDMRLV ++FP+ SLP V++VT RLSADYLIHDLQ
Sbjct: 422 HGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQ 481

Query: 484 TGHFVTVLRFISPLMQACNLSEYRFYQLLAQVLERYMAQHPDLADRFTLFNLFKPQIIRV 543
TGHFVTVLRFISPLM + E RFYQLLA VL YM +HP +++RF LF+LF+PQIIRV
Sbjct: 482 TGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIRV 541

Query: 544 VLNPVKLTYSEQDGGSRMLPDYLQDLDNPLYLVTKE 579
VLNPVKLT+ + DGGSRMLP+YL+DL NPL+LVT+E
Sbjct: 542 VLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQE 577


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0714PF04183320e-104 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 320 bits (821), Expect = e-104
Identities = 101/457 (22%), Positives = 170/457 (37%), Gaps = 37/457 (8%)

Query: 62 TQHHHYLFPAYLHQQGNDRQDDDTPVKLGIEQLVTLLLEKPTVKGELSDDVVARFRQRVL 121
+ F A G D T L LL + +SD VA Q +
Sbjct: 41 LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLY 100

Query: 122 ESHDNTQQAINIRLDWPSLRDKPLNFAQAEQGLLAGHAFHPAPKSHQPFNEKQAQRYLPD 181
+ Q + R + LN Q LL+GH K + + ++ +RY P+
Sbjct: 101 ATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFVFNKGRRGWGKEALERYAPE 159

Query: 182 FASRFPLRWFAVDKRYLCGDSLKLTLQHRLQRFASESAPQLLAYFT--------DDVW-L 232
+A+ F L W AV + ++ H Q + PQ A F+ D W
Sbjct: 160 YANTFRLHWLAVKREHMIWRCDNEMDIH--QLLTAAMDPQEFARFSQVWQENGLDHNWLP 217

Query: 233 LPMHPWQADHLLKQDWCQQLVQQNALHDLGEAGERWLPTSSSRSLYSPSNRD--MVKFSL 290
LP+HPWQ + D+ + + LGE G++WL S R+L + S R +K L
Sbjct: 218 LPVHPWQWQQKIATDFIADFAEGR-MVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPL 276

Query: 291 SVRLTNSVRTLSVKEAKRGMRLARLAQTPRWQELQARY--------PTFRVMQEDGWAGL 342
++ T+ R + + G +R Q + P + +G+A L
Sbjct: 277 TIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAAL 336

Query: 343 RSADFTLQEESLLVLRDNLLFSQPDSQTNVLVTLTQAAPDGGDSLLASAVRRLAARLNLP 402
A + QE ++ R+N ++ VL+ + L + + R
Sbjct: 337 ARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSG------ 390

Query: 403 LQQAAFCWLDAYCQHVLLPLFSTEADYGLVLLAHQQNILVEMQQDLPVGMLYRDCQGSGF 462
A WL + V++PL+ YG+ L+AH QNI + M++ +P +L +D QG
Sbjct: 391 --LDAETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGD-- 446

Query: 463 TQSALPWLAEIGEAEAENSFSEQQLLRYFPYYLLVNS 499
+ E+ E + + L++
Sbjct: 447 MRLVKEEFPEMDSLPQE----VRDVTSRLSADYLIHD 479


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0715TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 42/180 (23%), Positives = 73/180 (40%), Gaps = 16/180 (8%)

Query: 24 FCVGLLGIGQNGLLVVLPVLVSRTHLSLSVWAG---LLTLGSMLFLVGSAWWGRQSEIRG 80
V L +G ++ VLP L+ S V A LL L +++ + G S+ G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 81 CKFVVIMALAGYLLSFVLLALAVWGLSAGWLSEMAGLGWLIVARIIYGLTVSGMVPASQT 140
+ V++++LAG + + ++A A W+ L + RI+ G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATA----PFLWV--------LYIGRIVAGITGATGAVAGAY 119

Query: 141 WALQRAGYEQRMAALATISSGLSCGRLLGPLCAALALSIHPIAPLWLMAITPLIALLVVY 200
A G ++R +S+ G + GP+ L P AP + A + L
Sbjct: 120 IADITDG-DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0719AUTOINDCRSYN320e-114 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 320 bits (821), Expect = e-114
Identities = 114/216 (52%), Positives = 154/216 (71%)

Query: 1 MLEIFDVRYDELTDIRSEDLYKLRKKTFKDRLNWEVNCSNGMEFDEYDNSDTRYLLGIYQ 60
MLEIFDV + L++ +S +L+ LRK+TFKDRLNW V C++GMEFD+YDN++T YL GI
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNNNTTYLFGIKD 60

Query: 61 GQLICSVRFIELHLPNMITHTFNALFDDVALPKRGYIESSRFFVDKTRAKLLFGNHYPIS 120
+ICS+RFIE PNMIT TF F ++ +P+ Y+ESSRFFVDK+RAK + GN YPIS
Sbjct: 61 NTVICSLRFIETKYPNMITGTFFPYFKEINIPEGNYLESSRFFVDKSRAKDILGNEYPIS 120

Query: 121 YLFFLSIINYSRHNGYTGIYTIVSRAMLTILKRSGWQVEVIKEAHITEKERIYLLHLPID 180
+ FLS+INYS+ GY GIYTIVS MLTILKRSGW + V+++ ++ER+YL+ LP+D
Sbjct: 121 SMLFLSMINYSKDKGYDGIYTIVSHPMLTILKRSGWGIRVVEQGLSEKEERVYLVFLPVD 180

Query: 181 RDNQARLLLQVNQRLQDPCSVLSTWPISLPVMPESA 216
+NQ L ++N+ + L WP+ +P A
Sbjct: 181 DENQEALARRINRSGTFMSNELKQWPLRVPAAIAQA 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0730OMPADOMAIN849e-20 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 84.2 bits (208), Expect = 9e-20
Identities = 41/146 (28%), Positives = 60/146 (41%), Gaps = 14/146 (9%)

Query: 426 PPPPPPPAPPAPKTVRLDSLSLFDVGKFTLNAGSTKML---VTALIDIKAKPGWLIVVAG 482
P P P K L S LF+ K TL L + L ++ K G +VV G
Sbjct: 201 APAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-SVVVLG 259

Query: 483 HTDITGDAQANHILSLKRAEALRDWMLSTSDVSPTCFAVQGYGATRPIADNDT------- 535
+TD G N LS +RA+++ D+ L + + + +G G + P+ N
Sbjct: 260 YTDRIGSDAYNQGLSERRAQSVVDY-LISKGIPADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 536 --PDGRALNRRVEISLVPQADACQVP 559
D A +RRVEI + D P
Sbjct: 319 ALIDCLAPDRRVEIEVKGIKDVVTQP 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0732DPTHRIATOXIN300.040 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 30.1 bits (67), Expect = 0.040
Identities = 19/54 (35%), Positives = 24/54 (44%), Gaps = 9/54 (16%)

Query: 622 GIGKTETALALADSLFGGEKSLITINMSEYQEAHTVSQLKGSPPGYVGYGQGGV 675
GIG +A A AD + KS + N S Y G+ PGYV Q G+
Sbjct: 23 GIGAPPSAHAGADDVVDSSKSFVMENFSSYH---------GTKPGYVDSIQKGI 67


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0741MALTOSEBP290.005 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 28.9 bits (64), Expect = 0.005
Identities = 16/48 (33%), Positives = 23/48 (47%), Gaps = 8/48 (16%)

Query: 35 KKTFRQLLGLLSGFNIVFWCTDNFSAY-------EMLPDEKHIRSKLY 75
++ F Q+ G +I+FW D F Y E+ PD K + KLY
Sbjct: 70 EEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPD-KAFQDKLY 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0781OMPADOMAIN862e-20 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 86.1 bits (213), Expect = 2e-20
Identities = 57/186 (30%), Positives = 84/186 (45%), Gaps = 21/186 (11%)

Query: 386 GDAEELDNEFRNGEPLRLGLGLYQGWRLRLPLLAAVKTYVPPPPAIDKETPTTVRLDSLS 445
GDA + NG L LG+ +R A V P PA + +T L S
Sbjct: 169 GDAHTIGTRPDNGM---LSLGV--SYRFGQGEAAPVVA-PAPAPAPEVQTKH-FTLKSDV 221

Query: 446 LFDVGKFQLKP---GSIKMLVDALMNIRAKPGWLIVVAGHTDITGDAQANQILSLKRAEA 502
LF+ K LKP ++ L L N+ K G +VV G+TD G NQ LS +RA++
Sbjct: 222 LFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-SVVVLGYTDRIGSDAYNQGLSERRAQS 280

Query: 503 LRDWMLSTSDVSPTCFAVQGYGATRPVATN--ESAEGRAA-------NRRVEISLVPQAD 553
+ D+ L + + + +G G + PV N ++ + RAA +RRVEI + D
Sbjct: 281 VVDY-LISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKD 339

Query: 554 ACQVPE 559
P+
Sbjct: 340 VVTQPQ 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0791TONBPROTEIN338e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 32.7 bits (74), Expect = 8e-04
Identities = 12/45 (26%), Positives = 17/45 (37%)

Query: 4 PVPPYPSKKDIELKRTPKSEVKPKVETQSKPEAKPKPAAQKSQKL 48
V P P + I V K + + KP+ KP Q+ K
Sbjct: 68 VVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0798MICOLLPTASE300.003 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.1 bits (67), Expect = 0.003
Identities = 16/69 (23%), Positives = 26/69 (37%)

Query: 35 YVYSSESTYGVEPNEKEVEEIIKMKPDVIDPGETLKLAPSILSLLKKNIRKDTGWRIGGR 94
Y+S G + + VE + ++ + E + +LS K NI K G
Sbjct: 199 IQYNSNFRLGTKAQDGVVEALGRLIGNASADPEVINNCIYVLSDFKDNIDKYGSNYSKGN 258

Query: 95 YSFNSVGGG 103
FN + G
Sbjct: 259 AVFNLMKGI 267


8YpsIP31758_0819YpsIP31758_0824Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0819326-8.087609adenine DNA glycosylase
YpsIP31758_0820427-8.156832tRNA (guanine-N(7)-)-methyltransferase
YpsIP31758_0821427-8.193637hypothetical protein
YpsIP31758_0822427-8.034963glutaminase
YpsIP31758_0823427-8.346913hypothetical protein
YpsIP31758_0824426-8.167421phage minor structural protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0822BLACTAMASEA300.011 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 30.1 bits (68), Expect = 0.011
Identities = 15/65 (23%), Positives = 26/65 (40%), Gaps = 1/65 (1%)

Query: 22 GQGKVADYIPALAEVPANKLGI-AVCTLDGQIFQAGDADERFSIQSISKVLSLTLALSRY 80
+ + I + ++G+ + G+ A ADERF + S KV+ L+R
Sbjct: 21 ASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARV 80

Query: 81 SEQDI 85
D
Sbjct: 81 DAGDE 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0824CABNDNGRPT523e-08 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 52.3 bits (125), Expect = 3e-08
Identities = 39/162 (24%), Positives = 60/162 (37%), Gaps = 24/162 (14%)

Query: 2138 DVAALFDLGGGDDVAEGYHKKKNIFTIGSGFKQYQGGENADTFILTSATASKSHIL--SG 2195
D+AA+ L G + + + Y +++ I + A + SG
Sbjct: 250 DIAAIQRLYGANMTTRTGDSVYGFNS-NTDRDFYTATDSSKALIFSVWDAGGTDTFDFSG 308

Query: 2196 GEGNDTVALGEVLGNEIDSIIDISKGYYSQVNGGVEKQVALLYDFENILGHENVNDTIIG 2255
N + L S + KG S + GV EN +G ND ++G
Sbjct: 309 YSNNQRI----NLNEGSFSDVGGLKGNVS-IAHGVTI--------ENAIGGSG-NDILVG 354

Query: 2256 NDVDNYLNGMGGDDKIWGNGGNDLLALQSGLAQGGTGLDSYH 2297
N DN L G G+D ++G G D L GG G D++
Sbjct: 355 NSADNILQGGAGNDVLYGGAGADTLY-------GGAGRDTFV 389



Score = 42.6 bits (100), Expect = 3e-05
Identities = 31/137 (22%), Positives = 47/137 (34%), Gaps = 21/137 (15%)

Query: 2631 SSGNDEVVITSATFLPGNYIDTGDGNDAIIYIRGQEGT-MLKGGGGDDTYYYSAGSGAIN 2689
SGND +V SA N + G GND + G G L GG G DT+ Y +G +
Sbjct: 346 GSGNDILVGNSA----DNILQGGAGNDVLY---GGAGADTLYGGAGRDTFVYGSGQDSTV 398

Query: 2690 IADTSGLDHLY-----------LDKHILLHTLSAERRENNLVLNIADNTSGRIIFVDWYL 2738
A D + + + ++L S I + +
Sbjct: 399 AAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKGQEVMLQWDAANS--ITNLWLHE 456

Query: 2739 ADENKVEFIWVEDSQIT 2755
A + V+F+ Q
Sbjct: 457 AGHSSVDFLVRIVGQAA 473


9YpsIP31758_0857YpsIP31758_0867Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_08571123.3693795,10-methenyltetrahydrofolate synthetase
YpsIP31758_08581113.083670Z-ring-associated protein
YpsIP31758_08591123.471848hypothetical protein
YpsIP31758_08601133.550215proline aminopeptidase P II
YpsIP31758_08611143.3703532-octaprenyl-6-methoxyphenyl hydroxylase
YpsIP31758_08620142.080967hypothetical protein
YpsIP31758_08631121.782744glycine cleavage system aminomethyltransferase
YpsIP31758_0864-1120.641089glycine cleavage system protein H
YpsIP31758_08650110.725740glycine dehydrogenase
YpsIP31758_0866215-0.852617hypothetical protein
YpsIP31758_08672130.117302hypothetical protein
10YpsIP31758_0885YpsIP31758_0907Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_08854172.635536hypothetical protein
YpsIP31758_08864182.123263hypothetical protein
YpsIP31758_08874181.817086helix-turn-helix DNA binding domain-containing
YpsIP31758_08886191.871295hypothetical protein
YpsIP31758_08897171.710426transcriptional regulator
YpsIP31758_08906202.299155D5 family nucleoside triphosphatase
YpsIP31758_0891419-3.062178hypothetical protein
YpsIP31758_0892517-1.383059hypothetical protein
YpsIP31758_0893417-1.469963AlpA family protein
YpsIP31758_0894218-2.003735hypothetical protein
YpsIP31758_0895118-2.015471hypothetical protein
YpsIP31758_0896018-1.785131Fic family protein
YpsIP31758_0897018-3.126772PAAR/S-type pyocin domain-containing protein
YpsIP31758_0898120-4.418132hypothetical protein
YpsIP31758_0899119-4.094023hypothetical protein
YpsIP31758_0900016-2.713431hypothetical protein
YpsIP31758_0901-114-2.053315hypothetical protein
YpsIP31758_0902017-5.303309hypothetical protein
YpsIP31758_0903015-2.756828PAAR domain-containing protein
YpsIP31758_0904-116-2.597545hypothetical protein
YpsIP31758_0905-118-3.866519PadR family transcriptional regulator
YpsIP31758_0906219-3.985731hypothetical protein
YpsIP31758_0907118-5.025033hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0892PHPHTRNFRASE230.048 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 23.2 bits (50), Expect = 0.048
Identities = 11/34 (32%), Positives = 20/34 (58%), Gaps = 3/34 (8%)

Query: 11 SHEQVVARMLKKPAV---RAEYERLERQDFAIID 41
SH +++R L+ PAV + E+++ D I+D
Sbjct: 189 SHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVD 222


11YpsIP31758_0946YpsIP31758_0971Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0946-117-3.844058polysaccharide lyase family protein 8
YpsIP31758_0947123-8.336582autotransporter protein
YpsIP31758_0948534-11.965685hypothetical protein
YpsIP31758_0949538-12.819879hypothetical protein
YpsIP31758_0950440-13.618203carbonic anhydrase
YpsIP31758_0951642-14.094688hypothetical protein
YpsIP31758_0952642-13.824908general secretion pathway protein C-like
YpsIP31758_0953642-13.567722general secretion pathway protein D
YpsIP31758_0954441-13.700360general secretory pathway protein E
YpsIP31758_0955644-15.801741general secretion pathway protein F
YpsIP31758_0956645-16.621021general secretion pathway protein G
YpsIP31758_0957544-16.653333general secretion pathway protein I
YpsIP31758_0958642-16.784428general secretion pathway protein J
YpsIP31758_0959542-16.743225general secretion pathway protein K
YpsIP31758_0960641-16.052691general secretion pathway protein L
YpsIP31758_0961432-11.204792hypothetical protein
YpsIP31758_0963330-8.817158type IV prepilin peptidase
YpsIP31758_0962122-7.300971lipoprotein
YpsIP31758_0964020-5.089955transcriptional regulator
YpsIP31758_0965-114-1.731378hypothetical protein
YpsIP31758_0967-215-0.336798methyl-accepting chemotaxis protein
YpsIP31758_0966-1131.316716hypothetical protein
YpsIP31758_0968-2121.883580metallo-beta-lactamase family protein
YpsIP31758_09690123.096162LysR family substrate binding transcriptional
YpsIP31758_09710113.089278major facilitator transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0947VACCYTOTOXIN330.005 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 33.1 bits (75), Expect = 0.005
Identities = 40/172 (23%), Positives = 63/172 (36%), Gaps = 26/172 (15%)

Query: 437 LYLRNQSAATPWNFWAQTLYAHSRQSSGTYTPGYQTNGYGINVGVDRRFND--ESLFG-- 492
LY P N WA + S S G + YG + GVD N E++ G
Sbjct: 1012 LYQFAPKYEKPTNVWANAIGGTSLNSGG------NASLYGTSAGVDAYLNGEVEAIVGGF 1065

Query: 493 VSLGYQNANIN---IHSYGNEKDVDSYELMAYTGWFDDRYFFNGNVNMGYNSNSSTRNIG 549
S GY + + ++S N + Y + F +++ F+ S+ S+ N
Sbjct: 1066 GSYGYSSFSNQANSLNSGANNTNFGVYSRI-----FANQHEFDFEAQGALGSDQSSLNFK 1120

Query: 550 ENTGYQGNTKATADYNSLQMGYQVKAGMTFDL----DVVKLQPSVAYNYQWL 597
N YN L +A +D + + L+PSV +Y L
Sbjct: 1121 SALLRDLNQS----YNYLAYSAATRASYGYDFAFFRNALVLKPSVGVSYNHL 1168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0952BCTERIALGSPC454e-08 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 44.6 bits (105), Expect = 4e-08
Identities = 19/62 (30%), Positives = 31/62 (50%)

Query: 105 IKLVGVIEHSAPSESIAILEVKGKQTTHLTRENINYEDIVIVKIFTDRVIIKRNGKYYSL 164
+ L GV+ S SIAI+ +Q + E + + IV I DRV+++ G+Y L
Sbjct: 95 LSLTGVMAGDDDSRSIAIISKDNEQFSRGVNEEVPGYNAKIVSIRPDRVVLQYQGRYEVL 154

Query: 165 II 166
+
Sbjct: 155 GL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0953BCTERIALGSPD5430.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 543 bits (1401), Expect = 0.0
Identities = 310/610 (50%), Positives = 432/610 (70%), Gaps = 15/610 (2%)

Query: 3 ISGKGIKSIHGMIFLFTLIMPLDIISANFSVSFKDVDIKEFINSVSKNINKTIIIDPTVQ 62
I I+S + +F ++ + FS SFK DI+EFIN+VSKN+NKT+IIDP+V+
Sbjct: 2 IIANVIRSFSLTLLIFAALLFRPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVR 61

Query: 63 GLISIRSYENLDKDTYYQLFLNVLDVYGYAAIEMPHNVLKVISSKRAKGVVAPLPKEGVT 122
G I++RSY+ L+++ YYQ FL+VLDVYG+A I M + VLKV+ SK AK P+ +
Sbjct: 62 GTITVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAP 121

Query: 123 FDGDELINRVIPLRYISAKKITPLLRQLNDNTESGSIINYDPSNILLITGRAAVVNRLHS 182
GDE++ RV+PL ++A+ + PLLRQLNDN GS+++Y+PSN+LL+TGRAAV+ RL +
Sbjct: 122 GIGDEVVTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLT 181

Query: 183 IVTDLDQAGDNEIELYKLNYAIAADVVKIVNEAINPINNLKQEVSIVGKVIADERTNSIL 242
IV +D AGD + L++A AADVVK+V E + S+V V+ADERTN++L
Sbjct: 182 IVERVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVL 241

Query: 243 ISGDTYIRKKSILMIKKLDKRQSSDGNTKVVYMKYAQASKLLDVLNGISEGFHNEKKTKQ 302
+SG+ R++ I MIK+LD++Q++ GNTKV+Y+KYA+AS L++VL GIS +EK+ +
Sbjct: 242 VSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAK 301

Query: 303 SNQWNQRPVAIKAYDQTNALVITADPDMMLALGEVIEKLDIRRAQVLVEAIIVETQNGEG 362
+ + IKA+ QTNAL++TA PD+M L VI +LDIRR QVLVEAII E Q+ +G
Sbjct: 302 PVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADG 361

Query: 363 INLGVKWENKRSDDINF----IKNSDGLLNNNGWGIATTIT-----------GLTAGFYK 407
+NLG++W NK + F + S + N + T++ G+ AGFY+
Sbjct: 362 LNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQ 421

Query: 408 GNWDVLLSALSTNTNNNILATPSIVTLDNMEAEFNVGQEVPVLISTQTTTTDKVYNSISR 467
GNW +LL+ALS++T N+ILATPSIVTLDNMEA FNVGQEVPVL +QTT+ D ++N++ R
Sbjct: 422 GNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVER 481

Query: 468 QSIGVMLKVKPQINKGDSVLLEIRQEVSSIADSSTVNTHNLGSVFNKRVVNNAVLVKSGE 527
+++G+ LKVKPQIN+GDSVLLEI QEVSS+AD+++ + +LG+ FN R VNNAVLV SGE
Sbjct: 482 KTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGE 541

Query: 528 TVVVGGLLDKKSSTIVNKVPFLGDLPLIGWLFRQTKEKVEKSNLILFIKPTILRESDDYS 587
TVVVGGLLDK S +KVP LGD+P+IG LFR T +KV K NL+LFI+PT++R+ D+Y
Sbjct: 542 TVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYR 601

Query: 588 VVTSKEYNKY 597
+S +Y +
Sbjct: 602 QASSGQYTAF 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0955BCTERIALGSPF354e-122 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 354 bits (911), Expect = e-122
Identities = 169/406 (41%), Positives = 264/406 (65%), Gaps = 7/406 (1%)

Query: 1 MAVFKYVAISRSGTKITGDIDAENIRIARYLLYKKNMHVLSI-------KKRILLFNKYV 53
MA + Y A+ G K G +A++ R AR LL ++ + LS+ +K
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 54 VKKNSNKTDLVLITRQIATLVNASMPLDEVLDIVGKQNSKSKMIEIIQRIRVNIQEGHSF 113
K + +DL L+TRQ+ATLV ASMPL+E LD V KQ+ K + +++ +R + EGHS
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 114 ADALSPFPAVFSPLYKTMVTAGEVSGHLGLVLVRLAEHIEQTQKIQRKIIQALIYPCVLV 173
ADA+ FP F LY MV AGE SGHL VL RLA++ EQ Q+++ +I QA+IYPCVL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 174 LISLSVIVILLTAVVPNIVEQFSFSETALPLSTKVLMILSYSIKENVIFIIAIGVSAVIF 233
+++++V+ ILL+ VVP +VEQF + ALPLST+VLM +S +++ +++ ++ +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 234 LNRLLKINKINIFFHRHYLSLPMLGNMFVRINTSRYLRTLTTLHSNGVTIVQAMSISNAV 293
+L+ K + FHR L LP++G + +NT+RY RTL+ L+++ V ++QAM IS V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 294 LTNVYIKNKLNISVKLVSEGCSLSSSLVDSGVFPPIILHMIISGERSGKLDHMLETVAGV 353
++N Y +++L+++ V EG SL +L + +FPP++ HMI SGERSG+LD MLE A
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 354 QEEELMNQISIVMSLLEPTIIIVMAAFISFVILSILQPILEINSLV 399
Q+ E +Q+++ + L EP +++ MAA + F++L+ILQPIL++N+L+
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0956BCTERIALGSPG2072e-72 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 207 bits (529), Expect = 2e-72
Identities = 87/136 (63%), Positives = 103/136 (75%)

Query: 2 ANKKTKGFTLLEIMVVIVILGLLASLTIPSLMSNKNRADQQKAVSDISALENALDMYRLD 61
A K +GFTLLEIMVVIVI+G+LASL +P+LM NK +AD+QKAVSDI ALENALDMY+LD
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLD 62

Query: 62 NGDYPTEQQGIAALVTKPNVPPLPQRYPSDGYIRRLPTDPWGNSYQMNNPGKHGQIDIFS 121
N YPT QG+ +LV P +PPL Y +GYI+RLP DPWGN Y + NPG+HG D+ S
Sbjct: 63 NHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLS 122

Query: 122 IGPDRLPETEDDIGNW 137
GPD TEDDI NW
Sbjct: 123 AGPDGEMGTEDDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0958BCTERIALGSPG300.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.003
Identities = 13/44 (29%), Positives = 24/44 (54%), Gaps = 9/44 (20%)

Query: 4 RPDCGFTLLEMLLAVVIFSMISFIIYSSLRITIKSNNVMGNKAQ 47
GFTLLE+++ +VI +++ ++ N+MGNK +
Sbjct: 5 DKQRGFTLLEIMVVIVIIGVLASLVVP---------NLMGNKEK 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0963PREPILNPTASE2311e-77 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 231 bits (591), Expect = 1e-77
Identities = 115/275 (41%), Positives = 150/275 (54%), Gaps = 4/275 (1%)

Query: 3 FFFVGYLILGAMVGSFLNVLIYRLPIMLANLSSR-SESHGEEIKMRSHLRNINLFQPGSF 61
+F + M+GSFLNV+I+RLPIML S+ NL P S
Sbjct: 14 LYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSC 73

Query: 62 CHHCNESIPIKYNIPILGWIFLRGASRCCNKKISTRYLFIEVLAVIQTLLVLMIFKEDLL 121
C HCN I NIP+L W++LRG R C IS RY +E+L + ++ V M
Sbjct: 74 CPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWG 133

Query: 122 ICTSLVLIWSLTALAFIDFDTYLLPDCMTIPLLWLGLLINIDTVFAPLTSAVLGAVSGYL 181
+L+L W L AL FID D LLPD +T+PLLW GLL N+ F L AV+GA++GYL
Sbjct: 134 TLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYL 193

Query: 182 FLWLSYWLFKIVRGVDGMGYGDFKLMAALGAWFGVSAVPFLILFSSFFGLVAYAIFYFFD 241
LW YW FK++ G +GMGYGDFKL+AALGAW G A+P ++L SS G
Sbjct: 194 VLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLR 253

Query: 242 KKDNGKEINYIAFGPYISLAGVLYLFLGSHVTNLF 276
K I FGPY+++AG + L G +T +
Sbjct: 254 NHHQSKP---IPFGPYLAIAGWIALLWGDSITRWY 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0968FERRIBNDNGPP290.019 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 29.1 bits (65), Expect = 0.019
Identities = 22/104 (21%), Positives = 45/104 (43%), Gaps = 13/104 (12%)

Query: 186 GVAVSGNIHLWVADTQTPESRENWLT----TLEKIKALKPAIVVPGHFLDNAPQTLESVT 241
GVA + N LWV++ P+S + LE + +KP+ +V +P+ L +
Sbjct: 58 GVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIA 117

Query: 242 FTQNYLTTLNAEIPKAKDSAELIAVMKNHYPELKDESSLELSAK 285
+ + D + +A+ + E+ D +L+ +A+
Sbjct: 118 PGRGF---------NFSDGKQPLAMARKSLTEMADLLNLQSAAE 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0971TCRTETB582e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 58.4 bits (141), Expect = 2e-11
Identities = 33/149 (22%), Positives = 67/149 (44%), Gaps = 3/149 (2%)

Query: 26 LPQVAGDLHISIPTAGWLISGYALGVAIGAPIMAVLTAKLPRKKTLLLLMVIFIIGNLMC 85
LP +A D + + W+ + + L +IG + L+ +L K+ LL ++I G+++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 86 ALAYSYDF-LMFARVITALCHGAFFGIGAVVAANLVAPNRRASAVALMFTGLTLANVLGV 144
+ +S+ L+ AR I AF + VV A + R A L+ + + + +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 145 PLGTALGQAFGWRSTFW--VVSVIGLFSL 171
+G + W ++++I + L
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFL 185


12YpsIP31758_1056YpsIP31758_1066Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_10562150.674593*cytosine/purines uracil thiamine allantoin
YpsIP31758_1057116-0.471247hypothetical protein
YpsIP31758_1058115-0.944177polysaccharide deacetylase family protein
YpsIP31758_1059116-1.041749glycine betaine transporter periplasmic subunit
YpsIP31758_1060115-1.339403glycine betaine transporter membrane protein
YpsIP31758_1061015-3.006589glycine betaine/L-proline ABC transporter
YpsIP31758_1062017-3.821116ribonucleotide-diphosphate reductase subunit
YpsIP31758_1063018-3.370784ribonucleotide-diphosphate reductase subunit
YpsIP31758_1064225-4.679104ribonucleotide reductase stimulatory protein
YpsIP31758_1065629-6.549107glutaredoxin-like protein NrdH
YpsIP31758_1066221-3.853264acid shock protein
13YpsIP31758_1075YpsIP31758_1115Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_10752150.866028nickel ABC transporter ATP-binding protein
YpsIP31758_10763161.005027hypothetical protein
YpsIP31758_10773151.593258urease subunit gamma
YpsIP31758_10782141.398201urease subunit beta
YpsIP31758_10791131.274509urease subunit alpha
YpsIP31758_1080011-1.283524urease accessory protein UreE
YpsIP31758_1081-110-2.094648urease accessory protein UreF
YpsIP31758_1082-113-3.624753urease accessory protein UreG
YpsIP31758_1083-115-4.486181urease accessory protein UreD
YpsIP31758_1084116-5.152284urea transporter
YpsIP31758_1085018-7.520109high-affinity nickel transport protein
YpsIP31758_1086018-7.200855acid-resistance protein
YpsIP31758_1087114-4.789063voltage-gated potassium channel
YpsIP31758_1088015-3.298060camphor resistance protein CrcB
YpsIP31758_1089-117-3.880616CrcB-like protein
YpsIP31758_1090-116-3.070563hypothetical protein
YpsIP31758_1091-116-2.718489PTS system N,N'-diacetylchitobiose-specific
YpsIP31758_1092017-2.951613PTS system N,N'-diacetylchitobiose-specific
YpsIP31758_1093217-3.156574PTS system N,N'-diacetylchitobiose-specific
YpsIP31758_1094-113-1.468444DNA-binding transcriptional regulator ChbR
YpsIP31758_1095-114-0.740057hypothetical protein
YpsIP31758_1096-216-0.866755hypothetical protein
YpsIP31758_1097-3160.216045hypothetical protein
YpsIP31758_1098-1172.760829replication initiation regulator SeqA
YpsIP31758_10990182.877668phosphoglucomutase
YpsIP31758_11001233.206314hypothetical protein
YpsIP31758_11010255.085907integral membrane protein
YpsIP31758_11020244.994320DNA-binding transcriptional activator KdpE
YpsIP31758_11030235.354232sensor protein KdpD
YpsIP31758_11041235.158020potassium-transporting ATPase subunit C
YpsIP31758_11051255.232406hypothetical protein
YpsIP31758_11060224.716192potassium-transporting ATPase subunit B
YpsIP31758_1107-1152.542828hypothetical protein
YpsIP31758_1108-1151.921474potassium-transporting ATPase subunit A
YpsIP31758_1109-2150.297765hypothetical protein
YpsIP31758_1110-1160.939727hypothetical protein
YpsIP31758_1111-1141.745269hypothetical protein
YpsIP31758_1112-1142.411062deoxyribodipyrimidine photolyase
YpsIP31758_1113-1152.3664623',5'-cyclic-nucleotide phosphodiesterase
YpsIP31758_11141153.113813hydrolase-oxidase
YpsIP31758_11151143.185243hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1079UREASE9770.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 977 bits (2528), Expect = 0.0
Identities = 328/570 (57%), Positives = 417/570 (73%), Gaps = 5/570 (0%)

Query: 3 QISRQEYAGLFGPTTGDKIRLGDTNLFIEIEKDLRGYGEESVYGGGKSLRDGMGANNNLT 62
++SR YA +FGPT GDK+RL DT LFIE+EKD +GEE +GGGK +RDGMG + +T
Sbjct: 4 RMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMG-QSQVT 62

Query: 63 RDNGVLDLVITNVTIVDARLGVIKADVGIRDGKIAGIGKSGNPGVMDGVTQGMVVGVSTD 122
R+ G +D VITN I+D G++KAD+G++DG+IA IGK+GNP + GVT ++VG T+
Sbjct: 63 REGGAVDTVITNALILDH-WGIVKADIGLKDGRIAAIGKAGNPDMQPGVT--IIVGPGTE 119

Query: 123 AISGEHLILTAAGIDSHIHLISPQQAYHALSNGVATFFGGGIGPTDGTNGTTVTPGPWNI 182
I+GE I+TA G+DSHIH I PQQ AL +G+ GGG GP GT TT TPGPW+I
Sbjct: 120 VIAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHI 179

Query: 183 RQMLRSIEGLPVNVGILGKGNSYGRGPLLEQAIAGVVGYKVHEDWGATANALRHALRMAD 242
+M+ + + P+N+ GKGN+ G L+E + G K+HEDWG T A+ L +AD
Sbjct: 180 ARMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVAD 239

Query: 243 EVDIQVSVHTDSLNECGYVEDTIDAFEGRTIHTFHTEGAGGGHAPDIIRVASQTNVLPSS 302
E D+QV +HTD+LNE G+VEDTI A +GRTIH +HTEGAGGGHAPDIIR+ Q NV+PSS
Sbjct: 240 EYDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSS 299

Query: 303 TNPTLPYGVNSQAELFDMIMVCHNLNPNVPADVSFAESRVRPETIAAENVLHDMGVISMF 362
TNPT PY VN+ AE DM+MVCH+L+P +P D++FAESR+R ETIAAE++LHD+G S+
Sbjct: 300 TNPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSII 359

Query: 363 SSDSQAMGRVGENWLRILQTADAMKAARGKLPEDAAGNDNFRVLRYVAKITINPAITQGV 422
SSDSQAMGRVGE +R QTAD MK RG+L E+ NDNFRV RY+AK TINPAI G+
Sbjct: 360 SSDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGL 419

Query: 423 SHVIGSVEVGKMADLVLWDPRFFGAKPKMVIKGGMINWAAMGDPNASLPTPQPVFYRPMF 482
SH IGS+EVGK ADLVLW+P FFG KP MV+ GG I A MGDPNAS+PTPQPV YRPMF
Sbjct: 420 SHEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMF 479

Query: 483 GAMGKTLQDTCVTFVSQAALDDGVKEKAGLDRQVIAVKNCR-TISKRDLVRNDQTPNIEV 541
GA G++ ++ VTFVSQA+LD G+ + G+ ++++AV+N R I K ++ N TP+IEV
Sbjct: 480 GAYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEV 539

Query: 542 DPETFAVKVDGVHATCEPIATASMNQRYFF 571
DPET+ V+ DG TCEP M QRYF
Sbjct: 540 DPETYEVRADGELLTCEPATVLPMAQRYFL 569


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1102HTHFIS749e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.1 bits (182), Expect = 9e-18
Identities = 27/112 (24%), Positives = 52/112 (46%), Gaps = 1/112 (0%)

Query: 1 MRIALESEGWRVFESETLQRGLIEAGTRKPDLIILDLGLPDGDGLNYIQDLRQWSA-IPI 59
+ AL G+ V + DL++ D+ +PD + + + +++ +P+
Sbjct: 19 LNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPDLPV 78

Query: 60 IVLSARNNEEDKVAALDAGADDYLSKPFGISELLARVRVALRRHSGASQESP 111
+V+SA+N + A + GA DYL KPF ++EL+ + AL +
Sbjct: 79 LVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLE 130


14YpsIP31758_1128YpsIP31758_1133Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1128-118-3.704249L-aspartate oxidase
YpsIP31758_1129-121-4.615029RNA polymerase sigma factor RpoE
YpsIP31758_1130-220-3.844289anti-RNA polymerase sigma factor SigE
YpsIP31758_1131-119-3.423069periplasmic negative regulator of sigmaE
YpsIP31758_1132-217-3.713836SoxR reducing system protein RseC
YpsIP31758_1133-214-3.005986hypothetical protein
15YpsIP31758_1167YpsIP31758_1221Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_11672132.532242IscR family transcriptional regulator
YpsIP31758_11682132.439587cysteine desulfurase
YpsIP31758_11691153.119407scaffold protein
YpsIP31758_11701164.083416iron-sulfur cluster assembly protein
YpsIP31758_11711184.158214co-chaperone HscB
YpsIP31758_11720152.901756chaperone protein HscA
YpsIP31758_11731181.460792ferredoxin, 2Fe-2S type, ISC system
YpsIP31758_11741181.113734hypothetical protein
YpsIP31758_11754190.860985aminopeptidase B
YpsIP31758_1176215-1.856245enhanced serine sensitivity protein SseB
YpsIP31758_1177217-2.130506autotransporter protein
YpsIP31758_1178016-2.962039hypothetical protein
YpsIP31758_1180115-2.486296hypothetical protein
YpsIP31758_1179114-2.014583autotransporter protein
YpsIP31758_1181-219-3.847389hypothetical protein
YpsIP31758_1182016-0.115420nucleoside diphosphate kinase
YpsIP31758_1183017-0.276122ribosomal RNA large subunit methyltransferase N
YpsIP31758_11842190.008850type IV pilus biogenesis/stability protein PilW
YpsIP31758_1185119-0.199577cytoskeletal protein RodZ
YpsIP31758_1186016-0.0554984-hydroxy-3-methylbut-2-en-1-yl diphosphate
YpsIP31758_11871150.588629histidyl-tRNA synthetase
YpsIP31758_11880110.974449hypothetical protein
YpsIP31758_11890131.555741outer membrane protein assembly complex subunit
YpsIP31758_11900131.461857GTP-binding protein EngA
YpsIP31758_11911131.841960auxin efflux carrier family protein
YpsIP31758_11922131.742918hypothetical protein
YpsIP31758_11930130.548004exodeoxyribonuclease VII large subunit
YpsIP31758_1194-118-2.604896inosine 5'-monophosphate dehydrogenase
YpsIP31758_1195-218-5.843465hypothetical protein
YpsIP31758_1196-119-7.229715GMP synthase
YpsIP31758_1197429-11.840869hypothetical protein
YpsIP31758_1198224-8.475702hypothetical protein
YpsIP31758_1199119-6.133335hypothetical protein
YpsIP31758_1200017-4.310346hypothetical protein
YpsIP31758_1201015-2.583859entero membrane protein
YpsIP31758_12020130.362877hypothetical protein
YpsIP31758_1203-1142.373846phosphomethylpyrimidine kinase
YpsIP31758_1204-1170.898169hypothetical protein
YpsIP31758_1205-1181.538704hypothetical protein
YpsIP31758_1206-1201.939221hypothetical protein
YpsIP31758_1207-2202.567004lipid kinase
YpsIP31758_1208-1213.568396U32 family peptidase
YpsIP31758_1209-1234.335520hypothetical protein
YpsIP31758_1210-1214.785410BaeR family transcriptional regulator
YpsIP31758_1211-1204.391095signal transduction histidine-protein kinase
YpsIP31758_1212-1214.755973multidrug efflux system protein MdtE
YpsIP31758_1213-1204.614811multidrug efflux system subunit MdtC
YpsIP31758_1214-1183.751146multidrug efflux system subunit MdtB
YpsIP31758_1215-1143.281305multidrug efflux system subunit MdtA
YpsIP31758_1216-1152.983734spermidine/putrescine ABC transporter
YpsIP31758_12170163.597327GntR family transcriptional regulator
YpsIP31758_12181193.6337784-aminobutyrate aminotransferase
YpsIP31758_12190173.200048spermidine/putrescine ABC transporter
YpsIP31758_1220-1183.828260spermidine/putrescine ABC transporter permease
YpsIP31758_12210193.457847spermidine/putrescine ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1172SHAPEPROTEIN1002e-25 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 100 bits (252), Expect = 2e-25
Identities = 54/263 (20%), Positives = 105/263 (39%), Gaps = 22/263 (8%)

Query: 150 GLVNPVQVSAEILKTLAQRAQ-AALAGELDGVVITVPAYFDDAQRQGTKDAARLAGLHVL 208
G++ V+ ++L+ ++ + V++ VP +R+ +++A+ AG +
Sbjct: 79 GVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREV 138

Query: 209 RLLNEPTAAAIAYGLDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDD 268
L+ EP AAAI GL + V D+GGGT +++++ L+ V +GGD
Sbjct: 139 FLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDR 193

Query: 269 FDHLLADWLREQAGVATRDDHGIQRQLLDAAIAAKIALSEAETAVVSVAG---WQG---- 321
FD + +++R G + A E + V G +G
Sbjct: 194 FDEAIINYVRRNYGSLIG-----EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRG 248

Query: 322 -EVTREQLESLIAPLVKRTLMACRRALKD-AGVTADEILE--VVMVGGSTRVPLVREQVG 377
+ ++ + + + A AL+ A +I E +V+ GG + + +
Sbjct: 249 FTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLM 308

Query: 378 QFFGRTPLTSIDPDKVVAIGAAI 400
+ G + + DP VA G
Sbjct: 309 EETGIPVVVAEDPLTCVARGGGK 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1177PRTACTNFAMLY1596e-42 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 159 bits (404), Expect = 6e-42
Identities = 114/438 (26%), Positives = 190/438 (43%), Gaps = 46/438 (10%)

Query: 724 LVMDSLAGNGTFKLGSMLQQDASAPVNVTGNADGDFTLQIDGSGIDPTNLN----VVSTG 779
L +++LAG+G F++ S + V +A G L + SG +P + N V +
Sbjct: 473 LTVNTLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQTPL 532

Query: 780 GGDARFTLT--DGPIGLGNRVYNLVKDASGKVTLVANESTVTPG---------------- 821
G A FTL DG + +G Y L + +G+ +LV ++ P
Sbjct: 533 GSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQ 592

Query: 822 ----------TASILAVANT---------TPVIFNAELSSVQQRLDKQSTEANESGIWGT 862
+ A AN ++ AE +++ +RL + + G WG
Sbjct: 593 PEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGR 652

Query: 863 YLHNNFAVKGRAAN-FDQTLNGITLGGDKATALADGVLSVGGFASASTSSIKTDYQSKGN 921
+ RA FDQ + G LG D A A+A G +GG A + G+
Sbjct: 653 GFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGH 712

Query: 922 VDSHSFGAYAQYLANNGYYVNGVVKANKFNQDIHVTSADNSA-SGNTNFSGMGVAVKAGK 980
DS G YA Y+A++G+Y++ ++A++ D V +D A G G+G +++AG+
Sbjct: 713 TDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGR 772

Query: 981 HINH-NHLYVSPYVAMSAFSSGKSAVKLSNGMAAQSSSTRSMIGTLGVNAGYRFVLKNGV 1039
H + ++ P ++ F +G A + +NG+ + S++G LG+ G R L G
Sbjct: 773 RFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGR 832

Query: 1040 EMKPYVSASVDHEFAANNKFRVNQEMFDNNLNGTRISTGAGLNVNITPNLSVGSEVKLSN 1099
+++PY+ ASV EF N L GTR G G+ + S+ + + S
Sbjct: 833 QVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSK 892

Query: 1100 GKNIKTPVTFNLNVGYRF 1117
G + P TF+ GYR+
Sbjct: 893 GPKLAMPWTFHA--GYRY 908


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1179PRTACTNFAMLY1514e-39 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 151 bits (383), Expect = 4e-39
Identities = 118/438 (26%), Positives = 184/438 (42%), Gaps = 44/438 (10%)

Query: 1148 LTMASLNGTGNFNLGSVMQSDSVAPLNVSGDANGDFIIAINSSGQAPTNLN----VVNTH 1203
LT+ +L G+G F + L V DA+G + + +SG P + N V
Sbjct: 473 LTVNTLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQTPL 532

Query: 1204 GGDARFALAN--GPVALGNYMTNLAKDANGNFVLTADKSAMTPGTAGIL----------- 1250
G A F LAN G V +G Y LA + NG + L K+ P A
Sbjct: 533 GSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQ 592

Query: 1251 -------------------AVANTTPV-----IFNAELSSIQQRLDKQSTEANQSGMWGS 1286
A NT V ++ AE +++ +RL + + G WG
Sbjct: 593 PEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGR 652

Query: 1287 YLNNNFEVKGRAAN-FDQKLNGITLGGDKATSLADGVLSIGGFASYSSSDIKTDYQSKGK 1345
++ RA FDQK+ G LG D A ++A G +GG A Y+ D G
Sbjct: 653 GFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGH 712

Query: 1346 VDSHSFGAYAQYLANSGYYMNAVVKNNQFSQDVNITSINGSA-SGVSNFSGMGIALKAGK 1404
DS G YA Y+A+SG+Y++A ++ ++ D + +G A G G+G +L+AG+
Sbjct: 713 TDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGR 772

Query: 1405 HFNFNEA-YVSPYVAMSAFSSGKSNISLSNGMEAQSSSTRSAIGTLGVNAGYRFVMNNGA 1463
F + ++ P ++ F +G +NG+ + S +G LG+ G R + G
Sbjct: 773 RFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGR 832

Query: 1464 ELKPYAIFAVDHEFAKNNQVTVNQEVFDNNLSGTRVNTGAGMNVNITPNLSVGSEVKLSS 1523
+++PY +V EF V N L GTR G GM + S+ + + S
Sbjct: 833 QVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASYEYSK 892

Query: 1524 GKDIKTPVTINLNVGYSF 1541
G + P T + YS+
Sbjct: 893 GPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1184SYCDCHAPRONE300.008 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.5 bits (66), Expect = 0.008
Identities = 17/89 (19%), Positives = 25/89 (28%)

Query: 39 LGLAYLAQGDLTAARKNLEKAVEADPQDYRTQLGMAFYAQRIGENSAAEQRYQQAMKLAP 98
L G A K + D D R LG+ Q +G+ A Y +
Sbjct: 42 LAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101

Query: 99 GNGTVLNNYGAFLCSLGQYVSAQQQFSAA 127
+ L G+ A+ A
Sbjct: 102 KEPRFPFHAAECLLQKGELAEAESGLFLA 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1193RTXTOXIND290.043 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.043
Identities = 12/123 (9%), Positives = 42/123 (34%), Gaps = 2/123 (1%)

Query: 254 PTPSAAAELVSRNQIELVRQIQGQQQRMEMAMDYYLAQRNQQFTRLEHRLQQQHPHLRLA 313
P +E L+++ Q + + L ++ + + R+ + R+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 314 RQQTLLLKLQRRLEESAQTQIRLLSKRTERLQQRLQQVQPQGQIHRYNQRVQQQEYRLRQ 373
+ + L L + A + +L + + ++ + + Q+ + + + +
Sbjct: 234 KSR--LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL 291

Query: 374 AVE 376
+
Sbjct: 292 VTQ 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1210HTHFIS789e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.9 bits (192), Expect = 9e-19
Identities = 33/150 (22%), Positives = 70/150 (46%), Gaps = 5/150 (3%)

Query: 10 QSGSVLIVEDEPKLGQLLVDYLQAAGYRTQWLTNGAEVVATVRQTPPAIILLDLMLPGSD 69
++L+ +D+ + +L L AGY + +N A + + +++ D+++P +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 70 GITLCREIR-RFSDIPIVMVTAKTEEIDRLLGLEIGADDYICKPYSPREVVARVKTIL-- 126
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 127 --RRCSQQRHQPTDDAPLLINESRFQASYQ 154
RR S+ D PL+ + Q Y+
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYR 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1211BCTERIALGSPF340.001 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 34.0 bits (78), Expect = 0.001
Identities = 24/90 (26%), Positives = 38/90 (42%), Gaps = 21/90 (23%)

Query: 170 LSTLLAAAVTWVLS-------------RGMLAPVKRLVEGTHRLAA------GDFST--R 208
L+TL+AA++ + ++A V+ V H LA G F
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLADAMKCFPGSFERLYC 136

Query: 209 VAVSSRDELGHLAQDFNQLASSLEKNEQMR 238
V++ + GHL N+LA E+ +QMR
Sbjct: 137 AMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1212TCRTETB1265e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (318), Expect = 5e-34
Identities = 97/435 (22%), Positives = 182/435 (41%), Gaps = 17/435 (3%)

Query: 20 FMQTLDTTIVNTALPSIAASLGENPLRMQSVIVSYVLTVAVMLPASGWLADRIGVKWVFF 79
F L+ ++N +LP IA + P V +++LT ++ G L+D++G+K +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 SAIILFTFGSLMCAQSATLNE-LILSRVLQGVGGAMMVPVGRLTVMKIVPREQYMAAMAF 138
II+ FGS++ + LI++R +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQIGPLVGPALGGFLVEFASWHWIFLINLP-VGVIGALATLLLMPNHKMSTRRFDI 197
+ +G VGPA+GG + + HW +L+ +P + +I + L+ FDI
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFIMLAIGMATLTLALDGHTGLGLSPLAIAGLILCGVIALGSYWWHALGNRFALFSLHL 257
G I++++G+ L + ++ V++ + H L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FKNKIYTLGLVGSMSARIGSGMLPFMTPIFLQIGLGFSPFHAG-LMMIPMIIGSMGMKRI 316
KN + +G++ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 IVQVVNRFGYRRVLVNATLLLAVVSLSLPLVAIMGWTLLMPVVLFFQGMLNALRFSTMNT 376
+V+R G VL L+V L+ + + +++F G L+ + ++T
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTV-IST 371

Query: 377 LTLKTLPDRLASSGNSLLSMAMQLSMSIGVSTAGILLGTFAHHQVATNTPATHSAFLYS- 435
+ +L + A +G SLL+ LS G++ G LL Q S +LYS
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTYLYSN 431

Query: 436 -YLCMAIIIALPALI 449
L + II + L+
Sbjct: 432 LLLLFSGIIVISWLV 446


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1213ACRIFLAVINRP8620.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 862 bits (2229), Expect = 0.0
Identities = 286/1035 (27%), Positives = 504/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIQRPVATTLLTLAITLSGIIGFSLLPVSPLPQVDYPVIMVSASMPGADPETMASSVAT 65
FI+RP+ +L + + ++G + LPV+ P + P + VSA+ PGAD +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERALGRIAGVNEMTSTS-SLGSTRIILQFDLNRDINGAARDVQAALNAAQSLLPSGMP 124
+E+ + I + M+STS S GS I L F D + A VQ L A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKMNPSDAPIMIMTLTSDT--FSQGQLYDYASTKLAQKIAQTEGVSDVTVGGSSL 182
+ S + +M+ SD +Q + DY ++ + +++ GV DV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVELNPSALFNQGVSLDAVRQAISAANVRRPQGSVDAAET------HWQVQANDEIK 236
A+R+ L+ L ++ V + N + G + + + A K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAEGYRPLIVHYN-NGSPVRLQDVANVIDSVQDVRNAGMSAGQPAVLLVISREPGANIIA 295
E + + + N +GS VRL+DVA V ++ G+PA L I GAN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDRIRAELPALRASIPASIQLNIAQDRSPTIRASLDEVERSLVIAVALVILVVFIFLRS 355
T I+A+L L+ P +++ D +P ++ S+ EV ++L A+ LV LV+++FL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATLIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENISRHL- 414
RATLIP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGVKPMVAALRGVREVGFTVLSMSISLVAVFIPLLLMAGLPGRLFREFAVTLSVAIGIS 474
E + P A + + ++ ++ +++ L AVFIP+ G G ++R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LVISLTLTPMMCAWLLRSHPKGQQQRIRGFG----KVLLAIQQGYGRSLNWALGHTRWVM 530
++++L LTP +CA LL+ + GF Y S+ LG T +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLSTIALNVWLYISIPKTFFPEQDTGRMMGFIQADQSISFQSMQQKLKDFMQIVGADP 590
++ +A V L++ +P +F PE+D G + IQ + + Q+ L +
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 591 -----AVDSVTGFT-GGSRTNSGSMFISLKPLSER---QETAQQVITRLRGKLAKEPGAN 641
+V +V GF+ G N+G F+SLKP ER + +A+ VI R + +L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLSSVQDIRVGGRHSNAAYQFTLLADDLAALREWEPKVRAALAKL-----PQLADVNSD 696
+ ++ I G + ++ L D + + R L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDKGAEMALTYDRETMARLGIDVSEANALLNNAFGQRQISTIYQPLNQYKVVMEVAPEY 756
+ A+ L D+E LG+ +S+ N ++ A G ++ K+ ++ ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDVSSLDKMFVINSNGQSIPLSYFAKWQPANAPLAVNHQGLSAASTISFNLPDGGSLSE 816
+DK++V ++NG+ +P S F + + I G S +
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ATAAVERAMTELGVPSTVRGAFAGTAQVFQETLKSQLWLIMAAIATVYIVLGILYESYVH 876
A A +E ++L P+ + + G + + + L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFDAPFSLIALIGIMLLIGIVKKNAIMMVDFALDAQRNGN 936
P++++ +P VG LLA LF+ + ++G++ IG+ KNAI++V+FA D
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 ISAREAIFQASLLRFRPIIMTTLAALFGALPLVLSSGDGAELRQPLGITIVGGLVVSQLL 996
EA A +R RPI+MT+LA + G LPL +S+G G+ + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVIYLYFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 77.6 bits (191), Expect = 2e-16
Identities = 59/350 (16%), Positives = 130/350 (37%), Gaps = 12/350 (3%)

Query: 680 VRAALAKLPQLADVNSDQQDKGAEMALTYDRETMARLGI---DVSEANALLNNAFGQRQI 736
V+ L++L + DV M + D + + + + DV + N+ Q+
Sbjct: 162 VKDTLSRLNGVGDVQLFGAQY--AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQL 219

Query: 737 STIYQPLNQYKVVMEVAPEYTQDVSSLDKMFV-INSNGQSIPLSYFAK--WQPANAPLAV 793
Q +A ++ K+ + +NS+G + L A+ N +
Sbjct: 220 GGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIA 279

Query: 794 NHQGLSAASTISFNLPDGGSLSEATAAVERAMTEL--GVPSTVRGAFA-GTAQVFQETLK 850
G AA +L + A++ + EL P ++ + T Q ++
Sbjct: 280 RINGKPAAGLGIKLATGANAL-DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIH 338

Query: 851 SQLWLIMAAIATVYIVLGILYESYVHPLTILSTLPSAGVGALLALELFDAPFSLIALIGI 910
+ + AI V++V+ + ++ L +P +G L F + + + G+
Sbjct: 339 EVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGM 398

Query: 911 MLLIGIVKKNAIMMVDFALDAQRNGNISAREAIFQASLLRFRPIIMTTLAALFGALPLVL 970
+L IG++ +AI++V+ + +EA ++ ++ + +P+
Sbjct: 399 VLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAF 458

Query: 971 SSGDGAELRQPLGITIVGGLVVSQLLTLYTTPVIYLYFDRLRNRFSKQPL 1020
G + + ITIV + +S L+ L TP + + + +
Sbjct: 459 FGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1214ACRIFLAVINRP8730.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 873 bits (2256), Expect = 0.0
Identities = 288/1036 (27%), Positives = 502/1036 (48%), Gaps = 29/1036 (2%)

Query: 13 SRLFILRPVATTLFMIAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVVTSSI 72
+ FI RP+ + I +++AG + LPV+ P + P + V YPGA V ++
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMASQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L M+S S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPYPPIYNKVNPADPPILTLAVTATAIPMTQVE--DMVETRIAQKISQVTGVGLVTLSGG 189
+ I + ++ + TQ + D V + + +S++ GVG V L G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAPAVAALGLDSETIRTAISNANVNSAKGSLDGP------TRSVTLSANDQ 243
Q A+R+ L+A + L + + N A G L G + ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MKSAEEYRDLII-AYQNGAPIRLQDVATIEQGAENNKLAAWANTQSAIVLNIQRQPGVNV 302
K+ EE+ + + +G+ +RL+DVA +E G EN + A N + A L I+ G N
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 IATADSIREMLPELIKSLPKSVDVKVLTDRTSTIRASVNDVQFELLLAIALVVMVIYLFL 362
+ TA +I+ L EL P+ + V D T ++ S+++V L AI LV +V+YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNAAATIIPSIAVPLSLVGTFAAMYFLGFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N AT+IP+IAVP+ L+GTFA + G+SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLDAALKGAGEIGFTIISLTFSLIAVLIPLLFMEDIVGRLFREFAVTLAVAIL 481
+ E P +A K +I ++ + L AV IP+ F G ++R+F++T+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SYESLRKQNRLSRASEKFFDWVIAHYAVALKKVLNHPWL 538
+S +V+L LTP +CA +L S E + FD + HY ++ K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVAFSTLVLTVILYLLIPKGFFPLQDNGLIQGTLEAPQSVSFSNMAERQQQVAAIILK 598
L + + V+L+L +P F P +D G+ ++ P + + QV LK
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VESLTSFVGVDGTNATLNNGRLQINLKPLSERDDRIP---QIITRLQESVSGVPG 653
+ VES+ + G + N G ++LKP ER+ +I R + + +
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 654 IKLYLQPVQDLTIDTQLSRTQYQFTLQ---ATSLEELSTWVPKLVNELQQK-APFQDVTS 709
++ P I + T + F L + L+ +L+ Q A V
Sbjct: 660 --GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDQGLVAFVNVDRDSASRLGITMAAIDNALYNAFGQRLISTIYTQSNQYRVVLEHDVQ 769
+ + + VD++ A LG++++ I+ + A G ++ + ++ ++ D +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 ATPGLAAFNDIRLTGSDGKGVPLNSIATIEERFGPLSINHLNQFPSATVSFNLAQGYSLG 829
+ + + ++G+ VP ++ T +G + N PS + A G S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 EAVAAVTLAEKEIQLPADITTRFQGSTLAFQAALGSTLWLIIAAIVAMYIVLGVLYESFI 889
+A+A + +LPA I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMLTGNELDVIAIIGIILLIGIVKKNAIMMIDFALAAERDQ 949
P++++ +P VG LLA L + DV ++G++ IG+ KNAI++++FA +
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMTPYDAIYQACLLRFRPILMTTLAALFGALPLMLSTGVGAELRQPLGVCMVGGLIVSQV 1009
G +A A +R RPILMT+LA + G LPL +S G G+ + +G+ ++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDKL 1025
L +F PV +++ +
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031



Score = 84.1 bits (208), Expect = 2e-18
Identities = 78/517 (15%), Positives = 191/517 (36%), Gaps = 25/517 (4%)

Query: 533 LNHPWLTLSVAFSTLVLTVILYLLIPKGFFPLQDNGLIQGTLEAPQSVSFSNMAERQQQV 592
+ P +A ++ + L +P +P + + P + + Q V
Sbjct: 6 IRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGA----DAQTVQDTV 61

Query: 593 AAIILKDPAVESLTSFVGVDGTNAT-LNNGRLQINL--KPLSERDDRIPQIITRLQESVS 649
+I +++ + ++T + G + I L + ++ D Q+ +LQ +
Sbjct: 62 TQVI-----EQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 650 GVP-GIKLYLQPVQDLTIDTQLSRTQYQFTLQATSLEELSTWVPK-LVNELQQKAPFQDV 707
+P ++ V+ + + L + T+ +++S +V + + L + DV
Sbjct: 117 LLPQEVQQQGISVEKSS-SSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV 175

Query: 708 TSDWQDQGLVAFVNVDRDSASRLGITMAAIDNALYNAFGQRLISTIYTQSNQYRVVLEHD 767
+ + +D D ++ +T + N L Q + L
Sbjct: 176 QLFGAQYAMR--IWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS 233

Query: 768 VQATPGLAAFNDIR----LTGSDGKGVPLNSIATIEERFGPLSIN-HLNQFPSATVSFNL 822
+ A + SDG V L +A +E ++ +N P+A + L
Sbjct: 234 IIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKL 293

Query: 823 AQGYSLGEAVAAV--TLAEKEIQLPADI-TTRFQGSTLAFQAALGSTLWLIIAAIVAMYI 879
A G + + A+ LAE + P + +T Q ++ + + AI+ +++
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 880 VLGVLYESFIHPITILSTLPTAGVGALLALMLTGNELDVIAIIGIILLIGIVKKNAIMMI 939
V+ + ++ + +P +G L G ++ + + G++L IG++ +AI+++
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVV 413

Query: 940 DFALAAERDQGMTPYDAIYQACLLRFRPILMTTLAALFGALPLMLSTGVGAELRQPLGVC 999
+ + + P +A ++ ++ + +P+ G + + +
Sbjct: 414 ENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSIT 473

Query: 1000 MVGGLIVSQVLTLFTTPVIYLLFDKLARNTRGKNRHR 1036
+V + +S ++ L TP + K +N+
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1215RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 22/115 (19%), Positives = 42/115 (36%), Gaps = 10/115 (8%)

Query: 84 VIAANTVTVTSRVDGELMALHFTEGQQVKAGDLLAEIDPRPYEVQLTQAQGQLAKDQATL 143
+ + + + + + EG+ V+ GD+L ++ A+ K Q++L
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTA-------LGAEADTLKTQSSL 143

Query: 144 DNARRDLARYQKLSK---TGLISQQELDTQSSLVRQSEGSVKADQGAIDSAKLQL 195
AR + RYQ LS+ + + +L + SE V I
Sbjct: 144 LQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTW 198



Score = 42.5 bits (100), Expect = 3e-06
Identities = 23/124 (18%), Positives = 54/124 (43%), Gaps = 4/124 (3%)

Query: 125 YEVQLTQAQGQLAKDQATLDNARRDLARYQKLSKTGLISQQELDTQSSLVRQSEGSVKAD 184
E + +A +L ++ L+ ++ + + L++Q + +RQ+ ++
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAK--EEYQLVTQLFKNEILDKLRQTTDNIGLL 314

Query: 185 QGAIDSAKLQLTYSRITAPISGRV-GLKQVDVGNYITSGTATPIVVITQTHPVDVVFTLP 243
+ + + S I AP+S +V LK G +T+ T +V++ + ++V +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALVQ 373

Query: 244 ESDI 247
DI
Sbjct: 374 NKDI 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1216PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 8/31 (25%), Positives = 14/31 (45%)

Query: 34 LTLLGPSGSGKTTSLMMLAGFETPTQGEITL 64
+ L G G GK+T + L G + + +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


16YpsIP31758_1247YpsIP31758_1263Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1247-3143.194678dihydrodipicolinate synthase
YpsIP31758_1248-2143.235322lipoprotein
YpsIP31758_1249-1153.199791phosphoribosylaminoimidazole-succinocarboxamide
YpsIP31758_12500154.162217metalloprotease-like protein
YpsIP31758_12512164.252538hypothetical protein
YpsIP31758_12521163.904587acetyltransferase
YpsIP31758_12530141.217322hypothetical protein
YpsIP31758_12540122.292767D-alanyl-D-alanine carboxypeptidase
YpsIP31758_1255-1132.313482succinyl-diaminopimelate desuccinylase
YpsIP31758_12560141.806313hypothetical protein
YpsIP31758_12571141.332179hypothetical protein
YpsIP31758_1258113-0.791944ABC transporter periplasmic protein
YpsIP31758_1259012-2.562610ABC transporter permease
YpsIP31758_1260117-5.478254ABC transporter ATP-binding protein
YpsIP31758_1261120-6.977190hypothetical protein
YpsIP31758_1262221-7.506132hypothetical protein
YpsIP31758_1263013-3.038555sulfatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1252SACTRNSFRASE300.026 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.026
Identities = 10/57 (17%), Positives = 24/57 (42%), Gaps = 1/57 (1%)

Query: 468 ISRVAVTAAWRQQGIARRMIAAEQAHARQQQ-CDFLSVSFGYTAELAHFWHRCGFRL 523
I +AV +R++G+ ++ A++ C + + HF+ + F +
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1257SYCDCHAPRONE280.016 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 27.6 bits (61), Expect = 0.016
Identities = 16/74 (21%), Positives = 29/74 (39%), Gaps = 3/74 (4%)

Query: 22 QDLLSRSPDNASLLYKIASLYDVQGLELQAVPFYRAAIEHNLVGTELQAAYLGLGSTYRT 81
L S D LY +A G A ++A + + +LGLG+ +
Sbjct: 26 AMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRF---FLGLGACRQA 82

Query: 82 LGLYQAALETFDHA 95
+G Y A+ ++ +
Sbjct: 83 MGQYDLAIHSYSYG 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1260PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 14/35 (40%), Positives = 18/35 (51%)

Query: 31 VVSLLGPSGSGKTTLLRAVAGLEKPSQGHIIIGEK 65
V L G G GK+TL+ + GL+ S H IG
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


17YpsIP31758_1279YpsIP31758_1287Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1279117-4.240412hypothetical protein
YpsIP31758_1280120-4.938623coproporphyrinogen III oxidase
YpsIP31758_1281329-9.334352acetyltransferase
YpsIP31758_1283123-7.777468hypothetical protein
YpsIP31758_1284020-7.484105hypothetical protein
YpsIP31758_1285015-5.737109hypothetical protein
YpsIP31758_1286013-4.549340acetyltransferase YpeA, truncation
YpsIP31758_1287013-3.875137hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1281SACTRNSFRASE280.013 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.013
Identities = 18/99 (18%), Positives = 36/99 (36%), Gaps = 4/99 (4%)

Query: 24 LRPWNDPEMDIERKLNHDPELFLVAEVNGTIVG--SVMGGYDGHRGSAYYLGVHPDYRGR 81
+ + D +MD+ FL + +G + ++G + V DYR +
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFL-YYLENNCIGRIKIRSNWNG-YALIEDIAVAKDYRKK 104

Query: 82 GFANALISRLEKKLIARGCPKLNIMVREDNDAVIGMYEK 120
G AL+ + + L + ++ N + Y K
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAK 143


18YpsIP31758_1298YpsIP31758_1315Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_12982173.044039sialic acid transporter
YpsIP31758_12991173.277596thiosulfate transporter subunit
YpsIP31758_1300-2193.327985sulfate/thiosulfate transporter subunit
YpsIP31758_1301-2182.153263sulfate/thiosulfate ABC transporter permease
YpsIP31758_1302-113-0.098951sulfate/thiosulfate transporter subunit
YpsIP31758_1303113-2.423588cysteine synthase B
YpsIP31758_1304213-4.688705DNA-binding response regulator
YpsIP31758_1305213-5.697849sensor histidine kinase
YpsIP31758_1306115-5.317598von Willebrand factor type A domain-containing
YpsIP31758_1307018-5.089637aminotransferase, classes I and II
YpsIP31758_1308116-4.161167peptidase
YpsIP31758_1309014-2.665378solute/sodium symporter (SSS) family protein
YpsIP31758_1310-112-1.526069pyridine nucleotide-disulfide oxidoreductase
YpsIP31758_1311-111-0.452059efflux ABC transporter ATP-binding
YpsIP31758_1312014-1.114061RND efflux transporter
YpsIP31758_1313117-1.204901CpxR family transcriptional regulator
YpsIP31758_1314117-1.155732sensor histidine kinase CpxA
YpsIP31758_1315219-1.071607PTS system glucose-specific transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1298TCRTETB712e-15 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 71.4 bits (175), Expect = 2e-15
Identities = 68/399 (17%), Positives = 140/399 (35%), Gaps = 36/399 (9%)

Query: 48 DFVLITLVLTDIKQEFGLTLIQATSLISAAFISRWFGGLVLGAMGDRYGRKLAMITSIVL 107
+ +++ + L DI +F + +A ++ G V G + D+ G K ++ I++
Sbjct: 29 NEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIII 88

Query: 108 FSFGTLACGLAPGYTTLFI-ARLIIGIGMAGEYGSSSTYVMESWPKNMRNKASGFLISGF 166
FG++ + + +L I AR I G G A V PK R KA G + S
Sbjct: 89 NCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIV 148

Query: 167 SIGAVLAAQAYSYVVPAFGWRMLFYIGLLPIIFALWLRKNLPEAEDWEKAQSKQKKGKQV 226
++G + + W L I ++ II +L K L +
Sbjct: 149 AMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR-------------- 194

Query: 227 TDRNMVDILYRSHLSYLNIGLTIFAAVSLYLCFTGMVSTLLVVVLGILCAAIFIYFMVQT 286
+ H I L + + ++ FT S +++ +L IF+ + +
Sbjct: 195 ---------IKGHFDIKGIIL-MSVGIVFFMLFTTSYSISF-LIVSVLSFLIFVKHIRKV 243

Query: 287 SGD----RWPTGVMLMVVVFCAFLYSWPIQA---LLPTYLKMDLGYDPHTVGNILFFSG- 338
+ + M+ V C + + ++P +K +G+++ F G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 339 FGAAVGCCVGGFLGDWLGTRK-AYVTSLLISQLLIIPLFAIQGSSILFLGGLLFLQQMLG 397
+ +GG L D G + +S + F ++ +S ++F+
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFV-LGGL 362

Query: 398 QGIAGLLPKLLGGYFDTEQRAAGLGFTYNVGALGGALAP 436
++ ++ ++ AG+ L
Sbjct: 363 SFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGI 401



Score = 34.5 bits (79), Expect = 9e-04
Identities = 35/173 (20%), Positives = 67/173 (38%), Gaps = 11/173 (6%)

Query: 297 LMVVVFCAFLYSWPIQALLPTYLKMDLGYDPHTVGNILFFSGFGAAVGCCVGGFLGDWLG 356
L ++ F + L + LP + D P + + ++G V G L D LG
Sbjct: 19 LCILSFFSVLNEMVLNVSLPD-IANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLG 77

Query: 357 TRKAYVTSLLISQL-LIIPLFAIQGSSILFLGGLLFLQQMLGQGIAGLLPKLLGGYFDTE 415
++ + ++I+ +I S+L + F+Q L+ ++ Y E
Sbjct: 78 IKRLLLFGIIINCFGSVIGFVGHSFFSLLIMA--RFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 416 QRAAGLGFTYNVGALGGALAPILGASIAQHLSLGTALGSLSFSLTFVVILLIG 468
R G ++ A+G + P +G IA ++ S+ L +I +I
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI-------HWSYLLLIPMITIIT 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1302PF05272290.041 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.041
Identities = 10/23 (43%), Positives = 14/23 (60%)

Query: 30 MVALLGPSGSGKTTLLRIIAGLE 52
V L G G GK+TL+ + GL+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1304HTHFIS938e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 8e-24
Identities = 33/134 (24%), Positives = 62/134 (46%)

Query: 2 KILIAEDNAHIRNGLMEVLAHEGYRPIAAENGVQALALYRQQQPDFIILDIMMPELDGYK 61
IL+A+D+A IR L + L+ GY N D ++ D++MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VCREIRKHDWQTPIIFLSAKDEEIDRVIGLELGADDYISKPFGIHEMRARIKTIVRRCLR 121
+ I+K P++ +SA++ + + E GA DY+ KPF + E+ I + R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 KVPESAEDAGFPFG 135
+ + +D+
Sbjct: 125 RPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1312RTXTOXIND598e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 59.5 bits (144), Expect = 8e-12
Identities = 40/259 (15%), Positives = 87/259 (33%), Gaps = 25/259 (9%)

Query: 27 FRWISPPDKPSYITAVAEIRDLEQTVLADGTIKAQKQVTVGAQVSGQIKALHVTLGQQVE 86
F+ +S + + + E Q + K+ V +I +
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 87 KNQLVAEI--DDLAQQNALKDAEEALKNVQAQRAAKIA--TQKNNQLTYQRQQQILAKGV 142
+ + + ++A+ + E + + Q +++ +++ L
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL---- 291

Query: 143 GVRADFDS-IKAILEATQAEISALDAQIAQAEIAVSTAKLNLGYTKISSPIAGTVVAIPV 201
V F + I L T I L ++A+ E + I +P++ V + V
Sbjct: 292 -VTQLFKNEILDKLRQTTDNIGLLTLELAKNE-------ERQQASVIRAPVSVKVQQLKV 343

Query: 202 -EEGQTVNAVQSAPTIIKVAQLDTMTVEAQISEADVVKVKTGMPVYFTILGEPEKRF--- 257
EG V + ++ V + DT+ V A + D+ + G + P R+
Sbjct: 344 HTEGGVVTTAE--TLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYL 401

Query: 258 SATLRAIEPAPDSINDETT 276
++ I D+I D+
Sbjct: 402 VGKVKNI--NLDAIEDQRL 418



Score = 49.1 bits (117), Expect = 2e-08
Identities = 17/167 (10%), Positives = 58/167 (34%), Gaps = 17/167 (10%)

Query: 10 RLIGWVVLLLIIGGLLFFRWISPPDKPSYITAVAEIRDLEQTVLADGTIKAQKQV-TVGA 68
RL+ + ++ ++ + + + +E A+G + + +
Sbjct: 58 RLVAYFIMGFLVIAFI---L-------------SVLGQVEIVATANGKLTHSGRSKEIKP 101

Query: 69 QVSGQIKALHVTLGQQVEKNQLVAEIDDLAQQNALKDAEEALKNVQAQRAAKIATQKNNQ 128
+ +K + V G+ V K ++ ++ L + + +L + ++ ++ +
Sbjct: 102 IENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIE 161

Query: 129 LTYQRQQQILAKGVGVRADFDSIKAILEATQAEISALDAQIAQAEIA 175
L + ++ + + + + + + S Q Q E+
Sbjct: 162 LNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1313HTHFIS891e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.1 bits (221), Expect = 1e-22
Identities = 32/122 (26%), Positives = 63/122 (51%), Gaps = 1/122 (0%)

Query: 2 KILLVDDDLELGTMLKEYLGGEGFTAKHVLTGKAGIDGALSGDYTALILDIMLPDMSGID 61
IL+ DDD + T+L + L G+ + +GD ++ D+++PD + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRQVRK-KSRLPIIMLTAKGDNIDRVIGLEMGADDYMPKPCYPRELVARLRAVLRRFEE 120
+L +++K + LP+++++A+ + + E GA DY+PKP EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 QP 122
+P
Sbjct: 125 RP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1314PF06580387e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 7e-05
Identities = 41/231 (17%), Positives = 83/231 (35%), Gaps = 59/231 (25%)

Query: 239 ELRSPLARLQLAIGLAHQNPGNVDNAL----QRIEHESERLDKMIGEL-------LALSR 287
++ S QL A NP + NAL I + + +M+ L L S
Sbjct: 153 KMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSN 212

Query: 288 AENHSLADD----DEYFDLQEL-------VKVVVNDARYEAQLPGVEIQLEVAAQSEYTV 336
A SLAD+ D Y L + + +N A + Q+P + +Q V
Sbjct: 213 ARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLV-------- 264

Query: 337 KGNAELMRRAIENIVRNALRFSASGQQVKVTLSALDKRYQIQVIDQGPGVEENKLSSIFD 396
EN +++ + G ++ + + + ++V + G +N
Sbjct: 265 -----------ENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------- 306

Query: 397 PFVRVKSAMSGKGYGLGLAITHK-VILAHGGQVEAR-NGEQGGLVITLRVP 445
+ + G GL + + + +G + + + + +QG + + +P
Sbjct: 307 ---------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


19YpsIP31758_1375YpsIP31758_1385Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1375019-3.712607hypothetical protein
YpsIP31758_1376015-2.499076hypothetical protein
YpsIP31758_1377016-2.279615ImpE family protein
YpsIP31758_1378117-1.798723hypothetical protein
YpsIP31758_1379220-3.997950hypothetical protein
YpsIP31758_1380223-6.573150alkylphosphonate utilization operon protein
YpsIP31758_1381326-7.459632hypothetical protein
YpsIP31758_1382331-10.129799LemA family protein
YpsIP31758_1383229-8.466808*hypothetical protein
YpsIP31758_1384227-7.244125hypothetical protein
YpsIP31758_1385023-4.982039hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1377PF07201280.029 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 28.3 bits (63), Expect = 0.029
Identities = 15/99 (15%), Positives = 29/99 (29%), Gaps = 16/99 (16%)

Query: 55 QLETLTQLLPEFTKQAELYKNLILSEKMRDEVLAGKRSPGTL--------GNDLPEWVAL 106
Q+ +PE ++ + ++ + + + E +
Sbjct: 86 QVNQYLSKVPELEQKQNV-------SELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFKM 138

Query: 107 LQQA-NQLHHDGDHQQSEALREQALQQAPESIGESAATG 144
L + L + L EQAL E GE+ G
Sbjct: 139 LCGLRDALKGRPELAHLSHLVEQALVSMAEEQGETIVLG 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1381cloacin366e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 35.8 bits (82), Expect = 6e-04
Identities = 16/42 (38%), Positives = 21/42 (50%)

Query: 619 SSSATATPSRSSDSSSSSSSSGSGSSGGGSSGGGSGGGGGGG 660
A+ SS+++ SGSG GG SG G+GGG G
Sbjct: 30 GGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNS 71



Score = 32.0 bits (72), Expect = 0.008
Identities = 14/40 (35%), Positives = 16/40 (40%)

Query: 621 SATATPSRSSDSSSSSSSSGSGSSGGGSSGGGSGGGGGGG 660
S+ P S GSG GG +G GG G GG
Sbjct: 40 SSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGG 79



Score = 31.6 bits (71), Expect = 0.012
Identities = 10/22 (45%), Positives = 12/22 (54%)

Query: 639 SGSGSSGGGSSGGGSGGGGGGG 660
SG G+ GG + GG G GG
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGNL 81


20YpsIP31758_1448YpsIP31758_1467Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1448014-3.309905hypothetical protein
YpsIP31758_1449-115-2.403619hypothetical protein
YpsIP31758_1450-116-1.391687aminotransferase AlaT
YpsIP31758_1451017-0.140233hypothetical protein
YpsIP31758_1452-1141.526734LrhA family transcriptional regulator
YpsIP31758_14530213.467321hypothetical protein
YpsIP31758_14540233.806284NADH dehydrogenase subunit A
YpsIP31758_14551243.928100NADH dehydrogenase subunit B
YpsIP31758_14561264.112696bifunctional NADH:ubiquinone oxidoreductase
YpsIP31758_14580264.581032NADH dehydrogenase I subunit F
YpsIP31758_14590274.652904NADH dehydrogenase subunit G
YpsIP31758_1460-1283.883821NADH dehydrogenase subunit H
YpsIP31758_14611264.332636NADH dehydrogenase subunit I
YpsIP31758_1462-1211.357133NADH dehydrogenase subunit J
YpsIP31758_1463-1201.211973NADH dehydrogenase subunit K
YpsIP31758_1464-1181.083907NADH dehydrogenase subunit L
YpsIP31758_1465-114-0.917499NADH dehydrogenase subunit M
YpsIP31758_1466-114-2.560233NADH dehydrogenase subunit N
YpsIP31758_1467117-3.586752hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1462TYPE3OMBPROT290.011 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 28.9 bits (64), Expect = 0.011
Identities = 17/45 (37%), Positives = 23/45 (51%)

Query: 102 LLSVLIYAISSVSDQGISGEMVDAKAVGISLFGPYVLAVELASML 146
L+S +Y+ + Q +SG+ VD K V SL P L SML
Sbjct: 251 LVSAALYSRPELLSQALSGKTVDLKIVSTSLLTPTSLTGGEESML 295


21YpsIP31758_1478YpsIP31758_1486Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1478-1143.329104hypothetical protein
YpsIP31758_1479-1133.826055hypothetical protein
YpsIP31758_14800144.754738protein RhiA
YpsIP31758_14811146.241758menaquinone-specific isochorismate synthase
YpsIP31758_14820166.5471752-succinyl-5-enolpyruvyl-6-hydroxy-3-
YpsIP31758_14830165.544906acyl-CoA thioester hydrolase YfbB
YpsIP31758_14840155.052445naphthoate synthase
YpsIP31758_14850165.068438O-succinylbenzoate synthase
YpsIP31758_14860153.073315O-succinylbenzoic acid--CoA ligase
22YpsIP31758_1500YpsIP31758_1528Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1500-121-4.978168threonine and homoserine efflux system
YpsIP31758_1501017-5.533485outer membrane protein X
YpsIP31758_1502-118-5.578840cation diffusion facilitator family transporter
YpsIP31758_1503017-4.630704hypothetical protein
YpsIP31758_1504115-2.722363hypothetical protein
YpsIP31758_1505114-3.329277hypothetical protein
YpsIP31758_1506113-1.045980oxidoreductase, zinc-binding dehydrogenase
YpsIP31758_1507214-0.590244ribose ABC transporter periplasmic protein
YpsIP31758_1508215-0.128705ribose ABC transporter ATP-binding protein
YpsIP31758_1509116-0.690349ribose ABC transporter permease
YpsIP31758_1510014-1.139432LacI family sugar-binding transcriptional
YpsIP31758_1511-1130.176300LysR family substrate binding transcriptional
YpsIP31758_1512-212-0.958369tartrate dehydrogenase
YpsIP31758_1513-3130.538653hypothetical protein
YpsIP31758_1514-2235.166469hypothetical protein
YpsIP31758_1515-2224.914484transporter
YpsIP31758_1516-1255.573708Rieske (2Fe-2S) domain-containing protein
YpsIP31758_1517-1265.589330oxidoreductase
YpsIP31758_1518-1286.162597ShlB/FhaC/HecB family hemolysin
YpsIP31758_1519-2275.336224hemagglutinin/adhesin repeat-containing protein
YpsIP31758_1520-217-2.295693hypothetical protein
YpsIP31758_1521-116-1.434805hypothetical protein
YpsIP31758_1522115-3.324158hypothetical protein
YpsIP31758_1523214-3.459969hypothetical protein
YpsIP31758_1524215-4.027751carbohydrate ABC transporter ATP-binding
YpsIP31758_1525213-2.904115phosphomannomutase
YpsIP31758_1526315-2.580233LacI family transcriptional regulator
YpsIP31758_1527313-3.197733carbohydrate ABC transporter periplasmic-binding
YpsIP31758_1528313-3.157127carbohydrate ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1501ENTEROVIROMP2038e-70 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 203 bits (517), Expect = 8e-70
Identities = 122/174 (70%), Positives = 135/174 (77%), Gaps = 3/174 (1%)

Query: 1 MKKIACLSAVAACVLAVTAGSAFAGQSTVSGGYAQSDYQGVANKSSGFNLKYRYEWSDSQ 60
MKKIACLSA+AA LA TAG++ A STV+GGYAQSD QG NK GFNLKYRYE +S
Sbjct: 1 MKKIACLSALAAV-LAFTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSP 59

Query: 61 LGYITSFTHTEKSGFGDEAVYNKAQYNAITGGPAYRINDWASIYGLVGVGHGRFTQNESA 120
LG I SFT+TEKS YNK QY IT GPAYRINDWASIYG+VGVG+G+F E
Sbjct: 60 LGVIGSFTYTEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQTTE-- 117

Query: 121 FVGDKHSTSDYGFTYGAGLQFNPAENVALDVSYEQSRIRNVDVGTWVAGVGYTF 174
+ KH TSDYGF+YGAGLQFNP ENVALD SYEQSRIR+VDVGTW+AGVGY F
Sbjct: 118 YPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1507FLGHOOKAP1290.024 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.2 bits (65), Expect = 0.024
Identities = 11/60 (18%), Positives = 23/60 (38%), Gaps = 4/60 (6%)

Query: 40 NDYFVSMKEALEQAANDIGAKVYIADAGHDVSKQINDVED---MLQKKIDILLINPTDSV 96
D+F S++ L A D A+ + + Q + K+++I + D +
Sbjct: 110 QDFFTSLQT-LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQI 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1519PF05860832e-20 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 82.5 bits (204), Expect = 2e-20
Identities = 21/141 (14%), Positives = 41/141 (29%), Gaps = 24/141 (17%)

Query: 68 AAIVADASAPGNQQPTIINSANGTPQVNIQAPSSGGVSRNVYSQFDVDGRGVILNNGHGV 127
A I D + P N + I + T + + + + + +F V G N
Sbjct: 1 AQITPDTTLPIN---SNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFN---- 52

Query: 128 NQTELGGFIDGNPWLARGEASIILNEVNSRDPSKLNGYIEVAGRKAQVVIANSAGITCEG 187
I++ V S ++G I A + + N GI
Sbjct: 53 ---------------NPTNIQNIISRVTGGSVSNIDGLIRANAT-ANLFLINPNGIIFGQ 96

Query: 188 CGFINANRVTLTTGQAQLNNG 208
++ + + +L
Sbjct: 97 NARLDIGGSFVGSTANRLKFA 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1524PF05272357e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 7e-04
Identities = 16/49 (32%), Positives = 23/49 (46%), Gaps = 1/49 (2%)

Query: 32 IVLVGPSGCGKSTLLRMIAGLEDVNSGEIKI-EDKDVTQTNAGARGVSM 79
+VL G G GKSTL+ + GL+ + I KD + AG +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYEL 647


23YpsIP31758_1557YpsIP31758_1574Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1557017-3.070208hypothetical protein
YpsIP31758_1558016-4.598943hypothetical protein
YpsIP31758_1559-112-3.671324hypothetical protein
YpsIP31758_1560014-3.977645hypothetical protein
YpsIP31758_1561018-4.410187transporter
YpsIP31758_1562-118-3.330014hypothetical protein
YpsIP31758_1563-116-3.304843LuxR family transcriptional regulator
YpsIP31758_15642190.409535N-methyltryptophan oxidase
YpsIP31758_15652190.739512biofilm formation regulatory protein BssS
YpsIP31758_15662170.969057DNA damage-inducible protein I
YpsIP31758_15672170.920367dihydroorotase
YpsIP31758_15683170.652859antibiotic biosynthesis monooxygenase
YpsIP31758_15694191.171526ribonuclease E
YpsIP31758_15701170.79598823S rRNA pseudouridylate synthase C
YpsIP31758_15710150.911716Maf-like protein
YpsIP31758_15721171.644980hypothetical protein
YpsIP31758_15732161.642013hypothetical protein
YpsIP31758_15742191.79634450S ribosomal protein L32
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1569IGASERPTASE439e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 43.1 bits (101), Expect = 9e-06
Identities = 39/287 (13%), Positives = 83/287 (28%), Gaps = 21/287 (7%)

Query: 504 QLHEAEMAQPLEEATIERKRPEQPALATFSLPTEVPPEEAPTVAKAKPAVATPAAVSTDV 563
L+ E+ + T++ P +P+ P +A+ A P A +T
Sbjct: 979 DLYNPEVEK--RNQTVDTTNITTPNNIQADVPSV--PSNNEEIARVDEAPVPPPAPATPS 1034

Query: 564 EQPGFFSRLFSGLKNMFGASAEAEVQPAEVVKTDASENRRNDRR-----NPRRQNNGRKE 618
E +++ E + E + DA+E +R + N +
Sbjct: 1035 ET-----------TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 619 RNDRTPREGRDNSSRDNTNRDNTSRDNANRDGANRDNSNRDNSGRDNVSREGREDQRRNN 678
++ E ++ + + ++ + + + + + +E E +
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 679 RRPAQPTTTSQGQTEVVEADKAQREEQPQRRGDRQRRRQDEKRQAPQEIKADVAEAPVIE 738
+ T + + + EQP + Q V E P
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE-QPVTESTTVNTGNSVVENPENT 1202

Query: 739 EVQPEQEERQQVMQRRQRRQLNQKVRIQSANDELNTLESPVSAPVAQ 785
Q + + + + VR N E T S + VA
Sbjct: 1203 TPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 39.3 bits (91), Expect = 1e-04
Identities = 39/288 (13%), Positives = 78/288 (27%), Gaps = 28/288 (9%)

Query: 671 REDQRRNNRRPAQPTTTSQGQTEVVEADKAQREEQPQRRGDR-QRRRQDEKRQAPQEIKA 729
++R TT + Q + P + + R DE P
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQAD-----------VPSVPSNNEEIARVDEAPVPPPAPAT 1032

Query: 730 DVAEAPVIEEVQPEQEERQQVMQRRQRRQLNQKVRIQSANDELNTLESPVSAPVAQVVVA 789
+ E ++ + + ++ Q R + + N + + VAQ
Sbjct: 1033 PSETTETVAENSKQESKTVEKNEQDATETTAQN-REVAKEAKSNVKANTQTNEVAQS--- 1088

Query: 790 EVQEEVKLLPQITAQTDDDSANERTTNNENGMPRRSRRSPRHLRVSGQRRRRYRDERYPA 849
E + T +T E+ +++ P+ ++ + + A
Sbjct: 1089 -GSETKETQTTETKETATVEKEEK----AKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 850 QSAMPLAGAFASPEMASGKVWVRYPVTPVVEQVVVEQVVVEQIAIEQTTTVEQTAIVEQV 909
+ A E S + T EQ E + + ++TTV V +
Sbjct: 1144 EPARENDPTVNIKEPQS-----QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVEN 1198

Query: 910 SVANIVTAQLP--VEQVQNTVAEQESSATPSVMTTPTVAVTLAPQHKP 955
P + N + + SV A T +
Sbjct: 1199 PENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRST 1246


24YpsIP31758_1645YpsIP31758_1674Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_16451193.707746chemotaxis-specific methylesterase
YpsIP31758_16461181.648710chemotaxis regulatory protein CheY
YpsIP31758_16471160.882149chemotaxis regulator CheZ
YpsIP31758_16481160.878504hypothetical protein
YpsIP31758_16491150.860379N-acetylmuramoyl-L-alanine amidase
YpsIP31758_16500131.460561hemagglutination repeat-containing protein
YpsIP31758_1651-115-4.019633hypothetical protein
YpsIP31758_1652-114-3.227332alanine racemase
YpsIP31758_1653017-2.985316hypothetical protein
YpsIP31758_1654017-2.590752hypothetical protein
YpsIP31758_1656018-2.874047patatin family phospholipase
YpsIP31758_1655219-5.871722hypothetical protein
YpsIP31758_1657321-5.619885hypothetical protein
YpsIP31758_1658322-3.832002hypothetical protein
YpsIP31758_1659322-3.838833hypothetical protein
YpsIP31758_1660324-3.871218hypothetical protein
YpsIP31758_1661422-3.114986spore coat U domain-containing protein
YpsIP31758_1662521-2.314227hypothetical protein
YpsIP31758_1663316-1.046541fimbrial usher protein
YpsIP31758_16643150.162464pilus assembly chaperone
YpsIP31758_16653150.358857hypothetical protein
YpsIP31758_16661131.017909hypothetical protein
YpsIP31758_1667-2100.261476hypothetical protein
YpsIP31758_1668-2110.223471mce family protein
YpsIP31758_1669-211-0.424064PqiA family integral membrane protein
YpsIP31758_1670-116-1.668935hypothetical protein
YpsIP31758_1671015-1.853980solute/DNA competence effector
YpsIP31758_1672012-1.533516carboxy-terminal protease
YpsIP31758_1673214-1.741108IS1541, transposase
YpsIP31758_1674215-1.583035heat shock protein HtpX
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1645HTHFIS636e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.5 bits (152), Expect = 6e-13
Identities = 27/109 (24%), Positives = 52/109 (47%), Gaps = 5/109 (4%)

Query: 1 MSKIRVLCVDDSALMRQLMTEIINSHPDMEMVAAAQDPLVARDLIKKFNPQVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + ++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKNSEITM-RALELGAIDFVTKP 108
+ D L ++ + RP V+V ++ +N+ +T +A E GA D++ KP
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1646HTHFIS896e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 6e-24
Identities = 34/105 (32%), Positives = 53/105 (50%), Gaps = 3/105 (2%)

Query: 7 RFLVVDDFSTMRRIVRNLLKELGFHNVEEAEDGVDALNKLRAGGFDFVVSDWNMPNMDGL 66
LV DD + +R ++ L G+ +V + + AG D VV+D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 DLLKTIRTDGALATLPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
DLL I+ A LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1650PF03895641e-14 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 63.7 bits (155), Expect = 1e-14
Identities = 22/78 (28%), Positives = 34/78 (43%)

Query: 803 DSTLSAGIAGAMAMASLTQPYTPGASMATIGAASYRGQSALSVGVSSISDSGRWVSKLQA 862
L G+A A++ L QP G + + YR ++AL++GV S A
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 863 SSNTQGDMGVGVGVGYQW 880
+ G M G VGY++
Sbjct: 62 FNTYNGGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1652ALARACEMASE1982e-62 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 198 bits (506), Expect = 2e-62
Identities = 85/354 (24%), Positives = 160/354 (45%), Gaps = 30/354 (8%)

Query: 45 AWLEISQGALDFNTKKMLTLLDNKSTLCAILKGDAYGHDLTLVTPVMLKNNVQCIGVASN 104
+ AL N ++ + + +++K +AYGH + + + + + +
Sbjct: 5 IQASLDLQALKQNLS-IVRQAATHARVWSVVKANAYGHGIERIWSAIGATD--GFALLNL 61

Query: 105 QELKTVRDLGFTGQLIRVRSAT-LKEMQQAMAYDVEELIGDKTVAEQLNNIAKLNGKVLR 163
+E T+R+ G+ G ++ + ++++ + + + + L N L
Sbjct: 62 EEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARL--KAPLD 119

Query: 164 IHLALNSAGMSRNGLEVSKARGLNDAKTIAGLKNLTIVGIMSHYPVEDASE-IKADLARF 222
I+L +NS GM+R G + + L + + + N+ + +MSH+ + + I +AR
Sbjct: 120 IYLKVNS-GMNRLGFQPDRV--LTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARI 176

Query: 223 QQQAKDVIAVTGLKREKIKLHVANTFATLAVPDSWLDMVRVGGVFYG-------DTIAST 275
+Q A GL+ + ++N+ ATL P++ D VR G + YG IA+T
Sbjct: 177 EQ------AAEGLECRR---SLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANT 227

Query: 276 EYKRVMTFKSNIASLNNYPKGGTVGYDRTYTLKRDSLLANIPVGYADGYRRVFSNAGHVI 335
+ VMT S I + G VGY YT + + + + GYADGY R V+
Sbjct: 228 GLRPVMTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVL 287

Query: 336 IQGQRLPVLGKTSMNTVIVDVTDLKKVSLGDEVVLFGKQGNAEIQAEEIEDLSG 389
+ G R +G SM+ + VD+T + +G V L+GK EI+ +++ +G
Sbjct: 288 VDGVRTMTVGTVSMDMLAVDLTPCPQAGIGTPVELWGK----EIKIDDVAAAAG 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1663PF00577458e-151 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 458 bits (1180), Expect = e-151
Identities = 141/825 (17%), Positives = 293/825 (35%), Gaps = 78/825 (9%)

Query: 46 TLYLELVVNDRNFGNA-VPISYRNNRYY----LSQSQLRAIGLPISEPLAPEIAIDN--- 97
T +++ +N+ V + ++ L+++QL ++GL ++ + +
Sbjct: 77 TYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNT-ASVSGMNLLADDAC 135

Query: 98 ------MAGVNVKYDGENQRLLINVPSEWLPKQQIEVTEQDDFNLAQSSLGALFNYDIYA 151
+ + D QRL + +P ++ + + ++ L NY+
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWD--PGINAGLLNYNFSG 193

Query: 152 TQGYPYSSLTHFSAWTEQRIFDRFGLLSNTGVYRTHFPSNNNTDDAKGYIRFDTQWQKND 211
A+ + G + S++++ +K + W + D
Sbjct: 194 NSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERD 253

Query: 212 EEHLL-RYSAGDLITGALPWCSAIRLGGIQIARHFAIRPDLITYPLPQFSGQAAVPSTVD 270
L R + GD T + I G Q+A + PD P G A + V
Sbjct: 254 IIPLRSRLTLGDGYTQGDIF-DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVT 312

Query: 271 LYIDNFRTQSANINPGPFVINNAPRINGAGQATIVTTDALGRQISTSVPFYVASTLLKPG 330
+ + + ++ + PGPF IN+ +G + +A G +VP+ L + G
Sbjct: 313 IKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREG 372

Query: 331 VWDFSLSGGALRRNYAIRSADYGEMVASGVVRYGTTPWLTLEGRGDIAKEMHVIGGGVNF 390
+S++ G R A + + +G T+ G +A G+
Sbjct: 373 HTRYSITAGEYRSGNAQQEKPR---FFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGK 429

Query: 391 RMGLLGVLNSAYSISNTSNGAFNNVAEPLNTNNATPNRLPSPAASRRGRGNQRSLGYSYS 450
MG LG L+ + +N++ + + + + L + + + G N + +GY YS
Sbjct: 430 NMGALGALSVDMTQANST------LPDDSQHDGQSVRFLYNKSLNESGT-NIQLVGYRYS 482

Query: 451 NA-FFNL--------NAQHIISSDEYSD----LANYKTPSLLSRRMTQLTGSLSLGSYGT 497
+ +FN N +I + D +Y + R QLT + LG T
Sbjct: 483 TSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTST 542

Query: 498 V----------GSGYFDVRDALGEQTRLINISYSTSLLRNSNFYSALNRELGRKGYNVQL 547
+ G+ D + G T +I+++ S N + + + L
Sbjct: 543 LYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQ------KGRDQMLAL 596

Query: 548 VWSIPLGPR-----------GSSSISATRTNDNQWIQQLNYSRSAPSNGGLGWNL--AYA 594
+IP S+S S + + + + + L +++ YA
Sbjct: 597 NVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYA 656

Query: 595 NSTNNNNQ-YQQADIVWRTSMMESRMGLYGNSNNYNYWGGLTGSLVVMNRSVYASNMIND 653
+ N+ A + +R + +G + + + G++G ++ V +ND
Sbjct: 657 GGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLND 716

Query: 654 AFALVSTNGFSNIPVSYENQLIGTTNAKGYLLIPTVASYYQAKFQIDPMNLPADVMLPNV 713
LV G + V ENQ T+ +GY ++P Y + + +D L +V L N
Sbjct: 717 TVVLVKAPGAKDAKV--ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNA 774

Query: 714 ERRLAIGERSGYLINFPIKRISAVNIRITDASGQDLPKGSAIYTTGNIPISYVGWDGMVY 773
+ + F + + + +T + + LP G+ + + + V +G VY
Sbjct: 775 VANVVPTRGAIVRAEFKARVGIKLLMTLTH-NNKPLPFGAMVTSESSQSSGIVADNGQVY 833

Query: 774 IEQVAQLNNLRI-IRADNGTQCYSQFKLKTTEGIQDAG--TTVCR 815
+ + +++ + C + ++L Q + CR
Sbjct: 834 LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1665IGASERPTASE290.016 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.9 bits (64), Expect = 0.016
Identities = 17/67 (25%), Positives = 26/67 (38%), Gaps = 7/67 (10%)

Query: 62 DSSNF-GSINFGNITSLATAINATSGLNAGTITIQCNGNPSVTLALNSGANMTGNISAGR 120
+ +N G++N + G TIQ GN V L NS ++TGN +
Sbjct: 811 NPTNLRGNVNLTESANFVLGKANLFG------TIQSRGNSQVRLTENSHWHLTGNSDVHQ 864

Query: 121 HLLNSST 127
L +
Sbjct: 865 LDLANGH 871


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1671IGASERPTASE330.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.001
Identities = 17/90 (18%), Positives = 26/90 (28%), Gaps = 11/90 (12%)

Query: 92 EEQHVEHARKQLEEAKARVQAQRAEQQAKKREAAIAAGETPEPRRPRPAGKKPAPRREAG 151
EE+ K E K V +Q + +Q + A EP R P +
Sbjct: 1109 EEKAKVETEKTQEVPK--VTSQVSPKQEQSETVQPQA----EPAREN----DPTVNIKEP 1158

Query: 152 AAPENRKPRQS-PRPQQVRPPRPQVEENQP 180
+ N P + V E+
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVTESTT 1188


25YpsIP31758_1683YpsIP31758_1688Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1683-119-3.993082transcriptional regulator
YpsIP31758_1684018-3.748958N-acetylmuramoyl-L-alanine amidase
YpsIP31758_1685-120-4.167411sodium:dicarboxylate symporter family protein
YpsIP31758_1686-120-5.085195hypothetical protein
YpsIP31758_1687-118-4.482002oligogalacturonate-specific porin protein KdgM
YpsIP31758_1688-217-3.575501oligogalacturonide ABC transporter periplasmic
26YpsIP31758_1704YpsIP31758_1713Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1704-215-3.340827chelated iron ABC transporter ATP-binding
YpsIP31758_1705017-3.805810chelated iron ABC transporter periplasmic
YpsIP31758_1706020-3.932518transglycosylase slt family protein
YpsIP31758_1707015-2.994733multiple drug resistance protein MarC
YpsIP31758_1708416-2.346294hypothetical protein
YpsIP31758_1709318-2.387054hypothetical protein
YpsIP31758_1710320-1.720904hypothetical protein
YpsIP31758_1711420-1.476685hypothetical protein
YpsIP31758_1712220-0.933325threonyl-tRNA synthetase
YpsIP31758_1713323-0.700732translation initiation factor IF-3
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1705ADHESNFAMILY389e-139 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 389 bits (1002), Expect = e-139
Identities = 105/309 (33%), Positives = 179/309 (57%), Gaps = 7/309 (2%)

Query: 10 LRAAALFTIVAFSSLISTAALAENNPSDTAKKFKVVTTFTIIQDIAQNIAGDVAVVESIT 69
++ ++ S++I A + + + +K KVV T +II DI +NIAGD + SI
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIV 60

Query: 70 KPGAEIHDYQPTPRDIVKAQSADLILWNGMNLER----WFEKFFESIK---DVPSAVVTA 122
G + H+Y+P P D+ K ADLI +NG+NLE WF K E+ K + V+
Sbjct: 61 PIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDYFAVSD 120

Query: 123 GITPLPIREGPYSGIANPHAWMSPSNALIYIENIRKALVEHDPAHAETYNRNAQAYAEKI 182
G+ + + G +PHAW++ N +I+ +NI K L DP + E Y +N + Y +K+
Sbjct: 121 GVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKL 180

Query: 183 KALDAPLRERLSRIPAEQRWLVTSEGAFSYLAKDYGFKEVYLWPINAEQQGIPQQVRHVI 242
LD +++ ++IPAE++ +VTSEGAF Y +K YG Y+W IN E++G P+Q++ ++
Sbjct: 181 DKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTPEQIKTLV 240

Query: 243 DIIRENKIPVVFSESTISDKPAKQVSKETGAQYGGVLYVDSLSGEKGPVPTYISLINMTV 302
+ +R+ K+P +F ES++ D+P K VS++T ++ DS++ + +Y S++ +
Sbjct: 241 EKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQGKEGDSYYSMMKYNL 300

Query: 303 DTIAKGFGQ 311
D IA+G +
Sbjct: 301 DKIAEGLAK 309


27YpsIP31758_1872YpsIP31758_1877Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1872117-4.484408DNA replication terminus site-binding protein
YpsIP31758_1873016-3.562162fumarate hydratase
YpsIP31758_1874017-4.378581mannose-6-phosphate isomerase
YpsIP31758_1875-216-4.679726hypothetical protein
YpsIP31758_1876-217-3.614314hypothetical protein
YpsIP31758_1877017-3.843627hypothetical protein
28YpsIP31758_1937YpsIP31758_1974Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1937-118-3.130432tryptophan synthase subunit alpha
YpsIP31758_1938018-2.896730hypothetical protein
YpsIP31758_1939018-3.190065phospholipid-binding domain-containing protein
YpsIP31758_1940116-2.973517outer membrane protein W
YpsIP31758_1941-119-2.525054hypothetical protein
YpsIP31758_1942018-2.032467hypothetical protein
YpsIP31758_1943019-2.118821sulfate permease
YpsIP31758_1944324-3.832323intracellular septation protein A
YpsIP31758_1945226-3.408260acyl-CoA thioester hydrolase
YpsIP31758_1946228-4.469604transporter
YpsIP31758_1947328-5.876386YciI-like protein
YpsIP31758_1948-120-4.422844hypothetical protein
YpsIP31758_1949-121-4.129405hypothetical protein
YpsIP31758_1950-221-4.505084hypothetical protein
YpsIP31758_1951-219-3.979954Ail/Lom family protein
YpsIP31758_1952-218-3.305743hypothetical protein
YpsIP31758_1953-117-3.102031cardiolipin synthetase
YpsIP31758_1954018-3.817106dsDNA-mimic protein
YpsIP31758_1955018-4.380244hypothetical protein
YpsIP31758_1956017-4.168004oligopeptide ABC transporter ATP-binding protein
YpsIP31758_1957016-4.180860oligopeptide ABC transporter ATP-binding
YpsIP31758_1958-118-3.575734oligopeptide ABC transporter permease
YpsIP31758_1959019-3.794620oligopeptide transporter permease
YpsIP31758_1960121-3.839680oligopeptide ABC transporter periplasmic
YpsIP31758_1961328-3.150698hypothetical protein
YpsIP31758_1962224-2.881523hypothetical protein
YpsIP31758_1963122-3.574913bifunctional acetaldehyde-CoA/alcohol
YpsIP31758_1964025-5.140423thymidine kinase
YpsIP31758_1965025-5.037328hypothetical protein
YpsIP31758_1966024-4.952900global DNA-binding transcriptional dual
YpsIP31758_1967-127-7.460454IS1541, transposase
YpsIP31758_1968-129-8.499742UDP-glucose/GDP-mannose dehydrogenase family
YpsIP31758_1969-126-7.879674response regulator of RpoS
YpsIP31758_1970-123-7.171587hypothetical protein
YpsIP31758_1971-124-7.790559formyltetrahydrofolate deformylase
YpsIP31758_1974019-4.464538**LysR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1938BCTERIALGSPF250.011 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 25.2 bits (55), Expect = 0.011
Identities = 8/29 (27%), Positives = 17/29 (58%)

Query: 18 LTAVITALQRLFVYMKQALFSAISWVISV 46
L+ V+ + F++MKQAL + ++ +
Sbjct: 191 LSVVVPKVVEQFIHMKQALPLSTRVLMGM 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1943PF03944300.024 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 30.4 bits (68), Expect = 0.024
Identities = 29/137 (21%), Positives = 52/137 (37%), Gaps = 29/137 (21%)

Query: 367 SLLSFKSLWQLRKRNTQAFYLA--IFTFVSVVLVGVISGIGLAVLL---------GLLQF 415
SL + + W K+N + YL + T S +L V S +G +L G
Sbjct: 30 SLDTVQKEWTEWKKNNHSLYLDPIVGTVASFLLKKVGSLVGKRILSELRNLIFPSGSTNL 89

Query: 416 LRTVFRPTEQLLG--VNADGMIHSMGNGNGIKAVPGVMIYRFNSPLTYFNVAYFKRRILN 473
++ + R TE+ L +N D + G++A NV F R++ N
Sbjct: 90 MQDILRETERFLNQRLNTDTVARVNAELTGLQA----------------NVEEFNRQVDN 133

Query: 474 LVDSTPHPADWVVIDAV 490
++ + + +V
Sbjct: 134 FLNPNRNAVPLSITSSV 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1946TONBPROTEIN1614e-51 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 161 bits (408), Expect = 4e-51
Identities = 93/248 (37%), Positives = 130/248 (52%), Gaps = 16/248 (6%)

Query: 10 RRLTWSLIFSIGLHGSVVAALLYVSVEQMKIQPEIEDTPLAVTMVNIAEFAAPQPAAAAP 69
RR W + S+ +HG+VVA LLY SV Q+ P P++VTMV A P
Sbjct: 7 RRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQ-PISVTMVTPA-------DLEPP 58

Query: 70 EPVQETPAVPEETPPVLEETPPEPEELPEPVPEPVKPKPKPVKKEVKKEVKKEVKKPEVK 129
+ VQ P E P E P P+E P + +P K K + E K +VK
Sbjct: 59 QAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQ---EQPKRDVK 115

Query: 130 KTQAPPDDKPFKSDEAALVANNAPVKSAPVASTPGLSTSAGPKALSKAKPSYPARALALG 189
++ P PF++ A + ++ + P S ++GP+ALS+ +P YPARA AL
Sbjct: 116 PVESRPA-SPFENTAPARLTSSTATAATS---KPVTSVASGPRALSRNQPQYPARAQALR 171

Query: 190 IEGQVKVQYDIDESGRVTNVRVLEATPRNTFEREVKQVMRKWRFEA-VAAKNYVTTIVFK 248
IEGQVKV++D+ GRV NV++L A P N FEREVK MR+WR+E V I+FK
Sbjct: 172 IEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFK 231

Query: 249 LDGKMEMN 256
++G E+
Sbjct: 232 INGTTEIQ 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1951ENTEROVIROMP1612e-53 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 161 bits (410), Expect = 2e-53
Identities = 72/180 (40%), Positives = 106/180 (58%), Gaps = 10/180 (5%)

Query: 1 MKWITTLAPLSLALSLGISVANAASDASNTVSFGYAQSTLKIDGEKIGKDNKGFNLKYRH 60
MK I L+ L+ L+ + AA+ +TV+ GYAQS + K+ GFNLKYR+
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAAT---STVTGGYAQSDAQGQMNKM----GGFNLKYRY 53

Query: 61 ELD-SVLGIVASFTHTKQNYGMPGDSDGKRKVEYYSLMVGPSWRFNEFVSAYALIGATQG 119
E D S LG++ SFT+T+++ K +YY + GP++R N++ S Y ++G G
Sbjct: 54 EEDNSPLGVIGSFTYTEKSRTASSGDYNK--NQYYGITAGPAYRINDWASIYGVVGVGYG 111

Query: 120 KSTHTKPRMVSNTVSKTSMGYGAGLQFNPVKHVAIDTAYEYAKIEDVKIGTWIVGVGYRF 179
K T+ + S YGAGLQFNP+++VA+D +YE ++I V +GTWI GVGYRF
Sbjct: 112 KFQTTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1956HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 11/16 (68%)

Query: 54 VVGESGCGKSTFARAI 69
+ GESG GK ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1968NUCEPIMERASE290.032 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.4 bits (66), Expect = 0.032
Identities = 20/88 (22%), Positives = 33/88 (37%), Gaps = 14/88 (15%)

Query: 1 MKVTVFGI-GYVGLVQATVLAEVGHDVLCID-IDANKVADLKKGRIAIFEPGLAPLVK-- 56
MK V G G++G + L E GH V+ ID ++ LK+ R+ + K
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 57 -ENYEAGRLQFSTD---------AQAGV 74
+ E F++ + V
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAV 88


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1969HTHFIS844e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.1 bits (208), Expect = 4e-20
Identities = 31/115 (26%), Positives = 48/115 (41%), Gaps = 1/115 (0%)

Query: 10 ILVVEDEVVFRTVLAEYLGSLGATIHQAENGLAALYQLKGHSPDLILCDLAMPKMGGIEF 69
ILV +D+ RTVL + L G + N + DL++ D+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 70 VEQLLLKGIKIPVLVISATDKMADIAQVLRLGVKDVLLKPIVDLNRLREAVLACL 124
+ ++ +PVLV+SA + + G D L KP DL L + L
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGIIGRAL 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1970SECA461e-08 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 46.4 bits (110), Expect = 1e-08
Identities = 15/23 (65%), Positives = 18/23 (78%)

Query: 132 PSLGRNDTCLCGSGKKHKKCCGR 154
+GRND C CGSGKK+K+C GR
Sbjct: 877 RKVGRNDPCPCGSGKKYKQCHGR 899



Score = 27.9 bits (62), Expect = 0.019
Identities = 8/14 (57%), Positives = 9/14 (64%)

Query: 5 CPCGSILNYHECCG 18
CPCGS Y +C G
Sbjct: 885 CPCGSGKKYKQCHG 898


29YpsIP31758_2039YpsIP31758_2054Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2039-116-3.889042methyltransferase
YpsIP31758_2040-120-5.828007copper homeostasis protein CutC
YpsIP31758_2041017-5.106895hypothetical protein
YpsIP31758_2042016-4.609108hypothetical protein
YpsIP31758_2043-115-4.469037arginyl-tRNA synthetase
YpsIP31758_2044019-5.198803adhesin/hemagglutinin
YpsIP31758_2045015-1.402663ShlB/FhaC/HecB family hemolysin
YpsIP31758_2046-1131.105044integral membrane protein MviN
YpsIP31758_2047-1130.940776oxidoreductase
YpsIP31758_2048-2130.676621hypothetical protein
YpsIP31758_2049-1142.473770ribosomal-protein-S5-alanine
YpsIP31758_2050-1153.049830multidrug resistance protein MdtH
YpsIP31758_2051-1193.573859hypothetical protein
YpsIP31758_20520214.545686TorD family cytoplasmic chaperone
YpsIP31758_2053-1254.614969hydrolase
YpsIP31758_2054-2274.506921*ROK family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2044PF05860601e-12 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 59.8 bits (145), Expect = 1e-12
Identities = 20/100 (20%), Positives = 37/100 (37%), Gaps = 20/100 (20%)

Query: 56 GISVINITSPSEQGLSHNQYMEFNVNEHGVVFNNSLETVAKNGITYQDNRNLRGSTARII 115
+I + + L H+ + EF+V G F N+ + + I
Sbjct: 20 NTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNN------------------PTNIQNI 60

Query: 116 LNEVVGSNISILNGHQDIIGMPADYILANANGISCQGCSF 155
++ V G ++S ++G A+ L N NGI +
Sbjct: 61 ISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNAR 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2050TCRTETA637e-13 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 62.5 bits (152), Expect = 7e-13
Identities = 66/356 (18%), Positives = 127/356 (35%), Gaps = 15/356 (4%)

Query: 14 FLLFDNLLVVLGFFVVFPLISIRFVDQLGWAALVV---GLALGLRQLVQQGLGIFGGAIA 70
+L L +G ++ P++ + L + V G+ L L L+Q GA++
Sbjct: 9 VILSTVALDAVGIGLIMPVLPG-LLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALS 67

Query: 71 DRFGAKPMIVTGMLMRAAGFALMAMADEPWILWLACALSGLGGTLFDPPRTALVIKLTRP 130
DRFG +P+++ + A +A+MA A W+L++ ++G+ G A + +T
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 131 HERGRFYSLLMMQDSAGAVIGALIGSWLLQYDFHFVCWTGAAIFVLAAGWNAWLLPAYRI 190
ER R + + G V G ++G + + H + AA+ L +LLP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186

Query: 191 STVRAPMKEGLMRVLRDRRFVTYVLTLTGYYMLAVQVMLMLPI--------VVNELAGSP 242
R P++ + L R+ + + + + L+ + +
Sbjct: 187 GE-RRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDA 245

Query: 243 AAVKWMYAIEAALSLTLLYPLARWSEKRFSLEQRLMAGLLIMTLSLFPIGMITHLQTLFM 302
+ A L + R + LM G++ + T F
Sbjct: 246 TTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFP 305

Query: 303 FICFFYMGSILAEPARETLGASLADSRARGSYMGFSRLGLALGGALGYTGGGWMYD 358
+ G I PA + + + D +G G +L +G +Y
Sbjct: 306 IMVLLASGGI-GMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360


30YpsIP31758_2086YpsIP31758_2106Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2086119-3.444762carboxymuconolactone decarboxylase family
YpsIP31758_2087221-5.717415cupin domain-containing protein
YpsIP31758_2088222-7.182193hypothetical protein
YpsIP31758_2089222-6.797293hypothetical protein
YpsIP31758_2090224-7.810047hypothetical protein
YpsIP31758_2091-117-4.652132hypothetical protein
YpsIP31758_2092017-6.039335hypothetical protein
YpsIP31758_2093017-6.219500short chain dehydrogenase/reductase family
YpsIP31758_2094016-6.219966hypothetical protein
YpsIP31758_2095015-5.167566hypothetical protein
YpsIP31758_2096116-5.112184hypothetical protein
YpsIP31758_2097217-6.169942hypothetical protein
YpsIP31758_2098116-3.379734hypothetical protein
YpsIP31758_2099215-0.491044hypothetical protein
YpsIP31758_2100-1141.166921UDP-glycosyltransferase family protein
YpsIP31758_2101-1181.478851hypothetical protein
YpsIP31758_2102-3192.577177hypothetical protein
YpsIP31758_2103-2173.374367NAD dependent epimerase/dehydratase family
YpsIP31758_21050121.877705hypothetical protein
YpsIP31758_21040121.825235hypothetical protein
YpsIP31758_21062131.359701metallo-beta-lactamase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2093DHBDHDRGNASE673e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 66.6 bits (162), Expect = 3e-15
Identities = 56/231 (24%), Positives = 90/231 (38%), Gaps = 25/231 (10%)

Query: 7 IILTGASGLIGSAIADALYKSGMNLVLACKRSQKLQDRYLSDDKSKRAYFWY-GDLTNEK 65
+TGA+ IG A+A L G ++ +KL+ S R + D+ +
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 66 ACRELVEYAVQQMGGVDVLINCAGVFNFSALEEMTYSRITDTISTNLLAPIYLTHLVLPY 125
A E+ ++MG +D+L+N AGV + ++ T S N + V Y
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 126 IKTSACPIIVNISSIAGFSSLPEGACYAASKWGLNGFIHSIREELRKKSIHICNI-SPCQ 184
+ IV + S A YA+SK F + EL + +I CNI SP
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR-CNIVSPGS 189

Query: 185 VKT-----LSHHSDTAIRTIA-----------------PENIANAVILVLS 213
+T L + A + I P +IA+AV+ ++S
Sbjct: 190 TETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2103NUCEPIMERASE803e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 80.2 bits (198), Expect = 3e-19
Identities = 67/365 (18%), Positives = 122/365 (33%), Gaps = 85/365 (23%)

Query: 3 NILITGASGFIGGAFMRRFACHDGIRLCGI-------------GRRSVEGFP--TSVRYQ 47
L+TGA+GFIG +R G ++ GI R + P +
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEA-GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 48 ALDLARLATL--DFTPDVVIHAAGRAG---PWGTRREYYRDNVVTTEQVIKFCQSRGNPR 102
D + L + V + R Y N+ +++ C+
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 103 LIYLSTAAVYYRYCHQLALTEQSEIGPEFANDYALTKHQGEALIEAYQG----EKTILRP 158
L+Y S+++V Y ++ + + + YA TK E + Y T LR
Sbjct: 121 LLYASSSSV-YGLNRKMPFSTDDSV-DHPVSLYAATKKANELMAHTYSHLYGLPATGLRF 178

Query: 159 CAVFGP-GDQLLFPPLLDAASRHGLPLLISEVPARGELM----HIDVLCDYLLKAAIKPE 213
V+GP G + A G + +V G++ +ID + + +++
Sbjct: 179 FTVYGPWGRPDMALFKFTKAMLEGKSI---DVYNYGKMKRDFTYIDDIAEAIIRLQDVIP 235

Query: 214 LR----------------PF--YNLSNAEPIEINEFLIDVLSK-LGLPAPKREVRVATAM 254
P+ YN+ N+ P+E+ ++ I L LG+ A K + +
Sbjct: 236 HADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDY-IQALEDALGIEAKKNMLPLQ--- 291

Query: 255 LIAGIIEGTYRLLRIKSEPSITRFGVGVLGYSKTLDVSAAIHDFG-SPSRSLSQGLDAFI 313
G + T D A G +P ++ G+ F+
Sbjct: 292 --PGDVLETS------------------------ADTKALYEVIGFTPETTVKDGVKNFV 325

Query: 314 RWYKE 318
WY++
Sbjct: 326 NWYRD 330


31YpsIP31758_2170YpsIP31758_2255Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_21701243.295261carbohydrate ABC transporter permease
YpsIP31758_21712223.911784carbohydrate ABC transporter permease
YpsIP31758_21722202.116086carbohydrate ABC transporter periplasmic-binding
YpsIP31758_2173216-1.199065oxidoreductase, NAD-binding
YpsIP31758_2174325-6.941439hypothetical protein
YpsIP31758_2175323-6.107391hypothetical protein
YpsIP31758_2176222-6.025872GntR family transcriptional regulator
YpsIP31758_2177227-8.289908hypothetical protein
YpsIP31758_2178024-6.148842hypothetical protein
YpsIP31758_2180-123-3.630537hypothetical protein
YpsIP31758_21812212.958082hypothetical protein
YpsIP31758_21832243.074707baseplate assembly protein V, truncation
YpsIP31758_21841222.844233baseplate assembly protein W
YpsIP31758_21850212.824061hypothetical protein
YpsIP31758_21860232.080145hypothetical protein
YpsIP31758_21871211.971375tail sheath protein
YpsIP31758_21881200.933184tail tube protein
YpsIP31758_2189118-1.890077tail protein
YpsIP31758_2190118-4.683802tail protein
YpsIP31758_2192118-3.827367phage P2 GpU
YpsIP31758_2193220-3.290829late control gene D protein
YpsIP31758_2194323-4.797516phage transcriptional activator, Ogr/Delta
YpsIP31758_2195221-4.541767hypothetical protein
YpsIP31758_2196119-3.652885hypothetical protein
YpsIP31758_2197217-1.706764hypothetical protein
YpsIP31758_2198218-2.462240tail collar domain-containing protein
YpsIP31758_2199318-3.390100hypothetical protein
YpsIP31758_2200118-2.211977hypothetical protein
YpsIP31758_2201118-2.417650hypothetical protein
YpsIP31758_2202120-2.435723hypothetical protein
YpsIP31758_2203121-3.074651hypothetical protein
YpsIP31758_2204018-3.136689hypothetical protein
YpsIP31758_2205-117-3.057168hypothetical protein
YpsIP31758_2206121-4.299959lysozyme
YpsIP31758_2207431-6.629767hypothetical protein
YpsIP31758_2208230-6.605536hypothetical protein
YpsIP31758_2209330-6.717644hypothetical protein
YpsIP31758_2211128-6.434261hypothetical protein
YpsIP31758_2212026-4.564603hypothetical protein
YpsIP31758_2213015-3.027897transcriptional regulator
YpsIP31758_2214016-1.421831hypothetical protein
YpsIP31758_2215215-0.107708hypothetical protein
YpsIP31758_22162150.940078hypothetical protein
YpsIP31758_22173130.579770hypothetical protein
YpsIP31758_22183130.096254hypothetical protein
YpsIP31758_22194160.537961hypothetical protein
YpsIP31758_22204140.986287hypothetical protein
YpsIP31758_22214130.696549hypothetical protein
YpsIP31758_22223140.112460hypothetical protein
YpsIP31758_2223213-0.502420hypothetical protein
YpsIP31758_2224114-0.667915phage-associated protein
YpsIP31758_2225015-1.232595phage protein
YpsIP31758_2226120-3.918323bifunctional antitoxin/transcriptional repressor
YpsIP31758_2227121-4.339157addiction module antitoxin
YpsIP31758_2228221-4.760665hypothetical protein
YpsIP31758_2229322-5.457011hypothetical protein
YpsIP31758_2230321-6.351981hypothetical protein
YpsIP31758_2231428-6.981473ParB family protein
YpsIP31758_2232733-7.764299hypothetical protein
YpsIP31758_2233633-6.913453hypothetical protein
YpsIP31758_2234630-6.346613hypothetical protein
YpsIP31758_2235325-5.770010hypothetical protein
YpsIP31758_2236331-8.226618hypothetical protein
YpsIP31758_2237132-9.125951prevent-host-death family protein
YpsIP31758_2238030-8.380662RelE/ParE family plasmid stabilization system
YpsIP31758_2239025-5.702531phage antitermination protein Q
YpsIP31758_2240125-5.852645hypothetical protein
YpsIP31758_2241026-5.595570hypothetical protein
YpsIP31758_2242023-3.325287hypothetical protein
YpsIP31758_2243125-3.839882hypothetical protein
YpsIP31758_2244224-4.070261replicative DNA helicase
YpsIP31758_2245429-5.410422hypothetical protein
YpsIP31758_2246229-4.853125hypothetical protein
YpsIP31758_2247127-4.855978hypothetical protein
YpsIP31758_2248025-4.490816repressor protein C2
YpsIP31758_2249023-3.930387hypothetical protein
YpsIP31758_2250-123-4.322086phage integrase family site specific
YpsIP31758_2251-125-4.713528***phosphatidylglycerophosphate synthetase
YpsIP31758_2252-127-5.938229excinuclease ABC subunit C
YpsIP31758_2253030-7.941147response regulator
YpsIP31758_2254-119-6.193244hypothetical protein
YpsIP31758_2255016-3.799251GlpM protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2172MALTOSEBP493e-08 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 48.6 bits (115), Expect = 3e-08
Identities = 101/420 (24%), Positives = 171/420 (40%), Gaps = 55/420 (13%)

Query: 14 TLLMASNASA---QETLRVLLEGHSTSDSIKALLPEFEKQTGIKVQAEIVPYSDLTSKAL 70
T++ +++A A + L + + G + + + +FEK TGIKV E + D +
Sbjct: 17 TMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVE---HPDKLEEKF 73

Query: 71 LAFSSHSGRYDVVMDDWVHAV--GYASAGYITPVDQWMESDTAFYDGADFVKSYA---DT 125
++ D++ W H GYA +G + + D AF D K Y D
Sbjct: 74 PQVAATGDGPDIIF--WAHDRFGGYAQSGLLAEI----TPDKAFQD-----KLYPFTWDA 122

Query: 126 LRYKDGYYGLPVYGESTFLMYRKDLFEQYGIAVPKTFDELTAAAKTIKEKTEGKVAGITL 185
+RY P+ E+ L+Y KDL PKT++E+ A K +K K GK A +
Sbjct: 123 VRYNGKLIAYPIAVEALSLIYNKDLLPN----PPKTWEEIPALDKELKAK--GKSA--LM 174

Query: 186 RGAQGIQNTFAWASFLWGYGGQWIDDNGK-----SAIASPQAVEATKSFVNILKNYGPIG 240
Q + F W G + +NGK + + A V+++KN
Sbjct: 175 FNLQ--EPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNA 232

Query: 241 AANFGWQENRLVFQQGKAAMTIDSTVNGGFNEDPKESTVVGKVGYAPVPVQPGDHPGNSG 300
++ E F +G+ AMTI NG + +++ KV Y + +
Sbjct: 233 DTDYSIAE--AAFNKGETAMTI----NGPWAWSNIDTS---KVNYGVTVLPTFKGQPSKP 283

Query: 301 ALQVHGLYISSDSKKQDAAWKFISWATDKQTQMKSVELNPNAGVSSLSAINSDAFTKRYG 360
+ V I++ S ++ A +F+ +++V + L A+ ++ +
Sbjct: 284 FVGVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKD-----KPLGAVALKSYEEELA 338

Query: 361 AFKDGMLAALQNGNAK--YLPTIPQSTQIINITGIALSEALAGTQTVENALQQANTRNDK 418
KD +AA K +P IPQ + A+ A +G QTV+ AL+ A TR K
Sbjct: 339 --KDPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2192BONTOXILYSIN290.006 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 29.5 bits (66), Expect = 0.006
Identities = 11/51 (21%), Positives = 24/51 (47%), Gaps = 4/51 (7%)

Query: 80 WSL-IEGNGAIHGMFVIESLERTKSIFFSDGSARKIEF-TLSLKRTDESLK 128
W + E NG + + I+S +S++ S+ + ++S+ R + L
Sbjct: 929 WEIYFEDNGLVFEI--IDSNGNQESVYLSNIINDNWYYISISVDRLKDQLL 977


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2200SURFACELAYER320.004 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 31.9 bits (72), Expect = 0.004
Identities = 19/84 (22%), Positives = 35/84 (41%)

Query: 101 IAFTPVRAGTLIRNRVTNTLWATDGDVVTDAAGNATVNATCTLAGAQGANSDNLTIIATP 160
+A P+ A + N T + + T+A + V + + A + I +
Sbjct: 16 LAVAPIAATAMPVNAATTINADSAINANTNAKYDVDVTPSISAIAAVAKSDTMPAIPGSL 75

Query: 161 IGGITAVTNGATASMGLDKETNNA 184
G I+A NG + + L K++ NA
Sbjct: 76 TGSISASYNGKSYTANLPKDSGNA 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2222CHANLCOLICIN290.042 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.9 bits (64), Expect = 0.042
Identities = 16/39 (41%), Positives = 24/39 (61%)

Query: 244 AEKKREEAEAKADAKDEEIAQLKEKASDAAIAKRLSDVI 282
A+ K+ +AE A AK AQ K KA+ A+ +RL D++
Sbjct: 60 AQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2253HTHFIS727e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.8 bits (176), Expect = 7e-17
Identities = 25/115 (21%), Positives = 46/115 (40%), Gaps = 2/115 (1%)

Query: 2 ISVLLVDDHELVRAGIRRILDDIKGIKVAGEMQCGEDAVKWCRSHVVDIVLMDMNMPGIG 61
++L+ DD +R + + L G V +W + D+V+ D+ MP
Sbjct: 4 ATILVADDDAAIRTVLNQALSR-AGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLEATRKILRFSPDTKVIMLTIHTENPLPAKVMQAGAGGYLSKGAAPQDVITAIR 116
+ +I + PD V++++ K + GA YL K ++I I
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


32YpsIP31758_2277YpsIP31758_2302Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2277321-1.807420hypothetical protein
YpsIP31758_2278317-1.859647flagella biosynthesis protein FliZ
YpsIP31758_2279316-1.190965hypothetical protein
YpsIP31758_2280217-1.579228flagellar biosynthesis sigma factor
YpsIP31758_2281119-3.366399flagellin
YpsIP31758_2282117-4.743662hypothetical protein
YpsIP31758_2283117-4.229826flagellar capping protein
YpsIP31758_2284217-4.399210flagellar protein FliS
YpsIP31758_2285-113-2.283744flagellar biosynthesis protein FliT
YpsIP31758_2286012-1.346597AraC family transcriptional regulator
YpsIP31758_2287012-0.122685HAMP domain-containing protein
YpsIP31758_2288-1164.011281base excision DNA repair protein
YpsIP31758_22890174.013617flagellar hook-basal body protein FliE
YpsIP31758_22900174.709441flagellar MS-ring protein
YpsIP31758_22911164.486558flagellar motor switch protein G
YpsIP31758_22921154.695781flagellar assembly protein H
YpsIP31758_22930153.796091flagellum-specific ATP synthase
YpsIP31758_2294-1172.696529flagellar biosynthesis chaperone
YpsIP31758_2295-1172.938649flagellar hook-length control protein
YpsIP31758_22960222.310740hypothetical protein
YpsIP31758_22970232.644224hypothetical protein
YpsIP31758_2298-1222.709455flagellar basal body-associated protein FliL
YpsIP31758_22991192.795656flagellar motor switch protein FliM
YpsIP31758_23001182.311639flagellar motor switch protein FliN
YpsIP31758_23012202.877882flagellar protein fliO
YpsIP31758_23022192.033874flagellar biosynthesis protein FliP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2281FLAGELLIN1659e-49 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 165 bits (419), Expect = 9e-49
Identities = 164/358 (45%), Positives = 191/358 (53%), Gaps = 3/358 (0%)

Query: 3 VINTNSLSLLTQNNLNKSQSSLGTAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQ 62
VINTNSLSLLTQNNLNKSQSSL +AIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQ
Sbjct: 3 VINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQ 62

Query: 63 AARNANDGISIAQTTEGSLNEINNNLQRVRELTVQAQNGSNSSSDLDSIQDEISLRLAEI 122
A+RNANDGISIAQTTEG+LNEINNNLQRVREL+VQA NG+NS SDL SIQDEI RL EI
Sbjct: 63 ASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEI 122

Query: 123 DRVSDQTQFNGKKVLAENTTMSIQVGANDGETIDINLQKIDSKSLGLGSYSVSGVSGALT 182
DRVS+QTQFNG KVL+++ M IQVGANDGETI I+LQKID KSLGL ++ V+G
Sbjct: 123 DRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFN---VNGPKE 179

Query: 183 SLTDTSVTGVTTTTALDFSDISTFAKGATVHGIGDVGTDGAYADGYVIRTTDGKQYKGEV 242
+ + T D + V+ V A +
Sbjct: 180 ATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTD 239

Query: 243 DATNGKVTFADDANGDPIDDATKLEAAAQFSPAGKATASPLETLDDAIKQVDGLRSSLGA 302
DA N A A + + + I G +
Sbjct: 240 DAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 303 VQNRFESAVTNLNNTVTNLTSARSRIEDADYATEVSNMSRAQILQQAGTSVLSQANQV 360
VT +T + +++ Q T S
Sbjct: 300 STTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357



Score = 101 bits (252), Expect = 1e-25
Identities = 82/241 (34%), Positives = 112/241 (46%), Gaps = 2/241 (0%)

Query: 129 TQFNGKKVLAENTTMSIQVGANDGETIDINLQKIDSKSLGLGSYSVSGVSGALTSLTDTS 188
G K + + D N + + + + +V+ ++ ++ +
Sbjct: 267 GAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAAT 326

Query: 189 VTGVTTTTALDFSDISTFAKG-ATVHGIGDVGTDGAYADGYVIRTTDGKQYKGEVDATNG 247
+ + TF G T +G +Y
Sbjct: 327 LQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKV 386

Query: 248 KVTFADDANGDPIDDATKLEAAAQFSPAGKATASPLETLDDAIKQVDGLRSSLGAVQNRF 307
+ + L + PL ++D A+ +VD +RSSLGA+QNRF
Sbjct: 387 TLAGKTMFIDKTASGVSTLINEDAAAAKKSTAN-PLASIDSALSKVDAVRSSLGAIQNRF 445

Query: 308 ESAVTNLNNTVTNLTSARSRIEDADYATEVSNMSRAQILQQAGTSVLSQANQVPQTVLSL 367
+SA+TNL NTVTNL SARSRIEDADYATEVSNMS+AQILQQAGTSVL+QANQVPQ VLSL
Sbjct: 446 DSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSL 505

Query: 368 L 368
L
Sbjct: 506 L 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2283ACRIFLAVINRP300.029 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.8 bits (67), Expect = 0.029
Identities = 21/121 (17%), Positives = 39/121 (32%), Gaps = 11/121 (9%)

Query: 32 PLTTQQTSYKGKLTAYGVLQSALAKLETASTALKKADTLNSTAVSGSNSAFSATTDSAAS 91
P + +Y G A V + +E + ++ST+ S + + T S
Sbjct: 41 PAVSVSANYPG-ADAQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSG-- 97

Query: 92 AGTYSIEVTNLAKAQSLLSTDVPSATDKLGSSDATRTITIAQPGQKEPMKISLTSEQTSL 151
T+ AQ + + AT L + I++ + M S+
Sbjct: 98 --------TDPDIAQVQVQNKLQLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGT 149

Query: 152 T 152
T
Sbjct: 150 T 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2289FLGHOOKFLIE803e-23 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 80.1 bits (197), Expect = 3e-23
Identities = 59/102 (57%), Positives = 73/102 (71%)

Query: 4 SVQGIEGVLQQLQVTALQASGSAKTLPAEAGFASELKAAIGKISENQQVARTSAQNFELG 63
++QGIEGV+ QLQ TA+ A FA +L AA+ +IS+ Q ART A+ F LG
Sbjct: 2 AIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLG 61

Query: 64 VPGVGLNDVMVNAQKSSVSLQLGIQVRNKLVAAYQEVMNMGV 105
PGV LNDVM + QK+SVS+Q+GIQVRNKLVAAYQEVM+M V
Sbjct: 62 EPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2290FLGMRINGFLIF5770.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 577 bits (1488), Expect = 0.0
Identities = 354/552 (64%), Positives = 443/552 (80%), Gaps = 9/552 (1%)

Query: 19 LARLRANPKIPLLIAAAAAIAIIVALMLWAKSPDYRVLYSNLSDRDGGDIVTQLTQLNIP 78
L RLRANP+IPL++A +AA+AI+VA++LWAK+PDYR L+SNLSD+DGG IV QLTQ+NIP
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 79 YRFADNGGALLIPAEKVHETRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQINYQRAL 138
YRFA+ GA+ +PA+KVHE RLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQ+NYQRAL
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 139 EGELSRTIGTLGPVLNVRVHLAMPKPSLFVREQKSPTASVTLALQPGRALDDGQINAIVY 198
EGEL+RTI TLGPV + RVHLAMPKPSLFVREQKSP+ASVT+ L+PGRALD+GQI+A+V+
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 199 MVSSSVAGLPPGNVTVVDQTGRLLTQSDSAGRDLNASQLKFTSEVENRYQRRIENILAPM 258
+VSS+VAGLPPGNVT+VDQ+G LLTQS+++GRDLN +QLKF ++VE+R QRRIE IL+P+
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPI 255

Query: 259 VGNGNVHAQVTAQVDFASREQTDEEYKPNQAANQGAVRSQQVSTSEQLGGTNVGGVPGAL 318
VGNGNVHAQVTAQ+DFA++EQT+E Y PN A++ +RS+Q++ SEQ+G GGVPGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 319 SNQPPVAPIAPIEIPQPAGAAANNAAPANTAATANANTTATAAKASSSNSRHDQTTNFEV 378
SNQP API P A N +T+ +N+ A +++ ++T+N+EV
Sbjct: 316 SNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNS--------AGPRSTQRNETSNYEV 367

Query: 379 DRTIRHTQQQAGMVQRLSVAVVVNYTSDKAGKPIALSKDQLAQVESLTREAMGFSTVRGD 438
DRTIRHT+ G ++RLSVAVVVNY + GKP+ L+ DQ+ Q+E LTREAMGFS RGD
Sbjct: 368 DRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGD 427

Query: 439 TLNVVNTPFTASDDTRGSSLPFWQQQSFFDQLLNAGRYLLILLVAWILWRKLLRPMLAKK 498
TLNVVN+PF+A D+T G LPFWQQQSF DQLL AGR+LL+L+VAWILWRK +RP L ++
Sbjct: 428 TLNVVNSPFSAVDNT-GGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRR 486

Query: 499 QVADKAAASVNNIVQTAQAAETVKQSKEELALRKKNQQRVSAEVQAQRIRELADKDPRVV 558
KAA + Q + A V+ SK+E +++ QR+ AEV +QRIRE++D DPRVV
Sbjct: 487 VEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPRVV 546

Query: 559 ALVIRQWMSNDQ 570
ALVIRQWMSND
Sbjct: 547 ALVIRQWMSNDH 558


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2291FLGMOTORFLIG314e-108 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 314 bits (806), Expect = e-108
Identities = 113/327 (34%), Positives = 192/327 (58%), Gaps = 2/327 (0%)

Query: 2 SLTGTEKSAIMLMTLGEDHAAEVFKHLSSREVQQLSTTMASMRQVSHQQLVDVLAEFEDD 61
+LTG +K+AI+L+++G + +++VFK+LS E++ L+ +A + ++ + +VL EF++
Sbjct: 14 ALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKEL 73

Query: 62 AEQYAALSVNASDYLRSVLIKALGEERASSLLEDILESRETTSGMETLNFMEPQMAADLI 121
+ DY R +L K+LG ++A ++ + L S + E + +P + I
Sbjct: 74 MMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILNFI 132

Query: 122 RDEHPQIIATILVHLKRAQAADILALFDERLRNDVMLRIATFGGVQPAALAELTEVLNNL 181
+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 133 QQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKK 192

Query: 182 LDGQ-NLKRSKMGGIRTAAEIINLMKTQQEETVMDAVREYDGELAQKIIDEMFLFENLVS 240
L + + GG+ EIIN+ + E+ +++++ E D ELA++I +MF+FE++V
Sbjct: 193 LASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVL 252

Query: 241 VDDRSIQRLLQEIDNESLLIALKGADQALRERFLSNMSLRAAEILRDDLATRGPVRMSLV 300
+DDRSIQR+L+EID + L ALK D ++E+ NMS RAA +L++D+ GP R V
Sbjct: 253 LDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDV 312

Query: 301 ENEQKSILLIVRRLAESGEIVIGGGED 327
E Q+ I+ ++R+L E GEIVI G +
Sbjct: 313 EESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2292FLGFLIH2215e-75 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 221 bits (563), Expect = 5e-75
Identities = 128/233 (54%), Positives = 167/233 (71%), Gaps = 7/233 (3%)

Query: 6 NALPWQPWSLKDFASQSEAPLSESMPDISLLFPNEPMEATAAVDEQQVLVNLQLEAEKQG 65
+ LPW+ W+ D A P +E +P + P E + A +Q L LQ++A +QG
Sbjct: 3 DNLPWKTWTPDDLAP----PQAEFVPIVE---PEETIIEEAEPSLEQQLAQLQMQAHEQG 55

Query: 66 RQQGFAKGLQEGLDKGYQTGLEEGHQQALADAQQQLAPMTAHWQVMVTDFQNTLDTLDSV 125
Q G A+G Q+G +GYQ GL +G +Q LA+A+ Q AP+ A Q +V++FQ TLD LDSV
Sbjct: 56 YQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSV 115

Query: 126 IASRLVQIALAAAKQIIGQPAICDGTALLAQIQQMIQQEPMFAGKTQLRVNPDDLAIVEQ 185
IASRL+Q+AL AA+Q+IGQ D +AL+ QIQQ++QQEP+F+GK QLRV+PDDL V+
Sbjct: 116 IASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDD 175

Query: 186 RLGSTLSLHGWRLLGDSQIHAGGCKVSAEEGDLDASLATRWHELCRLAAPGEL 238
LG+TLSLHGWRL GD +H GGCKVSA+EGDLDAS+ATRW ELCRLAAPG +
Sbjct: 176 MLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2294FLGFLIJ1129e-35 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 112 bits (281), Expect = 9e-35
Identities = 82/144 (56%), Positives = 102/144 (70%)

Query: 1 MKSQSPLVTLCDLAQKAVEQASTQLGHVRQSYQNAEQQLTMLLTYQDEYRERLNDTLCNG 60
M L TL DLA+K VE A+ LG +R+ Q AE+QL ML+ YQ+EYR LN + G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MASSSWQNYQQFIQTLEQAIDQHRKQLAQWSIKVEQAVKYWQEKQQRLNAFETLQERAET 120
+ S+ W NYQQFIQTLE+AI QHR+QL QW+ KV+ A+ W+EK+QRL A++TLQER T
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 TQRQQENRLDQKLMDEFAQRASQR 144
ENRLDQK MDEFAQRA+ R
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMR 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2295FLGHOOKFLIK1388e-39 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 138 bits (347), Expect = 8e-39
Identities = 102/226 (45%), Positives = 127/226 (56%), Gaps = 9/226 (3%)

Query: 229 TPAPDKLTHLAAQDGESVLNTKTPPLVAQSEVSLSFASSDKTQLNSTP--VTAALSSPMN 286
T P T L ++ + P AQ L + K ++ STP VTAA S +
Sbjct: 157 TEKPTLFTKLTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLIT 216

Query: 287 TAAASSLASAPANGYLSAPLGSQEWQQSLGQQVIMFSRNGQQSAELRLHPQELGALQISL 346
L + A LSAPLGS EWQQSL Q + +F+R GQQSAELRLHPQ+LG +QISL
Sbjct: 217 PHQTQPLPTVAAP-VLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISL 275

Query: 347 KMEDNQAQLHFASAHSQVRAALEAAMPSLRHALAESGVQLGQSSVGSEGQWQQAQQQSQQ 406
K++DNQAQ+ S H VRAALEAA+P LR LAESG+QLGQS++ E Q Q SQQ
Sbjct: 276 KVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQ 335

Query: 407 NQQDVVARGQPTYGDVVAGPLTETPLAAPTALQSLANGQGGVDVFA 452
Q A +P G+ + L P +LQ G GVD+FA
Sbjct: 336 QQSQRTANHEPLAGE------DDDTLPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2298PF04335270.031 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 27.1 bits (60), Expect = 0.031
Identities = 26/156 (16%), Positives = 44/156 (28%), Gaps = 27/156 (17%)

Query: 8 AKRKSSIWLILLVLVAIAASAGGGYSWWLLHKSKPTNTQIVAAIPVFMPLETFTVNLITP 67
A+R + ++ + A+AG V A+ PL+T +IT
Sbjct: 28 AERSKKLAWVVAGVAGALATAG------------------VVAVAALTPLKTVEPYVITV 69

Query: 68 DNNLDRVLYIGLTLRLPDDTTRTKLNDYLPE--VRSR-----LLLLLSRQSADSLSNEEG 120
D N T + Y VR R + +S
Sbjct: 70 DRNTGEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPE 129

Query: 121 KQRLVN--DIKNILSPPMVKGQPNQVISDVLFTAFI 154
+ R N SP + V ++ +F+
Sbjct: 130 QDRWSRFYKTDNPQSPQNILANRTDVFVEIKRVSFL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2299FLGMOTORFLIM334e-116 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 334 bits (859), Expect = e-116
Identities = 78/288 (27%), Positives = 138/288 (47%), Gaps = 8/288 (2%)

Query: 5 ILSQAEIDALLNGDS---GSEEPVVITANETDVKPYDPTTQRRVVRERLHALEIINERFA 61
+LSQ EID LL S S E ++ + YD + +E++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 62 RQFRMGLFNLLRRSPDITVGPIKIQPYHDFARNLPVPTNLNLVHLKPLRGTALFVFAPSL 121
R L LR + V + Y +F R++P P+ L ++ + PL+G A+ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 122 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVITRMLRLALDAYRDAWAAIYKIDVEYVRS 181
F +D LFGG G+ KV+ R+ T E V+ ++ L R++W + + +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 182 EIQVKFTNITTSPNDIVVSTPFQVEIGTLSGEFNICIPFAMIEPLRELLTNPPLENS--R 239
E +F I P+++VV + ++G G N CIP+ IEP+ L++ +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 240 QEDNYWRETLVKQVQHSELELVANFVDIPLRLSQILKLQPGDVLPIEK 287
+ L ++ ++++VA + L + IL L+ GD++ +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHD 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2300FLGMOTORFLIN1611e-54 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 161 bits (410), Expect = 1e-54
Identities = 103/138 (74%), Positives = 117/138 (84%), Gaps = 1/138 (0%)

Query: 1 MSDPKFPSADGKESVDDLWADAFNEQQATEKPTATTEGVFKSLEAPEGLGNLQDIDLILD 60
MSD PS + ++DDLWADA NEQ+AT +A + VF+ L + G +QDIDLI+D
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAA-DAVFQQLGGGDVSGAMQDIDLIMD 59

Query: 61 IPVKLSVELGRTKMTIKELLRLSQGSVVSLDGLAGEPLDILINGYLIAQGEVVVVADKYG 120
IPVKL+VELGRT+MTIKELLRL+QGSVV+LDGLAGEPLDILINGYLIAQGEVVVVADKYG
Sbjct: 60 IPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYG 119

Query: 121 VRITDIITPSERMRRLSR 138
VRITDIITPSERMRRLSR
Sbjct: 120 VRITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2302FLGBIOSNFLIP306e-108 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 306 bits (786), Expect = e-108
Identities = 196/240 (81%), Positives = 215/240 (89%), Gaps = 1/240 (0%)

Query: 35 TTLGLLTLFCSPSVLAQLPGIISQPLANGGQSWSLPVQTLVFITTLSFLPAALLMMTSFT 94
LL L P AQLPGI SQPL GGQSWSLPVQTLVFIT+L+F+PA LLMMTSFT
Sbjct: 7 VAPVLLWLIT-PLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFT 65

Query: 95 RIIIVLGLLRNAMGTPSAPPNQVMLGLALFLTFFIMSPVFDKVYQEAYLPFSQDKISMDV 154
RIIIV GLLRNA+GTPSAPPNQV+LGLALFLTFFIMSPV DK+Y +AY PFS++KISM
Sbjct: 66 RIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQE 125

Query: 155 ALDKGSQPLREFMLRQTRESDLALYARLANLPPLEGPEMVPMRILLPAYVTSELKTAFQI 214
AL+KG+QPLREFMLRQTRE+DL L+ARLAN PL+GPE VPMRILLPAYVTSELKTAFQI
Sbjct: 126 ALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQI 185

Query: 215 GFTVFIPFLIIDLVVASVLMALGMMMVPPASISLPFKLMLFVLVDGWQLLLGSLAQSFYS 274
GFT+FIPFLIIDLV+ASVLMALGMMMVPPA+I+LPFKLMLFVLVDGWQLL+GSLAQSFYS
Sbjct: 186 GFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSFYS 245


33YpsIP31758_2314YpsIP31758_2323Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_23141203.305468flagellar hook-associated protein FlgK
YpsIP31758_23153214.848353hypothetical protein
YpsIP31758_23162204.438193flagellar rod assembly protein/muramidase FlgJ
YpsIP31758_23172214.183153flagellar basal body P-ring protein
YpsIP31758_23182213.914305flagellar basal body L-ring protein
YpsIP31758_23192203.674619flagellar basal body rod protein FlgG
YpsIP31758_23201183.770541flagellar basal body rod protein FlgF
YpsIP31758_23212162.310279flagellar hook protein FlgE
YpsIP31758_23223161.969839flagellar basal body rod modification protein
YpsIP31758_23232161.184077flagellar basal body rod protein FlgC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2314FLGHOOKAP1436e-150 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 436 bits (1123), Expect = e-150
Identities = 315/552 (57%), Positives = 398/552 (72%), Gaps = 9/552 (1%)

Query: 3 NSLMNTAMSGLNAAQYALSTVSNNITNFQVAGYNRQNTVFAQNGGTITSAGFIGNGVTVT 62
+SL+N AMSGLNAAQ AL+T SNNI+++ VAGY RQ T+ AQ T+ + G++GNGV V+
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 63 GVNREYNAFITNQLRASQTQSSGLATYYQQISQIDNLLSNASNNLSTTMQDFFSNLQNLV 122
GV REY+AFITNQLRA+QTQSSGL Y+Q+S+IDN+LS ++++L+T MQDFF++LQ LV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 123 SNADDDAARKTVLGKAEGLVNQFQNADKYLRDMDDGVNQKITDSATQINNYAEQIAKLND 182
SNA+D AAR+ ++GK+EGLVNQF+ D+YLRD D VN I S QINNYA+QIA LND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 183 QITRLRG-SSGSEPNALLDQRDQLVTELNQIMAVTVTQQDGDAYNVSFAGGLSLVQGPNA 241
QI+RL G +G+ PN LLDQRDQLV+ELNQI+ V V+ QDG YN++ A G SLVQG A
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 YKVEAIPSSADATRLTLGYKRGNGEATEVDESRITTGSLGGTLKFRSEALDSARNQLGQL 301
++ A+PSSAD +R T+ Y G E+ E + TGSLGG L FRS+ LD RN LGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALVMADSFNTQHNAGFDINGDEGEDFFSFADPTVLKNAKNQGNASITVEYKDTSKVKASD 361
AL A++FNTQH AGFD NGD GEDFF+ P VL+N KN+G+ +I D S V A+D
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YTVEFDGTDWQVTRLSDNTKVQTTPGVNADGDPTLEFEGVAIKIDNGTPGPQAKDKFTIK 421
Y + FD WQVTRL+ NT TP D + + F+G+ + P D FT+K
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTP----DANGKVAFDGLELTFTG---TPAVNDSFTLK 413

Query: 422 TVSNVAANLQVAITDSSKIAAAGSADGGISDNTNAQALLDLQSKKLVEGK-TTLSGAYAG 480
VS+ N+ V ITD +KIA A D G SDN N QALLDLQS G + + AYA
Sbjct: 414 PVSDAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 481 LVSNVGNQTATAKTNSTAQANIVTQLTTEQQSISGVNLDEEYGDLQRFQQYYLANAQVLQ 540
LVS++GN+TAT KT+S Q N+VTQL+ +QQSISGVNLDEEYG+LQRFQQYYLANAQVLQ
Sbjct: 474 LVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQ 533

Query: 541 AASTLFNALLSI 552
A+ +F+AL++I
Sbjct: 534 TANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2316FLGFLGJ314e-109 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 314 bits (805), Expect = e-109
Identities = 181/316 (57%), Positives = 233/316 (73%), Gaps = 6/316 (1%)

Query: 1 MSDLLAMSGAAYDAQSLEALKRDAARDPEGNLKQVAQQVEGMFVQMMLKSMRAALPQDGV 60
+SD ++ AA+DAQSL LK A DP N++ VA+QVEGMFVQMMLKSMR ALP+DG+
Sbjct: 2 ISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDGL 61

Query: 61 MNSEQTKLYTSLYDQQIAQQMSA-KGLGLADMMVEQLS-GSTSASETAGTVPMMLDNEVL 118
+SE T+LYTS+YDQQIAQQM+A KGLGLA+MMV+Q++ E+ PM E +
Sbjct: 62 FSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLETV 121

Query: 119 QSMPAQALAQVMRRAIPTPPSSSMAAISPGNGNFVARMSIPAQIASQQSGIPHQLIMAQA 178
QAL+Q++++A+P S+ S F+A++S+PAQ+ASQQSG+PH LI+AQA
Sbjct: 122 VRYQNQALSQLVQKAVPRNYDDSLPGDSK---AFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 179 ALESGWGQREIPTADGKSSYNVFGIKAGSSWNGPVSEITTTEYEQGVAKKTKARFRVYGS 238
ALESGWGQR+I +G+ SYN+FG+KA +W GPV+EITTTEYE G AKK KA+FRVY S
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 239 YVEAVSDYVKLLTQNPRYAHVAAAQSPEQGAHALQKAGYATDPQYAQKLVSVIQQMRSTG 298
Y+EA+SDYV LLT+NPRYA V A S EQGA ALQ AGYATDP YA+KL ++IQQM+S
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSIS 298

Query: 299 EQAVKAYGGSDLSQLF 314
++ K Y ++ LF
Sbjct: 299 DKVSKTY-SMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2317FLGPRINGFLGI391e-138 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 391 bits (1007), Expect = e-138
Identities = 155/366 (42%), Positives = 217/366 (59%), Gaps = 9/366 (2%)

Query: 5 SLVTLLMVLLSLVWLPASAERIRDLVTVQGVRDNALIGYGLVVGLDGSGDQTMQTPFTTQ 64
+LV + LS A RI+D+ ++Q RDN LIGYGLVVGL G+GD +PFT Q
Sbjct: 10 ALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQ 69

Query: 65 SLSNMLSQLGITVPPGTNMQLKNVAAVMVTAKLPAFSRAGQTIDVVVSSMGNAKSIRGGT 124
S+ ML LGIT G + KN+AAVMVTA LP F+ G +DV VSS+G+A S+RGG
Sbjct: 70 SMRAMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGN 128

Query: 125 LLMTPLKGVDNQVYALAQGNVLVGGAGAAAGGSSVQVNQLAGGRISNGATIERELPTTFG 184
L+MT L G D Q+YA+AQG ++V G A +++ R+ NGA IERELP+ F
Sbjct: 129 LIMTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFK 188

Query: 185 TDGIINLQLNSEDFTLAQQVSDAINR----QRGFGSATAIDARTIQVLVPRGGSSQVRFL 240
+ LQL + DF+ A +V+D +N + G A D++ I V PR + R +
Sbjct: 189 DSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPR-VADLTRLM 247

Query: 241 ADIQNIPINVDPGDAKVIINSRTGSVVMNRNVVLDSCAVAQGNLSVVVDKQNIVSQPDTP 300
A+I+N+ + D AKV+IN RTG++V+ +V + AV+ G L+V V + V QP P
Sbjct: 248 AEIENLTVETD-TPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-AP 305

Query: 301 FGGGQTVVTPNTQISVQQQGGVLQRVNASPNLNNVVRALNSLGATPIDLMSILQAMESAG 360
F GQT V P T I Q+G + V P+L +V LNS+G +++ILQ ++SAG
Sbjct: 306 FSRGQTAVQPQTDIMAMQEGSKVAIVE-GPDLRTLVAGLNSIGLKADGIIAILQGIKSAG 364

Query: 361 CLRAKL 366
L+A+L
Sbjct: 365 ALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2318FLGLRINGFLGH2834e-99 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 283 bits (724), Expect = 4e-99
Identities = 176/222 (79%), Positives = 193/222 (86%), Gaps = 2/222 (0%)

Query: 23 PLMTMLL--LNGCAYIPHKPLVDGTTSAQPAPASAPLPNGSIFQTVQPMNYGYQPLFEDR 80
+ ++L+ L GCA+IP PLV G TSAQP P P+ NGSIFQ+ QP+NYGYQPLFEDR
Sbjct: 10 AISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 81 RPRNIGDTLTITLQENVSASKSSSANASRNGTSSFGVTTAPRYLDGLLGNGRADMEITGD 140
RPRNIGDTLTI LQENVSASKSSSANASR+G ++FG T PRYL GL GN RAD+E +G
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGG 129

Query: 141 NTFGGKGGANANNTFSGTITVTVDQVLANGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 200
NTF GKGGANA+NTFSGT+TVTVDQVL NGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 201 SGSNSVTSTQVADARIEYVGNGYINEAQTMGWLQRFFLNVSP 242
SGSN+V STQVADARIEYVGNGYINEAQ MGWLQRFFLN+SP
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2319FLGHOOKAP1422e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 2e-06
Identities = 17/80 (21%), Positives = 35/80 (43%), Gaps = 14/80 (17%)

Query: 4 SLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTLRQPGAQSSEQTTLP 63
+ A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTLG 48

Query: 64 SGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 49 AGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 220 ETSNVNVAEELVNMIQTQRAYEINSKAVSTSDQMLQKLAQL 260
S VN+ EE N+ + Q+ Y N++ + T++ + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2321FLGHOOKAP1453e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 45.3 bits (107), Expect = 3e-07
Identities = 22/87 (25%), Positives = 42/87 (48%), Gaps = 8/87 (9%)

Query: 6 AVSGMNAASSNLDVIGNNIANSATSGFKAGSVSFAD----MFAGSQTGMGVKVAGITQDF 61
A+SG+NAA + L+ NNI++ +G+ + A + AG G GV V+G+ +++
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGVQREY 66

Query: 62 NDGTATTTNRRLDLAISQNGFFRMQDS 88
+ +L A +Q+ +
Sbjct: 67 DA----FITNQLRAAQTQSSGLTARYE 89



Score = 40.7 bits (95), Expect = 9e-06
Identities = 15/49 (30%), Positives = 28/49 (57%)

Query: 380 TLTSGALESSNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILQTLVSLR 428
L++ S V+L +E N+ Q+ Y +NAQ ++T + I L+++R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2322SYCECHAPRONE290.008 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 28.9 bits (64), Expect = 0.008
Identities = 15/34 (44%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 43 LKNQDPTNPMENNELTTQLAQINTVSGIEKLNTT 76
L N+ P N ++NN L TQL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


34YpsIP31758_2335YpsIP31758_2341Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2335221-2.957108hypothetical protein
YpsIP31758_2336422-4.286478hypothetical protein
YpsIP31758_2337521-3.931640hypothetical protein
YpsIP31758_2338322-3.984466copper resistance protein D
YpsIP31758_2339121-3.923577copper resistance protein CopC
YpsIP31758_2340221-4.383180hypothetical protein
YpsIP31758_2341-113-3.087541ferritin family protein
35YpsIP31758_2368YpsIP31758_2384Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2368214-0.753203PTS system mannose/fructose/sorbose family
YpsIP31758_2369210-0.454274PTS system mannose/fructose/sorbose family
YpsIP31758_237018-1.080564PTS system mannose-specific transporter subunit
YpsIP31758_237109-1.561656hypothetical protein
YpsIP31758_237208-0.499336hypothetical protein
YpsIP31758_237308-0.624923ferrichrome receptor FcuA
YpsIP31758_2374-114-1.548399sensory box-containing diguanylate cyclase
YpsIP31758_2375-118-3.266917hypothetical protein
YpsIP31758_2376117-2.09949823S rRNA methyltransferase A
YpsIP31758_2377217-2.599222hypothetical protein
YpsIP31758_5001117-3.440181cold shock protein
YpsIP31758_2378219-3.398547hypothetical protein
YpsIP31758_2379219-2.889240hypothetical protein
YpsIP31758_2380216-2.476764palmitoyl transferase
YpsIP31758_2381-115-3.214848aromatic amino acid transport protein AroP
YpsIP31758_2382224-4.945105hypothetical protein
YpsIP31758_2383126-5.599367hypothetical protein
YpsIP31758_2384016-3.545455hypothetical protein
36YpsIP31758_2396YpsIP31758_2415Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2396215-2.731459hypothetical protein
YpsIP31758_2397116-2.226063hypothetical protein
YpsIP31758_2398016-2.3242844'-phosphopantetheinyl transferase
YpsIP31758_2399116-4.000352TRAP transporter solute receptor DctP family
YpsIP31758_2400113-2.533279TRAP dicarboxylate transporter subunit DctQ
YpsIP31758_2401-112-1.043607TRAP transporter subunit DctM
YpsIP31758_2402011-0.588850LuxR family transcrptional regulator
YpsIP31758_2403-111-0.153633*hypothetical protein
YpsIP31758_2404-1121.496003hypothetical protein
YpsIP31758_2405-1142.937102polysaccharide deacetylase family protein
YpsIP31758_2406-1142.952974major facilitator transporter
YpsIP31758_24070143.077700hypothetical protein
YpsIP31758_24082153.265328argininosuccinate synthase
YpsIP31758_24090132.913136L-lactate dehydrogenase
YpsIP31758_24100141.863762mandelate racemase/muconate lactonizing enzyme
YpsIP31758_2411-113-0.541545fumarylacetoacetate hydrolase family protein
YpsIP31758_2412-113-1.438828short chain dehydrogenase/reductase family
YpsIP31758_2413-114-2.413910hypothetical protein
YpsIP31758_2414-115-3.472202major facilitator transporter
YpsIP31758_2415015-3.255904hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2398ENTSNTHTASED992e-27 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 99.3 bits (247), Expect = 2e-27
Identities = 70/196 (35%), Positives = 96/196 (48%), Gaps = 22/196 (11%)

Query: 24 PFHG-LLAKCDFEVNEYR--DELFAAYGIPFPGSLNKAVIKRRAEYLAGRFVARQVLNLL 80
PF G L DF+ + +R D L+ +P L A KR+AE+LAGR A L +
Sbjct: 9 PFAGHRLHIVDFDASSFREHDLLW----LPHHDRLRSAGRKRKAEHLAGRIAAVHALREV 64

Query: 81 DIRDYPLATGMDRAPQWPTNLIGSISHNNQRALCAAQMIEPRGVESSTLHGIGLDIESHI 140
+R P G R P WP L GSISH CA + + IG+DIE +
Sbjct: 65 GVRTVP-GMGDKRQPLWPDGLFGSISH------CAT-----TALAVISRQRIGIDIEKIM 112

Query: 141 AEEKAQEIWSGIISDEEYSLLQQGPLPFNQALTLVFSAKESLFKAVYPQSGRYFDFIEAR 200
++ A E+ II +E +LQ LPF ALTL FSAKES++KA + F A+
Sbjct: 113 SQHTATELAPSIIDSDERQILQASLLPFPLALTLAFSAKESVYKA-FSDRVTLPGFNSAK 171

Query: 201 LLSYSLVSGNFELQLL 216
+ S + + + L LL
Sbjct: 172 VTSLT--ATHISLHLL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2402HTHFIS702e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 2e-16
Identities = 42/180 (23%), Positives = 71/180 (39%), Gaps = 28/180 (15%)

Query: 4 VALIDDHIVVRSGFAQLLSLE-EDIRIIGEYGCAAEAWENLPKTDVHVAVIDISMPDESG 62
+ + DD +R+ Q LS D+RI AA W + D + V D+ MPDE+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSN---AATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 63 LSLLKRLRQQIPHFRAIILSIYDTTAFVQSAMDAGASGYLTKRCDPAALVQALRTVNGGG 122
LL R+++ P +++S +T A + GA YL K D L+ +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR----- 117

Query: 123 LYLCTDALRALRQMPQQPSKLTILTPREKEIFQLLVKGIS-----VKALAEQLSLSHKTV 177
AL ++ P + + + LV G S + + +L + T+
Sbjct: 118 ------ALAEPKRRPSK-------LEDDSQDGMPLV-GRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2403TRNSINTIMINR290.010 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.9 bits (64), Expect = 0.010
Identities = 14/44 (31%), Positives = 27/44 (61%), Gaps = 1/44 (2%)

Query: 24 VQRTAKKSRAQTREAR-EAVEENKKAQLERDKQLSEQQKQAALS 66
V++ A++++ AR +AVE N +AQ + Q + +Q++ LS
Sbjct: 318 VEQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEELQLS 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2406TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 3e-04
Identities = 72/362 (19%), Positives = 126/362 (34%), Gaps = 53/362 (14%)

Query: 74 VVALTVGPIVDRLGRRLGIVFTVTGAAISSLLTAIGGAWGKGVLVGIRSIAGLGFAEQTV 133
A +G + DR GRR ++ ++ GAA+ + A VL R +AG+ A V
Sbjct: 58 ACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL--WVLYIGRIVAGITGATGAV 115

Query: 134 NATYLTEMYAAINDPILNRHKGFIYSLVQGGWPVGALVASGLSALLMPIIGWQGSFIFAA 193
Y+ ++ R + F G+ + A G+ A P++G
Sbjct: 116 AGAYIADI-----TDGDERARHF-------GF-MSACFGFGMVA--GPVLGGLMGGFSPH 160

Query: 194 VPSFIIALLALKLKETPQFQIHQHIRRLTEAGQPEKARQVAKDYHFEYADHQRTGLSAIF 253
P F A L T F + + + +P + + +
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPESHKG---ERRPLRREAL----------NPLASFRWAR 207

Query: 254 RGTSLRTTLVLGGALMINWFAIQIFGVLGTT--VITKVHDVSFESSLWVLVLSNLVGYCG 311
T + ALM +F +Q+ G + VI ++++ + L+ G
Sbjct: 208 GMTVV-------AALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLA-AFGILH 259

Query: 312 YLAHGWLGD----RFGRRNIIALGW---MLGGLSLATMLYGPADFSLIVILYSAGLFFLI 364
LA + R G R + LG G + LA G F ++V+L S G+
Sbjct: 260 SLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM-- 317

Query: 365 GPYSAALFFFSESFPTAIRATAGAFIIAMGPIGAIIASMGATSILS-SGGNWQTAALLFG 423
A S + + A+ + +I+ + T+I + S W A + G
Sbjct: 318 ---PALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAG 374

Query: 424 AI 425
A
Sbjct: 375 AA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2412DHBDHDRGNASE1246e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 124 bits (313), Expect = 6e-37
Identities = 80/255 (31%), Positives = 126/255 (49%), Gaps = 18/255 (7%)

Query: 3 LKGKKALVTAAGQGIGFNTATLFARKGAEVIASDINIAALSNIPGIT--------AVSLD 54
++GK A +T A QGIG A A +GA + A D N L + A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 55 VTDTLAINDVAQAI----GPIDVLFNCAGVVHSGDILTCSEQEWQFALDLNVTAMFHMIR 110
V D+ AI+++ I GPID+L N AGV+ G I + S++EW+ +N T +F+ R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 111 AFLPGMIACQQGSIINMSSVASSIKGVP--NRFAYSTSKAAVIGLTRSVAADYVTQGIRC 168
+ M+ + GSI+ +V S+ GVP + AY++SKAA + T+ + + IRC
Sbjct: 126 SVSKYMMDRRSGSIV---TVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 169 NAICPGTVESPSLRQRIAVQAHAEGRSEADVFQAFAARQPIGRIGKAEEIAQLALYLASD 228
N + PG+ E+ A + AE + + F P+ ++ K +IA L+L S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLET-FKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 229 ASAYTTGTIHIIDGG 243
+ + T +DGG
Sbjct: 242 QAGHITMHNLCVDGG 256


37YpsIP31758_2444YpsIP31758_2522Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2444-215-4.355780ferric enterobactin ABC transporter ATP-binding
YpsIP31758_2445-214-4.228732IucA/IucC family siderophore biosynthesis
YpsIP31758_2446-115-3.984914hypothetical protein
YpsIP31758_2447-115-3.701945lysine N6-hydroxylase/L-ornithine N5-oxygenase
YpsIP31758_2448-115-3.412296pyridoxal-dependent decarboxylase
YpsIP31758_2449-113-2.079932ferric iron reductase
YpsIP31758_2450-19-0.697024hypothetical protein
YpsIP31758_2451-39-1.836631assembly protein
YpsIP31758_2452-211-2.023493deoxycytidine triphosphate deaminase
YpsIP31758_2453-212-2.171902uridine kinase
YpsIP31758_2454-113-2.061905ATPase
YpsIP31758_2455-114-2.497437methionyl-tRNA synthetase
YpsIP31758_2456117-3.053815hypothetical protein
YpsIP31758_2457116-0.921871hypothetical protein
YpsIP31758_2458015-0.779693hypothetical protein
YpsIP31758_2459015-0.962306major facilitator transporter
YpsIP31758_2460014-1.205232hypothetical protein
YpsIP31758_2461-114-2.437149hypothetical protein
YpsIP31758_2462-214-2.929173cytidine deaminase
YpsIP31758_2463-215-3.677198malate dehydrogenase
YpsIP31758_2464-116-3.524137hypothetical protein
YpsIP31758_2465-214-3.778190beta-methylgalactoside transporter inner
YpsIP31758_2466-113-1.864155galactose/methyl galaxtoside transporter
YpsIP31758_24670100.836029galactose ABC transporter periplasmic protein
YpsIP31758_24680102.269282hypothetical protein
YpsIP31758_2469-192.515682hypothetical protein
YpsIP31758_24700121.712821GTP cyclohydrolase I
YpsIP31758_2471015-2.822063major facilitator transporter
YpsIP31758_2472022-6.940168LysR family transcriptional regulator
YpsIP31758_2473026-9.006913S-(hydroxymethyl)glutathione dehydrogenase/class
YpsIP31758_2474132-11.325675S-formylglutathione hydrolase
YpsIP31758_2475031-10.655440hypothetical protein
YpsIP31758_2476029-9.122169hypothetical protein
YpsIP31758_2477129-8.303541RND family efflux transporter MFP subunit
YpsIP31758_2478-117-4.808131ABC transporter ATP-binding protein
YpsIP31758_2479-115-5.213043radical SAM domain-containing protein
YpsIP31758_2480-110-1.135391molybdopterin biosynthesis protein MoeA
YpsIP31758_2481-210-1.048119molybdopterin biosynthesis protein MoeB
YpsIP31758_2482-114-0.739975hypothetical protein
YpsIP31758_2483-1202.632199ABC transporter ATP-binding protein
YpsIP31758_2484-1285.586115hypothetical protein
YpsIP31758_2485-1306.124096ImpA domain-containing protein
YpsIP31758_2486-1262.431734hypothetical protein
YpsIP31758_2487-1264.314024hypothetical protein
YpsIP31758_2488-1233.635640hypothetical protein
YpsIP31758_2489-1223.459029hypothetical protein
YpsIP31758_2490-2234.807457hypothetical protein
YpsIP31758_2491-2245.929668hypothetical protein
YpsIP31758_2492-1277.200741ImpA domain-containing protein
YpsIP31758_2493-1275.873539hypothetical protein
YpsIP31758_2494-1245.389558hypothetical protein
YpsIP31758_2495-1235.211392hypothetical protein
YpsIP31758_2496119-2.962039hypothetical protein
YpsIP31758_2498119-6.477364hypothetical protein
YpsIP31758_2499-115-1.761187hypothetical protein
YpsIP31758_2500-2202.049670M23 peptidase domain-containing protein
YpsIP31758_2501-1242.621862hypothetical protein
YpsIP31758_2502-1254.565156hypothetical protein
YpsIP31758_25030306.694882M23 peptidase domain-containing protein
YpsIP31758_250424110.377503Rhs element Vgr protein
YpsIP31758_250524110.344413AAA ATPase
YpsIP31758_25061389.375803hypothetical protein
YpsIP31758_25070317.628707OmpA domain-containing protein
YpsIP31758_2508-1274.518694hypothetical protein
YpsIP31758_2509-1231.993463hypothetical protein
YpsIP31758_2510018-2.904961hypothetical protein
YpsIP31758_2511-121-8.091854hypothetical protein
YpsIP31758_2512226-9.700397acyltransferase
YpsIP31758_2513221-8.399285acyl carrier protein
YpsIP31758_2514222-7.620158hypothetical protein
YpsIP31758_2515120-6.825428hypothetical protein
YpsIP31758_2516121-6.981339polyketide biosynthesis enoyl-CoA hydratase
YpsIP31758_2517223-6.781699enoyl-CoA hydratase/isomerase family protein
YpsIP31758_2518123-6.397350hydroxymethylglutaryl-coenzyme A synthase
YpsIP31758_2519025-7.041551short-chain dehydrogenase/reductase family
YpsIP31758_2520126-7.408347beta-ketoacyl synthase family protein
YpsIP31758_2521025-7.462039thioester dehydratase family protein
YpsIP31758_2522-125-4.575569hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2445PF041836080.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 608 bits (1568), Expect = 0.0
Identities = 171/593 (28%), Positives = 293/593 (49%), Gaps = 28/593 (4%)

Query: 20 LQPVLWQKVNRLHLCKAISEFSHECLLAPQRMTDHPDSEGYDYYQLVAAAADKPANYVFR 79
+ W VNR + K +SE +E + H +S+G D Y + A + F
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVF-------HAESQGDDRYCINLPGA----QWRFI 49

Query: 80 ARRLALDHWLIDPDSLHKTVNEKSTDLDLLLFIIEFKRQLSISERVLPTYLEEITSTLYS 139
A R ID +L + +E + +++ K+ LS+S+ + +++++ +TL
Sbjct: 50 AERGIWGWLWIDAQTL-RCADEP---VLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLG 105

Query: 140 SA-FKHCRTGISATALVNASFQIIEKEMMEGHPSFVANNGRIGFDAQDFQRFSPEAASDV 198
R G+SA+ L+N + ++ ++ GHP FV N GR G+ + +R++PE A+
Sbjct: 106 DLQLLKARRGLSASDLINLNADRLQC-LLSGHPKFVFNKGRRGWGKEALERYAPEYANTF 164

Query: 199 HLVWLAAHKSKAHFACIEQLDYAKLMEQELGAEVLAEFEQQLIARDCHPQDYVLMPVHPW 258
L WLA + + C ++D +L+ + + A F Q +++ +PVHPW
Sbjct: 165 RLHWLAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGL-DHNWLPLPVHPW 223

Query: 259 QWQNKLTSIFAADIANQRLVFLGQGKDVYQAQQSIRTFFNRSHPQRYYVKMALSILNMGF 318
QWQ K+ + F AD A R+V LG+ D + AQQS+RT N S +K+ L+I N
Sbjct: 224 QWQQKIATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSC 283

Query: 319 MRGLSPYYMATTPGINEWLFDLVEGDEILQAYGFKILREVASIGFRNSYYEQAITGDSAY 378
RG+ Y+A P + WL + D L G IL E A+ + Y Y
Sbjct: 284 YRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRY 343

Query: 379 KKMAAALWRENPLQLIQPGQQLMTMAALLHLDQEGKALLPALIDASGLSTQQWLERYLRS 438
++M +WRENP + ++P + + MA L+ D+ + L A ID SGL + WL + R
Sbjct: 344 QEMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRV 403

Query: 439 YLSPLLHCLYVYDLMFMPHGENIILVLQNAVPQHIFMKDLAEEILVLNTE----ADLPEK 494
+ PL H L Y + + HG+NI L ++ VPQ + +KD ++ ++ E LP++
Sbjct: 404 VVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQE 463

Query: 495 ARRIIVDMPDEMKTLTILSDVFDGVFRYLAAVLDQQGDYPERYFWADVAACIRQYQQQYP 554
R + + + + + F V R+++ ++ + G PER F+ +AA + Y +++P
Sbjct: 464 VRDVTSRLSADYLIHDLQTGHFVTVLRFISPLMVRLG-VPERRFYQLLAAVLSDYMKKHP 522

Query: 555 ELAAKFASYPLFTPKIKRCCLNRLQLNNNQHMLDLADPVQSF-QFADDLINPL 606
+++ +FA + LF P+I R LN ++L DL + + +DL NPL
Sbjct: 523 QMSERFALFSLFRPQIIRVVLNPVKLT----WPDLDGGSRMLPNYLEDLQNPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_24492FE2SRDCTASE365e-131 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 365 bits (938), Expect = e-131
Identities = 82/212 (38%), Positives = 130/212 (61%), Gaps = 2/212 (0%)

Query: 42 PEETMSFHTWSSIDNFPTLIQKYRDEYYGDNDLK-PNDKALYSLWSQWYFGLIIPPMMLL 100
P M+ WSS + +L+ Y D Y + + +K L SLW+QWY GL++PP+ML
Sbjct: 51 PLNAMTLAQWSSPNVLSSLLAVYSDHIYRNQPMMIRENKPLISLWAQWYIGLMVPPLMLA 110

Query: 101 LIEYPQTIDTHHKNFKVLFHHSGRPEVVYYQL-KWQSQDPGTLLERYYLLLNHHVIPIAE 159
L+ + +D ++F FH +GR + + + ++ P + R L++ ++P+ +
Sbjct: 111 LLTQEKALDVSPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQHRMETLISQALVPVVQ 170

Query: 160 KIESYQGINGRLLWNNIGYLMFWYLGEFKGRLGDDLYQSIINGLFMELSLPNGQDNPLYR 219
+E+ ING+L+W+N GYL+ WYL E K LG+ +S+ + LF E +L NG+DNPL+R
Sbjct: 171 ALEATGEINGKLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTNGEDNPLWR 230

Query: 220 TVILRNGTLQRRSCCQRNKLPGVRSCHDCPLE 251
TV+LR+G L RR+CCQR +LP V+ C DC L+
Sbjct: 231 TVVLRDGLLVRRTCCQRYRLPDVQQCGDCTLK 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2459TCRTETA681e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 67.5 bits (165), Expect = 1e-14
Identities = 78/386 (20%), Positives = 146/386 (37%), Gaps = 26/386 (6%)

Query: 9 TNLLIASLVLTIGRAVTLPFITIYLVEHFQLAPDTVGLLLGASLALGIFTSLYGGYLVDK 68
+ + + ++ + + V LP + LV + G+LL + + G L D+
Sbjct: 12 STVALDAVGIGLIMPV-LPGLLRDLVHSNDVTAHY-GILLALYALMQFACAPVLGALSDR 69

Query: 69 FNKKRLILLTIILFSATFFALPWIEHPAWVILTLALLHSAYSVYTIAVKACF-ADWLPVN 127
F ++ ++L+++ + + + WV L + + + + T AV + AD +
Sbjct: 70 FGRRPVLLVSLAGAAVDYAIMA-TAPFLWV-LYIGRIVAGITGATGAVAGAYIADITDGD 127

Query: 128 ERIKAFSANYTMVNVGWAIGPVMGVLVVGFGPQLPFIISGALALLVAIVLKFRINDADML 187
ER + F G GPV+G L+ GF P PF + AL L + F L
Sbjct: 128 ERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCF-------L 180

Query: 188 ATERVSSESVPDLRQTFNILRHDKRLIYFTLGSLLGAMVF-AQFSGYLSQYLITVF-DAK 245
E E P R+ N L + T+ + L A+ F Q G + L +F + +
Sbjct: 181 LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDR 240

Query: 246 FAYQ--VIGAVMTVNATIVIALQYL----LSRRMNQQNLMRWLMIGTLFFIIGLLGFMVA 299
F + IG + + Q + ++ R+ ++ + MI G + +
Sbjct: 241 FHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADG---TGYI-LLAF 296

Query: 300 QDSIPLWMLAMAIFSLGEIIVIPAEYLFIDFIAPANLKGSYYGV-QNLGQLGGAINPVLC 358
+ M + + G I +PA + +G G L L + P+L
Sbjct: 297 ATRGWMAFPIMVLLASGGIG-MPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLF 355

Query: 359 GFLLAYTVPEMMFYMLIIAATLGLVL 384
+ A ++ + I A L L+
Sbjct: 356 TAIYAASITTWNGWAWIAGAALYLLC 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2466PF05272320.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.007
Identities = 21/74 (28%), Positives = 29/74 (39%), Gaps = 17/74 (22%)

Query: 24 PGVKALDNVNLKVRPYSIHALMGENGAGKSTLLKCLFGIYKKDSGSIIFQGQEIEFKSSK 83
PG K D + L G G GKSTL+ L G+ F + + K
Sbjct: 591 PGCKF-DYSVV---------LEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGK 633

Query: 84 EALEQGVSMVHQEL 97
++ EQ +V EL
Sbjct: 634 DSYEQIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2476TYPE3IMSPROT310.013 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 31.3 bits (71), Expect = 0.013
Identities = 23/122 (18%), Positives = 48/122 (39%), Gaps = 5/122 (4%)

Query: 669 TISLVTLFSVILLLISTMIIGMAESKRISKILKIMESVGGSLYTHIIFFIQQNVTPVLVA 728
+ L+ L S +++ AE + + V L L+A
Sbjct: 39 SAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMA 98

Query: 729 VAIAF-PIGFIL----LQKWLSKYNFINNLSYLYAFGSLLLFMVSLVSVMTLSLILSHTK 783
+A GF++ ++ + K N I +++ SL+ F+ S++ V+ LS+++
Sbjct: 99 IASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIII 158

Query: 784 KN 785
K
Sbjct: 159 KG 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2480DPTHRIATOXIN355e-04 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 35.1 bits (80), Expect = 5e-04
Identities = 40/135 (29%), Positives = 55/135 (40%), Gaps = 41/135 (30%)

Query: 3 HCNTSDLLSLEQALTK-MLSQATPLPATEVIPLSEAAGRITASAIT----------SPIA 51
H NT ++++ AL+ M++QA PL E++ + AA S I P
Sbjct: 354 HHNTEEIVAQSIALSSLMVAQAIPL-VGELVDIGFAAYNFVESIINLFQVVHNSYNRPAY 412

Query: 52 VP-----PFANSAMDGYAVRWHELSDEI--------------------PLPVAGVAFAGA 86
P PF + DGYAV W+ + D I PLP+AGV
Sbjct: 413 SPGHKTQPFLH---DGYAVSWNTVEDSIIRTGFQGESGHDIKITAENTPLPIAGVLLPTI 469

Query: 87 PFK-DVWPEKTCIRI 100
P K DV KT I +
Sbjct: 470 PGKLDVNKSKTHISV 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2483PF05272320.009 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.009
Identities = 11/18 (61%), Positives = 13/18 (72%)

Query: 352 GPNGIGKSTLLKTLLGEY 369
G GIGKSTL+ TL+G
Sbjct: 603 GTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2493TCRTETB280.044 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.9 bits (62), Expect = 0.044
Identities = 16/72 (22%), Positives = 27/72 (37%), Gaps = 11/72 (15%)

Query: 81 LFSFVFLIPFILSMM--FLQLPGFYSKSPTEMKFFFGISNFSL---------LSLIFFCF 129
L + + +L F + GF S P MK +S + +S+I F +
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 130 RMALFVPRGTPV 141
+ V R P+
Sbjct: 312 IGGILVDRRGPL 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2494CHANLCOLICIN310.032 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.032
Identities = 16/50 (32%), Positives = 22/50 (44%)

Query: 800 VVGGGVSGIGALLGSFALLGPAGIGALLILTGALFAFEASQLRSTPFEVW 849
GVS + ALL S GI + I+TG L ++ +T EV
Sbjct: 471 AADAGVSYVVALLFSLLAGTTLGIWGIAIVTGILCSYIDKNKLNTINEVL 520


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2505DPTHRIATOXIN300.038 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 30.5 bits (68), Expect = 0.038
Identities = 19/54 (35%), Positives = 24/54 (44%), Gaps = 9/54 (16%)

Query: 622 GIGKTETALALADSLFGGEKSLITINMSEYQEAHTVSQLKGSPPGYVGYGQGGV 675
GIG +A A AD + KS + N S Y G+ PGYV Q G+
Sbjct: 23 GIGAPPSAHAGADDVVDSSKSFVMENFSSYH---------GTKPGYVDSIQKGI 67


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2507OMPADOMAIN854e-20 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 85.4 bits (211), Expect = 4e-20
Identities = 42/147 (28%), Positives = 60/147 (40%), Gaps = 14/147 (9%)

Query: 426 PPPPPPPAPPAPKTVRLDSLSLFDVGKFTLNAGSTKML---VDALMNIRAKPGWLIVVAG 482
P P P K L S LF+ K TL L L N+ K G +VV G
Sbjct: 201 APAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-SVVVLG 259

Query: 483 HTDITGDAQANHILSLKRAEALRDWMLSTSDVSPTCFAVQGYGATRPIADNDT------- 535
+TD G N LS +RA+++ D+ L + + + +G G + P+ N
Sbjct: 260 YTDRIGSDAYNQGLSERRAQSVVDY-LISKGIPADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 536 --PDGRALNRRVEISLVPQADACQVPE 560
D A +RRVEI + D P+
Sbjct: 319 ALIDCLAPDRRVEIEVKGIKDVVTQPQ 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2515DHBDHDRGNASE732e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 73.2 bits (179), Expect = 2e-17
Identities = 55/255 (21%), Positives = 92/255 (36%), Gaps = 32/255 (12%)

Query: 10 VLVTGGTKGIGRATVESFVKAGAKVYGTYFWGDNLDELENHFSQYLNRPVFLQADISDEE 69
+TG +GIG A + GA + + + L+++ + AD+ D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 70 ITTQLIEKIAQENKKIDILILNAAFAPQFKDTYKFRGLLDSIEHNSWPLITYIDC----- 124
++ +I +E IDIL+ N A + GL+ S+ W ++
Sbjct: 71 AIDEITARIEREMGPIDILV-NVAGVLRP-------GLIHSLSDEEWEATFSVNSTGVFN 122

Query: 125 -----IKQHFGQYPGYVVAITSEGHRSCHITGYDYVAASKAVLETLTKYIG---ARENII 176
K + G +V + S + Y A+SKA TK +G A NI
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAY-ASSKAAAVMFTKCLGLELAEYNIR 181

Query: 177 INCISPGVVDTE----------AFELVFGKKAQTFIRKFDPDFIVSPEAVGNVSVALCSG 226
N +SPG +T+ E V +TF + P + + + L SG
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 227 LMDAVRGQVITVDNG 241
+ + VD G
Sbjct: 242 QAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2519DHBDHDRGNASE1202e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 120 bits (303), Expect = 2e-35
Identities = 75/251 (29%), Positives = 117/251 (46%), Gaps = 16/251 (6%)

Query: 2 NLFISGGASGIGRSVVIAALSKGWNV-GFSYHNNKEGAQQLLDIAVAEFPRQLCRAYQLD 60
FI+G A GIG +V S+G ++ Y+ K A A + D
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA----FPAD 65

Query: 61 VIDSGAVEYVGDRLLVDFSNIDAVVCNAGIDLPGNLVSMTDEDWALVLNTNLTGTFYLIR 120
V DS A++ + R+ + ID +V AG+ PG + S++DE+W + N TG F R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 121 YFLPLFLANKYGRIVTL-SSLAKDGSSGQAAYAASKAGLVGLTKTTAKEYGHFGITANVV 179
+ + G IVT+ S+ A + AAYA+SKA V TK E + I N+V
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 180 VPGLINTEI-----IGDD-----IKGIKNFFAQYAPVGRLGSPSEVAEVILFLVAKESSY 229
PG T++ ++ IKG F P+ +L PS++A+ +LFLV+ ++ +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 230 VNGAVFNVTGG 240
+ V GG
Sbjct: 246 ITMHNLCVDGG 256


38YpsIP31758_2580YpsIP31758_2585Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2580120-3.33136130S ribosomal protein S1
YpsIP31758_2581117-3.698168cytidylate kinase
YpsIP31758_2582-114-3.2098533-phosphoshikimate 1-carboxyvinyltransferase
YpsIP31758_2583016-4.242590phosphoserine aminotransferase
YpsIP31758_2584016-3.705936hypothetical protein
YpsIP31758_2585-119-3.033319hemagglutinin/invasin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2584OMADHESIN737e-16 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 73.4 bits (179), Expect = 7e-16
Identities = 92/355 (25%), Positives = 161/355 (45%), Gaps = 32/355 (9%)

Query: 269 SILGGSGNMGVGDSVTAITNS-VVFGGNTSGNSTGSTLTDSVSVSGNGTSGNNVVNIGGA 327
SI G ++ +G A+ +S V +G ++ G + S S G + V
Sbjct: 93 SIATGVNSVAIGPLSKALGDSAVTYGAASTAQKDGVAIGARASTSDTGVA----VGFNSK 148

Query: 328 ANGNNSASLGTGS---VSSEGGIALGSGSIATRNDELNIG----DRQITSVKKGVENTDT 380
A+ NS ++G S + IA+G S R + ++IG +RQ+T + G ++TD
Sbjct: 149 ADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGTKDTDA 208

Query: 381 INVSQL-----------NDSFDDVLNLSSEYSDNSFSTVTENINNYTDA-SLDTVLNTTG 428
+NV+QL N ++L ++ Y+DN S+V NNYTD+ S +T+ N
Sbjct: 209 VNVAQLKKEIEKTQENTNKRSAELLANANAYADNKSSSVLGIANNYTDSKSAETLENARK 268

Query: 429 EYTDNS---ILLVTNESNNYTDNGMESVSNYANIYADESLLAIYNEEANYMSNLIDMTLN 485
E S + + SN+ +E+ +AN A +L E AN S L
Sbjct: 269 EAFAQSKDVLNMAKAHSNSVARTTLETAEEHANSVARTTL-ETAEEHANKKSA---EALA 324

Query: 486 NANNYTDLSVNTIIYTGKQYTDSRINEYQRTFKNEFLTYSNGKFGGFDKDINQKQKQLNA 545
+AN Y D + + T YTD ++ + E Y++ KF D +++ +++
Sbjct: 325 SANVYADSKSSHTLKTANSYTDVTVSNSTKKAIRESNQYTDHKFRQLDNRLDKLDTRVDK 384

Query: 546 GIAATMAAAVIPQKSG-SKVSIGVGLAGYSDQGAGSVGAIWHVNQRITMNTTMTY 599
G+A++ A + Q G KV+ G+ GY A ++G+ + VN+ + + + Y
Sbjct: 385 GLASSAALNSLFQPYGVGKVNFTAGVGGYRSSQALAIGSGYRVNENVALKAGVAY 439


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2585OMADHESIN1136e-30 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 113 bits (282), Expect = 6e-30
Identities = 109/368 (29%), Positives = 185/368 (50%), Gaps = 28/368 (7%)

Query: 16 AIFSGSVFADVN---IGDLNTGVIGNGTAVGNNNSLGGSTNGVVVGNGGSLSNSTNGVVI 72
A+ +GS+ VN IG L+ + +AV + +GV +G S S++ GV +
Sbjct: 88 AVGAGSIATGVNSVAIGPLSKAL--GDSAVTYGAASTAQKDGVAIGARASTSDT--GVAV 143

Query: 73 G-NGSVSDGDGVSIGGGTSTNG----GIAIGSGSNATQSDEINIG----DRQITGVKAGV 123
G N + V+IG + IAIG S + + ++IG +RQ+T + AG
Sbjct: 144 GFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHESLNRQLTHLAAGT 203

Query: 124 ADTDAANVGQL-----------VAKAGETLNSANIYVDNQATETLNNANLYTDNKATETI 172
DTDA NV QL ++ E L +AN Y DN+++ L AN YTD+K+ ET+
Sbjct: 204 KDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYADNKSSSVLGIANNYTDSKSAETL 263

Query: 173 NNANTYTDNKSSETLNSANSYTDNKSSETLNSANTYTDSKTAEIFNTNKTYMDEKSKETL 232
NA +S + LN A +++++ + TL +A + +S T + + ++KS E L
Sbjct: 264 ENARKEAFAQSKDVLNMAKAHSNSVARTTLETAEEHANSVARTTLETAEEHANKKSAEAL 323

Query: 233 NNTYDYVDSKVSSIVYDVNSYTDKTVNTAFETSLSDAKSYVDDKYNQLSDKVNKNFNKTN 292
+ Y DSK S + NSYTD TV+ + + ++ ++ Y D K+ QL ++++K + +
Sbjct: 324 ASANVYADSKSSHTLKTANSYTDVTVSNSTKKAIRESNQYTDHKFRQLDNRLDKLDTRVD 383

Query: 293 AGISGAMAMSGIPQKFGYEK-SFGMAIGAYRGQSALAVGGDWNINHKTITRVNVSADTEG 351
G++ + A++ + Q +G K +F +G YR ALA+G + +N + V+
Sbjct: 384 KGLASSAALNSLFQPYGVGKVNFTAGVGGYRSSQALAIGSGYRVNENVALKAGVAYAGSS 443

Query: 352 GVGVAAGF 359
V A F
Sbjct: 444 DVMYNASF 451


39YpsIP31758_2724YpsIP31758_2732Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2724013-3.368543*hypothetical protein
YpsIP31758_2725112-5.515614HNH endonuclease domain-containing protein
YpsIP31758_2726218-6.338695hypothetical protein
YpsIP31758_2727015-4.144555hypothetical protein
YpsIP31758_2728113-1.729575hypothetical protein
YpsIP31758_2729112-0.3087606-phospho-beta-glucosidase
YpsIP31758_27302140.810562RpiR family transcriptional regulator
YpsIP31758_27312153.418929tail collar domain-containing protein
YpsIP31758_27321194.952725hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2725PYOCINKILLER423e-06 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 42.1 bits (98), Expect = 3e-06
Identities = 24/80 (30%), Positives = 44/80 (55%), Gaps = 14/80 (17%)

Query: 283 FRGMRSKFLKSISDNPEVKKRFDSATLADLANGKAPKG-----------WDVHHKLPL-D 330
+R R +F +++++PE+ K+F+ +LA + +G AP ++HHK+ + D
Sbjct: 532 WRDFREQFWIAVANDPELSKQFNPGSLAVMRDGGAPYVRESEQAGGRIKIEIHHKVRVAD 591

Query: 331 DSGTNDVGNLVLI--KRDFE 348
G ++GNLV + KR E
Sbjct: 592 GGGVYNMGNLVAVTPKRHIE 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2728THERMOLYSIN260.046 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 26.1 bits (57), Expect = 0.046
Identities = 11/85 (12%), Positives = 28/85 (32%), Gaps = 10/85 (11%)

Query: 34 DLIWRYIENNDKIEELSSNPFSKGRTAVLVKAKFLSSELKEFKLKTGIIGYPFDMKDISL 93
+L++RY++ +L + L+ K + + I +
Sbjct: 53 ELVYRYLDQEKNTFQLGGQARERLS---LIGNKLDELGHTVMRFEQAIAASLCMGAVLVA 109

Query: 94 YLTSQNIKITLCTEFKRNGTLVNSL 118
++ + +GTL+ +L
Sbjct: 110 HVNDGELS-------SLSGTLIPNL 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2729PHPHTRNFRASE310.009 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 31.3 bits (71), Expect = 0.009
Identities = 23/122 (18%), Positives = 45/122 (36%), Gaps = 22/122 (18%)

Query: 349 GWQIDPVGLRYSLSVLYERYQKPLFIVENGFGAIDKVAADG-------MVHDDYRIAYLK 401
G++ +R L ++ +F + A+ + + G M+ + K
Sbjct: 357 GFR----AIRLCLE------KQDIFRTQ--LRALLRASTYGNLKVMFPMIATLEELRQAK 404

Query: 402 AHIEQMKKAVFEDGVDLMGYTPWGC---IDCVSFTTGEYSKRYGFIYVDKNDDGTGTMAR 458
A +++ K + +GVD+ G I + ++K F + ND TMA
Sbjct: 405 AIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAA 464

Query: 459 SR 460
R
Sbjct: 465 DR 466


40YpsIP31758_2757YpsIP31758_2763Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2757-217-4.257695hypothetical protein
YpsIP31758_2759-217-4.899935**hypothetical protein
YpsIP31758_2760-215-5.146693TetR family transcriptional regulator
YpsIP31758_2761-115-4.957344outer membrane porin protein C
YpsIP31758_2762010-3.449759major facilitator transporter
YpsIP31758_276319-3.696116phosphotransfer intermediate protein in
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2760HTHTETR654e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 64.6 bits (157), Expect = 4e-15
Identities = 25/104 (24%), Positives = 41/104 (39%), Gaps = 4/104 (3%)

Query: 15 PAQQRILLTAHRLFYQEGIRATGIDKIIKESGVTKVTFYRHFPSKNDLISAFLEYRHQRW 74
+Q IL A RLF Q+G+ +T + +I K +GVT+ Y HF K+DL S E
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 75 INWFIEELKQQTLHHA----NLALALTKCMASWFEHPSFRGCAF 114
+E + + + + + + F
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIF 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2761ECOLIPORIN5020.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 502 bits (1294), Expect = 0.0
Identities = 242/388 (62%), Positives = 287/388 (73%), Gaps = 22/388 (5%)

Query: 1 MKLRVLSFIIPALLVAGSASAAEIYNKDGNKLDLYGKIDGLHYFSDNKNLDGDQSYMRFG 60
MK +VL+ +IPALL AG+A AAEIYNKDGNKLDLYGK+DGLHYFSD+ + DGDQ+YMR G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 LKGETQITDQLTGYGQWEYQVNLNKAENEDGNHDSFTRVGFAGLKFADYGSLDYGRNYGV 120
KGETQI DQLTGYGQWEY V N E E N S+TR+ FAGLKF DYGS DYGRNYGV
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGAN--SWTRLAFAGLKFGDYGSFDYGRNYGV 118

Query: 121 LYDVTSWTDVLPEFGGDTYG-ADNFLSQRGNGMLTYRNTNFFGLVDGLNFALQYQGKNGS 179
LYDV WTD+LPEFGGD+Y ADN+++ R NG+ TYRNT+FFGLVDGLNFALQYQGKN S
Sbjct: 119 LYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNES 178

Query: 180 SS---------ETNNGRGVADQNGDGYGMSLSYDLGWGVSASAAMASSLRTTAQNDLQ-- 228
S NNG + NGDG+G+S +YD+G G SA AA +S RT Q +
Sbjct: 179 QSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT 238

Query: 229 YGQGKRANAYTGGLKYDANNVYLAANYTQTYNLTRFGDFSNRSSDAAFGFADKAHNIEVV 288
G +A+A+T GLKYDANN+YLA Y++T N+T +G G A+K N EV
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYG---KTDKGYDGGVANKTQNFEVT 295

Query: 289 AQYQFDFGLRPSVAYLQSKGKDIGI----YGDQDLLKYVDIGATYFFNKNMSTYVDYKIN 344
AQYQFDFGLRP+V++L SKGKD+ D+DL+KY D+GATY+FNKN STYVDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 345 LLDKND-FTKNARINTDDIVAVGMVYQF 371
LLD +D F K+A I+TDDIVA+GMVYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2762TCRTETA346e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.4 bits (79), Expect = 6e-04
Identities = 56/301 (18%), Positives = 102/301 (33%), Gaps = 15/301 (4%)

Query: 25 FIAGLGMAAWAPLVPFAKARIGLND---ASLGLLLLCIGIGSMLAMPLTGVLTAKWGCRA 81
+ +G+ P++P + ++ A G+LL + P+ G L+ ++G R
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 82 VILLAGAVLCLDLPLLVLMNTPATMAIALLVFGAAMGIIDVAMNIQAVIVEKASGRAMMS 141
V+L++ A +D ++ + I +V G VA A I RA
Sbjct: 75 VLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT-DGDERARHF 133

Query: 142 GFHG-LFSVGGIVG------AGGVSALLWLGLNPLTAIMATVVLMIILLLAAN---KNLL 191
GF F G + G GG S + + +L + + L
Sbjct: 134 GFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLR 193

Query: 192 RGSGEPHDGPLFVFPRGWVMFIGFLCFVMFLAEGSMLDWSAVFLTTLRGMSPSQAGMGYA 251
R + P + V + + F+M L +F + G+ A
Sbjct: 194 REALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLA 253

Query: 252 VFAIAMTLGR-LNGDRIVNGLGRYKVLLGGSLCSAIGIIIAISIDSSMAAIIGFMLVGFG 310
F I +L + + + LG + L+ G + G I+ A +L+ G
Sbjct: 254 AFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG 313

Query: 311 A 311

Sbjct: 314 G 314


41YpsIP31758_2792YpsIP31758_2803Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2792213-1.339964hydrophobic amino acid ABC transporter permease
YpsIP31758_2793114-2.735681hydrophobic amino acid ABC transporter permease
YpsIP31758_2794117-4.801464urea ABC transporter substrate-binding protein
YpsIP31758_2795022-6.555826hypothetical protein
YpsIP31758_2796221-5.344827phosphonate ABC transporter permease
YpsIP31758_2797320-5.177838phosphonate ABC transporter permease
YpsIP31758_2798322-5.387720phosphonate ABC transporter substrate-binding
YpsIP31758_2799423-4.792905phosphonate ABC transporter ATP-binding protein
YpsIP31758_2800116-3.565947hypothetical protein
YpsIP31758_2801115-3.331369hypothetical protein
YpsIP31758_2802017-4.333754hypothetical protein
YpsIP31758_2803016-3.311501hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2800HTHTETR270.010 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 26.5 bits (58), Expect = 0.010
Identities = 5/41 (12%), Positives = 18/41 (43%), Gaps = 6/41 (14%)

Query: 4 LSWIIFGLIAGILAKWIMP------GEDGGGFIMTIILGII 38
+ I+ G I+G++ W+ ++ ++ ++ +
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILLEMYL 203


42YpsIP31758_2816YpsIP31758_2848Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2816217-1.479061hypothetical protein
YpsIP31758_2817217-1.420711hypothetical protein
YpsIP31758_2818116-2.278713LysR family substrate binding transcriptional
YpsIP31758_2819114-3.479468FAD/FMN-binding oxidoreductase
YpsIP31758_2820014-4.390986ZraR family transcriptional regulator
YpsIP31758_2821214-4.776620sensory box histidine kinase
YpsIP31758_2822115-4.885330zinc resistance protein
YpsIP31758_2823-112-3.246411xanthosine transporter XapB
YpsIP31758_2824-111-2.272989purine nucleoside phosphorylase
YpsIP31758_2825-212-0.266694integral membrane protein
YpsIP31758_2826-1121.314325DNA-binding transcriptional activator XapR
YpsIP31758_2827-1132.001141hypothetical protein
YpsIP31758_28280132.535100choline transport protein BetT
YpsIP31758_28291133.412070BetI family transcriptional regulator
YpsIP31758_28301153.768731betaine aldehyde dehydrogenase
YpsIP31758_28311142.646306choline dehydrogenase
YpsIP31758_28320161.573243hypothetical protein
YpsIP31758_28330181.939702molybdopterin guanine dinucleotide biosynthesis
YpsIP31758_28340161.357908molybdopterin synthase small subunit
YpsIP31758_28350161.204628molybdenum cofactor biosynthesis protein MoaC
YpsIP31758_2836-1161.445656molybdenum cofactor biosynthesis protein A
YpsIP31758_2837-2152.144893hypothetical protein
YpsIP31758_2838-1153.734706hypothetical protein
YpsIP31758_2839-2153.749631excinuclease ABC subunit B
YpsIP31758_28400184.874272amino acid ABC transporter ATP-binding protein
YpsIP31758_28410164.994400dithiobiotin synthetase
YpsIP31758_28420174.695287biotin biosynthesis protein BioC
YpsIP31758_28430184.7637378-amino-7-oxononanoate synthase
YpsIP31758_2844-1183.973963biotin synthase
YpsIP31758_28450183.976521adenosylmethionine-8-amino-7-oxononanoate
YpsIP31758_28460193.5194436-phosphogluconolactonase
YpsIP31758_28471193.529402phosphotransferase
YpsIP31758_28481163.468250molybdate transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2820HTHFIS5180.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 518 bits (1336), Expect = 0.0
Identities = 171/473 (36%), Positives = 254/473 (53%), Gaps = 28/473 (5%)

Query: 5 KAHILVVDDDLSHCTIIQALMKGWGYQTTPAHNGLEAIELAKEIPFDLILTDVRMSEMDG 64
A ILV DDD + T++ + GY N DL++TDV M + +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 65 IEALKAIKAYNPAIPILIMTAYSNVESAVEAIKAGAYDYLTKPLDFDMLQLTLERALEHT 124
+ L IK P +P+L+M+A + +A++A + GAYDYL KP D L + RAL
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE- 121

Query: 125 HLKNENKTLKQQLISNQNIIGRSPQMRYLMDMVGMIAPSEATVLICGESGTGKEIIARSV 184
K L+ ++GRS M+ + ++ + ++ T++I GESGTGKE++AR++
Sbjct: 122 -PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 185 HANSSRKDQPLVIVNCAALSESLLESELFGHEKGAFTGADKRREGRFMEAHKATLFLDEI 244
H R++ P V +N AA+ L+ESELFGHEKGAFTGA R GRF +A TLFLDEI
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 245 GEISGLMQAKLLRAIQEREIQRVGSNQTLAIDVRLIAATNRNLKADVDSGKFRQDLYYRL 304
G++ Q +LLR +Q+ E VG + DVR++AATN++LK ++ G FR+DLYYRL
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 305 NVVTIDTPALRERSEDIPPLSMHFLEKFALKNRKSIKGFTPQAMNMLLKYNWPGNVRELE 364
NVV + P LR+R+EDIP L HF+++ K +K F +A+ ++ + WPGNVRELE
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAE-KEGLDVKRFDQEALELMKAHPWPGNVRELE 359

Query: 365 NTVERAVILLTGDFISEKELPLNINHYIQENAGSENIGYEDAEKP--------------- 409
N V R L D I+ + + + I ++ + +
Sbjct: 360 NLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASF 419

Query: 410 ----------IQSLDRVEIDAILTALEKTGGNKTEAAKHLGITRKTLQAKLQK 452
+ L +E IL AL T GN+ +AA LG+ R TL+ K+++
Sbjct: 420 GDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2821PF06580300.026 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.026
Identities = 45/281 (16%), Positives = 95/281 (33%), Gaps = 59/281 (20%)

Query: 311 VIEHEIDYKLKEGLIIPLSISV-ANIVNHNGSFLGNIFIFRDMREVRQLQEEIRRKEKLA 369
+ I+ K +PL++S+ N+V + F + + +Q + + + +A
Sbjct: 100 RLLAFINTK-PVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMA 158

Query: 370 AIGNLAAGVA----HEIRNPLSSIKGFAKYFEGHSPQGSEEQELAKVMIKEVDRLNRAVT 425
L A A H + N L++I+ E+ A+ M+ + L R
Sbjct: 159 QEAQLMALKAQINPHFMFNALNNIRALIL----------EDPTKAREMLTSLSELMRYSL 208

Query: 426 ELLGLVRPSDLRIQLVNINEIIAH-----SLHLIRQDADSKKITIQFISNENLPRVEIDP 480
+ V++ + + L I+ ++ + N + V++ P
Sbjct: 209 R--------YSNARQVSLADELTVVDSYLQLASIQF---EDRLQFENQINPAIMDVQV-P 256

Query: 481 DRFTQALL-NLYLNAIQAMGRAGALEIALALVEESKLRISVIDTGKGIRAEDLENIFNPY 539
Q L+ N + I + + G + + + + + V +TG E
Sbjct: 257 PMLVQTLVENGIKHGIAQLPQGGKILLK-GTKDNGTVTLEVENTGSLALKNTKE------ 309

Query: 540 FTTKASGTGLGLAIVQK------------VIEEHQGRITVT 568
TG GL V++ + E QG++
Sbjct: 310 ------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2829HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 1e-14
Identities = 36/182 (19%), Positives = 71/182 (39%), Gaps = 12/182 (6%)

Query: 10 RRQQLIEATMAAVNEVGMHEASIAQIAKRAGVSNGIISHYFRDKNGLLEATMRYLIRHLG 69
RQ +++ + ++ G+ S+ +IAK AGV+ G I +F+DK+ L ++G
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 70 EAVKQHLAALSVNDPRARLRAIAEGNFDDSQINSAAMKTWLAFWASSMHS----PQLYRL 125
E ++ A DP + LR I + + + + + + +
Sbjct: 72 ELELEYQAKFP-GDPLSVLREILIHVLEST-VTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 126 QQVNNRRLYSNLCAEFKRCLPREQ------AQLAAKGMAGLIDGLWLRSALSGEHFNRQE 179
Q+ Y + K C+ + + AA M G I GL + + F+ ++
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKK 189

Query: 180 AL 181

Sbjct: 190 EA 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2833PF06057270.021 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 27.5 bits (61), Expect = 0.021
Identities = 6/30 (20%), Positives = 13/30 (43%)

Query: 54 EHYPGMTEKALTEIIADARSRWSLQRVSVI 83
+ P + II ++ + Q+V +I
Sbjct: 93 QKDPKDVTQDTLAIIDKYQAEFGTQKVILI 122


43YpsIP31758_2864YpsIP31758_2870Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_28645140.333110nicotinamide mononucleotide transporter PnuC
YpsIP31758_28656140.578204quinolinate synthetase
YpsIP31758_28666190.152104********tol-pal system protein YbgF
YpsIP31758_28676210.301618peptidoglycan-associated outer membrane
YpsIP31758_28685180.524834translocation protein TolB
YpsIP31758_28699190.561127cell envelope integrity inner membrane protein
YpsIP31758_2870321-0.401993colicin uptake protein TolR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2866SYCDCHAPRONE300.005 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.3 bits (68), Expect = 0.005
Identities = 21/112 (18%), Positives = 40/112 (35%), Gaps = 10/112 (8%)

Query: 152 YNVAVSLALEKKQYDQAITAFQSFVKQYPKSTYQPNANYWLGQLYYNKGKKDDAAYYYAV 211
Y++A + + +Y+ A FQ+ Y LG G+ D A + Y+
Sbjct: 40 YSLAFN-QYQSGKYEDAHKVFQALCVLDH---YDSRFFLGLGACRQAMGQYDLAIHSYSY 95

Query: 212 VVKNYPKSPKSSEAMFKVGVIMQDKGQSDKAKA---VYQQVIKQYPNTDAAK 260
K P+ F + KG+ +A++ + Q++I
Sbjct: 96 GAIMDIKEPRFP---FHAAECLLQKGELAEAESGLFLAQELIADKTEFKELS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2867OMPADOMAIN1166e-34 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (291), Expect = 6e-34
Identities = 37/119 (31%), Positives = 54/119 (45%), Gaps = 4/119 (3%)

Query: 50 EEQARLQMQELQKNNIVYFGFDKYDIGSDFAQMLDAHAAFLRSN--PSYKVVVEGHADER 107
+Q + + V F F+K + + LD + L + VVV G+ D
Sbjct: 205 APAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI 264

Query: 108 GTPEYNIALGERRASAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAFAKNRRAVL 166
G+ YN L ERRA +V YL KG+ AD+IS G+ P V G+ K R A++
Sbjct: 265 GSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGNTCDN-VKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2869IGASERPTASE607e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.1 bits (145), Expect = 7e-12
Identities = 29/193 (15%), Positives = 63/193 (32%), Gaps = 4/193 (2%)

Query: 64 YNRQQQQQTDAKRAEQQRQKKAEQQAEELQQKQAAEQQRLKELEKERLQAQEDAK---LA 120
YN + +++ Q E R+ E ++
Sbjct: 981 YNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETV 1040

Query: 121 AEEQKKQVAEQQKQIAEQQKQAAEQQKIAAAAVAKAKEEQKQAETAAAQAKAEADKIVKA 180
AE K++ +K + + A+ +++A A + K + E A + ++ + + +
Sbjct: 1041 AENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTET 1100

Query: 181 QAEAQKKAEAEAKKEAA-VAAAAKKQADADAKKAVEVAEKAAADAAEKKAAADAEKKAAA 239
+ A + E +AK E K + K+ + A+ A + K+ +
Sbjct: 1101 KETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQS 1160

Query: 240 AKKVAAAAEAKKK 252
A E K
Sbjct: 1161 QTNTTADTEQPAK 1173



Score = 52.4 bits (125), Expect = 2e-09
Identities = 22/199 (11%), Positives = 68/199 (34%), Gaps = 5/199 (2%)

Query: 67 QQQQQTDAKRAEQQRQKKAEQQAEELQQKQAAEQQRLKELEKERLQAQ-EDAKLAAEEQK 125
+Q+ ++ EQ + Q E ++ ++ + + E + ++ ++ + ++
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 126 KQVAEQQKQIAEQQKQAAEQQKIAAAAVAKAKEEQKQAETAAAQAKAEADKIVKAQAEAQ 185
V +++K E +K + + + + + E Q + A+ I + Q++
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 186 KKAEAEAKKE----AAVAAAAKKQADADAKKAVEVAEKAAADAAEKKAAADAEKKAAAAK 241
A+ E + + VE E + +++ K
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 242 KVAAAAEAKKKAAAEAAAS 260
+ + + A +++
Sbjct: 1224 RRSVRSVPHNVEPATTSSN 1242



Score = 44.7 bits (105), Expect = 5e-07
Identities = 32/218 (14%), Positives = 65/218 (29%), Gaps = 10/218 (4%)

Query: 47 GEVIDAVMVDPGAVTEQYNRQQQQQTDAKRAEQQRQKKAEQQAEELQQKQAAEQQRLKEL 106
EV + A T+ Q + + ++ A + EE + + + Q
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQ----- 1120

Query: 107 EKERLQAQEDAKLAAEEQKKQVAEQQKQ--IAEQQKQAAEQQKIAAAAVAKAKEEQKQAE 164
E ++ +Q K E + AE ++ K+ Q A AKE E
Sbjct: 1121 EVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 165 TAAAQAKAEADKIVKAQAEAQKKAEAEAKKEAAVAAAAKKQADADAKKAVEVAEKA-AAD 223
++ + E + + + ++ K + + V A
Sbjct: 1181 QPVTESTTV--NTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPAT 1238

Query: 224 AAEKKAAADAEKKAAAAKKVAAAAEAKKKAAAEAAAST 261
+ + A + A ++A+ KA A
Sbjct: 1239 TSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVG 1276


44YpsIP31758_2890YpsIP31758_2896Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2890218-3.026141hypothetical protein
YpsIP31758_2891220-4.037497hypothetical protein
YpsIP31758_2892119-3.163188hypothetical protein
YpsIP31758_2893319-2.388551SsrA-binding protein
YpsIP31758_2895222-2.182541phage integrase family protein
YpsIP31758_2896217-2.074797transposase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2895FLGPRINGFLGI270.036 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 27.2 bits (60), Expect = 0.036
Identities = 11/24 (45%), Positives = 18/24 (75%)

Query: 150 LKARTLIQVLEPIKARGALETDLL 173
LKA +I +L+ IK+ GAL+ +L+
Sbjct: 348 LKADGIIAILQGIKSAGALQAELV 371


45YpsIP31758_2927YpsIP31758_2943Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2927222-3.801261LPS-assembly lipoprotein RlpB
YpsIP31758_2928224-3.722082DNA polymerase III subunit delta
YpsIP31758_2929327-4.465475nicotinic acid mononucleotide
YpsIP31758_2930428-4.162098Ig-like domain-containing protein
YpsIP31758_2931327-4.605292hypothetical protein
YpsIP31758_2932326-3.606491hypothetical protein
YpsIP31758_2933324-2.558813hypothetical protein
YpsIP31758_2934224-3.265418hypothetical protein
YpsIP31758_2935224-2.341770hypothetical protein
YpsIP31758_2936222-2.297001hypothetical protein
YpsIP31758_2937220-2.314051hypothetical protein
YpsIP31758_2938222-1.997763hypothetical protein
YpsIP31758_29392220.021280hypothetical protein
YpsIP31758_2940224-0.431851hypothetical protein
YpsIP31758_2941324-2.075981hypothetical protein
YpsIP31758_2942326-3.136536hypothetical protein
YpsIP31758_2943322-2.135851hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2928HTHFIS290.041 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.6 bits (64), Expect = 0.041
Identities = 14/56 (25%), Positives = 26/56 (46%), Gaps = 1/56 (1%)

Query: 160 LNVDDAAIQLLC-YCYEGNLLALSQALERLSLLYPDGKLTLPKVEQAVNDAAHFTP 214
D A++L+ + + GN+ L + RL+ LYP +T +E + +P
Sbjct: 336 KRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSP 391


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2929LPSBIOSNTHSS425e-07 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 41.7 bits (98), Expect = 5e-07
Identities = 23/73 (31%), Positives = 37/73 (50%), Gaps = 4/73 (5%)

Query: 12 ALFGGTFDPIHYGHLKPVEALAQQVGLQHIILLPNHVPPHRPQPEANAQQRLKMVELAVA 71
A++ G+FDPI +GHL +E + + + P +P + Q+RL+ + A+A
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRL--FDQVYVAVLRNPNKQPMF--SVQERLEQIAKAIA 58

Query: 72 GNPLFSVDSRELL 84
P VDS E L
Sbjct: 59 HLPNAQVDSFEGL 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2932PF04647290.013 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 29.4 bits (66), Expect = 0.013
Identities = 14/93 (15%), Positives = 28/93 (30%), Gaps = 5/93 (5%)

Query: 7 FLYTLILINSGVVAAAQSSPTSGSPYSSLVVLIALIALVIYV-----FRKIKTNKNKDVD 61
L +L++ N A P + + +L+AL+ V I + +
Sbjct: 81 TLTSLLVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNPRNLISNTEQRKTL 140

Query: 62 SNKKGKLTTVLLWILFVFMLIFSFSTFAIESYG 94
K + VL +++ G
Sbjct: 141 KLKTSMVLMVLFGGSIGAYRLYTHQIALAILLG 173


46YpsIP31758_2955YpsIP31758_2982Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2955016-5.088595lipoyl synthase
YpsIP31758_2956219-9.065082twin arginine translocase protein A
YpsIP31758_2957221-9.247349camphor resistance protein CrcB
YpsIP31758_2958322-10.505550cold shock protein CspE
YpsIP31758_2959222-10.684383cold-shock DNA-binding domain-containing
YpsIP31758_2960115-6.452359LuxR family transcriptional regulator
YpsIP31758_2961116-5.143942hypothetical protein
YpsIP31758_2962119-1.566188hypothetical protein
YpsIP31758_29630190.254234hypothetical protein
YpsIP31758_29641202.782537antibiotic biosynthesis monooxygenase
YpsIP31758_29650204.032911heavy metal ABC transporter (HMT) family
YpsIP31758_29660245.922527hypothetical protein
YpsIP31758_2967-1246.229584myo-inositol catabolism protein IolB
YpsIP31758_29680246.180050AP endonuclease
YpsIP31758_29690256.228993PfkB family kinase
YpsIP31758_29701265.514078Gfo/Idh/MocA family oxidoreductase
YpsIP31758_29710245.885253ribose ABC transporter permease
YpsIP31758_29721246.431035ribose ABC transporter ATP-binding protein
YpsIP31758_29730256.676798ribose ABC transporter periplasmic protein
YpsIP31758_2974-2183.878448oxidoreductase, NAD-binding
YpsIP31758_2975-3163.595594hypothetical protein
YpsIP31758_2976-3173.755648thiamine pyrophosphate-dependent enzyme
YpsIP31758_2977-2131.816287methylmalonate-semialdehyde dehydrogenase
YpsIP31758_2978-1110.260708RpiR family transcriptional regulator
YpsIP31758_2979011-0.633067alpha-2-macroglobulin domain-containing protein
YpsIP31758_2980116-2.461875penicillin-binding protein 1C
YpsIP31758_2981018-4.029955sugar fermentation stimulation protein B
YpsIP31758_2982019-4.253887hypothetical protein
47YpsIP31758_3013YpsIP31758_3019Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_30132141.510959hypothetical protein
YpsIP31758_30140123.227575cysteinyl-tRNA synthetase
YpsIP31758_3015-1123.761590peptidyl-prolyl cis-trans isomerase B
YpsIP31758_30160114.171987UDP-2,3-diacylglucosamine hydrolase
YpsIP31758_3017-1134.799399phosphoribosylaminoimidazole carboxylase
YpsIP31758_30180134.228603phosphoribosylaminoimidazole carboxylase ATPase
YpsIP31758_30190143.397819efflux ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3013PF07299250.024 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 25.2 bits (55), Expect = 0.024
Identities = 12/30 (40%), Positives = 16/30 (53%), Gaps = 2/30 (6%)

Query: 8 KHPHVELCDLLKLQ--GWNDSGASAKAAIA 35
K P +E D+ +L W D G+S K IA
Sbjct: 110 KLPDMEELDMKELSYLSWIDKGSSRKFIIA 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3014RTXTOXIND330.003 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.9 bits (75), Expect = 0.003
Identities = 18/153 (11%), Positives = 48/153 (31%), Gaps = 13/153 (8%)

Query: 299 RSQLNYSEENLKQARASLERLYTALRGTDANATPAGGAEFEARFRTAMDDDFNTPEAY-- 356
+ ++ +L QAR R R + N P E F+ +++ +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 357 SVLFDIAREVNRLK---NEDMAAANGLAAELRKLAQVLGLLEQDPELFLQGGAQ-ADDDE 412
+ + + ++ A + A + + + + + L +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR----LDDFSSLLHKQA 248

Query: 413 VAKIEALIKQRNDARSSKNWALADAARDQLNEL 445
+AK L ++ + + + QL ++
Sbjct: 249 IAKHAVLEQEN---KYVEAVNELRVYKSQLEQI 278


48YpsIP31758_3031YpsIP31758_3071Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3031015-3.347049hypothetical protein
YpsIP31758_3032117-4.215910bifunctional UDP-sugar hydrolase/5'-nucleotidase
YpsIP31758_3033220-5.892405fosmidomycin resistance protein
YpsIP31758_3034224-6.353360cation:proton antiport protein
YpsIP31758_3035231-8.639932inosine-guanosine kinase
YpsIP31758_3036237-9.682290ferric enterobactin ABC transporter
YpsIP31758_3037237-9.667562phosphomannomutase
YpsIP31758_3038240-11.863027glycosyl transferase family protein
YpsIP31758_3039342-12.799446mannose-1-phosphate guanylyltransferase
YpsIP31758_3040445-15.415527GDP-L-fucose synthetase
YpsIP31758_3041748-17.014470GDP-mannose 4,6-dehydratase
YpsIP31758_3042948-17.894084glycosyl transferase WbyK, group 1 family
YpsIP31758_3043848-18.081295O-antigen biosynthesis protein Wxy
YpsIP31758_3044743-15.202186mannosyltransferase WbyJ
YpsIP31758_3045536-12.229280glycosyl transferase family protein
YpsIP31758_3046331-10.069446O-unit flippase protein Wzx
YpsIP31758_3047129-8.660913O-antigen synthesis protein WbyH
YpsIP31758_3048126-6.671915CDP-paratose synthetase
YpsIP31758_3049023-5.311919CDP-4-keto-6-deoxy-D-glucose-3-dehydrase
YpsIP31758_3050-118-4.273010CDP-glucose 4,6-dehydratase
YpsIP31758_3051119-4.795372glucose-1-phosphate cytidylyltransferase
YpsIP31758_3052019-3.879470CDP-6-deoxy-delta-3,4-glucoseen reductase
YpsIP31758_3053116-2.172421ferrochelatase
YpsIP31758_3054112-0.110455adenylate kinase
YpsIP31758_30551110.224236heat shock protein 90
YpsIP31758_30561121.465861hypothetical protein
YpsIP31758_30572121.061975recombination protein RecR
YpsIP31758_30583130.512402hypothetical protein
YpsIP31758_3059112-0.593191DNA polymerase III subunits gamma and tau
YpsIP31758_3060014-2.078347adenine phosphoribosyltransferase
YpsIP31758_3061216-2.752754hypothetical protein
YpsIP31758_3062114-2.240616primosomal replication protein n''
YpsIP31758_3063-112-1.078091hypothetical protein
YpsIP31758_3064-112-1.034677potassium efflux protein KefA
YpsIP31758_3065-116-0.454222DsrE family protein
YpsIP31758_3066016-1.426951DNA-binding transcriptional repressor AcrR
YpsIP31758_3067017-1.364397acriflavine resistance protein A
YpsIP31758_3068019-2.284781acriflavine resistance protein B
YpsIP31758_3069025-6.33756650S ribosomal protein L31
YpsIP31758_3070027-4.10367150S ribosomal protein L36
YpsIP31758_3071-220-3.040327hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3033TCRTETA385e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.9 bits (88), Expect = 5e-05
Identities = 41/200 (20%), Positives = 75/200 (37%), Gaps = 5/200 (2%)

Query: 11 QPVNVSVKRTSFSILGAISVSHLLNDMIQSLILAIYPLLQAE-FSLSFAQIGLITLSYQL 69
P+ +++ A+ + ++ + A++ + + F IG+ ++ +
Sbjct: 198 NPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGI 257

Query: 70 TASLLQPLI-GLYTDKHPQPYSLPIGMGFTLSGILLLAVATTFPVVLLAAALVGTGSSVF 128
SL Q +I G + + +L +GM +G +LLA AT + L+ +G
Sbjct: 258 LHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317

Query: 129 HPESSRVARMASGGRHGLAQSVFQVGGNFGSALGPLLAAIIIA---PYGKGNVGWFSLAA 185
+ ++R R G Q + S +GPLL I A G A
Sbjct: 318 PALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAAL 377

Query: 186 LLAIVVLLQVSKWYKLQQRA 205
L + L+ W QRA
Sbjct: 378 YLLCLPALRRGLWSGAGQRA 397



Score = 30.2 bits (68), Expect = 0.014
Identities = 26/118 (22%), Positives = 41/118 (34%)

Query: 281 IGGPLGDKIGRKYVIWGSILGVAPFTLALPYASLYWTGILTVFIGVILASAFSAILVYAQ 340
+ G L D+ GR+ V+ S+ G A + A W + + I + + Y
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 341 ELIPGKVGMVSGLFFGFAFGMGGIGAAVLGYVADLTSIELVYQICAFLPLLGIFTALL 398
++ G F FG G + VLG + S + A L L T
Sbjct: 122 DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCF 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3040NUCEPIMERASE834e-20 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 82.5 bits (204), Expect = 4e-20
Identities = 59/352 (16%), Positives = 122/352 (34%), Gaps = 59/352 (16%)

Query: 5 RVFIAGHRGMVGSAIVRQLENRND--------------------IELIIRDR---TELDL 41
+ + G G +G + ++L +EL+ + ++DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 MSQSAVQKFFATEKIDEIYLAAAKVGGIQANNNYPAEFIYQNLMIECNIIHAAHLAGIQK 101
+ + FA+ + ++++ ++ + N P + NL NI+ IQ
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLEN-PHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLAAQPMTEEALLTGVLEPTNEP---YAIAKIAGIKLCESYNRQYGRDY 158
LL+ SS +Y P + + + + P YA K A + +Y+ YG
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD-------DSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 159 RSVMPTNLYGENDNFHPENSHVIPALLRRFHEAKIRNDKEMVVWGTGKPMREFLHVDDMA 218
+ +YG P + + K + V+ GK R+F ++DD+A
Sbjct: 174 TGLRFFTVYGPWGR---------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 219 AASVHVMELSDQIYQTNTQPMLSH------------INVGTGVDCTIRELAETMAKVVGF 266
A ++ L D I +TQ + N+G + + + + +G
Sbjct: 225 EA---IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 267 TGNLVFDSTKPDGTPRKLMDVSRLAK-LGWCYQISLEVGLTMTYQWFLAHQN 317
+P D L + +G+ + +++ G+ W+
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3041NUCEPIMERASE1035e-27 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 103 bits (258), Expect = 5e-27
Identities = 79/364 (21%), Positives = 128/364 (35%), Gaps = 64/364 (17%)

Query: 6 LITGITGQDGSYLAEFLLEKGYEVHGIKRRASSFNTSRIDHIYQDRHET--NPRFFLHYG 63
L+TG G G ++++ LLE G++V GI + N + Q R E P F H
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 64 DLTDTSNLIRLVQEIQPDEIYNLGAQSHVAVSFESPEYTADVDAMGTLRLLEAIRINGLE 123
DL D + L + ++ + V S E+P AD + G L +LE R N ++
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 124 KKTRFYQASTSELYGLVQETPQRETTPF-YPRSPYAVAKMYAYWITVNYRESYGMYACNG 182
AS+S +YGL ++ P +P S YA K + Y YG+ A
Sbjct: 120 ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGL 176

Query: 183 ILFNHESPRRGETFVTRKITRAVANIALGLEKCLYLGNIDSLRDWGHAKDYV----RMQW 238
F P K T+A+ G +Y RD+ + D R+Q
Sbjct: 177 RFFTVYGPWGRPDMALFKFTKAMLE---GKSIDVY-NYGKMKRDFTYIDDIAEAIIRLQD 232

Query: 239 MMLQQDKPED---------------FVIATGKQITVREFVRMSAREAGIELEFSGEGVEE 283
++ D + I + + ++++ GIE
Sbjct: 233 VIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIE---------- 282

Query: 284 VATVVAINGNHISSVNIGDVIVRVDPRYFRPAEVETLLGDPTKAKKVLGWVPEITVEEMC 343
N + +P +V D +V+G+ PE TV++
Sbjct: 283 ------AKKNMLP---------------LQPGDVLETSADTKALYEVIGFTPETTVKDGV 321

Query: 344 AEMV 347
V
Sbjct: 322 KNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3048NUCEPIMERASE618e-13 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 61.0 bits (148), Expect = 8e-13
Identities = 58/321 (18%), Positives = 114/321 (35%), Gaps = 70/321 (21%)

Query: 1 MKILITGVSGYLGSQLANALMLE-HEVAGTVRAGSVCNRITDIGNVNL------------ 47
MK L+TG +G++G ++ L+ H+V G + + D +V+L
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGI-------DNLNDYYDVSLKQARLELLAQPG 53

Query: 48 -----INVTDSGWIDKVL-SFSPDVVINTVALYGRKGELLS--ELVDANIQFPLRILE-- 97
I++ D + + S + V + + L + D+N+ L ILE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 98 --------MLVST----GKGLFFQCGTSLPAD--VSQYALTKNQFTELAREYCNKFSGKF 143
+ S+ G T D VS YA TK +A Y + +
Sbjct: 114 RHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 144 IELKLEHFFGPFDDST----KFTTYVINSCRSHSDLKL-TAGLQRRDFIYINDLINA--- 195
L+ +GP+ KFT + + + G +RDF YI+D+ A
Sbjct: 174 TGLRFFTVYGPWGRPDMALFKFT----KAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIR 229

Query: 196 ------------FKIMISKSESLISGESISIGSGHAVTIKEFVETVAKMTSYQGNLQFGA 243
+ + S+ +IG+ V + ++++ + +
Sbjct: 230 LQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNM-- 287

Query: 244 IPTRENELMYSCASLARIQEL 264
+P + +++ + A + E+
Sbjct: 288 LPLQPGDVLETSADTKALYEV 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3050NUCEPIMERASE732e-16 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 72.9 bits (179), Expect = 2e-16
Identities = 64/352 (18%), Positives = 118/352 (33%), Gaps = 48/352 (13%)

Query: 45 RVFVTGHTGFKGGWLSLWLQTMGATVKGYSLTAPTVPSLFETARVA----DGMQSEIGDI 100
+ VTG GF G +S L G V G + AR+ G Q D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 101 RDQNKLLESIREFQPEIVFHMAAQPLVRLSYSEPVETYSTNVMGTVYLLEAIRHVGGVKA 160
D+ + + E VF + VR S P +N+ G + +LE RH ++
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN-KIQH 120

Query: 161 VVNITSDKCYDNKEWIWGYRENEAMGGYDPYSNSKGCAELVTSSYRNSFFNPAN------ 214
++ +S Y + ++ Y+ +K EL+ +Y + + PA
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFT 180

Query: 215 -YGQHG----------TAVATVRAGNVIGGGDWA-----LDRIVPDILRAFEQSQPVIIR 258
YG G A+ ++ +V G +D I I+R +
Sbjct: 181 VYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI------ 234

Query: 259 NPHAIRPWQHVLEPLSGYLLLAQKLYTDGAEYAEGWNFGPNDADATPVKNIVEQMVKYWG 318
PHA W + T A A + ++ + + ++ + G
Sbjct: 235 -PHADTQW-------------TVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG 280

Query: 319 EGASWQLDGNAHPHEAHYLKLDCSKAKMQLGWHPRWNLNTTLEYIVGWHKNW 370
A + P + D +G+ P + ++ V W++++
Sbjct: 281 IEAKKNML-PLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3064GPOSANCHOR404e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.4 bits (94), Expect = 4e-05
Identities = 28/235 (11%), Positives = 58/235 (24%), Gaps = 21/235 (8%)

Query: 35 SEVQSQLDLLSKQKILSPAEKLAQQDLTQTLE-YLDTIERTKQEANQLKQQLAQAPAKLR 93
S + +L K ++ + LE L+ + + L A L
Sbjct: 95 SNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALA 154

Query: 94 QATEGLE-ALKSSSADTMTKESLANYSLRQLESRLNETLDNLQSAQEDLSAYNSQLIALQ 152
LE AL+ + + +++ + + + L
Sbjct: 155 ARKADLEKALEGAMNFS-----------TADSAKIKTLEAEKAALEARQAELEKALEGAM 203

Query: 153 TQPERVQSAMYSASMRLMQIRNQLNGLTPNQESLRPTQQ--QELLAEQVMLNGQLDLERK 210
+ + + + + L E + L+ +
Sbjct: 204 NFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQA 263

Query: 211 NLEANTTLQDLLQKQRDYTTAHINQLERYVQLLQEVVSGKRLILSEKTVKEAQAQ 265
LE + +A I LE L+ K + + V A Q
Sbjct: 264 ELEK---ALEGAMNFSTADSAKIKTLEAEKAALEAE---KADLEHQSQVLNANRQ 312



Score = 32.0 bits (72), Expect = 0.016
Identities = 36/201 (17%), Positives = 72/201 (35%), Gaps = 33/201 (16%)

Query: 37 VQSQLDLLSKQKILSPAEKLAQQDLTQTLEYLDTIERTKQEANQLKQQ------------ 84
+ L ++Q L A + A T + T+E K K
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR 311

Query: 85 --LAQAPAKLRQATEGLEA-----------LKSSSADTMTKESLANYSLRQLESRLNETL 131
L + R+A + LEA ++S + + +QLE+ +
Sbjct: 312 QSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLE 371

Query: 132 DNLQSAQEDLSAYNSQLIALQTQPERVQSAMYSASMRLMQIRNQLNGLTPNQESLRPTQQ 191
+ + ++ + L A + ++V+ A+ A+ +L + L +ES + T++
Sbjct: 372 EQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKEL---EESKKLTEK 428

Query: 192 QELLAEQVMLNGQLDLERKNL 212
E+ L +L+ E K L
Sbjct: 429 -----EKAELQAKLEAEAKAL 444


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3065ADHESNFAMILY260.034 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 26.0 bits (57), Expect = 0.034
Identities = 9/71 (12%), Positives = 27/71 (38%)

Query: 47 IAGLNGQQPREGYNLQQMLEILTAQNVPIKLCKTCADARGIAGLTLVDGVEIGTLVELAQ 106
I +N ++ ++ ++E L VP ++ D R + ++ + I +
Sbjct: 222 IWEINTEEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDS 281

Query: 107 WTLAAEKVLTF 117
++ ++
Sbjct: 282 IAEQGKEGDSY 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3066HTHTETR1657e-54 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 165 bits (420), Expect = 7e-54
Identities = 135/210 (64%), Positives = 164/210 (78%)

Query: 1 MARKTKQKAEETRQQILDAAVREFSAHGVSRTSLTDIAIAAGVTRGAIYWHFKNKVDLFN 60
MARKTKQ+A+ETRQ ILD A+R FS GVS TSL +IA AAGVTRGAIYWHFK+K DLF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EVWELSESKIDQLEIEYQAKYPDNPLRILRELLIYILVSTREDRRRRALMEIVFHKCEFV 120
E+WELSES I +LE+EYQAK+P +PL +LRE+LI++L ST + RRR LMEI+FHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMTSVHDARKVLDLASYERIESVLQGCIDANQLPVNLNTHRAAIIMRAYITGLMENWLF 180
GEM V A++ L L SY+RIE L+ CI+A LP +L T RAAIIMR YI+GLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 MPESFDIKQEAPVLIDAYLEMLGQSFSLRN 210
P+SFD+K+EA + LEM +LRN
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRN 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3067RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 22/166 (13%), Positives = 52/166 (31%), Gaps = 45/166 (27%)

Query: 96 QIDPATYQAAYDSAKGDLAKAQASAQIAHLTVNRYKPLLGTNYISKQ---EYDQALSDAQ 152
+++ +A + + + + +++ ++ + LL I+K E + +A
Sbjct: 206 ELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 153 QADATVLAAKAALES----------------------------------------ARINL 172
+ +ES
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325

Query: 173 AYTQVRSPISGRTGKSAV-TEGALVTSGQASAMTTVQQLDPMYVDV 217
+ +R+P+S + + V TEG +VT+ + M V + D + V
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAET-LMVIVPEDDTLEVTA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3068ACRIFLAVINRP13440.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1344 bits (3479), Expect = 0.0
Identities = 807/1032 (78%), Positives = 919/1032 (89%)

Query: 1 MAKFFIDRPIFAWVIAIIIMLAGALAIMKLPVAQYPTIAPPAITIAANYPGADATTVQNT 60
MA FFI RPIFAWV+AII+M+AGALAI++LPVAQYPTIAPPA++++ANYPGADA TVQ+T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLLYMSSSSDSSGNVQLTLTFNSGTDPDIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNL+YMSS+SDS+G+V +TLTF SGTDPDIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVAGFISEDGTMQQEDIADYVGSNIKDPISRTPGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMVAGF+S++ Q+DI+DYV SN+KD +SR GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMDPHKLNNYKLTPVDVINAIKIQNNQVAAGQLGGTPPVPGQELNSSIIAQTRL 240
QYAMRIW+D LN YKLTPVDVIN +K+QN+Q+AAGQLGGTP +PGQ+LN+SIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TNAEEFSQILLKVNTDGSQVRLKDVAIVKLGAESYNIIARYNGKPAAGIGIKLATGANAL 300
N EEF ++ L+VN+DGS VRLKDVA V+LG E+YN+IAR NGKPAAG+GIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 NTSAAVKAELAKLQPFFPSGLTVVYPYDTTPFVKISINEVVKTLIEAIILVFLVMYLFLQ 360
+T+ A+KA+LA+LQPFFP G+ V+YPYDTTPFV++SI+EVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAILSAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFAIL+AFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 QEEGLPPKEATKKSMEQIQGALVGIALVLSAVFVPMAFFGGATGAIYRQFSITIVSAMVL 480
E+ LPPKEAT+KSM QIQGALVGIA+VLSAVF+PMAFFGG+TGAIYRQFSITIVSAM L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIKKGDHGPKTGFFGWFNNMFEKSTHHYTDSVANILRSTGRY 540
SVLVALILTPALCAT+LKP+ H K GFFGWFN F+ S +HYT+SV IL STGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LVIYLAIVIGMAVLFMRLPSSFLPEEDQGVFLTMVQLPAGATQERTQKVLNQVTDYYLDK 600
L+IY IV GM VLF+RLPSSFLPEEDQGVFLTM+QLPAGATQERTQKVL+QVTDYYL
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKNVVNSVFTVNGFGFSGQGQNTGLAFVSLKNWDERKGEQNKVPAIVSRASAAFSKIKDG 660
EK V SVFTVNGF FSGQ QN G+AFVSLK W+ER G++N A++ RA KI+DG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 MVFAFNLPAIVELGTATGFDFQLIDQGNLGHQQLTDARNQLLGMAAQHPDMLVGVRPNGL 720
V FN+PAIVELGTATGFDF+LIDQ LGH LT ARNQLLGMAAQHP LV VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTPQFKVEVDQEKAQALGVAISDINTTLGSAMGGSYVNDFIDRGRVKKVYVQADAPFRM 780
EDT QFK+EVDQEKAQALGV++SDIN T+ +A+GG+YVNDFIDRGRVKK+YVQADA FRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPDDIDKWYVRNNMGQMVSFATFSTAKWEYGSPRLERYNGLPSMEILGQAAPGKSTGEAM 840
LP+D+DK YVR+ G+MV F+ F+T+ W YGSPRLERYNGLPSMEI G+AAPG S+G+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 DLMQELAAKLPSGVGYDWTGMSYQERLSGNQAPALYAISLIVVFLCLAALYESWSIPFSV 900
LM+ LA+KLP+G+GYDWTGMSYQERLSGNQAPAL AIS +VVFLCLAALYESWSIP SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGVVGALLAATLRGLENDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGLV 960
MLVVPLG+VG LLAATL +NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+V
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 ESTLESVRMRLRPILMTSLAFILGVMPLVISSGAGSGAQNAVGTGVMGGMITATVLAIFF 1020
E+TL +VRMRLRPILMTSLAFILGV+PL IS+GAGSGAQNAVG GVMGGM++AT+LAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPLFFVVVRRRF 1032
VP+FFVV+RR F
Sbjct: 1021 VPVFFVVIRRCF 1032


49YpsIP31758_3134YpsIP31758_3144Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3134514-0.552679phosphate-binding protein
YpsIP31758_3135515-0.393717phosphate regulon sensor protein
YpsIP31758_3136513-0.815974PhoB family transcriptional regulator
YpsIP31758_3137613-1.082680hypothetical protein
YpsIP31758_3138614-0.782171exonuclease subunit SbcD
YpsIP31758_3139715-1.365058nuclease SbcCD, C subunit
YpsIP31758_3140-121-2.847300fructokinase
YpsIP31758_3141123-4.273920recombination associated protein
YpsIP31758_3142026-5.095414hypothetical protein
YpsIP31758_3143023-5.232762shikimate kinase
YpsIP31758_3144-118-3.501254methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3136HTHFIS904e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.3 bits (224), Expect = 4e-23
Identities = 31/119 (26%), Positives = 58/119 (48%), Gaps = 2/119 (1%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGYQPLEAEDYDSAVARLSEPFPDLVLLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + GY + + ++ DLV+ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHMKREALTRDIPVMMLTARGEEEDRVRGLEVGADDYITKPFSPKELVARIKAVMRR 122
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3139RTXTOXIND422e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.7 bits (98), Expect = 2e-05
Identities = 32/222 (14%), Positives = 74/222 (33%), Gaps = 8/222 (3%)

Query: 321 QYLAQLTPLT--QAVEQATAARQQQQLNQHEQETLIEQRIVPLDNLITQQQQTLSQLAGQ 378
L +LT L + ++ Q +L Q + L + + + Q +
Sbjct: 122 DVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSE 181

Query: 379 IQQLRAKEQQNSQQLALNEQKLLQTHQRLQQLADYANLHAHHQHWEKHLPLWHEQFRQLQ 438
+ LR Q QK + ++ A+ + A +E + +
Sbjct: 182 EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS 241

Query: 439 LQQQQSAQSEQQLHQQTTLLATLQQQATTLSAQEKQQQVALAEARAQASYLQQKL--LVL 496
+ A ++ + +Q + +Q +Q + + A+ + + Q +L
Sbjct: 242 SLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL 301

Query: 497 EQ----QQPSAQLRQQLNEFNEQRQICQQLAALSPLAQQIQA 534
++ L +L + E++Q A +S QQ++
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKV 343



Score = 39.4 bits (92), Expect = 8e-05
Identities = 30/236 (12%), Positives = 65/236 (27%), Gaps = 42/236 (17%)

Query: 458 LATLQQQATTLSAQEKQQQVALAEARAQASYLQQKLLVLEQQ----QPSAQLRQQLNEFN 513
L L +A TL Q Q L + R Q +L L + +P Q +
Sbjct: 127 LTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 514 EQRQICQQLAALSPLAQQIQALYDKQQQQFTAQQQQLKQLEQQ---LTEKRQLYQQ-QKQ 569
I +Q + Q + DK++ + ++ + E + + +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK 246

Query: 570 HLVDLEALLEREKQLVTLEAERAKLQPGDACPLCGAVEHPAIAAYQAVKPSETAVRVAKL 629
+ A+LE+E + V E ++ ++
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELR----------------------------VYKSQLEQI 278

Query: 630 RLQVEQLDTEGTELRTQVASMQQHQQRIEQELQDHRQQLAAYQQRWQTLAQPLSLA 685
++ E + Q + I +L+ + + +
Sbjct: 279 ESEILSAKEEYQLVT------QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328



Score = 37.9 bits (88), Expect = 3e-04
Identities = 28/206 (13%), Positives = 71/206 (34%), Gaps = 19/206 (9%)

Query: 658 EQELQDHRQQLAAYQQRWQTLAQPLSL----AFTLNEPDALALWLEQHEQQEQACQLKLV 713
+ Q Q Q R+Q L++ + L L + E+ + +
Sbjct: 136 TLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL----- 190

Query: 714 EYERLTQQYQQAKDILTQLEQRQQEHQQQLALITERQKNAQQTYQQLQSQYQHQQEALIA 773
+ +Q+ ++ Q E + + + + R + + +S+ +L+
Sbjct: 191 ----IKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS-SLLH 245

Query: 774 QQQVLNHTLTELSLSVPDADQQQNWLAQREEECQRWQQHQQEQQRLTIEQKTLETRIENE 833
+Q + H + E +A + + E+ + +E+ +L + E +
Sbjct: 246 KQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDK-- 303

Query: 834 RRHLQECIDQLSALSQQRQQAETLLQ 859
L++ D + L+ + + E Q
Sbjct: 304 ---LRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.9 bits (75), Expect = 0.009
Identities = 26/180 (14%), Positives = 71/180 (39%), Gaps = 13/180 (7%)

Query: 844 LSALSQQRQQAETLLQQQIQQRRALFGEDIVAE-------VRQRLRLQQQQAELAQQNAE 896
+ L Q R Q + + + ++ + +R +++Q + Q +
Sbjct: 145 QARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQ 204

Query: 897 K--ALQQVQSQLNRLSGELTGLEQQCQQYQQRATTTQAEL-QQALSTSEFADETALTAAL 953
K L + +++ + + E + + R + L +QA++ ++
Sbjct: 205 KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 954 LSE--EERQHLQQLQQQLNERRQQAQIRLQQAR-EILDQHLQLCPQGVDKSSELTLLQQQ 1010
++E + L+Q++ ++ +++ Q+ Q + EILD+ Q + EL +++
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3140BCTERIALGSPF280.045 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.3 bits (63), Expect = 0.045
Identities = 11/37 (29%), Positives = 21/37 (56%)

Query: 218 DVIAEQAMNNYERRFAKSLAHVINLFDPDVVVLGGGM 254
D + E+A +N +R F+ + + LF+P +VV +
Sbjct: 351 DSMLERAADNQDREFSSQMTLALGLFEPLLVVSMAAV 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3143PF05272280.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.014
Identities = 16/68 (23%), Positives = 27/68 (39%), Gaps = 12/68 (17%)

Query: 7 MVGARGAGKTTIGKALAQALGYRFVDTDL-------FMQQTSQMTVAEVVESEGWDGFRL 59
+ G G GK+T+ L F DT +Q + + E+ E FR
Sbjct: 601 LEGTGGIGKSTLINTLVGL--DFFSDTHFDIGTGKDSYEQIAGIVAYELSE---MTAFRR 655

Query: 60 RESMALQA 67
++ A++A
Sbjct: 656 ADAEAVKA 663


50YpsIP31758_3168YpsIP31758_3201Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3168212-1.407005class II glutamine amidotransferases
YpsIP31758_3170113-1.794232phosphoheptose isomerase
YpsIP31758_3171214-1.889090acyl-CoA dehydrogenase
YpsIP31758_3172219-3.166086hypothetical protein
YpsIP31758_3173218-2.987431hypothetical protein
YpsIP31758_3174216-1.855170hemagglutination domain-containing protein
YpsIP31758_3175217-0.069455lipoprotein
YpsIP31758_31760171.392497hypothetical protein
YpsIP31758_31770162.337541hypothetical protein
YpsIP31758_31780133.212886methylthioribose kinase
YpsIP31758_31790143.758474eIF-2B alpha/beta/delta family protein
YpsIP31758_3180-1154.076844ARD/ARD' family dioxygenase
YpsIP31758_31810164.2479602,3-diketo-5-methylthio-1-phosphopentane
YpsIP31758_3182-1194.728342methylthioribulose-1-phosphate dehydratase
YpsIP31758_31830204.661852aminotransferase
YpsIP31758_31840234.880964hypothetical protein
YpsIP31758_31850244.541598allantoate amidohydrolase
YpsIP31758_31860233.355791class V aminotransferase
YpsIP31758_3187-1232.983583amino acid ABC transporter ATP-binding protein
YpsIP31758_31880222.812815amino acid ABC transporter permease
YpsIP31758_31890183.822303amino acid ABC transporter permease
YpsIP31758_31900173.084517amino acid ABC transporter periplasmic protein
YpsIP31758_31911173.189110RpiR family transcriptional regulator
YpsIP31758_31920173.811395hypothetical protein
YpsIP31758_31930173.869873hypothetical protein
YpsIP31758_31940173.234910amidase
YpsIP31758_3195-1161.147057hypothetical protein
YpsIP31758_3196-1191.216593hypothetical protein
YpsIP31758_3197-2192.504736major facilitator transporter
YpsIP31758_3198-3162.525656azaleucine resistance protein AzlC
YpsIP31758_3199-3172.399233hypothetical protein
YpsIP31758_3200-2163.116025transcriptional repressor MprA
YpsIP31758_3201-2143.611460multidrug resistance protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3172FIMREGULATRY583e-13 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 58.0 bits (140), Expect = 3e-13
Identities = 25/83 (30%), Positives = 45/83 (54%)

Query: 147 QGRISPGEVDEVQLTLLMDIAKVTKISLRAALHRHLVEGATEEWVCSVYKMNQEDFWQNM 206
+ + PG + E+ LL+ I+ + + A+ +LV G + + VC Y+MN F +
Sbjct: 20 ESVLLPGSMSEMHFFLLIGISSIHSDRVILAMKDYLVGGHSRKEVCEKYQMNNGYFSTTL 79

Query: 207 RKLHRLNERVVQLLPFYTRQTSS 229
+L RLN +L P+YT ++S+
Sbjct: 80 GRLIRLNALAARLAPYYTDESSA 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3174PF05860681e-15 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 68.3 bits (167), Expect = 1e-15
Identities = 27/115 (23%), Positives = 51/115 (44%), Gaps = 6/115 (5%)

Query: 67 VLAHPVLPVNGHVVIGQGMLDQQSNTLTVTQQTDKLAINWASFDIAHGHSVIYAQPGSQS 126
+ LP+N ++ TQ L ++ F + + + P +
Sbjct: 3 ITPDTTLPINSNITTEGN----TRIIERGTQAGSNLFHSFQEFSVPTSGTAFFNNPTNIQ 58

Query: 127 IALNQVQGQSASQIYGRLQANG--QVFLLNPRGILFGKEAQVNVGGLVASTKYMS 179
+++V G S S I G ++AN +FL+NP GI+FG+ A++++GG +
Sbjct: 59 NIISRVTGGSVSNIDGLIRANATANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3187PF05272300.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.007
Identities = 14/42 (33%), Positives = 22/42 (52%), Gaps = 4/42 (9%)

Query: 31 VISIIGRSGSGKSTLLRCMNGLEDYQDGSIKLGGMTVTNRDS 72
+ + G G GKSTL+ + GL+ + D +G T +DS
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG----TGKDS 635


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3197TCRTETB461e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.0 bits (109), Expect = 1e-07
Identities = 34/163 (20%), Positives = 65/163 (39%), Gaps = 5/163 (3%)

Query: 35 LETIATNFSLSVNQAGFIVTAAQLGYAVGLMFLVPLGDMFE-RRGLIVGMTLLAAGGMLI 93
L IA +F+ ++ TA L +++G L D +R L+ G+ + G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGS-VI 95

Query: 94 TAMSQNLTMMIIGTALTGLFSVVA--QLLVPLAATLAAPEKRGKVVGIIMSGLLLGILLA 151
+ + ++I A L++ + A E RGK G+I S + +G +
Sbjct: 96 GFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVG 155

Query: 152 RTVAGALATLGGWRTIYWVASALMFIMALVLWRCLPRYKQHTG 194
+ G +A W + + + I L + L + + G
Sbjct: 156 PAIGGMIAHYIHWSYLLLIPMITI-ITVPFLMKLLKKEVRIKG 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3200PF05272280.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.017
Identities = 23/105 (21%), Positives = 37/105 (35%), Gaps = 12/105 (11%)

Query: 12 LNSRAKRQKDFPYQEILLTRLSMHMHSKLLENRNKMLKAQGINETLFMALITLDAQESRS 71
+ + P QE+ L + L R A+G + + T
Sbjct: 745 PSPEDEEIYFRPEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTF------- 797

Query: 72 IQPSELSAALG-----SSRTNATRIADELEKKGWIERRESHNDRR 111
+ ++L ALG SS ++ D L + GW RE+ RR
Sbjct: 798 VTIADLVQALGADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3201RTXTOXIND681e-14 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 68.3 bits (167), Expect = 1e-14
Identities = 63/410 (15%), Positives = 118/410 (28%), Gaps = 99/410 (24%)

Query: 25 LLLTAIFIMIGVAYLIYWFLVLRHHQ---ETDNAYISGNQVQIMSQVPGSVVSVHFENTD 81
L A FIM + ++ + SG +I V + + +
Sbjct: 57 PRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGE 116

Query: 82 FVKSGDVLVTLDPTD-------AEQAFEQAK----------------------------- 105
V+ GDVL+ L + + QA+
Sbjct: 117 SVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYF 176

Query: 106 ----------------TALANSVRQTHQLIINSKQYQ-------ANIALKKTELSQAQND 142
+ Q +Q +N + + A I + ++
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 143 LKRRVVLGAAAVIGREELQHARDAVEAAQASLDMAVQQYNANQALVLNTPLE-------- 194
L L I + + + A L + Q ++ +L+ E
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 195 -KQPAIEQAAAKMRDAWLT---------LQRTKVVSPISGYVSRRSVQ-VGAEISSGTPL 243
+ + LT Q + + +P+S V + V G +++ L
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL 356

Query: 244 MAVVPADQ-LWIDANFKETQLANMRIGQPATI-VTDF----YGDDVVYQGKVVGLDMGTG 297
M +VP D L + A + + + +GQ A I V F YG GKV +
Sbjct: 357 MVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGY---LVGKVKNI----- 408

Query: 298 SAFSLLPAQNATGNWIKVVQRLPVRIALDEKQLKEHPLRIGLSSLVKVDT 347
+ ++ G V+ + K PL G++ ++ T
Sbjct: 409 NLDAIE--DQRLGLVFNVIISIEENCLSTG--NKNIPLSSGMAVTAEIKT 454


51YpsIP31758_3243YpsIP31758_3275Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_32431123.010674hypothetical protein
YpsIP31758_32451123.264164sugar-binding domain-containing protein
YpsIP31758_32441154.307154hypothetical protein
YpsIP31758_32461153.997127fucose isomerase domain-containing protein
YpsIP31758_32472164.562381carbohydrate kinase
YpsIP31758_32482183.419080transketolase, C-terminal subunit
YpsIP31758_32491141.376449transketolase
YpsIP31758_32502120.545577ribose ABC transporter permease
YpsIP31758_3251014-0.493424ribose ABC transporter ATP-binding protein
YpsIP31758_3252016-1.449963lipoprotein
YpsIP31758_3253015-1.713580ribose ABC transporter periplasmic protein
YpsIP31758_3254017-1.188266catalase/peroxidase HPI
YpsIP31758_32552170.561793cytochrome b562
YpsIP31758_32561162.955009cytochrome b561
YpsIP31758_32571163.261444hypothetical protein
YpsIP31758_32581153.003189twin-argninine leader-binding protein DmsD
YpsIP31758_32591151.907331anaerobic dimethyl sulfoxide reductase subunit
YpsIP31758_32600151.230958anaerobic dimethylsulfoxide reductase subunit B
YpsIP31758_32610151.083306anaerobic dimethyl sulfoxide reductase, A
YpsIP31758_3262-2180.376652L-ribulose-5-phosphate 4-epimerase
YpsIP31758_3263-3200.665815hypothetical protein
YpsIP31758_3264-3211.709193DeoR family transcriptional regulator
YpsIP31758_3265-3212.921800carbohydrate ABC transporter periplasmic-binding
YpsIP31758_3266-2243.028577carbohydrate ABC transporter ATP-binding
YpsIP31758_3267-2243.433725carbohydrate ABC transporter permease
YpsIP31758_3268-2222.983608carbohydrate ABC transporter permease
YpsIP31758_3269-1202.655563L-xylulose 5-phosphate 3-epimerase
YpsIP31758_3270-111-0.603883cryptic L-xylulose kinase
YpsIP31758_3271-212-1.776141fumarate hydratase, class I
YpsIP31758_3272-115-3.254575hypothetical protein
YpsIP31758_3273-113-3.079140methionine aminopeptidase
YpsIP31758_3274-214-4.054935hypothetical protein
YpsIP31758_3275-213-4.413254enterotoxin
52YpsIP31758_3326YpsIP31758_3340Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_33261113.148307vitamin B12-transporter protein BtuF
YpsIP31758_33271112.772097hypothetical protein
YpsIP31758_33280113.187526iron-sulfur cluster insertion protein ErpA
YpsIP31758_3329-1103.878695chloride channel protein
YpsIP31758_3330-1125.204828glutamate-1-semialdehyde aminotransferase
YpsIP31758_3331-1145.804930iron-hydroxamate transporter permease subunit
YpsIP31758_3332-1134.961637iron-hydroxamate transporter substrate-binding
YpsIP31758_3333-1135.095499iron-hydroxamate transporter ATP-binding
YpsIP31758_3334-1114.693000penicillin-binding protein 1b
YpsIP31758_33350104.537961ATP-dependent RNA helicase HrpB
YpsIP31758_33360122.7757532'-5' RNA ligase
YpsIP31758_33370131.696201sugar fermentation stimulation protein A
YpsIP31758_3338-2162.804072RNA polymerase-binding transcription factor
YpsIP31758_3339-2163.188502glutamyl-Q tRNA(Asp) synthetase
YpsIP31758_3340-2183.069938poly(A) polymerase I
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3326RTXTOXINA320.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 0.003
Identities = 15/47 (31%), Positives = 22/47 (46%), Gaps = 5/47 (10%)

Query: 27 AAERVISL-----SPSTTELAYAAGLGDKLVAVSAYSDYPESAKKLE 68
+ ER + + ELA GDK ++ +Y DY E K+LE
Sbjct: 468 SVERSVLITQQHWDTLIGELAGVTRNGDKTLSGKSYIDYYEEGKRLE 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3332FERRIBNDNGPP400e-143 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 400 bits (1030), Expect = e-143
Identities = 143/262 (54%), Positives = 182/262 (69%), Gaps = 1/262 (0%)

Query: 39 IDTKRVVALEWLPVELLLALGVTPFGVADIHNYRLWVGEPALPADVINVGQRTEPNLELL 98
ID R+VALEWLPVELLLALG+ P+GVAD NYRLWV EP LP VI+VG RTEPNLELL
Sbjct: 33 IDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELL 92

Query: 99 QQMAPSLILLSQGYGPSPEKLAPIAPTMSFAFNEQGSSPLAVGKNSLQTLGQRLGLETAA 158
+M PS ++ S GYGPSPE LA IAP F F++ G PLA+ + SL + L L++AA
Sbjct: 93 TEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSD-GKQPLAMARKSLTEMADLLNLQSAA 151

Query: 159 QQHLADFDHFMLAARARLSGDTQTPLLMFSLLDPRHALIIGNGSLFQDVLSTLNIENAWQ 218
+ HLA ++ F+ + + R PLL+ +L+DPRH L+ G SLFQ++L I NAWQ
Sbjct: 152 ETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYGIPNAWQ 211

Query: 219 GETNFWGSAVVGIERLATIKTARAVCFGHGNNEMLQQVARTPLWQSLSFVRENQLRLLPP 278
GETNFWGS V I+RLA K +CF H N++ + + TPLWQ++ FVR + + +P
Sbjct: 212 GETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQRVPA 271

Query: 279 VWFYGATLSAMRFVRLLEQAWG 300
VWFYGATLSAM FVR+L+ A G
Sbjct: 272 VWFYGATLSAMHFVRVLDNAIG 293


53YpsIP31758_3358YpsIP31758_3392Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_33582281.602808hypothetical protein
YpsIP31758_33592281.829968bifunctional aconitate hydratase
YpsIP31758_33604301.003645dihydrolipoamide dehydrogenase
YpsIP31758_33612250.920622hypothetical protein
YpsIP31758_33621230.899874dihydrolipoamide acetyltransferase
YpsIP31758_33631190.150174pyruvate dehydrogenase subunit E1
YpsIP31758_3364012-0.284826PdhR family transcriptional regulator
YpsIP31758_3365014-1.156386hypothetical protein
YpsIP31758_3366016-1.261269aromatic amino acid transport protein AroP
YpsIP31758_3367121-1.875082regulatory protein AmpE
YpsIP31758_3368120-1.992079N-acetyl-anhydromuranmyl-L-alanine amidase
YpsIP31758_3369222-2.170242quinolinate phosphoribosyltransferase
YpsIP31758_3370122-2.690783major pilin subunit
YpsIP31758_3371121-2.561218hypothetical protein
YpsIP31758_3372122-2.641526type IV pilin biogenesis protein
YpsIP31758_3373217-1.388353guanosine 5'-monophosphate oxidoreductase
YpsIP31758_3374116-1.047624dephospho-CoA kinase
YpsIP31758_3375216-0.855517hypothetical protein
YpsIP31758_3376217-0.827892zinc-binding protein
YpsIP31758_33772190.049671nucleoside triphosphate pyrophosphohydrolase
YpsIP31758_33782170.303727preprotein translocase subunit SecA
YpsIP31758_33790140.965027SecA regulator SecM
YpsIP31758_3380-1130.990495hypothetical protein
YpsIP31758_3381-1131.462908UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine
YpsIP31758_33820142.942596cell division protein FtsZ
YpsIP31758_33830122.612560cell division protein FtsA
YpsIP31758_33841133.506250cell division protein FtsQ
YpsIP31758_33851153.162926D-alanine--D-alanine ligase
YpsIP31758_33862153.388901UDP-N-acetylmuramate--L-alanine ligase
YpsIP31758_33872143.824339undecaprenyldiphospho-muramoylpentapeptide
YpsIP31758_33881143.488399cell division protein FtsW
YpsIP31758_33891143.579747UDP-N-acetylmuramoyl-L-alanyl-D-glutamate
YpsIP31758_33900133.228080phospho-N-acetylmuramoyl-pentapeptide-
YpsIP31758_33910133.504832UDP-N-acetylmuramoyl-tripeptide--D-alanyl-D-
YpsIP31758_3392-1133.238495UDP-N-acetylmuramoylalanyl-D-glutamate--2,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3362RTXTOXIND357e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 7e-04
Identities = 18/83 (21%), Positives = 34/83 (40%), Gaps = 2/83 (2%)

Query: 26 DTVEAEQSLITVEGDKASMEVPSPQAGVVKEIKIAVGDKVATGSLIMVFDATGAAAAPVK 85
+ V +T G S E+ + +VKEI + G+ V G +++ A GA A +K
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AEEKPAAPAQAAAPAASAAKNVE 108
+ ++++E
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIE 161



Score = 30.2 bits (68), Expect = 0.020
Identities = 10/49 (20%), Positives = 21/49 (42%), Gaps = 1/49 (2%)

Query: 26 DTVEAEQSLITVEGDKASMEVPSPQAGVVKEIKI-AVGDKVATGSLIMV 73
+ L E + + + +P + V+++K+ G V T +MV
Sbjct: 310 NIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 29.8 bits (67), Expect = 0.031
Identities = 12/44 (27%), Positives = 21/44 (47%), Gaps = 1/44 (2%)

Query: 133 EQSLITVEGDKASMEVPAPFAGIVKEIKIST-GDKVKTGSLIMV 175
L E + + + AP + V+++K+ T G V T +MV
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3370BCTERIALGSPG413e-07 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 41.0 bits (96), Expect = 3e-07
Identities = 21/66 (31%), Positives = 38/66 (57%)

Query: 10 QKGFTLIELMVAVAIIAVLSGIGIPSYQRYIQKAALTDMLQAIVPYKMAVELCALEQSNL 69
Q+GFTL+E+MV + II VL+ + +P+ +KA + IV + A+++ L+ +
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHY 66

Query: 70 DSCNAG 75
+ N G
Sbjct: 67 PTTNQG 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3372BCTERIALGSPF2766e-92 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 276 bits (708), Expect = 6e-92
Identities = 108/405 (26%), Positives = 205/405 (50%), Gaps = 13/405 (3%)

Query: 6 LFNWTALNKTGELQTGMLLATERNSVYEHIIQHGLQPLGV-----KGGRRLSARYWQGER 60
+++ AL+ G+ G A + + + GL PL V + S +
Sbjct: 3 QYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLRRK 62

Query: 61 -------LVAMTRQLATLLQAGLPLVNSLQLLAKEADDSAWRCLLDEISQQVAQGQSLSE 113
L +TRQLATL+ A +PL +L +AK+++ L+ + +V +G SL++
Sbjct: 63 IRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD 122

Query: 114 VMEQYPHVFPRLYPPVVAVGELTGNLEQCCTQLVHHQERQQNLHKKVIKALKYPVVVCIV 173
M+ +P F RLY +VA GE +G+L+ +L + E++Q + ++ +A+ YP V+ +V
Sbjct: 123 AMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVV 182

Query: 174 ALVVSVIMLVMVLPEFAQIYQSFDTPLPGLTASLLWLSTFLTFYGPYLALIIAIVCIGYF 233
A+ V I+L +V+P+ + + LP T L+ +S + +GP++ L + + +
Sbjct: 183 AIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFR 242

Query: 234 YTLRKKSRWQQWEQTILLSIPLVSTLIRGSCLSQIFQTLAITQQAGLPLSAGLDAAARSI 293
LR++ R + + LL +PL+ + RG ++ +TL+I + +PL + + +
Sbjct: 243 VMLRQEKRRVSFHRR-LLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVM 301

Query: 294 HNYNYQQALRCIQKQISQGIPLYTTLNQHPLFPAICQQLIRVGEESGSLDVLLEKLACWH 353
N + L + +G+ L+ L Q LFP + + +I GE SG LD +LE+ A
Sbjct: 302 SNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADNQ 361

Query: 354 QQQTQNLADNVTQMLEPLLMLIIGSIVGVLVIAMYLPIFQLGDVI 398
++ + + EPLL++ + ++V +V+A+ PI QL ++
Sbjct: 362 DREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3378SECA13730.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1373 bits (3556), Expect = 0.0
Identities = 805/904 (89%), Positives = 852/904 (94%), Gaps = 3/904 (0%)

Query: 1 MLIKLLTKVFGSRNDRTLRRMQKVVDVINRMEPDIEKLTDTELRAKTDEFRERLAKGEVL 60
MLIKLLTKVFGSRNDRTLRRM+KVV++IN MEP++EKL+D EL+ KT EFR RL KGEVL
Sbjct: 1 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVL 60

Query: 61 ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA 120
ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA
Sbjct: 61 ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA 120

Query: 121 LSGRGVHVVTVNDYLAQRDAENNRPLFEFLGLSIGINLPNMTAPAKRAAYAADITYGTNN 180
L+G+GVHVVTVNDYLAQRDAENNRPLFEFLGL++GINLP M APAKR AYAADITYGTNN
Sbjct: 121 LTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNN 180

Query: 181 EFGFDYLRDNMAFSPEERVQRQLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYIRVN 240
E+GFDYLRDNMAFSPEERVQR+LHYALVDEVDSILIDEARTPLIISGPAEDSSEMY RVN
Sbjct: 181 EYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVN 240

Query: 241 KLIPKLIRQEKEDSDSFQGEGHFSVDEKSRQVHLTERGLILIEQMLVEAGIMDEGESLYS 300
K+IP LIRQEKEDS++FQGEGHFSVDEKSRQV+LTERGL+LIE++LV+ GIMDEGESLYS
Sbjct: 241 KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS 300

Query: 301 PANIMLMHHVTAALRAHVLFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK 360
PANIMLMHHVTAALRAH LFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK
Sbjct: 301 PANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK 360

Query: 361 EGVEIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTIVVPTNRPMIR 420
EGV+IQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDT+VVPTNRPMIR
Sbjct: 361 EGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIR 420

Query: 421 KDLADLVYMTEQEKIGAIIEDIRERTANGQPVLVGTISIEKSEVVSAELTKAGIEHKVLN 480
KDL DLVYMTE EKI AIIEDI+ERTA GQPVLVGTISIEKSE+VS ELTKAGI+H VLN
Sbjct: 421 KDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN 480

Query: 481 AKFHAMEAEIVSQAGQPGAVTIATNMAGRGTDIVLGGSWQSEIAALEDPTEEQIAAIKAA 540
AKFHA EA IV+QAG P AVTIATNMAGRGTDIVLGGSWQ+E+AALE+PT EQI IKA
Sbjct: 481 AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKAD 540

Query: 541 WQIRHDAVLASGGLHIIGTERHESRRIDNQLRGRAGRQGDAGSSRFYLSMEDALMRIFAS 600
WQ+RHDAVL +GGLHIIGTERHESRRIDNQLRGR+GRQGDAGSSRFYLSMEDALMRIFAS
Sbjct: 541 WQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFAS 600

Query: 601 DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY 660
DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY
Sbjct: 601 DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY 660

Query: 661 SQRNELLDVSDVSETINSIREDVFKTTIDSYIPTQSLEEMWDIEGLEQRLKNDFDLDMPI 720
SQRNELLDVSDVSETINSIREDVFK TID+YIP QSLEEMWDI GL++RLKNDFDLD+PI
Sbjct: 661 SQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI 720

Query: 721 AKWLEDEPQLHEETLRERILQQAIETYQRKEEVVGIEMMRNFEKGVMLQTLDSLWKEHLA 780
A+WL+ EP+LHEETLRERIL Q+IE YQRKEEVVG EMMR+FEKGVMLQTLDSLWKEHLA
Sbjct: 721 AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLA 780

Query: 781 AMDYLRQGIHLRGYAQKDPKQEYKRESFAMFAAMLESLKYEVISVLSKVQVRMPEEVEAL 840
AMDYLRQGIHLRGYAQKDPKQEYKRESF+MFAAMLESLKYEVIS LSKVQVRMPEEVE L
Sbjct: 781 AMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEEL 840

Query: 841 EVQRREEAERLARQQQLSHQTDNSALMSEEEVKVANSLERKVGRNDPCPCGSGKKYKQCH 900
E QRR EAERLA+ QQLSHQ D+SA + + ERKVGRNDPCPCGSGKKYKQCH
Sbjct: 841 EQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTG---ERKVGRNDPCPCGSGKKYKQCH 897

Query: 901 GRLQ 904
GRLQ
Sbjct: 898 GRLQ 901


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3383SHAPEPROTEIN537e-10 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 53.2 bits (128), Expect = 7e-10
Identities = 47/201 (23%), Positives = 72/201 (35%), Gaps = 18/201 (8%)

Query: 171 IVKAVERCGLKVDQLIFAGLAASYAVLTEDERELGVCVVDIGGGTMDMAVYTGGALRHTK 230
I ++ + G + LI +AA+ G VVDIGGGT ++AV + + ++
Sbjct: 126 IRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSS 185

Query: 231 VIPYAGNVVTSDI------AYAFGTPPTDAEAIKVRHGCALGSIVSKDESVEVPSVGGRP 284
+ G+ I Y AE IK G A + V V GR
Sbjct: 186 SVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAY-----PGDEVREIEVRGRN 240

Query: 285 -----PRSLQRQTLAEVIEPRYTELLNLVNDEILQLQEQLRQQGVKHHLAAGIVLTGGAA 339
PR E++E E L + ++ EQ + G+VLTGG A
Sbjct: 241 LAEGVPRGF-TLNSNEILEA-LQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGA 298

Query: 340 QIDGLAECAQRVFHAQVRIGQ 360
+ L V + +
Sbjct: 299 LLRNLDRLLMEETGIPVVVAE 319


54YpsIP31758_3424YpsIP31758_3446Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_34242181.952984lipoprotein
YpsIP31758_34251182.613763hypothetical protein
YpsIP31758_34260172.636959hypothetical protein
YpsIP31758_34270172.699316pentapeptide repeat-containing protein
YpsIP31758_34280172.793753pentapeptide repeat-containing protein
YpsIP31758_34290173.131443Rhs element Vgr protein
YpsIP31758_3430-1162.502261AAA ATPase
YpsIP31758_34310150.811868hypothetical protein
YpsIP31758_3432015-1.462718hypothetical protein
YpsIP31758_3433219-4.782833hypothetical protein
YpsIP31758_3434218-4.599483hypothetical protein
YpsIP31758_3435220-4.091942hypothetical protein
YpsIP31758_3436014-3.773302hypothetical protein
YpsIP31758_3437-112-2.883697ImpA domain-containing protein
YpsIP31758_3438-110-1.140835hypothetical protein
YpsIP31758_3439-2100.048224ribosomal large subunit pseudouridine synthase
YpsIP31758_3440-111-0.001365Dna-J like membrane chaperone protein
YpsIP31758_3441-113-0.397220organic solvent tolerance protein
YpsIP31758_3442-114-1.267956peptidyl-prolyl cis-trans isomerase SurA
YpsIP31758_3443116-2.2767324-hydroxythreonine-4-phosphate dehydrogenase
YpsIP31758_3444020-3.181337dimethyladenosine transferase
YpsIP31758_3445124-3.549588ApaG protein
YpsIP31758_3446220-2.689453diadenosine tetraphosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3429ICENUCLEATIN373e-04 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 37.4 bits (86), Expect = 3e-04
Identities = 51/236 (21%), Positives = 89/236 (37%), Gaps = 8/236 (3%)

Query: 545 TGMSVSATGISVSTTGTSLSVTGMSTSVTGVSVGFTLIGTS--FTGVSASFTGVSTSFTG 602
+G + I ++T G++LS T S + G T +S G ++ T + S
Sbjct: 150 SGSTQPTQTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLV 209

Query: 603 ASNSLTGVSNSMTGCSSSFTGTSNSMTGSSHSMTGMSTSITGHSMSQ-TGSSSSITGDST 661
A T + + + + T M GS + ST G S G S+ T
Sbjct: 210 AGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGED 269

Query: 662 SFTGSSVSSTGSSVSTTGVSTSTTGSSTSTTGCSVSTTGSSTSTTGNSVSMTG----NST 717
S + ST ++ + ++ + T+ S+ ST T G + T T
Sbjct: 270 SSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQT 329

Query: 718 STTGCSISTTGSSIGTVGSSISTTGSSVSTTGSSISTTGLSVSYTGAQYSDVGVDL 773
+ G ++ S GT G S+ + +T ++ + L+ Y Q + G DL
Sbjct: 330 AQKGSDLTAGYGSTGTAGDD-SSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 384



Score = 34.7 bits (79), Expect = 0.001
Identities = 50/227 (22%), Positives = 93/227 (40%), Gaps = 10/227 (4%)

Query: 546 GMSVSATGISVSTTGTSLSVTGMSTSVTGVSVGFTLIGTSFTGVSASFTGVSTSFTGASN 605
G + +A+ SV T G + T S ++ G+ GT+ + S ST +
Sbjct: 549 GSTQTASYNSVLTAGYGSTQTAREGS--DLTAGYGSTGTAGSDSSIIAGYGSTQTASYHS 606

Query: 606 SLTGVSNSMTGCSSSFTGTSNSMTGSSHSMTGMSTSITGHSMSQTGSSSSITGDSTSFTG 665
SLT S T+ GS+ + S+ I G+ +QT +SI T+ G
Sbjct: 607 SLTAGYGSTQTAREQSVLTTG--YGSTSTAGADSSLIAGYGSTQTAGYNSIL---TAGYG 661

Query: 666 SSVSSTGSSVSTTGVSTSTTGSSTSTTGCSVSTTGSSTSTTGNSVSMTGNSTSTTGCSIS 725
S+ ++ S T G +++T + S+ +T ++ + + T+ G ++
Sbjct: 662 STQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLT 721

Query: 726 TTGSSIGTVGSS---ISTTGSSVSTTGSSISTTGLSVSYTGAQYSDV 769
+ S T G+ I+ GS+ + + S T G + T + S +
Sbjct: 722 SGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVL 768



Score = 34.3 bits (78), Expect = 0.002
Identities = 51/231 (22%), Positives = 88/231 (38%), Gaps = 20/231 (8%)

Query: 551 ATGISVSTTGTSLSVTGMSTSVTGVSVGFTLIGTSFTGVSASFTGVSTSFTGASNSLTGV 610
A + G+ + + + V + T +G + + + G++ S T
Sbjct: 114 ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTLSGTHQ 173

Query: 611 SNSMTGCSSSFTGTSNSMTGSSHSMTGM----STSITGHSMSQT-GSSSSITGDSTSFTG 665
S + G S+ T +S + + TG ST + G+ +QT G SS +
Sbjct: 174 SQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMA---GYGS 230

Query: 666 SSVSSTGSSVSTTGVSTSTTGSSTSTTGCSVSTTGSSTSTTGNSVSMTG----NSTSTTG 721
+ GS ++ ST T G +S + ST T G S+T T+ G
Sbjct: 231 TQTGMKGSDLTAGYGSTGTAGDDSS-----LIAGYGSTQTAGEDSSLTAGYGSTQTAQKG 285

Query: 722 CSISTTGSSIGTVGSS---ISTTGSSVSTTGSSISTTGLSVSYTGAQYSDV 769
++ S GT G+ I+ GS+ + S T G + T + SD+
Sbjct: 286 SDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDL 336



Score = 34.0 bits (77), Expect = 0.003
Identities = 56/233 (24%), Positives = 96/233 (41%), Gaps = 16/233 (6%)

Query: 544 VTGMSVSATGISVSTTGTSLSVTGMSTSVTGVSVGFTLIGTSFTGVSASFTGVSTSFTGA 603
+ G + T ST T + + ++ G+ GT+ S ST G
Sbjct: 401 IAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGE 460

Query: 604 SNSLTGVSNSMTGCSSSFTGTSNSMTGSSHSMTGMSTSITGHSMSQT-GSSSSITGDSTS 662
+SLT S T+ GS+ + S+ I G+ +QT G S++T
Sbjct: 461 DSSLTAGYGSTQTAQKGSDLTAG--YGSTSTAGYESSLIAGYGSTQTAGYGSTLTA---G 515

Query: 663 FTGSSVSSTGSSVSTTGVSTSTTGSSTSTTGCSVSTTGSSTSTTGNSVSMTGNSTSTT-- 720
+ + + S + T STST G+++S ++ GS+ + + NSV G ++ T
Sbjct: 516 YGSTQTAQNESDLITGYGSTSTAGANSSL----IAGYGSTQTASYNSVLTAGYGSTQTAR 571

Query: 721 -GCSISTTGSSIGTVGSS---ISTTGSSVSTTGSSISTTGLSVSYTGAQYSDV 769
G ++ S GT GS I+ GS+ + + S T G + T + S +
Sbjct: 572 EGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVL 624



Score = 31.6 bits (71), Expect = 0.017
Identities = 54/227 (23%), Positives = 90/227 (39%), Gaps = 16/227 (7%)

Query: 555 SVSTTGTSLSVTGMSTSVTGVSVGFTLI-GTSFTGVSASFTGVSTSFTGASNSLTGVSNS 613
S T G S+T S G L G TG + + + + + G++ + S
Sbjct: 262 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY-GSTQTAGEESTQ 320

Query: 614 MTGCSSSFTGTSNSMTGSSHSMTGM----STSITGHSMSQT-GSSSSITGDSTSFTGSSV 668
G S+ T S + + TG S+ I G+ +QT G SS+T + +
Sbjct: 321 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTA---GYGSTQT 377

Query: 669 SSTGSSVSTTGVSTSTTGSSTSTTGC--SVSTTGSSTSTTGNSVSMTGNSTSTTGCSIST 726
+ GS ++ ST T G+ +S S T G ++ T S T+ G ++
Sbjct: 378 AQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGS---TQTAQKGSDLTA 434

Query: 727 TGSSIGTVGSSISTTGSSVSTTGSSISTTGLSVSYTGAQYSDVGVDL 773
S GT G S+ + +T ++ + L+ Y Q + G DL
Sbjct: 435 GYGSTGTAGDD-SSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 480



Score = 31.3 bits (70), Expect = 0.021
Identities = 54/224 (24%), Positives = 92/224 (41%), Gaps = 14/224 (6%)

Query: 555 SVSTTGTSLSVTGMSTSVTGVSVGFTLIGTSFTGVSASFTGVSTSFTGASNSLTGVSNS- 613
S T G + +T S G L T+ G +++ S+ G ++ T NS
Sbjct: 646 STQTAGYNSILTAGYGSTQTAQEGSDL--TAGYGSTSTAGADSSLIAGYGSTQTAGYNSI 703

Query: 614 -MTGCSSSFTGTSNSMTGSSHSMTGM----STSITGHSMSQTGSSSSITGDSTSFTGSSV 668
G S+ T S S + T S+ I G+ +QT S S T+ GS+
Sbjct: 704 LTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSL---TAGYGSTQ 760

Query: 669 SSTGSSVSTTGVSTSTTGSSTSTTGCSVSTTGSSTSTTGNSVSMTGNSTSTTGCSISTTG 728
++ SV TTG +++T + S+ +T ++ + + T+ ++T
Sbjct: 761 TAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGY 820

Query: 729 SSIGTVG---SSISTTGSSVSTTGSSISTTGLSVSYTGAQYSDV 769
S T G S I+ GS+ + +SI T G + T + SD+
Sbjct: 821 GSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDL 864


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3432PERTACTIN300.026 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 30.5 bits (68), Expect = 0.026
Identities = 35/109 (32%), Positives = 47/109 (43%), Gaps = 22/109 (20%)

Query: 102 SPQWHSRVVLPKGSRVTLSDSSLNNRLANFSTGRTLKIQPLVIENAECAST-PPAYLPLS 160
+PQ + + +G+RVT+S SL+ N VIE A PP PLS
Sbjct: 309 APQLGAAIRAGRGARVTVSGGSLSAPHGN------------VIETGGGARRFPPPASPLS 356

Query: 161 VASQLQAGQAHLRLRLTTQGVASLSELDFAPMNLTLAGGIIQSNQLITT 209
+ LQAG QG A L + P+ LTLAGG ++ T
Sbjct: 357 I--TLQAGA-------RAQGRALLYRVLPEPVKLTLAGGAQGQGDIVAT 396


55YpsIP31758_3463YpsIP31758_3468Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3463021-5.042322transcriptional activator NhaR
YpsIP31758_3464-121-4.846548pH-dependent sodium/proton antiporter
YpsIP31758_3465-123-5.250379chaperone protein DnaJ
YpsIP31758_3466024-6.346795molecular chaperone DnaK
YpsIP31758_3467-123-7.108350hypothetical protein
YpsIP31758_3468022-7.326929acetyltransferase domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3466SHAPEPROTEIN1434e-40 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 143 bits (363), Expect = 4e-40
Identities = 81/387 (20%), Positives = 149/387 (38%), Gaps = 84/387 (21%)

Query: 5 IGIDLGTTNSCVAIMDGTKARVLENSEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + + VL PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKG--QGIVLNE-------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEAQRDKDIMPYKIIAADNGDAWLEVKGQKMAPPQISAE 118
P N + AI+ + +D I + + +
Sbjct: 64 GRTPGN-IAAIRPM-----------KDGVIADFFVTEK------------------MLQH 93

Query: 119 VLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALA 178
+K++ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 94 FIKQVHS---NSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 179 YGL--DKEVGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRL 236
GL + G+ V D+GGGT ++++I ++ V + +GG+ FD +
Sbjct: 151 AGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAI 198

Query: 237 INYLVEEFKKDQGMDLRTDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADGSG 292
INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 199 INYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGV 245

Query: 293 PKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIQD--VILVGGQTRMPMV 349
P+ + + LE+L E + + + VAL+ SDI + ++L GG + +
Sbjct: 246 PRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 350 QKKVADFFGKEPRKDVNPDEAVAIGAA 376
+ + + G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGG 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3468TRNSINTIMINR320.010 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 32.0 bits (72), Expect = 0.010
Identities = 24/79 (30%), Positives = 40/79 (50%), Gaps = 7/79 (8%)

Query: 647 AKLSDTEIEHAQTLRRKISETIAQWGQQYQGTHTNDDVDEEIDSGVNANLS-KLLHAGHI 705
+++ E + R++ E+ AQ Q+Y+ H + ++ SG+ LS L+ AG I
Sbjct: 319 EQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEELQLSSGIGYGLSSALIVAGGI 378

Query: 706 ---VTGIIHRRLVINRPGE 721
VT +HRR N+P E
Sbjct: 379 GAGVTTALHRR---NQPAE 394


56YpsIP31758_3516YpsIP31758_3543Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_35160143.135759lipopolysaccharide heptosyltransferase III
YpsIP31758_35171194.537961hypothetical protein
YpsIP31758_35180204.395821autoinducer-2 (AI-2) kinase
YpsIP31758_35191213.983696Crp/Fnr family sugar-binding transcriptional
YpsIP31758_35211213.683612ABC transporter ATP-binding protein
YpsIP31758_35221222.925126ABC transporter permease
YpsIP31758_35230183.042276autoinducer AI-2 ABC transporter permease LsrD
YpsIP31758_3524-1162.868928ABC transporter periplasmic protein
YpsIP31758_3525-1153.030869aldolase
YpsIP31758_3526-1142.447533autoinducer-2 (AI-2) modifying protein LsrG
YpsIP31758_3527-1121.064129hypothetical protein
YpsIP31758_35280140.516355HPr family phosphocarrier protein
YpsIP31758_3529-115-2.844466fructose-like permease EIIC subunit 2
YpsIP31758_3530014-3.294562PTS system fructose-like transporter subunit
YpsIP31758_3532018-3.435954fructose-like phosphotransferase EIIB subunit 3
YpsIP31758_3531018-4.741570AraC family transcriptional regulator
YpsIP31758_3533120-4.340808hypothetical protein
YpsIP31758_3534221-4.224185hypothetical protein
YpsIP31758_3535222-3.555023M48 family peptidase
YpsIP31758_3536221-3.739275type I restriction-modification system, M
YpsIP31758_3537016-3.484427type I restriction-modification system subunit
YpsIP31758_3538015-1.038498HsdR family type I site-specific
YpsIP31758_35391161.796290hypothetical protein
YpsIP31758_35400192.474584phage integrase family site specific
YpsIP31758_35410222.796985*permease
YpsIP31758_35420232.899414permease
YpsIP31758_35431263.283736leucyl aminopeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3528PHPHTRNFRASE5830.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 583 bits (1505), Expect = 0.0
Identities = 194/593 (32%), Positives = 310/593 (52%), Gaps = 28/593 (4%)

Query: 116 PTLLRARSVSPGTACGKLLSLIRADLNA--LGDLPVAQGIEREQQMLADGVAQLGKAWES 173
+ + S G A K + +++ V+ IE+ L +L
Sbjct: 2 HHKITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAI--- 58

Query: 174 LLVANSSTAANSSTTENSSTTENNSTTENNSTTRAIREVHRSLLRDGTFRQRLLSHIIAG 233
++ + + I H +L D + I
Sbjct: 59 ---------------------KDQTEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENE 97

Query: 234 ESCATAIVATAA-YFSQQLALAANTYLRERELDIRDVSFQLLQQIYGEQRFPSQQALSED 292
+ A + + F N Y++ER DIRDVS ++L + G + S ++E+
Sbjct: 98 QMNAEYALKEVSDMFVSMFESMDNEYMKERAADIRDVSKRVLGHLIGVET-GSLATIAEE 156

Query: 293 SLCIADELTPSQFLALDKRYLKGLLLGRGGSTSHTVILARSFNIPTLVGVDATALQPYLN 352
++ IA++LTPS L+K+++KG GG TSH+ I++RS IP +VG +
Sbjct: 157 TVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHG 216

Query: 353 QSLQIDGELGLVVCLLDEPVRRYYRQEQWLHDQLREQQSRYQNMPGRTLDGVRMVVAANI 412
+ +DG G+V+ E + Y +++ ++ +++ ++ P T DG + +AANI
Sbjct: 217 DMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANI 276

Query: 413 THAVEVEGAFNQGAESIGLFRTEMLYMDRAAAPSEEELYTLYAQALGAAKGKPMIIRTID 472
+V+G G E IGL+RTE LYMDR P+EEE + Y + + GKP++IRT+D
Sbjct: 277 GTPKDVDGVLANGGEGIGLYRTEFLYMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLD 336

Query: 473 IGGDKPVSYLNIPAESNPFLGYRAVRIYHEFLSLFHTQLRAILRASMHGPLKIMIPMISS 532
IGGDK +SYL +P E NPFLG+RA+R+ E +F TQLRA+LRAS +G LK+M PMI++
Sbjct: 337 IGGDKELSYLQLPKELNPFLGFRAIRLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIAT 396

Query: 533 MEEILWVKDQLAEVKQSLRINHLQFDETVPLGMMLEVPSVMFIIDQCCEEMDFLSIGSND 592
+EE+ K + E K L + +++ +G+M+E+PS + +E+DF SIG+ND
Sbjct: 397 LEELRQAKAIMQEEKDKLLSEGVDVSDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTND 456

Query: 593 LTQYLLAVDRDNAKVSEHYHCLSPALLRALDYAVCEVHRHGKWIGLCGELAAKDSVLPLL 652
L QY +A DR N +VS Y PA+LR +D + H GKW+G+CGE+A + +PLL
Sbjct: 457 LIQYTMAADRMNERVSYLYQPYHPAILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLL 516

Query: 653 VAMGLDEISMSASFIGATKARLAKLDRGECRLLLNRAMACRTSREVEHLLVQY 705
+ +GLDE SMSA+ I +++L KL + E + +A+ T+ EVE L+ +
Sbjct: 517 LGLGLDEFSMSATSILPARSQLLKLSKEELKPFAQKALMLDTAEEVEQLVKKT 569


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3536GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.002
Identities = 26/193 (13%), Positives = 55/193 (28%), Gaps = 15/193 (7%)

Query: 663 LSDDLQALAAKETRLAEIASMLEEILESLTEEEKEQDTVKESQDGFANAELSKAAKVFLK 722
S ++ L A++ LA + LE+ LE ++ + A ++ A++
Sbjct: 139 DSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA 198

Query: 723 EQKDAKTKFAEDSYEAKIIRANKLIDEEKTLKKTVKDAATALHLKTKTTIEALTDEQVNN 782
+ A+ + + + K + + A I+ L E
Sbjct: 199 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE---- 254

Query: 783 LLHLKWIAPLSTELAAMPNAVISQLTSQVQALADKYAVTYSQVANEIKNTEQELAQMMSE 842
A L A + + + + + E E E A + +
Sbjct: 255 ------KAALEAR-----QAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQ 303

Query: 843 LTGNEFDMQGLAE 855
+ Q L
Sbjct: 304 SQVLNANRQSLRR 316


57YpsIP31758_3557YpsIP31758_3569Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3557-2225.627543anaerobic ribonucleotide reductase-activating
YpsIP31758_35580215.620796hypothetical protein
YpsIP31758_35590236.245383phosphonate metabolism transcriptional regulator
YpsIP31758_35600235.886759phosphonate metabolism protein PhnG
YpsIP31758_35611206.095523carbon-phosphorus lyase complex subunit
YpsIP31758_35620195.938075phosphonate metabolism protein PhnI
YpsIP31758_35631165.245344phosphonate metabolism protein PhnJ
YpsIP31758_35640165.750083phosphonate C-P lyase system protein PhnK
YpsIP31758_35650185.333283phosphonate C-P lyase system protein PhnL
YpsIP31758_35661174.825975phosphonate metabolism protein PhnM
YpsIP31758_35670204.153346ribose 1,5-bisphosphokinase
YpsIP31758_35680213.655013carbon-phosphorus lyase complex accessory
YpsIP31758_35690253.650387hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3565PF05272280.033 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.033
Identities = 13/47 (27%), Positives = 17/47 (36%), Gaps = 8/47 (17%)

Query: 40 CVVLHGHSGSGKSTLLRSLYANYLPDSGHI--------WIKHQGEWI 78
VVL G G GKSTL+ +L H + + G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVA 644


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3566UREASE300.024 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 29.7 bits (67), Expect = 0.024
Identities = 15/27 (55%), Positives = 18/27 (66%), Gaps = 1/27 (3%)

Query: 332 TRNPARALGLNDR-GVIAEGKRADLIL 357
T NPA A GL+ G + GKRADL+L
Sbjct: 410 TINPAIAHGLSHEIGSLEVGKRADLVL 436


58YpsIP31758_3578YpsIP31758_3596Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_35781133.029666GIY-YIG nuclease superfamily protein
YpsIP31758_35790102.805891acetyltransferase
YpsIP31758_3580-1112.777757SCP-2 sterol transfer family protein
YpsIP31758_35810112.926525hypothetical protein
YpsIP31758_3582-1103.308976U32 family peptidase
YpsIP31758_3583-1112.999112U32 family peptidase
YpsIP31758_3584-1122.724805outer membrane protein oprM
YpsIP31758_35850123.313524RND efflux transporter
YpsIP31758_35862162.503213RND family efflux transporter MFP subunit
YpsIP31758_35873261.862120hypothetical protein
YpsIP31758_35885321.308903DNA-binding protein
YpsIP31758_35894311.446174hypothetical protein
YpsIP31758_35905321.505199ATP-dependent RNA helicase DeaD
YpsIP31758_35915290.400594lipoprotein NlpI
YpsIP31758_35926300.543502polynucleotide phosphorylase/polyadenylase
YpsIP31758_35936260.12113530S ribosomal protein S15
YpsIP31758_35946250.100875tRNA pseudouridine synthase B
YpsIP31758_3595425-0.226499ribosome-binding factor A
YpsIP31758_3596422-0.092525translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3584RTXTOXIND388e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.9 bits (88), Expect = 8e-05
Identities = 25/177 (14%), Positives = 54/177 (30%), Gaps = 19/177 (10%)

Query: 72 DLRQAIADIEAARAQYGVQRAAQLPTVNAGVNGSRGRGLSDTSDGNNNTAISQSYGAQAS 131
A AD ++ R Q R + LS + + N + +
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQT----------RYQILSRSIELNKLPELKLP--DEPY 175

Query: 132 VSAFELDLFGKKSSLSHAEFETYLATEEAAKTTRITLIADTATAWVTLAADQNQLLLAEE 191
+ + +SL +F T+ + + A+ T + +N + +
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 192 TLKSAEQSLKLAQLRQKNGIASRIDVAAMETLYQSARADVAQYKTTVAQDKNALDLL 248
L + L ++ V E Y A ++ YK+ + Q ++ +
Sbjct: 236 RLDD------FSSLL-HKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3585ACRIFLAVINRP11530.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1153 bits (2984), Expect = 0.0
Identities = 586/1033 (56%), Positives = 763/1033 (73%), Gaps = 5/1033 (0%)

Query: 3 ARFFIYRPVFAWVIAIVIMLGGVVALETLPIAQYPDVAPPSISIKATYTGASAETLENSV 62
A FFI RP+FAWV+AI++M+ G +A+ LP+AQYP +APP++S+ A Y GA A+T++++V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 63 TQVIEQELTGLDGLLYFSSSSGSDGNAKIVATFKQGTNADTAQVQVQNKVQQALTRLPTE 122
TQVIEQ + G+D L+Y SS+S S G+ I TF+ GT+ D AQVQVQNK+Q A LP E
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 VQAQGVTVTKSQTNFLLIMSLYDEKDKHTGTDIADYLVSNLQDPLARLEGVGSVQVFGSQ 182
VQ QG++V KS +++L++ + T DI+DY+ SN++D L+RL GVG VQ+FG+Q
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ 181

Query: 183 YAMRIWLNPTKLAAYNLMPSDVQSAITAQNTQVSAGKIGALPSGKEQQLTATVMAQSRLK 242
YAMRIWL+ L Y L P DV + + QN Q++AG++G P+ QQL A+++AQ+R K
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 243 TPEQFNNIIVKSDSTGAVVRLRDVARVELGNEDYSVTTRLNGHPAAGIAVMLAPGANALA 302
PE+F + ++ +S G+VVRL+DVARVELG E+Y+V R+NG PAAG+ + LA GANAL
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 303 TAERVKAKAAEFELNLPDGYKIAYPKDSTDFIKVSVEEVVKTLIEAILLVVIVMYIFLQN 362
TA+ +KAK AE + P G K+ YP D+T F+++S+ EVVKTL EAI+LV +VMY+FLQN
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 363 IRATLIPAIAVPVVLLGTFGVLAIFGYSINTLTLFGMVLSIGLLVDDAIVVVENVERVMR 422
+RATLIP IAVPVVLLGTF +LA FGYSINTLT+FGMVL+IGLLVDDAIVVVENVERVM
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 423 EDNLPPREATEKSMSEIASALIGIALVLSAVFLPMAFFGGATGVIYRQFSITIVSAMALS 482
ED LPP+EATEKSMS+I AL+GIA+VLSAVF+PMAFFGG+TG IYRQFSITIVSAMALS
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 483 VLVALTLTPALCATFLKPNHKPPSEH--GFFGGFNRRYDRMQTRYESLVGHVIHRSLRYL 540
VLVAL LTPALCAT LKP E+ GFFG FN +D Y + VG ++ + RYL
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 541 LIYAVLIGVMCVLFIRLPTGFLPTEDQGDVMVQYTLPAGATSGRTMEVSKAVENYFMTQE 600
LIYA+++ M VLF+RLP+ FLP EDQG + LPAGAT RT +V V +Y++ E
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 601 KDNTKAVFTISGFGFSGSGQNAGMAFIALKHWRDRPGSENTATAIADRAMKALSSIRDAQ 660
K N ++VFT++GF FSG QNAGMAF++LK W +R G EN+A A+ RA L IRD
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 661 IFSMTPPAVDGLGQSNGFTFELQATGDTSREQLLTLRDQLISKANKDPI-LASVRANTLQ 719
+ PA+ LG + GF FEL + L R+QL+ A + P L SVR N L+
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 720 QMPQLQVDIDNDKAAALGLSISDVNATLSAAWGGTYINDFIDRGRVKKVYMQGDVDTRSK 779
Q ++++D +KA ALG+S+SD+N T+S A GGTY+NDFIDRGRVKK+Y+Q D R
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 780 PEDLNQWFVRGSSDAMTSFSAFATTRWIYGPETLSRYNGQTSYEIQGQAASGSSSGTAMD 839
PED+++ +VR ++ M FSAF T+ W+YG L RYNG S EIQG+AA G+SSG AM
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 840 QMEKLAAELP-GTSYAWSGLSYQERLASGQALSLYAISILVVFLCLAALYESWSVPFSVM 898
ME LA++LP G Y W+G+SYQERL+ QA +L AIS +VVFLCLAALYESWS+P SVM
Sbjct: 842 LMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVM 901

Query: 899 MVIPLGIIGAVAAATLRGLENDIYFQVALLTTLGLASKNAILIVEFAEAAYLR-GEPLVA 957
+V+PLGI+G + AATL +ND+YF V LLTT+GL++KNAILIVEFA+ + G+ +V
Sbjct: 902 LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVE 961

Query: 958 AALQGAATRLRPILMTSLAFIAGVMPLAMSTGAGANSRISIGSGIIGGTLTATVLAVFFV 1017
A L RLRPILMTSLAFI GV+PLA+S GAG+ ++ ++G G++GG ++AT+LA+FFV
Sbjct: 962 ATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFV 1021

Query: 1018 PLFFVLIRRVFSG 1030
P+FFV+IRR F G
Sbjct: 1022 PVFFVVIRRCFKG 1034


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3586RTXTOXIND448e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.7 bits (103), Expect = 8e-07
Identities = 21/103 (20%), Positives = 38/103 (36%), Gaps = 10/103 (9%)

Query: 91 ASYQAAYDTAKAALQNVQVSVKSAKLKAQRYAALAKENGVSQQDADDAQTSYQQALANVA 150
K+ L+ ++ + SAK + Q L K + +Q N+
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN---------EILDKLRQTTDNIG 312

Query: 151 EKTAALETARINLAYTQVRAPISGRI-GISSVTPGALVTANQT 192
T L + +RAP+S ++ + T G +VT +T
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355



Score = 43.7 bits (103), Expect = 1e-06
Identities = 36/210 (17%), Positives = 75/210 (35%), Gaps = 16/210 (7%)

Query: 52 TVAAMTSEVRPQVDGIIKKRLFTEGSEVTAGQVLYQIDPASYQAAYDTAKAALQNVQVSV 111
T + + E++P + I+K+ + EG V G VL ++ A+A Q S+
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTAL-------GAEADTLKTQSSL 143

Query: 112 KSAKLKAQRYAALAKENGVSQQDADDAQTSYQQALANVAEKTAALETARINLAYTQVRAP 171
A+L+ RY L++ +++ + NV+E+ T+ I ++ +
Sbjct: 144 LQARLEQTRYQILSRSIELNKLPELKL--PDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQ 201

Query: 172 -ISGRIGISSVTPGALVTANQTTALATIRNLDPIYVDLTQSSAQLLALRKQQQAGNDTVA 230
+ + A + T LA I + + +L +Q V
Sbjct: 202 KYQKELNL------DKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVL 255

Query: 231 NAPVQLTLEDGSVYAHEGSLQLTEVAVDEA 260
+ + ++ L+ E + A
Sbjct: 256 EQENKYVEAVNELRVYKSQLEQIESEILSA 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3596TCRTETOQM711e-14 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 70.7 bits (173), Expect = 1e-14
Identities = 38/133 (28%), Positives = 59/133 (44%), Gaps = 18/133 (13%)

Query: 398 IMGHVDHGKTSLLDYI-----RSTKVASGEAG-------------GITQHIGAYHVETEN 439
++ HVD GKT+L + + T++ S + G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 440 GMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTIEAIQHAKAANVPVVVAV 499
+ +DTPGH F + R D +L+++A DGV QT + +P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 500 NKIDKPEADPDRV 512
NKID+ D V
Sbjct: 128 NKIDQNGIDLSTV 140


59YpsIP31758_3633YpsIP31758_3649Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_36333210.796392peptidyl-prolyl cis-trans isomerase
YpsIP31758_36344260.335567opacity-associated protein A
YpsIP31758_3635122-0.142562hypothetical protein
YpsIP31758_36363240.56614350S ribosomal protein L9
YpsIP31758_36370210.68819630S ribosomal protein S18
YpsIP31758_3638-1182.530065primosomal replication protein N
YpsIP31758_3639-2153.33362830S ribosomal protein S6
YpsIP31758_3640-3173.032585esterase
YpsIP31758_36410143.538870hypothetical protein
YpsIP31758_3642-1132.997113biofilm stress and motility protein A
YpsIP31758_36430152.876450isovaleryl CoA dehydrogenase
YpsIP31758_36440172.30473823S rRNA (guanosine-2'-O-)-methyltransferase
YpsIP31758_36452191.746776hypothetical protein
YpsIP31758_36462191.884751exoribonuclease R
YpsIP31758_36473200.784159NsrR family transcriptional regulator
YpsIP31758_36483221.012637adenylosuccinate synthetase
YpsIP31758_36492190.989448hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3633INFPOTNTIATR1622e-52 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 162 bits (412), Expect = 2e-52
Identities = 74/207 (35%), Positives = 115/207 (55%), Gaps = 5/207 (2%)

Query: 3 TPSFDSVEAQASYGIGLQIGQQLQESGLQGLLPEALLAGLRDAMEGN----TPTVPVDVI 58
S + + + SY IG +G+ + G+ + P+ L G++D M G T DV+
Sbjct: 24 ATSLTTDKDKLSYSIGADLGKNFKNQGID-INPDVLAKGMQDGMSGAQLILTEEQMKDVL 82

Query: 59 HRALQEVHEKADKVRVERQQALVDEGKTFLEENAKRDDVTTTESGLQFSVLQAGDGPIPS 118
+ +++ K ++ + +G FL N + + SGLQ+ ++ AG G P
Sbjct: 83 SKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPSGLQYKIIDAGTGAKPG 142

Query: 119 RQDRVRVHYTGRLVDGTVFDSSVERGQPADFPVSGVIPGWIEALSMMPVGSKWKLYIPHN 178
+ D V V YTG L+DGTVFDS+ + G+PA F VS VIPGW EAL +MP GS W++++P +
Sbjct: 143 KSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEALQLMPAGSTWEVFVPAD 202

Query: 179 LAYGERGAGATIPPFSALMFEVELLEI 205
LAYG R G I P L+F++ L+ +
Sbjct: 203 LAYGPRSVGGPIGPNETLIFKIHLISV 229


60YpsIP31758_3674YpsIP31758_3715Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_36742310.225274hypothetical protein
YpsIP31758_36752310.440591chaperonin GroEL
YpsIP31758_36760170.866614co-chaperonin GroES
YpsIP31758_3677-1141.932688FxsA protein
YpsIP31758_3678-1152.778446aspartate ammonia-lyase
YpsIP31758_36790143.847800anaerobic C4-dicarboxylate transporter
YpsIP31758_36800133.917494divalent-cation tolerance protein CutA
YpsIP31758_3681-1132.874398thiol:disulfide interchange protein
YpsIP31758_3682-1132.437211anaerobic formate dehydrogenase subunit alpha,
YpsIP31758_36830141.305232iron-sulfur cluster-binding protein
YpsIP31758_36841150.602600oxidoreductase Fe-S binding subunit
YpsIP31758_3685219-0.750573transcriptional regulator
YpsIP31758_3686220-0.727456*phage integrase
YpsIP31758_36871163.135703helicase
YpsIP31758_36881162.493551hypothetical protein
YpsIP31758_36891161.519341hypothetical protein
YpsIP31758_36900161.445178antirestriction protein
YpsIP31758_36912160.692033hypothetical protein
YpsIP31758_3692216-0.284325RHS/YD repeat-containing protein
YpsIP31758_3693633-8.422327hypothetical protein
YpsIP31758_3694120-4.918303hypothetical protein
YpsIP31758_3695-115-1.113125hypothetical protein
YpsIP31758_3696-114-0.081546hypothetical protein
YpsIP31758_3697-1141.022080hypothetical protein
YpsIP31758_36982160.237366hypothetical protein
YpsIP31758_36992161.099365hypothetical protein
YpsIP31758_37001141.997715hypothetical protein
YpsIP31758_37011141.574103hypothetical protein
YpsIP31758_37021121.348252hypothetical protein
YpsIP31758_37031131.611996hypothetical protein
YpsIP31758_37041133.536531lipoprotein
YpsIP31758_37050133.193222hypothetical protein
YpsIP31758_3706-1152.512523lipoprotein
YpsIP31758_3707-1162.413803hypothetical protein
YpsIP31758_3708-1192.667869hypothetical protein
YpsIP31758_3709219-3.807770hypothetical protein
YpsIP31758_3710432-8.141284hypothetical protein
YpsIP31758_3711536-10.626329hypothetical protein
YpsIP31758_3712433-9.470816hypothetical protein
YpsIP31758_3713122-5.934214hypothetical protein
YpsIP31758_3714122-5.815429hypothetical protein
YpsIP31758_3715019-3.710626hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3676TYPE3OMOPROT270.018 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 26.5 bits (58), Expect = 0.018
Identities = 19/73 (26%), Positives = 29/73 (39%), Gaps = 14/73 (19%)

Query: 9 RVIVKRKEVESKSAGGIVLTGTAAGKSTRGEVLAVGNGRILDNGEIKPLDVKVGDVVIFN 68
R V E+E+ ++ T A V + NG +L NGE+ V N
Sbjct: 240 RKNVTLAELEAMGQQQLLSLPTNAE----LNVEIMANGVLLGNGEL----------VQMN 285

Query: 69 DGYGVKAEKIDNE 81
D GV+ + +E
Sbjct: 286 DTLGVEIHEWLSE 298


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3680AUTOINDCRSYN280.007 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 27.9 bits (62), Expect = 0.007
Identities = 15/46 (32%), Positives = 23/46 (50%), Gaps = 1/46 (2%)

Query: 60 EGKLEQEYEVQLLFKSNTDH-QQALLTYIKQHHPYQTPELLVLPVR 104
+G E+E V L+F D Q+AL I + + + EL P+R
Sbjct: 163 QGLSEKEERVYLVFLPVDDENQEALARRINRSGTFMSNELKQWPLR 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3685HTHTETR483e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.1 bits (114), Expect = 3e-09
Identities = 34/173 (19%), Positives = 60/173 (34%), Gaps = 11/173 (6%)

Query: 3 REQVLSNALNLLEQQGLANTTLEMLAKALSVEVSDLTRFWPDREALLYDCLRYHSQQIDT 62
R+ +L AL L QQG+++T+L +AKA V + + D+ L + I
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 63 WRRQLQLDETLSPQQKLLARY-QTLSEQVQNQRYPGCLFIAACSFYPDTEH----PIHQL 117
+ Q P L L V +R + F+ + Q
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEERRRL---LMEIIFHKCEFVGEMAVVQQA 129

Query: 118 AEQQKQASLHYTKALLQEMDAD---DADMVAQQMELILEGCLSKLLIKRQLAD 167
S + L+ AD++ ++ +I+ G +S L+ A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAP 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3703PF01540310.020 Adhesin lipoprotein
		>PF01540#Adhesin lipoprotein

Length = 475

Score = 30.9 bits (69), Expect = 0.020
Identities = 27/97 (27%), Positives = 42/97 (43%), Gaps = 16/97 (16%)

Query: 88 ISRLSVEIQTNKSEHQKKTEMVTRVKEQCENAIKGCVAEIKKTALWDLMEGAKQGPRLYQ 147
IS+LS ++ KSE QK + ++ + EN + EGAK+ +L +
Sbjct: 92 ISKLSAAVENAKSEQQKVDQANKKIAD--ENL--------------KIKEGAKELLKLSE 135

Query: 148 TIISHTDTINTSTAELEAAFRQLLSSQGNHLSPLAEL 184
I S DTI + +LE Q+ + L EL
Sbjct: 136 KIQSFADTIALTITKLEGKKFQIDETFKKQLISTIEL 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3706PERTACTIN290.004 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 29.3 bits (65), Expect = 0.004
Identities = 16/60 (26%), Positives = 25/60 (41%), Gaps = 5/60 (8%)

Query: 30 LPPGGSTMLTLWNDSSSATHATREGRTTLRRTLMEPLDLVTETGSRIQENEIQDRFPRLP 89
PP S + + A +GR L R L EP+ L G++ Q + + P +P
Sbjct: 348 FPPPASPLSITLQAGARA-----QGRALLYRVLPEPVKLTLAGGAQGQGDIVATELPPIP 402


61YpsIP31758_3736YpsIP31758_3764Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3736223-0.714999pilus biogenesis protein
YpsIP31758_3737425-2.532746PilT domain-containing protein
YpsIP31758_3738219-1.015567prevent-host-death family protein
YpsIP31758_3739119-1.034178hypothetical protein
YpsIP31758_3740217-0.459068hypothetical protein
YpsIP31758_37413160.645344hypothetical protein
YpsIP31758_37423160.766953hypothetical protein
YpsIP31758_37433140.750472hypothetical protein
YpsIP31758_3744220-4.790376hypothetical protein
YpsIP31758_3745221-5.298422hypothetical protein
YpsIP31758_3746222-6.136642hypothetical protein
YpsIP31758_3747221-7.035595replicative DNA helicase
YpsIP31758_3748222-7.754230hypothetical protein
YpsIP31758_3749123-7.743633viral enhancin protein
YpsIP31758_3750113-0.319825hypothetical protein
YpsIP31758_37510131.288296hypothetical protein
YpsIP31758_37520142.341883rhamnose-proton symporter
YpsIP31758_3753-1132.707171transcriptional activator RhaR
YpsIP31758_37540142.983417transcriptional activator RhaS
YpsIP31758_37550154.032482rhamnulokinase
YpsIP31758_3756-2153.618520L-rhamnose isomerase
YpsIP31758_3757-2133.157624rhamnulose-1-phosphate aldolase
YpsIP31758_3758-3102.756648lactaldehyde reductase
YpsIP31758_3759-3133.396366L-rhamnose 1-epimerase
YpsIP31758_3760-1133.464515single-stranded DNA-binding protein
YpsIP31758_3761-1143.384066excinuclease ABC subunit A
YpsIP31758_3762-1163.187847hypothetical protein
YpsIP31758_3763-1153.484263aromatic amino acid aminotransferase
YpsIP31758_3764-1173.522888alanine racemase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3736FLGHOOKFLIK310.008 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 31.0 bits (69), Expect = 0.008
Identities = 32/115 (27%), Positives = 46/115 (40%), Gaps = 16/115 (13%)

Query: 152 DDVNREVCFTLRDAYQNLVSAKPVVPANVIIPTKNLAQQSGNPFSGSSASALTPVVP--- 208
DD+N +V +L A ++ P P+ L + F+ ++ LT P
Sbjct: 121 DDLNEDVTASL-SALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDA 179

Query: 209 --TLLVPLKPTI-----KPESISIPSPVLTEASKVLAPATAVFSTEAPGKTVSGP 256
T PL P + K E IS PSPV AS ++ P P TV+ P
Sbjct: 180 PGTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQT-----QPLPTVAAP 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3740BCTERIALGSPH300.004 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 29.9 bits (67), Expect = 0.004
Identities = 16/80 (20%), Positives = 29/80 (36%), Gaps = 13/80 (16%)

Query: 108 VLIGFVLSGLVADVFTLTKGDHAGESRPALKARL-IAVEWVKISGQLV-----------Y 155
+L+G V +G+V F ++ D A ++ +A+L + +GQ
Sbjct: 16 LLMG-VSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQFFGVSVHPDRWQFL 74

Query: 156 KSEKPTSMPPAQNEPQHQSY 175
E PA + Y
Sbjct: 75 VLEARDGADPAPADDGWSGY 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3742PF05775310.003 Enterobacteria AfaD invasin protein
		>PF05775#Enterobacteria AfaD invasin protein

Length = 142

Score = 30.6 bits (69), Expect = 0.003
Identities = 16/77 (20%), Positives = 28/77 (36%), Gaps = 17/77 (22%)

Query: 146 VLQAAHYGLISHQQRYDRLSDGSRLVREIYGSVMSYRSVAVSRLDAIENNT---VWQQAI 202
QAA L++H+ + L DG + +A R+ + ++ VW A
Sbjct: 20 FSQAADITLMNHKYMGNLLHDGVK--------------LATGRIICQDTHSGFRVWINAR 65

Query: 203 SELGEPNAAVLLGTRRS 219
E G ++ T
Sbjct: 66 QEGGGAGKYIVQSTEGP 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3758PF07520300.027 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 29.6 bits (66), Expect = 0.027
Identities = 16/83 (19%), Positives = 31/83 (37%), Gaps = 5/83 (6%)

Query: 282 ILLPVIEEYNRP---QATRRFARIAQAMGVDTQDMSDE-QASHQAIAAIRQLSLQVGIPA 337
++ VI P + + A + D Q + RQ S++V +P
Sbjct: 639 LVHRVISAIVLPRLQDSIAQAGGQFVAERMRELFGGDIGGQEQQTVQRRRQFSIRVLVPL 698

Query: 338 GFSAL-GIEESDIEGWLDKALAD 359
+ L E+++ +D +AD
Sbjct: 699 AEAILSACEDAEEADRIDIPVAD 721


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3764ALARACEMASE446e-160 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 446 bits (1150), Expect = e-160
Identities = 148/357 (41%), Positives = 217/357 (60%), Gaps = 4/357 (1%)

Query: 2 KAATAVIDRHALRHNLQQIRRLAPQSRLVAVVKANAYGHGLLAAAHTLQDADCYGVARIS 61
+ A +D AL+ NL +R+ A +R+ +VVKANAYGHG+ + D + + +
Sbjct: 3 RPIQASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLE 62

Query: 62 EALMLRAGGIVKPILLLEGFFDAEDLPVLVANHIETAVHSLEQLVALEAARLSAPINAWM 121
EA+ LR G PIL+LEGFF A+DL + + + T VHS QL AL+ ARL AP++ ++
Sbjct: 63 EAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYL 122

Query: 122 KLDTGMHRLGVRPDQAEAFYQRLSACRNVIQPVNIMSHFSRADEPEVAATQQQLACFDAF 181
K+++GM+RLG +PD+ +Q+L A NV + + +MSHF+ A+ P+ +A +
Sbjct: 123 KVNSGMNRLGFQPDRVLTVWQQLRAMANVGE-MTLMSHFAEAEHPD--GISGAMARIEQA 179

Query: 182 AAGKPGKQSIAASGGILRWPQAHRDWVRPGIVLYGVSPF-DAPYGRDFGLLPAMTLKSSL 240
A G ++S++ S L P+AH DWVRPGI+LYG SP + GL P MTL S +
Sbjct: 180 AEGLECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPVMTLSSEI 239

Query: 241 IAVREHKAGESVGYGGTWVSERDTRLGVIAIGYGDGYPRSAPSGTPVWLNGREVSIVGRV 300
I V+ KAGE VGYGG + + + R+G++A GY DGYPR AP+GTPV ++G VG V
Sbjct: 240 IGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGTV 299

Query: 301 SMDMISIDLGPESTDKVGDEALMWGAELPVERVAACTGISAYELITNLTSRVAMEYL 357
SMDM+++DL P +G +WG E+ ++ VAA G YEL+ L RV + +
Sbjct: 300 SMDMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRVPVVTV 356


62YpsIP31758_3774YpsIP31758_3785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_37742131.577717chorismate pyruvate lyase
YpsIP31758_37752131.433986autotransporter protein
YpsIP31758_3776018-1.453924oxidoreductase, FAD-binding
YpsIP31758_3777217-2.352526hypothetical protein
YpsIP31758_3778217-1.875082hypothetical protein
YpsIP31758_3779115-1.268272hypothetical protein
YpsIP31758_3780116-2.586841hypothetical protein
YpsIP31758_3781-114-2.149031periplasmic chaperone protein
YpsIP31758_3782-114-1.804019fimbrial usher protein
YpsIP31758_3783319-0.036150hypothetical protein
YpsIP31758_37844190.323869tellurium resistance protein
YpsIP31758_37853180.634827tellurium resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3775PERTACTIN682e-13 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 67.8 bits (165), Expect = 2e-13
Identities = 106/458 (23%), Positives = 160/458 (34%), Gaps = 66/458 (14%)

Query: 846 SGEGAWLTLDTVLGD---------DDSATDRLVINGDATGTTSVRVNNAGGLGDKTLNGI 896
+G L +DT+ G D +D+LV+ DA+G + V N+G + N +
Sbjct: 463 AGRFKVLMVDTLAGSGLFRMNVFADLGLSDKLVVMRDASGQHRLWVRNSGSEPA-SGNTM 521

Query: 897 NLITVDGLAQDDTFLLAGDYVTTDGYQAVVGGAYAYTLQADGEA--------ATAGRNWY 948
L+ + TF LA DG V G Y Y L A+G A
Sbjct: 522 LLVQTPRGSAA-TFTLA----NKDG--KVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPA 574

Query: 949 LSSELMLTEGVRYQAGVPLYEQYPQVLAALNTLPTLQQRVGNRYGAPGALA----DLNFD 1004
P Q PQ P Q G A A +
Sbjct: 575 PQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGLA 634

Query: 1005 DNQW----------------------AWGRIEGSHQVTDPARSTSGSQREIDVWKLQTGI 1042
W AWGR Q D + +G + + V + G
Sbjct: 635 STLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLD---NRAGRRFDQKVAGFELGA 691

Query: 1043 DVPLYQSQGGSLLTGGVNFTYGKAKADIHSFFGDGRINSAGYGLGTSLTWYGNNGVYVDG 1102
D + + G L G +T G F GDG ++ +G T+ N+G Y+D
Sbjct: 692 DHAVAVAGGRWHLGGLAGYTRGD-----RGFTGDGGGHTDSVHVGGYATYIANSGFYLDA 746

Query: 1103 QLQTMWFDSDLS-SRTAGHAVASGNNGRGYTSAIEAGKGYALGNGLSLTPQMQVTYSRVD 1161
L+ ++D + + G+AV G ++EAG+ +A +G L PQ ++ RV
Sbjct: 747 TLRASRLENDFKVAGSDGYAVKGKYRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVG 806

Query: 1162 FDTFRDPFDSEVSLQEGDSLRGRLGVSLDKETTWSAKDGTTRRSHIYSHLDLHNEFLNGS 1221
+R V + G S+ GRLG+ + K R+ Y + EF
Sbjct: 807 GGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIEL----AGGRQVQPYIKASVLQEFDGAG 862

Query: 1222 KVQVSGVEFAT--RDERQSVGLGAGGTYEWQNGRYAVY 1257
V+ +G+ T R R +GLG + YA Y
Sbjct: 863 TVRTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASY 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3782PF00577360e-114 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 360 bits (925), Expect = e-114
Identities = 180/876 (20%), Positives = 320/876 (36%), Gaps = 97/876 (11%)

Query: 1 MVARCINIQCIAFLFSFFPTLAFPVTEEG-EVVFDIETLERLGYSAELAKFFSGQDRFLP 59
+ R + A E+ F+ L + F P
Sbjct: 16 LHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPP 75

Query: 60 GQHDVTIIINASKTYRIAATFDSE-----GKLCMDKALLMALKLR-------NTESDGSC 107
G + V I +N TF++ C+ +A L ++ L N +D +C
Sbjct: 76 GTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDAC 135

Query: 108 ENMEARWPGMVVKLFPGQFRVEITLPQEAFDPEMEG----SEYQQGGHALLLNYNIFGQR 163
+ + +L GQ R+ +T+PQ G + G +A LLNYN G
Sbjct: 136 VPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS 195

Query: 164 VESNNS-RFNLVQGQFEPGINFKNWVLRNRGSYSYNQGVSQ------YYNQETSALRAVE 216
V++ + + G+N W LR+ ++SYN S + + T R +
Sbjct: 196 VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDII 255

Query: 217 SLKSVVQLGEFGLVGNTFSGLPVTGIQLYSDNAQRDDTQ--LIVPIEGIANTNATIEIRQ 274
L+S + LG+ G+ F G+ G QL SD+ D+Q I GIA A + I+Q
Sbjct: 256 PLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQ 315

Query: 275 RGRVIYRTIVAPGPFSLSNISNFSSGVNTDVSIIEEDGTQQNFTV-TSALDINAEQQASI 333
G IY + V PGPF++++I + + V+I E DG+ Q FTV S++ + + +
Sbjct: 316 NGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTR 375

Query: 334 YQLAVGRYRDMFTGEDRPSPLLLSGEMS--FNPAATFYMTSAGLLSSGYQNIRVQNLYSG 391
Y + G YR + P + T Y L+ Y+ N G
Sbjct: 376 YSITAGEYRS--GNAQQEKPRFFQSTLLHGLPAGWTIY--GGTQLADRYRAF---NFGIG 428

Query: 392 WDQAWF---SAAASYANTKDAGQGYQFSVQNQMTINGNFGVSWSSV------YGSANYWS 442
+ S + AN+ + N + S +++ Y ++ Y++
Sbjct: 429 KNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFN 488

Query: 443 PDDALSSSNNLNDL------------------MFGKLKNATSVAVSWVHPRWGAFSYALS 484
D S N ++ + + + V+ R + S
Sbjct: 489 FADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGS 548

Query: 485 NNMYYQASGR-TYHIFSISEQFGRATTILS-----SQLSSQGQNSLYVGINMPLG----- 533
+ Y+ S ++ F LS + L + +N+P
Sbjct: 549 HQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRS 608

Query: 534 -------NGTLSGRVQR-NNGNVALGSTYQGRWGDNKDYSVGISGD-------NRQRRIN 578
+ + S + NG + + G ++ + S + N
Sbjct: 609 DSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGY 668

Query: 579 GSMNIRTAYSQLTGGVSQATNNSRSAYLSSRGSVAYVNNTFATSSSSVGDTFAVVNIPNQ 638
++N R Y G S + ++ + Y G V + T + DT +V P
Sbjct: 669 ATLNYRGGYGNANIGYSHS-DDIKQLYYGVSGGVL-AHANGVTLGQPLNDTVVLVKAPGA 726

Query: 639 PGLRVSSPSSGIAITDYAGIALLPLVRPYTASKVQISTQTLPLNIRLNNTSADLLMTRGS 698
+V + + TD+ G A+LP Y ++V + T TL N+ L+N A+++ TRG+
Sbjct: 727 KDAKVENQT--GVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGA 784

Query: 699 VATHHFETTETRQLLLTIRGSDGEMLPIGANVLDEKGNFLGTIIGDGNFMLENKAIGVTL 758
+ F+ +LL+T+ + + LP GA V E G + +G L + +
Sbjct: 785 IVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKV 843

Query: 759 RVKANNRDE--CRVNYREPEKFDPDVLYEVADAVCQ 792
+VK + C NY+ P + +L ++ A C+
Sbjct: 844 QVKWGEEENAHCVANYQLPPESQQQLLTQL-SAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3785PF07824333e-04 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 32.6 bits (74), Expect = 3e-04
Identities = 15/85 (17%), Positives = 34/85 (40%), Gaps = 13/85 (15%)

Query: 85 EGDDESLKIKLPLI--PADVDKIVFVVTIHDAQARRQSFGQVANAFIRLVNDDNGVEIAR 142
E + +S+ + P P +++ +++ ++++ + +D+ G IAR
Sbjct: 35 EKEGDSINLLCPFCALPENINDLIYALSLN-----------YSEKICLATDDEGGSLIAR 83

Query: 143 YDLSEDASTETAMLFGELYRHNAEW 167
DL+ E + E Y W
Sbjct: 84 LDLTGINEFEDIYVNTEYYISRVRW 108


63YpsIP31758_3799YpsIP31758_3834Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_37991193.228251NmrA family protein
YpsIP31758_38001203.329226hemin uptake protein
YpsIP31758_38010152.328305TonB-dependent hemin receptor HmuR
YpsIP31758_3802-112-0.054920hemin transport protein HmuS
YpsIP31758_3803013-2.017212hemin ABC transporter periplasmic protein
YpsIP31758_3804216-2.584484hemin ABC transporter permease
YpsIP31758_3805115-2.968573hemin importer ATP-binding subunit
YpsIP31758_3806115-3.513295cystathionine beta-lyase
YpsIP31758_3807112-2.619634serine transporter
YpsIP31758_3808114-0.560588LysR family substrate binding transcriptional
YpsIP31758_38090181.414712hypothetical protein
YpsIP31758_38100171.978186inner membrane protein
YpsIP31758_3811-1171.505037secretion system apparatus protein SsaU
YpsIP31758_3812-1202.751020type III secretion apparatus protein
YpsIP31758_38130214.884925HrpO family type III secretion protein
YpsIP31758_38140204.582773type III secretion system protein
YpsIP31758_38150195.576071type III secretion system protein
YpsIP31758_3816-1185.569269hypothetical protein
YpsIP31758_3817-1195.925245type III secretion system protein
YpsIP31758_3818-1195.099556type III secretion system ATPase
YpsIP31758_3819-1193.666236secretion system apparatus protein SsaV
YpsIP31758_38200214.248768HrpE/YscL family type III secretion apparatus
YpsIP31758_38210232.616286hypothetical protein
YpsIP31758_3823-126-1.176324hypothetical protein
YpsIP31758_3822-127-2.398406YscJ/HrcJ family type III secretion apparatus
YpsIP31758_3824-125-2.962039type III secretion apparatus protein, YscI/HrpB
YpsIP31758_3825024-3.979481type III secretion system protein SsaH family
YpsIP31758_3826024-5.117211type III secretion apparatus needle protein
YpsIP31758_3827-119-3.851382AraC family transcriptional regulator
YpsIP31758_3828-118-3.488375type III secretion system protein YseE family
YpsIP31758_3829-115-3.647195YscD/HrpQ family type III secretion apparatus
YpsIP31758_3830-113-3.299554YscC/HrcC family type III secretion outer
YpsIP31758_3831-114-3.108605hypothetical protein
YpsIP31758_3832-114-2.699017sensor histidine kinase/response regulator EsrA
YpsIP31758_3833-214-3.134452DNA-binding response regulator EsrB
YpsIP31758_3834-115-3.041018glutamate/aspartate:proton symporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3805PF05272280.049 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.049
Identities = 10/21 (47%), Positives = 12/21 (57%)

Query: 39 MVAIIGPNGAGKSTLLRLLTG 59
V + G G GKSTL+ L G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3809PF01206921e-28 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.1 bits (229), Expect = 1e-28
Identities = 17/71 (23%), Positives = 37/71 (52%)

Query: 19 DYRLDMVGEPCPYPAVATLEAMPQLKPGEILEVISDCPQSINNIPLDARNYGYTVLDIQQ 78
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 79 DGPTIRYLIQR 89
+ T + ++R
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3811TYPE3IMSPROT345e-120 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 345 bits (887), Expect = e-120
Identities = 123/351 (35%), Positives = 198/351 (56%), Gaps = 2/351 (0%)

Query: 2 MSTEKNEKPTPKRLKEAKEKGQVVKSVEITSGVQLVALVIYFLLTGYSLVEQAKALIRSS 61
MS EK E+PTPK++++A++KGQV KS E+ S +VAL + E L+
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 62 IIQLQQPLTLALARIGAECMTVLMHIVVVLGGALIVVTIIAGIAQVGPLLATKAVSFKGE 121
Q P + AL+ + + ++ L ++ I + + Q G L++ +A+ +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 122 RINPIQNAKQLFSLRSVFELMKSLLKVGVLTLIFGYLLMQYAPSFGYLTHCGSRCALPVF 181
+INPI+ AK++FS++S+ E +KS+LKV +L+++ ++ + L CG C P+
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 182 STLMGWLLGSLIACYLVFSLMDYTFQRYTIMKQLKMSHDEVKREHKDSNGDPHIKQKRRQ 241
++ L+ ++V S+ DY F+ Y +K+LKMS DE+KRE+K+ G P IK KRRQ
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 242 LQHEVQSGSFATNVRRSTAVVRNPTHFAVCLVYHPEETPLPIVIEKGHDEQAALIVSLAE 301
E+QS + NV+RS+ VV NPTH A+ ++Y ETPLP+V K D Q + +AE
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 302 QSGIPVVENIALARALHRDVACGDTIPEQFFEPVAALLRM--ALELDYQPS 350
+ G+P+++ I LARAL+ D IP + E A +LR ++ Q S
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHS 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3812TYPE3IMRPROT1415e-43 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 141 bits (356), Expect = 5e-43
Identities = 52/230 (22%), Positives = 105/230 (45%), Gaps = 4/230 (1%)

Query: 5 LPGLTALALAMMRPYGILLILPLFTARSLGSSLLRNGLIVAIALPVTPLFLSAPIITNSS 64
L L ++R ++ P+ + RS+ + + GL + I + P + + S
Sbjct: 10 LSWLNLYFWPLLRVLALISTAPILSERSVPKRV-KLGLAMMITFAIAPSLPANDVPVFS- 67

Query: 65 PVTWIGVLCTELLIGVVMGFVAALPFWAMNMAGFLIDTLRGATMSTLFNPGMGVESSLFG 124
+ + ++LIG+ +GF F A+ AG +I G + +T +P + +
Sbjct: 68 -FFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLA 126

Query: 125 VLFTQILTVLFLISGGFNQVLAALYGSYDSLPIGQGIQPAADLLLFLQTEWQMMFELCLC 184
+ + +LFL G +++ L ++ +LPIG + L + +F L
Sbjct: 127 RIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSL-IFLNGLM 185

Query: 185 FALPALLVMVLADLSLGLINRSARQLNVFFLAMPIKSALALFLLLISLPY 234
ALP + +++ +L+LGL+NR A QL++F + P+ + + L+ +P
Sbjct: 186 LALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPL 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3813TYPE3IMQPROT693e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 68.6 bits (168), Expect = 3e-19
Identities = 32/79 (40%), Positives = 47/79 (59%)

Query: 10 IVHLATELLWLVLLLSLPVVVVASTVGLVISLVQALTQIQDQTLQFLIKLLAVSATLLMT 69
+V + L+LVL+LS +VA+ +GL++ L Q +TQ+Q+QTL F IKLL V L +
Sbjct: 4 LVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLL 63

Query: 70 YHWMGATLLNYTQQSFLQI 88
W G LL+Y +Q
Sbjct: 64 SGWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3814TYPE3IMPPROT2241e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 224 bits (572), Expect = 1e-76
Identities = 86/220 (39%), Positives = 143/220 (65%), Gaps = 7/220 (3%)

Query: 4 LNSSYQLIALLFMLSVLPLLVVMGTAFLKLSVVFSLLRNALGVQQVPPNIAIYGLALVLT 63
+ + LIALL ++LP ++ GT F+K S+VF ++RNALG+QQ+P N+ + G+AL+L+
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 64 IFIMAPVGLDVQARLQNEELSNDIGALAHQIDQNALVPYRDFLQRNTDIEQVTFFNDIVQ 123
+F+M P+ D ++E+++ + + + L YRD+L + +D E V FF +
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 124 NKWPE-------RYRDSVKPDSLLILMPAFTLSQLNEAFKIGLLLFLPFVAIDLIVSNIL 176
+ R +D ++ S+ L+PA+ LS++ AFKIG L+LPFV +DL+VS++L
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 177 LAMGMMMVSPMTLSLPFKLLVFVLVDGWSLVLGQLVGSYL 216
LA+GMMM+SP+T+S P KL++FV +DGW+L+ L+ Y+
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYM 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3815TYPE3OMOPROT503e-09 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 50.4 bits (120), Expect = 3e-09
Identities = 29/111 (26%), Positives = 50/111 (45%), Gaps = 4/111 (3%)

Query: 205 YIKLEGGNRMTIQQINEASDPLACGSRAESLPLAAVQFEDLPQTLVMEIGRLTLPLGEIK 264
+ ++EGG + I + AE+LP LP L + R + L E++
Sbjct: 194 FNRVEGGIIVETLDIQHIEEENNTTETAETLP----GLNQLPVKLEFVLYRKNVTLAELE 249

Query: 265 QLAVGQTLACQTHCYGEVNICLNGQSVGRGSLLRCDEQLVVRIAQWGLQNG 315
+ Q L+ T+ V I NG +G G L++ ++ L V I +W ++G
Sbjct: 250 AMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3817RTXTOXIND325e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 5e-04
Identities = 15/118 (12%), Positives = 38/118 (32%), Gaps = 11/118 (9%)

Query: 5 QQRTLQRLLALRQRQERRLRQQLGQLRREQQQQEQQLENGRRRHQQLCQQLQQLAQWCGI 64
++ + Q Q+ + L + R E+ ++ + +L +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSL--- 243

Query: 65 LTPREADEQKVLRQAVYQAERQAKKQLNAWVAQGRQQVSAIERQ--QARLRRNQREQE 120
+Q + + AV + E + + + + Q+ IE + A+ Q
Sbjct: 244 -----LHKQAIAKHAVLEQENKY-VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3822FLGMRINGFLIF631e-13 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 63.1 bits (153), Expect = 1e-13
Identities = 44/188 (23%), Positives = 70/188 (37%), Gaps = 7/188 (3%)

Query: 7 MLAIVLMTLSLSGCDME-LYSGLSEGEANQMLALLMLHQINAEKQIEKSGMVGLTVDKRQ 65
+ +V M L D L+S LS+ + ++A L I + V +
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYR--FANGSGA-IEVPADK 91

Query: 66 FINAVELLRQNGFPRQRFITVDELFPANQLVTSPTQEQAKMVFLKEQQLENMLSHMDGVI 125
L Q G P+ + EL + S EQ E +L + + V
Sbjct: 92 VHELRLRLAQQGLPKGGAVGF-ELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 126 HADVTVAMPM-SVDGKNPLPHTASVFIKYSPEVNLQSYQ-SQIKGLVRDAVPGIDYAKIS 183
A V +AMP S+ + +ASV + P L Q S + LV AV G+ ++
Sbjct: 151 SARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVT 210

Query: 184 VVMQPANY 191
+V Q +
Sbjct: 211 LVDQSGHL 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3830TYPE3OMGPROT478e-166 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 478 bits (1231), Expect = e-166
Identities = 160/514 (31%), Positives = 269/514 (52%), Gaps = 21/514 (4%)

Query: 4 IYIMRKITGLILLFFATLLPYGKFSYGKAIPWQGEPFFIYSRGMTVSELLKDLGMNYGIP 63
+ R +TG +LL + S+ + + W P+ ++G ++ +LL D G NY
Sbjct: 7 SFFKRVLTGTLLLLSSY-------SWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDAT 59

Query: 64 VVISSEINEHFTGKIRDKTPEKILSELAGRYNITWYYDGETLYFYPVQSIKREFISPDGL 123
VV+S +IN+ +G+ P+ L +A YN+ WYYDG LY + + I
Sbjct: 60 VVVSDKINDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQES 119

Query: 124 AANTLVKYLQRGDVLAGKNCAIKAIPHLDTLEVKGVPICIERVKSVSKMLS--EQVRHQN 181
A L + LQR + + + V G P +E V+ + L Q+R +
Sbjct: 120 EAAELKQALQRSGIWE-PRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEK 178

Query: 182 QNKETVKVFPLKYASAADSDYQYRDQNVRLPGLVSVLRELNQGNNLPLAGGNQPDGNQAS 241
+++FPLKYASA+D YRD V PG+ ++L+ + + + QA+
Sbjct: 179 TGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAA 238

Query: 242 S-----PVFSADPRQNAVIIRDRQANMPIYRSLITQLDQRPIQIEISVTIIDVDAGDISQ 296
+ ADP NA+I+RD MP+Y+ LI LD+ +IE++++I+D++A +++
Sbjct: 239 TRASAQARVEADPSLNAIIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTE 298

Query: 297 LGVDWSASASIGGTGV------SFNSTFAKNNAEGFSTVIGDTGNFMVRLNALQKNSRAR 350
LGVDW G S A N A G + R+N L+ A+
Sbjct: 299 LGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQ 358

Query: 351 ILSQPSVVTLNNIQAVLDKNVTFYTKLQGEKVAKLESVTSGSLLRVTPRMIETEGVQEVL 410
++S+P+++T N QAV+D + T+Y K+ G++VA+L+ +T G++LR+TPR++ E+
Sbjct: 359 VVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEIS 418

Query: 411 LNLNIQDGQQQASTNSNEPLPEIRNSDISTQATLQVGQSLLLGGFIQDTQIESQNKIPLL 470
LNL+I+DG Q+ +++ E +P I + + T A + GQSL++GG +D + +K+PLL
Sbjct: 419 LNLHIEDGNQKPNSSGIEGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLL 478

Query: 471 GDIPLLGGLFRSTDKQSHSVVRLFLIKAVPVNAG 504
GDIP +G LFR + + VRLF+I+ ++ G
Sbjct: 479 GDIPYIGALFRRKSELTRRTVRLFIIEPRIIDEG 512


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3832HTHFIS801e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-17
Identities = 35/173 (20%), Positives = 64/173 (36%), Gaps = 14/173 (8%)

Query: 695 HILLVDDSETNRDITGMMLQQLGHQVTRADSGTTALAIGRQHRFDLVLMDIRMPVLDGLA 754
IL+ DD R + L + G+ V + T DLV+ D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 755 TTARWRHDPANIDSHCMITALSANASPDEQIKTNQAGMNHYLSKPVTLGQLAEMLDLTAQ 814
R + + +SA + IK ++ G YL KP L E++ + +
Sbjct: 65 LLPRIK----KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF---DLTELIGIIGR 117

Query: 815 FQLERGVDLSPQLSEPQPLLDL-ADSALSLKLYQSLQVLIQQAKDAIENLPVL 866
E S + Q + L SA ++Y+ ++ + +L ++
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYR----VLARLMQT--DLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3833HTHFIS592e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.7 bits (142), Expect = 2e-12
Identities = 25/127 (19%), Positives = 53/127 (41%), Gaps = 3/127 (2%)

Query: 3 TKLLIVDDHELIIHGIKNMLAAYPRYLIVGQADNGLEVYNLCRQTEPDMVILDLGLPGMD 62
+L+ DD I + L+ V N ++ + D+V+ D+ +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALS--RAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GLDVIIQLLRRWPAMKILTLTARNEEHYASRTFNSGALGYVLKKSPQQILMAAIQTVAIG 122
D++ ++ + P + +L ++A+N A + GA Y+ K L+ I A+
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR-ALA 120

Query: 123 KRYIDPA 129
+ P+
Sbjct: 121 EPKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3834V8PROTEASE310.008 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 31.1 bits (70), Expect = 0.008
Identities = 7/43 (16%), Positives = 18/43 (41%)

Query: 293 AYGAPKAITSFVVPTGYSFNLDGSTLYQSIAAIFIAQLYGIEL 335
+ A + + TGY + +T+++S I + ++
Sbjct: 186 SNNAETQVNQNITVTGYPGDKPVATMWESKGKITYLKGEAMQY 228


64YpsIP31758_4003YpsIP31758_4015Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_40036163.803664glycogen synthase
YpsIP31758_40046163.638732glucose-1-phosphate adenylyltransferase
YpsIP31758_40057163.623756glycogen debranching protein
YpsIP31758_40067173.322053glycogen branching protein
YpsIP31758_40077183.441870sensor histidine kinase
YpsIP31758_40088193.726595invasin
YpsIP31758_4010-215-1.426951hypothetical protein
YpsIP31758_4011-215-1.525900hypothetical protein
YpsIP31758_4012-314-1.459127hypothetical protein
YpsIP31758_4013-119-2.295372aspartate-semialdehyde dehydrogenase
YpsIP31758_4014121-4.086125hypothetical protein
YpsIP31758_4015018-3.392582dITP- and XTP- hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4007PF065802262e-71 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 226 bits (578), Expect = 2e-71
Identities = 65/213 (30%), Positives = 115/213 (53%), Gaps = 2/213 (0%)

Query: 345 LGEGIAHLLSAQILAGEFEQQKQLLAQSEIKLLHAQVNPHFLFNALNTLSVVIRRNPDHA 404
L G + + + + + ++++ L AQ+NPHF+FNALN + +I +P A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 405 RKLVLSLSTFFRKNLKRS-HDVVTLSDEIEHVNAYLEIEKARFADRLAVTVSLPNELMEA 463
R+++ SLS R +L+ S V+L+DE+ V++YL++ +F DRL + +M+
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 464 RLPAFSLQPVVENAIKHGISQMFSNGRVTLHGKLDDNTLILEVEDNAGL-YQPQPDGDGL 522
++P +Q +VEN IKHGI+Q+ G++ L G D+ T+ LEVE+ L + + G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 523 GMSLVDRRIKARYGNEYGITVVSDAEVFTRIII 555
G+ V R++ YG E I + +++
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVL 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4008INTIMIN455e-139 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 455 bits (1171), Expect = e-139
Identities = 260/843 (30%), Positives = 385/843 (45%), Gaps = 64/843 (7%)

Query: 11 YTLGPGDSIQSIAKKYNITVDELKKLNAYRTFSKP-FASLTTGDEIEVPRKESSF----- 64
YTL G+++ ++K +I + + LN + S+ G +I +P K+ F
Sbjct: 65 YTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAEPGQQIILPLKKLPFEYSAL 124

Query: 65 ---------------------FSNNPNENNKKDVDDLLARNAMGAG-----KLLSNDNTS 98
+P+ DD A +L S
Sbjct: 125 PLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNG 184

Query: 99 DAASNMARSAVTNEINASSQQWLNQFGTARVQLNVDSDFKLDNSALDLLVPLKDSESSLL 158
D A + A N+ ++ Q WL +GTA V L ++F D S+LD L+P DSE L
Sbjct: 185 DYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNF--DGSSLDFLLPFYDSEKMLA 242

Query: 159 FTQLGVRNKDSRNTVNIGAGIRQYQGDWMYGANTFFDNDLTGKNRRVGVGAEVATDYLKF 218
F Q+G R DSR T N+GAG R + + M G N F D D +G N R+G+G E DY K
Sbjct: 243 FGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKS 302

Query: 219 SANTYFGLTGWHQSRDFSSYDERPADGFDIRTEAYLPAYPQLGGKLMYEKYRGDEVALFG 278
S N YF ++GWH+S + YDERPA+GFDIR YLP+YP LG KLMYE+Y GD VALF
Sbjct: 303 SVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFN 362

Query: 279 KDDRQKDPHAVTLGVNYTPVPLVTIGAEHREGKGNNNNTSVNVQLNYRMGQPWNDQIDQS 338
D Q +P A T+GVNYTP+PLVT+G ++R G GN N+ ++Q Y+ +PW+ QI+
Sbjct: 363 SDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQ 422

Query: 339 AVAANRTLAGSRYDLVERNNNIVLDYKKQELIHLVLPDRISGSGGGAITLTAQVRAKYGF 398
V RTL+GSRYDLV+RNNNI+L+YKKQ+++ L +P I+G+ + V++KYG
Sbjct: 423 YVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIVKSKYGL 482

Query: 399 SRIEWDATPLENAGG---STSPLTQSSLSVTLPFYQHILRTSNTHTISAVAYDAQGNASN 455
RI WD + L + GG + + LP Y SN + ++A AYD GN+SN
Sbjct: 483 DRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQ--GGSNVYKVTARAYDRNGNSSN 540

Query: 456 RAVTSIEVTRPETMV----ISHLATTVDNATANGIAANTVQATVTDGDGQPIIGQIINFA 511
+ +I V +V ++ +A A+G A T ATV +
Sbjct: 541 NVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNI 600

Query: 512 VNTQATLSTTEARTGANGIASTTLTHTVAGVSAVSATLGSSSRSVNTTFVADESTAEITA 571
V+ A LS A T +G A+ TL G VSA + ++N V + +
Sbjct: 601 VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASI 660

Query: 572 ANLTVTTNDSVANGSDTNAVRAKVTDAYTNAVANQSVIFSASNGATVIDQTVITNAEGIA 631
+ +VANG D KV V+NQ V F+ + + + T T+ G A
Sbjct: 661 TEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFT-TTLGKLSNSTEKTDTNGYA 718

Query: 632 DSTLTNTTAGVSAVTATLGSQS---QQVDTTFKPGSTAAISLMKLADRAVADGIDQNEIQ 688
TLT+TT G S V+A + + + + F +++ V G+
Sbjct: 719 KVTLTSTTPGKSLVSARVSDVAVDVKAPEVEF----FTTLTIDDGNIEIVGTGVKGKLPT 774

Query: 689 VVLRDGTGNAVPNVPMSIQADNGAIVVASTPNTGVDGTINATFTNLRA-GESVVSVTSPA 747
V L+ G N + NG S ++ L+ G + +SV S
Sbjct: 775 VWLQYGQVNLKAS------GGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVIS-- 826

Query: 748 LVGMTMTMTFSADQRTAVVSTLAAIDNNAKADGTDTNVVRAWVVDANGNSVPGVSVTFDA 807
T T++ +++ D +T + ++ N + V + A
Sbjct: 827 --SDNQTATYTIATPNSLI-VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883

Query: 808 GNG 810
N
Sbjct: 884 ANK 886



Score = 84.0 bits (207), Expect = 1e-17
Identities = 73/340 (21%), Positives = 116/340 (34%), Gaps = 29/340 (8%)

Query: 4572 NALADGVARNQVRAHVVDSTGNSVADMAVTFTANRGAQLSKVTVLTDNNGDAVNTLTNSL 4631
+A ADG A V + + A LS + T+ +G A TL +
Sbjct: 569 SAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDK 628

Query: 4632 AGVTVVTAKLGTAGTPLTVDTVFTAGPLATLTLVTTV--DNAFADNSATNTVRATLKDT- 4688
G VV+AK TA ++ T +T + D A + + + T+K
Sbjct: 629 PGQVVVSAK--TAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMK 686

Query: 4689 TGNPVVGEVVAFAASNGATITATDGGVSNANGIVLATLTNGSAGVSTVTATIE----TLT 4744
PV + V F + G +T+ ++ NG TLT+ + G S V+A + +
Sbjct: 687 GDKPVSNQEVTFTTTLGKLSNSTE--KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVK 744

Query: 4745 ETTDTTFIVMKNLDVTVNGTTFNGDAGFPTTGFVGATFKVNSGGDNSLYDWSSSAPALVS 4804
F + D + PT + + G N Y W S+ PA+ S
Sbjct: 745 APEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIAS 804

Query: 4805 VSGD-GVVTFNAVFPTGTPAITISATPKGGGSPLSYSFRVNQWFINNSGATLDRVSAIAH 4863
V G VT TIS + N + N + A+
Sbjct: 805 VDASSGQVTLK-----EKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNT 859

Query: 4864 CENVGYVMPISTQVTNAATWMSGRRAVGNLWSEWGDFSAY 4903
C+N G +P S + N++ WG + Y
Sbjct: 860 CKNFGGKLPSSQNE------------LENVFKAWGAANKY 887



Score = 79.0 bits (194), Expect = 4e-16
Identities = 74/392 (18%), Positives = 122/392 (31%), Gaps = 21/392 (5%)

Query: 955 VAGAVATITLTTLVNGAVADGANSNSVQAVVSDSEGNPVAGAAVVFSSANATAQITTVIG 1014
V V T A ADG + + A V G A V F+ + TA ++
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSA 612

Query: 1015 TTDADGIATATLTNTVAGTSNVVATIGSITNNIDTT---FVAGAVATITLTTPVNGAVAD 1071
T+ G AT TL + G V A +T+ ++ FV A+IT
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVA 672

Query: 1072 GANSNSVQAVVSDSGGNPVTGATVVFSSTNATAQVTTVIGTTGVDGIATATLTNTVAGTS 1131
V G PV+ V F++T +T T +G A TLT+T G S
Sbjct: 673 NGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE--KTDTNGYAKVTLTSTTPGKS 730

Query: 1132 NVVATIGSITNNI---DTTFVAGAVATITLTTLVNGAVADGANSNSVQAVVSDSGGNPVT 1188
V A + + ++ + F +V V + +Q +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQ---YGQVNLKAS 787

Query: 1189 GATVVFSSTNATAQVTTVIGTTGVDGIATATLTNTVAGTSNVVATIGSITNNIDTTFVAG 1248
G ++ +A + +V ++G +T GT+ + N T +A
Sbjct: 788 GGNGKYTWRSANPAIASVDASSGQ-------VTLKEKGTTTISVISSD--NQTATYTIAT 838

Query: 1249 AVATITLTTPVNGAVADGANSNSVQAVVSDSGGNSVTGATVVFSSTNATAQVTTVIGTTG 1308
+ I D N+ S N + + + N +
Sbjct: 839 PNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898

Query: 1309 ADGIATATLTNTVAGTSNVVATIDTVNANIDT 1340
+ VA T ++V N
Sbjct: 899 WVQQTAQDAKSGVASTYDLVKQNPLNNIKASE 930



Score = 74.3 bits (182), Expect = 1e-14
Identities = 80/421 (19%), Positives = 139/421 (33%), Gaps = 41/421 (9%)

Query: 821 DRNGYAENTLTNLAIGTTTVKATTVTDPVGQTVNTHFVAGAVDTITLTTPVNGAVADGAN 880
DRNG + N N+ + T + V D VG T T A ADG
Sbjct: 533 DRNGNSSN---NVLLTITVLSNGQVVDQVGVT-------------DFTADKTSAKADGTE 576

Query: 881 SNSVQAVVSDSGGNPVTGATVVFSSTNATAQVTTVIGTTGVDGIATATLTNTVAGTSNVV 940
+ + A V +G V F+ + TA ++ T G AT TL + G V
Sbjct: 577 AITYTATVKKNGVAQA-NVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVS 635

Query: 941 ATIGSITNNIDTT---FVAGAVATITLTTLVNGAVADGANSNSVQAVVSDSE-GNPVAGA 996
A +T+ ++ FV A+IT + A +++ V + PV+
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASIT-EIKADKTTAVANGQDAITYTVKVMKGDKPVSNQ 694

Query: 997 AVVFSSANATAQITTVIGTTDADGIATATLTNTVAGTSNVVATIGSITNNI---DTTFVA 1053
V F++ +T TD +G A TLT+T G S V A + + ++ + F
Sbjct: 695 EVTFTTTLGKLSNSTE--KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFT 752

Query: 1054 GAVATITLTTPVNGAVADGANSNSVQAVVSDSGGNPVTGATVVFSSTNATAQVTTVIGTT 1113
V V + +Q +G ++ +A + +V ++
Sbjct: 753 TLTIDDGNIEIVGTGVKGKLPTVWLQ---YGQVNLKASGGNGKYTWRSANPAIASVDASS 809

Query: 1114 GVDGIATATLTNTVAGTSNVVATIGSITNNIDTTFVAGAVATITLTTLVNGAV-ADGANS 1172
G +T GT+ + N T+ ++ + + D N+
Sbjct: 810 GQ-------VTLKEKGTTTISVISSD---NQTATYTIATPNSLIVPNMSKRVTYNDAVNT 859

Query: 1173 NSVQAVVSDSGGNPVTGATVVFSSTNATAQVTTVIGTTGVDGIATATLTNTVAGTSNVVA 1232
S N + + + N + + VA T ++V
Sbjct: 860 CKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDLVK 919

Query: 1233 T 1233

Sbjct: 920 Q 920



Score = 68.9 bits (168), Expect = 4e-13
Identities = 72/373 (19%), Positives = 127/373 (34%), Gaps = 30/373 (8%)

Query: 3940 SNQVQSKDTIFIADRTTATIRASDLTITRNNALADGVATNAARVIVTDANGNPVPSMFVG 3999
SN V T+ + + +D T + +A ADG V
Sbjct: 539 SNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSF 598

Query: 4000 YTSDNGALLTPASGMTDSSGTFSTTFTHTTAGISKVTAAIITMGISQAKDAVFIADSSTA 4059
A+L+ S T+ SG + T G V+A M + +AV D + A
Sbjct: 599 NIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKA 658

Query: 4060 RVSELIIVKNDSLANNSDRNIVQAHIKDAHGNVITGMNVNFSATENVTLTANTVTTNDQG 4119
++E+ K ++AN D + ++ V F+ T L+ +T T+ G
Sbjct: 659 SITEIKADKTTAVANGQDAITYTVKVMK-GDKPVSNQEVTFT-TTLGKLSNSTEKTDTNG 716

Query: 4120 YAENTLRHNVPVTSAVTATVA----TDLVGLTEDVRFVAGDGARIELFRLNDGAVADGIQ 4175
YA+ TL P S V+A V+ E + D IE+ V +
Sbjct: 717 YAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTG---VKGKLP 773

Query: 4176 TNRVEARVYDVSDHLVPNSNVVF---SASNGGQLVQEDVQTDASGSAYVTVSNTTSGVTR 4232
T ++ Y + N + SA+ V S VT+ +
Sbjct: 774 TVWLQ---YGQVNLKASGGNGKYTWRSANPAIASVDAS-------SGQVTLKEKGTTT-- 821

Query: 4233 VSVTADGVSASTTTTFIADKDTATLDANLFLITNDNAIANGVIENRVLLQLVDANGNKVS 4292
+SV + S + T T+ + + N ++ + V + + ++ N++
Sbjct: 822 ISVIS---SDNQTATYTIATPNSLIVPN---MSKRVTYNDAVNTCKNFGGKLPSSQNELE 875

Query: 4293 GVEVNFSATNGAS 4305
V + A N
Sbjct: 876 NVFKAWGAANKYE 888



Score = 65.1 bits (158), Expect = 7e-12
Identities = 75/392 (19%), Positives = 120/392 (30%), Gaps = 23/392 (5%)

Query: 1727 VAGAVAAITLTTPVDGAVADGTDSNSVQAVVSDSDGNPVTGATVVFSSTNATAQITTVIG 1786
V V T A ADGT++ + A V +G V F+ + TA ++
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSA 612

Query: 1787 TTGADGIATATLTNTVAGTSNVVATI----DTVNANIDTTFVAGAVATITLSVPVNDATA 1842
T G AT TL + G V A +NAN FV A+IT + + TA
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTA 670

Query: 1843 DGADTNQVDALVQDANGNAITGAAVVFSSTNGADIIVPTMNTGVNGVASTLLTHTMAGTS 1902
+ + V+ G+ V +T + T T NG A LT T G S
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 1903 NVIATIDTVNANI---DTTFVAGAVATITLSVPVNDATADGADTNQVDALVQDANGNAIT 1959
V A + V ++ + F V T + + +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 1960 GAAVVFSSANG-ATILSSTMNTGVNGVASTLLTHTQSGVSNVVATIDTVNANIDTTFVAG 2018
G S+ A++ +S+ + +T ++ S TI T N
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN---------- 840

Query: 2019 AVAAITLTTPVNGAVADGANSNSVQAVVSDSEGNAVAGAAVVFSSANATAQLTTVIGTTG 2078
+ I D N+ S N + + +AN +
Sbjct: 841 --SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898

Query: 2079 ADGIATATLTNTVAGTSNVIATIDTVNANIDT 2110
+ VA T +++ N
Sbjct: 899 WVQQTAQDAKSGVASTYDLVKQNPLNNIKASE 930



Score = 64.7 bits (157), Expect = 9e-12
Identities = 79/388 (20%), Positives = 130/388 (33%), Gaps = 29/388 (7%)

Query: 2209 VAGAVATITLSVLVNDATADGADTNQVDALVQDANGNAITGAAVVFSSANG-ATILSSTV 2267
V V + A ADG + A V+ NG A V F+ +G A + +++
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSA 612

Query: 2268 NTGADGIASTTLTHTQSGVSNVVATV----DTVNANIDTAFVAGAVATITLSVPVNDATA 2323
NT G A+ TL + G V A +NAN FV A+IT + + TA
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTA 670

Query: 2324 DGADTNQVDALVQDANGNAITGAAVVFSSTNGATILSSTVNTGADGIASTTLTHTQSGVS 2383
+ + V+ G+ V +T + +ST T +G A TLT T G S
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 2384 NVVATIDTVNANI---DTTFVPGAVATITLSVPVNDATADGADTNQVDALVQDANGNAIT 2440
V A + V ++ + F V T + + +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 2441 GAAVVFSSANG-ADIIAPTMNTGVNGVASTLLTHTQSGVSNVVATIDTVNANIDTTFVAG 2499
G S+ A + A + + +T ++ S TI T N
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN---------- 840

Query: 2500 AVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGV 2559
+ I ++ D +T + ++ N + + +AN +
Sbjct: 841 --SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYY-----KSS 893

Query: 2560 NGVASTLLTHTQSGVSNVVATIDTVNAN 2587
+ S + Q S V +T D V N
Sbjct: 894 QTIISWVQQTAQDAKSGVASTYDLVKQN 921



Score = 60.1 bits (145), Expect = 2e-10
Identities = 80/397 (20%), Positives = 129/397 (32%), Gaps = 33/397 (8%)

Query: 2016 VAGAVAAITLTTPVNGAVADGANSNSVQAVVSDSEGNAVAGAAVVFSSANATAQLTTVIG 2075
V V T A ADG + + A V G A A V F+ + TA L+
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSA 612

Query: 2076 TTGADGIATATLTNTVAGTSNVIATI----DTVNANIDTTFVAGAVATITLSVPVNDATA 2131
T G AT TL + G V A +NAN FV A+IT + + TA
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTA 670

Query: 2132 DGADTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGVNGVASTLLTHTQSGVS 2191
+ + V+ G+ V + + T T NG A LT T G S
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 2192 NVVATIDTVNANI---DTAFVAGAVATITLSVLVNDATADGADTNQVDALVQDANGNAIT 2248
V A + V ++ + F +V T + + +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 2249 GAAVVFSSANG-ATILSSTVNTGADGIASTTLTHTQSGVSNVVATVDTVNANIDTAFVAG 2307
G S+ A++ +S+ +TT++ S T+ T N
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN---------- 840

Query: 2308 AVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSTNGATILSS------ 2361
+ I ++ D +T + ++ N + + + N S
Sbjct: 841 --SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898

Query: 2362 ----TVNTGADGIASTTLTHTQSGVSNVVATIDTVNA 2394
T G+AST Q+ ++N+ A+ A
Sbjct: 899 WVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYA 935



Score = 59.7 bits (144), Expect = 3e-10
Identities = 77/388 (19%), Positives = 126/388 (32%), Gaps = 29/388 (7%)

Query: 2305 VAGAVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSTNG-ATILSSTV 2363
V V + A ADG + A V+ NG A V F+ +G A + +++
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSA 612

Query: 2364 NTGADGIASTTLTHTQSGVSNVVATI----DTVNANIDTTFVPGAVATITLSVPVNDATA 2419
NT G A+ TL + G V A +NAN FV A+IT + + TA
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTA 670

Query: 2420 DGADTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGVNGVASTLLTHTQSGVS 2479
+ + V+ G+ V + + T T NG A LT T G S
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 2480 NVVATIDTVNANI---DTTFVAGAVATITLSVPVNDATADGADTNQVDALVQDANGNAIT 2536
V A + V ++ + F V T + + +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 2537 GAAVVFSSANG-ADIIAPTMNTGVNGVASTLLTHTQSGVSNVVATIDTVNANIDTTFVPG 2595
G S+ A + A + + +T ++ S TI T N
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN---------- 840

Query: 2596 AVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGV 2655
+ I ++ D +T + ++ N + + +AN +
Sbjct: 841 --SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYY-----KSS 893

Query: 2656 NGVASTLLTHTQSGVSNVVATIDTVNAN 2683
+ S + Q S V +T D V N
Sbjct: 894 QTIISWVQQTAQDAKSGVASTYDLVKQN 921



Score = 59.7 bits (144), Expect = 3e-10
Identities = 79/397 (19%), Positives = 128/397 (32%), Gaps = 33/397 (8%)

Query: 1246 VAGAVATITLTTPVNGAVADGANSNSVQAVVSDSGGNSVTGATVVFSSTNATAQVTTVIG 1305
V V T A ADG + + A V +G V F+ + TA ++
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQA-NVPVSFNIVSGTAVLSANSA 612

Query: 1306 TTGADGIATATLTNTVAGTSNVVATI----DTVNANIDTTFVAGAVATITLSVLVNDATA 1361
T G AT TL + G V A +NAN FV A+IT + + TA
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTA 670

Query: 1362 DGADTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGVNGVASTLLTHTVAGTS 1421
+ + V+ G+ V + + T T NG A LT T G S
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 1422 NVIATIDTVNANI---DTTFVAGAVATITLSVPVNDATADGADTNQVDALVQDASGNAIT 1478
V A + V ++ + F V T + + +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 1479 GAAVVFSSANG-ATILSSTVNTGADGIASTTLTHTQSGVSNVVATIDTVNANIDTAFVAG 1537
G S+ A++ +S+ +TT++ S TI T N
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN---------- 840

Query: 1538 AVATITLSVLVNDATADGADTNQVDALVQDANGNAITGAAVVFSSANGATILSS------ 1591
+ I ++ D +T + ++ N + + +AN S
Sbjct: 841 --SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIIS 898

Query: 1592 ----TVNTGADGIASTTLTHTQSGVSNVVATIDTVNA 1624
T G+AST Q+ ++N+ A+ A
Sbjct: 899 WVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYA 935



Score = 59.3 bits (143), Expect = 4e-10
Identities = 74/397 (18%), Positives = 128/397 (32%), Gaps = 46/397 (11%)

Query: 4209 EDVQTDASGSAYVTVSNTTSGVTRVSVTADGVSASTTTTFIADKDTATLDANLFLITNDN 4268
D ++S + +T++ ++G V GV+ T A D + +
Sbjct: 532 YDRNGNSSNNVLLTITVLSNGQV---VDQVGVTDFTADKTSAKADGTEAITYTATVKKNG 588

Query: 4269 AIANGVIENRVLLQLVDANGNKVSGVEVNFSATNGASINASAITEANGFAFGTLTNTLSG 4328
V VS V+ +A A+ SA T +G A TL +
Sbjct: 589 ---------------VAQANVPVSFNIVSGTAVLSAN---SANTNGSGKATVTLKSD--K 628

Query: 4329 PSDVTVTLVTAGGTESLTVTPQFIADKNTAHIATGDFVIIDDGAVANSVAFNEVRAKVTD 4388
P V V+ TA T +L D+ A I AVAN KV
Sbjct: 629 PGQVVVSAKTAEMTSALNANAVIFVDQTKASITE--IKADKTTAVANGQDAITYTVKVMK 686

Query: 4389 DLGNAIAGYSVIFASQNGATITTSGITGVDGWASARLTHTQAGESGISARVARPATTTHS 4448
++ V F + G ++ T +G+A LT T G+S +SARV+ A +
Sbjct: 687 G-DKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKA 745

Query: 4449 LMPYFIADVSTATLKLFNFNTMPVIADGVTQFFVLGTV-FDANQNPVGGQQVAFSATNEV 4507
F ++ + ++ GV + + G ++ +
Sbjct: 746 PEVEFFTTLTIDDGNI------EIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSAN 799

Query: 4508 TLIESNGSISAPEGGVLLSVTSTQAGIHPITGTLVSNNYTDTLGAEFIADKNTAQLSTLI 4567
I S+ A G VT + G I+ N A + + +
Sbjct: 800 PAI---ASVDASSG----QVTLKEKGTTTISVISSDNQT-----ATYTIATPNSLI-VPN 846

Query: 4568 VVDNNALADGVARNQVRAHVVDSTGNSVADMAVTFTA 4604
+ D V + + S+ N + ++ + A
Sbjct: 847 MSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883



Score = 58.9 bits (142), Expect = 5e-10
Identities = 68/391 (17%), Positives = 116/391 (29%), Gaps = 22/391 (5%)

Query: 1439 VAGAVATITLSVPVNDATADGADTNQVDALVQDASGNAITGAAVVFSSANGATILSSTVN 1498
V V + A ADG + A V+ + A + +++ N
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 1499 TGADGIASTTLTHTQSGVSNVVATI----DTVNANIDTAFVAGAVATITLSVLVNDATAD 1554
T G A+ TL + G V A +NAN FV A+IT + + TA
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTAV 671

Query: 1555 GADTNQVDALVQDANGNAITGAAVVFSSANGATILSSTVNTGADGIASTTLTHTQSGVSN 1614
+ + V+ G+ V + + +ST T +G A TLT T G S
Sbjct: 672 ANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL 731

Query: 1615 VVATIDTVNANI---DTTFVAGAVATITLSVPVNDATADGADTNQVDALVQDANGNAITG 1671
V A + V ++ + F V T + + + G
Sbjct: 732 VSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNG 791

Query: 1672 AAVVFSSANG-ATILSSTMNTGVNGVASTLLTHTQSGVSNVVATIDTVNANIDTTFVAGA 1730
S+ A++ +S+ + +T ++ S TI T N
Sbjct: 792 KYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN----------- 840

Query: 1731 VAAITLTTPVDGAVADGTDSNSVQAVVSDSDGNPVTGATVVFSSTNATAQITTVIGTTGA 1790
+ I D ++ S N + + + N +
Sbjct: 841 -SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISW 899

Query: 1791 DGIATATLTNTVAGTSNVVATIDTVNANIDT 1821
+ VA T ++V N
Sbjct: 900 VQQTAQDAKSGVASTYDLVKQNPLNNIKASE 930



Score = 57.0 bits (137), Expect = 2e-09
Identities = 68/344 (19%), Positives = 114/344 (33%), Gaps = 22/344 (6%)

Query: 2692 AVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSTNG-ATILSSTVNTG 2750
V + A ADG + A V+ NG A V F+ +G A + +++ NT
Sbjct: 557 QVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTN 615

Query: 2751 ADGIASTTLTHTQSGVSNVVATI----DTVNANIDTTFVPGAVATITLSVPVNDATADGA 2806
G A+ TL + G V A +NAN FV A+IT + + TA
Sbjct: 616 GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTAVAN 673

Query: 2807 DTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGVNGVASTLLTHTVAGTSNVV 2866
+ + V+ G+ V + + T T NG A LT T G S V
Sbjct: 674 GQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVS 733

Query: 2867 ATIDTVNANI---DTTFVAGAVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAA 2923
A + V ++ + F V T + + + G
Sbjct: 734 ARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793

Query: 2924 VVFSSANGATILSSTMNTGVNGVASTLLTHTVAGTSNVVATIGSITDNIDTVFVAGAVAT 2983
S+ I S ++G +T GT+ + + T +A +
Sbjct: 794 TWRSANPA--IASVDASSGQ-------VTLKEKGTTTISVISSD--NQTATYTIATPNSL 842

Query: 2984 ITLSVPVNDATADGADTNQVDALVEDANGNAITGAAVVFSSANG 3027
I ++ D +T + ++ N + + +AN
Sbjct: 843 IVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANK 886



Score = 56.6 bits (136), Expect = 2e-09
Identities = 48/185 (25%), Positives = 73/185 (39%), Gaps = 8/185 (4%)

Query: 3364 AVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSANG-ATILASTVNTG 3422
V + A ADG + A V+ NG A V F+ +G A + A++ NT
Sbjct: 557 QVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTN 615

Query: 3423 VNGVASMLLTHTVAGASNVVATIGSITDNIDTT---FVAGAMANIV-VSIIDDNALANGA 3478
+G A++ L G V A +T ++ FV A+I + A+ANG
Sbjct: 616 GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQ 675

Query: 3479 DTNIVEAFVTDRFGNGVANQSLIFGTNGASIVGPSTVTTNLDGRVRASATHTVAGSSNTV 3538
D V V+NQ + F T + ST T+ +G + + T T G S
Sbjct: 676 DAITYTVKVMKG-DKPVSNQEVTFTTTL-GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVS 733

Query: 3539 VAMSG 3543
+S
Sbjct: 734 ARVSD 738



Score = 55.1 bits (132), Expect = 7e-09
Identities = 59/326 (18%), Positives = 112/326 (34%), Gaps = 22/326 (6%)

Query: 3653 VAGKAASIELTMTKDNAVANNIDTNEIQVLVTDTGGNAINGAVVNLTSNSGMNITPNSVT 3712
V + + T K +A A+ + V G N V + ++ NS
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 3713 TGSDGTATATLTHTLAGNLPINARIDQVSKTINATFI-----ADASTAQIIASDMFIIAN 3767
T G AT TL G + ++A+ +++ +NA + AS +I A +AN
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVAN 673

Query: 3768 NQVANGEAVNAIQARVTDSY-GNPIKDQTVEFVLSNNGTIKYNLDVTSAEGGVMVTFTNT 3826
Q +AI V P+ +Q V F + G + + + T G VT T+T
Sbjct: 674 GQ-------DAITYTVKVMKGDKPVSNQEVTFT-TTLGKLSNSTEKTDTNGYAKVTLTST 725

Query: 3827 LAGITNVTATVVSTGGS-RNIDTTFIADVTTAHIAASDLMVIVDNAVANNSDENEVHARV 3885
G + V+A V + + F +T IV V +
Sbjct: 726 TPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIE----IVGTGVKGKLPTVWLQYGQ 781

Query: 3886 TDAKGNVLSGQTVVFTSGNGAAITTVNGISDSDGLTKATLTHTLAGTSVVTARVSNQVQS 3945
+ K + +G+ ++ A + K T T++ S + + +
Sbjct: 782 VNLKASGGNGKYTWRSANPAIASVDAS---SGQVTLKEKGTTTISVISSDNQTATYTIAT 838

Query: 3946 KDTIFIADRTTATIRASDLTITRNNA 3971
+++ + + + + +N
Sbjct: 839 PNSLIVPNMSKRVTYNDAVNTCKNFG 864



Score = 55.1 bits (132), Expect = 7e-09
Identities = 70/352 (19%), Positives = 122/352 (34%), Gaps = 27/352 (7%)

Query: 696 GNAVPNVPMSIQADNG-AIVVASTPNTGVDGTINATFTNLRAGESVVSVTSPALVGMTMT 754
G A NVP+S +G A++ A++ NT G T + + G+ VVS + MT
Sbjct: 588 GVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKT---AEMTSA 644

Query: 755 MTFSA----DQRTAVVSTLAAIDNNAKADGTDTNVVRAWVVDANGNSVPGVSVTFDAGNG 810
+ +A DQ A ++ + A A A+G D + V V VTF
Sbjct: 645 LNANAVIFVDQTKASITEIKADKTTAVANGQDA-ITYTVKVMKGDKPVSNQEVTFTT-TL 702

Query: 811 AVLAQNPVVTDRNGYAENTLTNLAIG--TTTVKATTVTDPVGQTVNTHFVAGAVDTITLT 868
L+ + TD NGYA+ TLT+ G + + + V V F +D +
Sbjct: 703 GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIE 762

Query: 869 TPVNGAVADGANSNSVQAVVSDSGGNPVTGATVVFSSTNATAQVTTVIGTTGVDGIATAT 928
G V + +Q +G ++ +A + +V ++G
Sbjct: 763 IVGTG-VKGKLPTVWLQ---YGQVNLKASGGNGKYTWRSANPAIASVDASSGQ------- 811

Query: 929 LTNTVAGTSNVVATIGSITNNIDTTFVAGAVATITLTTLVNGAV-ADGANSNSVQAVVSD 987
+T GT+ + N T+ ++ + + D N+
Sbjct: 812 VTLKEKGTTTISVISSD---NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLP 868

Query: 988 SEGNPVAGAAVVFSSANATAQITTVIGTTDADGIATATLTNTVAGTSNVVAT 1039
S N + + +AN + + VA T ++V
Sbjct: 869 SSQNELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDLVKQ 920



Score = 54.3 bits (130), Expect = 1e-08
Identities = 74/388 (19%), Positives = 126/388 (32%), Gaps = 29/388 (7%)

Query: 3073 VAGAVATITLSVPVNDATADGADTNQVDALVEDANGNAITGAAVVFSSANG-ATILSSTV 3131
V V + A ADG + A V+ NG A V F+ +G A + +++
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSA 612

Query: 3132 NTGADGIASTTLTHTQSGVSNVVATI----DTVNANIDTTFVAGAVATITLSVPVNDATA 3187
NT G A+ TL + G V A +NAN FV A+IT + + TA
Sbjct: 613 NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA-VIFVDQTKASIT-EIKADKTTA 670

Query: 3188 DGADTNQVDALVQDANGNAITGAAVVFSSANGADIIAPTMNTGVNGVASTLLTHTVAGTS 3247
+ + V+ G+ V + + T T NG A LT T G S
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 3248 NVVATIGSITNNI---DTAFVAGAVATITLTTPVNGAVADGADTNQVDALVQDANGNAIT 3304
V A + + ++ + F V V T + + +
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 3305 GAAVVFSSANG-ADIIAPTMNTGVNGVASTLLTHTVAGTSNVVATIDTVNANIDTTFVPG 3363
G S+ A + A + + +T ++ + TI T N
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN---------- 840

Query: 3364 AVATITLSVPVNDATADGADTNQVDALVQDANGNAITGAAVVFSSANGATILASTVNTGV 3423
+ I ++ D +T + ++ N + + +AN S+
Sbjct: 841 --SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSS----- 893

Query: 3424 NGVASMLLTHTVAGASNVVATIGSITDN 3451
+ S + S V +T + N
Sbjct: 894 QTIISWVQQTAQDAKSGVASTYDLVKQN 921



Score = 52.0 bits (124), Expect = 7e-08
Identities = 90/434 (20%), Positives = 152/434 (35%), Gaps = 56/434 (12%)

Query: 3920 LTKATLTHTLAGTSVVTARVSNQVQSKDTIFIADRTTATIRASDLTITRNNALADGVATN 3979
+ + H + GT T ++ V+SK + + +R+ I + + +
Sbjct: 453 ILSLNIPHDINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQS------ 506

Query: 3980 AARVIVTDANGNPVPSMFVGYTSDNGALLTPASGMTDSSGTFSTTFTHTTAGIS-KVTAA 4038
D A+L + + S V
Sbjct: 507 ---------------------AQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLT 545

Query: 4039 IITMGISQAKDAVFIADSSTARVSELIIVKNDSLANNSDRNIVQAHIKDAHGNVITGMNV 4098
I + Q D V + D + + S + A+ ++ A +K +G + V
Sbjct: 546 ITVLSNGQVVDQVGVTDFTADKTS--------AKADGTEAITYTATVKK-NGVAQANVPV 596

Query: 4099 NFSATENV-TLTANTVTTNDQGYAENTLRHNVPVTSAVTATVATDLVGL-TEDVRFVAGD 4156
+F+ L+AN+ TN G A TL+ + P V+A A L V FV
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 4157 GARI-ELFRLNDGAVADGIQTNRVEARVYDVSDHLVPNSNVVFSASNGGQLVQEDVQTDA 4215
A I E+ AVA+G +V D V N V F+ + G+L +TD
Sbjct: 657 KASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSNQEVTFT-TTLGKLSNSTEKTDT 714

Query: 4216 SGSAYVTVSNTTSGVTRVSVTADGVSASTTTTFIADKDTATLDANLFLITNDNAIANGVI 4275
+G A VT+++TT G + VS V+ + T T+D N + GV
Sbjct: 715 NGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDG-----NIEIVGTGVK 769

Query: 4276 ENRVLLQLVDANGN-KVSGVEVNFSATNGASINASAITEANGFAFGTLTNTLSGPSDVTV 4334
+ L N K SG ++ ++ A A +A+ TL T+
Sbjct: 770 GKLPTVWLQYGQVNLKASGGNGKYTWR--SANPAIASVDASSGQV-----TLKEKGTTTI 822

Query: 4335 TLVTAGGTESLTVT 4348
+ V + ++ T T
Sbjct: 823 S-VISSDNQTATYT 835


65YpsIP31758_4024YpsIP31758_4042Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_4024-1213.183974DNA-binding response regulator
YpsIP31758_4025-1213.773207PfkB family kinase
YpsIP31758_4026-1224.437472hypothetical protein
YpsIP31758_4027-1234.477194hypothetical protein
YpsIP31758_4028-1194.166061ribose ABC transporter periplasmic protein
YpsIP31758_4029-1184.110959ribose ABC transporter permease
YpsIP31758_40300183.699004ribose ABC transporter ATP-binding protein
YpsIP31758_40312132.892306sensory box histidine kinase/response regulator
YpsIP31758_4032-213-0.437648hypothetical protein
YpsIP31758_4033-116-0.018293low-affinity inorganic phosphate transporter
YpsIP31758_4034-117-1.406634universal stress protein UspB
YpsIP31758_4035-116-1.398434universal stress protein A
YpsIP31758_4036016-2.970811glutamate dehydrogenase
YpsIP31758_4037319-6.044049metalloprotease
YpsIP31758_4038223-6.061249methyltransferase
YpsIP31758_4039028-10.357630hypothetical protein
YpsIP31758_4041-215-3.887341hypothetical protein
YpsIP31758_4042-113-3.147441entero membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4024HTHFIS1002e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 100 bits (250), Expect = 2e-26
Identities = 35/122 (28%), Positives = 61/122 (50%), Gaps = 1/122 (0%)

Query: 2 KPAILVVDDDTAICEVLRDVLNEHVFDVLLCHSGNEALQITATQPSIALILLDMMLPDIN 61
ILV DDD AI VL L+ +DV + + + A L++ D+++PD N
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMPDEN 61

Query: 62 GLLVLQQVQKLRPSLPVVMLTGMGSESDMVVGLEMGADDYIAKPFNARVVVARVKAVLRR 121
+L +++K RP LPV++++ + + E GA DY+ KPF+ ++ + L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 SE 123
+
Sbjct: 122 PK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4028SUBTILISIN290.019 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 29.4 bits (66), Expect = 0.019
Identities = 16/65 (24%), Positives = 25/65 (38%), Gaps = 5/65 (7%)

Query: 55 KLAGDKVKVTLVSSGYDLGQQVAQIDNFIAAKVDMIIL---NAADSKGIGPAVKRAKDAG 111
L +KV + I I KVD+I + D + AVK+A +
Sbjct: 111 DLLI--IKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEAVKKAVASQ 168

Query: 112 IVVVA 116
I+V+
Sbjct: 169 ILVMC 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4031HTHFIS624e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.2 bits (151), Expect = 4e-12
Identities = 26/125 (20%), Positives = 59/125 (47%), Gaps = 17/125 (13%)

Query: 735 MADQLVLVLEDEPDVRQTLCEQLHQLGYLTLETGDSRQALALMADVPDISIVISDLMLPG 794
M +LV +D+ +R L + L + GY T ++ +A +V++D+++P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPD 59

Query: 795 DLTGAEVLQQARSVYPHLKLLLISGQD---------LRRSKNFMPEVELLRKPFNQQQLV 845
+ ++L + + P L +L++S Q+ + + +++P KPF+ +L+
Sbjct: 60 E-NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLP------KPFDLTELI 112

Query: 846 QALQR 850
+ R
Sbjct: 113 GIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4037CABNDNGRPT2554e-82 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 255 bits (652), Expect = 4e-82
Identities = 188/433 (43%), Positives = 245/433 (56%), Gaps = 43/433 (9%)

Query: 36 ISSHQSWKENTIHNKNTNLTYSF-SRAYTLWDYDRTFQQNAYVSLFNPAQIHQAKIAMQS 94
+ SW + K+ NLT+ F ++ D F + FN QI QAK+++QS
Sbjct: 58 TRENVSWNGTNVFGKSANLTFKFLQSVSSIPSGDTGFVK------FNAEQIEQAKLSLQS 111

Query: 95 WADVANISFTEASADSSANILFLNFQR-PGN-----VAGYAYHPNLGSFS-PIWINYSFS 147
W+DVAN++FTE + + SANI F N+ R YAY+P + W NY+ S
Sbjct: 112 WSDVANLTFTEVTGNKSANITFGNYTRDASGNLDYGTQAYAYYPGNYQGAGSSWYNYNQS 171

Query: 148 DNQHPSRLNYGGGVLTHEIGHALGLGHS---HAPHGY-----------TQQMSVMSYLSE 193
+ ++P YG THEIGHALGL H +A G + Q S+MSY E
Sbjct: 172 NIRNPGSEEYGRQTFTHEIGHALGLAHPGEYNAGEGDPSYNDAVYAEDSYQFSIMSYWGE 231

Query: 194 QDSGANYGQHYLSTPQMYDIAAIQYLYGANLHTRTGDTVYGFNSTSYRDHFTATHASDAL 253
++GA+Y HY P + DIAAIQ LYGAN+ TRTGD+VYGFNS + RD +TAT +S AL
Sbjct: 232 NETGADYNGHYGGAPMIDDIAAIQRLYGANMTTRTGDSVYGFNSNTDRDFYTATDSSKAL 291

Query: 254 IFCVWDAGGNDTFDFSGYKQNQMINLNELCFSDVGGLKGNVSIAADVTIENAIGGSGHDD 313
IF VWDAGG DTFDFSGY NQ INLNE FSDVGGLKGNVSIA VTIENAIGGSG+D
Sbjct: 292 IFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDI 351

Query: 314 IIGNHTNNILTGN---------GGSDQLWGNGGNNTFRYASARDSMTTSPDTIHDFKSGR 364
++GN +NIL G G+D L+G G +TF Y S +DS + D I DF+ G
Sbjct: 352 LVGNSADNILQGGAGNDVLYGGAGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGI 411

Query: 365 DKIDLSQLMPSTDRVIFVDRLSFNGQ-TEMGQQYNEVADITYLMIDFDAQVSECDMMIKF 423
DKIDLS + + F G+ E+ Q++ IT L + S D +++
Sbjct: 412 DKIDLSAFRNEGQ--LSFVQDQFTGKGQEVMLQWDAANSITNLWLHEAGH-SSVDFLVRI 468

Query: 424 TGRHHFTANDFIL 436
G+ +D I+
Sbjct: 469 VGQ--AAQSDIIV 479


66YpsIP31758_4076YpsIP31758_4089Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_4076016-3.376269ShlB/FhaC/HecB family hemolysin
YpsIP31758_4077-115-3.188573hemagglutinin/hemolysin
YpsIP31758_4078-112-1.409175*regulatory protein UhpC
YpsIP31758_4079013-1.076686sensory histidine kinase UhpB
YpsIP31758_4080-112-0.719845UhpA family transcriptional regulator
YpsIP31758_4081-111-0.260436phosphoethanolamine transferase
YpsIP31758_4082-112-0.694806hypothetical protein
YpsIP31758_4083014-1.712433hypothetical protein
YpsIP31758_4084113-2.536003amino acid permease family protein
YpsIP31758_4085015-3.222455histidine ammonia-lyase
YpsIP31758_4086-117-4.124600urocanate hydratase
YpsIP31758_4087020-5.178161pyridoxal-phosphate dependent protein
YpsIP31758_4088-218-3.738875adenine phosphoribosyltransferase
YpsIP31758_4089-216-3.030345hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4077PF05860596e-13 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 58.7 bits (142), Expect = 6e-13
Identities = 17/115 (14%), Positives = 35/115 (30%), Gaps = 18/115 (15%)

Query: 66 VGMTETVVNIQAPDENGLSHNKYSKFDVVANGLFDVTTLNNRLAQEVDGNSFLQDKLATI 125
++ + L H+ + +F V +G F
Sbjct: 17 TEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTA----------------FFNNPTNIQN 59

Query: 126 ILNEVNSSQASLLDGNLHVGGQDAHVIIANPAGINCRGCSFTNTSHVTLTTGAPS 180
I++ V S +DG + A++ + NP GI + + + + A
Sbjct: 60 IISRVTGGSVSNIDGLIRANAT-ANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4078TCRTETB455e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.9 bits (106), Expect = 5e-07
Identities = 33/157 (21%), Positives = 68/157 (43%), Gaps = 7/157 (4%)

Query: 49 FNFIMPAMLTDLGLSMSDVGILGTLFYITYGCSKFVSGMISDRSNPRYFMGIGLVMTGII 108
N +P + D + + T F +T+ V G +SD+ + + G+++
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFG 92

Query: 109 NILFGMSSSLLVLGALWILNAFFQGWG---WPPCSKILTSWY-SRSERGGWWAIWNTSHN 164
+++ + S +L I+ F QG G +P ++ + Y + RG + + +
Sbjct: 93 SVIGFVGHSFF---SLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVA 149

Query: 165 FGGALIPLLVGVITLHFSWRYGMIIPGIIGVVIGLLM 201
G + P + G+I + W Y ++IP I + + LM
Sbjct: 150 MGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4079PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 17/85 (20%), Positives = 33/85 (38%), Gaps = 10/85 (11%)

Query: 426 VTNAYRHGAASR-----IEINARQDNQQIYLTISDNGK-GIDLASITPGYGLRGIQSRVS 479
V N +HG A I + +DN + L + + G + + G GL+ ++ R+
Sbjct: 264 VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQ 323

Query: 480 A-FGGNVSLSV---DNGTCLNVTLP 500
+G + + V +P
Sbjct: 324 MLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4080HTHFIS613e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.6 bits (147), Expect = 3e-13
Identities = 34/173 (19%), Positives = 63/173 (36%), Gaps = 20/173 (11%)

Query: 4 RVVFIDDHDIVRSGFAQLLSLEEDIQVVGEFSSAKQARAGLPGLQANICICDISMPDENG 63
++ DD +R+ Q LS V S+A + ++ + D+ MPDEN
Sbjct: 5 TILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 64 LDLLKGLPS---GMGVIMLSMHDSPALVETALERGARGFLSKRCKPEDLISAVRTVGSGG 120
DLL + + V+++S ++ A E+GA +L K +LI +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR----- 117

Query: 121 VYLMPEIAQQLARVAVDPLTRREREIAVLLAEG---MEVREIAESLGLSPKTV 170
A + L ++ L+ E+ + L + T+
Sbjct: 118 -------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


67YpsIP31758_4099YpsIP31758_4116Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_4099212-0.601150xylose ABC transporter system permease
YpsIP31758_4100113-1.827612xylose ABC transporter ATP-binding protein
YpsIP31758_4101111-2.451165D-xylose transporter subunit XylF
YpsIP31758_4102010-1.869549xylose isomerase
YpsIP31758_4103012-2.254491xylulokinase
YpsIP31758_4104012-3.966616hypothetical protein
YpsIP31758_4105-19-1.962609pili assembly chaperone protein
YpsIP31758_4106010-2.885549fimbrial usher protein
YpsIP31758_4107013-3.955866fimbrial protein
YpsIP31758_4109-112-0.258965hypothetical protein
YpsIP31758_4108-2121.041375acyltransferase
YpsIP31758_4110-2121.586650multidrug translocase MdfA
YpsIP31758_4111-2141.253648hypothetical protein
YpsIP31758_4112-2163.651623hypothetical protein
YpsIP31758_4114-1164.175256selenocysteinyl-tRNA-specific translation
YpsIP31758_4115-2153.660298selenocysteine synthase
YpsIP31758_4116-1193.155434formate dehydrogenase accessory protein FdhE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4106PF005777380.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 738 bits (1906), Expect = 0.0
Identities = 223/874 (25%), Positives = 376/874 (43%), Gaps = 62/874 (7%)

Query: 5 SKRKKTIFLMVKVLTIILVWLFLPESTAVVKFNTNIIDAKDRSNIDLSRFEVDDYTPPGN 64
K + F + ++ P S+A + FN + ++ DLSRFE PPG
Sbjct: 19 RKHRLAGFFV-RLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGT 77

Query: 65 YLLDILIDDRLLPERYLVTYLAVDEGKSTKLCLTPDLVNLFGLSTEVRESMTLWNNDKCV 124
Y +DI +++ + R VT+ D + CLT + GL+T M L +D CV
Sbjct: 78 YRVDIYLNNGYMATRD-VTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACV 136

Query: 125 AIDEK-KEIKIQYDKEKQSLIISIPQAWLAYNDPNWVPPSQWGNGVAGTLLDYNLFGYHY 183
+ + Q D +Q L ++IPQA+++ ++PP W G+ LL+YN G
Sbjct: 137 PLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV 196

Query: 184 SPNMGGSTTNFSSYGTTGANMGPWRIRADYQYINTETAGE--HYRNFDWSQVYAFRAIPS 241
+GG++ +G N+G WR+R + + + + + R I
Sbjct: 197 QNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIP 256

Query: 242 IGAKFVGGQTYLNSSIFDSFRFLGTSLSSDERMLPPTLRGYAPQVMGIAHTNARVVLSQN 301
+ ++ G Y IFD F G L+SD+ MLP + RG+AP + GIA A+V + QN
Sbjct: 257 LRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQN 316

Query: 302 GRVLYQTNVAPGPFVIQDIS-EAVQGNIDVRVEEEDGRVTVFQVNAASVPFLTRKGAVRY 360
G +Y + V PGPF I DI G++ V ++E DG +F V +SVP L R+G RY
Sbjct: 317 GYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRY 376

Query: 361 KAALGRPMLGNSA-SNPTFFSGEFSWGAFNHVSLYGGLMTTSQDYTSAALGIGQNLYDFG 419
G GN+ P FF G ++YGG + Y + GIG+N+ G
Sbjct: 377 SITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQ-LADRYRAFNFGIGKNMGALG 435

Query: 420 ALSIDITHSRAQLPNEEQQNGESYRVNYSKRFEQTDSQISFAGYRFSKKNFMSMSQYLD- 478
ALS+D+T + + LP++ Q +G+S R Y+K ++ + I GYR+S + + +
Sbjct: 436 ALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYS 495

Query: 479 WLNGNTALQYD-------------------KQAYTVAANQYLAWPDITMYLSVTRRTYWN 519
+NG D + + Q L T+YLS + +TYW
Sbjct: 496 RMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLYLSGSHQTYWG 554

Query: 520 A-ASSNNYSLSMSKIFDIGTFKGISATISANKVNNQYANENQMFFSLSVPIGIGQQASYD 578
+ ++ F+ I+ T+S + N + +L+V I D
Sbjct: 555 TSNVDEQFQAGLNT-----AFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 579 AQRG-RNTGYTQNISYFNNQNPKNI--------------WRISAGGGNPELQKGNGVFRG 623
++ R+ + ++S+ N N+ + + G
Sbjct: 610 SKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYA 669

Query: 624 GYQHSSPYGEFGLDGSHKNNEYNSINTNWYGSITATAYGVAAHQNKAGNEPRIMVDTGDV 683
+ YG + SH +++ + G + A A GV Q N+ ++V
Sbjct: 670 TLNYRGGYGNANIGYSH-SDDIKQLYYGVSGGVLAHANGVTLGQP--LNDTVVLVKAPGA 726

Query: 684 AGVSLNNNSAV-TNRFGVAVVSGATSYQQSDIRVDVQNLPDDIEVYNTVIQKTLTEGAIG 742
+ N + V T+ G AV+ AT Y+++ + +D L D++++ N V T GAI
Sbjct: 727 KDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIV 786

Query: 743 YREIRAVKGRQMMAIIRLKDGSSPPLGASVITDKTGAEVGIVGDDGLTYLAGLQDTERLT 802
E +A G +++ + + P GA V T ++ GIV D+G YL+G+ ++
Sbjct: 787 RAEFKARVGIKLLMTLT-HNNKPLPFGAMV-TSESSQSSGIVADNGQVYLSGMPLAGKVQ 844

Query: 803 VQWGKK---QCTL--ILPKDKGM-NSGKVLLPCQ 830
V+WG++ C LP + ++ C+
Sbjct: 845 VKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4110TCRTETB392e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 2e-05
Identities = 41/202 (20%), Positives = 80/202 (39%), Gaps = 5/202 (2%)

Query: 1 MQTSFSPATRLGRRALLFPLCLVLFEFAAYIANDMIQPGMLAVVAEFNASVEWVPTSMTA 60
M TS+S + + L++ L F + ++ P + + AS WV T+
Sbjct: 1 MNTSYSQSNLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFML 60

Query: 61 YLAGGMFLQWLLGPLSDRRGRRPVMLAGVAFFVVTCLAILLVNS-IEQFIAMRFLQGIGL 119
+ G + G LSD+ G + ++L G+ + + +S I RF+QG G
Sbjct: 61 TFSIGTAV---YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGA 117

Query: 120 CFIGAVGYATIQESFEEAVCIKITALMANVALIAPLLGPLAGAALIHVAPWQTMFVLFAV 179
A+ + + K L+ ++ + +GP G + H W + L +
Sbjct: 118 AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LIPM 176

Query: 180 LGAISFAGLWRAMPETASLKGE 201
+ I+ L + + + +KG
Sbjct: 177 ITIITVPFLMKLLKKEVRIKGH 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4114TCRTETOQM619e-12 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 61.0 bits (148), Expect = 9e-12
Identities = 50/175 (28%), Positives = 84/175 (48%), Gaps = 18/175 (10%)

Query: 8 HVDHGKTTLLQAI---TGV------------NADRLPEEKQRGMTIDLGYAYWPLPDGRI 52
HVD GKTTL +++ +G D E+QRG+TI G + + ++
Sbjct: 11 HVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWENTKV 70

Query: 53 MGFIDVPGHEKFLANMLAGVGGIDHALLVVACDDGVMAQTREHLAILRLSGRPALTVALT 112
ID PGH FLA + + +D A+L+++ DGV AQTR LR G P + +
Sbjct: 71 -NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTI-FFIN 128

Query: 113 KADRVDDERIAQVHQQILQELVAQGWSAEQISLFVTAAVTERGIGELREHLAQCH 167
K D+ ++ V+Q I ++L A+ +++ L+ VT E + + + +
Sbjct: 129 KIDQN-GIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182


68YpsIP31758_0128YpsIP31758_0134N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0128013-1.366801acetylglutamate kinase
YpsIP31758_0129015-1.744227argininosuccinate lyase
YpsIP31758_0130115-1.548689TonB-dependent heme receptor HasR
YpsIP31758_0131-113-0.527166hemophore HasA
YpsIP31758_0132-2120.099186type I secretion ATP-binding protein
YpsIP31758_0133-211-0.028826HlyD family hemolysin secretion protein
YpsIP31758_0134-2120.339809TonB domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0128CARBMTKINASE421e-06 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 42.1 bits (99), Expect = 1e-06
Identities = 36/138 (26%), Positives = 58/138 (42%), Gaps = 18/138 (13%)

Query: 133 VQTLLAAGYMPIISSIG----ITVEGQLMNVNA----DQAATALAATLGAD-LILLSDVS 183
++ L+ G + I S G I +G++ V A D A LA + AD ++L+DV+
Sbjct: 179 IKKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVN 238

Query: 184 GILDGKG----QRIAEMTAQKAEQLIAQGIITDG-MVVKVNAALDAARSLGRPVDIASWR 238
G G Q + E+ ++ + +G G M KV AA+ G IA
Sbjct: 239 GAALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHL- 297

Query: 239 HSEQLPALFNGVPIGTRI 256
E+ G GT++
Sbjct: 298 --EKAVEALEG-KTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0131PF064382123e-72 Heme acquisition protein HasAp
		>PF06438#Heme acquisition protein HasAp

Length = 205

Score = 212 bits (540), Expect = 3e-72
Identities = 61/214 (28%), Positives = 104/214 (48%), Gaps = 18/214 (8%)

Query: 1 MSTTIQYNSNYADYSISSYLREWANNFGDIDQAPAETKDRGSFSG-SSTLFSGTQYAIGS 59
MS +I Y++ Y+ ++++ YL +W+ FGD++ P + D + G + F G+QYA+ S
Sbjct: 1 MSISISYSTTYSGWTVADYLADWSAYFGDVNHRPGQVVDGSNTGGFNPGPFDGSQYALKS 60

Query: 60 SHGNPEGMIAEGNLKYSFM--PQHTFYGQIDTLQFGKDLATNAGGPSAGKHLEKIDITFN 117
+ + IA G+L Y+ P HT +G++D++ G L G S G L+ +++F+
Sbjct: 61 TASDA-AFIAGGDLHYTLFSNPSHTLWGKLDSIALGDTL--TGGASSGGYALDSQEVSFS 117

Query: 118 ELDLSGEFDSGKSMTENHQGDMHKAILGLRKGNADPMLEVMKAKGFDVDTAFKDLSIASQ 177
L L G+ G +HK + GL G++ + + A VD + S Q
Sbjct: 118 NLGLDSPIAQGRD------GTVHKVVYGLMSGDSSALQGQIDALLKAVDPSLSINSTFDQ 171

Query: 178 YPDSGYMSDAPM-----VDTVGVMD-SNDMLLAA 205
+G P V VGV + +D+ LAA
Sbjct: 172 LAAAGVAHATPAAAAAEVGVVGVQELPHDLALAA 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0133RTXTOXIND352e-120 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 352 bits (905), Expect = e-120
Identities = 91/424 (21%), Positives = 172/424 (40%), Gaps = 8/424 (1%)

Query: 25 RYLNIGGGLVVIGFIGFLLWAGLAPLDKGVAVTGLLVVAENRKVIQPLQGGRIQQLHVTE 84
R + ++ + + + L ++ G L + K I+P++ ++++ V E
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE 114

Query: 85 GDEIVSGQLLVTLDDTAIRNQRDNLQHQYLSALAQEARLTAEQNDLDVITFPQALLEH-- 142
G+ + G +L+ L Q L A ++ R +++ P+ L
Sbjct: 115 GESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEP 174

Query: 143 ATQPAVERNIILQQQLLHHRRQAHLSEIARLSTQLTRHQARLDGLQAMRSNHQRQSNLFQ 202
Q E ++ L+ + ++ + L + +A + A + ++ S + +
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEK 234

Query: 203 QQLDSVQLLAKDGHIAKNKLLEMESQLTSLQARVEQGTSDIAEAHKLIDETEQHVLQRRE 262
+LD L IAK+ +LE E++ + S + + I ++ +
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQ 294

Query: 263 QYQSENSEQLAKAQQNTQELVQRLNIAEYELSHTRIFAPVSGSVIALAQHTVGGVVSSGQ 322
+++E ++L + N L L E + I APVS V L HT GGVV++ +
Sbjct: 295 LFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354

Query: 323 ALMEIVPSGQPLFVEAQLPVELIDKVTVGLPVDLNFSAFNQSNTPRLQGSVWRIGADRIQ 382
LM IVP L V A + + I + VG + AF + L G V I D I+
Sbjct: 355 TLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIE 414

Query: 383 PPPTSPPYYPLTVAID-----IDPTELAIRPGMAVDVFIRTGERSLLSYLFKPFTDRLHL 437
+ + ++I+ + + GMAV I+TG RS++SYL P + +
Sbjct: 415 DQRLGLVFNVI-ISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRSVISYLLSPLEESVTE 473

Query: 438 ALAE 441
+L E
Sbjct: 474 SLRE 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0134PF03544659e-15 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 65.4 bits (159), Expect = 9e-15
Identities = 34/198 (17%), Positives = 72/198 (36%), Gaps = 14/198 (7%)

Query: 70 ITQNIIEPAVEQRINQPDDIVDLPTLPEQPEGQRE---ITRKEPIKVKRPAENRATSRKP 126
I+ ++ PA + P + P +PE + E KE V + KP
Sbjct: 50 ISVTMVAPAD---LEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK-PKPKPKP 105

Query: 127 VNKETQESDSKQSSPAAAASAMLSGTSQQVAAAVNSDSSHRQQAQVSWKSRLQGHLMGFK 186
+ E + P + A + ++ ++ + S S +
Sbjct: 106 KPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQP 165

Query: 187 RYPSSARKQQQQGTAMIRFVVDKNGYVSSVQLSHSSGTSALDREALAIIKRAQPLPKPPA 246
+YP+ A+ + +G ++F V +G V +VQ+ + + +RE ++R + P P
Sbjct: 166 QYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPG 225

Query: 247 ELLSQGQITLSLPVDFNL 264
+ + + F +
Sbjct: 226 S-------GIVVNILFKI 236


69YpsIP31758_0298YpsIP31758_0308N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0298-111-0.769056B12-dependent methionine synthase
YpsIP31758_0299-111-1.645161hemolysin
YpsIP31758_0300-116-0.012637hemolysin activator protein
YpsIP31758_03010210.488537aspartate kinase III
YpsIP31758_0302020-0.105856glucose-6-phosphate isomerase
YpsIP31758_0303-1210.992255phosphate-starvation-inducible protein PsiE
YpsIP31758_0304-2231.451483maltose transporter permease
YpsIP31758_0305-2191.283993maltose transporter membrane protein
YpsIP31758_0306-2211.607604maltose ABC transporter periplasmic protein
YpsIP31758_03070202.000576hypothetical protein
YpsIP31758_03080201.001376maltose/maltodextrin transporter ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0298BCTERIALGSPD310.042 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 30.7 bits (69), Expect = 0.042
Identities = 18/83 (21%), Positives = 33/83 (39%), Gaps = 9/83 (10%)

Query: 347 AGLEPLTIDANTLFVNVGERTN---VTGSARFKRLIKEEKYGEALDVARQQVESGAQIID 403
+P+ + + +TN VT + + E+ LD+ R QV A I +
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAP--DVMNDLERVIAQLDIRRPQVLVEAIIAE 355

Query: 404 INMDEGMLDAEAAMVRFLNLIAG 426
+ D L+ +++ N AG
Sbjct: 356 VQ-DADGLNLG---IQWANKNAG 374


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0299PF05860792e-19 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 79.1 bits (195), Expect = 2e-19
Identities = 24/124 (19%), Positives = 42/124 (33%), Gaps = 21/124 (16%)

Query: 45 VSSVNGTSVINIVQPSASGLSHNQFQDFNVGEKGAVLNNATSAGNSILAGQLAANQNLNG 104
+++ T +I + S L H+ FQ+F+V G N N
Sbjct: 15 ITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFN-------------------NP 54

Query: 105 QAASIILNEVISRNPSLLLGQQEIFGMTADYILANPNGITCNGCGFMNTNRESLVVGNPL 164
I++ V + S + G TA+ L NPNGI ++ +
Sbjct: 55 TNIQNIISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113

Query: 165 IEQG 168
++
Sbjct: 114 LKFA 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0301CARBMTKINASE290.036 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 29.0 bits (65), Expect = 0.036
Identities = 18/89 (20%), Positives = 29/89 (32%), Gaps = 5/89 (5%)

Query: 214 DYTAALLGEALNVSRIDIWTDVPGIYTTDPRVVPAAKRIDKIAFEEAAEMATFGAKILHP 273
D L E +N I TDV G + + ++ EE + G
Sbjct: 216 DLAGEKLAEEVNADIFMILTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEGH--FKA 271

Query: 274 ATLLPAVRSDIPVFVGSSKDPAAGGTLVC 302
++ P V + I F+ + A L
Sbjct: 272 GSMGPKVLAAIR-FIEWGGERAIIAHLEK 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0302BCTERIALGSPD330.005 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 32.6 bits (74), Expect = 0.005
Identities = 15/66 (22%), Positives = 30/66 (45%), Gaps = 8/66 (12%)

Query: 69 LAKETDLAGAIKSMFSGEKINR-------TEDRAVLHIALRNRSNTPIVVDGKDVMPEVN 121
AK +DL + + S + + D+ ++ I ++N IV DVM ++
Sbjct: 276 YAKASDLVEVLTGISSTMQSEKQAAKPVAALDKNII-IKAHGQTNALIVTAAPDVMNDLE 334

Query: 122 AVLAKM 127
V+A++
Sbjct: 335 RVIAQL 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0306MALTOSEBP6790.0 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 679 bits (1752), Expect = 0.0
Identities = 331/394 (84%), Positives = 367/394 (93%)

Query: 10 IGKTARVLALSALTTLVLSSSAFAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVT 69
I AR+LALSALTT++ S+SA AKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVT
Sbjct: 3 IKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVT 62

Query: 70 IEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAELTPSKAFQEKLFPFTWDA 129
+EHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAE+TP KAFQ+KL+PFTWDA
Sbjct: 63 VEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDA 122

Query: 130 VRFNGKLIGYPVAVEALSLIYNKDLVKEAPKTWEEIPALDKTLRANGKSAIMWNLQEPYF 189
VR+NGKLI YP+AVEALSLIYNKDL+ PKTWEEIPALDK L+A GKSA+M+NLQEPYF
Sbjct: 123 VRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEPYF 182

Query: 190 TWPVIAADGGYAFKFENGVYDAKNVGVNNAGAQAGLQFIVDLVKNKHINADTDYSIAEAA 249
TWP+IAADGGYAFK+ENG YD K+VGV+NAGA+AGL F+VDL+KNKH+NADTDYSIAEAA
Sbjct: 183 TWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAA 242

Query: 250 FNKGETAMTINGPWAWSNIDKSKINYGVTLLPTFHGQPSKPFVGVLTAGINAASPNKELA 309
FNKGETAMTINGPWAWSNID SK+NYGVT+LPTF GQPSKPFVGVL+AGINAASPNKELA
Sbjct: 243 FNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKELA 302

Query: 310 TEFLENYLITDQGLAEVNKDKPLGAVALKSFQEQLAKDPRIAATMDNATNGEIMPNIPQM 369
EFLENYL+TD+GL VNKDKPLGAVALKS++E+LAKDPRIAATM+NA GEIMPNIPQM
Sbjct: 303 KEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIPQM 362

Query: 370 AAFWYATRSAVLNAITGRQTVEAALNDAATRITK 403
+AFWYA R+AV+NA +GRQTV+ AL DA TRITK
Sbjct: 363 SAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0308PF05272340.001 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.5 bits (76), Expect = 0.001
Identities = 13/32 (40%), Positives = 17/32 (53%)

Query: 32 VVFVGPSGCGKSTLLRMIAGLEDITSGELLIG 63
VV G G GKSTL+ + GL+ + IG
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


70YpsIP31758_0623YpsIP31758_0633N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0623013-3.989718flagellar biosynthesis protein FlhB
YpsIP31758_0624013-3.819527flagellar biosynthetic protein FliR
YpsIP31758_0625-113-3.559183flagellar biosynthetic protein FliQ
YpsIP31758_0626-111-1.796350flagellar biosynthesis protein FliP
YpsIP31758_0627-115-1.839173flagellar motor switch protein FliN
YpsIP31758_0628-214-0.895179lateral flagellar export/assembly protein
YpsIP31758_0629-1182.740464sigma-54 dependent transcriptional regulator
YpsIP31758_06301203.955437flagellar hook-basal body complex protein FliE
YpsIP31758_06310193.621250flagellar MS-ring protein
YpsIP31758_06320204.240300flagellar motor switch protein G
YpsIP31758_0633-2194.248540flagellar assembly protein H
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0623TYPE3IMSPROT296e-101 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 296 bits (760), Expect = e-101
Identities = 98/344 (28%), Positives = 175/344 (50%)

Query: 5 SGEKSEKPTTGKLSKARKKGDIPRSKDVTMAAGLVTSFILLSLFLPYYKELISQSFVSVS 64
SGEK+E+PT K+ ARKKG + +SK+V A +V +L YY E S+ + +
Sbjct: 2 SGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPA 61

Query: 65 QLASQLNDQGALEQFLLANLFIFAKFLATLVPIPLFSMLATLIPGGWNFTPVKLIPDLKK 124
+ + Q L F L L ++ + ++ G+ + + PD+KK
Sbjct: 62 EQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKK 121

Query: 125 LSPLAGIKRIFSASNGTEVLKMLAKCSIVLYTLYLVVHSSLDDLLHLQTLPLEEAITQGF 184
++P+ G KRIFS + E LK + K ++ +++++ +L LL L T +E
Sbjct: 122 INPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLG 181

Query: 185 AQYHHILLYFIAIVVVFAIIDIPLSHHLFTKKMKMTKQEVKQEHKNNDGNPEIKSRVRQL 244
+++ VV +I D ++ + K++KM+K E+K+E+K +G+PEIKS+ RQ
Sbjct: 182 QILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQF 241

Query: 245 QRQYAIGQINKTVPSADVIITNPTHFSVALKYAPEKASAPYIVAKGKDDIALYIRSIAQK 304
++ + + V + V++ NPTH ++ + Y + P + K D +R IA++
Sbjct: 242 HQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEE 301

Query: 305 HKIEIVEFPPLARAIYHTTKVNQQIPAQLYRAIAQVLTYVMQIK 348
+ I++ PLARA+Y V+ IPA+ A A+VL ++ +
Sbjct: 302 EGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0624TYPE3IMRPROT1052e-29 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 105 bits (263), Expect = 2e-29
Identities = 70/237 (29%), Positives = 129/237 (54%), Gaps = 3/237 (1%)

Query: 17 LPFVRILSFLHFCPVIRHKAFTRKAKIGTALLLAILITPMISQPVVSRELLSIESLLLAG 76
P +R+L+ + P++ ++ ++ K+G A+++ I P + V + S +L LA
Sbjct: 18 WPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDV--PVFSFFALWLAV 75

Query: 77 EQILWGWLFGSMLHLVLAALEAAGQILSMNMGLGMAMMNDPTSGASTAVISQIIFTFSVL 136
+QIL G G + AA+ AG+I+ + MGL A DP S + V+++I+ ++L
Sbjct: 76 QQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLALL 135

Query: 137 IFFTLDGHLLFVTIVLKSFSSWPIG-EAINDFSLRSLALSLGWIISSATLLALPTTFIML 195
+F T +GHL +++++ +F + PIG E +N + +L + I + +LALP ++L
Sbjct: 136 LFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLLL 195

Query: 196 IVQGSFGLLNRISPTLNLFSLGFPIGMLFGLLCLLLLAINIPDHYLHLTNEILTQFE 252
+ + GLLNR++P L++F +GFP+ + G+ + L I HL +EI
Sbjct: 196 TLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLA 252


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0625TYPE3IMQPROT476e-11 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 47.5 bits (113), Expect = 6e-11
Identities = 25/74 (33%), Positives = 37/74 (50%)

Query: 14 GLHLVLMISIVAIVPSLLIGLLVSIFQATTQINEQTLSFLPRLVMTMLVLIFAGKWMMTK 73
L+LVL++S + + +IGLLV +FQ TQ+ EQTL F +L+ L L W
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLSGWYGEV 70

Query: 74 LSDFTVSIFQQAAQ 87
L + + A
Sbjct: 71 LLSYGRQVIFLALA 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0626FLGBIOSNFLIP2196e-74 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 219 bits (560), Expect = 6e-74
Identities = 112/236 (47%), Positives = 155/236 (65%), Gaps = 4/236 (1%)

Query: 19 LVGGLLYSPLLLAQEGGITLFNTVQTATGQDYNVKIEILILMTLLGLLPIMMLMMTCFTR 78
V L +PL AQ GIT + GQ +++ ++ L+ +T L +P ++LMMT FTR
Sbjct: 9 PVLLWLITPLAFAQLPGIT--SQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFTR 66

Query: 79 FIIVLAILRQALGLQQSPPNKVLTGIALALTLLVMRPVWTKIHQDAVIPFQQDEITLSQA 138
IIV +LR ALG +PPN+VL G+AL LT +M PV KI+ DA PF +++I++ +A
Sbjct: 67 IIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQEA 126

Query: 139 LGRAEVPLKNYMLAQTSTKSLDQMMAIA--QVSGEPQQQDLSVVTPAYVLSELKTAFQIG 196
L + PL+ +ML QT L +A P+ + ++ PAYV SELKTAFQIG
Sbjct: 127 LEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQIG 186

Query: 197 FMIYIPFLVIDLIVASILMAMGMMMLSPLIVSLPFKLMLFVLCDGWTLMVGTLTAS 252
F I+IPFL+IDL++AS+LMA+GMMM+ P ++LPFKLMLFVL DGW L+VG+L S
Sbjct: 187 FTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0627FLGMOTORFLIN723e-19 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 71.9 bits (176), Expect = 3e-19
Identities = 35/77 (45%), Positives = 50/77 (64%)

Query: 54 RKMSLFSRIPVTLTLEVASVELPLSELLTVNNDSVIELDKLAGEPLDIRVNGIMFGQAEV 113
+ + L IPV LT+E+ + + ELL + SV+ LD LAGEPLDI +NG + Q EV
Sbjct: 52 QDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEV 111

Query: 114 VVINEKYGLRIININSQ 130
VV+ +KYG+RI +I +
Sbjct: 112 VVVADKYGVRITDIITP 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0628TYPE3OMOPROT352e-04 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 35.0 bits (80), Expect = 2e-04
Identities = 25/103 (24%), Positives = 46/103 (44%), Gaps = 16/103 (15%)

Query: 147 GEHLIINNSTAALIACWSYRIDFFLKDYHKSGFSIFIDAPHIDRFINTIKTKSEKSVEKN 206
G+ L+I S A + C++ ++ F + ++ I ++ E N
Sbjct: 172 GDVLLIRTSRA-EVYCYAKKLGHFNR----------VEGGIIVETLDI----QHIEEENN 216

Query: 207 VSLSEKQLEHLVKKLPVTLTSQLSNINLTLAELMALKEGDIIS 249
+ + + L L +LPV L L N+TLAEL A+ + ++S
Sbjct: 217 TTETAETLPGL-NQLPVKLEFVLYRKNVTLAELEAMGQQQLLS 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0629HTHFIS373e-129 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 373 bits (958), Expect = e-129
Identities = 125/345 (36%), Positives = 184/345 (53%), Gaps = 22/345 (6%)

Query: 14 HGFVANAPSSVSVFSLARRVAEFNVPVLVTGETGTGKECVAKYIHQKAMGDASPYIAVNC 73
V + + ++ + R+ + ++ +++TGE+GTGKE VA+ +H P++A+N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 74 AAIPESMLEAILFGYEKGAFTGAIASVAGKFEQANGGTLLLDEIGDMPLALQVKLLRVLQ 133
AAIP ++E+ LFG+EKGAFTGA G+FEQA GGTL LDEIGDMP+ Q +LLRVLQ
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 134 EQEVERLGSHKAIPLDIRIIASTNKDLSVEIAEGRFRQDLYYRLSVVPIHILPLRERPED 193
+ E +G I D+RI+A+TNKDL I +G FR+DLYYRL+VVP+ + PLR+R ED
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 194 ILPLVKAFINKYQSFLNVKIDITAEAQCELYKYTWPGNVRELENVIQRGIIMSNNGVI-- 251
I LV+ F+ + + EA + + WPGNVRELEN+++R + VI
Sbjct: 317 IPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITR 376

Query: 252 ---------ELPSLGLPMAQGISSPVGETSLPF--------STIQPPDGENNIKLRGRLA 294
E+P + A S + + S
Sbjct: 377 EIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEM 436

Query: 295 QYQYIVDLLQRHQGNKSKTAAFLGITPRALRYRLANMREDGIDIE 339
+Y I+ L +GN+ K A LG+ LR + +RE G+ +
Sbjct: 437 EYPLILAALTATRGNQIKAADLLGLNRNTLRKK---IRELGVSVY 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0630FLGHOOKFLIE454e-09 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 44.7 bits (105), Expect = 4e-09
Identities = 23/73 (31%), Positives = 35/73 (47%), Gaps = 1/73 (1%)

Query: 53 NNLSFSQVLNGAIKSVDQLQHVASEKQTAMDMGISD-DLTGTMLASQKASVAFSAMVQVR 111
+SF+ L+ A+ + Q A + +G L M QKASV+ +QVR
Sbjct: 29 PTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVR 88

Query: 112 NKLTSALDDVMNT 124
NKL +A +VM+
Sbjct: 89 NKLVAAYQEVMSM 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0631FLGMRINGFLIF2831e-90 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 283 bits (724), Expect = 1e-90
Identities = 152/565 (26%), Positives = 255/565 (45%), Gaps = 62/565 (10%)

Query: 12 GQLGENTKTILMSAAALLVTAAIIFSLWRSSQGYTALFGSQENIPVTQVVEVLEGEAIAY 71
+L N + L+ A + V + LW + Y LF + + +V L I Y
Sbjct: 17 NRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPY 76

Query: 72 RINPDNGQVLVAENQLGKARILLAAKGITATLPIGYELMDKESMLGSSQFIQNVRYKRSL 131
R +G + V +++ + R+ LA +G+ +G+EL+D+E G SQF + V Y+R+L
Sbjct: 77 RFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEK-FGISQFSEQVNYQRAL 135

Query: 132 EGELAQSMMALSAVEYARVHLGMSEASSFAISNRSDNSASVVLRLRYGQTLSTEQVGAIV 191
EGELA+++ L V+ ARVHL M + S F + + SASV + L G+ L Q+ A+V
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLF-VREQKSPSASVTVTLEPGRALDEGQISAVV 194

Query: 192 QLVAGSIPGMKPANVRVVDQHGELLSQAYQANSEGVPSVKSGTELAHYLQSTTEKNIANL 251
LV+ ++ G+ P NV +VDQ G LL+ Q+N+ G + + A+ ++S ++ I +
Sbjct: 195 HLVSSAVAGLPPGNVTLVDQSGHLLT---QSNTSGRDLNDAQLKFANDVESRIQRRIEAI 251

Query: 252 LNSVIGANNYRISVSTQLDMSRIEETAERYGPDPRIN------DENIQQENSNDDMAMGI 305
L+ ++G N V+ QLD + E+T E Y P+ + + E G+
Sbjct: 252 LSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGV 311

Query: 306 PGSLSNQPIPQSQAGQTPAAVSRSQAQ------------------------RKYIYDRNI 341
PG+LSNQP P ++A ++ AQ Y DR I
Sbjct: 312 PGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTI 371

Query: 342 RHVRYPGYKLEKMTVAVVLN-KSLPVL--EQWTPEQQEELKRLIEDAAGIDVKRGDSLTI 398
RH + +E+++VAVV+N K+L T +Q ++++ L +A G KRGD+L +
Sbjct: 372 RHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGDTLNV 431

Query: 399 NVMAFAVP-TLIDEPVMPWWQEPSTFRWAELLGIGLLSLLVLW----FGVRPLMKRYSRK 453
F+ E +P+WQ+ S G LL L+V W VRP + R
Sbjct: 432 VNSPFSAVDNTGGE--LPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTR---- 485

Query: 454 GSENLPLAISSASADEALDHVDTGVDGVESSPRAETAFSASSLWESDDLPEQGSGLETKI 513
E + + A + Q G E
Sbjct: 486 -------------RVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMS 532

Query: 514 AHLQQLAQSETERTAEVIKQWINSN 538
+++++ ++ A VI+QW++++
Sbjct: 533 QRIREMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0632FLGMOTORFLIG1733e-53 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 173 bits (439), Expect = 3e-53
Identities = 85/334 (25%), Positives = 165/334 (49%), Gaps = 2/334 (0%)

Query: 15 KSDTKGRSRLEQASILLLSIGEEAAAMVMQQLSREEVVCVSQMMSRLHNIKLDQARQALD 74
D + ++A+ILL+SIG E ++ V + LS+EE+ ++ +++L I + L
Sbjct: 9 ILDVSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLL 68

Query: 75 DFFRDYREQSGINGASRSYLQAILNKALGSDIAKSVINGIYGDEIRHRMTRLQWVDTPQL 134
+F Q I Y + +L K+LG+ A +IN + ++ D +
Sbjct: 69 EFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANI 128

Query: 135 VALIDQEHLQLQAVFLAFLPPDVAAAVLAYLDKDHQDDILYRIAKLDDVNRDVVDEL-DR 193
+ I QEH Q A+ L++L P A+ +L+ L + Q ++ RIA +D + +VV E+
Sbjct: 129 LNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERV 188

Query: 194 LIERGVAVLSEHGSKVIGIKQAANIVNRIPGNQQQ-LLDQLGERDEEVLNELKDEMYEFF 252
L ++ ++ SE + G+ I+N ++ +++ L E D E+ E+K +M+ F
Sbjct: 189 LEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFE 248

Query: 253 ILSRQSEATLQRLMDLIPMSDWAIALKGTEPALRQAIYDVLPKRQIQQLQNATQRTGAVP 312
+ + ++QR++ I + A ALK + +++ I+ + KR L+ + G
Sbjct: 249 DIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTR 308

Query: 313 VSRVEHIRKVIMAQVRELAEAGEIQVQLFAEQTM 346
VE ++ I++ +R+L E GEI + E+ +
Sbjct: 309 RKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDV 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0633FLGFLIH591e-12 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 59.0 bits (142), Expect = 1e-12
Identities = 46/204 (22%), Positives = 100/204 (49%), Gaps = 11/204 (5%)

Query: 18 QFPPLRKVRQVALSAADQTLDPAEYQKQLMAGFQEGISQGFDKGLAEGKEEGYQEGVRLG 77
+F P+ + + + A+ +L+ Q Q+ A QG+ G+AEG+++G+++G + G
Sbjct: 21 EFVPIVEPEETIIEEAEPSLEQQLAQLQMQAH-----EQGYQAGIAEGRQQGHKQGYQEG 75

Query: 78 HDDGLKKGRIEGRQSELASFNDVIKPFSGYVTQLHTYLETYEQRRRDELLQLVEKVTRQV 137
GL++G E +S+ A + ++ V++ T L+ + L+Q+ + RQV
Sbjct: 76 LAQGLEQGLAEA-KSQQAPIHARMQQL---VSEFQTTLDALDSVIASRLMQMALEAARQV 131

Query: 138 IRCELALQPAQLLTLVEEALAALPKVPQQLKVYLNPAEFGRINDV--APEKVQAWGLAAD 195
I + + L+ +++ L P + ++ ++P + R++D+ A + W L D
Sbjct: 132 IGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGD 191

Query: 196 PEMVGGECRIVTETTEIDVGCQHR 219
P + G C++ + ++D R
Sbjct: 192 PTLHPGGCKVSADEGDLDASVATR 215


71YpsIP31758_0640YpsIP31758_0654N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_06401182.991938flagellar basal body rod protein FlgC
YpsIP31758_06412173.419757flagellar basal body rod modification protein
YpsIP31758_06422193.347782flagellar hook protein FlgE
YpsIP31758_06430161.073553flagellar basal body rod protein FlgF
YpsIP31758_0644015-1.487306flagellar basal body rod protein FlgG
YpsIP31758_0645017-1.521968flagellar basal body L-ring protein
YpsIP31758_0646018-2.656451flagellar basal body P-ring protein
YpsIP31758_0647017-4.619851peptidoglycan hydrolase
YpsIP31758_0648014-3.743550flagellar hook-associated protein FlgK
YpsIP31758_0649214-2.266518flagellar hook-associated protein FlgL
YpsIP31758_0650311-1.613642flagellar hook associated protein lafW
YpsIP31758_0651213-2.799701hypothetical protein
YpsIP31758_0652213-2.150829transcriptional regulator
YpsIP31758_0653316-0.758842flagellin
YpsIP31758_0654217-2.036113flagellin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0640FLGHOOKAP1300.004 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.5 bits (66), Expect = 0.004
Identities = 6/37 (16%), Positives = 19/37 (51%)

Query: 102 VNVVSEMADMMSASRSFETNVEVLNSVKSMQQSVLKL 138
VN+ E ++ + + N +VL + ++ +++ +
Sbjct: 509 VNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0642FLGHOOKAP1384e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 38.4 bits (89), Expect = 4e-05
Identities = 19/60 (31%), Positives = 27/60 (45%), Gaps = 5/60 (8%)

Query: 2 SFSIANTALNAHTEQLNTISNNIANSATKGFKASR----TEFASMYAQSQ-PLGVTVSGV 56
+ A + LNA LNT SNNI++ G+ +++ A GV VSGV
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62



Score = 33.4 bits (76), Expect = 0.001
Identities = 10/42 (23%), Positives = 22/42 (52%)

Query: 360 LENSNVDITAELVGLMTAQRNYQASTKIISTNDSMMNALFQV 401
S V++ E L Q+ Y A+ +++ T +++ +AL +
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0644FLGHOOKAP1422e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 2e-06
Identities = 11/42 (26%), Positives = 20/42 (47%)

Query: 213 QLEQGALEGSNVQVVEEMVDMITVQRAYEMNAKMVSAADDML 254
QL S V + EE ++ Q+ Y NA+++ A+ +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIF 539



Score = 40.7 bits (95), Expect = 3e-06
Identities = 20/78 (25%), Positives = 35/78 (44%), Gaps = 14/78 (17%)

Query: 2 NSALWVSKTGLAAQDAKMGAISNNLANVNTDGFKRDRVVFADLFYQNQRTPGAPLDQNNT 61
+S + + +GL A A + SNN+++ N G+ R + A N+T
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMA--------------QANST 46

Query: 62 TPSGIQFGSGVQIVGTQK 79
+G G+GV + G Q+
Sbjct: 47 LGAGGWVGNGVYVSGVQR 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0645FLGLRINGFLGH1531e-48 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 153 bits (387), Expect = 1e-48
Identities = 76/228 (33%), Positives = 113/228 (49%), Gaps = 14/228 (6%)

Query: 4 FLILTPMVLALCGCESPALLVQKDDAEFAPPANLVQPATVTEGGGLF---QPAY--NWSL 58
+ I + +VL+L GC A A P P G +F QP L
Sbjct: 9 YAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVA---NGSIFQSAQPINYGYQPL 65

Query: 59 LQDRRAYRIGDILTVILDESTQSSKQAKTNFGKKNDMSLGVPEVLGKKLNKFGGSI---- 114
+DRR IGD LT++L E+ +SK + N + + G V FG +
Sbjct: 66 FEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVE 125

Query: 115 -SGKRDFDGSATSAQQNMLRGSITVAVHQVLPNGVLVIRGEKWLTLNQGDEYMRVTGLVR 173
SG F+G + N G++TV V QVL NG L + GEK + +NQG E++R +G+V
Sbjct: 126 ASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVN 185

Query: 174 ADDVARDNSVSSQRIANARISYAGRGALSDANSAGWLTRFFNHPLFPI 221
++ N+V S ++A+ARI Y G G +++A + GWL RFF + L P+
Sbjct: 186 PRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN-LSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0646FLGPRINGFLGI319e-109 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 319 bits (819), Expect = e-109
Identities = 148/369 (40%), Positives = 211/369 (57%), Gaps = 12/369 (3%)

Query: 8 LVLSAAFVLPMALALPTATAQPLGSLVDIQGVRGNQLVGYSLVVGLDGSGDK-NQVKFTG 66
LV SA L A A + + +Q R NQL+GY LVVGL G+GD FT
Sbjct: 11 LVFSALPFLSTPPAQ--ADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTE 68

Query: 67 QSMANMLRQFGVQLPEKMDPKVKNVAAVAISATLPPGYGRGQSIDITVSSIGDAKSLRGG 126
QSM ML+ G+ KN+AAV ++A LPP G +D+TVSS+GDA SLRGG
Sbjct: 69 QSMRAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGG 127

Query: 127 TLLLTQLRGADGEVYALAQGNVVVGGIKAEGDSGSSVTVNTPTVGRIPNGASIERQVPSD 186
L++T L GADG++YA+AQG ++V G A+GD +++T T R+PNGA IER++PS
Sbjct: 128 NLIMTSLSGADGQIYAVAQGALIVNGFSAQGD-AATLTQGVTTSARVPNGAIIERELPSK 186

Query: 187 FQTNNQVVLNLKRPSFKSANNVALALNR----AFGANTATAQSATNVMVNAPQDAGARVA 242
F+ + +VL L+ P F +A VA +N +G A + + + V P+ A
Sbjct: 187 FKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVA-DLTR 245

Query: 243 FMSLLEDVQINAGQQSPRVVFNARTGTVVIGEGVIVRAAAVSHGNLTVSIRERKNVSQPN 302
M+ +E++ + +VV N RTGT+VIG V + AVS+G LTV + E V QP
Sbjct: 246 LMAEIENLTVET-DTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPA 304

Query: 303 TLGGGKTVTTPESDIEVTKGKNQMVMVPAGTRLRSIVNTINSLGASPDDIMAILQALHEA 362
G+T P++DI + +++ +V G LR++V +NS+G D I+AILQ + A
Sbjct: 305 PFSRGQTAVQPQTDIMAMQEGSKVAIVE-GPDLRTLVAGLNSIGLKADGIIAILQGIKSA 363

Query: 363 GALDAELVV 371
GAL AELV+
Sbjct: 364 GALQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0647FLGFLGJ456e-09 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 45.1 bits (106), Expect = 6e-09
Identities = 19/77 (24%), Positives = 41/77 (53%), Gaps = 4/77 (5%)

Query: 18 GDLQPQDLEQAAVQFEAVFMRTLLQQMRKAAEVLAADDDPFNSKQQRMMRDFYDDKLAST 77
G+ ++ A Q E +F++ +L+ MR A D F+S+ R+ YD ++A
Sbjct: 26 GEDPAANIRPVARQVEGMFVQMMLKSMRDAL----PKDGLFSSEHTRLYTSMYDQQIAQQ 81

Query: 78 LASQRSSGIANLLIQQL 94
+ + + G+A ++++Q+
Sbjct: 82 MTAGKGLGLAEMMVKQM 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0648FLGHOOKAP11525e-43 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 152 bits (386), Expect = 5e-43
Identities = 90/324 (27%), Positives = 151/324 (46%), Gaps = 8/324 (2%)

Query: 4 IKTAFSGMQATQAHLNATSMNIANIHTPGYSRQRAEQSAIGADGQGGINAGNGVNVDAIR 63
I A SG+ A QA LN S NI++ + GY+RQ + + G GNGV V ++
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGVQ 63

Query: 64 RLSKQYVVMQEWQANSQQQYYEAGEQYLKAVELMVSNESTSLATGLNNFFSSLSAATQMP 123
R ++ Q A +Q A + + ++ M+S ++SLAT + +FF+SL
Sbjct: 64 REYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVSNA 123

Query: 124 DSTPMRQQIIESANAMALRFNNVNNFIAQQKNSIGQQRDISIKEINSLTRSIADYNQQIL 183
+ RQ +I + + +F + ++ Q + S+ +IN+ + IA N QI
Sbjct: 124 EDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQIS 183

Query: 184 K--NRSDGNNINDLLDKQELQIKKLSGLIETQVNQAEDGTYRVSVKQGQPLVNGAVAAEL 241
+ G + N+LLD+++ + +L+ ++ +V+ + GTY +++ G LV G+ A +L
Sbjct: 184 RLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTARQL 243

Query: 242 AVDTRSADTKINIHFSGTTQGMNMSC------GGQLGGINDYEFTTLKKLQASTQGMAKT 295
A SAD N+ G LGGI + L + + + +A
Sbjct: 244 AAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLALA 303

Query: 296 VADEFNTQLKLGSDFTGANGRDLF 319
A+ FNTQ K G D G G D F
Sbjct: 304 FAEAFNTQHKAGFDANGDAGEDFF 327



Score = 61.5 bits (149), Expect = 2e-12
Identities = 36/145 (24%), Positives = 67/145 (46%), Gaps = 4/145 (2%)

Query: 311 TGANGRDLFVFNPGDPNGMLQLSAITAEQLALAGRG-EPSGDS--SNLFKLIDIKQKNVV 367
D F P + ++ + + ++ +A E +GDS N L+D++ +
Sbjct: 402 GTPAVNDSFTLKPVS-DAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKT 460

Query: 368 GMNSVRLDDAATTLVGYIAITSNRNYSELENAENTLNQATRYRESISGVNNDEEAINLME 427
+ +DA +LV I + + N + Q + ++SISGVN DEE NL
Sbjct: 461 VGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQR 520

Query: 428 YQRAYQSNMKVIATGDKLFSDLLAL 452
+Q+ Y +N +V+ T + +F L+ +
Sbjct: 521 FQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0653FLAGELLIN1003e-25 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 100 bits (250), Expect = 3e-25
Identities = 68/294 (23%), Positives = 114/294 (38%), Gaps = 7/294 (2%)

Query: 2 ALSIHTNASAKTAINSLSNAGLANAKSSQRLSTGFRINSPADNAAGLQITNRMEKFLNGA 61
A I+TN+ + N+L+ + + + + +RLS+G RINS D+AAG I NR + G
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGK 121
QA +N + I++ Q +G L E L +++L+ QA N TNS +D ++IQ E + +
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELQNALNNTEYNSEKLFADGGKMRKELNFQSGTDAGSSLKLNLNDVIAELTESVTKPGTA 181
E+ N T++N K+ + +M Q G + G ++ ++L + +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQM----KIQVGANDGETITIDLQKIDVKSLGLDGFNVNG 176

Query: 182 ITADASGTPAQKELARLNQVTADALREKELAKKAKTDLGAVQAGANATANIDIPEYKDAN 241
G + N D + + GAV A D AN
Sbjct: 177 PKEATVGDLK---SSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAAN 233

Query: 242 GQTVLGKRIASGATVSAGDIAQIDAAVTALTQVHTDADKASTDYANNNLVGGGV 295
GQ + A A D + V +
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTI 287



Score = 63.1 bits (153), Expect = 6e-13
Identities = 50/333 (15%), Positives = 103/333 (30%), Gaps = 6/333 (1%)

Query: 64 AKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGKEL 123
+++ S + D + K + A + D+ + +L +
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDD 240

Query: 124 QNALNNTEYNSEKLFADGGKMRKELNFQSGTDAGSSLKLNLNDVIAELTESVTKPGTAIT 183
+ G K + T++ ++
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVS 300

Query: 184 ADASGTPAQKELARLNQVTADALREKELAKKAKTDLGAVQAGANATANIDIPEYKDANGQ 243
+G +A + A+ + V + K+ + +
Sbjct: 301 TTINGEKVTLTVADITAGAAN------VDAATLQSSKNVYTSVVNGQFTFDDKTKNESAK 354

Query: 244 TVLGKRIASGATVSAGDIAQIDAAVTALTQVHTDADKASTDYANNNLVGGGVMNMRLADK 303
+ + S + + A T A K + V + A K
Sbjct: 355 LSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAK 414

Query: 304 DLAMEADKKLSEVIDAYGAFRATLGANQNRLQSSSNNLDNMISNTAQALGSIKDTDFADE 363
+ + A R++LGA QNR S+ NL N ++N A I+D D+A E
Sbjct: 415 KSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATE 474

Query: 364 MKNHAQSEMLMQSSVMMLKKANAATQLISTLLQ 396
+ N +++++L Q+ +L +AN Q + +LL+
Sbjct: 475 VSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0654FLAGELLIN969e-24 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 96.3 bits (239), Expect = 9e-24
Identities = 72/291 (24%), Positives = 119/291 (40%), Gaps = 6/291 (2%)

Query: 2 ALSIHTNASAKTAINSLSNAGLANAKSSQRLSTGFRINSPADNAAGLQITNRMEKFLNGA 61
A I+TN+ + N+L+ + + + + +RLS+G RINS D+AAG I NR + G
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 GQAKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGK 121
QA +N + I++ Q +G L E L +++L+ QA N TNS +D ++IQ E + +
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 ELQNALNNTEYNSEKLFADGGKMRKELNFQSGTDAESSLKLNLNSVIAELTESVTTKATP 181
E+ N T++N K+ + +M Q G + ++ ++L + +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQM----KIQVGANDGETITIDLQKIDVKSLGLDGFNVNG 176

Query: 182 VKADDAGSTLEKEADVLDKATKAAKAAKEAAEAAQLAIAGKKDGDAITATAIPEYKDATG 241
K G +V T A A K + A+ + A G
Sbjct: 177 PKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVY--VNAANG 234

Query: 242 QVIAAKTTGTTLSTADINQINDAATALTKAHAAAEKAENLFQAKNSTGGGV 292
Q+ T + A TA KA A A K + G
Sbjct: 235 QLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTF 285



Score = 61.6 bits (149), Expect = 2e-12
Identities = 53/330 (16%), Positives = 99/330 (30%), Gaps = 3/330 (0%)

Query: 64 AKQNIQESIAMLQIADGGLAESVKTLNAMKKLATQAANDTNSAADREAIQKEFTELGKEL 123
+++ S + D + K + A + D+ + +L +
Sbjct: 181 TVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDD 240

Query: 124 QNALNNTEYNSEKLFADGGKMRKELNFQSGTDAESSLKLNLNSVIAELTESVTTKATPVK 183
+ G K + E T++ V
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVS 300

Query: 184 ADDAGSTLEKEADVLDKATKAAKAAKEAAEAAQLAIAGKKDGDAITATAIPEYKDATGQV 243
G + + AA + T + A
Sbjct: 301 TTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTK---NESAKLSD 357

Query: 244 IAAKTTGTTLSTADINQINDAATALTKAHAAAEKAENLFQAKNSTGGGVMEMQLLDKDLA 303
+ A S +N A A A K + + + + E K
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKST 417

Query: 304 MMADKKLSDVIDAYGAFRATLGANQNRLQSSSNNLDNMISNTAQALGSIKDTDFADEMKN 363
+ + A R++LGA QNR S+ NL N ++N A I+D D+A E+ N
Sbjct: 418 ANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSN 477

Query: 364 HAQSEMLMQSSVMMLKKANAATQLISTLLQ 393
+++++L Q+ +L +AN Q + +LL+
Sbjct: 478 MSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


72YpsIP31758_0662YpsIP31758_0668N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0662-1130.079563hypothetical protein
YpsIP31758_0663-1162.482805hypothetical protein
YpsIP31758_0664-1162.447844hypothetical protein
YpsIP31758_0665-2152.454628outer membrane lipoprotein PcP
YpsIP31758_0666-112-0.925530sensory box-containing diguanylate cyclase
YpsIP31758_0667-113-0.841904RND family efflux transporter MFP subunit
YpsIP31758_06682172.888183RND efflux transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0662OMPADOMAIN383e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 38.4 bits (89), Expect = 3e-05
Identities = 26/116 (22%), Positives = 41/116 (35%), Gaps = 17/116 (14%)

Query: 171 FQRSSAVLTPFFSRLLGELAPAFNEM---DNKIIITGHTDASRYRDQLLYNNWNLSGERA 227
F + A L P L +L + + D +++ G+TD D N LS RA
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTD-RIGSDAY---NQGLSERRA 278

Query: 228 LMAHKALVSGGLDKGRVLQI----------NAMADQMLLDPTDPLAAKNRRIEIMV 273
L+S G+ ++ N + A +RR+EI V
Sbjct: 279 QSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0665FLGPRINGFLGI342e-04 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 33.8 bits (77), Expect = 2e-04
Identities = 28/126 (22%), Positives = 49/126 (38%), Gaps = 7/126 (5%)

Query: 32 SQAGQTQSVTHGTLVSVRPVTIQGGDGNNVAGAVGGAVIGGFLGNTIGGGTGRRLGTAAG 91
S G S+ G L+ ++ G DG A A G ++ GF + + T+A
Sbjct: 116 SSLGDATSLRGGNLIMT---SLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSAR 172

Query: 92 VVAGGVVGQQVQSMMNRSSAVELEVRRDDGSTFLVVQAQGVTQFHP---GQRVTIATSGS 148
V G ++ +++ S S + L++R D ST + V V F G +
Sbjct: 173 VPNGAIIERELPSKFKDSVNLVLQLRNPDFSTAVRVADV-VNAFARARYGDPIAEPRDSQ 231

Query: 149 TVTITP 154
+ +
Sbjct: 232 EIAVQK 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0667RTXTOXIND476e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.1 bits (112), Expect = 6e-08
Identities = 27/177 (15%), Positives = 55/177 (31%), Gaps = 17/177 (9%)

Query: 10 RHSLLSHALFLLILGAGTVSAAPAPLPAVTVAVVASITPDNAVQYLGRIEAIQAVDVTTR 69
R L + L + + + V +T GR + I+ +
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLGQVEIV-ATANGKLTHS------GRSKEIKPI----- 102

Query: 70 TEGFIARRLFTEGKMVKQGELLYEIDPALHQASVAQAQAQLDSATASANHAQVNLTRLQR 129
+ + EG+ V++G++L ++ +A + Q+ L A Q+ ++
Sbjct: 103 ENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIEL 162

Query: 130 LGNNRSVSQAE-----VDEAQAQRDISRAAVAQAQANLQIQQLQLSFTQIHAPISGQ 181
E V E + R S + Q Q +L+ + A
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTV 219



Score = 39.4 bits (92), Expect = 2e-05
Identities = 24/175 (13%), Positives = 52/175 (29%), Gaps = 46/175 (26%)

Query: 75 ARRLFTEGKMVKQGELL-YEIDPALHQASVAQAQAQLDSATASANHAQVNLTRLQRLGNN 133
L E Q + E++ +A A+++ + + L L +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK 246

Query: 134 RSVS-------QAEVDEAQAQRDISRAAVAQ---------------------------AQ 159
++++ + + EA + + ++ + Q Q
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQ 306

Query: 160 ANLQIQQL---------QLSFTQIHAPISGQ-MGHSRFNVGSLINPASGTLVNIV 204
I L + + I AP+S + G ++ A TL+ IV
Sbjct: 307 TTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIV 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0668ACRIFLAVINRP8940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 894 bits (2312), Expect = 0.0
Identities = 421/1036 (40%), Positives = 611/1036 (58%), Gaps = 17/1036 (1%)

Query: 1 MLHFFIRRPKFAIVIALVITLVGWVSLYVIPVEQYPDITPPVVSVSAVYPGASARDVAQA 60
M +FFIRRP FA V+A+++ + G +++ +PV QYP I PP VSVSA YPGA A+ V
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VASPLEAQVNGVSHMLYMESTSANNGSYQLSITFASGTDPDMAAVEVQNRISQVSAQLPA 120
V +E +NG+ +++YM STS + GS +++TF SGTDPD+A V+VQN++ + LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVNENGISVRKRASNLLLGVSVFSPQQTHDALFVSNYTSIQLRDAIARISGVGDVQVFGA 180
EV + GISV K +S+ L+ S +S+Y + ++D ++R++GVGDVQ+FGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 RDYSMRVWLDPQRMESLNVSVQDIVAALQQQNVQAAAGQIGSSPSMPNQQQTLTISGQGR 240
+ Y+MR+WLD + ++ D++ L+ QN Q AAGQ+G +P++P QQ +I Q R
Sbjct: 181 Q-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 LTDARQFADVIIRSNPQGGMIRLGDVARVALGAQNYQVSAAQNQTESAFLVVYPVPGANA 300
+ +F V +R N G ++RL DVARV LG +NY V A N +A L + GANA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 LNVANGVRDEMARLSAAFPADLTYEINYDSTLPVTATLHEIAVSLTLTLIVVLAVVYLFL 360
L+ A ++ ++A L FP + YD+T V ++HE+ +L +++V V+YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 361 QSLRATFIVALTVPVSLLGTFAVLYVFGYSANTLSLFAIILALTIVVDDAIVVVENVERL 420
Q++RAT I + VPV LLGTFA+L FGYS NTL++F ++LA+ ++VDDAIVVVENVER+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 421 LSNDPHLSPAEATRQAMSQIAGPIIATTLVLMAVFVPIAILPGIIGELYRQFAVTLSAAV 480
+ D L P EAT ++MSQI G ++ +VL AVF+P+A G G +YRQF++T+ +A+
Sbjct: 420 MMED-KLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 481 ILSSINALTLSPALCAVLLKRRTL----ATTGMFGTINKGLDRARDGYVGLTGRINRRAV 536
LS + AL L+PALCA LLK + G FG N D + + Y G+I
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 537 FSIAALLLVGLATWWGYSRLPTSFLPEEDQGYFFVSLQLPDGASLNRTQTVMDQMYQQVS 596
+ L+ + RLP+SFLPEEDQG F +QLP GA+ RTQ V+DQ+
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 597 TNDA--VEDVIKITGFSLLSGNNAPNAGFAIVMLKPWGQRP----HIDRVLASIQANLAA 650
N+ VE V + GFS A NAG A V LKPW +R + V+ + L
Sbjct: 599 KNEKANVESVFTVNGFSF--SGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGK 656

Query: 651 IPSAMIMAVNPPAIAGLGSASGFDLRIQALLGQSPQELAQVSQGIIFAANQDP-TLSRVF 709
I ++ N PAI LG+A+GFD + G L Q ++ A Q P +L V
Sbjct: 657 IRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 710 TTFSASVPETNLSIDRDRAALLQVPVSRIFQTLQTSLGGMNAGDFTLNNRMFRVQLQNDM 769
+ L +D+++A L V +S I QT+ T+LGG DF R+ ++ +Q D
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADA 776

Query: 770 NFRQRTAQINNLNVRSDNGALVSLANLVTLTPSVGAPFISNFNQFPSVAISGSAADGASS 829
FR ++ L VRS NG +V + T G+P + +N PS+ I G AA G SS
Sbjct: 777 KFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 830 GQAMAAMEALLAQNLPQGYSYSWSGMSWQEQQTGGQVAFIYLAALVFAYLFLVAQYESWS 889
G AMA ME L ++ LP G Y W+GMS+QE+ +G Q + + V +L L A YESWS
Sbjct: 837 GDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 IPLVVVLSVVFAVGGAVAGLSAMGFANDVYAQIGLVLLIGLAAKNAILIVEFSK-ARREE 948
IP+ V+L V + G + + NDVY +GL+ IGL+AKNAILIVEF+K +E
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 949 GASIAEAAQDGAKQRFRAVMMTAISFILGVMPLVFASGAGAMSRQIIGITVFGGMLMATA 1008
G + EA + R R ++MT+++FILGV+PL ++GAG+ ++ +GI V GGM+ AT
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1009 VGILFIPALYLHIQRL 1024
+ I F+P ++ I+R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031



Score = 83.0 bits (205), Expect = 3e-18
Identities = 105/513 (20%), Positives = 193/513 (37%), Gaps = 37/513 (7%)

Query: 533 RRAVFSIAALLLVGLATWWGYSRLPTSFLPEEDQGYFFVSLQLPDGASLNRTQTVMDQMY 592
RR +F+ +++ +A +LP + P VS P GA QTV D +
Sbjct: 7 RRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-GAD---AQTVQDTVT 62

Query: 593 QQVSTNDAVEDVIKITGFSLLSGNNAPNAGFAIVMLKPWGQRPHIDRVLASIQANLAAIP 652
Q + N + I +S + I + G P I +V +Q L
Sbjct: 63 QVIEQN-----MNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQ--VQNKLQLAT 115

Query: 653 SAMIMAVNPPAIAGLGSASGFDLRIQALLGQSPQELAQVSQGIIFAANQDPTLSRV---- 708
+ V I+ S+S + + A Q A+N TLSR+
Sbjct: 116 PLLPQEVQQQGISVEKSSSSY--LMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVG 173

Query: 709 -FTTFSASVPETNLSIDRDRAALLQVPVSRIF-----QTLQTSLGGMNAGDFTL-NNRMF 761
F A + +D D ++ + Q Q + G +
Sbjct: 174 DVQLFGAQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNA 232

Query: 762 RVQLQNDMNFRQRTAQINNLNVRSD-NGALVSLANLVTLTPSVGA-PFISNFNQFPSVAI 819
+ Q + + + +R + +G++V L ++ + I+ N P+ +
Sbjct: 233 SIIAQTRF---KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGL 289

Query: 820 SGSAADGASSGQAMAAMEALLAQ---NLPQGYSYSW-SGMSWQEQQTGGQVAFIYLAALV 875
A GA++ A++A LA+ PQG + + Q + +V A++
Sbjct: 290 GIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIM 349

Query: 876 FAYLFLVAQYESWSIPLVVVLSVVFAVGGAVAGLSAMGFANDVYAQIGLVLLIGLAAKNA 935
+L + ++ L+ ++V + G A L+A G++ + G+VL IGL +A
Sbjct: 350 LVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDA 409

Query: 936 ILIVE-FSKARREEGASIAEAAQDGAKQRFRAVMMTAISFILGVMPLVFASG-AGAMSRQ 993
I++VE + E+ EA + Q A++ A+ +P+ F G GA+ RQ
Sbjct: 410 IVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQ 469

Query: 994 IIGITVFGGMLMATAVGILFIPALYLHIQRLRE 1026
IT+ M ++ V ++ PAL + +
Sbjct: 470 F-SITIVSAMALSVLVALILTPALCATLLKPVS 501


73YpsIP31758_0690YpsIP31758_0698N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_06902162.381641ABC transporter ATP-binding protein/permease
YpsIP31758_06912183.320710hypothetical protein
YpsIP31758_06921162.803974hypothetical protein
YpsIP31758_06931193.972068fructuronate transporter
YpsIP31758_06941183.574831autotransporter protein
YpsIP31758_06950162.387220autotransporter protein
YpsIP31758_06961150.788097VgrG protein
YpsIP31758_0697216-0.717939hypothetical protein
YpsIP31758_0698117-1.163355YD repeat-/RHS repeat-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0690ACRIFLAVINRP300.026 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.026
Identities = 9/39 (23%), Positives = 22/39 (56%)

Query: 138 IIATASVLCFFSLGLLLKDWRMALAMLSTLPLAVCAYIL 176
++A + V+ F L L + W + ++++ +PL + +L
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLL 913


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0694PERTACTIN612e-11 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 60.9 bits (147), Expect = 2e-11
Identities = 104/458 (22%), Positives = 159/458 (34%), Gaps = 66/458 (14%)

Query: 820 SGEGAWLTLDTVLGD---------DDSATDRLVINGDATGTTSVRVNNAGGLGDKTLNGI 870
+G L +DT+ G D +D+LV+ DA+G + V N+G + N +
Sbjct: 463 AGRFKVLMVDTLAGSGLFRMNVFADLGLSDKLVVMRDASGQHRLWVRNSGSEPA-SGNTM 521

Query: 871 NLITVDGLAQDDTFLLAGDYVTTDGYQAVVGGAYAYTLQADGEA--------ATAGRNWY 922
L+ + TF LA DG V G Y Y L A+G A
Sbjct: 522 LLVQTPRGSAA-TFTLA----NKDG--KVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPA 574

Query: 923 LSSELMLTEGVRYQVGVPLYEQYPQVLAALNTLPTLQQRVGNRYGAPGALA----DLNFD 978
P Q PQ P Q G A A +
Sbjct: 575 PQPGPQPGPQPPQPPQPPQPPQPPQPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGLA 634

Query: 979 DNQW----------------------AWGRIEGSHQVTDPARSTSGSQREIDVWKLQTGI 1016
W AWGR Q D + +G + + V + G
Sbjct: 635 STLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLD---NRAGRRFDQKVAGFELGA 691

Query: 1017 DVPLYQSQDGSLLTGGVNFSYGKAKADIHSFFGDGRINSAGYGLGTSLTWYGNNGVYVDG 1076
D + + L G ++ G F GDG ++ +G T+ N+G Y+D
Sbjct: 692 DHAVAVAGGRWHLGGLAGYTRGD-----RGFTGDGGGHTDSVHVGGYATYIANSGFYLDA 746

Query: 1077 QLQTMWFDSDLS-SRTAGHAVASGNNGRGYTSAIEAGKGYALGNGLSLTPQMQVTYSRVD 1135
L+ ++D + + G+AV G ++EAG+ +A +G L PQ ++ RV
Sbjct: 747 TLRASRLENDFKVAGSDGYAVKGKYRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVG 806

Query: 1136 FDTFRDPFDSEVSLQEGDSLRGRLGVSLDKETTWSAKDGTTRRSHIYSHFDLHNEFLNGS 1195
+R V + G S+ GRLG+ + K R+ Y + EF
Sbjct: 807 GGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIEL----AGGRQVQPYIKASVLQEFDGAG 862

Query: 1196 KVQVSGVEFAT--RDERQSVGLGAGGTYEWQNGRYAVY 1231
V+ +G+ T R R +GLG + YA Y
Sbjct: 863 TVRTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASY 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0695PERTACTIN673e-13 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 67.4 bits (164), Expect = 3e-13
Identities = 101/434 (23%), Positives = 152/434 (35%), Gaps = 57/434 (13%)

Query: 835 DDSATDRLVINGDATGTTSVRVNNAGGLGDKTLNGINLITVDGLAQDDTFLLAGDYVTTD 894
D +D+LV+ DA+G + V N+G + N + L+ + TF LA D
Sbjct: 487 DLGLSDKLVVMRDASGQHRLWVRNSGSEPA-SGNTMLLVQTPRGSAA-TFTLA----NKD 540

Query: 895 GYQAVVAGAYAYTLQADGEA--------ATAGRNWYLSSELMLTEGVRYQVGVPLYEQYP 946
G V G Y Y L A+G A P Q P
Sbjct: 541 G--KVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPGPQPPQPPQPPQPPQPP 598

Query: 947 QVLAALNTLPTLQQRVGNRYGAPGALA----DLNFDDNQW-------------------- 982
Q P Q G A A + W
Sbjct: 599 QPPQRQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDA 658

Query: 983 --AWGRIEGSHQVTDPARSTSGSQREIDVWKLQTGIDVPLYQSQGGSLLTGGVNFTYGKA 1040
AWGR Q D + +G + + V + G D + + G L G +T G
Sbjct: 659 GGAWGRGFAQRQQLD---NRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGD- 714

Query: 1041 KADIHSFFGDGRINSAGYGLGTSLTWYGNNGVYVDGQLQTMWFDSDLS-SRTAGHAVASG 1099
F GDG ++ +G T+ N+G Y+D L+ ++D + + G+AV
Sbjct: 715 ----RGFTGDGGGHTDSVHVGGYATYIANSGFYLDATLRASRLENDFKVAGSDGYAVKGK 770

Query: 1100 NNGRGYTSAIEAGKGYALGNGLSLTPQMQVTYSRVDFDTFRDPFDSEVSLQEGDSLRGRL 1159
G ++EAG+ +A +G L PQ ++ RV +R V + G S+ GRL
Sbjct: 771 YRTHGVGVSLEAGRRFAHADGWFLEPQAELAVFRVGGGAYRAANGLRVRDEGGSSVLGRL 830

Query: 1160 GVSLDKETTWSAKDGTTRRSHIYSHLDLHNEFLNGSKVQVSGVEFAT--RDERQSVGLGA 1217
G+ + K R+ Y + EF V+ +G+ T R R +GLG
Sbjct: 831 GLEVGKRIEL----AGGRQVQPYIKASVLQEFDGAGTVRTNGIAHRTELRGTRAELGLGM 886

Query: 1218 GGTYEWQNGRYAVY 1231
+ YA Y
Sbjct: 887 AAALGRGHSLYASY 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0696ICENUCLEATIN374e-04 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 36.7 bits (84), Expect = 4e-04
Identities = 43/189 (22%), Positives = 65/189 (34%), Gaps = 1/189 (0%)

Query: 532 NTTVLNDRSTTVSGNHTETVTKDQAVTVSGNQTMDITQDQTITVTGTQRIDVTQDRIIDV 591
+T ++S +G + + + ++G + +I G Q+R
Sbjct: 758 STQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLT 817

Query: 592 TAEQQTTVKADDRLLISGKQKTKIDLDQEYEVVGSQKKTIGANQTLKVGGYQKNTLEGYK 651
T T+ D LI+G T+ G + GY + GY
Sbjct: 818 TGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYD 877

Query: 652 KSKIGG-DNTTTVGGHDKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITST 710
S I G +T T G + LT G T TA + L G S + I G T T
Sbjct: 878 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQT 937

Query: 711 ASTTHTIKA 719
AS T+ A
Sbjct: 938 ASFKSTLMA 946



Score = 34.7 bits (79), Expect = 0.002
Identities = 32/181 (17%), Positives = 67/181 (37%), Gaps = 9/181 (4%)

Query: 532 NTTVLNDRSTTVSGNHTETVTKDQAVTVSGNQTMDITQDQTITVTGTQRIDVTQDRIIDV 591
+T + S +G + + ++ ++G + ++ + G +++
Sbjct: 902 STQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLT 961

Query: 592 TAEQQTTVKADDRLLISGKQKTKIDLDQEYEVVGSQKKTIGANQTLKVGGYQKNTLEGYK 651
T++ D LI+G T+ G Q + + + GY
Sbjct: 962 AGYGSTSMAGYDSSLIAGYGSTQT--------AGYQSTLTAGYGSTQTAEHSSTLTAGYG 1013

Query: 652 KSKIGGDNTTTVGGH-DKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITST 710
+ G +++ + G+ LT G +TAG TL G S++ G+ I+G + T
Sbjct: 1014 STATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLT 1073

Query: 711 A 711
A
Sbjct: 1074 A 1074



Score = 34.0 bits (77), Expect = 0.003
Identities = 39/188 (20%), Positives = 59/188 (31%), Gaps = 15/188 (7%)

Query: 532 NTTVLNDRSTTVSGNHTETVTKDQAVTVSGNQTMDITQDQTITVTGTQRIDVTQDRIIDV 591
+T +RS +G + + + ++G + +I G Q+
Sbjct: 806 STQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLT 865

Query: 592 TAEQQTTVKADDRLLISGKQKTKIDLDQEYEVVGSQKKTIGANQTLKVGGYQKNTLEGYK 651
T T+ D LI+G T+ T G N L G T +
Sbjct: 866 TGYGSTSTAGYDSSLIAGYGSTQ---------------TAGYNSILTAGYGSTQTAQENS 910

Query: 652 KSKIGGDNTTTVGGHDKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITSTA 711
G +T+T G L G T TA TL G S + G TS A
Sbjct: 911 DLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMA 970

Query: 712 STTHTIKA 719
++ A
Sbjct: 971 GYDSSLIA 978



Score = 32.4 bits (73), Expect = 0.008
Identities = 37/133 (27%), Positives = 54/133 (40%), Gaps = 5/133 (3%)

Query: 606 LISGKQKTKIDLDQEYEVVGS-QKKTIGANQTLKVGGYQKNTLEGYKKSKIGGDNTTTVG 664
LI+G + T+I ++ + G +T G TL G K G D+T T G
Sbjct: 1088 LIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAG 1147

Query: 665 GHDKLTVGDTITITAGTSITLQCGASSIVMDEAGNIKITGVNITSTASTTHTIKAKTVTS 724
KL G+ +TAG L G I+M + G+N TA ++K + S
Sbjct: 1148 DRSKLLAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTAGINSILTAGC----RSKLIGS 1203

Query: 725 CGDTENIVEGGIL 737
G T E +L
Sbjct: 1204 NGSTLTAGENSVL 1216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0698ACRIFLAVINRP310.043 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.043
Identities = 12/37 (32%), Positives = 20/37 (54%), Gaps = 3/37 (8%)

Query: 218 KSLSLPTSVMLPIPMGRPVVVGGMPVLNLLALMMGLF 254
+S S+P SVML +P+G +VG + L ++
Sbjct: 892 ESWSIPVSVMLVVPLG---IVGVLLAATLFNQKNDVY 925


74YpsIP31758_0711YpsIP31758_0715N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_07110183.609208aerobactin siderophore biosynthesis protein
YpsIP31758_07120152.677426aerobactin siderophore biosynthesis protein
YpsIP31758_0713-1131.555841aerobactin siderophore biosynthesis protein
YpsIP31758_0714014-0.652727aerobactin siderophore biosynthesis protein
YpsIP31758_0715016-3.385038major facilitator transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0711INVEPROTEIN290.046 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.046
Identities = 13/44 (29%), Positives = 25/44 (56%)

Query: 221 NALDEAAFANEYFMPEYVESFYTLNDSAKQHMLAEQRMTSDGIT 264
A+ + F EY+ E + + ++ D A +H +AEQR T + ++
Sbjct: 329 KAIPSSLFYEEYWQEELLMALRSMTDIAYKHEMAEQRRTIEKLS 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0712PF041837360.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 736 bits (1902), Expect = 0.0
Identities = 381/576 (66%), Positives = 447/576 (77%), Gaps = 1/576 (0%)

Query: 5 NYANWQQVNRHMIAKILSELEYERTLHAELHGETG-RITLPGAVYTFNGKRGIWGWLHID 63
N+ +W VNR ++AK+LSELEYE+ HAE G+ I LPGA + F +RGIWGWL ID
Sbjct: 2 NHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWID 61

Query: 64 PATLRCEGVPLAADHMLRQLALVLKMDDSQVAEHLEDLYATLRGDMQLLSARHGMSAEAL 123
TLRC P+ A +L QL VL M D+ VAEH++DLYATL GD+QLL AR G+SA L
Sbjct: 62 AQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDL 121

Query: 124 IALNDDALQCLLAGHPKFIFNKGRRGWGLTALQHYAPEYQGQFRLHWVAAKRGSFIWCVD 183
I LN D LQCLL+GHPKF+FNKGRRGWG AL+ YAPEY FRLHW+A KR IW D
Sbjct: 122 INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCD 181

Query: 184 AEYPLDNLLNSAMDPAERQRFDRRWRECQLNDDWVPVPLHPWQWQQKIALHFLPQLAEGE 243
E + LL +AMDP E RF + W+E L+ +W+P+P+HPWQWQQKIA F+ AEG
Sbjct: 182 NEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEGR 241

Query: 244 LIELGEFGDHYLAQQSLRTLTNVSRRVPFDIKLPLTIYNTSCYRGIPGKYISAGPAASRW 303
++ LGEFGD +LAQQSLRTLTN SRR DIKLPLTIYNTSCYRGIPG+YI+AGP ASRW
Sbjct: 242 MVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRW 301

Query: 304 LQQVFAQDRTLHESGAEILGEPAAGYMLHQTYATLAKAPYRCQEMLGVIWRENPSCYLRE 363
LQQVFA D TL +SGA ILGEPAAGY+ H+ YA LA+APYR QEMLGVIWRENP +L+
Sbjct: 302 LQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKP 361

Query: 364 GEHAILMATLMETNNQGHPLIAAYIARSGLSAEAWLEQMFRVVVVPMYHLMCCYGVALIA 423
E +LMATLME + PL AYI RSGL AE WL Q+FRVVVVP+YHL+C YGVALIA
Sbjct: 362 DESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALIA 421

Query: 424 HGQNITLVMKDHAPQRILLKDFQGDMRLVDKDFPQAASLPNVVKEVTVRLSADYLIHDLQ 483
HGQNITL MK+ PQR+LLKDFQGDMRLV ++FP+ SLP V++VT RLSADYLIHDLQ
Sbjct: 422 HGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQ 481

Query: 484 TGHFVTVLRFISPLMQACNLSEYRFYQLLAQVLERYMAQHPDLADRFTLFNLFKPQIIRV 543
TGHFVTVLRFISPLM + E RFYQLLA VL YM +HP +++RF LF+LF+PQIIRV
Sbjct: 482 TGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIRV 541

Query: 544 VLNPVKLTYSEQDGGSRMLPDYLQDLDNPLYLVTKE 579
VLNPVKLT+ + DGGSRMLP+YL+DL NPL+LVT+E
Sbjct: 542 VLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQE 577


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0714PF04183320e-104 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 320 bits (821), Expect = e-104
Identities = 101/457 (22%), Positives = 170/457 (37%), Gaps = 37/457 (8%)

Query: 62 TQHHHYLFPAYLHQQGNDRQDDDTPVKLGIEQLVTLLLEKPTVKGELSDDVVARFRQRVL 121
+ F A G D T L LL + +SD VA Q +
Sbjct: 41 LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLY 100

Query: 122 ESHDNTQQAINIRLDWPSLRDKPLNFAQAEQGLLAGHAFHPAPKSHQPFNEKQAQRYLPD 181
+ Q + R + LN Q LL+GH K + + ++ +RY P+
Sbjct: 101 ATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFVFNKGRRGWGKEALERYAPE 159

Query: 182 FASRFPLRWFAVDKRYLCGDSLKLTLQHRLQRFASESAPQLLAYFT--------DDVW-L 232
+A+ F L W AV + ++ H Q + PQ A F+ D W
Sbjct: 160 YANTFRLHWLAVKREHMIWRCDNEMDIH--QLLTAAMDPQEFARFSQVWQENGLDHNWLP 217

Query: 233 LPMHPWQADHLLKQDWCQQLVQQNALHDLGEAGERWLPTSSSRSLYSPSNRD--MVKFSL 290
LP+HPWQ + D+ + + LGE G++WL S R+L + S R +K L
Sbjct: 218 LPVHPWQWQQKIATDFIADFAEGR-MVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPL 276

Query: 291 SVRLTNSVRTLSVKEAKRGMRLARLAQTPRWQELQARY--------PTFRVMQEDGWAGL 342
++ T+ R + + G +R Q + P + +G+A L
Sbjct: 277 TIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAAL 336

Query: 343 RSADFTLQEESLLVLRDNLLFSQPDSQTNVLVTLTQAAPDGGDSLLASAVRRLAARLNLP 402
A + QE ++ R+N ++ VL+ + L + + R
Sbjct: 337 ARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSG------ 390

Query: 403 LQQAAFCWLDAYCQHVLLPLFSTEADYGLVLLAHQQNILVEMQQDLPVGMLYRDCQGSGF 462
A WL + V++PL+ YG+ L+AH QNI + M++ +P +L +D QG
Sbjct: 391 --LDAETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGD-- 446

Query: 463 TQSALPWLAEIGEAEAENSFSEQQLLRYFPYYLLVNS 499
+ E+ E + + L++
Sbjct: 447 MRLVKEEFPEMDSLPQE----VRDVTSRLSADYLIHD 479


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0715TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 42/180 (23%), Positives = 73/180 (40%), Gaps = 16/180 (8%)

Query: 24 FCVGLLGIGQNGLLVVLPVLVSRTHLSLSVWAG---LLTLGSMLFLVGSAWWGRQSEIRG 80
V L +G ++ VLP L+ S V A LL L +++ + G S+ G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 81 CKFVVIMALAGYLLSFVLLALAVWGLSAGWLSEMAGLGWLIVARIIYGLTVSGMVPASQT 140
+ V++++LAG + + ++A A W+ L + RI+ G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATA----PFLWV--------LYIGRIVAGITGATGAVAGAY 119

Query: 141 WALQRAGYEQRMAALATISSGLSCGRLLGPLCAALALSIHPIAPLWLMAITPLIALLVVY 200
A G ++R +S+ G + GP+ L P AP + A + L
Sbjct: 120 IADITDG-DERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178


75YpsIP31758_0952YpsIP31758_0963N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_0952642-13.824908general secretion pathway protein C-like
YpsIP31758_0953642-13.567722general secretion pathway protein D
YpsIP31758_0954441-13.700360general secretory pathway protein E
YpsIP31758_0955644-15.801741general secretion pathway protein F
YpsIP31758_0956645-16.621021general secretion pathway protein G
YpsIP31758_0957544-16.653333general secretion pathway protein I
YpsIP31758_0958642-16.784428general secretion pathway protein J
YpsIP31758_0959542-16.743225general secretion pathway protein K
YpsIP31758_0960641-16.052691general secretion pathway protein L
YpsIP31758_0961432-11.204792hypothetical protein
YpsIP31758_0963330-8.817158type IV prepilin peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0952BCTERIALGSPC454e-08 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 44.6 bits (105), Expect = 4e-08
Identities = 19/62 (30%), Positives = 31/62 (50%)

Query: 105 IKLVGVIEHSAPSESIAILEVKGKQTTHLTRENINYEDIVIVKIFTDRVIIKRNGKYYSL 164
+ L GV+ S SIAI+ +Q + E + + IV I DRV+++ G+Y L
Sbjct: 95 LSLTGVMAGDDDSRSIAIISKDNEQFSRGVNEEVPGYNAKIVSIRPDRVVLQYQGRYEVL 154

Query: 165 II 166
+
Sbjct: 155 GL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0953BCTERIALGSPD5430.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 543 bits (1401), Expect = 0.0
Identities = 310/610 (50%), Positives = 432/610 (70%), Gaps = 15/610 (2%)

Query: 3 ISGKGIKSIHGMIFLFTLIMPLDIISANFSVSFKDVDIKEFINSVSKNINKTIIIDPTVQ 62
I I+S + +F ++ + FS SFK DI+EFIN+VSKN+NKT+IIDP+V+
Sbjct: 2 IIANVIRSFSLTLLIFAALLFRPAAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVR 61

Query: 63 GLISIRSYENLDKDTYYQLFLNVLDVYGYAAIEMPHNVLKVISSKRAKGVVAPLPKEGVT 122
G I++RSY+ L+++ YYQ FL+VLDVYG+A I M + VLKV+ SK AK P+ +
Sbjct: 62 GTITVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAP 121

Query: 123 FDGDELINRVIPLRYISAKKITPLLRQLNDNTESGSIINYDPSNILLITGRAAVVNRLHS 182
GDE++ RV+PL ++A+ + PLLRQLNDN GS+++Y+PSN+LL+TGRAAV+ RL +
Sbjct: 122 GIGDEVVTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLT 181

Query: 183 IVTDLDQAGDNEIELYKLNYAIAADVVKIVNEAINPINNLKQEVSIVGKVIADERTNSIL 242
IV +D AGD + L++A AADVVK+V E + S+V V+ADERTN++L
Sbjct: 182 IVERVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVL 241

Query: 243 ISGDTYIRKKSILMIKKLDKRQSSDGNTKVVYMKYAQASKLLDVLNGISEGFHNEKKTKQ 302
+SG+ R++ I MIK+LD++Q++ GNTKV+Y+KYA+AS L++VL GIS +EK+ +
Sbjct: 242 VSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAK 301

Query: 303 SNQWNQRPVAIKAYDQTNALVITADPDMMLALGEVIEKLDIRRAQVLVEAIIVETQNGEG 362
+ + IKA+ QTNAL++TA PD+M L VI +LDIRR QVLVEAII E Q+ +G
Sbjct: 302 PVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADG 361

Query: 363 INLGVKWENKRSDDINF----IKNSDGLLNNNGWGIATTIT-----------GLTAGFYK 407
+NLG++W NK + F + S + N + T++ G+ AGFY+
Sbjct: 362 LNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQ 421

Query: 408 GNWDVLLSALSTNTNNNILATPSIVTLDNMEAEFNVGQEVPVLISTQTTTTDKVYNSISR 467
GNW +LL+ALS++T N+ILATPSIVTLDNMEA FNVGQEVPVL +QTT+ D ++N++ R
Sbjct: 422 GNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVER 481

Query: 468 QSIGVMLKVKPQINKGDSVLLEIRQEVSSIADSSTVNTHNLGSVFNKRVVNNAVLVKSGE 527
+++G+ LKVKPQIN+GDSVLLEI QEVSS+AD+++ + +LG+ FN R VNNAVLV SGE
Sbjct: 482 KTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGE 541

Query: 528 TVVVGGLLDKKSSTIVNKVPFLGDLPLIGWLFRQTKEKVEKSNLILFIKPTILRESDDYS 587
TVVVGGLLDK S +KVP LGD+P+IG LFR T +KV K NL+LFI+PT++R+ D+Y
Sbjct: 542 TVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIRDRDEYR 601

Query: 588 VVTSKEYNKY 597
+S +Y +
Sbjct: 602 QASSGQYTAF 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0955BCTERIALGSPF354e-122 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 354 bits (911), Expect = e-122
Identities = 169/406 (41%), Positives = 264/406 (65%), Gaps = 7/406 (1%)

Query: 1 MAVFKYVAISRSGTKITGDIDAENIRIARYLLYKKNMHVLSI-------KKRILLFNKYV 53
MA + Y A+ G K G +A++ R AR LL ++ + LS+ +K
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 54 VKKNSNKTDLVLITRQIATLVNASMPLDEVLDIVGKQNSKSKMIEIIQRIRVNIQEGHSF 113
K + +DL L+TRQ+ATLV ASMPL+E LD V KQ+ K + +++ +R + EGHS
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 114 ADALSPFPAVFSPLYKTMVTAGEVSGHLGLVLVRLAEHIEQTQKIQRKIIQALIYPCVLV 173
ADA+ FP F LY MV AGE SGHL VL RLA++ EQ Q+++ +I QA+IYPCVL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 174 LISLSVIVILLTAVVPNIVEQFSFSETALPLSTKVLMILSYSIKENVIFIIAIGVSAVIF 233
+++++V+ ILL+ VVP +VEQF + ALPLST+VLM +S +++ +++ ++ +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 234 LNRLLKINKINIFFHRHYLSLPMLGNMFVRINTSRYLRTLTTLHSNGVTIVQAMSISNAV 293
+L+ K + FHR L LP++G + +NT+RY RTL+ L+++ V ++QAM IS V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 294 LTNVYIKNKLNISVKLVSEGCSLSSSLVDSGVFPPIILHMIISGERSGKLDHMLETVAGV 353
++N Y +++L+++ V EG SL +L + +FPP++ HMI SGERSG+LD MLE A
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 354 QEEELMNQISIVMSLLEPTIIIVMAAFISFVILSILQPILEINSLV 399
Q+ E +Q+++ + L EP +++ MAA + F++L+ILQPIL++N+L+
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0956BCTERIALGSPG2072e-72 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 207 bits (529), Expect = 2e-72
Identities = 87/136 (63%), Positives = 103/136 (75%)

Query: 2 ANKKTKGFTLLEIMVVIVILGLLASLTIPSLMSNKNRADQQKAVSDISALENALDMYRLD 61
A K +GFTLLEIMVVIVI+G+LASL +P+LM NK +AD+QKAVSDI ALENALDMY+LD
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLD 62

Query: 62 NGDYPTEQQGIAALVTKPNVPPLPQRYPSDGYIRRLPTDPWGNSYQMNNPGKHGQIDIFS 121
N YPT QG+ +LV P +PPL Y +GYI+RLP DPWGN Y + NPG+HG D+ S
Sbjct: 63 NHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLLS 122

Query: 122 IGPDRLPETEDDIGNW 137
GPD TEDDI NW
Sbjct: 123 AGPDGEMGTEDDITNW 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0958BCTERIALGSPG300.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.003
Identities = 13/44 (29%), Positives = 24/44 (54%), Gaps = 9/44 (20%)

Query: 4 RPDCGFTLLEMLLAVVIFSMISFIIYSSLRITIKSNNVMGNKAQ 47
GFTLLE+++ +VI +++ ++ N+MGNK +
Sbjct: 5 DKQRGFTLLEIMVVIVIIGVLASLVVP---------NLMGNKEK 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_0963PREPILNPTASE2311e-77 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 231 bits (591), Expect = 1e-77
Identities = 115/275 (41%), Positives = 150/275 (54%), Gaps = 4/275 (1%)

Query: 3 FFFVGYLILGAMVGSFLNVLIYRLPIMLANLSSR-SESHGEEIKMRSHLRNINLFQPGSF 61
+F + M+GSFLNV+I+RLPIML S+ NL P S
Sbjct: 14 LYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSC 73

Query: 62 CHHCNESIPIKYNIPILGWIFLRGASRCCNKKISTRYLFIEVLAVIQTLLVLMIFKEDLL 121
C HCN I NIP+L W++LRG R C IS RY +E+L + ++ V M
Sbjct: 74 CPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSVAVAMTLAPGWG 133

Query: 122 ICTSLVLIWSLTALAFIDFDTYLLPDCMTIPLLWLGLLINIDTVFAPLTSAVLGAVSGYL 181
+L+L W L AL FID D LLPD +T+PLLW GLL N+ F L AV+GA++GYL
Sbjct: 134 TLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYL 193

Query: 182 FLWLSYWLFKIVRGVDGMGYGDFKLMAALGAWFGVSAVPFLILFSSFFGLVAYAIFYFFD 241
LW YW FK++ G +GMGYGDFKL+AALGAW G A+P ++L SS G
Sbjct: 194 VLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLR 253

Query: 242 KKDNGKEINYIAFGPYISLAGVLYLFLGSHVTNLF 276
K I FGPY+++AG + L G +T +
Sbjct: 254 NHHQSKP---IPFGPYLAIAGWIALLWGDSITRWY 285


76YpsIP31758_1210YpsIP31758_1216N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1210-1214.785410BaeR family transcriptional regulator
YpsIP31758_1211-1204.391095signal transduction histidine-protein kinase
YpsIP31758_1212-1214.755973multidrug efflux system protein MdtE
YpsIP31758_1213-1204.614811multidrug efflux system subunit MdtC
YpsIP31758_1214-1183.751146multidrug efflux system subunit MdtB
YpsIP31758_1215-1143.281305multidrug efflux system subunit MdtA
YpsIP31758_1216-1152.983734spermidine/putrescine ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1210HTHFIS789e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.9 bits (192), Expect = 9e-19
Identities = 33/150 (22%), Positives = 70/150 (46%), Gaps = 5/150 (3%)

Query: 10 QSGSVLIVEDEPKLGQLLVDYLQAAGYRTQWLTNGAEVVATVRQTPPAIILLDLMLPGSD 69
++L+ +D+ + +L L AGY + +N A + + +++ D+++P +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 70 GITLCREIR-RFSDIPIVMVTAKTEEIDRLLGLEIGADDYICKPYSPREVVARVKTIL-- 126
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 127 --RRCSQQRHQPTDDAPLLINESRFQASYQ 154
RR S+ D PL+ + Q Y+
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYR 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1211BCTERIALGSPF340.001 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 34.0 bits (78), Expect = 0.001
Identities = 24/90 (26%), Positives = 38/90 (42%), Gaps = 21/90 (23%)

Query: 170 LSTLLAAAVTWVLS-------------RGMLAPVKRLVEGTHRLAA------GDFST--R 208
L+TL+AA++ + ++A V+ V H LA G F
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLADAMKCFPGSFERLYC 136

Query: 209 VAVSSRDELGHLAQDFNQLASSLEKNEQMR 238
V++ + GHL N+LA E+ +QMR
Sbjct: 137 AMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1212TCRTETB1265e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (318), Expect = 5e-34
Identities = 97/435 (22%), Positives = 182/435 (41%), Gaps = 17/435 (3%)

Query: 20 FMQTLDTTIVNTALPSIAASLGENPLRMQSVIVSYVLTVAVMLPASGWLADRIGVKWVFF 79
F L+ ++N +LP IA + P V +++LT ++ G L+D++G+K +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 SAIILFTFGSLMCAQSATLNE-LILSRVLQGVGGAMMVPVGRLTVMKIVPREQYMAAMAF 138
II+ FGS++ + LI++R +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQIGPLVGPALGGFLVEFASWHWIFLINLP-VGVIGALATLLLMPNHKMSTRRFDI 197
+ +G VGPA+GG + + HW +L+ +P + +I + L+ FDI
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFIMLAIGMATLTLALDGHTGLGLSPLAIAGLILCGVIALGSYWWHALGNRFALFSLHL 257
G I++++G+ L + ++ V++ + H L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FKNKIYTLGLVGSMSARIGSGMLPFMTPIFLQIGLGFSPFHAG-LMMIPMIIGSMGMKRI 316
KN + +G++ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 IVQVVNRFGYRRVLVNATLLLAVVSLSLPLVAIMGWTLLMPVVLFFQGMLNALRFSTMNT 376
+V+R G VL L+V L+ + + +++F G L+ + ++T
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTV-IST 371

Query: 377 LTLKTLPDRLASSGNSLLSMAMQLSMSIGVSTAGILLGTFAHHQVATNTPATHSAFLYS- 435
+ +L + A +G SLL+ LS G++ G LL Q S +LYS
Sbjct: 372 IVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTYLYSN 431

Query: 436 -YLCMAIIIALPALI 449
L + II + L+
Sbjct: 432 LLLLFSGIIVISWLV 446


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1213ACRIFLAVINRP8620.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 862 bits (2229), Expect = 0.0
Identities = 286/1035 (27%), Positives = 504/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIQRPVATTLLTLAITLSGIIGFSLLPVSPLPQVDYPVIMVSASMPGADPETMASSVAT 65
FI+RP+ +L + + ++G + LPV+ P + P + VSA+ PGAD +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERALGRIAGVNEMTSTS-SLGSTRIILQFDLNRDINGAARDVQAALNAAQSLLPSGMP 124
+E+ + I + M+STS S GS I L F D + A VQ L A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKMNPSDAPIMIMTLTSDT--FSQGQLYDYASTKLAQKIAQTEGVSDVTVGGSSL 182
+ S + +M+ SD +Q + DY ++ + +++ GV DV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVELNPSALFNQGVSLDAVRQAISAANVRRPQGSVDAAET------HWQVQANDEIK 236
A+R+ L+ L ++ V + N + G + + + A K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAEGYRPLIVHYN-NGSPVRLQDVANVIDSVQDVRNAGMSAGQPAVLLVISREPGANIIA 295
E + + + N +GS VRL+DVA V ++ G+PA L I GAN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDRIRAELPALRASIPASIQLNIAQDRSPTIRASLDEVERSLVIAVALVILVVFIFLRS 355
T I+A+L L+ P +++ D +P ++ S+ EV ++L A+ LV LV+++FL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATLIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENISRHL- 414
RATLIP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGVKPMVAALRGVREVGFTVLSMSISLVAVFIPLLLMAGLPGRLFREFAVTLSVAIGIS 474
E + P A + + ++ ++ +++ L AVFIP+ G G ++R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LVISLTLTPMMCAWLLRSHPKGQQQRIRGFG----KVLLAIQQGYGRSLNWALGHTRWVM 530
++++L LTP +CA LL+ + GF Y S+ LG T +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLSTIALNVWLYISIPKTFFPEQDTGRMMGFIQADQSISFQSMQQKLKDFMQIVGADP 590
++ +A V L++ +P +F PE+D G + IQ + + Q+ L +
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 591 -----AVDSVTGFT-GGSRTNSGSMFISLKPLSER---QETAQQVITRLRGKLAKEPGAN 641
+V +V GF+ G N+G F+SLKP ER + +A+ VI R + +L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLSSVQDIRVGGRHSNAAYQFTLLADDLAALREWEPKVRAALAKL-----PQLADVNSD 696
+ ++ I G + ++ L D + + R L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDKGAEMALTYDRETMARLGIDVSEANALLNNAFGQRQISTIYQPLNQYKVVMEVAPEY 756
+ A+ L D+E LG+ +S+ N ++ A G ++ K+ ++ ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDVSSLDKMFVINSNGQSIPLSYFAKWQPANAPLAVNHQGLSAASTISFNLPDGGSLSE 816
+DK++V ++NG+ +P S F + + I G S +
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ATAAVERAMTELGVPSTVRGAFAGTAQVFQETLKSQLWLIMAAIATVYIVLGILYESYVH 876
A A +E ++L P+ + + G + + + L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFDAPFSLIALIGIMLLIGIVKKNAIMMVDFALDAQRNGN 936
P++++ +P VG LLA LF+ + ++G++ IG+ KNAI++V+FA D
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 ISAREAIFQASLLRFRPIIMTTLAALFGALPLVLSSGDGAELRQPLGITIVGGLVVSQLL 996
EA A +R RPI+MT+LA + G LPL +S+G G+ + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVIYLYFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 77.6 bits (191), Expect = 2e-16
Identities = 59/350 (16%), Positives = 130/350 (37%), Gaps = 12/350 (3%)

Query: 680 VRAALAKLPQLADVNSDQQDKGAEMALTYDRETMARLGI---DVSEANALLNNAFGQRQI 736
V+ L++L + DV M + D + + + + DV + N+ Q+
Sbjct: 162 VKDTLSRLNGVGDVQLFGAQY--AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQL 219

Query: 737 STIYQPLNQYKVVMEVAPEYTQDVSSLDKMFV-INSNGQSIPLSYFAK--WQPANAPLAV 793
Q +A ++ K+ + +NS+G + L A+ N +
Sbjct: 220 GGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIA 279

Query: 794 NHQGLSAASTISFNLPDGGSLSEATAAVERAMTEL--GVPSTVRGAFA-GTAQVFQETLK 850
G AA +L + A++ + EL P ++ + T Q ++
Sbjct: 280 RINGKPAAGLGIKLATGANAL-DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIH 338

Query: 851 SQLWLIMAAIATVYIVLGILYESYVHPLTILSTLPSAGVGALLALELFDAPFSLIALIGI 910
+ + AI V++V+ + ++ L +P +G L F + + + G+
Sbjct: 339 EVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGM 398

Query: 911 MLLIGIVKKNAIMMVDFALDAQRNGNISAREAIFQASLLRFRPIIMTTLAALFGALPLVL 970
+L IG++ +AI++V+ + +EA ++ ++ + +P+
Sbjct: 399 VLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAF 458

Query: 971 SSGDGAELRQPLGITIVGGLVVSQLLTLYTTPVIYLYFDRLRNRFSKQPL 1020
G + + ITIV + +S L+ L TP + + + +
Sbjct: 459 FGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1214ACRIFLAVINRP8730.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 873 bits (2256), Expect = 0.0
Identities = 288/1036 (27%), Positives = 502/1036 (48%), Gaps = 29/1036 (2%)

Query: 13 SRLFILRPVATTLFMIAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVVTSSI 72
+ FI RP+ + I +++AG + LPV+ P + P + V YPGA V ++
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMASQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L M+S S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPYPPIYNKVNPADPPILTLAVTATAIPMTQVE--DMVETRIAQKISQVTGVGLVTLSGG 189
+ I + ++ + TQ + D V + + +S++ GVG V L G
Sbjct: 122 VQQQGIS-VEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAPAVAALGLDSETIRTAISNANVNSAKGSLDGP------TRSVTLSANDQ 243
Q A+R+ L+A + L + + N A G L G + ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MKSAEEYRDLII-AYQNGAPIRLQDVATIEQGAENNKLAAWANTQSAIVLNIQRQPGVNV 302
K+ EE+ + + +G+ +RL+DVA +E G EN + A N + A L I+ G N
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 IATADSIREMLPELIKSLPKSVDVKVLTDRTSTIRASVNDVQFELLLAIALVVMVIYLFL 362
+ TA +I+ L EL P+ + V D T ++ S+++V L AI LV +V+YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNAAATIIPSIAVPLSLVGTFAAMYFLGFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N AT+IP+IAVP+ L+GTFA + G+SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLDAALKGAGEIGFTIISLTFSLIAVLIPLLFMEDIVGRLFREFAVTLAVAIL 481
+ E P +A K +I ++ + L AV IP+ F G ++R+F++T+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SYESLRKQNRLSRASEKFFDWVIAHYAVALKKVLNHPWL 538
+S +V+L LTP +CA +L S E + FD + HY ++ K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVAFSTLVLTVILYLLIPKGFFPLQDNGLIQGTLEAPQSVSFSNMAERQQQVAAIILK 598
L + + V+L+L +P F P +D G+ ++ P + + QV LK
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VESLTSFVGVDGTNATLNNGRLQINLKPLSERDDRIP---QIITRLQESVSGVPG 653
+ VES+ + G + N G ++LKP ER+ +I R + + +
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 654 IKLYLQPVQDLTIDTQLSRTQYQFTLQ---ATSLEELSTWVPKLVNELQQK-APFQDVTS 709
++ P I + T + F L + L+ +L+ Q A V
Sbjct: 660 --GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDQGLVAFVNVDRDSASRLGITMAAIDNALYNAFGQRLISTIYTQSNQYRVVLEHDVQ 769
+ + + VD++ A LG++++ I+ + A G ++ + ++ ++ D +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 ATPGLAAFNDIRLTGSDGKGVPLNSIATIEERFGPLSINHLNQFPSATVSFNLAQGYSLG 829
+ + + ++G+ VP ++ T +G + N PS + A G S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 EAVAAVTLAEKEIQLPADITTRFQGSTLAFQAALGSTLWLIIAAIVAMYIVLGVLYESFI 889
+A+A + +LPA I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DAMALMENLAS--KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMLTGNELDVIAIIGIILLIGIVKKNAIMMIDFALAAERDQ 949
P++++ +P VG LLA L + DV ++G++ IG+ KNAI++++FA +
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMTPYDAIYQACLLRFRPILMTTLAALFGALPLMLSTGVGAELRQPLGVCMVGGLIVSQV 1009
G +A A +R RPILMT+LA + G LPL +S G G+ + +G+ ++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDKL 1025
L +F PV +++ +
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031



Score = 84.1 bits (208), Expect = 2e-18
Identities = 78/517 (15%), Positives = 191/517 (36%), Gaps = 25/517 (4%)

Query: 533 LNHPWLTLSVAFSTLVLTVILYLLIPKGFFPLQDNGLIQGTLEAPQSVSFSNMAERQQQV 592
+ P +A ++ + L +P +P + + P + + Q V
Sbjct: 6 IRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGA----DAQTVQDTV 61

Query: 593 AAIILKDPAVESLTSFVGVDGTNAT-LNNGRLQINL--KPLSERDDRIPQIITRLQESVS 649
+I +++ + ++T + G + I L + ++ D Q+ +LQ +
Sbjct: 62 TQVI-----EQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 650 GVP-GIKLYLQPVQDLTIDTQLSRTQYQFTLQATSLEELSTWVPK-LVNELQQKAPFQDV 707
+P ++ V+ + + L + T+ +++S +V + + L + DV
Sbjct: 117 LLPQEVQQQGISVEKSS-SSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV 175

Query: 708 TSDWQDQGLVAFVNVDRDSASRLGITMAAIDNALYNAFGQRLISTIYTQSNQYRVVLEHD 767
+ + +D D ++ +T + N L Q + L
Sbjct: 176 QLFGAQYAMR--IWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS 233

Query: 768 VQATPGLAAFNDIR----LTGSDGKGVPLNSIATIEERFGPLSIN-HLNQFPSATVSFNL 822
+ A + SDG V L +A +E ++ +N P+A + L
Sbjct: 234 IIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKL 293

Query: 823 AQGYSLGEAVAAV--TLAEKEIQLPADI-TTRFQGSTLAFQAALGSTLWLIIAAIVAMYI 879
A G + + A+ LAE + P + +T Q ++ + + AI+ +++
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 880 VLGVLYESFIHPITILSTLPTAGVGALLALMLTGNELDVIAIIGIILLIGIVKKNAIMMI 939
V+ + ++ + +P +G L G ++ + + G++L IG++ +AI+++
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVV 413

Query: 940 DFALAAERDQGMTPYDAIYQACLLRFRPILMTTLAALFGALPLMLSTGVGAELRQPLGVC 999
+ + + P +A ++ ++ + +P+ G + + +
Sbjct: 414 ENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSIT 473

Query: 1000 MVGGLIVSQVLTLFTTPVIYLLFDKLARNTRGKNRHR 1036
+V + +S ++ L TP + K +N+
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGG 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1215RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 22/115 (19%), Positives = 42/115 (36%), Gaps = 10/115 (8%)

Query: 84 VIAANTVTVTSRVDGELMALHFTEGQQVKAGDLLAEIDPRPYEVQLTQAQGQLAKDQATL 143
+ + + + + + EG+ V+ GD+L ++ A+ K Q++L
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTA-------LGAEADTLKTQSSL 143

Query: 144 DNARRDLARYQKLSK---TGLISQQELDTQSSLVRQSEGSVKADQGAIDSAKLQL 195
AR + RYQ LS+ + + +L + SE V I
Sbjct: 144 LQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTW 198



Score = 42.5 bits (100), Expect = 3e-06
Identities = 23/124 (18%), Positives = 54/124 (43%), Gaps = 4/124 (3%)

Query: 125 YEVQLTQAQGQLAKDQATLDNARRDLARYQKLSKTGLISQQELDTQSSLVRQSEGSVKAD 184
E + +A +L ++ L+ ++ + + L++Q + +RQ+ ++
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAK--EEYQLVTQLFKNEILDKLRQTTDNIGLL 314

Query: 185 QGAIDSAKLQLTYSRITAPISGRV-GLKQVDVGNYITSGTATPIVVITQTHPVDVVFTLP 243
+ + + S I AP+S +V LK G +T+ T +V++ + ++V +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALVQ 373

Query: 244 ESDI 247
DI
Sbjct: 374 NKDI 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1216PF05272310.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.007
Identities = 8/31 (25%), Positives = 14/31 (45%)

Query: 34 LTLLGPSGSGKTTSLMMLAGFETPTQGEITL 64
+ L G G GK+T + L G + + +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


77YpsIP31758_1312YpsIP31758_1316N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1312014-1.114061RND efflux transporter
YpsIP31758_1313117-1.204901CpxR family transcriptional regulator
YpsIP31758_1314117-1.155732sensor histidine kinase CpxA
YpsIP31758_1315219-1.071607PTS system glucose-specific transporter
YpsIP31758_13161130.180252phosphoenolpyruvate-protein phosphotransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1312RTXTOXIND598e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 59.5 bits (144), Expect = 8e-12
Identities = 40/259 (15%), Positives = 87/259 (33%), Gaps = 25/259 (9%)

Query: 27 FRWISPPDKPSYITAVAEIRDLEQTVLADGTIKAQKQVTVGAQVSGQIKALHVTLGQQVE 86
F+ +S + + + E Q + K+ V +I +
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 87 KNQLVAEI--DDLAQQNALKDAEEALKNVQAQRAAKIA--TQKNNQLTYQRQQQILAKGV 142
+ + + ++A+ + E + + Q +++ +++ L
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQL---- 291

Query: 143 GVRADFDS-IKAILEATQAEISALDAQIAQAEIAVSTAKLNLGYTKISSPIAGTVVAIPV 201
V F + I L T I L ++A+ E + I +P++ V + V
Sbjct: 292 -VTQLFKNEILDKLRQTTDNIGLLTLELAKNE-------ERQQASVIRAPVSVKVQQLKV 343

Query: 202 -EEGQTVNAVQSAPTIIKVAQLDTMTVEAQISEADVVKVKTGMPVYFTILGEPEKRF--- 257
EG V + ++ V + DT+ V A + D+ + G + P R+
Sbjct: 344 HTEGGVVTTAE--TLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYL 401

Query: 258 SATLRAIEPAPDSINDETT 276
++ I D+I D+
Sbjct: 402 VGKVKNI--NLDAIEDQRL 418



Score = 49.1 bits (117), Expect = 2e-08
Identities = 17/167 (10%), Positives = 58/167 (34%), Gaps = 17/167 (10%)

Query: 10 RLIGWVVLLLIIGGLLFFRWISPPDKPSYITAVAEIRDLEQTVLADGTIKAQKQV-TVGA 68
RL+ + ++ ++ + + + +E A+G + + +
Sbjct: 58 RLVAYFIMGFLVIAFI---L-------------SVLGQVEIVATANGKLTHSGRSKEIKP 101

Query: 69 QVSGQIKALHVTLGQQVEKNQLVAEIDDLAQQNALKDAEEALKNVQAQRAAKIATQKNNQ 128
+ +K + V G+ V K ++ ++ L + + +L + ++ ++ +
Sbjct: 102 IENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIE 161

Query: 129 LTYQRQQQILAKGVGVRADFDSIKAILEATQAEISALDAQIAQAEIA 175
L + ++ + + + + + + S Q Q E+
Sbjct: 162 LNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1313HTHFIS891e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.1 bits (221), Expect = 1e-22
Identities = 32/122 (26%), Positives = 63/122 (51%), Gaps = 1/122 (0%)

Query: 2 KILLVDDDLELGTMLKEYLGGEGFTAKHVLTGKAGIDGALSGDYTALILDIMLPDMSGID 61
IL+ DDD + T+L + L G+ + +GD ++ D+++PD + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRQVRK-KSRLPIIMLTAKGDNIDRVIGLEMGADDYMPKPCYPRELVARLRAVLRRFEE 120
+L +++K + LP+++++A+ + + E GA DY+PKP EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 QP 122
+P
Sbjct: 125 RP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1314PF06580387e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 7e-05
Identities = 41/231 (17%), Positives = 83/231 (35%), Gaps = 59/231 (25%)

Query: 239 ELRSPLARLQLAIGLAHQNPGNVDNAL----QRIEHESERLDKMIGEL-------LALSR 287
++ S QL A NP + NAL I + + +M+ L L S
Sbjct: 153 KMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSN 212

Query: 288 AENHSLADD----DEYFDLQEL-------VKVVVNDARYEAQLPGVEIQLEVAAQSEYTV 336
A SLAD+ D Y L + + +N A + Q+P + +Q V
Sbjct: 213 ARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLV-------- 264

Query: 337 KGNAELMRRAIENIVRNALRFSASGQQVKVTLSALDKRYQIQVIDQGPGVEENKLSSIFD 396
EN +++ + G ++ + + + ++V + G +N
Sbjct: 265 -----------ENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------- 306

Query: 397 PFVRVKSAMSGKGYGLGLAITHK-VILAHGGQVEAR-NGEQGGLVITLRVP 445
+ + G GL + + + +G + + + + +QG + + +P
Sbjct: 307 ---------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1316PHPHTRNFRASE7500.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 750 bits (1939), Expect = 0.0
Identities = 278/571 (48%), Positives = 392/571 (68%), Gaps = 2/571 (0%)

Query: 1 MISGILVSPGIAFGKALLLKEDEIVINRKKISADQVEQEVERFKAGRAKAAEQLEAIKTK 60
I+GI S G+A KA + E + I + I V E+E+ A K+ E+L AIK +
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSI--TDVSTEIEKLTAALEKSKEELRAIKDQ 61

Query: 61 AGVSLGEEKAAIFEGHIMLLEDEELEQEIIALIKDEHASADAAAYSVIEGQAKALEELDD 120
S+G +KA IF H+++L+D EL I I++E +A+ A V + E +D+
Sbjct: 62 TEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDN 121

Query: 121 EYLKERAADVRDIGKRLLKNILGLNIVDLSAIQDEVILVATDLTPSETAQLNLDKVLGFI 180
EY+KERAAD+RD+ KR+L +++G+ L+ I +E +++A DLTPS+TAQLN V GF
Sbjct: 122 EYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFA 181

Query: 181 TDIGGRTSHTSIMARSLELPAIVGTSNVTKQVKNDDYLILDAVNNKVYLNPTADVIEQLK 240
TDIGGRTSH++IM+RSLE+PA+VGT VT+++++ D +I+D + V +NPT + ++ +
Sbjct: 182 TDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYE 241

Query: 241 AVKNQYITEKNELAKLKDLPAITLDGHQVEVVANIGTVRDIAGAERNGAEGVGLYRTEFL 300
+ + +K E AKL P+ T DG VE+ ANIGT +D+ G NG EG+GLYRTEFL
Sbjct: 242 EKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFL 301

Query: 301 FMDRDSLPTEEEQFQAYKAVAEAMGSQAVIVRTMDIGGDKDLPYMNLPKEENPFLGWRAI 360
+MDRD LPTEEEQF+AYK V + M + V++RT+DIGGDK+L Y+ LPKE NPFLG+RAI
Sbjct: 302 YMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAI 361

Query: 361 RIAMDRKEILHAQLRAILRASAFGKLRIMFPMIISVEEVRELKAELELLKSQLREENKAF 420
R+ +++++I QLRA+LRAS +G L++MFPMI ++EE+R+ KA ++ K +L E
Sbjct: 362 RLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDV 421

Query: 421 DETIEVGVMVETPAAAVIARHLAKEVDFFSIGTNDLTQYTLAVDRGNELISHLYNPMSPS 480
++IEVG+MVE P+ AV A AKEVDFFSIGTNDL QYT+A DR NE +S+LY P P+
Sbjct: 422 SDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPA 481

Query: 481 VLGLIKQVIDASHAEGKWTGMCGELAGDERATLLLLGMGLDEFSMSAISIPRIKKIIRNT 540
+L L+ VI A+H+EGKW GMCGE+AGDE A LLLG+GLDEFSMSA SI + +
Sbjct: 482 ILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKL 541

Query: 541 NFEDVKVLAEQALAQPTAKELMDLVTTFIEE 571
+ E++K A++AL TA+E+ LV +
Sbjct: 542 SKEELKPFAQKALMLDTAEEVEQLVKKTYLK 572


78YpsIP31758_1358YpsIP31758_1365N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1358-190.275037hypothetical protein
YpsIP31758_1359-19-0.078377ImpA domain-containing protein
YpsIP31758_1360-210-1.040705hypothetical protein
YpsIP31758_1361-212-2.021539hypothetical protein
YpsIP31758_1362-117-3.044254Clp protease-associated protein ClpB
YpsIP31758_1363016-3.890765frimbrial protein
YpsIP31758_1364015-4.063886pili assembly chaperone
YpsIP31758_1365015-3.718324fimbrial usher protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1358FIMBRIALPAPE310.003 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 31.2 bits (70), Expect = 0.003
Identities = 24/83 (28%), Positives = 38/83 (45%), Gaps = 7/83 (8%)

Query: 207 PSCTFDGPQKVNFGLVTSSNL-NNGGIERDLDFNITCKTDYGHYSATAAISTQTPSDDNN 265
P+CT + VN+G + NL +GG ++D ++ C G T + QT N
Sbjct: 37 PACTVQNAE-VNWGDIEIQNLVQSGGNQKDFTVDMNCPYSLGTMKVTITSNGQT----GN 91

Query: 266 YIKVKDNQN-QEDRLLIKISDTN 287
I V + D LLI + ++N
Sbjct: 92 SILVPNTSTASGDGLLIYLYNSN 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1362HTHFIS320.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.5 bits (74), Expect = 0.007
Identities = 35/164 (21%), Positives = 57/164 (34%), Gaps = 32/164 (19%)

Query: 576 DDIRAVMELPQRLEAR----------VIGQPHALMQLGENIMTARAGLSDPRKPLGVFML 625
I + P+R ++ ++G+ A+ ++ + AR +D L + M+
Sbjct: 113 GIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL--ARLMQTD----LTL-MI 165

Query: 626 VGPSGVGKTETALAIAESMYGGEQNMITINMSEYQESHTVSSLKGSPPGYVGYGEGGVLT 685
G SG GK A A+ + + INM+ S L G E G T
Sbjct: 166 TGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGH--------EKGAFT 217

Query: 686 EAVRRKPYSV-------VLLDEIEKAHSDVHELFFQVFDKGQME 722
A R + LDEI D +V +G+
Sbjct: 218 GAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYT 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1363FIMBRIALPAPE326e-04 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 32.3 bits (73), Expect = 6e-04
Identities = 32/113 (28%), Positives = 49/113 (43%), Gaps = 18/113 (15%)

Query: 1 MRKLNLAVCAVALSVISSTSYAAAGGTVTFNGKLIADTCQVDTASENITVTLPTLSIQSL 60
M+K+ V L + + + A +TF GKLI C V +N V + IQ+L
Sbjct: 1 MKKIRGLCLPVMLGAVLMSQHVHAADNLTFKGKLIIPACTV----QNAEVNWGDIEIQNL 56

Query: 61 AVAEAQDGS--KDFEIKVLDCP-------ATLTQVGAHFNAIDSSGVNPATGN 104
Q G KDF + ++CP T+T G N+I + A+G+
Sbjct: 57 ----VQSGGNQKDFTVD-MNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGD 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1365PF005776950.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 695 bits (1794), Expect = 0.0
Identities = 249/856 (29%), Positives = 403/856 (47%), Gaps = 54/856 (6%)

Query: 4 VEFNADFIHGGG---VDVMRFMHENPVAPGVYDVTVIINGKNRGKHRIRFELSEGESTAE 60
+ FN F+ D+ RF + + PG Y V + +N + F + E
Sbjct: 47 LYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIV 106

Query: 61 PCFTLEQLDSIGLKIETSDTDLLVNGKAAPKDQCYNLRALIKDSHVNYNSGDLELSLTVP 120
PC T QL S+GL +T + D C L ++I D+ + G L+LT+P
Sbjct: 107 PCLTRAQLASMGL-----NTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIP 161

Query: 121 QFNLVHHPRGYIDSSLWDAGGTVGFLDYNSNVYSIFNGRSNSDVGSDNSNSYNSNIGLSA 180
Q + + RGYI LWD G G L+YN + + +G ++ +Y + L +
Sbjct: 162 QAFMSNRARGYIPPELWDPGINAGLLNYNFSGN-----SVQNRIGGNSHYAY---LNLQS 213

Query: 181 GINLGEWRFRKRLNTTWSNSSG-----MHTQNLYGYAATDITALKSQLTIGDTNTQGSLF 235
G+N+G WR R ++++S Q++ + DI L+S+LT+GD TQG +F
Sbjct: 214 GLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIF 273

Query: 236 DSYALRGVLLASDTRMLPEGIRNYSPIVRGIAETNARVTVTQRGQIIYETVVTPGAFELT 295
D RG LASD MLP+ R ++P++ GIA A+VT+ Q G IY + V PG F +
Sbjct: 274 DGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTIN 333

Query: 296 DIGTMSYGGDLQMTITESDGRTRIQRIPFSAPPMLLYQGVSRFDFSAGQL-NDSSINHNP 354
DI GDLQ+TI E+DG T+I +P+S+ P+L +G +R+ +AG+ + ++ P
Sbjct: 334 DIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKP 393

Query: 355 AIVQGAYHYGLGNTYTLYGGAQVAENYRSVAIGNAFNT-PLGGVSMDITHAKSELAGDRR 413
Q +GL +T+YGG Q+A+ YR+ G N LG +S+D+T A S L D +
Sbjct: 394 RFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQ 453

Query: 414 SSGNSYKIDYSKYVGETDTNLTLAAYRYSSGGYYSFREASLDRYGNSNGIDE-------- 465
G S + Y+K + E+ TN+ L YRYS+ GY++F + + R N +
Sbjct: 454 HDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKP 513

Query: 466 -------IDFRTRNRLSLSVSQRVADNMSVNLNSSLYSYWGNQDASQQYSVGFNHSLRSF 518
+ + R +L L+V+Q++ ++ L+ S +YWG + +Q+ G N +
Sbjct: 514 KFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDI 573

Query: 519 SYTVSAIRTSNSGNSSNGDNDREYENSYMLAVSIPIGG----SGKNKPLFSSLSTMVSHS 574
++T+S T N+ + + L V+IP K++ +S S +SH
Sbjct: 574 NWTLSYSLTKNAW-------QKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHD 626

Query: 575 EAGDTQLQLTTSGSRGDQNELTYGIGTSYGNRNDASSEQSVIGNMGYQSSVGQLGMTASA 634
G G+ + N L+Y + T Y D +S + + Y+ G + S
Sbjct: 627 LNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSH 686

Query: 635 NNNASRQLSVSASGSLVAHQGGVIAGPRLGDAPFAIINAQGAGGAKVFNGRGAKIDSNGY 694
+++ +QL SG ++AH GV G L D ++ A GA AKV N G + D GY
Sbjct: 687 SDD-IKQLYYGVSGGVLAHANGVTLGQPLNDT-VVLVKAPGAKDAKVENQTGVRTDWRGY 744

Query: 695 ALVPSLTPYRENTIAIDYKDLPETVDILENHKVVVPRMGAMIPVKMKTMTGNPMMLIVRD 754
A++P T YREN +A+D L + VD+ VVP GA++ + K G +++ +
Sbjct: 745 AVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLTH 804

Query: 755 ENKEFLPIGTDLLDADGVSQSIVGQGGMAFIRGWDPVSQPITATLNGGIDKCVIKPDAKI 814
NK LP G + S IV G ++ G + CV + ++
Sbjct: 805 NNKP-LPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVA--NYQL 861

Query: 815 DTATKTVQIIQLEVIC 830
++ + QL C
Sbjct: 862 PPESQQQLLTQLSAEC 877


79YpsIP31758_1645YpsIP31758_1652N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_16451193.707746chemotaxis-specific methylesterase
YpsIP31758_16461181.648710chemotaxis regulatory protein CheY
YpsIP31758_16471160.882149chemotaxis regulator CheZ
YpsIP31758_16481160.878504hypothetical protein
YpsIP31758_16491150.860379N-acetylmuramoyl-L-alanine amidase
YpsIP31758_16500131.460561hemagglutination repeat-containing protein
YpsIP31758_1651-115-4.019633hypothetical protein
YpsIP31758_1652-114-3.227332alanine racemase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1645HTHFIS636e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.5 bits (152), Expect = 6e-13
Identities = 27/109 (24%), Positives = 52/109 (47%), Gaps = 5/109 (4%)

Query: 1 MSKIRVLCVDDSALMRQLMTEIINSHPDMEMVAAAQDPLVARDLIKKFNPQVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + ++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKNSEITM-RALELGAIDFVTKP 108
+ D L ++ + RP V+V ++ +N+ +T +A E GA D++ KP
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1646HTHFIS896e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 6e-24
Identities = 34/105 (32%), Positives = 53/105 (50%), Gaps = 3/105 (2%)

Query: 7 RFLVVDDFSTMRRIVRNLLKELGFHNVEEAEDGVDALNKLRAGGFDFVVSDWNMPNMDGL 66
LV DD + +R ++ L G+ +V + + AG D VV+D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 DLLKTIRTDGALATLPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
DLL I+ A LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1650PF03895641e-14 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 63.7 bits (155), Expect = 1e-14
Identities = 22/78 (28%), Positives = 34/78 (43%)

Query: 803 DSTLSAGIAGAMAMASLTQPYTPGASMATIGAASYRGQSALSVGVSSISDSGRWVSKLQA 862
L G+A A++ L QP G + + YR ++AL++GV S A
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 863 SSNTQGDMGVGVGVGYQW 880
+ G M G VGY++
Sbjct: 62 FNTYNGGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1652ALARACEMASE1982e-62 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 198 bits (506), Expect = 2e-62
Identities = 85/354 (24%), Positives = 160/354 (45%), Gaps = 30/354 (8%)

Query: 45 AWLEISQGALDFNTKKMLTLLDNKSTLCAILKGDAYGHDLTLVTPVMLKNNVQCIGVASN 104
+ AL N ++ + + +++K +AYGH + + + + + +
Sbjct: 5 IQASLDLQALKQNLS-IVRQAATHARVWSVVKANAYGHGIERIWSAIGATD--GFALLNL 61

Query: 105 QELKTVRDLGFTGQLIRVRSAT-LKEMQQAMAYDVEELIGDKTVAEQLNNIAKLNGKVLR 163
+E T+R+ G+ G ++ + ++++ + + + + L N L
Sbjct: 62 EEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARL--KAPLD 119

Query: 164 IHLALNSAGMSRNGLEVSKARGLNDAKTIAGLKNLTIVGIMSHYPVEDASE-IKADLARF 222
I+L +NS GM+R G + + L + + + N+ + +MSH+ + + I +AR
Sbjct: 120 IYLKVNS-GMNRLGFQPDRV--LTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARI 176

Query: 223 QQQAKDVIAVTGLKREKIKLHVANTFATLAVPDSWLDMVRVGGVFYG-------DTIAST 275
+Q A GL+ + ++N+ ATL P++ D VR G + YG IA+T
Sbjct: 177 EQ------AAEGLECRR---SLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANT 227

Query: 276 EYKRVMTFKSNIASLNNYPKGGTVGYDRTYTLKRDSLLANIPVGYADGYRRVFSNAGHVI 335
+ VMT S I + G VGY YT + + + + GYADGY R V+
Sbjct: 228 GLRPVMTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVL 287

Query: 336 IQGQRLPVLGKTSMNTVIVDVTDLKKVSLGDEVVLFGKQGNAEIQAEEIEDLSG 389
+ G R +G SM+ + VD+T + +G V L+GK EI+ +++ +G
Sbjct: 288 VDGVRTMTVGTVSMDMLAVDLTPCPQAGIGTPVELWGK----EIKIDDVAAAAG 337


80YpsIP31758_1718YpsIP31758_1727N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1718118-1.382190integration host factor subunit alpha
YpsIP31758_1719116-1.606888hypothetical protein
YpsIP31758_1720-117-2.075292vtamin B12-transporter permease
YpsIP31758_1721-117-2.013236glutathione peroxidase
YpsIP31758_1722015-2.231577vitamin B12-transporter ATPase
YpsIP31758_1723016-2.098333UDP-4-amino-4-deoxy-L-arabinose--oxoglutarate
YpsIP31758_1725017-2.723514IS1541, transposase
YpsIP31758_1727017-2.658851bifunctional UDP-glucuronic acid
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1718DNABINDINGHU1172e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (296), Expect = 2e-38
Identities = 36/89 (40%), Positives = 55/89 (61%)

Query: 4 TKAEMSEHLFEKLGLSKRDAKDLVELFFEEVRRALENGEQVKLSGFGNFDLRDKNQRPGR 63
K ++ + E L+K+D+ V+ F V L GE+V+L GFGNF++R++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 64 NPKTGEDIPITARRVVTFRPGQKLKSRVE 92
NP+TGE+I I A +V F+ G+ LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1719OUTRSURFACE300.004 Outer surface protein signature.
		>OUTRSURFACE#Outer surface protein signature.

Length = 273

Score = 29.5 bits (66), Expect = 0.004
Identities = 18/52 (34%), Positives = 27/52 (51%), Gaps = 8/52 (15%)

Query: 1 MKKYLLLFGVLSFMPLIAQSDVSLD------INMPGIN--LHLGDQDKRGYY 44
MKKYLL G++ + Q+ SLD +++PG L ++DK G Y
Sbjct: 1 MKKYLLGIGLILALIACKQNVSSLDEKNSASVDLPGEMKVLVSKEKDKDGKY 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1722PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.003
Identities = 17/66 (25%), Positives = 27/66 (40%), Gaps = 10/66 (15%)

Query: 29 LIGPNGAGKSTLLASLAGL------LPASGEIVLAGKSLQHYEGHELAR----QRAYLSQ 78
L G G GKSTL+ +L GL G + + + +EL+ +RA
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFRRADAEA 660

Query: 79 QQSALS 84
++ S
Sbjct: 661 VKAFFS 666


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1727NUCEPIMERASE1003e-25 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 100 bits (251), Expect = 3e-25
Identities = 73/361 (20%), Positives = 138/361 (38%), Gaps = 60/361 (16%)

Query: 317 RVLILGVNGFIGNHLTERLLQDDRYEVYGLDIGSD--------AISRFLGNPAFHFVEGD 368
+ L+ G GFIG H+++RLL+ ++V G+D +D A L P F F + D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 369 ISIHSEWIE--YHIKKCDVILPLVAIATPIEYT-RNPLRVFELDFEENLKIVRDCVKYN- 424
++ E + + + + + Y+ NP + + L I+ C
Sbjct: 61 LADR-EGMTDLFASGHFERVFISPHRLA-VRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 425 KRIVFPSTSEVYGMCDDKEFDEDTSRLIVGPINKQRWIYSVSKQLLDRVIWAYGVKEGLK 484
+ +++ S+S VYG+ F D ++ +Y+ +K+ + + Y GL
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTD------DSVDHPVSLYAATKKANELMAHTYSHLYGLP 172

Query: 485 FTLFRPFNWMGPRLDNLDAARIGSSRAITQLILNLVEGSPIKLVDGGAQKRCFTDIHDGI 544
T R F GP D A ++A+ +EG I + + G KR FT I D
Sbjct: 173 ATGLRFFTVYGPWGRP-DMALFKFTKAM-------LEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 545 EALFRIIEN---------------RDGCCDGQIINIGNPTNEASIRELAEMLLTSFENHE 589
EA+ R+ + ++ NIGN ++ + + + L +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGN-SSPVELMDYIQALEDALGIEA 283

Query: 590 LRDHFPPFAGFKDIESSAYYGKGYQDVEYRTPSIKNARRILHWQPEIAMQQTVTETLDFF 649
++ P G DV + K ++ + PE ++ V ++++
Sbjct: 284 KKNMLPLQPG---------------DVLETSADTKALYEVIGFTPETTVKDGVKNFVNWY 328

Query: 650 L 650

Sbjct: 329 R 329


81YpsIP31758_1778YpsIP31758_1786N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_1778-1120.964430tripeptide transporter permease
YpsIP31758_1779-1111.181870hypothetical protein
YpsIP31758_1780-3131.338719peptide ABC transporter ATP-binding protein
YpsIP31758_1781-3131.148076peptide ABC transporter ATP-binding protein
YpsIP31758_1782-3150.964952peptide ABC transporter permease
YpsIP31758_1783-3150.683475peptide ABC transporter permease
YpsIP31758_1784-1150.293357peptide ABC transporter periplasmic protein
YpsIP31758_17850140.618491phage shock protein operon transcriptional
YpsIP31758_17860150.939275phage shock protein PspA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1778TCRTETB385e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 38.3 bits (89), Expect = 5e-05
Identities = 72/440 (16%), Positives = 151/440 (34%), Gaps = 49/440 (11%)

Query: 52 LGMSEADSITLFSSFSALVYGFVAIGGWLGDKVLGAKRVIVLGALTLAVGYSMIAYSGHE 111
A + + ++F A+ G L D+ LG KR+++ G + G S+I + GH
Sbjct: 44 FNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ-LGIKRLLLFGIIINCFG-SVIGFVGHS 101

Query: 112 IF-WVYLGMATIAVGNGLFKANPSSLLSTCYSKDDPRLDGAFTMYYMSINIGSFFSMLAT 170
F + + G F A +++ K+ AF + + +G
Sbjct: 102 FFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE--NRGKAFGLIGSIVAMGEGVGPAIG 159

Query: 171 PWLAAKYGWSVAFSLSVVGMLITLVNFWFCRKWVKNQGSKPDFLPLQFKKLLMVLVGIIA 230
+A WS + +IT++ F K +K + K ++++ VGI+
Sbjct: 160 GMIAHYIHWSYLLLI----PMITIITVPFLMKLLKKEVRIKG--HFDIKGIILMSVGIVF 213

Query: 231 LITLSNWLLHNQIIARWALALVSLGIIFIFTKET-----------LFLQGIARRRMIVAF 279
+ + + +VS+ IF K L ++
Sbjct: 214 FMLFTT-------SYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGG 266

Query: 280 LLMLEAVIFFVLYSQMPTSLNFFAIHNVEHSIFGIGFEPEQFQALNPFWIMLASPILAAI 339
++ F + +P + +H + + G F ++ I I
Sbjct: 267 IIFGTVAGFVSM---VPYMMKD--VHQLSTAEIGSVI---------IFPGTMSVIIFGYI 312

Query: 340 YNKMGDRLPMPHKFAFGMMLCSAAFLVLPWGASFANEHGIVSVNW-LILSYALQSIGELM 398
+ DR + G+ S +FL ASF E + ++ S + +
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFL----TASFLLETTSWFMTIIIVFVLGGLSFTKTV 368

Query: 399 ISGLGLAMVAQLVPQRLMGFIMGSWFLTTAAAALIAGKVAALTAVPSDAI-TDAHASLAI 457
IS + + + Q M + + FL+ I G + ++ + + + S +
Sbjct: 369 ISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTYL 428

Query: 458 YSHVFMQIGIVTAIIAVLMM 477
YS++ + + I ++ +
Sbjct: 429 YSNLLLLFSGIIVISWLVTL 448


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1781HTHFIS310.006 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.006
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1782TATBPROTEIN320.002 Bacterial sec-independent translocation TatB protein...
		>TATBPROTEIN#Bacterial sec-independent translocation TatB protein

signature.
Length = 171

Score = 31.5 bits (71), Expect = 0.002
Identities = 15/46 (32%), Positives = 25/46 (54%), Gaps = 3/46 (6%)

Query: 144 LLLAIIVVAFVGPS-LEHAMFAVWLALLPRMVRTIYSAVHDELDKE 188
LL+ II + +GP L A +A R +R++ + V +EL +E
Sbjct: 10 LLVFIIGLVVLGPQRLPVA--VKTVAGWIRALRSLATTVQNELTQE 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1785HTHFIS346e-119 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 346 bits (890), Expect = e-119
Identities = 124/344 (36%), Positives = 176/344 (51%), Gaps = 19/344 (5%)

Query: 3 EQLDNLLGEANAFVDVLEQVSGLAKLNKPVLVIGERGTGKELIAHRLHYLSERWQGPFIS 62
+ L+G + A ++ ++ L + + +++ GE GTGKEL+A LH +R GPF++
Sbjct: 134 QDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVA 193

Query: 63 LNCAALNENLLDSELFGHEAGAFTGAQKRHLGRFERADGGTLFLDELATAPMLVQEKLLR 122
+N AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLR
Sbjct: 194 INMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLR 253

Query: 123 VIEYGHLERVGGSQPLQVDVRLVCATNDNLPALAAAGKFRADLLDRLAFDVVQLPPLRER 182
V++ G VGG P++ DVR+V ATN +L G FR DL RL ++LPPLR+R
Sbjct: 254 VLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDR 313

Query: 183 QQDIMLLAEHFAILMCRELGLPLFSGFTATAKEQLLEYRWPGNVRELKNVVERSV----- 237
+DI L HF +E F A E + + WPGNVREL+N+V R
Sbjct: 314 AEDIPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQ 371

Query: 238 -----------YRHSDSSLPLNNIIINPFASNQKGEIEGVDTPNEGGAVLPALPVD-LKH 285
R P+ + + +E P
Sbjct: 372 DVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDR 431

Query: 286 WLHTSEHQMLTRALKQARFNQRKAAHLLGLTYHQLRGLLKKHTI 329
L E+ ++ AL R NQ KAA LLGL + LR +++ +
Sbjct: 432 VLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1786cloacin300.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 30.5 bits (68), Expect = 0.006
Identities = 33/146 (22%), Positives = 54/146 (36%), Gaps = 29/146 (19%)

Query: 56 QLLRRIDHSESQQQEWQ------------EKAELALRKDKEDLARAALIEKQ-KVMTLVE 102
Q+ +R D +QQEW E+A L + ED+AR E+Q K + +
Sbjct: 295 QVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVAR--NQERQAKAVQVYN 352

Query: 103 TLKREVATVDETLSRMKHEITELENKLTETRA--------------RQQALTLRHQAASS 148
+ K E+ ++TL+ EI + + A R Q QAA
Sbjct: 353 SRKSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFD 412

Query: 149 SRDVRRQLDSGKLDEAMARFEQFERR 174
+ + L AM ++ E +
Sbjct: 413 AAAKEKSDADAALSSAMESRKKKEDK 438


82YpsIP31758_1818YpsIP31758_1826N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_18180171.000924protease
YpsIP31758_1819-2150.998459acid shock protein
YpsIP31758_1820-2130.576363cytochrome b561 family protein
YpsIP31758_1821-3120.812875hypothetical protein
YpsIP31758_1822-3141.169481hypothetical protein
YpsIP31758_1823-3111.036883toxin protein
YpsIP31758_1824-113-1.384446thermostable carboxypeptidase I
YpsIP31758_1825-110-0.403697sensor protein RstB
YpsIP31758_1826012-0.147984RstA family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1818V8PROTEASE1035e-28 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 103 bits (259), Expect = 5e-28
Identities = 38/249 (15%), Positives = 84/249 (33%), Gaps = 41/249 (16%)

Query: 33 QTALFFGKDDRTAVTNSRQWPWEAIGQVET---TSGNLCTATLISPRLVLTAGHCVLTP- 88
+ +DR +T++ + + ++ T + + ++ +LT H V
Sbjct: 66 HANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATH 125

Query: 89 --PGNIDQAVALRFISDKGHWKYQITDLKTRVDAKLGQKLKADGDGWIVPPAAAAYDFAL 146
P + + + + + + +GD A F+
Sbjct: 126 GDPHALKAFPSAINQDNYPNGGFTAEQITKY---------SGEGD-------LAIVKFSP 169

Query: 147 IQLTNAAPIPIKPLPLWEGTANELTKALKLVNRKVTQAGYPLD-NLNTLYKHEDCLVTGW 205
+ +KP + A VN+ +T GYP D + T+++ + +
Sbjct: 170 NEQNKHIGEVVKPATM-------SNNAETQVNQNITVTGYPGDKPVATMWESKG--KITY 220

Query: 206 AQQGVLAHQCDTLPGDSGSPLLLKNGNSWSLIAIQSSAPAAKERYLADNRALSVT-AINN 264
+ + + T G+SGSP+ + +I I + N A+ + + N
Sbjct: 221 LKGEAMQYDLSTTGGNSGSPVFNEKNE---VIGIHWGGVPNE-----FNGAVFINENVRN 272

Query: 265 RLKKLVNKI 273
LK+ + I
Sbjct: 273 FLKQNIEDI 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1823ANTHRAXTOXNA795e-17 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 79.0 bits (194), Expect = 5e-17
Identities = 83/309 (26%), Positives = 128/309 (41%), Gaps = 33/309 (10%)

Query: 696 YSQAFKRTANKYNVIIGVRAPNPLGETLLKEGFPSKNFHMKAKSSPTGPTAGFIAEDPIY 755
++ AFK+ A + N I R N L L+K G +K ++ KSS GP AG+I D
Sbjct: 311 HADAFKKIARELNTYILFRPVNKLATNLIKSGVATKGLNVHGKSSDWGPVAGYIPFDQDL 370

Query: 756 SKVSPSAYKKQRASIDKAKALGSES-----IDLFISKSRINELIETGNL--------NFL 802
SK ++ +++ K++ I L + RI EL E G + N
Sbjct: 371 SKKHGQQLAVEKGNLENKKSITEHEGEIGKIPLKLDHLRIEELKENGIILKGKKEIDN-- 428

Query: 803 GENRYSAKYPYGTQEFEIGNNGRVLNSEGKPVKVMTNPPEIGERKSNS---------SPI 853
G+ Y + EF I + + + K K+ + R P+
Sbjct: 429 GKKYYLLESNNQVYEFRISDENNEVQYKTKEGKITVLGEKFNWRNIEVMAKNVEGVLKPL 488

Query: 854 TADYDLFAIIPSVNQSVNERPLTVPHKLLRGNFSLP----FTSPKGKNGMSE--DVNMGN 907
TADYDLFA+ PS+ + + P K++ SL T+ K G+ D G
Sbjct: 489 TADYDLFALAPSLTEIKKQIPQKEWDKVVNTPNSLEKQKGVTNLLIKYGIERKPDSTKGT 548

Query: 908 LHHFGKTIVNSLNKEINAEGYAGGKLVWHNDEAGNPFSPGFDENDKPIFFLPSGGMFQAK 967
L ++ K +++ LN+ + GY GG +V H E N F E D IF + G F
Sbjct: 549 LSNWQKQMLDRLNEAVKYTGYTGGDVVNHGTEQDN---EEFPEKDNEIFIINPEGEFILT 605

Query: 968 NKSELLGFY 976
E+ G +
Sbjct: 606 KNWEMTGRF 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1824PREPILNPTASE290.032 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.4 bits (66), Expect = 0.032
Identities = 23/87 (26%), Positives = 28/87 (32%), Gaps = 14/87 (16%)

Query: 402 YRNGCMQDIHWTDGAFGYFPTYTLGAMYAAQLFHAARSAIPALDSHIANGNLAPLLNWLQ 461
+R M + W YF G RS P + I PLL+WL
Sbjct: 35 HRLPIMLEREWQAEYRSYFNPDDEGVDEPPYNLMVPRSCCPHCNHPITALENIPLLSWL- 93

Query: 462 QNIWQHGS----------RYPTAELIT 478
W G RYP EL+T
Sbjct: 94 ---WLRGRCRGCQAPISARYPLVELLT 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1825PF06580290.030 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.4 bits (66), Expect = 0.030
Identities = 16/105 (15%), Positives = 31/105 (29%), Gaps = 28/105 (26%)

Query: 327 LVNNALRY------SHQRLRIGLWFDGDNACLQVEDDGPGIPPEERTRIFEPFVRLDPSR 380
LV N +++ ++ + D L+VE+ G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------- 309

Query: 381 DRATGGCGLGLAIVHS-IALAY--QGSISVNTSPLGGASFRFSWP 422
G GL V + + Y + I + + G + P
Sbjct: 310 -----STGTGLQNVRERLQMLYGTEAQIKL-SEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_1826HTHFIS706e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.9 bits (171), Expect = 6e-16
Identities = 26/133 (19%), Positives = 59/133 (44%), Gaps = 2/133 (1%)

Query: 2 SKIVFVEDDPEVGKLIAAYLGKHDIDVFVEPRGDTAQAVIEQQQPDLVLLDIMLPGKDGM 61
+ I+ +DD + ++ L + DV + T I DLV+ D+++P ++
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 TLCRDLRPHYDG-PIVLLTSLDSDMNHILSLEMGANDYILKTTPPAVLLARLRLHLRQHN 120
L ++ P++++++ ++ M I + E GA DY+ K L+ + L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL-AEP 122

Query: 121 QRLRQQTPLQAKE 133
+R + +++
Sbjct: 123 KRRPSKLEDDSQD 135


83YpsIP31758_2160YpsIP31758_2172N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2160-211-2.034310LuxR family transcriptional regulator
YpsIP31758_2161-111-1.448881sensor histidine kinase/response regulator
YpsIP31758_2162114-1.662039hypothetical protein
YpsIP31758_2163114-1.430418fimbrial protein
YpsIP31758_2164-110-0.994654pili assembly chaperone
YpsIP31758_2165-1110.009938fimbrial usher protein
YpsIP31758_2166-1180.156783hypothetical protein
YpsIP31758_21670201.155113pili assembly chaperone
YpsIP31758_21681232.895722*hypothetical protein
YpsIP31758_21691232.892392carbohydrate ABC transporter ATP-binding
YpsIP31758_21701243.295261carbohydrate ABC transporter permease
YpsIP31758_21712223.911784carbohydrate ABC transporter permease
YpsIP31758_21722202.116086carbohydrate ABC transporter periplasmic-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2160HTHFIS673e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.8 bits (163), Expect = 3e-15
Identities = 34/166 (20%), Positives = 76/166 (45%), Gaps = 10/166 (6%)

Query: 1 MTK-SVMIVDDHPAIRVAIHALLSQSKEFSTISESVDGSEALEKLKNNPVDLVIIDIELP 59
MT ++++ DD AIR ++ LS+ + + + + + DLV+ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 60 NFDGFSLLKKLQQRGFTGKSLFLSAKNEQVFAVRALQAGANGFISKNKDISEILFAAQNV 119
+ + F LL ++++ L +SA+N + A++A + GA ++ K D++E++
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 120 LRGYSFFPSETLTQ------LAGQ-PSSHDPVNRARLLSEREINVL 158
L PS+ L G+ + + L + ++ ++
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2161HTHFIS712e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.6 bits (173), Expect = 2e-14
Identities = 33/116 (28%), Positives = 53/116 (45%), Gaps = 3/116 (2%)

Query: 946 RILVVDDLPANRQLLQQQLAFIGIEQVVTAENGAKACQILQHNNFDVVITDCSMPVMDGY 1005
ILV DD A R +L Q L+ G V N A + + + D+V+TD MP + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAG-YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 1006 ELAAHIRQDPALKDLIVIGCTADAREESAARCIDAGMNACMIKPVAIDTLQATLLR 1061
+L I++ A DL V+ +A +A + + G + KP + L + R
Sbjct: 64 DLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2165PF005777430.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 743 bits (1920), Expect = 0.0
Identities = 245/884 (27%), Positives = 392/884 (44%), Gaps = 73/884 (8%)

Query: 5 SLLVTHISSAADKNN-QDDYIFDDALVRGSSLGLGSIARFNKKNSYDAGQYQVDMYMNNK 63
L + AA + F+ + + ++RF G Y+VD+Y+NN
Sbjct: 28 VRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNG 87

Query: 64 FVDRLKMLFVDKDNS--VEPCLSVAQLLQAGVKEEALKTAD--PKTPCLAFQSILPASDF 119
++ + F D+ + PCL+ AQL G+ ++ + C+ S++ +
Sbjct: 88 YMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATA 147

Query: 120 RFDHAKLRFDLSIPQKFVKNVPRGYVDPKNLTAGNTIGFSNYNLNQYHVDYNKEGIKRTT 179
+ D + R +L+IPQ F+ N RGY+ P+ G G NYN + V G +
Sbjct: 148 QLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGG---NS 204

Query: 180 NSTYLSLNSGINIGMWRFRQQGSLRYDASRG-----TNWTSNRLYSQRALPTIGSEITLG 234
+ YL+L SG+NIG WR R + Y++S W + +R + + S +TLG
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 235 ETFSSGQFFSSLGFLGVALSTDDRMLPESQRGYAPVVRGIARTNARVTVYQNNRSIYQTT 294
+ ++ G F + F G L++DD MLP+SQRG+APV+ GIAR A+VT+ QN IY +T
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 295 VSPGAFEFNDLSVTHFGGDLTVEINEADGSVSTFQVPFASVPESLRPGYSRYSFAAGQVR 354
V PG F ND+ GDL V I EADGS F VP++SVP R G++RYS AG+ R
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 355 DVGN---NETFSELTYQQGISNAITANTGIRLASGYQAIMLGGVF-THYIGALGLNTTYS 410
F + T G+ T G +LA Y+A G +GAL ++ T +
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 411 HARLPDGEQQQGWMAKASFSRTFQPTNTTLSVAGYRYSTDGYRDLSDVLGVR-------- 462
++ LPD Q G + ++++ + T + + GYRYST GY + +D R
Sbjct: 445 NSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIET 504

Query: 463 ------ATSNDSSWNSSTYRQRSRAEISLNQNFHRYGSLYLTASSQDYRDDRSRDSQLQL 516
+ + + Y +R + ++++ Q R +LYL+ S Q Y + D Q Q
Sbjct: 505 QDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQA 564

Query: 517 GYSNTFWRNTSFNLAISQQKTGGANKIYFVDPGSGMPASNGANTLATRETVAQMSISFPL 576
G + + + ++ L+ S K R+ + ++++ P
Sbjct: 565 GLNTA-FEDINWTLSYSLTKNAWQKG---------------------RDQMLALNVNIPF 602

Query: 577 GGSSSAP--------YVSAGAVNSRTSGASYQTSLSGTMGSDQTAGYSVDVARNEP---T 625
+ S + + + GT+ D YSV
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 626 NENTLSGSLQKQLPTTSLSGSASRSPGYWQGSASARGSVAFHRGGVTLGPYLSDTFALIE 685
+ +T +L + + + S S Q G V H GVTLG L+DT L++
Sbjct: 663 SGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVK 722

Query: 686 AKGASGAKVMYGQGARIDRFGYALVPTLTPYRYNTLSLDPDGMDFNTELQDGERQIAPYA 745
A GA AKV G R D GYA++P T YR N ++LD + + N +L + + P
Sbjct: 723 APGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTR 782

Query: 746 GSTVKVTFRTLNGYPALITIKMPDGSQLPMGTVVYNYNGKGTNDKNDIVGMVGQSSQAYL 805
G+ V+ F+ G L+T+ + LP G +V T++ + G+V + Q YL
Sbjct: 783 GAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMV-------TSESSQSSGIVADNGQVYL 834

Query: 806 RAEELSGTLTLVWGESSKERCQLDYDLGKPTDNDKQLYKLDALC 849
L+G + + WGE C +Y L P + L +L A C
Sbjct: 835 SGMPLAGKVQVKWGEEENAHCVANYQL-PPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2166PF00577300.022 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.8 bits (67), Expect = 0.022
Identities = 13/116 (11%), Positives = 26/116 (22%), Gaps = 15/116 (12%)

Query: 287 ESGTSSGQTAIGIQTSLPGYLKALGLGLVNTAGGVSYLLSDSYG--TDSRIATGVGISLS 344
+ + + ++P L + + S SY D +
Sbjct: 584 NAWQKGRDQMLALNVNIP-----FSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVY 638

Query: 345 DSNGSTMNFVGWG-------GCAQTQDCLTTADAGWYPILTGASGNGSHSAGYNNY 393
+ N + + G A + A+ SHS
Sbjct: 639 GTLLED-NNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQL 693


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2169PF05272354e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.4 bits (81), Expect = 4e-04
Identities = 13/33 (39%), Positives = 16/33 (48%)

Query: 33 VFIGPSGCGKSTLLRMIAGLETISSGEISIGDK 65
V G G GKSTL+ + GL+ S IG
Sbjct: 600 VLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2172MALTOSEBP493e-08 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 48.6 bits (115), Expect = 3e-08
Identities = 101/420 (24%), Positives = 171/420 (40%), Gaps = 55/420 (13%)

Query: 14 TLLMASNASA---QETLRVLLEGHSTSDSIKALLPEFEKQTGIKVQAEIVPYSDLTSKAL 70
T++ +++A A + L + + G + + + +FEK TGIKV E + D +
Sbjct: 17 TMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVE---HPDKLEEKF 73

Query: 71 LAFSSHSGRYDVVMDDWVHAV--GYASAGYITPVDQWMESDTAFYDGADFVKSYA---DT 125
++ D++ W H GYA +G + + D AF D K Y D
Sbjct: 74 PQVAATGDGPDIIF--WAHDRFGGYAQSGLLAEI----TPDKAFQD-----KLYPFTWDA 122

Query: 126 LRYKDGYYGLPVYGESTFLMYRKDLFEQYGIAVPKTFDELTAAAKTIKEKTEGKVAGITL 185
+RY P+ E+ L+Y KDL PKT++E+ A K +K K GK A +
Sbjct: 123 VRYNGKLIAYPIAVEALSLIYNKDLLPN----PPKTWEEIPALDKELKAK--GKSA--LM 174

Query: 186 RGAQGIQNTFAWASFLWGYGGQWIDDNGK-----SAIASPQAVEATKSFVNILKNYGPIG 240
Q + F W G + +NGK + + A V+++KN
Sbjct: 175 FNLQ--EPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNA 232

Query: 241 AANFGWQENRLVFQQGKAAMTIDSTVNGGFNEDPKESTVVGKVGYAPVPVQPGDHPGNSG 300
++ E F +G+ AMTI NG + +++ KV Y + +
Sbjct: 233 DTDYSIAE--AAFNKGETAMTI----NGPWAWSNIDTS---KVNYGVTVLPTFKGQPSKP 283

Query: 301 ALQVHGLYISSDSKKQDAAWKFISWATDKQTQMKSVELNPNAGVSSLSAINSDAFTKRYG 360
+ V I++ S ++ A +F+ +++V + L A+ ++ +
Sbjct: 284 FVGVLSAGINAASPNKELAKEFLENYLLTDEGLEAVNKD-----KPLGAVALKSYEEELA 338

Query: 361 AFKDGMLAALQNGNAK--YLPTIPQSTQIINITGIALSEALAGTQTVENALQQANTRNDK 418
KD +AA K +P IPQ + A+ A +G QTV+ AL+ A TR K
Sbjct: 339 --KDPRIAATMENAQKGEIMPNIPQMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


84YpsIP31758_2289YpsIP31758_2304N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_22890174.013617flagellar hook-basal body protein FliE
YpsIP31758_22900174.709441flagellar MS-ring protein
YpsIP31758_22911164.486558flagellar motor switch protein G
YpsIP31758_22921154.695781flagellar assembly protein H
YpsIP31758_22930153.796091flagellum-specific ATP synthase
YpsIP31758_2294-1172.696529flagellar biosynthesis chaperone
YpsIP31758_2295-1172.938649flagellar hook-length control protein
YpsIP31758_22960222.310740hypothetical protein
YpsIP31758_22970232.644224hypothetical protein
YpsIP31758_2298-1222.709455flagellar basal body-associated protein FliL
YpsIP31758_22991192.795656flagellar motor switch protein FliM
YpsIP31758_23001182.311639flagellar motor switch protein FliN
YpsIP31758_23012202.877882flagellar protein fliO
YpsIP31758_23022192.033874flagellar biosynthesis protein FliP
YpsIP31758_23030182.019090flagellar biosynthesis protein FliQ
YpsIP31758_23041191.933795flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2289FLGHOOKFLIE803e-23 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 80.1 bits (197), Expect = 3e-23
Identities = 59/102 (57%), Positives = 73/102 (71%)

Query: 4 SVQGIEGVLQQLQVTALQASGSAKTLPAEAGFASELKAAIGKISENQQVARTSAQNFELG 63
++QGIEGV+ QLQ TA+ A FA +L AA+ +IS+ Q ART A+ F LG
Sbjct: 2 AIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLG 61

Query: 64 VPGVGLNDVMVNAQKSSVSLQLGIQVRNKLVAAYQEVMNMGV 105
PGV LNDVM + QK+SVS+Q+GIQVRNKLVAAYQEVM+M V
Sbjct: 62 EPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2290FLGMRINGFLIF5770.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 577 bits (1488), Expect = 0.0
Identities = 354/552 (64%), Positives = 443/552 (80%), Gaps = 9/552 (1%)

Query: 19 LARLRANPKIPLLIAAAAAIAIIVALMLWAKSPDYRVLYSNLSDRDGGDIVTQLTQLNIP 78
L RLRANP+IPL++A +AA+AI+VA++LWAK+PDYR L+SNLSD+DGG IV QLTQ+NIP
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 79 YRFADNGGALLIPAEKVHETRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQINYQRAL 138
YRFA+ GA+ +PA+KVHE RLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQ+NYQRAL
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 139 EGELSRTIGTLGPVLNVRVHLAMPKPSLFVREQKSPTASVTLALQPGRALDDGQINAIVY 198
EGEL+RTI TLGPV + RVHLAMPKPSLFVREQKSP+ASVT+ L+PGRALD+GQI+A+V+
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 199 MVSSSVAGLPPGNVTVVDQTGRLLTQSDSAGRDLNASQLKFTSEVENRYQRRIENILAPM 258
+VSS+VAGLPPGNVT+VDQ+G LLTQS+++GRDLN +QLKF ++VE+R QRRIE IL+P+
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILSPI 255

Query: 259 VGNGNVHAQVTAQVDFASREQTDEEYKPNQAANQGAVRSQQVSTSEQLGGTNVGGVPGAL 318
VGNGNVHAQVTAQ+DFA++EQT+E Y PN A++ +RS+Q++ SEQ+G GGVPGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 319 SNQPPVAPIAPIEIPQPAGAAANNAAPANTAATANANTTATAAKASSSNSRHDQTTNFEV 378
SNQP API P A N +T+ +N+ A +++ ++T+N+EV
Sbjct: 316 SNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNS--------AGPRSTQRNETSNYEV 367

Query: 379 DRTIRHTQQQAGMVQRLSVAVVVNYTSDKAGKPIALSKDQLAQVESLTREAMGFSTVRGD 438
DRTIRHT+ G ++RLSVAVVVNY + GKP+ L+ DQ+ Q+E LTREAMGFS RGD
Sbjct: 368 DRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGD 427

Query: 439 TLNVVNTPFTASDDTRGSSLPFWQQQSFFDQLLNAGRYLLILLVAWILWRKLLRPMLAKK 498
TLNVVN+PF+A D+T G LPFWQQQSF DQLL AGR+LL+L+VAWILWRK +RP L ++
Sbjct: 428 TLNVVNSPFSAVDNT-GGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRR 486

Query: 499 QVADKAAASVNNIVQTAQAAETVKQSKEELALRKKNQQRVSAEVQAQRIRELADKDPRVV 558
KAA + Q + A V+ SK+E +++ QR+ AEV +QRIRE++D DPRVV
Sbjct: 487 VEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPRVV 546

Query: 559 ALVIRQWMSNDQ 570
ALVIRQWMSND
Sbjct: 547 ALVIRQWMSNDH 558


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2291FLGMOTORFLIG314e-108 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 314 bits (806), Expect = e-108
Identities = 113/327 (34%), Positives = 192/327 (58%), Gaps = 2/327 (0%)

Query: 2 SLTGTEKSAIMLMTLGEDHAAEVFKHLSSREVQQLSTTMASMRQVSHQQLVDVLAEFEDD 61
+LTG +K+AI+L+++G + +++VFK+LS E++ L+ +A + ++ + +VL EF++
Sbjct: 14 ALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKEL 73

Query: 62 AEQYAALSVNASDYLRSVLIKALGEERASSLLEDILESRETTSGMETLNFMEPQMAADLI 121
+ DY R +L K+LG ++A ++ + L S + E + +P + I
Sbjct: 74 MMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILNFI 132

Query: 122 RDEHPQIIATILVHLKRAQAADILALFDERLRNDVMLRIATFGGVQPAALAELTEVLNNL 181
+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 133 QQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKK 192

Query: 182 LDGQ-NLKRSKMGGIRTAAEIINLMKTQQEETVMDAVREYDGELAQKIIDEMFLFENLVS 240
L + + GG+ EIIN+ + E+ +++++ E D ELA++I +MF+FE++V
Sbjct: 193 LASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVL 252

Query: 241 VDDRSIQRLLQEIDNESLLIALKGADQALRERFLSNMSLRAAEILRDDLATRGPVRMSLV 300
+DDRSIQR+L+EID + L ALK D ++E+ NMS RAA +L++D+ GP R V
Sbjct: 253 LDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDV 312

Query: 301 ENEQKSILLIVRRLAESGEIVIGGGED 327
E Q+ I+ ++R+L E GEIVI G +
Sbjct: 313 EESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2292FLGFLIH2215e-75 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 221 bits (563), Expect = 5e-75
Identities = 128/233 (54%), Positives = 167/233 (71%), Gaps = 7/233 (3%)

Query: 6 NALPWQPWSLKDFASQSEAPLSESMPDISLLFPNEPMEATAAVDEQQVLVNLQLEAEKQG 65
+ LPW+ W+ D A P +E +P + P E + A +Q L LQ++A +QG
Sbjct: 3 DNLPWKTWTPDDLAP----PQAEFVPIVE---PEETIIEEAEPSLEQQLAQLQMQAHEQG 55

Query: 66 RQQGFAKGLQEGLDKGYQTGLEEGHQQALADAQQQLAPMTAHWQVMVTDFQNTLDTLDSV 125
Q G A+G Q+G +GYQ GL +G +Q LA+A+ Q AP+ A Q +V++FQ TLD LDSV
Sbjct: 56 YQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSV 115

Query: 126 IASRLVQIALAAAKQIIGQPAICDGTALLAQIQQMIQQEPMFAGKTQLRVNPDDLAIVEQ 185
IASRL+Q+AL AA+Q+IGQ D +AL+ QIQQ++QQEP+F+GK QLRV+PDDL V+
Sbjct: 116 IASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDD 175

Query: 186 RLGSTLSLHGWRLLGDSQIHAGGCKVSAEEGDLDASLATRWHELCRLAAPGEL 238
LG+TLSLHGWRL GD +H GGCKVSA+EGDLDAS+ATRW ELCRLAAPG +
Sbjct: 176 MLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2294FLGFLIJ1129e-35 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 112 bits (281), Expect = 9e-35
Identities = 82/144 (56%), Positives = 102/144 (70%)

Query: 1 MKSQSPLVTLCDLAQKAVEQASTQLGHVRQSYQNAEQQLTMLLTYQDEYRERLNDTLCNG 60
M L TL DLA+K VE A+ LG +R+ Q AE+QL ML+ YQ+EYR LN + G
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MASSSWQNYQQFIQTLEQAIDQHRKQLAQWSIKVEQAVKYWQEKQQRLNAFETLQERAET 120
+ S+ W NYQQFIQTLE+AI QHR+QL QW+ KV+ A+ W+EK+QRL A++TLQER T
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 TQRQQENRLDQKLMDEFAQRASQR 144
ENRLDQK MDEFAQRA+ R
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMR 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2295FLGHOOKFLIK1388e-39 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 138 bits (347), Expect = 8e-39
Identities = 102/226 (45%), Positives = 127/226 (56%), Gaps = 9/226 (3%)

Query: 229 TPAPDKLTHLAAQDGESVLNTKTPPLVAQSEVSLSFASSDKTQLNSTP--VTAALSSPMN 286
T P T L ++ + P AQ L + K ++ STP VTAA S +
Sbjct: 157 TEKPTLFTKLTSEQLTTAQPDDAPGTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLIT 216

Query: 287 TAAASSLASAPANGYLSAPLGSQEWQQSLGQQVIMFSRNGQQSAELRLHPQELGALQISL 346
L + A LSAPLGS EWQQSL Q + +F+R GQQSAELRLHPQ+LG +QISL
Sbjct: 217 PHQTQPLPTVAAP-VLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISL 275

Query: 347 KMEDNQAQLHFASAHSQVRAALEAAMPSLRHALAESGVQLGQSSVGSEGQWQQAQQQSQQ 406
K++DNQAQ+ S H VRAALEAA+P LR LAESG+QLGQS++ E Q Q SQQ
Sbjct: 276 KVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQ 335

Query: 407 NQQDVVARGQPTYGDVVAGPLTETPLAAPTALQSLANGQGGVDVFA 452
Q A +P G+ + L P +LQ G GVD+FA
Sbjct: 336 QQSQRTANHEPLAGE------DDDTLPVPVSLQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2298PF04335270.031 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 27.1 bits (60), Expect = 0.031
Identities = 26/156 (16%), Positives = 44/156 (28%), Gaps = 27/156 (17%)

Query: 8 AKRKSSIWLILLVLVAIAASAGGGYSWWLLHKSKPTNTQIVAAIPVFMPLETFTVNLITP 67
A+R + ++ + A+AG V A+ PL+T +IT
Sbjct: 28 AERSKKLAWVVAGVAGALATAG------------------VVAVAALTPLKTVEPYVITV 69

Query: 68 DNNLDRVLYIGLTLRLPDDTTRTKLNDYLPE--VRSR-----LLLLLSRQSADSLSNEEG 120
D N T + Y VR R + +S
Sbjct: 70 DRNTGEASIAAKLHGDATITYDEAVRKYFLATYVRYREGWIAAAREEYFDAVMVMSARPE 129

Query: 121 KQRLVN--DIKNILSPPMVKGQPNQVISDVLFTAFI 154
+ R N SP + V ++ +F+
Sbjct: 130 QDRWSRFYKTDNPQSPQNILANRTDVFVEIKRVSFL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2299FLGMOTORFLIM334e-116 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 334 bits (859), Expect = e-116
Identities = 78/288 (27%), Positives = 138/288 (47%), Gaps = 8/288 (2%)

Query: 5 ILSQAEIDALLNGDS---GSEEPVVITANETDVKPYDPTTQRRVVRERLHALEIINERFA 61
+LSQ EID LL S S E ++ + YD + +E++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 62 RQFRMGLFNLLRRSPDITVGPIKIQPYHDFARNLPVPTNLNLVHLKPLRGTALFVFAPSL 121
R L LR + V + Y +F R++P P+ L ++ + PL+G A+ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 122 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVITRMLRLALDAYRDAWAAIYKIDVEYVRS 181
F +D LFGG G+ KV+ R+ T E V+ ++ L R++W + + +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 182 EIQVKFTNITTSPNDIVVSTPFQVEIGTLSGEFNICIPFAMIEPLRELLTNPPLENS--R 239
E +F I P+++VV + ++G G N CIP+ IEP+ L++ +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 240 QEDNYWRETLVKQVQHSELELVANFVDIPLRLSQILKLQPGDVLPIEK 287
+ L ++ ++++VA + L + IL L+ GD++ +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHD 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2300FLGMOTORFLIN1611e-54 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 161 bits (410), Expect = 1e-54
Identities = 103/138 (74%), Positives = 117/138 (84%), Gaps = 1/138 (0%)

Query: 1 MSDPKFPSADGKESVDDLWADAFNEQQATEKPTATTEGVFKSLEAPEGLGNLQDIDLILD 60
MSD PS + ++DDLWADA NEQ+AT +A + VF+ L + G +QDIDLI+D
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAA-DAVFQQLGGGDVSGAMQDIDLIMD 59

Query: 61 IPVKLSVELGRTKMTIKELLRLSQGSVVSLDGLAGEPLDILINGYLIAQGEVVVVADKYG 120
IPVKL+VELGRT+MTIKELLRL+QGSVV+LDGLAGEPLDILINGYLIAQGEVVVVADKYG
Sbjct: 60 IPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYG 119

Query: 121 VRITDIITPSERMRRLSR 138
VRITDIITPSERMRRLSR
Sbjct: 120 VRITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2302FLGBIOSNFLIP306e-108 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 306 bits (786), Expect = e-108
Identities = 196/240 (81%), Positives = 215/240 (89%), Gaps = 1/240 (0%)

Query: 35 TTLGLLTLFCSPSVLAQLPGIISQPLANGGQSWSLPVQTLVFITTLSFLPAALLMMTSFT 94
LL L P AQLPGI SQPL GGQSWSLPVQTLVFIT+L+F+PA LLMMTSFT
Sbjct: 7 VAPVLLWLIT-PLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTSFT 65

Query: 95 RIIIVLGLLRNAMGTPSAPPNQVMLGLALFLTFFIMSPVFDKVYQEAYLPFSQDKISMDV 154
RIIIV GLLRNA+GTPSAPPNQV+LGLALFLTFFIMSPV DK+Y +AY PFS++KISM
Sbjct: 66 RIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISMQE 125

Query: 155 ALDKGSQPLREFMLRQTRESDLALYARLANLPPLEGPEMVPMRILLPAYVTSELKTAFQI 214
AL+KG+QPLREFMLRQTRE+DL L+ARLAN PL+GPE VPMRILLPAYVTSELKTAFQI
Sbjct: 126 ALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAFQI 185

Query: 215 GFTVFIPFLIIDLVVASVLMALGMMMVPPASISLPFKLMLFVLVDGWQLLLGSLAQSFYS 274
GFT+FIPFLIIDLV+ASVLMALGMMMVPPA+I+LPFKLMLFVLVDGWQLL+GSLAQSFYS
Sbjct: 186 GFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2303TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 24/78 (30%), Positives = 40/78 (51%)

Query: 4 ESVMALGTEAMKIALALAAPLLLAALISGLIVSLLQAATQINEMTLSFIPKILAVFTTMV 63
+ ++ G +A+ + L L+ + A I GL+V L Q TQ+ E TL F K+L V +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLILDYMRNLF 81
+ W ++L Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2304TYPE3IMRPROT1731e-55 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 173 bits (440), Expect = 1e-55
Identities = 172/258 (66%), Positives = 215/258 (83%)

Query: 1 MLSFDTHQLSVWVSQYFWPLVRVLALIGTAPLLSEKQINKKVKIGLGVLITFLIAPSLPP 60
ML + Q W++ YFWPL+RVLALI TAP+LSE+ + K+VK+GL ++ITF IAPSLP
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 VNIPLFSSAALWVAIQQILIGVALGVTMQFAFAAVRLSGEVIGLQMGLSFATFFDPSGGP 120
++P+FS ALW+A+QQILIG+ALG TMQFAFAAVR +GE+IGLQMGLSFATF DP+
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLSRLLNILVTLLFLSFDGHLWLISLLADSFHTLPIQFAPLNGNGFLTLAQSGSMIF 180
NMPVL+R++++L LLFL+F+GHLWLISLL D+FHTLPI PLN N FL L ++GS+IF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 MNGLMLALPLITLLLTLNMALGMLNRMTPQLSVFVIGFPLTLTVGIISLGLIMPLLAPFT 240
+NGLMLALPLITLLLTLN+ALG+LNRM PQLS+FVIGFPLTLTVGI + +MPL+APF
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFGEFFDRLAEVLSGM 258
EHLF E F+ LA+++S +
Sbjct: 241 EHLFSEIFNLLADIISEL 258


85YpsIP31758_2311YpsIP31758_2322N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2311-1152.074834short chain dehydrogenase
YpsIP31758_23120172.260785sugar-binding domain-containing protein
YpsIP31758_23131182.582262flagellar hook-associated protein FlgL
YpsIP31758_23141203.305468flagellar hook-associated protein FlgK
YpsIP31758_23153214.848353hypothetical protein
YpsIP31758_23162204.438193flagellar rod assembly protein/muramidase FlgJ
YpsIP31758_23172214.183153flagellar basal body P-ring protein
YpsIP31758_23182213.914305flagellar basal body L-ring protein
YpsIP31758_23192203.674619flagellar basal body rod protein FlgG
YpsIP31758_23201183.770541flagellar basal body rod protein FlgF
YpsIP31758_23212162.310279flagellar hook protein FlgE
YpsIP31758_23223161.969839flagellar basal body rod modification protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2311DHBDHDRGNASE1015e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 101 bits (252), Expect = 5e-26
Identities = 70/256 (27%), Positives = 114/256 (44%), Gaps = 8/256 (3%)

Query: 433 SVKPLQGQIVVVTGAGGGIGAAIAKEFSLLGAELAVLDIDSESAKNVAAQL---GPHALA 489
+ K ++G+I +TGA GIG A+A+ + GA +A +D + E + V + L HA A
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 490 LQCDVTETASVQAAFETIATRFGGVDIVVSNAGVALSGAIAELPEATLRTSFEVNFFAHQ 549
DV ++A++ I G +DI+V+ AGV G I L + +F VN
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 550 RVAQQAVSIMKKQGIGGVLLFNISKQAINPGINFGAYGTSKAALLSLVRQYALEQGQDSI 609
++ M + G ++ S A P + AY +SKAA + + LE + +I
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVG-SNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 610 RVNAVNADRIRSGLLDDEMISLRARARGL--SEEKYMAGNLLGQEVTAQDVAKA--FVVS 665
R N V+ + + + + S E + G L + D+A A F+VS
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240

Query: 666 AMLDKSTGNVITVDGG 681
T + + VDGG
Sbjct: 241 GQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2313FLAGELLIN415e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 40.8 bits (95), Expect = 5e-06
Identities = 35/137 (25%), Positives = 63/137 (45%), Gaps = 7/137 (5%)

Query: 4 STSMLYQQNMQGITNAQSLWMQTGQQLSTGKRVVNPSDDPMAASQAVMVSQAESENSQYT 63
S S+L Q N+ ++ S ++LS+G R+ + DD AA QA+ +
Sbjct: 8 SLSLLTQNNLNKSQSSLS---SAIERLSSGLRINSAKDD--AAGQAIANRFTSNIKGLTQ 62

Query: 64 LARSFARQSSSLETT--VLAQTTSTIQSIQSLVISAKNDTLSDDDRASYATQLQGLKDQL 121
+R+ S +TT L + + +Q ++ L + A N T SD D S ++Q +++
Sbjct: 63 ASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEI 122

Query: 122 LNQANTTDGNGRYIFAG 138
+N T NG + +
Sbjct: 123 DRVSNQTQFNGVKVLSQ 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2314FLGHOOKAP1436e-150 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 436 bits (1123), Expect = e-150
Identities = 315/552 (57%), Positives = 398/552 (72%), Gaps = 9/552 (1%)

Query: 3 NSLMNTAMSGLNAAQYALSTVSNNITNFQVAGYNRQNTVFAQNGGTITSAGFIGNGVTVT 62
+SL+N AMSGLNAAQ AL+T SNNI+++ VAGY RQ T+ AQ T+ + G++GNGV V+
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 63 GVNREYNAFITNQLRASQTQSSGLATYYQQISQIDNLLSNASNNLSTTMQDFFSNLQNLV 122
GV REY+AFITNQLRA+QTQSSGL Y+Q+S+IDN+LS ++++L+T MQDFF++LQ LV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 123 SNADDDAARKTVLGKAEGLVNQFQNADKYLRDMDDGVNQKITDSATQINNYAEQIAKLND 182
SNA+D AAR+ ++GK+EGLVNQF+ D+YLRD D VN I S QINNYA+QIA LND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 183 QITRLRG-SSGSEPNALLDQRDQLVTELNQIMAVTVTQQDGDAYNVSFAGGLSLVQGPNA 241
QI+RL G +G+ PN LLDQRDQLV+ELNQI+ V V+ QDG YN++ A G SLVQG A
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 YKVEAIPSSADATRLTLGYKRGNGEATEVDESRITTGSLGGTLKFRSEALDSARNQLGQL 301
++ A+PSSAD +R T+ Y G E+ E + TGSLGG L FRS+ LD RN LGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALVMADSFNTQHNAGFDINGDEGEDFFSFADPTVLKNAKNQGNASITVEYKDTSKVKASD 361
AL A++FNTQH AGFD NGD GEDFF+ P VL+N KN+G+ +I D S V A+D
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YTVEFDGTDWQVTRLSDNTKVQTTPGVNADGDPTLEFEGVAIKIDNGTPGPQAKDKFTIK 421
Y + FD WQVTRL+ NT TP D + + F+G+ + P D FT+K
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTP----DANGKVAFDGLELTFTG---TPAVNDSFTLK 413

Query: 422 TVSNVAANLQVAITDSSKIAAAGSADGGISDNTNAQALLDLQSKKLVEGK-TTLSGAYAG 480
VS+ N+ V ITD +KIA A D G SDN N QALLDLQS G + + AYA
Sbjct: 414 PVSDAIVNMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 481 LVSNVGNQTATAKTNSTAQANIVTQLTTEQQSISGVNLDEEYGDLQRFQQYYLANAQVLQ 540
LVS++GN+TAT KT+S Q N+VTQL+ +QQSISGVNLDEEYG+LQRFQQYYLANAQVLQ
Sbjct: 474 LVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQ 533

Query: 541 AASTLFNALLSI 552
A+ +F+AL++I
Sbjct: 534 TANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2316FLGFLGJ314e-109 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 314 bits (805), Expect = e-109
Identities = 181/316 (57%), Positives = 233/316 (73%), Gaps = 6/316 (1%)

Query: 1 MSDLLAMSGAAYDAQSLEALKRDAARDPEGNLKQVAQQVEGMFVQMMLKSMRAALPQDGV 60
+SD ++ AA+DAQSL LK A DP N++ VA+QVEGMFVQMMLKSMR ALP+DG+
Sbjct: 2 ISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDGL 61

Query: 61 MNSEQTKLYTSLYDQQIAQQMSA-KGLGLADMMVEQLS-GSTSASETAGTVPMMLDNEVL 118
+SE T+LYTS+YDQQIAQQM+A KGLGLA+MMV+Q++ E+ PM E +
Sbjct: 62 FSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLETV 121

Query: 119 QSMPAQALAQVMRRAIPTPPSSSMAAISPGNGNFVARMSIPAQIASQQSGIPHQLIMAQA 178
QAL+Q++++A+P S+ S F+A++S+PAQ+ASQQSG+PH LI+AQA
Sbjct: 122 VRYQNQALSQLVQKAVPRNYDDSLPGDSK---AFLAQLSLPAQLASQQSGVPHHLILAQA 178

Query: 179 ALESGWGQREIPTADGKSSYNVFGIKAGSSWNGPVSEITTTEYEQGVAKKTKARFRVYGS 238
ALESGWGQR+I +G+ SYN+FG+KA +W GPV+EITTTEYE G AKK KA+FRVY S
Sbjct: 179 ALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSS 238

Query: 239 YVEAVSDYVKLLTQNPRYAHVAAAQSPEQGAHALQKAGYATDPQYAQKLVSVIQQMRSTG 298
Y+EA+SDYV LLT+NPRYA V A S EQGA ALQ AGYATDP YA+KL ++IQQM+S
Sbjct: 239 YLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSIS 298

Query: 299 EQAVKAYGGSDLSQLF 314
++ K Y ++ LF
Sbjct: 299 DKVSKTY-SMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2317FLGPRINGFLGI391e-138 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 391 bits (1007), Expect = e-138
Identities = 155/366 (42%), Positives = 217/366 (59%), Gaps = 9/366 (2%)

Query: 5 SLVTLLMVLLSLVWLPASAERIRDLVTVQGVRDNALIGYGLVVGLDGSGDQTMQTPFTTQ 64
+LV + LS A RI+D+ ++Q RDN LIGYGLVVGL G+GD +PFT Q
Sbjct: 10 ALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQ 69

Query: 65 SLSNMLSQLGITVPPGTNMQLKNVAAVMVTAKLPAFSRAGQTIDVVVSSMGNAKSIRGGT 124
S+ ML LGIT G + KN+AAVMVTA LP F+ G +DV VSS+G+A S+RGG
Sbjct: 70 SMRAMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGN 128

Query: 125 LLMTPLKGVDNQVYALAQGNVLVGGAGAAAGGSSVQVNQLAGGRISNGATIERELPTTFG 184
L+MT L G D Q+YA+AQG ++V G A +++ R+ NGA IERELP+ F
Sbjct: 129 LIMTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFK 188

Query: 185 TDGIINLQLNSEDFTLAQQVSDAINR----QRGFGSATAIDARTIQVLVPRGGSSQVRFL 240
+ LQL + DF+ A +V+D +N + G A D++ I V PR + R +
Sbjct: 189 DSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPR-VADLTRLM 247

Query: 241 ADIQNIPINVDPGDAKVIINSRTGSVVMNRNVVLDSCAVAQGNLSVVVDKQNIVSQPDTP 300
A+I+N+ + D AKV+IN RTG++V+ +V + AV+ G L+V V + V QP P
Sbjct: 248 AEIENLTVETD-TPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-AP 305

Query: 301 FGGGQTVVTPNTQISVQQQGGVLQRVNASPNLNNVVRALNSLGATPIDLMSILQAMESAG 360
F GQT V P T I Q+G + V P+L +V LNS+G +++ILQ ++SAG
Sbjct: 306 FSRGQTAVQPQTDIMAMQEGSKVAIVE-GPDLRTLVAGLNSIGLKADGIIAILQGIKSAG 364

Query: 361 CLRAKL 366
L+A+L
Sbjct: 365 ALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2318FLGLRINGFLGH2834e-99 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 283 bits (724), Expect = 4e-99
Identities = 176/222 (79%), Positives = 193/222 (86%), Gaps = 2/222 (0%)

Query: 23 PLMTMLL--LNGCAYIPHKPLVDGTTSAQPAPASAPLPNGSIFQTVQPMNYGYQPLFEDR 80
+ ++L+ L GCA+IP PLV G TSAQP P P+ NGSIFQ+ QP+NYGYQPLFEDR
Sbjct: 10 AISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 81 RPRNIGDTLTITLQENVSASKSSSANASRNGTSSFGVTTAPRYLDGLLGNGRADMEITGD 140
RPRNIGDTLTI LQENVSASKSSSANASR+G ++FG T PRYL GL GN RAD+E +G
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGG 129

Query: 141 NTFGGKGGANANNTFSGTITVTVDQVLANGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 200
NTF GKGGANA+NTFSGT+TVTVDQVL NGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI
Sbjct: 130 NTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTI 189

Query: 201 SGSNSVTSTQVADARIEYVGNGYINEAQTMGWLQRFFLNVSP 242
SGSN+V STQVADARIEYVGNGYINEAQ MGWLQRFFLN+SP
Sbjct: 190 SGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSP 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2319FLGHOOKAP1422e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 2e-06
Identities = 17/80 (21%), Positives = 35/80 (43%), Gaps = 14/80 (17%)

Query: 4 SLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTLRQPGAQSSEQTTLP 63
+ A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTLG 48

Query: 64 SGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 49 AGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 220 ETSNVNVAEELVNMIQTQRAYEINSKAVSTSDQMLQKLAQL 260
S VN+ EE N+ + Q+ Y N++ + T++ + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2321FLGHOOKAP1453e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 45.3 bits (107), Expect = 3e-07
Identities = 22/87 (25%), Positives = 42/87 (48%), Gaps = 8/87 (9%)

Query: 6 AVSGMNAASSNLDVIGNNIANSATSGFKAGSVSFAD----MFAGSQTGMGVKVAGITQDF 61
A+SG+NAA + L+ NNI++ +G+ + A + AG G GV V+G+ +++
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGVQREY 66

Query: 62 NDGTATTTNRRLDLAISQNGFFRMQDS 88
+ +L A +Q+ +
Sbjct: 67 DA----FITNQLRAAQTQSSGLTARYE 89



Score = 40.7 bits (95), Expect = 9e-06
Identities = 15/49 (30%), Positives = 28/49 (57%)

Query: 380 TLTSGALESSNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILQTLVSLR 428
L++ S V+L +E N+ Q+ Y +NAQ ++T + I L+++R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2322SYCECHAPRONE290.008 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 28.9 bits (64), Expect = 0.008
Identities = 15/34 (44%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 43 LKNQDPTNPMENNELTTQLAQINTVSGIEKLNTT 76
L N+ P N ++NN L TQL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


86YpsIP31758_2621YpsIP31758_2625N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2621-1161.430251NAD dependent epimerase/dehydratase family
YpsIP31758_2622-2170.707288NAD-dependent epimerase/dehydratase family
YpsIP31758_2623-2170.552018lipoprotein
YpsIP31758_2624-2170.614409chorismate mutase
YpsIP31758_2625-2171.061511arginine transporter ATP-binding subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2621NUCEPIMERASE362e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.3 bits (84), Expect = 2e-04
Identities = 13/26 (50%), Positives = 18/26 (69%)

Query: 5 RILVLGASGYIGQHLVPLLSQQGHQV 30
+ LV GA+G+IG H+ L + GHQV
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV 27


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2622NUCEPIMERASE769e-18 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 76.4 bits (188), Expect = 9e-18
Identities = 71/366 (19%), Positives = 126/366 (34%), Gaps = 73/366 (19%)

Query: 1 MKVLVTGATSGLGRNAVEYLRRQEISVIA---------TGRNQAMGALLTKLGAKFIHAD 51
MK LVTGA +G + + L V+ QA LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LTDLVSSQAKAMLADVDTLWHCS-------SFTSPWGTEQAFALANVRATRRLGEWAAAY 104
L D + ++ S +P A+A +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 105 GVENFIHISSPAIYFDYHHHRNIQEDFRPVRFANEFARSKAAGEEVIKLLALSNPQTH-- 162
+++ ++ SS ++Y + D + +A +K A E L+A + +
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANE----LMAHTYSHLYGL 171

Query: 163 -FTILRPQGLFGPHDK--VMLPRLLHMIKHYGTLLLPRGGDALVDMTYLENAVHAM---- 215
T LR ++GP + + L + + ++ + G D TY+++ A+
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 216 ---------WLATQSQKTLS---GRAYNITNQQPRPLRTIVQQLLDALDMKCRIRSVPYP 263
W S R YNI N P L +Q L DAL ++ + +P
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQ 291

Query: 264 MMDIMARAMEKMSNKAEKEPVLTHYAVAKLNFDLTLDTLRAEQELGYRPIISLDEGILRT 323
D VL A DT + +G+ P ++ +G+
Sbjct: 292 PGD-----------------VLETSA----------DTKALYEVIGFTPETTVKDGVKNF 324

Query: 324 ARWLKE 329
W ++
Sbjct: 325 VNWYRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2623PF04183300.007 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 29.8 bits (67), Expect = 0.007
Identities = 14/65 (21%), Positives = 23/65 (35%), Gaps = 9/65 (13%)

Query: 54 IQQIGGQQGLPDDNLSAQFRPYLSQSLYNDIQA--ARKQASNRTPAQVNKTQMISGDIFT 111
+ Q+ + D + A+ L +L D+Q AR+ S +N D
Sbjct: 78 LMQLKQVLSMSDATV-AEHMQDLYATLLGDLQLLKARRGLSASDLINLN------ADRLQ 130

Query: 112 SLREG 116
L G
Sbjct: 131 CLLSG 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2625PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


87YpsIP31758_2760YpsIP31758_2768N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_2760-215-5.146693TetR family transcriptional regulator
YpsIP31758_2761-115-4.957344outer membrane porin protein C
YpsIP31758_2762010-3.449759major facilitator transporter
YpsIP31758_276319-3.696116phosphotransfer intermediate protein in
YpsIP31758_2764011-1.878224RcsB family transcriptional regulator
YpsIP31758_2765113-1.519682RcsB family transcriptional regulator
YpsIP31758_2766119-0.102969hypothetical protein
YpsIP31758_2767118-0.220940DNA gyrase subunit A
YpsIP31758_27680150.3455213-demethylubiquinone-9 3-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2760HTHTETR654e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 64.6 bits (157), Expect = 4e-15
Identities = 25/104 (24%), Positives = 41/104 (39%), Gaps = 4/104 (3%)

Query: 15 PAQQRILLTAHRLFYQEGIRATGIDKIIKESGVTKVTFYRHFPSKNDLISAFLEYRHQRW 74
+Q IL A RLF Q+G+ +T + +I K +GVT+ Y HF K+DL S E
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 75 INWFIEELKQQTLHHA----NLALALTKCMASWFEHPSFRGCAF 114
+E + + + + + + F
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIF 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2761ECOLIPORIN5020.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 502 bits (1294), Expect = 0.0
Identities = 242/388 (62%), Positives = 287/388 (73%), Gaps = 22/388 (5%)

Query: 1 MKLRVLSFIIPALLVAGSASAAEIYNKDGNKLDLYGKIDGLHYFSDNKNLDGDQSYMRFG 60
MK +VL+ +IPALL AG+A AAEIYNKDGNKLDLYGK+DGLHYFSD+ + DGDQ+YMR G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 LKGETQITDQLTGYGQWEYQVNLNKAENEDGNHDSFTRVGFAGLKFADYGSLDYGRNYGV 120
KGETQI DQLTGYGQWEY V N E E N S+TR+ FAGLKF DYGS DYGRNYGV
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGAN--SWTRLAFAGLKFGDYGSFDYGRNYGV 118

Query: 121 LYDVTSWTDVLPEFGGDTYG-ADNFLSQRGNGMLTYRNTNFFGLVDGLNFALQYQGKNGS 179
LYDV WTD+LPEFGGD+Y ADN+++ R NG+ TYRNT+FFGLVDGLNFALQYQGKN S
Sbjct: 119 LYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNES 178

Query: 180 SS---------ETNNGRGVADQNGDGYGMSLSYDLGWGVSASAAMASSLRTTAQNDLQ-- 228
S NNG + NGDG+G+S +YD+G G SA AA +S RT Q +
Sbjct: 179 QSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT 238

Query: 229 YGQGKRANAYTGGLKYDANNVYLAANYTQTYNLTRFGDFSNRSSDAAFGFADKAHNIEVV 288
G +A+A+T GLKYDANN+YLA Y++T N+T +G G A+K N EV
Sbjct: 239 IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYG---KTDKGYDGGVANKTQNFEVT 295

Query: 289 AQYQFDFGLRPSVAYLQSKGKDIGI----YGDQDLLKYVDIGATYFFNKNMSTYVDYKIN 344
AQYQFDFGLRP+V++L SKGKD+ D+DL+KY D+GATY+FNKN STYVDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 345 LLDKND-FTKNARINTDDIVAVGMVYQF 371
LLD +D F K+A I+TDDIVA+GMVYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2762TCRTETA346e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.4 bits (79), Expect = 6e-04
Identities = 56/301 (18%), Positives = 102/301 (33%), Gaps = 15/301 (4%)

Query: 25 FIAGLGMAAWAPLVPFAKARIGLND---ASLGLLLLCIGIGSMLAMPLTGVLTAKWGCRA 81
+ +G+ P++P + ++ A G+LL + P+ G L+ ++G R
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 82 VILLAGAVLCLDLPLLVLMNTPATMAIALLVFGAAMGIIDVAMNIQAVIVEKASGRAMMS 141
V+L++ A +D ++ + I +V G VA A I RA
Sbjct: 75 VLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADIT-DGDERARHF 133

Query: 142 GFHG-LFSVGGIVG------AGGVSALLWLGLNPLTAIMATVVLMIILLLAAN---KNLL 191
GF F G + G GG S + + +L + + L
Sbjct: 134 GFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLR 193

Query: 192 RGSGEPHDGPLFVFPRGWVMFIGFLCFVMFLAEGSMLDWSAVFLTTLRGMSPSQAGMGYA 251
R + P + V + + F+M L +F + G+ A
Sbjct: 194 REALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLA 253

Query: 252 VFAIAMTLGR-LNGDRIVNGLGRYKVLLGGSLCSAIGIIIAISIDSSMAAIIGFMLVGFG 310
F I +L + + + LG + L+ G + G I+ A +L+ G
Sbjct: 254 AFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG 313

Query: 311 A 311

Sbjct: 314 G 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2764HTHFIS531e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.3 bits (128), Expect = 1e-10
Identities = 24/133 (18%), Positives = 57/133 (42%), Gaps = 24/133 (18%)

Query: 1 MNNLNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLSKLDANVLITDLSMP 60
M +++ADD + + ++L + + + ++ L ++ D ++++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GDKYGDGITLIKYIKRHYPDLAIIVLTMNNNPAILSSVLDLDIDGIV--LKQGA------ 112
+ L+ IK+ PDL ++V++ N + ++GA
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQN-----------TFMTAIKASEKGAYDYLPK 104

Query: 113 PADLPKALAALQK 125
P DL + + + +
Sbjct: 105 PFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2765HTHFIS823e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 3e-18
Identities = 29/109 (26%), Positives = 50/109 (45%)

Query: 829 ILVVDDHPINRRLLADQLTTLGYRVITANDGLDALVALNTNTVDMVLTDVNMPNMDGYRL 888
ILV DD R +L L+ GY V ++ + D+V+TDV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 889 TERLRQLNHNFPIIGVTANALAEGKQRCIEAGMDNCLSKPVTLDTLRQM 937
R+++ + P++ ++A + E G + L KP L L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGI 114


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2768DHBDHDRGNASE320.002 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 31.6 bits (71), Expect = 0.002
Identities = 21/98 (21%), Positives = 35/98 (35%), Gaps = 26/98 (26%)

Query: 54 GIFEKKVLDVGCGGGI---LAESMAREGAQVTGLDMGYEPLQVARLHALETGVKLEYVQE 110
GI K G GI +A ++A +GA + +D E L+ K+ +
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE-----------KVVSSLK 53

Query: 111 TVENHAQQHPQHYDVVTCMEMLEHVPDPASVVRACAQL 148
HA+ P V D A++ A++
Sbjct: 54 AEARHAEAFPA------------DVRDSAAIDEITARI 79


88YpsIP31758_2806YpsIP31758_2815N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_28061160.646789D-alanyl-D-alanine endopeptidase
YpsIP31758_2807-1131.745534hypothetical protein
YpsIP31758_2808-1132.202892tRNA-dihydrouridine synthase C
YpsIP31758_2810-1142.009713ATP-dependent RNA helicase RhlE
YpsIP31758_2811-2141.427732DNA-binding transcriptional regulator
YpsIP31758_2812-2161.635662hypothetical protein
YpsIP31758_2813-1171.905520ABC transporter ATP-binding protein
YpsIP31758_28140201.657303ABC transporter permease
YpsIP31758_28151190.026177ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2806BLACTAMASEA443e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 44.0 bits (104), Expect = 3e-07
Identities = 32/196 (16%), Positives = 67/196 (34%), Gaps = 24/196 (12%)

Query: 5 IRFALLSFLLLSTGISVAPLAIARGSAVEVKGTAPLELASGSAM---VVDLQTNKVIYAN 61
+R+ L + L + +A S ++ E + +DL + + + A
Sbjct: 1 MRYIRLCIISLLATLPLA----VHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAW 56

Query: 62 NADKVVPIASITKLMTAMVVLD----AKLPLDEILSVDIDQTKELKGVFSRVRVNSEISR 117
AD+ P+ S K++ VL L+ + + V S + ++
Sbjct: 57 RADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPV-SEKHLADGMTV 115

Query: 118 KDMLLLTLMSSENRAAASLAHHY--PGGYNAFIKAMNAKAKSL-----GMSSTHYVEPTG 170
++ + S+N AA L P G AF++ + L ++ +
Sbjct: 116 GELCAAAITMSDNSAANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDAR- 174

Query: 171 LSINNVSTARDLAKLL 186
+ +T +A L
Sbjct: 175 ----DTTTPASMAATL 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2811HTHTETR713e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 71.2 bits (174), Expect = 3e-17
Identities = 28/158 (17%), Positives = 53/158 (33%), Gaps = 15/158 (9%)

Query: 12 PSPATTRGEQARQQLLQAAIELFGELGLKGATTRDIAQRAGQNIAAITYYFNSKEGLYLA 71
++ RQ +L A+ LF + G+ + +IA+ AG AI ++F K L+
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 72 VAQYIADFIQQAFSPLAQEIDHFLQLPAEHQPPEQQLHYIRQGLLAFSHLMTQPETL-NL 130
+ + I + + P L +R+ L+ E L
Sbjct: 62 IWELSESNIGELELEYQAKF------------PGDPLSVLREILIHVLESTVTEERRRLL 109

Query: 131 SKIMAREQLSPSEAYPLIHTQAIAP--LHQTLNQLLAA 166
+I+ + E + Q + + Q L
Sbjct: 110 MEIIFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKH 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2812RTXTOXIND724e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 72.2 bits (177), Expect = 4e-16
Identities = 54/261 (20%), Positives = 94/261 (36%), Gaps = 29/261 (11%)

Query: 82 NALKQAQANVQSAQAQLALLKAGYREEEIAQVRSEVAQRQAAFD--YADNFLKRQQGLWA 139
N Q + N+ +A+ + A E R E R F + + L
Sbjct: 200 NQKYQKELNLDKKRAERLTVLARINRYE-NLSRVE-KSRLDDFSSLLHKQAIAKHAVLEQ 257

Query: 140 SKAVSA--NELENARTARNQAQANLQAAKDKLAQFLSGNRPQ---EIAQAEANLAQTEAE 194
NEL ++ Q ++ + +AK++ + + ++ Q N+ E
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 195 LAQAQLNLQDTILLAPSAGTVLTRAV--EPGTILSASNTVFTVSLTDPVWVRAYVSERHL 252
LA+ + Q +++ AP + V V E G + +A + V D + V A V + +
Sbjct: 318 LAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDI 377

Query: 253 GQAIPGSEVEVFTDGRPDKPYH---GKIGFVSPTAEFTPKTVETPDLRTDLVYRLRIIIT 309
G G + + P Y GK+ ++ A D R LV+ + I I
Sbjct: 378 GFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIE 429

Query: 310 DADES-------LRQGMPVTV 323
+ S L GM VT
Sbjct: 430 ENCLSTGNKNIPLSSGMAVTA 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2813PF05272310.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.014
Identities = 21/91 (23%), Positives = 27/91 (29%), Gaps = 13/91 (14%)

Query: 296 PRFEDAFIDLLGGGPDSESALAKIMPRVAGNPGETVIEAQALTKKFGDFAATDHVNFQVK 355
PR E + +LG PD + Q + K + K
Sbjct: 548 PRLEKWLVHVLGKTPDD-------------YKPRRLRYLQLVGKYILMGHVARVMEPGCK 594

Query: 356 RGEIFGLLGPNGAGKSTTFKMMCGLLVPSDG 386
L G G GKST + GL SD
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLDFFSDT 625



Score = 30.0 bits (67), Expect = 0.030
Identities = 10/19 (52%), Positives = 12/19 (63%)

Query: 40 LVGPDGAGKTTLLRMLAGL 58
L G G GK+TL+ L GL
Sbjct: 601 LEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_2815ABC2TRNSPORT512e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.7 bits (121), Expect = 2e-09
Identities = 34/147 (23%), Positives = 60/147 (40%), Gaps = 1/147 (0%)

Query: 197 AREREQGTMEQLLVSPLTTWQIFIGKAVPALIVATFQASIVLLIGIFFYQIPFAGSLALF 256
R Q T E +L + L I +G+ A A + + ++ + SL
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWL-SLLYA 150

Query: 257 YGTMLLYGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPIWLQNIT 316
+ L GL+ G+++++L + + + P + LSG V PV+ +PI Q
Sbjct: 151 LPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTAA 210

Query: 317 WINPIRHFTDITKQIYLKDASFDIIWH 343
P+ H D+ + I L D+ H
Sbjct: 211 RFLPLSHSIDLIRPIMLGHPVVDVCQH 237


89YpsIP31758_3064YpsIP31758_3068N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3064-112-1.034677potassium efflux protein KefA
YpsIP31758_3065-116-0.454222DsrE family protein
YpsIP31758_3066016-1.426951DNA-binding transcriptional repressor AcrR
YpsIP31758_3067017-1.364397acriflavine resistance protein A
YpsIP31758_3068019-2.284781acriflavine resistance protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3064GPOSANCHOR404e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.4 bits (94), Expect = 4e-05
Identities = 28/235 (11%), Positives = 58/235 (24%), Gaps = 21/235 (8%)

Query: 35 SEVQSQLDLLSKQKILSPAEKLAQQDLTQTLE-YLDTIERTKQEANQLKQQLAQAPAKLR 93
S + +L K ++ + LE L+ + + L A L
Sbjct: 95 SNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALA 154

Query: 94 QATEGLE-ALKSSSADTMTKESLANYSLRQLESRLNETLDNLQSAQEDLSAYNSQLIALQ 152
LE AL+ + + +++ + + + L
Sbjct: 155 ARKADLEKALEGAMNFS-----------TADSAKIKTLEAEKAALEARQAELEKALEGAM 203

Query: 153 TQPERVQSAMYSASMRLMQIRNQLNGLTPNQESLRPTQQ--QELLAEQVMLNGQLDLERK 210
+ + + + + L E + L+ +
Sbjct: 204 NFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQA 263

Query: 211 NLEANTTLQDLLQKQRDYTTAHINQLERYVQLLQEVVSGKRLILSEKTVKEAQAQ 265
LE + +A I LE L+ K + + V A Q
Sbjct: 264 ELEK---ALEGAMNFSTADSAKIKTLEAEKAALEAE---KADLEHQSQVLNANRQ 312



Score = 32.0 bits (72), Expect = 0.016
Identities = 36/201 (17%), Positives = 72/201 (35%), Gaps = 33/201 (16%)

Query: 37 VQSQLDLLSKQKILSPAEKLAQQDLTQTLEYLDTIERTKQEANQLKQQ------------ 84
+ L ++Q L A + A T + T+E K K
Sbjct: 252 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR 311

Query: 85 --LAQAPAKLRQATEGLEA-----------LKSSSADTMTKESLANYSLRQLESRLNETL 131
L + R+A + LEA ++S + + +QLE+ +
Sbjct: 312 QSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLE 371

Query: 132 DNLQSAQEDLSAYNSQLIALQTQPERVQSAMYSASMRLMQIRNQLNGLTPNQESLRPTQQ 191
+ + ++ + L A + ++V+ A+ A+ +L + L +ES + T++
Sbjct: 372 EQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKEL---EESKKLTEK 428

Query: 192 QELLAEQVMLNGQLDLERKNL 212
E+ L +L+ E K L
Sbjct: 429 -----EKAELQAKLEAEAKAL 444


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3065ADHESNFAMILY260.034 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 26.0 bits (57), Expect = 0.034
Identities = 9/71 (12%), Positives = 27/71 (38%)

Query: 47 IAGLNGQQPREGYNLQQMLEILTAQNVPIKLCKTCADARGIAGLTLVDGVEIGTLVELAQ 106
I +N ++ ++ ++E L VP ++ D R + ++ + I +
Sbjct: 222 IWEINTEEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDS 281

Query: 107 WTLAAEKVLTF 117
++ ++
Sbjct: 282 IAEQGKEGDSY 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3066HTHTETR1657e-54 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 165 bits (420), Expect = 7e-54
Identities = 135/210 (64%), Positives = 164/210 (78%)

Query: 1 MARKTKQKAEETRQQILDAAVREFSAHGVSRTSLTDIAIAAGVTRGAIYWHFKNKVDLFN 60
MARKTKQ+A+ETRQ ILD A+R FS GVS TSL +IA AAGVTRGAIYWHFK+K DLF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EVWELSESKIDQLEIEYQAKYPDNPLRILRELLIYILVSTREDRRRRALMEIVFHKCEFV 120
E+WELSES I +LE+EYQAK+P +PL +LRE+LI++L ST + RRR LMEI+FHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMTSVHDARKVLDLASYERIESVLQGCIDANQLPVNLNTHRAAIIMRAYITGLMENWLF 180
GEM V A++ L L SY+RIE L+ CI+A LP +L T RAAIIMR YI+GLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 MPESFDIKQEAPVLIDAYLEMLGQSFSLRN 210
P+SFD+K+EA + LEM +LRN
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRN 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3067RTXTOXIND401e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 1e-05
Identities = 22/166 (13%), Positives = 52/166 (31%), Gaps = 45/166 (27%)

Query: 96 QIDPATYQAAYDSAKGDLAKAQASAQIAHLTVNRYKPLLGTNYISKQ---EYDQALSDAQ 152
+++ +A + + + + +++ ++ + LL I+K E + +A
Sbjct: 206 ELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 153 QADATVLAAKAALES----------------------------------------ARINL 172
+ +ES
Sbjct: 266 NELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325

Query: 173 AYTQVRSPISGRTGKSAV-TEGALVTSGQASAMTTVQQLDPMYVDV 217
+ +R+P+S + + V TEG +VT+ + M V + D + V
Sbjct: 326 QASVIRAPVSVKVQQLKVHTEGGVVTTAET-LMVIVPEDDTLEVTA 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3068ACRIFLAVINRP13440.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1344 bits (3479), Expect = 0.0
Identities = 807/1032 (78%), Positives = 919/1032 (89%)

Query: 1 MAKFFIDRPIFAWVIAIIIMLAGALAIMKLPVAQYPTIAPPAITIAANYPGADATTVQNT 60
MA FFI RPIFAWV+AII+M+AGALAI++LPVAQYPTIAPPA++++ANYPGADA TVQ+T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLLYMSSSSDSSGNVQLTLTFNSGTDPDIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNL+YMSS+SDS+G+V +TLTF SGTDPDIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVAGFISEDGTMQQEDIADYVGSNIKDPISRTPGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMVAGF+S++ Q+DI+DYV SN+KD +SR GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMDPHKLNNYKLTPVDVINAIKIQNNQVAAGQLGGTPPVPGQELNSSIIAQTRL 240
QYAMRIW+D LN YKLTPVDVIN +K+QN+Q+AAGQLGGTP +PGQ+LN+SIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TNAEEFSQILLKVNTDGSQVRLKDVAIVKLGAESYNIIARYNGKPAAGIGIKLATGANAL 300
N EEF ++ L+VN+DGS VRLKDVA V+LG E+YN+IAR NGKPAAG+GIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 NTSAAVKAELAKLQPFFPSGLTVVYPYDTTPFVKISINEVVKTLIEAIILVFLVMYLFLQ 360
+T+ A+KA+LA+LQPFFP G+ V+YPYDTTPFV++SI+EVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAILSAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFAIL+AFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 QEEGLPPKEATKKSMEQIQGALVGIALVLSAVFVPMAFFGGATGAIYRQFSITIVSAMVL 480
E+ LPPKEAT+KSM QIQGALVGIA+VLSAVF+PMAFFGG+TGAIYRQFSITIVSAM L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIKKGDHGPKTGFFGWFNNMFEKSTHHYTDSVANILRSTGRY 540
SVLVALILTPALCAT+LKP+ H K GFFGWFN F+ S +HYT+SV IL STGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LVIYLAIVIGMAVLFMRLPSSFLPEEDQGVFLTMVQLPAGATQERTQKVLNQVTDYYLDK 600
L+IY IV GM VLF+RLPSSFLPEEDQGVFLTM+QLPAGATQERTQKVL+QVTDYYL
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKNVVNSVFTVNGFGFSGQGQNTGLAFVSLKNWDERKGEQNKVPAIVSRASAAFSKIKDG 660
EK V SVFTVNGF FSGQ QN G+AFVSLK W+ER G++N A++ RA KI+DG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 MVFAFNLPAIVELGTATGFDFQLIDQGNLGHQQLTDARNQLLGMAAQHPDMLVGVRPNGL 720
V FN+PAIVELGTATGFDF+LIDQ LGH LT ARNQLLGMAAQHP LV VRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTPQFKVEVDQEKAQALGVAISDINTTLGSAMGGSYVNDFIDRGRVKKVYVQADAPFRM 780
EDT QFK+EVDQEKAQALGV++SDIN T+ +A+GG+YVNDFIDRGRVKK+YVQADA FRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPDDIDKWYVRNNMGQMVSFATFSTAKWEYGSPRLERYNGLPSMEILGQAAPGKSTGEAM 840
LP+D+DK YVR+ G+MV F+ F+T+ W YGSPRLERYNGLPSMEI G+AAPG S+G+AM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 DLMQELAAKLPSGVGYDWTGMSYQERLSGNQAPALYAISLIVVFLCLAALYESWSIPFSV 900
LM+ LA+KLP+G+GYDWTGMSYQERLSGNQAPAL AIS +VVFLCLAALYESWSIP SV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGVVGALLAATLRGLENDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGLV 960
MLVVPLG+VG LLAATL +NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+V
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 ESTLESVRMRLRPILMTSLAFILGVMPLVISSGAGSGAQNAVGTGVMGGMITATVLAIFF 1020
E+TL +VRMRLRPILMTSLAFILGV+PL IS+GAGSGAQNAVG GVMGGM++AT+LAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPLFFVVVRRRF 1032
VP+FFVV+RR F
Sbjct: 1021 VPVFFVVIRRCF 1032


90YpsIP31758_3089YpsIP31758_3096N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3089324-1.136835transcriptional regulator HU subunit beta
YpsIP31758_3090323-1.227218DNA-binding ATP-dependent protease La
YpsIP31758_3091218-1.143042ATP-dependent protease ATP-binding subunit ClpX
YpsIP31758_3092020-1.986861ATP-dependent Clp protease proteolytic subunit
YpsIP31758_3093020-2.298792trigger factor
YpsIP31758_3094-218-1.502065BolA family transcriptional regulator
YpsIP31758_3095-119-1.267985hypothetical protein
YpsIP31758_3096-120-1.350711muropeptide transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3089DNABINDINGHU1216e-40 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 121 bits (305), Expect = 6e-40
Identities = 48/88 (54%), Positives = 65/88 (73%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIITSVTESLKEGDDVALVGFGTFAVRERSARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F VRER+AR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEISIPAAKVPGFRAGKGLKDAV 89
NPQTG+EI I A+KVP F+AGK LKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3090PF05272320.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.010
Identities = 15/76 (19%), Positives = 32/76 (42%), Gaps = 6/76 (7%)

Query: 296 DWMLQVPWNSRSKVKKDLVKAQEVLDTDHYGLERVKDRILEYLAVQSRVSKIKGP----- 350
DW+ W+ +++K LV D+ +++ + V+++ P
Sbjct: 537 DWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFD 596

Query: 351 -ILCLVGPPGVGKTSL 365
+ L G G+GK++L
Sbjct: 597 YSVVLEGTGGIGKSTL 612


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3091HTHFIS290.032 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.032
Identities = 15/72 (20%), Positives = 27/72 (37%), Gaps = 13/72 (18%)

Query: 61 RSSLPTPHEIRHHLDDYVIGQEPAKKVLAVAVYNHYKRLRNGDTSNGIELGKSNILLIGP 120
P+ E ++G+ A + +Y RL D +++ G
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITGE 168

Query: 121 TGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 169 SGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3095PF06291280.014 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.014
Identities = 12/38 (31%), Positives = 19/38 (50%)

Query: 2 LKKILFPLLAIFILAGCATTSNTLNVTPKVVLPTQDPT 39
+KK+LF ++ GCA + T+ P V P + T
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETIT 43


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3096TCRTETB471e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.8 bits (111), Expect = 1e-07
Identities = 44/199 (22%), Positives = 77/199 (38%), Gaps = 15/199 (7%)

Query: 221 RNNAWLI-LLLIVFYKMGDAFAASLSTTFLIRGVGFDAGEVGLVNKTLGLIATIIGALYG 279
R+N LI L ++ F+ + + ++S + VN L +I A+YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 280 GLLMQRLSLFRALMIFGILQAVSNMGYWLLAITDKNIFSMGSAIFLENLCGGMGTAAFVA 339
L +L + R L+ I+ + ++ + FS+ + + G G AAF A
Sbjct: 71 KL-SDQLGIKRLLLFGIIINCFGS----VIGFVGHSFFSL---LIMARFIQGAGAAAFPA 122

Query: 340 LLM----TLCNKSFSATQFALLSALSAVGRVYVGP-IAGWFVEAHGWPLFYLFSIAAAIP 394
L+M K F L+ ++ A+G VGP I G W L + I
Sbjct: 123 LVMVVVARYIPKENRGKAFGLIGSIVAMG-EGVGPAIGGMIAHYIHWSYLLLIPMITIIT 181

Query: 395 GLLLLYVCRQTLDHTQKTD 413
L+ + ++ + D
Sbjct: 182 VPFLMKLLKKEVRIKGHFD 200


91YpsIP31758_3136YpsIP31758_3143N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3136513-0.815974PhoB family transcriptional regulator
YpsIP31758_3137613-1.082680hypothetical protein
YpsIP31758_3138614-0.782171exonuclease subunit SbcD
YpsIP31758_3139715-1.365058nuclease SbcCD, C subunit
YpsIP31758_3140-121-2.847300fructokinase
YpsIP31758_3141123-4.273920recombination associated protein
YpsIP31758_3142026-5.095414hypothetical protein
YpsIP31758_3143023-5.232762shikimate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3136HTHFIS904e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.3 bits (224), Expect = 4e-23
Identities = 31/119 (26%), Positives = 58/119 (48%), Gaps = 2/119 (1%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGYQPLEAEDYDSAVARLSEPFPDLVLLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + GY + + ++ DLV+ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHMKREALTRDIPVMMLTARGEEEDRVRGLEVGADDYITKPFSPKELVARIKAVMRR 122
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3139RTXTOXIND422e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.7 bits (98), Expect = 2e-05
Identities = 32/222 (14%), Positives = 74/222 (33%), Gaps = 8/222 (3%)

Query: 321 QYLAQLTPLT--QAVEQATAARQQQQLNQHEQETLIEQRIVPLDNLITQQQQTLSQLAGQ 378
L +LT L + ++ Q +L Q + L + + + Q +
Sbjct: 122 DVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSE 181

Query: 379 IQQLRAKEQQNSQQLALNEQKLLQTHQRLQQLADYANLHAHHQHWEKHLPLWHEQFRQLQ 438
+ LR Q QK + ++ A+ + A +E + +
Sbjct: 182 EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS 241

Query: 439 LQQQQSAQSEQQLHQQTTLLATLQQQATTLSAQEKQQQVALAEARAQASYLQQKL--LVL 496
+ A ++ + +Q + +Q +Q + + A+ + + Q +L
Sbjct: 242 SLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL 301

Query: 497 EQ----QQPSAQLRQQLNEFNEQRQICQQLAALSPLAQQIQA 534
++ L +L + E++Q A +S QQ++
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKV 343



Score = 39.4 bits (92), Expect = 8e-05
Identities = 30/236 (12%), Positives = 65/236 (27%), Gaps = 42/236 (17%)

Query: 458 LATLQQQATTLSAQEKQQQVALAEARAQASYLQQKLLVLEQQ----QPSAQLRQQLNEFN 513
L L +A TL Q Q L + R Q +L L + +P Q +
Sbjct: 127 LTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 514 EQRQICQQLAALSPLAQQIQALYDKQQQQFTAQQQQLKQLEQQ---LTEKRQLYQQ-QKQ 569
I +Q + Q + DK++ + ++ + E + + +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK 246

Query: 570 HLVDLEALLEREKQLVTLEAERAKLQPGDACPLCGAVEHPAIAAYQAVKPSETAVRVAKL 629
+ A+LE+E + V E ++ ++
Sbjct: 247 QAIAKHAVLEQENKYVEAVNELR----------------------------VYKSQLEQI 278

Query: 630 RLQVEQLDTEGTELRTQVASMQQHQQRIEQELQDHRQQLAAYQQRWQTLAQPLSLA 685
++ E + Q + I +L+ + + +
Sbjct: 279 ESEILSAKEEYQLVT------QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328



Score = 37.9 bits (88), Expect = 3e-04
Identities = 28/206 (13%), Positives = 71/206 (34%), Gaps = 19/206 (9%)

Query: 658 EQELQDHRQQLAAYQQRWQTLAQPLSL----AFTLNEPDALALWLEQHEQQEQACQLKLV 713
+ Q Q Q R+Q L++ + L L + E+ + +
Sbjct: 136 TLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL----- 190

Query: 714 EYERLTQQYQQAKDILTQLEQRQQEHQQQLALITERQKNAQQTYQQLQSQYQHQQEALIA 773
+ +Q+ ++ Q E + + + + R + + +S+ +L+
Sbjct: 191 ----IKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFS-SLLH 245

Query: 774 QQQVLNHTLTELSLSVPDADQQQNWLAQREEECQRWQQHQQEQQRLTIEQKTLETRIENE 833
+Q + H + E +A + + E+ + +E+ +L + E +
Sbjct: 246 KQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDK-- 303

Query: 834 RRHLQECIDQLSALSQQRQQAETLLQ 859
L++ D + L+ + + E Q
Sbjct: 304 ---LRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.9 bits (75), Expect = 0.009
Identities = 26/180 (14%), Positives = 71/180 (39%), Gaps = 13/180 (7%)

Query: 844 LSALSQQRQQAETLLQQQIQQRRALFGEDIVAE-------VRQRLRLQQQQAELAQQNAE 896
+ L Q R Q + + + ++ + +R +++Q + Q +
Sbjct: 145 QARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQ 204

Query: 897 K--ALQQVQSQLNRLSGELTGLEQQCQQYQQRATTTQAEL-QQALSTSEFADETALTAAL 953
K L + +++ + + E + + R + L +QA++ ++
Sbjct: 205 KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 954 LSE--EERQHLQQLQQQLNERRQQAQIRLQQAR-EILDQHLQLCPQGVDKSSELTLLQQQ 1010
++E + L+Q++ ++ +++ Q+ Q + EILD+ Q + EL +++
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3140BCTERIALGSPF280.045 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.3 bits (63), Expect = 0.045
Identities = 11/37 (29%), Positives = 21/37 (56%)

Query: 218 DVIAEQAMNNYERRFAKSLAHVINLFDPDVVVLGGGM 254
D + E+A +N +R F+ + + LF+P +VV +
Sbjct: 351 DSMLERAADNQDREFSSQMTLALGLFEPLLVVSMAAV 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3143PF05272280.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.014
Identities = 16/68 (23%), Positives = 27/68 (39%), Gaps = 12/68 (17%)

Query: 7 MVGARGAGKTTIGKALAQALGYRFVDTDL-------FMQQTSQMTVAEVVESEGWDGFRL 59
+ G G GK+T+ L F DT +Q + + E+ E FR
Sbjct: 601 LEGTGGIGKSTLINTLVGL--DFFSDTHFDIGTGKDSYEQIAGIVAYELSE---MTAFRR 655

Query: 60 RESMALQA 67
++ A++A
Sbjct: 656 ADAEAVKA 663


92YpsIP31758_3197YpsIP31758_3206N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3197-2192.504736major facilitator transporter
YpsIP31758_3198-3162.525656azaleucine resistance protein AzlC
YpsIP31758_3199-3172.399233hypothetical protein
YpsIP31758_3200-2163.116025transcriptional repressor MprA
YpsIP31758_3201-2143.611460multidrug resistance protein A
YpsIP31758_3202-2122.961009multidrug resistance protein B
YpsIP31758_3203-1122.231414methyltransferase
YpsIP31758_3204-1100.957897thioredoxin 2
YpsIP31758_3205-2111.097485DTW domain-containing protein
YpsIP31758_3206-2110.648686acyl-CoA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3197TCRTETB461e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.0 bits (109), Expect = 1e-07
Identities = 34/163 (20%), Positives = 65/163 (39%), Gaps = 5/163 (3%)

Query: 35 LETIATNFSLSVNQAGFIVTAAQLGYAVGLMFLVPLGDMFE-RRGLIVGMTLLAAGGMLI 93
L IA +F+ ++ TA L +++G L D +R L+ G+ + G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGS-VI 95

Query: 94 TAMSQNLTMMIIGTALTGLFSVVA--QLLVPLAATLAAPEKRGKVVGIIMSGLLLGILLA 151
+ + ++I A L++ + A E RGK G+I S + +G +
Sbjct: 96 GFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVG 155

Query: 152 RTVAGALATLGGWRTIYWVASALMFIMALVLWRCLPRYKQHTG 194
+ G +A W + + + I L + L + + G
Sbjct: 156 PAIGGMIAHYIHWSYLLLIPMITI-ITVPFLMKLLKKEVRIKG 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3200PF05272280.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.017
Identities = 23/105 (21%), Positives = 37/105 (35%), Gaps = 12/105 (11%)

Query: 12 LNSRAKRQKDFPYQEILLTRLSMHMHSKLLENRNKMLKAQGINETLFMALITLDAQESRS 71
+ + P QE+ L + L R A+G + + T
Sbjct: 745 PSPEDEEIYFRPEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTF------- 797

Query: 72 IQPSELSAALG-----SSRTNATRIADELEKKGWIERRESHNDRR 111
+ ++L ALG SS ++ D L + GW RE+ RR
Sbjct: 798 VTIADLVQALGADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3201RTXTOXIND681e-14 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 68.3 bits (167), Expect = 1e-14
Identities = 63/410 (15%), Positives = 118/410 (28%), Gaps = 99/410 (24%)

Query: 25 LLLTAIFIMIGVAYLIYWFLVLRHHQ---ETDNAYISGNQVQIMSQVPGSVVSVHFENTD 81
L A FIM + ++ + SG +I V + + +
Sbjct: 57 PRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGE 116

Query: 82 FVKSGDVLVTLDPTD-------AEQAFEQAK----------------------------- 105
V+ GDVL+ L + + QA+
Sbjct: 117 SVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYF 176

Query: 106 ----------------TALANSVRQTHQLIINSKQYQ-------ANIALKKTELSQAQND 142
+ Q +Q +N + + A I + ++
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 143 LKRRVVLGAAAVIGREELQHARDAVEAAQASLDMAVQQYNANQALVLNTPLE-------- 194
L L I + + + A L + Q ++ +L+ E
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF 296

Query: 195 -KQPAIEQAAAKMRDAWLT---------LQRTKVVSPISGYVSRRSVQ-VGAEISSGTPL 243
+ + LT Q + + +P+S V + V G +++ L
Sbjct: 297 KNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL 356

Query: 244 MAVVPADQ-LWIDANFKETQLANMRIGQPATI-VTDF----YGDDVVYQGKVVGLDMGTG 297
M +VP D L + A + + + +GQ A I V F YG GKV +
Sbjct: 357 MVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGY---LVGKVKNI----- 408

Query: 298 SAFSLLPAQNATGNWIKVVQRLPVRIALDEKQLKEHPLRIGLSSLVKVDT 347
+ ++ G V+ + K PL G++ ++ T
Sbjct: 409 NLDAIE--DQRLGLVFNVIISIEENCLSTG--NKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3202TCRTETB1401e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 140 bits (355), Expect = 1e-38
Identities = 94/404 (23%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 18 LSLATFMQVLDSTIANVAIPTIAGDLGSSNSQGTWVITSFGVANAISIPVTGWLAKRVGE 77
L + +F VL+ + NV++P IA D + WV T+F + +I V G L+ ++G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 78 VRLFLWSTGLFVLASWLCGMSNS-LGMLIFFRVIQGLVAGPLIPLSQSLLLNNYPPAKRS 136
RL L+ + S + + +S +LI R IQG A L ++ P R
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 137 MALALWSMTIVVAPIFGPILGGYISDNYHWGWIFFINIPIGLVVVLLAGSTLKGRETKTE 196
A L + + GP +GG I+ HW + + IP+ ++ + L +E + +
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIK 196

Query: 197 IRPIDTIGLVLLVVGIGALQIMLDQGKELDWFNSTEIIVLTVVAVVAITFLIVWELTDDH 256
D G++L+ VGI + ML F ++ I +V+V++ +
Sbjct: 197 -GHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKVTD 245

Query: 257 PVIDLSLFKSRNFTIGCLCLSLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGILP 316
P +D L K+ F IG LC + + G + ++P ++++V+ + G G +
Sbjct: 246 PFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMS 305

Query: 317 VLLS-PLIGRFAHRIDMRQLVTFSFIMYAVCFYWRAYTFEPGMDFGASAWPQFFQGFAIA 375
V++ + G R ++ +V F ++ E F G +
Sbjct: 306 VIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFT 365

Query: 376 CFFMPLTTITLSGLPPERMAAASSLSNFMRTLAGSIGTSITTTL 419
++TI S L + A SL NF L+ G +I L
Sbjct: 366 K--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3206SACTRNSFRASE371e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.8 bits (85), Expect = 1e-04
Identities = 16/54 (29%), Positives = 22/54 (40%)

Query: 812 VLVRSDLKGLGLGRALLEKMIRYARSHGLSRLTAVTMPNNRGMIGLAQKLGFTI 865
+ V D + G+G ALL K I +A+ + L T N K F I
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


93YpsIP31758_3504YpsIP31758_3513N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3504-2120.237412peptide chain release factor 3
YpsIP31758_3505014-0.505517ribosomal-protein-alanine N-acetyltransferase
YpsIP31758_3506-114-0.994785DNA polymerase III subunit psi
YpsIP31758_3507-2110.18845816S ribosomal RNA m2G1207 methyltransferase
YpsIP31758_3508-1120.068935***hypothetical protein
YpsIP31758_3509-110-0.493040diguanylate cyclase
YpsIP31758_3510-2110.066274hypothetical protein
YpsIP31758_3511-2100.019133pectinesterase A
YpsIP31758_3512-1110.093517AcrB/AcrD/AcrF family transporter
YpsIP31758_3513-2110.413605RND family efflux transporter MFP subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3504TCRTETOQM2194e-66 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 219 bits (560), Expect = 4e-66
Identities = 115/462 (24%), Positives = 215/462 (46%), Gaps = 48/462 (10%)

Query: 12 KRRTFAIISHPDAGKTTITEKVLLFGHAIQTAGTVKGRGSSHHAKSDWMEMEKQRGISIT 71
K +++H DAGKTT+TE +L AI G+V ++D +E+QRGI+I
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGT----TRTDNTLLERQRGITIQ 57

Query: 72 TSVMQFPYGGCLVNLLDTPGHEDFSEDTYRTLTAVDCCLMVIDAAKGVEDRTRKLMEVTR 131
T + F + VN++DTPGH DF + YR+L+ +D +++I A GV+ +TR L R
Sbjct: 58 TGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 132 LRDTPILTFMNKLDREIRDPMEVLDEVERELNIACSPITWPIGCGKSFKGVYHLHKDETY 191
P + F+NK+D+ D V +++ +L+ K +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVI------------------KQKVE 159

Query: 192 LYQSGKGHTIQEVRIVKGLNNPDLDVAVGEDLAKQFRQELELVQGASHEFDHEAFLSGDL 251
LY + E + + +DL +++ L + + F + L
Sbjct: 160 LYPNMCVTNFTESEQWDTVIEGN------DDLLEKYMSGKSLEALELEQEESIRFHNCSL 213

Query: 252 TPVFFGTALGNFGVDHMLDGLVEWAPAPMPRKTDTRVVVASEEKFTGFVFKIQANMDPKH 311
PV+ G+A N G+D++++ + + R + + G VFKI+ K
Sbjct: 214 FPVYHGSAKNNIGIDNLIEVITNKFYSSTHR---------GQSELCGKVFKIE--YSEK- 261

Query: 312 RDRVAFMRVVSGRFEKGMKLRQVRTKKDVVISDALTFMAGDRSHVEEAYAGDIIGLHNHG 371
R R+A++R+ SG +R + K+ + I++ T + G+ +++AY+G+I+ L N
Sbjct: 262 RQRLAYIRLYSGVLHLRDSVR-ISEKEKIKITEMYTSINGELCKIDKAYSGEIVILQNEF 320

Query: 372 ---TIQIGDTFTQGEDMKFTGIPNFAPELFRRIRLRDPLKQKQLLKGLVQLSEEG-AVQV 427
+GDT + + I N P L + P +++ LL L+++S+ ++
Sbjct: 321 LKLNSVLGDTKLLPQRER---IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRY 377

Query: 428 FRPLSNNDLIVGAVGVLQFEVVSSRLKSEYNVEAVYESVNVS 469
+ + +++I+ +G +Q EV + L+ +Y+VE + V
Sbjct: 378 YVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVI 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3505SACTRNSFRASE472e-09 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 47.2 bits (112), Expect = 2e-09
Identities = 22/80 (27%), Positives = 33/80 (41%), Gaps = 1/80 (1%)

Query: 62 DEATLFNIAIDPQYQRQGYGRLLLEHLIEQLEARNIVTLWLEVRASNARAIALYESLGFN 121
A + +IA+ Y+++G G LL IE + + L LE + N A Y F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 122 EVSVRRNYYPS-ANGREDAI 140
+V Y + E AI
Sbjct: 148 IGAVDTMLYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3506PF04183280.017 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 27.9 bits (62), Expect = 0.017
Identities = 8/38 (21%), Positives = 14/38 (36%), Gaps = 2/38 (5%)

Query: 32 HLPEDTRLLIVA--QQLPEHGDPLLCDVLRSLGLTPHQ 67
L D +++A + E+ PL + GL
Sbjct: 358 WLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAET 395


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3511ANTHRAXTOXNA300.022 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.7 bits (66), Expect = 0.022
Identities = 25/105 (23%), Positives = 39/105 (37%), Gaps = 18/105 (17%)

Query: 106 WGTSGSSTVLVNAANFTAENLTILNDFDFPANQAKAEGDPTKLKDTQAVALLLAEKSDKA 165
+ S S + VNA N I + N+ + E K KD+ + ++
Sbjct: 21 FAISSSQAIEVNAMNEHYTESDIKRNHKTEKNKTEKE----KFKDSINNLVKTEFTNETL 76

Query: 166 RFRQVKLEGYQDTL----------YSKTGSRSYFTDCDISGHVDF 200
K++ QD L YS+ G YFTD D+ H +
Sbjct: 77 ----DKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDIDLVEHKEL 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3512ACRIFLAVINRP458e-147 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 458 bits (1181), Expect = e-147
Identities = 223/1047 (21%), Positives = 434/1047 (41%), Gaps = 69/1047 (6%)

Query: 7 FINNNTRIWLTILLLGIGGIIAYLNIGRLEDPAFTIKTAVVVTRYDGASAQQVEEEVTLP 66
FI W+ ++L + G +A L + + P V Y GA AQ V++ VT
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 67 LENAIQELSYVDDVTSISSAGLSQITINIRAQYGANELPQIWDELRRKISDNSVRLPPGA 126
+E + + + ++S S + S +TI + Q G + +++ K+ + LP
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGS-VTITLTFQSGTD-PDIAQVQVQNKLQLATPLLPQEV 122

Query: 127 SAPMVN-----DDFGDVYGFFFSLTGDGYSNQDLRNFAE-QLRRELVLVPGVGKVGIVGI 180
++ + V GF G + D+ ++ ++ L + GVG V + G
Sbjct: 123 QQQGISVEKSSSSYLMVAGF--VSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 181 LPEEVQVEISRAQMTAAGITPQQLSDLLSRQNVVSDAGQL--------QVGSESIRLHPT 232
+++ + + +TP + + L QN AGQL Q + SI
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQ-- 237

Query: 233 GEFQSVQELGNLLVSQPGSPKSVYLRDIATVTQGFGHSPTNIYRANGQPALALGISFAPN 292
F++ +E G + + V L+D+A V G G + I R NG+PA LGI A
Sbjct: 238 TRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELG-GENYNVIARINGKPAAGLGIKLATG 296

Query: 293 VNVVNVGNAIKSRLAQLEGERPSGMHINVFYDQSKEVEGAVNGFILNFLLALLIVVVTLL 352
N ++ AIK++LA+L+ P GM + YD + V+ +++ + A+++V + +
Sbjct: 297 ANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMY 356

Query: 353 IFMG-VRSGVVIAISLALNVLGTLLIMWLFNIELQRVSLGALIIALSMLVDNAIVVVEGV 411
+F+ +R+ ++ I++ + +LGT I+ F + +++ +++A+ +LVD+AIVVVE V
Sbjct: 357 LFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENV 416

Query: 412 V-VGRQRGDSISTAIGNVVKRSKMPLLGATIIAILAFAPIGLSNDATGEYCKSLFQVLLI 470
V + A + + + L+G ++ F P+ +TG + ++
Sbjct: 417 ERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVS 476

Query: 471 SLMLSWITALTLTPVFAKWAFQNQKQPEADADKPV----KQPYDGWLFRYYRVVLNKLLQ 526
++ LS + AL LTP + + +D + Y + K+L
Sbjct: 477 AMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNS-VGKILG 535

Query: 527 HRSITLVLLAALLVASVIGFGHVRQSFFPPSNTPIFFVDIWLPYGTDISYTEDIAAKIEQ 586
L++ A ++ V+ F + SF P + +F I LP G T+ + ++
Sbjct: 536 STGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTD 595

Query: 587 HI--KQQKNVADTMTTIGQGAMRFMLTYNGQRHYPNYAQVMV-----RTEQLEQIPGLID 639
+ ++ NV T G +++GQ A V + R +I
Sbjct: 596 YYLKNEKANVESVFTVNG-------FSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIH 648

Query: 640 EIESYMHDEYPQV-DAQIKRIMFGPSNNSSIEARF-------IGPDPDVLRTLAAQ-AEQ 690
+ E ++ D + F G D L Q
Sbjct: 649 RAKM----ELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGM 704

Query: 691 AIIADPMADGARHDWQDRSKMIRPQFSDYLGRELGVDKREVDGTLRMSFGGLPVGLYRDG 750
A R + + + + + + LGV +++ T+ + GG V + D
Sbjct: 705 AAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDR 764

Query: 751 TRLIPIVLRTPDSERLNAERLNDVMVWSQARQAFIPIDNVVTSFETEWEDPLIMRLDRKR 810
R+ + ++ R+ E ++ + V S + +P T + P + R +
Sbjct: 765 GRVKKLYVQADAKFRMLPEDVDKLYVRSANGE-MVPFSAFTT-SHWVYGSPRLERYNGLP 822

Query: 811 TLTVQTDPTHQGGETSSELLQRIKPGVEAITLPRGYELEWGGDYESTKEAQRGIFISLPI 870
++ +Q + G +S + + ++ LP G +W G + + + I
Sbjct: 823 SMEIQGEA--APGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 871 AFLLMFVVTVLMFSSVRNALAIWLTVPLALIGVTFGFLLTGIPFGFMALLGLLSLSGMLI 930
+F+++F+ ++ S +++ L VPL ++GV L ++GLL+ G+
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 931 RNGIVLVEEI-GLQRQE-KPLREAIIDASTARLRPIMLTAFTTVLGLAPLL-----SDAF 983
+N I++VE L +E K + EA + A RLRPI++T+ +LG+ PL
Sbjct: 939 KNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGA 998

Query: 984 FQSMAVVIMFGLGFATVLTLLVLPVIY 1010
++ + +M G+ AT+L + +PV +
Sbjct: 999 QNAVGIGVMGGMVSATLLAIFFVPVFF 1025


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3513RTXTOXIND392e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 2e-05
Identities = 22/128 (17%), Positives = 39/128 (30%), Gaps = 42/128 (32%)

Query: 65 GGQLQELLVREGEQVKKGQKIAMLNDTD-----LSLRVRDRQSTFNLARDQF-------- 111
++E++V+EGE V+KG + L L + Q+ R Q
Sbjct: 104 NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELN 163

Query: 112 -----------------------------NRFNTLQGRSAVSRADLDIRRAEMESAQAAL 142
+F+T Q + +LD +RAE + A +
Sbjct: 164 KLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARI 223

Query: 143 DIARKELS 150
+
Sbjct: 224 NRYENLSR 231



Score = 32.1 bits (73), Expect = 0.003
Identities = 7/54 (12%), Positives = 16/54 (29%), Gaps = 3/54 (5%)

Query: 129 DIRRAEMESAQ--AALDIARKELSDATIIAPFDGIIANVNVRNH-QVMAAGQPV 179
+R+ L + + I AP + + V V+ + +
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL 356



Score = 28.6 bits (64), Expect = 0.038
Identities = 15/100 (15%), Positives = 32/100 (32%), Gaps = 10/100 (10%)

Query: 153 TIIAPFDGIIANVNVRNHQVMAAGQPVATLSALDT-LDVVFSVP---------ERLFTTL 202
I + I+ + V+ + + G + L+AL D + + R
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 203 DISNRNYKPTVLLNHMPGREFIAEYKEHTTSTTSASQTFQ 242
N P + L P + ++E + ++ Q
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197


94YpsIP31758_3726YpsIP31758_3736N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3726-1141.735414transposase
YpsIP31758_37270152.092931shufflon protein
YpsIP31758_37281192.309937type IV prepilin peptidase
YpsIP31758_37290172.081821type IV pilus biogenesis protein
YpsIP31758_37300182.408800type IV pilus biogenesis protein PilR
YpsIP31758_37310182.460600type IV secretion system protein
YpsIP31758_37321191.768326type IV pilus biogenesis protein PilP
YpsIP31758_37331191.747999pilin accessory protein PilO
YpsIP31758_37340201.288864type IVB pilus formation outer membrane protein
YpsIP31758_37351240.093517PilM protein
YpsIP31758_3736223-0.714999pilus biogenesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3726FLGFLIH310.006 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 30.5 bits (68), Expect = 0.006
Identities = 18/44 (40%), Positives = 25/44 (56%)

Query: 236 QHKELLMTIAQKLKQEGRQEGRQEGRVEGIQIGEANGLKKGKLE 279
Q +L M ++ Q G EGRQ+G +G Q G A GL++G E
Sbjct: 43 QLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAE 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3728PREPILNPTASE548e-11 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 53.7 bits (129), Expect = 8e-11
Identities = 40/139 (28%), Positives = 55/139 (39%), Gaps = 7/139 (5%)

Query: 81 MLFVPFVYRLSLIDRLSGWLPQEFTWSFLAAGLLAA--AGNDELLSHSMTAIALLLFCGS 138
+L + L+ ID LP + T L GLL G L + A+A L S
Sbjct: 138 LLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVLWS 197

Query: 139 VRIVFGYYAQSEVLGLGDVWFATAIGAWLAWPLALLVL----CVGLCGFILWHLMSG-DV 193
+ F E +G GD A+GAWL W +VL VG I L+
Sbjct: 198 LYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQ 257

Query: 194 KTGGPLGPWLGHAALLAMI 212
P GP+L A +A++
Sbjct: 258 SKPIPFGPYLAIAGWIALL 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3729PilS_PF08805983e-28 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 98.5 bits (245), Expect = 3e-28
Identities = 44/193 (22%), Positives = 82/193 (42%), Gaps = 23/193 (11%)

Query: 2 SLPVASRKQPHSGWGILESGGVALVVIVVIAVVLGGIYTLWNRKDIALESANVQSIITST 61
SL +K+ G ++E L+V+ VI V+ Y L++ ++S+N Q+ + +
Sbjct: 15 SLSARRKKEQDKGATLME----VLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQNNVLTV 70

Query: 62 QGLLKGRNGYSFASGTTMTGILIQVGGVPKNMMTKGNVTASTATLWNTWGGQVIVAPVTA 121
+K + + L G +P +M+ A+ N WGG V +
Sbjct: 71 IANMKSLKFQGRYTDSNYIKTLYAQGLLPSDMIADTTG----ASAKNPWGGSVTITT--- 123

Query: 122 NGFNHGFTLTYQKVPQSVCIAITTRLSAGGSMSGITINSTVYSDGNITAENAGTTCVKDT 181
+ + F + VPQ C+A+ L + ++S I + + + +A T C D+
Sbjct: 124 SSDKYSFNVVEANVPQKNCMAMVNALRSSSAISKIN-------NTSTSTVSAATVCASDS 176

Query: 182 GRIGMNTLTFTVN 194
NTLTF+ +
Sbjct: 177 -----NTLTFSTD 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3730BCTERIALGSPF431e-06 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 42.9 bits (101), Expect = 1e-06
Identities = 50/266 (18%), Positives = 109/266 (40%), Gaps = 22/266 (8%)

Query: 39 LLGNGHPLGAALRMIAEVHTDFGKLWHPYGDLVDDCLESLSDNSTGRMLEDVLAAWGPLE 98
L+ PL AL +A+ L + +E S L D + +
Sbjct: 80 LVAASMPLEEALDAVAKQSEK-PHLSQLMAAVRSKVMEGHS-------LADAMKCFPGSF 131

Query: 99 EA---ALISAGMRSGKLPEALRQAGKLVEARRRILTLVGKMSLYPLLLLSLGSGMLGING 155
E A+++AG SG L L + E R+++ + + + +YP +L + ++ I
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLTVVAIAVVSILL 191

Query: 156 MYLIPTLRKLSDPENWRGAL-GFMHAVANVTDK-QSTVAVGVIAILVGLVLWSI----PR 209
++P + + + + AL + ++D ++ ++A+L G + + + +
Sbjct: 192 SVVVPKVVEQF--IHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQEK 249

Query: 210 WRGRLRRFADNI-MPWSVYKDIQGAVFLMNMAALLRANVQTLEALQILLP-FSSPWLQER 267
R R ++ + + + + A + ++ L + V L+A++I S+ + + R
Sbjct: 250 RRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDVMSNDYARHR 309

Query: 268 LDAIIACIEQGDHLGKALRNSGYAFP 293
L + +G L KAL + FP
Sbjct: 310 LSLATDAVREGVSLHKALEQTA-LFP 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3734BCTERIALGSPD601e-11 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 60.3 bits (146), Expect = 1e-11
Identities = 52/281 (18%), Positives = 99/281 (35%), Gaps = 20/281 (7%)

Query: 216 GDSSSVQKTSVDLNSNLYEDIKKTIENMLTPSKGRFWLSSASGTLTVTDTPETLERIGRY 275
+S + + ++S + + + + L VT P+ + + R
Sbjct: 277 AKASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERV 336

Query: 276 IDHQNELLNRQVQLNVQVLSVTQSRKEQFGLDWKLVYQSLNN-------IGASVTGNLIN 328
I Q ++ QV + + V + G+ W + I ++ G
Sbjct: 337 IA-QLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQY 395

Query: 329 VANNTLSGGL-SILDTATGPAEKFSGSE--LLIKALSQQGNVSLVTELNRATINLTPVPF 385
+ T+S L S L + G A F +L+ ALS ++ + T++ F
Sbjct: 396 NKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATF 455

Query: 386 QISDQTGYIQSSSTTVTANAGTTSSMQTGIITTGLFMSMLPYIQENGDVQLQFAFTLSDK 445
+ + + S TT N T T G+ + + P I E V L+ +S
Sbjct: 456 NVGQEVPVLTGSQTTSGDNIFNTVE----RKTVGIKLKVKPQINEGDSVLLEIEQEVSSV 511

Query: 446 PIIQPFFSRDGNTRNDTPAFKLRSLTQTVNLRAGQTLVLTG 486
S D +T R++ V + +G+T+V+ G
Sbjct: 512 ADAASSTSSDLGATFNT-----RTVNNAVLVGSGETVVVGG 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3736FLGHOOKFLIK310.008 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 31.0 bits (69), Expect = 0.008
Identities = 32/115 (27%), Positives = 46/115 (40%), Gaps = 16/115 (13%)

Query: 152 DDVNREVCFTLRDAYQNLVSAKPVVPANVIIPTKNLAQQSGNPFSGSSASALTPVVP--- 208
DD+N +V +L A ++ P P+ L + F+ ++ LT P
Sbjct: 121 DDLNEDVTASL-SALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDA 179

Query: 209 --TLLVPLKPTI-----KPESISIPSPVLTEASKVLAPATAVFSTEAPGKTVSGP 256
T PL P + K E IS PSPV AS ++ P P TV+ P
Sbjct: 180 PGTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQT-----QPLPTVAAP 229


95YpsIP31758_3805YpsIP31758_3817N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3805115-2.968573hemin importer ATP-binding subunit
YpsIP31758_3806115-3.513295cystathionine beta-lyase
YpsIP31758_3807112-2.619634serine transporter
YpsIP31758_3808114-0.560588LysR family substrate binding transcriptional
YpsIP31758_38090181.414712hypothetical protein
YpsIP31758_38100171.978186inner membrane protein
YpsIP31758_3811-1171.505037secretion system apparatus protein SsaU
YpsIP31758_3812-1202.751020type III secretion apparatus protein
YpsIP31758_38130214.884925HrpO family type III secretion protein
YpsIP31758_38140204.582773type III secretion system protein
YpsIP31758_38150195.576071type III secretion system protein
YpsIP31758_3816-1185.569269hypothetical protein
YpsIP31758_3817-1195.925245type III secretion system protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3805PF05272280.049 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.049
Identities = 10/21 (47%), Positives = 12/21 (57%)

Query: 39 MVAIIGPNGAGKSTLLRLLTG 59
V + G G GKSTL+ L G
Sbjct: 598 SVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3809PF01206921e-28 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.1 bits (229), Expect = 1e-28
Identities = 17/71 (23%), Positives = 37/71 (52%)

Query: 19 DYRLDMVGEPCPYPAVATLEAMPQLKPGEILEVISDCPQSINNIPLDARNYGYTVLDIQQ 78
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 79 DGPTIRYLIQR 89
+ T + ++R
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3811TYPE3IMSPROT345e-120 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 345 bits (887), Expect = e-120
Identities = 123/351 (35%), Positives = 198/351 (56%), Gaps = 2/351 (0%)

Query: 2 MSTEKNEKPTPKRLKEAKEKGQVVKSVEITSGVQLVALVIYFLLTGYSLVEQAKALIRSS 61
MS EK E+PTPK++++A++KGQV KS E+ S +VAL + E L+
Sbjct: 1 MSGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 62 IIQLQQPLTLALARIGAECMTVLMHIVVVLGGALIVVTIIAGIAQVGPLLATKAVSFKGE 121
Q P + AL+ + + ++ L ++ I + + Q G L++ +A+ +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 122 RINPIQNAKQLFSLRSVFELMKSLLKVGVLTLIFGYLLMQYAPSFGYLTHCGSRCALPVF 181
+INPI+ AK++FS++S+ E +KS+LKV +L+++ ++ + L CG C P+
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 182 STLMGWLLGSLIACYLVFSLMDYTFQRYTIMKQLKMSHDEVKREHKDSNGDPHIKQKRRQ 241
++ L+ ++V S+ DY F+ Y +K+LKMS DE+KRE+K+ G P IK KRRQ
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 242 LQHEVQSGSFATNVRRSTAVVRNPTHFAVCLVYHPEETPLPIVIEKGHDEQAALIVSLAE 301
E+QS + NV+RS+ VV NPTH A+ ++Y ETPLP+V K D Q + +AE
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 302 QSGIPVVENIALARALHRDVACGDTIPEQFFEPVAALLRM--ALELDYQPS 350
+ G+P+++ I LARAL+ D IP + E A +LR ++ Q S
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQHS 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3812TYPE3IMRPROT1415e-43 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 141 bits (356), Expect = 5e-43
Identities = 52/230 (22%), Positives = 105/230 (45%), Gaps = 4/230 (1%)

Query: 5 LPGLTALALAMMRPYGILLILPLFTARSLGSSLLRNGLIVAIALPVTPLFLSAPIITNSS 64
L L ++R ++ P+ + RS+ + + GL + I + P + + S
Sbjct: 10 LSWLNLYFWPLLRVLALISTAPILSERSVPKRV-KLGLAMMITFAIAPSLPANDVPVFS- 67

Query: 65 PVTWIGVLCTELLIGVVMGFVAALPFWAMNMAGFLIDTLRGATMSTLFNPGMGVESSLFG 124
+ + ++LIG+ +GF F A+ AG +I G + +T +P + +
Sbjct: 68 -FFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLA 126

Query: 125 VLFTQILTVLFLISGGFNQVLAALYGSYDSLPIGQGIQPAADLLLFLQTEWQMMFELCLC 184
+ + +LFL G +++ L ++ +LPIG + L + +F L
Sbjct: 127 RIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSL-IFLNGLM 185

Query: 185 FALPALLVMVLADLSLGLINRSARQLNVFFLAMPIKSALALFLLLISLPY 234
ALP + +++ +L+LGL+NR A QL++F + P+ + + L+ +P
Sbjct: 186 LALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPL 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3813TYPE3IMQPROT693e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 68.6 bits (168), Expect = 3e-19
Identities = 32/79 (40%), Positives = 47/79 (59%)

Query: 10 IVHLATELLWLVLLLSLPVVVVASTVGLVISLVQALTQIQDQTLQFLIKLLAVSATLLMT 69
+V + L+LVL+LS +VA+ +GL++ L Q +TQ+Q+QTL F IKLL V L +
Sbjct: 4 LVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLL 63

Query: 70 YHWMGATLLNYTQQSFLQI 88
W G LL+Y +Q
Sbjct: 64 SGWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3814TYPE3IMPPROT2241e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 224 bits (572), Expect = 1e-76
Identities = 86/220 (39%), Positives = 143/220 (65%), Gaps = 7/220 (3%)

Query: 4 LNSSYQLIALLFMLSVLPLLVVMGTAFLKLSVVFSLLRNALGVQQVPPNIAIYGLALVLT 63
+ + LIALL ++LP ++ GT F+K S+VF ++RNALG+QQ+P N+ + G+AL+L+
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 64 IFIMAPVGLDVQARLQNEELSNDIGALAHQIDQNALVPYRDFLQRNTDIEQVTFFNDIVQ 123
+F+M P+ D ++E+++ + + + L YRD+L + +D E V FF +
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 124 NKWPE-------RYRDSVKPDSLLILMPAFTLSQLNEAFKIGLLLFLPFVAIDLIVSNIL 176
+ R +D ++ S+ L+PA+ LS++ AFKIG L+LPFV +DL+VS++L
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 177 LAMGMMMVSPMTLSLPFKLLVFVLVDGWSLVLGQLVGSYL 216
LA+GMMM+SP+T+S P KL++FV +DGW+L+ L+ Y+
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYM 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3815TYPE3OMOPROT503e-09 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 50.4 bits (120), Expect = 3e-09
Identities = 29/111 (26%), Positives = 50/111 (45%), Gaps = 4/111 (3%)

Query: 205 YIKLEGGNRMTIQQINEASDPLACGSRAESLPLAAVQFEDLPQTLVMEIGRLTLPLGEIK 264
+ ++EGG + I + AE+LP LP L + R + L E++
Sbjct: 194 FNRVEGGIIVETLDIQHIEEENNTTETAETLP----GLNQLPVKLEFVLYRKNVTLAELE 249

Query: 265 QLAVGQTLACQTHCYGEVNICLNGQSVGRGSLLRCDEQLVVRIAQWGLQNG 315
+ Q L+ T+ V I NG +G G L++ ++ L V I +W ++G
Sbjct: 250 AMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3817RTXTOXIND325e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 5e-04
Identities = 15/118 (12%), Positives = 38/118 (32%), Gaps = 11/118 (9%)

Query: 5 QQRTLQRLLALRQRQERRLRQQLGQLRREQQQQEQQLENGRRRHQQLCQQLQQLAQWCGI 64
++ + Q Q+ + L + R E+ ++ + +L +
Sbjct: 187 LTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSL--- 243

Query: 65 LTPREADEQKVLRQAVYQAERQAKKQLNAWVAQGRQQVSAIERQ--QARLRRNQREQE 120
+Q + + AV + E + + + + Q+ IE + A+ Q
Sbjct: 244 -----LHKQAIAKHAVLEQENKY-VEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295


96YpsIP31758_3830YpsIP31758_3834N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3830-113-3.299554YscC/HrcC family type III secretion outer
YpsIP31758_3831-114-3.108605hypothetical protein
YpsIP31758_3832-114-2.699017sensor histidine kinase/response regulator EsrA
YpsIP31758_3833-214-3.134452DNA-binding response regulator EsrB
YpsIP31758_3834-115-3.041018glutamate/aspartate:proton symporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3830TYPE3OMGPROT478e-166 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 478 bits (1231), Expect = e-166
Identities = 160/514 (31%), Positives = 269/514 (52%), Gaps = 21/514 (4%)

Query: 4 IYIMRKITGLILLFFATLLPYGKFSYGKAIPWQGEPFFIYSRGMTVSELLKDLGMNYGIP 63
+ R +TG +LL + S+ + + W P+ ++G ++ +LL D G NY
Sbjct: 7 SFFKRVLTGTLLLLSSY-------SWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDAT 59

Query: 64 VVISSEINEHFTGKIRDKTPEKILSELAGRYNITWYYDGETLYFYPVQSIKREFISPDGL 123
VV+S +IN+ +G+ P+ L +A YN+ WYYDG LY + + I
Sbjct: 60 VVVSDKINDKVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYIFKNSEVASRLIRLQES 119

Query: 124 AANTLVKYLQRGDVLAGKNCAIKAIPHLDTLEVKGVPICIERVKSVSKMLS--EQVRHQN 181
A L + LQR + + + V G P +E V+ + L Q+R +
Sbjct: 120 EAAELKQALQRSGIWE-PRFGWRPDASNRLVYVSGPPRYLELVEQTAAALEQQTQIRSEK 178

Query: 182 QNKETVKVFPLKYASAADSDYQYRDQNVRLPGLVSVLRELNQGNNLPLAGGNQPDGNQAS 241
+++FPLKYASA+D YRD V PG+ ++L+ + + + QA+
Sbjct: 179 TGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQRVLSDATIQQVTVDNQRIPQAA 238

Query: 242 S-----PVFSADPRQNAVIIRDRQANMPIYRSLITQLDQRPIQIEISVTIIDVDAGDISQ 296
+ ADP NA+I+RD MP+Y+ LI LD+ +IE++++I+D++A +++
Sbjct: 239 TRASAQARVEADPSLNAIIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTE 298

Query: 297 LGVDWSASASIGGTGV------SFNSTFAKNNAEGFSTVIGDTGNFMVRLNALQKNSRAR 350
LGVDW G S A N A G + R+N L+ A+
Sbjct: 299 LGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQ 358

Query: 351 ILSQPSVVTLNNIQAVLDKNVTFYTKLQGEKVAKLESVTSGSLLRVTPRMIETEGVQEVL 410
++S+P+++T N QAV+D + T+Y K+ G++VA+L+ +T G++LR+TPR++ E+
Sbjct: 359 VVSRPTLLTQENAQAVIDHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEIS 418

Query: 411 LNLNIQDGQQQASTNSNEPLPEIRNSDISTQATLQVGQSLLLGGFIQDTQIESQNKIPLL 470
LNL+I+DG Q+ +++ E +P I + + T A + GQSL++GG +D + +K+PLL
Sbjct: 419 LNLHIEDGNQKPNSSGIEGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLL 478

Query: 471 GDIPLLGGLFRSTDKQSHSVVRLFLIKAVPVNAG 504
GDIP +G LFR + + VRLF+I+ ++ G
Sbjct: 479 GDIPYIGALFRRKSELTRRTVRLFIIEPRIIDEG 512


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3832HTHFIS801e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-17
Identities = 35/173 (20%), Positives = 64/173 (36%), Gaps = 14/173 (8%)

Query: 695 HILLVDDSETNRDITGMMLQQLGHQVTRADSGTTALAIGRQHRFDLVLMDIRMPVLDGLA 754
IL+ DD R + L + G+ V + T DLV+ D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 755 TTARWRHDPANIDSHCMITALSANASPDEQIKTNQAGMNHYLSKPVTLGQLAEMLDLTAQ 814
R + + +SA + IK ++ G YL KP L E++ + +
Sbjct: 65 LLPRIK----KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF---DLTELIGIIGR 117

Query: 815 FQLERGVDLSPQLSEPQPLLDL-ADSALSLKLYQSLQVLIQQAKDAIENLPVL 866
E S + Q + L SA ++Y+ ++ + +L ++
Sbjct: 118 ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYR----VLARLMQT--DLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3833HTHFIS592e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.7 bits (142), Expect = 2e-12
Identities = 25/127 (19%), Positives = 53/127 (41%), Gaps = 3/127 (2%)

Query: 3 TKLLIVDDHELIIHGIKNMLAAYPRYLIVGQADNGLEVYNLCRQTEPDMVILDLGLPGMD 62
+L+ DD I + L+ V N ++ + D+V+ D+ +P +
Sbjct: 4 ATILVADDDAAIRTVLNQALS--RAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GLDVIIQLLRRWPAMKILTLTARNEEHYASRTFNSGALGYVLKKSPQQILMAAIQTVAIG 122
D++ ++ + P + +L ++A+N A + GA Y+ K L+ I A+
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR-ALA 120

Query: 123 KRYIDPA 129
+ P+
Sbjct: 121 EPKRRPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3834V8PROTEASE310.008 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 31.1 bits (70), Expect = 0.008
Identities = 7/43 (16%), Positives = 18/43 (41%)

Query: 293 AYGAPKAITSFVVPTGYSFNLDGSTLYQSIAAIFIAQLYGIEL 335
+ A + + TGY + +T+++S I + ++
Sbjct: 186 SNNAETQVNQNITVTGYPGDKPVATMWESKGKITYLKGEAMQY 228


97YpsIP31758_3864YpsIP31758_3868N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_3864328-1.57315050S ribosomal protein L11
YpsIP31758_3865326-3.404785transcription antitermination protein NusG
YpsIP31758_3866229-4.791845preprotein translocase subunit SecE
YpsIP31758_3867228-5.211557elongation factor Tu
YpsIP31758_3868329-6.359136****acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3864ACRIFLAVINRP270.045 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.045
Identities = 14/71 (19%), Positives = 26/71 (36%), Gaps = 2/71 (2%)

Query: 4 KVQAYVKLQVAAGMANPSPPVGPALGQQ-GVNIMEFCKAFNAKTESIEKGLPIPVVITVY 62
+V+ + N P G + G N ++ KA AK ++ P + +
Sbjct: 267 RVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYP 326

Query: 63 SDRSFTFVTKT 73
D + FV +
Sbjct: 327 YDTT-PFVQLS 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3866SECETRNLCASE1617e-55 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 161 bits (410), Expect = 7e-55
Identities = 109/127 (85%), Positives = 116/127 (91%)

Query: 1 MSANTEAPGSGRGLETAKWLIVAVLLVVAIVGNYYYREYSLPLRALAVVVIIAVAGAVAL 60
MSANTEA GSGRGLE KW++V LL+VAIVGNY YR+ LPLRALAVV++IA AG VAL
Sbjct: 1 MSANTEAQGSGRGLEAMKWVVVVALLLVAIVGNYLYRDIMLPLRALAVVILIAAAGGVAL 60

Query: 61 MTAKGKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVS 120
+T KGKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVS
Sbjct: 61 LTTKGKATVAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLDGILVRLVS 120

Query: 121 FITGLRF 127
FITGLRF
Sbjct: 121 FITGLRF 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3867TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 53/154 (34%), Positives = 78/154 (50%), Gaps = 13/154 (8%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGSARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G+ R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPARHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQ 160
G+P I F+NK D + L V +++E LS
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_3868SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.1 bits (70), Expect = 0.001
Identities = 28/116 (24%), Positives = 47/116 (40%), Gaps = 10/116 (8%)

Query: 55 IEREALLLWIARDEIGIIGTIQLVLCQKPNGLNRAEIQKLLVHS--RRTGIGHKLIIAAE 112
+E E ++ E IG I++ + N A I+ + V R+ G+G L+ A
Sbjct: 60 VEEEGKAAFLYYLENNCIGRIKI----RSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI 115

Query: 113 NTAVQLRRGLIYLDTQS-GSSAESFYRAQGYRYVG-EIPDYACTPNGNYHPTAIYF 166
A + + L+TQ SA FY + + Y+ P N AI++
Sbjct: 116 EWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYSNFPTAN--EIAIFW 169


98YpsIP31758_4077YpsIP31758_4080N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
YpsIP31758_4077-115-3.188573hemagglutinin/hemolysin
YpsIP31758_4078-112-1.409175*regulatory protein UhpC
YpsIP31758_4079013-1.076686sensory histidine kinase UhpB
YpsIP31758_4080-112-0.719845UhpA family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4077PF05860596e-13 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 58.7 bits (142), Expect = 6e-13
Identities = 17/115 (14%), Positives = 35/115 (30%), Gaps = 18/115 (15%)

Query: 66 VGMTETVVNIQAPDENGLSHNKYSKFDVVANGLFDVTTLNNRLAQEVDGNSFLQDKLATI 125
++ + L H+ + +F V +G F
Sbjct: 17 TEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTA----------------FFNNPTNIQN 59

Query: 126 ILNEVNSSQASLLDGNLHVGGQDAHVIIANPAGINCRGCSFTNTSHVTLTTGAPS 180
I++ V S +DG + A++ + NP GI + + + + A
Sbjct: 60 IISRVTGGSVSNIDGLIRANAT-ANLFLINPNGIIFGQNARLDIGGSFVGSTANR 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4078TCRTETB455e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.9 bits (106), Expect = 5e-07
Identities = 33/157 (21%), Positives = 68/157 (43%), Gaps = 7/157 (4%)

Query: 49 FNFIMPAMLTDLGLSMSDVGILGTLFYITYGCSKFVSGMISDRSNPRYFMGIGLVMTGII 108
N +P + D + + T F +T+ V G +SD+ + + G+++
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFG 92

Query: 109 NILFGMSSSLLVLGALWILNAFFQGWG---WPPCSKILTSWY-SRSERGGWWAIWNTSHN 164
+++ + S +L I+ F QG G +P ++ + Y + RG + + +
Sbjct: 93 SVIGFVGHSFF---SLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVA 149

Query: 165 FGGALIPLLVGVITLHFSWRYGMIIPGIIGVVIGLLM 201
G + P + G+I + W Y ++IP I + + LM
Sbjct: 150 MGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4079PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 17/85 (20%), Positives = 33/85 (38%), Gaps = 10/85 (11%)

Query: 426 VTNAYRHGAASR-----IEINARQDNQQIYLTISDNGK-GIDLASITPGYGLRGIQSRVS 479
V N +HG A I + +DN + L + + G + + G GL+ ++ R+
Sbjct: 264 VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQ 323

Query: 480 A-FGGNVSLSV---DNGTCLNVTLP 500
+G + + V +P
Sbjct: 324 MLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
YpsIP31758_4080HTHFIS613e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.6 bits (147), Expect = 3e-13
Identities = 34/173 (19%), Positives = 63/173 (36%), Gaps = 20/173 (11%)

Query: 4 RVVFIDDHDIVRSGFAQLLSLEEDIQVVGEFSSAKQARAGLPGLQANICICDISMPDENG 63
++ DD +R+ Q LS V S+A + ++ + D+ MPDEN
Sbjct: 5 TILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 64 LDLLKGLPS---GMGVIMLSMHDSPALVETALERGARGFLSKRCKPEDLISAVRTVGSGG 120
DLL + + V+++S ++ A E+GA +L K +LI +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR----- 117

Query: 121 VYLMPEIAQQLARVAVDPLTRREREIAVLLAEG---MEVREIAESLGLSPKTV 170
A + L ++ L+ E+ + L + T+
Sbjct: 118 -------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.