PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome1063.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_002655 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1Z0015Z0033Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0015-216-3.089568molecular chaperone DnaJ
Z0016127-7.609934Gef protein
Z0018129-8.649492pH-dependent sodium/proton antiporter
Z0019136-10.798841transcriptional activator NhaR
Z0020341-13.811160hypothetical protein
Z0021140-13.359377hypothetical protein
Z0022138-12.895206usher protein
Z0023-132-10.893083chaperone protein
Z0024-215-3.592779type-1 fimbrial protein
Z0025-214-2.397600hypothetical protein
Z0027-1232.74874430S ribosomal protein S20
Z0028-1223.247541hypothetical protein
Z0029-2213.601578bifunctional riboflavin kinase/FMN
Z0030-1213.556907isoleucyl-tRNA synthetase
Z0031-3162.789552lipoprotein signal peptidase
Z0033-1243.529944peptidyl-prolyl cis- trans isomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0016HOKGEFTOXIC589e-16 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 58.3 bits (141), Expect = 9e-16
Identities = 18/46 (39%), Positives = 28/46 (60%)

Query: 23 HKVMIVALIVXCITAVVAALVTRKDLCEVHIRTGQTEVAVFTAYES 68
++ +++ C+T ++ +TRK LCE+ R G EVA F AYES
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0022PF005776560.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 656 bits (1695), Expect = 0.0
Identities = 243/833 (29%), Positives = 397/833 (47%), Gaps = 38/833 (4%)

Query: 7 WVAAIIFLYSFPGYAEETFDTHFMIGGMKGEKVSEYHFDNKQP-LPGNYELDFYVNNQWR 65
+VA + AE F+ F+ F+N Q PG Y +D Y+NN +
Sbjct: 31 FVACAFAAQAPLSSAELYFNPRFLADD-PQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 66 GKQDITI----PESPVKPCLPKVLLTKLGVKTGNLNT-----EDNCILLDKAVHGGQYQW 116
+D+T E + PCL + L +G+ T +++ +D C+ L +H Q
Sbjct: 90 ATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQL 149

Query: 117 DISEHRLNLTVPQAYINELERGYVPPESWDRGIDAFYTSYNLSQYRSYDSNNNSNTASYG 176
D+ + RLNLT+PQA+++ RGY+PPE WD GI+A +YN S + ++ +Y
Sbjct: 150 DVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYL 209

Query: 177 RFNSGLNLFSWQLHSDASYSKPD-----DMKGTWQSNTLYLEHGWSQILSTVQIGENYTS 231
SGLN+ +W+L + ++S K WQ +LE + S + +G+ YT
Sbjct: 210 NLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQ 269

Query: 232 SLIFDSLRFSGIRLFRDMQMLPDSMQSFTPLVQGVAQSNALITVSQNGYIIYQKEVPPGP 291
IFD + F G +L D MLPDS + F P++ G+A+ A +T+ QNGY IY VPPGP
Sbjct: 270 GDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGP 329

Query: 292 FTIADLQLSGSGSDLDVSIKEADGSVRSFLVPYSSVPNMLQPGISNFDFIAGRSKIYGVK 351
FTI D+ +G+ DL V+IKEADGS + F VPYSSVP + + G + + AG + +
Sbjct: 330 FTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQ 389

Query: 352 NQED-FLEANYIYGLNNLLTLYGGTILSDNYNAITLGNGWNT-PLGAISFDATRSSSKLN 409
++ F ++ ++GL T+YGGT L+D Y A G G N LGA+S D T+++S L
Sbjct: 390 QEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLP 449

Query: 410 NDITHEGTSYQVAYNKYLVQTATRFSVAAWRYASQDYRTFSDHLYENDKINHQSDYDDFY 469
+D H+G S + YNK L ++ T + +RY++ Y F+D Y + D
Sbjct: 450 DDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVI 509

Query: 470 DIG------------RKNSLSANIMQPLSNNLGNVSLSALWRNYWGRSGNAKDYQFSYSN 517
+ ++ L + Q L + LS + YWG S + +Q +
Sbjct: 510 QVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYWGTSNVDEQFQAGLNT 568

Query: 518 SWQRISYTFSASQSYDENDKEEER-FNLFISIPF--YWGDDIAKTRHQINLSNSTSFSKD 574
+++ I++T S S + + K ++ L ++IPF + D + S S S +
Sbjct: 569 AFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLN 628

Query: 575 GYSSNNTGITGIAGEHDQLNYGI---YVNQQQQNNDTSLGTNLSWRTPIATIDGSYSHSK 631
G +N G+ G E + L+Y + Y N+ ++ L++R + YSHS
Sbjct: 629 GRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSD 688

Query: 632 NAWQSGGSISSGLVVWPGGINITNQLSDTFAILDAPGLEGAHINGQKYNRTNSKGQVVYD 691
+ Q +S G++ G+ + L+DT ++ APG + A + Q RT+ +G V
Sbjct: 689 DIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLP 748

Query: 692 LMIPHRENHLVLDTANSESETELQGNRQIIAPYRGAVSYVQFTTDQRKPWYIQALRPDGS 751
+REN + LDT +L + P RGA+ +F + L +
Sbjct: 749 YATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMT-LTHNNK 807

Query: 752 PLTFGYDVLDLQENNIGVVGQGSRLFIRVDEIPTGIKVALNDEQNLFCTITFQ 804
PL FG V + G+V ++++ + ++V +E+N C +Q
Sbjct: 808 PLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQ 860


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0033INFPOTNTIATR310.002 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 30.7 bits (69), Expect = 0.002
Identities = 14/32 (43%), Positives = 19/32 (59%)

Query: 8 NSAVLVHFTLKLDDGTTAESTRNNGKPALFRL 39
+ V V +T L DGT +ST GKPA F++
Sbjct: 144 SDTVTVEYTGTLIDGTVFDSTEKAGKPATFQV 175


2Z0066Z0076Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0066-2204.18041123S rRNA/tRNA pseudouridine synthase A
Z0067-3204.041595ATP-dependent helicase HepA
Z0068-2174.125743DNA polymerase II
Z0069-1174.293128L-ribulose-5-phosphate 4-epimerase
Z0070-1195.096664L-arabinose isomerase
Z00720174.776178ribulokinase
Z0073-1152.896252DNA-binding transcriptional regulator AraC
Z00740153.321345hypothetical protein
Z00751163.128194thiamine transporter ATP-binding subunit
Z00760173.106654thiamine transporter membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0073PF05616290.022 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 28.9 bits (64), Expect = 0.022
Identities = 26/118 (22%), Positives = 47/118 (39%), Gaps = 21/118 (17%)

Query: 82 YGRHPEAREWYHQWVYFRPRAYWHEWLNWPSIFANTGFFRPDEAHQPHFSDLFGQ-IINA 140
Y R PE +E + R YW + N P ++ +F+ + +F G ++
Sbjct: 158 YSRFPEVKELMESQMERLARPYWEKLRNRPDMY----YFKNYNFKRCYFGLNGGDCLVAK 213

Query: 141 G-----------QGEGRYSELLAINLLEQLLLRRMEA-----INESLHPPMDNRVREA 182
G QG +Y E + LE++L +++A I + +P +V A
Sbjct: 214 GDDGRTFISFSLQGNSKYKEEMDAKKLEEILSLKVDANPDKYIKATGYPGYSEKVEVA 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0076PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 0.005
Identities = 17/80 (21%), Positives = 29/80 (36%), Gaps = 5/80 (6%)

Query: 4 RRQPLIPGWLIPGVSAATLVVAVALAAFLALWWNAPQGDWSAVWQDS-YLWHVVRFSFWQ 62
R GWL + L V A +W+ A +++W+ ++
Sbjct: 60 RSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVAN----TSIWRLLAFINTKPVAFTLP 115

Query: 63 AFLSALLSVVPAIFLARALY 82
LS + +VV F+ LY
Sbjct: 116 LALSIIFNVVVVTFMWSLLY 135


3Z0145Z0164Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0145316-3.3971023-methyl-2-oxobutanoate
Z0146317-4.134029fimbrial protein
Z0147418-3.559846fimbrial protein
Z0148419-3.295532fimbrial protein
Z0149316-1.403636fimbrial protein
Z0150114-0.725615outer membrane usher protein
Z01510140.473033chaperone protein EcpD
Z01520140.648942fimbrial protein
Z0153-1151.9470872-amino-4-hydroxy-6-
Z01540143.385576poly(A) polymerase
Z0155-1143.273936glutamyl-Q tRNA(Asp) synthetase
Z01560112.539494RNA polymerase-binding transcription factor
Z01570122.864265sugar fermentation stimulation protein A
Z0158-1133.2532492'-5' RNA ligase
Z0159-1164.168028ATP-dependent RNA helicase HrpB
Z0160-2173.928044penicillin-binding protein 1b
Z01610153.857092ferrichrome outer membrane transporter
Z01620174.871691iron-hydroxamate transporter ATP-binding
Z01631164.536862iron-hydroxamate transporter substrate-binding
Z01640154.295704iron-hydroxamate transporter permease subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0150PF005777400.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 740 bits (1912), Expect = 0.0
Identities = 262/889 (29%), Positives = 425/889 (47%), Gaps = 45/889 (5%)

Query: 1 MYQFTHQKSRIPKKTLLA-----ACCALFYSSNGAAADTVEYDSSFLMGTGASTIDVKRY 55
+YQ Q I K L F + ++ + ++ FL + D+ R+
Sbjct: 8 LYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRF 67

Query: 56 AQGNPTPPGLYNVRVFVNGQATSSLEIPFV-DIGENSAAACLTHKNLAQLHIKQPEQPVT 114
G PPG Y V +++N ++ ++ F E CLT LA + +
Sbjct: 68 ENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGM 127

Query: 115 LLAREGEEEDCLDLAKSYEKADVCFDGSDQFLDLTIPQAYVLKSYGGYVDPSLWESGINA 174
L ++ C+ L A D Q L+LTIPQA++ GY+ P LW+ GINA
Sbjct: 128 NLL---ADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINA 184

Query: 175 ATLAYTLNAYHTSSDND-NSDSVYGAFNSGINLGAWHFRARGNYNWTTDNGS-----DFD 228
L Y + + NS Y SG+N+GAW R +++ + + S +
Sbjct: 185 GLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQ 244

Query: 229 FQDRYLQRDIPAIRSQIIMGDAYTTGETFDSVNVRGVRLYSDSRMLPSALASYAPTIRGV 288
+ +L+RDI +RS++ +GD YT G+ FD +N RG +L SD MLP + +AP I G+
Sbjct: 245 HINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGI 304

Query: 289 ANSNAKVTVTQSGYKIYETTVPPGEFVIDDISPSGFGSELVVTIEEADGSKRTFTHPFSS 348
A A+VT+ Q+GY IY +TVPPG F I+DI +G +L VTI+EADGS + FT P+SS
Sbjct: 305 ARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSS 364

Query: 349 VVQMQRPGVGRWDFSAGKV-IDDSLRSEPNMGQASYYYGLNNLFTGYTGIQFTDNNYLAG 407
V +QR G R+ +AG+ ++ + +P Q++ +GL +T Y G Q D Y A
Sbjct: 365 VPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD-RYRAF 423

Query: 408 LLGVGINT-SIGAFAVDVTHSRAEIPDDKTYQGQSYRVTWNKLFQDTGTSFNLAAYRYST 466
G+G N ++GA +VD+T + + +PDD + GQS R +NK ++GT+ L YRYST
Sbjct: 424 NFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYST 483

Query: 467 QDYLGLHDALVLIDDAKHL--------SADEDKNTMQTYSRMKNQFTVSINQPLNIAYED 518
Y D + ++ + + + + +++ Q L
Sbjct: 484 SGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLG----R 539

Query: 519 YGSLFISGSWTYYWAANNSRTEYNVGYSKSVSWGSFSVNLQRSWNE-DGEKDDAMYVSVS 577
+L++SGS YW +N ++ G + + +++++ + N +D + ++V+
Sbjct: 540 TSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVN 599

Query: 578 VPIENILGGKRKSS-GFRNLNTQLNTDFDGSHQLNVNSSGNT-ENNLVNYSVNAGYSLDK 635
+P + L KS + + ++ D +G G E+N ++YSV GY+
Sbjct: 600 IPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGG 659

Query: 636 NAGDLASVGGYLNYESGLGGISASASATSDNSQQYSISTDGGFVLHSGGLTFTNNSFSSN 695
+ ++ LNY G G + S SD+ +Q GG + H+ G+T N
Sbjct: 660 DGNSGSTGYATLNYRGGYGNANIGYSH-SDDIKQLYYGVSGGVLAHANGVTLGQP---LN 715

Query: 696 DTLVLINALGAKGARINNSNN-EIDRWGYAVTSSVSPYRENRVGLNIETLENDVELKSTS 754
DT+VL+ A GAK A++ N D GYAV + YRENRV L+ TL ++V+L +
Sbjct: 716 DTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAV 775

Query: 755 ATTVPRSGSVVLTRFETDEGRSAVLNITAANGKSIPFAAEVYQGE-VMIGSMGQGGQAFV 813
A VP G++V F+ G ++ +T N K +PF A V G + GQ ++
Sbjct: 776 ANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYL 834

Query: 814 RGINDSGELIVRWYENNQTIDCKLHYQFPAQPQTQGSTNTLLLNNLTCQ 862
G+ +G++ V+W E C +YQ P + Q Q L + C+
Sbjct: 835 SGMPLAGKVQVKWGEEENAH-CVANYQLPPESQQQL----LTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0163FERRIBNDNGPP5110.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 511 bits (1316), Expect = 0.0
Identities = 294/296 (99%), Positives = 295/296 (99%)

Query: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60
MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DTINYRLWVSEPPLPESVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120
DTINYRLWVSEPPLP+SVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMNPRFVKRGARPLLLT 180
GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSM PRFVKRGARPLLLT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240
TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296
DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


4Z0235Z0291Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0235-220-3.566536membrane-bound lytic murein transglycosylase D
Z0236-226-6.569821hydroxyacylglutathione hydrolase
Z0237-125-7.033838hypothetical protein
Z0239130-9.688942ribonuclease H
Z0241135-11.980449DNA polymerase III subunit epsilon
Z0243547-16.060974*hypothetical protein
Z0244233-9.802537hypothetical protein
Z0245017-1.527252hypothetical protein
Z02460170.444598hypothetical protein
Z02471172.111037hypothetical protein
Z02481193.855778hypothetical protein
Z02492236.189883hypothetical protein
Z02501256.416982macrophage toxin
Z02510256.572669hypothetical protein
Z0252-1276.051488hypothetical protein
Z02530266.106516hypothetical protein
Z02541225.138445protease
Z02551182.973832hypothetical protein
Z02562193.013170hypothetical protein
Z02571182.210577hypothetical protein
Z02582192.228509hypothetical protein
Z02591201.228585hypothetical protein
Z02601210.015786hypothetical protein
Z0261120-0.636637hypothetical protein
Z02621191.315968hypothetical protein
Z02633346.119477hypothetical protein
Z02643294.865609hypothetical protein
Z02652283.309348hypothetical protein
Z02663293.942658hypothetical protein
Z02673303.605092hypothetical protein
Z02684302.593944hypothetical protein
Z0269329-7.151241hypothetical protein
Z0271126-4.853574hypothetical protein
Z0272025-3.276445hypothetical protein
Z0273-116-2.342561hypothetical protein
Z0274-115-1.584984hypothetical protein
Z0275-215-0.321834hypothetical protein
Z0276-2182.189040hypothetical protein
Z0277-2161.525096C-lysozyme inhibitor
Z0278-1171.503843acyl-CoA dehydrogenase
Z0280217-1.028087phosphoheptose isomerase
Z0281217-1.613213amidotransferase
Z02820221.595556hypothetical protein
Z02841252.242118hypothetical protein
Z02851242.797122damage-inducible protein J
Z02871222.372235lipoprotein
Z02882201.652249hypothetical protein
Z02902191.273371flagellar biosynthesis
Z0291218-1.881382hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0236BINARYTOXINB344e-04 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 34.3 bits (78), Expect = 4e-04
Identities = 12/55 (21%), Positives = 28/55 (50%), Gaps = 4/55 (7%)

Query: 186 NDYYRKVKELRAKNQITLPVILKNERQINVFLRT----EDIDLINVINEETLLQQ 236
+ ++ EL A N T+ +K ++N+ +R D + I V +E+++++
Sbjct: 589 QNIKNQLAELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNIAVGADESVVKE 643


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0268OUTRSURFACE391e-04 Outer surface protein signature.
		>OUTRSURFACE#Outer surface protein signature.

Length = 273

Score = 38.7 bits (90), Expect = 1e-04
Identities = 42/199 (21%), Positives = 75/199 (37%), Gaps = 38/199 (19%)

Query: 395 RVTITDSLNRR--EVLYTEGEGGLKRVVKKEHADGSITRSEYDEAGRL--KAQTDAAGRR 450
++TI D L++ E+ +G+ + R V + D + T ++E G L K T G +
Sbjct: 87 KLTIADDLSKTTFELFKEDGKTLVSRKVSSK--DKTSTDEMFNEKGELSAKTMTRENGTK 144

Query: 451 TEYSLHMASGAVTAVTGPDGRTVRYGYNSQRQVTSVTYPDGLRSSREYDEKGRLAAETSR 510
EY+ + G A T+ + + ++ +G + L+ E ++
Sbjct: 145 LEYTEMKSDGTGKAKEVLKNFTLEGKVANDK--VTLEVKEGTVT---------LSKEIAK 193

Query: 511 SGETTRYSYDDPASELPTGIQDATGSTKQMA-WSRYGQLLTFTDCSGYTTRYEYDRYGQQ 569
SGE T D + T +TK+ W LT + S TT+
Sbjct: 194 SGEVTVALND----------TNTTQATKKTGAWDSKTSTLTISVNSKKTTQL-------- 235

Query: 570 IAVHREEGISTYSSYNPRG 588
V ++ T Y+ G
Sbjct: 236 --VFTKQDTITVQKYDSAG 252



Score = 33.0 bits (75), Expect = 0.007
Identities = 30/139 (21%), Positives = 52/139 (37%), Gaps = 39/139 (28%)

Query: 549 LTFTDCSGYTTRYEYDRYGQQIA---VHREEGISTYSSYNPRGQLVSQKDAQGRETRYEY 605
LT D TT + G+ + V ++ ST +N +G+L ++ + T+ EY
Sbjct: 88 LTIADDLSKTTFELFKEDGKTLVSRKVSSKDKTSTDEMFNEKGELSAKTMTRENGTKLEY 147

Query: 606 SAAGDLTAIVAPDGSRSEIQYDAWGKAVST-------------------TQGGLTRSMGY 646
+E++ D GKA +G +T S
Sbjct: 148 ----------------TEMKSDGTGKAKEVLKNFTLEGKVANDKVTLEVKEGTVTLSKEI 191

Query: 647 DAAGRITV-LTNENGSQST 664
+G +TV L + N +Q+T
Sbjct: 192 AKSGEVTVALNDTNTTQAT 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0284ENTSNTHTASED260.008 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 26.1 bits (57), Expect = 0.008
Identities = 6/23 (26%), Positives = 10/23 (43%)

Query: 20 AVYKDHPLQGSWKGYRDAHVEPD 42
+VYK + + G+ A V
Sbjct: 153 SVYKAFSDRVTLPGFNSAKVTSL 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0291OMPADOMAIN399e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.1 bits (91), Expect = 9e-06
Identities = 30/118 (25%), Positives = 46/118 (38%), Gaps = 22/118 (18%)

Query: 121 FERGSAQIMPFFKTLLVELAPVFDSL---DNKIIITGHTDAM---AYKNNIYNNWNLSGD 174
F A + P + L +L +L D +++ G+TD + AY N LS
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAY------NQGLSER 276

Query: 175 RALSARRVLEEAGMPEDKVMQVS-----AMADQMLLDAKNPQS-----AGNRRIEIMV 222
RA S L G+P DK+ + + K + A +RR+EI V
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


5Z0304Z0409Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0304022-3.079038gamma-glutamyl phosphate reductase
Z0307433-4.729409*integrase for prophage CP-933H
Z0308431-5.060831hypothetical protein
Z0309428-3.470823cI repressor protein for prophage CP-933H
Z0310224-3.567074cII antiterminator protein for prophage CP-933H
Z0311223-3.527481hypothetical protein
Z0312121-2.907483hypothetical protein
Z0313126-3.412520hypothetical protein
Z0314130-4.300450hypothetical protein
Z0315336-7.162382hypothetical protein
Z0316234-6.781910hypothetical protein
Z0317436-7.393696tail fiber protein from prophage CP-933H
Z0318639-8.612509DNA invertase from prophage CP-933H
Z0319643-11.420286hypothetical protein
Z0321747-13.029804AraC family transcriptional regulator
Z0322647-14.004067hypothetical protein
Z0324851-16.620111integrase for prophage CP-933I
Z0325957-20.538453hypothetical protein
Z0326654-19.596688hypothetical protein
Z0327547-15.855081hypothetical protein
Z0328336-12.108814hypothetical protein
Z0330332-10.148888hypothetical protein
Z0331224-5.280826hypothetical protein
Z0332022-1.107949activator encoded in prophage CP-933I
Z0333122-0.332790polarity suppression protein encoded in CP-933I
Z03341210.252307capsid morphogenesis protein encoded in CP-933I
Z0335128-3.663809hypothetical protein
Z0336127-2.860740regulatory protein encoded in prophage CP-933I
Z0337127-3.077743regulator encoded in prophage CP-933I
Z0338127-4.224304hypothetical protein
Z0339128-5.145505alpha replication protein of prophage CP-933I
Z0340231-7.121394hypothetical protein
Z0342-118-1.522382LysR family transcriptional regulator
Z0343-1140.483995oxidoreductase
Z0344-1172.547736hypothetical protein
Z0345-1185.268627hypothetical protein
Z0346-1206.342941LysR family transcriptional regulator
Z0347-2206.566702hypothetical protein
Z0348-1206.064651transcriptional regulator
Z0349-1205.739943hypothetical protein
Z0350-1194.659124hypothetical protein
Z0351-1171.647675hypothetical protein
Z03520181.223265xanthine dehydrogenase iron-sulfur-binding
Z03530191.239646hypothetical protein
Z03541211.964949ferredoxin
Z03562200.879378hypothetical protein
Z03571200.464841receptor
Z03581210.349857hypothetical protein
Z0359322-3.415111hypothetical protein
Z0360424-4.857216hypothetical protein
Z0361425-4.613084regulator
Z03622232.879747hypothetical protein
Z03631180.560256hypothetical protein
Z0364019-0.46855050S ribosomal protein L31
Z0365021-1.731308hypothetical protein
Z0366022-2.648735hypothetical protein
Z0367024-3.535160hypothetical protein
Z0369133-7.061011oxidoreductase
Z0370-122-2.306822hypothetical protein
Z0371-122-2.546405LysR family transcriptional regulator
Z0372-120-2.427983hypothetical protein
Z0373-119-2.213689hypothetical protein
Z0374-120-2.530637hypothetical protein
Z0375-121-3.228642adhesin
Z0376129-6.483824AraC family transcriptional regulator
Z0377128-5.318601dehydrogenase
Z0378124-3.729241hypothetical protein
Z0380022-2.814393hypothetical protein
Z0381-121-2.394646pyridine nucleotide-disulfide oxidoreductase
Z0382019-2.640808AraC family transcriptional regulator
Z0384119-2.990497dehydrogenase subunit
Z0385125-4.967043hypothetical protein
Z0386229-6.307005hypothetical protein
Z0387334-7.766755hypothetical protein
Z0388334-7.389932hypothetical protein
Z0389335-6.055321hypothetical protein
Z0390337-6.877934hypothetical protein
Z0391434-5.652608hypothetical protein
Z03920180.283201hypothetical protein
Z03931213.450370hypothetical protein
Z03941213.838177hypothetical protein
Z03950234.110120hypothetical protein
Z0397-2142.200839hypothetical protein
Z0398-2131.585336choline dehydrogenase
Z0399-2110.670151betaine aldehyde dehydrogenase
Z0400-210-0.650869transcriptional regulator BetI
Z0401-19-0.935692choline transport protein BetT
Z0402014-2.713746beta-barrel outer membrane protein
Z0403221-3.878309hypothetical protein
Z0404016-0.437492hypothetical protein
Z04051171.685545hypothetical protein
Z04061182.728614hypothetical protein
Z04071203.132683transcription factor
Z04081203.323203hypothetical protein
Z04090213.301867oxidoreductase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0343DHBDHDRGNASE826e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.4 bits (203), Expect = 6e-21
Identities = 56/190 (29%), Positives = 88/190 (46%), Gaps = 2/190 (1%)

Query: 3 KVILITGASSGIGEGIARELGMTGAKVLLGARRVERIEAIATEICRAGGIAKARELDVTD 62
K+ ITGA+ GIGE +AR L GA + E++E + + + A+A DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 RQSMADFVQAALDSWGRVDVLINNAGVMPLSPLAAGKQDEWALTIDVNIKGVLWGIGAVL 122
++ + G +D+L+N AGV+ + + +EW T VN GV +V
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 123 PVMEAQGSGQIINLGSIGALSVVPTGAVYCASKFAVR--AISDGLRQESSKIRVTCVNPG 180
M + SG I+ +GS A + A Y +SK A GL IR V+PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 181 VVESELASTI 190
E+++ ++
Sbjct: 189 STETDMQWSL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0348TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 36/194 (18%), Positives = 73/194 (37%), Gaps = 8/194 (4%)

Query: 2 STLSQTPSPHHQHAYWGGIFAMTLCVFVLIASEFMPVSLLTPIARDLGVTEGLAGRGIAI 61
++ SQ+ H+Q W I L F ++ + VSL IA D
Sbjct: 3 TSYSQSNLRHNQILIWLCI----LSFFSVLNEMVLNVSLPD-IANDFNKPPASTNWVNTA 57

Query: 62 SGALAVLTSLTLSTLAGKMNRKFLLLGMTVLMAVSGLIIALATSYLMYMV-GRAMIGVAI 120
+ + L+ ++ K LLL ++ +I + S+ ++ R + G
Sbjct: 58 FMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGA 117

Query: 121 GGFWSMSAATAIRLVPQHQVTRALAIFNAGNALATVVAAPLGSYLGATVGWRGAFLCLVP 180
F ++ R +P+ +A + + A+ V +G + + W ++L L+P
Sbjct: 118 AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIP 175

Query: 181 MAVVAFIWQCISLP 194
M + + + L
Sbjct: 176 MITIITVPFLMKLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0358PF00577634e-12 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 62.9 bits (153), Expect = 4e-12
Identities = 30/247 (12%), Positives = 72/247 (29%), Gaps = 23/247 (9%)

Query: 487 TLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQNVYSGTFGSLGLRAGIQRYNNGDSN 546
L + + T +S + Y + +Q + F + N
Sbjct: 530 QLTVTQQLGRTSTLYLSG-SHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQK 588

Query: 547 ANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGT------------IRTV 594
+ +AL++++P +W + Q + A+ S + +
Sbjct: 589 -GRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647

Query: 595 GANLSRAISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYVNTNLTANGSVGWQGK 654
++ +G + +G A + Y + + S +D +G V
Sbjct: 648 SYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHAN 706

Query: 655 NIAASGRTDGNAGVIFNTGLED---DGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQ 711
+ + ++ G +D + Q + + R G + Y V L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR-----GYAVLPYATEYRENRVALD 761

Query: 712 NSKNSLD 718
+ + +
Sbjct: 762 TNTLADN 768


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0375INTIMIN549e-178 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 549 bits (1415), Expect = e-178
Identities = 226/818 (27%), Positives = 357/818 (43%), Gaps = 49/818 (5%)

Query: 41 PVMAARAQHAVQPRLSMGNTTVTADNNVEKNVASFAANAGTFLSSQPDS-----DATRNF 95
P++AA +L+ + VT N + ++AA L SQ S D ++
Sbjct: 131 PLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDT 190

Query: 96 ITGMATAKANQEIQEWLGKYGTARVKLNVDKDFSLKDSSLEMLYPIYDTPTNMLFTQGAI 155
G+A +A+ ++Q WL YGTA V L +F SSL+ L P YD+ + F Q
Sbjct: 191 ALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGA 248

Query: 156 HRTDDRTQSNIGFGWRHFSGNDWMAGVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGY 215
D R +N+G G R F M G N FID D S +TR+G+G EYWRDY K S NGY
Sbjct: 249 RYIDSRFTANLGAGQRFFLPE-NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGY 307

Query: 216 IRASGWKKSPDIEDYQERPANGWDIRAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQ 275
R SGW +S + +DY ERPANG+DIR GYLP++P LGA LMYEQYYGD V LF DK Q
Sbjct: 308 FRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQ 367

Query: 276 KDPHAISAEVTYTPVPLLTLSAGHKQGKSGENDTRFGLEVNYRIGEPLAKQLDTDSIRER 335
+P A + V YTP+PL+T+ ++ G END + ++ Y+ +P ++Q++ + E
Sbjct: 368 SNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNEL 427

Query: 336 RVLAGSRYDLVERNNNIVLEYRKSEVIRIALPERIEGKGGQTLSLGLVVSKATHGLKNVQ 395
R L+GSRYDLV+RNNNI+LEY+K +++ + +P I G T + L+V K+ +GL +
Sbjct: 428 RTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIV-KSKYGLDRIV 486

Query: 396 WEAPSLLAEGGKITGQGSQ----WQVTLPAYRPGKDNYYAISAVAYDNKGNTSKRVQTEV 451
W+ +L ++GG+I GSQ +Q LPAY G N Y ++A AYD GN+S V +
Sbjct: 487 WDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTI 546

Query: 452 VITGAGMSADRTALTLDGQSRIQMLANGNEQKPLVLSLRDAEGQPVTGMKDQIKTELTFK 511
+ G D+ +T + A+G E +++ Q ++F
Sbjct: 547 TVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNG-------VAQANVPVSFN 599

Query: 512 PAGNIVTRSLKATKSQAKPTLGEFTETEAGVYQSVFTTGTQSGEATITVSVDGMSKTVTA 571
S + + G + G+ ++ M+ + A
Sbjct: 600 IVSGTAVLSANSANTNGS-----------GKATVTLKSDK-PGQVVVSAKTAEMTSALNA 647

Query: 572 ELRATMMDVANSTLSANEPSGDVVADGQQAYTLTLTAVDSEGNPVTGEASRLRFVPQDTN 631
+ S VA+GQ A T T+ V PV+ + T
Sbjct: 648 NAVIFVDQTKASITEIKADKTTAVANGQDAITYTVK-VMKGDKPVSNQEVTF-----TTT 701

Query: 632 GVTVGAIS--EIKPGVYSAAVSSTRAGNVVVRAFSEQYQLGTLQQTLKFVAGP-LDAAHS 688
+ + G ++ST G +V A + ++F +D +
Sbjct: 702 LGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNI 761

Query: 689 SITLNPDKPVVGGTVTAIWTVKDAYDNPVTSLTPE---APSLAGAAAEGSTASGWTNNGD 745
I V G + +W + + + + A+ +++ T
Sbjct: 762 EIVGTG----VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEK 817

Query: 746 GTWTAQITLGSTAGELEVMPKLNGQNAAANAAKVTVVADALSSNQSKVSVAEDHVKAGES 805
GT T + + N N +K DA+++ ++ E+
Sbjct: 818 GTTTISVISSDNQTATYTIATPNSL-IVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876

Query: 806 TTVTLVAKDAHGNAISGLALSASLTGTASEGATVSSWT 843
A + + S + + + TA + + + T
Sbjct: 877 VFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVAST 914



Score = 74.0 bits (181), Expect = 4e-15
Identities = 76/421 (18%), Positives = 140/421 (33%), Gaps = 45/421 (10%)

Query: 976 NGQNAVAQPLVLNVAGDAS-KAEIRDMTVKVNNQLANGQSANQITLTVVDTYGNPLQGQE 1034
N N V + + G + + D T + A+G A T TV G
Sbjct: 537 NSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVP 595

Query: 1035 VTLTLPQGVTSKTGNTVTTNAAGKADIELMSTVAGEHNISASVNGAQKTV---TVKFNAD 1091
V+ + G + N+ TN +GKA + L S G+ +SA + V F
Sbjct: 596 VSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQ 655

Query: 1092 ASTGQANLQVDAAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNLPRGVKPLTGDNVW 1151
++ D VANG+DA T T V K PV VTF G + +
Sbjct: 656 TKASITEIKADKTTA-VANGQDAITYTVKV-MKGDKPVSNQEVTFTTTLGKLSNSTE--- 710

Query: 1152 VKANDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADKATATVSGIEVIGNYAL 1211
K + G A++ + S T G ++A + T IE++G
Sbjct: 711 -KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT--- 766

Query: 1212 ADGNAKQTYKVTVTDANNNLLK---DSEVTLTASPANLVLTPNGTAKTNEQGQAIFTATT 1268
G + V + NL + + T ++ + + + + + T +
Sbjct: 767 --GVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISV 824

Query: 1269 TVAAKYTLTAKVSQADGQESTKTAESKFVADDTNAVLTASSDVTSLVADGISTAKLEVTL 1328
+ T T + N+++ + D ++T K
Sbjct: 825 ISSDNQTAT------------------YTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1329 MSANNPVGGNMWVDIKTPEGVTEKDYQFLPSKNDHFVSGKITRTFSTSKPGVYTFTFNAL 1388
+ ++ N++ Y++ S + + +T +K GV + T++ +
Sbjct: 867 LPSSQNELENVFKAWGAANK-----YEYYKSSQT--IISWVQQTAQDAKSGVAS-TYDLV 918

Query: 1389 T 1389

Sbjct: 919 K 919



Score = 74.0 bits (181), Expect = 4e-15
Identities = 72/347 (20%), Positives = 114/347 (32%), Gaps = 46/347 (13%)

Query: 905 KTTTELTFTVK----DAYGNPVTGLKPDAPVFSGAASTGSERPSAGNWTEKGNGVYVSTL 960
T TVK PV+ + SG A SA + G+G TL
Sbjct: 575 TEAITYTATVKKNGVAQANVPVSFN-----IVSGTAV-----LSANSANTNGSGKATVTL 624

Query: 961 TLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDMTVKVNNQLANGQSANQITL 1020
+ +A+ V+ V D +KA I ++ +ANGQ A IT
Sbjct: 625 KSDKPGQVVVSAKTAEMTSALNANAVIFV--DQTKASITEIKADKTTAVANGQDA--ITY 680

Query: 1021 TVVDTY-GNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKADIELMSTVAGEHNISASVNG 1079
TV P+ QEVT T G S + T T+ G A + L ST G+ +SA V+
Sbjct: 681 TVKVMKGDKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSD 738

Query: 1080 AQ---KTVTVKFNADASTGQANLQVDAAAQKVANGKDAFTLTANVEDKNGN-PVPGSLVT 1135
K V+F + N+++ V G T ++ N G
Sbjct: 739 VAVDVKAPEVEFFTTLTIDDGNIEI------VGTGVKGKLPTVWLQYGQVNLKASGGNGK 792

Query: 1136 FNLPRGVKPLTGDNVWVKANDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADK 1195
+ N + + D QV GT I+ + ++Q T+
Sbjct: 793 YT-------WRSANPAIASVDASSG--QVTLKEKGTTTISVISSDNQT-----ATYTIAT 838

Query: 1196 ATATVSGIEVIGNYALADGNAKQTYKVTVTDANNNLLKDSEVTLTAS 1242
+ + + D ++ N L++ A+
Sbjct: 839 PNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAA 884



Score = 51.2 bits (122), Expect = 3e-08
Identities = 71/405 (17%), Positives = 126/405 (31%), Gaps = 56/405 (13%)

Query: 768 NGQNAAANAAKVTVVADALSSNQSKV---SVAEDHVKAGESTTVTLVA------KDAHGN 818
NG ++ +TV+++ +Q V + + KA + +T A
Sbjct: 535 NGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANV 594

Query: 819 AISGLALSASLTGTASEGATVSSWTEKGNGSYVATLTTGGKTGELRVMPLFNGQPAATEA 878
+S +S GTA A +S G+G TL + + A
Sbjct: 595 PVSFNIVS----GTAVLSA--NSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN-- 646

Query: 879 AQLTVIAGEMSSANSTLVADNKAPTVKTTTELTFTVKDAY-GNPVTGLKPDAPVFSGAAS 937
A + + ++ + + AD +T+TVK PV+ + +
Sbjct: 647 ANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEV-------TFT 699

Query: 938 TGSERPSAGNWTEKGNGVYVSTLTLGSAAGQLSVMPRVNGQN-AVAQPLVLNVAG---DA 993
T + S NG TLT + G+ V RV+ V P V D
Sbjct: 700 TTLGKLSNSTEKTDTNGYAKVTLT-STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDD 758

Query: 994 SKAEIRDMTVK---VNNQLANGQSANQI-------TLTVVDTYGNPLQGQEVTLTLPQGV 1043
EI VK L GQ + T + + +TL
Sbjct: 759 GNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTL---- 814

Query: 1044 TSKTGNTVT---------TNAAGKADIELMSTVAGEHNISASVNGAQKTVTVKFNADAST 1094
K T++ T + ++ ++ + +VN + +
Sbjct: 815 KEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKL--PSSQN 872

Query: 1095 GQANLQVD-AAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNL 1138
N+ AA K K + T+ + V+ + G T++L
Sbjct: 873 ELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDL 917



Score = 44.3 bits (104), Expect = 4e-06
Identities = 53/303 (17%), Positives = 91/303 (30%), Gaps = 63/303 (20%)

Query: 1035 VTLTLPQGVTSKTGNTVTTNAAGKADIELMSTVAGEHNISASVNGAQKTVTVKFNADAST 1094
++L +P + +T K+ L V + + + G Q ++ + S
Sbjct: 454 LSLNIPHDINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQ--GGQ----IQHSGSQSA 507

Query: 1095 GQANLQVDAAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNLPRGVKPLTGDNVWVKA 1154
+ A Q +N + +TA D+NG
Sbjct: 508 QDYQAILPAYVQGGSN---VYKVTARAYDRNG---------------------------- 536

Query: 1155 NDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADKATATVSGIEVIGNYALADG 1214
N L + ++ G + F ADK + A ADG
Sbjct: 537 NSSNNVLLTITVLSNGQVVDQVGVTD----------FTADKTS------------AKADG 574

Query: 1215 NAKQTYKVTVTDANNNLLKDSEVTLTASPANLVLTPNGTAKTNEQGQAIFTATTTVAAKY 1274
TY TV S + +A TN G+A T + +
Sbjct: 575 TEAITYTATVKKNGVAQANVPVSFNIVS--GTAVLSANSANTNGSGKATVTLKSDKPGQV 632

Query: 1275 TLTAKVSQADGQESTKTAESKFVADDTNAVLTASSDVTSLVADGISTAKLEVTLMSANNP 1334
++AK A+ + FV ++ +D T+ VA+G V +M + P
Sbjct: 633 VVSAK--TAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKP 690

Query: 1335 VGG 1337
V
Sbjct: 691 VSN 693


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0376HTHTETR280.029 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.029
Identities = 12/42 (28%), Positives = 19/42 (45%)

Query: 14 RQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRN 55
RQ IL L S+ +IA+ +G +R I F++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0389PRTACTNFAMLY280.002 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.002
Identities = 13/48 (27%), Positives = 22/48 (45%)

Query: 33 IDGAAFRVGAGVQADITKNMGAYASLDYTKGDDIENPLQGVVGINVTW 80
+ G +G G+ A + + YAS +Y+KG + P G +W
Sbjct: 863 LRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0390IGASERPTASE310.012 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.012
Identities = 20/109 (18%), Positives = 42/109 (38%), Gaps = 4/109 (3%)

Query: 257 ASAQGTGSATQNLNLSVADSTIYSDVLALSESENSAATTTNVNMNVARSYWEGNAYTFNS 316
A+ +A+ + + T + + + TT ++ S+ N
Sbjct: 761 ANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNSF---NPTNLRG 817

Query: 317 GDKAGSNLDINLSDSSVWKGKVSGAGNASVSLQNESVWNVTGSSTVDAL 365
+ + L ++++ G + GN+ V L S W++TG+S V L
Sbjct: 818 NVNLTESANFVLGKANLF-GTIQSRGNSQVRLTENSHWHLTGNSDVHQL 865


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0400HTHTETR653e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 3e-15
Identities = 32/172 (18%), Positives = 59/172 (34%), Gaps = 15/172 (8%)

Query: 10 RRRQLIDATLEAINEVGMHDATIAQIARRAGVSTGIISHYFRDKNGLLEATMRDITSQLR 69
R+ ++D L ++ G+ ++ +IA+ AGV+ G I +F+DK+ L S +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 70 DAVLNRLHALPQGSAEQRLQAIVGGNFDETQVSSAAMKAWLAFWASSMHQP-------ML 122
+ L P G L+ I+ + T V+ + +
Sbjct: 72 ELELEYQAKFP-GDPLSVLREILIHVLEST-VTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 123 YRLQQVSSRRLLSNLVSEFRRE---LPRHQAQEAGYGLAALIDGL---WLRA 168
R + S + + + A + I GL WL A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0402PRTACTNFAMLY1322e-33 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 132 bits (333), Expect = 2e-33
Identities = 146/614 (23%), Positives = 232/614 (37%), Gaps = 97/614 (15%)

Query: 758 GTLNGSADSLLSLNGGSLTVTNG------GTSTGSLTGSGELNIQGGTL----------- 800
G + LS G++ T G + S+T + QG L
Sbjct: 326 GARVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKL 385

Query: 801 ----------DIAGDNSNLTANVNIANSANVLVSHAQGLGSANVENNGTLALNNSAEKRA 850
DI +I L S A+ G+ + +L+++N+
Sbjct: 386 TLTGGADAQGDIVATELPSIPGTSIGPLDVALASQARWTGATRAVD--SLSIDNATWVMT 443

Query: 851 AASVNYALGGNLTNNGTLMTGMSGQQAGNVLVVKGNYHGNNGQLVMNTVLNGDDSVTDKL 910
S AL L ++G++ + AG V+ N +G MN D ++DKL
Sbjct: 444 DNSNVGAL--RLASDGSVDFQQPAE-AGRFKVLTVNTLAGSGLFRMNV--FADLGLSDKL 498

Query: 911 VVEGDTSGTTAVTVNNAGGTGAKTLNGIELIHVDGKSEGEFVQA---GRIVAGAYDYTLA 967
VV D SG + V N+G + N + L+ S F A G++ G Y Y LA
Sbjct: 499 VVMQDASGQHRLWVRNSGS-EPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLA 557

Query: 968 RGQGANSGNWYLTSGSDSPELQPEPDPMPNPEPNPNPEPN-PNPTPTPGPDLNVDNDLRP 1026
+G W L P +P P P P P P P+P P P P G +L+
Sbjct: 558 ---ANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELS------- 607

Query: 1027 EAGSYIANLAAANTMFTTRLHERLGNTYYTDMVTGEQKQTTMWMRHEGGHNKWRDGSGQL 1086
AAAN T +Y + ++ + + + G W G Q
Sbjct: 608 ---------AAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAG-GAWGRGFAQR 657

Query: 1087 KTQSNRYV---------LQLGGDVAQWSQNGSDRWHVGVMAGYGNSDSKTISSRTGYRAK 1137
+ NR +LG D A G RWH+G +AGY D G+
Sbjct: 658 QQLDNRAGRRFDQKVAGFELGADHAVAVAGG--RWHLGGLAGYTRGDRGFTGDGGGH--- 712

Query: 1138 ASVNGYSTGLYATWYADDESRNGAYLDSWAQYSWFDN--TVKGDDLQS--ESYKSKGFTA 1193
+ G YAT+ AD +G YLD+ + S +N V G D + Y++ G A
Sbjct: 713 --TDSVHVGGYATYIAD----SGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA 766

Query: 1194 SLEAGYKHKLAEFNGSQGTRNEWYVQPQAQVTWMGVKADKHRESNGTLVHSNGDGNVQTR 1253
SLEAG + A + W+++PQA++ +R +NG V G +V R
Sbjct: 767 SLEAGRRFTHA---------DGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGR 817

Query: 1254 LGVKTWLKSHHKMDDGKSREFQPFVEVNWLHNSKDFST-SMDGVSVTQDGARNIAEIKTG 1312
LG L+ +++ R+ QP+++ + L T +G++ + AE+ G
Sbjct: 818 LG----LEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLG 873

Query: 1313 VEGQLNANLNVWGN 1326
+ L +++ +
Sbjct: 874 MAAALGRGHSLYAS 887


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0405HTHFIS310.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.004
Identities = 15/50 (30%), Positives = 24/50 (48%), Gaps = 2/50 (4%)

Query: 6 TEENLLAFTTAARFGSFSKAAEELGLTTSAISYTIKRMETGLDVVLFTRS 55
E L+ A G+ KAA+ LGL + + I+ + G+ V +RS
Sbjct: 436 MEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL--GVSVYRSSRS 483


6Z0419Z0447Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0419-120-3.670159permease
Z0420-117-1.849625oxidoreductase
Z0421018-1.394688hypothetical protein
Z04231161.975305*hypothetical protein
Z04240172.735447cytochrome subunit of dehydrogenase
Z04251234.319798hypothetical protein
Z04261245.078082regulator for prp operon
Z04270244.9310042-methylisocitrate lyase
Z0428-1224.419762methylcitrate synthase
Z0429-1204.3327252-methylcitrate dehydratase
Z0430-1204.368200propionyl-CoA synthetase
Z0431-1173.032666hypothetical protein
Z04320153.650985cytosine permease
Z0433-1162.300360cytosine deaminase
Z0434-1160.442469DNA-binding transcriptional regulator CynR
Z0435-3121.636724carbonic anhydrase
Z0436-3121.757545cyanate hydratase
Z0437-2111.656461cyanate transporter
Z0438-2110.340167galactoside O-acetyltransferase
Z0439-2111.421805galactoside permease
Z0440-2133.255224beta-D-galactosidase
Z0441-2143.147712lac repressor
Z0442-2132.783242AraC family transcriptional regulator
Z0443-1153.352643hypothetical protein
Z0444-1154.220168DNA-binding transcriptional activator MhpR
Z0445-1164.4804743-(3-hydroxyphenyl)propionate hydroxylase
Z04460144.4903613-(2,3-dihydroxyphenyl)propionate dioxygenase
Z04471133.4412542-hydroxy-6-ketonona-2,4-dienedioic acid
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0426HTHFIS338e-113 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 338 bits (868), Expect = e-113
Identities = 122/401 (30%), Positives = 200/401 (49%), Gaps = 54/401 (13%)

Query: 164 DLAEEAGMTGIFIYSAATVRQAFSDALDMTRMSLRHNTHDATRNALRTRYVLGDMLGQSP 223
A +A G + Y ++ + + +L ++ ++G+S
Sbjct: 88 MTAIKASEKGAYDYLPKPFDL--TELIGIIGRALAEPKRRPSK-LEDDSQDGMPLVGRSA 144

Query: 224 QMEQVRQTILLYARSSAAVLIEGETGTGKELAAQAIHREYFARHDARQGKKSHPFVAVNC 283
M+++ + + ++ ++I GE+GTGKEL A+A+H + R+ PFVA+N
Sbjct: 145 AMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-----YGKRRNG---PFVAINM 196

Query: 284 GAIAESLLEAELFGYEEGAFTGSRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVL 343
AI L+E+ELFG+E+GAFTG++ G FE A GGTLFLDEIG+MP+ QTRLLRVL
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTR-STGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVL 255

Query: 344 EEKEVTRVGGHQPVPVDVRVISATHCNLEEDMRQGQFRRDLFYRLSILRLQLPPLRERVA 403
++ E T VGG P+ DVR+++AT+ +L++ + QG FR DL+YRL+++ L+LPPLR+R
Sbjct: 256 QQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAE 315

Query: 404 DILPLAESFLKVSLAALSAPFSAALRQGLQASETVLVHYDWPGNIRELRNMMERLALFLS 463
DI L F++ ++ L+ + + WPGN+REL N++ RL
Sbjct: 316 DIPDLVRHFVQ-QAEKEGLDVKRFDQEALEL----MKAHPWPGNVRELENLVRRLTALYP 370

Query: 464 VEP-TPDLTPQFLQLLLPELARESAKIPAPRLLTP------------------------- 497
+ T ++ L+ +P+ E A + L
Sbjct: 371 QDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYD 430

Query: 498 -----------QQALEKFNGDKTAAANYLGISRTTFWRRLK 527
AL G++ AA+ LG++R T ++++
Sbjct: 431 RVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIR 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0428PHPHTRNFRASE300.023 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 29.8 bits (67), Expect = 0.023
Identities = 11/33 (33%), Positives = 19/33 (57%), Gaps = 1/33 (3%)

Query: 65 LIHGKLPTRDE-LAAYKTKLKALRGLPANVRTV 96
+ +LPT +E AYK ++ + G P +RT+
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTL 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0438BCTERIALGSPD280.023 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 28.3 bits (63), Expect = 0.023
Identities = 24/125 (19%), Positives = 53/125 (42%), Gaps = 22/125 (17%)

Query: 82 FYANFN----LTIVDDYTVTIGDNVLIAPNVTLSVTGHPVHHELRKNGEMYSFPITIGNN 137
F A+F ++ + + V+I P+V ++T +++ + Y F +++ +
Sbjct: 30 FSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTIT--VRSYDMLNEEQYYQFFLSV-LD 86

Query: 138 VWIGSHVVINPGVTI---------------GDNSVIGAGSIVTKDIPPNVVAAGVPCRVI 182
V+ + + +N GV D + +VT+ +P VAA ++
Sbjct: 87 VYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVTRVVPLTNVAARDLAPLL 146

Query: 183 REIND 187
R++ND
Sbjct: 147 RQLND 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0439TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 44/192 (22%), Positives = 72/192 (37%), Gaps = 22/192 (11%)

Query: 4 LKNTNFWMFGLFFFFYFFI-MGAYFPFFPIWLHDINHISK--SDTGIIFAAISLFSLLFQ 60
+K + L + +G P P L D+ H + + GI+ A +L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 61 PLFGLLSDKLGLRKYLLWIITGMLVMFAPFFIFIFGPLLQYNILVGSIVGGIYLGFCFNA 120
P+ G LSD+ G R LL + + I P L + + +G IV GI A
Sbjct: 61 PVLGALSDRFGRRPVLL---VSLAGAAVDYAIMATAPFL-WVLYIGRIVAGIT-----GA 111

Query: 121 GAPAVEAFIEKVSRRSNFEFGRARMFG----CVGWALCAS--IVGIMFTINNQFVFWLGS 174
A+I ++ RAR FG C G+ + A + G+M + F+ +
Sbjct: 112 TGAVAGAYIADITDGDE----RARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAA 167

Query: 175 GCALILAILLFF 186
+ + F
Sbjct: 168 ALNGLNFLTGCF 179


7Z0461Z0469Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z04612203.986407permease; hexosephosphate transport
Z04621223.795496sensor kinase
Z04630203.959169response regulator; hexosephosphate transport
Z04642220.553716taurine transporter substrate binding subunit
Z0465220-1.087057taurine transporter ATP-binding subunit
Z0466118-0.961020taurine transporter subunit
Z0467219-1.771207taurine dioxygenase
Z0468219-1.769163delta-aminolevulinic acid dehydratase
Z0469218-2.193773hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0461TCRTETA402e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.8 bits (93), Expect = 2e-05
Identities = 62/399 (15%), Positives = 122/399 (30%), Gaps = 31/399 (7%)

Query: 38 VNYVLPALQTDLGLD---KGDIGLLGSLFYLSYGLSKFTAGLWHDSHGQRGFMGVGLFAT 94
+ VLP L DL G+L +L+ L G D G+R + V L
Sbjct: 24 IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGA 83

Query: 95 GLLNVVFAFGESLTLLLVVWTLNGFFQGWGWPPCARLLTHWYSRNERGFWWGCWNMSINI 154
+ + A L +L + + G G + +ER +G +
Sbjct: 84 AVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 155 GGAIIPLISAFAAHWWGWQAAMLTPGIISMALGIWLTLQLKGTPQEEGLPTVGHWRHDPL 214
G P++ + A ++ + L + + E P
Sbjct: 143 GMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP---------- 191

Query: 215 ELRQEQQSPPMGLWQMLRTTMLQNPLIWLLGVSYVLVYVIRIALNDWGNIWLTESHGVNL 274
LR+E +P R + L+ V +++ V ++ W + +
Sbjct: 192 -LRREALNPLAS----FRWARGMTVVAALMAVFFIMQLVGQVPAALWV---IFGEDRFHW 243

Query: 275 LSANATVMLFEVGGLLGALFAGWGSDLLFSGQRAPMILLFTLGLMVSVAALWLAPVHHYA 334
+ + L G+L +L + + + L+ + + L +
Sbjct: 244 DATTIGISL-AAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWM 302

Query: 335 LLAVCFFTVGFFVFGPQMLIGLAAVECGHK--AAAGSITGFLGLFAYLGAALAGWPLSLV 392
+ + P + L+ + GS+ L + +G L +
Sbjct: 303 AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362

Query: 393 IERYGWPGMFSLLSVAAVLMGLLLMPLLMAGITTTHARR 431
I W G +A + LL +P L G+ + +R
Sbjct: 363 ITT--WNG---WAWIAGAALYLLCLPALRRGLWSGAGQR 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0462PF06580491e-08 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 49.5 bits (118), Expect = 1e-08
Identities = 42/205 (20%), Positives = 79/205 (38%), Gaps = 43/205 (20%)

Query: 337 QSQLVKRARDPAQIQSAASQIN-------------------ELARRIHLSTRQLLR-QLR 376
Q ++ A++ AQ+ + +QIN AR + S +L+R LR
Sbjct: 151 QWKMASMAQE-AQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 377 PPALDELTFREALLHL-----INEFAFSERGIHCQFAYQLNSTPENETVRFTLYRLLQEL 431
+++ + L + + F +R ++ P V+ L+Q L
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDR-----LQFENQINPAIMDVQVPPM-LVQTL 263

Query: 432 LNNICKHA-----EASEVTIILRQQGEVLHLEVSDNGVGIA--SGKMAGFGIQGMRERVS 484
+ N KH + ++ + + + LEV + G + + G G+Q +RER+
Sbjct: 264 VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQ 323

Query: 485 ALGGD---LTLE-KQHGTRVIVNLP 505
L G + L KQ +V +P
Sbjct: 324 MLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0463HTHFIS733e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.9 bits (179), Expect = 3e-17
Identities = 31/116 (26%), Positives = 53/116 (45%), Gaps = 4/116 (3%)

Query: 2 IRVVLVDDHVVVRSGFAQLLSLED-DLEVIGQYSSAAQAWSALIRDDVNVAVIDIAMPDE 60
+++ DD +R+ Q LS D+ + +AA W + D ++ V D+ MPDE
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLSLLKRLRAQKPQFRAIILSIYDAPTFVQSALDAGASGYLTKRCGPEELVQAVR 116
N LL R++ +P +++S + A + GA YL K EL+ +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0469PRTACTNFAMLY798e-17 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 78.6 bits (193), Expect = 8e-17
Identities = 75/318 (23%), Positives = 115/318 (36%), Gaps = 56/318 (17%)

Query: 584 TINGNGDNDNTASIEAGQNEVDNNGDHVAAATGNYKVRIDNATGAGSIADYNGNELIYVN 643
T+ G+G + G ++ A+G +++ + N+ GS L+
Sbjct: 477 TLAGSGLFRMNVFADLGLSDKLVVMQD---ASGQHRLWVRNS---GSEPASANTLLLVQT 530

Query: 644 DKNSNATFSAAN---KADLGAYTYQAEQRGNTV--------------------------- 673
S ATF+ AN K D+G Y Y+ GN
Sbjct: 531 PLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQ 590

Query: 674 ---------VLQQMELTDYANMALSIP--SANTNIWNLEQDTVGTRLTNSRHGLADNGGA 722
EL+ AN A++ + +W E + + RL R D GGA
Sbjct: 591 PQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRL-NPDAGGA 649

Query: 723 WVSYFGGNFNGDNGTIN-YDQDVNGIMVGVDTKIDGNNAKWIVGAAAGFAKGDMN---DR 778
W F DN +DQ V G +G D + +W +G AG+ +GD D
Sbjct: 650 WGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDG 709

Query: 779 SGQVDQDSQTAYIYSSAHFANNVF-VDGSLSYSHFNNDLSATMSNGTYVDGSTNSDAWGF 837
G D Y + + A++ F +D +L S ND S+G V G + G
Sbjct: 710 GGHTDSVHVGGY---ATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA 766

Query: 838 GLKAGYDFKLGDAGYVTP 855
L+AG F D ++ P
Sbjct: 767 SLEAGRRFTHADGWFLEP 784


8Z0507Z0520Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0507328-0.514395preprotein translocase subunit SecD
Z0508323-2.475950preprotein translocase subunit SecF
Z0509420-3.134912hypothetical protein
Z0510116-0.541414hypothetical protein
Z0511-117-0.022897hypothetical protein
Z05120200.595666nucleoside channel phage T6/colicin K receptor
Z0513-2141.618969hypothetical protein
Z0514-1142.450345transcriptional regulator NrdR
Z0515-215-3.937586bifunctional
Z0516-114-4.6400986,7-dimethyl-8-ribityllumazine synthase
Z0518013-2.999571transcription antitermination protein NusB
Z0519012-2.540861thiamine monophosphate kinase
Z0520013-3.202745phosphatidylglycerophosphatase A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0507SECFTRNLCASE691e-14 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 68.7 bits (168), Expect = 1e-14
Identities = 38/183 (20%), Positives = 88/183 (48%), Gaps = 5/183 (2%)

Query: 433 IQIVEERTIGPTLGMQNIEQGLEACLAGLLVSILFMII-FYKKFGLIATSALIANLILIV 491
++I ++GP + + + + + LA +V + ++ + F +F L A AL+ +++L V
Sbjct: 135 LKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTV 194

Query: 492 GIMSLLPGATLSMPGIAGIVLTLAVAVDANVLINERIKEEL--SNGRTVQQAIDEGYRGA 549
G+ ++L + +A ++ +++ V++ +R++E L ++ ++
Sbjct: 195 GLFAVL-QLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNET 253

Query: 550 FSSIFDANITTLIKVIILYAVGTGAIKGFAITTGIGVATSMFTAIVGTRAIVNLLYGGKR 609
S +TTL+ ++ + G I+GF GV T ++++ + IV L G R
Sbjct: 254 LSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIV-LFIGLDR 312

Query: 610 VKK 612
K+
Sbjct: 313 NKE 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0508SECFTRNLCASE350e-123 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 350 bits (899), Expect = e-123
Identities = 104/309 (33%), Positives = 179/309 (57%), Gaps = 12/309 (3%)

Query: 17 YDFMRWDYWAFGISGLLLIAAIVIMGVRGFNWGLDFTGGTVIEITLEKPAEIDVMRDALQ 76
+DF RW + FG + +++IA++++ V G N+G+DF GGT I ++ V R AL+
Sbjct: 14 FDFFRWQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGVYRAALE 73

Query: 77 KAGFEEPMLQNFGS------SHDIMVRMPPAEGETGGQVLGSQVLKVINE------STNQ 124
+ ++ H M+R+ E G + G+Q +++N+ + +
Sbjct: 74 PLELGDVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDP 133

Query: 125 NAAVKRIEFVGPSVGADLAQTGAMALMAALLSILVYVGFRFEWRLAAGVVIALAHDVIIT 184
+ E VGP V +L T +L+AA + I+ Y+ RFEW+ A G V+AL HDV++T
Sbjct: 134 ALKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLT 193

Query: 185 LGILSLFHIEIDLTIVASLMSVIGYSLNDSIVVSDRIRENFRKIRRGTPYEIFNVSLTQT 244
+G+ ++ ++ DLT VA+L+++ GYS+ND++VV DR+REN K + ++ N+S+ +T
Sbjct: 194 VGLFAVLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNET 253

Query: 245 LHRTLITSGTTLMVILMLYLFGGPVLEGFSLTMLIGVSIGTASSIYVASALALKLGMKRE 304
L RT++T TTL+ ++ + ++GG V+ GF M+ GV GT SS+YVA + L +G+ R
Sbjct: 254 LSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRN 313

Query: 305 HMLQQKVEK 313
+ +K
Sbjct: 314 KEKKDPSDK 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0512CHANNELTSX403e-146 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 403 bits (1036), Expect = e-146
Identities = 200/218 (91%), Positives = 207/218 (94%)

Query: 1 MKKTLLAAGAVLALSSSFTVNAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE 60
MKKTLLAAGAV+ALS++F AAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE
Sbjct: 1 MKKTLLAAGAVVALSTTFAAGAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE 60

Query: 61 YEAFAKKDWFDFYGYADAPVFFGGNSDAKGIWNHGSPLFMEIEPRFSIDKLTNTDLSFGP 120
YEAFAKKDWFDFYGY DAPVFFGGNS AKGIWN GSPLFMEIEPRFSIDKLTNTDLSFGP
Sbjct: 61 YEAFAKKDWFDFYGYIDAPVFFGGNSTAKGIWNKGSPLFMEIEPRFSIDKLTNTDLSFGP 120

Query: 121 FKEWYFANNYIYDMGRNKDGRQSTWYMGLGTDIDTGLPMSLSMNVYAKYQWQNYGAANEN 180
FKEWYFANNYIYDMGRN QSTWYMGLGTDIDTGLPMSLS+NVYAKYQWQNYGA+NEN
Sbjct: 121 FKEWYFANNYIYDMGRNDSQEQSTWYMGLGTDIDTGLPMSLSLNVYAKYQWQNYGASNEN 180

Query: 181 EWDGYRFKIKYFVPITDLWGGQLSYIGFTNFDWGSDLG 218
EWDGYRFK+KYFVP+TDLWGG LSYIGFTNFDWGSDLG
Sbjct: 181 EWDGYRFKVKYFVPLTDLWGGSLSYIGFTNFDWGSDLG 218


9Z0581Z0588Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z05813162.550056potassium efflux protein KefA
Z05834164.229799hypothetical protein
Z05843174.720858primosomal replication protein N''
Z05853243.376607hypothetical protein
Z05864293.233283adenine phosphoribosyltransferase
Z05872243.186945DNA polymerase III subunits gamma and tau
Z05882231.635196hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0581RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0587IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDAWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


10Z0600Z0615Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0600-1143.141818hypothetical protein
Z06010132.775795ligase
Z06021154.643962hypothetical protein
Z06044287.792198copper exporting ATPase
Z06064297.717368glutaminase
Z06074307.806448amino acid/amine transport protein
Z06084318.079922outer membrane export protein
Z06094318.253974hypothetical protein
Z06153317.915574RTX family exoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0606BLACTAMASEA290.021 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.0 bits (65), Expect = 0.021
Identities = 11/43 (25%), Positives = 19/43 (44%)

Query: 38 GQLAAVAIVTSDGNVYSAGDSDYRFALESISKVCTLALALEDV 80
G++ + + + G +A +D RF + S KV L V
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARV 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0608RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.006
Identities = 22/165 (13%), Positives = 57/165 (34%), Gaps = 20/165 (12%)

Query: 199 DELQAQTRIAGMRSTLEQYQAQMASAKAQLAVLTGVQPEAIAAP----PAELAEQPVSLK 254
L A+ +S+L Q + + + + + + P ++E+ V
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVL-- 185

Query: 255 NIDYQSIPLVLAAENLRQSAQYGVEKTKAQYWPTLSIQGGKTRYQTSDRSYWDDQLQLNV 314
+ L+ + Q+ +Y E + R + ++ +L+
Sbjct: 186 ----RLTSLIKEQFSTWQNQKYQKELNLDKK--RAERLTVLARINRYENLSRVEKSRLDD 239

Query: 315 NAPLYQGGAVS--------AQVQQAEGQQKISASQVEQAKLDVLQ 351
+ L A++ + +A + ++ SQ+EQ + ++L
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0609INTIMIN375e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 37.4 bits (86), Expect = 5e-04
Identities = 63/372 (16%), Positives = 115/372 (30%), Gaps = 44/372 (11%)

Query: 649 QTVTVTLNGQTYQGVVQPDGTWSVTVPAANVGALADGNA--TVTASVNDVAGNPSSVSRV 706
T+TV NGQ V D T A A ADG T TA+V ++V
Sbjct: 544 LTITVLSNGQVVDQVGVTDFT------ADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 707 ALVDATPPVVTINPVATDNVINTPEHAQAQIISGTVTGAQAGDIVTVTLNNVDYTTVVDG 766
+ + V++ N T+ ++ V A+ ++ + N + VD
Sbjct: 598 FNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL--NANAVIFVDQ 655

Query: 767 SGNWSLGVPASVVSGLADGSYPVSVSVTDKAGNTGSQSLTVTVNTAAPLIGINSIAGDDV 826
+ + A + +A+G ++ +V G+ + VT T +
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSN-------- 707

Query: 827 INASEKGADLQITGTSDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANY 886
T +D +T+T + + + +V V A V
Sbjct: 708 -----------STEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF--TTL 754

Query: 887 TVTAAVTSDIGNSATASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAED 946
T+ +G V LP V + + + + + D
Sbjct: 755 TIDDGNIEIVGTG--------VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVD 806

Query: 947 GDTVTITL---GGNTYTATVGSN--LTWSVDVPAADIQALGNGDLTVNASVTNQNGNTGS 1001
+ +TL G T + N T+++ P + I + +T N +V G
Sbjct: 807 ASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1002 GTRDITIDANLP 1013
N+
Sbjct: 867 LPSSQNELENVF 878



Score = 35.0 bits (80), Expect = 0.002
Identities = 81/416 (19%), Positives = 139/416 (33%), Gaps = 61/416 (14%)

Query: 783 ADGSYPVSVSVTDKAGN-TGSQSLTVTVNTAAPLIGINSIAGDDVINASEKGADLQITGT 841
Y V+ D+ GN + + LT+TV + + D + + AD GT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKAD----GT 575

Query: 842 SDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANYTVTAAVTSDIGNSAT 901
AIT YT T +G VP S G A + +A T + +
Sbjct: 576 ------EAIT-------YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT-----NGS 617

Query: 902 ASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAEDGDTVTITLGGNTYTA 961
V + S PG + T ++ +A A A Q I T A
Sbjct: 618 GKATVTLKSDKPGQVVVSAKTAEMTSALNAN-AVIFVDQTK----ASITEIKADKTTAVA 672

Query: 962 TVGSNLTWSVDVPAADIQALGNGDLTVNASVTNQNG----NTGSGTRDITIDANLPG--- 1014
+T++V V D + + N ++T ++ + +G +T+ + PG
Sbjct: 673 NGQDAITYTVKVMKGD-KPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL 731

Query: 1015 --LRVDTVAGDDVVNIIEHGQALVVTGSS-----SGLAESTP----------LTVTINNV 1057
RV VA D +E L + + +G+ P L + N
Sbjct: 732 VSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNG 791

Query: 1058 EYTTAVQADGSWSVGVTAAQVSAWPAGTVNIAVSGESSAGNSVSITHPVTVDLTPAAITI 1117
+YT SV ++ QV+ GT I+V + + +I TP ++ +
Sbjct: 792 KYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA-------TPNSLIV 844

Query: 1118 NTIATDDVINAAEKGADLTLSGTTTNVEPGQTVTVTFGGKNYTASVASDGSWTATV 1173
++ N A ++ + V +G N S + + V
Sbjct: 845 PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWV 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0615CABNDNGRPT459e-06 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 44.6 bits (105), Expect = 9e-06
Identities = 43/221 (19%), Positives = 68/221 (30%), Gaps = 12/221 (5%)

Query: 4539 GTASNNGKFVGTGYNDTFFATAGTDTYDGSGGWVYSSGTGTWLANGGMDVVDFRLSTVGV 4598
T + F D + AT + S + T + ++ +
Sbjct: 265 RTGDSVYGFNSNTDRDFYTATDSSKALIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSD 324

Query: 4599 TANLSSTAAQATGFNTSTFTNIEGISGSNFNDILTGSSGDNQLEGRGGNDTLNIGNGGHD 4658
L + A G IE G + NDIL G+S DN L+G GND L G G
Sbjct: 325 VGGLKGNVSIAHG------VTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAG--A 376

Query: 4659 TLLYKLLNASDATGGNGSDVVNGFTVGTWEGTADTDRIDIRELLQGSGYTG-NGKASYVN 4717
LY G+G D + D+ID+ + + +
Sbjct: 377 DTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKG 436

Query: 4718 GVATLDAQAGNIGDFVKVTQS---GSDTIVQIDRDGTGGTF 4755
L A N + + ++ D +V+I
Sbjct: 437 QEVMLQWDAANSITNLWLHEAGHSSVDFLVRIVGQAAQSDI 477


11Z0644Z0651Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z06441193.513065metal resistance protein
Z0645-1266.921673thioredoxin-like protein
Z06460256.333031short chain dehydrogenase
Z06470266.269226multifunctional acyl-CoA thioesterase I/protease
Z06481286.292291ABC transporter ATP-binding protein
Z06491275.912981oxidoreductase
Z06512285.335338hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0646DHBDHDRGNASE785e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.8 bits (191), Expect = 5e-19
Identities = 49/212 (23%), Positives = 81/212 (38%), Gaps = 7/212 (3%)

Query: 16 KSVLITGCSSGIGLESALELKRQGFHVLAGCRKPDDVERMNS----MGFT--GVLIDLDS 69
K ITG + GIG A L QG H+ A P+ +E++ S D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 70 PESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRAQMEQQFSANFFGAHQLTMRL 129
++D + + + N AG G + ++S + E FS N G + +
Sbjct: 69 SAAIDEITARIEREMGP-IDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 130 LPAMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRRSGIKVSLIEP 189
M+ G IV S + AYA+SK A ++ L +EL I+ +++ P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 190 GPIRTRFTDNVNQTQSDKPVENPGIAARFTLG 221
G T ++ ++ G F G
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTG 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0648PF05272290.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.014
Identities = 12/20 (60%), Positives = 13/20 (65%)

Query: 41 LVGESGSGKSTLLAILAGLD 60
L G G GKSTL+ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


12Z0662Z0705Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0662020-3.259048hydroxypyruvate isomerase
Z0663118-1.996014oxidoreductase
Z0664316-0.740196hypothetical protein
Z0665216-0.595553allantoin permease
Z06661140.388788allantoinase
Z06674160.297001hypothetical protein
Z06684181.435643purine permease YbbY
Z06693172.364432glycerate kinase
Z06702162.269035hypothetical protein
Z06711173.573150allantoate amidohydrolase
Z06721174.504490malate dehydrogenase
Z06732175.216305membrane protein FdrA
Z06741174.975813hypothetical protein
Z06751153.873260carboxylase
Z06761193.486883carbamate kinase
Z06772202.924412phosphoribosylaminoimidazole carboxylase ATPase
Z06783212.282846phosphoribosylaminoimidazole carboxylase
Z06793211.926323UDP-2,3-diacylglucosamine hydrolase
Z06802180.609073peptidyl-prolyl cis-trans isomerase B
Z0681-115-0.522492cysteinyl-tRNA synthetase
Z0682020-3.020802hypothetical protein
Z0683122-3.931246hypothetical protein
Z0684224-4.045434bifunctional 5,10-methylene-tetrahydrofolate
Z0686228-6.201592fimbrial-like protein
Z0688226-5.965782chaperone
Z0689223-4.666426hypothetical protein
Z0690016-1.424000fimbrial asembly protein
Z0691012-0.221973fimbrial protein
Z0693-111-1.411120transcriptional regulator FimZ
Z0696-111-0.773777*envelope protein; thermoregulation of porin
Z0697-2110.003325hypothetical protein
Z0698-1172.485357bacteriophage N4 receptor, outer membrane
Z06990212.190216bacteriophage N4 adsorption protein B
Z07002253.084476receptor
Z07012244.284560hypothetical protein
Z07021244.695772hypothetical protein
Z0705-1224.446440hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0666UREASE551e-10 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 54.7 bits (132), Expect = 1e-10
Identities = 39/163 (23%), Positives = 59/163 (36%), Gaps = 32/163 (19%)

Query: 4 DLIIKNGTVILENEARVVDIAVKGGKIAAIG-------QD-----LGDAKEVMDASGLVV 51
D +I N ++ DI +K G+IAAIG Q +G EV+ G +V
Sbjct: 69 DTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIV 128

Query: 52 SPGMVDAHTHISEPGRSHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRAS------- 104
+ G +D+H H P + A G+T M+ PA A+
Sbjct: 129 TAGGMDSHIHFICPQQIE---------EALMSGLTCMLGGGTG--PAHGTLATTCTPGPW 177

Query: 105 -IELKFDAAKGKLTIDAAQLGGLVSYNIDRLHELDEVGVVGFK 146
I +AA ++ A G + L E+ G K
Sbjct: 178 HIARMIEAADA-FPMNLAFAGKGNASLPGALVEMVLGGATSLK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0676CARBMTKINASE381e-136 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 381 bits (980), Expect = e-136
Identities = 125/310 (40%), Positives = 174/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VEPYPLDVLVAESQGMIGYMLAQSLSAQPQM----PPVTTVRTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q+L + + V T+ T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLQPEKFIGPVYQPEEQEALEAAYGWQMKRD-GKYLRRVVASPQPRKILDSEAIELL 172
DPAF P K +GP Y E + L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVTDDG---AGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGMPQQRAIRHATPDELAPFAKAD----GSMGPKVTAVSGYVRSRGKPAWIGALSRIEE 285
+G +++ +R +EL + + GSMGPKV A ++ G+ A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0689PF005777440.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 744 bits (1923), Expect = 0.0
Identities = 372/876 (42%), Positives = 535/876 (61%), Gaps = 88/876 (10%)

Query: 11 QRYTWCL------AGICYSSLAILPSFLSY-----AESYFNPAFLLENGTFVADLSRFER 59
QR T CL + L + +F + AE YFNP FL ++ VADLSRFE
Sbjct: 10 QRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFEN 69

Query: 60 GNHQPAGVYRVDLWRNDEFIGSQDIVFESTTENTGDKSGGLMPCFNQVLLERIGLNSSAF 119
G P G YRVD++ N+ ++ ++D+ F NTGD G++PC + L +GLN+++
Sbjct: 70 GQELPPGTYRVDIYLNNGYMATRDVTF-----NTGDSEQGIVPCLTRAQLASMGLNTASV 124

Query: 120 PELAQQQNNKCINLLKAVPDATINFDFAAMRLNITIPQIALLSSAHGYIPPEEWDEGIPA 179
+ ++ C+ L + DAT D RLN+TIPQ + + A GYIPPE WD GI A
Sbjct: 125 SGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINA 184

Query: 180 LLLNYNFTGN----RGNGNDSYFFSEL-SGINIGPWRLRNNGSWNYFRGNG--YHSEQWN 232
LLNYNF+GN R GN Y + L SG+NIG WRLR+N +W+Y + +W
Sbjct: 185 GLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQ 244

Query: 233 NIGTWVQRAIIPLKSELVMGDGNTGSDIFDGVGFRGVRLYSSDNMYPDSQQGFAPTVRGI 292
+I TW++R IIPL+S L +GDG T DIFDG+ FRG +L S DNM PDSQ+GFAP + GI
Sbjct: 245 HINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGI 304

Query: 293 ARTAAQLTIRQNGFIIYQSYVSPGAFEITDLHPTSSNGDLDVTIDERDGNQQNYTIPYST 352
AR AQ+TI+QNG+ IY S V PG F I D++ ++GDL VTI E DG+ Q +T+PYS+
Sbjct: 305 ARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSS 364

Query: 353 VPILQREGRFKFDLTAGDFRSGNSQQSSPFFFQGTALGGLPQEFTAYGGTQLSANYTAFL 412
VP+LQREG ++ +TAG++RSGN+QQ P FFQ T L GLP +T YGGTQL+ Y AF
Sbjct: 365 VPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFN 424

Query: 413 LGLGRNLGNWGAVSLDVTHARSQLADDSRHEGDSIRFLYAKSMNTFGTNFQLMGYRYSTQ 472
G+G+N+G GA+S+D+T A S L DDS+H+G S+RFLY KS+N GTN QL+GYRYST
Sbjct: 425 FGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTS 484

Query: 473 GFYTLDDVAYRRMEGYEYDYDYDGEHRDEPIIVNYHNLRFSRKDRLQLNISQSLNDFGSL 532
G++ D Y RM GY DG + +P +Y+NL ++++ +LQL ++Q L +L
Sbjct: 485 GYFNFADTTYSRMNGY-NIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTL 543

Query: 533 YISGTHQKYWNTSDSDTWYQVGYTSSWVGISYSLSFSWNESVGIPDNERIVGLNVSVPFN 592
Y+SG+HQ YW TS+ D +Q G +++ I+++LS+S ++ ++++ LNV++PF+
Sbjct: 544 YLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFS 603

Query: 593 VLTKRRYTRENALDRAYASFNANRNSNGQN----------------SWQLSGGVVGHENG 636
R ++ A AS++ + + NG+ S+ + G G +G
Sbjct: 604 HWL--RSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDG 661

Query: 637 ---------ITLSQPLGDTN-------------------------------------VLI 650
+ G+ N VL+
Sbjct: 662 NSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLV 721

Query: 651 KAPGAGGVRIENQTGILTDWRGYAVMPYATVYRYNRIALDTNTMGNSIDVEKNISSVVPT 710
KAPGA ++ENQTG+ TDWRGYAV+PYAT YR NR+ALDTNT+ +++D++ +++VVPT
Sbjct: 722 KAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPT 781

Query: 711 QGALVRANFDTRIGVRALITVTQGGKPVPFGSLVRENSTGITSMVGDDGQVYLSGAPLSG 770
+GA+VRA F R+G++ L+T+T KP+PFG++V S+ + +V D+GQVYLSG PL+G
Sbjct: 782 RGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAG 841

Query: 771 ELLVQWGDGANSRCIAHYVLPKQSLQQAVTVISAVC 806
++ V+WG+ N+ C+A+Y LP +S QQ +T +SA C
Sbjct: 842 KVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0693HTHFIS617e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.6 bits (147), Expect = 7e-13
Identities = 26/122 (21%), Positives = 55/122 (45%), Gaps = 2/122 (1%)

Query: 22 MKPTSVIIMDTHPIIRMSIEVLLQKNSELQIVLKTDDYRITIDYLRTRPVDLIIMDIDLP 81
M ++++ D IR + L + V T + ++ DL++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 82 GTDGFTFLKRIKQIQSTVKVLFLSSKSECFYAGRAIQAGANGFVSKCNDQNDIFHAVQMI 141
+ F L RIK+ + + VL +S+++ A +A + GA ++ K D ++ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 142 LS 143
L+
Sbjct: 119 LA 120


13Z0722Z0747Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0722-3103.024143hypothetical protein
Z0723-3101.963806phosphopantetheinyltransferase component of
Z0724-3112.341445outer membrane receptor FepA
Z07250132.744288enterobactin/ferric enterobactin esterase
Z07260123.552960hypothetical protein
Z07271143.970103enterobactin synthase subunit F
Z07281143.489648ferric enterobactin transport protein FepE
Z07290155.617841iron-enterobactin transporter ATP-binding
Z07310155.616900iron-enterobactin transporter permease
Z0732-1165.341371iron-enterobactin transporter membrane protein
Z0733-1164.835976enterobactin exporter EntS
Z0734-2164.601897iron-enterobactin transporter periplasmic
Z0735-2204.926254isochorismate synthase
Z0736-2204.795683enterobactin synthase subunit E
Z0737-1194.5810742,3-dihydro-2,3-dihydroxybenzoate synthetase
Z0738-1173.7653732,3-dihydroxybenzoate-2,3-dehydrogenase
Z0739-1142.074611hypothetical protein
Z0740-213-0.090887carbon starvation protein
Z0742-217-2.618901hypothetical protein
Z0743-214-3.870109aminotransferase
Z0744-216-3.837933hypothetical protein
Z0746-217-3.640214hypothetical protein
Z0747-214-3.012509LysR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0722HOKGEFTOXIC645e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.5 bits (157), Expect = 5e-18
Identities = 18/52 (34%), Positives = 28/52 (53%)

Query: 32 INMLTKYALVAVIVLCLTVLGFTLLVGDSLCEFTVKERNIEFKAVLAYEPKK 83
+ + + V+++CLT+L FT L SLCE ++ E A +AYE K
Sbjct: 1 MKLPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0723ENTSNTHTASED2757e-97 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 275 bits (705), Expect = 7e-97
Identities = 105/183 (57%), Positives = 130/183 (71%), Gaps = 1/183 (0%)

Query: 4 MKTTHTSLPFAGHTLHFVEFDPANFCEQDLLWLPHYAQLQHAGRKRKTEHLAGRIAAVYA 63
M T+H LPFAGH LH V+FD ++F E DLLWLPH+ +L+ AGRKRK EHLAGRIAAV+A
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 64 LREYGYKCVPAIGELRQPVWPAEVYGSISHCGATALAVVSRQPIGVDIEEIFSAQTATEL 123
LRE G + VP +G+ RQP+WP ++GSISHC TALAV+SRQ IG+DIE+I S TATEL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 124 TDNIITPAEHERLADCGLAFSLALTLAFSAKESAFKA-SEIQTDAGFLDYQIISWNKQQV 182
+II E + L L F LALTLAFSAKES +KA S+ T GF ++ S +
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 183 IIH 185
+H
Sbjct: 181 SLH 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0733TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0734FERRIBNDNGPP641e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.8 bits (155), Expect = 1e-13
Identities = 61/285 (21%), Positives = 101/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSTEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKSWQA 154
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 155 L-----LTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0737ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0738DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (930), Expect = e-130
Identities = 110/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFAQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


14Z0840Z0861Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0840-1174.327637hypothetical protein
Z0841-2184.681133DNA-binding transcriptional activator KdpE
Z0842-2195.786073sensor protein KdpD
Z08430194.697235potassium-transporting ATPase subunit C
Z08440194.431681potassium-transporting ATPase subunit B
Z08451203.193409potassium-transporting ATPase subunit A
Z08462232.633045hypothetical protein
Z08471221.010734rhsC protein in rhs element, interrupted
Z0849230-6.883209hypothetical protein
Z0851-122-3.321569hypothetical protein
Z0853-115-3.412167hypothetical protein
Z0855-214-1.581959hypothetical protein
Z0857-313-0.325084receptor
Z0858-1142.079666hypothetical protein
Z0859-1142.774000deoxyribodipyrimidine photolyase
Z0860-1162.629538transporter
Z08610163.741892hydrolase-oxidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0841HTHFIS921e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.8 bits (228), Expect = 1e-23
Identities = 35/125 (28%), Positives = 58/125 (46%), Gaps = 1/125 (0%)

Query: 2 TNVLIVEDEQAIRRFLRTALEGDGMRVYEAETLQRGLLEAATRKPDLIILDLGLPDGDGI 61
+L+ +D+ AIR L AL G V A DL++ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EFIRDLRQWSA-VPVIVLSARSEESDKIAALDAGADDYLSKPFGIGELQARLRVALRRHS 120
+ + +++ +PV+V+SA++ I A + GA DYL KPF + EL + AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 ATAAP 125
+
Sbjct: 124 RRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0842PF06580320.012 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.012
Identities = 10/48 (20%), Positives = 21/48 (43%), Gaps = 4/48 (8%)

Query: 785 LLENAVKYAGAQAE----IGIDAHVEGENLQLDVWDNGPGLPPGQEQT 828
L+EN +K+ AQ I + + + L+V + G +++
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES 310


15Z0876Z0907Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z08761253.303711succinate dehydrogenase cytochrome b556 small
Z08771273.432043succinate dehydrogenase flavoprotein subunit
Z08782302.958933succinate dehydrogenase iron-sulfur subunit
Z08792323.042444hypothetical protein
Z08802271.9574972-oxoglutarate dehydrogenase E1
Z0881-1220.737165dihydrolipoamide succinyltransferase
Z0882-115-1.017402succinyl-CoA synthetase subunit beta
Z0883016-2.298905succinyl-CoA synthetase subunit alpha
Z0884020-3.702485hypothetical protein
Z0885015-1.986594LysR family transcriptional regulator
Z0886016-0.523331cob(I)yrinic acid a,c-diamide
Z0887-117-0.355242fumarate hydratase
Z0888-1180.496399symport protein
Z0890-1200.819891hypothetical protein
Z0891-2200.892919hypothetical protein
Z0892-119-1.101680methylaspartate ammonia-lyase
Z0893023-4.241010glutamate mutase subumit E
Z0894019-4.032287hypothetical protein
Z0895221-4.133358methylaspartate mutase subunit S
Z0896121-4.073315hypothetical protein
Z0897221-3.223660hypothetical protein
Z0898018-1.962796hypothetical protein
Z09001230.587355cytochrome d terminal oxidase, polypeptide
Z09012180.383974cytochrome d terminal oxidase polypeptide
Z09031180.305388hypothetical protein
Z09042220.369221acyl-CoA thioester hydrolase
Z09052210.191695colicin uptake protein TolQ
Z09062190.221541colicin uptake protein TolR
Z0907217-0.273455cell envelope integrity inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0878TCRTETOQM310.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.003
Identities = 11/41 (26%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDDAPRMQDYTLEAEEGRDM-MLLDALIQLKEKDPSLSFRR 53
+++ + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0881RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.020
Identities = 27/196 (13%), Positives = 56/196 (28%), Gaps = 12/196 (6%)

Query: 48 EVPASADGILDAVLEDEGTTVTSRQILGRLREGNSAGKETSAKSE-EKASTPAQRQQASL 106
E+ + I+ ++ EG +V +L +L + +S +A R Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 107 EEQNNDAL----SPAIRRLLAEHNLDASAIKGTGVGGRLTRED----VEKHLAKAPAKES 158
+ L P + + T ++ E +L K A+
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 159 APAAAAPAAQPALAARSEKRVPMTRLRKRVA---ERLLEAKNSTAMLTTFNEVNMKPIMD 215
A + + + L + A +LE +N V +
Sbjct: 218 TVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQ 277

Query: 216 LRKQYGEAFEKRHGIR 231
+ + A E+ +
Sbjct: 278 IESEILSAKEEYQLVT 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0894PF03309340.001 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 33.6 bits (77), Expect = 0.001
Identities = 10/55 (18%), Positives = 23/55 (41%)

Query: 3 IVSVDIGSTWTKAALFTREGDALTLVNHVLTPTTTHHLAKGFFSSLNQVLNVDNA 57
++++D+ +T T L + GD +V T A +++ ++ D
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDGLIGDDAE 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0907IGASERPTASE546e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 54.3 bits (130), Expect = 6e-10
Identities = 33/215 (15%), Positives = 69/215 (32%), Gaps = 1/215 (0%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEEAAAKAAADAKAKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKK 185
+++ KQ + + A+ + + +E + A + A+ + E
Sbjct: 1126 TSQVSPKQEQSETVQPQAEPARENDP-TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 186 AEAAAAALKKKAEAAEAAAAEARKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAAD 245
E E + +++ K + + + + A +
Sbjct: 1185 ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR 1244

Query: 246 KKAAAAKAAAEKAAAAKAAAEADDIFGELSSGKNA 280
A + A + A A F L+ GK
Sbjct: 1245 STVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279



Score = 53.9 bits (129), Expect = 7e-10
Identities = 26/169 (15%), Positives = 57/169 (33%), Gaps = 2/169 (1%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADDKAAEE 158
E E+ Q QA+ + + ++ + A +E + AE
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 159 AAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAAEARKKAAAEKAAA 218
+ +++ +K E +A + A+ ++ A+ A + +K + E A + + K
Sbjct: 1044 SKQESK--TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 219 DKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEA 267
+ EK K +K K + ++ + A A
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150



Score = 52.0 bits (124), Expect = 3e-09
Identities = 23/162 (14%), Positives = 58/162 (35%), Gaps = 8/162 (4%)

Query: 86 QQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADA 145
+ AE +++ + K + + ++ A+EA + + E A + +
Sbjct: 1038 ETVAENSKQESKTVE---KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 146 KAKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAA 205
E + A E +KA + +K E K ++ K E + + A E
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKT--QEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 206 EARKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKK 247
K+ ++ + A + A++ +++ + ++
Sbjct: 1153 VNIKEPQSQT---NTTADTEQPAKETSSNVEQPVTESTTVNT 1191



Score = 52.0 bits (124), Expect = 3e-09
Identities = 24/192 (12%), Positives = 66/192 (34%), Gaps = 5/192 (2%)

Query: 68 QSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Q+ S ++E+ ++ +E ++ + +K E+ A +
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--EQDATET 1061

Query: 128 ELKQKQ-AEEAAAKAAADAKAKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKA 186
+ ++ A+EA + A+ + E +E + + + KA E +K
Sbjct: 1062 TAQNREVAKEAKSNVKANTQT-NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 187 EAAAAALKKKAEAAEAAAAEARKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADK 246
E + + ++ + + + A E E + AD + A++ +++
Sbjct: 1121 EVPKVTSQVSPKQEQSETVQPQAEPARENDPT-VNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 247 KAAAAKAAAEKA 258
+ ++
Sbjct: 1180 EQPVTESTTVNT 1191



Score = 51.6 bits (123), Expect = 4e-09
Identities = 31/187 (16%), Positives = 58/187 (31%), Gaps = 8/187 (4%)

Query: 87 QAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAK 146
QA E R+ + A + E A+ KQ + K DA
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN----SKQESKTVEKNEQDAT 1059

Query: 147 AKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKA--EAAAAALKKKAEAAEAAA 204
+ + A+EA A+ + A++ E Q E A ++KA+
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKT 1119

Query: 205 AEARKKAAAEKAAADKKAAEKAAAEKAAADKKA--AAEKAAADKKAAAAKAAAEKAAAAK 262
E K + ++ + AE A + E + A + A++ ++
Sbjct: 1120 QEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 263 AAAEADD 269
+
Sbjct: 1180 EQPVTES 1186



Score = 50.1 bits (119), Expect = 1e-08
Identities = 29/251 (11%), Positives = 84/251 (33%), Gaps = 19/251 (7%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAE--------EAAAKAAADAKAK---AEAD 152
+ E + ++ ++ Q K+ +K+ + + K + K +E
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETV 1139

Query: 153 DKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAAEARKKAA 212
AE A + K+ +++ A Q E ++ + E+ + +
Sbjct: 1140 QPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 213 AEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEADDIFG 272
A + + + ++ + ++ A ++ +++ A + +
Sbjct: 1200 ENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNA 1259

Query: 273 ELSSGKNAPKT 283
LS + +
Sbjct: 1260 VLSDARAKAQF 1270


16Z0943Z0993Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0943023-4.127354pectinesterase
Z0946332-5.572247integrase encoded by prophage CP-933K; partial
Z0947225-3.809957hypothetical protein
Z0948024-3.319219hypothetical protein
Z0949124-1.765885hypothetical protein
Z0950224-4.536656hypothetical protein
Z0951024-4.738435exonuclease
Z0952130-7.680345Bet recombination protein of prophage CP-933K
Z0953235-8.988455hypothetical protein
Z0954234-8.915797hypothetical protein
Z0955235-8.054504hypothetical protein
Z0956031-5.891128antiterminator Q protein of prophage CP-933K
Z0957226-4.540835hypothetical protein
Z09583171.841297hypothetical protein
Z09602172.392229lysozyme protein R of prophage CP-933K
Z09613182.709658endopeptidase Rz of prophage CP-933K
Z09623192.006751hypothetical protein
Z09633191.758276hypothetical protein
Z09643201.839192DNA packaging protein of prophage CP-933K
Z09653231.601685hypothetical protein
Z09665221.070619hypothetical protein
Z09675201.450567protease encoded in prophage CP-933K
Z09686262.568577hypothetical protein
Z09697283.164904hypothetical protein
Z09704262.583165tail component of prophage CP-933K
Z09714262.272823tail component of prophage CP-933K
Z09723252.547950tail component of prophage CP-933K
Z09733272.885730tail component of prophage CP-933K
Z09743243.540997tail component of prophage CP-933K
Z09752223.963877tail component of prophage CP-933K
Z09763254.822316tail component of prophage CP-933K
Z09773265.362192tail component of prophage CP-933K
Z09784285.100463tail component of prophage CP-933K
Z09792251.841472tail component of prophage CP-933K
Z09802230.148976tail component of prophage CP-933K
Z0981537-4.976131prophage protein
Z0982843-6.172506tail component of prophage CP-933K
Z0984645-9.295099hypothetical protein
Z0985232-6.345159hypothetical protein
Z0986020-2.556443hypothetical protein
Z0989-1160.144057hypothetical protein
Z09900142.766209hypothetical protein
Z09921154.144666kinase inhibitor protein
Z0993-1133.573228adenosylmethionine-8-amino-7-oxononanoate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0953TYPE4SSCAGX290.012 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.0 bits (64), Expect = 0.012
Identities = 14/41 (34%), Positives = 30/41 (73%), Gaps = 2/41 (4%)

Query: 36 KIALERRSKEREKAEKAEKAAEKKRRREEQKQKDKLKIQKL 76
K ALE+ + +E+A+KA+K +K+ +R+E++ K++ ++ L
Sbjct: 145 KKALEKEKEAKEQAQKAQK--DKREKRKEERAKNRANLENL 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0975cloacin443e-06 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 43.9 bits (103), Expect = 3e-06
Identities = 34/142 (23%), Positives = 62/142 (43%), Gaps = 4/142 (2%)

Query: 519 DQQRLNDLQEKKRQKDLQDAK--EQAERNYQEQQKRRNAENAALNRMNETEAARHQREIA 576
DQ + +E +RQ++ E AERNY+ + N N + R E +A Q +
Sbjct: 294 DQVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNS 353

Query: 577 RINAMQYADQAVRDA-AIQRENERYEKALASGKKKTRETRNDEATRLLLQYSQQQAQVEG 635
R + + A++ + DA A ++ R+ +G + + +A R + +QA +
Sbjct: 354 RKSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDA 413

Query: 636 QIAAARQSAGIATERMTEARKQ 657
A + A A E+RK+
Sbjct: 414 -AAKEKSDADAALSSAMESRKK 434


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0980SURFACELAYER330.005 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 33.5 bits (76), Expect = 0.005
Identities = 34/143 (23%), Positives = 45/143 (31%), Gaps = 30/143 (20%)

Query: 966 SVNANSGTLNNVTVNENCTIKGMLEATQV----RGDF---------VKAVSKSFPKQAGT 1012
+ + L NVT + +K L+A ++ G F VKA S K A
Sbjct: 235 AAQYDKKQLTNVTFDTETAVKDALKAQKIEVSSVGYFKAPHTFTVNVKATSNKNGKSATL 294

Query: 1013 WGNTETPNGTVTVTISDDHNFDRQIIIPPIIFNGIAYSDPGSGNNPGGTRYTGYGFEVRK 1072
PN V S I+ N Y + G R
Sbjct: 295 PVTVTVPNVADPVVPSQSKT---------IMHNAYFYDKDA--------KRVGTDKVTRY 337

Query: 1073 NGVLIASRETKGAIPGSYSAVID 1095
N V +A TK A SY VI+
Sbjct: 338 NTVTVAMNTTKLANGISYYEVIE 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0981ENTEROVIROMP872e-24 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 87.3 bits (216), Expect = 2e-24
Identities = 45/140 (32%), Positives = 72/140 (51%), Gaps = 15/140 (10%)

Query: 1 MRKVCAAILSAAICLAVSGVPAWASEHQSTLSAGYLHASTDAPG-SDDLNGINVKYRYEF 59
M+K+ AA+ +G A ST++ GY A +DA G + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEE 55

Query: 60 TDT-LGLITSFSYANAEDEQKTHYSDTRWHEDYVRNRWFSVMAGPSVRVNEWFSAYAMAG 118
++ LG+I SF+Y T S T DY +N+++ + AGP+ R+N+W S Y + G
Sbjct: 56 DNSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVG 107

Query: 119 VAYSRVSTFSGDYFRVTDNK 138
V Y + T ++ +
Sbjct: 108 VGYGKFQTTEYPTYKHDTSD 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0982CHANLCOLICIN330.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.002
Identities = 36/130 (27%), Positives = 55/130 (42%), Gaps = 15/130 (11%)

Query: 133 SARNAGISASKAEASAANADTSAEDASESARQAAESAASAKKSEEASSSSAS-------- 184
S G SK+E+SAA T+ ++ + AE AA AK + EA + + +
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQR 93

Query: 185 ------EAAQKASESLQSATDAELSKKTAESAAGNAARDATTSTEKARESAESAQSAEQS 238
EA + + SAT+ + A A R A + EKAR+ AE+A+ A Q
Sbjct: 94 LKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLA-KAEEKARKEAEAAEKAFQE 152

Query: 239 RIAAEDAVNR 248
+ R
Sbjct: 153 AEQRRKEIER 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0989YERSSTKINASE290.027 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.9 bits (64), Expect = 0.027
Identities = 19/66 (28%), Positives = 32/66 (48%), Gaps = 3/66 (4%)

Query: 200 RMDKINGESLLNISSLPAQAEHAIYDMFDRLEQKGILFVDTTETNVLYDRAKNEFNPIDI 259
+ KIN E+ A H + D+ + L + G++ D NV++DRA E ID+
Sbjct: 234 KQGKINSEAYWGTIKFIA---HRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDL 290

Query: 260 SSYNVS 265
++ S
Sbjct: 291 GLHSRS 296


17Z1008Z1019Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1008-2213.184291cardiolipin synthase 2
Z1009-2213.636463hypothetical protein
Z1010-2213.521536hypothetical protein
Z1012-2193.797930hypothetical protein
Z1013-2193.853869hypothetical protein
Z1014-113-2.111963ABC transporter ATP-binding protein
Z1015-113-2.345046hypothetical protein
Z1016013-2.569080DNA-binding transcriptional regulator
Z1017014-2.109336ATP-dependent RNA helicase RhlE
Z1018017-3.513179hypothetical protein
Z1019015-3.092076hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1012ABC2TRNSPORT320.003 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 31.8 bits (72), Expect = 0.003
Identities = 15/51 (29%), Positives = 26/51 (50%)

Query: 232 VFMMPAILLSGYVSPVENMPVWLQNLTWINPIRHFTDITKQIYLKDASLDI 282
+ + P + LSG V PV+ +P+ Q P+ H D+ + I L +D+
Sbjct: 184 LVITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1014PF05272320.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.008
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 240 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 299
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 300 KRGEIFG----LLGPNGAGKSTTFKMMCGL 325
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.037
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1015RTXTOXIND534e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.9 bits (127), Expect = 4e-10
Identities = 36/203 (17%), Positives = 78/203 (38%), Gaps = 24/203 (11%)

Query: 80 YEIALMQAKAGVSVAQAQYD-LMLTLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQA 135
E ++A + V ++Q + + + SA+++ + + + + Q ++
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTL 316

Query: 136 QLAQAELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERN 193
+LA+ E Q S + AP + V G V+ T+ + + V A V ++
Sbjct: 317 ELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKD 376

Query: 194 LDQAQPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVV 250
+ G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 377 IGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISI 428

Query: 251 T-------DADDALRQGMPVTVQ 266
+ + L GM VT +
Sbjct: 429 EENCLSTGNKNIPLSSGMAVTAE 451



Score = 48.3 bits (115), Expect = 1e-08
Identities = 24/157 (15%), Positives = 52/157 (33%), Gaps = 15/157 (9%)

Query: 4 KPVVIGLAVVVLAAVVAGXYWWYQSRQDNGLTLYGNV--DIRTVNLSFRVGGRVESLAVD 61
+A ++ +V + + T G + R+ + V+ + V
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 62 EGDAIKAGQVLGELDHKPYEIALMQAKAGVSVAQAQYDLMLTLKSAQDK----------- 110
EG++++ G VL +L E ++ ++ + A+ + L + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 111 --LRQYRSGNREQDIAQAKASLEQAQAQLAQAELNLQ 145
+ + + K Q Q Q ELNL
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1016HTHTETR736e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 73.1 bits (179), Expect = 6e-18
Identities = 33/214 (15%), Positives = 77/214 (35%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ + + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1017SECA300.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.025
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


18Z1046mZ1058Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Z1046m-2143.850275pyruvate-formate lyase
Z1047-1123.696222pyruvate formate-lyase 2 activating enzyme
Z1048-1133.128698fructose-6-phosphate aldolase
Z1049-1133.157173molybdopterin biosynthesis protein MoeB
Z1050-1142.785659molybdopterin biosynthesis protein MoeA
Z1051m015-1.039933L-asparaginase
Z1053015-2.751716glutathione transporter ATP-binding protein
Z1054015-4.220934transporter
Z1055014-5.832617transport system permease
Z1056-111-5.590528transport system permease
Z1057-112-6.419552hypothetical protein
Z1058011-3.242020hypothetical protein
19Z1092Z1102Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1092-117-5.654933arginine transporter permease subunit ArtQ
Z1093-122-8.510815arginine ABC transporter substrate-binding
Z1094-226-8.731153arginine transporter ATP-binding subunit
Z1095125-7.455761lipoprotein
Z1096121-6.073960hypothetical protein
Z1097018-3.484609hypothetical protein
Z10981121.781010hypothetical protein
Z1099-1153.339830hypothetical protein
Z1100-1143.429807regulator
Z1102-3173.142214nucleotide di-P-sugar epimerase or dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1094PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1100ECOLIPORIN290.025 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 28.7 bits (64), Expect = 0.025
Identities = 20/54 (37%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLIAVALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADD 55
R+V L+ ALL AG A I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1102NUCEPIMERASE731e-16 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 72.9 bits (179), Expect = 1e-16
Identities = 69/363 (19%), Positives = 122/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKXISVRA---------TGRNEAMGKLLEKMGAEFVPAD 63
MK LVTGA +G + + L + V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVITLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


20Z1116Z1222Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1116-316-4.289040macrolide transporter ATP-binding /permease
Z1117-120-6.441858stationary phase/starvation inducible regulatory
Z1118-121-6.441341ATP-dependent Clp protease adaptor protein ClpS
Z1119022-6.428082ATP-dependent Clp protease ATP-binding subunit
Z1120339-10.816884P4 family integrase
Z1121439-12.549741hypothetical protein
Z1122533-7.641213hypothetical protein
Z1123430-5.716048hypothetical protein
Z1124532-5.776372prophage regulatory protein
Z1125323-2.676602hypothetical protein
Z1126222-1.958186hypothetical protein
Z1128220-0.186910hypothetical protein
Z11293230.538603helicase
Z11303262.447262hypothetical protein
Z11314262.033276hypothetical protein
Z1132432-2.726236hypothetical protein
Z1133636-4.221510transposase
Z1134636-5.511649hypothetical protein
Z1135847-13.604275complement resistance protein
Z1136951-14.878603hypothetical protein
Z1137640-11.701745hypothetical protein
Z1138535-10.197567hypothetical protein
Z1139433-9.321625diacylglycerol kinase
Z1140323-4.984786hypothetical protein
Z11411211.904814hypothetical protein
Z11420203.346768urease accessory protein D
Z11431213.357131urease subunit gamma
Z11441203.198917urease subunit beta
Z11450202.653443urease subunit alpha
Z11461202.039280urease accessory protein UreE
Z11471210.447087urease accessory protein F
Z1148222-2.788870urease accessory protein G
Z1149127-7.741974hypothetical protein
Z1150438-13.693169hypothetical protein
Z1151637-14.907319hypothetical protein
Z1152537-13.81889650S ribosomal protein L31
Z1153332-9.653385hypothetical protein
Z1154333-8.768367hypothetical protein
Z1155329-6.462815hypothetical protein
Z11563221.635475hypothetical protein
Z11574242.173659hypothetical protein
Z11584262.236172hypothetical protein
Z11594241.468329hypothetical protein
Z11602221.947019hypothetical protein
Z11612222.095622hypothetical protein
Z11621251.791720hypothetical protein
Z11631251.702878hypothetical protein
Z11641251.907296hypothetical protein
Z11651231.666223hypothetical protein
Z11661231.364423hypothetical protein
Z11672221.525438hypothetical protein
Z11683201.190815hypothetical protein
Z11692200.883595hypothetical protein
Z11701210.502643hypothetical protein
Z11711260.756741phage inhibition, colicin resistance and
Z11722240.602667phage inhibition, colicin resistance and
Z1173522-0.171150phage inhibition, colicin resistance and
Z1174623-0.739781phage inhibition, colicin resistance and
Z1175523-2.211291phage inhibition, colicin resistance and
Z1176525-3.563242phage inhibition, colicin resistance and
Z1177729-4.579344phage inhibition, colicin resistance and
Z1178730-4.748620bifunctional enterobactin receptor/adhesin
Z1180444-13.761508hypothetical protein
Z1181638-11.958773hypothetical protein
Z1182434-8.652680hypothetical protein
Z1183430-6.356795hypothetical protein
Z1184423-4.364218hypothetical protein
Z1185433-6.376010hypothetical protein
Z1186437-6.402119hypothetical protein
Z1187331-4.907910hypothetical protein
Z1188333-6.842569hypothetical protein
Z1189333-5.855308hypothetical protein
Z1190229-4.659750glucosyltransferase
Z1191323-2.891019hypothetical protein
Z1192322-2.148245IS1 protein InsB
Z1193422-1.583003hypothetical protein
Z11943210.826967hypothetical protein
Z11953251.038875hypothetical protein
Z1196326-0.810715hypothetical protein
Z1197327-4.474247hypothetical protein
Z1198226-3.669148transposase
Z1199429-3.989727hypothetical protein
Z1200430-4.827473hypothetical protein
Z1201530-6.126465hypothetical protein
Z1202528-5.629425hypothetical protein
Z1203524-2.244071hypothetical protein
Z1204726-4.456452hypothetical protein
Z1205523-2.168763hypothetical protein
Z12068233.405803hypothetical protein
Z12076212.741119hypothetical protein
Z12086222.839312hypothetical protein
Z12096223.035171hypothetical protein
Z12106223.433772histone
Z12116223.164245adhesin
Z12125232.463925hypothetical protein
Z12136233.539066hypothetical protein
Z12147253.880504hypothetical protein
Z12158304.459322hypothetical protein
Z12167284.006156hypothetical protein
Z12176273.600437RadC family DNA repair protein
Z12186232.431510hypothetical protein
Z12195242.262041hypothetical protein
Z12206220.696047structural protein
Z1221520-0.233454transposase
Z1222219-0.347932hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1119HTHFIS365e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.0 bits (83), Expect = 5e-04
Identities = 31/114 (27%), Positives = 46/114 (40%), Gaps = 4/114 (3%)

Query: 177 NQLARVGGIDPLIGREKELERAIQVLCR--RRKNNPLLVGESGVGKTAIAEGLAWRIVQG 234
PL+GR ++ +VL R + ++ GESG GK +A L +
Sbjct: 128 KLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRR 187

Query: 235 DVPEVMADCTIYSLD-IGSLLAGTKYRGDFEKRFKALLKQLEQDTNSILFIDEI 287
+ P V + D I S L G + +G F + EQ LF+DEI
Sbjct: 188 NGPFVAINMAAIPRDLIESELFGHE-KGAFTGAQTRSTGRFEQAEGGTLFLDEI 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1124HTHFIS260.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 25.6 bits (56), Expect = 0.043
Identities = 6/15 (40%), Positives = 13/15 (86%)

Query: 21 SQLLGISRSTIYEKM 35
+ LLG++R+T+ +K+
Sbjct: 456 ADLLGLNRNTLRKKI 470


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1145UREASE10810.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1081 bits (2797), Expect = 0.0
Identities = 397/570 (69%), Positives = 462/570 (81%), Gaps = 2/570 (0%)

Query: 1 MMSNISRQAYADMFGPTTGDKIRLADTELWIEVEDDLTTYGEEVKFGGGKVIRDGMGQGQ 60
M +SR AYA+MFGPT GDK+RLADTEL+IEVE D TT+GEEVKFGGGKVIRDGMGQ Q
Sbjct: 1 MSYRMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQ 60

Query: 61 ML-SAGCADLVLTNALIIDYWGIVKADIGVKDGRIFAIGKAGNPDIQPNVTIPIGVSTEI 119
+ G D V+TNALI+D+WGIVKADIG+KDGRI AIGKAGNPD+QP VTI +G TE+
Sbjct: 61 VTREGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEV 120

Query: 120 IAAEGRIVTAGGVDTHIHWICPQQAEEALTSGITTMIGGGTGPTAGSNATTCTPGPWYIY 179
IA EG+IVTAGG+D+HIH+ICPQQ EEAL SG+T M+GGGTGP G+ ATTCTPGPW+I
Sbjct: 121 IAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIA 180

Query: 180 QMLQAADSLPVNIGLLGKGNCSNPDALREQVAAGVIGLKIHEDWGATPAVINCALTVADE 239
+M++AAD+ P+N+ GKGN S P AL E V G LK+HEDWG TPA I+C L+VADE
Sbjct: 181 RMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADE 240

Query: 240 MDVQVALHSDTLNESGFVEDTLTAIGGRTIHTFHTEGAGGGHAPDIITACAHPNILPSST 299
DVQV +H+DTLNESGFVEDT+ AI GRTIH +HTEGAGGGHAPDII C PN++PSST
Sbjct: 241 YDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSST 300

Query: 300 NPTLPYTVNTIDEHLDMLMVCHHLDPDIAEDVAFAESRIRQETIAAEDVLHDLGAFSLTS 359
NPT PYTVNT+ EHLDMLMVCHHL P I ED+AFAESRIR+ETIAAED+LHD+GAFS+ S
Sbjct: 301 NPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIIS 360

Query: 360 SDSQAMGRVGEVVLRTWQVAHRMKVQRGPLPEESGDNDNVRVKRYIAKYTINPALTHGIA 419
SDSQAMGRVGEV +RTWQ A +MK QRG L EE+GDNDN RVKRYIAKYTINPA+ HG++
Sbjct: 361 SDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLS 420

Query: 420 HEVGSIEVGKLADLVLWSPAFFGVKPATIVKGGMIAMAPMGDINGSIPTPQPVHYRPMFA 479
HE+GS+EVGK ADLVLW+PAFFGVKP ++ GG IA APMGD N SIPTPQPVHYRPMF
Sbjct: 421 HEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFG 480

Query: 480 ALGSARHRCRVTFLSQAAAANGVAEQLNLHSTTAVVKGCR-TVQKADMRHNSLLPDITVD 538
A G +R VTF+SQA+ G+A +L + V+ R + KA M HNSL P I VD
Sbjct: 481 AYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVD 540

Query: 539 SQTYEVRINGELITSEPADILPMAQRYFLF 568
+TYEVR +GEL+T EPA +LPMAQRYFLF
Sbjct: 541 PETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1153HOKGEFTOXIC342e-06 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 33.6 bits (77), Expect = 2e-06
Identities = 14/48 (29%), Positives = 28/48 (58%), Gaps = 2/48 (4%)

Query: 1 MPQKTIIVGML--CLTMLLTVWVLHASPCEFRVSFMWSEIAAFLQCKP 46
+P+ +++ +L CLT+L+ ++ S CE R + E+AAF+ +
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1161PF02370361e-04 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 35.9 bits (82), Expect = 1e-04
Identities = 20/103 (19%), Positives = 47/103 (45%), Gaps = 3/103 (2%)

Query: 18 EQAEALRQKDQQLSLVEETEAFLRSALARAEEKIEEEEREIEHLRAQIEKLRRMLFGTRS 77
+ +++ R+ D Q + LR + ++KIEE E+E + + + E+ + +
Sbjct: 38 DSSDSKRENDPQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQD 97

Query: 78 EKLQREVEQAEAQLKQREQESDRYSGREDDPQVPRQLRQSRHR 120
+ Q + ++ + + +Q E E + + Q+ RQ +R
Sbjct: 98 KHYQEQQKKHQQEQQQLEAEKQKL---AKEKQISDASRQGLNR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1172TYPE4SSCAGA300.028 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.7 bits (66), Expect = 0.028
Identities = 21/54 (38%), Positives = 29/54 (53%), Gaps = 4/54 (7%)

Query: 314 KTDGVVTIHVPDQPPIETRLTEGENRRTLCAIARLVNE--NGAIK-VERINQYF 364
K D V + PDQ PI + + +NR+ I++L E N AIK + NQYF
Sbjct: 31 KVDNAVASYDPDQKPIVDK-NDRDNRQAFEGISQLREEYSNKAIKNPTKKNQYF 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1175PF07824280.014 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 28.0 bits (62), Expect = 0.014
Identities = 10/36 (27%), Positives = 15/36 (41%)

Query: 132 VNDDNQTEVARYDLTEDASTETAMLFGELYRHNGEW 167
+D+ + +AR DLT E + E Y W
Sbjct: 73 TDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRW 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1208cdtoxina280.013 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 27.7 bits (61), Expect = 0.013
Identities = 15/61 (24%), Positives = 24/61 (39%), Gaps = 5/61 (8%)

Query: 74 VELLPVEITPDEQKEPVAAIAPSLSTSTQTSVSAGSCKVEFRHGNMTLENPSPELLTLLI 133
VE P +PDE P+ P+L T+ + ++L N +LT+
Sbjct: 40 VEGGPTVPSPDEPGLPLPGPGPALPTNGAIPIPEPGTAPA-----VSLMNMDGSVLTMWS 94

Query: 134 R 134
R
Sbjct: 95 R 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1211PRTACTNFAMLY330.006 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 33.1 bits (75), Expect = 0.006
Identities = 100/488 (20%), Positives = 153/488 (31%), Gaps = 74/488 (15%)

Query: 423 GTLAVSAGGKATSVT---ITSGGALI---ADSGATV---EGTNASGKFSIDGTSGQASGL 473
G +GG ++ I +GGA + ++ G +A GK + + L
Sbjct: 326 GARVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKL 385

Query: 474 LLENG----GSFTVNAGGQAGNTTVGHRGTLTLAAGGSLSGRTQLSKGASMVLNGDVVST 529
L G G T++G + LA+ +G T+ S+ N V T
Sbjct: 386 TLTGGADAQGDIVATELPSIPGTSIG-PLDVALASQARWTGATRAVDSLSID-NATWVMT 443

Query: 530 GDIVNAGEIRFDNQTTPNAALSRAVAKSNSPVTFHKLTTTNLTGQGGTINMRVRLDGSNA 589
+ N G +R + S + F LT L G G M V D
Sbjct: 444 DN-SNVGALRLASDG------SVDFQQPAEAGRFKVLTVNTLAGSG-LFRMNVFAD-LGL 494

Query: 590 SDQLVINGGQATGKTWLAFTNVGNSNLGVATTGQGIRVVDAQNGATTEEGAFALSRPLQA 649
SD+LV+ A+G+ L N G+ T + +V G+ +
Sbjct: 495 SDKLVVMQD-ASGQHRLWVRNSGSEPASANT----LLLVQTPLGSAATFTLANKDGKVDI 549

Query: 650 GAFNYTLNRDSDEDWYLRSENAYRAEVPLY-----------------------TSMLTQA 686
G + Y L + + W L A A P L+ A
Sbjct: 550 GTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAA 609

Query: 687 MDYDRILAGSRSHQTGVNGENNSVRLSIQGGHLGHDNNGGIARGATPESSGSYGFVRLEG 746
+ G T E+N++ + L D G RG R
Sbjct: 610 ANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGR--- 666

Query: 747 DLLRTEVAGMSL--------TTGVYGAAGHSSVDVKDDDGSRAGTVRDDAGSLGGYLNLV 798
+VAG L G + G + D + G D+ +GGY +
Sbjct: 667 -RFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI 725

Query: 799 HTSSGLWADIVAQGTRHSMKASSDNND-------FRARGWGWLGSLETGLPFSITDNLML 851
SG + D + +R +D +R G G SLE G F+ D L
Sbjct: 726 -ADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA--SLEAGRRFTHADGWFL 782

Query: 852 EPQLQYTW 859
EPQ +
Sbjct: 783 EPQAELAV 790


21Z1320Z1390Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1320017-3.425481acylphosphatase
Z1321018-3.975251sulfur transfer protein TusE
Z1322018-4.262959hypothetical protein
Z1323122-5.445599integrase for cryptic prophage CP-933M
Z1324222-5.028547exodeoxyribonuclease VIII
Z1325232-9.925173hypothetical protein
Z1326233-9.547244inhibitor of cell division encoded by cryptic
Z1328534-8.248825hypothetical protein
Z1329537-7.634308hypothetical protein
Z1330429-4.401397hypothetical protein
Z1331328-4.341668hypothetical protein
Z1332526-2.169268hypothetical protein
Z1333422-1.303236DicA, regulator of DicB; encoded within cryptic
Z1334321-0.366135hypothetical protein
Z1335222-0.344164hypothetical protein
Z1336127-3.254689hypothetical protein
Z1337126-2.880181hypothetical protein
Z1338-124-2.769718replication protein
Z1339-126-3.068110hypothetical protein
Z1340025-4.183761hypothetical protein
Z1341-124-3.678709hypothetical protein
Z1342-1240.243422cell killing protein encoded within cryptic
Z13431221.730226hypothetical protein
Z13442271.614115endonuclease of cryptic prophage CP-933M
Z13452281.132787antitermination protein Q
Z13473282.724031hypothetical protein
Z13484241.492629hypothetical protein
Z13493221.809626hypothetical protein
Z1350124-0.990911holin protein of cryptic prophage CP-933M
Z1351122-0.430690hypothetical protein
Z13521230.976285endolysin of cryptic prophage CP-933M
Z13532222.862074antirepressor protein of cryptic prophage
Z13544264.734009endopeptidase of cryptic prophage CP-933M
Z13555286.297005hypothetical protein
Z13565306.950132hypothetical protein
Z13575316.742454hypothetical protein
Z13585316.821175hypothetical protein
Z13595316.533485hypothetical protein
Z13606286.326657hypothetical protein
Z13617284.713099hypothetical protein
Z13627275.052057hypothetical protein
Z13638285.432650hypothetical protein
Z136410306.061434hypothetical protein
Z13657284.920456hypothetical protein
Z13666335.040173hypothetical protein
Z13675335.387148hypothetical protein
Z13685324.843944hypothetical protein
Z13695325.207228hypothetical protein
Z13706325.028747tail component encoded by cryptic prophage
Z13717326.384106tail assembly chaperone encoded by cryptic
Z13727346.013543hypothetical protein
Z13737335.881778hypothetical protein
Z13747366.282443hypothetical protein
Z13758345.457146tail component encoded by cryptic prophage
Z13766325.372803tail component encoded by cryptic prophage
Z13776316.106330tail component encoded by cryptic prophage
Z13786315.766395tail component encoded by cryptic prophage
Z13796265.235110tail component encoded by cryptic prophage
Z13805264.415429tail component encoded by cryptic prophage
Z13814323.564138outer membrane protein Lom encoded by cryptic
Z13824253.299391tail component encoded by cryptic prophage
Z13831152.451220hypothetical protein
Z13850162.460709hypothetical protein
Z1386-1193.193262hypothetical protein
Z1387-1203.919049hypothetical protein
Z1389-1184.009412*hydrogenase-1 small subunit
Z1390-2193.730070hydrogenase 1 large subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1342HOKGEFTOXIC666e-19 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 66.4 bits (162), Expect = 6e-19
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRVRTGQTEVAVFTAYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1381ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 61/200 (30%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEQQATLSAGYLHARTSAPGSDNLNGINVKYRYEFT 60
M+K+ AA+ +G + +T++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGT---SVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1382IGASERPTASE408e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.4 bits (94), Expect = 8e-06
Identities = 47/289 (16%), Positives = 91/289 (31%), Gaps = 30/289 (10%)

Query: 11 LKDGTGKPVENCTIQLKARRNSATVVVNTVASENPDE-AGRYSMDVEYGQYSVILLVEGF 69
+ D TG+P N A + + ++ D A +Y + G+Y + +
Sbjct: 928 VADKTGEPNHNELTLFDASKAQRDHLNVSLVGNTVDLGAWKYKLRNVNGRYDL------Y 981

Query: 70 PPSHAGTITVYEDSQPGTLNDFLGAMSEDDVRPEALRRFELMVEEAARHAEEAKKNAGEA 129
P + + T N+ + E + R + A ++
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSE----TT 1037

Query: 130 ETSARNAGISASQAEESAANADTSAGDASESARQAA-ESAAAAKQSEEASSSSASAAAQK 188
ET A N+ + E++ +A + E A++A A + +E A S S + Q
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 189 ASESSQSAAEA------------ELSKKTAESAAGNAARDAT-TATEKARE-----SAES 230
+ E E+ K T++ + + E ARE + +
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 231 AQSAEQSRIAAEEAVNRIPTVVGPPGPKGEPGPAGPQGPKGDKGERGDT 279
QS + E+ + V P + G + + T
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206


22Z1416Z1489Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1416-216-4.275135chaperone protein TorD
Z1417017-5.043456chaperone-modulator protein CbpM
Z1418016-4.482075curved DNA-binding protein CbpA
Z1419-121-5.669456hypothetical protein
Z1420019-2.946632hypothetical protein
Z1421-118-2.191276glucose-1-phosphatase/inositol phosphatase
Z1422-227-3.521976hypothetical protein
Z1423-327-3.578896TrpR binding protein WrbA
Z1424-230-4.675652integrase for bacteriophage BP-933W
Z1425-129-5.269648excisionase
Z1426026-4.710435hypothetical protein
Z1428026-4.176947hypothetical protein
Z1429128-4.163210hypothetical protein
Z1430126-2.990786hypothetical protein
Z1431125-2.112179hypothetical protein
Z1432124-1.513974hypothetical protein
Z1433124-1.557237hypothetical protein
Z1434424-2.963406hypothetical protein
Z1435425-4.928217exonuclease of bacteriophage BP-933W
Z1437229-5.940508Bet recombination protein of bacteriophage
Z1438437-9.319443host-nuclease inhibitor protein Gam of
Z1439444-12.805561Kil protein of bacteriphage BP-933W
Z1440549-13.781691single-stranded DNA binding protein
Z1441548-13.078688hypothetical protein
Z1442442-11.117692antitermination protein N of bacteriophage
Z1443644-12.134213hypothetical protein
Z1444642-11.516812serine/threonine kinase encoded by bacteriophage
Z1445432-7.545744hypothetical protein
Z1446324-4.202581hypothetical protein
Z1447424-4.050481repressor protein CI of bacteriophage BP-933W
Z1448225-2.673014regulatory protein Cro of bacteriophage BP-933W
Z1449226-1.873297regulatory protein CII of bacteriophage BP-933W
Z1450229-3.337926replication protein O of bacteriophage BP-933W
Z1451030-3.495212replication protein P of bacteriophage BP-933W
Z1452232-4.208615hypothetical protein
Z1453133-3.701105hypothetical protein
Z1454134-4.127406DNA N-6-adenine-methyltransferase of
Z1456035-5.176675hypothetical protein
Z1457033-4.385385DNA-binding protein Roi of bacteriophage
Z1458227-0.615277hypothetical protein
Z1459327-0.135583antitermination protein Q of bacteriophage
Z14604280.080251hypothetical protein
Z14644260.301039***shiga-like toxin II A subunit encoded by
Z14652221.156731shiga-like toxin II B subunit encoded by
Z14662212.075784hypothetical protein
Z1467120-0.507650hypothetical protein
Z1468121-1.601307lysis protein S of bacteriophage BP-933W
Z1469018-1.897174lysozyme protein R of bacteriophage BP-933W
Z1471120-1.330356antirepressor protein Ant of bacteriophage
Z1473220-0.516965endopeptidase Rz of bacteriophage BP-933W
Z1474220-0.619431Bor protein of bacteriophage BP-933W
Z1475320-0.177225terminase small subunit of bacteriophage
Z14763220.285522terminase large subunit of bacteriophage
Z14774230.861696portal protein of bacteriophage BP-933W
Z14787283.710976hypothetical protein
Z14798303.729229hypothetical protein
Z14808343.785381hypothetical protein
Z14816312.804925hypothetical protein
Z14825301.924385hypothetical protein
Z14834301.012931tail fiber protein of bacteriophage BP-933W
Z1484429-3.139393tail fiber protein of bacteriophage BP-933W
Z1485328-2.957579hypothetical protein
Z1486228-1.984248hypothetical protein
Z1487230-2.121898hypothetical protein
Z1488332-2.059877hypothetical protein
Z1489229-0.765779outer membrane protein Lom of bacteriophage
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1440UREASE270.014 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 27.4 bits (61), Expect = 0.014
Identities = 18/66 (27%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 57 IMLAQHALLIAISSDLNAYGVVCEFDWN----DGNGQEGWPSMDGSEGIRITD---IDTS 109
+ LA L I + D +G +F DG GQ G+ IT+ +D
Sbjct: 22 VRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTREGGAVDTVITNALILDHW 81

Query: 110 GIFDSD 115
GI +D
Sbjct: 82 GIVKAD 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1444YERSSTKINASE310.007 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 31.2 bits (70), Expect = 0.007
Identities = 19/54 (35%), Positives = 27/54 (50%), Gaps = 7/54 (12%)

Query: 122 HIHSKGLMHFDIKPNNIMISNRN-EAMLSDFGLSQLVNE------ESRAAPEFG 168
H+ G++H DIKP N++ + E ++ D GL E ES APE G
Sbjct: 260 HLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGFTESFKAPELG 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1448HTHTETR305e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.0 bits (67), Expect = 5e-04
Identities = 9/24 (37%), Positives = 14/24 (58%)

Query: 10 GVGIPEVAKACGVSERAVYKWLKN 33
+ E+AKA GV+ A+Y K+
Sbjct: 31 STSLGEIAKAAGVTRGAIYWHFKD 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1464SHIGARICIN1444e-43 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 144 bits (364), Expect = 4e-43
Identities = 53/277 (19%), Positives = 115/277 (41%), Gaps = 35/277 (12%)

Query: 4 ILFKWVLCLLLGFSSVSYSREFTIDFSTQQSYVSSLNSIRTEISTPLEHISQGTTSVSVI 63
+ +L L L +V F + +T SY ++++R + + + ++
Sbjct: 6 VFSLLILTLFLTAPAVEGDVSFRLSGATSSSYGVFISNLRKALPYERK-----LYDIPLL 60

Query: 64 NHTPPGSYFAVDIRGLDVYQARFDHLRLIIEQNNLYVAGFVNTATNTFYRFSDFT----- 118
T PGS I L Y + + + I+ N+YV G+ A +T Y F++ +
Sbjct: 61 RSTLPGSQRYALIH-LTNYAD--ETISVAIDVTNVYVMGY--RAGDTSYFFNEASATEAA 115

Query: 119 HISVPGVT-TVSMTTDSSYTTLQRVAALERSGMQISRHSLVSSYLALMEFSGNTMTRDAS 177
V++ +Y LQ A R + + +L S+ L ++ N+ A+
Sbjct: 116 KYVFKDAKRKVTLPYSGNYERLQIAAGKIRENIPLGLPALDSAITTLFYYNANS----AA 171

Query: 178 RAVLRFVTVTAEALRFRQIQREFRQALSETAPVYTMTPGDVDLTLNWGRISNVLPEYRGE 237
A++ + T+EA R++ I+++ + + +T + + + L +W +S +
Sbjct: 172 SALMVLIQSTSEAARYKFIEQQIGKRVDKT---FLPSLAIISLENSWSALSKQIQIASTN 228

Query: 238 DGV----------RVGRISFNNISA--ILGTVAVILN 262
+G + R++ N+ A + +A++LN
Sbjct: 229 NGQFETPVVLINAQNQRVTITNVDAGVVTSNIALLLN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1474PF062911633e-56 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 163 bits (413), Expect = 3e-56
Identities = 89/97 (91%), Positives = 91/97 (93%)

Query: 1 MKKMLLATALALLITGCAQQTFTVQNKQTAVAPKETITHHFFVSGIGQKKTVDAAKICGG 60
MKKML + ALA+LITGCAQQTFTV NK TAV PKETITHHFFVSGIGQKKTVDAAKICGG
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAAKICGG 65

Query: 61 TENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 97
ENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ
Sbjct: 66 AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1475RTXTOXIND310.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.008
Identities = 17/91 (18%), Positives = 29/91 (31%), Gaps = 25/91 (27%)

Query: 179 KILKAEQALDRNIARIESIERSLL----------------TLDVLAETAPKLRADRERIN 222
K ++A L +++E IE +L LD L +T + +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELA 319

Query: 223 AARDKLRAETDILTNQRRGVVTPVSDIVSSL 253
++ Q + PVS V L
Sbjct: 320 KNEERQ---------QASVIRAPVSVKVQQL 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1477CHANLCOLICIN330.004 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.004
Identities = 21/101 (20%), Positives = 46/101 (45%), Gaps = 10/101 (9%)

Query: 613 QEVAAQQQALQQQQAELQMREMAGRVAKLEADAARAHAAAQRDNASAQREVALTQGQRYV 672
+E A ++A Q+ AE + +E+ + +A+ R A+ + +R AL++ + V
Sbjct: 141 KEAEAAEKAFQE--AEQRRKEIE----REKAETERQLKLAEAEE---KRLAALSEEAKAV 191

Query: 673 DALNQAHTAEIITGVQNMEQEQDVLQQQMLYTLQQRMNEMS 713
+ + + + V M+ E L ++ ++ R EM
Sbjct: 192 EIAQKK-LSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMK 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1483CHANLCOLICIN350.001 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.7 bits (79), Expect = 0.001
Identities = 30/111 (27%), Positives = 48/111 (43%), Gaps = 4/111 (3%)

Query: 130 KNTQATQSKESAAASAKSASDSAK--TATSRAAEAGQKATDATEAATRAVTAAGNAEESS 187
K TQA Q+ + AA+ A A T R + +A + T + T +A ++
Sbjct: 63 KKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAA 122

Query: 188 TRAGESEKAAGADAEKARQHAEKARLAQESAGEILKRAEAATVSAEEARRM 238
+A + EKAR+ AE A A + A + +R E AE R++
Sbjct: 123 MQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQ--RRKEIEREKAETERQL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1489ENTEROVIROMP1132e-33 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 113 bits (283), Expect = 2e-33
Identities = 46/167 (27%), Positives = 73/167 (43%), Gaps = 31/167 (18%)

Query: 79 SGYEGKDKNPQGINIRYRYEITDD-FGVITSFTWTRSLTNSQTFIDVQSADHTRKIKNPA 137
S +G+ G N++YRYE + GVI SFT+T K+
Sbjct: 35 SDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTE--------------------KSRT 74

Query: 138 ASARTDIRANYWSLLAGPSWRVNQYMSLYAMAGMGVAKVSADLKIKDNINSSGGFSESNS 197
AS+ + Y+ + AGP++R+N + S+Y + G+G K + + +
Sbjct: 75 ASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQT----------TEYPTYKHD 124

Query: 198 TKKTSLAWAAGAQFNLNESVTLDVAYEGSGSGDWRTSGVTAGIGLKF 244
T ++ AG QFN E+V LD +YE S AG+G +F
Sbjct: 125 TSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYRF 171


23Z1503Z1510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1503-3153.065158hypothetical protein
Z1505-1174.294768transporter
Z1506-1214.8345214-hydroxyphenylacetate 3-monooxygenase
Z1507-1163.616544hypothetical protein
Z1508-1134.288413acetyltransferase
Z1509-1153.787778hypothetical protein
Z1510-2123.048098synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1510ISCHRISMTASE733e-17 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 72.7 bits (178), Expect = 3e-17
Identities = 43/176 (24%), Positives = 70/176 (39%), Gaps = 23/176 (13%)

Query: 26 TFDPQQSALIVVDMQNAYATPGGYLDLAGFDVSTTRPVIANIQTAVTAARAAGMLIIWFQ 85
DP ++ L++ DMQN + +D S + ANI+ G+ +++
Sbjct: 25 VPDPNRAVLLIHDMQNYF------VDAFTAGASPVTELSANIRKLKNQCVQLGIPVVY-- 76

Query: 86 NGWDEQYVEAGGPGSPNFHKSNALKTMRKQPQLQGKLLAKGSWDYQLVDELVPQPGDIVL 145
PGS N L G L G ++ +++ EL P+ D+VL
Sbjct: 77 ---------TAQPGSQNPDDRALLTDF------WGPGLNSGPYEEKIITELAPEDDDLVL 121

Query: 146 PKPRYSGFFNTPLDSILRSRGIRHLVFTSIATNVCVESTLRDGFFLEYFGVVLEDA 201
K RYS F T L ++R G L+ T I ++ T + F + + DA
Sbjct: 122 TKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDA 177


24Z1521Z1543Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1521-222-3.633507hypothetical protein
Z1522-226-5.964651hypothetical protein
Z1523-131-8.265630PGA biosynthesis protein
Z1524-131-8.815091N-glycosyltransferase
Z1525-134-10.272305outer membrane N-deacetylase
Z1526-135-11.482608outer membrane protein PgaA
Z1527140-14.550970hypothetical protein
Z1528340-13.609436hypothetical protein
Z1530233-9.574921hypothetical protein
Z1531125-5.138679hypothetical protein
Z1532122-3.531862hypothetical protein
Z1533121-2.886246oxidoreductase
Z1534222-3.712351chaperone
Z1535222-3.854002hypothetical protein
Z1536221-4.564710usher protein
Z1537215-4.891174chaperone
Z1538313-5.782654pilin subunit
Z1539314-6.432012hypothetical protein
Z1540212-4.524312hypothetical protein
Z1541212-4.188654hypothetical protein
Z1542212-4.264985ShlA/HecA/FhaA exofamily protein
Z1543015-5.476669ShlA/HecA/FhaA exofamily protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1527BINARYTOXINA300.025 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 29.6 bits (66), Expect = 0.025
Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%)

Query: 335 DQVIKTVVNIIGKSIRPDDLLA--RVGGEEFGVLLTDIDTERAKALAERIRENVERLTGD 392
D + + N + + P +L+ R G +EFG+ LT + + K E I E+ G
Sbjct: 313 DSKVNNIENALKLTPIPSNLIVYRRSGPQEFGLTLTSPEYDFNK--IENIDAFKEKWEGK 370

Query: 393 NPEYAIPQKVTISIGAV 409
Y P ++ SIG+V
Sbjct: 371 VITY--PNFISTSIGSV 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1530TRNSINTIMINR300.004 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 30.1 bits (67), Expect = 0.004
Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 5 YFLFAGIILCAFIAAILSHIAFHHANEPAEQNISCNAHVI 44
Y L + +I+ I A ++ A H N+PAEQ + H +
Sbjct: 366 YGLSSALIVAGGIGAGVT-TALHRRNQPAEQTTTTTTHTV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1533DHBDHDRGNASE1046e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 104 bits (260), Expect = 6e-29
Identities = 70/255 (27%), Positives = 118/255 (46%), Gaps = 11/255 (4%)

Query: 18 LHNKVAIVTGAAGELGRGLCSALAKAGANLLLVDIK-EPDNRYLKHLTHEGVEVEFMTID 76
+ K+A +TGAA +G + LA GA++ VD E + + L E E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 77 ITKPDASCTIINRCLERFGQLDILVNNAGVCNINRPIDFNRNDWDPMINLNLNAAFDMSQ 136
+ A I R G +DILVN AGV + +W+ ++N F+ S+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 137 AALNIFVPQRKGKIINMCSVLSFHGGRWSPG-YAATKHALAGLTKAYADDFAEYNIQING 195
+ + +R G I+ + S + R S YA++K A TK + AEYNI+ N
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 196 IAPGYYVSEMTAIIYNNPKIKE-LIKGR-------IPAQRWGRAQDLMGAMVFLASAASD 247
++PG ++M ++ + E +IKG IP ++ + D+ A++FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 248 YVNGQLLVIDGGYSI 262
++ L +DGG ++
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1536PF005776770.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 677 bits (1748), Expect = 0.0
Identities = 237/805 (29%), Positives = 386/805 (47%), Gaps = 44/805 (5%)

Query: 28 ATPSDEDNYTFDPQLFRGSRFSQSSLAKLTTRESVAPGNYKMDIYTNNKLSGSWNVTFKE 87
P F+P+ + + L++ + + PG Y++DIY NN + +VTF
Sbjct: 39 QAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNT 98

Query: 88 AADG-RVLPCLTPEVADAIGLKTGEDKGEK---DPVCTFAKELAPGITSQTQLSQLRLDL 143
++PCLT ++GL T G D C + T+Q + Q RL+L
Sbjct: 99 GDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNL 158

Query: 144 SVPQSQLISRPRGYVPPSELDTGASLAFMNYIANYYNVAYSGQNAHSQRSLWASFNGGIN 203
++PQ+ + +R RGY+PP D G + +NY + + + + + + G+N
Sbjct: 159 TIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS--VQNRIGGNSHYAYLNLQSGLN 216

Query: 204 LGAWQYRQLSNMTW-----DNDKGNQWNNIRSYLQRPLPAINSQLMMGQLITSGRFFSGL 258
+GAW+ R + ++ + N+W +I ++L+R + + S+L +G T G F G+
Sbjct: 217 IGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGI 276

Query: 259 SYHGVSLATDERMLPDSMRGYAPTIRGVAATNARVSVMQNGHEIYQTTVAPGPFEINDLY 318
++ G LA+D+ MLPDS RG+AP I G+A A+V++ QNG++IY +TV PGPF IND+Y
Sbjct: 277 NFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIY 336

Query: 319 PTSYSGDLDVTVTEANGAVSRFSVPFSAVPESMRPGTSRYNVEVGKTQDSG---DDSMFG 375
SGDL VT+ EA+G+ F+VP+S+VP R G +RY++ G+ + + F
Sbjct: 337 AAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFF 396

Query: 376 DLTWQHGMTNTLTFNSGSRIADGYQALMLGGVYGS-SLGAFGANLTWSHARVPESEAQSG 434
T HG+ T G+++AD Y+A G +LGA ++T +++ +P+ G
Sbjct: 397 QSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDG 456

Query: 435 WMSQLTWSKTFQPTSTTVSLAGYRYSTSGYRDLADVLGERHAASNKQSWD---------- 484
+ ++K+ + T + L GYRYSTSGY + AD R N ++ D
Sbjct: 457 QSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFT 516

Query: 485 ---SSQWRQQSRFDLTLSQSLANYGNLFVSGSTQNYRGGKSRDTQLQLGYSNSFSHGISM 541
+ + ++ + LT++Q L L++SGS Q Y G + D Q Q G + +F I+
Sbjct: 517 DYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF-EDINW 575

Query: 542 NLSVGRQRMGGYKDNSDDMQTVTSLSFSFPLGG-------NGPRVPSLSNSWTHSTDGSS 594
LS + D +L+ + P + R S S S +H +G
Sbjct: 576 TLSYSLTKNAW--QKGRDQM--LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRM 631

Query: 595 QLQSSLTGMLDEAQTTNYSLNV---MRDQQYKQTTLSGNMQKRFSQTTVGLNASKGQDYW 651
+ + G L E +YS+ +T + R + S D
Sbjct: 632 TNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIK 691

Query: 652 QASGNVQGAMAVHGGGITFGPYLGETFALVEAKGAEGAKVYNSSQLEINDSGYALVPAVT 711
Q V G + H G+T G L +T LV+A GA+ AKV N + + + GYA++P T
Sbjct: 692 QLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYAT 751

Query: 712 PYRYNRISLDPQGMDGDAELVDSERQVAPVAGAAVKVIFRTRPGKALLIKSRMADGSELP 771
YR NR++LD + + +L ++ V P GA V+ F+ R G LL+ + LP
Sbjct: 752 EYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLP 810

Query: 772 MGADVLDENNTVVGIAGQGGQIYLR 796
GA V E++ GI GQ+YL
Sbjct: 811 FGAMVTSESSQSSGIVADNGQVYLS 835


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1537SECA290.022 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.7 bits (64), Expect = 0.022
Identities = 19/66 (28%), Positives = 27/66 (40%), Gaps = 14/66 (21%)

Query: 170 VTNPTGYYVTIRAAELLNNGKKVPLANSVMIAPQSTTEW-----TLPSGISVAPGAQIHL 224
V + V + +LN IA T E TLP+ ++ G +H+
Sbjct: 78 VFGMRHFDVQLLGGMVLNERC---------IAEMRTGEGKTLTATLPAYLNALTGKGVHV 128

Query: 225 VTVNDY 230
VTVNDY
Sbjct: 129 VTVNDY 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1542PF05860652e-14 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 64.8 bits (158), Expect = 2e-14
Identities = 23/126 (18%), Positives = 48/126 (38%), Gaps = 21/126 (16%)

Query: 39 KNGTVYNANGVPVVDINKPNGSGLSHNIWDNLNVDKNGVVFNNSANESSTSLAGNIQGNS 98
N + +++ GS L H+ + +V +G F N+
Sbjct: 11 INSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNP--------------- 54

Query: 99 NLTSGSAKVILNEVTSKNPSTINGMMEVAGDKADLIIANPNGITVNGGGSINTGKLTLTT 158
+ + I++ VT + S I+G++ A+L + NPNGI ++ G + +
Sbjct: 55 ----TNIQNIISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLDIGGSFVGS 109

Query: 159 GTPDIQ 164
++
Sbjct: 110 TANRLK 115


25Z1553Z1677Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1553019-3.467304hypothetical protein
Z1554-221-3.887828hypothetical protein
Z1555-226-8.102580hypothetical protein
Z1556-129-8.632854hypothetical protein
Z1557136-11.000641hypothetical protein
Z1558136-10.878162hypothetical protein
Z1559339-10.816884P4 family integrase
Z1560439-12.549741hypothetical protein
Z1561533-7.641213hypothetical protein
Z1562430-5.716048hypothetical protein
Z1563532-5.776372prophage regulatory protein
Z1564323-2.676602hypothetical protein
Z1565222-1.958186hypothetical protein
Z1567220-0.186910hypothetical protein
Z15683230.538603helicase
Z15693262.447262hypothetical protein
Z15704262.033276hypothetical protein
Z1571432-2.726236hypothetical protein
Z1572537-5.205308transposase
Z1573641-10.951995hypothetical protein
Z1574847-13.144833complement resistance protein
Z1575641-11.706043hypothetical protein
Z1576537-10.653475hypothetical protein
Z1578433-9.321625diacylglycerol kinase
Z1579323-4.984786hypothetical protein
Z15801211.904814hypothetical protein
Z15810203.346768urease accessory protein D
Z15821213.357131urease subunit gamma
Z15831203.198917urease subunit beta
Z15840202.653443urease subunit alpha
Z15851202.039280urease accessory protein UreE
Z15861210.447087urease accessory protein F
Z1587222-2.788870urease accessory protein G
Z1588127-7.741974hypothetical protein
Z1589438-13.693169hypothetical protein
Z1590637-14.907319hypothetical protein
Z1591537-13.81889650S ribosomal protein L31
Z1592332-9.653385hypothetical protein
Z1593333-8.768367hypothetical protein
Z1594329-6.462815hypothetical protein
Z15953221.635475hypothetical protein
Z15964242.173659hypothetical protein
Z15974262.236172hypothetical protein
Z15984241.468329hypothetical protein
Z15992221.947019hypothetical protein
Z16002222.095622hypothetical protein
Z16011251.791720hypothetical protein
Z16021251.702878hypothetical protein
Z16031251.907296hypothetical protein
Z16041231.666223hypothetical protein
Z16051231.364423hypothetical protein
Z16062221.525438hypothetical protein
Z16073201.190815hypothetical protein
Z16082200.883595hypothetical protein
Z16091210.502643hypothetical protein
Z16101260.756741phage inhibition, colicin resistance and
Z16112240.602667phage inhibition, colicin resistance and
Z1612522-0.171150phage inhibition, colicin resistance and
Z1613623-0.739781phage inhibition, colicin resistance and
Z1614523-2.211291phage inhibition, colicin resistance and
Z1615525-3.563242phage inhibition, colicin resistance and
Z1616729-4.579344phage inhibition, colicin resistance and
Z1617730-4.748620bifunctional enterobactin receptor/adhesin
Z1619444-13.761508hypothetical protein
Z1620638-11.958773hypothetical protein
Z1621434-8.652680hypothetical protein
Z1622430-6.356795hypothetical protein
Z1623423-4.364218hypothetical protein
Z1624428-6.353650hypothetical protein
Z1625430-6.380635hypothetical protein
Z1626326-4.887915hypothetical protein
Z1627328-6.825987hypothetical protein
Z1628329-5.840119hypothetical protein
Z1629m226-4.646402glycosyl transferase
Z1631323-2.891019hypothetical protein
Z1632322-2.148245IS1 protein InsB
Z1633422-1.583003hypothetical protein
Z16343210.826967hypothetical protein
Z16353251.038875hypothetical protein
Z1636326-0.810715hypothetical protein
Z1637327-4.474247hypothetical protein
Z1638226-3.669148transposase
Z1639429-3.989727hypothetical protein
Z1640430-4.827473hypothetical protein
Z1641530-6.126465hypothetical protein
Z1642528-5.629425hypothetical protein
Z1643524-2.273940hypothetical protein
Z1644726-4.499968hypothetical protein
Z1645523-2.208257hypothetical protein
Z16468233.385416hypothetical protein
Z16476212.727107hypothetical protein
Z16486222.826482hypothetical protein
Z16496223.035171hypothetical protein
Z16506223.433772hypothetical protein
Z16516223.164245adhesin
Z16525232.431389hypothetical protein
Z16537253.800176hypothetical protein
Z16547253.653275hypothetical protein
Z16556253.484849hypothetical protein
Z16566243.248127hypothetical protein
Z16576242.892365RadC family DNA repair protein
Z16586220.680673structural protein
Z1660216-0.131672transposase for IS629
Z1661016-1.207382hypothetical protein
Z1662015-0.886809hypothetical protein
Z1663015-0.740364hypothetical protein
Z1664-114-0.851189hypothetical protein
Z1666-116-1.275791*dehydrogenase
Z1667-215-2.332130hydrolase
Z1668-117-3.608370oxidoreductase component
Z1669021-5.420715hypothetical protein
Z1670025-6.127853curli production assembly/transport component,
Z1671134-7.964467curli assembly protein CsgF
Z1672033-7.629614curli assembly protein CsgE
Z1673231-6.069306DNA-binding transcriptional regulator CsgD
Z1675-125-3.342995curlin minor subunit
Z1676021-3.718021cryptic curlin major subunit
Z1677-117-3.374108autoagglutination protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1563HTHFIS260.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 25.6 bits (56), Expect = 0.043
Identities = 6/15 (40%), Positives = 13/15 (86%)

Query: 21 SQLLGISRSTIYEKM 35
+ LLG++R+T+ +K+
Sbjct: 456 ADLLGLNRNTLRKKI 470


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1584UREASE10810.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1081 bits (2797), Expect = 0.0
Identities = 397/570 (69%), Positives = 462/570 (81%), Gaps = 2/570 (0%)

Query: 1 MMSNISRQAYADMFGPTTGDKIRLADTELWIEVEDDLTTYGEEVKFGGGKVIRDGMGQGQ 60
M +SR AYA+MFGPT GDK+RLADTEL+IEVE D TT+GEEVKFGGGKVIRDGMGQ Q
Sbjct: 1 MSYRMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQ 60

Query: 61 ML-SAGCADLVLTNALIIDYWGIVKADIGVKDGRIFAIGKAGNPDIQPNVTIPIGVSTEI 119
+ G D V+TNALI+D+WGIVKADIG+KDGRI AIGKAGNPD+QP VTI +G TE+
Sbjct: 61 VTREGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEV 120

Query: 120 IAAEGRIVTAGGVDTHIHWICPQQAEEALTSGITTMIGGGTGPTAGSNATTCTPGPWYIY 179
IA EG+IVTAGG+D+HIH+ICPQQ EEAL SG+T M+GGGTGP G+ ATTCTPGPW+I
Sbjct: 121 IAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIA 180

Query: 180 QMLQAADSLPVNIGLLGKGNCSNPDALREQVAAGVIGLKIHEDWGATPAVINCALTVADE 239
+M++AAD+ P+N+ GKGN S P AL E V G LK+HEDWG TPA I+C L+VADE
Sbjct: 181 RMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADE 240

Query: 240 MDVQVALHSDTLNESGFVEDTLTAIGGRTIHTFHTEGAGGGHAPDIITACAHPNILPSST 299
DVQV +H+DTLNESGFVEDT+ AI GRTIH +HTEGAGGGHAPDII C PN++PSST
Sbjct: 241 YDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSST 300

Query: 300 NPTLPYTVNTIDEHLDMLMVCHHLDPDIAEDVAFAESRIRQETIAAEDVLHDLGAFSLTS 359
NPT PYTVNT+ EHLDMLMVCHHL P I ED+AFAESRIR+ETIAAED+LHD+GAFS+ S
Sbjct: 301 NPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIIS 360

Query: 360 SDSQAMGRVGEVVLRTWQVAHRMKVQRGPLPEESGDNDNVRVKRYIAKYTINPALTHGIA 419
SDSQAMGRVGEV +RTWQ A +MK QRG L EE+GDNDN RVKRYIAKYTINPA+ HG++
Sbjct: 361 SDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLS 420

Query: 420 HEVGSIEVGKLADLVLWSPAFFGVKPATIVKGGMIAMAPMGDINGSIPTPQPVHYRPMFA 479
HE+GS+EVGK ADLVLW+PAFFGVKP ++ GG IA APMGD N SIPTPQPVHYRPMF
Sbjct: 421 HEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFG 480

Query: 480 ALGSARHRCRVTFLSQAAAANGVAEQLNLHSTTAVVKGCR-TVQKADMRHNSLLPDITVD 538
A G +R VTF+SQA+ G+A +L + V+ R + KA M HNSL P I VD
Sbjct: 481 AYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVD 540

Query: 539 SQTYEVRINGELITSEPADILPMAQRYFLF 568
+TYEVR +GEL+T EPA +LPMAQRYFLF
Sbjct: 541 PETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1592HOKGEFTOXIC342e-06 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 33.6 bits (77), Expect = 2e-06
Identities = 14/48 (29%), Positives = 28/48 (58%), Gaps = 2/48 (4%)

Query: 1 MPQKTIIVGML--CLTMLLTVWVLHASPCEFRVSFMWSEIAAFLQCKP 46
+P+ +++ +L CLT+L+ ++ S CE R + E+AAF+ +
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1600PF02370361e-04 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 35.9 bits (82), Expect = 1e-04
Identities = 20/103 (19%), Positives = 47/103 (45%), Gaps = 3/103 (2%)

Query: 18 EQAEALRQKDQQLSLVEETEAFLRSALARAEEKIEEEEREIEHLRAQIEKLRRMLFGTRS 77
+ +++ R+ D Q + LR + ++KIEE E+E + + + E+ + +
Sbjct: 38 DSSDSKRENDPQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQD 97

Query: 78 EKLQREVEQAEAQLKQREQESDRYSGREDDPQVPRQLRQSRHR 120
+ Q + ++ + + +Q E E + + Q+ RQ +R
Sbjct: 98 KHYQEQQKKHQQEQQQLEAEKQKL---AKEKQISDASRQGLNR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1611TYPE4SSCAGA300.028 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.7 bits (66), Expect = 0.028
Identities = 21/54 (38%), Positives = 29/54 (53%), Gaps = 4/54 (7%)

Query: 314 KTDGVVTIHVPDQPPIETRLTEGENRRTLCAIARLVNE--NGAIK-VERINQYF 364
K D V + PDQ PI + + +NR+ I++L E N AIK + NQYF
Sbjct: 31 KVDNAVASYDPDQKPIVDK-NDRDNRQAFEGISQLREEYSNKAIKNPTKKNQYF 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1614PF07824280.014 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 28.0 bits (62), Expect = 0.014
Identities = 10/36 (27%), Positives = 15/36 (41%)

Query: 132 VNDDNQTEVARYDLTEDASTETAMLFGELYRHNGEW 167
+D+ + +AR DLT E + E Y W
Sbjct: 73 TDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRW 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1648cdtoxina280.010 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 28.1 bits (62), Expect = 0.010
Identities = 15/61 (24%), Positives = 24/61 (39%), Gaps = 5/61 (8%)

Query: 74 VELLPVEITPDEQKEPVAAIAPSLSTSTQTSVSAGSCKVEFRHGNMTLENPSPELLTLLI 133
VE P +PDE P+ P+L T+ + ++L N +LT+
Sbjct: 40 VEGGPTVPSPDEPGLPLPGPGPALPTNGAIPIPEPGTAPA-----VSLMNMDGSVLTMWS 94

Query: 134 R 134
R
Sbjct: 95 R 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1651PRTACTNFAMLY330.006 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 33.1 bits (75), Expect = 0.006
Identities = 100/488 (20%), Positives = 153/488 (31%), Gaps = 74/488 (15%)

Query: 423 GTLAVSAGGKATSVT---ITSGGALI---ADSGATV---EGTNASGKFSIDGTSGQASGL 473
G +GG ++ I +GGA + ++ G +A GK + + L
Sbjct: 326 GARVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKL 385

Query: 474 LLENG----GSFTVNAGGQAGNTTVGHRGTLTLAAGGSLSGRTQLSKGASMVLNGDVVST 529
L G G T++G + LA+ +G T+ S+ N V T
Sbjct: 386 TLTGGADAQGDIVATELPSIPGTSIG-PLDVALASQARWTGATRAVDSLSID-NATWVMT 443

Query: 530 GDIVNAGEIRFDNQTTPNAALSRAVAKSNSPVTFHKLTTTNLTGQGGTINMRVRLDGSNA 589
+ N G +R + S + F LT L G G M V D
Sbjct: 444 DN-SNVGALRLASDG------SVDFQQPAEAGRFKVLTVNTLAGSG-LFRMNVFAD-LGL 494

Query: 590 SDQLVINGGQATGKTWLAFTNVGNSNLGVATTGQGIRVVDAQNGATTEEGAFALSRPLQA 649
SD+LV+ A+G+ L N G+ T + +V G+ +
Sbjct: 495 SDKLVVMQD-ASGQHRLWVRNSGSEPASANT----LLLVQTPLGSAATFTLANKDGKVDI 549

Query: 650 GAFNYTLNRDSDEDWYLRSENAYRAEVPLY-----------------------TSMLTQA 686
G + Y L + + W L A A P L+ A
Sbjct: 550 GTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAA 609

Query: 687 MDYDRILAGSRSHQTGVNGENNSVRLSIQGGHLGHDNNGGIARGATPESSGSYGFVRLEG 746
+ G T E+N++ + L D G RG R
Sbjct: 610 ANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGR--- 666

Query: 747 DLLRTEVAGMSL--------TTGVYGAAGHSSVDVKDDDGSRAGTVRDDAGSLGGYLNLV 798
+VAG L G + G + D + G D+ +GGY +
Sbjct: 667 -RFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI 725

Query: 799 HTSSGLWADIVAQGTRHSMKASSDNND-------FRARGWGWLGSLETGLPFSITDNLML 851
SG + D + +R +D +R G G SLE G F+ D L
Sbjct: 726 -ADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA--SLEAGRRFTHADGWFL 782

Query: 852 EPQLQYTW 859
EPQ +
Sbjct: 783 EPQAELAV 790


26Z1704Z1722Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z17042161.125996hypothetical protein
Z17052161.256880virulence factor
Z17071130.795483virulence factor
Z17081190.839092flagellar synthesis protein FlgN
Z17091161.139946anti-sigma-28 factor FlgM
Z17100162.439841flagellar basal body P-ring biosynthesis protein
Z17112172.689835flagellar basal-body rod protein FlgB
Z17123162.606053flagellar basal body rod protein FlgC
Z17132142.721811flagellar basal body rod modification protein
Z17140142.787836flagellar hook protein FlgE
Z1715-1132.589100flagellar basal body rod protein FlgF
Z1716-191.443819flagellar basal body rod protein FlgG
Z17170132.439050flagellar basal body L-ring protein
Z17180132.167724flagellar basal body P-ring biosynthesis protein
Z17191141.883117flagellar rod assembly protein/muramidase FlgJ
Z17202141.436184flagellar hook-associated protein FlgK
Z17213161.465268flagellar hook-associated protein FlgL
Z17224191.829082ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1713SYCECHAPRONE270.033 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 26.6 bits (58), Expect = 0.033
Identities = 14/34 (41%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 43 LKNQDPTNPMENNELTSQLAQISTVSGIEKLNTT 76
L N+ P N ++NN L +QL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1714FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1716FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1717FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1718FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1098), Expect = e-152
Identities = 156/363 (42%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAIAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + A++ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1719FLGFLGJ5080.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 508 bits (1308), Expect = 0.0
Identities = 311/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKASEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKA EDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTTGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMT GKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1720FLGHOOKAP16770.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 677 bits (1747), Expect = 0.0
Identities = 541/546 (99%), Positives = 543/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNSQHKAGFDANGDEGEDFFAIGKPAVLQNTKNNGNVAIGATVTDASAVLATD 361
ALAFAEAFN+QHKAGFDANGD GEDFFAIGKPAVLQNTKN G+VAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSTTQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSS TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1721FLAGELLIN461e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.8 bits (108), Expect = 1e-07
Identities = 41/226 (18%), Positives = 81/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDNDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD+D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEVNGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + +G E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1722IGASERPTASE643e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 64.3 bits (156), Expect = 3e-12
Identities = 47/288 (16%), Positives = 84/288 (29%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAAPATPATPAQPGLL 571
P E+ + DVP P+ E A AP P APATP+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT----- 1037

Query: 572 SRFFGALKALFSGGEETKPTEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRSER-- 629
ET + Q QN + + + ++
Sbjct: 1038 ---------------ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETREGRQQAEVTEKARTADEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEETVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 AEETVAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
+P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248



Score = 63.5 bits (154), Expect = 4e-12
Identities = 46/261 (17%), Positives = 81/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAAPATPATPAQPGLLSRFFGALKALFSGGEETKPTEQP-APKAEAKPERQQDRR 609
P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETREGRQQAEV------T 663
N ++++++ D E +NR A++ + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTADEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +T + ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEETVVAPVAEETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232


27Z1762Z1826Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1762-220-3.797025spermidine/putrescine ABC transporter
Z1763-125-4.701355spermidine/putrescine ABC transporter
Z1764232-6.296319hypothetical protein
Z1765231-6.674290excisionase
Z1766227-5.561396hypothetical protein
Z1768225-2.951496hypothetical protein
Z17692240.006283hypothetical protein
Z17703240.354495hypothetical protein
Z17714251.715494hypothetical protein
Z17723262.730637hypothetical protein
Z17732282.241100hypothetical protein
Z17742250.796917hypothetical protein
Z1775223-0.834007hypothetical protein
Z1776226-1.957278hypothetical protein
Z1777227-2.560159hypothetical protein
Z1778228-2.911498hypothetical protein
Z1779225-1.018006hypothetical protein
Z17800230.021081hypothetical protein
Z1781028-1.126987hypothetical protein
Z1782231-4.374442hypothetical protein
Z1783130-4.068972Gef-like protein encoded by prophage CP-933N
Z1784131-5.401639hypothetical protein
Z1785325-3.153743endodeoxyribonuclease of prophage CP-933N
Z1786526-3.326061Q antiterminator of prophage CP-933N
Z1787525-2.836747hypothetical protein
Z17883240.126831hypothetical protein
Z1789123-0.532799envelope protein encoded within prophage
Z17931231.347593**hypothetical protein
Z1794025-1.515896holin protein
Z1795027-0.987896hypothetical protein
Z17960260.103587endolysin of prophage CP-933N
Z17972242.275537antirepressor of prophage CP-933N
Z17984275.351060endopeptidase of prophage CP-933N
Z17995295.572456hypothetical protein
Z18005285.963090hypothetical protein
Z18025296.009842hypothetical protein
Z18036306.356386terminase encoded by prophage CP-933N
Z18046306.407505hypothetical protein
Z18056305.617795hypothetical protein
Z18065284.953912hypothetical protein
Z18077315.216703hypothetical protein
Z18086315.599547hypothetical protein
Z18094305.850932hypothetical protein
Z18104325.890683hypothetical protein
Z18114294.951980tail component encoded by prophage CP-933N
Z18124305.266939tail assembly chaperone encoded by prophage
Z18130231.811909hypothetical protein
Z18140252.285712hypothetical protein
Z18150220.162095tail component L homolog encoded by prophage
Z1816024-3.402119hypothetical protein
Z1817537-9.288140hypothetical protein
Z1818540-11.235062antirepressor protein encoded by prophage
Z18191051-15.030383tail component K homolog encoded by prophage
Z1820952-14.929522hypothetical protein
Z1821952-14.962942hypothetical protein
Z1822850-14.220736hypothetical protein
Z1823642-12.891502hypothetical protein
Z1824434-10.240780hypothetical protein
Z1825118-2.949813insertion element IS2 transposase InsD
Z1826-112-3.749609IS encoded protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1762CHLAMIDIAOMP280.045 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 28.4 bits (63), Expect = 0.045
Identities = 19/67 (28%), Positives = 28/67 (41%), Gaps = 8/67 (11%)

Query: 137 GVNGDAVDPKSVTSWADL------WKPEYKGSLLLTDDAREVFQMALRKLGYSGNTTDPK 190
G GD DP T+W D + ++ +L D + FQM + +GN T P
Sbjct: 42 GFGGDPCDP--CTTWCDAISMRMGYYGDFVFDRVLKTDVNKEFQMGDKPTSTTGNATAPT 99

Query: 191 EIEAAYN 197
+ A N
Sbjct: 100 TLTAREN 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1766BINARYTOXINB260.043 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 25.8 bits (56), Expect = 0.043
Identities = 14/52 (26%), Positives = 20/52 (38%), Gaps = 5/52 (9%)

Query: 1 MTDEIPLDDALLQLREF--IDENSGEFFVQVWGNGA-NFDNTILRRSYERQG 49
+ I D+ LQL E NS + G + DN + S E +G
Sbjct: 171 KKEVISSDN--LQLPELKQKSSNSRKKRSTSAGPTVPDRDNDGIPDSLEVEG 220


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1768GPOSANCHOR280.032 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.1 bits (62), Expect = 0.032
Identities = 22/79 (27%), Positives = 36/79 (45%), Gaps = 13/79 (16%)

Query: 42 AELTAAMTAIRETAQIAKL-----------MNEAKTQAEVNAAIGELNSKLASIQRECVS 90
+L A + E +I++ EAK Q V A+ E NSKLA++++
Sbjct: 361 KQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ--VEKALEEANSKLAALEKLNKE 418

Query: 91 LVELVGTYQEINASLKAKI 109
L E ++ A L+AK+
Sbjct: 419 LEESKKLTEKEKAELQAKL 437


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1781FLGMRINGFLIF320.001 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 32.2 bits (73), Expect = 0.001
Identities = 15/58 (25%), Positives = 27/58 (46%), Gaps = 6/58 (10%)

Query: 22 GSNVVLPAEEAEELARIALASLAAVSDERAAYELFMEKRFG-----ESVDRRRAKNGD 74
+ +PA++ E R+ LA +EL +++FG E V+ +RA G+
Sbjct: 82 SGAIEVPADKVHE-LRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRALEGE 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1783HOKGEFTOXIC652e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.8 bits (158), Expect = 2e-18
Identities = 18/46 (39%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICITVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++C+T+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1799TONBPROTEIN712e-18 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 70.8 bits (173), Expect = 2e-18
Identities = 31/78 (39%), Positives = 49/78 (62%)

Query: 28 PRQLVKALPQYPAYAAANYIKGRVDVKFDIGADGTVTRIEFIRSEPHHLFDEQVVKAMAK 87
PR L + PQYPA A A I+G+V VKFD+ DG V ++ + ++P ++F+ +V AM +
Sbjct: 153 PRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNAMRR 212

Query: 88 WRFEKDRPRKGVKKTFIF 105
WR+E +P G+ +F
Sbjct: 213 WRYEPGKPGSGIVVNILF 230


28Z1863Z1941Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1863-121-3.593300phosphohydrolase
Z1864-122-3.03004723S rRNA pseudouridine synthase E
Z1865023-4.012218isocitrate dehydrogenase
Z1866138-8.680153integrase of prophage CP-933X
Z1867136-7.944493integrase of prophage CP-933X
Z1868132-7.256711replication protein O of prophage CP-933X
Z1869133-7.085957replication protein P of prophage CP-933X
Z1870335-8.340894multidrug efflux protein
Z1871233-7.375859hypothetical protein
Z1872426-1.459043hypothetical protein
Z1873225-2.355200endodeoxyribonuclease RUS
Z1874229-5.792077antiterminator Q of prophage CP-933X
Z1875131-6.971810holin protein of prophage CP-933X
Z1876134-7.875958endolysin of prophage CP-933X
Z1877231-6.936999endopeptidase of prophage CP-933X
Z1878018-2.505366Bor protein of prophage CP-933X
Z1879119-1.913665envelope protein of prophage CP-933X
Z18801222.777254hypothetical protein
Z18811234.795071hypothetical protein
Z18821235.347427DNA packaging protein of prophage CP-933X
Z18831245.632055DNA packaging protein of prophage CP-933X
Z18844256.820629head-tail joining protein of prophage CP-933X
Z18854256.968720capsid structural protein of prophage CP-933X
Z18864276.534287capsid protein of prophage CP-933X
Z18873274.723677head-DNA stabilization protein of prophage
Z18884244.831953capsid protein of prophage CP-933X
Z18895264.476768DNA packaging protein of prophage CP-933X
Z18904264.388357head-tail joining protein of prophage CP-933X
Z18912275.233945tail component of prophage CP-933X
Z18934275.397793tail component of prophage CP-933X
Z18944295.853164tail component of prophage CP-933X
Z18955315.888096tail component of prophage CP-933X
Z18964306.445307tail component of prophage CP-933X
Z18984316.902643tail component of prophage CP-933X
Z19016304.631675hypothetical protein
Z19026314.727543head-tail adaptor of prophage CP-933X
Z19035324.810194hypothetical protein
Z19054325.402983hypothetical protein
Z19064335.281322hypothetical protein
Z19084325.585976tail component of prophage CP-933X
Z19104315.791003hypothetical protein
Z19123295.658131hypothetical protein
Z19132244.872824tail component of prophage CP-933X
Z19142224.751123minor tail fiber protein of prophage CP-933X
Z19152214.117936tail protein (partial) of prophage CP-933X
Z19161192.196737tail protein (partial) of prophage CP-933X
Z19173240.166111prophage protein
Z1918428-1.310638membrane protein of prophage CP-933X
Z1919636-5.210807hypothetical protein
Z1920542-9.299094tail fiber protein of prophage CP-933X
Z1921542-9.780790hypothetical protein
Z1922435-8.490364hypothetical protein
Z1923433-7.099209hypothetical protein
Z19241220.726954hypothetical protein
Z1925225-2.409685hypothetical protein
Z1926324-2.365740hypothetical protein
Z1927326-2.730403hypothetical protein
Z1928223-2.595794hypothetical protein
Z1929222-2.111840hypothetical protein
Z1930430-5.597786protease encoded within prophage CP-933X
Z1931325-3.847084outer membrane protease
Z1932018-1.650967hypothetical protein
Z1933-214-0.653902hypothetical protein
Z1934-316-2.062893transposase for IS629
Z1935-219-4.069821hypothetical protein
Z1936-120-4.768060cell division topological specificity factor
Z1937-218-3.318600cell division inhibitor MinD
Z1938-320-3.893083septum formation inhibitor
Z1939-219-6.476544hypothetical protein
Z1940-218-5.719159hypothetical protein
Z1941-318-3.860633hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1878PF062911704e-59 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 170 bits (432), Expect = 4e-59
Identities = 91/97 (93%), Positives = 93/97 (95%)

Query: 1 MKKMLLATALALLITGCAQQTFTVQNKPAAVTPKETITHHFFVSGIGQKKTVDAAKICGG 60
MKKML + ALA+LITGCAQQTFTV NKP AVTPKETITHHFFVSGIGQKKTVDAAKICGG
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAAKICGG 65

Query: 61 AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 97
AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ
Sbjct: 66 AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1894INTIMIN290.019 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 29.3 bits (65), Expect = 0.019
Identities = 32/202 (15%), Positives = 61/202 (30%), Gaps = 29/202 (14%)

Query: 76 DWAATGQGQKSAGDTSFT----LAWMPGEQGQQALLAWFNEGDTRAYKIRFPNGTVDVFR 131
G G+ + S + + AL + A I +
Sbjct: 611 SANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL-------NANAV-IFVDQTKASITE 662

Query: 132 GWVSSIGKAVTAKEVITRTVKVTNVGRPSMAEDRSTVTAATGMTVTPASTSVVKGQSTTL 191
++ IT TVKV +P ++ + T ++ + T TL
Sbjct: 663 IKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTL 722

Query: 192 T---------------VAFQPEGATDKSFRAVSADKTKATVSVSGMTITVKG--VAAGKV 234
T VA + + F ++ D + +G+ + + G+V
Sbjct: 723 TSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQV 782

Query: 235 NIPVVSGNGEFAAVAEINVTAS 256
N+ GNG++ + AS
Sbjct: 783 NLKASGGNGKYTWRSANPAIAS 804


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1910TACYTOLYSIN250.037 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 24.9 bits (54), Expect = 0.037
Identities = 10/39 (25%), Positives = 20/39 (51%)

Query: 11 ARLSGFRHKTVKVPEWRNVSVVLREPSAEAWYLWQEVLN 49
++ S F RN+ ++ RE + AW W++V++
Sbjct: 508 SKTSPFSTVIPLGANSRNIRIMARECTGLAWEWWRKVID 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1913CHANLCOLICIN320.015 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 32.0 bits (72), Expect = 0.015
Identities = 45/240 (18%), Positives = 93/240 (38%), Gaps = 24/240 (10%)

Query: 356 RAQQAVAAARGTEMQIAAEARLAATQERLN-------RNIAARSAAQNALNSTTAVGSRL 408
+A+QA A E Q A+A A +RL R+ A+R+ + L +
Sbjct: 66 QAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQA 125

Query: 409 MSGALGLVGGVPGLVMLGAAAWYTLYQNQEQARESARQYALTIDEIAHKTPSMSLPEASD 468
L L AA + +++ +E R+ A T ++ + ++
Sbjct: 126 EDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQL----------KLAE 175

Query: 469 NEGRTRAALTEQNRLID-------EQASRVKSLQEKAQSIQDVLAGLEDRRVALIRQQAA 521
E + AAL+E+ + ++ S V + + +++ L+ R A ++ A
Sbjct: 176 AEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAG 235

Query: 522 EQNKVYQSMLVMNGQHTEFNRLLGLGNELLQQRQGLVNVPLRLPQATLDDKQQSALTKTE 581
++N++ Q+ +L N+ LQ R R+ + +++Q +T +E
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1917ENTEROVIROMP831e-22 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 82.6 bits (204), Expect = 1e-22
Identities = 39/134 (29%), Positives = 67/134 (50%), Gaps = 12/134 (8%)

Query: 7 VILSAVVWQVAAATPASAAEHQSTLSAGYLHASTNVPG-SDDLNGINVKYRYEFMDA-LG 64
+ + + V A T ++ ST++ GY A ++ G + + G N+KYRYE ++ LG
Sbjct: 4 IACLSALAAVLAFTAGTSVAATSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEEDNSPLG 61

Query: 65 LITSFSYANAEDEQKTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRV 124
+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y +
Sbjct: 62 VIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKF 113

Query: 125 STFSGDYLRVTDNK 138
T + +
Sbjct: 114 QTTEYPTYKHDTSD 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1918CHANLCOLICIN452e-06 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 44.7 bits (105), Expect = 2e-06
Identities = 54/319 (16%), Positives = 118/319 (36%)

Query: 154 ARAASTSAGQAASSAQSASSSAGTASTKATEASKSAAAAESSKSAAATSAGAAKTSETNA 213
+ S S AA A + S+A T+A +A+++ AAAE+ A A + +
Sbjct: 39 GKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98

Query: 214 AVSQQSAATSASTATTKASEAASSARDASASKEAAKSSETSAASSASSAASSATAAGNSA 273
+ + A+ +AT A ++ + AK+ E + + ++ + A
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRK 158

Query: 274 KAAKTSETNAKSSETAAEQSASAAAGSKTAAALSASAASTSAGQASASATAAGKSAESAA 333
+ + + + A + AA S+ A A+ + SA Q+ ++
Sbjct: 159 EIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSR 218

Query: 334 SSASTATTKAGEATEQASAAASSASAAKTSETNAKASETSAESSKTAAASSASSAASSAS 393
S+S A T + ++AK E + + S ++ A
Sbjct: 219 LSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRV 278

Query: 394 SASASKDEATRQASAAKSSATTASTKATEAAGSATAAAQSKSTAESAATRAETAAKRAED 453
A ++E +Q +A+++ + T+ + + + +++ + AE K+A++
Sbjct: 279 GAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQN 338

Query: 454 IASAVALEDASTTKKGIVQ 472
++DA Q
Sbjct: 339 NLLNSQIKDAVDATVSFYQ 357



Score = 31.6 bits (71), Expect = 0.019
Identities = 58/332 (17%), Positives = 111/332 (33%), Gaps = 22/332 (6%)

Query: 315 AGQASASATAAGKSAESAASSA----STATTKAGEATEQASAAASSASAAKTSETNAKAS 370
+G KS SAA A STA K +A + A A A++ + AK +
Sbjct: 32 SGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALT 91

Query: 371 E---------TSAESSKTAAASSASSAASSASSASASKDEATRQASAAKSSATTASTKAT 421
+ +S+T +A+ + A ++A A + + A+ A A
Sbjct: 92 QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQ 151

Query: 422 EAAGSATAAAQSKSTAESAATRAETAAKRAEDIASA-----VALEDASTTKKGIVQLSSA 476
EA + K+ E AE KR ++ +A + S + +V++
Sbjct: 152 EAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGE 211

Query: 477 TNSTSESLAATPKAVKAAYELANGKYTAQDATTAQKGIVQLSNATNSTSEMLAATPKSVK 536
+ + L+++ A A + GK +A+ + + A P +
Sbjct: 212 IKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAK---YKELDELVKKLSPRANDPLQNR 268

Query: 537 AAYDLANGKYTAQDAT-TAQKGIVQLSSATNSASETLAATPKAVKAANDNANGRVPSARK 595
++ + A QK + + N + + KA+ ++N N + +
Sbjct: 269 PFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHE 328

Query: 596 VNGKALSSDITLTPKDIGTLNSTTMSFSGGAG 627
+ L I T+SF
Sbjct: 329 AEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1925LUXSPROTEIN310.002 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 31.4 bits (71), Expect = 0.002
Identities = 18/66 (27%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 41 TKEHLLPHFL-EHLGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 93
T EHL F+ HL + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 94 AGESKI 99
++KI
Sbjct: 114 ENQNKI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1931OMPTIN5270.0 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 527 bits (1358), Expect = 0.0
Identities = 313/317 (98%), Positives = 316/317 (99%)

Query: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60
MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS
Sbjct: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60

Query: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120
QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR
Sbjct: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120

Query: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180
HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI
Sbjct: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180

Query: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVEASDNDEHYDPGKRIT 240
GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVE+SDNDEHYDPGKRIT
Sbjct: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKRIT 240

Query: 241 YRSKVKDQNYYSVSVNAGYYVTPNAKVYVEGTWNRVTNKKGNTSLYDHNDNTSDYSKNGA 300
YRSKVKDQNYYSV+VNAGYYVTPNAKVYVEG WNRVTNKKGNTSLYDHN+NTSDYSKNGA
Sbjct: 241 YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNNNTSDYSKNGA 300

Query: 301 GIENYNFITTAGLKYTF 317
GIENYNFITTAGLKYTF
Sbjct: 301 GIENYNFITTAGLKYTF 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1932HTHTETR280.009 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 27.7 bits (61), Expect = 0.009
Identities = 9/49 (18%), Positives = 25/49 (51%), Gaps = 2/49 (4%)

Query: 1 MKEGRRMDKRAKNQIVDSDIARLLLKLRKSRNLTVTELAQRSGVSQAMI 49
++ ++ + + I+D A L + + ++ E+A+ +GV++ I
Sbjct: 2 ARKTKQEAQETRQHILDV--ALRLFSQQGVSSTSLGEIAKAAGVTRGAI 48


29Z2028Z2151Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2028-219-3.925278voltage-gated potassium channel
Z2029115-1.572230hypothetical protein
Z2030-115-2.467882transporter
Z2031-115-2.906640acyl-CoA thioester hydrolase
Z2032015-3.095052intracellular septation protein A
Z2033018-3.294474hypothetical protein
Z2034120-4.042117outer membrane protein W
Z2036123-4.978370integrase for prophage CP-933O
Z2037124-4.815016exonuclease VIII
Z2038137-9.287212hypothetical protein
Z2039235-8.824249regulator of cell division encoded by prophage
Z2040535-7.423901hypothetical protein
Z2041231-5.587192hypothetical protein
Z2042328-2.989351hypothetical protein
Z2043327-1.137420hypothetical protein
Z2045226-0.533883transcriptional repressor DicA
Z2046126-0.379951DNA-binding transcriptional regulator DicC
Z2047023-1.901888hypothetical protein
Z2048125-1.541881hypothetical protein
Z2049027-2.541441hypothetical protein
Z2051-223-2.702591hypothetical protein
Z2052-125-3.137924hypothetical protein
Z2053-124-3.021258intestinal colonization factor encoded by
Z2054-126-1.851967killer protein encoded by prophage CP-933O;
Z2055021-0.773953hypothetical protein
Z20562220.612648hypothetical protein
Z20572250.876639endonuclease of prophage CP-933O
Z20583261.068700hypothetical protein
Z20594292.141532hypothetical protein
Z20604302.073087DNA adenine methyltransferase encoded by
Z20655282.343300**phage protein YjhS encoded within prophage
Z20663272.874228hypothetical protein
Z20685272.231715hypothetical protein
Z20693222.414450holin protein of prophage CP-933O
Z20702200.233974hypothetical protein
Z20712210.273760endolysin of prophage CP933-O; partial
Z2072125-4.189224IS encoded proteinen coded by prophage CP-933O
Z2073226-4.949473transposase within CP-933O
Z2074225-5.465611IS encoded protein within CP-933O
Z2075124-2.790308hypothetical protein
Z2076023-0.525758hypothetical protein
Z20770230.866813hypothetical protein
Z20781253.336138transposase within CP-933O
Z20790222.486072IS encoded protein within CP-933O
Z20800200.667799IS encoded protein within CP-933O
Z2081119-1.875191IS encoded protein within CP-933O
Z2082120-2.697600transposase within CP-933O; partial
Z2083026-5.974481hypothetical protein
Z2084026-5.283699integrase within CP-933O; partial
Z2085127-5.827814exonuclease VIII
Z2086032-7.317834division inhibition protein DicB within CP-933O
Z2087027-5.262218hypothetical protein
Z2088125-2.990124hypothetical protein
Z20891240.112579hypothetical protein
Z20902240.313906repressor protein encoded within prophage
Z20912242.705264hypothetical protein
Z20923252.300509hypothetical protein
Z20932251.787755hypothetical protein
Z20941240.278130hypothetical protein
Z2095019-0.100151hypothetical protein
Z2096-121-0.122111hypothetical protein
Z2097-121-2.070971hypothetical protein
Z2098-323-1.770581hypothetical protein
Z2099-125-3.196163hypothetical protein
Z2100-122-2.464690hypothetical protein
Z2101-224-4.067838endonuclease encoded within prophage CP-933O
Z2102023-4.223830hypothetical protein
Z2103223-1.008267hypothetical protein
Z21042250.570770AraC family transcriptional regulator
Z21051282.897679hypothetical protein
Z21063243.053577hypothetical protein
Z21074222.771096hypothetical protein
Z21084202.940656phage protein YjhS encoded within prophage
Z21094222.959693phage protein YjhS encoded within prophage
Z21103203.341353transposase encoded within prophage CP-933O
Z21115244.109707transposase encoded within prophage CP-933O
Z21125243.584084ClpP-like protease encoded within prophage
Z21134244.025105hypothetical protein
Z21144263.686316hypothetical protein
Z21153233.526954hypothetical protein
Z21163211.511902hypothetical protein
Z2117525-1.311656hypothetical protein
Z2118223-1.143904endopeptidase Rz of prophage CP-933O
Z2119225-2.559895hypothetical protein
Z2120224-0.013172endolysin of prophage CP-933O
Z21212243.012447hypothetical protein
Z21222223.428333holin protein of prophage CP-933O
Z21231223.584469hypothetical protein
Z21241234.173425hypothetical protein
Z21272235.408872prophage CP-933O IS protein
Z21301235.299033prophage CP-933O IS protein
Z21311234.653846terminase large subunit of prophage CP-933O
Z2132-1275.871651head completion protein of prophage CP-933O
Z2133-1265.536471capsid assembly protein of prophage CP-933O
Z21341245.629926head-tail preconnector protein of prophage
Z21351255.365388capsid protein small subunit of prophage
Z21361265.476188major capsid protein of prophage CP-933O
Z21372295.822790tail component of prophage CP-933O
Z21383306.221260tail component of prophage CP-933O
Z21393306.930012tail component of prophage CP-933O
Z21403306.541042tail component of prophage CP-933O
Z21415316.093402tail component of prophage CP-933O
Z21425296.352695tail component of prophage CP-933O
Z21435296.009216tail component of prophage CP-933O
Z21445264.693945tail component of prophage CP-933O
Z21454232.743877tail component of prophage CP-933O
Z2146432-2.587005outer membrane protein Lom of prophage CP-933O
Z2147638-4.297146tail fiber protein of prophage CP-933O
Z2148436-8.564024hypothetical protein
Z2149031-6.870763hypothetical protein
Z2150-128-6.257698hypothetical protein
Z2151-222-4.826367hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2029adhesinmafb314e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 4e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2030TONBPROTEIN1563e-50 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 156 bits (395), Expect = 3e-50
Identities = 157/161 (97%), Positives = 157/161 (97%)

Query: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 60
MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQQKRDVKPVESR 120
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQ KRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 121 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQK 161
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQ
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQP 161


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2039STREPKINASE290.002 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 29.3 bits (65), Expect = 0.002
Identities = 12/36 (33%), Positives = 21/36 (58%)

Query: 17 TSPGGTRHRITKFIVEDAIMETLLPNVNTSEGCFEI 52
T G H++ K + AI E L+ NV++++ FE+
Sbjct: 91 TDSGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEV 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2054HOKGEFTOXIC593e-16 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 59.5 bits (144), Expect = 3e-16
Identities = 18/46 (39%), Positives = 31/46 (67%)

Query: 23 QKAMLIALIVICLIVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++CL +++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2090PF07675280.033 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 28.1 bits (62), Expect = 0.033
Identities = 17/73 (23%), Positives = 30/73 (41%), Gaps = 4/73 (5%)

Query: 116 IPANTFAVVLESDSMSTSGGGVSIPNGSTVFVDPDRIVQPGNIVLALPKGTTTPVIRKLE 175
I A+ + V S G GV+ +G +I + GN + + + PVI++++
Sbjct: 267 IQASAGSYVAISKDGVLYGTGVANASGVATVNMTKQITENGNYDVVITRSNYLPVIKQIQ 326

Query: 176 IEGPDILLVPTNP 188
P P P
Sbjct: 327 AGEPS----PYQP 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2137INTIMIN330.001 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.7 bits (74), Expect = 0.001
Identities = 23/119 (19%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 93 KEVITRTVKVTNVGKPSVAEERSKITPVSAIKVTP-------------TSGTVAKGKTTT 139
++ IT TVKV KP +E + T + + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 140 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 194
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2140GPOSANCHOR300.043 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.0 bits (67), Expect = 0.043
Identities = 55/235 (23%), Positives = 90/235 (38%), Gaps = 23/235 (9%)

Query: 362 AWNDRENARLGLAAATLQSDMEKAGELAARDR--AERDASQLKYTGEAQKAYERLLTPLE 419
++ ++A++ A + + EL + + L
Sbjct: 239 NFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKA 298

Query: 420 KYTARQEELNKALKDGKILRADYNTLMAAAKKDYESTLKKPKSSGVKVSAGERQE----- 474
+ + LN + LR D + AKK E+ +K + K+S RQ
Sbjct: 299 DLEHQSQVLNANRQS---LRRDLDA-SREAKKQLEAEHQKLEE-QNKISEASRQSLRRDL 353

Query: 475 DQAHAALLALETELRTLEKHSGANEKISQQ-RRDLWKA-ENQYAVLKE-AATKRQLSEQE 531
D + A LE E + LE+ + +E Q RRDL + E + V K +L+ E
Sbjct: 354 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALE 413

Query: 532 KF---LLAHKDETLEYKRQLAELGDKVEHQ-KRLNE-LAQQAVRFEEQQSAKQAA 581
K L K T + K AEL K+E + K L E LA+QA + ++ K +
Sbjct: 414 KLNKELEESKKLTEKEK---AELQAKLEAEAKALKEKLAKQAEELAKLRAGKASD 465


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2146ENTEROVIROMP1364e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 136 bits (344), Expect = 4e-43
Identities = 64/200 (32%), Positives = 101/200 (50%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHQSTLSAGYLHVSTNVPGSDELNGINVKYRYEFT 60
M+K+ A + + A LA + + A+ ST++ GY + + G N+KYRYE
Sbjct: 1 MKKI-ACLSALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGMVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


30Z2196Z2213Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2196-122-4.330995ABC transporter ATP-binding protein
Z2197028-5.887944DNA-binding regulator
Z2199131-6.415231hypothetical protein
Z2200331-6.509250major fimbrial subunit
Z2201126-5.127981fimbrial chaperone protein
Z2203127-5.339165fimbrial usher protein
Z2204024-5.485796fimbrial-like protein
Z2205025-6.007161fimbrial-like protein
Z2206025-6.991394adhesin FimH
Z2207-124-7.236321oxidoreductase
Z2208-121-8.363821hypothetical protein
Z2209-220-7.379710transcriptional regulator YdeO
Z2210-315-5.853571sulfatase
Z2211-214-5.213727hypothetical protein
Z2212-212-3.783174ABC transporter ATP-binding protein
Z2213-212-3.518821hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2196PRTACTNFAMLY1144e-29 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 114 bits (286), Expect = 4e-29
Identities = 118/467 (25%), Positives = 176/467 (37%), Gaps = 59/467 (12%)

Query: 22 NGLMTFNATLGGDNSPTDKMNVKGDTQGNTRVRVDNIGGVGAQTVNGIELIEVGGNSAGN 81
+GL N D +DK+ V D G R+ V N G + N + L++ SA
Sbjct: 481 SGLFRMNVFA--DLGLSDKLVVMQDASGQHRLWVRNSGS-EPASANTLLLVQTPLGSAAT 537

Query: 82 FALTT--GTVEAGAYVYTLAKGKGNDEKNWYLTSKWDGVTPADTPDPINNPPVVDPEGPS 139
F L G V+ G Y Y LA N W L P P P PP P
Sbjct: 538 FTLANKDGKVDIGTYRYRLAA---NGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPE 594

Query: 140 --VYRPEAGSYIS----------NIAAANSLF---SHRLHDRLGEPQYTDSLHSQDSASS 184
+P AG +S + A++L+ S+ L RLGE L A
Sbjct: 595 APAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGE------LRLNPDAGG 648

Query: 185 MWMRHVGGHERSSAGDGQLNTQANRYVLQLGGDLAQWSSNAQDRWHLGVMAGYANQHSNT 244
W R ++ G+ Q +LG D A + A RWHLG +AGY
Sbjct: 649 AWGRGFAQRQQLDNRAGRRFDQ-KVAGFELGADHA--VAVAGGRWHLGGLAGYTR----- 700

Query: 245 QSNRVGYKSDGRISGYSAGLYATWYQNDANKTGAYVDSWALYNWFDNSV---SSDNRSAD 301
G+ DG G++ ++ Y +G Y+D+ + +N SD +
Sbjct: 701 --GDRGFTGDG--GGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVK 756

Query: 302 -DYDSRGVTASVEGGYTFEAGTCSGSEGTLNTWYVQPQAQITWMGVKDSDHARKDGTRIE 360
Y + GV AS+E G F + W+++PQA++ + +G R+
Sbjct: 757 GKYRTHGVGASLEAGRRFTHA---------DGWFLEPQAELAVFRAGGGAYRAANGLRVR 807

Query: 361 TEGDGNVQTRLGVKTYLNSHHQRDDGKQREFQPYIEANWINNSK-VYAVKMNGQTVSRDG 419
EG +V RLG L + + R+ QPYI+A+ + V NG +
Sbjct: 808 DEGGSSVLGRLG----LEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTEL 863

Query: 420 ARNLGEVRTGVEAKVNNNLSLWGNVGVQLGDKGYSDTQGMLGVKYSW 466
E+ G+ A + SL+ + G K G +YSW
Sbjct: 864 RGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2203PF005779390.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 939 bits (2429), Expect = 0.0
Identities = 501/869 (57%), Positives = 653/869 (75%), Gaps = 10/869 (1%)

Query: 15 QVLLLPRFARLTIALSLATAVFPVDAEYYFNPRFLSNDLAESVDLSAFTKGREAPPGTYR 74
+ L F RL +A + A AE YFNPRFL++D DLS F G+E PPGTYR
Sbjct: 20 KHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYR 79

Query: 75 VDIYLNDEFMTSRDITFITDDNNADLIPCLSTDLLVSLGIKKSALLDNKEHSAEKHVPDN 134
VDIYLN+ +M +RD+TF T D+ ++PCL+ L S+G+ +++ + ++ +
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASV-------SGMNLLAD 132

Query: 135 SACTPLQDRLVDASTEFDVGQQHLSLSVPQIYVGRMARGYVSPDLWEEGINAGLLNYSFN 194
AC PL + DA+ + DVGQQ L+L++PQ ++ ARGY+ P+LW+ GINAGLLNY+F+
Sbjct: 133 DACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFS 192

Query: 195 GNSINNRSNHNAGKSNYAYLNLQSGINIGSWRLRDNSTWSYNSGSSNSSDSNKWQHINTS 254
GNS N G S+YAYLNLQSG+NIG+WRLRDN+TWSYNS S+S NKWQHINT
Sbjct: 193 GNS---VQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249

Query: 255 AERDIIPLRSRLTVGDSYTDGDIFDSVNFRGLKINSTEAMLPDSQHGFAPVIHGIARGTA 314
ERDIIPLRSRLT+GD YT GDIFD +NFRG ++ S + MLPDSQ GFAPVIHGIARGTA
Sbjct: 250 LERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTA 309

Query: 315 QVSVKQNGYDVYQTTVPPGPFTIDDINSAANGGDLQVTIKEADGSIQTLYVPYSSVPVLQ 374
QV++KQNGYD+Y +TVPPGPFTI+DI +A N GDLQVTIKEADGS Q VPYSSVP+LQ
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQ 369

Query: 375 RAGYTRYALAMGEYRSGNNLQSSPRFIQGSLMHGLEGNWTPYGGMQIAEDYQAFNLGIGK 434
R G+TRY++ GEYRSGN Q PRF Q +L+HGL WT YGG Q+A+ Y+AFN GIGK
Sbjct: 370 REGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGK 429

Query: 435 DLGLFGAFSFDITQANTTLADGTRHSGQSVKSVYSKSFYQTGTNIQVAGYRYSTQGFYNL 494
++G GA S D+TQAN+TL D ++H GQSV+ +Y+KS ++GTNIQ+ GYRYST G++N
Sbjct: 430 NMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNF 489

Query: 495 SDSAYSRMSGYTVKPPTGDSNEQTQFIDYFNLFYSKRGQEQISISQQLGNYGATFFSASR 554
+D+ YSRM+GY ++ G + +F DY+NL Y+KRG+ Q++++QQLG + S S
Sbjct: 490 ADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSH 549

Query: 555 QSYWNTSRSDQQISFGLNVPFGDITTSLNYSYSNNIWQNDRDHLLAFTLNVPFSHWMRTD 614
Q+YW TS D+Q GLN F DI +L+YS + N WQ RD +LA +N+PFSHW+R+D
Sbjct: 550 QTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 615 SQSAFRNSNASYSMSNDLKGGMTNLSGVYGTLLPDNNLNYSVQVGNTHGGNTSSGTSGYS 674
S+S +R+++ASYSMS+DL G MTNL+GVYGTLL DNNL+YSVQ G GG+ +SG++GY+
Sbjct: 610 SKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYA 669

Query: 675 TLNYRGAYGNTNVGYSRSGDSSQIYYGMSGGIIAHADGITFGQPLGDTMVLVKAPGADNV 734
TLNYRG YGN N+GYS S D Q+YYG+SGG++AHA+G+T GQPL DT+VLVKAPGA +
Sbjct: 670 TLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDA 729

Query: 735 KIENQTGIHTDWRGYAILPFATEYRENRVALNANSLADNVELDETVVTVIPTHGAIARAT 794
K+ENQTG+ TDWRGYA+LP+ATEYRENRVAL+ N+LADNV+LD V V+PT GAI RA
Sbjct: 730 KVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAE 789

Query: 795 FNAQIGGKVLMTLKYGNKSVPFGAIVTHGENKNGSIVAENGQVYLTGLPQSGKLQVSWGN 854
F A++G K+LMTL + NK +PFGA+VT +++ IVA+NGQVYL+G+P +GK+QV WG
Sbjct: 790 FKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGE 849

Query: 855 DKNSNCIVDYKLPEVSPGTLLNQQTAICR 883
++N++C+ +Y+LP S LL Q +A CR
Sbjct: 850 EENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2205FIMBRIALPAPF332e-04 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 33.1 bits (75), Expect = 2e-04
Identities = 29/93 (31%), Positives = 47/93 (50%), Gaps = 7/93 (7%)

Query: 16 LLTATLQAADVTITVNGRVVAKPCTIQT-KEANVNLGDLYTRNLQQPGSASGWHNITLSL 74
LLT+ ADV I + G V PCTI + V+ G++ N + ++ G +S+
Sbjct: 11 LLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNI---NPEHVDNSRGEVTKNISI 67

Query: 75 TDCPAETSAVTAIVTGSTDNTGYYKNEGTAENI 107
+ CP ++ ++ VTG+T G +N A NI
Sbjct: 68 S-CPYKSGSLWIKVTGNTMGVG--QNNVLATNI 97


31Z2249Z2259Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2249-124-4.630189hypothetical protein
Z2250-125-4.973069N-hydroxyarylamine O-acetyltransferase
Z2251229-9.092633hypothetical protein
Z2252125-1.4678584-oxalocrotonate tautomerase
Z22533293.624493H repeat-containing Rhs element protein
Z22544316.360691H repeat-containing Rhs element protein
Z22553326.003763Rhs element protein
Z22563315.626299Rhs element protein
Z22574315.720138Rhs element protein
Z22593254.488427Rhs element protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2251IGASERPTASE270.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.024
Identities = 7/29 (24%), Positives = 13/29 (44%)

Query: 119 WQFDDDKLNTLHHLGAGTFVTSGKRVTAG 147
W+ + + + L +G GT + G G
Sbjct: 437 WKVHNPQYDRLAKIGKGTLIVEGTGDNKG 465


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2253TYPE3OMGPROT280.031 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.031
Identities = 11/37 (29%), Positives = 18/37 (48%)

Query: 29 RRGAIHVISAFSTMHSLVIGQIKTDEKSNEITAIPEL 65
R + ++ SL+IG I DE S ++ +P L
Sbjct: 442 SRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLL 478


32Z2302Z2316Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Z2302-217-3.239677hypothetical protein
Z2303-218-4.246275cytochrome b561
Z2304-121-5.273929glyceraldehyde-3-phosphate dehydrogenase
Z2306-126-7.087581aldehyde dehydrogenase
Z2308014-2.872917hypothetical protein
Z2309014-2.875927hypothetical protein
Z2310-111-2.013708hypothetical protein
Z2311-210-0.918904hypothetical protein
Z2312-211-0.874198hypothetical protein
Z2313-311-0.465400ATP-dependent RNA helicase HrpA
Z2315-325-4.352017azoreductase
Z2316-327-4.109201hypothetical protein
33Z2333Z2413Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2333437-8.521677hypothetical protein
Z2334532-3.714328hypothetical protein
Z2335533-2.971620hypothetical protein
Z2337434-2.569637hypothetical protein
Z23385242.630176hypothetical protein
Z23395254.253522hypothetical protein
Z23405274.899300tail fiber protein encoded by prophage CP-933R
Z23425274.170699outer membrane protein Lom encoded by prophage
Z23435284.385585outer membrane protein Lom encoded by prophage
Z23446304.689934tail fiber protein encoded by prophage CP-933R
Z23465314.206488phage tail protein encoded by prophage CP-933R
Z23475314.397988copper-zinc superoxide dismutase encoded within
Z23484366.809962phage tail protein encoded by prophage CP-933R
Z23503336.863085phage tail protein encoded by prophage CP-933R
Z23514336.614737tail component of prophage CP-933R
Z23523326.360566tail component of prophage CP-933R
Z23532285.938201tail component of prophage CP-933R
Z23541275.629542tail component of prophage CP-933R
Z23551275.019026tail component of prophage CP-933R
Z23561245.633923tail component of prophage CP-933R
Z23570265.581111tail component of prophage CP-933R
Z2358-1265.917428tail component of prophage CP-933R
Z23591234.705392capsid structural protein of prophage CP-933R
Z23602244.085688capsid protein of prophage CP-933R
Z23612233.452138capsid assembly protein of prophage CP-933R
Z23622230.965264capsid protein of prophage CP-933R
Z2363221-2.175652DNA packaging protein of prophage CP-933R
Z2364120-2.008604DNA packaging protein of prophage CP-933R;
Z2365131-6.402902DNA packaging protein of prophage CP-933R;
Z2366024-3.809669hypothetical protein
Z2367023-2.987423hypothetical protein
Z2368222-1.156150hypothetical protein
Z23691230.678635endopeptidase Rz of prophage CP-933R
Z23703221.122634hypothetical protein
Z23712222.678830lysozyme R of prophage CP-933R
Z23723242.574851hypothetical protein
Z23742223.021041holin protein of prophage CP-933R
Z23751222.279326hypothetical protein
Z23761231.492763IS629 transposase within prophage CP-933R
Z2377-1230.793322hypothetical protein
Z2378129-7.981334hypothetical protein
Z2379436-11.175551hypothetical protein
Z2382745-13.310259**hypothetical protein
Z2384745-12.689868hypothetical protein
Z2385747-13.555729hypothetical protein
Z2386852-14.989420hypothetical protein
Z2387850-14.219250hypothetical protein
Z2389442-10.073409DNA modification methyltransferase encoded
Z2390-128-3.724267hypothetical protein
Z2391-224-2.145717hypothetical protein
Z2392025-1.943648hypothetical protein
Z2393023-1.057216hypothetical protein
Z2394023-1.468964hypothetical protein
Z2395024-2.086285hypothetical protein
Z2396-126-2.565037replication protein
Z2397032-4.347334hypothetical protein
Z2398-235-7.449738hypothetical protein
Z2399040-9.625436regulatory protein Cro of prophage CP-933R
Z2400136-9.605465represssor protein encoded within prophage
Z2402025-4.512063hypothetical protein
Z2403025-4.745765hypothetical protein
Z2404126-4.924647prophage CP-933R superinfection exclusion
Z2406127-4.498280FtsZ inhibitor protein
Z2408226-4.217505hypothetical protein
Z2409223-4.573990exonuclease VIII
Z2410119-6.052821recombination and repair protein RecT
Z2412-118-4.130190restriction alleviation and modification
Z2413-217-3.137190hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2333ECOLIPORIN2065e-69 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 206 bits (525), Expect = 5e-69
Identities = 100/117 (85%), Positives = 107/117 (91%)

Query: 1 MKSKVLALLIPALLGAGAAHAAEVYNKDGNKLDLYGKVDGLHYFSDNSAKDGDQSYARLG 60
MK KVLAL+IPALL AGAAHAAE+YNKDGNKLDLYGKVDGLHYFSD+S+KDGDQ+Y R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQLTGYGQWEYNIQANNTESSKNQSWTRLAFAGLKFSDYGSFDYGRNYG 117
FKGETQINDQLTGYGQWEYN+QAN TE SWTRLAFAGLKF DYGSFDYGRNYG
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYG 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2334ECOLIPORIN1171e-34 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 117 bits (295), Expect = 1e-34
Identities = 81/154 (52%), Positives = 86/154 (55%), Gaps = 59/154 (38%)

Query: 45 NDQVNH--TAAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPFGDS----DYAVANKT 98
N+QVN T AGGDKADAWTAGLKYDANNIYLATMYSETRNMTP+G + D VANKT
Sbjct: 230 NEQVNAGGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKT 289

Query: 99 QNFE-----------------------------------------------------STY 105
QNFE STY
Sbjct: 290 QNFEVTAQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTY 349

Query: 106 VDYKINLLDEDDSFYAANGISTDDIVALGLVYQF 139
VDYKINLLD+DD FY GISTDDIVALG+VYQF
Sbjct: 350 VDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383



Score = 83.4 bits (206), Expect = 5e-22
Identities = 43/70 (61%), Positives = 50/70 (71%)

Query: 2 GWTDMLPEFGGDSYTNADNFMTGRANGVATYRNTDFFGLVNGLNDQVNHTAAGGDKADAW 61
GWTDMLPEFGGDSYT ADN+MTGRANGVATYRNTDFFGLV+GLN + + ++
Sbjct: 124 GWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQSADD 183

Query: 62 TAGLKYDANN 71
+ NN
Sbjct: 184 VNIGTNNRNN 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2342ENTEROVIROMP457e-10 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 44.9 bits (106), Expect = 7e-10
Identities = 13/35 (37%), Positives = 23/35 (65%)

Query: 1 VRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRVST 35
+N+++ + AGP+ R+N+W S Y + GV Y + T
Sbjct: 81 NKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2344RTXTOXIND300.045 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.045
Identities = 15/103 (14%), Positives = 34/103 (33%), Gaps = 2/103 (1%)

Query: 548 YVRSVNLVGKSAFVEASGRASNDAEGYLGLFREKIGKLH--LAQGLWELIDNSQLADEMA 605
Y + +NL K A N E + + ++ L + + ++
Sbjct: 203 YQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYV 262

Query: 606 EMKTSITETRNEITQTVSKTLEDQSAIIQQIQRVQKDTNDDLA 648
E + ++++ Q S+ L + Q + + D L
Sbjct: 263 EAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLR 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2358INTIMIN330.001 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.7 bits (74), Expect = 0.001
Identities = 23/119 (19%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 104 KEVITRTVKVTNVGKPSVAEERSKITPVSAIKVTP-------------TSGTVAKGKTTT 150
++ IT TVKV KP +E + T + + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 151 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 205
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


34Z2432Z2439Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Z2432-113-3.042547O-6-alkylguanine-DNA:cysteine-protein
Z2433-114-3.712948fumarate/nitrate reduction transcriptional
Z2435-216-4.045964universal stress protein UspE
Z2436-119-4.588833hypothetical protein
Z2437-119-4.291823hypothetical protein
Z2438023-4.157764transport periplasmic protein
Z2439226-3.232003LysR family transcriptional regulator
35Z2474mZ2487Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2474m-213-3.524714sugar-binding periplasmic protein
Z2475m-214-4.321564glycosidase
Z2477211-0.684789thiosulfate:cyanide sulfurtransferase
Z24782131.728449peripheral inner membrane phage-shock protein
Z24792193.200278DNA-binding transcriptional activator PspC
Z24802214.199274phage shock protein B
Z24821204.063540phage shock protein PspA
Z24840194.315497phage shock protein operon transcriptional
Z2486-1184.0436364-aminobutyrate aminotransferase
Z2487-3163.161295oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2480MPTASEINHBTR250.030 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.030
Identities = 7/43 (16%), Positives = 17/43 (39%)

Query: 30 SGRSELSQSEQQRLAQLADEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A A++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2484HTHFIS342e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (880), Expect = e-118
Identities = 126/341 (36%), Positives = 182/341 (53%), Gaps = 23/341 (6%)

Query: 6 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 65
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 185
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 245
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 246 PLDDIIID---PFKRRPPEDAIAVSETTSLPTLPLD------------------LREFQM 284
+ I + P E A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 285 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 325
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


36Z2551Z6051Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2551021-4.345039tryptophan synthase subunit alpha
Z2553138-9.932316hypothetical protein
Z2554237-9.610369hypothetical protein
Z2555238-8.878309hypothetical protein
Z2557132-6.954584hypothetical protein
Z2558235-8.756737hypothetical protein
Z2560238-8.968621hypothetical protein
Z2561134-7.596122transposase (partial)
Z2562340-9.337733transposase (partial)
Z2563440-8.471041transposase (partial)
Z2565435-6.875680chaperone protein
Z60102200.344671hypothetical protein
Z60113222.883503hypothetical protein
Z60123243.289373hypothetical protein
Z60143251.720536hypothetical protein
Z6015223-1.610565hypothetical protein
Z6016327-2.456008hypothetical protein
Z6017746-8.251487transposase
Z6019851-9.417804transposase
Z6020952-9.395756hypothetical protein
Z6021643-4.276783hypothetical protein
Z6022435-1.571150integrase fragment
Z60243220.787463hypothetical protein
Z60253233.814169hypothetical protein
Z60264275.458503hypothetical protein
Z60274275.706944tail fiber protein of cryptic prophage CP-933P
Z60284275.247926Lom-like outer membrane protein of cryptic
Z60294295.605831tail component of cryptic prophage CP-933P
Z60304316.071847tail component of cryptic prophage CP-933P
Z60314336.340391tail assembly protein of cryptic prophage
Z60324325.526219tail assembly protein of cryptic prophage
Z60334335.685059tail component of cryptic prophage CP-933P
Z60344315.741152tail component of cryptic prophage CP-933P
Z60356336.196464tail assembly protein of cryptic prophage
Z60366335.772558tail assembly protein of cryptic prophage
Z60375295.238886tail component of cryptic prophage CP-933P
Z60386305.884792structural component of cryptic prophage
Z60395316.610751structural component of cryptic prophage
Z60406306.481747head-tail adaptor of cryptic prophage CP-933P
Z60415306.078401hypothetical protein
Z60425286.099479hypothetical protein
Z60435286.072924hypothetical protein
Z60444265.907924hypothetical protein
Z60453242.960912terminase subunit encoded by cryptic prophage
Z60460231.213266terminase subunit encoded by cryptic prophage
Z6047121-0.025599DNAse encoded by cryptic prophage CP-933P
Z6048024-0.626869hypothetical protein
Z60492221.996861endopeptidase encoded by cryptic prophage
Z60503241.710908antirepressor protein encoded by cryptic
Z60512282.697654endolysin encoded by cryptic prophage CP-933P
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z6027CHANLCOLICIN320.007 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 31.6 bits (71), Expect = 0.007
Identities = 33/170 (19%), Positives = 63/170 (37%), Gaps = 34/170 (20%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEENAANADTSAGDASESAR 160
EA + + E A+ E A+K A+ + + E N+ S+ S AR
Sbjct: 175 EAEEKRLAALSEEAKAVEIAQKKLSAAQ-----SEVVKMDGEIKTLNSRLSS---SIHAR 226

Query: 161 QAAESAAAAKQSEEASSSSA--------SAAAQKASESLQS----------------ATD 196
A A K++E A +S+ + +A++ LQ+ +
Sbjct: 227 DAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREE 286

Query: 197 AELSKKTAESAAGNAARDATTAAEKARESAESAQSAEQSRIA-AEEAVNR 245
+ +E+ N T +KA + ++A +R+ AEE + +
Sbjct: 287 KQKQVTASETRI-NRINADITQIQKAISQVSNNRNAGIARVHEAEENLKK 335



Score = 30.8 bits (69), Expect = 0.012
Identities = 35/149 (23%), Positives = 52/149 (34%), Gaps = 10/149 (6%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEENAANADTSAGDASESAR 160
E R+ E+A + AE+ +K E + + ++AEE A + A E
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIER-EKAETERQLKLAEAEEKRLAALSEEAKAVE--- 192

Query: 161 QAAESAAAAKQSEEASSSSASAAAQKASESLQSATDAE---LSKKTAESAAGNAARDATT 217
A+ +A QSE S A DAE L+ K E A +A
Sbjct: 193 -IAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELD 251

Query: 218 AAEKARESAESAQSAEQSRIAAEEAVNRI 246
K + A Q+R E R+
Sbjct: 252 ELVKK--LSPRANDPLQNRPFFEATRRRV 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z6028ENTEROVIROMP1371e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 137 bits (347), Expect = 1e-43
Identities = 62/195 (31%), Positives = 98/195 (50%), Gaps = 29/195 (14%)

Query: 7 VILSAVVWQVAAATPASAAEHQSTLSAGYLHASTNVPG-SDDLNGINVKYRYEFMDA-LG 64
+ + + V A T ++ ST++ GY A ++ G + + G N+KYRYE ++ LG
Sbjct: 4 IACLSALAAVLAFTAGTSVAATSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEEDNSPLG 61

Query: 65 LITSFSYANAEDEQKTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRV 124
+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y +
Sbjct: 62 VIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKF 113

Query: 125 STFYGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVTIDLAYEGSGSG 184
T T+ HD S+ ++GAG+QFNP E+V +D +YE S
Sbjct: 114 QT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYEQSRIR 156

Query: 185 DWRSDAFIVGIGYRF 199
+I G+GYRF
Sbjct: 157 SVDVGTWIAGVGYRF 171


37Z6066Z2568Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z60662191.098513exclusion protein ren of cryptic prophage
Z60672180.883595hypothetical protein
Z6068219-0.730135hypothetical protein
Z6069219-0.733129replication protein
Z6070221-2.184834hypothetical protein
Z6071324-3.878309hypothetical protein
Z6072233-6.052913hypothetical protein
Z6073333-6.385461repressor protein encoded by cryptic prophage
Z6074123-3.465228hypothetical protein
Z6075123-3.723971hypothetical protein
Z6076123-3.783309hypothetical protein
Z6078224-4.655387inhibitor of cell division encoded by cryptic
Z6079122-3.849095hypothetical protein
Z6080122-3.409521exodeoxyribonuclease encoded by cryptic prophage
Z6081-119-5.138814excisionase
Z2566-221-5.553489integrase fragment, cryptic prophage CP-933P
Z2568-322-4.509405integrase fragment, cryptic prophage CP-933P
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z6078STREPKINASE290.002 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 29.3 bits (65), Expect = 0.002
Identities = 12/36 (33%), Positives = 21/36 (58%)

Query: 17 TSPGGTRHRITKFIVEDAIMETLLPNVNTSEGCFEI 52
T G H++ K + AI E L+ NV++++ FE+
Sbjct: 91 TDSGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEV 126


38Z2741Z2785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z27414291.055196integration host factor subunit alpha
Z27423290.402182phenylalanyl-tRNA synthetase subunit beta
Z2743021-4.582500phenylalanyl-tRNA synthetase subunit alpha
Z2745-122-5.54075050S ribosomal protein L20
Z2746-218-4.45670750S ribosomal protein L35
Z2747-119-4.692936translation initiation factor IF-3
Z2748-219-4.128936threonyl-tRNA synthetase
Z2749-126-5.486366hypothetical protein
Z2751-119-1.386164hypothetical protein
Z2752-118-1.1493296-phosphofructokinase
Z2753016-1.186369hypothetical protein
Z2754-118-3.370016hypothetical protein
Z2755-118-3.837580hypothetical protein
Z2756-215-2.0029182-deoxyglucose-6-phosphatase
Z2757-214-2.161368hypothetical protein
Z2758-313-2.720463hypothetical protein
Z2759-215-4.599810hypothetical protein
Z2760-314-2.800512cell division modulator
Z2761-212-2.876020hydroperoxidase II
Z2763018-4.465668hypothetical protein
Z2764017-4.8482116-phospho-beta-glucosidase
Z2765017-4.454546DNA-binding transcriptional regulator ChbR
Z2766018-2.353940PTS system N,N'-diacetylchitobiose-specific
Z2767115-2.068916PTS system N,N'-diacetylchitobiose-specific
Z2768015-1.740145PTS system N,N'-diacetylchitobiose-specific
Z2769-113-0.304298DNA-binding transcriptional activator OsmE
Z27700120.811529NAD synthetase
Z27710122.766209nucleotide excision repair endonuclease
Z27740123.442159hypothetical protein
Z2775-1113.660280hypothetical protein
Z2776-1123.805342succinylglutamate desuccinylase
Z2777-1113.054938succinylarginine dihydrolase
Z27780112.125831succinylglutamic semialdehyde dehydrogenase
Z2779-1130.606926arginine succinyltransferase
Z2780-1120.345395bifunctional succinylornithine
Z27811150.602788exonuclease III
Z27821151.538713hypothetical protein
Z27832132.105747hypothetical protein
Z27841122.836825hypothetical protein
Z27852143.043407hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2741DNABINDINGHU1193e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (301), Expect = 3e-39
Identities = 34/89 (38%), Positives = 55/89 (61%)

Query: 4 TKAEMSEYLFDKLGLSKRDAKELVELFFEEIRRALENGEQVKLSGFGNFDLRDKNQRPGR 63
K ++ + + L+K+D+ V+ F + L GE+V+L GFGNF++R++ R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 64 NPKTGEDIPITARRVVTFRPGQKLKSRVE 92
NP+TGE+I I A +V F+ G+ LK V+
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


39Z2803Z2835Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2803-220-4.075680transporter
Z2804-219-3.878309hypothetical protein
Z2806-220-4.437453transposase
Z2808-122-4.968877DEOR-type transcriptional regulator
Z2809019-3.864599hypothetical protein
Z2810020-4.118577kinase
Z2811019-3.686706aldolase
Z2812-218-2.922104oxidoreductase
Z2813-218-2.312521transporter
Z2815-121-1.957278oxidoreductase
Z2816025-1.662607hypothetical protein
Z2817120-1.317962methionine sulfoxide reductase B
Z2818018-1.341177glyceraldehyde-3-phosphate dehydrogenase
Z2820-211-3.437317hypothetical protein
Z2821-212-4.040838aldehyde reductase
Z2822-212-4.211173hypothetical protein
Z2823-114-4.765038hypothetical protein
Z2824-318-5.275646hypothetical protein
Z2825-322-5.474630hypothetical protein
Z2826-220-1.660436hypothetical protein
Z2827-1200.296624hypothetical protein
Z28280200.420323hypothetical protein
Z2829-118-0.271867hypothetical protein
Z2831-118-0.982159AraC family transcriptional regulator
Z2833-118-1.081076amino acid/amine transport protein
Z2834-121-2.509336hypothetical protein
Z2835-220-3.894112hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2803TCRTETB392e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.5 bits (92), Expect = 2e-05
Identities = 30/129 (23%), Positives = 50/129 (38%), Gaps = 1/129 (0%)

Query: 29 ALMFGYFIGSLTGGFIGDYFGRRRAFRINLLIVGIAATGAAFVPDMY-WLIFFRFLMGTG 87
A M + IG+ G + D G +R ++I + + LI RF+ G G
Sbjct: 57 AFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG 116

Query: 88 MGALIMVGYASFTEFIPATVRGKWSARLSFVGNWSPMLSAAIGVVVIAFFSWRIMFLLGG 147
A + +IP RGK + + + AIG ++ + W + L+
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM 176

Query: 148 IGILLAWFL 156
I I+ FL
Sbjct: 177 ITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2813TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.011
Identities = 33/142 (23%), Positives = 48/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2820INVEPROTEIN290.024 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.024
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 165 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 213
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 214 TDRVYLNPQDCSVINDEALNR 234
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2829PRTACTNFAMLY280.021 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.021
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


40Z2959Z2989Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2959022-3.095531hypothetical protein
Z2960026-4.371790ferritin
Z2962-133-4.835546hypothetical protein
Z2963-135-5.068786tyrosine-specific transporter
Z2964142-7.083438hypothetical protein
Z2966045-8.741531integrase for prophage CP-933T
Z2967-140-9.584935hypothetical protein
Z2968035-8.084324hypothetical protein
Z2969-131-7.158231hypothetical protein
Z2970229-5.399252regulator for prophage CP-933T
Z2971430-4.270559hypothetical protein
Z2972628-3.185810hypothetical protein
Z29731283.023191hypothetical protein
Z29740273.495776hypothetical protein
Z2975-1274.138279hypothetical protein
Z29760254.008678hypothetical protein
Z2977-1234.039954hypothetical protein
Z2978-1223.799100replication protein for prophage CP-933T
Z2979118-1.904597stability/partitioning protein encoded within
Z2980-126-3.602620stability/partitioning protein encoded within
Z2981-125-2.327725IS629 transposase encoded within prophage
Z2982-121-0.062201prophage-associated protein
Z2983-121-0.150117tail fiber assembly protein of prophage CP-933T
Z2984-119-0.211925serine acetlyltransferase of prophage CP-933T
Z29850182.450602tail fiber protein of prophage CP-933T
Z29860182.659597tail fiber protein of prophage CP-933T
Z29871172.736298tail fiber component of prophage CP-933T
Z2988220-0.846627tail fiber protein component of prophage
Z2989219-2.022095hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2964SECA608e-13 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 60.3 bits (146), Expect = 8e-13
Identities = 27/70 (38%), Positives = 31/70 (44%), Gaps = 5/70 (7%)

Query: 155 RVEKMSPEAFEESVDAIRLAALDLH---AYWMAHPQEKAVQQPI--KAEEKPGRNDPCPC 209
+V+ PE EE R+ A L A E K GRNDPCPC
Sbjct: 828 KVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPC 887

Query: 210 GSGKKFKQCC 219
GSGKK+KQC
Sbjct: 888 GSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2985PF03944280.016 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 28.1 bits (62), Expect = 0.016
Identities = 11/31 (35%), Positives = 18/31 (58%)

Query: 74 TQAYTGRPWPLIDGVGQIYGMYVLTGTNTTR 104
TQ++T + WP + + Q+ YVL G + R
Sbjct: 285 TQSFTSQDWPFLYSLFQVNSNYVLNGFSGAR 315


41Z3018Z3163Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3018016-3.536942hypothetical protein
Z3019119-4.637094hypothetical protein
Z3020236-10.392134hypothetical protein
Z3021236-10.454443hypothetical protein
Z3022338-11.486410hypothetical protein
Z3023026-7.177059hypothetical protein
Z3025014-2.510366hypothetical protein
Z3026013-1.182148hypothetical protein
Z3024-1154.095863hypothetical protein
Z30270174.356524flagellar hook-basal body protein FliE
Z30280154.227834flagellar MS-ring protein
Z30291174.391925flagellar motor switch protein G
Z3030-1173.823043flagellar assembly protein H
Z3031-1193.554403flagellum-specific ATP synthase
Z3032-1172.427290flagellar biosynthesis chaperone
Z3033-1172.438503flagellar hook-length control protein
Z3034-3211.940335flagellar basal body protein FliL
Z3035-1170.695993flagellar motor switch protein FliM
Z3036016-2.247826flagellar motor switch protein FliN
Z3037-117-2.926889flagellar biosynthesis protein FliO
Z3038-120-3.813146flagellar biosynthesis protein FliP
Z3039-121-4.024679flagellar biosynthesis protein FliQ
Z3040-215-2.657476flagellar biosynthesis protein FliR
Z3041-118-2.270690positive regulator for ctr capsule biosynthesis,
Z3042-2160.258330hypothetical protein
Z3043-3160.533631hypothetical protein
Z3045-2171.100565mannosyl-3-phosphoglycerate phosphatase
Z30470161.079779hypothetical protein
Z30482181.844657hypothetical protein
Z30492181.607839hypothetical protein
Z30500140.384184hypothetical protein
Z3053-212-0.990845DNA mismatch endonuclease, patch repair protein
Z3054-213-1.950196DNA cytosine methylase
Z3055-219-5.717191hypothetical protein
Z3056025-7.360513hypothetical protein
Z3057-128-7.895774hypothetical protein
Z3058-123-6.191904hypothetical protein
Z3059-126-5.648835chaperone protein HchA
Z3060031-7.1232812-component sensor protein
Z3061027-5.720713transcriptional regulatory protein YedW
Z3062124-6.106663hypothetical protein
Z3063125-6.418558sulfite oxidase subunit YedY
Z3064338-11.490717sulfite oxidase subunit YedZ
Z3065545-9.809450hypothetical protein
Z3066642-8.671960hypothetical protein
Z3067434-3.210320cytochrome
Z3069231-1.187653hypothetical protein
Z30714242.600784hypothetical protein
Z30726274.710195hypothetical protein
Z30735316.014281hypothetical protein
Z30745316.443585tail fiber protein of prophage CP-933U
Z30755335.962815prophage protein
Z30775306.139833tail fiber component J of prophage CP-933U
Z30794296.410995tail fiber component I of prophage CP-933U
Z30814285.788411tail fiber component K of prophage CP-933U
Z30823275.341490tail fiber component L of prophage CP-933U
Z30834264.551113tail fiber component M of prophage CP-933U
Z30843264.683723tail fiber component H of prophage CP-933U
Z30853283.187908tail fiber component T of prophage CP-933U
Z30864253.264548tail fiber component G of prophage CP-933U
Z30874254.475836tail fiber component V of prophage CP-933U
Z30884274.457400tail fiber component U of prophage CP-933U
Z30894254.633904tail fiber component Z of prophage CP-933U
Z30906213.497385hypothetical protein
Z30915213.328514hypothetical protein
Z30925223.889322hypothetical protein
Z30934213.147185hypothetical protein
Z30954203.257732transposase encoded within prophage CP-933U
Z30973212.583765peptidase encoded within prophage CP-933U
Z30982212.950082head-tail preconnector protein of prophage
Z30991202.266815DNA packaging protein of prophage CP-933U
Z3100223-0.391162hypothetical protein
Z31012231.743044endopeptidase of prophage CP-933U
Z31032251.418980antirepressor protein of prophage CP-933U
Z31042282.308634endolysin of prophage CP-933U
Z31051280.650206hypothetical protein
Z31060281.195052holin protein of prophage CP-933U
Z31071241.408852hypothetical protein
Z31080220.063068hypothetical protein
Z3109023-0.124010hypothetical protein
Z3114-124-0.802239***antitermination protein Q
Z3115123-0.410801endonuclease encoded within prophage CP-933U
Z3116224-0.907386hypothetical protein
Z3117126-1.888787hypothetical protein
Z3118127-0.412758hypothetical protein
Z31192280.543735hypothetical protein
Z31202261.352966hypothetical protein
Z31212270.638060hypothetical protein
Z31221250.675360hypothetical protein
Z3123326-0.190167hypothetical protein
Z3124219-2.144307hypothetical protein
Z3125221-4.443381hypothetical protein
Z3126325-5.172571repressor protein of prophage CP-933U
Z3127017-2.678522hypothetical protein
Z3128017-2.665984inhibitor of cell division encoded within
Z3129-118-3.260248exodeoxyribonuclease VIII
Z3130-119-3.235057integrase for prophage CP-933U
Z3132-118-2.664838*hypothetical protein
Z3135-218-2.965527*invasin
Z3136-227-5.096731hypothetical protein
Z3137-128-4.996302hypothetical protein
Z3138-226-3.470281shikimate transporter
Z3139-227-3.830337AMP nucleosidase
Z3140-129-4.086921hypothetical protein
Z3143028-2.179299*hypothetical protein
Z3144-126-1.843699hypothetical protein
Z3146026-1.168664*transcriptional regulator Cbl
Z3147126-0.578041nitrogen assimilation transcriptional regulator
Z31501220.488826*hypothetical protein
Z31511193.085171nicotinate-nucleotide--dimethylbenzimidazole
Z31521190.027299cobalamin synthase
Z3153121-0.144482adenosylcobinamide kinase
Z31542210.187838hypothetical protein
Z31552200.261345hypothetical protein
Z31563190.245562hypothetical protein
Z3159422-2.123391outer membrane receptor for iron compound or
Z31614221.853613hypothetical protein
Z31625221.418222IS629 transposase
Z3163223-0.027584hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3019RTXTOXIND300.018 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.018
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVAAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3020PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3022SACTRNSFRASE324e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 4e-04
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 80 APNYLRRGVASLILRHILQVAHDRCLHRLSLETGTQAGFTACHQLYLKHGFVDC 133
A +Y ++GV + +L ++ A + L LET +ACH Y KH F+
Sbjct: 98 AKDYRKKGVGTALLHKAIEWAKENHFCGLMLET-QDINISACH-FYAKHHFIIG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3023PF07299280.044 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 27.5 bits (61), Expect = 0.044
Identities = 17/78 (21%), Positives = 30/78 (38%), Gaps = 5/78 (6%)

Query: 1 MFPLNDLSLKTQSVQLNKITSNTESTIKQHELVSDDAIINELSSEKSQGEGTLPIRHKLE 60
M+ + + +S Q N I S H +D +I L S + I H E
Sbjct: 1 MYGVIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIE-----KIIHVFE 55

Query: 61 FISTNIAELLDKLTKITD 78
++ EL+D + + +
Sbjct: 56 NLTDEQKELIDTVLTVQN 73


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3025SACTRNSFRASE321e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 1e-04
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 20 APNYLRRGVASLILRHILQVAHDRCLHRLSLETGTQAGFTACHQLYLKHGFVDC 73
A +Y ++GV + +L ++ A + L LET +ACH Y KH F+
Sbjct: 98 AKDYRKKGVGTALLHKAIEWAKENHFCGLMLET-QDINISACH-FYAKHHFIIG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3027FLGHOOKFLIE1175e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (294), Expect = 5e-38
Identities = 103/103 (100%), Positives = 103/103 (100%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3028FLGMRINGFLIF7520.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 752 bits (1943), Expect = 0.0
Identities = 478/555 (86%), Positives = 515/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSWRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTS RDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESQAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S+A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3029FLGMOTORFLIG2623e-89 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 262 bits (672), Expect = 3e-89
Identities = 105/328 (32%), Positives = 168/328 (51%), Gaps = 58/328 (17%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTASGIETLN------- 53
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + E +
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 54 -----------------------FMEPQSAADLI-------------------------- 64
+ Q A D+I
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNF 131

Query: 65 -RDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLNG 123
+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 132 IQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEK 191

Query: 124 LLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENLV 182
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++V
Sbjct: 192 KLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIV 251

Query: 183 DVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLSQ 242
+DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 252 LLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKD 311

Query: 243 VENEQKAILLIVRRLAETGEMVIGSGED 270
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 312 VEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3030FLGFLIH374e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 374 bits (961), Expect = e-135
Identities = 226/228 (99%), Positives = 227/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3032FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3033FLGHOOKFLIK409e-145 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 409 bits (1051), Expect = e-145
Identities = 329/334 (98%), Positives = 329/334 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120
GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDVPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTD PSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPQVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTP VAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSSHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVS HQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQ 334
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQ
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQ 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3035FLGMOTORFLIM2531e-85 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 253 bits (648), Expect = 1e-85
Identities = 68/324 (20%), Positives = 119/324 (36%), Gaps = 68/324 (20%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGES----------------------------- 33
+LSQ EID LL S + E +S
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 34 ----------------------DIRPYD------PN-TQRNLIHLKPLRGTGLVVFSPSL 64
D Y+ P + +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 65 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 124
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 125 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 182
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 183 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 239
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 240 GVPVLTSQYGTLNGQYALRIEHLI 263
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3036FLGMOTORFLIN2121e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 212 bits (542), Expect = 1e-74
Identities = 125/137 (91%), Positives = 134/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T++KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3038FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3039TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3040TYPE3IMRPROT2033e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (518), Expect = 3e-67
Identities = 260/261 (99%), Positives = 261/261 (100%)

Query: 1 MLQVTSEQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
MLQVTSEQWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3054PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3055CARBMTKINASE352e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.8 bits (80), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQSSSILAAEETRRLLREEFVQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3057ECOLIPORIN307e-107 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 307 bits (789), Expect = e-107
Identities = 143/204 (70%), Positives = 160/204 (78%), Gaps = 2/204 (0%)

Query: 11 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKLDLYGKVAGLHYFSDDASSDGDMSYARIG 70
MKRKVLA+++PALL AGAA+AAEIYNKDGNKLDLYGKV GLHYFSDD+S DGD +Y R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 71 FKGETQIADQFTGYGQWEFNIGANGPESDKGNTATRLAFAGFGFGQNGTFDYGRNYGVVY 130
FKGETQI DQ TGYGQWE+N+ AN E + N+ TRLAFAG FG G+FDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 131 DVEAWTDMLPEFGGDTYAGADNFMNGRANSVATYRNNGFFGQVDGLNFALQYQGNNEKSG 190
DVE WTDMLPEFGGD+Y ADN+M GRAN VATYRN FFG VDGLNFALQYQG NE
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 191 LFDQEGSGNG--NGRKLAKENGDG 212
D N NG + +NGDG
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDG 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3058ECOLIPORIN1438e-45 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 143 bits (363), Expect = 8e-45
Identities = 85/187 (45%), Positives = 100/187 (53%), Gaps = 64/187 (34%)

Query: 1 MSTSYDFDFGLSLGAAYSNSDRTDNQVHKGTHNTRYGDRFDATAGGETAEAWTVGAKYDA 60
+ST+YD G S GAAY+ SDRT+ QV+ G AGG+ A+AWT G KYDA
Sbjct: 207 ISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGTI----------AGGDKADAWTAGLKYDA 256

Query: 61 NNVYLAAMYAEPRNMTGYGDADA-----IANKTQNFEVVAQYQFDFG------------- 102
NN+YLA MY+E RNMT YG D +ANKTQNFEV AQYQFDFG
Sbjct: 257 NNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMSKGK 316

Query: 103 ------------------------------------KINLLDNDDDFYKENGIATDDIVA 126
KINLLD+DD FYK+ GI+TDDIVA
Sbjct: 317 DLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDDIVA 376

Query: 127 VGLVYQF 133
+G+VYQF
Sbjct: 377 LGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3060PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 35/181 (19%), Positives = 61/181 (33%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNSYLNIDIAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ +I + D N + +++ +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGAKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIT 447
G+ + K G GL V+ + L+G A K
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3061HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3074CHANLCOLICIN300.014 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.014
Identities = 31/118 (26%), Positives = 50/118 (42%), Gaps = 10/118 (8%)

Query: 132 SARNAGISASQAEESAANADTSAGDASESARQAAESAAAAKQSEEASSSSASAAAQKASE 191
S G S++E SAA T+ ++ + AE AA AK A+A AQ ++
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAK---------AAAEAQAKAK 84

Query: 192 SSQSAADAELSKKTAESAAGNAARDATTATEKARESAESAQSAEQSRIA-AEEAVNRI 248
+++ A L E+ NA+R + +A E+ R+A AEE +
Sbjct: 85 ANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKE 142



Score = 30.0 bits (67), Expect = 0.020
Identities = 31/147 (21%), Positives = 51/147 (34%), Gaps = 6/147 (4%)

Query: 103 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEESAANADTSAGDASESAR 162
E R+ E+A + AE+ +K E + + ++AEE A + A E
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIER-EKAETERQLKLAEAEEKRLAALSEEAKAVE--- 192

Query: 163 QAAESAAAAKQSEEASSSSASAAAQKASESSQSAADAELSKKTAESAAGNAARDATTATE 222
A+ +A QSE SS A DAE+ + A +
Sbjct: 193 -IAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELD 251

Query: 223 KARES-AESAQSAEQSRIAAEEAVNRI 248
+ + + A Q+R E R+
Sbjct: 252 ELVKKLSPRANDPLQNRPFFEATRRRV 278



Score = 29.3 bits (65), Expect = 0.039
Identities = 28/162 (17%), Positives = 58/162 (35%), Gaps = 18/162 (11%)

Query: 103 EALRRFELMVEEAARHAEEAKKNAGEA---------ETSARNAGISAS----QAEESA-A 148
EA + + E A+ E A+K A E N+ +S+S AE A
Sbjct: 175 EAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLA 234

Query: 149 NADTSAGDASESARQ--AAESAAAAKQSEEASSSSASAAAQKASESSQSAADAELSKKTA 206
AS ++ + + ++ + A ++ + + + + +
Sbjct: 235 GKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTAS 294

Query: 207 ESAAGNAARDATTATEKARESAESAQSAEQSRIA-AEEAVNR 247
E+ N T +KA + ++A +R+ AEE + +
Sbjct: 295 ETRI-NRINADITQIQKAISQVSNNRNAGIARVHEAEENLKK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3075ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 61/200 (30%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEQQATLSAGYLHARTSAPGSDNLNGINVKYRYEFT 60
M+K+ AA+ +G + +T++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGT---SVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3084LCRVANTIGEN340.002 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 34.3 bits (78), Expect = 0.002
Identities = 25/101 (24%), Positives = 46/101 (45%), Gaps = 4/101 (3%)

Query: 532 DLWKAESQYAVL-KEAATKRQLSEQEKSLLAHKDETLEYKRQLAELG---DKVEYQKRLN 587
+++KA ++Y +L K T Q+ EK +++ KD ++ LG + Y K N
Sbjct: 205 EIFKASAEYKILEKMPQTTIQVDGSEKKIVSIKDFLGSENKRTGALGNLKNSYSYNKDNN 264

Query: 588 ELAQQAVRFEEQQSAKQAAISAKARGLTDRQAQRESEAQRL 628
EL+ A ++ +S K L+D ++ S + L
Sbjct: 265 ELSHFATTCSDKSRPLNDLVSQKTTQLSDITSRFNSAIEAL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3087INTIMIN310.006 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.006
Identities = 23/119 (19%), Positives = 44/119 (36%), Gaps = 17/119 (14%)

Query: 134 KEVITRTVKVTNVGKPSVAEERSEITPATAIKVTP-------------TSGTVAKGKTTT 180
++ IT TVKV KP +E + T + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 181 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 235
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3135INTIMIN7020.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 702 bits (1813), Expect = 0.0
Identities = 221/790 (27%), Positives = 353/790 (44%), Gaps = 70/790 (8%)

Query: 139 QQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDWLSRFGTARITLGVDEDFS 198
QQ AS Q+ S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224

Query: 199 LKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWRHFTPTWMSGINFFFDHDL 258
S DFL P+Y++ L F Q D R N G G R F P M G N F D D
Sbjct: 225 --GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDF 282

Query: 259 SRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYEARPANGWDVRAEGWLPAW 318
S ++R GIG EYWRDY K S NGY R++ W + DY+ RPANG+D+R G+LP++
Sbjct: 283 SGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRFNGYLPSY 341

Query: 319 PHLGGKLVYEQYYGDEVALFDKDDRQSNPHAITAGLNYTPFPLMTFSAEQRQGKQGENDT 378
P LG KL+YEQYYGD VALF+ D QSNP A T G+NYTP PL+T + R G END
Sbjct: 342 PALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDL 401

Query: 379 RFAVDFTWQPGSAMQKQLDPNEVDARRSLAGSRFDLVDRNNNIVLEYRKKELVRLTLTDP 438
+++ F +Q +Q++P V+ R+L+GSR+DLV RNNNI+LEY+K++++ L +
Sbjct: 402 LYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHD 461

Query: 439 VTGKSGEVKSLVSSLQTKYALKGYNVEATALEAAGGKVVTTG----KDILVTLPAYRFTS 494
+ G + + +++KY L + +AL + GG++ +G +D LPAY
Sbjct: 462 INGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAY---- 517

Query: 495 TPETDNTWPIEVTAEDVKGNFSNREQ-SMVVVQAPTLSQKDSSVSLSSQTLSADSHSTAT 553
N + + A D GN SN ++ V+ + + ++ SA + T
Sbjct: 518 VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 554 LTFIAH------DAAGNPVIGLVLSTRHEGVQDITLSDWKDNGDGSYTQILTTGAMSGTL 607
+T+ A A PV ++S G ++ + NG G T L + +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVS----GTAVLSANSANTNGSGKATVTLKSDKPGQVV 633

Query: 608 TLMPQLNGVDAAKAPAVVNIISVSSSRTHSSIKIDKDRYLSGNPIEVTVELR-DENDKPV 666
A A AV I + + + IK DK ++ +T ++ + DKPV
Sbjct: 634 VSAKTAEMTSALNANAV--IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPV 691

Query: 667 KEQKQQLNTAVSIDNVKPGVTTDWKETADGVYKATYTAYTKGSGL-TAKLLMQNWNEDLH 725
Q+ T + K +T+ K +G K T T+ T G L +A++ +
Sbjct: 692 SNQEVTFTTTLG----KLSNSTE-KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAP 746

Query: 726 TAGFIIDANPQSAKIATLSASNNGVLANENAANTVSVNVADEGSNPINDHTVTFAVLSGS 785
F I + G L + + ++
Sbjct: 747 EVEFFTTLTIDDGNIEIVGTGVKGKLPTV---------------------WLQYGQVNLK 785

Query: 786 ATSFNNQNTAKTDVNGLATFDLKSSK---QEDNTVEVTLENGVKQTLIVSFVGDSSTAQV 842
A+ N + T ++ +A+ D S + +E T +++ + QT ++ + + +
Sbjct: 786 ASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQT--ATYTIATPNSLI 843

Query: 843 DLQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKVTF----------NVNSAAAKLSQT 892
SK D ++ + N L +V + + + + + QT
Sbjct: 844 VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQQT 903

Query: 893 EVNSHDGIAT 902
++ G+A+
Sbjct: 904 AQDAKSGVAS 913



Score = 155 bits (393), Expect = 5e-40
Identities = 96/389 (24%), Positives = 157/389 (40%), Gaps = 44/389 (11%)

Query: 1117 TAEAILLNGNRD-----TKIVNIAPDASNAQVTLNIPAQQV--VTNNSDSVQLTATVK-- 1167
TA A NGN T V + + A + + ++++ TATVK
Sbjct: 528 TARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN 587

Query: 1168 --DPSNHPVAGITVNFTMPQDVAANFTLENNGIAITQANGEAHVTLKGKKAGTHTVTA-T 1224
+N PV+ V+ T ++AN A T +G+A VTLK K G V+A T
Sbjct: 588 GVAQANVPVSFNIVSGT--AVLSAN-------SANTNGSGKATVTLKSDKPGQVVVSAKT 638

Query: 1225 LGNNNASDAQPVTFVADKDSAVVVLQTSKAEIIGNGVDETTLTATVKDPFDNAVKDLQVT 1284
+A +A V FV +++ ++ K + NG D T T V D V + +VT
Sbjct: 639 AEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVT 697

Query: 1285 FSTNPADTQLSQSKSNTNDSGVAEVTFKGTVLGVHTAEATLPNGNNDTKIVNIAPDASNA 1344
F+T +LS S T+ +G A+VT T G A + + D K +
Sbjct: 698 FTTTLG--KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVE---FFT 752

Query: 1345 QVTLNIPAQQVVTNNSDSVQLTATVK-DPSNHPVAGITVNFTMPQDVAANFTLENNGIAI 1403
+T++ ++V T ++ N +G +T + N IA
Sbjct: 753 TLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYT--------WRSANPAIAS 804

Query: 1404 TQANGEAHVTLKGKKAGTHTVTATLSNNNTSDSQPVTFVADKTSALVVLQISKNEITGNG 1463
A+ VTLK K GT T++ +SD+Q T+ ++L+V +SK +
Sbjct: 805 VDAS-SGQVTLKEK--GTTTISVI-----SSDNQTATYTIATPNSLIVPNMSKRVTYNDA 856

Query: 1464 VDSATLTATVKDQFDNEVNNLPVTFSTAS 1492
V++ NE+ N+ + A+
Sbjct: 857 VNTCKNFGGKLPSSQNELENVFKAWGAAN 885



Score = 129 bits (325), Expect = 7e-32
Identities = 83/397 (20%), Positives = 151/397 (38%), Gaps = 36/397 (9%)

Query: 914 TVTASVSSGSQANQQVIFIGDQSTAALTLSVPSGDITVT-------NTAPLHMTATLQDK 966
T A +G+ +N ++ I S + V D T T + TAT++ K
Sbjct: 528 TARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVK-K 586

Query: 967 NGNPLKDKEITFSVPNDVASRFSISNSGKGMTDSNGTAIASLTGTLAGTHMITARLANSN 1026
NG + ++F++ + A + S + T+ +G A +L G +++A+ A
Sbjct: 587 NGVAQANVPVSFNIVSGTAVLSANSAN----TNGSGKATVTLKSDKPGQVVVSAKTAEMT 642

Query: 1027 VS-DTQPMTFVADKDRAVVVLQTSKAEIIGNGVDETTLTATVKDPFDNVVKNLSVVFRTS 1085
+ + + FV ++ ++ K + NG D T T V D V N V F T+
Sbjct: 643 SALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFTTT 701

Query: 1086 PADTQLSLNARNTNENGIAEVTLKGTVLGVHTAEAILLNGNRDTKIVNIAPDASNAQVTL 1145
+LS + T+ NG A+VTL T G A + + D K + +T+
Sbjct: 702 LG--KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVE---FFTTLTI 756

Query: 1146 NIPAQQVVTNNSDSVQLTATVK-DPSNHPVAGITVNFTMPQDVAANFTLENNGIAITQAN 1204
+ ++V T ++ N +G +T + N IA A+
Sbjct: 757 DDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYT--------WRSANPAIASVDAS 808

Query: 1205 GEAHVTLKGKKAGTHTVTATLGNNNASDAQPVTFVADKDSAVVVLQTSKAEIIGNGVDET 1264
VTLK K GT T++ +SD Q T+ ++++V SK + V+
Sbjct: 809 -SGQVTLKEK--GTTTISVI-----SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTC 860

Query: 1265 TLTATVKDPFDNAVKDLQVTFSTNPADTQLSQSKSNT 1301
N ++++ + S++
Sbjct: 861 KNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTII 897



Score = 120 bits (301), Expect = 5e-29
Identities = 80/367 (21%), Positives = 145/367 (39%), Gaps = 29/367 (7%)

Query: 821 LENGVKQTLIVSFVGDSSTAQ--VDLQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKV 878
N V T+ V G D K ADG ++ T TATV+ N V V
Sbjct: 538 SSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQAN-VPV 596

Query: 879 TFNVNSAAAKLSQTEVNSH-DGIATATLTSLKNGDYTVTASVSSGSQA-NQQVIFIGDQS 936
+FN+ S A LS N++ G AT TL S K G V+A + + A N + DQ+
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 937 TAALTLSVPSGDITVTNTAPLHMTATLQ-DKNGNPLKDKEITFSVPNDVASRFSISNSGK 995
A++T + + T +T T++ K P+ ++E+TF+ + ++
Sbjct: 657 KASIT-EIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFT------TTLGKLSNST 709

Query: 996 GMTDSNGTAIASLTGTLAGTHMITARLANSNVSDTQPMTFVADKDRAVVVLQTSKAEIIG 1055
TD+NG A +LT T G +++AR+++ V P + + + EI+G
Sbjct: 710 EKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEV----EFFTTLTIDDGNIEIVG 765

Query: 1056 NGVDETTLTATVK-DPFDNVVKNLSVVFRTSPADTQLSLNARNTNENGIAEVTLKGTVLG 1114
GV T ++ + + + A+ ++ ++ +VTLK G
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASS-----GQVTLKEK--G 818

Query: 1115 VHTAEAILLNGNRDTKIVNIAPDASNAQVTLNIPAQQVVTNNSDSVQLTATVKDPSNHPV 1174
T I + D + N+ + N+ + + ++ + S + +
Sbjct: 819 TTTISVI----SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNEL 874

Query: 1175 AGITVNF 1181
+ +
Sbjct: 875 ENVFKAW 881



Score = 85.1 bits (210), Expect = 2e-18
Identities = 80/380 (21%), Positives = 130/380 (34%), Gaps = 41/380 (10%)

Query: 1415 KGKKAGTHTVTAT-LSNNNTSDSQPVT-FVADKTSALVVLQISKNEITGNGVDSATLTAT 1472
G + +T T LSN D VT F ADKTSA +G ++ T TAT
Sbjct: 535 NGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAK-----------ADGTEAITYTAT 583

Query: 1473 VKDQFDNEVNNLPVTFSTASSGLTLTPGESNTNESGIAQATLAGVAFGEQTVTASLANNG 1532
VK + N PV+F+ S L+ +NTN SG A TL G+ V+A A
Sbjct: 584 VKKNGVAQANV-PVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMT 642

Query: 1533 ASDNKTVHFIGDTAAAKIIELTPVPDSIIAGTPQNSSGSVITATV-VDNNGFPVKGVTVN 1591
++ N D A I E+ + +A IT TV V PV V
Sbjct: 643 SALNANAVIFVDQTKASITEIKADKTTAVANGQ-----DAITYTVKVMKGDKPVSNQEVT 697

Query: 1592 FTSNAAT---AEMTNGGQAVTNDTVSAGDTTNLYIEVK-DNYGNGVPQQEVTLSVSPSEG 1647
FT+ + T+++ + + + V EV + +
Sbjct: 698 FTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLT-- 755

Query: 1648 VTPSNNAIYTTNHDGNF-YASFADTEGNAIANSEVTFTLPEDVRANFTLGDGGKVVTDTE 1706
+ N I T G + N A+ + + + + D
Sbjct: 756 IDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASG-------GNGKYTWRSANPAIASVDAS 808

Query: 1707 GKAKVTLKGTKAGAHTVTASMAGGKSEQLVVNFIADTLTAQVNLNVTEDNFIANNVGMTR 1766
+VTLK G T++ + ++ + T + + N+++ + V +
Sbjct: 809 -SGQVTLKEK--GTTTISVISSDNQT----ATYTIATPNSLIVPNMSKRVTYNDAVNTCK 861

Query: 1767 LQATVTDGNGNPLANEAVTF 1786
+ N L N +
Sbjct: 862 NFGGKLPSSQNELENVFKAW 881



Score = 76.3 bits (187), Expect = 1e-15
Identities = 91/469 (19%), Positives = 174/469 (37%), Gaps = 44/469 (9%)

Query: 2094 SGGKVRTNSSGQA--------PVVLTSNKVGTYTVTASFHNGVT----IQTQTIVKVTGN 2141
GG+++ + S A V + V T A NG + + T T++
Sbjct: 495 QGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQV 554

Query: 2142 SSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGLTVYFALKSGSATLTSLTAV 2201
V F AD ++ A ++ T ATV+ +G + V F + SG+A L++ +A
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 2202 TDQNGIATTSVRGAITGSVTVSAVTTAGGMQTVDITLVAGPAD--ASQSVLKNNRSSLKG 2259
T+ +G AT +++ G V VSA TA ++ V AS + +K ++++
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSA-KTAEMTSALNANAVIFVDQTKASITEIKADKTTAVA 672

Query: 2260 DFTDSAELHLVLHDISGNPIKVSEGLEFVQSGTNAPYVQVSAIDYSKNFSGEYKATVTGG 2319
+ D + + + G+ ++ + F + +S + +G K T+T
Sbjct: 673 NGQD--AITYTVKVMKGDKPVSNQEVTFTTTLGK-----LSNSTEKTDTNGYAKVTLTST 725

Query: 2320 GEGIATLIPVLNGVHQAGLSTTIQFTRAEDKIMSGTVLVNGANLPTTTFPSQGFTGAYYQ 2379
G + + ++ V + ++F I G + + G + P+
Sbjct: 726 TPGKSLVSARVSDVAVDVKAPEVEF-FTTLTIDDGNIEIVGTGV-KGKLPTVWLQYGQVN 783

Query: 2380 LNNDNFAPGKTAADYEFSSSASWVDVDATGKVTFKNVGSKWERITATPKTGGPSYIYEIR 2439
L + G + ++ A ++G+VT K G+ I+ + Y I
Sbjct: 784 L---KASGGNGKYTWRSANPAIASVDASSGQVTLKEKGT--TTISVISSDNQTA-TYTIA 837

Query: 2440 VKSWWVNAG-DAFMIYSLAENFCSSNGYTLPLGDHLNHSRSRGIGSLYSEWGDMGHYTTE 2498
+ + + Y+ A N C + G LP + + +++ WG Y
Sbjct: 838 TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNE-------LENVFKAWGAANKYEYY 890

Query: 2499 AGFHSNMYW---SSSPANSNEQYVVSLATGDQSVFEKLGF--AYATCYK 2542
+ + W ++ A S L + K AYATC K
Sbjct: 891 KSSQTIISWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939



Score = 59.7 bits (144), Expect = 2e-10
Identities = 49/199 (24%), Positives = 82/199 (41%), Gaps = 5/199 (2%)

Query: 1748 VNLNVTEDNFIANNVGMTRLQATVTDGNGNPLANEAVTFTLPADVSASFTLGQGGSAITD 1807
+ + + A+ ATV NG AN V+F + VS + L SA T+
Sbjct: 561 TDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNI---VSGTAVLSAN-SANTN 615

Query: 1808 INGKAEVTLSGTKSGTYPVTVSVNNYGVSDTKQVTLIADAGTAKLASLTSVYSFVVSTTE 1867
+GKA VTL K G V+ + + D A + + + + V+ +
Sbjct: 616 GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQ 675

Query: 1868 GATMTASVTDANGNPVEGIKVNFRGTSVTLSSTSVETDDRGFAEILVTSTEVGLKTVSAS 1927
A PV +V F T LS+++ +TD G+A++ +TST G VSA
Sbjct: 676 DAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSAR 735

Query: 1928 LADKPTEVISRLLNAKADI 1946
++D +V + + +
Sbjct: 736 VSDVAVDVKAPEVEFFTTL 754



Score = 57.8 bits (139), Expect = 6e-10
Identities = 51/197 (25%), Positives = 76/197 (38%), Gaps = 12/197 (6%)

Query: 2029 LANGSSYEKDLVVIDQKLTLSASSPLIGVNSPTGATLTATLTSANGTPVEGQVINFSVTP 2088
L+NG ++ V SA + + T TAT+ NG ++F++
Sbjct: 549 LSNGQVVDQVGVTDFTADKTSAKA-----DGTEAITYTATVKK-NGVAQANVPVSFNIVS 602

Query: 2089 EGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTASFHNGV-TIQTQTIVKVTGNSSTAHV 2147
A LS TN SG+A V L S+K G V+A + ++ V + + A +
Sbjct: 603 GTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFV--DQTKASI 660

Query: 2148 ASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGLTVYFALKSGSATLTSLTAVTDQNGI 2207
AD +T A D T V G + V F G + + T TD NG
Sbjct: 661 TEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFTTTLGKLSNS--TEKTDTNGY 717

Query: 2208 ATTSVRGAITGSVTVSA 2224
A ++ G VSA
Sbjct: 718 AKVTLTSTTPGKSLVSA 734



Score = 46.2 bits (109), Expect = 2e-06
Identities = 53/287 (18%), Positives = 90/287 (31%), Gaps = 36/287 (12%)

Query: 1563 GTPQNSSGSVITATVVDNNGFPVKGVTVNFTSNAATA------EMTNGG----QAVTNDT 1612
+ + ++ V NG V V+F + TA TNG + +D
Sbjct: 569 SAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDK 628

Query: 1613 ----VSAGDTTNLYIEVKDNYGNGVPQQEVTLS-VSPSEGVTPSNNAIYTTNHDGNFYAS 1667
V + T + + N V Q + +++ + + +N T Y
Sbjct: 629 PGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAIT------YTV 682

Query: 1668 FADTEGNAIANSEVTFTLPEDVRANFTLGDGGKVVTDTEGKAKVTLKGTKAGAHTVTASM 1727
++N EVTFT TDT G AKVTL T G V+A +
Sbjct: 683 KVMKGDKPVSNQEVTFT------TTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARV 736

Query: 1728 AGGKSE--QLVVNFIADTLTAQVNLNVTEDNFIANNVGMTRLQATVTDGNGN-PLANEAV 1784
+ + V F N+ + + V + G N +
Sbjct: 737 SDVAVDVKAPEVEFFTTLTIDDGNIEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNG 791

Query: 1785 TFTLPADVSASFTLGQGGSAITDINGKAEVTLSGTKSGTYPVTVSVN 1831
+T + A ++ S + K T+S S T ++
Sbjct: 792 KYTWRSANPAIASV-DASSGQVTLKEKGTTTISVISSDNQTATYTIA 837


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3138TCRTETB290.026 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.1 bits (65), Expect = 0.026
Identities = 22/150 (14%), Positives = 53/150 (35%), Gaps = 17/150 (11%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQ 227
W L ++ ++ ++ +++ +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKG 197


42Z3181Z3213Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3181-1233.746782ATP phosphoribosyltransferase
Z31820233.439580histidinol dehydrogenase
Z3183-1262.651294histidinol-phosphate aminotransferase
Z3184-2190.773476imidazole glycerol-phosphate
Z3185-318-1.527975imidazole glycerol phosphate synthase subunit
Z3186-217-1.9435511-(5-phosphoribosyl)-5-[(5-
Z3187-117-4.288426imidazole glycerol phosphate synthase subunit
Z3188-214-3.381276bifunctional phosphoribosyl-AMP
Z3189-120-5.978556regulator of length of O-antigen component of
Z3190-122-6.641954UDP-glucose 6-dehydrogenase
Z3191-126-7.3302986-phosphogluconate dehydrogenase
Z3192039-9.727911acetyl transferase; O-antigen biosynthesis
Z3194040-11.120283phosphomannomutase
Z3195456-15.032138mannose-1-phosphate guanylyltransferase
Z3196458-16.433041GDP-mannose mannosyl hydrolase
Z3197559-17.245750fucose synthetase
Z3198760-18.623079GDP-mannose dehydratase
Z3199760-20.080807glycosyl transferase
Z3200451-17.261793perosamine synthetase
Z3201244-15.512698O antigen flippase Wzx
Z3202031-11.830690glycosyl transferase
Z3203-221-7.907613O antigen polymerase
Z3204-316-3.399945glycosyl transferase
Z3205-215-0.244949UTP-glucose-1-phosphate uridylyltransferase
Z3206-1170.350203UDP-galactose 4-epimerase
Z3207-1211.659116colanic acid biosynthesis protein
Z32080253.133630colanic acid biosynthesis glycosyl transferase
Z32090243.260185pyruvyl transferase
Z32100253.435652colanic acid exporter
Z3211-1243.825505UDP-glucose lipid carrier transferase
Z3212-1253.901704phosphomannomutase
Z3213-1223.320149mannose-1-phosphate guanylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3197NUCEPIMERASE945e-24 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 93.7 bits (233), Expect = 5e-24
Identities = 64/347 (18%), Positives = 129/347 (37%), Gaps = 53/347 (15%)

Query: 5 RIFIAGHQGMVGSAITRRLKQRD-------------DVEL------VLRTRD----ELNL 41
+ + G G +G +++RL + DV L +L +++L
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 LDSSAVLDFFSSQKIDQVYLAAAKVGGILANSSYPADFIYENIMIEANVIHAAHKNNVNK 101
D + D F+S ++V+++ + + + P + N+ N++ N +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLAHQPIMEDELLQGKLEPTNEP---YAIAKIAGIKLCESYNRQFGRDY 158
LL+ SS +Y P D + + P YA K A + +Y+ +G
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD-------DSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 159 RSVMPTNLYGPNDNFHPSNSHVIPALLRRFHDAVENNSPNVVVWGSGTPKREFLHVDDMA 218
+ +YGP P L +F A+ + V+ G KR+F ++DD+A
Sbjct: 174 TGLRFFTVYGPWGR--PD------MALFKFTKAMLEGKS-IDVYNYGKMKRDFTYIDDIA 224

Query: 219 SASIYVMEMPYDIWQKNTK---------VMLSHINIGTGIDCTICELAETIAKVVGYKGH 269
A I + ++ + T NIG + + + + +G +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 270 ITFDTTKPDGAPRKLLDVTLLHQ-LGWNHKITLHKGLENTYNWFLEN 315
+P D L++ +G+ + T+ G++N NW+ +
Sbjct: 285 KNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3198NUCEPIMERASE1058e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 105 bits (263), Expect = 8e-28
Identities = 73/353 (20%), Positives = 121/353 (34%), Gaps = 42/353 (11%)

Query: 6 LITGVTGQDGSYLAEFLLDKGYEVHGIKRRASSFNTERIDHIYQDPH--------GSNPN 57
L+TG G G ++++ LL+ G++V GI + N Y D + P
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLND------YYDVSLKQARLELLAQPG 53

Query: 58 FHLHYGDLTDSSNLTRILKEVQPDEVYNLAAMSHVAVSFESPEYTADVDAIGTLRLLEAI 117
F H DL D +T + + V+ V S E+P AD + G L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 118 RFLGLENKTRFYQASTSELYGLVQEIPQKESTPF-YPRSPYAVAKLYAYWITVNYRESYG 176
R ++ AS+S +YGL +++P +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 177 IYACNGILFNHESPRRGETFVTRKITRGLANIAQGLESCLYLGNMDSLRDWGHAKDYVRM 236
+ A F P K T+ + +G +Y RD+ + D
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTK---AMLEGKSIDVY-NYGKMKRDFTYIDDIAEA 226

Query: 237 QWLMLQQEQPEDFVIATGVQYSVRQFVEMAAAQLGIKMSFVGKGIEEKGIVDSVEGQDAP 296
+ D +G +E + ++E DA
Sbjct: 227 IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG-----NSSPVELMDYIQALE--DAL 279

Query: 297 GVKPGDVIVAVDPRY--FRPAEVDTLLGDPSKANLKLGWRPEITLAEMISEMV 347
G++ +P +V D +G+ PE T+ + + V
Sbjct: 280 GIE-------AKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3206NUCEPIMERASE945e-24 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 93.7 bits (233), Expect = 5e-24
Identities = 70/334 (20%), Positives = 126/334 (37%), Gaps = 62/334 (18%)

Query: 4 NVLLIGASGFVGT----RLLE----------------TAIADFNIKNLDKQQSHFYPEIT 43
L+ GA+GF+G RLLE ++ ++ L + F+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 44 QIGDVRDQQALDQALA--GFDTVVLLAAEH--RDDVSPTSLYYDVNVQGTRNVLAAMEKN 99
D+ D++ + A F+ V + R + Y D N+ G N+L N
Sbjct: 59 --IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 100 GVKNIIFTSSVAVYGLNKHNP-DENHPHD-PFNHYGKSKWQAEEVLREWYNKA---PTER 154
++++++ SS +VYGLN+ P + D P + Y +K +A E++ Y+ P
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATK-KANELMAHTYSHLYGLPA-- 173

Query: 155 SLTIIRPTVIFGERNRGN--VYNLLKQIAGGKFMMV-GAGTNYKSMAYVGNIVEFIKYKL 211
T +R ++G R + ++ K + GK + V G + Y+ +I E I
Sbjct: 174 --TGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 212 KNVA-----------------AGYEVYNYVDKPDLNMNQLVAEVEQSLNKKIPSMHLPYP 254
+ A Y VYN + + + + +E +L + LP
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQ 291

Query: 255 LGMLGGYCFDI--LSKITGKKYAVS-SVRVKKFC 285
G + D L ++ G + VK F
Sbjct: 292 PGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


43Z3240Z3254Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3240-3173.803153hypothetical protein
Z3241-3184.240856hypothetical protein
Z3242-3184.215009hypothetical protein
Z3243-2184.143109multidrug efflux system subunit MdtA
Z3244-2193.929689multidrug efflux system subunit MdtB
Z3245-2183.308149multidrug efflux system subunit MdtC
Z3246-2121.376206multidrug efflux system protein MdtE
Z3247-290.146932signal transduction histidine-protein kinase
Z3248-110-1.377797DNA-binding transcriptional regulator BaeR
Z3249013-2.341801hypothetical protein
Z3250213-1.770880hypothetical protein
Z3251321-3.460143hypothetical protein
Z3252421-3.259963lipid kinase
Z3253320-3.323982galactitol utilization operon repressor
Z3254319-2.211235galactitol-1-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3243RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.9 bits (75), Expect = 0.002
Identities = 25/108 (23%), Positives = 47/108 (43%), Gaps = 5/108 (4%)

Query: 125 EASVASAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSGDTTGIVVITQTHPIDLLFTLP 183
+A + + S I APV +V LK G +++ +T +V++ + +++ +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTALVQ 373

Query: 184 ESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSLDNQIDATTG 230
DI + Q A + VEA+ T L + ++LD D G
Sbjct: 374 NKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 28.6 bits (64), Expect = 0.038
Identities = 8/53 (15%), Positives = 16/53 (30%), Gaps = 5/53 (9%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGR 51
S + R V ++ IA G+ + + A G + +
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSI 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3244ACRIFLAVINRP9170.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 917 bits (2372), Expect = 0.0
Identities = 298/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QISDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P + V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3245ACRIFLAVINRP7770.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 777 bits (2009), Expect = 0.0
Identities = 243/877 (27%), Positives = 429/877 (48%), Gaps = 32/877 (3%)

Query: 45 QLAPTISQIDGVGDVDVGGSSLPAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG-- 102
+ T+S+++GVGDV + G+ A+R+ L+ L ++ DV + N + G
Sbjct: 161 NVKDTLSRLNGVGDVQLFGAQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQL 219

Query: 103 ----ALEDGTHRWQIQTNDELKTAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAG 157
AL I K E+ + + N +G VRL DVA V ++
Sbjct: 220 GGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIA 279

Query: 158 MTNAKPAILLMIRKLPEANIIQTVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEE 217
N KPA L I+ AN + T +I+AKL ELQ P + + D +P ++ S+ E
Sbjct: 280 RINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHE 339

Query: 218 VEQTLIISVALVILVVFLFLRSGRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALT 277
V +TL ++ LV LV++LFL++ RAT+IP +AVPV L+GTFA + G+S+N L++ +
Sbjct: 340 VVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMV 399

Query: 278 IATGFVVDDAIVVLENIARHL-EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLM 336
+A G +VDDAIVV+EN+ R + E + P +A + ++ ++ +++ L AVF+P+
Sbjct: 400 LAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFF 459

Query: 337 GGLPGRLLREFAVTLSVAIGISLLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RML 392
GG G + R+F++T+ A+ +S+LV+L LTP +C +LK + GF
Sbjct: 460 GGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTF 519

Query: 393 VALQQGYGKSLKWVLNHTRLVGMVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQ 452
Y S+ +L T ++ +A + L++ +P +F PE+D GV + IQ
Sbjct: 520 DHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPA 579

Query: 453 SISFQ----AMRGKLQDFMKIIRD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS-- 504
+ + + ++K + +V V GF+ G N+GM F++LKP +ER+
Sbjct: 580 GATQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGD 639

Query: 505 -ETAQQIIDRLRVKLAKEPGANLFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPK 563
+A+ +I R +++L K + + I G + ++ L D + +
Sbjct: 640 ENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQ 696

Query: 564 IRKKLATL-----PELADVNSDQQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQR 618
R +L + L V + ++ A+ L D++ LG+ + N ++ A G
Sbjct: 697 ARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGT 756

Query: 619 QISTIYQPMNQYKVVMEVDPRYTQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVN 678
++ K+ ++ D ++ ++K++V + G+ +P S F +
Sbjct: 757 YVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLE 816

Query: 679 HQGLSAASTISFNLPTGKSLSDASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVI 738
+ I G S DA A ++ ++L P+ + + G + + + N
Sbjct: 817 RYNGLPSMEIQGEAAPGTSSGDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPA 874

Query: 739 LIIAAIATVYIVLGILYESYVHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLI 798
L+ + V++ L LYES+ P++++ +P VG LLA LFN + ++G++ I
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTI 934

Query: 799 GIVKKNAIMMVDFALEAQRHGNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGD 858
G+ KNAI++V+FA + EA A +R RPI+MT+LA + G LPL +S G
Sbjct: 935 GLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA 994

Query: 859 GSELRQPLGITIVGGLVMSQLLTLYTTPVVYLFFDRL 895
GS + +GI ++GG+V + LL ++ PV ++ R
Sbjct: 995 GSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 49.1 bits (117), Expect = 7e-08
Identities = 13/39 (33%), Positives = 20/39 (51%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPV 44
FI RP+ +L++ + + G L LPVA P + P
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPA 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3246TCRTETB1268e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 8e-34
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAITGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSSTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3247BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3248HTHFIS581e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 1e-12
Identities = 17/80 (21%), Positives = 40/80 (50%), Gaps = 1/80 (1%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLSYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTA 89
L I+ D+P+++++A
Sbjct: 64 DLLPRIKKARPDLPVLVMSA 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3254DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


44Z3267Z3278Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3267224-2.358436phosphomethylpyrimidine kinase
Z3268226-4.830690hydroxyethylthiazole kinase
Z3269222-6.152850hypothetical protein
Z3270020-5.967897hypothetical protein
Z3271123-7.781019hypothetical protein
Z3272125-7.095531hypothetical protein
Z3273026-7.472200hypothetical protein
Z3274129-8.103890nickel/cobalt efflux protein RcnA
Z3275233-9.333646hypothetical protein
Z3276127-7.555018fimbrial protein
Z3277-214-4.236797hypothetical protein
Z3278-214-3.240578chaperone protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3275TYPE3OMGPROT280.020 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.020
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 66 KMLLGALLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 106
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3277PF005777140.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 714 bits (1844), Expect = 0.0
Identities = 241/843 (28%), Positives = 395/843 (46%), Gaps = 35/843 (4%)

Query: 2 LRMTPLASAI---VALLLGIEAHAAEETFDTHFMMGGMKGEQVTNLRL--DDNQPLPGQY 56
R+ + A +AE F+ F+ + V +L + + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 57 DIDIYVNKQWRGKYEIIVKDNPHET----CLTREIVKRLGIN-----SDNFARENQCLTF 107
+DIY+N + ++ E CLTR + +G+N N ++ C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 108 EQLVQGGSYSWDIGIFRLDLAVPQAWVEELENGYVPPENWERGINAFYTSYYVSQYYSDY 167
++ + D+G RL+L +PQA++ GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 168 KASGNSKSTYVRFNSGLNLLGWQLHSDASFSKTDNNP-----GEWKSNTLYLEHGFSQIL 222
+ GNS Y+ SGLN+ W+L + ++S ++ +W+ +LE +
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 223 GTLRIGDMYTSADIFDSVRFTGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 282
L +GD YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 283 VVYQKEVPPGPFSISDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 342
+Y VPPGPF+I+D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 343 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGTMVANNYYAFTLGTGWNT-RIGAIS 400
AG A ++ F Q+ +G T+YGGT +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 401 VDATKSHSKQDNGDVFDGQSYQIAYNKFLSQTSTRFGLAAWRYSSRDYRTFNDYVWANNK 460
VD T+++S + DGQS + YNK L+++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 461 DNYRRDKNDVYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSG 516
++ V + DYY + ++ ++Q L ++ LS + YWG S
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 517 SSKDYQLSYSNNWRRISYTLAASQAYDENHAE-EKRFNIFISIPFD--WGDDVTTPRRQI 573
+ +Q + + I++TL+ S + ++ + ++IPF D + R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 574 YMSNSTTFDDQGFASNNTGLSGTVGNRDQFNYGINLSHQHQGN---ETTAGANLTWTAPA 630
S S + D G +N G+ GT+ + +Y + + G+ +T A L +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 631 ATVNGSYSQSSTYRQVGASVSGGLVAWSGGVNLANRLSETFAVMHAPGIKDAYVNGQKYR 690
N YS S +Q+ VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 691 TTNCNGVVVYDGLTPYRENHLMMDVSQSDSETELRGNRKMTAPYRGAVVLVDFDTDQRKP 750
T+ G V T YREN + +D + +L P RGA+V +F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 751 WFIKALRSDGQPLTFGYEVNDMHGHNIGVVGQGSQIFIRTNEIPPAVNVAIDKQQGLSCT 810
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 811 ITF 813
+
Sbjct: 857 ANY 859


45Z3305Z3376Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z33053244.455024hypothetical protein
Z33064254.487139hypothetical protein
Z33074265.112467tail fiber protein encoded within prophage
Z33094275.151396tail fiber protein encoded within prophage
Z33105275.169310Lom-like outer membrane protein of prophage
Z33115285.269057tail fiber protein of prophage CP-933V
Z33124305.216626superoxide dismutase
Z33134345.951021tail component of prophage CP-933V
Z33145355.536641tail component of prophage CP-933V
Z33154334.694740tail component of prophage CP-933V
Z33164334.672856hypothetical protein
Z33183324.717821tail component of prophage CP-933V
Z33196324.284956hypothetical protein
Z33207313.612268hypothetical protein
Z33225293.337404major tail subunit encoded within prophage
Z33236314.042615hypothetical protein
Z33256315.136947hypothetical protein
Z33266315.163665hypothetical protein
Z33275305.058336hypothetical protein
Z33285305.152685portal protein for prophage CP-933V
Z33295295.498528hypothetical protein
Z33314285.200645hypothetical protein
Z33323252.114200hypothetical protein
Z33331260.073524hypothetical protein
Z3334225-0.354623hypothetical protein
Z3335125-0.073370hypothetical protein
Z33362252.277572endopeptidase Rz for prophage CP-933V
Z33371251.218139antirepressor of prophage CP-933V
Z3339328-0.381692endolysin R of prophage CP-933V
Z3340230-0.693723lysis protein S of prophage CP-933V
Z3341126-1.151486hypothetical protein
Z3342127-1.625832hypothetical protein
Z3343238-6.884197shiga-like toxin 1 subunit B encoded within
Z3344337-5.201061shiga-like toxin 1 subunit A encoded within
Z3345233-2.230611antitermination protein Q for prophage CP-933V
Z3346335-2.436899hypothetical protein
Z3347333-2.793915hypothetical protein
Z3348132-2.588499hypothetical protein
Z3349129-0.250713DNA methyltransferase encoded within prophage
Z3351124-1.132877hypothetical protein
Z3352125-1.811165hypothetical protein
Z3353328-1.433986hypothetical protein
Z3354229-1.447106exclusion protein ren of prophage CP-933V
Z3355331-3.006907DNA replication protein P of prophage CP-933V
Z3356534-4.071017DNA replication protein O of prophage CP-933V
Z3357741-5.899066regulatory protein CII of prophage CP-933V
Z3358741-7.218750repressor protein CI of prophage CP-933V
Z3359842-10.002765hypothetical protein
Z3360838-8.717330hypothetical protein
Z3361427-5.384211transcription antitermination protein N of
Z3362325-4.467308superinfection exclusion protein B of prophage
Z3363324-3.194888ssDNA-binding protein
Z3364124-1.836678host killing protein Kil of prophage CP-933V
Z3365121-2.486989host-nuclease inhibitor protein Gam of prophage
Z3366022-2.653694recombination protein Bet of prophage CP-933V
Z3367022-3.447712exonuclease of prophage CP-933V
Z3368124-4.135710hypothetical protein
Z3369022-3.740101hypothetical protein
Z3370028-5.446732hypothetical protein
Z3371125-3.595376hypothetical protein
Z3372120-1.706063hypothetical protein
Z3373018-0.399858hypothetical protein
Z33741150.927443hypothetical protein
Z33752151.042326integrase for prophage CP-933V
Z33760163.314797transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3310ENTEROVIROMP1363e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 136 bits (344), Expect = 3e-43
Identities = 57/193 (29%), Positives = 93/193 (48%), Gaps = 27/193 (13%)

Query: 8 ILSAAICLTVSGAPAWASEQQATLSAGYLHVSTNAPGSDNLNGINVKYRYEFTDT-LGLV 66
+A+ ++ + +T++ GY + + G N+KYRYE ++ LG++
Sbjct: 5 ACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEEDNSPLGVI 63

Query: 67 TSFSYAGDRNRQITRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRVST 126
SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y + T
Sbjct: 64 GSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQT 115

Query: 127 FSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYEGSGSGDW 186
T+ HD S+ ++GAG+QFNP E+VA+D +YE S
Sbjct: 116 --------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSV 158

Query: 187 RTDGFIVGVGYKF 199
+I GVGY+F
Sbjct: 159 DVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3328PF07675320.006 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 32.4 bits (73), Expect = 0.006
Identities = 21/102 (20%), Positives = 42/102 (41%)

Query: 164 SGVIEIPGSITEENAKKLKSNWDSGYTGENAGKTAILSNGAKYNPTTFSPVDAQTALETG 223
+GV G T K++ N + + ++ P+ + PV TA G
Sbjct: 286 TGVANASGVATVNMTKQITENGNYDVVITRSNYLPVIKQIQAGEPSPYQPVSNLTATAQG 345

Query: 224 ENESTEFDVTTLLRMDSERRMKTLGDAVKNTLLTPNEARKRE 265
+ + ++D + + + R +K +GD + T+ N+ R E
Sbjct: 346 QKVTLKWDAPSAKKAEGSREVKRIGDGLFVTIEPANDVRANE 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3341GPOSANCHOR280.008 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.5 bits (63), Expect = 0.008
Identities = 20/65 (30%), Positives = 34/65 (52%), Gaps = 3/65 (4%)

Query: 69 EAVKEVLRSEEVRSALKQKLRHNLEARLDAEVDAILDELLGAPAAPEPEGIAGEGSASDS 128
A++++ + E L +K + L+A+L+AE A+ ++L A A E + G ASDS
Sbjct: 410 AALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKL--AKQAEELAKL-RAGKASDS 466

Query: 129 GDPTP 133
P
Sbjct: 467 QTPDA 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3343FLGMOTORFLIM260.024 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 26.0 bits (57), Expect = 0.024
Identities = 7/36 (19%), Positives = 17/36 (47%)

Query: 38 DTFTVKVGDKELFTNRWNLQSLLLSAQITGMTVTIK 73
D F + +G+++ F + + ++AQI +
Sbjct: 293 DPFVLSIGNRKKFLCQPGVVGKKIAAQILERIESTS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3344SHIGARICIN1203e-34 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 120 bits (303), Expect = 3e-34
Identities = 49/283 (17%), Positives = 112/283 (39%), Gaps = 40/283 (14%)

Query: 3 IIIFRVLTFFFVIFSVNVVAKE----FTLDFSTAKTYVDSLNVIRSAIGTPLQTISSGGT 58
+I F V + + + A E F L +T+ +Y ++ +R A+ +
Sbjct: 1 MIRFLVFSLLILTLFLTAPAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKL-----Y 55

Query: 59 SLLMIDSGTGDNLFAVDVRGIDPEEGRFNNLRLIVERNNLYVTGFVNRTNNVFYRFADF- 117
+ ++ S + + + + + + ++ N+YV G+ + Y F +
Sbjct: 56 DIPLLRSTLPGSQRYALIHLTNYADE---TISVAIDVTNVYVMGYRA--GDTSYFFNEAS 110

Query: 118 ----SHVTFPGTTA-VTLSGDSSYTTLQRVAGISRTGMQINRHSLTTSYLDLMSHSGTSL 172
+ F VTL +Y LQ AG R + + +L ++ L ++
Sbjct: 111 ATEAAKYVFKDAKRKVTLPYSGNYERLQIAAGKIRENIPLGLPALDSAITTLFYYNA--- 167

Query: 173 TQSVARAMLRFVTVTAEALRFRQIQRGFRTTLDDLSGRSYVMTAEDVDLTLNWGRLSSVL 232
S A A++ + T+EA R++ I++ +D +++ + + L +W LS +
Sbjct: 168 -NSAASALMVLIQSTSEAARYKFIEQQIGKRVDK----TFLPSLAIISLENSWSALSKQI 222

Query: 233 PDYHGQDSV----------RVGRISFGSINA--ILGSVALILN 263
+ + R++ +++A + ++AL+LN
Sbjct: 223 QIASTNNGQFETPVVLINAQNQRVTITNVDAGVVTSNIALLLN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z334860KDINNERMP280.014 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.0 bits (62), Expect = 0.014
Identities = 9/37 (24%), Positives = 16/37 (43%)

Query: 88 MTYMKAYQKAWKEHRDRYQQDMEKLESENMELRRKLG 124
M M+ Q + R+R D +++ E M L +
Sbjct: 380 MAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEK 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3363UREASE270.014 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 27.4 bits (61), Expect = 0.014
Identities = 18/66 (27%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 57 IMLAQHALLIAISSDLNAYGVVCEFDWN----DGNGQEGWPSMDGSEGIRITD---IDTS 109
+ LA L I + D +G +F DG GQ G+ IT+ +D
Sbjct: 22 VRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTREGGAVDTVITNALILDHW 81

Query: 110 GIFDSD 115
GI +D
Sbjct: 82 GIVKAD 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3370GPOSANCHOR290.033 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.9 bits (64), Expect = 0.033
Identities = 13/80 (16%), Positives = 32/80 (40%), Gaps = 2/80 (2%)

Query: 79 NVALALLDERERNQQYIKRRDQENEEIALTVGKLRVELEAAKSKLNEQREYYEGVIADGS 138
N + A + + + + E ++ L ++ + L+ RE + + A+
Sbjct: 274 NFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQ 333

Query: 139 KRIAELEKQCAEWERKALSN 158
K E + + +E R++L
Sbjct: 334 K--LEEQNKISEASRQSLRR 351


46Z3385Z3399Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3385-1173.688708hypothetical protein
Z3386-1183.763213acetoin dehydrogenase
Z3387-2163.885728multidrug resistance outer membrane protein
Z3388-2204.123176hypothetical protein
Z3389-1204.443490tRNA-dihydrouridine synthase C
Z3390-1193.775185salicylate hydroxylase
Z3391-1192.566247glutathione-S-transferase
Z33920192.508866isomerase-decarboxylase
Z33930172.4556951,2-dioxygenase
Z33940151.273775transporter
Z3395015-0.740364regulator
Z3396217-0.288934hypothetical protein
Z3397216-0.201504hypothetical protein
Z3398114-0.681375cytidine deaminase
Z3399317-2.695614hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3385BCTERIALGSPF290.019 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.019
Identities = 5/33 (15%), Positives = 16/33 (48%), Gaps = 2/33 (6%)

Query: 164 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKR 194
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3386DHBDHDRGNASE1132e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (283), Expect = 2e-32
Identities = 70/253 (27%), Positives = 115/253 (45%), Gaps = 12/253 (4%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWXSDEEGAKDTAREVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I + E + + + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHI-AAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 63 NLPEGAQALEKLIQRLGRIDVLVNNAGAMTKVPFLDMAFDEWRKIFTVDVDGAFLCSQIA 122
+ + ++ + +G ID+LVN AG + ++ +EW F+V+ G F S+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVA 182
++ M+ + + G I+ + S P +AY ++K A TK + LEL + I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NGMDDSDVKPDAEP---SIPLRRFGATHEIASLVAWLCSEGANYT 232
PG+ T M + +K E IPL++ +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3389SHAPEPROTEIN290.029 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.029
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 5/127 (3%)

Query: 122 GAKAMREAVPAHLPVSVKVRLGWDSGEK-KFEIADAVQQAGATELVVHGRTKEQGY-RAE 179
G EA+ ++ + +G + E+ K EI A E+ V GR +G R
Sbjct: 190 GGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGF 249

Query: 180 HIDWQAIGE-IRQRLNIPVIANGEIWDWQSAQECMAISGCDSVMIGRGALNIPNLSRVVK 238
++ I E +++ L V A + + IS V+ G GAL + NL R++
Sbjct: 250 TLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL-LRNLDRLL- 307

Query: 239 YNEPRMP 245
E +P
Sbjct: 308 MEETGIP 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3394TCRTETB431e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 43.3 bits (102), Expect = 1e-06
Identities = 42/200 (21%), Positives = 79/200 (39%), Gaps = 5/200 (2%)

Query: 193 RQLPITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQHASWVTAAFQIGGTLGALALG 252
R I + L ++ F S+L +L+ +P + N +WV AF + ++G G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 253 VLMDKFNPFRVLTLSYAIGAICIVMIGLSQDGLWLMALAIFGTGIGISGSQVGLNALTAT 312
L D+ R+L I V+ + L+ +A F G G + + + A
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 313 LYPTQSRATGVSWSNAIGRCGAIVGSLSGGVMMAMNFSFDTLFFIIAVPAAISAVMLTLL 372
P ++R +I G VG GG M+A + L I I+ + + L
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGG-MIAHYIHWSYLLLI----PMITIITVPFL 185

Query: 373 ITVVRQSTSVPDSLPRAGVV 392
+ ++++ + G++
Sbjct: 186 MKLLKKEVRIKGHFDIKGII 205



Score = 31.0 bits (70), Expect = 0.009
Identities = 54/355 (15%), Positives = 125/355 (35%), Gaps = 22/355 (6%)

Query: 11 IDAAPVGKMQWRVIICCFLVVMLDGFDTAA-IGFIASAFSPDLQTLVFLRFLTGLGLGGA 69
I A GK+ ++ I L+ + + IGF+ +F L+ RF+ G G A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFS---LLIMARFIQGAG-AAA 119

Query: 70 MPNTIT-MTSEYLPARRRGALVTLMFCGFTLGSAFGGIVSAQLVPVIGWHGILVLGGVLP 128
P + + + Y+P RG L+ +G G + + I W +L++ +
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITI 179

Query: 129 LMLFVALLVVLPESPRWQVRRQLPQAVI-----------AKTVSAITRERYVDTHFYLIE 177
+ + L+ +L + R + + ++ + S V + ++
Sbjct: 180 ITVPF-LMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVK 238

Query: 178 SASVTKGSIRQLFMGRQLPITLMLWVVF--FMSLLIIYLLSSWMPTLLNHRGIDLQHASW 235
+G+ +P + + F ++ + +M ++ +
Sbjct: 239 HIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVI 298

Query: 236 VTAAFQIGGTLGALALGVLMDKFNPFRVLTLSYAIGAICIVMIGLSQDG-LWLMALAIFG 294
+ G + G+L+D+ P VL + ++ + + W M + I
Sbjct: 299 IFPGTMSVIIFGYIG-GILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVF 357

Query: 295 TGIGISGSQVGLNALTATLYPTQSRATGVSWSNAIGRCGAIVGSLSGGVMMAMNF 349
G+S ++ ++ + ++ Q G+S N G G ++++
Sbjct: 358 VLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412


47Z3450Z3462Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z34500203.601764transcriptional regulator NarP
Z34510224.031441subunit of heme lyase
Z34521204.413812disulfide oxidoreductase
Z3453-1184.488029cytochrome C biogenesis protein
Z3454-1163.031629cytochrome C biogenesis protein CcmE
Z3455-1163.316781heme exporter protein C
Z3456-1153.431505heme exporter protein C
Z3457-1174.126680heme exporter protein B, cytochrome C biogenesis
Z3458-1204.448778cytochrome c biogenesis protein CcmA
Z3459-1234.449697cytochrome C
Z3460-1224.876311citrate reductase cytochrome C subunit
Z3461-1224.328894quinol dehydrogenase membrane component
Z34620203.767533quinol dehydrogenase periplasmic component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3450HTHFIS642e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.5 bits (157), Expect = 2e-14
Identities = 22/113 (19%), Positives = 47/113 (41%), Gaps = 2/113 (1%)

Query: 9 VMIVDDHPLMRRGVRQLLELDPGFEVVAEAGDGASAIDLANRLDIDVILLDLNMKGMSGL 68
+++ DD +R + Q L G++V + A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIR 121
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


48Z3516Z3542Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z35160153.160732hypothetical protein
Z3517-1133.610868hypothetical protein
Z35190144.922020polymyxin resistance protein B
Z35200144.869440O-succinylbenzoic acid--CoA ligase
Z35210144.696048O-succinylbenzoate synthase
Z3522-1143.528801naphthoate synthase
Z3523-1112.826368acyl-CoA thioester hydrolase
Z3524-115-1.5398962-succinyl-5-enolpyruvyl-6-hydroxy-3-
Z3525-122-5.307202menaquinone-specific isochorismate synthase
Z3526028-8.019096hypothetical protein
Z3527-117-4.377374hypothetical protein
Z3528-114-2.373654ribonuclease Z
Z3529-218-0.328411deubiquitinase
Z35310253.196521hypothetical protein
Z35331273.719051hypothetical protein
Z35340304.432670NADH dehydrogenase subunit N
Z35360303.849539NADH dehydrogenase subunit M
Z3537-1304.283489NADH dehydrogenase subunit L
Z35380304.006519NADH dehydrogenase subunit K
Z35390304.083269NADH dehydrogenase subunit J
Z35400304.379036NADH dehydrogenase subunit I
Z35410294.153841NADH dehydrogenase subunit H
Z35420284.142846NADH dehydrogenase subunit G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3516BCTERIALGSPC280.008 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 28.0 bits (62), Expect = 0.008
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMVLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3520ACETATEKNASE300.016 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 30.2 bits (68), Expect = 0.016
Identities = 19/124 (15%), Positives = 47/124 (37%), Gaps = 20/124 (16%)

Query: 339 EMHNGKLTIVG-----RLDNLFFSGGEGIQTEEVERVIAAHPAVLQVFIVPVADKEF--- 390
E +G + G +++ + + ++++ + H +++ + + + ++
Sbjct: 19 ESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDAIKLVLDALVNSDYGVI 78

Query: 391 ---------GHRPVAVMEYDHESVDLSEWVKDKLARFQQPVRWLTLPPELKNGGIKISRQ 441
GHR V EY SV +++ V + + L P + GIK Q
Sbjct: 79 KDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDC-IELAPLHNPANI--EGIKACTQ 135

Query: 442 ALKE 445
+ +
Sbjct: 136 IMPD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3527AUTOINDCRSYN356e-05 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.8 bits (80), Expect = 6e-05
Identities = 14/79 (17%), Positives = 32/79 (40%), Gaps = 12/79 (15%)

Query: 1 MIEWQDLHHSELSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLTGDNRHIL 52
M+E D++H+ LS ++ L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWKNDELVAYARILKSDDD 71
G K++ ++ R +++
Sbjct: 57 GIKDNTVICSLRFIETKYP 75


49Z3606Z3649Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3606-117-4.629649hypothetical protein
Z3608021-5.646443long-chain fatty acid outer membrane
Z3609025-6.180769hypothetical protein
Z3610-125-4.022023lipoprotein precursor
Z3611027-4.397217transport
Z3613132-5.184673*prophage integrase
Z3614234-6.286318prophage DNA injection protein
Z3615234-8.672584prophage DNA injection protein
Z3616335-10.147876hypothetical protein
Z3617338-9.644586hypothetical protein
Z3618335-10.196544hypothetical protein
Z3619234-8.356204hypothetical protein
Z3620131-5.992801hypothetical protein
Z3621025-2.423283hypothetical protein
Z3622024-2.214005resolvase
Z3623-119-1.627565galactoside permease
Z3624-124-2.421988aminoimidazole riboside kinase
Z3625-124-4.124209sucrose hydrolase
Z3626-227-5.830349sucrose specific transcriptional regulator
Z3627-130-8.554095D-serine permease
Z3628-130-8.406718D-serine dehydratase
Z3629-132-9.397335multidrug resistance protein Y
Z3630-132-8.447542multidrug resistance protein K
Z3631030-7.719747DNA-binding transcriptional activator EvgA
Z3632030-7.334811hybrid sensory histidine kinase in two-component
Z3633m130-5.641742hypothetical protein
Z3635131-5.441075transporter YfdV
Z3637-126-4.205441oxalyl-CoA decarboxylase
Z3639-215-1.891575formyl-coenzyme A transferase
Z3640-112-0.908887hypothetical protein
Z3641013-0.793132hypothetical protein
Z36420150.171066hypothetical protein
Z36430171.516359lipid A biosynthesis palmitoleoyl
Z36440182.322087aminotransferase
Z36450192.541323sensor protein
Z36460173.2406142-component transcriptional regulator
Z36470163.840797AraC family transcriptional regulator
Z3648-1153.522892PTS system enzyme IIA component, enzyme I
Z3649-1133.095686exoaminopeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3610VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 250/251 (99%), Positives = 250/251 (99%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADGLYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3614TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.9 bits (80), Expect = 4e-04
Identities = 26/120 (21%), Positives = 55/120 (45%), Gaps = 7/120 (5%)

Query: 229 LSFAEISSVV------FILMSVYMVAKSQNSNGLQYDA-LFIPSMAFSILVFSFNGGIIS 281
LS AEI SV+ +++ Y+ + G Y + + ++ S L SF S
Sbjct: 289 LSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTS 348

Query: 282 KIISNKVMILLGDASFSFYLVHTIVISTLSKFFNVSGLGAISVIKFIVMALFASLFISIM 341
++ ++ +LG SF+ ++ TIV S+L + +G+ ++ F+ ++ ++
Sbjct: 349 WFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3615RTXTOXINA280.037 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.0 bits (62), Expect = 0.037
Identities = 26/129 (20%), Positives = 55/129 (42%), Gaps = 4/129 (3%)

Query: 93 GQARYQSLAAAEATGGLGSTATGNQLAAIAPTLGQNWLS--GQMNNYNNLANIGLGALTG 150
G+ Q + A A GL ++A L A A TL + LS + + I +
Sbjct: 284 GKGISQYIIAQRAAQGLSTSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRF 343

Query: 151 QANAGQNYANNVSQLYQQQAAASAANANKPSGLQSFATGAIGGAASGAMIGSAVPVIGTG 210
+ G + + ++ +++ A A+ + L S + I AA+ +++G+ V +
Sbjct: 344 KK-LGYDGDSLLAAFHKETGAIDASLTTISTVLASVS-SGISAAATTSLVGAPVSALVGA 401

Query: 211 IGALAGGVI 219
+ + G++
Sbjct: 402 VTGIISGIL 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3629TCRTETB539e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.0 bits (127), Expect = 9e-10
Identities = 60/268 (22%), Positives = 105/268 (39%), Gaps = 24/268 (8%)

Query: 39 VAIPTILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRETETSPVKMNLPGLTLLV 98
+ +GG I W +L+ +PM I+ L L +E ++ G+ L+
Sbjct: 152 EGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKG-HFDIKGIILMS 208

Query: 99 LGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVIFLISLVIWESTSENPILDLSLFKSRNF 158
+G+ + ML F +S I +VSV+ + V +P +D L K+ F
Sbjct: 209 VGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPF 258

Query: 159 TIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPIGIMPLLISPLIG----- 213
IG++ + +G + ++P ++++ + G G M ++I IG
Sbjct: 259 MIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVD 318

Query: 214 RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQGFAVACFFLPLTTI 273
R G + + VTF +V + S T F II+ G + ++TI
Sbjct: 319 RRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVFVLGGLSFTK--TVISTI 372

Query: 274 SFSGLPDNKFANASSMSNFFRTLSGSVG 301
S L + S+ NF LS G
Sbjct: 373 VSSSLKQQEAGAGMSLLNFTSFLSEGTG 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3630RTXTOXIND794e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.7 bits (194), Expect = 4e-18
Identities = 63/412 (15%), Positives = 122/412 (29%), Gaps = 96/412 (23%)

Query: 13 RRKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
RR I+ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQ-VGETVSPG 233
L + + + + + I++PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3631HTHFIS485e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 48.3 bits (115), Expect = 5e-09
Identities = 22/147 (14%), Positives = 52/147 (35%), Gaps = 31/147 (21%)

Query: 5 IIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQVL 64
+ DD + L + ++ + + + + D+V+ DV +P N +L
Sbjct: 8 VADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDLL 66

Query: 65 ETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF- 123
++K + ++++SA+N + AI+A++ G +
Sbjct: 67 PRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDYL 102

Query: 124 --PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 103 PKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3632HTHFIS762e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.4 bits (188), Expect = 2e-16
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNVDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3645PF065802233e-70 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 223 bits (570), Expect = 3e-70
Identities = 60/207 (28%), Positives = 102/207 (49%), Gaps = 11/207 (5%)

Query: 348 RAEQLREMANKAELRALQSKINPHFLFNALNAISSSIRLNPDTARQLIFNLSRYLRYNIE 407
++ MA +A+L AL+++INPHF+FNALN I + I +P AR+++ +LS +RY++
Sbjct: 150 DQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 408 LKDDEQIDIKKELYQIKDYIAIEQARFGDKLTVIYDIDEEV-NCCIPSLLIQPLVENAIV 466
+ Q+ + EL + Y+ + +F D+L I+ + + +P +L+Q LVEN I
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIK 269

Query: 467 HGIQPCKGKGVVTISVAECGNRVRIAVRDTGHGIDPKVIERVEANEMPGNKIGLLNVHHR 526
HGI G + + + V + V +TG E GL NV R
Sbjct: 270 HGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE--------STGTGLQNVRER 321

Query: 527 VKLLYGE--GLHIRRLEPGTEIAFYIP 551
+++LYG + + + IP
Sbjct: 322 LQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3646HTHFIS555e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.2 bits (133), Expect = 5e-11
Identities = 21/132 (15%), Positives = 57/132 (43%), Gaps = 6/132 (4%)

Query: 2 KVIIVEDEFLAQQELSWLIKEHSQMEIVGTFDDGLDVLKFLQHNRVDAIFLDINIPSLDG 61
+++ +D+ + L+ + V + + +++ D + D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALS--RAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 V-LLAQNISQFAHKPFIVFITAWK--EHAVEAFELEAFDYILKPYQESRITGMLQKLEAA 118
LL + P +V ++A A++A E A+DY+ KP+ + + G++ + A
Sbjct: 63 FDLLPRIKKARPDLPVLV-MSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 119 WQQQQTSSTPAA 130
+++ + +
Sbjct: 122 PKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3648PHPHTRNFRASE6140.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 614 bits (1585), Expect = 0.0
Identities = 202/567 (35%), Positives = 331/567 (58%), Gaps = 8/567 (1%)

Query: 117 LYGNVLASGVGVGTLTLLQSDSLDSYRAIPA-SAQDSTRLEHSLATLAEQLNQQLRERDG 175
+ G +SGV + + ++D + + + +L +L E+L + +
Sbjct: 5 ITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEA 64

Query: 176 ----ESKTILSAHLSLIQDDEFAGNIRRLMTEQHQGLGAAIIRNMEQVCAKLSASASDYL 231
+ I +AHL ++ D E I+ + + A+ + + + ++Y+
Sbjct: 65 SMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYM 124

Query: 232 RERVSDIRDISEQLL-HITWPELKPRNNLVLEKPTILVAEDLTPSQFLSLDLKNLAGMIL 290
+ER +DIRD+S+++L H+ E + + T+++AEDLTPS L+ + + G
Sbjct: 125 KERAADIRDVSKRVLGHLIGVETGSLATIA--EETVIIAEDLTPSDTAQLNKQFVKGFAT 182

Query: 291 EKTGRTSHTLILARASAIPVLSGLPLDAIARYAGQPAVLDAQCGVLAINPNDAVSGYYQV 350
+ GRTSH+ I++R+ IP + G G ++D G++ +NP + Y+
Sbjct: 183 DIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEE 242

Query: 351 AQTLADKRQKQQAQAAAQLAYSRDNKRIDIAANIGTALEAPGAFANGAEGVGLFRTEMLY 410
+ +K++++ A+ + + ++D +++AANIGT + G ANG EG+GL+RTE LY
Sbjct: 243 KRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLY 302

Query: 411 MDRDSEPDEQEQFEAYQQVLLAAGDKPIIFRTMDIGGDKSIPYLNIPQEENPFLGYRAVR 470
MDRD P E+EQFEAY++V+ KP++ RT+DIGGDK + YL +P+E NPFLG+RA+R
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIR 362

Query: 471 IYPEFAGLFRTQLRAILRAASFGNAQLMIPMVHGLDQILWVKGEIQKAIVELKRDGLRHA 530
+ E +FRTQLRA+LRA+++GN ++M PM+ L+++ K +Q+ +L +G+ +
Sbjct: 363 LCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVS 422

Query: 531 ETITLGIMVEVPSVCYIIDHFCDEVDFFSIGSNDMTQYLYAVDRNNPRVSPLYNPITPSF 590
++I +GIMVE+PS + F EVDFFSIG+ND+ QY A DR N RVS LY P P+
Sbjct: 423 DSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAI 482

Query: 591 LRMLQQIVTTAHQRGKWVGICGELGGESRYLPLLLGLGLDELSMSSPRIPAVKSQLRQLD 650
LR++ ++ AH GKWVG+CGE+ G+ +PLLLGLGLDE SMS+ I +SQL +L
Sbjct: 483 LRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKLS 542

Query: 651 SEACRELARQACECRSAQEIEALLTAF 677
E + A++A +A+E+E L+
Sbjct: 543 KEELKPFAQKALMLDTAEEVEQLVKKT 569


50Z3701Z3714Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3701-2173.828528coproporphyrinogen III oxidase
Z3702-2194.852660transcriptional regulator EutR
Z3703-2225.166003hypothetical protein
Z3704-1225.518629hypothetical protein
Z37050225.822706ethanolamine ammonia-lyase small subunit
Z37060225.879472ethanolamine ammonia-lyase, heavy chain
Z37072216.055584reactivating factor for ethanolamine ammonia
Z37081205.710620ethanolamine transport protein
Z37094206.385927iron-containing alcohol dehydrogenase
Z37102196.070447hypothetical protein
Z37112215.500615ethanolamine utilization; acetaldehyde
Z37121204.637001detox protein
Z37132214.049023detox protein
Z37142183.352260phosphotransacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3710SHAPEPROTEIN512e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.5 bits (121), Expect = 2e-09
Identities = 33/116 (28%), Positives = 50/116 (43%), Gaps = 9/116 (7%)

Query: 63 VRDGIVWDFFGAVTIVRRHLD-TLEQQFGRRFSHAATSFPPGTDP---RISINVLESAGL 118
++DG++ DFF +++ + F R P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 119 EVSHVLDEPTAVA---DLLQLDNAG--VVDIGGGTTGIAIVKKGKVTYSADEATGG 169
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


51Z3764Z3770Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3764-311-3.425301polyphosphate kinase
Z3765-114-3.655353exopolyphosphatase
Z3766-214-2.798543cytochrome C biogenesis protein
Z37672300.222131hypothetical protein
Z37682241.158369hypothetical protein
Z37692221.050139outer membrane lipoprotein
Z37702211.512632hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3770IGASERPTASE280.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.024
Identities = 19/124 (15%), Positives = 40/124 (32%), Gaps = 6/124 (4%)

Query: 34 QQGKNEEQRQHDEWVAERNREIQQEKQRRANAQAAANKRAATAAANKKARQDKLDAEATA 93
Q + ++ + + + E+ Q Q K AT +KA+ + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEV 1122

Query: 94 DKKRDQSYEDELRSLEIQKQKLALAKEEARVKRENEFIDQELKHKAAQTDVVQSEADANR 153
K Q + +S +Q Q + + V I + D Q + +
Sbjct: 1123 PKVTSQVSPKQEQSETVQPQAEPARENDPTVN-----IKEPQSQTNTTADTEQPAKETSS 1177

Query: 154 NMTE 157
N+ +
Sbjct: 1178 NVEQ 1181


52Z3910Z3961Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3910121-3.164024small membrane protein A
Z3911123-5.721416hypothetical protein
Z3912331-7.525726hypothetical protein
Z3913340-10.760178SsrA-binding protein
Z3916751-12.517569hypothetical protein
Z3917748-11.340862hypothetical protein
Z3918546-10.894039chaperone protein
Z3919343-9.363480hypothetical protein
Z3920234-7.435204hypothetical protein
Z3921429-1.072428hypothetical protein
Z39224232.824948transposase
Z39236252.568154hypothetical protein
Z3924424-1.545021transposase
Z3925324-1.722582transposase
Z3926121-2.902663hypothetical protein
Z3927121-3.534708hypothetical protein
Z3929234-7.985392hypothetical protein
Z3931136-7.449738hypothetical protein
Z3932133-5.743845antiterminator of prophage CP-933Y
Z3933434-6.177722hypothetical protein
Z3934439-10.152878hypothetical protein
Z3935643-12.019484hypothetical protein
Z3936441-12.723327IS30 transposase
Z3937852-17.431745hypothetical protein
Z3938952-18.791686hypothetical protein
Z39391053-19.055389hypothetical protein
Z39401053-18.633808hypothetical protein
Z39411050-16.900201hypothetical protein
Z3942334-9.668955hypothetical protein
Z3943330-7.845396hypothetical protein
Z3945330-7.350241hypothetical protein
Z3946227-6.042812DNA binding protein
Z3947229-7.052913hypothetical protein
Z3948230-7.658187ABC transporter ATP-binding protein
Z3949232-11.000641hypothetical protein
Z3950-121-7.459752hypothetical protein
Z3951113-2.276325hypothetical protein
Z3954112-0.564572*hypothetical protein
Z39552161.615901hypothetical protein
Z39561213.772096hypothetical protein
Z39582203.571802hydroxyglutarate oxidase
Z39592202.953874succinate-semialdehyde dehydrogenase I
Z39601181.8876924-aminobutyrate aminotransferase
Z3961216-0.953256gamma-aminobutyrate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3948PRTACTNFAMLY2332e-65 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 233 bits (596), Expect = 2e-65
Identities = 201/789 (25%), Positives = 317/789 (40%), Gaps = 52/789 (6%)

Query: 697 TAEGSIINGGSQIVNEGGLAENSVLNDGGTLDVREKGSATGIQQSSQGALVATTRATRVT 756
A G I G+ + + S + DGG G+ +Q R T VT
Sbjct: 159 GAGGVQIERGANVT-----VQRSAIVDGGL----HIGALQSLQPEDLPPSRVVLRDTNVT 209

Query: 757 GTRADGVAFSIEQGAANNILLANGGVLTVESDTSSDKTQVNTGGREIVKTKATAT-GTTL 815
A G ++ A+ + L G + + + + + A G +
Sbjct: 210 AVPASGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGAVVHLQRATIRRGDAPAGGAV 269

Query: 816 TGGEQIVEGVANETTINDGGIQTVSANGEAIKTTINE----GGTLTVNDNGKATDIVQN- 870
GG V G A GG V + + + + + G A + +
Sbjct: 270 PGGA--VPGGAVPGGFGPGGFGPVLDGWYGVDVSGSSVELAQSIVEAPELGAAIRVGRGA 327

Query: 871 ----SGAALQTSTANGIEISGT--HQYGTFSISGNLATNMLLENGGNLLVAIVYAGTLAD 924
SG +L N IE G +S L + L + L
Sbjct: 328 RVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTL 387

Query: 925 ASVSGATGSLSLMTPRDNVTPVKLEGAIRITDSATLTIGNGVDTTLADLTAASRGSVWLN 984
+ A G + + + A T T A + + + W+
Sbjct: 388 TGGADAQGDIVATELPSIPGTSIGPLDVALASQARWT-----GATRAVDSLSIDNATWVM 442

Query: 985 SNNSCAGTSNCEYRVNSLLLNDGNVYLSAQTAAPATTNGIYNTLTTNELSGSGNFYLHTN 1044
++NS V +L L + + Q A A G + LT N L+GSG F ++
Sbjct: 443 TDNS---------NVGALRL-ASDGSVDFQQPAEA---GRFKVLTVNTLAGSGLFRMNVF 489

Query: 1045 VAGSRGDQLVVNNNATGNFKIFVQDTGVSPQSDDAMTLVKT-GGGDASFSLGNTGGFVDL 1103
D+LVV +A+G +++V+++G P S + + LV+T G A+F+L N G VD+
Sbjct: 490 ADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQTPLGSAATFTLANKDGKVDI 549

Query: 1104 GTYEYVLKSDGNSNWNLTNDVKPNPDPNPNPNPNPKPDPKPDPKPDPKPDPTPEPTPTPV 1163
GTY Y L ++GN W+L P P P P P P P P P+P P+ P P+P
Sbjct: 550 GTYRYRLAANGNGQWSLVGAKAP---PAPKPAPQPGPQPPQPPQPQPEA-PAPQPPAG-- 603

Query: 1164 PEKRITPSTAAVLNMAATLP-LVFDAELNSIRERLNIMKASPHNNNVWGATYNTRNNVTT 1222
+ + AAV L ++ AE N++ +RL ++ +P WG + R +
Sbjct: 604 -RELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDN 662

Query: 1223 DAGAGFEQTLTGMTVGIDSPNDIPEGIATLGAFMGYSHSHIGFDRGGHGSVGSYSLGGYA 1282
AG F+Q + G +G D + G LG GY+ GF G G S +GGYA
Sbjct: 663 RAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYA 722

Query: 1283 SWEHESGFYLDGVVKLNRFESNVAGKMSSGGAANGSYHSNGLGGHIETGMRFT-DGNWNL 1341
++ +SGFYLD ++ +R E++ S G A G Y ++G+G +E G RFT W L
Sbjct: 723 TYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTHADGWFL 782

Query: 1342 TPYASLTGFTADNPEYHLSNGMESKSVDTRSIYRELGATLSYNMRLGNGMEIEPWLKAAV 1401
P A L F A Y +NG+ + S+ LG + + L G +++P++KA+V
Sbjct: 783 EPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGRQVQPYIKASV 842

Query: 1402 RKEFVDDNRVKVNNDGNFVNDLSGRRGIYQAGIKASFSSTLSGHLGVGYSHGAGVESPWN 1461
+EF V N + +L G R G+ A+ S + YS G + PW
Sbjct: 843 LQEFDGAGTVHTNGIAH-RTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWT 901

Query: 1462 AVAGVNWSF 1470
AG +S+
Sbjct: 902 FHAGYRYSW 910


53Z4011Z4043Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z40111173.296579PTS system glucitol/sorbitol-specific
Z40121152.928459sorbitol-6-phosphate dehydrogenase
Z40131173.188290DNA-binding transcriptional activator GutM
Z40140163.994047DNA-binding transcriptional repressor SrlR
Z40151174.438178D-arabinose 5-phosphate isomerase
Z40171183.867226anaerobic nitric oxide reductase transcriptional
Z40180163.664653anaerobic nitric oxide reductase
Z40191163.418158nitric oxide reductase
Z40200163.475366transcriptional regulatory protein
Z4021-2182.205300electron transport protein HydN
Z4022-1192.784237ascBF operon repressor
Z4023-1223.309578PTS system cellobiose/arbutin/salicin-specific
Z4024-1253.344985cryptic 6-phospho-beta-glucosidase
Z4025-1294.707325hydrogenase 3 maturation protease
Z4026-1275.303549processing of large subunit (HycE) of
Z4027-1275.567179hydrogenase activity
Z40280275.034397formate hydrogenlyase complex iron-sulfur
Z40290254.575268large subunit of hydrogenase 3 (part of FHL
Z40302214.205007membrane-spanning protein of hydrogenase 3 (part
Z40312213.581226formate hydrogenlyase subunit 3
Z40323241.806445small subunit of hydrogenase-3, iron-sulfur
Z40332212.121779formate hydrogenlyase regulatory protein HycA
Z40342213.548268hypothetical protein
Z40350172.958263hydrogenase nickel incorporation protein
Z40360162.823631hydrogenase nickel incorporation protein HypB
Z4037-1152.436139hydrogenase assembly chaperone
Z4038-2163.245753pleiotrophic effects on 3 hydrogenase isozymes
Z4039-2141.818214plays structural role in maturation of all 3
Z4040-1141.225087formate hydrogenlyase transcriptional activator
Z40410192.755388hypothetical protein
Z40420183.268365hypothetical protein
Z4043-1163.241619DNA mismatch repair protein MutS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4012DHBDHDRGNASE821e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.0 bits (202), Expect = 1e-20
Identities = 66/257 (25%), Positives = 119/257 (46%), Gaps = 7/257 (2%)

Query: 3 QVAVVIGGGQTLGAFLCHGLAAEGYRVAVVDIQSDKAANVAQEINAEYGEGTAYGFGADA 62
++A + G Q +G + LA++G +A VD +K V + AE A F AD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARHAEAFPADV 66

Query: 63 TSEQSVLALSRGVDEIFVRVDLLVYSAGIAKAAFISDFQLGDFDRSLQVNLVGYFLCARE 122
++ ++ ++ +D+LV AG+ + I +++ + VN G F +R
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 FSRLMIRDGIQGRIIQINSKSGKVGSKHNSGYSAAKFGGVGLTQSLALDLAEYGITVHSL 182
S+ M D G I+ + S V + Y+++K V T+ L L+LAEY I + +
Sbjct: 127 VSKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 183 MLGNLLKSPMFQSL-LPQYATKLGIKPDQVEQYYIDKVPLKRGCDYQDVLNMLLFYASPK 241
G+ ++ M SL + + IK +E + +PLK+ D+ + +LF S +
Sbjct: 186 SPGS-TETDMQWSLWADENGAEQVIKGS-LETFKTG-IPLKKLAKPSDIADAVLFLVSGQ 242

Query: 242 ASYCTGQSINVTGGQVM 258
A + T ++ V GG +
Sbjct: 243 AGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4014PF07201280.039 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 27.5 bits (61), Expect = 0.039
Identities = 14/56 (25%), Positives = 22/56 (39%), Gaps = 5/56 (8%)

Query: 86 NNITVMTNSLHIVNALSELDNEQTILMPGGTFRKKSASFH---GQLAENAFEHFTF 138
+N++ LH N E+ + Q + G FR +S Q + E TF
Sbjct: 5 HNLSYGNTPLH--NERPEIASSQIVNQTLGQFRGESVQIVSGTLQSIADMAEEVTF 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4017HTHFIS373e-127 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 373 bits (960), Expect = e-127
Identities = 125/388 (32%), Positives = 195/388 (50%), Gaps = 33/388 (8%)

Query: 149 IAALAAGALS----------NALLIEQLESQNMLPGDAAPFEAVKQTQMIGLSPGMTQLK 198
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 199 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 258
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 259 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 318
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 319 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 378
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 379 RQGLSRVVLSAGARNLLQHYSFPGNVRELEHAIHRAVVLARATRSGDEVIL-----EAQH 433
++GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 434 FAFPEVTLPPPEVAAVPVVKQNLR-----------------EATEAFQRETIRQALAQNH 476
+ ++ V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 477 HNWAACARMLETDVANLHRLAKRLGLKD 504
N A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4022HTHTETR280.035 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.035
Identities = 17/93 (18%), Positives = 29/93 (31%), Gaps = 7/93 (7%)

Query: 3 TTMLEVAKRAGVSKATVSRVLSG-----NGYVSQETKDRVFQAVEESGYRPNLLARNLSA 57
T++ E+AK AGV++ + + + +E P L
Sbjct: 32 TSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLRE 91

Query: 58 KSTQTLGLVVTNTLYHGIYFSELLFHAARMAEE 90
L VT + E++FH E
Sbjct: 92 ILIHVLESTVTEERRRLLM--EIIFHKCEFVGE 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4037TYPE4SSCAGA270.012 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.012
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 8/75 (10%)

Query: 12 IDGNQAKVD--VCGIQRDVDLTLVGSCDENGQPRVGQWVLVHVGFAMSVINEAEARDTLD 69
I GNQ + D G+ D L ++NG+P G W+ + + F + ++ ++ D +
Sbjct: 171 IIGNQIRTDQKFMGV-FDESLKERQEAEKNGEPTGGDWLDIFLSF---IFDKKQSSDVKE 226

Query: 70 ALQN--MFDVEPDVG 82
A+ + V+PD+
Sbjct: 227 AINQEPVPHVQPDIA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4040HTHFIS389e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 389 bits (1000), Expect = e-131
Identities = 140/373 (37%), Positives = 203/373 (54%), Gaps = 39/373 (10%)

Query: 350 YQEIHRLKERLVDENLALTEQLNNVDSEFGEIIGRSEAMYSVLKQVEMVAQSDSTVLILG 409
E+ + R + E +L + + ++GRS AM + + + + Q+D T++I G
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITG 167

Query: 410 ETGTGKELIARAIHNLSGRNNRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF 469
E+GTGKEL+ARA+H+ R N V +N AA+P L+ES+LFGHE+GAFTGA + GRF
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRF 227

Query: 470 ELADKSSLFLDEVGDMPLELQPKLLRVLQEQEFERLGSNKIIQTDVRLIAATNRDLKKMV 529
E A+ +LFLDE+GDMP++ Q +LLRVLQ+ E+ +G I++DVR++AATN+DLK+ +
Sbjct: 228 EQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSI 287

Query: 530 ADREFRSDLYYRLNVFPIHLPPLRERPEDIPLLAKAFTFKIARRLGRNIDSIPAETLRTL 589
FR DLYYRLNV P+ LPPLR+R EDIP L + F + A + G ++ E L +
Sbjct: 288 NQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFV-QQAEKEGLDVKRFDQEALELM 346

Query: 590 SNMEWPGNVRELENVIERAVLLTRGNVLQLSL---------------------PDIALPE 628
WPGNVRELEN++ R L +V+ + +++ +
Sbjct: 347 KAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQ 406

Query: 629 PETPPAATVVAQEG--------------EDEYQLIVRVLKETNGVVAGPKGAAQRLGLKR 674
A G E EY LI+ L T G AA LGL R
Sbjct: 407 AVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQI---KAADLLGLNR 463

Query: 675 TTLLSRMKRLGID 687
TL +++ LG+
Sbjct: 464 NTLRKKIRELGVS 476


54Z4071Z4089Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z40710163.075060hypothetical protein
Z40720163.355371phosphoadenosine phosphosulfate reductase
Z40730153.045094sulfite reductase subunit beta
Z40740152.347131sulfite reductase subunit alpha
Z4075-1181.2041496-pyruvoyl tetrahydrobiopterin synthase
Z40761172.272015hypothetical protein
Z40773181.977349hypothetical protein
Z40782161.403822anti-terminator regulatory protein
Z40791140.855208hypothetical protein
Z40801130.231016hypothetical protein
Z4081012-0.649899transporter
Z4083m-113-1.220178major facilitator superfamily permease
Z4084-112-2.715995alkyl-dihydroxyacetonephosphate synthase
Z4085015-3.393783oxidoreductase
Z4086-116-3.403109transporter
Z4087-220-3.775135kinase
Z4089-120-3.508851hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4071HOKGEFTOXIC543e-14 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 53.7 bits (129), Expect = 3e-14
Identities = 17/50 (34%), Positives = 28/50 (56%)

Query: 1 MLTKYALVAIIVLCCTVLGFTLMVGDSLCELSIRERGMEFKAVLAYESKK 50
+ + ++++C T+L FT + SLCE+ R+ E A +AYES K
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4073PF07675300.021 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.4 bits (68), Expect = 0.021
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 12/92 (13%)

Query: 206 ILGQTYLPRKFKTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIE 260
++ +P+ T +P PQN + A+ ++VAI+++G L G + G++
Sbjct: 240 VMPYRAMPKT--NTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATV 297

Query: 261 HGNK-----KTYARTASEFGYLPLEHTLAVAE 287
+ K Y + YLP+ + E
Sbjct: 298 NMTKQITENGNYDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4083mTCRTETB372e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 36.8 bits (85), Expect = 2e-04
Identities = 45/314 (14%), Positives = 112/314 (35%), Gaps = 36/314 (11%)

Query: 93 LGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 151
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 152 GHTLLAEFSPRRHRGVLLGAFSVVWT----VGYVLASIAGHHFISENPEAWRWLLASAAL 207
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 208 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVATATHKHIKTLF-- 265
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 266 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 312
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 313 LNALLIVGALLGLV-------LTHLLAHRKFLLGSFLLLAATLVVMACLPSGSSLTLLLF 365
+ ++ G + ++ L L L+ + + + L +S + +
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTII 354

Query: 366 VLFSTTISAVSNLV 379
++F + + V
Sbjct: 355 IVFVLGGLSFTKTV 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4085DHBDHDRGNASE1052e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (264), Expect = 2e-29
Identities = 72/257 (28%), Positives = 116/257 (45%), Gaps = 11/257 (4%)

Query: 36 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANVFIPSFVKDNGETKEMIEN-QGVEVD 94
M+ ++GK A +TG G+G+A A LA GA++ + + E + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 95 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 154
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 155 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 214
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 215 QVNGIAPGYYATDI--TLATRSNPETNQRVLDH-------IPANRWGDTQDLMGAAVFLA 265
+ N ++PG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 266 SQASNYVNGHLLVVDGG 282
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4086TCRTETA290.029 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.029
Identities = 21/103 (20%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIILYAPSGVIADKFSHRKMITSAMIITGLLGLLMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + +MAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


55Z4162Z4199Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4162119-4.9249962-deoxy-D-gluconate 3-dehydrogenase
Z4163124-6.7160165-keto-4-deoxyuronate isomerase
Z4164031-8.640214acetyl-CoA acetyltransferase
Z4165136-12.954868transporter protein
Z4166344-16.368398hypothetical protein
Z4167345-17.185687sensory transducer
Z4168750-18.368165hypothetical protein
Z4169751-18.440521hypothetical protein
Z4170651-18.344692hypothetical protein
Z4171651-17.512650hypothetical protein
Z4172549-17.439386hypothetical protein
Z4173649-17.489650transcriptional regulator
Z4174549-18.314174hypothetical protein
Z4175649-17.328701hypothetical protein
Z4176751-16.6702122-component transcriptional regulator
Z4177752-17.568121hypothetical protein
Z4178753-17.252862hypothetical protein
Z4179753-17.091142hypothetical protein
Z4180752-17.238098lipoprotein of type III secretion apparatus
Z4181650-17.152934Type III secretion apparatus protein
Z4182650-16.751594hypothetical protein
Z4183551-15.718576hypothetical protein
Z4184449-15.941801hypothetical protein
Z4185550-15.641211surface presentation of antigens protein SpaS
Z4186349-14.257281integral membrane protein-component of typeIII
Z4187448-14.758158type III secretion apparatus protein
Z4188346-15.282557type III secretion apparatus protein
Z4189343-12.717877surface presentation of antigens protein SpaP
Z4190443-12.735556surface presentation of antigens protein SpaO
Z4191342-12.676111type III secretion apparatus protein
Z4192342-12.620721hypothetical protein
Z4193343-12.546967type III secretion apparatus protein
Z4194343-12.522354ATP synthase SpaL
Z4195544-13.578871type III secretion apparatus protein
Z4196442-12.937800hypothetical protein
Z4197339-10.716139type III secretion apparatus protein
Z4198-225-5.154562regulatory protein for type III secretion
Z4199-219-3.035135hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4162DHBDHDRGNASE693e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 68.5 bits (167), Expect = 3e-16
Identities = 39/163 (23%), Positives = 82/163 (50%), Gaps = 9/163 (5%)

Query: 37 IVGINIVEEFSEKDWDDVMNLNIKSVFFMSQAAAKHFIAQGNGGKIINIASMLSFQGGIR 96
++ ++ S+++W+ ++N VF S++ +K+ + + G I+ + S +
Sbjct: 95 VLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKYMMDR-RSGSIVTVGSNPAGVPRTS 153

Query: 97 VPSYTASKSGVMGVTRLMANEWAKHNINVNAIAPGYMATNNTQQLRADEQRSAEILD--- 153
+ +Y +SK+ + T+ + E A++NI N ++PG T+ L ADE + +++
Sbjct: 154 MAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSL 213

Query: 154 -----RIPAGRWGLPSDLMGPVVFLASSASDYVNGYTIAVDGG 191
IP + PSD+ V+FL S + ++ + + VDGG
Sbjct: 214 ETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITMHNLCVDGG 256



Score = 34.6 bits (79), Expect = 1e-04
Identities = 17/47 (36%), Positives = 27/47 (57%)

Query: 3 LSAFSLEGKVAVVTGCDTGLGQGMALGLAQAGCDIVGINIVEEFSEK 49
++A +EGK+A +TG G+G+ +A LA G I ++ E EK
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEK 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4172SYCDCHAPRONE751e-19 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 75.0 bits (184), Expect = 1e-19
Identities = 25/143 (17%), Positives = 58/143 (40%), Gaps = 5/143 (3%)

Query: 22 ALSKGENLALLHGLTPDILDRIYAYAFDYHEKGNVTDAEIYYKLLCIYAFENHEYLKGFA 81
L G +A+L+ ++ D L+++Y+ AF+ ++ G DA ++ LC+ + + G
Sbjct: 18 FLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLG 77

Query: 82 SVCQSKKKYQQAYDLYKLSYNYSPYDDYSVIYRMGQCQIGAKNIDNAMQCFYH----IIN 137
+ Q+ +Y A Y + + +C + + A + I +
Sbjct: 78 ACRQAMGQYDLAIHSYSYGAIMDI-KEPRFPFHAAECLLQKGELAEAESGLFLAQELIAD 136

Query: 138 NCEDASVKSKAQAYIELLTDNSE 160
E + ++ + +E + E
Sbjct: 137 KTEFKELSTRVSSMLEAIKLKKE 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4180FLGMRINGFLIF353e-04 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 34.6 bits (79), Expect = 3e-04
Identities = 22/126 (17%), Positives = 49/126 (38%), Gaps = 5/126 (3%)

Query: 4 ISLLLFILLLCGCKQQE-LLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIYVEPTD 62
+++++ ++L L ++L Q ++A L + NI + I V
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGA---IEVPADK 91

Query: 63 FASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDGIV 122
L LP + + + S +E+ A+E L ++++ + +
Sbjct: 92 VHELRLRLAQQGLPKGGAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 123 SSRVHV 128
S+RVH+
Sbjct: 151 SARVHL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4185TYPE3IMSPROT310e-106 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 310 bits (796), Expect = e-106
Identities = 112/340 (32%), Positives = 185/340 (54%), Gaps = 5/340 (1%)

Query: 2 ANKTEKPTQKKLQDASKKGQILKSRDLTVSVIMLVG--TLYLGYVFDVHHIMSILEYILD 59
KTE+PT KK++DA KKGQ+ KS+++ + +++ L + H ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 60 HNAKPDIWD---YFKAMGIGWLKTIIPFLLVCMFTTILVSWFQSKMQLATEAVKLKFDSL 116
+ P + + + P L V I Q ++ EA+K +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 117 NPVNGLKRIFGLKTVKEFVKAILYIIFFALEIKVFWSNHKSLLFKTLDGDIISLLSDWGE 176
NP+ G KRIF +K++ EF+K+IL ++ ++ I + + L + I + G+
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 177 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERH 236
+L L++ C +++ I D+ EY+ ++K++KM K E+KREYKE EG+PEIKSKRR+ H
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 237 QEILSEQLKSDVSNSRLMIANPTHIAIGIYFKPHLSPIPLISVRETNEVALAVRKYAKEI 296
QEI S ++ +V S +++ANPTHIAIGI +K +P+PL++ + T+ VRK A+E
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 297 GIPIITDKKLARKIYATHRRYDYVSFENIDEILRLLLWLE 336
G+PI+ LAR +Y Y+ E I+ +L WLE
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4186TYPE3IMRPROT444e-09 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 44.4 bits (105), Expect = 4e-09
Identities = 14/54 (25%), Positives = 25/54 (46%)

Query: 1 MTHTIVYASPVIAVMLGGEAVLGLLARYASQLNAFAISLTVKSALAFLILIIYF 54
+ ++ A P+I ++L LGLL R A QL+ F I + + ++
Sbjct: 180 FLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALM 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4187TYPE3IMRPROT876e-24 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 87.5 bits (217), Expect = 6e-24
Identities = 30/150 (20%), Positives = 65/150 (43%)

Query: 1 MGEVILYQLHSLLAATALGFCRLAPTFYLLPFFASGNIPTVVRHPIIIVVSCALVQHYHY 60
M +V Q S L R+ P + ++P V+ + ++++ A+
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 ELSTLNEIDIALLAAREIIIGLFIACLLASPFWIFLAIGSFIDNQRGATLSSTLDPATGV 120
+ LA ++I+IG+ + + F G I Q G + ++ +DPA+ +
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 DTSELARLFNLFSAAVYLTNGGLNFILETL 150
+ LAR+ ++ + ++LT G +++ L
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLL 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4188TYPE3IMQPROT794e-23 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 78.7 bits (194), Expect = 4e-23
Identities = 59/86 (68%), Positives = 73/86 (84%)

Query: 1 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF 60
MDD+VFAGN+ALYL+L++S P VAT +GLLVGLFQTVTQLQEQTLPFG+KLL V +C
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLMSGWYGEKLYSFGIEMLNLAFARG 86
FL+SGWYGE L S+G +++ LA A+G
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4189TYPE3IMPPROT2262e-77 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 226 bits (577), Expect = 2e-77
Identities = 151/223 (67%), Positives = 181/223 (81%), Gaps = 5/223 (2%)

Query: 1 MSNSISLIAILSLFTLLPFIIASGTCFIKFSIVFVIVRNALGLQQVPSNMTLNGVALLLS 60
M N ISLIA+L+ TLLPFIIASGTCF+KFSIVFV+VRNALGLQQ+PSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMMPVGKEIYYNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK 120
MFVM P+ + Y ++E+++FN+++S+ V+ G+ GY+ YLIKYS+ ELV FFE Q
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 VNSSEDNEEIIDDD-----NISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVL 175
+ E + D SIF+LLPAYALSEIKSAF IGFY+YLPFVVVDLV+SSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 176 LTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYFDLS 218
L LGMMMMSPVTISTPIKL+LFVA+DGWT+LSKGLILQY D++
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIA 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4190TYPE3OMOPROT1561e-47 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 156 bits (395), Expect = 1e-47
Identities = 91/292 (31%), Positives = 136/292 (46%), Gaps = 13/292 (4%)

Query: 35 KENGEDVALLMPEFSAKWLPIAEESGSWSGWVLLREIFPLISAELAGMALMPETERLIGE 94
+ +G + L P W+ +++ WS W+ + +S LAG A+ E L+
Sbjct: 23 QRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWLEHVSPALAGAAVSAGAEHLVVP 82

Query: 95 WLSLSSSPLNLKYPELKYNRLCVGKVFDGVLSPAQPLIRIWTGELNLWLDKVTVCQYENA 154
WL+ + P L P L RLCV G P L+ I + LW + +
Sbjct: 83 WLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLLHIMSDRGGLWFEHLPELPAVGG 142

Query: 155 PTLDKKSLYWPIHFVIGFSKTCYRTIVDIEVGDVLLISNNMAYAVIYNTKICDLIYPEEL 214
K L WP+ FVIG S T + I +GDVLLI + A +Y
Sbjct: 143 GRP--KMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRA-----------EVYCYAK 189

Query: 215 KMADHFQYEEDFETDDFDIKKSESEIYDENDEQMINSFEELPVKIEFVLGKKIMNLYEID 274
K+ + E + DI+ E E + + +LPVK+EFVL +K + L E++
Sbjct: 190 KLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYRKNVTLAELE 249

Query: 275 ELCAKRIISLLPESEKNIEIRVNGALTGYGELVEVDDKLGVEIHSWLSGHNN 326
+ ++++SL +E N+EI NG L G GELV+++D LGVEIH WLS N
Sbjct: 250 AMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESGN 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4191SSPANPROTEIN492e-09 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 49.4 bits (117), Expect = 2e-09
Identities = 31/75 (41%), Positives = 44/75 (58%), Gaps = 3/75 (4%)

Query: 105 ENELTYQFQRWGQNHTVRILESSEG-IRLKPSDTLVSDRLHEAQHNDVTAQRWVLTEQDE 163
++ LTY+FQRWG +++V I G L PS+T V RLH+ N QRW LT +D+
Sbjct: 260 DSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLHDQWQNG-NPQRWHLT-RDD 317

Query: 164 RQGQRHQPHEEQENE 178
+Q + Q H +Q E
Sbjct: 318 QQNPQQQQHRQQSGE 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4193SSPAMPROTEIN352e-05 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 34.7 bits (79), Expect = 2e-05
Identities = 31/101 (30%), Positives = 56/101 (55%)

Query: 2 QLKNLQSLLDMKELLGEVVFRQDIFYSLRKVTVIQQQIAEINLEKQKIAERRKILNKEIV 61
Q+ L+ LLD + R++I+ LRK +++++QI ++ L+ +I E+R L K+
Sbjct: 45 QIAGLKLLLDTLRAENRQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKRE 104

Query: 62 QQQAQRKHWWLKGEKYDRLKKRIKKQLLNQMLYQDELEQEE 102
+ Q + K+W K Y R R K+ + + + Q+E E EE
Sbjct: 105 EFQEKSKYWLRKEGNYQRWIIRQKRLYIQREIQQEEAESEE 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4195VACCYTOTOXIN310.019 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.2 bits (70), Expect = 0.019
Identities = 18/60 (30%), Positives = 31/60 (51%), Gaps = 3/60 (5%)

Query: 597 EIEDRIRDGVRPTAGGTFLNLDASEAEMILDNFKLAL---SGINIPIKDIILLGSVDIRR 653
EI +R+ G A T L L ASE +N +++L + +N+ + L+G+V + R
Sbjct: 202 EINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYDGATLNLASNSVKLMGNVWMGR 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4196INVEPROTEIN2402e-78 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 240 bits (613), Expect = 2e-78
Identities = 128/321 (39%), Positives = 195/321 (60%)

Query: 14 AREVSRLEDIITEDNEDIEAEMPKMRDDPAGKEARFLQATDEMSAALTQFMKKKIYEEQL 73
+R+ S + D + E + P + +F+Q+TDEMSAAL QF ++ YE++
Sbjct: 16 SRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSAALAQFRNRRDYEKKS 75

Query: 74 ANFLDGEEYVLEDQPIEKTDKVMEALKAATTHDYEVYSFAKKLFPDESDLVVVLRAILRK 133
+N + E VLED+ + K ++++ + + A+ LFPD SDLV+VLR +LR+
Sbjct: 76 SNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFPDPSDLVLVLRELLRR 135

Query: 134 KQISENVRLNAEALLRKVNQETTKKFINSGINSALKAKLFGQALSLNPKLLRASYRQFLM 193
K + E VR E+LL+ V ++T K + +GIN ALKA+LFG+ LSL P LLRASYRQF+
Sbjct: 136 KDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLSLKPGLLRASYRQFIQ 195

Query: 194 AEDDAVDTYVEWIGSYGYQNRMLVTKFIKETLFSDINALDASCSSLEFGMFLNKLSQLLS 253
+E V+ Y +WI SYGYQ R++V FI+ +L +DI+A DASCS LEFG L +L+QL
Sbjct: 196 SESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSRLEFGQLLRRLTQLKM 255

Query: 254 LQSAEALFLKTLMNNPIIKKFISAEDYWIFFLISLIKFPETAEELLNNALVTLPADANYK 313
L+SA+ LF+ TL++ K F + E W+ ++SL++ P + LL + + ++K
Sbjct: 256 LRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSLLADIIGLNALLLSHK 315

Query: 314 DKTLLLKAIYSGCTNLPFSLF 334
+ L+ Y C +P SLF
Sbjct: 316 EHASFLQIFYQVCKAIPSSLF 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4197TYPE3OMGPROT448e-154 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 448 bits (1155), Expect = e-154
Identities = 158/536 (29%), Positives = 271/536 (50%), Gaps = 54/536 (10%)

Query: 34 YVANKENLRSFFETVSSYAGKPTIVSKLAMKKQISGNFDLTEPYALIERLSAQMGLIWYD 93
YVA E+LR + +VS K +SG F+ P ++ +++ L+WY
Sbjct: 38 YVAKGESLRDLLTDFGANYDATVVVSDKINDK-VSGQFEHDNPQDFLQHIASLYNLVWYY 96

Query: 94 DGKAIYIYDSSEMRNALINLRKVSTNEFNNFLKKSGLYNSRYEIKGD-GNGTFYVSGPPV 152
DG +YI+ +SE+ + LI L++ E L++SG++ R+ + D N YVSGPP
Sbjct: 97 DGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPR 156

Query: 153 YVDLVVNAAKLMEQNSD--GIEIGRNKVGIIHLVNTFVNDRTYELRGEKIVIPGMAKVLS 210
Y++LV A +EQ + + G + I L +DRT R +++ PG+A +L
Sbjct: 157 YLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQ 216

Query: 211 TLLNNNIKQSTGVNVLSEISSRQQLKNVSRMPPFPGAEEDDDLQVEKIISTAGAPETDDI 270
+L++ ++ QQ+ ++ P A +
Sbjct: 217 RVLSD--------------ATIQQVTVDNQRIPQ-----------------AATRASAQA 245

Query: 271 QIIAYPDTNSLLVKGTVSQVDFIEKLVATLDIPKRHIELSLWIIDIDKTDLEQLGADWSG 330
++ A P N+++V+ + ++ ++L+ LD P IE++L I+DI+ L +LG DW
Sbjct: 246 RVEADPSLNAIIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRV 305

Query: 331 TIKIGSSLSASFNNSG----------SISTLDG---TQFIATIQALAQKRRAAVVARPVV 377
I+ G++ +G S +D +A + L + A VV+RP +
Sbjct: 306 GIRTGNNHQVVIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTL 365

Query: 378 LTQENIPAIFDNNRTFYTKLVGERTAELDEVTYGTMISVLPRFAARN---QIELLLNIED 434
LTQEN A+ D++ T+Y K+ G+ AEL +TYGTM+ + PR + +I L L+IED
Sbjct: 366 LTQENAQAVIDHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIED 425

Query: 435 GNEINSDKTNVDDLPQVGRTLISTIARVPQGKSLLIGGYTRDTNTYESRKIPILGSIPFI 494
GN+ + + ++ +P + RT++ T+ARV G+SL+IGG RD + K+P+LG IP+I
Sbjct: 426 GNQ-KPNSSGIEGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYI 484

Query: 495 GKLFGYEGTNANNIVRVFLIEPREIDERMMNNANEAAVDARAITQQMAKNKEINDE 550
G LF + VR+F+IEPR IDE + ++ A + + + + EI+++
Sbjct: 485 GALFRRKSELTRRTVRLFIIEPRIIDEGIAHHL--ALGNGQDLRTGILTVDEISNQ 538


56Z4313Z4346Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z43132250.797395*pathogenicity island integrase
Z43142282.125831hypothetical protein
Z4315326-0.205752hypothetical protein
Z43164270.388904hypothetical protein
Z43174280.697376hypothetical protein
Z4318630-0.792670hypothetical protein
Z4320532-1.080141hypothetical protein
Z4321541-9.181677PagC-like membrane protein
Z4322542-10.573494hypothetical protein
Z4323743-12.036880hypothetical protein
Z4324542-12.309500transposase
Z4325746-15.341872hypothetical protein
Z4326746-15.906831enterotoxin
Z4328637-12.321709hypothetical protein
Z4329432-10.360306hypothetical protein
Z4330226-7.836040transposase
Z4332220-4.241748cytotoxin
Z43331180.148275cytotoxin
Z43343194.621335IS629 transposase
Z4335-1163.818572hypothetical protein
Z4336-1153.100091hypothetical protein
Z4337-2142.835164hypothetical protein
Z4338-2121.734270hypothetical protein
Z4340-2131.944404hypothetical protein
Z4341-3151.576889transporter
Z4342-1162.134827bifunctional glutathionylspermidine
Z43430192.532209glutathione S-transferase
Z43443252.630054hydrogenase 2 accessory protein HypG
Z43453262.755915hydrogenase nickel incorporation protein HybF
Z43462252.663031hydrogenase 2-specific chaperone
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4315HTHTETR280.037 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 27.7 bits (61), Expect = 0.037
Identities = 12/50 (24%), Positives = 23/50 (46%), Gaps = 9/50 (18%)

Query: 3 SQTKKDIPCFRSYLPDALRLRFE---DKLTIRAIAQRLGLSHSTIHTLFQ 49
+T++ I L ALRL + ++ IA+ G++ I+ F+
Sbjct: 10 QETRQHI------LDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFK 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4321ENTEROVIROMP1211e-37 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 121 bits (305), Expect = 1e-37
Identities = 61/178 (34%), Positives = 89/178 (50%), Gaps = 26/178 (14%)

Query: 2 SGSRLADNHTLSAGYAQSKVQDFKN-IKGVNLQYRYEWD-SPVSVVGSFSYMKGDWADSH 59
+G+ +A T++ GYAQS Q N + G NL+YRYE D SP+ V+GSF+Y +
Sbjct: 18 AGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTEKS----- 72

Query: 60 RDEADDFYRHQADIKYYSFLAGPAYRLNDYISFYGLVGISHTKAKGDYEWRNSVGADESD 119
R + Y +YY AGPAYR+ND+ S YG+VG+ + K + E
Sbjct: 73 RTASSGDY---NKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQ----------TTEYP 119

Query: 120 GYLSESVSKKSTDFAYAAGVIINPWGNMSVNVGYEGTKADIYGKHSVNGFTVGVGYRF 177
Y F+Y AG+ NP N++++ YE ++ V + GVGYRF
Sbjct: 120 TY---KHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRI---RSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4323CHLAMIDIAOMP270.043 Chlamydia major outer membrane protein signature.
		>CHLAMIDIAOMP#Chlamydia major outer membrane protein signature.

Length = 393

Score = 27.3 bits (60), Expect = 0.043
Identities = 15/46 (32%), Positives = 22/46 (47%), Gaps = 1/46 (2%)

Query: 56 LRSLLPAVTFFFFTGTFFSLVLNTVGVRGTTPALLFDKIDWQGAVG 101
++ LL +V F + SL VG P+L+ D I W+G G
Sbjct: 1 MKKLLKSVLVFAALSSASSLQALPVG-NPAEPSLMIDGILWEGFGG 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4341TYPE3IMSPROT387e-05 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 37.8 bits (88), Expect = 7e-05
Identities = 26/194 (13%), Positives = 58/194 (29%), Gaps = 40/194 (20%)

Query: 12 TGLLLLLALAFVLFYEAINGFHDTANAVATVIY------TRAMQPQLAVVMAAFFNFFGV 65
L++AL+ +L + F + + ++A+ + V+ FF
Sbjct: 30 VSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFP 89

Query: 66 LLGGLSVAYAIVHML-------------------PTDLLLNMGSTHGLAMVFSMLLAAII 106
LL ++ H++ P + + S L +L ++
Sbjct: 90 LLTVAALMAIASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVL 149

Query: 107 WNLGTWFFGLPASSSHTLIGAIIGIGLTNALLTGSSVMDALNLREVTKIFSSLIVSPIVG 166
++ W + ++ L + L + +I L+V VG
Sbjct: 150 LSILIWIIIKG------NLVTLLQ-------LPTCGIECITPL--LGQILRQLMVICTVG 194

Query: 167 LVIAGGLIFLLRRY 180
V+ + Y
Sbjct: 195 FVVISIADYAFEYY 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4346DPTHRIATOXIN280.020 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 27.8 bits (61), Expect = 0.020
Identities = 15/48 (31%), Positives = 24/48 (50%), Gaps = 1/48 (2%)

Query: 92 MTFTVGELDGVSQYLSCSLMSPLSHSMSIEEG-QRLTDDCARMILSLP 138
+ V + + + L SL PL + EE +R D +R++LSLP
Sbjct: 124 LALKVDNAETIKKELGLSLTEPLMEQVGTEEFIKRFGDGASRVVLSLP 171


57Z4466Z4472Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4466-114-3.322687formate acetyltransferase 3
Z4467-120-6.731210propionate/acetate kinase
Z4468126-10.544976threonine/serine transporter TdcC
Z4469129-9.961277threonine dehydratase
Z4470027-8.972702DNA-binding transcriptional activator TdcA
Z4471124-7.290911DNA-binding transcriptional activator TdcR
Z4472-118-5.095456hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4467ACETATEKNASE5330.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 533 bits (1375), Expect = 0.0
Identities = 172/397 (43%), Positives = 253/397 (63%), Gaps = 11/397 (2%)

Query: 11 VLVINCGSSSIKFSVLDASDCEVLMSGIADGINSENAFLSVN-GGEPAP--LAHHSYEGA 67
+LVINCGSSS+K+ ++++ D VL G+A+ I ++ L+ N GE ++ A
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDA 62

Query: 68 LKAIAFELEKRNLN-----DSVALIGHRIAHGGSIFTESAIITDEVIDNIRRVSPLAPLH 122
+K + L + + +GHR+ HGG FT S +ITD+V+ I LAPLH
Sbjct: 63 IKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLH 122

Query: 123 NYANLSGIESAQQLFPGVTQVAVFDTSFHQTMAPEAYLYGLPWKYYEELGVRRYGFHGTS 182
N AN+ GI++ Q+ P V VAVFDT+FHQTM AYLY +P++YY + +R+YGFHGTS
Sbjct: 123 NPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGTS 182

Query: 183 HRYVSLRAHSLLNLAEDDSGLVVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSG 242
H+YVS RA +LN + ++ HLGNG+SI AV+NG+S+DTSMG TPLEGL MGTRSG
Sbjct: 183 HKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRSG 242

Query: 243 DVDFGAMSWVASQTNQSLGDLERVVNKESGLLGISGLSSDLR-VLEKAWHEGHERAQLAI 301
+D +S++ + N S ++ ++NK+SG+ GISG+SSD R + + A+ G +RAQLA+
Sbjct: 243 SIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLAL 302

Query: 302 KTFVHRIARHIAGHAASLRRLDGIIFTGGIGENSSLIRRLVMEHLAVLGVEIDTEMNNRS 361
F +R+ + I +AA++ +D I+FT GIGEN IR +++ L LG ++D E N
Sbjct: 303 NVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKVR 362

Query: 362 NSFGERIVSSENARVICAVIPTNEEKMIALDAIHLGK 398
E I+S+ +++V V+PTNEE MIA D + +
Sbjct: 363 GE--EAIISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


58Z4484Z4528Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z44840143.035418tagatose 6-phosphate kinase 2
Z44850153.336079PTS system N-acetylgalactosamine-specific
Z44860153.346274phosphotransferase system enzyme subunit
Z44870142.216466phosphotransferase system enzyme subunit
Z44880142.426748phosphotransferase system enzyme subunit
Z4489-1151.735876N-acetylgalctosamine-6-phosphate deacetylase
Z4490-2160.116245tagatose-6-phosphate aldose/ketose isomerase
Z4491-213-2.053324tagatose-bisphosphate aldolase
Z4492-215-3.661190PTS system N-acetylgalactosamine-specific
Z4493-217-3.557472PTS system N-acetylgalactosamine-specific
Z4494-120-4.094870PTS system N-acetylgalactosamine-specific
Z4497m022-3.957586galactosamine-6-phosphate isomerase
Z4498123-3.498112fimbrial-like protein
Z4499122-3.236070chaperone
Z4500016-1.702199hypothetical protein
Z4501-2131.283829hypothetical protein
Z4502-2142.104827hypothetical protein
Z4503-2131.736474transposase
Z4504-1141.763293fimbrial protein
Z45050143.010095hypothetical protein
Z4506-1133.124965glycosylase
Z4507-1162.870818hypothetical protein
Z45080172.945375chromosome replication initiator DnaA
Z45090193.549787hypothetical protein
Z4510-1203.596022hypothetical protein
Z45110202.180575hypothetical protein
Z45120212.928086hypothetical protein
Z45141203.776560hypothetical protein
Z4516-1183.995201GIY-YIG nuclease superfamily protein
Z4517-1183.381870hypothetical protein
Z45180193.479019hypothetical protein
Z4519-1162.858950collagenase
Z45201242.254181hypothetical protein
Z45212271.974588hypothetical protein
Z45222301.540277tryptophan permease
Z45234351.572220ATP-dependent RNA helicase DeaD
Z45245351.144919lipoprotein NlpI
Z45256391.690668polynucleotide phosphorylase
Z45266341.22291330S ribosomal protein S15
Z45274311.233649tRNA pseudouridine synthase B
Z4528223-0.822694ribosome-binding factor A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4500PF005777780.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 778 bits (2011), Expect = 0.0
Identities = 321/849 (37%), Positives = 470/849 (55%), Gaps = 48/849 (5%)

Query: 31 SGMLCTTANAEEYYFDPIMLETTKSGMQTTDLSRFSKKYAQLPGTYQVDIWLNKKKVSQK 90
+ ++ E YF+P L DLSRF PGTY+VDI+LN ++ +
Sbjct: 35 AFAAQAPLSSAELYFNPRFLAD--DPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATR 92

Query: 91 KITFTAN-AEQLLQPQFTVEQLRELGIKVDEIPALAEKDDDSVINSLEQIIPGTAAEFDF 149
+TF +EQ + P T QL +G+ + + DD+ + L +I A+ D
Sbjct: 93 DVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVP-LTSMIHDATAQLDV 151

Query: 150 NHQRLNLSIPQIALYRDARGYVSPSRWDDGIPTLFTNYSFTGSDNRYRQGNRSQRQYLNM 209
QRLNL+IPQ + ARGY+ P WD GI NY+F+G+ + R G S YLN+
Sbjct: 152 GQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNL 211

Query: 210 QNGANFGPWRLRNYSTWTRNDQASS------WNTISSYLQRDIKALKSQLLLGESATSGS 263
Q+G N G WRLR+ +TW+ N SS W I+++L+RDI L+S+L LG+ T G
Sbjct: 212 QSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGD 271

Query: 264 IFSSYNFTGVQLASDDNMLPNSQRGFAPTVRGIANSSAIVTIRQNGYVIYQSNVPAGAFE 323
IF NF G QLASDDNMLP+SQRGFAP + GIA +A VTI+QNGY IY S VP G F
Sbjct: 272 IFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFT 331

Query: 324 INDLYPSSNSGDLEVTIEESDGTQRRFIQPYSSLPMMQRPGHLKYSATAGRYRADANSDS 383
IND+Y + NSGDL+VTI+E+DG+ + F PYSS+P++QR GH +YS TAG YR+
Sbjct: 332 INDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQE 391

Query: 384 KEPEFAEATAIYGLNNTFTLYGGLLGSEDYYALGIGIGGTLGALGALSMDINRADTQFDN 443
K P F ++T ++GL +T+YGG ++ Y A GIG +GALGALS+D+ +A++ +
Sbjct: 392 K-PRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPD 450

Query: 444 QHSFHGYQWRTQYIKDIPETNTNIAVSYYRYTNDGYFSFDEA------------------ 485
G R Y K + E+ TNI + YRY+ GYF+F +
Sbjct: 451 DSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQ 510

Query: 486 ----NTRNWDYNSRQKSEIQFNISQTIFDGVSLYASGSQQDYWGNNEKNRNISVGVSGQQ 541
T ++ ++ ++Q ++Q + +LY SGS Q YWG + + G++
Sbjct: 511 VKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF 570

Query: 542 WGIGYSLNYQYSRYTDQN-NDRALSLNLSIPLERWLPRSR--------VSYQMTSQKDRP 592
I ++L+Y ++ Q D+ L+LN++IP WL SY M+ +
Sbjct: 571 EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGR 630

Query: 593 TQHEMRLDGSLLDDGRLSYSLEQSLDDDNNHNS----SVNASYRSPYGTFSAGYSYGNDS 648
+ + G+LL+D LSYS++ + NS +YR YG + GYS+ +D
Sbjct: 631 MTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDI 690

Query: 649 SQYNYGVTGGVVIHPHGVTLSQYLGNAFALIDANGASGVRIQNYPGIATDPFGYAVVPYL 708
Q YGV+GGV+ H +GVTL Q L + L+ A GA +++N G+ TD GYAV+PY
Sbjct: 691 KQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYA 750

Query: 709 TTYQENRLSVDTTQLPDNVDLEQTTQFVVPNRGAMVAARFNANIGYRVLVTVSDRNGKPL 768
T Y+ENR+++DT L DNVDL+ VVP RGA+V A F A +G ++L+T+ N KPL
Sbjct: 751 TEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTL-THNNKPL 809

Query: 769 PFGALASNDDTGQQSIVDEGGILYLSGISSKSQSWTVRWGNQADQQCQFAFSTPDSEPTT 828
PFGA+ +++ + IV + G +YLSG+ + V+WG + + C + P
Sbjct: 810 PFGAMVTSESSQSSGIVADNGQVYLSGMPLAGK-VQVKWGEEENAHCVANYQLPPESQQQ 868

Query: 829 SVLQGTAQC 837
+ Q +A+C
Sbjct: 869 LLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4506BINARYTOXINB300.043 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 29.7 bits (66), Expect = 0.043
Identities = 11/72 (15%), Positives = 24/72 (33%), Gaps = 4/72 (5%)

Query: 487 AGVNGGSGIALTGTPITPRATTDSGMTTNNPTLQTTPTDDQFTNNGGRVDAVYIVATPGE 546
+ V+G + + + I + + ++ T D + G R A + +
Sbjct: 330 SEVHGNAEVHASFFDIGGSVSAGFSNSNSS----TVAIDHSLSLAGERTWAETMGLNTAD 385

Query: 547 IAFIKPMIAMRN 558
A + I N
Sbjct: 386 TARLNANIRYVN 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4508RTXTOXINA280.036 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 27.6 bits (61), Expect = 0.036
Identities = 26/111 (23%), Positives = 44/111 (39%), Gaps = 22/111 (19%)

Query: 42 NKILCCGNGTSAANAQHFAASMINRFETERPSLPAIALNTDNVVLTAIA-------NDRL 94
K+L GN + A T + IA + V AI+ D+
Sbjct: 277 TKVL--GNVGKGISQYIIAQRAAQGLSTSAAAAGLIA----SAVTLAISPLSFLSIADKF 330

Query: 95 HD----EVYAKQVRALGHAGDVLLAISTRGNSRDIVKAVEAAVTRDMTIVA 141
E Y+++ + LG+ GD LLA + A++A++T T++A
Sbjct: 331 KRANKIEEYSQRFKKLGYDGDSLLAAFHKETG-----AIDASLTTISTVLA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4511NUCEPIMERASE290.014 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.014
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 19 VLITGATGLVGGHLLRMLINEP 40
L+TGA G +G H+ + L+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG 24


59Z4563Z4576Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z45632170.260052lipopolysaccharide transport periplasmic protein
Z45643170.420047ABC transporter ATP-binding protein
Z4565316-0.151445RNA polymerase factor sigma-54
Z4566018-0.641088sigma(54) modulation protein
Z4567-1160.107581PTS system transporter subunit IIA-like
Z4568-114-0.386246hypothetical protein
Z4569-2150.182787phosphohistidinoprotein-hexose
Z4570-2152.758178hypothetical protein
Z4571-1173.092560monofunctional biosynthetic peptidoglycan
Z4573-1172.716101isoprenoid biosynthesis protein with
Z4574-1162.542624aerobic respiration control sensor protein ArcB
Z4575-2183.882458hypothetical protein
Z4576-1184.069243glutamate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4574HTHFIS656e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 6e-13
Identities = 26/115 (22%), Positives = 45/115 (39%), Gaps = 4/115 (3%)

Query: 528 VLLVEDIELNVIVARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDI 587
+L+ +D V L + G V + G+ DLV+ D+ +PD D+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 588 SRELTKRYPREDLPPLVALTA-NVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKK 641
+ K P P++ ++A N + G D L KP + L +I +
Sbjct: 66 LPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


60Z4738Z4746Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z47382150.726154phosphoglycolate phosphatase
Z47392151.103492ribulose-phosphate 3-epimerase
Z47402151.215821DNA adenine methylase
Z47412151.847924hypothetical protein
Z47421171.7891933-dehydroquinate synthase
Z47431202.369908shikimate kinase I
Z4744-2143.528310porin
Z4746-2153.249892hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4741IGASERPTASE441e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 43.5 bits (102), Expect = 1e-06
Identities = 41/203 (20%), Positives = 67/203 (33%), Gaps = 10/203 (4%)

Query: 126 APSTTSSDQTASGEKSIDLAGNATDQANGVQPAPGTTSAENTQQDVSLPPISST-PTQGQ 184
P+ +D + + ++A D+A PAP T S + S T Q
Sbjct: 999 TPNNIQADVPSVPSNNEEIA--RVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 185 TPAATDGQQRVEVQGDLNNALTQPQN----QQQLNNVAVNSTLPTEPATVAPVRNGNASR 240
T Q R + +N Q Q +T E ATV + A
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE--KEEKAKV 1114

Query: 241 DTAKTQTAERPATTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAA 300
+T KTQ + + +Q+ E +PQA E P + A T+ PA
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQS-ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 301 TSTPAPKETATTAPVQTASPAQT 323
++ ++ T + +
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVV 1196



Score = 42.4 bits (99), Expect = 4e-06
Identities = 38/199 (19%), Positives = 68/199 (34%), Gaps = 19/199 (9%)

Query: 143 DLAGNATDQANGVQPAPGTTSAENTQQDVSL-----------------PPISSTPTQGQT 185
DL ++ N T+ N Q DV PP +TP++
Sbjct: 979 DLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE 1038

Query: 186 PAATDGQQRVEVQGDLNNALTQPQNQQQLNNVAVNSTLPTEPATVAPVRNGNASRDTAKT 245
A + +Q + T+ Q + S + T ++G+ +++T T
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTT 1098

Query: 246 QTAERPATTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAATSTPA 305
+T E + + + E + V ++ P + + +P A A P
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 306 PKETATTAPVQTASPAQTT 324
+T TTA T PA+ T
Sbjct: 1159 QSQTNTTA--DTEQPAKET 1175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4743CARBMTKINASE328e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 32.1 bits (73), Expect = 8e-04
Identities = 27/91 (29%), Positives = 40/91 (43%), Gaps = 18/91 (19%)

Query: 32 FYDSDQEIEKRTGADVGWVFDLEGEEGFRD----------REEKVINELTEKQGIVLATG 81
FYD + KR + GW+ + G+R E + I +L E+ IV+A+G
Sbjct: 136 FYDEETA--KRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLVERGVIVIASG 193

Query: 82 GGSVKSRETRNRLSARGVVVYLETTIEKQLA 112
GG V + +GV E I+K LA
Sbjct: 194 GGGVPVILEDGEI--KGV----EAVIDKDLA 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4744TYPE3OMGPROT2862e-93 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 286 bits (734), Expect = 2e-93
Identities = 80/301 (26%), Positives = 131/301 (43%), Gaps = 18/301 (5%)

Query: 117 LENRNITLQYADAGELAKAGEKLLSAKGSMTVDKRTNRLLLRDNKTALSALEQWVAQMDL 176
L + I D + +A SA+ + D N +++RD+ + ++ + +D
Sbjct: 219 LSDATIQQVTVDNQRIPQAAT-RASAQARVEADPSLNAIIVRDSPERMPMYQRLIHALDK 277

Query: 177 PVGQVELSAHIVTINEKSLRELGVKWTLADAQHAGGVGQVTTLGSDLSVATATTHVGFNI 236
P ++E++ IV IN L ELGV W + + T G ++A+ G
Sbjct: 278 PSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASN----GALG 333

Query: 237 GRINGRLLDL---ELSALEQKQQLDIIASPRLLASHLQPASIKQGSEIPYQVSSGESGAT 293
++ R LD ++ LE + +++ P LL A I SE Y +G+ A
Sbjct: 334 SLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDH-SETYYVKVTGKEVA- 391

Query: 294 SVEFKEAVLG--MEVTPTVLQKG---RIRLKLHISQNVPGQVLQQADGEVLAIDKQEIET 348
E K G + +TP VL +G I L LHI +G + I + ++T
Sbjct: 392 --ELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEG-IPTISRTVVDT 448

Query: 349 QVEVKSGETLALGGIFTRKNKSGQDSVPLLGDIPWFGQLFRHDGKEDERRELVVFITPRL 408
V G++L +GGI+ + VPLLGDIP+ G LFR + R + I PR+
Sbjct: 449 VARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRI 508

Query: 409 V 409
+
Sbjct: 509 I 509



Score = 36.8 bits (85), Expect = 2e-04
Identities = 18/95 (18%), Positives = 33/95 (34%), Gaps = 4/95 (4%)

Query: 1 MKQWIAALLLMLIPGVQAA----KPQKVTLMVDDVPVAQVLQALAEQEKLNLVVSPDVSG 56
K+ + LL+L A P + + +L +VVS ++
Sbjct: 9 FKRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKIND 68

Query: 57 TVSLHLTDVPWKQALQTVVKSAGLITRQEGNILSV 91
VS + LQ + L+ +GN+L +
Sbjct: 69 KVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYI 103


61Z4805Z4824Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4805015-3.110959gluconate kinase
Z4806017-4.981600gluconate (gnt) operon regulator
Z4807015-3.824258hypothetical protein
Z4808017-4.230412dehydrogenase
Z4809018-3.618953acetyltransferase YhhY
Z4810016-1.938175hypothetical protein
Z4811-1171.665036hypothetical protein
Z4813-2223.750202gamma-glutamyltranspeptidase
Z4815-2233.505469hypothetical protein
Z4817-2213.214918glycerophosphodiester phosphodiesterase
Z4818-1233.276062glycerol-3-phosphate transporter ATP-binding
Z4819-2232.536734glycerol-3-phosphate transporter membrane
Z4820-1233.159506glycerol-3-phosphate transporter permease
Z4822-2233.171396glycerol-3-phosphate transporter periplasmic
Z4823-1213.300623hypothetical protein
Z4824-1234.051765leucine/isoleucine/valine transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4809SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 1e-05
Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 16/92 (17%)

Query: 51 VACIDGDVVGHLTIDVQQRPRRSHVADFGICVDARWKNRGVASALMREMIE------MCD 104
+ ++ + +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 105 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 136
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4813NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 276 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 332
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 333 YAYADRSEYLGDPDFVKVPWQA 354
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4817PF04619300.004 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 30.3 bits (68), Expect = 0.004
Identities = 13/63 (20%), Positives = 23/63 (36%), Gaps = 4/63 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129

Query: 85 YGK 87
G
Sbjct: 130 GGI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4818PF05272310.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.010
Identities = 11/35 (31%), Positives = 19/35 (54%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTEGDICINDQR 67
+V+ G G GKSTL+ + GL+ ++ I +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4822MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


62Z4836Z4908Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z48362141.624835cell division protein FtsX
Z48372122.034981cell division protein FtsE
Z48382123.637778cell division protein FtsY
Z48390164.02718416S rRNA m(2)G966-methyltransferase
Z4840-1143.585061hypothetical protein
Z4841-1153.786148receptor
Z4842-2154.360459hypothetical protein
Z4843-1153.467538zinc/cadmium/mercury/lead-transporting ATPase
Z48440150.839092sulfur transfer protein SirA
Z4845114-0.514605hypothetical protein
Z4846016-0.099380hypothetical protein
Z4847118-0.225452major facilitator superfamily transporter
Z4848018-1.623881hypothetical protein
Z4849019-2.125211hypothetical protein
Z4850116-0.766838O-methyltransferase
Z48512142.166059hypothetical protein
Z48521172.221651phospholipid biosynthesis acyltransferase
Z48533194.154817acyl carrier protein
Z48542194.201568acyl carrier protein
Z48552194.437895hypothetical protein
Z48562205.215297hypothetical protein
Z48572245.544897hypothetical protein
Z48582246.231257hypothetical protein
Z48591266.422902hypothetical protein
Z48601256.525444hypothetical protein
Z48611226.247605hypothetical protein
Z48620225.944991hypothetical protein
Z4863-1204.8741713-oxoacyl-ACP synthase
Z4864-1204.383402hypothetical protein
Z48650214.7865183-ketoacyl-ACP reductase
Z48661225.2316313-oxoacyl-ACP synthase
Z4867-1245.575548holo-(acyl carrier protein) synthase 2
Z4868-1255.522163periplasmic binding protein for nickel
Z48690234.279977nickel transporter permease NikB
Z4870-1202.935531nickel transporter permease NikC
Z48711210.596734nickel transporter ATP-binding protein NikD
Z4872120-1.546019nickel transporter ATP-binding protein NikE
Z4873017-1.247980nickel responsive regulator
Z4874219-1.817870regulator
Z4875216-0.692759phosphotransferase system enzyme subunit
Z4876116-1.101296phosphotransferase system enzyme subunit
Z4877116-1.143659phosphotransferase system enzyme subunit
Z4878016-0.203079xylulose kinase
Z48791171.370027phosphocarrier protein
Z48810172.594835fructose-1,6-bisphosphate aldolase
Z4882-114-0.699038hypothetical protein
Z4883117-3.732985hypothetical protein
Z4884117-3.801751transporter
Z4885-116-3.554773ABC transporter ATP-binding protein, fragment 1
Z4886-120-5.232150hypothetical protein
Z4887023-6.999494hypothetical protein
Z4888-118-4.830690hypothetical protein
Z4890-112-0.234234hypothetical protein
Z48910161.846508hypothetical protein
Z4893-2181.891047low-affinity phosphate transport protein
Z4894-2202.006044universal stress protein UspB
Z4895-2212.337644universal stress protein; broad regulatory
Z4896-2202.478582inner membrane transporter YhiP
Z4897-1213.083716methyltransferase
Z4898-2222.378059oligopeptidase A
Z4899-1191.817686hypothetical protein
Z4900-114-2.670157glutathione reductase
Z4903-118-5.066951ArsR family transcriptional regulator
Z4904-119-6.299617arsenical pump membrane protein
Z4905-222-8.164024arsenate reductase
Z4906-118-6.108872hypothetical protein
Z4907-118-6.529825hypothetical protein
Z4908-315-3.266026hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4838IGASERPTASE533e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 52.8 bits (126), Expect = 3e-09
Identities = 36/183 (19%), Positives = 61/183 (33%), Gaps = 12/183 (6%)

Query: 19 EQTPEKETEVQNEQTVVEEIVQAQEPVKASEQAVEE----QPQAHTEAEAET-FAADVVE 73
P T + +TV E Q + V+ +EQ E + EA++ E
Sbjct: 1025 VPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNE 1084

Query: 74 VTEQVAENEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAE 133
V + +E ++ Q + V +E V E+ + +VSP++ Q+E
Sbjct: 1085 VAQSGSETKETQTTE--TKETATVEKEEKAKVETEKTQ---EVPKVTSQVSPKQEQSETV 1139

Query: 134 TVEIVEAAEEEAAK--EEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVEEIAQEQEKP 191
+ A E + +E + A E + V P E V E P
Sbjct: 1140 QPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 192 TKE 194

Sbjct: 1200 ENT 1202



Score = 46.6 bits (110), Expect = 2e-07
Identities = 39/178 (21%), Positives = 65/178 (36%), Gaps = 20/178 (11%)

Query: 22 PEKETEVQNEQTVVEEIVQAQEPVKASEQAV----EEQPQAHTEAEAETFAADVVEVTEQ 77
PE E + QTV + ++A +V EE + A E TE
Sbjct: 983 PEVE---KRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 78 VAENEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAETVEI 137
VAEN K + + + + + + + + EV+ Q+ +ET E
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA----QSGSETKET 1095

Query: 138 VEAAEEEAAKEEITDEELEAQALAAEAAEEAVMV--VPP----AEEEQPVEEIAQEQE 189
+E A E +E +A+ + E + V P +E QP E A+E +
Sbjct: 1096 QTTETKETATVE---KEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150



Score = 43.5 bits (102), Expect = 2e-06
Identities = 27/157 (17%), Positives = 49/157 (31%), Gaps = 6/157 (3%)

Query: 17 QKEQTPEKETE----VQNEQTVVEEIVQAQEPVKASEQAVEEQPQAHTEAEAETFAADVV 72
+ ++T E E V+ E+T V +Q K EQ+ QPQA E +
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPK-QEQSETVQPQAEPARENDPTVNIKE 1157

Query: 73 EVTEQVAENEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEA 132
++ + QP E + E V E+ V + PE+ P +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTES-TTVNTGNSVVENPENTTPATTQPTVNSESS 1216

Query: 133 ETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAV 169
+ + + + + A +
Sbjct: 1217 NKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253



Score = 43.1 bits (101), Expect = 2e-06
Identities = 25/176 (14%), Positives = 49/176 (27%), Gaps = 2/176 (1%)

Query: 17 QKEQTPEKETEVQNEQTVVEEIVQAQEPVKASEQAVEEQPQAHTEAEAETFAADVVEVTE 76
+E E ++ V+ E E + +E E +A+ EV
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVP- 1123

Query: 77 QVAENEKAQPEAEVVAQPEPVVEETPEPVAIEREELPLPEDVNAEEVSPEEWQAEAETVE 136
+V + E QP+ +P + +E + A+ P + +
Sbjct: 1124 KVTSQVSPKQEQSETVQPQAEPARENDPT-VNIKEPQSQTNTTADTEQPAKETSSNVEQP 1182

Query: 137 IVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMVVPPAEEEQPVEEIAQEQEKPT 192
+ E+ + + E A P + V + E T
Sbjct: 1183 VTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPAT 1238



Score = 30.8 bits (69), Expect = 0.016
Identities = 34/152 (22%), Positives = 52/152 (34%), Gaps = 21/152 (13%)

Query: 52 VEEQPQAHTEAEAETFAADVVEVTEQVAEN-EKAQPEAEVVAQPEPVVE-ETPEPVAIER 109
VE++ Q T +V + N E A+ + V P P ET E VA
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 110 EELPLPEDVNAEEVSPEEWQAEAETVEIVEAAEEEAAKE--------EITDEELEAQALA 161
++ ++ V E A T + E A+E A E+ E +
Sbjct: 1045 KQ-------ESKTVEKNEQDATETTAQNREVAKE-AKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 162 AEAAEEAVMVVPPAEEEQPVEEIAQEQEKPTK 193
+E V +EE+ E + QE P
Sbjct: 1097 TTETKETATV---EKEEKAKVETEKTQEVPKV 1125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4844PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4847TCRTETA516e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 6e-09
Identities = 79/398 (19%), Positives = 146/398 (36%), Gaps = 32/398 (8%)

Query: 13 LRLNLRILSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 129
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSL--HIGRVISWNDIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 187
A G+ + + H G + + +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 188 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 240
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 241 ITLFYDVK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 296
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 297 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 355
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 356 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 389
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4861SECFTRNLCASE310.015 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 31.0 bits (70), Expect = 0.015
Identities = 21/111 (18%), Positives = 42/111 (37%), Gaps = 22/111 (19%)

Query: 568 VNLFSLLALVLVLGIGINYTLF--------FSNPRGTPLTSLLAIAL---------AMLT 610
+L ++ AL+ + G IN T+ + PL ++ +++ +T
Sbjct: 204 FDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTGMT 263

Query: 611 TLLTLGMLVFSATQAISSFGIVLVSGIFTA-----FLLSPLAMPDKKRTKK 656
TLL L ++ I F +V G+FT ++ + + K
Sbjct: 264 TLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRNK 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4865DHBDHDRGNASE935e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 93.2 bits (231), Expect = 5e-25
Identities = 63/251 (25%), Positives = 119/251 (47%), Gaps = 15/251 (5%)

Query: 3 RSVLVTGASKGIGRAIACQLAADGFNI-GVHYHRDATGAQETLNAIVANGGNGRLLSFDV 61
+ +TGA++GIG A+A LA+ G +I V Y+ + + ++ A + DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVS--SLKAEARHAEAFPADV 66

Query: 62 ANREQCREVLEHEIAQHGAWYGVVSNAGIARDAAFPALSDDDWDAVIHTNLDSFYNVIQP 121
+ E+ + G +V+ AG+ R +LSD++W+A N +N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 122 CIMPMIGARQGGRIITLSSVSGVMGNRGQVNYSAAKAGIIGATKALAIELAKRKITVNCI 181
+ + R+ G I+T+ S + Y+++KA + TK L +ELA+ I N +
Sbjct: 127 -VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 182 APGLIDTGMIEM-------EESALKEAMSM----IPMKRMGQAEEVAGLASYLMSDIAGY 230
+PG +T M E +K ++ IP+K++ + ++A +L+S AG+
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 231 VTRQVISINGG 241
+T + ++GG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4884ABC2TRNSPORT512e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 51.1 bits (122), Expect = 2e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 201 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 259
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 260 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 318
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 319 IMLIMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 368
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4885PF05272300.043 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.043
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 20 ARCMVGLIGPDGVGKSSLLSLISGAR 45
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4886RTXTOXIND823e-19 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 81.8 bits (202), Expect = 3e-19
Identities = 72/408 (17%), Positives = 137/408 (33%), Gaps = 81/408 (19%)

Query: 6 RHLAWWGVGLLAVAAIVXWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++ +G L +A I+ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4891ALARACEMASE290.033 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.033
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


63Z5044Z5053Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5044-115-4.5043952-amino-3-ketobutyrate CoA ligase
Z5045024-7.757441hypothetical protein
Z5046126-8.903888ADP-L-glycero-D-manno-heptose-6-epimerase
Z5047236-11.605306ADP-heptose--LPS heptosyltransferase
Z5048345-15.447974ADP-heptose--LPS heptosyltransferase
Z5049345-16.262173LPS biosynthesis rpteon
Z5050140-13.771259LPS biosynthesis rpteon
Z5051134-11.958815LPS biosynthesis rpteon
Z5052125-7.403790lipopolysaccharide core biosynthesis protein
Z5053119-5.032184LPS biosynthesis rpteon
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5046NUCEPIMERASE1023e-27 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 102 bits (256), Expect = 3e-27
Identities = 76/348 (21%), Positives = 127/348 (36%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLNI 47
+VTG AGFIG ++ K L + G ++ +DNL D +++
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + A F E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 DVNL------------WFLENGVSG-------IFNLGTGRAESFQAVADATLAY-HKKGQ 258
+ + W +E G ++N+G A + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRAA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5053RTXTOXINA330.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.6 bits (74), Expect = 0.003
Identities = 25/117 (21%), Positives = 45/117 (38%), Gaps = 10/117 (8%)

Query: 60 HVFTDYISDKDKLYFSDL-------AKQYNSRINIYVINCDKLKSLPSTKNWTYATYFRF 112
H+ D +DKL +D+ ++ N I + S+ T+ +F
Sbjct: 860 HIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEG--NVLSIGHKNGITFRNWFEK 917

Query: 113 IIADYFYHKHEKILYLDADIACKGSIKELLDYQFSTNEIAAVVAERDVEWWQNRASV 169
D H+ E+I I S+K+ L+YQ N A+ V D + ++ +
Sbjct: 918 ESGDISNHEIEQIFDKSGRIITPDSLKKALEYQ-QRNNKASYVYGNDALAYGSQGDL 973


64Z5076Z5148Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5076-2143.103649bifunctional (p)ppGpp synthetase II/
Z5077-2143.164780tRNA guanosine-2'-O-methyltransferase
Z5078-2132.408240ATP-dependent DNA helicase RecG
Z5079-2121.054181hypothetical protein
Z5081-3110.634932glutamate transport
Z5082-390.271865transporter
Z5083-210-0.323279hypothetical protein
Z5084-111-0.924838alpha-xylosidase
Z5085115-1.441316transporter
Z5087220-1.222632*integrase for prophage 933L and the LEE
Z50885230.801494hypothetical protein
Z50895230.921998transposase
Z50915230.956631hypothetical protein
Z50924281.690823hypothetical protein
Z50934264.719041hypothetical protein
Z50942244.644487hypothetical protein
Z50952233.239789hypothetical protein
Z50961222.305301prophage-associated protein
Z50971200.887627prophage-associated protein
Z5098125-0.747478prophage-associated protein
Z5100439-5.045744hypothetical protein
Z5102241-7.775510hypothetical protein
Z5103441-9.478415hypothetical protein
Z5104338-9.868228hypothetical protein
Z5105238-9.387325hypothetical protein
Z5106237-10.557575hypothetical protein
Z5107339-9.712237hypothetical protein
Z5108337-9.441759hypothetical protein
Z5109236-8.903215hypothetical protein
Z5110339-8.421149intimin adherence protein
Z5111545-9.256166hypothetical protein
Z5112545-9.110526translocated intimin receptor protein
Z5113544-13.578658hypothetical protein
Z5114345-12.196621hypothetical protein
Z5115342-11.317474hypothetical protein
Z5116242-12.026158hypothetical protein
Z5117241-11.065456hypothetical protein
Z5118242-10.503568hypothetical protein
Z5119240-10.562034hypothetical protein
Z5120243-12.197840hypothetical protein
Z5121245-13.699493hypothetical protein
Z5122245-13.234634hypothetical protein
Z5123244-14.207468hypothetical protein
Z5124344-15.066911hypothetical protein
Z5125347-16.107747hypothetical protein
Z5126547-16.527745hypothetical protein
Z5127750-17.908051hypothetical protein
Z5128750-17.803663hypothetical protein
Z5129750-18.248057negative regulator GrlR
Z5131752-18.767223hypothetical protein
Z5132951-18.442223secretion system apparatus protein SsaU
Z5133953-18.904764hypothetical protein
Z5134952-19.208021hypothetical protein
Z5135950-18.627978type III secretion system protein
Z5136650-15.802290hypothetical protein
Z5137649-14.613728hypothetical protein
Z5138337-10.882846hypothetical protein
Z5139132-9.052302hypothetical protein
Z5140131-8.434414hypothetical protein
Z5142-125-6.099277hypothetical protein
Z5143022-5.166261hypothetical protein
Z5146119-3.245785permease transporter
Z5147118-2.984188hypothetical protein
Z5148219-2.121946hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5078SECA427e-06 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 42.2 bits (99), Expect = 7e-06
Identities = 39/129 (30%), Positives = 57/129 (44%), Gaps = 18/129 (13%)

Query: 233 NLSMLALRAGAQRFHAQPLSANDALKNKLLAALPFKPTGAQARVVAEIERDM-ALDVPMM 291
LS L+ F A+ L + L+N + A A R ++ M DV ++
Sbjct: 37 KLSDEELKGKTAEFRAR-LEKGEVLENLIPEAF------AVVREASKRVFGMRHFDVQLL 89

Query: 292 ---RLVQGDV-----GSGKTLVAALAA-LRAIAHGKQVALMAPTELLAEQHANNFRNWFE 342
L + + G GKTL A L A L A+ GK V ++ + LA++ A N R FE
Sbjct: 90 GGMVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFE 148

Query: 343 PLGIEVGWL 351
LG+ VG
Sbjct: 149 FLGLTVGIN 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5106BACINVASINB300.020 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 29.7 bits (66), Expect = 0.020
Identities = 29/102 (28%), Positives = 52/102 (50%), Gaps = 9/102 (8%)

Query: 112 MMMVTLLSLDTSAQKVSSLKNSNEIY---MDGQTKALENKTQEYKKQLEEQQKAEEKSQK 168
M+M + + SL+N ++ +G+ +E K+ E++ EE +KAEE ++
Sbjct: 258 MLMAMFIEI-VGKNTEESLQNDLALFNALQEGRQAEMEKKSAEFQ---EETRKAEETNRI 313

Query: 169 SKIVGQVFGWLGVALTAVAAVFNPALWAVVAIGATAMALQTA 210
+G+V G L ++ VAAVF A +A+ A +A+ A
Sbjct: 314 MGCIGKVLGALLTIVSVVAAVFTGG--ASLALAAVGLAVMVA 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5108PF07201280.047 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 28.3 bits (63), Expect = 0.047
Identities = 43/225 (19%), Positives = 73/225 (32%), Gaps = 23/225 (10%)

Query: 39 SPLINLQNELAMITSSSLSETIEGLSLGYRK---GSARKEEEGSTIEKLLNDMQELLTLT 95
+ ++ E+ SE E LSL RK AR + + + L+ + EL
Sbjct: 47 QSIADMAEEVTF----VFSERKE-LSLDKRKLSDSQARVSDVEEQVNQYLSKVPEL---E 98

Query: 96 DSDKIKELS--LKNSGL--LEQHDPTLAMFGNMPKGEIVALISSLLQSK--FVKIELKKK 149
+ EL L NS L Q L P + L K L
Sbjct: 99 QKQNVSELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFKMLCGLRDALKGRPELAHLSHL 158

Query: 150 YARLLLDLLGEDDWELAL-----LSWLGVGELNQEGIQKIKKLYEKAKDEDSENGASLLD 204
+ L+ + E + L + +Q ++ Y A + ++
Sbjct: 159 VEQALVSMAEEQGETIVLGARITPEAYRESQSGVNPLQPLRDTYRDAV-MGYQGIYAIWS 217

Query: 205 WFMEIKDLPEREKHLKVIIRALSFDLSYMSSFEDKVKTSSIISDL 249
+ + + + + +ALS DL S + K +ISDL
Sbjct: 218 DLQKRFPNGDIDSVILFLQKALSADLQSQQSGSGREKLGIVISDL 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5110INTIMIN14590.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 1459 bits (3777), Expect = 0.0
Identities = 780/942 (82%), Positives = 837/942 (88%), Gaps = 11/942 (1%)

Query: 1 MITHGCYTRTRHKHKLKKTLIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHDSYQ 60
MITHG Y RTRHKHKLKKT IMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTH+SYQ
Sbjct: 1 MITHGFYARTRHKHKLKKTFIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHNSYQ 60

Query: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAAPGQQIILPLKKLPFE 120
NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKA PGQQIILPLKKLPFE
Sbjct: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAEPGQQIILPLKKLPFE 120

Query: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180
YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR
Sbjct: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180

Query: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240
SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM
Sbjct: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240

Query: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPANMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300
LAFGQVGARYIDSRFTANLGAGQRFFLP NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF
Sbjct: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300

Query: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLIYEQYYGDNVAL 360
KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKL+YEQYYGDNVAL
Sbjct: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVAL 360

Query: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKSWSQQIE 420
FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDK WSQQIE
Sbjct: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIE 420

Query: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTEHSTQKIQLIVKSKY 480
PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTE STQKIQLIVKSKY
Sbjct: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIVKSKY 480

Query: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNIYKVTARAYDRNGNSSN 540
GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSN+YKVTARAYDRNGNSSN
Sbjct: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSN 540

Query: 541 NVQLTITVLSNGQVVDQVGVTDFTADKTSAKADNADTITYTATVKKNGVAQANVPVSFNI 600
NV LTITVLSNGQVVDQVGVTDFTADKTSAKAD + ITYTATVKKNGVAQANVPVSFNI
Sbjct: 541 NVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNI 600

Query: 601 VSGTATLGANSAKTDANGKATVTLKSSTPGQVVVSAKTAEMTSALNASAVIFFDQTKASI 660
VSGTA L ANSA T+ +GKATVTLKS PGQVVVSAKTAEMTSALNA+AVIF DQTKASI
Sbjct: 601 VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASI 660

Query: 661 TEIKADKTTAVANGKDAIKYTVKVMKNGQPVNNQSVTFSTNFGMFNGKSQTQATTGNDGR 720
TEIKADKTTAVANG+DAI YTVKVMK +PV+NQ VTF+T G + + T +G
Sbjct: 661 TEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLS---NSTEKTDTNGY 717

Query: 721 ATITLTSSSAGKATVSATVSDGA-EVKATEVTFFDELKID-NKVDIIGNNVRGELPNIWL 778
A +TLTS++ GK+ VSA VSD A +VKA EV FF L ID ++I+G V+G+LP +WL
Sbjct: 718 AKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWL 777

Query: 779 QYGQFKLKASGGDGTYSWYSENTSIATVDA-SGKVTLNGKGSVVIKATSGDKQTVSYTIK 837
QYGQ LKASGG+G Y+W S N +IA+VDA SG+VTL KG+ I S D QT +YTI
Sbjct: 778 QYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA 837

Query: 838 APSYMI--KVDKQAYYADAMSICKNL---LPSTQTVLSDIYDSWGAANKYSHYSSMNSIT 892
P+ +I + K+ Y DA++ CKN LPS+Q L +++ +WGAANKY +Y S +I
Sbjct: 838 TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTII 897

Query: 893 AWIKQTSSEQRSGVSSTYNLITQNPLPGVNVNTPNVYAVCVE 934
+W++QT+ + +SGV+STY+L+ QNPL + + N YA CV+
Sbjct: 898 SWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5111PF059321224e-39 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 122 bits (309), Expect = 4e-39
Identities = 24/125 (19%), Positives = 52/125 (41%), Gaps = 5/125 (4%)

Query: 1 MSSRS-ELLLEKFAEKIGIGSISFNENRLCSFAIDEIYYISLS-DANDEYMMIYGVCGKF 58
MS+ + LL+ F+ + + + F+++ C+ ID + ++LS D E +++ G+
Sbjct: 1 MSNLFYKTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH 60

Query: 59 PTDNSNFALEILNANLWFAENGGPYLCYEAGAQSLLLALRFPLDDATPEKLENEIEVVVK 118
+L L N GP L + + P + + L+ E+ +++
Sbjct: 61 KD---IPQQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLE 117

Query: 119 SMENL 123
M
Sbjct: 118 WMRGW 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5112TRNSINTIMINR7310.0 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 731 bits (1887), Expect = 0.0
Identities = 328/566 (57%), Positives = 390/566 (68%), Gaps = 25/566 (4%)

Query: 1 MPIGNLGHNPNVNNSIPPAPPLPSQTDGA--GGRGQLINSTGPLGSRALFTPVRNSMADS 58
MPIGNLG+N N N+ IPPAPPLPSQTDGA GG G LI+STG LGSR+LF+P+RNSMADS
Sbjct: 1 MPIGNLGNNVNGNHLIPPAPPLPSQTDGAARGGTGHLISSTGALGSRSLFSPLRNSMADS 60

Query: 59 GDNRASDVPGLPVNPMRLAA--SEITLNDGFEVLHDHGPLDTLNRQIGSSVFRVETQEDG 116
D+R D+PGLP NP RLAA SE L GFEVLHD GPLD LN QIG S FRVE Q DG
Sbjct: 61 VDSR--DIPGLPTNPSRLAAATSETCLLGGFEVLHDKGPLDILNTQIGPSAFRVEVQADG 118

Query: 117 KHIAVGQRNGVETSVVLSDQEYARLQSIDPEGKDKFVFTGGRGGAGHAMVTVASDITEAR 176
H A+G++NG+E SV LS QE++ LQSID EGK++FVFTGGRGG+GH MVTVASDI EAR
Sbjct: 119 THAAIGEKNGLEVSVTLSPQEWSSLQSIDTEGKNRFVFTGGRGGSGHPMVTVASDIAEAR 178

Query: 177 QRILELLEPKGTGESK-GAGESKGVGELRESNSGAENTTETQTSTSTSSLRSDPKLWLAL 235
+IL L+P G + +++ VG S +ET TST+ SS+RSDPK W+++
Sbjct: 179 TKILAKLDPDNHGGRQPKDVDTRSVGVGSASGIDDGVVSETHTSTTNSSVRSDPKFWVSV 238

Query: 236 GTVATGLIGLAATGIVQALALTPEPDSPTTTDPDAAASATETATRDQLTKEAFQNPDNQK 295
G +A GL GLAATGI QALALTPEPD PTTTDPD AA+A E+AT+DQLT+EAF+NP+NQK
Sbjct: 239 GAIAAGLAGLAATGIAQALALTPEPDDPTTTDPDQAANAAESATKDQLTQEAFKNPENQK 298

Query: 296 VNIDELGNAIPSGVLKDDVVANIEEQAKAAGEEAKQQAIENNAQAQKKYDEQQAKRQEEL 355
VNID GNAIPSG LKDD+V I +QAK AGE A+QQA+E+NAQAQ++Y++Q A+RQEEL
Sbjct: 299 VNIDANGNAIPSGELKDDIVEQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEEL 358

Query: 356 KVSSGAGYGLSGALILGGGIGVAVTAALHRKNQPVEQTTTTTTTTTTTSARTVENKPANN 415
++SSG GYGLS ALI+ GGIG VT ALHR+NQP EQTTTTTT TV +
Sbjct: 359 QLSSGIGYGLSSALIVAGGIGAGVTTALHRRNQPAEQTTTTTT-------HTVVQQQTGG 411

Query: 416 TPAQGNVDTPGSEDTMESRRSSMASTSSTFFDTSSIGTVQNPYADV---KTSLHDSQVPT 472
P P RR S S +ST + SS V NPYA+V + SL Q
Sbjct: 412 IPQHKVALMPQERRRFSDRRDSQGSVASTHWSDSS-SEVVNPYAEVGGARNSLSAHQPEE 470

Query: 473 SNSNTSVQNMGNTDSVVYSTIQHPPRDTTDNGARLLGNPSAGIQSTYARLALSGGLRHDM 532
+ + G YS IQ+ G RL+G P GIQSTYA LA SGGLR M
Sbjct: 471 HIYDEVAADPG------YSVIQNFSGSGPVTG-RLIGTPGQGIQSTYALLANSGGLRLGM 523

Query: 533 GGLTGGSNSAVNTSNNPPAPGSHRFV 558
GGLT G +AV++ N P PG RFV
Sbjct: 524 GGLTSGGETAVSSVNAAPTPGPVRFV 549


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5114PF06704366e-06 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 36.4 bits (84), Expect = 6e-06
Identities = 21/119 (17%), Positives = 51/119 (42%), Gaps = 4/119 (3%)

Query: 3 EKFRTDLAHTFGIALEEQTDVLSFHDNDGHEW-ILECASQSEILFFYCYLLNSESIQINS 61
+ L G +L Q V + +D+ +E ++E SE++ F+C + S +
Sbjct: 9 SRLIKSLGAQLGTSLTAQNGVCALYDSQDNEAAVIEMPDHSEMVIFHCRVGRSPDRAADL 68

Query: 62 ILEMNSNRELLGMF--FLSLKDDNILLNIAFPADKIDITEFANLMENGYLLKNEIIRSL 118
++ N ++ M + ++ ++ L +D +F + G++++ R+L
Sbjct: 69 QKLLSLNFDVARMHGSWFAVDQGDVRLCAQRELAVLDEAQFCDTA-RGFIVQAREARAL 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5124FLGMRINGFLIF561e-11 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 55.7 bits (134), Expect = 1e-11
Identities = 32/166 (19%), Positives = 58/166 (34%), Gaps = 10/166 (6%)

Query: 22 EQLYTGLTEKEANQMQALLLSNDVNVSKEMDKSGNMTLSVEKEDFVRAITILNNNGFPKK 81
L++ L++++ + A L N+ + V + L G PK
Sbjct: 51 RTLFSNLSDQDGGAIVAQLTQM--NIPYRFANGSG-AIEVPADKVHELRLRLAQQGLPKG 107

Query: 82 KFADIEVIFPPSQLVASPSQENAKINYLKEQDIERLLSKIPGVIDCSVSLNVNNN----- 136
E + + S E E ++ R + + V V L +
Sbjct: 108 GAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVR 166

Query: 137 ESQPSSAAVLVISSPEVNLAPSVIQ-IKNLVKNSVDDLKLENISVV 181
E + SA+V V P L I + +LV ++V L N+++V
Sbjct: 167 EQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLV 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5126TYPE3OMGPROT5590.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 559 bits (1441), Expect = 0.0
Identities = 153/494 (30%), Positives = 262/494 (53%), Gaps = 24/494 (4%)

Query: 30 KSEYFIITKSSPVRAILNDFAANYSIPVFISSSVNDDFSGEIKNEKPVKVLEKLSKLYHL 89
Y + K +R +L DF ANY V +S +ND SG+ +++ P L+ ++ LY+L
Sbjct: 33 PIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEHDNPQDFLQHIASLYNL 92

Query: 90 TWYYDENILYIYKTNEISRSIITPTYLDIDSLLKYLSDTISVNKNSCNVRKITTFNSIEV 149
WYYD N+LYI+K +E++ +I + L + L + + R + + V
Sbjct: 93 VWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQR-SGIWEPRFGWRPDASNRLVYV 151

Query: 150 RGVPECIKYITSLSESLDKEAQSKAKNKD--VVKVFKLNYASATDITYKYRDQNVVVPGV 207
G P ++ + + +L+++ Q +++ +++F L YASA+D T YRD V PGV
Sbjct: 152 SGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGV 211

Query: 208 VSILKTMASNGSLP--STGKGAVERSGNLFDNSVTISADPRLNAVVVKDREITMDIYQQL 265
+IL+ + S+ ++ + + ++ + ADP LNA++V+D M +YQ+L
Sbjct: 212 ATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 266 ISELDIEQRQIEISVSIIDVDANDLQQLGVNWSGTLNAGQGTIA--------FNSSTAQA 317
I LD +IE+++SI+D++A+ L +LGV+W + G N ++ A
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGA 331

Query: 318 NISSSVISNASNFMIRVNALQQNSKAKILSQPSIITLNNMQAILDKNVTFYTKVSGEKVA 377
S + RVN L+ A+++S+P+++T N QA++D + T+Y KV+G++VA
Sbjct: 332 LGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVA 391

Query: 378 SLESITSGTLLRVTPRILDDSSNSLTGKRRERVRLLLDIQDGNQSTNQSNAQDASSTLPE 437
L+ IT GT+LR+TPR+L S + L L I+DGNQ N S + +P
Sbjct: 392 ELKGITYGTMLRMTPRVLTQGDKS-------EISLNLHIEDGNQKPNSSGIE----GIPT 440

Query: 438 VQNSEMTTEATLSAGESLLLGGFIQDKESSSKDGIPLLSDIPVIGSLFSSTVKQKHSVVR 497
+ + + T A + G+SL++GG +D+ S + +PLL DIP IG+LF + VR
Sbjct: 441 ISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVR 500

Query: 498 LFLIKATPIKSASS 511
LF+I+ I +
Sbjct: 501 LFIIEPRIIDEGIA 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5127SYCDCHAPRONE1394e-45 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 139 bits (352), Expect = 4e-45
Identities = 33/142 (23%), Positives = 63/142 (44%)

Query: 6 SSLEDIYDFYQDGGTLASLTNLTQQDLNDLHSYAYTAYQSGDVITARNLFHLLTYLEHWN 65
+ F + GGT+A L ++ L L+S A+ YQSG A +F L L+H++
Sbjct: 10 EYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYD 69

Query: 66 YDYTLSLGLCHQRLSNHEDAQLCFARCATLVMQDPRASYYSGISYLLVGNKKMAKKAFKA 125
+ L LG C Q + ++ A ++ A + +++PR +++ L G A+
Sbjct: 70 SRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFL 129

Query: 126 CLMWCNEKEKYTTYKENIKKLL 147
+K ++ + +L
Sbjct: 130 AQELIADKTEFKELSTRVSSML 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5131OMPTIN260.048 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 26.5 bits (58), Expect = 0.048
Identities = 13/38 (34%), Positives = 17/38 (44%), Gaps = 4/38 (10%)

Query: 115 AYNAGYFNTPNAVELRRQYAMKIYKTYNKLKNNEQIID 152
A NAGY+ TPNA + Y + K N + D
Sbjct: 254 AVNAGYYVTPNA----KVYVEGAWNRVTNKKGNTSLYD 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5132TYPE3IMSPROT376e-132 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 376 bits (967), Expect = e-132
Identities = 123/339 (36%), Positives = 195/339 (57%), Gaps = 4/339 (1%)

Query: 2 SEKTEKPTPKKLRDLKKKGDVTKSEEVMAAVQSLILFSFFSLYGMS--FFVDIVGLVNTT 59
EKTE+PTPKK+RD +KKG V KS+EV++ LI+ L G+S +F L+
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTA--LIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 IDSLNRPFLYAIREILGAVLNIFLLYILPISLIVFVGTVTTGVSQIGFIFAVEKIKPSAQ 119
+ PF A+ ++ VL F P+ + + + + V Q GF+ + E IKP +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 120 KISVKNNLKNIFSVKSIFELLKSVFKLVIIVLIFYFMGHSYANEFANFTGLNAYQALVVV 179
KI+ K IFS+KS+ E LKS+ K+V++ ++ + + ++
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 180 AFFVFLLWKGVLFGYLLFSVFDFWFQKHEGLKKMKMSKDEVKREAKDTDGNPEIKGERRR 239
+ L G+++ S+ D+ F+ ++ +K++KMSKDE+KRE K+ +G+PEIK +RR+
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 240 LHSEIQSGSLANNIKKSTVIVKNPTHIAICLYYKLGETPLPLVIETGKDAKALQIIKLAE 299
H EIQS ++ N+K+S+V+V NPTHIAI + YK GETPLPLV DA+ + K+AE
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 300 LYDIPVIEDIPLARTLYKNIHKGQYITEDFFEPVAQLIR 338
+P+++ IPLAR LY + YI + E A+++R
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLR 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5133TYPE3IMRPROT1551e-48 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 155 bits (394), Expect = 1e-48
Identities = 46/230 (20%), Positives = 102/230 (44%), Gaps = 4/230 (1%)

Query: 11 SFYCILRPLGMFIILPIFSTGVLLSNFIRNSIMIAFTLPIIVENYTFSEKLPSGIFQLTG 70
F+ +LR L + PI S + R + +A + + + +P F
Sbjct: 16 YFWPLLRVLALISTAPILSERSVPK---RVKLGLAMMITFAIAPSLPANDVPVFSFFALW 72

Query: 71 IALKEISIGFFIGLSFTILFWAIDAAGQIIDTLRGSTISSIFNPSISDSSSITGVILYQF 130
+A+++I IG +G + F A+ AG+II G + ++ +P+ + + I+
Sbjct: 73 LAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDML 132

Query: 131 ISVIFVIHGGIQSILDKLYLSYEILPLQADIAFNRALIDFLFSLWDSFIKLMLSFSVPMI 190
++F+ G ++ L ++ LP+ + + A + + F+ L ++P+I
Sbjct: 133 ALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFL-NGLMLALPLI 191

Query: 191 IGIFLCDMGFGFLNKTAPQLNVFTLSLPVKSLIAIFILLLVIHVFPDFIT 240
+ ++ G LN+ APQL++F + P+ + I ++ ++ + F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCE 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5134TYPE3IMQPROT692e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 69.0 bits (169), Expect = 2e-19
Identities = 25/78 (32%), Positives = 45/78 (57%)

Query: 7 VQLCVQTFWIIFILSLPTVIAASVIGIIISLVQAITQLQDQTLPFLLKIIAVFATLALTY 66
V + +++ ILS I A++IG+++ L Q +TQLQ+QTLPF +K++ V L L
Sbjct: 5 VFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLS 64

Query: 67 HWMGTTIINFSSIIFEMI 84
W G ++++ + +
Sbjct: 65 GWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5135TYPE3IMPPROT2232e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 223 bits (570), Expect = 2e-76
Identities = 89/212 (41%), Positives = 136/212 (64%), Gaps = 9/212 (4%)

Query: 12 IFLIIVFFLLSLLPIFVVIGTSFLKISIVLGILKNALGIQQVPPNMALTSVSLILTMFIM 71
I LI + +LLP + GT F+K SIV +++NALG+QQ+P NM L V+L+L+MF+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 72 SPIILQINDNISQEPINYTDSDFFQKVDEKILSPYRGFLEKNTEKDNVEFFERAAQKKLG 131
PI+ E + + D K ++ L YR +L K ++++ V+FFE A K+
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 132 NETI---------LKKDSLFILLPAFTMGQLEAAFKIGFLLYLPFIAIDLIISNILLALG 182
E ++K S+F LLPA+ + ++++AFKIGF LYLPF+ +DL++S++LLALG
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 183 MMMVSPVTISIPFKILLFILVGGWQKLFEFLL 214
MMM+SPVTIS P K++LF+ + GW L + L+
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLI 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5142PF068727240.0 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 724 bits (1871), Expect = 0.0
Identities = 293/397 (73%), Positives = 338/397 (85%)

Query: 1 MILVAKLFITNQIGESLMINGLNNDSASLVLDAAMKVNSGFKKSWDEMSCAEKLFKVLSF 60
MILV K+F+ ++ + M+NGLNN+SASLVLDA +K+NS +KK W+EM+CAEKL K+L+
Sbjct: 1 MILVIKIFVIDETERAFMLNGLNNNSASLVLDATIKINSDYKKPWNEMTCAEKLLKILTL 60

Query: 61 GLWNPTYSRSERQSFQELLTVLEPVYPLPNELGRVSARFSDGSSLRISVTNSELVEAEIR 120
GLWNP YS+ ERQ FQ LLTVLEPV P NELGRV A+FSDGSSLRISVTNSEL+EAEI
Sbjct: 61 GLWNPKYSQDERQQFQGLLTVLEPVSPAHNELGRVYAKFSDGSSLRISVTNSELIEAEIH 120

Query: 121 TANNEKITVLLESNEQNRLLQSLPIDRHMPYIQVHRALSEMDLTDTTSMRNLLGFTSKLS 180
T NNEK VLLE+NEQNRLLQSLPI+RHMPYIQVH L + +LTD SM LL FTSKLS
Sbjct: 121 TPNNEKFLVLLEANEQNRLLQSLPINRHMPYIQVHHTLPQEELTDLLSMHKLLSFTSKLS 180

Query: 181 TTLIPHNAQTDPLSGPTPFSSIFMDTCRGLGNAKLSLNGVDIPANAQKLLRDALGLKDTH 240
TLIPHN QTDPLSG TPFS++FMDT RGLGN+KLSLNGVDIPA+AQKLLR+ LGLKDT+
Sbjct: 181 ATLIPHNNQTDPLSGLTPFSTVFMDTSRGLGNSKLSLNGVDIPADAQKLLRNTLGLKDTN 240

Query: 241 SSPTRNVIDHGISRHDAEQIARESSGSDKQKAEVVEFLCHPEAATAICSAFYQSFNVPAL 300
SSP NVI +GI RH AEQI +ESS +++QKA VV+FLC PEA TAICSAFYQSFNVPAL
Sbjct: 241 SSPDLNVIRNGIPRHYAEQIVKESSSTNEQKAAVVDFLCQPEAPTAICSAFYQSFNVPAL 300

Query: 301 TLTHERISKASEYNAERSLDTPNACINISISQSSDGNIYVTSHTGVLIMAPEDRPNEMGM 360
LTH RIS+AS YNA+RSLD PNACINISI+QSS+G+I+VTSHTGVLIMAPEDRPN++GM
Sbjct: 301 MLTHVRISQASAYNAQRSLDMPNACINISITQSSEGSIHVTSHTGVLIMAPEDRPNQLGM 360

Query: 361 LTNRTSYEVPQGVKCIIDEMVSALQPRYAASETYLQN 397
LTNRTSYEVP GVKC +EM L+ +YA+SETYL N
Sbjct: 361 LTNRTSYEVPPGVKCEPNEMARMLKAKYASSETYLNN 397


65Z5158Z5165Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5158026-6.615800sensory histidine kinase UhpB
Z5159135-11.645379DNA-binding transcriptional activator UhpA
Z5160025-7.971007hypothetical protein
Z5161022-4.936070hypothetical protein
Z5162117-1.191858hypothetical protein
Z51630140.757943hypothetical protein
Z51641163.307775acetolactate synthase 1 regulatory subunit
Z51651153.540847acetolactate synthase catalytic subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5158PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 308 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIEESALSENQRVTLFRVCQEGLNN 367
LR ++L + + ++L L++ + + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 368 IVKHA-----DASAVTLQGWLQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 421
+KH + L+G + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 422 G---TLTISCLHG-TRVSVSLP 439
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5159HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


66Z5189Z5233Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z51893222.713250hypothetical protein
Z51903233.021504DNA gyrase subunit B
Z51911212.614744recombination protein F
Z51920203.022417DNA polymerase III subunit beta
Z5193-1192.170672chromosome replication initiator DnaA
Z5194-115-3.83837750S ribosomal protein L34
Z5195-115-3.993841ribonuclease P
Z5197-116-4.690028inner membrane protein translocase component
Z5198119-5.844764tRNA modification GTPase TrmE
Z5199122-8.479606hypothetical protein
Z5200018-6.361235hypothetical protein
Z5201-211-1.547407hypothetical protein
Z5202-212-1.178886tryptophanase
Z5203-212-0.764935tryptophanase
Z5204-312-0.018263tryptophan permease
Z5205-3131.016814multidrug efflux system protein MdtL
Z5206-116-6.615232DNA-binding transcriptional regulator YidZ
Z5207021-9.168077hypothetical protein
Z5208024-11.319667hypothetical protein
Z5209336-15.441126membrane / transport protein
Z5210339-17.1206416-phosphogluconate phosphatase
Z5211336-16.689457hypothetical protein
Z5212024-11.950200hypothetical protein
Z5213118-8.572982hypothetical protein
Z5214-117-6.433169hypothetical protein
Z5215-3200.064287transcriptional regulator PhoU
Z5216-213-0.758525phosphate transporter ATP-binding protein
Z5217-19-1.715006phosphate transporter permease subunit PtsA
Z5218-19-2.767059phosphate transporter permease subunit PstC
Z5219015-4.365426phosphate ABC transporter substrate-binding
Z5220121-5.300029fimbrial protein
Z5221012-2.826135fimbrial protein
Z5222-114-1.780581fimbrial usher
Z5223-119-0.679747fimbrial chaperone
Z52242280.530685fimbrial chaperone
Z52252291.539268major fimbrial subunit
Z52273372.254330glucosamine--fructose-6-phosphate
Z52283352.051874bifunctional N-acetylglucosamine-1-phosphate
Z52294392.146678ATP synthase F0F1 subunit epsilon
Z52304412.058463ATP synthase F0F1 subunit beta
Z52312341.021112ATP synthase F0F1 subunit gamma
Z52323340.761679ATP synthase F0F1 subunit alpha
Z5233221-0.453652ATP synthase F0F1 subunit delta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z519760KDINNERMP8740.0 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 874 bits (2260), Expect = 0.0
Identities = 547/548 (99%), Positives = 548/548 (100%)

Query: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL 60
MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL 60

Query: 61 ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP 120
ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP
Sbjct: 61 ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP 120

Query: 121 DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV 180
DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV
Sbjct: 121 DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV 180

Query: 181 QNAGEKPLEISTFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD 240
QNAGEKPLEIS+FGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD
Sbjct: 181 QNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD 240

Query: 241 NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT 300
NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT
Sbjct: 241 NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT 300

Query: 301 GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII 360
GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII
Sbjct: 301 GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII 360

Query: 361 ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL 420
ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL
Sbjct: 361 ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL 420

Query: 421 GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK 480
GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK
Sbjct: 421 GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK 480

Query: 481 MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL 540
MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL
Sbjct: 481 MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL 540

Query: 541 HSREKKKS 548
HSREKKKS
Sbjct: 541 HSREKKKS 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5199FLGHOOKFLIE250.018 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 25.4 bits (55), Expect = 0.018
Identities = 9/44 (20%), Positives = 16/44 (36%)

Query: 19 LKDVMMQLEAKNNEGKYVISKANGNPVFKELFWKAIDEFNFPQE 62
++ V+ QL+A + S F A+D + Q
Sbjct: 6 IEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5205TCRTETA543e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 54.4 bits (131), Expect = 3e-10
Identities = 61/257 (23%), Positives = 97/257 (37%), Gaps = 10/257 (3%)

Query: 2 SRFLICSFALVLLYPAGIDMYLVGLPRIAADLNASEAQLHIAFSVYLAGMAAAML----F 57
+R LI + V L GI + + LP + DL S + + + LA A
Sbjct: 4 NRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSN-DVTAHYGILLALYALMQFACAPV 62

Query: 58 AGKVADRSGRKPVAIPGAALFIITSVFCSLAETSTLFLAGRFLQGLGAGCCYVVAFAILR 117
G ++DR GR+PV + A + + A + GR + G+ G VA A +
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAYIA 121

Query: 118 DTLDDRRRAKVLSLLNGITCIIPVLAPVLGHLIMLKFPWQSLFWTMAIMGIAVLMLSLFI 177
D D RA+ ++ V PVLG L M F + F+ A + + F+
Sbjct: 122 DITDGDERARHFGFMSACFGFGMVAGPVLGGL-MGGFSPHAPFFAAAALNGLNFLTGCFL 180

Query: 178 LKETRPAAPAASDKSRENSESLLNRFFLSRVVITTLSVSVILTFVNTSPVLLMEIMGFER 237
L E+ + N + VV ++V I+ V P L I G +R
Sbjct: 181 LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDR 240

Query: 238 GEYATIMALTAGVSMTV 254
+ G+S+
Sbjct: 241 FHWDATT---IGISLAA 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5220FIMBRIALPAPF320.002 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 32.0 bits (72), Expect = 0.002
Identities = 38/169 (22%), Positives = 67/169 (39%), Gaps = 11/169 (6%)

Query: 189 LYANISSTTTRGEAIAKVRISGSLTAPQSCQINAGQVIYFDFDTIPASEFSSTAGQAITS 248
L+ ++ T+ A ++ I G++ P C IN GQ I DF I ++ G+
Sbjct: 6 LFISLLLTSVAVLADVQINIRGNVYIP-PCTINNGQNIVVDFGNINPEHVDNSRGE---- 60

Query: 249 RKITKTVSIECTGMGYERTQKVDASFTGTNRSSDDTMVATDNADVGIKIYNKSNAEVSVN 308
+TK +SI C KV + G + + ++AT+ GI +Y +
Sbjct: 61 --VTKNISISCPYKSGSLWIKVTGNTMGVGQ---NNVLATNITHFGIALYQGKGMSTPLT 115

Query: 309 NGKLPADMGNTTI-FGRKNGSVTFSAAPASFTGARPQPGVFNATATLTI 356
G + T + TF++ P G F TA++++
Sbjct: 116 LGNGSGNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5222PF005777690.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 769 bits (1986), Expect = 0.0
Identities = 329/875 (37%), Positives = 481/875 (54%), Gaps = 58/875 (6%)

Query: 6 LFITLASGICLLCSISAFARDSLFNPRLLELDHPADNIDIHQFNRSNTLPAGTYKVDVMI 65
F+ L + + FNPR L D P D+ +F LP GTY+VD+ +
Sbjct: 26 FFVRLFVACAFAAQAPLSSAELYFNPRFLADD-PQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 66 NGMLFERQEVKFAQDNPDAELHPCYVAIKNVLATYGIKVDAIKSLANVDDKTCVNPVPLI 125
N ++V F + + + PC LA+ G+ ++ + + D CV +I
Sbjct: 85 NNGYMATRDVTFNTGDSEQGIVPCLTR--AQLASMGLNTASVSGMNLLADDACVPLTSMI 142

Query: 126 DGATWLLDASKLALNITIPQIYLNNAVNGYISPSRWDQGINAMMMNYDFSASHTIRSNYD 185
AT LD + LN+TIPQ +++N GYI P WD GINA ++NY+FS + ++
Sbjct: 143 HDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV-QNRIG 201

Query: 186 DDDDSYYLNLRNGINLGAWRFRNYSTLN------SYDGNVDYHSVSNYIQRDIMALRSQI 239
+ YLNL++G+N+GAWR R+ +T + S + ++ +++RDI+ LRS++
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 240 MIGDTWTASDVFDSTQVRGVRLYTDDDMLPSSQNGFAPVVHGIAKTNATVIIKQNGYVIY 299
+GD +T D+FD RG +L +DD+MLP SQ GFAPV+HGIA+ A V IKQNGY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 300 QSAVPQGAFALTDLNTTSSGGDLDVTIKEEDGSEQHFIQPFTSLAILKREGQTDVDLSIG 359
S VP G F + D+ + GDL VTIKE DGS Q F P++S+ +L+REG T ++ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 360 EVR--DESGFTPEVLQLQAMHGFPLGITLYGGTQLANDYASAALGIGKDMGALGAISFDV 417
E R + P Q +HG P G T+YGGTQLA+ Y + GIGK+MGALGA+S D+
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDM 441

Query: 418 THARSQFDYDDNESGQSYRFLYSKRFEDTNTTFRLVGYRYSMEGFYTLNEWVSRQDNDSD 477
T A S D GQS RFLY+K ++ T +LVGYRYS G++ + + N +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 478 -----------------FWVTGNRRSRFEGTWTQSFTPGWGNIYLTFSRQEYWQTDEVER 520
+ + N+R + + T TQ +YL+ S Q YW T V+
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYWGTSNVDE 560

Query: 521 LLQFGYNNNWRNISWNVSWNYTDSIKRSLGNHHDDNNDDFGKEQIFMFSMSIPLSCWMED 580
Q G N + +I+W +S++ T ++ G++Q+ +++IP S W+
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLT----KNAWQK--------GRDQMLALNVNIPFSHWLRS 608

Query: 581 --------SYVNYSLTQNNHHESTMQVGLNGTMLEGRNLSYNVQESWMHSPDDSYSGNAG 632
+ +YS++ + + T G+ GT+LE NLSY+VQ + D SG+ G
Sbjct: 609 DSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGY-AGGGDGNSGSTG 667

Query: 633 ---MTYDGTYGSVNGSYSWSRDSQHFDYGARGGVLVHSDGVTFSQELGETVALVKAPGAE 689
+ Y G YG+ N YS S D + YG GGVL H++GVT Q L +TV LVKAPGA+
Sbjct: 668 YATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAK 727

Query: 690 GLSIENATGISTDWRGYTVKTQLSPYDENRVALNSDYFSKANIELENTVINLVPTRGAVV 749
+EN TG+ TDWRGY V + Y ENRVAL+++ + N++L+N V N+VPTRGA+V
Sbjct: 728 DAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLA-DNVDLDNAVANVVPTRGAIV 786

Query: 750 KAEFVTHVGYRVLFNVRQVNGKPIMFGAMATASLETGTVTGIVGDNGELYLSGMPEKGEF 809
+AEF VG ++L + N KP+ FGAM T E+ +GIV DNG++YLSGMP G+
Sbjct: 787 RAEFKARVGIKLLMTLTH-NNKPLPFGAMVT--SESSQSSGIVADNGQVYLSGMPLAGKV 843

Query: 810 LLSWGQAADEKCKAAYHITHKPDDTSLVQMDAICR 844
+ WG+ + C A Y + + L Q+ A CR
Sbjct: 844 QVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


67Z5325Z5336Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Z5325-1214.469273hypothetical protein
Z5326-2173.419752diaminopimelate epimerase
Z5327-2162.557576hypothetical protein
Z5328-2172.114275site-specific tyrosine recombinase XerC
Z5329-2140.347076flavin mononucleotide phosphatase
Z5330-210-2.942727DNA-dependent helicase II
Z5331018-7.845396hypothetical protein
Z5332124-9.942924hypothetical protein
Z5333430-12.217383magnesium/nickel/cobalt transporter CorA
Z5334332-12.980297hypothetical protein
Z5335329-9.830690hypothetical protein
Z5336219-4.206717magnesium/cobalt transporter CorA
68Z5365Z5409Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5365-2193.101731FMN reductase
Z5366-2203.0904273-ketoacyl-CoA thiolase
Z5367-2182.088570multifunctional fatty acid oxidation complex
Z5369-3151.078941proline dipeptidase
Z5370-2140.435374hypothetical protein
Z5371-115-0.957499potassium transporter
Z5372-114-2.404386protoporphyrinogen oxidase
Z5388-220-5.369411**molybdopterin-guanine dinucleotide biosynthesis
Z5389-319-5.753767molybdopterin-guanine dinucleotide biosynthesis
Z5390-214-3.694616hypothetical protein
Z5391-213-3.284991serine/threonine protein kinase
Z5392-113-2.977242protein disulfide isomerase
Z5393-111-2.100705hypothetical protein
Z53940150.805102acyltransferase
Z53980151.923594DNA polymerase I
Z54000192.512668ribosome biogenesis GTP-binding protein YsxC
Z54022252.562049hypothetical protein
Z54031232.198491coproporphyrinogen III oxidase
Z54041190.953225nitrogen regulation protein NR(I)
Z5405020-1.083107nitrogen regulation protein NR(II)
Z5406121-2.608468glutamine synthetase
Z5407114-4.394960GTP-binding protein
Z5408012-4.995108transcriptional regulator
Z5409-111-3.382086hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5402SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 29/71 (40%)

Query: 13 AKARRKTREELNQEARDRKRQKKRRGHAPGSRAAGGNNTSGSKGQNAPKDPRIGSKTPIP 72
+K + + EE+ + + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 73 LGVTEKVTKQH 83
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5404HTHFIS6010.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 601 bits (1552), Expect = 0.0
Identities = 205/478 (42%), Positives = 299/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGXEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5405PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5407TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


69Z5424Z5429Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Z5424-220-4.377374phosphatase
Z5425-223-5.149692ribonuclease BN
Z5426-128-6.214848D-tyrosyl-tRNA(Tyr) deacylase
Z5427031-7.461969acetyltransferase
Z5428129-7.623954hypothetical protein
Z5429-218-3.521976hypothetical protein
70Z5472Z5497Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5472-115-3.403215facilitated diffusion of glycerol
Z5473-214-2.623637hypothetical protein
Z5474-216-2.223853hypothetical protein
Z5475-2150.361266hypothetical protein
Z5476-1161.923396ribonuclease activity regulator protein RraA
Z54770133.1406361,4-dihydroxy-2-naphthoate
Z54781153.062617ATP-dependent protease ATP-binding subunit HslU
Z54792134.549197ATP-dependent protease peptidase subunit
Z54801155.980597cell division protein FtsN
Z5481-1205.440567DNA-binding transcriptional regulator CytR
Z54820215.196930primosome assembly protein PriA
Z54842274.15902550S ribosomal protein L31
Z54852274.301508hypothetical protein
Z54871252.878020hypothetical protein
Z54880221.195035hypothetical protein
Z5489-114-1.497357hypothetical protein
Z54900173.717244hypothetical protein
Z54910194.022472hypothetical protein
Z5492-1204.185509peptidoglycan peptidase
Z5493-2204.276971transcriptional repressor protein MetJ
Z5494-2194.200077cystathionine gamma-synthase
Z5495-2204.111264bifunctional aspartate kinase II/homoserine
Z5496-3182.7865705,10-methylenetetrahydrofolate reductase
Z5497-2203.220612catalase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5478HTHFIS300.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.017
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 49 TPKNILMIGPTGVGKTEIAR---RLAKLANAPFIKV 81
T +++ G +G GK +AR K N PF+ +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5480IGASERPTASE414e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 41.2 bits (96), Expect = 4e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLAQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVTRA 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


71Z5660Z5705Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5660015-3.124288hypothetical protein
Z5661-115-0.620946DNA-binding transcriptional regulator SoxS
Z5662-115-0.701581redox-sensitive transcriptional activator SoxR
Z5663-1170.589779hypothetical protein
Z5664-116-0.004101hypothetical protein
Z5665-115-0.798588hypothetical protein
Z5666-2233.377000acetate permease
Z5667-2213.719961hypothetical protein
Z5668-3184.297048acetyl-CoA synthetase
Z5669-1153.800073cytochrome c552
Z5670-1184.223420cytochrome c nitrite reductase pentaheme
Z5671-1193.795496formate-dependent nitrite reductase; Fe-S
Z5672-1182.814373formate-dependent nitrate reductase complex;
Z5673-3152.345399heme lyase subunit NrfE
Z5674-3162.078401formate-dependent nitrite reductase complex
Z5675-2162.360892formate-dependent nitrite reductase complex
Z5676-2162.644893glutamate/aspartate:proton symporter
Z5677-1162.385673hypothetical protein
Z5678-2172.774106formate dehydrogenase
Z5680-2163.232618outer membrane efflux protein MdtP
Z5681-1163.019655multidrug efflux system protein MdtO
Z5682-1162.794250multidrug resistance protein MdtN
Z5683-1182.920798hypothetical protein
Z56840214.109579transcriptional regulator
Z56860234.567781kinase
Z56871234.780947hypothetical protein
Z56881244.926562hypothetical protein
Z56891275.770979ABC transporter substrate-binding protein
Z56901265.987769permease of ribose ABC transport system
Z56912276.379237ribose ABC transporter ATP-binding protein
Z56923307.407007histidine kinase
Z56942368.634395hypothetical protein
Z56951399.203401carbon-phosphorus lyase complex accessory
Z56961418.859869aminoalkylphosphonic acid N-acetyltransferase
Z56972419.303277ribose 1,5-bisphosphokinase
Z56981429.689318phosphonate metabolism protein
Z56990409.746662phosphonate ABC transporter ATP-binding protein
Z57000409.937014phosphonate C-P lyase system protein PhnK
Z570104310.131582phosphonate metabolism protein
Z57022429.139804phosphonate metabolism protein
Z57031438.602620carbon-phosphorus lyase complex subunit
Z57041407.188089phosphonate metabolism protein
Z57052395.731704phosphonate metabolism transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5667RTXTOXIND270.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.7 bits (59), Expect = 0.020
Identities = 5/33 (15%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 17 ELVEKR-QRFATILSIIMLAVYIGFILLIAFAP 48
EL+E R +++ ++ + +L
Sbjct: 47 ELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQ 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5671VACJLIPOPROT300.006 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.9 bits (67), Expect = 0.006
Identities = 6/21 (28%), Positives = 11/21 (52%)

Query: 179 FGNLDDPNSEISQLLRQKPTY 199
GNL++P ++ L+ P
Sbjct: 75 TGNLEEPAVMVNYFLQGDPYQ 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5682RTXTOXIND809e-19 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 80.3 bits (198), Expect = 9e-19
Identities = 53/363 (14%), Positives = 118/363 (32%), Gaps = 78/363 (21%)

Query: 8 APRSKFPALLVVALALVALVFVIW-RVDS-APSTNDAYVSADTIDVVPEVSGRIVELAVT 65
+ R + A ++ ++A + + +V+ A + S + ++ P + + E+ V
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 66 DNQAVKQGDLLFRIDPRPYEANLAKSEAS-----LAALDKQIM----------------- 103
+ ++V++GD+L ++ EA+ K+++S L QI+
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 -----------LTQRSVDAQQFGA-------------------DSVNATVEKARAAAKQA 133
L S+ +QF +V A + + ++
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 134 TDTLRRTEPLLKEGFVSAEDVDRARTAQRAAEADLNAVLLQAQSAASAVSGVDALVAQR- 192
L LL + ++ V A +L Q + S +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 193 ------------------AAVEADIALTKLHLEMATVRAPFDGRVISLKT-SVGQFASAM 233
+ ++A + + + +RAP +V LK + G +
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 RPIFTLIDTRHWYVI-ANFRETDLKNIRSGTPATIRLMSDSGKTF---EGKVDSIGYGVL 289
+ ++ + A + D+ I G A I++ + + GKV +I +
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413

Query: 290 PDD 292
D
Sbjct: 414 EDQ 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5684HTHFIS825e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 5e-20
Identities = 33/112 (29%), Positives = 52/112 (46%), Gaps = 1/112 (0%)

Query: 16 DDTAICALLQDVLSEHVFTVSVCHTGQEAILRIEGDPDIALVVLDMMLPDTNGLRVLQQI 75
DD AI +L LS + V + I LVV D+++PD N +L +I
Sbjct: 11 DDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDENAFDLLPRI 69

Query: 76 QKLRPTLPVVMLTGMGSESDVVVGLEMGADDYICKPFTPRVVVARLKAVLRR 127
+K RP LPV++++ + + E GA DY+ KPF ++ + L
Sbjct: 70 KKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5689SUBTILISIN290.027 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 28.7 bits (64), Expect = 0.027
Identities = 15/65 (23%), Positives = 24/65 (36%), Gaps = 5/65 (7%)

Query: 55 KLAGDNVKVTLVSSGYDLGQQVSQIDNFIAANVDMIIL---NAADSKGIGPAVKRAKDAG 111
L +KV + I I VD+I + D + AVK+A +
Sbjct: 111 DLLI--IKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEAVKKAVASQ 168

Query: 112 IVVVA 116
I+V+
Sbjct: 169 ILVMC 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5690PF00577280.047 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 28.3 bits (63), Expect = 0.047
Identities = 16/73 (21%), Positives = 27/73 (36%), Gaps = 1/73 (1%)

Query: 219 FVYGMSGLLSGLGGIMSASRLYSANGNLGMG-YELDAIAAVILGGTSFVGGIGTITGTLV 277
++G+ + GG A R + N +G L A++ + S + G V
Sbjct: 400 LLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSV 459

Query: 278 GALIIATLNNGMT 290
L +LN T
Sbjct: 460 RFLYNKSLNESGT 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5692HTHFIS586e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 6e-11
Identities = 21/81 (25%), Positives = 44/81 (54%), Gaps = 2/81 (2%)

Query: 643 VLVLEDEAAVRQTICEQLHLLGYLTLEASSGEQALDLLAASAEIDIFISDLMLPGGMSGA 702
+LV +D+AA+R + + L GY S+ +AA + D+ ++D+++P +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMP-DENAF 63

Query: 703 EVVNAARKLYPHLTLLLISGQ 723
+++ +K P L +L++S Q
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQ 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5694RTXTOXIND260.034 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.9 bits (57), Expect = 0.034
Identities = 17/107 (15%), Positives = 41/107 (38%), Gaps = 8/107 (7%)

Query: 11 TLLTLTTVPAQADIIDDTIGNIQ--------QAINDASNPDRGRDYEDSRDDGWQREVSD 62
LL LT + A+AD + +Q Q ++ + ++ + + + +Q +
Sbjct: 123 VLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEE 182

Query: 63 DRRRQYDDRRRQFEDRRRQLDDRQHQLNQERRQLEDEERRMEDEYGQ 109
+ R + QF + Q ++ L+++R + R+
Sbjct: 183 EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5696SACTRNSFRASE322e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 2e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 15 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 74
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 75 AEMTELSTNVKRHDAHRFYLREGY 98
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5699PF05272290.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.015
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


72Z5726Z5732Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5726016-4.518096DNA-binding transcriptional activator DcuR
Z5727113-4.636751sensory histidine kinase DcuS
Z5728-215-4.268807hypothetical protein
Z5729-219-4.104719hypothetical protein
Z5730-123-3.459839hypothetical protein
Z5731019-4.103818hypothetical protein
Z5732-119-3.416408lysyl-tRNA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5726HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5727PF06580417e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 7e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5729SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


73Z5765Z5787Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5765-1153.665813hypothetical protein
Z5766-2143.067176phosphatidylserine decarboxylase
Z5767-2153.258680ribosome-associated GTPase
Z5768-2153.835395oligoribonuclease
Z5773-1143.710547***hypothetical protein
Z5774-2133.707104hypothetical protein
Z5775-1132.873301ATPase
Z57760133.318730N-acetylmuramoyl-L-alanine amidase
Z57771153.056295DNA mismatch repair protein
Z57782192.139733tRNA delta(2)-isopentenylpyrophosphate
Z57794262.202195RNA-binding protein Hfq
Z57804232.095622GTPase HflX
Z57814242.575041FtsH protease regulator HflK
Z57824242.452168FtsH protease regulator HflC
Z57832201.421355hypothetical protein
Z57843191.302643adenylosuccinate synthetase
Z57854140.339987transcriptional repressor NsrR
Z5786314-0.053787exoribonuclease R
Z5787217-2.87090623S rRNA (guanosine-2'-O-)-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5765GPOSANCHOR512e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.8 bits (121), Expect = 2e-08
Identities = 50/312 (16%), Positives = 105/312 (33%), Gaps = 18/312 (5%)

Query: 121 SRQAQQEQERAREIADSLNQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDS 180
+ ++ QERA + N L + +D ++ LT L+ +
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTE---ELSNAKEKLRKND 105

Query: 181 ARLKALVDEL-ELAQLSANNRQELARLRSELAEKES--QQLDAYLQALRNQLNSQRQLEA 237
L ++ EL A+ + L + + + L+A AL + + +
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAAR-KADLEKAL 164

Query: 238 ERALESTELLAENSADLPKDIVAQFKINRELSAALNQQAQRMDLVASQQRQAASQTLQVR 297
E A+ + + L + A EL AL +++ + ++ +
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 298 QALNTLREQSQWLGSSNLLGEALRAQVARLPEMPKPQQLDTEMAQLRVQRLRYEDLLNKQ 357
L + L + A A++ L + L+ A+L +
Sbjct: 225 ARKADLEKA---LEGAMNFSTADSAKIKTLEA--EKAALEARQAELEKALEGAMNFSTAD 279

Query: 358 PLLRQIHQADGQPLTAE------QNRILEAQLRTQRELLNSLLQGGDTLLLELTKLKVSN 411
+ +A+ L AE Q+++L A ++ R L++ + L E KL+ N
Sbjct: 280 SAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQN 339

Query: 412 GQLEDALKEVNE 423
E + + +
Sbjct: 340 KISEASRQSLRR 351



Score = 42.7 bits (100), Expect = 6e-06
Identities = 48/239 (20%), Positives = 92/239 (38%), Gaps = 23/239 (9%)

Query: 20 ATAPDSKQITQELEQAKAAKPAQPEVVEALQSALNALEERKGSLER-IKQYQEVIDNYPK 78
A A + + LE A A ++ L++ ALE R+ LE+ ++
Sbjct: 222 ALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSA 281

Query: 79 LSATLRAQLNNMRDEPRSVSPGMSTDALNQEILQVSSQLLDKSRQAQQEQERAREIADSL 138
TL A+ + E + + Q + LD SR+A+++ E + +
Sbjct: 282 KIKTLEAEKAALEAEKADL---EHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQ 338

Query: 139 NQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDSARLKAL--VDELELAQLS 196
N++ ++A RQ + R L L+++ +L+ + E L
Sbjct: 339 NKI----SEASRQ--SLRRDLDASR-------EAKKQLEAEHQKLEEQNKISEASRQSLR 385

Query: 197 AN---NRQELARLRSELAEKESQQLDAYLQALRNQLNSQRQLEAERALESTELLAENSA 252
+ +R+ ++ L E S +L A + + S++ E E+A +L AE A
Sbjct: 386 RDLDASREAKKQVEKALEEANS-KLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKA 443


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5780SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5781cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5786RTXTOXIND310.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.028
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


74Z5802Z5811Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5802-1313.339874PTS system ascorbate-specific transporter
Z5803-1283.410050PTS system L-ascorbate-specific transporter
Z5804-1273.517658PTS system L-ascorbate-specific transporter
Z5805-1243.4083033-keto-L-gulonate-6-phosphate decarboxylase
Z5806-1232.764522L-xylulose 5-phosphate 3-epimerase
Z58071280.671010L-ribulose-5-phosphate 4-epimerase
Z5808130-3.611178hypothetical protein
Z5809031-3.50411730S ribosomal protein S6
Z5810031-4.596184primosomal replication protein N
Z5811-127-3.38644930S ribosomal protein S18
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5805ECOLNEIPORIN280.034 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.034
Identities = 6/19 (31%), Positives = 7/19 (36%), Gaps = 2/19 (10%)

Query: 105 FNGDVQI--ELTGYWTWEQ 121
F G + L W EQ
Sbjct: 62 FKGQEDLGNGLKAIWQVEQ 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5811FRAGILYSIN250.026 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 25.4 bits (55), Expect = 0.026
Identities = 8/23 (34%), Positives = 16/23 (69%)

Query: 18 VQEIDYEDIATLKNYITESGKIV 40
+Q + Y D+AT N +++ GK++
Sbjct: 48 LQSVSYTDLATQLNDVSDFGKMI 70


75Z5858Z5913Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5858232-2.259884hypothetical protein
Z5859026-1.796208hypothetical protein
Z5860237-10.048771hypothetical protein
Z5861128-8.191346oxidoreductase
Z5863126-9.397235hypothetical protein
Z5862026-9.167812hypothetical protein
Z5864028-10.730941hypothetical protein
Z5865-117-5.840312hypothetical protein
Z5866-220-1.160819ornithine carbamoyltransferase subunit I
Z5867-119-0.304234hypothetical protein
Z5868-2180.312628hypothetical protein
Z5869-2190.734189hypothetical protein
Z5870-1223.148609valyl-tRNA synthetase
Z5871-3162.915285DNA polymerase III subunit chi
Z5872-2130.986208leucyl aminopeptidase
Z5873-2150.487278hypothetical protein
Z5874-2150.221013hypothetical protein
Z5875-118-1.003151hypothetical protein
Z5876-121-2.015492oxidoreductase
Z5878025-3.404242*integrase
Z5879-122-1.197057hypothetical protein
Z5880029-5.410876transposase
Z5881-130-7.132278hypothetical protein
Z5882-128-5.906460hypothetical protein
Z5884-129-6.386606hypothetical protein
Z5885030-6.159193resolvase
Z5886031-7.424491hypothetical protein
Z5887-122-4.146772hypothetical protein
Z5888-125-3.965168hypothetical protein
Z5889028-7.276105hypothetical protein
Z5890130-8.290230integrase
Z5891136-10.979395hypothetical protein
Z5892240-11.362828hypothetical protein
Z5893133-7.872747hypothetical protein
Z5894028-6.665243hypothetical protein
Z5895-119-4.544169hypothetical protein
Z5896-213-2.801466hypothetical protein
Z5897-212-2.033622hypothetical protein
Z5898-39-0.489116hypothetical protein
Z5899-380.110378ATP-dependent helicase
Z5900-37-0.202675hypothetical protein
Z5901-411-1.011335helicase
Z5902-123-4.402556helicase
Z5904129-5.778493hypothetical protein
Z5905-129-5.682756hypothetical protein
Z5906028-5.577578N-acetylneuraminic acid mutarotase
Z5907128-5.303540hypothetical protein
Z5910126-3.893190tyrosine recombinase
Z5911025-3.255444tyrosine recombinase
Z5912023-2.703791major type 1 subunit fimbrin (pilin)
Z5913223-2.800097fimbrial protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5861DHBDHDRGNASE871e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 86.6 bits (214), Expect = 1e-22
Identities = 67/250 (26%), Positives = 114/250 (45%), Gaps = 24/250 (9%)

Query: 6 GKTVLILGGSRGIGAAIVRRFVTDGANVRFTYAGSKDAAERLAQETGATAVFT-----DS 60
GK I G ++GIG A+ R + GA++ + + E++ A A D
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHI-AAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 61 ADRDAVIDVV----RKSGALDILVVNAGIGVFGDALELNADDIDRLFKINIHAPYHASVE 116
D A+ ++ R+ G +DILV AG+ G L+ ++ + F +N ++AS
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 117 AARQMP--EGGRILIIGSVNGDRMPVAGMAAYAASKSALQGMARGLARDFGPRGITINVV 174
++ M G I+ +GS N +P MAAYA+SK+A + L + I N+V
Sbjct: 127 VSKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 175 QPGPIDTDA--------NPANGPMRDMLHSF---MAIKRHGQPEEVAGMVAWLAGPEASF 223
PG +TD N A ++ L +F + +K+ +P ++A V +L +A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 224 VTGAMHTIDG 233
+T +DG
Sbjct: 246 ITMHNLCVDG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5865TYPE4SSCAGX330.003 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 33.2 bits (75), Expect = 0.003
Identities = 30/125 (24%), Positives = 56/125 (44%), Gaps = 2/125 (1%)

Query: 138 IIFPQPDGSTNRYERKSFERKDESSLHLITNKVLACYQR--EANKEIARLLNNHQKLNNL 195
+I PD ++K+ E++ E+ + +R E K A L N ++N
Sbjct: 131 LIVDAPDPKELEEQKKALEKEKEAKEQAQKAQKDKREKRKEERAKNRANLENLTNAMSNP 190

Query: 196 QKLNNLQKLNNLQKLNNLQKLNNIQKLNNIQELNNSQELNNSQELNNSQELNNSQDLKNS 255
Q L+N + L+ L K +L+ +++L ++QE + L +ELN Q +
Sbjct: 191 QNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQAEEAVRQRAKD 250

Query: 256 QVSCK 260
++S K
Sbjct: 251 KISIK 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5868SACTRNSFRASE331e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 1e-04
Identities = 16/48 (33%), Positives = 19/48 (39%)

Query: 27 PAIRGKGLAKKLALKAMEEAREMGFKRCYLETTAFLKEAIGLYEHLGF 74
R KG+ L KA+E A+E F LET A Y F
Sbjct: 99 KDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5893THERMOLYSIN280.007 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.1 bits (62), Expect = 0.007
Identities = 15/39 (38%), Positives = 20/39 (51%), Gaps = 4/39 (10%)

Query: 41 AGKDSAAIDKINAHYFDKKAEDYFFNKF-LLSFDPSTQQ 78
A D+AA+D AHY+ DY+ N LS+D S
Sbjct: 292 ASYDAAAVD---AHYYAGVVYDYYKNVHGRLSYDGSNAA 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5895OMPADOMAIN584e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 58.0 bits (140), Expect = 4e-12
Identities = 36/136 (26%), Positives = 53/136 (38%), Gaps = 28/136 (20%)

Query: 95 SPDVLFGLGSTELKPKFKLILDDFFPRYLKVLDNYQEHITEVRIEGHTSTDWTGTTNPDI 154
DVLF LKP+ + LD + L N V + G+ TD G+
Sbjct: 218 KSDVLFNFNKATLKPEGQAALD----QLYSQLSNLDPKDGSVVVLGY--TDRIGSD---- 267

Query: 155 AYFNNMALSQGRTRAVLQYVYDIKNIATHQQWVKSKFAAVGYSSAHPILDKTGKEDPNRS 214
AY N LS+ R ++V+ Y+ K I K +A G ++P+ T R+
Sbjct: 268 AY--NQGLSERRAQSVVDYLI-SKGIP------ADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 215 ---------RRVTFKV 221
RRV +V
Sbjct: 319 ALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5896RTXTOXIND320.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.008
Identities = 18/134 (13%), Positives = 50/134 (37%), Gaps = 13/134 (9%)

Query: 161 AQIKLLRTEISDSSQAQLANHTHFSNKLWEQLEQFADLMAKGATEQI-IDALRQVIIDFN 219
+ R E + A++ + + S +L+ F+ L+ K A + + ++
Sbjct: 207 LNLDKKRAER-LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 220 QNLTEQFGENFKALDASVKKLVEWQGNYKTQIEQMSEQYQQSV-ESLVETKTAVAGIWEE 278
L + L+ +++ K + + +++ ++ + + L +T + + E
Sbjct: 266 NELRVYKSQ----LEQIESEILS----AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 279 CK--EIPLAMSELR 290
E S +R
Sbjct: 318 LAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5899RTXTOXIND310.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.030
Identities = 26/163 (15%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 334 RLASGAEEEAYRRLVESQFRDDDDEQAQSN---KGRLFKITLEKALFSSPMACASVVANR 390
+ S E L++ QF +++ Q + + A + + V +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 391 LKRLESRKDHN--SQSQINELESLLLALNNIDASQFSKYQLLLDTIRKDLAWKANNTEDR 448
L S ++ + E E+ + N S+ L+ I ++ + ++
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ----LEQIESEIL----SAKEE 288

Query: 449 LVIFTESIKTLEFLEQ--QLRADLKLKDDQIATLRGDQGDTVL 489
+ T+ K E L++ Q ++ L ++A Q +V+
Sbjct: 289 YQLVTQLFKN-EILDKLRQTTDNIGLLTLELAKNEERQQASVI 330


76Z5929Z5949Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5929-219-4.847654hypothetical protein
Z5930-219-4.553615hypothetical protein
Z5932-316-3.920882invasin
Z5933-117-4.242145transporter
Z5934016-4.144108hypothetical protein
Z5935016-3.645505hypothetical protein
Z59361182.010623hypothetical protein
Z59371191.673698hypothetical protein
Z5938019-1.217144hypothetical protein
Z5939-219-1.339785transporter
Z5940-220-4.420575hypothetical protein
Z5941-216-3.911150regulator
Z5942013-4.677526hypothetical protein
Z5943m013-5.180313hypothetical protein
Z5945017-6.826511endoribonuclease SymE
Z5946015-6.228658restriction modification enzyme S subunit
Z5947-114-5.154051restriction modification enzyme subunit M
Z5948-215-3.058573restriction modification enzyme subunit R
Z5949-115-4.509183hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5932INTIMIN521e-166 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 521 bits (1343), Expect = e-166
Identities = 157/528 (29%), Positives = 241/528 (45%), Gaps = 28/528 (5%)

Query: 96 ENQIASTQFDFLHPWYDTPDYLLFSQHTLHRTDDRTQINTGLGWRHFTSSWMSGINLFFD 155
N + DFL P+YD+ L F Q D R N G G R F M G N+F D
Sbjct: 220 GNNFDGSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFID 279

Query: 156 HDLSRYHSRAGLGAEYWRDYLKLSSNAYIGLTGWRSAPELDNDFEARPANGWDLRAEGWL 215
D S ++R G+G EYWRDY K S N Y ++GW + D++ RPANG+D+R G+L
Sbjct: 280 QDFSGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRFNGYL 338

Query: 216 PAWPQLGGKLVYEQYYGDEVALFDKNDRQSNPHAITAGLNYTPFPLLTLSAEQRQGKQGE 275
P++P LG KL+YEQYYGD VALF+ + QSNP A T G+NYTP PL+T+ + R G E
Sbjct: 339 PSYPALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNE 398

Query: 276 NDTRFAVDLTWQPSSSMQKQLNPDEVAGRRSLAGSRYDLIDRNNNIVLEYRKKELIRLSL 335
ND +++ +Q +Q+ P V R+L+GSRYDL+ RNNNI+LEY+K++++ L++
Sbjct: 399 NDLLYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI 458

Query: 336 LDPVKGKSGEIKPLVSSLQTKYALKGYNIEAAALEAAGGKVSTSG----KDITVTLPGYR 391
+ G + + +++KY L + +AL + GG++ SG +D LP Y
Sbjct: 459 PHDINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAY- 517

Query: 392 FTNTPETDNTWSIDVTAEDVKGNLSRHEQ-SMVVIQAPTLSQKDSLLSVNPLTVAADKKS 450
N + + A D GN S + ++ V+ + + + +A
Sbjct: 518 ---VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADG 574

Query: 451 TTTLTVTAHDSD------GTPVPGLALQTRSEGVQDITLSDWTDNGDGSYTQILTAGTTS 504
T +T TA PV + G ++ + NG G T L +
Sbjct: 575 TEAITYTATVKKNGVAQANVPVSFNIVS----GTAVLSANSANTNGSGKATVTLKSDKPG 630

Query: 505 GSVTLTPQINGESAVKESIVVNIVPVVSSRDHSSITIDNVSYYAGDDIKVRVELKDDSN- 563
V SA+ + V I + + I D + A + +K
Sbjct: 631 QVVVSAKTAEMTSALNANAV--IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGD 688

Query: 564 QPVAYQKEELVKAVTVENSKPGATIVWHEEQPGVYAANYPAYKQGTAL 611
+PV+ Q+ + K + + G + G +L
Sbjct: 689 KPVSNQEVTFTTTL----GKLSNSTE-KTDTNGYAKVTLTSTTPGKSL 731



Score = 72.8 bits (178), Expect = 8e-15
Identities = 83/423 (19%), Positives = 137/423 (32%), Gaps = 41/423 (9%)

Query: 868 VYGHPLPDEDVKFTLPASMTGNFTLSSETARTDANGDAVVTLRGTKAGEFTVTATLTRNN 927
+ +D + LPA + G + TAR G + +T T+ N
Sbjct: 500 QHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDR-------NGNSSNNVLLTITVLSNG 552

Query: 928 TVAYQQVTFIGDTNSAQLQPLTASLNSIVAGNSTGSTLTATILDAYQNPLKDQLV-TFQS 986
V Q + D TA S A + T TAT+ + S
Sbjct: 553 QVVDQVG--VTD--------FTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVS 602

Query: 987 NDVTLSETEVTTNTLGQATVTMTSNIAGQHNVVVSRKAQASDNKTFSLSVLPDESSAKVI 1046
LS TN G+ATVT+ S+ GQ VV ++ A+ + + + D++ A +
Sbjct: 603 GTAVLSANSANTNGSGKATVTLKSDKPGQ-VVVSAKTAEMTSALNANAVIFVDQTKASIT 661

Query: 1047 SITGAEKTITVGENITLRILVQDAFN-NVIAGQRVRLS-AQPTTNITIGDTAYTDNNGYA 1104
I + T + V+ ++ Q V + + + T TD NGYA
Sbjct: 662 EIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNS---TEKTDTNGYA 718

Query: 1105 YVNLLSTQPGVYQVTATLDNNSSSKVDVNVAN-GKLELTSSKPETTVHNSEGITLTATAR 1163
V L ST PG V+A + + + V L + E +G T +
Sbjct: 719 KVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQ 778

Query: 1164 NARGEL-MPGQIITFSVTPEGATLSNTGEVLTDQSGQAKVTLTSDKVNVYTVTAIMGKDV 1222
+ L G ++ +N D S +VTL +V +
Sbjct: 779 YGQVNLKASGGNGKYTW-----RSANPAIASVDAS-SGQVTLKEKGTTTISVIS------ 826

Query: 1223 PVQSQVTVAVKADAKTAHVVSVVASPDTITADGIDSSTITSRVEDDYGFPVEGVDISHGL 1282
T + +V + S D +++ +E V + G
Sbjct: 827 --SDNQTATYTIATPNSLIVPNM-SKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883

Query: 1283 DTK 1285
K
Sbjct: 884 ANK 886



Score = 67.4 bits (164), Expect = 3e-13
Identities = 65/278 (23%), Positives = 105/278 (37%), Gaps = 21/278 (7%)

Query: 1064 RILVQDAFNNVIAGQRVRLSAQPTTNITIGDTAYTDNNGYAYVNLLSTQPGVYQVTATLD 1123
RI+ D+ GQ +Q + AY N+ Y
Sbjct: 484 RIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQ----GGSNVYKVTARAYDRNGNSS 539

Query: 1124 NNSSSKVDVNVANGKL-------ELTSSKPETTVHNSEGITLTATARNARGELMPGQIIT 1176
NN + V ++NG++ + T+ K +E IT TAT + G ++
Sbjct: 540 NNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVS 597

Query: 1177 FSVTPEGATLSNTGEVLTDQSGQAKVTLTSDKVNVYTVTA-IMGKDVPVQSQVTVAVKAD 1235
F++ A LS T+ SG+A VTL SDK V+A + + + V D
Sbjct: 598 FNIVSGTAVLSANSAN-TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFV--D 654

Query: 1236 AKTAHVVSVVASPDTITADGIDSSTITSRVEDDYGFPVEGVDISHGLDTKGSPVVNIPTT 1295
A + + A T A+G D+ T T +V PV +++ T + + T
Sbjct: 655 QTKASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSNQEVT--FTTTLGKL-SNSTE 710

Query: 1296 RTDQSGQVTATITSTLAETLTVNVQVPGTANQSATITL 1333
+TD +G T+TST V+ +V A +
Sbjct: 711 KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEV 748



Score = 62.4 bits (151), Expect = 1e-11
Identities = 65/344 (18%), Positives = 112/344 (32%), Gaps = 26/344 (7%)

Query: 744 LVADPDTIIAGNSQGSTLTAIITDFHNNPLKDMKVNFVAPGGSQLDNTTATTDQSGIVRV 803
AD + A ++ T TA + + G + L +A T+ SG V
Sbjct: 563 FTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATV 622

Query: 804 HLTSSKAGSYSVDASLEVDKNIHQSVTITVVPNREQSVMTLNAGSGSAIANNTNIVTLTA 863
L S K G V A + + + V + S+ + A +A+AN + +T T
Sbjct: 623 TLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTV 682

Query: 864 SVKDVYGHPLPDEDVKFTLPASMTGNFTLSSETARTDANGDAVVTLRGTKAGEFTVTATL 923
V P+ +++V FT N T +TD NG A VTL T G+ V+A +
Sbjct: 683 KVMK-GDKPVSNQEVTFTTTLGKLSN-----STEKTDTNGYAKVTLTSTTPGKSLVSARV 736

Query: 924 TRNNT-VAYQQVTFIGDTNSAQLQPLTASLNSIVAGNSTGSTLTATI-LDAYQNPLKDQL 981
+ V +V F + IV G T +
Sbjct: 737 SDVAVDVKAPEVEFFTTLT------IDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 982 VTFQSNDVTLSETEVTTNTLGQATVTMTSNIAGQHNVVVSRKAQASDNKTFSLSVLPDES 1041
+ + + + VT+ G + V +SDN+T + ++
Sbjct: 791 GKY---TWRSANPAIASVDASSGQVTL--KEKGTTTISVI----SSDNQTATYTI--ATP 839

Query: 1042 SAKVISITGAEKTITVGENI-TLRILVQDAFNNVIAGQRVRLSA 1084
++ ++ T N + N + A
Sbjct: 840 NSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883



Score = 33.5 bits (76), Expect = 0.008
Identities = 44/314 (14%), Positives = 91/314 (28%), Gaps = 38/314 (12%)

Query: 367 AALEAAGGKVSTSGKDITVTLPGYRFTNTPETDNTWSIDVTAEDVKGNLSRHEQSMVVIQ 426
A L A + SGK TVTL + V + S + V+
Sbjct: 605 AVLSANSANTNGSGK-ATVTL----------KSDKPGQVVVSAKTAEMTSALNANAVIF- 652

Query: 427 APTLSQKDSLLSVNPLTVAADKKSTTTLTVTAHDSDGTPVPGLALQTRSEGVQDITLSDW 486
+ + + T A+ + T TV PV T + + ++ S
Sbjct: 653 VDQTKASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSN-QEVTFTTTLGKLSNSTE 710

Query: 487 TDNGDGSYTQILTAGTTSGSVTLTPQINGESAVKESIVVNIVPVVSSRDHSSITIDNVSY 546
+ +G LT TT G ++ +++ + ++ V ++
Sbjct: 711 KTDTNGYAKVTLT-STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTI------------- 756

Query: 547 YAGDDIKVRVELKD-DSNQPVAYQKEELVKAVTVENSKPGATIVWHEEQPGVYAANYPAY 605
DD + + P + + V + W P + + + +
Sbjct: 757 ---DDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN---GKYTWRSANPAIASVDASSG 810

Query: 606 KQGTALRAQLSLHNWNAPLQSHIYNIEANQNKARVATLSATNNDVYADKKTFNTLTINVT 665
+ + ++ ++ Q+ Y I + + + + Y D
Sbjct: 811 QVTLKEKGTTTISVISSDNQTATYTIATPNS---LIVPNMSKRVTYNDAVNTCKNFGGKL 867

Query: 666 DESDNPLTNHQVTF 679
S N L N +
Sbjct: 868 PSSQNELENVFKAW 881


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5933TCRTETA290.040 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 28.6 bits (64), Expect = 0.040
Identities = 64/316 (20%), Positives = 113/316 (35%), Gaps = 24/316 (7%)

Query: 82 RPFLLASALATGLLILAMAWLPPFLLVFIIRFLAGV-----ASAGMLIFGSTLIMQHTRH 136
RP LL S + MA P +++I R +AG+ A AG I T + RH
Sbjct: 73 RPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARH 132

Query: 137 PFVLAALFSGVGVGIALGNEYVLAGLHFALSSQTLWQGAGALSAIILLALALLIP-SNKH 195
++A F G G+ G VL GL S + A AL+ + L L+P S+K
Sbjct: 133 FGFMSACF---GFGMVAGP--VLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKG 187

Query: 196 VIPPAPLAKIAQQPMSWW---------LLAILYGLAGFGYIIVATYLPLMAKDAGQPVLT 246
P + W L+A+ + + G + A ++ + +D T
Sbjct: 188 ERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWV-IFGEDRFHWDAT 246

Query: 247 AHLWTLVGLSIVPGCFGWLWA---AKRWGALPCLTANLLVQAICVLLTLASSSPLLLIIS 303
+L I+ + A R G L ++ +L ++ +
Sbjct: 247 TIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPI 306

Query: 304 SIGFGGTFMGTTSLVMTIARQLSVPGNLNLLGFVTLIYGIGQILGPALTSMLGNGTSALA 363
+ +G +L ++RQ+ L G + + + I+GP L + + +
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTW 366

Query: 364 SATLCGAAALFIAALI 379
+ A A +
Sbjct: 367 NGWAWIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5939TCRTETB523e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 51.8 bits (124), Expect = 3e-09
Identities = 47/189 (24%), Positives = 76/189 (40%), Gaps = 5/189 (2%)

Query: 7 RHAATLFFPMALILYDFAAYLSTDLIQPGIINVVRDFNADVSLAPAAVSLYLAGGMALQW 66
RH L + L + + ++ P I N A + A L + G A+
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV-- 68

Query: 67 LLGPLSDRIGRKPVLITGALIFTLACAATMFTTSMTQFLI-ARAIQGTSICFIATVGYVT 125
G LSD++G K +L+ G +I S LI AR IQG + V
Sbjct: 69 -YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 126 VQEAFGQTKGIKLMAIITSIVLIAPIIGPLSGAALMHFVHWKVLFAIIAVMGFISFVGLL 185
V + K +I SIV + +GP G + H++HW L +I ++ I+ L+
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LIPMITIITVPFLM 186

Query: 186 LAMPETVKR 194
+ + V+
Sbjct: 187 KLLKKEVRI 195


77Z0458Z0463N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0458-2192.648026ferric transporter ATP-binding subunit
Z0459-1202.998736permease component of transport system for
Z04601182.845592periplasmic ferric iron-binding protein
Z04612203.986407permease; hexosephosphate transport
Z04621223.795496sensor kinase
Z04630203.959169response regulator; hexosephosphate transport
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0458PF05272320.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.004
Identities = 20/91 (21%), Positives = 29/91 (31%), Gaps = 15/91 (16%)

Query: 34 MVTLLGPSGCGKTTILRLVAGLEKPSEGQIFIDGEDVTHRSI-QQRDICMVFQSYALFPH 92
V L G G GK+T++ + GL F D TH I +D
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL------DFFSD----THFDIGTGKDSYEQIAGIVA--- 644

Query: 93 MSLGENVGYGDEPLSNLDANLRRSMRDKIRE 123
L E + + A +D+ R
Sbjct: 645 YELSEMTAFRRADAEAVKAFFSSR-KDRYRG 674


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0461TCRTETA402e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.8 bits (93), Expect = 2e-05
Identities = 62/399 (15%), Positives = 122/399 (30%), Gaps = 31/399 (7%)

Query: 38 VNYVLPALQTDLGLD---KGDIGLLGSLFYLSYGLSKFTAGLWHDSHGQRGFMGVGLFAT 94
+ VLP L DL G+L +L+ L G D G+R + V L
Sbjct: 24 IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGA 83

Query: 95 GLLNVVFAFGESLTLLLVVWTLNGFFQGWGWPPCARLLTHWYSRNERGFWWGCWNMSINI 154
+ + A L +L + + G G + +ER +G +
Sbjct: 84 AVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 155 GGAIIPLISAFAAHWWGWQAAMLTPGIISMALGIWLTLQLKGTPQEEGLPTVGHWRHDPL 214
G P++ + A ++ + L + + E P
Sbjct: 143 GMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP---------- 191

Query: 215 ELRQEQQSPPMGLWQMLRTTMLQNPLIWLLGVSYVLVYVIRIALNDWGNIWLTESHGVNL 274
LR+E +P R + L+ V +++ V ++ W + +
Sbjct: 192 -LRREALNPLAS----FRWARGMTVVAALMAVFFIMQLVGQVPAALWV---IFGEDRFHW 243

Query: 275 LSANATVMLFEVGGLLGALFAGWGSDLLFSGQRAPMILLFTLGLMVSVAALWLAPVHHYA 334
+ + L G+L +L + + + L+ + + L +
Sbjct: 244 DATTIGISL-AAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWM 302

Query: 335 LLAVCFFTVGFFVFGPQMLIGLAAVECGHK--AAAGSITGFLGLFAYLGAALAGWPLSLV 392
+ + P + L+ + GS+ L + +G L +
Sbjct: 303 AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362

Query: 393 IERYGWPGMFSLLSVAAVLMGLLLMPLLMAGITTTHARR 431
I W G +A + LL +P L G+ + +R
Sbjct: 363 ITT--WNG---WAWIAGAALYLLCLPALRRGLWSGAGQR 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0462PF06580491e-08 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 49.5 bits (118), Expect = 1e-08
Identities = 42/205 (20%), Positives = 79/205 (38%), Gaps = 43/205 (20%)

Query: 337 QSQLVKRARDPAQIQSAASQIN-------------------ELARRIHLSTRQLLR-QLR 376
Q ++ A++ AQ+ + +QIN AR + S +L+R LR
Sbjct: 151 QWKMASMAQE-AQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 377 PPALDELTFREALLHL-----INEFAFSERGIHCQFAYQLNSTPENETVRFTLYRLLQEL 431
+++ + L + + F +R ++ P V+ L+Q L
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDR-----LQFENQINPAIMDVQVPPM-LVQTL 263

Query: 432 LNNICKHA-----EASEVTIILRQQGEVLHLEVSDNGVGIA--SGKMAGFGIQGMRERVS 484
+ N KH + ++ + + + LEV + G + + G G+Q +RER+
Sbjct: 264 VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQ 323

Query: 485 ALGGD---LTLE-KQHGTRVIVNLP 505
L G + L KQ +V +P
Sbjct: 324 MLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0463HTHFIS733e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.9 bits (179), Expect = 3e-17
Identities = 31/116 (26%), Positives = 53/116 (45%), Gaps = 4/116 (3%)

Query: 2 IRVVLVDDHVVVRSGFAQLLSLED-DLEVIGQYSSAAQAWSALIRDDVNVAVIDIAMPDE 60
+++ DD +R+ Q LS D+ + +AA W + D ++ V D+ MPDE
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLSLLKRLRAQKPQFRAIILSIYDAPTFVQSALDAGASGYLTKRCGPEELVQAVR 116
N LL R++ +P +++S + A + GA YL K EL+ +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


78Z0493Z0498N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z04931142.127895fructokinase
Z04941122.191260MFS transport protein AraJ
Z04950142.451906exonuclease SbcC
Z0496-1132.482646exonuclease SbcD
Z0497-211-0.528072transcriptional regulator PhoB
Z0498-210-0.642329phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0493ACETATEKNASE300.015 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.8 bits (67), Expect = 0.015
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0494TCRTETA513e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.4 bits (123), Expect = 3e-09
Identities = 74/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 5 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 61
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 62 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 121
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 122 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 181
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 182 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 229
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 230 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 286
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 287 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 340
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0495RTXTOXIND397e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 7e-05
Identities = 34/199 (17%), Positives = 71/199 (35%), Gaps = 14/199 (7%)

Query: 671 QQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEETVALDNWRQVHEQCLALH 730
+ + Q + Q R Q L+ +E + E + +V +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTAL--------QASVFDDQQAFLAALMDEQTLTQL 782
Q T Q Q +L K +A+ T L + V + ++L+ +Q +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA-- 250

Query: 783 EQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQEL-AQTHQKLREN 841
K + Q + V + +Q +Q + L+ + + Q + KLR+
Sbjct: 251 ---KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT 307

Query: 842 TTSQGEIRQQLKQDADNRQ 860
T + G + +L ++ + +Q
Sbjct: 308 TDNIGLLTLELAKNEERQQ 326



Score = 39.4 bits (92), Expect = 7e-05
Identities = 25/204 (12%), Positives = 59/204 (28%), Gaps = 18/204 (8%)

Query: 487 EARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVKKLGEEG 546
EA ++ Q + Q ++E + E + +E + L
Sbjct: 133 EADTLKTQSSLLQARLEQ---TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT- 188

Query: 547 AALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQPWLDAQD 606
+ ++ Q Q + E R + + + + DD L Q
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 607 -------EHERQL-RLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLP 658
E E + +++ + Q+ +I+ +++ + Q L
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEILD 302

Query: 659 QEDEEESWLATRQQEAQSWQQRQN 682
+ + + E ++RQ
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQ 326



Score = 33.3 bits (76), Expect = 0.006
Identities = 16/150 (10%), Positives = 42/150 (28%), Gaps = 5/150 (3%)

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTA----LQASVFDDQQAFLAALMDEQTLTQLEQLK 786
+ Q + A + Q + L D+ F +E+ L +K
Sbjct: 134 ADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS-EEEVLRLTSLIK 192

Query: 787 QNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRENTTSQG 846
+ + Q + A+ + L L + ++
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 847 EIRQQLKQDADNRQQQQTLLQQIAQMTQQV 876
+ +Q + + + + Q+ Q+ ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0496FRAGILYSIN300.022 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 29.7 bits (66), Expect = 0.022
Identities = 13/70 (18%), Positives = 23/70 (32%), Gaps = 4/70 (5%)

Query: 149 KQQHLLAAITDYYQQHYADACKLRGDQPLPIIATGHLTTVGASKSDAVRDIYIGTLDAFP 208
K+ ++ I ++Y + + + I T D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAINEKEAFECI-YDSRTRSA--GKD-IVSVKINIDKAKK 190

Query: 209 AQNFPPADYI 218
N P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0497HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 33/149 (22%), Positives = 62/149 (41%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPTSHRVMAGEEP 152
E L D + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0498PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


79Z0536Z0553N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z05360230.270728muropeptide transporter
Z0537328-0.165563hypothetical protein
Z0539328-0.245135transcriptional regulator BolA
Z05413280.045972trigger factor
Z05420210.307220ATP-dependent Clp protease proteolytic subunit
Z05431210.088138ATP-dependent protease ATP-binding subunit ClpX
Z05450190.066183DNA-binding ATP-dependent protease La
Z05470140.394230transcriptional regulator HU subunit beta
Z0548-2140.367743peptidyl-prolyl cis-trans isomerase
Z0549-2180.342033hypothetical protein
Z0550-1150.798257hypothetical protein
Z05510152.065195queuosine biosynthesis protein QueC
Z0552-1152.140987hypothetical protein
Z05531152.605564hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0536TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0537PF06291270.029 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.029
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCARPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0543HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0545GPOSANCHOR330.003 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.5 bits (76), Expect = 0.003
Identities = 23/101 (22%), Positives = 48/101 (47%), Gaps = 7/101 (6%)

Query: 195 KQSVLEMSDVNERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQK 254
++ + L A QV R +++ ++ S+ +Q++A +
Sbjct: 277 TADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAK---KQLEAEHQ 333

Query: 255 ELGEMDDAPD-ENEALKRKIDAAKMPKEAKEKAEAELQKLK 294
+L E + + ++L+R +DA++ EAK++ EAE QKL+
Sbjct: 334 KLEEQNKISEASRQSLRRDLDASR---EAKKQLEAEHQKLE 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0547DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0550PF08280280.018 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 27.5 bits (61), Expect = 0.018
Identities = 24/138 (17%), Positives = 41/138 (29%), Gaps = 20/138 (14%)

Query: 1 MQTQIKVRGYHLDVYQHVNNARYL-------EFLEEARWDGLENSDSFHWMTAH------ 47
+Q I + Y N Y E++ + N FH +
Sbjct: 361 LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR 420

Query: 48 ------NIAFVVVN-ININYRRPAVLSDLLTITSQLQQLNGKSGILSQVITLEPEGQVVA 100
+ FV N IN + + + + Q+ L+P+ +
Sbjct: 421 NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH 480

Query: 101 DALITFVCIDLKTQKALA 118
LI FV +L A+A
Sbjct: 481 SQLIPFVHHELTKGIAVA 498


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0553HTHFIS290.020 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.020
Identities = 12/64 (18%), Positives = 24/64 (37%), Gaps = 10/64 (15%)

Query: 197 LTVLTQHLGLSLRDCMAFGDAMNDREMLGSVGSGFIMGN----------AMPQLRAELPH 246
TVL Q L + D +A + + ++ + +P+++ P
Sbjct: 16 RTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPD 75

Query: 247 LPVI 250
LPV+
Sbjct: 76 LPVL 79


80Z0569Z0587N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0569-112-1.442351hypothetical protein
Z0570117-0.097357hypothetical protein
Z0571115-0.405796maltose O-acetyltransferase
Z05730160.072651hemolysin expression-modulating protein
Z05740160.284543hypothetical protein
Z05760161.133059acridine efflux pump
Z05781130.542159acridine efflux pump
Z05791150.295330DNA-binding transcriptional repressor AcrR
Z05813162.550056potassium efflux protein KefA
Z05834164.229799hypothetical protein
Z05843174.720858primosomal replication protein N''
Z05853243.376607hypothetical protein
Z05864293.233283adenine phosphoribosyltransferase
Z05872243.186945DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0569BCTERIALGSPF300.028 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.028
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0576ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3545), Expect = 0.0
Identities = 801/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSIEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+S+EKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0578RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 8e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 168 RINLA 172
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0579HTHTETR2211e-75 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 221 bits (564), Expect = 1e-75
Identities = 214/215 (99%), Positives = 214/215 (99%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPSDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFP DPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0581RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0587IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDAWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


81Z0606Z0639N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z06064297.717368glutaminase
Z06074307.806448amino acid/amine transport protein
Z06084318.079922outer membrane export protein
Z06094318.253974hypothetical protein
Z06153317.915574RTX family exoprotein
Z0634-3182.164743hypothetical protein
Z0635-2140.690392membrane spanning export protein
Z0636-114-0.714550transcriptional regulator
Z0638-214-2.326802hypothetical protein
Z0639-315-2.298866hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0606BLACTAMASEA290.021 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.0 bits (65), Expect = 0.021
Identities = 11/43 (25%), Positives = 19/43 (44%)

Query: 38 GQLAAVAIVTSDGNVYSAGDSDYRFALESISKVCTLALALEDV 80
G++ + + + G +A +D RF + S KV L V
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARV 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0608RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.006
Identities = 22/165 (13%), Positives = 57/165 (34%), Gaps = 20/165 (12%)

Query: 199 DELQAQTRIAGMRSTLEQYQAQMASAKAQLAVLTGVQPEAIAAP----PAELAEQPVSLK 254
L A+ +S+L Q + + + + + + P ++E+ V
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVL-- 185

Query: 255 NIDYQSIPLVLAAENLRQSAQYGVEKTKAQYWPTLSIQGGKTRYQTSDRSYWDDQLQLNV 314
+ L+ + Q+ +Y E + R + ++ +L+
Sbjct: 186 ----RLTSLIKEQFSTWQNQKYQKELNLDKK--RAERLTVLARINRYENLSRVEKSRLDD 239

Query: 315 NAPLYQGGAVS--------AQVQQAEGQQKISASQVEQAKLDVLQ 351
+ L A++ + +A + ++ SQ+EQ + ++L
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0609INTIMIN375e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 37.4 bits (86), Expect = 5e-04
Identities = 63/372 (16%), Positives = 115/372 (30%), Gaps = 44/372 (11%)

Query: 649 QTVTVTLNGQTYQGVVQPDGTWSVTVPAANVGALADGNA--TVTASVNDVAGNPSSVSRV 706
T+TV NGQ V D T A A ADG T TA+V ++V
Sbjct: 544 LTITVLSNGQVVDQVGVTDFT------ADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 707 ALVDATPPVVTINPVATDNVINTPEHAQAQIISGTVTGAQAGDIVTVTLNNVDYTTVVDG 766
+ + V++ N T+ ++ V A+ ++ + N + VD
Sbjct: 598 FNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL--NANAVIFVDQ 655

Query: 767 SGNWSLGVPASVVSGLADGSYPVSVSVTDKAGNTGSQSLTVTVNTAAPLIGINSIAGDDV 826
+ + A + +A+G ++ +V G+ + VT T +
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSN-------- 707

Query: 827 INASEKGADLQITGTSDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANY 886
T +D +T+T + + + +V V A V
Sbjct: 708 -----------STEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF--TTL 754

Query: 887 TVTAAVTSDIGNSATASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAED 946
T+ +G V LP V + + + + + D
Sbjct: 755 TIDDGNIEIVGTG--------VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVD 806

Query: 947 GDTVTITL---GGNTYTATVGSN--LTWSVDVPAADIQALGNGDLTVNASVTNQNGNTGS 1001
+ +TL G T + N T+++ P + I + +T N +V G
Sbjct: 807 ASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1002 GTRDITIDANLP 1013
N+
Sbjct: 867 LPSSQNELENVF 878



Score = 35.0 bits (80), Expect = 0.002
Identities = 81/416 (19%), Positives = 139/416 (33%), Gaps = 61/416 (14%)

Query: 783 ADGSYPVSVSVTDKAGN-TGSQSLTVTVNTAAPLIGINSIAGDDVINASEKGADLQITGT 841
Y V+ D+ GN + + LT+TV + + D + + AD GT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKAD----GT 575

Query: 842 SDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANYTVTAAVTSDIGNSAT 901
AIT YT T +G VP S G A + +A T + +
Sbjct: 576 ------EAIT-------YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT-----NGS 617

Query: 902 ASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAEDGDTVTITLGGNTYTA 961
V + S PG + T ++ +A A A Q I T A
Sbjct: 618 GKATVTLKSDKPGQVVVSAKTAEMTSALNAN-AVIFVDQTK----ASITEIKADKTTAVA 672

Query: 962 TVGSNLTWSVDVPAADIQALGNGDLTVNASVTNQNG----NTGSGTRDITIDANLPG--- 1014
+T++V V D + + N ++T ++ + +G +T+ + PG
Sbjct: 673 NGQDAITYTVKVMKGD-KPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL 731

Query: 1015 --LRVDTVAGDDVVNIIEHGQALVVTGSS-----SGLAESTP----------LTVTINNV 1057
RV VA D +E L + + +G+ P L + N
Sbjct: 732 VSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNG 791

Query: 1058 EYTTAVQADGSWSVGVTAAQVSAWPAGTVNIAVSGESSAGNSVSITHPVTVDLTPAAITI 1117
+YT SV ++ QV+ GT I+V + + +I TP ++ +
Sbjct: 792 KYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA-------TPNSLIV 844

Query: 1118 NTIATDDVINAAEKGADLTLSGTTTNVEPGQTVTVTFGGKNYTASVASDGSWTATV 1173
++ N A ++ + V +G N S + + V
Sbjct: 845 PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWV 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0615CABNDNGRPT459e-06 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 44.6 bits (105), Expect = 9e-06
Identities = 43/221 (19%), Positives = 68/221 (30%), Gaps = 12/221 (5%)

Query: 4539 GTASNNGKFVGTGYNDTFFATAGTDTYDGSGGWVYSSGTGTWLANGGMDVVDFRLSTVGV 4598
T + F D + AT + S + T + ++ +
Sbjct: 265 RTGDSVYGFNSNTDRDFYTATDSSKALIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSD 324

Query: 4599 TANLSSTAAQATGFNTSTFTNIEGISGSNFNDILTGSSGDNQLEGRGGNDTLNIGNGGHD 4658
L + A G IE G + NDIL G+S DN L+G GND L G G
Sbjct: 325 VGGLKGNVSIAHG------VTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAG--A 376

Query: 4659 TLLYKLLNASDATGGNGSDVVNGFTVGTWEGTADTDRIDIRELLQGSGYTG-NGKASYVN 4717
LY G+G D + D+ID+ + + +
Sbjct: 377 DTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKG 436

Query: 4718 GVATLDAQAGNIGDFVKVTQS---GSDTIVQIDRDGTGGTF 4755
L A N + + ++ D +V+I
Sbjct: 437 QEVMLQWDAANSITNLWLHEAGHSSVDFLVRIVGQAAQSDI 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0635RTXTOXIND2571e-83 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 257 bits (657), Expect = 1e-83
Identities = 103/434 (23%), Positives = 168/434 (38%), Gaps = 56/434 (12%)

Query: 11 LTEPRLPRSALAV-RVTAVMLLCFLGWAWYFQLDEVTTGSGTVEPSGREQVVQSLEGGIL 69
L E + R V L+ + Q++ V T +G + SGR + ++ +E I+
Sbjct: 48 LIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIV 107

Query: 70 YHLDVKVGDIVEQGQPLAQLNRTKTESDVQEAMSRLYAALATSARLRAEVSNK------P 123
+ VK G+ V +G L +L E+D + S L A R + +
Sbjct: 108 KEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE 167

Query: 124 LVFPDEL----------------------NKFPELIESETALYNTR--RDGLNKATTGLT 159
L PDE + + E L R R +
Sbjct: 168 LKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYE 227

Query: 160 QGISLVNRELAMTQPLVKQGAASSVEVLRLQRQANELEN--------------------- 198
+ L L+ + A + VL + + E N
Sbjct: 228 NLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 199 KLSDVRTQYYVQAREELAKANAEVETQRSVIRGREDSLTRLNFTAPVRGIVQDIDVTTVG 258
+ V + + ++L + + + E+ APV VQ + V T G
Sbjct: 288 EYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347

Query: 259 GVIAPGGKLMTIVPLDEQLLIEAKISPRDVAFIHPGQKSLVKITAYDYSIYGGLPGEVAV 318
GV+ LM IVP D+ L + A + +D+ FI+ GQ +++K+ A+ Y+ YG L G+V
Sbjct: 348 GVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKN 407

Query: 319 ISPDTVQDEVRRDVYYYRVYIRTFSNHLENKSKQQFPIFPGMVATVDIRTGKKSVLDYLL 378
I+ D ++D+ R + V I N L +K P+ GM T +I+TG +SV+ YLL
Sbjct: 408 INLDAIEDQ--RLGLVFNVIISIEENCLSTGNK-NIPLSSGMAVTAEIKTGMRSVISYLL 464

Query: 379 KPF-NKAQEALRER 391
P E+LRER
Sbjct: 465 SPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0639PF03895553e-12 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 55.2 bits (133), Expect = 3e-12
Identities = 21/78 (26%), Positives = 34/78 (43%), Gaps = 1/78 (1%)

Query: 262 RKEANAGTASAIAIASQPQVKTGDVMMVSAGAGTFNGESAVSVGTSFNAGTHTVLKAGIS 321
KE G A+ A++ Q VSA G + ++A+++G KAG++
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 322 ADTQS-DFGAGVGVGYSF 338
+T + G VGY F
Sbjct: 62 FNTYNGGMSYGASVGYEF 79


82Z0708Z0714N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z07080202.345050sensor kinase CusS
Z0709-1192.391327DNA-binding transcriptional activator CusR
Z0711-1171.426922copper/silver efflux system outer membrane
Z0712-2171.537506copper-binding protein
Z0713-2171.611257copper/silver efflux system membrane fusion
Z0714-3161.039956inner membrane component for iron transport
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0708PF06580300.018 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.018
Identities = 30/183 (16%), Positives = 67/183 (36%), Gaps = 34/183 (18%)

Query: 306 EELTRMAKMVSDML-FLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDR-GVELQFV 363
+ M +S+++ + + N + + LADE+ V + + LA + LQF
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVS------LADELTVVDSYLQ-LASIQFEDRLQFE 243

Query: 364 GDECQVAGDPLMLRRALSNLLSNALRY----TPPGEAIVVRCQTVDHLVQVIVENPGTPI 419
D + + L+ N +++ P G I+++ + V + VEN G+
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA 303

Query: 420 APEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVK---SIVVAHKGTVAVTSNARGTRFVI 476
E +G GL V+ ++ + + ++ ++
Sbjct: 304 LKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 477 VLP 479
++P
Sbjct: 346 LIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0709HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0711RTXTOXIND389e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 9e-05
Identities = 24/182 (13%), Positives = 59/182 (32%), Gaps = 13/182 (7%)

Query: 254 QAQTVNSDSLQSVKLPA-GLSSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +S + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARALYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + + L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SL 429

Sbjct: 260 KY 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0714ACRIFLAVINRP6940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 694 bits (1793), Expect = 0.0
Identities = 213/1058 (20%), Positives = 439/1058 (41%), Gaps = 54/1058 (5%)

Query: 1 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI 60
M + IRR + A+ L + G I+ PV P ++ V + +YPG Q
Sbjct: 1 MANFFIRR----PIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQT 56

Query: 61 VENQVTYPLTTTMLSVPGAKTVRGFSQ-FGDSYVYVIFEDGTDPYWARSRVLEYLNQVQG 119
V++ VT + M + + S G + + F+ GTDP A+ +V L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 120 KLPAGVSAELGP-DATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVA 178
LP V + + + ++ V + D+ +K L + V +V
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 179 SVGGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELA------EAEYMVR 232
G ++ +D L +Y ++ +V + L N + + + +
Sbjct: 177 LFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 ASGYLQTLDDFNHIVLKASENGVPVYLRDVAKIQVGPEMRRGIAELNG-EVVGGVVILRS 291
A + ++F + L+ + +G V L+DVA++++G E IA +NG G + L +
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLAT 295

Query: 292 GKNAREVIAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVVC 351
G NA + A+K KL L+ P+G++++ YD + + +I + L E ++V +V
Sbjct: 296 GANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVM 355

Query: 352 ALFLWHVRSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIEN 411
LFL ++R+ L+ I++P+ L F ++ G + N +++ G+ +A+G +VD AIV++EN
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 412 AHKRLEEWQHQHPDATLDNKTRWQVITNASVEVGPALFISLLIITLSFIPIFTLEGQEGR 471
+ + E D + + ++ AL ++++ FIP+ G G
Sbjct: 416 VERVMME----------DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 472 LFGPLAFTKTYAMAGAALLAIVVIPILMGYWIRGKIPPESSNPLNRF----------LIR 521
++ + T AMA + L+A+++ P L ++ + E F +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKP-VSAEHHENKGGFFGWFNTTFDHSVN 524

Query: 522 VYHPLLLKVLHWPKTTLLVAALSVLTVLWPLNKVGGEFLPQINEGDLLYMPSTLPGISAA 581
Y + K+L LL+ AL V ++ ++ FLP+ ++G L M G +
Sbjct: 525 HYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQE 584

Query: 582 EAASMLQKTDKLIM--SVPEVARVFGKTGKAETATDSAPLEMVETTIQLKPQEQW-RPGM 638
+L + + V VF G + + + LKP E+
Sbjct: 585 RTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQN---AGMAFVSLKPWEERNGDEN 641

Query: 639 TMDKIIEELDNTVRLPGLANLWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADI-DAMAEQ 697
+ + +I + + + +++ + I +G + A +
Sbjct: 642 SAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 698 IEEVARTVPGVASALAERLEGGRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGET 757
+ A+ + S LE +E+++EKA G++++D+ +++A+GG V +
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 758 VEGIARYPINLRYPQSWRDSPQALRQLPILTPMKQQITLADVADVKVSTGPSMLKTENAR 817
++ + ++ +R P+ + +L + + + + + G L+ N
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGL 821

Query: 818 PTSWIYIDARDRDMVSVVHDLQKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPMT 877
P+ I +A L + +A K L G ++G + ++ +V ++
Sbjct: 822 PSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAIS 879

Query: 878 LMIIFVLLYLAFRRVGEALLIISSVPFALVGGIWLLWWMGFHLSVATGTGFIALAGVAAE 937
+++F+ L + + ++ VP +VG + V G + G++A+
Sbjct: 880 FVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAK 939

Query: 938 FGVVMLMYLRHAIEAEPSLNNPQTFSEQKLDEALHHGAVLRVRPKAMTVAVIIAGLLPIL 997
++++ + + +E E + + EA +R+RP MT I G+LP+
Sbjct: 940 NAILIVEFAKDLMEKE----------GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 998 WGTGAGSEVMSRIAAPMIGGMITAPLLSLFIIPAAYKL 1035
GAGS + + ++GGM++A LL++F +P + +
Sbjct: 990 ISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


83Z0733Z0738N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0733-1164.835976enterobactin exporter EntS
Z0734-2164.601897iron-enterobactin transporter periplasmic
Z0735-2204.926254isochorismate synthase
Z0736-2204.795683enterobactin synthase subunit E
Z0737-1194.5810742,3-dihydro-2,3-dihydroxybenzoate synthetase
Z0738-1173.7653732,3-dihydroxybenzoate-2,3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0733TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0734FERRIBNDNGPP641e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.8 bits (155), Expect = 1e-13
Identities = 61/285 (21%), Positives = 101/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSTEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKSWQA 154
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 155 L-----LTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0737ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0738DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (930), Expect = e-130
Identities = 110/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFAQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


84Z0870Z0878N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z0870019-1.599356hypothetical protein
Z0871-1200.028951outer membrane usher protein
Z0872-1220.404870fimbriae structural protein
Z08730220.988483type II citrate synthase
Z08751232.774719succinate dehydrogenase cytochrome b556 large
Z08761253.303711succinate dehydrogenase cytochrome b556 small
Z08771273.432043succinate dehydrogenase flavoprotein subunit
Z08782302.958933succinate dehydrogenase iron-sulfur subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0870PF00577465e-10 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 45.6 bits (108), Expect = 5e-10
Identities = 14/53 (26%), Positives = 23/53 (43%), Gaps = 5/53 (9%)

Query: 1 MVGENGHAWLSGVDENQQFTVHWG--DQKTCAIH--LPEHLEDVT-KRLILPC 48
+V +NG +LSG+ + V WG + C + LP + +L C
Sbjct: 825 IVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0871PF005775480.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 548 bits (1414), Expect = 0.0
Identities = 208/760 (27%), Positives = 338/760 (44%), Gaps = 57/760 (7%)

Query: 5 RLSVLSCLAMVTPPALTA-EFNLNVLDKSIRDSVDISLLNQKGVVAPGDYFVSVTVNNNK 63
RL V A P + FN L + D+S + PG Y V + +NN
Sbjct: 29 RLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGY 88

Query: 64 ISNGQQIRWQKSGDK--IIPCINESLIELFGLKSDFRKKLPAIKE--CVDF-SVFPEIIF 118
++ + + + + I+PC+ + + GL + + + + CV S+ +
Sbjct: 89 MAT-RDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATA 147

Query: 119 TFDQANQQLNITIPQAWLAWHSENWTPPSTWNNGIPGFLMDYNLFASTYRPQSGSSSNNL 178
D Q+LN+TIPQA+++ + + PP W+ GI L++YN ++ + + G +S+
Sbjct: 148 QLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYA 207

Query: 179 NAYGTTGLNAGAWRLRSDYQLSQSDSGDNREQSGAI--SRTYLFRPLPQIGSRLTLGETD 236
+GLN GAWRLR + S + S + T+L R + + SRLTLG+
Sbjct: 208 YLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGY 267

Query: 237 FSSNIFDGFSYTGAALASDDRMLPWELRGYAPQISGIAQTNATVTISHSGRVIYQKKVPP 296
+IFDG ++ GA LASDD MLP RG+AP I GIA+ A VTI +G IY VPP
Sbjct: 268 TQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPP 327

Query: 297 GPFIIDDLNQ-SVQGTLDVKVSEEDGRVNNFQVSAASTPFLTRQGQVRYKLAAGRPRSSM 355
GPF I+D+ G L V + E DG F V +S P L R+G RY + AG RS
Sbjct: 328 GPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSG- 386

Query: 356 SHHTEDETFISHEVSWGMLSNTSLYGGMLLAGDDYRSGALGIGQNMLWMGALSFDVTWAD 415
+ E F + G+ + ++YGG LA D YR+ GIG+NM +GALS D+T A+
Sbjct: 387 NAQQEKPRFFQSTLLHGLPAGWTIYGGTQLA-DRYRAFNFGIGKNMGALGALSVDMTQAN 445

Query: 416 SHFDTQQDEQGYSYRFNYSKQVDATNSTISLAAYRFSDRHFHSYANYIDHKYNDADAQDE 475
S G S RF Y+K ++ + + I L YR+S + ++A+ + N + + +
Sbjct: 446 STLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQ 505

Query: 476 --------------------KQTISLSFGQPITLLNLNLYANILHQSWWNADTSTTANIT 515
+ + L+ Q + + LY + HQ++W
Sbjct: 506 DGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLYLSGSHQTYWGTSNVDE-QFQ 563

Query: 516 VGFNVDIGDWKDISVSTSFNTTHYE-DKDRDNQIYFSISLPIG-----------ESGRLG 563
G N ++DI+ + S++ T K RD + ++++P
Sbjct: 564 AGLNT---AFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASAS 620

Query: 564 YDMQNNSN-TTTHRMSWNDTLDERN--SWGMSAG-IQSDRPDNGAQVSGNYQHLSSAGEW 619
Y M ++ N T+ TL E N S+ + G ++G+ + G
Sbjct: 621 YSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNA 680

Query: 620 DLTGTYAANDYTSASASWSGSFTATQHGAAFHRRSSTNEPRLMVSTDGVGDIPIQGN-ID 678
++ G ++D SG A +G + N+ ++V G D ++
Sbjct: 681 NI-GYSHSDDIKQLYYGVSGGVLAHANGVTLGQP--LNDTVVLVKAPGAKDAKVENQTGV 737

Query: 679 YTNRFGIAVVPFVSSYQPTTVAVNMNDLPDGVTVSENVVK 718
T+ G AV+P+ + Y+ VA++ N L D V + V
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVAN 777


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0872FIMBRIALPAPE332e-04 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 33.5 bits (76), Expect = 2e-04
Identities = 41/179 (22%), Positives = 79/179 (44%), Gaps = 26/179 (14%)

Query: 14 SLLFASPVFAADEGSGEIHFKGEVIEAPCEIHPDDLDMNIDLGEVTTSHINREHHSEKK- 72
++L + V AAD + FKG++I C + + ++ G++ ++ + ++K
Sbjct: 15 AVLMSQHVHAADN----LTFKGKLIIPACTVQ----NAEVNWGDIEIQNLVQSGGNQKDF 66

Query: 73 PVNIRLINCDIAGWDDGKGGVVSKVGVTFDSTAKTTGATPLLSNISAGEATGVGVRLMNK 132
V++ NC + + + VT S TG + L+ N S G+ + L N
Sbjct: 67 TVDM---NCPYS---------LGTMKVTITSNG-QTGNSILVPNTSTASGDGLLIYLYNS 113

Query: 133 DES----FVTLGTEAPTIDLLASSTEQTLNFFAWMEQMDNAVSVTAGAVTANATYVLDY 187
+ S VTLG++ + ++ + + +A + N S+ AG +A AT V Y
Sbjct: 114 NNSGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0878TCRTETOQM310.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.003
Identities = 11/41 (26%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDDAPRMQDYTLEAEEGRDM-MLLDALIQLKEKDPSLSFRR 53
+++ + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


85Z0975Z0989N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z09752223.963877tail component of prophage CP-933K
Z09763254.822316tail component of prophage CP-933K
Z09773265.362192tail component of prophage CP-933K
Z09784285.100463tail component of prophage CP-933K
Z09792251.841472tail component of prophage CP-933K
Z09802230.148976tail component of prophage CP-933K
Z0981537-4.976131prophage protein
Z0982843-6.172506tail component of prophage CP-933K
Z0984645-9.295099hypothetical protein
Z0985232-6.345159hypothetical protein
Z0986020-2.556443hypothetical protein
Z0989-1160.144057hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0975cloacin443e-06 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 43.9 bits (103), Expect = 3e-06
Identities = 34/142 (23%), Positives = 62/142 (43%), Gaps = 4/142 (2%)

Query: 519 DQQRLNDLQEKKRQKDLQDAK--EQAERNYQEQQKRRNAENAALNRMNETEAARHQREIA 576
DQ + +E +RQ++ E AERNY+ + N N + R E +A Q +
Sbjct: 294 DQVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNS 353

Query: 577 RINAMQYADQAVRDA-AIQRENERYEKALASGKKKTRETRNDEATRLLLQYSQQQAQVEG 635
R + + A++ + DA A ++ R+ +G + + +A R + +QA +
Sbjct: 354 RKSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDA 413

Query: 636 QIAAARQSAGIATERMTEARKQ 657
A + A A E+RK+
Sbjct: 414 -AAKEKSDADAALSSAMESRKK 434


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0980SURFACELAYER330.005 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 33.5 bits (76), Expect = 0.005
Identities = 34/143 (23%), Positives = 45/143 (31%), Gaps = 30/143 (20%)

Query: 966 SVNANSGTLNNVTVNENCTIKGMLEATQV----RGDF---------VKAVSKSFPKQAGT 1012
+ + L NVT + +K L+A ++ G F VKA S K A
Sbjct: 235 AAQYDKKQLTNVTFDTETAVKDALKAQKIEVSSVGYFKAPHTFTVNVKATSNKNGKSATL 294

Query: 1013 WGNTETPNGTVTVTISDDHNFDRQIIIPPIIFNGIAYSDPGSGNNPGGTRYTGYGFEVRK 1072
PN V S I+ N Y + G R
Sbjct: 295 PVTVTVPNVADPVVPSQSKT---------IMHNAYFYDKDA--------KRVGTDKVTRY 337

Query: 1073 NGVLIASRETKGAIPGSYSAVID 1095
N V +A TK A SY VI+
Sbjct: 338 NTVTVAMNTTKLANGISYYEVIE 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0981ENTEROVIROMP872e-24 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 87.3 bits (216), Expect = 2e-24
Identities = 45/140 (32%), Positives = 72/140 (51%), Gaps = 15/140 (10%)

Query: 1 MRKVCAAILSAAICLAVSGVPAWASEHQSTLSAGYLHASTDAPG-SDDLNGINVKYRYEF 59
M+K+ AA+ +G A ST++ GY A +DA G + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEE 55

Query: 60 TDT-LGLITSFSYANAEDEQKTHYSDTRWHEDYVRNRWFSVMAGPSVRVNEWFSAYAMAG 118
++ LG+I SF+Y T S T DY +N+++ + AGP+ R+N+W S Y + G
Sbjct: 56 DNSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVG 107

Query: 119 VAYSRVSTFSGDYFRVTDNK 138
V Y + T ++ +
Sbjct: 108 VGYGKFQTTEYPTYKHDTSD 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0982CHANLCOLICIN330.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.002
Identities = 36/130 (27%), Positives = 55/130 (42%), Gaps = 15/130 (11%)

Query: 133 SARNAGISASKAEASAANADTSAEDASESARQAAESAASAKKSEEASSSSAS-------- 184
S G SK+E+SAA T+ ++ + AE AA AK + EA + + +
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQR 93

Query: 185 ------EAAQKASESLQSATDAELSKKTAESAAGNAARDATTSTEKARESAESAQSAEQS 238
EA + + SAT+ + A A R A + EKAR+ AE+A+ A Q
Sbjct: 94 LKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLA-KAEEKARKEAEAAEKAFQE 152

Query: 239 RIAAEDAVNR 248
+ R
Sbjct: 153 AEQRRKEIER 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z0989YERSSTKINASE290.027 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.9 bits (64), Expect = 0.027
Identities = 19/66 (28%), Positives = 32/66 (48%), Gaps = 3/66 (4%)

Query: 200 RMDKINGESLLNISSLPAQAEHAIYDMFDRLEQKGILFVDTTETNVLYDRAKNEFNPIDI 259
+ KIN E+ A H + D+ + L + G++ D NV++DRA E ID+
Sbjct: 234 KQGKINSEAYWGTIKFIA---HRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDL 290

Query: 260 SSYNVS 265
++ S
Sbjct: 291 GLHSRS 296


86Z1012Z1017N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1012-2193.797930hypothetical protein
Z1013-2193.853869hypothetical protein
Z1014-113-2.111963ABC transporter ATP-binding protein
Z1015-113-2.345046hypothetical protein
Z1016013-2.569080DNA-binding transcriptional regulator
Z1017014-2.109336ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1012ABC2TRNSPORT320.003 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 31.8 bits (72), Expect = 0.003
Identities = 15/51 (29%), Positives = 26/51 (50%)

Query: 232 VFMMPAILLSGYVSPVENMPVWLQNLTWINPIRHFTDITKQIYLKDASLDI 282
+ + P + LSG V PV+ +P+ Q P+ H D+ + I L +D+
Sbjct: 184 LVITPILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1014PF05272320.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.008
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 240 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 299
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 300 KRGEIFG----LLGPNGAGKSTTFKMMCGL 325
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.037
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 39 YVTGLVGPDGAGKTTLMRMLAGL 61
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1015RTXTOXIND534e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.9 bits (127), Expect = 4e-10
Identities = 36/203 (17%), Positives = 78/203 (38%), Gaps = 24/203 (11%)

Query: 80 YEIALMQAKAGVSVAQAQYD-LMLTLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQA 135
E ++A + V ++Q + + + SA+++ + + + + Q ++
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTL 316

Query: 136 QLAQAELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERN 193
+LA+ E Q S + AP + V G V+ T+ + + V A V ++
Sbjct: 317 ELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKD 376

Query: 194 LDQAQPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVV 250
+ G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 377 IGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISI 428

Query: 251 T-------DADDALRQGMPVTVQ 266
+ + L GM VT +
Sbjct: 429 EENCLSTGNKNIPLSSGMAVTAE 451



Score = 48.3 bits (115), Expect = 1e-08
Identities = 24/157 (15%), Positives = 52/157 (33%), Gaps = 15/157 (9%)

Query: 4 KPVVIGLAVVVLAAVVAGXYWWYQSRQDNGLTLYGNV--DIRTVNLSFRVGGRVESLAVD 61
+A ++ +V + + T G + R+ + V+ + V
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 62 EGDAIKAGQVLGELDHKPYEIALMQAKAGVSVAQAQYDLMLTLKSAQDK----------- 110
EG++++ G VL +L E ++ ++ + A+ + L + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 111 --LRQYRSGNREQDIAQAKASLEQAQAQLAQAELNLQ 145
+ + + K Q Q Q ELNL
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1016HTHTETR736e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 73.1 bits (179), Expect = 6e-18
Identities = 33/214 (15%), Positives = 77/214 (35%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ + + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1017SECA300.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.025
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


87Z1530Z1537N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1530233-9.574921hypothetical protein
Z1531125-5.138679hypothetical protein
Z1532122-3.531862hypothetical protein
Z1533121-2.886246oxidoreductase
Z1534222-3.712351chaperone
Z1535222-3.854002hypothetical protein
Z1536221-4.564710usher protein
Z1537215-4.891174chaperone
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1530TRNSINTIMINR300.004 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 30.1 bits (67), Expect = 0.004
Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 5 YFLFAGIILCAFIAAILSHIAFHHANEPAEQNISCNAHVI 44
Y L + +I+ I A ++ A H N+PAEQ + H +
Sbjct: 366 YGLSSALIVAGGIGAGVT-TALHRRNQPAEQTTTTTTHTV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1533DHBDHDRGNASE1046e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 104 bits (260), Expect = 6e-29
Identities = 70/255 (27%), Positives = 118/255 (46%), Gaps = 11/255 (4%)

Query: 18 LHNKVAIVTGAAGELGRGLCSALAKAGANLLLVDIK-EPDNRYLKHLTHEGVEVEFMTID 76
+ K+A +TGAA +G + LA GA++ VD E + + L E E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 77 ITKPDASCTIINRCLERFGQLDILVNNAGVCNINRPIDFNRNDWDPMINLNLNAAFDMSQ 136
+ A I R G +DILVN AGV + +W+ ++N F+ S+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 137 AALNIFVPQRKGKIINMCSVLSFHGGRWSPG-YAATKHALAGLTKAYADDFAEYNIQING 195
+ + +R G I+ + S + R S YA++K A TK + AEYNI+ N
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 196 IAPGYYVSEMTAIIYNNPKIKE-LIKGR-------IPAQRWGRAQDLMGAMVFLASAASD 247
++PG ++M ++ + E +IKG IP ++ + D+ A++FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 248 YVNGQLLVIDGGYSI 262
++ L +DGG ++
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1536PF005776770.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 677 bits (1748), Expect = 0.0
Identities = 237/805 (29%), Positives = 386/805 (47%), Gaps = 44/805 (5%)

Query: 28 ATPSDEDNYTFDPQLFRGSRFSQSSLAKLTTRESVAPGNYKMDIYTNNKLSGSWNVTFKE 87
P F+P+ + + L++ + + PG Y++DIY NN + +VTF
Sbjct: 39 QAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNT 98

Query: 88 AADG-RVLPCLTPEVADAIGLKTGEDKGEK---DPVCTFAKELAPGITSQTQLSQLRLDL 143
++PCLT ++GL T G D C + T+Q + Q RL+L
Sbjct: 99 GDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNL 158

Query: 144 SVPQSQLISRPRGYVPPSELDTGASLAFMNYIANYYNVAYSGQNAHSQRSLWASFNGGIN 203
++PQ+ + +R RGY+PP D G + +NY + + + + + + G+N
Sbjct: 159 TIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS--VQNRIGGNSHYAYLNLQSGLN 216

Query: 204 LGAWQYRQLSNMTW-----DNDKGNQWNNIRSYLQRPLPAINSQLMMGQLITSGRFFSGL 258
+GAW+ R + ++ + N+W +I ++L+R + + S+L +G T G F G+
Sbjct: 217 IGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGI 276

Query: 259 SYHGVSLATDERMLPDSMRGYAPTIRGVAATNARVSVMQNGHEIYQTTVAPGPFEINDLY 318
++ G LA+D+ MLPDS RG+AP I G+A A+V++ QNG++IY +TV PGPF IND+Y
Sbjct: 277 NFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIY 336

Query: 319 PTSYSGDLDVTVTEANGAVSRFSVPFSAVPESMRPGTSRYNVEVGKTQDSG---DDSMFG 375
SGDL VT+ EA+G+ F+VP+S+VP R G +RY++ G+ + + F
Sbjct: 337 AAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFF 396

Query: 376 DLTWQHGMTNTLTFNSGSRIADGYQALMLGGVYGS-SLGAFGANLTWSHARVPESEAQSG 434
T HG+ T G+++AD Y+A G +LGA ++T +++ +P+ G
Sbjct: 397 QSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDG 456

Query: 435 WMSQLTWSKTFQPTSTTVSLAGYRYSTSGYRDLADVLGERHAASNKQSWD---------- 484
+ ++K+ + T + L GYRYSTSGY + AD R N ++ D
Sbjct: 457 QSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFT 516

Query: 485 ---SSQWRQQSRFDLTLSQSLANYGNLFVSGSTQNYRGGKSRDTQLQLGYSNSFSHGISM 541
+ + ++ + LT++Q L L++SGS Q Y G + D Q Q G + +F I+
Sbjct: 517 DYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF-EDINW 575

Query: 542 NLSVGRQRMGGYKDNSDDMQTVTSLSFSFPLGG-------NGPRVPSLSNSWTHSTDGSS 594
LS + D +L+ + P + R S S S +H +G
Sbjct: 576 TLSYSLTKNAW--QKGRDQM--LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRM 631

Query: 595 QLQSSLTGMLDEAQTTNYSLNV---MRDQQYKQTTLSGNMQKRFSQTTVGLNASKGQDYW 651
+ + G L E +YS+ +T + R + S D
Sbjct: 632 TNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIK 691

Query: 652 QASGNVQGAMAVHGGGITFGPYLGETFALVEAKGAEGAKVYNSSQLEINDSGYALVPAVT 711
Q V G + H G+T G L +T LV+A GA+ AKV N + + + GYA++P T
Sbjct: 692 QLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYAT 751

Query: 712 PYRYNRISLDPQGMDGDAELVDSERQVAPVAGAAVKVIFRTRPGKALLIKSRMADGSELP 771
YR NR++LD + + +L ++ V P GA V+ F+ R G LL+ + LP
Sbjct: 752 EYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLP 810

Query: 772 MGADVLDENNTVVGIAGQGGQIYLR 796
GA V E++ GI GQ+YL
Sbjct: 811 FGAMVTSESSQSSGIVADNGQVYLS 835


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1537SECA290.022 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.7 bits (64), Expect = 0.022
Identities = 19/66 (28%), Positives = 27/66 (40%), Gaps = 14/66 (21%)

Query: 170 VTNPTGYYVTIRAAELLNNGKKVPLANSVMIAPQSTTEW-----TLPSGISVAPGAQIHL 224
V + V + +LN IA T E TLP+ ++ G +H+
Sbjct: 78 VFGMRHFDVQLLGGMVLNERC---------IAEMRTGEGKTLTATLPAYLNALTGKGVHV 128

Query: 225 VTVNDY 230
VTVNDY
Sbjct: 129 VTVNDY 134


88Z1713Z1722N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z17132142.721811flagellar basal body rod modification protein
Z17140142.787836flagellar hook protein FlgE
Z1715-1132.589100flagellar basal body rod protein FlgF
Z1716-191.443819flagellar basal body rod protein FlgG
Z17170132.439050flagellar basal body L-ring protein
Z17180132.167724flagellar basal body P-ring biosynthesis protein
Z17191141.883117flagellar rod assembly protein/muramidase FlgJ
Z17202141.436184flagellar hook-associated protein FlgK
Z17213161.465268flagellar hook-associated protein FlgL
Z17224191.829082ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1713SYCECHAPRONE270.033 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 26.6 bits (58), Expect = 0.033
Identities = 14/34 (41%), Positives = 21/34 (61%), Gaps = 2/34 (5%)

Query: 43 LKNQDPTNPMENNELTSQLAQISTVSGIEKLNTT 76
L N+ P N ++NN L +QL + V G E+L T+
Sbjct: 89 LWNRQPLNSLDNNSLYTQLEML--VQGAERLQTS 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1714FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1716FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1717FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1718FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1098), Expect = e-152
Identities = 156/363 (42%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAIAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + A++ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1719FLGFLGJ5080.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 508 bits (1308), Expect = 0.0
Identities = 311/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKASEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKA EDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTTGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMT GKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1720FLGHOOKAP16770.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 677 bits (1747), Expect = 0.0
Identities = 541/546 (99%), Positives = 543/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNSQHKAGFDANGDEGEDFFAIGKPAVLQNTKNNGNVAIGATVTDASAVLATD 361
ALAFAEAFN+QHKAGFDANGD GEDFFAIGKPAVLQNTKN G+VAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSTTQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSS TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1721FLAGELLIN461e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.8 bits (108), Expect = 1e-07
Identities = 41/226 (18%), Positives = 81/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDNDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD+D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEVNGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + +G E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1722IGASERPTASE643e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 64.3 bits (156), Expect = 3e-12
Identities = 47/288 (16%), Positives = 84/288 (29%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAAPATPATPAQPGLL 571
P E+ + DVP P+ E A AP P APATP+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT----- 1037

Query: 572 SRFFGALKALFSGGEETKPTEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRSER-- 629
ET + Q QN + + + ++
Sbjct: 1038 ---------------ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETREGRQQAEVTEKARTADEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEETVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 AEETVAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
+P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248



Score = 63.5 bits (154), Expect = 4e-12
Identities = 46/261 (17%), Positives = 81/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAAPATPATPAQPGLLSRFFGALKALFSGGEETKPTEQP-APKAEAKPERQQDRR 609
P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETREGRQQAEV------T 663
N ++++++ D E +NR A++ + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTADEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +T + ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEETVVAPVAEETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232


89Z1910Z1918N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z19104315.791003hypothetical protein
Z19123295.658131hypothetical protein
Z19132244.872824tail component of prophage CP-933X
Z19142224.751123minor tail fiber protein of prophage CP-933X
Z19152214.117936tail protein (partial) of prophage CP-933X
Z19161192.196737tail protein (partial) of prophage CP-933X
Z19173240.166111prophage protein
Z1918428-1.310638membrane protein of prophage CP-933X
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1910TACYTOLYSIN250.037 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 24.9 bits (54), Expect = 0.037
Identities = 10/39 (25%), Positives = 20/39 (51%)

Query: 11 ARLSGFRHKTVKVPEWRNVSVVLREPSAEAWYLWQEVLN 49
++ S F RN+ ++ RE + AW W++V++
Sbjct: 508 SKTSPFSTVIPLGANSRNIRIMARECTGLAWEWWRKVID 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1913CHANLCOLICIN320.015 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 32.0 bits (72), Expect = 0.015
Identities = 45/240 (18%), Positives = 93/240 (38%), Gaps = 24/240 (10%)

Query: 356 RAQQAVAAARGTEMQIAAEARLAATQERLN-------RNIAARSAAQNALNSTTAVGSRL 408
+A+QA A E Q A+A A +RL R+ A+R+ + L +
Sbjct: 66 QAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQA 125

Query: 409 MSGALGLVGGVPGLVMLGAAAWYTLYQNQEQARESARQYALTIDEIAHKTPSMSLPEASD 468
L L AA + +++ +E R+ A T ++ + ++
Sbjct: 126 EDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQL----------KLAE 175

Query: 469 NEGRTRAALTEQNRLID-------EQASRVKSLQEKAQSIQDVLAGLEDRRVALIRQQAA 521
E + AAL+E+ + ++ S V + + +++ L+ R A ++ A
Sbjct: 176 AEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAG 235

Query: 522 EQNKVYQSMLVMNGQHTEFNRLLGLGNELLQQRQGLVNVPLRLPQATLDDKQQSALTKTE 581
++N++ Q+ +L N+ LQ R R+ + +++Q +T +E
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1917ENTEROVIROMP831e-22 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 82.6 bits (204), Expect = 1e-22
Identities = 39/134 (29%), Positives = 67/134 (50%), Gaps = 12/134 (8%)

Query: 7 VILSAVVWQVAAATPASAAEHQSTLSAGYLHASTNVPG-SDDLNGINVKYRYEFMDA-LG 64
+ + + V A T ++ ST++ GY A ++ G + + G N+KYRYE ++ LG
Sbjct: 4 IACLSALAAVLAFTAGTSVAATSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEEDNSPLG 61

Query: 65 LITSFSYANAEDEQKTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRV 124
+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y +
Sbjct: 62 VIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKF 113

Query: 125 STFSGDYLRVTDNK 138
T + +
Sbjct: 114 QTTEYPTYKHDTSD 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1918CHANLCOLICIN452e-06 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 44.7 bits (105), Expect = 2e-06
Identities = 54/319 (16%), Positives = 118/319 (36%)

Query: 154 ARAASTSAGQAASSAQSASSSAGTASTKATEASKSAAAAESSKSAAATSAGAAKTSETNA 213
+ S S AA A + S+A T+A +A+++ AAAE+ A A + +
Sbjct: 39 GKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98

Query: 214 AVSQQSAATSASTATTKASEAASSARDASASKEAAKSSETSAASSASSAASSATAAGNSA 273
+ + A+ +AT A ++ + AK+ E + + ++ + A
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRK 158

Query: 274 KAAKTSETNAKSSETAAEQSASAAAGSKTAAALSASAASTSAGQASASATAAGKSAESAA 333
+ + + + A + AA S+ A A+ + SA Q+ ++
Sbjct: 159 EIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSR 218

Query: 334 SSASTATTKAGEATEQASAAASSASAAKTSETNAKASETSAESSKTAAASSASSAASSAS 393
S+S A T + ++AK E + + S ++ A
Sbjct: 219 LSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRV 278

Query: 394 SASASKDEATRQASAAKSSATTASTKATEAAGSATAAAQSKSTAESAATRAETAAKRAED 453
A ++E +Q +A+++ + T+ + + + +++ + AE K+A++
Sbjct: 279 GAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQN 338

Query: 454 IASAVALEDASTTKKGIVQ 472
++DA Q
Sbjct: 339 NLLNSQIKDAVDATVSFYQ 357



Score = 31.6 bits (71), Expect = 0.019
Identities = 58/332 (17%), Positives = 111/332 (33%), Gaps = 22/332 (6%)

Query: 315 AGQASASATAAGKSAESAASSA----STATTKAGEATEQASAAASSASAAKTSETNAKAS 370
+G KS SAA A STA K +A + A A A++ + AK +
Sbjct: 32 SGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALT 91

Query: 371 E---------TSAESSKTAAASSASSAASSASSASASKDEATRQASAAKSSATTASTKAT 421
+ +S+T +A+ + A ++A A + + A+ A A
Sbjct: 92 QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQ 151

Query: 422 EAAGSATAAAQSKSTAESAATRAETAAKRAEDIASA-----VALEDASTTKKGIVQLSSA 476
EA + K+ E AE KR ++ +A + S + +V++
Sbjct: 152 EAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGE 211

Query: 477 TNSTSESLAATPKAVKAAYELANGKYTAQDATTAQKGIVQLSNATNSTSEMLAATPKSVK 536
+ + L+++ A A + GK +A+ + + A P +
Sbjct: 212 IKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAK---YKELDELVKKLSPRANDPLQNR 268

Query: 537 AAYDLANGKYTAQDAT-TAQKGIVQLSSATNSASETLAATPKAVKAANDNANGRVPSARK 595
++ + A QK + + N + + KA+ ++N N + +
Sbjct: 269 PFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHE 328

Query: 596 VNGKALSSDITLTPKDIGTLNSTTMSFSGGAG 627
+ L I T+SF
Sbjct: 329 AEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360


90Z1964Z1977N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1964-1170.333581iron ABC transporter ATP-binding protein
Z1965-2160.216529iron ABC transporter permease
Z1966-2150.027138hypothetical protein
Z1967-215-0.777431hypothetical protein
Z1968-215-0.259262trehalase, periplasmic
Z1969-216-0.516965dihydroxyacetone kinase subunit DhaM
Z1970-116-0.297507dihydroxyacetone kinase subunit DhaL
Z1971-1150.454170dihydroxyacetone kinase subunit DhaK
Z1972-1190.349119adhesion protein
Z1974-2181.285043GTP-dependent nucleic acid-binding protein EngD
Z1975-2151.528346peptidyl-tRNA hydrolase
Z1976-2142.021260hypothetical protein
Z1977-2161.938828sulfate transporter YchM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1964LCRVANTIGEN300.010 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 29.7 bits (66), Expect = 0.010
Identities = 19/63 (30%), Positives = 28/63 (44%), Gaps = 7/63 (11%)

Query: 193 LMSTHHPLHANAIADSIIQVEPDGRVTQGLPTEQLTTNKLAAL------YRVSADQIHHH 246
+ H L A+ I D I++V D G +L +LA L Y V +I+ H
Sbjct: 119 MAVMHFSLTADRIDDDILKVIVDSMNHHGDARSKL-REELAELTAELKIYSVIQAEINKH 177

Query: 247 LSA 249
LS+
Sbjct: 178 LSS 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1966BACINVASINB290.044 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 28.6 bits (63), Expect = 0.044
Identities = 21/51 (41%), Positives = 29/51 (56%), Gaps = 2/51 (3%)

Query: 55 LLLAVAPEKMVGFSSFDFARQALIPLSEHIRQLPRLGRLAGRASTLSLEGL 105
L + VA E + + F +QAL P+ EH+ L L L G+A T +LEGL
Sbjct: 348 LAVMVADEIVKAATGVSFIQQALNPIMEHV--LKPLMELIGKAITKALEGL 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1969PHPHTRNFRASE1411e-38 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 141 bits (357), Expect = 1e-38
Identities = 63/206 (30%), Positives = 102/206 (49%), Gaps = 1/206 (0%)

Query: 259 GKAFYYQPVLCTVQAKSPLTVEEEQERLRQAIDFTLLDLMTLTAKAEASGLDDIAAIFSG 318
KAF + ++ S V E E+L A++ + +L + + EAS D A IF+
Sbjct: 17 AKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEASMGADKAEIFAA 76

Query: 319 HHTLLDDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQLDDEYLQARYIDVDDLLH 378
H +LDDPEL+ +++E AEYA ++V ++ +D+EY++ R D+ D+
Sbjct: 77 HLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYMKERAADIRDVSK 136

Query: 379 RTLVHLT-QTKEELPQFNSPTILLAENIYPSPVLQLDPAVVKGICLSAGSPVSHSALIAR 437
R L HL L T+++AE++ PS QL+ VKG G SHSA+++R
Sbjct: 137 RVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSR 196

Query: 438 ELGIGWICQQGEKLYAIQPEEKLTLD 463
L I + E IQ + + +D
Sbjct: 197 SLEIPAVVGTKEVTEKIQHGDMVIVD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1970adhesinmafb320.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.6 bits (71), Expect = 0.002
Identities = 10/47 (21%), Positives = 25/47 (53%)

Query: 138 VESLRQSSEQNLSVPAALEAASSIAEFAAQSTITMQARKGRASYLGE 184
E++ + ++N + +EA ++A A + + A+ G+A+ G+
Sbjct: 293 REAVDRWIQENPNAAETVEAVFNVAAAAKVAKLAKAAKPGKAAVSGD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1972PRTACTNFAMLY442e-06 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 43.5 bits (102), Expect = 2e-06
Identities = 117/548 (21%), Positives = 198/548 (36%), Gaps = 92/548 (16%)

Query: 14 RLAELKIRSPSIQLIKFGAIGLNAILFSPLLIAADTGSQYGTNITINDGDRI---TGDTA 70
+ A L+ + ++ L GA ++ I Q+G +I +D + +G T
Sbjct: 10 KAAPLRRTTLAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTASGTTI 69

Query: 71 DPSGN-LYGVMTPAGNTPGNINLGNDVTVN---VNDASGYAKGIIIQGKNSSLTANRLTV 126
SG G++ N + N + ++D + K L A+ T+
Sbjct: 70 KVSGRQAQGILLE--NPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATL 127

Query: 127 DVVGQT---SAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSNGIG 183
VG T I + + G+ A + ST++ G+ I + +T + I + G+
Sbjct: 128 ANVGDTWDDDGIALYVAGEQAQASIAD-STLQGAG-GVQIERGANVTVQRSAIVD-GGLH 184

Query: 184 LTINDYGTSVDLGSGSKIKTDGS-TGVYIGGLNGNNANGAARFTATDLTID---VQGYSA 239
+ DL + D + T V G + A++LT+D + G A
Sbjct: 185 IGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPA----AVSVLGASELTLDGGHITGGRA 240

Query: 240 MGINVQKNSVVDLGTNSSIKTSGDNAHGLWSFGQVSANAL-------TVDVTGAAANGVE 292
G+ + +VV L ++I+ A G G V A+ GV+
Sbjct: 241 AGVAAMQGAVVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYGVD 299

Query: 293 VRGGTTTIGADSHISSAQGGGLVTSGSDATINFSG---TAAQRNSIFSGGSYGASAQTAT 349
V G + + A S + + + G + G A + SG +A N I +GG+ + Q A
Sbjct: 300 VSGSSVEL-AQSIVEAPELGAAIRVGRGARVTVSGGSLSAPHGNVIETGGARRFAPQAAP 358

Query: 350 AVINMQNTDITVDRNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDL 409
I +Q G+ A G L L +TG A A+G T + +
Sbjct: 359 LSITLQA--------GAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIPGTSI 410

Query: 410 VIDMSTPDQMAIATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHPGSVWTGSSLS 469
P +A+A+ + WTG++
Sbjct: 411 -----GPLDVALAS------------------------------------QARWTGAT-- 427

Query: 470 DNVNGGKLDVAMNNSVWNVTSNSNLDTLAL-SHSTVDFASHGSTAGTFTTLNVENLSGNS 528
V+ +D N+ W +T NSN+ L L S +VDF + AG F L V L+G+
Sbjct: 428 RAVDSLSID----NATWVMTDNSNVGALRLASDGSVDFQQ-PAEAGRFKVLTVNTLAGSG 482

Query: 529 TFIMRADV 536
F M
Sbjct: 483 LFRMNVFA 490


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1977RTXTOXINA330.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 33.0 bits (75), Expect = 0.003
Identities = 24/81 (29%), Positives = 37/81 (45%), Gaps = 16/81 (19%)

Query: 279 LGAIESLLCAV----VL---DGMTGTKHKANSELVGQGLGNI---IAPFF------GGIT 322
L + +L A+ +L D T TK A EL + LGN+ I+ + G++
Sbjct: 242 LDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGLS 301

Query: 323 ATAAIARSAANVRAGATSPIS 343
+AA A A+ A SP+S
Sbjct: 302 TSAAAAGLIASAVTLAISPLS 322


91Z1995Z2000N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z1995-2172.188679hypothetical protein
Z1996-2212.610839transcriptional regulator NarL
Z1998-1222.827762nitrate/nitrite sensor protein NarX
Z2000-2252.404069nitrite extrusion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1995INTIMIN2588e-80 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 258 bits (660), Expect = 8e-80
Identities = 120/378 (31%), Positives = 196/378 (51%), Gaps = 21/378 (5%)

Query: 32 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 91
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 92 DRYLTWSQLGLTQQDDGLVSNVGVGQRWARGNWLVGYNTFYDNLLDENLQRAGFGAEAWG 151
++ L + Q+G D +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 152 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSVEQYFGDR 209
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 210 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 269
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 270 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 329
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 330 RSRYGIRQLIWQGDTQILS-----LTPGAQANSAEGWTLIMPDWQNGEGASNHWRLSVVV 384
+S+YG+ +++W D+ + S G+Q SA+ + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 385 EDNQGQRVSSNEITLTLV 402
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1996HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-17
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z1998PF06580531e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 1e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2000TCRTETB300.015 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.2 bits (68), Expect = 0.015
Identities = 17/58 (29%), Positives = 29/58 (50%), Gaps = 1/58 (1%)

Query: 128 TPYSVFIIISLLCGFAGANF-ASSMANISFFFPKQKQGGALGLNGGLGNMGVSVMQLV 184
+ +S+ I+ + G A F A M ++ + PK+ +G A GL G + MG V +
Sbjct: 101 SFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAI 158


92Z2499Z2516N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z2499-118-0.097176peptide ABC transporter ATP-binding protein
Z2500-1190.348327peptide ABC transporter ATP-binding protein
Z25010140.237337membrane transport protein
Z25030130.263045membrane transport protein
Z25040130.312880outer membrane channel protein
Z2506-113-0.231036outer membrane channel protein
Z2507-213-0.459266hypothetical protein
Z2508-314-0.543563efflux pump protein
Z2509-117-0.265913efflux pump protein
Z2510-116-1.085107transcriptional repressor
Z2511-117-0.982376hypothetical protein
Z2512-117-0.810605enoyl-ACP reductase
Z2513-117-1.037754oxidoreductase
Z2514017-1.247496exoribonuclease II
Z2516118-1.830215RNase II stability modulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2499HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2503TCRTETB622e-13 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 62.2 bits (151), Expect = 2e-13
Identities = 41/192 (21%), Positives = 82/192 (42%), Gaps = 19/192 (9%)

Query: 6 LSWALILGLLAGIGPMCTDLYLPALPEMSEQLAATTTITQLTLTASLIGLGVGQLLFGPL 65
L W IL + + +LP+++ T TA ++ +G ++G L
Sbjct: 16 LIWLCILSFF---SVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKL 72

Query: 66 SDKIGRKRPLILSLLLFIVSSILCATTNNIY-WLVVWRFIQGIAGAG-----GSVLSRSI 119
SD++G KR L+ +++ S++ ++ + L++ RFIQG A V++R I
Sbjct: 73 SDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYI 132

Query: 120 ARDKYQGVTLTQFFALLMTVNGLAPVLSPVLGGYIVSTFDWRTLFWVMAEISTVLLLGCV 179
++ F L+ ++ + + P +GG I W L + ++ + V
Sbjct: 133 PKENRGKA-----FGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP-----MITIITV 182

Query: 180 LFINETLPENKR 191
F+ + L + R
Sbjct: 183 PFLMKLLKKEVR 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2506RTXTOXIND320.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.5 bits (74), Expect = 0.002
Identities = 26/166 (15%), Positives = 49/166 (29%), Gaps = 11/166 (6%)

Query: 70 DVQKAIADIDSARALYGQTNASLFPTVNAALSSTRSRSLANGTETTAEADGTVSSFTLDL 129
A AD ++ Q +RS L E + + + +
Sbjct: 128 TALGAEADTLKTQSSLLQARL----EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 130 FGRNQSLSRAARETWLASEFTAQNTRLTLIAEISTAWLTLAADNSNLALAKETMTSAENS 189
R SL + TW ++ + AE T + + + K
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS-------R 236

Query: 190 LKIIQRQQQVGTAAATDVSEAMSVYQQARASVASYQTQVMQDKNAL 235
L A V E + Y +A + Y++Q+ Q ++ +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2507ACRIFLAVINRP621e-15 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 62.2 bits (151), Expect = 1e-15
Identities = 27/49 (55%), Positives = 42/49 (85%)

Query: 1 MPLAIATGSGANSRIAIGTGIIGGTLTATLLAIFFVPLFFVLVKRLFAG 49
+PLAI+ G+G+ ++ A+G G++GG ++ATLLAIFFVP+FFV+++R F G
Sbjct: 986 LPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCFKG 1034


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2508ACRIFLAVINRP11050.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1105 bits (2860), Expect = 0.0
Identities = 550/983 (55%), Positives = 713/983 (72%), Gaps = 8/983 (0%)

Query: 3 SRFFVRRPVFAWVIAILIMLAGILAIRTLPVAQYPDVAPPTIKISATYTGASAETLENSV 62
+ FF+RRP+FAWV+AI++M+AG LAI LPVAQYP +APP + +SA Y GA A+T++++V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 63 TQVIEQQLTGLDNLLYFSSTSSSDGSVSINVTFEQGTDPDTAQ--VQNKIQQAESRLPSE 120
TQVIEQ + G+DNL+Y SSTS S GSV+I +TF+ GTDPD AQ VQNK+Q A LP E
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 121 VQQTGVTVEKSQSNFLLIAAVYDTTDKASSSDIADWLVSNVQDPLARVEGVGSLQVFGAE 180
VQQ G++VEKS S++L++A + DI+D++ SNV+D L+R+ GVG +Q+FGA+
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ 181

Query: 181 YAMRIWLDPAKLASYSLMPSDVQSAIEAQNVQVTAGKIGALPSPNTQQLTATVRAQSRLQ 240
YAMRIWLD L Y L P DV + ++ QN Q+ AG++G P+ QQL A++ AQ+R +
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 241 TVDQFKNIIVKSQSDSAVVRIKDVARVEMGSEDYTAIGKLNGHPSAGVAVMLSPGANALN 300
++F + ++ SD +VVR+KDVARVE+G E+Y I ++NG P+AG+ + L+ GANAL+
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 301 TATLVKDKIAEFQRNMPQGYDIAYPKDSTEFIKISVEDVIQTLFEAIVLVVCVMYLFLQN 360
TA +K K+AE Q PQG + YP D+T F+++S+ +V++TLFEAI+LV VMYLFLQN
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 361 LRATLIPALAVPVVLLGTFGVLALFGYSINTLTLFAMVLAIGLLVDDAIVVVENVERIMR 420
+RATLIP +AVPVVLLGTF +LA FGYSINTLT+F MVLAIGLLVDDAIVVVENVER+M
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 421 DKGLPAREATEKSMGEISGALVAIALVLSAVFLPMAFFGGSTGVIYRQFSITIISAMLLS 480
+ LP +EATEKSM +I GALV IA+VLSAVF+PMAFFGGSTG IYRQFSITI+SAM LS
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 481 VVVALTLTPALCGSVL----QHVPPHKKGFFGAFDRFYRRTEDKYQRGVIYVLRRAARTM 536
V+VAL LTPALC ++L +K GFFG F+ + + + Y V +L R +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 537 GLYLVLGGGMALMMWKLPGSFLPTEDQGEIMVQYTLPAGATAARTAEVNRQIVDWFLINE 596
+Y ++ GM ++ +LP SFLP EDQG + LPAGAT RT +V Q+ D++L NE
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 597 KANTDVIFTVDGFSFSGSGQNTGMAFVSLKNWSQRKGAENTAQAIALRATKELGTIRDAT 656
KAN + +FTV+GFSFSG QN GMAFVSLK W +R G EN+A+A+ RA ELG IRD
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 657 VFAMTPPAVDGLGQSNGFTFELLANGGTDRETLLQMRNQLIEKANQSP-ELHSVRANDLP 715
V PA+ LG + GF FEL+ G + L Q RNQL+ A Q P L SVR N L
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 716 QMPQLQVDIDSNKAVSLGLSLNDVTDTLSSAWGGTYVNDFIDRGRVKKVYIQGDSEFRSA 775
Q ++++D KA +LG+SL+D+ T+S+A GGTYVNDFIDRGRVKK+Y+Q D++FR
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 776 PSDLGKWFVRGSDNAMTPFSAFATTRWLYGPERLVRYNGSAAYEIQGENATGFSSGDAMT 835
P D+ K +VR ++ M PFSAF T+ W+YG RL RYNG + EIQGE A G SSGDAM
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 836 KMEELANSLPAGTTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAALYESWSVPFSVI 895
ME LA+ LPAG + W+G+S QE+L+ QA +L A+S +VVFLCLAALYESWS+P SV+
Sbjct: 842 LMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVM 901

Query: 896 LVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFA-EAAVAEGYSLSR 954
LV+PLG++G LAA + + NDVYF V LLTTIGLS+KNAILIVEFA + EG +
Sbjct: 902 LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVE 961

Query: 955 AALRAAQTRLRPIIMTSLAFIAG 977
A L A + RLRPI+MTSLAFI G
Sbjct: 962 ATLMAVRMRLRPILMTSLAFILG 984



Score = 89.9 bits (223), Expect = 3e-20
Identities = 76/502 (15%), Positives = 164/502 (32%), Gaps = 26/502 (5%)

Query: 6 FVRRPVFAWVIAILIMLAGILAIRTLPVAQYPDVAPPTIKISA-TYTGASAETLENSVTQ 64
+ +I LI+ ++ LP + P+ GA+ E + + Q
Sbjct: 533 ILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQ 592

Query: 65 VIEQQLTGLDNLLY-------FSSTSSSDGSVSINVTFEQGTDPDTAQ--VQNKIQQAES 115
V + L + FS + + + V+ + + + + + I +A+
Sbjct: 593 VTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKM 652

Query: 116 RLPSEVQQTGVTVEKSQSNFLLIAAVYDTTDKASSSDIADWLVSNVQDPLARVEGVGS-- 173
L + L A +D + D L L +
Sbjct: 653 ELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 174 ----LQVFGAEYAMRIWLDPAKLASYSLMPSDVQSAIEAQNVQVTAGKIGALPSPNTQQL 229
++ +D K + + SD+ I + ++L
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF--IDRGRVKKL 770

Query: 230 TATVRAQSRLQTVDQFKNIIVKSQSDSAVVRIKDVARVEMGSEDYTAIGKLNGHPSAGVA 289
A+ R + + V+S ++ +V + + NG PS +
Sbjct: 771 YVQADAKFR-MLPEDVDKLYVRS-ANGEMVPFSAFTTSHWV-YGSPRLERYNGLPSMEIQ 827

Query: 290 VMLSPGANALNTATLVKDKIAEFQRNMPQGYDIAYPKDSTEFIKISVEDVIQTLFEAIVL 349
+PG + A + + +A +P G + S + S + + V+
Sbjct: 828 GEAAPGTS-SGDAMALMENLAS---KLPAGIGYDWTGMSYQERL-SGNQAPALVAISFVV 882

Query: 350 VVCVMYLFLQNLRATLIPALAVPVVLLGTFGVLALFGYSINTLTLFAMVLAIGLLVDDAI 409
V + ++ + L VP+ ++G LF + + ++ IGL +AI
Sbjct: 883 VFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAI 942

Query: 410 VVVENVERIMRDKGLPAREATEKSMGEISGALVAIALVLSAVFLPMAFFGGSTGVIYRQF 469
++VE + +M +G EAT ++ ++ +L LP+A G+
Sbjct: 943 LIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAV 1002

Query: 470 SITIISAMLLSVVVALTLTPAL 491
I ++ M+ + ++A+ P
Sbjct: 1003 GIGVMGGMVSATLLAIFFVPVF 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2509RTXTOXIND483e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 3e-08
Identities = 28/133 (21%), Positives = 56/133 (42%), Gaps = 10/133 (7%)

Query: 41 PVSVVSELTGR-TSAALSAEVRPQVGGIIQKRLFKEGDLVKAGQPLYQIDAASYQAAWNE 99
V +V+ G+ T + S E++P I+++ + KEG+ V+ G L ++ A +A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 100 ARAALQQAQALVKADCQKAQRYARLVKENGVSQQDADDAQSTCAQDKASV--------AA 151
+++L QA+ Q R L K + D Q+ ++ + +
Sbjct: 139 TQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 152 KKAALETARINLD 164
+ +NLD
Sbjct: 198 WQNQKYQKELNLD 210



Score = 32.1 bits (73), Expect = 0.004
Identities = 17/114 (14%), Positives = 37/114 (32%), Gaps = 5/114 (4%)

Query: 83 QPLYQIDAASYQAAWN--EARAALQQAQALVKADCQKAQRYARLVKEN--GVSQQDADDA 138
L A + A + K+ ++ + KE V+Q ++
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 139 QSTCAQDKASVAAKKAALETARINLDWTTVTAPISGRI-GISSVTPGALVTASQ 191
Q ++ L + + AP+S ++ + T G +VT ++
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2510HTHTETR558e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 8e-12
Identities = 17/65 (26%), Positives = 33/65 (50%)

Query: 1 MTSKLEIRHKQRQDEIINAARRCFRRCGFHAASMSQIASEAQLSVGQIYRYFANKDAIIE 60
M K + ++ + I++ A R F + G + S+ +IA A ++ G IY +F +K +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMVRR 65
E+
Sbjct: 61 EIWEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2512DHBDHDRGNASE501e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 50.4 bits (120), Expect = 1e-09
Identities = 51/260 (19%), Positives = 98/260 (37%), Gaps = 22/260 (8%)

Query: 4 LSGKRILVTGVASKLSIAYGIAQAMHREGAEL-AFTYQNDKLKGRVEEFAAQLGSDIVLQ 62
+ GK +TG A I +A+ + +GA + A Y +KL+ V A+
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVAEDASIDTMFAELGKVWPKFDGFVHSIGF---APGDQLDGDYVNAVTREGFKIAHDI 119
DV + A+ID + A + + D V+ G L + A F +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT----FSVN--- 116

Query: 120 SSYSFVAMAKACRSMLNP-GSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMG 178
S+ F A + M++ +++T+ A + +KA+ + + +
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 PEGVRVNAISAGPIRTLAASGI--------KDFRKMLAHCEAVTPIRRTVTIEDVGNSAA 230
+R N +S G T + + + L + P+++ D+ ++
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 231 FLCSDLSAGISGEVVHVDGG 250
FL S + I+ + VDGG
Sbjct: 237 FLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2516PF08280310.018 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 31.0 bits (70), Expect = 0.018
Identities = 21/105 (20%), Positives = 36/105 (34%), Gaps = 2/105 (1%)

Query: 526 PIDVELTESCLIENDELALSVIQQFSQLGAQVHLDDFGTAYSSLSQLARFPIDAIKLDQV 585
P+ V S I L S + FS + + ++ Q+ D +
Sbjct: 425 PLVVVFVASNFINAHLLTDSFPRYFSDKS--IDFHSYYLLQDNVYQIPDLKPDLVITHSQ 482

Query: 586 FVRDIHKQPVSQSLVRAIVAVAQALNLQVIAEGVESAKEDAFLTK 630
+ +H + V I L++Q + V+ K A LTK
Sbjct: 483 LIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTK 527


93Z2936Z2944N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z29360131.427342chemotaxis regulatory protein CheY
Z2937-1111.629278chemotaxis-specific methylesterase
Z2938-1131.514211chemotaxis methyltransferase CheR
Z2939-1131.192097methyl-accepting protein IV
Z2940-1130.640574methyl-accepting chemotaxis protein II
Z2941-1120.197071purine-binding chemotaxis protein
Z2942-114-0.293006chemotaxis protein CheA
Z2943-116-1.714371flagellar motor protein MotB
Z2944-215-2.189013flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2936HTHFIS904e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 4e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2937HTHFIS635e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 5e-13
Identities = 34/188 (18%), Positives = 71/188 (37%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMXQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A + ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2942PF06580434e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 361 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 418
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 419 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 478
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 479 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 506
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2943PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.010
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z2944PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


94Z3013Z3040N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3013-114-1.060339flagellin
Z3014-1190.575329flagellar capping protein
Z30150150.238501flagellar protein FliS
Z30160140.615963flagellar biosynthesis protein FliT
Z30170140.094716alpha-amylase
Z3018016-3.536942hypothetical protein
Z3019119-4.637094hypothetical protein
Z3020236-10.392134hypothetical protein
Z3021236-10.454443hypothetical protein
Z3022338-11.486410hypothetical protein
Z3023026-7.177059hypothetical protein
Z3025014-2.510366hypothetical protein
Z3026013-1.182148hypothetical protein
Z3024-1154.095863hypothetical protein
Z30270174.356524flagellar hook-basal body protein FliE
Z30280154.227834flagellar MS-ring protein
Z30291174.391925flagellar motor switch protein G
Z3030-1173.823043flagellar assembly protein H
Z3031-1193.554403flagellum-specific ATP synthase
Z3032-1172.427290flagellar biosynthesis chaperone
Z3033-1172.438503flagellar hook-length control protein
Z3034-3211.940335flagellar basal body protein FliL
Z3035-1170.695993flagellar motor switch protein FliM
Z3036016-2.247826flagellar motor switch protein FliN
Z3037-117-2.926889flagellar biosynthesis protein FliO
Z3038-120-3.813146flagellar biosynthesis protein FliP
Z3039-121-4.024679flagellar biosynthesis protein FliQ
Z3040-215-2.657476flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3013FLAGELLIN2376e-74 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 237 bits (605), Expect = 6e-74
Identities = 252/525 (48%), Positives = 308/525 (58%), Gaps = 18/525 (3%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQATTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQAT GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGETITIDLKKIDSDTLGLNGFNVNGKGTI 181
EIDRVS QTQFNGV VL++D MKIQVGANDGETITIDL+KID +LGL+GFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNG---- 176

Query: 182 TNKAATVSDLTSAGAKLNTTTGLYDLKTENTLLTTDAAFDKLGNGDKVTVGGVDYTYNAK 241
+ + + TG D + NA
Sbjct: 177 ----PKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAA 232

Query: 242 SGDFTTTKSTAGTGVDAAAQAADSASKRDALAATLHADVGKSVNGSYTTKDGTVSFETDS 301
+G TT + T VD +A +A A GK +
Sbjct: 233 NGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTK-- 290

Query: 302 AGNITIGGSQAYVDDAGNLTTNNAGSAAKADMKKAAAATSSITFNSGVLSKTIGFTAGES 361
GN G ++ T +A A++ A +S + S V + ++
Sbjct: 291 TGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKN 350

Query: 362 SDAAKSYVDDKGGITNVADYTVSYSVNKDNGSVTVAGYASATDTNKDYAPAIGTAVNVNS 421
A S ++ + + TV A+
Sbjct: 351 ESAKLSDLEANNAVKGESKITV--------NGAEYTANAAGDKVTLAGKTMFIDKTASGV 402

Query: 422 AGKITTETTSAGSATTNPLAALDDAISSIDKFRSSLGAIQNRLDSAVTNLNNTTTNLSEA 481
+ I + +A +T NPLA++D A+S +D RSSLGAIQNR DSA+TNL NT TNL+ A
Sbjct: 403 STLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSA 462

Query: 482 QSRIQDADYATEVSNMSKAQIIQQAGNSVLAKANQVPQQVLSLLQ 526
+SRI+DADYATEVSNMSKAQI+QQAG SVLA+ANQVPQ VLSLL+
Sbjct: 463 RSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3014TYPE3OMBPROT330.003 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.7 bits (74), Expect = 0.003
Identities = 24/72 (33%), Positives = 37/72 (51%), Gaps = 2/72 (2%)

Query: 211 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKAQT 270
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 271 AIKDWVNAYNSL 282
+KD VNA L
Sbjct: 294 MLKDQVNALKGL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3019RTXTOXIND300.018 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.018
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVAAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3020PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3022SACTRNSFRASE324e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 4e-04
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 80 APNYLRRGVASLILRHILQVAHDRCLHRLSLETGTQAGFTACHQLYLKHGFVDC 133
A +Y ++GV + +L ++ A + L LET +ACH Y KH F+
Sbjct: 98 AKDYRKKGVGTALLHKAIEWAKENHFCGLMLET-QDINISACH-FYAKHHFIIG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3023PF07299280.044 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 27.5 bits (61), Expect = 0.044
Identities = 17/78 (21%), Positives = 30/78 (38%), Gaps = 5/78 (6%)

Query: 1 MFPLNDLSLKTQSVQLNKITSNTESTIKQHELVSDDAIINELSSEKSQGEGTLPIRHKLE 60
M+ + + +S Q N I S H +D +I L S + I H E
Sbjct: 1 MYGVIKMEAFIRSDQYNFIKSQAYILANGHATANDRGVIQALKSLAIE-----KIIHVFE 55

Query: 61 FISTNIAELLDKLTKITD 78
++ EL+D + + +
Sbjct: 56 NLTDEQKELIDTVLTVQN 73


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3025SACTRNSFRASE321e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 1e-04
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 20 APNYLRRGVASLILRHILQVAHDRCLHRLSLETGTQAGFTACHQLYLKHGFVDC 73
A +Y ++GV + +L ++ A + L LET +ACH Y KH F+
Sbjct: 98 AKDYRKKGVGTALLHKAIEWAKENHFCGLMLET-QDINISACH-FYAKHHFIIG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3027FLGHOOKFLIE1175e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (294), Expect = 5e-38
Identities = 103/103 (100%), Positives = 103/103 (100%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3028FLGMRINGFLIF7520.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 752 bits (1943), Expect = 0.0
Identities = 478/555 (86%), Positives = 515/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSWRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTS RDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESQAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S+A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3029FLGMOTORFLIG2623e-89 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 262 bits (672), Expect = 3e-89
Identities = 105/328 (32%), Positives = 168/328 (51%), Gaps = 58/328 (17%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTASGIETLN------- 53
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + E +
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 54 -----------------------FMEPQSAADLI-------------------------- 64
+ Q A D+I
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNF 131

Query: 65 -RDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLNG 123
+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 132 IQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEK 191

Query: 124 LLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENLV 182
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++V
Sbjct: 192 KLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIV 251

Query: 183 DVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLSQ 242
+DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 252 LLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKD 311

Query: 243 VENEQKAILLIVRRLAETGEMVIGSGED 270
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 312 VEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3030FLGFLIH374e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 374 bits (961), Expect = e-135
Identities = 226/228 (99%), Positives = 227/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3032FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3033FLGHOOKFLIK409e-145 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 409 bits (1051), Expect = e-145
Identities = 329/334 (98%), Positives = 329/334 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120
GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDVPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTD PSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPQVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTP VAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSSHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVS HQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQ 334
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQ
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQ 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3035FLGMOTORFLIM2531e-85 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 253 bits (648), Expect = 1e-85
Identities = 68/324 (20%), Positives = 119/324 (36%), Gaps = 68/324 (20%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGES----------------------------- 33
+LSQ EID LL S + E +S
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 34 ----------------------DIRPYD------PN-TQRNLIHLKPLRGTGLVVFSPSL 64
D Y+ P + +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 65 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 124
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 125 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 182
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 183 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 239
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 240 GVPVLTSQYGTLNGQYALRIEHLI 263
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3036FLGMOTORFLIN2121e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 212 bits (542), Expect = 1e-74
Identities = 125/137 (91%), Positives = 134/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T++KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3038FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3039TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3040TYPE3IMRPROT2033e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (518), Expect = 3e-67
Identities = 260/261 (99%), Positives = 261/261 (100%)

Query: 1 MLQVTSEQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
MLQVTSEQWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


95Z3054Z3061N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3054-213-1.950196DNA cytosine methylase
Z3055-219-5.717191hypothetical protein
Z3056025-7.360513hypothetical protein
Z3057-128-7.895774hypothetical protein
Z3058-123-6.191904hypothetical protein
Z3059-126-5.648835chaperone protein HchA
Z3060031-7.1232812-component sensor protein
Z3061027-5.720713transcriptional regulatory protein YedW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3054PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3055CARBMTKINASE352e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.8 bits (80), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQSSSILAAEETRRLLREEFVQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3057ECOLIPORIN307e-107 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 307 bits (789), Expect = e-107
Identities = 143/204 (70%), Positives = 160/204 (78%), Gaps = 2/204 (0%)

Query: 11 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKLDLYGKVAGLHYFSDDASSDGDMSYARIG 70
MKRKVLA+++PALL AGAA+AAEIYNKDGNKLDLYGKV GLHYFSDD+S DGD +Y R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 71 FKGETQIADQFTGYGQWEFNIGANGPESDKGNTATRLAFAGFGFGQNGTFDYGRNYGVVY 130
FKGETQI DQ TGYGQWE+N+ AN E + N+ TRLAFAG FG G+FDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 131 DVEAWTDMLPEFGGDTYAGADNFMNGRANSVATYRNNGFFGQVDGLNFALQYQGNNEKSG 190
DVE WTDMLPEFGGD+Y ADN+M GRAN VATYRN FFG VDGLNFALQYQG NE
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 191 LFDQEGSGNG--NGRKLAKENGDG 212
D N NG + +NGDG
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDG 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3058ECOLIPORIN1438e-45 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 143 bits (363), Expect = 8e-45
Identities = 85/187 (45%), Positives = 100/187 (53%), Gaps = 64/187 (34%)

Query: 1 MSTSYDFDFGLSLGAAYSNSDRTDNQVHKGTHNTRYGDRFDATAGGETAEAWTVGAKYDA 60
+ST+YD G S GAAY+ SDRT+ QV+ G AGG+ A+AWT G KYDA
Sbjct: 207 ISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGTI----------AGGDKADAWTAGLKYDA 256

Query: 61 NNVYLAAMYAEPRNMTGYGDADA-----IANKTQNFEVVAQYQFDFG------------- 102
NN+YLA MY+E RNMT YG D +ANKTQNFEV AQYQFDFG
Sbjct: 257 NNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMSKGK 316

Query: 103 ------------------------------------KINLLDNDDDFYKENGIATDDIVA 126
KINLLD+DD FYK+ GI+TDDIVA
Sbjct: 317 DLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDDIVA 376

Query: 127 VGLVYQF 133
+G+VYQF
Sbjct: 377 LGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3060PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 35/181 (19%), Positives = 61/181 (33%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNSYLNIDIAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ +I + D N + +++ +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGAKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIT 447
G+ + K G GL V+ + L+G A K
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3061HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


96Z3238Z3248N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3238-2121.685428chaperone
Z3239-3132.480565chaperonin
Z3240-3173.803153hypothetical protein
Z3241-3184.240856hypothetical protein
Z3242-3184.215009hypothetical protein
Z3243-2184.143109multidrug efflux system subunit MdtA
Z3244-2193.929689multidrug efflux system subunit MdtB
Z3245-2183.308149multidrug efflux system subunit MdtC
Z3246-2121.376206multidrug efflux system protein MdtE
Z3247-290.146932signal transduction histidine-protein kinase
Z3248-110-1.377797DNA-binding transcriptional regulator BaeR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3238SHAPEPROTEIN514e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.9 bits (122), Expect = 4e-09
Identities = 33/129 (25%), Positives = 57/129 (44%), Gaps = 20/129 (15%)

Query: 132 AMMLH-IRQQAQAQLPEAITQAVIGRPINFQGLGGDEANAQAQGILERAAKRAGFRDVVF 190
M+ H I+Q + ++ P+ + E A + +A+ AG R+V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV---ERRA-----IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREASLLGHSGCRI 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S RI
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 33.2 bits (76), Expect = 0.002
Identities = 32/137 (23%), Positives = 55/137 (40%), Gaps = 23/137 (16%)

Query: 332 RLSYRLV---RSAEECKIALSSV--AETRASLPFISDELAT------LISQQGLESALSQ 380
R +Y + +AE K + S + + LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 381 PLARIQEQVQLALDNAQEKPDV--------IYLTGGSARSPLIKKALAEQLPGIPIAGGD 432
PL I V +AL+ Q P++ + LTGG A + + L E+ GIP+ +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 433 D-FGSVTAGLARWAEVV 448
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3243RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.9 bits (75), Expect = 0.002
Identities = 25/108 (23%), Positives = 47/108 (43%), Gaps = 5/108 (4%)

Query: 125 EASVASAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSGDTTGIVVITQTHPIDLLFTLP 183
+A + + S I APV +V LK G +++ +T +V++ + +++ +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTALVQ 373

Query: 184 ESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSLDNQIDATTG 230
DI + Q A + VEA+ T L + ++LD D G
Sbjct: 374 NKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 28.6 bits (64), Expect = 0.038
Identities = 8/53 (15%), Positives = 16/53 (30%), Gaps = 5/53 (9%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGR 51
S + R V ++ IA G+ + + A G + +
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSI 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3244ACRIFLAVINRP9170.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 917 bits (2372), Expect = 0.0
Identities = 298/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QISDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P + V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3245ACRIFLAVINRP7770.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 777 bits (2009), Expect = 0.0
Identities = 243/877 (27%), Positives = 429/877 (48%), Gaps = 32/877 (3%)

Query: 45 QLAPTISQIDGVGDVDVGGSSLPAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG-- 102
+ T+S+++GVGDV + G+ A+R+ L+ L ++ DV + N + G
Sbjct: 161 NVKDTLSRLNGVGDVQLFGAQY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQL 219

Query: 103 ----ALEDGTHRWQIQTNDELKTAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAG 157
AL I K E+ + + N +G VRL DVA V ++
Sbjct: 220 GGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIA 279

Query: 158 MTNAKPAILLMIRKLPEANIIQTVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEE 217
N KPA L I+ AN + T +I+AKL ELQ P + + D +P ++ S+ E
Sbjct: 280 RINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHE 339

Query: 218 VEQTLIISVALVILVVFLFLRSGRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALT 277
V +TL ++ LV LV++LFL++ RAT+IP +AVPV L+GTFA + G+S+N L++ +
Sbjct: 340 VVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMV 399

Query: 278 IATGFVVDDAIVVLENIARHL-EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLM 336
+A G +VDDAIVV+EN+ R + E + P +A + ++ ++ +++ L AVF+P+
Sbjct: 400 LAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFF 459

Query: 337 GGLPGRLLREFAVTLSVAIGISLLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RML 392
GG G + R+F++T+ A+ +S+LV+L LTP +C +LK + GF
Sbjct: 460 GGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTF 519

Query: 393 VALQQGYGKSLKWVLNHTRLVGMVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQ 452
Y S+ +L T ++ +A + L++ +P +F PE+D GV + IQ
Sbjct: 520 DHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPA 579

Query: 453 SISFQ----AMRGKLQDFMKIIRD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS-- 504
+ + + ++K + +V V GF+ G N+GM F++LKP +ER+
Sbjct: 580 GATQERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGD 639

Query: 505 -ETAQQIIDRLRVKLAKEPGANLFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPK 563
+A+ +I R +++L K + + I G + ++ L D + +
Sbjct: 640 ENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQ 696

Query: 564 IRKKLATL-----PELADVNSDQQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQR 618
R +L + L V + ++ A+ L D++ LG+ + N ++ A G
Sbjct: 697 ARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGT 756

Query: 619 QISTIYQPMNQYKVVMEVDPRYTQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVN 678
++ K+ ++ D ++ ++K++V + G+ +P S F +
Sbjct: 757 YVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLE 816

Query: 679 HQGLSAASTISFNLPTGKSLSDASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVI 738
+ I G S DA A ++ ++L P+ + + G + + + N
Sbjct: 817 RYNGLPSMEIQGEAAPGTSSGDAMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPA 874

Query: 739 LIIAAIATVYIVLGILYESYVHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLI 798
L+ + V++ L LYES+ P++++ +P VG LLA LFN + ++G++ I
Sbjct: 875 LVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTI 934

Query: 799 GIVKKNAIMMVDFALEAQRHGNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGD 858
G+ KNAI++V+FA + EA A +R RPI+MT+LA + G LPL +S G
Sbjct: 935 GLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA 994

Query: 859 GSELRQPLGITIVGGLVMSQLLTLYTTPVVYLFFDRL 895
GS + +GI ++GG+V + LL ++ PV ++ R
Sbjct: 995 GSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRC 1031



Score = 49.1 bits (117), Expect = 7e-08
Identities = 13/39 (33%), Positives = 20/39 (51%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPV 44
FI RP+ +L++ + + G L LPVA P + P
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPA 42


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3246TCRTETB1268e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 8e-34
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAITGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSSTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3247BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3248HTHFIS581e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 1e-12
Identities = 17/80 (21%), Positives = 40/80 (50%), Gaps = 1/80 (1%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLSYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTA 89
L I+ D+P+++++A
Sbjct: 64 DLLPRIKKARPDLPVLVMSA 83


97Z3341Z3348N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3341126-1.151486hypothetical protein
Z3342127-1.625832hypothetical protein
Z3343238-6.884197shiga-like toxin 1 subunit B encoded within
Z3344337-5.201061shiga-like toxin 1 subunit A encoded within
Z3345233-2.230611antitermination protein Q for prophage CP-933V
Z3346335-2.436899hypothetical protein
Z3347333-2.793915hypothetical protein
Z3348132-2.588499hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3341GPOSANCHOR280.008 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.5 bits (63), Expect = 0.008
Identities = 20/65 (30%), Positives = 34/65 (52%), Gaps = 3/65 (4%)

Query: 69 EAVKEVLRSEEVRSALKQKLRHNLEARLDAEVDAILDELLGAPAAPEPEGIAGEGSASDS 128
A++++ + E L +K + L+A+L+AE A+ ++L A A E + G ASDS
Sbjct: 410 AALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKL--AKQAEELAKL-RAGKASDS 466

Query: 129 GDPTP 133
P
Sbjct: 467 QTPDA 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3343FLGMOTORFLIM260.024 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 26.0 bits (57), Expect = 0.024
Identities = 7/36 (19%), Positives = 17/36 (47%)

Query: 38 DTFTVKVGDKELFTNRWNLQSLLLSAQITGMTVTIK 73
D F + +G+++ F + + ++AQI +
Sbjct: 293 DPFVLSIGNRKKFLCQPGVVGKKIAAQILERIESTS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3344SHIGARICIN1203e-34 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 120 bits (303), Expect = 3e-34
Identities = 49/283 (17%), Positives = 112/283 (39%), Gaps = 40/283 (14%)

Query: 3 IIIFRVLTFFFVIFSVNVVAKE----FTLDFSTAKTYVDSLNVIRSAIGTPLQTISSGGT 58
+I F V + + + A E F L +T+ +Y ++ +R A+ +
Sbjct: 1 MIRFLVFSLLILTLFLTAPAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKL-----Y 55

Query: 59 SLLMIDSGTGDNLFAVDVRGIDPEEGRFNNLRLIVERNNLYVTGFVNRTNNVFYRFADF- 117
+ ++ S + + + + + + ++ N+YV G+ + Y F +
Sbjct: 56 DIPLLRSTLPGSQRYALIHLTNYADE---TISVAIDVTNVYVMGYRA--GDTSYFFNEAS 110

Query: 118 ----SHVTFPGTTA-VTLSGDSSYTTLQRVAGISRTGMQINRHSLTTSYLDLMSHSGTSL 172
+ F VTL +Y LQ AG R + + +L ++ L ++
Sbjct: 111 ATEAAKYVFKDAKRKVTLPYSGNYERLQIAAGKIRENIPLGLPALDSAITTLFYYNA--- 167

Query: 173 TQSVARAMLRFVTVTAEALRFRQIQRGFRTTLDDLSGRSYVMTAEDVDLTLNWGRLSSVL 232
S A A++ + T+EA R++ I++ +D +++ + + L +W LS +
Sbjct: 168 -NSAASALMVLIQSTSEAARYKFIEQQIGKRVDK----TFLPSLAIISLENSWSALSKQI 222

Query: 233 PDYHGQDSV----------RVGRISFGSINA--ILGSVALILN 263
+ + R++ +++A + ++AL+LN
Sbjct: 223 QIASTNNGQFETPVVLINAQNQRVTITNVDAGVVTSNIALLLN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z334860KDINNERMP280.014 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.0 bits (62), Expect = 0.014
Identities = 9/37 (24%), Positives = 16/37 (43%)

Query: 88 MTYMKAYQKAWKEHRDRYQQDMEKLESENMELRRKLG 124
M M+ Q + R+R D +++ E M L +
Sbjct: 380 MAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEK 416


98Z3383Z3389N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z33831151.943576D-alanyl-D-alanine endopeptidase
Z33840162.529489hypothetical protein
Z3385-1173.688708hypothetical protein
Z3386-1183.763213acetoin dehydrogenase
Z3387-2163.885728multidrug resistance outer membrane protein
Z3388-2204.123176hypothetical protein
Z3389-1204.443490tRNA-dihydrouridine synthase C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3383BLACTAMASEA443e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 44.0 bits (104), Expect = 3e-07
Identities = 43/195 (22%), Positives = 77/195 (39%), Gaps = 18/195 (9%)

Query: 4 MPKFRVSLFSLALMLAVPLAPQAVAKTAAATTASQPEIASGSAMI-VDLNTNKVIYSNHP 62
M R+ + SL + +PLA A + S+ +++ MI +DL + + + +
Sbjct: 1 MRYIRLCIISL--LATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRA 58

Query: 63 DLVRPIASISKLMTAMVVLDARLPLDEKLKVDISQTPEMKGVYSRV---RLNSEISRKDM 119
D P+ S K++ VL DE+L+ I + YS V L ++ ++
Sbjct: 59 DERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGEL 118

Query: 120 LLLALMSSENRAAASLAHHYPGGYKAFIKAMNAKAKSLGMNNTRFV--EPTGLS-----V 172
A+ S+N +AA+L GG + A + +G N TR E
Sbjct: 119 CAAAITMSDN-SAANLLLATVGG----PAGLTAFLRQIGDNVTRLDRWETELNEALPGDA 173

Query: 173 HNVSTARDLTKLLIA 187
+ +T + L
Sbjct: 174 RDTTTPASMAATLRK 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3385BCTERIALGSPF290.019 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.019
Identities = 5/33 (15%), Positives = 16/33 (48%), Gaps = 2/33 (6%)

Query: 164 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKR 194
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3386DHBDHDRGNASE1132e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (283), Expect = 2e-32
Identities = 70/253 (27%), Positives = 115/253 (45%), Gaps = 12/253 (4%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWXSDEEGAKDTAREVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I + E + + + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHI-AAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 63 NLPEGAQALEKLIQRLGRIDVLVNNAGAMTKVPFLDMAFDEWRKIFTVDVDGAFLCSQIA 122
+ + ++ + +G ID+LVN AG + ++ +EW F+V+ G F S+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVA 182
++ M+ + + G I+ + S P +AY ++K A TK + LEL + I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NGMDDSDVKPDAEP---SIPLRRFGATHEIASLVAWLCSEGANYT 232
PG+ T M + +K E IPL++ +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3389SHAPEPROTEIN290.029 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.029
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 5/127 (3%)

Query: 122 GAKAMREAVPAHLPVSVKVRLGWDSGEK-KFEIADAVQQAGATELVVHGRTKEQGY-RAE 179
G EA+ ++ + +G + E+ K EI A E+ V GR +G R
Sbjct: 190 GGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGF 249

Query: 180 HIDWQAIGE-IRQRLNIPVIANGEIWDWQSAQECMAISGCDSVMIGRGALNIPNLSRVVK 238
++ I E +++ L V A + + IS V+ G GAL + NL R++
Sbjct: 250 TLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL-LRNLDRLL- 307

Query: 239 YNEPRMP 245
E +P
Sbjct: 308 MEETGIP 314


99Z3629Z3632N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3629-132-9.397335multidrug resistance protein Y
Z3630-132-8.447542multidrug resistance protein K
Z3631030-7.719747DNA-binding transcriptional activator EvgA
Z3632030-7.334811hybrid sensory histidine kinase in two-component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3629TCRTETB539e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.0 bits (127), Expect = 9e-10
Identities = 60/268 (22%), Positives = 105/268 (39%), Gaps = 24/268 (8%)

Query: 39 VAIPTILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRETETSPVKMNLPGLTLLV 98
+ +GG I W +L+ +PM I+ L L +E ++ G+ L+
Sbjct: 152 EGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKG-HFDIKGIILMS 208

Query: 99 LGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVIFLISLVIWESTSENPILDLSLFKSRNF 158
+G+ + ML F +S I +VSV+ + V +P +D L K+ F
Sbjct: 209 VGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPF 258

Query: 159 TIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPIGIMPLLISPLIG----- 213
IG++ + +G + ++P ++++ + G G M ++I IG
Sbjct: 259 MIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVD 318

Query: 214 RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQGFAVACFFLPLTTI 273
R G + + VTF +V + S T F II+ G + ++TI
Sbjct: 319 RRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVFVLGGLSFTK--TVISTI 372

Query: 274 SFSGLPDNKFANASSMSNFFRTLSGSVG 301
S L + S+ NF LS G
Sbjct: 373 VSSSLKQQEAGAGMSLLNFTSFLSEGTG 400


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3630RTXTOXIND794e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.7 bits (194), Expect = 4e-18
Identities = 63/412 (15%), Positives = 122/412 (29%), Gaps = 96/412 (23%)

Query: 13 RRKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
RR I+ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQ-VGETVSPG 233
L + + + + + I++PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3631HTHFIS485e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 48.3 bits (115), Expect = 5e-09
Identities = 22/147 (14%), Positives = 52/147 (35%), Gaps = 31/147 (21%)

Query: 5 IIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQVL 64
+ DD + L + ++ + + + + D+V+ DV +P N +L
Sbjct: 8 VADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDLL 66

Query: 65 ETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF- 123
++K + ++++SA+N + AI+A++ G +
Sbjct: 67 PRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDYL 102

Query: 124 --PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 103 PKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3632HTHFIS762e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.4 bits (188), Expect = 2e-16
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNVDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


100Z3982Z3988N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z3982-2132.519364transporter
Z3983-2142.030473hypothetical protein
Z3984-2151.598892hypothetical protein
Z3985-2131.261564transcriptional repressor MprA
Z3986-1131.676834multidrug resistance secretion protein
Z3987-1141.485538multidrug resistance protein B
Z39880160.912691S-ribosylhomocysteinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3982TCRTETB443e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.5 bits (105), Expect = 3e-07
Identities = 32/165 (19%), Positives = 70/165 (42%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASVLMALMALALWRGLPQMKSETHLNY 196
+ G++A+ W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3985PF05272280.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.017
Identities = 23/94 (24%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLXRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRESDNDRR 111
SS ++ D L + GW RE+ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3986RTXTOXIND742e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 74.1 bits (182), Expect = 2e-16
Identities = 64/412 (15%), Positives = 117/412 (28%), Gaps = 97/412 (23%)

Query: 25 LLLTLLFIIIAVAIGIYWFLVLRHFEETDDA----YVAGNQIQIMSQVSGSVTKVWADNT 80
L FI+ + I VL E A +G +I + V ++
Sbjct: 57 PRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 81 DFVKEGDVLVTLDPTD-------------------------------------------- 96
+ V++GDVL+ L
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 97 ---ARQAFEKAKTALASSVRQTHQQMINSKQ------------LQANIEVQKIALAKAQS 141
+ T+L T Q K+ + A I + +S
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 142 DYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQYNANQAMILGTKLEDQPAVQQ 201
+ L + I + + + A +L V Q ++ IL K E Q Q
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 202 AATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQISPTTP 242
E+ + + + I +P++ V + V G ++
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 243 LMAVVPA-TNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKY---TGKVVGLDMGTGS 298
LM +VP + V A + I + +GQ I + + +Y GKV + +
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI-----N 409

Query: 299 AFSLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNR 350
++ G V+ + + PL G++ + T R
Sbjct: 410 LDAIE--DQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3987TCRTETB921e-22 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 92.3 bits (229), Expect = 1e-22
Identities = 54/198 (27%), Positives = 89/198 (44%), Gaps = 4/198 (2%)

Query: 17 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 76
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 135
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 195
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 196 TERRRIDAVGLALLVIGI 213
+ D G+ L+ +GI
Sbjct: 195 I-KGHFDIKGIILMSVGI 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z3988LUXSPROTEIN292e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 292 bits (748), Expect = e-105
Identities = 130/170 (76%), Positives = 147/170 (86%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADVWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VAD W AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARSILERDVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


101Z4180Z4197N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4180752-17.238098lipoprotein of type III secretion apparatus
Z4181650-17.152934Type III secretion apparatus protein
Z4182650-16.751594hypothetical protein
Z4183551-15.718576hypothetical protein
Z4184449-15.941801hypothetical protein
Z4185550-15.641211surface presentation of antigens protein SpaS
Z4186349-14.257281integral membrane protein-component of typeIII
Z4187448-14.758158type III secretion apparatus protein
Z4188346-15.282557type III secretion apparatus protein
Z4189343-12.717877surface presentation of antigens protein SpaP
Z4190443-12.735556surface presentation of antigens protein SpaO
Z4191342-12.676111type III secretion apparatus protein
Z4192342-12.620721hypothetical protein
Z4193343-12.546967type III secretion apparatus protein
Z4194343-12.522354ATP synthase SpaL
Z4195544-13.578871type III secretion apparatus protein
Z4196442-12.937800hypothetical protein
Z4197339-10.716139type III secretion apparatus protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4180FLGMRINGFLIF353e-04 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 34.6 bits (79), Expect = 3e-04
Identities = 22/126 (17%), Positives = 49/126 (38%), Gaps = 5/126 (3%)

Query: 4 ISLLLFILLLCGCKQQE-LLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIYVEPTD 62
+++++ ++L L ++L Q ++A L + NI + I V
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGA---IEVPADK 91

Query: 63 FASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDGIV 122
L LP + + + S +E+ A+E L ++++ + +
Sbjct: 92 VHELRLRLAQQGLPKGGAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 123 SSRVHV 128
S+RVH+
Sbjct: 151 SARVHL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4185TYPE3IMSPROT310e-106 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 310 bits (796), Expect = e-106
Identities = 112/340 (32%), Positives = 185/340 (54%), Gaps = 5/340 (1%)

Query: 2 ANKTEKPTQKKLQDASKKGQILKSRDLTVSVIMLVG--TLYLGYVFDVHHIMSILEYILD 59
KTE+PT KK++DA KKGQ+ KS+++ + +++ L + H ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 60 HNAKPDIWD---YFKAMGIGWLKTIIPFLLVCMFTTILVSWFQSKMQLATEAVKLKFDSL 116
+ P + + + P L V I Q ++ EA+K +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 117 NPVNGLKRIFGLKTVKEFVKAILYIIFFALEIKVFWSNHKSLLFKTLDGDIISLLSDWGE 176
NP+ G KRIF +K++ EF+K+IL ++ ++ I + + L + I + G+
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 177 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERH 236
+L L++ C +++ I D+ EY+ ++K++KM K E+KREYKE EG+PEIKSKRR+ H
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 237 QEILSEQLKSDVSNSRLMIANPTHIAIGIYFKPHLSPIPLISVRETNEVALAVRKYAKEI 296
QEI S ++ +V S +++ANPTHIAIGI +K +P+PL++ + T+ VRK A+E
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 297 GIPIITDKKLARKIYATHRRYDYVSFENIDEILRLLLWLE 336
G+PI+ LAR +Y Y+ E I+ +L WLE
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4186TYPE3IMRPROT444e-09 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 44.4 bits (105), Expect = 4e-09
Identities = 14/54 (25%), Positives = 25/54 (46%)

Query: 1 MTHTIVYASPVIAVMLGGEAVLGLLARYASQLNAFAISLTVKSALAFLILIIYF 54
+ ++ A P+I ++L LGLL R A QL+ F I + + ++
Sbjct: 180 FLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALM 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4187TYPE3IMRPROT876e-24 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 87.5 bits (217), Expect = 6e-24
Identities = 30/150 (20%), Positives = 65/150 (43%)

Query: 1 MGEVILYQLHSLLAATALGFCRLAPTFYLLPFFASGNIPTVVRHPIIIVVSCALVQHYHY 60
M +V Q S L R+ P + ++P V+ + ++++ A+
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 ELSTLNEIDIALLAAREIIIGLFIACLLASPFWIFLAIGSFIDNQRGATLSSTLDPATGV 120
+ LA ++I+IG+ + + F G I Q G + ++ +DPA+ +
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 DTSELARLFNLFSAAVYLTNGGLNFILETL 150
+ LAR+ ++ + ++LT G +++ L
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLL 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4188TYPE3IMQPROT794e-23 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 78.7 bits (194), Expect = 4e-23
Identities = 59/86 (68%), Positives = 73/86 (84%)

Query: 1 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF 60
MDD+VFAGN+ALYL+L++S P VAT +GLLVGLFQTVTQLQEQTLPFG+KLL V +C
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLMSGWYGEKLYSFGIEMLNLAFARG 86
FL+SGWYGE L S+G +++ LA A+G
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4189TYPE3IMPPROT2262e-77 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 226 bits (577), Expect = 2e-77
Identities = 151/223 (67%), Positives = 181/223 (81%), Gaps = 5/223 (2%)

Query: 1 MSNSISLIAILSLFTLLPFIIASGTCFIKFSIVFVIVRNALGLQQVPSNMTLNGVALLLS 60
M N ISLIA+L+ TLLPFIIASGTCF+KFSIVFV+VRNALGLQQ+PSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMMPVGKEIYYNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK 120
MFVM P+ + Y ++E+++FN+++S+ V+ G+ GY+ YLIKYS+ ELV FFE Q
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 VNSSEDNEEIIDDD-----NISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVL 175
+ E + D SIF+LLPAYALSEIKSAF IGFY+YLPFVVVDLV+SSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 176 LTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYFDLS 218
L LGMMMMSPVTISTPIKL+LFVA+DGWT+LSKGLILQY D++
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIA 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4190TYPE3OMOPROT1561e-47 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 156 bits (395), Expect = 1e-47
Identities = 91/292 (31%), Positives = 136/292 (46%), Gaps = 13/292 (4%)

Query: 35 KENGEDVALLMPEFSAKWLPIAEESGSWSGWVLLREIFPLISAELAGMALMPETERLIGE 94
+ +G + L P W+ +++ WS W+ + +S LAG A+ E L+
Sbjct: 23 QRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWLEHVSPALAGAAVSAGAEHLVVP 82

Query: 95 WLSLSSSPLNLKYPELKYNRLCVGKVFDGVLSPAQPLIRIWTGELNLWLDKVTVCQYENA 154
WL+ + P L P L RLCV G P L+ I + LW + +
Sbjct: 83 WLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLLHIMSDRGGLWFEHLPELPAVGG 142

Query: 155 PTLDKKSLYWPIHFVIGFSKTCYRTIVDIEVGDVLLISNNMAYAVIYNTKICDLIYPEEL 214
K L WP+ FVIG S T + I +GDVLLI + A +Y
Sbjct: 143 GRP--KMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRA-----------EVYCYAK 189

Query: 215 KMADHFQYEEDFETDDFDIKKSESEIYDENDEQMINSFEELPVKIEFVLGKKIMNLYEID 274
K+ + E + DI+ E E + + +LPVK+EFVL +K + L E++
Sbjct: 190 KLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYRKNVTLAELE 249

Query: 275 ELCAKRIISLLPESEKNIEIRVNGALTGYGELVEVDDKLGVEIHSWLSGHNN 326
+ ++++SL +E N+EI NG L G GELV+++D LGVEIH WLS N
Sbjct: 250 AMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESGN 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4191SSPANPROTEIN492e-09 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 49.4 bits (117), Expect = 2e-09
Identities = 31/75 (41%), Positives = 44/75 (58%), Gaps = 3/75 (4%)

Query: 105 ENELTYQFQRWGQNHTVRILESSEG-IRLKPSDTLVSDRLHEAQHNDVTAQRWVLTEQDE 163
++ LTY+FQRWG +++V I G L PS+T V RLH+ N QRW LT +D+
Sbjct: 260 DSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLHDQWQNG-NPQRWHLT-RDD 317

Query: 164 RQGQRHQPHEEQENE 178
+Q + Q H +Q E
Sbjct: 318 QQNPQQQQHRQQSGE 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4193SSPAMPROTEIN352e-05 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 34.7 bits (79), Expect = 2e-05
Identities = 31/101 (30%), Positives = 56/101 (55%)

Query: 2 QLKNLQSLLDMKELLGEVVFRQDIFYSLRKVTVIQQQIAEINLEKQKIAERRKILNKEIV 61
Q+ L+ LLD + R++I+ LRK +++++QI ++ L+ +I E+R L K+
Sbjct: 45 QIAGLKLLLDTLRAENRQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKRE 104

Query: 62 QQQAQRKHWWLKGEKYDRLKKRIKKQLLNQMLYQDELEQEE 102
+ Q + K+W K Y R R K+ + + + Q+E E EE
Sbjct: 105 EFQEKSKYWLRKEGNYQRWIIRQKRLYIQREIQQEEAESEE 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4195VACCYTOTOXIN310.019 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.2 bits (70), Expect = 0.019
Identities = 18/60 (30%), Positives = 31/60 (51%), Gaps = 3/60 (5%)

Query: 597 EIEDRIRDGVRPTAGGTFLNLDASEAEMILDNFKLAL---SGINIPIKDIILLGSVDIRR 653
EI +R+ G A T L L ASE +N +++L + +N+ + L+G+V + R
Sbjct: 202 EINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYDGATLNLASNSVKLMGNVWMGR 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4196INVEPROTEIN2402e-78 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 240 bits (613), Expect = 2e-78
Identities = 128/321 (39%), Positives = 195/321 (60%)

Query: 14 AREVSRLEDIITEDNEDIEAEMPKMRDDPAGKEARFLQATDEMSAALTQFMKKKIYEEQL 73
+R+ S + D + E + P + +F+Q+TDEMSAAL QF ++ YE++
Sbjct: 16 SRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSAALAQFRNRRDYEKKS 75

Query: 74 ANFLDGEEYVLEDQPIEKTDKVMEALKAATTHDYEVYSFAKKLFPDESDLVVVLRAILRK 133
+N + E VLED+ + K ++++ + + A+ LFPD SDLV+VLR +LR+
Sbjct: 76 SNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFPDPSDLVLVLRELLRR 135

Query: 134 KQISENVRLNAEALLRKVNQETTKKFINSGINSALKAKLFGQALSLNPKLLRASYRQFLM 193
K + E VR E+LL+ V ++T K + +GIN ALKA+LFG+ LSL P LLRASYRQF+
Sbjct: 136 KDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLSLKPGLLRASYRQFIQ 195

Query: 194 AEDDAVDTYVEWIGSYGYQNRMLVTKFIKETLFSDINALDASCSSLEFGMFLNKLSQLLS 253
+E V+ Y +WI SYGYQ R++V FI+ +L +DI+A DASCS LEFG L +L+QL
Sbjct: 196 SESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSRLEFGQLLRRLTQLKM 255

Query: 254 LQSAEALFLKTLMNNPIIKKFISAEDYWIFFLISLIKFPETAEELLNNALVTLPADANYK 313
L+SA+ LF+ TL++ K F + E W+ ++SL++ P + LL + + ++K
Sbjct: 256 LRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSLLADIIGLNALLLSHK 315

Query: 314 DKTLLLKAIYSGCTNLPFSLF 334
+ L+ Y C +P SLF
Sbjct: 316 EHASFLQIFYQVCKAIPSSLF 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4197TYPE3OMGPROT448e-154 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 448 bits (1155), Expect = e-154
Identities = 158/536 (29%), Positives = 271/536 (50%), Gaps = 54/536 (10%)

Query: 34 YVANKENLRSFFETVSSYAGKPTIVSKLAMKKQISGNFDLTEPYALIERLSAQMGLIWYD 93
YVA E+LR + +VS K +SG F+ P ++ +++ L+WY
Sbjct: 38 YVAKGESLRDLLTDFGANYDATVVVSDKINDK-VSGQFEHDNPQDFLQHIASLYNLVWYY 96

Query: 94 DGKAIYIYDSSEMRNALINLRKVSTNEFNNFLKKSGLYNSRYEIKGD-GNGTFYVSGPPV 152
DG +YI+ +SE+ + LI L++ E L++SG++ R+ + D N YVSGPP
Sbjct: 97 DGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPR 156

Query: 153 YVDLVVNAAKLMEQNSD--GIEIGRNKVGIIHLVNTFVNDRTYELRGEKIVIPGMAKVLS 210
Y++LV A +EQ + + G + I L +DRT R +++ PG+A +L
Sbjct: 157 YLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQ 216

Query: 211 TLLNNNIKQSTGVNVLSEISSRQQLKNVSRMPPFPGAEEDDDLQVEKIISTAGAPETDDI 270
+L++ ++ QQ+ ++ P A +
Sbjct: 217 RVLSD--------------ATIQQVTVDNQRIPQ-----------------AATRASAQA 245

Query: 271 QIIAYPDTNSLLVKGTVSQVDFIEKLVATLDIPKRHIELSLWIIDIDKTDLEQLGADWSG 330
++ A P N+++V+ + ++ ++L+ LD P IE++L I+DI+ L +LG DW
Sbjct: 246 RVEADPSLNAIIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRV 305

Query: 331 TIKIGSSLSASFNNSG----------SISTLDG---TQFIATIQALAQKRRAAVVARPVV 377
I+ G++ +G S +D +A + L + A VV+RP +
Sbjct: 306 GIRTGNNHQVVIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTL 365

Query: 378 LTQENIPAIFDNNRTFYTKLVGERTAELDEVTYGTMISVLPRFAARN---QIELLLNIED 434
LTQEN A+ D++ T+Y K+ G+ AEL +TYGTM+ + PR + +I L L+IED
Sbjct: 366 LTQENAQAVIDHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIED 425

Query: 435 GNEINSDKTNVDDLPQVGRTLISTIARVPQGKSLLIGGYTRDTNTYESRKIPILGSIPFI 494
GN+ + + ++ +P + RT++ T+ARV G+SL+IGG RD + K+P+LG IP+I
Sbjct: 426 GNQ-KPNSSGIEGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYI 484

Query: 495 GKLFGYEGTNANNIVRVFLIEPREIDERMMNNANEAAVDARAITQQMAKNKEINDE 550
G LF + VR+F+IEPR IDE + ++ A + + + + EI+++
Sbjct: 485 GALFRRKSELTRRTVRLFIIEPRIIDEGIAHHL--ALGNGQDLRTGILTVDEISNQ 538


102Z4593Z4600N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4593-217-0.505025serine endoprotease
Z4594-314-0.186272serine endoprotease
Z4595-2130.014081malate dehydrogenase
Z4596-313-0.403870arginine repressor ArgR
Z4597-3130.190955hypothetical protein
Z4598-2131.189132hypothetical protein
Z4599-3111.881240p-hydroxybenzoic acid efflux subunit AaeB
Z4600-2111.879252p-hydroxybenzoic acid efflux subunit AaeA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4593V8PROTEASE672e-14 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 66.6 bits (162), Expect = 2e-14
Identities = 28/170 (16%), Positives = 56/170 (32%), Gaps = 36/170 (21%)

Query: 46 VLTNNHVINQAQKISIQL------------NDGREFDAKLIGSDDQSDIALLQIQN---- 89
+LTN HV++ L +G ++ + D+A+++
Sbjct: 114 LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQN 173

Query: 90 ---PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIVSALGRSGLNLEGLEN-FI 145
+ ++++ + +V G P V+ + S + L+ +
Sbjct: 174 KHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMWESKGKITYLKGEAM 226

Query: 146 QTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSNMART 195
Q D S GNSG + N E+IGI+ G+ +
Sbjct: 227 QYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGAVFIN 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4595DHBDHDRGNASE280.045 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.045
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 VAKTCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4596ARGREPRESSOR1694e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 169 bits (430), Expect = 4e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKDLYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4600RTXTOXIND512e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 99 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 149
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 150 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 207
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 208 YRAEIT----PLGSNKVLKGTVDSVAA 230
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410



Score = 47.9 bits (114), Expect = 3e-08
Identities = 29/163 (17%), Positives = 58/163 (35%), Gaps = 17/163 (10%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVH-NQLVKKGQVLFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 113
V + V+KG VL + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 114 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 153
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211


103Z4621Z4628N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4621-213-2.286312Fis family transcriptional regulator
Z4622-312-1.763757methyltransferase
Z4623-312-1.188542hypothetical protein
Z4624-313-1.078018DNA-binding transcriptional regulator EnvR
Z4625-214-0.406448transmembrane protein affects septum formation
Z4626-215-0.481935hypothetical protein
Z4627-214-0.916283hypothetical protein
Z4628-116-1.830085hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4621DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4624HTHTETR1276e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 127 bits (321), Expect = 6e-39
Identities = 78/209 (37%), Positives = 122/209 (58%), Gaps = 3/209 (1%)

Query: 1 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN 60
MA++TK EA +TRQ +++ A+ F+Q GVS T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMW-LQQPSLRELIQEHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEF 119
E+W L + ++ EL E A DP LRE LI L+ R++ L++I++HKCEF
Sbjct: 61 EIWELSESNIGELELE-YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 NDEM-LAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWL 178
EM + + R + + + L+ C + + +L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNMAGYDLYKQAPALVDNVLRMFMPDENI 207
+DL K+A V +L M++ +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4625RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 38/217 (17%), Positives = 70/217 (32%), Gaps = 38/217 (17%)

Query: 98 ATYQANYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 IAAKATVESARINLAYTKVTAPISGRIGK-STVTEGALVTNGQTTELATVQQLDPIYVDV 215
+ + + AP+S ++ + TEG +VT +T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVEQGNLHKENATSNVELVMENGQTYP-LKGTLQ--FSDVTVDEST 270
+ D F+ + Q+ +++ Y L G ++ D D+
Sbjct: 371 LVQNKDIGFINVGQNAI------------IKVEAFPYTRYGYLVGKVKNINLDAIEDQRL 418

Query: 271 GSIT--LRAV------FPNPQHTLLPGMFVRARIDEG 299
G + + ++ N L GM V A I G
Sbjct: 419 GLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 13/127 (10%)

Query: 46 TAPLEVKTELPGR-TNAYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQANY 104
+E+ G+ T++ R E++P + IV EG V+ G L ++ +A
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA-- 134

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVE 164
+ K++++ A L RY L E ++ +
Sbjct: 135 -----DTLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191



Score = 29.0 bits (65), Expect = 0.031
Identities = 11/34 (32%), Positives = 15/34 (44%), Gaps = 1/34 (2%)

Query: 65 AEVRPQVSGIVLNRN-FTEGSDVQAGQSLYQIDP 97
+ +R VS V TEG V ++L I P
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4626ACRIFLAVINRP8480.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 848 bits (2193), Expect = 0.0
Identities = 616/627 (98%), Positives = 616/627 (98%)

Query: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPDTTQDDISDYVASNVKDTLSRLNSVGDVQLFGA 180
EVQQQGISVEKSSSSYLMVAGFVSDNP TTQDDISDYVASNVKDTLSRLN VGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNAQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDADLLNKYKLTPVDVINQLKVQN QIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300
KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360
DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPTSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540
SVLVALILTPALCATLLKP SAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600
LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNGFSFSGQAPPQPHVF 627
EKANVESVFTVNGFSFSGQA F
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAF 627



Score = 96.8 bits (241), Expect = 7e-23
Identities = 77/511 (15%), Positives = 180/511 (35%), Gaps = 42/511 (8%)

Query: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYP-GADAQTVQDTVTQ 63
+ ++ +++ + L+LP + P P GA + Q + Q
Sbjct: 533 ILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQ 592

Query: 64 VIEQNMNGIDNLMY---------MSSTSDSAGSVTITL-TFQSGTDPDIAQVQVQNKLQL 113
V + + + S + +AG ++L ++ + + V ++ ++
Sbjct: 593 VTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKM 652

Query: 114 ATPLLP----QEVQQQGISVEKSSSSYLMVAGFVSDNPDTTQDDISDYVASNVKDTLSRL 169
+ I +++ + + D D ++ +
Sbjct: 653 ELGKIRDGFVIPFNMPAIVELGTATGFDF---ELIDQAGLGHDALTQARNQLLGMAAQHP 709

Query: 170 NSVGDVQLFGA--QYAMRIWLDADLLNKYKLTPVDVINQLKVQNAQIAAGQLGGTPALPG 227
S+ V+ G ++ +D + ++ D+ + G
Sbjct: 710 ASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF----IDRG 765

Query: 228 QQLNASIIA-QTRLKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPA 286
+ + A PE+ K+ +R +++G +V + R NG P+
Sbjct: 766 RVKKLYVQADAKFRMLPEDVDKLYVR-SANGEMVPFSAFTTSHWV-YGSPRLERYNGLPS 823

Query: 287 AGLGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFV---QLSIHEVVKT 343
+ + A G ++ D ++ ++L P G YD T +LS ++
Sbjct: 824 MEIQGEAAPGTSSGDAMALMENLASKL----PAG----IGYDWTGMSYQERLSGNQAPAL 875

Query: 344 LFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIG 403
+ + ++VFL + ++ + + VP+ ++G F + M G++ IG
Sbjct: 876 VAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIG 935

Query: 404 LLVDDAIVVVENVERVMMEDKLPPKEAT-EKSMSQIQGALV-GIAMVLSAVFIPMAFFGG 461
L +AI++VE + +M ++ EAT +++ L+ +A +L +P+A G
Sbjct: 936 LSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG--VLPLAISNG 993

Query: 462 STGAIYRQFSITIVSAMALSVLVALILTPAL 492
+ I ++ M + L+A+ P
Sbjct: 994 AGSGAQNAVGIGVMGGMVSATLLAIFFVPVF 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4627ACRIFLAVINRP5260.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 526 bits (1357), Expect = 0.0
Identities = 389/392 (99%), Positives = 391/392 (99%)

Query: 1 MAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELI 60
MAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELI
Sbjct: 625 MAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELI 684

Query: 61 DKAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSD 120
D+AGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSD
Sbjct: 685 DQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSD 744

Query: 121 INQTISTALGGTYVNDFIDRGRVKKVYVQADAKFRMLPEDVDKLYVRSTNGEMVPFSAFT 180
INQTISTALGGTYVNDFIDRGRVKK+YVQADAKFRMLPEDVDKLYVRS NGEMVPFSAFT
Sbjct: 745 INQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFT 804

Query: 181 TSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQ 240
TSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQ
Sbjct: 805 TSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQ 864

Query: 241 ERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDV 300
ERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDV
Sbjct: 865 ERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDV 924

Query: 301 YFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG 360
YFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG
Sbjct: 925 YFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILG 984

Query: 361 VLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 392
VLPLAISNGAGSGAQNAVGIGVMGGMVSATLL
Sbjct: 985 VLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016



Score = 68.3 bits (167), Expect = 2e-14
Identities = 54/316 (17%), Positives = 122/316 (38%), Gaps = 18/316 (5%)

Query: 90 SVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTAL----GGTYVNDFIDRGRVKK 145
V+ G + ++ +D + ++ D+ + G G+
Sbjct: 174 DVQLFGAQ--YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLN 231

Query: 146 VYVQADAKFRMLPEDVDKLYVR-STNGEMVPFSAFTTSHWVYGSPRLE---RYNGLPSME 201
+ A +F+ PE+ K+ +R +++G +V G R NG P+
Sbjct: 232 ASIIAQTRFKN-PEEFGKVTLRVNSDGSVVRLKDVARV--ELGGENYNVIARINGKPAAG 288

Query: 202 IQGEAAPGTSSGDA----MALMENLASKLPAGIGYDWT-GMSYQERLSGNQAPALVAISF 256
+ + A G ++ D A + L P G+ + + +LS ++ + +
Sbjct: 289 LGIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI 348

Query: 257 VVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKN 316
++VFL + ++ + + VP+ ++G F + M G++ IGL +
Sbjct: 349 MLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDD 408

Query: 317 AILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQN 376
AI++VE + +M ++ EAT ++ ++ ++ +P+A G+
Sbjct: 409 AIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYR 468

Query: 377 AVGIGVMGGMVSATLL 392
I ++ M + L+
Sbjct: 469 QFSITIVSAMALSVLV 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4628adhesinb280.004 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.5 bits (61), Expect = 0.004
Identities = 14/68 (20%), Positives = 26/68 (38%), Gaps = 10/68 (14%)

Query: 1 MKR---LIPVALLTALLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGG 57
MK+ L+ + L LA C+ + +V TN+ + T++ IAG
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN-------IAGD 53

Query: 58 AAAVAGLT 65
+ +
Sbjct: 54 KINLHSIV 61


104Z4693Z4698N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4693448-1.690800leader peptidase
Z4695655-1.227535bacterioferritin
Z4696657-0.298394bacterioferritin-associated ferredoxin
Z4697655-0.058058elongation factor Tu
Z4698445-0.196056elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4693PREPILNPTASE1411e-44 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 141 bits (358), Expect = 1e-44
Identities = 65/142 (45%), Positives = 84/142 (59%), Gaps = 2/142 (1%)

Query: 4 TLPFLILYACLSALLFFWDAKHGLLPDRFTCPLLWSGLLFYQVCHPDGLADALWGAIIGY 63
TL L+L L AL F D LLPD+ T PLLW GLLF + L DA+ GA+ GY
Sbjct: 134 TLAALLLTWVLVALTFI-DLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGY 192

Query: 64 GTFAVIYWGYRILRHKEGLGYGDVKFLAALGAWHSWAFLPRLVFLAASFACGAVVIGLLM 123
+YW +++L KEG+GYGD K LAALGAW W LP +V L +S + IGL++
Sbjct: 193 LVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALP-IVLLLSSLVGAFMGIGLIL 251

Query: 124 RGKESLKNPLPFGPFLAAAGFV 145
P+PFGP+LA AG++
Sbjct: 252 LRNHHQSKPIPFGPYLAIAGWI 273


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4695HELNAPAPROT383e-06 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 38.3 bits (89), Expect = 3e-06
Identities = 19/103 (18%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 44 EYHESIDEMKHADKYIERILFLEGIPN--LQDLGKL------GIGEDVEEMLQSDLRLEL 95
E ++ E D ER+L + G P +++ + G EM+Q+ +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EGAKDLREAIAYADSVHDYVSRDMMIEILADEEGHIDWLETEL 138
+ + + + I A+ D + D+ + ++ + E + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4697TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 186
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 187 AKILELAGFLDSYIPEPE 204
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4698TCRTETOQM6130.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 613 bits (1583), Expect = 0.0
Identities = 178/698 (25%), Positives = 304/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGGEELTEAEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+ R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPDAPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E +K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D I +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


105Z4704Z4716N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z47040150.415967hypothetical protein
Z4705-1161.845596FKBP-type peptidylprolyl isomerase
Z4706-1153.379996hypothetical protein
Z4707-2153.326987FKBP-type peptidylprolyl isomerase
Z4708-1153.080046hypothetical protein
Z4710-2142.983775glutathione-regulated potassium-efflux system
Z4712-1182.694380glutathione-regulated potassium-efflux system
Z4713-1191.918915ABC transporter ATP-binding protein
Z4714-2120.916305hydrolase
Z4715-1120.993910hypothetical protein
Z4716-2121.182449phosphoribulokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4704ACRIFLAVINRP290.024 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 28.7 bits (64), Expect = 0.024
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 164 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 222
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 223 SK 224
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4705INFPOTNTIATR1076e-31 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 107 bits (269), Expect = 6e-31
Identities = 62/186 (33%), Positives = 100/186 (53%), Gaps = 8/186 (4%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKG 212
+ + G
Sbjct: 187 QLMPAG 192


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z471060KDINNERMP310.021 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.021
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 261 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 316
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 317 GVRSSERMQ 325
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4712ISCHRISMTASE327e-04 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 7e-04
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 12 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 69
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 70 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESA------Y 119
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 120 RYDALNRYPMSDVLR 134
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4713PF05272300.033 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.033
Identities = 14/52 (26%), Positives = 20/52 (38%), Gaps = 5/52 (9%)

Query: 1 MIVFSSLQIRRGVRVLLDNATATI-NPGQKVG----LVGKNGCGKSTLLALL 47
++ + +L A + PG K L G G GKSTL+ L
Sbjct: 565 YKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4716PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


106Z4809Z4822N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4809018-3.618953acetyltransferase YhhY
Z4810016-1.938175hypothetical protein
Z4811-1171.665036hypothetical protein
Z4813-2223.750202gamma-glutamyltranspeptidase
Z4815-2233.505469hypothetical protein
Z4817-2213.214918glycerophosphodiester phosphodiesterase
Z4818-1233.276062glycerol-3-phosphate transporter ATP-binding
Z4819-2232.536734glycerol-3-phosphate transporter membrane
Z4820-1233.159506glycerol-3-phosphate transporter permease
Z4822-2233.171396glycerol-3-phosphate transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4809SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 1e-05
Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 16/92 (17%)

Query: 51 VACIDGDVVGHLTIDVQQRPRRSHVADFGICVDARWKNRGVASALMREMIE------MCD 104
+ ++ + +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 105 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 136
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4813NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 276 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 332
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 333 YAYADRSEYLGDPDFVKVPWQA 354
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4817PF04619300.004 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 30.3 bits (68), Expect = 0.004
Identities = 13/63 (20%), Positives = 23/63 (36%), Gaps = 4/63 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129

Query: 85 YGK 87
G
Sbjct: 130 GGI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4818PF05272310.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.010
Identities = 11/35 (31%), Positives = 19/35 (54%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTEGDICINDQR 67
+V+ G G GKSTL+ + GL+ ++ I +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4822MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


107Z4884Z4891N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4884117-3.801751transporter
Z4885-116-3.554773ABC transporter ATP-binding protein, fragment 1
Z4886-120-5.232150hypothetical protein
Z4887023-6.999494hypothetical protein
Z4888-118-4.830690hypothetical protein
Z4890-112-0.234234hypothetical protein
Z48910161.846508hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4884ABC2TRNSPORT512e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 51.1 bits (122), Expect = 2e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 201 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 259
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 260 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 318
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 319 IMLIMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 368
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4885PF05272300.043 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.043
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 20 ARCMVGLIGPDGVGKSSLLSLISGAR 45
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4886RTXTOXIND823e-19 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 81.8 bits (202), Expect = 3e-19
Identities = 72/408 (17%), Positives = 137/408 (33%), Gaps = 81/408 (19%)

Query: 6 RHLAWWGVGLLAVAAIVXWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++ +G L +A I+ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4891ALARACEMASE290.033 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.033
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


108Z4968mZ4977N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z4968m013-1.767971PapC-like porin protein involved in fimbrial
Z4969-114-2.642211fimbrial chaperone
Z4971-1120.195765major fimbrial subunit
Z4972-1121.376801resistance protein
Z49730121.602373lipase
Z4974-1101.4986023-methyladenine DNA glycosylase
Z49750101.234278hypothetical protein
Z4976-1101.200138biotin sulfoxide reductase
Z4977-116-0.696226outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4968mPF005776760.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 676 bits (1745), Expect = 0.0
Identities = 352/896 (39%), Positives = 490/896 (54%), Gaps = 135/896 (15%)

Query: 3 RKTVSRTFSSFSISVVAVAVASTFSAHAGKFNPKFLEDVQGVGQHVDLTMFEKGQEQQLP 62
RK F A A + S+ FNP+FL D DL+ FE GQ + P
Sbjct: 19 RKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLAD--DPQAVADLSRFENGQ-ELPP 75

Query: 63 GIYRVSVYVNEQRMETRTLEFKEATEAQRKAMGESLVPCLSRTQLAEMGVRVESFPALNL 122
G YRV +Y+N M TR + F + +VPCL+R QLA MG+ S +NL
Sbjct: 76 GTYRVDIYLNNGYMATRDVTFNTGDS------EQGIVPCLTRAQLASMGLNTASVSGMNL 129

Query: 123 VSAEACVPFDEIIPLASSHFDFSEQKLVLSFPQAAMHQVARGTVPESLWDEGIPALLLDY 182
++ +ACVP +I A++ D +Q+L L+ PQA M ARG +P LWD GI A LL+Y
Sbjct: 130 LADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNY 189

Query: 183 SFSGSNSEYDSTGSSSSYVDDNGTVHHDDGKDTLKSDSYYLNLRSGLNLGAWRLRNYSTW 242
+FSG++ + G+S YLNL+SGLN+GAWRLR+ +TW
Sbjct: 190 NFSGNSVQNRIGGNS---------------------HYAYLNLQSGLNIGAWRLRDNTTW 228

Query: 243 S------HSGGKAQWDNIGTSLSRAIIPFKAQLTMGDTATAGDIFDSVQMRGAMLASDEE 296
S SG K +W +I T L R IIP +++LT+GD T GDIFD + RGA LASD+
Sbjct: 229 SYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDN 288

Query: 297 MLPDSQRGFAPIVRGIAKSN-----------------------------------AEVSI 321
MLPDSQRGFAP++ GIA+ +V+I
Sbjct: 289 MLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTI 348

Query: 322 -EQNGYVIYRTYVQ-------------------AGNYDSA-----SPRFGQLDLIYGLPW 356
E +G + + AG Y S PRF Q L++GLP
Sbjct: 349 KEADGST--QIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPA 406

Query: 357 GMTAYGGVLISNNYNAFTLGIGKNFGYIGAISIDVTQAKSELNNDRDSQGQSYRFLYSKS 416
G T YGG +++ Y AF GIGKN G +GA+S+D+TQA S L +D GQS RFLY+KS
Sbjct: 407 GWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKS 466

Query: 417 F-ESGTDFRLAGYRYSTSGFYTFQEATDVRSDA-----------------DSDYNRYHKR 458
ESGT+ +L GYRYSTSG++ F + T R + D Y+KR
Sbjct: 467 LNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKR 526

Query: 459 SEIQGNLTQQLGAYGSVYLNLTQQDYWNDAGKQNTVSAGYNGRIGKVSYSIAYSWNKSPE 518
++Q +TQQLG ++YL+ + Q YW + AG N ++++++YS K+
Sbjct: 527 GKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAW 586

Query: 519 WDESDRLWSFNISVPLGR------------AWSNYRVTTDQDGRTNQQVGVSGTLLEDRN 566
D++ + N+++P A ++Y ++ D +GR GV GTLLED N
Sbjct: 587 QKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNN 646

Query: 567 LSYSVQEGYASNG---VGNSGNANVGYQGGSGNVNVGYSYGKDYRQLNYSVRGGVIVHSE 623
LSYSVQ GYA G G++G A + Y+GG GN N+GYS+ D +QL Y V GGV+ H+
Sbjct: 647 LSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHAN 706

Query: 624 GVTLSQPLGETMTLISVPGARNARVVNNGGVQVDWMGNAIVPYAMPYRENEISLRSDSLG 683
GVTL QPL +T+ L+ PGA++A+V N GV+ DW G A++PYA YREN ++L +++L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLA 766

Query: 684 DDVDVENAFQKVVPTRGAIVRARFDTRVGYRVLMTLLRSAGSPVPFGATATLITDKQNEV 743
D+VD++NA VVPTRGAIVRA F RVG ++LMT L P+PFGA +T + ++
Sbjct: 767 DNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMT-LTHNNKPLPFGAM---VTSESSQS 822

Query: 744 SSIVGEEGQLYISGMPEEGRVLIKWGNDASQQCVAPYKLSLELKQGGIIPVSANCQ 799
S IV + GQ+Y+SGMP G+V +KWG + + CVA Y+L E +Q + +SA C+
Sbjct: 823 SGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4972TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 47/275 (17%), Positives = 94/275 (34%), Gaps = 32/275 (11%)

Query: 44 PVSQVAFSFGLLSLGLAIS----SSVAGKLQERFGVKRVTIASGILLGLGFFLTAHSDNL 99
+ V +G+L A+ + V G L +RFG + V + S + + + A + L
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 100 MMLWLS---AGVLVGLADGAGYLL----TLSNCVKWFPERKGLISAFAIGSYGLGSLGFK 152
+L++ AG+ AG + + F + LG
Sbjct: 97 WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG---- 152

Query: 153 FIDTQLLETVGLEKTFVIWGAIALVMIVFGATLMKDAPKQEVKTSNGVVEKDYTLAESMR 212
L+ F A+ + + G L+ ++ K E + R
Sbjct: 153 -----LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 213 --KPQYWMLAVMFLTACMSG----LYVIGVAKDIAQSLAHLDVVSAANAVTVISIAN-LS 265
++AV F+ + L+VI + H D + ++ I + L+
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVI-----FGEDRFHWDATTIGISLAAFGILHSLA 262

Query: 266 GRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
++ G ++ ++ R + +G + G L FA
Sbjct: 263 QAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297



Score = 36.7 bits (85), Expect = 1e-04
Identities = 37/155 (23%), Positives = 63/155 (40%), Gaps = 9/155 (5%)

Query: 241 AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
AH ++ A A+ + A + G L SD+ R V+ + + V A + A
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGAL-----SDRFGRRPVLLVSLAGAAVDYAIMATA 93

Query: 301 PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSICGSIIA 360
P V + I VA G T V + +++ + A+++G + FG G + G ++
Sbjct: 94 PFLWVLYIGRI--VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151

Query: 361 SLFGGFYVK--FYVIFALLILSLALSTTIRQPEQK 393
L GGF F+ AL L+ + K
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4973ECOLNEIPORIN280.040 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.040
Identities = 22/117 (18%), Positives = 47/117 (40%), Gaps = 16/117 (13%)

Query: 121 SMYNEFGDSTTTLTDPLWHASVSTLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLS 180
S+ + D+ + H S + + + R G++ P ++SY F +
Sbjct: 228 SVAVQQQDAKLV-EENYSHNSQTEVAATLAYRFGNVTP--RVSYAHGFKGSF-------- 276

Query: 181 RMTATNQNGNWLDVTVGADMLLNQNIAAYAA---LSQAENTTNNSDYLYTMGVSARF 234
ATN N ++ V VGA+ ++ +A + L + + + +G+ +F
Sbjct: 277 --DATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4975SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.0 bits (75), Expect = 2e-04
Identities = 16/52 (30%), Positives = 22/52 (42%), Gaps = 5/52 (9%)

Query: 76 VAPKAVRRGIGKALMQYV-----QQRYPHLMLEVYQKNQPAIDFYQAQGFHI 122
VA ++G+G AL+ + + LMLE N A FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z4977OMPADOMAIN1132e-32 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 113 bits (285), Expect = 2e-32
Identities = 41/122 (33%), Positives = 62/122 (50%), Gaps = 11/122 (9%)

Query: 108 LNMPNNVTFDSSSATLKPAGANTLTGVAMVLKEY--PKTAVNVIGYTDSTGGHDLNMRLS 165
+ ++V F+ + ATLKP G L + L +V V+GYTD G N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 166 QQRADSVASALITQGVDASRIRTQGLGPANPIASNSTAEGK---------AQNRRVEITL 216
++RA SV LI++G+ A +I +G+G +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 217 SP 218

Sbjct: 335 KG 336


109Z5106Z5114N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5106237-10.557575hypothetical protein
Z5107339-9.712237hypothetical protein
Z5108337-9.441759hypothetical protein
Z5109236-8.903215hypothetical protein
Z5110339-8.421149intimin adherence protein
Z5111545-9.256166hypothetical protein
Z5112545-9.110526translocated intimin receptor protein
Z5113544-13.578658hypothetical protein
Z5114345-12.196621hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5106BACINVASINB300.020 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 29.7 bits (66), Expect = 0.020
Identities = 29/102 (28%), Positives = 52/102 (50%), Gaps = 9/102 (8%)

Query: 112 MMMVTLLSLDTSAQKVSSLKNSNEIY---MDGQTKALENKTQEYKKQLEEQQKAEEKSQK 168
M+M + + SL+N ++ +G+ +E K+ E++ EE +KAEE ++
Sbjct: 258 MLMAMFIEI-VGKNTEESLQNDLALFNALQEGRQAEMEKKSAEFQ---EETRKAEETNRI 313

Query: 169 SKIVGQVFGWLGVALTAVAAVFNPALWAVVAIGATAMALQTA 210
+G+V G L ++ VAAVF A +A+ A +A+ A
Sbjct: 314 MGCIGKVLGALLTIVSVVAAVFTGG--ASLALAAVGLAVMVA 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5108PF07201280.047 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 28.3 bits (63), Expect = 0.047
Identities = 43/225 (19%), Positives = 73/225 (32%), Gaps = 23/225 (10%)

Query: 39 SPLINLQNELAMITSSSLSETIEGLSLGYRK---GSARKEEEGSTIEKLLNDMQELLTLT 95
+ ++ E+ SE E LSL RK AR + + + L+ + EL
Sbjct: 47 QSIADMAEEVTF----VFSERKE-LSLDKRKLSDSQARVSDVEEQVNQYLSKVPEL---E 98

Query: 96 DSDKIKELS--LKNSGL--LEQHDPTLAMFGNMPKGEIVALISSLLQSK--FVKIELKKK 149
+ EL L NS L Q L P + L K L
Sbjct: 99 QKQNVSELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFKMLCGLRDALKGRPELAHLSHL 158

Query: 150 YARLLLDLLGEDDWELAL-----LSWLGVGELNQEGIQKIKKLYEKAKDEDSENGASLLD 204
+ L+ + E + L + +Q ++ Y A + ++
Sbjct: 159 VEQALVSMAEEQGETIVLGARITPEAYRESQSGVNPLQPLRDTYRDAV-MGYQGIYAIWS 217

Query: 205 WFMEIKDLPEREKHLKVIIRALSFDLSYMSSFEDKVKTSSIISDL 249
+ + + + + +ALS DL S + K +ISDL
Sbjct: 218 DLQKRFPNGDIDSVILFLQKALSADLQSQQSGSGREKLGIVISDL 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5110INTIMIN14590.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 1459 bits (3777), Expect = 0.0
Identities = 780/942 (82%), Positives = 837/942 (88%), Gaps = 11/942 (1%)

Query: 1 MITHGCYTRTRHKHKLKKTLIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHDSYQ 60
MITHG Y RTRHKHKLKKT IMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTH+SYQ
Sbjct: 1 MITHGFYARTRHKHKLKKTFIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHNSYQ 60

Query: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAAPGQQIILPLKKLPFE 120
NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKA PGQQIILPLKKLPFE
Sbjct: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAEPGQQIILPLKKLPFE 120

Query: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180
YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR
Sbjct: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180

Query: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240
SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM
Sbjct: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240

Query: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPANMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300
LAFGQVGARYIDSRFTANLGAGQRFFLP NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF
Sbjct: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300

Query: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLIYEQYYGDNVAL 360
KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKL+YEQYYGDNVAL
Sbjct: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVAL 360

Query: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKSWSQQIE 420
FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDK WSQQIE
Sbjct: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIE 420

Query: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTEHSTQKIQLIVKSKY 480
PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTE STQKIQLIVKSKY
Sbjct: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIVKSKY 480

Query: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNIYKVTARAYDRNGNSSN 540
GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSN+YKVTARAYDRNGNSSN
Sbjct: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSN 540

Query: 541 NVQLTITVLSNGQVVDQVGVTDFTADKTSAKADNADTITYTATVKKNGVAQANVPVSFNI 600
NV LTITVLSNGQVVDQVGVTDFTADKTSAKAD + ITYTATVKKNGVAQANVPVSFNI
Sbjct: 541 NVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNI 600

Query: 601 VSGTATLGANSAKTDANGKATVTLKSSTPGQVVVSAKTAEMTSALNASAVIFFDQTKASI 660
VSGTA L ANSA T+ +GKATVTLKS PGQVVVSAKTAEMTSALNA+AVIF DQTKASI
Sbjct: 601 VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASI 660

Query: 661 TEIKADKTTAVANGKDAIKYTVKVMKNGQPVNNQSVTFSTNFGMFNGKSQTQATTGNDGR 720
TEIKADKTTAVANG+DAI YTVKVMK +PV+NQ VTF+T G + + T +G
Sbjct: 661 TEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLS---NSTEKTDTNGY 717

Query: 721 ATITLTSSSAGKATVSATVSDGA-EVKATEVTFFDELKID-NKVDIIGNNVRGELPNIWL 778
A +TLTS++ GK+ VSA VSD A +VKA EV FF L ID ++I+G V+G+LP +WL
Sbjct: 718 AKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWL 777

Query: 779 QYGQFKLKASGGDGTYSWYSENTSIATVDA-SGKVTLNGKGSVVIKATSGDKQTVSYTIK 837
QYGQ LKASGG+G Y+W S N +IA+VDA SG+VTL KG+ I S D QT +YTI
Sbjct: 778 QYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA 837

Query: 838 APSYMI--KVDKQAYYADAMSICKNL---LPSTQTVLSDIYDSWGAANKYSHYSSMNSIT 892
P+ +I + K+ Y DA++ CKN LPS+Q L +++ +WGAANKY +Y S +I
Sbjct: 838 TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTII 897

Query: 893 AWIKQTSSEQRSGVSSTYNLITQNPLPGVNVNTPNVYAVCVE 934
+W++QT+ + +SGV+STY+L+ QNPL + + N YA CV+
Sbjct: 898 SWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5111PF059321224e-39 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 122 bits (309), Expect = 4e-39
Identities = 24/125 (19%), Positives = 52/125 (41%), Gaps = 5/125 (4%)

Query: 1 MSSRS-ELLLEKFAEKIGIGSISFNENRLCSFAIDEIYYISLS-DANDEYMMIYGVCGKF 58
MS+ + LL+ F+ + + + F+++ C+ ID + ++LS D E +++ G+
Sbjct: 1 MSNLFYKTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH 60

Query: 59 PTDNSNFALEILNANLWFAENGGPYLCYEAGAQSLLLALRFPLDDATPEKLENEIEVVVK 118
+L L N GP L + + P + + L+ E+ +++
Sbjct: 61 KD---IPQQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLE 117

Query: 119 SMENL 123
M
Sbjct: 118 WMRGW 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5112TRNSINTIMINR7310.0 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 731 bits (1887), Expect = 0.0
Identities = 328/566 (57%), Positives = 390/566 (68%), Gaps = 25/566 (4%)

Query: 1 MPIGNLGHNPNVNNSIPPAPPLPSQTDGA--GGRGQLINSTGPLGSRALFTPVRNSMADS 58
MPIGNLG+N N N+ IPPAPPLPSQTDGA GG G LI+STG LGSR+LF+P+RNSMADS
Sbjct: 1 MPIGNLGNNVNGNHLIPPAPPLPSQTDGAARGGTGHLISSTGALGSRSLFSPLRNSMADS 60

Query: 59 GDNRASDVPGLPVNPMRLAA--SEITLNDGFEVLHDHGPLDTLNRQIGSSVFRVETQEDG 116
D+R D+PGLP NP RLAA SE L GFEVLHD GPLD LN QIG S FRVE Q DG
Sbjct: 61 VDSR--DIPGLPTNPSRLAAATSETCLLGGFEVLHDKGPLDILNTQIGPSAFRVEVQADG 118

Query: 117 KHIAVGQRNGVETSVVLSDQEYARLQSIDPEGKDKFVFTGGRGGAGHAMVTVASDITEAR 176
H A+G++NG+E SV LS QE++ LQSID EGK++FVFTGGRGG+GH MVTVASDI EAR
Sbjct: 119 THAAIGEKNGLEVSVTLSPQEWSSLQSIDTEGKNRFVFTGGRGGSGHPMVTVASDIAEAR 178

Query: 177 QRILELLEPKGTGESK-GAGESKGVGELRESNSGAENTTETQTSTSTSSLRSDPKLWLAL 235
+IL L+P G + +++ VG S +ET TST+ SS+RSDPK W+++
Sbjct: 179 TKILAKLDPDNHGGRQPKDVDTRSVGVGSASGIDDGVVSETHTSTTNSSVRSDPKFWVSV 238

Query: 236 GTVATGLIGLAATGIVQALALTPEPDSPTTTDPDAAASATETATRDQLTKEAFQNPDNQK 295
G +A GL GLAATGI QALALTPEPD PTTTDPD AA+A E+AT+DQLT+EAF+NP+NQK
Sbjct: 239 GAIAAGLAGLAATGIAQALALTPEPDDPTTTDPDQAANAAESATKDQLTQEAFKNPENQK 298

Query: 296 VNIDELGNAIPSGVLKDDVVANIEEQAKAAGEEAKQQAIENNAQAQKKYDEQQAKRQEEL 355
VNID GNAIPSG LKDD+V I +QAK AGE A+QQA+E+NAQAQ++Y++Q A+RQEEL
Sbjct: 299 VNIDANGNAIPSGELKDDIVEQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEEL 358

Query: 356 KVSSGAGYGLSGALILGGGIGVAVTAALHRKNQPVEQTTTTTTTTTTTSARTVENKPANN 415
++SSG GYGLS ALI+ GGIG VT ALHR+NQP EQTTTTTT TV +
Sbjct: 359 QLSSGIGYGLSSALIVAGGIGAGVTTALHRRNQPAEQTTTTTT-------HTVVQQQTGG 411

Query: 416 TPAQGNVDTPGSEDTMESRRSSMASTSSTFFDTSSIGTVQNPYADV---KTSLHDSQVPT 472
P P RR S S +ST + SS V NPYA+V + SL Q
Sbjct: 412 IPQHKVALMPQERRRFSDRRDSQGSVASTHWSDSS-SEVVNPYAEVGGARNSLSAHQPEE 470

Query: 473 SNSNTSVQNMGNTDSVVYSTIQHPPRDTTDNGARLLGNPSAGIQSTYARLALSGGLRHDM 532
+ + G YS IQ+ G RL+G P GIQSTYA LA SGGLR M
Sbjct: 471 HIYDEVAADPG------YSVIQNFSGSGPVTG-RLIGTPGQGIQSTYALLANSGGLRLGM 523

Query: 533 GGLTGGSNSAVNTSNNPPAPGSHRFV 558
GGLT G +AV++ N P PG RFV
Sbjct: 524 GGLTSGGETAVSSVNAAPTPGPVRFV 549


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5114PF06704366e-06 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 36.4 bits (84), Expect = 6e-06
Identities = 21/119 (17%), Positives = 51/119 (42%), Gaps = 4/119 (3%)

Query: 3 EKFRTDLAHTFGIALEEQTDVLSFHDNDGHEW-ILECASQSEILFFYCYLLNSESIQINS 61
+ L G +L Q V + +D+ +E ++E SE++ F+C + S +
Sbjct: 9 SRLIKSLGAQLGTSLTAQNGVCALYDSQDNEAAVIEMPDHSEMVIFHCRVGRSPDRAADL 68

Query: 62 ILEMNSNRELLGMF--FLSLKDDNILLNIAFPADKIDITEFANLMENGYLLKNEIIRSL 118
++ N ++ M + ++ ++ L +D +F + G++++ R+L
Sbjct: 69 QKLLSLNFDVARMHGSWFAVDQGDVRLCAQRELAVLDEAQFCDTA-RGFIVQAREARAL 126


110Z5124Z5135N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5124344-15.066911hypothetical protein
Z5125347-16.107747hypothetical protein
Z5126547-16.527745hypothetical protein
Z5127750-17.908051hypothetical protein
Z5128750-17.803663hypothetical protein
Z5129750-18.248057negative regulator GrlR
Z5131752-18.767223hypothetical protein
Z5132951-18.442223secretion system apparatus protein SsaU
Z5133953-18.904764hypothetical protein
Z5134952-19.208021hypothetical protein
Z5135950-18.627978type III secretion system protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5124FLGMRINGFLIF561e-11 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 55.7 bits (134), Expect = 1e-11
Identities = 32/166 (19%), Positives = 58/166 (34%), Gaps = 10/166 (6%)

Query: 22 EQLYTGLTEKEANQMQALLLSNDVNVSKEMDKSGNMTLSVEKEDFVRAITILNNNGFPKK 81
L++ L++++ + A L N+ + V + L G PK
Sbjct: 51 RTLFSNLSDQDGGAIVAQLTQM--NIPYRFANGSG-AIEVPADKVHELRLRLAQQGLPKG 107

Query: 82 KFADIEVIFPPSQLVASPSQENAKINYLKEQDIERLLSKIPGVIDCSVSLNVNNN----- 136
E + + S E E ++ R + + V V L +
Sbjct: 108 GAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVR 166

Query: 137 ESQPSSAAVLVISSPEVNLAPSVIQ-IKNLVKNSVDDLKLENISVV 181
E + SA+V V P L I + +LV ++V L N+++V
Sbjct: 167 EQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLV 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5126TYPE3OMGPROT5590.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 559 bits (1441), Expect = 0.0
Identities = 153/494 (30%), Positives = 262/494 (53%), Gaps = 24/494 (4%)

Query: 30 KSEYFIITKSSPVRAILNDFAANYSIPVFISSSVNDDFSGEIKNEKPVKVLEKLSKLYHL 89
Y + K +R +L DF ANY V +S +ND SG+ +++ P L+ ++ LY+L
Sbjct: 33 PIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEHDNPQDFLQHIASLYNL 92

Query: 90 TWYYDENILYIYKTNEISRSIITPTYLDIDSLLKYLSDTISVNKNSCNVRKITTFNSIEV 149
WYYD N+LYI+K +E++ +I + L + L + + R + + V
Sbjct: 93 VWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQR-SGIWEPRFGWRPDASNRLVYV 151

Query: 150 RGVPECIKYITSLSESLDKEAQSKAKNKD--VVKVFKLNYASATDITYKYRDQNVVVPGV 207
G P ++ + + +L+++ Q +++ +++F L YASA+D T YRD V PGV
Sbjct: 152 SGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGV 211

Query: 208 VSILKTMASNGSLP--STGKGAVERSGNLFDNSVTISADPRLNAVVVKDREITMDIYQQL 265
+IL+ + S+ ++ + + ++ + ADP LNA++V+D M +YQ+L
Sbjct: 212 ATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 266 ISELDIEQRQIEISVSIIDVDANDLQQLGVNWSGTLNAGQGTIA--------FNSSTAQA 317
I LD +IE+++SI+D++A+ L +LGV+W + G N ++ A
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGA 331

Query: 318 NISSSVISNASNFMIRVNALQQNSKAKILSQPSIITLNNMQAILDKNVTFYTKVSGEKVA 377
S + RVN L+ A+++S+P+++T N QA++D + T+Y KV+G++VA
Sbjct: 332 LGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVA 391

Query: 378 SLESITSGTLLRVTPRILDDSSNSLTGKRRERVRLLLDIQDGNQSTNQSNAQDASSTLPE 437
L+ IT GT+LR+TPR+L S + L L I+DGNQ N S + +P
Sbjct: 392 ELKGITYGTMLRMTPRVLTQGDKS-------EISLNLHIEDGNQKPNSSGIE----GIPT 440

Query: 438 VQNSEMTTEATLSAGESLLLGGFIQDKESSSKDGIPLLSDIPVIGSLFSSTVKQKHSVVR 497
+ + + T A + G+SL++GG +D+ S + +PLL DIP IG+LF + VR
Sbjct: 441 ISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVR 500

Query: 498 LFLIKATPIKSASS 511
LF+I+ I +
Sbjct: 501 LFIIEPRIIDEGIA 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5127SYCDCHAPRONE1394e-45 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 139 bits (352), Expect = 4e-45
Identities = 33/142 (23%), Positives = 63/142 (44%)

Query: 6 SSLEDIYDFYQDGGTLASLTNLTQQDLNDLHSYAYTAYQSGDVITARNLFHLLTYLEHWN 65
+ F + GGT+A L ++ L L+S A+ YQSG A +F L L+H++
Sbjct: 10 EYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYD 69

Query: 66 YDYTLSLGLCHQRLSNHEDAQLCFARCATLVMQDPRASYYSGISYLLVGNKKMAKKAFKA 125
+ L LG C Q + ++ A ++ A + +++PR +++ L G A+
Sbjct: 70 SRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFL 129

Query: 126 CLMWCNEKEKYTTYKENIKKLL 147
+K ++ + +L
Sbjct: 130 AQELIADKTEFKELSTRVSSML 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5131OMPTIN260.048 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 26.5 bits (58), Expect = 0.048
Identities = 13/38 (34%), Positives = 17/38 (44%), Gaps = 4/38 (10%)

Query: 115 AYNAGYFNTPNAVELRRQYAMKIYKTYNKLKNNEQIID 152
A NAGY+ TPNA + Y + K N + D
Sbjct: 254 AVNAGYYVTPNA----KVYVEGAWNRVTNKKGNTSLYD 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5132TYPE3IMSPROT376e-132 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 376 bits (967), Expect = e-132
Identities = 123/339 (36%), Positives = 195/339 (57%), Gaps = 4/339 (1%)

Query: 2 SEKTEKPTPKKLRDLKKKGDVTKSEEVMAAVQSLILFSFFSLYGMS--FFVDIVGLVNTT 59
EKTE+PTPKK+RD +KKG V KS+EV++ LI+ L G+S +F L+
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTA--LIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 IDSLNRPFLYAIREILGAVLNIFLLYILPISLIVFVGTVTTGVSQIGFIFAVEKIKPSAQ 119
+ PF A+ ++ VL F P+ + + + + V Q GF+ + E IKP +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 120 KISVKNNLKNIFSVKSIFELLKSVFKLVIIVLIFYFMGHSYANEFANFTGLNAYQALVVV 179
KI+ K IFS+KS+ E LKS+ K+V++ ++ + + ++
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 180 AFFVFLLWKGVLFGYLLFSVFDFWFQKHEGLKKMKMSKDEVKREAKDTDGNPEIKGERRR 239
+ L G+++ S+ D+ F+ ++ +K++KMSKDE+KRE K+ +G+PEIK +RR+
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 240 LHSEIQSGSLANNIKKSTVIVKNPTHIAICLYYKLGETPLPLVIETGKDAKALQIIKLAE 299
H EIQS ++ N+K+S+V+V NPTHIAI + YK GETPLPLV DA+ + K+AE
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 300 LYDIPVIEDIPLARTLYKNIHKGQYITEDFFEPVAQLIR 338
+P+++ IPLAR LY + YI + E A+++R
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLR 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5133TYPE3IMRPROT1551e-48 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 155 bits (394), Expect = 1e-48
Identities = 46/230 (20%), Positives = 102/230 (44%), Gaps = 4/230 (1%)

Query: 11 SFYCILRPLGMFIILPIFSTGVLLSNFIRNSIMIAFTLPIIVENYTFSEKLPSGIFQLTG 70
F+ +LR L + PI S + R + +A + + + +P F
Sbjct: 16 YFWPLLRVLALISTAPILSERSVPK---RVKLGLAMMITFAIAPSLPANDVPVFSFFALW 72

Query: 71 IALKEISIGFFIGLSFTILFWAIDAAGQIIDTLRGSTISSIFNPSISDSSSITGVILYQF 130
+A+++I IG +G + F A+ AG+II G + ++ +P+ + + I+
Sbjct: 73 LAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDML 132

Query: 131 ISVIFVIHGGIQSILDKLYLSYEILPLQADIAFNRALIDFLFSLWDSFIKLMLSFSVPMI 190
++F+ G ++ L ++ LP+ + + A + + F+ L ++P+I
Sbjct: 133 ALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFL-NGLMLALPLI 191

Query: 191 IGIFLCDMGFGFLNKTAPQLNVFTLSLPVKSLIAIFILLLVIHVFPDFIT 240
+ ++ G LN+ APQL++F + P+ + I ++ ++ + F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCE 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5134TYPE3IMQPROT692e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 69.0 bits (169), Expect = 2e-19
Identities = 25/78 (32%), Positives = 45/78 (57%)

Query: 7 VQLCVQTFWIIFILSLPTVIAASVIGIIISLVQAITQLQDQTLPFLLKIIAVFATLALTY 66
V + +++ ILS I A++IG+++ L Q +TQLQ+QTLPF +K++ V L L
Sbjct: 5 VFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLS 64

Query: 67 HWMGTTIINFSSIIFEMI 84
W G ++++ + +
Sbjct: 65 GWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5135TYPE3IMPPROT2232e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 223 bits (570), Expect = 2e-76
Identities = 89/212 (41%), Positives = 136/212 (64%), Gaps = 9/212 (4%)

Query: 12 IFLIIVFFLLSLLPIFVVIGTSFLKISIVLGILKNALGIQQVPPNMALTSVSLILTMFIM 71
I LI + +LLP + GT F+K SIV +++NALG+QQ+P NM L V+L+L+MF+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 72 SPIILQINDNISQEPINYTDSDFFQKVDEKILSPYRGFLEKNTEKDNVEFFERAAQKKLG 131
PI+ E + + D K ++ L YR +L K ++++ V+FFE A K+
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 132 NETI---------LKKDSLFILLPAFTMGQLEAAFKIGFLLYLPFIAIDLIISNILLALG 182
E ++K S+F LLPA+ + ++++AFKIGF LYLPF+ +DL++S++LLALG
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 183 MMMVSPVTISIPFKILLFILVGGWQKLFEFLL 214
MMM+SPVTIS P K++LF+ + GW L + L+
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLI 216


111Z5155Z5159N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5155-2100.891033cryptic adenine deaminase
Z5156-212-0.877719sugar phosphate antiporter
Z5157-119-2.936540regulatory protein UhpC
Z5158026-6.615800sensory histidine kinase UhpB
Z5159135-11.645379DNA-binding transcriptional activator UhpA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5155UREASE389e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 38.2 bits (89), Expect = 9e-05
Identities = 28/105 (26%), Positives = 41/105 (39%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG----------AEYTDAPA 71
V+R D +I N ILD + G + I +K IA +G P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDAHGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5157TCRTETB411e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 1e-05
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 30 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 87
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 88 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 144
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 145 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAAAALHYGWRAGMMIAGCMAIVVGIFLC 203
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 204 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 263
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 307
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 308 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 366
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 367 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 396
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5158PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 308 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIEESALSENQRVTLFRVCQEGLNN 367
LR ++L + + ++L L++ + + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 368 IVKHA-----DASAVTLQGWLQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 421
+KH + L+G + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 422 G---TLTISCLHG-TRVSVSLP 439
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5159HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


112Z5402Z5410N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z54022252.562049hypothetical protein
Z54031232.198491coproporphyrinogen III oxidase
Z54041190.953225nitrogen regulation protein NR(I)
Z5405020-1.083107nitrogen regulation protein NR(II)
Z5406121-2.608468glutamine synthetase
Z5407114-4.394960GTP-binding protein
Z5408012-4.995108transcriptional regulator
Z5409-111-3.382086hypothetical protein
Z5410012-2.749313resistance protein (transport)
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5402SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 29/71 (40%)

Query: 13 AKARRKTREELNQEARDRKRQKKRRGHAPGSRAAGGNNTSGSKGQNAPKDPRIGSKTPIP 72
+K + + EE+ + + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 73 LGVTEKVTKQH 83
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5404HTHFIS6010.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 601 bits (1552), Expect = 0.0
Identities = 205/478 (42%), Positives = 299/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGXEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5405PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5407TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5410TCRTETB300.024 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.024
Identities = 31/161 (19%), Positives = 64/161 (39%), Gaps = 15/161 (9%)

Query: 227 NVFFVYAVYCGLTFFIPFLKNIYLLP----------VALVGAYGIINQYCLKMIGGPIGG 276
N+ F+ V CG F + ++P A +G+ I +I G IGG
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 277 MISDKILKSPSKYLCYTFIISTAALVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFF 336
++ D+ + P L + + + L E+ ++ + G + FT+
Sbjct: 315 ILVDR--RGPLYVLNIGVTFLSVSFLTASFLL-ETTSWFMTIIIVFVLGGLSFTK--TVI 369

Query: 337 APIGEAKIAENKTGAAMALGSFIGYAPAMFCFSLYGYILDL 377
+ I + + + + GA M+L +F + ++ G +L +
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


113Z5626Z5633N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5626-2141.645979hypothetical protein
Z5627-2170.834794hypothetical protein
Z5628-2190.689672phosphate-starvation-inducible protein PsiE
Z5629-2210.804230D-xylose transporter XylE
Z5630-1221.607568maltose ABC transporter permease
Z5631-213-2.959345maltose transporter membrane protein
Z5632013-3.663422maltose ABC transporter substrate-binding
Z5633-211-3.756441maltose ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5626CHANLCOLICIN300.007 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.007
Identities = 21/95 (22%), Positives = 38/95 (40%), Gaps = 3/95 (3%)

Query: 20 AAGTVKVFSNGSSEAKTLTGAEHLIDLVGQPRLANSWWPGAVISEELATAAALRQQQALL 79
A + + + LT + L D+V + N+ + A AA++ + L
Sbjct: 73 AKAAAEAQAKAKANRDALT--QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERL 130

Query: 80 TRLAEQGADSSADDAAAINALRQQIQALKVTGRQK 114
RLA+ + + AA A ++ Q K R+K
Sbjct: 131 -RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREK 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5629TCRTETA364e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 4e-04
Identities = 20/87 (22%), Positives = 42/87 (48%), Gaps = 3/87 (3%)

Query: 279 VIGVMLSIFQQFVGINVVLYYAPEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMT--- 335
+I ++ ++ VGI +++ P + + L S D+ I++ + L A +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 336 VDKFGRKPLQIIGALGMAIGMFSLGTA 362
D+FGR+P+ ++ G A+ + TA
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5632MALTOSEBP7560.0 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 756 bits (1953), Expect = 0.0
Identities = 396/396 (100%), Positives = 396/396 (100%)

Query: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60
MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK
Sbjct: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60

Query: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120
VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW
Sbjct: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120

Query: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180
DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP
Sbjct: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180

Query: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240
YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE
Sbjct: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240

Query: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300
AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE
Sbjct: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300

Query: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360
LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP
Sbjct: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360

Query: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396
QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK
Sbjct: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5633PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 13/35 (37%), Positives = 18/35 (51%)

Query: 32 VVFVGPSGCGKSTLLRMIAGLETITSGDLFIGEKR 66
VV G G GKSTL+ + GL+ + IG +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


114Z5682Z5699N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5682-1162.794250multidrug resistance protein MdtN
Z5683-1182.920798hypothetical protein
Z56840214.109579transcriptional regulator
Z56860234.567781kinase
Z56871234.780947hypothetical protein
Z56881244.926562hypothetical protein
Z56891275.770979ABC transporter substrate-binding protein
Z56901265.987769permease of ribose ABC transport system
Z56912276.379237ribose ABC transporter ATP-binding protein
Z56923307.407007histidine kinase
Z56942368.634395hypothetical protein
Z56951399.203401carbon-phosphorus lyase complex accessory
Z56961418.859869aminoalkylphosphonic acid N-acetyltransferase
Z56972419.303277ribose 1,5-bisphosphokinase
Z56981429.689318phosphonate metabolism protein
Z56990409.746662phosphonate ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5682RTXTOXIND809e-19 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 80.3 bits (198), Expect = 9e-19
Identities = 53/363 (14%), Positives = 118/363 (32%), Gaps = 78/363 (21%)

Query: 8 APRSKFPALLVVALALVALVFVIW-RVDS-APSTNDAYVSADTIDVVPEVSGRIVELAVT 65
+ R + A ++ ++A + + +V+ A + S + ++ P + + E+ V
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 66 DNQAVKQGDLLFRIDPRPYEANLAKSEAS-----LAALDKQIM----------------- 103
+ ++V++GD+L ++ EA+ K+++S L QI+
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 -----------LTQRSVDAQQFGA-------------------DSVNATVEKARAAAKQA 133
L S+ +QF +V A + + ++
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 134 TDTLRRTEPLLKEGFVSAEDVDRARTAQRAAEADLNAVLLQAQSAASAVSGVDALVAQR- 192
L LL + ++ V A +L Q + S +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 193 ------------------AAVEADIALTKLHLEMATVRAPFDGRVISLKT-SVGQFASAM 233
+ ++A + + + +RAP +V LK + G +
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 RPIFTLIDTRHWYVI-ANFRETDLKNIRSGTPATIRLMSDSGKTF---EGKVDSIGYGVL 289
+ ++ + A + D+ I G A I++ + + GKV +I +
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413

Query: 290 PDD 292
D
Sbjct: 414 EDQ 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5684HTHFIS825e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 5e-20
Identities = 33/112 (29%), Positives = 52/112 (46%), Gaps = 1/112 (0%)

Query: 16 DDTAICALLQDVLSEHVFTVSVCHTGQEAILRIEGDPDIALVVLDMMLPDTNGLRVLQQI 75
DD AI +L LS + V + I LVV D+++PD N +L +I
Sbjct: 11 DDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDENAFDLLPRI 69

Query: 76 QKLRPTLPVVMLTGMGSESDVVVGLEMGADDYICKPFTPRVVVARLKAVLRR 127
+K RP LPV++++ + + E GA DY+ KPF ++ + L
Sbjct: 70 KKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5689SUBTILISIN290.027 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 28.7 bits (64), Expect = 0.027
Identities = 15/65 (23%), Positives = 24/65 (36%), Gaps = 5/65 (7%)

Query: 55 KLAGDNVKVTLVSSGYDLGQQVSQIDNFIAANVDMIIL---NAADSKGIGPAVKRAKDAG 111
L +KV + I I VD+I + D + AVK+A +
Sbjct: 111 DLLI--IKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEAVKKAVASQ 168

Query: 112 IVVVA 116
I+V+
Sbjct: 169 ILVMC 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5690PF00577280.047 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 28.3 bits (63), Expect = 0.047
Identities = 16/73 (21%), Positives = 27/73 (36%), Gaps = 1/73 (1%)

Query: 219 FVYGMSGLLSGLGGIMSASRLYSANGNLGMG-YELDAIAAVILGGTSFVGGIGTITGTLV 277
++G+ + GG A R + N +G L A++ + S + G V
Sbjct: 400 LLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSV 459

Query: 278 GALIIATLNNGMT 290
L +LN T
Sbjct: 460 RFLYNKSLNESGT 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5692HTHFIS586e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 6e-11
Identities = 21/81 (25%), Positives = 44/81 (54%), Gaps = 2/81 (2%)

Query: 643 VLVLEDEAAVRQTICEQLHLLGYLTLEASSGEQALDLLAASAEIDIFISDLMLPGGMSGA 702
+LV +D+AA+R + + L GY S+ +AA + D+ ++D+++P +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMP-DENAF 63

Query: 703 EVVNAARKLYPHLTLLLISGQ 723
+++ +K P L +L++S Q
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQ 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5694RTXTOXIND260.034 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.9 bits (57), Expect = 0.034
Identities = 17/107 (15%), Positives = 41/107 (38%), Gaps = 8/107 (7%)

Query: 11 TLLTLTTVPAQADIIDDTIGNIQ--------QAINDASNPDRGRDYEDSRDDGWQREVSD 62
LL LT + A+AD + +Q Q ++ + ++ + + + +Q +
Sbjct: 123 VLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEE 182

Query: 63 DRRRQYDDRRRQFEDRRRQLDDRQHQLNQERRQLEDEERRMEDEYGQ 109
+ R + QF + Q ++ L+++R + R+
Sbjct: 183 EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5696SACTRNSFRASE322e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 2e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 15 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 74
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 75 AEMTELSTNVKRHDAHRFYLREGY 98
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5699PF05272290.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.015
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


115Z5708Z5715N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5708-2140.950992phosphonate/organophosphate ester transporter
Z5709-2140.487139hypothetical protein
Z5710-3140.697400hypothetical protein
Z5711-3130.468199hypothetical protein
Z5712-2131.128310hypothetical protein
Z5713-113-0.616425proline/glycine betaine transporter
Z5714-2170.091165sensor protein BasS/PmrB
Z5715-217-0.546288DNA-binding transcriptional regulator BasR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5708PF05272290.020 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.020
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5713TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.3 bits (102), Expect = 2e-06
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEANFLDWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKYWRS 260
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 261 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 315
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 316 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 353
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 39.0 bits (91), Expect = 4e-05
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 401
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQNLMMPAYYLMVVAVVGLITG-VTMKETANR 444
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5714PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 40/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 181 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 237
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 238 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 292
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 293 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWV 351
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 352 RL 353
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5715HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 41/121 (33%), Positives = 60/121 (49%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEAGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY + A + + AG LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


116Z5726Z5733N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5726016-4.518096DNA-binding transcriptional activator DcuR
Z5727113-4.636751sensory histidine kinase DcuS
Z5728-215-4.268807hypothetical protein
Z5729-219-4.104719hypothetical protein
Z5730-123-3.459839hypothetical protein
Z5731019-4.103818hypothetical protein
Z5732-119-3.416408lysyl-tRNA synthetase
Z5733015-2.180884peptide transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5726HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5727PF06580417e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 7e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5729SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5733TCRTETA300.020 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.020
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWSWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


117Z5893Z5899N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5893133-7.872747hypothetical protein
Z5894028-6.665243hypothetical protein
Z5895-119-4.544169hypothetical protein
Z5896-213-2.801466hypothetical protein
Z5897-212-2.033622hypothetical protein
Z5898-39-0.489116hypothetical protein
Z5899-380.110378ATP-dependent helicase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5893THERMOLYSIN280.007 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.1 bits (62), Expect = 0.007
Identities = 15/39 (38%), Positives = 20/39 (51%), Gaps = 4/39 (10%)

Query: 41 AGKDSAAIDKINAHYFDKKAEDYFFNKF-LLSFDPSTQQ 78
A D+AA+D AHY+ DY+ N LS+D S
Sbjct: 292 ASYDAAAVD---AHYYAGVVYDYYKNVHGRLSYDGSNAA 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5895OMPADOMAIN584e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 58.0 bits (140), Expect = 4e-12
Identities = 36/136 (26%), Positives = 53/136 (38%), Gaps = 28/136 (20%)

Query: 95 SPDVLFGLGSTELKPKFKLILDDFFPRYLKVLDNYQEHITEVRIEGHTSTDWTGTTNPDI 154
DVLF LKP+ + LD + L N V + G+ TD G+
Sbjct: 218 KSDVLFNFNKATLKPEGQAALD----QLYSQLSNLDPKDGSVVVLGY--TDRIGSD---- 267

Query: 155 AYFNNMALSQGRTRAVLQYVYDIKNIATHQQWVKSKFAAVGYSSAHPILDKTGKEDPNRS 214
AY N LS+ R ++V+ Y+ K I K +A G ++P+ T R+
Sbjct: 268 AY--NQGLSERRAQSVVDYLI-SKGIP------ADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 215 ---------RRVTFKV 221
RRV +V
Sbjct: 319 ALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5896RTXTOXIND320.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.008
Identities = 18/134 (13%), Positives = 50/134 (37%), Gaps = 13/134 (9%)

Query: 161 AQIKLLRTEISDSSQAQLANHTHFSNKLWEQLEQFADLMAKGATEQI-IDALRQVIIDFN 219
+ R E + A++ + + S +L+ F+ L+ K A + + ++
Sbjct: 207 LNLDKKRAER-LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 220 QNLTEQFGENFKALDASVKKLVEWQGNYKTQIEQMSEQYQQSV-ESLVETKTAVAGIWEE 278
L + L+ +++ K + + +++ ++ + + L +T + + E
Sbjct: 266 NELRVYKSQ----LEQIESEILS----AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 279 CK--EIPLAMSELR 290
E S +R
Sbjct: 318 LAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5899RTXTOXIND310.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.030
Identities = 26/163 (15%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 334 RLASGAEEEAYRRLVESQFRDDDDEQAQSN---KGRLFKITLEKALFSSPMACASVVANR 390
+ S E L++ QF +++ Q + + A + + V +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 391 LKRLESRKDHN--SQSQINELESLLLALNNIDASQFSKYQLLLDTIRKDLAWKANNTEDR 448
L S ++ + E E+ + N S+ L+ I ++ + ++
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ----LEQIESEIL----SAKEE 288

Query: 449 LVIFTESIKTLEFLEQ--QLRADLKLKDDQIATLRGDQGDTVL 489
+ T+ K E L++ Q ++ L ++A Q +V+
Sbjct: 289 YQLVTQLFKN-EILDKLRQTTDNIGLLTLELAKNEERQQASVI 330


118Z5915Z5919N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5915010-0.652762outer membrane protein; export and assembly of
Z5916-2192.163237fimbrial morphology protein
Z5917-1202.545748fimbrial morphology protein
Z5918-3170.826435minor fimbrial subunit, D-mannose specific
Z5919-1210.706860fructuronate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5915PF005779960.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 996 bits (2576), Expect = 0.0
Identities = 797/878 (90%), Positives = 802/878 (91%), Gaps = 58/878 (6%)

Query: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAVQAPLSSAELYFNPRFLADDPQA 60
MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFA QAPLSSAELYFNPRFLADDPQA
Sbjct: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60

Query: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120
VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN
Sbjct: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120

Query: 121 TASVAGMNLLADDACVPLTTMVQDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180
TASV+GMNLLADDACVPLT+M+ DATA LDVGQQRLNLTIPQAFMSNRARGYIPPELWDP
Sbjct: 121 TASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180

Query: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDRSSGSK 240
GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSD SSGSK
Sbjct: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240

Query: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300
NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV
Sbjct: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300

Query: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360
IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV
Sbjct: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360

Query: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420
PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY
Sbjct: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420

Query: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480
RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR
Sbjct: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480

Query: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRS 540
YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR+
Sbjct: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540

Query: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLARNVNI 600
STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLA NVNI
Sbjct: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600

Query: 601 PFSHWLRSDSKSQWRHASASYSMSGYSHSDDIKQ------------LYYGVSGG------ 642
PFSHWLRSDSKSQWRHASASYSMS + L Y V G
Sbjct: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660

Query: 643 ----------------------------------------VLAHANGVTLGQPLNDTVVL 662
VLAHANGVTLGQPLNDTVVL
Sbjct: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720

Query: 663 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 722
VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP
Sbjct: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 723 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 782
TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA
Sbjct: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840

Query: 783 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 820
GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5917VACCYTOTOXIN300.003 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.003
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WRKRGYLLAAILALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5918SURFACELAYER280.044 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.044
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5919PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


119Z5968Z5978N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5968-2172.704663ferric hydroximate transport ferric iron
Z59720192.498867*16S ribosomal RNA m2G1207 methyltransferase
Z59731201.938147DNA polymerase III subunit psi
Z5974-1181.704829ribosomal-protein-alanine N-acetyltransferase
Z5975-1182.139384nucleotidase
Z5976-2172.544260peptide chain release factor 3
Z5977-1152.162655hypothetical protein
Z5978-1152.681486hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z59682FE2SRDCTASE490e-180 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 490 bits (1262), Expect = e-180
Identities = 253/262 (96%), Positives = 255/262 (97%)

Query: 1 MAYRSAPLYEDIIWRTHLQPQDARLAQAVRATIAEHREHLLEFIRLDEPAPLNAMTLAQW 60
MAYRSAPLYED+IWRTHLQPQD LAQAVRATIA+HREHLLEFIRLDEPAPLNAMTLAQW
Sbjct: 1 MAYRSAPLYEDVIWRTHLQPQDPTLAQAVRATIAKHREHLLEFIRLDEPAPLNAMTLAQW 60

Query: 61 SSPNALSSLLAVYSDHIYRNQPTMIREYKPLISLWAQWYIGLMVPPLMLALLTQEKALDV 120
SSPN LSSLLAVYSDHIYRNQP MIRE KPLISLWAQWYIGLMVPPLMLALLTQEKALDV
Sbjct: 61 SSPNVLSSLLAVYSDHIYRNQPMMIRENKPLISLWAQWYIGLMVPPLMLALLTQEKALDV 120

Query: 121 SPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQQRMETLISQALVPVVQALEATGEING 180
SPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQ RMETLISQALVPVVQALEATGEING
Sbjct: 121 SPEHFHAEFHETGRVACFWVDVCEDKNATPHSPQHRMETLISQALVPVVQALEATGEING 180

Query: 181 KLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTTGEDNPLWRTVVLRDGLLV 240
KLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLT GEDNPLWRTVVLRDGLLV
Sbjct: 181 KLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTNGEDNPLWRTVVLRDGLLV 240

Query: 241 RRTCCQRYRLPDVQQCGDCTLK 262
RRTCCQRYRLPDVQQCGDCTLK
Sbjct: 241 RRTCCQRYRLPDVQQCGDCTLK 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5974SACTRNSFRASE554e-12 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 54.6 bits (131), Expect = 4e-12
Identities = 23/80 (28%), Positives = 35/80 (43%), Gaps = 1/80 (1%)

Query: 62 DEATLFNIAVDPDYQRQGLGRALLEHLIDELEKRGVATLWLEVRASNAAAIALYESLGFN 121
A + +IAV DY+++G+G ALL I+ ++ L LE + N +A Y F
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 122 EATIRRNYYPTTDG-REDAI 140
+ Y E AI
Sbjct: 148 IGAVDTMLYSNFPTANEIAI 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5976TCRTETOQM2168e-65 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 216 bits (551), Expect = 8e-65
Identities = 107/460 (23%), Positives = 208/460 (45%), Gaps = 44/460 (9%)

Query: 12 KRRTFAIISHPDAGKTTITEKVLLFGQAIQTAGTVKGRGSNQHAKSDWMEMEKQRGISIT 71
K +++H DAGKTT+TE +L AI G+V ++D +E+QRGI+I
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSV----DKGTTRTDNTLLERQRGITIQ 57

Query: 72 TSVMQFPYHDCLVNLLDTPGHEDFSEDTYRTLTAVDCCLMVIDAAKGVEDRTRKLMEVTR 131
T + F + + VN++DTPGH DF + YR+L+ +D +++I A GV+ +TR L R
Sbjct: 58 TGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALR 117

Query: 132 LRDTPILTFMNKLDRDIRDPMELLDEVENELKIGCAPITWPIGCGKLFKGVYHLYKDETY 191
P + F+NK+D++ D + +++ +L K +
Sbjct: 118 KMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVI------------------KQKVE 159

Query: 192 LYQSGKGHTIQEVRIVKGLNNPDLDAAVGEDLAQQLRDELELVKGASNEFDKELFLAGEI 251
LY + E + + +DL ++ L + + F +
Sbjct: 160 LYPNMCVTNFTESEQWDTVIEGN------DDLLEKYMSGKSLEALELEQEESIRFHNCSL 213

Query: 252 TPVFFGTALGNFGVDHMLDGLVEWAPAPMPRQTDTRTVEASEDKFTGFVFKIQANMDPKH 311
PV+ G+A N G+D++++ + + + + G VFKI+ K
Sbjct: 214 FPVYHGSAKNNIGIDNLIEVITNKFYSS---------THRGQSELCGKVFKIE--YSEK- 261

Query: 312 RDRVAFMRVVSGKYEKGMKLRQVRTAKDVVISDALTFMAGDRSHVEEAYPGDILGLHNHG 371
R R+A++R+ SG +R K + I++ T + G+ +++AY G+I+ L N
Sbjct: 262 RQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAYSGEIVILQNEF 320

Query: 372 TIQIGDTFTQGEMMKFTGIPNFA-PELFRRIRLKDPLKQKQLLKGLVQLSEEG-AVQVFR 429
+++ +++ P L + P +++ LL L+++S+ ++ +
Sbjct: 321 -LKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379

Query: 430 PISNNDLIVGAVGVLQFDVVVARLKSEYNVEAVYESVNVA 469
+ +++I+ +G +Q +V A L+ +Y+VE + V
Sbjct: 380 DSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVI 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5978CHANLCOLICIN270.006 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 26.6 bits (58), Expect = 0.006
Identities = 16/49 (32%), Positives = 20/49 (40%), Gaps = 8/49 (16%)

Query: 10 WGIIFLVIALIA--------AALGFGGLAGTAAGAAKIVFVVGIILFLV 50
W +FL + A AL F LAGT G I V GI+ +
Sbjct: 460 WKPLFLTLEKKAADAGVSYVVALLFSLLAGTTLGIWGIAIVTGILCSYI 508


120Z5997Z6004N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Z5997-2151.158321phosphoglycerate mutase
Z5999-2130.609715right origin-binding protein
Z60000150.148476hypothetical protein
Z6001DNA-binding response regulator CreB
Z6002sensory histidine kinase CreC
Z6003hypothetical protein
Z6004two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z5997VACCYTOTOXIN290.013 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.013
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z6001HTHFIS876e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.8 bits (215), Expect = 6e-22
Identities = 33/139 (23%), Positives = 60/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+ V + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z6002PF06580310.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.006
Identities = 45/207 (21%), Positives = 80/207 (38%), Gaps = 51/207 (24%)

Query: 240 LTQNARMQAL---------VETL--LRQARLENRQEVVLTAVDVAALFR---RVSEARTV 285
+ Q A++ AL L +R LE+ + ++ L R R S AR V
Sbjct: 157 MAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQV 216

Query: 286 QLAE--KNITLHVM--------PTEVNVAAEPALLEQALGNLL-----DNA----IDFTP 326
LA+ + ++ + PA+++ + +L +N I P
Sbjct: 217 SLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLP 276

Query: 327 KSGRITLSAEVDQEHVALKVLDTGSGIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE 386
+ G+I L D V L+V +TGS N ++S+G GL V E
Sbjct: 277 QGGKILLKGTKDNGTVTLEVENTGSLALK----------------NTKESTGTGLQNVRE 320

Query: 387 -VARLFNGEVTLR-NVQEGGVLASLRL 411
+ L+ E ++ + ++G V A + +
Sbjct: 321 RLQMLYGTEAQIKLSEKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Z6004HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.