PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome1076.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_010498 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1EcSMS35_0008EcSMS35_0024Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_00082290.822899molybdenum cofactor biosynthesis protein MogA
EcSMS35_0009125-0.960606hypothetical protein
EcSMS35_0010-117-2.243361hypothetical protein
EcSMS35_0011020-3.875377hypothetical protein
EcSMS35_0012019-3.380434molecular chaperone DnaK
EcSMS35_0013-119-4.299597chaperone protein DnaJ
EcSMS35_0014120-4.666618hypothetical protein
EcSMS35_0015118-3.907421sulfatase
EcSMS35_0016117-3.858794hypothetical protein
EcSMS35_0017113-2.526737pH-dependent sodium/proton antiporter
EcSMS35_0018-115-2.187501transcriptional activator NhaR
EcSMS35_0019-2150.257770glycosyl hydrolase family protein
EcSMS35_0020-1180.864190xylose-proton symporter
EcSMS35_0021-1232.233993hypothetical protein
EcSMS35_00220242.99729330S ribosomal protein S20
EcSMS35_00230213.367141bifunctional riboflavin kinase/FMN
EcSMS35_00240213.333443isoleucyl-tRNA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0010PF07201300.007 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.007
Identities = 9/51 (17%), Positives = 24/51 (47%)

Query: 138 LHAVDARVNELEELLPLLMKDKLLAKGVSHLLSSQLTRILRTHAAMSVLGH 188
+ V+ +VN+ +P L + + +++ +S L +S + + A +
Sbjct: 80 VSDVEEQVNQYLSKVPELEQKQNVSELLSLLSNSPNISLSQLKAYLEGKSE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0012SHAPEPROTEIN1427e-40 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 142 bits (361), Expect = 7e-40
Identities = 83/387 (21%), Positives = 149/387 (38%), Gaps = 84/387 (21%)

Query: 5 IGIDLGTTNSCVAIMDGTTPRVLENAEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + + E PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIV-LNE--------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPFKIIAADNGDAWVEVKGQKMAPPQISAE 118
P N + AI+ + +D I F + +
Sbjct: 64 GRTPGN-IAAIRPM-----------KDGVIADFFVTEK------------------MLQH 93

Query: 119 VLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALA 178
+K++ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 94 FIKQVHS---NSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 179 YGL--DKGTGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRL 236
GL + TG+ V D+GGGT ++++I ++ V + +GG+ FD +
Sbjct: 151 AGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAI 198

Query: 237 INYLVEEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADATG 292
INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 199 INYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGV 245

Query: 293 PKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIDD--VILVGGQTRMPMV 349
P+ + + LE+L E + + + VAL+ SDI + ++L GG + +
Sbjct: 246 PRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 350 QKKVAEFFGKEPRKDVNPDEAVAIGAA 376
+ + E G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGG 330


2EcSMS35_0059EcSMS35_0070Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0059-1193.425550Dna-J like membrane chaperone protein
EcSMS35_0060-1194.13858423S rRNA/tRNA pseudouridine synthase A
EcSMS35_0061-2184.043972ATP-dependent helicase HepA
EcSMS35_0062-1111.877788DNA polymerase II
EcSMS35_0063-111-0.607571L-ribulose-5-phosphate 4-epimerase
EcSMS35_0064-111-0.592911L-arabinose isomerase
EcSMS35_0065219-1.100225ribulokinase
EcSMS35_0066221-1.382350DNA-binding transcriptional regulator AraC
EcSMS35_0067217-1.253784hypothetical protein
EcSMS35_00680160.898445hypothetical protein
EcSMS35_00690163.228719hypothetical protein
EcSMS35_00700153.091479thiamine transporter ATP-binding subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0066PF05616290.021 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.3 bits (65), Expect = 0.021
Identities = 26/118 (22%), Positives = 47/118 (39%), Gaps = 21/118 (17%)

Query: 82 YGRHPEAREWYHQWVYFRPRAYWHEWLNWPSIFANTGFFRPDEAHQPHFSDLFGQ-IINA 140
Y R PE +E + R YW + N P ++ +F+ + +F G ++
Sbjct: 158 YSRFPEVKELMESQMERLARPYWEKLRNRPDMY----YFKNYNFKRCYFGLNGGDCLVAK 213

Query: 141 G-----------QGEGRYSELLAINLLEQLLLRRMEA-----INESLHPPMDNRVREA 182
G QG +Y E + LE++L +++A I + +P +V A
Sbjct: 214 GDDGRTFISFSLQGNSKYKEEMDAKKLEEILSLKVDANPDKYIKATGYPGYSEKVEVA 271


3EcSMS35_0114EcSMS35_0125Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0114215-2.856421regulatory protein AmpE
EcSMS35_0115317-4.061615aromatic amino acid transporter
EcSMS35_0116631-8.038197putative S-type colicin
EcSMS35_0117625-8.188794putative colicin immunity protein
EcSMS35_0118427-7.671468putative colicin
EcSMS35_0119427-1.568007putative colicin immunity protein
EcSMS35_01205350.606547putative colicin
EcSMS35_01215381.063178putative colicin immunity protein
EcSMS35_01235311.888495transcriptional regulator PdhR
EcSMS35_01224352.185829hypothetical protein
EcSMS35_01244342.385966pyruvate dehydrogenase subunit E1
EcSMS35_01252271.919927dihydrolipoamide acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0116PYOCINKILLER1848e-53 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 184 bits (468), Expect = 8e-53
Identities = 97/288 (33%), Positives = 139/288 (48%), Gaps = 21/288 (7%)

Query: 309 LQQKALAGSTATTRVRFFWGTDIHGKPQVYGVHTGEGTPY-ENVRVANMQWNEQTQRYEF 367
+ A+A ++ T + + G V + +G + V V +N T YE
Sbjct: 341 VNLNAVAKASGTVDLPMRLTNEARGNTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEV 400

Query: 368 T---PAHDVDGPLITWTPENPEHGNVPGHTGN--DRPPLEQPTILVTPIPDGTDTYSTPP 422
T + ++TWTP +P P T +P +TP+ +TY
Sbjct: 401 TVPSTTAEAPPLILTWTPASPPGNQNPSSTTPVVPKPVPVYEGATLTPVKATPETY---- 456

Query: 423 FPVPDPKEFNDYILVFPAGSGIKPIYVYLKEDPRKLPGVVTGRGVPLSPGTRWLDMSVSN 482
P D I+ FPA SGIKPIYV + DPR +PG TG+G P+ WL ++
Sbjct: 457 -PGVITLP-EDLIIGFPADSGIKPIYVMFR-DPRDVPGAATGKGQPV--SGNWLG--AAS 509

Query: 483 NGNGAPIPAHIADKLRGREFKTFDEFREALWLEVSQDPELIAQFSSGNQTRIKQGLTAKA 542
G GAPIP+ IADKLRG+ FK + +FRE W+ V+ DPEL QF+ G+ ++ G
Sbjct: 510 QGEGAPIPSQIADKLRGKTFKNWRDFREQFWIAVANDPELSKQFNPGSLAVMRDGGAPYV 569

Query: 543 PIDGWHYGPKEIVKKFQIHHRVAIEYGGSVYDIDNLRIVTPRLHDEIH 590
G + K +IHH+V + GG VY++ NL VTP+ H EIH
Sbjct: 570 RESE-QAGGRI---KIEIHHKVRVADGGGVYNMGNLVAVTPKRHIEIH 613


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0118PYOCINKILLER542e-12 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 54.0 bits (129), Expect = 2e-12
Identities = 18/64 (28%), Positives = 31/64 (48%), Gaps = 4/64 (6%)

Query: 1 MSQYPELIAQFSSGNQTRIKQGLIAKAPLEGWHYGTKEIVKKFHIYHRVAIEYSGGIYDI 60
++ PEL QF+ G+ ++ G E G + K I+H+V + GG+Y++
Sbjct: 543 VANDPELSKQFNPGSLAVMRDGGAPYVR-ESEQAGGRI---KIEIHHKVRVADGGGVYNM 598

Query: 61 DNLR 64
NL
Sbjct: 599 GNLV 602


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0119PF04605260.013 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 26.4 bits (58), Expect = 0.013
Identities = 13/34 (38%), Positives = 18/34 (52%)

Query: 1 MYDFKNKIEDYTEREFIELLGEFTNPTGDNAQLK 34
Y K I+D ++F + L EFT T N +LK
Sbjct: 88 QYSLKETIQDLCAKDFHQKLKEFTEKTPKNQKLK 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0120PYOCINKILLER472e-10 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 47.5 bits (112), Expect = 2e-10
Identities = 16/62 (25%), Positives = 29/62 (46%), Gaps = 4/62 (6%)

Query: 1 MSQYPELIAQFSTGNQTRIKQGLIAKAPLEGWYYGSKEIVKEFHIYHSVAIECGGEIYDI 60
++ PEL QF+ G+ ++ G E G + + I+H V + GG +Y++
Sbjct: 543 VANDPELSKQFNPGSLAVMRDGGAPYVR-ESEQAGGRI---KIEIHHKVRVADGGGVYNM 598

Query: 61 DN 62
N
Sbjct: 599 GN 600


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0125RTXTOXIND340.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.0 bits (78), Expect = 0.002
Identities = 42/281 (14%), Positives = 89/281 (31%), Gaps = 32/281 (11%)

Query: 26 DKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVSVGDKTQTGALIMIFDSADGAADAAP 85
+ V +T G S E+ + IVKEI V G+ + G +++ + AD
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AQA--------EEKKEAAPAAA-----PAAAAAKDVNVPDIGSDEVEVTEILVKVG-DKV 131
Q+ + + + + P + ++ +EV L+K
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTW 198

Query: 132 EAEQSLITVEGDKASMEVPAPFAGTVKEIKVNVGDKVSTGSLIMVFEVAGEAGAAAPAAK 191
+ ++ + DK E A + ++ +K + A +
Sbjct: 199 QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK-QAIAKHAVLEQ 257

Query: 192 QEAAPAAAPASAAGVKDVNVPDIGGDEV-------------EVTEVMVKVGDKVAA-EQS 237
+ A + + E+ + + + D +
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 238 LITVEGDKASMEVPAPFAGVVKELKVN-VGDKVKTGSLIMI 277
L E + + + AP + V++LKV+ G V T +M+
Sbjct: 318 LAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 29.8 bits (67), Expect = 0.034
Identities = 20/95 (21%), Positives = 35/95 (36%), Gaps = 3/95 (3%)

Query: 230 DKVAAEQSLITVEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAP 289
+ VA +T G S E+ +VKE+ V G+ V+ G +++ GA A
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAE-ADTL 137

Query: 290 AKQEAAAPAPAAKAEAPAAAPAAKAEGKSEFAEND 324
Q + A + + + + E D
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172


4EcSMS35_0141EcSMS35_0165Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0141-120-3.640617aspartate alpha-decarboxylase
EcSMS35_0142121-4.980633ISNCY family transposase
EcSMS35_0143125-5.891213hypothetical protein
EcSMS35_0144228-6.297390pantoate--beta-alanine ligase
EcSMS35_0145331-7.6392413-methyl-2-oxobutanoate
EcSMS35_0146435-8.756432putative fimbrial-like adhesin protein
EcSMS35_0147431-7.605517putative fimbrial protein
EcSMS35_0148330-6.846685hypothetical protein
EcSMS35_0149120-4.304164fimbrial protein
EcSMS35_0150-116-3.137762putative outer membrane usher protein
EcSMS35_0151-214-0.324450putative chaperone protein EcpD
EcSMS35_0152-1140.238166fimbrial protein
EcSMS35_01531131.5807342-amino-4-hydroxy-6-hydroxymethyldihyropteridine
EcSMS35_01540153.365431poly(A) polymerase I
EcSMS35_0155-1153.328883glutamyl-Q tRNA(Asp) synthetase
EcSMS35_0156-1163.331103RNA polymerase-binding transcription factor
EcSMS35_01570122.446923sugar fermentation stimulation protein A
EcSMS35_0158-1133.0748892'-5' RNA ligase
EcSMS35_0159-2133.004205ATP-dependent RNA helicase HrpB
EcSMS35_0160-1163.447211penicillin-binding protein 1b
EcSMS35_0161-2153.391281hypothetical protein
EcSMS35_0162-1133.399391ferrichrome outer membrane transporter
EcSMS35_01630154.464119iron-hydroxamate transporter ATP-binding
EcSMS35_01641144.133492iron-hydroxamate transporter substrate-binding
EcSMS35_01650133.948541iron-hydroxamate transporter permease subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0145FLGMRINGFLIF290.021 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.2 bits (65), Expect = 0.021
Identities = 26/99 (26%), Positives = 39/99 (39%), Gaps = 20/99 (20%)

Query: 110 MVKIEGGEWL----VETVQMLTERAVPVCGHLGLTPQSVNIFGGYKVQGRGDEAGDQL-L 164
V +E G L + V L AV GL P +V + D++G L
Sbjct: 176 TVTLEPGRALDEGQISAVVHLVSSAVA-----GLPPGNVTLV---------DQSGHLLTQ 221

Query: 165 SDALALEAAGAQLLVLECVPVELAKRITEALAIPVIGIG 203
S+ + AQL V + +RI L+ P++G G
Sbjct: 222 SNTSGRDLNDAQLKFANDVESRIQRRIEAILS-PIVGNG 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0150PF005777670.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 767 bits (1982), Expect = 0.0
Identities = 252/847 (29%), Positives = 420/847 (49%), Gaps = 37/847 (4%)

Query: 18 SFCALLLSPCAISAEHVEYDNTFLMGQDAFNIDLSRYTEGNPTLPGVYDVSVYINDQPVI 77
+ +S+ + ++ FL DLSR+ G PG Y V +Y+N+ +
Sbjct: 31 FVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMA 90

Query: 78 NQSISFITLEGKKNAQACITLKNLLQFHINKPDINAENSILLQREGELGDCLDLANIIPQ 137
+ ++F T + ++ C+T L +N ++ N + C+ L ++I
Sbjct: 91 TRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLL------ADDACVPLTSMIHD 144

Query: 138 ASVHYDVNDQRLDINVPQAWVMKNYQNYVDSSLWENGINAAMLAYNVNAYHSEIP-DRKN 196
A+ DV QRL++ +PQA++ + Y+ LW+ GINA +L YN + + +
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 197 DSVYAAFNGGINLGAWRLRATGNYNWMTNVGS-----DYDFQNRYLQRDLASLRSQLIVG 251
Y G+N+GAWRLR +++ ++ S + N +L+RD+ LRS+L +G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 252 ESYTTGETFDAVSIRGIRLYSDSRMLPPALASFAPIIHGVANTNAKVTITQGGYKIYETT 311
+ YT G+ FD ++ RG +L SD MLP + FAP+IHG+A A+VTI Q GY IY +T
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 312 VPPGAFVIDDLSPSGYGSDLIITIEESDGIKRTFSQPFSSVIQMQRPGVGRWDISAGQVL 371
VPPG F I+D+ +G DL +TI+E+DG + F+ P+SSV +QR G R+ I+AG+
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 372 KDD-IQNEPNLFQASYYYGLNNYLTGYTGIQITDNNYTAGLLGLGLNT-AYGAFSVDVTH 429
+ Q +P FQ++ +GL T Y G Q+ D Y A G+G N A GA SVD+T
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD-RYRAFNFGIGKNMGALGALSVDMTQ 443

Query: 430 SDVQIPDDKTYRGQSYRISWNKLFEDTRTSLNIAAYRYSTQNYLGLNDALTLIDEVKHPE 489
++ +PDD + GQS R +NK ++ T++ + YRYST Y D + E
Sbjct: 444 ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 503

Query: 490 -----QDLEPKNMRNYSRM---KNQVTVSINQPLKFEKKDYGSFYLAGSWSDYWADGQNN 541
++PK Y+ + ++ +++ Q L + YL+GS YW +
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL----GRTSTLYLSGSHQTYWGTSNVD 559

Query: 542 TNYSIGYSNSASWGSYSISAQRSWSQ-DGGNEDSIYLSFSIPIEKLLGSEHRDS-GFQSI 599
+ G + + ++++S + + G + + L+ +IP L S+ + S
Sbjct: 560 EQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASA 619

Query: 600 DTQLNSDFNGSNQLSISSSGYSTENH-ISYSVNTGYSMMKSSDDLGYIGGYASYESPWGT 658
++ D NG G E++ +SYSV TGY+ + +Y +G
Sbjct: 620 SYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGN 679

Query: 659 LSSSVSASSDNSRQISFNTDGGFVLHSGGLTFSNDSFSDSDTLAVVQAPGAKGARINYGN 718
+ S S D +Q+ + GG + H+ G+T +DT+ +V+APGAK A++
Sbjct: 680 ANIGYSHSDDI-KQLYYGVSGGVLAHANGVTLGQPL---NDTVVLVKAPGAKDAKVENQT 735

Query: 719 ST-IDRWGYGVTNALSPYHENRIALDINGLENDVELKSTSAITVPRQGSVVFAGFETVQG 777
D GY V + Y ENR+ALD N L ++V+L + A VP +G++V A F+ G
Sbjct: 736 GVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVG 795

Query: 778 QSAIMNIKRTDGKNIPFAADIYDENGNIIGNVGQGGQAFVRGIEQQGNIRINWLDDGKPV 837
+M + + K +PF A + E+ G V GQ ++ G+ G +++ W ++
Sbjct: 796 IKLLMTLTH-NNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENA- 853

Query: 838 TCLAHYQ 844
C+A+YQ
Sbjct: 854 HCVANYQ 860


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0164FERRIBNDNGPP5100.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 510 bits (1314), Expect = 0.0
Identities = 293/296 (98%), Positives = 294/296 (99%)

Query: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60
MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120
DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAHYEDFIRSMKPRFVKRGARPLLLT 180
GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLA YEDFIRSMKPRFVKRGARPLLLT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240
TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 DNSKDMDTLMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRILDNAIGGKA 296
DNSKDMD LMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVR+LDNAIGGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


5EcSMS35_0238EcSMS35_0283Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_02380293.361539addiction module antitoxin component DinJ
EcSMS35_02391313.730228NlpC/P60 family protein
EcSMS35_02401334.223703hypothetical protein
EcSMS35_02411375.132237lateral flagellar export/assembly protein LfhA
EcSMS35_02422352.645455flagellar biosynthesis protein FlhB
EcSMS35_02431311.646066lateral flagellar export/assembly protein LfiR
EcSMS35_02440301.599193lateral flagellar export/assembly protein LfiQ
EcSMS35_02451303.185673flagellar biosynthesis protein FliP
EcSMS35_02463303.510778lateral flagellar export/assembly protein LfiN
EcSMS35_02473294.223450lateral flagellar export/assembly protein LfiM
EcSMS35_02483356.629697lateral flagellar RpoN-interacting regulatory
EcSMS35_02494387.537996lateral flagellar basal body component protein
EcSMS35_02504377.395812flagellar MS-ring protein
EcSMS35_02514335.338043flagellar motor switch protein G
EcSMS35_02523210.757247flagellar assembly protein H
EcSMS35_02531200.706095lateral flagellar export/assembly protein LfiI
EcSMS35_0254-116-1.169552lateral flagellar export/assembly protein LfiJ
EcSMS35_0255-116-1.469493cytidyltransferase-like protein
EcSMS35_0256017-1.497857hypothetical protein
EcSMS35_0257-1190.771174hypothetical protein
EcSMS35_02581233.330431glycosyl transferase, group 2 family protein
EcSMS35_02592273.791636lateral flagellar associated protein LafV
EcSMS35_02600274.898350lateral flagellar chaperone protein LfgN
EcSMS35_02611295.751068lateral flagellar anti-sigma factor 28 protein
EcSMS35_02622325.805658lateral flagellar P-ring addition protein LfgA
EcSMS35_02633335.841270lateral flagellar rod protein LfgB
EcSMS35_02642356.608235flagellar basal body rod protein FlgC
EcSMS35_02652367.213633lateral flagellar rod protein LfgD
EcSMS35_02662376.569546lateral flagellar hook protein LfgE
EcSMS35_02671396.709611flagellar basal body rod protein FlgF
EcSMS35_02681386.308839flagellar basal body rod protein FlgG
EcSMS35_02690356.626141flagellar basal body L-ring protein
EcSMS35_0270-1293.954097flagellar basal body P-ring protein
EcSMS35_02710242.553266lateral flagellar peptidoglycan hydrolase LfgJ
EcSMS35_02721253.074209lateral flagellar hook associated protein 1
EcSMS35_02731222.132523lateral flagellar hook associated protein 3
EcSMS35_02741191.807642lateral flagellar hook associated protein LafW
EcSMS35_02751181.505811lateral flagellar transmembrane regulator LafZ
EcSMS35_02762223.313618lateral flagellar flagellin LafA
EcSMS35_02772233.193386lateral flagellar hook associated protein 2
EcSMS35_02782232.819148lateral flagellar chaperone protein LafC
EcSMS35_02793242.856566lateral flagellar chaperone protein LafD
EcSMS35_02802212.536080lateral flagellar hook length control protein
EcSMS35_02812190.823342lateral flagellar basal body-associated protein
EcSMS35_02822200.216679flagellar biosynthesis sigma factor
EcSMS35_02832170.340104flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0242TYPE3IMSPROT298e-101 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 298 bits (765), Expect = e-101
Identities = 97/347 (27%), Positives = 178/347 (51%), Gaps = 6/347 (1%)

Query: 6 SEEKTEKPSAQKLRKAREEGQLPRSKDMGLAASLFAAFVVISSSFPWYADFVRESFISVH 65
S EKTE+P+ +K+R AR++GQ+ +SK++ A + A ++ +Y + + +
Sbjct: 2 SGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLI-- 59

Query: 66 QYAQEINNPDV--IGQFLRHHLLILGKFILTLLPMPA-AALLSSLVPGGWLFLPKKILPD 122
A++ P + + + LL LL + A A+ S +V G+L + I PD
Sbjct: 60 -PAEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPD 118

Query: 123 FSKISPLKGIGRLFSSEHLAETGKMTVKSVVVLVMLWVSLRNNFAAFLGLQALPFKLAIN 182
KI+P++G R+FS + L E K +K V++ +++W+ ++ N L L +
Sbjct: 119 IKKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITP 178

Query: 183 DGLSLYASVMRNFVILFIFFALIDVPLAKALFTKGLKMTKQELKEEYKNQEGKPEVKARV 242
+ +M + F+ ++ D + K LKM+K E+K EYK EG PE+K++
Sbjct: 179 LLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKR 238

Query: 243 RRLQRQLAMGQIRKVVPKANVVITNPTHYAVALQYDQSRAAAPFVVAKGTDEIALYIRQV 302
R+ +++ +R+ V +++VV+ NPTH A+ + Y + P V K TD +R++
Sbjct: 239 RQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKI 298

Query: 303 AAENQVEVVEFPRLARSVYYTTQVNQQIPFQLYRAIAHVLTYVLQMK 349
A E V +++ LAR++Y+ V+ IP + A A VL ++ +
Sbjct: 299 AEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0243TYPE3IMRPROT1102e-31 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 110 bits (277), Expect = 2e-31
Identities = 75/232 (32%), Positives = 126/232 (54%), Gaps = 2/232 (0%)

Query: 8 QLTDLALGLWFPFVRIMAFLRYVPVLDNSALTVRVRIILSLALAIIITPLIPHPIPHDLL 67
Q ++P +R++A + P+L ++ RV++ L++ + I P +P +
Sbjct: 8 QWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDV-PVF 66

Query: 68 SLNSLILTVEQILWGMLFGLMFQFLFLALQLAGQILSFNMGMSMAVMNDPSSGASTTVLA 127
S +L L V+QIL G+ G QF F A++ AG+I+ MG+S A DP+S + VLA
Sbjct: 67 SFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLA 126

Query: 128 ELINVYAILLFFAMDGHLLLVSVLYKGFTYWPIGNA-LHPQTLRTIALAFSWVLASASLL 186
++++ A+LLF +GHL L+S+L F PIG L+ + A S + + +L
Sbjct: 127 RIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLML 186

Query: 187 ALPTTFIMLIVQGCFGLLNRIAPPLNLFSLGFPINMLAGLVCFATLLYNLPD 238
ALP ++L + GLLNR+AP L++F +GFP+ + G+ A L+ +
Sbjct: 187 ALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAP 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0244TYPE3IMQPROT433e-09 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 43.2 bits (102), Expect = 3e-09
Identities = 21/73 (28%), Positives = 36/73 (49%)

Query: 14 GIKVVILLVSVLVVPSLLVGLLVSVFQAVTQINEQTLSFLPRLIVTLVVLGVCGKWMIIQ 73
+ +V++L + + ++GLLV +FQ VTQ+ EQTL F +L+ + L + W
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLSGWYGEV 70

Query: 74 LHDLCIHLFSQAA 86
L + A
Sbjct: 71 LLSYGRQVIFLAL 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0245FLGBIOSNFLIP2232e-75 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 223 bits (569), Expect = 2e-75
Identities = 111/243 (45%), Positives = 152/243 (62%), Gaps = 4/243 (1%)

Query: 3 RRTQLALGLGLLALAPLALAQGGDIALLNVVTHGNTQEYSVKIQVLILMTLVGLLPTMVL 62
RR + L + PLA AQ + + G Q +S+ +Q L+ +T + +P ++L
Sbjct: 2 RRLLSVAPVLLWLITPLAFAQLP--GITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILL 59

Query: 63 MMTCFTRFIIVLSLLRQALGLQQTPPNRILIGIALSLTMLVMRPVWLNIYDHAVVPFEND 122
MMT FTR IIV LLR ALG PPN++L+G+AL LT +M PV IY A PF +
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 123 QITLTDALSTAATPLKRFMLAQTDKKAMAQIMTIGGAKG--NAADQDLTIVVPAYVLSEL 180
+I++ +AL A PL+ FML QT + + + + I++PAYV SEL
Sbjct: 120 KISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSEL 179

Query: 181 KTAFQIGFMIYIPFLVIDLIVASVLMAMGMMMLSPLIVSLPFKLMLFVLIDGWSLTIGTL 240
KTAFQIGF I+IPFL+IDL++ASVLMA+GMMM+ P ++LPFKLMLFVL+DGW L +G+L
Sbjct: 180 KTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSL 239

Query: 241 TTS 243
S
Sbjct: 240 AQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0246FLGMOTORFLIN715e-19 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 71.1 bits (174), Expect = 5e-19
Identities = 39/104 (37%), Positives = 61/104 (58%), Gaps = 5/104 (4%)

Query: 13 GLADEVAPVTKSADAETLVTRL----EDRFSDSMTLLKRIPVTLTLEVSSVEIMLADLLN 68
L ++ A TKSA A+ + +L + L+ IPV LT+E+ + + +LL
Sbjct: 22 ALNEQKATTTKSA-ADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLR 80

Query: 69 IDDDTVIELDKLAGEPLDIKVNNILLGKAEVVVVNEKYGLRVLE 112
+ +V+ LD LAGEPLDI +N L+ + EVVVV +KYG+R+ +
Sbjct: 81 LTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITD 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0248HTHFIS361e-125 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 361 bits (929), Expect = e-125
Identities = 112/345 (32%), Positives = 183/345 (53%), Gaps = 26/345 (7%)

Query: 3 ELIATAASSINAFTLAKRVAAFNVPVLIQGETGAGKECVAKYIHTVAFGENDNAPYIGVN 62
L+ +A+ + + R+ ++ ++I GE+G GKE VA+ +H +G+ N P++ +N
Sbjct: 138 PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH--DYGKRRNGPFVAIN 195

Query: 63 CAAIPENMLESTLFGYDKGAFTGAIASVPGKMELANNGSLLLDEIGEMPLALQAKILRVL 122
AAIP +++ES LFG++KGAFTGA G+ E A G+L LDEIG+MP+ Q ++LRVL
Sbjct: 196 MAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVL 255

Query: 123 QEQQVERLGSNRQIKLNFRLIACTNKNLEQEVAAGRFREDLYYRLAVIPITMPPLRERLN 182
Q+ + +G I+ + R++A TNK+L+Q + G FREDLYYRL V+P+ +PPLR+R
Sbjct: 256 QQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAE 315

Query: 183 DIIPLAESFIKKYSTVLVKNITLSESTRRALLNYRWPGNVRQLENAIQRGMILNRDGVIY 242
DI L F+++ + + + + WPGNVR+LEN ++R L VI
Sbjct: 316 DIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVIT 375

Query: 243 PDAL---------------------GLPDTDIADHSELQWPVQPAVHIAETGDLGQHGRS 281
+ + L + + + Q+ + +G +
Sbjct: 376 REIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAE 435

Query: 282 AQYQYIADLMRKYQGNRSKIADLLGITPRALRYRLASMRKHGIEV 326
+Y I + +GN+ K ADLLG+ LR + +R+ G+ V
Sbjct: 436 MEYPLILAALTATRGNQIKAADLLGLNRNTLRKK---IRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0249FLGHOOKFLIE394e-07 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 38.9 bits (90), Expect = 4e-07
Identities = 24/79 (30%), Positives = 36/79 (45%), Gaps = 1/79 (1%)

Query: 36 SSTDPDVSFNRIMSGALGHVDQFQQVAEQQQTAVDTGKSD-DLAGAMIASQQASLSFSAL 94
S P +SF + AL + Q A Q G+ L M Q+AS+S
Sbjct: 25 SLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMG 84

Query: 95 VQVRNKIATGFNDLMSMSI 113
+QVRNK+ + ++MSM +
Sbjct: 85 IQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0250FLGMRINGFLIF2845e-91 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 284 bits (727), Expect = 5e-91
Identities = 161/551 (29%), Positives = 255/551 (46%), Gaps = 37/551 (6%)

Query: 18 RLADNKRWALMAGVGLAVAATAIIVSVLWTGNRGYVSLYGRQENLPVSQIVTVLDGEKLS 77
RL N R L V + A ++ VLW Y +L+ + IV L +
Sbjct: 18 RLRANPRIPL--IVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 78 YRIDPQSGQILVPEDELSKTRMTLAAKGVQAILPSGYELMDKDEVLGSSQFVQNVRYKRS 137
YR SG I VP D++ + R+ LA +G+ G+EL+D+ E G SQF + V Y+R+
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQ-EKFGISQFSEQVNYQRA 134

Query: 138 LEGELAQSIMSLDAVESARVHLALNEESSFVVSDEPQNSASVVVRLHYGAKLNMDQVNAI 197
LEGELA++I +L V+SARVHLA+ + S FV + SASV V L G L+ Q++A+
Sbjct: 135 LEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSP-SASVTVTLEPGRALDEGQISAV 193

Query: 198 VHLVSGSIPGLHASKVSVVDQAGNLLTDG-IGAGEAVSAATRKRDQILKDIQDKTRASVA 256
VHLVS ++ GL V++VDQ+G+LLT + A + + + IQ + +
Sbjct: 194 VHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRR----IE 249

Query: 257 NVLDSLVGSGNYRVSVMPDLDLSNIDETQEHY---GDAPKIN---REENVLDSDTNQVAM 310
+L +VG+GN V LD +N ++T+EHY GDA K R+ N+ +
Sbjct: 250 AILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPG 309

Query: 311 GVPGSLSNRPPIAANQMTNGTEENR----------------SPEALSKHSESKRDYSYDR 354
GVPG+LSN+P N+ S S +Y DR
Sbjct: 310 GVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDR 369

Query: 355 SVQHIQHPGFAVKRLNVAVVLN-QNAPALKN--WKPEQTTQLTALLNNAAGIDVQRGDNL 411
+++H + ++RL+VAVV+N + K +Q Q+ L A G +RGD L
Sbjct: 370 TIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGDTL 429

Query: 412 TLSLLNFVPQAVPVEPIIPLWKDDSVLAWVRLIGCGLLALLLLFFVVRPVMKRLTAVRAP 471
+ F +P W+ S + + G LL L++ + + R ++ R
Sbjct: 430 NVVNSPF-SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVE 488

Query: 472 VITPEPEAVSEPWIAMPEEERKNVDLPSLPGDDSLPSQSSGLEVKLEFLQKLAMSDTDRV 531
E E + L + +Q G EV + +++++ +D V
Sbjct: 489 EAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRA--NQRLGAEVMSQRIREMSDNDPRVV 546

Query: 532 AEVLRQWITSN 542
A V+RQW++++
Sbjct: 547 ALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0251FLGMOTORFLIG1862e-58 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 186 bits (473), Expect = 2e-58
Identities = 86/317 (27%), Positives = 170/317 (53%), Gaps = 2/317 (0%)

Query: 14 EQAAILLLCLGEEAAATVMQKLSREEVVRLSENMARLSGVKTSMARKVINNFFDEFREQS 73
++AAILL+ +G E ++ V + LS+EE+ L+ +A+L + + + V+ F + Q
Sbjct: 19 QKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMAQE 78

Query: 74 GINGASRSMLQGILNKALGTEIASSVINGIYGDEIRSRMARLQWVEPRQLAMLISEEHLQ 133
I + +L K+LGT+ A +IN + ++ +P + I +EH Q
Sbjct: 79 FIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEHPQ 138

Query: 134 LQAVFLAFLTPEISAAVLSYLNESVQNEILYRVAKLNDVNRDVVDELDRLIERGL-SVLS 192
A+ L++L P+ ++ +LS L VQ + R+A ++ + +VV E++R++E+ L S+ S
Sbjct: 139 TIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSS 198

Query: 193 EHGSKVKGIKQAADIVNRFQGNQQ-VILDQMRERDEDVLEQLQDEMYDFFILSRQNEEVR 251
E + G+ +I+N + I++ + E D ++ E+++ +M+ F + ++
Sbjct: 199 EDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDRSI 258

Query: 252 RRLLDEVPMEDWAVALKGTEALLRRSIYAVMPKRQAQQLEAITARLGPVPVSRIEQIRRE 311
+R+L E+ ++ A ALK + ++ I+ M KR A L+ LGP +E+ +++
Sbjct: 259 QRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQQK 318

Query: 312 IMGIARELEEAGEIQLQ 328
I+ + R+LEE GEI +
Sbjct: 319 IVSLIRKLEEQGEIVIS 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0252FLGFLIH561e-11 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 56.0 bits (134), Expect = 1e-11
Identities = 44/187 (23%), Positives = 84/187 (44%), Gaps = 6/187 (3%)

Query: 40 LMDGFQEGLQKGFAQGMTEGQEQGFSEGHQQGFAEGRRQGYTEGSLAGQQEGRKQFVDAA 99
+++ + L++ AQ + EQG+ G +G +G +QGY EG G ++G +
Sbjct: 32 IIEEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQ 91

Query: 100 QPLEA----ISGKVNDFLAHIERKQREDLLQLVEKVTRQVIRCELALQPTQLLALVEEAL 155
P+ A + + L ++ L+Q+ + RQVI + + L+ +++ L
Sbjct: 92 APIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLL 151

Query: 156 AAFPAMPETLQVMLSTEEFNRLRDAVPEKVS--EWGLTPSPDLPPGECRVITDKSELDIG 213
P Q+ + ++ R+ D + +S W L P L PG C+V D+ +LD
Sbjct: 152 QQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDAS 211

Query: 214 CEHRLEQ 220
R ++
Sbjct: 212 VATRWQE 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0255LPSBIOSNTHSS382e-06 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.3 bits (89), Expect = 2e-06
Identities = 30/135 (22%), Positives = 55/135 (40%), Gaps = 20/135 (14%)

Query: 8 GTFDVFHVGHLRLLQRARTLGERLLVGVSSDALNIAKKGRAPVYPQDDRMAIIAG-LACV 66
G+FD GHL +++R L +++ V V N K+ P++ +R+ IA +A +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAV---LRNPNKQ---PMFSVQERLEQIAKAIAHL 60

Query: 67 DGVFLEESLEQKAEYLRGYSADILVMG-----D-----DWAGKFDSFAYICEVVYFPRTP 116
++ Y R A ++ G D A + A E V+ +
Sbjct: 61 PNAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTST 120

Query: 117 ---SVSTTGIIEVIR 128
+S++ + EV R
Sbjct: 121 EYSFLSSSLVKEVAR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0264FLGHOOKAP1280.017 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 27.6 bits (61), Expect = 0.017
Identities = 7/37 (18%), Positives = 18/37 (48%)

Query: 104 VNVVEQMADMMSASRDFETNVDVLNNVKSMQQSLLKL 140
VN+ E+ ++ + + N VL ++ +L+ +
Sbjct: 509 VNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0266FLGHOOKAP1393e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 38.8 bits (90), Expect = 3e-05
Identities = 20/59 (33%), Positives = 26/59 (44%), Gaps = 5/59 (8%)

Query: 2 SYEIAATGLNAVNEQLDGISNNIANAGTVGYKS----MTTQFSAMYAGSQ-AMGVSVAG 55
A +GLNA L+ SNNI++ GY M S + AG GV V+G
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61



Score = 37.2 bits (86), Expect = 1e-04
Identities = 16/47 (34%), Positives = 24/47 (51%)

Query: 352 TLSSGVLESSNVDITSELVNLMTAQRNYQANTKVIATSTQLDDALFQ 398
LS+ S V++ E NL Q+ Y AN +V+ T+ + DAL
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0268FLGHOOKAP1404e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 40.3 bits (94), Expect = 4e-06
Identities = 19/93 (20%), Positives = 38/93 (40%), Gaps = 19/93 (20%)

Query: 5 LWISKTGLSAQDAEMSAIANNIANVNTTGFKRDRVMFQDLFYQTQEAPGAMLDQNNIMPT 64
+ + +GL+A A ++ +NNI++ N G+ R + M N+ +
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTLGA 49

Query: 65 GLQFGSGVRIVGTQKT-----FTEGNVETTDNA 92
G G+GV + G Q+ + T ++
Sbjct: 50 GGWVGNGVYVSGVQREYDAFITNQLRAAQTQSS 82



Score = 38.8 bits (90), Expect = 2e-05
Identities = 9/45 (20%), Positives = 20/45 (44%)

Query: 213 TLQDNALEGSNVDIVNEMVAMITVQRAYEMNAKMVSAADDMLQYI 257
L + S V++ E + Q+ Y NA+++ A+ + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDAL 542


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0269FLGLRINGFLGH1364e-42 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 136 bits (344), Expect = 4e-42
Identities = 63/189 (33%), Positives = 93/189 (49%), Gaps = 12/189 (6%)

Query: 37 EAPPPADGRAGGVFET------GYNWSLTADRRAYRVGDILTVILEESTQSSKQAKTNFG 90
P P G +F++ GY L DRR +GD LT++L+E+ +SK + N
Sbjct: 39 PVPGPTPVANGSIFQSAQPINYGYQ-PLFEDRRPRNIGDTLTIVLQENVSASKSSSANAS 97

Query: 91 KSNTVDIG---APTIFGHTKDKLSGSIDA--NRDFDGSATSQQQNSLRGEITVSVHAVQP 145
+ + G P ++A F+G + N+ G +TV+V V
Sbjct: 98 RDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLV 157

Query: 146 NGILEIRGEKWLTLNQGDEYIRLSGLVRADDIQNDNSVSSQRIADARISYAGRGALSDAN 205
NG L + GEK + +NQG E+IR SG+V I N+V S ++ADARI Y G G +++A
Sbjct: 158 NGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQ 217

Query: 206 AAGWLTRLF 214
GWL R F
Sbjct: 218 NMGWLQRFF 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0270FLGPRINGFLGI329e-113 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 329 bits (844), Expect = e-113
Identities = 147/369 (39%), Positives = 219/369 (59%), Gaps = 15/369 (4%)

Query: 10 IKAAVITVSL----ALPGVALAQSLESLVNVQGVRENQLVGYSLVVGLDGTGDK-NQVKF 64
I AA++ +L P A ++ + ++Q R+NQL+GY LVVGL GTGD F
Sbjct: 7 IAAALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPF 66

Query: 65 TNQTITNMLRQFGVQLPNKIDPKVKNVAAVAVSATLPPMYSRGQTIDVTVSSIGDAKSIR 124
T Q++ ML+ G+ KN+AAV V+A LPP S G +DVTVSS+GDA S+R
Sbjct: 67 TEQSMRAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLR 125

Query: 125 GGTLLLTQLHGADGEVYALAQGSVVVGGMNATGASGSSVTVNTPTAGLIPNGATVEREIP 184
GG L++T L GADG++YA+AQG+++V G +A G +++T T+ +PNGA +ERE+P
Sbjct: 126 GGNLIMTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELP 184

Query: 185 SDFQMGDTITLNLKRPSFKDANNIAAAINASF-----GGIATAQSSTNVSVRAPTSPGAR 239
S F+ + L L+ P F A +A +NA IA + S ++V+ P
Sbjct: 185 SKFKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADL 243

Query: 240 VAFMSQLDDVQVQAEKIRARVVFNSRTGTVVMGDGVALHAAAVSHGSLTVSINETSNVSQ 299
M++++++ V+ + A+VV N RTGT+V+G V + AVS+G+LTV + E+ V Q
Sbjct: 244 TRLMAEIENLTVETDTP-AKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQ 302

Query: 300 PNAFAGGRTAVTPQSNIAVNHARPGVVSLPESSSLKTLVNALNSLGATPDDIMSILQALH 359
P F+ G+TAV PQ++I V++ E L+TLV LNS+G D I++ILQ +
Sbjct: 303 PAPFSRGQTAVQPQTDIMA-MQEGSKVAIVEGPDLRTLVAGLNSIGLKADGIIAILQGIK 361

Query: 360 EAGALDADL 368
AGAL A+L
Sbjct: 362 SAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0271FLGFLGJ573e-13 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 57.0 bits (137), Expect = 3e-13
Identities = 26/75 (34%), Positives = 42/75 (56%), Gaps = 4/75 (5%)

Query: 22 ANDIKQAAEQFEAIFLRNMLKEMRKTNELFDSKDNPFNSDSVRMMQGFYDDELCNTLAQQ 81
A +I+ A Q E +F++ MLK MR KD F+S+ R+ YD ++ +
Sbjct: 30 AANIRPVARQVEGMFVQMMLKSMRDAL----PKDGLFSSEHTRLYTSMYDQQIAQQMTAG 85

Query: 82 HGIGIAAMIVKQLSP 96
G+G+A M+VKQ++P
Sbjct: 86 KGLGLAEMMVKQMTP 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0272FLGHOOKAP11761e-51 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 176 bits (448), Expect = 1e-51
Identities = 97/326 (29%), Positives = 165/326 (50%), Gaps = 7/326 (2%)

Query: 2 DMINIGYSGASTAQVELNVTAQNTANAMTTGYTRQVAEISTIGASGGSPNSAGNGVQVDS 61
+IN SG + AQ LN + N ++ GYTRQ ++ ++ G+ GNGV V
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 62 IRRVSNQYQVNQVWYAASDYGYYSTQQGYLTQLEAVLSDDNSSLSGGFDNFFAALNEATT 121
++R + + NQ+ A + + + +++++ +LS SSL+ +FF +L +
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 122 SPDDSALREQVISEAGALSLRIDNTLDYIDSQSTEIISQQQAMVSQINTLTSGIASYNQQ 181
+ +D A R+ +I ++ L + T Y+ Q ++ A V QIN IAS N Q
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 182 IAQAEAN--GDNASALYDARDQMVEELSGMMDVQVNIDDQGNYNVTLKNGQPLVSGQQSS 239
I++ G + + L D RDQ+V EL+ ++ V+V++ D G YN+T+ NG LV G +
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 240 TIA-LETNADGTPT----MSLTFAGTTSTMTTDTGGSLGALFDYQNDVLTPLTDTINSMA 294
+A + ++AD + T + T GSLG + +++ L +T+ +A
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 295 LQFADAVNNQLAQGYDLNGNPGEPLF 320
L FA+A N Q G+D NG+ GE F
Sbjct: 302 LAFAEAFNTQHKAGFDANGDAGEDFF 327



Score = 74.2 bits (182), Expect = 3e-16
Identities = 54/286 (18%), Positives = 110/286 (38%), Gaps = 14/286 (4%)

Query: 179 NQQIAQAEANGDNASALYDARDQMVEELSGMMDV-------QVNIDDQGNYNVTLKNGQP 231
N +I + N + + R Q +++ + N + ++ G+
Sbjct: 266 NIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLALAFAEAFNTQHKAGFDANGDAGED 325

Query: 232 LVSGQQSSTIALETNADGTPTMSLTFAGTTSTMTTDTGGSLGALFDYQNDVLTPLTDTI- 290
+ + A+ N +++ T ++ T + + Q V ++T
Sbjct: 326 FFAIGKP---AVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNN-QWQVTRLASNTTF 381

Query: 291 NSMALQFADAVNNQLAQGYDLNGNPGEPLFIYDASNADGPLTVNPDITADELAFSSSPDE 350
+ L + + + S+A + V A S
Sbjct: 382 TVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLITDEAKIAMASEEDAG 441

Query: 351 SGNSDNLQALINISTEPLEIANLGSVTVGQACSSIISNIGIYSQQNQTEVDAASNVYSEA 410
++ N QAL+++ + G+ + A +S++S+IG + +T NV ++
Sbjct: 442 DSDNRNGQALLDLQSN--SKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQL 499

Query: 411 QNQQSSVSGVSMDEEAVNLITYQQIYEANLKVISAGAEIFDSVLEM 456
NQQ S+SGV++DEE NL +QQ Y AN +V+ IFD+++ +
Sbjct: 500 SNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0273FLAGELLIN423e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.6 bits (97), Expect = 3e-06
Identities = 47/313 (15%), Positives = 99/313 (31%), Gaps = 12/313 (3%)

Query: 1 MRVTTQQTYVSMTQSFNNLSGSLAHVVEQMATGKEILQPSDDPIAATRITQLNRQQSAIE 60
+ T + + N SL+ +E++++G I DD + +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYQSNIDSASAGLSQQESILDGVNNSLLAVRDDLLEAANGTNTADSLASLGQDIQSLTES 120
Q N + + E L+ +NN+L VR+ ++A NGTN+ L S+ +IQ E
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 MVAALNYQDEDGHYVFGGTINDQPPIVAVDDDGDGVT-----------DSYSYQGNSDHR 169
+ N +G V + + A D + + D ++ G +
Sbjct: 122 IDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 170 QTTVSNGVEVDTNVAASDFFGSNLDV-LNTLNSLSQELQNPDVDPADPQVQSDLQNAVDV 228
+ + + T + V +N+ ++ D + D
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDA 241

Query: 229 VDTASDDLNASIASLGETQNTMSMLSDAQTDISTSNDELIGSLQDLDYGPASITFTGLEV 288
+ + DL + S T ++ + + G +D + +
Sbjct: 242 ENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVST 301

Query: 289 AMEATLKTYSKVS 301
+ T +
Sbjct: 302 TINGEKVTLTVAD 314



Score = 37.7 bits (87), Expect = 4e-05
Identities = 27/262 (10%), Positives = 71/262 (27%), Gaps = 3/262 (1%)

Query: 31 ATGKEILQPSDDPIAATRITQLNRQQSAIEQYQSNIDSASAGLSQQESILDGVNNSLLAV 90
+DD T + +S ++ + + ++ D +
Sbjct: 229 VNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 91 RDDLLEAANGTNTADSLASLGQDIQSLTESMVAALNYQDEDGHYVFGGTINDQPPIVAVD 150
+ +T + + + +T + V+ +N Q
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 151 DDGDGVTDSYSYQGNSDHRQTTVSNGVEVDTNVAASDFFGSNLDVLNTLNSLSQELQNPD 210
+ ++ + V A + L + +
Sbjct: 349 KNESAKLSDL---EANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTL 405

Query: 211 VDPADPQVQSDLQNAVDVVDTASDDLNASIASLGETQNTMSMLSDAQTDISTSNDELIGS 270
++ + N + +D+A ++A +SLG QN + T+ +
Sbjct: 406 INEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSR 465

Query: 271 LQDLDYGPASITFTGLEVAMEA 292
++D DY + ++ +A
Sbjct: 466 IEDADYATEVSNMSKAQILQQA 487


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0276FLAGELLIN895e-22 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 89.3 bits (221), Expect = 5e-22
Identities = 69/253 (27%), Positives = 122/253 (48%)

Query: 4 INTNNASMAAVNAISKSSSSLSTSMERLATGNRINSSADDAAGKQIANRLTAQSSGMGVA 63
INTN+ S+ N ++KS SSLS+++ERL++G RINS+ DDAAG+ IANR T+ G+ A
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 LSNINDATAMLQTADSMFDEMSDVLGRMKDLSTQAANGTYSDDDLQAMQDEYDELGQQMS 123
N ND ++ QT + +E+++ L R+++LS QA NGT SD DL+++QDE + +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 DMLQNTTYGGTNLFGVSGTSNTGTDGLFQSAVTFQVGAESSDTMTVNISSQLNTLVTDLS 183
+ T + G + +T + ++ ++ + +
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 184 AISNSFSADQADTTGTAGVSGGTELTASGSANQMITSISTAMDDVSQIQSKLGASINRLN 243
+ +SF T G + SG+ T+ + + + + N
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 244 DTANNLTSMQDNT 256
+TA +L +T
Sbjct: 244 NTAVDLFKTTKST 256



Score = 65.1 bits (158), Expect = 5e-14
Identities = 55/273 (20%), Positives = 95/273 (34%), Gaps = 2/273 (0%)

Query: 32 ATGNRINSSADDAAGKQIANRLTAQSSGMGVALSNINDATAMLQTADSMFDEMSDVLGRM 91
T + N++A D + TA++ + A+ + + +
Sbjct: 237 TTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGN 296

Query: 92 KDLSTQAANGTYSDDDLQAMQDEYDELGQQMSDMLQNTTYGGTNLFGVSGTSNTGTDGLF 151
+ST + + + Y + T +
Sbjct: 297 GKVSTTINGEKVTLTVADITAGAANVDAATLQS--SKNVYTSVVNGQFTFDDKTKNESAK 354

Query: 152 QSAVTFQVGAESSDTMTVNISSQLNTLVTDLSAISNSFSADQADTTGTAGVSGGTELTAS 211
S + + +TVN + D ++ +G + + A
Sbjct: 355 LSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAK 414

Query: 212 GSANQMITSISTAMDDVSQIQSKLGASINRLNDTANNLTSMQDNTEVAIGNIMDTDYATE 271
S + SI +A+ V ++S LGA NR + NL + N A I D DYATE
Sbjct: 415 KSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATE 474

Query: 272 ASNMTKQQVLMQTGITMLKQSNSMSSMVSSLLQ 304
SNM+K Q+L Q G ++L Q+N + V SLL+
Sbjct: 475 VSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0280FLGHOOKFLIK353e-04 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 35.2 bits (80), Expect = 3e-04
Identities = 26/99 (26%), Positives = 45/99 (45%)

Query: 236 LKDQIHFQLNKQQQISTIRLDPPSLGKLEIAVQLDNGKLMVHIGANQSEVCRALQQFSDD 295
L I + QQ + +RL P LG+++I++++D+ + + + + V AL+
Sbjct: 244 LSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPV 303

Query: 296 LRQHLTAQNFMEVNVQVSSEGQSQQQQQSGHQQEEVSAA 334
LR L +S E S QQQ + QQ+ A
Sbjct: 304 LRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTA 342


6EcSMS35_0294EcSMS35_0341Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0294-216-3.490943DNA-binding transcriptional regulator Crl
EcSMS35_0295022-6.934837outer membrane phosphoporin protein E
EcSMS35_0296-126-7.686204gamma-glutamyl kinase
EcSMS35_0297441-14.538292gamma-glutamyl phosphate reductase
EcSMS35_0299749-16.917339*phage integrase family site specific
EcSMS35_0300648-17.388479hypothetical protein
EcSMS35_0301643-15.845316hypothetical protein
EcSMS35_0302537-12.756606response regulator
EcSMS35_0303433-11.750271hypothetical protein
EcSMS35_03044242.057837IS911 transposase orfA
EcSMS35_03054231.472821IS911 transposase orfB
EcSMS35_03075221.443061antirestriction protein
EcSMS35_0308419-1.000332RadC family DNA repair protein
EcSMS35_0309526-2.607665hypothetical protein
EcSMS35_0310423-1.893603putative antitoxin module of toxin-antitoxin
EcSMS35_0311120-1.181514hypothetical protein
EcSMS35_03121201.136071hypothetical protein
EcSMS35_03131201.366162hypothetical protein
EcSMS35_03143222.134926hypothetical protein
EcSMS35_03153211.048783hypothetical protein
EcSMS35_03162200.569849hypothetical protein
EcSMS35_03172220.363590hypothetical protein
EcSMS35_0318424-3.500510hypothetical protein
EcSMS35_0319625-6.246369fimbrillin MatB
EcSMS35_0320431-8.631119fimbrillin MatA
EcSMS35_0321234-7.067509hypothetical protein
EcSMS35_0322234-7.091220hypothetical protein
EcSMS35_0323232-5.80061650S ribosomal protein L36
EcSMS35_0324128-4.68927450S ribosomal protein L31 type B
EcSMS35_0325125-4.116165oxidoreductase, FAD/FMN-binding
EcSMS35_0326020-1.409059hypothetical protein
EcSMS35_0327020-1.653504LysR family transcriptional regulator
EcSMS35_0328019-1.660320LysR family transcriptional regulator
EcSMS35_0329018-2.117865aldo/keto reductase family oxidoreductase
EcSMS35_0330021-2.760278organophosphate reductase
EcSMS35_0331024-3.497108putative invasin
EcSMS35_0332230-6.657038AraC family transcriptional regulator
EcSMS35_0333228-5.377875aldo/keto reductase family oxidoreductase
EcSMS35_0334225-3.639425hypothetical protein
EcSMS35_0335123-2.704530hypothetical protein
EcSMS35_0336022-2.397701pyridine nucleotide-disulfide oxidoreductase
EcSMS35_0337-1121.232826AraC family transcriptional regulator
EcSMS35_03380142.702583hypothetical protein
EcSMS35_03391153.353635iron-sulfur cluster binding protein
EcSMS35_03400224.396734hypothetical protein
EcSMS35_03410193.199901hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0295ECOLIPORIN5500.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 550 bits (1418), Expect = 0.0
Identities = 232/384 (60%), Positives = 268/384 (69%), Gaps = 34/384 (8%)

Query: 1 MKKSTLALVVMGIVASASVQAAEIYNKDGNKLDIYGKVKAMHYMSDNDSKDGDQSYIRFG 60
MK+ LALV+ ++A+ + AAEIYNKDGNKLD+YGKV +HY SD+ SKDGDQ+Y+R G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQLTGYGRWEAEFAGNKAESDTAQQKTRLAFAGLKYKDLGSFDYGRNLGALY 120
FKGETQINDQLTGYG+WE N E + A TRLAFAGLK+ D GSFDYGRN G LY
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 121 DVEAWTDMFPEFGGDSSAQTDNFMTKRASGLATYRNTDFFGVIDGLNLTLQYQGKNEN-- 178
DVE WTDM PEFGGDS DN+MT RA+G+ATYRNTDFFG++DGLN LQYQGKNE+
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 --------------RDVKKQNGDGFGTSLTYDFGGSDFAISGAYTNSDRTNEQNLQSR-- 222
D++ NGDGFG S TYD G F+ AYT SDRTNEQ
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDI-GMGFSAGAAYTTSDRTNEQVNAGGTI 239

Query: 223 GTGKRAEAWATGLKYDANNIYLATFYSETRKMTP-------ITGGFANKTQNFEAVAQYQ 275
G +A+AW GLKYDANNIYLAT YSETR MTP GG ANKTQNFE AQYQ
Sbjct: 240 AGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQ 299

Query: 276 FDFGLRPSLGYVLSKGKDIE----GIGDEDLVNYIDVGATYYFNKNMSAFVDYKINQLDS 331
FDFGLRP++ +++SKGKD+ D+DLV Y DVGATYYFNKN S +VDYKIN LD
Sbjct: 300 FDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDD 359

Query: 332 DNKL----NINNDDIVAVGMTYQF 351
D+ I+ DDIVA+GM YQF
Sbjct: 360 DDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0296CARBMTKINASE376e-05 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 37.5 bits (87), Expect = 6e-05
Identities = 28/127 (22%), Positives = 48/127 (37%), Gaps = 17/127 (13%)

Query: 119 DTLRALLDNNI---------VPVINENDAVATAEIKVGDNDNLSALAAILAGADKLLLLT 169
+T++ L++ + VPVI E+ + E V D D A AD ++LT
Sbjct: 177 ETIKKLVERGVIVIASGGGGVPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILT 235

Query: 170 DQKGLYTADPRSNPQAELIKDVYGIDDALRAIAGDSVSGLGTGGMSTKLQAA-DVACRAG 228
D G + + +++V +++ + G M K+ AA G
Sbjct: 236 DVNGAALY--YGTEKEQWLREV-KVEELRKYYEEG---HFKAGSMGPKVLAAIRFIEWGG 289

Query: 229 IDTIIAA 235
IIA
Sbjct: 290 ERAIIAH 296



Score = 30.2 bits (68), Expect = 0.013
Identities = 16/76 (21%), Positives = 33/76 (43%), Gaps = 13/76 (17%)

Query: 4 SQTLVVKLGTSVLTGGSRRLNRAHIVELVRQCAQ----LHAAGHRIVIVTSG-------- 51
+ +V+ LG + L ++ + +++ VR+ A+ + A G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61

Query: 52 -AIAAGREHLGYPELP 66
+ AG+ G P P
Sbjct: 62 LHMDAGQATYGIPAQP 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0301HTHFIS414e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 4e-06
Identities = 14/80 (17%), Positives = 28/80 (35%), Gaps = 12/80 (15%)

Query: 2 IKILIVDDNKSRIEKLKSSLTELITKNMIRIDEKYTSDAAKIALKLNQYDYLILDVFLPK 61
IL+ DD+ + L +L ++ + + + D ++ DV +P
Sbjct: 4 ATILVADDDAAIRTVLNQAL----SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMP- 58

Query: 62 KDNYSPDERNGLGLLKQINS 81
+ N LL +I
Sbjct: 59 -------DENAFDLLPRIKK 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0302HTHFIS388e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 8e-06
Identities = 15/98 (15%), Positives = 36/98 (36%), Gaps = 16/98 (16%)

Query: 2 KILLVEDIEYKRDKVIGLLESISADVSVDVAKSYVSAVNSATSKTYDLIILDMSLPTYDK 61
IL+ +D R + L A V + + + + DL++ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN- 61

Query: 62 GPNENGGRFRVYGGKDIIRKLMRGKEHPVVVVLTQHTT 99
D++ ++ + + V+V++ T
Sbjct: 62 -------------AFDLLPRIKKARPDLPVLVMSAQNT 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0304RTXTOXIND260.043 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.6 bits (56), Expect = 0.043
Identities = 6/48 (12%), Positives = 12/48 (25%)

Query: 42 LESWVRQLRRERQGIAPSATPITPEQQRIRELEKQVRRLEEQNTILKK 89
+W Q ++ + RI E R + +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0317PF00577634e-12 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 62.6 bits (152), Expect = 4e-12
Identities = 30/247 (12%), Positives = 72/247 (29%), Gaps = 23/247 (9%)

Query: 487 TLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQNVYSGTFGSLGLRAGIQRYNNGDSS 546
L + + T +S + Y + +Q + F + N
Sbjct: 530 QLTVTQQLGRTSTLYLSG-SHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQK 588

Query: 547 ANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGT------------IRTV 594
+ +AL++++P +W + Q + A+ S + +
Sbjct: 589 -GRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647

Query: 595 GANLSRAISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYINTNLTANGSVGWQGK 654
++ +G + +G A + Y + + S +D +G V
Sbjct: 648 SYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHAN 706

Query: 655 NIAASGRTDGNAGVIFNTGLED---DGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQ 711
+ + ++ G +D + Q + + R G + Y V L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR-----GYAVLPYATEYRENRVALD 761

Query: 712 NSKNSLD 718
+ + +
Sbjct: 762 TNTLADN 768


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0331INTIMIN546e-177 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 546 bits (1409), Expect = e-177
Identities = 221/818 (27%), Positives = 358/818 (43%), Gaps = 49/818 (5%)

Query: 41 PVMAARAQHAVQPRLSMGNTTVTADSNVEKNVASFAANAGTFLSSQPDS-----DATRNF 95
P++AA +L+ + VT + + ++AA L SQ S D ++
Sbjct: 131 PLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDT 190

Query: 96 ITGMATAKANQEIQEWLGKYGTARVKLNVDKDFSLKDSSLEMLYPIYDTPTNMLFTQGAI 155
G+A +A+ ++Q WL YGTA V L +F SSL+ L P YD+ + F Q
Sbjct: 191 ALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGA 248

Query: 156 HRTDDRTQSNIGFGWRHFSGNDWMAGVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGY 215
D R +N+G G R F M G N FID D S +TR+G+G EYWRDY K S NGY
Sbjct: 249 RYIDSRFTANLGAGQRFFLPE-NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGY 307

Query: 216 IRASGWKKSPDIEDYQERPANGWDIRAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQ 275
R SGW +S + +DY ERPANG+DIR GYLP++P LGA LMYEQYYGD V LF DK Q
Sbjct: 308 FRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQ 367

Query: 276 KDPHAISAEVTYTPVPLLTLSAGHKQGKSGENDTRFGLEVNYRIGEPLAKQLDTDSIRER 335
+P A + V YTP+PL+T+ ++ G END + ++ Y+ +P ++Q++ + E
Sbjct: 368 SNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNEL 427

Query: 336 RMLAGSRYDLVERNNNIVLEYRKSEVIRIALPDRIAGKGGQTVSLGLVVSKATHGLKNVQ 395
R L+GSRYDLV+RNNNI+LEY+K +++ + +P I G T + L+V K+ +GL +
Sbjct: 428 RTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIV-KSKYGLDRIV 486

Query: 396 WEAPSLLAAGGKITGQG----NQWQVTLPAYQAGKDNYYAISAVAYDNKGNASKRVQTEV 451
W+ +L + GG+I G +Q LPAY G N Y ++A AYD GN+S V +
Sbjct: 487 WDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTI 546

Query: 452 VITGAGMSAERTALTLDGQSRIQMLANGSEQKPLVLSLRDAEGQPVTGMKDQIKTELTFK 511
+ G ++ +T + A+G+E +++ Q ++F
Sbjct: 547 TVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNG-------VAQANVPVSFN 599

Query: 512 PAGNIVTRTLKATKSQAQPTLGEFTETEAGVYQSVFTTGTQSGEATITVSVDDMSKTVTA 571
+ + + G + G+ ++ +M+ + A
Sbjct: 600 IVSGTAVLSANSANTNGS-----------GKATVTLKSDK-PGQVVVSAKTAEMTSALNA 647

Query: 572 ELRATMMDVANSTLSANEPSGDVVADGQQSHTLTLTAVDTDGNPVTGEASRLRLVPQDTN 631
+ S VA+GQ + T T+ V PV+ + T
Sbjct: 648 NAVIFVDQTKASITEIKADKTTAVANGQDAITYTVK-VMKGDKPVSNQEVTF-----TTT 701

Query: 632 GVTVGAIS--EIKPGVYSATVSSTRAGNVVVRAFSEQYQLGTLQQTLKFVAGP-LDAAHS 688
+ + G T++ST G +V A + ++F +D +
Sbjct: 702 LGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNI 761

Query: 689 SITLNPDKPVVGGTVTAIWTAKDAYDNPVTSLTPE---APSLAGAAAVGSTASGWTNNGD 745
I V G + +W + + + + A+V +++ T
Sbjct: 762 EIVGTG----VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEK 817

Query: 746 GTWTAQITLGSTAGELDVMPKLNGQDAAANAAKVTVVADALSSNQSKVSVAEDHVKAGES 805
GT T + + N N +K DA+++ ++ E+
Sbjct: 818 GTTTISVISSDNQTATYTIATPNSL-IVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876

Query: 806 TTVTLIAKDAHGNAISGLSLSASLTGAASEGATVSSWT 843
A + + S ++ + + A + + + T
Sbjct: 877 VFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVAST 914



Score = 76.3 bits (187), Expect = 6e-16
Identities = 69/373 (18%), Positives = 121/373 (32%), Gaps = 37/373 (9%)

Query: 976 NGQNAVAQPLVLNVAGDAS-KAEIRDMTVKVDNQLANGQSTNQVTLTVVDTYGNPLQGQE 1034
N N V + + G + + D T + A+G T TV G
Sbjct: 537 NSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVP 595

Query: 1035 VTLTLPQGVTSKTGNTVTTNAAGKADIELISTVAGELEIAAAVKNSQKTV---TVKFNAD 1091
V+ + G + N+ TN +GKA + L S G++ ++A + V F
Sbjct: 596 VSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQ 655

Query: 1092 ASTGQANLQVDAAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNLPRGVKPLTGDNVW 1151
++ D VANG+DA T T V K PV VTF G + +
Sbjct: 656 TKASITEIKADKTTA-VANGQDAITYTVKV-MKGDKPVSNQEVTFTTTLGKLSNSTE--- 710

Query: 1152 VKANDEGKAELQVVSVTAGTYEITASAGNDQPSDAQTITFVADKTTATVSGIEVIGNYAL 1211
K + G A++ + S T G ++A + T IE++G
Sbjct: 711 -KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT--- 766

Query: 1212 ADGKAKQTYKVTVTDANNNLLK---DSEVTLTASPANLALDPDGTAKTNEQGQAIFTATT 1268
G + V + NL + + T ++ +A + + + + T +
Sbjct: 767 --GVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISV 824

Query: 1269 TVAAKYTLTAKVEQANGQESTKTAESKFVADDKNAVLAASSDVTSLVADGVQTATMTVTL 1328
+ T T + N+++ + D V T
Sbjct: 825 ISSDNQTAT------------------YTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1329 FSANNPVGGNVWV 1341
++ NV+
Sbjct: 867 LPSSQNELENVFK 879



Score = 75.1 bits (184), Expect = 1e-15
Identities = 72/367 (19%), Positives = 120/367 (32%), Gaps = 45/367 (12%)

Query: 882 TVIAGEMSSANSTLVADNKAP-TVKATTELTFTAKDAYGNPVSGLKLDAPVFSGAASTGS 940
V + ++ ++ AD T AT + PVS + +G+
Sbjct: 557 QVGVTDFTADKTSAKADGTEAITYTAT--VKKNGVAQANVPVSFNIV----------SGT 604

Query: 941 ERPSAGSWTEQSNGVYVATLTLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRD 1000
SA S +G TL + +A+ V+ V D +KA I +
Sbjct: 605 AVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFV--DQTKASITE 662

Query: 1001 MTVKVDNQLANGQSTNQVTLTVVDTY-GNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKA 1059
+ +ANGQ +T TV P+ QEVT T G S + T T+ G A
Sbjct: 663 IKADKTTAVANGQDA--ITYTVKVMKGDKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYA 718

Query: 1060 DIELISTVAGELEIAAAVKNSQ---KTVTVKFNADASTGQANLQVDAAAQKVANGKDAFT 1116
+ L ST G+ ++A V + K V+F + N+++ V G
Sbjct: 719 KVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEI------VGTGVKGKL 772

Query: 1117 LTANVEDKNGN-PVPGSLVTFNLPRGVKPLTGDNVWVKANDEGKAELQVVSVTAGTYEIT 1175
T ++ N G + N + + D QV GT I+
Sbjct: 773 PTVWLQYGQVNLKASGGNGKYT-------WRSANPAIASVDASSG--QVTLKEKGTTTIS 823

Query: 1176 ASAGNDQPSDAQTITFVADKTTATVSGIEVIGNYALADGKAKQTYKVTVTDANNNLLKDS 1235
+ D QT T+ + + + D ++ N L++
Sbjct: 824 VISS-----DNQTATYTIATPNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENV 877

Query: 1236 EVTLTAS 1242
A+
Sbjct: 878 FKAWGAA 884


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0332HTHTETR280.029 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.029
Identities = 12/42 (28%), Positives = 19/42 (45%)

Query: 3 RQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRN 44
RQ IL L S+ +IA+ +G +R I F++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54


7EcSMS35_0350EcSMS35_0401Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_03501203.533156hypothetical protein
EcSMS35_03511224.375507acyl-CoA synthetase
EcSMS35_0352-1152.255007hypothetical protein
EcSMS35_0353-1140.654164putative carbamate kinase
EcSMS35_0354-115-1.139100putative deaminase
EcSMS35_0355023-4.057432hypothetical protein
EcSMS35_0356-115-1.254148zinc-binding dehydrogenase family
EcSMS35_0357116-0.748549hypothetical protein
EcSMS35_03592162.400627*putative homoserine/threonine efflux protein
EcSMS35_03602203.558110hypothetical protein
EcSMS35_03621244.911175hypothetical protein
EcSMS35_03610224.653365propionate catabolism operon regulatory protein
EcSMS35_03631224.3778002-methylisocitrate lyase
EcSMS35_03640204.201415methylcitrate synthase
EcSMS35_0365-1194.4166152-methylcitrate dehydratase
EcSMS35_0366-1193.912543propionyl-CoA synthetase
EcSMS35_0367-1143.694984cytosine permease
EcSMS35_0368-1162.469703cytosine deaminase
EcSMS35_03690150.537029DNA-binding transcriptional regulator CynR
EcSMS35_0370-2111.952866carbonic anhydrase
EcSMS35_0371-2111.926022cyanate hydratase
EcSMS35_0372-2112.220974putative cyanate transporter
EcSMS35_0373-2122.198536galactoside O-acetyltransferase
EcSMS35_0374-2133.428292galactoside permease
EcSMS35_0375-2154.743314beta-D-galactosidase
EcSMS35_0376-2174.801957lac repressor
EcSMS35_0377-1164.642573DNA-binding transcriptional activator MhpR
EcSMS35_03780164.7497423-(3-hydroxyphenyl)propionate hydroxylase
EcSMS35_03790155.0042373-(2,3-dihydroxyphenyl)propionate dioxygenase
EcSMS35_03802133.8880822-hydroxy-6-ketonona-2,4-dienedioic acid
EcSMS35_03812132.9045752-keto-4-pentenoate hydratase
EcSMS35_03820112.442139acetaldehyde dehydrogenase
EcSMS35_0383-1112.2221274-hydroxy-2-ketovalerate aldolase
EcSMS35_0384-1131.103394putative 3-hydroxyphenylpropionic transporter
EcSMS35_0385016-1.349897hypothetical protein
EcSMS35_0386214-1.078072S-formylglutathione hydrolase
EcSMS35_0387118-1.644862S-(hydroxymethyl)glutathione dehydrogenase/class
EcSMS35_0388224-3.748149regulator protein FrmR
EcSMS35_0389217-2.162499hypothetical protein
EcSMS35_0390116-1.026286putative acyltransferase
EcSMS35_03911140.768682glycosyl transferase, group 2 family protein
EcSMS35_03921151.924827putative GlcNAc-PI de-N-acetylase
EcSMS35_03930163.395663hypothetical protein
EcSMS35_03940132.096788taurine transporter substrate binding subunit
EcSMS35_03950121.837360taurine transporter ATP-binding subunit
EcSMS35_03960131.120434taurine transporter subunit
EcSMS35_0397522-1.594005taurine dioxygenase
EcSMS35_0398624-3.422710delta-aminolevulinic acid dehydratase
EcSMS35_0400421-2.979576hypothetical protein
EcSMS35_0399321-2.122570IS2 transposase orfB
EcSMS35_0401220-2.600532insertion sequence 2 OrfA protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0353CARBMTKINASE430e-155 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 430 bits (1106), Expect = e-155
Identities = 140/315 (44%), Positives = 201/315 (63%), Gaps = 3/315 (0%)

Query: 1 MKELVVVAIGGNSIIKDNASQSIEHQAEAVKAVADTVLEMLASDYDIVLTHGNGPQVGLD 60
M + VV+A+GGN++ + S E + V+ A + E++A Y++V+THGNGPQVG
Sbjct: 1 MGKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSL 60

Query: 61 LRRAEIAHEREGLPLTPLANCVADTQGGIGYLIQQALNNRLARHG-EKKAVTVVTQVEVD 119
L + G+P P+ A +QG IGY+IQQAL N L + G EKK VT++TQ VD
Sbjct: 61 LLHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVD 120

Query: 120 KNDPGFAHPTKPIGAFFSESQRDELQKANPDWRFVEDAGRGYRRVVASPEPKRIVEAPAI 179
KNDP F +PTKP+G F+ E L + W ED+GRG+RRVV SP+PK VEA I
Sbjct: 121 KNDPAFQNPTKPVGPFYDEETAKRLAR-EKGWIVKEDSGRGWRRVVPSPDPKGHVEAETI 179

Query: 180 KALIQQGFVVIGAGGGGIPVVRTEAGDYQSVDAVIDKDLSTALLAREIHADILVITTGVE 239
K L+++G +VI +GGGG+PV+ E G+ + V+AVIDKDL+ LA E++ADI +I T V
Sbjct: 180 KKLVERGVIVIASGGGGVPVIL-EDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVN 238

Query: 240 KVCIHFGKPQQQVLDRVDIATMTRYMQEGHFPPGSMLPKIIASLTFLEQGGKEVIITTPE 299
+++G ++Q L V + + +Y +EGHF GSM PK++A++ F+E GG+ II E
Sbjct: 239 GAALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLE 298

Query: 300 CLPAALRGETGTHII 314
AL G+TGT ++
Sbjct: 299 KAVEALEGKTGTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0361HTHFIS338e-113 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 338 bits (868), Expect = e-113
Identities = 122/401 (30%), Positives = 200/401 (49%), Gaps = 54/401 (13%)

Query: 164 DLAEEAGMTGIFIYSAATVRQAFSDALDMTRMSLRHNTHDATRNALRTRYVLGDMLGQSP 223
A +A G + Y ++ + + +L ++ ++G+S
Sbjct: 88 MTAIKASEKGAYDYLPKPFDL--TELIGIIGRALAEPKRRPSK-LEDDSQDGMPLVGRSA 144

Query: 224 QMEQVRQTILLYARSSAAVLIEGETGTGKELAAQAIHREYFARHDARQGKKSHPFVAVNC 283
M+++ + + ++ ++I GE+GTGKEL A+A+H + R+ PFVA+N
Sbjct: 145 AMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-----YGKRRNG---PFVAINM 196

Query: 284 GAIAESLLEAELFGYEEGAFTGSRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVL 343
AI L+E+ELFG+E+GAFTG++ G FE A GGTLFLDEIG+MP+ QTRLLRVL
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTR-STGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVL 255

Query: 344 EEKEVTRVGGHQPVPVDVRVISATHCNLEEDMRQGEFRRDLFYRLSILRLQLPPLRERVT 403
++ E T VGG P+ DVR+++AT+ +L++ + QG FR DL+YRL+++ L+LPPLR+R
Sbjct: 256 QQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAE 315

Query: 404 DILPLAESFLKVSLAALSAPFSAALRQGLQASETVLVHYDWPGNIRELRNMMERLALFLS 463
DI L F++ ++ L+ + + WPGN+REL N++ RL
Sbjct: 316 DIPDLVRHFVQ-QAEKEGLDVKRFDQEALEL----MKAHPWPGNVRELENLVRRLTALYP 370

Query: 464 VEP-TPDLTPQFLQLLLPELARESAKTPAPRLLTP------------------------- 497
+ T ++ L+ +P+ E A + L
Sbjct: 371 QDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYD 430

Query: 498 -----------QQALEKFNGDKTAAANYLGISRTTFWRRLK 527
AL G++ AA+ LG++R T ++++
Sbjct: 431 RVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIR 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0364PHPHTRNFRASE300.023 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 29.8 bits (67), Expect = 0.023
Identities = 11/33 (33%), Positives = 19/33 (57%), Gaps = 1/33 (3%)

Query: 65 LIHGKLPTRDE-LAAYKTKLKALRGLPANVRTV 96
+ +LPT +E AYK ++ + G P +RT+
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTL 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0374TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 3e-04
Identities = 44/192 (22%), Positives = 72/192 (37%), Gaps = 22/192 (11%)

Query: 4 LKNTNFWMFGLFFFFYFFI-MGAYFPFFPIWLHDINHISK--SDTGIIFAAISLFSLLFQ 60
+K + L + +G P P L D+ H + + GI+ A +L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 61 PLFGLLSDKLGLRKYLLWIITGMLVMFAPFFIFIFGPLLQYNILVGSIVGGIYLGFCFNA 120
P+ G LSD+ G R LL + + I P L + + +G IV GI A
Sbjct: 61 PVLGALSDRFGRRPVLL---VSLAGAAVDYAIMATAPFL-WVLYIGRIVAGIT-----GA 111

Query: 121 GAPAVEAFIEKVSRRSNFEFGRARMFG----CVGWALCAS--IVGIMFTINNQFVFWLGS 174
A+I ++ RAR FG C G+ + A + G+M + F+ +
Sbjct: 112 TGAVAGAYIADITDGDE----RARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAA 167

Query: 175 GCALILAVLLFF 186
+ + F
Sbjct: 168 ALNGLNFLTGCF 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0384TCRTETB582e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 58.4 bits (141), Expect = 2e-11
Identities = 85/414 (20%), Positives = 152/414 (36%), Gaps = 51/414 (12%)

Query: 1 MSTRTPSSSSSRLMLTIGLCFLVALMEGLDLQAAGIAAGGIAQAFALDKMQMGWIFSAGI 60
M+T S+ + I LC L L+ ++ IA F W+ +A +
Sbjct: 1 MNTSYSQSNLRHNQILIWLCILS-FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFM 59

Query: 61 LGLLPGALVGGMLADRYGRKRILIGSVALFGLFSLATAIAWD-FPSLVFARLMTGVGLGA 119
L G V G L+D+ G KR+L+ + + S+ + F L+ AR + G G A
Sbjct: 60 LTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG-AA 118

Query: 120 ALPNLIA-LTSEAAGPRFRGTAVSLMYCGVPIGAALAATLGFAGANLAWQTVFWVGGVVP 178
A P L+ + + RG A L+ V +G + +G A+ + + ++
Sbjct: 119 AFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMIT 178

Query: 179 LILVPLLMRWLPESAVFAGEKQ--------AAPPLRALFAPETATATLLLWLCYFFTLLV 230
+I VP LM+ L + G LF + + L++ + F + V
Sbjct: 179 IITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL-IFV 237

Query: 231 VYMLINWLPLLLVEQGFQPSQAAGVMFA-LQMGAASGTLMLGALMDK------------- 276
++ P + G GV+ + G +G + + M K
Sbjct: 238 KHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSV 297

Query: 277 -LRPVTMSLLIYS---GMLAS------LLALGTVSSFNGMLLAGFV----------AGLF 316
+ P TMS++I+ G+L +L +G L A F+ +F
Sbjct: 298 IIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVF 357

Query: 317 ATGGQSVLYALAPLFYSSQIRATGVGTAVA----VGRLGAMSGPLLAGKMLALG 366
GG S + SS ++ G ++ L +G + G +L++
Sbjct: 358 VLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0385TRNSINTIMINR280.020 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.2 bits (62), Expect = 0.020
Identities = 14/56 (25%), Positives = 30/56 (53%), Gaps = 2/56 (3%)

Query: 11 LKAGLVTSKKAAKVERTAKKSRVQAREARAAVEENKKAQLERDKQLSEQQKQAALA 66
+ +G + ++ + AK++ AR+ AVE N +AQ + Q + +Q++ L+
Sbjct: 308 IPSGELKDDIVEQIAQQAKEAGEVARQQ--AVESNAQAQQRYEDQHARRQEELQLS 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0398BINARYTOXINB300.015 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.015
Identities = 19/69 (27%), Positives = 30/69 (43%)

Query: 254 DIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYF 313
+ EL + +L + QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 314 ALDLAEKKI 322
L+L E++I
Sbjct: 526 DLNLVERRI 534


8EcSMS35_0421EcSMS35_0426Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_04214151.689143hypothetical protein
EcSMS35_04223141.686958hypothetical protein
EcSMS35_04233141.715596recombination associated protein
EcSMS35_04242162.057403fructokinase
EcSMS35_04252151.614919MFS transport protein AraJ
EcSMS35_04262151.845071exonuclease subunit SbcC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0425TCRTETA531e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.5 bits (126), Expect = 1e-09
Identities = 73/356 (20%), Positives = 128/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIVVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNVLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 314
F A T + L G+ + +++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKITSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
+ I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0426IGASERPTASE391e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 1e-04
Identities = 40/264 (15%), Positives = 81/264 (30%), Gaps = 11/264 (4%)

Query: 162 LNAKPKERAELLEELTGTEIYGKISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSL 221
A P E E + E + E S V + + A + + A Q+
Sbjct: 1029 APATPSETTETVAENSKQE-----SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 222 TASLQVLTDEEKQLITAQQQEQQSLNWLTRLD-ELQQEGSRRQQALQQALAEEEKAQPQL 280
+ +E Q ++ +++ E QE + + + E QPQ
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 281 AALSLAQPARNLRPHWE---RIAEHSAALAHTRQQIEEVNTRLQSTMALRASIRHHAAKQ 337
P N++ A+ T +E+ T + + + +
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 338 SAELQQQQQSLNAWLQEHDRFRQWNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLN 397
A Q S ++ ++ R + + ++DR + T+ L+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA-TTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 398 ALAAITLTLTADEVASALAQHAEQ 421
A + + V A++QH Q
Sbjct: 1263 DARAKAQFVALN-VGKAVSQHISQ 1285



Score = 37.0 bits (85), Expect = 5e-04
Identities = 28/140 (20%), Positives = 50/140 (35%), Gaps = 15/140 (10%)

Query: 738 QQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLTQLEQLKQNLENQRRQAQ 797
Q DV + S + A+ D A A E T T E KQ +++
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPP-------APATPSETTETVAENSKQ-------ESK 1049

Query: 798 TLVTQTAEALAQHQQHRPDGLDLSVTVEQIQQELAQTQQKLRENTTSQGEIRQQLKQDAD 857
T+ +A Q+R + V+ Q Q T E ++ + +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 858 NRQQQQT-LMQQIAQMTQQV 876
+ + +T Q++ ++T QV
Sbjct: 1110 EKAKVETEKTQEVPKVTSQV 1129


9EcSMS35_0461EcSMS35_0479Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0461-214-3.2139662-dehydropantoate 2-reductase
EcSMS35_0462017-3.880481putative nucleotide-binding protein
EcSMS35_0463016-2.689785major facilitator transporter
EcSMS35_0464019-4.360737hypothetical protein
EcSMS35_0465016-2.329177hypothetical protein
EcSMS35_04661210.294783acetyltransferase
EcSMS35_04673250.948845hypothetical protein
EcSMS35_04683250.584909protoheme IX farnesyltransferase
EcSMS35_04691200.700657cytochrome o ubiquinol oxidase subunit IV
EcSMS35_0470-1210.701519cytochrome o ubiquinol oxidase subunit III
EcSMS35_0471-1180.304567cytochrome o ubiquinol oxidase, subunit I
EcSMS35_0472-114-0.751537cytochrome o ubiquinol oxidase subunit II
EcSMS35_0473-115-0.951781hypothetical protein
EcSMS35_0474122-0.210105muropeptide transporter
EcSMS35_0475330-1.472128hypothetical protein
EcSMS35_0476529-1.027129hypothetical protein
EcSMS35_0477427-0.175534transcriptional regulator BolA
EcSMS35_0478428-0.107047hypothetical protein
EcSMS35_04793280.122546trigger factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0463TCRTETA825e-19 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 81.8 bits (202), Expect = 5e-19
Identities = 77/362 (21%), Positives = 139/362 (38%), Gaps = 29/362 (8%)

Query: 16 GLGTVFSLRMLGMFMVLPVLTTY--GMALQGASEALIGIAIGIYGLTQAVFQIPFGLLSD 73
L TV L +G+ +++PVL + A GI + +Y L Q G LSD
Sbjct: 10 ILSTVA-LDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSD 68

Query: 74 RIGRKPLIVGGLAVFAAGSVIAALSDSIWGIILGRALQG-SGAIAAAVMALLSDLTREQN 132
R GR+P+++ LA A I A + +W + +GR + G +GA A A ++D+T
Sbjct: 69 RFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDE 128

Query: 133 RTKAMAFIGVSFGITFAIAMVLGPIITHKLG---LHALFWMIAILATTGIALTIWVVPNS 189
R + F+ FG MV GP++ +G HA F+ A L +++P S
Sbjct: 129 RARHFGFMSACFG----FGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 190 STHVLNRESGMVKGSFSKVLAEPRLLKLNFGIMCLHMLLMSTFVA-LPGQLADAGFPAAE 248
+ + G+ + L+ F+ L GQ+ A +
Sbjct: 185 HKGERRPLRREALNPLAS-------FRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 249 HWKVYLATMLIAF--------GSVVPFIIYAEVKRKMKQVFVFCVGLIV-VAEIVLWNAQ 299
+ + I S+ +I V ++ + +G+I +L
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 300 TQFWQLVVGVQLFFVAFNLMEALLPSLISKESPAGYKGTAMGVYSTSQFLGVAIGGSLGG 359
T+ W + + + + + L +++S++ +G G + L +G L
Sbjct: 298 TRGW-MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFT 356

Query: 360 WI 361
I
Sbjct: 357 AI 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0474TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLICR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0475PF06291270.027 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.027
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


10EcSMS35_0498EcSMS35_0545Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0498219-3.439693methylated-DNA-[protein]-cysteine
EcSMS35_0499119-5.008989hypothetical protein
EcSMS35_0500012-1.504090cyclic diguanylate phosphodiesterase
EcSMS35_0501217-0.195949hypothetical protein
EcSMS35_0502115-0.435460maltose O-acetyltransferase
EcSMS35_05031150.042066hemolysin expression-modulating protein
EcSMS35_05041150.259050hypothetical protein
EcSMS35_05051161.182149acriflavine resistance protein B
EcSMS35_05062120.596062acriflavine resistance protein A
EcSMS35_05072140.420813DNA-binding transcriptional repressor AcrR
EcSMS35_05083152.524753potassium efflux protein KefA
EcSMS35_05094154.333479hypothetical protein
EcSMS35_05103174.822418primosomal replication protein N''
EcSMS35_05113233.456212hypothetical protein
EcSMS35_05124243.372465adenine phosphoribosyltransferase
EcSMS35_05134243.081923DNA polymerase III subunits gamma and tau
EcSMS35_05143261.735060hypothetical protein
EcSMS35_05151201.632511recombination protein RecR
EcSMS35_05160190.980243heat shock protein 90
EcSMS35_0517-3131.665367hypothetical protein
EcSMS35_0518-2121.768273adenylate kinase
EcSMS35_0520-1141.410568ferrochelatase
EcSMS35_0519-1131.129611acetyl esterase
EcSMS35_0521-1141.225242inosine kinase
EcSMS35_0522-1143.055010putative cation:proton antiport protein
EcSMS35_0523-1132.762430fosmidomycin resistance protein
EcSMS35_0524-1132.944432bifunctional UDP-sugar hydrolase/5'-nucleotidase
EcSMS35_05250143.952960hypothetical protein
EcSMS35_05260154.034517GumN family protein
EcSMS35_05270153.485560copper exporting ATPase
EcSMS35_0528012-0.215278glutaminase
EcSMS35_0529015-1.546537amino acid permease family protein
EcSMS35_0530122-4.574682DNA-binding transcriptional regulator CueR
EcSMS35_0531025-6.593975hypothetical protein
EcSMS35_0532023-3.954451putative lipoprotein
EcSMS35_0533-117-1.870449hypothetical protein
EcSMS35_0534-217-2.574404hypothetical protein
EcSMS35_0535-118-2.675518hypothetical protein
EcSMS35_0536-116-0.735733hypothetical protein
EcSMS35_05370160.338683nodulation efficiency family protein
EcSMS35_0538-116-0.212862SPFH domain-containing protein/band 7 family
EcSMS35_0539-1180.252352putative ABC transporter ATP-binding protein
EcSMS35_05401193.583493hypothetical protein
EcSMS35_05412184.754952protein YbbN
EcSMS35_05421164.164445short chain dehydrogenase
EcSMS35_05432163.992399multifunctional acyl-CoA thioesterase I and
EcSMS35_05440163.889599putative ABC transporter ATP-binding protein
EcSMS35_0545-1113.255482efflux ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0505ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3546), Expect = 0.0
Identities = 802/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0506RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 112 TYQAAYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 171
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 172 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 230
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 231 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 280
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 281 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 312
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 61 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQAAYDS 119
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 120 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 179
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 180 RINLA 184
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0507HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0508RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0513IGASERPTASE399e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 9e-05
Identities = 40/251 (15%), Positives = 78/251 (31%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALST-LKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S+ ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0516FRAGILYSIN320.009 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 31.6 bits (71), Expect = 0.009
Identities = 24/108 (22%), Positives = 46/108 (42%), Gaps = 12/108 (11%)

Query: 422 RMKEGQEK--IYYITADSYAAAKSSPHLELLRKKGIEVLLLSDRIDEWMMNYLTEFDGKP 479
R+ G++K +I D +A + + G + ++ + + MMN + EF P
Sbjct: 99 RLFNGRDKDSTSFILGDEFAVLR-------FYRNGESISYIAYK-EAQMMNEIAEFYAAP 150

Query: 480 FQSVSKV--DESLEKLADEVDESAKEAEKALTPFIDRVKALLGERVKD 525
F+ + E+ E + D SA + ++ ID+ K +L D
Sbjct: 151 FKKTRAINEKEAFECIYDSRTRSAGKDIVSVKINIDKAKKILNLPECD 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0523TCRTETA310.008 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.9 bits (70), Expect = 0.008
Identities = 37/167 (22%), Positives = 57/167 (34%), Gaps = 14/167 (8%)

Query: 52 QSEFSLTFMQIGMITLTFQLASSLLQ-----PVVGYWTDKYPMPWSLPIGMCFTLSGLVL 106
+ F IG+ F + SL Q PV ++ + + G +L
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGT----GYIL 293

Query: 107 LALAGSFGAVLLAAALVGTGSSVFHPESSRVARMASGGRHGLAQSIFQVGGNFGSSLGPL 166
LA A L+ +G + ++R R G Q + S +GPL
Sbjct: 294 LAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 167 LAAVIIA---PYGKGNVAWFVLAALLAIVVLA-QISRWYSAQHRMNK 209
L I A G AW AAL + + A + W A R ++
Sbjct: 354 LFTAIYAASITTWNG-WAWIAGAALYLLCLPALRRGLWSGAGQRADR 399



Score = 30.2 bits (68), Expect = 0.017
Identities = 32/133 (24%), Positives = 50/133 (37%), Gaps = 2/133 (1%)

Query: 267 LHLFAFLFAVAAGTVIGGPVGDKIGRKYVIWGSILGVAPFTLILPYASLHWTGVLTVIIG 326
L L+A + A + G + D+ GR+ V+ S+ G A I+ A W + I+
Sbjct: 49 LALYALMQFACAP--VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVA 106

Query: 327 FILASAFSAILVYAQELLPGRIGMVSGLFFGFAFGMGGLGAAVLGLIADHTSIELVYKIC 386
I + + Y ++ G F FG G + VLG + S +
Sbjct: 107 GITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAA 166

Query: 387 AFLPLLGMLTIFL 399
A L L LT
Sbjct: 167 AALNGLNFLTGCF 179



Score = 29.8 bits (67), Expect = 0.018
Identities = 51/319 (15%), Positives = 111/319 (34%), Gaps = 24/319 (7%)

Query: 43 LILAIYPLLQSEFSLTFMQI---GMITLTFQLASSLLQPVVGYWTDKYPMPWSLPIGMCF 99
LI+ + P L + + G++ + L PV+G +D++ L + +
Sbjct: 23 LIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAG 82

Query: 100 TLSGLVLLALAGSFGAVLLAAALVGTGSSVFHPESSRVARMASGGRH----GLAQSIFQV 155
++A A + + + G + + +A + G G + F
Sbjct: 83 AAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 156 GGNFGSSLGPLLAAVIIAPYGKGNVAWFVLAALLAIVVLAQISRWYSAQHRMNKGKPKAT 215
G G LG L+ A F AA L + H+ + +
Sbjct: 143 GMVAGPVLGGLMGGF-------SPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRRE 195

Query: 216 IINPLPRNKVVLAVSILLILIFSKYFYMASISSY----YTFYLMQKFGLSIQNAQLHLFA 271
+NPL + ++++ L+ +F M + + + +F + L A
Sbjct: 196 ALNPLASFRWARGMTVVAALMAV-FFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAA 254

Query: 272 F-LFAVAAGTVIGGPVGDKIGRKYVIWGSILGVAPFTLILPYASLHWTGVLTVIIGFILA 330
F + A +I GPV ++G + + ++ ++L +A+ W +L
Sbjct: 255 FGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW----MAFPIMVLL 310

Query: 331 SAFSAILVYAQELLPGRIG 349
++ + Q +L ++
Sbjct: 311 ASGGIGMPALQAMLSRQVD 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0533PF03895553e-12 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 55.2 bits (133), Expect = 3e-12
Identities = 21/78 (26%), Positives = 34/78 (43%), Gaps = 1/78 (1%)

Query: 262 RKEANAGTASAIAIASQPQVKTGDVMMVSAGAGTFNGESAVSVGTSFNAGTHTVLKAGIS 321
KE G A+ A++ Q VSA G + ++A+++G KAG++
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 322 ADTQS-DFGAGVGVGYSF 338
+T + G VGY F
Sbjct: 62 FNTYNGGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0542DHBDHDRGNASE755e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 74.7 bits (183), Expect = 5e-18
Identities = 48/212 (22%), Positives = 80/212 (37%), Gaps = 7/212 (3%)

Query: 3 KSVLITGCSSGIGLESALELKRQGFHVLAGCRKADDVERMNS----MGFT--GVLIDLDS 56
K ITG + GIG A L QG H+ A + +E++ S D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 57 PESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRVQMEQQFSANFFGAHQLTMRL 116
++D + + + N AG G + ++S + E FS N G + +
Sbjct: 69 SAAIDEITARIEREMGP-IDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 117 LPVMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRHSGIKVSLIEP 176
M+ G IV S + AYA+SK A ++ L +EL I+ +++ P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 177 GPIRTRFTDNVNQTQSDKPVENPGIAARFTLG 208
G T ++ ++ G F G
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTG 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0544PF05272290.013 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.013
Identities = 12/20 (60%), Positives = 13/20 (65%)

Query: 41 LVGESGSGKSTLLAILAGLD 60
L G G GKSTL+ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


11EcSMS35_0556EcSMS35_0579Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_05563140.791373IS1 transposase orfA
EcSMS35_05572151.735525IS1 transposase orfB
EcSMS35_05592132.360045glycerate kinase II
EcSMS35_05602142.384977hypothetical protein
EcSMS35_05610153.638832allantoate amidohydrolase
EcSMS35_05620144.565616ureidoglycolate dehydrogenase
EcSMS35_05630155.477632membrane protein FdrA
EcSMS35_05641165.464878hypothetical protein
EcSMS35_05651154.259630hypothetical protein
EcSMS35_05661183.653582carbamate kinase
EcSMS35_05672192.997282phosphoribosylaminoimidazole carboxylase ATPase
EcSMS35_05684202.369883phosphoribosylaminoimidazole carboxylase,
EcSMS35_05693181.935418UDP-2,3-diacylglucosamine hydrolase
EcSMS35_05702180.623015peptidyl-prolyl cis-trans isomerase B (rotamase
EcSMS35_0571014-0.280876cysteinyl-tRNA synthetase
EcSMS35_0572119-2.828203hypothetical protein
EcSMS35_0573221-3.830695hypothetical protein
EcSMS35_0574222-4.064711bifunctional 5,10-methylene-tetrahydrofolate
EcSMS35_0575328-6.492660type-1 fimbrial protein
EcSMS35_0576327-6.509025chaperone protein FimC-like protein
EcSMS35_0577325-6.137042outer membrane usher protein SfmD
EcSMS35_0578125-6.038701mannose binding protein FimH-like protein
EcSMS35_0579215-0.837986fimbrial protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0566CARBMTKINASE381e-136 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 381 bits (981), Expect = e-136
Identities = 126/310 (40%), Positives = 175/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VEPYPLDVLVAESQGMIGYMLVQSLSAQPQM----PPVTTVLTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q+L + + V T++T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLQPEKFIGPVYQPEEQEALEAAYGWQMKRD-GKYLRRVVASPQPRNILDSEAIELL 172
DPAF P K +GP Y E + L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVTDDG---VGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGTPQQRAIRHATPDELAPLAKAD----GSMGPKVTAVSGYVRSRGKPAWIGALSRIEE 285
+GT +++ +R +EL + GSMGPKV A ++ G+ A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0571RTXTOXIND290.036 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.036
Identities = 16/150 (10%), Positives = 43/150 (28%), Gaps = 8/150 (5%)

Query: 299 RSQLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMNDDFNTP----- 353
+ ++ +L QAR R R + P E F ++
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 354 EAYSVLFDMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEV 413
E +S + + + A + + + + + + + + F +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS---LLHKQAI 249

Query: 414 AEIEALIQQRLDARKAKDWAAADAARDRLN 443
A+ L Q+ + + +++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0577PF005778260.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 826 bits (2136), Expect = 0.0
Identities = 404/856 (47%), Positives = 572/856 (66%), Gaps = 20/856 (2%)

Query: 20 ICYSSLAILPSFLSYAESYFNPAFLLENGTSVADLSRFERGNHQPAGVYRVDLWRNDEFI 79
+ A + LS AE YFNP FL ++ +VADLSRFE G P G YRVD++ N+ ++
Sbjct: 31 FVACAFA-AQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 80 GSQDIVFESTTVNTGDKSGGLMPCFNQALLERIGLNSSAFPELAQQQNNKCINLLKAVPD 139
++D+ F NTGD G++PC +A L +GLN+++ + ++ C+ L + D
Sbjct: 90 ATRDVTF-----NTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHD 144

Query: 140 ATINFDFAAMRLNITIPQIALLSSAHGYIPPEEWDEGIPALLLNYNFTGN----RGNGND 195
AT D RLN+TIPQ + + A GYIPPE WD GI A LLNYNF+GN R GN
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 196 SYFFSEL-SGINIGPWRLRNNGSWNYFRGNG--YHSEQWNNIGTWVQRAIIPLKSELVMG 252
Y + L SG+NIG WRLR+N +W+Y + +W +I TW++R IIPL+S L +G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 253 DGNTGSDIFDGVGFRGVRLYSSDNMYPDSQQGFAPTVRGIARTAAQLTIRQNGFIIYQSY 312
DG T DIFDG+ FRG +L S DNM PDSQ+GFAP + GIAR AQ+TI+QNG+ IY S
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 313 VSPGAFEITDLHPTSSNGDLDVTIDERDGNQQNYTIPYSTVPILQREGRFKFDLTAGDFR 372
V PG F I D++ ++GDL VTI E DG+ Q +T+PYS+VP+LQREG ++ +TAG++R
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 373 SGNSQQSSPFFFQGTALGGLPQEFTAYGGTQLSANYTAFLLGLGRNLGNWGAVSLDVTHA 432
SGN+QQ P FFQ T L GLP +T YGGTQL+ Y AF G+G+N+G GA+S+D+T A
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 433 RSQLADDSRHEGDSIRFLYAKSMNTFGTNFQLMGYRYSTQGFYTLDDVAYRRMEGYEYDY 492
S L DDS+H+G S+RFLY KS+N GTN QL+GYRYST G++ D Y RM GY
Sbjct: 445 NSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGY-NIE 503

Query: 493 DYDGEHRDEPIIVNYHNLRFSRKDRLQLNISQSLNDFGSLYISGTHQKYWNTSDSDTWYQ 552
DG + +P +Y+NL ++++ +LQL ++Q L +LY+SG+HQ YW TS+ D +Q
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 553 VGYTSSWVGISYSLSFSWNESVGIPDNERIVGLNVSVPFNVLTKRRYTRENALDRAYASF 612
G +++ I+++LS+S ++ ++++ LNV++PF+ R ++ A AS+
Sbjct: 564 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWL--RSDSKSQWRHASASY 621

Query: 613 NANRNSNGQNNWLAGVGGTLLEGHNLSYHVSQG----DTSNNGYTGSATANWQAAYGTLG 668
+ + + NG+ LAGV GTLLE +NLSY V G N+G TG AT N++ YG
Sbjct: 622 SMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNAN 681

Query: 669 VGYNYDRDQHDVNWQLSGGVVGHENGITLSQPLGDTNVLIKAPGAGGVRIENQTGILTDW 728
+GY++ D + + +SGGV+ H NG+TL QPL DT VL+KAPGA ++ENQTG+ TDW
Sbjct: 682 IGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDW 741

Query: 729 RGYAVMPYATVYRYNRIALDTNTMGNSIDVEKNISSVVPTQGALVRANFDTRIGVRALIT 788
RGYAV+PYAT YR NR+ALDTNT+ +++D++ +++VVPT+GA+VRA F R+G++ L+T
Sbjct: 742 RGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMT 801

Query: 789 VTQGGKPVPFGSPVRENSTGITSMVGDDGQVYLSGAPLSGELLVQWGDGANSRCIAHYVL 848
+T KP+PFG+ V S+ + +V D+GQVYLSG PL+G++ V+WG+ N+ C+A+Y L
Sbjct: 802 LTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQL 861

Query: 849 PKQSLQQAVTVISAVC 864
P +S QQ +T +SA C
Sbjct: 862 PPESQQQLLTQLSAEC 877


12EcSMS35_0601EcSMS35_0640Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0601-1113.549096phosphopantetheinyltransferase component of
EcSMS35_0602-2122.694010outer membrane receptor FepA
EcSMS35_06030122.695329hypothetical protein
EcSMS35_0604-1123.258919enterobactin/ferric enterobactin esterase
EcSMS35_06050123.885076mbtH-like protein
EcSMS35_06061134.133366enterobactin synthase subunit F
EcSMS35_06081133.417521ferric enterobactin transport protein FepE
EcSMS35_06070155.530131iron-enterobactin transporter ATP-binding
EcSMS35_06090175.811895iron-enterobactin transporter permease
EcSMS35_06100175.452537iron-enterobactin transporter membrane protein
EcSMS35_0611-1185.072344enterobactin exporter EntS
EcSMS35_0612-1185.010090iron-enterobactin transporter periplasmic
EcSMS35_0613-2225.441715isochorismate synthase, entC
EcSMS35_0614-1245.421074enterobactin synthase subunit E
EcSMS35_0615-1215.182100isochorismatase
EcSMS35_0616-1184.8896912,3-dihydroxybenzoate-2,3-dehydrogenase
EcSMS35_0617-1173.524269hypothetical protein
EcSMS35_0618-1141.679265carbon starvation protein A
EcSMS35_0619-116-2.894813hypothetical protein
EcSMS35_0620-116-3.255808hypothetical protein
EcSMS35_0621-117-4.616817putative aminotransferase
EcSMS35_0622-118-4.404710immunoglobulin-binding regulator family protein
EcSMS35_0623-119-4.169421phosphoadenosine phosphosulfate reductase family
EcSMS35_0624-121-3.292933LysR family transcriptional regulator
EcSMS35_0625023-0.491296disulfide isomerase/thiol-disulfide oxidase
EcSMS35_06260250.821780alkyl hydroperoxide reductase subunit C
EcSMS35_06270171.456092alkyl hydroperoxide reductase subunit F
EcSMS35_06280152.182175universal stress protein G
EcSMS35_0629-1173.170566nucleoside diphosphate kinase regulator
EcSMS35_0630-1193.609103ribonuclease I
EcSMS35_0631-2214.180369sodium:sulfate symporter family protein
EcSMS35_0632-2203.742184triphosphoribosyl-dephospho-CoA synthase
EcSMS35_0633-1162.3310092'-(5''-triphosphoribosyl)-3'-dephospho-CoA:apo-
EcSMS35_0634-1141.558333citrate lyase, alpha subunit
EcSMS35_0635-1110.406246citrate (pro-3S)-lyase, beta subunit
EcSMS35_06361130.067294citrate lyase subunit gamma
EcSMS35_0637013-0.436158[citrate (pro-3S)-lyase] ligase
EcSMS35_0638113-0.406011sensor histidine kinase DpiB
EcSMS35_06392140.458621two-component response regulator DpiA
EcSMS35_06402140.604527C4-dicarboxylate transporter DcuC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0601ENTSNTHTASED2686e-94 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 268 bits (685), Expect = 6e-94
Identities = 106/183 (57%), Positives = 130/183 (71%), Gaps = 1/183 (0%)

Query: 1 MKTTHTALPFAGNTLHFVKFDPASFCEQDLLWLPHYAQLQHAGRKRKTEHLAGRIAAVYA 60
M T+H LPFAG+ LH V FD +SF E DLLWLPH+ +L+ AGRKRK EHLAGRIAAV+A
Sbjct: 1 MLTSHFPLPFAGHRLHIVDFDASSFREHDLLWLPHHDRLRSAGRKRKAEHLAGRIAAVHA 60

Query: 61 LREYGYKCVPAIGELRQPVWPAGVYGSISHCGTTALAVVSRQPIGIDIEEIFSAQTAAEL 120
LRE G + VP +G+ RQP+WP G++GSISHC TTALAV+SRQ IGIDIE+I S TA EL
Sbjct: 61 LREVGVRTVPGMGDKRQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATEL 120

Query: 121 TDNIITPAEHKRLADCGLAFSLALTLAFSAKESAFKA-CERQTDAGFLDYQIINWNKQQV 179
+II E + L L F LALTLAFSAKES +KA +R T GF ++ + +
Sbjct: 121 APSIIDSDERQILQASLLPFPLALTLAFSAKESVYKAFSDRVTLPGFNSAKVTSLTATHI 180

Query: 180 IIH 182
+H
Sbjct: 181 SLH 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0611TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0612FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0615ISCHRISMTASE442e-160 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 442 bits (1139), Expect = e-160
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPDSHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0616DHBDHDRGNASE361e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 361 bits (928), Expect = e-130
Identities = 109/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFAQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAQTERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ + +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0625BCTLIPOCALIN280.017 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 28.4 bits (63), Expect = 0.017
Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 13/98 (13%)

Query: 50 QGITIIKTFDAPGGMKGYLGKYQDMGVTIYLTPDGKHAISG--YMYNEKGENLSNTLIEK 107
+ + + F+ YLGK+ ++ + G ++ + N+ G ++ N
Sbjct: 21 ESVKPVSDFEL----NNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLN----- 71

Query: 108 EIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYVFADPF 145
Y+ + W+ E + ++G D + V F PF
Sbjct: 72 RGYSEE-KGEWKEAEGKAYFVNGSTDGYLKVSFFG-PF 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0636PF03944270.009 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 27.3 bits (60), Expect = 0.009
Identities = 12/43 (27%), Positives = 24/43 (55%), Gaps = 3/43 (6%)

Query: 21 IAPLDTQDIDLQINSSVEKQFG---DAIRTTILDVLARYNVRG 60
I+P+ ++ Q + + ++FG D++R + ARY +RG
Sbjct: 496 ISPIHATQVNNQTRTFISEKFGNQGDSLRFEQNNTTARYTLRG 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0637LPSBIOSNTHSS391e-05 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.6 bits (90), Expect = 1e-05
Identities = 14/67 (20%), Positives = 33/67 (49%), Gaps = 2/67 (2%)

Query: 155 NPFTNGHRYLIQQAAAQCDWLHLFLVKEDSSR--FPYEDRLDLVLKGTADIPRLTVHRGS 212
+P T GH +I++ D +++ +++ + + F ++RL+ + K A +P V
Sbjct: 10 DPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPNAQVDSFE 69

Query: 213 EYIISRA 219
++ A
Sbjct: 70 GLTVNYA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0639HTHFIS622e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.2 bits (151), Expect = 2e-13
Identities = 28/121 (23%), Positives = 51/121 (42%), Gaps = 5/121 (4%)

Query: 1 MTAPLTLLIVEDETPLAEMHAEYIRHIPGFSQILLAGNLAQARMMIERFKPGLILLDNYL 60
MT T+L+ +D+ + + + + G+ + + N A I L++ D +
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALS-RAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 61 PDGRGINLLHELVQAHYPG-DVVFTTAASDMETVSEAVRCGVFDYLIKPIAYERLGQTLT 119
PD +LL + + P V+ +A + T +A G +DYL KP L +
Sbjct: 58 PDENAFDLLPRI-KKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 120 R 120
R
Sbjct: 117 R 117


13EcSMS35_0663EcSMS35_0680Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0663225-4.418407hypothetical protein
EcSMS35_0665327-4.772608hypothetical protein
EcSMS35_0664224-4.413292hypothetical protein
EcSMS35_0666022-3.987495hypothetical protein
EcSMS35_0667-119-3.443818DnaJ domain-containing protein
EcSMS35_0668015-2.844082hypothetical protein
EcSMS35_0669-113-2.120558hypothetical protein
EcSMS35_0670-112-1.625337DnaJ domain-containing protein
EcSMS35_0671013-1.311521DnaK family protein HscC
EcSMS35_0672217-0.721279ribonucleoside hydrolase 1
EcSMS35_0673220-0.729306glutamate/aspartate ABC transporter ATP-binding
EcSMS35_0674-1180.690944glutamate/aspartate ABC transporter permease
EcSMS35_0675-1180.819787glutamate/aspartate ABC transporter permease
EcSMS35_06760170.691035glutamate and aspartate transporter subunit
EcSMS35_06771201.845071hypothetical protein
EcSMS35_06791191.917014hypothetical protein
EcSMS35_06782222.221533apolipoprotein N-acyltransferase
EcSMS35_06802192.377465magnesium and cobalt efflux protein CorC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0671SHAPEPROTEIN973e-24 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 97.1 bits (242), Expect = 3e-24
Identities = 80/346 (23%), Positives = 140/346 (40%), Gaps = 40/346 (11%)

Query: 8 IGIDLGTTNSLIAVWKDGAAQLIPNKFGEYLTPSIISMDENNH------ILVGKPAVSRR 61
+ IDLGT N+LI V K L PS++++ ++ VG A
Sbjct: 13 LSIDLGTANTLIYV-KGQGIVLN--------EPSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 62 TSHPDKTAALFKRAMGSNTNWRLGSDTFNAPELSSLVLRSLKEDAEEFLQRPIKDVVISV 121
P AA+ R M +D F ++ ++ + ++ RP V++ V
Sbjct: 64 GRTPGNIAAI--RPMKDGVI----ADFFVTEKMLQHFIKQVHSNS---FMRPSPRVLVCV 114

Query: 122 PAYFSDEQRKHTRLAAELAGLNAVRLINEPTAAAMAYGLHTQQNTRSLVFDLGGGTFDVT 181
P + +R+ R +A+ AG V LI EP AAA+ GL + T S+V D+GGGT +V
Sbjct: 115 PVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVA 174

Query: 182 VLEYATPVIEVHASAGDNFLGGEDFTHMLVDEVLKRAAVAKNMLNESELAALYTSVEAAK 241
V+ V + +GG+ F +++ V + ++ E+ + + +A
Sbjct: 175 VISLNGVVY-----SSSVRIGGDRFDEAIINYVRRNYGS---LIGEATAERIKHEIGSAY 226

Query: 242 CSNQLPLQISWQYQEET----RECEFYENELEDLWLPLLNRLRVPIEQALRDA--RLKPS 295
++ +I + + R NE+ + L + + AL L
Sbjct: 227 PGDE-VREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASD 285

Query: 296 QID-SLVLVGGASQMPLVQRIAVRLFGKLPYQSYDPSTIVALGAAI 340
+ +VL GG + + + R+ + G + DP T VA G
Sbjct: 286 ISERGMVLTGGGALLRNLDRLLMEETGIPVVVAEDPLTCVARGGGK 331


14EcSMS35_0715EcSMS35_0742Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0715-2133.320016ornithine decarboxylase
EcSMS35_0716-1174.868209DNA-binding transcriptional activator KdpE
EcSMS35_0717-2154.279470sensor protein KdpD
EcSMS35_0718-1143.331390potassium-transporting ATPase subunit C
EcSMS35_0719-2142.714163potassium-transporting ATPase subunit B
EcSMS35_0720-1142.040565potassium-transporting ATPase subunit A
EcSMS35_0721-2141.829132hypothetical protein
EcSMS35_0722-1132.297639hypothetical protein
EcSMS35_0723-1133.218606deoxyribodipyrimidine photolyase
EcSMS35_0724-1132.938090amino acid/peptide transporter
EcSMS35_07250164.012960putative hydrolase-oxidase
EcSMS35_07260153.039167allophanate hydrolase, subunit 1
EcSMS35_0727-1101.783128allophanate hydrolase, subunit 2
EcSMS35_0728-2110.883450LamB/YcsF family protein
EcSMS35_0730-215-0.159163endonuclease VIII
EcSMS35_0729-1211.209786protein AbrB
EcSMS35_07311230.370398hypothetical protein
EcSMS35_07322272.424069type II citrate synthase
EcSMS35_07343263.245099succinate dehydrogenase cytochrome b556 large
EcSMS35_07352283.556008succinate dehydrogenase cytochrome b556 small
EcSMS35_07362303.552131succinate dehydrogenase flavoprotein subunit
EcSMS35_07372292.983635succinate dehydrogenase iron-sulfur subunit
EcSMS35_07382261.9997832-oxoglutarate dehydrogenase E1 component
EcSMS35_0739-1200.721064dihydrolipoamide succinyltransferase
EcSMS35_0740-114-0.964506succinyl-CoA synthetase subunit beta
EcSMS35_0741016-2.275808succinyl-CoA synthetase subunit alpha
EcSMS35_0742021-3.688751hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0716HTHFIS927e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 7e-24
Identities = 35/125 (28%), Positives = 58/125 (46%), Gaps = 1/125 (0%)

Query: 2 TNVLIVEDEQAIRRFLRTALEGDGMRVFEAETLQRGLLEAATRKPDLIILDLGLPDGDGI 61
+L+ +D+ AIR L AL G V A DL++ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EFIRDLRQWSA-VPVIVLSARSEESDKIAALDAGADDYLSKPFGIGELQARLRVALRRHS 120
+ + +++ +PV+V+SA++ I A + GA DYL KPF + EL + AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 ATTAP 125
+
Sbjct: 124 RRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0717PF06580320.012 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.012
Identities = 10/48 (20%), Positives = 21/48 (43%), Gaps = 4/48 (8%)

Query: 785 LLENAVKYAGAQAE----IGIDAHVEGENLQLDVWDNGPGLPPGQEQT 828
L+EN +K+ AQ I + + + L+V + G +++
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0737TCRTETOQM310.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.003
Identities = 11/41 (26%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDDAPRMQDYTLEAEEGRDM-MLLDALIQLKEKDPSLSFRR 53
+++ + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0739RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.020
Identities = 27/196 (13%), Positives = 56/196 (28%), Gaps = 12/196 (6%)

Query: 48 EVPASADGILDAVLEDEGTTVTSRQILGRLREGNSAGKETSAKSE-EKASTPAQRQQASL 106
E+ + I+ ++ EG +V +L +L + +S +A R Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 107 EEQNNDAL----SPAIRRLLAEHNLDASAIKGTGVGGRLTRED----VEKHLAKAPAKES 158
+ L P + + T ++ E +L K A+
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 159 APAAAAPAAQPALAARSEKRVPMTRLRKRVA---ERLLEAKNSTAMLTTFNEVNMKPIMD 215
A + + + L + A +LE +N V +
Sbjct: 218 TVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQ 277

Query: 216 LRKQYGEAFEKRHGIR 231
+ + A E+ +
Sbjct: 278 IESEILSAKEEYQLVT 293


15EcSMS35_0751EcSMS35_0762Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0751218-2.113685putative glutamate mutase mutL
EcSMS35_0752321-3.158161methylaspartate mutase subunit S
EcSMS35_0753222-3.149355hypothetical protein
EcSMS35_0754220-1.924797hypothetical protein
EcSMS35_07552220.753731cytochrome d ubiquinol oxidase, subunit I
EcSMS35_07561180.479218cytochrome d ubiquinol oxidase, subunit II
EcSMS35_0757718-0.151197cyd operon protein YbgT
EcSMS35_07583190.376222hypothetical protein
EcSMS35_07594230.414029acyl-CoA thioester hydrolase YbgC
EcSMS35_07603220.297096colicin uptake protein TolQ
EcSMS35_07614200.282804colicin uptake protein TolR
EcSMS35_0762418-0.188527cell envelope integrity inner membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0751PF03309320.005 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 31.7 bits (72), Expect = 0.005
Identities = 10/55 (18%), Positives = 23/55 (41%)

Query: 3 IVSVDIGSTWTKAALFAREGDALTLVNHVLTPTTTHHLAKGFFSSLNLVLNVDNA 57
++++D+ +T T L + GD +V T A +++ ++ D
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALTIDGLIGDDAE 56


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0762IGASERPTASE616e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.8 bits (147), Expect = 6e-12
Identities = 34/207 (16%), Positives = 71/207 (34%), Gaps = 8/207 (3%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEE 158
E E+ Q QA+ + E A A ++ E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVP----SNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 159 AAK--KAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAA--AAEARKKAATE 214
A+ K + +K E +A + A+ ++ A+ A + +K + E A +E ++ TE
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 215 AAEKAKAEAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKK 274
E A E E+KA E + + K+ + +A ++ + +
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQ 1159

Query: 275 AAAAKAAAAKAAAEKAAAAKAAAEADD 301
+ A + A++ ++ +
Sbjct: 1160 SQTNTTADTEQPAKETSSNVEQPVTES 1186



Score = 56.2 bits (135), Expect = 1e-10
Identities = 28/230 (12%), Positives = 74/230 (32%), Gaps = 2/230 (0%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAK--KAAADAKKKAEAEAAKAAAEAQ 183
+++ KQ + + A+ + + E ++ A + E + +
Sbjct: 1126 TSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTE 1185

Query: 184 KKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAAEKA 243
++ + E A + + + K + ++ ++ +++
Sbjct: 1186 STTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRS 1245

Query: 244 AADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAAKAAAEKAAAA 293
+D +A A+ A + A ++ + +
Sbjct: 1246 TVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYN 1295



Score = 56.2 bits (135), Expect = 2e-10
Identities = 32/241 (13%), Positives = 87/241 (36%), Gaps = 13/241 (5%)

Query: 68 QSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Q+ S ++E+ ++ +E ++ + +K E+ A +
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--EQDATET 1061

Query: 128 ELKQKQ-AEEAAAKAAAD------AKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAA 180
+ ++ A+EA + A+ A++ +E E + A + ++KA+ E K
Sbjct: 1062 TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQE 1121

Query: 181 EAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAADKKAAA 240
+ ++ + ++++E + A AR+ T ++ +++ A E+ A + +
Sbjct: 1122 VPKVTSQVSPK--QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 241 EKAAADKKAAEK--AAAEKAAADKKAAAEKAAADKKAAAAKAAAAKAAAEKAAAAKAAAE 298
E+ + + E A + + + K ++ + A
Sbjct: 1180 EQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATT 1239

Query: 299 A 299
+
Sbjct: 1240 S 1240



Score = 55.1 bits (132), Expect = 3e-10
Identities = 33/257 (12%), Positives = 87/257 (33%), Gaps = 21/257 (8%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKA 163
+ E + ++ ++ Q K+ +K+ KAK E + +E K
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKE-----------EKAKVETEKT--QEVPKVT 1126

Query: 164 AADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEA 223
+ + K+ ++E + AE ++ + + +++ A E K + E+ E+
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186

Query: 224 EKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAA 283
+ + +E + K + + + ++ +
Sbjct: 1187 TTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 284 KAAAEKAAAAKAAAEAD 300
A + + A +D
Sbjct: 1247 VALCDLTSTNTNAVLSD 1263


16EcSMS35_0812EcSMS35_0820Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0812-1213.087055cardiolipin synthase 2
EcSMS35_0813-1213.595519endonuclease/exonuclease/phosphatase family
EcSMS35_0815-2213.390391hypothetical protein
EcSMS35_0814-1193.739238ABC-2 type transporter, permease
EcSMS35_0816-1174.067294ABC-2 type transporter, permease
EcSMS35_0817-2154.055157ABC transporter ATP-binding protein
EcSMS35_0818-2133.565065hypothetical protein
EcSMS35_0819-1133.295531putative DNA-binding transcriptional regulator
EcSMS35_0820-1133.568990ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0814ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0817PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 293 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 352
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 353 KRGEIFG----LLGPNGAGKSTTFKMMCGL 378
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.047
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0818RTXTOXIND626e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.2 bits (151), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 82 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 141
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 142 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 196
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 197 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 254
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 255 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 308
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 309 ----DADDALRQGMPVTVQ 323
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0819HTHTETR721e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.0 bits (176), Expect = 1e-17
Identities = 32/220 (14%), Positives = 74/220 (33%), Gaps = 29/220 (13%)

Query: 9 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 67
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 68 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSK---FISR 124
IGE E + P + +RE+++ + + + + +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 125 EQLSPTAAYHLVHEQVISPLHSHLTRLIAAW---TGCDASDTRMILHTHALIGEILAFRL 181
E A + + + L I A +I+ I ++
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM--RGYISGLM---- 175

Query: 182 GKETILLRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 221
W + + + ++ ++L+
Sbjct: 176 --------ENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0820SECA300.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.025
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


17EcSMS35_0848EcSMS35_0859Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0848-2154.038508formate C-acetyltransferase 3
EcSMS35_0849-2113.717625glycyl-radical activating family protein
EcSMS35_0850-2123.194345fructose-6-phosphate aldolase
EcSMS35_0851-2123.124588molybdopterin biosynthesis protein MoeB
EcSMS35_0852-1132.755627molybdopterin biosynthesis protein MoeA
EcSMS35_0853014-1.026759L-asparaginase
EcSMS35_0854015-2.610711glutathione transporter ATP-binding protein
EcSMS35_0855115-4.098869glutathione ABC transporter periplasmic
EcSMS35_0856115-5.732467glutathione ABC transporter permease GsiC
EcSMS35_0857011-5.430997glutathione ABC transporter permease GsiD
EcSMS35_0858011-6.327509cyclic diguanylate phosphodiesterase
EcSMS35_0859110-3.060589diguanylate cyclase
18EcSMS35_0889EcSMS35_0896Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0889-115-5.959807arginine transporter permease subunit ArtM
EcSMS35_0890-217-6.290170arginine transporter permease subunit ArtQ
EcSMS35_0891-115-5.713186arginine ABC transporter periplasmic
EcSMS35_0892115-4.600769arginine transporter ATP-binding subunit
EcSMS35_0893116-3.293468putative lipoprotein
EcSMS35_0894113-2.505185hypothetical protein
EcSMS35_0895-1143.521963hypothetical protein
EcSMS35_0897-2133.638845N-acetylmuramoyl-L-alanine amidase AmiD
EcSMS35_0896-2153.266339NAD-dependent epimerase/dehydratase family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0892PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0897ECOLIPORIN310.007 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 30.7 bits (69), Expect = 0.007
Identities = 20/54 (37%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLVAAALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVMVIHYTADD 55
R+V LV ALL AG A I K+G +LD Y ++ + HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0896NUCEPIMERASE752e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.2 bits (185), Expect = 2e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------TGRNEAMGKLLEKMGAEFVPAD 63
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVLTLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


19EcSMS35_0929EcSMS35_0943Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0929013-4.033352methionyl-tRNA synthetase
EcSMS35_0930323-5.736967putative ATPase
EcSMS35_0931326-6.547299hypothetical protein
EcSMS35_0932324-6.328146fimbrial protein
EcSMS35_0934224-5.375531fimbrial usher protein
EcSMS35_0935126-7.314999hypothetical protein
EcSMS35_0936-124-5.608112hypothetical protein
EcSMS35_0937-128-5.518608nickel/cobalt efflux protein RcnA
EcSMS35_0938-232-6.200522transcriptional repressor RcnR
EcSMS35_0939035-6.879209phage integrase family site specific
EcSMS35_0940036-8.830406hypothetical protein
EcSMS35_0941023-2.668083AlpA family transcriptional regulator
EcSMS35_0942022-2.567011hypothetical protein
EcSMS35_0943224-0.900027hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0934PF005777220.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 722 bits (1865), Expect = 0.0
Identities = 244/843 (28%), Positives = 392/843 (46%), Gaps = 35/843 (4%)

Query: 2 LRMTPLASAI---VALLLGIEAYAAEETFDTHFMIGGMKDQQVANIRL--DDNQPLPGQY 56
R+ + A +AE F+ F+ Q VA++ + + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 57 DIDIYVNKQWRGKYEIIVKDNPQET----CLSREIIKRLGINTD-----NFASGKQCLTF 107
+DIY+N + ++ E CL+R + +G+NT N + C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 108 EQLVQGGSYTWDIGVFRLDFSVPQAWVEELESGYVPPENWERGINAFYTSYYVSQYYSDY 167
++ + D+G RL+ ++PQA++ GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 168 KASGNSKSTYVRFNSGLNLLGWQLHSDASFSKTNSNPGV-----WKSNTLYLERGFAQLL 222
+ GNS Y+ SGLN+ W+L + ++S +S+ W+ +LER L
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 223 GTLRVGDMYTSSDIFDSVRFSGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 282
L +GD YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 283 VVYQKEVPPGPFAITDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 342
+Y VPPGPF I D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 343 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGSMVANNYYAFTLGTGWNT-RIGAIS 400
AG A ++ F Q+ +G T+YGG+ +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 401 VDATKSHSKQDNGDVFDGQSYQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNK 460
VD T+++S + DGQS + YNK ++++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 461 DNYRRDENDIYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSG 516
++ + + DYY + ++ ++Q L ++ LS + YWG S
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 517 SSKDYQLSYSNNLRRISYTLAASHAYDENHHE-EKRFNIFISIPFD--WGDDVTTPRRQI 573
+ +Q + I++TL+ S + ++ + ++IPF D + R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 574 YMSNSTTFDDQGVASNNTGLSGTVGSRDQFNYGVNLSYQYQGN---ETTAGANLTWNAPV 630
S S + D G +N G+ GT+ + +Y V Y G+ +T A L +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 631 ATVNGSYSQSSAYRQAGASVSGGIVAWSGGVNLANRLSETFAVMNAPGIKDAYVNGQKYR 690
N YS S +Q VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 691 TTNRNGVVVYDGMTPYRENYLMLDVSQSDSEAELRGNRKIAAPYRGAVVLVNFDTDQRKP 750
T+ G V T YREN + LD + +L P RGA+V F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 751 WFIKALRADGQPLTFGYEVNDIHGHNIGVVGQGSQLFIRTNEVPPSVNVAIDKQQGLSCT 810
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 811 ITF 813
+
Sbjct: 857 ANY 859


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0936TYPE3OMGPROT280.007 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.007
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 6 KMLLGALLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 46
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


20EcSMS35_0967EcSMS35_0984Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0967215-1.174872fructose-bisphosphate aldolase
EcSMS35_0968319-1.688148tagatose-bisphosphate aldolase
EcSMS35_0969319-2.704948putative tagatose-6-phosphate kinase
EcSMS35_0970420-2.661678PTS system galactitol-specific transporter
EcSMS35_0971319-2.652557PTS system galactitol-specific transporter
EcSMS35_0972212-1.129557PTS system, galactitol-specific IIC component
EcSMS35_0973012-2.111114galactitol-1-phosphate dehydrogenase
EcSMS35_0974011-1.992529galactitol utilization operon repressor
EcSMS35_0975-211-1.546742lipid kinase
EcSMS35_0976015-0.671117hypothetical protein
EcSMS35_0977-1151.268544U32 family peptidase
EcSMS35_0978-2141.641682RelE/ParE family plasmid stabilization system
EcSMS35_0979-1183.687980CC2985 family addiction module antidote protein
EcSMS35_0980-1194.232687hypothetical protein
EcSMS35_0981-2174.379642DNA-binding transcriptional regulator BaeR
EcSMS35_0982-3164.280793signal transduction histidine-protein kinase
EcSMS35_0983-2154.047697multidrug efflux system protein MdtE
EcSMS35_0984-2143.161652multidrug efflux system subunit MdtC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0973DHBDHDRGNASE320.003 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 31.9 bits (72), Expect = 0.003
Identities = 21/92 (22%), Positives = 35/92 (38%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSRE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQIQGVLRELRFNQLILETAGVPQTVELA 246
A + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0981HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0982BCTERIALGSPF340.002 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 33.6 bits (77), Expect = 0.002
Identities = 28/95 (29%), Positives = 36/95 (37%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALSTLLAALATF------PLARGLLAPVKRLVDGTHKLAAGDFTTRVTPTSE 217
RQ + L+ A L AL P L+A V+ V H LA + P S
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GKLAQDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0983TCRTETB1251e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 125 bits (315), Expect = 1e-33
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0984ACRIFLAVINRP9180.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 918 bits (2373), Expect = 0.0
Identities = 287/1035 (27%), Positives = 505/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------VLEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G L I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRARLPELQSTIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+A+L ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRGERS---ETAQQIIDRLRVKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 80.7 bits (199), Expect = 2e-17
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRGERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 703
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPK 1020
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


21EcSMS35_1008EcSMS35_1037Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1008-1223.271170putative colanic acid biosynthesis
EcSMS35_10090243.905677GDP-mannose 4,6-dehydratase
EcSMS35_10100233.713251GDP-L-fucose synthetase
EcSMS35_10110243.507676GDP-mannose mannosyl hydrolase
EcSMS35_10120233.305924putative glycosyl transferase
EcSMS35_10130243.228258mannose-1-phosphate guanylyltransferase
EcSMS35_10140221.754185phosphomannomutase
EcSMS35_1015-1180.333548putative UDP-glucose lipid carrier transferase
EcSMS35_1016-114-0.139514colanic acid exporter
EcSMS35_1017-314-3.251010putative pyruvyl transferase
EcSMS35_1018-121-8.486148colanic acid biosynthesis glycosyl transferase
EcSMS35_1019131-11.677976putative colanic acid biosynthesis protein
EcSMS35_1020340-14.786379UDP-N-acetylglucosamine 4-epimerase
EcSMS35_1021647-16.615529UTP--glucose-1-phosphate uridylyltransferase
EcSMS35_1022855-19.674064glycosyl transferase, group 2
EcSMS35_1023955-19.763591hypothetical protein
EcSMS35_1024854-18.745440glycosyl transferase, group 2
EcSMS35_1025748-16.789891hypothetical protein
EcSMS35_1026331-11.194761glycosyl transferase, group 1
EcSMS35_1027225-8.828926putative polysaccharide biosynthesis protein
EcSMS35_1028017-4.851300hypothetical protein
EcSMS35_1029014-2.409638IS1 transposase orfA
EcSMS35_1030-112-2.038923IS1 transposase orfB
EcSMS35_1031-115-1.6994556-phosphogluconate dehydrogenase
EcSMS35_1032-215-1.097798UDP-glucose 6-dehydrogenase
EcSMS35_1033-2191.081488chain length determinant protein
EcSMS35_10340263.298643bifunctional phosphoribosyl-AMP
EcSMS35_10350233.935161imidazole glycerol phosphate synthase subunit
EcSMS35_1036-1234.1789331-(5-phosphoribosyl)-5-[(5-
EcSMS35_10370233.744755imidazole glycerol phosphate synthase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1009NUCEPIMERASE1041e-27 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 104 bits (262), Expect = 1e-27
Identities = 76/353 (21%), Positives = 122/353 (34%), Gaps = 42/353 (11%)

Query: 6 LITGVTGQDGSYLAEFLLEKGYEVHGIKRRASSFNTERVDHIYQDPH--------TCNPK 57
L+TG G G ++++ LLE G++V GI + N Y D P
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLND------YYDVSLKQARLELLAQPG 53

Query: 58 FHLHYGDLSDTSNLTRILREVQPDEVYNLGAMSHVAVSFESPEYTADVDAMGTLRLLEAI 117
F H DL+D +T + + V+ V S E+P AD + G L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 118 RFLGLEKKTRFYQASTSELYGLVQEIPQKETTPF-YPRSPYAVAKLYAYWITVNYRESYG 176
R ++ AS+S +YGL +++P +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 177 MYACNGILFNHESPRRGETFVTRKITRAIANIAQGLESCLYLGNMDSLRDWGHAKDYVKM 236
+ A F P K T+A+ G +Y RD+ + D +
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLE---GKSIDVY-NYGKMKRDFTYIDDIAEA 226

Query: 237 QWMMLQQEQPEDFVIATGVQYSVRQFVEMAAAQLGIKLRFEGTGVEEKGIVVSVTGHDAP 296
+ D +G + E + DA
Sbjct: 227 IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG------NSSPVELMDYIQAL-EDAL 279

Query: 297 GVKPGDVIIAVDPRY--FRPAEVETLLGDPTKAHEKLGWKPEITLREMVSEMV 347
G++ +P +V D +E +G+ PE T+++ V V
Sbjct: 280 GIE-------AKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1010NUCEPIMERASE871e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.1 bits (216), Expect = 1e-21
Identities = 66/344 (19%), Positives = 132/344 (38%), Gaps = 47/344 (13%)

Query: 5 RIFIAGHRGMVGSAIRRQLEQRG-------------DVEL------VLRTRD----ELNL 41
+ + G G +G + ++L + G DV L +L +++L
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 LDSRAVHDFFASERIDQVYLAAAKVGGIVANNTYPADFIYQNMMIESNIIHAAHQNDVNK 101
D + D FAS ++V+++ + + + P + N+ NI+ N +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLAKQPMAESELLQGTLEPTNEPYAIAKIAGIKLCESYNRQYGRDYRSV 161
LL+ SS +Y K P + P + YA K A + +Y+ YG +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD---DSVDHPVS-LYAATKKANELMAHTYSHLYGLPATGL 176

Query: 162 MPTNLYGPHDNFHPSNSHVIPALLRRFHEATAQNAPDVVVWGSGTPMREFLHVDDMAAAS 221
+YGP P L +F +A + + V+ G R+F ++DD+A A
Sbjct: 177 RFFTVYGPWGR--PD------MALFKFTKAMLEGKS-IDVYNYGKMKRDFTYIDDIAEAI 227

Query: 222 IHVMELAH----EVWLENTQPMLSH-----INVGTGVDCTIRELAQTIAKVVGYKGRVVF 272
I + ++ + +E P S N+G + + Q + +G + +
Sbjct: 228 IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNM 287

Query: 273 DASKPDGTPRKLLDVTRLHQ-LGWYHEISLEAGLASTYQWFLEN 315
+P D L++ +G+ E +++ G+ + W+ +
Sbjct: 288 LPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1020NUCEPIMERASE945e-24 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 93.7 bits (233), Expect = 5e-24
Identities = 70/334 (20%), Positives = 126/334 (37%), Gaps = 62/334 (18%)

Query: 4 NVLLIGASGFVGT----RLLE----------------TAIADFNIKNLDKQQSHFYPEIT 43
L+ GA+GF+G RLLE ++ ++ L + F+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 44 QIGDVRDQQALDQALA--GFDTVVLLAAEH--RDDVSPTSLYYDVNVQGTRNVLAAMEKN 99
D+ D++ + A F+ V + R + Y D N+ G N+L N
Sbjct: 59 --IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 100 GVKNIIFTSSVAVYGLNKHNP-DENHPHD-PFNHYGKSKWQAEEVLREWYNKA---PTER 154
++++++ SS +VYGLN+ P + D P + Y +K +A E++ Y+ P
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATK-KANELMAHTYSHLYGLPA-- 173

Query: 155 SLTIIRPTVIFGERNRGN--VYNLLKQIAGGKFMMV-GAGTNYKSMAYVGNIVEFIKYKL 211
T +R ++G R + ++ K + GK + V G + Y+ +I E I
Sbjct: 174 --TGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 212 KNVA-----------------AGYEVYNYVDKPDLNMNQLVAEVEQSLNKKIPSMHLPYP 254
+ A Y VYN + + + + +E +L + LP
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQ 291

Query: 255 LGMLGGYCFDI--LSKITGKKYAVS-SVRVKKFC 285
G + D L ++ G + VK F
Sbjct: 292 PGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


22EcSMS35_1050EcSMS35_1168Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1050-115-3.088504hypothetical protein
EcSMS35_1051-114-2.456923D-alanyl-D-alanine carboxypeptidase
EcSMS35_1052-115-2.487854DNA gyrase inhibitor
EcSMS35_1054-117-2.110669hypothetical protein
EcSMS35_1053328-2.361624hypothetical protein
EcSMS35_1055325-0.897711hypothetical protein
EcSMS35_10567250.489139hypothetical protein
EcSMS35_10577302.743509hypothetical protein
EcSMS35_10586262.875566hypothetical protein
EcSMS35_10596284.119953hypothetical protein
EcSMS35_10605283.734376hypothetical protein
EcSMS35_10614251.593795hypothetical protein
EcSMS35_1062424-0.142300RadC family DNA repair protein
EcSMS35_1063424-0.003388antirestriction protein
EcSMS35_10644240.286372hypothetical protein
EcSMS35_1066424-0.381909hypothetical protein
EcSMS35_10654240.782315hypothetical protein
EcSMS35_10675231.116566hypothetical protein
EcSMS35_10687231.249782patatin family phospholipase
EcSMS35_1069623-0.324430hypothetical protein
EcSMS35_1070723-0.745959hypothetical protein
EcSMS35_1071723-1.286314putative GTPase
EcSMS35_1072722-2.128303hypothetical protein
EcSMS35_1073623-2.089148IS150 transposase orfA
EcSMS35_1074723-0.991573IS150 transposase orfB
EcSMS35_1075620-0.056689hypothetical protein
EcSMS35_1076322-1.080588prophage CP4-57 regulatory protein AlpA
EcSMS35_1077423-2.928330hypothetical protein
EcSMS35_1078329-5.484908hypothetical protein
EcSMS35_1079331-5.640962ISL3 family transposase
EcSMS35_1081128-5.116380phosphotriesterase family protein
EcSMS35_1082127-5.028085hypothetical protein
EcSMS35_1083227-5.179421hypothetical protein
EcSMS35_1084128-5.055943PfkB family DNA-binding protein/kinase
EcSMS35_1086230-4.588392insertion sequence 2 OrfA protein
EcSMS35_1088229-5.086858IS2 transposase orfB
EcSMS35_1087224-5.042850lyase
EcSMS35_1089018-4.181130anaerobic C4-dicarboxylate transporter
EcSMS35_1090-114-2.329383aspartate racemase
EcSMS35_1091-213-0.181467LysR family transcriptional regulator
EcSMS35_1092-1142.122149hypothetical protein
EcSMS35_1094-1142.157571d-galactonate transporter
EcSMS35_10950162.295451galactonate dehydratase
EcSMS35_10960182.7405942-dehydro-3-deoxy-6-phosphogalactonate aldolase
EcSMS35_10970171.7155552-dehydro-3-deoxygalactonokinase
EcSMS35_10982180.018984galactonate operon transcriptional repressor
EcSMS35_1099123-0.380667hypothetical protein
EcSMS35_11001210.036652IS911 transposase orfA
EcSMS35_1101123-1.488262IS911 transposase orfB
EcSMS35_1102026-2.490441hypothetical protein
EcSMS35_1103031-4.322233major facilitator family transporter
EcSMS35_1104132-4.784195periplasmic sugar binding transcriptional
EcSMS35_1105334-5.676630inosine-uridine preferring nucleoside hydrolase
EcSMS35_1106337-6.767765mgtC family protein
EcSMS35_1110433-5.503125hypothetical protein
EcSMS35_1111533-5.360162asparaginase family protein
EcSMS35_1113425-3.507084hypothetical protein
EcSMS35_1114426-5.107791anaerobic C4-dicarboxylate transporter
EcSMS35_1115428-6.120054peptidase T
EcSMS35_1116432-6.817585putative aspartate ammonia-lyase
EcSMS35_1117434-7.575151anaerobic C4-dicarboxylate transporter
EcSMS35_1118336-8.668824L-asparaginase II
EcSMS35_1119542-9.908041response regulator
EcSMS35_1120535-7.266358GntR family transcriptional regulator
EcSMS35_1121435-6.852546anaerobic C4-dicarboxylate transporter
EcSMS35_1122531-6.456733anaerobic C4-dicarboxylate transporter
EcSMS35_1123427-6.201918argininosuccinate lyase ArgH-like protein
EcSMS35_1124427-2.420663hypothetical protein
EcSMS35_1125323-1.078426immunoglobulin-binding regulator A-like protein
EcSMS35_1127326-0.461235hypothetical protein
EcSMS35_1129328-0.443985hypothetical protein
EcSMS35_1130228-0.728593hypothetical protein
EcSMS35_1131127-0.966220adenosylcobinamide
EcSMS35_1132126-1.216925cobalamin synthase
EcSMS35_1133028-2.037713nicotinate-nucleotide--dimethylbenzimidazole
EcSMS35_1134029-3.989733hypothetical protein
EcSMS35_1136-227-3.775146*nitrogen assimilation transcriptional regulator
EcSMS35_1137-226-3.216411transcriptional regulator Cbl
EcSMS35_1139-128-4.886955*hypothetical protein
EcSMS35_1141020-3.378911*hypothetical protein
EcSMS35_1142021-3.400962hypothetical protein
EcSMS35_1143021-3.429170AMP nucleosidase
EcSMS35_1144021-3.740985shikimate transporter
EcSMS35_1145120-3.813659hypothetical protein
EcSMS35_1146119-3.085502putative invasin
EcSMS35_1148226-5.538973*hypothetical protein
EcSMS35_1150228-5.463358*phage integrase family site specific
EcSMS35_1151126-4.204201hypothetical protein
EcSMS35_1152226-3.977637exonuclease family protein
EcSMS35_1153-134-7.143024hypothetical protein
EcSMS35_1154031-5.607036division inhibition protein DicB
EcSMS35_1155122-3.869214hypothetical protein
EcSMS35_1156327-2.714068hypothetical protein
EcSMS35_1157327-1.231243hypothetical protein
EcSMS35_1158326-0.463349hypothetical protein
EcSMS35_1159328-0.226525DNA-binding transcriptional regulator DicC
EcSMS35_11603280.021660hypothetical protein
EcSMS35_11613290.416078hypothetical protein
EcSMS35_11621280.232049hypothetical protein
EcSMS35_1163228-1.792769hypothetical protein
EcSMS35_1164431-3.601692hypothetical protein
EcSMS35_1165530-4.500310hypothetical protein
EcSMS35_1166432-5.709753hypothetical protein
EcSMS35_1167332-7.310813hypothetical protein
EcSMS35_1168329-6.638477hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1051BLACTAMASEA290.042 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 28.6 bits (64), Expect = 0.042
Identities = 27/165 (16%), Positives = 56/165 (33%), Gaps = 23/165 (13%)

Query: 42 VLMDYTTGQILTAGNEHQQRNPASLTKLMTGYVVDRAIDSHRITPDDIVTVGRDAWAKDN 101
+ MD +G+ LTA ++ S K++ V +D+ + + + +
Sbjct: 43 IEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYS 102

Query: 102 PV---FVGSSLMFLKEGDRVSVRDLSRGLIVDSGNDACVALADYIAGGQRQFVEMMNNYA 158
PV + + +V +L I S N A L + G + +
Sbjct: 103 PVSEKHLADGM---------TVGELCAAAITMSDNSAANLLLATVGG-----PAGLTAFL 148

Query: 159 EKLHLKDTH---FETVHGLDAPGQH---SSAYDLAVLSRAIIHGE 197
++ T +ET PG ++ +A R ++ +
Sbjct: 149 RQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLLTSQ 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1055FbpA_PF05833280.012 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 27.5 bits (61), Expect = 0.012
Identities = 13/83 (15%), Positives = 33/83 (39%), Gaps = 6/83 (7%)

Query: 16 RLFRRKNKLQREIQDVEKKIRDNQKRVLLLDNLSDYIKPGMSVEAIQGIIASMKGDYEDR 75
+++ NKL++ + +++ N++ + L ++ I + + I+ I E
Sbjct: 385 SYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKK------ELI 438

Query: 76 VDDYIIKNAELSKERRDISKKLK 98
YI ++ SK +
Sbjct: 439 ETGYIKFKKIYKSKKSKTSKPMH 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1094TCRTETA447e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.0 bits (104), Expect = 7e-07
Identities = 63/383 (16%), Positives = 114/383 (29%), Gaps = 34/383 (8%)

Query: 51 AEMGYVFSAFAWLYTLCQIPGGWFLDRVGSRVTYFIAIFGWSVATLFQGFATGLMSLIGL 110
A G + + +A + C G DR G R +++ G +V A L L
Sbjct: 43 AHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIG 102

Query: 111 RAITGIFEAPAFPTNNRMVTSWFPEHERASAVGFYTSGQFVGLAFLTPLLIWIQEMLSWH 170
R + GI A + ERA GF ++ G+ P+L + S H
Sbjct: 103 RIVAGITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMV-AGPVLGGLMGGFSPH 160

Query: 171 WVFIVTGGIGIIWSLIWFKVYQPPRLTKGISKAELDYIRDGGGLVDGDAPVKKEARQPLT 230
F + + L + + P+++EA PL
Sbjct: 161 APFFAAAALNGLNFLTGCFLLPESHKGER-------------------RPLRREALNPLA 201

Query: 231 AKDWKLVFHRKLIGVYLGQFAVTSTLWFFLTWFPNYLTQEKGITALKAGFMTSVPFLAAF 290
+ W + + F + + + A G +
Sbjct: 202 SFRWARGM-TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAA---FGI 257

Query: 291 VGVLLSGWVADLLVRKGFSLGFARKTPIICGLLISTC--IMGANYTNDPMMIMCLMALAF 348
+ L + + + + ++ G++ I+ A T M ++ LA
Sbjct: 258 LHSLAQAMITGPVAAR-----LGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLAS 312

Query: 349 FGNGFASITWSLVSSLAPMRLIGLTGGVFNFVGGLGGITVPLVVGYL-AQGYGFAPALVY 407
G G ++ +++S G G + L I PL+ + A +
Sbjct: 313 GGIGMPALQ-AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAW 371

Query: 408 ISAVALIGALSYILMVGDVKRVG 430
I+ AL L G G
Sbjct: 372 IAGAALYLLCLPALRRGLWSGAG 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1103TCRTETB389e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 37.6 bits (87), Expect = 9e-05
Identities = 73/369 (19%), Positives = 132/369 (35%), Gaps = 57/369 (15%)

Query: 89 IGALLFGWIGDKHGRKIVMVITIGLMGMSTMLIGLIPSYAQIGVWAPICLVILRFSQGLG 148
IG ++G + D+ G K +++ I + +++ + S+ + L++ RF QG G
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL-------LIMARFIQGAG 116

Query: 149 AGAELSGGTVMLGEYAPVKRR----GLVSSVIGLGSNSGTLLASLVWLIVLQMDKDDLLS 204
A A + V++ Y P + R GL+ S++ +G G + ++ +
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMI---------AHYIH 167

Query: 205 WGW--RIPFLCSILIAAAALLIRRHIRETPVFERQKALLQ--AEREKVIREEKAQQQHDS 260
W + IP + I + L+++ +R F+ + +L ++
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 261 RS------FWKRTRAFWT-MVGLRIGENGP--SYLAQGFII--GYVAKVLMVDKSVPTAA 309
S F K R V +G+N P + G II V MV +
Sbjct: 228 VSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVH 287

Query: 310 VLIASVLGFAII-----------PLAGWLSDRFGRRIIYRWFCLLLILYAFPAFMLLDSR 358
L + +G II + G L DR G + L + A LL++
Sbjct: 288 QLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETT 347

Query: 359 EPWIVIPTIITGMGLA----SLGIFGVQAAWGVELFGVTNRYTKMAFAKELGSILSGGTA 414
++ I + GL+ + + E + +F LS GT
Sbjct: 348 SWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSF-------LSEGTG 400

Query: 415 PLIASALLS 423
I LLS
Sbjct: 401 IAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1104TYPE3IMSPROT280.043 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 28.2 bits (63), Expect = 0.043
Identities = 18/84 (21%), Positives = 33/84 (39%), Gaps = 15/84 (17%)

Query: 193 REMLNTQDRIRGWQQALEASSLVVNPSWIFSTNYTRAGGYEATKRMLEHQLPRALFATNE 252
+E+ + R + +S +V NP T+ Y+ E LP F +
Sbjct: 243 QEIQSRNMR----ENVKRSSVVVANP-----THIAIGILYKRG----ETPLPLVTFKYTD 289

Query: 253 QQALGCLRALA-EHGLRVPEDVAL 275
Q +R +A E G+ + + + L
Sbjct: 290 AQVQT-VRKIAEEEGVPILQRIPL 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1110YERSSTKINASE280.006 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 27.8 bits (61), Expect = 0.006
Identities = 25/80 (31%), Positives = 38/80 (47%), Gaps = 12/80 (15%)

Query: 8 LTTVLFSMSASAALTPACEEYYKEIDNFVAKMKEMG-TPEAQVNTLKQQYEQSRKQIAAL 66
L VL ++S P E Y F+ ++ E T Q+NTL+QQ E ++ Q++ L
Sbjct: 572 LLEVLVTLSQQG--QPVSSETY----GFLNRLTEAKITLSQQLNTLQQQQESAKAQLSIL 625

Query: 67 PDASQDMACKQGLDALRQSM 86
+ S A D RQS+
Sbjct: 626 INRSGSWA-----DVARQSL 640


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1119HTHFIS737e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.6 bits (178), Expect = 7e-17
Identities = 28/116 (24%), Positives = 53/116 (45%), Gaps = 4/116 (3%)

Query: 4 VLIVDDDILVADVIRRIVEQTPAFKCCGIALSLGQAKEIISVNKKLADLILLDLYINKDN 63
+L+ DDD + V+ + + + + + + I+ DL++ D+ + +N
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVRITS-NAATLWRWIA--AGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPFLHSIHCKSDVIVISAADDSATIKDSLHYGVSDYLIKPFHISRLVDSLTR 119
DLLP + V+V+SA + T + G DYL KPF ++ L+ + R
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1144TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 4e-04
Identities = 39/259 (15%), Positives = 96/259 (37%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHYQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + + + ++ +L + +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGRRR 317
+ + L +++ + + GL + + IG+L GG+ T+ F + +
Sbjct: 228 VSVLSFL-IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1146INTIMIN6970.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 697 bits (1799), Expect = 0.0
Identities = 218/790 (27%), Positives = 347/790 (43%), Gaps = 70/790 (8%)

Query: 139 QQIASTSQPIGSLLAEDMNSEQAANMARGWASSQASGAMTDWLSRFGTARITLGVDEDFS 198
QQ AS + S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224

Query: 199 LKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWRHFTPTWMSGINFFFDHDL 258
S DFL P+Y++ L F Q D R N G G R F P M G N F D D
Sbjct: 225 --GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDF 282

Query: 259 SRYHSRAGIGAEYWRDYLKLSSNGYLPLTNWRSAPELDNDYEARPANGWDVRAEGWLPAW 318
S ++R GIG EYWRDY K S NGY ++ W + DY+ RPANG+D+R G+LP++
Sbjct: 283 SGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRFNGYLPSY 341

Query: 319 PHLGGKLVYEQYYGDEVALFDKDDRQSNPHTITAGLNYTPFPLMTFSAEQRQGKQGENDT 378
P LG KL+YEQYYGD VALF+ D QSNP T G+NYTP PL+T + R G END
Sbjct: 342 PALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDL 401

Query: 379 RFAVDFTWQPGSAMQKQLDPNEVAARRSLAGSRYDLVDRNNNIVLEYRKKELVRLTLTDP 438
+++ F +Q +Q++P V R+L+GSRYDLV RNNNI+LEY+K++++ L +
Sbjct: 402 LYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHD 461

Query: 439 VTGKSGEVKSLVSSLQTKYALKGYNVEATALEAAGGKVVTTG----KDILVTLPAYRFTS 494
+ G + + +++KY L + +AL + GG++ +G +D LPAY
Sbjct: 462 INGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAY---- 517

Query: 495 TPETDNTWPIEVTAEDVKGNLSNREQ-SMVVVQAPTLSQKDSSVSLSTQTLSADSHSTAT 553
N + + A D GN SN ++ V+ + + + SA + T
Sbjct: 518 VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 554 LTFIAH------DAAGNPVIGLVLSTRHEGVQDITLSEWKDNGDGSYTQILTTGAMSGTL 607
+T+ A A PV ++S G ++ + NG G T L + +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVS----GTAVLSANSANTNGSGKATVTLKSDKPGQVV 633

Query: 608 TLMPQLNGVDAAKAPAVVNIISISSSRTHSSIKIDKDRYLSGNPIEVTVELR-DENDKPV 666
A A AV I + + + IK DK ++ +T ++ + DKPV
Sbjct: 634 VSAKTAEMTSALNANAV--IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPV 691

Query: 667 KEQKQQLNNAVSIDNVKPGVTTDWKETADGVYKATYTAYTKGSGL-TAKLLMQNWNEDLH 725
Q+ + K +T+ K +G K T T+ T G L +A++ +
Sbjct: 692 SNQEVTFTTTLG----KLSNSTE-KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAP 746

Query: 726 TAGFIIDANPQSAKIATLSASNNGVLANENAANTVSVNVADEGSNPINDHTVTFAVLSGS 785
F I + G L + + ++
Sbjct: 747 EVEFFTTLTIDDGNIEIVGTGVKGKLPTV---------------------WLQYGQVNLK 785

Query: 786 ATSFNNQNTAKTDVNGLATFDLKSSK---QEDNTVEVTLENGVKQTLIVSFVGDSSTAQV 842
A+ N + T ++ +A+ D S + +E T +++ + QT ++ + + +
Sbjct: 786 ASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQT--ATYTIATPNSLI 843

Query: 843 ELQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKVTF----------NVNSAEAKLSQT 892
SK D ++ + N L +V + + + + + QT
Sbjct: 844 VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQQT 903

Query: 893 EVNSHDGIAT 902
++ G+A+
Sbjct: 904 AQDAKSGVAS 913



Score = 196 bits (498), Expect = 9e-53
Identities = 95/376 (25%), Positives = 157/376 (41%), Gaps = 34/376 (9%)

Query: 821 LENGVKQTLIVSFVGDSSTAQ--VELQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKV 878
N V T+ V G + K ADG ++ T TATV+ N V V
Sbjct: 538 SSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQAN-VPV 596

Query: 879 TFNVNSAEAKLSQTEVNSH-DGIATATLTSLKNGDYRVTASVSSGSQA-NQQVIFIGDQS 936
+FN+ S A LS N++ G AT TL S K G V+A + + A N + DQ+
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 937 TAALTLSVPSGDITVTNTAPLHMTATLQ-DKNGNPLKDKEITFSVPNDVASRFSISNSGK 995
A++T + + T +T T++ K P+ ++E+TF+ + ++
Sbjct: 657 KASIT-EIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFT------TTLGKLSNST 709

Query: 996 GMTDSNGTAIASLTGTLAGTHMITARLANSNVSDTQPMTFVADKDRAVVVLQTSKAEIIG 1055
TD+NG A +LT T G +++AR+++ V P + + + EI+G
Sbjct: 710 EKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEV----EFFTTLTIDDGNIEIVG 765

Query: 1056 NGVDETTLTATVK-DPSNHPVAGITVNFTMPQGVAANFTLENNGIAITQANGEAHVTLKG 1114
GV T ++ N +G +T + N IA A+ VTLK
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYT--------WRSANPAIASVDAS-SGQVTLKE 816

Query: 1115 KKAGTHTVTATLGNNNTSDSQPVTFVADKTSAQVVLQMSKDEITGNGVDNATLTATVKDQ 1174
K GT T++ +SD+Q T+ ++ +V MSK + V+
Sbjct: 817 K--GTTTISVI-----SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPS 869

Query: 1175 FDNEVNNLPVTFSSAS 1190
NE+ N+ + +A+
Sbjct: 870 SQNELENVFKAWGAAN 885



Score = 88.6 bits (219), Expect = 2e-19
Identities = 92/395 (23%), Positives = 132/395 (33%), Gaps = 49/395 (12%)

Query: 1117 AGTHTVTATLGNNNTSDSQPVTFVADKTSAQVVLQMS--------KDEITGNGVDNATLT 1168
+ + VTA + N + S V S V+ K +G + T T
Sbjct: 522 SNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYT 581

Query: 1169 ATVKDQFDNEVNNLPVTFSSASSGLTLTPGVSNTNESGIAQATLAGVAFGEQTVTASLAN 1228
ATVK + N PV+F+ S L+ +NTN SG A TL G+ V+A A
Sbjct: 582 ATVKKNGVAQANV-PVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAE 640

Query: 1229 NGASDNKTVHFIGDTAAAKIIELTPVPDSIIAGTPQNSSGSVITATV-VDNNGFPVKGVT 1287
++ N D A I E+ + +A IT TV V PV
Sbjct: 641 MTSALNANAVIFVDQTKASITEIKADKTTAVANGQ-----DAITYTVKVMKGDKPVSNQE 695

Query: 1288 VNFTSRTNSAEMTNGGQAVTNEQGKATVTYTNTRSSIESGARPDTVEASLENGSSTLSTS 1347
V F T + + T+ G A VT T+T G S +S
Sbjct: 696 VTF---TTTLGKLSNSTEKTDTNGYAKVTLTSTTP-----------------GKSLVSAR 735

Query: 1348 INVNADASTAHLTLLQALFDTVSAGDTTNLYIEVKDNYGNGVPQQ--EVTLRVSPSEGVT 1405
+ +D + F T++ D N+ I G GV + V L+
Sbjct: 736 V---SDVAVDVKAPEVEFFTTLTI-DDGNIEIV-----GTGVKGKLPTVWLQYGQVNLKA 786

Query: 1406 PSNNAIYTTNHDGNFYASFTATKAGV---YQVTATLENGDSMQQTVTYVPNVANAEITLA 1462
N YT AS A+ V + T T+ S QT TY N+ I
Sbjct: 787 SGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPN 846

Query: 1463 ASKDPLIADNNDLTTLTATVADTEGNAIANTEVTF 1497
SK D + + N + N +
Sbjct: 847 MSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAW 881



Score = 82.8 bits (204), Expect = 1e-17
Identities = 92/467 (19%), Positives = 170/467 (36%), Gaps = 40/467 (8%)

Query: 1908 SGGKVRTNSSGQA--------PVVLTSNKVGTYTVTASFHNGVT----IQTQTTVKVTGN 1955
GG+++ + S A V + V T A NG + + T T +
Sbjct: 495 QGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQV 554

Query: 1956 SSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGLTVYFALKSGSATLTSLTAV 2015
V F AD ++ A ++ T ATV+ +G + V F + SG+A L++ +A
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 2016 TDQNGIATTSVKGAMTGSVTVSAVTTAGGMQTVDITLVAGPADTSQSVLKSNRSSLKGDY 2075
T+ +G AT ++K G V VSA TA ++ V T S+ +
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSA-KTAEMTSALNANAVIFVDQTKASITEIKADKTTAVA 672

Query: 2076 TDSAELRLVLHDISGNPIKVSEGMEFVQSGTNVPYIKISAIDYSLNINGDYKATVTSGGE 2135
+ + + G+ ++ + F + + + NG K T+TS
Sbjct: 673 NGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKT-----DTNGYAKVTLTSTTP 727

Query: 2136 GIATLIPVLNGVHQAGLSTTIQFTRAEDKIMSGTVSVNGTDLPTTTFPSQGFTGAYYQLN 2195
G + + ++ V + ++F I G + + GT + P+ L
Sbjct: 728 GKSLVSARVSDVAVDVKAPEVEF-FTTLTIDDGNIEIVGTGV-KGKLPTVWLQYGQVNLK 785

Query: 2196 NDNFAPGKTAADYEFSSSASWVDVDATGKVTFKNVGSNWERITATPKSGGPSYVYEIRVK 2255
+ G + ++ A ++G+VT K G+ I+ + Y I
Sbjct: 786 ---ASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTT--TISVISSDNQTA-TYTIATP 839

Query: 2256 SWWVNAGEAF-MIYSLAENFCSSNGYTLPRANYLNHSSSRGIGSLYSEWGDMGHYTTDAG 2314
+ + + + Y+ A N C + G LP SS + +++ WG Y
Sbjct: 840 NSLIVPNMSKRVTYNDAVNTCKNFGGKLP-------SSQNELENVFKAWGAANKYEYYKS 892

Query: 2315 FQSNMYW-----SSSPANSSEQYVVSLATGDQSVFEKLGFAYATCYK 2356
Q+ + W + + + Y + ++ AYATC K
Sbjct: 893 SQTIISWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939



Score = 67.8 bits (165), Expect = 4e-13
Identities = 55/264 (20%), Positives = 85/264 (32%), Gaps = 18/264 (6%)

Query: 1387 NGVPQQEVTLRVSPSEGVTPSNNAIYTTNHDGNFYASFTATKAGVYQVTATLENGDSMQQ 1446
NGV Q V + + G + TN G + + K G V+A S
Sbjct: 587 NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN 646

Query: 1447 T--VTYVPNVANAEITLAASKDPLIADNNDLTTLTATVADTEGNAIANTEVTFTLPEDVK 1504
V +V + + A K +A+ D T T V ++N EVTFT
Sbjct: 647 ANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSNQEVTFT------ 699

Query: 1505 ANFTLSDGGKAITDAEGKAKVTLKGTKAGAHTVTASMTGGKSE--QLVVNFIADTLSAQV 1562
TD G AKVTL T G V+A ++ + V F
Sbjct: 700 TTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDG 759

Query: 1563 NLNVTEDNFIANNVGMTTLQATVTDGNGN-PLANEAVTFTLPADVSASFTLGQGGSAITD 1621
N+ + + V + G N + +T + A ++ S
Sbjct: 760 NIEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASV-DASSGQVT 813

Query: 1622 INGKAEVTLSGTKSGTYPVTVSVN 1645
+ K T+S S T ++
Sbjct: 814 LKEKGTTTISVISSDNQTATYTIA 837



Score = 65.9 bits (160), Expect = 2e-12
Identities = 76/359 (21%), Positives = 130/359 (36%), Gaps = 37/359 (10%)

Query: 1432 YQVTATLENGDSMQQTVTYVPNVANAEITLAASKDPLIADNNDLT-----------TLTA 1480
Y + + Q + P N TL+ S+ L+ NN++ +
Sbjct: 403 YSMQFRYQFDKPWSQQIE--PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPH 460

Query: 1481 TVADTEGNA---IANTEVTFTLPEDVKANFTL-SDGGKAITDAEGKAK---VTLKGTKAG 1533
+ TE + + + L V + L S GG+ A+ L G
Sbjct: 461 DINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQG 520

Query: 1534 AH-----TVTASMTGGKS---EQLVVNFIADTLSAQ----VNLNVTEDNFIANNVGMTTL 1581
T A G S L + +++ + + + A+ T
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITY 580

Query: 1582 QATVTDGNGNPLANEAVTFTLPADVSASFTLGQGGSAITDINGKAEVTLSGTKSGTYPVT 1641
ATV NG AN V+F + VS + L SA T+ +GKA VTL K G V+
Sbjct: 581 TATVKK-NGVAQANVPVSFNI---VSGTAVLSAN-SANTNGSGKATVTLKSDKPGQVVVS 635

Query: 1642 VSVNNYGVSDTKQVTLIADAGTATLASLTSVYSFVVSTTEGATMTASVTDANGNPVEGIK 1701
+ + D A++ + + + V+ + A PV +
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 1702 VNFRGTSVTLSSTSVETDDQGFAEILVTSTEVGLKTVSASLADKPTEVISRLLNAKADI 1760
V F T LS+++ +TD G+A++ +TST G VSA ++D +V + + +
Sbjct: 696 VTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTL 754



Score = 56.6 bits (136), Expect = 1e-09
Identities = 56/267 (20%), Positives = 90/267 (33%), Gaps = 11/267 (4%)

Query: 1875 TGTTLTATLTSANGTPVEGQVINFSVTPEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTV 1934
T TAT+ NG ++F++ A LS TN SG+A V L S+K G V
Sbjct: 576 EAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVV 634

Query: 1935 TASFHNGVTIQTQTTVKVTGNSSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIE 1994
+A + + + + A + AD +T A D T V G +
Sbjct: 635 SAKTAEMTS-ALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVS 692

Query: 1995 GLTVYFALKSGSATLTSLTAVTDQNGIATTSVKGAMTGSVTVSAVTTAGGMQ--TVDITL 2052
V F G + + T TD NG A ++ G VSA + + ++
Sbjct: 693 NQEVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEF 750

Query: 2053 VAGPADTSQSVLKSNRSSLKGDYTDSAELRLVLHDISGNPIKVSEGMEFVQSGTNVPYIK 2112
++ T + V SG K + + + + +
Sbjct: 751 FTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYT----WRSANPAIASVD 806

Query: 2113 ISAIDYSLNINGDYKATVTSGGEGIAT 2139
S+ +L G +V S AT
Sbjct: 807 ASSGQVTLKEKGTTTISVISSDNQTAT 833


23EcSMS35_1184EcSMS35_1248Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_11842232.346639bacteriophage lysis protein
EcSMS35_11852222.377391P27 family phage terminase small subunit
EcSMS35_11862222.135758putative phage terminase, large subunit
EcSMS35_11873242.496382hypothetical protein
EcSMS35_11882241.914361HK97 family phage portal protein
EcSMS35_11893230.770008HK97 family phage prohead protease
EcSMS35_1190325-0.237480HK97 family phage major capsid protein
EcSMS35_1191537-1.742069hypothetical protein
EcSMS35_1192635-1.123298hypothetical protein
EcSMS35_11934300.293475HK97 family phage protein
EcSMS35_11944300.205851hypothetical protein
EcSMS35_11953280.560700hypothetical protein
EcSMS35_11963271.788694phage tail assembly chaperone
EcSMS35_11973262.628001hypothetical protein
EcSMS35_11983243.615118lambda family phage tail tape measure protein
EcSMS35_11994254.799163phage minor tail protein
EcSMS35_12001202.616687phage minor tail protein L
EcSMS35_12011212.672797tail assembly protein
EcSMS35_12020202.071100bacteriophage lambda tail assembly protein I
EcSMS35_12030201.219539fibronectin type III domain-containing protein
EcSMS35_1204125-3.115123Ail/Lom family protein
EcSMS35_1206027-3.425397putative prophage side tail fiber protein
EcSMS35_1205025-4.215535IS2 transposase orfB
EcSMS35_1207130-8.186081insertion sequence 2 OrfA protein
EcSMS35_1208128-8.198597hypothetical protein
EcSMS35_1209023-5.913466hypothetical protein
EcSMS35_1210121-5.240740hypothetical protein
EcSMS35_1211227-6.413635nickel-dependent hydrogenase, b-type cytochrome
EcSMS35_1212132-7.552618hypothetical protein
EcSMS35_1213028-6.053522putative sulfite oxidase subunit YedZ
EcSMS35_1214028-6.546414putative sulfite oxidase subunit YedY
EcSMS35_1215135-8.369982transthyretin family protein
EcSMS35_1216131-8.289157transcriptional regulatory protein YedW
EcSMS35_1217028-7.327861heavy metal sensor histidine kinase
EcSMS35_1218-119-3.366831chaperone protein HchA
EcSMS35_1219-311-1.976192hypothetical protein
EcSMS35_1220-213-0.884140hypothetical protein
EcSMS35_1221-112-0.241780porin family protein
EcSMS35_12222182.196617hypothetical protein
EcSMS35_12232182.162325hypothetical protein
EcSMS35_12240171.350713DNA cytosine methylase
EcSMS35_1226-1171.483626very short patch repair protein
EcSMS35_1225-2160.907769hypothetical protein
EcSMS35_1227-1160.471788hypothetical protein
EcSMS35_1228018-2.228106hypothetical protein
EcSMS35_1230-216-2.761410diguanylate cyclase
EcSMS35_1229019-4.045763mannosyl-3-phosphoglycerate phosphatase
EcSMS35_1231119-4.045581hypothetical protein
EcSMS35_1232117-3.153247hypothetical protein
EcSMS35_1233116-2.551426colanic acid capsular biosynthesis activation
EcSMS35_12340160.677147flagellar biosynthesis protein FliR
EcSMS35_1235-2212.041511flagellar biosynthesis protein FliQ
EcSMS35_1236-1172.544780flagellar biosynthesis protein FliP
EcSMS35_1237-1162.567206flagellar biosynthesis protein FliO
EcSMS35_1238-2173.480682flagellar motor switch protein FliN
EcSMS35_12390173.871330flagellar motor switch protein FliM
EcSMS35_12402174.322496flagellar basal body-associated protein FliL
EcSMS35_12411154.221026flagellar hook-length control protein
EcSMS35_12421154.292560flagellar biosynthesis chaperone
EcSMS35_1243-1130.642673flagellum-specific ATP synthase
EcSMS35_1244016-1.289285flagellar assembly protein H
EcSMS35_1245-116-1.954130flagellar motor switch protein G
EcSMS35_1246-120-3.411767flagellar MS-ring protein
EcSMS35_1247219-4.035245flagellar hook-basal body protein FliE
EcSMS35_1248117-3.648532hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1204ENTEROVIROMP1485e-48 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 148 bits (376), Expect = 5e-48
Identities = 66/200 (33%), Positives = 101/200 (50%), Gaps = 30/200 (15%)

Query: 1 MRKLCAVILSAVVWLVAAGTPASAAEHQSTLSAGYLQTHTDMPGSDDLKGINVKYRYEFT 60
M+K+ + A V AGT +A ST++ GY Q+ + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYANAKDEQKTHYSDTRWHEDSVRNRWFSMMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1206IGASERPTASE468e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.8 bits (108), Expect = 8e-07
Identities = 32/165 (19%), Positives = 54/165 (32%), Gaps = 15/165 (9%)

Query: 209 ETAAKNSQVAAAQSESAAAGSA--TSAAGSATAAANSQKAAKTSETNAKSSQTAAKTSET 266
E +N V + A S + A +A A S+T +E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 267 NAKASETAAKNSQDA------------AAQSESAAAGSASAAASSATASANSQKAAKTSE 314
+ + S+T KN QDA A+S A + A S + + +Q
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 315 TNAKASETAAANSAKASAASQTAAKASEDAAREYASQ-AAEPYKQ 358
+ E A + K + ++ S + Q AEP ++
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148



Score = 45.1 bits (106), Expect = 1e-06
Identities = 32/183 (17%), Positives = 62/183 (33%), Gaps = 11/183 (6%)

Query: 112 EEAARNAEAASQSAAAAKKSETAAASSKNAAKTSETNAANSAQAAATSQTASANSATAAK 171
E + E ++ + K E + A E + + + + +A++ AK
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 172 KSETNAKNSETAAKTSETNAK-----SSQTAAKTSETNAKASETAAKNSQVAAAQSESAA 226
++ +N + T + T T + T A T T S KN + +S
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHN 1233

Query: 227 AGSATSAAGSATAAANSQKAAKTSETNAKSSQTAAKTS----ETNAKASETAAKNSQDAA 282
AT+++ + A ++ TNA S AK S+ ++ +
Sbjct: 1234 VEPATTSSNDRSTVA--LCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNE 1291

Query: 283 AQS 285
Q
Sbjct: 1292 GQY 1294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1216HTHFIS841e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.1 bits (208), Expect = 1e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 39 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 98
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 99 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 154
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1217PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.002
Identities = 37/181 (20%), Positives = 61/181 (33%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSHIHITSFLDTNGYLNIDVAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ I + D NG + ++V +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGTKIHEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIT 447
G+ + K G GL V+ + L+G A K
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1221ECOLIPORIN5150.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 515 bits (1328), Expect = 0.0
Identities = 261/396 (65%), Positives = 309/396 (78%), Gaps = 22/396 (5%)

Query: 1 MKRKVLAMLVPALLVAGAANAAEIYNKNGNKLDLYGKVDARHTFSDNPGDDGDETIIELG 60
MKRKVLA+++PALL AGAA+AAEIYNK+GNKLDLYGKVD H FSD+ DGD+T + +G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQITDQLTGYGQALTKTKASDTEG-SDNTYVKLAFAGLKFGEMGSFDYGRNYGVIY 119
FKGETQI DQLTGYGQ +A+ TEG N++ +LAFAGLKFG+ GSFDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVEAWTDMLPVFGGDSYTWTDNFMAGRANGVATYRNSDFFGLVEGLNFALQYQGNNEGSN 179
DVE WTDMLP FGGDSYT+ DN+M GRANGVATYRN+DFFGLV+GLNFALQYQG NE +
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 180 AGEDQEGT--KNGHEDVRFQNGDGFGLSTSYDFDFGLSLGAAYSNSDRTNSQVALGGYHY 237
A + GT +N +D+R+ NGDGFG+ST+YD G S GAAY+ SDRTN QV GG
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT-- 238

Query: 238 NEYSKFAGGDTAEAWTFGAKYDANNVYLAMMYAETRNMTPYG------NVGIANKTQNFE 291
AGGD A+AWT G KYDANN+YLA MY+ETRNMTPYG + G+ANKTQNFE
Sbjct: 239 -----IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFE 293

Query: 292 AVAQYQFDFGLRPSLAYVYSKGKDLGGNDYNNNGHQEYVDQDLVNYVEIGATYYFNKNFS 351
AQYQFDFGLRP+++++ SKGKDL N+ N D+DLV Y ++GATYYFNKNFS
Sbjct: 294 VTAQYQFDFGLRPAVSFLMSKGKDLTYNNVN------GDDKDLVKYADVGATYYFNKNFS 347

Query: 352 TYVDYKINLLDKDDDFYDNNGIATDDVVGVGLVYQF 387
TYVDYKINLLD DD FY + GI+TDD+V +G+VYQF
Sbjct: 348 TYVDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1223CARBMTKINASE342e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.4 bits (79), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQRSSILAAEETRRLLREEFEQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEE--GHFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1224PF05272290.047 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.047
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1234TYPE3IMRPROT2011e-66 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 201 bits (514), Expect = 1e-66
Identities = 257/261 (98%), Positives = 261/261 (100%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEMFNLLADIISELPLI 261
EHLFSE+FNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1235TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1236FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1238FLGMOTORFLIN2113e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 211 bits (539), Expect = 3e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTNSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T +KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1239FLGMOTORFLIM381e-135 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 381 bits (979), Expect = e-135
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1241FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1204), Expect = e-168
Identities = 365/375 (97%), Positives = 370/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSEILADAQQADLLIPVDETPPVINDEQSTSTPLTTAQTMTLAAVAGNNTAKDEKA 120
GEPL+S+I++DAQQA+LLIPVDETPPVINDEQSTSTPLTTAQTM LAAVA NT KDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTVNHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRT NHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1242FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1244FLGFLIH374e-136 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 374 bits (961), Expect = e-136
Identities = 225/228 (98%), Positives = 228/228 (100%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLARGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLA+GLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1245FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1246FLGMRINGFLIF7560.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 756 bits (1953), Expect = 0.0
Identities = 479/555 (86%), Positives = 515/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1247FLGHOOKFLIE1175e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (294), Expect = 5e-38
Identities = 103/103 (100%), Positives = 103/103 (100%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


24EcSMS35_1389EcSMS35_1419Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1389025-3.500121LysR family transcriptional regulator
EcSMS35_1390122-5.877349leucine export protein LeuE
EcSMS35_1391225-6.096358hypothetical protein
EcSMS35_1392021-3.924909hypothetical protein
EcSMS35_1393020-3.568385hypothetical protein
EcSMS35_1394120-2.766671hypothetical protein
EcSMS35_1395-219-1.110190hypothetical protein
EcSMS35_1396-120-0.805531hypothetical protein
EcSMS35_1397-1190.035828GAF domain/diguanylate cyclase domain-containing
EcSMS35_13980210.624414putative lipoprotein
EcSMS35_13990200.544258hypothetical protein
EcSMS35_1400-121-1.542764inner membrane transport protein yeaN
EcSMS35_1402-222-5.306827AraC family transcriptional regulator
EcSMS35_1401-318-5.209014hypothetical protein
EcSMS35_1403-115-4.800706hypothetical protein
EcSMS35_1404-112-4.202078YbaK/prolyl-tRNA synthetase-associated
EcSMS35_1405-113-4.057897diguanylate cyclase
EcSMS35_1406-110-3.370097diguanylate cyclase
EcSMS35_1407117-1.324132hypothetical protein
EcSMS35_1408219-1.231060serine kinase family protein
EcSMS35_1409123-1.392339MltA-interacting protein MipA
EcSMS35_1410020-1.799373aldo/keto reductase family oxidoreductase
EcSMS35_1411017-2.214630aldose 1-epimerase family protein
EcSMS35_1412-118-2.857861giyceraldehyde-3-phosphate dehydrogenase
EcSMS35_1413020-3.727068methionine sulfoxide reductase B
EcSMS35_1414021-4.285528hypothetical protein
EcSMS35_1415021-4.049296zinc-binding dehydrogenase family
EcSMS35_1416-123-5.077100major facilitator family transporter
EcSMS35_1417023-5.891096sorbitol dehydrogenase
EcSMS35_1418-121-5.270383fructose-bisphosphate aldolase
EcSMS35_1419-119-4.327379PfkB family kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1395HTHTETR306e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.0 bits (67), Expect = 6e-04
Identities = 9/37 (24%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 4 LSWIIFGLIAGILAKWIMPG-----KDGGGFFMTILL 35
+ I+ G I+G++ W+ K ++ ILL
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1401PRTACTNFAMLY280.022 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.022
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1411INVEPROTEIN290.023 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.023
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 158 ETTSALHTYFNVGDITKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 206
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 207 TDRVYLNPQDCSVINDEALNR 227
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1416TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.011
Identities = 33/142 (23%), Positives = 48/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


25EcSMS35_1431EcSMS35_1473Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_14312152.970071hypothetical protein
EcSMS35_14302153.341383pyrimidine (deoxy)nucleoside triphosphate
EcSMS35_14322153.456040CDP-alcohol phosphatidyltransferase family
EcSMS35_14332143.071009rhodanese-like domain-containing protein
EcSMS35_14342152.385510ABC transporter ATP-binding protein
EcSMS35_14352161.680555ABC transporter permease
EcSMS35_14361140.676240putative ABC transporter solute-binding protein
EcSMS35_1437016-0.743229carboxymuconolactone decarboxylase family
EcSMS35_1438-113-0.231279hypothetical protein
EcSMS35_14390120.750544hypothetical protein
EcSMS35_14401122.862770hypothetical protein
EcSMS35_14410134.034274exonuclease III
EcSMS35_14420133.922576hypothetical protein
EcSMS35_14430123.914248bifunctional succinylornithine
EcSMS35_14440133.546549arginine succinyltransferase
EcSMS35_14451132.795960succinylglutamic semialdehyde dehydrogenase
EcSMS35_1446-1110.780447succinylarginine dihydrolase
EcSMS35_1447-113-0.558257succinylglutamate desuccinylase
EcSMS35_1448015-1.905288periplasmic protein
EcSMS35_1450116-2.246883hypothetical protein
EcSMS35_1449018-2.518565nucleotide excision repair endonuclease
EcSMS35_1451018-4.492922NAD synthetase
EcSMS35_1452017-4.965011DNA-binding transcriptional activator OsmE
EcSMS35_1453-118-4.530353PTS system N,N'-diacetylchitobiose-specific
EcSMS35_1454-212-2.854856PTS system N,N'-diacetylchitobiose-specific
EcSMS35_1455-213-2.790211PTS system N,N'-diacetylchitobiose-specific
EcSMS35_1456-115-4.575274DNA-binding transcriptional regulator ChbR
EcSMS35_1457-211-2.6199496-phospho-beta-glucosidase
EcSMS35_1458-212-1.868931hypothetical protein
EcSMS35_1459-214-1.798613hydroperoxidase II
EcSMS35_1460-117-3.542086cell division modulator
EcSMS35_1461-117-2.909770hypothetical protein
EcSMS35_1462015-0.762623sodium/dicarboxylate symporter family protein
EcSMS35_1463018-0.830876hypothetical protein
EcSMS35_1464019-1.1299722-deoxyglucose-6-phosphatase
EcSMS35_1465019-1.474362hypothetical protein
EcSMS35_1466128-5.128109fructosamine kinase
EcSMS35_1467020-4.881327hypothetical protein
EcSMS35_1468-119-4.6401016-phosphofructokinase 2
EcSMS35_1469-120-5.744559hypothetical protein
EcSMS35_1470-122-6.104105hypothetical protein
EcSMS35_1471020-4.711289ankyrin repeat-containing protein
EcSMS35_14723280.427173threonyl-tRNA synthetase
EcSMS35_14734291.107191translation initiation factor IF-3
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1445DNABINDINGHU310.002 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 31.2 bits (71), Expect = 0.002
Identities = 14/61 (22%), Positives = 28/61 (45%), Gaps = 5/61 (8%)

Query: 74 SNKAELTAIIARETGKPRWEAATEVTAMINKIAISIKAYHVRTGEQRSEMPDGAASLRHR 133
+NK +L A +A T + ++A V A+ + ++ + GE+ + G +R R
Sbjct: 2 ANKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAK-----GEKVQLIGFGNFEVRER 56

Query: 134 P 134

Sbjct: 57 A 57


26EcSMS35_1498EcSMS35_1503Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1498-216-3.277971putative electron transfer flavoprotein YdiQ
EcSMS35_1499017-3.738536hypothetical protein
EcSMS35_1500-115-4.028308AraC family transcriptional regulator
EcSMS35_1501015-3.733815putative acyl-CoA dehydrogenase
EcSMS35_1502118-4.557569propionate CoA-transferase
EcSMS35_1503215-3.2903063-dehydroquinate dehydratase
27EcSMS35_1615EcSMS35_1640Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1615-216-3.577531hypothetical protein
EcSMS35_1616-115-3.298732diamine N-acetyltransferase
EcSMS35_1617017-3.436314hypothetical protein
EcSMS35_1618-119-3.122249hypothetical protein
EcSMS35_1619-218-3.224567starvation-sensing protein RspA
EcSMS35_1620-219-3.639796putative dehydrogenase
EcSMS35_1621-218-2.579012inner membrane metabolite transport protein
EcSMS35_1622017-1.431249hypothetical protein
EcSMS35_1623017-1.416034mannitol dehydrogenase family protein
EcSMS35_1624018-2.373930hypothetical protein
EcSMS35_1625017-2.277913GntR family transcriptional regulator
EcSMS35_1626016-2.2425203-hydroxy acid dehydrogenase
EcSMS35_1627018-3.360095dipeptidyl carboxypeptidase II
EcSMS35_1628022-4.876352hypothetical protein
EcSMS35_1629121-5.563173hypothetical protein
EcSMS35_1630221-5.290543putative MFS-type transporter YdeE
EcSMS35_1631226-7.503588O-acetylserine/cysteine export protein
EcSMS35_1632125-8.189651GntR family transcriptional regulator
EcSMS35_1633127-7.681444PTS system lactose/cellobiose family IIB
EcSMS35_1634225-7.411281PTS system lactose/cellobiose family IIC
EcSMS35_1635022-6.797402PTS system lactose/cellobiose-specific family
EcSMS35_1636019-6.116739glucoside specific outer membrane porin BglH
EcSMS35_1637014-3.1145606-phospho-beta-glucosidase
EcSMS35_1638117-1.554531hypothetical protein
EcSMS35_1639017-1.471133DNA-binding transcriptional activator MarA
EcSMS35_1640020-4.158763DNA-binding transcriptional repressor MarR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1616SACTRNSFRASE423e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 3e-07
Identities = 22/112 (19%), Positives = 47/112 (41%), Gaps = 4/112 (3%)

Query: 34 FEEPYEAFVELSDLYDKHIHDQSERRFVVECDGEKAGLVELVEINHVHRRAEFQ-IIISP 92
F +PY E D+ ++ ++ + F+ + G +++ + + A + I ++
Sbjct: 42 FSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRS--NWNGYALIEDIAVAK 99

Query: 93 EYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFTVEG 144
+Y+ KG+ T A+++ + L L N A H Y K F +
Sbjct: 100 DYRKKGVGTALLHKAIEWAKEN-HFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1621TCRTETB493e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 48.7 bits (116), Expect = 3e-08
Identities = 33/118 (27%), Positives = 55/118 (46%), Gaps = 16/118 (13%)

Query: 72 VGAFIFGKMGDRIGRKKVLFITITMMGICTTLIGVLPTYAQIGVFAPILLVTLRIIQGLG 131
+G ++GK+ D++G K++L I + + + V ++ + + A R IQG G
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMA-------RFIQGAG 116

Query: 132 AGAEISGAGTMLAEYAPKGKR----GIISSFVAMGTNCGTLSATAI-----WAFMFFI 180
A A + ++A Y PK R G+I S VAMG G I W+++ I
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1626DHBDHDRGNASE1002e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 99.7 bits (248), Expect = 2e-27
Identities = 70/244 (28%), Positives = 114/244 (46%), Gaps = 16/244 (6%)

Query: 2 IVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQ---LDVRNR 58
I +TGA G GE + R QG + A E+L+++ L A+ DVR+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 59 AAIEEMLASLPAEWCNIDILVNNAGLALGMEPAHKASVEDWETMIDTNNKGLVYMTRAVL 118
AAI+E+ A + E IDILVN AG+ L H S E+WE N+ G+ +R+V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 119 PGMVERNYGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNLRTDLHGTAVRVTDIEPG 178
M++R G I+ +GS P Y ++KA F+ L +L +R + PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 179 LVGGTEFSNVRFKGDDGKAE------KTYQNTVALT----PEDVSEAV-WWVSTLPAHVN 227
T+ + ++G + +T++ + L P D+++AV + VS H+
Sbjct: 189 ST-ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 228 INTL 231
++ L
Sbjct: 248 MHNL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1630TCRTETA414e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.3 bits (97), Expect = 4e-06
Identities = 41/235 (17%), Positives = 82/235 (34%), Gaps = 10/235 (4%)

Query: 7 RSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLI---GYAMTIALTIGVVFSLGF 63
R +L++ L +G G +P + L R S D+ G + + + +
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLL-RDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 64 GILADKFDKKRYMLLAITAFASGFIAIPLVNNVTLVVLFFALINCAYSVFATVLKAWFAD 123
G L+D+F ++ +L+++ A + + + ++ + + + A A+ AD
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIAD 122

Query: 124 NLSSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQSINLPFWLAAICSAFPMLFIQIWVK 183
+ + F G GP LG L+ S + PF+ AA + L +
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 184 RSEKIIAT-ETGRAWSPKVLLQDKALL----WFTCSGFLASFVSGAFASCISQYV 233
S K A +P + + F+ V A+ +
Sbjct: 183 ESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237



Score = 32.1 bits (73), Expect = 0.003
Identities = 22/155 (14%), Positives = 60/155 (38%), Gaps = 2/155 (1%)

Query: 7 RSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLIGYAMTIALTIGVVF-SLGFGI 65
+AL+A ++ + I+ ++ IG ++ + + ++ G
Sbjct: 210 TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGP 269

Query: 66 LADKFDKKRYMLLAITAFASGFIAIPLVNNVTLVVLFFALINCAYSVFATVLKAWFADNL 125
+A + ++R ++L + A +G+I + + L+ + L+A + +
Sbjct: 270 VAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG-GIGMPALQAMLSRQV 328

Query: 126 SSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQSI 160
+ ++ + ++ +GP L T + SI
Sbjct: 329 DEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASI 363


28EcSMS35_1660EcSMS35_1678Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1660218-1.726195transcriptional repressor LsrR
EcSMS35_1661221-3.109162autoinducer-2 (AI-2) kinase
EcSMS35_1662324-4.045020pertactin family protein
EcSMS35_1663232-6.639427hypothetical protein
EcSMS35_1664228-5.002318fimbrial protein
EcSMS35_1665229-4.759772periplasmic pilus chaperone family protein
EcSMS35_1666329-4.947695fimbrial usher family protein
EcSMS35_1667225-2.621630protein FimF-like protein
EcSMS35_1669224-2.552092protein FimG-like protein
EcSMS35_1668123-3.461946IS2 transposase orfB
EcSMS35_1670124-5.400321insertion sequence 2 OrfA protein
EcSMS35_1671125-6.861116protein FimH-like protein
EcSMS35_1672-123-7.108719putative oxidoreductase
EcSMS35_1673020-8.189557hypothetical protein
EcSMS35_1674-120-7.156181transcriptional regulator YdeO
EcSMS35_1675-215-5.661733sulfatase
EcSMS35_1676-214-5.126953radical SAM domain-containing protein
EcSMS35_1677-212-3.621050inner membrane ABC transporter ATP-binding
EcSMS35_1678-212-3.293350TonB-dependent receptor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1662PRTACTNFAMLY639e-12 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 62.8 bits (152), Expect = 9e-12
Identities = 82/349 (23%), Positives = 124/349 (35%), Gaps = 35/349 (10%)

Query: 1271 QGTTDIVGGEIAFGSDSAINMASQHINIHNSGVMSGNVTTAGDVNVMPGGTLRVAKTTVG 1330
DIV E+ S ++ + + + +G +++ + + VG
Sbjct: 392 DAQGDIVATELP--SIPGTSIGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVG 449

Query: 1331 G-NLENGGTVQMNSEGGKPGNVLTVNGNYTGNNGLMTFNATLGGDNSPTDKMNVKGDTQG 1389
L + G+V G + N +GL N D +DK+ V D G
Sbjct: 450 ALRLASDGSVDFQQPAEA-GRFKVLTVNTLAGSGLFRMNVFA--DLGLSDKLVVMQDASG 506

Query: 1390 NTRVRVDNIGGVGAQTVNGIELIEVGGNSAGNFALTT--GTVEAGAYVYTLAKGKGNDEK 1447
R+ V N G + N + L++ SA F L G V+ G Y Y LA N
Sbjct: 507 QHRLWVRNSGS-EPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAA---NGNG 562

Query: 1448 NWYLTSKWDGVTPADTPDPINNPPVVDPEGPS--VYRPEAGSYIS----------NIAAA 1495
W L P P P PP P +P AG +S + A
Sbjct: 563 QWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLA 622

Query: 1496 NSLF---SHRLHDRLGEPQYTDSLRVQDSATSMWMRHVGGHERFRTGDGQLNTQANRYVL 1552
++L+ S+ L RLGE LR+ A W R ++ G+ Q
Sbjct: 623 STLWYAESNALSKRLGE------LRLNPDAGGAWGRGFAQRQQLDNRAGRRFDQ-KVAGF 675

Query: 1553 QLGGDLAQWSSTQDRWHLGVMAGYANQHSNTQSNHVGYKSDGRISGYSA 1601
+LG D A + RWHLG +AGY + G+ + GY+
Sbjct: 676 ELGADHA-VAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYAT 723


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1666PF005779400.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 940 bits (2432), Expect = 0.0
Identities = 498/869 (57%), Positives = 654/869 (75%), Gaps = 10/869 (1%)

Query: 3 QVLLLPRFARLTIALGLATAVFPVDAEFYFNPRFLSNDLAESVDLSAFTKGREAPPGTYR 62
+ L F RL +A A AE YFNPRFL++D DLS F G+E PPGTYR
Sbjct: 20 KHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYR 79

Query: 63 VDIYLNDEFMTSRDITFIADDNNADLIPCLSTDLLVSLGIKKSALLDNKEHSAEKHVSDN 122
VDIYLN+ +M +RD+TF D+ ++PCL+ L S+G+ +++ + ++ +
Sbjct: 80 VDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASV-------SGMNLLAD 132

Query: 123 SACTPLRDRLADASSEFNVGQQHLSLSVPQIYVGRMARGYVSPDLWEEGINAGLLNYSFN 182
AC PL + DA+++ +VGQQ L+L++PQ ++ ARGY+ P+LW+ GINAGLLNY+F+
Sbjct: 133 DACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFS 192

Query: 183 GNSINNRGNHNAGKSNYAYLNLQSGINIGSWRLRDNSTWSYNSGSSNSSDSNKWQHINTS 242
GNS+ NR G S+YAYLNLQSG+NIG+WRLRDN+TWSYNS S+S NKWQHINT
Sbjct: 193 GNSVQNRI---GGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249

Query: 243 AERDIIPLRSRLTVGDSYTDGDIFDSVNFRGLKINSTEAMLPDSQHGFAPVIHGIARSTA 302
ERDIIPLRSRLT+GD YT GDIFD +NFRG ++ S + MLPDSQ GFAPVIHGIAR TA
Sbjct: 250 LERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTA 309

Query: 303 QVSVKQNGYDVYQTTVPPGPFTIDDINSAANGGDLQVTIKEADGSIQTLYVPYSSVPVLQ 362
QV++KQNGYD+Y +TVPPGPFTI+DI +A N GDLQVTIKEADGS Q VPYSSVP+LQ
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQ 369

Query: 363 RAGYTRYALAMGEYRSGNNLQSSPKFIQGSLMHGLEGNWTPYGGMQIAEDYQAFNLGIGK 422
R G+TRY++ GEYRSGN Q P+F Q +L+HGL WT YGG Q+A+ Y+AFN GIGK
Sbjct: 370 REGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGK 429

Query: 423 DLGLFGAFSFDITQANTTLADGTRHSGQSIKSVYSKSFYQTGTNIQVAGYRYSTQGFYNL 482
++G GA S D+TQAN+TL D ++H GQS++ +Y+KS ++GTNIQ+ GYRYST G++N
Sbjct: 430 NMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNF 489

Query: 483 SDSAYSRMSGYTVKPPTGDSNEQTQFIDYFNLFYSKRGQEQISISQQLGNYGTTFFSASR 542
+D+ YSRM+GY ++ G + +F DY+NL Y+KRG+ Q++++QQLG T + S S
Sbjct: 490 ADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSH 549

Query: 543 QSYWNTSRSDQQISFGLNVPFSDITTSLNYSYSNNIWQNDRDHLLAFTLNVPFSHWMRTD 602
Q+YW TS D+Q GLN F DI +L+YS + N WQ RD +LA +N+PFSHW+R+D
Sbjct: 550 QTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 603 SQSAFRNSNASYSMSNDLKGGMTNLSGVYGTLLPDNNLNYSVQVGNTHGGNTSSGTSGYS 662
S+S +R+++ASYSMS+DL G MTNL+GVYGTLL DNNL+YSVQ G GG+ +SG++GY+
Sbjct: 610 SKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYA 669

Query: 663 SLNYRGAYGNTNIGYSRSGDSSQIYYGMSGGIIAHADGITFGQPLGDTMVLVKAPGADNV 722
+LNYRG YGN NIGYS S D Q+YYG+SGG++AHA+G+T GQPL DT+VLVKAPGA +
Sbjct: 670 TLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDA 729

Query: 723 KIENQTGIHTDWRGYAILPFATEYRENRVALNANSLADNVELDETVVTVIPTHGAIARAT 782
K+ENQTG+ TDWRGYA+LP+ATEYRENRVAL+ N+LADNV+LD V V+PT GAI RA
Sbjct: 730 KVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAE 789

Query: 783 FNAQIGGKVLMTLKYGNKSVPFGAIVTHGENKNGSIVAENGQVYLTGLPQSGKLQVSWGN 842
F A++G K+LMTL + NK +PFGA+VT +++ IVA+NGQVYL+G+P +GK+QV WG
Sbjct: 790 FKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGE 849

Query: 843 DKNSNCIVDYKLPAVSPGTLLNQQTAICR 871
++N++C+ +Y+LP S LL Q +A CR
Sbjct: 850 EENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1669FIMBRIALPAPF326e-04 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 32.4 bits (73), Expect = 6e-04
Identities = 28/93 (30%), Positives = 46/93 (49%), Gaps = 7/93 (7%)

Query: 16 LFTATLQAADVTITVIGRVVAKPCTIQT-KEANVNLGDLYTRNLQQPGSASGWHNITLSL 74
L T+ ADV I + G V PCTI + V+ G++ N + ++ G +S+
Sbjct: 11 LLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNI---NPEHVDNSRGEVTKNISI 67

Query: 75 TDCPVETSAVTAIVTGSTDNTGYYKNEGTAENI 107
+ CP ++ ++ VTG+T G +N A NI
Sbjct: 68 S-CPYKSGSLWIKVTGNTMGVG--QNNVLATNI 97


29EcSMS35_1699EcSMS35_1714Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1699014-4.861913formate dehydrogenase, nitrate-inducible,
EcSMS35_1700-115-5.253508formate dehydrogenase, nitrate inducible, alpha
EcSMS35_1701017-4.458148hypothetical protein
EcSMS35_1702-116-3.620779porin family protein
EcSMS35_1703-115-2.661401hypothetical protein
EcSMS35_1704-214-0.623507hypothetical protein
EcSMS35_1705-2151.842836nitrite extrusion protein 2
EcSMS35_1706-1152.278250nitrate reductase 2, alpha subunit
EcSMS35_17070131.416937nitrate reductase 2, beta subunit
EcSMS35_17081160.286961nitrate reductase molybdenum cofactor assembly
EcSMS35_1709-116-2.623751respiratory nitrate reductase 2, gamma subunit
EcSMS35_1710018-3.196837hypothetical protein
EcSMS35_1711024-4.908932N-hydroxyarylamine O-acetyltransferase
EcSMS35_1712025-5.537596flavin reductase domain-containing protein
EcSMS35_1713019-4.3461834-oxalocrotonate tautomerase
EcSMS35_1714-116-3.844448metallo-beta-lactamase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1702ECOLIPORIN479e-172 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 479 bits (1234), Expect = e-172
Identities = 225/386 (58%), Positives = 271/386 (70%), Gaps = 23/386 (5%)

Query: 1 MKLKIVAVVVTGLLAANVAHAAEVYNKDGNKLDLYGKVTALRYFTDDKRDDGDKTYARLG 60
MK K++A+V+ LLAA AHAAE+YNKDGNKLDLYGKV L YF+DD DGD+TY R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQMIGFGHWEYDFKGYNDEANGSRGNKTRLAYAGLKISEFGSLDYGRNYGVG 120
FKGETQINDQ+ G+G WEY+ + E G+ + TRLA+AGLK ++GS DYGRNYGV
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGAN-SWTRLAFAGLKFGDYGSFDYGRNYGVL 119

Query: 121 YDIGSWTDMLPEFGGDTWSQKDVFMTYRTTGVATYRNYDFFGLIEGLNFAAQYQGKNER- 179
YD+ WTDMLPEFGGD+++ D +MT R GVATYRN DFFGL++GLNFA QYQGKNE
Sbjct: 120 YDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQ 179

Query: 180 -------TDNGHLYGADYTRANGDGFGISSTYVYD-GFGIGAVYTKSDRTNAQERAAANP 231
N G D NGDGFGIS+TY GF GA YT SDRTN Q A
Sbjct: 180 SADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT- 238

Query: 232 LNASGKNAELWATGIKYDANNIYFAANYAETLNMTTYG------DGYISNKAQSFEVVAQ 285
A G A+ W G+KYDANNIY A Y+ET NMT YG DG ++NK Q+FEV AQ
Sbjct: 239 -IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ 297

Query: 286 YQFDFGLRPSLAYLKSKGRDLGR----YGDQDMIEYIDVGATYFFNKNMSTYVDYKINLI 341
YQFDFGLRP++++L SKG+DL D+D+++Y DVGATY+FNKN STYVDYKINL+
Sbjct: 298 YQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLL 357

Query: 342 DESD-FTRAVDIRTDNIVATGITYQF 366
D+ D F + I TD+IVA G+ YQF
Sbjct: 358 DDDDPFYKDAGISTDDIVALGMVYQF 383


30EcSMS35_1812EcSMS35_1821Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1812-113-3.136614putative sugar ABC transporter periplasmic
EcSMS35_1813-215-4.009149alpha amylase family protein
EcSMS35_1814212-0.567352thiosulfate:cyanide sulfurtransferase
EcSMS35_18152131.656899peripheral inner membrane phage-shock protein
EcSMS35_18162172.958275DNA-binding transcriptional activator PspC
EcSMS35_18172214.140653phage shock protein B
EcSMS35_18181203.976482phage shock protein PspA
EcSMS35_18190184.362465phage shock protein operon transcriptional
EcSMS35_1820-1194.4910434-aminobutyrate transaminase
EcSMS35_1821-2183.718710gamma-glutamylputrescine oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1817MPTASEINHBTR250.030 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.030
Identities = 7/43 (16%), Positives = 17/43 (39%)

Query: 30 SGRSELSQSEQQRLAQLADEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A A++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1819HTHFIS342e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (880), Expect = e-118
Identities = 126/341 (36%), Positives = 182/341 (53%), Gaps = 23/341 (6%)

Query: 6 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 65
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 185
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 245
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 246 PLDDIIID---PFKRRPPEDAIAVSETTSLPTLPLD------------------LREFQM 284
+ I + P E A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 285 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 325
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


31EcSMS35_1873EcSMS35_1883Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1873119-4.621871hypothetical protein
EcSMS35_1874019-4.514199YciF protein
EcSMS35_1875015-2.193333hypothetical protein
EcSMS35_1876115-1.592708outer membrane protein W
EcSMS35_1877016-1.892302hypothetical protein
EcSMS35_1878028-9.348959intracellular septation protein A
EcSMS35_1879133-9.846105acyl-CoA thioester hydrolase
EcSMS35_1880134-10.226864transport protein TonB
EcSMS35_1881235-11.617330YciI-like protein
EcSMS35_1882336-11.167435hypothetical protein
EcSMS35_1884133-8.561771sulfatase family protein
EcSMS35_1883329-1.195293hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1880TONBPROTEIN2525e-87 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 252 bits (644), Expect = 5e-87
Identities = 233/239 (97%), Positives = 236/239 (98%), Gaps = 1/239 (0%)

Query: 6 MTLDLPRRFPWPTLLSVCIHGAVVAGLIYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 65
MTLDLPRRFPWPTLLSVCIHGAVVAGL+YTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 66 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ-PKRDVKPVESR 124
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV++ PKRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 125 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 184
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 185 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 243
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1881adhesinmafb314e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 4e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


32EcSMS35_1894EcSMS35_1904Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1894-114-3.731271oligopeptide transporter ATP-binding component
EcSMS35_1895116-4.669587oligopeptide ABC transporter permease OppC
EcSMS35_1896026-4.037282oligopeptide transporter permease
EcSMS35_1897026-4.287998oligopeptide ABC transporter periplasmic
EcSMS35_1899030-4.582869hypothetical protein
EcSMS35_1898034-3.696838hypothetical protein
EcSMS35_1900-130-3.360713hypothetical protein
EcSMS35_1901-128-3.207012bifunctional acetaldehyde-CoA/alcohol
EcSMS35_1902025-4.081958thymidine kinase
EcSMS35_1903024-3.531439hypothetical protein
EcSMS35_1904-220-3.279793global DNA-binding transcriptional dual
33EcSMS35_1965EcSMS35_1986Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1965-218-3.790975DNA polymerase V subunit UmuC
EcSMS35_1966-116-5.646992DNA polymerase V subunit UmuD
EcSMS35_1967-118-6.441473hemolysin E
EcSMS35_1968-319-3.933631hypothetical protein
EcSMS35_1969-118-3.405392hypothetical protein
EcSMS35_1970020-4.673760hypothetical protein
EcSMS35_1971-118-3.784713pre-peptidase C-terminal domain-containing
EcSMS35_1972-119-4.058179putative fels-1 prophage protein
EcSMS35_1973-119-2.670566septum formation inhibitor
EcSMS35_1974-121-3.881879cell division inhibitor MinD
EcSMS35_1975-123-4.768474cell division topological specificity factor
EcSMS35_1976126-5.219605hypothetical protein
EcSMS35_1977027-6.018177hypothetical protein
EcSMS35_1978025-5.542879hypothetical protein
EcSMS35_1979125-6.359725hypothetical protein
EcSMS35_1980328-6.832108autotransporter (AT) family porin
EcSMS35_1981332-8.837799hypothetical protein
EcSMS35_1982230-7.265056hypothetical protein
EcSMS35_1983129-7.835230cyclic diguanylate phosphodiesterase
EcSMS35_1984022-6.384404hypothetical protein
EcSMS35_1985021-5.106610hypothetical protein
EcSMS35_1986-118-4.113102BLUF/cyclic diguanylate phosphodiesterase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1976PRTACTNFAMLY429e-08 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 42.0 bits (98), Expect = 9e-08
Identities = 28/101 (27%), Positives = 45/101 (44%), Gaps = 1/101 (0%)

Query: 8 TRSIYRELGATLSYNMRLGNGMEVEPWLKAAVRKEFVDDNRVKVNNDGNFVNDLSGRRGI 67
S+ LG + + L G +V+P++KA+V +EF V N + +L G R
Sbjct: 811 GSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAH-RTELRGTRAE 869

Query: 68 YQAGIKASFSSSLSGHLGVGYSHGAGVESPWNAVAGANWSF 108
G+ A+ S + YS G + PW AG +S+
Sbjct: 870 LGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1980PRTACTNFAMLY1022e-24 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 102 bits (256), Expect = 2e-24
Identities = 145/653 (22%), Positives = 226/653 (34%), Gaps = 89/653 (13%)

Query: 137 DVDITTHGDNAHAIAARQGTVSFNQGEIHTTGPDAAIAKIYNGGKVTLKNTSAVAHQGAG 196
D + + +V Q + AAI + G +VT+ S A G
Sbjct: 285 PGGFGPVLDGWYGVDVSGSSVELAQSIVEAPELGAAIR-VGRGARVTVSGGSLSAPHGNV 343

Query: 197 IVLESSIN--GQEATVDILSGSSLRSANEILYNKNETSNVTITDSEVSSAADVFINNIKG 254
I + Q A + I + + + L + V +T ++ AD + +
Sbjct: 344 IETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLT---LTGGADAQGDIVAT 400

Query: 255 HLVIDASNSKITGSANLSTD----DSTHTYLSLS-DNSTWDIKTDSTVS--KLTVDNSTV 307
L S L++ +T SLS DN+TW + +S V +L D S
Sbjct: 401 ELPSIPGTSIGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVGALRLASDGSVD 460

Query: 308 YISRADGKAFEPTRLTITENYVGNNGVLHLRTELGDDNSATDKVVINGNTSGTTRVKVTN 367
+ A+ F+ +T N + +G+ + D +DK+V+ + SG R+ V N
Sbjct: 461 FQQPAEAGRFK----VLTVNTLAGSGLFRMNVFA--DLGLSDKLVVMQDASGQHRLWVRN 514

Query: 368 AGGSGAYTLNGIEIISVEGESNGEFI---KDSRIFAGAYEYSLTRGNTEATNKNWYLTNF 424
+G S + N + ++ S F KD ++ G Y Y L N W L
Sbjct: 515 SG-SEPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLA----ANGNGQWSLVGA 569

Query: 425 QAT-------SGGETNSGGSSAPTVAPTPVLRPEAGSYVANLAAANTLFVMRLNDRAGET 477
+A G AP P A AA NT V A
Sbjct: 570 KAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGV----GLASTL 625

Query: 478 RYIDPVTEQERSSRLWLRQIGGHNAWRDSNGQLRTTSHRY-------VS--QLGADLLTG 528
Y + +R L L G AW Q + +R V+ +LGAD
Sbjct: 626 WYAESNALSKRLGELRLNPDAG-GAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVA 684

Query: 529 GFTDSDSWRLGVMAGYARDYNSTHSSVSDYRSKGSVRGYSAGLYATWFADDISKKGAYID 588
W LG +AGY R + G G YAT+ AD G Y+D
Sbjct: 685 VAGGR--WHLGGLAGYTR----GDRGFTGDGG-GHTDSVHVGGYATYIADS----GFYLD 733

Query: 589 AWAQYSWFKN----------SVKGDELAYESYSAKGATVSLEAGYGFALNKSFGLEAAKY 638
A + S +N +VKG Y G SLEAG F
Sbjct: 734 ATLRASRLENDFKVAGSDGYAVKGK------YRTHGVGASLEAGRRFTHADG-------- 779

Query: 639 TWIFQPQAQAIWMGVDHNAHTEANGSRIENDANNNFQTRLGFRTFIRTQEKNSGPHGDDF 698
W +PQA+ A+ ANG R+ ++ ++ RLG R + G
Sbjct: 780 -WFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAG----GRQV 834

Query: 699 EPFVEMNWIHNSK-DFAVSMNGVKVEQDGARNLGEIKLGVNGNLNPSASVWGN 750
+P+++ + + V NG+ + E+ LG+ L S++ +
Sbjct: 835 QPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYAS 887


34EcSMS35_2037EcSMS35_2057Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_20373161.737792putative glycerol-3-phosphate acyltransferase
EcSMS35_20384181.86942850S ribosomal protein L32
EcSMS35_20393161.469392hypothetical protein
EcSMS35_20402151.503585Maf-like protein
EcSMS35_20411141.68285923S rRNA pseudouridylate synthase C
EcSMS35_20420142.156947ribonuclease E
EcSMS35_2043-2121.100645hypothetical protein
EcSMS35_2044-1101.255222acetyltransferase
EcSMS35_2045-1101.648993flagellar hook-associated protein FlgL
EcSMS35_2046-1132.750171flagellar hook-associated protein FlgK
EcSMS35_20471132.815166flagellar rod assembly protein/muramidase FlgJ
EcSMS35_20483132.673632flagellar basal body P-ring protein
EcSMS35_20493152.483135flagellar basal body L-ring protein
EcSMS35_20502172.487811flagellar basal body rod protein FlgG
EcSMS35_20511192.359999flagellar basal body rod protein FlgF
EcSMS35_20522181.057448flagellar hook protein FlgE
EcSMS35_20532191.481848flagellar basal body rod modification protein
EcSMS35_20541190.288388flagellar basal body rod protein FlgC
EcSMS35_20551140.900865flagellar basal body rod protein FlgB
EcSMS35_20562151.485251hypothetical protein
EcSMS35_20572151.492314flagellar basal body P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2042IGASERPTASE668e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 65.9 bits (160), Expect = 8e-13
Identities = 49/261 (18%), Positives = 83/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAATATPASPAQPGLLSRFFSALKALFSGGEETKPAEQP-APKAEAKPERQQDRR 609
T P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDSRSERTEGSDNREENRRNRRQAQQQTVETRESRQQAEV------T 663
N +++S++ D E +NR A++ + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +TT+ ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEEAVVAPVVEETVAGEPIVQEAPAPRTELVKVPLPVVAQAAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232



Score = 38.9 bits (90), Expect = 1e-04
Identities = 50/315 (15%), Positives = 91/315 (28%), Gaps = 34/315 (10%)

Query: 516 EEFAERKRPEQPALATFAMPDVPPAPTPAEPAAPVVAPAPK------------------- 556
+ + E+ A A P PPAP VA K
Sbjct: 1006 DVPSVPSNNEEIARVDEA-PVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQ 1064

Query: 557 ----AATATPASPAQPGLLSRFFSALKALFSGGEETKPAEQPAPKAEAKPERQQDRRKPR 612
A A A S + + ETK + +AK E ++ + P+
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK 1124

Query: 613 QNNRRDRNERRDSRSERTEGSDNREEN-RRNRRQAQQQTVETRESRQQAEVTEKARTTDE 671
++ + + S + + + RE + N ++ Q QT T ++ Q A+ T + E
Sbjct: 1125 VTSQVSPKQEQ-SETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET---SSNVE 1180

Query: 672 QQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRY 731
Q N + A + E+ + + R + R N +
Sbjct: 1181 QPVTESTTVNTGNSVVENPENTTPA-TTQPTVNSESSNKPKNRHRRSVRSVPH-NVEPAT 1238

Query: 732 EQSVAEEAVVAPVVEETVAGEPIVQEAPAPRTELVKVPLPV---VAQAAPEQQEENNADN 788
S V + T + + + V V ++Q + + N
Sbjct: 1239 TSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWV 1298

Query: 789 RDNGGMPRRSRRSPR 803
+ S R
Sbjct: 1299 SNTSMNKNYSSSQYR 1313



Score = 35.4 bits (81), Expect = 0.001
Identities = 33/232 (14%), Positives = 69/232 (29%), Gaps = 15/232 (6%)

Query: 515 EEEFAERKRPEQPALATFAMPDVPPAPTPAEPAAPVVAPAPKAATATPASPAQPGLLSRF 574
E+E + E+ V P +E P PA + Q
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQ----- 1161

Query: 575 FSALKALFSGGEETKPAEQPAPKAEAKPERQQDR--RKPRQNNRRDRNERRDSRSERTEG 632
+ A + + P E+ + P +S S
Sbjct: 1162 -TNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPK 1220

Query: 633 SDNREENRRNRRQAQQQTVETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKRQAQQ 692
+ +R R + T + + A + T+ + R +++ + +A
Sbjct: 1221 NRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVS 1280

Query: 693 EAKA---LNVEEQS---VQETEQEERVRPVQPRRKQRQLNQ-KVRYEQSVAE 737
+ + +N E Q V T + Q RR + Q ++ ++Q+++
Sbjct: 1281 QHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTISN 1332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2045FLAGELLIN474e-08 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 47.3 bits (112), Expect = 4e-08
Identities = 42/226 (18%), Positives = 80/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTESAPFSEVDGKYEGGAESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + DG E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2046FLGHOOKAP16810.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 681 bits (1758), Expect = 0.0
Identities = 543/546 (99%), Positives = 544/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYIDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAY+DGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQATRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQ TRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDVGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEED GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2047FLGFLGJ5080.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 508 bits (1310), Expect = 0.0
Identities = 310/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESMPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEES PAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQTLSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQ LSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTSMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLT+MIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2048FLGPRINGFLGI425e-151 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 425 bits (1095), Expect = e-151
Identities = 156/363 (42%), Positives = 212/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAVQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2049FLGLRINGFLGH349e-125 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (896), Expect = e-125
Identities = 231/232 (99%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKSNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGK+NFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2050FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2052FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


35EcSMS35_2082EcSMS35_2093Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2082017-3.349514hypothetical protein
EcSMS35_2083021-3.770087hypothetical protein
EcSMS35_2084018-3.372963glucans biosynthesis protein
EcSMS35_2085022-2.847600phopholipase D
EcSMS35_2086433-5.398877hypothetical protein
EcSMS35_2087537-7.444734hypothetical protein
EcSMS35_2088334-7.614013putative autoagglutination protein
EcSMS35_2089229-7.749523cryptic curlin major subunit
EcSMS35_2090230-9.626731curlin minor subunit
EcSMS35_2091325-7.017488hypothetical protein
EcSMS35_2092121-4.626337hypothetical protein
EcSMS35_2093119-3.296223DNA-binding transcriptional regulator CsgD
36EcSMS35_2109EcSMS35_2116Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2109-1123.112510hypothetical protein
EcSMS35_2110-1113.241951hypothetical protein
EcSMS35_2111-1114.098036trifunctional transcriptional regulator/proline
EcSMS35_21120143.608940transcriptional regulator RutR
EcSMS35_2113-1194.820035putative monooxygenase rutA
EcSMS35_2114-1174.415241putative isochorismatase family protein, rutB
EcSMS35_21151154.257227rutC protein
EcSMS35_21160154.001499putative rutD protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2112HTHTETR662e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.8 bits (160), Expect = 2e-15
Identities = 30/165 (18%), Positives = 62/165 (37%), Gaps = 8/165 (4%)

Query: 10 GKRSRAVSAKKKAILSAALDTFSQFGFHGTRLEQIAELAGVSKTNLLYYFPSKEALYIAV 69
K + ++ IL AL FSQ G T L +IA+ AGV++ + ++F K L+ +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 70 LRQILDIWLAPLKAFREDF--APLAAIKEYIRLKLEVSRDYPQASRLFCM-----EMLAG 122
++ F PL+ ++E + LE + + L + E +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 123 APLLMDELTGDLKALIDEKSALIAGWVKSGKL-APIDPQHLIFMI 166
++ D + +++ L A + + ++
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2114ISCHRISMTASE762e-18 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 75.8 bits (186), Expect = 2e-18
Identities = 46/176 (26%), Positives = 72/176 (40%), Gaps = 23/176 (13%)

Query: 13 TFDPQQTALIVVDMQNAYATPGGYLDLAGFDVSTTRPVIANIQTAVTAARAAGMLIIWFQ 72
DP + L++ DMQN + +D S + ANI+ G+ +++
Sbjct: 25 VPDPNRAVLLIHDMQNYF------VDAFTAGASPVTELSANIRKLKNQCVQLGIPVVY-- 76

Query: 73 NGWDAQYVEAGGPGSPNFHKSNALKTMRKQPQLQGKLLAKGSWDYQLVDELVPQPGDIVL 132
AQ PGS N L G L G ++ +++ EL P+ D+VL
Sbjct: 77 ---TAQ------PGSQNPDDRALLTDF------WGPGLNSGPYEEKIITELAPEDDDLVL 121

Query: 133 PKPRYSGFFNTPLDSILRSRGIRHLVFTGIATNVCVESTLRDGFFLEYFGVVLEDA 188
K RYS F T L ++R G L+ TGI ++ T + F + + DA
Sbjct: 122 TKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDA 177


37EcSMS35_2158EcSMS35_2163Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2158316-0.707950hypothetical protein
EcSMS35_2160215-0.908726hypothetical protein
EcSMS35_2159317-1.357884TfoX family protein
EcSMS35_2162218-0.720059SOS cell division inhibitor
EcSMS35_2161317-0.289142hypothetical protein
EcSMS35_2163214-0.233067outer membrane protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2163OUTRMMBRANEA6090.0 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 609 bits (1571), Expect = 0.0
Identities = 338/350 (96%), Positives = 340/350 (97%), Gaps = 4/350 (1%)

Query: 9 MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFIPNNGPTHENQLGAGA 68
MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFI NNGPTHENQLGAGA
Sbjct: 1 MKKTAIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGFINNNGPTHENQLGAGA 60

Query: 69 FGGYQVNPYVGFEMGYDWLGRMPYKGDNINGAYKAQGVQLTAKLGYPITDDLDVYTRLGG 128
FGGYQVNPYVGFEMGYDWLGRMPYKG NGAYKAQGVQLTAKLGYPITDDLD+YTRLGG
Sbjct: 61 FGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDLDIYTRLGG 120

Query: 129 MVWRADTKSNVPGGASTKDHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGT 188
MVWRADTKSNV G K+HDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGT
Sbjct: 121 MVWRADTKSNVYG----KNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGT 176

Query: 189 RPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQA 248
RPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQA
Sbjct: 177 RPDNGMLSLGVSYRFGQGEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQA 236

Query: 249 ALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQALSERRAQSVVDYLISKGIPADKIS 308
ALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQ LSERRAQSVVDYLISKGIPADKIS
Sbjct: 237 ALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSVVDYLISKGIPADKIS 296

Query: 309 ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA 358
ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA
Sbjct: 297 ARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIKDVVTQPQA 346


38EcSMS35_2172EcSMS35_2183Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2172-221-3.392023hypothetical protein
EcSMS35_2174022-3.673149dihydroorotate dehydrogenase 2
EcSMS35_2175124-3.628858putativi pili assembly chaperone
EcSMS35_2176226-4.572151fimbrial protein
EcSMS35_2177124-3.905436fimbrial protein
EcSMS35_2178121-3.068520putative fimbrial protein
EcSMS35_2179017-1.411315outer membrane usher protein fimD-like protein
EcSMS35_2180-1160.877467periplasmic pilus chaperone family protein
EcSMS35_2181-2172.074956putative fimbrial protein
EcSMS35_2182-2133.381963NAD(P)H-dependent FMN reductase
EcSMS35_2183-2133.156547alkanesulfonate transporter substrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2178CLENTEROTOXN320.005 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 31.6 bits (71), Expect = 0.005
Identities = 13/48 (27%), Positives = 22/48 (45%)

Query: 295 VGVVVTDSQNNIISPAGGTLPLSIPDDADSFARMNVYPVSTTGVPPET 342
+ V TD + I+ A T L++ D +S N+Y ++ P T
Sbjct: 188 LTVPSTDIEKEILDLAAATERLNLTDALNSNPAGNLYDWRSSNSYPWT 235


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2179PF005778240.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 824 bits (2131), Expect = 0.0
Identities = 416/874 (47%), Positives = 570/874 (65%), Gaps = 19/874 (2%)

Query: 3 RTQRQH-SLLSSGGVPSFIGGLVVFVSAAFNAQAETWFDPAFFKDDPSMVADLSRFEKGQ 61
TQ H G + F + A + AE +F+P F DDP VADLSRFE GQ
Sbjct: 12 NTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQ 71

Query: 62 KITPGVYRVDIVLNQTIVDTRNVNFVEITPEKGIAACLTTESLDAMGVNTDAFPAFKQLD 121
++ PG YRVDI LN + TR+V F E+GI CLT L +MG+NT + L
Sbjct: 72 ELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLA 131

Query: 122 KQACAPLAEIIPDASVTFNVNKLRLEISVPQIAIKSNARGYVPPERWDEGINALLLGYSF 181
AC PL +I DA+ +V + RL +++PQ + + ARGY+PPE WD GINA LL Y+F
Sbjct: 132 DDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNF 191

Query: 182 SGANSIHSSAGSDSGDSYFLNLNSGVNLGPWRLRNNSTWSR-----SSGQTAEWKNLSSY 236
SG + + G+ +LNL SG+N+G WRLR+N+TWS SSG +W++++++
Sbjct: 192 SGNSVQNRIGGNS--HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249

Query: 237 LQRAVIPLKGELTVGDDYTAGDFFDSVSFRGVQLASDDNMLPDSLKGFAPVVRGIAKSNA 296
L+R +IPL+ LT+GD YT GD FD ++FRG QLASDDNMLPDS +GFAPV+ GIA+ A
Sbjct: 250 LERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTA 309

Query: 297 QVTIKQNGYTIYQTYVSPGAFEISDLYSTSSSGDLLVEIKEADGSVNSYSVPFSSVPLLQ 356
QVTIKQNGY IY + V PG F I+D+Y+ +SGDL V IKEADGS ++VP+SSVPLLQ
Sbjct: 310 QVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQ 369

Query: 357 RQGRIKYAVTLAKYRTNSNDQQESKFAQATLQWGGPRGTTWYGGGQYAEYYRAAMFGLGF 416
R+G +Y++T +YR+ + Q++ +F Q+TL G P G T YGG Q A+ YRA FG+G
Sbjct: 370 REGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGK 429

Query: 417 NLGDFGAISFDATQAKSTLADQSEHKGQSYRFLYAKTLNQLGTNFQLMGYRYSTSGFYTL 476
N+G GA+S D TQA STL D S+H GQS RFLY K+LN+ GTN QL+GYRYSTSG++
Sbjct: 430 NMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNF 489

Query: 477 SDTMYKHMDGY--EFNDGDDEDTPMWSRYYNLFYTKRGKLQVNISQQLGEYGSFYLSGSQ 534
+DT Y M+GY E DG + P ++ YYNL Y KRGKLQ+ ++QQLG + YLSGS
Sbjct: 490 ADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSH 549

Query: 535 QTYWHTDQQDRLLQFGYNTQIKDLSLGVSWNYSKSRGQPDADQVFALNFSLPLNLLLPKS 594
QTYW T D Q G NT +D++ +S++ +K+ Q DQ+ ALN ++P + L
Sbjct: 550 QTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSD 609

Query: 595 NDSYTRKKNYAWMTSNTSIDNEGHTTQNLGLTETLLDDGNLSYSVQQGYNSEGKTANGS- 653
+ S R +A + + S D G T G+ TLL+D NLSYSVQ GY G +GS
Sbjct: 610 SKSQWR---HASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGST 666

Query: 654 --ASMDYKGAFADARVGYNYSDNGSQQQLNYALSGSLVAHSQGITLGQSLGETNVLIAAP 711
A+++Y+G + +A +GY++SD+ +QL Y +SG ++AH+ G+TLGQ L +T VL+ AP
Sbjct: 667 GYATLNYRGGYGNANIGYSHSDD--IKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAP 724

Query: 712 GAENTRVANSTGLKTDWRGYTVVPYATSYRENRIALDAASLKRNVDLENAVVNVVPTKGA 771
GA++ +V N TG++TDWRGY V+PYAT YRENR+ALD +L NVDL+NAV NVVPT+GA
Sbjct: 725 GAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGA 784

Query: 772 LVLAEFNAHAGARVLMKTSKQGMPLRFGAMATLDGAQTISGIIDDDGSLYMSGLPAKGTI 831
+V AEF A G ++LM + PL FGAM T + SGI+ D+G +Y+SG+P G +
Sbjct: 785 IVRAEFKARVGIKLLMTLTHNNKPLPFGAMVT-SESSQSSGIVADNGQVYLSGMPLAGKV 843

Query: 832 TVRWGDAPDQICHISYELTEQQINAAITRMDSVC 865
V+WG+ + C +Y+L + +T++ + C
Sbjct: 844 QVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2181FIMBRIALPAPE280.016 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.1 bits (62), Expect = 0.016
Identities = 20/67 (29%), Positives = 30/67 (44%), Gaps = 7/67 (10%)

Query: 31 SVTFNGKVIAPACTLVAATKDSVVTLPDVSATKLQSNGQVS---GVQTDVPIALEDCDIT 87
++TF GK+I PACT+ A V D+ L +G V + P +L +T
Sbjct: 27 NLTFKGKLIIPACTVQNAE----VNWGDIEIQNLVQSGGNQKDFTVDMNCPYSLGTMKVT 82

Query: 88 VTKNATF 94
+T N
Sbjct: 83 ITSNGQT 89


39EcSMS35_2236EcSMS35_2275Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2236422-0.114055leucyl/phenylalanyl-tRNA--protein transferase
EcSMS35_22377270.189846translation initiation factor IF-1
EcSMS35_22398291.540569*hypothetical protein
EcSMS35_22406273.215712hypothetical protein
EcSMS35_22417274.249443hypothetical protein
EcSMS35_22428284.529243hypothetical protein
EcSMS35_22438304.854199YagB/YeeU/YfjZ family protein
EcSMS35_22449275.440374hypothetical protein
EcSMS35_22459275.188475hypothetical protein
EcSMS35_22469275.509601RadC family DNA repair protein
EcSMS35_22477222.825551antirestriction protein
EcSMS35_22486212.828873hypothetical protein
EcSMS35_22496192.439551antigen 43
EcSMS35_2250120-5.388971hypothetical protein
EcSMS35_2251121-6.758641putative GTPase
EcSMS35_2253120-6.943527type I restriction-modification system, M
EcSMS35_2252224-5.854609IS1 transposase orfB
EcSMS35_2254325-6.261947IS1 transposase orfA
EcSMS35_2255222-5.152335glycosyl transferase, group 1/glycosyl
EcSMS35_2256218-2.330408hypothetical protein
EcSMS35_2257019-1.967083IS5 transposase
EcSMS35_2258023-4.566340IS629 transposase orfB
EcSMS35_2259124-6.191409IS629 transposase orfA
EcSMS35_2261126-8.194046IS903 transposase
EcSMS35_2262128-9.352061glycosyl transferase, group 2
EcSMS35_2263128-10.496451glycosyl transferase, group 1 family protein
EcSMS35_2264228-10.491477glycosyl transferase, group 2 family protein
EcSMS35_2265124-9.909423UDP-galactopyranose mutase
EcSMS35_2266227-10.518755glycosyl transferase family protein
EcSMS35_2267229-4.490507O-antigen export system ATP-binding protein
EcSMS35_2268227-4.514539O-antigen export system permease RfbA
EcSMS35_2269534-7.073909hypothetical protein
EcSMS35_2270537-7.368105hypothetical protein
EcSMS35_2271538-6.837166hypothetical protein
EcSMS35_2272327-4.832633ISL3 family transposase
EcSMS35_2274225-6.043087hypothetical protein
EcSMS35_2275225-5.907556hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2249PRTACTNFAMLY310.033 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 30.8 bits (69), Expect = 0.033
Identities = 100/525 (19%), Positives = 168/525 (32%), Gaps = 67/525 (12%)

Query: 366 GTLAVSAGGKA---TGVTMTSGGALI---ADSGATV---EGTNASGKFSIDGISGQASGL 416
G +GG G + +GGA + ++ G +A GK + + + L
Sbjct: 326 GARVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKL 385

Query: 417 LLENG----GSFTVNAGGQAGNTTVGHRGTLTLAAGGSLSGRTQLSKGASMVLNGDVVST 472
L G G T++G + LA+ +G T+ S+ N V T
Sbjct: 386 TLTGGADAQGDIVATELPSIPGTSIG-PLDVALASQARWTGATRAVDSLSID-NATWVMT 443

Query: 473 GDIVNAGEIRFDNQTTQDAVLSRAVAKGDAPVTFHKLTTSNLTGQGGTINMRVRLDGSNA 532
+ N G +R + + D + F LT + L G G M V D
Sbjct: 444 DN-SNVGALRLASDGSVD------FQQPAEAGRFKVLTVNTLAGSG-LFRMNVFAD-LGL 494

Query: 533 SDQLVINGGQATGKTWLAFTNVGNSNLGVATSGQGIRVVDAQNGATTEEGAFALSRPLQA 592
SD+LV+ A+G+ L N G+ S + +V G+ +
Sbjct: 495 SDKLVVMQD-ASGQHRLWVRNSGSE----PASANTLLLVQTPLGSAATFTLANKDGKVDI 549

Query: 593 GAFNYTLNRDSDEDWYLRSENAYRAEVPLYASMLTQAMDYDRILAGSRSHQTGVNGENNS 652
G + Y L + + W L A A P + +
Sbjct: 550 GTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAA 609

Query: 653 VRLSIQGGHLGHDNNGGIARG-----------ATPESSGSYGFVRLESDLLRTEVA---- 697
++ G +G + A P++ G++G + L
Sbjct: 610 ANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGRRFD 669

Query: 698 ------------GMSVTAGVYSAAGHSSVDVKDDDGSRAGTVRDDAGSLGGYLNLVHTSS 745
++V G + G + D + G D+ +GGY + S
Sbjct: 670 QKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI-ADS 728

Query: 746 GLWADIMAQGTRHSMKASSDNND-------FRARGWGWLGSLETGLPFSITDNLMLEPQL 798
G + D + +R +D +R G G SLE G F+ D LEPQ
Sbjct: 729 GFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA--SLEAGRRFTHADGWFLEPQA 786

Query: 799 QYTWQGLSLDDGQDNAGY-VKFGHGSAQHMRAGFRLGSHNDMSFG 842
+ + G V+ GS+ R G +G +++ G
Sbjct: 787 ELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGG 831


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2265NUCEPIMERASE290.031 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.031
Identities = 15/43 (34%), Positives = 23/43 (53%), Gaps = 5/43 (11%)

Query: 5 NILIVG-AGFSGVVIARQLAEQGHKVKIIDQRDHIGGNSYDTR 46
L+ G AGF G ++++L E GH+V ID + + YD
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLN----DYYDVS 40


40EcSMS35_2334EcSMS35_2353Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_23342130.13022150S ribosomal protein L25
EcSMS35_23351120.827418nucleoid-associated protein NdpA
EcSMS35_23361141.180696hypothetical protein
EcSMS35_23370172.936803hypothetical protein
EcSMS35_23380173.012048sulfatase family protein
EcSMS35_23410214.040044*transcriptional regulator NarP
EcSMS35_23420224.421832cytochrome c-type biogenesis family protein
EcSMS35_23431214.880125thiol:disulfide interchange protein DsbE
EcSMS35_23440194.948446cytochrome c-type biogenesis protein CcmF
EcSMS35_2345-1163.493622cytochrome c-type biogenesis protein CcmE
EcSMS35_23460163.935260heme exporter protein D
EcSMS35_23470153.872510heme exporter protein C
EcSMS35_2348-1174.520983heme exporter protein CcmB
EcSMS35_23490204.387918cytochrome c biogenesis protein CcmA
EcSMS35_23500234.309650cytochrome c-type protein NapC
EcSMS35_23510214.657096citrate reductase cytochrome c-type subunit
EcSMS35_23520214.094116quinol dehydrogenase membrane component
EcSMS35_23530203.450815quinol dehydrogenase periplasmic component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2338IGASERPTASE300.027 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.027
Identities = 19/70 (27%), Positives = 28/70 (40%), Gaps = 6/70 (8%)

Query: 503 LHVSTPASEYSQGQ-DLF---NPQRRHYWVTAADNDTLAITTPKKTLVLNNNGKYRTYNL 558
L V+ E + + LF QR H V+ +T+ + K L N NG+Y YN
Sbjct: 926 LQVADKTGEPNHNELTLFDASKAQRDHLNVSLV-GNTVDLGAWKYKLR-NVNGRYDLYNP 983

Query: 559 RGERVKDEKP 568
E+
Sbjct: 984 EVEKRNQTVD 993


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2341HTHFIS643e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.1 bits (156), Expect = 3e-14
Identities = 22/113 (19%), Positives = 47/113 (41%), Gaps = 2/113 (1%)

Query: 9 VMIVDDHPLMRRGVRQLLELDPGFEVVAEAGDGASAIDLANRLDIDVILLDLNMKGMSGL 68
+++ DD +R + Q L G++V + A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIR 121
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


41EcSMS35_2411EcSMS35_2438Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_24112142.5020644-amino-4-deoxy-L-arabinose transferase
EcSMS35_24121153.731320SMR family multidrug efflux pump
EcSMS35_24141133.958361hypothetical protein
EcSMS35_24131145.100093polymyxin B resistance protein pmrD
EcSMS35_24151134.864049O-succinylbenzoic acid--CoA ligase
EcSMS35_24160124.551376O-succinylbenzoate synthase
EcSMS35_24170113.257627naphthoate synthase
EcSMS35_24180122.563303acyl-CoA thioester hydrolase YfbB
EcSMS35_2419-1121.7418722-succinyl-5-enolpyruvyl-6-hydroxy-3-
EcSMS35_2420-117-1.903158isochorismate synthase, menaquinone-specific
EcSMS35_2421022-3.895669hypothetical protein
EcSMS35_2422022-3.932396hypothetical protein
EcSMS35_2424021-4.139253ribonuclease Z
EcSMS35_2423-112-1.673447hypothetical protein
EcSMS35_2426-110-0.072561von Willebrand factor type A domain-containing
EcSMS35_24271202.704260hypothetical protein
EcSMS35_24281213.230473M28 family peptidase
EcSMS35_24292273.894861hypothetical protein
EcSMS35_24302304.511738NADH dehydrogenase subunit N
EcSMS35_24312313.899031NADH dehydrogenase subunit M
EcSMS35_24320314.461032NADH dehydrogenase subunit L
EcSMS35_24330314.222477NADH dehydrogenase subunit K
EcSMS35_24340314.278073NADH dehydrogenase subunit J
EcSMS35_24351294.329228NADH dehydrogenase subunit I
EcSMS35_24360294.179653NADH dehydrogenase subunit H
EcSMS35_24370274.102963NADH dehydrogenase subunit G
EcSMS35_24380263.193156NADH dehydrogenase I subunit F
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2412BCTERIALGSPC280.008 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 28.0 bits (62), Expect = 0.008
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMVLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2415ALARACEMASE300.023 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.7 bits (67), Expect = 0.023
Identities = 32/192 (16%), Positives = 58/192 (30%), Gaps = 37/192 (19%)

Query: 268 GYGLTEFASTVCAKEADGLADVGSPL----PGREVKIVNDEVWLRAASMAEGYWRNGQRV 323
G+G+ S + A + L ++ + G + I+ E + A + + R+
Sbjct: 40 GHGIERIWSAIGATDGFALLNLEEAITLRERGWKGPILMLEGFFHAQDLEIY---DQHRL 96

Query: 324 PLVNDEGWYATRDRGEMHNGKLTI-------VGRLDNLFFSGGEGIQPEEVERVIAAHPA 376
W + L I + RL G QP+ V V A
Sbjct: 97 TTCVHSNWQLKALQNARLKAPLDIYLKVNSGMNRL---------GFQPDRVLTVWQQLRA 147

Query: 377 VLQVFIVPVADKEFGHRPVAVVEYDQQTVDLDEWVKDKLARFQQPVRWLTLPPELKNGGI 436
+ V + + H A + + + +AR +Q L L N
Sbjct: 148 MANVGEMTL----MSHFAEA---------EHPDGISGAMARIEQAAEGLECRRSLSNSAA 194

Query: 437 KISRQALK-EWV 447
+ +WV
Sbjct: 195 TLWHPEAHFDWV 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2422AUTOINDCRSYN356e-05 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.8 bits (80), Expect = 6e-05
Identities = 14/79 (17%), Positives = 31/79 (39%), Gaps = 12/79 (15%)

Query: 1 MIEWQDLHHSELSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLTRDNRHIL 52
M+E D++H+ LS ++ L LR F + D + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYD----NNNTTYLF 56

Query: 53 GWKNDELVAYARILKSDDD 71
G K++ ++ R +++
Sbjct: 57 GIKDNTVICSLRFIETKYP 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2426IGASERPTASE365e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 5e-04
Identities = 20/115 (17%), Positives = 37/115 (32%), Gaps = 5/115 (4%)

Query: 22 ESENKESLQQQPSTPTDQQVLAAQHAAIKEAEQRSVAAKATADAKAKALAQQEAQQYSDK 81
ES+ E +Q + T Q A KEA+ A T + +E Q K
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVA-----KEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 82 QALQGRLQAAPKYQHAAREKAASQIANPGTARYQQFDDNPVKQVAQNPLATFSLD 136
+ + K + ++ + + Q P + A+ T ++
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIK 1156


42EcSMS35_2501EcSMS35_2536Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2501017-3.3181333-ketoacyl-CoA thiolase
EcSMS35_2502018-5.454480hypothetical protein
EcSMS35_2503120-6.759438long-chain fatty acid outer membrane
EcSMS35_2504424-4.698074hypothetical protein
EcSMS35_2505524-3.774581VacJ family lipoprotein
EcSMS35_2506526-4.788085hypothetical protein
EcSMS35_2507527-5.502013hypothetical protein
EcSMS35_2509528-5.581360*putative outer membrane protein
EcSMS35_2510526-4.828250outer membrane autotransporter
EcSMS35_2511330-6.779560LuxR family transcriptional regulator
EcSMS35_2512231-6.802043type 1 fimbriae regulatory protein
EcSMS35_2513129-6.403911type 1 fimbriae regulatory protein
EcSMS35_2514128-5.873466DNA-binding transcriptional regulator DsdC
EcSMS35_2515030-7.809733permease DsdX
EcSMS35_2516033-8.465894D-serine dehydratase
EcSMS35_2517135-9.432081multidrug resistance protein Y
EcSMS35_2518034-8.421998drug resistance MFS transporter, membrane fusion
EcSMS35_2519132-7.707828DNA-binding transcriptional activator EvgA
EcSMS35_2520132-7.275990hybrid sensory histidine kinase in two-component
EcSMS35_2521331-5.575008hypothetical protein
EcSMS35_2522231-5.769328putative transporter YfdV
EcSMS35_2523129-5.509614putative oxalyl-CoA decarboxylase
EcSMS35_2524-123-4.786722formyl-coenzyme A transferase
EcSMS35_2525-123-5.411106hypothetical protein
EcSMS35_2526118-1.625155hypothetical protein
EcSMS35_2527115-0.616948putative lipoprotein
EcSMS35_2528016-0.383905hypothetical protein
EcSMS35_2529017-0.028803lipid A biosynthesis palmitoleoyl
EcSMS35_25300191.676650hypothetical protein
EcSMS35_25310192.389441aminotransferase
EcSMS35_25320192.762340sensor histidine kinase
EcSMS35_25330173.410545response regulator
EcSMS35_25340174.005854AraC family transcriptional regulator
EcSMS35_25350163.741455multiphosphoryl transfer protein 1
EcSMS35_25360133.172326exoaminopeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2505VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 250/251 (99%), Positives = 250/251 (99%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADGLYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2510PRTACTNFAMLY922e-20 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 91.7 bits (227), Expect = 2e-20
Identities = 115/496 (23%), Positives = 189/496 (38%), Gaps = 62/496 (12%)

Query: 2155 TAEAGNFTVGSLTNTGVIRLAGGKTGN-----------TLTVNGDYTGGGTLIINTVLGD 2203
+ + + + +N G +RLA + + LTVN G G +N
Sbjct: 434 SIDNATWVMTDNSNVGALRLASDGSVDFQQPAEAGRFKVLTVN-TLAGSGLFRMNVFA-- 490

Query: 2204 DTSATDKLIVTGNTSGDTGVVVNNVRGQGAQTADGIEIVHVGGQSDGNFRLQN---RAVA 2260
D +DKL+V + SG + V N G +A+ + +V S F L N +
Sbjct: 491 DLGLSDKLVVMQDASGQHRLWVRNS-GSEPASANTLLLVQTPLGSAATFTLANKDGKVDI 549

Query: 2261 GAWEYFLHKGNAGGTDGNWYLR-SELPPEPQPQPQPQPQPQPHPTPDKPVQKVYRPEAGS 2319
G + Y L A +G W L ++ PP P+P PQP PQP P P +P +P AG
Sbjct: 550 GTYRYRL----AANGNGQWSLVGAKAPPAPKPAPQPGPQP-PQPPQPQPEAPAPQPPAGR 604

Query: 2320 YIANIAAANTLFNIRMHDREGETYYTDVFTGEKKATSMWMRHIGGHNRWKDSSSQLNTQS 2379
++ AAAN N +Y + K+ + + G W +Q
Sbjct: 605 ELS--AAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAG-GAWGRGFAQRQQLD 661

Query: 2380 NRYVVQLGGSIAQWTDGQD--------RLQLGIMAGYGNEKSSTTSSLSGYKSKGAINGY 2431
NR + +A + G D R LG +AGY T G+ + GY
Sbjct: 662 NRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGY 721

Query: 2432 STGLYGTWQQNDGNDNGAYVDTWIQYGWFNN--TVNGEKLAAESWKSR--GFTGSVEAGY 2487
+T + D+G Y+D ++ N V G A K R G S+EAG
Sbjct: 722 ATYI---------ADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGR 772

Query: 2488 TFKAGEFTGSQGSHYDWYIQPQSQITWMNVRASEHTEKNGTKVQLSGDGNIQSRLGVRTY 2547
F W+++PQ+++ + NG +V+ G ++ RLG+
Sbjct: 773 RFTH---------ADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVG 823

Query: 2548 LKGKSASDDNKAHQFEPFVEVNWIHNTRSWG-VKMDNTALSQDGATNIAEVKTGVQGKLS 2606
+ + A Q +P+++ + + G V + A + AE+ G+ L
Sbjct: 824 KRIELAGG----RQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALG 879

Query: 2607 DNLNVWGNVGVQAGDK 2622
+++ + G K
Sbjct: 880 RGHSLYASYEYSKGPK 895


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2517TCRTETB1214e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (306), Expect = 4e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2518RTXTOXIND785e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 5e-18
Identities = 61/413 (14%), Positives = 122/413 (29%), Gaps = 96/413 (23%)

Query: 13 RKKYFALLAVVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
R+ ++ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQ-VGETVSPG 233
L + + + + + I++PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDTR 341
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2519HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2520HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 31/105 (29%), Positives = 51/105 (48%)

Query: 890 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKISMQHYDLLITDVNMPNMDGFE 949
+IL+ADD R +L + L+ GYDV ++ I+ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 950 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 994
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2532PF065802232e-70 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 223 bits (571), Expect = 2e-70
Identities = 60/207 (28%), Positives = 102/207 (49%), Gaps = 11/207 (5%)

Query: 342 RAEQLREMANKAELRALQSKINPHFLFNALNAISSSIRLNPDTARQLIFNLSRYLRYNIE 401
++ MA +A+L AL+++INPHF+FNALN I + I +P AR+++ +LS +RY++
Sbjct: 150 DQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 402 LKDDEQIDIKKELYQIKDYIAIEQARFGDKLTVIYDIDEEV-NCCIPSLLIQPLVENAIV 460
+ Q+ + EL + Y+ + +F D+L I+ + + +P +L+Q LVEN I
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIK 269

Query: 461 HGIQPCKGKGVVTISVAECGNRVRIAVRDTGHGIDPKVIERVEANEMPGNKIGLLNVHHR 520
HGI G + + + V + V +TG E GL NV R
Sbjct: 270 HGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE--------STGTGLQNVRER 321

Query: 521 VKLLYGE--GLHIRRLEPGTEIAFYIP 545
+++LYG + + + IP
Sbjct: 322 LQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2533HTHFIS555e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.2 bits (133), Expect = 5e-11
Identities = 21/125 (16%), Positives = 55/125 (44%), Gaps = 6/125 (4%)

Query: 2 KVIIVEDEFLAQQELSWLIKEHSQMEIVGTFDDGLDVLKFLQHNRVDAIFLDINIPSLDG 61
+++ +D+ + L+ + V + + +++ D + D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALS--RAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 V-LLAQNISQFAHKPFIVFITAWK--EHAVEAFELEAFDYILKPYQESRITGMLQKLEAA 118
LL + P +V ++A A++A E A+DY+ KP+ + + G++ + A
Sbjct: 63 FDLLPRIKKARPDLPVLV-MSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 119 WQQQQ 123
+++
Sbjct: 122 PKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2535PHPHTRNFRASE6130.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 613 bits (1583), Expect = 0.0
Identities = 201/567 (35%), Positives = 330/567 (58%), Gaps = 8/567 (1%)

Query: 117 LYGNVLASGVGVGTLTLLQSDSLDSYRAIPA-SAQDSTRLEHSLATLAEQLNQQLRERDG 175
+ G +SGV + + ++D + + + +L +L E+L + +
Sbjct: 5 ITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEA 64

Query: 176 ----ESKTILSAHLSLIQDDEFAGNIRRLMAEQHQGLGAAIISNMEQVCAKLSASASDYL 231
+ I +AHL ++ D E I+ + + A+ + + + ++Y+
Sbjct: 65 SMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYM 124

Query: 232 RERVSDIRDISEQLL-HITWPELKPRNNLVLEKPTILVAEDLTPSQFLSLDLKNLAGMIL 290
+ER +DIRD+S+++L H+ E + + T+++AEDLTPS L+ + + G
Sbjct: 125 KERAADIRDVSKRVLGHLIGVETGSLATIA--EETVIIAEDLTPSDTAQLNKQFVKGFAT 182

Query: 291 EKTGRTSHTLILARASAIPVLSGLPLDAIARYAGQPAVLDAQCGVLAINPNDAVSGYYQV 350
+ GRTSH+ I++R+ IP + G G ++D G++ +NP + Y+
Sbjct: 183 DIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEE 242

Query: 351 AQTLADKRQKQQAQAAAQLAYSRDNKRIDIAANIGTALEAPGAFANGAEGVGLFRTEMLY 410
+ +K++++ A+ + + ++D +++AANIGT + G ANG EG+GL+RTE LY
Sbjct: 243 KRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLY 302

Query: 411 MDRDSAPDEQEQFEAYQQVLLAAGDKPIIFRTMDIGGDKSIPYLNIPQEENPFLGYRAVR 470
MDRD P E+EQFEAY++V+ KP++ RT+DIGGDK + YL +P+E NPFLG+RA+R
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIR 362

Query: 471 IYPEFAGLFRTQLRAILRAACFGNAQLMIPMVHSLDQILWVKGEIQKAIVELKRDCLRHA 530
+ E +FRTQLRA+LRA+ +GN ++M PM+ +L+++ K +Q+ +L + + +
Sbjct: 363 LCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVS 422

Query: 531 ETITLGIMVEVPSVCYIIDHFCDEVDFFSIGSNDMTQYLYAVDRNNPRVSPLYNPITPSF 590
++I +GIMVE+PS + F EVDFFSIG+ND+ QY A DR N RVS LY P P+
Sbjct: 423 DSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAI 482

Query: 591 LRMLQQIVTTAHQRGKWVGICGELGGESRYLPLLLGLGLDELSMSSPRIPAVKSQLRQLD 650
LR++ ++ AH GKWVG+CGE+ G+ +PLLLGLGLDE SMS+ I +SQL +L
Sbjct: 483 LRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKLS 542

Query: 651 SEACRELARQACECRSAQEIEALLTAF 677
E + A++A +A+E+E L+
Sbjct: 543 KEELKPFAQKALMLDTAEEVEQLVKKT 569


43EcSMS35_2568EcSMS35_2577Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2568220-0.705747putative sulfate transport protein CysZ
EcSMS35_2569219-0.470383cysteine synthase A
EcSMS35_2570219-0.537013PTS system phosphohistidinoprotein-hexose
EcSMS35_25711160.336299phosphoenolpyruvate-protein phosphotransferase
EcSMS35_2572-1162.436787PTS system glucose-specific transporter subunit
EcSMS35_2573-1203.441834IS4 transposase
EcSMS35_25740233.925622pyridoxal kinase
EcSMS35_2575-1264.191331hypothetical protein
EcSMS35_2576-1254.615916cysteine synthase B
EcSMS35_2577-1243.982058sulfate/thiosulfate transporter subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2571PHPHTRNFRASE7480.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 748 bits (1933), Expect = 0.0
Identities = 276/571 (48%), Positives = 386/571 (67%), Gaps = 2/571 (0%)

Query: 1 MISGILASPGIAFGKALLLKEDEIVIDRKKISADQVDQEVERFLSGRAKASAQLETIKTK 60
I+GI AS G+A KA + E + I++ I V E+E+ + K+ +L IK +
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSI--TDVSTEIEKLTAALEKSKEELRAIKDQ 61

Query: 61 AGETFGEEKEAIFEGHIMLLEDEELEQEIIALIKDKHMTADAAAHEVIEGQASALEELDD 120
+ G +K IF H+++L+D EL I I+++ M A+ A EV + S E +D+
Sbjct: 62 TEASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDN 121

Query: 121 EYLKERAADVRDIGKRLLRNILGLKIIDLSAIQDEVILVAADLTPSETAQLNLKKVLGFI 180
EY+KERAAD+RD+ KR+L +++G++ L+ I +E +++A DLTPS+TAQLN + V GF
Sbjct: 122 EYMKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFA 181

Query: 181 TDAGGRTSHTSIMARSLELPAIVGTGSVTSQVKNDDYLILDAVNNQVYVNPTNEVIDKMR 240
TD GGRTSH++IM+RSLE+PA+VGT VT ++++ D +I+D + V VNPT E +
Sbjct: 182 TDIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYE 241

Query: 241 AVQEQVASEKAELAKLKDLPAITLDGHQVEVCANIGTVRDVEGAERNGAEGVGLYRTEFL 300
+ +K E AKL P+ T DG VE+ ANIGT +DV+G NG EG+GLYRTEFL
Sbjct: 242 EKRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFL 301

Query: 301 FMDRDALPTEEEQFAAYKAVAEACGSQAVIVRTMDIGGDKELPYMNFPKEENPFLGWRAI 360
+MDRD LPTEEEQF AYK V + + V++RT+DIGGDKEL Y+ PKE NPFLG+RAI
Sbjct: 302 YMDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAI 361

Query: 361 RIAMDRKEILRDQLRAILRASAFGKLRIMFPMIISVEEVRALRKEIEIYKQELRDEGKAF 420
R+ +++++I R QLRA+LRAS +G L++MFPMI ++EE+R + ++ K +L EG
Sbjct: 362 RLCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDV 421

Query: 421 DESIEIGVMVETPAAATIARHLAKEVDFFSIGTNDLTQYTLAVDRGNDMISHLYQPMSPS 480
+SIE+G+MVE P+ A A AKEVDFFSIGTNDL QYT+A DR N+ +S+LYQP P+
Sbjct: 422 SDSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPA 481

Query: 481 VLNLIKQVIDASHAEGKWTGMCGELAGDERATLLLLGMGLDEFSMSAISIPRIKKIIRNT 540
+L L+ VI A+H+EGKW GMCGE+AGDE A LLLG+GLDEFSMSA SI + +
Sbjct: 482 ILRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKL 541

Query: 541 NFEDAKVLAEQALAQPTTDELMTLVNKFIEE 571
+ E+ K A++AL T +E+ LV K +
Sbjct: 542 SKEELKPFAQKALMLDTAEEVEQLVKKTYLK 572


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2577PF05272347e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.3 bits (78), Expect = 7e-04
Identities = 11/33 (33%), Positives = 16/33 (48%)

Query: 30 MVALLGPSGSGKTTLLRIIAGLEHQTSGHIRFH 62
V L G G GK+TL+ + GL+ + H
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG 630


44EcSMS35_2586EcSMS35_2604Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_25862151.173430dyp-type peroxidase family protein
EcSMS35_25871140.680442hypothetical protein
EcSMS35_25881131.562006putative inner membrane protein YfeZ
EcSMS35_25890152.162532putative acetyltransferase
EcSMS35_25900142.904185N-acetylmuramoyl-l-alanine amidase I
EcSMS35_2591-2163.769009coproporphyrinogen III oxidase
EcSMS35_2592-2184.902331transcriptional regulator EutR
EcSMS35_2593-2225.231367putative ethanolamine utilization protein EutK
EcSMS35_2594-1215.444417putative ethanolamine utilization protein EutL
EcSMS35_2595-1215.708379ethanolamine ammonia-lyase small subunit
EcSMS35_25960215.881719ethanolamine ammonia-lyase, large subunit
EcSMS35_25971196.057422reactivating factor for ethanolamine ammonia
EcSMS35_25981195.690050ethanolamine utilization protein EutH
EcSMS35_25994196.308078ethanolamine utilization protein EutG
EcSMS35_26002186.306742ethanolamine utilization protein EutJ
EcSMS35_26013195.619744ethanolamine utilization protein EutE
EcSMS35_26021184.495002ethanolamine utilization protein
EcSMS35_26031194.140695ethanolamine utilization protein EutM
EcSMS35_26042183.414746phosphotransacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2589SACTRNSFRASE316e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 6e-04
Identities = 15/102 (14%), Positives = 38/102 (37%), Gaps = 4/102 (3%)

Query: 24 LRPWNDPEMDIERKMNHDVSLFLVAEVNGEVVG--TVMGGYDGHRGSAYYLGVHPEFRGR 81
+ + D +MD+ + FL + +G + ++G + V ++R +
Sbjct: 47 FKQYEDDDMDVSYVEEEGKAAFL-YYLENNCIGRIKIRSNWNG-YALIEDIAVAKDYRKK 104

Query: 82 GIANALLNRLEKKLIARGCPKIQINVPEDNDMVLGMYERLGY 123
G+ ALL++ + + + + N Y + +
Sbjct: 105 GVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2600SHAPEPROTEIN503e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.1 bits (120), Expect = 3e-09
Identities = 33/116 (28%), Positives = 50/116 (43%), Gaps = 9/116 (7%)

Query: 63 VRDGIVWDFFGAVTIVRRHLD-TLEQQFGRRFSHAATSFPPGTDP---RISINVLESAGL 118
++DG++ DFF +++ + F R P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 119 EVSHVLDEPTAVA---DLLQLDNAG--VVDIGGGTTGIAIVKKGKVTYSADEATGG 169
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


45EcSMS35_2623EcSMS35_2632Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_26233151.472111phosphoribosylaminoimidazole-succinocarboxamide
EcSMS35_26241102.470630lipoprotein
EcSMS35_26250112.215442dihydrodipicolinate synthase
EcSMS35_26260142.869952glycine cleavage system transcriptional
EcSMS35_26270153.156280thioredoxin-dependent thiol peroxidase
EcSMS35_26281133.465576hydrogenase-4 component A
EcSMS35_26290143.317244hydrogenase 4 subunit B
EcSMS35_2630-1153.176080hydrogenase-4 component C
EcSMS35_26310163.375028hydrogenase 4 subunit D
EcSMS35_2632-1173.126172hydrogenase 4 membrane subunit
46EcSMS35_2649EcSMS35_2655Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2649-311-3.275572polyphosphate kinase
EcSMS35_2650-114-3.367961exopolyphosphatase
EcSMS35_2651-213-2.455139putative cytochrome C-type biogenesis protein
EcSMS35_26522280.471871hypothetical protein
EcSMS35_26532201.618195hypothetical protein
EcSMS35_26540133.102600surface antigen family protein
EcSMS35_26550123.184934hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2655IGASERPTASE280.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.024
Identities = 19/124 (15%), Positives = 40/124 (32%), Gaps = 6/124 (4%)

Query: 34 QQGKNEEQRQHDEWVAERNREIQQEKQRRANAQAAANKRAATAAANKKARQDKLDAEATA 93
Q + ++ + + + E+ Q Q K AT +KA+ + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEV 1122

Query: 94 DKKRDQSYEDELRSLEIQKQKLALAKEEARVKRENEFIDQELKHKAAQTDVVQSEADANR 153
K Q + +S +Q Q + + V I + D Q + +
Sbjct: 1123 PKVTSQVSPKQEQSETVQPQAEPARENDPTVN-----IKEPQSQTNTTADTEQPAKETSS 1177

Query: 154 NMTE 157
N+ +
Sbjct: 1178 NVEQ 1181


47EcSMS35_2673EcSMS35_2681Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2673-1153.1481873-mercaptopyruvate sulfurtransferase
EcSMS35_26741173.134094enhanced serine sensitivity protein SseB
EcSMS35_26750214.655757aminopeptidase B
EcSMS35_26761213.818166hypothetical protein
EcSMS35_26772233.425360ferredoxin, 2Fe-2S type, ISC system
EcSMS35_26782273.182940chaperone protein HscA
EcSMS35_26794300.972368co-chaperone HscB
EcSMS35_26801281.612868hypothetical protein
EcSMS35_26812301.718809iron-sulfur cluster assembly protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2674STREPKINASE290.025 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 28.5 bits (63), Expect = 0.025
Identities = 27/120 (22%), Positives = 52/120 (43%), Gaps = 21/120 (17%)

Query: 127 GNPLSSQEVLEGGESLILSE-----VAEPPAQMIDSLTTLFKTIKPVKRAFICSIKENEE 181
G+ ++SQE+L +S++ + E + ++ +F+TI P+ + F +K E+
Sbjct: 217 GDTITSQELLAQAQSILNKNHPGYTIYERDSSIVTHDNDIFRTILPMDQEFTYRVKNREQ 276

Query: 182 A-QPNLLIGIEADGDIEEIIQAAGSVATDTIPGDEPIDICQVKKGEKGISHFITEHIVPF 240
A + N G+ + + ++I V +KKGEK F H+ F
Sbjct: 277 AYRINKKSGLNEEINNTDLISEKYYV---------------LKKGEKPYDPFDRSHLKLF 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2678SHAPEPROTEIN1145e-30 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 114 bits (288), Expect = 5e-30
Identities = 81/371 (21%), Positives = 144/371 (38%), Gaps = 74/371 (19%)

Query: 23 GIDLGTTNSLVATVRSGQAETLADHEGRHLLPSVVHYQQQGHS-------VGYDARTNAA 75
IDLGT N+L+ G + +E PSVV +Q VG+DA+
Sbjct: 14 SIDLGTANTLIYVKGQG----IVLNE-----PSVVAIRQDRAGSPKSVAAVGHDAK-QML 63

Query: 76 LDTANTISSVKRLMGRSLADIQQRYPHLPYQFQASENGLPMIETAAGLLNPVRVSADILK 135
T I++++ + +AD V+ +L+
Sbjct: 64 GRTPGNIAAIRPMKDGVIADF-------------------------------FVTEKMLQ 92

Query: 136 ALAARATEALAGE-LDGVVITVPAYFDDAQRQGTKDAARLAGLHVLRLLNEPTAAAIAYG 194
+ V++ VP +R+ +++A+ AG + L+ EP AAAI G
Sbjct: 93 HFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAG 152

Query: 195 LDSGQEGVIAVYDLGGGTFDISILRLSRGVFEVLATGGDSALGGDDFDHLLADYIREQAG 254
L + V D+GGGT +++++ L+ V +GGD FD + +Y+R G
Sbjct: 153 LPVSEATGSMVVDIGGGTTEVAVISLNGVV-----YSSSVRIGGDRFDEAIINYVRRNYG 207

Query: 255 --IPDRSDNRVQRELLDAAIAAKIALSDADSVTVNVAG---WQG-----EISREQFNELI 304
I + + R++ E+ A + + V G +G ++ + E +
Sbjct: 208 SLIGEATAERIKHEI-------GSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEAL 260

Query: 305 APLVKRTLLACRRALKDAGVE-ADEVLE--VVMVGGSTRVPLVRERVGEFFGRPPLTSID 361
+ + A AL+ E A ++ E +V+ GG + + + E G P + + D
Sbjct: 261 QEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAED 320

Query: 362 PDKVVAIGAAI 372
P VA G
Sbjct: 321 PLTCVARGGGK 331


48EcSMS35_2761EcSMS35_2783Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2761215-0.45545530S ribosomal protein S16
EcSMS35_2762113-0.626193signal recognition particle protein
EcSMS35_2763211-1.062222putative cytochrome C assembly protein
EcSMS35_2764211-1.366607hypothetical protein
EcSMS35_2765315-1.522579hypothetical protein
EcSMS35_2766215-0.804622heat shock protein GrpE
EcSMS35_2767215-0.738383inorganic polyphosphate/ATP-NAD kinase
EcSMS35_2768117-1.268219recombination and repair protein
EcSMS35_2769320-1.792427hypothetical protein
EcSMS35_2770222-3.240718hypothetical protein
EcSMS35_2771326-5.980917hypothetical protein
EcSMS35_2772122-5.237454SsrA-binding protein
EcSMS35_2773119-4.130296hypothetical protein
EcSMS35_2774112-3.607341hypothetical protein
EcSMS35_2775111-1.676232hypothetical protein
EcSMS35_2778211-0.300682*alpha amylase family protein
EcSMS35_27791224.171445hypothetical protein
EcSMS35_27802214.062176hydroxyglutarate oxidase
EcSMS35_27812203.573741succinate-semialdehyde dehydrogenase I
EcSMS35_27821172.6020064-aminobutyrate aminotransferase
EcSMS35_2783216-0.687805gamma-aminobutyrate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2769BLACTAMASEA260.033 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 25.9 bits (57), Expect = 0.033
Identities = 23/87 (26%), Positives = 36/87 (41%), Gaps = 11/87 (12%)

Query: 4 KTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRV--GMTQQQVAYALGT 61
K + AVL + AG LER ++ Q + + + VS+ + GMT ++ A
Sbjct: 69 KVV-LCGAVLARVDAGDEQLERKIH---YRQQDLVDYSPVSEKHLADGMTVGELCAA--A 122

Query: 62 PLMSDPFGTNTWFYVFRQQPGHEGVTQ 88
MSD N + G G+T
Sbjct: 123 ITMSDNSAANL---LLATVGGPAGLTA 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2773PRTACTNFAMLY2417e-68 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 241 bits (616), Expect = 7e-68
Identities = 219/883 (24%), Positives = 351/883 (39%), Gaps = 89/883 (10%)

Query: 763 NDGGTLDVREKGSATGIQQSSQGAL-VATTRATRVTGTRADGVAFSIEQGAANNILLTNG 821
N+ + E+ IQ S G + A+ +V+G +A G+ + A + NG
Sbjct: 37 NNQSIVKTGERQHGIHIQGSDPGGVRTASGTTIKVSGRQAQGILL---ENPAAELQFRNG 93

Query: 822 GVLT----VESDTSSAKTQVNAGGREIVKTKATATGTTLTGGEQ----IVEGVANETTIN 873
V + + V ++V AT T + V G + +I
Sbjct: 94 SVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATLANVGDTWDDDGIALYVAGEQAQASIA 153

Query: 874 DGGIQTVS-------ANGEAVKTTINEGGTLTVNDNGKATDIIQNSGAALQTSTANGIEI 926
D +Q AN ++ I +GG + + S L+ + +
Sbjct: 154 DSTLQGAGGVQIERGANVTVQRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPA 213

Query: 927 SGTHQY------------GTFSIAGNLATNALLENGGNLLVLAGTEARDSTVG------- 967
SG G G A A ++ L A D+ G
Sbjct: 214 SGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGAVVHLQRATIRRGDAPAGGAVPGGA 273

Query: 968 -KGGAMQNLGQDSATKVNSGGQYTL---GRSKDEFQALARAEDLQVA-----GGTAIVYA 1018
GGA+ G Y + G S + Q++ A +L A G V
Sbjct: 274 VPGGAVPGGFGPGGFGPVLDGWYGVDVSGSSVELAQSIVEAPELGAAIRVGRGARVTVSG 333

Query: 1019 GTLA--DASVSGATGSLSLMTPRDNVTPVKLEGAIRITDSATLTIGNGVDTTLTDLTAA- 1075
G+L+ +V G+ P+ + L+ A L LT A
Sbjct: 334 GSLSAPHGNVIETGGARRFA-PQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGAD 392

Query: 1076 SRGSVWLNSNNSCAGTSNCEYR---------------VNSLLLNDG-----------DVY 1109
++G + S GTS V+SL +++ +
Sbjct: 393 AQGDIVATELPSIPGTSIGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVGALR 452

Query: 1110 LSAQTA---APATTNGIYNTLTTSELSGSGNFYLHTNVAGSRGDQLVVNNNATGNFKIFV 1166
L++ + G + LT + L+GSG F ++ D+LVV +A+G +++V
Sbjct: 453 LASDGSVDFQQPAEAGRFKVLTVNTLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWV 512

Query: 1167 QDTGVSPQSDDAMTLVKT-GGGDASFTLGNTGGFVDLGTYEYVLKSDGNSNWNLTNDVKP 1225
+++G P S + + LV+T G A+FTL N G VD+GTY Y L ++GN W+L P
Sbjct: 513 RNSGSEPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAP 572

Query: 1226 NPDPNPNPKPDPKPDPKPDPKPDPTPDPTPTPVPEKRITPSTAAVLNMA--ATLPLVFDA 1283
P P P P+P P+P P P+P+ P P P + ++ + A +N ++ A
Sbjct: 573 -PAPKPAPQPGPQPPQPPQPQPEA---PAPQPPAGRELSAAANAAVNTGGVGLASTLWYA 628

Query: 1284 ELNSIRERLNIMKASPHNNNVWGTTYNTRNNVTTDAGAGFEQTLTGMTVGIDSRNDIPEG 1343
E N++ +RL ++ +P WG + R + AG F+Q + G +G D + G
Sbjct: 629 ESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGG 688

Query: 1344 IATLGAFMGYSHSHIGFDRGGHGSVGSYSLGGYASWEHESGFYLDGIVKLNRFESNVAGK 1403
LG GY+ GF G G S +GGYA++ +SGFYLD ++ +R E++
Sbjct: 689 RWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVA 748

Query: 1404 MSSGGAANGSYRSNGLGGHIETGMRFT-DGNWNLTPYASLTGFTADNPEYHLSNGMESKS 1462
S G A G YR++G+G +E G RFT W L P A L F A Y +NG+ +
Sbjct: 749 GSDGYAVKGKYRTHGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRD 808

Query: 1463 VDTRSIYRELGATLSYNMRLGNGMEVEPWLKAAVRKEFVDDNRVKVNNDGNFVNDLSGRR 1522
S+ LG + + L G +V+P++KA+V +EF V N + +L G R
Sbjct: 809 EGGSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAH-RTELRGTR 867

Query: 1523 GIYQAGIKASFSSSLSGNLGVGYSHGAGVESPWNAVAGVNWSF 1565
G+ A+ S YS G + PW AG +S+
Sbjct: 868 AELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


49EcSMS35_2827EcSMS35_2868Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_28270163.200221PTS system glucitol/sorbitol-specific
EcSMS35_28280152.787659sorbitol-6-phosphate dehydrogenase
EcSMS35_28291152.898553DNA-binding transcriptional activator GutM
EcSMS35_28301153.894837DNA-binding transcriptional repressor SrlR
EcSMS35_28321164.406047D-arabinose 5-phosphate isomerase
EcSMS35_28311173.784568anaerobic nitric oxide reductase transcription
EcSMS35_28331162.912827anaerobic nitric oxide reductase
EcSMS35_28342163.458948nitric oxide reductase
EcSMS35_28350163.128123[NiFe] hydrogenase maturation protein HypF
EcSMS35_2836-2131.165784electron transport protein HydN
EcSMS35_2837-1151.466156hypothetical protein
EcSMS35_2838-1161.974242transcriptional regulator AscG
EcSMS35_2839-1182.718582PTS system cellobiose/arbutin/salicin-specific
EcSMS35_2840-2182.306891cryptic 6-phospho-beta-glucosidase
EcSMS35_2841-1243.746781hypothetical protein
EcSMS35_2842-1305.009263hydrogenase 3 maturation protease
EcSMS35_2843-1275.523343formate hydrogenlyase maturation protein
EcSMS35_2844-1265.748272formate hydrogenlyase, subunit G
EcSMS35_28450265.127577formate hydrogenlyase complex iron-sulfur
EcSMS35_28460234.990873formate hydrogenlyase, subunit E
EcSMS35_28472204.706166formate hydrogenlyase, subunit D
EcSMS35_28482204.180672formate hydrogenlyase subunit 3
EcSMS35_28492203.032656formate hydrogenlyase, subunit B
EcSMS35_28501193.531231formate hydrogenlyase regulatory protein HycA
EcSMS35_28510163.047658hydrogenase nickel incorporation protein
EcSMS35_28520152.717025hydrogenase nickel incorporation protein HypB
EcSMS35_28530122.381123hydrogenase assembly chaperone
EcSMS35_28540122.097009hydrogenase expression/formation protein HypD
EcSMS35_2855-1142.739011hydrogenase expression/formation protein HypE
EcSMS35_2856-1120.929285formate hydrogenlyase transcriptional activator
EcSMS35_28570161.077587molybdenum-pterin binding protein
EcSMS35_28580182.762297hypothetical protein
EcSMS35_28590183.330908hypothetical protein
EcSMS35_2860-1163.336746DNA mismatch repair protein MutS
EcSMS35_28610151.928463serine/threonine-specific protein phosphatase 2
EcSMS35_2862-2143.2409054-hydroxybenzoate decarboxylase, subunit D
EcSMS35_2863-2123.0974804-hydroxybenzoate decarboxylase, subunit C
EcSMS35_2864-2112.1320474-hydroxybenzoate decarboxylase, subunit B
EcSMS35_2865-1111.299544MarR family transcriptional regulator
EcSMS35_2866-1111.615732RNA polymerase sigma factor RpoS
EcSMS35_28671132.134016lipoprotein NlpD
EcSMS35_28682152.285801protein-L-isoaspartate O-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2828DHBDHDRGNASE813e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 80.9 bits (199), Expect = 3e-20
Identities = 67/257 (26%), Positives = 120/257 (46%), Gaps = 7/257 (2%)

Query: 3 QVAVVIGGGQTLGAFLCHGLAAEGYRVAVVDIQSDKAANVAQEINAEYGEGMAYGFGADA 62
++A + G Q +G + LA++G +A VD +K V + AE A F AD
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAE--ARHAEAFPADV 66

Query: 63 TSEQSVLALSRGVDEIFGRVDLLVYSAGIAKAAFISDFQLGDFDRSLQVNLVGYFLCARE 122
++ ++ ++ G +D+LV AG+ + I +++ + VN G F +R
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 123 FSRLMIRDGIQGRIIQINSKSGKVGSKHNSGYSAAKFGGVGLTQSLALDLAEYGITVHSL 182
S+ M D G I+ + S V + Y+++K V T+ L L+LAEY I + +
Sbjct: 127 VSKYM-MDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 183 MLGNLLKSPMFQSL-LPQYATKLGIKPEQVEQYYIDKVPLKRGCDYQDVLNMLLFYASPK 241
G+ ++ M SL + + IK +E + +PLK+ D+ + +LF S +
Sbjct: 186 SPGS-TETDMQWSLWADENGAEQVIKGS-LETFKTG-IPLKKLAKPSDIADAVLFLVSGQ 242

Query: 242 ASYCTGQSINVTGGQVM 258
A + T ++ V GG +
Sbjct: 243 AGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2830ARGREPRESSOR290.014 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 28.7 bits (64), Expect = 0.014
Identities = 20/105 (19%), Positives = 35/105 (33%), Gaps = 17/105 (16%)

Query: 1 MKPRQRQAAILEYLQKQGKCSVEEL-----AQYFDTTGTTIRKDLVILEHAGTVIRTYGG 55
M QR I E + + +EL ++ T T+ +D+ E + T G
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIK--ELHLVKVPTNNG 58

Query: 56 ---VVLNKEESDPPIDHKTLINTHKKELIAEAAVSFIHDGDSIIL 97
L ++ P+ K + +A V I+L
Sbjct: 59 SYKYSLPADQRFNPLS-------KLKRSLMDAFVKIDSASHLIVL 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2831HTHFIS370e-126 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 370 bits (951), Expect = e-126
Identities = 125/388 (32%), Positives = 194/388 (50%), Gaps = 33/388 (8%)

Query: 149 IAALAAGALS----------NALLIEQLESQNMLPGDAAPFEAVKQTQMIGLSPGMTQLK 198
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 199 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 258
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 259 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 318
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 319 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 378
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 379 RLGLSRVVLSARARNLLQHYNFPGNVRELEHAIHRAVVLARATRSGDEVIL-----EAQH 433
+ GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 434 FAFPEVTLPPPEAAAVPIVKQNLR-----------------EATEAFQRETIRQALAQNH 476
+ + V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 477 HNWAASARMLETDVANLHRLAKRLGLKD 504
N +A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2837TCRTETA260.010 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 25.6 bits (56), Expect = 0.010
Identities = 10/42 (23%), Positives = 21/42 (50%)

Query: 5 VIVTSVDDIVFPALQNLISLNNSKLKTGMICECIGALTVIAG 46
+++ + I PALQ ++S + + G + + ALT +
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTS 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2838HTHTETR300.012 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 29.6 bits (66), Expect = 0.012
Identities = 19/103 (18%), Positives = 34/103 (33%), Gaps = 8/103 (7%)

Query: 6 LIACKGMKMMTTMLEVAKRAGVSKATVSRVLSG-----NGYVSQETKDRVFQAVEESGYR 60
L + +G+ T++ E+AK AGV++ + + + +E
Sbjct: 23 LFSQQGVSS-TSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKF 81

Query: 61 PNLLARNLSAKSTQTLGLVVTNTLYHGIYFSELLFHAARMAEE 103
P L L VT + E++FH E
Sbjct: 82 PGDPLSVLREILIHVLESTVTEERRRLLM--EIIFHKCEFVGE 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2853TYPE4SSCAGA270.012 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.012
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 8/75 (10%)

Query: 12 IDGNQAKVD--VCGIQRDVDLTLVGSCDENGQPRVGQWVLVHVGFAMSVINEAEARDTLD 69
I GNQ + D G+ D L ++NG+P G W+ + + F + ++ ++ D +
Sbjct: 171 IIGNQIRTDQKFMGV-FDESLKERQEAEKNGEPTGGDWLDIFLSF---IFDKKQSSDVKE 226

Query: 70 ALQN--MFDVEPDVG 82
A+ + V+PD+
Sbjct: 227 AINQEPVPHVQPDIA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2856HTHFIS389e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 389 bits (1000), Expect = e-131
Identities = 140/373 (37%), Positives = 204/373 (54%), Gaps = 39/373 (10%)

Query: 350 YQEIHRLKERLVDENLALTEQLNNVDSEFGEIIGRSEAMYSVLKQVEMVAQSDSTVLILG 409
E+ + R + E +L + + ++GRS AM + + + + Q+D T++I G
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITG 167

Query: 410 ETGTGKELIARAIHNLSGRNNRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF 469
E+GTGKEL+ARA+H+ R N V +N AA+P L+ES+LFGHE+GAFTGA + GRF
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRF 227

Query: 470 ELADKSSLFLDEVGDMPLELQPKLLRVLQEQEFERLGSNKIIQTDVRLIAATNRDLKKMV 529
E A+ +LFLDE+GDMP++ Q +LLRVLQ+ E+ +G I++DVR++AATN+DLK+ +
Sbjct: 228 EQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSI 287

Query: 530 ADREFRSDLYYRLNVFPIHLPPLRERPEDIPLLAKAFTFKIARRLGRNIDSIPAETLRIL 589
FR DLYYRLNV P+ LPPLR+R EDIP L + F + A + G ++ E L ++
Sbjct: 288 NQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFV-QQAEKEGLDVKRFDQEALELM 346

Query: 590 SNMEWPGNVRELENVIERAVLLTRGNVLQLSL---------------------PDIALPE 628
WPGNVRELEN++ R L +V+ + +++ +
Sbjct: 347 KAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQ 406

Query: 629 PETPPAATVVAQEG--------------EDEYQLIVRVLKETNGVVAGPKGAAQRLGLKR 674
A G E EY LI+ L T G AA LGL R
Sbjct: 407 AVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQI---KAADLLGLNR 463

Query: 675 TTLLSRMKRLGID 687
TL +++ LG+
Sbjct: 464 NTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2857ALARACEMASE270.027 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 27.4 bits (61), Expect = 0.027
Identities = 16/138 (11%), Positives = 39/138 (28%), Gaps = 28/138 (20%)

Query: 35 NDEVELTLAGGAKLVAIV--------------THSSQQALGLAKGKEAIAL----IKAPW 76
N + A A++ ++V + L +EAI L K P
Sbjct: 17 NLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEAITLRERGWKGPI 76

Query: 77 VTL--ATEDCGLKFSARNQFAGSVSTI--------TEGAVNATVHIKTDAGFEIVAVVTN 126
+ L L+ +++ V + +++K ++G + +
Sbjct: 77 LMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKVNSGMNRLGFQPD 136

Query: 127 ESQDEMKLTTGSRVIALI 144
+ + +
Sbjct: 137 RVLTVWQQLRAMANVGEM 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2867RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.020
Identities = 16/84 (19%), Positives = 36/84 (42%), Gaps = 12/84 (14%)

Query: 292 IIATADGRVVYAGNALRGYGNLIIIKHNDDYLSAYAHNDTMLVREQQEVKAGQKIATMGS 351
I+ATA+G++ ++G + IK ++ ++V+E + V+ G + + +
Sbjct: 82 IVATANGKLTHSGRSK-------EIKP---IENSIV--KEIIVKEGESVRKGDVLLKLTA 129

Query: 352 TGTSSTRLHFEIRYKGKSVNPLRY 375
G + L + + RY
Sbjct: 130 LGAEADTLKTQSSLLQARLEQTRY 153


50EcSMS35_2890EcSMS35_2917Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_28901193.189077phosphoadenosine phosphosulfate reductase
EcSMS35_28911183.662924sulfite reductase subunit beta
EcSMS35_28921142.342107sulfite reductase subunit alpha
EcSMS35_2893-1171.224242hypothetical protein
EcSMS35_28940172.122241queuosine biosynthesis protein QueD
EcSMS35_28953162.580134pyridine nucleotide-disulphide oxidoreductase
EcSMS35_28961141.348012putative ferredoxin
EcSMS35_28971120.323332glycerol-3-phosphate responsive antiterminator
EcSMS35_28981100.189177electron transfer flavoprotein
EcSMS35_2899010-0.813726electron transfer flavoprotein
EcSMS35_2900012-1.701849major facilitator family transporter
EcSMS35_2901-114-3.095093FAD binding domain-containing protein
EcSMS35_2902120-3.929442short chain dehydrogenase/reductase family
EcSMS35_2903121-3.539448major facilitator family transporter
EcSMS35_2904025-3.819128carbohydrate kinase, FGGY family protein
EcSMS35_2905127-3.707583aminoimidazole riboside kinase
EcSMS35_2906124-3.944316sucrose porin ScrY
EcSMS35_2907227-4.330489PTS system sucrose-specific EIIBC component
EcSMS35_2908125-3.884428beta-fructofuranosidase
EcSMS35_2909-125-4.686460sucrose operon repressor
EcSMS35_2910126-5.507257hypothetical protein
EcSMS35_2911226-4.182271hypothetical protein
EcSMS35_2912222-3.282384LemA family protein
EcSMS35_2913120-1.960878hypothetical protein
EcSMS35_2914120-1.671567hypothetical protein
EcSMS35_2915224-1.002516hypothetical protein
EcSMS35_29162260.009168IS4 transposase
EcSMS35_29172250.321175phosphopyruvate hydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2891PF07675310.020 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.8 bits (69), Expect = 0.020
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 12/92 (13%)

Query: 206 ILGQTYLPRKFKTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIE 260
++ +P+ T +P PQN + A+ ++VAI+++G L G + G++
Sbjct: 240 VMPYRAMPKT--NTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATV 297

Query: 261 HGNK-----KTYARTASEFGYLPLEHTLAVAE 287
+ K Y + YLP+ + E
Sbjct: 298 NMTKQITENGNYDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2900TCRTETB355e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 35.2 bits (81), Expect = 5e-04
Identities = 45/314 (14%), Positives = 112/314 (35%), Gaps = 36/314 (11%)

Query: 69 LGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 127
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 128 GHTLLAEFSPRRHRGILLGAFSVVWT----VGYVLASIAGHHFISENPEAWRWLLASAAL 183
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 184 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYLGPHVLLGDEVAAATHKHIKTLF-- 241
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 242 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 288
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 289 LNALLIVGALLGLV-------LTHLLAHRKFLLGSFLLLAATLVVMACLPSGSSLTLLLF 341
+ ++ G + ++ L L L+ + + + L +S + +
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTII 354

Query: 342 VLFSTTISAVSNLV 355
++F + + V
Sbjct: 355 IVFVLGGLSFTKTV 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2902DHBDHDRGNASE1061e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 106 bits (265), Expect = 1e-29
Identities = 72/257 (28%), Positives = 116/257 (45%), Gaps = 11/257 (4%)

Query: 11 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANVFIPSFVKDNGETKEMIEK-QGVEVD 69
M+ ++GK A +TG G+G+A A LA GA++ + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 70 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 129
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 130 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 189
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 190 QVNGIAPGYYATDI--TLATRSNPETNQRVLDH-------IPANRWGDTQDLMGTAIFLA 240
+ N ++PG TD+ +L N Q + IP + D+ +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 241 SPASNYVNGHLLVVDGG 257
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2903TCRTETA290.034 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.034
Identities = 21/103 (20%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIILYAPSGVIADKFSHRKMITSAMIITGLLGLLMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + +MAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2915cloacin392e-05 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 38.5 bits (89), Expect = 2e-05
Identities = 19/39 (48%), Positives = 20/39 (51%), Gaps = 1/39 (2%)

Query: 253 PQKKASGRSYHSDSGGSGGGSSGGGFNGGGGSSGGGGAS 291
P SG H GGSG G+ GG N GGGS GG S
Sbjct: 45 PWGGGSGSGIH-WGGGSGHGNGGGNGNSGGGSGTGGNLS 82



Score = 33.5 bits (76), Expect = 8e-04
Identities = 16/29 (55%), Positives = 18/29 (62%)

Query: 264 SDSGGSGGGSSGGGFNGGGGSSGGGGASG 292
S SG GG SG G GG G+SGGG +G
Sbjct: 50 SGSGIHWGGGSGHGNGGGNGNSGGGSGTG 78



Score = 31.2 bits (70), Expect = 0.005
Identities = 15/39 (38%), Positives = 20/39 (51%), Gaps = 3/39 (7%)

Query: 257 ASGRSYHSDSGGSGGGSSGG---GFNGGGGSSGGGGASG 292
+ G + S++ GGGS G G G G+ GG G SG
Sbjct: 34 SDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSG 72



Score = 31.2 bits (70), Expect = 0.005
Identities = 10/25 (40%), Positives = 11/25 (44%)

Query: 268 GSGGGSSGGGFNGGGGSSGGGGASG 292
G G G GG NG G G G +
Sbjct: 57 GGGSGHGNGGGNGNSGGGSGTGGNL 81



Score = 30.5 bits (68), Expect = 0.009
Identities = 15/29 (51%), Positives = 17/29 (58%), Gaps = 1/29 (3%)

Query: 263 HSDSGGSGGGSSGGGFNGGGGSSGGGGAS 291
HS SG GG +G G GGG S G G +S
Sbjct: 14 HSTSGNINGGPTGLGV-GGGASDGSGWSS 41



Score = 30.5 bits (68), Expect = 0.010
Identities = 12/30 (40%), Positives = 12/30 (40%)

Query: 263 HSDSGGSGGGSSGGGFNGGGGSSGGGGASG 292
GGGS G G G S GG G G
Sbjct: 50 SGSGIHWGGGSGHGNGGGNGNSGGGSGTGG 79



Score = 29.3 bits (65), Expect = 0.021
Identities = 12/26 (46%), Positives = 14/26 (53%)

Query: 267 GGSGGGSSGGGFNGGGGSSGGGGASG 292
GGSG G GG +G G G G + G
Sbjct: 48 GGSGSGIHWGGGSGHGNGGGNGNSGG 73


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2917ANTHRAXTOXNA290.038 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.038
Identities = 31/132 (23%), Positives = 51/132 (38%), Gaps = 9/132 (6%)

Query: 211 GYAPNLGSNAEALAVIAEAVKAAGYELGKDITLAMDCAASEFYKDGKYVLA-----GEGN 265
P L N + A+ +E K YE+GK I+L + + ++ + +
Sbjct: 147 RETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 266 KAFTSEEFTHFLEELTKQYPIVSIEDGLDESDW---DGFAYQTKVLG-DKIQLVGDDLFV 321
S++F LE K I I++ L E F+Y ++L D+F
Sbjct: 207 DLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFE 266

Query: 322 TNTKILKEGIEK 333
K+ K G EK
Sbjct: 267 YMNKLEKGGFEK 278


51EcSMS35_2952EcSMS35_2960Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2952020-4.309080cysteine desulfurase, sulfur acceptor subunit
EcSMS35_2953121-5.628596ThiF family protein
EcSMS35_2954126-8.292136murein transglycosylase A
EcSMS35_2958233-10.598110***D-isomer specific 2-hydroxyacid dehydrogenase
EcSMS35_2959025-8.380493KpsF/GutQ family sugar isomerase
EcSMS35_2960021-6.432100aminotransferase, classes I and II
52EcSMS35_3107EcSMS35_3255Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3107215-0.730397hypothetical protein
EcSMS35_3108216-1.232536nucleoside permease NupG
EcSMS35_3109420-1.248297ornithine decarboxylase
EcSMS35_3112627-3.665787*site-specific recombinase, phage integrase
EcSMS35_3113628-3.556958HTH-type transcriptional regulator RafR
EcSMS35_3114525-3.437622alpha-galactosidase
EcSMS35_3115227-5.221675galactoside permease
EcSMS35_3116227-4.196736raffinose invertase
EcSMS35_3117327-5.021720glycoporin RafY
EcSMS35_31184262.508186IS2 transposase orfB
EcSMS35_31195261.870300IS1 transposase orfA
EcSMS35_31207261.190847hypothetical protein
EcSMS35_31247261.059824IS66 family transposase orfA
EcSMS35_31258250.302372hypothetical protein
EcSMS35_31267240.082372IS66 family transposase orfB
EcSMS35_3127726-2.075370hypothetical protein
EcSMS35_3128725-1.569904bifunctional enterobactin receptor/adhesin
EcSMS35_3129524-3.107310hypothetical protein
EcSMS35_3130116-1.154929hypothetical protein
EcSMS35_3131017-0.970833IS630 transposase
EcSMS35_3132317-0.513085hypothetical protein
EcSMS35_31363180.269729IS66 family transposase orfB
EcSMS35_31375200.116359IS66 family transposase orfA
EcSMS35_3140521-0.054472N-acetylmuramoyl-L-alanine amidase AmiC
EcSMS35_31418270.172677hypothetical protein
EcSMS35_31437270.764628IS21 transposase orfB
EcSMS35_31446270.369617IS2 transposase orfA
EcSMS35_3152627-1.104910immunoglobulin-binding regulator B
EcSMS35_3153526-1.812813immunoglobulin-binding regulator A
EcSMS35_3155330-1.317614hypothetical protein
EcSMS35_3156225-0.135836ISEc16 transposase orfA
EcSMS35_3161228-0.523011hypothetical protein
EcSMS35_3162330-1.008332hypothetical protein
EcSMS35_3163330-1.052313superoxide dismutase (Cu-Zn)
EcSMS35_3164125-0.316268insertion sequence 2 OrfA protein
EcSMS35_3165221-0.415982IS2 transposase orfB
EcSMS35_3166221-1.701595hypothetical protein
EcSMS35_31672160.216965hypothetical protein
EcSMS35_3170316-0.496883iron/manganese transport system periplasmic
EcSMS35_31713170.681743iron/manganese transport system inner membrane
EcSMS35_31723181.129340iron/manganese transport system inner membrane
EcSMS35_31734191.537379iron/manganese transport system membrane protein
EcSMS35_31784212.252112ferric aerobactin receptor IutA
EcSMS35_31794211.399960L-lysine 6-monooxygenase IucD
EcSMS35_31804212.034300aerobactin siderophore biosynthesis protein
EcSMS35_31814231.307006aerobactin siderophore biosynthesis protein
EcSMS35_31825240.539837aerobactin siderophore biosynthesis protein
EcSMS35_3183630-0.282217transport protein ShiF
EcSMS35_3187431-2.494703putative immunoglobuling-binding protein
EcSMS35_3188323-0.743527hypothetical protein
EcSMS35_31895220.034079hypothetical protein
EcSMS35_3190521-0.167361hypothetical protein
EcSMS35_3195522-0.555861hypothetical protein
EcSMS35_3196530-7.755200AlpA family transcriptional regulator
EcSMS35_3197428-6.043818hypothetical protein
EcSMS35_3198532-8.604190hypothetical protein
EcSMS35_3199534-8.672170hypothetical protein
EcSMS35_3200531-7.079316hypothetical protein
EcSMS35_3201431-6.346143hypothetical protein
EcSMS35_32025250.057550putative GTPase
EcSMS35_32035260.018240hypothetical protein
EcSMS35_32046271.850828hypothetical protein
EcSMS35_32056262.508421hypothetical protein
EcSMS35_32066304.362160hypothetical protein
EcSMS35_32077304.308013hypothetical protein
EcSMS35_32088273.357457antirestriction protein
EcSMS35_32097262.994155RadC family DNA repair protein
EcSMS35_32107270.317467hypothetical protein
EcSMS35_3211629-0.325471putative antitoxin module of toxin-antitoxin
EcSMS35_3212528-1.190643hypothetical protein
EcSMS35_3213528-2.895034hypothetical protein
EcSMS35_3214321-3.619160hypothetical protein
EcSMS35_3215218-2.983093hypothetical protein
EcSMS35_3216011-1.750281CC2985 family addiction module antidote protein
EcSMS35_3217-111-0.851188hypothetical protein
EcSMS35_3218-211-0.598582hypothetical protein
EcSMS35_3219-310-1.136914polysialic acid capsule expression protein KpsF
EcSMS35_3220-114-4.592654polysialic acid capsule export inner-membrane
EcSMS35_3221223-7.998679polysialic acid capsule transport protein KpsD
EcSMS35_3222335-11.6627963-deoxy-manno-octulosonate cytidylyltransferase
EcSMS35_3223343-14.392087polysialic acid capsule polysaccharide export
EcSMS35_3224654-19.142792polysialic acid capsule biosynthesis protein
EcSMS35_3228962-21.675839polysialic acid capsule biosynthesis
EcSMS35_32291061-20.463658polysialic acid capsule biosynthesis protein
EcSMS35_3230755-17.739391polysialic acid capsule biosynthesis protein
EcSMS35_3231446-14.305315polysialic acid capsule biosynthesis
EcSMS35_3232239-11.802687polysialic acid capsule biosynthesis sialic acid
EcSMS35_3233031-8.233444polysialic acid capsule biosynthesis protein
EcSMS35_3234-221-3.337779polysialic acid capsule biosynthesis transport
EcSMS35_3235118-0.188026polysialic acid capsule biosynthesis transport
EcSMS35_32362161.620810IS200 transposase orfB
EcSMS35_32372143.306931IS200 transposase orfA
EcSMS35_32381173.988603putative general secretion pathway protein YghD
EcSMS35_32391153.987069GspL-like protein
EcSMS35_32401194.749833general secretion pathway protein GspK
EcSMS35_3241-1205.551602general secretion pathway protein GspJ
EcSMS35_3242-1204.855929general secretion pathway protein GspI
EcSMS35_3243-1174.082089general secretion pathway protein GspH
EcSMS35_3244-2163.476921general secretion pathway protein GspG
EcSMS35_3245-2143.281853general secretion pathway protein GspF
EcSMS35_3246-1131.518636general secretory pathway protein GspE
EcSMS35_3247-2130.839538general secretion pathway protein GspD
EcSMS35_3248-3110.659383putative type II secretion protein GspC
EcSMS35_3249-3130.922577putative lipoprotein
EcSMS35_3250-3121.640482leader peptidase PppA
EcSMS35_3251-3132.108277hypothetical protein
EcSMS35_3252-1143.643776glycolate transporter
EcSMS35_32530144.006398malate synthase G
EcSMS35_32541123.497774hypothetical protein
EcSMS35_32551113.054727glycolate oxidase iron-sulfur subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3108TCRTETA290.036 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.036
Identities = 36/239 (15%), Positives = 76/239 (31%), Gaps = 18/239 (7%)

Query: 158 SHMQLYIGAALSAVLVLFTLTLPHIPVAKQQANQSWTTLLGLDAFALFKNKRMAIFFIFS 217
H + AAL+ + L L + + L+ A F+ R
Sbjct: 159 PHAPFFAAAALNGLNFLTGCFLLPES---HKGERRPLRREALNPLASFRWARGMTVVAAL 215

Query: 218 MLLGAELQITNMFGNTFLHSFDKDPMFASSFIVQHASIIMSISQISETLF-ILTIPFFLS 276
M + +Q+ F +D + + I ++ I +L + +
Sbjct: 216 MAVFFIMQLVGQVPAALWVIFGEDRFHWDATTI---GISLAAFGILHSLAQAMITGPVAA 272

Query: 277 RYGIKNVMMISIVAWILRFALFAYGDPTPFGTVLLVLSMIVYGCAFDFFNISGSVFVEKE 336
R G + +M+ ++A + L A+ + ++V + + + ++
Sbjct: 273 RLGERRALMLGMIADGTGYILLAF-----ATRGWMAFPIMVLLASGGIGMPALQAMLSRQ 327

Query: 337 VSPAIRASAQGMFLMMTNGFGCILGGIVSGKVVEMYTQNGITDWQ-TVWLIFAGYSVVL 394
V + QG +T+ L IV + IT W W+ A ++
Sbjct: 328 VDEERQGQLQGSLAALTS-----LTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLC 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3115TCRTETA454e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.8 bits (106), Expect = 4e-07
Identities = 63/306 (20%), Positives = 106/306 (34%), Gaps = 27/306 (8%)

Query: 20 FLYFFIMATCFPFLPVWLSDVV--GLSKTDTGIVFSCLSLFAISFQPLLGVISDRLGLKK 77
L + P LP L D+V GI+ + +L + P+LG +SDR G +
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 78 NLIWSISLLLVFFAPFFLYVFAPLLRFNIWAGALTGGVFIGFVFSAGAGAIEAYIERVSR 137
L+ S L + + AP L ++ G + G+ G + I + R
Sbjct: 75 VLLVS---LAGAAVDYAIMATAPFLWV-LYIGRIVAGI-TGATGAVAGAYIADITDGDER 129

Query: 138 SRGFEYGKARMFGCLGWALCA--AMAGMLFNVDPSLVFWMGSGSALLLLLL-LFLARPST 194
+R F + M C G+ + A + G++ P F+ + L L FL S
Sbjct: 130 ARHFGF----MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 195 SQTAMVMNTLGANSSLISTRMVFSLFRMRQMWMFVLYTIGVACVYDVFDQQFATFFRSFF 254
+ N + FR + V + V + + Q A + F
Sbjct: 186 KGERRPLRREALNP--------LASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 255 -DTPQAGIKAFGFATTAGEICNAII-MFCTPWIIHRIGAKNTLLVAGGIMTIRITGSAFA 312
D G + A I +++ T + R+G + L++ M TG
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG---MIADGTGYILL 294

Query: 313 TTATEV 318
AT
Sbjct: 295 AFATRG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3126CHANLCOLICIN310.010 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 31.2 bits (70), Expect = 0.010
Identities = 34/155 (21%), Positives = 65/155 (41%), Gaps = 15/155 (9%)

Query: 4 SLAHENARLRALLQTQQDTIRQMAEYNRLLSQRVAAYASEINRLKALVAKLQRMQFGKSS 63
+ A A AL Q +D + + +N + A N A+ A+ +R++ K+
Sbjct: 79 AQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANN--AAMQAEDERLRLAKAE 136

Query: 64 EKLR---AKTERQIQEAQERISALQEEMAETLGEQYDPVLPSAALRQSSACKQLPASLPR 120
EK R E+ QEA++R ++ E AET L+ + A ++ A+L
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAET----------ERQLKLAEAEEKRLAALSE 186

Query: 121 ETRVIRPEEECCPACGGELSSLGCDVSEQLELISS 155
E + + ++ A E+ + ++ +SS
Sbjct: 187 EAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSS 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3137cdtoxina270.018 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 27.4 bits (60), Expect = 0.018
Identities = 15/61 (24%), Positives = 24/61 (39%), Gaps = 5/61 (8%)

Query: 74 VELLPVEITPDEQKEPVAAIAPSLSTSTQTRVSASSCKVEFRHGNMTLENPSPELLTVLI 133
VE P +PDE P+ P+L T+ + ++L N +LT+
Sbjct: 40 VEGGPTVPSPDEPGLPLPGPGPALPTNGAIPIPEPGTAPA-----VSLMNMDGSVLTMWS 94

Query: 134 R 134
R
Sbjct: 95 R 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3170adhesinb334e-117 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 334 bits (858), Expect = e-117
Identities = 93/298 (31%), Positives = 164/298 (55%), Gaps = 7/298 (2%)

Query: 7 LTMLLGFLALTCSIAFQASATEKFKVITTFTIIADMAKNVAGDVAEVSSITKPGAEIHEY 66
L +G A + + + + K V+ T +IIAD+ KN+AGD + SI G + HEY
Sbjct: 11 LLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVPVGQDPHEY 70

Query: 67 QPTPGDIKRAQGAQLILANGMNLEL----WFQRFYQHLNGVPE---VIVSSGVTPVGITE 119
+P P D+K+ A LI NG+NLE WF + ++ VS GV + +
Sbjct: 71 EPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEG 130

Query: 120 GPYEGKPNPHAWMSPDNALIYVDNIRDALIKYDPANAKTYQRNADTYKAKITQTLAPLRK 179
+GK +PHAW++ +N +IY NI L + DPAN +TY++N Y K++ ++
Sbjct: 131 QSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAYVEKLSALDKEAKE 190

Query: 180 QIAELPENKRWMVTSEGAFSYLARDLGLKELYLWPINADQQGTPQQVRKVVDIVKKNNIP 239
+ +P K+ +VTSEG F Y ++ + Y+W IN +++GTP Q++ +V+ ++K +P
Sbjct: 191 KFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEKLRKTKVP 250

Query: 240 AVFSESTVSDKPARQVARETGAHYGGVLYVDSLSTENGPVPTYIDLLKVTTSTLVQGI 297
++F ES+V D+P + V+++T ++ DS++ + +Y ++K + +G+
Sbjct: 251 SLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMMKYNLEKIAEGL 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3180PF041838230.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 823 bits (2128), Expect = 0.0
Identities = 577/580 (99%), Positives = 580/580 (100%)

Query: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60
MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60

Query: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYSTLLGDLQLLKARRGLSASD 120
DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLY+TLLGDLQLLKARRGLSASD
Sbjct: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120

Query: 121 LINLSADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180
LINL+ADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC
Sbjct: 121 LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180

Query: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240
DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG
Sbjct: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240

Query: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300
RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR
Sbjct: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300

Query: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360
WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK
Sbjct: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360

Query: 361 PDESPVLMATLMECDENDQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420
PDESPVLMATLMECDEN+QPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI
Sbjct: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420

Query: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480
AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL
Sbjct: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480

Query: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540
QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR
Sbjct: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540

Query: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580
VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES
Sbjct: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3182PF04183332e-109 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 332 bits (853), Expect = e-109
Identities = 110/474 (23%), Positives = 181/474 (38%), Gaps = 45/474 (9%)

Query: 37 ELIIPLDEQKSLHFRVAYFSPTQHHRF-----AFPAHLVTASGSYPVDFTTLSRLIIDKL 91
E + + Q + + P RF + + A D L++ ++ +L
Sbjct: 24 EQVFHAESQGDDRYCIN--LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQL 81

Query: 92 RHQLFLPVPLCETFHQRVLESYAHTQQTIDARHDWAILREKALNFGEAEQALLTGHAFHP 151
+ L + Q + + Q + AR + LN + Q LL+GH
Sbjct: 82 KQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFV 140

Query: 152 APKSHEPFNRQEAERYLPDMAPHFPLRWFSVDKTQIAGES-LHLNLQQRLTRFAAENAPQ 210
K + ++ ERY P+ A F L W +V + + +++ Q LT A PQ
Sbjct: 141 FNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLT---AAMDPQ 197

Query: 211 LLNELS--------DNQWLF-PLHPWQGEYLLQQVWCQALFAKGLIRDLGEAGTSWLPTT 261
S D+ WL P+HPWQ + + + A FA+G + LGE G WL
Sbjct: 198 EFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFI-ADFAEGRMVSLGEFGDQWLAQQ 256

Query: 262 SSRSLYCATSRD--MIKFSLSVRLTNSVRTLSVKEVERGMRLARLAQ----TDGWQMLQA 315
S R+L A+ R IK L++ T+ R + + + G +R Q TD +
Sbjct: 257 SLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSG 316

Query: 316 ----RFPTFRVMQEDGWAGLRDLNGNIMQESLFSLRENLLLEQPQSQTNVLVSLTQAAPD 371
P + +G+A L + REN ++ VL++ +
Sbjct: 317 AVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDE 376

Query: 372 GGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVLKPLFTAEADYGLVLLAHQQNILV 431
L + + DR G+ A W+ + V+ PL+ YG+ L+AH QNI +
Sbjct: 377 NNQPLAGAYI----DRSGLD----AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL 428

Query: 432 QMLGDLPVGFIYRDCQGSAFMPHATEWLDTIDEAQAENIFTREQLLRYFPYYLL 485
M +P + +D QG M E +D E R+ R YL+
Sbjct: 429 AMKEGVPQRVLLKDFQGD--MRLVKEEFPEMDSLPQE---VRDVTSRLSADYLI 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3183TCRTETA462e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.0 bits (109), Expect = 2e-07
Identities = 84/380 (22%), Positives = 135/380 (35%), Gaps = 49/380 (12%)

Query: 20 FSAGLLGIGQNGLLVVLPVLVIQTNLSLSV---WAALLMLGSMLFLPSSPWWGKQISLTG 76
+ L +G ++ VLP L+ S V + LL L +++ +P G G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 77 SKTVVLWALGGYGVSFTLLGLGSVLMATGAVTKAVGLGILIIARIVYGLTVSAMVPACQV 136
+ V+L +L G V + ++ L +L I RIV G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATAPFLW------------VLYIGRIVAGITGATGAVAGAY 119

Query: 137 WALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLVIHPLAPVWM--LMAAPALALVM 194
A R +S+ G + GP+ M P AP + +
Sbjct: 120 IA-DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 195 LLRLPGTPPQPTPERK-------SVSLKRDFLPYLLCAMLLAAAMSMMQLGLSPAL---- 243
L + P R+ S R A L+A M +G PA
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVV---AALMAVFFIMQLVGQVPAALWVI 235

Query: 244 --TRQFATDTTTISQQVAWLLGLSAIA-ALIAQFVVLRPQRLTPVALLLSAGVLMSSGLA 300
+F D TTI +A L ++A A+I V R AL+L + +
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERR--ALMLGMIADGTGYIL 293

Query: 301 IMLAEQLWLFYLGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLACSHTLGYGLCALLV 360
+ A + W+ + +L+ G + PA Q +L+ + D G L L
Sbjct: 294 LAFATRGWMAFPIMVLLASG-GIGMPALQAMLS-RQVDEERQGQLQ----------GSLA 341

Query: 361 PLVSKTGVAIALIVMALFAA 380
L S T + L+ A++AA
Sbjct: 342 ALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3187OMADHESIN904e-22 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 90.3 bits (223), Expect = 4e-22
Identities = 89/331 (26%), Positives = 141/331 (42%), Gaps = 30/331 (9%)

Query: 67 AVGSGAAILDADKSMAVGNNTAVFNADNSVALGYGSQVNGESNVLSVGAGPSGY-----G 121
V GA +D +AVG N+ +A NSVA+G+ S V S+ G
Sbjct: 127 GVAIGARASTSDTGVAVGFNSKA-DAKNSVAIGHSSHVAANHG-YSIAIGDRSKTDRENS 184

Query: 122 FSVDGAPETRRIINVSDGVKDSDAATKGQMDNAIADAVRESGDALRGEIGAVYRDAVADA 181
S+ R++ +++ G KD+DA Q+ I + + A +
Sbjct: 185 VSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYADNKS 244

Query: 182 KSRVESAENRLNGNITAARASAQEYTDAVKSDVLDETRTYTDSSVRTVRNEVKSQAEHLS 241
S + A N + +A++ A DVL+ + +++S RT + A ++
Sbjct: 245 SSVLGIANNYTDSKSAETLENARKEAFAQSKDVLNMAKAHSNSVARTTLETAEEHANSVA 304

Query: 242 DVLVK------NRAQTDAAIASNTAAIRNNSHRLDLTEAWQKMAT--------------- 280
++ N+ +A ++N A +SH L ++ +
Sbjct: 305 RTTLETAEEHANKKSAEALASANVYADSKSSHTLKTANSYTDVTVSNSTKKAIRESNQYT 364

Query: 281 -ERMNNMQEQIKENRKELRESAAQSAALAGLFQPYSVGKFNATAAVGGYRDEQAIAVGVG 339
+ + ++ + + + A SAAL LFQPY VGK N TA VGGYR QA+A+G G
Sbjct: 365 DHKFRQLDNRLDKLDTRVDKGLASSAALNSLFQPYGVGKVNFTAGVGGYRSSQALAIGSG 424

Query: 340 YRFTENVAGKVAVA-AGGSSASWNAGVNFEF 369
YR ENVA K VA AG S +NA N E+
Sbjct: 425 YRVNENVALKAGVAYAGSSDVMYNASFNIEW 455


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3235ABC2TRNSPORT336e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 33.4 bits (76), Expect = 6e-04
Identities = 29/125 (23%), Positives = 54/125 (43%), Gaps = 10/125 (8%)

Query: 137 ITNFLQLVLTWSLLIILS--CGVGLIF----MVVGKTFPEMQKVL---PILLKPLYFISC 187
+ L SLL L GL F MVV P + +++ P+ F+S
Sbjct: 135 VAAALGYTQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSG 194

Query: 188 IMFPLHSIPKQYWSYLLWNPLVHVVELSREAVMPGYISE-GVSLNYLAMFTLVTLFIGLA 246
+FP+ +P + + + PL H ++L R ++ + + + L ++ ++ F+ A
Sbjct: 195 AVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTA 254

Query: 247 LYRTR 251
L R R
Sbjct: 255 LLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3241BCTERIALGSPG290.010 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.7 bits (64), Expect = 0.010
Identities = 15/46 (32%), Positives = 22/46 (47%), Gaps = 2/46 (4%)

Query: 1 MRRARAGFTLLEMLVAIAIFASLA-LMAQQVTNGVTRVNSAVAGHD 45
+ R GFTLLE++V I I LA L+ + + + A D
Sbjct: 4 TDKQR-GFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3242PilS_PF08805349e-05 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 33.8 bits (77), Expect = 9e-05
Identities = 15/52 (28%), Positives = 26/52 (50%)

Query: 3 RGFTLLEVILALAIFALAATAVLQIASGALSNQQILEEKTVAGWVAENQTAL 54
+G TL+EV+L + + + A + ++ S SN Q E+ V N +L
Sbjct: 26 KGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQNNVLTVIANMKSL 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3243BCTERIALGSPH782e-20 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 77.7 bits (191), Expect = 2e-20
Identities = 38/166 (22%), Positives = 65/166 (39%), Gaps = 28/166 (16%)

Query: 1 MPERGFTLLEIMLVIFLIGLASAGVVQTFATASEPPAKKAAQDFLTRFAQFKDRAVIEGQ 60
M +RGFTLLE+ML++ L+G+++ V+ F + + A + F + + R + GQ
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 61 TLGVLIDPPGYQFMQRRHGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQR 120
GV + P +QF+ + P D W L L+
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPA-------------------PADDGWSGYRWLPLRA 101

Query: 121 RRL----TLHDIELEL-----QKEAKKKTPQIRFSPFEPVTPFTLR 157
R+ ++ +L L + P + P +TPF L
Sbjct: 102 GRVATSGSIAGGKLNLAFAQGEAWTPGDNPDVLIFPGGEMTPFRLT 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3244BCTERIALGSPG2182e-76 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 218 bits (556), Expect = 2e-76
Identities = 91/146 (62%), Positives = 109/146 (74%), Gaps = 3/146 (2%)

Query: 6 RTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENALDMYRL 65
R + GFTLLE+MVVIVI+GVLASLVVPNL+GNKEKAD+QKA+SDIVALENALDMY+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTEQGLEALIQQPANMADARNYRTGGYIKRLPKDPWGNDYQYLSPGEKGLFDVY 125
DN YPTT QGLE+L++ P A NY GYIKRLP DPWGNDY ++PGE G +D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 126 TLGADGQENGEGAGADIGNWNLQEFQ 151
+ G DG+ E DI NW L + +
Sbjct: 122 SAGPDGEMGTED---DITNWGLSKKK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3245BCTERIALGSPF454e-161 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 454 bits (1170), Expect = e-161
Identities = 225/406 (55%), Positives = 300/406 (73%), Gaps = 1/406 (0%)

Query: 1 MALFYYQALERNGRKTKGMIEADSARHARQLLRGKDLIPVHI-EARMNASAGGMLQRRRH 59
MA ++YQAL+ G+K +G EADSAR ARQLLR + L+P+ + E R + G
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 AHRRVAAADLALFTRQLATLVQAAMPLETCLQAVSEQSEKLHVKSLGMALRSRIQEGYTL 119
R++ +DLAL TRQLATLV A+MPLE L AV++QSEK H+ L A+RS++ EG++L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 SDSLREHPRVFDSLFCSMVAAGEKSGHLDVVLNRLADYTEQRQRLKSRLLQAMLYPLVLL 179
+D+++ P F+ L+C+MVAAGE SGHLD VLNRLADYTEQRQ+++SR+ QAM+YP VL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVATGVVTILLTAVVPKIIEQFDHLGHALPASTRMLIAMSDALQASGVYWLAGLLGLLVL 239
VVA VV+ILL+ VVPK++EQF H+ ALP STR+L+ MSDA++ G + L LL +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 GQRLLKNPAMRLRWDKTLLRLPVTGRVARGLNTARFSRTLSILTASSVPLLEGIQTAAAV 299
+ +L+ R+ + + LL LP+ GR+ARGLNTAR++RTLSIL AS+VPLL+ ++ + V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 SANRYVEQQLLLAADRVREGSSLRAALADLRLFPPMMLYMIASGEQSGELETMLEQAAIN 359
+N Y +L LA D VREG SL AL LFPPMM +MIASGE+SGEL++MLE+AA N
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QEREFDTQVGLALGLFEPALVVVMAGVVLFIVIAILEPMLQLNNMV 405
Q+REF +Q+ LALGLFEP LVV MA VVLFIV+AIL+P+LQLN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3247BCTERIALGSPD5760.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 576 bits (1485), Expect = 0.0
Identities = 295/668 (44%), Positives = 431/668 (64%), Gaps = 34/668 (5%)

Query: 24 LLPLVLAAALCSSPVWAEEATFTANFKDTDLKSFIETVGANLNKTIIMGPGVQGKVSIRT 83
L L++ AAL P AEE F+A+FK TD++ FI TV NLNKT+I+ P V+G +++R+
Sbjct: 11 SLTLLIFAALLFRPAAAEE--FSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRS 68

Query: 84 MTPLNERQYYQLFLNLLEAQGYAVVPMENDVLKVVKSSAAKVEPLPLVGEGSDNYAGDEM 143
LNE QYYQ FL++L+ G+AV+ M N VLKVV+S AK +P+ + + GDE+
Sbjct: 69 YDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPG-IGDEV 127

Query: 144 VTKVVPVRNVSVRELAPILRQMIDSAGSGNVVNYDPSNVIMLTGRASVVERLTEVIQRVD 203
VT+VVP+ NV+ R+LAP+LRQ+ D+AG G+VV+Y+PSNV+++TGRA+V++RL +++RVD
Sbjct: 128 VTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVD 187

Query: 204 HAGNRTEEVIPLDNASASEIARVLESLTKNSGENQ-PATLKSQIVADERTNSVIVSGDPA 262
+AG+R+ +PL ASA+++ +++ L K++ ++ P ++ + +VADERTN+V+VSG+P
Sbjct: 188 NAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPN 247

Query: 263 TRDKMRRLIRRLDSEMERRGNSQVFYLKYSKAEDLVDVLKQVSGTLTAAKEEAEGTVGSG 322
+R ++ +I++LD + +GN++V YLKY+KA DLV+VL +S T+ + K+ A+ +
Sbjct: 248 SRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPV-AAL 306

Query: 323 REVVSIAASKHSNALIVTAPQDIMQSLQSVIEQLDIRRAQVHVEALIVEVAEGSNINFGV 382
+ + I A +NALIVTA D+M L+ VI QLDIRR QV VEA+I EV + +N G+
Sbjct: 307 DKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGI 366

Query: 383 QWGSKDAGLMQFANGTQIPIGTLGAAISAAKPQKGSTVISENGATTINPDTNGDLST-LA 441
QW +K+AG+ QF N + +PI T A + +G +S+ LA
Sbjct: 367 QWANKNAGMTQFTN-SGLPISTAIAG-------------------ANQYNKDGTVSSSLA 406

Query: 442 QLLSGFSGTAVGVVKGDWMALVQAVKNDSSSNVLSTPSITTLDNQEAFFMVGQDVPVLTG 501
LS F+G A G +G+W L+ A+ + + +++L+TPSI TLDN EA F VGQ+VPVLTG
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 502 STVGSNNSNPFNTVERKKVGIMLKVTPQINEGNAVQMVIEQEVSKVEGQTS-----LDVV 556
S S N FNTVERK VGI LKV PQINEG++V + IEQEVS V S L
Sbjct: 467 SQTTSG-DNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGAT 525

Query: 557 FGERKLKTTVLANDGELIVLGGLMDDQAGESVAKVPLLGDIPLIGNLFKSTADKKEKRNL 616
F R + VL GE +V+GGL+D ++ KVPLLGDIP+IG LF+ST+ K KRNL
Sbjct: 526 FNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNL 585

Query: 617 MVFIRPTILRDGMAADGVSQRKYNYMRAEQIYR--DEQGLSLMPHTAQPILPAQNQALPP 674
M+FIRPT++RD S +Y Q + E +++ I P Q+ A
Sbjct: 586 MLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYPRQDTAAFR 645

Query: 675 EVRAFLNA 682
+V A ++A
Sbjct: 646 QVSAAIDA 653


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3248BCTERIALGSPC1173e-33 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 117 bits (294), Expect = 3e-33
Identities = 67/286 (23%), Positives = 114/286 (39%), Gaps = 38/286 (13%)

Query: 40 IARGMFWLMLLIISAKVAHSLWRYFSFSAEYMAVSPSANKPLRADAKAFDKNDVQLISQQ 99
I R +F+L++L+ ++A WR S P +A + ND L
Sbjct: 14 IRRILFYLLMLLFCQQLAMIFWR-IGLPDNAPVSSVQIT-PAQARQQPVTLNDFTL---- 67

Query: 100 NWFGKYQPV--ATPVKQPEPAPVAETRLNVVLRGIAFG---ARPGAVIEEGGKQQVYLQG 154
FG A + + + + + LN+ L G+ G +R A+I + +Q
Sbjct: 68 --FGVSPEKNKAGALDASQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSRGVN 125

Query: 155 ETLGSHNAVIEEINRDHVMLRYQGKMERLSLAEEKRPTIAVTSKKAVSDEAKQAVAEPAA 214
E + +NA I I D V+L+YQG+ E L L + +
Sbjct: 126 EEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQ-----------------------EDS 162

Query: 215 SAPVEIPAAVRQAL-AKDPQKIFNYIQLTPVRKEG-IVGYAVKPGADRSLFDASGFKEGD 272
+ A V + L + + +Y+ +P+ + + GY + PG F G ++ D
Sbjct: 163 GSDGVPGAQVNEQLQQRASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDND 222

Query: 273 IAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARYDISIAL 318
+A+ALN D D M ++ + + LTV R G R DI +
Sbjct: 223 MAVALNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQDIYMEF 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3250PREPILNPTASE2791e-96 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 279 bits (715), Expect = 1e-96
Identities = 109/271 (40%), Positives = 149/271 (54%), Gaps = 12/271 (4%)

Query: 1 MLFDVFQQYPTAMPVLATVGGLIIGSFLNVVIWRYPIML-RQQMAEFHGEMSSAQSKI-- 57
+L ++ P L + L+IGSFLNVVI R PIML R+ AE+ + +
Sbjct: 3 LLLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDE 62

Query: 58 ---SLALPRSHCPHCQQTIRIRDNIPLFSWLMLKGRCRDCQAKISKRYPLVELLTALAFL 114
+L +PRS CPHC I +NIPL SWL L+GRCR CQA IS RYPLVELLTAL +
Sbjct: 63 PPYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSV 122

Query: 115 LASLVWPESGWALAVMILSAWLIAASVIDLDHQWLPDVFTQGVLWTGLSAAWAQQSPLTL 174
++ LA ++L+ L+A + IDLD LPD T +LW GL ++L
Sbjct: 123 AVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNL-LGGFVSL 181

Query: 175 QDAVTGVLVGFIAFYSLRWIAGIVLRKEALGMGDVLLFAALGSWVGPLSLPNVALIASCC 234
DAV G + G++ +SL W ++ KE +G GD L AALG+W+G +LP V L++S
Sbjct: 182 GDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLV 241

Query: 235 GLIYAVI-----TKRGSTTLPFGPCLSLGGI 260
G + S +PFGP L++ G
Sbjct: 242 GAFMGIGLILLRNHHQSKPIPFGPYLAIAGW 272


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3251PF03544496e-08 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 48.8 bits (116), Expect = 6e-08
Identities = 24/60 (40%), Positives = 30/60 (50%), Gaps = 3/60 (5%)

Query: 32 SSDTPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEP---IPDPEPTPEPEPEPVP 88
S T + V+P P P EP PEP P PEP E I P+P P+P+P+PV
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109



Score = 41.9 bits (98), Expect = 1e-05
Identities = 16/92 (17%), Positives = 27/92 (29%), Gaps = 2/92 (2%)

Query: 33 SDTPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPIPDPEPTPEPEPEPVPTKTG 92
+D P + PE +P P PEP PEP + E + V
Sbjct: 58 ADLEPPQAVQ-PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKR 116

Query: 93 YLTLGGSQRVTGATCNGESSDGFTFKPGEDVT 124
+ S+ + N + + +
Sbjct: 117 DVKPVESRPASPFE-NTAPARPTSSTATAATS 147



Score = 40.7 bits (95), Expect = 2e-05
Identities = 18/59 (30%), Positives = 23/59 (38%), Gaps = 2/59 (3%)

Query: 35 TPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPIPDP--EPTPEPEPEPVPTKT 91
P + +P +P PEP +PEP PEPIP+P E E K
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 103



Score = 40.3 bits (94), Expect = 3e-05
Identities = 20/96 (20%), Positives = 28/96 (29%), Gaps = 7/96 (7%)

Query: 29 SGSSSDTPPVDSGTGSLPEVKPDPTPNPEPT---PEPTPDPEPTPEPIPDPEPTPEPEPE 85
+ P V PE +P P P E +P P P+P P+P+ E
Sbjct: 65 AVQPPPEPVV----EPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKP 120

Query: 86 PVPTKTGYLTLGGSQRVTGATCNGESSDGFTFKPGE 121
R T +T +S T
Sbjct: 121 VESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156



Score = 35.0 bits (80), Expect = 0.002
Identities = 17/40 (42%), Positives = 17/40 (42%)

Query: 50 PDPTPNPEPTPEPTPDPEPTPEPIPDPEPTPEPEPEPVPT 89
P P T D EP P PEP EPEPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83



Score = 30.3 bits (68), Expect = 0.043
Identities = 11/40 (27%), Positives = 13/40 (32%)

Query: 52 PTPNPEPTPEPTPDPEPTPEPIPDPEPTPEPEPEPVPTKT 91
P P + + P P P P EPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83


53EcSMS35_3385EcSMS35_3394Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3385019-4.429438hexuronate transporter
EcSMS35_3386-120-5.570240pilus biogenesis initiator
EcSMS35_3387-121-6.218266hypothetical protein
EcSMS35_3388118-6.309199CS1 type fimbrial major subunit
EcSMS35_3389118-5.276141putative CFA/I fimbrial subunit A
EcSMS35_3390117-3.014665hypothetical protein
EcSMS35_3391014-0.661657DNA-binding transcriptional repressor ExuR
EcSMS35_3392216-0.164799hypothetical protein
EcSMS35_33932171.165138putative inner membrane protein
EcSMS35_33942191.020699hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3385TCRTETA416e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.0 bits (96), Expect = 6e-06
Identities = 59/329 (17%), Positives = 107/329 (32%), Gaps = 37/329 (11%)

Query: 34 PTLMEELNISTQQ---YSYIIAAYSAAYTVMQPVAGYVLDVLGTK----IGYAMFAVLWA 86
P L+ +L S Y ++A Y+ PV G + D G + + A AV +A
Sbjct: 29 PGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYA 88

Query: 87 VFCGATALAGSWGGLAVA--RGAVGAAEAAMIPAGLKASSEWFPAKERSIAVGYFNVGSS 144
+ A L + G VA GA GA A I ++ ER+ G+ +
Sbjct: 89 IMATAPFLWVLYIGRIVAGITGATGAVAGAYI-------ADITDGDERARHFGFMSACFG 141

Query: 145 IGAMIAPPLVVWAIVMHSWQMAFIISGALSFIWAMAWLIFYKHPRDQKHLTDEERDYIIN 204
G M+A P++ + S F + AL+ + + K R +N
Sbjct: 142 FG-MVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES--HKGERRPLRREALN 198

Query: 205 GQEAQHQVDTAKKMSVGQILRNRQFWGIALPRFLAEPAWGTFNAWIPLFMFKVYGFNLKE 264
+ ++ + F+ + L + W +F + ++
Sbjct: 199 PLASFRWARGMTVVAALMAV----FFIMQLVGQVPAALWV-------IFGEDRFHWDATT 247

Query: 265 IAMFAWMPMLFADLGCILGGYLPPLFQRWFGVNLIVSRKMVV-TLGAVLMIGPGMIGLFT 323
I + F L + + G + M+ G +L+ +
Sbjct: 248 IGI---SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMA- 303

Query: 324 NPYVAIMLLCIGGFAHQALSGALITLSSD 352
+ ++LL GG AL L +
Sbjct: 304 --FPIMVLLASGGIGMPALQAMLSRQVDE 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3387PF00577789e-17 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 78.0 bits (192), Expect = 9e-17
Identities = 73/433 (16%), Positives = 139/433 (32%), Gaps = 30/433 (6%)

Query: 285 NSRVDAYRNEQLLGSFYLNSGSQFIDTSSFPPGSYSVALKVYENNQLTRTELVPFTKTGG 344
++V +N + + + G I+ S + + + E + T+ VP++
Sbjct: 308 TAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPL 367

Query: 345 LT-DGNAQWFLQAGKTTSQVS-DDESSAYQLGVRLPLHPQYELYAGLANADDVSAFELGN 402
L +G+ ++ + AG+ S + ++ +Q + L + +Y G AD AF G
Sbjct: 368 LQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGI 427

Query: 403 NWTADLGGAGNLAISASVFRNDDGGKGDMQQANWSH-PGWPTLGF------YRTNSDGDA 455
GA ++ ++ + D + D Q + + G YR ++ G
Sbjct: 428 GKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYF 487

Query: 456 CTTDNRESYNALSCYES--ISATVSQNFVGWNMMLGYTRTQNNTDDSLRWDKQQSFENNY 513
D S E+ V F + + R + + + + + +
Sbjct: 488 NFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSG 547

Query: 514 LRQT--SAQSISETVQLSASRAFVMRDWILSTSLGVFHRNDNGGGSDDNGLYLSFS--LS 569
QT ++ E Q + AF ++ +L + D L L+ + S
Sbjct: 548 SHQTYWGTSNVDEQFQAGLNTAFED----INWTLSYSLTKNAWQKGRDQMLALNVNIPFS 603

Query: 570 DTPTMDSNNNSHSTNVSTDYRYSDQDGDQTSWQLSHTFYNDSFSHKEL--GVTVGGLNTD 627
DS + + S + + T D+ + G GG
Sbjct: 604 HWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNS 663

Query: 628 TINSAVNGRWDGQYGNVYATVSDSYDRQNHDHLSAFTGTYSSTLAVSRYGINLGASGSDD 687
+ G YGN S S D + S + G+ LG +D
Sbjct: 664 GSTGYATLNYRGGYGNANIGYSHSDDIKQ------LYYGVSGGVLAHANGVTLGQPLND- 716

Query: 688 LLGAVLVDVKGFS 700
VLV G
Sbjct: 717 --TVVLVKAPGAK 727



Score = 32.1 bits (73), Expect = 0.012
Identities = 40/222 (18%), Positives = 68/222 (30%), Gaps = 35/222 (15%)

Query: 299 SFYLNSGSQFIDTSSF------PPGSYSVALKVYENNQLTRTELVPFTKTGGLTDGNAQW 352
F + D S F PPG+Y V + + NN T V F
Sbjct: 52 RFLADDPQAVADLSRFENGQELPPGTYRVDIYL--NNGYMATRDVTFNTGDSEQG----- 104

Query: 353 FLQAGKTTSQVSDDESSAYQLGVRLPLHPQYELYAGLANADDVSAFELGNNWTADLG-GA 411
+ T +Q++ +G+ L A A S D+G
Sbjct: 105 -IVPCLTRAQLA-------SMGLNTASVSGMNLLADDACVPLTSMIH-DATAQLDVGQQR 155

Query: 412 GNLAISASVFRNDDGGKGDMQQANWSHPGWPTLGFYRTNSDGDACTTDNRESYNALSCYE 471
NL I + N +G + W L Y + + NR N+ Y
Sbjct: 156 LNLTIPQAFMSNRA--RGYIPPELWDPGINAGLLNYNFS----GNSVQNRIGGNSHYAYL 209

Query: 472 SISATVSQNFVGW----NMMLGYTRTQNNTDDSLRWDKQQSF 509
++ + + N W N Y + +++ +W ++
Sbjct: 210 NLQSGL--NIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249


54EcSMS35_3410EcSMS35_3418Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3410014-3.190090formate acetyltransferase
EcSMS35_3411017-5.415175propionate/acetate kinase
EcSMS35_3412019-7.370615threonine/serine transporter TdcC
EcSMS35_3413234-12.840919threonine dehydratase
EcSMS35_3414131-10.822526DNA-binding transcriptional activator TdcA
EcSMS35_3415127-8.748988DNA-binding transcriptional activator TdcR
EcSMS35_3416022-6.265469hypothetical protein
EcSMS35_3417016-4.923897hypothetical protein
EcSMS35_3418014-4.157497hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3411ACETATEKNASE5360.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 536 bits (1383), Expect = 0.0
Identities = 173/397 (43%), Positives = 254/397 (63%), Gaps = 11/397 (2%)

Query: 7 VLVINCGSSSIKFSVLDASDCEVLMSGIADGINSENAFLSVN-GGEPAP--LAHHSYEGA 63
+LVINCGSSS+K+ ++++ D VL G+A+ I ++ L+ N GE ++ A
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDA 62

Query: 64 LKAIAFELEKRNLN-----DSVALIGHRIAHGGSIFTESAIITDEVIDNIRRVSPLAPLH 118
+K + L + + +GHR+ HGG FT S +ITD+V+ I LAPLH
Sbjct: 63 IKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLH 122

Query: 119 NYANLSGIESAQQLFPGVTQVAVFDTSFHQTMAPEAYLYGLPWKYYEELGVRRYGFHGTS 178
N AN+ GI++ Q+ P V VAVFDT+FHQTM AYLY +P++YY + +R+YGFHGTS
Sbjct: 123 NPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGTS 182

Query: 179 HRYVSQRAHSLLNLAEDDSGLVVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSG 238
H+YVSQRA +LN + ++ HLGNG+SI AV+NG+S+DTSMG TPLEGL MGTRSG
Sbjct: 183 HKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRSG 242

Query: 239 DVDFGAMSWVASQTNQSLGDLERVVNKESGLLGISGLSSDLR-VLEKAWHEGHERAQLAI 297
+D +S++ + N S ++ ++NK+SG+ GISG+SSD R + + A+ G +RAQLA+
Sbjct: 243 SIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLAL 302

Query: 298 KTFVHRIARHIAGHAASLHRLDGIIFTGGIGENSSLIRRLVMEHLAVLGVEIDTEMNNRS 357
F +R+ + I +AA++ +D I+FT GIGEN IR +++ L LG ++D E N
Sbjct: 303 NVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKVR 362

Query: 358 NSCGERIVSSENARVICAVIPTNEEKMIALDAIHLGK 394
E I+S+ +++V V+PTNEE MIA D + +
Sbjct: 363 GE--EAIISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


55EcSMS35_3494EcSMS35_3524Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3494215-0.4157143-deoxy-D-manno-octulosonate 8-phosphate
EcSMS35_3495217-0.530972hypothetical protein
EcSMS35_34963170.361037lipopolysaccharide transport periplasmic protein
EcSMS35_34974170.502888putative ABC transporter ATP-binding protein
EcSMS35_3498316-0.041721RNA polymerase factor sigma-54
EcSMS35_3499018-0.417926putative sigma(54) modulation protein
EcSMS35_3500-1150.519439PTS system transporter subunit IIA-like
EcSMS35_3501014-0.110776hypothetical protein
EcSMS35_3502-2150.491580phosphohistidinoprotein-hexose
EcSMS35_3504-2152.806747hypothetical protein
EcSMS35_3503-1173.270281monofunctional biosynthetic peptidoglycan
EcSMS35_3505-2173.927625isoprenoid biosynthesis protein with
EcSMS35_3506-1174.272750aerobic respiration control sensor protein ArcB
EcSMS35_3507-1195.366946radical SAM family protein
EcSMS35_35080205.918238glutamate synthase subunit alpha
EcSMS35_35091185.970551glutamate synthase subunit beta
EcSMS35_35101185.968919SWIM zinc finger domain-containing protein
EcSMS35_35111184.884287hypothetical protein
EcSMS35_35120154.833577AAA family ATPase
EcSMS35_35131155.228423hypothetical protein
EcSMS35_3514-1113.626431VWA domain-containing protein
EcSMS35_3515-1112.861946hypothetical protein
EcSMS35_3516-2133.002986hypothetical protein
EcSMS35_3517-1122.838188N-acetylmannosamine kinase
EcSMS35_3518-1142.390145N-acetylmannosamine-6-phosphate 2-epimerase
EcSMS35_3519-1141.869750putative sialic acid transporter
EcSMS35_3520-1120.684646N-acetylneuraminate lyase
EcSMS35_35210150.851607transcriptional regulator NanR
EcSMS35_35220160.312372putative cryptic C4-dicarboxylate transporter
EcSMS35_35231200.569924ClpXP protease specificity-enhancing factor
EcSMS35_35242220.363590stringent starvation protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3495MYCMG045290.017 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 28.5 bits (63), Expect = 0.017
Identities = 30/144 (20%), Positives = 53/144 (36%), Gaps = 17/144 (11%)

Query: 57 ALSYRLIAQHVEYYSDQAVSWFTQPVLTTFDKDKIPTWSVKADKAKLTNDRMLYLYGHVE 116
AL +I + + +A L D+ K +K D+ T+D YL G ++
Sbjct: 307 ALDLLVINKQQSNFQKEAHEIIFDLALDGADQTKEQL--IKTDEELGTDDEDFYLKGAMQ 364

Query: 117 ----VNALVPDSQLRRITT----------DNAQINLVTQDVTSEDLVTLYGTTFNSSGLK 162
VN + P + +T + + T +TSE Y T + K
Sbjct: 365 NFSYVNYVSPLKVISDPSTGIVSSKKNNAEMKSKQMSTDQMTSEKEFDYYTETLKALLEK 424

Query: 163 M-RGNLRSKNAELIEKVRTSYEIQ 185
L +L+E ++ +Y I+
Sbjct: 425 EDSAELNENEKKLVETIKKAYTIE 448


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3506HTHFIS656e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 6e-13
Identities = 26/115 (22%), Positives = 45/115 (39%), Gaps = 4/115 (3%)

Query: 528 VLLVEDIELNVIVARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDI 587
+L+ +D V L + G V + G+ DLV+ D+ +PD D+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 588 SRELTKRYPREDLPPLVALTA-NVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKK 641
+ K P P++ ++A N + G D L KP + L +I +
Sbjct: 66 LPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3510cloacin290.032 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 29.3 bits (65), Expect = 0.032
Identities = 14/39 (35%), Positives = 20/39 (51%)

Query: 100 VSEWLTSRQQRAVRKEEKKEQAEAKAADPQAAAKREAAR 138
VS+ L+ Q + + EE + Q E A P AA+R R
Sbjct: 287 VSDVLSPDQVKQRQDEENRRQQEWDATHPVEAAERNYER 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3519TCRTETB607e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 60.3 bits (146), Expect = 7e-12
Identities = 64/406 (15%), Positives = 135/406 (33%), Gaps = 32/406 (7%)

Query: 30 LLDGFDFVLIALVLTEVQGEFGLTTVQAASLISAAFISRWFGGLMLGAMGDRYGRRLAMV 89
+ +++ + L ++ +F + +A ++ G + G + D+ G + ++
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 90 TSIVLFSAGTLACGFAPGYITMFI-ARLVIGMGMAGEYGSSATYVIESWPKHLRNKASGF 148
I++ G++ + ++ I AR + G G A V PK R KA G
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 149 LISGFSVGAVVAAQVYSLVVPVWGWRALFFIGILPIIFALWLRKNIPEAEDWKEKHGGKA 208
+ S ++G V + ++ W L I ++ II +L K + +
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVR--------- 194

Query: 209 PVRTMVDILYRGEHRIANIVMTLAAATALWFCFAGNLQNAAIVAVLGLLCAAIFISFMVQ 268
+G I I++ + IV+VL L IF+ + +
Sbjct: 195 ---------IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL---IFVKHIRK 242

Query: 269 STGK----RWPTGVMLMVVVLFAFLYSWPIQA---LLPTYLKTDLAYDPHTVANVLFFSG 321
T + M+ VL + + ++P +K + +V+ F G
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 322 -FGAAVGCCVGGFLGDWLGTRK-AYVCSLLASQLLIIPVFAIGGANVWVLGLLLFFQQML 379
+ +GG L D G + S + F + W + +++ F
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLET-TSWFMTIIIVFVLGG 361

Query: 380 GQGIAGILPKLIGGYFDTDQRAAGLGFTYNVGALGGALAPIIGALI 425
++ ++ + AG+ L I +
Sbjct: 362 LSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


56EcSMS35_3662EcSMS35_3670Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_36622130.836118ribulose-phosphate 3-epimerase
EcSMS35_36632141.050130DNA adenine methylase
EcSMS35_36642161.840986hypothetical protein
EcSMS35_36651161.5286333-dehydroquinate synthase
EcSMS35_36662201.436266shikimate kinase I
EcSMS35_36673233.661037outer membrane porin HofQ
EcSMS35_3668-1163.771478hypothetical protein
EcSMS35_3669-1163.536864hypothetical protein
EcSMS35_3670-1143.258439hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3664IGASERPTASE442e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 43.5 bits (102), Expect = 2e-06
Identities = 41/203 (20%), Positives = 67/203 (33%), Gaps = 10/203 (4%)

Query: 126 APSTSSSDQTASGEKSIDLAGNATDQANGVQPAPGTTSAENTQQDVSLPPISST-PTQGQ 184
P+ +D + + ++A D+A PAP T S + S T Q
Sbjct: 999 TPNNIQADVPSVPSNNEEIA--RVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 185 TPAATDGQQRVEVQGDLNNALTQPQN----QQQLNNVAVNSTLPTEPATVAPVRNGNASR 240
T Q R + +N Q Q +T E ATV + A
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE--KEEKAKV 1114

Query: 241 DTAKTQTAERPATTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAA 300
+T KTQ + + +Q+ E +PQA E P + A T+ PA
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQS-ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 301 TSTPAPKETATTAPVQTASPAQT 323
++ ++ T + +
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVV 1196



Score = 42.4 bits (99), Expect = 3e-06
Identities = 38/199 (19%), Positives = 68/199 (34%), Gaps = 19/199 (9%)

Query: 143 DLAGNATDQANGVQPAPGTTSAENTQQDVSL-----------------PPISSTPTQGQT 185
DL ++ N T+ N Q DV PP +TP++
Sbjct: 979 DLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE 1038

Query: 186 PAATDGQQRVEVQGDLNNALTQPQNQQQLNNVAVNSTLPTEPATVAPVRNGNASRDTAKT 245
A + +Q + T+ Q + S + T ++G+ +++T T
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTT 1098

Query: 246 QTAERPATTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAATSTPA 305
+T E + + + E + V ++ P + + +P A A P
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 306 PKETATTAPVQTASPAQTT 324
+T TTA T PA+ T
Sbjct: 1159 QSQTNTTA--DTEQPAKET 1175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3666CARBMTKINASE344e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 33.6 bits (77), Expect = 4e-04
Identities = 27/91 (29%), Positives = 40/91 (43%), Gaps = 18/91 (19%)

Query: 84 FYDSDQEIEKRTGADVGWVFDLEGEEGFRD----------REEKVINELTEKQGIVLATG 133
FYD + KR + GW+ + G+R E + I +L E+ IV+A+G
Sbjct: 136 FYDEETA--KRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLVERGVIVIASG 193

Query: 134 GGSVKSRETRNRLSARGVVVYLETTIEKQLA 164
GG V + +GV E I+K LA
Sbjct: 194 GGGVPVILEDGEI--KGV----EAVIDKDLA 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3667TYPE3OMGPROT2842e-92 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 284 bits (727), Expect = 2e-92
Identities = 80/301 (26%), Positives = 131/301 (43%), Gaps = 18/301 (5%)

Query: 117 LENRNITLQYADAGELAKAGEKLLSAKGSMTVDKRTNRLLLRDNKTALSALEQWVAQMDL 176
L + I D + +A SA+ + D N +++RD+ + ++ + +D
Sbjct: 219 LSDATIQQVTVDNQRIPQAAT-RASAQARVEADPSLNAIIVRDSPERMPMYQRLIHALDK 277

Query: 177 PVGQVELSAHIVTINEKSLRELGVKWTLADAQQAGGVGQVTTLGSDLSVATATTHVGFNI 236
P ++E++ IV IN L ELGV W + + T G ++A+ G
Sbjct: 278 PSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASN----GALG 333

Query: 237 GRINGRLLDL---ELSALEQKQQLDIIASPRLLASHLQPASIKQGSEIPYQVSSGESGAT 293
++ R LD ++ LE + +++ P LL A I SE Y +G+ A
Sbjct: 334 SLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDH-SETYYVKVTGKEVA- 391

Query: 294 SVEFKEAVLG--MEVTPTVLQKG---RIRLKLHISQNVPGQVLQQADGEVLAIDKQEIET 348
E K G + +TP VL +G I L LHI +G + I + ++T
Sbjct: 392 --ELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEG-IPTISRTVVDT 448

Query: 349 QVEVKSGETLALGGIFTRKNKSGQDSVPLLGDIPWFGQLFRHDGKEDERRELVVFITPRL 408
V G++L +GGI+ + VPLLGDIP+ G LFR + R + I PR+
Sbjct: 449 VARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRI 508

Query: 409 V 409
+
Sbjct: 509 I 509



Score = 36.8 bits (85), Expect = 2e-04
Identities = 18/95 (18%), Positives = 33/95 (34%), Gaps = 4/95 (4%)

Query: 1 MKQWIAALLLMLIPGVQAA----KPQKVTLMVDDVPVAQVLQALAEQEKLNLVVSPDVSG 56
K+ + LL+L A P + + +L +VVS ++
Sbjct: 9 FKRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKIND 68

Query: 57 TVSLHLTDVPWKQALQTVVKSAGLITRQEGNILSV 91
VS + LQ + L+ +GN+L +
Sbjct: 69 KVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYI 103


57EcSMS35_3726EcSMS35_3811Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3726-2193.166550gamma-glutamyltranspeptidase
EcSMS35_3728-2213.118861hypothetical protein
EcSMS35_3727-1223.144135cytoplasmic glycerophosphodiester
EcSMS35_3729-2212.275179glycerol-3-phosphate transporter ATP-binding
EcSMS35_3730-1201.071131glycerol-3-phosphate transporter membrane
EcSMS35_3731-1201.662839glycerol-3-phosphate transporter permease
EcSMS35_3732-1201.370000glycerol-3-phosphate transporter periplasmic
EcSMS35_3733-1192.467434hypothetical protein
EcSMS35_3734-2223.198554hypothetical protein
EcSMS35_3736-1213.614193hypothetical protein
EcSMS35_3735-1213.243248leucine/isoleucine/valine transporter
EcSMS35_3737-1203.126287leucine/isoleucine/valine transporter
EcSMS35_3738-1222.913681leucine/isoleucine/valine transporter permease
EcSMS35_3739-2232.208064branched-chain amino acid transporter permease
EcSMS35_3740-1211.737544high-affinity branched-chain amino acid ABC
EcSMS35_3741-2212.087409hypothetical protein
EcSMS35_37421192.362005acetyltransferase
EcSMS35_37431182.178233high-affinity branched-chain amino acid ABC
EcSMS35_37442171.708771RNA polymerase factor sigma-32
EcSMS35_37452141.315476cell division protein FtsX
EcSMS35_37463121.712132cell division protein FtsE
EcSMS35_37472113.286416cell division protein FtsY
EcSMS35_37480133.62407216S rRNA m(2)G966-methyltransferase
EcSMS35_37490143.560354hypothetical protein
EcSMS35_37500143.560524hypothetical protein
EcSMS35_3751-1153.538882hypothetical protein
EcSMS35_37520143.493570zinc/cadmium/mercury/lead-transporting ATPase
EcSMS35_37531160.938209hypothetical protein
EcSMS35_37540171.757895sulfur transfer protein SirA
EcSMS35_3755-1161.957589hypothetical protein
EcSMS35_37560173.078522hypothetical protein
EcSMS35_3757-1204.174080major facilitator superfamily transporter
EcSMS35_37580234.634719hypothetical protein
EcSMS35_3759-1255.566508holo-(acyl carrier protein) synthase 2
EcSMS35_3760-1265.493720nickel ABC transporter periplasmic
EcSMS35_3761-1224.289516nickel transporter permease NikB
EcSMS35_3762-1192.718737nickel transporter permease NikC
EcSMS35_37630200.453067nickel transporter ATP-binding protein NikD
EcSMS35_3764118-1.132084nickel transporter ATP-binding protein NikE
EcSMS35_3765118-0.744495nickel responsive regulator
EcSMS35_3766219-1.434394GntR family transcriptional regulator
EcSMS35_3767217-0.426222PEP-dependent sugar transporting PTS family, IIA
EcSMS35_3768217-0.686355PEP-dependent sugar transporting PTS family, IIB
EcSMS35_37691150.371734PEP-dependent sugar transporting PTS family, IIC
EcSMS35_37700172.418674carbohydrate kinase
EcSMS35_37711183.071413HPr family phosphocarrier protein
EcSMS35_37721173.329759putative fructose-1,6-bisphosphate aldolase
EcSMS35_3773014-0.080918HicB family protein
EcSMS35_3774116-3.297374ABC transporter ATP-binding protein
EcSMS35_3775018-4.307469ABC transporter ATP binding protein
EcSMS35_3776026-6.805510MFP family transporter
EcSMS35_3777023-7.029824hypothetical protein
EcSMS35_3778122-6.943553hypothetical protein
EcSMS35_3779016-4.655092hypothetical protein
EcSMS35_37800120.270109hypothetical protein
EcSMS35_37810151.993928pyridine nucleotide-disulfide oxidoreductase
EcSMS35_3782-2182.375690low-affinity inorganic phosphate transporter 1
EcSMS35_3783-2192.631202universal stress protein UspB
EcSMS35_3784-3212.767211universal stress protein A
EcSMS35_3785-3212.978465inner membrane transporter YhiP
EcSMS35_3786-2233.506063putative methyltransferase
EcSMS35_3787-1232.697234oligopeptidase A
EcSMS35_3788-3171.283881DNA utilization protein YhiR
EcSMS35_3789-2151.228331glutathione reductase
EcSMS35_37900170.287364hypothetical protein
EcSMS35_3791-1180.023496hypothetical protein
EcSMS35_3792117-1.474005hypothetical protein
EcSMS35_3793022-5.830909DNA-binding transcriptional repressor ArsR
EcSMS35_3794122-6.542414arsenical pump membrane protein
EcSMS35_3795230-9.993112arsenate reductase
EcSMS35_3796129-9.024494ArsR family transcriptional regulator
EcSMS35_3797023-6.894916putative permease
EcSMS35_3798022-7.149975hypothetical protein
EcSMS35_3799-215-2.994906Slp family outer membrane lipoprotein
EcSMS35_3800-212-1.505046transcriptional regulator DctR
EcSMS35_3801-211-0.576408hemin transport protein HmuS
EcSMS35_3802-2130.333380outer membrane heme/hemoglobin receptor ChuA
EcSMS35_38030181.195635hypothetical protein
EcSMS35_3804-1180.931405periplasmic-binding protein
EcSMS35_3805018-0.235971coproporphyrinogen III oxidase
EcSMS35_3806219-1.386013hypothetical protein
EcSMS35_3807220-2.101595hypothetical protein
EcSMS35_3808221-3.748611iron chelate ABC transporter permease
EcSMS35_3809022-7.709029hemin importer ATP-binding subunit
EcSMS35_3810122-9.471455putative Mg(2+) transport ATPase
EcSMS35_3811-120-5.475583acid-resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3726NAFLGMOTY330.003 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 33.2 bits (75), Expect = 0.003
Identities = 28/82 (34%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 276 RTPISGEYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 332
R P+ GE R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 333 YAYADRSEYLGDPDFVKVPWQA 354
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3727PF04619290.014 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.7 bits (64), Expect = 0.014
Identities = 12/60 (20%), Positives = 23/60 (38%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGDLNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ + W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3729PF05272290.042 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.042
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTTGDI 61
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3732MALTOSEBP393e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 38.9 bits (90), Expect = 3e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3747IGASERPTASE556e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 55.1 bits (132), Expect = 6e-10
Identities = 39/206 (18%), Positives = 67/206 (32%), Gaps = 1/206 (0%)

Query: 19 EQTPEKETEVQNEQPVVEEIVPAQEPVKASEQTVEEQPQAHTEAEAETFAADVVEVTEQV 78
TP + TE E E + A+E T + + A EV +
Sbjct: 1030 PATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSG 1089

Query: 79 AESEKAQPEAVAEPEVITEPETIVEETPASVIEPEVTPE-PVAIEREELPLPEDVNAEEV 137
+E+++ Q E + + E ET + P+VT + E+ E P+ A E
Sbjct: 1090 SETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149

Query: 138 SPEEWQAEAETLEIVEAAEEEAAKEDITDEELEAQALAAEAAEEAVMVVSPAEEEQPVEE 197
P E ++ A E+ AKE ++ E +V+ +
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP 1209

Query: 198 IAQEQEKPTKEGFFARLKRSLLKTKE 223
+ + R RS+ E
Sbjct: 1210 TVNSESSNKPKNRHRRSVRSVPHNVE 1235



Score = 43.5 bits (102), Expect = 2e-06
Identities = 41/201 (20%), Positives = 67/201 (33%), Gaps = 19/201 (9%)

Query: 17 QKEQTPEK------ETEVQNEQPVVEEIVPAQEPVKASEQTVEEQPQAHTEAEAETFAAD 70
Q+ +T EK ET QN + E A+ VKA+ QT E E +T
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKE----AKSNVKANTQTNEVAQSGSETKETQTTET- 1100

Query: 71 VVEVTEQVAESEKAQPEA---VAEPEVITE--PETIVEETPASVIEPEVTPEPVAIEREE 125
+ T V + EKA+ E P+V ++ P+ ET EP +P + +E
Sbjct: 1101 --KETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT-VNIKE 1157

Query: 126 LPLPEDVNAEEVSPEEWQAEAETLEIVEAAEEEAAKEDITDEELEAQALAAEAAEEAVMV 185
+ A+ P + + + E+ + + E A
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 186 VSPAEEEQPVEEIAQEQEKPT 206
+ V + E T
Sbjct: 1218 KPKNRHRRSVRSVPHNVEPAT 1238



Score = 43.1 bits (101), Expect = 2e-06
Identities = 25/171 (14%), Positives = 56/171 (32%), Gaps = 9/171 (5%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVPAQEPVKASEQTVEEQPQAHTEAEAETFAADVVEVTE 76
+T E +T E VE+ +E K + +E P+ ++ + ++ V+
Sbjct: 1088 SGSETKETQTTETKETATVEK----EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 77 QVAESEKAQPEAVAEPEVIT----EPETIVEETPASVIEPEVTPEPVAIEREELPLPEDV 132
+ E + EP+ T + E +ET ++V +P V + PE+
Sbjct: 1144 E-PARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT 1202

Query: 133 NAEEVSPEEWQAEAETLEIVEAAEEEAAKEDITDEELEAQALAAEAAEEAV 183
P + + + ++ + + A +
Sbjct: 1203 TPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3754PF012061041e-33 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 104 bits (261), Expect = 1e-33
Identities = 24/71 (33%), Positives = 41/71 (57%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRK 79
+ Y + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3757TCRTETA546e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 53.7 bits (129), Expect = 6e-10
Identities = 82/398 (20%), Positives = 149/398 (37%), Gaps = 32/398 (8%)

Query: 24 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 81
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 82 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 140
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 141 FAGTGSTLWGVGVVGSL--HIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 198
A G+ + + H G + + G +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 199 GVALVAILLAIP-----RPMVKASKGKPLPFRAVLGRVWLYGMALA--LASAGFGVIATF 251
LL RP+ + + FR G + + + V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 252 ITLFYDAK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 307
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 308 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 366
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 367 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 400
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3764HTHFIS290.021 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.021
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3774ABC2TRNSPORT504e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.3 bits (120), Expect = 4e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKV-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3775PF05272300.046 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.046
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3776RTXTOXIND853e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.9 bits (210), Expect = 3e-20
Identities = 72/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGLLAVAAIVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A I++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAIAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3781ALARACEMASE290.032 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.032
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3804FERRIBNDNGPP320.002 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 32.2 bits (73), Expect = 0.002
Identities = 42/210 (20%), Positives = 73/210 (34%), Gaps = 23/210 (10%)

Query: 28 MVKRKKLFT--ALLALSWTFSVTAAE-----RIVVAGGSLTELIYAMGAGKRVVGVDETT 80
++ R++L T AL L W + A RIV EL+ A+G GV +T
Sbjct: 6 LISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVP--YGVADTI 63

Query: 81 SY------PPETARLPHIGYWKQLSSEGILSLRPDSVITWQDAGPQIVLDQL-RAQKVNV 133
+Y PP + +G + + E + ++P ++ GP + L R
Sbjct: 64 NYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGP--SPEMLARIAPGRG 121

Query: 134 VTLPRVPATLEQMYANIRQLAKTLQVPEQGEALVTQINQRLERVQQNVAAKKAPVKAMFI 193
L ++ ++A L + E + Q + ++ + A +
Sbjct: 122 FNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTT 181

Query: 194 LSAGGSAPQ--VAGKGSVADAILSLAGAEN 221
L V G S+ IL G N
Sbjct: 182 LI---DPRHMLVFGPNSLFQEILDEYGIPN 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3809PF05272290.019 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.019
Identities = 15/59 (25%), Positives = 24/59 (40%), Gaps = 4/59 (6%)

Query: 5 AQPEDHMISAQNLVYSLQGRRLTDNVSLTFPGG---EIVAIL-GPNGAGKSTLLRQLTG 59
P+D+ + + L +V+ G + +L G G GKSTL+ L G
Sbjct: 560 KTPDDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVG 618


58EcSMS35_3848EcSMS35_3859Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3848647-0.680181hypothetical protein
EcSMS35_3849746-0.276141hypothetical protein
EcSMS35_3850746-0.175131hypothetical protein
EcSMS35_3851746-0.175131hypothetical protein
EcSMS35_38526400.544258hypothetical protein
EcSMS35_3853123-3.116614hypothetical protein
EcSMS35_3854018-0.699510hypothetical protein
EcSMS35_3855-1160.628565hypothetical protein
EcSMS35_3856-2181.942784hypothetical protein
EcSMS35_3857-1182.461318hypothetical protein
EcSMS35_3858-1182.471948serine transporter family protein
EcSMS35_3859-2223.525862dipeptide transporter ATP-binding subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3855ABC2TRNSPORT250.013 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 24.5 bits (53), Expect = 0.013
Identities = 10/35 (28%), Positives = 14/35 (40%)

Query: 6 QNIIRLQPLHFSYEKGRRVLSLRPGPGANCHYGSL 40
Q R PL S + R ++ P H G+L
Sbjct: 207 QTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGAL 241


59EcSMS35_3874EcSMS35_3885Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3874118-3.0414912-ketogluconate reductase
EcSMS35_3875327-5.609342putative lipoprotein
EcSMS35_3876320-1.817450putative transcriptional regulator
EcSMS35_3877224-0.259969major cold shock protein
EcSMS35_3878222-0.388220small toxic polypeptide
EcSMS35_3879019-0.702182IS150 transposase orfA
EcSMS35_3880019-0.941373IS150 transposase orfB
EcSMS35_3881-218-0.274526glycyl-tRNA synthetase subunit beta
EcSMS35_3882215-1.286106glycyl-tRNA synthetase subunit alpha
EcSMS35_3883116-2.362208hypothetical protein
EcSMS35_3884016-3.263997acyltransferase family protein
EcSMS35_3885-216-3.400292hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3878HOKGEFTOXIC688e-20 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 67.5 bits (165), Expect = 8e-20
Identities = 18/50 (36%), Positives = 33/50 (66%)

Query: 1 MPQKYRLLSLIVICFTLLFFTWMIRDSLCELHIKQGSYELAAFLACKLKE 50
+P+ + ++++C TLL FT++ R SLCE+ + G E+AAF+A + +
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3885FLGBIOSNFLIP270.021 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 27.5 bits (61), Expect = 0.021
Identities = 19/66 (28%), Positives = 26/66 (39%), Gaps = 1/66 (1%)

Query: 77 MTCLTVFIISVALLLVGLWNATLLLSEKGFYGLAFFLSLFGAVAVQKNIRDAGINPPKET 136
MT T II LL L + + GLA FL+ F V I P E
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAP-PNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 137 QVTQEE 142
+++ +E
Sbjct: 120 KISMQE 125


60EcSMS35_3919EcSMS35_3964Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3919429-5.274670selenocysteine synthase
EcSMS35_3920534-7.875488putative glutathione S-transferase
EcSMS35_3921738-9.790213outer membrane autotransporter
EcSMS35_3922847-12.097640putative fimbrial protein FanH
EcSMS35_3923747-13.181442putative fimbrial protein FanG
EcSMS35_3924648-14.768847putative fimbrial protein FanF
EcSMS35_3925337-12.064230putative fimbrial assembly chaperone FanE
EcSMS35_3926334-11.100346putative fimbrial usher protein FanD
EcSMS35_3927018-5.743186putative fimbrial protein FanC
EcSMS35_3928-115-3.962025sigma-70 domain-containing protein
EcSMS35_3929-215-2.129288putative fimbrial operon positive regulatory
EcSMS35_3930-1190.882230auxiliary membrane fusion protein family
EcSMS35_39310220.719032hypothetical protein
EcSMS35_3932-1200.016680PTS system, mannitol-specific IIABC component
EcSMS35_3933-214-3.505643mannitol-1-phosphate 5-dehydrogenase
EcSMS35_3934320-0.900727mannitol repressor protein
EcSMS35_3935319-0.005782hypothetical protein
EcSMS35_39362190.414107hypothetical protein
EcSMS35_39372160.895801hypothetical protein
EcSMS35_39382171.427344putative lipoprotein
EcSMS35_39391172.430233hemagluttinin family protein
EcSMS35_3940-1153.797933L-lactate permease
EcSMS35_39411163.562243DNA-binding transcriptional repressor LldR
EcSMS35_39420163.326027L-lactate dehydrogenase
EcSMS35_3943-1152.404472putative tRNA/rRNA methyltransferase YibK
EcSMS35_39440212.047759serine acetyltransferase
EcSMS35_39451192.881840NAD(P)H-dependent glycerol-3-phosphate
EcSMS35_39462202.019975preprotein translocase subunit SecB
EcSMS35_3947-1150.900505glutaredoxin 3
EcSMS35_3948-1160.940629rhodanese domain-containing protein
EcSMS35_3949-1161.380882phosphoglyceromutase
EcSMS35_3950-1141.034837hypothetical protein
EcSMS35_3952116-0.384697polysaccharide deacetylase family protein
EcSMS35_3951116-0.059160putative glycosyl transferase
EcSMS35_3953-1140.887325L-threonine 3-dehydrogenase
EcSMS35_3954-116-3.2620782-amino-3-ketobutyrate coenzyme A ligase
EcSMS35_3955127-7.480771hypothetical protein
EcSMS35_3956227-8.587474ADP-L-glycero-D-mannoheptose-6-epimerase
EcSMS35_3957332-10.728938ADP-heptose:LPS heptosyltransferase II
EcSMS35_3958340-14.005131ADP-heptose:LPS heptosyl transferase I
EcSMS35_3959342-16.129146O-antigen polymerase
EcSMS35_3960335-14.058499glycosyl transferase, group 2 family protein
EcSMS35_3961229-11.102967lipopolysaccharide 1,2-galactosyltransferase
EcSMS35_3962227-9.329331lipopolysaccharide core biosynthesis protein
EcSMS35_3963222-6.167044lipopolysaccharide 1,2-glucosyltransferase
EcSMS35_3964218-3.807178lipopolysaccharide 1,3-galactosyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3921IGASERPTASE436e-132 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 436 bits (1123), Expect = e-132
Identities = 244/888 (27%), Positives = 382/888 (43%), Gaps = 145/888 (16%)

Query: 50 SPLIQASIVGMDIPYQTYRDFAENKGAFSVGALDIPLYKKDGTLYSTL--NKAPMIDFSA 107
+P +A++V D+ YQ +RDFAENKG FSVGA ++ + K+ T N PMIDFS
Sbjct: 20 TPYTEAALVRDDVDYQIFRDFAENKGKFSVGATNVLVKDKNNKDLGTALPNGIPMIDFSV 79

Query: 108 VDSGQTVATLISPQYIVSVKH-NTGYKNVRFG-------------YRDDSS----YILVD 149
VD + +ATLI+PQY+V VKH + G + FG +RD SS Y V+
Sbjct: 80 VDVDKRIATLINPQYVVGVKHVSNGVSELHFGNLNGNMNNGNAKAHRDVSSEENRYFSVE 139

Query: 150 RNN------------------SSVDFHTPRLNKIVTEVVPADITDAGTANGTYQNQDRFP 191
+N D++ PRL+K VTEV P + + A + GTY +Q+++P
Sbjct: 140 KNEYPTKLNGKTVTTEDQTQKRREDYYMPRLDKFVTEVAPIEASTASSDAGTYNDQNKYP 199

Query: 192 IFYRVGTGTQYVKDRNGK--------------LTQLAGGYAYRTGGTVGKPTSSNKRIV- 236
F R+G+G+Q++ + L + Y Y GT K N ++
Sbjct: 200 AFVRLGSGSQFIYKKGDNYSLILNNHEVGGNNLKLVGDAYTYGIAGTPYKVNHENNGLIG 259

Query: 237 -SNPGNTYSAANG-----PMPSYGIPGDSGSPLFAWDTQRNKWVLVAVLNSYAGNAGKTN 290
N +S G P+ +Y + GDSGSPLF +D ++ KW+ + + +AG K +
Sbjct: 260 FGNSKEEHSDPKGILSQDPLTNYAVLGDSGSPLFVYDREKGKWLFLGSYDFWAGY-NKKS 318

Query: 291 WFTVIPVNEVSANIEADTDAPVTPTSTTENINWTYDISTGTGKLTQGTDAWEMHGRDTGS 350
W + D+ + ++++ + T +T G + + D
Sbjct: 319 WQEWNIYKSQFTKDVLNKDSAGS--LIGSKTDYSWSSNGKTSTITGGEKSLNVDLADGKD 376

Query: 351 SAVSFNHGKDLSFENTGTVVLKDIVNQGAGTLTFNGDYIVKPDAD-QTWVGGGIIVNGDH 409
NHGK ++FE +GT+ L + ++QGAG L F GDY VK +D TW G G+ V
Sbjct: 377 KP---NHGKSVTFEGSGTLTLNNNIDQGAGGLFFEGDYEVKGTSDNTTWKGAGVSVAEGK 433

Query: 410 TVNWQVNGVKGDSMHKLGTGTLNISGTGINPGTLSVGDGTVVLAQKPDSNGQVQAFESAS 469
TV W+V+ + D + K+G GTL + GTG N G+L VGDGTV+L Q+ + +GQ AF S
Sbjct: 434 TVTWKVHNPQYDRLAKIGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGSGQ-HAFASVG 492

Query: 470 IVSGRPTLVLSDSQQMNPDNIKWGYRGGKLDINGNDLTFHALNAADEGAILTNSGSLATT 529
IVSGR TLVL+D +Q++P++I +G+RGG+LD+NGN LTF + D+GA L N
Sbjct: 493 IVSGRSTLVLNDDKQVDPNSIYFGFRGGRLDLNGNSLTFDHIRNIDDGARLVN------- 545

Query: 530 SLDFNSTDTTKPVTTMFHGFFTGNVNVKNNATSNVNNTFVVDGGINTPAGSMTQQGGRLF 589
++ +T TG + + T N D N A + GG+L+
Sbjct: 546 ----HNMTNASNIT------ITGESLITDPNTITPYNIDAPDED-NPYAFRRIKDGGQLY 594

Query: 590 FQGHPVIHAVSTQSVANKLKALGDDSVLTQPVSFTQSDWQTRQFNLKSLDLNNAAFYLAR 649
T K +
Sbjct: 595 LNLEN----------------------------------YTYYALRKGASTRSELP---- 616

Query: 650 NAGLITTINANNSTVTLGSEDLYIDTNDGNGVKTTPVEGQSVATASEDQSHFTGNVNLTN 709
+ +N + + +G N N + + G + E+ + GN+N+T
Sbjct: 617 ----KNSGESNENWLYMGKTSDEAKRNVMNHINNERMNGFNGYFGEEEGKN-NGNLNVTF 671

Query: 710 GSALRVNENF--SGGIISSNSSVTISSTNANLTESSMFTHSVIKLSDNAQLTSTAGLQSD 767
F +GG +N + ++ L + D A ++ST
Sbjct: 672 -KGKSEQNRFLLTGG---TNLNGDLTVEKGTLF---LSGRPTPHARDIAGISSTKKDPHF 724

Query: 768 GTIEFGNGAKLSLLGESSSTFTPFSATAWNLKGTGSSLNIGSGTNVNGDINAWSDTNINF 827
++ E F AT N+ G S + + N+ +I A + ++
Sbjct: 725 AENN-------EVVVEDDWINRNFKATTMNVTGNASLYSGRNVANITSNITASNKAQVHI 777

Query: 828 GNSGKQSTSSGILYTGDIYAPEANVSIDNTSWTLNKTSLLGNLTLKNS 875
G + YTG + +S D + N T+L GN+ L S
Sbjct: 778 GYKTGDTVCVRSDYTGYVTCTTDKLS-DKALNSFNPTNLRGNVNLTES 824


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3926PF00577386e-123 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 386 bits (992), Expect = e-123
Identities = 192/840 (22%), Positives = 341/840 (40%), Gaps = 83/840 (9%)

Query: 5 IFIAAIIFHLLSKGALAEEFNYSFIRGGSKDIPDVLNSNKENV--PGKYVVDVVFNGSKI 62
+F+A + FN F+ + + D+ PG Y VD+ N +
Sbjct: 30 LFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 63 ASSTEMSIAKEDAEG---ICLSDEWLTENGIIINKDFYKN-VYNSARQCYLLGN-EANSK 117
A+ +++ D+E CL+ L G+ N + C L + ++
Sbjct: 90 AT-RDVTFNTGDSEQGIVPCLTRAQLASMGL--NTASVSGMNLLADDACVPLTSMIHDAT 146

Query: 118 VAFDQSLQEVSIDLPQAGFQDAAK---DGGVWDYGSNGFKIAYDVN--TAKNSNQERTTY 172
D Q +++ +PQA + A+ +WD G N + Y+ + + +N + Y
Sbjct: 147 AQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHY 206

Query: 173 SSIDGQ--VNLGEWVL---------LGRGYAYQGENFDTNNLLLTRAIKSLKSDLQLGKT 221
+ ++ Q +N+G W L + + N L R I L+S L LG
Sbjct: 207 AYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDG 266

Query: 222 QLYNSLNNGFTFYGAQLKSNQDMYPWNSRAYSPVINGIARTHARVTIEQSGYTLKSIVVP 281
+ +G F GAQL S+ +M P + R ++PVI+GIAR A+VTI+Q+GY + + VP
Sbjct: 267 YTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVP 326

Query: 282 PGPFVINDLNGV-YSGDLIMKIYEEDGSVREQRFPVAVLPNLLRPGTYNYALAMGSKVNQ 340
PGPF IND+ SGDL + I E DGS + P + +P L R G Y++ G +
Sbjct: 327 PGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG-EYRS 385

Query: 341 DNGERDKESLFAQMSYDYGFEP-FTLNSSLLLDKNYNNIGLGLIRSFGWFGAMSFSGNLS 399
N +++K F Q + +G +T+ L Y G+ ++ G GA+S +
Sbjct: 386 GNAQQEKP-RFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 400 QAKYHNGKNLKGYSTSLKYAKALGD-NANLQLIGYRFNSEDYIDYADFTY---------- 448
+ + G S Y K+L + N+QL+GYR+++ Y ++AD TY
Sbjct: 445 NSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIET 504

Query: 449 -----------NSYSFIRNRPKQRYESIVTYQLPEKGMFLNFSAWKEDYWD-NYNEVGAN 496
Y + + + + VT QL L S + YW + +
Sbjct: 505 QDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTST-LYLSGSHQTYWGTSNVDEQFQ 563

Query: 497 LSLTKSFDQITMTLNGGYSRLQNM-DADYNVGLSLSVPLSLFDKTH--------YSFSNV 547
L +F+ I TL+ ++ D + L++++P S + ++ + ++
Sbjct: 564 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM 623

Query: 548 NYDRRTGTSMNTGISGMI--NQRLSYNASVNQTRDTIG-----GTLSASYLFDWMQTSAT 600
++D + G+ G + + LSY+ G G + +Y + +
Sbjct: 624 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 683

Query: 601 YSQTGKNSSTSLQLGGSVIGVPEGGIIFTPVKNDQLAIVQMKDVPGVMFNGSLPG--DKY 658
YS + + G V+ G+ ND + +V+ D
Sbjct: 684 YSHSDDIKQLYYGVSGGVLAHA-NGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR 742

Query: 659 GRAVIP-LTAYNNNTISVNAEKLPKNIELTDNAINVTPTGNAIIYKNVKFK-KINTYVVK 716
G AV+P T Y N ++++ L N++L + NV PT AI+ +FK ++ ++
Sbjct: 743 GYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVR--AEFKARVGIKLLM 800

Query: 717 LYGKNGYVVPMGSIAKNTQGKEVGYVNNGG-ILLMNLETHDE-----GVISLDQCQFNTQ 770
N +P G++ + + G V + G + L + + G C N Q
Sbjct: 801 TLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQ 860


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3930RTXTOXIND642e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 64.5 bits (157), Expect = 2e-13
Identities = 56/314 (17%), Positives = 103/314 (32%), Gaps = 82/314 (26%)

Query: 66 ITPQVTGIVTEVTDKNNQLIQKGEVLFKLDPVR------------YQARVD--RLQA--- 108
I P IV E+ K + ++KG+VL KL + QAR++ R Q
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 109 ------------------------DLMTATHNIK----TLRAQLTEAQANTTQVSAERDR 140
+++ T IK T + Q + + N + AER
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 141 LFKNYQRY----------LKGSQAAVNPFS---------ERDIDDARQNF---LAQDALV 178
+ RY L + ++ + E +A +Q +
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 179 KGSVAE----QAQIQSQLDSMVNGE----QSQIVSLRAQLTEAKYNLEQTVIRAPSNGYV 230
+ + + + + + I L +L + + + +VIRAP + V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 231 TQVLIR-PGTYAAALPLRPVMVFIPEQKRQIV-AQFRQNSLLRLKPGDDAEVVFNALPGQ 288
Q+ + G +MV +PE V A + + + G +A + A P
Sbjct: 339 QQLKVHTEGGVVT--TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYT 396

Query: 289 VFH---GKLTSILP 299
+ GK+ +I
Sbjct: 397 RYGYLVGKVKNINL 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3939PF03895641e-14 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 63.7 bits (155), Expect = 1e-14
Identities = 19/79 (24%), Positives = 36/79 (45%), Gaps = 2/79 (2%)

Query: 1472 ESKLSGGIASAMAMTGLPQAYTPGASMASIGGGTYNGESAVALGV-SMVSANGRWVYKLQ 1530
+L G+A+ A++ L Q G + S G Y ++A+A+GV S ++ +
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 1531 GSTNSQGEYSAALGAGIQW 1549
+T + G S G ++
Sbjct: 62 FNTYN-GGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3945NUCEPIMERASE290.020 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.4 bits (66), Expect = 0.020
Identities = 20/87 (22%), Positives = 30/87 (34%), Gaps = 13/87 (14%)

Query: 8 MTVI---GAGSYGTALAITLARNGHEVVLWGHD---PEHIATLERDRCNAAFLPDVPFPD 61
M + AG G ++ L GH+VV G D + +L++ R P F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVV--GIDNLNDYYDVSLKQARLELLAQPGFQF-- 56

Query: 62 TLHLESDLATALAASRNILVVVPSHVF 88
+ DLA + VF
Sbjct: 57 ---HKIDLADREGMTDLFASGHFERVF 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3946SECBCHAPRONE2351e-82 Bacterial protein-transport SecB chaperone protein ...
		>SECBCHAPRONE#Bacterial protein-transport SecB chaperone protein

signature.
Length = 170

Score = 235 bits (600), Expect = 1e-82
Identities = 87/153 (56%), Positives = 117/153 (76%), Gaps = 4/153 (2%)

Query: 3 EQNNTEMTFQIQRIYTKDISFEAPNAPHVFQKDWQPEVKLDLDTASSQLADDVYEVVLRV 62
Q + QIQRIY KD+SFEAPN PH+FQ+DW+P++ DL T + Q+ DD+YEV L +
Sbjct: 12 TQATQQPVLQIQRIYVKDVSFEAPNLPHIFQQDWEPKLSFDLSTEAKQVGDDLYEVCLNI 71

Query: 63 TVTASLGEE--TAFLCEVQQGGIFSIAGIEGTQMAHCLGAYCPNILFPYARECITSMVSR 120
+V ++ AF+CEV+Q G+F+I+G+E QMAHCL + CPN+LFPYARE ++S+V+R
Sbjct: 72 SVETTMESSGDVAFICEVKQAGVFTISGLEEMQMAHCLTSQCPNMLFPYARELVSSLVNR 131

Query: 121 GTFPQLNLAPVNFDALFMNYL--QQQAGEGTEE 151
GTFP LNL+PVNFDALFM+YL Q+QA + TEE
Sbjct: 132 GTFPALNLSPVNFDALFMDYLQRQEQAEQTTEE 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3950CHANLCOLICIN362e-04 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 36.2 bits (83), Expect = 2e-04
Identities = 51/223 (22%), Positives = 74/223 (33%), Gaps = 22/223 (9%)

Query: 52 RAVRQKQQQRASLLAQLKKQEEAISEATRKLRETQNTLNQLNKQIDEMNASIAKLEQQKA 111
R + +++ R A K +EA RE T QL E A E+ KA
Sbjct: 131 RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKA 190

Query: 112 ---AQERSLAAQLDAAFRQGEHTGIQLILSGEESQRGQRLQAYFGYLNQARQETIAQLKQ 168
AQ++ AAQ + GE + LS AR + L
Sbjct: 191 VEIAQKKLSAAQSEVVKMDGEIKTLNSRLSS---------------SIHARDAEMKTLAG 235

Query: 169 TREEVAMQRAELEEKQSEQQTLLYEQRAQQAKLTQALSERKKTLAGLESSIQQGQQQLSE 228
R E+A A+ +E + L + R++ AG +Q Q SE
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295

Query: 229 LRANESRLRNSIARAEAAAKARAEREAREAQAVRDRQKEATRK 271
R N R+ I + + A R A R + E K
Sbjct: 296 TRIN--RINADITQIQKAIS--QVSNNRNAGIARVHEAEENLK 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3956NUCEPIMERASE1052e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 105 bits (264), Expect = 2e-28
Identities = 78/348 (22%), Positives = 128/348 (36%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLDI 47
+VTG AGFIG ++ K L + G ++ +DNL D +D+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + A F E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-LLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LLYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 DVNL------------WFLENGVSG-------IFNLGTGRAESFQAVADATLAY-HKKGQ 258
+ + W +E G ++N+G A + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRAA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


61EcSMS35_3975EcSMS35_4020Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_39753151.289516deoxyuridine 5'-triphosphate
EcSMS35_39761121.103809nucleoid occlusion protein
EcSMS35_39770111.026227orotate phosphoribosyltransferase
EcSMS35_3978-1100.640794ribonuclease PH
EcSMS35_3979-113-0.036319hypothetical protein
EcSMS35_3980-2120.622577DNA-damage-inducible protein D
EcSMS35_3982-3101.545071hypothetical protein
EcSMS35_3981-2112.745393NAD-dependent DNA ligase LigB
EcSMS35_3983-2143.070204guanylate kinase
EcSMS35_3984-2153.598779DNA-directed RNA polymerase subunit omega
EcSMS35_3985-2143.436104bifunctional (p)ppGpp synthetase II/
EcSMS35_3986-1143.182528tRNA guanosine-2'-O-methyltransferase
EcSMS35_3987-1123.034477ATP-dependent DNA helicase RecG
EcSMS35_3988-1121.924682sodium/glutamate symporter
EcSMS35_3989-2112.174827xanthine permease
EcSMS35_3990-1131.017185AsmA family protein
EcSMS35_3991019-2.646707hypothetical protein
EcSMS35_3992019-3.045211ROK family protein
EcSMS35_3993-115-2.533473hypothetical protein
EcSMS35_3994-114-2.959949fructose-bisphosphate aldolase, class II
EcSMS35_3996-115-4.621738PTS system, fructose family, IIA component
EcSMS35_3997024-9.467733putative PTS regulatory protein
EcSMS35_3998021-8.558603hypothetical protein
EcSMS35_3999020-7.272410alpha-xylosidase YicI
EcSMS35_4000225-9.284550putative transporter
EcSMS35_4002431-9.697981*phage integrase family site specific
EcSMS35_4003431-9.576055hypothetical protein
EcSMS35_40041294.862692phage polarity suppression protein
EcSMS35_40052316.036010glycoprotein 3, capsid size determination
EcSMS35_40061358.054848AlpA family transcriptional regulator
EcSMS35_40071347.986362phage protein
EcSMS35_40081337.807397hypothetical protein
EcSMS35_40090254.761738hypothetical protein
EcSMS35_4010-2203.324785hypothetical protein
EcSMS35_4012-2162.058607D5 family nucleoside triphosphatase
EcSMS35_4011-120-3.639725hypothetical protein
EcSMS35_4013021-5.406028hypothetical protein
EcSMS35_4014022-6.317880sugar efflux transporter C
EcSMS35_4015-127-8.088262carboxylate/amino acid/amine transporter
EcSMS35_4016137-11.280986cytoplasmic membrane lipoprotein-28
EcSMS35_4017239-12.526482hypothetical protein
EcSMS35_4018242-13.682469putative type III secretion chaperone SicA
EcSMS35_4019-122-3.936983hypothetical protein
EcSMS35_4020-120-3.376295putative type III cell invasion protein SipB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3976HTHTETR513e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 3e-10
Identities = 40/182 (21%), Positives = 70/182 (38%), Gaps = 9/182 (4%)

Query: 1 MAEK-QTAKRNRREEILQSLALMLESSDGSQRITTAKLAASVGVSEAALYRHFPSKTRMF 59
MA K + + R+ IL AL L S G + ++A + GV+ A+Y HF K+ +F
Sbjct: 1 MARKTKQEAQETRQHILDV-ALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59

Query: 60 DSLIEFIEDSLITRIN-LILKDEKDTTARLRLIVLLLL---GFGERNPGLTRILTGHALM 115
+ E E ++ K D + LR I++ +L ER L I+
Sbjct: 60 SEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 116 F-EQDRLQGRINQLFERIEAQLRQVLREKRMREGEGYA-TDETLLASQILAFCEGMLSRF 173
E +Q L ++ Q L+ + A A + + G++ +
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHC-IEAKMLPADLMTRRAAIIMRGYISGLMENW 178

Query: 174 VR 175
+
Sbjct: 179 LF 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3987SECA429e-06 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 41.8 bits (98), Expect = 9e-06
Identities = 39/129 (30%), Positives = 57/129 (44%), Gaps = 18/129 (13%)

Query: 233 NLSMLALRAGAQRFHAQPLSANDALKNKLLAALPFKPTGAQARVAAEIERDM-ALDVPMM 291
LS L+ F A+ L + L+N + A A R A++ M DV ++
Sbjct: 37 KLSDEELKGKTAEFRAR-LEKGEVLENLIPEAF------AVVREASKRVFGMRHFDVQLL 89

Query: 292 ---RLVQGDV-----GSGKTLVAALAA-LRAIAHGKQVALMAPTELLAEQHANNFRNWFA 342
L + + G GKTL A L A L A+ GK V ++ + LA++ A N R F
Sbjct: 90 GGMVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFE 148

Query: 343 PLGIEVGWL 351
LG+ VG
Sbjct: 149 FLGLTVGIN 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3997PF08280330.003 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 32.9 bits (75), Expect = 0.003
Identities = 78/491 (15%), Positives = 168/491 (34%), Gaps = 73/491 (14%)

Query: 7 RQNRLLRFLLPRREYTTIVTIAGYLNVSEKTIQRDLRLLEQWL-GQWRINVEKRAGAGVM 65
+ +L+ + I +A ++ + L + + ++KR M
Sbjct: 45 SKCQLVVLFF-KTSSLPITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKR-----M 98

Query: 66 LSAENIADLLHLDHLLGAECEEIDGVMNNARRVKIASQLLSETPNETSISKLSERYFISG 125
+ H ++ + + ++ +++ + L+ + ++ + +F+S
Sbjct: 99 I-------SCQFTHP--SKETYLYQLYASSNVLQLLAFLIKNGSHSRPLTDFARSHFLSN 149

Query: 126 ASIVNDLRVIESWLAPLGLSLIRSPSGTHIEGSEGQVRQAMALLINGIINHNEPQGVVYS 185
+S + L L L S I G E ++R +ALL G+
Sbjct: 150 SSAYRMREALIPLLRNFELKL----SKNKIVGEEYRIRYLIALL-------YSKFGIKVY 198

Query: 186 RLDPGSYKALVHYFGEEEVLFVQSLLLDMENELSWSLGEPYYVNIFTHILIMMYRNTHGN 245
L K ++H F L S L LS E + F IL+ + H
Sbjct: 199 DLTQQD-KNIIHSF-----LSHSSTHLKTSPWLS----ESFS---FYDILLALSWKRHQF 245

Query: 246 ALSREEDQTRQYDENIF---NVASQMIHKIEQRIAHTLPDDEVWFIYQ-YIISSGVAIDG 301
+++ + + Q + +F ++ IE ++ ++Y YI ++
Sbjct: 246 SVTIPQTRIFQQLKKLFVYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANNSFASL 305

Query: 302 Q---KDVSIISHMQASNEA-RLITWRLITVFSDIVD---------CDFSEDSALYDGLMV 348
Q + + + N+ RL+ +IT+ ++ + FS+ S L++ +
Sbjct: 306 QWTPEHIRQCCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSK-SFLFN--LQ 362

Query: 349 HIKPLINRLNYRIHIRNPLLEDIKAELADVWRLTQYVVNQVFKTWGENAVSEDEVGYLTV 408
H P N + N L + + W + K G+ ++
Sbjct: 363 HFIPETNLFVSPYYKGNQKLYTSLKLIVEEW---------MAKLPGKRYLNHKHFHLFCH 413

Query: 409 HFQAAMERQIARKRVLLVCSTGIGTSHLLKSRILRAFPEWTI---VDVISAANLSQVLPD 465
+ + + V+ V S I +HLL R F + +I + N+ Q+
Sbjct: 414 YVEQILRNIQPPLVVVFVASNFI-NAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDL 472

Query: 466 NIELIISTINL 476
+L+I+ L
Sbjct: 473 KPDLVITHSQL 483


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4008ANTHRAXTOXNA280.006 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 27.8 bits (61), Expect = 0.006
Identities = 10/34 (29%), Positives = 19/34 (55%), Gaps = 4/34 (11%)

Query: 52 QLYIPHIFSYLNE----DIDFVLNELKAKGLCRD 81
+LY P +F Y+N+ + + LK +G+ +D
Sbjct: 258 ELYAPDMFEYMNKLEKGGFEKISESLKKEGVEKD 291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_40102FE2SRDCTASE260.036 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 25.8 bits (56), Expect = 0.036
Identities = 12/31 (38%), Positives = 17/31 (54%)

Query: 18 VACAWLTVCERQHRYPHLTLESLEAAIAAEL 48
VAC W+ VCE ++ PH +E I+ L
Sbjct: 135 VACFWVDVCEDKNATPHSPQHRMETLISQAL 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4014TCRTETA417e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 40.6 bits (95), Expect = 7e-06
Identities = 71/391 (18%), Positives = 129/391 (32%), Gaps = 33/391 (8%)

Query: 20 LLVAFLTGIAGALQTPTLSIFLADELKARPIM--VGFFFTGSAIMGILVSQFLARHSDKQ 77
L L + L P L L D + + + G A+M + L SD+
Sbjct: 11 LSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF 70

Query: 78 GDRKLLILLCCLFGVLACTLFAWNRNYFILLSTGVLLSSFASTANPQMFALAREHADRTG 137
G R ++L+ + + A ++L ++ A AD T
Sbjct: 71 GRR-PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV----AGITGATGAVAGAYIADITD 125

Query: 138 RET-VMFSTFLRAQISLAWVIGPPLAYELAMGFSFKVMYLTAAIAFVVCGLIVWLFLP-- 194
+ F+ A V GP L + GFS + AA + L LP
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 195 --SIQRNIPVVT-QPVEILPSIHRKRDTRLLFVVCSMMWAANNLYMINMPLFIIDELHLT 251
+R + P+ L V +M + +F D H
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 252 DKLAGEMI-GIAAGLEIPMMLIAGYYMKRIGKRLLMLIAIVSGMCFYASVLMATTPAIEL 310
G + + +I G R+G+R +++ +++ Y +L+A +
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY--ILLAFATRGWM 302

Query: 311 ELQILNAIFLGILCGIGMLYFQDLMPEKI---------GSATTLYANTSRVGWIIAGSVD 361
+ L GIGM Q ++ ++ GS L + TS VG ++ ++
Sbjct: 303 ---AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359

Query: 362 GIMVEIWSYHALFWLAIGMLGIAMICLLFIK 392
+ W+ W I + ++CL ++
Sbjct: 360 AASITTWNG----WAWIAGAALYLLCLPALR 386



Score = 32.1 bits (73), Expect = 0.004
Identities = 18/102 (17%), Positives = 34/102 (33%)

Query: 17 AAFLLVAFLTGIAGALQTPTLSIFLADELKARPIMVGFFFTGSAIMGILVSQFLARHSDK 76
AA + V F+ + G + IF D +G I+ L +
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAA 272

Query: 77 QGDRKLLILLCCLFGVLACTLFAWNRNYFILLSTGVLLSSFA 118
+ + ++L + L A+ ++ VLL+S
Sbjct: 273 RLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGG 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4018SYCDCHAPRONE1002e-29 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 100 bits (249), Expect = 2e-29
Identities = 34/164 (20%), Positives = 69/164 (42%), Gaps = 5/164 (3%)

Query: 5 NLDLEENKEIASKFERALGMGATLAELHGITPDTLEGVYAYAYNFYEKGRLDEAELFFKF 64
+ + +E E L G T+A L+ I+ DTLE +Y+ A+N Y+ G+ ++A F+
Sbjct: 2 QQETTDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQA 61

Query: 65 LCIYDFQNYNYLKGYAAVCQLKKDYQKAFDMYHICLMLSPDNDFSLVYYMGQCQMGLKNI 124
LC+ D + + G A Q Y A Y ++ + ++ +C + +
Sbjct: 62 LCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIK-EPRFPFHAAECLLQKGEL 120

Query: 125 KMATELFNT----VVTYSQNEKIKEMATTYLELLTANSEEEVQT 164
A + ++ +++ ++ LE + E E +
Sbjct: 121 AEAESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKEMEHEC 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4020BACINVASINB1154e-29 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 115 bits (288), Expect = 4e-29
Identities = 117/544 (21%), Positives = 227/544 (41%), Gaps = 37/544 (6%)

Query: 60 NKPMLAPPTIQVSDSDNATTAKTNDARLTMILGNLTGIADQDITTRLHNNLDSTLLRHEM 119
N L PPT + ++ + +LT++LG L + ++L + L E
Sbjct: 64 NTVGLKPPTDAAREKLSS------EGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIES 117

Query: 120 AHNKFRELSDAYSSSLDDAQKADDIMHQANNNYNAVDKKVQSLEKKVNTLNQELSQLQPG 179
++S + ++L +AQ+A D+ + + + KK+ +L L P
Sbjct: 118 QKEMGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPA 177

Query: 180 DPQYNKVLTQKNAAEKTLTLSLQKKSLAEKSLNTAIMDADAAIGQSMEIFDEIQQQEQIN 239
DP Y + A K T + + A + A DA A ++ I + Q
Sbjct: 178 DPGYAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQ---GTA 234

Query: 240 NFTTNICLTQENQKNRNATATFILLITSVMEVIGDTNCDSIKNQSEVMKEINHVRENKLN 299
N + ++Q Q N + A +L+ +E++G +S++N + + R+ ++
Sbjct: 235 NAASQNQVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEME 294

Query: 300 ETARKYTTTTKVLKIVNECVTVVTFAVSAVLIVVGLLAAVPSGGSSIAGALALIGGIAGA 359
+ + ++ T+ + N + + + A+L +V ++AAV +GG+S+A A A
Sbjct: 295 KKSAEFQEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALA---------A 345

Query: 360 VVLGVDITCQIALGTTATGWILGKVVEGLSAAIKTVDPTL-LAITALLDVIGVDQDTIEL 418
V L V + +I T +I + + +K + + AIT L+ +GVD+ T E+
Sbjct: 346 VGLAVMVADEIVKAATGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEM 405

Query: 419 VKSIYASAAASIVMATVMIGAAVICSVAIGAVVSALSKTAAEEVTKEITSTIK---STIE 475
SI + A+I M V++ AV+ A + +ALSK E + K + + +K
Sbjct: 406 AGSIVGAIVAAIAMVAVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGS 465

Query: 476 SIINSVSKNIIKVLDSVCS--VLQTSAVVLKLIAKISNGLEKIGLLICAIATSTMNC--- 530
+ + I L +V S LQT+A+ +L + N L K+ L + T+ +
Sbjct: 466 KLFTQGMQRITSGLGNVGSKMGLQTNALSKEL---VGNTLNKVALGMEVTNTAAQSAGGV 522

Query: 531 -------FVAGNSADMAILQQDMSNLSKTREQMLSVLQRVDKTVEQEVSQMVRVLQHRTE 583
+ AD + + M + + +Q + + K + M +Q +
Sbjct: 523 AEGVFIKNASEALADFMLARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNAD 582

Query: 584 ALKF 587
A +F
Sbjct: 583 ASRF 586


62EcSMS35_4031EcSMS35_4037Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4031016-3.217388sugar phosphate antiporter
EcSMS35_4032021-4.240415regulatory protein UhpC
EcSMS35_4033-120-3.619791sensory histidine kinase UhpB
EcSMS35_4034-120-4.104585DNA-binding transcriptional activator UhpA
EcSMS35_4035-121-4.629704hypothetical protein
EcSMS35_4036014-1.450682hypothetical protein
EcSMS35_40370143.221386acetolactate synthase 1 regulatory subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4031TCRTETB357e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.9 bits (80), Expect = 7e-04
Identities = 61/372 (16%), Positives = 133/372 (35%), Gaps = 32/372 (8%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGA-----GAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRYGSDSPES 219
+G G +YL +I + P ++ L+ + ++ D
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGI 204

Query: 220 YGLGKAEELFGEEISEEDKETESTDMTKWQIFVEYVLK--NKVIWLLCFANI-FLYVVRI 276
+ F + + + IFV+++ K + + NI F+ V
Sbjct: 205 ILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLC 264

Query: 277 GIDQWSTVYAFQELKLSKAVAIQGFTLFEAG------ALVGTLLWGWLSDLANGRRG--L 328
G + TV F + + + E G + +++G++ + RRG
Sbjct: 265 GGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLY 324

Query: 329 VACIALALIIA---TLGVYQHASNQYIYLASLFALGFLVFGPQLLIGVAAVGFVPKKAIG 385
V I + + T ++ ++ + +F LG L F ++ + + ++A G
Sbjct: 325 VLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEA-G 383

Query: 386 AADGIKGTFAYL 397
A + ++L
Sbjct: 384 AGMSLLNFTSFL 395


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4032TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.2 bits (94), Expect = 1e-05
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4033PF06580401e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 40.2 bits (94), Expect = 1e-05
Identities = 32/166 (19%), Positives = 68/166 (40%), Gaps = 18/166 (10%)

Query: 341 KQSGQLIEHLSLGVYDAVRRLLGRLRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWR 400
++ +++ LS + +R L RQ+ +L + + ++L L++
Sbjct: 191 TKAREMLTSLS----ELMRYSLRYSNARQV---SLADELTVVDSYLQLASIQFEDRLQFE 243

Query: 401 IDESALSENQRVTLFRVCQEGLNNIVKHA-----DASAVTLQGWQQDERLMLVIEDDGSG 455
+ + +V + Q + N +KH + L+G + + + L +E+ GS
Sbjct: 244 NQINPAIMDVQVPPM-LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL 302

Query: 456 LPPDSGQ-HGFGLTGMRERVTALGG---TLTISCLHG-TRVSVSLP 496
++ + G GL +RER+ L G + +S G V +P
Sbjct: 303 ALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4034HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


63EcSMS35_4078EcSMS35_4103Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4078016-6.645213DNA-binding transcriptional regulator YidZ
EcSMS35_4079118-7.768207hypothetical protein
EcSMS35_4080125-11.500997NADPH-dependent FMN reductase
EcSMS35_4081336-15.315197sulfate permease family inorganic anion
EcSMS35_4082544-18.7852116-phosphogluconate phosphatase
EcSMS35_4083543-18.538397hypothetical protein
EcSMS35_4084131-14.917785hypothetical protein
EcSMS35_4085025-12.235472hypothetical protein
EcSMS35_4086015-7.760302hypothetical protein
EcSMS35_4087-3281.463358hypothetical protein
EcSMS35_4088-121-0.344202transcriptional regulator PhoU
EcSMS35_4089114-1.763744phosphate transporter ATP-binding protein
EcSMS35_4090114-2.693582phosphate transporter permease subunit PtsA
EcSMS35_4091317-3.604204phosphate transporter permease subunit PstC
EcSMS35_4092216-3.187104phosphate ABC transporter periplasmic
EcSMS35_4093015-2.735176long polar fimbrial operon protein LpfD
EcSMS35_4094013-1.280180long polar fimbrial operon protein LpfC
EcSMS35_40952270.817967long polar fimbrial operon protein LpfB
EcSMS35_40962301.935119long polar fimbrial operon protein LpfA
EcSMS35_40973372.290415glucosamine--fructose-6-phosphate
EcSMS35_40984362.257240bifunctional N-acetylglucosamine-1-phosphate
EcSMS35_40995402.232490F0F1 ATP synthase subunit epsilon
EcSMS35_41005422.147018F0F1 ATP synthase subunit beta
EcSMS35_41013341.186911F0F1 ATP synthase subunit gamma
EcSMS35_41024360.971183F0F1 ATP synthase subunit alpha
EcSMS35_4103222-0.420769F0F1 ATP synthase subunit delta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4094PF005777650.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 765 bits (1978), Expect = 0.0
Identities = 329/867 (37%), Positives = 489/867 (56%), Gaps = 60/867 (6%)

Query: 15 CLIFSQSLMAEVSV------FNPALLEIDHQSGVDIRQFNRANLMPPGVYSVDIFINGKM 68
L + + A+ + FNP L D Q+ D+ +F +PPG Y VDI++N
Sbjct: 29 RLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGY 88

Query: 69 FERQDVTFVQDNPDADLHACFVAIKKTLTTFGVKVDALKSLNDVDETVCIDPGPRIEGSS 128
+DVTF + + + C L + G+ ++ +N + + C+ I ++
Sbjct: 89 MATRDVTFNTGDSEQGIVPCLTR--AQLASMGLNTASVSGMNLLADDACVPLTSMIHDAT 146

Query: 129 WQFDSDKLQLNISIPQIYMDAMAYDYISPSRWDEGINALTINYDFSGSHTLRSDYGSQET 188
Q D + +LN++IPQ +M A YI P WD GINA +NY+FSG+ + +
Sbjct: 147 AQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV--QNRIGGNS 204

Query: 189 DTSYLNLRNGLNIGPWRLRNYSTLN------TTDGSAEYNSISTWIQRDIAALRSQIMIG 242
+YLNL++GLNIG WRLR+ +T + ++ ++ I+TW++RDI LRS++ +G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 243 DTWTASDIFDSTQIRGARLYTDNDMLPASQNGFAPVVRGIAKSNATVIIRQNGYVIYQSA 302
D +T DIFD RGA+L +D++MLP SQ GFAPV+ GIA+ A V I+QNGY IY S
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 303 VPQGAFEITDLNTASTGGDLDVTIKEEDGSEQRFTQPYASLAILKREGQTDVDVSVGELR 362
VP G F I D+ A GDL VTIKE DGS Q FT PY+S+ +L+REG T ++ GE R
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 363 DEDG--FTPDVIQAQILHGFPYGFTLYGGMQAAEKYGSAALGVGKDLGALGAISFDVTHA 420
+ P Q+ +LHG P G+T+YGG Q A++Y + G+GK++GALGA+S D+T A
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 421 RAKFSHDDTETGQSYRFLYSKRFDDTDTSLRLVGYRYSTEGYYTLNEWASRRNN------ 474
+ D GQS RFLY+K +++ T+++LVGYRYST GY+ + R N
Sbjct: 445 NSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIET 504

Query: 475 -----------PEDFWETGNRRSRVEGTLTQSLGRDYGNLYLTLSRQQYWHTDDVERLMQ 523
+ + N+R +++ T+TQ LGR LYL+ S Q YW T +V+ Q
Sbjct: 505 QDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 524 FGYSSSWKRLSWNVSWSYSNTARQGTGNNHASDNTSEQIYMLSLSVPLSGW--------W 575
G +++++ ++W +S+S + A +Q+ L++++P S W W
Sbjct: 564 AGLNTAFEDINWTLSYSLTKN---------AWQKGRDQMLALNVNIPFSHWLRSDSKSQW 614

Query: 576 GNSYATYSVSQNDNSGSSHQLGLSGTALERNNLSWNLMQSYNSHDDEVGGN---MSLTYD 632
++ A+YS+S + N ++ G+ GT LE NNLS+++ Y D G+ +L Y
Sbjct: 615 RHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYR 674

Query: 633 GTYGTVNGSYNYSQNSQRLNYGIRGGILAHSEGVTLSQELGETIALVKAPGAAGLEIDNM 692
G YG N Y++S + ++L YG+ GG+LAH+ GVTL Q L +T+ LVKAPGA +++N
Sbjct: 675 GGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQ 734

Query: 693 RGAATDWRGYTVKTQLNPYDENRVAISDNYFSKSNIELDNTVVTMVPTRGAVVKAEFVTR 752
G TDWRGY V Y ENRVA+ N + N++LDN V +VPTRGA+V+AEF R
Sbjct: 735 TGVRTDWRGYAVLPYATEYRENRVALDTNTLA-DNVDLDNAVANVVPTRGAIVRAEFKAR 793

Query: 753 VGYRVLFRVAGTKGKPAPFGAIATVQNTSSADSGIVGDLGELYLSGLPEKGQVMLSWGEN 812
VG ++L + KP PFG A V + SS SGIV D G++YLSG+P G+V + WGE
Sbjct: 794 VGIKLLMTLTH-NNKPLPFG--AMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEE 850

Query: 813 AATTCTFDYSISIPESESGLIEQGVTC 839
C +Y + + L + C
Sbjct: 851 ENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4098RTXTOXINA290.048 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.048
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


64EcSMS35_4133EcSMS35_4139Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4133-2153.311377putative ATP-dependent protease
EcSMS35_41340204.394503acetolactate synthase 2 catalytic subunit
EcSMS35_41350264.359439acetolactate synthase 2 regulatory subunit
EcSMS35_41360274.358686branched-chain amino acid aminotransferase
EcSMS35_41371223.988814dihydroxy-acid dehydratase
EcSMS35_41391203.379339threonine dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4133HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.0 bits (78), Expect = 0.001
Identities = 40/196 (20%), Positives = 62/196 (31%), Gaps = 51/196 (26%)

Query: 170 KHALERPKPTDAVSRALQHDLSDVVGQEQG----KRGLEITAAGGHNLLLIGPPGTGKTM 225
AL PK + D +VG+ R L L++ G GTGK +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKEL 175

Query: 226 LASRINGLLPDLSNEEALESAAILSLVNAESVQKQWRQRPFRSPHHSA--------SLTA 277
+A ++ R PF + + +A L
Sbjct: 176 VARALHDYGK-------------------------RRNGPFVAINMAAIPRDLIESELFG 210

Query: 278 MVGG---GAIP-GPGEISLAHNGVLFLDEL----PEFERRTLDALREPIESGQIHLSRTR 329
G GA G A G LFLDE+ + + R L L++ G+
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQ----GEYT--TVG 264

Query: 330 AKITYPARFQLVAAMN 345
+ + ++VAA N
Sbjct: 265 GRTPIRSDVRIVAATN 280


65EcSMS35_4173EcSMS35_4184Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_41730193.255860putative lipoprotein
EcSMS35_4174-2204.659725putative lipoprotein
EcSMS35_4175-2163.509324diaminopimelate epimerase
EcSMS35_4176-2162.485409hypothetical protein
EcSMS35_4177-2162.028558site-specific tyrosine recombinase XerC
EcSMS35_4178-2151.727487flavin mononucleotide phosphatase
EcSMS35_4179-113-0.431532DNA-dependent helicase II
EcSMS35_4180016-7.640223hypothetical protein
EcSMS35_4181-115-6.945489hypothetical protein
EcSMS35_4182-114-6.354345hypothetical protein
EcSMS35_4183-111-5.166194magnesium/nickel/cobalt transporter CorA
EcSMS35_4184015-3.391440hypothetical protein
66EcSMS35_4223EcSMS35_4258Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_42230153.305054DNase TatD
EcSMS35_4222-1183.076926transcriptional activator RfaH
EcSMS35_4224-1173.3603443-octaprenyl-4-hydroxybenzoate decarboxylase
EcSMS35_4225-2183.132798FMN reductase
EcSMS35_4226-2193.1767613-ketoacyl-CoA thiolase
EcSMS35_4227-2182.031174multifunctional fatty acid oxidation complex
EcSMS35_4228-2151.121465proline dipeptidase
EcSMS35_4229-1150.288515hypothetical protein
EcSMS35_4230015-0.961705potassium transporter
EcSMS35_4231-115-2.292665protoporphyrinogen oxidase
EcSMS35_4237020-3.623380*molybdopterin-guanine dinucleotide biosynthesis
EcSMS35_4238-218-4.417895molybdopterin-guanine dinucleotide biosynthesis
EcSMS35_4239-218-6.096848hypothetical protein
EcSMS35_4240-213-2.763548serine/threonine protein kinase
EcSMS35_4241-214-2.684632periplasmic protein disulfide isomerase I
EcSMS35_4242-114-2.085707hypothetical protein
EcSMS35_4244015-1.248936putative acyltransferase
EcSMS35_42451150.028964hypothetical protein
EcSMS35_42461161.105527DNA polymerase I
EcSMS35_42473201.272262hypothetical protein
EcSMS35_42482182.699403ribosome biogenesis GTP-binding protein YsxC
EcSMS35_42490173.049260hypothetical protein
EcSMS35_4250-1172.778696hypothetical protein
EcSMS35_42512242.412447coproporphyrinogen III oxidase
EcSMS35_42522222.050341hypothetical protein
EcSMS35_42532201.041561nitrogen regulation protein NR(I)
EcSMS35_4254120-1.012072nitrogen regulation protein NR(II)
EcSMS35_4255321-2.645567glutamine synthetase
EcSMS35_4256113-4.225461GTP-binding protein
EcSMS35_4257012-4.950953GntR family transcriptional regulator
EcSMS35_4258-110-3.242991AP endonuclease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4249SECA300.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.005
Identities = 11/71 (15%), Positives = 31/71 (43%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTSSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++++ + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVTEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4253HTHFIS6030.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 603 bits (1556), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 4 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 63
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 64 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 123
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 124 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 183
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 184 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 243
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 244 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 303
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 304 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 363
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 364 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 419
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 420 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 472
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4256TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


67EcSMS35_4269EcSMS35_4274Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4269-218-3.541801DeoR family transcriptional regulator
EcSMS35_4270-320-4.383699phosphatase
EcSMS35_4271-324-4.861542ribonuclease BN
EcSMS35_4272-121-3.997576D-tyrosyl-tRNA(Tyr) deacylase
EcSMS35_4273-119-3.977235acetyltransferase
EcSMS35_4274-216-3.860138hypothetical protein
68EcSMS35_4305EcSMS35_4350Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4305126-3.443390DNA-binding transcriptional regulator CpxR
EcSMS35_4306129-4.095972periplasmic repressor CpxP
EcSMS35_4307028-5.762291phage integrase family site specific
EcSMS35_4308230-5.662329regulatory protein Cox
EcSMS35_4309327-5.388971hypothetical protein
EcSMS35_4310022-1.107385putative replication gene B protein
EcSMS35_4311-223-0.771701hypothetical protein
EcSMS35_4312-222-0.751764hypothetical protein
EcSMS35_4313-226-2.046142C4-type zinc finger DksA/TraR family protein
EcSMS35_4314129-3.928003hypothetical protein
EcSMS35_4315234-6.021662replication gene A protein
EcSMS35_4316648-10.045438hypothetical protein
EcSMS35_4317537-7.934488hypothetical protein
EcSMS35_4319122-4.020423DNA-cytosine methyltransferase family protein
EcSMS35_4318021-2.074361hypothetical protein
EcSMS35_43200240.469150hypothetical protein
EcSMS35_43212344.847021hypothetical protein
EcSMS35_43222354.929698PBSX family phage portal protein
EcSMS35_43233365.837319phage large terminase subunit GpP
EcSMS35_43243335.448233phage capsid scaffolding protein GpO
EcSMS35_43252346.422366phage major capsid protein GpN
EcSMS35_43262347.595822phage small terminase subunit GpM
EcSMS35_43272356.823758phage head completion protein GPL
EcSMS35_43283326.856825phage tail protein GpX
EcSMS35_43292326.144528phage holin GpY
EcSMS35_43302325.849755phage lysozyme
EcSMS35_4331-227-1.137385phage lysis protein LysA
EcSMS35_4332-324-0.148393phage lysis timing protein LysB
EcSMS35_4333-223-0.415157phage lysis protein LysC
EcSMS35_43341240.595071phage tail completion protein GpR
EcSMS35_43351220.600190phage virion morphogenesis protein GpS
EcSMS35_4336217-0.800189hypothetical protein
EcSMS35_43371151.447001phage baseplate assembly protein GpV
EcSMS35_4338019-2.232821baseplate assembly protein GPW
EcSMS35_4339020-2.840688baseplate assembly protein GpJ
EcSMS35_4340018-2.599373phage tail protein GpI
EcSMS35_4341-118-2.583015putative phage tail fiber protein
EcSMS35_4342019-1.882406tail fiber assembly protein GpG
EcSMS35_4343019-0.947550hypothetical protein
EcSMS35_43440254.585907hypothetical protein
EcSMS35_43450234.689728phage tail sheath protein
EcSMS35_4346-1244.173811phage major tail tube protein FII
EcSMS35_4347-1244.003559phage tail protein E
EcSMS35_43480233.511738phage tail protein P2
EcSMS35_43490223.257530TP901 family phage tail tape measure protein
EcSMS35_43502191.949386phage protein gpU
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4305HTHFIS929e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 9e-24
Identities = 35/117 (29%), Positives = 62/117 (52%), Gaps = 2/117 (1%)

Query: 3 KILLVDDDRELTSLLKELLEMEGFNVIVAHDGEQALDLL-DDSIDLLLLDVMMPKKNGID 61
IL+ DDD + ++L + L G++V + + + DL++ DV+MP +N D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 TLKALRQTH-QTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRR 117
L +++ PV++++A+ + + + E GA DYLPKPF+ EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4308MICOLLPTASE260.023 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 26.2 bits (57), Expect = 0.023
Identities = 6/22 (27%), Positives = 14/22 (63%)

Query: 28 ETAVVKMVKENKLPVIELRDPS 49
E+ +K+V++ + VI +P+
Sbjct: 849 ESKKIKVVEDKPVEVINESEPN 870


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4310SECA280.016 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.016
Identities = 11/72 (15%), Positives = 26/72 (36%)

Query: 6 VEKQPAAMRRIIGKHLAVPRWQDTCDYYNQMMERERLTVCFHAQLKQRHATMRFEEMNDV 65
+ ++ L + W D ++ RER+ +++ + E M
Sbjct: 703 IPGLQERLKNDFDLDLPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHF 762

Query: 66 ERERLVCAIDEL 77
E+ ++ +D L
Sbjct: 763 EKGVMLQTLDSL 774


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4326PF06872290.014 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 29.3 bits (65), Expect = 0.014
Identities = 24/95 (25%), Positives = 37/95 (38%), Gaps = 5/95 (5%)

Query: 122 PPYMFTEEVALAAMRAHAAGESVDTRLLTETIELTATADMPDEVRAKLHKITGLFLRDAG 181
P M T ++ A+ A S+D I +T +++ V + TG+ +
Sbjct: 298 PALMLTHV-RISQASAYNAQRSLDMPNACINISITQSSEGSIHVTSH----TGVLIMAPE 352

Query: 182 DAAGALAHLQRATQLDCQAGVKKEIERLERELKPK 216
D L L T + GVK E + R LK K
Sbjct: 353 DRPNQLGMLTNRTSYEVPPGVKCEPNEMARMLKAK 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4334SALSPVBPROT280.016 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 28.2 bits (62), Expect = 0.016
Identities = 13/35 (37%), Positives = 18/35 (51%), Gaps = 2/35 (5%)

Query: 105 SISLMLTERTLVSEVDGALH--VKNIPEPPPPEPV 137
++ + RTL E DG V N+ PPPP P+
Sbjct: 340 ILTQLCAARTLAYEGDGYRRAPVNNMMPPPPPPPM 374


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4349GPOSANCHOR320.008 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.3 bits (73), Expect = 0.008
Identities = 28/144 (19%), Positives = 53/144 (36%), Gaps = 4/144 (2%)

Query: 36 RETQKSLRELNGQASRIEGFRKTSAQLAVTGHALEKARQEAEALATQFKNTERPTRAQAK 95
+ + ++ +++I+ A LA LEKA + A +T + A+
Sbjct: 197 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKA 256

Query: 96 VLE----SAKRAAEDLQAKYNRLTDSVKRQQRELAVVGINTRNLAHDEQGLKNRISETTA 151
LE ++A E + +K + E A + +L H Q L
Sbjct: 257 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRR 316

Query: 152 QLNRQRDALARVSAQQAKLNAVKQ 175
L+ R+A ++ A+ KL +
Sbjct: 317 DLDASREAKKQLEAEHQKLEEQNK 340


69EcSMS35_4367EcSMS35_4391Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4367016-3.496448glycerol uptake facilitator protein
EcSMS35_4368-114-2.728308hypothetical protein
EcSMS35_4369-315-2.193390hypothetical protein
EcSMS35_4370-2130.207858hypothetical protein
EcSMS35_43710141.886785ribonuclease activity regulator protein RraA
EcSMS35_43721123.1703901,4-dihydroxy-2-naphthoate
EcSMS35_43732143.183516ATP-dependent protease ATP-binding subunit HslU
EcSMS35_43743123.441315ATP-dependent protease peptidase subunit
EcSMS35_43753143.497653essential cell division protein FtsN
EcSMS35_43761143.035806DNA-binding transcriptional regulator CytR
EcSMS35_43772174.562510primosome assembly protein PriA
EcSMS35_43781183.80079150S ribosomal protein L31
EcSMS35_4379-2131.660489putative peptidoglycan peptidase
EcSMS35_4380-211-0.440381transcriptional repressor protein MetJ
EcSMS35_4381-112-2.556473cystathionine gamma-synthase
EcSMS35_4382-114-3.627334bifunctional aspartate kinase II/homoserine
EcSMS35_4383123-7.102144hypothetical protein
EcSMS35_4384019-6.215415nucleoside-specific channel-forming protein Tsx
EcSMS35_4385-111-3.062492serine/threonine protein phosphatase family
EcSMS35_4386-112-1.530888serine/threonine protein phosphatase family
EcSMS35_4387-2120.778844hypothetical protein
EcSMS35_4388-2141.834043serine/threonine protein phosphatase family
EcSMS35_4389-2173.2343535,10-methylenetetrahydrofolate reductase
EcSMS35_4390-1193.603286catalase/peroxidase HPI
EcSMS35_43910153.116714putative transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4373HTHFIS300.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.018
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 49 TPKNILMIGPTGVGKTEIAR---RLAKLANAPFIKV 81
T +++ G +G GK +AR K N PF+ +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4375IGASERPTASE422e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 2e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLAQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVARA 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4384CHANNELTSX360e-128 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 360 bits (925), Expect = e-128
Identities = 171/262 (65%), Positives = 203/262 (77%), Gaps = 6/262 (2%)

Query: 30 WLHQSLNVIGRTDSRFGPRLTNDLYPEYTVAGRKDWFDFYGYVDLPKFFGVGSHYDVGIW 89
W HQS+NV+G +RFGP++ ND Y EY +KDWFDFYGY+D P FFG G+ GIW
Sbjct: 34 WWHQSVNVVGSYHTRFGPQIRNDTYLEYEAFAKKDWFDFYGYIDAPVFFG-GNSTAKGIW 92

Query: 90 DEGSPLFTEIEPRFSIDKLTGLNLAFGPFKEWFIANNYVYDMGDNQSSRQSTWYMGLGTD 149
++GSPLF EIEPRFSIDKLT +L+FGPFKEW+ ANNY+YDMG N S QSTWYMGLGTD
Sbjct: 93 NKGSPLFMEIEPRFSIDKLTNTDLSFGPFKEWYFANNYIYDMGRNDSQEQSTWYMGLGTD 152

Query: 150 IDTGLPIKLSANIYAKYQWQNYGAANENEWDGYRFKIKYSIPLTHLFGGRLVYNSFTNFD 209
IDTGLP+ LS N+YAKYQWQNYGA+NENEWDGYRFK+KY +PLT L+GG L Y FTNFD
Sbjct: 153 IDTGLPMSLSLNVYAKYQWQNYGASNENEWDGYRFKVKYFVPLTDLWGGSLSYIGFTNFD 212

Query: 210 FGSDLADKSHNN-----KRTSNAIASSHILSLLYEHWKFAFTLRYFHNGGQWNAGEKVNF 264
+GSDL D + + RTSN+IASSHIL+L Y HW ++ RYFHNGGQW K+NF
Sbjct: 213 WGSDLGDDNFYDLNGKHARTSNSIASSHILALNYAHWHYSIVARYFHNGGQWADDAKLNF 272

Query: 265 GDGPFELKNTGWGTYTTIGYQF 286
GDGPF +++TGWG Y +GY F
Sbjct: 273 GDGPFSVRSTGWGGYFVVGYNF 294


70EcSMS35_4524EcSMS35_4542Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4524-1193.328954DNA-binding transcriptional regulator SoxS
EcSMS35_4525-1193.201493redox-sensitive transcriptional activator SoxR
EcSMS35_45260214.010459hypothetical protein
EcSMS35_4527-1213.636723sulfate permease family inorganic anion
EcSMS35_4528-2213.367901Na+/H+ antiporter
EcSMS35_4529-2233.466283acetate permease
EcSMS35_4530-2223.819001hypothetical protein
EcSMS35_4531-3194.594212acetyl-CoA synthetase
EcSMS35_4532-1164.002966cytochrome c552
EcSMS35_45330184.437664cytochrome c nitrite reductase pentaheme
EcSMS35_4534-1184.195354cytochrome c nitrite reductase, Fe-S protein
EcSMS35_45350183.297607nrfD protein
EcSMS35_4536-1151.949238heme lyase subunit NrfE
EcSMS35_4537-3160.857417formate-dependent nitrite reductase complex
EcSMS35_4538-2170.723950formate-dependent nitrite reductase complex
EcSMS35_4539-2161.759820hypothetical protein
EcSMS35_4540-2162.435951glutamate/aspartate:proton symporter
EcSMS35_4541-1152.876659Sel1 repeat-containing protein
EcSMS35_4542-2153.061040formate dehydrogenase H, alpha subunit,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4530RTXTOXIND270.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.7 bits (59), Expect = 0.020
Identities = 5/33 (15%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 17 ELVEKR-QRFATILSIIMLAVYIGFILLIAFAP 48
EL+E R +++ ++ + +L
Sbjct: 47 ELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQ 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4534VACJLIPOPROT290.013 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.1 bits (65), Expect = 0.013
Identities = 6/21 (28%), Positives = 10/21 (47%)

Query: 179 FGNLDDPNSEISQLLHQKPTY 199
GNL++P ++ L P
Sbjct: 75 TGNLEEPAVMVNYFLQGDPYQ 95


71EcSMS35_4556EcSMS35_4568Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_45562286.592959ribose-5-phosphate isomerase B
EcSMS35_45572337.742507hypothetical protein
EcSMS35_45580368.594898carbon-phosphorus lyase complex accessory
EcSMS35_45590388.407571aminoalkylphosphonic acid N-acetyltransferase
EcSMS35_45601388.935836ribose 1,5-bisphosphokinase
EcSMS35_45611409.284456phosphonate metabolism protein PhnM
EcSMS35_45620389.277288phosphonate C-P lyase system protein PhnL
EcSMS35_45630409.675473phosphonate C-P lyase system protein PhnK
EcSMS35_45640419.544286phosphonate metabolism protein PhnJ
EcSMS35_45653408.736554phosphonate metabolism protein PhnI
EcSMS35_45661418.212887carbon-phosphorus lyase complex subunit
EcSMS35_45672387.005442phosphonate C-P lyase system protein PhnG
EcSMS35_45682375.394232phosphonate metabolism transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4559SACTRNSFRASE333e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 3e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 50 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 109
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 110 AEMTELSTNVKRHDAHRFYLREGY 133
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4562PF05272290.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.014
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


72EcSMS35_4591EcSMS35_4607Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4591117-3.301896DNA-binding transcriptional activator DcuR
EcSMS35_4592218-4.693390sensory histidine kinase DcuS
EcSMS35_4593321-6.334598hypothetical protein
EcSMS35_4594-115-3.858114hypothetical protein
EcSMS35_4595-117-3.842018hypothetical protein
EcSMS35_4596022-3.271888hypothetical protein
EcSMS35_4597118-4.143671hypothetical protein
EcSMS35_4598018-3.436011lysyl-tRNA synthetase
EcSMS35_4599114-2.207933amino acid/peptide transporter
EcSMS35_4600116-2.581910lysine decarboxylase, constitutive
EcSMS35_4601218-1.525086lysine/cadaverine antiporter
EcSMS35_4602118-1.582588DNA-binding transcriptional activator CadC
EcSMS35_46041200.270187*putative transcriptional regulator
EcSMS35_46051160.124641thiol:disulfide interchange protein precursor
EcSMS35_4606122-0.861548divalent-cation tolerance protein CutA
EcSMS35_4607235-0.209809anaerobic C4-dicarboxylate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4591HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4592PF06580418e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 8e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4595SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4599TCRTETA290.040 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.040
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWTWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4602SYCDCHAPRONE378e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.8 bits (85), Expect = 8e-05
Identities = 16/97 (16%), Positives = 36/97 (37%), Gaps = 7/97 (7%)

Query: 391 PLDEKQLAALNTEIDNIVTLPELNNLS-----IIYQIKAVSALVKGKTDESYQAINTGID 445
++ A+ + + T+ LN +S +Y + A + GK +++++
Sbjct: 6 TDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSL-AFNQYQSGKYEDAHKVFQALCV 64

Query: 446 LEMSWLNYVL-LGKVYEMKGMNREAADAYLTAFNLRP 481
L+ + L LG + G A +Y +
Sbjct: 65 LDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4604HTHTETR453e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.0 bits (106), Expect = 3e-08
Identities = 28/188 (14%), Positives = 51/188 (27%), Gaps = 13/188 (6%)

Query: 3 REDVLGEALKLLELQGIANTTLEMVAERVDYPLDELRRFWPDKEAILYDALRYLSQQIDA 62
R+ +L AL+L QG+++T+L +A+ + + DK + + I
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 63 WRRQLMLDETQTAEQKLLARYQALSECVKNNRYPGCLFIAACTFYPDPGH----PIHQLA 118
+ L + E + F+ + Q
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLEST--VTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 119 DQQKSAAYDFTHELLTT-------LEVDDPAMVAKQMELVLEGCLSRMLVNRSQADVDTA 171
+YD + L A M + G + L D+
Sbjct: 131 RNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQSFDLKKE 190

Query: 172 HRLAEDIL 179
R IL
Sbjct: 191 ARDYVAIL 198


73EcSMS35_4630EcSMS35_4652Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4630-1143.444604hypothetical protein
EcSMS35_4631-2112.774896phosphatidylserine decarboxylase
EcSMS35_4632-2122.853475ribosome-associated GTPase
EcSMS35_4633-2123.479687oligoribonuclease
EcSMS35_4637-1123.319827***iron-sulfur cluster binding protein
EcSMS35_4638-1123.357964hypothetical protein
EcSMS35_4639-1132.682446putative ATPase
EcSMS35_46400123.213327N-acetylmuramoyl-l-alanine amidase II
EcSMS35_46411143.095447DNA mismatch repair protein
EcSMS35_46423192.248748tRNA delta(2)-isopentenylpyrophosphate
EcSMS35_46435262.379451RNA-binding protein Hfq
EcSMS35_46445232.250996putative GTPase HflX
EcSMS35_46454232.523804FtsH protease regulator HflK
EcSMS35_46464232.313770FtsH protease regulator HflC
EcSMS35_46473191.303155hypothetical protein
EcSMS35_46483181.228405adenylosuccinate synthetase
EcSMS35_46494110.311775transcriptional repressor NsrR
EcSMS35_4650312-0.025945exoribonuclease R
EcSMS35_4651216-2.63524623S rRNA (guanosine-2'-O-)-methyltransferase
EcSMS35_4652117-3.065350hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4630GPOSANCHOR512e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.8 bits (121), Expect = 2e-08
Identities = 50/312 (16%), Positives = 105/312 (33%), Gaps = 18/312 (5%)

Query: 121 SRQAQQEQERAREIADSLNQLPQQQTDARRQLNEIERRLGTLTGNSPLNQAQNFALQSDS 180
+ ++ QERA + N L + +D ++ LT L+ +
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTE---ELSNAKEKLRKND 105

Query: 181 ARLKALVDEL-ELAQLSANNRQELARLRSELAEKES--QQLDAYLQALRNQLNSQRQLEA 237
L ++ EL A+ + L + + + L+A AL + + +
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAAR-KADLEKAL 164

Query: 238 ERALESTELLAENSADLPKDIVAQFKINRELSAALNQQAQRMDLVASQQRQAASQTLQVR 297
E A+ + + L + A EL AL +++ + ++ +
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 298 QALNTLREQSQWLGSSNLLGEALRAQVARLPEMPKPQQLDTEMAQLRVQRLRYEDLLNKQ 357
L + L + A A++ L + L+ A+L +
Sbjct: 225 ARKADLEKA---LEGAMNFSTADSAKIKTLEA--EKAALEARQAELEKALEGAMNFSTAD 279

Query: 358 PQLRQIHQADGQPLTAE------QNRILEAQLRTQRELLNSLLQGGDTLLLELTKLKVSN 411
+ +A+ L AE Q+++L A ++ R L++ + L E KL+ N
Sbjct: 280 SAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQN 339

Query: 412 GQLEDALKEVNE 423
E + + +
Sbjct: 340 KISEASRQSLRR 351



Score = 42.4 bits (99), Expect = 1e-05
Identities = 48/239 (20%), Positives = 92/239 (38%), Gaps = 23/239 (9%)

Query: 20 ATAPDSKQISQELEQAKAAKPAQPEVVEALQSALNALEERKGSLER-IKQYQEVIDNYPK 78
A A + + LE A A ++ L++ ALE R+ LE+ ++
Sbjct: 222 ALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSA 281

Query: 79 LSATLRAQLNNMRDEPRSVSPGMSTDALNQEILQVSSQLLDKSRQAQQEQERAREIADSL 138
TL A+ + E + + Q + LD SR+A+++ E + +
Sbjct: 282 KIKTLEAEKAALEAEKADL---EHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQ 338

Query: 139 NQLPQQQTDARRQLNEIERRLGTLTGNSPLNQAQNFALQSDSARLKAL--VDELELAQLS 196
N++ ++A RQ + R L L+++ +L+ + E L
Sbjct: 339 NKI----SEASRQ--SLRRDLDASR-------EAKKQLEAEHQKLEEQNKISEASRQSLR 385

Query: 197 AN---NRQELARLRSELAEKESQQLDAYLQALRNQLNSQRQLEAERALESTELLAENSA 252
+ +R+ ++ L E S +L A + + S++ E E+A +L AE A
Sbjct: 386 RDLDASREAKKQVEKALEEANS-KLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKA 443


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4644SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4645cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4650RTXTOXIND310.029 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.029
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


74EcSMS35_4666EcSMS35_4675Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4666-2253.106333PTS system L-ascorbate-specific transporter
EcSMS35_4667-2263.2805713-keto-L-gulonate-6-phosphate decarboxylase
EcSMS35_46680272.142443L-xylulose 5-phosphate 3-epimerase
EcSMS35_4669023-1.504879L-ribulose-5-phosphate 4-epimerase
EcSMS35_4670125-4.854708hypothetical protein
EcSMS35_4671125-5.78891630S ribosomal protein S6
EcSMS35_4672120-3.89254430S ribosomal protein S18
EcSMS35_4673017-2.93148550S ribosomal protein L9
EcSMS35_4674117-2.998329TRAP transporter solute receptor DctP family
EcSMS35_4675215-1.708965TRAP transporter, DctM subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4667ECOLNEIPORIN280.034 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.034
Identities = 6/19 (31%), Positives = 7/19 (36%), Gaps = 2/19 (10%)

Query: 105 FNGDVQI--ELTGYWTWEQ 121
F G + L W EQ
Sbjct: 62 FKGQEDLGNGLKAIWQVEQ 80


75EcSMS35_4748EcSMS35_4894Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4748020-3.010776L-idonate 5-dehydrogenase
EcSMS35_4749019-2.409268D-gluconate kinase
EcSMS35_4750119-1.818762zinc-binding dehydrogenase family
EcSMS35_4753323-3.426246*phage integrase family site specific
EcSMS35_4754316-3.064715sulfatase family protein
EcSMS35_4755220-5.090841heat resistant agglutinin 1
EcSMS35_4757221-4.655771IS1 transposase orfB
EcSMS35_4758123-5.658146IS1 transposase orfA
EcSMS35_4759122-6.650820addiction module antitoxin
EcSMS35_4760123-6.424461type I restriction-modification system DNA
EcSMS35_4761124-7.325341type I restriction modification DNA specificity
EcSMS35_4762224-6.851971type I restriction-modification system
EcSMS35_4763335-9.590321hypothetical protein
EcSMS35_4764329-7.277736hypothetical protein
EcSMS35_4765225-3.058415hypothetical protein
EcSMS35_4766223-1.776239hypothetical protein
EcSMS35_4767324-0.875107hypothetical protein
EcSMS35_4768426-1.978458hypothetical protein
EcSMS35_4770425-3.497908hypothetical protein
EcSMS35_4771329-7.070336IS1203 transposase orfB
EcSMS35_4772538-11.878059IS1203 transposase orfA
EcSMS35_4773440-11.624874hypothetical protein
EcSMS35_4775440-11.636918hypothetical protein
EcSMS35_4777238-9.955285IS10 transposase
EcSMS35_4776330-8.495214hypothetical protein
EcSMS35_4778225-4.692661hypothetical protein
EcSMS35_4780526-2.035042hypothetical protein
EcSMS35_4781528-2.373384hypothetical protein
EcSMS35_4782426-2.050850AlpA family transcriptional regulator
EcSMS35_4783324-1.102658hypothetical protein
EcSMS35_4784323-2.162826hypothetical protein
EcSMS35_4786525-1.519473H-NS histone family protein
EcSMS35_4789422-0.390868hypothetical protein
EcSMS35_4790327-5.849074IS1203 transposase orfA
EcSMS35_4791431-7.128313IS1203 transposase orfB
EcSMS35_4792528-7.456516hypothetical protein
EcSMS35_4793529-8.223851hypothetical protein
EcSMS35_4794531-8.412808hypothetical protein
EcSMS35_4795430-7.070336hypothetical protein
EcSMS35_4796528-2.420162hypothetical protein
EcSMS35_4797524-0.221710putative GTPase
EcSMS35_4798625-0.333790hypothetical protein
EcSMS35_47997261.691950hypothetical protein
EcSMS35_48006262.376226hypothetical protein
EcSMS35_48016273.984064hypothetical protein
EcSMS35_48027283.851913hypothetical protein
EcSMS35_48038263.053242antirestriction protein
EcSMS35_48047262.759130RadC family DNA repair protein
EcSMS35_48053180.404742hypothetical protein
EcSMS35_4806121-0.848806hypothetical protein
EcSMS35_4807122-0.944658hypothetical protein
EcSMS35_4808321-1.917754hypothetical protein
EcSMS35_4809322-2.153141hypothetical protein
EcSMS35_4812224-3.590826hypothetical protein
EcSMS35_4813223-3.723846hypothetical protein
EcSMS35_4814223-3.974845hypothetical protein
EcSMS35_4815223-4.243490hypothetical protein
EcSMS35_4816231-7.794287hypothetical protein
EcSMS35_4817333-10.063223hypothetical protein
EcSMS35_4818233-9.983265hypothetical protein
EcSMS35_4819441-11.540963hypothetical protein
EcSMS35_4820233-7.554145HNH endonuclease domain-containing protein
EcSMS35_4822234-7.858482hypothetical protein
EcSMS35_4823127-6.184505hypothetical protein
EcSMS35_4824020-4.996350hypothetical protein
EcSMS35_4825-113-2.679933hypothetical protein
EcSMS35_4827-29-1.269921DEAD/DEAH box helicase domain-containing
EcSMS35_4826-27-1.103126hypothetical protein
EcSMS35_4828-27-1.043193helicase family protein
EcSMS35_4829-28-1.132952hypothetical protein
EcSMS35_4830-211-1.119258DEAD/DEAH box helicase domain-containing
EcSMS35_4831024-4.936238ATP-dependent DNA helicase UvrD
EcSMS35_4832129-6.877326hypothetical protein
EcSMS35_4833128-6.360466hypothetical protein
EcSMS35_4834128-6.227883hypothetical protein
EcSMS35_4835030-6.401247N-acetylneuraminic acid mutarotase
EcSMS35_4836129-5.071115N-acetylneuraminic acid porin NanC
EcSMS35_4837032-4.513173hypothetical protein
EcSMS35_4838132-4.735446tyrosine recombinase
EcSMS35_4839226-3.333428tyrosine recombinase
EcSMS35_4840226-3.134603type-1 fimbrial protein
EcSMS35_4842225-2.913903type-1 fimbrial protein
EcSMS35_4841223-2.851898hypothetical protein
EcSMS35_4843316-1.718889chaperone protein FimC
EcSMS35_4844211-0.507870outer membrane usher protein fimD
EcSMS35_48450182.155463fimbrial protein FimF
EcSMS35_48460192.473819protein fimG
EcSMS35_4847-1170.681636protein FimH
EcSMS35_48480200.472010fructuronate transporter
EcSMS35_4849-120-0.031308mannonate dehydratase
EcSMS35_4850018-1.179868fructuronate reductase
EcSMS35_4851221-3.036649DNA-binding transcriptional repressor UxuR
EcSMS35_4852227-3.717138hypothetical protein
EcSMS35_4853124-2.046980hypothetical protein
EcSMS35_4854123-2.192239hypothetical protein
EcSMS35_4855121-2.647300DNA replication/recombination/repair protein
EcSMS35_4856021-2.110296phosphoenolpyruvate-protein phosphotransferase
EcSMS35_4857124-3.737795dihydroxyacetone kinase subunit DhaL
EcSMS35_4858223-3.396659dihydroxyacetone kinase subunit DhaK
EcSMS35_4859224-3.463571glycerol dehydrogenase
EcSMS35_4860226-4.034757major facilitator family protein CglT
EcSMS35_4861229-3.773478dihydrolipoamide dehydrogenase CglE
EcSMS35_4862132-6.447248putative carnitine transporter CglC
EcSMS35_4863032-6.269214glycerate kinase GcxK
EcSMS35_4864132-7.1052732-hydroxy-3-oxopropionate reductase GcxR
EcSMS35_4865029-6.745615hydroxypyruvate isomerase Gcxl
EcSMS35_4866-123-4.774812glyoxylate carboligase
EcSMS35_4867-121-4.623576DNA-binding transcriptional regulator DhaR
EcSMS35_4868-115-1.554862invasion protein IbeA
EcSMS35_4869-114-0.580310Na+/H+ antiporter IbeT
EcSMS35_48700171.482534putative DNA-binding transcriptional regulator
EcSMS35_4871014-0.819668isoaspartyl dipeptidase
EcSMS35_4872-116-1.024797hypothetical protein
EcSMS35_4873-116-1.844799putative transporter
EcSMS35_4874-120-4.999373hypothetical protein
EcSMS35_4875-120-4.751255RNA 2'-phosphotransferase-like protein
EcSMS35_4876-219-4.286035putative invasin
EcSMS35_4877-119-4.711011major facilitator transporter
EcSMS35_4878018-5.141869SdiA-regulated protein
EcSMS35_4879117-3.383393pentapeptide repeat-containing protein
EcSMS35_48802181.657726hypothetical protein
EcSMS35_48812182.083167(R)-2-hydroxyglutaryl-CoA dehydratase activator
EcSMS35_48821201.777340putative 2-hydroxyglutaryl-CoA dehydratase,
EcSMS35_4883118-0.837275hypothetical protein
EcSMS35_4884-116-0.937232multidrug resistance protein MdtM
EcSMS35_4885021-4.969873hypothetical protein
EcSMS35_4886-115-4.446541GntR family transcriptional regulator
EcSMS35_4887018-6.391260hypothetical protein
EcSMS35_4888-19-3.456766hypothetical protein
EcSMS35_4889014-1.065852endoribonuclease SymE
EcSMS35_4891012-0.429348hypothetical protein
EcSMS35_48921193.266872putative GTP-binding protein YjiA
EcSMS35_48932152.333161hypothetical protein
EcSMS35_48942132.343539carbon starvation family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4755OMPADOMAIN421e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 42.2 bits (99), Expect = 1e-06
Identities = 51/246 (20%), Positives = 78/246 (31%), Gaps = 55/246 (22%)

Query: 18 MNKVIAVSALAMAGMFSTQALADESKTGFYVTGKAGASVMSLADQRFLSGDGEETSKYKG 77
M K A+A+AG F+T A A +Y K G S D F++
Sbjct: 1 MKKTAIAIAVALAG-FATVAQAAPKDNTWYTGAKLGWS--QYHDTGFIN---------NN 48

Query: 78 GDGHDTVFSGGIAAGYDFYPQFSIPVRTELEFYARGKADSKYNVDKDSWSGGYWRDDLKN 137
G H+ G GY P E+ + G+ K +V+ +
Sbjct: 49 GPTHENQLGAGAFGGYQVNPYVGF----EMGYDWLGRMPYKGSVE-----------NGAY 93

Query: 138 EVSVNTLMLNAYYDFRNDSAFTPWVSAGIGYARIHQKTTGISTWDYGYGSSGRESLSRSG 197
+ L Y +D + G R K YG + +S
Sbjct: 94 KAQGVQLTAKLGYPITDD--LDIYTRLGGMVWRADTK-------SNVYGKNHDTGVS--- 141

Query: 198 SADNFAWSLGAGVRYDVTPDIALDLSYRYLDAGDSSVSYKDEWGDKYKSEVDVKSHDILL 257
GV Y +TP+IA L Y++ + GD + + + L
Sbjct: 142 ------PVFAGGVEYAITPEIATRLEYQWT----------NNIGDAHTIGTRPDNGMLSL 185

Query: 258 GVTYNF 263
GV+Y F
Sbjct: 186 GVSYRF 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4775FbpA_PF05833280.010 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 27.9 bits (62), Expect = 0.010
Identities = 13/67 (19%), Positives = 27/67 (40%), Gaps = 4/67 (5%)

Query: 4 ISAIESMD----IIADINSIKELIDDVKYARKLTRVAKSSPVILANIENEKIIEFCKIYP 59
I I ++ ++ D S EL + Y+ + + + S + L + I++ K
Sbjct: 88 IVDIHQINQDRIVVIDFESTDELGFNSIYSLIIEIMGRHSNMTLIRKRDNIIMDSIKHIT 147

Query: 60 VLVNRIR 66
+N R
Sbjct: 148 PDINTYR 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4815RTXTOXIND330.011 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.9 bits (75), Expect = 0.011
Identities = 19/171 (11%), Positives = 53/171 (30%), Gaps = 18/171 (10%)

Query: 267 QGELEGIRRNESFLKDVLANWLAFLDPAKQLESIQAQQQLLDDGFREAMRDLAAFVGDAQ 326
Q I N+ + +++ L+ + F +
Sbjct: 154 QILSRSIELNKLPELKLPDEPYFQNVSEEEV---LRLTSLIKEQFSTWQNQKYQKELNLD 210

Query: 327 VEQTQLEDNMDSLQTHIGQAMSLREQLGKRRALLEYDLV-QHRFNKQKAIVDALQDEADT 385
++ + + + + + + +L +LL + +H +Q+ +E
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRV 270

Query: 386 NQRQIEITHAAI------------DYKAYAKALNEYRQADEDLKVFQSETE 424
+ Q+E + I +K + L++ RQ +++ + E
Sbjct: 271 YKSQLEQIESEILSAKEEYQLVTQLFK--NEILDKLRQTTDNIGLLTLELA 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4820FIMBRIALPAPE280.016 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.5 bits (63), Expect = 0.016
Identities = 10/35 (28%), Positives = 18/35 (51%)

Query: 49 AGIAATLMTGTQFTPADFTGGIKSKCVKLLIEQGF 83
+GI + G+Q TP TG ++ + L + G+
Sbjct: 116 SGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGY 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4823OMPADOMAIN584e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 58.0 bits (140), Expect = 4e-12
Identities = 36/136 (26%), Positives = 53/136 (38%), Gaps = 28/136 (20%)

Query: 95 SPDVLFGLGSTELKPKFKLILDDFFPRYLKVLDNYQEHITEVRIEGHTSTDWTGTTNPDI 154
DVLF LKP+ + LD + L N V + G+ TD G+
Sbjct: 218 KSDVLFNFNKATLKPEGQAALD----QLYSQLSNLDPKDGSVVVLGY--TDRIGSD---- 267

Query: 155 AYFNNMALSQGRTRAVLQYVYDIKNIATHQQWVKSKFAAVGYSSAHPILDKTGKEDPNRS 214
AY N LS+ R ++V+ Y+ K I K +A G ++P+ T R+
Sbjct: 268 AY--NQGLSERRAQSVVDYLI-SKGIP------ADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 215 ---------RRVTFKV 221
RRV +V
Sbjct: 319 ALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4824FLGHOOKFLIK320.005 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 32.1 bits (72), Expect = 0.005
Identities = 27/111 (24%), Positives = 49/111 (44%), Gaps = 19/111 (17%)

Query: 242 EWQENYKTQVELMSEQYQQSVESLVETKTAVAGIWEECKEIPLAMSELREVLQVNQHQIS 301
EWQ++ + L + Q QQS E + ++ E+ +++ + NQ QI
Sbjct: 239 EWQQSLSQHISLFTRQGQQSAELRLHP--------QDLGEVQISLK-----VDDNQAQIQ 285

Query: 302 ELSRHLETFVAIRDKATTVLPEIQNKMAEVGELLKSGAANVSASLEQTSQQ 352
+S H +R LP ++ ++AE G ++ G +N+S QQ
Sbjct: 286 MVSPHQH----VRAALEAALPVLRTQLAESG--IQLGQSNISGESFSGQQQ 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4828RTXTOXIND310.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.030
Identities = 26/163 (15%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 334 RLASGAEEEAYRRLVESQFRDDDDEQAQSN---KGRLFKITLEKALFSSPMACASVVANR 390
+ S E L++ QF +++ Q + + A + + V +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 391 LKRLESRKDHN--SQSQINELESLLLALNNIDASQFSKYQLLLDTIRKDLAWKANNTEDR 448
L S ++ + E E+ + N S+ L+ I ++ + ++
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ----LEQIESEIL----SAKEE 288

Query: 449 LVIFTESIKTLEFLEQ--QLRADLKLKDDQIATLRGDQGDTVL 489
+ T+ K E L++ Q ++ L ++A Q +V+
Sbjct: 289 YQLVTQLFKN-EILDKLRQTTDNIGLLTLELAKNEERQQASVI 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4844PF0057710530.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 1053 bits (2725), Expect = 0.0
Identities = 852/855 (99%), Positives = 855/855 (100%)

Query: 2 AGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIY 61
AGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIY
Sbjct: 24 AGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIY 83

Query: 62 LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH 121
LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH
Sbjct: 84 LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 122 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGN 181
DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGN
Sbjct: 144 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGN 203

Query: 182 SHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTL 241
SHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTL
Sbjct: 204 SHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTL 263

Query: 242 GDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNS 301
GDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNS
Sbjct: 264 GDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNS 323

Query: 302 TVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEY 361
TVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEY
Sbjct: 324 TVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEY 383

Query: 362 RSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQ 421
RSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQ
Sbjct: 384 RSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQ 443

Query: 422 ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 481
ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE
Sbjct: 444 ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 503

Query: 482 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRSSTLYLSGSHQTYWGTNNVDEQFQ 541
TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR+STLYLSGSHQTYWGT+NVDEQFQ
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 542 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM 601
AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM
Sbjct: 564 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM 623

Query: 602 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 661
SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG
Sbjct: 624 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 683

Query: 662 YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG 721
YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG
Sbjct: 684 YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG 743

Query: 722 YAVLPYATEYRENRVALDTNTLADNVDLDNSVANVVPTRGAIVRAEFKARVGIKLLMTLT 781
YAVLPYATEYRENRVALDTNTLADNVDLDN+VANVVPTRGAIVRAEFKARVGIKLLMTLT
Sbjct: 744 YAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT 803

Query: 782 HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPP 841
HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPP
Sbjct: 804 HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPP 863

Query: 842 ESQQQLLTQLSAECR 856
ESQQQLLTQLSAECR
Sbjct: 864 ESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4846VACCYTOTOXIN320.001 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.5 bits (71), Expect = 0.001
Identities = 31/158 (19%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WRKRGYLLAAMLAFASATIQAADVTITVNGKVVAKPCTVSTTNATIDLGDLYSFSLVSAG 62
W R + A LA + +TI + VT VN + + I + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTHCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4847SURFACELAYER280.047 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.047
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4848PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4856PHPHTRNFRASE5750.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 575 bits (1485), Expect = 0.0
Identities = 197/537 (36%), Positives = 298/537 (55%), Gaps = 13/537 (2%)

Query: 266 AISGLSVQNGIAIGPVKWFTCERPEITQRTVDSPQEELSRIESAIDIVVCEL------AD 319
I+G++ +G+AI +I + ++ E+ ++ +A++ EL +
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTE 63

Query: 320 KAAGPE-GDIFAAHKMMLEDPKINRQLQQRLAK-GKQAEFAWLEVMQALAEQYCQAETLY 377
+ G + +IFAAH ++L+DP++ ++ ++ AE+A EV + + Y
Sbjct: 64 ASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEY 123

Query: 378 LREREADIRDLTRQVLNQLCGGSEQHFITTA-PCILLANDLLPSQITSLNKAHILGICLH 436
++ER ADIRD++++VL L G T A +++A DL PS LNK + G
Sbjct: 124 MKERAADIRDVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATD 183

Query: 437 NGGTTSHTAILARAMGIPAIVKAAITPQNVRDNDTVILDGETGRLWLQPDEVTRLDLLQR 496
GG TSH+AI++R++ IPA+V + ++ D VI+DG G + + P E ++
Sbjct: 184 IGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEEK 243

Query: 497 AEAWRQQRDRQLADAMQPAVTQGGRKISVLANIGDLQDIEAALSHGAEGVGLLRTEFLFH 556
A+ +Q+ +P+ T+ G + + ANIG +D++ L++G EG+GL RTEFL+
Sbjct: 244 RAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLYM 303

Query: 557 ESATLPDEEEQFRVYCSVAQAFGDKPVTIRTLDIGGDKPLPSYPLPAEDNPFLGLRGIRL 616
+ LP EEEQF Y V Q KPV IRTLDIGGDK L LP E NPFLG R IRL
Sbjct: 304 DRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIRL 363

Query: 617 CLAHPQIFIPQLRALLRAGKEYPTLQIMLPMVSTLEEVNAVKTLIQTQAKLL---GLTAE 673
CL IF QLRALLRA Y L++M PM++TLEE+ K ++Q + L G+
Sbjct: 364 CLEKQDIFRTQLRALLRAS-TYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVS 422

Query: 674 NLPALGIMIEVPAAVMIAEKLASEVDFFSIGTNDLTQYIMAADRGNSTVAELVDYRNDAV 733
+ +GIM+E+P+ + A A EVDFFSIGTNDL QY MAADR N V+ L + A+
Sbjct: 423 DSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAI 482

Query: 734 INAIAMVCQAGRNNEIPVSMCGEMAGDTQQTARLLTMGIDKLSASPSRLPALKAAIR 790
+ + MV +A + V MCGEMAGD LL +G+D+ S S + + ++ +
Sbjct: 483 LRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLL 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4860TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 58/324 (17%), Positives = 106/324 (32%), Gaps = 25/324 (7%)

Query: 39 TGATNAELGFLMTAYGLVNFLLYLPGGWAADRFSARKLMTFSLISTGISGFYYATFPSYT 98
+ A G L+ Y L+ F G +DRF R ++ SL + AT P
Sbjct: 38 SNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLW 97

Query: 99 MICLLHALWAVTTVFTFWAVCVRIIRTLGTSEEQGRLYGYWFLGKGLTSIVLGFLSVPVF 158
++ + + +T T AV I + +E+ R +G+ + G ++ PV
Sbjct: 98 VLYIGRIVAGITGA-TG-AVAGAYIADITDGDERARHFGF--MS---ACFGFGMVAGPVL 150

Query: 159 AKFGEGVDGLRATIIFYSVVTILAGVLAWFVCQDETHSEDKANFRLADMAF-----VLKM 213
G A + + L + F+ + E + R A M
Sbjct: 151 GGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGM 209

Query: 214 PTVWLAGVVTFCMWSI-YIGFGMVTPYLTQILHMGESEVAVASILRAYVLFAMGGLIGGQ 272
V V F M + + + + H + + ++ + L
Sbjct: 210 TVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGIS----LAAFGILHSLAQAM 265

Query: 273 LADRCASRTRFMIYAFIGMIVFTTVYFFLP--GESRYVTIALANMVALGVFIYSANAVFF 330
+ A+R +GMI T Y L + + + G+ + + A+
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLS 325

Query: 331 SIIDEIRIPAKVTGTAAGLISLLT 354
+DE R G G ++ LT
Sbjct: 326 RQVDEER-----QGQLQGSLAALT 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4867HTHFIS2281e-69 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 228 bits (582), Expect = 1e-69
Identities = 82/355 (23%), Positives = 157/355 (44%), Gaps = 41/355 (11%)

Query: 327 RKIAQQQISTNANFTFDSLHAASGGMKQVLLIARRAIKSISPILINGEEGVGKLSLAMAI 386
K ++ ++ L S M+++ + R +++ ++I GE G GK +A A+
Sbjct: 122 PKRRPSKLEDDSQ-DGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 387 HNESEQRDGPFISVDCQMLSPENILHELLGSDVG-------PSPSKFELAHNGTLYLDKV 439
H+ ++R+GPF++++ + + I EL G + G S +FE A GTL+LD++
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 440 EYLSGEVQSVLLKVLKTGLVTRSDSHRLIPVRFRLITCTSSSLREYVQQGAFSRQLYYEI 499
+ + Q+ LL+VL+ G T I R++ T+ L++ + QG F LYY +
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 500 SMNEIEIPPLRKRREDLKQMIDDIIDKYQERTRKKMTITPDANSVLLEYRWPGNISEFKN 559
++ + +PPLR R ED+ ++ + + ++ +A ++ + WPGN+ E +N
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELEN 360

Query: 560 RMEKVFINCNRLVLGLENIPLDIRQN-----NSSGDDDIPHLT----------------- 597
+ ++ + V+ E I ++R L+
Sbjct: 361 LVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFG 420

Query: 598 -----------SLAELEMQAIEHTCRVCEWNLTKAAEVLKIGRTTLWRKLKIYNL 641
LAE+E I N KAA++L + R TL +K++ +
Sbjct: 421 DALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4871UREASE340.001 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 33.6 bits (77), Expect = 0.001
Identities = 21/85 (24%), Positives = 37/85 (43%), Gaps = 20/85 (23%)

Query: 26 CDVLVANGKIIAVASNIPSDIVPDCT--------VVDLSGQILCPGFIDQHVHLIGGGGE 77
D+ + +G+I A+ D+ P T V+ G+I+ G +D H+H I
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFI----- 140

Query: 78 AGPTTRTPEVALSRLTEAGVTSVVG 102
+ E AL +G+T ++G
Sbjct: 141 ---CPQQIEEALM----SGLTCMLG 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4876INTIMIN5770.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 577 bits (1487), Expect = 0.0
Identities = 178/592 (30%), Positives = 275/592 (46%), Gaps = 33/592 (5%)

Query: 136 NGENTLENQIASTSQRVGPLLSQDMNSEQASGMARGWASSEASGAMTDWLNNFGTAKISL 195
N Q AS ++ S+ +N + A A G A ++AS + WL ++GTA+++L
Sbjct: 161 KALNYAAQQAASLGSQLQ---SRSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNL 217

Query: 196 GVDEDFSLKNSQFDFLHPWYDTPDYLLFSQHTLHRTDDRTQINTGLGWRHFTPSWMSGIN 255
+F S DFL P+YD+ L F Q D R N G G R F P M G N
Sbjct: 218 QSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYN 275

Query: 256 LFFDHDLSRYHSRAGLGAEYWRDYLKLSSNAYIGLTGWRSAPELDNDYEARPANGWDLRA 315
+F D D S ++R G+G EYWRDY K S N Y ++GW + DY+ RPANG+D+R
Sbjct: 276 VFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRF 334

Query: 316 EGWLPAWPQLGGKLVYEQYYGDEVALFDKNDRQSNPHAITAGLNYTPFPLLTLSAEQRQG 375
G+LP++P LG KL+YEQYYGD VALF+ + QSNP A T G+NYTP PL+T+ + R G
Sbjct: 335 NGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHG 394

Query: 376 KQGENDTRFAVDLTWQPSSSMQKQLNPDEVAGRRSLAGSRYDLIDRNNNIVLEYRKKELI 435
END +++ +Q +Q+ P V R+L+GSRYDL+ RNNNI+LEY+K++++
Sbjct: 395 TGNENDLLYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDIL 454

Query: 436 HLSLQDPVKGKSGEIKPLVSSIQTKYALKGYNIEAAALEAAGGKVRTSG----KDITVTL 491
L++ + G + + +++KY L + +AL + GG+++ SG +D L
Sbjct: 455 SLNIPHDINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAIL 514

Query: 492 PGYRFTNTPETDNTWSIDVTAEDVKGNLSRHEQ-SMVVIQAPTLSQKDSLLSVNPLTVAA 550
P Y N + + A D GN S + ++ V+ + + + +A
Sbjct: 515 PAY----VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSA 570

Query: 551 DKKSTTTLTVTAHDSD------GTPVPGLALQTRSEGVQDITLSDWTDNGDGSYTQMLTA 604
T +T TA PV + G ++ + NG G T L +
Sbjct: 571 KADGTEAITYTATVKKNGVAQANVPVSFNIVS----GTAVLSANSANTNGSGKATVTLKS 626

Query: 605 GTTSGSVTLTPQINGESAVKESIVVNIVPVVSSRDHSSITIDNVSYYAGDDIKVRVELKD 664
V SA+ + V I + + I D + A + +K
Sbjct: 627 DKPGQVVVSAKTAEMTSALNANAV--IFVDQTKASITEIKADKTTAVANGQDAITYTVKV 684

Query: 665 DSN-QPVAYQKEELVKAVTVENSKPGATIVWHEEQPGVYAANYPAHKQGTAL 715
+PV+ Q+ + K + + G + G +L
Sbjct: 685 MKGDKPVSNQEVTFTTTL----GKLSNSTE-KTDTNGYAKVTLTSTTPGKSL 731



Score = 73.2 bits (179), Expect = 7e-15
Identities = 83/435 (19%), Positives = 145/435 (33%), Gaps = 43/435 (9%)

Query: 961 NTVILTASVKDVYGHPLPDEDVKFTLPASMTGNFTLSSETARTDANGDAVVTLRGTKAGE 1020
++ + + + + +D + LPA + G + TAR G +
Sbjct: 489 DSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDR-------NGNSSNN 541

Query: 1021 FTVTATLTRNNTVAHQQ--VTFIGDTNSAQLQPLTASLNTIVAGDSTGSTLTATILDAYQ 1078
+T T+ N V Q F D SA+ A + T TAT+
Sbjct: 542 VLLTITVLSNGQVVDQVGVTDFTADKTSAK------------ADGTEAITYTATVKKNGV 589

Query: 1079 NPLKDQLV-TFQSNDVTLSGTEVTTNTLGQATVTMTSNIAGQHNVVVSRKAQVSDNKTFS 1137
+ S LS TN G+ATVT+ S+ GQ VV ++ A+++ +
Sbjct: 590 AQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQ-VVVSAKTAEMTSALNAN 648

Query: 1138 LSVLPDESSAKVISITGAEKTITVGENITLRILVQDAFN-NVISGQRVRLSAQPTANITI 1196
+ D++ A + I + T + V+ +S Q V + ++
Sbjct: 649 AVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFT-TTLGKLSN 707

Query: 1197 GDTAYTDNNGYAYVNLISTQPGVYQVTATLDNNSSSKVDVNVAN-GKLELTSSKPETTVH 1255
T TD NGYA V L ST PG V+A + + + V L + E
Sbjct: 708 -STEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT 766

Query: 1256 NSEGITLTATARNARDEL-MPGQIITFSVTPEGATLSNTGEILTDQSGQAKVTLTSNKVN 1314
+G T + + L G ++ +N I + + +VTL +
Sbjct: 767 GVKGKLPTVWLQYGQVNLKASGGNGKYTW-----RSANPA-IASVDASSGQVTL--KEKG 818

Query: 1315 VYTVTATMGKDVPVQSQVTVAVKADAKTAHVVSVVASPDTITADGVDSSTITSRVEDDYG 1374
T++ T + +V + S D V++
Sbjct: 819 TTTISVISS------DNQTATYTIATPNSLIVPNM-SKRVTYNDAVNTCKNFGGKLPSSQ 871

Query: 1375 FPVEGVDVSYALDTK 1389
+E V ++ K
Sbjct: 872 NELENVFKAWGAANK 886



Score = 66.2 bits (161), Expect = 8e-13
Identities = 65/278 (23%), Positives = 105/278 (37%), Gaps = 21/278 (7%)

Query: 1168 RILVQDAFNNVISGQRVRLSAQPTANITIGDTAYTDNNGYAYVNLISTQPGVYQVTATLD 1227
RI+ D+ GQ +Q + AY N+ Y
Sbjct: 484 RIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQ----GGSNVYKVTARAYDRNGNSS 539

Query: 1228 NNSSSKVDVNVANGKL-------ELTSSKPETTVHNSEGITLTATARNARDELMPGQIIT 1280
NN + V ++NG++ + T+ K +E IT TAT + ++
Sbjct: 540 NNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVS 597

Query: 1281 FSVTPEGATLSNTGEILTDQSGQAKVTLTSNKVNVYTVTA-TMGKDVPVQSQVTVAVKAD 1339
F++ A LS T+ SG+A VTL S+K V+A T + + + V D
Sbjct: 598 FNIVSGTAVLSANSA-NTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFV--D 654

Query: 1340 AKTAHVVSVVASPDTITADGVDSSTITSRVEDDYGFPVEGVDVSYALDTKGSPVVNIPTT 1399
A + + A T A+G D+ T T +V PV +V T + + T
Sbjct: 655 QTKASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSNQEV--TFTTTLGKL-SNSTE 710

Query: 1400 RTDQSGQVTATITSTLAETLIVNVQVPGTANQSATITL 1437
+TD +G T+TST +V+ +V A +
Sbjct: 711 KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEV 748



Score = 63.5 bits (154), Expect = 6e-12
Identities = 66/344 (19%), Positives = 111/344 (32%), Gaps = 26/344 (7%)

Query: 848 LVADPDTIIAGNSQGSTLTATVTDFHNNPLKDMKVNFVAPGGSQLDNTTATTDQSGIVRV 907
AD + A ++ T TATV + G + L +A T+ SG V
Sbjct: 563 FTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATV 622

Query: 908 HLTSSKAGSYSVDASLEADKNIHQSVTITVVPNREQSVMTLNAGSGSAIANNTNTVILTA 967
L S K G V A + + + V + S+ + A +A+AN + + T
Sbjct: 623 TLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTV 682

Query: 968 SVKDVYGHPLPDEDVKFTLPASMTGNFTLSSETARTDANGDAVVTLRGTKAGEFTVTATL 1027
V P+ +++V FT N T +TD NG A VTL T G+ V+A +
Sbjct: 683 KVMK-GDKPVSNQEVTFTTTLGKLSN-----STEKTDTNGYAKVTLTSTTPGKSLVSARV 736

Query: 1028 TRNNT-VAHQQVTFIGDTNSAQLQPLTASLNTIVAGDSTGSTLTATI-LDAYQNPLKDQL 1085
+ V +V F + IV G T +
Sbjct: 737 SDVAVDVKAPEVEFFTTLT------IDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGN 790

Query: 1086 VTFQSNDVTLSGTEVTTNTLGQATVTMTSNIAGQHNVVVSRKAQVSDNKTFSLSVLPDES 1145
+ + + + VT+ G + V SDN+T + ++
Sbjct: 791 GKY---TWRSANPAIASVDASSGQVTL--KEKGTTTISVI----SSDNQTATYTI--ATP 839

Query: 1146 SAKVISITGAEKTITVGENI-TLRILVQDAFNNVISGQRVRLSA 1188
++ ++ T N + N + A
Sbjct: 840 NSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGA 883



Score = 33.9 bits (77), Expect = 0.007
Identities = 34/283 (12%), Positives = 80/283 (28%), Gaps = 27/283 (9%)

Query: 502 TDNTWSIDVTAEDVKGNLSRHEQSMVVIQAPTLSQKDSLLSVNPLTVAADKKSTTTLTVT 561
+ V + S + V+ + + + T A+ + T TV
Sbjct: 625 KSDKPGQVVVSAKTAEMTSALNANAVIF-VDQTKASITEIKADKTTAVANGQDAITYTVK 683

Query: 562 AHDSDGTPVPGLALQTRSEGVQDITLSDWTDNGDGSYTQMLTAGTTSGSVTLTPQINGES 621
PV T + + ++ S + +G LT TT G ++ +++ +
Sbjct: 684 V-MKGDKPVSN-QEVTFTTTLGKLSNSTEKTDTNGYAKVTLT-STTPGKSLVSARVSDVA 740

Query: 622 AVKESIVVNIVPVVSSRDHSSITIDNVSYYAGDDIKVRVELKD-DSNQPVAYQKEELVKA 680
++ V ++ DD + + P + + V
Sbjct: 741 VDVKAPEVEFFTTLTI----------------DDGNIEIVGTGVKGKLPTVWLQYGQVNL 784

Query: 681 VTVENSKPGATIVWHEEQPGVYAANYPAHKQGTALRAQLSLHNWNAPLQSHIYNIEANQN 740
+ W P + + + + + + ++ ++ Q+ Y I +
Sbjct: 785 KASGGN---GKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPNS 841

Query: 741 KARVATLSATNNDVYADKKTFNTLTINVTDESDNPLTNHQVTF 783
+ + + Y D S N L N +
Sbjct: 842 ---LIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAW 881


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4877TCRTETA290.037 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.037
Identities = 64/316 (20%), Positives = 113/316 (35%), Gaps = 24/316 (7%)

Query: 82 RPFLLASALATGLLILAMAWLPPFLLVFIIRFLAGV-----ASAGMLIFGSTLIMQHTRH 136
RP LL S + MA P +++I R +AG+ A AG I T + RH
Sbjct: 73 RPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARH 132

Query: 137 PFVLAALFSGVGVGIALGNEYVLAGLHFALSSQTLWQGAGALSAIILLALALLIP-SNKH 195
++A F G G+ G VL GL S + A AL+ + L L+P S+K
Sbjct: 133 FGFMSACF---GFGMVAGP--VLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKG 187

Query: 196 VIPPAPLAKIAQQPMSWW---------LLAILYGLAGFGYIIVATYLPLMAKDAGQPVLT 246
P + W L+A+ + + G + A ++ + +D T
Sbjct: 188 ERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWV-IFGEDRFHWDAT 246

Query: 247 AHLWTLVGLSIVPGCFGWLWA---AKRWGALPCLTANLLVQAICVLLTLASSSPLLLIIS 303
+L I+ + A R G L ++ +L ++ +
Sbjct: 247 TIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPI 306

Query: 304 SIGFGGTFMGTTSLVMTIARQLSVPGNLNLLGFVTLIYGIGQILGPALTSMLGNGTSALA 363
+ +G +L ++RQ+ L G + + + I+GP L + + +
Sbjct: 307 MVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAASITTW 366

Query: 364 SATLCGAAALFIAALI 379
+ A A +
Sbjct: 367 NGWAWIAGAALYLLCL 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4884TCRTETB501e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 49.9 bits (119), Expect = 1e-08
Identities = 46/189 (24%), Positives = 76/189 (40%), Gaps = 5/189 (2%)

Query: 7 RHAATLFFPMALILYDFAAYLSTDLIQPGIINVVRDFNADVSLAPAAVSLYLAGGMALQW 66
RH L + L + + ++ P I N A + A L + G A+
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV-- 68

Query: 67 LLGPLSDRIGRRPVLINGALIFTLACAATMFTTSMTQFLI-ARAIQGTSICFIATVGYVT 125
G LSD++G + +L+ G +I S LI AR IQG + V
Sbjct: 69 -YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 126 VQEAFGQTKGIKLMAIITSIVLIAPIIGPLSGAALMHFMHWKVLFAIIAVMGFISFVGLL 185
V + K +I SIV + +GP G + H++HW L +I ++ I+ L+
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LIPMITIITVPFLM 186

Query: 186 LAMPETVKR 194
+ + V+
Sbjct: 187 KLLKKEVRI 195


76EcSMS35_0116EcSMS35_0125N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0116631-8.038197putative S-type colicin
EcSMS35_0117625-8.188794putative colicin immunity protein
EcSMS35_0118427-7.671468putative colicin
EcSMS35_0119427-1.568007putative colicin immunity protein
EcSMS35_01205350.606547putative colicin
EcSMS35_01215381.063178putative colicin immunity protein
EcSMS35_01235311.888495transcriptional regulator PdhR
EcSMS35_01224352.185829hypothetical protein
EcSMS35_01244342.385966pyruvate dehydrogenase subunit E1
EcSMS35_01252271.919927dihydrolipoamide acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0116PYOCINKILLER1848e-53 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 184 bits (468), Expect = 8e-53
Identities = 97/288 (33%), Positives = 139/288 (48%), Gaps = 21/288 (7%)

Query: 309 LQQKALAGSTATTRVRFFWGTDIHGKPQVYGVHTGEGTPY-ENVRVANMQWNEQTQRYEF 367
+ A+A ++ T + + G V + +G + V V +N T YE
Sbjct: 341 VNLNAVAKASGTVDLPMRLTNEARGNTTTLSVVSTDGVSVPKAVPVRMAAYNATTGLYEV 400

Query: 368 T---PAHDVDGPLITWTPENPEHGNVPGHTGN--DRPPLEQPTILVTPIPDGTDTYSTPP 422
T + ++TWTP +P P T +P +TP+ +TY
Sbjct: 401 TVPSTTAEAPPLILTWTPASPPGNQNPSSTTPVVPKPVPVYEGATLTPVKATPETY---- 456

Query: 423 FPVPDPKEFNDYILVFPAGSGIKPIYVYLKEDPRKLPGVVTGRGVPLSPGTRWLDMSVSN 482
P D I+ FPA SGIKPIYV + DPR +PG TG+G P+ WL ++
Sbjct: 457 -PGVITLP-EDLIIGFPADSGIKPIYVMFR-DPRDVPGAATGKGQPV--SGNWLG--AAS 509

Query: 483 NGNGAPIPAHIADKLRGREFKTFDEFREALWLEVSQDPELIAQFSSGNQTRIKQGLTAKA 542
G GAPIP+ IADKLRG+ FK + +FRE W+ V+ DPEL QF+ G+ ++ G
Sbjct: 510 QGEGAPIPSQIADKLRGKTFKNWRDFREQFWIAVANDPELSKQFNPGSLAVMRDGGAPYV 569

Query: 543 PIDGWHYGPKEIVKKFQIHHRVAIEYGGSVYDIDNLRIVTPRLHDEIH 590
G + K +IHH+V + GG VY++ NL VTP+ H EIH
Sbjct: 570 RESE-QAGGRI---KIEIHHKVRVADGGGVYNMGNLVAVTPKRHIEIH 613


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0118PYOCINKILLER542e-12 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 54.0 bits (129), Expect = 2e-12
Identities = 18/64 (28%), Positives = 31/64 (48%), Gaps = 4/64 (6%)

Query: 1 MSQYPELIAQFSSGNQTRIKQGLIAKAPLEGWHYGTKEIVKKFHIYHRVAIEYSGGIYDI 60
++ PEL QF+ G+ ++ G E G + K I+H+V + GG+Y++
Sbjct: 543 VANDPELSKQFNPGSLAVMRDGGAPYVR-ESEQAGGRI---KIEIHHKVRVADGGGVYNM 598

Query: 61 DNLR 64
NL
Sbjct: 599 GNLV 602


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0119PF04605260.013 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 26.4 bits (58), Expect = 0.013
Identities = 13/34 (38%), Positives = 18/34 (52%)

Query: 1 MYDFKNKIEDYTEREFIELLGEFTNPTGDNAQLK 34
Y K I+D ++F + L EFT T N +LK
Sbjct: 88 QYSLKETIQDLCAKDFHQKLKEFTEKTPKNQKLK 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0120PYOCINKILLER472e-10 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 47.5 bits (112), Expect = 2e-10
Identities = 16/62 (25%), Positives = 29/62 (46%), Gaps = 4/62 (6%)

Query: 1 MSQYPELIAQFSTGNQTRIKQGLIAKAPLEGWYYGSKEIVKEFHIYHSVAIECGGEIYDI 60
++ PEL QF+ G+ ++ G E G + + I+H V + GG +Y++
Sbjct: 543 VANDPELSKQFNPGSLAVMRDGGAPYVR-ESEQAGGRI---KIEIHHKVRVADGGGVYNM 598

Query: 61 DN 62
N
Sbjct: 599 GN 600


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0125RTXTOXIND340.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.0 bits (78), Expect = 0.002
Identities = 42/281 (14%), Positives = 89/281 (31%), Gaps = 32/281 (11%)

Query: 26 DKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVSVGDKTQTGALIMIFDSADGAADAAP 85
+ V +T G S E+ + IVKEI V G+ + G +++ + AD
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AQA--------EEKKEAAPAAA-----PAAAAAKDVNVPDIGSDEVEVTEILVKVG-DKV 131
Q+ + + + + P + ++ +EV L+K
Sbjct: 139 TQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTW 198

Query: 132 EAEQSLITVEGDKASMEVPAPFAGTVKEIKVNVGDKVSTGSLIMVFEVAGEAGAAAPAAK 191
+ ++ + DK E A + ++ +K + A +
Sbjct: 199 QNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHK-QAIAKHAVLEQ 257

Query: 192 QEAAPAAAPASAAGVKDVNVPDIGGDEV-------------EVTEVMVKVGDKVAA-EQS 237
+ A + + E+ + + + D +
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 238 LITVEGDKASMEVPAPFAGVVKELKVN-VGDKVKTGSLIMI 277
L E + + + AP + V++LKV+ G V T +M+
Sbjct: 318 LAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 29.8 bits (67), Expect = 0.034
Identities = 20/95 (21%), Positives = 35/95 (36%), Gaps = 3/95 (3%)

Query: 230 DKVAAEQSLITVEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAP 289
+ VA +T G S E+ +VKE+ V G+ V+ G +++ GA A
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAE-ADTL 137

Query: 290 AKQEAAAPAPAAKAEAPAAAPAAKAEGKSEFAEND 324
Q + A + + + + E D
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172


77EcSMS35_0242EcSMS35_0255N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_02422352.645455flagellar biosynthesis protein FlhB
EcSMS35_02431311.646066lateral flagellar export/assembly protein LfiR
EcSMS35_02440301.599193lateral flagellar export/assembly protein LfiQ
EcSMS35_02451303.185673flagellar biosynthesis protein FliP
EcSMS35_02463303.510778lateral flagellar export/assembly protein LfiN
EcSMS35_02473294.223450lateral flagellar export/assembly protein LfiM
EcSMS35_02483356.629697lateral flagellar RpoN-interacting regulatory
EcSMS35_02494387.537996lateral flagellar basal body component protein
EcSMS35_02504377.395812flagellar MS-ring protein
EcSMS35_02514335.338043flagellar motor switch protein G
EcSMS35_02523210.757247flagellar assembly protein H
EcSMS35_02531200.706095lateral flagellar export/assembly protein LfiI
EcSMS35_0254-116-1.169552lateral flagellar export/assembly protein LfiJ
EcSMS35_0255-116-1.469493cytidyltransferase-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0242TYPE3IMSPROT298e-101 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 298 bits (765), Expect = e-101
Identities = 97/347 (27%), Positives = 178/347 (51%), Gaps = 6/347 (1%)

Query: 6 SEEKTEKPSAQKLRKAREEGQLPRSKDMGLAASLFAAFVVISSSFPWYADFVRESFISVH 65
S EKTE+P+ +K+R AR++GQ+ +SK++ A + A ++ +Y + + +
Sbjct: 2 SGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLI-- 59

Query: 66 QYAQEINNPDV--IGQFLRHHLLILGKFILTLLPMPA-AALLSSLVPGGWLFLPKKILPD 122
A++ P + + + LL LL + A A+ S +V G+L + I PD
Sbjct: 60 -PAEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPD 118

Query: 123 FSKISPLKGIGRLFSSEHLAETGKMTVKSVVVLVMLWVSLRNNFAAFLGLQALPFKLAIN 182
KI+P++G R+FS + L E K +K V++ +++W+ ++ N L L +
Sbjct: 119 IKKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITP 178

Query: 183 DGLSLYASVMRNFVILFIFFALIDVPLAKALFTKGLKMTKQELKEEYKNQEGKPEVKARV 242
+ +M + F+ ++ D + K LKM+K E+K EYK EG PE+K++
Sbjct: 179 LLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKR 238

Query: 243 RRLQRQLAMGQIRKVVPKANVVITNPTHYAVALQYDQSRAAAPFVVAKGTDEIALYIRQV 302
R+ +++ +R+ V +++VV+ NPTH A+ + Y + P V K TD +R++
Sbjct: 239 RQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKI 298

Query: 303 AAENQVEVVEFPRLARSVYYTTQVNQQIPFQLYRAIAHVLTYVLQMK 349
A E V +++ LAR++Y+ V+ IP + A A VL ++ +
Sbjct: 299 AEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQN 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0243TYPE3IMRPROT1102e-31 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 110 bits (277), Expect = 2e-31
Identities = 75/232 (32%), Positives = 126/232 (54%), Gaps = 2/232 (0%)

Query: 8 QLTDLALGLWFPFVRIMAFLRYVPVLDNSALTVRVRIILSLALAIIITPLIPHPIPHDLL 67
Q ++P +R++A + P+L ++ RV++ L++ + I P +P +
Sbjct: 8 QWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPANDV-PVF 66

Query: 68 SLNSLILTVEQILWGMLFGLMFQFLFLALQLAGQILSFNMGMSMAVMNDPSSGASTTVLA 127
S +L L V+QIL G+ G QF F A++ AG+I+ MG+S A DP+S + VLA
Sbjct: 67 SFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLA 126

Query: 128 ELINVYAILLFFAMDGHLLLVSVLYKGFTYWPIGNA-LHPQTLRTIALAFSWVLASASLL 186
++++ A+LLF +GHL L+S+L F PIG L+ + A S + + +L
Sbjct: 127 RIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLML 186

Query: 187 ALPTTFIMLIVQGCFGLLNRIAPPLNLFSLGFPINMLAGLVCFATLLYNLPD 238
ALP ++L + GLLNR+AP L++F +GFP+ + G+ A L+ +
Sbjct: 187 ALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAP 238


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0244TYPE3IMQPROT433e-09 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 43.2 bits (102), Expect = 3e-09
Identities = 21/73 (28%), Positives = 36/73 (49%)

Query: 14 GIKVVILLVSVLVVPSLLVGLLVSVFQAVTQINEQTLSFLPRLIVTLVVLGVCGKWMIIQ 73
+ +V++L + + ++GLLV +FQ VTQ+ EQTL F +L+ + L + W
Sbjct: 11 ALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLSGWYGEV 70

Query: 74 LHDLCIHLFSQAA 86
L + A
Sbjct: 71 LLSYGRQVIFLAL 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0245FLGBIOSNFLIP2232e-75 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 223 bits (569), Expect = 2e-75
Identities = 111/243 (45%), Positives = 152/243 (62%), Gaps = 4/243 (1%)

Query: 3 RRTQLALGLGLLALAPLALAQGGDIALLNVVTHGNTQEYSVKIQVLILMTLVGLLPTMVL 62
RR + L + PLA AQ + + G Q +S+ +Q L+ +T + +P ++L
Sbjct: 2 RRLLSVAPVLLWLITPLAFAQLP--GITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILL 59

Query: 63 MMTCFTRFIIVLSLLRQALGLQQTPPNRILIGIALSLTMLVMRPVWLNIYDHAVVPFEND 122
MMT FTR IIV LLR ALG PPN++L+G+AL LT +M PV IY A PF +
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 123 QITLTDALSTAATPLKRFMLAQTDKKAMAQIMTIGGAKG--NAADQDLTIVVPAYVLSEL 180
+I++ +AL A PL+ FML QT + + + + I++PAYV SEL
Sbjct: 120 KISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSEL 179

Query: 181 KTAFQIGFMIYIPFLVIDLIVASVLMAMGMMMLSPLIVSLPFKLMLFVLIDGWSLTIGTL 240
KTAFQIGF I+IPFL+IDL++ASVLMA+GMMM+ P ++LPFKLMLFVL+DGW L +G+L
Sbjct: 180 KTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSL 239

Query: 241 TTS 243
S
Sbjct: 240 AQS 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0246FLGMOTORFLIN715e-19 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 71.1 bits (174), Expect = 5e-19
Identities = 39/104 (37%), Positives = 61/104 (58%), Gaps = 5/104 (4%)

Query: 13 GLADEVAPVTKSADAETLVTRL----EDRFSDSMTLLKRIPVTLTLEVSSVEIMLADLLN 68
L ++ A TKSA A+ + +L + L+ IPV LT+E+ + + +LL
Sbjct: 22 ALNEQKATTTKSA-ADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTVELGRTRMTIKELLR 80

Query: 69 IDDDTVIELDKLAGEPLDIKVNNILLGKAEVVVVNEKYGLRVLE 112
+ +V+ LD LAGEPLDI +N L+ + EVVVV +KYG+R+ +
Sbjct: 81 LTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITD 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0248HTHFIS361e-125 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 361 bits (929), Expect = e-125
Identities = 112/345 (32%), Positives = 183/345 (53%), Gaps = 26/345 (7%)

Query: 3 ELIATAASSINAFTLAKRVAAFNVPVLIQGETGAGKECVAKYIHTVAFGENDNAPYIGVN 62
L+ +A+ + + R+ ++ ++I GE+G GKE VA+ +H +G+ N P++ +N
Sbjct: 138 PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALH--DYGKRRNGPFVAIN 195

Query: 63 CAAIPENMLESTLFGYDKGAFTGAIASVPGKMELANNGSLLLDEIGEMPLALQAKILRVL 122
AAIP +++ES LFG++KGAFTGA G+ E A G+L LDEIG+MP+ Q ++LRVL
Sbjct: 196 MAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVL 255

Query: 123 QEQQVERLGSNRQIKLNFRLIACTNKNLEQEVAAGRFREDLYYRLAVIPITMPPLRERLN 182
Q+ + +G I+ + R++A TNK+L+Q + G FREDLYYRL V+P+ +PPLR+R
Sbjct: 256 QQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAE 315

Query: 183 DIIPLAESFIKKYSTVLVKNITLSESTRRALLNYRWPGNVRQLENAIQRGMILNRDGVIY 242
DI L F+++ + + + + WPGNVR+LEN ++R L VI
Sbjct: 316 DIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVIT 375

Query: 243 PDAL---------------------GLPDTDIADHSELQWPVQPAVHIAETGDLGQHGRS 281
+ + L + + + Q+ + +G +
Sbjct: 376 REIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAE 435

Query: 282 AQYQYIADLMRKYQGNRSKIADLLGITPRALRYRLASMRKHGIEV 326
+Y I + +GN+ K ADLLG+ LR + +R+ G+ V
Sbjct: 436 MEYPLILAALTATRGNQIKAADLLGLNRNTLRKK---IRELGVSV 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0249FLGHOOKFLIE394e-07 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 38.9 bits (90), Expect = 4e-07
Identities = 24/79 (30%), Positives = 36/79 (45%), Gaps = 1/79 (1%)

Query: 36 SSTDPDVSFNRIMSGALGHVDQFQQVAEQQQTAVDTGKSD-DLAGAMIASQQASLSFSAL 94
S P +SF + AL + Q A Q G+ L M Q+AS+S
Sbjct: 25 SLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMG 84

Query: 95 VQVRNKIATGFNDLMSMSI 113
+QVRNK+ + ++MSM +
Sbjct: 85 IQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0250FLGMRINGFLIF2845e-91 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 284 bits (727), Expect = 5e-91
Identities = 161/551 (29%), Positives = 255/551 (46%), Gaps = 37/551 (6%)

Query: 18 RLADNKRWALMAGVGLAVAATAIIVSVLWTGNRGYVSLYGRQENLPVSQIVTVLDGEKLS 77
RL N R L V + A ++ VLW Y +L+ + IV L +
Sbjct: 18 RLRANPRIPL--IVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 78 YRIDPQSGQILVPEDELSKTRMTLAAKGVQAILPSGYELMDKDEVLGSSQFVQNVRYKRS 137
YR SG I VP D++ + R+ LA +G+ G+EL+D+ E G SQF + V Y+R+
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQ-EKFGISQFSEQVNYQRA 134

Query: 138 LEGELAQSIMSLDAVESARVHLALNEESSFVVSDEPQNSASVVVRLHYGAKLNMDQVNAI 197
LEGELA++I +L V+SARVHLA+ + S FV + SASV V L G L+ Q++A+
Sbjct: 135 LEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSP-SASVTVTLEPGRALDEGQISAV 193

Query: 198 VHLVSGSIPGLHASKVSVVDQAGNLLTDG-IGAGEAVSAATRKRDQILKDIQDKTRASVA 256
VHLVS ++ GL V++VDQ+G+LLT + A + + + IQ + +
Sbjct: 194 VHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRR----IE 249

Query: 257 NVLDSLVGSGNYRVSVMPDLDLSNIDETQEHY---GDAPKIN---REENVLDSDTNQVAM 310
+L +VG+GN V LD +N ++T+EHY GDA K R+ N+ +
Sbjct: 250 AILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPG 309

Query: 311 GVPGSLSNRPPIAANQMTNGTEENR----------------SPEALSKHSESKRDYSYDR 354
GVPG+LSN+P N+ S S +Y DR
Sbjct: 310 GVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDR 369

Query: 355 SVQHIQHPGFAVKRLNVAVVLN-QNAPALKN--WKPEQTTQLTALLNNAAGIDVQRGDNL 411
+++H + ++RL+VAVV+N + K +Q Q+ L A G +RGD L
Sbjct: 370 TIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGDTL 429

Query: 412 TLSLLNFVPQAVPVEPIIPLWKDDSVLAWVRLIGCGLLALLLLFFVVRPVMKRLTAVRAP 471
+ F +P W+ S + + G LL L++ + + R ++ R
Sbjct: 430 NVVNSPF-SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVE 488

Query: 472 VITPEPEAVSEPWIAMPEEERKNVDLPSLPGDDSLPSQSSGLEVKLEFLQKLAMSDTDRV 531
E E + L + +Q G EV + +++++ +D V
Sbjct: 489 EAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRA--NQRLGAEVMSQRIREMSDNDPRVV 546

Query: 532 AEVLRQWITSN 542
A V+RQW++++
Sbjct: 547 ALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0251FLGMOTORFLIG1862e-58 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 186 bits (473), Expect = 2e-58
Identities = 86/317 (27%), Positives = 170/317 (53%), Gaps = 2/317 (0%)

Query: 14 EQAAILLLCLGEEAAATVMQKLSREEVVRLSENMARLSGVKTSMARKVINNFFDEFREQS 73
++AAILL+ +G E ++ V + LS+EE+ L+ +A+L + + + V+ F + Q
Sbjct: 19 QKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMAQE 78

Query: 74 GINGASRSMLQGILNKALGTEIASSVINGIYGDEIRSRMARLQWVEPRQLAMLISEEHLQ 133
I + +L K+LGT+ A +IN + ++ +P + I +EH Q
Sbjct: 79 FIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEHPQ 138

Query: 134 LQAVFLAFLTPEISAAVLSYLNESVQNEILYRVAKLNDVNRDVVDELDRLIERGL-SVLS 192
A+ L++L P+ ++ +LS L VQ + R+A ++ + +VV E++R++E+ L S+ S
Sbjct: 139 TIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSS 198

Query: 193 EHGSKVKGIKQAADIVNRFQGNQQ-VILDQMRERDEDVLEQLQDEMYDFFILSRQNEEVR 251
E + G+ +I+N + I++ + E D ++ E+++ +M+ F + ++
Sbjct: 199 EDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDRSI 258

Query: 252 RRLLDEVPMEDWAVALKGTEALLRRSIYAVMPKRQAQQLEAITARLGPVPVSRIEQIRRE 311
+R+L E+ ++ A ALK + ++ I+ M KR A L+ LGP +E+ +++
Sbjct: 259 QRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQQK 318

Query: 312 IMGIARELEEAGEIQLQ 328
I+ + R+LEE GEI +
Sbjct: 319 IVSLIRKLEEQGEIVIS 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0252FLGFLIH561e-11 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 56.0 bits (134), Expect = 1e-11
Identities = 44/187 (23%), Positives = 84/187 (44%), Gaps = 6/187 (3%)

Query: 40 LMDGFQEGLQKGFAQGMTEGQEQGFSEGHQQGFAEGRRQGYTEGSLAGQQEGRKQFVDAA 99
+++ + L++ AQ + EQG+ G +G +G +QGY EG G ++G +
Sbjct: 32 IIEEAEPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQ 91

Query: 100 QPLEA----ISGKVNDFLAHIERKQREDLLQLVEKVTRQVIRCELALQPTQLLALVEEAL 155
P+ A + + L ++ L+Q+ + RQVI + + L+ +++ L
Sbjct: 92 APIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLL 151

Query: 156 AAFPAMPETLQVMLSTEEFNRLRDAVPEKVS--EWGLTPSPDLPPGECRVITDKSELDIG 213
P Q+ + ++ R+ D + +S W L P L PG C+V D+ +LD
Sbjct: 152 QQEPLFSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDAS 211

Query: 214 CEHRLEQ 220
R ++
Sbjct: 212 VATRWQE 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0255LPSBIOSNTHSS382e-06 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 38.3 bits (89), Expect = 2e-06
Identities = 30/135 (22%), Positives = 55/135 (40%), Gaps = 20/135 (14%)

Query: 8 GTFDVFHVGHLRLLQRARTLGERLLVGVSSDALNIAKKGRAPVYPQDDRMAIIAG-LACV 66
G+FD GHL +++R L +++ V V N K+ P++ +R+ IA +A +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAV---LRNPNKQ---PMFSVQERLEQIAKAIAHL 60

Query: 67 DGVFLEESLEQKAEYLRGYSADILVMG-----D-----DWAGKFDSFAYICEVVYFPRTP 116
++ Y R A ++ G D A + A E V+ +
Sbjct: 61 PNAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTST 120

Query: 117 ---SVSTTGIIEVIR 128
+S++ + EV R
Sbjct: 121 EYSFLSSSLVKEVAR 135


78EcSMS35_0264EcSMS35_0276N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_02642356.608235flagellar basal body rod protein FlgC
EcSMS35_02652367.213633lateral flagellar rod protein LfgD
EcSMS35_02662376.569546lateral flagellar hook protein LfgE
EcSMS35_02671396.709611flagellar basal body rod protein FlgF
EcSMS35_02681386.308839flagellar basal body rod protein FlgG
EcSMS35_02690356.626141flagellar basal body L-ring protein
EcSMS35_0270-1293.954097flagellar basal body P-ring protein
EcSMS35_02710242.553266lateral flagellar peptidoglycan hydrolase LfgJ
EcSMS35_02721253.074209lateral flagellar hook associated protein 1
EcSMS35_02731222.132523lateral flagellar hook associated protein 3
EcSMS35_02741191.807642lateral flagellar hook associated protein LafW
EcSMS35_02751181.505811lateral flagellar transmembrane regulator LafZ
EcSMS35_02762223.313618lateral flagellar flagellin LafA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0264FLGHOOKAP1280.017 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 27.6 bits (61), Expect = 0.017
Identities = 7/37 (18%), Positives = 18/37 (48%)

Query: 104 VNVVEQMADMMSASRDFETNVDVLNNVKSMQQSLLKL 140
VN+ E+ ++ + + N VL ++ +L+ +
Sbjct: 509 VNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0266FLGHOOKAP1393e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 38.8 bits (90), Expect = 3e-05
Identities = 20/59 (33%), Positives = 26/59 (44%), Gaps = 5/59 (8%)

Query: 2 SYEIAATGLNAVNEQLDGISNNIANAGTVGYKS----MTTQFSAMYAGSQ-AMGVSVAG 55
A +GLNA L+ SNNI++ GY M S + AG GV V+G
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61



Score = 37.2 bits (86), Expect = 1e-04
Identities = 16/47 (34%), Positives = 24/47 (51%)

Query: 352 TLSSGVLESSNVDITSELVNLMTAQRNYQANTKVIATSTQLDDALFQ 398
LS+ S V++ E NL Q+ Y AN +V+ T+ + DAL
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0268FLGHOOKAP1404e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 40.3 bits (94), Expect = 4e-06
Identities = 19/93 (20%), Positives = 38/93 (40%), Gaps = 19/93 (20%)

Query: 5 LWISKTGLSAQDAEMSAIANNIANVNTTGFKRDRVMFQDLFYQTQEAPGAMLDQNNIMPT 64
+ + +GL+A A ++ +NNI++ N G+ R + M N+ +
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTLGA 49

Query: 65 GLQFGSGVRIVGTQKT-----FTEGNVETTDNA 92
G G+GV + G Q+ + T ++
Sbjct: 50 GGWVGNGVYVSGVQREYDAFITNQLRAAQTQSS 82



Score = 38.8 bits (90), Expect = 2e-05
Identities = 9/45 (20%), Positives = 20/45 (44%)

Query: 213 TLQDNALEGSNVDIVNEMVAMITVQRAYEMNAKMVSAADDMLQYI 257
L + S V++ E + Q+ Y NA+++ A+ + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDAL 542


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0269FLGLRINGFLGH1364e-42 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 136 bits (344), Expect = 4e-42
Identities = 63/189 (33%), Positives = 93/189 (49%), Gaps = 12/189 (6%)

Query: 37 EAPPPADGRAGGVFET------GYNWSLTADRRAYRVGDILTVILEESTQSSKQAKTNFG 90
P P G +F++ GY L DRR +GD LT++L+E+ +SK + N
Sbjct: 39 PVPGPTPVANGSIFQSAQPINYGYQ-PLFEDRRPRNIGDTLTIVLQENVSASKSSSANAS 97

Query: 91 KSNTVDIG---APTIFGHTKDKLSGSIDA--NRDFDGSATSQQQNSLRGEITVSVHAVQP 145
+ + G P ++A F+G + N+ G +TV+V V
Sbjct: 98 RDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLV 157

Query: 146 NGILEIRGEKWLTLNQGDEYIRLSGLVRADDIQNDNSVSSQRIADARISYAGRGALSDAN 205
NG L + GEK + +NQG E+IR SG+V I N+V S ++ADARI Y G G +++A
Sbjct: 158 NGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQ 217

Query: 206 AAGWLTRLF 214
GWL R F
Sbjct: 218 NMGWLQRFF 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0270FLGPRINGFLGI329e-113 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 329 bits (844), Expect = e-113
Identities = 147/369 (39%), Positives = 219/369 (59%), Gaps = 15/369 (4%)

Query: 10 IKAAVITVSL----ALPGVALAQSLESLVNVQGVRENQLVGYSLVVGLDGTGDK-NQVKF 64
I AA++ +L P A ++ + ++Q R+NQL+GY LVVGL GTGD F
Sbjct: 7 IAAALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPF 66

Query: 65 TNQTITNMLRQFGVQLPNKIDPKVKNVAAVAVSATLPPMYSRGQTIDVTVSSIGDAKSIR 124
T Q++ ML+ G+ KN+AAV V+A LPP S G +DVTVSS+GDA S+R
Sbjct: 67 TEQSMRAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLR 125

Query: 125 GGTLLLTQLHGADGEVYALAQGSVVVGGMNATGASGSSVTVNTPTAGLIPNGATVEREIP 184
GG L++T L GADG++YA+AQG+++V G +A G +++T T+ +PNGA +ERE+P
Sbjct: 126 GGNLIMTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELP 184

Query: 185 SDFQMGDTITLNLKRPSFKDANNIAAAINASF-----GGIATAQSSTNVSVRAPTSPGAR 239
S F+ + L L+ P F A +A +NA IA + S ++V+ P
Sbjct: 185 SKFKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADL 243

Query: 240 VAFMSQLDDVQVQAEKIRARVVFNSRTGTVVMGDGVALHAAAVSHGSLTVSINETSNVSQ 299
M++++++ V+ + A+VV N RTGT+V+G V + AVS+G+LTV + E+ V Q
Sbjct: 244 TRLMAEIENLTVETDTP-AKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQ 302

Query: 300 PNAFAGGRTAVTPQSNIAVNHARPGVVSLPESSSLKTLVNALNSLGATPDDIMSILQALH 359
P F+ G+TAV PQ++I V++ E L+TLV LNS+G D I++ILQ +
Sbjct: 303 PAPFSRGQTAVQPQTDIMA-MQEGSKVAIVEGPDLRTLVAGLNSIGLKADGIIAILQGIK 361

Query: 360 EAGALDADL 368
AGAL A+L
Sbjct: 362 SAGALQAEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0271FLGFLGJ573e-13 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 57.0 bits (137), Expect = 3e-13
Identities = 26/75 (34%), Positives = 42/75 (56%), Gaps = 4/75 (5%)

Query: 22 ANDIKQAAEQFEAIFLRNMLKEMRKTNELFDSKDNPFNSDSVRMMQGFYDDELCNTLAQQ 81
A +I+ A Q E +F++ MLK MR KD F+S+ R+ YD ++ +
Sbjct: 30 AANIRPVARQVEGMFVQMMLKSMRDAL----PKDGLFSSEHTRLYTSMYDQQIAQQMTAG 85

Query: 82 HGIGIAAMIVKQLSP 96
G+G+A M+VKQ++P
Sbjct: 86 KGLGLAEMMVKQMTP 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0272FLGHOOKAP11761e-51 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 176 bits (448), Expect = 1e-51
Identities = 97/326 (29%), Positives = 165/326 (50%), Gaps = 7/326 (2%)

Query: 2 DMINIGYSGASTAQVELNVTAQNTANAMTTGYTRQVAEISTIGASGGSPNSAGNGVQVDS 61
+IN SG + AQ LN + N ++ GYTRQ ++ ++ G+ GNGV V
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 62 IRRVSNQYQVNQVWYAASDYGYYSTQQGYLTQLEAVLSDDNSSLSGGFDNFFAALNEATT 121
++R + + NQ+ A + + + +++++ +LS SSL+ +FF +L +
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 122 SPDDSALREQVISEAGALSLRIDNTLDYIDSQSTEIISQQQAMVSQINTLTSGIASYNQQ 181
+ +D A R+ +I ++ L + T Y+ Q ++ A V QIN IAS N Q
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 182 IAQAEAN--GDNASALYDARDQMVEELSGMMDVQVNIDDQGNYNVTLKNGQPLVSGQQSS 239
I++ G + + L D RDQ+V EL+ ++ V+V++ D G YN+T+ NG LV G +
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 240 TIA-LETNADGTPT----MSLTFAGTTSTMTTDTGGSLGALFDYQNDVLTPLTDTINSMA 294
+A + ++AD + T + T GSLG + +++ L +T+ +A
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 295 LQFADAVNNQLAQGYDLNGNPGEPLF 320
L FA+A N Q G+D NG+ GE F
Sbjct: 302 LAFAEAFNTQHKAGFDANGDAGEDFF 327



Score = 74.2 bits (182), Expect = 3e-16
Identities = 54/286 (18%), Positives = 110/286 (38%), Gaps = 14/286 (4%)

Query: 179 NQQIAQAEANGDNASALYDARDQMVEELSGMMDV-------QVNIDDQGNYNVTLKNGQP 231
N +I + N + + R Q +++ + N + ++ G+
Sbjct: 266 NIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLALAFAEAFNTQHKAGFDANGDAGED 325

Query: 232 LVSGQQSSTIALETNADGTPTMSLTFAGTTSTMTTDTGGSLGALFDYQNDVLTPLTDTI- 290
+ + A+ N +++ T ++ T + + Q V ++T
Sbjct: 326 FFAIGKP---AVLQNTKNKGDVAIGATVTDASAVLATDYKISFDNN-QWQVTRLASNTTF 381

Query: 291 NSMALQFADAVNNQLAQGYDLNGNPGEPLFIYDASNADGPLTVNPDITADELAFSSSPDE 350
+ L + + + S+A + V A S
Sbjct: 382 TVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIVNMDVLITDEAKIAMASEEDAG 441

Query: 351 SGNSDNLQALINISTEPLEIANLGSVTVGQACSSIISNIGIYSQQNQTEVDAASNVYSEA 410
++ N QAL+++ + G+ + A +S++S+IG + +T NV ++
Sbjct: 442 DSDNRNGQALLDLQSN--SKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQL 499

Query: 411 QNQQSSVSGVSMDEEAVNLITYQQIYEANLKVISAGAEIFDSVLEM 456
NQQ S+SGV++DEE NL +QQ Y AN +V+ IFD+++ +
Sbjct: 500 SNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0273FLAGELLIN423e-06 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 41.6 bits (97), Expect = 3e-06
Identities = 47/313 (15%), Positives = 99/313 (31%), Gaps = 12/313 (3%)

Query: 1 MRVTTQQTYVSMTQSFNNLSGSLAHVVEQMATGKEILQPSDDPIAATRITQLNRQQSAIE 60
+ T + + N SL+ +E++++G I DD + +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYQSNIDSASAGLSQQESILDGVNNSLLAVRDDLLEAANGTNTADSLASLGQDIQSLTES 120
Q N + + E L+ +NN+L VR+ ++A NGTN+ L S+ +IQ E
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 MVAALNYQDEDGHYVFGGTINDQPPIVAVDDDGDGVT-----------DSYSYQGNSDHR 169
+ N +G V + + A D + + D ++ G +
Sbjct: 122 IDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 170 QTTVSNGVEVDTNVAASDFFGSNLDV-LNTLNSLSQELQNPDVDPADPQVQSDLQNAVDV 228
+ + + T + V +N+ ++ D + D
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDA 241

Query: 229 VDTASDDLNASIASLGETQNTMSMLSDAQTDISTSNDELIGSLQDLDYGPASITFTGLEV 288
+ + DL + S T ++ + + G +D + +
Sbjct: 242 ENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVST 301

Query: 289 AMEATLKTYSKVS 301
+ T +
Sbjct: 302 TINGEKVTLTVAD 314



Score = 37.7 bits (87), Expect = 4e-05
Identities = 27/262 (10%), Positives = 71/262 (27%), Gaps = 3/262 (1%)

Query: 31 ATGKEILQPSDDPIAATRITQLNRQQSAIEQYQSNIDSASAGLSQQESILDGVNNSLLAV 90
+DD T + +S ++ + + ++ D +
Sbjct: 229 VNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTID 288

Query: 91 RDDLLEAANGTNTADSLASLGQDIQSLTESMVAALNYQDEDGHYVFGGTINDQPPIVAVD 150
+ +T + + + +T + V+ +N Q
Sbjct: 289 TKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 151 DDGDGVTDSYSYQGNSDHRQTTVSNGVEVDTNVAASDFFGSNLDVLNTLNSLSQELQNPD 210
+ ++ + V A + L + +
Sbjct: 349 KNESAKLSDL---EANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTL 405

Query: 211 VDPADPQVQSDLQNAVDVVDTASDDLNASIASLGETQNTMSMLSDAQTDISTSNDELIGS 270
++ + N + +D+A ++A +SLG QN + T+ +
Sbjct: 406 INEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSR 465

Query: 271 LQDLDYGPASITFTGLEVAMEA 292
++D DY + ++ +A
Sbjct: 466 IEDADYATEVSNMSKAQILQQA 487


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0276FLAGELLIN895e-22 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 89.3 bits (221), Expect = 5e-22
Identities = 69/253 (27%), Positives = 122/253 (48%)

Query: 4 INTNNASMAAVNAISKSSSSLSTSMERLATGNRINSSADDAAGKQIANRLTAQSSGMGVA 63
INTN+ S+ N ++KS SSLS+++ERL++G RINS+ DDAAG+ IANR T+ G+ A
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 LSNINDATAMLQTADSMFDEMSDVLGRMKDLSTQAANGTYSDDDLQAMQDEYDELGQQMS 123
N ND ++ QT + +E+++ L R+++LS QA NGT SD DL+++QDE + +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 DMLQNTTYGGTNLFGVSGTSNTGTDGLFQSAVTFQVGAESSDTMTVNISSQLNTLVTDLS 183
+ T + G + +T + ++ ++ + +
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 184 AISNSFSADQADTTGTAGVSGGTELTASGSANQMITSISTAMDDVSQIQSKLGASINRLN 243
+ +SF T G + SG+ T+ + + + + N
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 244 DTANNLTSMQDNT 256
+TA +L +T
Sbjct: 244 NTAVDLFKTTKST 256



Score = 65.1 bits (158), Expect = 5e-14
Identities = 55/273 (20%), Positives = 95/273 (34%), Gaps = 2/273 (0%)

Query: 32 ATGNRINSSADDAAGKQIANRLTAQSSGMGVALSNINDATAMLQTADSMFDEMSDVLGRM 91
T + N++A D + TA++ + A+ + + +
Sbjct: 237 TTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGN 296

Query: 92 KDLSTQAANGTYSDDDLQAMQDEYDELGQQMSDMLQNTTYGGTNLFGVSGTSNTGTDGLF 151
+ST + + + Y + T +
Sbjct: 297 GKVSTTINGEKVTLTVADITAGAANVDAATLQS--SKNVYTSVVNGQFTFDDKTKNESAK 354

Query: 152 QSAVTFQVGAESSDTMTVNISSQLNTLVTDLSAISNSFSADQADTTGTAGVSGGTELTAS 211
S + + +TVN + D ++ +G + + A
Sbjct: 355 LSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAK 414

Query: 212 GSANQMITSISTAMDDVSQIQSKLGASINRLNDTANNLTSMQDNTEVAIGNIMDTDYATE 271
S + SI +A+ V ++S LGA NR + NL + N A I D DYATE
Sbjct: 415 KSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATE 474

Query: 272 ASNMTKQQVLMQTGITMLKQSNSMSSMVSSLLQ 304
SNM+K Q+L Q G ++L Q+N + V SLL+
Sbjct: 475 VSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


79EcSMS35_0295EcSMS35_0304N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0295022-6.934837outer membrane phosphoporin protein E
EcSMS35_0296-126-7.686204gamma-glutamyl kinase
EcSMS35_0297441-14.538292gamma-glutamyl phosphate reductase
EcSMS35_0299749-16.917339*phage integrase family site specific
EcSMS35_0300648-17.388479hypothetical protein
EcSMS35_0301643-15.845316hypothetical protein
EcSMS35_0302537-12.756606response regulator
EcSMS35_0303433-11.750271hypothetical protein
EcSMS35_03044242.057837IS911 transposase orfA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0295ECOLIPORIN5500.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 550 bits (1418), Expect = 0.0
Identities = 232/384 (60%), Positives = 268/384 (69%), Gaps = 34/384 (8%)

Query: 1 MKKSTLALVVMGIVASASVQAAEIYNKDGNKLDIYGKVKAMHYMSDNDSKDGDQSYIRFG 60
MK+ LALV+ ++A+ + AAEIYNKDGNKLD+YGKV +HY SD+ SKDGDQ+Y+R G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQLTGYGRWEAEFAGNKAESDTAQQKTRLAFAGLKYKDLGSFDYGRNLGALY 120
FKGETQINDQLTGYG+WE N E + A TRLAFAGLK+ D GSFDYGRN G LY
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 121 DVEAWTDMFPEFGGDSSAQTDNFMTKRASGLATYRNTDFFGVIDGLNLTLQYQGKNEN-- 178
DVE WTDM PEFGGDS DN+MT RA+G+ATYRNTDFFG++DGLN LQYQGKNE+
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 --------------RDVKKQNGDGFGTSLTYDFGGSDFAISGAYTNSDRTNEQNLQSR-- 222
D++ NGDGFG S TYD G F+ AYT SDRTNEQ
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDI-GMGFSAGAAYTTSDRTNEQVNAGGTI 239

Query: 223 GTGKRAEAWATGLKYDANNIYLATFYSETRKMTP-------ITGGFANKTQNFEAVAQYQ 275
G +A+AW GLKYDANNIYLAT YSETR MTP GG ANKTQNFE AQYQ
Sbjct: 240 AGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQ 299

Query: 276 FDFGLRPSLGYVLSKGKDIE----GIGDEDLVNYIDVGATYYFNKNMSAFVDYKINQLDS 331
FDFGLRP++ +++SKGKD+ D+DLV Y DVGATYYFNKN S +VDYKIN LD
Sbjct: 300 FDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDD 359

Query: 332 DNKL----NINNDDIVAVGMTYQF 351
D+ I+ DDIVA+GM YQF
Sbjct: 360 DDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0296CARBMTKINASE376e-05 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 37.5 bits (87), Expect = 6e-05
Identities = 28/127 (22%), Positives = 48/127 (37%), Gaps = 17/127 (13%)

Query: 119 DTLRALLDNNI---------VPVINENDAVATAEIKVGDNDNLSALAAILAGADKLLLLT 169
+T++ L++ + VPVI E+ + E V D D A AD ++LT
Sbjct: 177 ETIKKLVERGVIVIASGGGGVPVILEDGEIKGVE-AVIDKDLAGEKLAEEVNADIFMILT 235

Query: 170 DQKGLYTADPRSNPQAELIKDVYGIDDALRAIAGDSVSGLGTGGMSTKLQAA-DVACRAG 228
D G + + +++V +++ + G M K+ AA G
Sbjct: 236 DVNGAALY--YGTEKEQWLREV-KVEELRKYYEEG---HFKAGSMGPKVLAAIRFIEWGG 289

Query: 229 IDTIIAA 235
IIA
Sbjct: 290 ERAIIAH 296



Score = 30.2 bits (68), Expect = 0.013
Identities = 16/76 (21%), Positives = 33/76 (43%), Gaps = 13/76 (17%)

Query: 4 SQTLVVKLGTSVLTGGSRRLNRAHIVELVRQCAQ----LHAAGHRIVIVTSG-------- 51
+ +V+ LG + L ++ + +++ VR+ A+ + A G+ +VI
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61

Query: 52 -AIAAGREHLGYPELP 66
+ AG+ G P P
Sbjct: 62 LHMDAGQATYGIPAQP 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0301HTHFIS414e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 4e-06
Identities = 14/80 (17%), Positives = 28/80 (35%), Gaps = 12/80 (15%)

Query: 2 IKILIVDDNKSRIEKLKSSLTELITKNMIRIDEKYTSDAAKIALKLNQYDYLILDVFLPK 61
IL+ DD+ + L +L ++ + + + D ++ DV +P
Sbjct: 4 ATILVADDDAAIRTVLNQAL----SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMP- 58

Query: 62 KDNYSPDERNGLGLLKQINS 81
+ N LL +I
Sbjct: 59 -------DENAFDLLPRIKK 71


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0302HTHFIS388e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.5 bits (87), Expect = 8e-06
Identities = 15/98 (15%), Positives = 36/98 (36%), Gaps = 16/98 (16%)

Query: 2 KILLVEDIEYKRDKVIGLLESISADVSVDVAKSYVSAVNSATSKTYDLIILDMSLPTYDK 61
IL+ +D R + L A V + + + + DL++ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN- 61

Query: 62 GPNENGGRFRVYGGKDIIRKLMRGKEHPVVVVLTQHTT 99
D++ ++ + + V+V++ T
Sbjct: 62 -------------AFDLLPRIKKARPDLPVLVMSAQNT 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0304RTXTOXIND260.043 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.6 bits (56), Expect = 0.043
Identities = 6/48 (12%), Positives = 12/48 (25%)

Query: 42 LESWVRQLRRERQGIAPSATPITPEQQRIRELEKQVRRLEEQNTILKK 89
+W Q ++ + RI E R + +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS 242


80EcSMS35_0425EcSMS35_0430N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_04252151.614919MFS transport protein AraJ
EcSMS35_04262151.845071exonuclease subunit SbcC
EcSMS35_0427-1121.840897exonuclease subunit SbcD
EcSMS35_0428-1151.744366hypothetical protein
EcSMS35_0429-1142.398372transcriptional regulator PhoB
EcSMS35_0430-111-1.062989phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0425TCRTETA531e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.5 bits (126), Expect = 1e-09
Identities = 73/356 (20%), Positives = 128/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIVVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNVLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 314
F A T + L G+ + +++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKITSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
+ I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0426IGASERPTASE391e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 1e-04
Identities = 40/264 (15%), Positives = 81/264 (30%), Gaps = 11/264 (4%)

Query: 162 LNAKPKERAELLEELTGTEIYGKISAMVFEQHKSARTELEKLQAQASGVALLTPEQVQSL 221
A P E E + E + E S V + + A + + A Q+
Sbjct: 1029 APATPSETTETVAENSKQE-----SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 222 TASLQVLTDEEKQLITAQQQEQQSLNWLTRLD-ELQQEGSRRQQALQQALAEEEKAQPQL 280
+ +E Q ++ +++ E QE + + + E QPQ
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 281 AALSLAQPARNLRPHWE---RIAEHSAALAHTRQQIEEVNTRLQSTMALRASIRHHAAKQ 337
P N++ A+ T +E+ T + + + +
Sbjct: 1144 EPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT 1203

Query: 338 SAELQQQQQSLNAWLQEHDRFRQWNNELAGWRAQFSQQTSDREHLRQWQQQLTHAEQKLN 397
A Q S ++ ++ R + + ++DR + T+ L+
Sbjct: 1204 PATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA-TTSSNDRSTVALCDLTSTNTNAVLS 1262

Query: 398 ALAAITLTLTADEVASALAQHAEQ 421
A + + V A++QH Q
Sbjct: 1263 DARAKAQFVALN-VGKAVSQHISQ 1285



Score = 37.0 bits (85), Expect = 5e-04
Identities = 28/140 (20%), Positives = 50/140 (35%), Gaps = 15/140 (10%)

Query: 738 QQDVLAAQSLQKAQAQFDTALQASVFDDQQAFLAALMDEQTLTQLEQLKQNLENQRRQAQ 797
Q DV + S + A+ D A A E T T E KQ +++
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPP-------APATPSETTETVAENSKQ-------ESK 1049

Query: 798 TLVTQTAEALAQHQQHRPDGLDLSVTVEQIQQELAQTQQKLRENTTSQGEIRQQLKQDAD 857
T+ +A Q+R + V+ Q Q T E ++ + +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 858 NRQQQQT-LMQQIAQMTQQV 876
+ + +T Q++ ++T QV
Sbjct: 1110 EKAKVETEKTQEVPKVTSQV 1129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0427FRAGILYSIN300.022 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 29.7 bits (66), Expect = 0.022
Identities = 13/70 (18%), Positives = 23/70 (32%), Gaps = 4/70 (5%)

Query: 149 KQQHLLAAITDYYQQHYADACKLRGDQPLPIIATGHLTTVGASKSDAVRDIYIGTLDAFP 208
K+ ++ I ++Y + + + I T D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAINEKEAFECI-YDSRTRSA--GKD-IVSVKINIDKAKK 190

Query: 209 AQNFPPADYI 218
N P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0429HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 33/149 (22%), Positives = 62/149 (41%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPTSHRVMAGEEP 152
E L D + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0430PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


81EcSMS35_0505EcSMS35_0513N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_05051161.182149acriflavine resistance protein B
EcSMS35_05062120.596062acriflavine resistance protein A
EcSMS35_05072140.420813DNA-binding transcriptional repressor AcrR
EcSMS35_05083152.524753potassium efflux protein KefA
EcSMS35_05094154.333479hypothetical protein
EcSMS35_05103174.822418primosomal replication protein N''
EcSMS35_05113233.456212hypothetical protein
EcSMS35_05124243.372465adenine phosphoribosyltransferase
EcSMS35_05134243.081923DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0505ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3546), Expect = 0.0
Identities = 802/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0506RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 112 TYQAAYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 171
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 172 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 230
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 231 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 280
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 281 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 312
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 61 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQAAYDS 119
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 120 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 179
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 180 RINLA 184
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0507HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0508RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0513IGASERPTASE399e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.5 bits (89), Expect = 9e-05
Identities = 40/251 (15%), Positives = 78/251 (31%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDPWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALST-LKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S+ ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


82EcSMS35_0587EcSMS35_0593N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_05871170.703965sensor kinase CusS
EcSMS35_05880212.468988DNA-binding transcriptional activator CusR
EcSMS35_05890192.194290hypothetical protein
EcSMS35_0590-1181.491414copper/silver efflux system outer membrane
EcSMS35_0591-2181.657101periplasmic copper-binding protein
EcSMS35_0592-1171.718063copper/silver efflux system membrane fusion
EcSMS35_0593-2171.147792cation efflux system protein cusA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0587PF06580330.004 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.5 bits (74), Expect = 0.004
Identities = 29/184 (15%), Positives = 67/184 (36%), Gaps = 34/184 (18%)

Query: 306 EELTRMAKMVSDML-FLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDR-GVELRFV 363
+ M +S+++ + + N + + LADE+ V + + LA + L+F
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVS------LADELTVVDSYLQ-LASIQFEDRLQFE 243

Query: 364 GDECQVAGDPLMLRRALSNLLSNALRY----TPTGETIVVRCQTVDHLVQVTVENPGTPI 419
D + + L+ N +++ P G I+++ + V + VEN G+
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA 303

Query: 420 APEHLPRLFDRFYRIDPSRQRKGEGSGIGLAIVK---SIVVAHKGTVAVTSDARGTRFVI 476
E +G GL V+ ++ + + ++ ++
Sbjct: 304 LKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 477 ILPA 480
++P
Sbjct: 346 LIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0588HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0590RTXTOXIND388e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 8e-05
Identities = 26/189 (13%), Positives = 60/189 (31%), Gaps = 13/189 (6%)

Query: 254 QAQTVNSDSLQSVKLPA-GLPSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +S + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLNDQISAQQRYLASLKITLQRARALYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + K L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SLFATRQTL 436
L
Sbjct: 260 KYVEAVNEL 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0593ACRIFLAVINRP6960.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 696 bits (1798), Expect = 0.0
Identities = 213/1059 (20%), Positives = 440/1059 (41%), Gaps = 54/1059 (5%)

Query: 1 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI 60
M + IRR + A+ L + G I+ PV P ++ V + +YPG Q
Sbjct: 1 MANFFIRR----PIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQT 56

Query: 61 VENQVTYPLTTTMLSVPGAKTVRGFSQ-FGDSYVYVIFEDGTDPYWARSRVLEYLNQVQG 119
V++ VT + M + + S G + + F+ GTDP A+ +V L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 120 KLPAGVSAELGP-DATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVA 178
LP V + + + ++ V + D+ +K L + V +V
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 179 SVGGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELA------EAEYMVR 232
G ++ +D L +Y ++ +V + L N + + + +
Sbjct: 177 LFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 ASGYLQTLDDFNHIVLKASENGVPVYLRDVAKVQVGPEMRRGIAELNGEGEVAGGVVILR 292
A + ++F + L+ + +G V L+DVA+V++G E IA +NG+ AG + L
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLA 294

Query: 293 SGKNAREVIAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVV 352
+G NA + A+K KL L+ P+G++++ YD + + +I + L E ++V +V
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 CALFLWHVRSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIE 412
LFL ++R+ L+ I++P+ L F ++ G + N +++ G+ +A+G +VD AIV++E
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 413 NAHKRLEEWQHQHPDATLDNKTRWQVITDASVEVGPALFISLLIITLSFIPIFTLEGQEG 472
N + + E D + + ++ AL ++++ FIP+ G G
Sbjct: 415 NVERVMME----------DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 473 RLFGPLAFTKTYAMAGAALLAIVVIPILMGYWIRGKIPPESSNPLNRF----------LI 522
++ + T AMA + L+A+++ P L ++ + E F +
Sbjct: 465 AIYRQFSITIVSAMALSVLVALILTPALCATLLKP-VSAEHHENKGGFFGWFNTTFDHSV 523

Query: 523 RVYHPLLLKVLHWPKTTLLVAALSVLTVLWPLNKVGGEFLPQINEGDLLYMPSTLPGISA 582
Y + K+L LL+ AL V ++ ++ FLP+ ++G L M G +
Sbjct: 524 NHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQ 583

Query: 583 AEAASMLQKTDKLIM--SVPEVARVFGKTGKAETATDSAPLEMVETTIQLKPQDQW-RPG 639
+L + + V VF G + + + LKP ++
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQN---AGMAFVSLKPWEERNGDE 640

Query: 640 MTMDKIIEELDNTVRLPGLANLWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADI-DTMAE 698
+ + +I + + + +++ + I +G + +
Sbjct: 641 NSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 699 QIEEVARTVPGVASALAERLEGGRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGE 758
+ A+ + S LE +E+++EKA G++++D+ +++A+GG V +
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND 760

Query: 759 TVEGIARYPINLRYPQSWRDSPQALRQLPILTPMKQQITLADVADVKVSTGPSMLKTENA 818
++ + ++ +R P+ + +L + + + + + G L+ N
Sbjct: 761 FIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNG 820

Query: 819 RPTSWIYIDARDRDMVSVVHDLQKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPM 878
P+ I +A L + +A K L G ++G + ++ +V +
Sbjct: 821 LPSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 879 TLMIIFVLLYLAFRRVGEALLIISSVPFALVGGIWLLWWMGFHLSVATGTGFIALAGVAA 938
+ +++F+ L + + ++ VP +VG + V G + G++A
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 939 EFGVVMLMYLRHAIEAEPSLNNPQTFSEQKLDEALYHGAVLRVRPKAMTVAVIIAGLLPI 998
+ ++++ + + +E E + + EA +R+RP MT I G+LP+
Sbjct: 939 KNAILIVEFAKDLMEKE----------GKGVVEATLMAVRMRLRPILMTSLAFILGVLPL 988

Query: 999 LWGTGAGSEVMSRIAAPMIGGMITAPLLSLFIIPAAYKL 1037
GAGS + + ++GGM++A LL++F +P + +
Sbjct: 989 AISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


83EcSMS35_0611EcSMS35_0616N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0611-1185.072344enterobactin exporter EntS
EcSMS35_0612-1185.010090iron-enterobactin transporter periplasmic
EcSMS35_0613-2225.441715isochorismate synthase, entC
EcSMS35_0614-1245.421074enterobactin synthase subunit E
EcSMS35_0615-1215.182100isochorismatase
EcSMS35_0616-1184.8896912,3-dihydroxybenzoate-2,3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0611TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0612FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0615ISCHRISMTASE442e-160 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 442 bits (1139), Expect = e-160
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPDSHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0616DHBDHDRGNASE361e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 361 bits (928), Expect = e-130
Identities = 109/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFAQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAQTERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ + +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


84EcSMS35_0814EcSMS35_0820N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0814-1193.739238ABC-2 type transporter, permease
EcSMS35_0816-1174.067294ABC-2 type transporter, permease
EcSMS35_0817-2154.055157ABC transporter ATP-binding protein
EcSMS35_0818-2133.565065hypothetical protein
EcSMS35_0819-1133.295531putative DNA-binding transcriptional regulator
EcSMS35_0820-1133.568990ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0814ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0817PF05272310.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 293 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 352
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 353 KRGEIFG----LLGPNGAGKSTTFKMMCGL 378
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.3 bits (65), Expect = 0.047
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0818RTXTOXIND626e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.2 bits (151), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 82 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 141
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 142 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 196
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 197 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 254
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 255 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 308
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 309 ----DADDALRQGMPVTVQ 323
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0819HTHTETR721e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.0 bits (176), Expect = 1e-17
Identities = 32/220 (14%), Positives = 74/220 (33%), Gaps = 29/220 (13%)

Query: 9 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 67
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 68 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSK---FISR 124
IGE E + P + +RE+++ + + + + +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 125 EQLSPTAAYHLVHEQVISPLHSHLTRLIAAW---TGCDASDTRMILHTHALIGEILAFRL 181
E A + + + L I A +I+ I ++
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM--RGYISGLM---- 175

Query: 182 GKETILLRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 221
W + + + ++ ++L+
Sbjct: 176 --------ENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0820SECA300.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.025
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


85EcSMS35_0866EcSMS35_0874N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0866014-0.752968D-alanyl-D-alanine carboxypeptidase fraction C
EcSMS35_08671130.013884DNA-binding transcriptional repressor DeoR
EcSMS35_08680120.277955undecaprenyl pyrophosphate phosphatase
EcSMS35_08690120.195766multidrug translocase MdfA
EcSMS35_0870-114-0.315989hypothetical protein
EcSMS35_0871015-0.751161phosphatase YbjI
EcSMS35_0872-1130.118717major facilitator transporter
EcSMS35_0873013-0.609709TetR family transcriptional regulator
EcSMS35_08740120.353098hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0866BLACTAMASEA438e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 43.2 bits (102), Expect = 8e-07
Identities = 41/201 (20%), Positives = 64/201 (31%), Gaps = 34/201 (16%)

Query: 19 AFLFLFAPTAFAAEQTVEAPSVDARAW----------ILMDYASGKVLAEGNADEKLDPA 68
+ L A A P + I MD ASG+ L ADE+
Sbjct: 7 CIISLLATLPLAV-HASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMM 65

Query: 69 SLTKIMTSYVVGQALKADKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQVSVADLN 128
S K++ V + A +L + + +P V D ++V +L
Sbjct: 66 STFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGMTVGELC 119

Query: 129 KGVIIQSGNDACIALADYVAGSQESFIGLMNGYAKKLGLTNTT---FQTVHGLDAPGQF- 184
I S N A L V G + + +++G T ++T PG
Sbjct: 120 AAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEALPGDAR 174

Query: 185 --STARDMA------LLGKAL 197
+T MA L + L
Sbjct: 175 DTTTPASMAATLRKLLTSQRL 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0869TCRTETA416e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.0 bits (96), Expect = 6e-06
Identities = 59/269 (21%), Positives = 106/269 (39%), Gaps = 23/269 (8%)

Query: 71 LLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQFTLLRFLQGISLCFIGAVGYAAI 130
+LG LSDR GRRPV+L + V + A + + R + GI+ GAV A I
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYI 120

Query: 131 QESFEEAVCIKITALMANVALIAPLLGPLVG---AAWIHVLPWEGMFVLFAALAAISFFG 187
+ + + M+ + GP++G + P F AAL ++F
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFLT 176

Query: 188 LQRAMPETATRIGEKLSLKELGRDYKLVLKNG-RFVAGALALGFVSLPLLAWIAQSP--I 244
+PE+ L + L G VA +A+ F ++ + Q P +
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF----IMQLVGQVPAAL 232

Query: 245 IIITGEQLSSYEYGLLQVPIFGALIAGNL----LLARLTSRRTVRSLIIMGGWPIMIGLL 300
+I GE ++ + + + I +L + + +R R +++G G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 301 VAAAATVISSHAYLWMTAGLSLYAFGIGL 329
+ A AT ++ + L + GIG+
Sbjct: 293 LLAFAT----RGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0872TCRTETB355e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.9 bits (80), Expect = 5e-04
Identities = 34/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVAVVR-ASALM--GALGIGMIIFVDSAWVA-GVSVVLWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + VL GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0873HTHTETR521e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.6 bits (123), Expect = 1e-10
Identities = 14/81 (17%), Positives = 31/81 (38%)

Query: 2 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAFS 61
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 62 RFTEIMSRQYQAFFSDVSDAP 82
+ + + P
Sbjct: 65 LSESNIGELELEYQAKFPGDP 85


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0874TCRTETA320.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.006
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSTFSFGMGNAAGLLFAGIML-GFMRANHPTFG-YIPQ--GALSMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAIGGQM--LIAGLIVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


86EcSMS35_0892EcSMS35_0898N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0892115-4.600769arginine transporter ATP-binding subunit
EcSMS35_0893116-3.293468putative lipoprotein
EcSMS35_0894113-2.505185hypothetical protein
EcSMS35_0895-1143.521963hypothetical protein
EcSMS35_0897-2133.638845N-acetylmuramoyl-L-alanine amidase AmiD
EcSMS35_0896-2153.266339NAD-dependent epimerase/dehydratase family
EcSMS35_0898-3122.642569NAD dependent epimerase/dehydratase family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0892PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0897ECOLIPORIN310.007 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 30.7 bits (69), Expect = 0.007
Identities = 20/54 (37%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLVAAALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVMVIHYTADD 55
R+V LV ALL AG A I K+G +LD Y ++ + HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0896NUCEPIMERASE752e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.2 bits (185), Expect = 2e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------TGRNEAMGKLLEKMGAEFVPAD 63
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVLTLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0898NUCEPIMERASE561e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.6 bits (134), Expect = 1e-10
Identities = 29/125 (23%), Positives = 52/125 (41%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVRTLSQQGHQILA---------AARHVDRLAKLQLANVSCHKVDL 54
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQVALNVRDALREVPVKQL 106
+ + + L + V+ H S+ + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


87EcSMS35_0981EcSMS35_0991N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_0981-2174.379642DNA-binding transcriptional regulator BaeR
EcSMS35_0982-3164.280793signal transduction histidine-protein kinase
EcSMS35_0983-2154.047697multidrug efflux system protein MdtE
EcSMS35_0984-2143.161652multidrug efflux system subunit MdtC
EcSMS35_0985-3101.439253multidrug efflux system subunit MdtB
EcSMS35_0986-1110.328927multidrug efflux system subunit MdtA
EcSMS35_09870140.141811von Willebrand factor type A domain-containing
EcSMS35_0988-1120.357182hypothetical protein
EcSMS35_0989-2100.290211hypothetical protein
EcSMS35_0990-1111.589635hypothetical protein
EcSMS35_0991-1131.942768putative chaperone
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0981HTHFIS766e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 6e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0982BCTERIALGSPF340.002 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 33.6 bits (77), Expect = 0.002
Identities = 28/95 (29%), Positives = 36/95 (37%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALSTLLAALATF------PLARGLLAPVKRLVDGTHKLAAGDFTTRVTPTSE 217
RQ + L+ A L AL P L+A V+ V H LA + P S
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GKLAQDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0983TCRTETB1251e-33 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 125 bits (315), Expect = 1e-33
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0984ACRIFLAVINRP9180.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 918 bits (2373), Expect = 0.0
Identities = 287/1035 (27%), Positives = 505/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------VLEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G L I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRARLPELQSTIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+A+L ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRGERS---ETAQQIIDRLRVKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 80.7 bits (199), Expect = 2e-17
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRGERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 703
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPK 1020
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0985ACRIFLAVINRP9200.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 920 bits (2379), Expect = 0.0
Identities = 300/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPREAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G EA A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0986RTXTOXIND416e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 6e-06
Identities = 32/167 (19%), Positives = 66/167 (39%), Gaps = 11/167 (6%)

Query: 126 ALAQALGQLAKDKATLANARRDLARYQQLVKTNLVSRQELDAQQALVSETEGTIKADEAS 185
+A+ +L K+ L ++ ++ +L + L + T +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILSAKE----EYQLVTQLFKNEILDKLRQTTDNIGLLT 315

Query: 186 --VASAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSGDTTGIVVITQTHPIDLVFTLPE 242
+A + + S I APV +V LK G +++ +T +V++ + +++ +
Sbjct: 316 LELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTALVQN 374

Query: 243 SDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSLDNQIDATTG 288
DI + Q A + VEA+ T L + ++LD D G
Sbjct: 375 KDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419



Score = 41.4 bits (97), Expect = 6e-06
Identities = 23/144 (15%), Positives = 52/144 (36%), Gaps = 18/144 (12%)

Query: 59 LAPVQAATA-VEQAVPRYLTGLGTITAA-NTVTVRSRVDGQLMALHFQEGQQVKAGDLLA 116
+A + + VE G +T + + ++ + + + +EG+ V+ GD+L
Sbjct: 70 IAFILSVLGQVEIVAT----ANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLL 125

Query: 117 EIDPSQFKVALAQALGQLAKDKATLANARRDLARYQQLVKTNLVSRQELDAQQALVSETE 176
++ + + +++L AR + RYQ L ++ EL+ L E
Sbjct: 126 KLTALGAEADTLKT-------QSSLLQARLEQTRYQILSRS-----IELNKLPELKLPDE 173

Query: 177 GTIKADEASVASAQLQLDWSRITA 200
+ L + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFST 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_0991SHAPEPROTEIN508e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.1 bits (120), Expect = 8e-09
Identities = 32/129 (24%), Positives = 57/129 (44%), Gaps = 20/129 (15%)

Query: 132 AMMLH-IRQQAQAQLPEAITQAVIGRPINFQGLGGDEANAQAQGILERAAKRAGFKDVVF 190
M+ H I+Q + ++ P+ + E A + +A+ AG ++V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV---ERRA-----IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREASLLGHSGCRI 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S RI
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 35.9 bits (83), Expect = 2e-04
Identities = 32/137 (23%), Positives = 56/137 (40%), Gaps = 23/137 (16%)

Query: 332 RLSYRLV---RSAEESKIALSSV--AETRASLPFISDELAT------LISQQGLENALSQ 380
R +Y + +AE K + S + + LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 381 PLARILEQVQLALDNAQEKPDV--------IYLTGGSARSPLIKKALAEQLPGIPIAGGD 432
PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIP+ +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 433 D-FGSVTAGLARWAEVV 448
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


88EcSMS35_1216EcSMS35_1224N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1216131-8.289157transcriptional regulatory protein YedW
EcSMS35_1217028-7.327861heavy metal sensor histidine kinase
EcSMS35_1218-119-3.366831chaperone protein HchA
EcSMS35_1219-311-1.976192hypothetical protein
EcSMS35_1220-213-0.884140hypothetical protein
EcSMS35_1221-112-0.241780porin family protein
EcSMS35_12222182.196617hypothetical protein
EcSMS35_12232182.162325hypothetical protein
EcSMS35_12240171.350713DNA cytosine methylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1216HTHFIS841e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.1 bits (208), Expect = 1e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 39 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 98
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 99 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 154
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1217PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.002
Identities = 37/181 (20%), Positives = 61/181 (33%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSHIHITSFLDTNGYLNIDVAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ I + D NG + ++V +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGTKIHEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIT 447
G+ + K G GL V+ + L+G A K
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1221ECOLIPORIN5150.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 515 bits (1328), Expect = 0.0
Identities = 261/396 (65%), Positives = 309/396 (78%), Gaps = 22/396 (5%)

Query: 1 MKRKVLAMLVPALLVAGAANAAEIYNKNGNKLDLYGKVDARHTFSDNPGDDGDETIIELG 60
MKRKVLA+++PALL AGAA+AAEIYNK+GNKLDLYGKVD H FSD+ DGD+T + +G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQITDQLTGYGQALTKTKASDTEG-SDNTYVKLAFAGLKFGEMGSFDYGRNYGVIY 119
FKGETQI DQLTGYGQ +A+ TEG N++ +LAFAGLKFG+ GSFDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVEAWTDMLPVFGGDSYTWTDNFMAGRANGVATYRNSDFFGLVEGLNFALQYQGNNEGSN 179
DVE WTDMLP FGGDSYT+ DN+M GRANGVATYRN+DFFGLV+GLNFALQYQG NE +
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 180 AGEDQEGT--KNGHEDVRFQNGDGFGLSTSYDFDFGLSLGAAYSNSDRTNSQVALGGYHY 237
A + GT +N +D+R+ NGDGFG+ST+YD G S GAAY+ SDRTN QV GG
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT-- 238

Query: 238 NEYSKFAGGDTAEAWTFGAKYDANNVYLAMMYAETRNMTPYG------NVGIANKTQNFE 291
AGGD A+AWT G KYDANN+YLA MY+ETRNMTPYG + G+ANKTQNFE
Sbjct: 239 -----IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFE 293

Query: 292 AVAQYQFDFGLRPSLAYVYSKGKDLGGNDYNNNGHQEYVDQDLVNYVEIGATYYFNKNFS 351
AQYQFDFGLRP+++++ SKGKDL N+ N D+DLV Y ++GATYYFNKNFS
Sbjct: 294 VTAQYQFDFGLRPAVSFLMSKGKDLTYNNVN------GDDKDLVKYADVGATYYFNKNFS 347

Query: 352 TYVDYKINLLDKDDDFYDNNGIATDDVVGVGLVYQF 387
TYVDYKINLLD DD FY + GI+TDD+V +G+VYQF
Sbjct: 348 TYVDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1223CARBMTKINASE342e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.4 bits (79), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQRSSILAAEETRRLLREEFEQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEE--GHFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1224PF05272290.047 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.047
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


89EcSMS35_1234EcSMS35_1258N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_12340160.677147flagellar biosynthesis protein FliR
EcSMS35_1235-2212.041511flagellar biosynthesis protein FliQ
EcSMS35_1236-1172.544780flagellar biosynthesis protein FliP
EcSMS35_1237-1162.567206flagellar biosynthesis protein FliO
EcSMS35_1238-2173.480682flagellar motor switch protein FliN
EcSMS35_12390173.871330flagellar motor switch protein FliM
EcSMS35_12402174.322496flagellar basal body-associated protein FliL
EcSMS35_12411154.221026flagellar hook-length control protein
EcSMS35_12421154.292560flagellar biosynthesis chaperone
EcSMS35_1243-1130.642673flagellum-specific ATP synthase
EcSMS35_1244016-1.289285flagellar assembly protein H
EcSMS35_1245-116-1.954130flagellar motor switch protein G
EcSMS35_1246-120-3.411767flagellar MS-ring protein
EcSMS35_1247219-4.035245flagellar hook-basal body protein FliE
EcSMS35_1248117-3.648532hypothetical protein
EcSMS35_12490140.326479acetyltransferase
EcSMS35_12500140.830678hypothetical protein
EcSMS35_1251-1140.490374hypothetical protein
EcSMS35_1252-2170.647037putative inner membrane protein
EcSMS35_1253-210-1.051504hypothetical protein
EcSMS35_1254-210-1.451430cytoplasmic alpha-amylase
EcSMS35_1255112-0.758961flagellar biosynthesis protein FliT
EcSMS35_1256111-1.261364flagellar protein FliS
EcSMS35_1257010-1.220984flagellar capping protein
EcSMS35_1258-214-1.010885flagellin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1234TYPE3IMRPROT2011e-66 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 201 bits (514), Expect = 1e-66
Identities = 257/261 (98%), Positives = 261/261 (100%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEMFNLLADIISELPLI 261
EHLFSE+FNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1235TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1236FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1238FLGMOTORFLIN2113e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 211 bits (539), Expect = 3e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTNSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T +KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1239FLGMOTORFLIM381e-135 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 381 bits (979), Expect = e-135
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1241FLGHOOKFLIK468e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 468 bits (1204), Expect = e-168
Identities = 365/375 (97%), Positives = 370/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSEILADAQQADLLIPVDETPPVINDEQSTSTPLTTAQTMTLAAVAGNNTAKDEKA 120
GEPL+S+I++DAQQA+LLIPVDETPPVINDEQSTSTPLTTAQTM LAAVA NT KDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTVNHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRT NHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1242FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1244FLGFLIH374e-136 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 374 bits (961), Expect = e-136
Identities = 225/228 (98%), Positives = 228/228 (100%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLARGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLA+GLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1245FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1246FLGMRINGFLIF7560.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 756 bits (1953), Expect = 0.0
Identities = 479/555 (86%), Positives = 515/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1247FLGHOOKFLIE1175e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (294), Expect = 5e-38
Identities = 103/103 (100%), Positives = 103/103 (100%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1249SACTRNSFRASE324e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 4e-04
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 80 APNYLRRGVASLILRHILQVAHDRCLHRLSLETGTQAGFTACHQLYLKHGFVDC 133
A +Y ++GV + +L ++ A + L LET +ACH Y KH F+
Sbjct: 98 AKDYRKKGVGTALLHKAIEWAKENHFCGLMLET-QDINISACH-FYAKHHFIIG 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1251PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1252RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1257TYPE3OMBPROT330.003 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.7 bits (74), Expect = 0.003
Identities = 27/95 (28%), Positives = 43/95 (45%), Gaps = 2/95 (2%)

Query: 214 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKAQT 273
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 274 AIKDWVNAYNSLIDTFSSLTKYTAVDAGADSQSSS 308
+KD VNA L TK ++ + S
Sbjct: 294 MLKDQVNALKGLNSKRGEPTKLLIRNSDGLLKEVS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1258FLAGELLIN2093e-63 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 209 bits (533), Expect = 3e-63
Identities = 238/543 (43%), Positives = 294/543 (54%), Gaps = 36/543 (6%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQASTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQA+ GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGQTITIDLKKIDSDTLGLSGFNVNGSADK 181
EIDRVS QTQFNGV VL++D MKIQVGANDG+TITIDL+KID +LGL GFNVNG +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEA 180

Query: 182 ASVAATADGMVKDGYIKGLTSSDGSTAYTKTTANTAAKGSDILAALKTGDKITATGANSL 241
+ S T Y D+ + D T + +
Sbjct: 181 TVGDLKS-------------SFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227

Query: 242 ADNATSTTYTYNATSNTFSYTADGVNQTNAAANLIPAAGKTTAASVTIGGTAQNVNIDDS 301
NA + T + N + ++ A A
Sbjct: 228 YVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTI 287

Query: 302 GNITSSDGDQLYLDSTGNLTKNQAGNPNKATVSGLLGNTDAKGTAVKTTIKTEAGVTVTA 361
T +DG+ + + + + T V A
Sbjct: 288 DTKTGNDGNG-----------------------KVSTTINGEKVTLTVADITAGAANVDA 324

Query: 362 EGNTGTVKIEGATVSASAFTGIAYSANTGGNTYAVAANNTTNGFLAGDALTQDAQTVSTY 421
+ + + V+ + + A N + +
Sbjct: 325 ATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGD 384

Query: 422 YSQADGTVTNSAGKEIYKDADGVYSTENKTSKTSDPLAALDDAISSIDKFRSSLGAIQNR 481
G T++PLA++D A+S +D RSSLGAIQNR
Sbjct: 385 KVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNR 444

Query: 482 LDSAVTNLNNTTTNLSEAQSRIQDADYATEVSNMSKAQIIQQAGNSVLAKANQVPQQVLS 541
DSA+TNL NT TNL+ A+SRI+DADYATEVSNMSKAQI+QQAG SVLA+ANQVPQ VLS
Sbjct: 445 FDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLS 504

Query: 542 LLQ 544
LL+
Sbjct: 505 LLR 507


90EcSMS35_1294EcSMS35_1302N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_12940121.441995flagellar motor protein MotA
EcSMS35_12950121.778538flagellar motor protein MotB
EcSMS35_12960111.824123chemotaxis protein CheA
EcSMS35_12970131.733027purine-binding chemotaxis protein
EcSMS35_12981131.893381methyl-accepting chemotaxis protein II
EcSMS35_12990170.445335methyl-accepting protein IV
EcSMS35_1300-117-2.179506chemotaxis methyltransferase CheR
EcSMS35_1301019-3.255600chemotaxis-specific methylesterase
EcSMS35_1302018-2.856176chemotaxis regulatory protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1294PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1295PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.010
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1296PF06580434e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 359 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 416
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 417 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 476
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 477 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 504
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1301HTHFIS658e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 8e-14
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1302HTHFIS889e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.3 bits (219), Expect = 9e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGLDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


91EcSMS35_1831EcSMS35_1842N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1831-1130.215753peptide ABC transporter ATP-binding protein
EcSMS35_1832-1130.034705peptide transport system, ATP-binding protein
EcSMS35_1833014-0.137272multidrug efflux transport protein EefD
EcSMS35_1834-1120.434002multidrug efflux outer membrane protein EefC
EcSMS35_1835-213-0.168058multidrug efflux transport protein EefB
EcSMS35_18360160.201284multidrug efflux transport protein EefA
EcSMS35_1837-115-0.724437transcriptional regulator EefR
EcSMS35_1838-115-0.553759hypothetical protein
EcSMS35_1839015-0.556585enoyl-(acyl carrier protein) reductase
EcSMS35_1840-116-0.858417hypothetical protein
EcSMS35_1841015-0.857932exoribonuclease II
EcSMS35_1842118-1.666961RNase II stability modulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1831HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1833TCRTETA681e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 67.5 bits (165), Expect = 1e-14
Identities = 68/312 (21%), Positives = 114/312 (36%), Gaps = 18/312 (5%)

Query: 5 SLSWALILGLLAGIGPMCTDLYLPALPEMSEQLAATTTITQLTLTASLIGLGVGQLLFGP 64
L L L +G L +P LP + L + +T L + Q P
Sbjct: 6 PLIVILSTVALDAVG---IGLIMPVLPGLLRDLVHSNDVTA-HYGILLALYALMQFACAP 61

Query: 65 ----LSDKIGRKRPLILSLLLFIVSSILCATTNNIYWLVVWRFIQGIAGAGGSVLSRSIA 120
LSD+ GR+ L++SL V + AT ++ L + R + GI GA G+V IA
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 121 RDKYQGVTLTQFFALLMTVNGLAPVLSPVLGGYIVSTFDWRTLFWVMAEISTVLLLGCLL 180
D G + F + G V PVLGG + F F+ A ++ + L
Sbjct: 122 -DITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 FINETLPENKRGSSL----LLTGRSVVQNRRFMRFCLIQSFMLAGLFAYIGSSSFVL--Q 234
+ E+ +R L + + + F + L + ++ +V+ +
Sbjct: 180 LLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF-IMQLVGQVPAALWVIFGE 238

Query: 235 KEFGFSPMQFSLVFGLNGI-GLIIASWIFSRLARRINAMTLLRGGLIAAILCALLTVLCA 293
F + + GI + + I +A R+ L G+IA +L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 294 WTQLPIPALVAL 305
+ P +V L
Sbjct: 299 RGWMAFPIMVLL 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1834RTXTOXIND290.048 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.048
Identities = 24/166 (14%), Positives = 49/166 (29%), Gaps = 11/166 (6%)

Query: 70 DVQKAIADIDSARALYGQTNASLFPTVNAALSSTRSRSLANGTGTTAEADGTVSSYTLDL 129
A AD ++ Q +RS L + + + +
Sbjct: 128 TALGAEADTLKTQSSLLQARL----EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 130 FGRNQSLSRAARETWLASEFTAQNTRLTLIAEISTAWLTLAADNSNLALAKETMASAENS 189
R SL + TW ++ + AE T + + + ++
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTV-------LARINRYENLSRVEKSR 236

Query: 190 LKIIQRQQQVGTAAATDVSEAMSVYQQARASVASYQTQVMQDKNAL 235
L A V E + Y +A + Y++Q+ Q ++ +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1835ACRIFLAVINRP11610.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1161 bits (3005), Expect = 0.0
Identities = 583/1033 (56%), Positives = 760/1033 (73%), Gaps = 6/1033 (0%)

Query: 3 SRFFVRRPVFAWVIAILIMLAGILAIRTLPVAQYPDVAPPTIKISATYTGASAETLENSV 62
+ FF+RRP+FAWV+AI++M+AG LAI LPVAQYP +APP + +SA Y GA A+T++++V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 63 TQVIEQQLTGLDNLLYFSSTSSSDGSVSINVTFEQGTDPDTAQVQVQNKIQQAESRLPSE 122
TQVIEQ + G+DNL+Y SSTS S GSV+I +TF+ GTDPD AQVQVQNK+Q A LP E
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 123 VQQTGVTVEKSQSNFLLIAAVYDTTDKASSSDIADWLVSNVQDPLARVEGVGSLQVFGAE 182
VQQ G++VEKS S++L++A + DI+D++ SNV+D L+R+ GVG +Q+FGA+
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ 181

Query: 183 YAMRIWLDPAKLASYSLMPSDVQSAIEAQNVQVTAGKIGALPSPNTQQLTATVRAQSRLQ 242
YAMRIWLD L Y L P DV + ++ QN Q+ AG++G P+ QQL A++ AQ+R +
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 243 TVDQFKNIIVKSQSDGAVVRIKDVARVEMGSEDYTAIGKLNGHPSAGVAVMLSPGANALN 302
++F + ++ SDG+VVR+KDVARVE+G E+Y I ++NG P+AG+ + L+ GANAL+
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 303 TATLVKDKIAEFQRNMPQGYDIAYPKDSTEFIKISVEDVIQTLFEAIVLVVCVMYLFLQN 362
TA +K K+AE Q PQG + YP D+T F+++S+ +V++TLFEAI+LV VMYLFLQN
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 363 LRATLIPALAVPVVLLGTFGVLALFGYSINTLTLFAMVLAIGLLVDDAIVVVENVERIMR 422
+RATLIP +AVPVVLLGTF +LA FGYSINTLT+F MVLAIGLLVDDAIVVVENVER+M
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 423 DEGLPAREATEKSMGEISGALVAIALVLSAVFLPMAFFGGSTGVIYRQFSITIISAMLLS 482
++ LP +EATEKSM +I GALV IA+VLSAVF+PMAFFGGSTG IYRQFSITI+SAM LS
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 483 VVVALTLTPALCGSVL----QHVPPHKKGFFGAFNRFYRRTEDKYQRGVIYVLRRAARTM 538
V+VAL LTPALC ++L +K GFFG FN + + + Y V +L R +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 539 GLYVVLGGGMALMMWKLPGSFLPTEDQGEIMVQYTLPAGATAARTAEVNRQIVDWFLINE 598
+Y ++ GM ++ +LP SFLP EDQG + LPAGAT RT +V Q+ D++L NE
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 599 KANTDVIFTVDGFSFSGSGQNTGMAFVSLKNWSQRKGAENTAQAIALRATKELGTIRDAT 658
KAN + +FTV+GFSFSG QN GMAFVSLK W +R G EN+A+A+ RA ELG IRD
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 659 VFAMTPPAVDGLGQSNGFTFELLANGGTDRETLLQMRNQLIEKANQSP-ELHSVRANDLP 717
V PA+ LG + GF FEL+ G + L Q RNQL+ A Q P L SVR N L
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 718 QMPQLQVDIDSNKAVSLGLSLNDVTDTLSSAWGGTYVNDFIDRGRVKKVYIQGDSEFRSA 777
Q ++++D KA +LG+SL+D+ T+S+A GGTYVNDFIDRGRVKK+Y+Q D++FR
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 778 PSDLGKWFVRGSDNAMTPFSAFATTRWLYGPERLVRYNGSAAYEIQGENATGFSSGDAMT 837
P D+ K +VR ++ M PFSAF T+ W+YG RL RYNG + EIQGE A G SSGDAM
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 838 KMEELANSLPAGTTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAALYESWSVPFSVI 897
ME LA+ LPAG + W+G+S QE+L+ QA +L A+S +VVFLCLAALYESWS+P SV+
Sbjct: 842 LMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVM 901

Query: 898 LVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFA-EAAVAEGYSLSR 956
LV+PLG++G LAA + + NDVYF V LLTTIGLS+KNAILIVEFA + EG +
Sbjct: 902 LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVE 961

Query: 957 AALRAAQTRLRPIIMTSLAFIAGVMPLAIATGAGANSRIAIGTGIIGGTLTATLLAIFFV 1016
A L A + RLRPI+MTSLAFI GV+PLAI+ GAG+ ++ A+G G++GG ++ATLLAIFFV
Sbjct: 962 ATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFV 1021

Query: 1017 PLFFVLVKRLFAG 1029
P+FFV+++R F G
Sbjct: 1022 PVFFVVIRRCFKG 1034



Score = 75.3 bits (185), Expect = 1e-15
Identities = 53/330 (16%), Positives = 117/330 (35%), Gaps = 19/330 (5%)

Query: 721 QLQVDIDSNKAVSLGLSLNDVTDTLSSA----WGGTYVNDFIDRGRVKKVYIQGDSEFRS 776
+++ +D++ L+ DV + L G G+ I + F++
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKN 242

Query: 777 APSDLGKWFVRGSDN-AMTPFSAFATTRWLYGPER--LVRYNGSAA-----YEIQGENAT 828
P + GK +R + + ++ A L G + R NG A G NA
Sbjct: 243 -PEEFGKVTLRVNSDGSVVRLKDVARVE-LGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 829 GFSSGDAMTKMEELANSLPAG--TTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAAL 886
+ K+ EL P G + + + +L+ +LV + +
Sbjct: 301 DTAKA-IKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV-MYLF 358

Query: 887 YESWSVPFSVILVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFAEA 946
++ + +P+ LLG + + ++ IGL +AI++VE E
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 947 AVAEGYSLSRAALRAAQTRLR-PIIMTSLAFIAGVMPLAIATGAGANSRIAIGTGIIGGT 1005
+ E + A + ++++ ++ ++ A +P+A G+ I+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 1006 LTATLLAIFFVPLFFVLVKRLFAGKPRRQE 1035
+ L+A+ P + + + + +
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1836RTXTOXIND483e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.3 bits (115), Expect = 3e-08
Identities = 28/133 (21%), Positives = 55/133 (41%), Gaps = 10/133 (7%)

Query: 41 PVSVVSELTGR-TSAALSAEVRPQVGGIIQKRLFKEGDLVKAGQPLYQIDAASYQAAWNE 99
V +V+ G+ T + S E++P I+++ + KEG+ V+ G L ++ A +A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 100 ARAALQQAQALVKADCQKAQRYARLVKENGVSQQDADDAQSTCAQDKASV--------EA 151
+++L QA+ Q R L K + D Q+ ++ +
Sbjct: 139 TQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 152 KKAALETARINLD 164
+ +NLD
Sbjct: 198 WQNQKYQKELNLD 210



Score = 31.7 bits (72), Expect = 0.005
Identities = 15/116 (12%), Positives = 32/116 (27%), Gaps = 9/116 (7%)

Query: 83 QPLYQIDAASYQAAWN--EARAALQQAQALVKADCQKAQRYARLVKENGVSQQDADDAQS 140
L A + A + K+ ++ + KE Q ++
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEE--YQLVTQLFKN 298

Query: 141 TCAQDKASVEAKKAALET----ARINLDWTTVTAPISGRI-GISSVTPGALVTASQ 191
L + + AP+S ++ + T G +VT ++
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1837HTHTETR556e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 6e-12
Identities = 17/65 (26%), Positives = 33/65 (50%)

Query: 1 MTSKLEIRHKQRQDEIINAARRCFRRCGFHAASMSQIASEAQLSVGQIYRYFANKDAIIE 60
M K + ++ + I++ A R F + G + S+ +IA A ++ G IY +F +K +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMVRR 65
E+
Sbjct: 61 EIWEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1839DHBDHDRGNASE501e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 50.4 bits (120), Expect = 1e-09
Identities = 51/260 (19%), Positives = 98/260 (37%), Gaps = 22/260 (8%)

Query: 4 LSGKRILVTGVASKLSIAYGIAQAMHREGAEL-AFTYQNDKLKGRVEEFAAQLGSDIVLQ 62
+ GK +TG A I +A+ + +GA + A Y +KL+ V A+
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVAEDASIDTMFAELGKVWPKFDGFVHSIGF---APGDQLDGDYVNAVTREGFKIAHDI 119
DV + A+ID + A + + D V+ G L + A F +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT----FSVN--- 116

Query: 120 SSYSFVAMAKACRSMLNP-GSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMG 178
S+ F A + M++ +++T+ A + +KA+ + + +
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 PEGVRVNAISAGPIRTLAASGI--------KDFRKMLAHCEAVTPIRRTVTIEDVGNSAA 230
+R N +S G T + + + L + P+++ D+ ++
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 231 FLCSDLSAGISGEVVHVDGG 250
FL S + I+ + VDGG
Sbjct: 237 FLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1842PF08280310.013 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 31.4 bits (71), Expect = 0.013
Identities = 21/105 (20%), Positives = 37/105 (35%), Gaps = 2/105 (1%)

Query: 526 PIDVELTESCLIENDELALSVIQQFSQLGAQVHLDDFGTGYSSLSQLARFPIDAIKLDQV 585
P+ V S I L S + FS + + ++ Q+ D +
Sbjct: 425 PLVVVFVASNFINAHLLTDSFPRYFSDKS--IDFHSYYLLQDNVYQIPDLKPDLVITHSQ 482

Query: 586 FVRDIHKQPVSQSLVRAIVAVAQALNLQVIAEGVESKKEDAFLTK 630
+ +H + V I L++Q + V+ +K A LTK
Sbjct: 483 LIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTK 527


92EcSMS35_1916EcSMS35_1920N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_1916-1130.491766nitrite extrusion protein 1
EcSMS35_1917-1150.487481hypothetical protein
EcSMS35_1918-1150.756078nitrate/nitrite sensor protein NarX
EcSMS35_1919-215-0.584920transcriptional regulator NarL
EcSMS35_1920-318-0.619141hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1916ACRIFLAVINRP310.011 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.011
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 258 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 314
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 315 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 370
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 371 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATDTAAALGFIS 411
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1918PF06580531e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 1e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1919HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-17
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_1920INTIMIN2561e-78 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 256 bits (656), Expect = 1e-78
Identities = 119/378 (31%), Positives = 195/378 (51%), Gaps = 21/378 (5%)

Query: 79 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 138
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 139 DRYLTWSQLGLTQQDDGLVSNVGVGQRWARGNWLVGYNTFYDNLLDENLQRAGFGAEAWG 198
++ L + Q+G D +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 199 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSVEQYFGDR 256
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 257 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTARHKQGESGENQNNLGLNLNYRFGVPLKK 316
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 317 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 376
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 377 RSRYGIRQLIWQGDTQILS-----LTPGAQANSEEGWTLIMPDWQNGEGASNHWRLSVVV 431
+S+YG+ +++W D+ + S G+Q S + + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 432 EDNQGQRVSSNEITLTLV 449
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


93EcSMS35_2042EcSMS35_2052N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_20420142.156947ribonuclease E
EcSMS35_2043-2121.100645hypothetical protein
EcSMS35_2044-1101.255222acetyltransferase
EcSMS35_2045-1101.648993flagellar hook-associated protein FlgL
EcSMS35_2046-1132.750171flagellar hook-associated protein FlgK
EcSMS35_20471132.815166flagellar rod assembly protein/muramidase FlgJ
EcSMS35_20483132.673632flagellar basal body P-ring protein
EcSMS35_20493152.483135flagellar basal body L-ring protein
EcSMS35_20502172.487811flagellar basal body rod protein FlgG
EcSMS35_20511192.359999flagellar basal body rod protein FlgF
EcSMS35_20522181.057448flagellar hook protein FlgE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2042IGASERPTASE668e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 65.9 bits (160), Expect = 8e-13
Identities = 49/261 (18%), Positives = 83/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAATATPASPAQPGLLSRFFSALKALFSGGEETKPAEQP-APKAEAKPERQQDRR 609
T P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDSRSERTEGSDNREENRRNRRQAQQQTVETRESRQQAEV------T 663
N +++S++ D E +NR A++ + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +TT+ ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEEAVVAPVVEETVAGEPIVQEAPAPRTELVKVPLPVVAQAAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232



Score = 38.9 bits (90), Expect = 1e-04
Identities = 50/315 (15%), Positives = 91/315 (28%), Gaps = 34/315 (10%)

Query: 516 EEFAERKRPEQPALATFAMPDVPPAPTPAEPAAPVVAPAPK------------------- 556
+ + E+ A A P PPAP VA K
Sbjct: 1006 DVPSVPSNNEEIARVDEA-PVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQ 1064

Query: 557 ----AATATPASPAQPGLLSRFFSALKALFSGGEETKPAEQPAPKAEAKPERQQDRRKPR 612
A A A S + + ETK + +AK E ++ + P+
Sbjct: 1065 NREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPK 1124

Query: 613 QNNRRDRNERRDSRSERTEGSDNREEN-RRNRRQAQQQTVETRESRQQAEVTEKARTTDE 671
++ + + S + + + RE + N ++ Q QT T ++ Q A+ T + E
Sbjct: 1125 VTSQVSPKQEQ-SETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET---SSNVE 1180

Query: 672 QQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRY 731
Q N + A + E+ + + R + R N +
Sbjct: 1181 QPVTESTTVNTGNSVVENPENTTPA-TTQPTVNSESSNKPKNRHRRSVRSVPH-NVEPAT 1238

Query: 732 EQSVAEEAVVAPVVEETVAGEPIVQEAPAPRTELVKVPLPV---VAQAAPEQQEENNADN 788
S V + T + + + V V ++Q + + N
Sbjct: 1239 TSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWV 1298

Query: 789 RDNGGMPRRSRRSPR 803
+ S R
Sbjct: 1299 SNTSMNKNYSSSQYR 1313



Score = 35.4 bits (81), Expect = 0.001
Identities = 33/232 (14%), Positives = 69/232 (29%), Gaps = 15/232 (6%)

Query: 515 EEEFAERKRPEQPALATFAMPDVPPAPTPAEPAAPVVAPAPKAATATPASPAQPGLLSRF 574
E+E + E+ V P +E P PA + Q
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQ----- 1161

Query: 575 FSALKALFSGGEETKPAEQPAPKAEAKPERQQDR--RKPRQNNRRDRNERRDSRSERTEG 632
+ A + + P E+ + P +S S
Sbjct: 1162 -TNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPK 1220

Query: 633 SDNREENRRNRRQAQQQTVETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKRQAQQ 692
+ +R R + T + + A + T+ + R +++ + +A
Sbjct: 1221 NRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVS 1280

Query: 693 EAKA---LNVEEQS---VQETEQEERVRPVQPRRKQRQLNQ-KVRYEQSVAE 737
+ + +N E Q V T + Q RR + Q ++ ++Q+++
Sbjct: 1281 QHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTISN 1332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2045FLAGELLIN474e-08 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 47.3 bits (112), Expect = 4e-08
Identities = 42/226 (18%), Positives = 80/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTESAPFSEVDGKYEGGAESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + DG E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2046FLGHOOKAP16810.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 681 bits (1758), Expect = 0.0
Identities = 543/546 (99%), Positives = 544/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYIDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAY+DGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQATRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQ TRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDVGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEED GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2047FLGFLGJ5080.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 508 bits (1310), Expect = 0.0
Identities = 310/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESMPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEES PAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQTLSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQ LSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTSMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLT+MIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2048FLGPRINGFLGI425e-151 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 425 bits (1095), Expect = e-151
Identities = 156/363 (42%), Positives = 212/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAVQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2049FLGLRINGFLGH349e-125 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (896), Expect = e-125
Identities = 231/232 (99%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKSNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGK+NFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2050FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2052FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


94EcSMS35_2364EcSMS35_2369N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2364011-1.888949outer membrane porin protein C
EcSMS35_2365011-1.528063phosphotransfer intermediate protein in
EcSMS35_2366013-1.189308transcriptional regulator RcsB
EcSMS35_2367-114-0.603205hybrid sensory kinase in two-component
EcSMS35_2368018-0.173408sensory histidine kinase AtoS
EcSMS35_23690150.859274acetoacetate metabolism regulatory protein AtoC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2364ECOLIPORIN5290.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 529 bits (1365), Expect = 0.0
Identities = 251/383 (65%), Positives = 294/383 (76%), Gaps = 19/383 (4%)

Query: 1 MKVKVLSLLVPALLVAGAANAAEIYNKDGNKLDLYGKVDGLHYFSDNDSKDGDKTYMRLG 60
MK KVL+L++PALL AGAA+AAEIYNKDGNKLDLYGKVDGLHYFSD+ SKDGD+TYMR+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQVTDQLTGYGQWEYQIQGNEPESDNS-SWTRVAFAGLKFQDVGSFDYGRNYGVVY 119
FKGETQ+ DQLTGYGQWEY +Q N E + + SWTR+AFAGLKF D GSFDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 120 DVTSWTDVLPEFGGDTY-DSDNFMQQRGNGFATYRNTDFFGLVDGLDFAVQYQGKNGSAH 178
DV WTD+LPEFGGD+Y +DN+M R NG ATYRNTDFFGLVDGL+FA+QYQGKN S
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 179 GEGMT-----TNGRDDVFEQNGDGVGGSITYNY-EGFGIGAAVSSSKRTWDQNNT-GLIG 231
+ + N DD+ NGDG G S TY+ GF GAA ++S RT +Q N G I
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGTIA 240

Query: 232 TGDRAETYTGGLKYDANNIYLAAQYTQTYNATRVGSL------GWANKAQNFEAVAQYQF 285
GD+A+ +T GLKYDANNIYLA Y++T N T G G ANK QNFE AQYQF
Sbjct: 241 GGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQF 300

Query: 286 DFGLRPSLAYLQSKGKNLGR---GYDDEDILKYVDVGATYYFNKNMSTYVDYKINLLD-D 341
DFGLRP++++L SKGK+L DD+D++KY DVGATYYFNKN STYVDYKINLLD D
Sbjct: 301 DFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDD 360

Query: 342 NQFTRAAGINTDDIVALGLVYQF 364
+ F + AGI+TDDIVALG+VYQF
Sbjct: 361 DPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2366HTHFIS489e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 47.9 bits (114), Expect = 9e-09
Identities = 26/145 (17%), Positives = 60/145 (41%), Gaps = 20/145 (13%)

Query: 1 MNNMNVIIADDHPIVLFGIRKSLEQIEWVNVVGEFEDSTALINNLPKLDAHVLITDLSMP 60
M +++ADD + + ++L + + + ++ L + D +++TD+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GDKYGDGITLIKYIKRHFPSLSIIVLTMNNNPAILSAVLDLDIEGIVLKQGA------PT 114
+ L+ IK+ P L ++V++ N +A+ ++GA P
Sbjct: 59 D---ENAFDLLPRIKKARPDLPVLVMSAQNTFM--TAIKA-------SEKGAYDYLPKPF 106

Query: 115 DLPKALAALQKGKKFTPESVSRLLE 139
DL + + + + S+L +
Sbjct: 107 DLTELIGIIGRALAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2367HTHFIS823e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 3e-18
Identities = 29/106 (27%), Positives = 48/106 (45%)

Query: 827 ILVVDDHPINRRLLADQLGSLGYQCKTANDGVDALNVLSKNHIDIVLSDVNMPNMDGYRL 886
ILV DD R +L L GY + ++ ++ D+V++DV MP+ + + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 887 TQRIRQLGLTLPVIGVTANALAEEKQRCLESGMDSCLSKPVTLDVI 932
RI++ LPV+ ++A + E G L KP L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2369HTHFIS5600.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 560 bits (1444), Expect = 0.0
Identities = 180/484 (37%), Positives = 268/484 (55%), Gaps = 35/484 (7%)

Query: 1 MTAINRILIVDDEDNVRRMLSTAFALQGFETHCANNGRTALHLFADIHPDVVLMDIRMPE 60
MT IL+ DD+ +R +L+ A + G++ +N T A D+V+ D+ MP+
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 MDGIKALKEMRSHETRTPVILMTAYAEVETAVEALRCGAFDYVIKPFDLDELNLIVQRTL 120
+ L ++ PV++M+A TA++A GA+DY+ KPFDL EL I+ R L
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 121 QLQSMKKEIRHLHQALSTSWQWGH-ILTNSPAMMDICKDTAKIALSQASVLISGESGTGK 179
+ L Q G ++ S AM +I + A++ + +++I+GESGTGK
Sbjct: 120 AEP------KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGK 173

Query: 180 ELIARAIHYNSRRAKGPFIKVNCAALPESLLESELFGHEKGAFTGAQTLRQGLFERANEG 239
EL+ARA+H +R GPF+ +N AA+P L+ESELFGHEKGAFTGAQT G FE+A G
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGG 233

Query: 240 TLLLDEIGEMPLVLQAKLLRILQEREFERIGGHQTIKVDIRIIAATNRDLQAMVKEGTFR 299
TL LDEIG+MP+ Q +LLR+LQ+ E+ +GG I+ D+RI+AATN+DL+ + +G FR
Sbjct: 234 TLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFR 293

Query: 300 EDLFYRLNVIHLILPPLRDRREDISLLANHFLQKFSSENQRDIIDIDPMAMSLLTAWSWP 359
EDL+YRLNV+ L LPPLRDR EDI L HF+Q+ E + D A+ L+ A WP
Sbjct: 294 EDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWP 352

Query: 360 GNIRELSNVIERAVVMNSGPIIFSEDLPPQIRQPV---------CNAGEAKTAPVGERN- 409
GN+REL N++ R + +I E + ++R + +G + E N
Sbjct: 353 GNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENM 412

Query: 410 ----------------LKEEIKRVEKRIIMEVLEQQEGNRTRTALMLGISRRALMYKLQE 453
+ +E +I+ L GN+ + A +LG++R L K++E
Sbjct: 413 RQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472

Query: 454 YGID 457
G+
Sbjct: 473 LGVS 476


95EcSMS35_2492EcSMS35_2496N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2492-1242.889957putative minor fimbrial subunit
EcSMS35_2493-1212.634399putative minor fimbrial protein
EcSMS35_2494-1202.492130fimbrial protein
EcSMS35_2495-2131.836186fimbrial chaperone protein
EcSMS35_2496-2111.910726fimbrial usher protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2492FIMBRIALPAPE290.008 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.8 bits (64), Expect = 0.008
Identities = 43/167 (25%), Positives = 71/167 (42%), Gaps = 18/167 (10%)

Query: 15 LLCSSTAPAVDNLHFTGNLLGKSCTPVINGSLLAEVQFPTIAASDLMHLGQSDRVPLVFQ 74
+L S A DNL F G L+ +CT V N AEV + I +L+ G + + F
Sbjct: 16 VLMSQHVHAADNLTFKGKLIIPACT-VQN----AEVNWGDIEIQNLVQSGGNQK---DFT 67

Query: 75 LKDCKSSTLFSVRVTLAGTEDSELPGFLAIDASSSASGVGIGIETAAGAAVPINDTTGVT 134
+ +L +++VT+ T + + + + +S+ASG G+ I I + +
Sbjct: 68 VDMNCPYSLGTMKVTI--TSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGNAVTLG 125

Query: 135 FPLNQG-------SNTLNFNAWLQTKSG-RDVTPGDFSATATATFEY 173
+ G + + A L K + + G FSATAT Y
Sbjct: 126 SQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2493FIMBRIALPAPF332e-04 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 32.8 bits (74), Expect = 2e-04
Identities = 34/161 (21%), Positives = 67/161 (41%), Gaps = 18/161 (11%)

Query: 1 MLWGFCSMALSNVTFHGYLVQPPNCTISDGETIELTFQDVNIDDINGSNYEQIVPYRITC 60
+L +A + G + PP CTI++G+ I + F ++N + ++ S E I+C
Sbjct: 11 LLTSVAVLADVQINIRGNVYIPP-CTINNGQNIVVDFGNINPEHVDNSRGEVTKNISISC 69

Query: 61 DTPVRDPLLEMTLSWSGTQSDFDDAAVSTDIAGLGIRLKQ-------------AGQSFKL 107
P + L + ++ T + ++T+I GI L Q +G +++
Sbjct: 70 --PYKSGSLWIKVT-GNTMGVGQNNVLATNITHFGIALYQGKGMSTPLTLGNGSGNGYRV 126

Query: 108 NTPLVVDETALPALTAVPVKKSGVDLPEANFEAWATLQVDY 148
L + T+VP + L +F A++ + Y
Sbjct: 127 TAGLDTARSTF-TFTSVPFRNGSGILNGGDFRTTASMSMIY 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2494FIMBRIALPAPE290.008 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.8 bits (64), Expect = 0.008
Identities = 40/142 (28%), Positives = 62/142 (43%), Gaps = 22/142 (15%)

Query: 37 PPCTVTGAEVEFGNVVI-TRIGGVNYKRPIDYKLVCNNLAMDDLRLQMQAATVVINGETV 95
P CTV AEV +G++ I + ++ + C L T+ NG+T
Sbjct: 37 PACTVQNAEVNWGDIEIQNLVQSGGNQKDFTVDMNC------PYSLGTMKVTITSNGQTG 90

Query: 96 ISTGIP--------GFGIRIQKASDSSILD-LTSGA-WLPFNFSSGAPA----LEA-VPV 140
S +P G I + +++S I + +T G+ P + APA L A +
Sbjct: 91 NSILVPNTSTASGDGLLIYLYNSNNSGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGY 150

Query: 141 KQSGTSLTAAEFSASATIVVDY 162
K + SL A FSA+AT+V Y
Sbjct: 151 KGNMQSLQAGTFSATATLVASY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2496PF005777470.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 747 bits (1930), Expect = 0.0
Identities = 226/881 (25%), Positives = 383/881 (43%), Gaps = 70/881 (7%)

Query: 8 RLRGIACYIALAISGGSVNAWADDSIQFDPRFLELKGDTKIDLGKFSKKGYVDAGKYNLR 67
RL G + +A + + + + F+PRFL DL +F + G Y +
Sbjct: 22 RLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVD 81

Query: 68 VFINKQPLSDEYDINWYVSENDPTKTYACLTPELVAALGLKEGIAKSLQWTHNDECLKPG 127
+++N + D+ + + + CLT +A++GL + +D C+
Sbjct: 82 IYLNNGYM-ATRDVTFN-TGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLT 139

Query: 128 QL-DGMEVENDLSQSALLLTVPQAYLEYTSSDWDPPSRWDDGIPGLIADYSLNAQTRHQE 186
+ + D+ Q L LT+PQA++ + + PP WD GI + +Y+ + +
Sbjct: 140 SMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNR 199

Query: 187 QGGEDSHDISGNGTVGANLGAWRFRADWQSDYQHTRSNDDDDDSSNSTTSKNWDWSRYYA 246
GG +SH N G N+GAWR R + Y + S+ + W +
Sbjct: 200 IGG-NSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKN--------KWQHINTWL 250

Query: 247 WRALPSLKAKLSLGEDYLNSDIFDGFNYIGSSVSTDDQMLPPNLRGYAPDVSGVAHSSAK 306
R + L+++L+LG+ Y DIFDG N+ G+ +++DD MLP + RG+AP + G+A +A+
Sbjct: 251 ERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQ 310

Query: 307 VTISQMGRVLYETQVPAGPFRIQDI-GDSVSGTLHVRVEEQNGQVQEYDVTTASMPFLTR 365
VTI Q G +Y + VP GPF I DI SG L V ++E +G Q + V +S+P L R
Sbjct: 311 VTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQR 370

Query: 366 QGQVRYKVMMGRPEDWNHKTEGGFFSGGEASWGVADGWSLYGGALADEHYQSAAMGVGRD 425
+G RY + G N + E F G+ GW++YGG + Y++ G+G++
Sbjct: 371 EGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKN 430

Query: 426 LAQFGALAFDVTHSHVNLDHDSAYGKGKLDGNSFRVSYAKDFDELNSRVTFAGYRFSEKN 485
+ GAL+ D+T ++ L DS + DG S R Y K +E + + GYR+S
Sbjct: 431 MGALGALSVDMTQANSTLPDDSQH-----DGQSVRFLYNKSLNESGTNIQLVGYRYSTSG 485

Query: 486 FMTMSEYLDANQSDMARTGND-------------------KEMYTITYNQNFAAAGVSIY 526
+ ++ + + D + +T Q ++Y
Sbjct: 486 YFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTS-TLY 544

Query: 527 LNYSHRTYWDRP-EQTNYNLMFSHYFNMGSIRNVSISVTGYRYEYDDNADKGMYLSMSIP 585
L+ SH+TYW + + F + ++S + + + D+ + L+++IP
Sbjct: 545 LSGSHQTYWGTSNVDEQFQAGLNTAFEDIN---WTLSYSLTKNAWQKGRDQMLALNVNIP 601

Query: 586 WSD-----------SSTVTYNGSYGS-GSDSSQVGYFKRV--DDATHYQVNVGT-----S 626
+S ++ +Y+ S+ G ++ G + + D+ Y V G
Sbjct: 602 FSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDG 661

Query: 627 EQHGSVDGYLSHDGSLAKVDLSANYHEGEYRSAGIALQGGATLTAHGGALHRTQSMGGTR 686
+ L++ G ++ H + + + GG A+G L Q + T
Sbjct: 662 NSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHANGVTLG--QPLNDTV 718

Query: 687 LLIDADGIANVPVESNGAPVYTNMFGKAVVADINNYYRNQAYIDLNNLPEDAEATQSVVQ 746
+L+ A G + VE N V T+ G AV+ Y N+ +D N L ++ + +V
Sbjct: 719 VLVKAPGAKDAKVE-NQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVAN 777

Query: 747 ATLTEGAIGYRKFKVISGQKAMAVLRLRDGSYPPFGAEVKNDEQQQVGIVDDEGNVYLAG 806
T GAI +FK G K + L + PFGA V ++ Q GIV D G VYL+G
Sbjct: 778 VVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSG 836

Query: 807 VNAGEHMTVFW--EGSAQCEI--VLPKPLPADLFSGLLLPC 843
+ + V W E +A C LP L + L C
Sbjct: 837 MPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAEC 877


96EcSMS35_2517EcSMS35_2520N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2517135-9.432081multidrug resistance protein Y
EcSMS35_2518034-8.421998drug resistance MFS transporter, membrane fusion
EcSMS35_2519132-7.707828DNA-binding transcriptional activator EvgA
EcSMS35_2520132-7.275990hybrid sensory histidine kinase in two-component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2517TCRTETB1214e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (306), Expect = 4e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2518RTXTOXIND785e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.3 bits (193), Expect = 5e-18
Identities = 61/413 (14%), Positives = 122/413 (29%), Gaps = 96/413 (23%)

Query: 13 RKKYFALLAVVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
R+ ++ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQ-VGETVSPG 233
L + + + + + I++PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDTR 341
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2519HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2520HTHFIS792e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-17
Identities = 31/105 (29%), Positives = 51/105 (48%)

Query: 890 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKISMQHYDLLITDVNMPNMDGFE 949
+IL+ADD R +L + L+ GYDV ++ I+ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 950 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 994
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


97EcSMS35_2803EcSMS35_2809N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_2803-1122.607395major facilitator family transporter
EcSMS35_2804-1132.083167branched chain amino acid ABC transporter
EcSMS35_2805-2141.696451hypothetical protein
EcSMS35_2806-2121.288065transcriptional repressor MprA
EcSMS35_2807-1131.720525multidrug resistance protein A
EcSMS35_2808-1151.682368multidrug resistance protein B
EcSMS35_2809016-0.153742S-ribosylhomocysteinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2803TCRTETB431e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 43.3 bits (102), Expect = 1e-06
Identities = 32/165 (19%), Positives = 69/165 (41%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLASLGGWRTVFWVASVLMALMALALWRGLPQMKSETHLNY 196
+ G++A W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2806PF05272280.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.018
Identities = 23/94 (24%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRESDNDRR 111
SS ++ D L + GW RE+ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2807RTXTOXIND795e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.7 bits (194), Expect = 5e-18
Identities = 64/412 (15%), Positives = 120/412 (29%), Gaps = 97/412 (23%)

Query: 25 LLLTLLFIIIAVAIGIYWFLVLRHFEETDDA----YVAGNQIQIMSQVSGSVTKVWADNT 80
L FI+ + I VL E A +G +I + V ++
Sbjct: 57 PRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 81 DFVKEGDVLVTLDPTDARQAFEKA------------------------------------ 104
+ V++GDVL+ L A K
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 105 ----------------KTALASSVRQTHQLMINSKQLQANIEVQKIALAKA-------QS 141
K ++ Q +Q +N + +A + + +S
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 142 DYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQYNANQAMILGTKLEDQPAVQQ 201
+ L + I + + + A +L V Q ++ IL K E Q Q
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 202 AATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQISPTTP 242
E+ + + + I +P++ V + V G ++
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 243 LMAVVPA-TNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKY---TGKVVGLDMGTGS 298
LM +VP + V A + I + +GQ I + + +Y GKV + +
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI-----N 409

Query: 299 AFSLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNR 350
++ G V+ + + PL G++ + T R
Sbjct: 410 LDAIE--DQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2808TCRTETB1329e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 132 bits (333), Expect = 9e-36
Identities = 97/405 (23%), Positives = 169/405 (41%), Gaps = 23/405 (5%)

Query: 17 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 76
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 135
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 195
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 196 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVVAICFLIVWELTD 255
+ D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 195 I-KGHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 256 DNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 315
+P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 316 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMDFGASAWPQFIQGF- 373
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 374 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 416
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_2809LUXSPROTEIN293e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 293 bits (751), Expect = e-105
Identities = 132/170 (77%), Positives = 148/170 (87%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARNILERDVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA+NILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


98EcSMS35_3180EcSMS35_3187N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_31804212.034300aerobactin siderophore biosynthesis protein
EcSMS35_31814231.307006aerobactin siderophore biosynthesis protein
EcSMS35_31825240.539837aerobactin siderophore biosynthesis protein
EcSMS35_3183630-0.282217transport protein ShiF
EcSMS35_3187431-2.494703putative immunoglobuling-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3180PF041838230.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 823 bits (2128), Expect = 0.0
Identities = 577/580 (99%), Positives = 580/580 (100%)

Query: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60
MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIWGWLWI 60

Query: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYSTLLGDLQLLKARRGLSASD 120
DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLY+TLLGDLQLLKARRGLSASD
Sbjct: 61 DAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASD 120

Query: 121 LINLSADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180
LINL+ADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC
Sbjct: 121 LINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRC 180

Query: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240
DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG
Sbjct: 181 DNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEG 240

Query: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300
RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR
Sbjct: 241 RMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASR 300

Query: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360
WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK
Sbjct: 301 WLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLK 360

Query: 361 PDESPVLMATLMECDENDQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420
PDESPVLMATLMECDEN+QPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI
Sbjct: 361 PDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVALI 420

Query: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480
AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL
Sbjct: 421 AHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDL 480

Query: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540
QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR
Sbjct: 481 QTGHFVTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFSLFRPQIIR 540

Query: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580
VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES
Sbjct: 541 VVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWLVTQEYES 580


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3182PF04183332e-109 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 332 bits (853), Expect = e-109
Identities = 110/474 (23%), Positives = 181/474 (38%), Gaps = 45/474 (9%)

Query: 37 ELIIPLDEQKSLHFRVAYFSPTQHHRF-----AFPAHLVTASGSYPVDFTTLSRLIIDKL 91
E + + Q + + P RF + + A D L++ ++ +L
Sbjct: 24 EQVFHAESQGDDRYCIN--LPGAQWRFIAERGIWGWLWIDAQTLRCADEPVLAQTLLMQL 81

Query: 92 RHQLFLPVPLCETFHQRVLESYAHTQQTIDARHDWAILREKALNFGEAEQALLTGHAFHP 151
+ L + Q + + Q + AR + LN + Q LL+GH
Sbjct: 82 KQVLSMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINLNA-DRLQCLLSGHPKFV 140

Query: 152 APKSHEPFNRQEAERYLPDMAPHFPLRWFSVDKTQIAGES-LHLNLQQRLTRFAAENAPQ 210
K + ++ ERY P+ A F L W +V + + +++ Q LT A PQ
Sbjct: 141 FNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLT---AAMDPQ 197

Query: 211 LLNELS--------DNQWLF-PLHPWQGEYLLQQVWCQALFAKGLIRDLGEAGTSWLPTT 261
S D+ WL P+HPWQ + + + A FA+G + LGE G WL
Sbjct: 198 EFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFI-ADFAEGRMVSLGEFGDQWLAQQ 256

Query: 262 SSRSLYCATSRD--MIKFSLSVRLTNSVRTLSVKEVERGMRLARLAQ----TDGWQMLQA 315
S R+L A+ R IK L++ T+ R + + + G +R Q TD +
Sbjct: 257 SLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSG 316

Query: 316 ----RFPTFRVMQEDGWAGLRDLNGNIMQESLFSLRENLLLEQPQSQTNVLVSLTQAAPD 371
P + +G+A L + REN ++ VL++ +
Sbjct: 317 AVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDE 376

Query: 372 GGDSLLVSAVKRLSDRLGITVQQAAHAWVDAYCQQVLKPLFTAEADYGLVLLAHQQNILV 431
L + + DR G+ A W+ + V+ PL+ YG+ L+AH QNI +
Sbjct: 377 NNQPLAGAYI----DRSGLD----AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITL 428

Query: 432 QMLGDLPVGFIYRDCQGSAFMPHATEWLDTIDEAQAENIFTREQLLRYFPYYLL 485
M +P + +D QG M E +D E R+ R YL+
Sbjct: 429 AMKEGVPQRVLLKDFQGD--MRLVKEEFPEMDSLPQE---VRDVTSRLSADYLI 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3183TCRTETA462e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 46.0 bits (109), Expect = 2e-07
Identities = 84/380 (22%), Positives = 135/380 (35%), Gaps = 49/380 (12%)

Query: 20 FSAGLLGIGQNGLLVVLPVLVIQTNLSLSV---WAALLMLGSMLFLPSSPWWGKQISLTG 76
+ L +G ++ VLP L+ S V + LL L +++ +P G G
Sbjct: 12 STVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFG 71

Query: 77 SKTVVLWALGGYGVSFTLLGLGSVLMATGAVTKAVGLGILIIARIVYGLTVSAMVPACQV 136
+ V+L +L G V + ++ L +L I RIV G+T + A
Sbjct: 72 RRPVLLVSLAGAAVDYAIMATAPFLW------------VLYIGRIVAGITGATGAVAGAY 119

Query: 137 WALQRAGEGNRMAALATISSGLSCGRLFGPLCAAAMLVIHPLAPVWM--LMAAPALALVM 194
A R +S+ G + GP+ M P AP + +
Sbjct: 120 IA-DITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 195 LLRLPGTPPQPTPERK-------SVSLKRDFLPYLLCAMLLAAAMSMMQLGLSPAL---- 243
L + P R+ S R A L+A M +G PA
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVV---AALMAVFFIMQLVGQVPAALWVI 235

Query: 244 --TRQFATDTTTISQQVAWLLGLSAIA-ALIAQFVVLRPQRLTPVALLLSAGVLMSSGLA 300
+F D TTI +A L ++A A+I V R AL+L + +
Sbjct: 236 FGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERR--ALMLGMIADGTGYIL 293

Query: 301 IMLAEQLWLFYLGCAVLSFGAALATPAYQLLLNDKLADGAGAGWLACSHTLGYGLCALLV 360
+ A + W+ + +L+ G + PA Q +L+ + D G L L
Sbjct: 294 LAFATRGWMAFPIMVLLASG-GIGMPALQAMLS-RQVDEERQGQLQ----------GSLA 341

Query: 361 PLVSKTGVAIALIVMALFAA 380
L S T + L+ A++AA
Sbjct: 342 ALTSLTSIVGPLLFTAIYAA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3187OMADHESIN904e-22 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 90.3 bits (223), Expect = 4e-22
Identities = 89/331 (26%), Positives = 141/331 (42%), Gaps = 30/331 (9%)

Query: 67 AVGSGAAILDADKSMAVGNNTAVFNADNSVALGYGSQVNGESNVLSVGAGPSGY-----G 121
V GA +D +AVG N+ +A NSVA+G+ S V S+ G
Sbjct: 127 GVAIGARASTSDTGVAVGFNSKA-DAKNSVAIGHSSHVAANHG-YSIAIGDRSKTDRENS 184

Query: 122 FSVDGAPETRRIINVSDGVKDSDAATKGQMDNAIADAVRESGDALRGEIGAVYRDAVADA 181
S+ R++ +++ G KD+DA Q+ I + + A +
Sbjct: 185 VSIGHESLNRQLTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYADNKS 244

Query: 182 KSRVESAENRLNGNITAARASAQEYTDAVKSDVLDETRTYTDSSVRTVRNEVKSQAEHLS 241
S + A N + +A++ A DVL+ + +++S RT + A ++
Sbjct: 245 SSVLGIANNYTDSKSAETLENARKEAFAQSKDVLNMAKAHSNSVARTTLETAEEHANSVA 304

Query: 242 DVLVK------NRAQTDAAIASNTAAIRNNSHRLDLTEAWQKMAT--------------- 280
++ N+ +A ++N A +SH L ++ +
Sbjct: 305 RTTLETAEEHANKKSAEALASANVYADSKSSHTLKTANSYTDVTVSNSTKKAIRESNQYT 364

Query: 281 -ERMNNMQEQIKENRKELRESAAQSAALAGLFQPYSVGKFNATAAVGGYRDEQAIAVGVG 339
+ + ++ + + + A SAAL LFQPY VGK N TA VGGYR QA+A+G G
Sbjct: 365 DHKFRQLDNRLDKLDTRVDKGLASSAALNSLFQPYGVGKVNFTAGVGGYRSSQALAIGSG 424

Query: 340 YRFTENVAGKVAVA-AGGSSASWNAGVNFEF 369
YR ENVA K VA AG S +NA N E+
Sbjct: 425 YRVNENVALKAGVAYAGSSDVMYNASFNIEW 455


99EcSMS35_3241EcSMS35_3251N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3241-1205.551602general secretion pathway protein GspJ
EcSMS35_3242-1204.855929general secretion pathway protein GspI
EcSMS35_3243-1174.082089general secretion pathway protein GspH
EcSMS35_3244-2163.476921general secretion pathway protein GspG
EcSMS35_3245-2143.281853general secretion pathway protein GspF
EcSMS35_3246-1131.518636general secretory pathway protein GspE
EcSMS35_3247-2130.839538general secretion pathway protein GspD
EcSMS35_3248-3110.659383putative type II secretion protein GspC
EcSMS35_3249-3130.922577putative lipoprotein
EcSMS35_3250-3121.640482leader peptidase PppA
EcSMS35_3251-3132.108277hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3241BCTERIALGSPG290.010 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 28.7 bits (64), Expect = 0.010
Identities = 15/46 (32%), Positives = 22/46 (47%), Gaps = 2/46 (4%)

Query: 1 MRRARAGFTLLEMLVAIAIFASLA-LMAQQVTNGVTRVNSAVAGHD 45
+ R GFTLLE++V I I LA L+ + + + A D
Sbjct: 4 TDKQR-GFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3242PilS_PF08805349e-05 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 33.8 bits (77), Expect = 9e-05
Identities = 15/52 (28%), Positives = 26/52 (50%)

Query: 3 RGFTLLEVILALAIFALAATAVLQIASGALSNQQILEEKTVAGWVAENQTAL 54
+G TL+EV+L + + + A + ++ S SN Q E+ V N +L
Sbjct: 26 KGATLMEVLLVVGVIVVLAASAYKLYSMVQSNIQSSNEQNNVLTVIANMKSL 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3243BCTERIALGSPH782e-20 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 77.7 bits (191), Expect = 2e-20
Identities = 38/166 (22%), Positives = 65/166 (39%), Gaps = 28/166 (16%)

Query: 1 MPERGFTLLEIMLVIFLIGLASAGVVQTFATASEPPAKKAAQDFLTRFAQFKDRAVIEGQ 60
M +RGFTLLE+ML++ L+G+++ V+ F + + A + F + + R + GQ
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 61 TLGVLIDPPGYQFMQRRHGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQR 120
GV + P +QF+ + P D W L L+
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPA-------------------PADDGWSGYRWLPLRA 101

Query: 121 RRL----TLHDIELEL-----QKEAKKKTPQIRFSPFEPVTPFTLR 157
R+ ++ +L L + P + P +TPF L
Sbjct: 102 GRVATSGSIAGGKLNLAFAQGEAWTPGDNPDVLIFPGGEMTPFRLT 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3244BCTERIALGSPG2182e-76 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 218 bits (556), Expect = 2e-76
Identities = 91/146 (62%), Positives = 109/146 (74%), Gaps = 3/146 (2%)

Query: 6 RTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENALDMYRL 65
R + GFTLLE+MVVIVI+GVLASLVVPNL+GNKEKAD+QKA+SDIVALENALDMY+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTEQGLEALIQQPANMADARNYRTGGYIKRLPKDPWGNDYQYLSPGEKGLFDVY 125
DN YPTT QGLE+L++ P A NY GYIKRLP DPWGNDY ++PGE G +D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 126 TLGADGQENGEGAGADIGNWNLQEFQ 151
+ G DG+ E DI NW L + +
Sbjct: 122 SAGPDGEMGTED---DITNWGLSKKK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3245BCTERIALGSPF454e-161 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 454 bits (1170), Expect = e-161
Identities = 225/406 (55%), Positives = 300/406 (73%), Gaps = 1/406 (0%)

Query: 1 MALFYYQALERNGRKTKGMIEADSARHARQLLRGKDLIPVHI-EARMNASAGGMLQRRRH 59
MA ++YQAL+ G+K +G EADSAR ARQLLR + L+P+ + E R + G
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 AHRRVAAADLALFTRQLATLVQAAMPLETCLQAVSEQSEKLHVKSLGMALRSRIQEGYTL 119
R++ +DLAL TRQLATLV A+MPLE L AV++QSEK H+ L A+RS++ EG++L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 SDSLREHPRVFDSLFCSMVAAGEKSGHLDVVLNRLADYTEQRQRLKSRLLQAMLYPLVLL 179
+D+++ P F+ L+C+MVAAGE SGHLD VLNRLADYTEQRQ+++SR+ QAM+YP VL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVATGVVTILLTAVVPKIIEQFDHLGHALPASTRMLIAMSDALQASGVYWLAGLLGLLVL 239
VVA VV+ILL+ VVPK++EQF H+ ALP STR+L+ MSDA++ G + L LL +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 GQRLLKNPAMRLRWDKTLLRLPVTGRVARGLNTARFSRTLSILTASSVPLLEGIQTAAAV 299
+ +L+ R+ + + LL LP+ GR+ARGLNTAR++RTLSIL AS+VPLL+ ++ + V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 SANRYVEQQLLLAADRVREGSSLRAALADLRLFPPMMLYMIASGEQSGELETMLEQAAIN 359
+N Y +L LA D VREG SL AL LFPPMM +MIASGE+SGEL++MLE+AA N
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QEREFDTQVGLALGLFEPALVVVMAGVVLFIVIAILEPMLQLNNMV 405
Q+REF +Q+ LALGLFEP LVV MA VVLFIV+AIL+P+LQLN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3247BCTERIALGSPD5760.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 576 bits (1485), Expect = 0.0
Identities = 295/668 (44%), Positives = 431/668 (64%), Gaps = 34/668 (5%)

Query: 24 LLPLVLAAALCSSPVWAEEATFTANFKDTDLKSFIETVGANLNKTIIMGPGVQGKVSIRT 83
L L++ AAL P AEE F+A+FK TD++ FI TV NLNKT+I+ P V+G +++R+
Sbjct: 11 SLTLLIFAALLFRPAAAEE--FSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRS 68

Query: 84 MTPLNERQYYQLFLNLLEAQGYAVVPMENDVLKVVKSSAAKVEPLPLVGEGSDNYAGDEM 143
LNE QYYQ FL++L+ G+AV+ M N VLKVV+S AK +P+ + + GDE+
Sbjct: 69 YDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPG-IGDEV 127

Query: 144 VTKVVPVRNVSVRELAPILRQMIDSAGSGNVVNYDPSNVIMLTGRASVVERLTEVIQRVD 203
VT+VVP+ NV+ R+LAP+LRQ+ D+AG G+VV+Y+PSNV+++TGRA+V++RL +++RVD
Sbjct: 128 VTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVD 187

Query: 204 HAGNRTEEVIPLDNASASEIARVLESLTKNSGENQ-PATLKSQIVADERTNSVIVSGDPA 262
+AG+R+ +PL ASA+++ +++ L K++ ++ P ++ + +VADERTN+V+VSG+P
Sbjct: 188 NAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPN 247

Query: 263 TRDKMRRLIRRLDSEMERRGNSQVFYLKYSKAEDLVDVLKQVSGTLTAAKEEAEGTVGSG 322
+R ++ +I++LD + +GN++V YLKY+KA DLV+VL +S T+ + K+ A+ +
Sbjct: 248 SRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPV-AAL 306

Query: 323 REVVSIAASKHSNALIVTAPQDIMQSLQSVIEQLDIRRAQVHVEALIVEVAEGSNINFGV 382
+ + I A +NALIVTA D+M L+ VI QLDIRR QV VEA+I EV + +N G+
Sbjct: 307 DKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGI 366

Query: 383 QWGSKDAGLMQFANGTQIPIGTLGAAISAAKPQKGSTVISENGATTINPDTNGDLST-LA 441
QW +K+AG+ QF N + +PI T A + +G +S+ LA
Sbjct: 367 QWANKNAGMTQFTN-SGLPISTAIAG-------------------ANQYNKDGTVSSSLA 406

Query: 442 QLLSGFSGTAVGVVKGDWMALVQAVKNDSSSNVLSTPSITTLDNQEAFFMVGQDVPVLTG 501
LS F+G A G +G+W L+ A+ + + +++L+TPSI TLDN EA F VGQ+VPVLTG
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 502 STVGSNNSNPFNTVERKKVGIMLKVTPQINEGNAVQMVIEQEVSKVEGQTS-----LDVV 556
S S N FNTVERK VGI LKV PQINEG++V + IEQEVS V S L
Sbjct: 467 SQTTSG-DNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGAT 525

Query: 557 FGERKLKTTVLANDGELIVLGGLMDDQAGESVAKVPLLGDIPLIGNLFKSTADKKEKRNL 616
F R + VL GE +V+GGL+D ++ KVPLLGDIP+IG LF+ST+ K KRNL
Sbjct: 526 FNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNL 585

Query: 617 MVFIRPTILRDGMAADGVSQRKYNYMRAEQIYR--DEQGLSLMPHTAQPILPAQNQALPP 674
M+FIRPT++RD S +Y Q + E +++ I P Q+ A
Sbjct: 586 MLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYPRQDTAAFR 645

Query: 675 EVRAFLNA 682
+V A ++A
Sbjct: 646 QVSAAIDA 653


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3248BCTERIALGSPC1173e-33 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 117 bits (294), Expect = 3e-33
Identities = 67/286 (23%), Positives = 114/286 (39%), Gaps = 38/286 (13%)

Query: 40 IARGMFWLMLLIISAKVAHSLWRYFSFSAEYMAVSPSANKPLRADAKAFDKNDVQLISQQ 99
I R +F+L++L+ ++A WR S P +A + ND L
Sbjct: 14 IRRILFYLLMLLFCQQLAMIFWR-IGLPDNAPVSSVQIT-PAQARQQPVTLNDFTL---- 67

Query: 100 NWFGKYQPV--ATPVKQPEPAPVAETRLNVVLRGIAFG---ARPGAVIEEGGKQQVYLQG 154
FG A + + + + + LN+ L G+ G +R A+I + +Q
Sbjct: 68 --FGVSPEKNKAGALDASQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSRGVN 125

Query: 155 ETLGSHNAVIEEINRDHVMLRYQGKMERLSLAEEKRPTIAVTSKKAVSDEAKQAVAEPAA 214
E + +NA I I D V+L+YQG+ E L L + +
Sbjct: 126 EEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQ-----------------------EDS 162

Query: 215 SAPVEIPAAVRQAL-AKDPQKIFNYIQLTPVRKEG-IVGYAVKPGADRSLFDASGFKEGD 272
+ A V + L + + +Y+ +P+ + + GY + PG F G ++ D
Sbjct: 163 GSDGVPGAQVNEQLQQRASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDND 222

Query: 273 IAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARYDISIAL 318
+A+ALN D D M ++ + + LTV R G R DI +
Sbjct: 223 MAVALNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQDIYMEF 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3250PREPILNPTASE2791e-96 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 279 bits (715), Expect = 1e-96
Identities = 109/271 (40%), Positives = 149/271 (54%), Gaps = 12/271 (4%)

Query: 1 MLFDVFQQYPTAMPVLATVGGLIIGSFLNVVIWRYPIML-RQQMAEFHGEMSSAQSKI-- 57
+L ++ P L + L+IGSFLNVVI R PIML R+ AE+ + +
Sbjct: 3 LLLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDE 62

Query: 58 ---SLALPRSHCPHCQQTIRIRDNIPLFSWLMLKGRCRDCQAKISKRYPLVELLTALAFL 114
+L +PRS CPHC I +NIPL SWL L+GRCR CQA IS RYPLVELLTAL +
Sbjct: 63 PPYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSV 122

Query: 115 LASLVWPESGWALAVMILSAWLIAASVIDLDHQWLPDVFTQGVLWTGLSAAWAQQSPLTL 174
++ LA ++L+ L+A + IDLD LPD T +LW GL ++L
Sbjct: 123 AVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNL-LGGFVSL 181

Query: 175 QDAVTGVLVGFIAFYSLRWIAGIVLRKEALGMGDVLLFAALGSWVGPLSLPNVALIASCC 234
DAV G + G++ +SL W ++ KE +G GD L AALG+W+G +LP V L++S
Sbjct: 182 GDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLV 241

Query: 235 GLIYAVI-----TKRGSTTLPFGPCLSLGGI 260
G + S +PFGP L++ G
Sbjct: 242 GAFMGIGLILLRNHHQSKPIPFGPYLAIAGW 272


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3251PF03544496e-08 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 48.8 bits (116), Expect = 6e-08
Identities = 24/60 (40%), Positives = 30/60 (50%), Gaps = 3/60 (5%)

Query: 32 SSDTPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEP---IPDPEPTPEPEPEPVP 88
S T + V+P P P EP PEP P PEP E I P+P P+P+P+PV
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109



Score = 41.9 bits (98), Expect = 1e-05
Identities = 16/92 (17%), Positives = 27/92 (29%), Gaps = 2/92 (2%)

Query: 33 SDTPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPIPDPEPTPEPEPEPVPTKTG 92
+D P + PE +P P PEP PEP + E + V
Sbjct: 58 ADLEPPQAVQ-PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKR 116

Query: 93 YLTLGGSQRVTGATCNGESSDGFTFKPGEDVT 124
+ S+ + N + + +
Sbjct: 117 DVKPVESRPASPFE-NTAPARPTSSTATAATS 147



Score = 40.7 bits (95), Expect = 2e-05
Identities = 18/59 (30%), Positives = 23/59 (38%), Gaps = 2/59 (3%)

Query: 35 TPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPIPDP--EPTPEPEPEPVPTKT 91
P + +P +P PEP +PEP PEPIP+P E E K
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 103



Score = 40.3 bits (94), Expect = 3e-05
Identities = 20/96 (20%), Positives = 28/96 (29%), Gaps = 7/96 (7%)

Query: 29 SGSSSDTPPVDSGTGSLPEVKPDPTPNPEPT---PEPTPDPEPTPEPIPDPEPTPEPEPE 85
+ P V PE +P P P E +P P P+P P+P+ E
Sbjct: 65 AVQPPPEPVV----EPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKP 120

Query: 86 PVPTKTGYLTLGGSQRVTGATCNGESSDGFTFKPGE 121
R T +T +S T
Sbjct: 121 VESRPASPFENTAPARPTSSTATAATSKPVTSVASG 156



Score = 35.0 bits (80), Expect = 0.002
Identities = 17/40 (42%), Positives = 17/40 (42%)

Query: 50 PDPTPNPEPTPEPTPDPEPTPEPIPDPEPTPEPEPEPVPT 89
P P T D EP P PEP EPEPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83



Score = 30.3 bits (68), Expect = 0.043
Identities = 11/40 (27%), Positives = 13/40 (32%)

Query: 52 PTPNPEPTPEPTPDPEPTPEPIPDPEPTPEPEPEPVPTKT 91
P P + + P P P P EPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83


100EcSMS35_3530EcSMS35_3537N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3530-217-0.495791serine endoprotease
EcSMS35_3531-213-0.454748serine endoprotease
EcSMS35_3532-113-0.291892malate dehydrogenase
EcSMS35_3533-112-0.627430arginine repressor
EcSMS35_3534-2120.073901protein YcfR
EcSMS35_3535-1121.027623hypothetical protein
EcSMS35_3536-3111.596970p-hydroxybenzoic acid efflux subunit AaeB
EcSMS35_3537-1101.759225p-hydroxybenzoic acid efflux subunit AaeA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3530V8PROTEASE726e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 72.0 bits (176), Expect = 6e-16
Identities = 32/184 (17%), Positives = 63/184 (34%), Gaps = 38/184 (20%)

Query: 90 GLGSGVIINASKGYVLTNNHVINQAQKISIQL------------NDGREFDAKLIGSDDQ 137
+ SGV++ K +LTN HV++ L +G ++ +
Sbjct: 102 FIASGVVVG--KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 138 SDIALLQIQN-------PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIVSALG 190
D+A+++ + ++++ + +V G P V+ +
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMW 212

Query: 191 RSGLNLEGLEN-FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSN 249
S + L+ +Q D S GNSG + N E+IGI+ G+
Sbjct: 213 ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGA 263

Query: 250 MART 253
+
Sbjct: 264 VFIN 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3531V8PROTEASE538e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 52.7 bits (126), Expect = 8e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3532DHBDHDRGNASE280.042 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.042
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 VAKTCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3533ARGREPRESSOR1694e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 169 bits (430), Expect = 4e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKDLYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3537RTXTOXIND542e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 54.4 bits (131), Expect = 2e-10
Identities = 29/163 (17%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVHDNQLVKKGQVLFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 114
V + + V+KG VL + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 115 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 154
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211



Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 100 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 150
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 151 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 208
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 209 YRAEIT----PLGSNKVLKGTVDSVAA 231
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410


101EcSMS35_3556EcSMS35_3562N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3556-318-2.761098DNA-binding protein Fis
EcSMS35_3557-217-2.446964putative methyltransferase
EcSMS35_3558-218-1.912088hypothetical protein
EcSMS35_3559-116-1.142790DNA-binding transcriptional regulator EnvR
EcSMS35_3560-215-0.708589acriflavine resistance protein E
EcSMS35_3562016-1.323245putative lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3556DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3559HTHTETR1304e-40 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 130 bits (329), Expect = 4e-40
Identities = 78/209 (37%), Positives = 123/209 (58%), Gaps = 3/209 (1%)

Query: 1 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN 60
MA++TK EA +TRQ +++ A+ F+Q GVS T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMW-LQQPSLRELIQEHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEF 119
E+W L + ++ EL E A DP LRE LI L+ R++ L++I++HKCEF
Sbjct: 61 EIWELSESNIGELELE-YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 NDEM-LAEGVIRDKMGFNPQTLHEVLQACQQQGCIANNLDLDVVMIIIDGAFSGIVQNWL 178
EM + + R+ + + + L+ C + + +L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNMAGYDLYKQAPALVDNVLRMFMPDENI 207
+DL K+A V +L M++ +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3560RTXTOXIND448e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.7 bits (103), Expect = 8e-07
Identities = 39/220 (17%), Positives = 73/220 (33%), Gaps = 38/220 (17%)

Query: 98 ATYQASYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 IAAKATVESARINLAYTKVTAPISGRIGK-STVTEGALVTNGQTTELATVQQLDPIYVDV 215
+ + + AP+S ++ + TEG +VT +T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVEQGNLHKENATSNVELVMENGQTYP-LKGTLQ--FSDVTVDEST 270
+ D F+ + Q+ +++ Y L G ++ D D+
Sbjct: 371 LVQNKDIGFINVGQNAI------------IKVEAFPYTRYGYLVGKVKNINLDAIEDQRL 418

Query: 271 GSIT--LRAV------FPNPQHTLLPGMFVRARIDEGVQS 302
G + + ++ N L GM V A I G++S
Sbjct: 419 GLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMRS 458



Score = 34.0 bits (78), Expect = 0.001
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 13/127 (10%)

Query: 46 TAPLEVKTELPGR-TNAYRIAEVRPQVSGIVLSRNFTEGSDVQAGQSLYQIDPATYQASY 104
+E+ G+ T++ R E++P + IV EG V+ G L ++ +A
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA-- 134

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVE 164
+ K++++ A L RY L E ++ +
Sbjct: 135 -----DTLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3562adhesinb280.004 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.5 bits (61), Expect = 0.004
Identities = 14/68 (20%), Positives = 26/68 (38%), Gaps = 10/68 (14%)

Query: 1 MKR---LIPVALLTALLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGG 57
MK+ L+ + L LA C+ + +V TN+ + T++ IAG
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN-------IAGD 53

Query: 58 AAAVAGLT 65
+ +
Sbjct: 54 KINLHSIV 61


102EcSMS35_3618EcSMS35_3621N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3618447-1.559531A24 family peptidase
EcSMS35_3617653-1.259951bacterioferritin
EcSMS35_3619756-0.332505bacterioferritin-associated ferredoxin
EcSMS35_3620655-0.174119elongation factor Tu
EcSMS35_3621445-0.337111elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3618PREPILNPTASE1458e-46 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 145 bits (367), Expect = 8e-46
Identities = 65/143 (45%), Positives = 84/143 (58%), Gaps = 2/143 (1%)

Query: 3 ATLPFLILYACLSALLFFWDAKHGLLPDRFTCPLLWSGLLFYQVCNPDGLADALWGAIIG 62
TL L+L L AL F D LLPD+ T PLLW GLLF + L DA+ GA+ G
Sbjct: 133 GTLAALLLTWVLVALTFI-DLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAG 191

Query: 63 YGTFAVIYWGYRILRHKEGLGYGDVKFLAALGAWHTWTFLPRLVFLAASFACGAVVIGLL 122
Y +YW +++L KEG+GYGD K LAALGAW W LP +V L +S + IGL+
Sbjct: 192 YLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALP-IVLLLSSLVGAFMGIGLI 250

Query: 123 MRGKESLKNPLPFGPFLAAAGFV 145
+ P+PFGP+LA AG++
Sbjct: 251 LLRNHHQSKPIPFGPYLAIAGWI 273


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3617HELNAPAPROT383e-06 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 38.3 bits (89), Expect = 3e-06
Identities = 19/103 (18%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 44 EYHESIDEMKHADKYIERILFLEGIPN--LQDLGKL------GIGEDVEEMLQSDLRLEL 95
E ++ E D ER+L + G P +++ + G EM+Q+ +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EGAKDLREAIAYADSVHDYVSRDMMIEILADEEGHIDWLETEL 138
+ + + + I A+ D + D+ + ++ + E + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3620TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 186
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 187 AKILELAGFLDSYIPEPE 204
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3621TCRTETOQM6130.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 613 bits (1583), Expect = 0.0
Identities = 178/698 (25%), Positives = 304/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGGEELTEAEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+ R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPDAPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E +K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D I +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


103EcSMS35_3627EcSMS35_3637N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_36271160.269147hypothetical protein
EcSMS35_36280171.692035FKBP-type peptidyl-prolyl cis-trans isomerase
EcSMS35_36290143.348005hypothetical protein
EcSMS35_3630-2143.185633FKBP-type peptidyl-prolyl cis-trans isomerase
EcSMS35_3631-1142.947434hypothetical protein
EcSMS35_3632-2133.007352glutathione-regulated potassium-efflux system
EcSMS35_3633-1162.916715glutathione-regulated potassium-efflux system
EcSMS35_3634-1182.085988putative ABC transporter ATP-binding protein
EcSMS35_3635-2121.192524putative hydrolase
EcSMS35_3636-1131.323204hypothetical protein
EcSMS35_3637-1131.492557phosphoribulokinase/uridine kinase family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3627ACRIFLAVINRP290.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.021
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 160 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 218
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 219 SK 220
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3628INFPOTNTIATR1325e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 132 bits (334), Expect = 5e-40
Identities = 79/226 (34%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_363260KDINNERMP310.021 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.021
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 261 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 316
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 317 GVRSSERMQ 325
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3633ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 11 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 68
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 69 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESA------Y 118
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 119 RYDALNRYPMSDVLR 133
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3634GPOSANCHOR330.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.005
Identities = 28/152 (18%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQSRKAELTAC 602
+LE++ + +L A+ + EE+ SE QS + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.0 bits (72), Expect = 0.008
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQSRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3637PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


104EcSMS35_3725EcSMS35_3732N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3725-2172.718221putative acetyltransferase YhhY
EcSMS35_3726-2193.166550gamma-glutamyltranspeptidase
EcSMS35_3728-2213.118861hypothetical protein
EcSMS35_3727-1223.144135cytoplasmic glycerophosphodiester
EcSMS35_3729-2212.275179glycerol-3-phosphate transporter ATP-binding
EcSMS35_3730-1201.071131glycerol-3-phosphate transporter membrane
EcSMS35_3731-1201.662839glycerol-3-phosphate transporter permease
EcSMS35_3732-1201.370000glycerol-3-phosphate transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3725SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.4 bits (76), Expect = 2e-04
Identities = 21/92 (22%), Positives = 32/92 (34%), Gaps = 16/92 (17%)

Query: 55 VACIDGIVVGHLTIDVQQRPHRSHVADFGICVDSRWKNRGVASALMREMIE------MCD 108
+ ++ +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 140
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3726NAFLGMOTY330.003 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 33.2 bits (75), Expect = 0.003
Identities = 28/82 (34%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 276 RTPISGEYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 332
R P+ GE R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 333 YAYADRSEYLGDPDFVKVPWQA 354
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3727PF04619290.014 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.7 bits (64), Expect = 0.014
Identities = 12/60 (20%), Positives = 23/60 (38%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGDLNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ + W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3729PF05272290.042 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.042
Identities = 10/29 (34%), Positives = 16/29 (55%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTTGDI 61
+V+ G G GKSTL+ + GL+ +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3732MALTOSEBP393e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 38.9 bits (90), Expect = 3e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


105EcSMS35_3774EcSMS35_3781N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_3774116-3.297374ABC transporter ATP-binding protein
EcSMS35_3775018-4.307469ABC transporter ATP binding protein
EcSMS35_3776026-6.805510MFP family transporter
EcSMS35_3777023-7.029824hypothetical protein
EcSMS35_3778122-6.943553hypothetical protein
EcSMS35_3779016-4.655092hypothetical protein
EcSMS35_37800120.270109hypothetical protein
EcSMS35_37810151.993928pyridine nucleotide-disulfide oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3774ABC2TRNSPORT504e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.3 bits (120), Expect = 4e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKV-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3775PF05272300.046 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.046
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3776RTXTOXIND853e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 84.9 bits (210), Expect = 3e-20
Identities = 72/408 (17%), Positives = 139/408 (34%), Gaps = 81/408 (19%)

Query: 6 RHLAWWVVGLLAVAAIVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A I++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAIAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3781ALARACEMASE290.032 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.032
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


106EcSMS35_3868EcSMS35_3873N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_38680111.364286major facilitator family transporter
EcSMS35_38691131.622967hypothetical protein
EcSMS35_38701111.7571213-methyl-adenine DNA glycosylase I
EcSMS35_38721121.634428hypothetical protein
EcSMS35_38710111.578868biotin sulfoxide reductase
EcSMS35_3873017-0.605811putative outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3868TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.9 bits (101), Expect = 2e-06
Identities = 48/275 (17%), Positives = 95/275 (34%), Gaps = 32/275 (11%)

Query: 44 PVSQVAFSFGLLSLGLAIS----SSVAGKLQERFGVKRVTMASGILLGLGFFLTAHSNNL 99
+ V +G+L A+ + V G L +RFG + V + S + + + A + L
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 100 MMLWLS---AGVLVGLADGAGYLL----TLSNCVKWFPERKGLISAFAIGSYGLGSLGFK 152
+L++ AG+ AG + + F + LG
Sbjct: 97 WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG---- 152

Query: 153 FIDTHLLETVGLEKTFVIWGAIVLVMIVFGATLMKDAPKQEVKTSNGVVEKDYTLAESMR 212
L+ F A+ + + G L+ ++ K E + R
Sbjct: 153 -----LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 213 --KPQYWMLAVMFLTACMSG----LYVIGVAKDIAQSLAHLDAISAANAVTVISIAN-LS 265
++AV F+ + L+VI + H DA + ++ I + L+
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVI-----FGEDRFHWDATTIGISLAAFGILHSLA 262

Query: 266 GRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
++ G ++ ++ R + +G + G L FA
Sbjct: 263 QAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297



Score = 35.6 bits (82), Expect = 3e-04
Identities = 34/127 (26%), Positives = 56/127 (44%), Gaps = 4/127 (3%)

Query: 269 VLGILSDKIARIRVITIGQVISLVGMAALLFAPLNAVTFFAAIACVAFNFGGTITVFPSL 328
VLG LSD+ R V+ + + V A + AP V + I VA G T V +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRI--VAGITGATGAVAGAY 119

Query: 329 VSEFFGLNNLAKNYGVIYLGFGIGSICGSIIASLFGGF--YVTFYVIFALLILSLALSTT 386
+++ + A+++G + FG G + G ++ L GGF + F+ AL L+
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 387 IRQPEQK 393
+ K
Sbjct: 180 LLPESHK 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3869ECOLNEIPORIN270.045 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.5 bits (61), Expect = 0.045
Identities = 22/117 (18%), Positives = 47/117 (40%), Gaps = 16/117 (13%)

Query: 119 SMYNEFGDSTTTLTDPLWHASVSTLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLS 178
S+ + D+ + H S + + + R G++ P ++SY F +
Sbjct: 228 SVAVQQQDAKLV-EENYSHNSQTEVAATLAYRFGNVTP--RVSYAHGFKGSF-------- 276

Query: 179 RMTATNQNGNWLDVTVGADMLLNQNIAAYAA---LSQAENTTNNSDYLYTMGVSARF 232
ATN N ++ V VGA+ ++ +A + L + + + +G+ +F
Sbjct: 277 --DATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3872SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.1 bits (83), Expect = 1e-05
Identities = 17/52 (32%), Positives = 25/52 (48%), Gaps = 5/52 (9%)

Query: 76 VAPKAVRRGIGKALM----QYVQQRHP-HLMLEVYQKNQPAIDFYHAQGFHI 122
VA ++G+G AL+ ++ ++ H LMLE N A FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_3873OMPADOMAIN1132e-32 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 113 bits (285), Expect = 2e-32
Identities = 41/122 (33%), Positives = 62/122 (50%), Gaps = 11/122 (9%)

Query: 108 LNMPNNVTFDSSSATLKPAGANTLTGVAMVLKEY--PKTAVNVIGYTDSTGGHDLNMRLS 165
+ ++V F+ + ATLKP G L + L +V V+GYTD G N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 166 QQRADSVASALITQGVDASRIRTQGLGPANPIASNSTAEGK---------AQNRRVEITL 216
++RA SV LI++G+ A +I +G+G +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 217 SP 218

Sbjct: 335 KG 336


107EcSMS35_4018EcSMS35_4039N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4018242-13.682469putative type III secretion chaperone SicA
EcSMS35_4019-122-3.936983hypothetical protein
EcSMS35_4020-120-3.376295putative type III cell invasion protein SipB
EcSMS35_4021-119-2.667686hypothetical protein
EcSMS35_4022-118-1.535662putative type III effector protein SipD
EcSMS35_4023-118-0.434891invasion protein regulator
EcSMS35_4024-1181.379768putative invasin
EcSMS35_4025015-0.383290ribonucleoside transporter
EcSMS35_4026-111-0.290817hypothetical protein
EcSMS35_4027-1121.063608putative DNA-binding protein
EcSMS35_4028-2142.174036hypothetical protein
EcSMS35_4029-2132.590021sulfate permease family inorganic anion
EcSMS35_4030-1100.257652cryptic adenine deaminase
EcSMS35_4031016-3.217388sugar phosphate antiporter
EcSMS35_4032021-4.240415regulatory protein UhpC
EcSMS35_4033-120-3.619791sensory histidine kinase UhpB
EcSMS35_4034-120-4.104585DNA-binding transcriptional activator UhpA
EcSMS35_4035-121-4.629704hypothetical protein
EcSMS35_4036014-1.450682hypothetical protein
EcSMS35_40370143.221386acetolactate synthase 1 regulatory subunit
EcSMS35_40380162.806162acetolactate synthase catalytic subunit
EcSMS35_40390171.866485multidrug resistance protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4018SYCDCHAPRONE1002e-29 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 100 bits (249), Expect = 2e-29
Identities = 34/164 (20%), Positives = 69/164 (42%), Gaps = 5/164 (3%)

Query: 5 NLDLEENKEIASKFERALGMGATLAELHGITPDTLEGVYAYAYNFYEKGRLDEAELFFKF 64
+ + +E E L G T+A L+ I+ DTLE +Y+ A+N Y+ G+ ++A F+
Sbjct: 2 QQETTDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQA 61

Query: 65 LCIYDFQNYNYLKGYAAVCQLKKDYQKAFDMYHICLMLSPDNDFSLVYYMGQCQMGLKNI 124
LC+ D + + G A Q Y A Y ++ + ++ +C + +
Sbjct: 62 LCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIK-EPRFPFHAAECLLQKGEL 120

Query: 125 KMATELFNT----VVTYSQNEKIKEMATTYLELLTANSEEEVQT 164
A + ++ +++ ++ LE + E E +
Sbjct: 121 AEAESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKEMEHEC 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4020BACINVASINB1154e-29 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 115 bits (288), Expect = 4e-29
Identities = 117/544 (21%), Positives = 227/544 (41%), Gaps = 37/544 (6%)

Query: 60 NKPMLAPPTIQVSDSDNATTAKTNDARLTMILGNLTGIADQDITTRLHNNLDSTLLRHEM 119
N L PPT + ++ + +LT++LG L + ++L + L E
Sbjct: 64 NTVGLKPPTDAAREKLSS------EGQLTLLLGKLMTLLGDVSLSQLESRLAVWQAMIES 117

Query: 120 AHNKFRELSDAYSSSLDDAQKADDIMHQANNNYNAVDKKVQSLEKKVNTLNQELSQLQPG 179
++S + ++L +AQ+A D+ + + + KK+ +L L P
Sbjct: 118 QKEMGIQVSKEFQTALGEAQEATDLYEASIKKTDTAKSVYDAATKKLTQAQNKLQSLDPA 177

Query: 180 DPQYNKVLTQKNAAEKTLTLSLQKKSLAEKSLNTAIMDADAAIGQSMEIFDEIQQQEQIN 239
DP Y + A K T + + A + A DA A ++ I + Q
Sbjct: 178 DPGYAQAEAAVEQAGKEATEAKEALDKATDATVKAGTDAKAKAEKADNILTKFQ---GTA 234

Query: 240 NFTTNICLTQENQKNRNATATFILLITSVMEVIGDTNCDSIKNQSEVMKEINHVRENKLN 299
N + ++Q Q N + A +L+ +E++G +S++N + + R+ ++
Sbjct: 235 NAASQNQVSQGEQDNLSNVARLTMLMAMFIEIVGKNTEESLQNDLALFNALQEGRQAEME 294

Query: 300 ETARKYTTTTKVLKIVNECVTVVTFAVSAVLIVVGLLAAVPSGGSSIAGALALIGGIAGA 359
+ + ++ T+ + N + + + A+L +V ++AAV +GG+S+A A A
Sbjct: 295 KKSAEFQEETRKAEETNRIMGCIGKVLGALLTIVSVVAAVFTGGASLALA---------A 345

Query: 360 VVLGVDITCQIALGTTATGWILGKVVEGLSAAIKTVDPTL-LAITALLDVIGVDQDTIEL 418
V L V + +I T +I + + +K + + AIT L+ +GVD+ T E+
Sbjct: 346 VGLAVMVADEIVKAATGVSFIQQALNPIMEHVLKPLMELIGKAITKALEGLGVDKKTAEM 405

Query: 419 VKSIYASAAASIVMATVMIGAAVICSVAIGAVVSALSKTAAEEVTKEITSTIK---STIE 475
SI + A+I M V++ AV+ A + +ALSK E + K + + +K
Sbjct: 406 AGSIVGAIVAAIAMVAVIVVVAVVGKGAAAKLGNALSKMMGETIKKLVPNVLKQLAQNGS 465

Query: 476 SIINSVSKNIIKVLDSVCS--VLQTSAVVLKLIAKISNGLEKIGLLICAIATSTMNC--- 530
+ + I L +V S LQT+A+ +L + N L K+ L + T+ +
Sbjct: 466 KLFTQGMQRITSGLGNVGSKMGLQTNALSKEL---VGNTLNKVALGMEVTNTAAQSAGGV 522

Query: 531 -------FVAGNSADMAILQQDMSNLSKTREQMLSVLQRVDKTVEQEVSQMVRVLQHRTE 583
+ AD + + M + + +Q + + K + M +Q +
Sbjct: 523 AEGVFIKNASEALADFMLARFAMDQIQQWLKQSVEIFGENQKVTAELQKAMSSAVQQNAD 582

Query: 584 ALKF 587
A +F
Sbjct: 583 ASRF 586


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4023SYCDCHAPRONE290.025 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.1 bits (65), Expect = 0.025
Identities = 13/97 (13%), Positives = 33/97 (34%)

Query: 315 QKQAITTAVTSIEKALEINPSNSQALGLLGLISGLKDEHSVSNVLFKQAHLLKPNSPDVY 374
Q A + ++ +S+ LG ++ ++ + ++ P
Sbjct: 48 QSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFP 107

Query: 375 YYQSLLYFLNGDLARAFNLIEKSIALEPNKMGISILK 411
++ + G+LA A + + + L +K L
Sbjct: 108 FHAAECLLQKGELAEAESGLFLAQELIADKTEFKELS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4024INTIMIN538e-167 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 538 bits (1386), Expect = e-167
Identities = 200/789 (25%), Positives = 332/789 (42%), Gaps = 65/789 (8%)

Query: 144 QVAEMAQQSGTLLARDMDSEQAASMARGWVASSASAQATDWLSRWGTARVSLGVDEDFSL 203
Q A + Q L +R ++ + A A G + AS+Q WL +GTA V+L +F
Sbjct: 169 QAASLGSQ---LQSRSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD- 224

Query: 204 KSSSFEFLHPWYETPDNLVFSQHTLHRTDDRTQTNHGIGWRYFTSSWMSGVNMFIDHDLT 263
SS +FL P+Y++ L F Q D R N G G R+F M G N+FID D +
Sbjct: 225 -GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFS 283

Query: 264 RYHTRTGMGVEYWRDYLKLSGNGYLRLSNWRSAPELDNDYEARPANGWDLRAEGWLPAWP 323
+TR G+G EYWRDY K S NGY R+S W + DY+ RPANG+D+R G+LP++P
Sbjct: 284 GDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRFNGYLPSYP 342

Query: 324 QLGGKLVYEQYYGDEVALFGKDERQNDPHAITAGLSYTPVPLISFSAEQRQGKQGENDTR 383
LG KL+YEQYYGD VALF D+ Q++P A T G++YTP+PL++ + R G END
Sbjct: 343 ALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLL 402

Query: 384 IGMELTLQPGHSLQKQLDPAEVAARRSLVGSRYDLVDRNNNIVLEYRKKELVRLTLTDPL 443
M+ Q +Q++P V R+L GSRYDLV RNNNI+LEY+K++++ L + +
Sbjct: 403 YSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDI 462

Query: 444 KGKPGEVKSLVSSLQTKYALKGYDIEAASLQSAGGKVAVSG----KDIQVTIPPYRFTAM 499
G + + +++KY L + ++L+S GG++ SG +D Q +P Y +
Sbjct: 463 NGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAY----V 518

Query: 500 PETDNTYPIAVTAEDSKGNFSRREE-SMVVVEKPTLSLTDSTLSVDQQILLADGKSTSTL 558
N Y + A D GN S ++ V+ + A T +
Sbjct: 519 QGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAI 578

Query: 559 TYTA------RDSSGKPIPGMTLKTQVKGLQDFALSEWKDNGNGTYTQIVTAGKTSGALS 612
TYTA + P+ V G + + NG+G T + + K +
Sbjct: 579 TYTATVKKNGVAQANVPVSFNI----VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVV 634

Query: 613 LMPQFNGDDIAKTPALIAIVANTASRADSTIETDQDNYVAGKPIVVKVTLRDD-NGNGVT 671
A+I + AS + I+ D+ VA + T++ V+
Sbjct: 635 SAKTAEMTSALNANAVIFVDQTKASI--TEIKADKTTAVANGQDAITYTVKVMKGDKPVS 692

Query: 672 GRKELLKQTVKVDNTKADDVSAWTEESEGIYKASYTAHLIGDKLTA------QLTMPGWQ 725
++ T+ + + ++ G K + T+ G L + + + +
Sbjct: 693 NQEVTFTTTLGKLSNSTE-----KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPE 747

Query: 726 TKHSDAFSIAGDKDTAKIAAMQITANNAVARRDHNTVAVTVRDVHQNLLQGQNVTFTVVN 785
+ +I ++ + + + + T+ N
Sbjct: 748 VEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNG--------KYTWRSAN 799

Query: 786 GAAVFADPNGGIVTTDKDGIASVNLASDQAVNSLIKAEINGSSQSVEVSFITGDISQLTS 845
A D + G VT + G ++++ S +Q+ + T + S +
Sbjct: 800 PAIASVDASSGQVTLKEKGTTTISVIS-------------SDNQTATYTIATPN-SLIVP 845

Query: 846 TIKTDDVSYTAGGKIKVSVTLMDEQKNLVKGMASLLAGSSVVEVSGTDKNETG----NWS 901
+ A K + +N ++ + ++ E + +
Sbjct: 846 NMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQQTAQ 905

Query: 902 EESDGVYTT 910
+ GV +T
Sbjct: 906 DAKSGVAST 914



Score = 88.2 bits (218), Expect = 3e-19
Identities = 79/419 (18%), Positives = 140/419 (33%), Gaps = 31/419 (7%)

Query: 1425 VTASINNSSQSQNVTFVADV-------RTAKIADLVVSQDNAVADGSTANTLRARVTDAF 1477
A N + S NV V + D + +A ADG+ A T A V
Sbjct: 529 ARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN- 587

Query: 1478 GNTLAGQTVSVMAGNGATV--APTVITEPDGTAEISVTSQTAGVSAVTASINNSSQSRDV 1535
G A VS +G V A + T G A +++ S G V+A + + +
Sbjct: 588 GVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNA 647

Query: 1536 TF--IADIRTAQIASLEVTQDNAVADGAMANTLQVRVTDANGNTLAGQAVSVMAGNGATV 1593
D A I ++ + AVA+G A T V+V ++ Q V+ G
Sbjct: 648 NAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMK-GDKPVSNQEVTFTTTLGKLS 706

Query: 1594 APAVTTQPDGTVEIPVTSQTAGASAVTASINNSSLSRDVTFIADVRTAQIAELVVIKDGS 1653
T +G ++ +TS T G S V+A +++ ++ + T I + + G+
Sbjct: 707 NSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT 766

Query: 1654 AADGAT-ANTLQARVTDAFGNALAGQTVSVLADNSATVAPAVITEPDGTVDISVTSQTAG 1712
G LQ + + G+ A+ + A + G VT + G
Sbjct: 767 GVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAI----ASVDASSG----QVTLKEKG 818

Query: 1713 ISTVTATINNHSLSQSVMFIADVRTAQIADLVVIKDGSEADGATANTLRARVTDAFGNAL 1772
+T++ ++ +Q+ + + I + K + D + N L
Sbjct: 819 TTTISVISSD---NQTATYTIATPNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNEL 874

Query: 1773 AGQTVSVLADNG-----ATVAPTVITGQDGTVEISVTSQTAGISTVTATINSSSQSQNV 1826
+ A N ++ Q S + T + N + N
Sbjct: 875 ENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNA 933



Score = 77.8 bits (191), Expect = 5e-16
Identities = 50/190 (26%), Positives = 79/190 (41%), Gaps = 6/190 (3%)

Query: 2026 ISTAQIADLVVIKDGSEADGATANTLRARVTDAFGNALAGQTVSVMAGNGATVAPAVT-- 2083
+ + D K ++ADG A T A V G A A VS +G V A +
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 2084 TQPDGTVEISVTSQTAGISAVTASINSSSQSRDVTF--IADVRTAKIAELEVIRDNAVAD 2141
T G +++ S G V+A + + + D A I E++ + AVA+
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVAN 673

Query: 2142 GSTANTLQVKVTDANGNTLAGQTVSVLAGNSATVASTVTTKPDGTVEISVTSQTAGTSTV 2201
G A T VKV ++ Q V+ ST T +G ++++TS T G S V
Sbjct: 674 GQDAITYTVKVMK-GDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLV 732

Query: 2202 SASINNSSQS 2211
SA +++ +
Sbjct: 733 SARVSDVAVD 742



Score = 75.5 bits (185), Expect = 2e-15
Identities = 62/347 (17%), Positives = 110/347 (31%), Gaps = 21/347 (6%)

Query: 1638 VRTAQIAELVVIKDGSAADGATANTLQARVTDAFGNALAGQTVSVLADNSATVAPAVI-- 1695
V + + K + ADG A T A V G A A VS + V A
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 1696 TEPDGTVDISVTSQTAGISTVTATINNH--SLSQSVMFIADVRTAQIADLVVIKDGSEAD 1753
T G +++ S G V+A +L+ + + D A I ++ K + A+
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVAN 673

Query: 1754 GATANTLRARVTDAFGNALAGQTVSVLADNGATVAPTVITGQDGTVEISVTSQTAGISTV 1813
G A T +V ++ Q V+ G T T +G ++++TS T G S V
Sbjct: 674 GQDAITYTVKVMKG-DKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLV 732

Query: 1814 TATINSSSQSQNVTFIADVRTAQIAELVVIKDGSAADGVMANMLRARVTDAFGNALAGQT 1873
+A ++ + + L + D ++ V
Sbjct: 733 SARVSDVAVD-----VKAPEVEFFTTLTI-------DDGNIEIVGTGVKGKLPTVWLQYG 780

Query: 1874 VSVLAGNGATTAPTVTTQPDGTVEISVTSQTAGISAVTASINN--SSQSRNVTFIADVRT 1931
L +G T + + +S + + + SS ++ T+
Sbjct: 781 QVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPN 840

Query: 1932 AQIADLVVIKDGSEADGATANTLRARVTDAFGNALAGQTVSVTAGNG 1978
+ I + K + D + N L + A N
Sbjct: 841 SLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANK 886



Score = 74.7 bits (183), Expect = 5e-15
Identities = 64/350 (18%), Positives = 115/350 (32%), Gaps = 27/350 (7%)

Query: 1250 VRTAQIADLVVIKDGSEADGATANTLRARVTDAFGNALAGQTVSVLADNGATVAPVVT-- 1307
V + D K ++ADG A T A V G A A VS +G V +
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 1308 TQPDGTVEISVTSQTAGSSAVTVSINSSSQSRDVTF--IADVRTAQIADLVVIKDDSVAD 1365
T G +++ S G V+ + + + D A I ++ K +VA+
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVAN 673

Query: 1366 GAMANMLRARVSDVFGNALAGQTVSVMADNGAAVASTMTTKPDGTVEISVTSQTAGISVV 1425
G A +V ++ Q V+ G ST T +G ++++TS T G S+V
Sbjct: 674 GQDAITYTVKV-MKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLV 732

Query: 1426 TASINNSSQSQNVTFVADVRTAKIADLVVSQDNAVADGSTANTLRARVTDAFGNTLAGQT 1485
+A +++ + V L + D + V
Sbjct: 733 SARVSDVAVD-----VKAPEVEFFTTLTI-------DDGNIEIVGTGVKGKLPTVWLQYG 780

Query: 1486 VSVMAGNGATVAPTVITEPDGTAEIS-----VTSQTAGVSAVTASINNSSQSRDVTFIAD 1540
+ +G T + A + VT + G + ++ SS ++ T+
Sbjct: 781 QVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVI---SSDNQTATYTIA 837

Query: 1541 IRTAQIASLEVTQDNAVADGAMANTLQVRVTDANGNTLAGQAVSVMAGNG 1590
+ I +++ D ++ N L + A N
Sbjct: 838 TPNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANK 886



Score = 71.3 bits (174), Expect = 5e-14
Identities = 73/374 (19%), Positives = 125/374 (33%), Gaps = 30/374 (8%)

Query: 1812 TVTATINSSSQSQNVTFIADVRTAQ-------IAELVVIKDGSAADGVMANMLRARVTDA 1864
T A + + S NV V + + + K + ADG A A V
Sbjct: 528 TARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN 587

Query: 1865 FGNALAGQTVSVLAGNGATT--APTVTTQPDGTVEISVTSQTAGISAVTASINNSSQSRN 1922
G A A VS +G A + T G +++ S G V+A + + N
Sbjct: 588 -GVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN 646

Query: 1923 VTF--IADVRTAQIADLVVIKDGSEADGATANTLRARVTDAFGNALAGQTVSVTAGNGAT 1980
D A I ++ K + A+G A T +V ++ Q V+ T G
Sbjct: 647 ANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFTTTLGKL 705

Query: 1981 VAPTVTTQPDGTAEISVTSQTAGVSAVTASINNSSQSRDVTFIADISTAQIADLVVIKDG 2040
T T +G A++++TS T G S V+A +++ + + +T I D + G
Sbjct: 706 SNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVG 765

Query: 2041 SEADGATANTLRARVTDAFGNALAGQTVSVMAGNGATVAPAVTTQPDGTVEIS---VTSQ 2097
+ G GQ +G +V+ S VT +
Sbjct: 766 TGVKGKLPTVWLQY----------GQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLK 815

Query: 2098 TAGISAVTASINSSSQSRDVTFIADVRTAKIAELEVIRDNAVADGSTANTLQVKVTDANG 2157
G + ++ SS ++ T+ + I + + D ++
Sbjct: 816 EKGTTTISV---ISSDNQTATYTIATPNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQ 871

Query: 2158 NTLAGQTVSVLAGN 2171
N L + A N
Sbjct: 872 NELENVFKAWGAAN 885



Score = 69.7 bits (170), Expect = 1e-13
Identities = 76/379 (20%), Positives = 138/379 (36%), Gaps = 26/379 (6%)

Query: 1035 NALTSNDYSISGDAASAQIVAMQV-----TTGNPDVLANGSDRHTVNVRVEDQFGNVLSE 1089
N +SN+ ++ S V QV T A+G++ T V+ G +
Sbjct: 535 NGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQAN 593

Query: 1090 QTVTFTVTKGAAVFANAGQSADIRTDAHGMAEVDLSSTVADASTVEAKINQSSDSKTVNF 1149
V+F + G AV + + T+ G A V L S V AK + + + N
Sbjct: 594 VPVSFNIVSGTAVLS----ANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANA 649

Query: 1150 V--ADVSTAQVAELVVTQDGSVADGSTANTLRARVTDVFGNALAGQTVSVLAGNGATTAP 1207
V D + A + E+ + +VA+G A T +V ++ Q V+ G +
Sbjct: 650 VIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFTTTLGKLSNS 708

Query: 1208 TVTTQPDGTVEISVTSQTAGTSVITASVNNSSQSRDVTFIADVRTAQIADLVVIKDGSEA 1267
T T +G ++++TS T G S+++A V++ + + T I D + G+
Sbjct: 709 TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGV 768

Query: 1268 DGAT-ANTLRARVTDAFGNALAGQTVSVLADNGATVAPVVTTQPDGTVEISVTSQTAGSS 1326
G L+ + + G+ A+ +A V + VT + G++
Sbjct: 769 KGKLPTVWLQYGQVNLKASGGNGKYTWRSANPA--IASVDASSG------QVTLKEKGTT 820

Query: 1327 AVTVSINSSSQSRDVTFIADVRTAQIADLVVIKDDSVADGAMANMLRARVSDVFGNALAG 1386
++V SS ++ T+ + I + K + D N L
Sbjct: 821 TISV---ISSDNQTATYTIATPNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876

Query: 1387 QTVSVMADNGAAVASTMTT 1405
+ A N + T
Sbjct: 877 VFKAWGAANKYEYYKSSQT 895



Score = 66.6 bits (162), Expect = 1e-12
Identities = 76/428 (17%), Positives = 130/428 (30%), Gaps = 71/428 (16%)

Query: 1657 GATANTLQARVTDAFGNALAGQTVSVLADNSATVAPAVITEPDGTVDISVTSQTAGISTV 1716
G+ + AR D GN + N+ + V++ + VT TA
Sbjct: 521 GSNVYKVTARAYDRNGN----------SSNNVLLTITVLSNGQVVDQVGVTDFTAD---- 566

Query: 1717 TATINNHSLSQSVMFIADVRTAQIADLVVIKDGSEADGATANTLRARVTDAFGNALAGQT 1776
K ++ADG A T A V G A A
Sbjct: 567 ------------------------------KTSAKADGTEAITYTATVKKN-GVAQANVP 595

Query: 1777 VSVLADNGATV--APTVITGQDGTVEISVTSQTAGISTVTATINSSSQSQNVTF--IADV 1832
VS +G V A + T G +++ S G V+A + + N D
Sbjct: 596 VSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQ 655

Query: 1833 RTAQIAELVVIKDGSAADGVMANMLRARVTDAFGNALAGQTVSVLAGNGATTAPTVTTQP 1892
A I E+ K + A+G A +V ++ Q V+ G + T T
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFTTTLGKLSNSTEKTDT 714

Query: 1893 DGTVEISVTSQTAGISAVTASINNSSQSRNVTFIADVRTAQIADLVVIKDGSEADGATAN 1952
+G ++++TS T G S V+A +++ + + L + D
Sbjct: 715 NGYAKVTLTSTTPGKSLVSARVSDVAVD-----VKAPEVEFFTTLTI-------DDGNIE 762

Query: 1953 TLRARVTDAFGNALAGQTVSVTAGNGATVAPTVTTQPDGTAEIS-----VTSQTAGVSAV 2007
+ V +G T + A + VT + G + +
Sbjct: 763 IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTI 822

Query: 2008 TASINNSSQSRDVTFIADISTAQIADLVVIKDGSEADGATANTLRARVTDAFGNALAGQT 2067
+ SS ++ T+ + I + K + D + N L
Sbjct: 823 SVI---SSDNQTATYTIATPNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVF 878

Query: 2068 VSVMAGNG 2075
+ A N
Sbjct: 879 KAWGAANK 886



Score = 54.3 bits (130), Expect = 7e-09
Identities = 57/323 (17%), Positives = 103/323 (31%), Gaps = 38/323 (11%)

Query: 885 SVVEVSGTDKNETGNWSEESDGVYTTTRTAKIAGDRHYATLKLSTWSSAQQSDAYAIRES 944
S VSGT + + G T T + G + + K + +SA ++A +
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPG-QVVVSAKTAEMTSALNANAVIFVDQ 655

Query: 945 GAVLAYSSIVTDKTAYTAGGAIKVTVTLKDSY-ENLVGGQRDAINLAIQLPNTKAESIAW 1003
+ + I DKT A G +T T+K + V Q + + E
Sbjct: 656 TKA-SITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEK--- 711

Query: 1004 NEDQKGIYTATYTALLPGTGLKAQLQMSGWANALTSNDYSISGDAASAQIVAMQVTTGNP 1063
D G T T+ PG L + ++S A + + + + + GN
Sbjct: 712 -TDTNGYAKVTLTSTTPGKSLVSA-RVSDVAVDVKAPEVEFFTT--------LTIDDGNI 761

Query: 1064 DVLANGSDRHTVNVRVE-DQFGNVLSEQTVTFTVTKGAAVFANAGQSADIRTDAHGMAEV 1122
+++ G V ++ Q S +T +A V
Sbjct: 762 EIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANP----------------AIASV 805

Query: 1123 DLSS---TVADASTVEAKINQSSDSKTVNFVADVSTAQVAELVVTQDGSVADGSTANTLR 1179
D SS T+ + T + SSD++T + + + +++ + D
Sbjct: 806 DASSGQVTLKEKGTTTISVI-SSDNQTATYTIATPNSLIV-PNMSKRVTYNDAVNTCKNF 863

Query: 1180 ARVTDVFGNALAGQTVSVLAGNG 1202
N L + A N
Sbjct: 864 GGKLPSSQNELENVFKAWGAANK 886



Score = 38.9 bits (90), Expect = 4e-04
Identities = 25/127 (19%), Positives = 44/127 (34%), Gaps = 9/127 (7%)

Query: 2446 EPLTVTITLRDEFGNPALGLTSEVIESYIDSFAVGGATPDSMRWVEQNNGEYTIVWTAWI 2505
E +T T T++ A S I S G A + +G+ T+ +
Sbjct: 576 EAITYTATVKKNGVAQANVPVSFNIVS-------GTAVLSANSANTNGSGKATVTLKSD- 627

Query: 2506 AEENLVASLKLKTWAEEIKSSLYGIQPGAAAKTQSTIVADKTIYIAGDSITVTVVLKDAQ 2565
+V S K + ++ A + + I ADKT +A +T +K +
Sbjct: 628 KPGQVVVSAKTAEMTSALNANAVIFVDQTKA-SITEIKADKTTAVANGQDAITYTVKVMK 686

Query: 2566 GNFITDG 2572
G+
Sbjct: 687 GDKPVSN 693


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4025TCRTETA386e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.9 bits (88), Expect = 6e-05
Identities = 33/208 (15%), Positives = 71/208 (34%), Gaps = 13/208 (6%)

Query: 33 IIVEFLPVSLLTP----MAQDLGISEGVA---GQSVTVTAFVAMFASLFITQTIQATDRR 85
+ ++ + + L+ P + +DL S V G + + A + + + RR
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 86 YVVILFAVLLTLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 145
V+++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 146 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAVMG----VLCIFWIIKSLPSLPGE 201
+ +V LG +G F AAA + + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 202 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 229
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4030UREASE389e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 38.2 bits (89), Expect = 9e-05
Identities = 28/105 (26%), Positives = 41/105 (39%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG----------AEYTDAPA 71
V+R D +I N ILD + G + I +K IA +G P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4031TCRTETB357e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.9 bits (80), Expect = 7e-04
Identities = 61/372 (16%), Positives = 133/372 (35%), Gaps = 32/372 (8%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGA-----GAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRYGSDSPES 219
+G G +YL +I + P ++ L+ + ++ D
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGI 204

Query: 220 YGLGKAEELFGEEISEEDKETESTDMTKWQIFVEYVLK--NKVIWLLCFANI-FLYVVRI 276
+ F + + + IFV+++ K + + NI F+ V
Sbjct: 205 ILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLC 264

Query: 277 GIDQWSTVYAFQELKLSKAVAIQGFTLFEAG------ALVGTLLWGWLSDLANGRRG--L 328
G + TV F + + + E G + +++G++ + RRG
Sbjct: 265 GGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLY 324

Query: 329 VACIALALIIA---TLGVYQHASNQYIYLASLFALGFLVFGPQLLIGVAAVGFVPKKAIG 385
V I + + T ++ ++ + +F LG L F ++ + + ++A G
Sbjct: 325 VLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEA-G 383

Query: 386 AADGIKGTFAYL 397
A + ++L
Sbjct: 384 AGMSLLNFTSFL 395


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4032TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.2 bits (94), Expect = 1e-05
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAASALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4033PF06580401e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 40.2 bits (94), Expect = 1e-05
Identities = 32/166 (19%), Positives = 68/166 (40%), Gaps = 18/166 (10%)

Query: 341 KQSGQLIEHLSLGVYDAVRRLLGRLRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWR 400
++ +++ LS + +R L RQ+ +L + + ++L L++
Sbjct: 191 TKAREMLTSLS----ELMRYSLRYSNARQV---SLADELTVVDSYLQLASIQFEDRLQFE 243

Query: 401 IDESALSENQRVTLFRVCQEGLNNIVKHA-----DASAVTLQGWQQDERLMLVIEDDGSG 455
+ + +V + Q + N +KH + L+G + + + L +E+ GS
Sbjct: 244 NQINPAIMDVQVPPM-LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL 302

Query: 456 LPPDSGQ-HGFGLTGMRERVTALGG---TLTISCLHG-TRVSVSLP 496
++ + G GL +RER+ L G + +S G V +P
Sbjct: 303 ALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4034HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4039TCRTETB606e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.9 bits (145), Expect = 6e-12
Identities = 41/184 (22%), Positives = 81/184 (44%), Gaps = 1/184 (0%)

Query: 5 RNVNLLLMLVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYG 64
R+ +L+ L +L + + + ++ D+A D N + V A++LT+ + YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 65 PISDRVGRRPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQGMGTGVGGVMARTLPRD 123
+SD++G + ++L G+ I +++ V S ++LI A +QG G + +
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 124 LYERTQLRHANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWM 183
+ A L+ + + + P IGG++ +W L ++ V F M
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLK 190

Query: 184 PETR 187
E R
Sbjct: 191 KEVR 194


108EcSMS35_4302EcSMS35_4310N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4302-2150.0721442-keto-3-deoxygluconate permease
EcSMS35_4303-117-0.533979hypothetical protein
EcSMS35_4304017-0.838587two-component sensor protein
EcSMS35_4305126-3.443390DNA-binding transcriptional regulator CpxR
EcSMS35_4306129-4.095972periplasmic repressor CpxP
EcSMS35_4307028-5.762291phage integrase family site specific
EcSMS35_4308230-5.662329regulatory protein Cox
EcSMS35_4309327-5.388971hypothetical protein
EcSMS35_4310022-1.107385putative replication gene B protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4302ACRIFLAVINRP290.022 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.4 bits (66), Expect = 0.022
Identities = 38/172 (22%), Positives = 67/172 (38%), Gaps = 17/172 (9%)

Query: 162 TAGIASFEPHVFVGAVLPFLVGFA-LGNLDPELREFFSKAVQTLIPF-FAFALGNTID-L 218
I +F +L FLV + L N+ L + V L F A G +I+ L
Sbjct: 334 QLSIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTL 393

Query: 219 TVIAQTGLLGILLGVAVIIVTGIPLIIADKLIGGGDGTAGIAASSSAGAAV--ATPVLIA 276
T+ +G+L+ A+++V + ++ + A + S A+ VL A
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMED--KLPPKEATEKSMSQIQGALVGIAMVLSA 451

Query: 277 EMVPA----------FKPMAPAATSLVATAVIVTSILVPILTSIWSRKVKAR 318
+P ++ + S +A +V+V IL P L + + V A
Sbjct: 452 VFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAE 503


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4304PF06580290.027 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.4 bits (66), Expect = 0.027
Identities = 28/186 (15%), Positives = 63/186 (33%), Gaps = 33/186 (17%)

Query: 277 IETEAQRLDSMINDLLVMSRNQQKNALVSETIKANQLWSEV-LDNAAFEAEQM--GKSLT 333
I + + M+ L + R + + + L E+ + ++ + + L
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQV----SLADELTVVDSYLQLASIQFEDRLQ 241

Query: 334 VNF--PPGPWPLYGNPNALESALENIVRNAL--RYSHTKIEVGFAVDKDGITITVDDDGP 389
P + P +++ +EN +++ + KI + D +T+ V++ G
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 390 GVSPEDREQIFRPFYRTDEARDRESGGTGLGLAIVETAIQQHRGW---VKAEDSPLGGLR 446
+E TG GL V +Q G +K + G +
Sbjct: 302 LALKNTKE------------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ-GKVN 342

Query: 447 LVIWLP 452
++ +P
Sbjct: 343 AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4305HTHFIS929e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 9e-24
Identities = 35/117 (29%), Positives = 62/117 (52%), Gaps = 2/117 (1%)

Query: 3 KILLVDDDRELTSLLKELLEMEGFNVIVAHDGEQALDLL-DDSIDLLLLDVMMPKKNGID 61
IL+ DDD + ++L + L G++V + + + DL++ DV+MP +N D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 TLKALRQTH-QTPVIMLTARGSELDRVLGLELGADDYLPKPFNDRELVARIRAILRR 117
L +++ PV++++A+ + + + E GA DYLPKPF+ EL+ I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4308MICOLLPTASE260.023 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 26.2 bits (57), Expect = 0.023
Identities = 6/22 (27%), Positives = 14/22 (63%)

Query: 28 ETAVVKMVKENKLPVIELRDPS 49
E+ +K+V++ + VI +P+
Sbjct: 849 ESKKIKVVEDKPVEVINESEPN 870


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4310SECA280.016 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.016
Identities = 11/72 (15%), Positives = 26/72 (36%)

Query: 6 VEKQPAAMRRIIGKHLAVPRWQDTCDYYNQMMERERLTVCFHAQLKQRHATMRFEEMNDV 65
+ ++ L + W D ++ RER+ +++ + E M
Sbjct: 703 IPGLQERLKNDFDLDLPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHF 762

Query: 66 ERERLVCAIDEL 77
E+ ++ +D L
Sbjct: 763 EKGVMLQTLDSL 774


109EcSMS35_4571EcSMS35_4578N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4571-1140.988568phosphonate/organophosphate ester transporter
EcSMS35_4572-2140.459988hypothetical protein
EcSMS35_4573-3150.653018alkylphosphonate utilization operon protein
EcSMS35_4574-2140.486816hypothetical protein
EcSMS35_4575-1131.245127hypothetical protein
EcSMS35_4576-113-0.554815proline/glycine betaine transporter
EcSMS35_4577-116-0.674391sensor protein BasS/PmrB
EcSMS35_4578016-0.579438DNA-binding transcriptional regulator BasR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4571PF05272290.020 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.020
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4572FLGLRINGFLGH260.045 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 26.5 bits (58), Expect = 0.045
Identities = 11/42 (26%), Positives = 23/42 (54%)

Query: 66 IAGSDIMMSDAIPSGKASYSGFTLVLDSQQVEEGKRWFDNLA 107
I+GS+ + S + + Y G + ++Q + +R+F NL+
Sbjct: 189 ISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLS 230


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4576TCRTETA431e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.3 bits (102), Expect = 1e-06
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEENFLDWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKYWRS 260
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 261 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 315
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 316 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 353
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 39.0 bits (91), Expect = 3e-05
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 401
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQNLMMPAYYLMVVAVVGLITG-VTMKETANR 444
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4577PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 40/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 184 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 240
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 241 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 295
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 296 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWV 354
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 355 RL 356
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4578HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 41/121 (33%), Positives = 60/121 (49%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEAGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY + A + + AG LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


110EcSMS35_4820EcSMS35_4828N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4820233-7.554145HNH endonuclease domain-containing protein
EcSMS35_4822234-7.858482hypothetical protein
EcSMS35_4823127-6.184505hypothetical protein
EcSMS35_4824020-4.996350hypothetical protein
EcSMS35_4825-113-2.679933hypothetical protein
EcSMS35_4827-29-1.269921DEAD/DEAH box helicase domain-containing
EcSMS35_4826-27-1.103126hypothetical protein
EcSMS35_4828-27-1.043193helicase family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4820FIMBRIALPAPE280.016 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.5 bits (63), Expect = 0.016
Identities = 10/35 (28%), Positives = 18/35 (51%)

Query: 49 AGIAATLMTGTQFTPADFTGGIKSKCVKLLIEQGF 83
+GI + G+Q TP TG ++ + L + G+
Sbjct: 116 SGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGY 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4823OMPADOMAIN584e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 58.0 bits (140), Expect = 4e-12
Identities = 36/136 (26%), Positives = 53/136 (38%), Gaps = 28/136 (20%)

Query: 95 SPDVLFGLGSTELKPKFKLILDDFFPRYLKVLDNYQEHITEVRIEGHTSTDWTGTTNPDI 154
DVLF LKP+ + LD + L N V + G+ TD G+
Sbjct: 218 KSDVLFNFNKATLKPEGQAALD----QLYSQLSNLDPKDGSVVVLGY--TDRIGSD---- 267

Query: 155 AYFNNMALSQGRTRAVLQYVYDIKNIATHQQWVKSKFAAVGYSSAHPILDKTGKEDPNRS 214
AY N LS+ R ++V+ Y+ K I K +A G ++P+ T R+
Sbjct: 268 AY--NQGLSERRAQSVVDYLI-SKGIP------ADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 215 ---------RRVTFKV 221
RRV +V
Sbjct: 319 ALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4824FLGHOOKFLIK320.005 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 32.1 bits (72), Expect = 0.005
Identities = 27/111 (24%), Positives = 49/111 (44%), Gaps = 19/111 (17%)

Query: 242 EWQENYKTQVELMSEQYQQSVESLVETKTAVAGIWEECKEIPLAMSELREVLQVNQHQIS 301
EWQ++ + L + Q QQS E + ++ E+ +++ + NQ QI
Sbjct: 239 EWQQSLSQHISLFTRQGQQSAELRLHP--------QDLGEVQISLK-----VDDNQAQIQ 285

Query: 302 ELSRHLETFVAIRDKATTVLPEIQNKMAEVGELLKSGAANVSASLEQTSQQ 352
+S H +R LP ++ ++AE G ++ G +N+S QQ
Sbjct: 286 MVSPHQH----VRAALEAALPVLRTQLAESG--IQLGQSNISGESFSGQQQ 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4828RTXTOXIND310.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.030
Identities = 26/163 (15%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 334 RLASGAEEEAYRRLVESQFRDDDDEQAQSN---KGRLFKITLEKALFSSPMACASVVANR 390
+ S E L++ QF +++ Q + + A + + V +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 391 LKRLESRKDHN--SQSQINELESLLLALNNIDASQFSKYQLLLDTIRKDLAWKANNTEDR 448
L S ++ + E E+ + N S+ L+ I ++ + ++
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ----LEQIESEIL----SAKEE 288

Query: 449 LVIFTESIKTLEFLEQ--QLRADLKLKDDQIATLRGDQGDTVL 489
+ T+ K E L++ Q ++ L ++A Q +V+
Sbjct: 289 YQLVTQLFKN-EILDKLRQTTDNIGLLTLELAKNEERQQASVI 330


111EcSMS35_4844EcSMS35_4848N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4844211-0.507870outer membrane usher protein fimD
EcSMS35_48450182.155463fimbrial protein FimF
EcSMS35_48460192.473819protein fimG
EcSMS35_4847-1170.681636protein FimH
EcSMS35_48480200.472010fructuronate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4844PF0057710530.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 1053 bits (2725), Expect = 0.0
Identities = 852/855 (99%), Positives = 855/855 (100%)

Query: 2 AGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIY 61
AGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIY
Sbjct: 24 AGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIY 83

Query: 62 LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH 121
LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH
Sbjct: 84 LNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 122 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGN 181
DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGN
Sbjct: 144 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGN 203

Query: 182 SHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTL 241
SHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTL
Sbjct: 204 SHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTL 263

Query: 242 GDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNS 301
GDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNS
Sbjct: 264 GDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNS 323

Query: 302 TVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEY 361
TVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEY
Sbjct: 324 TVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEY 383

Query: 362 RSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQ 421
RSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQ
Sbjct: 384 RSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQ 443

Query: 422 ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 481
ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE
Sbjct: 444 ANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIE 503

Query: 482 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRSSTLYLSGSHQTYWGTNNVDEQFQ 541
TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR+STLYLSGSHQTYWGT+NVDEQFQ
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 542 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM 601
AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM
Sbjct: 564 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSM 623

Query: 602 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 661
SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG
Sbjct: 624 SHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIG 683

Query: 662 YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG 721
YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG
Sbjct: 684 YSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRG 743

Query: 722 YAVLPYATEYRENRVALDTNTLADNVDLDNSVANVVPTRGAIVRAEFKARVGIKLLMTLT 781
YAVLPYATEYRENRVALDTNTLADNVDLDN+VANVVPTRGAIVRAEFKARVGIKLLMTLT
Sbjct: 744 YAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT 803

Query: 782 HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPP 841
HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPP
Sbjct: 804 HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPP 863

Query: 842 ESQQQLLTQLSAECR 856
ESQQQLLTQLSAECR
Sbjct: 864 ESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4846VACCYTOTOXIN320.001 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.5 bits (71), Expect = 0.001
Identities = 31/158 (19%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WRKRGYLLAAMLAFASATIQAADVTITVNGKVVAKPCTVSTTNATIDLGDLYSFSLVSAG 62
W R + A LA + +TI + VT VN + + I + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTHCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4847SURFACELAYER280.047 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.047
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4848PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


112EcSMS35_4947EcSMS35_4952N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcSMS35_4947-3151.332251phosphoglycerate mutase
EcSMS35_4946-2120.691156right origin-binding protein
EcSMS35_4948hypothetical protein
EcSMS35_4949DNA-binding response regulator CreB
EcSMS35_4950sensory histidine kinase CreC
EcSMS35_4951hypothetical protein
EcSMS35_4952two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4947VACCYTOTOXIN290.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.014
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4949HTHFIS844e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.5 bits (209), Expect = 4e-21
Identities = 33/139 (23%), Positives = 60/139 (43%)

Query: 1 MRRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+ V + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4950PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 3e-04
Identities = 41/182 (22%), Positives = 71/182 (39%), Gaps = 40/182 (21%)

Query: 312 LRQARLENRQEVVLTVVDVAALFR---RVSEARTVQLAE-----------------KKIT 351
+R LE+ + + ++ L R R S AR V LA+ ++
Sbjct: 182 IRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQ 241

Query: 352 LHV-MPTEVNVAAEPTLLEQAL-GNLLDNAIDFTPESGRITLSAEVEQEHVTLKVLDTGS 409
+ + P +L Q L N + + I P+ G+I L + VTL+V +TGS
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 410 GIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE-VARLFNGEVTLH-NVQEGGVLASL 467
N ++S+G GL V E + L+ E + + ++G V A +
Sbjct: 302 LALK----------------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 468 RL 469
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcSMS35_4952HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.