PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome1039.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_009801 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1EcE24377A_0010EcE24377A_0029Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0010230-0.024466hypothetical protein
EcE24377A_0011024-0.246869hypothetical protein
EcE24377A_0012023-0.806785hypothetical protein
EcE24377A_0013020-2.808230hypothetical protein
EcE24377A_0014-119-5.806077molecular chaperone DnaK
EcE24377A_0015024-8.061165molecular chaperone DnaJ
EcE24377A_0016131-10.641642pH-dependent sodium/proton antiporter
EcE24377A_0017237-13.467234transcriptional activator NhaR
EcE24377A_0018343-15.459071hypothetical protein
EcE24377A_0020131-10.239466hypothetical protein
EcE24377A_00210211.467596hypothetical protein
EcE24377A_00220212.090876hypothetical protein
EcE24377A_0023-1212.31505130S ribosomal protein S20
EcE24377A_00240212.787513hypothetical protein
EcE24377A_0025-1203.161952bifunctional riboflavin kinase/FMN
EcE24377A_0026-1203.299363isoleucyl-tRNA synthetase
EcE24377A_0027-3162.365675lipoprotein signal peptidase
EcE24377A_0028-2132.303029FKBP-type peptidylprolyl isomerase
EcE24377A_00290223.2865894-hydroxy-3-methylbut-2-enyl diphosphate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0011PF07201300.007 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.007
Identities = 9/51 (17%), Positives = 24/51 (47%)

Query: 138 LHAVDARVNELEELLPLLMKDKLLAKGVSHLLSSQLTRILRTHAAMSVLGH 188
+ V+ +VN+ +P L + + +++ +S L +S + + A +
Sbjct: 80 VSDVEEQVNQYLSKVPELEQKQNVSELLSLLSNSPNISLSQLKAYLEGKSE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0014SHAPEPROTEIN1427e-40 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 142 bits (361), Expect = 7e-40
Identities = 83/387 (21%), Positives = 149/387 (38%), Gaps = 84/387 (21%)

Query: 5 IGIDLGTTNSCVAIMDGTTPRVLENAEGDRTTPSIIAYTQDGET------LVGQPAKRQA 58
+ IDLGT N+ + + + E PS++A QD VG AK+
Sbjct: 13 LSIDLGTANTLIYVKGQGIV-LNE--------PSVVAIRQDRAGSPKSVAAVGHDAKQML 63

Query: 59 VTNPQNTLFAIKRLIGRRFQDEEVQRDVSIMPFKIIAADNGDAWVEVKGQKMAPPQISAE 118
P N + AI+ + +D I F + +
Sbjct: 64 GRTPGN-IAAIRPM-----------KDGVIADFFVTEK------------------MLQH 93

Query: 119 VLKKMKKTAEDYLGEPVTEAVITVPAYFNDAQRQATKDAGRIAGLEVKRIINEPTAAALA 178
+K++ + P ++ VP +R+A +++ + AG +I EP AAA+
Sbjct: 94 FIKQVHS---NSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIG 150

Query: 179 YGL--DKGTGNRTIAVYDLGGGTFDISIIEIDEVDGEKTFEVLATNGDTHLGGEDFDSRL 236
GL + TG+ V D+GGGT ++++I ++ V + +GG+ FD +
Sbjct: 151 AGLPVSEATGS---MVVDIGGGTTEVAVISLNGV---------VYSSSVRIGGDRFDEAI 198

Query: 237 INYLVEEFKKDQGIDLRNDPLAMQRLKEAAEKAKIELSSA----QQTDVNLPYITADATG 292
INY+ + G + AE+ K E+ SA + ++ +
Sbjct: 199 INYVRRNYGSLIG-------------EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGV 245

Query: 293 PKHMNIKVTRAKLESLVEDLVNRSIEPLKVALQD-AGLSVSDIDD--VILVGGQTRMPMV 349
P+ + + LE+L E + + + VAL+ SDI + ++L GG + +
Sbjct: 246 PRGFTLN-SNEILEALQEP-LTGIVSAVMVALEQCPPELASDISERGMVLTGGGALLRNL 303

Query: 350 QKKVAEFFGKEPRKDVNPDEAVAIGAA 376
+ + E G +P VA G
Sbjct: 304 DRLLMEETGIPVVVAEDPLTCVARGGG 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0028INFPOTNTIATR310.002 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 30.7 bits (69), Expect = 0.002
Identities = 14/32 (43%), Positives = 19/32 (59%)

Query: 8 NSAVLVHFTLKLDDGTTAESTRNNGKPALFRL 39
+ V V +T L DGT +ST GKPA F++
Sbjct: 144 SDTVTVEYTGTLIDGTVFDSTEKAGKPATFQV 175


2EcE24377A_0060EcE24377A_0067Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0060-2183.71963523S rRNA/tRNA pseudouridine synthase A
EcE24377A_0061-2173.556043ATP-dependent helicase HepA
EcE24377A_0062-2153.696573DNA polymerase II
EcE24377A_0063-1164.098628L-ribulose-5-phosphate 4-epimerase
EcE24377A_0064-1174.772335L-arabinose isomerase
EcE24377A_00651164.655080ribulokinase
EcE24377A_00660173.970057DNA-binding transcriptional regulator AraC
EcE24377A_00671164.145858hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0065TCRTETOQM290.039 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 29.4 bits (66), Expect = 0.039
Identities = 19/103 (18%), Positives = 39/103 (37%), Gaps = 18/103 (17%)

Query: 300 ILIADKQSVGERAVKGICGQVDGSVV------PGFIGLEAGQS-AFGDIYAWFGRVLGWP 352
+ I++K+ + + + ++G + G I + + + G P
Sbjct: 281 VRISEKEKIK---ITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV---LGDTKLLP 334

Query: 353 L-EQLAAQHPELKAQINASQKQ----LLPALTEAWAKNPSLDH 390
E++ P L+ + S+ Q LL AL E +P L +
Sbjct: 335 QRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRY 377


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0066PF05616290.022 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 28.9 bits (64), Expect = 0.022
Identities = 26/118 (22%), Positives = 47/118 (39%), Gaps = 21/118 (17%)

Query: 82 YGRHPEAREWYHQWVYFRPRAYWHEWLNWPSIFANTGFFRPDEAHQPHFSDLFGQ-IINA 140
Y R PE +E + R YW + N P ++ +F+ + +F G ++
Sbjct: 158 YSRFPEVKELMESQMERLARPYWEKLRNRPDMY----YFKNYNFKRCYFGLNGGDCLVAK 213

Query: 141 G-----------QGEGRYSELLAINLLEQLLLRRMEA-----INESLHPPMDNRVREA 182
G QG +Y E + LE++L +++A I + +P +V A
Sbjct: 214 GDDGRTFISFSLQGNSKYKEEMDAKKLEEILSLKVDANPDKYIKATGYPGYSEKVEVA 271


3EcE24377A_0112EcE24377A_0117Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_01122281.881045N-acetyl-anhydromuranmyl-L-alanine amidase
EcE24377A_01133331.781066regulatory protein AmpE
EcE24377A_01143302.036636aromatic amino acid transporter
EcE24377A_01154322.575598transcriptional regulator PdhR
EcE24377A_01163342.392991pyruvate dehydrogenase subunit E1
EcE24377A_01172261.972344dihydrolipoamide acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0117RTXTOXIND320.007 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.007
Identities = 15/60 (25%), Positives = 29/60 (48%), Gaps = 2/60 (3%)

Query: 119 EVTEILVKVGDKV-EAEQSLITVEGDKASMEVPAPFAGTVKEIKVN-VGDKVSTGSLIMV 176
E+ + L + D + L E + + + AP + V+++KV+ G V+T +MV
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 32.1 bits (73), Expect = 0.008
Identities = 16/63 (25%), Positives = 27/63 (42%), Gaps = 2/63 (3%)

Query: 26 DKVEAEQSLITVEGDKASMEVPSPQAGIVKEIKVSVGDKTQTGALIMIFDSADGAADAAP 85
+ V +T G S E+ + IVKEI V G+ + G +++ + AD
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 86 AQA 88
Q+
Sbjct: 139 TQS 141



Score = 30.6 bits (69), Expect = 0.019
Identities = 14/60 (23%), Positives = 28/60 (46%), Gaps = 2/60 (3%)

Query: 220 EVTEVMVKVGDKVAA-EQSLITVEGDKASMEVPAPFAGVVKELKVN-VGDKVKTGSLIMI 277
E+ + + + D + L E + + + AP + V++LKV+ G V T +M+
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358



Score = 29.8 bits (67), Expect = 0.034
Identities = 20/95 (21%), Positives = 35/95 (36%), Gaps = 3/95 (3%)

Query: 230 DKVAAEQSLITVEGDKASMEVPAPFAGVVKELKVNVGDKVKTGSLIMIFEVEGAAPAAAP 289
+ VA +T G S E+ +VKE+ V G+ V+ G +++ GA A
Sbjct: 81 EIVATANGKLTHSGR--SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAE-ADTL 137

Query: 290 AKQEAAAPAPAAKAEAPAAAPAAKAEGKSEFAEND 324
Q + A + + + + E D
Sbjct: 138 KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172



Score = 29.8 bits (67), Expect = 0.038
Identities = 13/60 (21%), Positives = 27/60 (45%), Gaps = 2/60 (3%)

Query: 16 EITEILVKVGDKV-EAEQSLITVEGDKASMEVPSPQAGIVKEIKV-SVGDKTQTGALIMI 73
EI + L + D + L E + + + +P + V+++KV + G T +M+
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMV 358


4EcE24377A_0133EcE24377A_0158Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0133-218-3.462134aspartate alpha-decarboxylase
EcE24377A_0134022-4.615381hypothetical protein
EcE24377A_0135024-5.196530hypothetical protein
EcE24377A_0136126-5.659407pantoate--beta-alanine ligase
EcE24377A_0137233-7.9833203-methyl-2-oxobutanoate
EcE24377A_0138438-9.427258fimbrial-like adhesin protein
EcE24377A_0139236-9.068169hypothetical protein
EcE24377A_0140234-8.854695fimbrial protein
EcE24377A_0141131-8.037517hypothetical protein
EcE24377A_0142022-5.189744outer membrane usher protein
EcE24377A_0143-116-1.438993chaperone protein EcpD
EcE24377A_0144-216-0.030488hypothetical protein
EcE24377A_0145-1140.334540fimbrial protein
EcE24377A_0146-1141.8155342-amino-4-hydroxy-6-
EcE24377A_01470133.140466poly(A) polymerase
EcE24377A_0148-1143.022942glutamyl-Q tRNA(Asp) synthetase
EcE24377A_01490162.855023RNA polymerase-binding transcription factor
EcE24377A_01500112.133916sugar fermentation stimulation protein A
EcE24377A_0151-1122.7670562'-5' RNA ligase
EcE24377A_0152-1122.731991ATP-dependent RNA helicase HrpB
EcE24377A_0153-1163.226025penicillin-binding protein 1b
EcE24377A_0154-1153.281563hypothetical protein
EcE24377A_0155-1133.352102ferrichrome outer membrane transporter
EcE24377A_01560164.362417iron-hydroxamate transporter ATP-binding
EcE24377A_01570154.081179iron-hydroxamate transporter substrate-binding
EcE24377A_01580143.867805iron-hydroxamate transporter permease subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0137FLGMRINGFLIF290.020 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.2 bits (65), Expect = 0.020
Identities = 26/99 (26%), Positives = 39/99 (39%), Gaps = 20/99 (20%)

Query: 110 MVKIEGGEWL----VETVKMLTERAVPVCGHLGLTPQSVNIFGGYKVQGRGDEAGDQL-L 164
V +E G L + V L AV GL P +V + D++G L
Sbjct: 176 TVTLEPGRALDEGQISAVVHLVSSAVA-----GLPPGNVTLV---------DQSGHLLTQ 221

Query: 165 SDALALEAAGAQLLVLECVPVELAKRITEALAIPVIGIG 203
S+ + AQL V + +RI L+ P++G G
Sbjct: 222 SNTSGRDLNDAQLKFANDVESRIQRRIEAILS-PIVGNG 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0142PF005777870.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 787 bits (2034), Expect = 0.0
Identities = 266/850 (31%), Positives = 424/850 (49%), Gaps = 38/850 (4%)

Query: 15 RIATFCALLYCNSAFCAELVEYDHTFLMGQNASNIDLSRYSEGNPAIPGMYDVSVYVNDQ 74
R+ CA AE + ++ FL + DLSR+ G PG Y V +Y+N+
Sbjct: 29 RLFVACAFAAQAPLSSAE-LYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNG 87

Query: 75 PIINQSITFIEIEGKKNAQACITLKNLLQFHINSPDINNEKAVLLARDETLGNCLNLTEI 134
+ + +TF + ++ C+T L +N+ + LLA D C+ LT +
Sbjct: 88 YMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASV--SGMNLLADDA----CVPLTSM 141

Query: 135 IPQASVRYDVNEQRLDIDVPQAWVMKNYQNYVDPSLWENGINAAMLSYNLNGYHSETP-G 193
I A+ + DV +QRL++ +PQA++ + Y+ P LW+ GINA +L+YN +G + G
Sbjct: 142 IHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIG 201

Query: 194 RRNDSIYAAFNGGMNLGAWRLRASGNYNWMTDSGS-----NYDFKNRYIQRDIASLRSQL 248
+ Y G+N+GAWRLR + +++ + S + N +++RDI LRS+L
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 249 ILGESYTTGETFDSVSIRGIRLYSDSRMLPPTLASFAPIIHGVANTNAKVTITQGGYKIY 308
LG+ YT G+ FD ++ RG +L SD MLP + FAP+IHG+A A+VTI Q GY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 309 ETTVPPGAFVIDDLSPSGYGSDLIVTVEESDGSKRTFSQPFSSVVQMLRPGVGRWDISGG 368
+TVPPG F I+D+ +G DL VT++E+DGS + F+ P+SSV + R G R+ I+ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 369 QVLKDD-IQDEPNLFQASYYYGLNNYLTGYTGIQITDNNYTAGLLGLGLNT-SVGAFSFD 426
+ + Q++P FQ++ +GL T Y G Q+ D Y A G+G N ++GA S D
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD-RYRAFNFGIGKNMGALGALSVD 440

Query: 427 VTHSNVRIPDDKTYQGQSYRVSWNKLFEETSTSLNIAAYRYSTQNYLGLNDALTLIDEVK 486
+T +N +PDD + GQS R +NK E+ T++ + YRYST Y D
Sbjct: 441 MTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGY 500

Query: 487 HPE-----QDLEPKSMRNYSRM---KNQVTVSINQPLKFEKKDYGSFYLSGSWSDYWASG 538
+ E ++PK Y+ + ++ +++ Q L + YLSGS YW +
Sbjct: 501 NIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL----GRTSTLYLSGSHQTYWGTS 556

Query: 539 QNRSNYSIGYSNSASWGSYSVSAQRSWNE-DGDTDDSVYLSFTIPIEKLLGTEQRTS-GF 596
+ G + + ++++S + N D + L+ IP L ++ ++
Sbjct: 557 NVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRH 616

Query: 597 QSIDTQMSSDFKGNNQLNVSSSGYS-DNARVSYSVNTGYTMNKASKDLSYVGGYASYESP 655
S MS D G G ++ +SYSV TGY S +Y
Sbjct: 617 ASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGG 676

Query: 656 WGTLAGSVSANSDNSRQVSLSTDGGFVLHSGGLTFSNDSFSDSDTLAVVQAPGAQGARIN 715
+G S + D +Q+ GG + H+ G+T +DT+ +V+APGA+ A++
Sbjct: 677 YGNANIGYSHSDDI-KQLYYGVSGGVLAHANGVTLGQPL---NDTVVLVKAPGAKDAKVE 732

Query: 716 YGNST-IDRWGYGVTSALSPYHENRIALDINDLENDVELKSTSAVAVPRQGSVVFADFET 774
D GY V + Y ENR+ALD N L ++V+L + A VP +G++V A+F+
Sbjct: 733 NQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA 792

Query: 775 VQGQSAIMNITRSDGKNIPFAADIYDEQGNVIGNVGQGGQAFVRGIEQQGNISIKCLEES 834
G +M +T + K +PF A + E G V GQ ++ G+ G + +K EE
Sbjct: 793 RVGIKLLMTLTH-NNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEE 851

Query: 835 KPVSCLAHYQ 844
C+A+YQ
Sbjct: 852 NA-HCVANYQ 860


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0157FERRIBNDNGPP5110.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 511 bits (1316), Expect = 0.0
Identities = 294/296 (99%), Positives = 294/296 (99%)

Query: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60
MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120
DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAHYEDFIRSMKPRFVKRGERPLLLT 180
GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLA YEDFIRSMKPRFVKRG RPLLLT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240
TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296
DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


5EcE24377A_0215EcE24377A_0256Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0215-124-3.055324UbiE/COQ5 family methlytransferase
EcE24377A_0216-123-2.767809membrane-bound lytic murein transglycosylase D
EcE24377A_0217030-4.802668hydroxyacylglutathione hydrolase
EcE24377A_0220025-4.192398methyltransferase
EcE24377A_0218131-7.172918hypothetical protein
EcE24377A_0219131-8.198197ribonuclease H
EcE24377A_0221227-5.043490DNA polymerase III subunit epsilon
EcE24377A_0223118-0.571819*hypothetical protein
EcE24377A_02241170.759883lipoprotein
EcE24377A_02251192.633987hypothetical protein
EcE24377A_02261215.437033hypothetical protein
EcE24377A_02271235.991367ImpA domain-containing protein
EcE24377A_02280245.954543hypothetical protein
EcE24377A_02291256.012307ImpA domain-containing protein
EcE24377A_02301255.839586type VI secretion-associated protein
EcE24377A_02311224.933733ATP-dependent chaperone protein ClpB
EcE24377A_02322202.782984hypothetical protein
EcE24377A_02333202.855200hypothetical protein
EcE24377A_02342191.971431type VI secretion lipoprotein
EcE24377A_02352202.004385hypothetical protein
EcE24377A_02362200.960792hypothetical protein
EcE24377A_0237220-0.303377hypothetical protein
EcE24377A_0238220-0.854726hypothetical protein
EcE24377A_0239020-1.107842hypothetical protein
EcE24377A_02404211.478614hypothetical protein
EcE24377A_02414335.777407hypothetical protein
EcE24377A_02424334.348559hypothetical protein
EcE24377A_02434304.690997hypothetical protein
EcE24377A_02443314.477657hypothetical protein
EcE24377A_02451274.652137ImpA family type VI secretion-associated
EcE24377A_02461254.453885Rhs protein
EcE24377A_02470150.329771hypothetical protein
EcE24377A_0251-1182.876497hypothetical protein
EcE24377A_0252-2172.021205C-lysozyme inhibitor
EcE24377A_0253-1161.548278acyl-CoA dehydrogenase
EcE24377A_0254114-2.109781phosphoheptose isomerase
EcE24377A_0255016-1.690018hypothetical protein
EcE24377A_0257018-1.875255glutamine amidotransferase
EcE24377A_0256-220-3.714525hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0217BINARYTOXINB344e-04 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 34.3 bits (78), Expect = 4e-04
Identities = 12/55 (21%), Positives = 28/55 (50%), Gaps = 4/55 (7%)

Query: 186 NDYYRKVKELRAKNQITLPVILKNERQINVFLRT----EDIDLINVINEETLLQQ 236
+ ++ EL A N T+ +K ++N+ +R D + I V +E+++++
Sbjct: 589 QNIKNQLAELNATNIYTVLDKIKLNAKMNILIRDKRFHYDRNNIAVGADESVVKE 643


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0228PF06580310.024 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.024
Identities = 14/95 (14%), Positives = 30/95 (31%), Gaps = 9/95 (9%)

Query: 18 RPAMPRFKISAFWLLILAWIFL-LVWIWWKGPMWTLYEEQWLKPLANRWLATAAWG---- 72
+ + ++ A + + +VW +W L KP+A +
Sbjct: 66 QGWLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSIIFNVV 125

Query: 73 IIALVW----LTVRVMKRLQQLEKMQKQQREEAVD 103
++ +W K +Q E Q + A +
Sbjct: 126 VVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQE 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0246OUTRSURFACE350.003 Outer surface protein signature.
		>OUTRSURFACE#Outer surface protein signature.

Length = 273

Score = 34.5 bits (79), Expect = 0.003
Identities = 19/72 (26%), Positives = 32/72 (44%), Gaps = 6/72 (8%)

Query: 401 ITDSLNRR--EVLYTEGEGGLKRVVKKEHADGSITRSEYDEAGRL--KAQTDAAGRRTEY 456
I D L++ E+ +G+ + R V + D + T ++E G L K T G + EY
Sbjct: 90 IADDLSKTTFELFKEDGKTLVSRKVSSK--DKTSTDEMFNEKGELSAKTMTRENGTKLEY 147

Query: 457 SLHMASGAVTAV 468
+ + G A
Sbjct: 148 TEMKSDGTGKAK 159


6EcE24377A_0278EcE24377A_0404Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0278323-3.469194*hypothetical protein
EcE24377A_0279220-3.138742prophage CP4-57 regulatory protein
EcE24377A_0280117-2.769746hypothetical protein
EcE24377A_0281112-1.617428hypothetical protein
EcE24377A_0282113-2.775189SNF2 family helicase
EcE24377A_0283112-2.653841hypothetical protein
EcE24377A_0284010-2.117745hypothetical protein
EcE24377A_0285011-3.424413hypothetical protein
EcE24377A_0286013-3.857014N4/N6-methyltransferase
EcE24377A_0287018-4.615470type I restriction modification DNA specificity
EcE24377A_0288015-2.738053hypothetical protein
EcE24377A_0289-114-0.690132HsdR family type I site-specific
EcE24377A_02901190.306263UvrD family helicase
EcE24377A_02911194.138971hypothetical protein
EcE24377A_02920184.172858phage integrase
EcE24377A_02940195.087985*xanthine dehydrogenase accessory factor
EcE24377A_02950183.543126xanthine dehydrogenase, molybdopterin-binding
EcE24377A_02960192.315026FAD binding domain-containing protein
EcE24377A_02971201.478149xanthine dehydrogenase iron-sulfur-binding
EcE24377A_02981211.289582hypothetical protein
EcE24377A_02992210.780194hypothetical protein
EcE24377A_03001210.353307hypothetical protein
EcE24377A_03012210.130369hypothetical protein
EcE24377A_0302422-4.317430hypothetical protein
EcE24377A_0303524-7.410830fimbrillin MatB
EcE24377A_0304735-12.731645fimbrillin MatA
EcE24377A_0305440-13.470510hypothetical protein
EcE24377A_0306537-11.50650250S ribosomal protein L36
EcE24377A_0307120-0.96538550S ribosomal protein L31
EcE24377A_0308021-1.787890hypothetical protein
EcE24377A_0309021-2.604457hypothetical protein
EcE24377A_0310021-2.658889hypothetical protein
EcE24377A_0311021-2.868041hypothetical protein
EcE24377A_0312022-3.308836EaeH
EcE24377A_0313332-8.184426AraC family transcriptional regulator
EcE24377A_0314232-6.692446aldo/keto reductase
EcE24377A_0315228-5.214089hypothetical protein
EcE24377A_0316225-3.838081hypothetical protein
EcE24377A_0318022-2.642220pyridine nucleotide-disulfide oxidoreductase
EcE24377A_0319020-1.849206hypothetical protein
EcE24377A_0320021-3.235063AraC family transcriptional regulator
EcE24377A_0321-112-0.794870hypothetical protein
EcE24377A_03220131.055243iron-sulfur cluster binding protein
EcE24377A_03230162.157127hypothetical protein
EcE24377A_0324-1182.265851hypothetical protein
EcE24377A_03250192.802260hypothetical protein
EcE24377A_03260244.799175choline dehydrogenase
EcE24377A_0327-1121.487530betaine aldehyde dehydrogenase
EcE24377A_0328012-0.325663transcriptional regulator BetI
EcE24377A_0329011-0.930242IS1, transposase orfA
EcE24377A_0330012-0.872311IS1, transposase orfB
EcE24377A_0331011-1.427495choline transport protein BetT
EcE24377A_0332115-3.194481outer membrane autotransporter
EcE24377A_0333221-4.517454LuxR family transcriptional regulator
EcE24377A_0334118-1.585093LysR family transcriptional regulator
EcE24377A_03351171.013580hypothetical protein
EcE24377A_03361181.744624hypothetical protein
EcE24377A_03371182.746181ankyrin repeat-containing protein
EcE24377A_03381192.852483hypothetical protein
EcE24377A_03392233.568781acyl-CoA synthetase
EcE24377A_03400172.165644hypothetical protein
EcE24377A_0341-1120.023419carbamate kinase
EcE24377A_0342-212-1.159732deaminase
EcE24377A_0343016-2.381655hypothetical protein
EcE24377A_0344114-1.129298hypothetical protein
EcE24377A_0345015-2.314570sugar ABC transporter periplasmic sugar-binding
EcE24377A_0346016-1.960060sugar ABC transporter ATP-binding protein
EcE24377A_0347018-3.005944sugar ABC transporter permease
EcE24377A_0348-116-1.405183sugar ABC transporter permease
EcE24377A_0349-116-1.991990zinc-binding dehydrogenase oxidoreductase
EcE24377A_0350118-1.573325hypothetical protein
EcE24377A_03521151.626567*homoserine/threonine efflux protein
EcE24377A_03531192.869714hypothetical protein
EcE24377A_03541234.400860propionate catabolism operon regulatory protein
EcE24377A_03551224.127564hypothetical protein
EcE24377A_03560213.9533482-methylisocitrate lyase
EcE24377A_03570193.840045methylcitrate synthase
EcE24377A_0358-1193.8365082-methylcitrate dehydratase
EcE24377A_0359-1183.771425propionyl-CoA synthetase
EcE24377A_0360-2162.354463cytosine permease
EcE24377A_03611173.554250cytosine deaminase
EcE24377A_03622151.693215DNA-binding transcriptional regulator CynR
EcE24377A_0363-2122.649524hypothetical protein
EcE24377A_0364-2112.914747carbonic anhydrase
EcE24377A_0365-2112.547527cyanate hydratase
EcE24377A_0366-2123.124295cyanate transporter
EcE24377A_0367-3122.948119galactoside permease
EcE24377A_0368-2134.125997beta-D-galactosidase
EcE24377A_0369-1173.999853lac repressor
EcE24377A_0370-1153.847789DNA-binding transcriptional activator MhpR
EcE24377A_03710154.2678103-(3-hydroxyphenyl)propionate hydroxylase
EcE24377A_03720134.2272593-(2,3-dihydroxyphenyl)propionate dioxygenase
EcE24377A_03731133.1845932-hydroxy-6-ketonona-2,4-dienedioic acid
EcE24377A_03741143.4433132-keto-4-pentenoate hydratase
EcE24377A_03751142.279112acetaldehyde dehydrogenase
EcE24377A_03760132.0120974-hydroxy-2-ketovalerate aldolase
EcE24377A_0377-1151.3398323-hydroxyphenylpropionic transporter MhpT
EcE24377A_0378-117-0.570680hypothetical protein
EcE24377A_0379116-0.834816hypothetical protein
EcE24377A_0380215-1.356881S-formylglutathione hydrolase
EcE24377A_0381118-2.126578alcohol dehydrogenase
EcE24377A_0382324-4.261379regulator protein FrmR
EcE24377A_0383326-5.543033hypothetical protein
EcE24377A_0384119-3.803760acyltransferase
EcE24377A_0385116-1.310670glycosyl transferase family protein
EcE24377A_0386118-0.216218GlcNAc-PI de-N-acetylase
EcE24377A_0387-1171.781549hypothetical protein
EcE24377A_0388-2162.813904hypothetical protein
EcE24377A_0389-3163.593080taurine transporter substrate binding subunit
EcE24377A_0391217-0.573953taurine transporter subunit
EcE24377A_0392319-2.601897taurine dioxygenase
EcE24377A_0393319-2.238750delta-aminolevulinic acid dehydratase
EcE24377A_0394322-3.150053insertion element IS2 transposase InsD
EcE24377A_0395325-4.428151insertion sequence 2 OrfA protein
EcE24377A_0396221-4.005982outer membrane autotransporter
EcE24377A_0397013-1.143688DNA-binding transcriptional regulator
EcE24377A_03980160.649688beta-lactam binding protein AmpH
EcE24377A_0399117-0.319863hypothetical protein
EcE24377A_0400114-0.745300hypothetical protein
EcE24377A_0401-113-0.428929transporter
EcE24377A_0402014-1.374687lipoprotein
EcE24377A_0403117-4.006051hypothetical protein
EcE24377A_0404018-4.369054hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0284OMPADOMAIN443e-07 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 43.8 bits (103), Expect = 3e-07
Identities = 23/76 (30%), Positives = 36/76 (47%), Gaps = 6/76 (7%)

Query: 88 DNRISFGEAGRFDHNQFFLNAEGQKALQDVVPLVLEASNSEEGKKWFKQIVIEGFTDTDG 147
+ F+ N+ L EGQ AL D + L + ++G +V+ G+TD G
Sbjct: 212 TKHFTLKSDVLFNFNKATLKPEGQAAL-DQLYSQLSNLDPKDGS-----VVVLGYTDRIG 265

Query: 148 SYLYNLHLSLQRSEWV 163
S YN LS +R++ V
Sbjct: 266 SDAYNQGLSERRAQSV 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0285RTXTOXIND300.032 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.032
Identities = 20/164 (12%), Positives = 56/164 (34%), Gaps = 25/164 (15%)

Query: 381 DMRQVSDDSRQGSAQLIEQLLSEMKSGQQAMQAGMNDMLTSLQTSVAKIGAEGEGAGERM 440
+ VS++ LI++ S ++ + + ++ T +A+I E
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRY-ENLSRVE 233

Query: 441 ARQLEKMFADSEAREKAQAEHMTAFIEAIQNSVQQGQSATMEKMAASVESLGEQLGSLFG 500
+L+ + + ++ + E + +L
Sbjct: 234 KSRLDDF--------SSLLH---------KQAIAKHAVLEQENKYVEAVN---ELRVYKS 273

Query: 501 QIDKGQQQISANQQANQQSLHEQTQRVMSEVDDQIKQLVETVAS 544
Q+++ + +I A ++ TQ +E+ D+++Q + +
Sbjct: 274 QLEQIESEI---LSA-KEEYQLVTQLFKNEILDKLRQTTDNIGL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0301PF00577634e-12 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 62.9 bits (153), Expect = 4e-12
Identities = 30/247 (12%), Positives = 72/247 (29%), Gaps = 23/247 (9%)

Query: 487 TLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQNVYSGTFGSLGLRAGIQRYNNGDSN 546
L + + T +S + Y + +Q + F + N
Sbjct: 530 QLTVTQQLGRTSTLYLSG-SHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQK 588

Query: 547 ANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGT------------IRTV 594
+ +AL++++P +W + Q + A+ S + +
Sbjct: 589 -GRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647

Query: 595 GANLSRAISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYVNTNLTANGSVGWQGK 654
++ +G + +G A + Y + + S +D +G V
Sbjct: 648 SYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHAN 706

Query: 655 NIAASGRTDGNAGVIFNTGLED---DGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQ 711
+ + ++ G +D + Q + + R G + Y V L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR-----GYAVLPYATEYRENRVALD 761

Query: 712 NSKNSLD 718
+ + +
Sbjct: 762 TNTLADN 768


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0312INTIMIN552e-179 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 552 bits (1423), Expect = e-179
Identities = 228/818 (27%), Positives = 360/818 (44%), Gaps = 49/818 (5%)

Query: 20 PVMAARAQHAVQPRLSMGNTTVTADNNVEKNVASFAANAGTFLSSQPDS-----DATRNF 74
P++AA +L+ + VT N + ++AA L SQ S D ++
Sbjct: 131 PLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDT 190

Query: 75 ITGMATAKANQEIQEWLGKYGTARVKLNVDKDFSLKDSSLEMLYPIYDTPTNMLFTQGAI 134
G+A +A+ ++Q WL YGTA V L +F SSL+ L P YD+ + F Q
Sbjct: 191 ALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGA 248

Query: 135 HRTDDRTQSNIGFGWRHFSGNDWMAGVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGY 194
D R +N+G G R F M G N FID D S +TR+G+G EYWRDY K S NGY
Sbjct: 249 RYIDSRFTANLGAGQRFFLPE-NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGY 307

Query: 195 IRASGWKKSPDVEDYQERPANGWDIRAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQ 254
R SGW +S + +DY ERPANG+DIR GYLP++P LGA LMYEQYYGD V LF DK Q
Sbjct: 308 FRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQ 367

Query: 255 KDPHAISAEVTYTPVPLLTLSAGHKQGKSGENDTRFGLEVNYRIGEPLAKQLDTDSIRER 314
+P A + V YTP+PL+T+ ++ G END + ++ Y+ +P ++Q++ + E
Sbjct: 368 SNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNEL 427

Query: 315 RVLAGSRYDLVERNNNIVLEYRKSEVIRIALPERIEGKGGQTLSLGLVVSKATHGLKNVQ 374
R L+GSRYDLV+RNNNI+LEY+K +++ + +P I G T + L+V K+ +GL +
Sbjct: 428 RTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIV-KSKYGLDRIV 486

Query: 375 WEAPSLLAEGGKITGQGSQ----WQVTLPAYRPGKDNYYAISAVAYDNKGNASKRVQTEV 430
W+ +L ++GG+I GSQ +Q LPAY G N Y ++A AYD GN+S V +
Sbjct: 487 WDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTI 546

Query: 431 VITGAGMSADRTALTLDGQSRIQMLANGNEQRPLVLSLRDAEGQPVTGMKDQIKTELAFK 490
+ G D+ +T + A+G E +++ Q ++F
Sbjct: 547 TVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNG-------VAQANVPVSFN 599

Query: 491 PAGNIVTRSLKATKSQAKPTLGEFTETEAGVYQSVFTTGTQSGEATITVSVDGMSKTVTA 550
S + + G + G+ ++ M+ + A
Sbjct: 600 IVSGTAVLSANSANTNGS-----------GKATVTLKSDK-PGQVVVSAKTAEMTSALNA 647

Query: 551 ELRATMMDVANSTLSANEPSGDVVADGQQAYTLTLTAVDSEGNPVTGEASRLRFVPQDTN 610
+ S VA+GQ A T T+ V PV+ + T
Sbjct: 648 NAVIFVDQTKASITEIKADKTTAVANGQDAITYTVK-VMKGDKPVSNQEVTF-----TTT 701

Query: 611 GVTVGAIS--EIKPGVYSATVSSTRAGNVVVRAFSEQYQLGTLQQTLKFVAGP-LDAAHS 667
+ + G T++ST G +V A + ++F +D +
Sbjct: 702 LGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNI 761

Query: 668 SITLNPDKPVVGGTVTAIWTVKDAYDNPVTSLTPE---APSLAGAAAVGSTASGWTNNGD 724
I V G + +W + + + + A+V +++ T
Sbjct: 762 EIVGTG----VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEK 817

Query: 725 GTWTAQITLGSTAGELEVMPKLNGQDAAANAAKVTVVADALSSNQSKVSVAEDHVKAGES 784
GT T + + N N +K DA+++ ++ E+
Sbjct: 818 GTTTISVISSDNQTATYTIATPNSL-IVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876

Query: 785 TTVTLVAKDAHGNAISGLSLSASLTGTASEGATVSSWT 822
A + + S ++ + + TA + + + T
Sbjct: 877 VFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVAST 914



Score = 78.6 bits (193), Expect = 1e-16
Identities = 77/393 (19%), Positives = 127/393 (32%), Gaps = 39/393 (9%)

Query: 884 KTTTELTFTVKDAYGNPVTGMKPDAPVFSGAASTGTERPSTGDWTETSNGVYVATLTLGS 943
T TVK G+ S +GT S +G TL
Sbjct: 575 TEAITYTATVKK------NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDK 628

Query: 944 AAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDMTVKVDNQLANGQSTNLVTLTVVD 1003
+ +A+ V+ V D +KA I ++ +ANGQ +T TV
Sbjct: 629 PGQVVVSAKTAEMTSALNANAVIFV--DQTKASITEIKADKTTAVANGQDA--ITYTVKV 684

Query: 1004 TY-GNPLQGQEVTLNLPQGVTSKTGNTVTTNAAGKADIELISTVAGELEIAAAVKNSQ-- 1060
P+ QEVT G S + T T+ G A + L ST G+ ++A V +
Sbjct: 685 MKGDKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVD 742

Query: 1061 -KTVTVKFNADASTGQANLQVDTAVQKVANGKDAFTLTATVEDKNGN-PVPGSLVTFNLP 1118
K V+F + N+++ V G T ++ N G +
Sbjct: 743 VKAPEVEFFTTLTIDDGNIEI------VGTGVKGKLPTVWLQYGQVNLKASGGNGKYT-- 794

Query: 1119 RGVKPLTGDNVWVKANDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADKATAT 1178
N + + D QV GT I+ + ++Q T+ +
Sbjct: 795 -----WRSANPAIASVDASSG--QVTLKEKGTTTISVISSDNQT-----ATYTIATPNSL 842

Query: 1179 VSGIEVMGNYALADGKAKQTYKVTVTDANNNLVKDSEVTLTASPASLNLEPNGTATTNEQ 1238
+ + D ++ N +++ A+ + + T + Q
Sbjct: 843 IV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQ 901

Query: 1239 GQAIFTATTTVAATYTLKAQVSQTNGQVSTKTA 1271
Q A + VA+TY L Q N + S A
Sbjct: 902 -QTAQDAKSGVASTYDLVKQNPLNNIKASESNA 933



Score = 75.9 bits (186), Expect = 7e-16
Identities = 75/338 (22%), Positives = 123/338 (36%), Gaps = 23/338 (6%)

Query: 955 NGQNAVAQPLVLNVAGDAS-KAEIRDMTVKVDNQLANGQSTNLVTLTVVDTYGNPLQGQE 1013
N N V + + G + + D T + A+G T TV G
Sbjct: 537 NSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVP 595

Query: 1014 VTLNLPQGVTSKTGNTVTTNAAGKADIELISTVAGELEIAAAVKNSQKTV---TVKFNAD 1070
V+ N+ G + N+ TN +GKA + L S G++ ++A + V F
Sbjct: 596 VSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQ 655

Query: 1071 ASTGQANLQVDTAVQKVANGKDAFTLTATVEDKNGNPVPGSLVTFNLPRGVKPLTGDNVW 1130
++ D VANG+DA T T V K PV VTF G + +
Sbjct: 656 TKASITEIKADKTTA-VANGQDAITYTVKV-MKGDKPVSNQEVTFTTTLGKLSNSTE--- 710

Query: 1131 VKANDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADKATATVSGIEVMGNYAL 1190
K + G A++ + S T G ++A + T IE++G
Sbjct: 711 -KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT--- 766

Query: 1191 ADGKAKQTYKVTVTDANNNLVK---DSEVTLTAS-PASLNLEPN-GTATTNEQGQAIFTA 1245
G + V + NL + + T ++ PA +++ + G T E+G +
Sbjct: 767 --GVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISV 824

Query: 1246 TT--TVAATYTLKAQVSQTNGQVSTKTAESKFVADDKN 1281
+ ATYT+ S +S + + V KN
Sbjct: 825 ISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKN 862



Score = 55.1 bits (132), Expect = 2e-09
Identities = 59/369 (15%), Positives = 105/369 (28%), Gaps = 44/369 (11%)

Query: 758 VTVVADALSSNQSKV---SVAEDHVKAGESTTVTLVA------KDAHGNAISGLSLSASL 808
+TV+++ +Q V + + KA + +T A +S +S
Sbjct: 546 ITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVS--- 602

Query: 809 TGTASEGATVSSWTEKGDGSYVATLTTGGKTGELRVMPLFNGQPAATEAAQLTVIAGEMS 868
GTA A +S G G TL + + A A + +
Sbjct: 603 -GTAVLSA--NSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN--ANAVIFVDQTK 657

Query: 869 SANSTLVADNKAPTVKTTTELTFTVKDAY-GNPVTGMKPDAPVFSGAASTGTERPSTGDW 927
++ + + AD +T+TVK PV+ + +T + S
Sbjct: 658 ASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEV-------TFTTTLGKLSNSTE 710

Query: 928 TETSNGVYVATLTLGSAAGQLSVMPRVNGQN-AVAQPLVLNVAGDASKAEIRDMTVKVDN 986
+NG TLT + G+ V RV+ V P V T+ +D+
Sbjct: 711 KTDTNGYAKVTLT-STTPGKSLVSARVSDVAVDVKAPEVEFFT-----------TLTIDD 758

Query: 987 QLANGQSTNLVTLTVVDTYGNPLQGQEVTLNLPQGVTSKTGNTVTTNAAGKADIELISTV 1046
T LQ +V L G T + A T+
Sbjct: 759 GNIEIVGTG----VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTL 814

Query: 1047 AGELEIAAAVKNSQKTVTVKFNADASTGQANLQVDTAVQKVANGKDAFTLTATVEDKNGN 1106
+ +V +S T + + V + + + N
Sbjct: 815 KEKGTTTISVISSDNQ-TATYTIATPNSLIVPNMSKRVT-YNDAVNTCKNFGGKLPSSQN 872

Query: 1107 PVPGSLVTF 1115
+ +
Sbjct: 873 ELENVFKAW 881



Score = 54.3 bits (130), Expect = 3e-09
Identities = 40/178 (22%), Positives = 63/178 (35%), Gaps = 13/178 (7%)

Query: 1147 TAGTYEITASA----GNSQPSNTQTITFVADKATATVSGI---EVMGNYALADGKAKQTY 1199
+ Y++TA A GNS + TIT +++ G+ A ADG TY
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITY 580

Query: 1200 KVTVTDANNNLVKDSEVTLTAS-PASLNLEPNGTATTNEQGQAIFTATTTVAATYTLKAQ 1258
TV S A L+ +A TN G+A T + + A+
Sbjct: 581 TATVKKNGVAQANVPVSFNIVSGTAVLSAN---SANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 1259 VSQTNGQVSTKTAESKFVADDKNAVLTASSDMQSLVADGKSTAKLEVTLMSANNPVGG 1316
+ FV K ++ +D + VA+G+ V +M + PV
Sbjct: 638 --TAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSN 693


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0313HTHTETR280.027 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.027
Identities = 12/42 (28%), Positives = 19/42 (45%)

Query: 3 RQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRN 44
RQ IL L S+ +IA+ +G +R I F++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0328HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.5 bits (154), Expect = 1e-14
Identities = 32/172 (18%), Positives = 59/172 (34%), Gaps = 15/172 (8%)

Query: 10 RRRQLIDATLEAINEVGMHDATIAQIARRAGVSTGIISHYFRDKNGLLEATMRDITSQLR 69
R+ ++D L ++ G+ ++ +IA+ AGV+ G I +F+DK+ L S +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 70 DAVLNRLHALPQGSAEQRLQAIVGGNFDETQVSSAAMKAWLAFWASSMHQP-------ML 122
+ L P G L+ I+ + T V+ + +
Sbjct: 72 ELELEYQAKFP-GDPLSVLREILIHVLEST-VTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 123 YRLQQVSSRRLLSNLVSEFRRE---LPRQQAQEAGYGLAALIDGL---WLRA 168
R + S + + + A + I GL WL A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0332PRTACTNFAMLY1279e-32 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 127 bits (320), Expect = 9e-32
Identities = 120/469 (25%), Positives = 187/469 (39%), Gaps = 65/469 (13%)

Query: 854 QAGNVLVVKGNYHGNNGQLVMNTVLNGDDSVTDKLVVEGDTSGTTAVTVNNAGGTGAKTL 913
+AG V+ N +G MN D ++DKLVV D SG + V N+G +
Sbjct: 466 EAGRFKVLTVNTLAGSGLFRMNV--FADLGLSDKLVVMQDASGQHRLWVRNSGS-EPASA 522

Query: 914 NGIELIHVDGKSEGEFVQA---GRIVAGAYDYTLARGQGANSGNWYLTSGSDSPELQPEP 970
N + L+ S F A G++ G Y Y LA +G W L P +P P
Sbjct: 523 NTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLA---ANGNGQWSLVGAKAPPAPKPAP 579

Query: 971 DPMPNPEPNPNPEPN-PNPTPTPGPDLNVDNDLRPEAGSYIANLAAANTMFTTRLHERLG 1029
P P P P P+P P P P G +L+ AAAN T
Sbjct: 580 QPGPQPPQPPQPQPEAPAPQPPAGRELS----------------AAANAAVNTGGVGLAS 623

Query: 1030 NTYYTDMVTGEQKQTTMWMRHEGGHNKWRDGSGQLKTQSNRYV---------LQLGGDVA 1080
+Y + ++ + + + G W G Q + NR +LG D A
Sbjct: 624 TLWYAESNALSKRLGELRLNPDAG-GAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHA 682

Query: 1081 QWSQNGSDSWHVGVMAGYGNSDSKTISSRTGYRAKASVNGYSTGLYATWYADDESRNGAY 1140
G WH+G +AGY D G+ + G YAT+ AD +G Y
Sbjct: 683 VAVAGGR--WHLGGLAGYTRGDRGFTGDGGGH-----TDSVHVGGYATYIAD----SGFY 731

Query: 1141 LDSWAQYSWFDN--TVKGDDLQS--ESYKSKGFTASLEAGYKHKLAEFNGSQGTRNEWYV 1196
LD+ + S +N V G D + Y++ G ASLEAG + A + W++
Sbjct: 732 LDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTHA---------DGWFL 782

Query: 1197 QPQAQVTWMGVKADKHRESNGTLVHSNGDGNVQTRLGVKTWLKSHHKMDDGKSREFQPFV 1256
+PQA++ +R +NG V G +V RLG L+ +++ R+ QP++
Sbjct: 783 EPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLG----LEVGKRIELAGGRQVQPYI 838

Query: 1257 EVNWLHNSKDFST-SMDGVSVTQDGARNIAEIKTGVEGQLNANLNVWGN 1304
+ + L T +G++ + AE+ G+ L +++ +
Sbjct: 839 KASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYAS 887


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0334HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.009
Identities = 15/50 (30%), Positives = 24/50 (48%), Gaps = 2/50 (4%)

Query: 6 TEENLLAFTTAARFGSFSKAAEELGLTTSAISYTIKRMETGLDVVLFTRS 55
E L+ A G+ KAA+ LGL + + I+ + G+ V +RS
Sbjct: 436 MEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL--GVSVYRSSRS 483


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0341CARBMTKINASE427e-153 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 427 bits (1099), Expect = e-153
Identities = 139/315 (44%), Positives = 200/315 (63%), Gaps = 3/315 (0%)

Query: 1 MKELVVVAIGGNSIIKDNASQSIEHQAEAVKAVADTVLEMLASDYDIVLTHGNGPQVGLD 60
M + VV+A+GGN++ + S E + V+ A + E++A Y++V+THGNGPQVG
Sbjct: 1 MGKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSL 60

Query: 61 LRRAEIAHERQGLPLTPLANCVADTQGGIGYLIQQALNNRLARHG-EKKAVTVVTQVEVD 119
L + G+P P+ A +QG IGY+IQQAL N L + G EKK VT++TQ VD
Sbjct: 61 LLHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVD 120

Query: 120 KNDPGFAHPTKPIGAFFSDSQRDELQKANPDWCFVEDAGRGYRRVVASPEPKRIVEAPAI 179
KNDP F +PTKP+G F+ + L + W ED+GRG+RRVV SP+PK VEA I
Sbjct: 121 KNDPAFQNPTKPVGPFYDEETAKRLAREK-GWIVKEDSGRGWRRVVPSPDPKGHVEAETI 179

Query: 180 KALIQQGFVVIGAGGGGIPVVRTEAGDYQSVDAVIDKDLSTALLAHEIHADILVITTGVE 239
K L+++G +VI +GGGG+PV+ E G+ + V+AVIDKDL+ LA E++ADI +I T V
Sbjct: 180 KKLVERGVIVIASGGGGVPVIL-EDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVN 238

Query: 240 KVCIHFGKPEQQALDRVDNATMTRYMQEGHFPPGSMLPKIIASLAFLEQGGKEVIITTPE 299
+++G ++Q L V + +Y +EGHF GSM PK++A++ F+E GG+ II E
Sbjct: 239 GAALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLE 298

Query: 300 CLPAALRGETGTHLI 314
AL G+TGT ++
Sbjct: 299 KAVEALEGKTGTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0354HTHFIS336e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 336 bits (864), Expect = e-112
Identities = 122/401 (30%), Positives = 199/401 (49%), Gaps = 54/401 (13%)

Query: 164 DLAEEAGMTGIFIYSAATVRQAFSDALDMTRMSLRHNTHDATRNALRTRYVLGDMLGQSP 223
A +A G + Y ++ + + +L ++ ++G+S
Sbjct: 88 MTAIKASEKGAYDYLPKPFDL--TELIGIIGRALAEPKRRPSK-LEDDSQDGMPLVGRSA 144

Query: 224 QMEQVRQTILLYARSSAAVLIEGETGTGKELAAQAIHREYFARHDARQGKKSHPFVAVNC 283
M+++ + + ++ ++I GE+GTGKEL A+A+H + R+ PFVA+N
Sbjct: 145 AMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-----YGKRRNG---PFVAINM 196

Query: 284 GAIAESLLEAELFGYEEGAFTGSRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVL 343
AI L+E+ELFG+E+GAFTG++ G FE A GGTLFLDEIG+MP+ QTRLLRVL
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTR-STGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVL 255

Query: 344 EEKEVTRVGGHQPVPVDVRVISATHCKLEEDMQQGRFRRDLFYRLSILRLQLPPLRERVA 403
++ E T VGG P+ DVR+++AT+ L++ + QG FR DL+YRL+++ L+LPPLR+R
Sbjct: 256 QQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAE 315

Query: 404 DILPLAESFLKVSLAALSAPFSAALRQGLQASETVLVHYDWPGNIRELRNMMERLALFLS 463
DI L F++ ++ L+ + + WPGN+REL N++ RL
Sbjct: 316 DIPDLVRHFVQ-QAEKEGLDVKRFDQEALEL----MKAHPWPGNVRELENLVRRLTALYP 370

Query: 464 VEP-TPDLTPQFLQLLLPELARESAKTPAPRLLTP------------------------- 497
+ T ++ L+ +P+ E A + L
Sbjct: 371 QDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYD 430

Query: 498 -----------QQALEKFNGDKTAAANYLGISRTTFWRRLK 527
AL G++ AA+ LG++R T ++++
Sbjct: 431 RVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIR 471


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0357PHPHTRNFRASE300.023 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 29.8 bits (67), Expect = 0.023
Identities = 11/33 (33%), Positives = 19/33 (57%), Gaps = 1/33 (3%)

Query: 65 LIHGKLPTRDE-LAAYKTKLKALRGLPANVRTV 96
+ +LPT +E AYK ++ + G P +RT+
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTL 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0367TCRTETA363e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 3e-04
Identities = 44/192 (22%), Positives = 72/192 (37%), Gaps = 22/192 (11%)

Query: 4 LKNTNFWMFGLFFFFYFFI-MGAYFPFFPIWLHDINHISK--SDTGIIFAAISLFSLLFQ 60
+K + L + +G P P L D+ H + + GI+ A +L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 61 PLFGLLSDKLGLRKYLLWIITGMLVMFAPFFIFIFGPLLQYNILVGSIVGGIYLGFCFNA 120
P+ G LSD+ G R LL + + I P L + + +G IV GI A
Sbjct: 61 PVLGALSDRFGRRPVLL---VSLAGAAVDYAIMATAPFL-WVLYIGRIVAGIT-----GA 111

Query: 121 GAPAVEAFIEKVSRRSNFEFGRARMFG----CVGWALCAS--IVGIMFTINNQFVFWLGS 174
A+I ++ RAR FG C G+ + A + G+M + F+ +
Sbjct: 112 TGAVAGAYIADITDGDE----RARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAA 167

Query: 175 GCALILAVLLFF 186
+ + F
Sbjct: 168 ALNGLNFLTGCF 179


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0377TCRTETB582e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 58.4 bits (141), Expect = 2e-11
Identities = 85/414 (20%), Positives = 152/414 (36%), Gaps = 51/414 (12%)

Query: 1 MSTRTPSSSSSRLMLTIGLCFLVALMEGLDLQAAGIAAGGIAQAFALDKMQMGWIFSAGI 60
M+T S+ + I LC L L+ ++ IA F W+ +A +
Sbjct: 1 MNTSYSQSNLRHNQILIWLCILS-FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFM 59

Query: 61 LGLLPGALVGGMLADRYGRKRILIGSVALFGLFSLATAIAWD-FPSLVFARLMTGVGLGA 119
L G V G L+D+ G KR+L+ + + S+ + F L+ AR + G G A
Sbjct: 60 LTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG-AA 118

Query: 120 ALPNLIA-LTSEAAGPRFRGTAVSLMYCGVPIGAALAATLGFAGANLAWQTVFWVGGVVP 178
A P L+ + + RG A L+ V +G + +G A+ + + ++
Sbjct: 119 AFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMIT 178

Query: 179 LILVPLLMRWLPESAVFAGEKQ--------AAPPLRALFAPETATATLLLWLCYFFTLLV 230
+I VP LM+ L + G LF + + L++ + F + V
Sbjct: 179 IITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL-IFV 237

Query: 231 VYMLINWLPLLLVEQGFQPSQAAGVMFA-LQMGAASGTLMLGALMDK------------- 276
++ P + G GV+ + G +G + + M K
Sbjct: 238 KHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSV 297

Query: 277 -LRPVTMSLLIYS---GMLAS------LLALGTVSSFNGMLLAGFV----------AGLF 316
+ P TMS++I+ G+L +L +G L A F+ +F
Sbjct: 298 IIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVF 357

Query: 317 ATGGQSVLYALAPLFYSSQIRATGVGTAVA----VGRLGAMSGPLLAGKMLALG 366
GG S + SS ++ G ++ L +G + G +L++
Sbjct: 358 VLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0378TRNSINTIMINR280.018 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.2 bits (62), Expect = 0.018
Identities = 14/56 (25%), Positives = 30/56 (53%), Gaps = 2/56 (3%)

Query: 11 LKAGLVTSKKAAKVERTAKKSRVQAREARAAVEENKKAQLERDKQLSEQQKQAALA 66
+ +G + ++ + AK++ AR+ AVE N +AQ + Q + +Q++ L+
Sbjct: 308 IPSGELKDDIVEQIAQQAKEAGEVARQQ--AVESNAQAQQRYEDQHARRQEELQLS 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0393BINARYTOXINB300.015 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.015
Identities = 19/69 (27%), Positives = 30/69 (43%)

Query: 254 DIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYF 313
+ EL + +L + QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 314 ALDLAEKKI 322
L+L E++I
Sbjct: 526 DLNLVERRI 534


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0396PRTACTNFAMLY1214e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 121 bits (304), Expect = 4e-30
Identities = 101/442 (22%), Positives = 171/442 (38%), Gaps = 59/442 (13%)

Query: 532 TINGNGDVDNGTELDNSSVDNVVA---ATGNYKVRIDNATGAGAIADYKDKEIIYVNDVN 588
T+ G+G D D +V A+G +++ + N+ G+ + ++ +
Sbjct: 477 TLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWVRNS---GSEPASANTLLLVQTPLG 533

Query: 589 TNATFSAAN---KADLGAYTYQAEQRGNTV------------------------------ 615
+ ATF+ AN K D+G Y Y+ GN
Sbjct: 534 SAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQP 593

Query: 616 ------VLQQMELTDYANMALSIP--SANTNIWNLEQDTVGTRLTNSRHGLADNGGAWVS 667
EL+ AN A++ + +W E + + RL R D GGAW
Sbjct: 594 EAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRL-NPDAGGAWGR 652

Query: 668 YFGGNFNGDNGTIN-YDQDVNGIMVGVDTKIDGNNAKWIVGAAAGFAKGDMN---DRSGQ 723
F DN +DQ V G +G D + +W +G AG+ +GD D G
Sbjct: 653 GFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGH 712

Query: 724 VDQDSQTAYIYSSAHFANNVF-VDGSLSYSHFNNDLSATMSNGTYVDGSTNSDAWGFGLK 782
D Y + + A++ F +D +L S ND S+G V G + G L+
Sbjct: 713 TDSVHVGGY---ATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLE 769

Query: 783 AGYDFKLGDAGYVTPYGSISGLFQSGDDYQLSNDMKVDGQSYDSMRYELGVDAGYTFTYS 842
AG F D ++ P ++ G Y+ +N ++V + S+ LG++ G +
Sbjct: 770 AGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELA 829

Query: 843 EDQALTPYFKLAYVYDDSNNDNDVNGDSIDNGTEGSAVRV--GLGTQFSFTKNFSAYTDA 900
+ + PY K + + + + V+ + I + TE R GLG + + S Y
Sbjct: 830 GGRQVQPYIKASVL-QEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYASY 888

Query: 901 NYLGGGDVDQDWSANVGVKYTW 922
Y G + W+ + G +Y+W
Sbjct: 889 EYSKGPKLAMPWTFHAGYRYSW 910


7EcE24377A_0419EcE24377A_0424Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_04194141.167800hypothetical protein
EcE24377A_04203131.350797recombination associated protein
EcE24377A_04213151.409381fructokinase
EcE24377A_04223161.247683hypothetical protein
EcE24377A_04232141.185339MFS transport protein AraJ
EcE24377A_04243131.469538exonuclease SbcC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0421ACETATEKNASE300.016 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.8 bits (67), Expect = 0.016
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIISLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0423TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.0 bits (122), Expect = 4e-09
Identities = 74/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 314
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0424RTXTOXIND405e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 5e-05
Identities = 34/199 (17%), Positives = 71/199 (35%), Gaps = 14/199 (7%)

Query: 671 QQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEDTVALDNWRQVHEQCLALH 730
+ + Q + Q R Q L+ +E + E + +V +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTAL--------QASVFDDQQAFLAALMDEQTLTQL 782
Q T Q Q +L K +A+ T L + V + ++L+ +Q +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA-- 250

Query: 783 EQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQEL-AQTHQKLREN 841
K + Q + V + +Q +Q + L+ + + Q + KLR+
Sbjct: 251 ---KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT 307

Query: 842 TTSQGEIRQQLKQDADNRQ 860
T + G + +L ++ + +Q
Sbjct: 308 TDNIGLLTLELAKNEERQQ 326



Score = 39.4 bits (92), Expect = 6e-05
Identities = 25/204 (12%), Positives = 59/204 (28%), Gaps = 18/204 (8%)

Query: 487 EARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVKKLGEEG 546
EA ++ Q + Q ++E + E + +E + L
Sbjct: 133 EADTLKTQSSLLQARLEQ---TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT- 188

Query: 547 AALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQPWLDAQD 606
+ ++ Q Q + E R + + + + DD L Q
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 607 -------EHERQL-RLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLP 658
E E + +++ + Q+ +I+ +++ + Q L
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEILD 302

Query: 659 QEDEEESWLATRQQEAQSWQQRQN 682
+ + + E ++RQ
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.008
Identities = 16/150 (10%), Positives = 42/150 (28%), Gaps = 5/150 (3%)

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTA----LQASVFDDQQAFLAALMDEQTLTQLEQLK 786
+ Q + A + Q + L D+ F +E+ L +K
Sbjct: 134 ADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS-EEEVLRLTSLIK 192

Query: 787 QNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRENTTSQG 846
+ + Q + A+ + L L + ++
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 847 EIRQQLKQDADNRQQQQTLMQQIAQMTQQV 876
+ +Q + + + + Q+ Q+ ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEI 282


8EcE24377A_0460EcE24377A_0472Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0460014-3.129515hypothetical protein
EcE24377A_0461120-0.445094acetyltransferase
EcE24377A_04623230.262088hypothetical protein
EcE24377A_04631191.027227protoheme IX farnesyltransferase
EcE24377A_04641210.767689cytochrome o ubiquinol oxidase subunit IV
EcE24377A_04650200.688388cytochrome o ubiquinol oxidase subunit III
EcE24377A_0466-2170.311730cytochrome o ubiquinol oxidase subunit I
EcE24377A_0467020-0.008360cytochrome o ubiquinol oxidase subunit II
EcE24377A_04680190.137868muropeptide transporter
EcE24377A_0469329-0.673473hypothetical protein
EcE24377A_0470425-0.139735transcriptional regulator BolA
EcE24377A_0471327-0.086320hypothetical protein
EcE24377A_04723260.211675trigger factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0468TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0469PF06291270.027 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.027
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


9EcE24377A_0490EcE24377A_0509Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0490119-3.676809methylated-DNA-[protein]-cysteine
EcE24377A_0491013-1.323770hypothetical protein
EcE24377A_0492013-0.729072cyclic diguanylate phosphodiesterase
EcE24377A_0493215-0.565081hypothetical protein
EcE24377A_0494115-0.540461maltose O-acetyltransferase
EcE24377A_0495115-0.054416hypothetical protein
EcE24377A_04961150.467572acriflavine resistance protein B
EcE24377A_0497212-0.439134acriflavine resistance protein A
EcE24377A_0498213-0.168869DNA-binding transcriptional repressor AcrR
EcE24377A_04993160.699200hypothetical protein
EcE24377A_05003160.961529potassium efflux protein KefA
EcE24377A_05014182.608909hypothetical protein
EcE24377A_05024154.241163hypothetical protein
EcE24377A_05035154.115241primosomal replication protein N''
EcE24377A_05044163.261165hypothetical protein
EcE24377A_05053183.916936hypothetical protein
EcE24377A_05064233.238208adenine phosphoribosyltransferase
EcE24377A_05074242.976072DNA polymerase III subunits gamma and tau
EcE24377A_05083271.851258hypothetical protein
EcE24377A_05092221.594568hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0492BCTERIALGSPF300.031 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.031
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0496ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3546), Expect = 0.0
Identities = 802/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0497RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 8e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 168 RINLA 172
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0498HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0500RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRIKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0507IGASERPTASE395e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.3 bits (91), Expect = 5e-05
Identities = 40/251 (15%), Positives = 78/251 (31%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDAWAAQVSQLSLPKLIEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALST-LKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S+ ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


10EcE24377A_0521EcE24377A_0536Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_05210123.252410GumN family protein
EcE24377A_05220133.631462addiction module antidote protein
EcE24377A_0523-1153.969784copper exporting ATPase
EcE24377A_0524-2151.336461glutaminase
EcE24377A_0525-118-0.114546amino acid permease
EcE24377A_05260180.399094DNA-binding transcriptional regulator CueR
EcE24377A_0527-1160.142774nodulation efficiency family protein
EcE24377A_0528-216-0.191326hypothetical protein
EcE24377A_0529-2170.369234ABC transporter ATP-binding protein
EcE24377A_05300183.305121hypothetical protein
EcE24377A_05310236.295461protein YbbN
EcE24377A_05321225.352397short chain dehydrogenase
EcE24377A_05332224.591147multifunctional acyl-CoA thioesterase I/protease
EcE24377A_05341224.678323ABC transporter ATP-binding protein
EcE24377A_05351214.225939ABC transporter permease
EcE24377A_05362243.349123RhsD protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0524BLACTAMASEA290.016 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.4 bits (66), Expect = 0.016
Identities = 11/43 (25%), Positives = 19/43 (44%)

Query: 38 GQLAAVAIVTCDGKVYSAGDSDYRFALESISKVCTLALALEDV 80
G++ + + G+ +A +D RF + S KV L V
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARV 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0532DHBDHDRGNASE779e-19 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 77.0 bits (189), Expect = 9e-19
Identities = 48/212 (22%), Positives = 81/212 (38%), Gaps = 7/212 (3%)

Query: 16 KSVLITGCSSGIGLESALELKRQGFHVLAGCRKPDDVERMNN----MGFT--GVLIDLDS 69
K ITG + GIG A L QG H+ A P+ +E++ + D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 70 PESVDRAADEVIALTDNCLYGIFNNAGFGMYGPLSTISRAQMEQQFSANFFGAHQLTMRL 129
++D + + + N AG G + ++S + E FS N G + +
Sbjct: 69 SAAIDEITARIEREMGP-IDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 130 LPAMLPHGEGRIVMTSSVMGLISTPGRGAYAASKYALEAWSDALRMELRHSGIKVSLIEP 189
M+ G IV S + AYA+SK A ++ L +EL I+ +++ P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 190 GPIRTRFTDNVNQTQSDKPVENPGIAARFTLG 221
G T ++ ++ G F G
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTG 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0534PF05272290.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.014
Identities = 12/20 (60%), Positives = 13/20 (65%)

Query: 41 LVGESGSGKSTLLAILAGLD 60
L G G GKSTL+ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


11EcE24377A_0547EcE24377A_0585Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0547214-0.787359allantoin permease
EcE24377A_0548114-0.419545allantoinase
EcE24377A_0550318-0.660968purine permease YbbY
EcE24377A_0549517-0.443570hypothetical protein
EcE24377A_05514170.839571glycerate kinase
EcE24377A_05522141.280504hypothetical protein
EcE24377A_05532141.958619hypothetical protein
EcE24377A_05541163.244333allantoate amidohydrolase
EcE24377A_05551174.073208ureidoglycolate dehydrogenase
EcE24377A_05561154.812055membrane protein FdrA
EcE24377A_05571155.234870hypothetical protein
EcE24377A_05581164.673291hypothetical protein
EcE24377A_05591153.463047carbamate kinase
EcE24377A_05611202.844872hypothetical protein
EcE24377A_05602182.761651phosphoribosylaminoimidazole carboxylase ATPase
EcE24377A_05623202.290137phosphoribosylaminoimidazole carboxylase
EcE24377A_05633181.723563UDP-2,3-diacylglucosamine hydrolase
EcE24377A_05642170.461899peptidyl-prolyl cis-trans isomerase B
EcE24377A_05661140.321934cysteinyl-tRNA synthetase
EcE24377A_0567024-2.817065hypothetical protein
EcE24377A_0569-124-3.771348hypothetical protein
EcE24377A_0570128-5.127649bifunctional 5,10-methylene-tetrahydrofolate
EcE24377A_0571236-9.159306type-1 fimbrial protein
EcE24377A_0572129-8.723598hypothetical protein
EcE24377A_0573024-6.267156chaperone protein FimC
EcE24377A_0575115-1.622789hypothetical protein
EcE24377A_0576017-2.939452mannose binding protein FimH
EcE24377A_0577015-2.320196transcriptional regulator FimZ
EcE24377A_0579-1182.210654*AraC family transcriptional regulator
EcE24377A_05800192.851641hypothetical protein
EcE24377A_05810213.362827bacteriophage N4 receptor, outer membrane
EcE24377A_05832242.990464ISEc3, transposase
EcE24377A_05841264.528708hypothetical protein
EcE24377A_05850224.093761hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0548UREASE553e-10 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 55.1 bits (133), Expect = 3e-10
Identities = 39/163 (23%), Positives = 59/163 (36%), Gaps = 32/163 (19%)

Query: 4 DLIIKNGTVILENEARVVDIAVKGGKIAAIG-------QD-----LGDAKEVMDASGLVV 51
D +I N ++ DI +K G+IAAIG Q +G EV+ G +V
Sbjct: 69 DTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIV 128

Query: 52 SPGMVDAHTHISEPGRSHWEGYETGTRAAAKGGITTMIEMPLNQLPATVDRAS------- 104
+ G +D+H H P + A G+T M+ PA A+
Sbjct: 129 TAGGMDSHIHFICPQQIE---------EALMSGLTCMLGGGTG--PAHGTLATTCTPGPW 177

Query: 105 -IELKFDAAKGKLTIDAAQLGGLVSYNIDRLHELDEVGVVGFK 146
I +AA ++ A G + L E+ G K
Sbjct: 178 HIARMIEAADA-FPMNLAFAGKGNASLPGALVEMVLGGATSLK 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0559CARBMTKINASE384e-137 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 384 bits (987), Expect = e-137
Identities = 126/310 (40%), Positives = 175/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VDPYPLDVLVAESQGMIGYMLAQSLSAQPQM----PPVTTVLTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q+L + + V T++T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPVFLQPEKFIGPVYQPEEQKALEAAYGWQMKRD-GKYLRRVVASPQPRKILDSEAIELL 172
DP F P K +GP Y E K L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVTEDG---AGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGTPQQRAIRHATPDELAPFAKAD----GSMGPKVTAVSGYVRSRGKPAWIGALSRIEE 285
+GT +++ +R +EL + + GSMGPKV A ++ G+ A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0566RTXTOXIND290.030 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.030
Identities = 16/150 (10%), Positives = 44/150 (29%), Gaps = 8/150 (5%)

Query: 299 RSQLNYSEENLKQARAALERLYTALRGTDKTVAPAGGEAFEARFIEAMDDDFNTP----- 353
+ ++ +L QAR R R + P E F +++
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 354 EAYSVLFDMAREVNRLKAEDMAAANAMASHLRKLSAVLGLLEQEPEAFLQSGAQADDSEV 413
E +S + + + A + + + + + + + + F +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS---LLHKQAI 249

Query: 414 AEIEALIQQRLDARKAKDWAAADAARDRLN 443
A+ L Q+ + + +++
Sbjct: 250 AKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0577HTHFIS614e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.6 bits (147), Expect = 4e-13
Identities = 26/122 (21%), Positives = 55/122 (45%), Gaps = 2/122 (1%)

Query: 1 MKPTSVIIMDTHPIIRMSIEVLLQKNSELQIVLKTDDYRITIDYLRTRPVDLIIMDIDLP 60
M ++++ D IR + L + V T + ++ DL++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GTDGFTFLKRIKQIQSTVKVLFLSSKSECFYAGRAIQAGANGFVSKCNDQNDIFHAVQMI 120
+ F L RIK+ + + VL +S+++ A +A + GA ++ K D ++ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 LS 122
L+
Sbjct: 119 LA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0581PF07201300.034 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.2 bits (68), Expect = 0.034
Identities = 17/72 (23%), Positives = 25/72 (34%), Gaps = 1/72 (1%)

Query: 629 PAAVSDLRAALELEPNNSNIQAALGYALWDSGDIAQSREMLEQAHKGLPDDPALIRQLAY 688
VS+L + L N ++ Y S + ++ +ML L P L
Sbjct: 100 KQNVSELLSLL-SNSPNISLSQLKAYLEGKSEEPSEQFKMLCGLRDALKGRPELAHLSHL 158

Query: 689 VNQRLDDMPATQ 700
V Q L M Q
Sbjct: 159 VEQALVSMAEEQ 170


12EcE24377A_0605EcE24377A_0624Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_06050133.438239mbtH-like protein
EcE24377A_06060143.765591enterobactin synthase subunit F
EcE24377A_06081163.376822ferric enterobactin transport protein FepE
EcE24377A_06071165.584916iron-enterobactin transporter ATP-binding
EcE24377A_06090175.805065iron-enterobactin transporter permease
EcE24377A_0610-1175.391884iron-enterobactin transporter membrane protein
EcE24377A_0611-1174.851865enterobactin exporter EntS
EcE24377A_0612-2184.687698iron-enterobactin transporter periplasmic
EcE24377A_0613-1205.003538isochorismate synthase
EcE24377A_0614-1214.915810enterobactin synthase subunit E
EcE24377A_06150204.817127isochorismatase
EcE24377A_0616-1184.3915932,3-dihydroxybenzoate-2,3-dehydrogenase
EcE24377A_06170183.181227hypothetical protein
EcE24377A_06180141.631339carbon starvation protein A
EcE24377A_0619-119-2.475338hypothetical protein
EcE24377A_0620-118-2.882631hypothetical protein
EcE24377A_0621-115-4.217180aminotransferase
EcE24377A_0622-116-3.980930IbrB protein
EcE24377A_0623-117-3.720230IbrA protein
EcE24377A_0624-118-4.104845LysR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0611TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0612FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0615ISCHRISMTASE443e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 443 bits (1141), Expect = e-161
Identities = 147/299 (49%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPVPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA V + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0616DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (931), Expect = e-130
Identities = 110/258 (42%), Positives = 149/258 (57%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDALVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAVSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


13EcE24377A_0634EcE24377A_0653Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0634-1183.192685citrate transporter
EcE24377A_0635-1193.986091triphosphoribosyl-dephospho-CoA synthase
EcE24377A_0636-1142.1901742-(5''-triphosphoribosyl)-3'-dephosphocoenzyme-A
EcE24377A_0637-1121.280317citrate lyase subunit alpha
EcE24377A_0638-111-0.198002citrate (pro-3S)-lyase subunit beta
EcE24377A_0639114-1.969163citrate lyase subunit gamma
EcE24377A_0641115-2.823600hypothetical protein
EcE24377A_0642014-2.678176sensor histidine kinase DpiB
EcE24377A_0643014-3.539398two-component response regulator DpiA
EcE24377A_0644016-3.897806C4-dicarboxylate transporter DcuC
EcE24377A_0645021-3.642525hypothetical protein
EcE24377A_0646-120-2.610418hypothetical protein
EcE24377A_0647-114-0.792615palmitoyl transferase
EcE24377A_0648-115-0.355314cold shock protein CspE
EcE24377A_0649-214-1.679918camphor resistance protein CrcB
EcE24377A_0650-213-2.010075carbon-nitrogen family hydrolase
EcE24377A_0651-113-3.346030twin arginine translocase E
EcE24377A_0652-213-3.000386lipoyl synthase
EcE24377A_0653015-4.594347hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0639PF03944270.009 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 27.3 bits (60), Expect = 0.009
Identities = 12/43 (27%), Positives = 24/43 (55%), Gaps = 3/43 (6%)

Query: 21 IAPLDTQDIDLQINSSVEKQFG---DAIRTTILDVLARYNVRG 60
I+P+ ++ Q + + ++FG D++R + ARY +RG
Sbjct: 496 ISPIHATQVNNQTRTFISEKFGNQGDSLRFEQNNTTARYTLRG 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0643HTHFIS622e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.2 bits (151), Expect = 2e-13
Identities = 28/121 (23%), Positives = 51/121 (42%), Gaps = 5/121 (4%)

Query: 1 MTAPLTLLIVEDETPLAEMHAEYIRHIPGFSQILLAGNLAQARMMIERFKPGLILLDNYL 60
MT T+L+ +D+ + + + + G+ + + N A I L++ D +
Sbjct: 1 MTGA-TILVADDDAAIRTVLNQALS-RAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVM 57

Query: 61 PDGRGINLLHELVQAHYPG-DVVFTTAASDMETVSEAVRCGVFDYLIKPIAYERLGQTLT 119
PD +LL + + P V+ +A + T +A G +DYL KP L +
Sbjct: 58 PDENAFDLLPRI-KKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 120 R 120
R
Sbjct: 117 R 117


14EcE24377A_0667EcE24377A_0675Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0667219-0.717878LPS-assembly lipoprotein RlpB
EcE24377A_0668015-1.612715leucyl-tRNA synthetase
EcE24377A_0669224-4.624737hypothetical protein
EcE24377A_0670229-5.864680hypothetical protein
EcE24377A_0671330-6.031917hypothetical protein
EcE24377A_0672227-5.389758hypothetical protein
EcE24377A_0673023-4.253792hypothetical protein
EcE24377A_0674-120-3.590560DnaJ domain-containing protein
EcE24377A_0675-115-3.013576hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0675SACTRNSFRASE280.011 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.4 bits (63), Expect = 0.011
Identities = 12/63 (19%), Positives = 24/63 (38%), Gaps = 2/63 (3%)

Query: 31 KAMVNYAFDYLRSPGS--LPFTTAATELSAIHGHSTSQYRLGEFYLHGSDGKPLDYTQAR 88
A+++ A ++ + L T +SA H ++ + +G P A
Sbjct: 108 TALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLYSNFPTANEIAI 167

Query: 89 YWY 91
+WY
Sbjct: 168 FWY 170


15EcE24377A_0719EcE24377A_0727Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0719-1174.139592hypothetical protein
EcE24377A_0720-2174.488748DNA-binding transcriptional activator KdpE
EcE24377A_0721-2164.214960sensor protein KdpD
EcE24377A_07220205.505632potassium-transporting ATPase subunit C
EcE24377A_07231205.293112potassium-transporting ATPase subunit B
EcE24377A_07242224.101291potassium-transporting ATPase subunit A
EcE24377A_07253253.520862K+-transporting ATPase subunit F
EcE24377A_07262233.394657hypothetical protein
EcE24377A_07271203.179471protein rhsA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0720HTHFIS911e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 1e-23
Identities = 35/125 (28%), Positives = 58/125 (46%), Gaps = 1/125 (0%)

Query: 2 TNVLIVEDEQAIRRFLRTALEGDGMRVYEAETLQRGLLEAATRKPDLIILDLGLPDGDGI 61
+L+ +D+ AIR L AL G V A DL++ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EFIRDLRQWSA-VPVIVLSARSEESDKIAALDAGADDYLSKPFGIGELQARLRVALRRHS 120
+ + +++ +PV+V+SA++ I A + GA DYL KPF + EL + AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 ATAAP 125
+
Sbjct: 124 RRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0721PF06580320.011 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.011
Identities = 10/48 (20%), Positives = 21/48 (43%), Gaps = 4/48 (8%)

Query: 785 LLENAVKYAGAQAE----IGIDAHVEGENLQLDVWDNGPGLPPGQEQT 828
L+EN +K+ AQ I + + + L+V + G +++
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES 310


16EcE24377A_0740EcE24377A_0769Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0740-118-3.452800endonuclease VIII
EcE24377A_0742017-3.588138hypothetical protein
EcE24377A_0743018-2.782800periplasmic pilus chaperone family protein
EcE24377A_0744018-2.745861fimbrial usher family protein
EcE24377A_0745124-0.435004type 1 fimbrial protein
EcE24377A_07461230.082559hypothetical protein
EcE24377A_07471262.106504type II citrate synthase
EcE24377A_07482263.119706succinate dehydrogenase cytochrome b556 large
EcE24377A_07492283.357788succinate dehydrogenase cytochrome b556 small
EcE24377A_07502303.269667succinate dehydrogenase flavoprotein subunit
EcE24377A_07512262.271038succinate dehydrogenase iron-sulfur subunit
EcE24377A_07521212.6166842-oxoglutarate dehydrogenase E1
EcE24377A_0753-2110.615185dihydrolipoamide succinyltransferase
EcE24377A_0754-39-0.200161succinyl-CoA synthetase subunit beta
EcE24377A_0755-39-0.368048succinyl-CoA synthetase subunit alpha
EcE24377A_0756-29-0.166029DNA-binding transcriptional repressor MngR
EcE24377A_0757-2100.275037PTS system 2-O-a-mannosyl-D-glycerate specific
EcE24377A_0758014-0.750532alpha-mannosidase
EcE24377A_07603230.302575hypothetical protein
EcE24377A_07611210.423201cytochrome d ubiquinol oxidase, subunit I
EcE24377A_0762119-0.016576cytochrome d ubiquinol oxidase, subunit II
EcE24377A_0763818-0.487035cyd operon protein YbgT
EcE24377A_07644200.032282hypothetical protein
EcE24377A_07654230.118382acyl-CoA thioester hydrolase
EcE24377A_0766422-0.053249colicin uptake protein TolQ
EcE24377A_0767421-0.152969colicin uptake protein TolR
EcE24377A_0768419-0.530259cell envelope integrity inner membrane protein
EcE24377A_0769216-0.459265translocation protein TolB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0744PF005776080.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 608 bits (1570), Expect = 0.0
Identities = 239/862 (27%), Positives = 388/862 (45%), Gaps = 64/862 (7%)

Query: 22 RLSFVSCLVVAMPCALA-VEFNLNVLDKSMRDRIDISLLKEKGVIAPGEYFVSVAVNNNQ 80
RL P + A + FN L + D+S + + PG Y V + +NN
Sbjct: 29 RLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGY 88

Query: 81 ISNGQKINWHKNDDK--TIPCINDLLVDKFGLKPEVRQSLPLI--NQCVDFSSR-PEMLF 135
++ + + ++ D + +PC+ + GL + L+ + CV +S +
Sbjct: 89 MAT-RDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATA 147

Query: 136 NFDQANQQLNISIPQAWLAWHSENWTPPSTWKEGVAGILMDYNLFASSYRPQDGSSSTNL 195
D Q+LN++IPQA+++ + + PP W G+ L++YN +S + + G +S
Sbjct: 148 QLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYA 207

Query: 196 NAYGTTGINAGAWRLRSDYQLNQTDSDDNHEQSGGI--SRTYLFRPLPQLGSKLTLGETD 253
+G+N GAWRLR + + SD + T+L R + L S+LTLG+
Sbjct: 208 YLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGY 267

Query: 254 FSSNIFDGFSYTGAALASDERMLPWELRGYAPQISGIAQTNATVTISQSGRVIYQKKVPP 313
+IFDG ++ GA LASD+ MLP RG+AP I GIA+ A VTI Q+G IY VPP
Sbjct: 268 TQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPP 327

Query: 314 GPFIIDDLNQ-SVQGTLDVKVTEEDGRVNNFQVSAASTPFLTRQGQVRYKLAAGQPRPSM 372
GPF I+D+ G L V + E DG F V +S P L R+G RY + AG+ R
Sbjct: 328 GPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSG- 386

Query: 373 SHQTENETFFSNEVSWGMLSNTSLYGGLLLSGDDYHSAAMGIGQNMLWLGALSFDVTWAS 432
+ Q E FF + + G+ + ++YGG L+ D Y + GIG+NM LGALS D+T A+
Sbjct: 387 NAQQEKPRFFQSTLLHGLPAGWTIYGGTQLA-DRYRAFNFGIGKNMGALGALSVDMTQAN 445

Query: 433 SHFDTQQDERGLSYRFNYSKQVDATNSTISLAAYRFSDRHFHSYANYLDHKYND------ 486
S G S RF Y+K ++ + + I L YR+S + ++A+ + N
Sbjct: 446 STLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQ 505

Query: 487 --------------SDAQDEKQTISLSVGQPITPLNLNLYANLLHQTWWNADASTTANIT 532
+ A +++ + L+V Q + LY + HQT+W
Sbjct: 506 DGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYWGTSNVDE-QFQ 563

Query: 533 AGFNVDIGDWRDISISTSFNTTHYE-DKDRDNQIYLSISLPFGNGGR-----------VG 580
AG N + DI+ + S++ T K RD + L++++PF + R
Sbjct: 564 AGLNT---AFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASAS 620

Query: 581 YDMQNSSHS-TTHRMSWNDTLDERN--SWGMSAGL-QSDRPDNGAQVSGNYQHLSSAGEW 636
Y M + + T+ TL E N S+ + G ++G+ + G
Sbjct: 621 YSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNA 680

Query: 637 DISGTYAANDYSSVSSSWSGSFTATQYGAAFHRRSSTNEPRLMVSTDGVADIPVQGNLDY 696
+I ++ ++D + SG A G + N+ ++V G D V+
Sbjct: 681 NIGYSH-SDDIKQLYYGVSGGVLAHANGVTLGQP--LNDTVVLVKAPGAKDAKVENQTGV 737

Query: 697 -TNHFGIAVVPLISSYQPSTVAVNMNDLPDGVTVAENVIKETWIEGAIGYKSLASRSGKD 755
T+ G AV+P + Y+ + VA++ N L D V + V GAI +R G
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 756 VNVIIRNASGQFPPLGADIRQDDSGISVGMVGEEGHAWLSGVAENQKFTVVWG--DSQHC 813
+ + + + + P GA + +S S G+V + G +LSG+ K V WG ++ HC
Sbjct: 798 LLMTLT-HNNKPLPFGAMV-TSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHC 855

Query: 814 SLH--LPEH-MEDTANRLILPC 832
+ LP + +L C
Sbjct: 856 VANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0745FIMBRIALPAPE357e-05 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 35.0 bits (80), Expect = 7e-05
Identities = 43/179 (24%), Positives = 74/179 (41%), Gaps = 26/179 (14%)

Query: 14 SLLFTAPVYAADEGSGEIHFKGEVIEAPCEIHQDDIDKEVELGQVTTSHINQS-HHSDAV 72
++L + V+AAD + FKG++I C + + EV G + ++ QS +
Sbjct: 15 AVLMSQHVHAADN----LTFKGKLIIPACTVQ----NAEVNWGDIEIQNLVQSGGNQKDF 66

Query: 73 AVDLRLVNCDLENSSNGSGGKISKVAVTFDSSAKTTGADPILNNTSTGEATGVGVRLMNK 132
VD+ NC + + VT S+ TG ++ NTST G+ + L N
Sbjct: 67 TVDM---NCPYS---------LGTMKVTITSNG-QTGNSILVPNTSTASGDGLLIYLYNS 113

Query: 133 DQSNI----VLGTATPDIDLAPTSSEQTLNFFAWMEQIDQATPVTPGAVTANATYVLDY 187
+ S I LG+ + T+ + + +A + + G +A AT V Y
Sbjct: 114 NNSGIGNAVTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQSLQAGTFSATATLVASY 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0751TCRTETOQM310.003 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 31.4 bits (71), Expect = 0.003
Identities = 11/41 (26%), Positives = 23/41 (56%), Gaps = 1/41 (2%)

Query: 14 VDDAPRMQDYTLEAEEGRDM-MLLDALIQLKEKDPSLSFRR 53
+++ + T+E + + MLLDAL+++ + DP L +
Sbjct: 339 IENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYV 379


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0753RTXTOXIND300.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.020
Identities = 27/196 (13%), Positives = 56/196 (28%), Gaps = 12/196 (6%)

Query: 48 EVPASADGILDAVLEDEGTTVTSRQILGRLREGNSAGKETSAKSE-EKASTPAQRQQASL 106
E+ + I+ ++ EG +V +L +L + +S +A R Q
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS 157

Query: 107 EEQNNDAL----SPAIRRLLAEHNLDASAIKGTGVGGRLTRED----VEKHLAKAPAKES 158
+ L P + + T ++ E +L K A+
Sbjct: 158 RSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERL 217

Query: 159 APAAAAPAAQPALAARSEKRVPMTRLRKRVA---ERLLEAKNSTAMLTTFNEVNMKPIMD 215
A + + + L + A +LE +N V +
Sbjct: 218 TVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQ 277

Query: 216 LRKQYGEAFEKRHGIR 231
+ + A E+ +
Sbjct: 278 IESEILSAKEEYQLVT 293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0768IGASERPTASE648e-13 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 63.5 bits (154), Expect = 8e-13
Identities = 40/213 (18%), Positives = 72/213 (33%), Gaps = 14/213 (6%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEE 158
E E+ Q QA+ + E A A ++ E
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVP----SNNEEIARVDEAPVPPPAPATPSETTET 1039

Query: 159 AAK--KAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAA--AAEARKKAATE 214
A+ K + +K E +A + A+ ++ A+ A + +K + E A +E ++ TE
Sbjct: 1040 VAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE 1099

Query: 215 AAEKAKAEAEKKAAAEKAAAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKK 274
E A E E+KA K EK K + E++ + AE A
Sbjct: 1100 TKETATVEKEEKA---KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV--- 1153

Query: 275 AAAEKAAADKKAAAAKAAAEKAAAAKAAAEADD 307
E + A + A++ ++ +
Sbjct: 1154 NIKEPQSQTNTTADTEQPAKETSSNVEQPVTES 1186



Score = 58.9 bits (142), Expect = 3e-11
Identities = 32/245 (13%), Positives = 87/245 (35%), Gaps = 11/245 (4%)

Query: 68 QSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Q+ S ++E+ ++ +E ++ + +K E+ A +
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--EQDATET 1061

Query: 128 ELKQKQ-AEEAAAKAAAD------AKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAA 180
+ ++ A+EA + A+ A++ +E E + A + ++KA+ E K
Sbjct: 1062 TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQE 1121

Query: 181 EAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAAEKKAAA 240
+ ++ + ++++E + A AR+ T ++ +++ A E+ A E +
Sbjct: 1122 VPKVTSQVSPK--QEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 241 EKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAK 300
E+ + + + A ++ K + + A
Sbjct: 1180 EQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATT 1239

Query: 301 AAAEA 305
++ +
Sbjct: 1240 SSNDR 1244



Score = 58.2 bits (140), Expect = 4e-11
Identities = 29/238 (12%), Positives = 78/238 (32%), Gaps = 7/238 (2%)

Query: 86 QQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADA 145
+ AE +++ + K + + ++ A+EA + + E A + +
Sbjct: 1038 ETVAENSKQESKTVE---KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 146 KAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAA 205
E A E +KA + +K E K ++ K E + + A E
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKT--QEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 206 EARKKAATEAAEKAKAE--AEKKAAAEKAAAEKKAAAEKAAADKKAAAEKAAADKKAAEK 263
K+ ++ A E A++ ++ + + + + A +
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 264 AAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEADDIFGELSSGKNAPKT 321
+ + ++ + ++ A ++ +++ A + + LS + +
Sbjct: 1213 SESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQF 1270



Score = 55.5 bits (133), Expect = 3e-10
Identities = 39/276 (14%), Positives = 95/276 (34%), Gaps = 25/276 (9%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKA 163
+ E + ++ ++ Q K+ +K+ KAK E + +E K
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKE-----------EKAKVETEKT--QEVPKVT 1126

Query: 164 AADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEA 223
+ + K+ ++E + AE ++ + + +++ A E A E + +
Sbjct: 1127 SQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ---PAKETSSNVEQPV 1183

Query: 224 EKKAAAEKAAAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAAD 283
+ + + A + +++K + ++ A ++ D
Sbjct: 1184 TESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSND 1243

Query: 284 KKAAA-AKAAAEKAAAAKAAAEADDIFGELSSGKNA 318
+ A + A + A A F L+ GK
Sbjct: 1244 RSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279



Score = 53.5 bits (128), Expect = 1e-09
Identities = 26/237 (10%), Positives = 69/237 (29%), Gaps = 9/237 (3%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKK 185
+++ KQ + + A+ A E + + + A +
Sbjct: 1126 TSQVSPKQEQSETVQPQAEP---------ARENDPTVNIKEPQSQTNTTADTEQPAKETS 1176

Query: 186 AEAAAAALKKKAEAAEAAAAEARKKAATEAAEKAKAEAEKKAAAEKAAAEKKAAAEKAAA 245
+ + + E + + + ++
Sbjct: 1177 SNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEP 1236

Query: 246 DKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAA 302
++ +++ +D +A A+ A + A ++ ++ +
Sbjct: 1237 ATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293



Score = 42.0 bits (98), Expect = 4e-06
Identities = 38/292 (13%), Positives = 91/292 (31%), Gaps = 24/292 (8%)

Query: 59 AVVEQYKRMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKK 118
A + + QS + + + K E EK E E+ +++ K +++
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETKETATVEKE---EKAKVETEKTQEVPKVTSQVSPKQE 1134

Query: 119 QAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADAKAAEEAAKKAAADAKKKAEAEAAKA 178
Q+E QAE ++ K + A+E + + +
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 179 AAEAQKKAEAAAAALKKKAEAAEAAAAEARK---------KAATEAAEKAKAEAEKKAA- 228
E + A +E++ R+ + AT ++ A
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTS 1254

Query: 229 --AEKAAAEKKAAAEKAAADKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKA 286
++ +A A+ A + A + + + + ++ +++ +
Sbjct: 1255 TNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRR 1314

Query: 287 AAAKAAAEKAAAAKAAAEADDIFGELSSGKNAPKTGGGAKGNNASPAGSGNT 338
++K+ + + + + G + +N+ NN A S NT
Sbjct: 1315 FSSKSTQTQLGWDQTISNNVQLGGVFTYVRNS---------NNFDKATSKNT 1357


17EcE24377A_0794EcE24377A_0837Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0794212-0.9183486-phosphogluconolactonase
EcE24377A_0795013-2.181700LysR family transcriptional regulator
EcE24377A_0796013-1.560628hypothetical protein
EcE24377A_0797-114-1.481213anion transporter
EcE24377A_0798-216-1.542422hypothetical protein
EcE24377A_0799123-2.054472pectinesterase
EcE24377A_0800430-2.937175phage integrase
EcE24377A_0801423-1.172578phage/conjugal plasmid C-4 type zinc finger
EcE24377A_0802321-0.934307hypothetical protein
EcE24377A_0803222-1.773743hypothetical protein
EcE24377A_0804321-1.524934exonuclease
EcE24377A_0805222-1.328516phage recombination protein Bet
EcE24377A_0806224-1.848670host-nuclease inhibitor protein Gam
EcE24377A_0807226-2.832658protein KiL
EcE24377A_08082250.732373repressor protein
EcE24377A_08091232.618736hypothetical protein
EcE24377A_08111252.909579replication protein P
EcE24377A_08122270.224159hypothetical protein
EcE24377A_08133240.159713multidrug efflux protein
EcE24377A_08153241.152388IS66 family transposase
EcE24377A_0816426-2.313023IS66 family orf2
EcE24377A_0817530-3.644587IS66 family orf1
EcE24377A_0818430-5.098634hypothetical protein
EcE24377A_0819329-4.431075hypothetical protein
EcE24377A_0820231-5.525063hypothetical protein
EcE24377A_0821232-6.187024endodeoxyribonuclease RUS
EcE24377A_0822331-6.899117hypothetical protein
EcE24377A_0823331-6.262746phage antitermination Q
EcE24377A_0824431-5.492911outer membrane porin protein LC
EcE24377A_0825230-4.207252hypothetical protein
EcE24377A_0826127-2.443013lysis protein S
EcE24377A_0827120-0.520041phage lysozyme
EcE24377A_0828119-0.035200bacteriophage lysis protein
EcE24377A_0829218-0.436390KilA domain-containing protein
EcE24377A_0830222-5.759189hypothetical protein
EcE24377A_0831220-5.220917phage DNA packaging protein Nu1
EcE24377A_0832217-3.750037phage terminase large subunit (GpA)
EcE24377A_0833219-4.012531phage tail collar domain-containing protein
EcE24377A_0834218-2.523213tail fiber assembly protein
EcE24377A_0835217-1.500789diguanylate cyclase
EcE24377A_08360134.029303kinase inhibitor protein
EcE24377A_0837-2133.533958adenosylmethionine-8-amino-7-oxononanoate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0824ECOLIPORIN5050.0 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 505 bits (1303), Expect = 0.0
Identities = 239/388 (61%), Positives = 278/388 (71%), Gaps = 33/388 (8%)

Query: 1 MKKLTVAISAVAASVLMAMSAQAAEIYNKDSNKLDLYGKVNAKHYFSSNDADDGDTTYAR 60
MK+ +A+ V ++L A +A AAEIYNKD NKLDLYGKV+ HYFS + + DGD TY R
Sbjct: 1 MKRKVLAL--VIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMR 58

Query: 61 LGFKGETQINDQLTGFGQWEYEFKGNRAESQGSSKDKTRLAFAGLKFGDYGSIDYGRNYG 120
+GFKGETQINDQLTG+GQWEY + N E +G++ TRLAFAGLKFGDYGS DYGRNYG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQANTTEGEGANS-WTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 121 VAYDIGAWTDVLPEFGGDTWTQTDVFMTGRTTGVATYRNNDFFGLVDGLNFAAQYQGKND 180
V YD+ WTD+LPEFGGD++T D +MTGR GVATYRN DFFGLVDGLNFA QYQGKN+
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 181 RNE----------------VTEANGDGFGFSTTYEY-EGFGVGATYAKSDRTNNQVIYGN 223
+ NGDGFG STTY+ GF GA Y SDRTN QV
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQV--NA 235

Query: 224 NGLNASGQNAEVWAAGLKYDANNIYLATTYSETQNMTVFG------NNHIANKAQNFEAV 277
G A G A+ W AGLKYDANNIYLAT YSET+NMT +G + +ANK QNFE
Sbjct: 236 GGTIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVT 295

Query: 278 AQYQFDFGLRPSVAYLHSKGKDLGV----WGDQDLVEYVDVGATYYFNKNMSTFVDYKIN 333
AQYQFDFGLRP+V++L SKGKDL D+DLV+Y DVGATYYFNKN ST+VDYKIN
Sbjct: 296 AQYQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKIN 355

Query: 334 LIDKSD-FTKASGVATDDIVAVGMVYQF 360
L+D D F K +G++TDDIVA+GMVYQF
Sbjct: 356 LLDDDDPFYKDAGISTDDIVALGMVYQF 383


18EcE24377A_0855EcE24377A_0860Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0855-2223.066469hypothetical protein
EcE24377A_0854-2213.424960hypothetical protein
EcE24377A_0857-2213.303348hypothetical protein
EcE24377A_0856-2203.660503ABC-2 type transporter permease
EcE24377A_0858-2194.078942ABC-2 type transporter permease
EcE24377A_0859-2183.904588ABC transporter ATP-binding protein
EcE24377A_0860-2153.337050hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0856ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0859PF05272320.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 293 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 352
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 353 KRGEIFG----LLGPNGAGKSTTFKMMCGL 378
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.046
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0860RTXTOXIND636e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.5 bits (152), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 82 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 141
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 142 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 196
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 197 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 254
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 255 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 308
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 309 ----DADDALRQGMPVTVQ 323
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


19EcE24377A_0894EcE24377A_0904Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0894-2133.327920formate C-acetyltransferas
EcE24377A_0895-1113.294112glycyl-radical activating family protein
EcE24377A_0896-2122.887550fructose-6-phosphate aldolase
EcE24377A_0897-2122.834042molybdopterin biosynthesis protein MoeB
EcE24377A_0898-1152.533426molybdopterin biosynthesis protein MoeA
EcE24377A_0899016-1.145554L-asparaginase
EcE24377A_0900016-2.855926glutathione transporter ATP-binding protein
EcE24377A_0901013-3.431152glutathione ABC transporter periplasmic
EcE24377A_0902012-4.329352glutathione ABC transporter permease GsiC
EcE24377A_0903110-4.455803glutathione ABC transporter permease GsiD
EcE24377A_090419-4.780844cyclic diguanylate phosphodiesterase
20EcE24377A_0971EcE24377A_0999Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_09712170.197840dimethylsulfoxide reductase subunit B
EcE24377A_0970213-1.610643hypothetical protein
EcE24377A_0972114-2.884051anaerobic dimethyl sulfoxide reductase subunit
EcE24377A_0973016-4.107984isochorismatase hydrolase
EcE24377A_0974120-4.459220hypothetical protein
EcE24377A_0975016-3.980837MFS family transporter protein
EcE24377A_0976020-5.848322amino acid permease
EcE24377A_0977123-5.907160LysR family transcriptional regulator
EcE24377A_0978124-6.236851hypothetical protein
EcE24377A_0979025-5.994102hypothetical protein
EcE24377A_0980-126-6.345742pyruvate formate lyase-activating enzyme 1
EcE24377A_0981029-7.064802phage integrase site specific recombinase
EcE24377A_0982328-6.150976phage regulatory protein
EcE24377A_0983024-1.515560replication gene B protein
EcE24377A_0984129-5.246841hypothetical protein
EcE24377A_0985128-5.459175hypothetical protein
EcE24377A_0986026-3.221528C4-type zinc finger DksA/TraR family protein
EcE24377A_0987-122-2.272188hypothetical protein
EcE24377A_0988-120-1.570604replication gene A protein
EcE24377A_0989019-1.843339hypothetical protein
EcE24377A_0990-1212.060237PBSX family phage portal protein
EcE24377A_09930223.047413hypothetical protein
EcE24377A_09941242.776203IS21 family transposase
EcE24377A_09953281.574636IS21 family transposition helper protein
EcE24377A_09963280.782373hypothetical protein
EcE24377A_09972260.101680phage protein gpU
EcE24377A_0998225-0.434125phage late control gene D protein
EcE24377A_0999227-1.282056DNA-binding transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0973ISCHRISMTASE395e-06 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 39.2 bits (91), Expect = 5e-06
Identities = 30/159 (18%), Positives = 53/159 (33%), Gaps = 20/159 (12%)

Query: 7 RLDKNDAAVLLVDHQAGLLSLVRDIEP--DKFKNNVLALGDLAKYFNLPTILTT---SFE 61
D N A +L+ D Q + + N+ L + +P + T S
Sbjct: 25 VPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQN 84

Query: 62 TGPNGPLV----PELKAQFSDAPYIAR----PGNI-------NAWDNEDFVKAVKATGKK 106
L P L + + I ++ +A+ + ++ ++ G+
Sbjct: 85 PDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRD 144

Query: 107 QLIIAGVVTEVCVAFPALSAIEEGFDVFVVTDASGTFNE 145
QLII G+ + A A E F V DA F+
Sbjct: 145 QLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSL 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0981PERTACTIN300.015 Pertactin signature.
		>PERTACTIN#Pertactin signature.

Length = 922

Score = 30.1 bits (67), Expect = 0.015
Identities = 13/49 (26%), Positives = 24/49 (48%)

Query: 9 GRYEVDVRPQGADGKRIRRKFKTKGEAQAFERHVLVNYHNKEWLEKPAD 57
R E D + G+DG ++ K++T G + E + + +LE A+
Sbjct: 751 SRLENDFKVAGSDGYAVKGKYRTHGVGVSLEAGRRFAHADGWFLEPQAE 799


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0983SECA280.016 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.016
Identities = 11/72 (15%), Positives = 26/72 (36%)

Query: 9 VEKQPAAMRRIIGKHLAVPRWQDTCDYYNQMMERERLTVCFHAQLKQRHATMRFEEMNDV 68
+ ++ L + W D ++ RER+ +++ + E M
Sbjct: 703 IPGLQERLKNDFDLDLPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHF 762

Query: 69 ERERLVCAIDEL 80
E+ ++ +D L
Sbjct: 763 EKGVMLQTLDSL 774


21EcE24377A_1050EcE24377A_1056Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1050221-3.758481repressor protein
EcE24377A_1051220-3.244688aliphatic sulfonate ABC transporter periplasmic
EcE24377A_1052123-3.894204NAD(P)H-dependent FMN reductase
EcE24377A_1053125-4.518404fimbrial protein
EcE24377A_1054023-3.676575periplasmic pilus chaperone family protein
EcE24377A_1055-121-3.529449outer membrane usher protein fimD
EcE24377A_1056-121-3.204378fimbrial protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1053INTIMIN300.006 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 29.7 bits (66), Expect = 0.006
Identities = 34/169 (20%), Positives = 58/169 (34%), Gaps = 4/169 (2%)

Query: 10 ITVVCATSSVMAADDNAITDGKVTFNGKVIAPACTLVAATKDSVVTLPNVSATKL--QTN 67
+V T+ + A N GK T K P +V+A + + N +A QT
Sbjct: 598 FNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTK 657

Query: 68 GAVSGVKTDVPIALEGCDVTVTKNATFTFSGTADGVQPTAFANQATTDAATNVALQM--Y 125
+++ +K D A+ +T Q F + + Y
Sbjct: 658 ASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGY 717

Query: 126 LPDGSTSVTPGTETSNIQLADSAEQTVTFKVDYIATGKATSGNVNAVTN 174
TS TPG + +++D A +V++ T GN+ V
Sbjct: 718 AKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT 766


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1055PF005778240.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 824 bits (2129), Expect = 0.0
Identities = 415/862 (48%), Positives = 571/862 (66%), Gaps = 18/862 (2%)

Query: 15 GVPSFIGGLVVFVSAAFNAQAETWFDPAFFKDDPSMVADLSRFEKGQKITPGVYRVDIVL 74
G + F + A + AE +F+P F DDP VADLSRFE GQ++ PG YRVDI L
Sbjct: 25 GFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 75 NQTIVNTRNVNFVEITPEKGIAACLTTESLDAMGVNTDAFPAFKQLDKQACVPLAEIIPD 134
N + TR+V F E+GI CLT L +MG+NT + L ACVPL +I D
Sbjct: 85 NNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHD 144

Query: 135 ASVTFNVNKLRLEISVPQIAIKSNARGYVPPERWDEGINALLLGYSFSGANSIHSSADSD 194
A+ +V + RL +++PQ + + ARGY+PPE WD GINA LL Y+FSG + + +
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 195 SGDSYFLNLNSGVNLGPWRLRNNSTWSR-----SSGQTAEWKNLSSYLQRAVIPLKGELT 249
+LNL SG+N+G WRLR+N+TWS SSG +W++++++L+R +IPL+ LT
Sbjct: 205 --HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLT 262

Query: 250 VGDDYTAGDFFDSVSFRGVQLASDDNMLPDSLKGFAPVVRGIAKSNAQITIKQNGYTIYQ 309
+GD YT GD FD ++FRG QLASDDNMLPDS +GFAPV+ GIA+ AQ+TIKQNGY IY
Sbjct: 263 LGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYN 322

Query: 310 TYVSPGAFEISDLYSTSSSGDLLVEIKEADGSVNSYSVPFSSVPLLQRQGRIKYAVTLAK 369
+ V PG F I+D+Y+ +SGDL V IKEADGS ++VP+SSVPLLQR+G +Y++T +
Sbjct: 323 STVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGE 382

Query: 370 YRTNSNEQQESKFAQATLQWGGPWGTTWYGGGQYAEYYRAAMFGLGFNLGDFGAISFDAT 429
YR+ + +Q++ +F Q+TL G P G T YGG Q A+ YRA FG+G N+G GA+S D T
Sbjct: 383 YRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMT 442

Query: 430 QAKSTLADQSEHKGQSYRFLYAKTLNQLGTNFQLMGYRYSTSGFYTLSDTMYKHMDGY-- 487
QA STL D S+H GQS RFLY K+LN+ GTN QL+GYRYSTSG++ +DT Y M+GY
Sbjct: 443 QANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNI 502

Query: 488 EFNDGDDEDTPMWSRYYNLFYTKRGKLQVNISQQLGEYGSFYLSGSQQTYWHTDQQDRLL 547
E DG + P ++ YYNL Y KRGKLQ+ ++QQLG + YLSGS QTYW T D
Sbjct: 503 ETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQF 562

Query: 548 QFGYNTQIKDLSLGVSWNYSKSRGQPDADQVFALNFSLPLNLLLPRSNDSYTRKKNYAWM 607
Q G NT +D++ +S++ +K+ Q DQ+ ALN ++P + L + S R +A
Sbjct: 563 QAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWR---HASA 619

Query: 608 TSNTSIDNEGHITQNLGLTETLLDDGNLSYSVQQGYNSEGKTANGS---ASMDYKGAFAD 664
+ + S D G +T G+ TLL+D NLSYSVQ GY G +GS A+++Y+G + +
Sbjct: 620 SYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGN 679

Query: 665 ARVGYNYSDNGSQQQLNYALSGSLVAHSQGITLGQSLGETNVLIAAPGAENTRVANSTGL 724
A +GY++SD+ +QL Y +SG ++AH+ G+TLGQ L +T VL+ APGA++ +V N TG+
Sbjct: 680 ANIGYSHSDD--IKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 725 KTDWRGYTVVPYATSYRENRIALDAASLKRNVDLENAVVNVVPTKGALVLAEFNAHAGAR 784
+TDWRGY V+PYAT YRENR+ALD +L NVDL+NAV NVVPT+GA+V AEF A G +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 785 VLMKTSKQGIPLRFGAIATLDGIQTNSGIIDDDGSLYMSGLPAQGAITVRWGEAPDQICH 844
+LM + PL FGA+ T + +SGI+ D+G +Y+SG+P G + V+WGE + C
Sbjct: 798 LLMTLTHNNKPLPFGAMVTSES-SQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 845 ISYQLTEQQINSAITRMDAICR 866
+YQL + +T++ A CR
Sbjct: 857 ANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1056CLENTEROTOXN320.004 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 31.9 bits (72), Expect = 0.004
Identities = 13/48 (27%), Positives = 22/48 (45%)

Query: 295 VGVVVTDSQNNIISPAGGTLPLSIPDDADSIARMNVYPVSTTGVPPET 342
+ V TD + I+ A T L++ D +S N+Y ++ P T
Sbjct: 188 LTVPSTDIEKEILDLAAATERLNLTDALNSNPAGNLYDWRSSNSYPWT 235


22EcE24377A_1116EcE24377A_1127Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1116-217-4.525858chaperone protein TorD
EcE24377A_1115-118-5.339671hypothetical protein
EcE24377A_1117017-4.578526chaperone-modulator protein CbpM
EcE24377A_1118018-4.844653curved DNA-binding protein CbpA
EcE24377A_1119118-4.146253hypothetical protein
EcE24377A_11201161.388038glucose-1-phosphatase/inositol phosphatase
EcE24377A_11212172.494033hypothetical protein
EcE24377A_11220163.656092TrpR binding protein WrbA
EcE24377A_11230153.769885hypothetical protein
EcE24377A_1124-1163.823909purine permease ycdG
EcE24377A_1125-2184.128708flavin reductase rutF
EcE24377A_1126-1143.141098hypothetical protein
EcE24377A_1127-2113.857745rutD protein
23EcE24377A_1137EcE24377A_1164Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1137-120-3.996978hypothetical protein
EcE24377A_1138-125-4.670450Tat-translocated enzyme
EcE24377A_1139-131-7.330652hypothetical protein
EcE24377A_1141027-6.198601PGA biosynthesis protein
EcE24377A_1142-126-5.972728N-glycosyltransferase
EcE24377A_1143-128-6.185888outer membrane N-deacetylase
EcE24377A_1144-128-6.594481outer membrane protein PgaA
EcE24377A_1145028-7.910855diguanylate cyclase
EcE24377A_1146-122-4.541212IS3, transposase orfB
EcE24377A_1147-121-5.547422IS3, transposase orfA
EcE24377A_1148020-4.768173hypothetical protein
EcE24377A_1149016-3.740022hypothetical protein
EcE24377A_1151-117-2.200083*hypothetical protein
EcE24377A_1153-116-2.628738hydrolase
EcE24377A_1154018-3.955100chaperone TorD
EcE24377A_1155122-5.960622hypothetical protein
EcE24377A_1156127-7.709343curli production assembly/transport subunit
EcE24377A_1157131-8.413235curli assembly protein CsgF
EcE24377A_1158034-8.281270curli assembly protein CsgE
EcE24377A_1159132-8.165117DNA-binding transcriptional regulator CsgD
EcE24377A_1160331-5.716288hypothetical protein
EcE24377A_1161023-3.287714curlin minor subunit
EcE24377A_1162020-3.901923cryptic curlin major subunit
EcE24377A_1163023-4.305693autoagglutination protein
EcE24377A_1164-115-3.336796hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1144ARGDEIMINASE310.022 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 31.0 bits (70), Expect = 0.022
Identities = 27/183 (14%), Positives = 61/183 (33%), Gaps = 23/183 (12%)

Query: 450 WPRAAENELKK-AEVIEPRNINLEVEQTWTALTLQEWQQA--AVLTHDVVEREPQDPGVV 506
+ A E + A +++ + +E + + L ++ ++E E + +
Sbjct: 47 YLEVARQEHEVFASILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTI 106

Query: 507 -RLK---RAVDVHNLAELRIAGSTGIDAEGPDSGKHDVDLTTIVYS---PPLKDNWRGFA 559
LK ++ + N+ I+G E + DL P+ + F
Sbjct: 107 NLLKDYFSSLTIDNMISKMISGVVT--EELKNYTSSLDDLVNGANLFIIDPMPNVL--FT 162

Query: 560 GFGYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVFNHEHKPGARLSGWYDFNDN 619
D S G G+ + + + R E +AE +F + + W + +
Sbjct: 163 ----RDPFASIGNGVT---INKMFTKVRQ--RETIFAEYIFKYHPVYKENVPIWLNRWEE 213

Query: 620 WRI 622
+
Sbjct: 214 ASL 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1145BINARYTOXINA300.027 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 29.6 bits (66), Expect = 0.027
Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%)

Query: 359 DQVIKTVVNIIGKSIRPDDLLA--RVGGEEFGVLLTDIDTERAKALAERIRENVERLTGD 416
D + + N + + P +L+ R G +EFG+ LT + + K E I E+ G
Sbjct: 313 DSKVNNIENALKLTPIPSNLIVYRRSGPQEFGLTLTSPEYDFNK--IENIDAFKEKWEGK 370

Query: 417 NPEYAIPQKVTISIGAV 433
Y P ++ SIG+V
Sbjct: 371 VITY--PNFISTSIGSV 385


24EcE24377A_1191EcE24377A_1207Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_11912161.359911oxidoreductase
EcE24377A_11920130.917566integral membrane protein MviN
EcE24377A_11931190.751012flagellar synthesis protein FlgN
EcE24377A_11942160.967787anti-sigma-28 factor FlgM
EcE24377A_11951162.140841flagellar basal body P-ring biosynthesis protein
EcE24377A_11962162.369468flagellar basal-body rod protein FlgB
EcE24377A_11973142.313511flagellar basal body rod protein FlgC
EcE24377A_11982122.380727flagellar basal body rod modification protein
EcE24377A_11990122.506596flagellar hook protein FlgE
EcE24377A_1200-1122.446995flagellar basal body rod protein FlgF
EcE24377A_1201-191.331663flagellar basal body rod protein FlgG
EcE24377A_12020122.259446flagellar basal body L-ring protein
EcE24377A_12030132.021553flagellar basal body P-ring biosynthesis protein
EcE24377A_12041141.633508flagellar rod assembly protein/muramidase FlgJ
EcE24377A_12052151.203864flagellar hook-associated protein FlgK
EcE24377A_12062141.248637flagellar hook-associated protein FlgL
EcE24377A_12072141.726975ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1199FLGHOOKAP1424e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 9e-05
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1201FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1202FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1203FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1100), Expect = e-152
Identities = 157/363 (43%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1204FLGFLGJ5110.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 511 bits (1318), Expect = 0.0
Identities = 313/313 (100%), Positives = 313/313 (100%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1205FLGHOOKAP16830.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 683 bits (1764), Expect = 0.0
Identities = 544/546 (99%), Positives = 545/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNSQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFN+QHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSTTQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSS TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1206FLAGELLIN452e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.4 bits (107), Expect = 2e-07
Identities = 46/229 (20%), Positives = 84/229 (36%), Gaps = 15/229 (6%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD AA QA+ + T A
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDD--AAGQAIANRF-TSNIKGLTQAS 64

Query: 67 TFATQKVSL---EESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLN 123
A +S+ E L+++ +Q +E V A+NGT SD D S+ +IQ +++
Sbjct: 65 RNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDR 124

Query: 124 LANTTDGNGRYIFAGYKTETAPFSEEKGKYVGGAESIRQQVDASRSMVIGHTGDKIFDSI 183
++N T NG + + G E+I + +G G +
Sbjct: 125 VSNQTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 184 TSNAVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKEIAAAALDKT 232
+ + T A + + A DK
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1207IGASERPTASE652e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 65.1 bits (158), Expect = 2e-12
Identities = 47/261 (18%), Positives = 83/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAAPATPATPAQPGLLSRFFGALKALFSGSEETKPTEQP-APKAEAKPERQQDRR 609
P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETRESRQQAEV------T 663
N ++++++ D E +NR A++ + + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +TT+ ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEEAVVAPVVEETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232



Score = 64.3 bits (156), Expect = 2e-12
Identities = 47/288 (16%), Positives = 84/288 (29%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAAPATPATPAQPGLL 571
P E+ + DVP P+ E A AP P APATP+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT----- 1037

Query: 572 SRFFGALKALFSGSEETKPTEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRSER-- 629
ET + Q QN + + + ++
Sbjct: 1038 ---------------ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEEAVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 VEETVAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
+P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248


25EcE24377A_1241EcE24377A_1254Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1241-214-3.377284N-acetyl-D-glucosamine kinase
EcE24377A_1243-120-4.942284hypothetical protein
EcE24377A_1244-121-4.007892hypothetical protein
EcE24377A_1245-120-3.282994hypothetical protein
EcE24377A_1246020-2.862722spermidine/putrescine ABC transporter
EcE24377A_1247123-3.637513spermidine/putrescine ABC transporter membrane
EcE24377A_1248226-4.501671phage integrase
EcE24377A_1249226-4.738891excisionase
EcE24377A_1250125-5.656786exonuclease
EcE24377A_1251126-6.075924exonuclease
EcE24377A_1252034-8.850526hypothetical protein
EcE24377A_1253128-7.062239DicB protein
EcE24377A_1254224-4.309774hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1253STREPKINASE290.002 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 29.3 bits (65), Expect = 0.002
Identities = 12/36 (33%), Positives = 21/36 (58%)

Query: 17 TSPGGTRHRITKFIVEDAIMETLLPNVNTSEGCFEI 52
T G H++ K + AI E L+ NV++++ FE+
Sbjct: 91 TDSGAMSHKLEKADLLKAIQEQLIANVHSNDDYFEV 126


26EcE24377A_1264EcE24377A_1271Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1264029-4.352381hypothetical protein
EcE24377A_1265032-5.705100hypothetical protein
EcE24377A_1266031-5.564139hypothetical protein
EcE24377A_1267131-6.287996crossover junction endodeoxyribonuclease
EcE24377A_1268130-5.429011phage antitermination protein Q
EcE24377A_1269130-5.210941protein kinase domain-containing protein
EcE24377A_1270125-2.075780hypothetical protein
EcE24377A_12712251.151057hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1269YERSSTKINASE300.013 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 29.7 bits (66), Expect = 0.013
Identities = 29/91 (31%), Positives = 41/91 (45%), Gaps = 16/91 (17%)

Query: 119 IVKMVLDGVAHIHAKGYLHRDIKPFNVL-RFSDGTYKVSDFGLVKDT--NPEGDTTKLTE 175
I +LD H+ G +H DIKP NV+ + G V D GL + P+G T
Sbjct: 250 IAHRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGFTES--- 306

Query: 176 IGTRMGSTRYMAPEI-LYNAEYSVKTDVYAV 205
+ APE+ + N S K+DV+ V
Sbjct: 307 ---------FKAPELGVGNLGASEKSDVFLV 328


27EcE24377A_1297EcE24377A_1322Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1297-217-3.186841NUDIX family hydrolase
EcE24377A_1298-120-4.75316523S rRNA pseudouridine synthase E
EcE24377A_1299022-6.145155isocitrate dehydrogenase
EcE24377A_1300034-8.923104hypothetical protein
EcE24377A_1301233-8.723200transcriptional regulator mlrA-like protein
EcE24377A_1302132-9.116366hypothetical protein
EcE24377A_1303333-9.436518BLUF/cyclic diguanylate phosphodiesterase
EcE24377A_1304336-10.627499hypothetical protein
EcE24377A_1305337-10.292631hypothetical protein
EcE24377A_1306334-9.493763hypothetical protein
EcE24377A_1307124-6.519269cyclic diguanylate phosphodiesterase
EcE24377A_1308025-6.075038hypothetical protein
EcE24377A_1309325-4.569421hypothetical protein
EcE24377A_1310124-4.670616hypothetical protein
EcE24377A_1311023-3.717985hypothetical protein
EcE24377A_1312121-3.049404autotransporter (AT) family porin
EcE24377A_1313124-5.032940hypothetical protein
EcE24377A_1314-121-3.554879hypothetical protein
EcE24377A_1315-118-3.315993hypothetical protein
EcE24377A_1316-119-3.106554hypothetical protein
EcE24377A_1317-219-3.921324hypothetical protein
EcE24377A_1318-121-4.768596cell division topological specificity factor
EcE24377A_1319-219-3.455495cell division inhibitor MinD
EcE24377A_1320-221-4.065204septum formation inhibitor
EcE24377A_1321-120-5.335729fels-1 prophage protein
EcE24377A_1322-119-4.384600pre-peptidase C-terminal domain-containing
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1312PRTACTNFAMLY621e-12 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 61.6 bits (149), Expect = 1e-12
Identities = 58/242 (23%), Positives = 88/242 (36%), Gaps = 41/242 (16%)

Query: 53 QLGADLLTGGFTDSDSWRLGVMAGYARDYNSTHSSVSDYRSKGSVRGYCAGLYATWFADD 112
+LGAD W LG +AGY R + G G YAT+ AD
Sbjct: 676 ELGADHAVAVAGGR--WHLGGLAGYTR----GDRGFTGDGG-GHTDSVHVGGYATYIADS 728

Query: 113 ISKKGAYIDAWAQYSWFKN----------SVKGDELAYESYSAKGATVSLEAGYGFALNK 162
G Y+DA + S +N +VKG Y G SLEAG F
Sbjct: 729 ----GFYLDATLRASRLENDFKVAGSDGYAVKGK------YRTHGVGASLEAGRRFTHAD 778

Query: 163 SFGLEAAKYTWIFQPQAQAIWMGVDHNAHTEANGSRIENDANNNIQTRLGFRTFIRTQEK 222
W +PQA+ A+ ANG R+ ++ +++ RLG R +
Sbjct: 779 G---------WFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRIELA 829

Query: 223 NSGPHGDDFEPFVEMNWFHNSK-DFAVSMNGVKVEQDGASNLGEIKLGVNGNLNPAASVW 281
G +P+++ + V NG+ + E+ LG+ L S++
Sbjct: 830 G----GRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLY 885

Query: 282 GN 283
+
Sbjct: 886 AS 887


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1317PRTACTNFAMLY491e-09 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 48.5 bits (115), Expect = 1e-09
Identities = 30/114 (26%), Positives = 50/114 (43%), Gaps = 1/114 (0%)

Query: 23 YHLSNGMESKSVDTRSIYRELGATLSYNMRLGNGMEIEPCLKAAVRKEFVDDNRVKVNSD 82
Y +NG+ + S+ LG + + L G +++P +KA+V +EF V N
Sbjct: 798 YRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGI 857

Query: 83 GNFVNDLSGRRGIYQAGIKASFSSTLSGHFGVGYSHGAGVESPWNAVAGVNWSF 136
+ +L G R G+ A+ S + YS G + PW AG +S+
Sbjct: 858 AH-RTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


28EcE24377A_1381EcE24377A_1454Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1381-221-3.627083**formyltetrahydrofolate deformylase
EcE24377A_1382-126-4.293337hypothetical protein
EcE24377A_1383-225-2.877798hypothetical protein
EcE24377A_1384-225-3.179740response regulator of RpoS
EcE24377A_1385027-3.426051UTP-glucose-1-phosphate uridylyltransferase
EcE24377A_1386-126-3.495404global DNA-binding transcriptional dual
EcE24377A_1387-123-2.924973thymidine kinase
EcE24377A_1389-121-2.095402bifunctional acetaldehyde-CoA/alcohol
EcE24377A_1390012-2.156264hypothetical protein
EcE24377A_1391-112-2.128298oligopeptide ABC transporter periplasmic
EcE24377A_1392013-0.996943oligopeptide transporter permease
EcE24377A_1393-114-2.100588oligopeptide ABC transporter permease OppC
EcE24377A_1394-117-2.609922oligopeptide ABC transporter ATP-binding
EcE24377A_1395-220-3.378598oligopeptide ABC transporter ATP-binding protein
EcE24377A_1396-226-6.340200dsDNA-mimic protein
EcE24377A_1397-128-6.985973cardiolipin synthetase
EcE24377A_1398437-10.456356voltage-gated potassium channel
EcE24377A_1399848-10.750424hypothetical protein
EcE24377A_1400847-10.360791phage N-6-adenine-methyltransferase
EcE24377A_1401844-9.843015hypothetical protein
EcE24377A_1402529-4.115564hypothetical protein
EcE24377A_1403531-4.402618hypothetical protein
EcE24377A_1404633-4.225783hypothetical protein
EcE24377A_1405425-0.647384IS66 family orf1
EcE24377A_1406423-0.869930IS66 family orf2
EcE24377A_1407323-1.825688IS66 family transposase
EcE24377A_1408428-4.827192hypothetical protein
EcE24377A_1409423-3.531914recombinase
EcE24377A_1410017-2.502709transporter
EcE24377A_1411017-3.065665acyl-CoA thioester hydrolase
EcE24377A_1412017-3.339086intracellular septation protein A
EcE24377A_1413019-3.290522hypothetical protein
EcE24377A_1414120-3.544932outer membrane protein W
EcE24377A_1415121-3.767822phage integrase site specific recombinase
EcE24377A_1416124-3.396755exonuclease
EcE24377A_1417327-4.914212hypothetical protein
EcE24377A_1418125-4.155370hypothetical protein
EcE24377A_1419427-3.127572Rha family phage regulatory protein
EcE24377A_1420429-1.036608hypothetical protein
EcE24377A_1421431-0.790053hypothetical protein
EcE24377A_1422331-0.302462DNA-binding transcriptional regulator DicC
EcE24377A_1423331-0.979987hypothetical protein
EcE24377A_1424333-5.141972hypothetical protein
EcE24377A_1425133-6.502180hypothetical protein
EcE24377A_1426135-10.614298hypothetical protein
EcE24377A_1427027-7.454703hypothetical protein
EcE24377A_1429-128-7.541826hypothetical protein
EcE24377A_1430-125-6.445870hypothetical protein
EcE24377A_1431-129-4.899937hypothetical protein
EcE24377A_1433032-6.262413hypothetical protein
EcE24377A_1434032-5.818123hypothetical protein
EcE24377A_1435132-7.018901crossover junction endodeoxyribonuclease
EcE24377A_1436132-6.512875phage antitermination protein Q
EcE24377A_1437133-6.085928protein kinase domain-containing protein
EcE24377A_1438024-0.713792hypothetical protein
EcE24377A_14392242.692556hypothetical protein
EcE24377A_14403263.831663DNA methylase
EcE24377A_14437357.264871**hypothetical protein
EcE24377A_144410387.513294hypothetical protein
EcE24377A_14458365.591153lambda family phage portal protein
EcE24377A_14468364.856510hypothetical protein
EcE24377A_14477384.372086hypothetical protein
EcE24377A_14486342.913001hypothetical protein
EcE24377A_14496332.198294hypothetical protein
EcE24377A_14504270.665522hypothetical protein
EcE24377A_1451122-3.513157hypothetical protein
EcE24377A_1452-122-5.585933hypothetical protein
EcE24377A_1453029-7.182408hypothetical protein
EcE24377A_1454122-4.172961hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1382SECA572e-12 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 57.2 bits (138), Expect = 2e-12
Identities = 16/28 (57%), Positives = 20/28 (71%)

Query: 125 IDGTRPQFGRNDPCPCGSGKKFKKCCGQ 152
+ GRNDPCPCGSGKK+K+C G+
Sbjct: 872 AQTGERKVGRNDPCPCGSGKKYKQCHGR 899


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1384HTHFIS907e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 7e-22
Identities = 40/152 (26%), Positives = 64/152 (42%), Gaps = 3/152 (1%)

Query: 10 ILIVEDEQVFRSLLDSWFSSLGATTVLAADGVDALELLGGFTPDLMICDIAMPRMNGLKL 69
IL+ +D+ R++L+ S G + ++ + DL++ D+ MP N L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 70 LEHIRNRGDQTPVLVISATENMADIAKALRLGVEDVLLKPVKDLNRLREMVFACLYPSMF 129
L I+ PVLV+SA KA G D L KP DL L ++ L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF-DLTELIGIIGRAL--AEP 122

Query: 130 NSRVEEEERLFRDWDAMVDNPAAAAKLLQELQ 161
R + E +D +V AA ++ + L
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLA 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1395HTHFIS310.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.008
Identities = 9/16 (56%), Positives = 11/16 (68%)

Query: 55 VVGESGCGKSTFARAI 70
+ GESG GK ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1399adhesinmafb315e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 30.8 bits (69), Expect = 5e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTIIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1410TONBPROTEIN2597e-90 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 259 bits (663), Expect = 7e-90
Identities = 237/239 (99%), Positives = 237/239 (99%)

Query: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 60
MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPAVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120
VQPPPEPVVEPEPEPEPIPEPPKEAP VIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180
PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1437YERSSTKINASE300.010 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 30.1 bits (67), Expect = 0.010
Identities = 29/91 (31%), Positives = 41/91 (45%), Gaps = 16/91 (17%)

Query: 144 IVKMVLDGVAHIHAKGYLHRDIKPFNVL-RFSDGTYKVSDFGLVKDT--NPEGDTTKLTE 200
I +LD H+ G +H DIKP NV+ + G V D GL + P+G T
Sbjct: 250 IAHRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQPKGFTES--- 306

Query: 201 IGTRMGSTRYMAPEI-LYNAEYSVKTDVYAV 230
+ APE+ + N S K+DV+ V
Sbjct: 307 ---------FKAPELGVGNLGASEKSDVFLV 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1450ACRIFLAVINRP310.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.3 bits (71), Expect = 0.021
Identities = 15/70 (21%), Positives = 21/70 (30%), Gaps = 8/70 (11%)

Query: 212 ADEGVVVSGVTVNVVIKSASGQRYVATVTTDSEGHWSYEMEDDDFANG------WYSSTA 265
GV +S +N I +A G YV Y D F Y +A
Sbjct: 736 QALGVSLS--DINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSA 793

Query: 266 YNDLVRLRQL 275
++V
Sbjct: 794 NGEMVPFSAF 803


29EcE24377A_1506EcE24377A_1515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1506-2173.341640glutamate--putrescine ligase
EcE24377A_1507-1204.329266hypothetical protein
EcE24377A_15080173.926360gamma-glutamyl-gamma-aminobutyrate hydrolase
EcE24377A_15091193.644697DNA-binding transcriptional repressor PuuR
EcE24377A_15102203.769589gamma-glutamyl-gamma-aminobutyraldehyde
EcE24377A_15112172.827268gamma-glutamylputrescine oxidoreductase
EcE24377A_15122131.3943874-aminobutyrate aminotransferase
EcE24377A_1513212-0.911325phage shock protein operon transcriptional
EcE24377A_1514-118-4.561692phage shock protein PspA
EcE24377A_1515-114-3.836225phage shock protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1507ACETATEKNASE270.002 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 27.1 bits (60), Expect = 0.002
Identities = 12/25 (48%), Positives = 14/25 (56%)

Query: 18 LKPVPAHYDTDNRIVHYIFHAQSHK 42
L P+P Y T +I Y FH SHK
Sbjct: 160 LYPIPYEYYTKYKIRKYGFHGTSHK 184


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1513HTHFIS342e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (880), Expect = e-118
Identities = 126/341 (36%), Positives = 182/341 (53%), Gaps = 23/341 (6%)

Query: 6 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 65
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 185
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 245
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 246 PLDDIIID---PFKRRPPEDAIAVSETTSLPTLPLD------------------LREFQM 284
+ I + P E A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 285 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 325
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1515MPTASEINHBTR250.032 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.032
Identities = 7/43 (16%), Positives = 17/43 (39%)

Query: 30 SGRSELSQSEQQRLAQLADEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A A++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


30EcE24377A_1625EcE24377A_1639Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1625219-0.708092hypothetical protein
EcE24377A_1626220-1.772673hypothetical protein
EcE24377A_1627116-1.142383hypothetical protein
EcE24377A_1628217-1.592181hypothetical protein
EcE24377A_1629115-1.714879acetyltransferase
EcE24377A_1630014-1.360397zinc-binding dehydrogenase oxidoreductase
EcE24377A_1631014-2.172404DNA-binding transcriptional regulator
EcE24377A_16321100.292638TonB-dependent receptor
EcE24377A_16333244.528845hypothetical protein
EcE24377A_16342234.084024hypothetical protein
EcE24377A_16353254.481528L-asparagine permease
EcE24377A_16373294.985543hypothetical protein
EcE24377A_16383274.846208ImpA family type VI secretion-associated
EcE24377A_16393244.420236protein rhsD, truncation
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1629SACTRNSFRASE363e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 3e-05
Identities = 13/61 (21%), Positives = 30/61 (49%), Gaps = 1/61 (1%)

Query: 99 RHTVEHSVYVHPDHQGKGLGRKLLSRLIDEARDCGKHVMVAGIESQNQASLHLHQSLGFV 158
+E + V D++ KG+G LL + I+ A++ ++ + N ++ H + F+
Sbjct: 89 YALIED-IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147

Query: 159 V 159
+
Sbjct: 148 I 148


31EcE24377A_1679EcE24377A_1706Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1679-215-3.269923diguanylate cyclase
EcE24377A_1680-112-3.319191lipoprotein
EcE24377A_1681-213-4.735819glutamate:gamma aminobutyrate antiporter
EcE24377A_1682-215-5.979883glutamate decarboxylase GadB
EcE24377A_1683-221-7.756336hypothetical protein
EcE24377A_1684-121-6.401064M16B family peptidase
EcE24377A_1685022-6.868868TonB-dependent receptor
EcE24377A_1686126-7.247507inner membrane ABC transporter ATP-binding
EcE24377A_1687127-6.797064radical SAM domain-containing protein
EcE24377A_1689126-5.454139transcriptional regulator YdeO
EcE24377A_1690226-4.484244oxidoreductase
EcE24377A_1691430-5.947637protein FimH-like protein
EcE24377A_1692128-1.928392protein fimG-like protein
EcE24377A_16931250.540821fimbrial protein
EcE24377A_16950261.424421periplasmic pilus chaperone family protein
EcE24377A_16962261.766673hypothetical protein
EcE24377A_1697226-0.305695hypothetical protein
EcE24377A_1698327-0.879039type II/III secretion system protein
EcE24377A_1699331-3.194481hypothetical protein
EcE24377A_1700438-7.039508hypothetical protein
EcE24377A_1702439-7.380878hypothetical protein
EcE24377A_1701338-7.804974hypothetical protein
EcE24377A_1703140-8.549689hypothetical protein
EcE24377A_1704039-9.045545hypothetical protein
EcE24377A_1705035-8.316279replication initiation factor family protein
EcE24377A_1706320-3.616222hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1698BCTERIALGSPD1215e-32 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 121 bits (305), Expect = 5e-32
Identities = 72/394 (18%), Positives = 159/394 (40%), Gaps = 65/394 (16%)

Query: 51 PFMLAPELVNDPRAVTLHISPDIDEREFITRYLGNMNIRISRKKGVDYIYSHTPAAPEEP 110
P + +V D R + +S + + R+ I + ++ + + + IY
Sbjct: 224 PGSMVANVVADERTNAVLVSGEPNSRQRIIAMIKQLDRQQATQGNTKVIYL--------- 274

Query: 111 LKSLVYTPRYRTVEYLHQALSGLGQLPAAQQPVQGANGEQTWQAVSSGTRFISASGDV-- 168
+Y L + L+G+ + +Q + V++ + I
Sbjct: 275 --------KYAKASDLVEVLTGISS--------TMQSEKQAAKPVAALDKNIIIKAHGQT 318

Query: 169 --FVFRGTAREVELVRQLLPQIDVRAQEVSVAGYVFEVQTSER----------------- 209
+ + + +++ Q+D+R +V V + EVQ ++
Sbjct: 319 NALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQF 378

Query: 210 NGSGLALAAELLSGRF-----SITMSSASGLDNF----IRLSTGSVDAMYELFRTDSRFQ 260
SGL ++ + +++ S AS L +F G+ + + ++
Sbjct: 379 TNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKND 438

Query: 261 VVSSPRLRVISGKEAVFSVGSDVPVLS-SVSWQDKVPVQSVEYRSSGAIFRVKPTVT-QD 318
++++P + + EA F+VG +VPVL+ S + +VE ++ G +VKP + D
Sbjct: 439 ILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQINEGD 498

Query: 319 VIGLDIVQQLSNFAKTDTGVNNT--PTLIKREVSTSVSLNDGDIIVLGGLAENKTSKART 376
+ L+I Q++S+ A + ++ T R V+ +V + G+ +V+GGL + S
Sbjct: 499 SVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTAD 558

Query: 377 GLSFLPDV------FGSDSDERAKTDIIVVLQAR 404
+ L D+ F S S + +K ++++ ++
Sbjct: 559 KVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPT 592


32EcE24377A_1730EcE24377A_1787Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1730015-3.396327sugar efflux transporter
EcE24377A_1731-118-6.103304multiple drug resistance protein MarC
EcE24377A_1732022-6.988230DNA-binding transcriptional repressor MarR
EcE24377A_1733225-7.797260DNA-binding transcriptional activator MarA
EcE24377A_1734127-8.031994hypothetical protein
EcE24377A_1735126-8.6183466-phospho-beta-glucosidase
EcE24377A_1736127-8.047056glucoside specific outer membrane porin
EcE24377A_1737323-6.157755PTS system lactose/cellobiose-specific family
EcE24377A_1738322-6.170555PTS system lactose/cellobiose family transporter
EcE24377A_1739121-5.443913PTS system lactose/cellobiose family transporter
EcE24377A_1740120-4.951635GntR family transcriptional regulator
EcE24377A_1741119-3.467058O-acetylserine/cysteine export protein
EcE24377A_1742120-3.058395MFS-type transporter YdeE
EcE24377A_1743122-2.960040hypothetical protein
EcE24377A_1744021-2.585146hypothetical protein
EcE24377A_1745019-1.615103hypothetical protein
EcE24377A_1746018-1.529684competence damage-inducible protein A
EcE24377A_1747019-2.213860dipeptidyl carboxypeptidase II
EcE24377A_1748-119-3.3367963-hydroxy acid dehydrogenase
EcE24377A_1749-121-4.282861GntR family transcriptional regulator
EcE24377A_1750-124-4.564413hypothetical protein
EcE24377A_1751025-4.631402mannitol dehydrogenase
EcE24377A_1752123-2.226237inner membrane metabolite transport protein
EcE24377A_1753327-0.665468hypothetical protein
EcE24377A_17543270.962554site-specific recombinase resolvase PinR
EcE24377A_17553263.959622tail fiber assembly protein
EcE24377A_17563285.101931hypothetical protein
EcE24377A_17573274.925614IS66 family transposase
EcE24377A_17582274.201700IS66 family orf2
EcE24377A_17592253.264878IS66 family orf1
EcE24377A_17602231.902327IS66 family transposase
EcE24377A_1761328-4.125430IS66 family orf2
EcE24377A_1762228-3.771307IS66 family orf1
EcE24377A_1763230-5.034525lysis protein S
EcE24377A_1764130-4.922145hypothetical protein
EcE24377A_1765437-9.378303cold shock DNA-binding protein
EcE24377A_1766335-8.449452antitermination protein
EcE24377A_1767437-8.944306hypothetical protein
EcE24377A_1768438-10.057226hypothetical protein
EcE24377A_1770437-9.379457hypothetical protein
EcE24377A_1771538-9.235807hypothetical protein
EcE24377A_1772226-4.111046IS3, transposase orfA
EcE24377A_1774126-4.270189hypothetical protein
EcE24377A_1775027-4.454590phage O family protein
EcE24377A_1776-234-7.026923hypothetical protein
EcE24377A_1777241-7.873626DNA-binding transcriptional regulator DicC
EcE24377A_1778123-4.528880hypothetical protein
EcE24377A_1779123-4.172691hypothetical protein
EcE24377A_1780124-4.537746hypothetical protein
EcE24377A_1781226-5.754394division inhibition protein DicB
EcE24377A_1782124-5.580176hypothetical protein
EcE24377A_1784120-4.555570exonuclease
EcE24377A_1783021-5.114533hypothetical protein
EcE24377A_1785-120-4.732683phage excisionase
EcE24377A_1786-118-4.589464phage integrase site specific recombinase
EcE24377A_1787-216-3.955612dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1730TCRTETB537e-10 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 53.3 bits (128), Expect = 7e-10
Identities = 41/192 (21%), Positives = 83/192 (43%), Gaps = 8/192 (4%)

Query: 36 LSDIAQSFHMQTAQVGIMLTIYAWVVALMSLPFMLMTSQVERRKLLICLFVVFIASHVLS 95
L DIA F+ A + T + ++ + + ++ Q+ ++LL+ ++ V+
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 96 FLSWS-FTVLVISRIGVAFAHAIFWSITASLAIRMAPAGKRAQALSLIATGTALAMVLGL 154
F+ S F++L+++R A F ++ + R P R +A LI + A+ +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 155 PLGRIVGQYFGWRMTFFAIGIGALITLLCLIKLLPLLPSEHSGSLKSLPLLFRRPALMSI 214
+G ++ Y W I + +IT+ L+KLL + LMS+
Sbjct: 157 AIGGMIAHYIHWSY-LLLIPMITIITVPFLMKLLK------KEVRIKGHFDIKGIILMSV 209

Query: 215 YLLTVVVVTAHY 226
++ ++ T Y
Sbjct: 210 GIVFFMLFTTSY 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1736RTXTOXIND300.032 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.032
Identities = 13/50 (26%), Positives = 22/50 (44%), Gaps = 5/50 (10%)

Query: 33 QRLELLENELSQNKQELKATQNELGVYKSRLSTLQKSITENKYKSASLAE 82
+ +LE E + NEL VYKS+L ++ I K + + +
Sbjct: 250 AKHAVLEQE-----NKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQ 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1742TCRTETA431e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.3 bits (102), Expect = 1e-06
Identities = 42/239 (17%), Positives = 82/239 (34%), Gaps = 18/239 (7%)

Query: 7 RSTSALLASSLLLTIGRGATLPFMTIYLSRQYSLSVDLI---GYAMTIALTISVVFSLGF 63
R +L++ L +G G +P + L R S D+ G + + + +
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLL-RDLVHSNDVTAHYGILLALYALMQFACAPVL 63

Query: 64 GILADKFDKKRYMLLAITAFASGFIAIPLVNNVKLVVLFFALINCAYSVFATVLKAWFAD 123
G L+D+F ++ +L+++ A + + + ++ + + + A A+ AD
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIAD 122

Query: 124 NLSSTSKTKIFSINYTMLNIGWTIGPPLGTLLVMQSINLPFWLAAICSAFPMLFIQIWVK 183
+ + F G GP LG L+ S + PF+ AA + L +
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 184 RSEK---------IIATETGSVWSPKVLLQDKALLWFTCSGFLASFVSGAFASCISQYV 233
S K + W + A L F+ V A+ +
Sbjct: 183 ESHKGERRPLRREALNPLASFRW--ARGMTVVAALMAV--FFIMQLVGQVPAALWVIFG 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1748DHBDHDRGNASE1002e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 100 bits (249), Expect = 2e-27
Identities = 70/244 (28%), Positives = 114/244 (46%), Gaps = 16/244 (6%)

Query: 2 IVLVTGATAGFGECITRRFIQQGHKVIATGRRQERLQELKDELGDNLYIAQ---LDVRNR 58
I +TGA G GE + R QG + A E+L+++ L A+ DVR+
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 59 AAIEEMLASLPAEWCNIDILVNNAGLALGMEPAHKASVEDWETMIDTNNKGLVYMTRAVL 118
AAI+E+ A + E IDILVN AG+ L H S E+WE N+ G+ +R+V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 119 PGMVERNHGHIINIGSTAGSWPYAGGNVYGATKAFVRQFSLNLRTDLHGTAVRVTDIEPG 178
M++R G I+ +GS P Y ++KA F+ L +L +R + PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 179 LVGGTEFSNVRFKGDDGKAE------KTYQNTVALT----PEDVSEAV-WWVSTLPAHVN 227
T+ + ++G + +T++ + L P D+++AV + VS H+
Sbjct: 189 ST-ETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 228 INTL 231
++ L
Sbjct: 248 MHNL 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1752TCRTETB484e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 48.0 bits (114), Expect = 4e-08
Identities = 33/118 (27%), Positives = 55/118 (46%), Gaps = 16/118 (13%)

Query: 44 VGAFIFGKMGDRIGRKKVLFITITMMGICTTLIGVLPTYAQIGVFAPILLVTLRIIQGLG 103
+G ++GK+ D++G K++L I + + + V ++ + + A R IQG G
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMA-------RFIQGAG 116

Query: 104 AGAEISGAGTMLAEYAPKGKR----GIISSFVAMGTNCGTLSATAI-----WAFMFFI 152
A A + ++A Y PK R G+I S VAMG G I W+++ I
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLI 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1757CHANLCOLICIN310.014 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.014
Identities = 25/91 (27%), Positives = 42/91 (46%), Gaps = 5/91 (5%)

Query: 4 SLAHENARLRALLQTQQDTIRQMAEYNRLLSQRVAAYASEINRLKALVAKLQRMQFGKSS 63
+ A A AL Q +D + + +N + A N A+ A+ +R++ K+
Sbjct: 79 AQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANN--AAMQAEDERLRLAKAE 136

Query: 64 EKLR---AKTERQIQEAQERISALQEEMAET 91
EK R E+ QEA++R ++ E AET
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAET 167


33EcE24377A_1904EcE24377A_1909Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1904316-3.649027inner membrane protein
EcE24377A_1905120-5.030967hypothetical protein
EcE24377A_1906117-4.106928major facilitator family transporter
EcE24377A_1907017-4.346644major facilitator transporter
EcE24377A_1908019-4.076080quinate/shikimate dehydrogenase
EcE24377A_1909-117-3.5425843-dehydroquinate dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1906TCRTETA300.015 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.015
Identities = 58/311 (18%), Positives = 107/311 (34%), Gaps = 16/311 (5%)

Query: 61 FAGLLSDRFGRRPFIMLGMCCYMAFFFGILHTNNIIIAYVFGFLAGMANSFLDAGTYPSL 120
G LSDRFGRRP +++ + + + + + Y+ +AG+ + A +
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYI 120

Query: 121 MEAFPRSPGTANI-LIKAFVSSGQFLLPLIISLLVWAELWFGWSFMIAAGIMFINALFLY 179
+ + + A G P++ L+ F AA + +N L
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSPHAPFFAAAALNGLNFLTGC 178

Query: 180 RCTFPPHPGHRLPV---IKKTTSSTEHRCSIIDLASYTLYGYISMATFYLVSQWLAQYGQ 236
H G R P+ +S + +A+ +I + + +G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGE 238

Query: 237 FVAGMSYTM-SIKLLSIYTVGSLLCVFITAPLIRNTVRPTTLLMLYTFISFIALFTVCLH 295
T I L + + SL IT P+ L++ IA T +
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALML-----GMIADGTGYIL 293

Query: 296 PTFYVVIIFAF-VIGFTSAGGVVQIGLTLMAERF--PYAKGKATGIYYSAGSIATFTIPL 352
F AF ++ ++GG+ L M R +G+ G + S+ + PL
Sbjct: 294 LAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353

Query: 353 ITAHLSQRSIA 363
+ + SI
Sbjct: 354 LFTAIYAASIT 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1907TCRTETB310.012 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.6 bits (69), Expect = 0.012
Identities = 38/177 (21%), Positives = 75/177 (42%), Gaps = 9/177 (5%)

Query: 14 ILAVLCIYFSYFLHGISVITLAQNMSSLAEKFSTDNAGIAYLISGIGLGRLISILFFGVI 73
IL LCI F ++ + L ++ +A F+ A ++ + L I +G +
Sbjct: 15 ILIWLCIL--SFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKL 72

Query: 74 SDKFGRRAVILMAVIMY----LLFFFGIPACPNLTLAYGLAVCVGIANSALDTGGYPALM 129
SD+ G + ++L +I+ ++ F G L +A + A AL
Sbjct: 73 SDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY- 131

Query: 130 ECFPKASGSAVILVKAMVSFGQMFYPMLVSYMLLNNIWYGYGLIIPGILFVLITLML 186
+ G A L+ ++V+ G+ P + M+ + I + Y L+IP I + + ++
Sbjct: 132 -IPKENRGKAFGLIGSIVAMGEGVGP-AIGGMIAHYIHWSYLLLIPMITIITVPFLM 186


34EcE24377A_1953EcE24377A_1978Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1953-112-3.052605hydroperoxidase II
EcE24377A_1954019-4.704025hypothetical protein
EcE24377A_1955017-5.0002466-phospho-beta-glucosidase
EcE24377A_1956017-4.571838DNA-binding transcriptional regulator ChbR
EcE24377A_1957117-2.409160PTS system N,N'-diacetylchitobiose-specific
EcE24377A_1958216-2.083438PTS system N,N'-diacetylchitobiose-specific
EcE24377A_1959116-1.694806PTS system N,N'-diacetylchitobiose-specific
EcE24377A_1960014-1.584098DNA-binding transcriptional activator OsmE
EcE24377A_1961014-0.723633NAD synthetase
EcE24377A_19631150.945065nucleotide excision repair endonuclease
EcE24377A_19620132.614283hypothetical protein
EcE24377A_19640133.253023hypothetical protein
EcE24377A_19650133.560660hypothetical protein
EcE24377A_19660133.606546succinylglutamate desuccinylase
EcE24377A_19670133.633545succinylarginine dihydrolase
EcE24377A_19681122.719322succinylglutamic semialdehyde dehydrogenase
EcE24377A_19690130.836191arginine succinyltransferase
EcE24377A_1970014-0.108819bifunctional succinylornithine
EcE24377A_1971017-0.482477hypothetical protein
EcE24377A_19721160.931900exonuclease III
EcE24377A_19731172.005013hypothetical protein
EcE24377A_19742152.511531hypothetical protein
EcE24377A_19751153.071784hypothetical protein
EcE24377A_19762153.374650carboxymuconolactone decarboxylase
EcE24377A_19772153.246251ABC transporter substrate-binding protein
EcE24377A_19782152.755274ABC transporter permease
35EcE24377A_1991EcE24377A_2019Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1991019-4.545484asparaginase
EcE24377A_1992022-5.506014nicotinamidase/pyrazinamidase
EcE24377A_1993023-6.194203major facilitator family transporter
EcE24377A_1994023-5.295922DeoR family transcriptional regulator
EcE24377A_1995021-4.189187aldo/keto reductase
EcE24377A_1996022-4.359259PfkB family kinase
EcE24377A_1997022-3.902997fructose-bisphosphate aldolase
EcE24377A_1998-118-3.112782sorbitol dehydrogenase
EcE24377A_1999-117-2.338241major facilitator family transporter
EcE24377A_2000-119-2.066738zinc-binding dehydrogenase oxidoreductase
EcE24377A_2001023-1.616520hypothetical protein
EcE24377A_2002118-1.390731methionine sulfoxide reductase B
EcE24377A_2003116-1.508215giyceraldehyde-3-phosphate dehydrogenase
EcE24377A_2004-110-3.854950aldose 1-epimerase
EcE24377A_2005-111-4.535911aldo/keto reductase
EcE24377A_2006-112-4.675352MltA-interacting protein MipA
EcE24377A_2007-114-5.282274serine kinase
EcE24377A_2008-318-5.841540hypothetical protein
EcE24377A_2009-221-5.064841diguanylate cyclase
EcE24377A_2010-219-1.614802diguanylate cyclase
EcE24377A_20110210.785626prolyl-tRNA synthetase
EcE24377A_2012-119-0.876435hypothetical protein
EcE24377A_2014-119-1.024093hypothetical protein
EcE24377A_2015-121-2.072061inner membrane transport protein yeaN
EcE24377A_2016022-4.073476hypothetical protein
EcE24377A_2017-122-3.221662lipoprotein
EcE24377A_2018-222-3.485835GAF domain/diguanylate cyclase domain-containing
EcE24377A_2019022-3.488599hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1992ISCHRISMTASE382e-05 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 38.1 bits (88), Expect = 2e-05
Identities = 36/192 (18%), Positives = 57/192 (29%), Gaps = 58/192 (30%)

Query: 2 PHRALLLV-DLQNDFCAGGALAVPEGDSTVDVANRLIDWCQSRGEAVI-----ASQD--- 52
P+RA+LL+ D+QN F +L + C G V+ SQ+
Sbjct: 28 PNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDD 87

Query: 53 -------WHPANHGSFASQHGVEPYTPGQLDGLPQTFWPDHCVQNSEGAQLHPLLNQKAI 105
W P + + + P D + T W
Sbjct: 88 RALLTDFWGPGLNSGPYEEKIITELAPEDDDLV-LTKW---------------------- 124

Query: 106 AAVFHKGENPLVDSYSAFFDNGRRQKTSLDDWLRDHEIDELIVMGLATDYCVKFTVLDAL 165
YSAF +T+L + +R D+LI+ G+ T +A
Sbjct: 125 -------------RYSAFK------RTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAF 165

Query: 166 QLGYKVNVITDG 177
K + D
Sbjct: 166 MEDIKAFFVGDA 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1993TCRTETB402e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 39.9 bits (93), Expect = 2e-05
Identities = 30/129 (23%), Positives = 50/129 (38%), Gaps = 1/129 (0%)

Query: 65 ALMFGYFIGSLTGGFIGDYFGRRRAFRINLLIVGIAATGAAFVPDMY-WLIFFRFLMGTG 123
A M + IG+ G + D G +R ++I + + LI RF+ G G
Sbjct: 57 AFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG 116

Query: 124 MGALIMVGYASFTEFIPATVRGKWSARLSFVGNWSPMLSAAIGVVVIAFFSWRIMFLLGG 183
A + +IP RGK + + + AIG ++ + W + L+
Sbjct: 117 AAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPM 176

Query: 184 IGILLAWFL 192
I I+ FL
Sbjct: 177 ITIITVPFL 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1999TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.011
Identities = 33/142 (23%), Positives = 48/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2004INVEPROTEIN290.021 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 29.3 bits (65), Expect = 0.021
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 165 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 213
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 214 TDRVYLNPQDCSVINDEALNR 234
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2014PRTACTNFAMLY280.022 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.022
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


36EcE24377A_2170EcE24377A_2242Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_21701164.129988flagellar hook-basal body protein FliE
EcE24377A_21711154.049008flagellar MS-ring protein
EcE24377A_21722174.165389flagellar motor switch protein G
EcE24377A_21730163.557314flagellar assembly protein H
EcE24377A_2174-1173.224347flagellum-specific ATP synthase
EcE24377A_21750162.253082flagellar biosynthesis chaperone
EcE24377A_2176-1162.333444flagellar hook-length control protein
EcE24377A_2177-2211.775186flagellar basal body protein FliL
EcE24377A_21780160.441516flagellar motor switch protein FliM
EcE24377A_2179116-2.536901flagellar motor switch protein FliN
EcE24377A_2180117-3.153425flagellar biosynthesis protein FliO
EcE24377A_2181019-4.069572flagellar biosynthesis protein FliP
EcE24377A_2182020-4.247065flagellar biosynthesis protein FliQ
EcE24377A_2183-217-3.020189flagellar biosynthesis protein FliR
EcE24377A_2184019-2.490640colanic acid capsular biosynthesis activation
EcE24377A_2185-2160.183362hypothetical protein
EcE24377A_2186-2160.605856hypothetical protein
EcE24377A_2188-2171.144137mannosyl-3-phosphoglycerate phosphatase
EcE24377A_21870170.968896diguanylate cyclase
EcE24377A_21892181.770031hypothetical protein
EcE24377A_21901181.488833hypothetical protein
EcE24377A_21922180.964279hypothetical protein
EcE24377A_2191-113-1.820232very short patch repair protein
EcE24377A_2193-113-2.617341DNA cytosine methylase
EcE24377A_2194-223-6.452263hypothetical protein
EcE24377A_2195030-8.237051hypothetical protein
EcE24377A_2196-131-8.891649hypothetical protein
EcE24377A_2197-126-6.908876outer membrane protein
EcE24377A_2198128-6.110900chaperone protein HchA
EcE24377A_2199133-7.705741heavy metal sensor histidine kinase
EcE24377A_2200228-6.524888transcriptional regulatory protein YedW
EcE24377A_2201124-6.166350transthyretin family protein
EcE24377A_2202025-5.735536sulfite oxidase subunit YedY
EcE24377A_2203031-8.124025sulfite oxidase subunit YedZ
EcE24377A_2204034-6.348028hypothetical protein
EcE24377A_2205134-4.948383nickel-dependent hydrogenase, b-type cytochrome
EcE24377A_2206233-4.636116hypothetical protein
EcE24377A_2207134-4.141650tail fiber family protein
EcE24377A_2209232-5.003214hypothetical protein
EcE24377A_2210234-5.274085hypothetical protein
EcE24377A_2211434-5.367210hypothetical protein
EcE24377A_2212535-5.761284hypothetical protein
EcE24377A_2213536-4.832176major tail sheath protein
EcE24377A_2214636-5.115627hypothetical protein
EcE24377A_2215335-3.278101phage tail protein I
EcE24377A_2216533-1.764313baseplate assembly protein J
EcE24377A_2217530-1.847579baseplate assembly protein W
EcE24377A_2218625-0.488854phage baseplate assembly protein V
EcE24377A_2219524-0.097513hypothetical protein
EcE24377A_22205252.627622hypothetical protein
EcE24377A_22213263.433163hypothetical protein
EcE24377A_22222253.297855hypothetical protein
EcE24377A_22232262.883461phage major capsid protein E
EcE24377A_22243252.370400bacteriophage lambda head decoration protein D
EcE24377A_22252232.045276S49 family peptidase
EcE24377A_22272210.363940IS21 family transposition helper protein
EcE24377A_2228222-0.091846IS21 family transposase
EcE24377A_2230324-0.819673phage terminase large subunit (GpA)
EcE24377A_2231028-2.991320hypothetical protein
EcE24377A_2232027-3.758525hypothetical protein
EcE24377A_2233234-7.668910phage antitermination protein Q
EcE24377A_2234536-8.212782hypothetical protein
EcE24377A_2235636-8.853523hypothetical protein
EcE24377A_2236432-7.884409hypothetical protein
EcE24377A_2237433-7.976456hypothetical protein
EcE24377A_2238432-7.498485hypothetical protein
EcE24377A_2239226-4.124163hypothetical protein
EcE24377A_2240224-3.941792hypothetical protein
EcE24377A_2241226-3.572950hypothetical protein
EcE24377A_2242227-4.477793lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2170FLGHOOKFLIE1174e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (295), Expect = 4e-38
Identities = 101/103 (98%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISLLQATAMSARAQDSLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVIS LQATAMSARAQ+SLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2171FLGMRINGFLIF7550.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 755 bits (1951), Expect = 0.0
Identities = 477/555 (85%), Positives = 514/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIATIGPVKGARVHLAMPKPSLFVREQKSPSASVTINLLPGRA 182
FSEQVNYQRALEGEL+RTI T+GPVK ARVHLAMPKPSLFVREQKSPSASVT+ L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2172FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2173FLGFLIH371e-134 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 371 bits (952), Expect = e-134
Identities = 223/228 (97%), Positives = 227/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLARGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLA+GLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQIALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQ+ALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPRVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAP VV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2175FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2176FLGHOOKFLIK463e-166 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 463 bits (1191), Expect = e-166
Identities = 362/375 (96%), Positives = 368/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAASQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAA QLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSEILADAQQADLLIPVDETPPVINDEQSTSTPLTTAQTMTLAAVAGNNTAKDEKA 120
GEPL+S+I++DAQQA+LLIPVDETPPVINDEQSTSTPLTTAQTM LAAVA NT KDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTVNHEPLAGEEDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRT NHEPLAGE+DDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2178FLGMOTORFLIM381e-135 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 381 bits (979), Expect = e-135
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2179FLGMOTORFLIN2113e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 211 bits (539), Expect = 3e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTNSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T +KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2181FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2182TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2183TYPE3IMRPROT2029e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 202 bits (515), Expect = 9e-67
Identities = 258/261 (98%), Positives = 261/261 (100%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2193PF05272290.042 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.042
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2194CARBMTKINASE342e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.4 bits (79), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQRSSILAAEETRRLLREEFEQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEE--GHFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2197ECOLIPORIN447e-160 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 447 bits (1152), Expect = e-160
Identities = 214/387 (55%), Positives = 258/387 (66%), Gaps = 34/387 (8%)

Query: 1 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKVDFYGKMVGERIWSNTDDNNSENEDTSYA 60
MKRKVLA+++PALL AGAA+AAEIYNKDGNK+D YGK+ G +S D++S++ D +Y
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFS---DDSSKDGDQTYM 57

Query: 61 RFGVKGETQITSELTGFGQFEYNLDASKPEG-SNQEKTRLTFAGLKYNELGSFDYGRNYG 119
R G KGETQI +LTG+GQ+EYN+ A+ EG TRL FAGLK+ + GSFDYGRNYG
Sbjct: 58 RVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 120 VAYDAAAYTDMLVEWGGDSWASADNFMNGRTNGVATYRNSDFFGLVDGLNFAVQYQGKNS 179
V YD +TDML E+GGDS+ ADN+M GR NGVATYRN+DFFGLVDGLNFA+QYQGKN
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 180 NRG----------------VTKQNGDGYALSVDYNI-EGFGFVGAYSKSDRTNEQAG--- 219
++ + NGDG+ +S Y+I GF AY+ SDRTNEQ
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 220 -DGYGDNAEVWSLAAKYDANNIYAAMMYGETRNMTVLA------NDHFANKTQNFEAVVQ 272
GD A+ W+ KYDANNIY A MY ETRNMT + ANKTQNFE Q
Sbjct: 238 TIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ 297

Query: 273 YQFDFGLRPSLGYVYSKGKDLYARNGHKGVDADRVNYIEVGTWYYFNKNMNVYTAYKFNL 332
YQFDFGLRP++ ++ SKGKDL N G D D V Y +VG YYFNKN + Y YK NL
Sbjct: 298 YQFDFGLRPAVSFLMSKGKDL-TYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINL 356

Query: 333 LDKDDAAITDA--ATDDQFAVGIVYQF 357
LD DD DA +TDD A+G+VYQF
Sbjct: 357 LDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2199PF06580330.003 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.5 bits (74), Expect = 0.003
Identities = 38/181 (20%), Positives = 63/181 (34%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNGYLNIDVAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ +I + D NG + ++V +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGTKIHEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIM 447
G+ + K G GL V+ + L+G A K M
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2200HTHFIS842e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.7 bits (207), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 39 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 98
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 99 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 154
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


37EcE24377A_2255EcE24377A_2275Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2255119-3.415142hypothetical protein
EcE24377A_2256019-3.183553hypothetical protein
EcE24377A_2257-119-3.091190phage integrase site specific recombinase
EcE24377A_2259-121-3.526689*hypothetical protein
EcE24377A_2261-121-3.337783*invasin
EcE24377A_2262-228-4.807717hypothetical protein
EcE24377A_2263-225-3.431280shikimate transporter
EcE24377A_2264-228-3.651331AMP nucleosidase
EcE24377A_2265030-3.422611hypothetical protein
EcE24377A_2267027-1.811654*hypothetical protein
EcE24377A_2269128-1.240521*transcriptional regulator Cbl
EcE24377A_2270230-1.262421nitrogen assimilation transcriptional regulator
EcE24377A_2272128-1.960147*hypothetical protein
EcE24377A_2273228-2.396737nicotinate-nucleotide--dimethylbenzimidazole
EcE24377A_2274127-3.088259cobalamin synthase
EcE24377A_2275-125-3.211988adenosylcobinamide kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2261INTIMIN6990.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 699 bits (1806), Expect = 0.0
Identities = 219/790 (27%), Positives = 349/790 (44%), Gaps = 70/790 (8%)

Query: 77 QQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDWLSRFGTARITLGVDEDFS 136
QQ AS Q+ S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224

Query: 137 LKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWRHFTPTWMSGINFFFDHDL 196
S DFL P+Y++ L F Q D R N G G R F P M G N F D D
Sbjct: 225 --GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDF 282

Query: 197 SRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYEARPANGWDVRAEGWLPAW 256
S ++R GIG EYWRDY K S NGY R++ W + DY+ RPANG+D+R G+LP++
Sbjct: 283 SGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRFNGYLPSY 341

Query: 257 PYLGGKLVYEQYYGDEVALFDKDDRQSNPHAITAGLNYTPFPLMTFSAEQRQGKQGENDT 316
P LG KL+YEQYYGD VALF+ D QSNP A T G+NYTP PL+T + R G END
Sbjct: 342 PALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDL 401

Query: 317 RFAVDFTWQPGSAMQKQLDPNEVAARRSLAGSRYDLVDRNNNIVLEYRKKELVRLTLTDP 376
+++ F +Q +Q++P V R+L+GSRYDLV RNNNI+LEY+K++++ L +
Sbjct: 402 LYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHD 461

Query: 377 VTGKSGEVKSLVSSLQTKYALKGYNVEATALEAAGGKVVTTG----KDILVTLPGYRFTS 432
+ G + + +++KY L + +AL + GG++ +G +D LP Y
Sbjct: 462 INGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAY---- 517

Query: 433 TPETDNTWPIEVTAEDVKGNLSNREQ-SMVVVQAPTLSQKDSSVSLSTQTLNADSHSTAT 491
N + + A D GN SN ++ V+ + + + +A + T
Sbjct: 518 VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 492 LTFIAH------DAAGNPVVGLVLSTRHEGVQDITLSEWKDNGDGSYTQILTTGAMSGTL 545
+T+ A A PV ++S G ++ + NG G T L + +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVS----GTAVLSANSANTNGSGKATVTLKSDKPGQVV 633

Query: 546 TLMPQLNGVDAAKAPAVVNIISISSSRTHSSIKIDKDRYLSGNPIEVTVELR-DENDKPV 604
A A AV I + + + IK DK ++ +T ++ + DKPV
Sbjct: 634 VSAKTAEMTSALNANAV--IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPV 691

Query: 605 KEQKQQLNNAVSIDNVKPGVTTDWKETADGVYKATYTAYTRGSGL-TAKLLMQNWNEDLH 663
Q+ + K +T+ K +G K T T+ T G L +A++ +
Sbjct: 692 SNQEVTFTTTLG----KLSNSTE-KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAP 746

Query: 664 TAGFIIDANPQSAKIATLSASNNGVLANENAANTVSVNVADEGSNPINDHTVTFAVLSGS 723
F I + G L + + ++
Sbjct: 747 EVEFFTTLTIDDGNIEIVGTGVKGKLPTV---------------------WLQYGQVNLK 785

Query: 724 ATCFNNQNTAKTDVNGLATFDLKSSK---QEDNTVEVTLENGVKQTLIVSFVGDSSTAQV 780
A+ N + T ++ +A+ D S + +E T +++ + QT ++ + + +
Sbjct: 786 ASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQT--ATYTIATPNSLI 843

Query: 781 DLQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKVTF----------NVNSAAAKLSQT 830
SK D ++ + N L +V + + + + + QT
Sbjct: 844 VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQQT 903

Query: 831 EVNSHDGIAT 840
++ G+A+
Sbjct: 904 AQDAKSGVAS 913



Score = 197 bits (503), Expect = 2e-53
Identities = 96/376 (25%), Positives = 157/376 (41%), Gaps = 34/376 (9%)

Query: 759 LENGVKQTLIVSFVGDSSTAQ--VDLQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKV 816
N V T+ V G D K ADG ++ T TATV+ N V V
Sbjct: 538 SSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQAN-VPV 596

Query: 817 TFNVNSAAAKLSQTEVNSH-DGIATATLTSLKNGDYRVTASVSSGSQA-NQQVIFIGDQS 874
+FN+ S A LS N++ G AT TL S K G V+A + + A N + DQ+
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 875 TAALTLSVPSGDITVTNTAPLHMTATLQ-DKNGNPLKDKEITFSVPNDVASRFSISNSGK 933
A++T + + T +T T++ K P+ ++E+TF+ + ++
Sbjct: 657 KASIT-EIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFT------TTLGKLSNST 709

Query: 934 GMTDSNGTAIASLTGTLAGTHMITARLANSNVSDTQPMTFVADKDRAVVVLQTSKAEIIG 993
TD+NG A +LT T G +++AR+++ V P + + + EI+G
Sbjct: 710 EKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEV----EFFTTLTIDDGNIEIVG 765

Query: 994 NGVDETTLTATVK-DPSNHPVAGITVNFTMPQDVAANFTLENNGIAITQANGEAHVTLKG 1052
GV T ++ N +G +T + N IA A+ VTLK
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYT--------WRSANPAIASVDAS-SGQVTLKE 816

Query: 1053 KKAGTHTVTATLGNNNTSDSQPVTFVADKTSAQVVLQMSKDEITGNGVDNATLTATVKDQ 1112
K GT T++ +SD+Q T+ ++ +V MSK + V+
Sbjct: 817 K--GTTTISVI-----SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPS 869

Query: 1113 FDNEVNNLPVTFSSAS 1128
NE+ N+ + +A+
Sbjct: 870 SQNELENVFKAWGAAN 885



Score = 87.0 bits (215), Expect = 6e-19
Identities = 91/395 (23%), Positives = 131/395 (33%), Gaps = 49/395 (12%)

Query: 1055 AGTHTVTATLGNNNTSDSQPVTFVADKTSAQVVLQMS--------KDEITGNGVDNATLT 1106
+ + VTA + N + S V S V+ K +G + T T
Sbjct: 522 SNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYT 581

Query: 1107 ATVKDQFDNEVNNLPVTFSSASSGLTLTPGVSNTNESGIAQATLAGVAFGEQTVTASLAN 1166
ATVK + N PV+F+ S L+ +NTN SG A TL G+ V+A A
Sbjct: 582 ATVKKNGVAQANV-PVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAE 640

Query: 1167 NGASDNKTVHFIGDTAAAKIIELTPVPDSIIAGTPQNSSGSVITATV-VDNNGFPVKGVT 1225
++ N D A I E+ + +A IT TV V PV
Sbjct: 641 MTSALNANAVIFVDQTKASITEIKADKTTAVANGQ-----DAITYTVKVMKGDKPVSNQE 695

Query: 1226 VNFTSRTNSAEMTNGGQAVTNEQGKATVTYTNTRSSIESGARPDTVEASLENGSSTLSTS 1285
V F T + + T+ G A VT T+T G S +S
Sbjct: 696 VTF---TTTLGKLSNSTEKTDTNGYAKVTLTSTTP-----------------GKSLVSAR 735

Query: 1286 INVNADASTAHLTLLQALFDTVSAGDTTNLYIEVKDNYGNGVPQQ--EVTLRVSPSEGVT 1343
+ +D + F T++ D N+ I G GV + V L+
Sbjct: 736 V---SDVAVDVKAPEVEFFTTLTI-DDGNIEIV-----GTGVKGKLPTVWLQYGQVNLKA 786

Query: 1344 PSNNAIYTTNHDGNFYTSFTATKAGV---YQVTATLENGDSMQQTVTYVPNVANAEITLA 1400
N YT S A+ V + T T+ S QT TY N+ I
Sbjct: 787 SGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPN 846

Query: 1401 ASKDPVIADNNDLTTLTATVADTEGNAIANTEVTF 1435
SK D + + N + N +
Sbjct: 847 MSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAW 881



Score = 82.8 bits (204), Expect = 1e-17
Identities = 89/466 (19%), Positives = 166/466 (35%), Gaps = 38/466 (8%)

Query: 1846 SGGKVRTNSSGQA--------PVVLTSNKVGTYTVTASFHNGVT----IQTQTTVKVTGN 1893
GG+++ + S A V + V T A NG + + T T +
Sbjct: 495 QGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQV 554

Query: 1894 SSTAHVASFIADPSTIAATNTDLSTLKTTVEDGSGNLIEGLTVYFALKSGSATLTSLTAV 1953
V F AD ++ A T+ T TV+ +G + V F + SG+A L++ +A
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 1954 TDQNGIATTSVKGAMTGSVTVSAVTTAGGMQTVDITLVAGPADTSQSVLKSNRSSLKGDY 2013
T+ +G AT ++K G V VSA TA ++ V T S+ +
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSA-KTAEMTSALNANAVIFVDQTKASITEIKADKTTAVA 672

Query: 2014 TDSAELRLVLHDISGNPIKVSEGMEFVQSGTNVPYIKISAIDYSLNINGDYKATVTGGGE 2073
+ + + G+ ++ + F + + + NG K T+T
Sbjct: 673 NGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKT-----DTNGYAKVTLTSTTP 727

Query: 2074 GIATLIPVLNGVHQAGLSTTIQFTRAEDKIMSGTVSVNGTDLPTTTFPSQGFTGAYYQLN 2133
G + + ++ V + ++F I G + + GT + P+ L
Sbjct: 728 GKSLVSARVSDVAVDVKAPEVEF-FTTLTIDDGNIEIVGTGV-KGKLPTVWLQYGQVNLK 785

Query: 2134 NDNFAPGKTAADYEFSSSASWVDVDATGKVTFKNVGSNWERITATPKSGGPSYIYEIRVK 2193
+ G + ++ A ++G+VT K G+ I+ +
Sbjct: 786 ---ASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTT--TISVISSDNQTATYTIATPN 840

Query: 2194 SWWVNSGDAFMIYSLAENFCSSNGYTLPRADHLNHSRSRGIGSLYSEWGDMGHYTTEAGF 2253
S V + + Y+ A N C + G LP + + + +++ WG Y
Sbjct: 841 SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNE-------LENVFKAWGAANKYEYYKSS 893

Query: 2254 QSNMYW-----SSSPANSSEQYVVSLATGDQSVFEKLGFAYATCYK 2294
Q+ + W + + + Y + ++ AYATC K
Sbjct: 894 QTIISWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939



Score = 61.6 bits (149), Expect = 3e-11
Identities = 74/359 (20%), Positives = 128/359 (35%), Gaps = 37/359 (10%)

Query: 1370 YQVTATLENGDSMQQTVTYVPNVANAEITLAASKDPVIADNNDLT-----------TLTA 1418
Y + + Q + P N TL+ S+ ++ NN++ +
Sbjct: 403 YSMQFRYQFDKPWSQQIE--PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPH 460

Query: 1419 TVADTEGNA---IANTEVTFTLPEDVKANFTL-SDGGKAITDAEGKAK---VTLKGTKAG 1471
+ TE + + + L V + L S GG+ A+ L G
Sbjct: 461 DINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQG 520

Query: 1472 AH-----TVTASMTGGKS---EQLVVNFIADTLTAQ----VNLNVTEDNFIANNVGMTRL 1519
T A G S L + +++ + + + A+
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITY 580

Query: 1520 QATVTDGNGNPLANEAVTFTLPADVSASFTLGQGGSAITDINGKAEVTLSGTKSGTYPVT 1579
ATV NG AN V+F + VS + L SA T+ +GKA VTL K G V+
Sbjct: 581 TATVKK-NGVAQANVPVSFNI---VSGTAVLSAN-SANTNGSGKATVTLKSDKPGQVVVS 635

Query: 1580 VSVNNYGVSDTKQVTLIADAGTAKLASLTSVYSFVVSTTEGATMTASVTDTNGNPVEGIK 1639
+ + D A + + + + V+ + A PV +
Sbjct: 636 AKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQE 695

Query: 1640 VNFRGTSVTLSSTSVETDDRGFAEILVTSTEVGLKTVSASLADKPTEVISRLLNASADV 1698
V F T LS+++ +TD G+A++ +TST G VSA ++D +V + + +
Sbjct: 696 VTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTL 754



Score = 55.8 bits (134), Expect = 2e-09
Identities = 44/161 (27%), Positives = 63/161 (39%), Gaps = 5/161 (3%)

Query: 1816 TLTATLTSANGTPVEGQVINFSVTPEGATLSGGKVRTNSSGQAPVVLTSNKVGTYTVTAS 1875
T TAT+ NG ++F++ A LS TN SG+A V L S+K G V+A
Sbjct: 579 TYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAK 637

Query: 1876 FHNGVTIQTQTTVKVTGNSSTAHVASFIADPSTIAATNTDLSTLKTTVEDGSGNLIEGLT 1935
+ + + + A + AD +T A D T V G +
Sbjct: 638 TAEMTS-ALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQE 695

Query: 1936 VYFALKSGSATLTSLTAVTDQNGIATTSVKGAMTGSVTVSA 1976
V F G + + T TD NG A ++ G VSA
Sbjct: 696 VTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSA 734


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2263TCRTETB349e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 9e-04
Identities = 39/259 (15%), Positives = 95/259 (36%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISIMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQHYQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + + + ++ +L + +
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLI 227

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGFSCLTIPCFAWLADRFGRRR 317
+ + L +++ + + GL + + IG+L GG T+ F + +
Sbjct: 228 VSVLSFL-IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


38EcE24377A_2309EcE24377A_2342Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2309-1213.163325antitoxin YefM
EcE24377A_2310-1223.714147ATP phosphoribosyltransferase
EcE24377A_23110223.519350histidinol dehydrogenase
EcE24377A_23120252.857124histidinol-phosphate aminotransferase
EcE24377A_2313-2191.314896imidazole glycerol-phosphate
EcE24377A_2314-116-1.215763imidazole glycerol phosphate synthase subunit
EcE24377A_2315-117-1.8807971-(5-phosphoribosyl)-5-[(5-
EcE24377A_2316-116-5.852333imidazole glycerol phosphate synthase subunit
EcE24377A_2317126-9.840725bifunctional phosphoribosyl-AMP
EcE24377A_2318334-12.737448chain length determinant protein
EcE24377A_2319442-14.784457UDP-glucose 6-dehydrogenase
EcE24377A_2320647-16.3441756-phosphogluconate dehydrogenase
EcE24377A_2321962-20.905066hypothetical protein
EcE24377A_2322962-20.783299hypothetical protein
EcE24377A_2323961-20.804769hypothetical protein
EcE24377A_2324759-19.391453VI polysaccharide biosynthesis protein
EcE24377A_2325453-18.096550VI polysaccharide biosynthesis protein
EcE24377A_2326350-16.081501hypothetical protein
EcE24377A_2327143-12.903111glycosyl transferase family protein
EcE24377A_2328142-11.510132glycosyl transferase group 2 family protein
EcE24377A_2329032-7.014538dTDP-4-dehydrorhamnose 3,5-epimerase
EcE24377A_2330-122-5.742941glucose-1-phosphate thymidylyltransferase
EcE24377A_2331-314-3.494279dTDP-4-dehydrorhamnose reductase
EcE24377A_2332-213-1.966652dTDP-glucose-4,6-dehydratase
EcE24377A_2333-1150.066870hypothetical protein
EcE24377A_2334-1170.657059UTP-glucose-1-phosphate uridylyltransferase
EcE24377A_2335-1201.272914colanic acid biosynthesis protein
EcE24377A_23360222.345033colanic acid biosynthesis glycosyl transferase
EcE24377A_23370222.572653pyruvyl transferase
EcE24377A_23380222.840863colanic acid exporter
EcE24377A_2339-1203.155764UDP-glucose lipid carrier transferase
EcE24377A_2340-1223.642437phosphomannomutase
EcE24377A_2341-1233.621319hypothetical protein
EcE24377A_2342-1213.035539mannose-1-phosphate guanylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2324NUCEPIMERASE2653e-89 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 265 bits (678), Expect = 3e-89
Identities = 91/341 (26%), Positives = 159/341 (46%), Gaps = 30/341 (8%)

Query: 19 LITGVAGFIGSNLLETLLRLNQTVVGLDNFATGHQRNLDEVQQNVTPEAWQKFTMIEGDI 78
L+TG AGFIG ++ + LL VVG+DN + +L + + + + F + D+
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQ--PGFQFHKIDL 61

Query: 79 RDYETCMKAVDN--VNYVLHQAALGSVPRSINDPITTNEVNVSGFLNMLQAAKCRDVESF 136
D E + V +V S+ +P + N++GFLN+L+ + ++
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 137 VFAASSSTYGDHTALP-KIEHTIGKPLSPYAITKYVNELYAEVFAKMYGFKSIGLRYFNV 195
++A+SSS YG + +P + ++ P+S YA TK NEL A ++ +YG + GLR+F V
Sbjct: 122 LYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFFTV 181

Query: 196 FGKRQDPNGAYAAVIPKWISAMLMDKDIYINGDGETSRDFSFIENTVQMNILAA------ 249
+G P+ A K+ AML K I + G+ RDF++I++ + I
Sbjct: 182 YGPWGRPDMAL----FKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 250 ----------TADDSAKGEVYNVAYGERTSLNELVEFIKNDLGSNGIKYAGNIIYRDFRK 299
A A VYN+ L + ++ +++ LG K +
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKK-----NMLPLQP 292

Query: 300 GDVRHSLADISKAREMLGYQPQFDIKTGLRQAMPWYINLFK 340
GDV + AD E++G+ P+ +K G++ + WY + +K
Sbjct: 293 GDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2331NUCEPIMERASE461e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 45.5 bits (108), Expect = 1e-07
Identities = 32/170 (18%), Positives = 64/170 (37%), Gaps = 25/170 (14%)

Query: 1 MNILLFGKTGQVGWELQRALAPLGN-LIALDVHSTDY--------------------CGD 39
M L+ G G +G+ + + L G+ ++ +D + Y D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 40 FSNPEGVAETVKKIRPDVIVNAAAHTAVDKAESEPEFAQLLNATSVESIAKAANEVG-AW 98
++ EG+ + + + + AV + P N T +I +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 99 VIHYSTDYVFPGTGEIPWLETDATA-PLNVYGETKLAGEKALQEHCAKHL 147
+++ S+ V+ ++P+ D+ P+++Y TK A E L H HL
Sbjct: 121 LLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE--LMAHTYSHL 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2332NUCEPIMERASE1811e-56 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 181 bits (462), Expect = 1e-56
Identities = 88/360 (24%), Positives = 149/360 (41%), Gaps = 48/360 (13%)

Query: 1 MKILVTGGAGFIGSAVVRHIINNTQDSVVNVDKLT--YAGNL-ESLADVSDSERYVFEHA 57
MK LVTG AGFIG V + ++ VV +D L Y +L ++ ++ + F
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DICDAAVMARIFAQHQPDAVMHLAAESHVDRSITGPAAFIETNIVGTYVLLEAARNYWSA 117
D+ D M +FA + V V S+ P A+ ++N+ G +LE R+
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN--- 116

Query: 118 LDGDKKNSFRFHHISTDEVYGDLPHPDEVNNTEELPLFTETTAYAPSSPYSASKASSDHL 177
+ S+ VYG ++P T+ + P S Y+A+K +++ +
Sbjct: 117 ------KIQHLLYASSSSVYGL---------NRKMPFSTDDSVDHPVSLYAATKKANELM 161

Query: 178 VRAWKRTYGLPTIVTNCSNNYGPYHFPEKLIPLVILNALEGKALPIYGKGDQIRDWLYVE 237
+ YGLP YGP+ P+ + LEGK++ +Y G RD+ Y++
Sbjct: 162 AHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYID 221

Query: 238 D-------------HARALYTVVTEGKA-----GETYNIGGHNEKKNIDVVLTICDLLDE 279
D HA +TV T A YNIG + + +D + + D L
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALG- 280

Query: 280 IVPKEKSYREQITYVADRPGHDRRYAIDAEKIGRELGWKPQETFESGIRKTVEWYLSNTK 339
+ +K+ +PG + D + + +G+ P+ T + G++ V WY K
Sbjct: 281 -IEAKKNMLPL------QPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDFYK 333


39EcE24377A_2363EcE24377A_2399Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2363-2133.020835hypothetical protein
EcE24377A_2364-2163.783058hypothetical protein
EcE24377A_2365-2173.871312von Willebrand factor A
EcE24377A_2366-2173.925670multidrug efflux system subunit MdtA
EcE24377A_2367-2183.768813multidrug efflux system subunit MdtB
EcE24377A_2368-1163.096132multidrug efflux system subunit MdtC
EcE24377A_2369-2141.050156multidrug efflux system protein MdtE
EcE24377A_2370-2120.676469signal transduction histidine-protein kinase
EcE24377A_2371-115-1.458406DNA-binding transcriptional regulator BaeR
EcE24377A_2372-214-2.326640hypothetical protein
EcE24377A_2373014-2.821697CC2985 family addiction module antidote protein
EcE24377A_2374014-2.881300RelE/ParE family plasmid stabilization system
EcE24377A_2375213-2.059463U32 family peptidase
EcE24377A_2376321-3.815148hypothetical protein
EcE24377A_2377320-3.543398lipid kinase
EcE24377A_2378319-3.599376galactitol utilization operon repressor
EcE24377A_2379318-2.551991galactitol-1-phosphate dehydrogenase
EcE24377A_2380114-2.037283PTS system galactitol-specific transporter
EcE24377A_2381012-2.273344PTS system galactitol-specific transporter
EcE24377A_2382112-0.868975PTS system galactitol-specific transporter
EcE24377A_23831100.628132D-tagatose-bisphosphate aldolase non-catalytic
EcE24377A_23841121.155511tagatose-bisphosphate aldolase
EcE24377A_23852120.508554fructose-bisphosphate aldolase
EcE24377A_23861121.178529nucleoside transporter
EcE24377A_23870142.089253ADP-ribosylglycohydrolase
EcE24377A_2389-1141.336403PfkB family kinase
EcE24377A_2388-117-0.533417GntR family transcriptional regulator
EcE24377A_2390123-3.683289glycosyl hydrolase
EcE24377A_2391327-3.733943phosphomethylpyrimidine kinase
EcE24377A_2392229-6.018838hydroxyethylthiazole kinase
EcE24377A_2393330-8.192424transcriptional repressor RcnR
EcE24377A_2395332-8.522930hypothetical protein
EcE24377A_2396334-9.129083hypothetical protein
EcE24377A_2397123-5.765191hypothetical protein
EcE24377A_2399-112-3.780263fimbrial usher protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2366RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 4e-08
Identities = 47/369 (12%), Positives = 106/369 (28%), Gaps = 87/369 (23%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGRRG---MR 55
S + R V ++ IA G+ + + A G + + ++
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 56 SG-------PLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 103
G L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLA-------KDKATLANA 144
Q V ++L Q ++ L + + + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 145 RRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 187
+ L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 188 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 220
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 221 DTTGIVVITQTHPIDLVFTLPESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSL 279
+T +V++ + +++ + DI + Q A + VEA+ T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 280 DNQIDATTG 288
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2367ACRIFLAVINRP9180.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 918 bits (2375), Expect = 0.0
Identities = 299/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2368ACRIFLAVINRP9210.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 921 bits (2383), Expect = 0.0
Identities = 289/1035 (27%), Positives = 507/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIVIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 79.9 bits (197), Expect = 3e-17
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 703
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPK 1020
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2369TCRTETB1213e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (305), Expect = 3e-32
Identities = 98/435 (22%), Positives = 189/435 (43%), Gaps = 25/435 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLTIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + L V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISFLI---------VSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNCFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+ G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYT--WLSMALIIAL 445
+Y+ L + II +
Sbjct: 428 LYSNLLLLFSGIIVI 442


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2370BCTERIALGSPF310.010 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.0 bits (70), Expect = 0.010
Identities = 27/95 (28%), Positives = 35/95 (36%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALATLLAALATFLLA------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSE 217
RQ + L+ A L AL L+A V+ V H LA + P S
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GKLAQDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2371HTHFIS765e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 5e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2376LIPOLPP20270.026 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.026
Identities = 12/38 (31%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 18 EGEMKKIAAISLISIFIVSGCAVHNDETSIGKFGLAYK 55
+ ++KKI +S+++ ++ GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2379DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2386TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 53/268 (19%), Positives = 89/268 (33%), Gaps = 17/268 (6%)

Query: 29 LSKSGFSAGEIGWSYACTAIAAILSPILVGSITDRFFSAQKVLAVLMFAGALLMYFAAQQ 88
L S G A A+ ++G+++DRF ++ + ++ AGA + Y
Sbjct: 35 LVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAI--- 89

Query: 89 TTFAGFFPLLLAYSLTYMPTIALTNSIAFANVPDVERDFPRIRVMGTIG-WIASGLACGF 147
A F +L + T A T ++A A + D+ R R G + G+ G
Sbjct: 90 MATAPFLWVLYIGRIVAGITGA-TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAG- 147

Query: 148 LPQILGY-ADISPTNIPLLITAGSSALLGVFAFFLPDTPPKSTGKMDIKVMLGLDALILL 206
P + G SP + P A + L + FL K + + L A
Sbjct: 148 -PVLGGLMGGFSP-HAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRW 205

Query: 207 RDKN------FLVFFFCSFLFAMPLAFYYIFANGYLTEVGMKNATGWMTLGQFSEIFFML 260
VFF + +P A + IF G + +
Sbjct: 206 ARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAM 265

Query: 261 ALPFFTKRFGIKKVLLLGLVTAAIRYGF 288
R G ++ L+LG++ Y
Sbjct: 266 ITGPVAARLGERRALMLGMIADGTGYIL 293



Score = 34.0 bits (78), Expect = 0.001
Identities = 32/153 (20%), Positives = 53/153 (34%), Gaps = 20/153 (13%)

Query: 253 FSEIFFMLALPFFTKRFGIKKVLLLGLVTAAIRYGFFIYGSADEYFTYALLFLGILLHGV 312
+ L + RFG + VLL+ L AA+ Y +L++G ++ G+
Sbjct: 54 LMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAP-----FLWVLYIGRIVAGI 108

Query: 313 SYDFYYVTAYIYVDKKAPVHMRTAAQGLITLCCQGFGSLLGYRLGGVMMEKMFAYQEPVN 372
+ V D R G ++ C GFG + G LGG+M F+ P
Sbjct: 109 TGATGAVAGAYIAD-ITDGDERARHFGFMS-ACFGFGMVAGPVLGGLMGG--FSPHAP-- 162

Query: 373 GLTFNWSGMWTFGAVMIAIIAVLFMIFFRESDN 405
+ A + + + ES
Sbjct: 163 ---------FFAAAALNGLNFLTGCFLLPESHK 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2395TYPE3OMGPROT280.024 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.024
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 66 KMLLGALLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 106
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2399PF005777170.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 717 bits (1852), Expect = 0.0
Identities = 240/843 (28%), Positives = 390/843 (46%), Gaps = 35/843 (4%)

Query: 2 LRMTPLASAI---VALLLGIEAYAAEETFDTHFMIGGMKDQQVANIRL--DDNQPLPGQY 56
R+ + A +AE F+ F+ Q VA++ + + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 57 DIDIYVNKQWRGKYEIIVKDNPQET----CLSIEVIKRLGIN-----SDNFASGKQCLTF 107
+DIY+N + ++ E CL+ + +G+N N + C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 108 EQLVQGGSYTWDIGVFRLDFSVPQAWVEELESGYVPPENWERGINAFYTSYYVSQYYSDY 167
++ + D+G RL+ ++PQA++ GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 168 KASGNSKSTYVRFNSGLNLLGWQLHSDASFSKTNNNPGV-----WKSNTLYLERGFAQLL 222
+ GNS Y+ SGLN+ W+L + ++S +++ W+ +LER L
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 223 GTLRVGDMYTSSDIFDSVRFSGVRMFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 282
L +GD YT DIFD + F G ++ D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 283 VVYQKEVPPGPFAITDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 342
+Y VPPGPF I D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 343 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGSMVANNYYAFTLGTGWNT-RIGAIS 400
AG A ++ F Q+ +G T+YGG+ +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 401 VDATKSHSKQDNGDVFDGQSYQIAYNKFVSQTSTRFGLAAWRYSSRDYRTFNDHVWANNK 460
VD T+++S + DGQS + YNK ++++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 461 DNYRRDENDVYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSG 516
++ V + DYY + ++ ++Q L ++ LS + YWG S
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 517 SSKDYQLSYSNNWRRISYILAASQAYDENHHE-EKRFNIFISIPFD--WGDDVTTPRRQI 573
+ +Q + + I++ L+ S + ++ + ++IPF D + R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 574 YMSNSTTFDDQGFASNNTGLSGTVGSRDQFNYGVNLSYQNQGN---ETTAGANLTWNAPV 630
S S + D G +N G+ GT+ + +Y V Y G+ +T A L +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 631 ATVNGSYSQSSTYRQAGASVSGGIVAWSGGVNLANRLSETFAVMNAPGIKDAYVNGQKYR 690
N YS S +Q VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 691 TTNRNGVVVYDGMTPYRENHLMLDVSQSDSEAELRGNRKIAAPYRGAVVLVNFDTDQRKP 750
T+ G V T YREN + LD + +L P RGA+V F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 751 WFIKALRADGQPLMFGYEVNDIHGHNIGVVGQGSQLFIRTNEVPPSVNVAIDKQQGLSCT 810
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 811 ITF 813
+
Sbjct: 857 ANY 859


40EcE24377A_2426EcE24377A_2439Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_24260173.016892hypothetical protein
EcE24377A_24270173.445399acetoin dehydrogenase
EcE24377A_2428-1173.962382multidrug resistance outer membrane protein
EcE24377A_2429-1173.400220tRNA-dihydrouridine synthase C
EcE24377A_2431-1183.098342maleylacetoacetate isomerase
EcE24377A_2432-1172.865761fumarylacetoacetate hydrolase
EcE24377A_24330152.811318gentisate 1,2-dioxygenase
EcE24377A_24340131.485836major facilitator transporter
EcE24377A_2435013-0.581989LysR family transcriptional regulator
EcE24377A_2436117-0.515470hypothetical protein
EcE24377A_2437114-0.388167hypothetical protein
EcE24377A_2438013-0.926669cytidine deaminase
EcE24377A_2439216-2.955668hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2426BCTERIALGSPF280.019 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.3 bits (63), Expect = 0.019
Identities = 5/33 (15%), Positives = 16/33 (48%), Gaps = 2/33 (6%)

Query: 152 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKR 182
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2427DHBDHDRGNASE1148e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 114 bits (286), Expect = 8e-33
Identities = 71/253 (28%), Positives = 115/253 (45%), Gaps = 12/253 (4%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTAREVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I ++ E+ K + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKA-EARHAEAFPADVR 67

Query: 63 KLPEGAQALEKLIQRLGRIDVLVNNAGAMTKAPFLDMAFDEWRKIFTVDVDGAFLCSQIA 122
+ ++ + +G ID+LVN AG + ++ +EW F+V+ G F S+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVA 182
++ M+ + + G I+ + S P +AY ++K A TK + LEL + I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NGMDDSDVKPDAEP---SIPLRRFGATYEIASLVAWLCSEGANYT 232
PG+ T M + +K E IPL++ +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2434TCRTETB501e-08 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 49.9 bits (119), Expect = 1e-08
Identities = 60/402 (14%), Positives = 140/402 (34%), Gaps = 19/402 (4%)

Query: 22 RVIICCFLVVMLDGFDTAAIGFIAPDIRTHWQLTAGDLAPLFGAGLLGLTAGALLCGPLS 81
+++I ++ + + PDI + + A +L + G + G LS
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 82 DRFGRKRVIELCVFLFGALSLASAFS-PDLQTLVFLRFLTGLGLGGAMPNTITIT-SEYL 139
D+ G KR++ + + S+ L+ RF+ G G A P + + + Y+
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG-AAAFPALVMVVVARYI 132

Query: 140 PARRRGALVTLMFCGFTLGSAFGGIVSAQLVPVIGWHGILVLGGVLPLMLFVALLVVLPE 199
P RG L+ +G G + + I W +L++ + + + L+ +L +
Sbjct: 133 PKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPF-LMKLLKK 191

Query: 200 SPRWQVRRQLPQAVI-----------AKTVSAITRERYVDTHFYLIESASVTKGSIRQLF 248
R + + ++ + S V + ++
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPG 251

Query: 249 MGRQLPITLMLWVVF--FMSLLIIYLLSSWMPTLLNHRGIDLQHASWVTAAFQIGGTLGA 306
+G+ +P + + F ++ + +M ++ + + G
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 307 LALGVLMDKFNPFRVLALSYAIGAVCIVMIGLSQDG-LWLMALAIFGTGIGISGSQVGLN 365
+ G+L+D+ P VL + +V + + W M + I G+S ++ ++
Sbjct: 312 IG-GILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370

Query: 366 ALTATLYPTQSRATGVSWSNAVGRCGAIVGSLSGGVMMAMNF 407
+ ++ Q G+S N G G ++++
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412



Score = 43.3 bits (102), Expect = 1e-06
Identities = 41/200 (20%), Positives = 79/200 (39%), Gaps = 5/200 (2%)

Query: 251 RQLPITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQHASWVTAAFQIGGTLGALALG 310
R I + L ++ F S+L +L+ +P + N +WV AF + ++G G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 311 VLMDKFNPFRVLALSYAIGAVCIVMIGLSQDGLWLMALAIFGTGIGISGSQVGLNALTAT 370
L D+ R+L I V+ + L+ +A F G G + + + A
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 371 LYPTQSRATGVSWSNAVGRCGAIVGSLSGGVMMAMNFSFDTLFFIIAVPAAISAVMLALL 430
P ++R ++ G VG GG M+A + L I I+ + + L
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGG-MIAHYIHWSYLLLI----PMITIITVPFL 185

Query: 431 ITVVRQSTSVPDSLPRAGVV 450
+ ++++ + G++
Sbjct: 186 MKLLKKEVRIKGHFDIKGII 205


41EcE24377A_2484EcE24377A_2504Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2484214-0.441842nucleoid-associated protein NdpA
EcE24377A_2485112-0.110990hypothetical protein
EcE24377A_24862130.647194hypothetical protein
EcE24377A_24871162.302510sulfatase
EcE24377A_24910192.910968*hypothetical protein
EcE24377A_24920193.416103transcriptional regulator NarP
EcE24377A_24931213.845041cytochrome c-type biogenesis family protein
EcE24377A_24941204.339682thiol:disulfide interchange protein DsbE
EcE24377A_24950184.454916cytochrome c-type biogenesis protein CcmF
EcE24377A_2496-1163.016445cytochrome c-type biogenesis protein CcmE
EcE24377A_24970163.242591heme exporter protein D
EcE24377A_24980153.408221heme exporter protein C
EcE24377A_2499-1174.054227heme exporter protein CcmB
EcE24377A_2500-1194.099822cytochrome c biogenesis protein CcmA
EcE24377A_25010224.055376cytochrome c-type protein NapC
EcE24377A_25020204.421465citrate reductase cytochrome c-type subunit
EcE24377A_25030193.894615quinol dehydrogenase membrane component
EcE24377A_25040193.425827quinol dehydrogenase periplasmic component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2487IGASERPTASE300.027 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.027
Identities = 19/70 (27%), Positives = 28/70 (40%), Gaps = 6/70 (8%)

Query: 503 LHVSTPASEYSQGQ-DLF---NPQRRHYWVTAADNDTLAITTPKKTLVLNNNGKYRTYNL 558
L V+ E + + LF QR H V+ +T+ + K L N NG+Y YN
Sbjct: 926 LQVADKTGEPNHNELTLFDASKAQRDHLNVSLV-GNTVDLGAWKYKLR-NVNGRYDLYNP 983

Query: 559 RGERVKDEKP 568
E+
Sbjct: 984 EVEKRNQTVD 993


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2492HTHFIS643e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.1 bits (156), Expect = 3e-14
Identities = 22/113 (19%), Positives = 47/113 (41%), Gaps = 2/113 (1%)

Query: 9 VMIVDDHPLMRRGVRQLLELDPGFEVVAEAGDGASAIDLANRLDIDVILLDLNMKGMSGL 68
+++ DD +R + Q L G++V + A+ D D+++ D+ M +
Sbjct: 6 ILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 69 DTLNALRRDGVTAQIIILTVSDASSDVFALIDAGADGYLLKDSDPEVLLEAIR 121
D L +++ +++++ + + GA YL K D L+ I
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


42EcE24377A_2553EcE24377A_2576Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_25530153.543686SMR family multidrug efflux pump
EcE24377A_25550143.875236hypothetical protein
EcE24377A_25540145.051728polymyxin B resistance protein pmrD
EcE24377A_25560144.889174O-succinylbenzoic acid--CoA ligase
EcE24377A_25570144.691620O-succinylbenzoate synthase
EcE24377A_2558-1133.320950naphthoate synthase
EcE24377A_25590132.599832acyl-CoA thioester hydrolase
EcE24377A_2560-116-1.5109822-succinyl-5-enolpyruvyl-6-hydroxy-3-
EcE24377A_2562-124-4.827634menaquinone-specific isochorismate synthase
EcE24377A_2563-117-3.552911hypothetical protein
EcE24377A_2564-113-2.085141hypothetical protein
EcE24377A_2565-115-0.130854ribonuclease Z
EcE24377A_2566-1170.121475deubiquitinase
EcE24377A_25681253.427916hypothetical protein
EcE24377A_25691294.157988NADH dehydrogenase subunit N
EcE24377A_25701293.483897NADH dehydrogenase subunit M
EcE24377A_25710304.105123NADH dehydrogenase subunit L
EcE24377A_25720303.862699NADH dehydrogenase subunit K
EcE24377A_25730303.876830NADH dehydrogenase subunit J
EcE24377A_25741294.093597NADH dehydrogenase subunit I
EcE24377A_25750283.869116NADH dehydrogenase subunit H
EcE24377A_25761263.907461NADH dehydrogenase subunit G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2553BCTERIALGSPC280.008 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 28.0 bits (62), Expect = 0.008
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMVLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2564AUTOINDCRSYN356e-05 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.8 bits (80), Expect = 6e-05
Identities = 14/79 (17%), Positives = 32/79 (40%), Gaps = 12/79 (15%)

Query: 1 MIEWQDLHHSELSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLTGDNRHIL 52
M+E D++H+ LS ++ L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWKNDELVAYARILKSDDD 71
G K++ ++ R +++
Sbjct: 57 GIKDNTVICSLRFIETKYP 75


43EcE24377A_2639EcE24377A_2662Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2639119-5.123233hypothetical protein
EcE24377A_2640118-6.044881long-chain fatty acid outer membrane
EcE24377A_2641123-6.052897hypothetical protein
EcE24377A_2642-122-3.740414VacJ family lipoprotein
EcE24377A_2643-125-3.462390hypothetical protein
EcE24377A_2644023-2.807759hypothetical protein
EcE24377A_2646025-2.721044*galactoside permease
EcE24377A_2647123-0.953122aminoimidazole riboside kinase
EcE24377A_2648025-3.601451hypothetical protein
EcE24377A_2649026-4.429424sucrose-6-phosphate hydrolase
EcE24377A_2650-130-6.073944LacI family sucrose operon repressor
EcE24377A_2651032-8.839732hypothetical protein
EcE24377A_2652033-9.014061D-serine dehydratase
EcE24377A_2653036-9.968346DHA2 family drug:H+ antiporter-1
EcE24377A_2654036-10.120442drug resistance MFS transporter membrane fusion
EcE24377A_2655135-9.688223DNA-binding transcriptional activator EvgA
EcE24377A_2657135-8.554062hybrid sensory histidine kinase in two-component
EcE24377A_2656333-6.026507hypothetical protein
EcE24377A_2658232-5.888395hypothetical protein
EcE24377A_2660233-5.573274hypothetical protein
EcE24377A_2659231-5.179550transporter YfdV
EcE24377A_2661126-3.905823oxalyl-CoA decarboxylase
EcE24377A_2662020-3.495010formyl-coenzyme A transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2642VACJLIPOPROT407e-148 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 407 bits (1048), Expect = e-148
Identities = 250/251 (99%), Positives = 250/251 (99%)

Query: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60
MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR
Sbjct: 1 MKLRLSALALGTTLLVGCASSGTDQQGRSDPLEGFNRTMYNFNFNVLDPYIVRPVAVAWR 60

Query: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120
DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM
Sbjct: 61 DYVPQPARNGLSNFTGNLEEPAVMVNYFLQGDPYQGMVHFTRFFLNTILGMGGFIDVAGM 120

Query: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADGLYPVLSWLTWPM 180
ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMAD LYPVLSWLTWPM
Sbjct: 121 ANPKLQRTEPHRFGSTLGHYGVGYGPYVQLPFYGSFTLRDDGGDMADALYPVLSWLTWPM 180

Query: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240
SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA
Sbjct: 181 SVGKWTLEGIETRAQLLDSDGLLRQSSDPYIMVREAYFQRHDFIANGGELKPQENPNAQA 240

Query: 241 IQDDLKDIDSE 251
IQDDLKDIDSE
Sbjct: 241 IQDDLKDIDSE 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2653TCRTETB1214e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (306), Expect = 4e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2654RTXTOXIND764e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 75.6 bits (186), Expect = 4e-17
Identities = 63/412 (15%), Positives = 122/412 (29%), Gaps = 96/412 (23%)

Query: 13 RRKYFSLLAVVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
RR ++ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIRSPVTDYIAQRSVQ-VGETVSPG 233
L + + + + + IR+PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2655HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2657HTHFIS802e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


44EcE24377A_2722EcE24377A_2737Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2722-2163.775734coproporphyrinogen III oxidase
EcE24377A_2723-2174.666700transcriptional regulator EutR
EcE24377A_2724-1214.863330ethanolamine utilization protein EutK
EcE24377A_2725-1205.089898ethanolamine utilization protein EutL
EcE24377A_27260215.385867ethanolamine ammonia-lyase small subunit
EcE24377A_27270205.497880ethanolamine ammonia-lyase, large subunit
EcE24377A_27282195.680014reactivating factor for ethanolamine ammonia
EcE24377A_27292185.358963ethanolamine utilization protein EutH
EcE24377A_27304196.112225ethanolamine utilization protein EutG
EcE24377A_27312186.049578ethanolamine utilization protein EutJ
EcE24377A_27323205.628070ethanolamine utilization protein EutE
EcE24377A_27331194.538311ethanolamine utilization protein
EcE24377A_27342193.930628ethanolamine utilization protein EutM
EcE24377A_27352193.519074phosphotransacetylase
EcE24377A_27362172.374810ethanolamine utilization cobalamin
EcE24377A_27372131.004855ethanolamine utilization protein EutQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2731SHAPEPROTEIN504e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 49.8 bits (119), Expect = 4e-09
Identities = 33/116 (28%), Positives = 52/116 (44%), Gaps = 9/116 (7%)

Query: 63 VRDGIVWDFFGAVTIVRRHLDTLEQQFGLRFS-HAATSFPPGTDP---RISINVLESAGL 118
++DG++ DFF +++ + + +R S P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 119 EVSHVLDEPTAVA---DLLQLDNAG--VVDIGGGTTGIAIVKKGKVTYSADEATGG 169
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


45EcE24377A_2783EcE24377A_2790Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2783-212-3.179210phosphoribosylglycinamide formyltransferase
EcE24377A_2784-211-3.779300polyphosphate kinase
EcE24377A_2785-115-3.989172exopolyphosphatase
EcE24377A_2786-213-2.805647cyclic diguanylate phosphodiesterase
EcE24377A_27872300.187938hypothetical protein
EcE24377A_27882290.704583hypothetical protein
EcE24377A_27892250.676798surface antigen family protein
EcE24377A_27902251.274712hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2790IGASERPTASE280.020 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.020
Identities = 19/124 (15%), Positives = 40/124 (32%), Gaps = 6/124 (4%)

Query: 34 QQGKNEEQRQHDEWVAERNREIQQEKQRRANAQAAANKRAATAAANKKARQDKLDAEASA 93
Q + ++ + + + E+ Q Q K AT +KA+ + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEV 1122

Query: 94 DKKRDQSYEDELRSLEIQKQKLALAKEEARVKRENEFIDQELKHKAAQTDVVQSEADANR 153
K Q + +S +Q Q + + V I + D Q + +
Sbjct: 1123 PKVTSQVSPKQEQSETVQPQAEPARENDPTVN-----IKEPQSQTNTTADTEQPAKETSS 1177

Query: 154 NMTE 157
N+ +
Sbjct: 1178 NVEQ 1181


46EcE24377A_2895EcE24377A_2961Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2895211-1.263770cytochrome C assembly protein
EcE24377A_2896211-1.602238hypothetical protein
EcE24377A_2897314-1.895478hypothetical protein
EcE24377A_2898313-1.022013heat shock protein GrpE
EcE24377A_2899315-0.905822inorganic polyphosphate/ATP-NAD kinase
EcE24377A_2900219-2.088373recombination and repair protein
EcE24377A_2901433-8.869981hypothetical protein
EcE24377A_2902637-11.016640hypothetical protein
EcE24377A_2903741-12.909635hypothetical protein
EcE24377A_2904637-11.906173SsrA-binding protein
EcE24377A_2905741-13.030494phage integrase
EcE24377A_2906535-11.814596resolvase site-specific recombinase
EcE24377A_2907533-11.209831ParB family protein
EcE24377A_2908432-11.237924repB plasmid partitioning protein
EcE24377A_2909330-10.881534N4/N6-methyltransferase
EcE24377A_2910333-12.590638type I restriction-modification system subunit
EcE24377A_2911331-12.363609HsdR family type I site-specific
EcE24377A_2912541-16.782596hypothetical protein
EcE24377A_2913533-12.887075hypothetical protein
EcE24377A_2914530-11.454306hypothetical protein
EcE24377A_2916628-10.019451hypothetical protein
EcE24377A_2915522-4.169468hypothetical protein
EcE24377A_2917424-2.324277hypothetical protein
EcE24377A_2918324-2.715530relaxase/mobilization nuclease domain-containing
EcE24377A_2919225-3.959320lipoprotein
EcE24377A_2920325-4.372569lipoprotein
EcE24377A_2921427-6.967339IS21 family transposase
EcE24377A_2922532-10.096846IS21 family transposition helper protein
EcE24377A_2923742-14.397636hypothetical protein
EcE24377A_2924740-13.465728hypothetical protein
EcE24377A_2925739-11.930191hypothetical protein
EcE24377A_2927534-10.490025hypothetical protein
EcE24377A_2926437-10.417385hypothetical protein
EcE24377A_2928438-10.672277hypothetical protein
EcE24377A_2929124-6.245298hypothetical protein
EcE24377A_2931-116-3.645462IS66 family orf1
EcE24377A_2932112-1.764484IS66 family orf2
EcE24377A_2937211-0.511259*alpha amylase
EcE24377A_29382193.350532hypothetical protein
EcE24377A_29391203.921612hypothetical protein
EcE24377A_29402203.688359hydroxyglutarate oxidase
EcE24377A_29412193.116011succinate-semialdehyde dehydrogenase I
EcE24377A_29421172.2018244-aminobutyrate aminotransferase
EcE24377A_2943216-0.923436gamma-aminobutyrate transporter
EcE24377A_2944017-1.803829DNA-binding transcriptional regulator CsiR
EcE24377A_2945122-4.569421LysM domain/BON superfamily protein
EcE24377A_2946024-4.416982hypothetical protein
EcE24377A_2947125-4.787960ArsR family transcriptional regulator
EcE24377A_2948126-4.654296inner membrane protein
EcE24377A_2949319-2.786695DNA binding protein
EcE24377A_2950113-1.041478hypothetical protein
EcE24377A_2951013-0.937425hypothetical protein
EcE24377A_2952214-1.204798hypothetical protein
EcE24377A_29531120.353307hypothetical protein
EcE24377A_29550120.408319ribonucleotide reductase stimulatory protein
EcE24377A_29560130.179542ribonucleotide-diphosphate reductase subunit
EcE24377A_29570160.306451ribonucleotide-diphosphate reductase subunit
EcE24377A_29581181.400247glycine betaine transporter ATP-binding subunit
EcE24377A_29591173.241529glycine betaine transporter membrane protein
EcE24377A_29600171.712847glycine betaine transporter periplasmic subunit
EcE24377A_29612162.095947hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2901BLACTAMASEA260.032 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 26.3 bits (58), Expect = 0.032
Identities = 23/87 (26%), Positives = 36/87 (41%), Gaps = 11/87 (12%)

Query: 4 KTLTAAAAVLLMLTAGCSTLERVVYRPDINQGNYLTANDVSKIRV--GMTQQQVAYALGT 61
K + AVL + AG LER ++ Q + + + VS+ + GMT ++ A
Sbjct: 69 KVV-LCGAVLARVDAGDEQLERKIH---YRQQDLVDYSPVSEKHLADGMTVGELCAA--A 122

Query: 62 PLMSDPFGTNTWFYVFRQQPGHEGVTQ 88
MSD N + G G+T
Sbjct: 123 ITMSDNSAANL---LLATVGGPAGLTA 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2929PRTACTNFAMLY421e-07 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 41.6 bits (97), Expect = 1e-07
Identities = 27/94 (28%), Positives = 43/94 (45%), Gaps = 1/94 (1%)

Query: 15 LGATLSYNMRLGNGMEVEPWLKAAVRKEFVDDNRVKVNSDGNFVNDLSGRRGIYQAGIKA 74
LG + + L G +V+P++KA+V +EF V N + +L G R G+ A
Sbjct: 818 LGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAH-RTELRGTRAELGLGMAA 876

Query: 75 SFSSTLSGHLGVGYSHGAGVESPWNGVAGVNWSF 108
+ S + YS G + PW AG +S+
Sbjct: 877 ALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2938IGASERPTASE270.015 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 26.6 bits (58), Expect = 0.015
Identities = 23/88 (26%), Positives = 35/88 (39%), Gaps = 20/88 (22%)

Query: 2 NINHSPHDGLVIINKGNEEVEGTWPNK-------------LQPGIYKNMGSNSVNI---- 44
+++ +D L I KG VEGT NK Q SV I
Sbjct: 438 KVHNPQYDRLAKIGKGTLIVEGTGDNKGSLKVGDGTVILKQQTNGSGQHAFASVGIVSGR 497

Query: 45 ---IINNTRKIIPPGKVFTLRGGTLNIN 69
++N+ +++ P F RGG L++N
Sbjct: 498 STLVLNDDKQVDPNSIYFGFRGGRLDLN 525


47EcE24377A_3004EcE24377A_3026Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3004-2223.416695hypothetical protein
EcE24377A_3005-1284.542939hydrogenase 3 maturation protease
EcE24377A_3006-1265.319613formate hydrogenlyase maturation protein
EcE24377A_3007-1255.497285formate hydrogenlyase subunit G
EcE24377A_30080254.955936formate hydrogenlyase complex iron-sulfur
EcE24377A_30090244.901999formate hydrogenlyase subunit E
EcE24377A_30102224.604093formate hydrogenlyase subunit D
EcE24377A_30112224.233360formate hydrogenlyase subunit 3
EcE24377A_30121192.838793formate hydrogenlyase subunit B
EcE24377A_30131193.285882formate hydrogenlyase regulatory protein HycA
EcE24377A_30140162.751497hydrogenase nickel incorporation protein
EcE24377A_3015-1152.617844hydrogenase nickel incorporation protein HypB
EcE24377A_3016-214-4.488228hydrogenase assembly chaperone
EcE24377A_3017-116-6.138677hydrogenase expression/formation protein HypD
EcE24377A_3018126-10.672611hydrogenase expression/formation protein HypE
EcE24377A_3019230-12.649270formate hydrogenlyase transcriptional activator
EcE24377A_3020439-15.038509hypothetical protein
EcE24377A_3021645-17.502201hypothetical protein
EcE24377A_3022541-14.702677hypothetical protein
EcE24377A_3023124-8.929725hypothetical protein
EcE24377A_3024021-6.370780IS3, transposase orfA
EcE24377A_3025021-5.966861IS3, transposase orfB
EcE24377A_3026022-6.136183hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3016TYPE4SSCAGA270.012 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.012
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 8/75 (10%)

Query: 12 IDGNQAKVD--VCGIQRDVDLTLVGSCDENGQPRVGQWVLVHVGFAMSVINEAEARDTLD 69
I GNQ + D G+ D L ++NG+P G W+ + + F + ++ ++ D +
Sbjct: 171 IIGNQIRTDQKFMGV-FDESLKERQEAEKNGEPTGGDWLDIFLSF---IFDKKQSSDVKE 226

Query: 70 ALQN--MFDVEPDVG 82
A+ + V+PD+
Sbjct: 227 AINQEPVPHVQPDIA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3019HTHFIS390e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 390 bits (1004), Expect = e-131
Identities = 142/373 (38%), Positives = 203/373 (54%), Gaps = 39/373 (10%)

Query: 350 YQEIHRLKERLVDENLALTEQLNNVDSEFGEIIGRSEAMYSVLKQVEMVAQSDSTVLILG 409
E+ + R + E +L + + ++GRS AM + + + + Q+D T++I G
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITG 167

Query: 410 ETGTGKELIARAIHNLSGRNNRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF 469
E+GTGKEL+ARA+H+ R N V +N AA+P L+ES+LFGHE+GAFTGA + GRF
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRF 227

Query: 470 ELADKSSLFLDEVGDMPLELQPKLLRVLQEQEFERLGSNKIIQTDVRLIAATNRDLKKMV 529
E A+ +LFLDE+GDMP++ Q +LLRVLQ+ E+ +G I++DVR++AATN+DLK+ +
Sbjct: 228 EQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSI 287

Query: 530 ADREFRSDLYYRLNVFPIHLPPLRERPEDIPLLAKAFTFKIARRLGRNIDSIPAETLRTL 589
FR DLYYRLNV P+ LPPLR+R EDIP L + F + A + G ++ E L +
Sbjct: 288 NQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFV-QQAEKEGLDVKRFDQEALELM 346

Query: 590 SNMEWPGNVRELENVIERAVLLTRGNV-----LQLSLPDIVLPEPETPPAATVVAQE--- 641
WPGNVRELEN++ R L +V ++ L + P AA +
Sbjct: 347 KAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQ 406

Query: 642 ---------------------------GEDEYQLIVRVLKETNGVVAGPKGAAQRLGLKR 674
E EY LI+ L T G AA LGL R
Sbjct: 407 AVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIK---AADLLGLNR 463

Query: 675 TTLLSRMKRLGID 687
TL +++ LG+
Sbjct: 464 NTLRKKIRELGVS 476


48EcE24377A_3064EcE24377A_3081Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3064-1163.343620phosphoadenosine phosphosulfate reductase
EcE24377A_30650142.968826sulfite reductase subunit beta
EcE24377A_3066-1132.637478sulfite reductase subunit alpha
EcE24377A_30671171.969013queuosine biosynthesis protein QueD
EcE24377A_30682172.167616pyridine nucleotide-disulfide oxidoreductase
EcE24377A_30691130.922664ferredoxin
EcE24377A_3070111-0.257364glycerol-3-phosphate responsive antiterminator
EcE24377A_307119-0.169569electron transfer flavoprotein
EcE24377A_3072010-1.246096electron transfer flavoprotein
EcE24377A_3073011-1.738162major facilitator family transporter
EcE24377A_3074112-3.074779FAD binding domain-containing protein
EcE24377A_3075117-4.965431short chain dehydrogenase/reductase
EcE24377A_3076-115-3.866004major facilitator family transporter
EcE24377A_3077019-3.182431carbohydrate kinase, FGGY family protein
EcE24377A_3080022-2.915029hypothetical protein
EcE24377A_3081127-3.466362hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3065PF07675300.021 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 30.4 bits (68), Expect = 0.021
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 12/92 (13%)

Query: 206 ILGQTYLPRKFKTTVVIP---PQND--IDLHANDMNFVAIAENGKLVGFNLLVGGGLSIE 260
++ +P+ T +P PQN + A+ ++VAI+++G L G + G++
Sbjct: 240 VMPYRAMPKT--NTYTLPASLPQNQASYSIQASAGSYVAISKDGVLYGTGVANASGVATV 297

Query: 261 HGNK-----KTYARTASEFGYLPLEHTLAVAE 287
+ K Y + YLP+ + E
Sbjct: 298 NMTKQITENGNYDVVITRSNYLPVIKQIQAGE 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3073TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 8e-04
Identities = 45/314 (14%), Positives = 112/314 (35%), Gaps = 36/314 (11%)

Query: 69 LGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 127
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 128 GHTLLAEFSPRRHRGILLGAFSVVWT----VGYVLASIAGHHFISENPEAWRWLLASAAL 183
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 184 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVVTATHKHIKTLF-- 241
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 242 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 288
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 289 LNALLIVGALLGLV-------LTHLLAHRKFLLGSFLLLAATLVVMACLPSGSSLTLLLF 341
+ ++ G + ++ L L L+ + + + L +S + +
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTII 354

Query: 342 VLFSTTISAVSNLV 355
++F + + V
Sbjct: 355 IVFVLGGLSFTKTV 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3075DHBDHDRGNASE1052e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (263), Expect = 2e-29
Identities = 74/257 (28%), Positives = 117/257 (45%), Gaps = 11/257 (4%)

Query: 11 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANIFIPSFVKDNGETKEMIEK-QGVEVD 69
M+ ++GK A +TG G+G+A A LA GA+I + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 70 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 129
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 130 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 189
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 190 QVNGIAPGYYATDI--TLATRSNPETNQRVLDH-------IPANRWGDTQDLMGAAVFLA 240
+ N ++PG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 241 SPASNYVNGHLLVVDGG 257
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3076TCRTETA290.042 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.042
Identities = 21/103 (20%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIILYAPSGVIADKFSHRKMITSAVIITGLLGLLMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + +MAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


49EcE24377A_3119EcE24377A_3131Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3119-1174.037696murein transglycosylase A
EcE24377A_31230204.246377***hypothetical protein
EcE24377A_31241246.423455hypothetical protein
EcE24377A_31250286.966852hypothetical protein
EcE24377A_3126-2255.208880DotU family type IV/VI secretion system protein
EcE24377A_3127-2213.397943OmpA domain-containing protein
EcE24377A_3128-1211.692948hypothetical protein
EcE24377A_3129-1201.772316ClpA/ClpB family protein
EcE24377A_3130021-1.134452ImpA family type VI secretion-associated
EcE24377A_3131329-5.319168hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3127OMPADOMAIN825e-19 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 81.9 bits (202), Expect = 5e-19
Identities = 44/142 (30%), Positives = 63/142 (44%), Gaps = 14/142 (9%)

Query: 415 PEQKMEVTASLQAQTVRLDSMSLFDVGQARLKDGSTKVL---VDALVNIRAKPGWLILVA 471
+Q + L S LF+ +A LK L L N+ K G ++V
Sbjct: 200 VAPAPAPAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDG-SVVVL 258

Query: 472 GYTDATGDEKSNQQLSLRRAEAVRNWMLQTSDIPATCFAVQGLGESQPAATNDTPQGR-- 529
GYTD G + NQ LS RRA++V ++ L + IPA + +G+GES P N +
Sbjct: 259 GYTDRIGSDAYNQGLSERRAQSVVDY-LISKGIPADKISARGMGESNPVTGNTCDNVKQR 317

Query: 530 -------AVNRRVEISLVPRSD 544
A +RRVEI + D
Sbjct: 318 AALIDCLAPDRRVEIEVKGIKD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3129HTHFIS366e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.0 bits (83), Expect = 6e-04
Identities = 37/182 (20%), Positives = 66/182 (36%), Gaps = 30/182 (16%)

Query: 512 IMTLRQEGTDSTELQQQLRTHQGFAPLLALDVDARAVATVVADWTG----IPLSSLL--- 564
+ + ++ +L +++ + P+L + + + A G +P L
Sbjct: 52 VTDVVMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111

Query: 565 RDEQSDLLSMEQSLENR----------VVGQSPALCAIAQRL-RAAKTGLTPENGPQGVF 613
L+ + ++ +VG+S A+ I + L R +T LT
Sbjct: 112 IGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLT--------L 163

Query: 614 LLTGPSGTGKTETALTLADTLFGGEKSLITINLSEYQEPHTVSQL----KGAPPGYVGYG 669
++TG SGTGK A L D + IN++ S+L KGA G
Sbjct: 164 MITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRS 223

Query: 670 QG 671
G
Sbjct: 224 TG 225


50EcE24377A_3163EcE24377A_3186Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3163119-3.982898hypothetical protein
EcE24377A_3164220-5.0064472-deoxy-D-gluconate 3-dehydrogenase
EcE24377A_3165225-6.8064975-keto-4-deoxyuronate isomerase
EcE24377A_3166130-8.230973acetyl-CoA acetyltransferase
EcE24377A_3167336-12.750762serine transporter family protein
EcE24377A_3168545-16.163217hypothetical protein
EcE24377A_3169546-16.671445transcriptional regulatory protein, C terminal
EcE24377A_3170650-17.901801hypothetical protein
EcE24377A_3171749-18.133159hypothetical protein
EcE24377A_3172651-17.445125hypothetical protein
EcE24377A_3173653-17.753037transcriptional regulator
EcE24377A_3174654-18.283808hypothetical protein
EcE24377A_3175753-17.133576LuxR family transcriptional regulator
EcE24377A_3176753-16.857713hypothetical protein
EcE24377A_3177653-16.641020type III secretion apparatus lipoprotein EprK
EcE24377A_3179551-17.362344type III secretion apparatus protein EprH
EcE24377A_3181551-16.486720FlhB/HrpN/YscU/SpaS family protein
EcE24377A_3182445-13.888072type III secretion apparatus protein EpaR
EcE24377A_3183444-14.316816type III secretion apparatus protein EpaQ
EcE24377A_3184338-11.832758surface presentation of antigens protein SpaP
EcE24377A_3185-125-5.585459type III secretion apparatus protein EpaO2
EcE24377A_3186-119-3.000636type III secretion apparatus protein,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3164DHBDHDRGNASE1102e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 110 bits (276), Expect = 2e-31
Identities = 72/257 (28%), Positives = 129/257 (50%), Gaps = 11/257 (4%)

Query: 3 LSAFSLEGKVAVVTGCDTGLGQGMALGLAQAGCDIVGI--NIVEPTETIKQVTALGRRFL 60
++A +EGK+A +TG G+G+ +A LA G I + N + + + + A R
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 61 SLTADLRKIDGIPGLLDRAVAEFGHIDILVNNAGLIRREDALEFSEKDWDDVMNLNIKSV 120
+ AD+R I + R E G IDILVN AG++R S+++W+ ++N V
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 121 FFMSQAAAKHFIAQGNGGKIINIASMLSFQGGIRVPSYTASKSGVMGVTRLMANEWAKHN 180
F S++ +K+ + + G I+ + S + + +Y +SK+ + T+ + E A++N
Sbjct: 121 FNASRSVSKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 181 INVNAIAPGYMATNNTQQLRADEQRSAEILD--------RIPAGRWGLPSDLMGPVVFLA 232
I N ++PG T+ L ADE + +++ IP + PSD+ V+FL
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 233 SSASDYVNGYTIAVDGG 249
S + ++ + + VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3172SYCDCHAPRONE712e-18 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 71.5 bits (175), Expect = 2e-18
Identities = 28/164 (17%), Positives = 65/164 (39%), Gaps = 9/164 (5%)

Query: 1 MSTETIEIFNNSDEWANQLKHALSKGENLALLHGLTPDILDRIYAYAFDYHEKGNITDAE 60
M ET + + E+ ++ L G +A+L+ ++ D L+++Y+ AF+ ++ G DA
Sbjct: 1 MQQETTD----TQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAH 56

Query: 61 IYYKFLCIYAFENHEYLKDFASVCQPKKKYQQAYDLYKLSYNYSPYDDYSVIYRMGQCQI 120
++ LC+ + + + Q +Y A Y + + +C +
Sbjct: 57 KVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI-KEPRFPFHAAECLL 115

Query: 121 GAKNIDNAMQCFYH----IINNCEDDSVKSKAQAYIELLNDNSE 160
+ A + I + E + ++ + +E + E
Sbjct: 116 QKGELAEAESGLFLAQELIADKTEFKELSTRVSSMLEAIKLKKE 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3177FLGMRINGFLIF330.001 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 33.0 bits (75), Expect = 0.001
Identities = 22/126 (17%), Positives = 49/126 (38%), Gaps = 5/126 (3%)

Query: 4 ISLLLFILLLCGCKQQE-LLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIFVEPTD 62
+++++ ++L L ++L Q ++A L + NI + I V
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGA---IEVPADK 91

Query: 63 FASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDGII 122
L LP + + + S +E+ A+E L ++++ + +
Sbjct: 92 VHELRLRLAQQGLPKGGAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 123 SSRVHV 128
S+RVH+
Sbjct: 151 SARVHL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3181TYPE3IMSPROT1983e-64 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 198 bits (504), Expect = 3e-64
Identities = 72/242 (29%), Positives = 126/242 (52%), Gaps = 5/242 (2%)

Query: 2 ANKTEKPTQKKLQDASKKGQILKSRDLTISVIMLVG--TLYLGYVFDVHHIMSILEYILD 59
KTE+PT KK++DA KKGQ+ KS+++ + +++ L + H ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 60 HNAKPDIWD---YFKAMGVGWLKTNIPFLLVCMFTTILVSWFQSKMQLATEAVKFKFDSL 116
+ P + + + P L V I Q ++ EA+K +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 117 NPVNGLKRIFGLKTVKEFVKAILYIVFFALAIKVFWSNHKSLLFKTLDGDIISLLSDWGE 176
NP+ G KRIF +K++ EF+K+IL +V ++ I + + L + I + G+
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 177 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERI 236
+L L++ C +++ I D+ EY+ ++K++KM K E+KREYKE EG+PEIKSKRR+
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 237 RK 238
++
Sbjct: 243 QE 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3182TYPE3IMRPROT1357e-41 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 135 bits (341), Expect = 7e-41
Identities = 47/248 (18%), Positives = 98/248 (39%), Gaps = 4/248 (1%)

Query: 1 MGEAILYQLHSLLAATALGFCRLAPTFYLLPFFASGNIPTVVRHPIIIVVSCALVQHYHY 60
M + Q S L R+ P + ++P V+ + ++++ A+
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 ELLNLNEIDIALFAAREIIIGLFIACLLASPFWIFLAIGSFIDNQRGATLSSTLDPATGV 120
+ + A ++I+IG+ + + F G I Q G + ++ +DPA+ +
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 DTSELARLFNLFSAAVYLTKGGMNFILETLWQSHNLWPSGNFNF--PKLEPLFSYINNIM 178
+ LAR+ ++ + ++LT G +++ L + + P G L + I
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 179 THTIVYASPVIAVMLGGEAVLGLLACYASQLNAFAISLTVKSALAFLILIIYFA--PILA 236
+ ++ A P+I ++L LGLL A QL+ F I + + ++
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 237 ERVMPLSF 244
E + F
Sbjct: 241 EHLFSEIF 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3183TYPE3IMQPROT794e-23 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 78.7 bits (194), Expect = 4e-23
Identities = 59/86 (68%), Positives = 73/86 (84%)

Query: 1 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF 60
MDD+VFAGN+ALYL+L++S P VAT +GLLVGLFQTVTQLQEQTLPFG+KLL V +C
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLMSGWYGEKLYSFGIEMLNLAFARG 86
FL+SGWYGE L S+G +++ LA A+G
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3184TYPE3IMPPROT2241e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 224 bits (572), Expect = 1e-76
Identities = 150/223 (67%), Positives = 180/223 (80%), Gaps = 5/223 (2%)

Query: 1 MSNSISLIAILSLFTLLPFIIASGTCFIKFSIVFVIVRKALGLQQVPSNMTLNGVALLLS 60
M N ISLIA+L+ TLLPFIIASGTCF+KFSIVFV+VR ALGLQQ+PSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMMPVGKEIYNNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK 120
MFVM P+ + Y ++E+++FN+++S+ V+ G+ GY+ YLIKYS+ ELV FFE Q
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 VNSSEDNEEIIDDD-----NISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVL 175
+ E + D SIF+LLPAYALSEIKSAF IGFY+YLPFVVVDLV+SSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 176 LTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYFDLS 218
L LGMMMMSPVTISTPIKL+LFVA+DGWT+LSKGLILQY D++
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIA 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3185TYPE3OMOPROT1041e-29 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 104 bits (261), Expect = 1e-29
Identities = 62/186 (33%), Positives = 90/186 (48%), Gaps = 13/186 (6%)

Query: 3 LWLDKVTVCQYGNAPALDKKSLYWSIHFVIGFSKTCYRSLVDIEVGDVLLISNNLAYAVI 62
LW + + K L W + FVIG S T L I +GDVLLI + A
Sbjct: 129 LWFEHLPELPAVGGGRP--KMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRA---- 182

Query: 63 YNTKICDLIYPEELKMADHFEYEEDFETDDFDIKKNESEIYDENDDQMINSFEDLPVKIE 122
+Y K+ E + DI+ E E + + LPVK+E
Sbjct: 183 -------EVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLE 235

Query: 123 FVLGKKIMNLYEIDELCAKRIISLLPESEKNIKIRVNGALTGYGELVEVDDKLGVEIHSW 182
FVL +K + L E++ + ++++SL +E N++I NG L G GELV+++D LGVEIH W
Sbjct: 236 FVLYRKNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEW 295

Query: 183 LSGNNN 188
LS + N
Sbjct: 296 LSESGN 301


51EcE24377A_3251EcE24377A_3257Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3251233-0.656681mechanosensitive channel MscS
EcE24377A_3252239-0.486269hypothetical protein
EcE24377A_32532360.087981fructose-bisphosphate aldolase
EcE24377A_32540231.225960phosphoglycerate kinase
EcE24377A_32553192.161588erythrose 4-phosphate dehydrogenase
EcE24377A_32560203.252959hypothetical protein
EcE24377A_3257-1183.071483hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3251SECYTRNLCASE280.040 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 28.2 bits (63), Expect = 0.040
Identities = 34/154 (22%), Positives = 60/154 (38%), Gaps = 16/154 (10%)

Query: 34 ALAI--IIVGLIIARMISNAVNRLMISRKIDATVADFLSALVRYGIIAFTLIAALGRVGV 91
AL I I II ++++ + RL +K ++ RY +A ++ G V
Sbjct: 76 ALGIMPYITASIILQLLTVVIPRLEALKKEGQAGTAKITQYTRYLTVALAILQGTGL--V 133

Query: 92 QTASVIAVLGAAGLAVGLALQGSLSN-------LAAGVLLVMFRPFRAGEYVDLGGVAGT 144
TA + G + + S+ + AG +VM+ GE + G+ G
Sbjct: 134 ATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMW----LGELITDRGI-GN 188

Query: 145 VLSVQIFSTTMRTADGKIIVIPNGKIIAGNIINF 178
+S+ +F + T + I +AG I F
Sbjct: 189 GMSILMFISIAATFPSALWAIKKQGTLAGGWIEF 222


52EcE24377A_3319EcE24377A_3424Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3319016-3.253414hypothetical protein
EcE24377A_3320016-3.615883SNF2 family helicase
EcE24377A_3321017-2.967981ATPase AAA
EcE24377A_3322018-3.359614S8A family peptidase
EcE24377A_3323222-5.417111DNA methylase
EcE24377A_3325122-4.923511type III restriction enzyme, res subunit
EcE24377A_3324225-4.311678hypothetical protein
EcE24377A_3327226-3.542227group II intron-encoded reverse
EcE24377A_3329329-5.257270IS911, transposase orfA
EcE24377A_3330328-5.635057hypothetical protein
EcE24377A_3331322-4.286251ATPase
EcE24377A_3332625-0.968337hypothetical protein
EcE24377A_3333526-0.335004hypothetical protein
EcE24377A_33344250.140215hemolysin expression modulating protein
EcE24377A_33353251.977001hypothetical protein
EcE24377A_33374273.770913hypothetical protein
EcE24377A_33384283.800025IS100, transposase
EcE24377A_33395284.802094IS21 family transposition helper protein
EcE24377A_33417245.263555IS21 family transposase
EcE24377A_33438245.773173IS66 family transposase
EcE24377A_33449255.151107IS66 family orf2
EcE24377A_334510255.178807IS66 family orf1
EcE24377A_334610265.364029GTPase
EcE24377A_334710265.262881antigen 43
EcE24377A_33487274.389644hypothetical protein
EcE24377A_33496253.165563antirestriction protein
EcE24377A_33505251.373452RadC family DNA repair protein
EcE24377A_33516260.188675hypothetical protein
EcE24377A_33527281.300642hypothetical protein
EcE24377A_33537281.816606hypothetical protein
EcE24377A_33549253.751809hypothetical protein
EcE24377A_33559253.289843hypothetical protein
EcE24377A_33568232.420852phage integrase
EcE24377A_3357522-0.621304helicase/Zfx/Zfy transcription activation region
EcE24377A_3358524-3.790874hypothetical protein
EcE24377A_3359526-4.889757Ig family protein
EcE24377A_3360744-10.476539hypothetical protein
EcE24377A_3361742-9.679930hypothetical protein
EcE24377A_3362532-5.833482UvrD family helicase
EcE24377A_3363425-1.648174hypothetical protein
EcE24377A_33656303.359251hypothetical protein
EcE24377A_33666294.183598IS66 family orf1
EcE24377A_33677294.297327IS66 family orf2
EcE24377A_33687283.543638IS66 family transposase
EcE24377A_33697283.194314IS21 family transposition helper protein
EcE24377A_33707252.558223IS21 family transposase
EcE24377A_33716251.123678DnaB family helicase
EcE24377A_3372727-1.305246hypothetical protein
EcE24377A_3373724-1.173731hypothetical protein
EcE24377A_3374725-1.871778hypothetical protein
EcE24377A_3375725-1.887786chromosome partitioning protein
EcE24377A_3377628-4.021927*site-specific recombinase, phage integrase
EcE24377A_3379523-2.190014HTH-type transcriptional regulator RafR
EcE24377A_3380420-1.668563alpha-galactosidase
EcE24377A_3381322-1.911510galactoside permease
EcE24377A_3382322-2.139205raffinose invertase
EcE24377A_3383221-2.303790glycoporin RafY
EcE24377A_33843241.741170IS66 family transposase
EcE24377A_3385426-1.058791IS66 family orf2
EcE24377A_3386528-2.402434IS66 family orf1
EcE24377A_3387526-2.565644DNA-binding protein H-NS-like protein
EcE24377A_33895220.759272hypothetical protein
EcE24377A_33914240.510313prophage CP4-57 regulatory protein
EcE24377A_3392323-0.189501hypothetical protein
EcE24377A_3393330-2.772338hypothetical protein
EcE24377A_3394637-9.591913hypothetical protein
EcE24377A_3395638-10.987799hypothetical protein
EcE24377A_3396647-15.990123hypothetical protein
EcE24377A_3397532-10.499875hypothetical protein
EcE24377A_3398434-11.181257hypothetical protein
EcE24377A_3399433-9.006673hypothetical protein
EcE24377A_3400429-4.309304hypothetical protein
EcE24377A_3401428-2.808916hypothetical protein
EcE24377A_3402428-1.880633GTPase
EcE24377A_3403526-0.779151hypothetical protein
EcE24377A_34056270.843890hypothetical protein
EcE24377A_34046262.284978hypothetical protein
EcE24377A_34066262.364157hypothetical protein
EcE24377A_34077284.240817hypothetical protein
EcE24377A_34088284.255751hypothetical protein
EcE24377A_34098283.253389antirestriction protein
EcE24377A_34108252.637907RadC family DNA repair protein
EcE24377A_3411823-0.094525hypothetical protein
EcE24377A_3412623-2.991843hypothetical protein
EcE24377A_3413521-3.046183hypothetical protein
EcE24377A_3414217-1.323139hypothetical protein
EcE24377A_3415117-0.508505hypothetical protein
EcE24377A_34160150.379094hypothetical protein
EcE24377A_34170141.678794hypothetical protein
EcE24377A_34181174.469960general secretion pathway protein YghD
EcE24377A_34191144.241400GspL-like protein
EcE24377A_34201194.642344general secretion pathway protein K
EcE24377A_34210205.600948general secretion pathway protein J
EcE24377A_3422-1214.766141general secretion pathway protein I
EcE24377A_3423-2173.772482general secretion pathway protein H
EcE24377A_3424-2153.105050general secretion pathway protein G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3322SUBTILISIN931e-22 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 93.4 bits (232), Expect = 1e-22
Identities = 63/323 (19%), Positives = 104/323 (32%), Gaps = 100/323 (30%)

Query: 259 VCILDSGINTNHPLLSAAIAESASFIPNQD-----AFDQEGHGTAVASIALYGDVEACNQ 313
V +LD+G + +HP L A I +F + + D GHGT VA A +
Sbjct: 45 VAVLDTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYNGHGTHVAGTI------AATE 98

Query: 314 SNFWQ----PQLWLYNGKVLNERAEFNAETIESTLTAAVAYFTDLGCRIFNLSLGNANAP 369
+ P+ L KVLN++ + I + Y + I ++SLG P
Sbjct: 99 NENGVVGVAPEADLLIIKVLNKQGSGQYDWII----QGIYYAIEQKVDIISMSLGG---P 151

Query: 370 YDGKHIR-GMAYLLDTLARQYNVLFVVSAGNFAGSDDPPVPQNSWRDEYPDYLLHEDSVI 428
D + + A +L + +AGN DD + Y
Sbjct: 152 EDVPELHEAVKK-----AVASQILVMCAAGNEGDGDD-----RTDELGY----------- 190

Query: 429 IDPAPALNVLTVGSVARHNATLDAQRRPGDIQHLSPATENQPSPFTRHGPSVKGAFKPDV 488
P V++VG++ + S F+ V D+
Sbjct: 191 --PGCYNEVISVGAI---------------------NFDRHASEFSNSNNEV------DL 221

Query: 489 VAHGGNVASNVRQGQWQAHMRGLGVLSCHHQFQGNTLFKELSGTSFAAPYITHLAGRLLN 548
VA G ++ S V G++ SGTS A P++ +
Sbjct: 222 VAPGEDILSTVPGGKYATF----------------------SGTSMATPHVAGALALIKQ 259

Query: 549 EYP-----EMSANMLRAMLVNHA 566
+++ L A L+
Sbjct: 260 LANASFERDLTEPELYAQLIKRT 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3347PRTACTNFAMLY320.020 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 31.6 bits (71), Expect = 0.020
Identities = 161/824 (19%), Positives = 258/824 (31%), Gaps = 111/824 (13%)

Query: 38 IALSLAAVTSVPALAAD----TVVQAGETVNDGTLTNHDNQIVLGTANGMTISTG----- 88
+A++L A+ + PA AD ++V+ GE + + D V TA+G TI
Sbjct: 19 LAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVR-TASGTTIKVSGRQAQ 77

Query: 89 ---LEYGPDNEANTGGQWIQNGGIANNTTVTGGGLQRVNAGGSVSDTVISAGGGQSLQGQ 145
LE G +G ++++ G V AG V+D A G +
Sbjct: 78 GILLENPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATLANVGDTWDDD 137

Query: 146 AVNTTLNGGEQWVHEGGIA---TGTVINEKGWQAVKSGAMATDTVVNTGAEGGPDAENGD 202
+ + G + G V E+G + D ++ GA E+
Sbjct: 138 GIALYVAGEQAQASIADSTLQGAGGVQIERGANVTVQRSAIVDGGLHIGALQSLQPEDLP 197

Query: 203 TGQFVRGNAVRTTINKNGRQIVAAEGTANTTVVYAGGDQTVHGHALDTTLNGGYQYVHNG 262
+ V + T V A G V + T+ G + G +
Sbjct: 198 PSRVVLRDTNVTA--------VPASGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGA 249

Query: 263 GTASGTVVNSDGWQIIKEGGLADFTTVNQKGKLQVNAGGTATNVTLKQGGALVTSTAATV 322
G GG V G + G L G V + ++V
Sbjct: 250 VVHLQRATIRRGDA--PAGGAVPGGAV-PGGAVPGGFGPGGFGPVL-DGWYGVDVSGSSV 305

Query: 323 TGSNRLGNFTVE-NGKADGVVLESGGRLDVLEGHSAWKTLVDDGGTLAVSAGGKATDVTM 381
L VE + + G R+ V GG+L+ G +V
Sbjct: 306 ----ELAQSIVEAPELGAAIRVGRGARVTV------------SGGSLSAPHG----NVIE 345

Query: 382 TSGGALIADSGATVE-----GTNASGKFSIDGTSGQASGLLLENG----GSFTVNAGGLA 432
T G A A + G +A GK + + L L G G
Sbjct: 346 TGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSI 405

Query: 433 SNTTVGHRGTLTLAAGGSLSGRTQLSKGASMVLNGDVVSTGDIVNAGEIRFDNQTTPDAV 492
T++G + LA+ +G T+ S+ N V T + N G +R + + D
Sbjct: 406 PGTSIG-PLDVALASQARWTGATRAVDSLSID-NATWVMTDN-SNVGALRLASDGSVD-- 460

Query: 493 LSRAVAKGDSPVTFHKLTTSNLTGQGGTINMRVRLDGSNTSDQLVINGGQATGKTWLAFT 552
+ F LT + L G G M V D SD+LV+ A+G+ L
Sbjct: 461 ----FQQPAEAGRFKVLTVNTLAGSG-LFRMNVFAD-LGLSDKLVVMQD-ASGQHRLWVR 513

Query: 553 NVGNSNLGVATSGQGIRVVDAQNGATTEEGAFALSRPLQAGAFNYTLNRDSDEDWYLRSE 612
N G+ S + +V G+ + G + Y L + + W L
Sbjct: 514 NSGSE----PASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGA 569

Query: 613 NAYRAEVPLYASMLTQAMDYDRILAGSRSHQTGVSGENNSVRLSIQGGHLGHDNNGGIAR 672
A A P + + ++ G +G + A
Sbjct: 570 KAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAE 629

Query: 673 G-----------ATPESSGSYG--FVRLE------GDLLRTEVAGMSL--------TTGV 705
P++ G++G F + + G +VAG L G
Sbjct: 630 SNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGR 689

Query: 706 YGAAGHSSVDVKDDDGSRAGTARDDAGSLGGYLNLVHTSSGLWADIVAQGTRHSMKASSD 765
+ G + D + G D+ +GGY + SG + D + +R
Sbjct: 690 WHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI-ADSGFYLDATLRASRLENDFKVA 748

Query: 766 NND-------FRARGWGWLGSLETGLPFSITDNLMLEPQLQYTW 802
+D +R G G SLE G F+ D LEPQ +
Sbjct: 749 GSDGYAVKGKYRTHGVGA--SLEAGRRFTHADGWFLEPQAELAV 790


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3359INTIMIN479e-155 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 479 bits (1233), Expect = e-155
Identities = 254/734 (34%), Positives = 368/734 (50%), Gaps = 40/734 (5%)

Query: 44 AQTAMEAGRVLQG-SNSGDAARQMLTSQASGQAADAVTQWLNQFGTAKTQLSVVSDFSLK 102
AQ A G LQ S +GD A+ A QA+ + WL +GTA+ L ++F
Sbjct: 167 AQQAASLGSQLQSRSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD-- 224

Query: 103 GSSLDVLLPFYNTPKNVLFTQLGMRDNDGRFTTNAGLGHRYFTDNGWMLGYNVFYDVDWR 162
GSSLD LLPFY++ K + F Q+G R D RFT N G G R+F MLGYNVF D D+
Sbjct: 225 GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFL-PENMLGYNVFIDQDFS 283

Query: 163 NTNRRYGIGVEAWRDYLKLSANGYKRLSDWRQSPTVTDYDERPADGWDIRAEGWLPAYPQ 222
N R GIG E WRDY K S NGY R+S W +S DYDERPA+G+DIR G+LP+YP
Sbjct: 284 GDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPA 343

Query: 223 LGGKLVYEQYYGNEVALFGESERQKNPHAITAGVTWTPFSLLTAGVDYRRGKNGADDTRL 282
LG KL+YEQYYG+ VALF + Q NP A T GV +TP L+T G+DYR G +D
Sbjct: 344 LGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLY 403

Query: 283 NLGLTYRIGEPLAHQLDSSRVGAQRSLAANRLELVNRNNDVVLEYRKQTLITLQLPPDVY 342
++ Y+ +P + Q++ V R+L+ +R +LV RNN+++LEY+KQ +++L +P D+
Sbjct: 404 SMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDIN 463

Query: 343 GAELTTVTLTPQVNAKYGLSRIELDDAELRQAGGKII---SNTGNQITLQLPAWSSDRQS 399
G E +T + V +KYGL RI DD+ LR GG+I S + LPA+ +
Sbjct: 464 GTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSN 523

Query: 400 V-TLSGRARDTRGNLSDIARTRILV--SPAVQQQLAV---STDKTTATADGADSVRYTLT 453
V ++ RA D GN S+ I V + V Q+ V + DKT+A ADG +++ YT T
Sbjct: 524 VYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTAT 583

Query: 454 VTGSDGKPVSGQAVRWEHNGGTLNGENT--TNADGVATATLTSQTAGIIRVTATTRNQTA 511
V + + +G + N+ TN G AT TL S G + V+A T T+
Sbjct: 584 VKKNGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTS 643

Query: 512 --KAADVTFV---AAMQGELLADRTQALADGQEAVSYTLTLKTTDGKPLSGKNVTFTTTV 566
A V FV A E+ AD+T A+A+GQ+A++YT+ + KP+S + VTFTTT+
Sbjct: 644 ALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVM-KGDKPVSNQEVTFTTTL 702

Query: 567 GQLSRTQGTTDQNGQLSVQLTSTRAGQAVVNASVDSTTISAAPVTFENRLDSAIVVNKTS 626
G+LS + TD NG V LTST G+++V+A V + E I
Sbjct: 703 GKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIE 762

Query: 627 AVADGQDSIILTAVIR-DAAGVPVAGQAV--TWHTDNGQFTQQDA---VTNAAGTASATL 680
V G + T ++ + +G TW + N DA + T+
Sbjct: 763 IVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTI 822

Query: 681 TSTQAGNAQVSLSLNGTTTTVSAPRVSFTQQLYLTLQAGRTTAVADGNDAITYTLNVRDA 740
+ + N + ++ + + + D + +
Sbjct: 823 SVISSDNQTATYTIATPNSLI-------------VPNMSKRVTYNDAVNTCKNFGGKLPS 869

Query: 741 AGQPVKDKAVQWST 754
+ +++ W
Sbjct: 870 SQNELENVFKAWGA 883



Score = 103 bits (258), Expect = 2e-24
Identities = 99/354 (27%), Positives = 148/354 (41%), Gaps = 37/354 (10%)

Query: 577 DQNGQLS--VQLTSTRAGQAVVNASVDSTTISAAPVTFENRLDSAIVVNKTSAVADGQDS 634
D+NG S V LT T V V T +A +KTSA ADG ++
Sbjct: 533 DRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTA---------------DKTSAKADGTEA 577

Query: 635 IILTAVIRDAAGVPVAGQAVTWHTDNGQ--FTQQDAVTNAAGTASATLTSTQAGNAQVSL 692
I TA ++ GV A V+++ +G + A TN +G A+ TL S + G VS
Sbjct: 578 ITYTATVKKN-GVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSA 636

Query: 693 SLNGTTTTVSAPRVSFTQQL---YLTLQAGRTTAVADGNDAITYTLNVRDAAGQPVKDKA 749
T+ ++A V F Q ++A +TTAVA+G DAITYT+ V +PV ++
Sbjct: 637 KTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSNQE 695

Query: 750 VQWSTDLGTLSPLQGTTSAQGEASVTLTSTQAGQAVVNATVDGKQISAQSVTFTRTVRGV 809
V ++T LG LS T G A VTLTST G+++V+A V + ++
Sbjct: 696 VTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLT 755

Query: 810 ITVEKERV--YPGTRQTVTLTLTDAAGN-PVSGERV--DWHVDSGNLWQTQGTTDAQGRT 864
I + + T+ L N SG W + + + G+
Sbjct: 756 IDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDAS---SGQV 812

Query: 865 TTTWESTTPGTATITADAYGQQYTAPVITVMPALTVSSVTGIDATGADGKNFGK 918
T + GT TI+ + Q I +L V +++ T D N K
Sbjct: 813 TLKEK----GTTTISVISSDNQTATYTIATPNSLIVPNMSK-RVTYNDAVNTCK 861


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3368CHANLCOLICIN310.014 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.014
Identities = 25/91 (27%), Positives = 42/91 (46%), Gaps = 5/91 (5%)

Query: 4 SLAHENARLRALLQTQQDTIRQMAEYNRLLSQRVAAYASEINRLKALVAKLQRMQFGKSS 63
+ A A AL Q +D + + +N + A N A+ A+ +R++ K+
Sbjct: 79 AQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANN--AAMQAEDERLRLAKAE 136

Query: 64 EKLR---AKTERQIQEAQERISALQEEMAET 91
EK R E+ QEA++R ++ E AET
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAET 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3381TCRTETA448e-07 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 44.0 bits (104), Expect = 8e-07
Identities = 63/306 (20%), Positives = 106/306 (34%), Gaps = 27/306 (8%)

Query: 25 FLYFFIMATCFPFLPIWLSDVV--GLSKTDTGIVFSCLSLFAISFQPLLGVISDRLGLKK 82
L + P LP L D+V GI+ + +L + P+LG +SDR G +
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 83 NLIWSISLLLVFFAPFFLYVFAPLLRFNIWAGALTGGVFIGFVFSAGAGAIEAYIERVSR 142
L+ S L + + AP L ++ G + G+ G + I + R
Sbjct: 75 VLLVS---LAGAAVDYAIMATAPFLWV-LYIGRIVAGI-TGATGAVAGAYIADITDGDER 129

Query: 143 SRGFEYGKARMFGCLGWALCA--AMAGMLFNVDPSLVFWMGSGSALLLLLL-LFLARPST 199
+R F + M C G+ + A + G++ P F+ + L L FL S
Sbjct: 130 ARHFGF----MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 200 SQTAMVMNTLGANSSLISTRMVFSLFRMRQMWMFVLYTIGVACVYDVFDQQFATFFRSFF 259
+ N + FR + V + V + + Q A + F
Sbjct: 186 KGERRPLRREALNP--------LASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 260 -DTPQAGIKAFGFATTAGEICNAII-MFCTPWIIHRIGAKNTLLVAGGIMTIRITGSAFA 317
D G + A I +++ T + R+G + L++ M TG
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLG---MIADGTGYILL 294

Query: 318 TTATEV 323
AT
Sbjct: 295 AFATRG 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3421BCTERIALGSPG300.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.003
Identities = 17/48 (35%), Positives = 23/48 (47%), Gaps = 3/48 (6%)

Query: 1 MRRTR--AGFTLLEMLVAIAIFASLA-LMAQQVTNGVTRVNSAVAGHD 45
MR T GFTLLE++V I I LA L+ + + + A D
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3422BCTERIALGSPH331e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 33.0 bits (75), Expect = 1e-04
Identities = 13/24 (54%), Positives = 18/24 (75%)

Query: 2 KRGFTLLEVMLALAIFALAATAVL 25
+RGFTLLE+ML L + ++A VL
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVL 26


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3423BCTERIALGSPH744e-19 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 74.2 bits (182), Expect = 4e-19
Identities = 41/196 (20%), Positives = 69/196 (35%), Gaps = 41/196 (20%)

Query: 1 MPERGFTLLEIMLVIFLIGLASSGVVQTFATDSEPPAKKAAQDFLTRFAQFKDRAVIEGQ 60
M +RGFTLLE+ML++ L+G+++ V+ F + A + F + + R + GQ
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 61 TLGVLIDAPGYQFMQRRQGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQR 120
GV + +QF+ + P D W L L+
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPA-------------------PADDGWSGYRWLPLRA 101

Query: 121 RRL----TLHDIELEL-----QKEAKKKTPQIRFSPFEPATPFTLRFYSAAQNACWAVKL 171
R+ ++ +L L + P + P TPF L L
Sbjct: 102 GRVATSGSIAGGKLNLAFAQGEAWTPGDNPDVLIFPGGEMTPFRLT-------------L 148

Query: 172 AHDGALSLNQCDERMP 187
++ N E +P
Sbjct: 149 GEAPGIAFNARGESLP 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3424BCTERIALGSPG2175e-76 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 217 bits (553), Expect = 5e-76
Identities = 90/142 (63%), Positives = 107/142 (75%), Gaps = 3/142 (2%)

Query: 6 RTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENALDMYRL 65
R + GFTLLE+MVVIVI+GVLASLVVPNL+GNKEKAD+QKA+SDIVALENALDMY+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTEQGLEALIQQPANMADSRNYRTGGYIKRLPKDPWGNDYQYLSPGEKGLFDVY 125
DN YPTT QGLE+L++ P + NY GYIKRLP DPWGNDY ++PGE G +D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 126 TLGADGQENGEGAGADIGNWNL 147
+ G DG+ E DI NW L
Sbjct: 122 SAGPDGEMGTED---DITNWGL 140


53EcE24377A_3504EcE24377A_3516Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3504025-6.245857zinc transporter ZupT
EcE24377A_3505129-8.364596hypothetical protein
EcE24377A_3506233-9.0490593,4-dihydroxy-2-butanone 4-phosphate synthase
EcE24377A_3507435-10.167846hypothetical protein
EcE24377A_3508436-10.464907fimbrial protein
EcE24377A_3509132-8.571097fimbrial usher protein
EcE24377A_3510-122-4.562625periplasmic pilus chaperone family protein
EcE24377A_3511010-0.555758hypothetical protein
EcE24377A_3512081.609440glycogen synthesis protein GlgS
EcE24377A_3513081.680112inner membrane protein
EcE24377A_3514092.809572hypothetical protein
EcE24377A_3515-1133.561250bifunctional heptose 7-phosphate kinase/heptose
EcE24377A_35160143.141272bifunctional glutamine-synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3508FIMBRIALPAPE280.011 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.5 bits (63), Expect = 0.011
Identities = 36/163 (22%), Positives = 66/163 (40%), Gaps = 35/163 (21%)

Query: 14 AMILSNNVFADEGHGIVKFKGEVISAPCSIKPGDEDLTVNLGEVADTVLKSDQKSLAE-- 71
A+++S +V A + + FKG++I C++ ++ VN G++ L + +
Sbjct: 15 AVLMSQHVHAADN---LTFKGKLIIPACTV----QNAEVNWGDIEIQNLVQSGGNQKDFT 67

Query: 72 -----PFTIHLQDCMLSQGGTTYSKAKVTFTTANTMTGQTDLLKNTKETEIGGATGVGVR 126
P+++ ++ G T + V T+ + G L N+ + IG A
Sbjct: 68 VDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGNA------ 121

Query: 127 ILDSQSGEVTLGTPVV---ITFNNTNS----YQELNFKARMES 162
VTLG+ V IT Y +L +K M+S
Sbjct: 122 --------VTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQS 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3509PF005776370.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 637 bits (1645), Expect = 0.0
Identities = 224/837 (26%), Positives = 388/837 (46%), Gaps = 56/837 (6%)

Query: 17 IYCSLSVLIIGCASAYAVEFNKDLIEAEDRENVNLSQFETDGQLPVGKYSLNALINNKRT 76
++ + + S+ + FN + + + +LS+FE +LP G Y ++ +NN
Sbjct: 30 LFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 77 PIHLDLQWVLIDN--QTAVCLTPEQLTLLGFTDEIIEVAQQNLIDGCYPIEK-EKQITTY 133
D+ + D+ CLT QL +G + D C P+ T
Sbjct: 90 A-TRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQ 148

Query: 134 LDKGKMQLSISAPQAWLKYKDANWTPPELWDHGIAGAFLDYNLYASHYAPHQGDNSQNIS 193
LD G+ +L+++ PQA++ + + PPELWD GI L+YN + G NS
Sbjct: 149 LDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAY 208

Query: 194 SYGQAGVNLGAWRLRTDYQYDQSFNNGKS-QANNLDFPRIYLFRPIPAINAKLTIGQYDT 252
Q+G+N+GAWRLR + + + ++ S N +L R I + ++LT+G T
Sbjct: 209 LNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYT 268

Query: 253 ESSIFDSFHFSGVSLKSDENMLPPDLRGYAPQITGVAQTNAKVTVSQNNRIIYQENVPPG 312
+ IFD +F G L SD+NMLP RG+AP I G+A+ A+VT+ QN IY VPPG
Sbjct: 269 QGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPG 328

Query: 313 PFAITNLFNT-LQGQLDVKVEEEDGQVTQWQVASNSIPYLTRKGQIRYTTAMGKPTSVGG 371
PF I +++ G L V ++E DG + V +S+P L R+G RY+ G+ S G
Sbjct: 329 PFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRS-GN 387

Query: 372 DSLQQPFFWTGEFSWGWLNNVSLYGGSVLTNRDYQSLAAGVGFNLNSLVSLSFDVTRSDA 431
++P F+ G ++YGG+ L +R Y++ G+G N+ +L +LS D+T++++
Sbjct: 388 AQQEKPRFFQSTLLHGLPAGWTIYGGTQLADR-YRAFNFGIGKNMGALGALSVDMTQANS 446

Query: 432 QLHNQDKETGYSYRANYSKRFESTGSQLTFAGYRFSDKNFVTMNEYIND----------- 480
L + + G S R Y+K +G+ + GYR+S + +
Sbjct: 447 TLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQD 506

Query: 481 ---------TNHYTNYQNEKESYIVTFNQYLESLRLNTYVSLARNTYWDAS-SNVNYSLS 530
T++Y N++ +T Q L Y+S + TYW S + +
Sbjct: 507 GVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQAG 565

Query: 531 LSRDFDIGPLKNVSTSLTFSRIN--WEEDNQDQLYLNISIPWGTSR-----------TLS 577
L+ F ++++ +L++S W++ L LN++IP+ + S
Sbjct: 566 LNTAF-----EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASAS 620

Query: 578 YGMQRNQDNKISHTASWYDS--SDRNNSWSVSASGDNDEFKDMKASLRASYQHNTENGRL 635
Y M + + ++++ A Y + D N S+SV + ++ A+ + G
Sbjct: 621 YSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNA 680

Query: 636 YLSGTSQRDSYYSLNASWNGSFTATRHGAAFHDYSGSADSRFMIDADGAEDIPLNNKRAV 695
+ + D L +G A +G D+ ++ A GA+D + N+ V
Sbjct: 681 NIGYSHSDD-IKQLYYGVSGGVLAHANGVTLGQPLN--DTVVLVKAPGAKDAKVENQTGV 737

Query: 696 -TNRYGIGVIPSVSSYITTSLSVDTRNLPENVDIENSVITTTLTEGAIGYAKLDTRKGYQ 754
T+ G V+P + Y +++DT L +NVD++N+V T GAI A+ R G +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 755 IMGVIRLADGSHPPLGISVKDKTSHKELGLVADGGFVYLNGIQDDSKLTLRWGDKSC 811
++ + + P G V ++S + G+VAD G VYL+G+ K+ ++WG++
Sbjct: 798 LLMTLT-HNNKPLPFGAMVTSESS-QSSGIVADNGQVYLSGMPLAGKVQVKWGEEEN 852


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3514IGASERPTASE525e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 52.0 bits (124), Expect = 5e-09
Identities = 47/287 (16%), Positives = 93/287 (32%), Gaps = 16/287 (5%)

Query: 197 PNNAFDAEGLTKLTQETERRRRERNEVEQDVEVAVREKNRDALSRKLEIEQQEAFMTLEQ 256
N A+ + + E R A + + ++ E +QE+ +
Sbjct: 999 TPNNIQAD-VPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA---ENSKQESKTVEKN 1054

Query: 257 EQQVKTRTAEQNARIAAFEAERRREAE-QTRILAERQIQETEIEREQAVRSRKVEAEREV 315
EQ TA+ R A EA+ +A QT +A+ + E + + + VE E +
Sbjct: 1055 EQDATETTAQN--REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 316 RIKEIEQQQVTEIANQTKSIAIAAKSEQ---QSQAEARANLALAEAVSAQQNVETTRQTA 372
+++ + Q+V ++ +Q +++ Q + E + + E S T Q A
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 373 EADRAKQVALIAAAQDAET------KAVELTVRAKAEKEAAEMQAAAIVELAEATRKKGL 426
+ + + + T T +E + R
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232

Query: 427 AEAEAQRALNDAINVLSDEQTSLKFKLALLQALPAVIEKSVEPMKSI 473
A + ND V + TS L A ++ K++
Sbjct: 1233 NVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3515LPSBIOSNTHSS290.028 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 29.0 bits (65), Expect = 0.028
Identities = 10/37 (27%), Positives = 20/37 (54%)

Query: 347 GVFDILHAGHVSYLANARKLGDRLIVAVNSDASTKRL 383
G FD + GH+ + +L D++ VAV + + + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPM 43


54EcE24377A_3560EcE24377A_3574Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3560-113-3.381495glucuronate isomerase
EcE24377A_3561017-4.264113hexuronate transporter
EcE24377A_3562018-5.351800pilus biogenesis initiator protein
EcE24377A_3563-117-5.005007hypothetical protein
EcE24377A_3564-118-4.591962CS1 type fimbrial major subunit
EcE24377A_3565-119-4.218903hypothetical protein
EcE24377A_3566-113-0.856841DNA-binding transcriptional repressor ExuR
EcE24377A_3567015-0.441842inner membrane protein
EcE24377A_35692210.802989hypothetical protein
EcE24377A_3568121-0.252518hypothetical protein
EcE24377A_3570-217-0.057226hypothetical protein
EcE24377A_3571-3170.065411hypothetical protein
EcE24377A_3572-216-0.360965hypothetical protein
EcE24377A_3573-117-1.723893hypothetical protein
EcE24377A_3574017-3.691035inner membrane protein YqjF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3561TCRTETA417e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.0 bits (96), Expect = 7e-06
Identities = 58/329 (17%), Positives = 107/329 (32%), Gaps = 37/329 (11%)

Query: 34 PTLMEELNISTQQ---YSYIIAAYSAAYTVMQPVAGYVLDVLGTK----IGYAMFAVLWA 86
P L+ +L S Y ++A Y+ PV G + D G + + A AV +A
Sbjct: 29 PGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYA 88

Query: 87 VFCGATALAGSWGGLAIA--RGAVGAAEAAMIPAGLKASSEWFPAKERSIAVGYFNVGSS 144
+ A L + G +A GA GA A I ++ ER+ G+ +
Sbjct: 89 IMATAPFLWVLYIGRIVAGITGATGAVAGAYI-------ADITDGDERARHFGFMSACFG 141

Query: 145 IGAMIAPPLVVWAIVMHSWQMAFIISGALSFIWAMAWLIFYKHPRDQKHLTDEERDYIIN 204
G M+A P++ + S F + AL+ + + K R +N
Sbjct: 142 FG-MVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES--HKGERRPLRREALN 198

Query: 205 GQEAQHQVSTAKKMSVGQILRNRQFWGIALPRFLAEPAWGTFNAWIPLFMFKVYGFNLKE 264
+ ++ + F+ + L + W +F + ++
Sbjct: 199 PLASFRWARGMTVVAALMAV----FFIMQLVGQVPAALWV-------IFGEDRFHWDATT 247

Query: 265 IAMFAWMPMLFADLGCILGGYLPPLFQRWFGVNLIVSRKMVV-TLGAVLMIGPGMIGLFT 323
I + F L + + G + M+ G +L+ +
Sbjct: 248 IGI---SLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMA- 303

Query: 324 NPYVAIMLLCIGGFAHQALSGALITLSSD 352
+ ++LL GG AL L +
Sbjct: 304 --FPIMVLLASGGIGMPALQAMLSRQVDE 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3563PF00577781e-16 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 77.6 bits (191), Expect = 1e-16
Identities = 75/434 (17%), Positives = 142/434 (32%), Gaps = 32/434 (7%)

Query: 291 NSRVDAYRNEQLLGSFYLNSGSQFIDTSSFPPGSYSVALKVYENNQLTRTELVPFTKTGG 350
++V +N + + + G I+ S + + + E + T+ VP++
Sbjct: 308 TAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPL 367

Query: 351 LT-DGNAQWFLQAGKTTSQVS-DDESSAYQLGVRLPLHPQYELYAGLANADDVSAFELGN 408
L +G+ ++ + AG+ S + ++ +Q + L + +Y G AD AF G
Sbjct: 368 LQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGI 427

Query: 409 NWTADLGGVGNLAISASVFRNDDGGKGDMQQANWS-NPGWPTLGF------YRTNSDG-- 459
G ++ ++ + D + D Q + N G YR ++ G
Sbjct: 428 GKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYF 487

Query: 460 -DACTTDSRESYNALSCYESISATVSQNFVGWNMMLGYTRTQNNTDDSLRWDKQQSFENN 518
A TT SR + + + + V F + + R + + + + + +
Sbjct: 488 NFADTTYSRMNGYNIETQDGV-IQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLS 546

Query: 519 YLRQTT--AQSISETVQLSASRAFVMRDWILSTSVGVFHRNDNGGDNDDNGLYLSFS--L 574
QT ++ E Q + AF ++ ++ + D L L+ +
Sbjct: 547 GSHQTYWGTSNVDEQFQAGLNTAFED----INWTLSYSLTKNAWQKGRDQMLALNVNIPF 602

Query: 575 SDTPTMDSNNNSHSTNVSTDYRYSEQDGDQTSWQLSHTFYNDSFSHKEL--GVTVGGLNT 632
S DS + + S + + T D+ + G GG
Sbjct: 603 SHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGN 662

Query: 633 DTINSAVNGRWDGQYGNVYATVSDSYDRKNHDHLSAFTGTYSSTLAVSRYGVNLGASGTD 692
+ G YGN S S + L S + GV LG D
Sbjct: 663 SGSTGYATLNYRGGYGNANIGYSHS---DDIKQLYY---GVSGGVLAHANGVTLGQPLND 716

Query: 693 DLLGAVLVDVKGFS 706
VLV G
Sbjct: 717 ---TVVLVKAPGAK 727



Score = 31.4 bits (71), Expect = 0.018
Identities = 39/222 (17%), Positives = 68/222 (30%), Gaps = 35/222 (15%)

Query: 305 SFYLNSGSQFIDTSSF------PPGSYSVALKVYENNQLTRTELVPFTKTGGLTDGNAQW 358
F + D S F PPG+Y V + + NN T V F
Sbjct: 52 RFLADDPQAVADLSRFENGQELPPGTYRVDIYL--NNGYMATRDVTFNTGDSEQG----- 104

Query: 359 FLQAGKTTSQVSDDESSAYQLGVRLPLHPQYELYAGLANADDVSAFELGNNWTADLG-GV 417
+ T +Q++ +G+ L A A S D+G
Sbjct: 105 -IVPCLTRAQLA-------SMGLNTASVSGMNLLADDACVPLTSMIH-DATAQLDVGQQR 155

Query: 418 GNLAISASVFRNDDGGKGDMQQANWSNPGWPTLGFYRTNSDGDACTTDSRESYNALSCYE 477
NL I + N +G + W L Y + + +R N+ Y
Sbjct: 156 LNLTIPQAFMSNRA--RGYIPPELWDPGINAGLLNYNFS----GNSVQNRIGGNSHYAYL 209

Query: 478 SISATVSQNFVGW----NMMLGYTRTQNNTDDSLRWDKQQSF 515
++ + + N W N Y + +++ +W ++
Sbjct: 210 NLQSGL--NIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTW 249


55EcE24377A_3588EcE24377A_3596Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3588014-3.341877formate acetyltransferase
EcE24377A_3589018-5.670444propionate/acetate kinase
EcE24377A_3590020-7.670292threonine/serine transporter TdcC
EcE24377A_3591238-13.260448threonine dehydratase
EcE24377A_3592243-15.181837DNA-binding transcriptional activator TdcA
EcE24377A_3593347-16.919353DNA-binding transcriptional activator TdcR
EcE24377A_3594133-9.981606hypothetical protein
EcE24377A_3595126-7.460384hypothetical protein
EcE24377A_3596119-4.785117hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3589ACETATEKNASE5340.0 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 534 bits (1378), Expect = 0.0
Identities = 173/397 (43%), Positives = 254/397 (63%), Gaps = 11/397 (2%)

Query: 7 VLVINCGSSSIKFSVLDASDCEVLMSGIADGINSENAFLSVN-GGEPAP--LAHHSYEGA 63
+LVINCGSSS+K+ ++++ D VL G+A+ I ++ L+ N GE ++ A
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDA 62

Query: 64 LKAIAFELEKRNLN-----DSVALIGHRIAHGGSIFTESAIITDEVIDNIRRVSPLAPLH 118
+K + L + + +GHR+ HGG FT S +ITD+V+ I LAPLH
Sbjct: 63 IKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLH 122

Query: 119 NYANLSGIESAQQLFPGVTQVAVFDTSFHQTMAPKAYLYGLPWKYYEELGVRRYGFHGTS 178
N AN+ GI++ Q+ P V VAVFDT+FHQTM AYLY +P++YY + +R+YGFHGTS
Sbjct: 123 NPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGTS 182

Query: 179 HRYVSQRAHSLLNLAEDDSGLVVAHLGNGASICAVRNGQSVDTSMGMTPLEGLMMGTRSG 238
H+YVSQRA +LN + ++ HLGNG+SI AV+NG+S+DTSMG TPLEGL MGTRSG
Sbjct: 183 HKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRSG 242

Query: 239 DVDFGAMSWVASQTNQSLGDLERVVNKESGLLGISGLSSDLR-VLEKAWHEGHERAQLAI 297
+D +S++ + N S ++ ++NK+SG+ GISG+SSD R + + A+ G +RAQLA+
Sbjct: 243 SIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLAL 302

Query: 298 KTFVHRIARHIAGHAASLRRLDGIIFTGGIGENSSLIRRLVMEHLAVLGVEIDTEMNNRS 357
F +R+ + I +AA++ +D I+FT GIGEN IR +++ L LG ++D E N
Sbjct: 303 NVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKVR 362

Query: 358 NSCGERIVSSENARVICAVIPTNEEKMIALDAIHLGK 394
E I+S+ +++V V+PTNEE MIA D + +
Sbjct: 363 GE--EAIISTADSKVNVMVVPTNEEYMIAKDTEKIVE 397


56EcE24377A_3612EcE24377A_3623Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_36120143.002176D-tagatose-bisphosphate aldolase non-catalytic
EcE24377A_36130153.351256PTS system N-acetylgalactosamine-specific
EcE24377A_36140143.253224PTS system N-acetylgalactosamine-specific
EcE24377A_36150132.010169PTS system N-acetylgalactosamine-specific, IID
EcE24377A_3616-1111.954693PTS system mannose/sorbose-specific transporter
EcE24377A_3617-1121.349054N-acetylglucosamine-6-phosphate deacetylase
EcE24377A_3618-113-0.227505AgaS family sugar isomerase
EcE24377A_3619-112-2.390560tagatose-bisphosphate aldolase
EcE24377A_3620014-3.767689PTS system N-acetylgalactosamine-specific
EcE24377A_3621-116-3.577552PTS system N-acetylgalactosamine-specific
EcE24377A_3622019-4.069342PTS system N-acetylgalactosamine-specific
EcE24377A_3623019-3.406696galactosamine-6-phosphate isomerase
57EcE24377A_3635EcE24377A_3652Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_36350193.466466hypothetical protein
EcE24377A_3638-1183.768396GIY-YIG nuclease superfamily protein
EcE24377A_3637-1183.211147acetyltransferase
EcE24377A_3639-1173.261733sterol-binding protein
EcE24377A_3640-1163.356235U32 family peptidase
EcE24377A_36410152.069548U32 family peptidase
EcE24377A_36421201.719967hypothetical protein
EcE24377A_36433271.546260tryptophan permease
EcE24377A_36445321.551332ATP-dependent RNA helicase DeaD
EcE24377A_36464270.859555hypothetical protein
EcE24377A_36455280.986552lipoprotein NlpI
EcE24377A_36486331.714152hypothetical protein
EcE24377A_36476351.606907polynucleotide phosphorylase
EcE24377A_36496291.31142730S ribosomal protein S15
EcE24377A_36505281.081920tRNA pseudouridine synthase B
EcE24377A_36515271.034728ribosome-binding factor A
EcE24377A_3652219-0.696723translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3652TCRTETOQM732e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 73.4 bits (180), Expect = 2e-15
Identities = 69/313 (22%), Positives = 109/313 (34%), Gaps = 77/313 (24%)

Query: 396 IMGHVDHGKTSLLDYI-----RSTKVASGEAG-------------GITQHIGAYHVETEN 437
++ HVD GKT+L + + T++ S + G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 438 GMITFLDTPGHAAFTSMRARGAQATDIVVLVVAADDGVMPQTIEAIQHAKAAQVPVVVAV 497
+ +DTPGH F + R D +L+++A DGV QT + +P + +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 498 NKIDKPEADPDRV----KNELSQYGI-----------------LPEEWG----------- 525
NKID+ D V K +LS + E+W
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 526 ---------------GESQFV---------HVSAKAGTGIDELLDAILLQAEVLELKAVR 561
ES H SAK GID L++ I +
Sbjct: 188 KYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVIT--NKFYSSTHRG 245

Query: 562 KGMASGAVIESFLDKGRGPVATVLVREGTLHKGDIVL-CGFEYGRVRAMRNELGQEVLEA 620
+ G V + + R +A + + G LH D V E ++ M + E+ +
Sbjct: 246 QSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSINGELCKI 305

Query: 621 GPSIPVEILGLSG 633
+ EI+ L
Sbjct: 306 DKAYSGEIVILQN 318


58EcE24377A_3855EcE24377A_3863Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_38552141.067176ribulose-phosphate 3-epimerase
EcE24377A_38572141.274245DNA adenine methylase
EcE24377A_38582152.015552hypothetical protein
EcE24377A_38591191.9352193-dehydroquinate synthase
EcE24377A_38601222.515985shikimate kinase I
EcE24377A_3861-2153.462411outer membrane porin HofQ
EcE24377A_3862-2153.238228hypothetical protein
EcE24377A_3863-1153.271102hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3858IGASERPTASE433e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.7 bits (100), Expect = 3e-06
Identities = 38/199 (19%), Positives = 69/199 (34%), Gaps = 19/199 (9%)

Query: 140 DLAGNATDQANGVQPAPGTTSAENTQQDVSL-----------------PPISSTPTQGQT 182
DL ++ N T+ N Q DV PP +TP++
Sbjct: 979 DLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE 1038

Query: 183 PAATDGQQRVEVQGDLNNALTQPQNQQQLNNVAVNSTLPTEPATVAPVRNGNASRDTAKT 242
A + +Q + T+ Q + S + T ++G+ +++T T
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTT 1098

Query: 243 QTAERPSTTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAATSTPA 302
+T E + + + + E + V ++ P + + +P A A P
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 303 PKETATTAPVQTASPAQTT 321
+T TTA T PA+ T
Sbjct: 1159 QSQTNTTA--DTEQPAKET 1175



Score = 42.7 bits (100), Expect = 3e-06
Identities = 41/203 (20%), Positives = 68/203 (33%), Gaps = 10/203 (4%)

Query: 123 APSTSSSDQTASGEKSIDLAGNATDQANGVQPAPGTTSAENTQQDVSLPPISST-PTQGQ 181
P+ +D + + ++A D+A PAP T S + S T Q
Sbjct: 999 TPNNIQADVPSVPSNNEEIA--RVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQ 1056

Query: 182 TPAATDGQQRVEVQGDLNNALTQPQN----QQQLNNVAVNSTLPTEPATVAPVRNGNASR 237
T Q R + +N Q Q +T E ATV + A
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE--KEEKAKV 1114

Query: 238 DTAKTQTAERPSTTRPARQQAVIEPKKPQATVKTEPKPVAQTPKRTEPAAPVASTKAPAA 297
+T KTQ + ++ +Q+ E +PQA E P + A T+ PA
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQS-ETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAK 1173

Query: 298 TSTPAPKETATTAPVQTASPAQT 320
++ ++ T + +
Sbjct: 1174 ETSSNVEQPVTESTTVNTGNSVV 1196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3860CARBMTKINASE328e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 32.1 bits (73), Expect = 8e-04
Identities = 27/91 (29%), Positives = 40/91 (43%), Gaps = 18/91 (19%)

Query: 32 FYDSDQEIEKRTGADVGWVFDLEGEEGFRD----------REEKVINELTEKQGIVLATG 81
FYD + KR + GW+ + G+R E + I +L E+ IV+A+G
Sbjct: 136 FYDEETA--KRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLVERGVIVIASG 193

Query: 82 GGSVKSRETRNRLSARGVVVYLETTIEKQLA 112
GG V + +GV E I+K LA
Sbjct: 194 GGGVPVILEDGEI--KGV----EAVIDKDLA 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3861TYPE3OMGPROT2871e-93 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 287 bits (736), Expect = 1e-93
Identities = 80/301 (26%), Positives = 132/301 (43%), Gaps = 18/301 (5%)

Query: 117 LENRSITLQYADAGELAKAGEKLLSAKGSMTVDKRTNRLLLRDNKTALSALEQWVAQMDL 176
L + +I D + +A SA+ + D N +++RD+ + ++ + +D
Sbjct: 219 LSDATIQQVTVDNQRIPQAAT-RASAQARVEADPSLNAIIVRDSPERMPMYQRLIHALDK 277

Query: 177 PVGQVELSAHIVTINEKSLRELGVKWTLADAQHAGGVGQVTTLGSDLSVATATTHVGFNI 236
P ++E++ IV IN L ELGV W + + T G ++A+ G
Sbjct: 278 PSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASN----GALG 333

Query: 237 GRINGRLLDL---ELSALEQKQQLDIIASPRLLASHLQPASIKQGSEIPYQVSSGESGAT 293
++ R LD ++ LE + +++ P LL A I SE Y +G+ A
Sbjct: 334 SLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDH-SETYYVKVTGKEVA- 391

Query: 294 SVEFKEAVLG--MEVTPTVLQKG---RIRLKLHISQNVPGQVLQQADGEVLAIDKQEIET 348
E K G + +TP VL +G I L LHI +G + I + ++T
Sbjct: 392 --ELKGITYGTMLRMTPRVLTQGDKSEISLNLHIEDGNQKPNSSGIEG-IPTISRTVVDT 448

Query: 349 QVEVKSGETLALGGIFTRKNKSGQDSVPLLGDIPWFGQLFRHDGKEDERRELVVFITPRL 408
V G++L +GGI+ + VPLLGDIP+ G LFR + R + I PR+
Sbjct: 449 VARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVRLFIIEPRI 508

Query: 409 V 409
+
Sbjct: 509 I 509



Score = 37.2 bits (86), Expect = 1e-04
Identities = 28/139 (20%), Positives = 51/139 (36%), Gaps = 24/139 (17%)

Query: 1 MKQWIAALLLMLIPGVQAA----KPQKVTLMVDDVPVAQVLQALAEQEKLNLVVSPDVSG 56
K+ + LL+L A P + + +L +VVS ++
Sbjct: 9 FKRVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDFGANYDATVVVSDKIND 68

Query: 57 TVSLHLTDVPWKQALQTVVKSAGLITRQEGNILSVHSIAWQNNNIARQEAEQARAQANLP 116
VS + LQ + L+ +GN+L + ++N+ +A
Sbjct: 69 KVSGQFEHDNPQDFLQHIASLYNLVWYYDGNVLYI----FKNSEVA-------------- 110

Query: 117 LENRSITLQYADAGELAKA 135
+R I LQ ++A EL +A
Sbjct: 111 --SRLIRLQESEAAELKQA 127


59EcE24377A_3916EcE24377A_3998Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3916116-4.263679GntR family transcriptional regulator
EcE24377A_3917122-8.111558pirin
EcE24377A_3918330-9.601444dehydrogenase
EcE24377A_3919222-6.860017hypothetical protein
EcE24377A_3920322-6.985217acetyltransferase YhhY
EcE24377A_3921322-7.138504HNH endonuclease domain-containing protein
EcE24377A_3922214-1.462250hypothetical protein
EcE24377A_3923-1193.028267hypothetical protein
EcE24377A_3924-1213.448521gamma-glutamyltranspeptidase
EcE24377A_3925-1243.904602hypothetical protein
EcE24377A_3927-1243.636951hypothetical protein
EcE24377A_3926-1253.885178glycerophosphodiester phosphodiesterase
EcE24377A_3928-1283.953095glycerol-3-phosphate transporter ATP-binding
EcE24377A_3929-2273.084214glycerol-3-phosphate transporter membrane
EcE24377A_3930-2253.522655glycerol-3-phosphate transporter permease
EcE24377A_3931-2243.492392glycerol-3-phosphate transporter periplasmic
EcE24377A_3932-2213.620864hypothetical protein
EcE24377A_3933-1213.141027leucine/isoleucine/valine transporter
EcE24377A_3934-1202.977519leucine/isoleucine/valine transporter
EcE24377A_39350151.907264leucine/isoleucine/valine transporter permease
EcE24377A_39361150.601688branched-chain amino acid transporter permease
EcE24377A_39371170.385991high-affinity branched-chain amino acid ABC
EcE24377A_39380200.260234hypothetical protein
EcE24377A_39400201.218685acetyltransferase
EcE24377A_39390191.288155hypothetical protein
EcE24377A_39412202.149088hypothetical protein
EcE24377A_39421182.377872high-affinity branched-chain amino acid ABC
EcE24377A_39432161.795524RNA polymerase factor sigma-32
EcE24377A_39442131.524028cell division protein FtsX
EcE24377A_39453121.997790cell division protein FtsE
EcE24377A_39462123.477117cell division protein FtsY
EcE24377A_39470163.91695416S rRNA m(2)G966-methyltransferase
EcE24377A_3948-1143.518781hypothetical protein
EcE24377A_3949-1153.752911hypothetical protein
EcE24377A_3950-1154.071643hypothetical protein
EcE24377A_39510153.118054zinc/cadmium/mercury/lead-transporting ATPase
EcE24377A_39520161.733067sulfur transfer protein SirA
EcE24377A_3953-1151.707738hypothetical protein
EcE24377A_39540172.743978hypothetical protein
EcE24377A_39550173.079285major facilitator superfamily transporter
EcE24377A_3956-1203.831663hypothetical protein
EcE24377A_39570205.454479holo-(acyl carrier protein) synthase 2
EcE24377A_39581225.830064nickel ABC transporter periplasmic
EcE24377A_39592247.246834nickel transporter permease NikB
EcE24377A_39602247.063001hypothetical protein
EcE24377A_39621205.687752nickel transporter permease NikC
EcE24377A_39611204.313039hypothetical protein
EcE24377A_39630214.143742nickel transporter ATP-binding protein NikD
EcE24377A_39640203.932432nickel transporter ATP-binding protein NikE
EcE24377A_3965015-0.308599nickel responsive regulator
EcE24377A_3966118-3.792900HicB family protein
EcE24377A_3967118-3.761301ABC transporter ATP-binding protein
EcE24377A_3968021-4.929262ABC transporter ATP-binding protein
EcE24377A_3969-128-7.564216MFP family transporter
EcE24377A_3970024-7.053556hypothetical protein
EcE24377A_3971017-4.950256hypothetical protein
EcE24377A_3972-112-0.528030hypothetical protein
EcE24377A_39730140.490421hypothetical protein
EcE24377A_39740141.707738pyridine nucleotide-disulfide oxidoreductase
EcE24377A_3975-2172.016355low-affinity inorganic phosphate transporter 1
EcE24377A_3976-2182.085631universal stress protein UspB
EcE24377A_3977-3202.415588universal stress protein A
EcE24377A_3978-2202.616700inner membrane transporter YhiP
EcE24377A_3979-1203.134186methyltransferase
EcE24377A_3980-2192.477997oligopeptidase A
EcE24377A_3981-1171.672846DNA utilization protein YhiR
EcE24377A_39820171.220461glutathione reductase
EcE24377A_39830140.354716hypothetical protein
EcE24377A_3984116-1.295077DNA-binding transcriptional repressor ArsR
EcE24377A_3985021-4.756474arsenical pump membrane protein
EcE24377A_3986133-8.782512arsenate reductase
EcE24377A_3988434-9.362782hypothetical protein
EcE24377A_3987332-10.071821ArsR family transcriptional regulator
EcE24377A_3989332-9.971179permease
EcE24377A_3990330-11.486257hypothetical protein
EcE24377A_3991126-9.240233hypothetical protein
EcE24377A_3992122-7.858965Slp family outer membrane lipoprotein
EcE24377A_3993122-10.192728transcriptional regulator DctR
EcE24377A_3994-221-5.705452Mg(2+) transport ATPase
EcE24377A_3995-213-2.096101acid-resistance protein
EcE24377A_3996-214-2.493124acid-resistance protein
EcE24377A_3997-213-2.432428acid-resistance membrane protein
EcE24377A_3998-215-3.224976transcriptional regulator GadE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3920SACTRNSFRASE384e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.4 bits (89), Expect = 4e-06
Identities = 21/92 (22%), Positives = 34/92 (36%), Gaps = 16/92 (17%)

Query: 55 VACIDGDVVGHLTIDVQQRPRRSHVADFGICVDSRWKNRGVASALMREMIE------MCD 108
+ ++ + +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKYGFEIEG 140
L I N A Y K+ F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3924NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 272 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 328
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 329 YAYADRSEYLGDPDFVKVPWQA 350
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3926PF04619280.017 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.4 bits (63), Expect = 0.017
Identities = 12/60 (20%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3928PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.003
Identities = 13/43 (30%), Positives = 20/43 (46%), Gaps = 7/43 (16%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTEGDIWINDQRVTEMEPKD 75
+V+ G G GKSTL+ + GL+ + +D KD
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3931MALTOSEBP402e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.7 bits (92), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYSAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3946IGASERPTASE501e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 50.4 bits (120), Expect = 1e-08
Identities = 41/182 (22%), Positives = 67/182 (36%), Gaps = 16/182 (8%)

Query: 19 EQTPEKETEVQNEQPVVEEI---VQAQEPAKASEQAVEEQPQAHTEAEAETFAADVVEVT 75
TP + TE E E Q+ + + Q E +A + +A T EV
Sbjct: 1030 PATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN---EVA 1086

Query: 76 EQVAESEKAQP-EAEVVAQPEPVVEETAEPVAIEREELPLPEDVNAEEVSPEEWQAEAET 134
+ +E+++ Q E + A E EE A+ + +E+P +VSP++ Q+E
Sbjct: 1087 QSGSETKETQTTETKETATVEK--EEKAKVETEKTQEVP----KVTSQVSPKQEQSETVQ 1140

Query: 135 VEIVEAAEEEAAK--EEITDEELEAQALAAEAAEEAVMVVPPVEE-QPVEEIAQEQEKPT 191
+ A E + +E + A E + V PV E V E P
Sbjct: 1141 PQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 192 KE 193

Sbjct: 1201 NT 1202



Score = 48.9 bits (116), Expect = 4e-08
Identities = 43/210 (20%), Positives = 69/210 (32%), Gaps = 26/210 (12%)

Query: 20 QTPEK-ETEVQNEQPVVEEIVQAQE----------PAKASEQAVEEQPQAHTEAE----- 63
TP + +V + EEI + E P++ +E E Q E
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQD 1057

Query: 64 AETFAADVVEVTEQVAESEKAQPEAEVVAQPEPVVEETAEPVAIEREELPLPEDVNAEEV 123
A A EV ++ + KA + VAQ +ET E V EE
Sbjct: 1058 ATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE------TATVEKEEK 1111

Query: 124 SPEEW--QAEAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAVMV--VPPVEEQP 179
+ E E V + ++E ++ E + +E EQP
Sbjct: 1112 AKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQP 1171

Query: 180 VEEIAQEQEKPTKEGFFARLKRSLLKTKEN 209
+E + E+P E S+++ EN
Sbjct: 1172 AKETSSNVEQPVTESTTVNTGNSVVENPEN 1201



Score = 46.2 bits (109), Expect = 3e-07
Identities = 25/159 (15%), Positives = 46/159 (28%), Gaps = 7/159 (4%)

Query: 17 QKEQTPEKETEVQNEQPVVEEIVQAQEPAKASE------QAVEEQPQAHTEAEAETFAAD 70
Q +T E T + E+ VE + P S+ Q+ QPQA E +
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 71 VVEVTEQVAESEKAQPEAEVVAQPEPVVEETAEPVAIEREELPLPEDVNAEEVSPEEWQA 130
++ ++ QP E + E V E+ PE+ P
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVV-ENPENTTPATTQPTVNSE 1214

Query: 131 EAETVEIVEAAEEEAAKEEITDEELEAQALAAEAAEEAV 169
+ + + + + + A +
Sbjct: 1215 SSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT 1253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3949SHIGARICIN260.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 25.9 bits (57), Expect = 0.039
Identities = 6/21 (28%), Positives = 13/21 (61%)

Query: 7 FFIVIIGLIVVAASFRFMQQR 27
+V+I AA ++F++Q+
Sbjct: 173 ALMVLIQSTSEAARYKFIEQQ 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3952PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3955TCRTETA552e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 55.2 bits (133), Expect = 2e-10
Identities = 80/398 (20%), Positives = 147/398 (36%), Gaps = 32/398 (8%)

Query: 27 LRLNLRIVSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 84
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 85 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 143
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 144 FAGTGSTLWGVGVVGSL--HIGRVISWNGIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 201
A G+ + + H G + + G +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 202 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 254
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 255 ITLFYDAK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 310
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 311 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 369
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 370 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 403
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3964HTHFIS290.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.018
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3967ABC2TRNSPORT504e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.3 bits (120), Expect = 4e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKV-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3968PF05272300.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.045
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3969RTXTOXIND662e-14 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 66.0 bits (161), Expect = 2e-14
Identities = 32/196 (16%), Positives = 75/196 (38%), Gaps = 5/196 (2%)

Query: 6 RHLAWWVVGLLAVAAIVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A I++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRVLQEQRLEAIAQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQ 124
EG+ VR+G+VL K+ + L+ + + +A+ Q L + L ++
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 125 RQAELDSVAKRHTRSRSLAHRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAA 184
+ S + + + + + Q + RA + A+++ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 185 RTNIIQAQTRVEARLI 200
++ + + + + I
Sbjct: 234 KSRLDDFSSLLHKQAI 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3974ALARACEMASE290.023 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.4 bits (66), Expect = 0.023
Identities = 26/109 (23%), Positives = 42/109 (38%), Gaps = 24/109 (22%)

Query: 215 VITAENGIVFRENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRN 272
++ E I RE RG GP +L + ++ + + + L T + N Q
Sbjct: 58 LLNLEEAITLRE------RGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLK 107

Query: 273 AHPNQSLKNTLAVHL------------PKRLVERLQQLGQIPDVSLKQL 309
A N LK L ++L P R++ QQL + +V L
Sbjct: 108 ALQNARLKAPLDIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


60EcE24377A_4013EcE24377A_4033Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4013-2143.118415diguanylate phosphodiesterase
EcE24377A_4014-2143.2363662-dehydro-3-deoxygluconokinase
EcE24377A_4015-2153.081727M16 family peptidase
EcE24377A_4016-3173.270998C4-dicarboxylate transporter DctA
EcE24377A_4017-2163.530845phosphodiesterase
EcE24377A_4018-3173.879638cellulose synthase subunit BcsC
EcE24377A_4019-3141.973932endo-1,4-D-glucanase
EcE24377A_4020-2141.704190cellulose synthase regulator protein
EcE24377A_4021-3110.963814cellulose synthase catalytic subunit
EcE24377A_4022115-0.504381cell division protein
EcE24377A_4023116-0.983719hypothetical protein
EcE24377A_4024116-0.862932hypothetical protein
EcE24377A_4025218-2.147738cellulose biosynthesis protein BcsF
EcE24377A_4026015-0.761924hypothetical protein
EcE24377A_4027-115-0.001582hypothetical protein
EcE24377A_4028-2161.373065hypothetical protein
EcE24377A_4030-2172.073339hypothetical protein
EcE24377A_4031-2201.721337serine transporter family protein
EcE24377A_4032-2233.138686dipeptide transporter ATP-binding subunit
EcE24377A_4033-1213.053177dipeptide transporter ATP-binding subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4013SALSPVBPROT320.002 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 32.0 bits (72), Expect = 0.002
Identities = 43/165 (26%), Positives = 65/165 (39%), Gaps = 32/165 (19%)

Query: 93 DFFIEHGLLASVNIDGPTLIALRQQPKILRQIERLPWLRFELV----EHIRLPKDSTFAS 148
DF++ H +++ G T A P+ + WL E V EHI ++
Sbjct: 157 DFWLLHDSNGILHLLGKTAAARLSDPQAASHTAQ--WLVEESVTPAGEHI------YYSY 208

Query: 149 MCEFGPLWLDDFGTGMANFSA---LSEVRYDYIKIARELFVMLRQSPEGRTLFSQLLHLM 205
+ E G + + SA LS+V+Y A +L++ +P + LF L+
Sbjct: 209 LAENGDNVDLNGNEAGRDRSAMRYLSKVQYGNATPAADLYLWTSATPAVQWLF----TLV 264

Query: 206 NRYC-RGVIVEGVETPEEWRDVQNSPAFAAQGWFLSRPAPIETLN 249
Y RGV D Q PAF AQ +L+R P N
Sbjct: 265 FDYGERGV------------DPQVPPAFTAQNSWLARQDPFSLYN 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4018SYCDCHAPRONE330.005 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 32.6 bits (74), Expect = 0.005
Identities = 11/67 (16%), Positives = 23/67 (34%), Gaps = 3/67 (4%)

Query: 20 LGEATHREDLVQQSLY---RLELIDPNNPDVVAARFRSLLRQGDIDGAQKQLDRLSQLAP 76
LG +++ ++D P LL++G++ A+ L +L
Sbjct: 76 LGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLAQELIA 135

Query: 77 SSNAYKS 83
+K
Sbjct: 136 DKTEFKE 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4025TYPE3OMGPROT260.006 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 26.4 bits (58), Expect = 0.006
Identities = 10/27 (37%), Positives = 14/27 (51%)

Query: 21 LGYLARHSLRRIRDTLRLFFAKPRYVK 47
+G L R R T+RLF +PR +
Sbjct: 484 IGALFRRKSELTRRTVRLFIIEPRIID 510


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4026FLGMRINGFLIF320.008 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 31.9 bits (72), Expect = 0.008
Identities = 10/44 (22%), Positives = 19/44 (43%), Gaps = 5/44 (11%)

Query: 94 QGSQVAGFSTDYLIDLVTRFINWQMIGAIFVLLVAWLFLSQWIR 137
G ++ + ID + W + VL+VAW+ + +R
Sbjct: 442 TGGELPFWQQQSFIDQLLAAGRW-----LLVLVVAWILWRKAVR 480


61EcE24377A_4087EcE24377A_4107Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_40870144.428805alcohol dehydrogenase
EcE24377A_4088-1143.122286selenocysteinyl-tRNA-specific translation
EcE24377A_40892181.930165selenocysteine synthase
EcE24377A_40902191.409794glutathione S-transferase
EcE24377A_40912201.366874hypothetical protein
EcE24377A_40922211.630446protein rhsB
EcE24377A_4093115-2.753166hypothetical protein
EcE24377A_40940150.338239RHS domain-containing protein
EcE24377A_40950180.983964membrane fusion protein family protein
EcE24377A_40960210.789426hypothetical protein
EcE24377A_40971220.492173hypothetical protein
EcE24377A_40981220.470805PTS system mannitol-specific transporter subunit
EcE24377A_4099117-1.094623mannitol-1-phosphate 5-dehydrogenase
EcE24377A_4100221-6.084860mannitol repressor protein
EcE24377A_4101220-2.962683hypothetical protein
EcE24377A_4102217-1.667818hypothetical protein
EcE24377A_4103117-1.189705hypothetical protein
EcE24377A_4104115-0.545769hypothetical protein
EcE24377A_4105114-0.234222lipoprotein
EcE24377A_41061140.901222hemagglutinin/invasin protein
EcE24377A_4107-2133.259272L-lactate permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4088TCRTETOQM588e-11 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 57.9 bits (140), Expect = 8e-11
Identities = 44/147 (29%), Positives = 70/147 (47%), Gaps = 18/147 (12%)

Query: 3 IATAGHVDHGKTTLLQAI---TGV------------NADRLPEEKKRGMTIDLGYAYWPQ 47
I HVD GKTTL +++ +G D E++RG+TI G +
Sbjct: 6 IGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQW 65

Query: 48 PDGRVPGFIDVPGHEKFLSNMLAGVGGIDHALLVVACDDGVMAQTREHLAILQLTGNPML 107
+ +V ID PGH FL+ + + +D A+L+++ DGV AQTR L+ G P +
Sbjct: 66 ENTKV-NIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTI 124

Query: 108 TVALTKADRVDEARVNEVERQVKEVLR 134
+ K D+ ++ V + +KE L
Sbjct: 125 -FFINKIDQNG-IDLSTVYQDIKEKLS 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4095RTXTOXIND642e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 64.5 bits (157), Expect = 2e-13
Identities = 56/314 (17%), Positives = 103/314 (32%), Gaps = 82/314 (26%)

Query: 66 ITPQVTGIVTEVTDKNNQLIQKGEVLFKLDPVR------------YQARVD--RLQA--- 108
I P IV E+ K + ++KG+VL KL + QAR++ R Q
Sbjct: 99 IKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSR 158

Query: 109 ------------------------DLMTATHNIK----TLRAQLTEAQANTTQVSAERDR 140
+++ T IK T + Q + + N + AER
Sbjct: 159 SIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLT 218

Query: 141 LFKNYQRY----------LKGSQAAVNPFS---------ERDIDDARQNF---LAQDALV 178
+ RY L + ++ + E +A +Q +
Sbjct: 219 VLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI 278

Query: 179 KGSVAE----QAQIQSQLDSMVNGE----QSQIVSLRAQLTEAKYNLEQTVIRAPSNGYV 230
+ + + + + + I L +L + + + +VIRAP + V
Sbjct: 279 ESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKV 338

Query: 231 TQVLIR-PGTYAAALPLRPVMVFIPEQKRQIV-AQFRQNSLLRLKPGDDAEVVFNALPGQ 288
Q+ + G +MV +PE V A + + + G +A + A P
Sbjct: 339 QQLKVHTEGGVVT--TAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYT 396

Query: 289 VFH---GKLTSILP 299
+ GK+ +I
Sbjct: 397 RYGYLVGKVKNINL 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4106PF03895655e-15 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 65.2 bits (159), Expect = 5e-15
Identities = 19/79 (24%), Positives = 36/79 (45%), Gaps = 2/79 (2%)

Query: 1539 ESKLSGGIASAMAMTGLPQAYTPGASMASIGGGTYNGESAVALGV-SMVSANGRWVYKLQ 1597
+L G+A+ A++ L Q G + S G Y ++A+A+GV S ++ +
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 1598 GSTNSQGEYSAALGAGIQW 1616
+T + G S G ++
Sbjct: 62 FNTYN-GGMSYGASVGYEF 79


62EcE24377A_4121EcE24377A_4130Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4121020-5.7711772-amino-3-ketobutyrate CoA ligase
EcE24377A_4122227-8.931100ADP-L-glycero-D-manno-heptose-6-epimerase
EcE24377A_4123333-11.178378ADP-heptose--LPS heptosyltransferase
EcE24377A_4124342-14.474948ADP-heptose--LPS heptosyltransferase
EcE24377A_4125344-16.615015O-antigen polymerase
EcE24377A_4126337-14.436987glycosyl transferase family protein
EcE24377A_4127330-11.405591lipopolysaccharide 1,2-galactosyltransferase
EcE24377A_4128228-9.491055lipopolysaccharide core biosynthesis protein
EcE24377A_4129223-6.285381lipopolysaccharide 1,2-glucosyltransferase
EcE24377A_4130219-3.900086lipopolysaccharide 1,3-galactosyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4122NUCEPIMERASE1023e-27 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 102 bits (256), Expect = 3e-27
Identities = 76/348 (21%), Positives = 127/348 (36%), Gaps = 67/348 (19%)

Query: 2 IIVTGGAGFIGSNIVKALNDKGITDILVVDNLKD--------------GTKFVNLVDLNI 47
+VTG AGFIG ++ K L + G ++ +DNL D +++
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 48 ADYMDKEDFLIQIMAGEEFGDVEAIFHEGACSSTTEWDGKYMMDNNYQYSK-------EL 100
AD + + + A F E +F + +Y ++N + Y+ +
Sbjct: 62 ADR----EGMTDLFASGHF---ERVFISPHRLAV-----RYSLENPHAYADSNLTGFLNI 109

Query: 101 LHYCLEREIP-FLYASSAATYGGRTSD-FIESREYEKPLNVYGYSKFLFDEYVRQILPEA 158
L C +I LYASS++ YG F + P+++Y +K +
Sbjct: 110 LEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY 169

Query: 159 NSQIVGFRYFNVYGPREGHKGSMASVAFHLNTQLNNGESPKLFEGSENFKRDFVYVGDVA 218
G R+F VYGP + MA F + G+S ++ KRDF Y+ D+A
Sbjct: 170 GLPATGLRFFTVYGPWG--RPDMA--LFKFTKAMLEGKSIDVY-NYGKMKRDFTYIDDIA 224

Query: 219 DVNL------------WFLENGVSG-------IFNLGTGRAESFQAVADATLAY-HKKGQ 258
+ + W +E G ++N+G A + +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 259 IEYIPFPDKLKGRYQAFTQADLTNLRAA-GYDKPFKTVAEGVTEYMAW 305
+P G T AD L G+ P TV +GV ++ W
Sbjct: 285 KNMLPLQ---PGDVL-ETSADTKALYEVIGF-TPETTVKDGVKNFVNW 327


63EcE24377A_4154EcE24377A_4180Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4154-1123.051494tRNA guanosine-2'-O-methyltransferase
EcE24377A_4156-2122.307018ATP-dependent DNA helicase RecG
EcE24377A_4155-1121.308196hypothetical protein
EcE24377A_4157-1120.919869sodium/glutamate symporter
EcE24377A_4158-2100.286756xanthine permease
EcE24377A_4159-210-1.319853AsmA family protein
EcE24377A_4160-212-2.537695alpha-xylosidase
EcE24377A_4162-317-4.264806hypothetical protein
EcE24377A_4161-217-4.506882transporter
EcE24377A_4165019-3.665332*hypothetical protein
EcE24377A_4166-118-2.866807sugar efflux transporter C
EcE24377A_4167020-2.665137carboxylate/amino acid/amine transporter
EcE24377A_4168016-1.019206hypothetical protein
EcE24377A_4169-112-0.098425hypothetical protein
EcE24377A_4170-1120.788768ribonucleoside transporter
EcE24377A_4171-2121.675775hypothetical protein
EcE24377A_4172-1121.609440sulfate permease inorganic anion transporter
EcE24377A_4173-1143.382426cryptic adenine deaminase
EcE24377A_4174-1163.158844sugar phosphate antiporter
EcE24377A_41751174.104439regulatory protein UhpC
EcE24377A_41762184.651946sensory histidine kinase UhpB
EcE24377A_41772194.182700hypothetical protein
EcE24377A_41782193.961617DNA-binding transcriptional activator UhpA
EcE24377A_41791163.237241acetolactate synthase 1 regulatory subunit
EcE24377A_41802152.986012acetolactate synthase catalytic subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4156SECA427e-06 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 42.2 bits (99), Expect = 7e-06
Identities = 39/129 (30%), Positives = 57/129 (44%), Gaps = 18/129 (13%)

Query: 233 NLSMLALRAGAQRFHAQPLSANDALKNKLLAALPFKPTGAQARVVAEIERDM-ALDVPMM 291
LS L+ F A+ L + L+N + A A R ++ M DV ++
Sbjct: 37 KLSDEELKGKTAEFRAR-LEKGEVLENLIPEAF------AVVREASKRVFGMRHFDVQLL 89

Query: 292 ---RLVQGDV-----GSGKTLVAALAA-LRAIAHGKQVALMAPTELLAEQHANNFRNWFE 342
L + + G GKTL A L A L A+ GK V ++ + LA++ A N R FE
Sbjct: 90 GGMVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFE 148

Query: 343 PLGIEVGWL 351
LG+ VG
Sbjct: 149 FLGLTVGIN 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4166TCRTETA401e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.8 bits (93), Expect = 1e-05
Identities = 71/391 (18%), Positives = 129/391 (32%), Gaps = 33/391 (8%)

Query: 20 LLVAFLTGIAGALQTPTLSIFLADELKARPIM--VGFFFTGSAIMGILVSQFLARHSDKQ 77
L L + L P L L D + + + G A+M + L SD+
Sbjct: 11 LSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRF 70

Query: 78 GDRKLLILLCCLFGVLACTLFAWNRNYFILLSTGVLLSSFASTANPQMFALAREHADRTG 137
G R ++L+ + + A ++L ++ A AD T
Sbjct: 71 GRR-PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV----AGITGATGAVAGAYIADITD 125

Query: 138 RET-VMFSTFLRAQISLAWVIGPPLAYELAMGFSFKVMYLTAAIAFVVCGLIVWLFLP-- 194
+ F+ A V GP L + GFS + AA + L LP
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMG-GFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 195 --SIQRNIPVVT-QPVEILPSTHRKRDTRLLFVVCSMMWAANNLYMINMPLFIIDELHLT 251
+R + P+ L V +M + +F D H
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 252 DKLAGEMI-GIAAGLEIPMMLIAGYYMKRIGKRLLMLIAIVSGMCFYASVLMATTPAVEL 310
G + + +I G R+G+R +++ +++ Y +L+A +
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGY--ILLAFATRGWM 302

Query: 311 ELQILNAIFLGILCGIGMLYFQDLMPEKI---------GSATTLYANTSRVGWIIAGSVD 361
+ L GIGM Q ++ ++ GS L + TS VG ++ ++
Sbjct: 303 ---AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIY 359

Query: 362 GIMVEIWSYHALFWLAIGMLGIAMICLLFIK 392
+ W+ W I + ++CL ++
Sbjct: 360 AASITTWNG----WAWIAGAALYLLCLPALR 386



Score = 32.1 bits (73), Expect = 0.004
Identities = 18/102 (17%), Positives = 34/102 (33%)

Query: 17 AAFLLVAFLTGIAGALQTPTLSIFLADELKARPIMVGFFFTGSAIMGILVSQFLARHSDK 76
AA + V F+ + G + IF D +G I+ L +
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAA 272

Query: 77 QGDRKLLILLCCLFGVLACTLFAWNRNYFILLSTGVLLSSFA 118
+ + ++L + L A+ ++ VLL+S
Sbjct: 273 RLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGG 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4170TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 35/208 (16%), Positives = 71/208 (34%), Gaps = 13/208 (6%)

Query: 88 IIVEFLPVSLLTP----MAQDLGISEGVAGQSVTVTAFVAMFASLFITQTIQATDR--RY 141
+ ++ + + L+ P + +DL S V + A A+ +DR R
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 142 VVILFAVLL-TLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 200
V+L ++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 201 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAVMG----VLCIFWIIKSLPSLPGE 256
+ +V LG +G F AAA + + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 257 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 284
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4173UREASE381e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.8 bits (88), Expect = 1e-04
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG-AEYAD---------APA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4174TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4175TCRTETB415e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.4 bits (97), Expect = 5e-06
Identities = 65/408 (15%), Positives = 138/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGDALIPIVMAAAALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G+ + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4176PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 362 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 421
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 422 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 475
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 476 G---TLHISCLHG-TRVSVSLP 493
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4178HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


64EcE24377A_4224EcE24377A_4250Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4224015-4.6001286-phosphogluconate phosphatase
EcE24377A_4225115-4.473384inner membrane protein
EcE24377A_4226013-3.135625hypothetical protein
EcE24377A_4227113-2.8642446-phosphogluconolactonase
EcE24377A_4228014-3.679602esterase
EcE24377A_4229016-4.432017glucoside specific outer membrane porin BglH
EcE24377A_4230-114-3.1086906-phospho-beta-glucosidase
EcE24377A_4231-214-2.283177PTS system beta-glucoside-specific transporter
EcE24377A_4232-219-1.623405hypothetical protein
EcE24377A_4233-219-0.390096transcriptional antiterminator BglG
EcE24377A_4234-3291.255066hypothetical protein
EcE24377A_4235-121-0.955507transcriptional regulator PhoU
EcE24377A_4236011-2.302405phosphate transporter ATP-binding protein
EcE24377A_4237012-3.344841phosphate transporter permease subunit PtsA
EcE24377A_4238118-4.448531phosphate transporter permease subunit PstC
EcE24377A_4239116-3.860315phosphate ABC transporter substrate-binding
EcE24377A_4240-116-3.476979fimbrial family protein
EcE24377A_4241-113-1.823218outer membrane usher protein FimD
EcE24377A_4242016-0.173719pili assembly chaperone protein
EcE24377A_42433281.668437fimbrial protein
EcE24377A_42443282.163171glucosamine--fructose-6-phosphate
EcE24377A_42454342.100841bifunctional N-acetylglucosamine-1-phosphate
EcE24377A_42485372.110579hypothetical protein
EcE24377A_42465391.977680ATP synthase F0F1 subunit epsilon
EcE24377A_42474411.871657ATP synthase F0F1 subunit beta
EcE24377A_42493340.794576ATP synthase F0F1 subunit gamma
EcE24377A_42504350.560244ATP synthase F0F1 subunit alpha
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4241PF005777630.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 763 bits (1971), Expect = 0.0
Identities = 331/870 (38%), Positives = 486/870 (55%), Gaps = 54/870 (6%)

Query: 5 IVVGLTAGTCLIFSQNLMAEVSVFNPALLEIDHQSGVDIRQFNRANLMPPGVYSVDIFIN 64
V L L + FNP L D Q+ D+ +F +PPG Y VDI++N
Sbjct: 26 FFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLN 85

Query: 65 GKMFERQDVTFVQDNPDADLHACFIAIKKTLSSFGIKVDALKSFNDVDETVCLDPALRIE 124
+DVTF + + + C L+S G+ ++ N + + C+ I
Sbjct: 86 NGYMATRDVTFNTGDSEQGIVPCLTR--AQLASMGLNTASVSGMNLLADDACVPLTSMIH 143

Query: 125 GSSWQFDSDKLQLNISIPQIYMDAMAYDYISPTRWDEGINALTINYDFSGSHTLRSDYGS 184
++ Q D + +LN++IPQ +M A YI P WD GINA +NY+FSG+ +
Sbjct: 144 DATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV--QNRIG 201

Query: 185 QETDTSYLNLRNGLNIGPWRLRNYSTLN------TSDGRAEYNSISTWIQRDIAALRSQI 238
+ +YLNL++GLNIG WRLR+ +T + +S + ++ I+TW++RDI LRS++
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 239 MIGDTWTASDIFDSTQIRGARLYTDNDMLPASQNGFAPVVRGIAKSNATVIIRQNGYVIY 298
+GD +T DIFD RGA+L +D++MLP SQ GFAPV+ GIA+ A V I+QNGY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 299 QSAVPQGAFEITDLNTASTGGDLDVTIKEEDGSEQRFTQPYASLAILKREGQTDVDVSVG 358
S VP G F I D+ A GDL VTIKE DGS Q FT PY+S+ +L+REG T ++ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 359 ELRDEDG--FTPDVLQAQILHGFSHGITLYGGMQAAENYGSAALGVGKDLGALGAISFDV 416
E R + P Q+ +LHG G T+YGG Q A+ Y + G+GK++GALGA+S D+
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDM 441

Query: 417 THARANFSHDDTETGQSYRFLYSKRFDDTDTSLRLVGYRYSTEGYYTLNEWASRRNS--- 473
T A + D GQS RFLY+K +++ T+++LVGYRYST GY+ + R +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 474 --------------PEDFWETGNRRSRVEGTLTQSLGRDYGNLYLTLSRQQYWHTDDVER 519
+ + N+R +++ T+TQ LGR LYL+ S Q YW T +V+
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDE 560

Query: 520 LMQFGYSSSWKRLSWNVSWSYSNTARQGTGNNHASDNTSEQIYMLSLSVPLSGW------ 573
Q G +++++ ++W +S+S + A +Q+ L++++P S W
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLTKN---------AWQKGRDQMLALNVNIPFSHWLRSDSK 611

Query: 574 --WGNSYATYSVSQNDNSGSSHQLGLSGTALERNNLSWNLMQSYNSHDDEVGGN---MSL 628
W ++ A+YS+S + N ++ G+ GT LE NNLS+++ Y D G+ +L
Sbjct: 612 SQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATL 671

Query: 629 TYDGSYGTVNGSYNYSQNSQRLNYGIRGGILAHSEGVTLSQELGETIALVKAPGAAGLEI 688
Y G YG N Y++S + ++L YG+ GG+LAH+ GVTL Q L +T+ LVKAPGA ++
Sbjct: 672 NYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKV 731

Query: 689 DNMRGAATDWRGYTVKTQLNPYDENRVAISDNYFSKSNIELDNTVVTMVPTRGAVVKAEF 748
+N G TDWRGY V Y ENRVA+ N + N++LDN V +VPTRGA+V+AEF
Sbjct: 732 ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLA-DNVDLDNAVANVVPTRGAIVRAEF 790

Query: 749 VTHVGYRVLFRVLNANGKPVPFGAIAAIQDASLADSGIVGDRGELYLSGLPEKGQVTLSW 808
VG ++L L N KP+PFG A + S SGIV D G++YLSG+P G+V + W
Sbjct: 791 KARVGIKLLMT-LTHNNKPLPFG--AMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKW 847

Query: 809 GENASTKCIFNYSLSTPESESGLIEQGVTC 838
GE + C+ NY L + L + C
Sbjct: 848 GEEENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4245RTXTOXINA290.048 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.048
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4248PF06438280.017 Heme acquisition protein HasAp
		>PF06438#Heme acquisition protein HasAp

Length = 205

Score = 28.0 bits (62), Expect = 0.017
Identities = 9/26 (34%), Positives = 13/26 (50%)

Query: 17 SRLAFVLFKPVLHRFFGQLDNAQLRD 42
L + LF H +G+LD+ L D
Sbjct: 71 GDLHYTLFSNPSHTLWGKLDSIALGD 96


65EcE24377A_4327EcE24377A_4336Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4327-2194.097652lipoprotein
EcE24377A_4328-2163.222609diaminopimelate epimerase
EcE24377A_4329-2172.853637hypothetical protein
EcE24377A_4330-2182.576036site-specific tyrosine recombinase XerC
EcE24377A_4331-1182.176376flavin mononucleotide phosphatase
EcE24377A_4332014-0.222588DNA-dependent helicase II
EcE24377A_4333122-9.249792hypothetical protein
EcE24377A_4334323-10.599306hypothetical protein
EcE24377A_4335118-8.229362magnesium/nickel/cobalt transporter CorA
EcE24377A_4337226-9.783049hypothetical protein
EcE24377A_4336118-7.376527hypothetical protein
66EcE24377A_4373EcE24377A_4397Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4373019-3.288914**molybdopterin-guanine dinucleotide biosynthesis
EcE24377A_4374-121-5.896807molybdopterin-guanine dinucleotide biosynthesis
EcE24377A_4375-220-6.454404hypothetical protein
EcE24377A_4376-120-6.988230hypothetical protein
EcE24377A_4377-214-4.137587serine/threonine protein kinase
EcE24377A_4378-113-3.889277protein disulfide isomerase I
EcE24377A_4379-114-3.551850hypothetical protein
EcE24377A_4380015-1.358227acyltransferase
EcE24377A_43811150.006333hypothetical protein
EcE24377A_43821150.664815DNA polymerase I
EcE24377A_43832170.942774ribosome biogenesis GTP-binding protein YsxC
EcE24377A_43842182.728823hypothetical protein
EcE24377A_43850163.159716hypothetical protein
EcE24377A_43860182.525191coproporphyrinogen III oxidase
EcE24377A_4387-1182.600625hypothetical protein
EcE24377A_43893272.434322hypothetical protein
EcE24377A_43902231.897961nitrogen regulation protein NR(I)
EcE24377A_4391222-0.434259nitrogen regulation protein NR(II)
EcE24377A_4392221-2.239520glutamine synthetase
EcE24377A_4393215-4.355472hypothetical protein
EcE24377A_4394114-4.798468GTP-binding protein
EcE24377A_4395014-5.391488GntR family transcriptional regulator
EcE24377A_4396-111-3.752655AP endonuclease
EcE24377A_4397011-3.202492major facilitator transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4384SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 30/71 (42%)

Query: 14 AKARRKTREELDQEARDRKRQKKRRGHAPGSRAAGGNTTSGSKGQNAPKDPRIGSKTPIP 73
+K + + EE+++ + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 74 LGVTEKVTKQH 84
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4390HTHFIS6010.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 601 bits (1550), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4391PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4394TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4397TCRTETB290.025 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.025
Identities = 31/161 (19%), Positives = 64/161 (39%), Gaps = 15/161 (9%)

Query: 227 NVFFVYAVYCGLTFFIPFLKNIYLLP----------VALVGAYGIINQYCLKMIGGPIGG 276
N+ F+ V CG F + ++P A +G+ I +I G IGG
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 277 MISDKILKSPSKYLCYTFIISTAALVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFF 336
++ D+ + P L + + + L E+ ++ + G + FT+
Sbjct: 315 ILVDR--RGPLYVLNIGVTFLSVSFLTASFLL-ETTSWFMTIIIVFVLGGLSFTK--TVI 369

Query: 337 APIGEAKIAENKTGAAMALGSFIGYAPAMFCFSLYGYILDL 377
+ I + + + + GA M+L +F + ++ G +L +
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


67EcE24377A_4466EcE24377A_4482Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_44660133.2295121,4-dihydroxy-2-naphthoate
EcE24377A_44671153.219407ATP-dependent protease ATP-binding subunit HslU
EcE24377A_4468-1195.316098ATP-dependent protease peptidase subunit
EcE24377A_4469-1195.487086essential cell division protein FtsN
EcE24377A_4470-1215.048267DNA-binding transcriptional regulator CytR
EcE24377A_44710235.553294primosome assembly protein PriA
EcE24377A_44721255.04566450S ribosomal protein L31
EcE24377A_44731255.051958protein rhsA
EcE24377A_4474-1121.755293lipoprotein
EcE24377A_44750154.128117peptidoglycan peptidase
EcE24377A_4476-1174.001039hypothetical protein
EcE24377A_4477-1184.212352IS621, transposase
EcE24377A_4478-1184.144295transcriptional repressor protein MetJ
EcE24377A_4479-2174.124703cystathionine gamma-synthase
EcE24377A_4480-2184.171840bifunctional aspartate kinase II/homoserine
EcE24377A_4481-2162.9830065,10-methylenetetrahydrofolate reductase
EcE24377A_4482-1183.367655catalase/peroxidase HPI
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4467HTHFIS300.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.018
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 49 TPKNILMIGPTGVGKTEIAR---RLAKLANAPFIKV 81
T +++ G +G GK +AR K N PF+ +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4469IGASERPTASE422e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.0 bits (98), Expect = 2e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLAQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVARA 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


68EcE24377A_4623EcE24377A_4628Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4623-2203.109736hypothetical protein
EcE24377A_4624-2213.895754hypothetical protein
EcE24377A_4625-2184.254593acetyl-CoA synthetase
EcE24377A_4626-1163.720082cytochrome c552
EcE24377A_46270184.128051cytochrome c nitrite reductase pentaheme
EcE24377A_46280193.819722cytochrome c nitrite reductase, Fe-S protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4623RTXTOXIND270.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.7 bits (59), Expect = 0.020
Identities = 5/33 (15%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 17 ELVEKR-QRFATILSIIMLAVYIGFILLIAFAP 48
EL+E R +++ ++ + +L
Sbjct: 47 ELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQ 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4628VACJLIPOPROT300.006 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.9 bits (67), Expect = 0.006
Identities = 6/21 (28%), Positives = 11/21 (52%)

Query: 179 FGNLDDPNSEISQLLRQKPTY 199
GNL++P ++ L+ P
Sbjct: 75 TGNLEEPAVMVNYFLQGDPYQ 95


69EcE24377A_4642EcE24377A_4657Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_46421306.469198hypothetical protein
EcE24377A_46450337.712004hypothetical protein
EcE24377A_46460358.382367carbon-phosphorus lyase complex accessory
EcE24377A_46470378.149295aminoalkylphosphonic acid N-acetyltransferase
EcE24377A_46480378.620935ribose 1,5-bisphosphokinase
EcE24377A_46490388.910840phosphonate metabolism protein PhnM
EcE24377A_46500389.132794phosphonate C-P lyase system protein PhnL
EcE24377A_46510379.259418phosphonate C-P lyase system protein PhnK
EcE24377A_46520399.308655phosphonate metabolism protein PhnJ
EcE24377A_46532408.522540phosphonate metabolism protein PhnI
EcE24377A_46542418.023233carbon-phosphorus lyase complex subunit
EcE24377A_46552407.306035phosphonate C-P lyase system protein PhnG
EcE24377A_46562375.917690phosphonate metabolism transcriptional regulator
EcE24377A_46572363.956291phosphonate ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4646FLGMOTORFLIG310.006 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 30.5 bits (69), Expect = 0.006
Identities = 15/58 (25%), Positives = 22/58 (37%), Gaps = 7/58 (12%)

Query: 169 PEKTLKFLLNNHPQVMVIDCSHPPRADAPRNHCDLNTVLALNQVIR-------SPQVI 219
P L F+ HPQ + + S+ A L T + N R SP+V+
Sbjct: 125 PANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVV 182


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4647SACTRNSFRASE333e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 3e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 50 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 109
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 110 AEMTELSTNVKRHDAHRFYLREGY 133
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4650PF05272290.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.014
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


70EcE24377A_4678EcE24377A_4687Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4678116-4.684518DNA-binding transcriptional activator DcuR
EcE24377A_4679114-4.984499sensory histidine kinase DcuS
EcE24377A_4680-116-4.621894hypothetical protein
EcE24377A_4681-117-4.484629hypothetical protein
EcE24377A_4682021-3.830675hypothetical protein
EcE24377A_4683119-4.440486hypothetical protein
EcE24377A_4684018-3.799937lysyl-tRNA synthetase
EcE24377A_4685014-2.534951amino acid/peptide transporter
EcE24377A_4686117-2.922321lysine decarboxylase-like protein, constitutive
EcE24377A_4687218-1.849096lysine/cadaverine antiporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4678HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4679PF06580417e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 7e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4681SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4685TCRTETA300.022 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.022
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWSWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


71EcE24377A_4716EcE24377A_4741Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4716-1133.229487hypothetical protein
EcE24377A_4717-2122.578311phosphatidylserine decarboxylase
EcE24377A_4718-1142.917145ribosome-associated GTPase
EcE24377A_4719-2152.929534oligoribonuclease
EcE24377A_4723-1143.510652***iron-sulfur cluster binding protein
EcE24377A_4724-1143.740981hypothetical protein
EcE24377A_4725-1152.787085ATPase
EcE24377A_4727-1142.848075N-acetylmuramoyl-L-alanine amidase
EcE24377A_47260133.014413hypothetical protein
EcE24377A_47281142.844787DNA mismatch repair protein
EcE24377A_47292191.913197tRNA delta(2)-isopentenylpyrophosphate
EcE24377A_47303231.570960RNA-binding protein Hfq
EcE24377A_47315251.986286GTPase HflX
EcE24377A_47325251.455702FtsH protease regulator HflK
EcE24377A_47335251.700765FtsH protease regulator HflC
EcE24377A_47354231.553017hypothetical protein
EcE24377A_47344232.087853hypothetical protein
EcE24377A_47363201.282174adenylosuccinate synthetase
EcE24377A_47374130.745625transcriptional repressor NsrR
EcE24377A_47385130.355744exoribonuclease R
EcE24377A_4739217-3.374058hypothetical protein
EcE24377A_4740218-3.06912123S rRNA (guanosine-2'-O-)-methyltransferase
EcE24377A_4741116-3.073880hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4716GPOSANCHOR512e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.8 bits (121), Expect = 2e-08
Identities = 50/312 (16%), Positives = 105/312 (33%), Gaps = 18/312 (5%)

Query: 121 SRQAQQEQERAREIADSLNQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDS 180
+ ++ QERA + N L + +D ++ LT L+ +
Sbjct: 49 TDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTE---ELSNAKEKLRKND 105

Query: 181 ARLKALVDEL-ELAQLSANNRQELARLRSELAEKES--QQLDAYLQALRNQLNSQRQLEA 237
L ++ EL A+ + L + + + L+A AL + + +
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAAR-KADLEKAL 164

Query: 238 ERALESTELLAENSADLPKDIVAQFKINRELSAALNQQAQRMDLVASQQRQAASQTLQVR 297
E A+ + + L + A EL AL +++ + ++ +
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 298 QALNTLREQSQWLGSSNLLGEALRAQVARLPEMPKPQQLDTEMAQLRVQRLRYEDLLNKQ 357
L + L + A A++ L + L+ A+L +
Sbjct: 225 ARKADLEKA---LEGAMNFSTADSAKIKTLEA--EKAALEARQAELEKALEGAMNFSTAD 279

Query: 358 PLLRQIHQADGQPLTAE------QNRILEAQLRTQRELLNSLLQGGDTLLLELTKLKVSN 411
+ +A+ L AE Q+++L A ++ R L++ + L E KL+ N
Sbjct: 280 SAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQN 339

Query: 412 GQLEDALKEVNE 423
E + + +
Sbjct: 340 KISEASRQSLRR 351



Score = 42.7 bits (100), Expect = 7e-06
Identities = 48/239 (20%), Positives = 92/239 (38%), Gaps = 23/239 (9%)

Query: 20 ATAPDSKQITQELEQAKAAKPAQPEVVEALQSALNALEERKGSLER-IKQYQQVIDNYPK 78
A A + + LE A A ++ L++ ALE R+ LE+ ++
Sbjct: 222 ALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSA 281

Query: 79 LSATLRAQLNNMRDEPRSVSPGMSTDALNQEILQVSSQLLDKSRQAQQEQERAREIADSL 138
TL A+ + E + + Q + LD SR+A+++ E + +
Sbjct: 282 KIKTLEAEKAALEAEKADL---EHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQ 338

Query: 139 NQLPQQQTDARRQLNEIERRLGTLTGNTPLNQAQNFALQSDSARLKAL--VDELELAQLS 196
N++ ++A RQ + R L L+++ +L+ + E L
Sbjct: 339 NKI----SEASRQ--SLRRDLDASR-------EAKKQLEAEHQKLEEQNKISEASRQSLR 385

Query: 197 AN---NRQELARLRSELAEKESQQLDAYLQALRNQLNSQRQLEAERALESTELLAENSA 252
+ +R+ ++ L E S +L A + + S++ E E+A +L AE A
Sbjct: 386 RDLDASREAKKQVEKALEEANS-KLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKA 443


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4731SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4732cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4738RTXTOXIND310.029 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.029
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


72EcE24377A_4822EcE24377A_4914Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4822-120-4.966295TetR family transcriptional regulator
EcE24377A_4823-217-1.529824hypothetical protein
EcE24377A_4825-216-1.241928ornithine carbamoyltransferase subunit I
EcE24377A_4826-119-1.140029hypothetical protein
EcE24377A_4827-116-0.012021acetyltransferase
EcE24377A_4828-1170.454756hypothetical protein
EcE24377A_48291253.213866valyl-tRNA synthetase
EcE24377A_4830-2152.640579DNA polymerase III subunit chi
EcE24377A_4831-2142.715902hypothetical protein
EcE24377A_4832-2152.507938leucyl aminopeptidase
EcE24377A_4834-2162.103381putaitve permease
EcE24377A_4835-2161.887943permease
EcE24377A_48360191.464162hypothetical protein
EcE24377A_48371211.190050zinc-binding dehydrogenase oxidoreductase
EcE24377A_48393270.274752*hypothetical protein
EcE24377A_48412271.120248group II intron-encoded reverse
EcE24377A_48431280.604369IS629, transposase orfA
EcE24377A_4844228-2.717483IS100, transposition helper protein, truncation
EcE24377A_4845124-2.241491TetR family transcriptional regulator
EcE24377A_4846223-2.080353IS3, transposase orfA
EcE24377A_4847223-3.162292IS3, transposase orfB
EcE24377A_4848326-5.149819hypothetical protein
EcE24377A_4849428-5.888872hypothetical protein
EcE24377A_4850329-6.449832BCCT family transporter
EcE24377A_4851536-8.295322ISSd1, transposase orfA/B, fusion
EcE24377A_4852432-6.911649lipid A biosynthesis (KDO)2-(lauroyl)-lipid IVA
EcE24377A_4853125-3.116135hypothetical protein
EcE24377A_4854122-1.189134glycoside hydrolase family protein
EcE24377A_48552220.729782polysaccharide deacetylase
EcE24377A_48563253.210545IS911, transposase orfA
EcE24377A_48583263.637703group II intron-encoded reverse
EcE24377A_48604245.341835IS66 family transposase
EcE24377A_48615225.587935IS66 family orf2
EcE24377A_48624205.505726IS66 family orf1
EcE24377A_48631236.400303hypothetical protein
EcE24377A_48641236.739649iron-dicitrate transporter ATP-binding subunit
EcE24377A_48652256.931973iron-dicitrate transporter subunit FecD
EcE24377A_48662266.471596iron-dicitrate transporter permease subunit
EcE24377A_48671265.427372iron-dicitrate transporter substrate-binding
EcE24377A_48680275.173060iron(III) dicitrate transport protein FecA
EcE24377A_48692222.328865fec operon regulator FecR
EcE24377A_48701190.491676RNA polymerase sigma factor FecI
EcE24377A_48711201.000260IS1, transposase orfA
EcE24377A_48730222.646110hypothetical protein
EcE24377A_48741233.374451IS629, transposase orfB, truncation
EcE24377A_48751243.184706hypothetical protein
EcE24377A_48761233.397032hypothetical protein
EcE24377A_48784243.815980IS66 family orf2
EcE24377A_48795243.764593IS66 family transposase
EcE24377A_48804243.084424IS66 family orf2
EcE24377A_48814243.111601IS66 family orf1
EcE24377A_48835212.901957hypothetical protein
EcE24377A_48855232.664098hypothetical protein
EcE24377A_4884524-2.441958hypothetical protein
EcE24377A_4886726-1.300021hemolysin expression modulating protein
EcE24377A_48878242.218090prophage CP4-57 regulatory protein AlpA
EcE24377A_48888242.006911hypothetical protein
EcE24377A_48897262.514575hypothetical protein
EcE24377A_48907242.781494hypothetical protein
EcE24377A_48919255.303372GTPase
EcE24377A_48929254.990449antigen 43
EcE24377A_48937274.121392hypothetical protein
EcE24377A_48948284.558935hypothetical protein
EcE24377A_489510294.209145antirestriction protein
EcE24377A_48969293.278547RadC family DNA repair protein
EcE24377A_48979271.906753hypothetical protein
EcE24377A_4898321-4.405495hypothetical protein
EcE24377A_4899128-9.397560hypothetical protein
EcE24377A_4900028-10.536154hypothetical protein
EcE24377A_4901028-10.345213hypothetical protein
EcE24377A_4902129-10.811792hypothetical protein
EcE24377A_4903129-10.292433type III restriction enzyme, res subunit
EcE24377A_4904131-10.808165N4/N6-methyltransferase
EcE24377A_4905-127-8.449039hypothetical protein
EcE24377A_4906127-5.773144hypothetical protein
EcE24377A_4907130-5.942924hypothetical protein
EcE24377A_4908125-4.766435hypothetical protein
EcE24377A_4909120-3.977649N-acetylneuraminic acid mutarotase
EcE24377A_4910119-2.353449oligogalacturonate-specific porin protein KdgM
EcE24377A_49113200.040813hypothetical protein
EcE24377A_4912219-0.099243IS1, transposase orfA
EcE24377A_4913219-0.208741IS1, transposase orfB
EcE24377A_49143130.065184outer membrane usher protein FimD, truncation
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4822HTHTETR505e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.4 bits (120), Expect = 5e-10
Identities = 20/117 (17%), Positives = 43/117 (36%), Gaps = 7/117 (5%)

Query: 5 KQSRVPGRPRRFAPEQAISAAKVLFHQKGFDAVSVAEVTDYLGINPPSLYAAFGSKAGLF 64
++++ + R + + A LF Q+G + S+ E+ G+ ++Y F K+ LF
Sbjct: 3 RKTKQEAQETR---QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59

Query: 65 SRVLNEYVGT----EAIPLADILRDDRPVGECLVEVLKEAARRYSQNGGCAGCMVLE 117
S + E A D V ++ + E+ + + +
Sbjct: 60 SEIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHK 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4827SACTRNSFRASE324e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 4e-04
Identities = 15/48 (31%), Positives = 18/48 (37%)

Query: 97 PAIRGKGLAKKLALMAMEQAREMGFKRCYLETTAFLKEAIALYEHLGF 144
R KG+ L A+E A+E F LET A Y F
Sbjct: 99 KDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4845HTHTETR562e-13 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.8 bits (134), Expect = 2e-13
Identities = 19/56 (33%), Positives = 30/56 (53%)

Query: 11 PRKTESTYADTRNDLIRSGLELLTQNGFLATGVDAIVKNANVPKGSFYYYLKSKED 66
RKT+ +TR ++ L L +Q G +T + I K A V +G+ Y++ K K D
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSD 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4867FERRIBNDNGPP655e-14 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 64.6 bits (157), Expect = 5e-14
Identities = 44/240 (18%), Positives = 91/240 (37%), Gaps = 13/240 (5%)

Query: 38 TPQRIVVLELSFADALAAVDVSPIGIADDNDAKRILPEVRAHLKPWQSVGTRAQPSLEAI 97
P RIV LE + L A+ + P G+AD + + + E VG R +P+LE +
Sbjct: 34 DPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSE-PPLPDSVIDVGLRTEPNLELL 92

Query: 98 AALKPDLIIADSSRHAGVYIALQQIAPVLLLKSR--NETYAENLQSAAIIGEMVGKKREM 155
+KP ++ S+ + L +IAP + A +S + +++ +
Sbjct: 93 TEMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAA 151

Query: 156 QARLEQHKERMAQWASQLPKGTR---VAFGTSREQQFNLHTQETWTGSVLASLGLNVPAA 212
+ L Q+++ + + K + + + + +L G +P A
Sbjct: 152 ETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYG--IPNA 209

Query: 213 MAGSS----MPSIGLEQLLAVNPAWLLVAHYREESIVKRWQQDPLWQMLTAAQKQQVASV 268
G + ++ +++L A +L + + PLWQ + + + V
Sbjct: 210 WQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRFQRV 269


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4868ECOLNEIPORIN330.005 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 32.9 bits (75), Expect = 0.005
Identities = 19/89 (21%), Positives = 29/89 (32%), Gaps = 9/89 (10%)

Query: 546 GSFGTVQYSQIGKAVQSGNVEPEKARTWELGTRYDDGALTAEMGLFLINFNNQYDSNQTN 605
G F + NV EK + L + YD+ AL A + Q D+
Sbjct: 187 GFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASV------AVQQQDAKLVE 240

Query: 606 DTVTARGKTRHTGLETQARYDLGTLTPTL 634
+ T + Y G +TP +
Sbjct: 241 E---NYSHNSQTEVAATLAYRFGNVTPRV 266


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4879CHANLCOLICIN310.014 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.8 bits (69), Expect = 0.014
Identities = 25/91 (27%), Positives = 42/91 (46%), Gaps = 5/91 (5%)

Query: 4 SLAHENARLRALLQTQQDTIRQMAEYNRLLSQRVAAYASEINRLKALVAKLQRMQFGKSS 63
+ A A AL Q +D + + +N + A N A+ A+ +R++ K+
Sbjct: 79 AQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANN--AAMQAEDERLRLAKAE 136

Query: 64 EKLR---AKTERQIQEAQERISALQEEMAET 91
EK R E+ QEA++R ++ E AET
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAET 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4892PRTACTNFAMLY330.008 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 32.7 bits (74), Expect = 0.008
Identities = 166/827 (20%), Positives = 253/827 (30%), Gaps = 117/827 (14%)

Query: 38 VALSLAAVTSVPVLAAD----TVVQAGETVSGGTLTNHDNQIVFGTANGMTISTG----- 88
+A++L A+ + P AD ++V+ GE G + D V TA+G TI
Sbjct: 19 LAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGV-RTASGTTIKVSGRQAQ 77

Query: 89 ---LEYGPDNEANTGGQWIQNGGIANNTTVTGGGLQRVNAGGSVSDTVISAGGGQSLQGQ 145
LE G +G ++++ G V AG V+D A G +
Sbjct: 78 GILLENPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATLANVGDTWDDD 137

Query: 146 AVNTTLNGGEQWVHEGGIA---TGTVINEKGWQAVKSGAMATDTVVNTGAEGGPDAENGD 202
+ + G + G V E+G + D ++ GA E+
Sbjct: 138 GIALYVAGEQAQASIADSTLQGAGGVQIERGANVTVQRSAIVDGGLHIGALQSLQPEDLP 197

Query: 203 TGQFVRGNAVRTTINENGRQIVAAEGTANTTVVYAGGDQTVHGHALDTTLNGGYQYVHNG 262
+ V + T V A G V + T+ G + G +
Sbjct: 198 PSRVVLRDTNVTA--------VPASGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGA 249

Query: 263 GTASDTVVNSDGWQIVKEGGLADFTTVNQKGKLQVNAGGTATNVTLKQGGALVTSTAATV 322
G GG V G + G L G V + ++V
Sbjct: 250 VVHLQRATIRRGDA--PAGGAVPGGAV-PGGAVPGGFGPGGFGPVL-DGWYGVDVSGSSV 305

Query: 323 TGSNRLGNFTVENGNADGVVLESGGRLDVLEGHSAWKTLVDDGGTLAVSAGGKATDVTMT 382
L VE A G + V G GG+L+ G +V T
Sbjct: 306 ----ELAQSIVE---APE----LGAAIRVGRGARV----TVSGGSLSAPHG----NVIET 346

Query: 383 SGSALIADSGATVE-----GTNASGKFSIDGTSGQASGLLLENG----GSFTVNAGGLAS 433
G+ A A + G +A GK + + L L G G
Sbjct: 347 GGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIP 406

Query: 434 NTTVGHRGTLTLAAGGSLSGRTQLSKGASMVLNGDVVSTGDIVNAGEIRFDNQTTPDAAL 493
T++G + LA+ +G T+ S+ N V T + N G +R + + D
Sbjct: 407 GTSIG-PLDVALASQARWTGATRAVDSLSID-NATWVMTDN-SNVGALRLASDGSVDFQQ 463

Query: 494 SRAVAKGDSPVTFHKLTTSNLTGQGGTINMRVRLDGSNTSDQLVINGGQATGKTWLAFTN 553
+ F LT + L G G M V D SD+LV+ A+G+ L N
Sbjct: 464 PAEAGR------FKVLTVNTLAGS-GLFRMNVFAD-LGLSDKLVVMQD-ASGQHRLWVRN 514

Query: 554 VGNSNLGVATSGQGIRVVDAQNGATTEEGAFALSRPLQAGAFNYTLNRDSDEDWYLRSEN 613
G+ S + +V G+ + G + Y L + + W L
Sbjct: 515 SGSE----PASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAK 570

Query: 614 AYRAEVPLY-----------------------TSMLTQAMDYDRILAGSRSHQTGVNGEN 650
A A P L+ A + G T E+
Sbjct: 571 APPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAES 630

Query: 651 NSVRLSIQGGHLGHDNNGGIARGATPESSGSYGFVRLEGDLLRTEVAGMSL--TTGVHGA 708
N++ + L D G RG R +VAG L V A
Sbjct: 631 NALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGR----RFDQKVAGFELGADHAVAVA 686

Query: 709 AGHSSVDV------KDDDGSRAGTVRDDAGSLGGYLNLTHTSSGLWADIVAQGTRH---- 758
G + D + G D+ +GGY SG + D + +R
Sbjct: 687 GGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYI-ADSGFYLDATLRASRLENDF 745

Query: 759 ---SMKASSDNNDFRARGWGWLGSLETGLPFSITDNLMLEPQLQYTW 802
+ +R G G SLE G F+ D LEPQ +
Sbjct: 746 KVAGSDGYAVKGKYRTHGVGA--SLEAGRRFTHADGWFLEPQAELAV 790


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4903PF07201310.026 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 30.6 bits (69), Expect = 0.026
Identities = 16/86 (18%), Positives = 38/86 (44%), Gaps = 5/86 (5%)

Query: 865 LEKWAEDMIFAAEEALRDTKMQIKSLKREARLAQSMEEQKQNQEKLKLLERQQKRQRMEI 924
+ AE++ F E + + K +AR++ E+ Q K+ LE++Q +
Sbjct: 49 IADMAEEVTFVFSERKELSLDKRKLSDSQARVSDVEEQVNQYLSKVPELEQKQNVSEL-- 106

Query: 925 FDIEDEIADKRDELISALEERMKQKT 950
+++ + +S L+ ++ K+
Sbjct: 107 ---LSLLSNSPNISLSQLKAYLEGKS 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4914PF00577352e-117 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 352 bits (904), Expect = e-117
Identities = 283/285 (99%), Positives = 283/285 (99%)

Query: 2 LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLVGVYGTLLEDNNLSYSVQT 61
LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNL GVYGTLLEDNNLSYSVQT
Sbjct: 594 LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQT 653

Query: 62 GYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP 121
GYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP
Sbjct: 654 GYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP 713

Query: 122 LNDTVVLVKAPGAKDVKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN 181
LNDTVVLVKAPGAKD KVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN
Sbjct: 714 LNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN 773

Query: 182 AVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVY 241
AVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVY
Sbjct: 774 AVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVY 833

Query: 242 LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 286
LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 834 LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


73EcE24377A_4980EcE24377A_4987Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_49801273.294499deoxyribose-phosphate aldolase
EcE24377A_4981-1243.532126thymidine phosphorylase
EcE24377A_4982-1224.100209phosphopentomutase
EcE24377A_4983-3183.295380purine nucleoside phosphorylase
EcE24377A_4985-1203.561718lipoate-protein ligase A
EcE24377A_4986-2173.112758hypothetical protein
EcE24377A_4987-2173.024710phosphoserine phosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4987FLGMRINGFLIF300.022 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.5 bits (66), Expect = 0.022
Identities = 21/71 (29%), Positives = 33/71 (46%), Gaps = 2/71 (2%)

Query: 123 QIECIDEIAKLAGTGEMVAEVTERAMRGELDFTASLRSRVATLK-GADANILQQVRENLP 181
Q+ E AK A V + TE A+ L L+ R A + GA+ + Q++RE
Sbjct: 482 QLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEV-MSQRIREMSD 540

Query: 182 LMPGLTQLVLK 192
P + LV++
Sbjct: 541 NDPRVVALVIR 551


74EcE24377A_0421EcE24377A_0428N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_04213151.409381fructokinase
EcE24377A_04223161.247683hypothetical protein
EcE24377A_04232141.185339MFS transport protein AraJ
EcE24377A_04243131.469538exonuclease SbcC
EcE24377A_0425-2150.286157exonuclease SbcD
EcE24377A_0426-1151.290435hypothetical protein
EcE24377A_0427-1131.443482transcriptional regulator PhoB
EcE24377A_04280141.761030phosphate regulon sensor protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0421ACETATEKNASE300.016 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.8 bits (67), Expect = 0.016
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIISLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0423TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.0 bits (122), Expect = 4e-09
Identities = 74/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 314
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0424RTXTOXIND405e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.8 bits (93), Expect = 5e-05
Identities = 34/199 (17%), Positives = 71/199 (35%), Gaps = 14/199 (7%)

Query: 671 QQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEDTVALDNWRQVHEQCLALH 730
+ + Q + Q R Q L+ +E + E + +V +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTAL--------QASVFDDQQAFLAALMDEQTLTQL 782
Q T Q Q +L K +A+ T L + V + ++L+ +Q +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA-- 250

Query: 783 EQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQEL-AQTHQKLREN 841
K + Q + V + +Q +Q + L+ + + Q + KLR+
Sbjct: 251 ---KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT 307

Query: 842 TTSQGEIRQQLKQDADNRQ 860
T + G + +L ++ + +Q
Sbjct: 308 TDNIGLLTLELAKNEERQQ 326



Score = 39.4 bits (92), Expect = 6e-05
Identities = 25/204 (12%), Positives = 59/204 (28%), Gaps = 18/204 (8%)

Query: 487 EARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVKKLGEEG 546
EA ++ Q + Q ++E + E + +E + L
Sbjct: 133 EADTLKTQSSLLQARLEQ---TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT- 188

Query: 547 AALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQPWLDAQD 606
+ ++ Q Q + E R + + + + DD L Q
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 607 -------EHERQL-RLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLP 658
E E + +++ + Q+ +I+ +++ + Q L
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEILD 302

Query: 659 QEDEEESWLATRQQEAQSWQQRQN 682
+ + + E ++RQ
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.5 bits (74), Expect = 0.008
Identities = 16/150 (10%), Positives = 42/150 (28%), Gaps = 5/150 (3%)

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTA----LQASVFDDQQAFLAALMDEQTLTQLEQLK 786
+ Q + A + Q + L D+ F +E+ L +K
Sbjct: 134 ADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS-EEEVLRLTSLIK 192

Query: 787 QNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRENTTSQG 846
+ + Q + A+ + L L + ++
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 847 EIRQQLKQDADNRQQQQTLMQQIAQMTQQV 876
+ +Q + + + + Q+ Q+ ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0427HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 33/149 (22%), Positives = 62/149 (41%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPTSHRVMAGEEP 152
E L D + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0428PF06580349e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 9e-04
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


75EcE24377A_0468EcE24377A_0476N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_04680190.137868muropeptide transporter
EcE24377A_0469329-0.673473hypothetical protein
EcE24377A_0470425-0.139735transcriptional regulator BolA
EcE24377A_0471327-0.086320hypothetical protein
EcE24377A_04723260.211675trigger factor
EcE24377A_04730200.380671ATP-dependent Clp protease proteolytic subunit
EcE24377A_04741210.124564ATP-dependent protease ATP-binding subunit ClpX
EcE24377A_04750180.044165DNA-binding ATP-dependent protease La
EcE24377A_0476-1120.148872transcriptional regulator HU subunit beta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0468TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0469PF06291270.027 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.027
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCAKPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0474HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0475GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.002
Identities = 34/133 (25%), Positives = 68/133 (51%), Gaps = 15/133 (11%)

Query: 191 ERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKELGEMDDAPD- 249
LE A +E + +L R +++ ++ S+ +Q++A ++L E + +
Sbjct: 291 AALEAEKADLEHQSQVLNAN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEA 344

Query: 250 ENEALKRKIDAAKMPKEAKEKAEAELQKLKMMSPMS-AEATVVRGYIDWMVQVPWNARSK 308
++L+R +DA++ EAK++ EAE QKL+ + +S A +R +D + A+ +
Sbjct: 345 SRQSLRRDLDASR---EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASRE----AKKQ 397

Query: 309 VKKDLRQAQEILD 321
V+K L +A L
Sbjct: 398 VEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0476DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


76EcE24377A_0492EcE24377A_0500N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0492013-0.729072cyclic diguanylate phosphodiesterase
EcE24377A_0493215-0.565081hypothetical protein
EcE24377A_0494115-0.540461maltose O-acetyltransferase
EcE24377A_0495115-0.054416hypothetical protein
EcE24377A_04961150.467572acriflavine resistance protein B
EcE24377A_0497212-0.439134acriflavine resistance protein A
EcE24377A_0498213-0.168869DNA-binding transcriptional repressor AcrR
EcE24377A_04993160.699200hypothetical protein
EcE24377A_05003160.961529potassium efflux protein KefA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0492BCTERIALGSPF300.031 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.031
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0496ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3546), Expect = 0.0
Identities = 802/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+SVEKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0497RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 8e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 168 RINLA 172
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0498HTHTETR2225e-76 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 222 bits (567), Expect = 5e-76
Identities = 215/215 (100%), Positives = 215/215 (100%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0500RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRIKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


77EcE24377A_0611EcE24377A_0616N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0611-1174.851865enterobactin exporter EntS
EcE24377A_0612-2184.687698iron-enterobactin transporter periplasmic
EcE24377A_0613-1205.003538isochorismate synthase
EcE24377A_0614-1214.915810enterobactin synthase subunit E
EcE24377A_06150204.817127isochorismatase
EcE24377A_0616-1184.3915932,3-dihydroxybenzoate-2,3-dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0611TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.3 bits (84), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0612FERRIBNDNGPP632e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.0 bits (153), Expect = 2e-13
Identities = 61/285 (21%), Positives = 102/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSAEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKS--- 151
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 152 --WQSLLTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
+ LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0615ISCHRISMTASE443e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 443 bits (1141), Expect = e-161
Identities = 147/299 (49%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPVPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA V + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0616DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (931), Expect = e-130
Identities = 110/258 (42%), Positives = 149/258 (57%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFTQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDALVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAVSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


78EcE24377A_0856EcE24377A_0862N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0856-2203.660503ABC-2 type transporter permease
EcE24377A_0858-2194.078942ABC-2 type transporter permease
EcE24377A_0859-2183.904588ABC transporter ATP-binding protein
EcE24377A_0860-2153.337050hypothetical protein
EcE24377A_0861-1162.546750DNA-binding transcriptional regulator
EcE24377A_08620152.835494ATP-dependent RNA helicase RhlE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0856ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0859PF05272320.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 293 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 352
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 353 KRGEIFG----LLGPNGAGKSTTFKMMCGL 378
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.046
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0860RTXTOXIND636e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.5 bits (152), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 82 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 141
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 142 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 196
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 197 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 254
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 255 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 308
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 309 ----DADDALRQGMPVTVQ 323
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0861HTHTETR737e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 72.7 bits (178), Expect = 7e-18
Identities = 33/214 (15%), Positives = 77/214 (35%), Gaps = 17/214 (7%)

Query: 9 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 67
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 68 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSKFISREQL 127
IGE E + P + +RE+++ + + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 128 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 187
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 188 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 221
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0862SECA300.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.025
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


79EcE24377A_0914EcE24377A_0919N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0914013-0.196115multidrug translocase MdfA
EcE24377A_0915-114-0.551620hypothetical protein
EcE24377A_0916015-1.027742phosphatase YbjI
EcE24377A_0917-114-0.231698major facilitator transporter
EcE24377A_0918-113-1.000379TetR family transcriptional regulator
EcE24377A_09190110.117467hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0914TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 57/269 (21%), Positives = 106/269 (39%), Gaps = 23/269 (8%)

Query: 71 LLGPLSDRIGRRPVMLAGVVWFIITCLAILLAQNIEQFTLLRFLQGISLCFIGAVGYAAI 130
+LG LSDR GRRPV+L + + + A + + R + GI+ GAV A I
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYI 120

Query: 131 QESFEEAVCIKITALMANVALIAPLLGPLVG---AAWIHVLPWEGMFVLFAALAAISFFG 187
+ + + M+ + GP++G + P F AAL ++F
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFLT 176

Query: 188 LQQAMPETATRIGEKLSLKELGRDYKLVLKNG-RFVAGALALGFVSLPLLAWIAQSP--I 244
+PE+ L + L G VA +A+ F ++ + Q P +
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF----IMQLVGQVPAAL 232

Query: 245 IIITGEQLSSYEYGLLQVPIFGALIAGNL----LLARLTSRRTVRSLIIMGGWPIMIGLL 300
+I GE ++ + + + I +L + + +R R +++G G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 301 VAAAATVISSHAYLWMTAGLSIYAFGIGL 329
+ A AT ++ + + + GIG+
Sbjct: 293 LLAFAT----RGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0917TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.7 bits (77), Expect = 0.001
Identities = 34/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVAVVR-ASALM--GALGIGLIIFVDSAWVA-GVSVVLWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + VL GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0918HTHTETR506e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 6e-10
Identities = 14/83 (16%), Positives = 30/83 (36%), Gaps = 4/83 (4%)

Query: 2 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAFS 61
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW- 63

Query: 62 SFTEIMSRQYQAFFSDVSDAQGA 84
E+ +
Sbjct: 64 ---ELSESNIGELELEYQAKFPG 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0919TCRTETA320.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.006
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSTFSFGMGNAAGLLFAGIML-GFMRANHPTFG-YIPQ--GALSMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAIGGQM--LIAGLIVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


80EcE24377A_0937EcE24377A_0942N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_0937-1132.650211arginine transporter ATP-binding subunit
EcE24377A_0938-1133.281928lipoprotein
EcE24377A_0939-1143.066102hypothetical protein
EcE24377A_0941-1133.174707N-acetylmuramoyl-L-alanine amidase
EcE24377A_0940-2152.941240NAD-dependent epimerase/dehydratase
EcE24377A_0942-3132.302690NAD dependent epimerase/dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0937PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0941ECOLIPORIN290.017 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 29.1 bits (65), Expect = 0.017
Identities = 20/54 (37%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLIAAALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADD 55
R+V L+ ALL AG A I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0940NUCEPIMERASE761e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.6 bits (186), Expect = 1e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------TGRNEAMGKLLEKMGAEFVPAD 63
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVITLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_0942NUCEPIMERASE561e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.6 bits (134), Expect = 1e-10
Identities = 29/125 (23%), Positives = 52/125 (41%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVRTLSQQGHQILA---------AARHVDRLAKLQLANVSCHKVDL 54
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQVALNVRDALREVPVKQL 106
+ + + L + V+ H S+ + LN+ + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


81EcE24377A_1199EcE24377A_1207N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_11990122.506596flagellar hook protein FlgE
EcE24377A_1200-1122.446995flagellar basal body rod protein FlgF
EcE24377A_1201-191.331663flagellar basal body rod protein FlgG
EcE24377A_12020122.259446flagellar basal body L-ring protein
EcE24377A_12030132.021553flagellar basal body P-ring biosynthesis protein
EcE24377A_12041141.633508flagellar rod assembly protein/muramidase FlgJ
EcE24377A_12052151.203864flagellar hook-associated protein FlgK
EcE24377A_12062141.248637flagellar hook-associated protein FlgL
EcE24377A_12072141.726975ribonuclease E
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1199FLGHOOKAP1424e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 9e-05
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1201FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1202FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1203FLGPRINGFLGI427e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 427 bits (1100), Expect = e-152
Identities = 157/363 (43%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 4 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 63
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 64 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 123
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 124 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 183
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 184 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 239
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 240 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAVAQGNLSVTVNRQANVSQPDTPFGG 299
+N+ V T AKVVIN RTG++V+ +V + AV+ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 300 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 359
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 360 AKL 362
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1204FLGFLGJ5110.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 511 bits (1318), Expect = 0.0
Identities = 313/313 (100%), Positives = 313/313 (100%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1205FLGHOOKAP16830.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 683 bits (1764), Expect = 0.0
Identities = 544/546 (99%), Positives = 545/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNSQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 361
ALAFAEAFN+QHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSTTQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSS TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1206FLAGELLIN452e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.4 bits (107), Expect = 2e-07
Identities = 46/229 (20%), Positives = 84/229 (36%), Gaps = 15/229 (6%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD AA QA+ + T A
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDD--AAGQAIANRF-TSNIKGLTQAS 64

Query: 67 TFATQKVSL---EESVLSQVTTAIQNAQEKIVYASNGTLSDDDRASLATDIQGLRDQLLN 123
A +S+ E L+++ +Q +E V A+NGT SD D S+ +IQ +++
Sbjct: 65 RNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDR 124

Query: 124 LANTTDGNGRYIFAGYKTETAPFSEEKGKYVGGAESIRQQVDASRSMVIGHTGDKIFDSI 183
++N T NG + + G E+I + +G G +
Sbjct: 125 VSNQTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 184 TSNAVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKEIAAAALDKT 232
+ + T A + + A DK
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1207IGASERPTASE652e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 65.1 bits (158), Expect = 2e-12
Identities = 47/261 (18%), Positives = 83/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAAPATPATPAQPGLLSRFFGALKALFSGSEETKPTEQP-APKAEAKPERQQDRR 609
P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETRESRQQAEV------T 663
N ++++++ D E +NR A++ + + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTTDEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +TT+ ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEEAVVAPVVEETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232



Score = 64.3 bits (156), Expect = 2e-12
Identities = 47/288 (16%), Positives = 84/288 (29%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAAPATPATPAQPGLL 571
P E+ + DVP P+ E A AP P APATP+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT----- 1037

Query: 572 SRFFGALKALFSGSEETKPTEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRSER-- 629
ET + Q QN + + + ++
Sbjct: 1038 ---------------ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETRESRQQAEVTEKARTTDEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEEAVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 VEETVAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
+P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248


82EcE24377A_1282EcE24377A_1293N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1282122-1.019434entero Ail/Lom protein
EcE24377A_1284019-1.203962phage tail domain-containing protein
EcE24377A_1285013-3.137395tail fiber assembly protein
EcE24377A_1286-211-1.903920hypothetical protein
EcE24377A_1287-112-0.663212hypothetical protein
EcE24377A_1288-112-0.875913spermidine/putrescine ABC transporter membrane
EcE24377A_1289-111-0.529842putrescine/spermidine ABC transporter ATPase
EcE24377A_1290-1110.199776peptidase T
EcE24377A_1291-1130.903558cupin family protein
EcE24377A_1292-2140.437087sensor protein PhoQ
EcE24377A_1293-2160.578470DNA-binding transcriptional regulator PhoP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1282ENTEROVIROMP711e-19 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 71.5 bits (175), Expect = 1e-19
Identities = 22/61 (36%), Positives = 32/61 (52%)

Query: 21 GKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYEGSGSGDWRTDGFIVGVSYK 80
GK + S+ ++GAG+QFNP E+VA+D +YE S +I GV Y+
Sbjct: 111 GKFQTTEYPTYKHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYR 170

Query: 81 F 81
F
Sbjct: 171 F 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1284FLAGELLIN504e-08 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 50.0 bits (119), Expect = 4e-08
Identities = 42/291 (14%), Positives = 83/291 (28%), Gaps = 3/291 (1%)

Query: 122 QNTAAAKKSASDAGTSAREAATHATDAAGSARAASTSAGQAASSAQSASSSAGTASTKAT 181
N + + ++G + + + A
Sbjct: 174 VNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAAN 233

Query: 182 EA-SKSAAAAESSKSAAATSASAAKTSETNAAASQKSAATSASAATTKASEAATSARDAA 240
+ A ++ T+ S A T+E A A K +
Sbjct: 234 GQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGN 293

Query: 241 ASKEAAKSSETNASSSASSAASSATAAGNSAKAAKTSETNARS--SETAAGQSASAAAGS 298
++ + + A +A AA A ++S+ S + + +
Sbjct: 294 DGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESA 353

Query: 299 KTAAASSASAASTSAGQASASATAAGKSAESAASSASTATTKAGEATEQASAAARSASAA 358
K + + +A + A +A + A A+ ++ A+AA
Sbjct: 354 KLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAA 413

Query: 359 KTSETNAKTSADNAASSKAAAASSAGSAASSASSASASKDEATRQASAAKG 409
K S N S D+A S A SS G+ + SA + ++A+
Sbjct: 414 KKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARS 464



Score = 45.0 bits (106), Expect = 2e-06
Identities = 44/358 (12%), Positives = 100/358 (27%), Gaps = 10/358 (2%)

Query: 149 AGSARAASTSAGQAASSAQSASSSAGTASTKATEASKSAAAAE--SSKSAAATSASAAKT 206
A + QA+ +A S A T E + + S ++ T++ +
Sbjct: 50 ANRFTSNIKGLTQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLK 109

Query: 207 SETNAAASQKSAATSASAATTKASEAATSARDAAASKEAAKSSET--------NASSSAS 258
S + + S T S + + A ET + S
Sbjct: 110 SIQDEIQQRLEEIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGL 169

Query: 259 SAASSATAAGNSAKAAKTSETNARSSETAAGQSASAAAGSKTAAASSASAASTSAGQASA 318
+ + K+S N +T A + + A + + A T +
Sbjct: 170 DGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYV 229

Query: 319 SATAAGKSAESAASSASTATTKAGEATEQASAAARSASAAKTSETNAKTSADNAASSKAA 378
+A + + A ++ + K ++T + A A A K + +
Sbjct: 230 NAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDT 289

Query: 379 AASSAGSAASSASSASASKDEATRQASAAKGSATTATTKALEAAGSATAASQSKVAAESA 438
+ G+ S + +A + AT ++ + ++ Q ++
Sbjct: 290 KTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTK 349

Query: 439 ATRAETAAKRAEDIASAVALEDASTTKKGIVQLSSATNSTSETLAATPKAVKSAYDNA 496
A+ + A + + + + +T+ A +
Sbjct: 350 NESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLIN 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1286LUXSPROTEIN300.004 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 29.9 bits (67), Expect = 0.004
Identities = 17/66 (25%), Positives = 30/66 (45%), Gaps = 7/66 (10%)

Query: 37 TKEHLLPHFL-EHVGNNHLDI------GVGTGFYLTHVPESSLISLMDLNEASLNAASTR 89
T EHL F+ H+ + ++I G TGFY++ + S + D A++
Sbjct: 54 TLEHLYAGFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKV 113

Query: 90 AGESKI 95
++KI
Sbjct: 114 ENQNKI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1289PF05272300.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.017
Identities = 10/36 (27%), Positives = 19/36 (52%), Gaps = 1/36 (2%)

Query: 46 LTLLGPSGCGKTTVLRLIAGLE-TVDSGRIMLDNED 80
+ L G G GK+T++ + GL+ D+ + +D
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1292PF06580290.048 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.7 bits (64), Expect = 0.048
Identities = 11/69 (15%), Positives = 22/69 (31%), Gaps = 20/69 (28%)

Query: 389 NACKYCLE------FVEISARQTDEHLYIVVEDDGPGIPLSKREVIFDRGQRVDTLRPGQ 442
N K+ + + + + + + + VE+ G + +E
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE--------------ST 311

Query: 443 GVGLAVARE 451
G GL RE
Sbjct: 312 GTGLQNVRE 320


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1293HTHFIS875e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.8 bits (215), Expect = 5e-22
Identities = 31/124 (25%), Positives = 62/124 (50%)

Query: 2 RVLVVEDNALLRHHLKVQIQDAGHQVDDAEDAKEADYYLNEHLPDIAIVDLGLPDEDGLS 61
+LV +D+A +R L + AG+ V +A ++ D+ + D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LIRRWRSNDVSLPILVLTARESWQDKVEVLSAGADDYVTKPFHIEEVMARMQALMRRNSG 121
L+ R + LP+LV++A+ ++ ++ GA DY+ KPF + E++ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 LASQ 125
S+
Sbjct: 125 RPSK 128


83EcE24377A_1344EcE24377A_1348N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1344-116-1.287400dihydroxyacetone kinase subunit DhaM
EcE24377A_1345-214-1.096171dihydroxyacetone kinase subunit DhaL
EcE24377A_1346-214-1.375865dihydroxyacetone kinase subunit DhaK
EcE24377A_1347-116-1.637760DNA-binding transcriptional regulator DhaR
EcE24377A_1348116-0.746135outer membrane autotransporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1344PHPHTRNFRASE1433e-39 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 143 bits (362), Expect = 3e-39
Identities = 62/206 (30%), Positives = 102/206 (49%), Gaps = 1/206 (0%)

Query: 258 GKAFYYQPVLCTVQAKSTLTVEEEQDRLRQAIDFTLLDLMTLTAKAEASGLDDIAAIFSG 317
KAF + ++ S V E ++L A++ + +L + + EAS D A IF+
Sbjct: 17 AKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEASMGADKAEIFAA 76

Query: 318 HHTLLDDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQLDDEYLQARYIDVDDLLH 377
H +LDDPEL+ +++E AEYA ++V ++ +D+EY++ R D+ D+
Sbjct: 77 HLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYMKERAADIRDVSK 136

Query: 378 RTLVHLT-QTKEELPQFNSPTILLAENIYPSTVLQLDPAVVKGICLSAGSPVSHSALIAR 436
R L HL L T+++AE++ PS QL+ VKG G SHSA+++R
Sbjct: 137 RVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSR 196

Query: 437 ELGIGWICQQGEKLYAIQPEETLTLD 462
L I + E IQ + + +D
Sbjct: 197 SLEIPAVVGTKEVTEKIQHGDMVIVD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1345adhesinmafb280.040 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 27.7 bits (61), Expect = 0.040
Identities = 10/47 (21%), Positives = 25/47 (53%)

Query: 138 VESLRQSSEQNLSVPVALEAASRIAESAAQSTITMQARKGRASYLGE 184
E++ + ++N + +EA +A +A + + A+ G+A+ G+
Sbjct: 293 REAVDRWIQENPNAAETVEAVFNVAAAAKVAKLAKAAKPGKAAVSGD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1347HTHFIS2462e-76 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 246 bits (629), Expect = 2e-76
Identities = 91/363 (25%), Positives = 155/363 (42%), Gaps = 33/363 (9%)

Query: 308 QMRQLMTSQLGKVSHTFAHMPQDDPQTRRLIHFGRQAARSSFPVLLCGEEGVGKALLSQA 367
+ S+L S + + + + ++ +++ GE G GK L+++A
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 368 IHNESERAAGPYIAVNCELYGDAALAEEFIG---GDRTDNENGRLSRLELAHGGTLFLEK 424
+H+ +R GP++A+N + E G G T + R E A GGTLFL++
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 425 IEYLAVELQSALLQVIKQGVITRLDARRLIPVDVKVIATTTADLAMLVEQNRFSRQLYYA 484
I + ++ Q+ LL+V++QG T + R I DV+++A T DL + Q F LYY
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 485 LHAFEITIPPLRMRRGSIPALVNNKLRSLEKRFSTRLKIDDDALARLVSCAWPGNDFELY 544
L+ + +PPLR R IP LV + ++ EK + D +AL + + WPGN EL
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELE 359

Query: 545 SVIENLALSSDNGRIRVSDLPEHLFTEQATDDVSATRLSTS------------------- 585
+++ L I + L +E + +
Sbjct: 360 NLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASF 419

Query: 586 -----------LSFAEVEKEAIINAAQVTGGRIQEMSALLGIGRTTLWRKMKQHGIDAGQ 634
AE+E I+ A T G + + LLG+ R TL +K+++ G+ +
Sbjct: 420 GDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVYR 479

Query: 635 FKR 637
R
Sbjct: 480 SSR 482


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1348PRTACTNFAMLY2116e-59 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 211 bits (539), Expect = 6e-59
Identities = 240/984 (24%), Positives = 397/984 (40%), Gaps = 119/984 (12%)

Query: 14 RLAELKIRSPSIQLIKFGAIGLNAIIFSPLLIAADTGSQYGTNITINEGDRI---TGDTA 70
+ A L+ + ++ L GA ++ I Q+G +I ++ + +G T
Sbjct: 10 KAAPLRRTTLAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTASGTTI 69

Query: 71 DPSGN-LYGVMTPAGNTPGNINLGNDVTVN---VNDASGYAKGIIIQGKNSSLTANRLTV 126
SG G++ N + N + ++D + K L A+ T+
Sbjct: 70 KVSGRQAQGILLE--NPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATL 127

Query: 127 DVVGQT---SAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSNGIG 183
VG T I + + G+ A + ST++ G+ I + +T + I + G+
Sbjct: 128 ANVGDTWDDDGIALYVAGEQAQASIAD-STLQGAG-GVQIERGANVTVQRSAIVD-GGLH 184

Query: 184 LTINDYGTSVDLGSGSKITTDGS-TGVYIGGLNGNNANGAARFTATDLTID---VQGYSA 239
+ DL + D + T V G + A++LT+D + G A
Sbjct: 185 IGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPA----AVSVLGASELTLDGGHITGGRA 240

Query: 240 MGINVQKNSVVDLGTNSSIKTNGDNAHGLWSFGQVSANAL-------TVDVTGAAANGVE 292
G+ + +VV L ++I+ A G G V A+ GV+
Sbjct: 241 AGVAAMQGAVVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYGVD 299

Query: 293 VRGGTTTIGADSHISSAQGGGLVTSGSDATINFSGTAAQRNSIFSGGSYGASAQTATAVV 352
V G + + A S + + + G + G A + SG + +G +T A
Sbjct: 300 VSGSSVEL-AQSIVEAPELGAAIRVGRGARVTVSGGSLS-------APHGNVIETGGARR 351

Query: 353 NM-QNTDITVD-RNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDLV 410
Q +++ + G+ A G L L +TG A A+G T + +
Sbjct: 352 FAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIPGTSI- 410

Query: 411 IDMSTPDQMAIATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHPGSVWTGSSLSD 470
P +A+A+ + WTG++
Sbjct: 411 ----GPLDVALAS------------------------------------QARWTGAT--R 428

Query: 471 NVNGGKLDVAMNNSVWNVTSNSNLDTLAL-SHSTVDFASHGSTAGTFATLNVENLSGNST 529
V+ +D N+ W +T NSN+ L L S +VDF + AG F L V L+G+
Sbjct: 429 AVDSLSID----NATWVMTDNSNVGALRLASDGSVDFQQ-PAEAGRFKVLTVNTLAGSGL 483

Query: 530 FIMRADVVGEGNGVNNKGDLLNISGSSAGNHVLAIRNQGSEATTGNEVLTVVKTTDGAAS 589
F M D L + ++G H L +RN GSE + N +L V AA+
Sbjct: 484 FRMNV------FADLGLSDKLVVMQDASGQHRLWVRNSGSEPASANTLLLVQTPLGSAAT 537

Query: 590 FSASS---QVELGGYLYDVRKNA-TNWELYASGTVPEPTPNPEPTPAPAQPPIVNPDPTP 645
F+ ++ +V++G Y Y + N W L + P P P P+P P P QPP P+
Sbjct: 538 FTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEA-- 595

Query: 646 EPDPTPNPTPTPKPTTTADAGGNYLNVGYL--LNYVENRTLMQRMGDLRNQSKDGNIWLR 703
P P P + + A+A N VG L Y E+ L +R+G+LR G W R
Sbjct: 596 ---PAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAGGAWGR 652

Query: 704 SYG--GSLDSFASGKLSGFDMGYSGIQFGGDKRLSDVM-PLYVGLYIGSTHASPDYSG-G 759
+ LD+ A + FD +G + G D ++ ++G G T ++G G
Sbjct: 653 GFAQRQQLDNRAGRR---FDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDG 709

Query: 760 DGTARSDYMGMYASYMAHNGFYSDLVVKASRQKNSFHVLDSQNNGVNANGTANGLSISLE 819
G S ++G YA+Y+A +GFY D ++ASR +N F V S V +G+ SLE
Sbjct: 710 GGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGASLE 769

Query: 820 AGQRFNLTPTGYGFYIEPQTQLTYSHQNEMAMKASNGLNIHLNHYESLLGRASMILGYDI 879
AG+RF G+++EPQ +L A +A+NGL + S+LGR + +G I
Sbjct: 770 AGRRFTHAD---GWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRI 826

Query: 880 T-AGNSQLNMYVKTGAIREFSGDTDYLLNNSREKYSFKGNGWNNGVGVSAQYNKQHTFYL 938
AG Q+ Y+K ++EF G N + +G G+G++A + H+ Y
Sbjct: 827 ELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLYA 886

Query: 939 EADYTQSNLFDQK-QVNGGYRFSF 961
+Y++ + GYR+S+
Sbjct: 887 SYEYSKGPKLAMPWTFHAGYRYSW 910


84EcE24377A_1370EcE24377A_1374N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_1370-1111.431024hypothetical protein
EcE24377A_1371-1181.928635transcriptional regulator NarL
EcE24377A_1372-1192.150096nitrate/nitrite sensor protein NarX
EcE24377A_13730252.450325hypothetical protein
EcE24377A_1374-1242.191397nitrite extrusion protein 1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1370INTIMIN2576e-79 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 257 bits (658), Expect = 6e-79
Identities = 120/378 (31%), Positives = 195/378 (51%), Gaps = 21/378 (5%)

Query: 79 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 138
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 139 DRYLTWSQLGLTQQDDGLVSNVGVGQRWARGNWLVGYNTFYDNLLDENLQRAGFGAEAWG 198
++ L + Q+G D +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 199 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSVEQYFGDR 256
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 257 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 316
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 317 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 376
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 377 RSRYGIRQLIWQGDTQILS-----LTPGAQANSAEGWTLIMPDWQNGERASNHWRLSVVV 431
+S+YG+ +++W D+ + S G+Q SA+ + I+P + G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYVQG--GSNVYKVTARA 531

Query: 432 EDNQGQRVSSNEITLTLV 449
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1371HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-17
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1372PF06580531e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 1e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_1374ACRIFLAVINRP310.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.0 bits (70), Expect = 0.012
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 258 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 314
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 315 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 370
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 371 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATDTAAALGFIS 411
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


85EcE24377A_2115EcE24377A_2123N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_21150131.118420chemotaxis regulatory protein CheY
EcE24377A_2116-1111.507400chemotaxis-specific methylesterase
EcE24377A_2117-1121.420636chemotaxis methyltransferase CheR
EcE24377A_2118-1121.137792methyl-accepting protein IV
EcE24377A_2119-1130.628597methyl-accepting chemotaxis protein II
EcE24377A_2120-1120.155662purine-binding chemotaxis protein
EcE24377A_2121-1140.157995chemotaxis protein CheA
EcE24377A_2122-119-1.739170flagellar motor protein MotB
EcE24377A_2123-118-2.771361flagellar motor protein MotA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2115HTHFIS904e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 4e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2116HTHFIS658e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 8e-14
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2121PF06580434e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 361 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 418
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 419 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 478
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 479 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 506
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2122PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.010
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2123PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


86EcE24377A_2158EcE24377A_2183N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2158012-1.475137flagellin
EcE24377A_2159-2160.589610flagellar capping protein
EcE24377A_2160-1130.289693flagellar protein FliS
EcE24377A_21610130.569057flagellar biosynthesis protein FliT
EcE24377A_2162-1100.350996alpha-amylase
EcE24377A_2163-116-1.467825hypothetical protein
EcE24377A_2164-115-1.780854inner membrane protein
EcE24377A_2165-318-1.310841hypothetical protein
EcE24377A_2167-113-0.242808hypothetical protein
EcE24377A_21660140.313836IS605 family transposase OrfB
EcE24377A_2169-1131.870205AraC family transcriptional regulator
EcE24377A_21701164.129988flagellar hook-basal body protein FliE
EcE24377A_21711154.049008flagellar MS-ring protein
EcE24377A_21722174.165389flagellar motor switch protein G
EcE24377A_21730163.557314flagellar assembly protein H
EcE24377A_2174-1173.224347flagellum-specific ATP synthase
EcE24377A_21750162.253082flagellar biosynthesis chaperone
EcE24377A_2176-1162.333444flagellar hook-length control protein
EcE24377A_2177-2211.775186flagellar basal body protein FliL
EcE24377A_21780160.441516flagellar motor switch protein FliM
EcE24377A_2179116-2.536901flagellar motor switch protein FliN
EcE24377A_2180117-3.153425flagellar biosynthesis protein FliO
EcE24377A_2181019-4.069572flagellar biosynthesis protein FliP
EcE24377A_2182020-4.247065flagellar biosynthesis protein FliQ
EcE24377A_2183-217-3.020189flagellar biosynthesis protein FliR
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2158FLAGELLIN965e-23 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 95.9 bits (238), Expect = 5e-23
Identities = 99/338 (29%), Positives = 140/338 (41%), Gaps = 5/338 (1%)

Query: 246 TATDYTYNSATGDFTYSATIAAGTNSGDSNSAQLQSFLTPKAGDTANLNVKIGSTSIDVV 305
+ ++ T + + +G V
Sbjct: 170 DGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYV 229

Query: 306 LASDGKITAKDGSELFIDVDGNLTQNNAGTVKAATLDALTKNWHTTGTPGAVSTVITTED 365
A++G++T D T++ AGT +A + K T T +
Sbjct: 230 NAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDT 289

Query: 366 ETTFTLAGGTNATTSGAITVANARMSAESLQSATKSTGFTVDVGATGNSAGDIKVDSKGI 425
+T G + T +G + +T + T G D K
Sbjct: 290 KTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTK 349

Query: 426 VQQYTGTVFEDAYTKADGSLTTDNTTNLFLQKDGTVTNGSGKAVYVSA-----DGNFTTD 480
+ + E S T N G +GK +++ D
Sbjct: 350 NESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINED 409

Query: 481 AETKAATTADLLKALDEAISSIDKFRSSLGAVQNRLDSAVTNLNNTTTNLSEAQSRIQDA 540
A +TA+ L ++D A+S +D RSSLGA+QNR DSA+TNL NT TNL+ A+SRI+DA
Sbjct: 410 AAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDA 469

Query: 541 DYATEVSNMSKAQIIQQAGNSVLAKANQVPQQVLSLLQ 578
DYATEVSNMSKAQI+QQAG SVLA+ANQVPQ VLSLL+
Sbjct: 470 DYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2159TYPE3OMBPROT330.003 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.7 bits (74), Expect = 0.003
Identities = 27/95 (28%), Positives = 43/95 (45%), Gaps = 2/95 (2%)

Query: 214 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKAQT 273
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 274 AIKDWVNAYNSLIDTFSSLTKYTAVDAGADSQSSS 308
+KD VNA L TK ++ + S
Sbjct: 294 MLKDQVNALKGLNSKRGEPTKLLIRNSDGLLKEVS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2164RTXTOXIND300.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.017
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVSAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2165PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2170FLGHOOKFLIE1174e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (295), Expect = 4e-38
Identities = 101/103 (98%), Positives = 102/103 (99%)

Query: 2 SAIQGIEGVISLLQATAMSARAQDSLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVIS LQATAMSARAQ+SLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2171FLGMRINGFLIF7550.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 755 bits (1951), Expect = 0.0
Identities = 477/555 (85%), Positives = 514/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIATIGPVKGARVHLAMPKPSLFVREQKSPSASVTINLLPGRA 182
FSEQVNYQRALEGEL+RTI T+GPVK ARVHLAMPKPSLFVREQKSPSASVT+ L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSGRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTSGRDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESHAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2172FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2173FLGFLIH371e-134 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 371 bits (952), Expect = e-134
Identities = 223/228 (97%), Positives = 227/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHEQGYQEGLARGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGH+QGYQEGLA+GLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQIALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQ+ALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPRVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAP VV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2175FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2176FLGHOOKFLIK463e-166 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 463 bits (1191), Expect = e-166
Identities = 362/375 (96%), Positives = 368/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAASQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAA QLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLVSEILADAQQADLLIPVDETPPVINDEQSTSTPLTTAQTMTLAAVAGNNTAKDEKA 120
GEPL+S+I++DAQQA+LLIPVDETPPVINDEQSTSTPLTTAQTM LAAVA NT KDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTVNHEPLAGEEDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRT NHEPLAGE+DDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2178FLGMOTORFLIM381e-135 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 381 bits (979), Expect = e-135
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2179FLGMOTORFLIN2113e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 211 bits (539), Expect = 3e-74
Identities = 125/137 (91%), Positives = 133/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTNSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T +KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2181FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2182TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2183TYPE3IMRPROT2029e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 202 bits (515), Expect = 9e-67
Identities = 258/261 (98%), Positives = 261/261 (100%)

Query: 1 MMQVTSDQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
M+QVTS+QWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


87EcE24377A_2193EcE24377A_2200N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2193-113-2.617341DNA cytosine methylase
EcE24377A_2194-223-6.452263hypothetical protein
EcE24377A_2195030-8.237051hypothetical protein
EcE24377A_2196-131-8.891649hypothetical protein
EcE24377A_2197-126-6.908876outer membrane protein
EcE24377A_2198128-6.110900chaperone protein HchA
EcE24377A_2199133-7.705741heavy metal sensor histidine kinase
EcE24377A_2200228-6.524888transcriptional regulatory protein YedW
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2193PF05272290.042 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.042
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2194CARBMTKINASE342e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.4 bits (79), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQRSSILAAEETRRLLREEFEQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEE--GHFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2197ECOLIPORIN447e-160 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 447 bits (1152), Expect = e-160
Identities = 214/387 (55%), Positives = 258/387 (66%), Gaps = 34/387 (8%)

Query: 1 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKVDFYGKMVGERIWSNTDDNNSENEDTSYA 60
MKRKVLA+++PALL AGAA+AAEIYNKDGNK+D YGK+ G +S D++S++ D +Y
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFS---DDSSKDGDQTYM 57

Query: 61 RFGVKGETQITSELTGFGQFEYNLDASKPEG-SNQEKTRLTFAGLKYNELGSFDYGRNYG 119
R G KGETQI +LTG+GQ+EYN+ A+ EG TRL FAGLK+ + GSFDYGRNYG
Sbjct: 58 RVGFKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYG 117

Query: 120 VAYDAAAYTDMLVEWGGDSWASADNFMNGRTNGVATYRNSDFFGLVDGLNFAVQYQGKNS 179
V YD +TDML E+GGDS+ ADN+M GR NGVATYRN+DFFGLVDGLNFA+QYQGKN
Sbjct: 118 VLYDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNE 177

Query: 180 NRG----------------VTKQNGDGYALSVDYNI-EGFGFVGAYSKSDRTNEQAG--- 219
++ + NGDG+ +S Y+I GF AY+ SDRTNEQ
Sbjct: 178 SQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGG 237

Query: 220 -DGYGDNAEVWSLAAKYDANNIYAAMMYGETRNMTVLA------NDHFANKTQNFEAVVQ 272
GD A+ W+ KYDANNIY A MY ETRNMT + ANKTQNFE Q
Sbjct: 238 TIAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ 297

Query: 273 YQFDFGLRPSLGYVYSKGKDLYARNGHKGVDADRVNYIEVGTWYYFNKNMNVYTAYKFNL 332
YQFDFGLRP++ ++ SKGKDL N G D D V Y +VG YYFNKN + Y YK NL
Sbjct: 298 YQFDFGLRPAVSFLMSKGKDL-TYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINL 356

Query: 333 LDKDDAAITDA--ATDDQFAVGIVYQF 357
LD DD DA +TDD A+G+VYQF
Sbjct: 357 LDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2199PF06580330.003 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.5 bits (74), Expect = 0.003
Identities = 38/181 (20%), Positives = 63/181 (34%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNGYLNIDVAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ +I + D NG + ++V +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGTKIHEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIM 447
G+ + K G GL V+ + L+G A K M
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2200HTHFIS842e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.7 bits (207), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 39 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 98
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 99 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 154
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


88EcE24377A_2362EcE24377A_2376N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2362-3122.617474chaperone
EcE24377A_2363-2133.020835hypothetical protein
EcE24377A_2364-2163.783058hypothetical protein
EcE24377A_2365-2173.871312von Willebrand factor A
EcE24377A_2366-2173.925670multidrug efflux system subunit MdtA
EcE24377A_2367-2183.768813multidrug efflux system subunit MdtB
EcE24377A_2368-1163.096132multidrug efflux system subunit MdtC
EcE24377A_2369-2141.050156multidrug efflux system protein MdtE
EcE24377A_2370-2120.676469signal transduction histidine-protein kinase
EcE24377A_2371-115-1.458406DNA-binding transcriptional regulator BaeR
EcE24377A_2372-214-2.326640hypothetical protein
EcE24377A_2373014-2.821697CC2985 family addiction module antidote protein
EcE24377A_2374014-2.881300RelE/ParE family plasmid stabilization system
EcE24377A_2375213-2.059463U32 family peptidase
EcE24377A_2376321-3.815148hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2362SHAPEPROTEIN514e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.9 bits (122), Expect = 4e-09
Identities = 33/129 (25%), Positives = 57/129 (44%), Gaps = 20/129 (15%)

Query: 132 AMMLH-IRQQAQAQLPEAITQAVIGRPINFQGLGGDEANAQAQGILERAAKRAGFRDVVF 190
M+ H I+Q + ++ P+ + E A + +A+ AG R+V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV---ERRA-----IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREASLLGHSGCRI 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S RI
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 34.7 bits (80), Expect = 5e-04
Identities = 32/137 (23%), Positives = 56/137 (40%), Gaps = 23/137 (16%)

Query: 332 RLSYRLV---RSAEECKIALSSV--AETRASLPFISDELAT------LISQQGLESALSQ 380
R +Y + +AE K + S + + LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 381 PLARILEQVQLALDNAQEKPDV--------IYLTGGSARSPLIKKALAEQLPGIPIAGGD 432
PL I+ V +AL+ Q P++ + LTGG A + + L E+ GIP+ +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 433 D-FGSVTAGLARWAEVV 448
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2366RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 4e-08
Identities = 47/369 (12%), Positives = 106/369 (28%), Gaps = 87/369 (23%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGRRG---MR 55
S + R V ++ IA G+ + + A G + + ++
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 56 SG-------PLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 103
G L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLA-------KDKATLANA 144
Q V ++L Q ++ L + + + + +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 145 RRDLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 187
+ L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 188 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 220
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 221 DTTGIVVITQTHPIDLVFTLPESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSL 279
+T +V++ + +++ + DI + Q A + VEA+ T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 280 DNQIDATTG 288
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2367ACRIFLAVINRP9180.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 918 bits (2375), Expect = 0.0
Identities = 299/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QLSDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P L V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2368ACRIFLAVINRP9210.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 921 bits (2383), Expect = 0.0
Identities = 289/1035 (27%), Positives = 507/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIVIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 VVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031



Score = 79.9 bits (197), Expect = 3e-17
Identities = 77/446 (17%), Positives = 162/446 (36%), Gaps = 26/446 (5%)

Query: 592 VDNVTGFTGGS-RVNSGMMFITLKPRDERSETAQQIIDRLRVKLAKEPGANLFLMAVQDI 650
+DN+ + S S + +T + + Q+ ++L++ P + Q I
Sbjct: 72 IDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE----VQQQGI 127

Query: 651 RVGGRQSNASYQYTLLSDDLAALREW-----EPKIRKKLATLPELADVNSDQQDNGAE-- 703
V S+ +SD+ ++ ++ L+ L + DV GA+
Sbjct: 128 SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL----FGAQYA 183

Query: 704 MNLVYDRDTMARLGID----VQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRYTQD 759
M + D D + + + + + + T P Q + R+
Sbjct: 184 MRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNP 243

Query: 760 ISALEKMFVINNEGKAIPLSYFAK--WQPANAPLSVNHQGLSAASTISFNLPTGKSLSDA 817
+ +N++G + L A+ N + G AA +L D
Sbjct: 244 EEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL-DT 302

Query: 818 SAAIDRAMTQL--GVPSTVRGSFA-GTAQVFQETMNSQVILIIAAIATVYIVLGILYESY 874
+ AI + +L P ++ + T Q +++ V + AI V++V+ + ++
Sbjct: 303 AKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQNM 362

Query: 875 VHPLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRH 934
L +P +G L F + + + G++L IG++ +AI++V+
Sbjct: 363 RATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMME 422

Query: 935 GNLTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQ 994
L P+EA ++ ++ + +P+ GG + + ITIV + +S
Sbjct: 423 DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSV 482

Query: 995 LLTLYTTPVVYLFFDRLRLRFSRKPK 1020
L+ L TP + + + K
Sbjct: 483 LVALILTPALCATLLKPVSAEHHENK 508


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2369TCRTETB1213e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (305), Expect = 3e-32
Identities = 98/435 (22%), Positives = 189/435 (43%), Gaps = 25/435 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLTIAGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + L V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLFTTSYSISFLI---------VSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNCFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+ G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSGTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYT--WLSMALIIAL 445
+Y+ L + II +
Sbjct: 428 LYSNLLLLFSGIIVI 442


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2370BCTERIALGSPF310.010 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.0 bits (70), Expect = 0.010
Identities = 27/95 (28%), Positives = 35/95 (36%), Gaps = 20/95 (21%)

Query: 164 RQTSWLIVALATLLAALATFLLA------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSE 217
RQ + L+ A L AL L+A V+ V H LA + P S
Sbjct: 75 RQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSF 131

Query: 218 DEL-----------GKLAQDFNQLASTLEKNQQMR 241
+ L G L N+LA E+ QQMR
Sbjct: 132 ERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2371HTHFIS765e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.6 bits (186), Expect = 5e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLPYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2376LIPOLPP20270.026 LPP20 lipoprotein precursor signature.
		>LIPOLPP20#LPP20 lipoprotein precursor signature.

Length = 175

Score = 26.6 bits (58), Expect = 0.026
Identities = 12/38 (31%), Positives = 24/38 (63%), Gaps = 1/38 (2%)

Query: 18 EGEMKKIAAISLISIFIVSGCAVHNDETSIGKFGLAYK 55
+ ++KKI +S+++ ++ GC+ H ++ I K AYK
Sbjct: 2 KNQVKKILGMSVVAAMVIVGCS-HAPKSGISKSNKAYK 38


89EcE24377A_2408EcE24377A_2414N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_24081171.253055hypothetical protein
EcE24377A_2409014-0.156809von Willebrand factor A
EcE24377A_2410015-0.422839hypothetical protein
EcE24377A_2411-115-2.046937lipoprotein
EcE24377A_2412-115-0.071836hypothetical protein
EcE24377A_2413-1150.814333two-component response-regulatory protein YehT
EcE24377A_24141172.253273sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2408PF07201310.016 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 31.0 bits (70), Expect = 0.016
Identities = 21/128 (16%), Positives = 37/128 (28%), Gaps = 11/128 (8%)

Query: 594 VSLSAAVALLERRSQAIHAPALDRGAVLGALMRLEH-PNASAEAALTMLAQLSPAQSGEA 652
L L+ R + H L A+ M E A +T A
Sbjct: 137 KMLCGLRDALKGRPELAHLSHLVEQAL--VSMAEEQGETIVLGARITPEAYRESQSGVNP 194

Query: 653 LQGLLALARHQLACQPTFIAGFSSHLNQLSDADFINALPDLRAAMA--------WLPPRE 704
LQ L R + A +S + + D + + L+ A++ +
Sbjct: 195 LQPLRDTYRDAVMGYQGIYAIWSDLQKRFPNGDIDSVILFLQKALSADLQSQQSGSGREK 254

Query: 705 RGTLAHQV 712
G + +
Sbjct: 255 LGIVISDL 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2410PF09025280.050 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 28.4 bits (63), Expect = 0.050
Identities = 28/102 (27%), Positives = 40/102 (39%), Gaps = 6/102 (5%)

Query: 374 WPRSEQENSPAATRRLFSFQAGALAGGQIVSQAAKRSADGELLLATRNRLSSVVPLSPDA 433
+ ++ PAA RRL + GAL + A L + L + +PL
Sbjct: 32 FEQALGGEPPAAGRRLAGLENGALGERLLQRFAQPLQGLEADRLELKAMLRAELPLGRQQ 91

Query: 434 ----WQMLSAPLRQPGIVALREYLRQRPPACIRPLN-QVDNL 470
Q+L A PG L + R+ I PLN +DNL
Sbjct: 92 QTFLLQLLGAVEHAPGGEYLAQLARRELQVLI-PLNGMLDNL 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2411INTIMIN270.028 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.3 bits (60), Expect = 0.028
Identities = 19/94 (20%), Positives = 31/94 (32%)

Query: 36 LNGTEIAITYVYKGDKVLKQSSETKIQFASIGATTKEDAAKTLEPLSAKYKNIAGVEEKL 95
+ + AITY K K K S ++ F + KT AK + K
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 96 TYTDTYAQENVTIDMEKVDFKALQGISGINVSAE 129
+ + V + +V+F I N+
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIV 764


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2413HTHFIS711e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 1e-16
Identities = 41/177 (23%), Positives = 77/177 (43%), Gaps = 12/177 (6%)

Query: 2 IKVLIVDDEPLARENLRVFLQEQSDIEIVGECSNAVEGIGAVHKLRPDVLFLDIQMPRIS 61
+L+ DD+ R L L ++ ++ SNA + D++ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLEMVGMLDPEHRPYI--VFLTAFD--EYAIKAFEEHAFDYLLKPIDEARLEKTLARLRQ 117
+++ + + RP + + ++A + AIKA E+ A+DYL KP D L + R
Sbjct: 62 AFDLLPRIK-KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 118 ERSKQDVSLLPENQQALKFIPCTGHSRIYLLQMKDVAFVSSRMSGVYVT--SHEGKE 172
E ++ L ++Q + + G S + +A + + +T S GKE
Sbjct: 121 EPKRRPSKLEDDSQDGMPLV---GRSAAMQEIYRVLARLMQTDLTLMITGESGTGKE 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2414PF065802204e-69 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 220 bits (562), Expect = 4e-69
Identities = 63/216 (29%), Positives = 115/216 (53%), Gaps = 3/216 (1%)

Query: 343 LGEGIAQLLSAQILAGQYERQKAMLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQA 402
L G + + + +M ++++ L AQ+NPHF+FNALN I+A+I D +A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 403 SQLVQYLSTFFRKNLKR-PSEFVTLADEIEHVNAYLQIEKARFQSRLQVNIAIPQELSQQ 461
+++ LS R +L+ + V+LADE+ V++YLQ+ +F+ RLQ I +
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 462 QLPAFTLQPIVENAIKHGTSQLLDTGRVAISARREGQHLMLEIEDNAGL-YQPVTNASGL 520
Q+P +Q +VEN IKHG +QL G++ + ++ + LE+E+ L + ++G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 521 GMNLVDKRLRERFGDDYGISVACEPDSYTRITLRLP 556
G+ V +RL+ +G + I ++ + + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


90EcE24377A_2653EcE24377A_2657N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2653036-9.968346DHA2 family drug:H+ antiporter-1
EcE24377A_2654036-10.120442drug resistance MFS transporter membrane fusion
EcE24377A_2655135-9.688223DNA-binding transcriptional activator EvgA
EcE24377A_2657135-8.554062hybrid sensory histidine kinase in two-component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2653TCRTETB1214e-32 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 121 bits (306), Expect = 4e-32
Identities = 92/404 (22%), Positives = 167/404 (41%), Gaps = 17/404 (4%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLS-TNLDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVISLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+S + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLIS-PLIGRYGNKIDMRLLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQFFQG 376
G M ++I + G ++ ++ +V + S T F II+ G
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGG 361

Query: 377 FAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
+ ++TI S L + S+ NF LS G ++
Sbjct: 362 LSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2654RTXTOXIND764e-17 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 75.6 bits (186), Expect = 4e-17
Identities = 63/412 (15%), Positives = 122/412 (29%), Gaps = 96/412 (23%)

Query: 13 RRKYFSLLAVVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
RR ++ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIRSPVTDYIAQRSVQ-VGETVSPG 233
L + + + + + IR+PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2655HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2657HTHFIS802e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 2e-17
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNMDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLSCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


91EcE24377A_2964EcE24377A_2970N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_2964-1122.292828major facilitator family transporter
EcE24377A_2965-1131.907392branched chain amino acid ABC transporter
EcE24377A_2966-2141.336970hypothetical protein
EcE24377A_2967-2110.965402transcriptional repressor MprA
EcE24377A_2968-1121.346510multidrug resistance protein A
EcE24377A_2969-1141.238143multidrug resistance protein B
EcE24377A_29700150.654632S-ribosylhomocysteinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2964TCRTETB453e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.9 bits (106), Expect = 3e-07
Identities = 32/165 (19%), Positives = 70/165 (42%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASVLMALMALALWRGLPQMKSETHLNY 196
+ G++A+ W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2967PF05272280.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.017
Identities = 23/94 (24%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRESDNDRR 111
SS ++ D L + GW RE+ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2968RTXTOXIND795e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.7 bits (194), Expect = 5e-18
Identities = 64/412 (15%), Positives = 120/412 (29%), Gaps = 97/412 (23%)

Query: 25 LLLTLLFIIIAVAIGIYWFLVLRHFEETDDA----YVAGNQIQIMSQVSGSVTKVWADNT 80
L FI+ + I VL E A +G +I + V ++
Sbjct: 57 PRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 81 DFVKEGDVLVTLDPTDARQAFEKA------------------------------------ 104
+ V++GDVL+ L A K
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 105 ----------------KTALASSVRQTHQLMINSKQLQANIEVQKIALAKA-------QS 141
K ++ Q +Q +N + +A + + +S
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 142 DYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQYNANQAMILGTKLEDQPAVQQ 201
+ L + I + + + A +L V Q ++ IL K E Q Q
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 202 AATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQISPTTP 242
E+ + + + I +P++ V + V G ++
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 243 LMAVVPA-TNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKY---TGKVVGLDMGTGS 298
LM +VP + V A + I + +GQ I + + +Y GKV + +
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI-----N 409

Query: 299 AFSLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNR 350
++ G V+ + + PL G++ + T R
Sbjct: 410 LDAIE--DQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2969TCRTETB1329e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 132 bits (333), Expect = 9e-36
Identities = 97/405 (23%), Positives = 169/405 (41%), Gaps = 23/405 (5%)

Query: 17 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 76
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 135
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 195
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 196 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVVAICFLIVWELTD 255
+ D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 195 I-KGHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 256 DNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 315
+P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 316 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMDFGASAWPQFIQGF- 373
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 374 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 416
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_2970LUXSPROTEIN292e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 292 bits (750), Expect = e-105
Identities = 131/170 (77%), Positives = 148/170 (87%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADAWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VADAW AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARSILERDVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


92EcE24377A_3073EcE24377A_3083N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3073011-1.738162major facilitator family transporter
EcE24377A_3074112-3.074779FAD binding domain-containing protein
EcE24377A_3075117-4.965431short chain dehydrogenase/reductase
EcE24377A_3076-115-3.866004major facilitator family transporter
EcE24377A_3077019-3.182431carbohydrate kinase, FGGY family protein
EcE24377A_3080022-2.915029hypothetical protein
EcE24377A_3081127-3.466362hypothetical protein
EcE24377A_3082127-2.813155hypothetical protein
EcE24377A_3083124-0.067152phosphopyruvate hydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3073TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.5 bits (79), Expect = 8e-04
Identities = 45/314 (14%), Positives = 112/314 (35%), Gaps = 36/314 (11%)

Query: 69 LGSLVLGWISDHIGRQKIFTFSFLLITLASFLQFFATTP-EHLIGLRILIGIGLGGDYSV 127
+G+ V G +SD +G +++ F ++ S + F + LI R + G G ++
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPAL 123

Query: 128 GHTLLAEFSPRRHRGILLGAFSVVWT----VGYVLASIAGHHFISENPEAWRWLLASAAL 183
++A + P+ +RG G + VG + + H+ W +LL +
Sbjct: 124 VMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------HWSYLLLIPMI 177

Query: 184 PALLITLLRWGTPESPRWLLRQGRFAEAHAIVHRYFGPHVLLGDEVVTATHKHIKTLF-- 241
+ + L + R +G F I+ +L + + + L
Sbjct: 178 TIITVPFLMKLLKKEVR---IKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFL 234

Query: 242 -SSRYWRRTA--------FNSVFFVCLVIPWFVIYT----WLPTIAQTIGLEDALTASLM 288
++ R+ ++ F+ V+ +I+ ++ + + L+ + +
Sbjct: 235 IFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEI 294

Query: 289 LNALLIVGALLGLV-------LTHLLAHRKFLLGSFLLLAATLVVMACLPSGSSLTLLLF 341
+ ++ G + ++ L L L+ + + + L +S + +
Sbjct: 295 GSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTII 354

Query: 342 VLFSTTISAVSNLV 355
++F + + V
Sbjct: 355 IVFVLGGLSFTKTV 368


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3075DHBDHDRGNASE1052e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (263), Expect = 2e-29
Identities = 74/257 (28%), Positives = 117/257 (45%), Gaps = 11/257 (4%)

Query: 11 MDFFSLKGKTAIVTGGNSGLGQAFAMALAKAGANIFIPSFVKDNGETKEMIEK-QGVEVD 69
M+ ++GK A +TG G+G+A A LA GA+I + + E K + +
Sbjct: 1 MNAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE 60

Query: 70 FMQVDITAEGAPQKIIAACCERFGTVDILVNNAGICKLNKVLDFGRADWDPMIDVNLTAA 129
D+ A +I A G +DILVN AG+ + + +W+ VN T
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 130 FELSYEAAKIMIPQKSGKIINICSLFSYLGGQWSPAYSATKHALAGFTKAYCDELGQYNI 189
F S +K M+ ++SG I+ + S + + AY+++K A FTK EL +YNI
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNI 180

Query: 190 QVNGIAPGYYATDI--TLATRSNPETNQRVLDH-------IPANRWGDTQDLMGAAVFLA 240
+ N ++PG TD+ +L N Q + IP + D+ A +FL
Sbjct: 181 RCNIVSPGSTETDMQWSLWADENGAE-QVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 241 SPASNYVNGHLLVVDGG 257
S + ++ H L VDGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3076TCRTETA290.042 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.0 bits (65), Expect = 0.042
Identities = 21/103 (20%), Positives = 45/103 (43%), Gaps = 8/103 (7%)

Query: 48 GLIMSTFGIAAIILYAPSGVIADKFSHRKMITSAVIITGLLGLLMATYPPLWVMLCIQVA 107
G++++ + + G ++D+F R ++ ++ + +MAT P LWV+ ++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 108 FAITTILMLWSVSIKAASLLGD---HSEQGKIMGWMEGLRGVG 147
IT + A + + D E+ + G+M G G
Sbjct: 106 AGITG-----ATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3082cloacin345e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 34.3 bits (78), Expect = 5e-04
Identities = 12/34 (35%), Positives = 15/34 (44%)

Query: 254 SGKSYHSDNSGSGGGSSGGGFSGGGGSSGGGGAS 287
SG H G G G SGGG +GG ++
Sbjct: 50 SGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83



Score = 33.9 bits (77), Expect = 6e-04
Identities = 15/36 (41%), Positives = 20/36 (55%)

Query: 253 ASGKSYHSDNSGSGGGSSGGGFSGGGGSSGGGGASG 288
+ G + S+N+ GGGS G GGG G GG +G
Sbjct: 34 SDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNG 69



Score = 32.4 bits (73), Expect = 0.002
Identities = 12/30 (40%), Positives = 12/30 (40%)

Query: 259 HSDNSGSGGGSSGGGFSGGGGSSGGGGASG 288
GGGS G G G S GG G G
Sbjct: 50 SGSGIHWGGGSGHGNGGGNGNSGGGSGTGG 79



Score = 31.6 bits (71), Expect = 0.003
Identities = 12/31 (38%), Positives = 13/31 (41%)

Query: 255 GKSYHSDNSGSGGGSSGGGFSGGGGSSGGGG 285
G G G +GGG GG SG GG
Sbjct: 49 GSGSGIHWGGGSGHGNGGGNGNSGGGSGTGG 79



Score = 30.8 bits (69), Expect = 0.007
Identities = 11/27 (40%), Positives = 12/27 (44%)

Query: 261 DNSGSGGGSSGGGFSGGGGSSGGGGAS 287
SGSG GG G GG +G G
Sbjct: 48 GGSGSGIHWGGGSGHGNGGGNGNSGGG 74



Score = 29.7 bits (66), Expect = 0.016
Identities = 11/34 (32%), Positives = 14/34 (41%)

Query: 255 GKSYHSDNSGSGGGSSGGGFSGGGGSSGGGGASG 288
G S + G G G GG +G G G G +
Sbjct: 48 GGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNL 81


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3083ANTHRAXTOXNA290.038 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.038
Identities = 31/132 (23%), Positives = 51/132 (38%), Gaps = 9/132 (6%)

Query: 211 GYAPNLGSNAEALAVIAEAVKAAGYELGKDITLAMDCAASEFYKDGKYVLA-----GEGN 265
P L N + A+ +E K YE+GK I+L + + ++ + +
Sbjct: 147 RETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDSS 206

Query: 266 KAFTSEEFTHFLEELTKQYPIVSIEDGLDESDW---DGFAYQTKVLG-DKIQLVGDDLFV 321
S++F LE K I I++ L E F+Y ++L D+F
Sbjct: 207 DLLFSQKFKEKLELNNKSIDINFIKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFE 266

Query: 322 TNTKILKEGIEK 333
K+ K G EK
Sbjct: 267 YMNKLEKGGFEK 278


93EcE24377A_3177EcE24377A_3190N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3177653-16.641020type III secretion apparatus lipoprotein EprK
EcE24377A_3179551-17.362344type III secretion apparatus protein EprH
EcE24377A_3181551-16.486720FlhB/HrpN/YscU/SpaS family protein
EcE24377A_3182445-13.888072type III secretion apparatus protein EpaR
EcE24377A_3183444-14.316816type III secretion apparatus protein EpaQ
EcE24377A_3184338-11.832758surface presentation of antigens protein SpaP
EcE24377A_3185-125-5.585459type III secretion apparatus protein EpaO2
EcE24377A_3186-119-3.000636type III secretion apparatus protein,
EcE24377A_3187-119-2.573820hypothetical protein
EcE24377A_3188017-1.287593hypothetical protein
EcE24377A_3190-213-1.175336*M23B family peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3177FLGMRINGFLIF330.001 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 33.0 bits (75), Expect = 0.001
Identities = 22/126 (17%), Positives = 49/126 (38%), Gaps = 5/126 (3%)

Query: 4 ISLLLFILLLCGCKQQE-LLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIFVEPTD 62
+++++ ++L L ++L Q ++A L + NI + I V
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGA---IEVPADK 91

Query: 63 FASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDGII 122
L LP + + + S +E+ A+E L ++++ + +
Sbjct: 92 VHELRLRLAQQGLPKGGAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 123 SSRVHV 128
S+RVH+
Sbjct: 151 SARVHL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3181TYPE3IMSPROT1983e-64 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 198 bits (504), Expect = 3e-64
Identities = 72/242 (29%), Positives = 126/242 (52%), Gaps = 5/242 (2%)

Query: 2 ANKTEKPTQKKLQDASKKGQILKSRDLTISVIMLVG--TLYLGYVFDVHHIMSILEYILD 59
KTE+PT KK++DA KKGQ+ KS+++ + +++ L + H ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 60 HNAKPDIWD---YFKAMGVGWLKTNIPFLLVCMFTTILVSWFQSKMQLATEAVKFKFDSL 116
+ P + + + P L V I Q ++ EA+K +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 117 NPVNGLKRIFGLKTVKEFVKAILYIVFFALAIKVFWSNHKSLLFKTLDGDIISLLSDWGE 176
NP+ G KRIF +K++ EF+K+IL +V ++ I + + L + I + G+
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 177 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERI 236
+L L++ C +++ I D+ EY+ ++K++KM K E+KREYKE EG+PEIKSKRR+
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 237 RK 238
++
Sbjct: 243 QE 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3182TYPE3IMRPROT1357e-41 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 135 bits (341), Expect = 7e-41
Identities = 47/248 (18%), Positives = 98/248 (39%), Gaps = 4/248 (1%)

Query: 1 MGEAILYQLHSLLAATALGFCRLAPTFYLLPFFASGNIPTVVRHPIIIVVSCALVQHYHY 60
M + Q S L R+ P + ++P V+ + ++++ A+
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 ELLNLNEIDIALFAAREIIIGLFIACLLASPFWIFLAIGSFIDNQRGATLSSTLDPATGV 120
+ + A ++I+IG+ + + F G I Q G + ++ +DPA+ +
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 DTSELARLFNLFSAAVYLTKGGMNFILETLWQSHNLWPSGNFNF--PKLEPLFSYINNIM 178
+ LAR+ ++ + ++LT G +++ L + + P G L + I
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 179 THTIVYASPVIAVMLGGEAVLGLLACYASQLNAFAISLTVKSALAFLILIIYFA--PILA 236
+ ++ A P+I ++L LGLL A QL+ F I + + ++
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 237 ERVMPLSF 244
E + F
Sbjct: 241 EHLFSEIF 248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3183TYPE3IMQPROT794e-23 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 78.7 bits (194), Expect = 4e-23
Identities = 59/86 (68%), Positives = 73/86 (84%)

Query: 1 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF 60
MDD+VFAGN+ALYL+L++S P VAT +GLLVGLFQTVTQLQEQTLPFG+KLL V +C
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLMSGWYGEKLYSFGIEMLNLAFARG 86
FL+SGWYGE L S+G +++ LA A+G
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3184TYPE3IMPPROT2241e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 224 bits (572), Expect = 1e-76
Identities = 150/223 (67%), Positives = 180/223 (80%), Gaps = 5/223 (2%)

Query: 1 MSNSISLIAILSLFTLLPFIIASGTCFIKFSIVFVIVRKALGLQQVPSNMTLNGVALLLS 60
M N ISLIA+L+ TLLPFIIASGTCF+KFSIVFV+VR ALGLQQ+PSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMMPVGKEIYNNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK 120
MFVM P+ + Y ++E+++FN+++S+ V+ G+ GY+ YLIKYS+ ELV FFE Q
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 VNSSEDNEEIIDDD-----NISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVL 175
+ E + D SIF+LLPAYALSEIKSAF IGFY+YLPFVVVDLV+SSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 176 LTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYFDLS 218
L LGMMMMSPVTISTPIKL+LFVA+DGWT+LSKGLILQY D++
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIA 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3185TYPE3OMOPROT1041e-29 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 104 bits (261), Expect = 1e-29
Identities = 62/186 (33%), Positives = 90/186 (48%), Gaps = 13/186 (6%)

Query: 3 LWLDKVTVCQYGNAPALDKKSLYWSIHFVIGFSKTCYRSLVDIEVGDVLLISNNLAYAVI 62
LW + + K L W + FVIG S T L I +GDVLLI + A
Sbjct: 129 LWFEHLPELPAVGGGRP--KMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRA---- 182

Query: 63 YNTKICDLIYPEELKMADHFEYEEDFETDDFDIKKNESEIYDENDDQMINSFEDLPVKIE 122
+Y K+ E + DI+ E E + + LPVK+E
Sbjct: 183 -------EVYCYAKKLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLE 235

Query: 123 FVLGKKIMNLYEIDELCAKRIISLLPESEKNIKIRVNGALTGYGELVEVDDKLGVEIHSW 182
FVL +K + L E++ + ++++SL +E N++I NG L G GELV+++D LGVEIH W
Sbjct: 236 FVLYRKNVTLAELEAMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEW 295

Query: 183 LSGNNN 188
LS + N
Sbjct: 296 LSESGN 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3190RTXTOXIND384e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 4e-05
Identities = 18/82 (21%), Positives = 31/82 (37%), Gaps = 12/82 (14%)

Query: 160 AAGAGKVVYVGNQLRGYGNLIMIKHSEDYITAYAHNDTMLVNNGQSVKAGQKIATMGSTD 219
A GK+ + G IK E+ I ++V G+SV+ G + + +
Sbjct: 84 ATANGKLTHSGRSK-------EIKPIENSIV-----KEIIVKEGESVRKGDVLLKLTALG 131

Query: 220 ATSVRLHFQIRYRATAIDPLRY 241
A + L Q ++ RY
Sbjct: 132 AEADTLKTQSSLLQARLEQTRY 153


94EcE24377A_3421EcE24377A_3432N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_34210205.600948general secretion pathway protein J
EcE24377A_3422-1214.766141general secretion pathway protein I
EcE24377A_3423-2173.772482general secretion pathway protein H
EcE24377A_3424-2153.105050general secretion pathway protein G
EcE24377A_3425-2142.974383general secretion pathway protein F
EcE24377A_3426-3132.489879general secretory pathway protein E
EcE24377A_3427-2120.492689general secretion pathway protein D
EcE24377A_3428-212-1.252924type II secretion protein GspC
EcE24377A_3429-112-0.702964lipoprotein
EcE24377A_3430-3120.209003type IV prepilin peptidase
EcE24377A_3431-2120.321364hypothetical protein
EcE24377A_3432-3110.661215hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3421BCTERIALGSPG300.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.003
Identities = 17/48 (35%), Positives = 23/48 (47%), Gaps = 3/48 (6%)

Query: 1 MRRTR--AGFTLLEMLVAIAIFASLA-LMAQQVTNGVTRVNSAVAGHD 45
MR T GFTLLE++V I I LA L+ + + + A D
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3422BCTERIALGSPH331e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 33.0 bits (75), Expect = 1e-04
Identities = 13/24 (54%), Positives = 18/24 (75%)

Query: 2 KRGFTLLEVMLALAIFALAATAVL 25
+RGFTLLE+ML L + ++A VL
Sbjct: 3 QRGFTLLEMMLILLLMGVSAGMVL 26


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3423BCTERIALGSPH744e-19 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 74.2 bits (182), Expect = 4e-19
Identities = 41/196 (20%), Positives = 69/196 (35%), Gaps = 41/196 (20%)

Query: 1 MPERGFTLLEIMLVIFLIGLASSGVVQTFATDSEPPAKKAAQDFLTRFAQFKDRAVIEGQ 60
M +RGFTLLE+ML++ L+G+++ V+ F + A + F + + R + GQ
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTGQ 60

Query: 61 TLGVLIDAPGYQFMQRRQGQWLPVSATRLSAQVTVPKQVQMLLQPGSDIWQKEYALELQR 120
GV + +QF+ + P D W L L+
Sbjct: 61 FFGVSVHPDRWQFLVLEARDGADPA-------------------PADDGWSGYRWLPLRA 101

Query: 121 RRL----TLHDIELEL-----QKEAKKKTPQIRFSPFEPATPFTLRFYSAAQNACWAVKL 171
R+ ++ +L L + P + P TPF L L
Sbjct: 102 GRVATSGSIAGGKLNLAFAQGEAWTPGDNPDVLIFPGGEMTPFRLT-------------L 148

Query: 172 AHDGALSLNQCDERMP 187
++ N E +P
Sbjct: 149 GEAPGIAFNARGESLP 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3424BCTERIALGSPG2175e-76 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 217 bits (553), Expect = 5e-76
Identities = 90/142 (63%), Positives = 107/142 (75%), Gaps = 3/142 (2%)

Query: 6 RTQKPRAGFTLLEVMVVIVILGVLASLVVPNLLGNKEKADRQKAISDIVALENALDMYRL 65
R + GFTLLE+MVVIVI+GVLASLVVPNL+GNKEKAD+QKA+SDIVALENALDMY+L
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL 61

Query: 66 DNGRYPTTEQGLEALIQQPANMADSRNYRTGGYIKRLPKDPWGNDYQYLSPGEKGLFDVY 125
DN YPTT QGLE+L++ P + NY GYIKRLP DPWGNDY ++PGE G +D+
Sbjct: 62 DNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDLL 121

Query: 126 TLGADGQENGEGAGADIGNWNL 147
+ G DG+ E DI NW L
Sbjct: 122 SAGPDGEMGTED---DITNWGL 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3425BCTERIALGSPF452e-161 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 452 bits (1165), Expect = e-161
Identities = 225/406 (55%), Positives = 301/406 (74%), Gaps = 1/406 (0%)

Query: 1 MALFYYQALERNGRKTKGMIEADSARHARQLLRGKELIPVHI-EARMNTSSGGMLQRRRH 59
MA ++YQAL+ G+K +G EADSAR ARQLLR + L+P+ + E R + G
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 AHRRVAAADLALFTRQLATLVQAAMPLETCLQAVSEQSEKLHVKSLGMALRSRIQEGYTL 119
R++ +DLAL TRQLATLV A+MPLE L AV++QSEK H+ L A+RS++ EG++L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 SDSLREHPRVFDSLFCSMVAAGEKSGHLDVVLNRLADYTEQRQRLKSRLLQAMLYPLVLL 179
+D+++ P F+ L+C+MVAAGE SGHLD VLNRLADYTEQRQ+++SR+ QAM+YP VL
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVATGVVTILLTAVVPKIIEQFDHLGHALPASTRTLIAMSDALQASGVYWLAGLLALLVL 239
VVA VV+ILL+ VVPK++EQF H+ ALP STR L+ MSDA++ G + L LLA +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 GQRLLKNPTMRLRWDKTVLRLPVTGRVARGLNTARFSRTLSILTASSVPLLEGIQTAAAV 299
+ +L+ R+ + + +L LP+ GR+ARGLNTAR++RTLSIL AS+VPLL+ ++ + V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 SANRYVEQQLLLAADRVREGSSLRAALAELRLFPPMMLYMIASGEQSGELETMLEQAAVN 359
+N Y +L LA D VREG SL AL + LFPPMM +MIASGE+SGEL++MLE+AA N
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QEREFDTQVGLALGLFEPALVVMMAGVVLFIVIAILEPMLQLNNMV 405
Q+REF +Q+ LALGLFEP LVV MA VVLFIV+AIL+P+LQLN ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3427BCTERIALGSPD5740.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 574 bits (1482), Expect = 0.0
Identities = 295/668 (44%), Positives = 431/668 (64%), Gaps = 34/668 (5%)

Query: 24 LLPLVLAAALCSSPVWAEEATFTANFKDTDLKSFIETVGANLNKTIIMGPGVQGKVSIRT 83
L L++ AAL P AEE F+A+FK TD++ FI TV NLNKT+I+ P V+G +++R+
Sbjct: 11 SLTLLIFAALLFRPAAAEE--FSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRS 68

Query: 84 MTPLNERQYYQLFLNLLEAQGYAVVPMENDVLKVVKSSAAKVEPLPLVGEGSDNYAGDEM 143
LNE QYYQ FL++L+ G+AV+ M N VLKVV+S AK +P+ + + GDE+
Sbjct: 69 YDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPG-IGDEV 127

Query: 144 VTKVVPVRNVSVRELAPILRQMIDSAGSGNVVNYDPSNVIMLTGRASVVERLTEVIQRVD 203
VT+VVP+ NV+ R+LAP+LRQ+ D+AG G+VV+Y+PSNV+++TGRA+V++RL +++RVD
Sbjct: 128 VTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVD 187

Query: 204 HAGNRTEEVIPLDNASASEIARVLESLTKNSGENQ-PATLKSQIVADERTNSVIVSGDPA 262
+AG+R+ +PL ASA+++ +++ L K++ ++ P ++ + +VADERTN+V+VSG+P
Sbjct: 188 NAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPN 247

Query: 263 TRDKMRRLIRRLDSEMERSGNSQVFYLKYSKAEDLVDVLKQVSGTLTAAKEEAEGTVGSG 322
+R ++ +I++LD + GN++V YLKY+KA DLV+VL +S T+ + K+ A+ +
Sbjct: 248 SRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPV-AAL 306

Query: 323 REIVSIAASKHSNALIVTAPQDIMQSLQSVIEQLDIRRAQVHVEALIVEVAEGSNINFGV 382
+ + I A +NALIVTA D+M L+ VI QLDIRR QV VEA+I EV + +N G+
Sbjct: 307 DKNIIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGI 366

Query: 383 QWASKDAGLMQFANGTQIPIGTLGAAISQAKPQKGSTVISENGATTINPDTNGDLST-LA 441
QWA+K+AG+ QF N + +PI T A + +G +S+ LA
Sbjct: 367 QWANKNAGMTQFTN-SGLPISTAIAG-------------------ANQYNKDGTVSSSLA 406

Query: 442 QLLSGFSGTAVGVVKGDWMALVQAVKNDSSSNVLSTPSITTLDNQEAFFMVGQDVPVLTG 501
LS F+G A G +G+W L+ A+ + + +++L+TPSI TLDN EA F VGQ+VPVLTG
Sbjct: 407 SALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVPVLTG 466

Query: 502 STVGSNNSNPFNTVERKKVGIMLKVTPQINEGNAVQMVIEQEVSKVEGQTS-----LDVV 556
S S N FNTVERK VGI LKV PQINEG++V + IEQEVS V S L
Sbjct: 467 SQTTSG-DNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGAT 525

Query: 557 FGERKLKTTVLANDGELIVLGGLMDDQAGESVAKVPLLGDIPLIGNLFKSTADKKEKRNL 616
F R + VL GE +V+GGL+D ++ KVPLLGDIP+IG LF+ST+ K KRNL
Sbjct: 526 FNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNL 585

Query: 617 MVFIRPTILRDGMAADGVSQRKYNYMRAEQIYR--DEQGLSLMPHTAQPVLPAQNQALPP 674
M+FIRPT++RD S +Y Q + E +++ + P Q+ A
Sbjct: 586 MLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYPRQDTAAFR 645

Query: 675 EVRAFLNA 682
+V A ++A
Sbjct: 646 QVSAAIDA 653


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3428BCTERIALGSPC1143e-32 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 114 bits (287), Expect = 3e-32
Identities = 68/287 (23%), Positives = 113/287 (39%), Gaps = 40/287 (13%)

Query: 40 IVRGMFWLMLLIISAKVAHSLWRYFSFSAEYTA-VSPSANKPPRADAKTFDKNDVQLISQ 98
I R +F+L++L+ ++A WR A VS P +A + ND L
Sbjct: 14 IRRILFYLLMLLFCQQLAMIFWR---IGLPDNAPVSSVQITPAQARQQPVTLNDFTL--- 67

Query: 99 QNWFGKYQPV--ATPVKQPEPVPVAETRLNVVLRGIAFG---ARPGAVIEEGGKQQVYLQ 153
FG A + + + + LN+ L G+ G +R A+I + +Q
Sbjct: 68 ---FGVSPEKNKAGALDASQMSNLPPSTLNLSLTGVMAGDDDSRSIAIISKDNEQFSRGV 124

Query: 154 GETLGSHNAVIEEINRDHVMLRYQGKIERLSLAEEGHSTVAVTNKKAVSDEAKQAVAEPA 213
E + +NA I I D V+L+YQG+ E L L +
Sbjct: 125 NEEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQ-----------------------ED 161

Query: 214 VSAPVEIPTAVRQAL-AKDPQKIFNYIQLTPVRKEG-IVGYAVKPGADRSLFDASGFKEG 271
+ V + L + + +Y+ +P+ + + GY + PG F G ++
Sbjct: 162 SGSDGVPGAQVNEQLQQRASTTMSDYVSFSPIMNDNKLQGYRLNPGPKSDSFYRVGLQDN 221

Query: 272 DIAIALNQQDFTDPRAMIALMRQLPSMDSIQLTVLRKGARHDISIAL 318
D+A+ALN D D M ++ + + LTV R G R DI +
Sbjct: 222 DMAVALNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQDIYMEF 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3430PREPILNPTASE2771e-95 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 277 bits (709), Expect = 1e-95
Identities = 110/271 (40%), Positives = 151/271 (55%), Gaps = 12/271 (4%)

Query: 1 MLFDVFQQYPAAMPILATVGGLIIGSFLNVVIWRYPIML-RQQMAEFHGEMPSAQSKI-- 57
+L ++ P L + L+IGSFLNVVI R PIML R+ AE+ +
Sbjct: 3 LLLELAHGLPWLYFSLVFLFSLMIGSFLNVVIHRLPIMLEREWQAEYRSYFNPDDEGVDE 62

Query: 58 ---SLALPRSHCPHCQQTIRIRDNIPLLSWLMLKGRCRDCQAKISKRYPLVELLTALAFL 114
+L +PRS CPHC I +NIPLLSWL L+GRCR CQA IS RYPLVELLTAL +
Sbjct: 63 PPYNLMVPRSCCPHCNHPITALENIPLLSWLWLRGRCRGCQAPISARYPLVELLTALLSV 122

Query: 115 LASLVWPESGWALAVMILSAWLIAASVIDLDRQWLPDVFTQGVLWTGLIAAWAQQSPLTL 174
++ LA ++L+ L+A + IDLD+ LPD T +LW GL+ ++L
Sbjct: 123 AVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNL-LGGFVSL 181

Query: 175 QDAVTGVLVGFIAFYSLRWIAGVVLRKEALGMGDVLLFAALGSWVGPLSLPNVALIASCC 234
DAV G + G++ +SL W ++ KE +G GD L AALG+W+G +LP V L++S
Sbjct: 182 GDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLLSSLV 241

Query: 235 GLIYAVI-----TKRGSTTLPFGPCLSLGGI 260
G + S +PFGP L++ G
Sbjct: 242 GAFMGIGLILLRNHHQSKPIPFGPYLAIAGW 272


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3432PF03544488e-08 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 48.0 bits (114), Expect = 8e-08
Identities = 23/60 (38%), Positives = 29/60 (48%), Gaps = 3/60 (5%)

Query: 17 SSDTPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPTPD---PEPTPEPEPEPVP 73
S T + V+P P P EP PEP P PEP E P+P P+P+P+PV
Sbjct: 50 ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 109



Score = 41.9 bits (98), Expect = 1e-05
Identities = 14/57 (24%), Positives = 19/57 (33%), Gaps = 1/57 (1%)

Query: 18 SDTPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPTPDPEPTPEPEPEPVPT 74
+D P + PE +P P PEP PEP + E + V
Sbjct: 58 ADLEPPQAVQ-PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQ 113



Score = 40.7 bits (95), Expect = 2e-05
Identities = 20/117 (17%), Positives = 37/117 (31%), Gaps = 6/117 (5%)

Query: 20 TPPVDSGTGSLPEVKPDPTPNPEPTPEPTPDPEPTPEPTPDP--EPTPEPEPEPVPTKTG 77
P + +P +P PEP +PEP PEP P+P E E K
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPK 104

Query: 78 YLTLGGSLRVTGDITCNDESSDGFTFTPGDKVTCVAGNNTTIATFDTQSEAARSLRA 134
+ + D+ + +P + ++T ++ + +
Sbjct: 105 PKPVKKVEQPKRDVKPVESRPA----SPFENTAPARPTSSTATAATSKPVTSVASGP 157



Score = 37.3 bits (86), Expect = 3e-04
Identities = 15/61 (24%), Positives = 23/61 (37%), Gaps = 6/61 (9%)

Query: 14 SGSSSDTPPVDSGTGSLPEVKPDPTPNPEPTPE-PTPDPEPTPEPTPDPEPTPEPEPEPV 72
+ P V PE +P P P E P P+P P+P + +P+ +
Sbjct: 65 AVQPPPEPVV----EPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP-KPVKKVEQPKRDVK 119

Query: 73 P 73
P
Sbjct: 120 P 120



Score = 35.3 bits (81), Expect = 0.001
Identities = 17/40 (42%), Positives = 17/40 (42%)

Query: 35 PDPTPNPEPTPEPTPDPEPTPEPTPDPEPTPEPEPEPVPT 74
P P T D EP P PEP EPEPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83



Score = 30.7 bits (69), Expect = 0.042
Identities = 11/40 (27%), Positives = 13/40 (32%)

Query: 37 PTPNPEPTPEPTPDPEPTPEPTPDPEPTPEPEPEPVPTKT 76
P P + + P P P P EPEP P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI 83


95EcE24377A_3508EcE24377A_3515N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3508436-10.464907fimbrial protein
EcE24377A_3509132-8.571097fimbrial usher protein
EcE24377A_3510-122-4.562625periplasmic pilus chaperone family protein
EcE24377A_3511010-0.555758hypothetical protein
EcE24377A_3512081.609440glycogen synthesis protein GlgS
EcE24377A_3513081.680112inner membrane protein
EcE24377A_3514092.809572hypothetical protein
EcE24377A_3515-1133.561250bifunctional heptose 7-phosphate kinase/heptose
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3508FIMBRIALPAPE280.011 Escherichia coli: P pili tip fibrillum papE protein...
		>FIMBRIALPAPE#Escherichia coli: P pili tip fibrillum papE protein

signature.
Length = 173

Score = 28.5 bits (63), Expect = 0.011
Identities = 36/163 (22%), Positives = 66/163 (40%), Gaps = 35/163 (21%)

Query: 14 AMILSNNVFADEGHGIVKFKGEVISAPCSIKPGDEDLTVNLGEVADTVLKSDQKSLAE-- 71
A+++S +V A + + FKG++I C++ ++ VN G++ L + +
Sbjct: 15 AVLMSQHVHAADN---LTFKGKLIIPACTV----QNAEVNWGDIEIQNLVQSGGNQKDFT 67

Query: 72 -----PFTIHLQDCMLSQGGTTYSKAKVTFTTANTMTGQTDLLKNTKETEIGGATGVGVR 126
P+++ ++ G T + V T+ + G L N+ + IG A
Sbjct: 68 VDMNCPYSLGTMKVTITSNGQTGNSILVPNTSTASGDGLLIYLYNSNNSGIGNA------ 121

Query: 127 ILDSQSGEVTLGTPVV---ITFNNTNS----YQELNFKARMES 162
VTLG+ V IT Y +L +K M+S
Sbjct: 122 --------VTLGSQVTPGKITGTAPARKITLYAKLGYKGNMQS 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3509PF005776370.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 637 bits (1645), Expect = 0.0
Identities = 224/837 (26%), Positives = 388/837 (46%), Gaps = 56/837 (6%)

Query: 17 IYCSLSVLIIGCASAYAVEFNKDLIEAEDRENVNLSQFETDGQLPVGKYSLNALINNKRT 76
++ + + S+ + FN + + + +LS+FE +LP G Y ++ +NN
Sbjct: 30 LFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 77 PIHLDLQWVLIDN--QTAVCLTPEQLTLLGFTDEIIEVAQQNLIDGCYPIEK-EKQITTY 133
D+ + D+ CLT QL +G + D C P+ T
Sbjct: 90 A-TRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQ 148

Query: 134 LDKGKMQLSISAPQAWLKYKDANWTPPELWDHGIAGAFLDYNLYASHYAPHQGDNSQNIS 193
LD G+ +L+++ PQA++ + + PPELWD GI L+YN + G NS
Sbjct: 149 LDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAY 208

Query: 194 SYGQAGVNLGAWRLRTDYQYDQSFNNGKS-QANNLDFPRIYLFRPIPAINAKLTIGQYDT 252
Q+G+N+GAWRLR + + + ++ S N +L R I + ++LT+G T
Sbjct: 209 LNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYT 268

Query: 253 ESSIFDSFHFSGVSLKSDENMLPPDLRGYAPQITGVAQTNAKVTVSQNNRIIYQENVPPG 312
+ IFD +F G L SD+NMLP RG+AP I G+A+ A+VT+ QN IY VPPG
Sbjct: 269 QGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPG 328

Query: 313 PFAITNLFNT-LQGQLDVKVEEEDGQVTQWQVASNSIPYLTRKGQIRYTTAMGKPTSVGG 371
PF I +++ G L V ++E DG + V +S+P L R+G RY+ G+ S G
Sbjct: 329 PFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRS-GN 387

Query: 372 DSLQQPFFWTGEFSWGWLNNVSLYGGSVLTNRDYQSLAAGVGFNLNSLVSLSFDVTRSDA 431
++P F+ G ++YGG+ L +R Y++ G+G N+ +L +LS D+T++++
Sbjct: 388 AQQEKPRFFQSTLLHGLPAGWTIYGGTQLADR-YRAFNFGIGKNMGALGALSVDMTQANS 446

Query: 432 QLHNQDKETGYSYRANYSKRFESTGSQLTFAGYRFSDKNFVTMNEYIND----------- 480
L + + G S R Y+K +G+ + GYR+S + +
Sbjct: 447 TLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQD 506

Query: 481 ---------TNHYTNYQNEKESYIVTFNQYLESLRLNTYVSLARNTYWDAS-SNVNYSLS 530
T++Y N++ +T Q L Y+S + TYW S + +
Sbjct: 507 GVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSNVDEQFQAG 565

Query: 531 LSRDFDIGPLKNVSTSLTFSRIN--WEEDNQDQLYLNISIPWGTSR-----------TLS 577
L+ F ++++ +L++S W++ L LN++IP+ + S
Sbjct: 566 LNTAF-----EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASAS 620

Query: 578 YGMQRNQDNKISHTASWYDS--SDRNNSWSVSASGDNDEFKDMKASLRASYQHNTENGRL 635
Y M + + ++++ A Y + D N S+SV + ++ A+ + G
Sbjct: 621 YSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNA 680

Query: 636 YLSGTSQRDSYYSLNASWNGSFTATRHGAAFHDYSGSADSRFMIDADGAEDIPLNNKRAV 695
+ + D L +G A +G D+ ++ A GA+D + N+ V
Sbjct: 681 NIGYSHSDD-IKQLYYGVSGGVLAHANGVTLGQPLN--DTVVLVKAPGAKDAKVENQTGV 737

Query: 696 -TNRYGIGVIPSVSSYITTSLSVDTRNLPENVDIENSVITTTLTEGAIGYAKLDTRKGYQ 754
T+ G V+P + Y +++DT L +NVD++N+V T GAI A+ R G +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 755 IMGVIRLADGSHPPLGISVKDKTSHKELGLVADGGFVYLNGIQDDSKLTLRWGDKSC 811
++ + + P G V ++S + G+VAD G VYL+G+ K+ ++WG++
Sbjct: 798 LLMTLT-HNNKPLPFGAMVTSESS-QSSGIVADNGQVYLSGMPLAGKVQVKWGEEEN 852


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3514IGASERPTASE525e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 52.0 bits (124), Expect = 5e-09
Identities = 47/287 (16%), Positives = 93/287 (32%), Gaps = 16/287 (5%)

Query: 197 PNNAFDAEGLTKLTQETERRRRERNEVEQDVEVAVREKNRDALSRKLEIEQQEAFMTLEQ 256
N A+ + + E R A + + ++ E +QE+ +
Sbjct: 999 TPNNIQAD-VPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA---ENSKQESKTVEKN 1054

Query: 257 EQQVKTRTAEQNARIAAFEAERRREAE-QTRILAERQIQETEIEREQAVRSRKVEAEREV 315
EQ TA+ R A EA+ +A QT +A+ + E + + + VE E +
Sbjct: 1055 EQDATETTAQN--REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKA 1112

Query: 316 RIKEIEQQQVTEIANQTKSIAIAAKSEQ---QSQAEARANLALAEAVSAQQNVETTRQTA 372
+++ + Q+V ++ +Q +++ Q + E + + E S T Q A
Sbjct: 1113 KVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPA 1172

Query: 373 EADRAKQVALIAAAQDAET------KAVELTVRAKAEKEAAEMQAAAIVELAEATRKKGL 426
+ + + + T T +E + R
Sbjct: 1173 KETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232

Query: 427 AEAEAQRALNDAINVLSDEQTSLKFKLALLQALPAVIEKSVEPMKSI 473
A + ND V + TS L A ++ K++
Sbjct: 1233 NVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3515LPSBIOSNTHSS290.028 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 29.0 bits (65), Expect = 0.028
Identities = 10/37 (27%), Positives = 20/37 (54%)

Query: 347 GVFDILHAGHVSYLANARKLGDRLIVAVNSDASTKRL 383
G FD + GH+ + +L D++ VAV + + + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPM 43


96EcE24377A_3626EcE24377A_3634N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3626013-0.949941fimbrial usher family protein
EcE24377A_36270120.806271fimbrial protein
EcE24377A_36280142.268652tetrapyrrole methylase
EcE24377A_3629-1141.840776lipoprotein
EcE24377A_36300171.470288hypothetical protein
EcE24377A_36310171.420454DnaA initiator-associating protein DiaA
EcE24377A_36320192.300252hypothetical protein
EcE24377A_36340191.807325NAD dependent epimerase/dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3626PF005777770.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 777 bits (2009), Expect = 0.0
Identities = 320/849 (37%), Positives = 470/849 (55%), Gaps = 48/849 (5%)

Query: 31 SGMLCTTANAEEYYFDPIMLETTKSGMQTTDLSRFSKKYAQLPGTYQVDIWLNKKKVSQK 90
+ ++ E YF+P L DLSRF PGTY+VDI+LN ++ +
Sbjct: 35 AFAAQAPLSSAELYFNPRFLAD--DPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATR 92

Query: 91 KITFTAN-AEQLLQPQFTVEQLRELGIKVDEIPALAEKDDDSVINSLEQIIPGTAAEFDF 149
+TF +EQ + P T QL +G+ + + DD+ + L +I A+ D
Sbjct: 93 DVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVP-LTSMIHDATAQLDV 151

Query: 150 NHQRLNLSIPQIALYRDARGYVSPSRWDDGIPTLFTNYSFTGSDNRYRQGNRSQRQYLNM 209
QRLNL+IPQ + ARGY+ P WD GI NY+F+G+ + R G S YLN+
Sbjct: 152 GQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNL 211

Query: 210 QNGANFGPWRLRNYSTWTRNDQASS------WNTISSYLQRDIKALKSQLLLGESATSGS 263
Q+G N G WRLR+ +TW+ N SS W I+++L+RDI L+S+L LG+ T G
Sbjct: 212 QSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGD 271

Query: 264 IFSSYTFTGVQLASDDNMLPNSQRGFAPTVRGIANSSAIVTIRQNGYVIYQSNVPAGAFE 323
IF F G QLASDDNMLP+SQRGFAP + GIA +A VTI+QNGY IY S VP G F
Sbjct: 272 IFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFT 331

Query: 324 INDLYPSSNSGDLEVTIEESDGTQRRFIQPYSSLPMMQRPGHLKYSATAGRYRADANSDS 383
IND+Y + NSGDL+VTI+E+DG+ + F PYSS+P++QR GH +YS TAG YR+
Sbjct: 332 INDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQE 391

Query: 384 KEPEFAEATAIYGLNNTFTLYGGLLGSEDYYALGIGIGGTLGALGALSMDINRADTQFDN 443
K P F ++T ++GL +T+YGG ++ Y A GIG +GALGALS+D+ +A++ +
Sbjct: 392 K-PRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPD 450

Query: 444 RHSFHGYQWRTQYIKDIPETNTNIAVSYYRYTNDGYFSFDEA------------------ 485
G R Y K + E+ TNI + YRY+ GYF+F +
Sbjct: 451 DSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQ 510

Query: 486 ----NTRNWDYNSRQKSEIQFNISQTIFDGVSLYASGSQQDYWGNNDKNRNISVGVSGQQ 541
T ++ ++ ++Q ++Q + +LY SGS Q YWG ++ + G++
Sbjct: 511 VKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF 570

Query: 542 WGIGYSLNYQYSRYTDQN-NDRALSLNLSIPLERWLPRSR--------VSYQMTSQKDRP 592
I ++L+Y ++ Q D+ L+LN++IP WL SY M+ +
Sbjct: 571 EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGR 630

Query: 593 TQHEMRLDGSLLDDGRLSYSLEQSLDDDNNHNS----SLNASYRSPYGTFSAGYSYGNDS 648
+ + G+LL+D LSYS++ + NS +YR YG + GYS+ +D
Sbjct: 631 MTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDI 690

Query: 649 SQYNYGVTGGVVIHPHGVTLSQYLGNAFALIDANGASGVRIQNYPGIATDPFGYAVVPYL 708
Q YGV+GGV+ H +GVTL Q L + L+ A GA +++N G+ TD GYAV+PY
Sbjct: 691 KQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYA 750

Query: 709 TTYQENRLSVDTTQLPDNVDLEQTTQFVVPNRGAMVAARFNANIGYRVLVTVSDRNGKPL 768
T Y+ENR+++DT L DNVDL+ VVP RGA+V A F A +G ++L+T+ N KPL
Sbjct: 751 TEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTL-THNNKPL 809

Query: 769 PFGALASNDETGQQSIVDEGGILYLSGISSKSQSWTVRWGNQADQQCQFAFSTPDSEPTT 828
PFGA+ +++ + IV + G +YLSG+ + V+WG + + C + P
Sbjct: 810 PFGAMVTSESSQSSGIVADNGQVYLSGMPLAGK-VQVKWGEEENAHCVANYQLPPESQQQ 868

Query: 829 SVLQGTAQC 837
+ Q +A+C
Sbjct: 869 LLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3627FIMBRIALPAPF300.009 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 30.1 bits (67), Expect = 0.009
Identities = 43/160 (26%), Positives = 66/160 (41%), Gaps = 21/160 (13%)

Query: 208 VKLSIQGNLTAPQSCKINQGDVIKVNFGFINGQKFTTRNAMPDGFTPVDFNITYDCGDTS 267
V+++I+GN+ P C IN G I V+FG IN + V NI+ C S
Sbjct: 21 VQINIRGNVYIP-PCTINNGQNIVVDFGNINPEHVDNSRG------EVTKNISISCPYKS 73

Query: 268 KIKNSLQMRIDGTTGVVDQYNLVARRRSSDNAPDVGIRIENLGGGVANIPFQNG------ 321
SL +++ G T V Q N++A N GI + G + NG
Sbjct: 74 ---GSLWIKVTGNTMGVGQNNVLA-----TNITHFGIALYQGKGMSTPLTLGNGSGNGYR 125

Query: 322 ILPVDPSGHGTINMRAWPVNLVGGELETGKFQGTATITVI 361
+ + T + P G L G F+ TA++++I
Sbjct: 126 VTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSMI 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3629BINARYTOXINB300.043 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 29.7 bits (66), Expect = 0.043
Identities = 11/72 (15%), Positives = 24/72 (33%), Gaps = 4/72 (5%)

Query: 487 AGVNGGSGIALTGTPITPRATTDSGMTTNNPTLQTTPTDDQFTNNGGRVDAVYIVATPGE 546
+ V+G + + + I + + ++ T D + G R A + +
Sbjct: 330 SEVHGNAEVHASFFDIGGSVSAGFSNSNSS----TVAIDHSLSLAGERTWAETMGLNTAD 385

Query: 547 IAFIKPMIAMRN 558
A + I N
Sbjct: 386 TARLNANIRYVN 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3631RTXTOXINA280.031 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.0 bits (62), Expect = 0.031
Identities = 26/111 (23%), Positives = 44/111 (39%), Gaps = 22/111 (19%)

Query: 42 NKILCCGNGTSAANAQHFAASMINRFETERPSLPAIALNTDNVVLTAIA-------NDRL 94
K+L GN + A T + IA + V AI+ D+
Sbjct: 277 TKVL--GNVGKGISQYIIAQRAAQGLSTSAAAAGLIA----SAVTLAISPLSFLSIADKF 330

Query: 95 HD----EVYAKQVRALGHAGDVLLAISTRGNSRDIVKAVEAAVTRDMTIVA 141
E Y+++ + LG+ GD LLA + A++A++T T++A
Sbjct: 331 KRANKIEEYSQRFKKLGYDGDSLLAAFHKETG-----AIDASLTTISTVLA 376


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3634NUCEPIMERASE290.014 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 29.0 bits (65), Expect = 0.014
Identities = 8/22 (36%), Positives = 13/22 (59%)

Query: 4 VLITGATGLVGGHLLRMLINEP 25
L+TGA G +G H+ + L+
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAG 24


97EcE24377A_3717EcE24377A_3724N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3717-217-0.708607serine endoprotease
EcE24377A_3718-213-0.309083serine endoprotease
EcE24377A_3719-113-0.114982malate dehydrogenase
EcE24377A_3720-212-0.447452arginine repressor ArgR
EcE24377A_3721-2130.135250protein YcfR
EcE24377A_3722-2120.987264hypothetical protein
EcE24377A_3723-3101.562037p-hydroxybenzoic acid efflux subunit AaeB
EcE24377A_3724-2101.497874p-hydroxybenzoic acid efflux subunit AaeA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3717V8PROTEASE726e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 72.0 bits (176), Expect = 6e-16
Identities = 32/184 (17%), Positives = 63/184 (34%), Gaps = 38/184 (20%)

Query: 90 GLGSGVIINASKGYVLTNNHVINQAQKISIQL------------NDGREFDAKLIGSDDQ 137
+ SGV++ K +LTN HV++ L +G ++ +
Sbjct: 102 FIASGVVVG--KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 138 SDIALLQIQN-------PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIVSALG 190
D+A+++ + ++++ + +V G P V+ +
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMW 212

Query: 191 RSGLNLEGLEN-FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSN 249
S + L+ +Q D S GNSG + N E+IGI+ G+
Sbjct: 213 ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGA 263

Query: 250 MART 253
+
Sbjct: 264 VFIN 267


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3718V8PROTEASE538e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 52.7 bits (126), Expect = 8e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3719DHBDHDRGNASE280.045 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.045
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 VAKTCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3720ARGREPRESSOR1694e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 169 bits (430), Expect = 4e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKDLYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3724RTXTOXIND534e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.3 bits (128), Expect = 4e-10
Identities = 28/163 (17%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVHDNQLVKKGQILFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 114
V + + V+KG +L + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 115 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 154
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211



Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 100 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 150
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 151 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 208
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 209 YRAEIT----PLGSNKVLKGTVDSVAA 231
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410


98EcE24377A_3746EcE24377A_3752N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3746-211-1.969340DNA-binding protein Fis
EcE24377A_3747-211-1.659790methyltransferase
EcE24377A_3748-312-1.452743hypothetical protein
EcE24377A_3749-312-1.196417DNA-binding transcriptional regulator EnvR
EcE24377A_3750-213-0.632190acriflavine resistance protein E
EcE24377A_3751-214-0.696898acriflavine resistance protein F
EcE24377A_3752-117-1.522551lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3746DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3749HTHTETR1276e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 127 bits (321), Expect = 6e-39
Identities = 78/209 (37%), Positives = 122/209 (58%), Gaps = 3/209 (1%)

Query: 1 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN 60
MA++TK EA +TRQ +++ A+ F+Q GVS T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMW-LQQPSLRELIQEHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEF 119
E+W L + ++ EL E A DP LRE LI L+ R++ L++I++HKCEF
Sbjct: 61 EIWELSESNIGELELE-YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 NDEM-LAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWL 178
EM + + R + + + L+ C + + +L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNMAGYDLYKQAPALVDNVLRMFMPDENI 207
+DL K+A V +L M++ +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3750RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 42.9 bits (101), Expect = 1e-06
Identities = 38/217 (17%), Positives = 70/217 (32%), Gaps = 38/217 (17%)

Query: 98 ATYQANYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 IAAKATVESARINLAYTKVTAPISGRIGK-STVTEGALVTNGQTTELATVQQLDPIYVDV 215
+ + + AP+S ++ + TEG +VT +T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVEQGNLHKENATSNVELVMENGQTYP-LKGTLQ--FSDVTVDEST 270
+ D F+ + Q+ +++ Y L G ++ D D+
Sbjct: 371 LVQNKDIGFINVGQNAI------------IKVEAFPYTRYGYLVGKVKNINLDAIEDQRL 418

Query: 271 GSIT--LRAV------FPNPQHTLLPGMFVRARIDEG 299
G + + ++ N L GM V A I G
Sbjct: 419 GLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 13/127 (10%)

Query: 46 TAPLEVKTELPGR-TNAYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQANY 104
+E+ G+ T++ R E++P + IV EG V+ G L ++ +A
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA-- 134

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVE 164
+ K++++ A L RY L E ++ +
Sbjct: 135 -----DTLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191



Score = 29.0 bits (65), Expect = 0.031
Identities = 11/34 (32%), Positives = 15/34 (44%), Gaps = 1/34 (2%)

Query: 65 AEVRPQVSGIVLNRN-FTEGSDVQAGQSLYQIDP 97
+ +R VS V TEG V ++L I P
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3751ACRIFLAVINRP14060.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1406 bits (3640), Expect = 0.0
Identities = 1029/1034 (99%), Positives = 1030/1034 (99%)

Query: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60
MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120
VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPDTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180
EVQQQGISVEKSSSSYLMVAGFVSDNP TTQDDISDYVASNVKDTLSRLNGVGDVQLFGA
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRL 240
QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300
KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360
DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480
MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATLLKPTSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540
SVLVALILTPALCATLLKP SAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600
LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 601 EKANVESVFTVNSFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660
EKANVESVFTVN FSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720
FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKVYVQADAKFRM 780
EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK+YVQADAKFRM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840
LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900
ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960
MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020
EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1021 VPVFFVVIRRCFKG 1034
VPVFFVVIRRCFKG
Sbjct: 1021 VPVFFVVIRRCFKG 1034


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3752adhesinb280.004 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.5 bits (61), Expect = 0.004
Identities = 14/68 (20%), Positives = 26/68 (38%), Gaps = 10/68 (14%)

Query: 1 MKR---LIPVALLTALLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGG 57
MK+ L+ + L LA C+ + +V TN+ + T++ IAG
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN-------IAGD 53

Query: 58 AAAVAGLT 65
+ +
Sbjct: 54 KINLHSIV 61


99EcE24377A_3806EcE24377A_3809N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3806447-1.734074A24 family peptidase
EcE24377A_3805653-1.391805bacterioferritin
EcE24377A_3807756-0.503327bacterioferritin-associated ferredoxin
EcE24377A_3808754-0.388891elongation factor Tu
EcE24377A_3809545-0.547717elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3806PREPILNPTASE1471e-46 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 147 bits (372), Expect = 1e-46
Identities = 63/143 (44%), Positives = 85/143 (59%), Gaps = 2/143 (1%)

Query: 3 ATLPFLILYACLSVLLFLWDAKHGLLPDRFTCPLLWSGLLFSQVCNPDCLADALWGAIIG 62
TL L+L L L F+ D LLPD+ T PLLW GLLF+ + L DA+ GA+ G
Sbjct: 133 GTLAALLLTWVLVALTFI-DLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAG 191

Query: 63 YGTFAVIYWGYRILRHKEGLGYGDVKFLAALGAWHTWTFLPRLVFLAASFACGAVVVGLL 122
Y +YW +++L KEG+GYGD K LAALGAW W LP +V L +S + +GL+
Sbjct: 192 YLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALP-IVLLLSSLVGAFMGIGLI 250

Query: 123 MRGKESLKNPLPFGPFLAAAGFV 145
+ P+PFGP+LA AG++
Sbjct: 251 LLRNHHQSKPIPFGPYLAIAGWI 273


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3805HELNAPAPROT383e-06 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 38.3 bits (89), Expect = 3e-06
Identities = 19/103 (18%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 44 EYHESIDEMKHADKYIERILFLEGIPN--LQDLGKL------GIGEDVEEMLQSDLRLEL 95
E ++ E D ER+L + G P +++ + G EM+Q+ +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EGAKDLREAIAYADSVHDYVSRDMMIEILADEEGHIDWLETEL 138
+ + + + I A+ D + D+ + ++ + E + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3808TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 186
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 187 AKILELAGFLDSYIPEPE 204
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3809TCRTETOQM6130.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 613 bits (1583), Expect = 0.0
Identities = 178/698 (25%), Positives = 304/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGGEELTEAEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+ R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPDAPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E +K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D I +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


100EcE24377A_3815EcE24377A_3825N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_38150150.194324hypothetical protein
EcE24377A_38160151.489966FKBP-type peptidylprolyl isomerase
EcE24377A_38170153.081253hypothetical protein
EcE24377A_3818-2142.988751FKBP-type peptidylprolyl isomerase
EcE24377A_38190152.770686hypothetical protein
EcE24377A_3820-2132.730467glutathione-regulated potassium-efflux system
EcE24377A_3821-1172.427732glutathione-regulated potassium-efflux system
EcE24377A_3822-1181.712126ABC transporter ATP-binding protein
EcE24377A_3823-2110.594367hydrolase
EcE24377A_3824-2120.859302hypothetical protein
EcE24377A_3825-2131.042359phosphoribulokinase/uridine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3815ACRIFLAVINRP290.021 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.021
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 160 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 218
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 219 SK 220
+
Sbjct: 114 AT 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3816INFPOTNTIATR1325e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 132 bits (334), Expect = 5e-40
Identities = 79/226 (34%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_382060KDINNERMP310.021 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.021
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 261 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 316
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 317 GVRSSERMQ 325
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3821ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 12 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 69
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 70 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESA------Y 119
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 120 RYDALNRYPMSDVLR 134
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3822GPOSANCHOR330.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.005
Identities = 28/152 (18%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQSRKAELTAC 602
+LE++ + +L A+ + EE+ SE QS + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.0 bits (72), Expect = 0.008
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQSRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3825PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


101EcE24377A_3924EcE24377A_3931N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_3924-1213.448521gamma-glutamyltranspeptidase
EcE24377A_3925-1243.904602hypothetical protein
EcE24377A_3927-1243.636951hypothetical protein
EcE24377A_3926-1253.885178glycerophosphodiester phosphodiesterase
EcE24377A_3928-1283.953095glycerol-3-phosphate transporter ATP-binding
EcE24377A_3929-2273.084214glycerol-3-phosphate transporter membrane
EcE24377A_3930-2253.522655glycerol-3-phosphate transporter permease
EcE24377A_3931-2243.492392glycerol-3-phosphate transporter periplasmic
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3924NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 272 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 328
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 329 YAYADRSEYLGDPDFVKVPWQA 350
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3926PF04619280.017 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 28.4 bits (63), Expect = 0.017
Identities = 12/60 (20%), Positives = 22/60 (36%), Gaps = 4/60 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3928PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.003
Identities = 13/43 (30%), Positives = 20/43 (46%), Gaps = 7/43 (16%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTEGDIWINDQRVTEMEPKD 75
+V+ G G GKSTL+ + GL+ + +D KD
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3931MALTOSEBP402e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.7 bits (92), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYSAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


102EcE24377A_3964EcE24377A_3974N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_39640203.932432nickel transporter ATP-binding protein NikE
EcE24377A_3965015-0.308599nickel responsive regulator
EcE24377A_3966118-3.792900HicB family protein
EcE24377A_3967118-3.761301ABC transporter ATP-binding protein
EcE24377A_3968021-4.929262ABC transporter ATP-binding protein
EcE24377A_3969-128-7.564216MFP family transporter
EcE24377A_3970024-7.053556hypothetical protein
EcE24377A_3971017-4.950256hypothetical protein
EcE24377A_3972-112-0.528030hypothetical protein
EcE24377A_39730140.490421hypothetical protein
EcE24377A_39740141.707738pyridine nucleotide-disulfide oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3964HTHFIS290.018 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.018
Identities = 10/34 (29%), Positives = 19/34 (55%)

Query: 25 QAVLNNVSLTLKSGETVALLGRSGCGKSTLARLL 58
Q + ++ +++ T+ + G SG GK +AR L
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3967ABC2TRNSPORT504e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 50.3 bits (120), Expect = 4e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKV-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLTMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3968PF05272300.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.045
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 37 ARCMVGLIGPDGVGKSSLLSLISGAR 62
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3969RTXTOXIND662e-14 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 66.0 bits (161), Expect = 2e-14
Identities = 32/196 (16%), Positives = 75/196 (38%), Gaps = 5/196 (2%)

Query: 6 RHLAWWVVGLLAVAAIVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++++G L +A I++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRVLQEQRLEAIAQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQ 124
EG+ VR+G+VL K+ + L+ + + +A+ Q L + L ++
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 125 RQAELDSVAKRHTRSRSLAHRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAA 184
+ S + + + + + Q + RA + A+++ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 185 RTNIIQAQTRVEARLI 200
++ + + + + I
Sbjct: 234 KSRLDDFSSLLHKQAI 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_3974ALARACEMASE290.023 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.4 bits (66), Expect = 0.023
Identities = 26/109 (23%), Positives = 42/109 (38%), Gaps = 24/109 (22%)

Query: 215 VITAENGIVFRENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRN 272
++ E I RE RG GP +L + ++ + + + L T + N Q
Sbjct: 58 LLNLEEAITLRE------RGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLK 107

Query: 273 AHPNQSLKNTLAVHL------------PKRLVERLQQLGQIPDVSLKQL 309
A N LK L ++L P R++ QQL + +V L
Sbjct: 108 ALQNARLKAPLDIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


103EcE24377A_4042EcE24377A_4047N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_40420111.399520major facilitator family transporter
EcE24377A_40430131.634316hypothetical protein
EcE24377A_40441111.8055193-methyladenine DNA glycosylase
EcE24377A_40460111.716168hypothetical protein
EcE24377A_40450111.307302biotin sulfoxide reductase
EcE24377A_4047-115-0.211796outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4042TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 47/275 (17%), Positives = 94/275 (34%), Gaps = 32/275 (11%)

Query: 44 PVSQVAFSFGLLSLGLAIS----SSVAGKLQERFGVKRVTMASGILLGLGFFLTAHSNNL 99
+ V +G+L A+ + V G L +RFG + V + S + + + A + L
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 100 MMLWLS---AGVLVGLADGAGYLL----TLSNCVKWFPERKGLISAFAIGSYGLGSLGFK 152
+L++ AG+ AG + + F + LG
Sbjct: 97 WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG---- 152

Query: 153 FIDTQLLETVGLEKTFVIWGAIALVMIVFGATLMKDAPKQEVKTSNGVVEKDYTLAESMR 212
L+ F A+ + + G L+ ++ K E + R
Sbjct: 153 -----LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 213 --KPQYWMLAVMFLTACMSG----LYVIGVAKDIAQSLAHLDVVSAANAVTVISIAN-LS 265
++AV F+ + L+VI + H D + ++ I + L+
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVI-----FGEDRFHWDATTIGISLAAFGILHSLA 262

Query: 266 GRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
++ G ++ ++ R + +G + G L FA
Sbjct: 263 QAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297



Score = 36.0 bits (83), Expect = 2e-04
Identities = 37/155 (23%), Positives = 64/155 (41%), Gaps = 9/155 (5%)

Query: 241 AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
AH ++ A A+ + A + G L SD+ R V+ + + V A + A
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGAL-----SDRFGRRPVLLVSLAGAAVDYAIMATA 93

Query: 301 PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSIFGSIIA 360
P V + I VA G T V + +++ + A+++G + FG G + G ++
Sbjct: 94 PFLWVLYIGRI--VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151

Query: 361 SLFGGF--YVTFYVIFALLILSLALSTTIRQPEQK 393
L GGF + F+ AL L+ + K
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4043ECOLNEIPORIN280.039 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.8 bits (62), Expect = 0.039
Identities = 19/90 (21%), Positives = 37/90 (41%), Gaps = 13/90 (14%)

Query: 119 SMYNEFGDSTTTLTDPLWHASVSSLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLS 178
S+ + D+ + H S + + + R G++ P ++SY F +
Sbjct: 228 SVAVQQQDAKLV-EENYSHNSQTEVAATLAYRFGNVTP--RVSYAHGFKGSF-------- 276

Query: 179 RMTATNQNGNWLDVTVGADMLLNQNIAAYA 208
ATN N ++ V VGA+ ++ +A
Sbjct: 277 --DATNYNNDYDQVVVGAEYDFSKRTSALV 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4046SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.9 bits (80), Expect = 5e-05
Identities = 16/52 (30%), Positives = 22/52 (42%), Gaps = 5/52 (9%)

Query: 76 VAPKAVRRGIGKALMQYV-----QQRYPHLMLEVYQKNQPAIDFYRAQGFHI 122
VA ++G+G AL+ + + LMLE N A FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4047OMPADOMAIN1132e-32 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 113 bits (285), Expect = 2e-32
Identities = 41/122 (33%), Positives = 62/122 (50%), Gaps = 11/122 (9%)

Query: 108 LNMPNNVTFDSSSATLKPAGANTLTGVAMVLKEY--PKTAVNVIGYTDSTGGHDLNMRLS 165
+ ++V F+ + ATLKP G L + L +V V+GYTD G N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 166 QQRADSVASALITQGVDASRIRTQGLGPANPIASNSTAEGK---------AQNRRVEITL 216
++RA SV LI++G+ A +I +G+G +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 217 SP 218

Sbjct: 335 KG 336


104EcE24377A_4170EcE24377A_4181N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4170-1120.788768ribonucleoside transporter
EcE24377A_4171-2121.675775hypothetical protein
EcE24377A_4172-1121.609440sulfate permease inorganic anion transporter
EcE24377A_4173-1143.382426cryptic adenine deaminase
EcE24377A_4174-1163.158844sugar phosphate antiporter
EcE24377A_41751174.104439regulatory protein UhpC
EcE24377A_41762184.651946sensory histidine kinase UhpB
EcE24377A_41772194.182700hypothetical protein
EcE24377A_41782193.961617DNA-binding transcriptional activator UhpA
EcE24377A_41791163.237241acetolactate synthase 1 regulatory subunit
EcE24377A_41802152.986012acetolactate synthase catalytic subunit
EcE24377A_4181-1161.342036multidrug resistance protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4170TCRTETA392e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.4 bits (92), Expect = 2e-05
Identities = 35/208 (16%), Positives = 71/208 (34%), Gaps = 13/208 (6%)

Query: 88 IIVEFLPVSLLTP----MAQDLGISEGVAGQSVTVTAFVAMFASLFITQTIQATDR--RY 141
+ ++ + + L+ P + +DL S V + A A+ +DR R
Sbjct: 14 VALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRR 73

Query: 142 VVILFAVLL-TLSCLLVSFANSFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKA 200
V+L ++ + +++ A +L IGR G+ G A++ + + +
Sbjct: 74 PVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAGAYIADITDGDERARH 132

Query: 201 LSVIFGAVSIALVIAAPLGSFLGELIGWRNVFNAAAVMG----VLCIFWIIKSLPSLPGE 256
+ +V LG +G F AAA + + F + +S
Sbjct: 133 FGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP 191

Query: 257 PSHQKQNTFRLLQRPGVMAGMIAIFMSF 284
+ N + M + A+ F
Sbjct: 192 LRREALNPLASFRWARGMTVVAALMAVF 219


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4173UREASE381e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.8 bits (88), Expect = 1e-04
Identities = 30/105 (28%), Positives = 43/105 (40%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG-AEYAD---------APA 71
V+R D +I N ILD + G + I +K IA +G A D P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDARGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4174TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4175TCRTETB415e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 41.4 bits (97), Expect = 5e-06
Identities = 65/408 (15%), Positives = 138/408 (33%), Gaps = 60/408 (14%)

Query: 29 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 86
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 87 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 143
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 144 TAWY-SRTERGGWWALWNTAHNVGDALIPIVMAAAALHYGWRAGMMIAGCMAIVVGIFLC 202
A Y + RG + L + +G+ + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 203 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 262
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 263 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 306
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 307 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 365
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 366 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 395
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4176PF06580402e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.8 bits (93), Expect = 2e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 362 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIDESALSENQRVTLFRVCQEGLNN 421
LR ++L + + ++L L++ + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 422 IVKHA-----DASAVTLQGWQQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 475
+KH + L+G + + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 476 G---TLHISCLHG-TRVSVSLP 493
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4178HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4181TCRTETB606e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 59.9 bits (145), Expect = 6e-12
Identities = 41/184 (22%), Positives = 81/184 (44%), Gaps = 1/184 (0%)

Query: 5 RNVNLLLMLVLLVAVGQMAQTIYIPAIADMARDLNVREGAVQSVMGAYLLTYGVSQLFYG 64
R+ +L+ L +L + + + ++ D+A D N + V A++LT+ + YG
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 65 PISDRVGRRPVILVGMSIFMLATLVA-VTTSSLTVLIAASAMQGMGTGVGGVMARTLPRD 123
+SD++G + ++L G+ I +++ V S ++LI A +QG G + +
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 124 LYERTQLRHANSLLNMGILVSPLLAPLIGGLLDTMWNWRACYLFLLVLCAGVTFSMARWM 183
+ A L+ + + + P IGG++ +W L ++ V F M
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLK 190

Query: 184 PETR 187
E R
Sbjct: 191 KEVR 194


105EcE24377A_4390EcE24377A_4397N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_43902231.897961nitrogen regulation protein NR(I)
EcE24377A_4391222-0.434259nitrogen regulation protein NR(II)
EcE24377A_4392221-2.239520glutamine synthetase
EcE24377A_4393215-4.355472hypothetical protein
EcE24377A_4394114-4.798468GTP-binding protein
EcE24377A_4395014-5.391488GntR family transcriptional regulator
EcE24377A_4396-111-3.752655AP endonuclease
EcE24377A_4397011-3.202492major facilitator transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4390HTHFIS6010.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 601 bits (1550), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNIQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4391PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4394TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4397TCRTETB290.025 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.5 bits (66), Expect = 0.025
Identities = 31/161 (19%), Positives = 64/161 (39%), Gaps = 15/161 (9%)

Query: 227 NVFFVYAVYCGLTFFIPFLKNIYLLP----------VALVGAYGIINQYCLKMIGGPIGG 276
N+ F+ V CG F + ++P A +G+ I +I G IGG
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 277 MISDKILKSPSKYLCYTFIISTAALVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFF 336
++ D+ + P L + + + L E+ ++ + G + FT+
Sbjct: 315 ILVDR--RGPLYVLNIGVTFLSVSFLTASFLL-ETTSWFMTIIIVFVLGGLSFTK--TVI 369

Query: 337 APIGEAKIAENKTGAAMALGSFIGYAPAMFCFSLYGYILDL 377
+ I + + + + GA M+L +F + ++ G +L +
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


106EcE24377A_4582EcE24377A_4587N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4582-1170.562452D-xylose transporter XylE
EcE24377A_4583-1201.151892maltose transporter permease
EcE24377A_45840190.913593maltose transporter membrane protein
EcE24377A_45851210.536169hypothetical protein
EcE24377A_45861190.632429maltose ABC transporter periplasmic protein
EcE24377A_4587-1140.764887maltose/maltodextrin transporter ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4582TCRTETA364e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 4e-04
Identities = 20/87 (22%), Positives = 42/87 (48%), Gaps = 3/87 (3%)

Query: 279 VIGVMLSIFQQFVGINVVLYYAPEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMT--- 335
+I ++ ++ VGI +++ P + + L S D+ I++ + L A +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 336 VDKFGRKPLQIIGALGMAIGMFSLGTA 362
D+FGR+P+ ++ G A+ + TA
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4584FLGHOOKAP1310.011 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.011
Identities = 22/124 (17%), Positives = 43/124 (34%), Gaps = 21/124 (16%)

Query: 128 GDEWQLALSDGETGKNYLSDAFKFGGEQKLQLKETTAQPEGERANLRVITQNRQALSDIT 187
++WQ+ T DA L+L T + L+ + A+ ++
Sbjct: 367 NNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPV---SDAIVNMD 423

Query: 188 AILPDGNKVMMSSLRQFSGTQPLYTLDGDGTLTNNQSGVKYRPNNQ--------IGFYQS 239
++ D K+ M+S GD N Q+ + + N++ Y S
Sbjct: 424 VLITDEAKIAMAS----------EEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 240 ITAD 243
+ +D
Sbjct: 474 LVSD 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4586MALTOSEBP7560.0 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 756 bits (1953), Expect = 0.0
Identities = 396/396 (100%), Positives = 396/396 (100%)

Query: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60
MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK
Sbjct: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60

Query: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120
VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW
Sbjct: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120

Query: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180
DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP
Sbjct: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180

Query: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240
YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE
Sbjct: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240

Query: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300
AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE
Sbjct: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300

Query: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360
LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP
Sbjct: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360

Query: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396
QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK
Sbjct: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4587PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 13/35 (37%), Positives = 18/35 (51%)

Query: 32 VVFVGPSGCGKSTLLRMIAGLETITSGDLFIGEKR 66
VV G G GKSTL+ + GL+ + IG +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


107EcE24377A_4678EcE24377A_4685N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4678116-4.684518DNA-binding transcriptional activator DcuR
EcE24377A_4679114-4.984499sensory histidine kinase DcuS
EcE24377A_4680-116-4.621894hypothetical protein
EcE24377A_4681-117-4.484629hypothetical protein
EcE24377A_4682021-3.830675hypothetical protein
EcE24377A_4683119-4.440486hypothetical protein
EcE24377A_4684018-3.799937lysyl-tRNA synthetase
EcE24377A_4685014-2.534951amino acid/peptide transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4678HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4679PF06580417e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 7e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4681SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4685TCRTETA300.022 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.022
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWSWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


108EcE24377A_4914EcE24377A_4919N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_49143130.065184outer membrane usher protein FimD, truncation
EcE24377A_49151130.682056protein FimF
EcE24377A_49160192.347354protein fimG
EcE24377A_4917-1222.137074protein FimH
EcE24377A_49180252.778477hypothetical protein
EcE24377A_49190252.616691fructuronate transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4914PF00577352e-117 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 352 bits (904), Expect = e-117
Identities = 283/285 (99%), Positives = 283/285 (99%)

Query: 2 LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLVGVYGTLLEDNNLSYSVQT 61
LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNL GVYGTLLEDNNLSYSVQT
Sbjct: 594 LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQT 653

Query: 62 GYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP 121
GYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP
Sbjct: 654 GYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQP 713

Query: 122 LNDTVVLVKAPGAKDVKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN 181
LNDTVVLVKAPGAKD KVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN
Sbjct: 714 LNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDN 773

Query: 182 AVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVY 241
AVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVY
Sbjct: 774 AVANVVPTRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVY 833

Query: 242 LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 286
LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 834 LSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4916VACCYTOTOXIN333e-04 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 33.5 bits (76), Expect = 3e-04
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WCKRGYVLAAILALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 SGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4917SURFACELAYER280.048 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.048
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4919PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


109EcE24377A_4995EcE24377A_5000N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
EcE24377A_4995-2150.913470phosphoglycerate mutase
EcE24377A_4994-2120.310833right origin-binding protein
EcE24377A_4996-114-0.078060hypothetical protein
EcE24377A_4997DNA-binding response regulator CreB
EcE24377A_4998sensory histidine kinase CreC
EcE24377A_4999hypothetical protein
EcE24377A_5000two-component response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4995VACCYTOTOXIN290.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.014
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4997HTHFIS853e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.9 bits (210), Expect = 3e-21
Identities = 34/139 (24%), Positives = 60/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQLVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+ V + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSSPSPVIRIGHFEL 139
K+ S L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_4998PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.3 bits (76), Expect = 0.002
Identities = 42/182 (23%), Positives = 73/182 (40%), Gaps = 40/182 (21%)

Query: 312 LRQARLENRQEVVLTAVDVAALFR---RVSEARTVQLAE--KNITLHVT--------PTE 358
+R LE+ + ++ L R R S AR V LA+ + ++ +
Sbjct: 182 IRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQ 241

Query: 359 VNVAAEPALLDQALGNLL-----DNA----IDFTPESGRITLSAEVDQEHVTLKVLDTGS 409
PA++D + +L +N I P+ G+I L D VTL+V +TGS
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 410 GIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE-VARLFNGEVTLR-NVQEGGVLASL 467
N ++S+G GL V E + L+ E ++ + ++G V A +
Sbjct: 302 LALK----------------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 468 RL 469
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
EcE24377A_5000HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.