PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomesakai.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_002695 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1ECs5354ECs5341Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5354-1153.150649right origin-binding protein
ECs5353-1172.646343phosphoglycerate mutase
ECs5352-1183.127229NTPase
ECs5351-2193.316636Trp operon repressor
ECs5350-2183.715570lytic murein transglycosylase
ECs5349-1203.378801ABC transporter ATP-binding protein
ECs5348-3203.085156nicotinamide-nucleotide adenylyltransferase
ECs5347-1233.756074DNA repair protein RadA
ECs5346-1243.470722phosphoserine phosphatase
ECs53451252.921551unknown domain/lipoate-protein ligase A fusion
ECs53441342.040885hypothetical protein
ECs53430303.096765purine nucleoside phosphorylase
ECs53420263.538326phosphopentomutase
ECs5341-1193.490807thymidine phosphorylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5353VACCYTOTOXIN290.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.014
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5348LPSBIOSNTHSS367e-05 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 36.3 bits (84), Expect = 7e-05
Identities = 22/152 (14%), Positives = 54/152 (35%), Gaps = 35/152 (23%)

Query: 71 GKFYPLHTGHIYLIQRACSQVDELHIIMGFDDTRDRALFEDSAMSQQPTVPDRLRWLLQT 130
G F P+ GH+ +I+R C D++++ A+ + +V +RL + +
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYV----------AVLRNPNKQPMFSVQERLEQIAKA 56

Query: 131 FKYQKNIRIHAFNEEGMEPYPHGWDVWSNGIKKFMAEKGI---------QPDLIYTSEEA 181
+ N ++ D + + ++ D + A
Sbjct: 57 IAHLPNAQV---------------DSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMA 101

Query: 182 DAPQYMEHLGIETVLVDPKRTFMSISGAQIRE 213
+ + + +ETV + + +S + ++E
Sbjct: 102 NTNKTLAS-DLETVFLTTSTEYSFLSSSLVKE 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5346FLGMRINGFLIF300.022 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.5 bits (66), Expect = 0.022
Identities = 21/71 (29%), Positives = 33/71 (46%), Gaps = 2/71 (2%)

Query: 123 QIECIDEIAKLAGTGEMVAEVTERAMRGELDFTASLRSRVATLK-GADANILQQVRENLP 181
Q+ E AK A V + TE A+ L L+ R A + GA+ + Q++RE
Sbjct: 482 QLTRRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEV-MSQRIREMSD 540

Query: 182 LMPGLTQLVLK 192
P + LV++
Sbjct: 541 NDPRVVALVIR 551


2ECs5314ECs5295Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5314-115-4.546495hypothetical protein
ECs5313-215-3.143980carbon starvation protein
ECs5312-114-5.201338hypothetical protein
ECs5311015-6.305653GTP-binding protein YjiA
ECs5310018-6.973944hypothetical protein
ECs5309014-4.988846hypothetical protein
ECs5308013-4.921037type I restriction-modification enzyme R
ECs5307-118-5.893495type I restriction modification enzyme M
ECs5306-121-4.729915type I restriction-modification enzyme S
ECs5593-118-2.539715endoribonuclease SymE
ECs5305-115-1.733442hypothetical protein
ECs53040180.007959hypothetical protein
ECs53031191.512405hypothetical protein
ECs53021181.851661regulator
ECs53011171.432836hypothetical protein
ECs5300018-3.702761transporter
ECs5299018-5.422768hypothetical protein
ECs5298-219-5.148751hypothetical protein
ECs5297-224-6.695565hypothetical protein
ECs5296-126-8.186382hypothetical protein
ECs5295-121-5.682645hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5300TCRTETB523e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 51.8 bits (124), Expect = 3e-09
Identities = 47/189 (24%), Positives = 76/189 (40%), Gaps = 5/189 (2%)

Query: 7 RHAATLFFPMALILYDFAAYLSTDLIQPGIINVVRDFNADVSLAPAAVSLYLAGGMALQW 66
RH L + L + + ++ P I N A + A L + G A+
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV-- 68

Query: 67 LLGPLSDRIGRKPVLITGALIFTLACAATMFTTSMTQFLI-ARAIQGTSICFIATVGYVT 125
G LSD++G K +L+ G +I S LI AR IQG + V
Sbjct: 69 -YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 126 VQEAFGQTKGIKLMAIITSIVLIAPIIGPLSGAALMHFVHWKVLFAIIAVMGFISFVGLL 185
V + K +I SIV + +GP G + H++HW L +I ++ I+ L+
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL-LIPMITIITVPFLM 186

Query: 186 LAMPETVKR 194
+ + V+
Sbjct: 187 KLLKKEVRI 195


3ECs5279ECs5270Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5279224-2.797723D-mannose specific adhesin
ECs5278123-2.690966protein FimG
ECs5277026-3.250254protein FimF
ECs5276127-3.895622protein FimD
ECs5275128-5.220931protein FimC
ECs5274028-5.652644protein FimI
ECs5273-129-5.763194FimA
ECs5272030-6.308665tyrosine recombinase
ECs5271029-5.409049tyrosine recombinase
ECs5270027-4.317614hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5279SURFACELAYER280.044 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.044
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5278VACCYTOTOXIN300.003 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.003
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WRKRGYLLAAILALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5276PF0057710890.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 1089 bits (2819), Expect = 0.0
Identities = 869/878 (98%), Positives = 873/878 (99%)

Query: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAVQAPLSSAELYFNPRFLADDPQA 60
MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFA QAPLSSAELYFNPRFLADDPQA
Sbjct: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60

Query: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120
VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN
Sbjct: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120

Query: 121 TASVAGMNLLADDACVPLTTMVQDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180
TASV+GMNLLADDACVPLT+M+ DATA LDVGQQRLNLTIPQAFMSNRARGYIPPELWDP
Sbjct: 121 TASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180

Query: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDRSSGSK 240
GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSD SSGSK
Sbjct: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240

Query: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300
NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV
Sbjct: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300

Query: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360
IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV
Sbjct: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360

Query: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420
PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY
Sbjct: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420

Query: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480
RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR
Sbjct: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480

Query: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRS 540
YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR+
Sbjct: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540

Query: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLARNVNI 600
STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLA NVNI
Sbjct: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600

Query: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660
PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD
Sbjct: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660

Query: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720
GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL
Sbjct: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720

Query: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780
VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP
Sbjct: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840
TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA
Sbjct: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840

Query: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878
GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


4ECs5262ECs5230Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5262-119-4.662924hypothetical protein
ECs5261028-6.786967ATP-dependent helicase
ECs5260133-8.022474hypothetical protein
ECs5259240-11.497253hypothetical protein
ECs5258135-11.058755hypothetical protein
ECs5257229-8.253769hypothetical protein
ECs5256126-6.318935hypothetical protein
ECs5255-121-4.310285hypothetical protein
ECs5254026-6.231529hypothetical protein
ECs5591030-6.223278hypothetical protein
ECs5253-128-5.664664integrase
ECs5252031-6.344287transcriptional regulator
ECs5251-130-6.569344hypothetical protein
ECs5250135-6.904613hypothetical protein
ECs5249028-1.388380resolvase
ECs5248023-1.302186hypothetical protein
ECs5247025-3.070478hypothetical protein
ECs5246-219-1.623983hypothetical protein
ECs5245-118-0.741176hypothetical protein
ECs5244-2150.244788transposase
ECs5243-2150.445703transposase
ECs5242-2130.932025integrase
ECs5241-3162.850182*oxidoreductase
ECs5240-1223.008699hypothetical protein
ECs5239-2190.692571hypothetical protein
ECs5238-2170.285864hypothetical protein
ECs5237-118-0.302799leucyl aminopeptidase
ECs5236-219-1.123227DNA polymerase III subunit chi
ECs5235-117-5.669558valyl-tRNA synthetase
ECs5234027-10.272571hypothetical protein
ECs5233025-8.218747hypothetical protein
ECs5232026-7.355040hypothetical protein
ECs5231128-7.592105ornithine carbamoyltransferase subunit I
ECs5230236-9.307767hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5261RTXTOXIND310.032 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.032
Identities = 26/163 (15%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 325 RLASGAEEEAYRRLVESQFRDDDDEQAQSN---KGRLFKITLEKALFSSPMACASVVANR 381
+ S E L++ QF +++ Q + + A + + V +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 382 LKRLESRKDHN--SQSQINELESLLLALNNIDASQFSKYQLLLDTIRKDLAWKANNTEDR 439
L S ++ + E E+ + N S+ L+ I ++ + ++
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ----LEQIESEIL----SAKEE 288

Query: 440 LVIFTESIKTLEFLEQ--QLRADLKLKDDQIATLRGDQGDTVL 480
+ T+ K E L++ Q ++ L ++A Q +V+
Sbjct: 289 YQLVTQLFKN-EILDKLRQTTDNIGLLTLELAKNEERQQASVI 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5258RTXTOXIND320.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.008
Identities = 18/134 (13%), Positives = 50/134 (37%), Gaps = 13/134 (9%)

Query: 161 AQIKLLRTEISDSSQAQLANHTHFSNKLWEQLEQFADLMAKGATEQI-IDALRQVIIDFN 219
+ R E + A++ + + S +L+ F+ L+ K A + + ++
Sbjct: 207 LNLDKKRAER-LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 220 QNLTEQFGENFKALDASVKKLVEWQGNYKTQIEQMSEQYQQSV-ESLVETKTAVAGIWEE 278
L + L+ +++ K + + +++ ++ + + L +T + + E
Sbjct: 266 NELRVYKSQ----LEQIESEILS----AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 279 CK--EIPLAMSELR 290
E S +R
Sbjct: 318 LAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5257OMPADOMAIN584e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 58.0 bits (140), Expect = 4e-12
Identities = 36/136 (26%), Positives = 53/136 (38%), Gaps = 28/136 (20%)

Query: 95 SPDVLFGLGSTELKPKFKLILDDFFPRYLKVLDNYQEHITEVRIEGHTSTDWTGTTNPDI 154
DVLF LKP+ + LD + L N V + G+ TD G+
Sbjct: 218 KSDVLFNFNKATLKPEGQAALD----QLYSQLSNLDPKDGSVVVLGY--TDRIGSD---- 267

Query: 155 AYFNNMALSQGRTRAVLQYVYDIKNIATHQQWVKSKFAAVGYSSAHPILDKTGKEDPNRS 214
AY N LS+ R ++V+ Y+ K I K +A G ++P+ T R+
Sbjct: 268 AY--NQGLSERRAQSVVDYLI-SKGIP------ADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 215 ---------RRVTFKV 221
RRV +V
Sbjct: 319 ALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5255THERMOLYSIN280.007 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.1 bits (62), Expect = 0.007
Identities = 20/71 (28%), Positives = 32/71 (45%), Gaps = 9/71 (12%)

Query: 10 AGNVTFVHNGKAYVTGGVNQNIFNGYFEDLNEAGKDSAAIDKINAHYFDKKAEDYFFNKF 69
+G T+ + + G + + N +F A D+AA+D AHY+ DY+ N
Sbjct: 265 SGIFTYDGRNRTVLPGSLWADGDNQFF-----ASYDAAAVD---AHYYAGVVYDYYKNVH 316

Query: 70 -LLSFDPSTQQ 79
LS+D S
Sbjct: 317 GRLSYDGSNAA 327


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5233SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 16/48 (33%), Positives = 19/48 (39%)

Query: 97 PAIRGKGLAKKLALKAMEEAREMGFKRCYLETTAFLKEAIGLYEHLGF 144
R KG+ L KA+E A+E F LET A Y F
Sbjct: 99 KDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5230TYPE4SSCAGX300.038 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.038
Identities = 18/75 (24%), Positives = 37/75 (49%)

Query: 180 KEIARLLNNHQKLNNLQKLNNLQKLNNLQKLNNIQKLNNIQELNNSQELNNSQELNNSQE 239
+ + ++N Q L+N + L+ L K +L+ +++L ++QE + L +ELN Q
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 240 LNNSQDLKNSQVSCK 254
+ ++S K
Sbjct: 241 EEAVRQRAKDKISIK 255


5ECs5183ECs5174Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5183027-3.724562transposase
ECs5182030-4.894025transposase
ECs5181030-3.815333hypothetical protein
ECs5180130-3.799262hypothetical protein
ECs51791280.48106050S ribosomal protein L9
ECs5178-2232.58081830S ribosomal protein S18
ECs5177-1233.260870primosomal replication protein N
ECs5176-1263.37022530S ribosomal protein S6
ECs5175-1283.262617hypothetical protein
ECs5174-1313.215366L-ribulose-5-phosphate 4-epimerase
6ECs5162ECs5142Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5162117-3.095711synthetase/amidase
ECs5161217-2.990018hypothetical protein
ECs5160414-0.182473hypothetical protein
ECs51594130.211207hypothetical protein
ECs51583191.188543hypothetical protein
ECs51572201.310293hypothetical protein
ECs51564242.33751123S rRNA
ECs51554242.457774exoribonuclease R
ECs51544231.984758transcriptional repressor NsrR
ECs51534262.092131adenylosuccinate synthetase
ECs51522191.992300hypothetical protein
ECs51511152.908862FtsH protease regulator HflC
ECs51500133.171297FtsH protease regulator HflK
ECs5149-1132.725868GTPase HflX
ECs5148-2133.559671RNA-binding protein Hfq
ECs5147-2143.563114tRNA delta(2)-isopentenylpyrophosphate
ECs5146-2153.687962DNA mismatch repair protein
ECs5145-2153.111247N-acetylmuramoyl-l-alanine amidase II
ECs5144-3132.919743ATPase
ECs5142-1153.530035hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5158PHPHTRNFRASE330.001 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 32.8 bits (75), Expect = 0.001
Identities = 22/122 (18%), Positives = 47/122 (38%), Gaps = 12/122 (9%)

Query: 85 VNPSLINEVAEEIARLENLITAEEQVLSNLEVSRDGVEKAVAATAQRIAQFEQQMEVVKA 144
+ + I +V+ EI +L A E+ L +D E ++ A I F + V+
Sbjct: 29 IEKTSITDVSTEIEKLT---AALEKSKEELRAIKDQTEASMGADKAEI--FAAHLLVLDD 83

Query: 145 TEAMQRAQQAVTTSTVGASSSVSTAAESLKRLQTRQAERQARLDAAAQLEKVADGRDLDE 204
E + + + + A ++ ++ +D E+ AD RD+ +
Sbjct: 84 PELVDGIKGKIENEQMNAEYALKEVSD-------MFVSMFESMDNEYMKERAADIRDVSK 136

Query: 205 KL 206
++
Sbjct: 137 RV 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5155RTXTOXIND310.028 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.028
Identities = 12/55 (21%), Positives = 24/55 (43%), Gaps = 1/55 (1%)

Query: 165 VVPDDSRLSFDILIPPDQIMGARMGFVVVVELTQRPTRRTKAV-GKIVEVLGDNM 218
+VP+D L L+ I +G ++++ P R + GK+ + D +
Sbjct: 359 IVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAI 413


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5150cloacin320.006 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/81 (30%), Positives = 30/81 (37%), Gaps = 10/81 (12%)

Query: 17 GSSKPGGNSEGNGNKGGRDQGPPDLDDIFRKLSKKLGGLGGGKGTGSGGGSSSQGP---- 72
S G +SE N GG G G GGG GTG G S+ P
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG-GNLSAVAAPVAFG 91

Query: 73 -----RPQLGGRVVTIAAAAI 88
P GG V+I+A A+
Sbjct: 92 FPALSTPGAGGLAVSISAGAL 112


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5149SECA320.005 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 32.2 bits (73), Expect = 0.005
Identities = 26/144 (18%), Positives = 54/144 (37%), Gaps = 6/144 (4%)

Query: 282 HVIDAADVRVQENIEAVNTVLEEIDAHEIPTLLVMNKIDMLEDFEPRIDRDEENK-PIRV 340
++D +DV N + IDA+ P L ++ + + R+ D + PI
Sbjct: 665 ELLDVSDVSETINSIREDVFKATIDAYIPPQSL--EEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 341 WLSAQTGAGIPQLFQALTERLSGEVAQHTLRLPPQEGRLRSRFYQLQAIEKEWMEEDGSV 400
WL + L + + + + + + R + LQ ++ W E ++
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAAM 782

Query: 401 SLQVRMPIVDWRRLCKQEPALIDY 424
+R I R +++P +Y
Sbjct: 783 D-YLRQGIH-LRGYAQKDP-KQEY 803


7ECs5116ECs5111Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5116-119-3.540580transcriptional regulator
ECs5115020-4.423572*CadC family transcriptional regulator
ECs5114-123-3.515611lysine/cadaverine antiporter
ECs5113-119-3.864199lysine decarboxylase 1
ECs5112-212-3.096075peptide transporter
ECs5111012-3.924816lysyl-tRNA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5116HTHTETR471e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.9 bits (111), Expect = 1e-08
Identities = 30/199 (15%), Positives = 57/199 (28%), Gaps = 14/199 (7%)

Query: 1 MGKTEENSVQ-REDVLGEALKLLELQGIANTTLEMVAERVDYPLDELRRFWPDKEAILYD 59
KT++ + + R+ +L AL+L QG+++T+L +A+ + + DK + +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 60 ALRYLSQQIDVWRRQLMLDETQTAEQKLLARYQALSECVKNNRYPGCLFIAACTFYPDPG 119
I + L + E + F+
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLEST--VTEERRRLLMEIIFHKCEF 119

Query: 120 H----PIHQLADQQKSAAYDFTHELLTT-------LEVDDPAMVAKQMELVLEGCLSRML 168
+ Q +YD + L A M + G + L
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 169 VNRSQADVDTAHRLAEDIL 187
D+ R IL
Sbjct: 180 FAPQSFDLKKEARDYVAIL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5115SYCDCHAPRONE378e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.8 bits (85), Expect = 8e-05
Identities = 16/97 (16%), Positives = 36/97 (37%), Gaps = 7/97 (7%)

Query: 391 PLDEKQLAALNTEIDNIVTLPELNNLS-----IIYQIKAVSALVKGKTDESYQAINTGID 445
++ A+ + + T+ LN +S +Y + A + GK +++++
Sbjct: 6 TDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSL-AFNQYQSGKYEDAHKVFQALCV 64

Query: 446 LEMSWLNYVL-LGKVYEMKGMNREAADAYLTAFNLRP 481
L+ + L LG + G A +Y +
Sbjct: 65 LDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5112TCRTETA300.020 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.020
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWSWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


8ECs5090ECs5072Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs50902385.584271hypothetical protein
ECs50891397.040656hypothetical protein
ECs50881438.455187phosphonate/organophosphate ester transporter
ECs50872428.992371periplasmic binding protein component of Pn
ECs508604210.006342membrane channel protein component of Pn
ECs50850409.812134phosphonate metabolism transcriptional regulator
ECs50840399.622013protein PhnG
ECs50832429.561597carbon-phosphorus lyase complex subunit
ECs50822409.195480protein PhnI
ECs50811418.736422protein PhnJ
ECs50801399.049654phosphonate C-P lyase system protein PhnK
ECs50791378.493992phosphonate ABC transporter ATP-binding protein
ECs50782317.286062protein PhnM
ECs50772276.292124ribose 1,5-bisphosphokinase
ECs50761265.884659aminoalkylphosphonic acid N-acetyltransferase
ECs50751265.636763carbon-phosphorus lyase complex accessory
ECs55841244.788350hypothetical protein
ECs50741224.655182histidine protein kinase
ECs50730234.445436sugar ABC transporter ATP-binding protein
ECs50720214.039665carbohydrate ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5088PF05272290.020 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.020
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5079PF05272290.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.015
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5076SACTRNSFRASE323e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 3e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 47 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 106
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 107 AEMTELSTNVKRHDAHRFYLREGY 130
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5584RTXTOXIND260.034 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.9 bits (57), Expect = 0.034
Identities = 17/107 (15%), Positives = 41/107 (38%), Gaps = 8/107 (7%)

Query: 11 TLLTLTTVPAQADIIDDTIGNIQ--------QAINDASNPDRGRDYEDSRDDGWQREVSD 62
LL LT + A+AD + +Q Q ++ + ++ + + + +Q +
Sbjct: 123 VLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEE 182

Query: 63 DRRRQYDDRRRQFEDRRRQLDDRQHQLNQERRQLEDEERRMEDEYGQ 109
+ R + QF + Q ++ L+++R + R+
Sbjct: 183 EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5074HTHFIS586e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 6e-11
Identities = 21/81 (25%), Positives = 44/81 (54%), Gaps = 2/81 (2%)

Query: 643 VLVLEDEAAVRQTICEQLHLLGYLTLEASSGEQALDLLAASAEIDIFISDLMLPGGMSGA 702
+LV +D+AA+R + + L GY S+ +AA + D+ ++D+++P +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMP-DENAF 63

Query: 703 EVVNAARKLYPHLTLLLISGQ 723
+++ +K P L +L++S Q
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQ 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5072PF00577280.047 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 28.3 bits (63), Expect = 0.047
Identities = 16/73 (21%), Positives = 27/73 (36%), Gaps = 1/73 (1%)

Query: 219 FVYGMSGLLSGLGGIMSASRLYSANGNLGMG-YELDAIAAVILGGTSFVGGIGTITGTLV 277
++G+ + GG A R + N +G L A++ + S + G V
Sbjct: 400 LLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSV 459

Query: 278 GALIIATLNNGMT 290
L +LN T
Sbjct: 460 RFLYNKSLNESGT 472


9ECs5059ECs5048Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5059-1193.665975glutamate/aspartate:proton symporter
ECs5058-1184.096639formate-dependent nitrite reductase complex
ECs5057-1153.723038formate-dependent nitrite reductase complex
ECs5056-3184.204735heme lyase subunit NrfE
ECs5055-2213.623263hypothetical protein
ECs5054-2223.274980NrfC
ECs5053-115-0.904510cytochrome c nitrite reductase pentaheme
ECs5052016-0.103364cytochrome c552
ECs5051-1160.454516acetyl-CoA synthetase
ECs5050-115-0.834140hypothetical protein
ECs5049-114-0.753524acetate permease
ECs5048015-3.256726hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5054VACJLIPOPROT300.006 VacJ lipoprotein signature.
		>VACJLIPOPROT#VacJ lipoprotein signature.

Length = 251

Score = 29.9 bits (67), Expect = 0.006
Identities = 6/21 (28%), Positives = 11/21 (52%)

Query: 179 FGNLDDPNSEISQLLRQKPTY 199
GNL++P ++ L+ P
Sbjct: 75 TGNLEEPAVMVNYFLQGDPYQ 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5050RTXTOXIND270.020 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 26.7 bits (59), Expect = 0.020
Identities = 5/33 (15%), Positives = 13/33 (39%), Gaps = 1/33 (3%)

Query: 17 ELVEKR-QRFATILSIIMLAVYIGFILLIAFAP 48
EL+E R +++ ++ + +L
Sbjct: 47 ELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQ 79


10ECs5033ECs5021Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5033-119-3.676683quinone oxidoreductase
ECs5032120-2.401291phage shock protein G
ECs5031017-2.036435tRNA-dihydrouridine synthase A
ECs5030018-2.777971hypothetical protein
ECs50290132.352276zinc uptake transcriptional repressor
ECs50280142.063694stress-response protein
ECs50270142.347668DNA-damage-inducible SOS response protein
ECs5026014-3.407459LexA repressor
ECs5025-113-3.009927diacylglycerol kinase
ECs5024-19-2.600538glycerol-3-phosphate acyltransferase
ECs5023-111-3.8722784-hydroxybenzoate octaprenyltransferase
ECs5022013-3.810855chorismate pyruvate lyase
ECs5021-213-3.093783hypothetical protein
11ECs5006ECs4957Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5006216-1.396424hypothetical protein
ECs5005115-0.86318123S rRNA pseudouridine synthase F
ECs5004-214-1.822726sor-operon regulator
ECs5003-316-1.802700sorbitol-6-phosphate 2-dehydrogenase
ECs5002-221-2.287821sorbose-permease PTS system IIA component
ECs5001-125-3.029184sorbose-permease PTS system IIB component
ECs5000028-2.316196sorbose-permease PTS system IIC component
ECs4998434-4.226950DNA modification protein
ECs5581526-1.753519hypothetical protein
ECs4997427-1.658902translational regulator
ECs4996428-1.983199hypothetical protein
ECs4995327-2.257144hypothetical protein
ECs4994228-2.051846hypothetical protein
ECs4993228-1.832290hypothetical protein
ECs4992127-0.526745DNA-invertase
ECs49911260.383144tail fiber protein
ECs49902231.403106tail fiber assembly protein
ECs49894243.833361tail fiber
ECs49886244.937196hypothetical protein
ECs49876246.215439hypothetical protein
ECs49867256.418140hypothetical protein
ECs49857256.348577hypothetical protein
ECs49846246.074204tail protein
ECs49835245.732259DNA circulation protein
ECs49824235.976170tape measure protein
ECs49812235.198199hypothetical protein
ECs49802245.202755hypothetical protein
ECs49792225.083377tail sheath protein
ECs4978-1225.664676hypothetical protein
ECs49770225.622132hypothetical protein
ECs49761205.171449hypothetical protein
ECs49752225.586936hypothetical protein
ECs49742235.269577major head subunit
ECs49732245.557778protease
ECs49723255.473710virion morphogenesis protein
ECs49712255.390298hypothetical protein
ECs49703285.895410hypothetical protein
ECs49691265.510236portal protein
ECs49682285.946389hypothetical protein
ECs49672265.167199hypothetical protein
ECs49661285.232533hypothetical protein
ECs49651263.896929C4-type zinc finger TraR
ECs49642242.663743hypothetical protein
ECs4962223-0.452766endolysin
ECs4961327-3.050412transcriptional regulator
ECs4960632-4.956583hypothetical protein
ECs4959432-7.092992hypothetical protein
ECs4958428-4.978123hypothetical protein
ECs4957226-4.691825hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5003DHBDHDRGNASE1171e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 117 bits (293), Expect = 1e-33
Identities = 80/272 (29%), Positives = 128/272 (47%), Gaps = 27/272 (9%)

Query: 7 LQDKIIIVTGGASGIGLAIVEELLAQGANVQMVDIHG-------GDGQYEGHKGYQFWPT 59
++ KI +TG A GIG A+ L +QGA++ VD + + E F P
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAF-PA 64

Query: 60 DISSAKEVNHTVAEIIQRFGRIDGLVNNAGVNFPRLLVDEKAPAGQYELNEAAFEKMVNI 119
D+ + ++ A I + G ID LVN AGV P L+ + L++ +E ++
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLI---------HSLSDEEWEATFSV 115

Query: 120 NQKGVFLMSQAVARQMVKQHDGVIVNVSSESGLEGSEGQSCYAATKAALNSFTRSWSKEL 179
N GVF S++V++ M+ + G IV V S + YA++KAA FT+ EL
Sbjct: 116 NSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLEL 175

Query: 180 GKHGIRVVGIAPGILEKTGLRTPEYEEALAWTRNITVEQLREGYT---KNAIPIGRAGRL 236
++ IR ++PG E + W EQ+ +G K IP+ + +
Sbjct: 176 AEYNIRCNIVSPGSTETDMQWS-------LWADENGAEQVIKGSLETFKTGIPLKKLAKP 228

Query: 237 AEIADFVCYLLSERASYITGVTTNIAGGKTRG 268
++IAD V +L+S +A +IT + GG T G
Sbjct: 229 SDIADAVLFLVSGQAGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4982OMADHESIN310.016 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 31.0 bits (69), Expect = 0.016
Identities = 22/59 (37%), Positives = 29/59 (49%)

Query: 562 GGGALWAGAMAAPVLLDGSASATDKGEAVGSLAGSIAGGALGAAAGPVGIAIGSTVGSY 620
G G L A A + G+ + KG AV AGSIA G A GP+ A+G + +Y
Sbjct: 59 GAGGLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTY 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4975IGASERPTASE260.048 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 26.2 bits (57), Expect = 0.048
Identities = 9/34 (26%), Positives = 17/34 (50%)

Query: 95 EAEAEAEAEAEAEAEAGQDAPRKKTGNKAEQARA 128
A E E +A+ E E Q+ P+ + +Q ++
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQS 1136


12ECs4877ECs4857Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4877-2203.170861PEP-protein phosphotransferase system enzyme I
ECs4875-3182.732076fructose-6-phosphate aldolase
ECs4874-2204.041918glycerol dehydrogenase
ECs4873-2194.118336hypothetical protein
ECs4872-2204.185375hypothetical protein
ECs4871-1204.064416hydroperoxidase HPI(I)
ECs4870-1183.5602575,10-methylenetetrahydrofolate reductase
ECs48690173.223683bifunctional aspartate kinase II/homoserine
ECs4868-1224.154217cystathionine gamma-synthase
ECs4867-1254.102921transcriptional repressor protein MetJ
ECs4866-1214.609227peptidoglycan peptidase
ECs4865-2194.382540hypothetical protein
ECs5565-2184.843587hypothetical protein
ECs4864-2185.059182protein RhsH
ECs48631142.93115650S ribosomal protein L31
ECs48620133.007540primosome assembly protein PriA
ECs4861-1161.775963DNA-binding transcriptional regulator CytR
ECs4860-1150.238737cell division protein FtsN
ECs4859-216-2.351731ATP-dependent protease peptidase subunit
ECs4858-214-2.730008ATP-dependent protease ATP-binding protein HslU
ECs4857-115-3.5080641,4-dihydroxy-2-naphthoate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4877PHPHTRNFRASE6220.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 622 bits (1606), Expect = 0.0
Identities = 196/569 (34%), Positives = 317/569 (55%), Gaps = 6/569 (1%)

Query: 119 RARTVCSGSAGGILTPISSLDLNALGNLPAAKDVDAEQSALENGLTLV---LKNIEFRLL 175
+ + + S I L+ N + DV E L L L+ I+ +
Sbjct: 4 KITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTE 63

Query: 176 DSDGATSA-ILEAHRSLAGDTSLHEHLLAGVSAGLSCAE-AIVASANHFCEEFSRSSSSY 233
S GA A I AH + D L + + + AE A+ ++ F F + Y
Sbjct: 64 ASMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEY 123

Query: 234 LQERALDVRDVCFQLLQQIYGEQRFPAPGKLTQPAICMADELTPSQFLELDKNHLKGLLL 293
++ERA D+RDV ++L + G + + + + + +A++LTPS +L+K +KG
Sbjct: 124 MKERAADIRDVSKRVLGHLIGVET-GSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFAT 182

Query: 294 KSGGTTSHTVILARSFNIPTLVGVDIDALTPWQHQTIYIDGNAGAIVVEPGEAVARYYQQ 353
GG TSH+ I++RS IP +VG + +DG G ++V P E + Y++
Sbjct: 183 DIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEE 242

Query: 354 EARVQDALREQQRVWLTQQARTADGIRIEIAANIAHSVEAQAAFGNGAEGVGLFRTEMLY 413
+ + +++ + + + T DG +E+AANI + NG EG+GL+RTE LY
Sbjct: 243 KRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLY 302

Query: 414 MDRTSAPGESELYNIFCQALESANGRSIIVRTMDIGGDKPVDYLNIPAEANPFLGYRAVR 473
MDR P E E + + + ++ +G+ +++RT+DIGGDK + YL +P E NPFLG+RA+R
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIR 362

Query: 474 IYEEYASLFTTQLRSILRASAHGSLKIMIPMISSMEEILWVKEKLAEAKQQLRNEHIPFD 533
+ E +F TQLR++LRAS +G+LK+M PMI+++EE+ K + E K +L +E +
Sbjct: 363 LCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVS 422

Query: 534 EKIQLGIMLEVPSVMFIIDQCCEEIDFFSIGSNDLTQYLLAVDRDNAKVTRHYNSLNPAF 593
+ I++GIM+E+PS + +E+DFFSIG+NDL QY +A DR N +V+ Y +PA
Sbjct: 423 DSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAI 482

Query: 594 LRALDYAVQAVHRQGKWIGLCGELGAKGSVLPLLVGLGLDELSMSAPSIPAAKARMAQLD 653
LR +D ++A H +GKW+G+CGE+ +PLL+GLGLDE SMSA SI A++++ +L
Sbjct: 483 LRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKLS 542

Query: 654 SRECRKLLNQAMACRTSLEVEHLLAQFRM 682
E + +A+ T+ EVE L+ + +
Sbjct: 543 KEELKPFAQKALMLDTAEEVEQLVKKTYL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4860IGASERPTASE414e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 41.2 bits (96), Expect = 4e-06
Identities = 32/155 (20%), Positives = 64/155 (41%), Gaps = 5/155 (3%)

Query: 114 LTPEQRQLLEQMQADMRQQPTQLVEVPWNEQTPEQRQQTLQRQRQAQQLAEQQRLAQQSR 173
+ +QAD+ P+ E+ ++ P + +AE + Q+S+
Sbjct: 992 VDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK--QESK 1049

Query: 174 TTEQSWQQQT-RTSQAAPVQAQPRQSKPASTQQPYQDLLQTPAHTTAQSKPQQAAPVTRA 232
T E++ Q T T+Q V + + + A+TQ + T ++ ++ A V +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 233 ADAPKPTAEKKDERRWMVQCGSFRGAEQAETVRAQ 267
A T + ++ + Q + EQ+ETV+ Q
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQ--EQSETVQPQ 1142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4858HTHFIS300.017 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.2 bits (68), Expect = 0.017
Identities = 11/36 (30%), Positives = 18/36 (50%), Gaps = 3/36 (8%)

Query: 49 TPKNILMIGPTGVGKTEIAR---RLAKLANAPFIKV 81
T +++ G +G GK +AR K N PF+ +
Sbjct: 159 TDLTLMITGESGTGKELVARALHDYGKRRNGPFVAI 194


13ECs4818ECs4813Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4818-219-3.228359formate dehydrogenase-O subunit gamma
ECs4817-128-7.113313formate dehydrogenase accessory protein FdhE
ECs4816-131-6.962726hypothetical protein
ECs4815-128-5.750679hypothetical protein
ECs4814-223-4.774843hypothetical protein
ECs4813-220-4.362412hypothetical protein
14ECs4800ECs4784Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4800-111-3.505225alpha-glucosidase
ECs4799012-5.114748permease
ECs4798114-4.515847permease
ECs4797120-2.741377outer membrane porin L
ECs4796020-1.216731resistance protein
ECs47951190.832870hypothetical protein
ECs47941232.089623transcriptional regulator
ECs47932242.467421GTP-binding protein
ECs47920182.427892glutamine synthetase
ECs47910151.814816nitrogen regulation protein NR(II)
ECs47901150.958029nitrogen regulation protein NR(I)
ECs4789013-0.957744coproporphyrinogen III oxidase
ECs4788-113-3.466033hypothetical protein
ECs4787-213-3.838959ribosome biogenesis GTP-binding protein YsxC
ECs4786-214-4.141936DNA polymerase I
ECs5562-220-7.044478hypothetical protein
ECs4785-319-5.731279acyltransferase
ECs4784-220-5.327092hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4799TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.6 bits (77), Expect = 0.001
Identities = 31/160 (19%), Positives = 57/160 (35%), Gaps = 8/160 (5%)

Query: 188 QLGYIFAATLFSLFGLLFMWICYSGVKERYVETQPANPAQKPGLLQSFRAIAGNRPLFIL 247
+ AA +L GL F+ C+ + E +P + L SFR G + L
Sbjct: 160 HAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLR-REALNPLASFRWARGMTVVAAL 215

Query: 248 CIANLCTLGAFNVKLAIQVYYTQYVLN-DPILLSYM--GFFSMGCIFIGVFLMPGAVRRF 304
V A+ V + + + D + F + + + R
Sbjct: 216 MAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLA-QAMITGPVAARL 274

Query: 305 GKKKVYIGGLLIWVLGDLLNYFFSGGSVSFVAFSCLAFFG 344
G+++ + G++ G +L F + G ++F LA G
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGG 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4796TCRTETB300.024 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.024
Identities = 31/161 (19%), Positives = 64/161 (39%), Gaps = 15/161 (9%)

Query: 227 NVFFVYAVYCGLTFFIPFLKNIYLLP----------VALVGAYGIINQYCLKMIGGPIGG 276
N+ F+ V CG F + ++P A +G+ I +I G IGG
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 277 MISDKILKSPSKYLCYTFIISTAALVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFF 336
++ D+ + P L + + + L E+ ++ + G + FT+
Sbjct: 315 ILVDR--RGPLYVLNIGVTFLSVSFLTASFLL-ETTSWFMTIIIVFVLGGLSFTK--TVI 369

Query: 337 APIGEAKIAENKTGAAMALGSFIGYAPAMFCFSLYGYILDL 377
+ I + + + + GA M+L +F + ++ G +L +
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4793TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4791PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4790HTHFIS6020.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 602 bits (1553), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4788SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 29/71 (40%)

Query: 13 AKARRKTREELNQEARDRKRQKKRRGHAPGSRAAGGNNTSGSKGQNAPKDPRIGSKTPIP 72
+K + + EE+ + + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 73 LGVTEKVTKQH 83
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


15ECs4752ECs4743Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4752115-3.208789ATP-dependent DNA helicase RecQ
ECs4751013-5.135312phospholipase A
ECs4750016-5.944719hypothetical protein
ECs4749-115-6.262488hypothetical protein
ECs4748-19-2.496248hypothetical protein
ECs4747-210-0.998917hypothetical protein
ECs4746-2162.029381magnesium/nickel/cobalt transporter CorA
ECs4745-2162.475151hypothetical protein
ECs4744-2173.352796hypothetical protein
ECs4743-1204.394383DNA-dependent helicase II
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4751PHPHLIPASEA14990.0 Bacterial phospholipase A1 protein signature.
		>PHPHLIPASEA1#Bacterial phospholipase A1 protein signature.

Length = 289

Score = 499 bits (1286), Expect = 0.0
Identities = 289/289 (100%), Positives = 289/289 (100%)

Query: 1 MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL 60
MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL
Sbjct: 1 MRTLQGWLLPVFMLPMAVYAQEATVKEVHDAPAVRGSIIANMLQEHDNPFTLYPYDTNYL 60

Query: 61 IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL 120
IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL
Sbjct: 61 IYTQTSDLNKEAIASYDWAENARKDEVKFQLSLAFPLWRGILGPNSVLGASYTQKSWWQL 120

Query: 121 SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT 180
SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT
Sbjct: 121 SNSEESSPFRETNYEPQLFLGFATDYRFAGWTLRDVEMGYNHDSNGRSDPTSRSWNRLYT 180

Query: 181 RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY 240
RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY
Sbjct: 181 RLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLKIGYHLGDAVLSAKGQYNWNTGY 240

Query: 241 GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF 289
GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF
Sbjct: 241 GGAELGLSYPITKHVRLYTQVYSGYGESLIDYNFNQTRVGVGVMLNDLF 289


16ECs4682ECs4637Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4682221-0.56846916S rRNA methyltransferase GidB
ECs46813340.639347F0F1 ATP synthase subunit I
ECs46802330.896065ATP synthase F0F1 subunit A
ECs46794411.911030ATP synthase F0F1 subunit C
ECs46784391.999245ATP synthase F0F1 subunit B
ECs46773351.937153ATP synthase F0F1 subunit delta
ECs46763372.120392ATP synthase F0F1 subunit alpha
ECs46752291.407289ATP synthase F0F1 subunit gamma
ECs46742280.383252ATP synthase F0F1 subunit beta
ECs4673-120-0.827180ATP synthase F0F1 subunit epsilon
ECs4672-114-1.928014bifunctional N-acetylglucosamine-1-phosphate
ECs4671012-3.003940glucosamine--fructose-6-phosphate
ECs4670121-5.464591type 1 fimbrial protein
ECs4669015-4.528793fimbrial chaperone
ECs466809-2.914492hypothetical protein
ECs4667-19-1.849249outer membrane usher protein
ECs4666-213-0.888776fimbrial protein
ECs4665-319-0.064881fimbrial protein
ECs4664-2291.614178phosphate ABC transporter substrate-binding
ECs4663-2271.164199phosphate transporter permease PstC
ECs4662021-9.982965phosphate transporter permease PtsA
ECs4661127-13.032930phosphate transporter ATP-binding protein
ECs4660440-16.841223transcriptional regulator PhoU
ECs4659653-20.582519hypothetical protein
ECs5542752-20.983816hypothetical protein
ECs4657546-19.092822hypothetical protein
ECs4656129-12.872126hypothetical protein
ECs4655025-11.582729hypothetical protein
ECs4654021-9.315510hypothetical protein
ECs4653-116-6.747744hypothetical protein
ECs4652-3130.8876666-phosphogluconate phosphatase
ECs4651-312-0.149169membrane/transport protein
ECs4650-212-0.837154hypothetical protein
ECs4649-212-1.245642hypothetical protein
ECs4648-115-5.490944DNA-binding transcriptional regulator YidZ
ECs4647018-6.868914multidrug efflux system protein MdtL
ECs4646016-5.756112tryptophan permease TnaB
ECs4645-113-4.002368tryptophanase
ECs4644-115-4.282870tryptophanase leader peptide
ECs4643-115-4.025742hypothetical protein
ECs4642-1181.968264hypothetical protein
ECs46410192.891601tRNA modification GTPase TrmE
ECs46401212.467311inner membrane protein translocase component
ECs46393232.904864ribonuclease P
ECs46383222.62518250S ribosomal protein L34
ECs46372212.240861chromosomal replication initiation protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4678IGASERPTASE270.028 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.028
Identities = 20/101 (19%), Positives = 37/101 (36%), Gaps = 18/101 (17%)

Query: 31 AAIEKRQKEIADGLASAERAHKDLDLAKASATDQLKKAKAEAQVIIEQ--ANKRRSQILD 88
+EK +++ + A K+ + T + A++ ++ Q K + +
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 89 EAKAEAEQERTKIVA----------------QAQAEIEAER 113
E KA+ E E+T+ V Q QAE E
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREN 1149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4672RTXTOXINA290.048 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.2 bits (65), Expect = 0.048
Identities = 23/80 (28%), Positives = 31/80 (38%), Gaps = 10/80 (12%)

Query: 367 LGDAEIGDNVNIGAGTITCNYDGANKFKTIIGDDVFVGSDTQLVAPVTVGKGATIAAGTT 426
LGD + D V + AG+ N G DV T G AT A T
Sbjct: 616 LGDGD--DKVFLSAGSA--NIYAGK------GHDVVYYDKTDTGYLTIDGTKATEAGNYT 665

Query: 427 VTRNVGENALAISRVPQTQK 446
VTR +G + + V + Q+
Sbjct: 666 VTRVLGGDVKVLQEVVKEQE 685


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4667PF005777690.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 769 bits (1986), Expect = 0.0
Identities = 329/875 (37%), Positives = 481/875 (54%), Gaps = 58/875 (6%)

Query: 6 LFITLASGICLLCSISAFARDSLFNPRLLELDHPADNIDIHQFNRSNTLPAGTYKVDVMI 65
F+ L + + FNPR L D P D+ +F LP GTY+VD+ +
Sbjct: 26 FFVRLFVACAFAAQAPLSSAELYFNPRFLADD-PQAVADLSRFENGQELPPGTYRVDIYL 84

Query: 66 NGMLFERQEVKFAQDNPDAELHPCYVAIKNVLATYGIKVDAIKSLANVDDKTCVNPVPLI 125
N ++V F + + + PC LA+ G+ ++ + + D CV +I
Sbjct: 85 NNGYMATRDVTFNTGDSEQGIVPCLTR--AQLASMGLNTASVSGMNLLADDACVPLTSMI 142

Query: 126 DGATWLLDASKLALNITIPQIYLNNAVNGYISPSRWDQGINAMMMNYDFSASHTIRSNYD 185
AT LD + LN+TIPQ +++N GYI P WD GINA ++NY+FS + ++
Sbjct: 143 HDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSV-QNRIG 201

Query: 186 DDDDSYYLNLRNGINLGAWRFRNYSTLN------SYDGNVDYHSVSNYIQRDIMALRSQI 239
+ YLNL++G+N+GAWR R+ +T + S + ++ +++RDI+ LRS++
Sbjct: 202 GNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRL 261

Query: 240 MIGDTWTASDVFDSTQVRGVRLYTDDDMLPSSQNGFAPVVHGIAKTNATVIIKQNGYVIY 299
+GD +T D+FD RG +L +DD+MLP SQ GFAPV+HGIA+ A V IKQNGY IY
Sbjct: 262 TLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIY 321

Query: 300 QSAVPQGAFALTDLNTTSSGGDLDVTIKEEDGSEQHFIQPFTSLAILKREGQTDVDLSIG 359
S VP G F + D+ + GDL VTIKE DGS Q F P++S+ +L+REG T ++ G
Sbjct: 322 NSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAG 381

Query: 360 EVR--DESGFTPEVLQLQAMHGFPLGITLYGGTQLANDYASAALGIGKDMGALGAISFDV 417
E R + P Q +HG P G T+YGGTQLA+ Y + GIGK+MGALGA+S D+
Sbjct: 382 EYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDM 441

Query: 418 THARSQFDYDDNESGQSYRFLYSKRFEDTNTTFRLVGYRYSMEGFYTLNEWVSRQDNDSD 477
T A S D GQS RFLY+K ++ T +LVGYRYS G++ + + N +
Sbjct: 442 TQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 478 -----------------FWVTGNRRSRFEGTWTQSFTPGWGNIYLTFSRQEYWQTDEVER 520
+ + N+R + + T TQ +YL+ S Q YW T V+
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYWGTSNVDE 560

Query: 521 LLQFGYNNNWRNISWNVSWNYTDSIKRSLGNHHDDNNDDFGKEQIFMFSMSIPLSCWMED 580
Q G N + +I+W +S++ T ++ G++Q+ +++IP S W+
Sbjct: 561 QFQAGLNTAFEDINWTLSYSLT----KNAWQK--------GRDQMLALNVNIPFSHWLRS 608

Query: 581 --------SYVNYSLTQNNHHESTMQVGLNGTMLEGRNLSYNVQESWMHSPDDSYSGNAG 632
+ +YS++ + + T G+ GT+LE NLSY+VQ + D SG+ G
Sbjct: 609 DSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGY-AGGGDGNSGSTG 667

Query: 633 ---MTYDGTYGSVNGSYSWSRDSQHFDYGARGGVLVHSDGVTFSQELGETVALVKAPGAE 689
+ Y G YG+ N YS S D + YG GGVL H++GVT Q L +TV LVKAPGA+
Sbjct: 668 YATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAK 727

Query: 690 GLSIENATGISTDWRGYTVKTQLSPYDENRVALNSDYFSKANIELENTVINLVPTRGAVV 749
+EN TG+ TDWRGY V + Y ENRVAL+++ + N++L+N V N+VPTRGA+V
Sbjct: 728 DAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLA-DNVDLDNAVANVVPTRGAIV 786

Query: 750 KAEFVTHVGYRVLFNVRQVNGKPIMFGAMATASLETGTVTGIVGDNGELYLSGMPEKGEF 809
+AEF VG ++L + N KP+ FGAM T E+ +GIV DNG++YLSGMP G+
Sbjct: 787 RAEFKARVGIKLLMTLTH-NNKPLPFGAMVT--SESSQSSGIVADNGQVYLSGMPLAGKV 843

Query: 810 LLSWGQAADEKCKAAYHITHKPDDTSLVQMDAICR 844
+ WG+ + C A Y + + L Q+ A CR
Sbjct: 844 QVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4665FIMBRIALPAPF320.002 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 32.0 bits (72), Expect = 0.002
Identities = 38/169 (22%), Positives = 67/169 (39%), Gaps = 11/169 (6%)

Query: 189 LYANISSTTTRGEAIAKVRISGSLTAPQSCQINAGQVIYFDFDTIPASEFSSTAGQAITS 248
L+ ++ T+ A ++ I G++ P C IN GQ I DF I ++ G+
Sbjct: 6 LFISLLLTSVAVLADVQINIRGNVYIP-PCTINNGQNIVVDFGNINPEHVDNSRGE---- 60

Query: 249 RKITKTVSIECTGMGYERTQKVDASFTGTNRSSDDTMVATDNADVGIKIYNKSNAEVSVN 308
+TK +SI C KV + G + + ++AT+ GI +Y +
Sbjct: 61 --VTKNISISCPYKSGSLWIKVTGNTMGVGQ---NNVLATNITHFGIALYQGKGMSTPLT 115

Query: 309 NGKLPADMGNTTI-FGRKNGSVTFSAAPASFTGARPQPGVFNATATLTI 356
G + T + TF++ P G F TA++++
Sbjct: 116 LGNGSGNGYRVTAGLDTARSTFTFTSVPFRNGSGILNGGDFRTTASMSM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4647TCRTETA543e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 54.4 bits (131), Expect = 3e-10
Identities = 61/257 (23%), Positives = 97/257 (37%), Gaps = 10/257 (3%)

Query: 2 SRFLICSFALVLLYPAGIDMYLVGLPRIAADLNASEAQLHIAFSVYLAGMAAAML----F 57
+R LI + V L GI + + LP + DL S + + + LA A
Sbjct: 4 NRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSN-DVTAHYGILLALYALMQFACAPV 62

Query: 58 AGKVADRSGRKPVAIPGAALFIITSVFCSLAETSTLFLAGRFLQGLGAGCCYVVAFAILR 117
G ++DR GR+PV + A + + A + GR + G+ G VA A +
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAYIA 121

Query: 118 DTLDDRRRAKVLSLLNGITCIIPVLAPVLGHLIMLKFPWQSLFWTMAIMGIAVLMLSLFI 177
D D RA+ ++ V PVLG L M F + F+ A + + F+
Sbjct: 122 DITDGDERARHFGFMSACFGFGMVAGPVLGGL-MGGFSPHAPFFAAAALNGLNFLTGCFL 180

Query: 178 LKETRPAAPAASDKSRENSESLLNRFFLSRVVITTLSVSVILTFVNTSPVLLMEIMGFER 237
L E+ + N + VV ++V I+ V P L I G +R
Sbjct: 181 LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDR 240

Query: 238 GEYATIMALTAGVSMTV 254
+ G+S+
Sbjct: 241 FHWDATT---IGISLAA 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4642FLGHOOKFLIE250.019 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 25.4 bits (55), Expect = 0.019
Identities = 9/44 (20%), Positives = 16/44 (36%)

Query: 22 LKDVMMQLEAKNNEGKYVISKANGNPVFKELFWKAIDEFNFPQE 65
++ V+ QL+A + S F A+D + Q
Sbjct: 6 IEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQT 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs464060KDINNERMP8740.0 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 874 bits (2260), Expect = 0.0
Identities = 547/548 (99%), Positives = 548/548 (100%)

Query: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL 60
MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL
Sbjct: 1 MDSQRNLLVIALLFVSFMIWQAWEQDKNPQPQAQQTTQTTTTAAGSAADQGVPASGQGKL 60

Query: 61 ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP 120
ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP
Sbjct: 61 ISVKTDVLDLTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQSGLTGRDGP 120

Query: 121 DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV 180
DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV
Sbjct: 121 DNPANGPRPLYNVEKDAYVLAEGQNELQVPMTYTDAAGNTFTKTFVLKRGDYAVNVNYNV 180

Query: 181 QNAGEKPLEISTFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD 240
QNAGEKPLEIS+FGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD
Sbjct: 181 QNAGEKPLEISSFGQLKQSITLPPHLDTGSSNFALHTFRGAAYSTPDEKYEKYKFDTIAD 240

Query: 241 NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT 300
NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT
Sbjct: 241 NENLNISSKGGWVAMLQQYFATAWIPHNDGTNNFYTANLGNGIAAIGYKSQPVLVQPGQT 300

Query: 301 GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII 360
GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII
Sbjct: 301 GAMNSTLWVGPEIQDKMAAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIII 360

Query: 361 ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL 420
ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL
Sbjct: 361 ITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEKVNPL 420

Query: 421 GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK 480
GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK
Sbjct: 421 GGCFPLLIQMPIFLALYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQK 480

Query: 481 MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL 540
MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL
Sbjct: 481 MSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKRGL 540

Query: 541 HSREKKKS 548
HSREKKKS
Sbjct: 541 HSREKKKS 548


17ECs4599ECs4540Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4599118-3.103300hypothetical protein
ECs4598120-3.368060DNA-binding protein
ECs4597020-3.152850hypothetical protein
ECs4596021-2.906196ribonucleoside transporter
ECs5540026-7.339483hypothetical protein
ECs4595032-8.282803hypothetical protein
ECs4594235-9.293101permease transporter
ECs4593444-12.057364hypothetical protein
ECs4592446-13.347586hypothetical protein
ECs4591649-14.732984hypothetical protein
ECs4590751-16.414200protein EspG
ECs4588951-19.213295hypothetical protein
ECs4587952-19.795351hypothetical protein
ECs4586953-19.455690hypothetical protein
ECs4585951-18.931035hypothetical protein
ECs4584752-19.261636hypothetical protein
ECs4583751-18.367479type III secretion system protein
ECs4582750-17.951096EscS
ECs4581850-18.055484EscT
ECs4580547-16.675178secretion system apparatus protein SsaU
ECs4579447-16.255180hypothetical protein
ECs4578444-15.214344negative regulator GrlR
ECs4577344-14.354901hypothetical protein
ECs4576345-13.382067CesD
ECs4575245-13.846926EscC
ECs4574243-12.345273SepD
ECs4573241-10.709467EscJ
ECs4572242-10.651001hypothetical protein
ECs4571342-11.420631SepZ
ECs4570243-12.332796hypothetical protein
ECs4569343-11.637633hypothetical protein
ECs4568443-11.963515EscN
ECs4567747-15.070716hypothetical protein
ECs4566644-13.544856hypothetical protein
ECs4565543-8.734516SepQ
ECs4564543-8.428188hypothetical protein
ECs5539238-8.495107hypothetical protein
ECs4563237-9.050648hypothetical protein
ECs4562338-9.589192hypothetical protein
ECs4561339-9.859670hypothetical protein
ECs4560238-10.705008protein CesT
ECs4559238-9.534758gamma intimin
ECs4558339-10.015661EscD
ECs4557442-9.625848SepL
ECs4556141-7.922943protein EspA
ECs4555440-5.193177protein EspD
ECs4554340-5.338675protein EspB
ECs4553124-0.172057hypothetical protein
ECs45521222.050669protein EscF
ECs45511232.905788hypothetical protein
ECs45501244.279905protein EspF
ECs45482254.313421hypothetical protein
ECs45474264.211880hypothetical protein
ECs45464262.040885hypothetical protein
ECs45454231.094585hypothetical protein
ECs45425250.917568hypothetical protein
ECs45414250.945630hypothetical protein
ECs45404230.742659hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4596TCRTETB348e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 8e-04
Identities = 29/163 (17%), Positives = 60/163 (36%), Gaps = 18/163 (11%)

Query: 43 LTPMAQDLGISEG-----VAGQSVTVTAFVAMFASLFITQTIQATDRRYVVILFAVLLTL 97
L +A D +T + A++ L +D+ + L + +
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKL--------SDQLGIKRLLLFGIII 88

Query: 98 SCL--LVSFAN--SFSLLLIGRACLGLALGGFWAMSASLTMRLVPPRTVPKALSVIFGAV 153
+C ++ F FSLL++ R G F A+ + R +P KA +I V
Sbjct: 89 NCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIV 148

Query: 154 SIALVIAAPLGCFLGELIGWRNVFNAAAAMGVLCIFWIIKSLP 196
++ + +G + I W + ++ + +++K L
Sbjct: 149 AMGEGVGPAIGGMIAHYIHWSYLLLIPMIT-IITVPFLMKLLK 190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4590PF068727240.0 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 724 bits (1871), Expect = 0.0
Identities = 293/397 (73%), Positives = 338/397 (85%)

Query: 1 MILVAKLFITNQIGESLMINGLNNDSASLVLDAAMKVNSGFKKSWDEMSCAEKLFKVLSF 60
MILV K+F+ ++ + M+NGLNN+SASLVLDA +K+NS +KK W+EM+CAEKL K+L+
Sbjct: 1 MILVIKIFVIDETERAFMLNGLNNNSASLVLDATIKINSDYKKPWNEMTCAEKLLKILTL 60

Query: 61 GLWNPTYSRSERQSFQELLTVLEPVYPLPNELGRVSARFSDGSSLRISVTNSELVEAEIR 120
GLWNP YS+ ERQ FQ LLTVLEPV P NELGRV A+FSDGSSLRISVTNSEL+EAEI
Sbjct: 61 GLWNPKYSQDERQQFQGLLTVLEPVSPAHNELGRVYAKFSDGSSLRISVTNSELIEAEIH 120

Query: 121 TANNEKITVLLESNEQNRLLQSLPIDRHMPYIQVHRALSEMDLTDTTSMRNLLGFTSKLS 180
T NNEK VLLE+NEQNRLLQSLPI+RHMPYIQVH L + +LTD SM LL FTSKLS
Sbjct: 121 TPNNEKFLVLLEANEQNRLLQSLPINRHMPYIQVHHTLPQEELTDLLSMHKLLSFTSKLS 180

Query: 181 TTLIPHNAQTDPLSGPTPFSSIFMDTCRGLGNAKLSLNGVDIPANAQKLLRDALGLKDTH 240
TLIPHN QTDPLSG TPFS++FMDT RGLGN+KLSLNGVDIPA+AQKLLR+ LGLKDT+
Sbjct: 181 ATLIPHNNQTDPLSGLTPFSTVFMDTSRGLGNSKLSLNGVDIPADAQKLLRNTLGLKDTN 240

Query: 241 SSPTRNVIDHGISRHDAEQIARESSGSDKQKAEVVEFLCHPEAATAICSAFYQSFNVPAL 300
SSP NVI +GI RH AEQI +ESS +++QKA VV+FLC PEA TAICSAFYQSFNVPAL
Sbjct: 241 SSPDLNVIRNGIPRHYAEQIVKESSSTNEQKAAVVDFLCQPEAPTAICSAFYQSFNVPAL 300

Query: 301 TLTHERISKASEYNAERSLDTPNACINISISQSSDGNIYVTSHTGVLIMAPEDRPNEMGM 360
LTH RIS+AS YNA+RSLD PNACINISI+QSS+G+I+VTSHTGVLIMAPEDRPN++GM
Sbjct: 301 MLTHVRISQASAYNAQRSLDMPNACINISITQSSEGSIHVTSHTGVLIMAPEDRPNQLGM 360

Query: 361 LTNRTSYEVPQGVKCIIDEMVSALQPRYAASETYLQN 397
LTNRTSYEVP GVKC +EM L+ +YA+SETYL N
Sbjct: 361 LTNRTSYEVPPGVKCEPNEMARMLKAKYASSETYLNN 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4583TYPE3IMPPROT2225e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 222 bits (568), Expect = 5e-76
Identities = 89/212 (41%), Positives = 136/212 (64%), Gaps = 9/212 (4%)

Query: 12 IFLIIVFFLLSLLPIFVGIGTSFLKISIVLGILKNALGIQQVPPNMALTSVSLILTMFIM 71
I LI + +LLP + GT F+K SIV +++NALG+QQ+P NM L V+L+L+MF+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 72 SPIILQINDNISQEPINYTDSDFFQKVDEKILSPYRGFLEKNTEKDNVEFFERAAQKKLG 131
PI+ E + + D K ++ L YR +L K ++++ V+FFE A K+
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 132 NETI---------LKKDSLFILLPAFTMGQLEAAFKIGFLLYLPFIAIDLIISNILLALG 182
E ++K S+F LLPA+ + ++++AFKIGF LYLPF+ +DL++S++LLALG
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 183 MMMVSPVTISIPFKILLFILVGGWQKLFEFLL 214
MMM+SPVTIS P K++LF+ + GW L + L+
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLI 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4582TYPE3IMQPROT692e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 69.0 bits (169), Expect = 2e-19
Identities = 25/78 (32%), Positives = 45/78 (57%)

Query: 7 VQLCVQTFWIIFILSLPTVIAASVIGIIISLVQAITQLQDQTLPFLLKIIAVFATLALTY 66
V + +++ ILS I A++IG+++ L Q +TQLQ+QTLPF +K++ V L L
Sbjct: 5 VFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLS 64

Query: 67 HWMGTTIINFSSIIFEMI 84
W G ++++ + +
Sbjct: 65 GWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4581TYPE3IMRPROT1551e-48 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 155 bits (394), Expect = 1e-48
Identities = 46/230 (20%), Positives = 102/230 (44%), Gaps = 4/230 (1%)

Query: 11 SFYCILRPLGMFIILPIFSTGVLLSNFIRNSIMIAFTLPIIVENYTFSEKLPSGIFQLTG 70
F+ +LR L + PI S + R + +A + + + +P F
Sbjct: 16 YFWPLLRVLALISTAPILSERSVPK---RVKLGLAMMITFAIAPSLPANDVPVFSFFALW 72

Query: 71 IALKEISIGFFIGLSFTILFWAIDAAGQIIDTLRGSTISSIFNPSISDSSSITGVILYQF 130
+A+++I IG +G + F A+ AG+II G + ++ +P+ + + I+
Sbjct: 73 LAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDML 132

Query: 131 ISVIFVIHGGIQSILDKLYLSYEILPLQADIAFNRALIDFLFSLWDSFIKLMLSFSVPMI 190
++F+ G ++ L ++ LP+ + + A + + F+ L ++P+I
Sbjct: 133 ALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFL-NGLMLALPLI 191

Query: 191 IGIFLCDMGFGFLNKTAPQLNVFTLSLPVKSLIAIFILLLVIHVFPDFIT 240
+ ++ G LN+ APQL++F + P+ + I ++ ++ + F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCE 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4580TYPE3IMSPROT376e-132 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 376 bits (967), Expect = e-132
Identities = 123/339 (36%), Positives = 195/339 (57%), Gaps = 4/339 (1%)

Query: 2 SEKTEKPTPKKLRDLKKKGDVTKSEEVMAAVQSLILFSFFSLYGMS--FFVDIVGLVNTT 59
EKTE+PTPKK+RD +KKG V KS+EV++ LI+ L G+S +F L+
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTA--LIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 IDSLNRPFLYAIREILGAVLNIFLLYILPISLIVFVGTVTTGVSQIGFIFAVEKIKPSAQ 119
+ PF A+ ++ VL F P+ + + + + V Q GF+ + E IKP +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 120 KISVKNNLKNIFSVKSIFELLKSVFKLVIIVLIFYFMGHSYANEFANFTGLNAYQALVVV 179
KI+ K IFS+KS+ E LKS+ K+V++ ++ + + ++
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 180 AFFVFLLWKGVLFGYLLFSVFDFWFQKHEGLKKMKMSKDEVKREAKDTDGNPEIKGERRR 239
+ L G+++ S+ D+ F+ ++ +K++KMSKDE+KRE K+ +G+PEIK +RR+
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 240 LHSEIQSGSLANNIKKSTVIVKNPTHIAICLYYKLGETPLPLVIETGKDAKALQIIKLAE 299
H EIQS ++ N+K+S+V+V NPTHIAI + YK GETPLPLV DA+ + K+AE
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 300 LYDIPVIEDIPLARTLYKNIHKGQYITEDFFEPVAQLIR 338
+P+++ IPLAR LY + YI + E A+++R
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLR 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4579OMPTIN260.048 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 26.5 bits (58), Expect = 0.048
Identities = 13/38 (34%), Positives = 17/38 (44%), Gaps = 4/38 (10%)

Query: 115 AYNAGYFNTPNAVELRRQYAMKIYKTYNKLKNNEQIID 152
A NAGY+ TPNA + Y + K N + D
Sbjct: 254 AVNAGYYVTPNA----KVYVEGAWNRVTNKKGNTSLYD 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4576SYCDCHAPRONE1394e-45 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 139 bits (352), Expect = 4e-45
Identities = 33/142 (23%), Positives = 63/142 (44%)

Query: 6 SSLEDIYDFYQDGGTLASLTNLTQQDLNDLHSYAYTAYQSGDVITARNLFHLLTYLEHWN 65
+ F + GGT+A L ++ L L+S A+ YQSG A +F L L+H++
Sbjct: 10 EYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYD 69

Query: 66 YDYTLSLGLCHQRLSNHEDAQLCFARCATLVMQDPRASYYSGISYLLVGNKKMAKKAFKA 125
+ L LG C Q + ++ A ++ A + +++PR +++ L G A+
Sbjct: 70 SRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFL 129

Query: 126 CLMWCNEKEKYTTYKENIKKLL 147
+K ++ + +L
Sbjct: 130 AQELIADKTEFKELSTRVSSML 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4575TYPE3OMGPROT5590.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 559 bits (1441), Expect = 0.0
Identities = 153/494 (30%), Positives = 262/494 (53%), Gaps = 24/494 (4%)

Query: 30 KSEYFIITKSSPVRAILNDFAANYSIPVFISSSVNDDFSGEIKNEKPVKVLEKLSKLYHL 89
Y + K +R +L DF ANY V +S +ND SG+ +++ P L+ ++ LY+L
Sbjct: 33 PIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEHDNPQDFLQHIASLYNL 92

Query: 90 TWYYDENILYIYKTNEISRSIITPTYLDIDSLLKYLSDTISVNKNSCNVRKITTFNSIEV 149
WYYD N+LYI+K +E++ +I + L + L + + R + + V
Sbjct: 93 VWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQR-SGIWEPRFGWRPDASNRLVYV 151

Query: 150 RGVPECIKYITSLSESLDKEAQSKAKNKD--VVKVFKLNYASATDITYKYRDQNVVVPGV 207
G P ++ + + +L+++ Q +++ +++F L YASA+D T YRD V PGV
Sbjct: 152 SGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGV 211

Query: 208 VSILKTMASNGSLP--STGKGAVERSGNLFDNSVTISADPRLNAVVVKDREITMDIYQQL 265
+IL+ + S+ ++ + + ++ + ADP LNA++V+D M +YQ+L
Sbjct: 212 ATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 266 ISELDIEQRQIEISVSIIDVDANDLQQLGVNWSGTLNAGQGTIA--------FNSSTAQA 317
I LD +IE+++SI+D++A+ L +LGV+W + G N ++ A
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGA 331

Query: 318 NISSSVISNASNFMIRVNALQQNSKAKILSQPSIITLNNMQAILDKNVTFYTKVSGEKVA 377
S + RVN L+ A+++S+P+++T N QA++D + T+Y KV+G++VA
Sbjct: 332 LGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVA 391

Query: 378 SLESITSGTLLRVTPRILDDSSNSLTGKRRERVRLLLDIQDGNQSTNQSNAQDASSTLPE 437
L+ IT GT+LR+TPR+L S + L L I+DGNQ N S + +P
Sbjct: 392 ELKGITYGTMLRMTPRVLTQGDKS-------EISLNLHIEDGNQKPNSSGIE----GIPT 440

Query: 438 VQNSEMTTEATLSAGESLLLGGFIQDKESSSKDGIPLLSDIPVIGSLFSSTVKQKHSVVR 497
+ + + T A + G+SL++GG +D+ S + +PLL DIP IG+LF + VR
Sbjct: 441 ISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVR 500

Query: 498 LFLIKATPIKSASS 511
LF+I+ I +
Sbjct: 501 LFIIEPRIIDEGIA 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4573FLGMRINGFLIF561e-11 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 55.7 bits (134), Expect = 1e-11
Identities = 32/166 (19%), Positives = 58/166 (34%), Gaps = 10/166 (6%)

Query: 22 EQLYTGLTEKEANQMQALLLSNDVNVSKEMDKSGNMTLSVEKEDFVRAITILNNNGFPKK 81
L++ L++++ + A L N+ + V + L G PK
Sbjct: 51 RTLFSNLSDQDGGAIVAQLTQM--NIPYRFANGSG-AIEVPADKVHELRLRLAQQGLPKG 107

Query: 82 KFADIEVIFPPSQLVASPSQENAKINYLKEQDIERLLSKIPGVIDCSVSLNVNNN----- 136
E + + S E E ++ R + + V V L +
Sbjct: 108 GAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVR 166

Query: 137 ESQPSSAAVLVISSPEVNLAPSVIQ-IKNLVKNSVDDLKLENISVV 181
E + SA+V V P L I + +LV ++V L N+++V
Sbjct: 167 EQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLV 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4563PF06704366e-06 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 36.4 bits (84), Expect = 6e-06
Identities = 21/119 (17%), Positives = 51/119 (42%), Gaps = 4/119 (3%)

Query: 3 EKFRTDLAHTFGIALEEQTDVLSFHDNDGHEW-ILECASQSEILFFYCYLLNSESIQINS 61
+ L G +L Q V + +D+ +E ++E SE++ F+C + S +
Sbjct: 9 SRLIKSLGAQLGTSLTAQNGVCALYDSQDNEAAVIEMPDHSEMVIFHCRVGRSPDRAADL 68

Query: 62 ILEMNSNRELLGMF--FLSLKDDNILLNIAFPADKIDITEFANLMENGYLLKNEIIRSL 118
++ N ++ M + ++ ++ L +D +F + G++++ R+L
Sbjct: 69 QKLLSLNFDVARMHGSWFAVDQGDVRLCAQRELAVLDEAQFCDTA-RGFIVQAREARAL 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4561TRNSINTIMINR7310.0 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 731 bits (1887), Expect = 0.0
Identities = 328/566 (57%), Positives = 390/566 (68%), Gaps = 25/566 (4%)

Query: 1 MPIGNLGHNPNVNNSIPPAPPLPSQTDGA--GGRGQLINSTGPLGSRALFTPVRNSMADS 58
MPIGNLG+N N N+ IPPAPPLPSQTDGA GG G LI+STG LGSR+LF+P+RNSMADS
Sbjct: 1 MPIGNLGNNVNGNHLIPPAPPLPSQTDGAARGGTGHLISSTGALGSRSLFSPLRNSMADS 60

Query: 59 GDNRASDVPGLPVNPMRLAA--SEITLNDGFEVLHDHGPLDTLNRQIGSSVFRVETQEDG 116
D+R D+PGLP NP RLAA SE L GFEVLHD GPLD LN QIG S FRVE Q DG
Sbjct: 61 VDSR--DIPGLPTNPSRLAAATSETCLLGGFEVLHDKGPLDILNTQIGPSAFRVEVQADG 118

Query: 117 KHIAVGQRNGVETSVVLSDQEYARLQSIDPEGKDKFVFTGGRGGAGHAMVTVASDITEAR 176
H A+G++NG+E SV LS QE++ LQSID EGK++FVFTGGRGG+GH MVTVASDI EAR
Sbjct: 119 THAAIGEKNGLEVSVTLSPQEWSSLQSIDTEGKNRFVFTGGRGGSGHPMVTVASDIAEAR 178

Query: 177 QRILELLEPKGTGESK-GAGESKGVGELRESNSGAENTTETQTSTSTSSLRSDPKLWLAL 235
+IL L+P G + +++ VG S +ET TST+ SS+RSDPK W+++
Sbjct: 179 TKILAKLDPDNHGGRQPKDVDTRSVGVGSASGIDDGVVSETHTSTTNSSVRSDPKFWVSV 238

Query: 236 GTVATGLIGLAATGIVQALALTPEPDSPTTTDPDAAASATETATRDQLTKEAFQNPDNQK 295
G +A GL GLAATGI QALALTPEPD PTTTDPD AA+A E+AT+DQLT+EAF+NP+NQK
Sbjct: 239 GAIAAGLAGLAATGIAQALALTPEPDDPTTTDPDQAANAAESATKDQLTQEAFKNPENQK 298

Query: 296 VNIDELGNAIPSGVLKDDVVANIEEQAKAAGEEAKQQAIENNAQAQKKYDEQQAKRQEEL 355
VNID GNAIPSG LKDD+V I +QAK AGE A+QQA+E+NAQAQ++Y++Q A+RQEEL
Sbjct: 299 VNIDANGNAIPSGELKDDIVEQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEEL 358

Query: 356 KVSSGAGYGLSGALILGGGIGVAVTAALHRKNQPVEQTTTTTTTTTTTSARTVENKPANN 415
++SSG GYGLS ALI+ GGIG VT ALHR+NQP EQTTTTTT TV +
Sbjct: 359 QLSSGIGYGLSSALIVAGGIGAGVTTALHRRNQPAEQTTTTTT-------HTVVQQQTGG 411

Query: 416 TPAQGNVDTPGSEDTMESRRSSMASTSSTFFDTSSIGTVQNPYADV---KTSLHDSQVPT 472
P P RR S S +ST + SS V NPYA+V + SL Q
Sbjct: 412 IPQHKVALMPQERRRFSDRRDSQGSVASTHWSDSS-SEVVNPYAEVGGARNSLSAHQPEE 470

Query: 473 SNSNTSVQNMGNTDSVVYSTIQHPPRDTTDNGARLLGNPSAGIQSTYARLALSGGLRHDM 532
+ + G YS IQ+ G RL+G P GIQSTYA LA SGGLR M
Sbjct: 471 HIYDEVAADPG------YSVIQNFSGSGPVTG-RLIGTPGQGIQSTYALLANSGGLRLGM 523

Query: 533 GGLTGGSNSAVNTSNNPPAPGSHRFV 558
GGLT G +AV++ N P PG RFV
Sbjct: 524 GGLTSGGETAVSSVNAAPTPGPVRFV 549


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4560PF059321224e-39 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 122 bits (309), Expect = 4e-39
Identities = 24/125 (19%), Positives = 52/125 (41%), Gaps = 5/125 (4%)

Query: 1 MSSRS-ELLLEKFAEKIGIGSISFNENRLCSFAIDEIYYISLS-DANDEYMMIYGVCGKF 58
MS+ + LL+ F+ + + + F+++ C+ ID + ++LS D E +++ G+
Sbjct: 1 MSNLFYKTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH 60

Query: 59 PTDNSNFALEILNANLWFAENGGPYLCYEAGAQSLLLALRFPLDDATPEKLENEIEVVVK 118
+L L N GP L + + P + + L+ E+ +++
Sbjct: 61 KD---IPQQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLE 117

Query: 119 SMENL 123
M
Sbjct: 118 WMRGW 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4559INTIMIN14590.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 1459 bits (3777), Expect = 0.0
Identities = 780/942 (82%), Positives = 837/942 (88%), Gaps = 11/942 (1%)

Query: 1 MITHGCYTRTRHKHKLKKTLIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHDSYQ 60
MITHG Y RTRHKHKLKKT IMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTH+SYQ
Sbjct: 1 MITHGFYARTRHKHKLKKTFIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHNSYQ 60

Query: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAAPGQQIILPLKKLPFE 120
NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKA PGQQIILPLKKLPFE
Sbjct: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAEPGQQIILPLKKLPFE 120

Query: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180
YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR
Sbjct: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180

Query: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240
SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM
Sbjct: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240

Query: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPANMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300
LAFGQVGARYIDSRFTANLGAGQRFFLP NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF
Sbjct: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300

Query: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLIYEQYYGDNVAL 360
KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKL+YEQYYGDNVAL
Sbjct: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVAL 360

Query: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKSWSQQIE 420
FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDK WSQQIE
Sbjct: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIE 420

Query: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTEHSTQKIQLIVKSKY 480
PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTE STQKIQLIVKSKY
Sbjct: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIVKSKY 480

Query: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNIYKVTARAYDRNGNSSN 540
GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSN+YKVTARAYDRNGNSSN
Sbjct: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSN 540

Query: 541 NVQLTITVLSNGQVVDQVGVTDFTADKTSAKADNADTITYTATVKKNGVAQANVPVSFNI 600
NV LTITVLSNGQVVDQVGVTDFTADKTSAKAD + ITYTATVKKNGVAQANVPVSFNI
Sbjct: 541 NVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNI 600

Query: 601 VSGTATLGANSAKTDANGKATVTLKSSTPGQVVVSAKTAEMTSALNASAVIFFDQTKASI 660
VSGTA L ANSA T+ +GKATVTLKS PGQVVVSAKTAEMTSALNA+AVIF DQTKASI
Sbjct: 601 VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASI 660

Query: 661 TEIKADKTTAVANGKDAIKYTVKVMKNGQPVNNQSVTFSTNFGMFNGKSQTQATTGNDGR 720
TEIKADKTTAVANG+DAI YTVKVMK +PV+NQ VTF+T G + + T +G
Sbjct: 661 TEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLS---NSTEKTDTNGY 717

Query: 721 ATITLTSSSAGKATVSATVSDGA-EVKATEVTFFDELKID-NKVDIIGNNVRGELPNIWL 778
A +TLTS++ GK+ VSA VSD A +VKA EV FF L ID ++I+G V+G+LP +WL
Sbjct: 718 AKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWL 777

Query: 779 QYGQFKLKASGGDGTYSWYSENTSIATVDA-SGKVTLNGKGSVVIKATSGDKQTVSYTIK 837
QYGQ LKASGG+G Y+W S N +IA+VDA SG+VTL KG+ I S D QT +YTI
Sbjct: 778 QYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA 837

Query: 838 APSYMI--KVDKQAYYADAMSICKNL---LPSTQTVLSDIYDSWGAANKYSHYSSMNSIT 892
P+ +I + K+ Y DA++ CKN LPS+Q L +++ +WGAANKY +Y S +I
Sbjct: 838 TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTII 897

Query: 893 AWIKQTSSEQRSGVSSTYNLITQNPLPGVNVNTPNVYAVCVE 934
+W++QT+ + +SGV+STY+L+ QNPL + + N YA CV+
Sbjct: 898 SWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4557PF07201280.047 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 28.3 bits (63), Expect = 0.047
Identities = 43/225 (19%), Positives = 73/225 (32%), Gaps = 23/225 (10%)

Query: 39 SPLINLQNELAMITSSSLSETIEGLSLGYRK---GSARKEEEGSTIEKLLNDMQELLTLT 95
+ ++ E+ SE E LSL RK AR + + + L+ + EL
Sbjct: 47 QSIADMAEEVTF----VFSERKE-LSLDKRKLSDSQARVSDVEEQVNQYLSKVPEL---E 98

Query: 96 DSDKIKELS--LKNSGL--LEQHDPTLAMFGNMPKGEIVALISSLLQSK--FVKIELKKK 149
+ EL L NS L Q L P + L K L
Sbjct: 99 QKQNVSELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFKMLCGLRDALKGRPELAHLSHL 158

Query: 150 YARLLLDLLGEDDWELAL-----LSWLGVGELNQEGIQKIKKLYEKAKDEDSENGASLLD 204
+ L+ + E + L + +Q ++ Y A + ++
Sbjct: 159 VEQALVSMAEEQGETIVLGARITPEAYRESQSGVNPLQPLRDTYRDAV-MGYQGIYAIWS 217

Query: 205 WFMEIKDLPEREKHLKVIIRALSFDLSYMSSFEDKVKTSSIISDL 249
+ + + + + +ALS DL S + K +ISDL
Sbjct: 218 DLQKRFPNGDIDSVILFLQKALSADLQSQQSGSGREKLGIVISDL 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4555BACINVASINB300.020 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 29.7 bits (66), Expect = 0.020
Identities = 29/102 (28%), Positives = 52/102 (50%), Gaps = 9/102 (8%)

Query: 112 MMMVTLLSLDTSAQKVSSLKNSNEIY---MDGQTKALENKTQEYKKQLEEQQKAEEKSQK 168
M+M + + SL+N ++ +G+ +E K+ E++ EE +KAEE ++
Sbjct: 258 MLMAMFIEI-VGKNTEESLQNDLALFNALQEGRQAEMEKKSAEFQ---EETRKAEETNRI 313

Query: 169 SKIVGQVFGWLGVALTAVAAVFNPALWAVVAIGATAMALQTA 210
+G+V G L ++ VAAVF A +A+ A +A+ A
Sbjct: 314 MGCIGKVLGALLTIVSVVAAVFTGG--ASLALAAVGLAVMVA 353


18ECs4509ECs4500Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4509119-5.128684phosphopantetheine adenylyltransferase
ECs4508125-7.5202113-deoxy-D-manno-octulosonic-acid transferase
ECs4507135-12.127524lipopolysaccharide core biosynthesis protein
ECs4506140-13.963631glucosyltransferase I
ECs4505345-16.457715RfaP protein
ECs4504345-15.595407UDP-D-galactose:(glucosyl)lipopolysaccharide-
ECs4503236-11.752739lipopolysaccharide core biosynthesis protein
ECs4502126-9.051321UDP-glucose:(galactosyl) LPS
ECs4501025-7.904874lipopolysaccharide 1,2-N-
ECs4500-115-4.651828lipid A-core:surface polymer ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4509LPSBIOSNTHSS2472e-87 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 247 bits (631), Expect = 2e-87
Identities = 77/154 (50%), Positives = 111/154 (72%)

Query: 5 AIYPGTFDPITNGHIDIVTRATQMFDHVILAIAASPSKKPMFTLEERVALAQQATAHLGN 64
AIYPG+FDPIT GH+DI+ R ++FD V +A+ +P+K+PMF+++ER+ +A AHL N
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPN 62

Query: 65 VEVVGFSDLMANFARNQHATVLIRGLRAVADFEYEMQLAHMNRHLMPELESVFLMPSKEW 124
+V F L N+AR + A ++RGLR ++DFE E+Q+A+ N+ L +LE+VFL S E+
Sbjct: 63 AQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTEY 122

Query: 125 SFISSSLVKEVARHQGDVTHFLPENVHQALMAKL 158
SF+SSSLVKEVAR G+V HF+P +V AL +
Sbjct: 123 SFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4504RTXTOXINA330.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.6 bits (74), Expect = 0.003
Identities = 25/117 (21%), Positives = 45/117 (38%), Gaps = 10/117 (8%)

Query: 60 HVFTDYISDKDKLYFSDL-------AKQYNSRINIYVINCDKLKSLPSTKNWTYATYFRF 112
H+ D +DKL +D+ ++ N I + S+ T+ +F
Sbjct: 860 HIIDDDGGKEDKLSLADIDFRDVAFKREGNDLIMYKGEG--NVLSIGHKNGITFRNWFEK 917

Query: 113 IIADYFYHKHEKILYLDADIACKGSIKELLDYQFSTNEIAAVVAERDVEWWQNRASV 169
D H+ E+I I S+K+ L+YQ N A+ V D + ++ +
Sbjct: 918 ESGDISNHEIEQIFDKSGRIITPDSLKKALEYQ-QRNNKASYVYGNDALAYGSQGDL 973


19ECs4393ECs4377Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4393-220-5.632329multidrug efflux system protein MdtE
ECs5533224-9.311457hypothetical protein
ECs4392025-8.380379hypothetical protein
ECs4391324-4.475500acid-resistance membrane protein
ECs4390223-2.570716acid stress chaperone HdeA
ECs4389121-1.801285acid-resistance protein
ECs4388020-0.661964Mg(2+) transport ATPase
ECs4387-1200.394426hemin importer ATP-binding protein
ECs4386-1200.185016hemin permease
ECs4385-315-0.296354ShuY-like protein
ECs4384-214-1.284475ShuX-like protein
ECs4383-314-2.302794coproporphyrinogen III oxidase
ECs4382-215-3.231181hemin binding protein
ECs4381-118-6.568405hypothetical protein
ECs4380-116-5.643375heme utilization/transport protein
ECs4379-219-5.445960hypothetical protein
ECs4378-218-5.893156hypothetical protein
ECs4377-114-3.094065hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4393RTXTOXIND513e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.4 bits (123), Expect = 3e-09
Identities = 41/218 (18%), Positives = 70/218 (32%), Gaps = 33/218 (15%)

Query: 97 LQAELNSAKGSLAKALSTASNARITFNRQASLLKTNYVSR-QDYDT-ARTQLNEAEANVT 154
+ + A L S + K Y Q + +L + N+
Sbjct: 257 QENKYVEAVNELRVYKSQLEQIE----SEILSAKEEYQLVTQLFKNEILDKLRQTTDNIG 312

Query: 155 VAKAAVEQATINLQYANVTSPITGVSGKSSV-TVGALVTANQADSLVTVQRLDPIYVDLT 213
+ + + Q + + +P++ + V T G +VT + +V V D + V
Sbjct: 313 LLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET-LMVIVPEDDTLEVTAL 371

Query: 214 QSVQDFLRMKEEVASGQIKQVQGSTPVQLNLE--NGKRY-SQTGTLK--FSDPTVDETTG 268
+D I + + +E RY G +K D D+ G
Sbjct: 372 VQNKD------------IGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLG 419

Query: 269 SVT--LRAI------FPNPNGDLLPGMYVTALVDEGSR 298
V + +I N N L GM VTA + G R
Sbjct: 420 LVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGMR 457



Score = 32.5 bits (74), Expect = 0.003
Identities = 22/139 (15%), Positives = 48/139 (34%), Gaps = 24/139 (17%)

Query: 53 PGRTVPY-EVAEIRPQVGGIIIKRNFI-EGDKVNQGDSLYQIDPAPLQAELNSAKGSLAK 110
G+ EI+P I+ K + EG+ V +GD L ++ +A+ + SL +
Sbjct: 87 NGKLTHSGRSKEIKPIENSIV-KEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQ 145

Query: 111 ALSTASNAR-------------ITFNRQASL--------LKTNYVSRQDYDTARTQLNEA 149
A + + + + L+ + ++ + T + Q +
Sbjct: 146 ARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQK 205

Query: 150 EANVTVAKAAVEQATINLQ 168
E N+ +A +
Sbjct: 206 ELNLDKKRAERLTVLARIN 224


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4387PF05272280.029 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.029
Identities = 10/23 (43%), Positives = 13/23 (56%), Gaps = 1/23 (4%)

Query: 28 EIVAIL-GPNGAGKSTLLRQLTG 49
+ +L G G GKSTL+ L G
Sbjct: 596 DYSVVLEGTGGIGKSTLINTLVG 618


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4382FERRIBNDNGPP310.004 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 31.5 bits (71), Expect = 0.004
Identities = 41/210 (19%), Positives = 72/210 (34%), Gaps = 25/210 (11%)

Query: 29 VKRKKLFTAVLALSWAF--------SVTAAERIVVAGGSLTELIYAMGAGERVVGVDETT 80
+ R++L TA +ALS + RIV EL+ A+G GV +T
Sbjct: 7 ISRRRLLTA-MALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVP--YGVADTI 63

Query: 81 SY------PPETAKLPHIGYWKQLSSEGILSLRPDSVITWQDAGPQIVLDQL-RAQKVNV 133
+Y PP + +G + + E + ++P ++ GP + L R
Sbjct: 64 NYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGP--SPEMLARIAPGRG 121

Query: 134 VTLPRVPATLQQMYANIRQLAKTLQVPEQGEALVTQISQRLERVQQNVATKKAPVKAMFI 193
L ++ ++A L + E + Q + ++ + A +
Sbjct: 122 FNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTT 181

Query: 194 LSAGGSAPQ--VAGKGSVADAILSLAGAEN 221
L V G S+ IL G N
Sbjct: 182 LI---DPRHMLVFGPNSLFQEILDEYGIPN 208


20ECs4367ECs4316Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4367-118-4.763276hypothetical protein
ECs4366023-6.984857universal stress protein UspB
ECs4365-120-5.193889low-affinity phosphate transport protein
ECs4364-116-3.528755hypothetical protein
ECs4363117-3.771301hypothetical protein
ECs4362117-3.683417hypothetical protein
ECs4361-114-0.811106hypothetical protein
ECs43600172.488543hypothetical protein
ECs43591171.233101ABC transporter ATP-binding protein
ECs4358016-0.338269transporter
ECs4357116-1.267228HicA-like protein
ECs4356116-1.224622HicB-like protein
ECs4355217-0.814085fructose-1,6-bisphosphate aldolase
ECs4354219-2.021864phosphotransferase system HPr enzyme
ECs4353117-1.445754sugar kinase
ECs4352221-1.755328phosphotransferase system enzyme IIC
ECs43511210.354060phosphotransferase system enzyme IIB
ECs4350-1202.711130phosphotransferase system enzyme IIA
ECs43490224.066321GntR family transcriptional regulator
ECs4348-1255.393498nickel responsive regulator
ECs4347-1245.446251nickel transporter ATP-binding protein NikE
ECs43461215.117885nickel transporter ATP-binding protein NikD
ECs43450214.672961nickel transporter permease NikC
ECs4344-1204.272103nickel transporter permease NikB
ECs4343-1204.761460periplasmic binding protein for nickel
ECs43420215.8185224'-phosphopantetheinyl transferase
ECs43411216.1132953-oxoacyl-ACP synthase
ECs43401256.3786263-ketoacyl-ACP reductase
ECs43391256.274832beta-hydroxydecanoyl-ACP dehydrase
ECs43382236.0962323-oxoacyl-ACP synthase
ECs43372245.403556lipoprotein
ECs43362195.069838hypothetical protein
ECs43351184.310446hypothetical protein
ECs43342194.097145hypothetical protein
ECs43333184.051977hypothetical protein
ECs43321172.141832(3R)-hydroxymyristoyl-ACP dehydratase
ECs43312152.213355surfactin synthetase
ECs4330016-0.813570hypothetical protein
ECs4329019-2.167959acyl carrier protein
ECs4328018-1.656859acyl carrier protein
ECs4327218-0.224057acyltransferase
ECs4326116-0.109024hypothetical protein
ECs4325114-0.570175O-methyltransferase
ECs43241150.809049lipoprotein
ECs4323-1153.383002hypothetical protein
ECs4322-2154.213026major facilitator superfamily transporter
ECs4321-1153.638715hypothetical protein
ECs4320-1143.437628hypothetical protein
ECs43190163.879751sulfur transfer protein SirA
ECs43182123.490345zinc/cadmium/mercury/lead-transporting ATPase
ECs43172121.887548hypothetical protein
ECs43162141.477402receptor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4364ALARACEMASE290.033 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.033
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4360RTXTOXIND838e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 83.3 bits (206), Expect = 8e-20
Identities = 72/408 (17%), Positives = 138/408 (33%), Gaps = 81/408 (19%)

Query: 6 RHLAWWGVGLLAVAAIVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++ +G L +A I++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4359PF05272300.043 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.043
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 20 ARCMVGLIGPDGVGKSSLLSLISGAR 45
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4358ABC2TRNSPORT512e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 51.1 bits (122), Expect = 2e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLIMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4340DHBDHDRGNASE935e-25 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 93.2 bits (231), Expect = 5e-25
Identities = 63/251 (25%), Positives = 119/251 (47%), Gaps = 15/251 (5%)

Query: 3 RSVLVTGASKGIGRAIACQLAADGFNI-GVHYHRDATGAQETLNAIVANGGNGRLLSFDV 61
+ +TGA++GIG A+A LA+ G +I V Y+ + + ++ A + DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVS--SLKAEARHAEAFPADV 66

Query: 62 ANREQCREVLEHEIAQHGAWYGVVSNAGIARDAAFPALSDDDWDAVIHTNLDSFYNVIQP 121
+ E+ + G +V+ AG+ R +LSD++W+A N +N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 122 CIMPMIGARQGGRIITLSSVSGVMGNRGQVNYSAAKAGIIGATKALAIELAKRKITVNCI 181
+ + R+ G I+T+ S + Y+++KA + TK L +ELA+ I N +
Sbjct: 127 -VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 182 APGLIDTGMIEM-------EESALKEAMSM----IPMKRMGQAEEVAGLASYLMSDIAGY 230
+PG +T M E +K ++ IP+K++ + ++A +L+S AG+
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 231 VTRQVISINGG 241
+T + ++GG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4336ACRIFLAVINRP497e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 48.7 bits (116), Expect = 7e-08
Identities = 37/167 (22%), Positives = 77/167 (46%), Gaps = 12/167 (7%)

Query: 223 YSDYASQQAKQDISTLGVATLLGVILLIVAVFRSLRPLLLCVISIGIGALAGTVATLLIF 282
+ + + + TL A +L V L++ +++R L+ I++ + L GT A L F
Sbjct: 329 TTPFVQLSIHEVVKTLFEAIML-VFLVMYLFLQNMRATLIPTIAVPV-VLLGTFAILAAF 386

Query: 283 G-ELHLMTLVMSMSVIGISADYTLYYL--TERMVHGNDVSPWQ----SLAKVRNALLLAL 335
G ++ +T+ + IG+ D + + ER++ + + P + S+++++ AL+
Sbjct: 387 GYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIA 446

Query: 336 LTTVAAYL-IMMLAPFPGI--RQMAIFAAVGLSASCLTVLFWHPWLC 379
+ A ++ + G RQ +I ++ S L L P LC
Sbjct: 447 MVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALC 493



Score = 41.7 bits (98), Expect = 1e-05
Identities = 35/199 (17%), Positives = 71/199 (35%), Gaps = 31/199 (15%)

Query: 569 LVPVEGVKSSALMQEIATYYPCGIAWV---DRKSTFDELFALYRYVLTGLLLVALAVIAC 625
L + +K A + E+ ++P G+ + D +++ V T + L +
Sbjct: 300 LDTAKAIK--AKLAELQPFFPQGMKVLYPYDTTPFVQL--SIHEVVKTLFEAIMLVFLVM 355

Query: 626 GAVARLGWRKGLISLVPSVLSLGCGLAVLAMSGQAVNLFSLLALVLVLGIGI-------- 677
+ R LI + + L A+LA G ++N ++ +VL +G+ +
Sbjct: 356 YLFLQ-NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 678 NYTLFFSNPRGTPLT-----------SLLAIALAMLTTLLTLGMLVFSATQAISSFGIVL 726
N + P +L+ IA+ + + + S F I +
Sbjct: 415 NVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITI 474

Query: 727 VSGI----FTAFLLSPLAM 741
VS + A +L+P
Sbjct: 475 VSAMALSVLVALILTPALC 493



Score = 36.7 bits (85), Expect = 4e-04
Identities = 38/224 (16%), Positives = 75/224 (33%), Gaps = 33/224 (14%)

Query: 544 EWLASPASEGWRLLWLTLENGESGVLV---PVEGVKSS---ALMQEIATYYPCGIAWVDR 597
W+ L NG + + G S ALM+ +A+ P GI D
Sbjct: 807 HWVYGSPR-------LERYNGLPSMEIQGEAAPGTSSGDAMALMENLASKLPAGI-GYDW 858

Query: 598 KSTFDELFALYRYVLTGLLLVALAVIACGAVARLGWRKGLISLVPSVLSLGCGLAVLAMS 657
+ + + + V C A W + ++ L + L +
Sbjct: 859 TGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF 918

Query: 658 GQAVNLFSLLALVLVLGIG-------INYTLFFSNPRGTPL---------TSLLAIALAM 701
Q +++ ++ L+ +G+ + + G + L I +
Sbjct: 919 NQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTS 978

Query: 702 LTTLLTLGMLVFSAT---QAISSFGIVLVSGIFTAFLLSPLAMP 742
L +L + L S A ++ GI ++ G+ +A LL+ +P
Sbjct: 979 LAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVP 1022


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4322TCRTETA516e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 6e-09
Identities = 79/398 (19%), Positives = 146/398 (36%), Gaps = 32/398 (8%)

Query: 13 LRLNLRILSIVMFNFASYLTIGLPLAVLPGYVHDVM--GFSAFWAGLVISLQYFATLLSR 70
++ N ++ I+ + IGL + VLPG + D++ G++++L
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 71 PHAGRYADLLGPKKIVVFGLCGCFLSGLGYLTAGLTASLPVISLLLLCLGRVILGI-GQS 129
P G +D G + +++ L G + + Y L V L +GR++ GI G +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAG---AAVDYAIMATAPFLWV-----LYIGRIVAGITGAT 112

Query: 130 FAGTGSTLWGVGVVGSL--HIGRVISWNDIVTYGAMAMGAPLGVVFYHWGGLQALALIIM 187
A G+ + + H G + + +G +G H A AL +
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGL 172

Query: 188 GVALVAILLAIPRPTVK--ASKGKPLPFRAVLGRVWLYGMALALA-----SAGFGVIATF 240
LL + + P + + +A +A V A
Sbjct: 173 NFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAAL 232

Query: 241 ITLFYDVK-GWDGAAFALTLFSCAFVGT---RLLFPNGINRIGGLNVAMICFSVEIIGLL 296
+F + + WD ++L + + + ++ R+G M+ + G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 297 LVGVATMPWMAKIG-VLLAGAGFSLVFPALGVVAVKAVPQQNQGAALATYTVFMDLSLGV 355
L+ AT WMA VLLA G + PAL + + V ++ QG + L+ +
Sbjct: 293 LLAFATRGWMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLT-SI 349

Query: 356 TGPLAGLVMSWAGVPV----IYLAAAGLVAIALLLTWR 389
GPL + A + ++A A L + L R
Sbjct: 350 VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4319PF012061053e-34 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 105 bits (265), Expect = 3e-34
Identities = 24/72 (33%), Positives = 41/72 (56%)

Query: 9 DHTLDALGLRCPEPVMMVRKTVRNMQPGETLLIIADDPATTRDIPGFCTFMEHELVAKET 68
D +LDA GL CP P++ +KT+ M GE L ++A DP + +D F HEL+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 69 DGLPYRYLIRKG 80
+ Y + +++
Sbjct: 65 EDGTYHFRLKRA 76


21ECs4306ECs4291Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4306-1233.904332hypothetical protein
ECs4305-1213.153190periplasmic binding protein of high-affinity
ECs4304-1223.023963branched-chain amino acid transporter permease
ECs4303-1223.029580leucine/isoleucine/valine transporter permease
ECs4302-2232.408240ABC transporter ATP-binding protein
ECs4301-1233.146541ABC transporter ATP-binding protein
ECs4300-2213.085301hypothetical protein
ECs4299-2223.376879glycerol-3-phosphate transporter periplasmic
ECs4298-2213.620203glycerol-3-phosphate transporter permease
ECs4297-1171.517603glycerol-3-phosphate transporter membrane
ECs4296016-2.085608glycerol-3-phosphate transporter ATP-binding
ECs4295018-3.803139glycerophosphodiester phosphodiesterase
ECs4294017-4.411001hypothetical protein
ECs4293015-4.004171gamma-glutamyltranspeptidase
ECs4292017-5.228783hypothetical protein
ECs4291016-3.372629hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4299MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4296PF05272310.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.010
Identities = 11/35 (31%), Positives = 19/35 (54%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTEGDICINDQR 67
+V+ G G GKSTL+ + GL+ ++ I +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4295PF04619300.004 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 30.3 bits (68), Expect = 0.004
Identities = 13/63 (20%), Positives = 23/63 (36%), Gaps = 4/63 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129

Query: 85 YGK 87
G
Sbjct: 130 GGI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4293NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 276 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 332
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 333 YAYADRSEYLGDPDFVKVPWQA 354
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


22ECs4096ECs4084Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4096-1183.965990N-acetylmannosamine-6-phosphate 2-epimerase
ECs4095-2173.778063N-acetylmannosamine kinase
ECs4094-1162.441681hypothetical protein
ECs4093-1162.640663hypothetical protein
ECs4092-1173.019617glutamate synthase subunit beta
ECs4091-2152.680716glutamate synthase subunit alpha
ECs4090-2150.101070hypothetical protein
ECs4089014-0.467997aerobic respiration control sensor protein ArcB
ECs4088-1160.032878isoprenoid biosynthesis protein with
ECs4087018-0.788521monofunctional biosynthetic peptidoglycan
ECs4086316-0.273721hypothetical protein
ECs40853170.297196phosphohistidinoprotein-hexose
ECs40842170.135591hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4089HTHFIS656e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 6e-13
Identities = 26/115 (22%), Positives = 45/115 (39%), Gaps = 4/115 (3%)

Query: 528 VLLVEDIELNVIVARSVLEKLGNSVDVAMTGKAALEMFKPGEYDLVLLDIQLPDMTGLDI 587
+L+ +D V L + G V + G+ DLV+ D+ +PD D+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 588 SRELTKRYPREDLPPLVALTA-NVLKDKQEYLNAGMDDVLSKPLSVPALTAMIKK 641
+ K P P++ ++A N + G D L KP + L +I +
Sbjct: 66 LPRIKKARPD---LPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


23ECs4026ECs4015Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4026121-3.004364fimbrial protein
ECs4025122-3.264902transposase
ECs4024-120-4.082658transposase
ECs4023-121-4.753073hypothetical protein
ECs4022-217-4.023333outer membrane protein
ECs4021-117-1.710493chaperone
ECs4020-1130.892523fimbrial-like protein
ECs4019-1151.750082PTS system N-acetylgalactosamine-specific
ECs40180132.103699PTS system N-acetylgalactosamine-specific
ECs40170153.245998tagatose-bisphosphate aldolase
ECs40160153.336760tagatose-6-phosphate aldose/ketose isomerase
ECs40150143.071437N-acetylgalactosamine-6-phosphate deacetylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4022PF005777780.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 778 bits (2011), Expect = 0.0
Identities = 321/849 (37%), Positives = 470/849 (55%), Gaps = 48/849 (5%)

Query: 31 SGMLCTTANAEEYYFDPIMLETTKSGMQTTDLSRFSKKYAQLPGTYQVDIWLNKKKVSQK 90
+ ++ E YF+P L DLSRF PGTY+VDI+LN ++ +
Sbjct: 35 AFAAQAPLSSAELYFNPRFLAD--DPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATR 92

Query: 91 KITFTAN-AEQLLQPQFTVEQLRELGIKVDEIPALAEKDDDSVINSLEQIIPGTAAEFDF 149
+TF +EQ + P T QL +G+ + + DD+ + L +I A+ D
Sbjct: 93 DVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVP-LTSMIHDATAQLDV 151

Query: 150 NHQRLNLSIPQIALYRDARGYVSPSRWDDGIPTLFTNYSFTGSDNRYRQGNRSQRQYLNM 209
QRLNL+IPQ + ARGY+ P WD GI NY+F+G+ + R G S YLN+
Sbjct: 152 GQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYLNL 211

Query: 210 QNGANFGPWRLRNYSTWTRNDQASS------WNTISSYLQRDIKALKSQLLLGESATSGS 263
Q+G N G WRLR+ +TW+ N SS W I+++L+RDI L+S+L LG+ T G
Sbjct: 212 QSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGD 271

Query: 264 IFSSYNFTGVQLASDDNMLPNSQRGFAPTVRGIANSSAIVTIRQNGYVIYQSNVPAGAFE 323
IF NF G QLASDDNMLP+SQRGFAP + GIA +A VTI+QNGY IY S VP G F
Sbjct: 272 IFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFT 331

Query: 324 INDLYPSSNSGDLEVTIEESDGTQRRFIQPYSSLPMMQRPGHLKYSATAGRYRADANSDS 383
IND+Y + NSGDL+VTI+E+DG+ + F PYSS+P++QR GH +YS TAG YR+
Sbjct: 332 INDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQE 391

Query: 384 KEPEFAEATAIYGLNNTFTLYGGLLGSEDYYALGIGIGGTLGALGALSMDINRADTQFDN 443
K P F ++T ++GL +T+YGG ++ Y A GIG +GALGALS+D+ +A++ +
Sbjct: 392 K-PRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPD 450

Query: 444 QHSFHGYQWRTQYIKDIPETNTNIAVSYYRYTNDGYFSFDEA------------------ 485
G R Y K + E+ TNI + YRY+ GYF+F +
Sbjct: 451 DSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQ 510

Query: 486 ----NTRNWDYNSRQKSEIQFNISQTIFDGVSLYASGSQQDYWGNNEKNRNISVGVSGQQ 541
T ++ ++ ++Q ++Q + +LY SGS Q YWG + + G++
Sbjct: 511 VKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF 570

Query: 542 WGIGYSLNYQYSRYTDQN-NDRALSLNLSIPLERWLPRSR--------VSYQMTSQKDRP 592
I ++L+Y ++ Q D+ L+LN++IP WL SY M+ +
Sbjct: 571 EDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGR 630

Query: 593 TQHEMRLDGSLLDDGRLSYSLEQSLDDDNNHNS----SVNASYRSPYGTFSAGYSYGNDS 648
+ + G+LL+D LSYS++ + NS +YR YG + GYS+ +D
Sbjct: 631 MTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDI 690

Query: 649 SQYNYGVTGGVVIHPHGVTLSQYLGNAFALIDANGASGVRIQNYPGIATDPFGYAVVPYL 708
Q YGV+GGV+ H +GVTL Q L + L+ A GA +++N G+ TD GYAV+PY
Sbjct: 691 KQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYA 750

Query: 709 TTYQENRLSVDTTQLPDNVDLEQTTQFVVPNRGAMVAARFNANIGYRVLVTVSDRNGKPL 768
T Y+ENR+++DT L DNVDL+ VVP RGA+V A F A +G ++L+T+ N KPL
Sbjct: 751 TEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTL-THNNKPL 809

Query: 769 PFGALASNDDTGQQSIVDEGGILYLSGISSKSQSWTVRWGNQADQQCQFAFSTPDSEPTT 828
PFGA+ +++ + IV + G +YLSG+ + V+WG + + C + P
Sbjct: 810 PFGAMVTSESSQSSGIVADNGQVYLSGMPLAGK-VQVKWGEEENAHCVANYQLPPESQQQ 868

Query: 829 SVLQGTAQC 837
+ Q +A+C
Sbjct: 869 LLTQLSAEC 877


24ECs4005ECs3999Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4005-118-5.225917transporter
ECs4004124-7.4179445-keto-4-deoxy-D-glucarate aldolase
ECs4003027-9.1004112-hydroxy-3-oxopropionate reductase
ECs4002129-10.089363glycerate kinase
ECs4001126-10.673735hypothetical protein
ECs4000-120-6.860083hypothetical protein
ECs3999-114-3.456079TdcR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4005TCRTETA371e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.1 bits (86), Expect = 1e-04
Identities = 60/365 (16%), Positives = 112/365 (30%), Gaps = 55/365 (15%)

Query: 52 AVSMGYIFSAFGWAYLLMQIPGGWLLDKFGSKKVYTYSLFFWSLFTFLQGFVDMFPLAWA 111
G + + + G L D+FG + V SL ++ + + +
Sbjct: 42 TAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYI 101

Query: 112 GISMFFMRFMLGFSEAPSFPANARIVAAWFPTKER----GTASAIFNSAQYFSLALFSPL 167
G R + G + A +A ER G SA F + P+
Sbjct: 102 G------RIVAGITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFG-----MVAGPV 149

Query: 168 LGWLTFAWGWEHVFTVMGVIG---FVLTALWIKLIHNPTDHPRMSAEELKFISENGAVVD 224
LG L + F + F+ + H P S
Sbjct: 150 LGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLAS------- 202

Query: 225 MDHKKPGSAAASGPKLHYIKQLLSNRMMLGVFFGQYFINTITWFFLTWFPIYLVQEKGMS 284
+ + + ++ VFF + + + I+
Sbjct: 203 ---------------FRWARGMTVVAALMAVFFIMQLVGQV---PAALWVIFGEDRFHWD 244

Query: 285 ILKVGLVASIPALCGFAGGVLGGVFSDYLIKRGLSLTLARKLPIVLGMLLAST--IILCN 342
+G+ + A G+L + + ++ L + ++LGM+ T I+L
Sbjct: 245 ATTIGISLA-------AFGILHSLAQAMITGP-VAARLGERRALMLGMIADGTGYILLAF 296

Query: 343 YTNNTTLVVMLMALAFFGKGFGALGWPVISDTAPKEIVGLCGGVFNVFGNVASIVTPLVI 402
T +++ LA G G AL ++S +E G G ++ SIV PL+
Sbjct: 297 ATRGWMAFPIMVLLASGGIGMPALQ-AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLF 355

Query: 403 GYLVS 407
+ +
Sbjct: 356 TAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4004PHPHTRNFRASE346e-04 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 34.0 bits (78), Expect = 6e-04
Identities = 15/82 (18%), Positives = 31/82 (37%), Gaps = 12/82 (14%)

Query: 144 KNITILVQIESQQGVDNVDAIAATEGVDGIFVGPSDLA----------AALGHLGNASHP 193
+I + + +E + A + VD +G +DL + +L HP
Sbjct: 423 DSIEVGIMVEIPSTAVAANLFA--KEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHP 480

Query: 194 DVQKAIQHIFNRASAHGKPSGI 215
+ + + + A + GK G+
Sbjct: 481 AILRLVDMVIKAAHSEGKWVGM 502


25ECs3882ECs3850Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs38822252.582566hydrogenase 2 small subunit
ECs38813262.646973hydrogenase 2 protein HybA
ECs38803242.527485hydrogenase 2 b cytochrome subunit
ECs38790202.479667hydrogenase 2 large subunit
ECs3878-1172.027920hydrogenase 2 maturation endopeptidase
ECs3877-2161.458254hydrogenase 2-specific chaperone
ECs3876-3151.246776hydrogenase nickel incorporation protein HybF
ECs3875-2121.599169hydrogenase 2 accessory protein HypG
ECs3874-3121.561590glutathione S-transferase
ECs3873-1132.546243bifunctional glutathionylspermidine
ECs3872-1163.445379low-affinity phosphate transport protein
ECs38713214.508862hypothetical protein
ECs38693204.689305hypothetical protein
ECs38683214.543184hypothetical protein
ECs38662213.608833hypothetical protein
ECs3865221-2.391533hypothetical protein
ECs3864329-6.962816hypothetical protein
ECs3863438-11.321603transposase
ECs3862439-12.280628transposase
ECs3859645-14.865466hypothetical protein
ECs3858952-16.686679hypothetical protein
ECs3857952-15.924884hypothetical protein
ECs3855746-12.260501enterotoxin
ECs5518427-1.518634hypothetical protein
ECs3854427-0.246941hypothetical protein
ECs3852427-0.360893hypothetical protein
ECs3850326-1.319866virulence-related membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3877DPTHRIATOXIN280.020 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 27.8 bits (61), Expect = 0.020
Identities = 15/48 (31%), Positives = 24/48 (50%), Gaps = 1/48 (2%)

Query: 92 MTFTVGELDGVSQYLSCSLMSPLSHSMSIEEG-QRLTDDCARMILSLP 138
+ V + + + L SL PL + EE +R D +R++LSLP
Sbjct: 124 LALKVDNAETIKKELGLSLTEPLMEQVGTEEFIKRFGDGASRVVLSLP 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3872TYPE3IMSPROT387e-05 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 37.8 bits (88), Expect = 7e-05
Identities = 26/194 (13%), Positives = 58/194 (29%), Gaps = 40/194 (20%)

Query: 12 TGLLLLLALAFVLFYEAINGFHDTANAVATVIY------TRAMQPQLAVVMAAFFNFFGV 65
L++AL+ +L + F + + ++A+ + V+ FF
Sbjct: 30 VSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFFYLCFP 89

Query: 66 LLGGLSVAYAIVHML-------------------PTDLLLNMGSTHGLAMVFSMLLAAII 106
LL ++ H++ P + + S L +L ++
Sbjct: 90 LLTVAALMAIASHVVQYGFLISGEAIKPDIKKINPIEGAKRIFSIKSLVEFLKSILKVVL 149

Query: 107 WNLGTWFFGLPASSSHTLIGAIIGIGLTNALLTGSSVMDALNLREVTKIFSSLIVSPIVG 166
++ W + ++ L + L + +I L+V VG
Sbjct: 150 LSILIWIIIKG------NLVTLLQ-------LPTCGIECITPL--LGQILRQLMVICTVG 194

Query: 167 LVIAGGLIFLLRRY 180
V+ + Y
Sbjct: 195 FVVISIADYAFEYY 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3850ENTEROVIROMP1234e-38 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 123 bits (309), Expect = 4e-38
Identities = 62/183 (33%), Positives = 91/183 (49%), Gaps = 26/183 (14%)

Query: 2 ILYVMSGSRLADNHTLSAGYAQSKVQDFKN-IKGVNLQYRYEWD-SPVSVVGSFSYMKGD 59
+L +G+ +A T++ GYAQS Q N + G NL+YRYE D SP+ V+GSF+Y +
Sbjct: 13 VLAFTAGTSVAATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTEKS 72

Query: 60 WADSHRDEADDFYRHQADIKYYSFLAGPAYRLNDYISFYGLVGISHTKAKGDYEWRNSVG 119
R + Y +YY AGPAYR+ND+ S YG+VG+ + K +
Sbjct: 73 -----RTASSGDY---NKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQ---------- 114

Query: 120 ADESDGYLSESVSKKSTDFAYAAGVIINPWGNMSVNVGYEGTKADIYGKHSVNGFTVGVG 179
E Y F+Y AG+ NP N++++ YE ++ V + GVG
Sbjct: 115 TTEYPTY---KHDTSDYGFSYGAGLQFNPMENVALDFSYEQSRI---RSVDVGTWIAGVG 168

Query: 180 YRF 182
YRF
Sbjct: 169 YRF 171


26ECs3740ECs3703Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3740-218-3.082323xanthine dehydrogenase subunit XdhB
ECs3739-124-5.174202xanthine dehydrogenase subunit XdhA
ECs3738238-10.616106lipoprotein
ECs3737442-13.022200*hypothetical protein
ECs3736544-13.673776hypothetical protein
ECs3735344-12.669787hypothetical protein
ECs3734343-12.694400EivF
ECs3733343-12.768154EivG
ECs3732342-12.852139EivE
ECs3731444-12.911889EivA
ECs3730344-12.904103ATP synthase SpaL
ECs3729347-15.442340EivI
ECs3728548-16.408164hypothetical protein
ECs3727550-16.626739EivJ
ECs3726550-17.398535surface presentation of antigens protein SpaO
ECs3725449-17.460393surface presentation of antigens protein SpaP
ECs3724551-17.058324EpaQ
ECs3721651-16.947572surface presentation of antigens protein SpaS
ECs3720554-16.424061transcriptional regulator
ECs3719653-16.544305EprH
ECs3718654-16.375723EprI
ECs3717653-16.429736EprJ
ECs3716652-16.997828EprK
ECs3715652-17.678796hypothetical protein
ECs3714751-17.499182hypothetical protein
ECs3713649-17.411316hypothetical protein
ECs3712750-17.509244hypothetical protein
ECs3711750-18.018029hypothetical protein
ECs3710751-18.311457hypothetical protein
ECs3709546-17.011481transcriptional regulator
ECs3708345-16.513638hypothetical protein
ECs3707237-13.494221hypothetical protein
ECs3706031-9.122492hypothetical protein
ECs3705125-7.225830hypothetical protein
ECs3704120-5.036157sensory transducer
ECs3703015-3.057496hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3738RTXTOXIND374e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.1 bits (86), Expect = 4e-05
Identities = 18/82 (21%), Positives = 31/82 (37%), Gaps = 12/82 (14%)

Query: 160 AAGAGKVVYVGNQLRGYGNLIMIKHSEDYITAYAHNDTMLVNNGQSVKAGQKIATMGSTD 219
A GK+ + G IK E+ I ++V G+SV+ G + + +
Sbjct: 84 ATANGKLTHSGRSK-------EIKPIENSIV-----KEIIVKEGESVRKGDVLLKLTALG 131

Query: 220 AASVRLHFQIRYRATAIDPLRY 241
A + L Q ++ RY
Sbjct: 132 AEADTLKTQSSLLQARLEQTRY 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3733TYPE3OMGPROT448e-154 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 448 bits (1155), Expect = e-154
Identities = 158/536 (29%), Positives = 271/536 (50%), Gaps = 54/536 (10%)

Query: 34 YVANKENLRSFFETVSSYAGKPTIVSKLAMKKQISGNFDLTEPYALIERLSAQMGLIWYD 93
YVA E+LR + +VS K +SG F+ P ++ +++ L+WY
Sbjct: 38 YVAKGESLRDLLTDFGANYDATVVVSDKINDK-VSGQFEHDNPQDFLQHIASLYNLVWYY 96

Query: 94 DGKAIYIYDSSEMRNALINLRKVSTNEFNNFLKKSGLYNSRYEIKGD-GNGTFYVSGPPV 152
DG +YI+ +SE+ + LI L++ E L++SG++ R+ + D N YVSGPP
Sbjct: 97 DGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPR 156

Query: 153 YVDLVVNAAKLMEQNSD--GIEIGRNKVGIIHLVNTFVNDRTYELRGEKIVIPGMAKVLS 210
Y++LV A +EQ + + G + I L +DRT R +++ PG+A +L
Sbjct: 157 YLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQ 216

Query: 211 TLLNNNIKQSTGVNVLSEISSRQQLKNVSRMPPFPGAEEDDDLQVEKIISTAGAPETDDI 270
+L++ ++ QQ+ ++ P A +
Sbjct: 217 RVLSD--------------ATIQQVTVDNQRIPQ-----------------AATRASAQA 245

Query: 271 QIIAYPDTNSLLVKGTVSQVDFIEKLVATLDIPKRHIELSLWIIDIDKTDLEQLGADWSG 330
++ A P N+++V+ + ++ ++L+ LD P IE++L I+DI+ L +LG DW
Sbjct: 246 RVEADPSLNAIIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRV 305

Query: 331 TIKIGSSLSASFNNSG----------SISTLDG---TQFIATIQALAQKRRAAVVARPVV 377
I+ G++ +G S +D +A + L + A VV+RP +
Sbjct: 306 GIRTGNNHQVVIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTL 365

Query: 378 LTQENIPAIFDNNRTFYTKLVGERTAELDEVTYGTMISVLPRFAARN---QIELLLNIED 434
LTQEN A+ D++ T+Y K+ G+ AEL +TYGTM+ + PR + +I L L+IED
Sbjct: 366 LTQENAQAVIDHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIED 425

Query: 435 GNEINSDKTNVDDLPQVGRTLISTIARVPQGKSLLIGGYTRDTNTYESRKIPILGSIPFI 494
GN+ + + ++ +P + RT++ T+ARV G+SL+IGG RD + K+P+LG IP+I
Sbjct: 426 GNQ-KPNSSGIEGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYI 484

Query: 495 GKLFGYEGTNANNIVRVFLIEPREIDERMMNNANEAAVDARAITQQMAKNKEINDE 550
G LF + VR+F+IEPR IDE + ++ A + + + + EI+++
Sbjct: 485 GALFRRKSELTRRTVRLFIIEPRIIDEGIAHHL--ALGNGQDLRTGILTVDEISNQ 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3732INVEPROTEIN2402e-78 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 240 bits (613), Expect = 2e-78
Identities = 128/321 (39%), Positives = 195/321 (60%)

Query: 14 AREVSRLEDIITEDNEDIEAEMPKMRDDPAGKEARFLQATDEMSAALTQFMKKKIYEEQL 73
+R+ S + D + E + P + +F+Q+TDEMSAAL QF ++ YE++
Sbjct: 16 SRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSAALAQFRNRRDYEKKS 75

Query: 74 ANFLDGEEYVLEDQPIEKTDKVMEALKAATTHDYEVYSFAKKLFPDESDLVVVLRAILRK 133
+N + E VLED+ + K ++++ + + A+ LFPD SDLV+VLR +LR+
Sbjct: 76 SNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFPDPSDLVLVLRELLRR 135

Query: 134 KQISENVRLNAEALLRKVNQETTKKFINSGINSALKAKLFGQALSLNPKLLRASYRQFLM 193
K + E VR E+LL+ V ++T K + +GIN ALKA+LFG+ LSL P LLRASYRQF+
Sbjct: 136 KDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLSLKPGLLRASYRQFIQ 195

Query: 194 AEDDAVDTYVEWIGSYGYQNRMLVTKFIKETLFSDINALDASCSSLEFGMFLNKLSQLLS 253
+E V+ Y +WI SYGYQ R++V FI+ +L +DI+A DASCS LEFG L +L+QL
Sbjct: 196 SESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSRLEFGQLLRRLTQLKM 255

Query: 254 LQSAEALFLKTLMNNPIIKKFISAEDYWIFFLISLIKFPETAEELLNNALVTLPADANYK 313
L+SA+ LF+ TL++ K F + E W+ ++SL++ P + LL + + ++K
Sbjct: 256 LRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSLLADIIGLNALLLSHK 315

Query: 314 DKTLLLKAIYSGCTNLPFSLF 334
+ L+ Y C +P SLF
Sbjct: 316 EHASFLQIFYQVCKAIPSSLF 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3731VACCYTOTOXIN310.019 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.2 bits (70), Expect = 0.019
Identities = 18/60 (30%), Positives = 31/60 (51%), Gaps = 3/60 (5%)

Query: 597 EIEDRIRDGVRPTAGGTFLNLDASEAEMILDNFKLAL---SGINIPIKDIILLGSVDIRR 653
EI +R+ G A T L L ASE +N +++L + +N+ + L+G+V + R
Sbjct: 202 EINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYDGATLNLASNSVKLMGNVWMGR 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3729SSPAMPROTEIN352e-05 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 34.7 bits (79), Expect = 2e-05
Identities = 31/101 (30%), Positives = 56/101 (55%)

Query: 2 QLKNLQSLLDMKELLGEVVFRQDIFYSLRKVTVIQQQIAEINLEKQKIAERRKILNKEIV 61
Q+ L+ LLD + R++I+ LRK +++++QI ++ L+ +I E+R L K+
Sbjct: 45 QIAGLKLLLDTLRAENRQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKRE 104

Query: 62 QQQAQRKHWWLKGEKYDRLKKRIKKQLLNQMLYQDELEQEE 102
+ Q + K+W K Y R R K+ + + + Q+E E EE
Sbjct: 105 EFQEKSKYWLRKEGNYQRWIIRQKRLYIQREIQQEEAESEE 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3727SSPANPROTEIN492e-09 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 49.4 bits (117), Expect = 2e-09
Identities = 31/75 (41%), Positives = 44/75 (58%), Gaps = 3/75 (4%)

Query: 121 ENELTYQFQRWGQNHTVRILESSEG-IRLKPSDTLVSDRLHEAQHNDVTAQRWVLTEQDE 179
++ LTY+FQRWG +++V I G L PS+T V RLH+ N QRW LT +D+
Sbjct: 260 DSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLHDQWQNG-NPQRWHLT-RDD 317

Query: 180 RQGQRHQPHEEQENE 194
+Q + Q H +Q E
Sbjct: 318 QQNPQQQQHRQQSGE 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3726TYPE3OMOPROT1561e-47 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 156 bits (395), Expect = 1e-47
Identities = 91/292 (31%), Positives = 136/292 (46%), Gaps = 13/292 (4%)

Query: 35 KENGEDVALLMPEFSAKWLPIAEESGSWSGWVLLREIFPLISAELAGMALMPETERLIGE 94
+ +G + L P W+ +++ WS W+ + +S LAG A+ E L+
Sbjct: 23 QRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWLEHVSPALAGAAVSAGAEHLVVP 82

Query: 95 WLSLSSSPLNLKYPELKYNRLCVGKVFDGVLSPAQPLIRIWTGELNLWLDKVTVCQYENA 154
WL+ + P L P L RLCV G P L+ I + LW + +
Sbjct: 83 WLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLLHIMSDRGGLWFEHLPELPAVGG 142

Query: 155 PTLDKKSLYWPIHFVIGFSKTCYRTIVDIEVGDVLLISNNMAYAVIYNTKICDLIYPEEL 214
K L WP+ FVIG S T + I +GDVLLI + A +Y
Sbjct: 143 GRP--KMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRA-----------EVYCYAK 189

Query: 215 KMADHFQYEEDFETDDFDIKKSESEIYDENDEQMINSFEELPVKIEFVLGKKIMNLYEID 274
K+ + E + DI+ E E + + +LPVK+EFVL +K + L E++
Sbjct: 190 KLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYRKNVTLAELE 249

Query: 275 ELCAKRIISLLPESEKNIEIRVNGALTGYGELVEVDDKLGVEIHSWLSGHNN 326
+ ++++SL +E N+EI NG L G GELV+++D LGVEIH WLS N
Sbjct: 250 AMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESGN 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3725TYPE3IMPPROT2262e-77 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 226 bits (577), Expect = 2e-77
Identities = 151/223 (67%), Positives = 181/223 (81%), Gaps = 5/223 (2%)

Query: 1 MSNSISLIAILSLFTLLPFIIASGTCFIKFSIVFVIVRNALGLQQVPSNMTLNGVALLLS 60
M N ISLIA+L+ TLLPFIIASGTCF+KFSIVFV+VRNALGLQQ+PSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMMPVGKEIYYNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK 120
MFVM P+ + Y ++E+++FN+++S+ V+ G+ GY+ YLIKYS+ ELV FFE Q
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 VNSSEDNEEIIDDD-----NISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVL 175
+ E + D SIF+LLPAYALSEIKSAF IGFY+YLPFVVVDLV+SSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 176 LTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYFDLS 218
L LGMMMMSPVTISTPIKL+LFVA+DGWT+LSKGLILQY D++
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIA 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3724TYPE3IMQPROT794e-23 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 78.7 bits (194), Expect = 4e-23
Identities = 59/86 (68%), Positives = 73/86 (84%)

Query: 1 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF 60
MDD+VFAGN+ALYL+L++S P VAT +GLLVGLFQTVTQLQEQTLPFG+KLL V +C
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLMSGWYGEKLYSFGIEMLNLAFARG 86
FL+SGWYGE L S+G +++ LA A+G
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3721TYPE3IMSPROT310e-106 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 310 bits (796), Expect = e-106
Identities = 112/340 (32%), Positives = 185/340 (54%), Gaps = 5/340 (1%)

Query: 2 ANKTEKPTQKKLQDASKKGQILKSRDLTVSVIMLVG--TLYLGYVFDVHHIMSILEYILD 59
KTE+PT KK++DA KKGQ+ KS+++ + +++ L + H ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 60 HNAKPDIWD---YFKAMGIGWLKTIIPFLLVCMFTTILVSWFQSKMQLATEAVKLKFDSL 116
+ P + + + P L V I Q ++ EA+K +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 117 NPVNGLKRIFGLKTVKEFVKAILYIIFFALEIKVFWSNHKSLLFKTLDGDIISLLSDWGE 176
NP+ G KRIF +K++ EF+K+IL ++ ++ I + + L + I + G+
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 177 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERH 236
+L L++ C +++ I D+ EY+ ++K++KM K E+KREYKE EG+PEIKSKRR+ H
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 237 QEILSEQLKSDVSNSRLMIANPTHIAIGIYFKPHLSPIPLISVRETNEVALAVRKYAKEI 296
QEI S ++ +V S +++ANPTHIAIGI +K +P+PL++ + T+ VRK A+E
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 297 GIPIITDKKLARKIYATHRRYDYVSFENIDEILRLLLWLE 336
G+PI+ LAR +Y Y+ E I+ +L WLE
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3716FLGMRINGFLIF353e-04 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 34.6 bits (79), Expect = 3e-04
Identities = 22/126 (17%), Positives = 49/126 (38%), Gaps = 5/126 (3%)

Query: 4 ISLLLFILLLCGCKQQE-LLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIYVEPTD 62
+++++ ++L L ++L Q ++A L + NI + I V
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGA---IEVPADK 91

Query: 63 FASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDGIV 122
L LP + + + S +E+ A+E L ++++ + +
Sbjct: 92 VHELRLRLAQQGLPKGGAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 123 SSRVHV 128
S+RVH+
Sbjct: 151 SARVHL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3708SYCDCHAPRONE751e-19 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 75.0 bits (184), Expect = 1e-19
Identities = 25/143 (17%), Positives = 58/143 (40%), Gaps = 5/143 (3%)

Query: 22 ALSKGENLALLHGLTPDILDRIYAYAFDYHEKGNVTDAEIYYKLLCIYAFENHEYLKGFA 81
L G +A+L+ ++ D L+++Y+ AF+ ++ G DA ++ LC+ + + G
Sbjct: 18 FLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLG 77

Query: 82 SVCQSKKKYQQAYDLYKLSYNYSPYDDYSVIYRMGQCQIGAKNIDNAMQCFYH----IIN 137
+ Q+ +Y A Y + + +C + + A + I +
Sbjct: 78 ACRQAMGQYDLAIHSYSYGAIMDI-KEPRFPFHAAECLLQKGELAEAESGLFLAQELIAD 136

Query: 138 NCEDASVKSKAQAYIELLTDNSE 160
E + ++ + +E + E
Sbjct: 137 KTEFKELSTRVSSMLEAIKLKKE 159


27ECs3594ECs3565Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3594-1163.197208MarR family transcriptional regulator
ECs35930183.156307phenylacrylic acid decarboxylase
ECs3592-1162.2297494-hydroxybenzoate decarboxylase
ECs3591-1141.956843hypothetical protein
ECs3590-2142.274879serine/threonine-specific protein phosphatase 2
ECs3589-2163.362684DNA mismatch repair protein MutS
ECs35880162.729108hypothetical protein
ECs35870172.863768formate hydrogenlyase transcriptional activator
ECs35861213.540395hypothetical protein
ECs35852213.188679hypothetical protein
ECs35842234.153300hydrogenase assembly chaperone
ECs35832224.584166hydrogenase nickel incorporation protein HypB
ECs35820254.794218hydrogenase nickel incorporation protein
ECs35810264.985681formate hydrogenlyase regulatory protein HycA
ECs3580-1265.513731formate hydrogenlyase subunit-7 component B
ECs3579-1275.269329formate hydrogenlyase subunit 3
ECs3578-1294.642759membrane-spanning protein of hydrogenase 3
ECs3577-1253.272627hydrogenase 3 large subunit
ECs3576-1223.201546formate hydrogenlyase complex iron-sulfur
ECs3575-1192.654831formate hydrogenlyase subunit-7 component G
ECs3574-2182.095553formate hydrogenlyase maturation protein
ECs35730163.369897hydrogenase 3 maturation protease
ECs35721163.3091276-phospho-beta-glucosidase
ECs35710163.556574PTS system cellobiose/arbutin/salicin-specific
ECs35701183.758856ascBF operon repressor
ECs35691174.356258electron transport protein HydN
ECs35680163.813448transcriptional regulatory protein
ECs35671172.975093nitric oxide reductase
ECs35661152.715004anaerobic nitric oxide reductase
ECs35651183.064671anaerobic nitric oxide reductase transcriptional
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3587HTHFIS389e-131 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 389 bits (1000), Expect = e-131
Identities = 140/373 (37%), Positives = 203/373 (54%), Gaps = 39/373 (10%)

Query: 350 YQEIHRLKERLVDENLALTEQLNNVDSEFGEIIGRSEAMYSVLKQVEMVAQSDSTVLILG 409
E+ + R + E +L + + ++GRS AM + + + + Q+D T++I G
Sbjct: 108 LTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITG 167

Query: 410 ETGTGKELIARAIHNLSGRNNRRMVKMNCAAMPAGLLESDLFGHERGAFTGASAQRIGRF 469
E+GTGKEL+ARA+H+ R N V +N AA+P L+ES+LFGHE+GAFTGA + GRF
Sbjct: 168 ESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRF 227

Query: 470 ELADKSSLFLDEVGDMPLELQPKLLRVLQEQEFERLGSNKIIQTDVRLIAATNRDLKKMV 529
E A+ +LFLDE+GDMP++ Q +LLRVLQ+ E+ +G I++DVR++AATN+DLK+ +
Sbjct: 228 EQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSI 287

Query: 530 ADREFRSDLYYRLNVFPIHLPPLRERPEDIPLLAKAFTFKIARRLGRNIDSIPAETLRTL 589
FR DLYYRLNV P+ LPPLR+R EDIP L + F + A + G ++ E L +
Sbjct: 288 NQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFV-QQAEKEGLDVKRFDQEALELM 346

Query: 590 SNMEWPGNVRELENVIERAVLLTRGNVLQLSL---------------------PDIALPE 628
WPGNVRELEN++ R L +V+ + +++ +
Sbjct: 347 KAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQ 406

Query: 629 PETPPAATVVAQEG--------------EDEYQLIVRVLKETNGVVAGPKGAAQRLGLKR 674
A G E EY LI+ L T G AA LGL R
Sbjct: 407 AVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQI---KAADLLGLNR 463

Query: 675 TTLLSRMKRLGID 687
TL +++ LG+
Sbjct: 464 NTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3584TYPE4SSCAGA270.012 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.012
Identities = 19/75 (25%), Positives = 37/75 (49%), Gaps = 8/75 (10%)

Query: 12 IDGNQAKVD--VCGIQRDVDLTLVGSCDENGQPRVGQWVLVHVGFAMSVINEAEARDTLD 69
I GNQ + D G+ D L ++NG+P G W+ + + F + ++ ++ D +
Sbjct: 171 IIGNQIRTDQKFMGV-FDESLKERQEAEKNGEPTGGDWLDIFLSF---IFDKKQSSDVKE 226

Query: 70 ALQN--MFDVEPDVG 82
A+ + V+PD+
Sbjct: 227 AINQEPVPHVQPDIA 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3570HTHTETR280.035 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.035
Identities = 17/93 (18%), Positives = 29/93 (31%), Gaps = 7/93 (7%)

Query: 3 TTMLEVAKRAGVSKATVSRVLSG-----NGYVSQETKDRVFQAVEESGYRPNLLARNLSA 57
T++ E+AK AGV++ + + + +E P L
Sbjct: 32 TSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLRE 91

Query: 58 KSTQTLGLVVTNTLYHGIYFSELLFHAARMAEE 90
L VT + E++FH E
Sbjct: 92 ILIHVLESTVTEERRRLLM--EIIFHKCEFVGE 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3565HTHFIS373e-127 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 373 bits (960), Expect = e-127
Identities = 125/388 (32%), Positives = 195/388 (50%), Gaps = 33/388 (8%)

Query: 149 IAALAAGALS----------NALLIEQLESQNMLPGDAAPFEAVKQTQMIGLSPGMTQLK 198
I A GA +I + ++ ++ ++G S M ++
Sbjct: 91 IKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIY 150

Query: 199 KEIEIVAASDLNVLISGETGTGKELVAKAIHEASPRAVNPLVYLNCAALPESVAESELFG 258
+ + + +DL ++I+GE+GTGKELVA+A+H+ R P V +N AA+P + ESELFG
Sbjct: 151 RVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG 210

Query: 259 HVKGAFTGAISNRSGKFEMADNGTLFLDEIGELSLALQAKLLRVLQYGDIQRVGDDRSLR 318
H KGAFTGA + +G+FE A+ GTLFLDEIG++ + Q +LLRVLQ G+ VG +R
Sbjct: 211 HEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIR 270

Query: 319 VDVRVLAATNRDLREEVLAGRFRADLFHRLSVFPLSVPPLRERGDDVILLAGYFCEQCRL 378
DVR++AATN+DL++ + G FR DL++RL+V PL +PPLR+R +D+ L +F +Q
Sbjct: 271 SDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAE- 329

Query: 379 RQGLSRVVLSAGARNLLQHYSFPGNVRELEHAIHRAVVLARATRSGDEVIL-----EAQH 433
++GL A L++ + +PGNVRELE+ + R L E+I E
Sbjct: 330 KEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPD 389

Query: 434 FAFPEVTLPPPEVAAVPVVKQNLR-----------------EATEAFQRETIRQALAQNH 476
+ ++ V++N+R + I AL
Sbjct: 390 SPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATR 449

Query: 477 HNWAACARMLETDVANLHRLAKRLGLKD 504
N A +L + L + + LG+
Sbjct: 450 GNQIKAADLLGLNRNTLRKKIRELGVSV 477


28ECs3532ECs3481Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3532123-3.068114hypothetical protein
ECs3531-120-2.936949hypothetical protein
ECs3530-118-1.805046DNA binding protein
ECs3529216-1.014779hypothetical protein
ECs35281181.904810hypothetical protein
ECs35272202.954507hypothetical protein
ECs35262203.591568LysM domain/BON superfamily protein
ECs35251213.765369DNA-binding transcriptional regulator CsiR
ECs35242161.602660gamma-aminobutyrate transporter
ECs3523112-0.5885494-aminobutyrate aminotransferase
ECs3522114-4.899308succinate-semialdehyde dehydrogenase I
ECs3521020-5.192691hydroxyglutarate oxidase
ECs3520125-6.914434hypothetical protein
ECs3519230-7.844107hypothetical protein
ECs3518432-8.478646hypothetical protein
ECs3517433-9.457128*hypothetical protein
ECs3515435-10.182112ABC transporter ATP-binding protein
ECs35141051-17.331065hypothetical protein
ECs35131054-18.872168DNA binding protein
ECs35121155-18.519991site specific recombinase
ECs3511952-18.016471hypothetical protein
ECs3510953-16.643721hypothetical protein
ECs3509747-13.197976hypothetical protein
ECs3508640-11.634596hypothetical protein
ECs3507338-9.288718hypothetical protein
ECs3506134-8.233288lipoprotein
ECs3504136-8.015666hypothetical protein
ECs3503021-3.029341hypothetical protein
ECs3502222-2.427635serine/threonine protein phosphatase
ECs3501324-1.938476antitermination protein
ECs3500323-1.891993hypothetical protein
ECs34994231.723086hypothetical protein
ECs34983212.666028hypothetical protein
ECs34974221.882646holin
ECs34962201.903597hypothetical protein
ECs34942232.499831hypothetical protein
ECs34932210.423337hypothetical protein
ECs3492228-5.045013hypothetical protein
ECs3491336-7.084512transposase
ECs3490645-10.581146transposase
ECs3489747-11.650737hypothetical protein
ECs3488443-11.511688hypothetical protein
ECs3487439-10.362739hypothetical protein
ECs3486230-7.188189hypothetical protein
ECs3485023-6.056555chaperone-like protein
ECs3483115-1.569375DNA damage-inducible protein
ECs3482114-1.226539SsrA-binding protein
ECs3481214-1.299622hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3515PRTACTNFAMLY2356e-66 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 235 bits (601), Expect = 6e-66
Identities = 216/886 (24%), Positives = 345/886 (38%), Gaps = 91/886 (10%)

Query: 765 NDGGTLDVREKGSATGIQQSSQGAL-VATTRATRVTGTRADGVAFSIEQGAANNILLANG 823
N+ + E+ IQ S G + A+ +V+G +A G+ + A + NG
Sbjct: 37 NNQSIVKTGERQHGIHIQGSDPGGVRTASGTTIKVSGRQAQGILL---ENPAAELQFRNG 93

Query: 824 GVLT----VESDTSSDKTQVNTGGREIVKTKATATGTTLTGGEQ----IVEGVANETTIN 875
V + + V ++V AT T + V G + +I
Sbjct: 94 SVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATLANVGDTWDDDGIALYVAGEQAQASIA 153

Query: 876 DGGIQTVS-------ANGEAIKTTINEGGTLTVNDNGKATDIVQNSGAALQTSTANGIEI 928
D +Q AN ++ I +GG + + S L+ + +
Sbjct: 154 DSTLQGAGGVQIERGANVTVQRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPA 213

Query: 929 SGTHQY------------GTFSISGNLATNMLLENGGNLLVLAGTEARDSTVG------- 969
SG G G A ++ L A D+ G
Sbjct: 214 SGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGAVVHLQRATIRRGDAPAGGAVPGGA 273

Query: 970 -KGGAMQNQGQDSATKVNSGGQYTL---GRSKDEFQALARAEDLQVA-----GGTAIVYA 1020
GGA+ G Y + G S + Q++ A +L A G V
Sbjct: 274 VPGGAVPGGFGPGGFGPVLDGWYGVDVSGSSVELAQSIVEAPELGAAIRVGRGARVTVSG 333

Query: 1021 GTLA--DASVSGATGSLSLMTPRDNVTPVKLEGAIRITDSATLTIGNGVDTTLADLTAA- 1077
G+L+ +V G+ P+ + L+ A L L A
Sbjct: 334 GSLSAPHGNVIETGGARRFA-PQAAPLSITLQAGAHAQGKALLYRVLPEPVKLTLTGGAD 392

Query: 1078 SRGSVWLNSNNSCAGTSNCEYR---------------VNSLLLNDGNVYLSAQTA----- 1117
++G + S GTS V+SL +++ ++ +
Sbjct: 393 AQGDIVATELPSIPGTSIGPLDVALASQARWTGATRAVDSLSIDNATWVMTDNSNVGALR 452

Query: 1118 ---------APATTNGIYNTLTTNELSGSGNFYLHTNVAGSRGDQLVVNNNATGNFKIFV 1168
G + LT N L+GSG F ++ D+LVV +A+G +++V
Sbjct: 453 LASDGSVDFQQPAEAGRFKVLTVNTLAGSGLFRMNVFADLGLSDKLVVMQDASGQHRLWV 512

Query: 1169 QDTGVSPQSDDAMTLVKT-GGGDASFSLGNTGGFVDLGTYEYVLKSDGNSNWNLTNDVKP 1227
+++G P S + + LV+T G A+F+L N G VD+GTY Y L ++GN W+L P
Sbjct: 513 RNSGSEPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAP 572

Query: 1228 NPDPNPNPNPNPKPDPKPDPKPDPKPDPTPEPTPTPVPEKRITPSTAAVLNMAATLP-LV 1286
P P P P P P P P+P P+ P P+P + + AAV L +
Sbjct: 573 ---PAPKPAPQPGPQPPQPPQPQPEA-PAPQPPAG---RELSAAANAAVNTGGVGLASTL 625

Query: 1287 FDAELNSIRERLNIMKASPHNNNVWGATYNTRNNVTTDAGAGFEQTLTGMTVGIDSPNDI 1346
+ AE N++ +RL ++ +P WG + R + AG F+Q + G +G D +
Sbjct: 626 WYAESNALSKRLGELRLNPDAGGAWGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAV 685

Query: 1347 PEGIATLGAFMGYSHSHIGFDRGGHGSVGSYSLGGYASWEHESGFYLDGVVKLNRFESNV 1406
G LG GY+ GF G G S +GGYA++ +SGFYLD ++ +R E++
Sbjct: 686 AGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYIADSGFYLDATLRASRLENDF 745

Query: 1407 AGKMSSGGAANGSYHSNGLGGHIETGMRFT-DGNWNLTPYASLTGFTADNPEYHLSNGME 1465
S G A G Y ++G+G +E G RFT W L P A L F A Y +NG+
Sbjct: 746 KVAGSDGYAVKGKYRTHGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLR 805

Query: 1466 SKSVDTRSIYRELGATLSYNMRLGNGMEIEPWLKAAVRKEFVDDNRVKVNNDGNFVNDLS 1525
+ S+ LG + + L G +++P++KA+V +EF V N + +L
Sbjct: 806 VRDEGGSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAH-RTELR 864

Query: 1526 GRRGIYQAGIKASFSSTLSGHLGVGYSHGAGVESPWNAVAGVNWSF 1571
G R G+ A+ S + YS G + PW AG +S+
Sbjct: 865 GTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3507TONBPROTEIN617e-13 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 61.2 bits (148), Expect = 7e-13
Identities = 22/56 (39%), Positives = 31/56 (55%), Gaps = 2/56 (3%)

Query: 338 SPLPEPEPEPEPEPEPEP-EPEPEPEPEPEPEPE-PEPEPEPEPEPEPEPEPEPEP 391
+P P+ P EPEPEPEP PEP E P +P+P+P+P+P+P +
Sbjct: 51 TPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV 106



Score = 61.2 bits (148), Expect = 8e-13
Identities = 25/55 (45%), Positives = 38/55 (69%), Gaps = 1/55 (1%)

Query: 339 PLPEPEPEPEPEPEPEPEPEPE-PEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPI 392
P PEP EPEPEPEP PEP E P +P+P+P+P+P+P + + +P+ + +P+
Sbjct: 63 PPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPV 117



Score = 60.4 bits (146), Expect = 2e-12
Identities = 22/64 (34%), Positives = 31/64 (48%), Gaps = 2/64 (3%)

Query: 330 IETSLDFSSPLPEPEPEPEPEPEPEPEP-EPEPEPEPEPEPEPE-PEPEPEPEPEPEPEP 387
+ + P P+ P EPEPEPEP PEP E P +P+P+P+P+P
Sbjct: 41 PAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKP 100

Query: 388 EPEP 391
+P
Sbjct: 101 KPVK 104



Score = 60.4 bits (146), Expect = 2e-12
Identities = 23/63 (36%), Positives = 35/63 (55%), Gaps = 2/63 (3%)

Query: 330 IETSLDFSSPLPEPEPEPEPEPEP-EPEPEPEPEPEPEPE-PEPEPEPEPEPEPEPEPEP 387
I ++ + L P+ P EPEPEPEP PEP E P +P+P+P+P+P+P
Sbjct: 45 ISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVK 104

Query: 388 EPE 390
+ +
Sbjct: 105 KVQ 107



Score = 58.1 bits (140), Expect = 8e-12
Identities = 23/62 (37%), Positives = 37/62 (59%), Gaps = 1/62 (1%)

Query: 339 PLPEPEPEPEPEPEPEPE-PEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPIRSSLK 397
P+ EPEPEPEP PEP E P +P+P+P+P+P+P + + +P+ + +P S +
Sbjct: 67 PVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFE 126

Query: 398 EN 399

Sbjct: 127 NT 128



Score = 40.0 bits (93), Expect = 9e-06
Identities = 11/56 (19%), Positives = 24/56 (42%)

Query: 339 PLPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPIRS 394
P +P+P+P+P+P+P + + +P+ + +P P P +
Sbjct: 84 EAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTAT 139



Score = 39.2 bits (91), Expect = 2e-05
Identities = 13/30 (43%), Positives = 14/30 (46%), Gaps = 1/30 (3%)

Query: 363 PEPEPEPEPEPEPEPEPEPEPEPEPEPEPI 392
P P+ P P EPEPEPEP P
Sbjct: 52 PADLEPPQAVQPPPE-PVVEPEPEPEPIPE 80



Score = 38.8 bits (90), Expect = 2e-05
Identities = 13/62 (20%), Positives = 28/62 (45%)

Query: 339 PLPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPIRSSLKE 398
P+ +P+P+P+P+P+P + + +P+ + +P P P ++ K
Sbjct: 86 PVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKP 145

Query: 399 NT 400
T
Sbjct: 146 VT 147



Score = 33.0 bits (75), Expect = 0.002
Identities = 10/63 (15%), Positives = 23/63 (36%)

Query: 339 PLPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPEPIRSSLKE 398
P+P+P+P+P+P + + +P+ + +P P P +
Sbjct: 90 EKPKPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSV 149

Query: 399 NTE 401
+
Sbjct: 150 ASG 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3503TYPE4SSCAGX290.014 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.0 bits (64), Expect = 0.014
Identities = 15/46 (32%), Positives = 32/46 (69%), Gaps = 2/46 (4%)

Query: 27 EICGTKIALERRSKEREKAEKAEKAAEKKRRREEQKQKDKLKIQKL 72
E+ K ALE+ + +E+A+KA+K +K+ +R+E++ K++ ++ L
Sbjct: 140 ELEEQKKALEKEKEAKEQAQKAQK--DKREKRKEERAKNRANLENL 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3489FbpA_PF05833300.002 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 30.2 bits (68), Expect = 0.002
Identities = 11/45 (24%), Positives = 16/45 (35%), Gaps = 1/45 (2%)

Query: 31 AVIQAEQENDMNILKKLMQRLCGCGKHDD-REHGELLTAQLRLGP 74
++ K L L C D + +GELLTA +
Sbjct: 306 KIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYALK 350


29ECs3399ECs3394Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs33992311.476261inositol monophosphatase
ECs33982261.039865ATP synthase subunit beta
ECs33972262.642119DNA-binding transcriptional regulator IscR
ECs33963262.600187cysteine desulfurase
ECs33953212.712056scaffold protein
ECs33940193.652196iron-sulfur cluster assembly protein
30ECs3373ECs3367Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs33732221.312321GTP-binding protein EngA
ECs33722231.105166hypothetical protein
ECs33712241.152104exodeoxyribonuclease VII large subunit
ECs33702300.187082inosine 5'-monophosphate dehydrogenase
ECs3369-214-2.850301GMP synthase
ECs3368-114-3.802786hypothetical protein
ECs3367-311-3.572734outer membrane lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3368IGASERPTASE280.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.024
Identities = 19/124 (15%), Positives = 40/124 (32%), Gaps = 6/124 (4%)

Query: 34 QQGKNEEQRQHDEWVAERNREIQQEKQRRANAQAAANKRAATAAANKKARQDKLDAEATA 93
Q + ++ + + + E+ Q Q K AT +KA+ + +
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET-EKTQEV 1122

Query: 94 DKKRDQSYEDELRSLEIQKQKLALAKEEARVKRENEFIDQELKHKAAQTDVVQSEADANR 153
K Q + +S +Q Q + + V I + D Q + +
Sbjct: 1123 PKVTSQVSPKQEQSETVQPQAEPARENDPTVN-----IKEPQSQTNTTADTEQPAKETSS 1177

Query: 154 NMTE 157
N+ +
Sbjct: 1178 NVEQ 1181


31ECs3325ECs3312Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs33252183.430025malic enzyme
ECs33242214.210307hypothetical protein
ECs33231194.489568hypothetical protein
ECs33223215.353182ethanolamine utilization protein EutQ
ECs33212185.923014hypothetical protein
ECs33204206.211216phosphotransacetylase
ECs33192205.552614detox protein
ECs33182215.900894detox protein
ECs33171225.725191EutE
ECs33160225.653212EutJ
ECs3315-1225.347334hypothetical protein
ECs3314-2225.025187EutH
ECs3313-2194.694678reactivating factor for ethanolamine ammonia
ECs3312-2173.687052ethanolamine ammonia-lyase heavy chain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3316SHAPEPROTEIN512e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.5 bits (121), Expect = 2e-09
Identities = 33/116 (28%), Positives = 50/116 (43%), Gaps = 9/116 (7%)

Query: 63 VRDGIVWDFFGAVTIVRRHLD-TLEQQFGRRFSHAATSFPPGTDP---RISINVLESAGL 118
++DG++ DFF +++ + F R P G R + AG
Sbjct: 76 MKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGA 135

Query: 119 EVSHVLDEPTAVA---DLLQLDNAG--VVDIGGGTTGIAIVKKGKVTYSADEATGG 169
+++EP A A L + G VVDIGGGTT +A++ V YS+ GG
Sbjct: 136 REVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSSVRIGG 191


32ECs3268ECs3231Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3268-1153.375459glucokinase
ECs32670163.693364PTS system enzyme IIB component
ECs32660173.093181transporter
ECs32650192.406531aminopeptidase
ECs32640182.187055exoaminopeptidase
ECs32630161.381509PTS system enzyme IIA component
ECs32620150.041195AraC family transcriptional regulator
ECs3261013-0.9210032-component transcriptional regulator
ECs3260-112-1.016424sensor protein
ECs3259-216-2.017107aminotransferase
ECs3258026-4.332962lipid A biosynthesis palmitoleoyl
ECs3257131-5.548872hypothetical protein
ECs3256133-5.876018hypothetical protein
ECs3255230-5.417877hypothetical protein
ECs3254032-7.586014formyl-coenzyme A transferase
ECs3253033-8.395133oxalyl-CoA decarboxylase
ECs3252034-9.456454transporter YfdV
ECs3251034-9.872795hypothetical protein
ECs3250-131-8.472888hypothetical protein
ECs3249-130-7.885994EvgA family transcriptional regulator
ECs3248-126-4.944211EvgA family transcriptional regulator
ECs3247-126-3.698470multidrug resistance protein K
ECs3246-124-3.407288multidrug resistance protein Y
ECs3245022-1.957815D-serine dehydratase
ECs3244126-2.543437sucrose operon repressor
ECs3243132-6.116754sucrose-6 phosphate hydrolase
ECs3242236-7.988368aminoimidazole riboside kinase
ECs3241233-9.320672galactoside permease
ECs3240338-9.426202resolvase
ECs3239336-8.829856hypothetical protein
ECs3238339-8.937124hypothetical protein
ECs3236234-5.529162replication protein
ECs3235128-4.678016hypothetical protein
ECs3234025-4.089754hypothetical protein
ECs3233025-6.270031DNA transfer protein
ECs3232021-5.738771acyltransferase
ECs3231-117-4.702311prophage Sf6-like integrase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3263PHPHTRNFRASE6140.0 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 614 bits (1585), Expect = 0.0
Identities = 202/567 (35%), Positives = 331/567 (58%), Gaps = 8/567 (1%)

Query: 117 LYGNVLASGVGVGTLTLLQSDSLDSYRAIPA-SAQDSTRLEHSLATLAEQLNQQLRERDG 175
+ G +SGV + + ++D + + + +L +L E+L + +
Sbjct: 5 ITGIAASSGVAIAKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEA 64

Query: 176 ----ESKTILSAHLSLIQDDEFAGNIRRLMTEQHQGLGAAIIRNMEQVCAKLSASASDYL 231
+ I +AHL ++ D E I+ + + A+ + + + ++Y+
Sbjct: 65 SMGADKAEIFAAHLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYM 124

Query: 232 RERVSDIRDISEQLL-HITWPELKPRNNLVLEKPTILVAEDLTPSQFLSLDLKNLAGMIL 290
+ER +DIRD+S+++L H+ E + + T+++AEDLTPS L+ + + G
Sbjct: 125 KERAADIRDVSKRVLGHLIGVETGSLATIA--EETVIIAEDLTPSDTAQLNKQFVKGFAT 182

Query: 291 EKTGRTSHTLILARASAIPVLSGLPLDAIARYAGQPAVLDAQCGVLAINPNDAVSGYYQV 350
+ GRTSH+ I++R+ IP + G G ++D G++ +NP + Y+
Sbjct: 183 DIGGRTSHSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIVIVNPTEEEVKAYEE 242

Query: 351 AQTLADKRQKQQAQAAAQLAYSRDNKRIDIAANIGTALEAPGAFANGAEGVGLFRTEMLY 410
+ +K++++ A+ + + ++D +++AANIGT + G ANG EG+GL+RTE LY
Sbjct: 243 KRAAFEKQKQEWAKLVGEPSTTKDGAHVELAANIGTPKDVDGVLANGGEGIGLYRTEFLY 302

Query: 411 MDRDSEPDEQEQFEAYQQVLLAAGDKPIIFRTMDIGGDKSIPYLNIPQEENPFLGYRAVR 470
MDRD P E+EQFEAY++V+ KP++ RT+DIGGDK + YL +P+E NPFLG+RA+R
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTLDIGGDKELSYLQLPKELNPFLGFRAIR 362

Query: 471 IYPEFAGLFRTQLRAILRAASFGNAQLMIPMVHGLDQILWVKGEIQKAIVELKRDGLRHA 530
+ E +FRTQLRA+LRA+++GN ++M PM+ L+++ K +Q+ +L +G+ +
Sbjct: 363 LCLEKQDIFRTQLRALLRASTYGNLKVMFPMIATLEELRQAKAIMQEEKDKLLSEGVDVS 422

Query: 531 ETITLGIMVEVPSVCYIIDHFCDEVDFFSIGSNDMTQYLYAVDRNNPRVSPLYNPITPSF 590
++I +GIMVE+PS + F EVDFFSIG+ND+ QY A DR N RVS LY P P+
Sbjct: 423 DSIEVGIMVEIPSTAVAANLFAKEVDFFSIGTNDLIQYTMAADRMNERVSYLYQPYHPAI 482

Query: 591 LRMLQQIVTTAHQRGKWVGICGELGGESRYLPLLLGLGLDELSMSSPRIPAVKSQLRQLD 650
LR++ ++ AH GKWVG+CGE+ G+ +PLLLGLGLDE SMS+ I +SQL +L
Sbjct: 483 LRLVDMVIKAAHSEGKWVGMCGEMAGDEVAIPLLLGLGLDEFSMSATSILPARSQLLKLS 542

Query: 651 SEACRELARQACECRSAQEIEALLTAF 677
E + A++A +A+E+E L+
Sbjct: 543 KEELKPFAQKALMLDTAEEVEQLVKKT 569


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3261HTHFIS555e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 55.2 bits (133), Expect = 5e-11
Identities = 21/132 (15%), Positives = 57/132 (43%), Gaps = 6/132 (4%)

Query: 2 KVIIVEDEFLAQQELSWLIKEHSQMEIVGTFDDGLDVLKFLQHNRVDAIFLDINIPSLDG 61
+++ +D+ + L+ + V + + +++ D + D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALS--RAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 V-LLAQNISQFAHKPFIVFITAWK--EHAVEAFELEAFDYILKPYQESRITGMLQKLEAA 118
LL + P +V ++A A++A E A+DY+ KP+ + + G++ + A
Sbjct: 63 FDLLPRIKKARPDLPVLV-MSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 119 WQQQQTSSTPAA 130
+++ + +
Sbjct: 122 PKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3260PF065802233e-70 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 223 bits (570), Expect = 3e-70
Identities = 60/207 (28%), Positives = 102/207 (49%), Gaps = 11/207 (5%)

Query: 348 RAEQLREMANKAELRALQSKINPHFLFNALNAISSSIRLNPDTARQLIFNLSRYLRYNIE 407
++ MA +A+L AL+++INPHF+FNALN I + I +P AR+++ +LS +RY++
Sbjct: 150 DQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 408 LKDDEQIDIKKELYQIKDYIAIEQARFGDKLTVIYDIDEEV-NCCIPSLLIQPLVENAIV 466
+ Q+ + EL + Y+ + +F D+L I+ + + +P +L+Q LVEN I
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIK 269

Query: 467 HGIQPCKGKGVVTISVAECGNRVRIAVRDTGHGIDPKVIERVEANEMPGNKIGLLNVHHR 526
HGI G + + + V + V +TG E GL NV R
Sbjct: 270 HGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE--------STGTGLQNVRER 321

Query: 527 VKLLYGE--GLHIRRLEPGTEIAFYIP 551
+++LYG + + + IP
Sbjct: 322 LQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3249HTHFIS762e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.4 bits (188), Expect = 2e-16
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNVDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3248HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3247RTXTOXIND794e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.7 bits (194), Expect = 4e-18
Identities = 63/412 (15%), Positives = 122/412 (29%), Gaps = 96/412 (23%)

Query: 13 RRKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
RR I+ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQ-VGETVSPG 233
L + + + + + I++PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3246TCRTETB1201e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 120 bits (302), Expect = 1e-31
Identities = 97/408 (23%), Positives = 169/408 (41%), Gaps = 25/408 (6%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLSIN-LDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVIFLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+ + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLISPLIG-----RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQ 372
G M ++I IG R G + + VTF +V + S T F II+
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVF 357

Query: 373 FFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
G + ++TI S L + S+ NF LS G ++
Sbjct: 358 VLGGLSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3233RTXTOXINA280.037 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.0 bits (62), Expect = 0.037
Identities = 26/129 (20%), Positives = 55/129 (42%), Gaps = 4/129 (3%)

Query: 93 GQARYQSLAAAEATGGLGSTATGNQLAAIAPTLGQNWLS--GQMNNYNNLANIGLGALTG 150
G+ Q + A A GL ++A L A A TL + LS + + I +
Sbjct: 284 GKGISQYIIAQRAAQGLSTSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRF 343

Query: 151 QANAGQNYANNVSQLYQQQAAASAANANKPSGLQSFATGAIGGAASGAMIGSAVPVIGTG 210
+ G + + ++ +++ A A+ + L S + I AA+ +++G+ V +
Sbjct: 344 KK-LGYDGDSLLAAFHKETGAIDASLTTISTVLASVS-SGISAAATTSLVGAPVSALVGA 401

Query: 211 IGALAGGVI 219
+ + G++
Sbjct: 402 VTGIISGIL 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3232TCRTETB354e-04 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.9 bits (80), Expect = 4e-04
Identities = 26/120 (21%), Positives = 55/120 (45%), Gaps = 7/120 (5%)

Query: 229 LSFAEISSVV------FILMSVYMVAKSQNSNGLQYDA-LFIPSMAFSILVFSFNGGIIS 281
LS AEI SV+ +++ Y+ + G Y + + ++ S L SF S
Sbjct: 289 LSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTS 348

Query: 282 KIISNKVMILLGDASFSFYLVHTIVISTLSKFFNVSGLGAISVIKFIVMALFASLFISIM 341
++ ++ +LG SF+ ++ TIV S+L + +G+ ++ F+ ++ ++
Sbjct: 349 WFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


33ECs3172ECs3143Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs31720284.071408NADH dehydrogenase subunit A
ECs31710294.060578NADH dehydrogenase subunit B
ECs31700304.274095bifunctional NADH:ubiquinone oxidoreductase
ECs31690303.986044NADH dehydrogenase subunit E
ECs31680303.895247NADH dehydrogenase I subunit F
ECs3167-1304.169833NADH dehydrogenase subunit G
ECs31660303.719494NADH dehydrogenase subunit H
ECs31650304.317335NADH dehydrogenase subunit I
ECs31641273.603918NADH dehydrogenase subunit J
ECs31630243.041152NADH dehydrogenase subunit K
ECs3162-217-0.412624NADH dehydrogenase subunit L
ECs3161-114-2.401232NADH dehydrogenase subunit M
ECs3160-117-4.323181NADH dehydrogenase subunit N
ECs3159127-7.823973hypothetical protein
ECs3158-122-5.209211hypothetical protein
ECs3157-115-1.568220deubiquitinase
ECs3156-1112.808277ribonuclease Z
ECs3155-1143.549530hypothetical protein
ECs31540144.719468hypothetical protein
ECs31530144.866872menaquinone-specific isochorismate synthase
ECs31520144.9100882-succinyl-5-enolpyruvyl-6-hydroxy-3-
ECs3151-1133.528370acyl-CoA thioester hydrolase
ECs31500143.0847791,4-dihydroxy-2-naphthoyl-CoA synthase
ECs31490151.800723O-succinylbenzoate synthase
ECs31480160.6926082-succinylbenzoate-CoA ligase
ECs3147-116-0.518055polymyxin resistance protein B
ECs3146016-0.3986094-amino-4-deoxy-L-arabinose-phospho-UDP
ECs5486-117-0.2248244-amino-4-deoxy-L-arabinose-phosphoundecaprenol
ECs3145018-1.0024364-amino-4-deoxy-L-arabinose lipid A transferase
ECs3144120-1.2455884-deoxy-4-formamido-L-arabinose-
ECs3143218-1.362304bifunctional UDP-glucuronic acid
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3171FLGBIOSNFLIP290.018 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 28.6 bits (64), Expect = 0.018
Identities = 18/56 (32%), Positives = 25/56 (44%), Gaps = 3/56 (5%)

Query: 68 MVTSFT---AVHDVARFGAEVLRASPRQADLMVVAGTCFTKMAPVIQRLYDQMLEP 120
M+TSFT V + R A P Q L + F M+PVI ++Y +P
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQP 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3155AUTOINDCRSYN356e-05 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 34.8 bits (80), Expect = 6e-05
Identities = 14/79 (17%), Positives = 32/79 (40%), Gaps = 12/79 (15%)

Query: 1 MIEWQDLHHSELSVSQLYALLQLRCAVFV--------VEQNCPYQDIDGDDLTGDNRHIL 52
M+E D++H+ LS ++ L LR F + D + + ++
Sbjct: 1 MLEIFDVNHTLLSETKSGELFTLRKETFKDRLNWAVQCTDGMEFDQYDNN----NTTYLF 56

Query: 53 GWKNDELVAYARILKSDDD 71
G K++ ++ R +++
Sbjct: 57 GIKDNTVICSLRFIETKYP 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3148ACETATEKNASE300.016 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 30.2 bits (68), Expect = 0.016
Identities = 19/124 (15%), Positives = 47/124 (37%), Gaps = 20/124 (16%)

Query: 339 EMHNGKLTIVG-----RLDNLFFSGGEGIQTEEVERVIAAHPAVLQVFIVPVADKEF--- 390
E +G + G +++ + + ++++ + H +++ + + + ++
Sbjct: 19 ESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDAIKLVLDALVNSDYGVI 78

Query: 391 ---------GHRPVAVMEYDHESVDLSEWVKDKLARFQQPVRWLTLPPELKNGGIKISRQ 441
GHR V EY SV +++ V + + L P + GIK Q
Sbjct: 79 KDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDC-IELAPLHNPANI--EGIKACTQ 135

Query: 442 ALKE 445
+ +
Sbjct: 136 IMPD 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5486BCTERIALGSPC280.008 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 28.0 bits (62), Expect = 0.008
Identities = 12/31 (38%), Positives = 18/31 (58%), Gaps = 1/31 (3%)

Query: 34 KHIVLWLGLALACLGLAMVLWLLVL-QNVPV 63
+ I+ +L + L C LAM+ W + L N PV
Sbjct: 15 RRILFYLLMLLFCQQLAMIFWRIGLPDNAPV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3143NUCEPIMERASE1168e-31 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 116 bits (293), Expect = 8e-31
Identities = 74/361 (20%), Positives = 137/361 (37%), Gaps = 60/361 (16%)

Query: 317 RVLILGVNGFIGNHLTERLLREDHYEVYGLDIGSD--------AISRFLNHPHFHFVEGD 368
+ L+ G GFIG H+++RLL H +V G+D +D A L P F F + D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGH-QVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 369 ISIHSEWIE--YHVKKCDVVLPLVAIATPIEYT-RNPLRVFELDFEENLRIIRYCVKYR- 424
++ E + + + V + Y+ NP + + L I+ C +
Sbjct: 61 LADR-EGMTDLFASGHFERVFISPHRLA-VRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 425 KRIIFPSTSEVYGMCSDKYFDEDHSNLIVGPVNKPRWIYSVSKQLLDRVIWAYGEKEGLQ 484
+ +++ S+S VYG+ F D V+ P +Y+ +K+ + + Y GL
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDD------SVDHPVSLYAATKKANELMAHTYSHLYGLP 172

Query: 485 FTLFRPFNWMGPRLDNLNAARIGSSRAITQLILNLVEGSPIKLIDGGKQKRCFTDIRDGI 544
T R F GP A+ + ++EG I + + GK KR FT I D
Sbjct: 173 ATGLRFFTVYGPWGR--------PDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 545 EALYRIIEN---------------AGNRCDGEIINIGNPENEASIEELGEMLLASFEKHP 589
EA+ R+ + A + + NIGN + + + L +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVE-LMDYIQALEDALGIEA 283

Query: 590 LRHHFPPFAGFRVVESSSYYGKGYQDVEHRKPSIRNAHHCLDWEPKIDMQETIDETLDFF 649
++ P G DV + + + + P+ +++ + ++++
Sbjct: 284 KKNMLPLQPG---------------DVLETSADTKALYEVIGFTPETTVKDGVKNFVNWY 328

Query: 650 L 650

Sbjct: 329 R 329


34ECs3099ECs3087Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs30990203.636553malate:quinone oxidoreductase
ECs3098-1214.219233ecotin
ECs3097-1224.766930ferredoxin-type protein
ECs3096-1234.339550assembly protein for periplasmic nitrate
ECs3095-1204.336322nitrate reductase catalytic subunit
ECs3094-1174.030608quinol dehydrogenase periplasmic component
ECs3093-1153.360291quinol dehydrogenase membrane component
ECs3092-1163.230286citrate reductase cytochrome c-type subunit
ECs3091-1162.944582cytochrome c-type protein NapC
ECs30900184.426562cytochrome c biogenesis protein CcmA
ECs30891204.353468heme exporter protein B
ECs30880223.964281heme exporter protein C
ECs30870203.515643heme exporter protein C
35ECs3041ECs2945Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3041317-2.809529galactose/methyl galaxtoside transporter
ECs3040114-0.791292beta-methylgalactoside transporter inner
ECs3039216-0.329001dihydropyrimidine dehydrogenase
ECs3038217-0.412438oxidoreductase
ECs3037015-0.774434hypothetical protein
ECs30360151.273890hypothetical protein
ECs30350172.457920cytidine deaminase
ECs30340192.497338hypothetical protein
ECs3033-1192.556558hypothetical protein
ECs3032-1193.756682regulatory protein
ECs3031-1194.364049transporter
ECs3030-2173.899582gentisate 1,2-dioxygenase
ECs3029-1173.764413isomerase
ECs3028-1183.589738glutathione-S-transferase
ECs3027-1173.479963salicylate hydroxylase
ECs30260161.993112tRNA-dihydrouridine synthase C
ECs30250121.527465multidrug resistance outer membrane protein
ECs3024-2151.443763acetoin dehydrogenase
ECs3023-3141.291416hypothetical protein
ECs3022-3141.861228hypothetical protein
ECs3021-2152.093117D-alanyl-D-alanine endopeptidase
ECs3020-2132.710611D-lactate dehydrogenase
ECs30190163.197300beta-D-glucoside glucohydrolase
ECs30182150.912529transport system permease
ECs30171160.522176transport system permease
ECs3016018-0.866678ABC transporter ATP-binding protein
ECs3015222-2.008426transport system permease
ECs3014326-3.777168transcriptional regulator
ECs3013331-5.129638integrase
ECs3012128-3.137782excisionase
ECs5482-124-4.258329hypothetical protein
ECs3011027-4.639420hypothetical protein
ECs3010129-5.059424hypothetical protein
ECs3009232-5.554794hypothetical protein
ECs5481235-5.911645hypothetical protein
ECs3007229-4.094013hypothetical protein
ECs3006228-1.666076C4-type zinc finger protein
ECs3005128-2.166275hypothetical protein
ECs3004229-2.214965hypothetical protein
ECs3003331-3.618742hypothetical protein
ECs3002329-5.153562exonuclease
ECs3001331-6.164217hypothetical protein
ECs2998742-9.582375Kil protein
ECs2997741-9.674480regulatory protein cIII
ECs2996639-7.373377single-stranded DNA binding protein
ECs2995839-6.218799superinfection exclusion protein
ECs2993838-4.518353regulatory protein
ECs2992633-4.445336hypothetical protein
ECs2991328-1.872210hypothetical protein
ECs2990329-1.045503prophage repressor CI
ECs2989226-2.029829regulatory protein
ECs2988127-1.883372regulatory protein CII
ECs2987125-1.158409phage replication protein O
ECs2986130-0.241872phage replication protein P
ECs2985232-1.412015Ren protein
ECs2984234-3.039212hypothetical protein
ECs2983338-2.649356hypothetical protein
ECs2982437-2.868419hypothetical protein
ECs2981337-4.241579DNA methylase
ECs2980032-4.237383protein NinE
ECs2979235-6.222759hypothetical protein
ECs2978236-6.412057hypothetical protein
ECs2977126-1.996983hypothetical protein
ECs2976228-1.732840hypothetical protein
ECs2975228-1.048665antitermination protein
ECs2974329-1.027506Shiga toxin I subunit A
ECs29733302.083474Shiga toxin I subunit B
ECs29723262.105548hypothetical protein
ECs29713220.884569hypothetical protein
ECs2970121-0.002214hypothetical protein
ECs2969221-1.008354holin
ECs29682192.086553endolysin
ECs29672222.124907antirepressor protein
ECs29662233.297875endopeptidase
ECs54804232.875133hypothetical protein
ECs29644233.206133hypothetical protein
ECs29633233.614541terminase large subunit
ECs29624222.940019hypothetical protein
ECs29615223.130925portal protein
ECs29606223.278014protease/scaffold protein
ECs29594254.381738transposase
ECs29584274.168382transposase
ECs29574254.234105hypothetical protein
ECs29564263.162902hypothetical protein
ECs29553283.086393hypothetical protein
ECs29544264.525175minor tail protein
ECs29534264.384017minor tail protein U
ECs29523275.196295hypothetical protein
ECs29514295.878021minor tail protein
ECs29504306.901270minor tail protein
ECs29493276.613943tail length tape measure protein
ECs29484307.393011minor tail protein
ECs29475306.941766minor tail protein
ECs29464266.085051tail assembly protein
ECs29450173.380024tail assembly protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3041PF05272320.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.007
Identities = 21/74 (28%), Positives = 28/74 (37%), Gaps = 17/74 (22%)

Query: 24 PGVKALDNVNLKVRPHSIHALMGENGAGKSTLLKCLFGIYKKDSGTILFQGKEIDFHSAK 83
PG K D + L G G GKSTL+ L G+ F D + K
Sbjct: 591 PGCKF-DYSVV---------LEGTGGIGKSTLINTLVGLD-------FFSDTHFDIGTGK 633

Query: 84 EALENGISMVHQEL 97
++ E +V EL
Sbjct: 634 DSYEQIAGIVAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3038AEROLYSIN290.029 Aerolysin signature.
		>AEROLYSIN#Aerolysin signature.

Length = 493

Score = 29.2 bits (65), Expect = 0.029
Identities = 11/35 (31%), Positives = 17/35 (48%), Gaps = 1/35 (2%)

Query: 359 QRNTIKTQNYQ-TRDPQVFAAGDIVEGDKTVVYAV 392
+ IK N+ DP F GD+ + D+ +V V
Sbjct: 189 DKTAIKVSNFAYNLDPDSFKHGDVTQSDRQLVKTV 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3031TCRTETB514e-09 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 51.4 bits (123), Expect = 4e-09
Identities = 59/402 (14%), Positives = 140/402 (34%), Gaps = 19/402 (4%)

Query: 22 RVIICCFLVVMLDGFDTAAIGFIAPDIRTHWQLTAGDLAPLFGAGLLGLTAGALLCGPLS 81
+++I ++ + + PDI + + A +L + G + G LS
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 82 DRFGRKRVIELCVFLFGALSLASAFS-PDLQTLVFLRFLTGLGLGGAMPNTIT-MTSEYL 139
D+ G KR++ + + S+ L+ RF+ G G A P + + + Y+
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAG-AAAFPALVMVVVARYI 132

Query: 140 PARRRGALVTLMFCGFTLGSAFGGIVSAQLVPVIGWHGILVLGGVLPLMLFVALLVVLPE 199
P RG L+ +G G + + I W +L++ + + + L+ +L +
Sbjct: 133 PKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPF-LMKLLKK 191

Query: 200 SPRWQVRRQLPQAVI-----------AKTVSAITRERYVDTHFYLIESASVTKGSIRQLF 248
R + + ++ + S V + ++
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPG 251

Query: 249 MGRQLPITLMLWVVF--FMSLLIIYLLSSWMPTLLNHRGIDLQHASWVTAAFQIGGTLGA 306
+G+ +P + + F ++ + +M ++ + + G
Sbjct: 252 LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGY 311

Query: 307 LALGVLMDKFNPFRVLTLSYAIGAICIVMIGLSQDG-LWLMALAIFGTGIGISGSQVGLN 365
+ G+L+D+ P VL + ++ + + W M + I G+S ++ ++
Sbjct: 312 IG-GILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370

Query: 366 ALTATLYPTQSRATGVSWSNAIGRCGAIVGSLSGGVMMAMNF 407
+ ++ Q G+S N G G ++++
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPL 412



Score = 44.1 bits (104), Expect = 8e-07
Identities = 42/200 (21%), Positives = 79/200 (39%), Gaps = 5/200 (2%)

Query: 251 RQLPITLMLWVVFFMSLLIIYLLSSWMPTLLNHRGIDLQHASWVTAAFQIGGTLGALALG 310
R I + L ++ F S+L +L+ +P + N +WV AF + ++G G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 311 VLMDKFNPFRVLTLSYAIGAICIVMIGLSQDGLWLMALAIFGTGIGISGSQVGLNALTAT 370
L D+ R+L I V+ + L+ +A F G G + + + A
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 371 LYPTQSRATGVSWSNAIGRCGAIVGSLSGGVMMAMNFSFDTLFFIIAVPAAISAVMLTLL 430
P ++R +I G VG GG M+A + L I I+ + + L
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGG-MIAHYIHWSYLLLI----PMITIITVPFL 185

Query: 431 ITVVRQSTSVPDSLPRAGVV 450
+ ++++ + G++
Sbjct: 186 MKLLKKEVRIKGHFDIKGII 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3026SHAPEPROTEIN290.030 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.030
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 5/127 (3%)

Query: 122 GAKAMREAVPAHLPVSVKVRLGWDSGEK-KFEIADAVQQAGATELVVHGRTKEQGY-RAE 179
G EA+ ++ + +G + E+ K EI A E+ V GR +G R
Sbjct: 190 GGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGF 249

Query: 180 HIDWQAIGE-IRQRLNIPVIANGEIWDWQSAQECMAISGCDSVMIGRGALNIPNLSRVVK 238
++ I E +++ L V A + + IS V+ G GAL + NL R++
Sbjct: 250 TLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL-LRNLDRLL- 307

Query: 239 YNEPRMP 245
E +P
Sbjct: 308 MEETGIP 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3024DHBDHDRGNASE1131e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (284), Expect = 1e-32
Identities = 71/253 (28%), Positives = 116/253 (45%), Gaps = 12/253 (4%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTAREVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I ++ E+ K + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKA-EARHAEAFPADVR 67

Query: 63 NLPEGAQALEKLIQRLGRIDVLVNNAGAMTKVPFLDMAFDEWRKIFTVDVDGAFLCSQIA 122
+ + ++ + +G ID+LVN AG + ++ +EW F+V+ G F S+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVA 182
++ M+ + + G I+ + S P +AY ++K A TK + LEL + I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NGMDDSDVKPDAEP---SIPLRRFGATHEIASLVAWLCSEGANYT 232
PG+ T M + +K E IPL++ +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3023BCTERIALGSPF290.019 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.019
Identities = 5/33 (15%), Positives = 16/33 (48%), Gaps = 2/33 (6%)

Query: 164 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKR 194
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3021BLACTAMASEA443e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 44.0 bits (104), Expect = 3e-07
Identities = 43/195 (22%), Positives = 77/195 (39%), Gaps = 18/195 (9%)

Query: 4 MPKFRVSLFSLALMLAVPLAPQAVAKTAAATTASQPEIASGSAMI-VDLNTNKVIYSNHP 62
M R+ + SL + +PLA A + S+ +++ MI +DL + + + +
Sbjct: 1 MRYIRLCIISL--LATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRA 58

Query: 63 DLVRPIASISKLMTAMVVLDARLPLDEKLKVDISQTPEMKGVYSRV---RLNSEISRKDM 119
D P+ S K++ VL DE+L+ I + YS V L ++ ++
Sbjct: 59 DERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGEL 118

Query: 120 LLLALMSSENRAAASLAHHYPGGYKAFIKAMNAKAKSLGMNNTRFV--EPTGLS-----V 172
A+ S+N +AA+L GG + A + +G N TR E
Sbjct: 119 CAAAITMSDN-SAANLLLATVGG----PAGLTAFLRQIGDNVTRLDRWETELNEALPGDA 173

Query: 173 HNVSTARDLTKLLIA 187
+ +T + L
Sbjct: 174 RDTTTPASMAATLRK 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3007GPOSANCHOR290.033 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 28.9 bits (64), Expect = 0.033
Identities = 13/80 (16%), Positives = 32/80 (40%), Gaps = 2/80 (2%)

Query: 79 NVALALLDERERNQQYIKRRDQENEEIALTVGKLRVELEAAKSKLNEQREYYEGVIADGS 138
N + A + + + + E ++ L ++ + L+ RE + + A+
Sbjct: 274 NFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQ 333

Query: 139 KRIAELEKQCAEWERKALSN 158
K E + + +E R++L
Sbjct: 334 K--LEEQNKISEASRQSLRR 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2996UREASE270.014 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 27.4 bits (61), Expect = 0.014
Identities = 18/66 (27%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 57 IMLAQHALLIAISSDLNAYGVVCEFDWN----DGNGQEGWPSMDGSEGIRITD---IDTS 109
+ LA L I + D +G +F DG GQ G+ IT+ +D
Sbjct: 22 VRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTREGGAVDTVITNALILDHW 81

Query: 110 GIFDSD 115
GI +D
Sbjct: 82 GIVKAD 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2986FLGMOTORFLIG290.019 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.6 bits (64), Expect = 0.019
Identities = 17/77 (22%), Positives = 27/77 (35%), Gaps = 11/77 (14%)

Query: 2 KNIAAQMVNFDREQM-----------RRIANNMPEQYDEKPQVQQVAQIINGVFSQLLAT 50
N+A ++ DR +++A+ E Y V V +IIN +
Sbjct: 165 TNVARRIALMDRTSPEVVREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224

Query: 51 FPASLANRDQNEVNEIR 67
SL D EI+
Sbjct: 225 IIESLEEEDPELAEEIK 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs297960KDINNERMP280.014 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.0 bits (62), Expect = 0.014
Identities = 9/37 (24%), Positives = 16/37 (43%)

Query: 78 MTYMKAYQKAWKEHRDRYQQDMEKLESENMELRRKLG 114
M M+ Q + R+R D +++ E M L +
Sbjct: 380 MAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEK 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2976HTHFIS270.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.1 bits (60), Expect = 0.004
Identities = 10/30 (33%), Positives = 17/30 (56%), Gaps = 4/30 (13%)

Query: 10 DMLVEAYE----NQTEVARILNCSRNTVRK 35
+++ A NQ + A +L +RNT+RK
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTLRK 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2974SHIGARICIN1203e-34 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 120 bits (303), Expect = 3e-34
Identities = 49/283 (17%), Positives = 112/283 (39%), Gaps = 40/283 (14%)

Query: 3 IIIFRVLTFFFVIFSVNVVAKE----FTLDFSTAKTYVDSLNVIRSAIGTPLQTISSGGT 58
+I F V + + + A E F L +T+ +Y ++ +R A+ +
Sbjct: 1 MIRFLVFSLLILTLFLTAPAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKL-----Y 55

Query: 59 SLLMIDSGTGDNLFAVDVRGIDPEEGRFNNLRLIVERNNLYVTGFVNRTNNVFYRFADF- 117
+ ++ S + + + + + + ++ N+YV G+ + Y F +
Sbjct: 56 DIPLLRSTLPGSQRYALIHLTNYADE---TISVAIDVTNVYVMGYRA--GDTSYFFNEAS 110

Query: 118 ----SHVTFPGTTA-VTLSGDSSYTTLQRVAGISRTGMQINRHSLTTSYLDLMSHSGTSL 172
+ F VTL +Y LQ AG R + + +L ++ L ++
Sbjct: 111 ATEAAKYVFKDAKRKVTLPYSGNYERLQIAAGKIRENIPLGLPALDSAITTLFYYNA--- 167

Query: 173 TQSVARAMLRFVTVTAEALRFRQIQRGFRTTLDDLSGRSYVMTAEDVDLTLNWGRLSSVL 232
S A A++ + T+EA R++ I++ +D +++ + + L +W LS +
Sbjct: 168 -NSAASALMVLIQSTSEAARYKFIEQQIGKRVDK----TFLPSLAIISLENSWSALSKQI 222

Query: 233 PDYHGQDSV----------RVGRISFGSINA--ILGSVALILN 263
+ + R++ +++A + ++AL+LN
Sbjct: 223 QIASTNNGQFETPVVLINAQNQRVTITNVDAGVVTSNIALLLN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2973FLGMOTORFLIM260.024 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 26.0 bits (57), Expect = 0.024
Identities = 7/36 (19%), Positives = 17/36 (47%)

Query: 38 DTFTVKVGDKELFTNRWNLQSLLLSAQITGMTVTIK 73
D F + +G+++ F + + ++AQI +
Sbjct: 293 DPFVLSIGNRKKFLCQPGVVGKKIAAQILERIESTS 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2952INTIMIN310.006 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.006
Identities = 23/119 (19%), Positives = 44/119 (36%), Gaps = 17/119 (14%)

Query: 134 KEVITRTVKVTNVGKPSVAEERSEITPATAIKVTP-------------TSGTVAKGKTTT 180
++ IT TVKV KP +E + T + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 181 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 235
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2949LCRVANTIGEN340.002 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 34.3 bits (78), Expect = 0.002
Identities = 25/101 (24%), Positives = 46/101 (45%), Gaps = 4/101 (3%)

Query: 529 DLWKAESQYAVL-KEAATKRQLSEQEKSLLAHKDETLEYKRQLAELG---DKVEYQKRLN 584
+++KA ++Y +L K T Q+ EK +++ KD ++ LG + Y K N
Sbjct: 205 EIFKASAEYKILEKMPQTTIQVDGSEKKIVSIKDFLGSENKRTGALGNLKNSYSYNKDNN 264

Query: 585 ELAQQAVRFEEQQSAKQAAISAKARGLTDRQAQRESEAQRL 625
EL+ A ++ +S K L+D ++ S + L
Sbjct: 265 ELSHFATTCSDKSRPLNDLVSQKTTQLSDITSRFNSAIEAL 305


36ECs2920ECs2912Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2920-214-4.360579methionyl-tRNA synthetase
ECs2919128-7.700944ATPase
ECs2918233-9.347871hypothetical protein
ECs2917229-8.130499fimbrial-like protein
ECs2916126-7.466162chaperone
ECs2915126-7.048916outer membrane protein
ECs2914122-7.644790type-1 fimbrial protein
ECs2913021-4.893198hypothetical protein
ECs2912221-3.195592nickel/cobalt efflux protein RcnA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2915PF005777140.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 714 bits (1844), Expect = 0.0
Identities = 241/843 (28%), Positives = 395/843 (46%), Gaps = 35/843 (4%)

Query: 2 LRMTPLASAI---VALLLGIEAHAAEETFDTHFMMGGMKGEQVTNLRL--DDNQPLPGQY 56
R+ + A +AE F+ F+ + V +L + + PG Y
Sbjct: 21 HRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADD--PQAVADLSRFENGQELPPGTY 78

Query: 57 DIDIYVNKQWRGKYEIIVKDNPHET----CLTREIVKRLGIN-----SDNFARENQCLTF 107
+DIY+N + ++ E CLTR + +G+N N ++ C+
Sbjct: 79 RVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPL 138

Query: 108 EQLVQGGSYSWDIGIFRLDLAVPQAWVEELENGYVPPENWERGINAFYTSYYVSQYYSDY 167
++ + D+G RL+L +PQA++ GY+PPE W+ GINA +Y S
Sbjct: 139 TSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQN 198

Query: 168 KASGNSKSTYVRFNSGLNLLGWQLHSDASFSKTDNNP-----GEWKSNTLYLEHGFSQIL 222
+ GNS Y+ SGLN+ W+L + ++S ++ +W+ +LE +
Sbjct: 199 RIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLR 258

Query: 223 GTLRIGDMYTSADIFDSVRFTGVRLFRDMQMLPNSKQNFTPRVQGIAQSNALVTIEQNGF 282
L +GD YT DIFD + F G +L D MLP+S++ F P + GIA+ A VTI+QNG+
Sbjct: 259 SRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGY 318

Query: 283 VVYQKEVPPGPFSISDLQLAGGGADLDVSVKEADGSVTTYLVPYAAVPNMLQPGVSKYDF 342
+Y VPPGPF+I+D+ AG DL V++KEADGS + VPY++VP + + G ++Y
Sbjct: 319 DIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSI 378

Query: 343 AAGRSHIEGASKQSD-FVQAGYQYGFNNLLTLYGGTMVANNYYAFTLGTGWNT-RIGAIS 400
AG A ++ F Q+ +G T+YGGT +A+ Y AF G G N +GA+S
Sbjct: 379 TAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALS 438

Query: 401 VDATKSHSKQDNGDVFDGQSYQIAYNKFLSQTSTRFGLAAWRYSSRDYRTFNDYVWANNK 460
VD T+++S + DGQS + YNK L+++ T L +RYS+ Y F D ++
Sbjct: 439 VDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMN 498

Query: 461 DNYRRDKNDVYDI----ADYYQNDFGRKNSFSANMSQSLPEGWGSVSLSTLWRDYWGRSG 516
++ V + DYY + ++ ++Q L ++ LS + YWG S
Sbjct: 499 GYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR-TSTLYLSGSHQTYWGTSN 557

Query: 517 SSKDYQLSYSNNWRRISYTLAASQAYDENHAE-EKRFNIFISIPFD--WGDDVTTPRRQI 573
+ +Q + + I++TL+ S + ++ + ++IPF D + R
Sbjct: 558 VDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHA 617

Query: 574 YMSNSTTFDDQGFASNNTGLSGTVGNRDQFNYGINLSHQHQGN---ETTAGANLTWTAPA 630
S S + D G +N G+ GT+ + +Y + + G+ +T A L +
Sbjct: 618 SASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGY 677

Query: 631 ATVNGSYSQSSTYRQVGASVSGGLVAWSGGVNLANRLSETFAVMHAPGIKDAYVNGQKYR 690
N YS S +Q+ VSGG++A + GV L L++T ++ APG KDA V Q
Sbjct: 678 GNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGV 737

Query: 691 TTNCNGVVVYDGLTPYRENHLMMDVSQSDSETELRGNRKMTAPYRGAVVLVDFDTDQRKP 750
T+ G V T YREN + +D + +L P RGA+V +F +
Sbjct: 738 RTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKA-RVGI 796

Query: 751 WFIKALRSDGQPLTFGYEVNDMHGHNIGVVGQGSQIFIRTNEIPPAVNVAIDKQQGLSCT 810
+ L + +PL FG V + G+V Q+++ + V V +++ C
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 811 ITF 813
+
Sbjct: 857 ANY 859


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2913TYPE3OMGPROT280.008 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 27.9 bits (62), Expect = 0.008
Identities = 13/42 (30%), Positives = 21/42 (50%), Gaps = 1/42 (2%)

Query: 6 KMLLGALLLVTSAAWAAPATAGSTNTSGISKYE-LSSFIADF 46
++L G LLL++S +WA ++K E L + DF
Sbjct: 11 RVLTGTLLLLSSYSWAQELDWLPIPYVYVAKGESLRDLLTDF 52


37ECs2899ECs2884Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2899319-2.358668tagatose-bisphosphate aldolase
ECs2898319-3.382913tagatose 6-phosphate kinase 1
ECs2897320-3.316649PTS system galactitol-specific transporter
ECs2896320-3.501330PTS system galactitol-specific transporter
ECs2895320-3.338425PTS system galactitol-specific enzyme IIC
ECs2894013-2.517447galactitol-1-phosphate dehydrogenase
ECs2893-113-2.318576split galactitol utilization operon repressor
ECs2892-311-0.922396lipid kinase
ECs2891-2100.295849hypothetical protein
ECs2890-2142.120437hypothetical protein
ECs2889-2183.285465hypothetical protein
ECs2888-2193.887611hypothetical protein
ECs2887-2184.050261BaeR family transcriptional regulator
ECs2886-3173.959565signal transduction histidine-protein kinase
ECs2885-3173.764950multidrug efflux system protein MdtE
ECs2884-3133.099938multidrug efflux system subunit MdtC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2894DHBDHDRGNASE347e-04 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 33.9 bits (77), Expect = 7e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 2/92 (2%)

Query: 156 AQGCENKNVIIIGAGT-IGLLAIQCAVALGAKSVTAIDISSEKLALAKSFGAMQTFNSSE 214
A+G E K I GA IG + + GA + A+D + EKL S + ++
Sbjct: 3 AKGIEGKIAFITGAAQGIGEAVARTLASQGAH-IAAVDYNPEKLEKVVSSLKAEARHAEA 61

Query: 215 MSAPQMQSVLRELRFNQLILETAGVPQTVELA 246
A S + ++ E + V +A
Sbjct: 62 FPADVRDSAAIDEITARIEREMGPIDILVNVA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2887HTHFIS764e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.0 bits (187), Expect = 4e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLSYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2886BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2885TCRTETB1268e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 8e-34
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAITGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSSTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2884ACRIFLAVINRP9220.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 922 bits (2384), Expect = 0.0
Identities = 289/1035 (27%), Positives = 507/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 MVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


38ECs2859ECs2824Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2859-1223.172716colanic acid biosynthesis acetyltransferase
ECs2858-1243.754271GDP-D-mannose dehydratase
ECs28570243.663529GDP-fucose synthetase chain A
ECs28560243.274685GDP-mannose mannosyl hydrolase
ECs28550243.100538glycosyl transferase family protein
ECs28540242.973978mannose-1-phosphate guanylyltransferase
ECs2853-1211.511683phosphomannomutase
ECs2852-1170.202770UDP-glucose lipid carrier transferase
ECs2851-215-0.378619colanic acid exporter
ECs2850-316-3.532157pyruvyl transferase
ECs2849-221-8.039600colanic acid biosynthesis glycosyl transferase
ECs2848031-11.961457colanic acid biosynthesis protein
ECs2847244-15.640784UDP-galactose 4-epimerase
ECs2846451-17.409226UTP-glucose-1-phosphate uridylyltransferase
ECs2845760-20.228240glycosyl transferase family protein
ECs2844760-18.770512O antigen polymerase
ECs2843660-17.897592glycosyl transferase family protein
ECs2842458-16.649775O antigen flippase
ECs2841455-15.374856perosamine synthetase
ECs2840455-14.723114glycosyl transferase family protein
ECs2839036-9.367071GDP-D-mannose dehydratase
ECs5479-130-8.741977hypothetical protein
ECs2838-134-9.397862fucose synthetase
ECs2837-123-6.619646GDP-L-fucose pathway enzyme
ECs2836-122-6.343585mannose-1-phosphate guanylyltransferase
ECs2835-116-4.276369phosphomannomutase
ECs2833019-5.406295H repeat-containing protein
ECs2831-117-4.425539acetyltransferase
ECs2830-217-2.0671766-phosphogluconate dehydrogenase
ECs2829-318-1.644790UDP-glucose 6-dehydrogenase
ECs2828-2190.694207regulator of length of O-antigen component of
ECs2827-1262.524450bifunctional phosphoribosyl-AMP
ECs28260233.346201imidazole glycerol phosphate synthase subunit
ECs2825-1233.6698581-(5-phosphoribosyl)-5-[(5-
ECs2824-2193.112154imidazole glycerol phosphate synthase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2858NUCEPIMERASE1041e-27 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 104 bits (262), Expect = 1e-27
Identities = 76/353 (21%), Positives = 122/353 (34%), Gaps = 42/353 (11%)

Query: 6 LITGVTGQDGSYLAEFLLEKGYEVHGIKRRASSFNTERVDHIYQDPH--------TCNPK 57
L+TG G G ++++ LLE G++V GI + N Y D P
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLND------YYDVSLKQARLELLAQPG 53

Query: 58 FHLHYGDLSDTSNLTRILREVQPDEVYNLGAMSHVAVSFESPEYTADVDAMGTLRLLEAI 117
F H DL+D +T + + V+ V S E+P AD + G L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 118 RFLGLEKKTRFYQASTSELYGLVQEIPQKETTPF-YPRSPYAVAKLYAYWITVNYRESYG 176
R ++ AS+S +YGL +++P +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 177 MYACNGILFNHESPRRGETFVTRKITRAIANIAQGLESCLYLGNMDSLRDWGHAKDYVKM 236
+ A F P K T+A+ G +Y RD+ + D +
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTKAMLE---GKSIDVY-NYGKMKRDFTYIDDIAEA 226

Query: 237 QWMMLQQEQPEDFVIATGVQYSVRQFVEMAAAQLGIKLRFEGTGVEEKGIVVSVTGHDAP 296
+ D +G + E + DA
Sbjct: 227 IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG------NSSPVELMDYIQAL-EDAL 279

Query: 297 GVKPGDVIIAVDPRY--FRPAEVETLLGDPTKAHEKLGWKPEITLREMVSEMV 347
G++ +P +V D +E +G+ PE T+++ V V
Sbjct: 280 GIE-------AKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2857NUCEPIMERASE871e-21 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.1 bits (216), Expect = 1e-21
Identities = 66/344 (19%), Positives = 132/344 (38%), Gaps = 47/344 (13%)

Query: 5 RIFIAGHRGMVGSAIRRQLEQRG-------------DVEL------VLRTRD----ELNL 41
+ + G G +G + ++L + G DV L +L +++L
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 LDSRAVHDFFASERIDQVYLAAAKVGGIVANNTYPADFIYQNMMIESNIIHAAHQNDVNK 101
D + D FAS ++V+++ + + + P + N+ NI+ N +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLAKQPMAESELLQGTLEPTNEPYAIAKIAGIKLCESYNRQYGRDYRSV 161
LL+ SS +Y K P + P + YA K A + +Y+ YG +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD---DSVDHPVS-LYAATKKANELMAHTYSHLYGLPATGL 176

Query: 162 MPTNLYGPHDNFHPSNSHVIPALLRRFHEATAQNAPDVVVWGSGTPMREFLHVDDMAAAS 221
+YGP P L +F +A + + V+ G R+F ++DD+A A
Sbjct: 177 RFFTVYGPWGR--PD------MALFKFTKAMLEGKS-IDVYNYGKMKRDFTYIDDIAEAI 227

Query: 222 IHVMELAH----EVWLENTQPMLSH-----INVGTGVDCTIRELAQTIAKVVGYKGRVVF 272
I + ++ + +E P S N+G + + Q + +G + +
Sbjct: 228 IRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNM 287

Query: 273 DASKPDGTPRKLLDVTRLHQ-LGWYHEISLEAGLASTYQWFLEN 315
+P D L++ +G+ E +++ G+ + W+ +
Sbjct: 288 LPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2847NUCEPIMERASE945e-24 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 93.7 bits (233), Expect = 5e-24
Identities = 70/334 (20%), Positives = 126/334 (37%), Gaps = 62/334 (18%)

Query: 4 NVLLIGASGFVGT----RLLE----------------TAIADFNIKNLDKQQSHFYPEIT 43
L+ GA+GF+G RLLE ++ ++ L + F+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 44 QIGDVRDQQALDQALA--GFDTVVLLAAEH--RDDVSPTSLYYDVNVQGTRNVLAAMEKN 99
D+ D++ + A F+ V + R + Y D N+ G N+L N
Sbjct: 59 --IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 100 GVKNIIFTSSVAVYGLNKHNP-DENHPHD-PFNHYGKSKWQAEEVLREWYNKA---PTER 154
++++++ SS +VYGLN+ P + D P + Y +K +A E++ Y+ P
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATK-KANELMAHTYSHLYGLPA-- 173

Query: 155 SLTIIRPTVIFGERNRGN--VYNLLKQIAGGKFMMV-GAGTNYKSMAYVGNIVEFIKYKL 211
T +R ++G R + ++ K + GK + V G + Y+ +I E I
Sbjct: 174 --TGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQ 231

Query: 212 KNVA-----------------AGYEVYNYVDKPDLNMNQLVAEVEQSLNKKIPSMHLPYP 254
+ A Y VYN + + + + +E +L + LP
Sbjct: 232 DVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQ 291

Query: 255 LGMLGGYCFDI--LSKITGKKYAVS-SVRVKKFC 285
G + D L ++ G + VK F
Sbjct: 292 PGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2839NUCEPIMERASE1058e-28 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 105 bits (263), Expect = 8e-28
Identities = 73/353 (20%), Positives = 121/353 (34%), Gaps = 42/353 (11%)

Query: 6 LITGVTGQDGSYLAEFLLDKGYEVHGIKRRASSFNTERIDHIYQDPH--------GSNPN 57
L+TG G G ++++ LL+ G++V GI + N Y D + P
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGI----DNLND------YYDVSLKQARLELLAQPG 53

Query: 58 FHLHYGDLTDSSNLTRILKEVQPDEVYNLAAMSHVAVSFESPEYTADVDAIGTLRLLEAI 117
F H DL D +T + + V+ V S E+P AD + G L +LE
Sbjct: 54 FQFHKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGC 113

Query: 118 RFLGLENKTRFYQASTSELYGLVQEIPQKESTPF-YPRSPYAVAKLYAYWITVNYRESYG 176
R ++ AS+S +YGL +++P +P S YA K + Y YG
Sbjct: 114 RHNKIQ---HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYG 170

Query: 177 IYACNGILFNHESPRRGETFVTRKITRGLANIAQGLESCLYLGNMDSLRDWGHAKDYVRM 236
+ A F P K T+ + +G +Y RD+ + D
Sbjct: 171 LPATGLRFFTVYGPWGRPDMALFKFTK---AMLEGKSIDVY-NYGKMKRDFTYIDDIAEA 226

Query: 237 QWLMLQQEQPEDFVIATGVQYSVRQFVEMAAAQLGIKMSFVGKGIEEKGIVDSVEGQDAP 296
+ D +G +E + ++E DA
Sbjct: 227 IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG-----NSSPVELMDYIQALE--DAL 279

Query: 297 GVKPGDVIVAVDPRY--FRPAEVDTLLGDPSKANLKLGWRPEITLAEMISEMV 347
G++ +P +V D +G+ PE T+ + + V
Sbjct: 280 GIE-------AKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFV 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2838NUCEPIMERASE945e-24 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 93.7 bits (233), Expect = 5e-24
Identities = 64/347 (18%), Positives = 129/347 (37%), Gaps = 53/347 (15%)

Query: 5 RIFIAGHQGMVGSAITRRLKQRD-------------DVEL------VLRTRD----ELNL 41
+ + G G +G +++RL + DV L +L +++L
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 42 LDSSAVLDFFSSQKIDQVYLAAAKVGGILANSSYPADFIYENIMIEANVIHAAHKNNVNK 101
D + D F+S ++V+++ + + + P + N+ N++ N +
Sbjct: 62 ADREGMTDLFASGHFERVFISPHR-LAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 102 LLFLGSSCIYPKLAHQPIMEDELLQGKLEPTNEP---YAIAKIAGIKLCESYNRQFGRDY 158
LL+ SS +Y P D + + P YA K A + +Y+ +G
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD-------DSVDHPVSLYAATKKANELMAHTYSHLYGLPA 173

Query: 159 RSVMPTNLYGPNDNFHPSNSHVIPALLRRFHDAVENNSPNVVVWGSGTPKREFLHVDDMA 218
+ +YGP P L +F A+ + V+ G KR+F ++DD+A
Sbjct: 174 TGLRFFTVYGPWGR--PD------MALFKFTKAMLEGKS-IDVYNYGKMKRDFTYIDDIA 224

Query: 219 SASIYVMEMPYDIWQKNTK---------VMLSHINIGTGIDCTICELAETIAKVVGYKGH 269
A I + ++ + T NIG + + + + +G +
Sbjct: 225 EAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAK 284

Query: 270 ITFDTTKPDGAPRKLLDVTLLHQ-LGWNHKITLHKGLENTYNWFLEN 315
+P D L++ +G+ + T+ G++N NW+ +
Sbjct: 285 KNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRDF 331


39ECs2809ECs2680Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs28092240.200654hypothetical protein
ECs28086272.422196hypothetical protein
ECs28077293.195529hypothetical protein
ECs28066304.201831hypothetical protein
ECs28055302.328601structural protein
ECs28045260.131696hypothetical protein
ECs28033230.826693DNA repair protein
ECs28023220.595722antirestriction protein
ECs28012230.511826hypothetical protein
ECs2800222-0.024563hypothetical protein
ECs27992220.928199hypothetical protein
ECs2798323-0.881738hypothetical protein
ECs27972210.290067hypothetical protein
ECs27962210.368250hypothetical protein
ECs27952210.185449transposase
ECs2794121-0.232361transposase
ECs2792119-0.064556hypothetical protein
ECs27911193.023594hypothetical protein
ECs27901220.464758hypothetical protein
ECs2789127-0.638950hypothetical protein
ECs2788027-1.238758adenosylcobinamide kinase/adenosylcobinamide
ECs2787-127-2.683532adenosylcobinamide-GDP ribazoletransferase
ECs2786-227-2.982748nicotinate-nucleotide--dimethylbenzimidazole
ECs2785-325-3.258206hypothetical protein
ECs2784-328-5.021798*nitrogen assimilation transcriptional regulator
ECs2783-223-3.979277transcriptional regulator Cbl
ECs2780-119-3.058835**hypothetical protein
ECs2779-119-2.738010AMP nucleosidase
ECs2778-119-3.315951shikimate transporter
ECs2777020-3.658454hypothetical protein
ECs2776018-2.684051hypothetical protein
ECs2775018-2.960118hypothetical protein
ECs2774223-4.855605*hypothetical protein
ECs2773123-5.123578*integrase
ECs2772121-4.056464hypothetical protein
ECs2771223-4.659118hypothetical protein
ECs2770326-4.917114hypothetical protein
ECs2769831-6.254719hypothetical protein
ECs2768328-2.576930cell division inhibition protein
ECs27673280.002865hypothetical protein
ECs27662270.246080repressor protein
ECs27652261.450666cell division control protein
ECs27642261.524224hypothetical protein
ECs27631300.799234hypothetical protein
ECs2762130-1.133543phage replication protein
ECs2761230-3.818703hypothetical protein
ECs2760533-5.526069hypothetical protein
ECs2759431-4.403411hypothetical protein
ECs2758429-3.690518hypothetical protein
ECs2757225-0.752995hypothetical protein
ECs27560250.552798hypothetical protein
ECs2755-125-0.180200hypothetical protein
ECs2754-223-0.138666prophage maintenance protein
ECs2753-122-0.065489hypothetical protein
ECs27520231.433486hypothetical protein
ECs27511241.442474crossover junction endodeoxyribonuclease
ECs27500231.477340antiterminator
ECs27493242.860761***hypothetical protein
ECs27484252.670333hypothetical protein
ECs27463222.795643hypothetical protein
ECs27453231.270369transposase
ECs27441240.777691transposase
ECs2743223-1.059061holin
ECs2742026-1.795847hypothetical protein
ECs2741-124-2.225915endolysin
ECs2740121-1.137176hypothetical protein
ECs2739122-0.636387endopeptidase
ECs54702241.134763hypothetical protein
ECs27372233.536728transcriptional regulator
ECs27362243.927178terminase small subunit
ECs27351234.543244terminase large subunit
ECs2734-1265.706868head completion protein
ECs27330265.394095portal protein
ECs27321235.453450head-tail preconnector protein
ECs27312243.299797head decoration protein
ECs27302274.083797major head protein
ECs27293295.055210hypothetical protein
ECs27285326.431495minor tail protein
ECs27274326.822065minor tail protein
ECs27244317.553492minor tail protein
ECs27234317.130325minor tail protein
ECs27224295.215309tail assembly protein
ECs27213271.511943tail assembly protein
ECs2718331-0.777879outer membrane protein
ECs2717436-2.902186tail fiber protein
ECs2716642-9.292889hypothetical protein
ECs2715541-8.867456EspF-like protein
ECs2714229-8.204790hypothetical protein
ECs2713125-6.516585hypothetical protein
ECs2712127-6.100236cytochrome
ECs2711031-7.228493hypothetical protein
ECs2710027-5.735515sulfite oxidase subunit YedZ
ECs2709-128-6.254719sulfite oxidase subunit YedY
ECs2708029-8.2025555-hydroxyisourate hydrolase
ECs2707028-7.739066transcriptional regulatory protein YedW
ECs2706-126-7.4355442-component sensor protein
ECs2705-320-3.632025chaperone protein HchA
ECs2704-313-2.292523hypothetical protein
ECs2703-212-0.991670hypothetical protein
ECs27020140.398077outer membrane protein
ECs27012181.532819hypothetical protein
ECs27002181.805731hypothetical protein
ECs26990161.075025DNA cytosine methylase
ECs2698-2171.113004DNA mismatch endonuclease
ECs2697-3160.559240hypothetical protein
ECs2696-2160.289734hypothetical protein
ECs2695-119-2.340349hypothetical protein
ECs2694-216-2.739008hypothetical protein
ECs2693-121-4.182513mannosyl-3-phosphoglycerate phosphatase
ECs2692-120-3.970371hypothetical protein
ECs2691017-3.074322hypothetical protein
ECs2690016-2.395259transcriptional regulator for ctr capsule
ECs2689-1170.576501flagellar biosynthesis protein FliR
ECs2688-2211.823596flagellar biosynthesis protein FliQ
ECs2687-1172.388252flagellar biosynthesis protein FliP
ECs2686-1172.384514flagellar biosynthesis protein FliO
ECs2685-1193.510490flagellar motor switch protein FliN
ECs26840173.773591flagellar motor switch protein FliM
ECs26831184.342646flagellar basal body-associated protein FliL
ECs26820154.159917flagellar hook-length control protein
ECs26810174.245621flagellar biosynthesis chaperone
ECs2680-1153.986095flagellum-specific ATP synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2809FbpA_PF05833280.012 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 27.5 bits (61), Expect = 0.012
Identities = 13/83 (15%), Positives = 33/83 (39%), Gaps = 6/83 (7%)

Query: 16 RLFRRKNKLQREIQDVEKKIRDNQKRVLLLDNLSDYIKPGMSVEAIQGIIASMKGDYEDR 75
+++ NKL++ + +++ N++ + L ++ I + + I+ I E
Sbjct: 385 SYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTNINNADNYDEIEEIKK------ELI 438

Query: 76 VDDYIIKNAELSKERRDISKKLK 98
YI ++ SK +
Sbjct: 439 ETGYIKFKKIYKSKKSKTSKPMH 461


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2778TCRTETB330.002 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 32.9 bits (75), Expect = 0.002
Identities = 38/259 (14%), Positives = 92/259 (35%), Gaps = 18/259 (6%)

Query: 79 LGGVIFGHFGDRLGRKRMLMLTVWMMGIATALIGILPSFSTIGWWAPILLVTLRAIQGFA 138
+G ++G D+LG KR+L+ + + + + + SF ++ I+ ++ A
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL----LIMARFIQGAGAAA 119

Query: 139 VGGEWGGAALLSVESAPKNKK-AFYSSGVQVGYGVGLLLSTGLVSLISMMTTDEQFLSWG 197
+ + K S V +G GVG + + I
Sbjct: 120 FPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI------------H 167

Query: 198 WRIPFLFSIVLVLGALWVRNGMEESAEFEQQQYNQAAAKKRIPVIEALLRHPGAFLKIIA 257
W L ++ ++ ++ +++ + ++ I + ++
Sbjct: 168 WSYLLLIPMITIITVPFLMKLLKKEVR-IKGHFDIKGIILMSVGIVFFMLFTTSYSISFL 226

Query: 258 LRLCELLTMYIVTAFALNYSTQNMGLPRELFLNIGLLVGGLSCLTIPCFAWLADRFGRRR 317
+ +++ + + GL + + IG+L GG+ T+ F + +
Sbjct: 227 IVSVLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDV 286

Query: 318 VYITGALIGTLSAFPFFMA 336
++ A IG++ FP M+
Sbjct: 287 HQLSTAEIGSVIIFPGTMS 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2776INTIMIN1032e-24 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 103 bits (259), Expect = 2e-24
Identities = 35/177 (19%), Positives = 65/177 (36%), Gaps = 20/177 (11%)

Query: 2 GVHTAEATLPNGNNDTKIVNIAPDASNAQVTLNIPAQQVVTNNSDSVQLTATVK-DPSNH 60
G A + + D K + +T++ ++V T ++ N
Sbjct: 728 GKSLVSARVSDVAVDVKAPEVE---FFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNL 784

Query: 61 PVAGITVNFTMPQDVAANFTLENNGIAITQANGEAHVTLKGKKAGTHTVTATLSNNNTSD 120
+G +T + N IA A+ VTLK K GT T++ +SD
Sbjct: 785 KASGGNGKYT--------WRSANPAIASVDAS-SGQVTLKEK--GTTTISVI-----SSD 828

Query: 121 SQPVTFVADKTSALVVLQISKNEITGNGVDSATLTATVKDQFDNEVNNLPVTFSTAS 177
+Q T+ ++L+V +SK + V++ NE+ N+ + A+
Sbjct: 829 NQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAAN 885



Score = 84.4 bits (208), Expect = 2e-18
Identities = 98/393 (24%), Positives = 136/393 (34%), Gaps = 54/393 (13%)

Query: 100 KGKKAGTHTVTAT-LSNNNTSDSQPVT-FVADKTSALVVLQISKNEITGNGVDSATLTAT 157
G + +T T LSN D VT F ADKTSA +G ++ T TAT
Sbjct: 535 NGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAK-----------ADGTEAITYTAT 583

Query: 158 VKDQFDNEVNNLPVTFSTASSGLTLTPGESNTNESGIAQATLAGVAFGEQTVTASLANNG 217
VK + N PV+F+ S L+ +NTN SG A TL G+ V+A A
Sbjct: 584 VKKNGVAQANV-PVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMT 642

Query: 218 ASDNKTVHFIGDTAAAKIIELTPVPDSIIAGTPQNSSGSVITATV-VDNNGFPVKGVTVN 276
++ N D A I E+ + +A IT TV V PV V
Sbjct: 643 SALNANAVIFVDQTKASITEIKADKTTAVANGQ-----DAITYTVKVMKGDKPVSNQEVT 697

Query: 277 FTSNAATAEMTNGGQAVTNEQGKATVTYTNTRSSIESGARPDTVEASLENGSSTLSTSIN 336
FT+ + T+ G A VT T+T G S +S +
Sbjct: 698 FTTTLGKLSNS---TEKTDTNGYAKVTLTSTTP-----------------GKSLVSARV- 736

Query: 337 VNADASTAHLTLLQALFDTVSAGDTTNLYIEVKDNYGNGVPQQ--EVTLSVSPSEGVTPS 394
+D + F T++ D N+ I G GV + V L
Sbjct: 737 --SDVAVDVKAPEVEFFTTLTI-DDGNIEIV-----GTGVKGKLPTVWLQYGQVNLKASG 788

Query: 395 NNAIYTTNHDGNFYASFTATKAGV---YQVTATLENGDSMQQTVTYVPNVANAEISLAAS 451
N YT AS A+ V + T T+ S QT TY N+ I S
Sbjct: 789 GNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMS 848

Query: 452 KDPVIANNNDLTTLTATVADTEGNAIANSEVTF 484
K + + + N + N +
Sbjct: 849 KRVTYNDAVNTCKNFGGKLPSSQNELENVFKAW 881



Score = 72.8 bits (178), Expect = 7e-15
Identities = 88/467 (18%), Positives = 166/467 (35%), Gaps = 40/467 (8%)

Query: 895 SGGKVRTNSSGQA--------PVVLTSNKVGTYTVTASFHNGVT----IQTQTIVKVTGN 942
GG+++ + S A V + V T A NG + + T T++
Sbjct: 495 QGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQV 554

Query: 943 SSTAHVASFIADPSTIAATNSDLSTLKATVEDGSGNLIEGLTVYFALKSGSATLTSLTAV 1002
V F AD ++ A ++ T ATV+ +G + V F + SG+A L++ +A
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 1003 TDQNGIATTSVRGAITGSVTVSAVTTAGGMQTVDITLVAGPADASQSVLKNNRSSLKGDF 1062
T+ +G AT +++ G V VSA TA ++ V S+ +
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSA-KTAEMTSALNANAVIFVDQTKASITEIKADKTTAVA 672

Query: 1063 TDSAELHLVLHDISGNPIKVSEGLEFVQSGTNAPYVQVSAIDYSKNFSGEYKATVTGGGE 1122
+ + + G+ ++ + F + +S + +G K T+T
Sbjct: 673 NGQDAITYTVKVMKGDKPVSNQEVTFTTTLGK-----LSNSTEKTDTNGYAKVTLTSTTP 727

Query: 1123 GIATLIPVLNGVHQAGLSTTIQFTRAEDKIMSGTVLVNGANLPTTTFPSQGFTGAYYQLN 1182
G + + ++ V + ++F I G + + G + P+ L
Sbjct: 728 GKSLVSARVSDVAVDVKAPEVEF-FTTLTIDDGNIEIVGTGV-KGKLPTVWLQYGQVNL- 784

Query: 1183 NDNFAPGKTAADYEFSSSASWVDVDATGKVTFKNVGSKWERITATPKTGGPSYIYEIRVK 1242
+ G + ++ A ++G+VT K G+ I+ + Y I
Sbjct: 785 --KASGGNGKYTWRSANPAIASVDASSGQVTLKEKGT--TTISVISSDNQTA-TYTIATP 839

Query: 1243 SWWVNAG-DAFMIYSLAENFCSSNGYTLPLGDHLNHSRSRGIGSLYSEWGDMGHYTTEAG 1301
+ + + Y+ A N C + G LP + + +++ WG Y
Sbjct: 840 NSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNE-------LENVFKAWGAANKYEYYKS 892

Query: 1302 FHSNMYW---SSSPANSNEQYVVSLATGDQSVFEKLGF--AYATCYK 1343
+ + W ++ A S L + K AYATC K
Sbjct: 893 SQTIISWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939



Score = 65.5 bits (159), Expect = 1e-12
Identities = 58/264 (21%), Positives = 87/264 (32%), Gaps = 18/264 (6%)

Query: 374 NGVPQQEVTLSVSPSEGVTPSNNAIYTTNHDGNFYASFTATKAGVYQVTATLENGDSMQQ 433
NGV Q V +S + G + TN G + + K G V+A S
Sbjct: 587 NGVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN 646

Query: 434 T--VTYVPNVANAEISLAASKDPVIANNNDLTTLTATVADTEGNAIANSEVTFTLPEDVR 491
V +V + + A K +AN D T T V ++N EVTFT
Sbjct: 647 ANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKV-MKGDKPVSNQEVTFT------ 699

Query: 492 ANFTLGDGGKVVTDTEGKAKVTLKGTKAGAHTVTASMAGGKSE--QLVVNFIADTLTAQV 549
TDT G AKVTL T G V+A ++ + V F
Sbjct: 700 TTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDG 759

Query: 550 NLNVTEDNFIANNVGMTRLQATVTDGNGN-PLANEAVTFTLPADVSASFTLGQGGSAITD 608
N+ + + V + G N + +T + A ++ S
Sbjct: 760 NIEI-----VGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASV-DASSGQVT 813

Query: 609 INGKAEVTLSGTKSGTYPVTVSVN 632
+ K T+S S T ++
Sbjct: 814 LKEKGTTTISVISSDNQTATYTIA 837



Score = 57.8 bits (139), Expect = 2e-10
Identities = 49/199 (24%), Positives = 82/199 (41%), Gaps = 5/199 (2%)

Query: 549 VNLNVTEDNFIANNVGMTRLQATVTDGNGNPLANEAVTFTLPADVSASFTLGQGGSAITD 608
+ + + A+ ATV NG AN V+F + VS + L SA T+
Sbjct: 561 TDFTADKTSAKADGTEAITYTATVKK-NGVAQANVPVSFNI---VSGTAVLSAN-SANTN 615

Query: 609 INGKAEVTLSGTKSGTYPVTVSVNNYGVSDTKQVTLIADAGTAKLASLTSVYSFVVSTTE 668
+GKA VTL K G V+ + + D A + + + + V+ +
Sbjct: 616 GSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQ 675

Query: 669 GATMTASVTDANGNPVEGIKVNFRGTSVTLSSTSVETDDRGFAEILVTSTEVGLKTVSAS 728
A PV +V F T LS+++ +TD G+A++ +TST G VSA
Sbjct: 676 DAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSAR 735

Query: 729 LADKPTEVISRLLNAKADI 747
++D +V + + +
Sbjct: 736 VSDVAVDVKAPEVEFFTTL 754



Score = 55.8 bits (134), Expect = 1e-09
Identities = 47/185 (25%), Positives = 70/185 (37%), Gaps = 5/185 (2%)

Query: 842 VIDQKLTLSASSPLIGVNSPTGATLTATLTSA-NGTPVEGQVINFSVTPEGATLSGGKVR 900
V+DQ ++ + +T T T NG ++F++ A LS
Sbjct: 554 VVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSAN 613

Query: 901 TNSSGQAPVVLTSNKVGTYTVTASFHNGVTIQTQTIVKVTGNSSTAHVASFIADPSTIAA 960
TN SG+A V L S+K G V+A + V + + + A + AD +T A
Sbjct: 614 TNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAV-IFVDQTKASITEIKADKTTAVA 672

Query: 961 TNSDLSTLKATVEDGSGNLIEGLTVYFALKSGSATLTSLTAVTDQNGIATTSVRGAITGS 1020
D T V G + V F G + + T TD NG A ++ G
Sbjct: 673 NGQDAITYTVKVMKG-DKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGK 729

Query: 1021 VTVSA 1025
VSA
Sbjct: 730 SLVSA 734


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2775INTIMIN7170.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 717 bits (1852), Expect = 0.0
Identities = 221/790 (27%), Positives = 353/790 (44%), Gaps = 70/790 (8%)

Query: 148 QQIASTSQQIGSLLAEDMNSEQAANMARGWASSQASGAMTDWLSRFGTARITLGVDEDFS 207
QQ AS Q+ S +N + A + A G A +QAS + WL +GTA + L +F
Sbjct: 168 QQAASLGSQLQS---RSLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD 224

Query: 208 LKNSQFDFLHPWYETPDNLFFSQHTLHRTDERTQINNGLGWRHFTPTWMSGINFFFDHDL 267
S DFL P+Y++ L F Q D R N G G R F P M G N F D D
Sbjct: 225 --GSSLDFLLPFYDSEKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDF 282

Query: 268 SRYHSRAGIGAEYWRDYLKLSSNGYLRLTNWRSAPELDNDYEARPANGWDVRAEGWLPAW 327
S ++R GIG EYWRDY K S NGY R++ W + DY+ RPANG+D+R G+LP++
Sbjct: 283 SGDNTRLGIGGEYWRDYFKSSVNGYFRMSGWHESYN-KKDYDERPANGFDIRFNGYLPSY 341

Query: 328 PHLGGKLVYEQYYGDEVALFDKDDRQSNPHAITAGLNYTPFPLMTFSAEQRQGKQGENDT 387
P LG KL+YEQYYGD VALF+ D QSNP A T G+NYTP PL+T + R G END
Sbjct: 342 PALGAKLMYEQYYGDNVALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDL 401

Query: 388 RFAVDFTWQPGSAMQKQLDPNEVDARRSLAGSRFDLVDRNNNIVLEYRKKELVRLTLTDP 447
+++ F +Q +Q++P V+ R+L+GSR+DLV RNNNI+LEY+K++++ L +
Sbjct: 402 LYSMQFRYQFDKPWSQQIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHD 461

Query: 448 VTGKSGEVKSLVSSLQTKYALKGYNVEATALEAAGGKVVTTG----KDILVTLPAYRFTS 503
+ G + + +++KY L + +AL + GG++ +G +D LPAY
Sbjct: 462 INGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAY---- 517

Query: 504 TPETDNTWPIEVTAEDVKGNFSNREQ-SMVVVQAPTLSQKDSSVSLSSQTLSADSHSTAT 562
N + + A D GN SN ++ V+ + + ++ SA + T
Sbjct: 518 VQGGSNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEA 577

Query: 563 LTFIAH------DAAGNPVIGLVLSTRHEGVQDITLSDWKDNGDGSYTQILTTGAMSGTL 616
+T+ A A PV ++S G ++ + NG G T L + +
Sbjct: 578 ITYTATVKKNGVAQANVPVSFNIVS----GTAVLSANSANTNGSGKATVTLKSDKPGQVV 633

Query: 617 TLMPQLNGVDAAKAPAVVNIISVSSSRTHSSIKIDKDRYLSGNPIEVTVELR-DENDKPV 675
A A AV I + + + IK DK ++ +T ++ + DKPV
Sbjct: 634 VSAKTAEMTSALNANAV--IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPV 691

Query: 676 KEQKQQLNTAVSIDNVKPGVTTDWKETADGVYKATYTAYTKGSGL-TAKLLMQNWNEDLH 734
Q+ T + K +T+ K +G K T T+ T G L +A++ +
Sbjct: 692 SNQEVTFTTTLG----KLSNSTE-KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAP 746

Query: 735 TAGFIIDANPQSAKIATLSASNNGVLANENAANTVSVNVADEGSNPINDHTVTFAVLSGS 794
F I + G L + + ++
Sbjct: 747 EVEFFTTLTIDDGNIEIVGTGVKGKLPTV---------------------WLQYGQVNLK 785

Query: 795 ATSFNNQNTAKTDVNGLATFDLKSSK---QEDNTVEVTLENGVKQTLIVSFVGDSSTAQV 851
A+ N + T ++ +A+ D S + +E T +++ + QT ++ + + +
Sbjct: 786 ASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQT--ATYTIATPNSLI 843

Query: 852 DLQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKVTF----------NVNSAAAKLSQT 901
SK D ++ + N L +V + + + + + QT
Sbjct: 844 VPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWVQQT 903

Query: 902 EVNSHDGIAT 911
++ G+A+
Sbjct: 904 AQDAKSGVAS 913



Score = 132 bits (334), Expect = 2e-33
Identities = 83/397 (20%), Positives = 151/397 (38%), Gaps = 36/397 (9%)

Query: 923 TVTASVSSGSQANQQVIFIGDQSTAALTLSVPSGDITVT-------NTAPLHMTATLQDK 975
T A +G+ +N ++ I S + V D T T + TAT++ K
Sbjct: 528 TARAYDRNGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVK-K 586

Query: 976 NGNPLKDKEITFSVPNDVASRFSISNSGKGMTDSNGTAIASLTGTLAGTHMITARLANSN 1035
NG + ++F++ + A + S + T+ +G A +L G +++A+ A
Sbjct: 587 NGVAQANVPVSFNIVSGTAVLSANSAN----TNGSGKATVTLKSDKPGQVVVSAKTAEMT 642

Query: 1036 VS-DTQPMTFVADKDRAVVVLQTSKAEIIGNGVDETTLTATVKDPFDNVVKNLSVVFRTS 1094
+ + + FV ++ ++ K + NG D T T V D V N V F T+
Sbjct: 643 SALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKG-DKPVSNQEVTFTTT 701

Query: 1095 PADTQLSLNARNTNENGIAEVTLKGTVLGVHTAEAILLNGNRDTKIVNIAPDASNAQVTL 1154
+LS + T+ NG A+VTL T G A + + D K + +T+
Sbjct: 702 LG--KLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVE---FFTTLTI 756

Query: 1155 NIPAQQVVTNNSDSVQLTATVK-DPSNHPVAGITVNFTMPQDVAANFTLENNGIAITQAN 1213
+ ++V T ++ N +G +T + N IA A+
Sbjct: 757 DDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYT--------WRSANPAIASVDAS 808

Query: 1214 GEAHVTLKGKKAGTHTVTATLGNNNASDAQPVTFVADKDSAVVVLQTSKAEIIGNGVDET 1273
VTLK K GT T++ +SD Q T+ ++++V SK + V+
Sbjct: 809 -SGQVTLKEK--GTTTISVI-----SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTC 860

Query: 1274 TLTATVKDPFDNAVKDLQVTFSTNPADTQLSQSKSNT 1310
N ++++ + S++
Sbjct: 861 KNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTII 897



Score = 123 bits (310), Expect = 2e-30
Identities = 80/367 (21%), Positives = 145/367 (39%), Gaps = 29/367 (7%)

Query: 830 LENGVKQTLIVSFVGDSSTAQ--VDLQKSKNEVVADGNDSATMTATVRDAKGNLLNDVKV 887
N V T+ V G D K ADG ++ T TATV+ N V V
Sbjct: 538 SSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQAN-VPV 596

Query: 888 TFNVNSAAAKLSQTEVNSH-DGIATATLTSLKNGDYTVTASVSSGSQA-NQQVIFIGDQS 945
+FN+ S A LS N++ G AT TL S K G V+A + + A N + DQ+
Sbjct: 597 SFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQT 656

Query: 946 TAALTLSVPSGDITVTNTAPLHMTATLQ-DKNGNPLKDKEITFSVPNDVASRFSISNSGK 1004
A++T + + T +T T++ K P+ ++E+TF+ + ++
Sbjct: 657 KASIT-EIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFT------TTLGKLSNST 709

Query: 1005 GMTDSNGTAIASLTGTLAGTHMITARLANSNVSDTQPMTFVADKDRAVVVLQTSKAEIIG 1064
TD+NG A +LT T G +++AR+++ V P + + + EI+G
Sbjct: 710 EKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEV----EFFTTLTIDDGNIEIVG 765

Query: 1065 NGVDETTLTATVK-DPFDNVVKNLSVVFRTSPADTQLSLNARNTNENGIAEVTLKGTVLG 1123
GV T ++ + + + A+ ++ ++ +VTLK G
Sbjct: 766 TGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASS-----GQVTLKEK--G 818

Query: 1124 VHTAEAILLNGNRDTKIVNIAPDASNAQVTLNIPAQQVVTNNSDSVQLTATVKDPSNHPV 1183
T I + D + N+ + N+ + + ++ + S + +
Sbjct: 819 TTTISVI----SSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNEL 874

Query: 1184 AGITVNF 1190
+ +
Sbjct: 875 ENVFKAW 881



Score = 34.7 bits (79), Expect = 0.003
Identities = 31/111 (27%), Positives = 45/111 (40%), Gaps = 11/111 (9%)

Query: 1225 AGTHTVTATLGNNNASDAQPVTF---------VADKDSAVVVLQTSKAEIIGNGVDETTL 1275
+ + VTA + N + + V V D+ V K +G + T
Sbjct: 522 SNVYKVTARAYDRNGNSSNNVLLTITVLSNGQVVDQVG-VTDFTADKTSAKADGTEAITY 580

Query: 1276 TATVKDPFDNAVKDLQVTFSTNPADTQLSQSKSNTNDSGVAEVTFKGTGFG 1326
TATVK A ++ V+F+ LS + +NTN SG A VT K G
Sbjct: 581 TATVKKN-GVAQANVPVSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPG 630


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2754HOKGEFTOXIC652e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.8 bits (158), Expect = 2e-18
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2729INTIMIN330.001 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.7 bits (74), Expect = 0.001
Identities = 23/119 (19%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 104 KEVITRTVKVTNVGKPSVAEERSKITPVSAIKVTP-------------TSGTVAKGKTTT 150
++ IT TVKV KP +E + T + + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 151 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 205
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2718ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 61/200 (30%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEQQATLSAGYLHARTSAPGSDNLNGINVKYRYEFT 60
M+K+ AA+ +G + +T++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGT---SVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2717CHANLCOLICIN300.014 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.014
Identities = 31/118 (26%), Positives = 50/118 (42%), Gaps = 10/118 (8%)

Query: 130 SARNAGISASQAEESAANADTSAGDASESARQAAESAAAAKQSEEASSSSASAAAQKASE 189
S G S++E SAA T+ ++ + AE AA AK A+A AQ ++
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAK---------AAAEAQAKAK 84

Query: 190 SSQSAADAELSKKTAESAAGNAARDATTATEKARESAESAQSAEQSRIA-AEEAVNRI 246
+++ A L E+ NA+R + +A E+ R+A AEE +
Sbjct: 85 ANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKE 142



Score = 30.0 bits (67), Expect = 0.020
Identities = 31/147 (21%), Positives = 51/147 (34%), Gaps = 6/147 (4%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEESAANADTSAGDASESAR 160
E R+ E+A + AE+ +K E + + ++AEE A + A E
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIER-EKAETERQLKLAEAEEKRLAALSEEAKAVE--- 192

Query: 161 QAAESAAAAKQSEEASSSSASAAAQKASESSQSAADAELSKKTAESAAGNAARDATTATE 220
A+ +A QSE SS A DAE+ + A +
Sbjct: 193 -IAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELD 251

Query: 221 KARES-AESAQSAEQSRIAAEEAVNRI 246
+ + + A Q+R E R+
Sbjct: 252 ELVKKLSPRANDPLQNRPFFEATRRRV 278



Score = 28.9 bits (64), Expect = 0.040
Identities = 28/162 (17%), Positives = 58/162 (35%), Gaps = 18/162 (11%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEA---------ETSARNAGISAS----QAEESA-A 146
EA + + E A+ E A+K A E N+ +S+S AE A
Sbjct: 175 EAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLA 234

Query: 147 NADTSAGDASESARQ--AAESAAAAKQSEEASSSSASAAAQKASESSQSAADAELSKKTA 204
AS ++ + + ++ + A ++ + + + + +
Sbjct: 235 GKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTAS 294

Query: 205 ESAAGNAARDATTATEKARESAESAQSAEQSRIA-AEEAVNR 245
E+ N T +KA + ++A +R+ AEE + +
Sbjct: 295 ETRI-NRINADITQIQKAISQVSNNRNAGIARVHEAEENLKK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2715IGASERPTASE300.013 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.013
Identities = 32/243 (13%), Positives = 65/243 (26%), Gaps = 23/243 (9%)

Query: 44 SPSSSSISATTLFRAPNAHSAS----FHRQSTAESSLHQQLPNVRQRLIQHLAEHGIKPA 99
P+ ++ S TT A N+ S + Q E++ + + + A
Sbjct: 1027 PPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVA 1086

Query: 100 RSMAEHIPPAPNWPAPPPPVQNE-QSRPLPDVAQRLVQHLAEHGIQPARNMAEHIPPAPN 158
+S +E V+ E +++ + Q + + ++ + P + +E + P
Sbjct: 1087 QSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQ--VSPKQEQSETVQPQAE 1144

Query: 159 WPAPPLPVQNEQSRPLPDVAQRLVQHLAEHGIQPARSMAEHIPPAPNWPAPPPPVQNEQS 218
P N + E Q + P PV +
Sbjct: 1145 PARENDPTVN----------------IKEPQSQTNTTADTEQPAKETSSNVEQPVTESTT 1188

Query: 219 RPLPDVAQRLMQHLAEHGIQPARNMAEHIPPAPNWPAPTPPVQNEQSRPLPDVAQRLMQH 278
+ ++ QP N P V + R
Sbjct: 1189 VNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248

Query: 279 LAE 281
L +
Sbjct: 1249 LCD 1251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2707HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2706PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 35/181 (19%), Positives = 61/181 (33%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNSYLNIDIAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ +I + D N + +++ +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGAKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIT 447
G+ + K G GL V+ + L+G A K
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2703ECOLIPORIN2382e-80 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 238 bits (608), Expect = 2e-80
Identities = 117/196 (59%), Positives = 139/196 (70%), Gaps = 24/196 (12%)

Query: 1 MSTSYDFDFGLSLGAAYSNSDRTDNQVHKGTHNTRYGDRFDATAGGETAEAWTVGAKYDA 60
+ST+YD G S GAAY+ SDRT+ QV+ G AGG+ A+AWT G KYDA
Sbjct: 207 ISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGTI----------AGGDKADAWTAGLKYDA 256

Query: 61 NNVYLAAMYAEPRNMTGYGDADA-----IANKTQNFEVVAQYQFDFGLRPSIAYLQSKGK 115
NN+YLA MY+E RNMT YG D +ANKTQNFEV AQYQFDFGLRP++++L SKGK
Sbjct: 257 NNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMSKGK 316

Query: 116 DLGGVNSDNFDSQGNHHYTNKDLVKYVDIGMTYYFNKNMSTYVDYKINLLDNDDDFYKEN 175
DL N + +KDLVKY D+G TYYFNKN STYVDYKINLLD+DD FYK+
Sbjct: 317 DLTY---------NNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDA 367

Query: 176 GIATDDIVAVGLVYQF 191
GI+TDDIVA+G+VYQF
Sbjct: 368 GISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2702ECOLIPORIN303e-106 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 303 bits (778), Expect = e-106
Identities = 143/204 (70%), Positives = 160/204 (78%), Gaps = 2/204 (0%)

Query: 1 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKLDLYGKVAGLHYFSDDASSDGDMSYARIG 60
MKRKVLA+++PALL AGAA+AAEIYNKDGNKLDLYGKV GLHYFSDD+S DGD +Y R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQIADQFTGYGQWEFNIGANGPESDKGNTATRLAFAGFGFGQNGTFDYGRNYGVVY 120
FKGETQI DQ TGYGQWE+N+ AN E + N+ TRLAFAG FG G+FDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 121 DVEAWTDMLPEFGGDTYAGADNFMNGRANSVATYRNNGFFGQVDGLNFALQYQGNNEKSG 180
DVE WTDMLPEFGGD+Y ADN+M GRAN VATYRN FFG VDGLNFALQYQG NE
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 181 LFDQEGSGNG--NGRKLAKENGDG 202
D N NG + +NGDG
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDG 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2700CARBMTKINASE352e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.8 bits (80), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQSSSILAAEETRRLLREEFVQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2699PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2689TYPE3IMRPROT2033e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (518), Expect = 3e-67
Identities = 260/261 (99%), Positives = 261/261 (100%)

Query: 1 MLQVTSEQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
MLQVTSEQWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2688TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2687FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2685FLGMOTORFLIN2121e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 212 bits (542), Expect = 1e-74
Identities = 125/137 (91%), Positives = 134/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T++KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2684FLGMOTORFLIM381e-135 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 381 bits (979), Expect = e-135
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2682FLGHOOKFLIK470e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 470 bits (1209), Expect = e-168
Identities = 369/375 (98%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120
GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDVPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTD PSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPQVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTP VAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSSHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVS HQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTVNHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRT NHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2681FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


40ECs2650ECs2614Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs26502171.033219phosphatidylglycerophosphate synthetase
ECs26462190.662902***tail protein
ECs2645-117-2.118362tail sheath protein
ECs2644-220-3.404624tail tube protein
ECs2643-122-2.956518tail protein
ECs2642-124-2.727041phage tail protein
ECs2639018-1.307780tail protein
ECs2638-2202.077249hypothetical protein
ECs26370233.736082transposase
ECs26360253.701122transposase
ECs26350273.821547plasmid partition protein
ECs26340274.104846plasmid partition protein
ECs26331294.141048phage replication protein
ECs2632532-0.664398hypothetical protein
ECs2631531-1.825295derepression protein
ECs2629229-3.223737hypothetical protein
ECs2628032-4.512479hypothetical protein
ECs2627-132-6.297322hypothetical protein
ECs2626032-6.377300hypothetical protein
ECs2625-233-6.351717hypothetical protein
ECs2624035-7.677217hypothetical protein
ECs2623-142-8.893174hypothetical protein
ECs2622248-8.762291DNA-binding protein
ECs5465142-6.631066hypothetical protein
ECs2620-135-5.192876transcriptional regulator
ECs2619-133-4.958701hypothetical protein
ECs2618126-4.519223hypothetical protein
ECs2617022-3.242964integrase
ECs2616015-2.020730hypothetical protein
ECs2615117-3.063932tyrosine-specific transporter
ECs2614-116-3.140516hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2639PF03944280.016 delta endotoxin
		>PF03944#delta endotoxin

Length = 633

Score = 28.1 bits (62), Expect = 0.016
Identities = 11/31 (35%), Positives = 18/31 (58%)

Query: 74 TQAYTGRPWPLIDGVGQIYGMYVLTGTNTTR 104
TQ++T + WP + + Q+ YVL G + R
Sbjct: 285 TQSFTSQDWPFLYSLFQVNSNYVLNGFSGAR 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2619STREPKINASE280.033 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 27.8 bits (61), Expect = 0.033
Identities = 17/41 (41%), Positives = 21/41 (51%), Gaps = 3/41 (7%)

Query: 15 RDFHEKNIQ--IERYDGSHTVNIAIPSNNDDDDRPLLKAQR 53
R + EK IQ + D +TV P N DDD RP LK +
Sbjct: 170 RPYKEKPIQNQAKSVDVEYTVQF-TPLNPDDDFRPGLKDTK 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2616SECA608e-13 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 60.3 bits (146), Expect = 8e-13
Identities = 27/70 (38%), Positives = 31/70 (44%), Gaps = 5/70 (7%)

Query: 155 RVEKMSPEAFEESVDAIRLAALDLH---AYWMAHPQEKAVQQPI--KAEEKPGRNDPCPC 209
+V+ PE EE R+ A L A E K GRNDPCPC
Sbjct: 828 KVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPC 887

Query: 210 GSGKKFKQCC 219
GSGKK+KQC
Sbjct: 888 GSGKKYKQCH 897


41ECs2507ECs2480Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2507-119-3.913035leucine export protein LeuE
ECs2506-120-2.402366hypothetical protein
ECs2505-117-0.959858hypothetical protein
ECs2504-118-0.902550hypothetical protein
ECs2503-118-0.171564hypothetical protein
ECs25020200.272890hypothetical protein
ECs2501-1210.149191hypothetical protein
ECs2500-220-1.807869amino acid/amine transport protein
ECs2499-222-5.646050AraC family transcriptional regulator
ECs2498-318-5.445288hypothetical protein
ECs2497-114-4.930440hypothetical protein
ECs2496-112-4.375796hypothetical protein
ECs2495-212-4.204874hypothetical protein
ECs2494-211-3.573868hypothetical protein
ECs2493018-1.450818hypothetical protein
ECs2492120-1.421882hypothetical protein
ECs2491025-1.596828scaffolding protein in the formation of a
ECs2490-122-1.911457aldehyde reductase
ECs2489-119-2.291556hypothetical protein
ECs2488-218-2.937228glyceraldehyde-3-phosphate dehydrogenase
ECs2487019-3.704491methionine sulfoxide reductase B
ECs2486020-4.152005hypothetical protein
ECs2485019-3.996038oxidoreductase
ECs2484-122-5.099458transporter
ECs2483-220-4.566510oxidoreductase
ECs2482-219-4.025742aldolase
ECs2481-220-4.203654kinase
ECs2480-119-3.026803hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2504HTHTETR306e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 30.0 bits (67), Expect = 6e-04
Identities = 9/37 (24%), Positives = 17/37 (45%), Gaps = 5/37 (13%)

Query: 4 LSWIIFGLIAGILAKWIMPG-----KDGGGFFMTILL 35
+ I+ G I+G++ W+ K ++ ILL
Sbjct: 163 AAIIMRGYISGLMENWLFAPQSFDLKKEARDYVAILL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2498PRTACTNFAMLY280.021 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 27.7 bits (61), Expect = 0.021
Identities = 18/61 (29%), Positives = 26/61 (42%)

Query: 49 QGLSIGIIILTIGVMAPIASGTLPPSTLIHSFLNWKSLVAIAVGVIVSWLGGRGVTLMGS 108
Q +I L IG + + LPPS ++ N ++ A VS LG +TL G
Sbjct: 174 QRSAIVDGGLHIGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGG 233

Query: 109 Q 109

Sbjct: 234 H 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2489INVEPROTEIN290.023 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 28.9 bits (64), Expect = 0.023
Identities = 18/81 (22%), Positives = 34/81 (41%), Gaps = 13/81 (16%)

Query: 158 ETTSALHTYFNVGDIAKVSVSGLGDRFIDKVNDAKED-----------VLTDGIQTFPDR 206
E ++AL + N D K S S L + F ++V + + V ++ F +
Sbjct: 57 EMSAALAQFRNRRDYEKKS-SNLSNSF-ERVLEDEALPKAKQILKLISVHGGALEDFLRQ 114

Query: 207 TDRVYLNPQDCSVINDEALNR 227
++ +P D ++ E L R
Sbjct: 115 ARSLFPDPSDLVLVLRELLRR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2484TCRTETB310.011 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 31.0 bits (70), Expect = 0.011
Identities = 33/142 (23%), Positives = 48/142 (33%), Gaps = 23/142 (16%)

Query: 71 MFLGALVGGIIGDKTGRRNAFILYEAIHIASMVVGAFSPNMDF-LIACRFVMGVGLGALL 129
+G V G + D+ G + + I+ V+G + LI RF+ G G A
Sbjct: 62 FSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFP 121

Query: 130 VTLFAGFTEYMPGRNR----GTWSSRVSFIGNWSYPLCSLIAMGLTPLISA----EWNWR 181
+ Y+P NR G S V+ + G+ P I +W
Sbjct: 122 ALVMVVVARYIPKENRGKAFGLIGSIVA------------MGEGVGPAIGGMIAHYIHWS 169

Query: 182 VQLLIPAILSLIATALAWRYFP 203
LLIP I I T
Sbjct: 170 YLLLIPMI--TIITVPFLMKLL 189


42ECs2464ECs2425Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs24642142.941110cytochrome oxidase
ECs24631132.734957thiosulfate sulfur transferase
ECs24622132.104030ABC transporter ATP-binding protein
ECs24611151.662385transport system permease
ECs24601150.767064ABC transporter substrate-binding protein
ECs2459-1130.451915hypothetical protein
ECs2458-1130.680791hypothetical protein
ECs24570112.224709hypothetical protein
ECs24560123.077053hypothetical protein
ECs2455-1123.701672exonuclease III
ECs2454-1113.558761bifunctional succinylornithine
ECs24530123.361930arginine succinyltransferase
ECs24520122.722135succinylglutamic semialdehyde dehydrogenase
ECs24510120.761108succinylarginine dihydrolase
ECs2450-113-0.355323succinylglutamate desuccinylase
ECs2449015-1.773325hypothetical protein
ECs2448115-2.102037hypothetical protein
ECs2447018-2.403478nucleotide excision repair endonuclease
ECs2446017-4.552486NAD synthetase
ECs2445017-4.951842OsmE family transcriptional regulator
ECs2444018-4.593045PTS system N,N'-diacetylchitobiose-specific
ECs2443-212-2.994581PTS system N,N'-diacetylchitobiose-specific
ECs2442-214-2.930798PTS system N,N'-diacetylchitobiose-specific
ECs2441-216-4.693278DNA-binding transcriptional regulator ChbR
ECs2440-313-2.8216076-phospho-beta-glucosidase
ECs2439-214-2.257779hypothetical protein
ECs2438-216-2.082059hydroperoxidase II
ECs2437-118-3.859905cell division modulator
ECs2436-118-3.367497hypothetical protein
ECs2435016-1.218621hypothetical protein
ECs2434-118-1.181111hypothetical protein
ECs2433-119-1.3978762-deoxyglucose-6-phosphate phosphatase
ECs2432-126-5.527366hypothetical protein
ECs2431-219-4.222728hypothetical protein
ECs2430-119-4.729461hypothetical protein
ECs2429-218-4.4957726-phosphofructokinase
ECs2428-122-5.541008hypothetical protein
ECs2427021-4.622793hypothetical protein
ECs24263280.269572threonyl-tRNA synthetase
ECs24254280.933826translation initiation factor IF-3
43ECs2292ECs2276Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2292-321-3.723127hypothetical protein
ECs2291-119-3.477587hypothetical protein
ECs2290020-2.971370spermidine N1-acetyltransferase
ECs2289023-3.234615hypothetical protein
ECs2288122-3.520988hypothetical protein
ECs2287124-3.940362excisionase
ECs2286124-3.675953hypothetical protein
ECs2285137-6.391400hypothetical protein
ECs2284135-5.576592inhibitor of cell division
ECs5458333-5.239561hypothetical protein
ECs2283338-5.351638hypothetical protein
ECs2282129-4.041304hypothetical protein
ECs2281024-2.554585hypothetical protein
ECs2280220-0.971555hypothetical protein
ECs2279220-1.027506hypothetical protein
ECs22782180.630048hypothetical protein
ECs22772190.852202hypothetical protein
ECs2276221-0.223601replication protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2290SACTRNSFRASE401e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 40.3 bits (94), Expect = 1e-06
Identities = 22/112 (19%), Positives = 47/112 (41%), Gaps = 4/112 (3%)

Query: 34 FEEPYEAFVELSDLYDKHIHDQSERRFVVECDGEKAGLVELVEINHVHRRAEFQ-IIISP 92
F +PY E D+ ++ ++ + F+ + G +++ + + A + I ++
Sbjct: 42 FSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRS--NWNGYALIEDIAVAK 99

Query: 93 EYQGKGLATRAAKLAMDYGFTVLNLYKLYLIVDKENEKAIHIYRKLGFSVEG 144
+Y+ KG+ T A+++ + L L N A H Y K F +
Sbjct: 100 DYRKKGVGTALLHKAIEWAKEN-HFCGLMLETQDINISACHFYAKHHFIIGA 150


44ECs2265ECs2201Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs22652252.645639***hypothetical protein
ECs22643252.292752hypothetical protein
ECs22632252.385711hypothetical protein
ECs22622231.529813hypothetical protein
ECs54521210.050498hypothetical protein
ECs2261024-0.709416holin
ECs2260122-0.111536hypothetical protein
ECs22591230.822742endolysin
ECs22581260.748372antirepressor protein
ECs22573264.386679endopeptidase
ECs22554275.984749hypothetical protein
ECs22545296.627269Dnase
ECs22534296.034535hypothetical protein
ECs22525295.972103terminase small subunit
ECs22516296.264815terminase large subunit
ECs22506306.326224major head protein/prohead protease
ECs54516305.564948hypothetical protein
ECs22486295.120471portal protein
ECs22477325.576131hypothetical protein
ECs22467336.003204head-tail adaptor
ECs22454325.704073hypothetical protein
ECs22445345.557322hypothetical protein
ECs22435345.424846major tail subunit
ECs22425345.780424tail assembly chaperone
ECs22414336.089084tail protein
ECs22403315.903287tail length tape measure protein
ECs22394306.682621minor tail protein
ECs22383316.272780minor tail protein
ECs22372274.390564tail assembly protein
ECs22361262.002334tail assembly protein
ECs2232127-1.374032outer membrane protein
ECs22312250.665789tail fiber protein
ECs2230022-1.334781hypothetical protein
ECs2229122-1.130210hypothetical protein
ECs22271230.843546hypothetical protein
ECs22261232.720600hypothetical protein
ECs22242244.255118hypothetical protein
ECs22232221.903387hypothetical protein
ECs22221220.064612hypothetical protein
ECs2221018-0.847689transposase
ECs2220021-2.295267transposase
ECs2219-123-3.406308transposase
ECs2218130-4.933530hypothetical protein
ECs2217132-5.348494integrase
ECs2216132-6.419763exonuclease
ECs2215136-8.323071hypothetical protein
ECs2214136-8.204557cell division inhibitor
ECs2213136-7.279260hypothetical protein
ECs2212130-6.541584hypothetical protein
ECs2211126-3.010910hypothetical protein
ECs22102250.221239hypothetical protein
ECs22092260.912966repressor protein
ECs22081242.201364regulatory protein
ECs22072272.351658hypothetical protein
ECs22062271.602759hypothetical protein
ECs2205128-0.312277replication protein
ECs2204127-2.158929hypothetical protein
ECs2203330-4.073046hypothetical protein
ECs2202328-5.298123hypothetical protein
ECs2201327-4.717707hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2232ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 61/200 (30%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEQQATLSAGYLHARTSAPGSDNLNGINVKYRYEFT 60
M+K+ AA+ +G + +T++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGT---SVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2231CHANLCOLICIN300.018 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.0 bits (67), Expect = 0.018
Identities = 31/118 (26%), Positives = 50/118 (42%), Gaps = 10/118 (8%)

Query: 130 SARNAGISASQAEESAANADTSAGDASESARQAAESAAAAKQSEEASSSSASAAAQKASE 189
S G S++E SAA T+ ++ + AE AA AK A+A AQ ++
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAK---------AAAEAQAKAK 84

Query: 190 SSQSAADAELSKKTAESAAGNAARDATTATEKARESAESAQSAEQSRIA-AEEAVNRI 246
+++ A L E+ NA+R + +A E+ R+A AEE +
Sbjct: 85 ANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKE 142



Score = 29.7 bits (66), Expect = 0.025
Identities = 31/147 (21%), Positives = 51/147 (34%), Gaps = 6/147 (4%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEESAANADTSAGDASESAR 160
E R+ E+A + AE+ +K E + + ++AEE A + A E
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIER-EKAETERQLKLAEAEEKRLAALSEEAKAVE--- 192

Query: 161 QAAESAAAAKQSEEASSSSASAAAQKASESSQSAADAELSKKTAESAAGNAARDATTATE 220
A+ +A QSE SS A DAE+ + A +
Sbjct: 193 -IAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELD 251

Query: 221 KARES-AESAQSAEQSRIAAEEAVNRI 246
+ + + A Q+R E R+
Sbjct: 252 ELVKKLSPRANDPLQNRPFFEATRRRV 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2209PF07675280.025 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 27.8 bits (61), Expect = 0.025
Identities = 17/73 (23%), Positives = 30/73 (41%), Gaps = 4/73 (5%)

Query: 78 IPANTFAVVLESDSMSTSGGGVSIPNGSTVFVDPDRIVQPGNIVLALPKGTTTPVIRKLE 137
I A+ + V S G GV+ +G +I + GN + + + PVI++++
Sbjct: 267 IQASAGSYVAISKDGVLYGTGVANASGVATVNMTKQITENGNYDVVITRSNYLPVIKQIQ 326

Query: 138 IEGPDILLVPTNP 150
P P P
Sbjct: 327 AGEPS----PYQP 335


45ECs2192ECs2154Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs21922240.799262hypothetical protein
ECs21912240.429723transcriptional regulator
ECs21901262.210183***hypothetical protein
ECs21893211.374234hypothetical protein
ECs5446123-0.876235hypothetical protein
ECs2188124-0.986354holin
ECs2187125-2.461783hypothetical protein
ECs2186024-1.965666endolysin
ECs2185019-0.543577antirepressor protein
ECs21841200.367173endopeptidase
ECs21820222.431128transcriptional regulator
ECs21811214.208282hypothetical protein
ECs21801225.096026terminase small subunit
ECs21791235.573794terminase large subunit
ECs21784256.775509head-to-tail joining protein
ECs21774246.857395portal protein
ECs21763236.020364minor capsid protein
ECs21753224.042258head decoration protein
ECs21743223.525632major capsid protein
ECs21733262.255175DNA-packaging protein
ECs21722262.182628tail attachment protein
ECs21713254.073114minor tail protein
ECs21703264.593798minor tail protein
ECs21693265.063926hypothetical protein
ECs21685305.595125minor tail protein
ECs21675306.613106minor tail protein
ECs21664307.113567tail length tape measure protein
ECs21656356.598409minor tail protein
ECs21646336.500219minor tail protein
ECs54455336.553575hypothetical protein
ECs21635336.036903tail assembly protein
ECs21625304.435659tail assembly protein
ECs21615262.329448host specificity protein
ECs2160335-4.082432outer host membrane protein
ECs2159540-5.886354tail fiber protein
ECs2157436-8.215435hypothetical protein
ECs2156131-6.566213hypothetical protein
ECs2155027-5.901062hypothetical protein
ECs2154-222-4.464963hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2169INTIMIN310.006 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.006
Identities = 23/119 (19%), Positives = 44/119 (36%), Gaps = 17/119 (14%)

Query: 134 KEVITRTVKVTNVGKPSVAEERSEITPATAIKVTP-------------TSGTVAKGKTTT 180
++ IT TVKV KP +E + T + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 181 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 235
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2166LCRVANTIGEN340.002 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 34.3 bits (78), Expect = 0.002
Identities = 25/101 (24%), Positives = 46/101 (45%), Gaps = 4/101 (3%)

Query: 518 DLWKAESQYAVL-KEAATKRQLSEQEKSLLAHKDETLEYKRQLAELG---DKVEYQKRLN 573
+++KA ++Y +L K T Q+ EK +++ KD ++ LG + Y K N
Sbjct: 205 EIFKASAEYKILEKMPQTTIQVDGSEKKIVSIKDFLGSENKRTGALGNLKNSYSYNKDNN 264

Query: 574 ELAQQAVRFEEQQSAKQAAISAKARGLTDRQAQRESEAQRL 614
EL+ A ++ +S K L+D ++ S + L
Sbjct: 265 ELSHFATTCSDKSRPLNDLVSQKTTQLSDITSRFNSAIEAL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2160ENTEROVIROMP1384e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 138 bits (350), Expect = 4e-44
Identities = 61/200 (30%), Positives = 98/200 (49%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEQQATLSAGYLHARTSAPGSDNLNGINVKYRYEFT 60
M+K+ AA+ +G + +T++ GY + + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGT---SVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2159IGASERPTASE310.007 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.007
Identities = 16/83 (19%), Positives = 33/83 (39%), Gaps = 3/83 (3%)

Query: 165 SAAAAKQSEEASSSSASAAAQKASESSQSAADAELSKKTAESAAGNAARMQRPQQKKPGS 224
+ Q++ S S + + E+ +T E+ A N+ + + +K
Sbjct: 998 TTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--- 1054

Query: 225 QQKAHSQREQSRIAAEEAVNRIP 247
+Q A Q+R A+EA + +
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVK 1077


46ECs2117ECs2100Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2117219-2.927755lipoprotein
ECs2116020-3.749732ABC transporter ATP-binding protein
ECs2115028-5.923397hypothetical protein
ECs2113330-6.354438type 1 fimbrial protein
ECs2112125-4.715398fimbrial chaperone protein
ECs2110125-4.935089outer membrane protein
ECs2109025-5.613961fimbrial-like protein
ECs2108026-6.123221fimbrial-like protein
ECs2107025-7.129212adhesin
ECs2106-124-7.375559oxidoreductase
ECs2105-121-8.517969hypothetical protein
ECs2104-220-7.502192transcriptional regulator YdeO
ECs2103-215-5.975428sulfatase
ECs2102-214-5.343420hypothetical protein
ECs2101-212-3.894610ABC transporter ATP-binding protein
ECs2100-212-3.619996hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2116PRTACTNFAMLY1144e-29 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 114 bits (286), Expect = 4e-29
Identities = 118/467 (25%), Positives = 176/467 (37%), Gaps = 59/467 (12%)

Query: 22 NGLMTFNATLGGDNSPTDKMNVKGDTQGNTRVRVDNIGGVGAQTVNGIELIEVGGNSAGN 81
+GL N D +DK+ V D G R+ V N G + N + L++ SA
Sbjct: 481 SGLFRMNVFA--DLGLSDKLVVMQDASGQHRLWVRNSGS-EPASANTLLLVQTPLGSAAT 537

Query: 82 FALTT--GTVEAGAYVYTLAKGKGNDEKNWYLTSKWDGVTPADTPDPINNPPVVDPEGPS 139
F L G V+ G Y Y LA N W L P P P PP P
Sbjct: 538 FTLANKDGKVDIGTYRYRLAA---NGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPE 594

Query: 140 --VYRPEAGSYIS----------NIAAANSLF---SHRLHDRLGEPQYTDSLHSQDSASS 184
+P AG +S + A++L+ S+ L RLGE L A
Sbjct: 595 APAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGE------LRLNPDAGG 648

Query: 185 MWMRHVGGHERSSAGDGQLNTQANRYVLQLGGDLAQWSSNAQDRWHLGVMAGYANQHSNT 244
W R ++ G+ Q +LG D A + A RWHLG +AGY
Sbjct: 649 AWGRGFAQRQQLDNRAGRRFDQ-KVAGFELGADHA--VAVAGGRWHLGGLAGYTR----- 700

Query: 245 QSNRVGYKSDGRISGYSAGLYATWYQNDANKTGAYVDSWALYNWFDNSV---SSDNRSAD 301
G+ DG G++ ++ Y +G Y+D+ + +N SD +
Sbjct: 701 --GDRGFTGDG--GGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVK 756

Query: 302 -DYDSRGVTASVEGGYTFEAGTCSGSEGTLNTWYVQPQAQITWMGVKDSDHARKDGTRIE 360
Y + GV AS+E G F + W+++PQA++ + +G R+
Sbjct: 757 GKYRTHGVGASLEAGRRFTHA---------DGWFLEPQAELAVFRAGGGAYRAANGLRVR 807

Query: 361 TEGDGNVQTRLGVKTYLNSHHQRDDGKQREFQPYIEANWINNSK-VYAVKMNGQTVSRDG 419
EG +V RLG L + + R+ QPYI+A+ + V NG +
Sbjct: 808 DEGGSSVLGRLG----LEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTEL 863

Query: 420 ARNLGEVRTGVEAKVNNNLSLWGNVGVQLGDKGYSDTQGMLGVKYSW 466
E+ G+ A + SL+ + G K G +YSW
Sbjct: 864 RGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2110PF00577388e-130 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 388 bits (997), Expect = e-130
Identities = 217/382 (56%), Positives = 292/382 (76%)

Query: 1 MSGYTVKPPTGDSNEQTQFIDYFNLFYSKRDQEQISISQQLGNYGATFFSASRQSYWNTS 60
M+GY ++ G + +F DY+NL Y+KR + Q++++QQLG + S S Q+YW TS
Sbjct: 497 MNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTS 556

Query: 61 RSDQQISFGLNVPFGDITTSLNYSYSNNIWQNDRDHLLAFTLNVPFSHWMRTDSQSAFRN 120
D+Q GLN F DI +L+YS + N WQ RD +LA +N+PFSHW+R+DS+S +R+
Sbjct: 557 NVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRH 616

Query: 121 SNASYSMSNDLKGGMTNLSGVYGTLLPDNNLNYSVQVGNTHGGNTSSGTSGYSTLNYRGA 180
++ASYSMS+DL G MTNL+GVYGTLL DNNL+YSVQ G GG+ +SG++GY+TLNYRG
Sbjct: 617 ASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGG 676

Query: 181 YGNTNVGYSRSGDSSQIYYGMSGGIIAHADGITFGQPLGDTMVLVKAPGADNVKIENQTG 240
YGN N+GYS S D Q+YYG+SGG++AHA+G+T GQPL DT+VLVKAPGA + K+ENQTG
Sbjct: 677 YGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTG 736

Query: 241 IHTDWRGYAILPFATEYRENRVALNANSLADNVELDETVVTVIPTHGAIARATFNAQIGG 300
+ TDWRGYA+LP+ATEYRENRVAL+ N+LADNV+LD V V+PT GAI RA F A++G
Sbjct: 737 VRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGI 796

Query: 301 KVLMTLKYGNKSVPFGAIVTHGENKNGSIVAENGQVYLTGLPQSGKLQVSWGNDKNSNCI 360
K+LMTL + NK +PFGA+VT +++ IVA+NGQVYL+G+P +GK+QV WG ++N++C+
Sbjct: 797 KLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCV 856

Query: 361 VDYKLPEVSPGTLLNQQTAICR 382
+Y+LP S LL Q +A CR
Sbjct: 857 ANYQLPPESQQQLLTQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2108FIMBRIALPAPF332e-04 Escherichia coli: P pili tip fibrillum papF protein...
		>FIMBRIALPAPF#Escherichia coli: P pili tip fibrillum papF protein

signature.
Length = 167

Score = 33.1 bits (75), Expect = 2e-04
Identities = 29/93 (31%), Positives = 47/93 (50%), Gaps = 7/93 (7%)

Query: 16 LLTATLQAADVTITVNGRVVAKPCTIQT-KEANVNLGDLYTRNLQQPGSASGWHNITLSL 74
LLT+ ADV I + G V PCTI + V+ G++ N + ++ G +S+
Sbjct: 11 LLTSVAVLADVQINIRGNVYIPPCTINNGQNIVVDFGNI---NPEHVDNSRGEVTKNISI 67

Query: 75 TDCPAETSAVTAIVTGSTDNTGYYKNEGTAENI 107
+ CP ++ ++ VTG+T G +N A NI
Sbjct: 68 S-CPYKSGSLWIKVTGNTMGVG--QNNVLATNI 97


47ECs2078ECs2061Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2078017-6.307169nitrate-inducible formate dehydrogenase-N alpha
ECs2077230-9.887684hypothetical protein
ECs2076-119-5.292066outer membrane porin protein
ECs2075-118-3.708909IpaH-like protein
ECs2074-215-2.236188hypothetical protein
ECs2073-214-0.925288hypothetical protein
ECs2072-3161.708664nitrite extrusion protein 2
ECs2071-2162.202292cryptic nitrate reductase 2 subunit alpha
ECs20700131.468323cryptic nitrate reductase 2 subunit beta
ECs20690150.448739cryptic nitrate reductase 2 subunit delta
ECs2068018-1.202032cryptic nitrate reductase 2 subunit gamma
ECs2067022-3.632899hypothetical protein
ECs20662274.888543N-hydroxyarylamine O-acetyltransferase
ECs20652295.081546hypothetical protein
ECs20642294.6480844-oxalocrotonate tautomerase
ECs54422274.282463hypothetical protein
ECs54412244.019956hypothetical protein
ECs20612224.068760protein RhsE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2076ECOLIPORIN477e-172 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 477 bits (1230), Expect = e-172
Identities = 225/386 (58%), Positives = 272/386 (70%), Gaps = 23/386 (5%)

Query: 1 MKLKIVAVVVTGLLAANVAHAAEVYNKDGNKLDLYGKVTALRYFTDDKRDDGDKTYARLG 60
MK K++A+V+ LLAA AHAAE+YNKDGNKLDLYGKV L YF+DD DGD+TY R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQINDQMIGFGHWEYDFKGYNDEANGSRDNKTRLAYAGLKISEFGSLDYGRNYGVG 120
FKGETQINDQ+ G+G WEY+ + E G+ ++ TRLA+AGLK ++GS DYGRNYGV
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGA-NSWTRLAFAGLKFGDYGSFDYGRNYGVL 119

Query: 121 YDIGSWTDMLPEFGGDTWSQKDVFMTYRTTGVATYRNYDFFGLIEGLNFAAQYQGKNER- 179
YD+ WTDMLPEFGGD+++ D +MT R GVATYRN DFFGL++GLNFA QYQGKNE
Sbjct: 120 YDVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQ 179

Query: 180 -------TDNSHLYGADYTRANGDGFGISSTYVYD-GFGIGAVYTKSDRTNAQERAAANP 231
N+ G D NGDGFGIS+TY GF GA YT SDRTN Q A
Sbjct: 180 SADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGT- 238

Query: 232 LNASGKNAELWATGIKYDANNIYFAANYAETLNMTTYG------DGYISNKAQSFEVVAQ 285
A G A+ W G+KYDANNIY A Y+ET NMT YG DG ++NK Q+FEV AQ
Sbjct: 239 -IAGGDKADAWTAGLKYDANNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQ 297

Query: 286 YQFDFGLRPSLAYLKSKGIDLGR----YGDQDMIEYIDVGATYFFNKNMSTYVDYKINLI 341
YQFDFGLRP++++L SKG DL D+D+++Y DVGATY+FNKN STYVDYKINL+
Sbjct: 298 YQFDFGLRPAVSFLMSKGKDLTYNNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLL 357

Query: 342 DESD-FTRAVDIRTDNIVATGITYQF 366
D+ D F + I TD+IVA G+ YQF
Sbjct: 358 DDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2065IGASERPTASE270.024 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.024
Identities = 7/29 (24%), Positives = 13/29 (44%)

Query: 119 WQFDDDKLNTLHHLGAGTFVTSGKRVTAG 147
W+ + + + L +G GT + G G
Sbjct: 437 WKVHNPQYDRLAKIGKGTLIVEGTGDNKG 465


48ECs2024ECs2011Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2024-216-3.193565hypothetical protein
ECs2023-118-4.218961cytochrome b561
ECs2022-121-5.333315glyceraldehyde 3-phosphate dehydrogenase C
ECs2021-126-7.147935aldehyde dehydrogenase
ECs2020014-2.946622hypothetical protein
ECs2019013-2.947262hypothetical protein
ECs2018-111-2.091347hypothetical protein
ECs2017-210-1.003764hypothetical protein
ECs2016-211-1.021631hypothetical protein
ECs2015-311-0.589938ATP-dependent RNA helicase HrpA
ECs2014-326-4.474979azoreductase
ECs2013-327-4.219842hypothetical protein
ECs2012-216-2.643774hypothetical protein
ECs2011-115-3.089991phosphatidate cytidylyltransferase
49ECs1997ECs1930Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1997741-3.625388filament protein
ECs1996536-2.444584hypothetical protein
ECs19954242.618430hypothetical protein
ECs19944243.845406hypothetical protein
ECs19934254.815839hypothetical protein
ECs19924255.491574tail fiber protein
ECs19914264.922803outer membrane protein
ECs19905285.143003host specificity protein
ECs19894325.337403copper/zinc-superoxide dismutase
ECs19884335.949927hypothetical protein
ECs19874336.375454tail assembly protein
ECs19865346.077462tail assembly protein
ECs19855335.451280minor tail protein
ECs19844335.473063minor tail protein
ECs19834315.530739tail length tape measure protein
ECs19826325.446986hypothetical protein
ECs19817315.185811tail assembly chaperon
ECs19808295.083442major tail protein
ECs19796285.315052hypothetical protein
ECs19786275.113340hypothetical protein
ECs19776296.158785head-tail adaptor
ECs19766296.117157hypothetical protein
ECs19754285.917934hypothetical protein
ECs19744296.014193portal protein
ECs54385286.478459hypothetical protein
ECs19725275.849215phage major head protein/prohead protease
ECs19714254.240892terminase large subunit
ECs19702250.689362terminase small subunit
ECs19683230.759853DNase
ECs5437122-0.661938hypothetical protein
ECs1967123-0.945934hypothetical protein
ECs19662221.728674endopeptidase
ECs19653241.460716antirepressor protein
ECs19641262.257718endolysin
ECs19631270.848207hypothetical protein
ECs19621251.028703holin
ECs1961-1220.611620hypothetical protein
ECs1960028-8.043859hypothetical protein
ECs1959435-11.198781hypothetical protein
ECs1958644-13.010252***antitermination protein
ECs1957747-13.344640hypothetical protein
ECs1956747-13.776010hypothetical protein
ECs1955852-15.220634hypothetical protein
ECs1954851-14.479280hypothetical protein
ECs1953443-10.303327methyltransferase
ECs1952-129-3.931447hypothetical protein
ECs1951-224-2.293150hypothetical protein
ECs1950025-2.091081hypothetical protein
ECs1949023-1.204649hypothetical protein
ECs1948023-1.354935hypothetical protein
ECs1946124-2.378543hypothetical protein
ECs1945126-2.901257replication protein
ECs1944231-4.741717hypothetical protein
ECs1943131-8.339005hypothetical protein
ECs1942136-10.869882regulatory protein
ECs1941334-11.076315transcriptional regulator
ECs1940030-10.771774hypothetical protein
ECs1939127-5.242553hypothetical protein
ECs1937126-5.383617phage superinfection exclusion protein
ECs1936128-4.446108FtsZ inhibitor protein
ECs5433128-4.606960hypothetical protein
ECs1935227-4.332202hypothetical protein
ECs1934224-4.701307exonuclease VIII
ECs1933119-6.210806recombination and repair protein RecT
ECs1932-119-4.241718restriction alleviation and modification
ECs1931-217-3.235469hypothetical protein
ECs1930-217-3.143382hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1992IGASERPTASE386e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.1 bits (88), Expect = 6e-05
Identities = 34/218 (15%), Positives = 68/218 (31%), Gaps = 23/218 (10%)

Query: 3 EDSQPGTLNDFLGAMSEDDVRPEALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISA 62
+ + T N+ + E + R + A ++ ET A N+ +
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSE----TTETVAENSKQES 1048

Query: 63 SQAEESAANADTSAGDASESARQAA-ESAAAAKQSEDASSSSASAAAQKASESSQSAAEA 121
E++ +A + E A++A A + +E A S S + Q + E
Sbjct: 1049 KTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEK 1108

Query: 122 ------------ELSRKTAESAAGNAARDAT-TATEKARE-----SAESAQSAEQSRIAA 163
E+ + T++ + + E ARE + + QS +
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 164 EEAVNRIPTVVGPPGPKGEQGPAGPQGPKGDKGERGDT 201
E+ + V P + G + + T
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1991ENTEROVIROMP1363e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 136 bits (344), Expect = 3e-43
Identities = 57/193 (29%), Positives = 93/193 (48%), Gaps = 27/193 (13%)

Query: 8 ILSAAICLTVSGAPAWASEQQATLSAGYLHVSTNAPGSDNLNGINVKYRYEFTDT-LGLV 66
+A+ ++ + +T++ GY + + G N+KYRYE ++ LG++
Sbjct: 5 ACLSALAAVLAFTAGTSVAATSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEEDNSPLGVI 63

Query: 67 TSFSYAGDRNRQITRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRVST 126
SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y + T
Sbjct: 64 GSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQT 115

Query: 127 FSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYEGSGSGDW 186
T+ HD S+ ++GAG+QFNP E+VA+D +YE S
Sbjct: 116 --------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSV 158

Query: 187 RTDGFIVGVGYKF 199
+I GVGY+F
Sbjct: 159 DVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1990SURFACELAYER340.004 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 33.9 bits (77), Expect = 0.004
Identities = 34/143 (23%), Positives = 45/143 (31%), Gaps = 30/143 (20%)

Query: 993 SVNANSGALNNVTINQNCTIKGMLEATQV----RGDF---------VKAVSKAFPKKVGT 1039
+ + L NVT + +K L+A ++ G F VKA S K
Sbjct: 235 AAQYDKKQLTNVTFDTETAVKDALKAQKIEVSSVGYFKAPHTFTVNVKATSNKNGKSATL 294

Query: 1040 WGNTETPNGTVTVTISDDHNFDRQIIIPPIIFNGIAYDDPGSGNNPGGTRYTGYGFEVRK 1099
PN V S I+ N YD + G R
Sbjct: 295 PVTVTVPNVADPVVPSQSKT---------IMHNAYFYDKDA--------KRVGTDKVTRY 337

Query: 1100 NGVLIASRETKGAIPGSYSAVID 1122
N V +A TK A SY VI+
Sbjct: 338 NTVTVAMNTTKLANGISYYEVIE 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1983CHANLCOLICIN320.018 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 31.6 bits (71), Expect = 0.018
Identities = 45/240 (18%), Positives = 93/240 (38%), Gaps = 24/240 (10%)

Query: 356 RAQQAVAAARGTEMQIAAEARLAATQERLN-------RNIAARSAAQNALNSTTAVGSRL 408
+A+QA A E Q A+A A +RL R+ A+R+ + L +
Sbjct: 66 QAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQA 125

Query: 409 MSGALGLVGGVPGLVMLGAAAWYTLYQNQEQARESARQYALTIDEIAHKTPSMSLPEASD 468
L L AA + +++ +E R+ A T ++ + ++
Sbjct: 126 EDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQL----------KLAE 175

Query: 469 NEGRTRAALTEQNRLID-------EQASRVKSLQEKAQSIQDVLAGLEDRRVALIRQQAA 521
E + AAL+E+ + ++ S V + + +++ L+ R A ++ A
Sbjct: 176 AEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAG 235

Query: 522 EQNKVYQSMLVMNGQHTEFNRLLGLGNELLQQRQGLVNVPLRLPQATLDDKQQSALTKTE 581
++N++ Q+ +L N+ LQ R R+ + +++Q +T +E
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295


50ECs1885ECs1878Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1885211-0.832222thiosulfate:cyanide sulfurtransferase
ECs18842131.581016peripheral inner membrane phage-shock protein
ECs18832193.052845PspC family transcriptional regulator
ECs18822214.051841phage shock protein B
ECs18811203.916107phage shock protein PspA
ECs18800184.226422phage shock protein operon transcriptional
ECs1879-1184.1468584-aminobutyrate transaminase
ECs1878-3163.248642oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1882MPTASEINHBTR250.030 Metalloprotease inhibitor signature.
		>MPTASEINHBTR#Metalloprotease inhibitor signature.

Length = 122

Score = 24.6 bits (53), Expect = 0.030
Identities = 7/43 (16%), Positives = 17/43 (39%)

Query: 30 SGRSELSQSEQQRLAQLADEAKRMRERIQALESILDAEHPNWR 72
+G+ + + A A++A + + E L + +W
Sbjct: 37 AGQLGIEATGSGVCAGPAEQANALAGDVACAEQWLGDKPVSWS 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1880HTHFIS342e-118 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 342 bits (880), Expect = e-118
Identities = 126/341 (36%), Positives = 182/341 (53%), Gaps = 23/341 (6%)

Query: 6 DNLLGEANSFLEVLEQVSHLAPLDKPVLIIGERGTGKELIASRLHYLSSRWQGPFISLNC 65
L+G + + E+ ++ L D ++I GE GTGKEL+A LH R GPF+++N
Sbjct: 137 MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINM 196

Query: 66 AALNENLLDSELFGHEAGAFTGAQKRHPGRFERADGGTLFLDELATAPMMVQEKLLRVIE 125
AA+ +L++SELFGHE GAFTGAQ R GRFE+A+GGTLFLDE+ PM Q +LLRV++
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQ 256

Query: 126 YGELERVGGSQPLQVNVRLVCATNADLPAMVNEGTFRADLLDRLAFDVVQLPPLRERESD 185
GE VGG P++ +VR+V ATN DL +N+G FR DL RL ++LPPLR+R D
Sbjct: 257 QGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAED 316

Query: 186 IMLMAEHFAIQMCREIKLPLFPGFTERARETLLNYRWPGNIRELKNVVERSVYRHGTSDY 245
I + HF Q +E F + A E + + WPGN+REL+N+V R +
Sbjct: 317 IPDLVRHFVQQAEKEGLDVK--RFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVI 374

Query: 246 PLDDIIID---PFKRRPPEDAIAVSETTSLPTLPLD------------------LREFQM 284
+ I + P E A A S + S+ +
Sbjct: 375 TREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLA 434

Query: 285 QQEKELLQLSLQQGKYNQKRAAELLGLTYHQFRALLKKHQI 325
+ E L+ +L + NQ +AA+LLGL + R +++ +
Sbjct: 435 EMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


51ECs1833ECs1733Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1833015-3.073361tryptophan synthase subunit beta
ECs1832027-6.746831tryptophan synthase subunit alpha
ECs1831240-10.550925hypothetical protein
ECs1830439-9.745925hypothetical protein
ECs1829335-7.853940hypothetical protein
ECs1825123-1.459898bfpT-regulated chaperone-like protein
ECs1824224-1.737751hypothetical protein
ECs1821327-3.047123hypothetical protein
ECs1820327-2.218727hypothetical protein
ECs1819534-4.206037hypothetical protein
ECs1818536-4.752135hypothetical protein
ECs1815744-4.978123hypothetical protein
ECs1814536-2.983988hypothetical protein
ECs18134232.261460integrase
ECs18124222.887197hypothetical protein
ECs18095285.613400hypothetical protein
ECs18084295.834674tail fiber protein
ECs18074295.569322outer membrane protein
ECs18065315.778600host specificity protein
ECs18053324.973405minor tail protein
ECs18044325.194519minor tail protein
ECs18033315.293454tail length tape measure protein
ECs18025315.082345hypothetical protein
ECs18016304.728306tail assembly chaperone
ECs18005284.686909major tail subunit
ECs17996306.438042hypothetical protein
ECs17986296.301822hypothetical protein
ECs17975296.035081head-tail adaptor
ECs17965285.918744hypothetical protein
ECs17955285.886477portal protein
ECs17935275.829231major head protein/prohead protease
ECs17924254.212595terminase large subunit
ECs17911240.648412terminase small subunit
ECs17893230.759853Dnase
ECs5431122-0.578291hypothetical protein
ECs1788024-0.691804hypothetical protein
ECs17863221.753755endopeptidase
ECs17852251.522532antirepressor protein
ECs17843281.993464endolysin
ECs17833270.851743hypothetical protein
ECs17823271.015887holin
ECs17812231.313447hypothetical protein
ECs17801220.203973**DNA methylase
ECs1779-124-1.322902lipoprotein
ECs1778-125-0.937945hypothetical protein
ECs1777-123-1.884845endonuclease
ECs1776021-0.341252hypothetical protein
ECs1775226-1.675885hypothetical protein
ECs1774124-2.375658hypothetical protein
ECs1773025-1.659245prophage maintenance protein
ECs1772226-2.246870colonization factor
ECs1769327-1.284853phage replication protein
ECs1768228-3.136784hypothetical protein
ECs1767232-5.734625hypothetical protein
ECs1766435-7.571334DNA-binding transcriptional regulator DicC
ECs1765336-9.323971transcriptional repressor DicA
ECs5428138-9.945698hypothetical protein
ECs1764124-5.037359hypothetical protein
ECs1763126-4.956756hypothetical protein
ECs1762123-4.409509hypothetical protein
ECs1761121-3.594708DicB
ECs1760018-3.017556hypothetical protein
ECs1759016-3.038729exonuclease
ECs1758-116-3.643710excisionase
ECs1757016-2.567388integrase
ECs1756115-1.659765outer membrane protein W
ECs1755-219-4.022129hypothetical protein
ECs1754-222-4.320040intracellular septation protein A
ECs1753-122-2.777036acyl-CoA thioesterase
ECs1752-120-2.898528transporter
ECs1751-218-3.177605hypothetical protein
ECs1750-115-2.588892voltage-gated potassium channel
ECs5426-111-0.285877hypothetical protein
ECs1749-112-0.935570cardiolipin synthetase
ECs1748-212-1.911526dsDNA-mimic protein
ECs1747-113-1.946634hypothetical protein
ECs1746-223-1.798636oligopeptide ABC transporter ATP-binding
ECs1745-124-2.657756hypothetical protein
ECs1744-127-3.200732oligopeptide transporter permease
ECs1743-128-3.145442oligopeptide transport periplasmic binding
ECs1742-325-2.990546hypothetical protein
ECs1741-324-2.694060bifunctional acetaldehyde-CoA/alcohol
ECs1740-125-4.156834thymidine kinase
ECs1739-220-3.411806global DNA-binding transcriptional dual
ECs1738-220-2.724412UTP-glucose-1-phosphate uridylyltransferase
ECs1737-315-1.300781response regulator of RpoS
ECs1736-1110.067889hypothetical protein
ECs1735-1171.107096hypothetical protein
ECs1734-1252.533108formyltetrahydrofolate deformylase
ECs1733-1263.055450**hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1808CHANLCOLICIN320.007 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 31.6 bits (71), Expect = 0.007
Identities = 33/170 (19%), Positives = 63/170 (37%), Gaps = 34/170 (20%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEENAANADTSAGDASESAR 160
EA + + E A+ E A+K A+ + + E N+ S+ S AR
Sbjct: 175 EAEEKRLAALSEEAKAVEIAQKKLSAAQ-----SEVVKMDGEIKTLNSRLSS---SIHAR 226

Query: 161 QAAESAAAAKQSEEASSSSA--------SAAAQKASESLQS----------------ATD 196
A A K++E A +S+ + +A++ LQ+ +
Sbjct: 227 DAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREE 286

Query: 197 AELSKKTAESAAGNAARDATTAAEKARESAESAQSAEQSRIA-AEEAVNR 245
+ +E+ N T +KA + ++A +R+ AEE + +
Sbjct: 287 KQKQVTASETRI-NRINADITQIQKAISQVSNNRNAGIARVHEAEENLKK 335



Score = 30.8 bits (69), Expect = 0.012
Identities = 35/149 (23%), Positives = 52/149 (34%), Gaps = 10/149 (6%)

Query: 101 EALRRFELMVEEAARHAEEAKKNAGEAETSARNAGISASQAEENAANADTSAGDASESAR 160
E R+ E+A + AE+ +K E + + ++AEE A + A E
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIER-EKAETERQLKLAEAEEKRLAALSEEAKAVE--- 192

Query: 161 QAAESAAAAKQSEEASSSSASAAAQKASESLQSATDAE---LSKKTAESAAGNAARDATT 217
A+ +A QSE S A DAE L+ K E A +A
Sbjct: 193 -IAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELD 251

Query: 218 AAEKARESAESAQSAEQSRIAAEEAVNRI 246
K + A Q+R E R+
Sbjct: 252 ELVKK--LSPRANDPLQNRPFFEATRRRV 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1807ENTEROVIROMP1371e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 137 bits (347), Expect = 1e-43
Identities = 62/195 (31%), Positives = 98/195 (50%), Gaps = 29/195 (14%)

Query: 7 VILSAVVWQVAAATPASAAEHQSTLSAGYLHASTNVPG-SDDLNGINVKYRYEFMDA-LG 64
+ + + V A T ++ ST++ GY A ++ G + + G N+KYRYE ++ LG
Sbjct: 4 IACLSALAAVLAFTAGTSVAATSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEEDNSPLG 61

Query: 65 LITSFSYANAEDEQKTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRV 124
+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y +
Sbjct: 62 VIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKF 113

Query: 125 STFYGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVTIDLAYEGSGSG 184
T T+ HD S+ ++GAG+QFNP E+V +D +YE S
Sbjct: 114 QT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYEQSRIR 156

Query: 185 DWRSDAFIVGIGYRF 199
+I G+GYRF
Sbjct: 157 SVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1803CHANLCOLICIN320.015 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 32.0 bits (72), Expect = 0.015
Identities = 45/240 (18%), Positives = 93/240 (38%), Gaps = 24/240 (10%)

Query: 356 RAQQAVAAARGTEMQIAAEARLAATQERLN-------RNIAARSAAQNALNSTTAVGSRL 408
+A+QA A E Q A+A A +RL R+ A+R+ + L +
Sbjct: 66 QAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAAMQA 125

Query: 409 MSGALGLVGGVPGLVMLGAAAWYTLYQNQEQARESARQYALTIDEIAHKTPSMSLPEASD 468
L L AA + +++ +E R+ A T ++ + ++
Sbjct: 126 EDERLRLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREKAETERQL----------KLAE 175

Query: 469 NEGRTRAALTEQNRLID-------EQASRVKSLQEKAQSIQDVLAGLEDRRVALIRQQAA 521
E + AAL+E+ + ++ S V + + +++ L+ R A ++ A
Sbjct: 176 AEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAG 235

Query: 522 EQNKVYQSMLVMNGQHTEFNRLLGLGNELLQQRQGLVNVPLRLPQATLDDKQQSALTKTE 581
++N++ Q+ +L N+ LQ R R+ + +++Q +T +E
Sbjct: 236 KRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASE 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1773HOKGEFTOXIC593e-16 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 59.5 bits (144), Expect = 3e-16
Identities = 18/46 (39%), Positives = 31/46 (67%)

Query: 23 QKAMLIALIVICLIVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++CL +++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1752TONBPROTEIN2561e-88 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 256 bits (655), Expect = 1e-88
Identities = 236/239 (98%), Positives = 236/239 (98%)

Query: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVAPADLEPPQA 60
MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMV PADLEPPQA
Sbjct: 1 MTLDLPRRFPWPTLLSVCIHGAVVAGLLYTSVHQVIELPAPAQPISVTMVTPADLEPPQA 60

Query: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQQKRDVKPVESR 120
VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQ KRDVKPVESR
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVESR 120

Query: 121 PASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180
PASPFENTAPAR TSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF
Sbjct: 121 PASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKF 180

Query: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239
DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ
Sbjct: 181 DVTPDGRVDNVQILSAKPANMFEREVKNAMRRWRYEPGKPGSGIVVNILFKINGTTEIQ 239


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1751adhesinmafb314e-04 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.2 bits (70), Expect = 4e-04
Identities = 16/57 (28%), Positives = 20/57 (35%), Gaps = 2/57 (3%)

Query: 41 GPMPAVDSNDPGAAGFTGSTVIAEFESLEAAQAWADADPYVAAGVYEHVSVKPFKKV 97
P+PA G GS E + EA W +P A V +V KV
Sbjct: 268 APLPA--EGKFAVIGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1747HTHFIS310.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.008
Identities = 9/16 (56%), Positives = 11/16 (68%)

Query: 55 VVGESGCGKSTFARAI 70
+ GESG GK ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1737HTHFIS907e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 7e-22
Identities = 40/152 (26%), Positives = 64/152 (42%), Gaps = 3/152 (1%)

Query: 10 ILIVEDEQVFRSLLDSWFSSLGATTVLAADGVDALELLGGFTPDLMICDIAMPRMNGLKL 69
IL+ +D+ R++L+ S G + ++ + DL++ D+ MP N L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 70 LEHIRNRGDQTPVLVISATENMADIAKALRLGVEDVLLKPVKDLNRLREMVFACLYPSMF 129
L I+ PVLV+SA KA G D L KP DL L ++ L +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF-DLTELIGIIGRAL--AEP 122

Query: 130 NSRVEEEERLFRDWDAMVDNPAAAAKLLQELQ 161
R + E +D +V AA ++ + L
Sbjct: 123 KRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLA 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1735SECA572e-12 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 57.2 bits (138), Expect = 2e-12
Identities = 16/28 (57%), Positives = 20/28 (71%)

Query: 125 IDGTRPQFGRNDPCPCGSGKKFKKCCGQ 152
+ GRNDPCPCGSGKK+K+C G+
Sbjct: 872 AQTGERKVGRNDPCPCGSGKKYKQCHGR 899


52ECs1680ECs1611Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1680-316-3.046439disulfide bond formation protein B
ECs1679-318-3.890299DNA polymerase V subunit UmuC
ECs1678-319-5.386287DNA polymerase V subunit UmuD
ECs1677-118-6.128021hemolysin E
ECs1676-319-4.994980hypothetical protein
ECs1675-122-3.971258hypothetical protein
ECs1674-121-4.509655hypothetical protein
ECs1673-121-4.555998hypothetical protein
ECs1672-118-3.582059hypothetical protein
ECs1671-116-1.658155hypothetical protein
ECs1670-215-0.522865septum formation inhibitor
ECs1669017-1.589049cell division inhibitor MinD
ECs1668221-3.213575cell division topological specificity factor
ECs1667327-4.808323hypothetical protein
ECs1666223-2.241573transposase
ECs1665223-2.389349transposase
ECs1664325-2.514027hypothetical protein
ECs1663322-1.987809outer membrane protease
ECs1662122-1.464366hypothetical protein
ECs16601212.186628hypothetical protein
ECs1659329-5.035430hypothetical protein
ECs1658634-8.457928hypothetical protein
ECs1657642-9.733824hypothetical protein
ECs1656542-9.235229hypothetical protein
ECs1655229-2.737980hypothetical protein
ECs1654226-2.019978hypothetical protein
ECs16530181.137672hypothetical protein
ECs16521192.372870catalase
ECs16511223.925725tail fiber assembly protein
ECs16501214.479241tail fiber protein
ECs16493255.214304membrane protein
ECs16482265.765265host specificity protein
ECs16472266.458341tail assembly protein
ECs16462275.950269tail assembly protein
ECs16452275.829138minor tail protein
ECs16442275.811620minor tail protein
ECs16431265.537099tail length tape measure protein
ECs16422254.571057minor tail protein
ECs16413254.648433minor tail protein
ECs16402234.921734major tail protein V
ECs16391264.817795minor tail protein
ECs16382256.445462minor tail protein
ECs16374256.936855minor capsid protein
ECs16364256.816434DNA packaging protein
ECs16351235.651749major capsid protein
ECs16341235.382510major capsid protein
ECs16331224.756176minor capsid protein
ECs16321212.786130portal protein
ECs1631119-1.757976head-to-tail joining protein
ECs1630019-2.542332terminase large subunit
ECs1629229-6.767646terminase small subunit
ECs1628433-7.943564hypothetical protein
ECs1627229-6.461746hypothetical protein
ECs1626228-5.417081hypothetical protein
ECs1625428-2.717397Bor protein
ECs1623529-2.107141endopeptidase
ECs1624228-1.690831lipoprotein Rz1
ECs1622124-1.951808endolysin
ECs1621127-2.151376holin
ECs1620225-1.786835antitermination protein
ECs5421335-8.613254hypothetical protein
ECs1619337-8.951380endodeoxyribonuclease RUS
ECs1618342-9.515611hypothetical protein
ECs1617139-8.145363prophage protein NinE
ECs1616135-7.771177hypothetical protein
ECs1615138-8.354747hypothetical protein
ECs1614029-4.989413multidrug efflux protein
ECs1613022-3.606234Ren protein
ECs1612-122-3.254995replication protein P
ECs1611-121-3.758396replication protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1667PRTACTNFAMLY1263e-35 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 126 bits (317), Expect = 3e-35
Identities = 75/235 (31%), Positives = 111/235 (47%), Gaps = 2/235 (0%)

Query: 1 MGIDSRNDIPEGIATLGAFMGYSHSHIGFDRGGHGSVDSYSLGGYASWEHESGFYLDGVV 60
+G D + G LG GY+ GF G G DS +GGYA++ +SGFYLD +
Sbjct: 677 LGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDGGGHTDSVHVGGYATYIADSGFYLDATL 736

Query: 61 KLNRFESNVAGKMSSGGAANGSYHSNGLGGHIETGMRFT-DGNWNLTPYASLTGFTADNP 119
+ +R E++ S G A G Y ++G+G +E G RFT W L P A L F A
Sbjct: 737 RASRLENDFKVAGSDGYAVKGKYRTHGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGG 796

Query: 120 EYHLSNGMESKSVDTRSIYRELGATLSYNMRLGNGMEVEPWLKAAVRKEFVDDNRVKVNS 179
Y +NG+ + S+ LG + + L G +V+P++KA+V +EF V N
Sbjct: 797 AYRAANGLRVRDEGGSSVLGRLGLEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNG 856

Query: 180 DGNFVNDLSGRRGIYQAGIKASFSSTLSGHLGVGYSNGAGMESPWNAVAGVNWSF 234
+ +L G R G+ A+ S + YS G + PW AG +S+
Sbjct: 857 IAH-RTELRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1664HTHTETR280.022 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.1 bits (62), Expect = 0.022
Identities = 9/41 (21%), Positives = 21/41 (51%), Gaps = 2/41 (4%)

Query: 3 KRAKNQIVDSDIARLLLKLRKSRNLTVTELAQRSGVSQAMI 43
+ + I+D A L + + ++ E+A+ +GV++ I
Sbjct: 10 QETRQHILDV--ALRLFSQQGVSSTSLGEIAKAAGVTRGAI 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1663OMPTIN5270.0 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 527 bits (1358), Expect = 0.0
Identities = 313/317 (98%), Positives = 316/317 (99%)

Query: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60
MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS
Sbjct: 1 MRAKLLGIVLTTPIAISSFASTETLSFTPDNINADISLGTLSGKTKERVYLAEEGGRKVS 60

Query: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120
QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR
Sbjct: 61 QLDWKFNNAAIIKGAINWDLMPQISIGAAGWTTLGSRGGNMVDQDWMDSSNPGTWTDESR 120

Query: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180
HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI
Sbjct: 121 HPDTQLNYANEFDLNIKGWLLNEPNYRLGLMAGYQESRYSFTARGGSYIYSSEEGFRDDI 180

Query: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVEASDNDEHYDPGKRIT 240
GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVE+SDNDEHYDPGKRIT
Sbjct: 181 GSFPNGERAIGYKQRFKMPYIGLTGSYRYEDFELGGTFKYSGWVESSDNDEHYDPGKRIT 240

Query: 241 YRSKVKDQNYYSVSVNAGYYVTPNAKVYVEGTWNRVTNKKGNTSLYDHNDNTSDYSKNGA 300
YRSKVKDQNYYSV+VNAGYYVTPNAKVYVEG WNRVTNKKGNTSLYDHN+NTSDYSKNGA
Sbjct: 241 YRSKVKDQNYYSVAVNAGYYVTPNAKVYVEGAWNRVTNKKGNTSLYDHNNNTSDYSKNGA 300

Query: 301 GIENYNFITTAGLKYTF 317
GIENYNFITTAGLKYTF
Sbjct: 301 GIENYNFITTAGLKYTF 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1650CHANLCOLICIN442e-06 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 44.3 bits (104), Expect = 2e-06
Identities = 54/319 (16%), Positives = 118/319 (36%)

Query: 152 ARAASTSAGQAASSAQSASSSAGTASTKATEASKSAAAAESSKSAAATSAGAAKTSETNA 211
+ S S AA A + S+A T+A +A+++ AAAE+ A A + +
Sbjct: 39 GKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98

Query: 212 AVSQQSAATSASTATTKASEAASSARDASASKEAAKSSETSAASSASSAASSATAAGNSA 271
+ + A+ +AT A ++ + AK+ E + + ++ + A
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRK 158

Query: 272 KAAKTSETNAKSSETAAEQSASAAAGSKTAAALSASAASTSAGQASASATAAGKSAESAA 331
+ + + + A + AA S+ A A+ + SA Q+ ++
Sbjct: 159 EIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSR 218

Query: 332 SSASTATTKAGEATEQASAAASSASAAKTSETNAKASETSAESSKTAAASSASSAASSAS 391
S+S A T + ++AK E + + S ++ A
Sbjct: 219 LSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRV 278

Query: 392 SASASKDEATRQASAAKSSATTASTKATEAAGSATAAAQSKSTAESAATRAETAAKRAED 451
A ++E +Q +A+++ + T+ + + + +++ + AE K+A++
Sbjct: 279 GAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQN 338

Query: 452 IASAVALEDASTTKKGIVQ 470
++DA Q
Sbjct: 339 NLLNSQIKDAVDATVSFYQ 357



Score = 31.6 bits (71), Expect = 0.020
Identities = 58/332 (17%), Positives = 111/332 (33%), Gaps = 22/332 (6%)

Query: 313 AGQASASATAAGKSAESAASSA----STATTKAGEATEQASAAASSASAAKTSETNAKAS 368
+G KS SAA A STA K +A + A A A++ + AK +
Sbjct: 32 SGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALT 91

Query: 369 E---------TSAESSKTAAASSASSAASSASSASASKDEATRQASAAKSSATTASTKAT 419
+ +S+T +A+ + A ++A A + + A+ A A
Sbjct: 92 QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQ 151

Query: 420 EAAGSATAAAQSKSTAESAATRAETAAKRAEDIASA-----VALEDASTTKKGIVQLSSA 474
EA + K+ E AE KR ++ +A + S + +V++
Sbjct: 152 EAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGE 211

Query: 475 TNSTSESLAATPKAVKAAYELANGKYTAQDATTAQKGIVQLSNATNSTSEMLAATPKSVK 534
+ + L+++ A A + GK +A+ + + A P +
Sbjct: 212 IKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAK---YKELDELVKKLSPRANDPLQNR 268

Query: 535 AAYDLANGKYTAQDAT-TAQKGIVQLSSATNSASETLAATPKAVKAANDNANGRVPSARK 593
++ + A QK + + N + + KA+ ++N N + +
Sbjct: 269 PFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHE 328

Query: 594 VNGKALSSDITLTPKDIGTLNSTTMSFSGGAG 625
+ L I T+SF
Sbjct: 329 AEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1649ENTEROVIROMP1392e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 139 bits (352), Expect = 2e-44
Identities = 64/200 (32%), Positives = 102/200 (51%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHQSTLSAGYLHARTNVPGSDDLNGINVKYRYEFT 60
M+K+ A + + A LA + + A+ ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKI-ACLSALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1647PF06291280.015 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.015
Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 5/40 (12%)

Query: 135 MTGILFSLGASMVLGGVAQML-----APKARTPRTQTTDN 169
M +LFS +M++ G AQ P A TP+ T +
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1643GPOSANCHOR330.007 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.007
Identities = 45/277 (16%), Positives = 84/277 (30%), Gaps = 21/277 (7%)

Query: 238 LTAMARQFHNVTAEQIAYVAQLQRSGEEAGALQAANEAATKGFDDQTRRLKENMGTLETW 297
+A + A A A L+++ E A A+ A K + + L+ LE
Sbjct: 139 DSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA 198

Query: 298 ADRTARAFKSMWDAVLDIGRP-DTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARARY 356
+ + + + E A + A + + +A
Sbjct: 199 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 357 WDDR---EKARLALEAARKKAEQQSQQDKNAQQQSDTEASRLKYTEEA-----QKAYERL 408
+ EKA + + + + + E + L++ + Q L
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318

Query: 409 QTPLEKYTARQEELNKALKDGKI-------LQADYNTLMAAAKKDYEATLKKPKQ----S 457
E + E K + KI L+ D + AKK EA +K ++ S
Sbjct: 319 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASR-EAKKQLEAEHQKLEEQNKIS 377

Query: 458 GVKVSAGDRQEDSAHAALLTLQAELRMLEKHAGANEK 494
+ R D++ A ++ L A EK
Sbjct: 378 EASRQSLRRDLDASREAKKQVEKALEEANSKLAALEK 414


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1640INTIMIN280.029 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.5 bits (63), Expect = 0.029
Identities = 32/202 (15%), Positives = 61/202 (30%), Gaps = 29/202 (14%)

Query: 66 DWAATGQGQKSAGDTSFT----LAWMPGEQGQQALLAWFNEGDTRAYKIRFPNGTVDVFR 121
G G+ + S + + AL + A I +
Sbjct: 611 SANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL-------NANAV-IFVDQTKASITE 662

Query: 122 GWVSSIGKAVTAKEVITRTVKVTNVGRPSMAEDRSTVTAATGMTVTPASTSVVKGQSTTL 181
++ IT TVKV +P ++ + T ++ + T TL
Sbjct: 663 IKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTL 722

Query: 182 T---------------VAFQPEGATDKSFRAVSADKTKATVSVSGMTITVKG--VAAGKV 224
T VA + + F ++ D + +G+ + + G+V
Sbjct: 723 TSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQV 782

Query: 225 NIPVVSGNGEFAAVAEINVTAS 246
N+ GNG++ + AS
Sbjct: 783 NLKASGGNGKYTWRSANPAIAS 804


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1625PF062911704e-59 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 170 bits (432), Expect = 4e-59
Identities = 91/97 (93%), Positives = 93/97 (95%)

Query: 1 MKKMLLATALALLITGCAQQTFTVQNKPAAVTPKETITHHFFVSGIGQKKTVDAAKICGG 60
MKKML + ALA+LITGCAQQTFTV NKP AVTPKETITHHFFVSGIGQKKTVDAAKICGG
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAAKICGG 65

Query: 61 AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 97
AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ
Sbjct: 66 AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1612FLGMOTORFLIG280.040 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 27.8 bits (62), Expect = 0.040
Identities = 17/77 (22%), Positives = 27/77 (35%), Gaps = 11/77 (14%)

Query: 2 KNIAAQMVNFDREQM-----------RRIANNMPEQYDEKPQVQQVAQIINGVFSQLLAT 50
N+A ++ DR +++A+ E Y V V +IIN +
Sbjct: 165 TNVARRIALMDRTSPEVVREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKF 224

Query: 51 FPASLANRDQNELNEIR 67
SL D EI+
Sbjct: 225 IIESLEEEDPELAEEIK 241


53ECs1597ECs1504Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs15970293.709820terminase small subunit
ECs15961273.789993hypothetical protein
ECs1595-1242.608941hypothetical protein
ECs1594-1241.703999hypothetical protein
ECs1593-1241.906273head-tail adaptor
ECs1592-2232.197449head portal protein
ECs15910280.644772prohead protease
ECs1590029-1.316562major head protein
ECs1589233-1.963261hypothetical protein
ECs1588233-2.412334transcriptional activator
ECs1587334-3.910998single stranded DNA-binding protein
ECs1586236-7.573644hypothetical protein
ECs1585437-7.245883hypothetical protein
ECs1581336-5.804747hypothetical protein
ECs1580439-6.099067hypothetical protein
ECs1579330-2.763358hypothetical protein
ECs1578-119-0.924967hypothetical protein
ECs1577-314-0.795875Icd-like protein
ECs1576-313-0.875834hypothetical protein
ECs5420-311-0.436807hypothetical protein
ECs1574-310-0.962933integrase
ECs1573-210-3.526647hypothetical protein
ECs1572-114-5.277395peptidase T
ECs1571121-6.076389putrescine/spermidine ABC transporter ATPase
ECs1570326-7.609537spermidine/putrescine ABC transporter membrane
ECs1569230-6.289089hypothetical protein
ECs1568643-13.161644hypothetical protein
ECs1567851-14.457501hypothetical protein
ECs5419849-13.854406hypothetical protein
ECs1565746-12.394599transposase
ECs1564645-12.130828insertion element IS2 transposase InsD
ECs1561748-13.834977hypothetical protein
ECs1560433-7.129900secreted effector protein
ECs15591240.919313tail assembly protein
ECs15582251.375554tail assembly protein
ECs15573260.615097antirepressor protein
ECs15566333.523749regulatory protein
ECs15557375.420427minor tail protein
ECs15546355.253137minor tail protein
ECs15517325.327796phage tail protein
ECs15508304.834523tail assembly chaperone
ECs15495294.401275major tail subunit
ECs15486304.912196hypothetical protein
ECs15475315.863418hypothetical protein
ECs15466305.924476head-tail adapotor
ECs15454305.694087hypothetical protein
ECs15444295.758022portal protein
ECs54185286.438443hypothetical protein
ECs15435275.973106major head protein/prohead proteinase
ECs15424273.710660large terminase subunit
ECs15412321.142666terminase small subunit
ECs1540531-0.957877Dnase
ECs5417026-1.072399hypothetical protein
ECs1539-124-2.999129hypothetical protein
ECs1538-121-1.111642hypothetical protein
ECs1537022-0.988289hypothetical protein
ECs1536221-0.966249hypothetical protein
ECs15342230.040603endopeptidase
ECs1533225-1.172646antirepressor protein
ECs15325291.596748endolysin
ECs15315281.063942hypothetical protein
ECs15305251.724973holin
ECs15293230.509056hypothetical protein
ECs15282230.315260hypothetical protein
ECs15272220.920891hypothetical protein
ECs15250220.051444hypothetical protein
ECs1526-1230.299654hypothetical protein
ECs1524-123-0.397420antitermination protein
ECs1523-126-0.068965crossover junction endodeoxyribonuclease
ECs1522026-0.376674hypothetical protein
ECs1521332-2.729603hypothetical protein
ECs1520331-3.700910prophage maintenance protein
ECs1518230-3.336955hypothetical protein
ECs1517428-2.784165hypothetical protein
ECs1516223-0.939981hypothetical protein
ECs15152250.535895hypothetical protein
ECs15142272.035259hypothetical protein
ECs15133252.536600hypothetical protein
ECs15123251.501245hypothetical protein
ECs15113240.116955hypothetical protein
ECs1510224-0.285567replication protein
ECs1509325-3.216499hypothetical protein
ECs1508226-5.760519hypothetical protein
ECs1507131-6.910238hypothetical protein
ECs1506232-6.510616phage repressor
ECs1505-224-4.913626hypothetical protein
ECs1504-219-3.998705hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1571PF05272300.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.017
Identities = 10/36 (27%), Positives = 19/36 (52%), Gaps = 1/36 (2%)

Query: 46 LTLLGPSGCGKTTVLRLIAGLE-TVDSGRIMLDNED 80
+ L G G GK+T++ + GL+ D+ + +D
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1520HOKGEFTOXIC652e-18 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 64.8 bits (158), Expect = 2e-18
Identities = 18/46 (39%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICITVIVTALVTRKDLCEVRIRTGQTEVAVFTAYEP 68
+ +++ ++++C+T+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1518FLGMRINGFLIF320.001 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 32.2 bits (73), Expect = 0.001
Identities = 15/58 (25%), Positives = 27/58 (46%), Gaps = 6/58 (10%)

Query: 22 GSNVVLPAEEAEELARIALASLAAVSDERAAYELFMEKRFG-----ESVDRRRAKNGD 74
+ +PA++ E R+ LA +EL +++FG E V+ +RA G+
Sbjct: 82 SGAIEVPADKVHE-LRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRALEGE 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1504HTHFIS270.031 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.5 bits (61), Expect = 0.031
Identities = 18/99 (18%), Positives = 38/99 (38%), Gaps = 10/99 (10%)

Query: 17 AELTAAMTAIRETA--QIAKLMNEAKTQAEVNAAIGELNSKLASIQRECVSLVELVGTYQ 74
A+ A + A + K + + + A+ E + + ++ + + LVG
Sbjct: 85 NTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSA 144

Query: 75 EINASLKAKIAEFENFEAQTEGYILSQLESGTFVYSKEV 113
+ + +A QT+ ++ ESGT KE+
Sbjct: 145 AMQE-IYRVLARL----MQTDLTLMITGESGT---GKEL 175


54ECs1467ECs1450Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs14674191.68164950S ribosomal protein L32
ECs14663161.317835hypothetical protein
ECs14652141.314712Maf-like protein
ECs14641141.74808923S rRNA pseudouridylate synthase C
ECs14630122.037239hypothetical protein
ECs14620132.307826ribonuclease E
ECs1461-191.352107flagellar hook-associated protein FlgL
ECs1460-1132.498670flagellar hook-associated protein FlgK
ECs14590142.665398flagellar rod assembly protein/muramidase FlgJ
ECs14582142.619663flagellar basal body P-ring protein
ECs14573162.502625flagellar basal body L-ring protein
ECs14562162.589317flagellar basal body rod protein FlgG
ECs14550162.316563flagellar basal body rod protein FlgF
ECs14541161.019702flagellar hook protein FlgE
ECs14531190.726273flagellar basal body rod modification protein
ECs14520130.674867flagellar basal body rod protein FlgC
ECs14512161.132988flagellar basal-body rod protein FlgB
ECs14502161.000890flagellar basal body P-ring biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1462IGASERPTASE643e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 64.3 bits (156), Expect = 3e-12
Identities = 47/288 (16%), Positives = 84/288 (29%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAAPATPATPAQPGLL 571
P E+ + DVP P+ E A AP P APATP+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT----- 1037

Query: 572 SRFFGALKALFSGGEETKPTEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRSER-- 629
ET + Q QN + + + ++
Sbjct: 1038 ---------------ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETREGRQQAEVTEKARTADEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEETVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 AEETVAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
+P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248



Score = 63.5 bits (154), Expect = 4e-12
Identities = 46/261 (17%), Positives = 81/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAAPATPATPAQPGLLSRFFGALKALFSGGEETKPTEQP-APKAEAKPERQQDRR 609
P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETREGRQQAEV------T 663
N ++++++ D E +NR A++ + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTADEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +T + ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEETVVAPVAEETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1461FLAGELLIN461e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.8 bits (108), Expect = 1e-07
Identities = 41/226 (18%), Positives = 81/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDNDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD+D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEVNGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + +G E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1460FLGHOOKAP16770.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 677 bits (1747), Expect = 0.0
Identities = 541/546 (99%), Positives = 543/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNSQHKAGFDANGDEGEDFFAIGKPAVLQNTKNNGNVAIGATVTDASAVLATD 361
ALAFAEAFN+QHKAGFDANGD GEDFFAIGKPAVLQNTKN G+VAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSTTQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSS TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1459FLGFLGJ5080.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 508 bits (1308), Expect = 0.0
Identities = 311/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKASEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKA EDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTTGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMT GKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1458FLGPRINGFLGI426e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 426 bits (1097), Expect = e-152
Identities = 156/363 (42%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 5 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 64
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 65 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 124
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 125 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 184
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 185 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 240
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 241 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAIAQGNLSVTVNRQANVSQPDTPFGG 300
+N+ V T AKVVIN RTG++V+ +V + A++ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 301 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 360
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 361 AKL 363
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1457FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1456FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1454FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


55ECs1426ECs1298Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1426-117-3.484192glucan biosynthesis protein G
ECs1425021-3.818703glucans biosynthesis protein
ECs1424-125-3.462463synthase
ECs1423334-5.929280hypothetical protein
ECs1422134-8.152338hypothetical protein
ECs1421036-8.347044autoagglutination protein
ECs1420032-8.514289cryptic curlin major subunit
ECs1419127-7.601952curlin minor subunit
ECs1418121-5.541319hypothetical protein
ECs1417-118-3.767081DNA-binding transcriptional regulator CsgD
ECs1416-216-2.483910curli assembly protein CsgE
ECs1415-117-1.422238curli assembly protein CsgF
ECs1414-114-0.996020protein CsgG
ECs1413115-0.914482hypothetical protein
ECs1412015-1.066451oxidoreductase component
ECs1411117-1.368852hydrolase
ECs1410321-0.157516dehydrogenase
ECs14097240.171577*hypothetical protein
ECs14087283.100418hypothetical protein
ECs14077293.485000hypothetical protein
ECs14067303.729211hypothetical protein
ECs14057293.504581hypothetical protein
ECs14046243.108296hypothetical protein
ECs14036243.044303DNA repair protein
ECs14025221.647391hypothetical protein
ECs14016222.894009hypothetical protein
ECs14006223.215816hypothetical protein
ECs13996222.999622hypothetical protein
ECs13986212.808837hypothetical protein
ECs13976212.700216hypothetical protein
ECs13967222.154017AidA-I
ECs1395424-1.434786hypothetical protein
ECs1394626-3.552547hypothetical protein
ECs1393524-1.717570hypothetical protein
ECs1392428-1.190730hypothetical protein
ECs1389428-1.008261hypothetical protein
ECs13885261.018944transcriptional regulator
ECs13876260.697014hypothetical protein
ECs13864240.513383hypothetical protein
ECs13825260.554284HecB-like protein
ECs13814220.672824transposase
ECs13804220.569925transposase
ECs5415524-2.051294hypothetical protein
ECs1379425-4.485953hypothetical protein
ECs1378326-4.063091hypothetical protein
ECs1377222-3.028806hypothetical protein
ECs1376232-6.211000hypothetical protein
ECs1375331-7.200346hypothetical protein
ECs1374328-5.616498hypothetical protein
ECs1373225-3.959065hypothetical protein
ECs1372128-4.740028transposase
ECs1370236-6.687757glucosyl-transferase
ECs5414331-7.789014hypothetical protein
ECs1367524-2.997804hypothetical protein
ECs1366523-2.985436hypothetical protein
ECs1365523-2.998665hypothetical protein
ECs1364520-2.458482hypothetical protein
ECs1362621-1.467485hypothetical protein
ECs13606210.009154bifunctional enterobactin receptor/adhesin
ECs1359227-0.297825hypothetical protein
ECs13571240.055365hypothetical protein
ECs13562260.139248protein TerE
ECs1355121-0.074054protein TerD
ECs13542200.361993protein TerC
ECs13533210.597273protein TerB
ECs13522220.842622protein TerA
ECs13512260.901209protein TerZ
ECs13502251.435028hypothetical protein
ECs13492262.077080hypothetical protein
ECs13482261.471921hypothetical protein
ECs13471251.670682hypothetical protein
ECs13462212.014174hypothetical protein
ECs13452201.958814hypothetical protein
ECs13443221.430048hypothetical protein
ECs13433231.817483TerW protein
ECs13404242.839815hypothetical protein
ECs13395252.640193hypothetical protein
ECs13383262.337148hypothetical protein
ECs13372261.228773hypothetical protein
ECs13362240.982142hypothetical protein
ECs13353240.304400hypothetical protein
ECs13340230.723152hypothetical protein
ECs5413-1191.289342hypothetical protein
ECs1330-1181.97839850S ribosomal protein L31
ECs13280192.570326hypothetical protein
ECs13271213.209698protein UreG
ECs13260203.135756UreF
ECs13250191.774013urease accessory protein UreE
ECs13242190.985308urease subunit alpha
ECs1323534-7.593753urease subunit beta
ECs1322535-8.891167urease subunit gamma
ECs1321639-9.717707hypothetical protein
ECs1320945-10.961863hypothetical protein
ECs1319739-8.772161hypothetical protein
ECs1317737-8.038920outer membrane protein
ECs1316529-1.870720diacylglycerol kinase
ECs13154252.299983hypothetical protein
ECs13124230.382237complement resistance protein
ECs13113210.103925transposase
ECs1309322-1.766002hypothetical protein
ECs1308322-2.129468hypothetical protein
ECs1307324-2.818943hypothetical protein
ECs1305430-5.820991hypothetical protein
ECs1304538-12.227645hypothetical protein
ECs1303338-10.577374hypothetical protein
ECs5412237-11.106437hypothetical protein
ECs1302137-10.989570regulatory protein
ECs1301-129-8.537143transposase
ECs1300-127-8.219291hypothetical protein
ECs1299-221-4.001945integrase
ECs1298019-3.578087hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1393cdtoxina280.013 Cytolethal distending toxin A signature.
		>cdtoxina#Cytolethal distending toxin A signature.

Length = 258

Score = 27.7 bits (61), Expect = 0.013
Identities = 15/61 (24%), Positives = 24/61 (39%), Gaps = 5/61 (8%)

Query: 74 VELLPVEITPDEQKEPVAAIAPSLSTSTQTSVSAGSCKVEFRHGNMTLENPSPELLTLLI 133
VE P +PDE P+ P+L T+ + ++L N +LT+
Sbjct: 40 VEGGPTVPSPDEPGLPLPGPGPALPTNGAIPIPEPGTAPA-----VSLMNMDGSVLTMWS 94

Query: 134 R 134
R
Sbjct: 95 R 95


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1355PF07824280.014 Type III secretion chaperone
		>PF07824#Type III secretion chaperone

Length = 120

Score = 28.0 bits (62), Expect = 0.014
Identities = 10/36 (27%), Positives = 15/36 (41%)

Query: 132 VNDDNQTEVARYDLTEDASTETAMLFGELYRHNGEW 167
+D+ + +AR DLT E + E Y W
Sbjct: 73 TDDEGGSLIARLDLTGINEFEDIYVNTEYYISRVRW 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1352TYPE4SSCAGA300.023 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 29.7 bits (66), Expect = 0.023
Identities = 21/54 (38%), Positives = 29/54 (53%), Gaps = 4/54 (7%)

Query: 270 KTDGVVTIHVPDQPPIETRLTEGENRRTLCAIARLVNE--NGAIK-VERINQYF 320
K D V + PDQ PI + + +NR+ I++L E N AIK + NQYF
Sbjct: 31 KVDNAVASYDPDQKPIVDK-NDRDNRQAFEGISQLREEYSNKAIKNPTKKNQYF 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1339PF02370361e-04 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 35.9 bits (82), Expect = 1e-04
Identities = 20/103 (19%), Positives = 47/103 (45%), Gaps = 3/103 (2%)

Query: 18 EQAEALRQKDQQLSLVEETEAFLRSALARAEEKIEEEEREIEHLRAQIEKLRRMLFGTRS 77
+ +++ R+ D Q + LR + ++KIEE E+E + + + E+ + +
Sbjct: 38 DSSDSKRENDPQYRALMGENQDLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQD 97

Query: 78 EKLQREVEQAEAQLKQREQESDRYSGREDDPQVPRQLRQSRHR 120
+ Q + ++ + + +Q E E + + Q+ RQ +R
Sbjct: 98 KHYQEQQKKHQQEQQQLEAEKQKL---AKEKQISDASRQGLNR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5413HOKGEFTOXIC342e-06 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 33.6 bits (77), Expect = 2e-06
Identities = 14/48 (29%), Positives = 28/48 (58%), Gaps = 2/48 (4%)

Query: 1 MPQKTIIVGML--CLTMLLTVWVLHASPCEFRVSFMWSEIAAFLQCKP 46
+P+ +++ +L CLT+L+ ++ S CE R + E+AAF+ +
Sbjct: 3 LPRSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1324UREASE10810.0 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 1081 bits (2797), Expect = 0.0
Identities = 397/570 (69%), Positives = 462/570 (81%), Gaps = 2/570 (0%)

Query: 1 MMSNISRQAYADMFGPTTGDKIRLADTELWIEVEDDLTTYGEEVKFGGGKVIRDGMGQGQ 60
M +SR AYA+MFGPT GDK+RLADTEL+IEVE D TT+GEEVKFGGGKVIRDGMGQ Q
Sbjct: 1 MSYRMSRAAYANMFGPTVGDKVRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQ 60

Query: 61 ML-SAGCADLVLTNALIIDYWGIVKADIGVKDGRIFAIGKAGNPDIQPNVTIPIGVSTEI 119
+ G D V+TNALI+D+WGIVKADIG+KDGRI AIGKAGNPD+QP VTI +G TE+
Sbjct: 61 VTREGGAVDTVITNALILDHWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEV 120

Query: 120 IAAEGRIVTAGGVDTHIHWICPQQAEEALTSGITTMIGGGTGPTAGSNATTCTPGPWYIY 179
IA EG+IVTAGG+D+HIH+ICPQQ EEAL SG+T M+GGGTGP G+ ATTCTPGPW+I
Sbjct: 121 IAGEGKIVTAGGMDSHIHFICPQQIEEALMSGLTCMLGGGTGPAHGTLATTCTPGPWHIA 180

Query: 180 QMLQAADSLPVNIGLLGKGNCSNPDALREQVAAGVIGLKIHEDWGATPAVINCALTVADE 239
+M++AAD+ P+N+ GKGN S P AL E V G LK+HEDWG TPA I+C L+VADE
Sbjct: 181 RMIEAADAFPMNLAFAGKGNASLPGALVEMVLGGATSLKLHEDWGTTPAAIDCCLSVADE 240

Query: 240 MDVQVALHSDTLNESGFVEDTLTAIGGRTIHTFHTEGAGGGHAPDIITACAHPNILPSST 299
DVQV +H+DTLNESGFVEDT+ AI GRTIH +HTEGAGGGHAPDII C PN++PSST
Sbjct: 241 YDVQVMIHTDTLNESGFVEDTIAAIKGRTIHAYHTEGAGGGHAPDIIRICGQPNVIPSST 300

Query: 300 NPTLPYTVNTIDEHLDMLMVCHHLDPDIAEDVAFAESRIRQETIAAEDVLHDLGAFSLTS 359
NPT PYTVNT+ EHLDMLMVCHHL P I ED+AFAESRIR+ETIAAED+LHD+GAFS+ S
Sbjct: 301 NPTRPYTVNTLAEHLDMLMVCHHLSPTIPEDIAFAESRIRKETIAAEDILHDIGAFSIIS 360

Query: 360 SDSQAMGRVGEVVLRTWQVAHRMKVQRGPLPEESGDNDNVRVKRYIAKYTINPALTHGIA 419
SDSQAMGRVGEV +RTWQ A +MK QRG L EE+GDNDN RVKRYIAKYTINPA+ HG++
Sbjct: 361 SDSQAMGRVGEVAIRTWQTADKMKRQRGRLKEETGDNDNFRVKRYIAKYTINPAIAHGLS 420

Query: 420 HEVGSIEVGKLADLVLWSPAFFGVKPATIVKGGMIAMAPMGDINGSIPTPQPVHYRPMFA 479
HE+GS+EVGK ADLVLW+PAFFGVKP ++ GG IA APMGD N SIPTPQPVHYRPMF
Sbjct: 421 HEIGSLEVGKRADLVLWNPAFFGVKPDMVLLGGTIAAAPMGDPNASIPTPQPVHYRPMFG 480

Query: 480 ALGSARHRCRVTFLSQAAAANGVAEQLNLHSTTAVVKGCR-TVQKADMRHNSLLPDITVD 538
A G +R VTF+SQA+ G+A +L + V+ R + KA M HNSL P I VD
Sbjct: 481 AYGRSRTNSSVTFVSQASLDAGLAGRLGVAKELVAVQNTRGGIGKASMIHNSLTPHIEVD 540

Query: 539 SQTYEVRINGELITSEPADILPMAQRYFLF 568
+TYEVR +GEL+T EPA +LPMAQRYFLF
Sbjct: 541 PETYEVRADGELLTCEPATVLPMAQRYFLF 570


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1302HTHFIS260.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 25.6 bits (56), Expect = 0.043
Identities = 6/15 (40%), Positives = 13/15 (86%)

Query: 21 SQLLGISRSTIYEKM 35
+ LLG++R+T+ +K+
Sbjct: 456 ADLLGLNRNTLRKKI 470


56ECs1288ECs1163Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1288015-5.365618glycine cleavage system protein T
ECs1287212-4.288084acyl carrier protein
ECs1286212-5.325650(3R)-hydroxymyristoyl-ACP dehydratase
ECs1285212-5.0946643-oxoacyl-ACP reductase
ECs1284214-5.759536holo-ACP synthase
ECs1283113-4.241666hemolysin activator-like protein
ECs1282214-3.164690hemagglutinin/hemolysin-like protein
ECs1281223-3.833658hypothetical protein
ECs1280121-2.979649major pilin protein
ECs1279124-4.972247chaperone protein
ECs1278126-5.857077outer membrane usher protein
ECs1277237-10.882169outer membrane protein
ECs1276140-13.409496chaperone protein
ECs1275-136-11.112645oxidoreductase
ECs1274-135-10.383216transcriptional regulator
ECs1273-132-8.915630FidL-like protein
ECs1272-131-8.361043rtn-like protein
ECs1271-227-6.077229hypothetical protein
ECs1270-222-3.745729outer membrane protein PgaA
ECs1269-217-3.056750outer membrane N-deacetylase
ECs1268-111-0.816427N-glycosyltransferase
ECs1267-113-0.174688PGA biosynthesis protein
ECs1266-2130.810099hypothetical protein
ECs12650130.325340hypothetical protein
ECs1264-2122.111470hypothetical protein
ECs1263-2112.262931cytochrome
ECs1262-2122.396306hypothetical protein
ECs1261-1153.187557major sodium/proline symporter
ECs54110143.290832hypothetical protein
ECs1260-1134.126216trifunctional transcriptional regulator/proline
ECs12590173.403915tet operon regulator
ECs1258-1214.517572hypothetical protein
ECs12570173.926273synthetase
ECs1256-2152.784693hypothetical protein
ECs1255-3142.583332acetyltransferase
ECs1254-3151.283138hypothetical protein
ECs1253-3130.116216hypothetical protein
ECs1252-216-0.942249transporter
ECs1251629-3.220246anti-repressor protein
ECs1250729-2.572108C4-type zinc finger TraR
ECs1249629-3.486150hypothetical protein
ECs1248532-3.502145hypothetical protein
ECs12471201.645330hypothetical protein
ECs12460181.312199hypothetical protein
ECs12450191.511918MokW protein
ECs1244-1191.692220hypothetical protein
ECs12430191.974513hypothetical protein
ECs12420191.976111hypothetical protein
ECs1241330-0.939815hypothetical protein
ECs1240332-2.207310hypothetical protein
ECs1239433-3.142559hypothetical protein
ECs1238232-3.639739hypothetical protein
ECs1237329-3.072791hypothetical protein
ECs1236427-3.459972outer membrane protein
ECs1235526-3.550208hypothetical protein
ECs1234426-2.592894outer membrane protein
ECs12335281.769715tail tip fiber protein
ECs12326302.425323hypothetical protein
ECs12318313.279743outer membrane protein
ECs12308333.733117hypothetical protein
ECs12299324.308426hypothetical protein
ECs12288283.680880tail fiber protein
ECs12276260.779452hypothetical protein
ECs12264240.862126hypothetical protein
ECs12254220.138089hypothetical protein
ECs1224320-0.324658hypothetical protein
ECs1223220-0.452310hypothetical protein
ECs1222221-0.960079hypothetical protein
ECs1221221-1.097474portal protein
ECs1220019-2.610796terminase large subunit
ECs1219120-2.022019small subunit terminase
ECs1218120-1.322209hypothetical protein
ECs1217120-0.836967Bor protein
ECs12153200.365934endopeptidase
ECs12143220.164734antirepressor protein
ECs12135232.258556endolysin
ECs12124232.653733holin
ECs12114211.787347hypothetical protein
ECs1210425-0.045408hypothetical protein
ECs12093260.064043transposase
ECs1208426-0.214361transposase
ECs1207328-1.083279hypothetical protein
ECs1206233-4.298475Shiga toxin 2 subunit B
ECs1205031-3.532811Shiga toxin 2 subunit A
ECs1204128-2.891717***hypothetical protein
ECs1203129-3.365332antitermination protein Q
ECs1202231-3.058931hypothetical protein
ECs1201230-2.265522protein NinG
ECs1200231-2.293230DNA-binding protein
ECs1199133-2.156299antirepressor protein
ECs1197332-1.591880protein NinE
ECs1196330-2.529970DNA methylase
ECs1195226-2.316972hypothetical protein
ECs1194227-3.851111hypothetical protein
ECs1193227-4.126408hypothetical protein
ECs1192227-4.440007hypothetical protein
ECs1191330-4.265143hypothetical protein
ECs1190330-4.045226replication protein P
ECs1189535-6.213503replication protein O
ECs1188637-6.499645hypothetical protein
ECs1187635-6.315390hypothetical protein
ECs1186636-5.873646CRO
ECs1185535-6.483500cI repressor protein
ECs1184535-8.260349hypothetical protein
ECs1183340-8.411172hypothetical protein
ECs1182241-8.456959hypothetical protein
ECs1181338-7.715809hypothetical protein
ECs1180428-5.164046hypothetical protein
ECs1179427-3.318897hypothetical protein
ECs1178328-1.844630regulatory protein cIII
ECs1177327-1.783185Kil protein
ECs1176227-1.058931hypothetical protein
ECs1175228-0.767597recombination protein Bet
ECs1174131-2.438441exonuclease
ECs1173431-3.274960hypothetical protein
ECs1172530-3.218864hypothetical protein
ECs1171428-4.091063hypothetical protein
ECs1170430-3.924560C4-type zinc finger TraR
ECs1169228-4.068691hypothetical protein
ECs1168-229-4.198553hypothetical protein
ECs1167-128-5.402015hypothetical protein
ECs1166-129-5.962204hypothetical protein
ECs1165-230-4.914932hypothetical protein
ECs1164-326-3.792029hypothetical protein
ECs1163-226-3.742577hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1285DHBDHDRGNASE1179e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 117 bits (294), Expect = 9e-35
Identities = 55/192 (28%), Positives = 100/192 (52%), Gaps = 10/192 (5%)

Query: 4 DLACPQSVSALCEQIERQAGKIDVLVNNAGIVKDSLFASMSYEDFTQVIETNMFSIFRLT 63
D+ ++ + +IER+ G ID+LVN AG+++ L S+S E++ N +F +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 64 KDALMLLRAAENPAIINVASIAALIPSVGQANYSASKGAILGFTRTLAAEMAPWGVRVNA 123
+ + + +I+ V S A +P A Y++SK A + FT+ L E+A + +R N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 124 VAPGMIESKMVKKV------SRAVVRAVTST----IPLRRLGKCEEVANTIVFLSSSASS 173
V+PG E+ M + + V++ T IPL++L K ++A+ ++FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 174 YIVGQTIVIDGG 185
+I + +DGG
Sbjct: 245 HITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1282PF05860642e-14 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 64.4 bits (157), Expect = 2e-14
Identities = 23/126 (18%), Positives = 48/126 (38%), Gaps = 21/126 (16%)

Query: 37 KNGTVYNANGVPVVDINKPNGSGLSHNIWDNLNVDKNGVVFNNSANESSTSLAGNIQGNS 96
N + +++ GS L H+ + +V +G F N+
Sbjct: 11 INSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNP--------------- 54

Query: 97 NLTSGSAKVILNEVTSKNPSTINGMMEVAGDKADLIIANPNGITVNGGGSINTGKLTLTT 156
+ + I++ VT + S I+G++ A+L + NPNGI ++ G + +
Sbjct: 55 ----TNIQNIISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLDIGGSFVGS 109

Query: 157 GTPDIQ 162
++
Sbjct: 110 TANRLK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1279SECA290.022 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.7 bits (64), Expect = 0.022
Identities = 19/66 (28%), Positives = 27/66 (40%), Gaps = 14/66 (21%)

Query: 170 VTNPTGYYVTIRAAELLNNGKKVPLANSVMIAPQSTTEW-----TLPSGISVAPGAQIHL 224
V + V + +LN IA T E TLP+ ++ G +H+
Sbjct: 78 VFGMRHFDVQLLGGMVLNERC---------IAEMRTGEGKTLTATLPAYLNALTGKGVHV 128

Query: 225 VTVNDY 230
VTVNDY
Sbjct: 129 VTVNDY 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1278PF005777170.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 717 bits (1851), Expect = 0.0
Identities = 246/848 (29%), Positives = 404/848 (47%), Gaps = 44/848 (5%)

Query: 28 ATPSDEDNYTFDPQLFRGSRFSQSSLAKLTTRESVAPGNYKMDIYTNNKLSGSWNVTFKE 87
P F+P+ + + L++ + + PG Y++DIY NN + +VTF
Sbjct: 39 QAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNT 98

Query: 88 AADG-RVLPCLTPEVADAIGLKTGEDKGEK---DPVCTFAKELAPGITSQTQLSQLRLDL 143
++PCLT ++GL T G D C + T+Q + Q RL+L
Sbjct: 99 GDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNL 158

Query: 144 SVPQSQLISRPRGYVPPSELDTGASLAFMNYIANYYNVAYSGQNAHSQRSLWASFNGGIN 203
++PQ+ + +R RGY+PP D G + +NY + + + + + + G+N
Sbjct: 159 TIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS--VQNRIGGNSHYAYLNLQSGLN 216

Query: 204 LGAWQYRQLSNMTW-----DNDKGNQWNNIRSYLQRPLPAINSQLMMGQLITSGRFFSGL 258
+GAW+ R + ++ + N+W +I ++L+R + + S+L +G T G F G+
Sbjct: 217 IGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGI 276

Query: 259 SYHGVSLATDERMLPDSMRGYAPTIRGVAATNARVSVMQNGHEIYQTTVAPGPFEINDLY 318
++ G LA+D+ MLPDS RG+AP I G+A A+V++ QNG++IY +TV PGPF IND+Y
Sbjct: 277 NFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIY 336

Query: 319 PTSYSGDLDVTVTEANGAVSRFSVPFSAVPESMRPGTSRYNVEVGKTQDSG---DDSMFG 375
SGDL VT+ EA+G+ F+VP+S+VP R G +RY++ G+ + + F
Sbjct: 337 AAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFF 396

Query: 376 DLTWQHGMTNTLTFNSGSRIADGYQALMLGGVYGS-SLGAFGANLTWSHARVPESEAQSG 434
T HG+ T G+++AD Y+A G +LGA ++T +++ +P+ G
Sbjct: 397 QSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDG 456

Query: 435 WMSQLTWSKTFQPTSTTVSLAGYRYSTSGYRDLADVLGERHAASNKQSWD---------- 484
+ ++K+ + T + L GYRYSTSGY + AD R N ++ D
Sbjct: 457 QSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFT 516

Query: 485 ---SSQWRQQSRFDLTLSQSLANYGNLFVSGSTQNYRGGKSRDTQLQLGYSNSFSHGISM 541
+ + ++ + LT++Q L L++SGS Q Y G + D Q Q G + +F I+
Sbjct: 517 DYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF-EDINW 575

Query: 542 NLSVGRQRMGGYKDNSDDMQTVTSLSFSFPLGG-------NGPRVPSLSNSWTHSTDGSS 594
LS + D +L+ + P + R S S S +H +G
Sbjct: 576 TLSYSLTKNAW--QKGRDQM--LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRM 631

Query: 595 QLQSSLTGMLDEAQTTNYSLNV---MRDQQYKQTTLSGNMQKRFSQTTVGLNASKGQDYW 651
+ + G L E +YS+ +T + R + S D
Sbjct: 632 TNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIK 691

Query: 652 QASGNVQGAMAVHGGGITFGPYLGETFALVEAKGAEGAKVYNSSQLEINDSGYALVPAVT 711
Q V G + H G+T G L +T LV+A GA+ AKV N + + + GYA++P T
Sbjct: 692 QLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYAT 751

Query: 712 PYRYNRISLDPQGMDGDAELVDSERQVAPVAGAAVKVIFRTRPGKALLIKSRMADGSELP 771
YR NR++LD + + +L ++ V P GA V+ F+ R G LL+ + LP
Sbjct: 752 EYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLP 810

Query: 772 MGADVLDENNTVVGIAGQGGQIYLRTEQTKGHLSVRWGEGANDSCQLPFDISGKDSNSPI 831
GA V E++ GI GQ+YL G + V+WGE N C + + + +
Sbjct: 811 FGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLL 870

Query: 832 IRLNETCQ 839
+L+ C+
Sbjct: 871 TQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1275DHBDHDRGNASE1037e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 103 bits (259), Expect = 7e-29
Identities = 70/255 (27%), Positives = 118/255 (46%), Gaps = 11/255 (4%)

Query: 15 LHNKVAIVTGAAGELGRGLCSALAKAGANLLLVDIK-EPDNRYLKHLTHEGVEVEFMTID 73
+ K+A +TGAA +G + LA GA++ VD E + + L E E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 74 ITKPDASCTIINRCLERFGQLDILVNNAGVCNINRPIDFNRNDWDPMINLNLNAAFDMSQ 133
+ A I R G +DILVN AGV + +W+ ++N F+ S+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 134 AALNIFVPQRKGKIINMCSVLSFHGGRWSPG-YAATKHALAGLTKAYADDFAEYNIQING 192
+ + +R G I+ + S + R S YA++K A TK + AEYNI+ N
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 193 IAPGYYVSEMTAIIYNNPKIKE-LIKGR-------IPAQRWGRAQDLMGAMVFLASAASD 244
++PG ++M ++ + E +IKG IP ++ + D+ A++FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 245 YVNGQLLVIDGGYSI 259
++ L +DGG ++
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1273TRNSINTIMINR300.004 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 30.1 bits (67), Expect = 0.004
Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 5 YFLFAGIILCAFIAAILSHIAFHHANEPAEQNISCNAHVI 44
Y L + +I+ I A ++ A H N+PAEQ + H +
Sbjct: 366 YGLSSALIVAGGIGAGVT-TALHRRNQPAEQTTTTTTHTV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1271BINARYTOXINA300.025 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 29.6 bits (66), Expect = 0.025
Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%)

Query: 335 DQVIKTVVNIIGKSIRPDDLLA--RVGGEEFGVLLTDIDTERAKALAERIRENVERLTGD 392
D + + N + + P +L+ R G +EFG+ LT + + K E I E+ G
Sbjct: 313 DSKVNNIENALKLTPIPSNLIVYRRSGPQEFGLTLTSPEYDFNK--IENIDAFKEKWEGK 370

Query: 393 NPEYAIPQKVTISIGAV 409
Y P ++ SIG+V
Sbjct: 371 VITY--PNFISTSIGSV 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1270ARGDEIMINASE300.047 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 29.8 bits (67), Expect = 0.047
Identities = 27/183 (14%), Positives = 61/183 (33%), Gaps = 23/183 (12%)

Query: 450 WPRAAENELKK-AEVIEPRNINLEVEQAWTALTLQEWQQA--AVLTHDVVEREPQDPGVV 506
+ A E + A +++ + +E + + L ++ ++E E + +
Sbjct: 47 YLEVARQEHEVFASILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTI 106

Query: 507 -RLK---RAVDVHNLAELRIAGSTGIDAEGPDSGKHDVDLTTIVYS---PPLKDNWRGFA 559
LK ++ + N+ I+G E + DL P+ + F
Sbjct: 107 NLLKDYFSSLTIDNMISKMISGVVT--EELKNYTSSLDDLVNGANLFIIDPMPNVL--FT 162

Query: 560 GFGYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVFNHEHKPGARLSGWYDFNDN 619
D S G G+ + + + R E +AE +F + + W + +
Sbjct: 163 ----RDPFASIGNGVT---INKMFTKVRQ--RETIFAEYIFKYHPVYKENVPIWLNRWEE 213

Query: 620 WRI 622
+
Sbjct: 214 ASL 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1259HTHTETR662e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.8 bits (160), Expect = 2e-15
Identities = 32/165 (19%), Positives = 61/165 (36%), Gaps = 8/165 (4%)

Query: 10 GKRSRAVSAKKKAILSAALDTFSQFGFHGTRLEQIAELAGVSKTNLLYYFPSKEALYIAV 69
K + ++ IL AL FSQ G T L +IA+ AGV++ + ++F K L+ +
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 70 LRQILDIWLAPLKAFREDF--APLAAIKEYIRLKLEVSRDYPQASRLF-CMEMLAGAPLL 126
++ F PL+ ++E + LE + + L +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 127 MDELTGDLKSLIDEKSALIAGWVKSG-----KLAPIDPQHLIFMI 166
M + ++L E I +K A + + ++
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIM 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1257ISCHRISMTASE733e-17 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 72.7 bits (178), Expect = 3e-17
Identities = 43/176 (24%), Positives = 70/176 (39%), Gaps = 23/176 (13%)

Query: 13 TFDPQQSALIVVDMQNAYATPGGYLDLAGFDVSTTRPVIANIQTAVTAARAAGMLIIWFQ 72
DP ++ L++ DMQN + +D S + ANI+ G+ +++
Sbjct: 25 VPDPNRAVLLIHDMQNYF------VDAFTAGASPVTELSANIRKLKNQCVQLGIPVVY-- 76

Query: 73 NGWDEQYVEAGGPGSPNFHKSNALKTMRKQPQLQGKLLAKGSWDYQLVDELVPQPGDIVL 132
PGS N L G L G ++ +++ EL P+ D+VL
Sbjct: 77 ---------TAQPGSQNPDDRALLTDF------WGPGLNSGPYEEKIITELAPEDDDLVL 121

Query: 133 PKPRYSGFFNTPLDSILRSRGIRHLVFTSIATNVCVESTLRDGFFLEYFGVVLEDA 188
K RYS F T L ++R G L+ T I ++ T + F + + DA
Sbjct: 122 TKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDA 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1245HOKGEFTOXIC622e-17 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 62.2 bits (151), Expect = 2e-17
Identities = 20/48 (41%), Positives = 34/48 (70%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRIRTGQTEVAVFVDYESEK 70
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F+ YES K
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYESGK 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1242IGASERPTASE330.016 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.5 bits (76), Expect = 0.016
Identities = 44/299 (14%), Positives = 84/299 (28%), Gaps = 49/299 (16%)

Query: 426 YRGRRQAAEETAMRDAETVQQDD----------AAPQPESVDPVAQQRESMQGMNREQLL 475
R Q + T + +Q D A V P A S +
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 476 EQYADADMATEGDASAAHR--REAASQLLNELDEQTKRQAVMN---ELKAKPRSELLEEY 530
+Q + E DA+ RE A + + + T+ V E K +E E
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 531 RRLSQKEGRTETEEQQ-----------FQAIREVIRPQQEVTPEAQSQP------ENAED 573
+++ + ETE+ Q Q E ++PQ E P ++ P ++
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE--PARENDPTVNIKEPQSQT 1162

Query: 574 GNGSIYPTVRFRDPNEVRIEINGNGASRPAERIEK---------VRPDNRYFTDEKSAMG 624
+ + V + + + + +P + K
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNR 1222

Query: 625 SDVFRNAAATGLKPSVVKKGENQYAVEMD------NPAFSEDVATETINTLADGERIAD 677
+ ++P+ + D N S+ A L G+ ++
Sbjct: 1223 HRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQ 1281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1236ENTEROVIROMP1132e-33 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 113 bits (283), Expect = 2e-33
Identities = 46/167 (27%), Positives = 73/167 (43%), Gaps = 31/167 (18%)

Query: 79 SGYEGKDKNPQGINIRYRYEITDD-FGVITSFTWTRSLTNSQTFIDVQSADHTRKIKNPA 137
S +G+ G N++YRYE + GVI SFT+T K+
Sbjct: 35 SDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTE--------------------KSRT 74

Query: 138 ASARTDIRANYWSLLAGPSWRVNQYMSLYAMAGMGVAKVSADLKIKDNINSSGGFSESNS 197
AS+ + Y+ + AGP++R+N + S+Y + G+G K + + +
Sbjct: 75 ASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKFQT----------TEYPTYKHD 124

Query: 198 TKKTSLAWAAGAQFNLNESVTLDVAYEGSGSGDWRTSGVTAGIGLKF 244
T ++ AG QFN E+V LD +YE S AG+G +F
Sbjct: 125 TSDYGFSYGAGLQFNPMENVALDFSYEQSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1228CHANLCOLICIN350.001 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.7 bits (79), Expect = 0.001
Identities = 30/111 (27%), Positives = 48/111 (43%), Gaps = 4/111 (3%)

Query: 130 KNTQATQSKESAAASAKSASDSAK--TATSRAAEAGQKATDATEAATRAVTAAGNAEESS 187
K TQA Q+ + AA+ A A T R + +A + T + T +A ++
Sbjct: 63 KKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIVNEALRHNASRTPSATELAHANNAA 122

Query: 188 TRAGESEKAAGADAEKARQHAEKARLAQESAGEILKRAEAATVSAEEARRM 238
+A + EKAR+ AE A A + A + +R E AE R++
Sbjct: 123 MQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQ--RRKEIEREKAETERQL 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1221CHANLCOLICIN330.004 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.004
Identities = 21/101 (20%), Positives = 46/101 (45%), Gaps = 10/101 (9%)

Query: 613 QEVAAQQQALQQQQAELQMREMAGRVAKLEADAARAHAAAQRDNASAQREVALTQGQRYV 672
+E A ++A Q+ AE + +E+ + +A+ R A+ + +R AL++ + V
Sbjct: 141 KEAEAAEKAFQE--AEQRRKEIE----REKAETERQLKLAEAEE---KRLAALSEEAKAV 191

Query: 673 DALNQAHTAEIITGVQNMEQEQDVLQQQMLYTLQQRMNEMS 713
+ + + + V M+ E L ++ ++ R EM
Sbjct: 192 EIAQKK-LSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMK 231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1219RTXTOXIND310.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.008
Identities = 17/91 (18%), Positives = 29/91 (31%), Gaps = 25/91 (27%)

Query: 179 KILKAEQALDRNIARIESIERSLL----------------TLDVLAETAPKLRADRERIN 222
K ++A L +++E IE +L LD L +T + +
Sbjct: 260 KYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELA 319

Query: 223 AARDKLRAETDILTNQRRGVVTPVSDIVSSL 253
++ Q + PVS V L
Sbjct: 320 KNEERQ---------QASVIRAPVSVKVQQL 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1217PF062911633e-56 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 163 bits (413), Expect = 3e-56
Identities = 89/97 (91%), Positives = 91/97 (93%)

Query: 1 MKKMLLATALALLITGCAQQTFTVQNKQTAVAPKETITHHFFVSGIGQKKTVDAAKICGG 60
MKKML + ALA+LITGCAQQTFTV NK TAV PKETITHHFFVSGIGQKKTVDAAKICGG
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAAKICGG 65

Query: 61 TENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 97
ENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ
Sbjct: 66 AENVVKTETQQTFVNGLLGFITLGIYTPLEARVYCSQ 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1205SHIGARICIN1444e-43 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 144 bits (364), Expect = 4e-43
Identities = 53/277 (19%), Positives = 115/277 (41%), Gaps = 35/277 (12%)

Query: 4 ILFKWVLCLLLGFSSVSYSREFTIDFSTQQSYVSSLNSIRTEISTPLEHISQGTTSVSVI 63
+ +L L L +V F + +T SY ++++R + + + ++
Sbjct: 6 VFSLLILTLFLTAPAVEGDVSFRLSGATSSSYGVFISNLRKALPYERK-----LYDIPLL 60

Query: 64 NHTPPGSYFAVDIRGLDVYQARFDHLRLIIEQNNLYVAGFVNTATNTFYRFSDFT----- 118
T PGS I L Y + + + I+ N+YV G+ A +T Y F++ +
Sbjct: 61 RSTLPGSQRYALIH-LTNYAD--ETISVAIDVTNVYVMGY--RAGDTSYFFNEASATEAA 115

Query: 119 HISVPGVT-TVSMTTDSSYTTLQRVAALERSGMQISRHSLVSSYLALMEFSGNTMTRDAS 177
V++ +Y LQ A R + + +L S+ L ++ N+ A+
Sbjct: 116 KYVFKDAKRKVTLPYSGNYERLQIAAGKIRENIPLGLPALDSAITTLFYYNANS----AA 171

Query: 178 RAVLRFVTVTAEALRFRQIQREFRQALSETAPVYTMTPGDVDLTLNWGRISNVLPEYRGE 237
A++ + T+EA R++ I+++ + + +T + + + L +W +S +
Sbjct: 172 SALMVLIQSTSEAARYKFIEQQIGKRVDKT---FLPSLAIISLENSWSALSKQIQIASTN 228

Query: 238 DGV----------RVGRISFNNISA--ILGTVAVILN 262
+G + R++ N+ A + +A++LN
Sbjct: 229 NGQFETPVVLINAQNQRVTITNVDAGVVTSNIALLLN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1202HTHFIS270.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.1 bits (60), Expect = 0.004
Identities = 10/30 (33%), Positives = 17/30 (56%), Gaps = 4/30 (13%)

Query: 10 DMLVEAYE----NQTEVARILNCSRNTVRK 35
+++ A NQ + A +L +RNT+RK
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTLRK 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1190DNABINDINGHU310.002 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 30.8 bits (70), Expect = 0.002
Identities = 10/49 (20%), Positives = 20/49 (40%), Gaps = 3/49 (6%)

Query: 128 MTEATEL---LYSRNGMTATQKYEAIQAIFTQLTDHAKTGSRRGLRSFG 173
M +L + +T A+ A+F+ ++ + G + L FG
Sbjct: 1 MANKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFG 49


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1179UREASE290.006 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 28.6 bits (64), Expect = 0.006
Identities = 18/66 (27%), Positives = 26/66 (39%), Gaps = 7/66 (10%)

Query: 57 IMLAQHALLIAISSDLNAYGVVCEFDWN----DGNGQEGWPPMDGSEGIRITD---IDTS 109
+ LA L I + D +G +F DG GQ G+ IT+ +D
Sbjct: 22 VRLADTELFIEVEKDFTTHGEEVKFGGGKVIRDGMGQSQVTREGGAVDTVITNALILDHW 81

Query: 110 GIFDSD 115
GI +D
Sbjct: 82 GIVKAD 87


57ECs1134ECs1058Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1134-1193.599609third cytochrome oxidase subunit I
ECs1133-1183.880274hydrogenase-1 operon protein HyaF
ECs1132-1203.792488hydrogenase-1 operon protein HyaE
ECs1131-1193.327888hydrogenase 1 maturation protease
ECs11300162.601425hydrogenase 1 b-type cytochrome subunit
ECs11291152.664508hydrogenase 1 large subunit
ECs11285233.581547hydrogenase-1 small subunit
ECs11275293.665511*hypothetical protein
ECs11255255.095120hypothetical protein
ECs11265245.128354EspF-like protein
ECs11245265.192216hypothetical protein
ECs11235275.814828tail fiber protein
ECs11225285.178586outer membrane protein
ECs11215295.415451host specificity protein
ECs11204324.797923copper/zinc-superoxide dismutase
ECs11193316.371964hypothetical protein
ECs11184306.556865tail assembly protein
ECs11174315.904055tail assembly protein
ECs11162285.651220minor tail protein
ECs11151265.315036minor tail protein
ECs11141255.201414tail length tape measure protein
ECs11131245.272652minor tail protein
ECs1112-1255.362786minor tail protein
ECs1111-1265.677940hypothetical protein
ECs11101234.543244major head protein
ECs11091235.179136head decoration protein
ECs11082235.305505head-tail preconnector protein
ECs11071234.099749portal protein
ECs54062213.376578hypothetical protein
ECs11062233.227005terminase large subunit
ECs11043243.038673hypothetical protein
ECs1103426-2.090721hypothetical protein
ECs1102531-5.636259hypothetical protein
ECs1101227-4.431858hypothetical protein
ECs1100328-4.524692holin
ECs1099125-3.308815hypothetical protein
ECs1097222-0.963591hypothetical protein
ECs10982190.580341hypothetical protein
ECs10963212.914481endolysin
ECs10943211.103836hypothetical protein
ECs1093320-1.464208endopeptidase
ECs1090324-2.588186transposase
ECs1089224-2.811341transposase
ECs1088322-2.278203hypothetical protein
ECs1087131-5.150537***transcriptional regulator
ECs1085131-3.915321hypothetical protein
ECs1084130-4.618646anti-termination protein
ECs1083028-3.965873crossover junction endodeoxyribonuclease
ECs1082028-3.401431hypothetical protein
ECs1081028-4.115084hypothetical protein
ECs1080126-2.978622prophage maintenance protein
ECs1077125-3.501866hypothetical protein
ECs1076220-0.441513hypothetical protein
ECs1075319-0.462851hypothetical protein
ECs1074423-1.468112replication protein
ECs1073528-2.401566hypothetical protein
ECs1072235-5.105762hypothetical protein
ECs1071232-4.667243hypothetical protein
ECs1070341-7.611868hypothetical protein
ECs1069339-8.460156regulatory protein
ECs1068240-7.837148hypothetical protein
ECs1067137-8.146440hypothetical protein
ECs1065136-9.009131hypothetical protein
ECs1063123-4.897802hypothetical protein
ECs1061122-4.703209hypothetical protein
ECs1060018-3.886989hypothetical protein
ECs1059018-4.122684cell division inhibition protein
ECs1058017-3.572914hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1123IGASERPTASE433e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 42.7 bits (100), Expect = 3e-06
Identities = 46/289 (15%), Positives = 91/289 (31%), Gaps = 30/289 (10%)

Query: 11 LKDGTGKPVENCTIQLKARRNSATVVVNTVASENPDE-AGRYSMDVEYGQYSVILLVEGF 69
+ D TG+P N A + + ++ D A +Y + G+Y + +
Sbjct: 928 VADKTGEPNHNELTLFDASKAQRDHLNVSLVGNTVDLGAWKYKLRNVNGRYDL------Y 981

Query: 70 PPSHAGTITVYEDSQPGTLNDFLGAMSEDDVRPEALRRFELMVEEAARHAEEAKKNAGEA 129
P + + T N+ + E + R + A ++
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSE----TT 1037

Query: 130 ETSARNAGISASQAEESAANADTSAGDASESARQAA-ESAAAAKQSEEASSSSASAAAQK 188
ET A N+ + E++ +A + E A++A A + +E A S S + Q
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 189 ASESSQSAAEA------------ELSRKTAESAAGNAARDAT-TATEKARE-----SAES 230
+ E E+ + T++ + + E ARE + +
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 231 AQSAEQSRIAAEEAVNRIPTVVGPPGPKGEPGPAGPQGPKGDKGERGDT 279
QS + E+ + V P + G + + T
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1122ENTEROVIROMP1372e-43 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 137 bits (346), Expect = 2e-43
Identities = 64/200 (32%), Positives = 102/200 (51%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHQSTLSAGYLHVSTNVPGSDELNGINVKYRYEFT 60
M+K+ A + + A LA + + A+ ST++ GY + + G N+KYRYE
Sbjct: 1 MKKI-ACLSALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGMVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPIESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP+E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1114GPOSANCHOR300.042 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 30.0 bits (67), Expect = 0.042
Identities = 55/235 (23%), Positives = 90/235 (38%), Gaps = 23/235 (9%)

Query: 362 AWNDRENARLGLAAATLQSDMEKAGELAARDR--AERDASQLKYTGEAQKAYERLLTPLE 419
++ ++A++ A + + EL + + L
Sbjct: 239 NFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKA 298

Query: 420 KYTARQEELNKALKDGKILRADYNTLMAAAKKDYESTLKKPKSSGVKVSAGERQE----- 474
+ + LN + LR D + AKK E+ +K + K+S RQ
Sbjct: 299 DLEHQSQVLNANRQS---LRRDLDA-SREAKKQLEAEHQKLEE-QNKISEASRQSLRRDL 353

Query: 475 DQAHAALLALETELRTLEKHSGANEKISQQ-RRDLWKA-ENQYAVLKE-AATKRQLSEQE 531
D + A LE E + LE+ + +E Q RRDL + E + V K +L+ E
Sbjct: 354 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALE 413

Query: 532 KF---LLAHKDETLEYKRQLAELGDKVEHQ-KRLNE-LAQQAVRFEEQQSAKQAA 581
K L K T + K AEL K+E + K L E LA+QA + ++ K +
Sbjct: 414 KLNKELEESKKLTEKEK---AELQAKLEAEAKALKEKLAKQAEELAKLRAGKASD 465


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1111INTIMIN320.001 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 32.3 bits (73), Expect = 0.001
Identities = 23/119 (19%), Positives = 45/119 (37%), Gaps = 17/119 (14%)

Query: 85 KEVITRTVKVTNVGKPSVAEERSKITPVSAIKVTP-------------TSGTVAKGKTTT 131
++ IT TVKV KP +E + T + + + TS T K +
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSA 734

Query: 132 LT--VSFEPESATDKTFRAVSADPSKATI--SVKDMTITVNGVATGKVQIPVVSGNGQF 186
V+ + ++ + F ++ D I + + + G+V + GNG++
Sbjct: 735 RVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKY 793


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1080HOKGEFTOXIC666e-19 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 66.4 bits (162), Expect = 6e-19
Identities = 19/46 (41%), Positives = 32/46 (69%)

Query: 23 QKAMLIALIVICLTVIVTALVTRKDLCEVRVRTGQTEVAVFTAYEP 68
+ +++ ++++CLT+++ +TRK LCE+R R G EVA F AYE
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


58ECs0857ECs0806Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0857-1133.483415excinuclease ABC subunit B
ECs08561144.071349dithiobiotin synthetase
ECs08550142.692604biotin biosynthesis protein BioC
ECs0854-1142.1179788-amino-7-oxononanoate synthase
ECs0853-119-1.644790biotin synthase
ECs0852126-3.901269adenosylmethionine--8-amino-7-oxononanoate
ECs0851645-9.720144kinase inhibitor protein
ECs0850949-10.534512hypothetical protein
ECs0849842-6.095732hypothetical protein
ECs0848637-5.127671hypothetical protein
ECs08473230.023190hypothetical protein
ECs08462251.718809hypothetical protein
ECs08454274.975419hypothetical protein
ECs08443265.280189tail fiber protein
ECs08433254.769135outer membrane protein
ECs08422223.887927host specificity protein
ECs08413243.440703tail assembly protein
ECs08403272.789800tail assembly protein
ECs08393252.494879minor tail protein
ECs08384262.167722minor tail protein
ECs08374262.476488tail length tape measure protein
ECs08366283.027763minor tail protein
ECs08355262.508275minor tail protein
ECs08345201.314712major tail protein
ECs08335220.850220minor tail protein
ECs08323221.388395minor tail protein
ECs08313221.115806hypothetical protein
ECs08303191.746683hypothetical protein
ECs08294191.704379protease/scaffold protein
ECs08283202.158342hypothetical protein
ECs08273191.917638portal protein
ECs08263201.240863hypothetical protein
ECs08252180.922202terminase large subunit
ECs0824225-1.597307hypothetical protein
ECs0823129-6.170745hypothetical protein
ECs0822131-6.057627hypothetical protein
ECs0820234-7.801454endopeptidase
ECs0819232-8.570330endolysin
ECs0818233-8.654860holin
ECs0816333-8.567465hypothetical protein
ECs0815330-6.896763anti-termination protein
ECs0814226-5.815304outer membrane protein
ECs0813224-2.048830serine/threonin protein phosphatase
ECs0812127-0.860713protein NinG
ECs0811127-0.987856protein NinF
ECs0810027-1.168600protein NinE
ECs5398227-1.060405hypothetical protein
ECs0809231-4.533679exonuclease
ECs0808438-7.531033hypothetical protein
ECs0807539-7.394144hypothetical protein
ECs0806222-2.819032hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0848YERSSTKINASE290.027 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.9 bits (64), Expect = 0.027
Identities = 19/66 (28%), Positives = 32/66 (48%), Gaps = 3/66 (4%)

Query: 200 RMDKINGESLLNISSLPAQAEHAIYDMFDRLEQKGILFVDTTETNVLYDRAKNEFNPIDI 259
+ KIN E+ A H + D+ + L + G++ D NV++DRA E ID+
Sbjct: 234 KQGKINSEAYWGTIKFIA---HRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDL 290

Query: 260 SSYNVS 265
++ S
Sbjct: 291 GLHSRS 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0844CHANLCOLICIN330.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.002
Identities = 36/130 (27%), Positives = 55/130 (42%), Gaps = 15/130 (11%)

Query: 131 SARNAGISASKAEASAANADTSAEDASESARQAAESAASAKKSEEASSSSAS-------- 182
S G SK+E+SAA T+ ++ + AE AA AK + EA + + +
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQR 93

Query: 183 ------EAAQKASESLQSATDAELSKKTAESAAGNAARDATTSTEKARESAESAQSAEQS 236
EA + + SAT+ + A A R A + EKAR+ AE+A+ A Q
Sbjct: 94 LKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLA-KAEEKARKEAEAAEKAFQE 152

Query: 237 RIAAEDAVNR 246
+ R
Sbjct: 153 AEQRRKEIER 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0843ENTEROVIROMP1442e-46 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 144 bits (365), Expect = 2e-46
Identities = 66/201 (32%), Positives = 100/201 (49%), Gaps = 32/201 (15%)

Query: 1 MRKVCAAILSAAICLAVSGVPAWASEHQSTLSAGYLHASTDAPG-SDDLNGINVKYRYEF 59
M+K+ AA+ +G A ST++ GY A +DA G + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEE 55

Query: 60 TDT-LGLITSFSYANAEDEQKTHYSDTRWHEDYVRNRWFSVMAGPSVRVNEWFSAYAMAG 118
++ LG+I SF+Y T S T DY +N+++ + AGP+ R+N+W S Y + G
Sbjct: 56 DNSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVG 107

Query: 119 VAYSRVSTFSGDYFRVTDNKRKTHDVLTGSDDARYSNTSLAWGAGVQFNPTESVAVDVAY 178
V Y + T + S+ ++GAG+QFNP E+VA+D +Y
Sbjct: 108 VGYGKFQT-------------TEYPTYKHDT----SDYGFSYGAGLQFNPMENVALDFSY 150

Query: 179 EGSGSGDWRTDGFIVGVGYKF 199
E S +I GVGY+F
Sbjct: 151 EQSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0842SURFACELAYER330.005 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 33.5 bits (76), Expect = 0.005
Identities = 34/143 (23%), Positives = 45/143 (31%), Gaps = 30/143 (20%)

Query: 965 SVNANSGTLNNVTVNENCTIKGMLEATQV----RGDF---------VKAVSKSFPKQAGT 1011
+ + L NVT + +K L+A ++ G F VKA S K A
Sbjct: 235 AAQYDKKQLTNVTFDTETAVKDALKAQKIEVSSVGYFKAPHTFTVNVKATSNKNGKSATL 294

Query: 1012 WGNTETPNGTVTVTISDDHNFDRQIIIPPIIFNGIAYSDPGSGNNPGGTRYTGYGFEVRK 1071
PN V S I+ N Y + G R
Sbjct: 295 PVTVTVPNVADPVVPSQSKT---------IMHNAYFYDKDA--------KRVGTDKVTRY 337

Query: 1072 NGVLIASRETKGAIPGSYSAVID 1094
N V +A TK A SY VI+
Sbjct: 338 NTVTVAMNTTKLANGISYYEVIE 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0837cloacin443e-06 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 43.5 bits (102), Expect = 3e-06
Identities = 34/142 (23%), Positives = 62/142 (43%), Gaps = 4/142 (2%)

Query: 519 DQQRLNDLQEKKRQKDLQDAK--EQAERNYQEQQKRRNAENAALNRMNETEAARHQREIA 576
DQ + +E +RQ++ E AERNY+ + N N + R E +A Q +
Sbjct: 294 DQVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNS 353

Query: 577 RINAMQYADQAVRDA-AIQRENERYEKALASGKKKTRETRNDEATRLLLQYSQQQAQVEG 635
R + + A++ + DA A ++ R+ +G + + +A R + +QA +
Sbjct: 354 RKSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDA 413

Query: 636 QIAAARQSAGIATERMTEARKQ 657
A + A A E+RK+
Sbjct: 414 -AAKEKSDADAALSSAMESRKK 434


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0812TYPE4SSCAGX290.012 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.0 bits (64), Expect = 0.012
Identities = 14/41 (34%), Positives = 30/41 (73%), Gaps = 2/41 (4%)

Query: 36 KIALERRSKEREKAEKAEKAAEKKRRREEQKQKDKLKIQKL 76
K ALE+ + +E+A+KA+K +K+ +R+E++ K++ ++ L
Sbjct: 145 KKALEKEKEAKEQAQKAQK--DKREKRKEERAKNRANLENL 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0808TCRTETB240.037 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 24.1 bits (52), Expect = 0.037
Identities = 7/23 (30%), Positives = 11/23 (47%)

Query: 10 VGTITFVYSVTKRGWVFPGLSVI 32
VG + F+ T F +SV+
Sbjct: 209 VGIVFFMLFTTSYSISFLIVSVL 231


59ECs0779ECs5391Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0779217-0.420888protein PnuC
ECs07782190.074108quinolinate synthetase
ECs07772200.044262******tol-pal system protein YbgF
ECs07762220.221788peptidoglycan-associated outer membrane
ECs07752180.157955translocation protein TolB
ECs0774616-0.458893cell envelope integrity inner membrane protein
ECs07730190.062527colicin uptake protein TolR
ECs07720220.512631colicin uptake protein TolQ
ECs0771220-2.136960acyl-CoA thioester hydrolase YbgC
ECs0770121-3.360476hypothetical protein
ECs5391222-3.324377hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0777ACRIFLAVINRP290.015 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.4 bits (66), Expect = 0.015
Identities = 14/72 (19%), Positives = 28/72 (38%), Gaps = 4/72 (5%)

Query: 24 AFAQAPISSVGSGSVEDRVIQLERISNAHSQLLTQLQQQLS---DNQSDIDSLRGQIQEN 80
F I +G+ + D + ++ H L Q L + + + S+R E+
Sbjct: 664 PFNMPAIVELGTATGFDFELI-DQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLED 722

Query: 81 QYQLNQVVERQK 92
Q V+++K
Sbjct: 723 TAQFKLEVDQEK 734


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0776OMPADOMAIN1165e-34 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 116 bits (292), Expect = 5e-34
Identities = 35/119 (29%), Positives = 54/119 (45%), Gaps = 4/119 (3%)

Query: 55 EEQARLQMQQLQQNNIVYFDLDKYDIRSDFAQMLDAHANFLRSN--PSYKVTVEGHADER 112
+Q + + V F+ +K ++ + LD + L + V V G+ D
Sbjct: 205 APAPEVQTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRI 264

Query: 113 GTPEYNISLGERRANAVKMYLQGKGVSADQISIVSYGKEKPAVLGHDEAAYSKNRRAVL 171
G+ YN L ERRA +V YL KG+ AD+IS G+ P V G+ K R A++
Sbjct: 265 GSDAYNQGLSERRAQSVVDYLISKGIPADKISARGMGESNP-VTGN-TCDNVKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0774IGASERPTASE546e-10 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 54.3 bits (130), Expect = 6e-10
Identities = 33/215 (15%), Positives = 69/215 (32%), Gaps = 1/215 (0%)

Query: 66 RMQSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAK 125
R ++E+ + + + Q+ E +E Q E + +EKE A E +K E
Sbjct: 1066 REVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKV 1125

Query: 126 QAELKQKQAEEAAAKAAADAKAKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKK 185
+++ KQ + + A+ + + +E + A + A+ + E
Sbjct: 1126 TSQVSPKQEQSETVQPQAEPARENDP-TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVT 1184

Query: 186 AEAAAAALKKKAEAAEAAAAEARKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAAD 245
E E + +++ K + + + + A +
Sbjct: 1185 ESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR 1244

Query: 246 KKAAAAKAAAEKAAAAKAAAEADDIFGELSSGKNA 280
A + A + A A F L+ GK
Sbjct: 1245 STVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279



Score = 53.9 bits (129), Expect = 7e-10
Identities = 26/169 (15%), Positives = 57/169 (33%), Gaps = 2/169 (1%)

Query: 99 EQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAKAKAEADDKAAEE 158
E E+ Q QA+ + + ++ + A +E + AE
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 159 AAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAAEARKKAAAEKAAA 218
+ +++ +K E +A + A+ ++ A+ A + +K + E A + + K
Sbjct: 1044 SKQESK--TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 219 DKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEA 267
+ EK K +K K + ++ + A A
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150



Score = 52.0 bits (124), Expect = 3e-09
Identities = 23/162 (14%), Positives = 58/162 (35%), Gaps = 8/162 (4%)

Query: 86 QQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADA 145
+ AE +++ + K + + ++ A+EA + + E A + +
Sbjct: 1038 ETVAENSKQESKTVE---KNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKE 1094

Query: 146 KAKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAA 205
E + A E +KA + +K E K ++ K E + + A E
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKT--QEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 206 EARKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKK 247
K+ ++ + A + A++ +++ + ++
Sbjct: 1153 VNIKEPQSQT---NTTADTEQPAKETSSNVEQPVTESTTVNT 1191



Score = 52.0 bits (124), Expect = 3e-09
Identities = 24/192 (12%), Positives = 66/192 (34%), Gaps = 5/192 (2%)

Query: 68 QSQESSAKRSDEQRKMKEQQAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQA 127
Q+ S ++E+ ++ +E ++ + +K E+ A +
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN--EQDATET 1061

Query: 128 ELKQKQ-AEEAAAKAAADAKAKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKA 186
+ ++ A+EA + A+ + E +E + + + KA E +K
Sbjct: 1062 TAQNREVAKEAKSNVKANTQT-NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 187 EAAAAALKKKAEAAEAAAAEARKKAAAEKAAADKKAAEKAAAEKAAADKKAAAEKAAADK 246
E + + ++ + + + A E E + AD + A++ +++
Sbjct: 1121 EVPKVTSQVSPKQEQSETVQPQAEPARENDPT-VNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 247 KAAAAKAAAEKA 258
+ ++
Sbjct: 1180 EQPVTESTTVNT 1191



Score = 51.6 bits (123), Expect = 4e-09
Identities = 31/187 (16%), Positives = 58/187 (31%), Gaps = 8/187 (4%)

Query: 87 QAAEELREKQAAEQERLKQLEKERLAAQEQKKQAEEAAKQAELKQKQAEEAAAKAAADAK 146
QA E R+ + A + E A+ KQ + K DA
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN----SKQESKTVEKNEQDAT 1059

Query: 147 AKAEADDKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKA--EAAAAALKKKAEAAEAAA 204
+ + A+EA A+ + A++ E Q E A ++KA+
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKT 1119

Query: 205 AEARKKAAAEKAAADKKAAEKAAAEKAAADKKA--AAEKAAADKKAAAAKAAAEKAAAAK 262
E K + ++ + AE A + E + A + A++ ++
Sbjct: 1120 QEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNV 1179

Query: 263 AAAEADD 269
+
Sbjct: 1180 EQPVTES 1186



Score = 50.1 bits (119), Expect = 1e-08
Identities = 29/251 (11%), Positives = 84/251 (33%), Gaps = 19/251 (7%)

Query: 51 DAVMVDSGAVVEQYKRMQSQESSAKRSDEQRKMKEQQAAE-ELREKQAAEQER------L 103
D V A + ++ ++K+ + + EQ A E + ++ A++ +
Sbjct: 1021 DEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANT 1080

Query: 104 KQLEKERLAAQEQKKQAEEAAKQAELKQKQAE--------EAAAKAAADAKAK---AEAD 152
+ E + ++ ++ Q K+ +K+ + + K + K +E
Sbjct: 1081 QTNEVAQSGSETKETQ-TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETV 1139

Query: 153 DKAAEEAAKKAAADAKKKAEAEAAKAAAEAQKKAEAAAAALKKKAEAAEAAAAEARKKAA 212
AE A + K+ +++ A Q E ++ + E+ + +
Sbjct: 1140 QPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 213 AEKAAADKKAAEKAAAEKAAADKKAAAEKAAADKKAAAAKAAAEKAAAAKAAAEADDIFG 272
A + + + ++ + ++ A ++ +++ A + +
Sbjct: 1200 ENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNA 1259

Query: 273 ELSSGKNAPKT 283
LS + +
Sbjct: 1260 VLSDARAKAQF 1270


60ECs0730ECs0718Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0730-1194.568835hypothetical protein
ECs07290205.582550protein RhsC
ECs0728-2174.455787hypothetical protein
ECs0727-2174.646344potassium-transporting ATPase subunit F
ECs0726-2143.021471potassium-transporting ATPase subunit A
ECs0725-2122.364048potassium-transporting ATPase subunit B
ECs0724-3102.115104potassium-transporting ATPase subunit C
ECs0723-3101.834094sensor protein KdpD
ECs0722-215-0.099119KdpE family transcriptional regulator
ECs0721-217-0.503438ornithine decarboxylase
ECs0720-1190.271586putrescine transporter
ECs07190220.300237phosphoglucomutase
ECs0718219-1.695142replication initiation regulator SeqA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0723PF06580320.012 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.012
Identities = 10/48 (20%), Positives = 21/48 (43%), Gaps = 4/48 (8%)

Query: 785 LLENAVKYAGAQAE----IGIDAHVEGENLQLDVWDNGPGLPPGQEQT 828
L+EN +K+ AQ I + + + L+V + G +++
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKES 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0722HTHFIS921e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.8 bits (228), Expect = 1e-23
Identities = 35/125 (28%), Positives = 58/125 (46%), Gaps = 1/125 (0%)

Query: 2 TNVLIVEDEQAIRRFLRTALEGDGMRVYEAETLQRGLLEAATRKPDLIILDLGLPDGDGI 61
+L+ +D+ AIR L AL G V A DL++ D+ +PD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EFIRDLRQWSA-VPVIVLSARSEESDKIAALDAGADDYLSKPFGIGELQARLRVALRRHS 120
+ + +++ +PV+V+SA++ I A + GA DYL KPF + EL + AL
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 ATAAP 125
+
Sbjct: 124 RRPSK 128


61ECs0647ECs0629Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0647-214-3.159942oxidoreductase
ECs0646-218-3.769332hypothetical protein
ECs0645-216-3.967701alkyl hydroperoxide reductase
ECs0644-214-3.998407alkyl hydroperoxide reductase
ECs0643-217-2.731558disulfide isomerase/thiol-disulfide oxidase
ECs0642-219-2.356055LysR family transcriptional regulator
ECs0641-2141.610479hypothetical protein
ECs0640-1173.196225hypothetical protein
ECs0639-1174.435700aminotransferase
ECs0638-1194.711455hypothetical protein
ECs5382-2214.691577hypothetical protein
ECs0637-2204.807756carbon starvation protein
ECs0636-2164.506520hypothetical protein
ECs0635-1164.7643012,3-dihydro-2,3-dihydroxybenzoate dehydrogenase
ECs0634-1165.2668462,3-dihydro-2,3-dihydroxybenzoate synthetase
ECs06330155.5409672,3-dihydroxybenzoate-AMP ligase
ECs06320155.518811isochorismate synthase
ECs06311143.374703iron-enterobactin transporter periplasmic
ECs06301143.855706enterobactin exporter EntS
ECs06290123.417951iron-enterobactin transporter membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0643BCTLIPOCALIN290.014 Bacterial lipocalin signature.
		>BCTLIPOCALIN#Bacterial lipocalin signature.

Length = 171

Score = 28.8 bits (64), Expect = 0.014
Identities = 18/98 (18%), Positives = 39/98 (39%), Gaps = 13/98 (13%)

Query: 30 QGITIIKTFDAPGGMKGYLGKYQDMGVTIYLTPDGKHAISG--YMYNEKGENLSNTLIEK 87
+ + + F+ YLGK+ ++ + G ++ + N+ G ++ N
Sbjct: 21 ESVKPVSDFEL----NNYLGKWYEVARLDHSFERGLSQVTAEYRVRNDGGISVLN----- 71

Query: 88 EIYAPAGREMWQRMEQSHWLLDGKKDAPVIVYVFADPF 125
Y+ + W+ E + ++G D + V F PF
Sbjct: 72 RGYSEE-KGEWKEAEGKAYFVNGSTDGYLKVSFFG-PF 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0635DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (930), Expect = e-130
Identities = 110/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFAQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0634ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0631FERRIBNDNGPP641e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.8 bits (155), Expect = 1e-13
Identities = 61/285 (21%), Positives = 101/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSTEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKSWQA 154
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 155 L-----LTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0630TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


62ECs0610ECs0559Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0610-1224.230711copper/silver efflux system outer membrane
ECs06090233.736514CusR family transcriptional regulator
ECs06081233.500785sensor kinase CusS
ECs06073253.333424hypothetical protein
ECs06063251.070599hypothetical protein
ECs06050200.931012Rhs core protein with extension
ECs0604-212-1.326012hypothetical protein
ECs0603-211-0.132739Rhs core protein
ECs5380-111-0.909049hypothetical protein
ECs0602-111-1.482640H repeat-containing protein
ECs0601012-0.278261bacteriophage N4 adsorption protein B
ECs0600016-1.474495bacteriophage N4 receptor, outer membrane
ECs0599223-4.759031hypothetical protein
ECs0598226-6.069855EnvY
ECs0597228-6.240471*transcriptional regulator FimZ
ECs0596224-4.069472fimbrial protein
ECs0595123-3.948313protein SfmH
ECs0594120-3.039797outer membrane protein
ECs0593-116-0.484752chaperone
ECs05922190.665195fimbrial-like protein
ECs05913201.826442bifunctional 5,10-methylene-tetrahydrofolate
ECs05903212.109432hypothetical protein
ECs05892202.755720hypothetical protein
ECs05880193.339450cysteinyl-tRNA synthetase
ECs05871153.725827peptidyl-prolyl cis-trans isomerase B
ECs05861164.828380UDP-2,3-diacylglucosamine hydrolase
ECs05852165.0688725-(carboxyamino)imidazole ribonucleotide mutase
ECs05841174.3570575-(carboxyamino)imidazole ribonucleotide
ECs05831173.425717carbamate kinase
ECs05822162.121602carboxylase
ECs05813162.216999hypothetical protein
ECs05804181.304570acyl-CoA synthetase FdrA
ECs0579516-0.487383ureidoglycolate dehydrogenase
ECs0578517-1.473808allantoate amidohydrolase
ECs0577417-1.765814hypothetical protein
ECs0576419-1.947820glycerate kinase
ECs0575117-2.058142purine permease YbbY
ECs0572-117-1.973904allantoin permease
ECs0571-218-1.305875hypothetical protein
ECs0570-316-0.2853272-hydroxy-3-oxopropionate reductase
ECs0569-3130.515074hydroxypyruvate isomerase
ECs0568-3111.396306glyoxylate carboligase
ECs0567-215-0.506702DNA-binding transcriptional repressor AllR
ECs05661254.987794ureidoglycolate hydrolase
ECs0565-1235.612143AllS family transcriptional regulator
ECs05640246.126341tRNA 2-selenouridine synthase
ECs05630256.107284hypothetical protein
ECs05610256.152253hypothetical protein
ECs05600266.805841protein RhsD
ECs05591193.441054oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0610RTXTOXIND389e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 9e-05
Identities = 24/182 (13%), Positives = 59/182 (32%), Gaps = 13/182 (7%)

Query: 254 QAQTVNSDSLQSVKLPA-GLSSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +S + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARALYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + + L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SL 429

Sbjct: 260 KY 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0609HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0608PF06580300.018 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.018
Identities = 30/183 (16%), Positives = 67/183 (36%), Gaps = 34/183 (18%)

Query: 306 EELTRMAKMVSDML-FLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDR-GVELQFV 363
+ M +S+++ + + N + + LADE+ V + + LA + LQF
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVS------LADELTVVDSYLQ-LASIQFEDRLQFE 243

Query: 364 GDECQVAGDPLMLRRALSNLLSNALRY----TPPGEAIVVRCQTVDHLVQVIVENPGTPI 419
D + + L+ N +++ P G I+++ + V + VEN G+
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA 303

Query: 420 APEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVK---SIVVAHKGTVAVTSNARGTRFVI 476
E +G GL V+ ++ + + ++ ++
Sbjct: 304 LKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 477 VLP 479
++P
Sbjct: 346 LIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0597HTHFIS614e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 60.6 bits (147), Expect = 4e-13
Identities = 26/122 (21%), Positives = 55/122 (45%), Gaps = 2/122 (1%)

Query: 1 MKPTSVIIMDTHPIIRMSIEVLLQKNSELQIVLKTDDYRITIDYLRTRPVDLIIMDIDLP 60
M ++++ D IR + L + V T + ++ DL++ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 GTDGFTFLKRIKQIQSTVKVLFLSSKSECFYAGRAIQAGANGFVSKCNDQNDIFHAVQMI 120
+ F L RIK+ + + VL +S+++ A +A + GA ++ K D ++ +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 121 LS 122
L+
Sbjct: 119 LA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0594PF005778250.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 825 bits (2132), Expect = 0.0
Identities = 403/856 (47%), Positives = 572/856 (66%), Gaps = 20/856 (2%)

Query: 20 ICYSSLAILPSFLSYAESYFNPAFLLENGTFVADLSRFERGNHQPAGVYRVDLWRNDEFI 79
+ A + LS AE YFNP FL ++ VADLSRFE G P G YRVD++ N+ ++
Sbjct: 31 FVACAFA-AQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 80 GSQDIVFESTTENTGDKSGGLMPCFNQVLLERIGLNSSAFPELAQQQNNKCINLLKAVPD 139
++D+ F NTGD G++PC + L +GLN+++ + ++ C+ L + D
Sbjct: 90 ATRDVTF-----NTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHD 144

Query: 140 ATINFDFAAMRLNITIPQIALLSSAHGYIPPEEWDEGIPALLLNYNFTGN----RGNGND 195
AT D RLN+TIPQ + + A GYIPPE WD GI A LLNYNF+GN R GN
Sbjct: 145 ATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNS 204

Query: 196 SYFFSEL-SGINIGPWRLRNNGSWNYFRGNG--YHSEQWNNIGTWVQRAIIPLKSELVMG 252
Y + L SG+NIG WRLR+N +W+Y + +W +I TW++R IIPL+S L +G
Sbjct: 205 HYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLG 264

Query: 253 DGNTGSDIFDGVGFRGVRLYSSDNMYPDSQQGFAPTVRGIARTAAQLTIRQNGFIIYQSY 312
DG T DIFDG+ FRG +L S DNM PDSQ+GFAP + GIAR AQ+TI+QNG+ IY S
Sbjct: 265 DGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNST 324

Query: 313 VSPGAFEITDLHPTSSNGDLDVTIDERDGNQQNYTIPYSTVPILQREGRFKFDLTAGDFR 372
V PG F I D++ ++GDL VTI E DG+ Q +T+PYS+VP+LQREG ++ +TAG++R
Sbjct: 325 VPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYR 384

Query: 373 SGNSQQSSPFFFQGTALGGLPQEFTAYGGTQLSANYTAFLLGLGRNLGNWGAVSLDVTHA 432
SGN+QQ P FFQ T L GLP +T YGGTQL+ Y AF G+G+N+G GA+S+D+T A
Sbjct: 385 SGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQA 444

Query: 433 RSQLADDSRHEGDSIRFLYAKSMNTFGTNFQLMGYRYSTQGFYTLDDVAYRRMEGYEYDY 492
S L DDS+H+G S+RFLY KS+N GTN QL+GYRYST G++ D Y RM GY
Sbjct: 445 NSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGY-NIE 503

Query: 493 DYDGEHRDEPIIVNYHNLRFSRKDRLQLNISQSLNDFGSLYISGTHQKYWNTSDSDTWYQ 552
DG + +P +Y+NL ++++ +LQL ++Q L +LY+SG+HQ YW TS+ D +Q
Sbjct: 504 TQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQ 563

Query: 553 VGYTSSWVGISYSLSFSWNESVGIPDNERIVGLNVSVPFNVLTKRRYTRENALDRAYASF 612
G +++ I+++LS+S ++ ++++ LNV++PF+ R ++ A AS+
Sbjct: 564 AGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWL--RSDSKSQWRHASASY 621

Query: 613 NANRNSNGQNSWLAGVGGTLLEGHNLSYHVSQG----DTSNNGYTGSATANWQAAYGTLG 668
+ + + NG+ + LAGV GTLLE +NLSY V G N+G TG AT N++ YG
Sbjct: 622 SMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNAN 681

Query: 669 VGYNYDRDQHDVNWQLSGGVVGHENGITLSQPLGDTNVLIKAPGAGGVRIENQTGILTDW 728
+GY++ D + + +SGGV+ H NG+TL QPL DT VL+KAPGA ++ENQTG+ TDW
Sbjct: 682 IGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDW 741

Query: 729 RGYAVMPYATVYRYNRIALDTNTMGNSIDVEKNISSVVPTQGALVRANFDTRIGVRALIT 788
RGYAV+PYAT YR NR+ALDTNT+ +++D++ +++VVPT+GA+VRA F R+G++ L+T
Sbjct: 742 RGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMT 801

Query: 789 VTQGGKPVPFGSLVRENSTGITSMVGDDGQVYLSGAPLSGELLVQWGDGANSRCIAHYVL 848
+T KP+PFG++V S+ + +V D+GQVYLSG PL+G++ V+WG+ N+ C+A+Y L
Sbjct: 802 LTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQL 861

Query: 849 PKQSLQQAVTVISAVC 864
P +S QQ +T +SA C
Sbjct: 862 PPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0583CARBMTKINASE381e-136 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 381 bits (980), Expect = e-136
Identities = 125/310 (40%), Positives = 174/310 (56%), Gaps = 16/310 (5%)

Query: 2 KTLVVALGGNALLQRGEALTAENQYRNIASAVPALARL-ARSYRLAIVHGNGPQVGLLAL 60
K +V+ALGGNAL QRG+ + E N+ +A + AR Y + I HGNGPQVG L L
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 61 QNLAWKE---VEPYPLDVLVAESQGMIGYMLAQSLSAQPQM----PPVTTVRTRIEVSPD 113
A + + P+DV A SQG IGYM+ Q+L + + V T+ T+ V +
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 114 DPAFLQPEKFIGPVYQPEEQEALEAAYGWQMKRD-GKYLRRVVASPQPRKILDSEAIELL 172
DPAF P K +GP Y E + L GW +K D G+ RRVV SP P+ +++E I+ L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 173 LKEGHVVICSGGGGVPVTDDG---AGSEAVIDKDLAAALLAEQINADGLVILTDADAVYE 229
++ G +VI SGGGGVPV + G EAVIDKDLA LAE++NAD +ILTD +
Sbjct: 183 VERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 230 NWGMPQQRAIRHATPDELAPFAKAD----GSMGPKVTAVSGYVRSRGKPAWIGALSRIEE 285
+G +++ +R +EL + + GSMGPKV A ++ G+ A I L + E
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVE 302

Query: 286 TLAGEAGTCI 295
L G+ GT +
Sbjct: 303 ALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0567PF09025280.020 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 28.1 bits (62), Expect = 0.020
Identities = 17/61 (27%), Positives = 25/61 (40%), Gaps = 8/61 (13%)

Query: 126 EAVLIGQLECKSMVRMCAPLGSR--------LPLHASGAGKALLYPLAEEELMSIILQTG 177
+ + +LE K+M+R PLG + L G L LA EL +I G
Sbjct: 68 QGLEADRLELKAMLRAELPLGRQQQTFLLQLLGAVEHAPGGEYLAQLARRELQVLIPLNG 127

Query: 178 L 178
+
Sbjct: 128 M 128


63ECs0548ECs0541Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs05484317.805284adhesin
ECs05464318.137908lipoprotein
ECs05454307.965051DNA-binding transcriptional regulator CueR
ECs05444297.730210membrane fusion protein of a transport system
ECs05434297.642712ABC transporter ATP-binding protein
ECs05424287.708506hypothetical protein
ECs05411154.505730hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0548PF03895553e-12 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 55.2 bits (133), Expect = 3e-12
Identities = 21/78 (26%), Positives = 34/78 (43%), Gaps = 1/78 (1%)

Query: 262 RKEANAGTASAIAIASQPQVKTGDVMMVSAGAGTFNGESAVSVGTSFNAGTHTVLKAGIS 321
KE G A+ A++ Q VSA G + ++A+++G KAG++
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 322 ADTQS-DFGAGVGVGYSF 338
+T + G VGY F
Sbjct: 62 FNTYNGGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0544RTXTOXIND2571e-83 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 257 bits (657), Expect = 1e-83
Identities = 103/434 (23%), Positives = 168/434 (38%), Gaps = 56/434 (12%)

Query: 11 LTEPRLPRSALAV-RVTAVMLLCFLGWAWYFQLDEVTTGSGTVEPSGREQVVQSLEGGIL 69
L E + R V L+ + Q++ V T +G + SGR + ++ +E I+
Sbjct: 48 LIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIV 107

Query: 70 YHLDVKVGDIVEQGQPLAQLNRTKTESDVQEAMSRLYAALATSARLRAEVSNK------P 123
+ VK G+ V +G L +L E+D + S L A R + +
Sbjct: 108 KEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE 167

Query: 124 LVFPDEL----------------------NKFPELIESETALYNTR--RDGLNKATTGLT 159
L PDE + + E L R R +
Sbjct: 168 LKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYE 227

Query: 160 QGISLVNRELAMTQPLVKQGAASSVEVLRLQRQANELEN--------------------- 198
+ L L+ + A + VL + + E N
Sbjct: 228 NLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 199 KLSDVRTQYYVQAREELAKANAEVETQRSVIRGREDSLTRLNFTAPVRGIVQDIDVTTVG 258
+ V + + ++L + + + E+ APV VQ + V T G
Sbjct: 288 EYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347

Query: 259 GVIAPGGKLMTIVPLDEQLLIEAKISPRDVAFIHPGQKSLVKITAYDYSIYGGLPGEVAV 318
GV+ LM IVP D+ L + A + +D+ FI+ GQ +++K+ A+ Y+ YG L G+V
Sbjct: 348 GVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKN 407

Query: 319 ISPDTVQDEVRRDVYYYRVYIRTFSNHLENKSKQQFPIFPGMVATVDIRTGKKSVLDYLL 378
I+ D ++D+ R + V I N L +K P+ GM T +I+TG +SV+ YLL
Sbjct: 408 INLDAIEDQ--RLGLVFNVIISIEENCLSTGNK-NIPLSSGMAVTAEIKTGMRSVISYLL 464

Query: 379 KPF-NKAQEALRER 391
P E+LRER
Sbjct: 465 SPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0542CABNDNGRPT451e-05 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 44.6 bits (105), Expect = 1e-05
Identities = 43/221 (19%), Positives = 68/221 (30%), Gaps = 12/221 (5%)

Query: 5048 GTASNNGKFVGTGYNDTFFATAGTDTYDGSGGWVYSSGTGTWLANGGMDVVDFRLSTVGV 5107
T + F D + AT + S + T + ++ +
Sbjct: 265 RTGDSVYGFNSNTDRDFYTATDSSKALIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSD 324

Query: 5108 TANLSSTAAQATGFNTSTFTNIEGISGSNFNDILTGSSGDNQLEGRGGNDTLNIGNGGHD 5167
L + A G IE G + NDIL G+S DN L+G GND L G G
Sbjct: 325 VGGLKGNVSIAHG------VTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAG--A 376

Query: 5168 TLLYKLLNASDATGGNGSDVVNGFTVGTWEGTADTDRIDIRELLQGSGYTG-NGKASYVN 5226
LY G+G D + D+ID+ + + +
Sbjct: 377 DTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKG 436

Query: 5227 GVATLDAQAGNIGDFVKVTQS---GSDTIVQIDRDGTGGTF 5264
L A N + + ++ D +V+I
Sbjct: 437 QEVMLQWDAANSITNLWLHEAGHSSVDFLVRIVGQAAQSDI 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0541INTIMIN375e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 37.4 bits (86), Expect = 5e-04
Identities = 63/372 (16%), Positives = 115/372 (30%), Gaps = 44/372 (11%)

Query: 707 QTVTVTLNGQTYQGVVQPDGTWSVTVPAANVGALADGNA--TVTASVNDVAGNPSSVSRV 764
T+TV NGQ V D T A A ADG T TA+V ++V
Sbjct: 544 LTITVLSNGQVVDQVGVTDFT------ADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 765 ALVDATPPVVTINPVATDNVINTPEHAQAQIISGTVTGAQAGDIVTVTLNNVDYTTVVDG 824
+ + V++ N T+ ++ V A+ ++ + N + VD
Sbjct: 598 FNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL--NANAVIFVDQ 655

Query: 825 SGNWSLGVPASVVSGLADGSYPVSVSVTDKAGNTGSQSLTVTVNTAAPLIGINSIAGDDV 884
+ + A + +A+G ++ +V G+ + VT T +
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSN-------- 707

Query: 885 INASEKGADLQITGTSDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANY 944
T +D +T+T + + + +V V A V
Sbjct: 708 -----------STEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF--TTL 754

Query: 945 TVTAAVTSDIGNSATASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAED 1004
T+ +G V LP V + + + + + D
Sbjct: 755 TIDDGNIEIVGTG--------VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVD 806

Query: 1005 GDTVTITL---GGNTYTATVGSN--LTWSVDVPAADIQALGNGDLTVNASVTNQNGNTGS 1059
+ +TL G T + N T+++ P + I + +T N +V G
Sbjct: 807 ASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1060 GTRDITIDANLP 1071
N+
Sbjct: 867 LPSSQNELENVF 878



Score = 35.0 bits (80), Expect = 0.002
Identities = 81/416 (19%), Positives = 139/416 (33%), Gaps = 61/416 (14%)

Query: 841 ADGSYPVSVSVTDKAGN-TGSQSLTVTVNTAAPLIGINSIAGDDVINASEKGADLQITGT 899
Y V+ D+ GN + + LT+TV + + D + + AD GT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKAD----GT 575

Query: 900 SDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANYTVTAAVTSDIGNSAT 959
AIT YT T +G VP S G A + +A T + +
Sbjct: 576 ------EAIT-------YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT-----NGS 617

Query: 960 ASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAEDGDTVTITLGGNTYTA 1019
V + S PG + T ++ +A A A Q I T A
Sbjct: 618 GKATVTLKSDKPGQVVVSAKTAEMTSALNAN-AVIFVDQTK----ASITEIKADKTTAVA 672

Query: 1020 TVGSNLTWSVDVPAADIQALGNGDLTVNASVTNQNG----NTGSGTRDITIDANLPG--- 1072
+T++V V D + + N ++T ++ + +G +T+ + PG
Sbjct: 673 NGQDAITYTVKVMKGD-KPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL 731

Query: 1073 --LRVDTVAGDDVVNIIEHGQALVVTGSS-----SGLAESTP----------LTVTINNV 1115
RV VA D +E L + + +G+ P L + N
Sbjct: 732 VSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNG 791

Query: 1116 EYTTAVQADGSWSVGVTAAQVSAWPAGTVNIAVSGESSAGNSVSITHPVTVDLTPAAITI 1175
+YT SV ++ QV+ GT I+V + + +I TP ++ +
Sbjct: 792 KYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA-------TPNSLIV 844

Query: 1176 NTIATDDVINAAEKGADLTLSGTTTNVEPGQTVTVTFGGKNYTASVASDGSWTATV 1231
++ N A ++ + V +G N S + + V
Sbjct: 845 PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWV 900


64ECs0529ECs0523Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs05292231.506353acetylesterase
ECs05282243.055257ferrochelatase
ECs05274283.102685adenylate kinase
ECs05263243.246802heat shock protein 90
ECs05253174.596541recombination protein RecR
ECs05244164.108106hypothetical protein
ECs05233162.417078DNA polymerase III subunits gamma and tau
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0526FRAGILYSIN320.009 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 31.6 bits (71), Expect = 0.009
Identities = 24/108 (22%), Positives = 46/108 (42%), Gaps = 12/108 (11%)

Query: 422 RMKEGQEK--IYYITADSYAAAKSSPHLELLRKKGIEVLLLSDRIDEWMMNYLTEFDGKP 479
R+ G++K +I D +A + + G + ++ + + MMN + EF P
Sbjct: 99 RLFNGRDKDSTSFILGDEFAVLR-------FYRNGESISYIAYK-EAQMMNEIAEFYAAP 150

Query: 480 FQSVSKV--DESLEKLADEVDESAKEAEKALTPFIDRVKALLGERVKD 525
F+ + E+ E + D SA + ++ ID+ K +L D
Sbjct: 151 FKKTRAINEKEAFECIYDSRTRSAGKDIVSVKINIDKAKKILNLPECD 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0523IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDAWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


65ECs0476ECs0464Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0476013-3.350178exodeoxyribonuclease VII small subunit
ECs0475012-2.688294geranyltranstransferase
ECs0474013-3.1470041-deoxy-D-xylulose-5-phosphate synthase
ECs0473-114-4.787531NAD(P)H-dependent xylose reductase
ECs0472-216-4.085019hypothetical protein
ECs0471-1142.302912phosphatidylglycerophosphatase A
ECs0470-2141.471536thiamine monophosphate kinase
ECs04690200.448233transcription antitermination protein NusB
ECs0468-117-0.1703306,7-dimethyl-8-ribityllumazine synthase
ECs0467116-0.688847bifunctional
ECs0466420-3.282345transcriptional regulator NrdR
ECs0465323-2.623383hypothetical protein
ECs0464328-0.661828nucleoside-specific channel-forming protein Tsx
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0472BONTOXILYSIN310.020 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 30.6 bits (69), Expect = 0.020
Identities = 24/115 (20%), Positives = 47/115 (40%), Gaps = 12/115 (10%)

Query: 414 SRKKYNEFFKYIQAEAKQYFKDQYKLTKNDYLKKVPLTAQLIAKYKMDDQLDQLLVTREI 473
S K N I ++ YFK Y + + + +Q +++ Q ++ +E
Sbjct: 642 SFKDLNNKLYEIYSKNIVYFKKIYFSFLDQWWTEY--YSQY---FELICMAKQSILAQE- 695

Query: 474 QDEIKSKIQDKIDELSKNLFNT-----MTETIENNFDDIFRQQSENMSNYYEFVD 523
+K +Q+K +LSK + ET E F D+ + +M+ F++
Sbjct: 696 -SLVKQIVQNKFTDLSKASIPPDTLKLIRETTEKTFIDLSNESQISMNRVDNFLN 749


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0464CHANNELTSX5270.0 Nucleoside-specific channel-forming protein Tsx signa...
		>CHANNELTSX#Nucleoside-specific channel-forming protein Tsx

signature.
Length = 294

Score = 527 bits (1358), Expect = 0.0
Identities = 257/294 (87%), Positives = 273/294 (92%)

Query: 1 MKKTLLAAGAVLALSSSFTVNAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE 60
MKKTLLAAGAV+ALS++F AAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE
Sbjct: 1 MKKTLLAAGAVVALSTTFAAGAAENDKPQYLSDWWHQSVNVVGSYHTRFGPQIRNDTYLE 60

Query: 61 YEAFAKKDWFDFYGYADAPVFFGGNSDAKGIWNHGSPLFMEIEPRFSIDKLTNTDLSFGP 120
YEAFAKKDWFDFYGY DAPVFFGGNS AKGIWN GSPLFMEIEPRFSIDKLTNTDLSFGP
Sbjct: 61 YEAFAKKDWFDFYGYIDAPVFFGGNSTAKGIWNKGSPLFMEIEPRFSIDKLTNTDLSFGP 120

Query: 121 FKEWYFANNYIYDMGRNKDGRQSTWYMGLGTDIDTGLPMSLSMNVYAKYQWQNYGAANEN 180
FKEWYFANNYIYDMGRN QSTWYMGLGTDIDTGLPMSLS+NVYAKYQWQNYGA+NEN
Sbjct: 121 FKEWYFANNYIYDMGRNDSQEQSTWYMGLGTDIDTGLPMSLSLNVYAKYQWQNYGASNEN 180

Query: 181 EWDGYRFKIKYFVPITDLWGGQLSYIGFTNFDWGSDLGDDSGNAINGIKTRTNNSIASSH 240
EWDGYRFK+KYFVP+TDLWGG LSYIGFTNFDWGSDLGDD+ +NG RT+NSIASSH
Sbjct: 181 EWDGYRFKVKYFVPLTDLWGGSLSYIGFTNFDWGSDLGDDNFYDLNGKHARTSNSIASSH 240

Query: 241 ILALNYDHWHYSVVARYWHDGGQWNDDAELNFGNGNFNVRSTGWGGYLVVGYNF 294
ILALNY HWHYS+VARY+H+GGQW DDA+LNFG+G F+VRSTGWGGY VVGYNF
Sbjct: 241 ILALNYAHWHYSIVARYFHNGGQWADDAKLNFGDGPFSVRSTGWGGYFVVGYNF 294


66ECs0429ECs0421Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0429218-2.314161hypothetical protein
ECs0428219-1.874290hypothetical protein
ECs0427219-1.875028transporter
ECs0426118-1.058931beta-lactam binding protein AmpH
ECs0425220-1.182814DNA-binding transcriptional regulator
ECs04242220.463274flagellin structural protein
ECs04230203.877539delta-aminolevulinic acid dehydratase
ECs04221223.648063taurine dioxygenase
ECs04211203.838974taurine transporter subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0424PRTACTNFAMLY1214e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 121 bits (305), Expect = 4e-30
Identities = 102/445 (22%), Positives = 169/445 (37%), Gaps = 59/445 (13%)

Query: 584 TINGNGDNDNTASIEAGQNEVDNNGDHVAAATGNYKVRIDNATGAGSIADYNGNELIYVN 643
T+ G+G + G ++ A+G +++ + N+ GS L+
Sbjct: 477 TLAGSGLFRMNVFADLGLSDKLVVMQD---ASGQHRLWVRNS---GSEPASANTLLLVQT 530

Query: 644 DKNSNATFSAAN---KADLGAYTYQAEQRGNTV--------------------------- 673
S ATF+ AN K D+G Y Y+ GN
Sbjct: 531 PLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQ 590

Query: 674 ---------VLQQMELTDYANMALSIP--SANTNIWNLEQDTVGTRLTNSRHGLADNGGA 722
EL+ AN A++ + +W E + + RL R D GGA
Sbjct: 591 PQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRL-NPDAGGA 649

Query: 723 WVSYFGGNFNGDNGTIN-YDQDVNGIMVGVDTKIDGNNAKWIVGAAAGFAKGDMN---DR 778
W F DN +DQ V G +G D + +W +G AG+ +GD D
Sbjct: 650 WGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDG 709

Query: 779 SGQVDQDSQTAYIYSSAHFANNVF-VDGSLSYSHFNNDLSATMSNGTYVDGSTNSDAWGF 837
G D Y + + A++ F +D +L S ND S+G V G + G
Sbjct: 710 GGHTDSVHVGGY---ATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA 766

Query: 838 GLKAGYDFKLGDAGYVTPYGSISGLFQSGDDYQLSNDMKVDGQSYDSMRYELGVDAGYTF 897
L+AG F D ++ P ++ G Y+ +N ++V + S+ LG++ G
Sbjct: 767 SLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRI 826

Query: 898 TYSEDQALTPYFKLAYVYDDSNNDNDVNGDSIDNGTEGSAVRV--GLGTQFSFTKNFSAY 955
+ + + PY K + + + + V+ + I + TE R GLG + + S Y
Sbjct: 827 ELAGGRQVQPYIKASVL-QEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLY 885

Query: 956 TDANYLGGGDVDQDWSANVGVKYTW 980
Y G + W+ + G +Y+W
Sbjct: 886 ASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0423BINARYTOXINB300.015 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.015
Identities = 19/69 (27%), Positives = 30/69 (43%)

Query: 254 DIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYF 313
+ EL + +L + QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 314 ALDLAEKKI 322
L+L E++I
Sbjct: 526 DLNLVERRI 534


67ECs0394ECs0383Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs03940153.521004cyanate transporter
ECs0393-1203.848865cyanate hydratase
ECs0392-1194.192910carbonic anhydrase
ECs0391-1204.120330DNA-binding transcriptional regulator CynR
ECs03900224.221272cytosine deaminase
ECs03890224.484712cytosine permease
ECs03881234.172365propionyl-CoA synthetase
ECs03870172.6878792-methylcitrate dehydratase
ECs0386015-0.1228892-methylcitrate synthase
ECs0385-115-0.7526112-methylisocitrate lyase
ECs0384-116-1.532304regulator for prp operon
ECs0383018-3.117560hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0386PHPHTRNFRASE300.023 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 29.8 bits (67), Expect = 0.023
Identities = 11/33 (33%), Positives = 19/33 (57%), Gaps = 1/33 (3%)

Query: 65 LIHGKLPTRDE-LAAYKTKLKALRGLPANVRTV 96
+ +LPT +E AYK ++ + G P +RT+
Sbjct: 303 MDRDQLPTEEEQFEAYKEVVQRMDGKPVVIRTL 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0384HTHFIS338e-113 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 338 bits (868), Expect = e-113
Identities = 122/401 (30%), Positives = 200/401 (49%), Gaps = 54/401 (13%)

Query: 164 DLAEEAGMTGIFIYSAATVRQAFSDALDMTRMSLRHNTHDATRNALRTRYVLGDMLGQSP 223
A +A G + Y ++ + + +L ++ ++G+S
Sbjct: 88 MTAIKASEKGAYDYLPKPFDL--TELIGIIGRALAEPKRRPSK-LEDDSQDGMPLVGRSA 144

Query: 224 QMEQVRQTILLYARSSAAVLIEGETGTGKELAAQAIHREYFARHDARQGKKSHPFVAVNC 283
M+++ + + ++ ++I GE+GTGKEL A+A+H + R+ PFVA+N
Sbjct: 145 AMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHD-----YGKRRNG---PFVAINM 196

Query: 284 GAIAESLLEAELFGYEEGAFTGSRRGGRAGLFEIAHGGTLFLDEIGEMPLPLQTRLLRVL 343
AI L+E+ELFG+E+GAFTG++ G FE A GGTLFLDEIG+MP+ QTRLLRVL
Sbjct: 197 AAIPRDLIESELFGHEKGAFTGAQTR-STGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVL 255

Query: 344 EEKEVTRVGGHQPVPVDVRVISATHCNLEEDMRQGQFRRDLFYRLSILRLQLPPLRERVA 403
++ E T VGG P+ DVR+++AT+ +L++ + QG FR DL+YRL+++ L+LPPLR+R
Sbjct: 256 QQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAE 315

Query: 404 DILPLAESFLKVSLAALSAPFSAALRQGLQASETVLVHYDWPGNIRELRNMMERLALFLS 463
DI L F++ ++ L+ + + WPGN+REL N++ RL
Sbjct: 316 DIPDLVRHFVQ-QAEKEGLDVKRFDQEALEL----MKAHPWPGNVRELENLVRRLTALYP 370

Query: 464 VEP-TPDLTPQFLQLLLPELARESAKIPAPRLLTP------------------------- 497
+ T ++ L+ +P+ E A + L
Sbjct: 371 QDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYD 430

Query: 498 -----------QQALEKFNGDKTAAANYLGISRTTFWRRLK 527
AL G++ AA+ LG++R T ++++
Sbjct: 431 RVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIR 471


68ECs5377ECs0276Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs53772244.169163hypothetical protein
ECs03731223.797702deaminase
ECs03721182.602889carbamate kinase
ECs03712182.158806hypothetical protein
ECs03701150.611172hypothetical protein
ECs0369-311-1.644790oxidoreductase subunit
ECs0368-111-1.442087hypothetical protein
ECs0367-29-0.705333transcription factor
ECs0366-2110.805416hypothetical protein
ECs0363-2121.513831hypothetical protein
ECs0362-2142.130863AidA-I adhesin-like protein
ECs03600254.575075choline transport protein BetT
ECs03591223.991175transcriptional regulator BetI
ECs03580182.532479betaine aldehyde dehydrogenase
ECs0357-218-1.898363choline dehydrogenase
ECs5376435-6.337682hypothetical protein
ECs0353334-7.798887hypothetical protein
ECs0352334-8.111971hypothetical protein
ECs0351229-6.394207hypothetical protein
ECs0350126-5.061091adhesin
ECs0349120-3.135523hypothetical protein
ECs0348020-2.788241hypothetical protein
ECs0347-121-2.542079hypothetical protein
ECs0346022-2.961826hypothetical protein
ECs0345125-3.876674hypothetical protein
ECs0344128-5.466034dehydrogenase subunit
ECs0343129-6.631257AraC family transcriptional regulator
ECs0342-122-3.376075pyridine nucleotide-disulfide oxidoreductase
ECs0341-120-2.678070hypothetical protein
ECs0340020-2.736493hypothetical protein
ECs0338-122-3.051634reductase
ECs0337-123-2.805527transcriptional regulator
ECs0336022-2.879811invasin
ECs0335022-2.990205oxidoreductase
ECs0334022-2.571247hypothetical protein
ECs0333121-1.663729transcriptional regulator
ECs0332019-0.637643hypothetical protein
ECs03311190.401079NADH-dependent flavin oxidoreductase
ECs03302223.102265hypothetical protein
ECs0329424-4.534367hypothetical protein
ECs0328424-4.812834hypothetical protein
ECs0327320-3.30123250S ribosomal protein L31 type B
ECs53751210.40534850S ribosomal protein L36
ECs03261200.464711hypothetical protein
ECs03242200.731945regulator
ECs03231211.817516hypothetical protein
ECs03220191.092213hypothetical protein
ECs03210181.075832hypothetical protein
ECs0320-1171.500242receptor
ECs0319-1194.511691hypothetical protein
ECs0318-1205.592510ferredoxin
ECs0317-1205.917218hypothetical protein
ECs0316-2206.331100xanthine dehydrogenase iron-sulfur-binding
ECs0315-1206.065840hypothetical protein
ECs0314-1174.615519hypothetical protein
ECs0313-1162.312460hypothetical protein
ECs0311-1160.681724transporter
ECs0310023-4.771764hypothetical protein
ECs0309126-3.932042transcriptional regulator
ECs0307127-4.411699hypothetical protein
ECs0306126-3.641024oxidoreductase
ECs0305127-3.345470hypothetical protein
ECs0304127-3.298570hypothetical protein
ECs03031230.267355DNA primase
ECs0302123-0.661828Cnr-like protein
ECs0301223-1.187897hypothetical protein
ECs0300122-1.255382CI repressor
ECs0299225-5.428259DNA binding protein
ECs5374333-10.296321hypothetical protein
ECs0298337-12.256247head size determination protein
ECs0297547-16.002514polarity suppression protein
ECs0296655-19.744121Ogr family transcription activator
ECs0295957-20.685886hypothetical protein
ECs0294852-16.824277hypothetical protein
ECs0293748-14.243753hypothetical protein
ECs0292748-13.250513hypothetical protein
ECs0291744-11.610552hypothetical protein
ECs0290639-8.798292hypothetical protein
ECs0289437-7.415934integrase
ECs0288337-7.616758hypothetical protein
ECs0287336-7.124631transcriptional regulator
ECs0285030-4.373589hypothetical protein
ECs0284026-3.051192DNA-invertase
ECs0283026-3.744096tail fiber protein
ECs0282129-4.307512hypothetical protein
ECs0281231-4.991510hypothetical protein
ECs0280434-5.243275tail fiber protein
ECs0276337-5.368194hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0372CARBMTKINASE429e-154 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 429 bits (1105), Expect = e-154
Identities = 140/315 (44%), Positives = 201/315 (63%), Gaps = 3/315 (0%)

Query: 1 MKELVVVAIGGNSIIKDNASQSIEHQAEAVKAVADTVLEMLASDYDIVLTHGNGPQVGLD 60
M + VV+A+GGN++ + S E + V+ A + E++A Y++V+THGNGPQVG
Sbjct: 1 MGKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSL 60

Query: 61 LRRAEIAHEREGLPLTPLANCVADTQGGIGYLIQQALNNRLARHG-EKKAVTVVTQVEVD 119
L + G+P P+ A +QG IGY+IQQAL N L + G EKK VT++TQ VD
Sbjct: 61 LLHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVD 120

Query: 120 KNDPGFAHPTKPIGAFFSESQRDELQKANPDWRFVEDAGRGYRRVVASPEPKRIVEAPAI 179
KNDP F +PTKP+G F+ E L + W ED+GRG+RRVV SP+PK VEA I
Sbjct: 121 KNDPAFQNPTKPVGPFYDEETAKRLAR-EKGWIVKEDSGRGWRRVVPSPDPKGHVEAETI 179

Query: 180 KALIQQGFVVIGAGGGGIPVVRTDAGDYQSVDAVIDKDLSTALLAREIHADILVITTGVE 239
K L+++G +VI +GGGG+PV+ D G+ + V+AVIDKDL+ LA E++ADI +I T V
Sbjct: 180 KKLVERGVIVIASGGGGVPVILED-GEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVN 238

Query: 240 KVCIHFGKPEKQALDRVDIATMTRYMQEGHFPPGSMLPKIIASLTFLEQGGKEVIITTPE 299
+++G ++Q L V + + +Y +EGHF GSM PK++A++ F+E GG+ II E
Sbjct: 239 GAALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLE 298

Query: 300 CLPAALRGETGTHII 314
AL G+TGT ++
Sbjct: 299 KAVEALEGKTGTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0362PRTACTNFAMLY1316e-33 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 131 bits (330), Expect = 6e-33
Identities = 146/614 (23%), Positives = 232/614 (37%), Gaps = 97/614 (15%)

Query: 736 GTLNGSADSLLSLNGGSLTVTNG------GTSTGSLTGSGELNIQGGTL----------- 778
G + LS G++ T G + S+T + QG L
Sbjct: 326 GARVTVSGGSLSAPHGNVIETGGARRFAPQAAPLSITLQAGAHAQGKALLYRVLPEPVKL 385

Query: 779 ----------DIAGDNSNLTANVNIANSANVLVSHAQGLGSANVENNGTLALNNSAEKRA 828
DI +I L S A+ G+ + +L+++N+
Sbjct: 386 TLTGGADAQGDIVATELPSIPGTSIGPLDVALASQARWTGATRAVD--SLSIDNATWVMT 443

Query: 829 AASVNYALGGNLTNNGTLMTGMSGQQAGNVLVVKGNYHGNNGQLVMNTVLNGDDSVTDKL 888
S AL L ++G++ + AG V+ N +G MN D ++DKL
Sbjct: 444 DNSNVGAL--RLASDGSVDFQQPAE-AGRFKVLTVNTLAGSGLFRMNV--FADLGLSDKL 498

Query: 889 VVEGDTSGTTAVTVNNAGGTGAKTLNGIELIHVDGKSEGEFVQA---GRIVAGAYDYTLA 945
VV D SG + V N+G + N + L+ S F A G++ G Y Y LA
Sbjct: 499 VVMQDASGQHRLWVRNSGS-EPASANTLLLVQTPLGSAATFTLANKDGKVDIGTYRYRLA 557

Query: 946 RGQGANSGNWYLTSGSDSPELQPEPDPMPNPEPNPNPEPN-PNPTPTPGPDLNVDNDLRP 1004
+G W L P +P P P P P P P+P P P P G +L+
Sbjct: 558 ---ANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQPQPEAPAPQPPAGRELS------- 607

Query: 1005 EAGSYIANLAAANTMFTTRLHERLGNTYYTDMVTGEQKQTTMWMRHEGGHNKWRDGSGQL 1064
AAAN T +Y + ++ + + + G W G Q
Sbjct: 608 ---------AAANAAVNTGGVGLASTLWYAESNALSKRLGELRLNPDAG-GAWGRGFAQR 657

Query: 1065 KTQSNRYV---------LQLGGDVAQWSQNGSDRWHVGVMAGYGNSDSKTISSRTGYRAK 1115
+ NR +LG D A G RWH+G +AGY D G+
Sbjct: 658 QQLDNRAGRRFDQKVAGFELGADHAVAVAGG--RWHLGGLAGYTRGDRGFTGDGGGH--- 712

Query: 1116 ASVNGYSTGLYATWYADDESRNGAYLDSWAQYSWFDN--TVKGDDLQS--ESYKSKGFTA 1171
+ G YAT+ AD +G YLD+ + S +N V G D + Y++ G A
Sbjct: 713 --TDSVHVGGYATYIAD----SGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA 766

Query: 1172 SLEAGYKHKLAEFNGSQGTRNEWYVQPQAQVTWMGVKADKHRESNGTLVHSNGDGNVQTR 1231
SLEAG + A + W+++PQA++ +R +NG V G +V R
Sbjct: 767 SLEAGRRFTHA---------DGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGR 817

Query: 1232 LGVKTWLKSHHKMDDGKSREFQPFVEVNWLHNSKDFST-SMDGVSVTQDGARNIAEIKTG 1290
LG L+ +++ R+ QP+++ + L T +G++ + AE+ G
Sbjct: 818 LG----LEVGKRIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLG 873

Query: 1291 VEGQLNANLNVWGN 1304
+ L +++ +
Sbjct: 874 MAAALGRGHSLYAS 887


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0359HTHTETR652e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.4 bits (159), Expect = 2e-15
Identities = 32/172 (18%), Positives = 59/172 (34%), Gaps = 15/172 (8%)

Query: 16 RRRQLIDATLEAINEVGMHDATIAQIARRAGVSTGIISHYFRDKNGLLEATMRDITSQLR 75
R+ ++D L ++ G+ ++ +IA+ AGV+ G I +F+DK+ L S +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 76 DAVLNRLHALPQGSAEQRLQAIVGGNFDETQVSSAAMKAWLAFWASSMHQP-------ML 128
+ L P G L+ I+ + T V+ + +
Sbjct: 72 ELELEYQAKFP-GDPLSVLREILIHVLEST-VTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 129 YRLQQVSSRRLLSNLVSEFRRE---LPRHQAQEAGYGLAALIDGL---WLRA 174
R + S + + + A + I GL WL A
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0350IGASERPTASE300.013 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.013
Identities = 20/109 (18%), Positives = 42/109 (38%), Gaps = 4/109 (3%)

Query: 252 ASAQGTGSATQNLNLSVADSTIYSDVLALSESENSAATTTNVNMNVARSYWEGNAYTFNS 311
A+ +A+ + + T + + + TT ++ S+ N
Sbjct: 761 ANITSNITASNKAQVHIGYKTGDTVCVRSDYTGYVTCTTDKLSDKALNSF---NPTNLRG 817

Query: 312 GDKAGSNLDINLSDSSVWKGKVSGAGNASVSLQNESVWNVTGSSTVDAL 360
+ + L ++++ G + GN+ V L S W++TG+S V L
Sbjct: 818 NVNLTESANFVLGKANLF-GTIQSRGNSQVRLTENSHWHLTGNSDVHQL 865


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0349PRTACTNFAMLY280.002 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.002
Identities = 13/48 (27%), Positives = 22/48 (45%)

Query: 35 IDGAAFRVGAGVQADITKNMGAYASLDYTKGDDIENPLQGVVGINVTW 82
+ G +G G+ A + + YAS +Y+KG + P G +W
Sbjct: 863 LRGTRAELGLGMAAALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0337HTHTETR280.029 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.029
Identities = 12/42 (28%), Positives = 19/42 (45%)

Query: 14 RQKILQQLLEWIECNLEHPISIEDIAQKSGYSRRNIQLLFRN 55
RQ IL L S+ +IA+ +G +R I F++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKD 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0336INTIMIN549e-178 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 549 bits (1415), Expect = e-178
Identities = 226/818 (27%), Positives = 357/818 (43%), Gaps = 49/818 (5%)

Query: 41 PVMAARAQHAVQPRLSMGNTTVTADNNVEKNVASFAANAGTFLSSQPDS-----DATRNF 95
P++AA +L+ + VT N + ++AA L SQ S D ++
Sbjct: 131 PLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSRSLNGDYAKDT 190

Query: 96 ITGMATAKANQEIQEWLGKYGTARVKLNVDKDFSLKDSSLEMLYPIYDTPTNMLFTQGAI 155
G+A +A+ ++Q WL YGTA V L +F SSL+ L P YD+ + F Q
Sbjct: 191 ALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFD--GSSLDFLLPFYDSEKMLAFGQVGA 248

Query: 156 HRTDDRTQSNIGFGWRHFSGNDWMAGVNTFIDHDLSRSHTRIGVGAEYWRDYLKLSANGY 215
D R +N+G G R F M G N FID D S +TR+G+G EYWRDY K S NGY
Sbjct: 249 RYIDSRFTANLGAGQRFFLPE-NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNGY 307

Query: 216 IRASGWKKSPDIEDYQERPANGWDIRAEGYLPAWPQLGASLMYEQYYGDEVGLFGKDKRQ 275
R SGW +S + +DY ERPANG+DIR GYLP++P LGA LMYEQYYGD V LF DK Q
Sbjct: 308 FRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVALFNSDKLQ 367

Query: 276 KDPHAISAEVTYTPVPLLTLSAGHKQGKSGENDTRFGLEVNYRIGEPLAKQLDTDSIRER 335
+P A + V YTP+PL+T+ ++ G END + ++ Y+ +P ++Q++ + E
Sbjct: 368 SNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIEPQYVNEL 427

Query: 336 RVLAGSRYDLVERNNNIVLEYRKSEVIRIALPERIEGKGGQTLSLGLVVSKATHGLKNVQ 395
R L+GSRYDLV+RNNNI+LEY+K +++ + +P I G T + L+V K+ +GL +
Sbjct: 428 RTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIV-KSKYGLDRIV 486

Query: 396 WEAPSLLAEGGKITGQGSQ----WQVTLPAYRPGKDNYYAISAVAYDNKGNTSKRVQTEV 451
W+ +L ++GG+I GSQ +Q LPAY G N Y ++A AYD GN+S V +
Sbjct: 487 WDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSNNVLLTI 546

Query: 452 VITGAGMSADRTALTLDGQSRIQMLANGNEQKPLVLSLRDAEGQPVTGMKDQIKTELTFK 511
+ G D+ +T + A+G E +++ Q ++F
Sbjct: 547 TVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNG-------VAQANVPVSFN 599

Query: 512 PAGNIVTRSLKATKSQAKPTLGEFTETEAGVYQSVFTTGTQSGEATITVSVDGMSKTVTA 571
S + + G + G+ ++ M+ + A
Sbjct: 600 IVSGTAVLSANSANTNGS-----------GKATVTLKSDK-PGQVVVSAKTAEMTSALNA 647

Query: 572 ELRATMMDVANSTLSANEPSGDVVADGQQAYTLTLTAVDSEGNPVTGEASRLRFVPQDTN 631
+ S VA+GQ A T T+ V PV+ + T
Sbjct: 648 NAVIFVDQTKASITEIKADKTTAVANGQDAITYTVK-VMKGDKPVSNQEVTF-----TTT 701

Query: 632 GVTVGAIS--EIKPGVYSAAVSSTRAGNVVVRAFSEQYQLGTLQQTLKFVAGP-LDAAHS 688
+ + G ++ST G +V A + ++F +D +
Sbjct: 702 LGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNI 761

Query: 689 SITLNPDKPVVGGTVTAIWTVKDAYDNPVTSLTPE---APSLAGAAAEGSTASGWTNNGD 745
I V G + +W + + + + A+ +++ T
Sbjct: 762 EIVGTG----VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEK 817

Query: 746 GTWTAQITLGSTAGELEVMPKLNGQNAAANAAKVTVVADALSSNQSKVSVAEDHVKAGES 805
GT T + + N N +K DA+++ ++ E+
Sbjct: 818 GTTTISVISSDNQTATYTIATPNSL-IVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELEN 876

Query: 806 TTVTLVAKDAHGNAISGLALSASLTGTASEGATVSSWT 843
A + + S + + + TA + + + T
Sbjct: 877 VFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVAST 914



Score = 74.0 bits (181), Expect = 4e-15
Identities = 76/421 (18%), Positives = 140/421 (33%), Gaps = 45/421 (10%)

Query: 976 NGQNAVAQPLVLNVAGDAS-KAEIRDMTVKVNNQLANGQSANQITLTVVDTYGNPLQGQE 1034
N N V + + G + + D T + A+G A T TV G
Sbjct: 537 NSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKN-GVAQANVP 595

Query: 1035 VTLTLPQGVTSKTGNTVTTNAAGKADIELMSTVAGEHNISASVNGAQKTV---TVKFNAD 1091
V+ + G + N+ TN +GKA + L S G+ +SA + V F
Sbjct: 596 VSFNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQ 655

Query: 1092 ASTGQANLQVDAAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNLPRGVKPLTGDNVW 1151
++ D VANG+DA T T V K PV VTF G + +
Sbjct: 656 TKASITEIKADKTTA-VANGQDAITYTVKV-MKGDKPVSNQEVTFTTTLGKLSNSTE--- 710

Query: 1152 VKANDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADKATATVSGIEVIGNYAL 1211
K + G A++ + S T G ++A + T IE++G
Sbjct: 711 -KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGT--- 766

Query: 1212 ADGNAKQTYKVTVTDANNNLLK---DSEVTLTASPANLVLTPNGTAKTNEQGQAIFTATT 1268
G + V + NL + + T ++ + + + + + T +
Sbjct: 767 --GVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISV 824

Query: 1269 TVAAKYTLTAKVSQADGQESTKTAESKFVADDTNAVLTASSDVTSLVADGISTAKLEVTL 1328
+ T T + N+++ + D ++T K
Sbjct: 825 ISSDNQTAT------------------YTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1329 MSANNPVGGNMWVDIKTPEGVTEKDYQFLPSKNDHFVSGKITRTFSTSKPGVYTFTFNAL 1388
+ ++ N++ Y++ S + + +T +K GV + T++ +
Sbjct: 867 LPSSQNELENVFKAWGAANK-----YEYYKSSQT--IISWVQQTAQDAKSGVAS-TYDLV 918

Query: 1389 T 1389

Sbjct: 919 K 919



Score = 74.0 bits (181), Expect = 4e-15
Identities = 72/347 (20%), Positives = 114/347 (32%), Gaps = 46/347 (13%)

Query: 905 KTTTELTFTVK----DAYGNPVTGLKPDAPVFSGAASTGSERPSAGNWTEKGNGVYVSTL 960
T TVK PV+ + SG A SA + G+G TL
Sbjct: 575 TEAITYTATVKKNGVAQANVPVSFN-----IVSGTAV-----LSANSANTNGSGKATVTL 624

Query: 961 TLGSAAGQLSVMPRVNGQNAVAQPLVLNVAGDASKAEIRDMTVKVNNQLANGQSANQITL 1020
+ +A+ V+ V D +KA I ++ +ANGQ A IT
Sbjct: 625 KSDKPGQVVVSAKTAEMTSALNANAVIFV--DQTKASITEIKADKTTAVANGQDA--ITY 680

Query: 1021 TVVDTY-GNPLQGQEVTLTLPQGVTSKTGNTVTTNAAGKADIELMSTVAGEHNISASVNG 1079
TV P+ QEVT T G S + T T+ G A + L ST G+ +SA V+
Sbjct: 681 TVKVMKGDKPVSNQEVTFTTTLGKLSNS--TEKTDTNGYAKVTLTSTTPGKSLVSARVSD 738

Query: 1080 AQ---KTVTVKFNADASTGQANLQVDAAAQKVANGKDAFTLTANVEDKNGN-PVPGSLVT 1135
K V+F + N+++ V G T ++ N G
Sbjct: 739 VAVDVKAPEVEFFTTLTIDDGNIEI------VGTGVKGKLPTVWLQYGQVNLKASGGNGK 792

Query: 1136 FNLPRGVKPLTGDNVWVKANDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADK 1195
+ N + + D QV GT I+ + ++Q T+
Sbjct: 793 YT-------WRSANPAIASVDASSG--QVTLKEKGTTTISVISSDNQT-----ATYTIAT 838

Query: 1196 ATATVSGIEVIGNYALADGNAKQTYKVTVTDANNNLLKDSEVTLTAS 1242
+ + + D ++ N L++ A+
Sbjct: 839 PNSLIV-PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAA 884



Score = 51.2 bits (122), Expect = 3e-08
Identities = 71/405 (17%), Positives = 126/405 (31%), Gaps = 56/405 (13%)

Query: 768 NGQNAAANAAKVTVVADALSSNQSKV---SVAEDHVKAGESTTVTLVA------KDAHGN 818
NG ++ +TV+++ +Q V + + KA + +T A
Sbjct: 535 NGNSSNNVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANV 594

Query: 819 AISGLALSASLTGTASEGATVSSWTEKGNGSYVATLTTGGKTGELRVMPLFNGQPAATEA 878
+S +S GTA A +S G+G TL + + A
Sbjct: 595 PVSFNIVS----GTAVLSA--NSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALN-- 646

Query: 879 AQLTVIAGEMSSANSTLVADNKAPTVKTTTELTFTVKDAY-GNPVTGLKPDAPVFSGAAS 937
A + + ++ + + AD +T+TVK PV+ + +
Sbjct: 647 ANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEV-------TFT 699

Query: 938 TGSERPSAGNWTEKGNGVYVSTLTLGSAAGQLSVMPRVNGQN-AVAQPLVLNVAG---DA 993
T + S NG TLT + G+ V RV+ V P V D
Sbjct: 700 TTLGKLSNSTEKTDTNGYAKVTLT-STTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDD 758

Query: 994 SKAEIRDMTVK---VNNQLANGQSANQI-------TLTVVDTYGNPLQGQEVTLTLPQGV 1043
EI VK L GQ + T + + +TL
Sbjct: 759 GNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTL---- 814

Query: 1044 TSKTGNTVT---------TNAAGKADIELMSTVAGEHNISASVNGAQKTVTVKFNADAST 1094
K T++ T + ++ ++ + +VN + +
Sbjct: 815 KEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGKL--PSSQN 872

Query: 1095 GQANLQVD-AAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNL 1138
N+ AA K K + T+ + V+ + G T++L
Sbjct: 873 ELENVFKAWGAANKYEYYKSSQTIISWVQQTAQDAKSGVASTYDL 917



Score = 44.3 bits (104), Expect = 4e-06
Identities = 53/303 (17%), Positives = 91/303 (30%), Gaps = 63/303 (20%)

Query: 1035 VTLTLPQGVTSKTGNTVTTNAAGKADIELMSTVAGEHNISASVNGAQKTVTVKFNADAST 1094
++L +P + +T K+ L V + + + G Q ++ + S
Sbjct: 454 LSLNIPHDINGTERSTQKIQLIVKSKYGLDRIVWDDSALRSQ--GGQ----IQHSGSQSA 507

Query: 1095 GQANLQVDAAAQKVANGKDAFTLTANVEDKNGNPVPGSLVTFNLPRGVKPLTGDNVWVKA 1154
+ A Q +N + +TA D+NG
Sbjct: 508 QDYQAILPAYVQGGSN---VYKVTARAYDRNG---------------------------- 536

Query: 1155 NDEGKAELQVVSVTAGTYEITASAGNSQPSNTQTITFVADKATATVSGIEVIGNYALADG 1214
N L + ++ G + F ADK + A ADG
Sbjct: 537 NSSNNVLLTITVLSNGQVVDQVGVTD----------FTADKTS------------AKADG 574

Query: 1215 NAKQTYKVTVTDANNNLLKDSEVTLTASPANLVLTPNGTAKTNEQGQAIFTATTTVAAKY 1274
TY TV S + +A TN G+A T + +
Sbjct: 575 TEAITYTATVKKNGVAQANVPVSFNIVS--GTAVLSANSANTNGSGKATVTLKSDKPGQV 632

Query: 1275 TLTAKVSQADGQESTKTAESKFVADDTNAVLTASSDVTSLVADGISTAKLEVTLMSANNP 1334
++AK A+ + FV ++ +D T+ VA+G V +M + P
Sbjct: 633 VVSAK--TAEMTSALNANAVIFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKP 690

Query: 1335 VGG 1337
V
Sbjct: 691 VSN 693


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0321PF00577634e-12 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 62.9 bits (153), Expect = 4e-12
Identities = 30/247 (12%), Positives = 72/247 (29%), Gaps = 23/247 (9%)

Query: 487 TLNLNSLWSKLGTFSISYNDDRRYNSHYYTADYYQNVYSGTFGSLGLRAGIQRYNNGDSN 546
L + + T +S + Y + +Q + F + N
Sbjct: 530 QLTVTQQLGRTSTLYLSG-SHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQK 588

Query: 547 ANTGKYIALDLSLPLGNWFSAGMTHQNGYTMANLSARKQFDEGT------------IRTV 594
+ +AL++++P +W + Q + A+ S + +
Sbjct: 589 -GRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNL 647

Query: 595 GANLSRAISGDTGDDKTLSGGAYAQFDARYASGTLNVNSAADGYVNTNLTANGSVGWQGK 654
++ +G + +G A + Y + + S +D +G V
Sbjct: 648 SYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGY-SHSDDIKQLYYGVSGGVLAHAN 706

Query: 655 NIAASGRTDGNAGVIFNTGLED---DGQISAKINGRIFPLNGKRNYLPLSPYGRYEVELQ 711
+ + ++ G +D + Q + + R G + Y V L
Sbjct: 707 GVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWR-----GYAVLPYATEYRENRVALD 761

Query: 712 NSKNSLD 718
+ + +
Sbjct: 762 TNTLADN 768


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0311TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 36/194 (18%), Positives = 73/194 (37%), Gaps = 8/194 (4%)

Query: 2 STLSQTPSPHHQHAYWGGIFAMTLCVFVLIASEFMPVSLLTPIARDLGVTEGLAGRGIAI 61
++ SQ+ H+Q W I L F ++ + VSL IA D
Sbjct: 3 TSYSQSNLRHNQILIWLCI----LSFFSVLNEMVLNVSLPD-IANDFNKPPASTNWVNTA 57

Query: 62 SGALAVLTSLTLSTLAGKMNRKFLLLGMTVLMAVSGLIIALATSYLMYMV-GRAMIGVAI 120
+ + L+ ++ K LLL ++ +I + S+ ++ R + G
Sbjct: 58 FMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGA 117

Query: 121 GGFWSMSAATAIRLVPQHQVTRALAIFNAGNALATVVAAPLGSYLGATVGWRGAFLCLVP 180
F ++ R +P+ +A + + A+ V +G + + W ++L L+P
Sbjct: 118 AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIP 175

Query: 181 MAVVAFIWQCISLP 194
M + + + L
Sbjct: 176 MITIITVPFLMKLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0306DHBDHDRGNASE826e-21 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.4 bits (203), Expect = 6e-21
Identities = 56/190 (29%), Positives = 88/190 (46%), Gaps = 2/190 (1%)

Query: 3 KVILITGASSGIGEGIARELGMTGAKVLLGARRVERIEAIATEICRAGGIAKARELDVTD 62
K+ ITGA+ GIGE +AR L GA + E++E + + + A+A DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 RQSMADFVQAALDSWGRVDVLINNAGVMPLSPLAAGKQDEWALTIDVNIKGVLWGIGAVL 122
++ + G +D+L+N AGV+ + + +EW T VN GV +V
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 123 PVMEAQGSGQIINLGSIGALSVVPTGAVYCASKFAVR--AISDGLRQESSKIRVTCVNPG 180
M + SG I+ +GS A + A Y +SK A GL IR V+PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 181 VVESELASTI 190
E+++ ++
Sbjct: 189 STETDMQWSL 198


69ECs0262ECs0212Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0262218-2.028815hypothetical protein
ECs02612191.125938hypothetical protein
ECs02602191.504816toxin YafO
ECs02591222.224802antitoxin of the YafO-YafN toxin-antitoxin
ECs02581242.649689DNA polymerase IV
ECs02561241.859484hypothetical protein
ECs02570221.219408FhiA protein
ECs0255217-2.002165hypothetical protein
ECs0254216-1.406695lipoprotein
ECs0253-1161.174201protein DinJ
ECs0252-2161.202590hypothetical protein
ECs0251-2172.041607hypothetical protein
ECs0250-215-0.469267amidotransferase
ECs0249-114-1.732417phosphoheptose isomerase
ECs0248-216-2.626292acyl-CoA dehydrogenase
ECs0247026-3.573702C-lysozyme inhibitor
ECs0246026-5.122145hypothetical protein
ECs0245231-7.700658H repeat-containing protein
ECs0244132-5.886975hypothetical protein
ECs0243131-5.940819hypothetical protein
ECs02423312.447172Rhs core protein
ECs02412292.306259H repeat-containing protein
ECs02404324.099835hypothetical protein
ECs02394305.084625hypothetical protein
ECs02383295.055099hypothetical protein
ECs02374346.326786RhsG core protein with extension
ECs02362191.761759protein VgrG
ECs02350200.158283hypothetical protein
ECs02341200.741991hypothetical protein
ECs02331201.099447hypothetical protein
ECs02322192.149638hypothetical protein
ECs02311182.128507hypothetical protein
ECs02302192.934303hypothetical protein
ECs02291182.891353hypothetical protein
ECs02281225.041219hypothetical protein
ECs02270266.014325hypothetical protein
ECs02260265.904055lipoprotein
ECs02250256.425236hypothetical protein
ECs02241256.285226hypothetical protein
ECs02231236.080078ATP-dependent Clp proteinase ATP-binding chain
ECs02221183.622044hypothetical protein
ECs02210150.946544hypothetical protein
ECs0220017-0.491111hypothetical protein
ECs0218017-2.434716IcmF-like protein
ECs0217127-7.781210hypothetical protein
ECs0216135-11.352107Hcp-like protein
ECs0214030-8.299466hypothetical protein
ECs0213-230-7.539099hypothetical protein
ECs0212-223-4.022461hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0261SACTRNSFRASE300.003 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.003
Identities = 16/56 (28%), Positives = 27/56 (48%), Gaps = 8/56 (14%)

Query: 77 YIDMLFVDPEYTRRGVASALL---KPWIKSESEL-----TVDASITAKPFFERYGF 124
I+ + V +Y ++GV +ALL W K T D +I+A F+ ++ F
Sbjct: 91 LIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0256OMPADOMAIN399e-06 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.1 bits (91), Expect = 9e-06
Identities = 30/118 (25%), Positives = 46/118 (38%), Gaps = 22/118 (18%)

Query: 121 FERGSAQIMPFFKTLLVELAPVFDSL---DNKIIITGHTDAM---AYKNNIYNNWNLSGD 174
F A + P + L +L +L D +++ G+TD + AY N LS
Sbjct: 223 FNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAY------NQGLSER 276

Query: 175 RALSARRVLEEAGMPEDKVMQVS-----AMADQMLLDAKNPQS-----AGNRRIEIMV 222
RA S L G+P DK+ + + K + A +RR+EI V
Sbjct: 277 RAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0252ENTSNTHTASED270.010 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 26.5 bits (58), Expect = 0.010
Identities = 6/23 (26%), Positives = 10/23 (43%)

Query: 39 AVYKDHPLQGSWKGYRDAHVEPD 61
+VYK + + G+ A V
Sbjct: 153 SVYKAFSDRVTLPGFNSAKVTSL 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0237OUTRSURFACE391e-04 Outer surface protein signature.
		>OUTRSURFACE#Outer surface protein signature.

Length = 273

Score = 38.7 bits (90), Expect = 1e-04
Identities = 42/199 (21%), Positives = 75/199 (37%), Gaps = 38/199 (19%)

Query: 395 RVTITDSLNRR--EVLYTEGEGGLKRVVKKEHADGSITRSEYDEAGRL--KAQTDAAGRR 450
++TI D L++ E+ +G+ + R V + D + T ++E G L K T G +
Sbjct: 87 KLTIADDLSKTTFELFKEDGKTLVSRKVSSK--DKTSTDEMFNEKGELSAKTMTRENGTK 144

Query: 451 TEYSLHMASGAVTAVTGPDGRTVRYGYNSQRQVTSVTYPDGLRSSREYDEKGRLAAETSR 510
EY+ + G A T+ + + ++ +G + L+ E ++
Sbjct: 145 LEYTEMKSDGTGKAKEVLKNFTLEGKVANDK--VTLEVKEGTVT---------LSKEIAK 193

Query: 511 SGETTRYSYDDPASELPTGIQDATGSTKQMA-WSRYGQLLTFTDCSGYTTRYEYDRYGQQ 569
SGE T D + T +TK+ W LT + S TT+
Sbjct: 194 SGEVTVALND----------TNTTQATKKTGAWDSKTSTLTISVNSKKTTQL-------- 235

Query: 570 IAVHREEGISTYSSYNPRG 588
V ++ T Y+ G
Sbjct: 236 --VFTKQDTITVQKYDSAG 252



Score = 33.0 bits (75), Expect = 0.007
Identities = 30/139 (21%), Positives = 52/139 (37%), Gaps = 39/139 (28%)

Query: 549 LTFTDCSGYTTRYEYDRYGQQIA---VHREEGISTYSSYNPRGQLVSQKDAQGRETRYEY 605
LT D TT + G+ + V ++ ST +N +G+L ++ + T+ EY
Sbjct: 88 LTIADDLSKTTFELFKEDGKTLVSRKVSSKDKTSTDEMFNEKGELSAKTMTRENGTKLEY 147

Query: 606 SAAGDLTAIVAPDGSRSEIQYDAWGKAVST-------------------TQGGLTRSMGY 646
+E++ D GKA +G +T S
Sbjct: 148 ----------------TEMKSDGTGKAKEVLKNFTLEGKVANDKVTLEVKEGTVTLSKEI 191

Query: 647 DAAGRITV-LTNENGSQST 664
+G +TV L + N +Q+T
Sbjct: 192 AKSGEVTVALNDTNTTQAT 210


70ECs0162ECs0143Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs01620154.163760vitamin B12-transporter protein BtuF
ECs01611164.404706hypothetical protein
ECs01600174.739139iron-sulfur cluster insertion protein ErpA
ECs01590153.721262chloride channel protein
ECs0158-2173.780611glutamate-1-semialdehyde aminotransferase
ECs0157-2173.702865iron-hydroxamate transporter permease
ECs0156-1133.000019iron-hydroxamate transporter substrate-binding
ECs0155-1123.076484iron-hydroxamate transporter ATP-binding
ECs01540122.454325ferrichrome outer membrane transporter
ECs0153-1153.124352penicillin-binding protein 1b
ECs53640172.974827hypothetical protein
ECs01520143.252634ATP-dependent RNA helicase HrpB
ECs0151-1151.8176912'-5' RNA ligase
ECs01500140.582553sugar fermentation stimulation protein A
ECs01490140.408508RNA polymerase-binding transcription factor
ECs0148114-0.819266glutamyl-Q tRNA(Asp) synthetase
ECs0147316-1.496050poly(A) polymerase
ECs0146419-3.4208932-amino-4-hydroxy-6-
ECs0145418-3.686325fimbrial-like protein
ECs0144317-4.284015chaperone protein EcpD
ECs0143317-3.544535outer membrane usher protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0162FERRIBNDNGPP481e-08 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 48.0 bits (114), Expect = 1e-08
Identities = 42/202 (20%), Positives = 74/202 (36%), Gaps = 18/202 (8%)

Query: 7 RALVALSFLAPL-WLNAAP--------RVITLSPANTELAFAAGITPVGVSSYSDY---- 53
R L+ L+PL W R++ L EL A GI P GV+ +Y
Sbjct: 10 RRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWV 69

Query: 54 --PPQAQKIEQVSTWQGMNLERIVALKPDLVIAWRG-GNAERQVDQLASLGIKVMWVDAT 110
PP + V NLE + +KP ++ G G + + ++A
Sbjct: 70 SEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQ 129

Query: 111 NIEQIANALRQLAPWSPQPDKAEQAAQSLLDQYAQLKAQYADKPKKRVFLQFGINPP--F 168
+ +L ++A AE D +K ++ + + + L I+P
Sbjct: 130 PLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHML 189

Query: 169 TSGKESIQNQVLEVCGGENIFK 190
G S+ ++L+ G N ++
Sbjct: 190 VFGPNSLFQEILDEYGIPNAWQ 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0156FERRIBNDNGPP5110.0 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 511 bits (1316), Expect = 0.0
Identities = 294/296 (99%), Positives = 295/296 (99%)

Query: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60
MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA
Sbjct: 1 MSGLPLISRRRLLTAMALSPLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVA 60

Query: 61 DTINYRLWVSEPPLPESVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120
DTINYRLWVSEPPLP+SVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR
Sbjct: 61 DTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGR 120

Query: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMNPRFVKRGARPLLLT 180
GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSM PRFVKRGARPLLLT
Sbjct: 121 GFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLT 180

Query: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240
TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH
Sbjct: 181 TLIDPRHMLVFGPNSLFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDH 240

Query: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296
DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA
Sbjct: 241 DNSKDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVLDNAIGGKA 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0143PF005777420.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 742 bits (1916), Expect = 0.0
Identities = 262/889 (29%), Positives = 425/889 (47%), Gaps = 45/889 (5%)

Query: 1 MYQFTHQKSRIPKKTLLA-----ACCALFYSSNGAAADTVEYDSSFLMGTGASTIDVKRY 55
+YQ Q I K L F + ++ + ++ FL + D+ R+
Sbjct: 8 LYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRF 67

Query: 56 AQGNPTPPGLYNVRVFVNGQATSSLEIPFV-DIGENSAAACLTHKNLAQLHIKQPEQPVT 114
G PPG Y V +++N ++ ++ F E CLT LA + +
Sbjct: 68 ENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGM 127

Query: 115 LLAREGEEEDCLDLAKSYEKADVCFDGSDQFLDLTIPQAYVLKSYGGYVDPSLWESGINA 174
L ++ C+ L A D Q L+LTIPQA++ GY+ P LW+ GINA
Sbjct: 128 NLL---ADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINA 184

Query: 175 ATLAYTLNAYHTSSDND-NSDSVYGAFNSGINLGAWHFRARGNYNWTTDNGS-----DFD 228
L Y + + NS Y SG+N+GAW R +++ + + S +
Sbjct: 185 GLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQ 244

Query: 229 FQDRYLQRDIPAIRSQIIMGDAYTTGETFDSVNVRGVRLYSDSRMLPSALASYAPTIRGV 288
+ +L+RDI +RS++ +GD YT G+ FD +N RG +L SD MLP + +AP I G+
Sbjct: 245 HINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGI 304

Query: 289 ANSNAKVTVTQSGYKIYETTVPPGEFVIDDISPSGFGSELVVTIEEADGSKRTFTQPFSS 348
A A+VT+ Q+GY IY +TVPPG F I+DI +G +L VTI+EADGS + FT P+SS
Sbjct: 305 ARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSS 364

Query: 349 VVQMQRPGVGRWDFSAGKV-IDDSLRSEPNMGQASYYYGLNNLFTGYTGIQFTDNNYLAG 407
V +QR G R+ +AG+ ++ + +P Q++ +GL +T Y G Q D Y A
Sbjct: 365 VPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLAD-RYRAF 423

Query: 408 LLGVGINT-SIGAFAVDVTHSRAEIPDDKTYQGQSYRVTWNKLFQDTGTSFNLAAYRYST 466
G+G N ++GA +VD+T + + +PDD + GQS R +NK ++GT+ L YRYST
Sbjct: 424 NFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYST 483

Query: 467 QDYLGLHDALVLIDDAKHL--------SADEDKNTMQTYSRMKNQFTVSINQPLNIAYED 518
Y D + ++ + + + + +++ Q L
Sbjct: 484 SGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLG----R 539

Query: 519 YGSLFISGSWTYYWAANNSRTEYNVGYSKSVSWGSFSVNLQRSWNE-DGEKDDAMYVSVS 577
+L++SGS YW +N ++ G + + +++++ + N +D + ++V+
Sbjct: 540 TSTLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVN 599

Query: 578 VPIENILGGKRKSS-GFRNLNTQLNTDFDGSHQLNVNSSGNT-ENNLVNYSVNAGYSLDK 635
+P + L KS + + ++ D +G G E+N ++YSV GY+
Sbjct: 600 IPFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGG 659

Query: 636 NAGDLASVGGYLNYESGLGGISASASATSDNSQQYSISTDGGFVLHSGGLTFTNNSFSSN 695
+ ++ LNY G G + S SD+ +Q GG + H+ G+T N
Sbjct: 660 DGNSGSTGYATLNYRGGYGNANIGYSH-SDDIKQLYYGVSGGVLAHANGVTLGQP---LN 715

Query: 696 DTLVLINALGAKGARINNSNN-EIDRWGYAVTSSVSPYRENRVGLNIETLENDVELKSTS 754
DT+VL+ A GAK A++ N D GYAV + YRENRV L+ TL ++V+L +
Sbjct: 716 DTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAV 775

Query: 755 ATTVPRSGSVVLTRFETDEGRSAVLNITAANGKSIPFAAEVYQGE-VMIGSMGQGGQAFV 813
A VP G++V F+ G ++ +T N K +PF A V G + GQ ++
Sbjct: 776 ANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYL 834

Query: 814 RGINDSGELIVRWYENNQTIDCKLHYQFPAQPQTQGSTNTLLLNNLTCQ 862
G+ +G++ V+W E C +YQ P + Q Q L + C+
Sbjct: 835 SGMPLAGKVQVKWGEEENAH-CVANYQLPPESQQQL----LTQLSAECR 878


71ECs0074ECs0067Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs00740153.173912transcriptional regulator SgrR
ECs0073-1152.748819hypothetical protein
ECs00720174.628745thiamine transporter substrate binding subunit
ECs0071-1184.949231thiamine transporter membrane protein
ECs0070-2174.145695thiamine transporter ATP-binding protein
ECs0069-2173.978310hypothetical protein
ECs0068-3193.894162AraC family transcriptional regulator
ECs0067-2204.032978ribulokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0071PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 0.005
Identities = 17/80 (21%), Positives = 29/80 (36%), Gaps = 5/80 (6%)

Query: 4 RRQPLIPGWLIPGVSAATLVVAVALAAFLALWWNAPQGDWSAVWQDS-YLWHVVRFSFWQ 62
R GWL + L V A +W+ A +++W+ ++
Sbjct: 60 RSFIKRQGWLKLNMGQIILRVLPACVVIGMVWFVAN----TSIWRLLAFINTKPVAFTLP 115

Query: 63 AFLSALLSVVPAIFLARALY 82
LS + +VV F+ LY
Sbjct: 116 LALSIIFNVVVVTFMWSLLY 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0068PF05616290.022 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 28.9 bits (64), Expect = 0.022
Identities = 26/118 (22%), Positives = 47/118 (39%), Gaps = 21/118 (17%)

Query: 82 YGRHPEAREWYHQWVYFRPRAYWHEWLNWPSIFANTGFFRPDEAHQPHFSDLFGQ-IINA 140
Y R PE +E + R YW + N P ++ +F+ + +F G ++
Sbjct: 158 YSRFPEVKELMESQMERLARPYWEKLRNRPDMY----YFKNYNFKRCYFGLNGGDCLVAK 213

Query: 141 G-----------QGEGRYSELLAINLLEQLLLRRMEA-----INESLHPPMDNRVREA 182
G QG +Y E + LE++L +++A I + +P +V A
Sbjct: 214 GDDGRTFISFSLQGNSKYKEEMDAKKLEEILSLKVDANPDKYIKATGYPGYSEKVEVA 271


72ECs0036ECs0016Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0036-1243.382511carbamoyl phosphate synthase large subunit
ECs0035-3152.642119carbamoyl phosphate synthase small subunit
ECs0034-1213.4249774-hydroxy-tetrahydrodipicolinate reductase
ECs0033-2213.469365ribonucleoside hydrolase RihC
ECs0032-1223.1171154-hydroxy-3-methylbut-2-enyl diphosphate
ECs0031-1232.620567FKBP-type peptidyl-prolyl cis-trans isomerase
ECs0030-214-2.528813lipoprotein signal peptidase
ECs0029-215-3.724094isoleucyl-tRNA synthetase
ECs0028-132-11.040516bifunctional riboflavin kinase/FMN
ECs0027139-13.042639hypothetical protein
ECs0026240-13.50681030S ribosomal protein S20
ECs0025241-13.989223hypothetical protein
ECs0024136-10.926558fimbrial protein
ECs0023130-8.745235fimbrial chaperone
ECs0022127-7.665545outer membrane usher protein
ECs0021-216-3.015134hypothetical protein
ECs0019-121-1.465579hypothetical protein
ECs0018024-0.337259transcriptional activator NhaR
ECs00170240.231112pH-dependent sodium/proton antiporter
ECs00162310.458770Gef protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0036HTHFIS320.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.5 bits (74), Expect = 0.009
Identities = 20/110 (18%), Positives = 38/110 (34%), Gaps = 18/110 (16%)

Query: 34 CKALREEGYRVILVNS-----------NPATIMTDPEMADATYIEPIHWEVVRKIIEKER 82
+AL GY V + ++ + ++TD M D + + I+K R
Sbjct: 20 NQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD------LLPRIKKAR 73

Query: 83 PDAVLPTMGGQTALNCALELERQGVLEEFGVTM-IGATADAIDKAEDRRR 131
PD + M Q A++ +G + + I +A +
Sbjct: 74 PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0031INFPOTNTIATR310.002 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 30.7 bits (69), Expect = 0.002
Identities = 14/32 (43%), Positives = 19/32 (59%)

Query: 8 NSAVLVHFTLKLDDGTTAESTRNNGKPALFRL 39
+ V V +T L DGT +ST GKPA F++
Sbjct: 144 SDTVTVEYTGTLIDGTVFDSTEKAGKPATFQV 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0022PF005776560.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 656 bits (1695), Expect = 0.0
Identities = 243/833 (29%), Positives = 397/833 (47%), Gaps = 38/833 (4%)

Query: 7 WVAAIIFLYSFPGYAEETFDTHFMIGGMKGEKVSEYHFDNKQP-LPGNYELDFYVNNQWR 65
+VA + AE F+ F+ F+N Q PG Y +D Y+NN +
Sbjct: 31 FVACAFAAQAPLSSAELYFNPRFLADD-PQAVADLSRFENGQELPPGTYRVDIYLNNGYM 89

Query: 66 GKQDITI----PESPVKPCLPKVLLTKLGVKTGNLNT-----EDNCILLDKAVHGGQYQW 116
+D+T E + PCL + L +G+ T +++ +D C+ L +H Q
Sbjct: 90 ATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQL 149

Query: 117 DISEHRLNLTVPQAYINELERGYVPPESWDRGIDAFYTSYNLSQYRSYDSNNNSNTASYG 176
D+ + RLNLT+PQA+++ RGY+PPE WD GI+A +YN S + ++ +Y
Sbjct: 150 DVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGGNSHYAYL 209

Query: 177 RFNSGLNLFSWQLHSDASYSKPD-----DMKGTWQSNTLYLEHGWSQILSTVQIGENYTS 231
SGLN+ +W+L + ++S K WQ +LE + S + +G+ YT
Sbjct: 210 NLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQ 269

Query: 232 SLIFDSLRFSGIRLFRDMQMLPDSMQSFTPLVQGVAQSNALITVSQNGYIIYQKEVPPGP 291
IFD + F G +L D MLPDS + F P++ G+A+ A +T+ QNGY IY VPPGP
Sbjct: 270 GDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGP 329

Query: 292 FTIADLQLSGSGSDLDVSIKEADGSVRSFLVPYSSVPNMLQPGISNFDFIAGRSKIYGVK 351
FTI D+ +G+ DL V+IKEADGS + F VPYSSVP + + G + + AG + +
Sbjct: 330 FTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQ 389

Query: 352 NQED-FLEANYIYGLNNLLTLYGGTILSDNYNAITLGNGWNT-PLGAISFDATRSSSKLN 409
++ F ++ ++GL T+YGGT L+D Y A G G N LGA+S D T+++S L
Sbjct: 390 QEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLP 449

Query: 410 NDITHEGTSYQVAYNKYLVQTATRFSVAAWRYASQDYRTFSDHLYENDKINHQSDYDDFY 469
+D H+G S + YNK L ++ T + +RY++ Y F+D Y + D
Sbjct: 450 DDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVI 509

Query: 470 DIG------------RKNSLSANIMQPLSNNLGNVSLSALWRNYWGRSGNAKDYQFSYSN 517
+ ++ L + Q L + LS + YWG S + +Q +
Sbjct: 510 QVKPKFTDYYNLAYNKRGKLQLTVTQQL-GRTSTLYLSGSHQTYWGTSNVDEQFQAGLNT 568

Query: 518 SWQRISYTFSASQSYDENDKEEER-FNLFISIPF--YWGDDIAKTRHQINLSNSTSFSKD 574
+++ I++T S S + + K ++ L ++IPF + D + S S S +
Sbjct: 569 AFEDINWTLSYSLTKNAWQKGRDQMLALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLN 628

Query: 575 GYSSNNTGITGIAGEHDQLNYGI---YVNQQQQNNDTSLGTNLSWRTPIATIDGSYSHSK 631
G +N G+ G E + L+Y + Y N+ ++ L++R + YSHS
Sbjct: 629 GRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSD 688

Query: 632 NAWQSGGSISSGLVVWPGGINITNQLSDTFAILDAPGLEGAHINGQKYNRTNSKGQVVYD 691
+ Q +S G++ G+ + L+DT ++ APG + A + Q RT+ +G V
Sbjct: 689 DIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLP 748

Query: 692 LMIPHRENHLVLDTANSESETELQGNRQIIAPYRGAVSYVQFTTDQRKPWYIQALRPDGS 751
+REN + LDT +L + P RGA+ +F + L +
Sbjct: 749 YATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMT-LTHNNK 807

Query: 752 PLTFGYDVLDLQENNIGVVGQGSRLFIRVDEIPTGIKVALNDEQNLFCTITFQ 804
PL FG V + G+V ++++ + ++V +E+N C +Q
Sbjct: 808 PLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQ 860


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0016HOKGEFTOXIC602e-16 Hok/Gef cell toxic protein family signature.
		>HOKGEFTOXIC#Hok/Gef cell toxic protein family signature.

Length = 52

Score = 59.8 bits (145), Expect = 2e-16
Identities = 18/46 (39%), Positives = 29/46 (63%)

Query: 23 HKVMIVALIVICITAVVAALVTRKDLCEVHIRTGQTEVAVFTAYES 68
++ ++++C+T ++ +TRK LCE+ R G EVA F AYES
Sbjct: 5 RSSLVWCVLIVCLTLLIFTYLTRKSLCEIRYRDGYREVAAFMAYES 50


73ECs5359ECs5353N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5359-2130.534628two-component response regulator
ECs5358-2161.175723hypothetical protein
ECs5357-2152.424474sensory histidine kinase CreC
ECs5356-1152.301831DNA-binding response regulator CreB
ECs53550142.443260hypothetical protein
ECs5354-1153.150649right origin-binding protein
ECs5353-1172.646343phosphoglycerate mutase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5359HTHFIS824e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 81.8 bits (202), Expect = 4e-20
Identities = 30/122 (24%), Positives = 60/122 (49%), Gaps = 1/122 (0%)

Query: 1 MQTPHILIVEDELVTRNTLKSIFEAEGYDVFEATDGAEMHQILSEYDINLVIMDINLPGK 60
M IL+ +D+ R L GYDV ++ A + + ++ D +LV+ D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELRE-QANVALMFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLS 119
N L +++ + ++ ++ ++ ++ + I E GA DY+ KPF+ EL L+
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RT 121

Sbjct: 121 EP 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5357PF06580310.012 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.6 bits (69), Expect = 0.012
Identities = 40/182 (21%), Positives = 72/182 (39%), Gaps = 40/182 (21%)

Query: 312 LRQARLENRQEVVLTAVDVAALFR---RVSEARTVQLAE--KNITLHVM--------PTE 358
+R LE+ + ++ L R R S AR V LA+ + ++ +
Sbjct: 182 IRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQ 241

Query: 359 VNVAAEPALLEQALGNLL-----DNA----IDFTPKSGRITLSAEVDQEHVALKVLDTGS 409
PA+++ + +L +N I P+ G+I L D V L+V +TGS
Sbjct: 242 FENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS 301

Query: 410 GIPDYALSRIFERFYSLPRANGQKSSGLGLAFVSE-VARLFNGEVTLR-NVQEGGVLASL 467
N ++S+G GL V E + L+ E ++ + ++G V A +
Sbjct: 302 LALK----------------NTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 468 RL 469
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5356HTHFIS876e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.8 bits (215), Expect = 6e-22
Identities = 33/139 (23%), Positives = 60/139 (43%)

Query: 1 MQRETVWLVEDEQGIADTLVYMLQQEGFAVEVFERGLPVLDKARQQVPDVMILDVGLPDI 60
M T+ + +D+ I L L + G+ V + + D+++ DV +PD
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 SGFELCRQLLALHPALPVLFLTARSEEVDRLLGLEIGADDYVAKPFSPREVCARVRTLLR 120
+ F+L ++ P LPVL ++A++ + + E GA DY+ KPF E+ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 RVKKFSTPSPVIRIGHFEL 139
K+ + L
Sbjct: 121 EPKRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5353VACCYTOTOXIN290.014 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 29.2 bits (65), Expect = 0.014
Identities = 14/45 (31%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 145 PLLVSHGIALGCLVSTILGLPAWAERRLRLRNCSISRVDYQESLW 189
P +V GIA G V T+ GL W ++ N D + +W
Sbjct: 42 PAIVG-GIATGAAVGTVSGLLGWGLKQAEEAN---KTPDKPDKVW 82


74ECs5280ECs5276N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5280115-1.993592fructuronate transporter
ECs5279224-2.797723D-mannose specific adhesin
ECs5278123-2.690966protein FimG
ECs5277026-3.250254protein FimF
ECs5276127-3.895622protein FimD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5280PF06580310.008 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.4 bits (71), Expect = 0.008
Identities = 10/49 (20%), Positives = 25/49 (51%)

Query: 230 LVPLIPAIIMISTTIANIWLVKDTPAWEVVNFIGSSPIAMFIAMVVAFV 278
+ +I ++ I +W V +T W ++ FI + P+A + + ++ +
Sbjct: 73 MGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFTLPLALSII 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5279SURFACELAYER280.044 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 28.1 bits (62), Expect = 0.044
Identities = 19/79 (24%), Positives = 32/79 (40%), Gaps = 1/79 (1%)

Query: 211 SQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVS 270
S+N G ++ +A+ N FT PA V V L ++G ++ + + +
Sbjct: 133 SENAGKEITIGSAN-PNVTFTEKTGDQPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYN 191

Query: 271 LGLTANYARTGGQVTAGNV 289
+ TG VT G V
Sbjct: 192 SNVNFYDVTTGATVTTGAV 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5278VACCYTOTOXIN300.003 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.003
Identities = 30/158 (18%), Positives = 49/158 (31%), Gaps = 9/158 (5%)

Query: 3 WRKRGYLLAAILALASATIQAADVTITVNGKVVAKPCTVSTTNATVDLGDLYSFSLMSAG 62
W R + A LA + +TI + VT VN + + + + G
Sbjct: 258 WMGRLQYVGAYLAPSYSTINTSKVTGEVNFNHLTVGDHNAAQAGIIASNKTH------IG 311

Query: 63 AASAWHDVALELTNCPVG--TSRVTASFSGAADSTGYYKNQGTAQNIQLELQDDSGNTLN 120
W L + P G + S + Q ++QN + N+
Sbjct: 312 TLDLWQSAGLNIIAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQ 371

Query: 121 TGATKTVQVDDSSQSAHFPLQVRALTVNGGATQGTIQA 158
+ QV D + V +N A GTI+
Sbjct: 372 KTEIQPTQVIDGPFAGGKNTVVNINRINTNA-DGTIRV 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5276PF0057710890.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 1089 bits (2819), Expect = 0.0
Identities = 869/878 (98%), Positives = 873/878 (99%)

Query: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAVQAPLSSAELYFNPRFLADDPQA 60
MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFA QAPLSSAELYFNPRFLADDPQA
Sbjct: 1 MSYLNLRLYQRNTQCLHIRKHRLAGFFVRLFVACAFAAQAPLSSAELYFNPRFLADDPQA 60

Query: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120
VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN
Sbjct: 61 VADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNTGDSEQGIVPCLTRAQLASMGLN 120

Query: 121 TASVAGMNLLADDACVPLTTMVQDATAHLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180
TASV+GMNLLADDACVPLT+M+ DATA LDVGQQRLNLTIPQAFMSNRARGYIPPELWDP
Sbjct: 121 TASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNLTIPQAFMSNRARGYIPPELWDP 180

Query: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDRSSGSK 240
GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSD SSGSK
Sbjct: 181 GINAGLLNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSK 240

Query: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300
NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV
Sbjct: 241 NKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPV 300

Query: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360
IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV
Sbjct: 301 IHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTV 360

Query: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420
PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY
Sbjct: 361 PYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRY 420

Query: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480
RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR
Sbjct: 421 RAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYR 480

Query: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRS 540
YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGR+
Sbjct: 481 YSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRT 540

Query: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLARNVNI 600
STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLA NVNI
Sbjct: 541 STLYLSGSHQTYWGTSNVDEQFQAGLNTAFEDINWTLSYSLTKNAWQKGRDQMLALNVNI 600

Query: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660
PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD
Sbjct: 601 PFSHWLRSDSKSQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGD 660

Query: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720
GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL
Sbjct: 661 GNSGSTGYATLNYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVL 720

Query: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780
VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP
Sbjct: 721 VKAPGAKDAKVENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVP 780

Query: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840
TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA
Sbjct: 781 TRGAIVRAEFKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLA 840

Query: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878
GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR
Sbjct: 841 GKVQVKWGEEENAHCVANYQLPPESQQQLLTQLSAECR 878


75ECs5261ECs5255N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5261028-6.786967ATP-dependent helicase
ECs5260133-8.022474hypothetical protein
ECs5259240-11.497253hypothetical protein
ECs5258135-11.058755hypothetical protein
ECs5257229-8.253769hypothetical protein
ECs5256126-6.318935hypothetical protein
ECs5255-121-4.310285hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5261RTXTOXIND310.032 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.032
Identities = 26/163 (15%), Positives = 59/163 (36%), Gaps = 16/163 (9%)

Query: 325 RLASGAEEEAYRRLVESQFRDDDDEQAQSN---KGRLFKITLEKALFSSPMACASVVANR 381
+ S E L++ QF +++ Q + + A + + V +R
Sbjct: 177 QNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSR 236

Query: 382 LKRLESRKDHN--SQSQINELESLLLALNNIDASQFSKYQLLLDTIRKDLAWKANNTEDR 439
L S ++ + E E+ + N S+ L+ I ++ + ++
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQ----LEQIESEIL----SAKEE 288

Query: 440 LVIFTESIKTLEFLEQ--QLRADLKLKDDQIATLRGDQGDTVL 480
+ T+ K E L++ Q ++ L ++A Q +V+
Sbjct: 289 YQLVTQLFKN-EILDKLRQTTDNIGLLTLELAKNEERQQASVI 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5258RTXTOXIND320.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.008
Identities = 18/134 (13%), Positives = 50/134 (37%), Gaps = 13/134 (9%)

Query: 161 AQIKLLRTEISDSSQAQLANHTHFSNKLWEQLEQFADLMAKGATEQI-IDALRQVIIDFN 219
+ R E + A++ + + S +L+ F+ L+ K A + + ++
Sbjct: 207 LNLDKKRAER-LTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAV 265

Query: 220 QNLTEQFGENFKALDASVKKLVEWQGNYKTQIEQMSEQYQQSV-ESLVETKTAVAGIWEE 278
L + L+ +++ K + + +++ ++ + + L +T + + E
Sbjct: 266 NELRVYKSQ----LEQIESEILS----AKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317

Query: 279 CK--EIPLAMSELR 290
E S +R
Sbjct: 318 LAKNEERQQASVIR 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5257OMPADOMAIN584e-12 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 58.0 bits (140), Expect = 4e-12
Identities = 36/136 (26%), Positives = 53/136 (38%), Gaps = 28/136 (20%)

Query: 95 SPDVLFGLGSTELKPKFKLILDDFFPRYLKVLDNYQEHITEVRIEGHTSTDWTGTTNPDI 154
DVLF LKP+ + LD + L N V + G+ TD G+
Sbjct: 218 KSDVLFNFNKATLKPEGQAALD----QLYSQLSNLDPKDGSVVVLGY--TDRIGSD---- 267

Query: 155 AYFNNMALSQGRTRAVLQYVYDIKNIATHQQWVKSKFAAVGYSSAHPILDKTGKEDPNRS 214
AY N LS+ R ++V+ Y+ K I K +A G ++P+ T R+
Sbjct: 268 AY--NQGLSERRAQSVVDYLI-SKGIP------ADKISARGMGESNPVTGNTCDNVKQRA 318

Query: 215 ---------RRVTFKV 221
RRV +V
Sbjct: 319 ALIDCLAPDRRVEIEV 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5255THERMOLYSIN280.007 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.1 bits (62), Expect = 0.007
Identities = 20/71 (28%), Positives = 32/71 (45%), Gaps = 9/71 (12%)

Query: 10 AGNVTFVHNGKAYVTGGVNQNIFNGYFEDLNEAGKDSAAIDKINAHYFDKKAEDYFFNKF 69
+G T+ + + G + + N +F A D+AA+D AHY+ DY+ N
Sbjct: 265 SGIFTYDGRNRTVLPGSLWADGDNQFF-----ASYDAAAVD---AHYYAGVVYDYYKNVH 316

Query: 70 -LLSFDPSTQQ 79
LS+D S
Sbjct: 317 GRLSYDGSNAA 327


76ECs5233ECs5226N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5233025-8.218747hypothetical protein
ECs5232026-7.355040hypothetical protein
ECs5231128-7.592105ornithine carbamoyltransferase subunit I
ECs5230236-9.307767hypothetical protein
ECs5229-217-0.157801hypothetical protein
ECs5228-218-0.208275hypothetical protein
ECs5226-2190.065865oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5233SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 16/48 (33%), Positives = 19/48 (39%)

Query: 97 PAIRGKGLAKKLALKAMEEAREMGFKRCYLETTAFLKEAIGLYEHLGF 144
R KG+ L KA+E A+E F LET A Y F
Sbjct: 99 KDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5230TYPE4SSCAGX300.038 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 29.8 bits (66), Expect = 0.038
Identities = 18/75 (24%), Positives = 37/75 (49%)

Query: 180 KEIARLLNNHQKLNNLQKLNNLQKLNNLQKLNNIQKLNNIQELNNSQELNNSQELNNSQE 239
+ + ++N Q L+N + L+ L K +L+ +++L ++QE + L +ELN Q
Sbjct: 181 ENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEELNKKQA 240

Query: 240 LNNSQDLKNSQVSCK 254
+ ++S K
Sbjct: 241 EEAVRQRAKDKISIK 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5228HTHTETR509e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 9e-10
Identities = 20/100 (20%), Positives = 38/100 (38%), Gaps = 6/100 (6%)

Query: 5 KQSRVPGRPRRFAPEQAVSAAKVLFHQKGFDAVSVAEVTDYLGINPPSLYAAFGSKAGLF 64
++++ + R + + A LF Q+G + S+ E+ G+ ++Y F K+ LF
Sbjct: 3 RKTKQEAQETR---QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLF 59

Query: 65 SRVLNEYVGTEAIPLVDILRDDRPVGECLAEVLKEAARRY 104
S + L + P VL+E
Sbjct: 60 SEIWELSESN-IGELELEYQAKFP--GDPLSVLREILIHV 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5226DHBDHDRGNASE871e-22 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 86.6 bits (214), Expect = 1e-22
Identities = 67/250 (26%), Positives = 114/250 (45%), Gaps = 24/250 (9%)

Query: 6 GKTVLILGGSRGIGAAIVRRFVTDGANVRFTYAGSKDAAERLAQETGATAVFT-----DS 60
GK I G ++GIG A+ R + GA++ + + E++ A A D
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHI-AAVDYNPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 61 ADRDAVIDVV----RKSGALDILVVNAGIGVFGDALELNADDIDRLFKINIHAPYHASVE 116
D A+ ++ R+ G +DILV AG+ G L+ ++ + F +N ++AS
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 117 AARQMP--EGGRILIIGSVNGDRMPVAGMAAYAASKSALQGMARGLARDFGPRGITINVV 174
++ M G I+ +GS N +P MAAYA+SK+A + L + I N+V
Sbjct: 127 VSKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 175 QPGPIDTDA--------NPANGPMRDMLHSF---MAIKRHGQPEEVAGMVAWLAGPEASF 223
PG +TD N A ++ L +F + +K+ +P ++A V +L +A
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 224 VTGAMHTIDG 233
+T +DG
Sbjct: 246 ITMHNLCVDG 255


77ECs5116ECs5106N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5116-119-3.540580transcriptional regulator
ECs5115020-4.423572*CadC family transcriptional regulator
ECs5114-123-3.515611lysine/cadaverine antiporter
ECs5113-119-3.864199lysine decarboxylase 1
ECs5112-212-3.096075peptide transporter
ECs5111012-3.924816lysyl-tRNA synthetase
ECs5110113-2.646125hypothetical protein
ECs5109018-0.429869hypothetical protein
ECs5108-1180.395204hypothetical protein
ECs5107-114-0.587879sensory histidine kinase DcuS
ECs5106-114-0.040060DcuR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5116HTHTETR471e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.9 bits (111), Expect = 1e-08
Identities = 30/199 (15%), Positives = 57/199 (28%), Gaps = 14/199 (7%)

Query: 1 MGKTEENSVQ-REDVLGEALKLLELQGIANTTLEMVAERVDYPLDELRRFWPDKEAILYD 59
KT++ + + R+ +L AL+L QG+++T+L +A+ + + DK + +
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 60 ALRYLSQQIDVWRRQLMLDETQTAEQKLLARYQALSECVKNNRYPGCLFIAACTFYPDPG 119
I + L + E + F+
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLEST--VTEERRRLLMEIIFHKCEF 119

Query: 120 H----PIHQLADQQKSAAYDFTHELLTT-------LEVDDPAMVAKQMELVLEGCLSRML 168
+ Q +YD + L A M + G + L
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 169 VNRSQADVDTAHRLAEDIL 187
D+ R IL
Sbjct: 180 FAPQSFDLKKEARDYVAIL 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5115SYCDCHAPRONE378e-05 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 36.8 bits (85), Expect = 8e-05
Identities = 16/97 (16%), Positives = 36/97 (37%), Gaps = 7/97 (7%)

Query: 391 PLDEKQLAALNTEIDNIVTLPELNNLS-----IIYQIKAVSALVKGKTDESYQAINTGID 445
++ A+ + + T+ LN +S +Y + A + GK +++++
Sbjct: 6 TDTQEYQLAMESFLKGGGTIAMLNEISSDTLEQLYSL-AFNQYQSGKYEDAHKVFQALCV 64

Query: 446 LEMSWLNYVL-LGKVYEMKGMNREAADAYLTAFNLRP 481
L+ + L LG + G A +Y +
Sbjct: 65 LDHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDI 101


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5112TCRTETA300.020 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.020
Identities = 36/190 (18%), Positives = 66/190 (34%), Gaps = 14/190 (7%)

Query: 44 NHAISLFSAYA-SLVYVTPILGGWLADRLLGNRTAVIAGALLMTLGHVVLGIDTNSTFSL 102
H L + YA P+LG +DR G R ++ + + ++ + L
Sbjct: 43 AHYGILLALYALMQFACAPVLGAL-SDRF-GRRPVLLVSLAGAAVDYAIMAT-APFLWVL 99

Query: 103 YLALAIIICGYGLFKSNISCLLGELYDEND-HRRDGGFSLLYAAGNIGSIAAPIACGLAA 161
Y+ + G+ + + + D D R F + A G +A P+ GL
Sbjct: 100 YIGRIV----AGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMG 155

Query: 162 QWYGWHVGFALAGGGMFIGLLIFLSGHRHFQSTRSMDKKALTSVKF-ALPVWSWLVVMLC 220
+ H F A + L FL+G + +++ L L + W M
Sbjct: 156 G-FSPHAPFFAAA---ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTV 211

Query: 221 LAPVFFTLLL 230
+A + +
Sbjct: 212 VAALMAVFFI 221


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5109SACTRNSFRASE260.012 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.4 bits (58), Expect = 0.012
Identities = 9/28 (32%), Positives = 16/28 (57%)

Query: 32 LAIIEHTDVDESLKGQGIGKQLVAKVVE 59
A+IE V + + +G+G L+ K +E
Sbjct: 89 YALIEDIAVAKDYRKKGVGTALLHKAIE 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5107PF06580417e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 7e-06
Identities = 21/99 (21%), Positives = 38/99 (38%), Gaps = 18/99 (18%)

Query: 442 LIENALE-ALGP-EPGGEISVTLHYRHGWLHCEVNDDGPGIAPDKIDHIFDKGVSTKGSE 499
L+EN ++ + GG+I + +G + EV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------TKES 310

Query: 500 RGVGLALVKQQVENLGG---SIAVESEPGIFTQFFVQIP 535
G GL V+++++ L G I + + G V IP
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5106HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 31/109 (28%), Positives = 50/109 (45%), Gaps = 4/109 (3%)

Query: 4 VLIIDDDAMVAELNRRYVAQIPGFQCCGTASTLEKAKEIIFNSDTPIDLILLDIYMQKEN 63
+L+ DDDA + + + +++ G+ S I + DL++ D+ M EN
Sbjct: 6 ILVADDDAAIRTVLNQALSRA-GYDVR-ITSNAATLWRWI--AAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLLPVLHNARCKSDVIVISSAADAATIKDSLHYGVVDYLIKPFQASR 112
DLLP + AR V+V+S+ T + G DYL KPF +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTE 110


78ECs5095ECs5088N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5095-3140.606254DNA-binding transcriptional regulator BasR
ECs5094-2140.399017sensor protein BasS/PmrB
ECs5093-2140.863301proline/glycine betaine transporter
ECs50921171.867679hypothetical protein
ECs50911222.606007hypothetical protein
ECs50902385.584271hypothetical protein
ECs50891397.040656hypothetical protein
ECs50881438.455187phosphonate/organophosphate ester transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5095HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 90.7 bits (225), Expect = 2e-23
Identities = 41/121 (33%), Positives = 60/121 (49%)

Query: 2 KILIVEDDTLLLQGLILAAQTEGYACDGVTTARMAEQSLEAGHYSLVVLDLGLPDEDGLH 61
IL+ +DD + L A GY + A + + AG LVV D+ +PDE+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLARIRQKKYTLPVLILTARDTLTDKIAGLDVGADDYLVKPFALEELHARIRALLRRHNN 121
L RI++ + LPVL+++A++T I + GA DYL KPF L EL I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 Q 122
+
Sbjct: 125 R 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5094PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 1e-04
Identities = 40/182 (21%), Positives = 80/182 (43%), Gaps = 34/182 (18%)

Query: 181 ARLDQMMESVSQLLQLARAGQSFSSGNYQHVKLLEDV-ILPSYDELSTML--DQRQQTLL 237
+ +M+ S+S+L++ S N + V L +++ ++ SY +L+++ D+ Q
Sbjct: 191 TKAREMLTSLSELMR-----YSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 238 LPESAADITVQGDATLLRMLLRNLVENAHRY----SPQGSNIMIKLQEDGGAV-MAVEDE 292
+ + D+ V ML++ LVEN ++ PQG I++K +D G V + VE+
Sbjct: 246 INPAIMDVQV------PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT 299

Query: 293 GPGIDESKCGELSKAFVRMDSRYGGIGLGLSIV-SRITQLHHGQFFLQNRQETSGTRAWV 351
G + + G GL V R+ L+ + ++ ++ A V
Sbjct: 300 GSLA--------------LKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 352 RL 353
+
Sbjct: 346 LI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5093TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 43.3 bits (102), Expect = 2e-06
Identities = 57/290 (19%), Positives = 105/290 (36%), Gaps = 55/290 (18%)

Query: 85 FFGMLGDKYGRQKILAITIVIMSISTFCIGLIPSYDTIGIWAPILLLICKMAQGFSVGGE 144
G L D++GR+ +L +++ ++ + P +W +L I ++ G + G
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF-----LW---VLYIGRIVAGIT-GAT 112

Query: 145 YTGASIFVAEYSPDRKR----GFMGSWLDFGSIAGFVLGAGVVVLISTIVGEANFLDWGW 200
A ++A+ + +R GFM + FG +AG VLG G++ S
Sbjct: 113 GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG-GLMGGFSP------------ 159

Query: 201 RIPFFIALPLGIIGLYLRHALEETPAFQQHVDKLEQGDREGLQDGPKVSFKEIATKYWRS 260
PFF A L + L K E+ P SF+ W
Sbjct: 160 HAPFFAAAALNGLNFLTGCFLLPESH------KGERRPLRREALNPLASFR------WAR 207

Query: 261 LLTCIGLVIATNVTYYML----LTYMPSYLSHNLHYS-EDHGVLIIIAIMIGMLFVQPVM 315
+T + ++A ++ + H+ G+ + ++ L +
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMIT 267

Query: 316 GLLSDRFGRRPFVLLG----SVALFVLA--------IPAFILINSNVIGL 353
G ++ R G R ++LG +LA P +L+ S IG+
Sbjct: 268 GPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGM 317



Score = 39.0 bits (91), Expect = 4e-05
Identities = 39/164 (23%), Positives = 73/164 (44%), Gaps = 16/164 (9%)

Query: 286 LSHNLHYSEDHGVLI-IIAIMIGMLFVQPVMGLLSDRFGRRPFVLLGSVALFVLAIPAFI 344
L H+ + +G+L+ + A+M PV+G LSDRFGRRP +L+ L A+ I
Sbjct: 35 LVHSNDVTAHYGILLALYALM--QFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI 89

Query: 345 LINSNVIGLIFAGLLMLAVILNCFTGVMASTLPAMFPTHIR---YSALAAAFNISVLVAG 401
+ + + +++ G ++A I V + + + R + ++A F +VAG
Sbjct: 90 MATAPFLWVLYIG-RIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG-MVAG 147

Query: 402 LTPTLAAWLVESSQNLMMPAYYLMVVAVVGLITG-VTMKETANR 444
P L + S + P + + + +TG + E+
Sbjct: 148 --PVLGGLMGGFSPH--APFFAAAALNGLNFLTGCFLLPESHKG 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5088PF05272290.020 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.020
Identities = 12/22 (54%), Positives = 13/22 (59%)

Query: 32 MVALLGPSGSGKSTLLRHLSGL 53
V L G G GKSTL+ L GL
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL 619


79ECs5079ECs5067N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs50791378.493992phosphonate ABC transporter ATP-binding protein
ECs50782317.286062protein PhnM
ECs50772276.292124ribose 1,5-bisphosphokinase
ECs50761265.884659aminoalkylphosphonic acid N-acetyltransferase
ECs50751265.636763carbon-phosphorus lyase complex accessory
ECs55841244.788350hypothetical protein
ECs50741224.655182histidine protein kinase
ECs50730234.445436sugar ABC transporter ATP-binding protein
ECs50720214.039665carbohydrate ABC transporter permease
ECs5071-1182.850771carbohydrate binding protein
ECs50700162.234754hypothetical protein
ECs50690152.651620hypothetical protein
ECs5068-2162.642137sugar kinase
ECs5067-2162.477074regulatory protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5079PF05272290.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.015
Identities = 17/70 (24%), Positives = 25/70 (35%), Gaps = 8/70 (11%)

Query: 36 CVVLHGHSGSGKSTLLRSLYANYLPDEGQIQIKHGDEWVDLVTAPARKVVEI------RK 89
VVL G G GKSTL+ +L + I G + + + E+ R+
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIV--AYELSEMTAFRR 655

Query: 90 TTVGWVSQFL 99
V F
Sbjct: 656 ADAEAVKAFF 665


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5076SACTRNSFRASE323e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 3e-04
Identities = 20/84 (23%), Positives = 32/84 (38%), Gaps = 5/84 (5%)

Query: 47 HLALLDGEVVGMIGLHLQFHLHHVNWIGEIQELVVMPQARGLNVGSKLLAWAEEEARQAG 106
L L+ +G I + + N I+++ V R VG+ LL A E A++
Sbjct: 68 FLYYLENNCIGRIKIRSNW-----NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH 122

Query: 107 AEMTELSTNVKRHDAHRFYLREGY 130
L T A FY + +
Sbjct: 123 FCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5584RTXTOXIND260.034 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 25.9 bits (57), Expect = 0.034
Identities = 17/107 (15%), Positives = 41/107 (38%), Gaps = 8/107 (7%)

Query: 11 TLLTLTTVPAQADIIDDTIGNIQ--------QAINDASNPDRGRDYEDSRDDGWQREVSD 62
LL LT + A+AD + +Q Q ++ + ++ + + + +Q +
Sbjct: 123 VLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEE 182

Query: 63 DRRRQYDDRRRQFEDRRRQLDDRQHQLNQERRQLEDEERRMEDEYGQ 109
+ R + QF + Q ++ L+++R + R+
Sbjct: 183 EVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5074HTHFIS586e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 58.3 bits (141), Expect = 6e-11
Identities = 21/81 (25%), Positives = 44/81 (54%), Gaps = 2/81 (2%)

Query: 643 VLVLEDEAAVRQTICEQLHLLGYLTLEASSGEQALDLLAASAEIDIFISDLMLPGGMSGA 702
+LV +D+AA+R + + L GY S+ +AA + D+ ++D+++P +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA-GDGDLVVTDVVMP-DENAF 63

Query: 703 EVVNAARKLYPHLTLLLISGQ 723
+++ +K P L +L++S Q
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQ 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5072PF00577280.047 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 28.3 bits (63), Expect = 0.047
Identities = 16/73 (21%), Positives = 27/73 (36%), Gaps = 1/73 (1%)

Query: 219 FVYGMSGLLSGLGGIMSASRLYSANGNLGMG-YELDAIAAVILGGTSFVGGIGTITGTLV 277
++G+ + GG A R + N +G L A++ + S + G V
Sbjct: 400 LLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDGQSV 459

Query: 278 GALIIATLNNGMT 290
L +LN T
Sbjct: 460 RFLYNKSLNESGT 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5071SUBTILISIN290.027 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 28.7 bits (64), Expect = 0.027
Identities = 15/65 (23%), Positives = 24/65 (36%), Gaps = 5/65 (7%)

Query: 55 KLAGDNVKVTLVSSGYDLGQQVSQIDNFIAANVDMIIL---NAADSKGIGPAVKRAKDAG 111
L +KV + I I VD+I + D + AVK+A +
Sbjct: 111 DLLI--IKVLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPELHEAVKKAVASQ 168

Query: 112 IVVVA 116
I+V+
Sbjct: 169 ILVMC 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5067HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 2e-23
Identities = 36/120 (30%), Positives = 56/120 (46%), Gaps = 1/120 (0%)

Query: 2 KPVVLVVDDDTAICALLQDVLSEHVFTVSVCHTGQEAILRIEGDPDIALVVLDMMLPDTN 61
+LV DDD AI +L LS + V + I LVV D+++PD N
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD-GDLVVTDVVMPDEN 61

Query: 62 GLRVLQQIQKLRPTLPVVMLTGMGSESDVVVGLEMGADDYICKPFTPRVVVARLKAVLRR 121
+L +I+K RP LPV++++ + + E GA DY+ KPF ++ + L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


80ECs5018ECs5011N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs5018-2190.542239maltose/maltodextrin transporter ATP-binding
ECs5017-2160.700500maltose ABC transporter periplasmic protein
ECs5016-2141.512517maltose transporter membrane protein
ECs5015-2131.762120maltose transporter permease
ECs5014-2111.817023D-xylose transporter XylE
ECs5013-1182.301212phosphate-starvation-inducible protein PsiE
ECs5012-1162.605619hypothetical protein
ECs50110182.391588hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5018PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 13/35 (37%), Positives = 18/35 (51%)

Query: 32 VVFVGPSGCGKSTLLRMIAGLETITSGDLFIGEKR 66
VV G G GKSTL+ + GL+ + IG +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5017MALTOSEBP7560.0 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 756 bits (1953), Expect = 0.0
Identities = 396/396 (100%), Positives = 396/396 (100%)

Query: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60
MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK
Sbjct: 1 MKIKTGARILALSALTTMMFSASALAKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIK 60

Query: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120
VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW
Sbjct: 61 VTVEHPDKLEEKFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTW 120

Query: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180
DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP
Sbjct: 121 DAVRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKSALMFNLQEP 180

Query: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240
YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE
Sbjct: 181 YFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIAE 240

Query: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300
AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE
Sbjct: 241 AAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKE 300

Query: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360
LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP
Sbjct: 301 LAKEFLENYLLTDEGLEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIP 360

Query: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396
QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK
Sbjct: 361 QMSAFWYAVRTAVINAASGRQTVDEALKDAQTRITK 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5016FLGHOOKAP1310.012 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.012
Identities = 22/124 (17%), Positives = 43/124 (34%), Gaps = 21/124 (16%)

Query: 128 GDEWQLALSDGETGKNYLSDAFKFGREQKLQLKETTAQPEGERANLRVITQNRQALSDIT 187
++WQ+ T DA L+L T + L+ + A+ ++
Sbjct: 367 NNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPV---SDAIVNMD 423

Query: 188 AILPDGNKVMMSSLRQFSGTQPLYTLDGDGTLTNNQSGVKYRPNNQ--------IGFYQS 239
++ D K+ M+S GD N Q+ + + N++ Y S
Sbjct: 424 VLITDEAKIAMAS----------EEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYAS 473

Query: 240 ITAD 243
+ +D
Sbjct: 474 LVSD 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5014TCRTETA364e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 35.6 bits (82), Expect = 4e-04
Identities = 20/87 (22%), Positives = 42/87 (48%), Gaps = 3/87 (3%)

Query: 279 VIGVMLSIFQQFVGINVVLYYAPEVFKTLGASTDIALLQTIIVGVINLTFTVLAIMT--- 335
+I ++ ++ VGI +++ P + + L S D+ I++ + L A +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 336 VDKFGRKPLQIIGALGMAIGMFSLGTA 362
D+FGR+P+ ++ G A+ + TA
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATA 93


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs5011CHANLCOLICIN300.007 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 30.4 bits (68), Expect = 0.007
Identities = 21/95 (22%), Positives = 38/95 (40%), Gaps = 3/95 (3%)

Query: 20 AAGTVKVFSNGSSEAKTLTGAEHLIDLVGQPRLANSWWPGAVISEELATAAALRQQQALL 79
A + + + LT + L D+V + N+ + A AA++ + L
Sbjct: 73 AKAAAEAQAKAKANRDALT--QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERL 130

Query: 80 TRLAEQGADSSADDAAAINALRQQIQALKVTGRQK 114
RLA+ + + AA A ++ Q K R+K
Sbjct: 131 -RLAKAEEKARKEAEAAEKAFQEAEQRRKEIEREK 164


81ECs4933ECs4923N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4933-1182.719126isocitrate lyase
ECs49321172.824717malate synthase
ECs49311173.565789homoserine O-succinyltransferase
ECs49300184.189706hypothetical protein
ECs49291174.059908*bifunctional
ECs49281153.607623phosphoribosylamine--glycine ligase
ECs49271131.906074transcriptional regulatory protein ZraR
ECs49260131.837242sensor protein ZraS
ECs49250152.182768zinc resistance protein
ECs4924-1152.101218hypothetical protein
ECs49230151.678159transcriptional regulator HU subunit alpha
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4933BINARYTOXINB349e-04 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 34.3 bits (78), Expect = 9e-04
Identities = 14/58 (24%), Positives = 22/58 (37%)

Query: 289 ETSTPDLELARRFAEAIHAKYPGKLLAYNCSPSFNWQKNLDDNTIASFQQQLSDMGYK 346
ET+ PD+ L A P L Y + N D T + + QL+++
Sbjct: 544 ETTKPDMTLKEALKIAFGFNEPNGNLQYQGKDITEFDFNFDQQTSQNIKNQLAELNAT 601


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4930SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 1e-04
Identities = 16/54 (29%), Positives = 21/54 (38%), Gaps = 5/54 (9%)

Query: 78 IDPDVRGCGVGRMLVKHALSMAPE-----LTTNVNEQNEQAVGFYKKVGFKVTG 126
+ D R GVG L+ A+ A E L + N A FY K F +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4927HTHFIS5180.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 518 bits (1336), Expect = 0.0
Identities = 184/468 (39%), Positives = 254/468 (54%), Gaps = 35/468 (7%)

Query: 8 ILVVDDDISHCTILQALLRGWGYNVALANSGRQALEQVRERVFDLVLCDVRMAEMDGIAT 67
ILV DDD + T+L L GY+V + ++ + DLV+ DV M + +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 68 LKEIKALNPAIPVLIMTAYSSVETAVEALKTGALDYLIKPLDFDNLQATLEKALAHTHII 127
L IK P +PVL+M+A ++ TA++A + GA DYL KP D L + +ALA
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 128 DAETPAVTASQFGMVGKSPAMQHLLSEIALVAPSEATVLIHGDSGTGKELVARAIHASSA 187
++ + +VG+S AMQ + +A + ++ T++I G+SGTGKELVARA+H
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGK 185

Query: 188 RSEKPLVTLNCAALNESLLESELFGHEKGAFTGADKRREGRFVEADGGTLFLDEIGDISP 247
R P V +N AA+ L+ESELFGHEKGAFTGA R GRF +A+GGTLFLDEIGD+
Sbjct: 186 RRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPM 245

Query: 248 MMLVRLLRAIQEREVQRVGSNQTISVDVRLIAATHRDLAAEVNAGRFRQDLYYRLNVVAI 307
RLLR +Q+ E VG I DVR++AAT++DL +N G FR+DLYYRLNVV +
Sbjct: 246 DAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPL 305

Query: 308 EVPSLRQRREDIPLLAGHFLQRFAERNRKAVKGFTPQAMDLLIHYDWPGNIRELENAVER 367
+P LR R EDIP L HF+Q+ + VK F +A++L+ + WPGN+RELEN V R
Sbjct: 306 RLPPLRDRAEDIPDLVRHFVQQAEKEGLD-VKRFDQEALELMKAHPWPGNVRELENLVRR 364

Query: 368 AVVLLTGEYISERELPLAI--------------------------ASTPIPLAQSLDIQP 401
L + I+ + + + A D P
Sbjct: 365 LTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALP 424

Query: 402 --------LVEVEKEVILAALEKTGGNKTEAARQLGITRKTLLAKLSR 441
L E+E +ILAAL T GN+ +AA LG+ R TL K+
Sbjct: 425 PSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4926PF06580356e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.8 bits (80), Expect = 6e-04
Identities = 49/262 (18%), Positives = 103/262 (39%), Gaps = 43/262 (16%)

Query: 197 ILFALATVLLA-SVLSFFW-YRRYLRSRQLLQDEMKRKEKLVALGHLAAGV-AHEIRNPL 253
I+F + V S+L F W + + + ++ Q +M + L L A + H + N L
Sbjct: 120 IIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNAL 179

Query: 254 SSIKGLAKYFAERAPAGGEAHQLAQVM---AKEADRLNRVVSELLELVKPTHLALQAVDL 310
++I+ L +A L+++M + ++ +++ L +V ++L L ++
Sbjct: 180 NNIRALILEDPTKAREM--LTSLSELMRYSLRYSNARQVSLADELTVVD-SYLQLASIQF 236

Query: 311 NTLINHSLQLVSQDANCREIQLRFTANDTLPEIQADPDRLTQVLL-NLYLNAIQAIGQHG 369
+ Q+ + ++Q+ P L Q L+ N + I + Q G
Sbjct: 237 EDRLQFENQI---NPAIMDVQV--------------PPMLVQTLVENGIKHGIAQLPQGG 279

Query: 370 VISVTASESGAGVKISVTDSGKGIAADQLEAIFTPYFTTKAEGTGLGLAVVHNIVEQHGG 429
I + ++ V + V ++G + E TG GL V ++ G
Sbjct: 280 KILLKGTKDNGTVTLEVENTGSLALKNT------------KESTGTGLQNVRERLQMLYG 327

Query: 430 ---TIQVASLEGKGARFTLWLP 448
I+++ +GK + +P
Sbjct: 328 TEAQIKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4923DNABINDINGHU1202e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 120 bits (302), Expect = 2e-39
Identities = 50/89 (56%), Positives = 66/89 (74%)

Query: 2 NKTQLIDVIAEKAELSKTQAKAALESTLAAITESLKEGDAVQLVGFGTFKVNHRAERTGR 61
NK LI +AE EL+K + AA+++ +A++ L +G+ VQL+GFG F+V RA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIKIAAANVPAFVSGKALKDAVK 90
NPQTG+EIKI A+ VPAF +GKALKDAVK
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAVK 91


82ECs4796ECs4788N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4796020-1.216731resistance protein
ECs47951190.832870hypothetical protein
ECs47941232.089623transcriptional regulator
ECs47932242.467421GTP-binding protein
ECs47920182.427892glutamine synthetase
ECs47910151.814816nitrogen regulation protein NR(II)
ECs47901150.958029nitrogen regulation protein NR(I)
ECs4789013-0.957744coproporphyrinogen III oxidase
ECs4788-113-3.466033hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4796TCRTETB300.024 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.024
Identities = 31/161 (19%), Positives = 64/161 (39%), Gaps = 15/161 (9%)

Query: 227 NVFFVYAVYCGLTFFIPFLKNIYLLP----------VALVGAYGIINQYCLKMIGGPIGG 276
N+ F+ V CG F + ++P A +G+ I +I G IGG
Sbjct: 255 NIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGG 314

Query: 277 MISDKILKSPSKYLCYTFIISTAALVLLIMLPHESMPVYLGMACTLGFGAIVFTQRAVFF 336
++ D+ + P L + + + L E+ ++ + G + FT+
Sbjct: 315 ILVDR--RGPLYVLNIGVTFLSVSFLTASFLL-ETTSWFMTIIIVFVLGGLSFTK--TVI 369

Query: 337 APIGEAKIAENKTGAAMALGSFIGYAPAMFCFSLYGYILDL 377
+ I + + + + GA M+L +F + ++ G +L +
Sbjct: 370 STIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4793TCRTETOQM1804e-51 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 180 bits (458), Expect = 4e-51
Identities = 97/445 (21%), Positives = 170/445 (38%), Gaps = 81/445 (18%)

Query: 4 KLRNIAIIAHVDHGKTTLVDKLLQQSGTFDSRAETQE--RVMDSNDLEKERGITILAKNT 61
K+ NI ++AHVD GKTTL + LL SG + D+ LE++RGITI T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 62 AIKWNDYRINIVDTPGHADFGGEVERVMSMVDSVLLVVDAFDGPMPQTRFVTKKAFAYGL 121
+ +W + ++NI+DTPGH DF EV R +S++D +L++ A DG QTR + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 122 KPIVVINKVDRPGARPDWVVDQVFD-------------LFVNLDATDEQLD--------- 159
I INK+D+ G V + + L+ N+ T+
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEG 181

Query: 160 --------------------------------FPIVYASALNGIAGLDHEDMAEDMTPLY 187
FP+ + SA N I G+D+ L
Sbjct: 182 NDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI-GIDN---------LI 231

Query: 188 QAIVDHVPAPDVDLDGPFQMQISQLDYNSYVGVIGIGRIKRGKVKPNQQVTIIDSEGKTR 247
+ I + + ++ +++Y+ + R+ G + V I + E
Sbjct: 232 EVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-- 289

Query: 248 NAKVGKVLGHLGLERIETDLAEAGDIVAITGLGELNISDTVCDTQNVEALPALSVDEPTV 307
K+ ++ + E + D A +G+IV + L ++ + DT+ + + P +
Sbjct: 290 --KITEMYTSINGELCKIDKAYSGEIVILQNEF-LKLNSVLGDTKLLPQRERIENPLPLL 346

Query: 308 SMFFCVNTSPFCGKEGKFVTSRQILDRLNKELVHNVALRVEETEDADAFRVSGRGELHLS 367
+ + D L LR +S G++ +
Sbjct: 347 QTTVEPSKPQQREMLLDALLEISDSDPL---------LRYYVDSATHEIILSFLGKVQME 397

Query: 368 VLIENMRRE-GFELAVSRPKVIFRE 391
V ++ + E+ + P VI+ E
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYME 422



Score = 32.5 bits (74), Expect = 0.005
Identities = 13/75 (17%), Positives = 29/75 (38%), Gaps = 1/75 (1%)

Query: 398 EPYENVTLDVEEQHQGSVMQALGERKGDLKNMNPDGKGRVRLDYVIPSRGLIGFRSEFMT 457
EPY + + +++ + ++ + V L IP+R + +RS+
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKN-NEVILSGEIPARCIQEYRSDLTF 595

Query: 458 MTSGTGLLYSTFSHY 472
T+G + + Y
Sbjct: 596 FTNGRSVCLTELKGY 610


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4791PF06580280.042 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 28.3 bits (63), Expect = 0.042
Identities = 34/190 (17%), Positives = 72/190 (37%), Gaps = 41/190 (21%)

Query: 171 IIEQADRLRNLVDRL---LGPQLPGTRVTE-SIHKVAERV---VTLVSMELPDNVRLIRD 223
I+E + R ++ L + L + + S+ V + L S++ D ++
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 224 YDPSLPELAHDPDQIEQVLLN-IVRNALQ---ALGPEGGEIILRTRTAFQLTLHGERYRL 279
+P++ ++ Q+ +L+ +V N ++ A P+GG+I+L+
Sbjct: 246 INPAIMDV-----QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGT------KDNGTVT- 293

Query: 280 AARIDVEDNGPGIPPHLQDTLFYPMVSGREGGTGLGLSIARNLIDQHSGK---IEFTSWP 336
++VE+ G + ++ TG GL R + G I+ +
Sbjct: 294 ---LEVENTGSLALKNTKE------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 337 GHTEFSVYLP 346
G V +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4790HTHFIS6020.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 602 bits (1553), Expect = 0.0
Identities = 206/478 (43%), Positives = 300/478 (62%), Gaps = 11/478 (2%)

Query: 1 MQRGIVWVVDDDSSIRWVLERALAGAGLTCTTFENGAEVLEALASKTPDVLLSDIRMPGM 60
M + V DDD++IR VL +AL+ AG N A + +A+ D++++D+ MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 DGLALLKQIKQRHPMLPVIIMTAHSDLDAAVSAYQQGAFDYLPKPFDIDEAVALVERAIS 120
+ LL +IK+ P LPV++M+A + A+ A ++GA+DYLPKPFD+ E + ++ RA++
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 HYQEQQQPRNVQLNGPTTDIIGEAPAMQDVFRIIGRLSRSSISVLINGESGTGKELVAHA 180
+ + ++G + AMQ+++R++ RL ++ ++++I GESGTGKELVA A
Sbjct: 121 EPKRRPSKLEDDSQDGM-PLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARA 179

Query: 181 LHRHSPRAKAPFIALNMAAIPKDLIESELFGHEKGAFTGANTIRQGRFEQADGGTLFLDE 240
LH + R PF+A+NMAAIP+DLIESELFGHEKGAFTGA T GRFEQA+GGTLFLDE
Sbjct: 180 LHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDE 239

Query: 241 IGDMPLDVQTRLLRVLADGQFYRVGGYAPVKVDVRIIAATHQNLEQRVQEGKFREDLFHR 300
IGDMP+D QTRLLRVL G++ VGG P++ DVRI+AAT+++L+Q + +G FREDL++R
Sbjct: 240 IGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYR 299

Query: 301 LNVIRVHLPPLRERREDIPRLARHFLQVAARELGVEAKLLHPETEAALTRLAWPGNVRQL 360
LNV+ + LPPLR+R EDIP L RHF+Q A +E G++ K E + WPGNVR+L
Sbjct: 300 LNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 361 ENTCRWLTVMAAGQEVLIQDLPGELFESTVAESTSQMQPDSWATLLAQWADRALRS---- 416
EN R LT + + + + EL + S + ++Q + +R
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 417 -----GHQNLLSEAQPELERTLLTTALRHTQGHKQEAARLLGWGRNTLTRKLKELGME 469
L E+E L+ AL T+G++ +AA LLG RNTL +K++ELG+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4788SECA300.004 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 30.2 bits (68), Expect = 0.004
Identities = 11/71 (15%), Positives = 29/71 (40%)

Query: 13 AKARRKTREELNQEARDRKRQKKRRGHAPGSRAAGGNNTSGSKGQNAPKDPRIGSKTPIP 72
+K + + EE+ + + R+ + +R ++ + + + ++G P P
Sbjct: 827 SKVQVRMPEEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCP 886

Query: 73 LGVTEKVTKQH 83
G +K + H
Sbjct: 887 CGSGKKYKQCH 897


83ECs4606ECs4602N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4606-1152.342138UhpA family transcriptional regulator
ECs4605-2152.009185sensory histidine kinase UhpB
ECs4604-1131.032400regulatory protein UhpC
ECs4603-214-0.043106sugar phosphate antiporter
ECs4602-118-1.339705cryptic adenine deaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4606HTHFIS612e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 2e-13
Identities = 29/174 (16%), Positives = 59/174 (33%), Gaps = 20/174 (11%)

Query: 2 ITVALIDDHLIVRSGFAQLLGLEPDLQVVAEFGSGREALAGLPGRGVQVCICDISMPDIS 61
T+ + DD +R+ Q L V + + + + D+ MPD +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLELLSQLPK---GMATIMLSVHDSPALVEQALNAGARGFLSKRCSPDELIAAVHTVATG 118
+LL ++ K + +++S ++ +A GA +L K ELI +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR---- 117

Query: 119 GCYLTPDIAIKLASGRQDPLTKRERQVAEKLAQG---MAVKEIAAELGLSPKTV 169
A+ R L + + + + + A L + T+
Sbjct: 118 --------ALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4605PF06580387e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.9 bits (88), Expect = 7e-05
Identities = 28/142 (19%), Positives = 56/142 (39%), Gaps = 11/142 (7%)

Query: 365 LRPRQLDDLTLEQAIRSLMREMELEGRGIVSHLEWRIEESALSENQRVTLFRVCQEGLNN 424
LR ++L + + ++L L++ + + + +V + Q + N
Sbjct: 208 LRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPM-LVQTLVEN 266

Query: 425 IVKHA-----DASAVTLQGWLQDERLMLVIEDDGSGLPPGSGQ-QGFGLTGMRERVTALG 478
+KH + L+G + + L +E+ GS + + G GL +RER+ L
Sbjct: 267 GIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLY 326

Query: 479 G---TLTISCLHG-TRVSVSLP 496
G + +S G V +P
Sbjct: 327 GTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4604TCRTETB411e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.6 bits (95), Expect = 1e-05
Identities = 65/408 (15%), Positives = 137/408 (33%), Gaps = 60/408 (14%)

Query: 30 RHILLTIWLGYALFY--FTRKSFNAAVPEILANGVLSRSDIGLLATLFYITYGVSKFVSG 87
RH + IWL F+ N ++P+I + + + T F +T+ + V G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 88 IVSDRSNARYFMGIGLIATGIINILFGFSTSLWAFAVLWVLNAFFQGWGS---PVCARLL 144
+SD+ + + G+I +++ S F L ++ F QG G+ P ++
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHS---FFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 145 TAWY-SRTERGGWWALWNTAHNVGGALIPIVMAAAALHYGWRAGMMIAGCMAIVVGIFLC 203
A Y + RG + L + +G + P + A + W ++ M ++ +
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPF- 184

Query: 204 WRLRDRPQALGLPAVGEWRHDALEIAQQQEGAGLTRKEILTKYVLLNPYIWLLSFCYVLV 263
+ L +I G L I+ + Y VL
Sbjct: 185 --------LMKLLKKEVRIKGHFDIK----GIILMSVGIVFFMLFTTSYSISFLIVSVLS 232

Query: 264 YVV-----RAAINDWGNLYMSETLGVDLVTANTAVTMFELGGFI-----------GALVA 307
+++ R + + + + + + + + + GF+ A
Sbjct: 233 FLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTA 292

Query: 308 GWGSDKLFNGNRGPMNLIFAAGILL-SVGSLWLMPFASYVMQATCFFTIGFFVFGPQMLI 366
GS +F G + + GIL+ G L+++ + + F T F + +
Sbjct: 293 EIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL-SVSFLTASFLLETTSWFM 351

Query: 367 ---------GMAAAECS---------HKEAAGAATGFVGLFAYLGASL 396
G++ + ++ AGA + ++L
Sbjct: 352 TIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGT 399


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4603TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 34.1 bits (78), Expect = 0.001
Identities = 28/168 (16%), Positives = 61/168 (36%), Gaps = 17/168 (10%)

Query: 49 FNIAQNDMISTYGLSMTQLGMIGLGFSITYGVGKTLVSYYADGKNTKQFLPFMLILSAIC 108
N++ D+ + + + F +T+ +G + +D K+ L F +I++ C
Sbjct: 33 LNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIIN--C 90

Query: 109 MLGFSASMGSGSVSLFLMIAFYALSGFFQSTGGSCSYSTI----TKWTPRRKRGTFLGFW 164
+G SL +M + F Q G + + + ++ P+ RG G
Sbjct: 91 FGSVIGFVGHSFFSLLIM------ARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLI 144

Query: 165 NISHNLGGAGAAGVALFGANYLFDGHVIGMFIFPSIIALIVGFIGLRY 212
+G + A+Y+ + + + P + I+ L
Sbjct: 145 GSIVAMGEGVGPAIGGMIAHYIHWSY---LLLIP--MITIITVPFLMK 187


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4602UREASE389e-05 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 38.2 bits (89), Expect = 9e-05
Identities = 28/105 (26%), Positives = 41/105 (39%), Gaps = 17/105 (16%)

Query: 22 AVSRGDAVADYIIDNVSILDLINGGEISGPIVIKGRYIAGVG----------AEYTDAPA 71
V+R D +I N ILD + G + I +K IA +G P
Sbjct: 60 QVTREGGAVDTVITNALILD--HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPG 117

Query: 72 LQRIDAHGATAVPGFIDAHLHIESSMMTPVTFETATLPRGLTTVI 116
+ I G G +D+H+H + P E A L GLT ++
Sbjct: 118 TEVIAGEGKIVTAGGMDSHIH----FICPQQIEEA-LMSGLTCML 157


84ECs4583ECs4573N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4583751-18.367479type III secretion system protein
ECs4582750-17.951096EscS
ECs4581850-18.055484EscT
ECs4580547-16.675178secretion system apparatus protein SsaU
ECs4579447-16.255180hypothetical protein
ECs4578444-15.214344negative regulator GrlR
ECs4577344-14.354901hypothetical protein
ECs4576345-13.382067CesD
ECs4575245-13.846926EscC
ECs4574243-12.345273SepD
ECs4573241-10.709467EscJ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4583TYPE3IMPPROT2225e-76 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 222 bits (568), Expect = 5e-76
Identities = 89/212 (41%), Positives = 136/212 (64%), Gaps = 9/212 (4%)

Query: 12 IFLIIVFFLLSLLPIFVGIGTSFLKISIVLGILKNALGIQQVPPNMALTSVSLILTMFIM 71
I LI + +LLP + GT F+K SIV +++NALG+QQ+P NM L V+L+L+MF+M
Sbjct: 5 ISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLSMFVM 64

Query: 72 SPIILQINDNISQEPINYTDSDFFQKVDEKILSPYRGFLEKNTEKDNVEFFERAAQKKLG 131
PI+ E + + D K ++ L YR +L K ++++ V+FFE A K+
Sbjct: 65 WPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQLKRQY 124

Query: 132 NETI---------LKKDSLFILLPAFTMGQLEAAFKIGFLLYLPFIAIDLIISNILLALG 182
E ++K S+F LLPA+ + ++++AFKIGF LYLPF+ +DL++S++LLALG
Sbjct: 125 GEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVLLALG 184

Query: 183 MMMVSPVTISIPFKILLFILVGGWQKLFEFLL 214
MMM+SPVTIS P K++LF+ + GW L + L+
Sbjct: 185 MMMMSPVTISTPIKLVLFVALDGWTLLSKGLI 216


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4582TYPE3IMQPROT692e-19 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 69.0 bits (169), Expect = 2e-19
Identities = 25/78 (32%), Positives = 45/78 (57%)

Query: 7 VQLCVQTFWIIFILSLPTVIAASVIGIIISLVQAITQLQDQTLPFLLKIIAVFATLALTY 66
V + +++ ILS I A++IG+++ L Q +TQLQ+QTLPF +K++ V L L
Sbjct: 5 VFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFLLS 64

Query: 67 HWMGTTIINFSSIIFEMI 84
W G ++++ + +
Sbjct: 65 GWYGEVLLSYGRQVIFLA 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4581TYPE3IMRPROT1551e-48 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 155 bits (394), Expect = 1e-48
Identities = 46/230 (20%), Positives = 102/230 (44%), Gaps = 4/230 (1%)

Query: 11 SFYCILRPLGMFIILPIFSTGVLLSNFIRNSIMIAFTLPIIVENYTFSEKLPSGIFQLTG 70
F+ +LR L + PI S + R + +A + + + +P F
Sbjct: 16 YFWPLLRVLALISTAPILSERSVPK---RVKLGLAMMITFAIAPSLPANDVPVFSFFALW 72

Query: 71 IALKEISIGFFIGLSFTILFWAIDAAGQIIDTLRGSTISSIFNPSISDSSSITGVILYQF 130
+A+++I IG +G + F A+ AG+II G + ++ +P+ + + I+
Sbjct: 73 LAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDML 132

Query: 131 ISVIFVIHGGIQSILDKLYLSYEILPLQADIAFNRALIDFLFSLWDSFIKLMLSFSVPMI 190
++F+ G ++ L ++ LP+ + + A + + F+ L ++P+I
Sbjct: 133 ALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFL-NGLMLALPLI 191

Query: 191 IGIFLCDMGFGFLNKTAPQLNVFTLSLPVKSLIAIFILLLVIHVFPDFIT 240
+ ++ G LN+ APQL++F + P+ + I ++ ++ + F
Sbjct: 192 TLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCE 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4580TYPE3IMSPROT376e-132 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 376 bits (967), Expect = e-132
Identities = 123/339 (36%), Positives = 195/339 (57%), Gaps = 4/339 (1%)

Query: 2 SEKTEKPTPKKLRDLKKKGDVTKSEEVMAAVQSLILFSFFSLYGMS--FFVDIVGLVNTT 59
EKTE+PTPKK+RD +KKG V KS+EV++ LI+ L G+S +F L+
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTA--LIVALSAMLMGLSDYYFEHFSKLMLIP 60

Query: 60 IDSLNRPFLYAIREILGAVLNIFLLYILPISLIVFVGTVTTGVSQIGFIFAVEKIKPSAQ 119
+ PF A+ ++ VL F P+ + + + + V Q GF+ + E IKP +
Sbjct: 61 AEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIK 120

Query: 120 KISVKNNLKNIFSVKSIFELLKSVFKLVIIVLIFYFMGHSYANEFANFTGLNAYQALVVV 179
KI+ K IFS+KS+ E LKS+ K+V++ ++ + + ++
Sbjct: 121 KINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLL 180

Query: 180 AFFVFLLWKGVLFGYLLFSVFDFWFQKHEGLKKMKMSKDEVKREAKDTDGNPEIKGERRR 239
+ L G+++ S+ D+ F+ ++ +K++KMSKDE+KRE K+ +G+PEIK +RR+
Sbjct: 181 GQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQ 240

Query: 240 LHSEIQSGSLANNIKKSTVIVKNPTHIAICLYYKLGETPLPLVIETGKDAKALQIIKLAE 299
H EIQS ++ N+K+S+V+V NPTHIAI + YK GETPLPLV DA+ + K+AE
Sbjct: 241 FHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAE 300

Query: 300 LYDIPVIEDIPLARTLYKNIHKGQYITEDFFEPVAQLIR 338
+P+++ IPLAR LY + YI + E A+++R
Sbjct: 301 EEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLR 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4579OMPTIN260.048 Omptin serine protease signature.
		>OMPTIN#Omptin serine protease signature.

Length = 317

Score = 26.5 bits (58), Expect = 0.048
Identities = 13/38 (34%), Positives = 17/38 (44%), Gaps = 4/38 (10%)

Query: 115 AYNAGYFNTPNAVELRRQYAMKIYKTYNKLKNNEQIID 152
A NAGY+ TPNA + Y + K N + D
Sbjct: 254 AVNAGYYVTPNA----KVYVEGAWNRVTNKKGNTSLYD 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4576SYCDCHAPRONE1394e-45 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 139 bits (352), Expect = 4e-45
Identities = 33/142 (23%), Positives = 63/142 (44%)

Query: 6 SSLEDIYDFYQDGGTLASLTNLTQQDLNDLHSYAYTAYQSGDVITARNLFHLLTYLEHWN 65
+ F + GGT+A L ++ L L+S A+ YQSG A +F L L+H++
Sbjct: 10 EYQLAMESFLKGGGTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYD 69

Query: 66 YDYTLSLGLCHQRLSNHEDAQLCFARCATLVMQDPRASYYSGISYLLVGNKKMAKKAFKA 125
+ L LG C Q + ++ A ++ A + +++PR +++ L G A+
Sbjct: 70 SRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFL 129

Query: 126 CLMWCNEKEKYTTYKENIKKLL 147
+K ++ + +L
Sbjct: 130 AQELIADKTEFKELSTRVSSML 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4575TYPE3OMGPROT5590.0 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 559 bits (1441), Expect = 0.0
Identities = 153/494 (30%), Positives = 262/494 (53%), Gaps = 24/494 (4%)

Query: 30 KSEYFIITKSSPVRAILNDFAANYSIPVFISSSVNDDFSGEIKNEKPVKVLEKLSKLYHL 89
Y + K +R +L DF ANY V +S +ND SG+ +++ P L+ ++ LY+L
Sbjct: 33 PIPYVYVAKGESLRDLLTDFGANYDATVVVSDKINDKVSGQFEHDNPQDFLQHIASLYNL 92

Query: 90 TWYYDENILYIYKTNEISRSIITPTYLDIDSLLKYLSDTISVNKNSCNVRKITTFNSIEV 149
WYYD N+LYI+K +E++ +I + L + L + + R + + V
Sbjct: 93 VWYYDGNVLYIFKNSEVASRLIRLQESEAAELKQALQR-SGIWEPRFGWRPDASNRLVYV 151

Query: 150 RGVPECIKYITSLSESLDKEAQSKAKNKD--VVKVFKLNYASATDITYKYRDQNVVVPGV 207
G P ++ + + +L+++ Q +++ +++F L YASA+D T YRD V PGV
Sbjct: 152 SGPPRYLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGV 211

Query: 208 VSILKTMASNGSLP--STGKGAVERSGNLFDNSVTISADPRLNAVVVKDREITMDIYQQL 265
+IL+ + S+ ++ + + ++ + ADP LNA++V+D M +YQ+L
Sbjct: 212 ATILQRVLSDATIQQVTVDNQRIPQAATRASAQARVEADPSLNAIIVRDSPERMPMYQRL 271

Query: 266 ISELDIEQRQIEISVSIIDVDANDLQQLGVNWSGTLNAGQGTIA--------FNSSTAQA 317
I LD +IE+++SI+D++A+ L +LGV+W + G N ++ A
Sbjct: 272 IHALDKPSARIEVALSIVDINADQLTELGVDWRVGIRTGNNHQVVIKTTGDQSNIASNGA 331

Query: 318 NISSSVISNASNFMIRVNALQQNSKAKILSQPSIITLNNMQAILDKNVTFYTKVSGEKVA 377
S + RVN L+ A+++S+P+++T N QA++D + T+Y KV+G++VA
Sbjct: 332 LGSLVDARGLDYLLARVNLLENEGSAQVVSRPTLLTQENAQAVIDHSETYYVKVTGKEVA 391

Query: 378 SLESITSGTLLRVTPRILDDSSNSLTGKRRERVRLLLDIQDGNQSTNQSNAQDASSTLPE 437
L+ IT GT+LR+TPR+L S + L L I+DGNQ N S + +P
Sbjct: 392 ELKGITYGTMLRMTPRVLTQGDKS-------EISLNLHIEDGNQKPNSSGIE----GIPT 440

Query: 438 VQNSEMTTEATLSAGESLLLGGFIQDKESSSKDGIPLLSDIPVIGSLFSSTVKQKHSVVR 497
+ + + T A + G+SL++GG +D+ S + +PLL DIP IG+LF + VR
Sbjct: 441 ISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYIGALFRRKSELTRRTVR 500

Query: 498 LFLIKATPIKSASS 511
LF+I+ I +
Sbjct: 501 LFIIEPRIIDEGIA 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4573FLGMRINGFLIF561e-11 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 55.7 bits (134), Expect = 1e-11
Identities = 32/166 (19%), Positives = 58/166 (34%), Gaps = 10/166 (6%)

Query: 22 EQLYTGLTEKEANQMQALLLSNDVNVSKEMDKSGNMTLSVEKEDFVRAITILNNNGFPKK 81
L++ L++++ + A L N+ + V + L G PK
Sbjct: 51 RTLFSNLSDQDGGAIVAQLTQM--NIPYRFANGSG-AIEVPADKVHELRLRLAQQGLPKG 107

Query: 82 KFADIEVIFPPSQLVASPSQENAKINYLKEQDIERLLSKIPGVIDCSVSLNVNNN----- 136
E + + S E E ++ R + + V V L +
Sbjct: 108 GAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVR 166

Query: 137 ESQPSSAAVLVISSPEVNLAPSVIQ-IKNLVKNSVDDLKLENISVV 181
E + SA+V V P L I + +LV ++V L N+++V
Sbjct: 167 EQKSPSASVTVTLEPGRALDEGQISAVVHLVSSAVAGLPPGNVTLV 212


85ECs4563ECs4555N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4563237-9.050648hypothetical protein
ECs4562338-9.589192hypothetical protein
ECs4561339-9.859670hypothetical protein
ECs4560238-10.705008protein CesT
ECs4559238-9.534758gamma intimin
ECs4558339-10.015661EscD
ECs4557442-9.625848SepL
ECs4556141-7.922943protein EspA
ECs4555440-5.193177protein EspD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4563PF06704366e-06 DspF/AvrF protein
		>PF06704#DspF/AvrF protein

Length = 129

Score = 36.4 bits (84), Expect = 6e-06
Identities = 21/119 (17%), Positives = 51/119 (42%), Gaps = 4/119 (3%)

Query: 3 EKFRTDLAHTFGIALEEQTDVLSFHDNDGHEW-ILECASQSEILFFYCYLLNSESIQINS 61
+ L G +L Q V + +D+ +E ++E SE++ F+C + S +
Sbjct: 9 SRLIKSLGAQLGTSLTAQNGVCALYDSQDNEAAVIEMPDHSEMVIFHCRVGRSPDRAADL 68

Query: 62 ILEMNSNRELLGMF--FLSLKDDNILLNIAFPADKIDITEFANLMENGYLLKNEIIRSL 118
++ N ++ M + ++ ++ L +D +F + G++++ R+L
Sbjct: 69 QKLLSLNFDVARMHGSWFAVDQGDVRLCAQRELAVLDEAQFCDTA-RGFIVQAREARAL 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4561TRNSINTIMINR7310.0 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 731 bits (1887), Expect = 0.0
Identities = 328/566 (57%), Positives = 390/566 (68%), Gaps = 25/566 (4%)

Query: 1 MPIGNLGHNPNVNNSIPPAPPLPSQTDGA--GGRGQLINSTGPLGSRALFTPVRNSMADS 58
MPIGNLG+N N N+ IPPAPPLPSQTDGA GG G LI+STG LGSR+LF+P+RNSMADS
Sbjct: 1 MPIGNLGNNVNGNHLIPPAPPLPSQTDGAARGGTGHLISSTGALGSRSLFSPLRNSMADS 60

Query: 59 GDNRASDVPGLPVNPMRLAA--SEITLNDGFEVLHDHGPLDTLNRQIGSSVFRVETQEDG 116
D+R D+PGLP NP RLAA SE L GFEVLHD GPLD LN QIG S FRVE Q DG
Sbjct: 61 VDSR--DIPGLPTNPSRLAAATSETCLLGGFEVLHDKGPLDILNTQIGPSAFRVEVQADG 118

Query: 117 KHIAVGQRNGVETSVVLSDQEYARLQSIDPEGKDKFVFTGGRGGAGHAMVTVASDITEAR 176
H A+G++NG+E SV LS QE++ LQSID EGK++FVFTGGRGG+GH MVTVASDI EAR
Sbjct: 119 THAAIGEKNGLEVSVTLSPQEWSSLQSIDTEGKNRFVFTGGRGGSGHPMVTVASDIAEAR 178

Query: 177 QRILELLEPKGTGESK-GAGESKGVGELRESNSGAENTTETQTSTSTSSLRSDPKLWLAL 235
+IL L+P G + +++ VG S +ET TST+ SS+RSDPK W+++
Sbjct: 179 TKILAKLDPDNHGGRQPKDVDTRSVGVGSASGIDDGVVSETHTSTTNSSVRSDPKFWVSV 238

Query: 236 GTVATGLIGLAATGIVQALALTPEPDSPTTTDPDAAASATETATRDQLTKEAFQNPDNQK 295
G +A GL GLAATGI QALALTPEPD PTTTDPD AA+A E+AT+DQLT+EAF+NP+NQK
Sbjct: 239 GAIAAGLAGLAATGIAQALALTPEPDDPTTTDPDQAANAAESATKDQLTQEAFKNPENQK 298

Query: 296 VNIDELGNAIPSGVLKDDVVANIEEQAKAAGEEAKQQAIENNAQAQKKYDEQQAKRQEEL 355
VNID GNAIPSG LKDD+V I +QAK AGE A+QQA+E+NAQAQ++Y++Q A+RQEEL
Sbjct: 299 VNIDANGNAIPSGELKDDIVEQIAQQAKEAGEVARQQAVESNAQAQQRYEDQHARRQEEL 358

Query: 356 KVSSGAGYGLSGALILGGGIGVAVTAALHRKNQPVEQTTTTTTTTTTTSARTVENKPANN 415
++SSG GYGLS ALI+ GGIG VT ALHR+NQP EQTTTTTT TV +
Sbjct: 359 QLSSGIGYGLSSALIVAGGIGAGVTTALHRRNQPAEQTTTTTT-------HTVVQQQTGG 411

Query: 416 TPAQGNVDTPGSEDTMESRRSSMASTSSTFFDTSSIGTVQNPYADV---KTSLHDSQVPT 472
P P RR S S +ST + SS V NPYA+V + SL Q
Sbjct: 412 IPQHKVALMPQERRRFSDRRDSQGSVASTHWSDSS-SEVVNPYAEVGGARNSLSAHQPEE 470

Query: 473 SNSNTSVQNMGNTDSVVYSTIQHPPRDTTDNGARLLGNPSAGIQSTYARLALSGGLRHDM 532
+ + G YS IQ+ G RL+G P GIQSTYA LA SGGLR M
Sbjct: 471 HIYDEVAADPG------YSVIQNFSGSGPVTG-RLIGTPGQGIQSTYALLANSGGLRLGM 523

Query: 533 GGLTGGSNSAVNTSNNPPAPGSHRFV 558
GGLT G +AV++ N P PG RFV
Sbjct: 524 GGLTSGGETAVSSVNAAPTPGPVRFV 549


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4560PF059321224e-39 Tir chaperone protein (CesT)
		>PF05932#Tir chaperone protein (CesT)

Length = 127

Score = 122 bits (309), Expect = 4e-39
Identities = 24/125 (19%), Positives = 52/125 (41%), Gaps = 5/125 (4%)

Query: 1 MSSRS-ELLLEKFAEKIGIGSISFNENRLCSFAIDEIYYISLS-DANDEYMMIYGVCGKF 58
MS+ + LL+ F+ + + + F+++ C+ ID + ++LS D E +++ G+
Sbjct: 1 MSNLFYKTLLDDFSRSLEMQPLVFDDHGTCNMIIDNTFALTLSCDYARERLLLIGLLEPH 60

Query: 59 PTDNSNFALEILNANLWFAENGGPYLCYEAGAQSLLLALRFPLDDATPEKLENEIEVVVK 118
+L L N GP L + + P + + L+ E+ +++
Sbjct: 61 KD---IPQQCLLAGALNPLLNAGPGLGLDEKSGLYHAYQSIPREKLSVPTLKREMAGLLE 117

Query: 119 SMENL 123
M
Sbjct: 118 WMRGW 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4559INTIMIN14590.0 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 1459 bits (3777), Expect = 0.0
Identities = 780/942 (82%), Positives = 837/942 (88%), Gaps = 11/942 (1%)

Query: 1 MITHGCYTRTRHKHKLKKTLIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHDSYQ 60
MITHG Y RTRHKHKLKKT IMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTH+SYQ
Sbjct: 1 MITHGFYARTRHKHKLKKTFIMLSAGLGLFFYVNQNSFANGENYFKLGSDSKLLTHNSYQ 60

Query: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAAPGQQIILPLKKLPFE 120
NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKA PGQQIILPLKKLPFE
Sbjct: 61 NRLFYTLKTGETVADLSKSQDINLSTIWSLNKHLYSSESEMMKAEPGQQIILPLKKLPFE 120

Query: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180
YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR
Sbjct: 121 YSALPLLGSAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDKALNYAAQQAASLGSQLQSR 180

Query: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240
SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM
Sbjct: 181 SLNGDYAKDTALGIAGNQASSQLQAWLQHYGTAEVNLQSGNNFDGSSLDFLLPFYDSEKM 240

Query: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPANMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300
LAFGQVGARYIDSRFTANLGAGQRFFLP NMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF
Sbjct: 241 LAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYF 300

Query: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLIYEQYYGDNVAL 360
KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKL+YEQYYGDNVAL
Sbjct: 301 KSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDNVAL 360

Query: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKSWSQQIE 420
FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDK WSQQIE
Sbjct: 361 FNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQQIE 420

Query: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTEHSTQKIQLIVKSKY 480
PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTE STQKIQLIVKSKY
Sbjct: 421 PQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNIPHDINGTERSTQKIQLIVKSKY 480

Query: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNIYKVTARAYDRNGNSSN 540
GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSN+YKVTARAYDRNGNSSN
Sbjct: 481 GLDRIVWDDSALRSQGGQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDRNGNSSN 540

Query: 541 NVQLTITVLSNGQVVDQVGVTDFTADKTSAKADNADTITYTATVKKNGVAQANVPVSFNI 600
NV LTITVLSNGQVVDQVGVTDFTADKTSAKAD + ITYTATVKKNGVAQANVPVSFNI
Sbjct: 541 NVLLTITVLSNGQVVDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNI 600

Query: 601 VSGTATLGANSAKTDANGKATVTLKSSTPGQVVVSAKTAEMTSALNASAVIFFDQTKASI 660
VSGTA L ANSA T+ +GKATVTLKS PGQVVVSAKTAEMTSALNA+AVIF DQTKASI
Sbjct: 601 VSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASI 660

Query: 661 TEIKADKTTAVANGKDAIKYTVKVMKNGQPVNNQSVTFSTNFGMFNGKSQTQATTGNDGR 720
TEIKADKTTAVANG+DAI YTVKVMK +PV+NQ VTF+T G + + T +G
Sbjct: 661 TEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLS---NSTEKTDTNGY 717

Query: 721 ATITLTSSSAGKATVSATVSDGA-EVKATEVTFFDELKID-NKVDIIGNNVRGELPNIWL 778
A +TLTS++ GK+ VSA VSD A +VKA EV FF L ID ++I+G V+G+LP +WL
Sbjct: 718 AKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWL 777

Query: 779 QYGQFKLKASGGDGTYSWYSENTSIATVDA-SGKVTLNGKGSVVIKATSGDKQTVSYTIK 837
QYGQ LKASGG+G Y+W S N +IA+VDA SG+VTL KG+ I S D QT +YTI
Sbjct: 778 QYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA 837

Query: 838 APSYMI--KVDKQAYYADAMSICKNL---LPSTQTVLSDIYDSWGAANKYSHYSSMNSIT 892
P+ +I + K+ Y DA++ CKN LPS+Q L +++ +WGAANKY +Y S +I
Sbjct: 838 TPNSLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTII 897

Query: 893 AWIKQTSSEQRSGVSSTYNLITQNPLPGVNVNTPNVYAVCVE 934
+W++QT+ + +SGV+STY+L+ QNPL + + N YA CV+
Sbjct: 898 SWVQQTAQDAKSGVASTYDLVKQNPLNNIKASESNAYATCVK 939


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4557PF07201280.047 Hypersensitivity response secretion protein HrpJ
		>PF07201#Hypersensitivity response secretion protein HrpJ

Length = 293

Score = 28.3 bits (63), Expect = 0.047
Identities = 43/225 (19%), Positives = 73/225 (32%), Gaps = 23/225 (10%)

Query: 39 SPLINLQNELAMITSSSLSETIEGLSLGYRK---GSARKEEEGSTIEKLLNDMQELLTLT 95
+ ++ E+ SE E LSL RK AR + + + L+ + EL
Sbjct: 47 QSIADMAEEVTF----VFSERKE-LSLDKRKLSDSQARVSDVEEQVNQYLSKVPEL---E 98

Query: 96 DSDKIKELS--LKNSGL--LEQHDPTLAMFGNMPKGEIVALISSLLQSK--FVKIELKKK 149
+ EL L NS L Q L P + L K L
Sbjct: 99 QKQNVSELLSLLSNSPNISLSQLKAYLEGKSEEPSEQFKMLCGLRDALKGRPELAHLSHL 158

Query: 150 YARLLLDLLGEDDWELAL-----LSWLGVGELNQEGIQKIKKLYEKAKDEDSENGASLLD 204
+ L+ + E + L + +Q ++ Y A + ++
Sbjct: 159 VEQALVSMAEEQGETIVLGARITPEAYRESQSGVNPLQPLRDTYRDAV-MGYQGIYAIWS 217

Query: 205 WFMEIKDLPEREKHLKVIIRALSFDLSYMSSFEDKVKTSSIISDL 249
+ + + + + +ALS DL S + K +ISDL
Sbjct: 218 DLQKRFPNGDIDSVILFLQKALSADLQSQQSGSGREKLGIVISDL 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4555BACINVASINB300.020 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 29.7 bits (66), Expect = 0.020
Identities = 29/102 (28%), Positives = 52/102 (50%), Gaps = 9/102 (8%)

Query: 112 MMMVTLLSLDTSAQKVSSLKNSNEIY---MDGQTKALENKTQEYKKQLEEQQKAEEKSQK 168
M+M + + SL+N ++ +G+ +E K+ E++ EE +KAEE ++
Sbjct: 258 MLMAMFIEI-VGKNTEESLQNDLALFNALQEGRQAEMEKKSAEFQ---EETRKAEETNRI 313

Query: 169 SKIVGQVFGWLGVALTAVAAVFNPALWAVVAIGATAMALQTA 210
+G+V G L ++ VAAVF A +A+ A +A+ A
Sbjct: 314 MGCIGKVLGALLTIVSVVAAVFTGG--ASLALAAVGLAVMVA 353


86ECs4437ECs4432N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4437-1121.266909outer membrane lipoprotein
ECs4436-1120.103091biotin sulfoxide reductase
ECs4435-114-2.718891hypothetical protein
ECs4434014-3.2421003-methyladenine DNA glycosylase
ECs4433-116-3.221510lipase
ECs4432-215-2.781551resistance protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4437OMPADOMAIN1132e-32 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 113 bits (285), Expect = 2e-32
Identities = 41/122 (33%), Positives = 62/122 (50%), Gaps = 11/122 (9%)

Query: 108 LNMPNNVTFDSSSATLKPAGANTLTGVAMVLKEY--PKTAVNVIGYTDSTGGHDLNMRLS 165
+ ++V F+ + ATLKP G L + L +V V+GYTD G N LS
Sbjct: 215 FTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLS 274

Query: 166 QQRADSVASALITQGVDASRIRTQGLGPANPIASNSTAEGK---------AQNRRVEITL 216
++RA SV LI++G+ A +I +G+G +NP+ N+ K A +RRVEI +
Sbjct: 275 ERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEV 334

Query: 217 SP 218

Sbjct: 335 KG 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4435SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.0 bits (75), Expect = 2e-04
Identities = 16/52 (30%), Positives = 22/52 (42%), Gaps = 5/52 (9%)

Query: 76 VAPKAVRRGIGKALMQYV-----QQRYPHLMLEVYQKNQPAIDFYQAQGFHI 122
VA ++G+G AL+ + + LMLE N A FY F I
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFII 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4433ECOLNEIPORIN270.045 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 27.5 bits (61), Expect = 0.045
Identities = 22/117 (18%), Positives = 47/117 (40%), Gaps = 16/117 (13%)

Query: 119 SMYNEFGDSTTTLTDPLWHASVSTLGWRVDSRLGDLRPWAQISYNQQFGENIWKAQSGLS 178
S+ + D+ + H S + + + R G++ P ++SY F +
Sbjct: 228 SVAVQQQDAKLV-EENYSHNSQTEVAATLAYRFGNVTP--RVSYAHGFKGSF-------- 276

Query: 179 RMTATNQNGNWLDVTVGADMLLNQNIAAYAA---LSQAENTTNNSDYLYTMGVSARF 232
ATN N ++ V VGA+ ++ +A + L + + + +G+ +F
Sbjct: 277 --DATNYNNDYDQVVVGAEYDFSKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4432TCRTETA432e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.5 bits (100), Expect = 2e-06
Identities = 47/275 (17%), Positives = 94/275 (34%), Gaps = 32/275 (11%)

Query: 44 PVSQVAFSFGLLSLGLAIS----SSVAGKLQERFGVKRVTIASGILLGLGFFLTAHSDNL 99
+ V +G+L A+ + V G L +RFG + V + S + + + A + L
Sbjct: 37 HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL 96

Query: 100 MMLWLS---AGVLVGLADGAGYLL----TLSNCVKWFPERKGLISAFAIGSYGLGSLGFK 152
+L++ AG+ AG + + F + LG
Sbjct: 97 WVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGG---- 152

Query: 153 FIDTQLLETVGLEKTFVIWGAIALVMIVFGATLMKDAPKQEVKTSNGVVEKDYTLAESMR 212
L+ F A+ + + G L+ ++ K E + R
Sbjct: 153 -----LMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 213 --KPQYWMLAVMFLTACMSG----LYVIGVAKDIAQSLAHLDVVSAANAVTVISIAN-LS 265
++AV F+ + L+VI + H D + ++ I + L+
Sbjct: 208 GMTVVAALMAVFFIMQLVGQVPAALWVI-----FGEDRFHWDATTIGISLAAFGILHSLA 262

Query: 266 GRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
++ G ++ ++ R + +G + G L FA
Sbjct: 263 QAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297



Score = 36.7 bits (85), Expect = 1e-04
Identities = 37/155 (23%), Positives = 63/155 (40%), Gaps = 9/155 (5%)

Query: 241 AQSLAHLDVVSAANAVTVISIANLSGRLVLGILSDKIARIRVITIGQVISLVGMAALLFA 300
AH ++ A A+ + A + G L SD+ R V+ + + V A + A
Sbjct: 39 NDVTAHYGILLALYALMQFACAPVLGAL-----SDRFGRRPVLLVSLAGAAVDYAIMATA 93

Query: 301 PLNAVTFFAAIACVAFNFGGTITVFPSLVSEFFGLNNLAKNYGVIYLGFGIGSICGSIIA 360
P V + I VA G T V + +++ + A+++G + FG G + G ++
Sbjct: 94 PFLWVLYIGRI--VAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLG 151

Query: 361 SLFGGFYVK--FYVIFALLILSLALSTTIRQPEQK 393
L GGF F+ AL L+ + K
Sbjct: 152 GLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHK 186


87ECs4364ECs4358N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4364-116-3.528755hypothetical protein
ECs4363117-3.771301hypothetical protein
ECs4362117-3.683417hypothetical protein
ECs4361-114-0.811106hypothetical protein
ECs43600172.488543hypothetical protein
ECs43591171.233101ABC transporter ATP-binding protein
ECs4358016-0.338269transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4364ALARACEMASE290.033 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 29.0 bits (65), Expect = 0.033
Identities = 23/98 (23%), Positives = 38/98 (38%), Gaps = 18/98 (18%)

Query: 226 ENLLFTHRGLSGPAVLQISSYWQPGEFVSINLLPDVDLETFL--NEQRNAHPNQSLKNTL 283
E + RG GP +L + ++ + + + L T + N Q A N LK L
Sbjct: 63 EAITLRERGWKGP-ILMLEGFFHAQD---LEIYDQHRLTTCVHSNWQLKALQNARLKAPL 118

Query: 284 AVHL------------PKRLVERLQQLGQIPDVSLKQL 309
++L P R++ QQL + +V L
Sbjct: 119 DIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4360RTXTOXIND838e-20 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 83.3 bits (206), Expect = 8e-20
Identities = 72/408 (17%), Positives = 138/408 (33%), Gaps = 81/408 (19%)

Query: 6 RHLAWWGVGLLAVAAIVAWWLLRPAGVP-EGFAVSNGRIEATEVDIASKIAGRIDTILVK 64
R +A++ +G L +A I++ G +GR + I + I+VK
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKE----IKPIENSIVKEIIVK 113

Query: 65 EGQFVREGEVLAKMDTRV----------------LQEQRLEAI----------------- 91
EG+ VR+G+VL K+ L++ R + +
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 92 -------------------AQIKEAQSAVAAAQALLEQRQSETRAAQSLVNQRQAELDSV 132
Q Q+ + L+++++E + +N+ +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 133 AKRHTRSRSLAQRGAISAQQLDDDRAAAESARAALESAKAQVSASKAAIEAARTNIIQ-- 190
R SL + AI+ + + A L K+Q+ ++ I +A+
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 191 -----------AQTRVEAAQATERRIAADID--DSELKAPRDGRV-QYRVAEPGEVLAAG 236
QT T + S ++AP +V Q +V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 237 GRVLNMVDLSDVY-MTFFLPTEQAGTLKLGGEARLILDAAPDLRIPATISFVASVAQFTP 295
++ +V D +T + + G + +G A + ++A P R V V
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYG---YLVGKVKNINL 410

Query: 296 KTVETSDERLKLMFRVKARIPPELLQQHLEYV--KTGLPGVAWVRVNE 341
+E D+RL L+F V I L + + +G+ A ++
Sbjct: 411 DAIE--DQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4359PF05272300.043 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.043
Identities = 9/26 (34%), Positives = 14/26 (53%)

Query: 20 ARCMVGLIGPDGVGKSSLLSLISGAR 45
V L G G+GKS+L++ + G
Sbjct: 595 FDYSVVLEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4358ABC2TRNSPORT512e-09 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 51.1 bits (122), Expect = 2e-09
Identities = 41/171 (23%), Positives = 73/171 (42%), Gaps = 7/171 (4%)

Query: 200 REREHGTVEHLLVMPITPFEIMMAKI-WSMGLVVLVVSGLSLVLMVKGVLGVPIEGSIPL 258
R T E +L + +I++ ++ W+ L +G+ +V G + L
Sbjct: 93 RMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGY----TQWLSLL 148

Query: 259 FMLGV-ALSLFATTSIGIFMGTIARSMPQLGLLVILVLLPLQMLSGGSTPRESMPQMVQD 317
+ L V AL+ A S+G+ + +A S LV+ P+ LSG P + +P + Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 318 IMLIMPTTHFVSLAQAILYRGAGFEIVWPQFLTLMAIGGAFF-TIALLRFR 367
+P +H + L + I+ ++ + I FF + ALLR R
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRRR 259


88ECs4299ECs4290N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4299-2223.376879glycerol-3-phosphate transporter periplasmic
ECs4298-2213.620203glycerol-3-phosphate transporter permease
ECs4297-1171.517603glycerol-3-phosphate transporter membrane
ECs4296016-2.085608glycerol-3-phosphate transporter ATP-binding
ECs4295018-3.803139glycerophosphodiester phosphodiesterase
ECs4294017-4.411001hypothetical protein
ECs4293015-4.004171gamma-glutamyltranspeptidase
ECs4292017-5.228783hypothetical protein
ECs4291016-3.372629hypothetical protein
ECs4290-1151.362285acetyltransferase YhhY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4299MALTOSEBP392e-05 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 39.3 bits (91), Expect = 2e-05
Identities = 39/160 (24%), Positives = 66/160 (41%), Gaps = 14/160 (8%)

Query: 134 GHLLSQPFNSSTPVLYYNKDAFKKAGLDPEQPPKTWQDLADYAAKLKASGMKCGYASGWQ 193
G L++ P L YNKD PPKTW+++ +LKA G + +
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLP-------NPPKTWEEIPALDKELKAKGKSALMFNLQE 179

Query: 194 GWIQLENFSAWNGLPFASKNNGFDGTDAVLEF--NKPEQVKHIAMLEEMNKKGDFSYVGR 251
+ +A G F +N +D D ++ K + +++ + D Y
Sbjct: 180 PYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDY--- 236

Query: 252 KDESTEKFYNGDCAMTTASSGSLANIREYAKFNYGVGMMP 291
+ F G+ AMT + +NI + +K NYGV ++P
Sbjct: 237 -SIAEAAFNKGETAMTINGPWAWSNI-DTSKVNYGVTVLP 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4296PF05272310.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.010
Identities = 11/35 (31%), Positives = 19/35 (54%)

Query: 33 IVMVGPSGCGKSTLLRMVAGLERVTEGDICINDQR 67
+V+ G G GKSTL+ + GL+ ++ I +
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGK 633


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4295PF04619300.004 Dr-family adhesin
		>PF04619#Dr-family adhesin

Length = 160

Score = 30.3 bits (68), Expect = 0.004
Identities = 13/63 (20%), Positives = 23/63 (36%), Gaps = 4/63 (6%)

Query: 29 VGAKYGHKMIEFDAKLSKDGEIFLLHDDNLERTSNGWGVAGELNWQD----LLRVDAGSW 84
+G ++ D + G+ FL+ D+N ++ W + D GSW
Sbjct: 70 LGCDARQVALKADTDNFEQGKFFLISDNNRDKLYVNIRPTDNSAWTTDNGVFYKNDVGSW 129

Query: 85 YGK 87
G
Sbjct: 130 GGI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4293NAFLGMOTY320.007 Sodium-type flagellar protein MotY precursor signature.
		>NAFLGMOTY#Sodium-type flagellar protein MotY precursor signature.

Length = 293

Score = 31.6 bits (71), Expect = 0.007
Identities = 27/82 (32%), Positives = 37/82 (45%), Gaps = 17/82 (20%)

Query: 276 RTPISGDYRGYQVYSMPPPSSGGIHIVQILNI--LENFDMKKYGF-GSADAMQIMAEAEK 332
R P+ G+ R + SMPPP G H +I N+ + FD G+ G A I++E EK
Sbjct: 77 RRPM-GETRNVSLISMPPPWRPGEHADRITNLKFFKQFD----GYVGGQTAWGILSELEK 131

Query: 333 YAYADRSEYLGDPDFVKVPWQA 354
Y P F WQ+
Sbjct: 132 GRY---------PTFSYQDWQS 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4290SACTRNSFRASE361e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 36.5 bits (84), Expect = 1e-05
Identities = 21/92 (22%), Positives = 33/92 (35%), Gaps = 16/92 (17%)

Query: 55 VACIDGDVVGHLTIDVQQRPRRSHVADFGICVDARWKNRGVASALMREMIE------MCD 108
+ ++ + +G + I + + D + D R K GV +AL+ + IE C
Sbjct: 69 LYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKK--GVGTALLHKAIEWAKENHFCG 125

Query: 109 NWLRVDRIELTVFVDNAPAIKVYKKFGFEIEG 140
L I N A Y K F I
Sbjct: 126 LMLETQDI-------NISACHFYAKHHFIIGA 150


89ECs4206ECs4197N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4206-2142.899235phosphoribulokinase
ECs4205-1152.985328hypothetical protein
ECs4204-2153.229035hydrolase
ECs4203-1153.289608ABC transporter ATP-binding protein
ECs4202-1161.746263glutathione-regulated potassium-efflux system
ECs42010150.366306glutathione-regulated potassium-efflux system
ECs5532222-0.758628hypothetical protein
ECs4200321-0.626361FKBP-type peptidylprolyl isomerase
ECs4199318-1.573105hypothetical protein
ECs4198222-1.373235FKBP-type peptidylprolyl isomerase
ECs4197224-0.915143hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4206PF07299320.002 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 31.8 bits (72), Expect = 0.002
Identities = 10/46 (21%), Positives = 21/46 (45%), Gaps = 2/46 (4%)

Query: 71 PEANDFGLLEQTFIEYGQSGKGKSRKYLHTYDEAVPWNQVPGTFTP 116
P+ + + E ++ KG SRK++ ++ + + GTF
Sbjct: 112 PDMEELDMKELSY--LSWIDKGSSRKFIIAKNDKNKFVGLQGTFQS 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4203GPOSANCHOR330.005 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.005
Identities = 28/152 (18%), Positives = 54/152 (35%), Gaps = 22/152 (14%)

Query: 504 KVEPFDGDLEDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKE 563
+ D + ++ E + + ++ R+ +R R + L E
Sbjct: 272 AMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAE 331

Query: 564 IARLEKEME---------------------KLNAQLAQAEEKLGDSELYDQSRKAELTAC 602
+LE++ + +L A+ + EE+ SE QS + +L A
Sbjct: 332 HQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDAS 391

Query: 603 LQQQASAKSGLEECEMAWLEAQEQLEQMLLEG 634
+ + + LEE L A E+L + L E
Sbjct: 392 REAKKQVEKALEEANSK-LAALEKLNKELEES 422



Score = 32.0 bits (72), Expect = 0.008
Identities = 13/125 (10%), Positives = 39/125 (31%), Gaps = 7/125 (5%)

Query: 513 EDYQQWLSDVQKQENQTDEAPKENANSAQARKDQKRREAELRAQTQPLRKEIARLEKEME 572
+ + ++ + E A A + D ++ + +++
Sbjct: 127 KALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST-------ADSAKIK 179

Query: 573 KLNAQLAQAEEKLGDSELYDQSRKAELTACLQQQASAKSGLEECEMAWLEAQEQLEQMLL 632
L A+ A E + + E + TA + + ++ + ++ LE +
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 633 EGQSN 637
++
Sbjct: 240 FSTAD 244


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4202ISCHRISMTASE320.001 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 31.9 bits (72), Expect = 0.001
Identities = 32/135 (23%), Positives = 51/135 (37%), Gaps = 16/135 (11%)

Query: 12 YAHPESQDSVANRVLLKPATQLSNVTVHDLYAHYPDFFIDIPREQALLREHEVIVFQH-- 69
Y P + D N+V P + + +HD+ ++ D F L + +
Sbjct: 9 YQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCV 68

Query: 70 ----PLYTYSCPALLKEWLDRVLSRGFASGPGGNQLAGKYWRSVITTGEPESA------Y 119
P+ + P DR L F GPG N +G Y +IT PE +
Sbjct: 69 QLGIPVVYTAQPGSQNP-DDRALLTDFW-GPGLN--SGPYEEKIITELAPEDDDLVLTKW 124

Query: 120 RYDALNRYPMSDVLR 134
RY A R + +++R
Sbjct: 125 RYSAFKRTNLLEMMR 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs420160KDINNERMP310.021 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 30.7 bits (69), Expect = 0.021
Identities = 13/69 (18%), Positives = 29/69 (42%), Gaps = 6/69 (8%)

Query: 261 TAIDPFKGLLLG---LFFISVGMSLNLGVLYTHL-LWVVISVVVLVAVKILVLYLLARLY 316
A+ P L + L+FIS + L +++ + W +++ V+ ++ L
Sbjct: 318 AAVAPHLDLTVDYGWLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKA-- 375

Query: 317 GVRSSERMQ 325
S +M+
Sbjct: 376 QYTSMAKMR 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4198INFPOTNTIATR1325e-40 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 132 bits (334), Expect = 5e-40
Identities = 79/226 (34%), Positives = 124/226 (54%), Gaps = 9/226 (3%)

Query: 28 AAKPATTADSKAAFKNDDQKSAYALGASLGRYMENSLKEQEKLGIKLDKDQLIAGVQDAF 87
A A A + D K +Y++GA LG K + GI ++ D L G+QD
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLG-------KNFKNQGIDINPDVLAKGMQDGM 66

Query: 88 A-DKSKLSDQEIEQTLQAFEARVKSSAQAKMEKDAADNEAKGKEYREKFAKEKGVKTSST 146
+ + L++++++ L F+ + + A+ K A +N+AKG + + G+ +
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 147 GLVYQVVEAGKGEAPKDSDTVVVNYKGTLIDGKEFDNSYTRGEPLSFRLDGVIPGWTEGL 206
GL Y++++AG G P SDTV V Y GTLIDG FD++ G+P +F++ VIPGWTE L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 207 KNIKKGGKIKLVIPPELAYGKAGVPG-IPPNSTLVFDVELLDVKPA 251
+ + G ++ +P +LAYG V G I PN TL+F + L+ VK A
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKA 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4197ACRIFLAVINRP290.022 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.022
Identities = 14/62 (22%), Positives = 29/62 (46%), Gaps = 1/62 (1%)

Query: 160 ASSVEDLVTQTLEFTIEEVNADRNV-SNNAKNRQIVLNLYEKGIFDIKDAINQVADRLNI 218
A +V+D VTQ +E + ++ + S + + + L + D A QV ++L +
Sbjct: 54 AQTVQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQL 113

Query: 219 SK 220
+
Sbjct: 114 AT 115


90ECs4191ECs4188N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4191345-1.882576elongation factor G
ECs4190230-1.981311elongation factor Tu
ECs5531027-3.364367bacterioferritin-associated ferredoxin
ECs4189331-2.036727bacterioferritin
ECs4188638-0.706666HopD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4191TCRTETOQM6130.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 613 bits (1583), Expect = 0.0
Identities = 178/698 (25%), Positives = 304/698 (43%), Gaps = 81/698 (11%)

Query: 9 RYRNIGISAHIDAGKTTTTERILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAAT 68
+ NIG+ AH+DAGKTT TE +L+ +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAFWSGMAKQYEPHRINIIDTPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWR 128
+ W ++NIIDTPGH+DF EV RS+ VLDGA+++ A GVQ Q+ ++
Sbjct: 62 SFQWEN-------TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFH 114

Query: 129 QANKYKVPRIAFVNKMDRMGANFLKVVNQIKTRLGANPVPLQLAIGAEEHFTGVVDLVKM 188
K +P I F+NK+D+ G + V IK +L A V Q V M
Sbjct: 115 ALRKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ----------KVELYPNM 164

Query: 189 KAINWNDADQGVTFEYEDIPADMVELANEWHQNLIESAAEASEELMEKYLGGEELTEAEI 248
N+ +++Q ++ E +++L+EKY+ G+ L E+
Sbjct: 165 CVTNFTESEQ------------------------WDTVIEGNDDLLEKYMSGKSLEALEL 200

Query: 249 KGALRQRVLNNEIILVTCGSAFKNKGVQAMLDAVIDYLPSPVDVPAINGILDDGKDTPAE 308
+ R N + V GSA N G+ +++ + + S
Sbjct: 201 EQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH----------------- 243

Query: 309 RHASDDEPFSALAFKIATDPFVGNLTFFRVYSGVVNSGDTVLNSVKAARERFGRIVQMHA 368
FKI L + R+YSGV++ D+V S K + + +
Sbjct: 244 ---RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK-EKIKITEMYTSIN 299

Query: 369 NKREEIKEVRAGDIAAAIG----LKDVTTGDTLCDPDAPIILERMEFPEPVISIAVEPKT 424
+ +I + +G+I L V GDT P ER+E P P++ VEP
Sbjct: 300 GELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQR----ERIENPLPLLQTTVEPSK 354

Query: 425 KADQEKMGLALGRLAKEDPSFRVWTDEESNQTIIAGMGELHLDIIVDRMKREFNVEANVG 484
+E + AL ++ DP R + D +++ I++ +G++ +++ ++ +++VE +
Sbjct: 355 PQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 485 KPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHVVIDMYPLEPGSNPKGYEFINDIKGGV 544
+P V Y E +K E + + + + + PL GS G ++ + + G
Sbjct: 415 EPTVIYMERPLKK---AEYTIHIEVPPNPFWASIGLSVSPLPLGS---GMQYESSVSLGY 468

Query: 545 IPGEYIPAVDKGIQEQLKAGPLAGYPVVDMGIRLHFGSYHDVDSSELAFKLAASIAFKEG 604
+ + AV +GI+ + G L G+ V D I +G Y+ S+ F++ A I ++
Sbjct: 469 LNQSFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQV 527

Query: 605 FKKAKPVLLEPIMKVEVETPEENTGDVIGDLSRRRGMLKGQESEVTGVKIHAEVPLSEMF 664
KKA LLEP + ++ P+E D + + + + V + E+P +
Sbjct: 528 LKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQ 587

Query: 665 GYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEAR 702
Y + L T GR+ E Y + V + R
Sbjct: 588 EYRSDLTFFTNGRSVCLTELKGYHVT---TGEPVCQPR 622


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4190TCRTETOQM803e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 79.5 bits (196), Expect = 3e-18
Identities = 57/198 (28%), Positives = 87/198 (43%), Gaps = 13/198 (6%)

Query: 13 VNVGTIGHVDHGKTTLTAAI------TTVLAKTYGGAARAFDQIDNAPEEKARGITINTS 66
+N+G + HVD GKTTLT ++ T L G R DN E+ RGITI T
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRT----DNTLLERQRGITIQTG 59

Query: 67 HVEYDTPTRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGPMPQTREHILLGRQV 126
+ +D PGH D++ + + +DGAIL+++A DG QTR R++
Sbjct: 60 ITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKM 119

Query: 127 GVPYIIVFLNKCDMVDDEELLELVEMEVRELLSQYDFPGDDTPIVRGSALKALEGDAEWE 186
G+P I F+NK D + L V +++E LS + + +W+
Sbjct: 120 GIP-TIFFINKIDQNGID--LSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWD 176

Query: 187 AKILELAGFLDSYIPEPE 204
I L+ Y+
Sbjct: 177 TVIEGNDDLLEKYMSGKS 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4189HELNAPAPROT383e-06 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 38.3 bits (89), Expect = 3e-06
Identities = 19/103 (18%), Positives = 43/103 (41%), Gaps = 10/103 (9%)

Query: 44 EYHESIDEMKHADKYIERILFLEGIPN--LQDLGKL------GIGEDVEEMLQSDLRLEL 95
E ++ E D ER+L + G P +++ + G EM+Q+ +
Sbjct: 52 ELYDHAAE--TVDTIAERLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYK 109

Query: 96 EGAKDLREAIAYADSVHDYVSRDMMIEILADEEGHIDWLETEL 138
+ + + + I A+ D + D+ + ++ + E + L + L
Sbjct: 110 QISSESKFVIGLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4188PREPILNPTASE1411e-44 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 141 bits (358), Expect = 1e-44
Identities = 65/142 (45%), Positives = 84/142 (59%), Gaps = 2/142 (1%)

Query: 4 TLPFLILYACLSALLFFWDAKHGLLPDRFTCPLLWSGLLFYQVCHPDGLADALWGAIIGY 63
TL L+L L AL F D LLPD+ T PLLW GLLF + L DA+ GA+ GY
Sbjct: 134 TLAALLLTWVLVALTFI-DLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGY 192

Query: 64 GTFAVIYWGYRILRHKEGLGYGDVKFLAALGAWHSWAFLPRLVFLAASFACGAVVIGLLM 123
+YW +++L KEG+GYGD K LAALGAW W LP +V L +S + IGL++
Sbjct: 193 LVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALP-IVLLLSSLVGAFMGIGLIL 251

Query: 124 RGKESLKNPLPFGPFLAAAGFV 145
P+PFGP+LA AG++
Sbjct: 252 LRNHHQSKPIPFGPYLAIAGWI 273


91ECs4140ECs4133N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4140-318-3.127903hypothetical protein
ECs4137-316-2.572724transmembrane protein affects septum formation
ECs4136-215-2.251555acrEF/envCD operon repressor
ECs4135-213-0.078938transporter
ECs4134-3130.524522methyltransferase
ECs4133-3161.611472DNA-binding protein Fis
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4140adhesinb280.004 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.5 bits (61), Expect = 0.004
Identities = 14/68 (20%), Positives = 26/68 (38%), Gaps = 10/68 (14%)

Query: 1 MKR---LIPVALLTALLAGCAHDSPCVPVYDDQGRLVHTNTCMKGTTQDNWETAGAIAGG 57
MK+ L+ + L LA C+ + +V TN+ + T++ IAG
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKN-------IAGD 53

Query: 58 AAAVAGLT 65
+ +
Sbjct: 54 KINLHSIV 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4137RTXTOXIND431e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 43.3 bits (102), Expect = 1e-06
Identities = 38/217 (17%), Positives = 70/217 (32%), Gaps = 38/217 (17%)

Query: 98 ATYQANYDSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADA-RQADAAV 156
K +L + E+ A + Q + I D RQ +
Sbjct: 262 VEAVNELRVYKSQLEQIESEILSAKEEYQLV----------TQLFKNEILDKLRQTTDNI 311

Query: 157 IAAKATVESARINLAYTKVTAPISGRIGK-STVTEGALVTNGQTTELATVQQLDPIYVDV 215
+ + + AP+S ++ + TEG +VT +T + V + D + V
Sbjct: 312 GLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTA 370

Query: 216 TQSSND--FMRLKQSVEQGNLHKENATSNVELVMENGQTYP-LKGTLQ--FSDVTVDEST 270
+ D F+ + Q+ +++ Y L G ++ D D+
Sbjct: 371 LVQNKDIGFINVGQNAI------------IKVEAFPYTRYGYLVGKVKNINLDAIEDQRL 418

Query: 271 GSIT--LRAV------FPNPQHTLLPGMFVRARIDEG 299
G + + ++ N L GM V A I G
Sbjct: 419 GLVFNVIISIEENCLSTGNKNIPLSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 7e-04
Identities = 22/127 (17%), Positives = 43/127 (33%), Gaps = 13/127 (10%)

Query: 46 TAPLEVKTELPGR-TNAYRIAEVRPQVSGIVLNRNFTEGSDVQAGQSLYQIDPATYQANY 104
+E+ G+ T++ R E++P + IV EG V+ G L ++ +A
Sbjct: 77 LGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA-- 134

Query: 105 DSAKGELAKSEAAAAIAHLTVKRYVPLVGTKYISQQEYDQAIADARQADAAVIAAKATVE 164
+ K++++ A L RY L E ++ +
Sbjct: 135 -----DTLKTQSSLLQARLEQTRYQIL-----SRSIELNKLPELKLPDEPYFQNVSEEEV 184

Query: 165 SARINLA 171
+L
Sbjct: 185 LRLTSLI 191



Score = 29.0 bits (65), Expect = 0.031
Identities = 11/34 (32%), Positives = 15/34 (44%), Gaps = 1/34 (2%)

Query: 65 AEVRPQVSGIVLNRN-FTEGSDVQAGQSLYQIDP 97
+ +R VS V TEG V ++L I P
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVP 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4136HTHTETR1276e-39 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 127 bits (321), Expect = 6e-39
Identities = 78/209 (37%), Positives = 122/209 (58%), Gaps = 3/209 (1%)

Query: 1 MAKRTKAEALKTRQELIETAIAQFAQHGVSKTTLNDIADAANVTRGAIYWHFENKTQLFN 60
MA++TK EA +TRQ +++ A+ F+Q GVS T+L +IA AA VTRGAIYWHF++K+ LF+
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMW-LQQPSLRELIQEHLTAGLEHDPFQQLREKLIVGLQYIAKIPRQQALLKILYHKCEF 119
E+W L + ++ EL E A DP LRE LI L+ R++ L++I++HKCEF
Sbjct: 61 EIWELSESNIGELELE-YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 120 NDEM-LAEGVIREKMGFNPQTLREVLQACQQQGCVANNLDLDVVMIIIDGAFSGIVQNWL 178
EM + + R + + + L+ C + + +L II+ G SG+++NWL
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWL 179

Query: 179 MNMAGYDLYKQAPALVDNVLRMFMPDENI 207
+DL K+A V +L M++ +
Sbjct: 180 FAPQSFDLKKEARDYVAILLEMYLLCPTL 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4133DNABINDNGFIS1573e-54 DNA-binding protein FIS signature.
		>DNABINDNGFIS#DNA-binding protein FIS signature.

Length = 98

Score = 157 bits (399), Expect = 3e-54
Identities = 98/98 (100%), Positives = 98/98 (100%)

Query: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60
MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ
Sbjct: 1 MFEQRVNSDVLTVSTVNSQDQVTQKPLRDSVKQALKNYFAQLNGQDVNDLYELVLAEVEQ 60

Query: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98
PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN
Sbjct: 61 PLLDMVMQYTRGNQTRAALMMGINRGTLRKKLKKYGMN 98


92ECs4114ECs4107N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs4114-213-0.133352p-hydroxybenzoic acid efflux subunit AaeA
ECs4113-313-0.313840p-hydroxybenzoic acid efflux subunit AaeB
ECs4112-217-0.607352hypothetical protein
ECs4111-2180.172994hypothetical protein
ECs4110-2130.428086arginine repressor ArgR
ECs4109-2140.631770malate dehydrogenase
ECs4108-2130.578128serine endoprotease
ECs41070160.647736serine endoprotease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4114RTXTOXIND512e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.4 bits (123), Expect = 2e-09
Identities = 28/147 (19%), Positives = 54/147 (36%), Gaps = 15/147 (10%)

Query: 99 LAQEKRQEAGRRNRLGVQ-AMSREEIDQANNVLQT-VLHQLAKAQAT-------RDLAKL 149
E R + ++ + ++EE + + +L +L + +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEE 323

Query: 150 DLERTVIRAPADGWVTNLNVYT-GEFITRGSTAVALVKQNSFY-VLAYMEETKLEGVRPG 207
+ +VIRAP V L V+T G +T T + +V ++ V A ++ + + G
Sbjct: 324 RQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVG 383

Query: 208 YRAEIT----PLGSNKVLKGTVDSVAA 230
A I P L G V ++
Sbjct: 384 QNAIIKVEAFPYTRYGYLVGKVKNINL 410



Score = 47.9 bits (114), Expect = 3e-08
Identities = 29/163 (17%), Positives = 58/163 (35%), Gaps = 17/163 (10%)

Query: 6 RKFSRTAITVVLVILAFIAIFNAWVYYTE----SPWTRDARFSADVVAIAPDVSGLITQV 61
SR V I+ F+ I + + S I P + ++ ++
Sbjct: 51 TPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEI 110

Query: 62 NVH-NQLVKKGQVLFTIDQPR-------YQKALEEAQADVAYYQVLAQEKRQEAGRRNRL 113
V + V+KG VL + Q +L +A+ + YQ+L++ E + L
Sbjct: 111 IVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRS--IELNKLPEL 168

Query: 114 GVQAMSREEIDQANNVL---QTVLHQLAKAQATRDLAKLDLER 153
+ + VL + Q + Q + +L+L++
Sbjct: 169 KLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDK 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4110ARGREPRESSOR1694e-57 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 169 bits (430), Expect = 4e-57
Identities = 44/141 (31%), Positives = 71/141 (50%), Gaps = 5/141 (3%)

Query: 15 KALLKEEKFSSQGEIVAALQEQGFDNINQSKVSRMLTKFGAVRTRNAKMEMVYCLPAELG 74
+ ++ + +Q E+V L++ G+ N+ Q+ VSR + + V+ Y LPA+
Sbjct: 11 REIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSYKYSLPADQR 69

Query: 75 VPTTSSPLKNLV---LDIDYNDAVVVIHTSPGAAQLIARLLDSLGKAEGILGTIAGDDTI 131
S ++L+ + ID ++V+ T PG AQ I L+D+L E I+GTI GDDTI
Sbjct: 70 FNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IMGTICGDDTI 128

Query: 132 FTTPANGFTVKDLYEAILELF 152
K + + ILEL
Sbjct: 129 LIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4109DHBDHDRGNASE280.045 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 28.1 bits (62), Expect = 0.045
Identities = 37/167 (22%), Positives = 61/167 (36%), Gaps = 27/167 (16%)

Query: 3 VAVLGAAGGIGQALALLLKTQLPSGSELSLYDIAPVTPGVAVDLSHIPTAVKIKGFSGED 62
+ GAA GIG+A+A L G+ ++ D P V S A + F +
Sbjct: 11 AFITGAAQGIGEAVARTL---ASQGAHIAAVDYNP-EKLEKVVSSLKAEARHAEAFPADV 66

Query: 63 ATPA------------LEGADVVLISAGVARK------PGMDRSDLFNVNAGIVKNLVQQ 104
A + D+++ AGV R + F+VN+ V N +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 105 VAKTCPK----ACIGIITNPVNTT-VAIAAEVLKKAGVYDKNKLFGV 146
V+K + + + +NP ++AA KA K G+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4108V8PROTEASE538e-10 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 52.7 bits (126), Expect = 8e-10
Identities = 31/160 (19%), Positives = 59/160 (36%), Gaps = 26/160 (16%)

Query: 77 RTLGSGVIMDQRGYIITNKHVINDADQIIVALQ------------DGRVFEALLVGSDSL 124
+ SGV++ + ++TNKHV++ AL+ +G +
Sbjct: 101 TFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 125 TDLAVLKI-------NATGGLPTIPINARRVPHIGDVVLAIGNPYNLGQTITQGIISATG 177
DLA++K + + ++ + + G P + T + G
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMW--ESKG 216

Query: 178 RIGLNPTGRQNFLQTDASINHGNSGGALVNSLGELMGINT 217
+I + +Q D S GNSG + N E++GI+
Sbjct: 217 KI---TYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHW 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs4107V8PROTEASE726e-16 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 72.0 bits (176), Expect = 6e-16
Identities = 32/184 (17%), Positives = 63/184 (34%), Gaps = 38/184 (20%)

Query: 90 GLGSGVIINASKGYVLTNNHVINQAQKISIQL------------NDGREFDAKLIGSDDQ 137
+ SGV++ K +LTN HV++ L +G ++ +
Sbjct: 102 FIASGVVVG--KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGE 159

Query: 138 SDIALLQIQN-------PSKLTQIAIADSDKLRVGDFAVAVGNPFGLGQTATSGIVSALG 190
D+A+++ + ++++ + +V G P V+ +
Sbjct: 160 GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP-------VATMW 212

Query: 191 RSGLNLEGLEN-FIQTDASINRGNSGGALLNLNGELIGINTAILAPGGGSVGIGFAIPSN 249
S + L+ +Q D S GNSG + N E+IGI+ G+
Sbjct: 213 ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWG---------GVPNEFNGA 263

Query: 250 MART 253
+
Sbjct: 264 VFIN 267


93ECs3738ECs3716N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3738238-10.616106lipoprotein
ECs3737442-13.022200*hypothetical protein
ECs3736544-13.673776hypothetical protein
ECs3735344-12.669787hypothetical protein
ECs3734343-12.694400EivF
ECs3733343-12.768154EivG
ECs3732342-12.852139EivE
ECs3731444-12.911889EivA
ECs3730344-12.904103ATP synthase SpaL
ECs3729347-15.442340EivI
ECs3728548-16.408164hypothetical protein
ECs3727550-16.626739EivJ
ECs3726550-17.398535surface presentation of antigens protein SpaO
ECs3725449-17.460393surface presentation of antigens protein SpaP
ECs3724551-17.058324EpaQ
ECs3721651-16.947572surface presentation of antigens protein SpaS
ECs3720554-16.424061transcriptional regulator
ECs3719653-16.544305EprH
ECs3718654-16.375723EprI
ECs3717653-16.429736EprJ
ECs3716652-16.997828EprK
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3738RTXTOXIND374e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.1 bits (86), Expect = 4e-05
Identities = 18/82 (21%), Positives = 31/82 (37%), Gaps = 12/82 (14%)

Query: 160 AAGAGKVVYVGNQLRGYGNLIMIKHSEDYITAYAHNDTMLVNNGQSVKAGQKIATMGSTD 219
A GK+ + G IK E+ I ++V G+SV+ G + + +
Sbjct: 84 ATANGKLTHSGRSK-------EIKPIENSIV-----KEIIVKEGESVRKGDVLLKLTALG 131

Query: 220 AASVRLHFQIRYRATAIDPLRY 241
A + L Q ++ RY
Sbjct: 132 AEADTLKTQSSLLQARLEQTRY 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3733TYPE3OMGPROT448e-154 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 448 bits (1155), Expect = e-154
Identities = 158/536 (29%), Positives = 271/536 (50%), Gaps = 54/536 (10%)

Query: 34 YVANKENLRSFFETVSSYAGKPTIVSKLAMKKQISGNFDLTEPYALIERLSAQMGLIWYD 93
YVA E+LR + +VS K +SG F+ P ++ +++ L+WY
Sbjct: 38 YVAKGESLRDLLTDFGANYDATVVVSDKINDK-VSGQFEHDNPQDFLQHIASLYNLVWYY 96

Query: 94 DGKAIYIYDSSEMRNALINLRKVSTNEFNNFLKKSGLYNSRYEIKGD-GNGTFYVSGPPV 152
DG +YI+ +SE+ + LI L++ E L++SG++ R+ + D N YVSGPP
Sbjct: 97 DGNVLYIFKNSEVASRLIRLQESEAAELKQALQRSGIWEPRFGWRPDASNRLVYVSGPPR 156

Query: 153 YVDLVVNAAKLMEQNSD--GIEIGRNKVGIIHLVNTFVNDRTYELRGEKIVIPGMAKVLS 210
Y++LV A +EQ + + G + I L +DRT R +++ PG+A +L
Sbjct: 157 YLELVEQTAAALEQQTQIRSEKTGALAIEIFPLKYASASDRTIHYRDDEVAAPGVATILQ 216

Query: 211 TLLNNNIKQSTGVNVLSEISSRQQLKNVSRMPPFPGAEEDDDLQVEKIISTAGAPETDDI 270
+L++ ++ QQ+ ++ P A +
Sbjct: 217 RVLSD--------------ATIQQVTVDNQRIPQ-----------------AATRASAQA 245

Query: 271 QIIAYPDTNSLLVKGTVSQVDFIEKLVATLDIPKRHIELSLWIIDIDKTDLEQLGADWSG 330
++ A P N+++V+ + ++ ++L+ LD P IE++L I+DI+ L +LG DW
Sbjct: 246 RVEADPSLNAIIVRDSPERMPMYQRLIHALDKPSARIEVALSIVDINADQLTELGVDWRV 305

Query: 331 TIKIGSSLSASFNNSG----------SISTLDG---TQFIATIQALAQKRRAAVVARPVV 377
I+ G++ +G S +D +A + L + A VV+RP +
Sbjct: 306 GIRTGNNHQVVIKTTGDQSNIASNGALGSLVDARGLDYLLARVNLLENEGSAQVVSRPTL 365

Query: 378 LTQENIPAIFDNNRTFYTKLVGERTAELDEVTYGTMISVLPRFAARN---QIELLLNIED 434
LTQEN A+ D++ T+Y K+ G+ AEL +TYGTM+ + PR + +I L L+IED
Sbjct: 366 LTQENAQAVIDHSETYYVKVTGKEVAELKGITYGTMLRMTPRVLTQGDKSEISLNLHIED 425

Query: 435 GNEINSDKTNVDDLPQVGRTLISTIARVPQGKSLLIGGYTRDTNTYESRKIPILGSIPFI 494
GN+ + + ++ +P + RT++ T+ARV G+SL+IGG RD + K+P+LG IP+I
Sbjct: 426 GNQ-KPNSSGIEGIPTISRTVVDTVARVGHGQSLIIGGIYRDELSVALSKVPLLGDIPYI 484

Query: 495 GKLFGYEGTNANNIVRVFLIEPREIDERMMNNANEAAVDARAITQQMAKNKEINDE 550
G LF + VR+F+IEPR IDE + ++ A + + + + EI+++
Sbjct: 485 GALFRRKSELTRRTVRLFIIEPRIIDEGIAHHL--ALGNGQDLRTGILTVDEISNQ 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3732INVEPROTEIN2402e-78 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 240 bits (613), Expect = 2e-78
Identities = 128/321 (39%), Positives = 195/321 (60%)

Query: 14 AREVSRLEDIITEDNEDIEAEMPKMRDDPAGKEARFLQATDEMSAALTQFMKKKIYEEQL 73
+R+ S + D + E + P + +F+Q+TDEMSAAL QF ++ YE++
Sbjct: 16 SRQASHQDATQHTDAQQAEIQQAAEDSSPGAEVQKFVQSTDEMSAALAQFRNRRDYEKKS 75

Query: 74 ANFLDGEEYVLEDQPIEKTDKVMEALKAATTHDYEVYSFAKKLFPDESDLVVVLRAILRK 133
+N + E VLED+ + K ++++ + + A+ LFPD SDLV+VLR +LR+
Sbjct: 76 SNLSNSFERVLEDEALPKAKQILKLISVHGGALEDFLRQARSLFPDPSDLVLVLRELLRR 135

Query: 134 KQISENVRLNAEALLRKVNQETTKKFINSGINSALKAKLFGQALSLNPKLLRASYRQFLM 193
K + E VR E+LL+ V ++T K + +GIN ALKA+LFG+ LSL P LLRASYRQF+
Sbjct: 136 KDLEEIVRKKLESLLKHVEEQTDPKTLKAGINCALKARLFGKTLSLKPGLLRASYRQFIQ 195

Query: 194 AEDDAVDTYVEWIGSYGYQNRMLVTKFIKETLFSDINALDASCSSLEFGMFLNKLSQLLS 253
+E V+ Y +WI SYGYQ R++V FI+ +L +DI+A DASCS LEFG L +L+QL
Sbjct: 196 SESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDASCSRLEFGQLLRRLTQLKM 255

Query: 254 LQSAEALFLKTLMNNPIIKKFISAEDYWIFFLISLIKFPETAEELLNNALVTLPADANYK 313
L+SA+ LF+ TL++ K F + E W+ ++SL++ P + LL + + ++K
Sbjct: 256 LRSADLLFVSTLLSYSFTKAFNAEESSWLLLMLSLLQQPHEVDSLLADIIGLNALLLSHK 315

Query: 314 DKTLLLKAIYSGCTNLPFSLF 334
+ L+ Y C +P SLF
Sbjct: 316 EHASFLQIFYQVCKAIPSSLF 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3731VACCYTOTOXIN310.019 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 31.2 bits (70), Expect = 0.019
Identities = 18/60 (30%), Positives = 31/60 (51%), Gaps = 3/60 (5%)

Query: 597 EIEDRIRDGVRPTAGGTFLNLDASEAEMILDNFKLAL---SGINIPIKDIILLGSVDIRR 653
EI +R+ G A T L L ASE +N +++L + +N+ + L+G+V + R
Sbjct: 202 EINNRVGSGAGRKASSTVLTLQASEGITSRENAEISLYDGATLNLASNSVKLMGNVWMGR 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3729SSPAMPROTEIN352e-05 Salmonella surface presentation of antigen gene typ...
		>SSPAMPROTEIN#Salmonella surface presentation of antigen gene type M

signature.
Length = 147

Score = 34.7 bits (79), Expect = 2e-05
Identities = 31/101 (30%), Positives = 56/101 (55%)

Query: 2 QLKNLQSLLDMKELLGEVVFRQDIFYSLRKVTVIQQQIAEINLEKQKIAERRKILNKEIV 61
Q+ L+ LLD + R++I+ LRK +++++QI ++ L+ +I E+R L K+
Sbjct: 45 QIAGLKLLLDTLRAENRQLSREEIYALLRKQSIVRRQIKDLELQIIQIQEKRSELEKKRE 104

Query: 62 QQQAQRKHWWLKGEKYDRLKKRIKKQLLNQMLYQDELEQEE 102
+ Q + K+W K Y R R K+ + + + Q+E E EE
Sbjct: 105 EFQEKSKYWLRKEGNYQRWIIRQKRLYIQREIQQEEAESEE 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3727SSPANPROTEIN492e-09 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 49.4 bits (117), Expect = 2e-09
Identities = 31/75 (41%), Positives = 44/75 (58%), Gaps = 3/75 (4%)

Query: 121 ENELTYQFQRWGQNHTVRILESSEG-IRLKPSDTLVSDRLHEAQHNDVTAQRWVLTEQDE 179
++ LTY+FQRWG +++V I G L PS+T V RLH+ N QRW LT +D+
Sbjct: 260 DSSLTYRFQRWGNDYSVNIQARQAGEFSLIPSNTQVEHRLHDQWQNG-NPQRWHLT-RDD 317

Query: 180 RQGQRHQPHEEQENE 194
+Q + Q H +Q E
Sbjct: 318 QQNPQQQQHRQQSGE 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3726TYPE3OMOPROT1561e-47 Type III secretion system outer membrane O protein ...
		>TYPE3OMOPROT#Type III secretion system outer membrane O protein

family signature.
Length = 303

Score = 156 bits (395), Expect = 1e-47
Identities = 91/292 (31%), Positives = 136/292 (46%), Gaps = 13/292 (4%)

Query: 35 KENGEDVALLMPEFSAKWLPIAEESGSWSGWVLLREIFPLISAELAGMALMPETERLIGE 94
+ +G + L P W+ +++ WS W+ + +S LAG A+ E L+
Sbjct: 23 QRHGREATLEYPTRQGMWVRLSDAEKRWSAWIKPGDWLEHVSPALAGAAVSAGAEHLVVP 82

Query: 95 WLSLSSSPLNLKYPELKYNRLCVGKVFDGVLSPAQPLIRIWTGELNLWLDKVTVCQYENA 154
WL+ + P L P L RLCV G P L+ I + LW + +
Sbjct: 83 WLAATERPFELPVPHLSCRRLCVENPVPGSALPEGKLLHIMSDRGGLWFEHLPELPAVGG 142

Query: 155 PTLDKKSLYWPIHFVIGFSKTCYRTIVDIEVGDVLLISNNMAYAVIYNTKICDLIYPEEL 214
K L WP+ FVIG S T + I +GDVLLI + A +Y
Sbjct: 143 GRP--KMLRWPLRFVIGSSDTQRSLLGRIGIGDVLLIRTSRA-----------EVYCYAK 189

Query: 215 KMADHFQYEEDFETDDFDIKKSESEIYDENDEQMINSFEELPVKIEFVLGKKIMNLYEID 274
K+ + E + DI+ E E + + +LPVK+EFVL +K + L E++
Sbjct: 190 KLGHFNRVEGGIIVETLDIQHIEEENNTTETAETLPGLNQLPVKLEFVLYRKNVTLAELE 249

Query: 275 ELCAKRIISLLPESEKNIEIRVNGALTGYGELVEVDDKLGVEIHSWLSGHNN 326
+ ++++SL +E N+EI NG L G GELV+++D LGVEIH WLS N
Sbjct: 250 AMGQQQLLSLPTNAELNVEIMANGVLLGNGELVQMNDTLGVEIHEWLSESGN 301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3725TYPE3IMPPROT2262e-77 Type III secretion system inner membrane P protein ...
		>TYPE3IMPPROT#Type III secretion system inner membrane P protein

family signature.
Length = 224

Score = 226 bits (577), Expect = 2e-77
Identities = 151/223 (67%), Positives = 181/223 (81%), Gaps = 5/223 (2%)

Query: 1 MSNSISLIAILSLFTLLPFIIASGTCFIKFSIVFVIVRNALGLQQVPSNMTLNGVALLLS 60
M N ISLIA+L+ TLLPFIIASGTCF+KFSIVFV+VRNALGLQQ+PSNMTLNGVALLLS
Sbjct: 1 MGNDISLIALLAFSTLLPFIIASGTCFVKFSIVFVMVRNALGLQQIPSNMTLNGVALLLS 60

Query: 61 MFVMMPVGKEIYYNSQNENLSFNNVASVVNFVETGMSGYKSYLIKYSEPELVSFFEKIQK 120
MFVM P+ + Y ++E+++FN+++S+ V+ G+ GY+ YLIKYS+ ELV FFE Q
Sbjct: 61 MFVMWPIMHDAYVYFEDEDVTFNDISSLSKHVDEGLDGYRDYLIKYSDRELVQFFENAQL 120

Query: 121 VNSSEDNEEIIDDD-----NISIFSLLPAYALSEIKSAFIIGFYIYLPFVVVDLVISSVL 175
+ E + D SIF+LLPAYALSEIKSAF IGFY+YLPFVVVDLV+SSVL
Sbjct: 121 KRQYGEETETVKRDKDEIEKPSIFALLPAYALSEIKSAFKIGFYLYLPFVVVDLVVSSVL 180

Query: 176 LTLGMMMMSPVTISTPIKLILFVAMDGWTMLSKGLILQYFDLS 218
L LGMMMMSPVTISTPIKL+LFVA+DGWT+LSKGLILQY D++
Sbjct: 181 LALGMMMMSPVTISTPIKLVLFVALDGWTLLSKGLILQYMDIA 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3724TYPE3IMQPROT794e-23 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 78.7 bits (194), Expect = 4e-23
Identities = 59/86 (68%), Positives = 73/86 (84%)

Query: 1 MDDIVFAGNRALYLILVMSAGPIAVATFVGLLVGLFQTVTQLQEQTLPFGVKLLCVSICF 60
MDD+VFAGN+ALYL+L++S P VAT +GLLVGLFQTVTQLQEQTLPFG+KLL V +C
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 61 FLMSGWYGEKLYSFGIEMLNLAFARG 86
FL+SGWYGE L S+G +++ LA A+G
Sbjct: 61 FLLSGWYGEVLLSYGRQVIFLALAKG 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3721TYPE3IMSPROT310e-106 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 310 bits (796), Expect = e-106
Identities = 112/340 (32%), Positives = 185/340 (54%), Gaps = 5/340 (1%)

Query: 2 ANKTEKPTQKKLQDASKKGQILKSRDLTVSVIMLVG--TLYLGYVFDVHHIMSILEYILD 59
KTE+PT KK++DA KKGQ+ KS+++ + +++ L + H ++ +
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 60 HNAKPDIWD---YFKAMGIGWLKTIIPFLLVCMFTTILVSWFQSKMQLATEAVKLKFDSL 116
+ P + + + P L V I Q ++ EA+K +
Sbjct: 63 QSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDIKKI 122

Query: 117 NPVNGLKRIFGLKTVKEFVKAILYIIFFALEIKVFWSNHKSLLFKTLDGDIISLLSDWGE 176
NP+ G KRIF +K++ EF+K+IL ++ ++ I + + L + I + G+
Sbjct: 123 NPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPLLGQ 182

Query: 177 MLFLLILYCLGSMIIVLIFDFIAEYFLFMKDMKMDKQEVKREYKEQEGNPEIKSKRRERH 236
+L L++ C +++ I D+ EY+ ++K++KM K E+KREYKE EG+PEIKSKRR+ H
Sbjct: 183 ILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRRQFH 242

Query: 237 QEILSEQLKSDVSNSRLMIANPTHIAIGIYFKPHLSPIPLISVRETNEVALAVRKYAKEI 296
QEI S ++ +V S +++ANPTHIAIGI +K +P+PL++ + T+ VRK A+E
Sbjct: 243 QEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIAEEE 302

Query: 297 GIPIITDKKLARKIYATHRRYDYVSFENIDEILRLLLWLE 336
G+PI+ LAR +Y Y+ E I+ +L WLE
Sbjct: 303 GVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3716FLGMRINGFLIF353e-04 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 34.6 bits (79), Expect = 3e-04
Identities = 22/126 (17%), Positives = 49/126 (38%), Gaps = 5/126 (3%)

Query: 4 ISLLLFILLLCGCKQQE-LLNHLDQQQANDVLAVLQRHNINAEKKDQGKTGFSIYVEPTD 62
+++++ ++L L ++L Q ++A L + NI + I V
Sbjct: 35 VAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFANGSGA---IEVPADK 91

Query: 63 FASAVDWLKIYNLPGKPDIQISQMFPADALVSSPRAEKARLYSAIEQRLEQSLKIMDGIV 122
L LP + + + S +E+ A+E L ++++ + +
Sbjct: 92 VHELRLRLAQQGLPKGGAVGFE-LLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVK 150

Query: 123 SSRVHV 128
S+RVH+
Sbjct: 151 SARVHL 156


94ECs3549ECs3543N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3549-2131.986495S-ribosylhomocysteinase
ECs3548-2132.480982multidrug resistance protein B
ECs3547-1141.563442multidrug resistance secretion protein
ECs3546-2142.645639transcriptional repressor MprA
ECs3545-1141.887819hypothetical protein
ECs3544-2131.024481hypothetical protein
ECs3543-2120.838325transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3549LUXSPROTEIN292e-105 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 292 bits (748), Expect = e-105
Identities = 130/170 (76%), Positives = 147/170 (86%)

Query: 2 PLLDSFTVDHTRMEAPAVRVAKTMNTPHGDAITVFDLRFCVPNKEVMPERGIHTLEHLFA 61
PLLDSFTVDHTRM APAVRVAKTM TP GD ITVFDLRF PNK+++ E+GIHTLEHL+A
Sbjct: 1 PLLDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYA 60

Query: 62 GFMRNHLNGNGVEIIDISPMGCRTGFYMSLIGTPDEQRVADVWKAAMEDVLKVQDQNQIP 121
GFMRNHLNG+ VEIIDISPMGCRTGFYMSLIGTP EQ+VAD W AAMEDVLKV++QN+IP
Sbjct: 61 GFMRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIP 120

Query: 122 ELNVYQCGTYQMHSLQEAQDIARSILERDVRINSNEELALPKEKLQELHI 171
ELN YQCGT MHSL EA+ IA++ILE V +N N+ELALP+ L+EL I
Sbjct: 121 ELNEYQCGTAAMHSLDEAKQIAKNILEVGVAVNKNDELALPESMLRELRI 170


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3548TCRTETB1329e-36 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 132 bits (333), Expect = 9e-36
Identities = 97/405 (23%), Positives = 169/405 (41%), Gaps = 23/405 (5%)

Query: 17 IALSLATFMQVLDSTIANVAIPTIAGNLGSSLSQGTWVITSFGVANAISIPLTGWLAKRV 76
I L + +F VL+ + NV++P IA + + WV T+F + +I + G L+ ++
Sbjct: 17 IWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQL 76

Query: 77 GEVKLFLWSTIAFAIASWACGVS-SSLNMLIFFRVIQGIVAGPLIPLSQSLLLNNYPPAK 135
G +L L+ I S V S ++LI R IQG A L ++ P
Sbjct: 77 GIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN 136

Query: 136 RSIALALWSMTVIVAPICGPILGGYISDNYHWGWIFFINVPIGVAVVLMTLQTLRGRETR 195
R A L V + GP +GG I+ HW + + +P+ + + L L +E R
Sbjct: 137 RGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVR 194

Query: 196 TERRRIDAVGLALLVIGIGSLQIMLDRGKELDWFSSQEIIILTVVAVVAICFLIVWELTD 255
+ D G+ L+ +GI + ML F++ I +V+V++ +
Sbjct: 195 I-KGHFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIRKV 243

Query: 256 DNPIVDLSLFKSRNFTIGCLCISLAYMLYFGAIVLLPQLLQEVYGYTATWAGLASAPVGI 315
+P VD L K+ F IG LC + + G + ++P ++++V+ + G G
Sbjct: 244 TDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGT 303

Query: 316 IPVILS-PIIGRFAHKLDMRRLVTFSFIMYAVCFYWRAYTFEPGMDFGASAWPQFIQGF- 373
+ VI+ I G + ++ +V F ++ S + I F
Sbjct: 304 MSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFL-----LETTSWFMTIIIVFV 358

Query: 374 --AVACFFMPLTTITLSGLPPERLAAASSLSNFTRTLAGSIGTSI 416
++ ++TI S L + A SL NFT L+ G +I
Sbjct: 359 LGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3547RTXTOXIND742e-16 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 74.1 bits (182), Expect = 2e-16
Identities = 64/412 (15%), Positives = 117/412 (28%), Gaps = 97/412 (23%)

Query: 25 LLLTLLFIIIAVAIGIYWFLVLRHFEETDDA----YVAGNQIQIMSQVSGSVTKVWADNT 80
L FI+ + I VL E A +G +I + V ++
Sbjct: 57 PRLVAYFIMGFLVIAFILS-VLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEG 115

Query: 81 DFVKEGDVLVTLDPTD-------------------------------------------- 96
+ V++GDVL+ L
Sbjct: 116 ESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPY 175

Query: 97 ---ARQAFEKAKTALASSVRQTHQQMINSKQ------------LQANIEVQKIALAKAQS 141
+ T+L T Q K+ + A I + +S
Sbjct: 176 FQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS 235

Query: 142 DYNRRVPLGNANLIGREELQHARDAVTSAQAQLDVAIQQYNANQAMILGTKLEDQPAVQQ 201
+ L + I + + + A +L V Q ++ IL K E Q Q
Sbjct: 236 RLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQL 295

Query: 202 AATEVRN------------------AWLALERTRIVSPMTGYVSRRAVQ-PGAQISPTTP 242
E+ + + + I +P++ V + V G ++
Sbjct: 296 FKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355

Query: 243 LMAVVPA-TNMWVDANFKETQIANMRIGQPVTITTDIYGDDVKY---TGKVVGLDMGTGS 298
LM +VP + V A + I + +GQ I + + +Y GKV + +
Sbjct: 356 LMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAF-PYTRYGYLVGKVKNI-----N 409

Query: 299 AFSLLPAQNATGNWIKVVQRLPVRIELDQKQLEQYPLRIGLSTLVSVNTTNR 350
++ G V+ + + PL G++ + T R
Sbjct: 410 LDAIE--DQRLGLVFNVIISIEENCLST--GNKNIPLSSGMAVTAEIKTGMR 457


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3546PF05272280.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.5 bits (63), Expect = 0.018
Identities = 23/94 (24%), Positives = 36/94 (38%), Gaps = 12/94 (12%)

Query: 23 PYQEILLTRLCMHMQSKLLENRNKMLKAQGINETLFMALITLESQENHSIQPSELSCALG 82
P QE+ L + + L R A+G + + T + ++L ALG
Sbjct: 756 PEQELRLVETGVQGRLWALLTREGAPAAEGAAQKGYSVNTTFVTI-------ADLVQALG 808

Query: 83 -----SSRTNATRIADELEKRGWIERRESDNDRR 111
SS ++ D L + GW RE+ RR
Sbjct: 809 ADPGKSSPMLEGQVRDWLNENGWEYLRETSGQRR 842


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3543TCRTETB454e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 44.9 bits (106), Expect = 4e-07
Identities = 32/165 (19%), Positives = 70/165 (42%), Gaps = 2/165 (1%)

Query: 34 LDTIARNFSLSASSAGFIVTAAQLGYAAGLLFLVPLGDMFERRRLIVSMTLLAAGGMLIT 93
L IA +F+ +S ++ TA L ++ G L D +RL++ ++ G +I
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 94 ASSQSLA-MMILGTALTGLFSVVAQILVPLA-ATLASPDKRGKVVGTIMSGLLLGILLAR 151
S ++I+ + G + LV + A + RGK G I S + +G +
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 152 TVAGLLANLGGWRTVFWVASVLMALMALALWRGLPQMKSETHLNY 196
+ G++A+ W + + + + + + +++ + H +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201


95ECs3249ECs3246N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs3249-130-7.885994EvgA family transcriptional regulator
ECs3248-126-4.944211EvgA family transcriptional regulator
ECs3247-126-3.698470multidrug resistance protein K
ECs3246-124-3.407288multidrug resistance protein Y
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3249HTHFIS762e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.4 bits (188), Expect = 2e-16
Identities = 30/105 (28%), Positives = 51/105 (48%)

Query: 960 SILIADDHPTNRLLLKRQLNLLGYDVDEATDGVQALHKVSMQHYDLLITDVNMPNVDGFE 1019
+IL+ADD R +L + L+ GYDV ++ ++ DL++TDV MP+ + F+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 LTRKLREQNSSLPIWGLTANAQANEREKGLNCGMNLCLFKPLTLD 1064
L ++++ LP+ ++A K G L KP L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3248HTHFIS493e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.1 bits (117), Expect = 3e-09
Identities = 22/148 (14%), Positives = 53/148 (35%), Gaps = 31/148 (20%)

Query: 4 IIIDDHPLAIAAIRNLLIKNDIEILAELTEGGSAVQRVETLKPDIVIIDVDIPGVNGIQV 63
++ DD + L + ++ + + + + D+V+ DV +P N +
Sbjct: 7 LVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 64 LETLRKRQYSGIIIIVSAKNDHFYGKHCADAGANGFVSKKEGMNNIIAAIEAAKNGYCYF 123
L ++K + ++++SA+N + AI+A++ G +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNT------------------------FMTAIKASEKGAYDY 101

Query: 124 ---PFSLNRFVGSLTSDQQKLDSLSKQE 148
PF L + + L ++
Sbjct: 102 LPKPFDLTE---LIGIIGRALAEPKRRP 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3247RTXTOXIND794e-18 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 78.7 bits (194), Expect = 4e-18
Identities = 63/412 (15%), Positives = 122/412 (29%), Gaps = 96/412 (23%)

Query: 13 RRKYFSLLVIVLFIAFSGAYAYWSMELEDMISTDDAYVT-GNADPISAQVSGSVTVVNHK 71
RR I+ F+ + + ++E + + + G + I + V + K
Sbjct: 55 RRPRLVAYFIMGFLVIAFILSVLG-QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 72 DTNYVRQGDILVSLDKTDATIALNKA---------------------------------- 97
+ VR+GD+L+ L A K
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 98 ------------------KNNLANIVRQTNKLYLQDKQYSAEVASARIQ---YQQSLEDY 136
K + Q + L + AE + + Y+
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 137 NRRV----PLAKQGVISKE----------TLEHTKDTLISSKAALNAAIQAYKANKALVM 182
R+ L + I+K + S + + I + K LV
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 183 N-------TPLNR-QPQVVEAADATKEAWLALKRTDIKSPVTGYIAQRSVQ-VGETVSPG 233
L + + + + + I++PV+ + Q V G V+
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 234 QSLMAVVPARQ-MWVNANFKETQLTDVRIGQSVNIISDLYGENVVFHGRVTGINMGTGNA 292
++LM +VP + V A + + + +GQ+ I + F G +G
Sbjct: 354 ETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVE------AFPYTRYGYLVGK--- 404

Query: 293 FSLLPAQNATGNWIKIVQRVPVEVSLDPKELMEH----PLRIGLSMTATIDT 340
+ + +V V +S++ L PL G+++TA I T
Sbjct: 405 VKNINLDAIEDQRLGLVFNVI--ISIEENCLSTGNKNIPLSSGMAVTAEIKT 454


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3246TCRTETB1201e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 120 bits (302), Expect = 1e-31
Identities = 97/408 (23%), Positives = 169/408 (41%), Gaps = 25/408 (6%)

Query: 19 VTIALSLATFMQMLDSTISNVAIPTISGFLGASTDEGTWVITSFGVANAIAIPVTGRLAQ 78
+ I L + +F +L+ + NV++P I+ WV T+F + +I V G+L+
Sbjct: 15 ILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSD 74

Query: 79 RIGELRLFLLSVTFFSLSSLMCSLSIN-LDVLIFFRVVQGLMAGPLIPLSQSLLLRNYPP 137
++G RL L + S++ + + +LI R +QG A L ++ R P
Sbjct: 75 QLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPK 134

Query: 138 EKRTFALALWSMTVIIAPICGPILGGYICDNFSWGWIFLINVPMGIIVLTLCLTLLKGRE 197
E R A L V + GP +GG I W +L+ +PM I+ L L +E
Sbjct: 135 ENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKE 192

Query: 198 TETSPVKMNLPGLTLLVLGVGGLQIMLDKGRDLDWFNSSTIIILTVVSVIFLISLVIWES 257
++ G+ L+ +G+ + ML F +S I +VSV+ + V
Sbjct: 193 VRIKG-HFDIKGIILMSVGI--VFFML--------FTTSYSISFLIVSVLSFLIFVKHIR 241

Query: 258 TSENPILDLSLFKSRNFTIGIVSITCAYLFYSGAIVLMPQLLQETMGYNAIWAGLAYAPI 317
+P +D L K+ F IG++ + +G + ++P ++++ + G
Sbjct: 242 KVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFP 301

Query: 318 GIMPLLISPLIG-----RYGNKIDMRVLVTFSFLMYAVCYYWRSVTFMPTIDFTGIILPQ 372
G M ++I IG R G + + VTF +V + S T F II+
Sbjct: 302 GTMSVIIFGYIGGILVDRRGPLYVLNIGVTFL----SVSFLTASFLLETTSWFMTIIIVF 357

Query: 373 FFQGFAVACFFLPLTTISFSGLPDNKFANASSMSNFFRTLSGSVGTSL 420
G + ++TI S L + S+ NF LS G ++
Sbjct: 358 VLGGLSFTK--TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


96ECs3026ECs3021N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs30260161.993112tRNA-dihydrouridine synthase C
ECs30250121.527465multidrug resistance outer membrane protein
ECs3024-2151.443763acetoin dehydrogenase
ECs3023-3141.291416hypothetical protein
ECs3022-3141.861228hypothetical protein
ECs3021-2152.093117D-alanyl-D-alanine endopeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3026SHAPEPROTEIN290.030 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 28.6 bits (64), Expect = 0.030
Identities = 32/127 (25%), Positives = 53/127 (41%), Gaps = 5/127 (3%)

Query: 122 GAKAMREAVPAHLPVSVKVRLGWDSGEK-KFEIADAVQQAGATELVVHGRTKEQGY-RAE 179
G EA+ ++ + +G + E+ K EI A E+ V GR +G R
Sbjct: 190 GGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGF 249

Query: 180 HIDWQAIGE-IRQRLNIPVIANGEIWDWQSAQECMAISGCDSVMIGRGALNIPNLSRVVK 238
++ I E +++ L V A + + IS V+ G GAL + NL R++
Sbjct: 250 TLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL-LRNLDRLL- 307

Query: 239 YNEPRMP 245
E +P
Sbjct: 308 MEETGIP 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3024DHBDHDRGNASE1131e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (284), Expect = 1e-32
Identities = 71/253 (28%), Positives = 116/253 (45%), Gaps = 12/253 (4%)

Query: 3 QVAIITASDSGIGKECALLLAQQGFDIGITWHSDEEGAKDTAREVVSHGVRAEIVQLDLG 62
++A IT + GIG+ A LA QG I ++ E+ K + AE D+
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKA-EARHAEAFPADVR 67

Query: 63 NLPEGAQALEKLIQRLGRIDVLVNNAGAMTKVPFLDMAFDEWRKIFTVDVDGAFLCSQIA 122
+ + ++ + +G ID+LVN AG + ++ +EW F+V+ G F S+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 123 ARQMVKQGQGGRIINITSVHEHTPLPDASAYTAAKHALGGLTKAMALELVRHKILVNAVA 182
++ M+ + + G I+ + S P +AY ++K A TK + LEL + I N V+
Sbjct: 128 SKYMMDR-RSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 183 PGAIATPM-------NGMDDSDVKPDAEP---SIPLRRFGATHEIASLVAWLCSEGANYT 232
PG+ T M + +K E IPL++ +IA V +L S A +
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 233 TGQSLIVDGGFML 245
T +L VDGG L
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3023BCTERIALGSPF290.019 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.019
Identities = 5/33 (15%), Positives = 16/33 (48%), Gaps = 2/33 (6%)

Query: 164 WLHNLDQHLKHW-VWLILVVVL-VVGVRWWLKR 194
L + ++ + W++L ++ + R L++
Sbjct: 215 VLMGMSDAVRTFGPWMLLALLAGFMAFRVMLRQ 247


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs3021BLACTAMASEA443e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 44.0 bits (104), Expect = 3e-07
Identities = 43/195 (22%), Positives = 77/195 (39%), Gaps = 18/195 (9%)

Query: 4 MPKFRVSLFSLALMLAVPLAPQAVAKTAAATTASQPEIASGSAMI-VDLNTNKVIYSNHP 62
M R+ + SL + +PLA A + S+ +++ MI +DL + + + +
Sbjct: 1 MRYIRLCIISL--LATLPLAVHASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRA 58

Query: 63 DLVRPIASISKLMTAMVVLDARLPLDEKLKVDISQTPEMKGVYSRV---RLNSEISRKDM 119
D P+ S K++ VL DE+L+ I + YS V L ++ ++
Sbjct: 59 DERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGEL 118

Query: 120 LLLALMSSENRAAASLAHHYPGGYKAFIKAMNAKAKSLGMNNTRFV--EPTGLS-----V 172
A+ S+N +AA+L GG + A + +G N TR E
Sbjct: 119 CAAAITMSDN-SAANLLLATVGG----PAGLTAFLRQIGDNVTRLDRWETELNEALPGDA 173

Query: 173 HNVSTARDLTKLLIA 187
+ +T + L
Sbjct: 174 RDTTTPASMAATLRK 188


97ECs2979ECs2973N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2979235-6.222759hypothetical protein
ECs2978236-6.412057hypothetical protein
ECs2977126-1.996983hypothetical protein
ECs2976228-1.732840hypothetical protein
ECs2975228-1.048665antitermination protein
ECs2974329-1.027506Shiga toxin I subunit A
ECs29733302.083474Shiga toxin I subunit B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs297960KDINNERMP280.014 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.0 bits (62), Expect = 0.014
Identities = 9/37 (24%), Positives = 16/37 (43%)

Query: 78 MTYMKAYQKAWKEHRDRYQQDMEKLESENMELRRKLG 114
M M+ Q + R+R D +++ E M L +
Sbjct: 380 MAKMRMLQPKIQAMRERLGDDKQRISQEMMALYKAEK 416


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2976HTHFIS270.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.1 bits (60), Expect = 0.004
Identities = 10/30 (33%), Positives = 17/30 (56%), Gaps = 4/30 (13%)

Query: 10 DMLVEAYE----NQTEVARILNCSRNTVRK 35
+++ A NQ + A +L +RNT+RK
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTLRK 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2974SHIGARICIN1203e-34 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 120 bits (303), Expect = 3e-34
Identities = 49/283 (17%), Positives = 112/283 (39%), Gaps = 40/283 (14%)

Query: 3 IIIFRVLTFFFVIFSVNVVAKE----FTLDFSTAKTYVDSLNVIRSAIGTPLQTISSGGT 58
+I F V + + + A E F L +T+ +Y ++ +R A+ +
Sbjct: 1 MIRFLVFSLLILTLFLTAPAVEGDVSFRLSGATSSSYGVFISNLRKALPYERKL-----Y 55

Query: 59 SLLMIDSGTGDNLFAVDVRGIDPEEGRFNNLRLIVERNNLYVTGFVNRTNNVFYRFADF- 117
+ ++ S + + + + + + ++ N+YV G+ + Y F +
Sbjct: 56 DIPLLRSTLPGSQRYALIHLTNYADE---TISVAIDVTNVYVMGYRA--GDTSYFFNEAS 110

Query: 118 ----SHVTFPGTTA-VTLSGDSSYTTLQRVAGISRTGMQINRHSLTTSYLDLMSHSGTSL 172
+ F VTL +Y LQ AG R + + +L ++ L ++
Sbjct: 111 ATEAAKYVFKDAKRKVTLPYSGNYERLQIAAGKIRENIPLGLPALDSAITTLFYYNA--- 167

Query: 173 TQSVARAMLRFVTVTAEALRFRQIQRGFRTTLDDLSGRSYVMTAEDVDLTLNWGRLSSVL 232
S A A++ + T+EA R++ I++ +D +++ + + L +W LS +
Sbjct: 168 -NSAASALMVLIQSTSEAARYKFIEQQIGKRVDK----TFLPSLAIISLENSWSALSKQI 222

Query: 233 PDYHGQDSV----------RVGRISFGSINA--ILGSVALILN 263
+ + R++ +++A + ++AL+LN
Sbjct: 223 QIASTNNGQFETPVVLINAQNQRVTITNVDAGVVTSNIALLLN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2973FLGMOTORFLIM260.024 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 26.0 bits (57), Expect = 0.024
Identities = 7/36 (19%), Positives = 17/36 (47%)

Query: 38 DTFTVKVGDKELFTNRWNLQSLLLSAQITGMTVTIK 73
D F + +G+++ F + + ++AQI +
Sbjct: 293 DPFVLSIGNRKKFLCQPGVVGKKIAAQILERIESTS 328


98ECs2942ECs2934N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2942-2141.758244outer membrane protein Lom
ECs2941-2161.103157tail fiber protein
ECs2940-214-2.655279hypothetical protein
ECs2939-114-2.396000hypothetical protein
ECs2937-312-1.2398872-component sensor protein
ECs29360120.415355two-component response-regulatory protein YehT
ECs29350121.172111hypothetical protein
ECs29340141.441034hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2942ENTEROVIROMP1422e-45 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 142 bits (359), Expect = 2e-45
Identities = 63/195 (32%), Positives = 99/195 (50%), Gaps = 29/195 (14%)

Query: 7 VILSAVVWQVAAATPASAAEHQSTLSAGYLHASTNVPG-SDDLNGINVKYRYEFMDA-LG 64
+ + + V A T ++ ST++ GY A ++ G + + G N+KYRYE ++ LG
Sbjct: 4 IACLSALAAVLAFTAGTSVAATSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEEDNSPLG 61

Query: 65 LITSFSYANAEDEQKTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGVAYSRV 124
+I SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV Y +
Sbjct: 62 VIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGVGYGKF 113

Query: 125 STFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYEGSGSG 184
T T+ HD S+ ++GAG+QFNP E+VA+D +YE S
Sbjct: 114 QT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYEQSRIR 156

Query: 185 DWRTDGFIVGVGYKF 199
+I GVGY+F
Sbjct: 157 SVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2941IGASERPTASE394e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.9 bits (90), Expect = 4e-05
Identities = 46/289 (15%), Positives = 91/289 (31%), Gaps = 30/289 (10%)

Query: 9 LKDGTGKPVENCTIQLKARRNSATVVVNTVASENPDE-AGRYSMDVEYGQYSVILLVEGF 67
+ D TG+P N A + + ++ D A +Y + G+Y + +
Sbjct: 928 VADKTGEPNHNELTLFDASKAQRDHLNVSLVGNTVDLGAWKYKLRNVNGRYDL------Y 981

Query: 68 PPSHAGTITVYEDSQPGTLNDFLGAMSEDDVRPEALRRFELMVEEAARHAEEAKKNAGEA 127
P + + T N+ + E + R + A ++
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSE----TT 1037

Query: 128 ETSARNAGISASQAEESAANADTSAGDALESARQAA-ESAAAAKQSEDASSSSASAAAQK 186
ET A N+ + E++ +A + E A++A A + +E A S S + Q
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 187 ASESSQSAAEA------------ELSRKTAESAAGNAARDAT-TATEKARE-----SAES 228
+ E E+ + T++ + + E ARE + +
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 229 AQSAEQSRIAAEEAVNRIPTVVGPPGPKGEQGPAGPQGPKGDKGERGDT 277
QS + E+ + V P + G + + T
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2937PF065802198e-69 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 219 bits (560), Expect = 8e-69
Identities = 63/216 (29%), Positives = 115/216 (53%), Gaps = 3/216 (1%)

Query: 343 LGEGIAQLLSAQILAGQYERQKAMLTQSEIKLLHAQVNPHFLFNALNTIKAVIRRDSEQA 402
L G + + + +M ++++ L AQ+NPHF+FNALN I+A+I D +A
Sbjct: 134 LYFGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKA 193

Query: 403 SQLVQYLSTFFRKNLKR-PSEFVTLADEIEHVNAYLQIEKARFQSRLQVNIAIPQELSQQ 461
+++ LS R +L+ + V+LADE+ V++YLQ+ +F+ RLQ I +
Sbjct: 194 REMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDV 253

Query: 462 QLPAFTLQPIVENAIKHGTSQLLDTGRVAISVRREGQHLMLEIEDNAGL-YQPVTNASGL 520
Q+P +Q +VEN IKHG +QL G++ + ++ + LE+E+ L + ++G
Sbjct: 254 QVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 521 GMNLVDKRLRERFGDDYGISVACEPDSYTRITLRLP 556
G+ V +RL+ +G + I ++ + + +P
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2936HTHFIS711e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 71.4 bits (175), Expect = 1e-16
Identities = 41/177 (23%), Positives = 77/177 (43%), Gaps = 12/177 (6%)

Query: 2 IKVLIVDDEPLARENLRVFLQEQSDIEIVGECSNAVEGIGAVHKLRPDVLFLDIQMPRIS 61
+L+ DD+ R L L ++ ++ SNA + D++ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLEMVGMLDPEHRPYI--VFLTAFD--EYAIKAFEEHAFDYLLKPIDEARLEKTLARLRQ 117
+++ + + RP + + ++A + AIKA E+ A+DYL KP D L + R
Sbjct: 62 AFDLLPRIK-KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 118 ERSKQDVSLLPENQQALKFIPCTGHSRIYLLQMKDVAFVSSRMSGVYVT--SHEGKE 172
E ++ L ++Q + + G S + +A + + +T S GKE
Sbjct: 121 EPKRRPSKLEDDSQDGMPLV---GRSAAMQEIYRVLARLMQTDLTLMITGESGTGKE 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2934INTIMIN270.027 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.3 bits (60), Expect = 0.027
Identities = 19/94 (20%), Positives = 31/94 (32%)

Query: 40 LNGTEIAITYVYKGDKVLKQSSETKIQFASIGATTKEDAAKTLEPLSAKYKNIAGVEEKL 99
+ + AITY K K K S ++ F + KT AK + K
Sbjct: 671 VANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKS 730

Query: 100 TYTDTYAQENVTIDMEKVDFKALQGISGINVSAE 133
+ + V + +V+F I N+
Sbjct: 731 LVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIV 764


99ECs2887ECs2878N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2887-2184.050261BaeR family transcriptional regulator
ECs2886-3173.959565signal transduction histidine-protein kinase
ECs2885-3173.764950multidrug efflux system protein MdtE
ECs2884-3133.099938multidrug efflux system subunit MdtC
ECs2883-3132.729591multidrug efflux system subunit MdtB
ECs2882-1131.870548multidrug efflux system subunit MdtA
ECs2881-1141.005444hypothetical protein
ECs2880-1141.740754hypothetical protein
ECs2879-2111.907350chaperonin
ECs2878-2142.061905chaperone
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2887HTHFIS764e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.0 bits (187), Expect = 4e-18
Identities = 28/136 (20%), Positives = 65/136 (47%), Gaps = 1/136 (0%)

Query: 11 PRILIVEDEPKLGQLLIDYLRAASYAPTLISHGDQVLSYVRQTPPDLILLDLMLPGTDGL 70
IL+ +D+ + +L L A Y + S+ + ++ DL++ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 71 TLCREIR-RFSDIPIVMVTAKIEEIDRLLGLEIGADDYICKPYSPREVVARVKTILRRCK 129
L I+ D+P+++++A+ + + E GA DY+ KP+ E++ + L K
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 130 PQRELQQQDAESPLII 145
+ + D++ + +
Sbjct: 124 RRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2886BCTERIALGSPF310.009 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 31.3 bits (71), Expect = 0.009
Identities = 27/93 (29%), Positives = 34/93 (36%), Gaps = 27/93 (29%)

Query: 173 LATLLAALATFLLA-------------RGLLAPVKRLVDGTHKLAAGDFTTRVTPTSEDE 219
LATL+AA A L+A V+ V H LA + P S +
Sbjct: 77 LATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSLAD---AMKCFPGSFER 133

Query: 220 L-----------GKLAQDFNQLASTLEKNQQMR 241
L G L N+LA E+ QQMR
Sbjct: 134 LYCAMVAAGETSGHLDAVLNRLADYTEQRQQMR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2885TCRTETB1268e-34 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 126 bits (317), Expect = 8e-34
Identities = 97/429 (22%), Positives = 188/429 (43%), Gaps = 23/429 (5%)

Query: 20 FMQSLDTTIVNTALPSMAQSLGESPLHMHMVIVSYVLTVAVMLPASGWLADKVGVRNIFF 79
F L+ ++N +LP +A + P + V +++LT ++ G L+D++G++ +
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 80 TAIVLFTLGSLFCALSGTLNELL-LARALQGVGGAMMVPVGRLTVMKIVPREQYMAAMTF 138
I++ GS+ + + LL +AR +QG G A + + V + +P+E A
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 139 VTLPGQVGPLLGPALGGLLVEYASWHWIFLINIPVGIIGAIATLM-LMPNYTMQTRRFDL 197
+ +G +GPA+GG++ Y HW +L+ IP+ I + LM L+ FD+
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHY--IHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 198 SGFLLLAVGMAVLTLALDGSKGTGLSPLAITGLVAVGVVALVLYLLHARNNNRALFSLKL 257
G +L++VG+ L + + V V++ ++++ H R L
Sbjct: 202 KGIILMSVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 258 FRTRTFSLGLAGSFAGRIGSGMLPFMTPVFLQIGLGFSPFHAG-LMMIPMVLGSMGMKRI 316
+ F +G+ M P ++ S G +++ P + + I
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 317 VVQVVNRFGYRRVLVATTLGLSLVTLLFMTTALL----GWYYVLPFVLFLQGMVNSTRFS 372
+V+R G VL +G++ +++ F+T + L W+ + V L G+ S +
Sbjct: 313 GGILVDRRGPLYVL---NIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGL--SFTKT 367

Query: 373 SMNTLTLKDLPDNLASSGNSLLSMIMQLSMSIGVTIAGLLLGLFGSQHVSVDSSTTQTVF 432
++T+ L A +G SLL+ LS G+ I G LL + + Q+ +
Sbjct: 368 VISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQRLLPMEVDQSTY 427

Query: 433 MYTWLSMAF 441
+Y+ L + F
Sbjct: 428 LYSNLLLLF 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2884ACRIFLAVINRP9220.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 922 bits (2384), Expect = 0.0
Identities = 289/1035 (27%), Positives = 507/1035 (48%), Gaps = 36/1035 (3%)

Query: 6 LFIYRPVATILLSVAITLCGILGFRMLPVAPLPQVDFPVIMVSASLPGASPETMASSVAT 65
FI RP+ +L++ + + G L LPVA P + P + VSA+ PGA +T+ +V
Sbjct: 4 FFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQ 63

Query: 66 PLERSLGRIAGVSEMTSSS-SLGSTRIILQFDFDRDINGAARDVQAAINAAQSLLPSGMP 124
+E+++ I + M+S+S S GS I L F D + A VQ + A LLP +
Sbjct: 64 VIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQ 123

Query: 125 SRPTYRKANPSDAPIMILTLTSDT--YSQGELYDFASTQLAPTISQIDGVGDVDVGGSSL 182
+ S + +M+ SD +Q ++ D+ ++ + T+S+++GVGDV + G+
Sbjct: 124 -QQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQY 182

Query: 183 PAVRVGLNPQALFNQGVSLDDVRTAISNANVRKPQG------ALEDGTHRWQIQTNDELK 236
A+R+ L+ L ++ DV + N + G AL I K
Sbjct: 183 -AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 237 TAAEYQPLIIHYN-NGGAVRLGDVATVTDSVQDVRNAGMTNAKPAILLMIRKLPEANIIQ 295
E+ + + N +G VRL DVA V ++ N KPA L I+ AN +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 296 TVDSIRAKLPELQETIPAAIDLQIAQDRSPTIRASLEEVEQTLIISVALVILVVFLFLRS 355
T +I+AKL ELQ P + + D +P ++ S+ EV +TL ++ LV LV++LFL++
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 356 GRATIIPAVAVPVSLIGTFAAMYLCGFSLNNLSLMALTIATGFVVDDAIVVLENIARHL- 414
RAT+IP +AVPV L+GTFA + G+S+N L++ + +A G +VDDAIVV+EN+ R +
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 415 EAGMKPLQAALQGTREVGFTVLSMSLSLVAVFLPLLLMGGLPGRLLREFAVTLSVAIGIS 474
E + P +A + ++ ++ +++ L AVF+P+ GG G + R+F++T+ A+ +S
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 475 LLVSLTLTPMMCGWMLKASKPREQKRLRGFG----RMLVALQQGYGKSLKWVLNHTRLVG 530
+LV+L LTP +C +LK + GF Y S+ +L T
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 531 MVLLGTIALNIWLYISIPKTFFPEQDTGVLMGGIQADQSISFQ----AMRGKLQDFMKII 586
++ +A + L++ +P +F PE+D GV + IQ + + + ++K
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 587 RD-DPAVDNVTGFT-GGSRVNSGMMFITLKPRDERS---ETAQQIIDRLRVKLAKEPGAN 641
+ +V V GF+ G N+GM F++LKP +ER+ +A+ +I R +++L K
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 642 LFLMAVQDIRVGGRQSNASYQYTLLSDDLAALREWEPKIRKKLATL-----PELADVNSD 696
+ + I G + ++ L D + + R +L + L V +
Sbjct: 662 VIPFNMPAIVELGTATGFDFE---LIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 697 QQDNGAEMNLVYDRDTMARLGIDVQAANSLLNNAFGQRQISTIYQPMNQYKVVMEVDPRY 756
++ A+ L D++ LG+ + N ++ A G ++ K+ ++ D ++
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 757 TQDISALEKMFVINNEGKAIPLSYFAKWQPANAPLSVNHQGLSAASTISFNLPTGKSLSD 816
++K++V + G+ +P S F + + I G S D
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 817 ASAAIDRAMTQLGVPSTVRGSFAGTAQVFQETMNSQVILIIAAIATVYIVLGILYESYVH 876
A A ++ ++L P+ + + G + + + N L+ + V++ L LYES+
Sbjct: 839 AMALMENLASKL--PAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSI 896

Query: 877 PLTILSTLPSAGVGALLALELFNAPFSLIALIGIMLLIGIVKKNAIMMVDFALEAQRHGN 936
P++++ +P VG LLA LFN + ++G++ IG+ KNAI++V+FA +
Sbjct: 897 PVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEG 956

Query: 937 LTPQEAIFQACLLRFRPIMMTTLAALFGALPLVLSGGDGSELRQPLGITIVGGLVMSQLL 996
EA A +R RPI+MT+LA + G LPL +S G GS + +GI ++GG+V + LL
Sbjct: 957 KGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLL 1016

Query: 997 TLYTTPVVYLFFDRL 1011
++ PV ++ R
Sbjct: 1017 AIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2883ACRIFLAVINRP9170.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 917 bits (2372), Expect = 0.0
Identities = 298/1036 (28%), Positives = 513/1036 (49%), Gaps = 29/1036 (2%)

Query: 13 SRLFIMRPVATTLLMVAILLAGIIGYRALPVSALPEVDYPTIQVVTLYPGASPDVMTSAV 72
+ FI RP+ +L + +++AG + LPV+ P + P + V YPGA + V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 73 TAPLERQFGQMSGLKQMSSQS-SGGASVITLQFQLTLPLDVAEQEVQAAINAATNLLPSD 131
T +E+ + L MSS S S G+ ITL FQ D+A+ +VQ + AT LLP +
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 132 LPNPPVYSKVNPADPPIMTLAVTSTAMPMTQVE--DMVETRVAQKISQISGVGLVTLSGG 189
+ + S + +M S TQ + D V + V +S+++GVG V L G
Sbjct: 122 VQQQGI-SVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 190 QRPAVRVKLNAQAIAALGLTSETVRTAITGANVNSAKGSLDGP------SRAVTLSANDQ 243
Q A+R+ L+A + LT V + N A G L G ++ A +
Sbjct: 181 QY-AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 244 MQSAEEYRQLII-AYQNGAPIRLGDVATVEQGAENSWLGAWANKEQAIVMNVQRQPGANI 302
++ EE+ ++ + +G+ +RL DVA VE G EN + A N + A + ++ GAN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 303 ISTADSIRQMLPQLTESLPKSVKVTVLSDRTTNIRASVDDTQFELMMAIALVVMIIYLFL 362
+ TA +I+ L +L P+ +KV D T ++ S+ + L AI LV +++YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 363 RNIPATIIPGVAVPLSLIGTFAVMVFLDFSINNLTLMALTIATGFVVDDAIVVIENISRY 422
+N+ AT+IP +AVP+ L+GTFA++ +SIN LT+ + +A G +VDDAIVV+EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 423 I-EKGEKPLAAALKGAGEIGFTIISLTFSLIAVLIPLLFMGDIVGRLFREFAITLAVAIL 481
+ E P A K +I ++ + L AV IP+ F G G ++R+F+IT+ A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 482 ISAVVSLTLTPMMCARML---SQESLRKQNRFSRASEKMFDRIIAAYGRGLAKVLNHPWL 538
+S +V+L LTP +CA +L S E + F FD + Y + K+L
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 539 TLSVALSTLLLSVLLWVFIPKGFFPVQDNGIIQGTLQAPQSSSFANMAQRQRQVADVILQ 598
L + + V+L++ +P F P +D G+ +Q P ++ + QV D L+
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 599 DPA--VQSLTSFVGVDGTNPSLNSARLQINLKPLDERDDR---VQKVIARLQTAVDKVPG 653
+ V+S+ + G + + N+ ++LKP +ER+ + VI R + + K+
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR- 658

Query: 654 VDLFLQPTQDLTIDTQVSRTQYQFTLQ---ATSLDALSTWVPQLMEKLQQLP-QISDVSS 709
D F+ P I + T + F L DAL+ QL+ Q P + V
Sbjct: 659 -DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 710 DWQDKGLVAYVNVDRDSASRLGISMADVDNALYNAFGQRLISTIYTQANQYRVVLEHNTE 769
+ + + VD++ A LG+S++D++ + A G ++ + ++ ++ + +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 770 NTPGLAALDTIRLTSSDGGVVPLSSIAKIEQRFAPLSINHLDQFPVTTISFNVPDNYSLG 829
+D + + S++G +VP S+ + + + P I S G
Sbjct: 778 FRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSG 837

Query: 830 DAVQAIMDTEKTLNLPVDITTQFQGSTLAFQSALGSTVWLIVAAVVAMYIVLGILYESFI 889
DA A+M+ + LP I + G + + + L+ + V +++ L LYES+
Sbjct: 838 DA-MALMENLAS-KLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 890 HPITILSTLPTAGVGALLALMIAGSELDVIAIIGIILLIGIVKKNAIMMIDFALAAEREQ 949
P++++ +P VG LLA + + DV ++G++ IG+ KNAI++++FA ++
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 950 GMSPRDAIYQACLLRFRPILMTTLAALLGALPLMLSTGVGAELRRPLGIGMVGGLIVSQV 1009
G +A A +R RPILMT+LA +LG LPL +S G G+ + +GIG++GG++ + +
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 1010 LTLFTTPVIYLLFDRL 1025
L +F PV +++ R
Sbjct: 1016 LAIFFVPVFFVVIRRC 1031


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2882RTXTOXIND484e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 4e-08
Identities = 48/369 (13%), Positives = 105/369 (28%), Gaps = 87/369 (23%)

Query: 4 SYKSRWVIVIVVVIAAIAAFWFWQGRNDSQSAAPG-----ATKQAQQSPAGGRRG---MR 55
S + R V ++ IA G+ + + A G + + ++
Sbjct: 54 SRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVK 113

Query: 56 SG-------PLA---PVQAATAVEQAVPRYLTGLGTITAANTVTVRSRVDG--QLMALHF 103
G L + A + L T ++ ++ +L
Sbjct: 114 EGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDE 173

Query: 104 QEGQQVKAGDLLAEI------------DPSQFKVALAQAQGQLAKDKATLTNARR----- 146
Q V ++L Q ++ L + + + A +
Sbjct: 174 PYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVE 233

Query: 147 --DLARYQQLAKTNLVSRQELDAQQALVSETEGTIKADEASVA----------------- 187
L + L +++ + Q+ E ++ ++ +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 188 --------------------------SAQLQLDWSRITAPVDGRV-GLKQVDVGNQISSG 220
+ + S I APV +V LK G +++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTA 353

Query: 221 DTTGIVVITQTHPIDLLFTLPESDIATVVQAQKAGKPLVVEAWDRTNSKKL-SEGTLLSL 279
+T +V++ + +++ + DI + Q A + VEA+ T L + ++L
Sbjct: 354 ETL-MVIVPEDDTLEVTALVQNKDIGFINVGQNA--IIKVEAFPYTRYGYLVGKVKNINL 410

Query: 280 DNQIDATTG 288
D D G
Sbjct: 411 DAIEDQRLG 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2878SHAPEPROTEIN514e-09 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 50.9 bits (122), Expect = 4e-09
Identities = 33/129 (25%), Positives = 57/129 (44%), Gaps = 20/129 (15%)

Query: 132 AMMLH-IRQQAQAQLPEAITQAVIGRPINFQGLGGDEANAQAQGILERAAKRAGFRDVVF 190
M+ H I+Q + ++ P+ + E A + +A+ AG R+V
Sbjct: 89 KMLQHFIKQVHSNSFMRPSPRVLVCVPVGATQV---ERRA-----IRESAQGAGAREVFL 140

Query: 191 QYEPVAAGLDYEATLQEEKRVLVVDIGGGTTDCSLLLMGPQWRSRLDREASLLGHSGCRI 250
EP+AA + + E +VVDIGGGTT+ +++ + ++ S RI
Sbjct: 141 IEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN-----------GVVYSSSVRI 189

Query: 251 GGNDLDIAL 259
GG+ D A+
Sbjct: 190 GGDRFDEAI 198



Score = 33.2 bits (76), Expect = 0.002
Identities = 32/137 (23%), Positives = 55/137 (40%), Gaps = 23/137 (16%)

Query: 332 RLSYRLV---RSAEECKIALSSV--AETRASLPFISDELAT------LISQQGLESALSQ 380
R +Y + +AE K + S + + LA ++ + AL +
Sbjct: 203 RRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQE 262

Query: 381 PLARIQEQVQLALDNAQEKPDV--------IYLTGGSARSPLIKKALAEQLPGIPIAGGD 432
PL I V +AL+ Q P++ + LTGG A + + L E+ GIP+ +
Sbjct: 263 PLTGIVSAVMVALE--QCPPELASDISERGMVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 433 D-FGSVTAGLARWAEVV 448
D V G + E++
Sbjct: 320 DPLTCVARGGGKALEMI 336


100ECs2707ECs2699N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2707028-7.739066transcriptional regulatory protein YedW
ECs2706-126-7.4355442-component sensor protein
ECs2705-320-3.632025chaperone protein HchA
ECs2704-313-2.292523hypothetical protein
ECs2703-212-0.991670hypothetical protein
ECs27020140.398077outer membrane protein
ECs27012181.532819hypothetical protein
ECs27002181.805731hypothetical protein
ECs26990161.075025DNA cytosine methylase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2707HTHFIS822e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.2 bits (203), Expect = 2e-20
Identities = 30/117 (25%), Positives = 60/117 (51%), Gaps = 1/117 (0%)

Query: 2 KILLIEDNQRTQEWVTQGLSEAGYVIDAVSDGRDGLYLALKDDYALIILDIMLPGMDGWQ 61
IL+ +D+ + + Q LS AGY + S+ D L++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 ILQTLRTA-KQTPVICLTARDSVDDRVRGLDSGANDYLVKPFSFSELLARVRAQLRQ 117
+L ++ A PV+ ++A+++ ++ + GA DYL KPF +EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2706PF06580320.005 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.8 bits (72), Expect = 0.005
Identities = 35/181 (19%), Positives = 61/181 (33%), Gaps = 37/181 (20%)

Query: 290 ENILFLARADKNNVLVKLDSLS----------------LNKEVENLLDYL--EYLSDEKE 331
NI L D L SLS L E+ + YL + E
Sbjct: 180 NNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDR 239

Query: 332 ICFKVECNQQIFADKI---LLQRMLSNLIVNAIRYSPEKSRIHITSFLDTNSYLNIDIAS 388
+ F+ + N I ++ L+Q ++ N I + I P+ +I + D N + +++ +
Sbjct: 240 LQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKD-NGTVTLEVEN 298

Query: 389 PGAKINEPEKLFRRFWRGDNSRHSVGQGLGLSLVKA-IAELHGGSATYHYLNKHNVFRIT 447
G+ + K G GL V+ + L+G A K
Sbjct: 299 TGSLALKNTKE--------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM 344

Query: 448 L 448
+
Sbjct: 345 V 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2703ECOLIPORIN2382e-80 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 238 bits (608), Expect = 2e-80
Identities = 117/196 (59%), Positives = 139/196 (70%), Gaps = 24/196 (12%)

Query: 1 MSTSYDFDFGLSLGAAYSNSDRTDNQVHKGTHNTRYGDRFDATAGGETAEAWTVGAKYDA 60
+ST+YD G S GAAY+ SDRT+ QV+ G AGG+ A+AWT G KYDA
Sbjct: 207 ISTTYDIGMGFSAGAAYTTSDRTNEQVNAGGTI----------AGGDKADAWTAGLKYDA 256

Query: 61 NNVYLAAMYAEPRNMTGYGDADA-----IANKTQNFEVVAQYQFDFGLRPSIAYLQSKGK 115
NN+YLA MY+E RNMT YG D +ANKTQNFEV AQYQFDFGLRP++++L SKGK
Sbjct: 257 NNIYLATMYSETRNMTPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSFLMSKGK 316

Query: 116 DLGGVNSDNFDSQGNHHYTNKDLVKYVDIGMTYYFNKNMSTYVDYKINLLDNDDDFYKEN 175
DL N + +KDLVKY D+G TYYFNKN STYVDYKINLLD+DD FYK+
Sbjct: 317 DLTY---------NNVNGDDKDLVKYADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDA 367

Query: 176 GIATDDIVAVGLVYQF 191
GI+TDDIVA+G+VYQF
Sbjct: 368 GISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2702ECOLIPORIN303e-106 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 303 bits (778), Expect = e-106
Identities = 143/204 (70%), Positives = 160/204 (78%), Gaps = 2/204 (0%)

Query: 1 MKRKVLAMLVPALLVAGAANAAEIYNKDGNKLDLYGKVAGLHYFSDDASSDGDMSYARIG 60
MKRKVLA+++PALL AGAA+AAEIYNKDGNKLDLYGKV GLHYFSDD+S DGD +Y R+G
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSKDGDQTYMRVG 60

Query: 61 FKGETQIADQFTGYGQWEFNIGANGPESDKGNTATRLAFAGFGFGQNGTFDYGRNYGVVY 120
FKGETQI DQ TGYGQWE+N+ AN E + N+ TRLAFAG FG G+FDYGRNYGV+Y
Sbjct: 61 FKGETQINDQLTGYGQWEYNVQANTTEGEGANSWTRLAFAGLKFGDYGSFDYGRNYGVLY 120

Query: 121 DVEAWTDMLPEFGGDTYAGADNFMNGRANSVATYRNNGFFGQVDGLNFALQYQGNNEKSG 180
DVE WTDMLPEFGGD+Y ADN+M GRAN VATYRN FFG VDGLNFALQYQG NE
Sbjct: 121 DVEGWTDMLPEFGGDSYTYADNYMTGRANGVATYRNTDFFGLVDGLNFALQYQGKNESQS 180

Query: 181 LFDQEGSGNG--NGRKLAKENGDG 202
D N NG + +NGDG
Sbjct: 181 ADDVNIGTNNRNNGDDIRYDNGDG 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2700CARBMTKINASE352e-04 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 34.8 bits (80), Expect = 2e-04
Identities = 22/92 (23%), Positives = 36/92 (39%), Gaps = 9/92 (9%)

Query: 37 AQKLAADDDVDMLVILTACYFHDIVSLAKNHPQRQSSSILAAEETRRLLREEFVQFPA-- 94
+KLA + + D+ +ILT + +L + Q + EE R+ E F A
Sbjct: 219 GEKLAEEVNADIFMILTDV---NGAALYYGTEKEQWLREVKVEELRKYYEEG--HFKAGS 273

Query: 95 --EKIEAVCHAIAAHSFSAQIAPLTTEAKIVQ 124
K+ A I A IA L + ++
Sbjct: 274 MGPKVLAAIRFIEWGGERAIIAHLEKAVEALE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2699PF05272290.045 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.045
Identities = 20/62 (32%), Positives = 29/62 (46%), Gaps = 15/62 (24%)

Query: 320 AKYILTPVLWKYLYRYAKKHQARGNGFGYGMVYPNNPQSVTRTLSARYYKDGAEILIDRG 379
A+Y + PVLW Y+ R+ K + G+ VY +R +DG+E RG
Sbjct: 166 ARYQVGPVLWGYVVRFIK---SDGDKLTLPYVY------------SRSQRDGSEAWKWRG 210

Query: 380 WD 381
WD
Sbjct: 211 WD 212


101ECs2689ECs2662N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2689-1170.576501flagellar biosynthesis protein FliR
ECs2688-2211.823596flagellar biosynthesis protein FliQ
ECs2687-1172.388252flagellar biosynthesis protein FliP
ECs2686-1172.384514flagellar biosynthesis protein FliO
ECs2685-1193.510490flagellar motor switch protein FliN
ECs26840173.773591flagellar motor switch protein FliM
ECs26831184.342646flagellar basal body-associated protein FliL
ECs26820154.159917flagellar hook-length control protein
ECs26810174.245621flagellar biosynthesis chaperone
ECs2680-1153.986095flagellum-specific ATP synthase
ECs26790132.709173flagellar assembly protein H
ECs2678-1131.919757flagellar motor switch protein G
ECs2677-1131.748149flagellar MS-ring protein
ECs26761141.528287flagellar hook-basal body protein FliE
ECs26750130.207062hypothetical protein
ECs26700140.546991hypothetical protein
ECs2669-1150.223235hypothetical protein
ECs2668-2190.530768hypothetical protein
ECs2667-114-1.096657hypothetical protein
ECs2666013-1.148128alpha-amylase
ECs2665114-1.253643flagellar biosynthesis protein FliT
ECs2664013-1.450992flagellar protein FliS
ECs2663012-1.172933flagellar capping protein
ECs2662-115-1.181564flagellin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2689TYPE3IMRPROT2033e-67 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 203 bits (518), Expect = 3e-67
Identities = 260/261 (99%), Positives = 261/261 (100%)

Query: 1 MLQVTSEQWLSWLSLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60
MLQVTSEQWLSWL+LYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120
NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL
Sbjct: 61 NDVPVFSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHL 120

Query: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180
NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF
Sbjct: 121 NMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIF 180

Query: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240
LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC
Sbjct: 181 LNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFC 240

Query: 241 EHLFSEIFNLLADIISELPLI 261
EHLFSEIFNLLADIISELPLI
Sbjct: 241 EHLFSEIFNLLADIISELPLI 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2688TYPE3IMQPROT671e-18 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 67.1 bits (164), Expect = 1e-18
Identities = 22/78 (28%), Positives = 42/78 (53%)

Query: 4 ESVMMMGTEAMKVALALAAPLLLVALVTGLIISILQAATQINEMTLSFIPKIIAVFIAII 63
+ ++ G +A+ + L L+ +VA + GL++ + Q TQ+ E TL F K++ V + +
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 IAGPWMLNLLLDYVRTLF 81
+ W +LL Y R +
Sbjct: 62 LLSGWYGEVLLSYGRQVI 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2687FLGBIOSNFLIP334e-119 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 334 bits (858), Expect = e-119
Identities = 245/245 (100%), Positives = 245/245 (100%)

Query: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60
MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120
MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180
ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELK 180

Query: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240
TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA
Sbjct: 181 TAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLA 240

Query: 241 QSFYS 245
QSFYS
Sbjct: 241 QSFYS 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2685FLGMOTORFLIN2121e-74 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 212 bits (542), Expect = 1e-74
Identities = 125/137 (91%), Positives = 134/137 (97%)

Query: 1 MSDMNNPADDNNGAMDDLWAEALSEQKSTSSKSAADAVFQQFGGGDVSGTLQDIDLIMDI 60
MSDMNNP+D+N GA+DDLWA+AL+EQK+T++KSAADAVFQQ GGGDVSG +QDIDLIMDI
Sbjct: 1 MSDMNNPSDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDI 60

Query: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120
PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV
Sbjct: 61 PVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGV 120

Query: 121 RITDIITPSERMRRLSR 137
RITDIITPSERMRRLSR
Sbjct: 121 RITDIITPSERMRRLSR 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2684FLGMOTORFLIM381e-135 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 381 bits (979), Expect = e-135
Identities = 85/324 (26%), Positives = 148/324 (45%), Gaps = 10/324 (3%)

Query: 5 ILSQAEIDALLNGDS--EVKDEPTASVSGESDIRPYDPNTQRRVVRERLQALEIINERFA 62
+LSQ EID LL S + E +S I YD + +E+++ L +++E FA
Sbjct: 4 VLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHETFA 63

Query: 63 RHFRMGLFNLLRRSPDITVGAIRIQPYHEFARNLPVPTNLNLIHLKPLRGTGLVVFSPSL 122
R L LR + V ++ Y EF R++P P+ L +I + PL+G ++ PS+
Sbjct: 64 RLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVDPSI 123

Query: 123 VFIAVDNLFGGDGRFPTKVEGREFTHTEQRVINRMLKLALEGYSDAWKAINPLEVEYVRS 182
F +D LFGG G+ KV+ R+ T E V+ ++ L ++W + L +
Sbjct: 124 TFSIIDRLFGGTGQ-AAKVQ-RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRLGQI 181

Query: 183 EMQVKFTNITTSPNDIVVNTPFHVEIGNLTGEFNICLPFSMIEPLRELLVNPPLENS--R 240
E +F I P+++VV ++G G N C+P+ IEP+ L + +S R
Sbjct: 182 ETNPQFAQI-VPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSVRR 240

Query: 241 NEDQNWRDNLVRQVQHSQLELVANFADISLRLSQILKLKPGDVLPIEKP---DRIIAHVD 297
+ + L ++ +++VA + L + IL L+ GD++ + D + +
Sbjct: 241 SSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLSIG 300

Query: 298 GVPVLTSQYGTLNGQYALRIEHLI 321
Q G + + A +I I
Sbjct: 301 NRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2682FLGHOOKFLIK470e-168 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 470 bits (1209), Expect = e-168
Identities = 369/375 (98%), Positives = 369/375 (98%)

Query: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60
MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK
Sbjct: 1 MIRLAPLITADVDTTTLPGGKASDAAQDFLALLSEALAGETTTDKAAPQLLVATDKPTTK 60

Query: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120
GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA
Sbjct: 61 GEPLISDIVSDAQQANLLIPVDETPPVINDEQSTSTPLTTAQTMALAAVADKNTTKDEKA 120

Query: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDVPSTVLPAEKPTLFTKLTSAQLTTAQPDDAP 180
DDLNEDVTASLSALFAMLPGFDNTPKVTD PSTVLP EKPTLFTKLTS QLTTAQPDDAP
Sbjct: 121 DDLNEDVTASLSALFAMLPGFDNTPKVTDAPSTVLPTEKPTLFTKLTSEQLTTAQPDDAP 180

Query: 181 GTPAQPLTPQVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240
GTPAQPLTP VAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW
Sbjct: 181 GTPAQPLTPLVAEAQSKAEVISTPSPVTAAASPLITPHQTQPLPTVAAPVLSAPLGSHEW 240

Query: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSSHQHVRAALEAA 300
QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVS HQHVRAALEAA
Sbjct: 241 QQSLSQHISLFTRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAA 300

Query: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTVNHEPLAGEDDDTLPVPVS 360
LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRT NHEPLAGEDDDTLPVPVS
Sbjct: 301 LPVLRTQLAESGIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDDTLPVPVS 360

Query: 361 LQGRVTGNSGVDIFA 375
LQGRVTGNSGVDIFA
Sbjct: 361 LQGRVTGNSGVDIFA 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2681FLGFLIJ2022e-70 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 202 bits (515), Expect = 2e-70
Identities = 146/147 (99%), Positives = 147/147 (100%)

Query: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60
MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 MTSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120
+TSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147
AALLAENRLDQKKMDEFAQRAAMRKPE
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRKPE 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2679FLGFLIH374e-135 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 374 bits (961), Expect = e-135
Identities = 226/228 (99%), Positives = 227/228 (99%)

Query: 1 MSDNLPWKTWTPDDLAPPQAEFVPMVESEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60
MSDNLPWKTWTPDDLAPPQAEFVP+VE EETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI
Sbjct: 1 MSDNLPWKTWTPDDLAPPQAEFVPIVEPEETIIEEAEPSLEQQLAQLQMQAHEQGYQAGI 60

Query: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120
AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL
Sbjct: 61 AEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRL 120

Query: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180
MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT
Sbjct: 121 MQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT 180

Query: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228
LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV
Sbjct: 181 LSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRWQELCRLAAPGVV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2678FLGMOTORFLIG341e-119 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 341 bits (876), Expect = e-119
Identities = 117/329 (35%), Positives = 197/329 (59%), Gaps = 2/329 (0%)

Query: 1 MSNLTGTDKSVILLMTIGEDRAAEVFKHLSQREVQTLSAAMANVTQISNKQLTDVLAEFE 60
+S LTG K+ ILL++IG + +++VFK+LSQ E+++L+ +A + I+++ +VL EF+
Sbjct: 12 VSALTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFK 71

Query: 61 QEAEQFAALNINANDYLRSVLVKALGEERAASLLEDILETRDTASGIETLNFMEPQSAAD 120
+ + DY R +L K+LG ++A ++ + L + + E + +P + +
Sbjct: 72 ELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINN-LGSALQSRPFEFVRRADPANILN 130

Query: 121 LIRDEHPQIIATILVHLKRAQAADILALFDERLRHDVMLRIATFGGVQPAALAELTEVLN 180
I+ EHPQ IA IL +L +A+ IL+ ++ +V RIA P + E+ VL
Sbjct: 131 FIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLE 190

Query: 181 GLLDGQ-NLKRSKMGGVRTAAEIINLMKTQQEEAVITAVREFDGELAQKIIDEMFLFENL 239
L + + GGV EIIN+ + E+ +I ++ E D ELA++I +MF+FE++
Sbjct: 191 KKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDI 250

Query: 240 VDVDDRSIQRLLQEVDSESLLIALKGAEQPLREKFLRNMSQRAADILRDDLANRGPVRLS 299
V +DDRSIQR+L+E+D + L ALK + P++EK +NMS+RAA +L++D+ GP R
Sbjct: 251 VLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRK 310

Query: 300 QVENEQKAILLIVRRLAETGEMVIGSGED 328
VE Q+ I+ ++R+L E GE+VI G +
Sbjct: 311 DVEESQQKIVSLIRKLEEQGEIVISRGGE 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2677FLGMRINGFLIF7520.0 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 752 bits (1943), Expect = 0.0
Identities = 478/555 (86%), Positives = 515/555 (92%), Gaps = 5/555 (0%)

Query: 3 ATAAQTKSLEWLNRLRANPKIPLIVAGSAAVAVMVALILWAKAPDYRTLFSNLSDQDGGA 62
+TA Q K LEWLNRLRANP+IPLIVAGSAAVA++VA++LWAK PDYRTLFSNLSDQDGGA
Sbjct: 5 STATQPKPLEWLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGA 64

Query: 63 IVSQLTQMNIPYRFSEASGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 122
IV+QLTQMNIPYRF+ SGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ
Sbjct: 65 IVAQLTQMNIPYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQ 124

Query: 123 FSEQVNYQRALEGELSRTIETIGPVKGARVHLAMPKPSLFVREQKSPSASVTVNLLPGRA 182
FSEQVNYQRALEGEL+RTIET+GPVK ARVHLAMPKPSLFVREQKSPSASVTV L PGRA
Sbjct: 125 FSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRA 184

Query: 183 LDEGQISAIVHLVSSAVAGLPPGNVTLVDQGGHLLTQSNTSWRDLNDAQLKYASDVEGRI 242
LDEGQISA+VHLVSSAVAGLPPGNVTLVDQ GHLLTQSNTS RDLNDAQLK+A+DVE RI
Sbjct: 185 LDEGQISAVVHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRI 244

Query: 243 QRRIEAILSPIVGNGNIHAQVTAQLDFASKEQTEEQYRPNGDESQAALRSRQLNESEQSG 302
QRRIEAILSPIVGNGN+HAQVTAQLDFA+KEQTEE Y PNGD S+A LRSRQLN SEQ G
Sbjct: 245 QRRIEAILSPIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVG 304

Query: 303 SGYPGGVPGALSNQPAPANNAPISTPPANQNNRQQ--QASTTSNS---GPRSTQRNETSN 357
+GYPGGVPGALSNQPAP N API+TPP NQ N Q Q ST++NS GPRSTQRNETSN
Sbjct: 305 AGYPGGVPGALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSN 364

Query: 358 YEVDRTIRHTKMNVGDVQRLSVAVVVNYKTLPDGKPLPLSNEQMKQIEDLTREAMGFSEK 417
YEVDRTIRHTKMNVGD++RLSVAVVVNYKTL DGKPLPL+ +QMKQIEDLTREAMGFS+K
Sbjct: 365 YEVDRTIRHTKMNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDK 424

Query: 418 RGDSLNVVNSPFNSSDESGGELPFWQQQAFIDQLLAAGRWLLVLLVAWLLWRKAVRPQLT 477
RGD+LNVVNSPF++ D +GGELPFWQQQ+FIDQLLAAGRWLLVL+VAW+LWRKAVRPQLT
Sbjct: 425 RGDTLNVVNSPFSAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLT 484

Query: 478 RRAEAMKAVQQQAQAREEVEDAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 537
RR E KA Q+QAQ R+E E+AVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR
Sbjct: 485 RRVEEAKAAQEQAQVRQETEEAVEVRLSKDEQLQQRRANQRLGAEVMSQRIREMSDNDPR 544

Query: 538 VVALVIRQWINNDHE 552
VVALVIRQW++NDHE
Sbjct: 545 VVALVIRQWMSNDHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2676FLGHOOKFLIE1175e-38 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 117 bits (294), Expect = 5e-38
Identities = 103/103 (100%), Positives = 103/103 (100%)

Query: 2 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 61
SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL
Sbjct: 1 SAIQGIEGVISQLQATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTL 60

Query: 62 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 104
GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV
Sbjct: 61 GEPGVALNDVMTDMQKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2669PF01206936e-29 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 92.5 bits (230), Expect = 6e-29
Identities = 16/71 (22%), Positives = 37/71 (52%)

Query: 7 DYRLDMVGEPCPYPAVATLEAMPQLKKGEILEVVSDCPQSINNIPLDARNHGYTVLDIQQ 66
D LD G CP P + + + + GE+L V++ P S+ + ++ G+ +L+ ++
Sbjct: 5 DQSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKE 64

Query: 67 DGPTIRYLIQK 77
+ T + +++
Sbjct: 65 EDGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2668RTXTOXIND300.018 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.2 bits (68), Expect = 0.018
Identities = 10/57 (17%), Positives = 17/57 (29%), Gaps = 2/57 (3%)

Query: 164 RFTLLPIFRIPVKMQKVAAASPLTQKPDQARRRF--RLGMLVFFGMLGWALLTAMNQ 218
R L R + + + A L + P R R M ++L +
Sbjct: 26 RKQLDTPVREKDENEFLPAHLELIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEI 82


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2663TYPE3OMBPROT330.003 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.7 bits (74), Expect = 0.003
Identities = 24/72 (33%), Positives = 37/72 (51%), Gaps = 2/72 (2%)

Query: 211 NGMEVSVAAQNAQLTVNNVAIENSSNTISDALENITLNLNDVTTGNQTLTITQDTSKAQT 270
N E +VAA+N + + A+ + +S AL T++L V+T LT T T ++
Sbjct: 236 NSSERAVAARNKAEELVSAALYSRPELLSQALSGKTVDLKIVSTS--LLTPTSLTGGEES 293

Query: 271 AIKDWVNAYNSL 282
+KD VNA L
Sbjct: 294 MLKDQVNALKGL 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2662FLAGELLIN2286e-70 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 228 bits (582), Expect = 6e-70
Identities = 253/583 (43%), Positives = 306/583 (52%), Gaps = 76/583 (13%)

Query: 2 AQVINTNSLSLITQNNINKNQSALSSSIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 61
AQVINTNSLSL+TQNN+NK+QS+LSS+IERLSSGLRINSAKDDAAGQAIANRFTSNIKGL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 TQAARNANDGISVAQTTEGALSEINNNLQRIRELTVQATTGTNSDSDLDSIQDEIKSRLD 121
TQA+RNANDGIS+AQTTEGAL+EINNNLQR+REL+VQAT GTNSDSDL SIQDEI+ RL+
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EIDRVSGQTQFNGVNVLAKDGSMKIQVGANDGETITIDLKKIDSDTLGLNGFNVNGKGTI 181
EIDRVS QTQFNGV VL++D MKIQVGANDGETITIDL+KID +LGL+GFNVNG
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNG---- 176

Query: 182 TNKAATVSDLTSAGAKLNTTTGLYDLKTENTLLTTDAAFDKLGNGDKVTVGGVDYTYNAK 241
+ + + TG D + NA
Sbjct: 177 ----PKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAA 232

Query: 242 SGDFTTTKSTAGTGVDAAAQAADSASKRDALAATLHADVGKSVNGSYTTKDGTVSFETDS 301
+G TT + T VD +A +A A K G D
Sbjct: 233 NGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIA------------GAIKGGKEGDTFDY 280

Query: 302 AGNITIGGSQAYVDDAGNLTTNNAGSAAKADMKALLKAASEGSDGASLTFNGTEYTIAKA 361
G ++ D G ++T
Sbjct: 281 KGVTFTIDTKTGNDGNGKVSTT-------------------------------------- 302

Query: 362 TPATTTPVAPLIPGGITYQATVSKDVVLSETKAAAATSSITFNSGVLSKTIGFTAGESSD 421
I G + AA SS + V++ F ++
Sbjct: 303 -----------INGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351

Query: 422 AAKSYVDDKGGITNVADYTVSYSVNKDNGSVTVAGYASATDTNKDYAPAIGTAVNVNSAG 481
+AK + A+ +
Sbjct: 352 SAKLSDLEANNAVKGE-------SKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVST 404

Query: 482 KITTETTSAGSATTNPLAALDDAISSIDKFRSSLGAIQNRLDSAVTNLNNTTTNLSEAQS 541
I + +A +T NPLA++D A+S +D RSSLGAIQNR DSA+TNL NT TNL+ A+S
Sbjct: 405 LINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARS 464

Query: 542 RIQDADYATEVSNMSKAQIIQQAGNSVLAKANQVPQQVLSLLQ 584
RI+DADYATEVSNMSKAQI+QQAG SVLA+ANQVPQ VLSLL+
Sbjct: 465 RIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


102ECs2600ECs2592N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs2600-1131.097658flagellar motor protein MotA
ECs2599-1131.419962flagellar motor protein MotB
ECs25980111.547256chemotaxis protein CheA
ECs25970131.362295purine-binding chemotaxis protein
ECs25961131.527513methyl-accepting chemotaxis protein II
ECs25950172.085243methyl-accepting protein IV
ECs25940152.372055chemotaxis methyltransferase CheR
ECs25930142.333478chemotaxis-specific methylesterase
ECs2592-1160.736162chemotaxis regulatory protein CheY
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2600PF05844330.001 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 33.1 bits (75), Expect = 0.001
Identities = 12/28 (42%), Positives = 22/28 (78%), Gaps = 2/28 (7%)

Query: 76 MDLLALLYRLMAKSRQMGMFSLERDIEN 103
++LL +L+R+ K+R++G+ L+RD EN
Sbjct: 74 VELLLILFRIAQKARELGV--LQRDNEN 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2599PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.010
Identities = 22/93 (23%), Positives = 35/93 (37%), Gaps = 11/93 (11%)

Query: 46 LISISSPKELIQIAEYFRTPLATAVTGGDRISNSESPIPGGGDDYTQSQGEVNKQPNIEE 105
L +SSP A P + G + ++ PGGGDD GE +++
Sbjct: 384 LADVSSPTAAAGGAGGGEPPKKRDPSAG---AGTDPGGPGGGDD-----GEDPFGEWLDD 435

Query: 106 LKKRM---EQSRLRKLRGDLDQLIESDPKLRAL 135
R+ + L+ R L + + S P L
Sbjct: 436 EVARLRLRGRWLLKPRRAALIEALRSAPALAGC 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2598PF06580434e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 42.5 bits (100), Expect = 4e-06
Identities = 23/151 (15%), Positives = 49/151 (32%), Gaps = 52/151 (34%)

Query: 361 ELDKSLIERIIDPLT--HLVRNSLDHGIELPEKRLAAGKNSVGNLILSAEHQGGNICIEV 418
+++ ++++ + P+ LV N + HGI G ++L G + +EV
Sbjct: 245 QINPAIMDVQVPPMLVQTLVENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEV 296

Query: 419 TDDGAGLNRERILAKAASQGLTVSENMSDDEVAMLIFAPGFSTAEQVTDVSGRGVGMDVV 478
+ G+ + G G+ V
Sbjct: 297 ENTGSLALKNTK--------------------------------------ESTGTGLQNV 318

Query: 479 KRNIQEMGG---HVEIQSKQGTGTTIRILLP 506
+ +Q + G +++ KQG +L+P
Sbjct: 319 RERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2593HTHFIS658e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.2 bits (159), Expect = 8e-14
Identities = 35/188 (18%), Positives = 72/188 (38%), Gaps = 23/188 (12%)

Query: 1 MSKIRVLSVDDSALMRQIMTEIINSHSDMEMVATAPDPLVARDLIKKFNPDVLTLDVEMP 60
M+ +L DD A +R ++ + ++ V + I + D++ DV MP
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMP 58

Query: 61 RMDGLDFLEKLMRLRPMPVVMVSSLTGKGS-EVTLRALELGAIDFVTKPQLGIREGMLAY 119
+ D L ++ + RP V+V ++ + + ++A E GA D++ KP + E +
Sbjct: 59 DENAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKP-FDLTELIGII 115

Query: 120 SEMIAEKVRTAAKASLAAHKPLSAPTTLKAGPLLSSEKLIAIGASTGGTEAIRHVLQPLP 179
+AE R +K + + +G S E R + + +
Sbjct: 116 GRALAEPKRRPSKLEDDSQDGMP-----------------LVGRSAAMQEIYRVLARLMQ 158

Query: 180 LSSPALLI 187
++
Sbjct: 159 TDLTLMIT 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs2592HTHFIS904e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 4e-24
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 7 KFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMPNMDGL 66
LV DD + +R ++ L G++ V + + AG V++D MP+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 67 ELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPF 111
+LL I+ LPVL+++A+ I A++ GA Y+ KPF
Sbjct: 64 DLLPRIKKARPD--LPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


103ECs1868ECs1858N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1868-114-0.252000peptide ABC transporter ATP-binding protein
ECs1867-114-0.339050peptide ABC transporter ATP-binding protein
ECs1866-113-0.163309membrane transport protein
ECs1865-1120.128523outer membrane channel protein
ECs1864-215-0.027399multidrug-efflux transport protein
ECs1863-116-0.841017multidrug-efflux transport protein
ECs1862-117-0.998795transcription regulatory protein
ECs1861-117-0.934941enoyl-ACP reductase
ECs1860-117-1.160294oxidoreductase
ECs1859017-1.366703exoribonuclease II
ECs1858119-1.966448RNase II stability modulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1868HTHFIS310.007 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.007
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGKSLIAKAI 53
+ GESG+GK L+A+A+
Sbjct: 165 ITGESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1866TCRTETA672e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 67.2 bits (164), Expect = 2e-14
Identities = 68/312 (21%), Positives = 114/312 (36%), Gaps = 18/312 (5%)

Query: 5 SLSWALILGLLAGIGPMCTDLYLPALPEMSEQLAATTTITQLTLTASLIGLGVGQLLFGP 64
L L L +G L +P LP + L + +T L + Q P
Sbjct: 6 PLIVILSTVALDAVG---IGLIMPVLPGLLRDLVHSNDVTA-HYGILLALYALMQFACAP 61

Query: 65 ----LSDKIGRKRPLILSLLLFIVSSILCATTNNIYWLVVWRFIQGIAGAGGSVLSRSIA 120
LSD+ GR+ L++SL V + AT ++ L + R + GI GA G+V IA
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA 121

Query: 121 RDKYQGVTLTQFFALLMTVNGLAPVLSPVLGGYIVSTFDWRTLFWVMAEISTVLLLGCVL 180
D G + F + G V PVLGG + F F+ A ++ + L
Sbjct: 122 -DITDGDERARHFGFMSACFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCF 179

Query: 181 FINETLPENKRGSSL----LLTGRSVVQNRRFMRFCLIQSFMLAGLFAYIGSSSFVL--Q 234
+ E+ +R L + + + F + L + ++ +V+ +
Sbjct: 180 LLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF-IMQLVGQVPAALWVIFGE 238

Query: 235 KEFGFSPMQFSLVFGLNGI-GLIIASWIFSRLARRINAMTLLRGGLIAAILCALLTVLCA 293
F + + GI + + I +A R+ L G+IA +L
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 294 WTQLPIPALVAL 305
+ P +V L
Sbjct: 299 RGWMAFPIMVLL 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1865RTXTOXIND310.008 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.3 bits (71), Expect = 0.008
Identities = 26/166 (15%), Positives = 49/166 (29%), Gaps = 11/166 (6%)

Query: 70 DVQKAIADIDSARALYGQTNASLFPTVNAALSSTRSRSLANGTETTAEADGTVSSFTLDL 129
A AD ++ Q +RS L E + + + +
Sbjct: 128 TALGAEADTLKTQSSLLQARL----EQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183

Query: 130 FGRNQSLSRAARETWLASEFTAQNTRLTLIAEISTAWLTLAADNSNLALAKETMTSAENS 189
R SL + TW ++ + AE T + + + K
Sbjct: 184 VLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKS-------R 236

Query: 190 LKIIQRQQQVGTAAATDVSEAMSVYQQARASVASYQTQVMQDKNAL 235
L A V E + Y +A + Y++Q+ Q ++ +
Sbjct: 237 LDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1864ACRIFLAVINRP11050.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1105 bits (2860), Expect = 0.0
Identities = 550/983 (55%), Positives = 713/983 (72%), Gaps = 8/983 (0%)

Query: 3 SRFFVRRPVFAWVIAILIMLAGILAIRTLPVAQYPDVAPPTIKISATYTGASAETLENSV 62
+ FF+RRP+FAWV+AI++M+AG LAI LPVAQYP +APP + +SA Y GA A+T++++V
Sbjct: 2 ANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTV 61

Query: 63 TQVIEQQLTGLDNLLYFSSTSSSDGSVSINVTFEQGTDPDTAQ--VQNKIQQAESRLPSE 120
TQVIEQ + G+DNL+Y SSTS S GSV+I +TF+ GTDPD AQ VQNK+Q A LP E
Sbjct: 62 TQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQE 121

Query: 121 VQQTGVTVEKSQSNFLLIAAVYDTTDKASSSDIADWLVSNVQDPLARVEGVGSLQVFGAE 180
VQQ G++VEKS S++L++A + DI+D++ SNV+D L+R+ GVG +Q+FGA+
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ 181

Query: 181 YAMRIWLDPAKLASYSLMPSDVQSAIEAQNVQVTAGKIGALPSPNTQQLTATVRAQSRLQ 240
YAMRIWLD L Y L P DV + ++ QN Q+ AG++G P+ QQL A++ AQ+R +
Sbjct: 182 YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK 241

Query: 241 TVDQFKNIIVKSQSDSAVVRIKDVARVEMGSEDYTAIGKLNGHPSAGVAVMLSPGANALN 300
++F + ++ SD +VVR+KDVARVE+G E+Y I ++NG P+AG+ + L+ GANAL+
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 301 TATLVKDKIAEFQRNMPQGYDIAYPKDSTEFIKISVEDVIQTLFEAIVLVVCVMYLFLQN 360
TA +K K+AE Q PQG + YP D+T F+++S+ +V++TLFEAI+LV VMYLFLQN
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQN 361

Query: 361 LRATLIPALAVPVVLLGTFGVLALFGYSINTLTLFAMVLAIGLLVDDAIVVVENVERIMR 420
+RATLIP +AVPVVLLGTF +LA FGYSINTLT+F MVLAIGLLVDDAIVVVENVER+M
Sbjct: 362 MRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMM 421

Query: 421 DKGLPAREATEKSMGEISGALVAIALVLSAVFLPMAFFGGSTGVIYRQFSITIISAMLLS 480
+ LP +EATEKSM +I GALV IA+VLSAVF+PMAFFGGSTG IYRQFSITI+SAM LS
Sbjct: 422 EDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALS 481

Query: 481 VVVALTLTPALCGSVL----QHVPPHKKGFFGAFDRFYRRTEDKYQRGVIYVLRRAARTM 536
V+VAL LTPALC ++L +K GFFG F+ + + + Y V +L R +
Sbjct: 482 VLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYL 541

Query: 537 GLYLVLGGGMALMMWKLPGSFLPTEDQGEIMVQYTLPAGATAARTAEVNRQIVDWFLINE 596
+Y ++ GM ++ +LP SFLP EDQG + LPAGAT RT +V Q+ D++L NE
Sbjct: 542 LIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNE 601

Query: 597 KANTDVIFTVDGFSFSGSGQNTGMAFVSLKNWSQRKGAENTAQAIALRATKELGTIRDAT 656
KAN + +FTV+GFSFSG QN GMAFVSLK W +R G EN+A+A+ RA ELG IRD
Sbjct: 602 KANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGF 661

Query: 657 VFAMTPPAVDGLGQSNGFTFELLANGGTDRETLLQMRNQLIEKANQSP-ELHSVRANDLP 715
V PA+ LG + GF FEL+ G + L Q RNQL+ A Q P L SVR N L
Sbjct: 662 VIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLE 721

Query: 716 QMPQLQVDIDSNKAVSLGLSLNDVTDTLSSAWGGTYVNDFIDRGRVKKVYIQGDSEFRSA 775
Q ++++D KA +LG+SL+D+ T+S+A GGTYVNDFIDRGRVKK+Y+Q D++FR
Sbjct: 722 DTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRML 781

Query: 776 PSDLGKWFVRGSDNAMTPFSAFATTRWLYGPERLVRYNGSAAYEIQGENATGFSSGDAMT 835
P D+ K +VR ++ M PFSAF T+ W+YG RL RYNG + EIQGE A G SSGDAM
Sbjct: 782 PEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMA 841

Query: 836 KMEELANSLPAGTTWAWSGLSLQEKLASGQALSLYAVSILVVFLCLAALYESWSVPFSVI 895
ME LA+ LPAG + W+G+S QE+L+ QA +L A+S +VVFLCLAALYESWS+P SV+
Sbjct: 842 LMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVM 901

Query: 896 LVIPLGLLGAALAAWMRDLNNDVYFQVALLTTIGLSSKNAILIVEFA-EAAVAEGYSLSR 954
LV+PLG++G LAA + + NDVYF V LLTTIGLS+KNAILIVEFA + EG +
Sbjct: 902 LVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVE 961

Query: 955 AALRAAQTRLRPIIMTSLAFIAG 977
A L A + RLRPI+MTSLAFI G
Sbjct: 962 ATLMAVRMRLRPILMTSLAFILG 984



Score = 89.9 bits (223), Expect = 3e-20
Identities = 76/502 (15%), Positives = 164/502 (32%), Gaps = 26/502 (5%)

Query: 6 FVRRPVFAWVIAILIMLAGILAIRTLPVAQYPDVAPPTIKISA-TYTGASAETLENSVTQ 64
+ +I LI+ ++ LP + P+ GA+ E + + Q
Sbjct: 533 ILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQ 592

Query: 65 VIEQQLTGLDNLLY-------FSSTSSSDGSVSINVTFEQGTDPDTAQ--VQNKIQQAES 115
V + L + FS + + + V+ + + + + + I +A+
Sbjct: 593 VTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKM 652

Query: 116 RLPSEVQQTGVTVEKSQSNFLLIAAVYDTTDKASSSDIADWLVSNVQDPLARVEGVGS-- 173
L + L A +D + D L L +
Sbjct: 653 ELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASL 712

Query: 174 ----LQVFGAEYAMRIWLDPAKLASYSLMPSDVQSAIEAQNVQVTAGKIGALPSPNTQQL 229
++ +D K + + SD+ I + ++L
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF--IDRGRVKKL 770

Query: 230 TATVRAQSRLQTVDQFKNIIVKSQSDSAVVRIKDVARVEMGSEDYTAIGKLNGHPSAGVA 289
A+ R + + V+S ++ +V + + NG PS +
Sbjct: 771 YVQADAKFR-MLPEDVDKLYVRS-ANGEMVPFSAFTTSHWV-YGSPRLERYNGLPSMEIQ 827

Query: 290 VMLSPGANALNTATLVKDKIAEFQRNMPQGYDIAYPKDSTEFIKISVEDVIQTLFEAIVL 349
+PG + A + + +A +P G + S + S + + V+
Sbjct: 828 GEAAPGTS-SGDAMALMENLAS---KLPAGIGYDWTGMSYQERL-SGNQAPALVAISFVV 882

Query: 350 VVCVMYLFLQNLRATLIPALAVPVVLLGTFGVLALFGYSINTLTLFAMVLAIGLLVDDAI 409
V + ++ + L VP+ ++G LF + + ++ IGL +AI
Sbjct: 883 VFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAI 942

Query: 410 VVVENVERIMRDKGLPAREATEKSMGEISGALVAIALVLSAVFLPMAFFGGSTGVIYRQF 469
++VE + +M +G EAT ++ ++ +L LP+A G+
Sbjct: 943 LIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAV 1002

Query: 470 SITIISAMLLSVVVALTLTPAL 491
I ++ M+ + ++A+ P
Sbjct: 1003 GIGVMGGMVSATLLAIFFVPVF 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1863RTXTOXIND483e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.9 bits (114), Expect = 3e-08
Identities = 28/133 (21%), Positives = 56/133 (42%), Gaps = 10/133 (7%)

Query: 41 PVSVVSELTGR-TSAALSAEVRPQVGGIIQKRLFKEGDLVKAGQPLYQIDAASYQAAWNE 99
V +V+ G+ T + S E++P I+++ + KEG+ V+ G L ++ A +A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 100 ARAALQQAQALVKADCQKAQRYARLVKENGVSQQDADDAQSTCAQDKASV--------AA 151
+++L QA+ Q R L K + D Q+ ++ + +
Sbjct: 139 TQSSLLQARLEQ-TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFST 197

Query: 152 KKAALETARINLD 164
+ +NLD
Sbjct: 198 WQNQKYQKELNLD 210



Score = 32.1 bits (73), Expect = 0.004
Identities = 17/114 (14%), Positives = 37/114 (32%), Gaps = 5/114 (4%)

Query: 83 QPLYQIDAASYQAAWN--EARAALQQAQALVKADCQKAQRYARLVKEN--GVSQQDADDA 138
L A + A + K+ ++ + KE V+Q ++
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEI 300

Query: 139 QSTCAQDKASVAAKKAALETARINLDWTTVTAPISGRI-GISSVTPGALVTASQ 191
Q ++ L + + AP+S ++ + T G +VT ++
Sbjct: 301 LDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE 354


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1862HTHTETR558e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 8e-12
Identities = 17/65 (26%), Positives = 33/65 (50%)

Query: 1 MTSKLEIRHKQRQDEIINAARRCFRRCGFHAASMSQIASEAQLSVGQIYRYFANKDAIIE 60
M K + ++ + I++ A R F + G + S+ +IA A ++ G IY +F +K +
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EMVRR 65
E+
Sbjct: 61 EIWEL 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1861DHBDHDRGNASE501e-09 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 50.4 bits (120), Expect = 1e-09
Identities = 51/260 (19%), Positives = 98/260 (37%), Gaps = 22/260 (8%)

Query: 4 LSGKRILVTGVASKLSIAYGIAQAMHREGAEL-AFTYQNDKLKGRVEEFAAQLGSDIVLQ 62
+ GK +TG A I +A+ + +GA + A Y +KL+ V A+
Sbjct: 6 IEGKIAFITGAAQ--GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 CDVAEDASIDTMFAELGKVWPKFDGFVHSIGF---APGDQLDGDYVNAVTREGFKIAHDI 119
DV + A+ID + A + + D V+ G L + A F +
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEAT----FSVN--- 116

Query: 120 SSYSFVAMAKACRSMLNP-GSALLTLSYLGAERAIPNYNVMGLAKASLEANVRYMANAMG 178
S+ F A + M++ +++T+ A + +KA+ + + +
Sbjct: 117 STGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 179 PEGVRVNAISAGPIRTLAASGI--------KDFRKMLAHCEAVTPIRRTVTIEDVGNSAA 230
+R N +S G T + + + L + P+++ D+ ++
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 231 FLCSDLSAGISGEVVHVDGG 250
FL S + I+ + VDGG
Sbjct: 237 FLVSGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1858PF08280310.018 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 31.0 bits (70), Expect = 0.018
Identities = 21/105 (20%), Positives = 36/105 (34%), Gaps = 2/105 (1%)

Query: 526 PIDVELTESCLIENDELALSVIQQFSQLGAQVHLDDFGTAYSSLSQLARFPIDAIKLDQV 585
P+ V S I L S + FS + + ++ Q+ D +
Sbjct: 425 PLVVVFVASNFINAHLLTDSFPRYFSDKS--IDFHSYYLLQDNVYQIPDLKPDLVITHSQ 482

Query: 586 FVRDIHKQPVSQSLVRAIVAVAQALNLQVIAEGVESAKEDAFLTK 630
+ +H + V I L++Q + V+ K A LTK
Sbjct: 483 LIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTK 527


104ECs1728ECs1720N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1728-2120.481556nitrite extrusion protein
ECs5423-1150.507362hypothetical protein
ECs1727-2150.841729nitrate/nitrite sensor protein NarX
ECs1726-315-0.450172transcriptional regulator NarL
ECs1725-213-0.459851hypothetical protein
ECs1724-311-1.411432hypothetical protein
ECs1723-312-1.839437cation transport regulator
ECs1722-213-1.550093cation transport regulator
ECs1721-313-0.758265calcium/sodium:proton antiporter
ECs17200150.3812042-dehydro-3-deoxyphosphooctonate aldolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1728ACRIFLAVINRP310.010 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 31.3 bits (71), Expect = 0.010
Identities = 35/166 (21%), Positives = 60/166 (36%), Gaps = 22/166 (13%)

Query: 258 IMSLLYLATFGSFIGFSAGFAMLSKTQFPDVQILQYAFFGPFIGALARSA---GGALSDR 314
I+S + L+ + I A A L K + + FFG F S ++
Sbjct: 474 IVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKI 533

Query: 315 LGGTRVTLVNFILMAIFSGLLFLTLPTD----GQGGSFMAFFAVFLALFLTAGLGSGSTF 370
LG T L+ + L+ +LFL LP+ G F+ L +G+T
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTM----------IQLPAGATQ 583

Query: 371 QMISVIFRKLTMDRVKAEGGSDER-----AMREAATDTAAALGFIS 411
+ + ++T +K E + E + A + F+S
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVS 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1727PF06580531e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 53.3 bits (128), Expect = 1e-09
Identities = 36/172 (20%), Positives = 73/172 (42%), Gaps = 23/172 (13%)

Query: 424 PESSRELLSQIRNELNASWAQLRELLTTFRLQLTEPGLRPALEASCEEYSAKFGFPVKLD 483
P +RE+L+ + + S + +LT +++ + S +F ++ +
Sbjct: 190 PTKAREMLTSLSELMRYSLRYSNARQVSLADELT------VVDSYLQLASIQFEDRLQFE 243

Query: 484 YQLPPRL----VPSHQAIHLLQIAREALSNALKH-----SQASEVVVTVAQNDNQVKLTV 534
Q+ P + VP L+Q E N +KH Q ++++ +++ V L V
Sbjct: 244 NQINPAIMDVQVPPM----LVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 535 QDNGCGVPENAIRSNHYGMIIMRDRAQSLRG-DCRVRRRESGGTEVVVTFIP 585
++ G +N S G+ +R+R Q L G + +++ E G + IP
Sbjct: 297 ENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1726HTHFIS742e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 73.7 bits (181), Expect = 2e-17
Identities = 32/117 (27%), Positives = 56/117 (47%), Gaps = 2/117 (1%)

Query: 7 ATILLIDDHPMLRTGVKQLISMAPDITVVGEASNGEQGIELAESLDPDLILLDLNMPGMN 66
ATIL+ DD +RT + Q +S A + SN + D DL++ D+ MP N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI--TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 67 GLETLDKLREKSLSGRIVVFSVSNHEEDVVTALKRGADGYLLKDMEPEDLLKALHQA 123
+ L ++++ ++V S N + A ++GA YL K + +L+ + +A
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1725INTIMIN2588e-80 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 258 bits (660), Expect = 8e-80
Identities = 120/378 (31%), Positives = 196/378 (51%), Gaps = 21/378 (5%)

Query: 32 GEQAKAFALGKVRDALSQQVNQHVESWLSPWGNASVDVKVDNEGHFTGSRGSWFVPLQDN 91
G+ AK ALG + Q + +++WL +G A V+++ N F GS + +P D+
Sbjct: 184 GDYAKDTALGIAGN----QASSQLQAWLQHYGTAEVNLQSGNN--FDGSSLDFLLPFYDS 237

Query: 92 DRYLTWSQLGLTQQDDGLVSNVGVGQRWARGNWLVGYNTFYDNLLDENLQRAGFGAEAWG 151
++ L + Q+G D +N+G GQR+ ++GYN F D + R G G E W
Sbjct: 238 EKMLAFGQVGARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWR 297

Query: 152 EYLRLSANFYQPFAAWHE--QTATQEQRMARGYDLTARMRMPFYQHLNTSVSVEQYFGDR 209
+Y + S N Y + WHE ++R A G+D+ +P Y L + EQY+GD
Sbjct: 298 DYFKSSVNGYFRMSGWHESYNKKDYDERPANGFDIRFNGYLPSYPALGAKLMYEQYYGDN 357

Query: 210 VDLFNSGTGYHNPVALSLGLNYTPVPLVTVTAQHKQGESGENQNNLGLNLNYRFGVPLKK 269
V LFNS NP A ++G+NYTP+PLVT+ ++ G EN + Y+F P +
Sbjct: 358 VALFNSDKLQSNPGAATVGVNYTPIPLVTMGIDYRHGTGNENDLLYSMQFRYQFDKPWSQ 417

Query: 270 QLSAGEVAESQSLRGSRYDNPQRNNLPTLEYRQRKTLTVFLATPPWDLKPGETVPLKLQI 329
Q+ V E ++L GSRYD QRNN LEY+++ L++ + + T ++L +
Sbjct: 418 QIEPQYVNELRTLSGSRYDLVQRNNNIILEYKKQDILSLNI-PHDINGTERSTQKIQLIV 476

Query: 330 RSRYGIRQLIWQGDTQILS-----LTPGAQANSAEGWTLIMPDWQNGEGASNHWRLSVVV 384
+S+YG+ +++W D+ + S G+Q SA+ + I+P + +G SN ++++
Sbjct: 477 KSKYGLDRIVWD-DSALRSQGGQIQHSGSQ--SAQDYQAILPAYV--QGGSNVYKVTARA 531

Query: 385 EDNQGQRVSSNEITLTLV 402
D G SSN + LT+
Sbjct: 532 YDRNGN--SSNNVLLTIT 547


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1720TRNSINTIMINR290.033 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.5 bits (63), Expect = 0.033
Identities = 35/157 (22%), Positives = 58/157 (36%), Gaps = 23/157 (14%)

Query: 82 QELKQTFGVKIITDVHEPSQAQPVADVVDVIQLPAFLARQTDLVEAMAKTGAVINVKKPQ 141
Q +QT T V + + P V + Q + + D ++A T
Sbjct: 391 QPAEQTTTTTTHTVVQQQTGGIPQHKVALMPQERRRFSDRRDSQGSVASTH--------- 441

Query: 142 FVSPGQMGNIVDKFKEGGNEKVILCDRGA-NFGYDNLVVDMLGFSIMKKVSGNSPVIFDV 200
+V+ + E G + L YD + D G+S+++ SG+ P V
Sbjct: 442 --WSDSSSEVVNPYAEVGGARNSLSAHQPEEHIYDEVAADP-GYSVIQNFSGSGP----V 494

Query: 201 THALQCRDPFGAASGGRRAQVAELA-RAGMAVGLAGL 236
T L G G ++ A LA G+ +G+ GL
Sbjct: 495 TGRL-----IGTPGQGIQSTYALLANSGGLRLGMGGL 526


105ECs1707ECs1697N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1707-117-1.224929hypothetical protein
ECs1706-216-0.296221hypothetical protein
ECs1705-2160.248513dihydroxyacetone kinase subunit DhaK
ECs1704-2190.377525dihydroxyacetone kinase subunit DhaL
ECs1703-2190.234622dihydroxyacetone kinase subunit DhaM
ECs1700-317-0.038572hypothetical protein
ECs1699-2160.733718ABC transporter ATP-binding protein
ECs1698-2160.256281ABC transporter permease
ECs1697-315-0.133455ferric enterobactin transport ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1707PRTACTNFAMLY442e-06 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 43.5 bits (102), Expect = 2e-06
Identities = 117/548 (21%), Positives = 198/548 (36%), Gaps = 92/548 (16%)

Query: 14 RLAELKIRSPSIQLIKFGAIGLNAILFSPLLIAADTGSQYGTNITINDGDRI---TGDTA 70
+ A L+ + ++ L GA ++ I Q+G +I +D + +G T
Sbjct: 10 KAAPLRRTTLAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTASGTTI 69

Query: 71 DPSGN-LYGVMTPAGNTPGNINLGNDVTVN---VNDASGYAKGIIIQGKNSSLTANRLTV 126
SG G++ N + N + ++D + K L A+ T+
Sbjct: 70 KVSGRQAQGILLE--NPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATL 127

Query: 127 DVVGQT---SAIGINLIGDYTHADLGTGSTIKSNDDGIIIGHSSTLTATQFTIENSNGIG 183
VG T I + + G+ A + ST++ G+ I + +T + I + G+
Sbjct: 128 ANVGDTWDDDGIALYVAGEQAQASIAD-STLQGAG-GVQIERGANVTVQRSAIVD-GGLH 184

Query: 184 LTINDYGTSVDLGSGSKIKTDGS-TGVYIGGLNGNNANGAARFTATDLTID---VQGYSA 239
+ DL + D + T V G + A++LT+D + G A
Sbjct: 185 IGALQSLQPEDLPPSRVVLRDTNVTAVPASGAPA----AVSVLGASELTLDGGHITGGRA 240

Query: 240 MGINVQKNSVVDLGTNSSIKTSGDNAHGLWSFGQVSANAL-------TVDVTGAAANGVE 292
G+ + +VV L ++I+ A G G V A+ GV+
Sbjct: 241 AGVAAMQGAVVHL-QRATIRRGDAPAGGAVPGGAVPGGAVPGGFGPGGFGPVLDGWYGVD 299

Query: 293 VRGGTTTIGADSHISSAQGGGLVTSGSDATINFSG---TAAQRNSIFSGGSYGASAQTAT 349
V G + + A S + + + G + G A + SG +A N I +GG+ + Q A
Sbjct: 300 VSGSSVEL-AQSIVEAPELGAAIRVGRGARVTVSGGSLSAPHGNVIETGGARRFAPQAAP 358

Query: 350 AVINMQNTDITVDRNGSLALGLWALSGGRITGDSLAITGAAGARGIYAMTNSQIDLTSDL 409
I +Q G+ A G L L +TG A A+G T + +
Sbjct: 359 LSITLQA--------GAHAQGKALLYRVLPEPVKLTLTGGADAQGDIVATELPSIPGTSI 410

Query: 410 VIDMSTPDQMAIATQHDDGYAASRINASGRMLINGSVLSKGGLINLDMHPGSVWTGSSLS 469
P +A+A+ + WTG++
Sbjct: 411 -----GPLDVALAS------------------------------------QARWTGAT-- 427

Query: 470 DNVNGGKLDVAMNNSVWNVTSNSNLDTLAL-SHSTVDFASHGSTAGTFTTLNVENLSGNS 528
V+ +D N+ W +T NSN+ L L S +VDF + AG F L V L+G+
Sbjct: 428 RAVDSLSID----NATWVMTDNSNVGALRLASDGSVDFQQ-PAEAGRFKVLTVNTLAGSG 482

Query: 529 TFIMRADV 536
F M
Sbjct: 483 LFRMNVFA 490


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1704adhesinmafb320.002 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 31.6 bits (71), Expect = 0.002
Identities = 10/47 (21%), Positives = 25/47 (53%)

Query: 138 VESLRQSSEQNLSVPAALEAASSIAEFAAQSTITMQARKGRASYLGE 184
E++ + ++N + +EA ++A A + + A+ G+A+ G+
Sbjct: 293 REAVDRWIQENPNAAETVEAVFNVAAAAKVAKLAKAAKPGKAAVSGD 339


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1703PHPHTRNFRASE1411e-38 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 141 bits (357), Expect = 1e-38
Identities = 63/206 (30%), Positives = 102/206 (49%), Gaps = 1/206 (0%)

Query: 259 GKAFYYQPVLCTVQAKSPLTVEEEQERLRQAIDFTLLDLMTLTAKAEASGLDDIAAIFSG 318
KAF + ++ S V E E+L A++ + +L + + EAS D A IF+
Sbjct: 17 AKAFIHLEPNVDIEKTSITDVSTEIEKLTAALEKSKEELRAIKDQTEASMGADKAEIFAA 76

Query: 319 HHTLLDDPELLAAASELLQHEHCTAEYAWQQVLKELSQQYQQLDDEYLQARYIDVDDLLH 378
H +LDDPEL+ +++E AEYA ++V ++ +D+EY++ R D+ D+
Sbjct: 77 HLLVLDDPELVDGIKGKIENEQMNAEYALKEVSDMFVSMFESMDNEYMKERAADIRDVSK 136

Query: 379 RTLVHLT-QTKEELPQFNSPTILLAENIYPSPVLQLDPAVVKGICLSAGSPVSHSALIAR 437
R L HL L T+++AE++ PS QL+ VKG G SHSA+++R
Sbjct: 137 RVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNKQFVKGFATDIGGRTSHSAIMSR 196

Query: 438 ELGIGWICQQGEKLYAIQPEEKLTLD 463
L I + E IQ + + +D
Sbjct: 197 SLEIPAVVGTKEVTEKIQHGDMVIVD 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1699BACINVASINB290.044 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 28.6 bits (63), Expect = 0.044
Identities = 21/51 (41%), Positives = 29/51 (56%), Gaps = 2/51 (3%)

Query: 55 LLLAVAPEKMVGFSSFDFARQALIPLSEHIRQLPRLGRLAGRASTLSLEGL 105
L + VA E + + F +QAL P+ EH+ L L L G+A T +LEGL
Sbjct: 348 LAVMVADEIVKAATGVSFIQQALNPIMEHV--LKPLMELIGKAITKALEGL 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1697LCRVANTIGEN300.010 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 29.7 bits (66), Expect = 0.010
Identities = 19/63 (30%), Positives = 28/63 (44%), Gaps = 7/63 (11%)

Query: 193 LMSTHHPLHANAIADSIIQVEPDGRVTQGLPTEQLTTNKLAAL------YRVSADQIHHH 246
+ H L A+ I D I++V D G +L +LA L Y V +I+ H
Sbjct: 119 MAVMHFSLTADRIDDDILKVIVDSMNHHGDARSKL-REELAELTAELKIYSVIQAEINKH 177

Query: 247 LSA 249
LS+
Sbjct: 178 LSS 180


106ECs1650ECs1643N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs16501214.479241tail fiber protein
ECs16493255.214304membrane protein
ECs16482265.765265host specificity protein
ECs16472266.458341tail assembly protein
ECs16462275.950269tail assembly protein
ECs16452275.829138minor tail protein
ECs16442275.811620minor tail protein
ECs16431265.537099tail length tape measure protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1650CHANLCOLICIN442e-06 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 44.3 bits (104), Expect = 2e-06
Identities = 54/319 (16%), Positives = 118/319 (36%)

Query: 152 ARAASTSAGQAASSAQSASSSAGTASTKATEASKSAAAAESSKSAAATSAGAAKTSETNA 211
+ S S AA A + S+A T+A +A+++ AAAE+ A A + +
Sbjct: 39 GKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQRLKDIV 98

Query: 212 AVSQQSAATSASTATTKASEAASSARDASASKEAAKSSETSAASSASSAASSATAAGNSA 271
+ + A+ +AT A ++ + AK+ E + + ++ + A
Sbjct: 99 NEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQEAEQRRK 158

Query: 272 KAAKTSETNAKSSETAAEQSASAAAGSKTAAALSASAASTSAGQASASATAAGKSAESAA 331
+ + + + A + AA S+ A A+ + SA Q+ ++
Sbjct: 159 EIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNSR 218

Query: 332 SSASTATTKAGEATEQASAAASSASAAKTSETNAKASETSAESSKTAAASSASSAASSAS 391
S+S A T + ++AK E + + S ++ A
Sbjct: 219 LSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRRV 278

Query: 392 SASASKDEATRQASAAKSSATTASTKATEAAGSATAAAQSKSTAESAATRAETAAKRAED 451
A ++E +Q +A+++ + T+ + + + +++ + AE K+A++
Sbjct: 279 GAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQN 338

Query: 452 IASAVALEDASTTKKGIVQ 470
++DA Q
Sbjct: 339 NLLNSQIKDAVDATVSFYQ 357



Score = 31.6 bits (71), Expect = 0.020
Identities = 58/332 (17%), Positives = 111/332 (33%), Gaps = 22/332 (6%)

Query: 313 AGQASASATAAGKSAESAASSA----STATTKAGEATEQASAAASSASAAKTSETNAKAS 368
+G KS SAA A STA K +A + A A A++ + AK +
Sbjct: 32 SGSGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALT 91

Query: 369 E---------TSAESSKTAAASSASSAASSASSASASKDEATRQASAAKSSATTASTKAT 419
+ +S+T +A+ + A ++A A + + A+ A A
Sbjct: 92 QRLKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLAKAEEKARKEAEAAEKAFQ 151

Query: 420 EAAGSATAAAQSKSTAESAATRAETAAKRAEDIASA-----VALEDASTTKKGIVQLSSA 474
EA + K+ E AE KR ++ +A + S + +V++
Sbjct: 152 EAEQRRKEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGE 211

Query: 475 TNSTSESLAATPKAVKAAYELANGKYTAQDATTAQKGIVQLSNATNSTSEMLAATPKSVK 534
+ + L+++ A A + GK +A+ + + A P +
Sbjct: 212 IKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAK---YKELDELVKKLSPRANDPLQNR 268

Query: 535 AAYDLANGKYTAQDAT-TAQKGIVQLSSATNSASETLAATPKAVKAANDNANGRVPSARK 593
++ + A QK + + N + + KA+ ++N N + +
Sbjct: 269 PFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHE 328

Query: 594 VNGKALSSDITLTPKDIGTLNSTTMSFSGGAG 625
+ L I T+SF
Sbjct: 329 AEENLKKAQNNLLNSQIKDAVDATVSFYQTLT 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1649ENTEROVIROMP1392e-44 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 139 bits (352), Expect = 2e-44
Identities = 64/200 (32%), Positives = 102/200 (51%), Gaps = 30/200 (15%)

Query: 1 MRKLYAAILSAAICLAVSGAPAWASEHQSTLSAGYLHARTNVPGSDDLNGINVKYRYEFT 60
M+K+ A + + A LA + + A+ ST++ GY + + + G N+KYRYE
Sbjct: 1 MKKI-ACLSALAAVLAFTAGTSVAA--TSTVTGGYAQSDAQGQMNK-MGGFNLKYRYEED 56

Query: 61 DT-LGLVTSFSYAGDKNRQLTRYSDTRWHEDSVRNRWFSVMAGPSVRVNEWFSAYAMAGV 119
++ LG++ SF+Y T S T D +N+++ + AGP+ R+N+W S Y + GV
Sbjct: 57 NSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVGV 108

Query: 120 AYSRVSTFSGDYLRVTDNKGKTHDVLTGSDDGRHSNTSLAWGAGVQFNPTESVAIDIAYE 179
Y + T T+ HD S+ ++GAG+QFNP E+VA+D +YE
Sbjct: 109 GYGKFQT--------TEYPTYKHD---------TSDYGFSYGAGLQFNPMENVALDFSYE 151

Query: 180 GSGSGDWRTDGFIVGVGYKF 199
S +I GVGY+F
Sbjct: 152 QSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1647PF06291280.015 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.015
Identities = 13/40 (32%), Positives = 19/40 (47%), Gaps = 5/40 (12%)

Query: 135 MTGILFSLGASMVLGGVAQML-----APKARTPRTQTTDN 169
M +LFS +M++ G AQ P A TP+ T +
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHH 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1643GPOSANCHOR330.007 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.007
Identities = 45/277 (16%), Positives = 84/277 (30%), Gaps = 21/277 (7%)

Query: 238 LTAMARQFHNVTAEQIAYVAQLQRSGEEAGALQAANEAATKGFDDQTRRLKENMGTLETW 297
+A + A A A L+++ E A A+ A K + + L+ LE
Sbjct: 139 DSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA 198

Query: 298 ADRTARAFKSMWDAVLDIGRP-DTAQEMLIKAEAAFKKADDIWNLRKDDYFVNDEARARY 356
+ + + + E A + A + + +A
Sbjct: 199 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 357 WDDR---EKARLALEAARKKAEQQSQQDKNAQQQSDTEASRLKYTEEA-----QKAYERL 408
+ EKA + + + + + E + L++ + Q L
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318

Query: 409 QTPLEKYTARQEELNKALKDGKI-------LQADYNTLMAAAKKDYEATLKKPKQ----S 457
E + E K + KI L+ D + AKK EA +K ++ S
Sbjct: 319 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASR-EAKKQLEAEHQKLEEQNKIS 377

Query: 458 GVKVSAGDRQEDSAHAALLTLQAELRMLEKHAGANEK 494
+ R D++ A ++ L A EK
Sbjct: 378 EASRQSLRRDLDASREAKKQVEKALEEANSKLAALEK 414


107ECs1462ECs1454N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs14620132.307826ribonuclease E
ECs1461-191.352107flagellar hook-associated protein FlgL
ECs1460-1132.498670flagellar hook-associated protein FlgK
ECs14590142.665398flagellar rod assembly protein/muramidase FlgJ
ECs14582142.619663flagellar basal body P-ring protein
ECs14573162.502625flagellar basal body L-ring protein
ECs14562162.589317flagellar basal body rod protein FlgG
ECs14550162.316563flagellar basal body rod protein FlgF
ECs14541161.019702flagellar hook protein FlgE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1462IGASERPTASE643e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 64.3 bits (156), Expect = 3e-12
Identities = 47/288 (16%), Positives = 84/288 (29%), Gaps = 36/288 (12%)

Query: 513 PSEEEFAERKRPEQPALATFAMPDVPPAPT-PAEPAAPVVAPAPKAAPATPATPAQPGLL 571
P E+ + DVP P+ E A AP P APATP+
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETT----- 1037

Query: 572 SRFFGALKALFSGGEETKPTEQPAPKAEAKPERQQDRRKPRQNNRRDRNERRDTRSER-- 629
ET + Q QN + + + ++
Sbjct: 1038 ---------------ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT 1082

Query: 630 TEGSDNREENRRNRRQAQQQTAETREGRQQAEVTEKARTADEQQAPRRERSRRRNDDKRQ 689
E + + E + + ++TA + + TEK + + + + + + Q
Sbjct: 1083 NEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ 1142

Query: 690 AQ---QEAKALNVEEQSVQETEQEERVRPVQPRRKQRQLNQKVRYEQSV--AEETVVAPV 744
A+ + +N++E Q + +P + + Q V +V V P
Sbjct: 1143 AEPARENDPTVNIKEPQSQTNTTADTEQPA--KETSSNVEQPVTESTTVNTGNSVVENPE 1200

Query: 745 AEETVAAEPIVQEAPA------PRTELVKVPLPVVAQTAPEQQEENNA 786
+P V + R + VP V T A
Sbjct: 1201 NTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVA 1248



Score = 63.5 bits (154), Expect = 4e-12
Identities = 46/261 (17%), Positives = 81/261 (31%), Gaps = 26/261 (9%)

Query: 551 VAPAPKAAPATPATPAQPGLLSRFFGALKALFSGGEETKPTEQP-APKAEAKPERQQDRR 609
P + S E + E P P A A P
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSN----------NEEIARVDEAPVPPPAPATPSETT--- 1037

Query: 610 KPRQNNRRDRNERRDTRSERTEGSDNREENRRNRRQAQQQTAETREGRQQAEV------T 663
N ++++++ D E +NR A++ + + Q EV T
Sbjct: 1038 -----ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSET 1092

Query: 664 EKARTADEQQAPRRERSRRRNDDKRQAQQEAKALNVEEQSVQETEQEERVRPVQPRRKQR 723
++ +T + ++ E+ + + + Q+ K + + QE + + + R
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPK-VTSQVSPKQEQSETVQPQAEPARENDP 1151

Query: 724 QLNQKVRYEQSVAEETVVAPVAEETVAAEPIVQEAPAPRTELVKVPLPVVAQTAPEQQEE 783
+N K Q+ P E + E V E+ T V P A Q
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTV 1211

Query: 784 NNADNRDNGGMPRRSRRSPRH 804
N+ + RRS RS H
Sbjct: 1212 NSESSNKPKNRHRRSVRSVPH 1232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1461FLAGELLIN461e-07 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 45.8 bits (108), Expect = 1e-07
Identities = 41/226 (18%), Positives = 81/226 (35%), Gaps = 9/226 (3%)

Query: 7 MMYQQNMRGITNSQAEWMKYGEQMSTGKRVVNPSDDPIAASQAVVLSQAQAQNSQYTLAR 66
++ Q N+ +S + + E++S+G R+ + DD + A + +Q +
Sbjct: 11 LLTQNNLNKSQSSLSSAI---ERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNA 67

Query: 67 TFATQKVSLEESVLSQVTTAIQNAQEKIVYASNGTLSDNDRASLATDIQGLRDQLLNLAN 126
E L+++ +Q +E V A+NGT SD+D S+ +IQ +++ ++N
Sbjct: 68 NDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSN 127

Query: 127 TTDGNGRYIFAGYKTETAPFSEVNGDYVGGTESIKQQVDASRSMVIGHTGDKIFDSITSN 186
T NG + + +G E+I + +G G + +
Sbjct: 128 QTQFNGVKVLSQDNQMKIQVGANDG------ETITIDLQKIDVKSLGLDGFNVNGPKEAT 181

Query: 187 AVAEPDGSASETNLFAMLDSAIAALKTPVADSEADKETAAAALDKT 232
+ T A + + TA DK
Sbjct: 182 VGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKV 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1460FLGHOOKAP16770.0 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 677 bits (1747), Expect = 0.0
Identities = 541/546 (99%), Positives = 543/546 (99%)

Query: 2 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 61
SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 62 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 121
GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLV 120

Query: 122 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 181
SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND
Sbjct: 121 SNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLND 180

Query: 182 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 241
QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA
Sbjct: 181 QISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTA 240

Query: 242 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 301
RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL
Sbjct: 241 RQLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQL 300

Query: 302 ALAFAEAFNSQHKAGFDANGDEGEDFFAIGKPAVLQNTKNNGNVAIGATVTDASAVLATD 361
ALAFAEAFN+QHKAGFDANGD GEDFFAIGKPAVLQNTKN G+VAIGATVTDASAVLATD
Sbjct: 301 ALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAVLATD 360

Query: 362 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 421
YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV
Sbjct: 361 YKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFTGTPAVNDSFTLKPVSDAIV 420

Query: 422 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 481
NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN
Sbjct: 421 NMDVLITDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGN 480

Query: 482 KTATLKTSSTTQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 541
KTATLKTSS TQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD
Sbjct: 481 KTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFD 540

Query: 542 ALINIR 547
ALINIR
Sbjct: 541 ALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1459FLGFLGJ5080.0 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 508 bits (1308), Expect = 0.0
Identities = 311/313 (99%), Positives = 311/313 (99%)

Query: 1 MISDSKLLASAAWDAQSLNELKAKASEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60
MISDSKLLASAAWDAQSLNELKAKA EDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG
Sbjct: 1 MISDSKLLASAAWDAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALPKDG 60

Query: 61 LFSSEHTRLYTSMYDQQIAQQMTTGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120
LFSSEHTRLYTSMYDQQIAQQMT GKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET
Sbjct: 61 LFSSEHTRLYTSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQPLPEESTPAAPMKFPLET 120

Query: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180
VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL
Sbjct: 121 VVRYQNQALSQLVQKAVPRNYDDSLPGDSKAFLAQLSLPAQLASQQSGVPHHLILAQAAL 180

Query: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240
ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL
Sbjct: 181 ESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYL 240

Query: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300
EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK
Sbjct: 241 EALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQMKSISDK 300

Query: 301 VSKTYSMNIDNLF 313
VSKTYSMNIDNLF
Sbjct: 301 VSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1458FLGPRINGFLGI426e-152 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 426 bits (1097), Expect = e-152
Identities = 156/363 (42%), Positives = 213/363 (58%), Gaps = 9/363 (2%)

Query: 5 FLSALILLLVTTAAQAERIRDLTSVQGVRQNSLIGYGLVVGLDGTGDQTTQTPFTTQTLN 64
F + L A RI+D+ S+Q R N LIGYGLVVGL GTGD +PFT Q++
Sbjct: 13 FSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQSMR 72

Query: 65 NMLSQLGITVPTGTNMQLKNVAAVMVTASLPPFGRQGQTIDVVVSSMGNAKSLRGGTLLM 124
ML LGIT G + KN+AAVMVTA+LPPF G +DV VSS+G+A SLRGG L+M
Sbjct: 73 AMLQNLGITTQGGQS-NAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLIM 131

Query: 125 TPLKGVDSQVYALAQGNILVGGAGASAGGSSVQVNQLNGGRITNGAVIERELPSQFGVGN 184
T L G D Q+YA+AQG ++V G A +++ R+ NGA+IERELPS+F
Sbjct: 132 TSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDSV 191

Query: 185 TLNLQLNDEDFSMAQQIADTINRVR----GYGSATALDARTIQVRVPSGNSSQVRFLADI 240
L LQL + DFS A ++AD +N G A D++ I V+ P + R +A+I
Sbjct: 192 NLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRV-ADLTRLMAEI 250

Query: 241 QNMQVNVTPQDAKVVINSRTGSVVMNREVTLDSCAIAQGNLSVTVNRQANVSQPDTPFGG 300
+N+ V T AKVVIN RTG++V+ +V + A++ G L+V V V QP PF
Sbjct: 251 ENLTVE-TDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQP-APFSR 308

Query: 301 GQTVVTPQTQIDLRQSGGSLQSVRSSASLNNVVRALNALGATPMDLMSILQSMQSAGCLR 360
GQT V PQT I Q G + ++ L +V LN++G +++ILQ ++SAG L+
Sbjct: 309 GQTAVQPQTDIMAMQEGSKV-AIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQ 367

Query: 361 AKL 363
A+L
Sbjct: 368 AEL 370


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1457FLGLRINGFLGH349e-126 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 349 bits (897), Expect = e-126
Identities = 232/232 (100%), Positives = 232/232 (100%)

Query: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60
MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY
Sbjct: 1 MQKNAAHTYAISSLLVLSLTGCAWIPSTPLVQGATSAQPVPGPTPVANGSIFQSAQPINY 60

Query: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120
GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA
Sbjct: 61 GYQPLFEDRRPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNA 120

Query: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180
RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF
Sbjct: 121 RADVEASGGNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRF 180

Query: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232
SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM
Sbjct: 181 SGVVNPRTISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1456FLGHOOKAP1444e-07 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 43.8 bits (103), Expect = 4e-07
Identities = 18/81 (22%), Positives = 36/81 (44%), Gaps = 14/81 (17%)

Query: 3 SSLWIAKTGLDAQQTNMDVIANNLANVSTNGFKRQRAVFEDLLYQTIRQPGAQSSEQTTL 62
S + A +GL+A Q ++ +NN+++ + G+ RQ + + +TL
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI--------------MAQANSTL 47

Query: 63 PSGLQIGTGVRPVATERLHSQ 83
+G +G GV +R +
Sbjct: 48 GAGGWVGNGVYVSGVQREYDA 68



Score = 41.1 bits (96), Expect = 3e-06
Identities = 11/41 (26%), Positives = 21/41 (51%)

Query: 220 ETSNVNVAEELVNMIQVQRAYEINSKAVSTTDQMLQKLTQL 260
S VN+ EE N+ + Q+ Y N++ + T + + L +
Sbjct: 505 SISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1454FLGHOOKAP1414e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 4e-06
Identities = 17/49 (34%), Positives = 29/49 (59%)

Query: 353 TLTNGALEASNVDLSKELVNMIVAQRNYQSNAQTIKTQDQILNTLVNLR 401
L+N S V+L +E N+ Q+ Y +NAQ ++T + I + L+N+R
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 37.2 bits (86), Expect = 1e-04
Identities = 22/56 (39%), Positives = 30/56 (53%), Gaps = 4/56 (7%)

Query: 6 AVSGLNAAATNLDVIGNNIANSATYGFKSGTASFAD----MFAGSKVGLGVKVAGI 57
A+SGLNAA L+ NNI++ G+ T A + AG VG GV V+G+
Sbjct: 7 AMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSGV 62


108ECs1285ECs1270N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs1285212-5.0946643-oxoacyl-ACP reductase
ECs1284214-5.759536holo-ACP synthase
ECs1283113-4.241666hemolysin activator-like protein
ECs1282214-3.164690hemagglutinin/hemolysin-like protein
ECs1281223-3.833658hypothetical protein
ECs1280121-2.979649major pilin protein
ECs1279124-4.972247chaperone protein
ECs1278126-5.857077outer membrane usher protein
ECs1277237-10.882169outer membrane protein
ECs1276140-13.409496chaperone protein
ECs1275-136-11.112645oxidoreductase
ECs1274-135-10.383216transcriptional regulator
ECs1273-132-8.915630FidL-like protein
ECs1272-131-8.361043rtn-like protein
ECs1271-227-6.077229hypothetical protein
ECs1270-222-3.745729outer membrane protein PgaA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1285DHBDHDRGNASE1179e-35 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 117 bits (294), Expect = 9e-35
Identities = 55/192 (28%), Positives = 100/192 (52%), Gaps = 10/192 (5%)

Query: 4 DLACPQSVSALCEQIERQAGKIDVLVNNAGIVKDSLFASMSYEDFTQVIETNMFSIFRLT 63
D+ ++ + +IER+ G ID+LVN AG+++ L S+S E++ N +F +
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 64 KDALMLLRAAENPAIINVASIAALIPSVGQANYSASKGAILGFTRTLAAEMAPWGVRVNA 123
+ + + +I+ V S A +P A Y++SK A + FT+ L E+A + +R N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 124 VAPGMIESKMVKKV------SRAVVRAVTST----IPLRRLGKCEEVANTIVFLSSSASS 173
V+PG E+ M + + V++ T IPL++L K ++A+ ++FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 174 YIVGQTIVIDGG 185
+I + +DGG
Sbjct: 245 HITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1282PF05860642e-14 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 64.4 bits (157), Expect = 2e-14
Identities = 23/126 (18%), Positives = 48/126 (38%), Gaps = 21/126 (16%)

Query: 37 KNGTVYNANGVPVVDINKPNGSGLSHNIWDNLNVDKNGVVFNNSANESSTSLAGNIQGNS 96
N + +++ GS L H+ + +V +G F N+
Sbjct: 11 INSNITTEGNTRIIERGTQAGSNLFHS-FQEFSVPTSGTAFFNNP--------------- 54

Query: 97 NLTSGSAKVILNEVTSKNPSTINGMMEVAGDKADLIIANPNGITVNGGGSINTGKLTLTT 156
+ + I++ VT + S I+G++ A+L + NPNGI ++ G + +
Sbjct: 55 ----TNIQNIISRVTGGSVSNIDGLIRANA-TANLFLINPNGIIFGQNARLDIGGSFVGS 109

Query: 157 GTPDIQ 162
++
Sbjct: 110 TANRLK 115


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1279SECA290.022 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.7 bits (64), Expect = 0.022
Identities = 19/66 (28%), Positives = 27/66 (40%), Gaps = 14/66 (21%)

Query: 170 VTNPTGYYVTIRAAELLNNGKKVPLANSVMIAPQSTTEW-----TLPSGISVAPGAQIHL 224
V + V + +LN IA T E TLP+ ++ G +H+
Sbjct: 78 VFGMRHFDVQLLGGMVLNERC---------IAEMRTGEGKTLTATLPAYLNALTGKGVHV 128

Query: 225 VTVNDY 230
VTVNDY
Sbjct: 129 VTVNDY 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1278PF005777170.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 717 bits (1851), Expect = 0.0
Identities = 246/848 (29%), Positives = 404/848 (47%), Gaps = 44/848 (5%)

Query: 28 ATPSDEDNYTFDPQLFRGSRFSQSSLAKLTTRESVAPGNYKMDIYTNNKLSGSWNVTFKE 87
P F+P+ + + L++ + + PG Y++DIY NN + +VTF
Sbjct: 39 QAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNGYMATRDVTFNT 98

Query: 88 AADG-RVLPCLTPEVADAIGLKTGEDKGEK---DPVCTFAKELAPGITSQTQLSQLRLDL 143
++PCLT ++GL T G D C + T+Q + Q RL+L
Sbjct: 99 GDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATAQLDVGQQRLNL 158

Query: 144 SVPQSQLISRPRGYVPPSELDTGASLAFMNYIANYYNVAYSGQNAHSQRSLWASFNGGIN 203
++PQ+ + +R RGY+PP D G + +NY + + + + + + G+N
Sbjct: 159 TIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNS--VQNRIGGNSHYAYLNLQSGLN 216

Query: 204 LGAWQYRQLSNMTW-----DNDKGNQWNNIRSYLQRPLPAINSQLMMGQLITSGRFFSGL 258
+GAW+ R + ++ + N+W +I ++L+R + + S+L +G T G F G+
Sbjct: 217 IGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGDGYTQGDIFDGI 276

Query: 259 SYHGVSLATDERMLPDSMRGYAPTIRGVAATNARVSVMQNGHEIYQTTVAPGPFEINDLY 318
++ G LA+D+ MLPDS RG+AP I G+A A+V++ QNG++IY +TV PGPF IND+Y
Sbjct: 277 NFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTVPPGPFTINDIY 336

Query: 319 PTSYSGDLDVTVTEANGAVSRFSVPFSAVPESMRPGTSRYNVEVGKTQDSG---DDSMFG 375
SGDL VT+ EA+G+ F+VP+S+VP R G +RY++ G+ + + F
Sbjct: 337 AAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRSGNAQQEKPRFF 396

Query: 376 DLTWQHGMTNTLTFNSGSRIADGYQALMLGGVYGS-SLGAFGANLTWSHARVPESEAQSG 434
T HG+ T G+++AD Y+A G +LGA ++T +++ +P+ G
Sbjct: 397 QSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQANSTLPDDSQHDG 456

Query: 435 WMSQLTWSKTFQPTSTTVSLAGYRYSTSGYRDLADVLGERHAASNKQSWD---------- 484
+ ++K+ + T + L GYRYSTSGY + AD R N ++ D
Sbjct: 457 QSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYNIETQDGVIQVKPKFT 516

Query: 485 ---SSQWRQQSRFDLTLSQSLANYGNLFVSGSTQNYRGGKSRDTQLQLGYSNSFSHGISM 541
+ + ++ + LT++Q L L++SGS Q Y G + D Q Q G + +F I+
Sbjct: 517 DYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQFQAGLNTAF-EDINW 575

Query: 542 NLSVGRQRMGGYKDNSDDMQTVTSLSFSFPLGG-------NGPRVPSLSNSWTHSTDGSS 594
LS + D +L+ + P + R S S S +H +G
Sbjct: 576 TLSYSLTKNAW--QKGRDQM--LALNVNIPFSHWLRSDSKSQWRHASASYSMSHDLNGRM 631

Query: 595 QLQSSLTGMLDEAQTTNYSLNV---MRDQQYKQTTLSGNMQKRFSQTTVGLNASKGQDYW 651
+ + G L E +YS+ +T + R + S D
Sbjct: 632 TNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNYRGGYGNANIGYSHSDDIK 691

Query: 652 QASGNVQGAMAVHGGGITFGPYLGETFALVEAKGAEGAKVYNSSQLEINDSGYALVPAVT 711
Q V G + H G+T G L +T LV+A GA+ AKV N + + + GYA++P T
Sbjct: 692 QLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKVENQTGVRTDWRGYAVLPYAT 751

Query: 712 PYRYNRISLDPQGMDGDAELVDSERQVAPVAGAAVKVIFRTRPGKALLIKSRMADGSELP 771
YR NR++LD + + +L ++ V P GA V+ F+ R G LL+ + LP
Sbjct: 752 EYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIKLLMTLT-HNNKPLP 810

Query: 772 MGADVLDENNTVVGIAGQGGQIYLRTEQTKGHLSVRWGEGANDSCQLPFDISGKDSNSPI 831
GA V E++ GI GQ+YL G + V+WGE N C + + + +
Sbjct: 811 FGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEEENAHCVANYQLPPESQQQLL 870

Query: 832 IRLNETCQ 839
+L+ C+
Sbjct: 871 TQLSAECR 878


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1275DHBDHDRGNASE1037e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 103 bits (259), Expect = 7e-29
Identities = 70/255 (27%), Positives = 118/255 (46%), Gaps = 11/255 (4%)

Query: 15 LHNKVAIVTGAAGELGRGLCSALAKAGANLLLVDIK-EPDNRYLKHLTHEGVEVEFMTID 73
+ K+A +TGAA +G + LA GA++ VD E + + L E E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 74 ITKPDASCTIINRCLERFGQLDILVNNAGVCNINRPIDFNRNDWDPMINLNLNAAFDMSQ 133
+ A I R G +DILVN AGV + +W+ ++N F+ S+
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 134 AALNIFVPQRKGKIINMCSVLSFHGGRWSPG-YAATKHALAGLTKAYADDFAEYNIQING 192
+ + +R G I+ + S + R S YA++K A TK + AEYNI+ N
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPA-GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 193 IAPGYYVSEMTAIIYNNPKIKE-LIKGR-------IPAQRWGRAQDLMGAMVFLASAASD 244
++PG ++M ++ + E +IKG IP ++ + D+ A++FL S +
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 245 YVNGQLLVIDGGYSI 259
++ L +DGG ++
Sbjct: 245 HITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1273TRNSINTIMINR300.004 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 30.1 bits (67), Expect = 0.004
Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 5 YFLFAGIILCAFIAAILSHIAFHHANEPAEQNISCNAHVI 44
Y L + +I+ I A ++ A H N+PAEQ + H +
Sbjct: 366 YGLSSALIVAGGIGAGVT-TALHRRNQPAEQTTTTTTHTV 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1271BINARYTOXINA300.025 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 29.6 bits (66), Expect = 0.025
Identities = 22/77 (28%), Positives = 36/77 (46%), Gaps = 6/77 (7%)

Query: 335 DQVIKTVVNIIGKSIRPDDLLA--RVGGEEFGVLLTDIDTERAKALAERIRENVERLTGD 392
D + + N + + P +L+ R G +EFG+ LT + + K E I E+ G
Sbjct: 313 DSKVNNIENALKLTPIPSNLIVYRRSGPQEFGLTLTSPEYDFNK--IENIDAFKEKWEGK 370

Query: 393 NPEYAIPQKVTISIGAV 409
Y P ++ SIG+V
Sbjct: 371 VITY--PNFISTSIGSV 385


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs1270ARGDEIMINASE300.047 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 29.8 bits (67), Expect = 0.047
Identities = 27/183 (14%), Positives = 61/183 (33%), Gaps = 23/183 (12%)

Query: 450 WPRAAENELKK-AEVIEPRNINLEVEQAWTALTLQEWQQA--AVLTHDVVEREPQDPGVV 506
+ A E + A +++ + +E + + L ++ ++E E + +
Sbjct: 47 YLEVARQEHEVFASILKNNLVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTI 106

Query: 507 -RLK---RAVDVHNLAELRIAGSTGIDAEGPDSGKHDVDLTTIVYS---PPLKDNWRGFA 559
LK ++ + N+ I+G E + DL P+ + F
Sbjct: 107 NLLKDYFSSLTIDNMISKMISGVVT--EELKNYTSSLDDLVNGANLFIIDPMPNVL--FT 162

Query: 560 GFGYADGQFSEGKGIVRDWLAGVEWRSRNIWLEAEYAERVFNHEHKPGARLSGWYDFNDN 619
D S G G+ + + + R E +AE +F + + W + +
Sbjct: 163 ----RDPFASIGNGVT---INKMFTKVRQ--RETIFAEYIFKYHPVYKENVPIWLNRWEE 213

Query: 620 WRI 622
+
Sbjct: 214 ASL 216


109ECs0955ECs0947N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0955-1142.375021dTDP-glucose enzyme
ECs09540141.968840nucleotide di-P-sugar epimerase or dehydratase
ECs0953-2150.869513regulator
ECs0952-1180.241708hypothetical protein
ECs09511190.185165hypothetical protein
ECs0948018-0.118538lipoprotein
ECs0947-218-0.113472arginine transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0955NUCEPIMERASE546e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 53.6 bits (129), Expect = 6e-10
Identities = 29/125 (23%), Positives = 51/125 (40%), Gaps = 17/125 (13%)

Query: 4 RILVLGASGYIGQHLVRTLSQQGHQILA---------AARHVDRLAKLQLANVSCHKVDL 54
+ LV GA+G+IG H+ + L + GHQ++ + RL L HK+DL
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 55 SWPDNLPALLQD--IDTVYFLVH------SMGEGGDFIAQERQVALNFRDALREVPVKQL 106
+ + + L + V+ H S+ + LN + R ++ L
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQHL 121

Query: 107 IFLSS 111
++ SS
Sbjct: 122 LYASS 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0954NUCEPIMERASE761e-17 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 75.6 bits (186), Expect = 1e-17
Identities = 70/363 (19%), Positives = 123/363 (33%), Gaps = 65/363 (17%)

Query: 13 MKVLVTGATSGLGRNAVEFLCQKGISVRA---------TGRNEAMGKLLEKMGAEFVPAD 63
MK LVTGA +G + + L + G V +A +LL + G +F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 64 LTELVSSQAKVMLAGIDTLWHCS-------SFTSPWGTQQAFDLANVRATRRLGEWAVAW 116
L + + ++ S +P A+ +N+ + E
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPH----AYADSNLTGFLNILEGCRHN 116

Query: 117 GVRNFIHISSPSLYFDYHHHRDIKEDFRPHRFANEFARSKAASEEVINMLSQANPQTRFT 176
+++ ++ SS S+Y + D + +A +K A+E + + S T
Sbjct: 117 KIQHLLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH-LYGLPAT 174

Query: 177 ILRPQSLFGPHDK--VFIPRLAHMMHHYGSILLPHGGSALVDMTYYENAVHAMWLASQEA 234
LR +++GP + + + + M SI + + G D TY ++ A+
Sbjct: 175 GLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVI 234

Query: 235 CDKLPS--------------GRVYNITNGEHRTLRSIVQKLIDELNIDCRIRSVPYPMLD 280
RVYNI N L +Q L D L I+ + +P D
Sbjct: 235 PHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGD 294

Query: 281 MIARSMERLGRKSAKEPPLTHYGVSKLNFDFTLDITRAQEELGYQPVITLDEGIEKTAAW 340
+ T D E +G+ P T+ +G++ W
Sbjct: 295 V----------------LETS-----------ADTKALYEVIGFTPETTVKDGVKNFVNW 327

Query: 341 LRD 343
RD
Sbjct: 328 YRD 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0953ECOLIPORIN290.025 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 28.7 bits (64), Expect = 0.025
Identities = 20/54 (37%), Positives = 27/54 (50%), Gaps = 9/54 (16%)

Query: 2 RRVFWLIAVALLLAGCAGEKGIVEKEGYQLDTRRQAQAAYPRIKVLVIHYTADD 55
R+V L+ ALL AG A I K+G +LD Y ++ L HY +DD
Sbjct: 3 RKVLALVIPALLAAGAAHAAEIYNKDGNKLDL-------YGKVDGL--HYFSDD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0947PF05272300.010 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.010
Identities = 9/18 (50%), Positives = 12/18 (66%)

Query: 31 LVLLGPSGAGKSSLLRVL 48
+VL G G GKS+L+ L
Sbjct: 599 VVLEGTGGIGKSTLINTL 616


110ECs0927ECs0919N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs09270120.037485hypothetical protein
ECs09260130.161924DeoR family transcriptional regulator
ECs0925115-0.022010DeoR family transcriptional regulator
ECs0924015-0.857952hypothetical protein
ECs0923-1140.140924hypothetical protein
ECs09220120.080594proton motive force efflux pump
ECs0921-110-0.300360undecaprenyl pyrophosphate phosphatase
ECs0920-110-0.065843DeoR family transcriptional regulator
ECs091909-0.683061D-alanyl-D-alanine carboxypeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0927TCRTETA320.006 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.006
Identities = 21/106 (19%), Positives = 34/106 (32%), Gaps = 6/106 (5%)

Query: 394 LMIGMITFQFSTFSFGMGNAAGLLFAGIML-GFMRANHPTFG-YIPQ--GALSMVKEFGL 449
L++ + +L+ G ++ G A G YI + FG
Sbjct: 76 LLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGF 135

Query: 450 MVFMAGVGLSAGSGINNGLGAIGGQM--LIAGLIVSLVPVVICFLF 493
M G G+ AG + +G A + L + CFL
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLL 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0926HTHTETR506e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 6e-10
Identities = 14/83 (16%), Positives = 30/83 (36%), Gaps = 4/83 (4%)

Query: 2 RRANDPQRREKIIQATLEAVKLYGIHAVTHRKIATLAGVPLGSMTYYFSGIDELLLEAFS 61
+ + R+ I+ L G+ + + +IA AGV G++ ++F +L E +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW- 63

Query: 62 SFTEIMSRQYQAFFSDVSDAQGA 84
E+ +
Sbjct: 64 ---ELSESNIGELELEYQAKFPG 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0925TCRTETB340.001 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 33.7 bits (77), Expect = 0.001
Identities = 34/150 (22%), Positives = 65/150 (43%), Gaps = 6/150 (4%)

Query: 218 LLIGVVVLAMAFAEGSANDWL-PLLMVDGHGFSP-TSGSLIYAGFTLGMTVGRFTGGWFI 275
+IGV+ + F + + P +M D H S GS+I T+ + + + GG +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 276 DRYSRVAVVR-ASALM--GALGIGLIIFVDSAWVA-GVSVVLWGLGASLGFPLTISAASD 331
DR + V+ + L ++ S ++ + VL GL + TI ++S
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 332 TGPDAPTRVSVVATTGYLAFLVGPPLLGYL 361
+A +S++ T +L+ G ++G L
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0922TCRTETA401e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.8 bits (93), Expect = 1e-05
Identities = 58/269 (21%), Positives = 106/269 (39%), Gaps = 23/269 (8%)

Query: 71 LLGPLSDRIGRRPVMLAGVVWFIVTCLAILLAQNIEQFTLLRFLQGISLCFIGAVGYAAI 130
+LG LSDR GRRPV+L + V + A + + R + GI+ GAV A I
Sbjct: 62 VLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYI 120

Query: 131 QESFEEAVCIKITALMANVALIAPLLGPLVG---AAWIHVLPWEGMFVLFAALAAISFFG 187
+ + + M+ + GP++G + P F AAL ++F
Sbjct: 121 ADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFLT 176

Query: 188 LQRAMPETATRIGEKLSLKELGRDYKLVLKNG-RFVAGALALGFVSLPLLAWIAQSP--I 244
+PE+ L + L G VA +A+ F ++ + Q P +
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFF----IMQLVGQVPAAL 232

Query: 245 IIITGEQLSSYEYGLLQVPIFGALIAGNL----LLARLTSRRTVRSLIIMGGWPIMIGLL 300
+I GE ++ + + + I +L + + +R R +++G G +
Sbjct: 233 WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYI 292

Query: 301 VAAAATVISSHAYLWMTAGLSIYAFGIGL 329
+ A AT ++ + + + GIG+
Sbjct: 293 LLAFAT----RGWMAFPIMVLLASGGIGM 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0919BLACTAMASEA438e-07 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 43.2 bits (102), Expect = 8e-07
Identities = 41/201 (20%), Positives = 64/201 (31%), Gaps = 34/201 (16%)

Query: 16 AFLFLFAPTAFAAEQTVEAPSVDARAW----------ILMDYASGKVLAEGNADEKLDPA 65
+ L A A P + I MD ASG+ L ADE+
Sbjct: 7 CIISLLATLPLAV-HASPQPLEQIKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMM 65

Query: 66 SLTKIMTSYVVGQALKADKIKLTDMVTVGKDAWATGNPALRGSSVMFLKPGDQVSVADLN 125
S K++ V + A +L + + +P V D ++V +L
Sbjct: 66 STFKVVLCGAVLARVDAGDEQLERKIHYRQQDLVDYSP------VSEKHLADGMTVGELC 119

Query: 126 KGVIIQSGNDACIALADYVAGSQESFIGLMNGYAKKLGLTNTT---FQTVHGLDAPGQF- 181
I S N A L V G + + +++G T ++T PG
Sbjct: 120 AAAITMSDNSAANLLLATVGGPAG-----LTAFLRQIGDNVTRLDRWETELNEALPGDAR 174

Query: 182 --STARDMA------LLGKAL 194
+T MA L + L
Sbjct: 175 DTTTPASMAATLRKLLTSQRL 195


111ECs0875ECs0870N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0875-2193.668910ATP-dependent RNA helicase RhlE
ECs0874-2213.394736DNA-binding transcriptional regulator
ECs0873-2213.509682hypothetical protein
ECs0872-2213.024379ABC transporter ATP-binding protein
ECs0871-2212.479564hypothetical protein
ECs0870-1180.617871hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0875SECA300.025 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 29.8 bits (67), Expect = 0.025
Identities = 20/67 (29%), Positives = 34/67 (50%), Gaps = 4/67 (5%)

Query: 246 QQVLVFTRTKHGANHLAEQLNKDGIRSAAIHG-NKSQGARTRALADFKSGDIRVLVATDI 304
Q VLV T + + ++ +L K GI+ ++ + A A A + + V +AT++
Sbjct: 450 QPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANEAAIVAQAGYPAA---VTIATNM 506

Query: 305 AARGLDI 311
A RG DI
Sbjct: 507 AGRGTDI 513


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0874HTHTETR736e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 73.1 bits (179), Expect = 6e-18
Identities = 33/214 (15%), Positives = 77/214 (35%), Gaps = 17/214 (7%)

Query: 13 KGEQAKKQLIAAALAQFGEYGMNATT-REIAAQAGQNIAAITYYFGSKEDLYLACAQWIA 71
+ ++ ++ ++ AL F + G+++T+ EIA AG AI ++F K DL+ +
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 72 DFIGEQFRPHAEEAERLFAQPQPDRAAIRELILRACRNMIKLLTQDDTVNLSKFISREQL 131
IGE E + P + +RE+++ + + + + + F E +
Sbjct: 68 SNIGELEL---EYQAKFPGDP---LSVLREILIHVLESTVTEERRRLLMEII-FHKCEFV 120

Query: 132 SPTAAYHLVHEQVISPLHSHLTRLIAAWTGCDANDTRMILHTHALIGEILAFRLGKETIL 191
A + + + + + +A L T + + G
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKH--CIEAKMLPADLMTRRAAIIMRGYISG----- 173

Query: 192 LRTGWTAFDEEKTELINQTVTCHIDLILQGLSQR 225
L W + + + ++ ++L+
Sbjct: 174 LMENWLFAPQSFD--LKKEARDYVAILLEMYLLC 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0873RTXTOXIND636e-13 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 62.5 bits (152), Expect = 6e-13
Identities = 42/259 (16%), Positives = 92/259 (35%), Gaps = 25/259 (9%)

Query: 83 ALMQAKAGVSVAQAQYDLMLAGYRDEEIAQAAAAVKQAQAAYDYAQNFYNRQQGLWKSRT 142
Q + + +A+ +LA E + + + + L +
Sbjct: 201 QKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 143 ISA--NDLENARSSRDQAQATLKSAQDKLRQYRSGNREQ---DIAQAKASLEQAQAQLAQ 197
N+L +S +Q ++ + SA+++ + + + + Q ++ +LA+
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 198 AELNLQDSTLIAPSDGTLLTRAV-EPGTVLNEGGTVFTVSLT-RPVWVRAYVDERNLDQA 255
E Q S + AP + V G V+ T+ + + V A V +++
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 256 QPGRKVLLYTDGRPDKPYH---GQIGFVSPTAEFTPKTVETPDLRTDLVYRLRIVVT--- 309
G+ ++ + P Y G++ ++ A D R LV+ + I +
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDA--------IEDQRLGLVFNVIISIEENC 432

Query: 310 ----DADDALRQGMPVTVQ 324
+ + L GM VT +
Sbjct: 433 LSTGNKNIPLSSGMAVTAE 451


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0872PF05272320.012 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.012
Identities = 20/90 (22%), Positives = 28/90 (31%), Gaps = 21/90 (23%)

Query: 293 TPRFEDAFIDLLGGAGTSESPLGAILHTVEGTPGETVIEAKELTKKFGDFAATDHVNFAV 352
PR E + +LG P + + + K HV +
Sbjct: 547 VPRLEKWLVHVLGKTPDDYKP-------------RRLRYLQLVGKYI----LMGHVARVM 589

Query: 353 KRGEIFG----LLGPNGAGKSTTFKMMCGL 378
+ G F L G G GKST + GL
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGL 619



Score = 29.7 bits (66), Expect = 0.046
Identities = 11/23 (47%), Positives = 13/23 (56%)

Query: 34 YVTGLVGPDGAGKTTLMRMLAGL 56
Y L G G GK+TL+ L GL
Sbjct: 597 YSVVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0870ABC2TRNSPORT473e-08 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 47.2 bits (112), Expect = 3e-08
Identities = 36/146 (24%), Positives = 63/146 (43%), Gaps = 5/146 (3%)

Query: 197 AREREQGTLDQLLVSPLTTWQIFIGKAVPALIVATFQATIVLAIGIWAYQIPFAGSLALF 256
R Q T + +L + L I +G+ A A IG+ A + + L+L
Sbjct: 92 GRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGA---GIGVVAAALGYTQWLSLL 148

Query: 257 YFTMVI--YGLSLVGFGLLISSLCSTQQQAFIGVFVFMMPAILLSGYVSPVENMPVWLQN 314
Y VI GL+ G+++++L + + + P + LSG V PV+ +P+ Q
Sbjct: 149 YALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQT 208

Query: 315 LTWINPIRHFTDITKQIYLKDASLDI 340
P+ H D+ + I L +D+
Sbjct: 209 AARFLPLSHSIDLIRPIMLGHPVVDV 234


112ECs0848ECs0837N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0848637-5.127671hypothetical protein
ECs08473230.023190hypothetical protein
ECs08462251.718809hypothetical protein
ECs08454274.975419hypothetical protein
ECs08443265.280189tail fiber protein
ECs08433254.769135outer membrane protein
ECs08422223.887927host specificity protein
ECs08413243.440703tail assembly protein
ECs08403272.789800tail assembly protein
ECs08393252.494879minor tail protein
ECs08384262.167722minor tail protein
ECs08374262.476488tail length tape measure protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0848YERSSTKINASE290.027 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.9 bits (64), Expect = 0.027
Identities = 19/66 (28%), Positives = 32/66 (48%), Gaps = 3/66 (4%)

Query: 200 RMDKINGESLLNISSLPAQAEHAIYDMFDRLEQKGILFVDTTETNVLYDRAKNEFNPIDI 259
+ KIN E+ A H + D+ + L + G++ D NV++DRA E ID+
Sbjct: 234 KQGKINSEAYWGTIKFIA---HRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDL 290

Query: 260 SSYNVS 265
++ S
Sbjct: 291 GLHSRS 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0844CHANLCOLICIN330.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 33.1 bits (75), Expect = 0.002
Identities = 36/130 (27%), Positives = 55/130 (42%), Gaps = 15/130 (11%)

Query: 131 SARNAGISASKAEASAANADTSAEDASESARQAAESAASAKKSEEASSSSAS-------- 182
S G SK+E+SAA T+ ++ + AE AA AK + EA + + +
Sbjct: 34 SGGGGGKGGSKSESSAAIHATAKWSTAQLKKTQAEQAARAKAAAEAQAKAKANRDALTQR 93

Query: 183 ------EAAQKASESLQSATDAELSKKTAESAAGNAARDATTSTEKARESAESAQSAEQS 236
EA + + SAT+ + A A R A + EKAR+ AE+A+ A Q
Sbjct: 94 LKDIVNEALRHNASRTPSATELAHANNAAMQAEDERLRLA-KAEEKARKEAEAAEKAFQE 152

Query: 237 RIAAEDAVNR 246
+ R
Sbjct: 153 AEQRRKEIER 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0843ENTEROVIROMP1442e-46 Enterobacterial virulence outer membrane protein si...
		>ENTEROVIROMP#Enterobacterial virulence outer membrane protein

signature.
Length = 171

Score = 144 bits (365), Expect = 2e-46
Identities = 66/201 (32%), Positives = 100/201 (49%), Gaps = 32/201 (15%)

Query: 1 MRKVCAAILSAAICLAVSGVPAWASEHQSTLSAGYLHASTDAPG-SDDLNGINVKYRYEF 59
M+K+ AA+ +G A ST++ GY A +DA G + + G N+KYRYE
Sbjct: 1 MKKIACLSALAAVLAFTAGTSVAA---TSTVTGGY--AQSDAQGQMNKMGGFNLKYRYEE 55

Query: 60 TDT-LGLITSFSYANAEDEQKTHYSDTRWHEDYVRNRWFSVMAGPSVRVNEWFSAYAMAG 118
++ LG+I SF+Y T S T DY +N+++ + AGP+ R+N+W S Y + G
Sbjct: 56 DNSPLGVIGSFTY--------TEKSRTASSGDYNKNQYYGITAGPAYRINDWASIYGVVG 107

Query: 119 VAYSRVSTFSGDYFRVTDNKRKTHDVLTGSDDARYSNTSLAWGAGVQFNPTESVAVDVAY 178
V Y + T + S+ ++GAG+QFNP E+VA+D +Y
Sbjct: 108 VGYGKFQT-------------TEYPTYKHDT----SDYGFSYGAGLQFNPMENVALDFSY 150

Query: 179 EGSGSGDWRTDGFIVGVGYKF 199
E S +I GVGY+F
Sbjct: 151 EQSRIRSVDVGTWIAGVGYRF 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0842SURFACELAYER330.005 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 33.5 bits (76), Expect = 0.005
Identities = 34/143 (23%), Positives = 45/143 (31%), Gaps = 30/143 (20%)

Query: 965 SVNANSGTLNNVTVNENCTIKGMLEATQV----RGDF---------VKAVSKSFPKQAGT 1011
+ + L NVT + +K L+A ++ G F VKA S K A
Sbjct: 235 AAQYDKKQLTNVTFDTETAVKDALKAQKIEVSSVGYFKAPHTFTVNVKATSNKNGKSATL 294

Query: 1012 WGNTETPNGTVTVTISDDHNFDRQIIIPPIIFNGIAYSDPGSGNNPGGTRYTGYGFEVRK 1071
PN V S I+ N Y + G R
Sbjct: 295 PVTVTVPNVADPVVPSQSKT---------IMHNAYFYDKDA--------KRVGTDKVTRY 337

Query: 1072 NGVLIASRETKGAIPGSYSAVID 1094
N V +A TK A SY VI+
Sbjct: 338 NTVTVAMNTTKLANGISYYEVIE 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0837cloacin443e-06 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 43.5 bits (102), Expect = 3e-06
Identities = 34/142 (23%), Positives = 62/142 (43%), Gaps = 4/142 (2%)

Query: 519 DQQRLNDLQEKKRQKDLQDAK--EQAERNYQEQQKRRNAENAALNRMNETEAARHQREIA 576
DQ + +E +RQ++ E AERNY+ + N N + R E +A Q +
Sbjct: 294 DQVKQRQDEENRRQQEWDATHPVEAAERNYERARAELNQANEDVARNQERQAKAVQVYNS 353

Query: 577 RINAMQYADQAVRDA-AIQRENERYEKALASGKKKTRETRNDEATRLLLQYSQQQAQVEG 635
R + + A++ + DA A ++ R+ +G + + +A R + +QA +
Sbjct: 354 RKSELDAANKTLADAIAEIKQFNRFAHDPMAGGHRMWQMAGLKAQRAQTDVNNKQAAFDA 413

Query: 636 QIAAARQSAGIATERMTEARKQ 657
A + A A E+RK+
Sbjct: 414 -AAKEKSDADAALSSAMESRKK 434


113ECs0635ECs0630N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0635-1164.7643012,3-dihydro-2,3-dihydroxybenzoate dehydrogenase
ECs0634-1165.2668462,3-dihydro-2,3-dihydroxybenzoate synthetase
ECs06330155.5409672,3-dihydroxybenzoate-AMP ligase
ECs06320155.518811isochorismate synthase
ECs06311143.374703iron-enterobactin transporter periplasmic
ECs06301143.855706enterobactin exporter EntS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0635DHBDHDRGNASE362e-130 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 362 bits (930), Expect = e-130
Identities = 110/258 (42%), Positives = 150/258 (58%), Gaps = 20/258 (7%)

Query: 5 GKNVWVTGAGKGIGYATALAFVEAGAKVTGFD---------------QAFAQEQYPFATE 49
GK ++TGA +GIG A A GA + D +A E +P
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP---- 63

Query: 50 VMDVADAAQVAQVCQRLLAETERLDVLVNAAGILRMGATDQLSKEDWQQTFAVNVGGAFN 109
DV D+A + ++ R+ E +D+LVN AG+LR G LS E+W+ TF+VN G FN
Sbjct: 64 -ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 110 LFQQTMNQFRRQRGGAIVTVASDAAHTPRIGMSAYGASKAALKSLALSVGLELAGSGVRC 169
+ +R G+IVTV S+ A PR M+AY +SKAA +GLELA +RC
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 170 NVVSPGSTDTDMQRTLWVSDDAEEQRIRGFGEQFKLGIPLGKIARPQEIANTILFLASDL 229
N+VSPGST+TDMQ +LW ++ EQ I+G E FK GIPL K+A+P +IA+ +LFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 230 ASHITLQDIVVDGGSTLG 247
A HIT+ ++ VDGG+TLG
Sbjct: 243 AGHITMHNLCVDGGATLG 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0634ISCHRISMTASE444e-161 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 444 bits (1142), Expect = e-161
Identities = 146/299 (48%), Positives = 195/299 (65%), Gaps = 18/299 (6%)

Query: 1 MAIPKLQAYALPESHDIPQNKVDWAFEPQRAALLIHDMQDYFVSFWGENCPMMEQVIANI 60
MAIP +Q Y +P + D+PQNKV W +P RA LLIHDMQ+YFV + + ++ ANI
Sbjct: 1 MAIPAIQPYQMPTASDMPQNKVSWVPDPNRAVLLIHDMQNYFVDAFTAGASPVTELSANI 60

Query: 61 AALRDYCKQHNIPVYYTAQPKEQSDEDRALLNDMWGPGLTRSPEQQKVVDRLTPDADDTV 120
L++ C Q IPV YTAQP Q+ +DRALL D WGPGL P ++K++ L P+ DD V
Sbjct: 61 RKLKNQCVQLGIPVVYTAQPGSQNPDDRALLTDFWGPGLNSGPYEEKIITELAPEDDDLV 120

Query: 121 LVKWRYSAFHRSPLEQMLKESGRNQLIITGVYAHIGCMTTATDAFMRDIKPFMVADALAD 180
L KWRYSAF R+ L +M+++ GR+QLIITG+YAHIGC+ TA +AFM DIK F V DA+AD
Sbjct: 121 LTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVAD 180

Query: 181 FSRDEHLMSLKYVAGRSGRVVMTEELL------PAPIPASKA-----------ALREVIL 223
FS ++H M+L+Y AGR VMT+ LL PA + + A +R+ I
Sbjct: 181 FSLEKHQMALEYAAGRCAFTVMTDSLLDQLQNAPADVQKTSANTGKKNVFTCENIRKQIA 240

Query: 224 PLLDESDEPFDDD-NLIDYGLDSVRMMALAARWRKVHGDIDFVMLAKNPTIDAWWKLLS 281
LL E+ E D +L+D GLDSVR+M L +WR+ ++ FV LA+ PTI+ W KLL+
Sbjct: 241 ELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEEWQKLLT 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0631FERRIBNDNGPP641e-13 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 63.8 bits (155), Expect = 1e-13
Identities = 61/285 (21%), Positives = 101/285 (35%), Gaps = 35/285 (12%)

Query: 40 HTLESQPQRIVSTSVTLTGSLLAIDAPVIASGATTPNNRVADDQGFLRQWSKVAKERKLQ 99
H P RIV+ LLA+ VAD + R W E L
Sbjct: 29 HAAAIDPNRIVALEWLPVELLLALGIVPYG---------VADTINY-RLW---VSEPPLP 75

Query: 100 RLYIG-----EPSTEAVAAQMPDLILISATGGDSALALYDQLSTIAPTLIINYDDKSWQA 154
I EP+ E + P ++ SA G S + L+ IAP N+ D
Sbjct: 76 DSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPS----PEMLARIAPGRGFNFSDGKQPL 131

Query: 155 L-----LTQLGEITGHEKQAAERIAQFDKQLAAAKEQIKLPPQPVTAIVYTAAAHSANLW 209
LT++ ++ + A +AQ++ + + K + + ++
Sbjct: 132 AMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVF 191

Query: 210 TPESAQGQMLEQLGFTLAKLPAGLNASQSQGKRHDIIQLGGENLAAGLNGESLFLFAGDQ 269
P S ++L++ G NA Q + + + LAA + + L +
Sbjct: 192 GPNSLFQEILDEYGIP--------NAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNS 243

Query: 270 KDADAIYANPLLAHLPAVQNKQVYALGTETFRLDYYSAMQVLERL 314
KD DA+ A PL +P V+ + + F SAM + L
Sbjct: 244 KDMDALMATPLWQAMPFVRAGRFQRVPAVWFYGATLSAMHFVRVL 288


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0630TCRTETA362e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 36.0 bits (83), Expect = 2e-04
Identities = 82/394 (20%), Positives = 145/394 (36%), Gaps = 38/394 (9%)

Query: 27 FISIVSLGLLGVAVPVQIQMMTHSTWQV---GLSVTLTGGAMFVGLMVGGVLADRYERKK 83
+ V +GL+ +P ++ + HS G+ + L F V G L+DR+ R+
Sbjct: 15 ALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRP 74

Query: 84 VILLARGTCGIGFIGLCLNALL--PEPSLLAIYLLGLWDGFFASLGVTALLAATPALVGR 141
V+L + G ++ + P L +Y+ + G + G A A +
Sbjct: 75 VLL-------VSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVA-GAYIADITDG 126

Query: 142 ENLMQAGAITMLTVRLGSVISPMIGGLLLATGGVAWNYGLAAAGTFITLLPLLSLPALPP 201
+ + G V P++GGL+ GG + + AA L L LP
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLM---GGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 202 PPQPREHPLK----SLLAGFRFLLASPLVGGIALLGGLLTMAS----AVRVLYPALADNW 253
+ PL+ + LA FR+ +V + + ++ + A+ V++ D +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG--EDRF 241

Query: 254 QMSAAQIGFLYAAIP-LGAAIGALTSGKLAHSARPGLLMLLSTLGS---FLAIGLFGLMP 309
A IG AA L + A+ +G +A ++L + ++ +
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 310 MWILGVVCLALFGWLSAVSSLLQYTMLQTQTPEAMLGRINGLWTAQNVTGDAIGAALLGG 369
M +V LA G ML Q E G++ G A +G L
Sbjct: 302 MAFPIMVLLASGGIGMPALQ----AMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTA 357

Query: 370 LGAMMTPVASASASGFGLLIIGVLLLLVLVELRR 403
+ A + + +G+ + L LL L LRR
Sbjct: 358 IYA----ASITTWNGWAWIAGAALYLLCLPALRR 387


114ECs0613ECs0608N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs06130192.209795inner membrane component for iron transport
ECs0612-2182.376715copper/silver efflux system membrane fusion
ECs0611-1182.032982copper-binding protein
ECs0610-1224.230711copper/silver efflux system outer membrane
ECs06090233.736514CusR family transcriptional regulator
ECs06081233.500785sensor kinase CusS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0613ACRIFLAVINRP6940.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 694 bits (1793), Expect = 0.0
Identities = 213/1058 (20%), Positives = 439/1058 (41%), Gaps = 54/1058 (5%)

Query: 1 MIEWIIRRSVANRFLVLMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI 60
M + IRR + A+ L + G I+ PV P ++ V + +YPG Q
Sbjct: 1 MANFFIRR----PIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQT 56

Query: 61 VENQVTYPLTTTMLSVPGAKTVRGFSQ-FGDSYVYVIFEDGTDPYWARSRVLEYLNQVQG 119
V++ VT + M + + S G + + F+ GTDP A+ +V L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 120 KLPAGVSAELGP-DATGVGWIYEYALVDRSGKHDLADLRSLQDWFLKYELKTIPDVAEVA 178
LP V + + + ++ V + D+ +K L + V +V
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 179 SVGGVVKEYQVVIDPQRLAQYGISLAEVKSALDASNQEAGGSSIELA------EAEYMVR 232
G ++ +D L +Y ++ +V + L N + + + +
Sbjct: 177 LFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 ASGYLQTLDDFNHIVLKASENGVPVYLRDVAKIQVGPEMRRGIAELNG-EVVGGVVILRS 291
A + ++F + L+ + +G V L+DVA++++G E IA +NG G + L +
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLAT 295

Query: 292 GKNAREVIAAVKDKLETLKSSLPEGVEIVTTYDRSQLIDRAIDNLSGKLLEEFIVVAVVC 351
G NA + A+K KL L+ P+G++++ YD + + +I + L E ++V +V
Sbjct: 296 GANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVM 355

Query: 352 ALFLWHVRSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIEN 411
LFL ++R+ L+ I++P+ L F ++ G + N +++ G+ +A+G +VD AIV++EN
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 412 AHKRLEEWQHQHPDATLDNKTRWQVITNASVEVGPALFISLLIITLSFIPIFTLEGQEGR 471
+ + E D + + ++ AL ++++ FIP+ G G
Sbjct: 416 VERVMME----------DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 472 LFGPLAFTKTYAMAGAALLAIVVIPILMGYWIRGKIPPESSNPLNRF----------LIR 521
++ + T AMA + L+A+++ P L ++ + E F +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKP-VSAEHHENKGGFFGWFNTTFDHSVN 524

Query: 522 VYHPLLLKVLHWPKTTLLVAALSVLTVLWPLNKVGGEFLPQINEGDLLYMPSTLPGISAA 581
Y + K+L LL+ AL V ++ ++ FLP+ ++G L M G +
Sbjct: 525 HYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQE 584

Query: 582 EAASMLQKTDKLIM--SVPEVARVFGKTGKAETATDSAPLEMVETTIQLKPQEQW-RPGM 638
+L + + V VF G + + + LKP E+
Sbjct: 585 RTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQN---AGMAFVSLKPWEERNGDEN 641

Query: 639 TMDKIIEELDNTVRLPGLANLWVPPIRNRIDMLSTGIKSPIGIKVSGTVLADI-DAMAEQ 697
+ + +I + + + +++ + I +G + A +
Sbjct: 642 SAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQL 701

Query: 698 IEEVARTVPGVASALAERLEGGRYINVEINREKAARYGMTVADVQLFVTSAVGGAMVGET 757
+ A+ + S LE +E+++EKA G++++D+ +++A+GG V +
Sbjct: 702 LGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF 761

Query: 758 VEGIARYPINLRYPQSWRDSPQALRQLPILTPMKQQITLADVADVKVSTGPSMLKTENAR 817
++ + ++ +R P+ + +L + + + + + G L+ N
Sbjct: 762 IDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGL 821

Query: 818 PTSWIYIDARDRDMVSVVHDLQKAIAEKVQLKPGTSVAFSGQFELLERANHKLKLMVPMT 877
P+ I +A L + +A K L G ++G + ++ +V ++
Sbjct: 822 PSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAIS 879

Query: 878 LMIIFVLLYLAFRRVGEALLIISSVPFALVGGIWLLWWMGFHLSVATGTGFIALAGVAAE 937
+++F+ L + + ++ VP +VG + V G + G++A+
Sbjct: 880 FVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAK 939

Query: 938 FGVVMLMYLRHAIEAEPSLNNPQTFSEQKLDEALHHGAVLRVRPKAMTVAVIIAGLLPIL 997
++++ + + +E E + + EA +R+RP MT I G+LP+
Sbjct: 940 NAILIVEFAKDLMEKE----------GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 998 WGTGAGSEVMSRIAAPMIGGMITAPLLSLFIIPAAYKL 1035
GAGS + + ++GGM++A LL++F +P + +
Sbjct: 990 ISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0610RTXTOXIND389e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 37.5 bits (87), Expect = 9e-05
Identities = 24/182 (13%), Positives = 59/182 (32%), Gaps = 13/182 (7%)

Query: 254 QAQTVNSDSLQSVKLPA-GLSSQILLQRPDIMEAEHALM-----AANANIGAARAAFFPS 307
+ +S + +K + +I+++ + + L+ A A+ ++
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQS----- 141

Query: 308 ISLTSGISTASSDLSSLFNASSGMWNFIPKIEIPIFNAGRNQANLDIAEIRQQQSVVNYE 367
SL + + + + P F + L + + ++Q
Sbjct: 142 -SLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQN 200

Query: 368 QKIQNAFKEVADALALRQSLNDQISAQQRYLASLQITLQRARALYQHGAVSYLEVLDAER 427
QK Q + A R ++ +I+ + + L +L A++ VL+ E
Sbjct: 201 QKYQ-KELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQEN 259

Query: 428 SL 429

Sbjct: 260 KY 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0609HTHFIS862e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.7 bits (212), Expect = 2e-21
Identities = 35/117 (29%), Positives = 62/117 (52%)

Query: 2 KLLIVEDEKKTGEYLTKGLTEAGFVVDLADNGLNGYHLAMTGDYDLIILDIMLPDVNGWD 61
+L+ +D+ L + L+ AG+ V + N + GD DL++ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IVRMLRSANKGMPILLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVRTLLRR 118
++ ++ A +P+L+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0608PF06580300.018 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 30.2 bits (68), Expect = 0.018
Identities = 30/183 (16%), Positives = 67/183 (36%), Gaps = 34/183 (18%)

Query: 306 EELTRMAKMVSDML-FLAQADNNQLIPEKKMLNLADEVGKVFDFFEALAEDR-GVELQFV 363
+ M +S+++ + + N + + LADE+ V + + LA + LQF
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVS------LADELTVVDSYLQ-LASIQFEDRLQFE 243

Query: 364 GDECQVAGDPLMLRRALSNLLSNALRY----TPPGEAIVVRCQTVDHLVQVIVENPGTPI 419
D + + L+ N +++ P G I+++ + V + VEN G+
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA 303

Query: 420 APEHLPRLFDRFYRVDPSRQRKGEGSGIGLAIVK---SIVVAHKGTVAVTSNARGTRFVI 476
E +G GL V+ ++ + + ++ ++
Sbjct: 304 LKN------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345

Query: 477 VLP 479
++P
Sbjct: 346 LIP 348


115ECs0548ECs0538N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs05484317.805284adhesin
ECs05464318.137908lipoprotein
ECs05454307.965051DNA-binding transcriptional regulator CueR
ECs05444297.730210membrane fusion protein of a transport system
ECs05434297.642712ABC transporter ATP-binding protein
ECs05424287.708506hypothetical protein
ECs05411154.505730hypothetical protein
ECs05400132.628362outer membrane transport protein
ECs0539-1132.994385amino acid/amine transport protein
ECs0538-2122.215015glutaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0548PF03895553e-12 Serum resistance protein DsrA.
		>PF03895#Serum resistance protein DsrA.

Length = 79

Score = 55.2 bits (133), Expect = 3e-12
Identities = 21/78 (26%), Positives = 34/78 (43%), Gaps = 1/78 (1%)

Query: 262 RKEANAGTASAIAIASQPQVKTGDVMMVSAGAGTFNGESAVSVGTSFNAGTHTVLKAGIS 321
KE G A+ A++ Q VSA G + ++A+++G KAG++
Sbjct: 2 SKELQTGLANQSALSMLVQPNGVGKTSVSAAVGGYRDKTALAIGVGSRITDRFTAKAGVA 61

Query: 322 ADTQS-DFGAGVGVGYSF 338
+T + G VGY F
Sbjct: 62 FNTYNGGMSYGASVGYEF 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0544RTXTOXIND2571e-83 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 257 bits (657), Expect = 1e-83
Identities = 103/434 (23%), Positives = 168/434 (38%), Gaps = 56/434 (12%)

Query: 11 LTEPRLPRSALAV-RVTAVMLLCFLGWAWYFQLDEVTTGSGTVEPSGREQVVQSLEGGIL 69
L E + R V L+ + Q++ V T +G + SGR + ++ +E I+
Sbjct: 48 LIETPVSRRPRLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIV 107

Query: 70 YHLDVKVGDIVEQGQPLAQLNRTKTESDVQEAMSRLYAALATSARLRAEVSNK------P 123
+ VK G+ V +G L +L E+D + S L A R + +
Sbjct: 108 KEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPE 167

Query: 124 LVFPDEL----------------------NKFPELIESETALYNTR--RDGLNKATTGLT 159
L PDE + + E L R R +
Sbjct: 168 LKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYE 227

Query: 160 QGISLVNRELAMTQPLVKQGAASSVEVLRLQRQANELEN--------------------- 198
+ L L+ + A + VL + + E N
Sbjct: 228 NLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKE 287

Query: 199 KLSDVRTQYYVQAREELAKANAEVETQRSVIRGREDSLTRLNFTAPVRGIVQDIDVTTVG 258
+ V + + ++L + + + E+ APV VQ + V T G
Sbjct: 288 EYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEG 347

Query: 259 GVIAPGGKLMTIVPLDEQLLIEAKISPRDVAFIHPGQKSLVKITAYDYSIYGGLPGEVAV 318
GV+ LM IVP D+ L + A + +D+ FI+ GQ +++K+ A+ Y+ YG L G+V
Sbjct: 348 GVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKN 407

Query: 319 ISPDTVQDEVRRDVYYYRVYIRTFSNHLENKSKQQFPIFPGMVATVDIRTGKKSVLDYLL 378
I+ D ++D+ R + V I N L +K P+ GM T +I+TG +SV+ YLL
Sbjct: 408 INLDAIEDQ--RLGLVFNVIISIEENCLSTGNK-NIPLSSGMAVTAEIKTGMRSVISYLL 464

Query: 379 KPF-NKAQEALRER 391
P E+LRER
Sbjct: 465 SPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0542CABNDNGRPT451e-05 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 44.6 bits (105), Expect = 1e-05
Identities = 43/221 (19%), Positives = 68/221 (30%), Gaps = 12/221 (5%)

Query: 5048 GTASNNGKFVGTGYNDTFFATAGTDTYDGSGGWVYSSGTGTWLANGGMDVVDFRLSTVGV 5107
T + F D + AT + S + T + ++ +
Sbjct: 265 RTGDSVYGFNSNTDRDFYTATDSSKALIFSVWDAGGTDTFDFSGYSNNQRINLNEGSFSD 324

Query: 5108 TANLSSTAAQATGFNTSTFTNIEGISGSNFNDILTGSSGDNQLEGRGGNDTLNIGNGGHD 5167
L + A G IE G + NDIL G+S DN L+G GND L G G
Sbjct: 325 VGGLKGNVSIAHG------VTIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAG--A 376

Query: 5168 TLLYKLLNASDATGGNGSDVVNGFTVGTWEGTADTDRIDIRELLQGSGYTG-NGKASYVN 5226
LY G+G D + D+ID+ + + +
Sbjct: 377 DTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQFTGKG 436

Query: 5227 GVATLDAQAGNIGDFVKVTQS---GSDTIVQIDRDGTGGTF 5264
L A N + + ++ D +V+I
Sbjct: 437 QEVMLQWDAANSITNLWLHEAGHSSVDFLVRIVGQAAQSDI 477


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0541INTIMIN375e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 37.4 bits (86), Expect = 5e-04
Identities = 63/372 (16%), Positives = 115/372 (30%), Gaps = 44/372 (11%)

Query: 707 QTVTVTLNGQTYQGVVQPDGTWSVTVPAANVGALADGNA--TVTASVNDVAGNPSSVSRV 764
T+TV NGQ V D T A A ADG T TA+V ++V
Sbjct: 544 LTITVLSNGQVVDQVGVTDFT------ADKTSAKADGTEAITYTATVKKNGVAQANVPVS 597

Query: 765 ALVDATPPVVTINPVATDNVINTPEHAQAQIISGTVTGAQAGDIVTVTLNNVDYTTVVDG 824
+ + V++ N T+ ++ V A+ ++ + N + VD
Sbjct: 598 FNIVSGTAVLSANSANTNGSGKATVTLKSDKPGQVVVSAKTAEMTSAL--NANAVIFVDQ 655

Query: 825 SGNWSLGVPASVVSGLADGSYPVSVSVTDKAGNTGSQSLTVTVNTAAPLIGINSIAGDDV 884
+ + A + +A+G ++ +V G+ + VT T +
Sbjct: 656 TKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSN-------- 707

Query: 885 INASEKGADLQITGTSDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANY 944
T +D +T+T + + + +V V A V
Sbjct: 708 -----------STEKTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFF--TTL 754

Query: 945 TVTAAVTSDIGNSATASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAED 1004
T+ +G V LP V + + + + + D
Sbjct: 755 TIDDGNIEIVGTG--------VKGKLPTVWLQYGQVNLKASGGNGKYTWRSANPAIASVD 806

Query: 1005 GDTVTITL---GGNTYTATVGSN--LTWSVDVPAADIQALGNGDLTVNASVTNQNGNTGS 1059
+ +TL G T + N T+++ P + I + +T N +V G
Sbjct: 807 ASSGQVTLKEKGTTTISVISSDNQTATYTIATPNSLIVPNMSKRVTYNDAVNTCKNFGGK 866

Query: 1060 GTRDITIDANLP 1071
N+
Sbjct: 867 LPSSQNELENVF 878



Score = 35.0 bits (80), Expect = 0.002
Identities = 81/416 (19%), Positives = 139/416 (33%), Gaps = 61/416 (14%)

Query: 841 ADGSYPVSVSVTDKAGN-TGSQSLTVTVNTAAPLIGINSIAGDDVINASEKGADLQITGT 899
Y V+ D+ GN + + LT+TV + + D + + AD GT
Sbjct: 521 GSNVYKVTARAYDRNGNSSNNVLLTITV-LSNGQVVDQVGVTDFTADKTSAKAD----GT 575

Query: 900 SDQPVNTAITVTLNGQNYTTTTDASGNWSVTVPASAVTALGQANYTVTAAVTSDIGNSAT 959
AIT YT T +G VP S G A + +A T + +
Sbjct: 576 ------EAIT-------YTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT-----NGS 617

Query: 960 ASHNVLVDSALPGVTINPVATDDIINAAEAGVAQTISGQVTGAEDGDTVTITLGGNTYTA 1019
V + S PG + T ++ +A A A Q I T A
Sbjct: 618 GKATVTLKSDKPGQVVVSAKTAEMTSALNAN-AVIFVDQTK----ASITEIKADKTTAVA 672

Query: 1020 TVGSNLTWSVDVPAADIQALGNGDLTVNASVTNQNG----NTGSGTRDITIDANLPG--- 1072
+T++V V D + + N ++T ++ + +G +T+ + PG
Sbjct: 673 NGQDAITYTVKVMKGD-KPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSL 731

Query: 1073 --LRVDTVAGDDVVNIIEHGQALVVTGSS-----SGLAESTP----------LTVTINNV 1115
RV VA D +E L + + +G+ P L + N
Sbjct: 732 VSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYGQVNLKASGGNG 791

Query: 1116 EYTTAVQADGSWSVGVTAAQVSAWPAGTVNIAVSGESSAGNSVSITHPVTVDLTPAAITI 1175
+YT SV ++ QV+ GT I+V + + +I TP ++ +
Sbjct: 792 KYTWRSANPAIASVDASSGQVTLKEKGTTTISVISSDNQTATYTIA-------TPNSLIV 844

Query: 1176 NTIATDDVINAAEKGADLTLSGTTTNVEPGQTVTVTFGGKNYTASVASDGSWTATV 1231
++ N A ++ + V +G N S + + V
Sbjct: 845 PNMSKRVTYNDAVNTCKNFGGKLPSSQNELENVFKAWGAANKYEYYKSSQTIISWV 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0540RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.006
Identities = 22/165 (13%), Positives = 57/165 (34%), Gaps = 20/165 (12%)

Query: 199 DELQAQTRIAGMRSTLEQYQAQMASAKAQLAVLTGVQPEAIAAP----PAELAEQPVSLK 254
L A+ +S+L Q + + + + + + P ++E+ V
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVL-- 185

Query: 255 NIDYQSIPLVLAAENLRQSAQYGVEKTKAQYWPTLSIQGGKTRYQTSDRSYWDDQLQLNV 314
+ L+ + Q+ +Y E + R + ++ +L+
Sbjct: 186 ----RLTSLIKEQFSTWQNQKYQKELNLDKK--RAERLTVLARINRYENLSRVEKSRLDD 239

Query: 315 NAPLYQGGAVS--------AQVQQAEGQQKISASQVEQAKLDVLQ 351
+ L A++ + +A + ++ SQ+EQ + ++L
Sbjct: 240 FSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILS 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0538BLACTAMASEA290.021 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.0 bits (65), Expect = 0.021
Identities = 11/43 (25%), Positives = 19/43 (44%)

Query: 38 GQLAAVAIVTSDGNVYSAGDSDYRFALESISKVCTLALALEDV 80
G++ + + + G +A +D RF + S KV L V
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARV 80


116ECs0523ECs0510N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs05233162.417078DNA polymerase III subunits gamma and tau
ECs05221150.165647adenine phosphoribosyltransferase
ECs05211130.410659hypothetical protein
ECs05200160.996678primosomal replication protein N''
ECs05190150.137110hypothetical protein
ECs0518016-0.074782potassium efflux protein KefA
ECs0517115-0.553229AcrR family transcriptional regulator
ECs0516117-0.244018acridine efflux pump
ECs0515-112-1.597352acridine efflux pump
ECs0514119-4.961117hypothetical protein
ECs0513119-3.465281hemolysin expression-modulating protein
ECs0512-118-2.134228maltose O-acetyltransferase
ECs0511-115-1.370392hypothetical protein
ECs0510-3110.302484hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0523IGASERPTASE412e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.8 bits (95), Expect = 2e-05
Identities = 40/251 (15%), Positives = 77/251 (30%), Gaps = 31/251 (12%)

Query: 404 PLPETTSQVLAARQQLQRVQGATKAKKSEPAA----ATRARPVNNAALERLASVTDRVQA 459
P E +Q + + + P+ AR + A + A T
Sbjct: 983 PEVEKRNQTVDTTN----ITTPNNIQADVPSVPSNNEEIARV-DEAPVPPPAPATPSETT 1037

Query: 460 RPVPSALEKAPAKKEAYRWKATTPVMQQKE--------VVATPKALKKA---LEHEKTPE 508
V ++ E AT Q +E V A + + A E ++T
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 509 LAAKLAA---------EAIERDAWAAQVSQLSLPKLVEQVALNAWKE-ESDNAVCLHLRS 558
K A E+ +V+ PK + + E +N ++++
Sbjct: 1098 TETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 559 SQRHLNNRGAQQKLAEALS-MLKGSTVELTIVEDDNPAVRTPLEWRQAIYEEKLAQARES 617
Q N ++ A+ S ++ E T V N V P A + + +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSN 1217

Query: 618 IIADNNIQTLR 628
+ + +++R
Sbjct: 1218 KPKNRHRRSVR 1228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0518RTXTOXIND320.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.017
Identities = 19/125 (15%), Positives = 40/125 (32%), Gaps = 6/125 (4%)

Query: 28 QNTAFARASSNGDLPTKADLQAQLDSLNKQKDLSAQDKLVQQDLTDTLATLDKIDRVKEE 87
N RA L + + L L+ + A L++ ++ E
Sbjct: 207 LNLDKKRAERLTVLARINRYENLSRVEKSR--LDDFSSLLHKQAIAKHAVLEQENKYVEA 264

Query: 88 TVQLRQKVAEAPEKMRQATAALTALSDVDND--EETRKIL--STLSLRQLETRVAQALDD 143
+LR ++ + + +A V E L +T ++ L +A+ +
Sbjct: 265 VNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEER 324

Query: 144 LQNAQ 148
Q +
Sbjct: 325 QQASV 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0517HTHTETR2211e-75 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 221 bits (564), Expect = 1e-75
Identities = 214/215 (99%), Positives = 214/215 (99%)

Query: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60
MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 EIWELSESNIGELELEYQAKFPSDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120
EIWELSESNIGELELEYQAKFP DPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180
GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215
APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTLRNPATNE 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0516RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 6e-07
Identities = 33/212 (15%), Positives = 71/212 (33%), Gaps = 23/212 (10%)

Query: 100 TYQATYDSAKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTA 159
+ Y A +L + + Q+ Q +++ ++ L +Q +
Sbjct: 256 EQENKYVEAVNELR--VYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGL 313

Query: 160 AKAAVETARINLAYTKVTSPISGRIGKSNV-TEGALVQNGQATALATVQQLDPIYVDVTQ 218
+ + + +P+S ++ + V TEG +V + T + V + D + V
Sbjct: 314 LTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAE-TLMVIVPEDDTLEVTALV 372

Query: 219 SSNDFLRLKQELA----------NGTLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVD 268
+ D + KV I D I+ + G + ++++
Sbjct: 373 QNKDIGFINVGQNAIIKVEAFPYTRYGYLV---GKVKNINLDAIEDQRLGLVFNVIISIE 429

Query: 269 QTTGSITLRAIFPNPDHTLLPGMFVRARLEEG 300
+ S + I L GM V A ++ G
Sbjct: 430 ENCLSTGNKNIP------LSSGMAVTAEIKTG 455



Score = 34.4 bits (79), Expect = 8e-04
Identities = 24/125 (19%), Positives = 43/125 (34%), Gaps = 13/125 (10%)

Query: 49 PLQITTELPGR-TSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDS 107
++I G+ T + R E++P + I+ + KEG + G L ++ +A
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA---- 134

Query: 108 AKGDLAKAQAAANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETA 167
D K Q++ A+L RYQ L E ++
Sbjct: 135 ---DTLKTQSSLLQARLEQTRYQILS-----RSIELNKLPELKLPDEPYFQNVSEEEVLR 186

Query: 168 RINLA 172
+L
Sbjct: 187 LTSLI 191


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0515ACRIFLAVINRP13690.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1369 bits (3545), Expect = 0.0
Identities = 801/1033 (77%), Positives = 915/1033 (88%), Gaps = 1/1033 (0%)

Query: 1 MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDT 60
M NFFI RPIFAWV+AII+M+AG LAIL+LPVAQYPTIAPPAV++SA+YPGADA+TVQDT
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTQVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQ 120
VTQVIEQNMNGIDNLMYMSS SDS G+V ITLTF+SGTD DIAQVQVQNKLQLA PLLPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 EVQQQGVSIEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFGS 180
EVQQQG+S+EKSSSS+LMV G ++ + TQ+DISDYVA+N+KD +SR +GVGDVQLFG+
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 QYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQTRL 240
QYAMRIW++ + LNK++LTPVDVI +K QN Q+AAGQLGGTP + GQQLNASIIAQTR
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 TSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGANAL 300
+ EEFGK+ L+VN DGS V L+DVA++ELGGENY++IA NG+PA+GLGIKLATGANAL
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 DTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFLQ 360
DTA AI+A+LA+++PFFP G+K++YPYDTTPFV++SIHEVVKTL EAI+LVFLVMYLFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420
N RATLIPTIAVPVVLLGTFA+LAAFG+SINTLTMFGMVLAIGLLVDDAIVVVENVERVM
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 AEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAMAL 480
E+ LPPKEAT KSM QIQGALVGIAMVLSAVF+PMAFFGGSTGAIYRQFSITIVSAMAL
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRSTGR 540
SVLVALILTPALCAT+LKP++ H E K GFFGWFN F+ S +HYT+SVG IL STGR
Sbjct: 481 SVLVALILTPALCATLLKPVSAE-HHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 541 YLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVTHYYLT 600
YL++Y +IV GM LF+RLPSSFLP+EDQGVF+TM+QLPAGATQERTQKVL++VT YYL
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 601 KEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATRAFSQIKD 660
EK NVESVF VNGF F+G+ QN G+AFVSLK W +R G+EN EA+ RA +I+D
Sbjct: 600 NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRD 659

Query: 661 AMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDMLTSVRPNG 720
V FN+PAIVELGTATGFDFELIDQAGLGH+ LTQARNQLL AA+HP L SVRPNG
Sbjct: 660 GFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNG 719

Query: 721 LEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKVYVMSEAKYR 780
LEDT QFK+++DQEKAQALGVS++DIN T+ A GG+YVNDFIDRGRVKK+YV ++AK+R
Sbjct: 720 LEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFR 779

Query: 781 MLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILGQAAPGKSTGEA 840
MLP+D+ YVR+A+G+MVPFSAF++S W YGSPRLERYNGLPSMEI G+AAPG S+G+A
Sbjct: 780 MLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDA 839

Query: 841 MELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLCLAALYESWSIPFS 900
M LME LASKLP G+GYDWTGMSYQERLSGNQAP+L AIS +VVFLCLAALYESWSIP S
Sbjct: 840 MALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVS 899

Query: 901 VMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEFAKDLMDKEGKGL 960
VMLVVPLG++G LLAAT NDVYF VGLLTTIGLSAKNAILIVEFAKDLM+KEGKG+
Sbjct: 900 VMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGV 959

Query: 961 IEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGVMGGMVTATVLAIF 1020
+EATL AVRMRLRPILMTSLAFILGV+PL IS GAGSGAQNAVG GVMGGMV+AT+LAIF
Sbjct: 960 VEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIF 1019

Query: 1021 FVPVFFVVVRRRF 1033
FVPVFFVV+RR F
Sbjct: 1020 FVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0510BCTERIALGSPF300.028 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.028
Identities = 31/137 (22%), Positives = 54/137 (39%), Gaps = 24/137 (17%)

Query: 247 IWLPLGLVIGLLAAMFVLRILRRIQSPHHRLQDAIENRDICVHYQPIVSLANGKIVGAEA 306
W+ L L+ G +A +LR R+ + + P++ G+I
Sbjct: 228 PWMLLALLAGFMAFRVMLR------QEKRRVS-----FHRRLLHLPLI----GRIARGLN 272

Query: 307 LARWPQTDGSWLSPDSFIPLAQQTGLS-EPLTLLIIRSVFEDMGDWLRQHPQQHISINLE 365
AR+ +T + S +PL Q +S + ++ R D +R+ H + LE
Sbjct: 273 TARYARTLSILNA--SAVPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKA--LE 328

Query: 366 STVLTSEKIPQLLREMI 382
T L P ++R MI
Sbjct: 329 QTAL----FPPMMRHMI 341


117ECs0500ECs0487N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs0500-2140.220310hypothetical protein
ECs04990140.246797hypothetical protein
ECs04980180.0059567-cyano-7-deazaguanine synthase QueC
ECs04971210.020349hypothetical protein
ECs04960210.238241hypothetical protein
ECs0495328-0.033982peptidyl-prolyl cis-trans isomerase
ECs0494327-0.310155transcriptional regulator HU subunit beta
ECs0493328-0.233937DNA-binding ATP-dependent protease La
ECs04920220.123295ATP-dependent protease ATP-binding protein ClpX
ECs0491-121-0.004821ATP-dependent Clp protease proteolytic subunit
ECs0490-1240.457448trigger factor
ECs0489-2210.834967transcriptional regulator BolA
ECs0488-1220.897298hypothetical protein
ECs04870201.154314muropeptide transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0500HTHFIS290.020 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.020
Identities = 12/64 (18%), Positives = 24/64 (37%), Gaps = 10/64 (15%)

Query: 197 LTVLTQHLGLSLRDCMAFGDAMNDREMLGSVGSGFIMGN----------AMPQLRAELPH 246
TVL Q L + D +A + + ++ + +P+++ P
Sbjct: 16 RTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLPRIKKARPD 75

Query: 247 LPVI 250
LPV+
Sbjct: 76 LPVL 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0497PF08280280.018 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 27.5 bits (61), Expect = 0.018
Identities = 24/138 (17%), Positives = 41/138 (29%), Gaps = 20/138 (14%)

Query: 1 MQTQIKVRGYHLDVYQHVNNARYL-------EFLEEARWDGLENSDSFHWMTAH------ 47
+Q I + Y N Y E++ + N FH +
Sbjct: 361 LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR 420

Query: 48 ------NIAFVVVN-ININYRRPAVLSDLLTITSQLQQLNGKSGILSQVITLEPEGQVVA 100
+ FV N IN + + + + Q+ L+P+ +
Sbjct: 421 NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH 480

Query: 101 DALITFVCIDLKTQKALA 118
LI FV +L A+A
Sbjct: 481 SQLIPFVHHELTKGIAVA 498


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0494DNABINDINGHU1173e-38 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 117 bits (294), Expect = 3e-38
Identities = 49/88 (55%), Positives = 67/88 (76%)

Query: 2 NKSQLIDKIAAGADISKAAAGRALDAIIASVTESLKEGDDVALVGFGTFAVKERAARTGR 61
NK LI K+A +++K + A+DA+ ++V+ L +G+ V L+GFG F V+ERAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEITIAAAKVPSFRAGKALKDAV 89
NPQTG+EI I A+KVP+F+AGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0493GPOSANCHOR340.002 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.3 bits (78), Expect = 0.002
Identities = 24/90 (26%), Positives = 49/90 (54%), Gaps = 10/90 (11%)

Query: 191 ERLEYLMAMMESEIDLLQVEKRIRNRVKKQMEKSQREYYLNEQMKAIQKELGEMDDAPD- 249
LE A +E + +L R +++ ++ S+ +Q++A ++L E + +
Sbjct: 291 AALEAEKADLEHQSQVLNAN---RQSLRRDLDASREAK---KQLEAEHQKLEEQNKISEA 344

Query: 250 ENEALKRKIDAAKMPKEAKEKAEAELQKLK 279
++L+R +DA++ EAK++ EAE QKL+
Sbjct: 345 SRQSLRRDLDASR---EAKKQLEAEHQKLE 371


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0492HTHFIS290.043 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.043
Identities = 16/73 (21%), Positives = 29/73 (39%), Gaps = 13/73 (17%)

Query: 60 ERSALPTPHEIRNHLDDYVIGQEQAKKVLAVAVYNHYKRLRNGDTSNGVELGKSNILLIG 119
E P+ E + ++G+ A + +Y RL D +++ G
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQ----EIYRVLARLMQTD---------LTLMITG 167

Query: 120 PTGSGKTLLAETL 132
+G+GK L+A L
Sbjct: 168 ESGTGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0488PF06291270.029 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 26.5 bits (58), Expect = 0.029
Identities = 11/34 (32%), Positives = 18/34 (52%)

Query: 3 KKILFPLVALFMLAGCARPPTTIEVSPTITLPQQ 36
KK+LF ++ GCA+ T+ PT P++
Sbjct: 7 KKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKE 40


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0487TCRTETA393e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.0 bits (91), Expect = 3e-05
Identities = 71/347 (20%), Positives = 135/347 (38%), Gaps = 20/347 (5%)

Query: 62 KFLWSPLMDRYTPPFFGRRRGWLLATQILLLVAIAAMGFLEPGTQLRWMAALAVVIAFCS 121
+F +P++ + F RR LL + V A M W+ + ++A +
Sbjct: 56 QFACAPVLGALSDRF--GRRPVLLVSLAGAAVDYAIMAT----APFLWVLYIGRIVAGIT 109

Query: 122 ASQDIVFDAWKTDVLPAEERGAGAAISVLGYRLGMLVSGGLALWLADKWLGWQGMYWLMA 181
+ V A+ D+ +ER + GM+ L + ++ A
Sbjct: 110 GATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG--FSPHAPFFAAA 167

Query: 182 AL-LIPCIIATLLAPEP--TDTIPVPKTLEQAVVAPLRDFFGRNNAWLILLLIVLYKLGD 238
AL + + L PE + P+ + + + A L+ + ++ +G
Sbjct: 168 ALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQ 227

Query: 239 AFAMSLTTTFLIRGVGFDAGEVGVVNKTLGLLATIVGALYGGILMQRLSLFRALLIFGIL 298
A +L F +DA +G+ G+L ++ A+ G + RL RAL+ G++
Sbjct: 228 VPA-ALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALM-LGMI 285

Query: 299 QGASNAGYWLLSITDKHLYSMGAAVFFENLCGGMGTSAFVALLMTLCNKSFSATQFALLS 358
A GY LL+ + + V GG+G A A+L ++ L+
Sbjct: 286 --ADGTGYILLAFATRGWMAFPIMVLL--ASGGIGMPALQAMLSRQVDEERQGQLQGSLA 341

Query: 359 ALSAVGRVYVGPVAGWFVEAHGWSTF--YLFSVAAAVPGLILLLVCR 403
AL+++ + VGP+ + A +T+ + + AA+ L L + R
Sbjct: 342 ALTSLTSI-VGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


118ECs0450ECs0444N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs04501141.944682phosphate regulon sensor protein
ECs04492141.757449transcriptional regulator PhoB
ECs04482141.188987exonuclease SbcD
ECs04473160.924834exonuclease SbcC
ECs0446118-0.415182MFS transport protein AraJ
ECs0444-119-1.075962fructokinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0450PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.1 bits (78), Expect = 0.001
Identities = 19/105 (18%), Positives = 33/105 (31%), Gaps = 26/105 (24%)

Query: 325 LVYNAVNH----TPEGTHITVRWQRVPHGAEFSVEDNGPGIAPEHIPRLTERFYRVDKAR 380
LV N + H P+G I ++ + VE+ G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 381 SRQTGGSGLGLAIVKHAVNH---HESRLNIESTVGKGTRFSFVIP 422
+G GL V+ + E+++ + GK +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0449HTHFIS951e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.9 bits (236), Expect = 1e-24
Identities = 33/149 (22%), Positives = 62/149 (41%), Gaps = 9/149 (6%)

Query: 4 RILVVEDEAPIREMVCFVLEQNGFQPVEAEDYDSAVNQLNEPWPDLILLDWMLPGGSGIQ 63
ILV +D+A IR ++ L + G+ + + + DL++ D ++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIKHLKRESMTRDIPVVMLTARGEEEDRVRGLETGADDYITKPFSPKELVARIKAVMRRI 123
+ +K+ D+PV++++A+ ++ E GA DY+ KPF EL+ I +
Sbjct: 65 LLPRIKKARP--DLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA-- 120

Query: 124 SPMAVEEVIEMQGLSLDPTSHRVMAGEEP 152
E L D + G
Sbjct: 121 -----EPKRRPSKLEDDSQDGMPLVGRSA 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0448FRAGILYSIN300.022 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 29.7 bits (66), Expect = 0.022
Identities = 13/70 (18%), Positives = 23/70 (32%), Gaps = 4/70 (5%)

Query: 149 KQQHLLAAITDYYQQHYADACKLRGDQPLPIIATGHLTTVGASKSDAVRDIYIGTLDAFP 208
K+ ++ I ++Y + + + I T D + + I A
Sbjct: 135 KEAQMMNEIAEFYAAPFKKTRAINEKEAFECI-YDSRTRSA--GKD-IVSVKINIDKAKK 190

Query: 209 AQNFPPADYI 218
N P DYI
Sbjct: 191 ILNLPECDYI 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0447RTXTOXIND397e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 7e-05
Identities = 34/199 (17%), Positives = 71/199 (35%), Gaps = 14/199 (7%)

Query: 671 QQEAQSWQQRQNELTALQNRIQQLTPILETLPQSDDLPHSEETVALDNWRQVHEQCLALH 730
+ + Q + Q R Q L+ +E + E + +V +
Sbjct: 133 EADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIK 192

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTAL--------QASVFDDQQAFLAALMDEQTLTQL 782
Q T Q Q +L K +A+ T L + V + ++L+ +Q +
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIA-- 250

Query: 783 EQLKQNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQEL-AQTHQKLREN 841
K + Q + V + +Q +Q + L+ + + Q + KLR+
Sbjct: 251 ---KHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQT 307

Query: 842 TTSQGEIRQQLKQDADNRQ 860
T + G + +L ++ + +Q
Sbjct: 308 TDNIGLLTLELAKNEERQQ 326



Score = 39.4 bits (92), Expect = 7e-05
Identities = 25/204 (12%), Positives = 59/204 (28%), Gaps = 18/204 (8%)

Query: 487 EARIKTLEAQRAQLQAGQPCPLCGSTSHPAVEAYQALEPGVNQSRLLALENEVKKLGEEG 546
EA ++ Q + Q ++E + E + +E + L
Sbjct: 133 EADTLKTQSSLLQARLEQ---TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT- 188

Query: 547 AALRGQLDALTKQLQRDENEAQSLRQDEQALTQQWQAVTASLNITLQPQDDIQPWLDAQD 606
+ ++ Q Q + E R + + + + DD L Q
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 607 -------EHERQL-RLLSQRHELQGQIAAHNQQIIQYQQQIEQRQQQLLTALAGYALTLP 658
E E + +++ + Q+ +I+ +++ + Q L
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLF------KNEILD 302

Query: 659 QEDEEESWLATRQQEAQSWQQRQN 682
+ + + E ++RQ
Sbjct: 303 KLRQTTDNIGLLTLELAKNEERQQ 326



Score = 32.9 bits (75), Expect = 0.006
Identities = 16/150 (10%), Positives = 42/150 (28%), Gaps = 5/150 (3%)

Query: 731 SQQQTLQQQDVLAAQSLQKAQAQFDTA----LQASVFDDQQAFLAALMDEQTLTQLEQLK 786
+ Q + A + Q + L D+ F +E+ L +K
Sbjct: 134 ADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVS-EEEVLRLTSLIK 192

Query: 787 QNLENQRRQAQTLVTQTAETLAQHQQHRPDGLALTVTVEQIQQELAQTHQKLRENTTSQG 846
+ + Q + A+ + L L + ++
Sbjct: 193 EQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 847 EIRQQLKQDADNRQQQQTLLQQIAQMTQQV 876
+ +Q + + + + Q+ Q+ ++
Sbjct: 253 AVLEQENKYVEAVNELRVYKSQLEQIESEI 282


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0446TCRTETA514e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 51.0 bits (122), Expect = 4e-09
Identities = 74/356 (20%), Positives = 126/356 (35%), Gaps = 35/356 (9%)

Query: 33 ILSLALGTFGLGMAEFGIMGVLTELAHNVGISIPAAGH---MISYYALGVVVGAPIIALF 89
+ ++AL G+G+ IM VL L ++ S H +++ YAL AP++
Sbjct: 11 LSTVALDAVGIGL----IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 90 SSRYSLKHILLFLVALCVIGNAMFTLSSSYLMLAIGRLVSGFPHGAFFGVGAIVLSKIIK 149
S R+ + +LL +A + A+ + +L IGR+V+G GA + I
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD--IT 124

Query: 150 PGKVTAAVAGMVSGMTVANLLGIPLGTYLSQEFSWRYTFLLIAVFNIAVMASVYFWVPDI 209
G A G +S ++ P+ L FS F A N + F +P+
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 210 RDEAKGKLREQ----------FHFLRSPAPWLI--FAATMFGNAGVFAWFSYVKPYMMFI 257
+ LR + + A + F + G W +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFG------E 238

Query: 258 SGFSETAMTFIMMLVGLGM---VLGNMLSGRISGRYSPLRIAAVTDFIIVLALLMLFFCG 314
F A T + L G+ + M++G ++ R R + ++L F
Sbjct: 239 DRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT 298

Query: 315 GMKTTSLIFAFICCAGLFALSAPLQILLLQNAKGGELLGAAGGQIAF--NLGSAVG 368
I + G+ LQ +L + E G G +A +L S VG
Sbjct: 299 RGWMAFPIMVLLASGGIG--MPALQAMLSRQV-DEERQGQLQGSLAALTSLTSIVG 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0444ACETATEKNASE300.015 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.8 bits (67), Expect = 0.015
Identities = 17/69 (24%), Positives = 29/69 (42%), Gaps = 10/69 (14%)

Query: 187 FISGTGFATDYRRLSGHALKGSEIIRLVEESDPVAELALRRYELRLAKSLAHVVNILDP- 245
+G ++D+R L A + D A+LAL + R+ K++ +
Sbjct: 273 VYGISGISSDFRDLEDAAF---------KNGDKRAQLALNVFAYRVKKTIGSYAAAMGGV 323

Query: 246 DVIVLGGGM 254
DVIV G+
Sbjct: 324 DVIVFTAGI 332


119ECs0424ECs0413N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
ECs04242220.463274flagellin structural protein
ECs04230203.877539delta-aminolevulinic acid dehydratase
ECs04221223.648063taurine dioxygenase
ECs04211203.838974taurine transporter subunit
ECs04201182.698159taurine transporter ATP-binding protein
ECs0419-1202.851303taurine transporter substrate binding subunit
ECs0418-2192.513691transcriptional regulator
ECs0417-2192.491063sensor histidine protein kinase
ECs0416-2162.474570regulatory protein
ECs0415-2161.519767periplasmic-iron-binding protein
ECs0414-3141.360119ferric transport system permease
ECs0413-2111.311559ferric transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0424PRTACTNFAMLY1214e-30 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 121 bits (305), Expect = 4e-30
Identities = 102/445 (22%), Positives = 169/445 (37%), Gaps = 59/445 (13%)

Query: 584 TINGNGDNDNTASIEAGQNEVDNNGDHVAAATGNYKVRIDNATGAGSIADYNGNELIYVN 643
T+ G+G + G ++ A+G +++ + N+ GS L+
Sbjct: 477 TLAGSGLFRMNVFADLGLSDKLVVMQD---ASGQHRLWVRNS---GSEPASANTLLLVQT 530

Query: 644 DKNSNATFSAAN---KADLGAYTYQAEQRGNTV--------------------------- 673
S ATF+ AN K D+G Y Y+ GN
Sbjct: 531 PLGSAATFTLANKDGKVDIGTYRYRLAANGNGQWSLVGAKAPPAPKPAPQPGPQPPQPPQ 590

Query: 674 ---------VLQQMELTDYANMALSIP--SANTNIWNLEQDTVGTRLTNSRHGLADNGGA 722
EL+ AN A++ + +W E + + RL R D GGA
Sbjct: 591 PQPEAPAPQPPAGRELSAAANAAVNTGGVGLASTLWYAESNALSKRLGELRL-NPDAGGA 649

Query: 723 WVSYFGGNFNGDNGTIN-YDQDVNGIMVGVDTKIDGNNAKWIVGAAAGFAKGDMN---DR 778
W F DN +DQ V G +G D + +W +G AG+ +GD D
Sbjct: 650 WGRGFAQRQQLDNRAGRRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRGFTGDG 709

Query: 779 SGQVDQDSQTAYIYSSAHFANNVF-VDGSLSYSHFNNDLSATMSNGTYVDGSTNSDAWGF 837
G D Y + + A++ F +D +L S ND S+G V G + G
Sbjct: 710 GGHTDSVHVGGY---ATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYRTHGVGA 766

Query: 838 GLKAGYDFKLGDAGYVTPYGSISGLFQSGDDYQLSNDMKVDGQSYDSMRYELGVDAGYTF 897
L+AG F D ++ P ++ G Y+ +N ++V + S+ LG++ G
Sbjct: 767 SLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGLEVGKRI 826

Query: 898 TYSEDQALTPYFKLAYVYDDSNNDNDVNGDSIDNGTEGSAVRV--GLGTQFSFTKNFSAY 955
+ + + PY K + + + + V+ + I + TE R GLG + + S Y
Sbjct: 827 ELAGGRQVQPYIKASVL-QEFDGAGTVHTNGIAHRTELRGTRAELGLGMAAALGRGHSLY 885

Query: 956 TDANYLGGGDVDQDWSANVGVKYTW 980
Y G + W+ + G +Y+W
Sbjct: 886 ASYEYSKGPKLAMPWTFHAGYRYSW 910


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0423BINARYTOXINB300.015 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 30.0 bits (67), Expect = 0.015
Identities = 19/69 (27%), Positives = 30/69 (43%)

Query: 254 DIVRELRERTELPIGAYQVSGEYAMIKFAALAGAIDEEKVVLESLGSIKRAGADLIFSYF 313
+ EL + +L + QV G A F +D E L I+ A +IF+
Sbjct: 466 NQFLELEKTKQLRLDTDQVYGNIATYNFENGRVRVDTGSNWSEVLPQIQETTARIIFNGK 525

Query: 314 ALDLAEKKI 322
L+L E++I
Sbjct: 526 DLNLVERRI 534


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0418HTHFIS733e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.9 bits (179), Expect = 3e-17
Identities = 31/116 (26%), Positives = 53/116 (45%), Gaps = 4/116 (3%)

Query: 2 IRVVLVDDHVVVRSGFAQLLSLED-DLEVIGQYSSAAQAWSALIRDDVNVAVIDIAMPDE 60
+++ DD +R+ Q LS D+ + +AA W + D ++ V D+ MPDE
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITS---NAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLSLLKRLRAQKPQFRAIILSIYDAPTFVQSALDAGASGYLTKRCGPEELVQAVR 116
N LL R++ +P +++S + A + GA YL K EL+ +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0417PF06580491e-08 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 49.5 bits (118), Expect = 1e-08
Identities = 42/205 (20%), Positives = 79/205 (38%), Gaps = 43/205 (20%)

Query: 337 QSQLVKRARDPAQIQSAASQIN-------------------ELARRIHLSTRQLLR-QLR 376
Q ++ A++ AQ+ + +QIN AR + S +L+R LR
Sbjct: 151 QWKMASMAQE-AQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 377 PPALDELTFREALLHL-----INEFAFSERGIHCQFAYQLNSTPENETVRFTLYRLLQEL 431
+++ + L + + F +R ++ P V+ L+Q L
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDR-----LQFENQINPAIMDVQVPPM-LVQTL 263

Query: 432 LNNICKHA-----EASEVTIILRQQGEVLHLEVSDNGVGIA--SGKMAGFGIQGMRERVS 484
+ N KH + ++ + + + LEV + G + + G G+Q +RER+
Sbjct: 264 VENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQ 323

Query: 485 ALGGD---LTLE-KQHGTRVIVNLP 505
L G + L KQ +V +P
Sbjct: 324 MLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0416TCRTETA402e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 39.8 bits (93), Expect = 2e-05
Identities = 62/399 (15%), Positives = 122/399 (30%), Gaps = 31/399 (7%)

Query: 38 VNYVLPALQTDLGLD---KGDIGLLGSLFYLSYGLSKFTAGLWHDSHGQRGFMGVGLFAT 94
+ VLP L DL G+L +L+ L G D G+R + V L
Sbjct: 24 IMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGA 83

Query: 95 GLLNVVFAFGESLTLLLVVWTLNGFFQGWGWPPCARLLTHWYSRNERGFWWGCWNMSINI 154
+ + A L +L + + G G + +ER +G +
Sbjct: 84 AVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADITDGDERARHFGFMSACFGF 142

Query: 155 GGAIIPLISAFAAHWWGWQAAMLTPGIISMALGIWLTLQLKGTPQEEGLPTVGHWRHDPL 214
G P++ + A ++ + L + + E P
Sbjct: 143 GMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRP---------- 191

Query: 215 ELRQEQQSPPMGLWQMLRTTMLQNPLIWLLGVSYVLVYVIRIALNDWGNIWLTESHGVNL 274
LR+E +P R + L+ V +++ V ++ W + +
Sbjct: 192 -LRREALNPLAS----FRWARGMTVVAALMAVFFIMQLVGQVPAALWV---IFGEDRFHW 243

Query: 275 LSANATVMLFEVGGLLGALFAGWGSDLLFSGQRAPMILLFTLGLMVSVAALWLAPVHHYA 334
+ + L G+L +L + + + L+ + + L +
Sbjct: 244 DATTIGISL-AAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWM 302

Query: 335 LLAVCFFTVGFFVFGPQMLIGLAAVECGHK--AAAGSITGFLGLFAYLGAALAGWPLSLV 392
+ + P + L+ + GS+ L + +G L +
Sbjct: 303 AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYAAS 362

Query: 393 IERYGWPGMFSLLSVAAVLMGLLLMPLLMAGITTTHARR 431
I W G +A + LL +P L G+ + +R
Sbjct: 363 ITT--WNG---WAWIAGAALYLLCLPALRRGLWSGAGQR 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
ECs0413PF05272320.004 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.004
Identities = 21/90 (23%), Positives = 30/90 (33%), Gaps = 22/90 (24%)

Query: 34 MVTLLGPSGCGKTTILRLVAGLEKPSEGQIFIDGEDVTHRSI-QQRDICMVFQSYALFPH 92
V L G G GK+T++ + GL F D TH I +D
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGL------DFFSD----THFDIGTGKDSYEQIAGIVA--- 644

Query: 93 MSLGENVGYGLKMLGVSRSEVKQRVKEALA 122
L E M R++ + VK +
Sbjct: 645 YELSE-------MTAFRRADA-EAVKAFFS 666



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.