PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_009480.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in AM711867 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1CMM_0001CMM_0011Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_000127-4.237679chromosomal replication initiator protein dnaA
CMM_000206-3.661918DNA polymerase III, beta chain
CMM_000306-3.8223746-phosphogluconate dehydrogenase
CMM_000406-3.745009DNA replication and repair protein RecF
CMM_000516-4.560483conserved hypothetical protein
CMM_000618-4.540975DNA gyrase subunit B
CMM_000719-4.333549DNA gyrase subunit A
CMM_000849-3.921772putative integral membrane protein
CMM_000949-3.173983**hypothetical protein
CMM_001028-2.244835peptidyl-prolyl cis-trans isomerase
CMM_001128-1.011118conserved membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0001HTHFIS290.037 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.4 bits (66), Expect = 0.037
Identities = 16/65 (24%), Positives = 22/65 (33%), Gaps = 3/65 (4%)

Query: 127 PSRGDSRLNPKYGFDTFVIGGSNRFAHAAAVAVAEAPAKAYNPLFIYGDSGLGKTHLLHA 186
P R S+L ++G S V L I G+SG GK + A
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL--TLMITGESGTGKELVARA 179

Query: 187 IGHYA 191
+ H
Sbjct: 180 L-HDY 183


2CMM_0021CMM_0114Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_00210103.523045putative two-component system response
CMM_00220143.477969putative two-component system sensor kinase
CMM_0023-1172.815843conserved hypothetical protein
CMM_0024-1192.794993ATP-dependent DNA helicase
CMM_00250172.473798hypothetical protein
CMM_00261162.823104putative glutaminase
CMM_00272133.094022putative hydrolase/acyltransferase
CMM_00281133.428175putative membrane protein
CMM_0029092.774592conserved hypothetical protein
CMM_0030092.438301putative sugar phosphate isomerase/epimerase
CMM_0031-116-0.507080putative dehydrogenase/oxidoreductase
CMM_0032-127-2.501592putative transcriptional regulator, LacI family
CMM_0033132-4.262278hypothetical protein
CMM_0034133-5.327494hypothetical protein
CMM_0035340-6.103952putative extracellular serine protease, family
CMM_0036862-10.081674hypothetical protein
CMM_PS_021050-9.353153putative extracellular serine protease
CMM_PS_031564-11.602194conserved hypothetical protein, putative
CMM_PS_041356-10.603466conserved hypothetical protein, putative
CMM_00391246-9.285433hypothetical protein predicted by
CMM_00411352-10.414450hypothetical protein, putative phage protein
CMM_0042843-9.242124hypothetical protein
CMM_0043843-9.360578hypothetical protein
CMM_0044846-8.753776putative beta-N-acetylglucosaminidase
CMM_0045747-8.813866conserved hypothetical protein, putative
CMM_PS_05643-8.012799putative extracellular serine protease, family
CMM_0046633-6.395729putative extracellular serine protease, family
CMM_0047734-6.090514hypothetical protein
CMM_00481045-8.891189conserved hypothetical protein
CMM_00491051-9.912178hypothetical protein
CMM_0050952-10.254653hypothetical protein
CMM_0051753-9.982586hypothetical protein
CMM_0052752-9.207562putative extracellular serine protease
CMM_0053337-6.895094hypothetical protein
CMM_0054230-5.585437putative cell filamentation protein
CMM_0055229-4.378436conserved hypothetical protein
CMM_0056225-2.958704putative hydrolase
CMM_0057019-2.595579conserved hypothetical protein, putative
CMM_0059525-4.519781partitioning protein
CMM_0060522-3.564811putative ATPase
CMM_0061732-5.677876hypothetical protein predicted by
CMM_PS_091043-9.033482Na+/H+ antiporter, NhaA family
CMM_PS_101254-11.578214subtilisin-like serine protease, peptidase
CMM_0062942-8.630472conserved hypothetical protein
CMM_0063944-8.956768putative acyl-CoA synthetase (AMP-forming)
CMM_0064641-8.095594conserved hypothetical protein
CMM_0065423-3.631660putative extracellular serine protease
CMM_0066117-1.743461hypothetical protein
CMM_0068527-2.619882hypothetical protein
CMM_0070941-6.027244putative NADH oxidase
CMM_PS_251160-10.403417Conserved hypothetical protein
CMM_00711160-10.404495conserved hypothetical protein
CMM_PS_111157-10.508256putative short-chain dehydrogenase/reductase
CMM_00721256-12.285435putative beta-glucosidase
CMM_0073841-8.605249putative beta-glucosidase/beta-xylosidase
CMM_0074429-5.169571putative sugar ABC transporter, permease
CMM_0075429-5.064257putative sugar ABC transporter, permease
CMM_0076223-4.600621putative sugar ABC transporter, binding protein
CMM_0077027-4.875449transcriptional regulator, TetR family
CMM_0079329-6.613535putative beta-glucosidase
CMM_0080429-6.987419putative LacI-family transcriptional regulator
CMM_0082426-6.834484tomatinase, endo-1,4-beta-glycosidase
CMM_0083423-6.867291putative two-component system response
CMM_0084325-7.017651putative two-component system sensor kinase
CMM_0085326-7.194950putative transport protein, RND family
CMM_0086228-6.978495putative cytochrome P450
CMM_0087032-6.290160putative 3Fe-4S ferredoxin
CMM_0088030-5.810026putative ferredoxin reductase
CMM_0089132-5.787476putative beta-glucosidase, glycosyl hydrolase
CMM_0090130-5.729917putative LacI-family transcriptional regulator
CMM_0091132-4.962276beta-glucosidase, glycosyl hydrolase family 3
CMM_0092030-4.790859beta-glucosidase, glycosyl hydrolase family 1
CMM_0093030-4.945054putative beta-galactosidase
CMM_0094-132-5.183465putative beta-xylosidase, glycosyl hydrolase
CMM_0095-131-5.184079putative sugar permease (MFS superfamily)
CMM_0096-130-5.560377putative alpha-rhamnosidase
CMM_0097030-5.671188putative alpha-glucosidase, glycosyl hydrolase
CMM_0098031-5.941712putative sugar ABC transporter, permease
CMM_0099030-6.024179putative sugar ABC transporter, permease
CMM_0100131-6.661209putative sugar ABC transporter, binding protein
CMM_0101328-6.568595putative xylosidase, glycosyl hydrolase family
CMM_0102325-6.261331putative sugar phosphate isomerase/epimerase
CMM_0103424-6.216969putative dehydrogenase/oxidoreductase
CMM_0104422-6.112646putative hydrolase
CMM_0105522-6.176355hypothetical protein
CMM_0106418-5.139189putative dienelactone hydrolase
CMM_0107114-4.052680putative MFS-type efflux protein
CMM_0108012-2.869279putative TetR-family transcriptional regulator
CMM_0109-110-1.617182transcriptional regulator, TetR family
CMM_0110-312-0.076422putative urea amidolyase
CMM_0111-192.294968conserved hypothetical protein
CMM_0112292.325495conserved hypothetical protein
CMM_01132102.206587putative amino acid permease, APC family
CMM_01142111.975572putative acyl-CoA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0023HTHFIS613e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 3e-13
Identities = 25/111 (22%), Positives = 52/111 (46%), Gaps = 3/111 (2%)

Query: 19 SVVVIDDESLVRSGIAMVLGASPRIAVRAAVSSDTAVATVREHAPDVVLLDIRMPAPDGL 78
+++V DD++ +R+ + L + VR ++ T + D+V+ D+ MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAG-YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 79 AILAELMAE-PRPPAVAMLTTFDTDDQVLEALHRGASGFLLKDTDPEQLAR 128
+L + P P V +++ +T ++A +GA +L K D +L
Sbjct: 64 DLLPRIKKARPDLP-VLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0041V8PROTEASE469e-08 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 45.8 bits (108), Expect = 9e-08
Identities = 40/280 (14%), Positives = 80/280 (28%), Gaps = 38/280 (13%)

Query: 32 IGGESARAASSASGPASGASDTGMSVSDADAAESENYWTPDKIAAAVPADGVTVTPQKQS 91
+ + A+ S PA+ A + + +S TP + +
Sbjct: 11 LFVATLTTATLVSSPAANALSSKAMDNHPQQTQSSKQQTPKIQKGGNLKP--LEQREHAN 68

Query: 92 PSLGASAVTSVFE----PVYWIGRLYYTAGGIDYACTASSIKSDSKLVIATAGHCLYHK- 146
L + + + + + A + + + D+ + T H +
Sbjct: 69 VILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDT---LLTNKHVVDATH 125

Query: 147 GEFSTNLRFIPAWDGANKPLLTWGANDYQVPRA------WRYQEDQQHDAGFVQLKPQRS 200
G+ F A + N P + A ++ ++Q+ +KP
Sbjct: 126 GDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKP--- 182

Query: 201 WLGAKQYLADRAGATATNFGLAKTGLHYEAFGYEQVSGFTSHPLLTCSGNGYRRFSQFSL 260
AT +N + + GY + G +
Sbjct: 183 -------------ATMSNNAETQVNQNITVTGY--PGDKPVATMWESKGKITY--LKGEA 225

Query: 261 LSINDCAMVGGGSGGPVYHESGKGVNGTQVGVVSSVIPQG 300
+ D + GG SG PV++E + V G G V +
Sbjct: 226 MQY-DLSTTGGNSGSPVFNEKNE-VIGIHWGGVPNEFNGA 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0044V8PROTEASE468e-08 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 46.2 bits (109), Expect = 8e-08
Identities = 48/290 (16%), Positives = 92/290 (31%), Gaps = 32/290 (11%)

Query: 39 GGVIQGGALSAAPALAATPDVSAVSAASLGSRAFSDEEMTSTSDYWTVDRLQRA-----I 93
G ++ +L A AT VS+ +A +L S+A + + S ++Q+ +
Sbjct: 3 GKFLKVSSLFVATLTTAT-LVSSPAANALSSKAMDNHPQQTQSSKQQTPKIQKGGNLKPL 61

Query: 94 PDDGSASVVGASDAGSGVVHSVSGSITATAWVGKIAFRRGGLDRLCSASAVHSDSGYLVA 153
A+V+ ++ + + +G V I + S V D +
Sbjct: 62 EQREHANVILPNNDRHQITDTTNGHYAP---VTYIQVEAPTGTFIASGVVVGKD---TLL 115

Query: 154 TAGHCLLNDDQTATGVTFVP-GWDGKNMPYGTWIAKFYSVTPEWRVRADDSHDVGFIKVQ 212
T H + + P + N P G + A+ + D+ +K
Sbjct: 116 TNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSG-------EGDLAIVKFS 168

Query: 213 PI-GSKSLADTVSALRV-NFSLAKPQLHYFSLAYGNIGGGFQAKPLSTCVGPAYRLHDEQ 270
P +K + + V + N + + + Y G + G L E
Sbjct: 169 PNEQNKHIGEVVKPATMSNNAETQVNQNITVTGY---PGDKPVATMWESKGKITYLKGEA 225

Query: 271 SLAMIGCKAVGGMSGGPVYHASTEEPRGTQVGVIGRNVETAYGDAIAFTP 320
GG SG PV++ +G+ V + A+
Sbjct: 226 MQ--YDLSTTGGNSGSPVFNEK-----NEVIGIHWGGVPNEFNGAVFINE 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0047TCRTETOQM260.027 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 26.4 bits (58), Expect = 0.027
Identities = 9/36 (25%), Positives = 15/36 (41%), Gaps = 4/36 (11%)

Query: 5 NDLAARAERGELTPIPGTDLHGAAAANAGRAMLMDA 40
+ + R L P+ HG+A N G L++
Sbjct: 202 QEESIRFHNCSLFPV----YHGSAKNNIGIDNLIEV 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0052V8PROTEASE353e-04 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 34.6 bits (79), Expect = 3e-04
Identities = 38/186 (20%), Positives = 60/186 (32%), Gaps = 45/186 (24%)

Query: 81 VVTANHCV-ARIGERVFVRNELESRHGHVPPIGT-----VYWRSDDVDLALIKIDPIVHV 134
++T H V A G+ ++ + + P G + S + DLA++K P
Sbjct: 114 LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPN--- 170

Query: 135 SYTCGSSSHGAPHCLPVTTWTPNALPRVLTASLRMRSIYAQPVIGYDNPGLNEAFATSGS 194
+ + T + NA +V V GY PG ++ AT
Sbjct: 171 -----EQNKHIGEVVKPATMSNNAETQVNQNI---------TVTGY--PG-DKPVATMWE 213

Query: 195 TTGVQVNWRNLSVRAWPPGFRDPRSGDQAASSTTDFLLPGDSGGPVFNPDTGMLYGIMTD 254
+ G L A G+SG PVFN + + GI
Sbjct: 214 SKGKI---TYLKGEAM---------------QYDLSTTGGNSGSPVFN-EKNEVIGIHWG 254

Query: 255 QVPHRL 260
VP+
Sbjct: 255 GVPNEF 260


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0053V8PROTEASE320.003 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 31.5 bits (71), Expect = 0.003
Identities = 10/24 (41%), Positives = 14/24 (58%)

Query: 235 PGDSGGPVVNYSAELLGIISSVLP 258
G+SG PV N E++GI +P
Sbjct: 234 GGNSGSPVFNEKNEVIGIHWGGVP 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0059V8PROTEASE320.003 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 31.9 bits (72), Expect = 0.003
Identities = 11/26 (42%), Positives = 16/26 (61%)

Query: 228 PGDSGGPVVSKDRRLLGIISGDVPNT 253
G+SG PV ++ ++GI G VPN
Sbjct: 234 GGNSGSPVFNEKNEVIGIHWGGVPNE 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0062IGASERPTASE270.020 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.020
Identities = 17/54 (31%), Positives = 23/54 (42%)

Query: 70 GNIRVTFSGAYTVREFQGDKGTILQHKVFVDEIGASFFGQDVTIARRTASTSTT 123
GN+ VTF G F GT L + V++ G+ AR A S+T
Sbjct: 665 GNLNVTFKGKSEQNRFLLTGGTNLNGDLTVEKGTLFLSGRPTPHARDIAGISST 718


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0070SUBTILISIN1095e-28 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 109 bits (274), Expect = 5e-28
Identities = 52/254 (20%), Positives = 80/254 (31%), Gaps = 76/254 (29%)

Query: 181 AEHAGAGTVIGDIDSGIAPDNPSFAGKPLGSAPGAEPYRDGTGIAFRKADGSVFHGTCET 240
+ G G + +D+G D+P +
Sbjct: 36 NQTRGRGVKVAVLDTGCDADHPDLKAR--------------------------------- 62

Query: 241 GDGFTAADCSTKVIGARSFVSGRDASGVPLGPQERRSARDTNGHGSHTASTAAGDADVPA 300
+IG R+F + + +D NGHG+H A T A +
Sbjct: 63 ------------IIGGRNFTDDDEG--------DPEIFKDYNGHGTHVAGTIAATENENG 102

Query: 301 VIHGRTLDTIAGVAPAARIAAYKVCWDGPDPTVETDDGCAASDIIAAIDQATADGVDVIN 360
V+ GVAP A + KV II I A VD+I+
Sbjct: 103 VV---------GVAPEADLLIIKVLNK--------QGSGQYDWIIQGIYYAIEQKVDIIS 145

Query: 361 MSLGGDGPSPDEEQRALLGAASAGIFVAASAGNSGPDASTVSNL-----EPWVTTVAASS 415
MSLGG P+ + A+ A ++ I V +AGN G L V +V A +
Sbjct: 146 MSLGGPEDVPELHE-AVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGAIN 204

Query: 416 VPDNYAATLTLGDG 429
+ + +
Sbjct: 205 FDRHASEFSNSNNE 218



Score = 60.6 bits (147), Expect = 1e-11
Identities = 23/148 (15%), Positives = 47/148 (31%), Gaps = 31/148 (20%)

Query: 534 VAYAAKAGATATLTAGDTSGVERPAPQVTGFSSRGPDPADGADIIKPDITAPGSGIPAAY 593
+ Y ++ A + + FS+ + D+ APG I +
Sbjct: 188 LGYPGCYNEVISVGAINFDR------HASEFSNSNNEV---------DLVAPGEDILSTV 232

Query: 594 KDVDGRPGFAALSGTSMSSPHIAGFALVYLG-----IHPTASPSEIKSAMMTTATDTVDA 648
+A SGTSM++PH+AG + + E+ + ++ +
Sbjct: 233 PG----GKYATFSGTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGN- 287

Query: 649 QGKAVTDPFAQGAGEIASARYLHPGLVY 676
P +G G + ++
Sbjct: 288 ------SPKMEGNGLLYLTAVEELSRIF 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0071V8PROTEASE320.002 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 32.3 bits (73), Expect = 0.002
Identities = 26/168 (15%), Positives = 45/168 (26%), Gaps = 25/168 (14%)

Query: 123 CTATVINTPDGDTIATAASCLIDRGAKTVHRLATFVPSTDHAAAPQGIWPIRVAEATPQW 182
+ V+ DT+ T +D H L F + + P G +
Sbjct: 104 ASGVVVGK---DTLLTNKHV-VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYS-- 157

Query: 183 NATGATSADVAFAKV-AGPTGTHLKDAAGSVPAVF-DTITPPATELLRVFAYPQEKPYSG 240
D+A K H+ + PA + + + V YP +KP +
Sbjct: 158 -----GEGDLAIVKFSPNEQNKHIGEVVK--PATMSNNAETQVNQNITVTGYPGDKPVAT 210

Query: 241 QKLVGCGAIAHRLAAPSAARE---IGCDINDGAMGGPALSPAGELRSV 285
+ E G G P + E+ +
Sbjct: 211 MW-------ESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGI 251


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0075V8PROTEASE461e-07 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 45.8 bits (108), Expect = 1e-07
Identities = 47/277 (16%), Positives = 90/277 (32%), Gaps = 37/277 (13%)

Query: 6 KNRFSVLALLLLVVMVGQSSVARPAQAAGPTGTAFSDDTGQSFSDAEASDAVRSWTPEAL 65
K +F ++ L + + + V+ PA A + A + Q+ S + TP+
Sbjct: 2 KGKFLKVSSLFVATLTTATLVSSPAANA-LSSKAMDNHPQQTQSSKQQ-------TPKIQ 53

Query: 66 AAASDLDRPSDAVNAPVTGATEQALITQ-GAGTFEPVYWIGRIYFDVDGRQYSCSGSSIR 124
+ + ++ IT G + PV I + + SG +
Sbjct: 54 KGGNLKPLEQREHANVILPNNDRHQITDTTNGHYAPV---TYIQVEAPTGTFIASGVVVG 110

Query: 125 SDSQLVVATAAHCL-YDHGEWSTRVVFIPAWDGANKPLGVWGAFFYAVSRDWRTTEDPGH 183
D+ + T H + HG+ F A + N P G + + T
Sbjct: 111 KDT---LLTNKHVVDATHGDPHALKAFPSAINQDNYPNGG-----FTAEQ--ITKYSGEG 160

Query: 184 DAAFIKMAPKTAWDGSKEYLASKAG-APAPTFSAAMPGLHFEAFGYRPLGGYVPAPLYTC 242
D A +K +P + +++ A + + GY G A ++
Sbjct: 161 DLAIVKFSP----NEQNKHIGEVVKPATMSNNAETQVNQNITVTGYP--GDKPVATMWES 214

Query: 243 AGEGRHFRGFASIPEYEIA-NCDPPGGASGGPVYHAS 278
G+ + + + GG SG PV++
Sbjct: 215 KGK------ITYLKGEAMQYDLSTTGGNSGSPVFNEK 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0081DHBDHDRGNASE822e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 81.6 bits (201), Expect = 2e-20
Identities = 48/190 (25%), Positives = 84/190 (44%), Gaps = 1/190 (0%)

Query: 3 IHDKVFVVTGGGNGMGRQVALEIMRRGGHVAAVDLNENGLADTLQLAQSSTGRLTVHTAN 62
I K+ +TG G+G VA + +G H+AAVD N L + ++ A+
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 63 VTDRNAIAVLRDEVLMQHPHVDGVINIAGIIHSFAPFEELPIDVTQRVMDVNFWGTVNVC 122
V D AI + + + +D ++N+AG++ L + + VN G N
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLR-PGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 123 RAFLPILRSRPSAALVNMSSLSALIPFASQTFYGASKGAVKQFSEGLYAEALGTGLTVST 182
R+ + R S ++V + S A +P S Y +SK A F++ L E + +
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 183 IFPGNISTNL 192
+ PG+ T++
Sbjct: 185 VSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0084SECFTRNLCASE290.031 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 28.7 bits (64), Expect = 0.031
Identities = 10/40 (25%), Positives = 18/40 (45%), Gaps = 1/40 (2%)

Query: 92 DIDLMKAAWNSFFVSAVTAVAVVFFSTLAGFAFAKLRFRG 131
+ D + W +F + V +A V + G F + F+G
Sbjct: 13 NFDFFRWQWATFGAAIVMMIASVILPLVIGLNFG-IDFKG 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0087HTHTETR679e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 67.3 bits (164), Expect = 9e-16
Identities = 30/181 (16%), Positives = 64/181 (35%), Gaps = 12/181 (6%)

Query: 8 RGPYRKGVERRREIVAAAAQLFSESGYTHASMRELAKRVGLSQALLLHYFSDKEDLLVEV 67
R ++ E R+ I+ A +LFS+ G + S+ E+AK G+++ + +F DK DL E+
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 68 LNLRDASVAEYLADIADSDIAT-----RSRKVARHAAEHEGLTSLYIALSAEAIDPEHPA 122
L ++++ E + R + + +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 123 HEYFADHYRSAQEQTR-------EPGADAGVVPAGVSPELVTALGIAVMDGLQVQRQYRP 175
R+ ++ + +A ++PA + + + GL + P
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAP 182

Query: 176 D 176

Sbjct: 183 Q 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0089HTHTETR280.036 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.036
Identities = 18/130 (13%), Positives = 40/130 (30%), Gaps = 4/130 (3%)

Query: 11 TLRDIARIAGVSVPTVSKVLNGRGDVSAETRERVQE---ATARVGYRRMPSAAVDALIAE 67
+L +IA+ AGV+ + + D+ +E E + + P + L
Sbjct: 33 SLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREI 92

Query: 68 QLVDLVLPAVGDSWASALIGGVERVAAQNSLDLVIIMVRAGESQGRSWVERLVDH-RSRG 126
+ L + + + + +V R + +E+ + H
Sbjct: 93 LIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAK 152

Query: 127 ALIAVVQPTA 136
L A +
Sbjct: 153 MLPADLMTRR 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0091HTHFIS661e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.6 bits (160), Expect = 1e-14
Identities = 36/164 (21%), Positives = 63/164 (38%), Gaps = 5/164 (3%)

Query: 4 VVLVDDQAMIRAGLRGILEDAGIRVIGEAADGRSAFAVIRGSRPDVVLMDLRMPILDGVG 63
+++ DD A IR L L AG V ++ + + I D+V+ D+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 ATAALRA-DPDLNAVRILVLTTFDGDEEVLAALRAGADGFLAKSADSESLIAAVEAVAKG 122
++ PDL +LV++ + + A GA +L K D LI +
Sbjct: 65 LLPRIKKARPDL---PVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 123 DTSLSAGAAKTVVSDLRRRGRSATDVELIARTATLTHRETDVVL 166
+ + GRSA E+ A L + +++
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMI 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0093ACRIFLAVINRP466e-07 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 45.6 bits (108), Expect = 6e-07
Identities = 40/254 (15%), Positives = 90/254 (35%), Gaps = 33/254 (12%)

Query: 177 TELVGALVAFLVLLLTFGSLVAAGANMLGALVGVGVGVLGILAFSAIAPIG---SVTPIL 233
T ++ FLV+ L ++ A + + V V+ + F+ +A G + +
Sbjct: 343 TLFEAIMLVFLVMYLFLQNMRAT------LIPTIAVPVVLLGTFAILAAFGYSINTLTMF 396

Query: 234 AVMLGLAVGIDYCLFVLSRFRAELRSGRL-VQDAIGRATATAGSSVVFAGATVIIALVGL 292
++L + + +D + V+ + +L ++A ++ + ++V + + +
Sbjct: 397 GMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPM 456

Query: 293 TVVGI---PFLGEMGIAAAFAVAVAVLMSLTLLPALLSWM----GRRALGRRERASTRSA 345
G + I A+A++VL++L L PAL + + +
Sbjct: 457 AFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFN 516

Query: 346 GGSPRWTT------TWIGAVLRRPVIATFAVVGGLAVVALPLLG----------MQTSLV 389
I R ++ +V G+ V+ L L T +
Sbjct: 517 TTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQ 576

Query: 390 IPGGEDPDSTQRAA 403
+P G + TQ+
Sbjct: 577 LPAGATQERTQKVL 590



Score = 42.9 bits (101), Expect = 4e-06
Identities = 39/188 (20%), Positives = 76/188 (40%), Gaps = 25/188 (13%)

Query: 498 STAIALDSDEQLRSALIAYVALIVGLSFLLLVLLFRSFLVPLIATGGFLLSLGAALGSTV 557
+ + L E +++ A + +V L L + R+ L+P IA + L LG+
Sbjct: 330 TPFVQLSIHEVVKTLFEAIM--LVFLVMYLFLQNMRATLIPTIA---VPVVL---LGTFA 381

Query: 558 AVFQWGWLDPIVQAPQGNPLLSLLPIIVTGILFGLAMDYQVFLVSRIHEAHR-HGTSPRE 616
+ +G+ L++ +++ GL +D + +V + P+E
Sbjct: 382 ILAAFGY---------SINTLTMFGMVLA---IGLLVDDAIVVVENVERVMMEDKLPPKE 429

Query: 617 AVRAGFQQSAPVVVAAAAIMAAVFFGFALSPSS---LVGSIALALAVGVLADALLVRMIL 673
A Q +V A +++AVF A S + ++ + + ++LV +IL
Sbjct: 430 ATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL-SVLVALIL 488

Query: 674 VPALLTLL 681
PAL L
Sbjct: 489 TPALCATL 496



Score = 33.3 bits (76), Expect = 0.004
Identities = 33/178 (18%), Positives = 70/178 (39%), Gaps = 22/178 (12%)

Query: 511 SALIAYVALIVGLSFLLLVLLFRSFLVPLIATGGFLLSLG-AALGSTVAVFQWGWLDPIV 569
+ A VA+ + FL L L+ S+ +P+ +L + +G +A +
Sbjct: 870 NQAPALVAISFVVVFLCLAALYESWSIPVS----VMLVVPLGIVGVLLAATLFN------ 919

Query: 570 QAPQGNPLLSLLPIIVTGILFGLAMDYQVFLVSRIHEAHRH-GTSPREAV-RAGFQQSAP 627
Q N + ++ ++ GL+ + +V + G EA A + P
Sbjct: 920 ---QKNDVYFMVGLLT---TIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRP 973

Query: 628 VVV-AAAAIMAAVFFGFALSPSSLVGSIALALAV-GVLADALLVRMILVPALLTLLGR 683
+++ + A I+ + + S + A+ + V G + A L+ + VP ++ R
Sbjct: 974 ILMTSLAFILGVLPLAISNGAGSGAQN-AVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0104TCRTETA483e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 48.3 bits (115), Expect = 3e-08
Identities = 63/291 (21%), Positives = 109/291 (37%), Gaps = 14/291 (4%)

Query: 90 QLIEPEGKEAALGLIVGIGALFNLILGPIFGVLSDATRLRWGRRRPFLVFGLGVAALAAC 149
L+ A G+++ + AL P+ G LSD R+GRR P L+ L AA+
Sbjct: 34 DLVHSNDVTAHYGILLALYALMQFACAPVLGALSD----RFGRR-PVLLVSLAGAAVDYA 88

Query: 150 LIAIAPSVPLVLAGWIIAQVAASAISAALNPTLPERVPAEQRGKLGALSGVAASIAGVSA 209
++A AP + ++ G I+A + + + A + + ++R + V+
Sbjct: 89 IMATAPFLWVLYIGRIVAGITGATGAVAGA-YIADITDGDERARHFGFMSACFGFGMVAG 147

Query: 210 TLVGSLLTG---DLVLLFLLPVAVLAVGVVLWLFVVPDAPAPAGAQRPIGAALRSLLFDP 266
++G L+ G + L +L +R L S +
Sbjct: 148 PVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWAR 207

Query: 267 RKHPDFAWVWIGKFCLQIGLAFFTTYQLYFLLDRLGFTAEEAGRQLAVVGGISLLATMLF 326
A + F +Q+ + F DR + A G LA G + LA +
Sbjct: 208 GMTV-VAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMI 266

Query: 327 TIAGGTLSDRLRRRKPFIYLASAMIAAGMITAAFSSDFVIYAVAGALLAAG 377
T G ++ RL R+ + L G I AF++ + LLA+G
Sbjct: 267 T---GPVAARLGERR-ALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0109MALTOSEBP320.005 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 32.0 bits (72), Expect = 0.005
Identities = 53/245 (21%), Positives = 95/245 (38%), Gaps = 14/245 (5%)

Query: 109 PDIVQVEYRALPSLVVAGVVKDITDDVADAKGDVADNIWDLTTFDDRVYGVPQDIGPAMF 168
PDI+ + +G++ +IT D A + + WD ++ ++ P +
Sbjct: 83 PDIIFWAHDRFGGYAQSGLLAEITPDKA-FQDKLYPFTWDAVRYNGKLIAYPIAVEALSL 141

Query: 169 TYRKDLLEQYGVEVPTTWAEYADAAEKIHTADPTVYISSFDPGEFQFFAAQAAQAGAEWW 228
Y KDLL P TW E +++ + + + F + AA G +
Sbjct: 142 IYNKDLLPN----PPKTWEEIPALDKELKAKGKSALMFNLQEPYFTW-PLIAADGGYAFK 196

Query: 229 TNDG--DTWKVGIDSEESLATADFWQDLVERDLVKVEALVTPEWNAEINDGKVLSWAAAS 286
+G D VG+D+ + A F DL++ + + + A N G+
Sbjct: 197 YENGKYDIKDVGVDNAGAKAGLTFLVDLIKNKHMNADTDYSIA-EAAFNKGETAMTINGP 255

Query: 287 WVPSVINAVAPDTAGKWESAPLPQWTEGDASVPFIGGSAYFVPEKSSDSEAAAKFATWLS 346
W S I+ + + LP + +G S PF+G + + S + E A +F
Sbjct: 256 WAWSNIDT----SKVNYGVTVLPTF-KGQPSKPFVGVLSAGINAASPNKELAKEFLENYL 310

Query: 347 TSDEG 351
+DEG
Sbjct: 311 LTDEG 315


3CMM_0125CMM_0162Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_01253103.544443hypothetical protein
CMM_0126393.170497putative MFS transporter
CMM_0127081.646542conserved hypothetical protein
CMM_0128181.094354conserved hypothetical protein
CMM_0129192.037828putative secreted protein
CMM_0130321-2.865689hypothetical protein
CMM_0131734-6.863480hypothetical protein
CMM_0132628-5.833017conserved hypothetical protein
CMM_0133730-6.305976hypothetical protein
CMM_0134732-6.312764hypothetical protein, putative peptidase
CMM_0135836-8.203978conserved hypothetical protein
CMM_0136620-5.053521hypothetical protein
CMM_0137-19-0.149443hypothetical protein
CMM_0138-112-0.324546putative cell wall surface anchor family
CMM_0139114-0.205530chaperone (HSP70)
CMM_0140219-0.792944heat shock chaperone
CMM_0141419-0.066505chaperone, curved DNA-binding protein
CMM_01422170.624212heat shock regulator, transcriptional
CMM_01432171.440037putative methyltransferase
CMM_01443141.729855conserved hypothetical protein
CMM_0145582.094369transcriptional regulator, Cro/CI family
CMM_0146390.925404putative membrane protein
CMM_0147290.360096conserved hypothetical protein
CMM_0148280.161087hypothetical protein
CMM_014927-0.144116putative transcriptional regulator, LacI-family
CMM_0150160.497333conserved hypothetical protein
CMM_0151111-0.256523conserved hypothetical protein
CMM_0152-1101.856420hypothetical protein
CMM_0153-2112.382649putative Fe3+-siderophore ABC transporter,ATPase
CMM_01540132.423407putative Fe3+-siderophore ABC
CMM_01550111.969383putative Fe3+-siderophore ABC
CMM_01562101.681007putative membrane protein
CMM_0157381.320934putative two-component system sensor kinase
CMM_0158482.272391putative two-component system response
CMM_0159492.723358putative tautomerase
CMM_0160283.126753putative acetyltransferase
CMM_0161272.617465conserved hypothetical protein
CMM_01621103.359534putative sulfate MFS permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0125ENTSNTHTASED883e-23 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 87.8 bits (217), Expect = 3e-23
Identities = 38/192 (19%), Positives = 65/192 (33%), Gaps = 13/192 (6%)

Query: 36 ESDAVATALPGRRAEFVTGRVLARRALAALGRRPGSIPVARDGAPVWPAGIVGSITHCVG 95
D + +A R+AE + GR+ A AL +G R P+WP G+ GSI+HC
Sbjct: 35 HHDRLRSAGRKRKAEHLAGRIAAVHALREVGVRTVPGM-GDKRQPLWPDGLFGSISHCAT 93

Query: 96 LRACAVGRRDEHAGIGIDATPARPLPGGVLARVADLGSAPVAAGLDALRATGVEAPGSVL 155
+ R+ IGID +A + ++
Sbjct: 94 TALAVISRQ----RIGIDIEKIMSQHTAT--ELAPSIIDSDERQILQASLLPFPLALTLA 147

Query: 156 FAAAEAVAKARTSAQGGWHGIDGAEIVLHPDGSFAVR-----ARRGPDFTGTGRWAVAGG 210
F+A E+V KA + G + A++ ++ A + T W
Sbjct: 148 FSAKESVYKAFSDRV-TLPGFNSAKVTSLTATHISLHLLPAFAATMAERTVRTEWFQRDN 206

Query: 211 LALAGIALDEHW 222
+ ++
Sbjct: 207 SVITLVSAITRV 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0129PYOCINKILLER310.005 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.5 bits (68), Expect = 0.005
Identities = 12/49 (24%), Positives = 19/49 (38%), Gaps = 2/49 (4%)

Query: 38 PGPAPAAPPAAATAAPSAPPTTAPSDPPTSAPTVVQGLGAVPTRVAIPA 86
P APP T P++PP +P ++ P V + + P
Sbjct: 403 PSTTAEAPPLILTWTPASPP--GNQNPSSTTPVVPKPVPVYEGATLTPV 449


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0146PF06776270.048 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 27.2 bits (60), Expect = 0.048
Identities = 19/61 (31%), Positives = 24/61 (39%), Gaps = 5/61 (8%)

Query: 1 MPRHRAAPDAPARPVPH---GRRAARRPVRRRRTLLVTAIAVTLALAGAGTAYAVFADRA 57
+P +A PA P RR ARR R AIA L+ + A A A R+
Sbjct: 21 VPALKAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIA--LSFGWSDRADAQGAVRS 78

Query: 58 T 58

Sbjct: 79 V 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0148cloacin345e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 34.3 bits (78), Expect = 5e-04
Identities = 27/81 (33%), Positives = 40/81 (49%), Gaps = 15/81 (18%)

Query: 183 GTGQDPDPSTGGGTGTGSGSGSG-SGGGSGSGTS------------PAGSAPAADAPRDL 229
G G GGG+G G+G G+G SGGGSG+G + PA S P A
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA-- 104

Query: 230 LPVTGRDIASALAGALLALLG 250
+ ++ +++A+A + AL G
Sbjct: 105 VSISAGALSAAIADIMAALKG 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0150INTIMIN392e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 38.5 bits (89), Expect = 2e-04
Identities = 75/408 (18%), Positives = 127/408 (31%), Gaps = 40/408 (9%)

Query: 400 GLFLVGTSVTCTGTHVVTPAEAMSGTLVNTASARGVNTLLASVTSNSSSVTAQIVAPAPA 459
G S + + PA G+ V +AR + +SN+ +T +++
Sbjct: 497 GQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDR--NGNSSNNVLLTITVLSNGQV 554

Query: 460 LALTKTGTLTDSNGNGRADVGERIAYSFVAQNSG----NVSLYTVAVADPRVTGISPAST 515
+ T + +AD E I Y+ + +G NV + V+ V + A+T
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT 614

Query: 516 TLAPGASQTFTSAA--YTVTQADVDAATPIVNT-ATVSGKTFAGVAAPTASSSTSTPVSG 572
+ A+ T S V A T +N A + + T+ +G
Sbjct: 615 NGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANG 674

Query: 573 SAALTLTKGATLTGGSKAGSTVAYSFSIRNTGTVPLTGVALTDPLPGLSAVTYTWPGTAG 632
A+T T + V + T L+ G + VT T T G
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTF-----TTTLGKLSNSTEKTDTNGYAKVTLT-STTPG 728

Query: 633 TLAAGATATATASYTVRQADVDAGQIANTATVRGASSGGTQAQATATRTLTLDRTATLAF 692
+ +A + DV A ++ T+ L T L +
Sbjct: 729 K------SLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLP---TVWLQY 779

Query: 693 TKTATPGNVPAAGGVVTYAFRLQNTGSTTLTSVSIADPRAGVSALSYTWPGTAGTLAPGQ 752
N+ A+GG Y +R N ++ D +G L T++
Sbjct: 780 G----QVNLKASGGNGKYTWRSANPA------IASVDASSGQVTLKEK---GTTTISVIS 826

Query: 753 VVTATATYTATTADVAAGSIVNTATATATAPTGQVSGTATATVLAVAD 800
TATYT T + IV + T + L +
Sbjct: 827 SDNQTATYTIATPN---SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQ 871


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0151SHAPEPROTEIN1382e-38 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 138 bits (350), Expect = 2e-38
Identities = 81/371 (21%), Positives = 147/371 (39%), Gaps = 69/371 (18%)

Query: 2 ARAVGIDLGTTNSVVSVLEGG----EPTVIA-NAEGARTTPSVVAFTKDGEVLVGETAKR 56
+ + IDLGT N+++ V G EP+V+A + A + SV A VG AK+
Sbjct: 10 SNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAA--------VGHDAKQ 61

Query: 57 QNVTNVDRTISSVKRHMGTDWTVGIDDKKYTSQELSARILGKLKRDAEQYLGDSVTDAVI 116
I++++ G+ + ++++ + ++ ++ ++
Sbjct: 62 MLGRT-PGNIAAIR-----PMKDGVIADFFVTEKMLQHFIKQVHSNS---FMRPSPRVLV 112

Query: 117 TVPAYFNDAERQATKEAGEIAGLNVLRIINEPTAAALAYGLDRGKEDELILVFDLGGGTF 176
VP ER+A +E+ + AG + +I EP AAA+ GL E +V D+GGGT
Sbjct: 113 CVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPV-SEATGSMVVDIGGGTT 171

Query: 177 DVSLLEVGKDDDFSTIQVRSTAGDNRLGGDDWDQRIVDHLVKRFKESTGVDVSNDKIAKQ 236
+V+++ + + R+GGD +D+ I++++ + + G
Sbjct: 172 EVAVISLNG---------VVYSSSVRIGGDRFDEAIINYVRRNYGSLIG----------- 211

Query: 237 RLKEAAEQAKKELSSS----TSTSIQLPYLSLTENGPANLDETLTRAKFEELTNDL---- 288
+ AE+ K E+ S+ I++ +L E P + E L L
Sbjct: 212 --EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLN-SNEILEALQEPLTGIV 268

Query: 289 ------LERTRKPFEDVIREAGVSVGDVAHVVLVGGSTRMPAVVDLVKKLTGGKEPNKGV 342
LE+ I E G +VL GG + + L+ + T G
Sbjct: 269 SAVMVALEQCPPELASDISERG--------MVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 343 NPDEVVAVGAA 353
+P VA G
Sbjct: 320 DPLTCVARGGG 330


4CMM_0186CMM_0196Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_01862131.574291hypothetical protein
CMM_01874131.382353putative glycosyl hydrolase, family 2
CMM_01883121.006354putative dehydrogenase/oxidoreductase
CMM_01892102.312886putative drug exporter, RND family
CMM_0190283.021138putative transcriptional regulator, TetR-family
CMM_0191-270.720861putative membrane protein
CMM_0192-27-0.6123363-hydroxyacyl-Coenzyme A dehydrogenase
CMM_0193-17-1.706442conserved hypothetical protein
CMM_0194-18-2.165745putative membrane-associated phospholipid
CMM_019509-3.119475hypothetical membrane protein
CMM_0196011-3.866025conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0186SALSPVBPROT300.027 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.7 bits (66), Expect = 0.027
Identities = 25/75 (33%), Positives = 33/75 (44%), Gaps = 12/75 (16%)

Query: 276 PEIVASEEEARRVVDASRHPTDEQLRA--------IALGWSSDLETD----LTGLDLDRP 323
PE A E R ++ P+DE+ + IA G SS ETD GL LD+P
Sbjct: 419 PETQAKETLLSRDYLSTNEPSDEEFKNAMSVYINDIAEGLSSLPETDHRVVYRGLKLDKP 478

Query: 324 VDRGVFGEHVSHGTI 338
V E+ + G I
Sbjct: 479 ALSDVLKEYTTIGNI 493


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0187HTHTETR509e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.0 bits (119), Expect = 9e-10
Identities = 23/96 (23%), Positives = 45/96 (46%), Gaps = 2/96 (2%)

Query: 19 RERRRMATTTEISEAALALFEQRGMAATTIHDIAQAAGVSDRTCFRYFPSKEESVLTLHP 78
++ T I + AL LF Q+G+++T++ +IA+AAGV+ + +F K + +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 79 VFDAPLDAWLADV--DRGSAPLPQLEAVYERVLASL 112
+ ++ + + PL L + VL S
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLEST 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0188TCRTETB1388e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 138 bits (348), Expect = 8e-38
Identities = 92/409 (22%), Positives = 186/409 (45%), Gaps = 20/409 (4%)

Query: 50 LVVAAFVVILNETIMSVALPSLMADLDITTATAQWLTTGFMLTMAVVIPATGFILQRFST 109
L + +F +LNE +++V+LP + D + A+ W+ T FMLT ++ G + +
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 110 RQVFGAAMTLFSIGTLIAAIA-PGFGILLVGRIVQASGTAIMMPLLFTTVLNLVPAARRG 168
+++ + + G++I + F +L++ R +Q +G A L+ V +P RG
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 169 RLMGVISIVIAVAPAIGPTVSGLILSSLSWRWMFWIVLPIALVALTLGLWKITNLTTPRK 228
+ G+I ++A+ +GP + G+I + W ++ ++P+ + L K+ K
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEVRIK 196

Query: 229 LPFDILSVVLSTLAFGGLIFGLSSLGESAEGDAPLPLWIPITVGVLALAAFITRQIALQR 288
FDI ++L ++ + +S + V VL+ F+ ++
Sbjct: 197 GHFDIKGIILMSVGIVFFMLFTTSY-----------SISFLIVSVLSFLIFVKHI---RK 242

Query: 289 QDRALMDLRTFRSRPFVVAIIMVSVSMMALFGSLIVLPLYLQNVLELGTLETG-LLLLPG 347
+D ++ PF++ ++ + + G + ++P +++V +L T E G +++ PG
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 348 GALMAILSPIVGRLFDRVGPRPLVIPGAIIVSIALWGMTTMLHEGTSIGWVIAVHLVLNA 407
+ I I G L DR GP ++ G +S++ + L E TS I + VL
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTA-SFLLETTSWFMTIIIVFVL-G 360

Query: 408 GLAFMFTPLLTSALGSLPPRLYSHGSATVSTMQQLAGAAGTALFVTVLT 456
GL+F T + T SL + G + ++ L+ G A+ +L+
Sbjct: 361 GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0190TETREPRESSOR419e-07 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 41.1 bits (96), Expect = 9e-07
Identities = 17/44 (38%), Positives = 29/44 (65%)

Query: 4 AGLDAATVTEAGAALADEIGLAGLSMGAVAERLGVKTPSLYKHV 47
A L+ +V +A L +E G+ GL+ +A++LG++ P+LY HV
Sbjct: 2 ARLNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0193HTHTETR656e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 6e-15
Identities = 33/191 (17%), Positives = 62/191 (32%), Gaps = 15/191 (7%)

Query: 49 GVRRRAEIVAAAVRVFGVRGYGAATIKEIADEVGVSPAAVLRYFR-KEELLTEVL-RQWD 106
R I+ A+R+F +G + ++ EIA GV+ A+ +F+ K +L +E+
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 107 RQQPFVSEAAPGLPA---------LRAFVDLMRYHVEHRGFLELYLTFATETSDATHPAH 157
E P L ++ R +E+ +
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA-VVQ 127

Query: 158 EYMRGRYARTIAQIRRRIGEAVALGQVPPMDDATLDYESACFLAILDGLEIQWIHNP-SL 216
+ R + +I + + + +P + GL W+ P S
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAII--MRGYISGLMENWLFAPQSF 185

Query: 217 DLPALVGEYVE 227
DL +YV
Sbjct: 186 DLKKEARDYVA 196


5CMM_0209CMM_0222Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0209213-1.983670putative phosphatase
CMM_0210111-0.869581putative glycosyl transferase
CMM_0211010-0.052974putative sarcosine oxidase
CMM_0212214-0.083578hypothetical protein
CMM_02132150.071067putative acetyl xylan esterase
CMM_02144150.772016putative transcriptional regulator, TetR family
CMM_02153140.783832putative acyl esterase
CMM_02164150.521816putative peptide ABC transporter, ATP-binding
CMM_02175170.262363putative peptide ABC transporter, permease
CMM_02185111.404688putative peptide ABC transporter, permease
CMM_0219372.938093putative peptide ABC transporter,
CMM_0220282.792163hypothetical protein
CMM_0221293.284079hypothetical protein
CMM_0222073.009238conserved membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0215HTHTETR573e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 3e-12
Identities = 29/101 (28%), Positives = 47/101 (46%), Gaps = 2/101 (1%)

Query: 7 DTRERLAAAALELFRERGFAETTVPEITARAGLTTRTFFRHFADKREVLFA--EEDELPA 64
+TR+ + AL LF ++G + T++ EI AG+T + HF DK ++ E E
Sbjct: 11 ETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNI 70

Query: 65 LVTRIIREASPDRSPLQVVEDGFPEVVASQFSEGRATLLAR 105
+ +A PL V+ + V+ S +E R LL
Sbjct: 71 GELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLME 111


6CMM_0240CMM_0260Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_02402102.492569conserved hypothetical protein
CMM_02411146-8.316713putative transcriptional regulator, MerR family
CMM_02421040-7.123587conserved hypothetical protein
CMM_02431040-6.700904putative membrane protein involved in chromosome
CMM_02441038-6.428382putative membrane protein involved in chromosome
CMM_02451035-5.773936putative acetyltransferase
CMM_02461037-6.267834putative NTP pyrophosphohydrolases
CMM_0247172.391122putative D-alanine--D-alanine ligase
CMM_0248183.268187conserved hypothetical protein
CMM_0249092.614525conserved hypothetical protein
CMM_0250091.657051conserved membrane protein
CMM_0251-181.606590Putative transcriptional regulator, RpiR family
CMM_0252-181.513449putative hydroxyacid dehydrogenase
CMM_0253-192.011217putative aldehyde dehydrogenase
CMM_0254091.335460putative sugar ABC transporter, substrate
CMM_02551101.601955putative sugar ABC transporter, permease
CMM_02560113.013517putative sugar ABC transporter, permease
CMM_02571103.026001trehalose utilization-related protein
CMM_02582113.556278putative dehydrogenase
CMM_02592134.061353putative ROK-family transcriptional regulator
CMM_02602143.145626conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0240DHBDHDRGNASE902e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.1 bits (223), Expect = 2e-23
Identities = 55/202 (27%), Positives = 74/202 (36%), Gaps = 13/202 (6%)

Query: 5 LITGCSTGLGRAFAVEALERGHDVVVTARDAANAQDLADTYPEHALA-----LDLDVTDP 59
ITG + G+G A A +G + A D + A A DV D
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHI--AAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 60 AQVSLAVDEATARFGGVDVLVNNAGYGYRAAVEEGDDEDVARLFDTQFHGSVRMIKAVLP 119
A + G +D+LVN AG + DE+ F G ++V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 120 GMRERRSGTIVNLSSIGAARTGAGSGYYGAVKAAIEQMTMALRTELEPLGIVATVVAPGS 179
M +RRSG+IV + S A Y + KAA T L EL I +V+PGS
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 180 FRTDFSGRSLTQSSTVIDDYAE 201
TD Q S D+
Sbjct: 190 TETDM------QWSLWADENGA 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0241TCRTETB250.044 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 25.2 bits (55), Expect = 0.044
Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 9/48 (18%)

Query: 26 SLGITFAMLGLVLMLTLDDTRVAGLPFIILGITFFVMSVRPKRARTRP 73
S+GI F ML T + F+I+ + F++ V+ R T P
Sbjct: 208 SVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDP 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0244HTHFIS794e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.1 bits (195), Expect = 4e-19
Identities = 29/112 (25%), Positives = 58/112 (51%)

Query: 25 RVFVVDEEAPITQLLSLALRMEGWDVRVFATGRAAIDAAVEAAPDAILLDMTLPDVSGVE 84
+ V D++A I +L+ AL G+DVR+ + D ++ D+ +PD + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 85 VVGELRRAGVASPVLFLTGRDSLEDRLAAFGAGADDYVTKPFGLEEVVETLR 136
++ +++A PVL ++ +++ + A GA DY+ KPF L E++ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0245GPOSANCHOR419e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.8 bits (95), Expect = 9e-06
Identities = 26/89 (29%), Positives = 33/89 (37%), Gaps = 17/89 (19%)

Query: 346 ELTAADAGGPVIPGVPTPTPTTPSIDPAGTAPTPITKP------AAQHGDTLPVTGTDG- 398
EL AG P P ++ G AP TKP + LP TG
Sbjct: 454 ELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGETAN 513

Query: 399 ----AAALGLGGLGTLLALVGAGALVARR 423
AAAL T++A G A+V R+
Sbjct: 514 PFFTAAAL------TVMATAGVAAVVKRK 536


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0249DPTHRIATOXIN300.010 Diphtheria toxin signature.
		>DPTHRIATOXIN#Diphtheria toxin signature.

Length = 567

Score = 29.7 bits (66), Expect = 0.010
Identities = 38/141 (26%), Positives = 57/141 (40%), Gaps = 18/141 (12%)

Query: 12 SRRVIAGAAVGLALALSAPLAASAHVELDASSTAPATLSVLTFAVGHGCEGSATTSLAIR 71
SR++ A +G L + AP SAH D + + + F+ HG +
Sbjct: 9 SRKLFASILIGALLGIGAP--PSAHAGADDVVDSSKSFVMENFSSYHGTK---------- 56

Query: 72 FPTDVQAVKPTLAPGWSVAEQEAADATTVTYTADTPLPDALRATVQVEALLPVDGAAGDV 131
P V +++ + S + D Y+ D DA +V E P+ G AG V
Sbjct: 57 -PGYVDSIQKGIQKPKSGTQGNYDDDWKGFYSTDNKY-DAAGYSVDNEN--PLSGKAGGV 112

Query: 132 --VTFPTLQTCVAGSTDWAET 150
VT+P L +A D AET
Sbjct: 113 VKVTYPGLTKVLALKVDNAET 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0255TCRTETB1392e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 139 bits (352), Expect = 2e-38
Identities = 83/397 (20%), Positives = 162/397 (40%), Gaps = 16/397 (4%)

Query: 25 FMAALDSLVMTFALPIIKQDLDATVEQLQWFVNSYSVVFTAFMLPVAALGDRIGRRRVFL 84
F + L+ +V+ +LP I D + W ++ + F+ L D++G +R+ L
Sbjct: 24 FFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLL 83

Query: 85 AGVALFTLASMGAALSTSS-DMLIAMRALQGLGAAGIVPLSLALVSASTTPKVRPIAIGV 143
G+ + S+ + S +LI R +QG GAA L + +V+ + R A G+
Sbjct: 84 FGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGL 143

Query: 144 WGGVNGLGIAAGPIIGGAVVEGFTWPGIFWLNVP-IGVVTIALVALALRESKGTGLGFDG 202
G + +G GP IGG + W + L +P I ++T+ + L++ FD
Sbjct: 144 IGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKKEVRIKGHFDI 201

Query: 203 AGSVLGIVFTLPLIWAVVEGEERGWSAPEIIGAFAVSAAGLAAFIWHEHRSPRAFLPLRF 262
G +L V + + I VS F+ H + F+
Sbjct: 202 KGIILMSVGIVFFMLFTTSYS---------ISFLIVSVLSFLIFVKHIRKVTDPFVDPGL 252

Query: 263 FRERAFALANVGGLLFSAGVFGAIFLLSQFLQVSMGYGALEAGLRAA-PWTLAPMVVAPL 321
+ F + + G + V G + ++ ++ E G P T++ ++ +
Sbjct: 253 GKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYI 312

Query: 322 SGFIVQRFGVKPVLITGLALQAGAILVIASVVEAQTEYGTVVVPMVVAGIGMGLTLAPLA 381
G +V R G VL G+ + + L + ++E + + T+++ V+ G+ T+ ++
Sbjct: 313 GGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTV--IS 370

Query: 382 NAVLTGRRDSEHGIASSVNSTLRQLGMAVGIALATAI 418
V + + E G S+ + L GIA+ +
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGL 407


7CMM_0274CMM_0283Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_02741103.468552putative RNA polymerase ECF-subfamily sigma
CMM_02750103.309852conserved membrane protein
CMM_0276192.797079hypothetical protein
CMM_0277393.882626putative MFS permease
CMM_0278283.589287conserved hypothetical protein
CMM_0279283.613380putative transcriptional regulator, LacI family
CMM_02802123.754926putative sugar ABC transporter,
CMM_02813123.782230putative sugar ABC transporter, permease
CMM_02823163.755898putative sugar ABC transporter, permease
CMM_02832162.431217putative beta-glycosidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0275BACYPHPHTASE290.030 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 29.4 bits (65), Expect = 0.030
Identities = 26/74 (35%), Positives = 34/74 (45%), Gaps = 4/74 (5%)

Query: 4 HETGDAARAPEPLASVVDPARSVPLAPDPRSRGTTPDHVRRSNLATVLQIVHETGPASRS 63
H G R E L S +DP R+ PL P R T+ H AT V GP +R+
Sbjct: 140 HAPGTPVR--EGLRSHLDP-RTPPLPPRERPH-TSGHHGAGEARATAPSTVSPYGPEARA 195

Query: 64 ELTRETGLNRSTIA 77
EL+ R+T+A
Sbjct: 196 ELSSRLTTLRNTLA 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0277RTXTOXINA290.041 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.8 bits (64), Expect = 0.041
Identities = 7/27 (25%), Positives = 16/27 (59%)

Query: 18 ADRAVPGVRGALLALAAGVAAAALGLL 44
D ++ + L ++++G++AAA L
Sbjct: 364 IDASLTTISTVLASVSSGISAAATTSL 390


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0280PF07299363e-05 Fibronectin-binding protein (FBP)
		>PF07299#Fibronectin-binding protein (FBP)

Length = 219

Score = 36.0 bits (83), Expect = 3e-05
Identities = 20/110 (18%), Positives = 42/110 (38%), Gaps = 25/110 (22%)

Query: 34 AWDRLD-----FLGWRDPKAPQRGIVVAPVGDELIGVLLQQSATAPRS-RAQCTWCQ--- 84
+ LD +L W D + ++ I+ ++ +G+ Q + ++ C+ C
Sbjct: 113 DMEELDMKELSYLSWIDKGSSRKFIIAKNDKNKFVGL---QGTFQSLNKKSICSLCHGHE 169

Query: 85 DVRLPNPVVFYGARRAGAAG---RNGNTVGTLVCTD-FECNANVRKPRPI 130
+V + F + G + GN +C D CN N++ +
Sbjct: 170 EVGM-----FLVEIKGDIPGTFVKKGN----YICKDGVACNQNMKSLDKL 210


8CMM_0319CMM_0339Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_03193133.659556putative membrane protein
CMM_03203114.269261putative membrane protein
CMM_03212114.299157hypothetical protein
CMM_03224115.241864putative alkaline shock protein
CMM_0323395.707036putative transcriptional regulator, AraC family
CMM_0324395.506961conserved exported protein
CMM_0325284.914202conserved membrane protein
CMM_0326185.206727putative alpha-mannosidase
CMM_0327185.224910putative 6-phospho-beta-glucosidase
CMM_0328175.244840putative transcriptional regulator, DeoR family
CMM_0329175.054596putative carbohydrate kinase
CMM_0330184.834710putative sugar ABC transporter, substrate
CMM_0331095.157915putative sugar ABC transporter, permease
CMM_0332-1104.291314putative sugar ABC transporter, permease
CMM_0333-183.270429putative glucosamine-6-phosphate isomerase
CMM_0334-193.008252beta-galactosidase
CMM_0335092.411608putative acyl-CoA thioesterase
CMM_0336071.935792conserved hypothetical protein
CMM_0337191.825466putative transcriptional regulator, LacI family
CMM_03382120.862095putative sugar ABC transporter,
CMM_03393102.109240putative sugar ABC transporter, permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0323SACTRNSFRASE381e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 38.0 bits (88), Expect = 1e-05
Identities = 14/68 (20%), Positives = 27/68 (39%), Gaps = 3/68 (4%)

Query: 197 RLALVGSWGGLFAV---ATRPEARRRGLSRAAMAAAVDAGLDRGITALWLQVVAENDGAR 253
R+ + +W G + A + R++G+ A + A++ + L L+ N A
Sbjct: 79 RIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISAC 138

Query: 254 ALYGGLGF 261
Y F
Sbjct: 139 HFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0331ISCHRISMTASE581e-10 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 58.5 bits (141), Expect = 1e-10
Identities = 25/65 (38%), Positives = 39/65 (60%)

Query: 13 SLRAQVARALRLDPADVGLDDDLVDLGLESTALIRLAGRWRRDGLAADFSRLAADPTIRV 72
++R Q+A L+ P D+ +DL+D GL+S ++ L +WRR+G F LA PTI
Sbjct: 234 NIRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEE 293

Query: 73 WARML 77
W ++L
Sbjct: 294 WQKLL 298


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0333ENTSNTHTASED972e-26 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 96.6 bits (240), Expect = 2e-26
Identities = 54/222 (24%), Positives = 77/222 (34%), Gaps = 32/222 (14%)

Query: 24 MHPAEAGAVARAVPSRRREFAATRACARTALAALVAEAARGGGASRAAADAPGSSPVAIP 83
+ + A R+ E A R A AL + G V
Sbjct: 31 LWLPHHDRLRSAGRKRKAEHLAGRIAAVHALREV------------------GVRTVPGM 72

Query: 84 KGRGGDPVWPRGVVGSLTHCAGYRAAVVAGIDALRTIGIDAEPHAPLPREARDIVGLAGE 143
G P+WP G+ GS++HCA AV+ + + IGID E A ++ +
Sbjct: 73 -GDKRQPLWPDGLFGSISHCATTALAVI----SRQRIGIDIEKIMSQH-TATELAPSIID 126

Query: 144 LHPHPPLGAG----VHADCVLFSAKESVGKAHFARYREWLGFADLHVTLHPDGAFTARRS 199
L A A + FSAKESV KA F+ GF VT +
Sbjct: 127 SDERQILQASLLPFPLALTLAFSAKESVYKA-FSDRVTLPGFNSAKVTSLTATHISLHLL 185

Query: 200 AP--GPIPFPAYRGGWCVTEGIVLTCAWLAVPRIPSAVARPA 239
+ R W + V+T A+ R+P + PA
Sbjct: 186 PAFAATMAERTVRTEWFQRDNSVIT-LVSAITRVPHDRSAPA 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0335cloacin340.002 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.5 bits (76), Expect = 0.002
Identities = 18/40 (45%), Positives = 21/40 (52%)

Query: 564 ASLGSVSSSMHGTYSGSSSSGGSTGGTTSGGGGGGGGGGG 603
AS GS SS + + G S SG GG + G GGG G G
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSG 72



Score = 32.4 bits (73), Expect = 0.006
Identities = 18/61 (29%), Positives = 21/61 (34%), Gaps = 9/61 (14%)

Query: 544 ARDRPGWYAGQTPFSPAVFAASLGSVSSSMHGTYSGSSSSGGSTGGTTSGGGGGGGGGGG 603
A D GW + P+ G S S GS G G + GG G GG
Sbjct: 33 ASDGSGWSSENNPW---------GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83

Query: 604 V 604
V
Sbjct: 84 V 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0337BLACTAMASEA340.001 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 33.6 bits (77), Expect = 0.001
Identities = 20/92 (21%), Positives = 42/92 (45%), Gaps = 4/92 (4%)

Query: 183 PIGSIVKLYVLGALVQAVEEGRIGWDDPLTVT-DDVRSLPSGELQDAPTGTVVSVRDTAE 241
P+ S K+ + GA++ V+ G + + D+ + + + ++V +
Sbjct: 63 PMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDL--VDYSPVSEKHLADGMTVGELCA 120

Query: 242 KMIAISDNTATDMLIQAV-GREAVEAALVDLG 272
I +SDN+A ++L+ V G + A L +G
Sbjct: 121 AAITMSDNSAANLLLATVGGPAGLTAFLRQIG 152


9CMM_0459CMM_0464Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0459291.454411hypothetical protein
CMM_0460291.555826putative membrane protein
CMM_04612151.853770putative ATP-dependent DNA helicase
CMM_04621123.060065putative adenine-specific DNA-modification
CMM_04634122.933008putative membrane protein
CMM_04643121.863550putative membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0463SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 1e-04
Identities = 19/76 (25%), Positives = 26/76 (34%), Gaps = 9/76 (11%)

Query: 72 VGVGSAWVALDDPNGRPDTAFLFELLVDPSRRGCGYGRALLAAVEEATRAAGAPALALNV 131
+ + S W NG A + ++ V R G G ALL E + L L
Sbjct: 80 IKIRSNW------NGY---ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLET 130

Query: 132 FGANRVAIALYASAGY 147
N A YA +
Sbjct: 131 QDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0464NUCEPIMERASE589e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 58.3 bits (141), Expect = 9e-12
Identities = 78/366 (21%), Positives = 125/366 (34%), Gaps = 97/366 (26%)

Query: 8 RALVLGGTGAIGGATAERLARDGWSV---DV--TGRDPLAMPARLTDL---GVRFHALDR 59
+ LV G G IG ++RL G V D D ARL L G +FH +D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 60 ADACGIERLTGDG-VDLLVDLVAFTAADVQA------------------LLPVMRASG-- 98
AD G+ L G + + V+ +L R +
Sbjct: 62 ADREGMTDLFASGHFERVFISPH--RLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 99 SVVVASSRAVHVDDAGRHVNGDEPPRFPVPIPEDNPTLPPAAPGTDPFTREGYAPSKVAV 158
++ ASS +V+ + +P D+ P + YA +K A
Sbjct: 120 HLLYASSSSVYGLNR------------KMPFSTDDSVDHPVSL---------YAATKKAN 158

Query: 159 ERAALDS----GLPVTVIRPSKVHGRWAR-NARTRAVVERMLAGADTIELADRGASVDHL 213
E A GLP T +R V+G W R + + ML G +I++ + G
Sbjct: 159 ELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEG-KSIDVYNYGKMKRDF 217

Query: 214 TAAANAAALVARVADVPGA-------------------RILHSADPDPLTAAEIVAVIAE 254
T + A + R+ DV R+ + + P+ + + + +
Sbjct: 218 TYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALED 277

Query: 255 ELGWRGRI--VPLEPGVAGGAHPWAAAHPIVLDTRASL-----ALGYAPVGPGAVLLRAE 307
LG + +PL+PG VL+T A +G+ P ++
Sbjct: 278 ALGIEAKKNMLPLQPG-------------DVLETSADTKALYEVIGFTPETTVKDGVKNF 324

Query: 308 VAWIRD 313
V W RD
Sbjct: 325 VNWYRD 330


10CMM_0472CMM_0480Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0472293.136072putative acetyltransferase
CMM_04732102.228085conserved hypothetical protein
CMM_0474492.233440putative serine protease, family S53
CMM_04764122.774113transcriptional regulator, TetR family
CMM_04775113.625103putative phospholipase C
CMM_0478493.439727DNA repair system specific for alkylated DNA
CMM_0479392.905397putative membrane protein
CMM_0480282.918625putative lipoate-protein ligase A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0473HTHFIS797e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 7e-19
Identities = 36/138 (26%), Positives = 59/138 (42%), Gaps = 2/138 (1%)

Query: 1 MDSGRVALVIEDDGDIRQLLEVVLRQGGFEVHSAGTATEGVRLAEEVSPDVITLDVGLPD 60
M + LV +DD IR +L L + G++V A R D++ DV +PD
Sbjct: 1 MTGATI-LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 FDGFEAARRIR-LVSDAYIVMLTAQGEEVDTLLGLEAGADDYIVKPFRPRELRARISAMM 119
+ F+ RI+ D +++++AQ + + E GA DY+ KPF EL I +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 120 RRPRGGGDATATPAAGIP 137
P+ +
Sbjct: 120 AEPKRRPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0474BACINVASINB310.002 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 31.3 bits (70), Expect = 0.002
Identities = 38/110 (34%), Positives = 56/110 (50%), Gaps = 13/110 (11%)

Query: 89 TLVSMLVAAVVGGS---IAAMGLGLIFVTDACDVDAYV----CRDSLFTIGYGIAVAGPL 141
T+VS++ A GG+ +AA+GL ++ V D V A + +L I V PL
Sbjct: 326 TIVSVVAAVFTGGASLALAAVGLAVM-VADEI-VKAATGVSFIQQALNPIME--HVLKPL 381

Query: 142 F-LTGIAVIVALVGMIRGRTRPWLVLLIGVGASLAAYVLGAVLVLVAVPG 190
L G A+ AL G+ + + I VGA +AA + AV+V+VAV G
Sbjct: 382 MELIGKAITKALEGLGVDKKTAEMAGSI-VGAIVAAIAMVAVIVVVAVVG 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0477FLGMOTORFLIN290.007 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 29.5 bits (66), Expect = 0.007
Identities = 15/59 (25%), Positives = 28/59 (47%)

Query: 151 LALTEGAVRTLVRQAGDEVPGVLIGRCTLDGEVTRAGEPVRVALTMSVVWGDPLPELAQ 209
L LT+G+V L AG+ + ++ G GEV + V +T + + + L++
Sbjct: 79 LRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSERMRRLSR 137


11CMM_0497CMM_0504Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0497283.060907putative metal-dependent hydrolase
CMM_0498283.513167hypothetical protein
CMM_0499373.042246putative oxidoreductase
CMM_0500363.079017putative 1-acylglycerol-3-phosphate
CMM_0501362.763318putative tRNA processing ribonuclease BN
CMM_0502171.076331Deoxyribodipyrimidine photo-lyase
CMM_050319-0.107933putative glycosyl transferase
CMM_050429-0.404904transcriptional regulator, MarR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0500SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 16/67 (23%), Positives = 26/67 (38%), Gaps = 4/67 (5%)

Query: 97 MALVPSARGRGLGRALLVALVAAVRASGAPAVSLSVEDGNDRARALYDSLGFVAVGREGG 156
+A+ R +G+G ALL + + + + L +D N A Y F G
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF----IIGA 150

Query: 157 SDVLLLR 163
D +L
Sbjct: 151 VDTMLYS 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0501PF05616413e-05 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 40.5 bits (94), Expect = 3e-05
Identities = 25/79 (31%), Positives = 29/79 (36%), Gaps = 3/79 (3%)

Query: 217 AAAAAVPVPTPSASPTPAPTPVPDPDPTPAPGPTPAPDPSPAPAPGRVATGVTVTAPDGT 276
+A A P P SP P P P+ P P P PDP P G T PD
Sbjct: 319 SAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSP 378

Query: 277 RVIDGQPRVDGLDRSATAD 295
V D R +G R +
Sbjct: 379 AVPD---RPNGRHRKERKE 394



Score = 35.5 bits (81), Expect = 0.001
Identities = 26/95 (27%), Positives = 33/95 (34%), Gaps = 6/95 (6%)

Query: 224 VPTPSASPTPAPTPVPDPDPTPAPGPTPAPDPSPAPAPGRVATGVTVTAPDGTRVIDGQP 283
+P P +P A P P P +P PA +P+P PG PD D P
Sbjct: 310 IPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPN----PEPDPDLNPDANP 365

Query: 284 RVDGLDRSATADSLTASDGSPEHAASVHVETGADG 318
DG + T A P + G DG
Sbjct: 366 DTDG--QPGTRPDSPAVPDRPNGRHRKERKEGEDG 398



Score = 34.3 bits (78), Expect = 0.002
Identities = 23/82 (28%), Positives = 31/82 (37%), Gaps = 6/82 (7%)

Query: 212 GSVTCAAAAAVPVPTPSASPTPAPTPVPDPDPTPAPGPTPAPDPSPAP----APGRVATG 267
GS A +P +P+ +P P P +P P P P P +P P PG
Sbjct: 318 GSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDS 377

Query: 268 VTVTAPDGTRVIDGQPRVDGLD 289
V PD + R +G D
Sbjct: 378 PAV--PDRPNGRHRKERKEGED 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0502SUBTILISIN486e-08 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 47.9 bits (114), Expect = 6e-08
Identities = 43/195 (22%), Positives = 68/195 (34%), Gaps = 31/195 (15%)

Query: 243 VDDVPAEDRGSGATIAIVAAYDDPDTQADTDTYSLA---VGEPAFTAGQYRDHPSASPRT 299
V + RG G +A++ DT D D L +G FT D
Sbjct: 31 APAVWNQTRGRGVKVAVL------DTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYN 84

Query: 300 G---ICGGPTAWTDEQHLDVQAVHAMAPDATVS----YWGADDCTSTSLYTRILDAAEDG 352
G G A T+ ++ V +AP+A + + I A E
Sbjct: 85 GHGTHVAGTIAATENEN----GVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQK 140

Query: 353 PDVISLSFGAMEGLDTADDRELLNRVLVEAASRDVSVFASTGNDGDYSGVGDHGGNATVA 412
D+IS+S G E + L+ + +A + + V + GN+GD +
Sbjct: 141 VDIISMSLGGPEDVPE------LHEAVKKAVASQILVMCAAGNEGDG-----DDRTDELG 189

Query: 413 SPASSPYVTAVGATS 427
P V +VGA +
Sbjct: 190 YPGCYNEVISVGAIN 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0503HTHTETR501e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 1e-09
Identities = 33/208 (15%), Positives = 65/208 (31%), Gaps = 13/208 (6%)

Query: 1 MNQRQAAVSRARREIMEAAGAQFAAHGYEGTSFSRVAEAMGKPKSAIGYHLFASKESLAG 60
+ + R+ I++ A F+ G TS +A+A G + AI Y F K L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAI-YWHFKDKSDLFS 60

Query: 61 AVVEDQEDRWLRIEAALDRP------GALHELIVFLLTGASTVEVCPVAAGAIRLLQDMP 114
+ E E +E L E+++ +L T E + I +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 115 RLGLAVERR-----FDVWRFTREHLEAELAVRDIRAG-DLDAVVDVLLSATFGVLSYRSP 168
V++ + + + L+ + + + A ++ G++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 169 RLAEHDSAERLRSLWIPLLVGLGIDDAD 196
D + R LL +
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTL 208


12CMM_0569CMM_0574Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0569212-3.050161conserved hypothetical protein
CMM_0570219-4.271942putative aldose-1-epimerase
CMM_0571318-4.182880putative glyoxylase family protein
CMM_0572319-2.772365putative transcription regulator, MarR family
CMM_0573218-2.230934hypothetical protein
CMM_0574220-2.549857putative levanase/invertase
13CMM_0606CMM_0627Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0606280.631554putative membrane protein
CMM_0607271.592790putative transcriptional regulator, TetR family
CMM_06083101.744472putative efflux MFS permease
CMM_0609260.905390putative membrane protein
CMM_0610061.721696putative monooxygenase
CMM_0611171.886389putative monooxygenase
CMM_0612282.375433putative Zn-dependant quinone oxidoreductase
CMM_0613292.323393putative serine O-acetyltransferase
CMM_0614-291.266566putative RNA methyltransferase
CMM_0615-192.195914conserved hypothetical protein
CMM_0616082.020381hypothetical protein
CMM_0617092.405892putative endoribonuclease L-PSP
CMM_06180101.985682putative adenosine deaminase
CMM_0619-282.508572hypothetical protein
CMM_0620083.841812conserved hypothetical protein
CMM_0621083.563458NADP-dependent alcohol dehydrogenase
CMM_0622-173.084366putative carboxylesterase, type B
CMM_0623626-5.013327putative flavin-dependant reductase
CMM_0624728-5.637021putative secreted phosphoesterase
CMM_0625940-9.672422conserved hypothetical protein, putative
CMM_0626836-8.832663hypothetical protein
CMM_PS_14936-8.668717hypothetical protein
CMM_0627835-8.314419conserved membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0627BINARYTOXINB401e-04 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 40.1 bits (93), Expect = 1e-04
Identities = 24/103 (23%), Positives = 49/103 (47%), Gaps = 12/103 (11%)

Query: 1313 SLRLTGLVSFPETGQYTFRTTSDDGVRVWLDDVLIIDQWTAQLPTDSTAQAISVTAGEAR 1372
S +G + ++ +YTF T++D+ V +W+DD +I++ S + I + G
Sbjct: 91 SAIWSGFIKVKKSDEYTFATSADNHVTMWVDDQEVINK-------ASNSNKIRLEKGRLY 143

Query: 1373 RIRIEYFE---VDLSAMLQLKWATPSQAGFTIVPGSSLR-PDY 1411
+I+I+Y + +L W SQ ++ +L+ P+
Sbjct: 144 QIKIQYQRENPTEKGLDFKLYWTD-SQNKKEVISSDNLQLPEL 185


14CMM_0651CMM_0681Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_06512111.770105putative hydrolase
CMM_06522100.311849putative transcriptional regulator, Cro/CI
CMM_06533120.113289hypothetical protein
CMM_06542110.730939two component system, sensor kinase
CMM_0655290.557514putative two-component system response
CMM_06563110.700994hypothetical protein
CMM_06571052-10.724400putative acetyltransferase
CMM_0658845-10.461381hypothetical protein
CMM_0659940-8.513563hypothetical protein
CMM_06601143-8.958252conserved hypothetical protein
CMM_PS_151043-9.179269putative beta lactamase/penicillin-binding
CMM_06611042-8.884660putative two-component system, sensor kinase
CMM_0662-270.581764putative two-component system response
CMM_0663-191.758357stress-induced protein, putative organic
CMM_0664-290.627075conserved hypothetical protein
CMM_0665-180.936258conserved hypothetical protein
CMM_0666-2101.506633conserved hypothetical protein
CMM_0667-2101.771962hypothetical protein
CMM_06682103.011217conserved membrane protein, putative
CMM_0669293.141603putative pyruvyl-transferase
CMM_0670193.062527putative membrane protein, possibly a
CMM_0671374.243251putative mannosyltransferase
CMM_0672394.804619putative glycosyltransferase
CMM_06731103.758344putative undecaprenyl-phosphate sugar
CMM_0675-1112.861892putative protein tyrosine kinase
CMM_06760122.650573putative acyltransferase
CMM_06771131.824650putative membrane-bound acyltransferase
CMM_06782131.597466putative esterase
CMM_06792131.487021conserved hypothetical protein
CMM_06802121.644403putative hydrolase
CMM_06812161.466849phosphoribosylformylglycinamidine synthase II
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0671ABC2TRNSPORT393e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 39.1 bits (91), Expect = 3e-05
Identities = 29/115 (25%), Positives = 48/115 (41%), Gaps = 5/115 (4%)

Query: 563 AIVAGALGFRIAHPLAMYGTMVLASITFAAIILALNVLLGSVGQ----FLGLVLMVVQLV 618
+VA ALG+ +Y V+A A L + V + F +++ L
Sbjct: 133 GVVAAALGY-TQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILF 191

Query: 619 TAGGTFPWQTLPGPLAALHHVLPMSFAVDALRQLMYGGDLGQAAQDAGVLALWLV 673
+G FP LP LP+S ++D +R +M G + Q G L +++V
Sbjct: 192 LSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIV 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0673HTHTETR502e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 2e-09
Identities = 34/168 (20%), Positives = 62/168 (36%), Gaps = 16/168 (9%)

Query: 7 ARAPRKDAATNRQALVDAAVVALDRDP--DASLETIAAAAGLSRRAVYGHFATRDDLVRE 64
AR +++A RQ ++D A+ + SL IA AAG++R A+Y HF + DL E
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 65 VLQRGARRVVEALAGIAHDDSRIHIALIGARLWAEVEQVRVMARV--AVRGPHAREVGAE 122
+ + + E ++++ L +E R + +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 123 LAPLRAELQR------------VVERGIAAGELRGDIPAPTLARLIEG 158
+ + QR ++ I A L D+ A ++ G
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRG 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0674RTXTOXINA300.036 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.036
Identities = 26/113 (23%), Positives = 46/113 (40%), Gaps = 10/113 (8%)

Query: 18 NTPHADPAGIEAALVASPTAASDAASLLAPSPSGGSSIASASDATPDTASLGTGPLSLVA 77
N P+ D G V+ +A A+ +L+ + + + A+A L T ++
Sbjct: 231 NLPNLDNIGAGLDTVSGILSAISASFILSNADADTRTKAAAG------VELTT---KVLG 281

Query: 78 GAGAAVMGMAADSRAAAAATTAMPAATDAVAAPVDTATAPAPAAATTEPFLRA 130
G + RAA +T+ AA +A+ V A +P + + F RA
Sbjct: 282 NVGKGISQYIIAQRAAQGLSTSAAAAG-LIASAVTLAISPLSFLSIADKFKRA 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0678YERSSTKINASE300.032 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 30.5 bits (68), Expect = 0.032
Identities = 41/139 (29%), Positives = 57/139 (41%), Gaps = 28/139 (20%)

Query: 281 LGRLDGMAGVP------SIVETTEVDGHRYLATTRLDGDMLWQWQGKVNPLIRPGSTADE 334
L + GMA VP + EVDG R T R D W+ QGK+N G+
Sbjct: 193 LANVHGMAVVPYGNRKEEALLMDEVDGWRCSDTLRTLADS-WK-QGKINSEAYWGTIK-- 248

Query: 335 RAEFARRAMRLTRSVERLVAEMHRRGVTHGDLHPGNIL-ATDDDEARLIDFEV-ARSGTD 392
A R + +T + + GV H D+ PGN++ E +ID + +RSG
Sbjct: 249 --FIAHRLLDVTN-------HLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQ 299

Query: 393 A-------AAPRRAMGGLG 404
AP +G LG
Sbjct: 300 PKGFTESFKAPELGVGNLG 318


15CMM_0694CMM_0705Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0694391.767859putative integral membrane protein
CMM_0695391.712742conserved hypothetical protein
CMM_06960103.081498conserved membrane protein
CMM_06970103.226598putative membrane protein
CMM_0698093.129479putative phosphoribulokinase
CMM_0699093.982482putative dioxygenase
CMM_0700-1103.530898putative monooxygenase
CMM_0701-1103.730646putative transcriptional regulator, MarR family
CMM_0702-1113.912042putative glycosyl transferase
CMM_07030103.541182conserved hypothetical protein, acyl-CoA
CMM_0704-1113.426272hypothetical protein
CMM_0705-1113.191398phosphoribosylaminoimidazolesuccinocarboxyamide
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0700HTHFIS801e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.9 bits (197), Expect = 1e-19
Identities = 39/127 (30%), Positives = 64/127 (50%), Gaps = 1/127 (0%)

Query: 5 ARVLVVEDDAAIRAAVVSTLTAERFVVRGLASGVDLEEEVKGFLPDLVVLDWMLPGPSGI 64
A +LV +DDAAIR + L+ + VR ++ L + DLVV D ++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 65 QLAERIR-RWSDASVIMLTARDAVEDRLRGFGQGVDDYIVKPFALAELVARVGAVLRRRG 123
L RI+ D V++++A++ ++ +G DY+ KPF L EL+ +G L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 124 RLASVVE 130
R S +E
Sbjct: 124 RRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0701GPOSANCHOR330.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.001
Identities = 11/60 (18%), Positives = 14/60 (23%), Gaps = 1/60 (1%)

Query: 156 PATPPTPPTNADGTPATPPAPPADGCAPPAPPVGADGKAPTPPAGPAAPATTPGSSEGST 215
A + TP P A AP G P P + E +
Sbjct: 455 LAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETK-RQLPSTGETAN 513


16CMM_0733CMM_0751Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0733828-7.147860putative oxidoreductase
CMM_0734829-7.352875putative transcriptional regulator, Cro/CI
CMM_0735831-7.832208hypothetical membrane protein
CMM_0736830-7.910632putative transcriptional regulator, GntR family
CMM_0737828-7.383063putative triosephosphate isomerase
CMM_0738833-7.019345putative ribose 5-phosphate isomerase
CMM_0739-190.565926putative dihydroxyacetone kinase
CMM_0740-1100.714324putative transcriptional regulator, GntR-family
CMM_07410100.655883putative sugar ABC transporter ATP-binding
CMM_07420111.764098putative sugar ABC transporter, permease
CMM_07431101.687688putative sugar ABC transporter, solute-binding
CMM_07441111.552380putative myo-inositol dehydrogenase
CMM_07451113.101307putative sugar epimerase
CMM_07462104.240883putative extracellular nuclease/phosphatase
CMM_0747294.317629putative thioredoxin
CMM_07480113.855500putative ATP-dependent DNA helicase
CMM_07490113.604451conserved hypothetical protein
CMM_0750093.685821putative oligopeptide ABC transporter,
CMM_0751-193.326568putative oligopeptide ABC transporter, permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0739SACTRNSFRASE414e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 4e-07
Identities = 16/57 (28%), Positives = 25/57 (43%), Gaps = 3/57 (5%)

Query: 90 ITPDARGRAAGVADMLLDAVIAWARDRPNAQALRLEVHEDNPRARAYYERRGFVLTG 146
+ D R + GV LL I WA++ L LE + N A +Y + F++
Sbjct: 97 VAKDYRKK--GVGTALLHKAIEWAKENH-FCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0747PRTACTNFAMLY280.038 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 28.5 bits (63), Expect = 0.038
Identities = 33/108 (30%), Positives = 41/108 (37%), Gaps = 2/108 (1%)

Query: 211 ITDATKAVATSQAALPEDACTAVYALPEYAGSTANLAQVALDSDNVFGDDGGELQLGTVT 270
I D + Q+ PED + L + TA A A + +V G L G +T
Sbjct: 178 IVDGGLHIGALQSLQPEDLPPSRVVLRD-TNVTAVPASGAPAAVSVLGASELTLDGGHIT 236

Query: 271 GDATAGYRVALTARVDTA-TTPTAGSAPSGGGGGGMGGPAGTPPGGDR 317
G AG A V T G AP+GG G P G PGG
Sbjct: 237 GGRAAGVAAMQGAVVHLQRATIRRGDAPAGGAVPGGAVPGGAVPGGFG 284


17CMM_0800CMM_0808Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_08002110.251873conserved hypothetical protein
CMM_08011100.711027pantoate--beta-alanine ligase
CMM_08021101.911034hypothetical protein
CMM_08031102.667105hypothetical protein
CMM_0804194.076012ATP-dependent protease, ATPase subunit
CMM_0805083.792932putative UDP-glucose 4-epimerase
CMM_0806073.150811putative oxidoreductase
CMM_0807173.92345550S ribosomal protein L10
CMM_0808-273.02903250S ribosomal protein L7/L12
18CMM_0819CMM_0825Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0819281.549094putative cystathionine beta-synthase
CMM_0820590.448183Ribonuclease H
CMM_0821490.119352putative transcriptional regulator, LacI-family
CMM_0822490.255129putative L-ribulokinase
CMM_082357-0.918476putative L-ribulose-5-phosphate 4-epimerase
CMM_082467-0.781680putative L-arabinose isomerase
CMM_082557-0.566477putative sugar ABC transporter,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0819PF03544359e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 35.0 bits (80), Expect = 9e-04
Identities = 18/91 (19%), Positives = 28/91 (30%)

Query: 270 AVRPLTAPVAAPAPTPTPAPAPTTPAPGAGSGSGSSDAEQGVSLGDARTGAGSAPVGSTA 329
AV+P PV P P P P P P AP + + +
Sbjct: 65 AVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESR 124

Query: 330 YGVPSDAVFVAPTGSNGGSGSKSSPYATIQR 360
P + A S+ + + S P ++
Sbjct: 125 PASPFENTAPARPTSSTATAATSKPVTSVAS 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0820PF03544372e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 37.3 bits (86), Expect = 2e-04
Identities = 20/121 (16%), Positives = 32/121 (26%), Gaps = 8/121 (6%)

Query: 242 VWSYLSSSSPTQAVAFDDVTVRPLTTAASPMPTPTPTPTPTP----TPTPTPTAPAPTSS 297
+ P A V P P P P P P P P AP
Sbjct: 35 TSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVI 94

Query: 298 DPEQTVPRGDARPG----TGSGSAAVGTTTYPAPADGVYVSPTGSNTGAGTKASPYASIQ 353
+ + P+ +P + +P + + S+T + P S+
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVA 154

Query: 354 K 354

Sbjct: 155 S 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0825MICOLLPTASE1012e-23 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 101 bits (253), Expect = 2e-23
Identities = 38/143 (26%), Positives = 58/143 (40%), Gaps = 8/143 (5%)

Query: 1093 QDVTVKAAPANVAPTAVVTATATDLTA---KLDGSASTDADGTVASYAWDFGDGSTGTGP 1149
T N P AV+ + ++ + DG+ S D DG + +Y WDFGDG
Sbjct: 762 NTDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEA 821

Query: 1150 TPTHAYAAGGTYTVALTVTDDKGLTGTASTQVTVVA----PPVN-REPTAVIASTTADLV 1204
TH Y G Y V LTVTD+ G T S ++ VV +N EP
Sbjct: 822 KATHKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAK 881

Query: 1205 ANLDGRASSDPDGTIASWAWEFG 1227
+N+ + + + + ++
Sbjct: 882 SNMLVKGTLSEEDYSDKYYFDVA 904



Score = 92.1 bits (228), Expect = 2e-20
Identities = 36/133 (27%), Positives = 54/133 (40%), Gaps = 4/133 (3%)

Query: 1572 NQAPTAAFTSTANGLTA---SFDGSGSTDADGTVASYAWAFGDGTTGTGRTATHAYAAAG 1628
N+ P A S ++ + +FDG+ S D DG + +Y W FGDG ATH Y G
Sbjct: 772 NKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTG 831

Query: 1629 TYAVSLTVTDDKGLVSAKKDGTVQVS-APVVTPANQAPTAAFTSTAKDLTASFDASTSTD 1687
Y V LTVTD+ G ++ + V PV P F + ++ +
Sbjct: 832 EYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLVKGTLS 891

Query: 1688 ADGTVASYAWAFG 1700
+ Y +
Sbjct: 892 EEDYSDKYYFDVA 904



Score = 90.9 bits (225), Expect = 4e-20
Identities = 41/142 (28%), Positives = 60/142 (42%), Gaps = 8/142 (5%)

Query: 1180 QVTVVAPPVNREPTAVIASTTADLVA---NLDGRASSDPDGTIASWAWEFGDGTTGAGAS 1236
T VN+EP AVI S ++ +V N DG S D DG I ++ W+FGDG A
Sbjct: 763 TDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAK 822

Query: 1237 IAHPYAKAGTYTVALTVTDDKGATGRTTASVTVTA----PPVNQA-PVAAFTSTVANLVA 1291
H Y K G Y V LTVTD+ G + + V +N++ P F +
Sbjct: 823 ATHKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKS 882

Query: 1292 SLDASTSSDPDGTVASYAWAFG 1313
++ + + Y +
Sbjct: 883 NMLVKGTLSEEDYSDKYYFDVA 904



Score = 83.6 bits (206), Expect = 6e-18
Identities = 49/220 (22%), Positives = 73/220 (33%), Gaps = 39/220 (17%)

Query: 1012 TGRVPNQAPKAAFTQTADFLTA---SFDATGSTDGDGTVTGYAWDFGDGVQASGATQSHT 1068
T N+ PKA + + +FD T S D DG + Y WDFGDG +++ A +H
Sbjct: 767 TDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHK 826

Query: 1069 YAAAGTYPVVLTVTDDRGTTNRTQQDVT-VKAAPANV----APTAVVTATATDLTAKLDG 1123
Y G Y V LTVTD+ G N + + V+ P V P + +
Sbjct: 827 YNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLV 886

Query: 1124 SASTDADGTVASYAWDF-------------------------GDGST-GTGPTPTHAYAA 1157
+ + Y +D GD + T
Sbjct: 887 KGTLSEEDYSDKYYFDVAKKGNVKITLNNLNSVGITWTLYKEGDLNNYVLYATGNDGTVL 946

Query: 1158 GGTYTVA-----LTVTDDKGLTGTASTQVTVVAPPVNREP 1192
G T+ L+V +GT + V +E
Sbjct: 947 KGEKTLEPGRYYLSVYTYDNQSGTYTVNVKGNLKNEVKET 986



Score = 82.1 bits (202), Expect = 2e-17
Identities = 35/127 (27%), Positives = 52/127 (40%), Gaps = 14/127 (11%)

Query: 1268 TVTAPPVNQAPVAAFTSTVANLVA---SLDASTSSDPDGTVASYAWAFGDGTTGTGRTTT 1324
T T VN+ P A S + +V + D + S D DG + +Y W FGDG T
Sbjct: 765 TNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKAT 824

Query: 1325 HAYAAAGTFAVSLTVTDDKGLATTTTSQVTV-----------QAPASNVLAQDSFGRAVA 1373
H Y G + V LTVTD+ G T + ++ V P ++ + ++
Sbjct: 825 HKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNM 884

Query: 1374 TGWGTAD 1380
GT
Sbjct: 885 LVKGTLS 891



Score = 78.6 bits (193), Expect = 2e-16
Identities = 29/76 (38%), Positives = 38/76 (50%), Gaps = 3/76 (3%)

Query: 1662 NQAPTAAFTSTAKDLTA---SFDASTSTDADGTVASYAWAFGDGTTGTGKTATHAYAAAG 1718
N+ P A S + + +FD + S D DG + +Y W FGDG ATH Y G
Sbjct: 772 NKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTG 831

Query: 1719 TYAVSLTVTDDKGLAS 1734
Y V LTVTD+ G +
Sbjct: 832 EYEVKLTVTDNNGGIN 847


19CMM_0845CMM_0880Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_08454103.889194putative malate:quinone oxidoreductase
CMM_0846393.654074conserved hypothetical protein
CMM_0847383.379480putative two-component system, response
CMM_0848393.096482putative tyrosine-protein kinase
CMM_08490121.800240hypothetical protein
CMM_08500141.468189hypothetical protein
CMM_08510141.172218hypothetical protein
CMM_0852015-1.924232hypothetical protein
CMM_0853014-2.827011conserved membrane protein
CMM_0854111-3.377672putative flavin-dependent
CMM_0855210-4.252335putative penicillin-binding protein
CMM_0856410-4.195742putative acyltransferase
CMM_085727-2.429192putative membrane protein
CMM_0858-17-0.606744*putative phosphohydrolase
CMM_0859060.100548putative penicillin-binding protein
CMM_0860070.488373conserved hypothetical protein
CMM_0861-17-0.009923putative endoribonuclease, putative translation
CMM_0862-17-0.953656acetyl-coenzyme A synthetase
CMM_086309-2.574591conserved hypothetical protein
CMM_0864010-2.710871conserved membrane protein
CMM_0865-111-3.096559hypothetical protein
CMM_0866-112-3.506640hypothetical protein
CMM_0867-110-3.075092hypothetical protein
CMM_0868-310-2.051996DNA topoisomerase I
CMM_0869-38-1.372191DNA polymerase III, delta' subunit
CMM_0870-29-0.977123*D-alanyl-D-alanine carboxypeptidase
CMM_0871-111-0.880627putative pyrazinamidase / nicotinamidase
CMM_0872-111-0.839933putative succinate-semialdehyde dehydrogenase
CMM_0873-212-0.987041putative tRNA/rRNA methyltransferase
CMM_0874-114-1.773328hypothetical protein
CMM_0875-115-3.175273hypothetical protein
CMM_0876015-4.125123putative multidrug export ABC transporter, fused
CMM_0877114-4.849069putative multidrug export ABC transporter,
CMM_0878011-4.158922putative multidrug export ABC
CMM_0879-19-3.765459putative multidrug export ABC
CMM_0880-17-3.242677Putative MFS-type efflux transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0849RTXTOXINA328e-04 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.2 bits (73), Expect = 8e-04
Identities = 14/50 (28%), Positives = 25/50 (50%), Gaps = 4/50 (8%)

Query: 84 KASA---LAGALLTGITGGVLAFVLARPVLPGASSVGLAVAGTVGAVVLL 130
KA+A L +L + G+ +++A+ G S+ A AG + + V L
Sbjct: 268 KAAAGVELTTKVLGNVGKGISQYIIAQRAAQGLSTSA-AAAGLIASAVTL 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0857HTHFIS374e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.7 bits (85), Expect = 4e-04
Identities = 30/154 (19%), Positives = 60/154 (38%), Gaps = 16/154 (10%)

Query: 515 VIGQEEAISALSKTIRRTRAGLKDPRRPSGSFIFAGPTGVGKTELAKALAEFLFDDEDAL 574
++G+ A+ + + + R + + G +G GK +A+AL ++
Sbjct: 139 LVGRSAAMQEIYRVLARLMQT-------DLTLMITGESGTGKELVARALHDYGKRRNGPF 191

Query: 575 ISLDMSEYGEKHTVSRLFGAPPG-FVGFEEGGQLTEKVRRKPFSVVLFDEIEKAHPDIFN 633
++++M+ S LFG G F G + + + + DEI D
Sbjct: 192 VAINMAAIPRDLIESELFGHEKGAFTGAQTRST--GRFEQAEGGTLFLDEIGDMPMDAQT 249

Query: 634 SLLQILEEG---RLTDGQGRVVDFKNTVIIMTTN 664
LL++L++G + D + I+ TN
Sbjct: 250 RLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN 280



Score = 31.0 bits (70), Expect = 0.021
Identities = 17/60 (28%), Positives = 33/60 (55%), Gaps = 8/60 (13%)

Query: 173 RNLTQAARDGKLDPVIGREKEIERVMQILSRRSKNN-PVLI-GEPGVGKTAVVEGLAQAI 230
L ++DG P++GR ++ + ++L+R + + ++I GE G GK V A+A+
Sbjct: 127 SKLEDDSQDG--MPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELV----ARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0858NUCEPIMERASE994e-26 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 99.1 bits (247), Expect = 4e-26
Identities = 70/338 (20%), Positives = 117/338 (34%), Gaps = 78/338 (23%)

Query: 1 MRIAVTGGSGKLGRHVVADLRAHGHEVTNID----------------QVGERGSGYVRVD 44
M+ VTG +G +G HV L GH+V ID + + G + ++D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 45 TTDYGQVVDALFGVQDLHDGFDAVVHLAAIPA--PAILSDVATFHNNMLTSFNVFQAARR 102
D + + LF F+ V A ++ + A +N+ N+ + R
Sbjct: 61 LADR-EGMTDLFASG----HFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRH 115

Query: 103 AGIRKIVYASSETVLGLPFDVPPPYIPVDEEYPA-QPNSTYSLVKHLEEQMAI---ELCR 158
I+ ++YASS +V GL +P + P S Y+ K E MA L
Sbjct: 116 NKIQHLLYASSSSVYGL-----NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLY- 169

Query: 159 WDPELQVTALRFSNVM---------------DVDDYEEFPGFDDDALARKWNLWGYIDGR 203
L T LRF V + + + ++ + R + YID
Sbjct: 170 ---GLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFT---YID-- 221

Query: 204 DGAQAVRKALEHDGTGFDRFIVANADTVMSRSSAEL----------------AAEVFPGV 247
D A+A+ + + ++ V S + + A E G+
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 248 EVTKELGE------HETLLSIDKARRILGYEPEHTWRD 279
E K + ET ++G+ PE T +D
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKD 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0864TCRTETA637e-13 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 62.9 bits (153), Expect = 7e-13
Identities = 73/358 (20%), Positives = 134/358 (37%), Gaps = 21/358 (5%)

Query: 17 RSVFAVAFACVIAFMGIGLVDPILPAIASSLDATATEAE---LLFTSYLLVTGLAMLVTS 73
R + + + +GIGL+ P+LP + L + +L Y L+ V
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLG 64

Query: 74 WISSRIGAKRTLLIGLAIIVVFAAAAGLSQDVEQVIGFRAGWGLGNALFISTALATIVGS 133
+S R G + LL+ LA V A + + + R G+ A + A A I
Sbjct: 65 ALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATG-AVAGAYIADI 123

Query: 134 ASGGT-ASAIMLYEAALGLGIAVGPLLGGLLGSWSWRGPFFGTATLMAVGFVAILALLGK 192
G A A G G+ GP+LGGL+G +S PFF A L + F+ LL +
Sbjct: 124 TDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPE 183

Query: 193 DDAP-RAPMRLSAP--------LRALRT--PALAVLAAAALFYNIGFFVLLAYTPFPLGF 241
R P+R A R + +AV L + + + + +
Sbjct: 184 SHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHW 243

Query: 242 DAIGLGLTFFGWGVGLAITSVLVAPLLTRRMARTSVLRLVLPLLAADLAAAGLVVRSSVG 301
DA +G++ +G+ ++ ++ + R+ L L + R +
Sbjct: 244 DATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMA 303

Query: 302 L--VVCVIVGGLLLGVLNTVLTECVMEATDHPRSVASSAYSSVRFLGGAIAPPAATEL 357
+V + GG+ + L +L+ + + + + +++ L + P T +
Sbjct: 304 FPIMVLLASGGIGMPALQAMLSR---QVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358


20CMM_0889CMM_0897Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0889292.201587putative membrane protein
CMM_0890262.035718putative MFS permease
CMM_0891161.582859putative MFS permease
CMM_08924110.656388hypothetical protein
CMM_08933110.446843putative peptide ABC transporter, permease
CMM_0894312-0.267793putative peptide ABC transporter,
CMM_08951110.432258putative MFS permease
CMM_08961120.546305putative peptide ABC transporter, ATP-binding
CMM_08972120.267209putative peptide ABC transporter, ATP-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0889TCRTETA582e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 57.9 bits (140), Expect = 2e-11
Identities = 78/368 (21%), Positives = 133/368 (36%), Gaps = 34/368 (9%)

Query: 18 IVYIIVLGALVGLGPFTIDLYLPAFPVIKDQFGVSDAAVQLTLTGTTIGFALGQLVVGP- 76
++ V VG+G L +P P + S+ + +AL Q P
Sbjct: 9 VILSTVALDAVGIG-----LIMPVLPGLLRDLVHSNDVTAHYGILLAL-YALMQFACAPV 62

Query: 77 ---WSDRVGRRLPLIVATSLHILASLGAALAPDVTVLLVFRILQGAGAAGGAVVAMAMVR 133
SDR GRR L+V+ + + A AP + VL + RI+ G A GAV A +
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AYIA 121

Query: 134 DLFGGRPLVRMLSRLALVTGLAPILAPVIGSQLLRFVDWRGVFYALTAYAILVVIAVTFF 193
D+ G R ++ G + PV+G L+ F+A A L + F
Sbjct: 122 DITDGDERARHFGFMSACFGFGMVAGPVLGG-LMGGFSPHAPFFAAAALNGLNFLTGCFL 180

Query: 194 IVETLPKDRVRVEEKG-TLLRRYRSVLGDR-------VFVGVALIGGMQFAGLFSYLSSS 245
+ E+ +R + + L +R G VF + L+G + A L+
Sbjct: 181 LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQV-PAALWVIFG-- 237

Query: 246 SFLFQDVYGFDAQQFGILF-GINSLGVVIGNQIAARLTKVIGPQWILAGVVTVQFLSSAT 304
+D + +DA GI L + I + +G + + + ++ T
Sbjct: 238 ----EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGER----RALMLGMIADGT 289

Query: 305 IVLLGTFTDAGLLGTLIPLFFFILACGFGFPCVQVLGLVNHGHEAGTAASLLGAVNFGLA 364
+L F G + P+ + + G G P +Q + E A L
Sbjct: 290 GYILLAFATRGWM--AFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLT 347

Query: 365 GAISPIVG 372
+ P++
Sbjct: 348 SIVGPLLF 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0894NUCEPIMERASE320.001 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 32.4 bits (74), Expect = 0.001
Identities = 25/135 (18%), Positives = 47/135 (34%), Gaps = 21/135 (15%)

Query: 1 MRIIIAGGHGQIARLLERRLADQGHQPVGI---------VRNPDHASDLADAGAEALVLD 51
M+ ++ G G I + +RL + GHQ VGI LA G + +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LE-KSGVDQVAEALRGADAVVFAAGGG-------PDSGPERKLTIDRDGAILLADAAERA 103
L + G+ + + + P + + LT G + + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLT----GFLNILEGCRHN 116

Query: 104 GVTRYVMISAMAVDG 118
+ + S+ +V G
Sbjct: 117 KIQHLLYASSSSVYG 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0896ACETATEKNASE492e-177 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 492 bits (1269), Expect = e-177
Identities = 191/392 (48%), Positives = 261/392 (66%), Gaps = 5/392 (1%)

Query: 4 VLVVNSGSSSFKYQLIEMDTESVLASGLVERIGEPTGSTRHKAGGDSWERELPIADHTAG 63
+LV+N GSSS KYQLIE +VLA GL ERIG H A G+ + + + DH
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDA 62

Query: 64 FQAMLDAF--AEHGPSLEEEPPVAVGHRVVHGGDVFVEPTVVTDDVKADIDDLSALAPLH 121
+ +LDA +++G + AVGHRVVHGG+ F ++TDDV I D LAPLH
Sbjct: 63 IKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLH 122

Query: 122 NPGALQGIAAAQTAFPDVPHVAVFDTAFHQTLPAEAYTYAIDRELAAAHRIRRYGFHGTS 181
NP ++GI A PDVP VAVFDTAFHQT+P AY Y I E ++IR+YGFHGTS
Sbjct: 123 NPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGTS 182

Query: 182 HKFVSEAAARLLGKPLEETRIIVLHLGNGASAAAVQGGRSIDTSMGLTPLEGLVMGTRSG 241
HK+VS+ AA +L KP+E +II HLGNG+S AAV+ G+SIDTSMG TPLEGL MGTRSG
Sbjct: 183 HKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRSG 242

Query: 242 DIDPAILFHLARHTDLGLDDLETLLNRRSGLLGLTGLG-DMRDVQRAAAD-GDEAAQTAL 299
IDP+I+ +L ++ +++ +LN++SG+ G++G+ D RD++ AA GD+ AQ AL
Sbjct: 243 SIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLAL 302

Query: 300 GVYRHRIRHYVGAYAAQLGGVDAVVFTAGVGENNPLVRRRSLAGLEFMGIGIDDDRNELI 359
V+ +R++ +G+YAA +GGVD +VFTAG+GEN P +R L GLEF+G +D ++N++
Sbjct: 303 NVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKVR 362

Query: 360 SSEARFVSPEGSPVAVLVIPTDEELEIARQSL 391
E +S S V V+V+PT+EE IA+ +
Sbjct: 363 GEE-AIISTADSKVNVMVVPTNEEYMIAKDTE 393


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0897ICENUCLEATIN1281e-31 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 128 bits (322), Expect = 1e-31
Identities = 142/480 (29%), Positives = 218/480 (45%), Gaps = 17/480 (3%)

Query: 1066 GAPAGDATVEFDVGVDGEEGVDSDQTAAIDVDGDGTDGTD-----GAADAAGTDATDAAG 1120
G D+T+ G G +S Q A G G+D G+ AG D++ AG
Sbjct: 200 GTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAG 259

Query: 1121 TDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDAT 1180
+T AG D++ AG + A + AG +T AG D++ AG +T AG ++T
Sbjct: 260 YGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEEST 319

Query: 1181 DAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAG 1240
AG +T A + AG +T AG D++ AG +T AG D++ AG +T A
Sbjct: 320 QTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQ 379

Query: 1241 TDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAT 1300
+ AG +T AG D++ AG +T AG ++T AG +T A + AG +T
Sbjct: 380 KGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGST 439

Query: 1301 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAG 1360
AG D++ AG + AG D++ AG +T A + AG +T AG +++ AG
Sbjct: 440 GTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAG 499

Query: 1361 TDATDAAGTDAADAAGTDATDAAD----------ATDAAGTDATDAAGTDATDAAGTDAT 1410
+T AG + AG +T A +T AG +++ AG +T A ++
Sbjct: 500 YGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSV 559

Query: 1411 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAG 1470
AG +T A + AG + AG+D++ AG +T A ++ AG +T A
Sbjct: 560 LTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAR 619

Query: 1471 TDATDAAGTDATDAAGTDAADAAGTDATDAA--DATDAAGTDATDAADATDAADATDGST 1528
+ G +T AG D++ AG +T A ++ AG +T A A GST
Sbjct: 620 EQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGST 679



Score = 127 bits (319), Expect = 3e-31
Identities = 131/435 (30%), Positives = 204/435 (46%), Gaps = 12/435 (2%)

Query: 1106 GAADAAGTDATDAAGTDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATD 1165
G+ AG D+T AG +T AG +++ AG + + AG +T AG D++
Sbjct: 197 GSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSL 256

Query: 1166 AAGADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGT 1225
AG +T AG D++ AG +T A + AG +T AG D++ AG +T AG
Sbjct: 257 IAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGE 316

Query: 1226 DATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1285
++T AG +T A + AG +T AG D++ AG +T AG D++ AG +T
Sbjct: 317 ESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQ 376

Query: 1286 AAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1345
A + AG +T AG D++ AG + AG ++ AG +T A + AG
Sbjct: 377 TAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGY 436

Query: 1346 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAD----------ATDAAGTDATD 1395
+T AG D++ AG +T AG D++ AG +T A +T AG +++
Sbjct: 437 GSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSL 496

Query: 1396 AAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1455
AG +T AG +T AG +T A ++ G + AG +++ AG +T A
Sbjct: 497 IAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASY 556

Query: 1456 DAADAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAA--DATDAAGTDATD 1513
++ AG +T A + AG +T AG+D++ AG +T A ++ AG +T
Sbjct: 557 NSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQ 616

Query: 1514 AADATDAADATDGST 1528
A GST
Sbjct: 617 TAREQSVLTTGYGST 631



Score = 123 bits (309), Expect = 4e-30
Identities = 137/453 (30%), Positives = 212/453 (46%), Gaps = 9/453 (1%)

Query: 1085 GVDSDQTAAIDVD---GDGTDGTDGAADA--AGTDATDAAGTDATDAAGTDATDAAGTDA 1139
G S QTA D G G+ GT GA + AG +T AG ++T AG +T A +
Sbjct: 275 GYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGS 334

Query: 1140 NDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAGTDATDAA 1199
+ AG +T AG D++ AG +T AG D++ AG +T A + AG +T A
Sbjct: 335 DLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTA 394

Query: 1200 GTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAGTDA 1259
G D++ AG +T AG ++T AG +T A + AG + AG D++ AG +
Sbjct: 395 GADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGS 454

Query: 1260 TDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDAADAA 1319
T AG D++ AG +T A + AG + AG +++ AG +T AG + A
Sbjct: 455 TQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTA 514

Query: 1320 GTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAA--GT 1377
G + A ++ G +T AG +++ AG +T A ++ AG + A G+
Sbjct: 515 GYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGS 574

Query: 1378 DATDAADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAADAA 1437
D T +T AG+D++ AG +T A ++ AG +T A + G + A
Sbjct: 575 DLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTA 634

Query: 1438 GTDATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDA 1497
G D++ AG +T AG ++ AG +T A + AG +T AG D++ AG +
Sbjct: 635 GADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGS 694

Query: 1498 TDAA--DATDAAGTDATDAADATDAADATDGST 1528
T A ++ AG +T A + GST
Sbjct: 695 TQTAGYNSILTAGYGSTQTAQEGSDLTSGYGST 727



Score = 119 bits (298), Expect = 8e-29
Identities = 127/451 (28%), Positives = 202/451 (44%), Gaps = 15/451 (3%)

Query: 1066 GAPAGDATVEFDVGVDGEEGVDSDQTAAIDVDGDGTDGTD-----GAADAAGTDATDAAG 1120
G D+++ G G DS TA G+D G+ AG D++ AG
Sbjct: 344 GTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAG 403

Query: 1121 TDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDAT 1180
+T AG ++T AG + A + AG +T AG D++ AG +T AG D++
Sbjct: 404 YGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSS 463

Query: 1181 DAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAG 1240
AG +T A + AG +T AG +++ AG +T AG +T AG +T A
Sbjct: 464 LTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQ 523

Query: 1241 TDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAT 1300
++ G +T AG +++ AG +T A ++ AG +T A + AG +T
Sbjct: 524 NESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGST 583

Query: 1301 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAG 1360
AG+D++ AG + A ++ AG +T A + G +T AG D++ AG
Sbjct: 584 GTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAG 643

Query: 1361 TDATDAAGTDAADAAGTDATDAAD----------ATDAAGTDATDAAGTDATDAAGTDAT 1410
+T AG ++ AG +T A +T AG D++ AG +T AG ++
Sbjct: 644 YGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSI 703

Query: 1411 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAG 1470
AG +T A + +G + AG D++ AG +T A ++ AG +T A
Sbjct: 704 LTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAR 763

Query: 1471 TDATDAAGTDATDAAGTDAADAAGTDATDAA 1501
+ G +T AG D++ AG +T A
Sbjct: 764 EQSVLTTGYGSTSTAGADSSLIAGYGSTQTA 794



Score = 100 bits (249), Expect = 5e-23
Identities = 106/394 (26%), Positives = 177/394 (44%), Gaps = 2/394 (0%)

Query: 1106 GAADAAGTDATDAAGTDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATD 1165
G+ A + G +T AG D++ AG + AG ++ AG +T A ++
Sbjct: 805 GSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDL 864

Query: 1166 AAGADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGT 1225
G +T AG D++ AG +T AG ++ AG +T A ++ G +T AG
Sbjct: 865 TTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 924

Query: 1226 DATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1285
+++ AG +T A + AG ++ A ++ AG +T AG D++ AG +T
Sbjct: 925 ESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQ 984

Query: 1286 AAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1345
AG + AG +T A +T AG + AG D++ AG ++ +G + AG
Sbjct: 985 TAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGY 1044

Query: 1346 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAADATDA--AGTDATDAAGTDATD 1403
+T +G + AG ++ +G ++ AG + A + AG ++T G +
Sbjct: 1045 GSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSML 1104

Query: 1404 AAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGT 1463
AG ++ AG +T +G D+ AG AG D+T AG + AG ++ AG
Sbjct: 1105 IAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGD 1164

Query: 1464 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDA 1497
+ AG D AG + AG ++ AG +
Sbjct: 1165 RSKLTAGNDCILMAGDRSKLTAGINSILTAGCRS 1198



Score = 95.6 bits (237), Expect = 1e-21
Identities = 107/365 (29%), Positives = 161/365 (44%), Gaps = 9/365 (2%)

Query: 1166 AAGADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGT 1225
A D T TD DA ++ T + A +T +
Sbjct: 114 ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDA-TIESGSTQPTQTIEIATYGSTLSGTH 172

Query: 1226 DATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1285
+ AG +T+ AG + AG +T AG D+T AG +T AG +++ AG +T
Sbjct: 173 QSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQ 232

Query: 1286 AAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1345
+ AG +T AG D++ AG + AG D++ AG +T A + AG
Sbjct: 233 TGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGY 292

Query: 1346 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAADATDAAGTDATDAAGTDATDAA 1405
+T AG D++ AG +T AG ++ AG +T A G+D T AG +T A
Sbjct: 293 GSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQK----GSDLT--AGYGSTGTA 346

Query: 1406 GTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDA 1465
G D++ AG +T AG D++ AG + A + AG +T AG D++ AG +
Sbjct: 347 GDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGS 406

Query: 1466 TDAAGTDATDAAGTDATDAA--GTDAADAAGTDATDAADATDAAGTDATDAADATDAADA 1523
T AG ++T AG +T A G+D G+ T D++ AG +T A + A
Sbjct: 407 TQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTA 466

Query: 1524 TDGST 1528
GST
Sbjct: 467 GYGST 471



Score = 70.6 bits (172), Expect = 5e-14
Identities = 83/306 (27%), Positives = 127/306 (41%), Gaps = 15/306 (4%)

Query: 1246 AAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAGT 1305
A D T TD DA ++ T + A +T +
Sbjct: 114 ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDA-TIESGSTQPTQTIEIATYGSTLSGTH 172

Query: 1306 DATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1365
+ AG + + AG + AG +T AG D+T AG +T AG +++ AG +T
Sbjct: 173 QSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQ 232

Query: 1366 AA--GTDAADAAGTDATDAADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGT 1423
G+D G+ T D++ AG +T AG D++ AG +T A + AG
Sbjct: 233 TGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGY 292

Query: 1424 DAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADA--------AGTDATDAAGTDATD 1475
+ AG D++ AG +T AG ++T AG + AG +T AG D++
Sbjct: 293 GSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSL 352

Query: 1476 AAGTDATDAAGTDAADAAGTDATDAA----DATDAAGTDATDAADATDAADATDGSTDGG 1531
AG +T AG D++ AG +T A D T G+ T AD++ A T G
Sbjct: 353 IAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGE 412

Query: 1532 DAPTRA 1537
++ A
Sbjct: 413 ESTQTA 418


21CMM_0953CMM_0959Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0953173.636389putative glutaryl-CoA dehydrogenase
CMM_0954385.286673putative mannose-6-phosphate isomerase
CMM_0955385.181634transcriptional regulator, whiB homolog
CMM_0956073.838372putative glycosyl transferase
CMM_0957193.506386putative secreted protein
CMM_09585103.605773conserved hypothetical protein
CMM_09593103.241302conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0957TCRTETA424e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.1 bits (99), Expect = 4e-06
Identities = 65/318 (20%), Positives = 109/318 (34%), Gaps = 34/318 (10%)

Query: 18 PAAVLLSSQLVFNLGFYAVVPFLAVVMRDDLGLGALA--IGLVLGARTFSQQGLFLLGGM 75
P V+LS+ + +G ++P L ++RD + + G++L Q + G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 76 LADRFGPRTLIAAGCVVRVSGYLGLALAAGLPGFLVGAILTGLGGALFSPALQSLVAAAD 135
L+DRFG R ++ Y +A A L +G I+ G+ GA + A + D
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITD 125

Query: 136 ARSRPTRRPGRPSLFAALVLVGEIGAAVGPLAGAALLGLGFSATVLVGAALFAAVGAALW 195
R F + G GP+ G + G AA +
Sbjct: 126 GDE-------RARHFGFMSACFGFGMVAGPVLGGLMGGFS-PHAPFFAAAALNGLNFLTG 177

Query: 196 CVIPADAGRAVAASRGEPRAGSEPVGGAAAVPAASADRW-------AAVRDRRFLAFSAL 248
C + ++ + R R P+ + + + A +
Sbjct: 178 CFLLPESHK--GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVI 235

Query: 249 FAVDLVAYNQLYLGLPLELARAGAGNAAVGSAFLAVSLLTIALQWPAALLAKRLGPGRAL 308
F D ++ +G+ L A + S A+ +A RLG RAL
Sbjct: 236 FGEDRFHWDATTIGISL------AAFGILHSLAQAMI---------TGPVAARLGERRAL 280

Query: 309 ALGFGTIATGFAALALAS 326
LG TG+ LA A+
Sbjct: 281 MLGMIADGTGYILLAFAT 298


22CMM_1017CMM_1031Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_101729-0.977885putative multisubunit Na+/H+ antiporter,
CMM_1018310-1.372006putative multisubunit Na+/H+ antiporter,
CMM_101938-1.526548dihydroxy-acid dehydratase
CMM_102037-1.194488putative acetolactate synthase large subunit
CMM_102127-0.555929putative acetolactate synthase, small
CMM_1022270.240663putative membrane protein
CMM_1023-170.300940putative membrane protein
CMM_1024-270.511217conserved membrane protein
CMM_1025-282.049679D-3-phosphoglycerate dehydrogenase
CMM_1026092.768624putative transcriptional regulator, TetR family
CMM_1027273.289434putative efflux MFS permease
CMM_10282103.346466hypothetical membrane protein
CMM_10291103.0656273-isopropylmalate dehydrogenase
CMM_1030193.117304branched-chain amino acid aminotransferase
CMM_10312121.700485conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1028NUCEPIMERASE1382e-40 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 138 bits (349), Expect = 2e-40
Identities = 77/341 (22%), Positives = 136/341 (39%), Gaps = 42/341 (12%)

Query: 1 MTWLVTGGAGYIGSHIVSAFARAGIDTVVLDDLSSGH---------ASFVPDGVPFHRGS 51
M +LVTG AG+IG H+ AG V +D+L+ + G FH+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 VLDRELLASVLGSGDIEGVVHVAGYKYAGV--SVQRPLHTYEQNVTATAVLLEEMERAGV 109
+ DRE + + SG E V V S++ P + N+T +LE +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHR--LAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 110 DSIVFSSSAAVYG-TPDVDLVDERTPKAPESPYGESKLIGEWLLRDQGIAKGLRHASLRY 168
++++SS++VYG + + + P S Y +K E + GL LR+
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRF 178

Query: 169 FNVVGS-GDEGYFDTSPHNLFPLVFDALLDGRTPRINGGDYPTPDGTCVRDYIHVVDLAA 227
F V G G P A+L+G++ + G RD+ ++ D+A
Sbjct: 179 FTVYGPWGR-------PDMALFKFTKAMLEGKSIDVYN------YGKMKRDFTYIDDIAE 225

Query: 228 SHVAAARRL---------EAGEPVEP-----VYCLGSGAGVSVREIMTAIARATGIDFEP 273
+ + + E G P VY +G+ + V + + + A+ A GI+ +
Sbjct: 226 AIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKK 285

Query: 274 EVRDRRPGDPARIVASGELAARDIDWSMRHSLDDMVTSAWD 314
+ +PGD A + I ++ ++ D V + +
Sbjct: 286 NMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVN 326


23CMM_1055CMM_1060Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1055212-0.056254putative short chain alcohol dehydrogenase
CMM_10563130.210291hypothetical protein
CMM_1057310-0.218396putative pyridine nucleotide-disulphide
CMM_105829-1.093721putative enoyl-CoA hydratase
CMM_1059210-1.999835putative F420-dependent NADP reductase
CMM_1060216-1.106053conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1055INTIMIN270.045 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.3 bits (60), Expect = 0.045
Identities = 18/82 (21%), Positives = 30/82 (36%), Gaps = 3/82 (3%)

Query: 36 SAPAQASSAVAGSTPA-TSTATTATVRGTVRKRFDVDDFFGRQPCSSQDLPPSGPLLENL 94
SAP A+ VAG T T + T + ++ +Q S S L +
Sbjct: 129 SAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDK--ALNYAAQQAASLGSQLQSRSLNGDY 186

Query: 95 TRCVIEILAGARELDQIARWVS 116
+ +AG + Q+ W+
Sbjct: 187 AKDTALGIAGNQASSQLQAWLQ 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1059PF06580523e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 51.8 bits (124), Expect = 3e-09
Identities = 38/222 (17%), Positives = 81/222 (36%), Gaps = 37/222 (16%)

Query: 286 ELRHQERELITKDATIREIHHRVK-----NNLQTVASLLRIQARRSHTEEAREALGHAQR 340
E+ + + ++A + + ++ N L + +L+ +ARE L
Sbjct: 148 EIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDP-----TKAREMLTS--- 199

Query: 341 RVGAIAVVHDTLSEGLNQNVDFDAVFDRVLLLIAEVA------SAHNTRVHPKIVGSFGV 394
LSE + ++ + R + L E+ + + ++ +
Sbjct: 200 -----------LSELMRYSLRYSN--ARQVSLADELTVVDSYLQLASIQFEDRLQFENQI 246

Query: 395 LPSAYATPL-ALALTELVTNAVEHGLAGRS--GEVAIEAARTEETLTVSVRDDG-VGLPE 450
P+ + + + LV N ++HG+A G++ ++ + T+T+ V + G + L
Sbjct: 247 NPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN 306

Query: 451 GKVGTGLGTQIVRTLIQGELGGTIDWHTLM-GSGTEVTIEVP 491
K TG G Q VR +Q G + +P
Sbjct: 307 TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


24CMM_1105CMM_1111Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1105321-6.144026putative Mg2+ transporter, MgtE family
CMM_1106426-6.868462conserved membrane protein
CMM_1107426-7.218359putative Xaa-Pro aminopeptidase
CMM_1108424-8.037266putative membrane protein with hydrolase
CMM_1109528-8.562334conserved hypothetical protein
CMM_1110429-8.475603putative ATP-dependent RNA helicase
CMM_111108-3.241403conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1109ACETATEKNASE290.032 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 28.6 bits (64), Expect = 0.032
Identities = 14/32 (43%), Positives = 18/32 (56%), Gaps = 4/32 (12%)

Query: 184 DIIVFTNGVAE----IDEAGERGLASLGIRVD 211
D+IVFT G+ E I E GL LG ++D
Sbjct: 324 DVIVFTAGIGENGPEIREFILDGLEFLGFKLD 355


25CMM_1161CMM_1174Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1161212-3.462324putative sugar ABC transporter, ATP-binding
CMM_1162211-4.265755putative sugar ABC transporter, permease
CMM_116319-4.428542hypothetical protein
CMM_1164010-4.063175putative signal peptidase I
CMM_1165010-3.916643hypothetical protein
CMM_116608-3.508037hypothetical secreted protein
CMM_1167-18-3.574919hypothetical secreted protein
CMM_1168-28-2.309295putative two-component system, sensor histidine
CMM_1169-17-1.609995putative choline/glycine/betaine
CMM_11704102.417501putative diacylglycerol kinase
CMM_11713112.634743putative ATPase
CMM_11724112.561152putative fatty acid desaturase
CMM_11733102.523052putative Zn-dependent alcohol dehydrogenase
CMM_1174292.152190putative membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1165FLGMRINGFLIF300.006 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 29.9 bits (67), Expect = 0.006
Identities = 26/123 (21%), Positives = 49/123 (39%), Gaps = 5/123 (4%)

Query: 10 AGEEAPSILLPAVYDIVWSAVVFVVLLVVIWKYALPRVYAMLDGRTEAIAGGIEKAERAQ 69
G E P + D + +A ++++LVV W V L R E E+A+ Q
Sbjct: 442 TGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVEEAKAAQEQAQVRQ 501

Query: 70 AEADAAKAELTAQLAEARAEAGRIREQARVDATAIAAEIKEQATADAARITASAQQQIEA 129
+A + L++ R R R+ A ++ I+E + D + +Q +
Sbjct: 502 ETEEAVEVR----LSKDEQLQQR-RANQRLGAEVMSQRIREMSDNDPRVVALVIRQWMSN 556

Query: 130 ERQ 132
+ +
Sbjct: 557 DHE 559


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1172RTXTOXINA330.001 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.6 bits (74), Expect = 0.001
Identities = 20/75 (26%), Positives = 31/75 (41%), Gaps = 9/75 (12%)

Query: 24 EYRRYHDEEWGRPLHGDRPLFEKLCLEGFQA--GLSWITILRKRPRFREVFHGFDVDAVA 81
+YR Y + L E L + G + +F ++FHG D D +
Sbjct: 696 QYRSYEFTH-----INGKNLTETDNLYSVEELIGTTRADKF-FGSKFTDIFHGADGDDLI 749

Query: 82 AMDDGDVERLMGDAG 96
+DG+ +RL GD G
Sbjct: 750 EGNDGN-DRLYGDKG 763


26CMM_1210CMM_1229Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_12104102.227149ribonuclease PH (tRNA nucleotidyltransferase)
CMM_1211492.005902putative xanthosine triphosphate
CMM_1212591.975028putative cobalt-zinc-cadmium efflux permease,CDF
CMM_1213592.026674conserved membrane protein
CMM_1214582.938533*putative xylulose kinase
CMM_1215583.244028putative ATPase, type II/type IV pathway
CMM_12165123.797269conserved membrane protein, putative pilus
CMM_12174134.067684conserved membrane protein, putative pilus
CMM_12184124.072564conserved membrane protein
CMM_12193143.977517conserved secreted protein
CMM_12200102.123299conserved hypothetical protein
CMM_1221181.767550putative secreted protein
CMM_1222182.327944putative MFS permease
CMM_1223181.711133peptide chain release factor RF-2
CMM_1224382.617406putative ABC transporter, ATP-binding protein
CMM_1225392.952976putative ABC transporter, permease
CMM_1226694.846300tmRNA (SsrA)-binding protein
CMM_1227484.274254putative iron ABC transporter, permease
CMM_1228493.288995putative iron ABC transporter, ATPase component
CMM_1229183.348352putative iron ABC transporter, substrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1215PF05616482e-07 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 47.8 bits (113), Expect = 2e-07
Identities = 24/69 (34%), Positives = 28/69 (40%), Gaps = 2/69 (2%)

Query: 289 PTPTPTVAPTQAPTPTPTPTVAPTQAPTPTPTPTVAPTQAPTPTPTPTPTPTVAPTT--A 346
P P T +AP P P V+P + P P P P P P P P P P T
Sbjct: 311 PRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQ 370

Query: 347 PTPAPTTPA 355
P P +PA
Sbjct: 371 PGTRPDSPA 379



Score = 44.0 bits (103), Expect = 3e-06
Identities = 27/87 (31%), Positives = 31/87 (35%), Gaps = 2/87 (2%)

Query: 255 VTVTGTKTGYSTTTRTSDAMSVPAPTPTATPTTAPT--PTPTVAPTQAPTPTPTPTVAPT 312
V V T S T D +P P T AP P P V+P + P P P P
Sbjct: 289 VQVVATFGRDSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPG 348

Query: 313 QAPTPTPTPTVAPTQAPTPTPTPTPTP 339
P P P P + P P P P
Sbjct: 349 TRPNPEPDPDLNPDANPDTDGQPGTRP 375



Score = 38.2 bits (88), Expect = 2e-04
Identities = 24/67 (35%), Positives = 27/67 (40%), Gaps = 3/67 (4%)

Query: 294 TVAPTQAPTPTPTPTVAPTQAPTPTPTPTVAPTQAPTPTPTPTPTPTVAPTTAPTPAPTT 353
TV P P TP A +AP P P V+P + P P P P P P P
Sbjct: 304 TVDVQVIPRPDLTPGSA--EAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDP-DLN 360

Query: 354 PAAAPFT 360
P A P T
Sbjct: 361 PDANPDT 367



Score = 32.0 bits (72), Expect = 0.013
Identities = 16/55 (29%), Positives = 18/55 (32%), Gaps = 2/55 (3%)

Query: 279 PTPTATPTTAPT--PTPTVAPTQAPTPTPTPTVAPTQAPTPTPTPTVAPTQAPTP 331
P P +P P P P P P P P P + P P P P P
Sbjct: 327 PLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVP 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1219HTHTETR402e-06 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 40.0 bits (93), Expect = 2e-06
Identities = 25/166 (15%), Positives = 54/166 (32%), Gaps = 5/166 (3%)

Query: 8 ARREDELAEAVWRVIRRDGASGVSVRTVAAEAGLSTGSLRHSFPSRIDLVAHATGLVARR 67
+ + R+ + G S S+ +A AG++ G++ F + DL + L
Sbjct: 10 QETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESN 69

Query: 68 IDGRIRSR-----RTDPDARRRAVRILAEHLPLDDGRRGEAEVTAALLADAASHPRLREV 122
I R + + E ++ RR E+ +++
Sbjct: 70 IGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129

Query: 123 RAAAHAAARETCLEQLAQLRAAGLLRPDADPEAEADHLQALLAGLA 168
+ + + + L A +L D A ++ ++GL
Sbjct: 130 QRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLM 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1222RTXTOXINA320.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.9 bits (72), Expect = 0.003
Identities = 46/206 (22%), Positives = 78/206 (37%), Gaps = 55/206 (26%)

Query: 65 GLGD-----SDTDPAPRDLDRLADDLEAVIAAF-------PHRRLVLAGHSWGGQVV--- 109
G+G+ + D LD ++ L A+ A+F R AG +V+
Sbjct: 224 GVGNKLQNLPNLDNIGAGLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNV 283

Query: 110 -----RVVAARRIARGLATAGVVLVDPSDERARGYGSAAARIGFATQAALVVPLARLGLL 164
+ + A+R A+GL+T+ +AA I A A + PL+ L +
Sbjct: 284 GKGISQYIIAQRAAQGLSTS---------------AAAAGLIASAVTLA-ISPLSFLSIA 327

Query: 165 RRLHRM-----------ALAGLPEPLLAAAADAAGSVRAARATAAEQRHLLPGLADIAGI 213
+ R L + LLAA G++ A+ T + LA ++
Sbjct: 328 DKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKETGAIDASLTTIST------VLASVSSG 381

Query: 214 LGSAAA--LPGIPVRVISGTTVGPLT 237
+ +AA L G PV + G G ++
Sbjct: 382 ISAAATTSLVGAPVSALVGAVTGIIS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1229SECGEXPORT300.008 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 29.9 bits (67), Expect = 0.008
Identities = 15/51 (29%), Positives = 26/51 (50%)

Query: 292 ISADGLVEMVDVTRMTGADAGDTLAASGFGIAFAGAGGGAFVRQDVGIVAS 342
I A GLV ++ + + GAD G + A F +G G F+ + ++A+
Sbjct: 11 IVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLAT 61


27CMM_1347CMM_1359Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1347393.442475*NH3-dependent NAD+ synthetase
CMM_1348273.854518hypothetical protein
CMM_1349273.506386hypothetical protein
CMM_1350283.091464putative peptide methionine sulfoxide reductase
CMM_1351162.354661hypothetical protein
CMM_1352162.496159hypothetical protein
CMM_1353271.857658putative transcriptional regulator, MarR-family
CMM_1354080.287951putative secreted protein
CMM_1355-311-0.216325putative ABC transporter, ATP-binding protein
CMM_1356261.455776putative thioesterase
CMM_1357271.573257putative acyl-CoA thioesterase II
CMM_1358281.086084conserved hypothetical protein
CMM_1359281.115594hemoglobin-like protein, truncated hemoglobin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1351HTHFIS379e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 36.7 bits (85), Expect = 9e-05
Identities = 32/138 (23%), Positives = 57/138 (41%), Gaps = 16/138 (11%)

Query: 20 QDATRMALTVFLSGGHLLIEDVPGVGKTMLAKALARSVD------CTVNRIQFTPDLLPS 73
Q+ R+ + + L+I G GK ++A+AL +N DL+ S
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIES 206

Query: 74 DVTGVS------VYSQADHRFEFQPGAVFANIVIGDEINRASPKTQSALLECMEEGQVTV 127
++ G +++ RFE G + + DEI Q+ LL +++G+ T
Sbjct: 207 ELFGHEKGAFTGAQTRSTGRFEQAEGGT---LFL-DEIGDMPMDAQTRLLRVLQQGEYTT 262

Query: 128 DGVTHPLQQPFTVVATQN 145
G P++ +VA N
Sbjct: 263 VGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1353PF05616310.025 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 30.9 bits (69), Expect = 0.025
Identities = 19/61 (31%), Positives = 23/61 (37%), Gaps = 7/61 (11%)

Query: 611 TPGRGQAA-----PYAQPSAAPTAAPTPSASPSAPTEATPSATPAPTASAAPGTSGTAGV 665
TPG +A P P+ P P P+ +P T P P A P T G G
Sbjct: 316 TPGSAEAPNAQPLPEVSPAENPANNPAPNENPG--TRPNPEPDPDLNPDANPDTDGQPGT 373

Query: 666 R 666
R
Sbjct: 374 R 374


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1355LPSBIOSNTHSS1653e-55 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 165 bits (420), Expect = 3e-55
Identities = 69/155 (44%), Positives = 97/155 (62%), Gaps = 5/155 (3%)

Query: 4 IAVVPGSFDPVTLGHLDVIRRAARLYDELVVLVVHNPGKTPMLPLGERVALIERVIRDAG 63
A+ PGSFDP+T GHLD+I R RL+D++ V V+ NP K PM + ER+ I + I
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAH-- 59

Query: 64 LPDTVRVDSWGAGLLVDYCRQVGATVLVKGVRSQLDVTYETPMALVNRDLA-DVETVLLL 122
LP+ +VDS+ GL V+Y RQ A +++G+R D E MA N+ LA D+ETV L
Sbjct: 60 LPN-AQVDSF-EGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLT 117

Query: 123 PDPAHAHVSSSLVRQVEALGGDVTPYVPAAVADAL 157
++ +SSSLV++V GG+V +VP+ VA AL
Sbjct: 118 TSTEYSFLSSSLVKEVARFGGNVEHFVPSHVAAAL 152


28CMM_1433CMM_1438Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_14333123.375202putative monooxygenase
CMM_14345113.334298putative polyprenyltransferase
CMM_14355102.707749putative acetyltransferase
CMM_1436390.368615putative N-acetyltransferase
CMM_143739-0.255363hypothetical protein
CMM_143828-2.186278cytochrome bd-type menaquinol oxidase subunit I
29CMM_1832CMM_1840Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_183217-3.320355putative GTP-binding protein
CMM_1833010-3.194655putative cytidylate kinase
CMM_1834114-3.476004putative prephenate dehydrogenase
CMM_1835015-3.043559putative pseudouridine synthase
CMM_1836-116-3.095321conserved hypothetical protein, putative
CMM_1837114-4.174544conserved hypothetical protein
CMM_1838214-3.997878putative ATPase involved in partitioning
CMM_PS_17212-5.083250integrase/recombinase
CMM_183918-4.418945putative NTP pyrophosphatase
CMM_184018-3.863553putative CTP synthase
30CMM_1895CMM_1917Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1895183.173161conserved hypothetical protein
CMM_1896083.851789putative 2-oxoglutarate/malate translocator,DASS
CMM_1897-163.923784putative ATP-dependent DNA helicase
CMM_1898-283.540407putative membrane protein
CMM_1899-283.4585085-methyltetrahydropteroyltriglutamate-
CMM_1900073.457534adenine phosphoribosyltransferase
CMM_1901173.597782putative membrane protein
CMM_1902183.386949putative 2,5-diketo-D-gluconic acid reductase
CMM_1903083.336986putative lipoprotein signal peptidase
CMM_1904-281.189501hypothetical protein
CMM_1905-1111.358775putative cysteine peptidase, family C56
CMM_1906-272.025390conserved hypothetical protein
CMM_1907-262.080320putative DNA-binding ferritin-like protein
CMM_1908-262.416991short chain dehydrogenase/oxidoreductase
CMM_1909-262.314591hypothetical protein
CMM_1910073.385374putative two-component system sensor kinase
CMM_19111103.398861putative two-component system response
CMM_1912092.964869D-alanyl-D-alanine carboxypeptidase
CMM_1913273.151582putative DNA-invertase/recombinase
CMM_1914482.772052hypothetical protein
CMM_1915492.885223conserved membrane protein
CMM_1916392.080355putative FAD/FMN-containing dehydrogenase
CMM_1917281.962464putative DNA glycosylase
31CMM_1953CMM_1971Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1953283.645275hypothetical protein
CMM_1954083.037570putative sugar ABC transporter, binding protein
CMM_1955-181.924468putative sugar ABC transporter, permease
CMM_1956-191.568128putative sugar ABC transporter, permease
CMM_1957-1120.701241beta-galactosidase
CMM_1958012-0.004436putative sugar alcohol dehydrogenase
CMM_1959-112-0.298854putative glutaredoxin
CMM_1960-212-0.350852conserved hypothetical protein, putative
CMM_19611150.519721hypothetical membrane protein
CMM_1962316-0.000211putative protease II (Oligopeptidase B)
CMM_19632112.155935thiamine-phosphate
CMM_19641112.333331conserved hypothetical protein, putative fusion
CMM_19651132.418441putative sugar MFS permease
CMM_1966-121-0.992622hypothetical membrane protein
CMM_1967023-1.642010putative DNA helicase
CMM_1968019-1.243299putative glutathione S-transferase
CMM_1969121-4.164471putative esterase
CMM_1970020-5.470635putative membrane protein
CMM_1971017-4.202869conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1958CHANLCOLICIN300.013 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 29.7 bits (66), Expect = 0.013
Identities = 15/48 (31%), Positives = 24/48 (50%), Gaps = 5/48 (10%)

Query: 88 VVGAVVAVLVGLVVGTLLGV-----VAGTVGGIVDDVLMRLVDVLLAI 130
V VVA+L L+ GT LG+ V G + +D + ++ +L I
Sbjct: 475 GVSYVVALLFSLLAGTTLGIWGIAIVTGILCSYIDKNKLNTINEVLGI 522


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1961SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 17/65 (26%), Positives = 23/65 (35%), Gaps = 7/65 (10%)

Query: 95 INVVASAHGTGAGQALFDAAV---GDSPA---YLWAADDNPRAAAFYRRNGFARDGGVER 148
I V G G AL A+ ++ L D N A FY ++ F G V+
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF-IIGAVDT 153

Query: 149 QTYQG 153
Y
Sbjct: 154 MLYSN 158


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1965HTHFIS502e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 49.8 bits (119), Expect = 2e-09
Identities = 20/105 (19%), Positives = 42/105 (40%), Gaps = 4/105 (3%)

Query: 8 IRVVVVDDQSLVRDGFARIVDAQPDMAAVGVCADGEQAVARVVELRPDVVLMDVRMPVLD 67
++V DD + +R + + V + ++ + D+V+ DV MP +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 68 GIEATRRLVEGGRAEGTRILGLTTHDTDSYAIRMLRAGAVGFLLK 112
+ R+ + +L ++ +T AI+ GA +L K
Sbjct: 62 AFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1969PF02370368e-05 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 36.2 bits (83), Expect = 8e-05
Identities = 16/67 (23%), Positives = 26/67 (38%), Gaps = 3/67 (4%)

Query: 64 RNAQAAEAAHDKAMRAAEAAQAKA-VRAAEAEQQRVAKVAQR-EAAMQAKAAERDAVARD 121
+ Q +K R + + + + EQQ++ Q+ Q A R + RD
Sbjct: 79 KEKQERPERREKFERQHQDKHYQEQQKKHQQEQQQLEAEKQKLAKEKQISDASRQGLNRD 138

Query: 122 KSA-RAA 127
A RAA
Sbjct: 139 LEASRAA 145


32CMM_2004CMM_2012Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2004-110-3.118920hypothetical protein
CMM_2005011-3.462116putative 1-deoxy-D-xylulose 5-phosphate
CMM_2006314-3.713655putative peptidylprolyl isomerase
CMM_2007416-2.750780conserved hypothetical protein
CMM_2008211-0.931812putative 4-aminobutyrate aminotransferase
CMM_2009213-0.843364putative asparaginase II
CMM_2010113-0.267685conserved hypothetical protein
CMM_2011-111-0.250687conserved membrane protein containing
CMM_20122101.069510conserved hypothetical protein
33CMM_2160CMM_2169Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2160063.032660hypothetical protein
CMM_2161074.223612putative permease
CMM_2162274.833984conserved membrane protein
CMM_2163184.730765hypothetical protein
CMM_2164484.880717conserved hypothetical protein
CMM_2165485.017666hypothetical protein
CMM_2166383.678385hypothetical protein
CMM_2167393.070172putative membrane-bound oxidoreductase
CMM_21683102.537155putative two-component system sensor kinase
CMM_21693102.780319putative two-component system response
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2161INFPOTNTIATR622e-13 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 62.3 bits (151), Expect = 2e-13
Identities = 38/102 (37%), Positives = 56/102 (54%), Gaps = 1/102 (0%)

Query: 236 LKVGDGAEVTDGASVTVQYTGINWNTKKVFDSSWDRGQSATFVTSQVIPGFTKALVGQKV 295
+ G GA+ +VTV+YTG + VFDS+ G+ ATF SQVIPG+T+AL
Sbjct: 133 IDAGTGAKPGKSDTVTVEYTGTLID-GTVFDSTEKAGKPATFQVSQVIPGWTEALQLMPA 191

Query: 296 GSQVIAIIPPADGYGDKGQGTDIGGTDTIVFVVDILGTQPAA 337
GS +P YG + G IG +T++F + ++ + AA
Sbjct: 192 GSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKAA 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2168HTHFIS354e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.8 bits (80), Expect = 4e-04
Identities = 41/182 (22%), Positives = 67/182 (36%), Gaps = 39/182 (21%)

Query: 2 TMTADEARAFQDEFARLVRNVEQV-----LLGKS----HVVRLAFTAMVTGGHLLLEDVP 52
+ RA + R + + L+G+S + R+ M T L++
Sbjct: 110 ELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGES 169

Query: 53 GTGKTSLARAMAQTVDGTHSRVQFTPDV------LP-------------GDITGVSVYDQ 93
GTGK +ARA+ + + + P V +P G TG +
Sbjct: 170 GTGKELVARALHD-----YGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQ--TR 222

Query: 94 RTGEFEFHRGPVFASIVLADEINRASPKTQSALLEVMEEGRVTVDGTPYDVGHPFMVIAT 153
TG FE G ++ L DEI Q+ LL V+++G T G + ++A
Sbjct: 223 STGRFEQAEG---GTLFL-DEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAA 278

Query: 154 QN 155
N
Sbjct: 279 TN 280


34CMM_2199CMM_2222Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2199282.206409conserved hypothetical protein
CMM_2200272.312420putative
CMM_2201182.691169putative dimethyladenosine transferase
CMM_2202293.069831putative citrate synthase
CMM_2203282.960337putative dihydrolipoamide dehydrogenase (E3)
CMM_2204192.748194putative NAD(P)H oxidoreductase
CMM_22050170.831217putative Zn-dependant alcohol dehydrogenase
CMM_2206-121-1.011092putative transcriptional regulator, TetR family
CMM_2207-116-3.726878putative membrane protein
CMM_2208015-4.246492putative SAM-dependent methyltransferase
CMM_2209-119-4.815293putative phosphatase
CMM_2210122-5.428705putative methylisocitrate lyase/phosphonomutase
CMM_2211018-4.545680putative 2-methylcitrate dehydratase
CMM_2212010-3.928683putative duplicated acetyltransferase
CMM_2213013-2.681504hypothetical protein, putative perforin
CMM_2214011-3.134867conserved hypothetical protein
CMM_2215110-2.337535putative sugar acetyltransferase
CMM_2216316-4.544805putative transcriptional regulator, TetR family
CMM_2217622-6.221109putative membrane-bound tyrosin-protein
CMM_2218524-6.675544hypothetical secreted protein
CMM_2219624-6.923388conserved membrane protein
CMM_2220620-5.252379putative permease, DMT family
CMM_2221619-5.907405conserved hypothetical protein
CMM_2222312-2.971205putative methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2204PF06776300.036 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 29.9 bits (67), Expect = 0.036
Identities = 12/99 (12%), Positives = 22/99 (22%), Gaps = 11/99 (11%)

Query: 23 RRRGIRALAVTAGLATALSLAGLPAHAATHASAPAAARAAAPAAATPPAPAAVEDATF-- 80
+ I+ A + A A A A A + + +
Sbjct: 23 ALKAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIALSFGWSDRADAQGAVRSVHGD 82

Query: 81 ------TPGEARLD---TSGEVIQAHGGQIVPSVDEAGD 110
TP A+ + V+ +V
Sbjct: 83 WQIRCDTPPGAKAEQCALIQSVVAEDRSNAGLTVIILKT 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2207PREPILNPTASE280.027 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 27.8 bits (62), Expect = 0.027
Identities = 6/32 (18%), Positives = 9/32 (28%), Gaps = 5/32 (15%)

Query: 52 CPACGSRLGVRYP-----SGWLCAVCEWRHGD 78
C C + + RYP + L
Sbjct: 99 CRGCQAPISARYPLVELLTALLSVAVAMTLAP 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2221TCRTETOQM290.022 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 28.7 bits (64), Expect = 0.022
Identities = 25/118 (21%), Positives = 35/118 (29%), Gaps = 17/118 (14%)

Query: 56 IAVALLARSAKSSARRTEVEVIASGVEARNTEYMARYSHDPNP-------PEESQPAASP 108
+ ALL E VI + EY PNP P S
Sbjct: 398 VTCALLQEKYHVEIEIKEPTVIYMERPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSG 457

Query: 109 LSNNSAV----------PAIDFGVVNDRNQGRYRWKLSNLSSTIVAKNVHISGRTPAD 156
+ S+V A+ G+ QG Y W +++ + TPAD
Sbjct: 458 MQYESSVSLGYLNQSFQNAVMEGIRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPAD 515


35CMM_2276CMM_2303Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2276281.556876putative acetyltransferase
CMM_2277081.258959putative UTP-glucose-1-phosphate
CMM_2278-280.365068putative 5-formyltetrahydrofolate cyclo-ligase
CMM_2279-171.164887conserved hypothetical protein
CMM_2280-36-0.522548putative large-conductance mechanosensitive
CMM_2281116-4.256356conserved hypothetical protein
CMM_2282322-6.479801*putative transcriptional regulator, Cro/CI
CMM_2283429-7.951455conserved hypothetical protein
CMM_2284634-9.582974conserved hypothetical protein, putative
CMM_2285739-11.531348putative cytosine/purine permease, NCS1 family
CMM_2286851-13.853978putative hydantoinase
CMM_2287953-14.265283conserved hypothetical protein
CMM_2288954-13.952479conserved hypothetical protein
CMM_22891052-13.518571putative ATPase
CMM_22901257-13.584579conserved hypothetical protein
CMM_22911156-13.136370conserved hypothetical protein, putative
CMM_22921157-13.956868hypothetical protein
CMM_22931150-13.169188putative transcriptional activator
CMM_22941049-13.679259putative serine protease, family S9A
CMM_2295950-13.961174putative restriction endonuclease
CMM_2296845-13.158507putative secreted hydrolase
CMM_2297747-14.430156putative iron-dependent repressor
CMM_2298844-14.411062conserved hypothetical protein
CMM_22991044-13.531201putative cold shock protein
CMM_2300838-11.423742hypothetical membrane protein
CMM_2301520-6.100939putative ATP-dependent DNA helicase
CMM_2302521-6.043967putative two component system response
CMM_2303211-1.965093putative two-component system sensor kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2288YERSSTKINASE290.012 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 28.6 bits (63), Expect = 0.012
Identities = 22/63 (34%), Positives = 30/63 (47%), Gaps = 3/63 (4%)

Query: 59 LSNAVDLDPQRIPPTGWSGTARDLSKYMNLDLLMLLNTVRIDAECDPDISVFEVRPATRP 118
LSN PQ P GW G LS +L+ + + T + AE + IS+ E + R
Sbjct: 104 LSNLFGAKPQTELPLGWKG--EPLSGAPDLEGMRVAETDKF-AEGESHISIIETKDKQRL 160

Query: 119 VAK 121
VAK
Sbjct: 161 VAK 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2292IGASERPTASE300.047 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.047
Identities = 26/126 (20%), Positives = 39/126 (30%), Gaps = 18/126 (14%)

Query: 477 KAYAPNVEVEA---EANTEATNVAAEPSEPGDAVDDSGASTPPQAEDASEA--------- 524
P V + + +E AEP+ D PQ++ + A
Sbjct: 1119 TQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV---NIKEPQSQTNTTADTEQPAKET 1175

Query: 525 SAVSEPPRTADAPLAKESPAVADVPQAT--DALQGTGASEVSGSPAVADASDASAVSSAP 582
S+ E P T +V + P+ T Q T SE S P +V
Sbjct: 1176 SSNVEQPVTESTT-VNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNV 1234

Query: 583 ESHHTA 588
E T+
Sbjct: 1235 EPATTS 1240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2295BCTERIALGSPG435e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 43.0 bits (101), Expect = 5e-08
Identities = 14/48 (29%), Positives = 29/48 (60%)

Query: 8 LQKGKSDRGFTLIELVIVVAVIGILAAIAIPAYGSIQATARANTATAN 55
++ RGFTL+E+++V+ +IG+LA++ +P + A A ++
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSD 48


36CMM_PS_21CMM_2328Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_PS_21494.074327putative sugar ABC transporter, ATP-binding
CMM_23113113.780958conserved hypothetical protein
CMM_23124103.438733putative tRNA/rRNA methyltransferase
CMM_2313292.7488042-C-methyl-D-erythritol 2,4-cyclodiphosphate
CMM_2314392.918155putative transcriptional regulator, CarD family
CMM_23152130.994172hypothetical protein
CMM_23162100.905012putative two-component system response
CMM_23171100.816367putative two-component system sensor kinase
CMM_23181100.827721putative phosphate transport system regulator
CMM_23192110.621410phosphoglycerate mutase
CMM_23202101.576752putative membrane protein
CMM_2321271.432446putative aminomethyltransferase
CMM_2323281.500683conserved hypothetical protein
CMM_2324281.185497hypothetical protein
CMM_23252100.576720putative membrane protein
CMM_2326112-0.098396transcriptional regulator
CMM_23273140.446582putative acetyltransferase
CMM_23282110.825458putative NTP pyrophosphohydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2314ACETATEKNASE290.030 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.4 bits (66), Expect = 0.030
Identities = 13/63 (20%), Positives = 27/63 (42%), Gaps = 4/63 (6%)

Query: 270 GIDLLVTTGGVSAGAYEVVRDVLEGGVEFVSVAVQP---GGPQGLGTAEAGGARIPVVAF 326
G+D++V T G+ E+ +L+ G+EF+ + +++ V+
Sbjct: 322 GVDVIVFTAGIGENGPEIREFILD-GLEFLGFKLDKEKNKVRGEEAIISTADSKVNVMVV 380

Query: 327 PGN 329
P N
Sbjct: 381 PTN 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2318NUCEPIMERASE1674e-51 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 167 bits (424), Expect = 4e-51
Identities = 83/363 (22%), Positives = 142/363 (39%), Gaps = 67/363 (18%)

Query: 7 LVTGGAGFIGGAIVQAALEEGRRVRVLDSLRADVHGGDPEI---------DPRVELVRGD 57
LVTG AGFIG + + LE G +V +D+L D + D + P + + D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNL-NDYY--DVSLKQARLELLAQPGFQFHKID 60

Query: 58 VTDPDAVAGALD--GVDVVCHQAAKVGLGVDFLDAPDYVTTNDGGTAVLLAAMTRAGIDR 115
+ D + + + V ++ + + Y +N G +L I
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 116 LVLASSMVVYGEGAYAGADGPVRPPARRVADLDAGMFDPVDPATGEPLVPQLIGEDVPLD 175
L+ ASS VYG P + + V
Sbjct: 121 LLYASSSSVYGLNRK-------------------------MPFSTDDSVDH--------- 146

Query: 176 PRNVYATTKLAQENLASSWTRATGGRAAALRYHNVYGP-GMPQNTPYAGVASLFRSALAR 234
P ++YA TK A E +A +++ G A LR+ VYGP G P + F A+
Sbjct: 147 PVSLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALF-----KFTKAMLE 201

Query: 235 GEAPRVFEDGRQRRDFVHVRDVAGANLTAL--------AWTADREAGS-----FRAFNVG 281
G++ V+ G+ +RDF ++ D+A A + WT + + +R +N+G
Sbjct: 202 GKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG 261

Query: 282 SGTVHTIGEMAEALAREAGGSAPVTTGEYRLGDVRHITASSDRLRTELGWEPRMTFEEGM 341
+ + + + +AL G A + GDV +A + L +G+ P T ++G+
Sbjct: 262 NSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGV 321

Query: 342 REF 344
+ F
Sbjct: 322 KNF 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2320DHBDHDRGNASE1031e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 103 bits (258), Expect = 1e-28
Identities = 73/255 (28%), Positives = 117/255 (45%), Gaps = 9/255 (3%)

Query: 9 KVVLITGGGSGLGRAAAVRLAAEGAQLALVDISEGGLVDTVAA--VMAATPDAAILTVLA 66
K+ ITG G+G A A LA++GA +A VD + L V++ A +A A
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA----FPA 64

Query: 67 DVSKESDVDSYVHQTVERFGRIDGFFNNAGIEGRQNLTEDFTAAEFDKVVAINLRGVFLG 126
DV + +D + G ID N AG+ R L + E++ ++N GVF
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVL-RPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 127 LEKVLAVMREQGSGMVVNTASVGGIRGVGNQSGYAAAKHGVVGLTRNSAVEYGEFGIRIN 186
V M ++ SG +V S + + YA++K V T+ +E E+ IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 187 AIAPGAIWTPMVEASMKQISADDP--RGAAEQFIQGNPTKRYGEAEEIASVVAFLLSDDA 244
++PG+ T M + + + +G+ E F G P K+ + +IA V FL+S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 245 AYVNAAVLPIDGGQS 259
++ L +DGG +
Sbjct: 244 GHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2321BACINVASINB340.001 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 33.6 bits (76), Expect = 0.001
Identities = 19/81 (23%), Positives = 41/81 (50%), Gaps = 4/81 (4%)

Query: 58 LGLAIGIADAVVQATGAGLLLEIVLQAVLMSAIVVPAVVLLRRRLDRRSLASIGLSRRIG 117
+GLA+ +AD +V+A ++ L + M ++ P + L+ + + ++L +G+ ++
Sbjct: 346 VGLAVMVADEIVKAATGVSFIQQALNPI-MEHVLKPLMELIGKAIT-KALEGLGVDKKTA 403

Query: 118 RPIALGVGVGAVTGAVVWVPA 138
G VGA+ A+ V
Sbjct: 404 EMA--GSIVGAIVAAIAMVAV 422


37CMM_2340CMM_2356Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2340-1133.055077hypothetical protein predicted by
CMM_23411133.261435putative aldose-1-epimerase
CMM_23423114.259844putative DeoR-family transcriptional regulator
CMM_23430114.050030putative acetyltransferase/ siderophore binding
CMM_23441103.606614putative bifunctional methylenetetrahydrofolate
CMM_2345193.623462serine hydroxymethyltransferase
CMM_23462103.798784putative glycosidase
CMM_23473114.112504hypothetical protein
CMM_23484124.242616putative MFS permease
CMM_23493113.557301putative oxidoreductase
CMM_23504104.002702putative manganese transporter, NRAMP family
CMM_2351172.715518hypothetical protein
CMM_2352-162.288228putative sugar MFS-permease
CMM_2353082.624659putative formyltetrahydrofolate deformylase
CMM_2354182.204611putative ABC-transporter, permease component
CMM_2355192.385181putative serine protease, peptidase family S8A
CMM_23562102.177884putative subtilisin-like serine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2349FERRIBNDNGPP601e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 60.3 bits (146), Expect = 1e-12
Identities = 41/187 (21%), Positives = 73/187 (39%), Gaps = 20/187 (10%)

Query: 71 PKRVVVLEPVALDTSVALGVIPVGAAVLNETAGVPAYLGADAADAEV--VGTVPEPGVER 128
P R+V LE + ++ +ALG++P G A +T ++ V VG EP +E
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVA---DTINYRLWVSEPPLPDSVIDVGLRTEPNLEL 91

Query: 129 IVALAPDLIIGTESRHAAIHDQLASIAPTVFLAS-----QADPWRDNVALVGEALGRADD 183
+ + P ++ + + + + LA IAP R ++ + + L
Sbjct: 92 LTEMKPSFMVWS-AGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSA 150

Query: 184 AAALLGDYQARCDAIAQEYAVAGTT----AQMIRPRDGLLTLYGPESFAGSTLECAGFTT 239
A L Y+ ++ + G +I PR L ++GP S L+ G
Sbjct: 151 AETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHML--VFGPNSLFQEILDEYGI-- 206

Query: 240 PERDWEG 246
W+G
Sbjct: 207 -PNAWQG 212


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2355cloacin290.042 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 28.5 bits (63), Expect = 0.042
Identities = 14/38 (36%), Positives = 16/38 (42%)

Query: 2 AGPARLSGAGAGGRGSGSGSGSGSGDGPGGRATIHDVA 39
+G G+G G G SG GSG G A VA
Sbjct: 52 SGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVA 89


38CMM_2456CMM_2478Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2456292.168205conserved hypothetical protein
CMM_2457172.330789hypothetical protein
CMM_2458072.634303conserved hypothetical protein
CMM_2459-162.383597hypothetical protein
CMM_2460-191.227970hypothetical protein
CMM_2461281.107312conserved hypothetical protein
CMM_24623101.311217putative ATP-dependent helicase
CMM_2463391.153939putative transcriptional regulator, TetR family
CMM_2464-180.648945conserved hypothetical protein
CMM_2465-170.177884conserved hypothetical protein, putative
CMM_2466-17-0.309938putative ATP-dependent helicase
CMM_2467-27-0.718271putative endonuclease VIII/DNA glycosylase
CMM_2468391.804750putative cationic amino acid permease, APC
CMM_2469291.472659putative RNA polymerase sigma factor,
CMM_24703100.816698putative two-component system, sensor
CMM_24710100.331037hypothetical protein
CMM_24720100.392442hypothetical protein
CMM_2473012-0.395692putative transcriptional regulator
CMM_2474013-2.698896putative amidase
CMM_2475115-2.858959conserved hypothetical protein, putative
CMM_2476114-1.900130putative penicillin-binding protein
CMM_2477316-2.662000putative transcriptional regulator, MarR-family
CMM_2478415-2.015492putative monooxygenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2456PF03309300.017 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 30.1 bits (68), Expect = 0.017
Identities = 17/68 (25%), Positives = 28/68 (41%), Gaps = 6/68 (8%)

Query: 1 MRIGIDVGGTNTDAVLMHGD----RVVVGIKSSTTEDVTSGIVGALAELDRQHPFDPADI 56
M + IDV T+T L+ G +VV + T +VT+ + +D D +
Sbjct: 1 MLLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALT--IDGLIGDDAERL 58

Query: 57 DGVMIGTT 64
G +T
Sbjct: 59 TGASGLST 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2457PF09025280.046 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 27.7 bits (61), Expect = 0.046
Identities = 25/90 (27%), Positives = 35/90 (38%), Gaps = 3/90 (3%)

Query: 18 HLPAIARGAAILGTGGGGDPYIGRLLAGEAIKELGPIPLADPFDLPDDAVVIPVAMMGAP 77
+LPA+ + A GG P GR LAG LG L F P + + A
Sbjct: 22 NLPAVDQVLAFEQALGGEPPAAGRRLAGLENGALGE-RLLQRFAQPLQGLEADRLELKA- 79

Query: 78 TVMVEKLPTVEQLQGAILALARYLGVTPTH 107
++ +LP Q Q +L L + P
Sbjct: 80 -MLRAELPLGRQQQTFLLQLLGAVEHAPGG 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2460RTXTOXIND310.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.017
Identities = 25/206 (12%), Positives = 70/206 (33%), Gaps = 10/206 (4%)

Query: 166 EAVHERIEARRAIDAELAAARAELVDVTATAELQGVGLFEHDHPAESSAELASRLEALRY 225
A+ + + + L A + + ++ L E P E + S E LR
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 226 T--IKNAVRDKRAVTATSGFTFNGSEAQGRRFVSDMSKVLLRAYNAEAENAVKATRAGNL 283
T IK + + A+ ++ +++ + ++ ++
Sbjct: 188 TSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQ 247

Query: 284 HVAQNRLTKAAEQIARSGTMIDLRIQDGYHELRLEEL--QLASAHLRVLQAEKEMERERR 341
+A++ + + + + + ++ +LE++ ++ SA + + E
Sbjct: 248 AIAKHAVLEQENKYV------EAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL 301

Query: 342 AELREQAKASAELQAERERLDKERAH 367
+LR+ L E + ++ +
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQA 327



Score = 29.4 bits (66), Expect = 0.037
Identities = 14/113 (12%), Positives = 30/113 (26%), Gaps = 5/113 (4%)

Query: 86 AELDDYRARAQTEAEAARRTGADAASALVAAARAERDRILAEAGAERLRAEQEAGAVRLR 145
+L A A T +T + A + R + E + +
Sbjct: 125 LKLTALGAEADT-----LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNV 179

Query: 146 AEEEAGATRRRAEAELQAVDEAVHERIEARRAIDAELAAARAELVDVTATAEL 198
+EEE + + +++ AE A + + +
Sbjct: 180 SEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRV 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2463PF03544409e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.0 bits (93), Expect = 9e-06
Identities = 23/121 (19%), Positives = 29/121 (23%)

Query: 264 EPTTPAPTPTVPTPTPTVPTPTPTPTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTP 323
+P + P P P P V P P P AP P P
Sbjct: 48 QPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKP 107

Query: 324 TPTAPTPTPTAPTPTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGT 383
P +P PT T T P +V + + P
Sbjct: 108 VKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQY 167

Query: 384 P 384
P
Sbjct: 168 P 168



Score = 36.1 bits (83), Expect = 2e-04
Identities = 21/116 (18%), Positives = 25/116 (21%), Gaps = 1/116 (0%)

Query: 273 TVPTPTPTVPTPTPTPTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTP 332
T +P P P T A P P P P P P P P
Sbjct: 35 TSVHQVIELPAP-AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVV 93

Query: 333 TAPTPTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGTPTPTA 388
P P+ P V + T + T T
Sbjct: 94 IEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKP 149



Score = 35.3 bits (81), Expect = 3e-04
Identities = 23/114 (20%), Positives = 28/114 (24%), Gaps = 2/114 (1%)

Query: 288 PTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTVPTPRAT 347
+P P + T A P A P P P P P P P V +
Sbjct: 41 IELPAP-AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVV-IEKPK 98

Query: 348 APTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGTPTPTAGTPTPTAGTPTPT 401
P V P P PT+ T T P +
Sbjct: 99 PKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTS 152



Score = 34.2 bits (78), Expect = 8e-04
Identities = 23/117 (19%), Positives = 27/117 (23%), Gaps = 1/117 (0%)

Query: 277 PTPTVPTPTPTPTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTAPT 336
P P P + T P P P P P P P P P
Sbjct: 44 PAPAQP-ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 337 PTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGTPTPTAGTPTP 393
P P V + + TA + T T T A P
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159



Score = 33.0 bits (75), Expect = 0.002
Identities = 19/111 (17%), Positives = 22/111 (19%)

Query: 300 PTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTVPTPRATAPTPTPTVPTPT 359
P P P P P P P P P P P
Sbjct: 47 AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 360 VTVPTPTATVPTPTPTVGTPTPGTPTPTAGTPTPTAGTPTPTSTTATPIVP 410
+P T A + TA T T+ P
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGP 157



Score = 31.5 bits (71), Expect = 0.005
Identities = 18/113 (15%), Positives = 22/113 (19%)

Query: 316 PTPTAPTPTPTAPTPTPTAPTPTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPT 375
P + T A P A P P P P P P V P P P
Sbjct: 47 AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 376 VGTPTPGTPTPTAGTPTPTAGTPTPTSTTATPIVPGGTTPGGSLPVTGYDPTP 428
+ A T+ + P
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2475HTHFIS1103e-30 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 110 bits (276), Expect = 3e-30
Identities = 39/121 (32%), Positives = 61/121 (50%)

Query: 3 DGPKILIVDDEPNIRDLLTTSLRFAGFAVRAVGNGAQAISAVLEEEPDLIILDVMLPDMN 62
G IL+ DD+ IR +L +L AG+ VR N A + + DL++ DV++PD N
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GFGVTKRLRAAGYTAPILFLTAKDDTEDKITGLTVGGDDYVTKPFSLDEIVARIKAILRR 122
F + R++ A P+L ++A++ I G DY+ KPF L E++ I L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 123 T 123

Sbjct: 122 P 122


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_247756KDTSANTIGN250.042 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 25.3 bits (55), Expect = 0.042
Identities = 19/69 (27%), Positives = 36/69 (52%), Gaps = 9/69 (13%)

Query: 21 SIGRIQAEVAGLHGQLADLQGSWTGSAATAFQGVV---------AEWKGTQQRVEEALAS 71
SI +IQ+++ L L +L+ S+ G AF + A+ + Q + ++A A+
Sbjct: 296 SIEQIQSKIQELGDTLEELRDSFDGYINNAFVNQIHLNFVMPPQAQQQQGQGQQQQAQAT 355

Query: 72 INQALSAAA 80
+A++AAA
Sbjct: 356 AQEAVAAAA 364


39CMM_2560CMM_2570Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_256027-1.795330hypothetical protein
CMM_256117-3.967995putative transcriptional regulator, ArsR family
CMM_256227-3.763255hypothetical protein
CMM_256327-4.228667putative acetyltransferase
CMM_256407-2.410753putative two-component system, sensor kinase
CMM_2565-17-2.052260putative two-component system, response
CMM_2566-27-0.277164hypothetical membrane protein
CMM_2567-181.464715conserved membrane protein, putative exporter
CMM_25681122.506102conserved hypothetical protein
CMM_2569193.062511putative multidrug ABC transporter, permease
CMM_25702113.074535putative ABC transporter, ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2562PF05272355e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.4 bits (81), Expect = 5e-04
Identities = 18/41 (43%), Positives = 18/41 (43%), Gaps = 2/41 (4%)

Query: 144 SPDGAAAGAAGSTP--APSASAGAGTSTGGETAAPDTETGA 182
SP AA GA G P SAGAGT GG D E
Sbjct: 389 SPTAAAGGAGGGEPPKKRDPSAGAGTDPGGPGGGDDGEDPF 429


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2569PF03309290.032 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 29.0 bits (65), Expect = 0.032
Identities = 7/38 (18%), Positives = 14/38 (36%)

Query: 129 DEVTAAVAGWNLAPFPEAEVVQGTAETFDAGGVDGVYL 166
D + +A ++ V G++ D G +L
Sbjct: 111 DRIVNCLAAYHKYGTAAIVVDFGSSICVDVVSAKGEFL 148


40CMM_2596CMM_2628Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2596-112-3.125146putative ferredoxin/ferredoxin-NADP reductase
CMM_2597216-4.008375putative polyprenyl diphosphate synthase
CMM_2598317-5.035231putative menaquinone biosynthesis
CMM_2599521-4.181765putative sugar MFS permease
CMM_2600419-4.630395putative isochorismate synthase
CMM_2601420-4.586019menaquinone biosynthesis bifunctional protein
CMM_2602420-6.041554hypothetical membrane protein
CMM_2603419-6.468071hypothetical membrane protein
CMM_2604521-6.581162hypothetical protein
CMM_2605522-6.703973putative 1,4-dihydroxy-2-naphthoate
CMM_2606421-5.999868putative mutase
CMM_2607419-5.925035putative metallopeptidase, family M13
CMM_2608417-5.797881putative transcriptional regulator, Cro/CI
CMM_2609718-5.175057putative xanthine/uracil family permease, NCS2
CMM_2610616-5.693860conserved hypothetical protein, putative
CMM_2611517-5.238018putative cytosine permease, NCS1 family
CMM_2612719-5.579634putative transcriptional regulator, GntR-family
CMM_2613619-5.892464putative phosphatase
CMM_2614515-4.492805putative homocysteine S-methyltransferase
CMM_2615213-5.102302putative nucleoside-diphosphate-sugar epimerase
CMM_2616112-5.612893putative transcriptional regulator
CMM_2617112-5.840323putative MFS permease
CMM_2618111-5.842547putative aldo/keto reductase
CMM_2619111-5.110404putative transcriptional regulator, LysR-family
CMM_2620014-4.486989putative permease, DMT family
CMM_2621113-3.358514putative membrane protein
CMM_2622112-1.302625conserved hypothetical protein, putative
CMM_2623110-1.021682putative sugar kinase
CMM_262409-0.380801putative ranscriptional regulator, LacI-family
CMM_2625112-2.612651putative sugar ABC transporter, substrate
CMM_2626213-4.053788putative sugar ABC transporter, permease
CMM_2627212-3.908705putative sugar ABC transporter, ATP-binding
CMM_2628212-3.463029conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2598SECYTRNLCASE5190.0 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 519 bits (1338), Expect = 0.0
Identities = 296/444 (66%), Positives = 358/444 (80%), Gaps = 11/444 (2%)

Query: 1 MLSAVVRIFRTPDLRRKIGFTLGIIALFRLGSFIPAPFVDFANVQSCLAANQGTSGLYEL 60
ML+A R FRTPDLR+K+ FTL II ++R+G+ IP P VD+ NVQ C+ G GL+ L
Sbjct: 1 MLTAFARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGL 60

Query: 61 VNLFSGGALLKLSIFALGIMPYITASIIVQLLRVVIPHFDTLYKEGQSGQAKLTQYTRYL 120
VN+FSGGALL+++IFALGIMPYITASII+QLL VVIP + L KEGQ+G AK+TQYTRYL
Sbjct: 61 VNMFSGGALLQITIFALGIMPYITASIILQLLTVVIPRLEALKKEGQAGTAKITQYTRYL 120

Query: 121 TIALAVLQSTTLITVARSGALFGQTNVSACTQLVTNDAWYAIMLMVITMTAGTGLIMWMG 180
T+ALA+LQ T L+ ARS LFG+ +V Q+V + + + + MVI MTAGT ++MW+G
Sbjct: 121 TVALAILQGTGLVATARSAPLFGRCSVGG--QIVPDQSIFTTITMVICMTAGTCVVMWLG 178

Query: 181 ELITERGIGNGMSLLIFTSVAAAFPTSLIAIQQ----SRGWEVFLLVIAVGLLVVAAVVY 236
ELIT+RGIGNGMS+L+F S+AA FP++L AI++ + GW F VIAVGL++VA VV+
Sbjct: 179 ELITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVVF 238

Query: 237 VEQSQRRIPVQYAKRMVGRRTYGGNNTYIPIKVNMAGVVPVIFASSLLYLPALVAQFNQP 296
VEQ+QRRIPVQYAKRM+GRR+YGG +TYIP+KVN AGV+PVIFASSLLY+PALVAQF
Sbjct: 239 VEQAQRRIPVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVAQFAGG 298

Query: 297 PVGQPPAPWVQWITDNLTTGDHPLYMAMYFLLIVGFTYFYVAITFNPEEVADNMKKYGGF 356
G W W+ NLT GDHP+Y+ YFLLIV F +FYVAI+FNPEEVADNMKKYGGF
Sbjct: 299 NSG-----WKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGF 353

Query: 357 IPGIRAGRPTAEYLDYVLTRITLPGSLYLGLIALLPLIALSLVGANQNFPFGGASILIVV 416
IPGIRAGRPTAEYL YVL RIT PGSLYLGLIAL+P +AL GA+QNFPFGG SILI+V
Sbjct: 354 IPGIRAGRPTAEYLSYVLNRITWPGSLYLGLIALVPTMALVGFGASQNFPFGGTSILIIV 413

Query: 417 GVGLETVKQIDSQLQQRHYEGLLK 440
GVGLETVKQI+SQLQQR+YEG L+
Sbjct: 414 GVGLETVKQIESQLQQRNYEGFLR 437


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2611SECA320.003 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.8 bits (72), Expect = 0.003
Identities = 19/65 (29%), Positives = 31/65 (47%), Gaps = 10/65 (15%)

Query: 74 TARPGIVIGRRGVEAERIRADLEKLTGKQIQLNILEVKNPEAEAQLVAQG-------IAE 126
+P +++G +E + ++ LT I+ N+L K EA +VAQ IA
Sbjct: 448 KGQP-VLVGTISIEKSELVSNE--LTKAGIKHNVLNAKFHANEAAIVAQAGYPAAVTIAT 504

Query: 127 QLAGR 131
+AGR
Sbjct: 505 NMAGR 509


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2617BINARYTOXINB280.041 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 27.7 bits (61), Expect = 0.041
Identities = 20/80 (25%), Positives = 32/80 (40%), Gaps = 1/80 (1%)

Query: 35 QITPNVVTQVRTPEVDGYGAIQIAYGQIDPRKADKPSTGHFDKAGVTPRRHLTEVRTADF 94
+I NV + R P V Y + + I K + ST + D T ++ + RT
Sbjct: 271 RIDKNVSPEARHPLVAAYPIVHVDMENIILSKNEDQSTQNTDSQTRTISKNTSTSRT-HT 329

Query: 95 AEYALGQEITVGAFEPGTKV 114
+E E+ F+ G V
Sbjct: 330 SEVHGNAEVHASFFDIGGSV 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2620TCRTETOQM743e-16 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 73.7 bits (181), Expect = 3e-16
Identities = 51/158 (32%), Positives = 81/158 (51%), Gaps = 18/158 (11%)

Query: 13 VNIGTIGHVDHGKTTLT-------AAISKVLADKYPSATNVQRDFASIDSAPEERQRGIT 65
+NIG + HVD GKTTLT AI+++ +V + D+ ERQRGIT
Sbjct: 4 INIGVLAHVDAGKTTLTESLLYNSGAITEL--------GSVDKGTTRTDNTLLERQRGIT 55

Query: 66 INISHVEYETPKRHYAHVDAPGHADYIKNMITGAAQMDGAILVVAATDGPMAQTREHVLL 125
I ++ +D PGH D++ + + +DGAIL+++A DG AQTR
Sbjct: 56 IQTGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHA 115

Query: 126 AKQVGVPYLLVALNKSDMVDDEEILELVELEVRELLSS 163
+++G+P + +NK D + L V +++E LS+
Sbjct: 116 LRKMGIP-TIFFINKIDQNGID--LSTVYQDIKEKLSA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2621TCRTETOQM5890.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 589 bits (1521), Expect = 0.0
Identities = 172/693 (24%), Positives = 302/693 (43%), Gaps = 78/693 (11%)

Query: 11 KVRNIGIMAHIDAGKTTTTERILYYTGITHKIGEVHDGAATMDWMAQEQERGITITSAAT 70
K+ NIG++AH+DAGKTT TE +LY +G ++G V G D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 71 TCFWNKNQINIIDTPGHVDFTVEVERSLRVLDGAVAVFDGKEGVEPQSETVWRQADKYDV 130
+ W ++NIIDTPGH+DF EV RSL VLDGA+ + K+GV+ Q+ ++ K +
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 131 PRICFVNKMDKLGADFYFTVDTIINRLGAKPLVIQLPIGSEGGFEGVIDLVEMRALTWRG 190
P I F+NK+D+ G D I +L A+ ++ Q
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQ------------------------- 156

Query: 191 DSKGDVELGAKYDIEEIPADLQDKADEYRAKLLETVAETDDSLLEKYFGGEELTVAEIKA 250
VEL + Q +TV E +D LLEKY G+ L E++
Sbjct: 157 ----KVELYPNMCVTNFTESEQ----------WDTVIEGNDDLLEKYMSGKSLEALELEQ 202

Query: 251 AIRKLTVNSEIYPVLCGSAFKNRGVQPMLDAVIDYLPSPLDVPPMEGHDVRDEEKIIIRK 310
N ++PV GSA N G+ +++ + + S
Sbjct: 203 EESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH------------------- 243

Query: 311 PDSTEPFSALAFKVAVHPFFGRLTYVRVYSGTIASGSQVINSTKGKKERIGKIFQMHSNK 370
FK+ RL Y+R+YSG + V S K K +I +++ + +
Sbjct: 244 -RGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKI-KITEMYTSINGE 301

Query: 371 ENPVDSVTAGHIYAVIG----LKDTTTGDTLCDPQDQIVLESMTFPEPVIEVAIEPKTKA 426
+D +G I + L GDT PQ E + P P+++ +EP
Sbjct: 302 LCKIDKAYSGEIVILQNEFLKLNSVL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQ 356

Query: 427 DQEKLGVAIQKLAEEDPTFRTEQNQETGQTVIKGMGELHLDILVDRMKREFNVEANVGKP 486
+E L A+ ++++ DP R + T + ++ +G++ +++ ++ +++VE + +P
Sbjct: 357 QREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEP 416

Query: 487 QVAYRETIRGTVDKHDFTHKKQTGGSGQFAKIQIKIEPMEVTAEKTYEFDNKVTGGRVPR 546
V Y E + +T + + +A I + + P+ + + YE + V+ G + +
Sbjct: 417 TVIYMERPLKKAE---YTIHIEVPPNPFWASIGLSVSPLPLGSGMQYE--SSVSLGYLNQ 471

Query: 547 EYIPSVDAGIQDALQVGILAGYPMVGVKATLLDGAAHDVDSSEMAFKIAGSMAFKEAARK 606
+ +V GI+ + G L G+ + K G + S+ F++ + ++ +K
Sbjct: 472 SFQNAVMEGIRYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKK 530

Query: 607 AKPVLLEPLMAVEVRTPEEYMGDVIGDLNSRRGQIQAMEDASGVKVITANVPLSEMFGYV 666
A LLEP ++ ++ P+EY+ D I + + +++ +P + Y
Sbjct: 531 AGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYR 590

Query: 667 GDLRSKTSGRAVYSMSFGSYAEVPKAVADEIVQ 699
DL T+GR+V Y + + Q
Sbjct: 591 SDLTFFTNGRSVCLTELKGYHV---TTGEPVCQ 620


41CMM_2668CMM_2682Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2668018-3.068452putative short chain
CMM_2669-120-1.501361putative transcriptional regulator, TetR family
CMM_PS_23-1130.040253conserved hypothetical protein
CMM_26700110.040870hypothetical membrane protein
CMM_2671291.606568hypothetical membrane protein
CMM_2672292.273460putative L-alanine dehydrogenase
CMM_2673293.240122putative leucine-responsive regulatory
CMM_26743113.504317putative deoxynucleotide triphosphate deaminase
CMM_2675393.411938*hypothetical protein
CMM_2676193.617398hypothetical protein
CMM_26771143.057712putative ABC transporter, ATP-binding protein
CMM_26781113.406840hypothetical membrane protein
CMM_26793112.862463hypothetical membrane protein
CMM_2680483.464033putative transcriptional regulator, GntR family
CMM_2681483.773912putative gluconokinase
CMM_2682493.692418putative gluconate permease, GntP family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2676BLACTAMASEA2726e-93 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 272 bits (696), Expect = 6e-93
Identities = 96/275 (34%), Positives = 139/275 (50%), Gaps = 11/275 (4%)

Query: 42 APSASAAPAVDQAAADAAFTALEGRFGARLGVHAVDTGTGAEVS-WRADERFAYASTIKA 100
A A A Q E + R+G+ +D +G ++ WRADERF ST K
Sbjct: 13 ATLPLAVHASPQPLEQ--IKLSESQLSGRVGMIEMDLASGRTLTAWRADERFPMMSTFKV 70

Query: 101 PLAAALLDRVGIAG--MDRAVPIEAADILSYAPVTETRVGGTMTLRELAEAAMTRSDNTA 158
L A+L RV ++R + D++ Y+PV+E + MT+ EL AA+T SDN+A
Sbjct: 71 VLCGAVLARVDAGDEQLERKIHYRQQDLVDYSPVSEKHLADGMTVGELCAAAITMSDNSA 130

Query: 159 ANLLLEALGGQAELDAALTALGDDTTVVSRTEPDLNEAIPGDDRDTTTPRAAAALLRAYA 218
ANLLL +GG A L A L +GD+ T + R E +LNEA+PGD RDTTTP + AA LR
Sbjct: 131 ANLLLATVGGPAGLTAFLRQIGDNVTRLDRWETELNEALPGDARDTTTPASMAATLRKLL 190

Query: 219 LGDPGAIADPLDADERALFTGWLKATQTGATLVRAELPADWTVGDKSGLGEYASRGDVAV 278
L A + W+ + L+R+ LPA W + DK+G GE +RG VA+
Sbjct: 191 TS------QRLSARSQRQLLQWMVDDRVAGPLIRSVLPAGWFIADKTGAGERGARGIVAL 244

Query: 279 IWRPDAAPIVIAVHSSKDQQDADADDALISGAARA 313
+ + A ++ ++ + I+G A
Sbjct: 245 LGPNNKAERIVVIYLRDTPASMAERNQQIAGIGAA 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2677HTHTETR479e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.9 bits (111), Expect = 9e-09
Identities = 23/137 (16%), Positives = 48/137 (35%), Gaps = 1/137 (0%)

Query: 14 RDARRAELLDAAVRVMAVAGVAGASTRAITAEAGLAHGAFHYCFGVREELLGALLRQEVD 73
R +LD A+R+ + GV+ S I AG+ GA ++ F + +L +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 74 AVVAQLEAAEDPAGAPLGDVVAATLRAELDRVRREPDRQRVLIDLAATVQRIPALADLPA 133
+ + V+ L L+ E R+ ++ + + + +A +
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 134 WEHGRYVEETRRRFTAA 150
+ E+ R
Sbjct: 129 AQRN-LCLESYDRIEQT 144


42CMM_2734CMM_2741Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_27342101.87108550S ribosomal protein L9
CMM_27352101.70306330S ribosomal protein S18
CMM_27363122.843426single-strand binding protein
CMM_27373133.42736830S ribosomal protein S6
CMM_27382113.023594putative RNA nucleotidyltransferase
CMM_27393122.894730conserved hypothetical protein
CMM_2740292.339817conserved membrane protein, MOP family
CMM_2741281.641403putative thioredoxin reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2736BACINVASINB330.002 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 33.2 bits (75), Expect = 0.002
Identities = 32/110 (29%), Positives = 54/110 (49%), Gaps = 13/110 (11%)

Query: 263 VLVFLASFIPVVGAIVSGAFAVVIALVFVGPLQAVIMLAAVIGVHLLES-------HVLQ 315
VL L + + VV A+ +G ++ +A V + + A ++ A GV ++ HVL+
Sbjct: 320 VLGALLTIVSVVAAVFTGGASLALAAVGLAVMVADEIVKAATGVSFIQQALNPIMEHVLK 379

Query: 316 PLV--MGGAV--HVHPLAVVLSVA--AGSYVGGVAGALFAVPFVATVNVM 359
PL+ +G A+ + L V A AGS VG + A+ V + V V+
Sbjct: 380 PLMELIGKAITKALEGLGVDKKTAEMAGSIVGAIVAAIAMVAVIVVVAVV 429


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2740HTHFIS517e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 51.0 bits (122), Expect = 7e-10
Identities = 35/165 (21%), Positives = 62/165 (37%), Gaps = 8/165 (4%)

Query: 2 IRIVIADDHPVVRAGL-HAVLDAAADIDVIGEAATPAEAVALAASEDPDLVLMDLQFGQE 60
I++ADD +R L A+ A D+ + + A A+ D DLV+ D+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRI---TSNAATLWRWIAAGDGDLVVTDVVMPD- 59

Query: 61 RTGADATRQIRAAEAAPYVLILTNYDSDGDILSAVEAGASGYLLKDAPPAELLAAVRAAA 120
D +I+ A VL+++ ++ + A E GA YL K EL+ + A
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA- 118

Query: 121 AGESALAPAVASRLLARMRAPRVSLSSREIEVLRLVADGASNTDV 165
+ + + S ++ + V TD+
Sbjct: 119 --LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDL 161


43CMM_2783CMM_2792Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_278328-1.923565
CMM_278438-1.885074
CMM_2785210-2.642955
CMM_2786313-2.113783
CMM_2787012-1.603998
CMM_2788012-0.899208
CMM_2789-111-0.149940
CMM_27901130.454959
CMM_27913160.799208
CMM_27922110.001994
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2785SACTRNSFRASE438e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.4 bits (102), Expect = 8e-08
Identities = 18/69 (26%), Positives = 25/69 (36%), Gaps = 12/69 (17%)

Query: 104 EIELSKVYVEAGSHGVGVARPLMAETLRVARELAGERGLGADAGIWLGVNEHNARAIRFY 163
+I ++K Y GV L+ A E A E G+ L + N A FY
Sbjct: 94 DIAVAKDY-----RKKGVGTALLH----KAIEWAKENHFC---GLMLETQDINISACHFY 141

Query: 164 ERSGFHIVG 172
+ F I
Sbjct: 142 AKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2786ARGREPRESSOR270.037 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.1 bits (60), Expect = 0.037
Identities = 24/131 (18%), Positives = 45/131 (34%), Gaps = 23/131 (17%)

Query: 1 MAKSKAYRAAAEKIDPAKAYTASEAVELARETGSSKFDSTVEVALK-------------- 46
M K + + E I + T E V++ ++ G + +TV +K
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSY 60

Query: 47 -------LGVDPRKADQMVRGTVILPHGTGKTARVIVFATGPAAEAAIAAGADEVGGDEL 99
+P ++ R + + +IV T P AI A D + +E+
Sbjct: 61 KYSLPADQRFNP--LSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEEI 118

Query: 100 IEKVAGGYTSF 110
+ + G T
Sbjct: 119 MGTICGDDTIL 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2789SECETRNLCASE312e-04 Bacterial translocase SecE signature.
		>SECETRNLCASE#Bacterial translocase SecE signature.

Length = 127

Score = 31.4 bits (71), Expect = 2e-04
Identities = 18/44 (40%), Positives = 27/44 (61%)

Query: 32 VLFIKQVVQELKKVVTPTRKELLTFTGVVLAFVIVMMVIVSLLD 75
V F ++ E++KV+ PTR+E L T +V A VM +I+ LD
Sbjct: 69 VAFAREARTEVRKVIWPTRQETLHTTLIVAAVTAVMSLILWGLD 112


44CMM_2823CMM_2865Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2823163.047189
CMM_2824082.974611
CMM_2825092.894785
CMM_2826193.074101
CMM_28274103.146302
CMM_28283102.837897
CMM_2829292.338439
CMM_28301102.968360
CMM_2831192.863716
CMM_28321132.858577
CMM_28331123.421665
CMM_2834-2132.962331
CMM_28350123.772185
CMM_2836093.029345
CMM_2837071.247204
CMM_2838-160.409923
CMM_283906-0.368894
CMM_284018-0.381777
CMM_2841318-4.953748
CMM_2842426-7.159723
CMM_2843530-7.148062
CMM_2844620-6.248820
CMM_2845722-6.033275
CMM_2846621-5.159853
CMM_284707-1.234716
CMM_2848-270.306332
CMM_2849-271.244714
CMM_2850393.318811
CMM_2851292.900524
CMM_28523102.676166
CMM_2853373.267691
CMM_2854473.064409
CMM_28554112.887993
CMM_28564102.457356
CMM_28574162.970872
CMM_28584133.041786
CMM_28593112.414477
CMM_28603111.675876
CMM_2861291.715921
CMM_2862191.120854
CMM_2863491.244415
CMM_2864391.405526
CMM_28652101.700742
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2825ISCHRISMTASE442e-07 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 44.2 bits (104), Expect = 2e-07
Identities = 29/110 (26%), Positives = 44/110 (40%), Gaps = 3/110 (2%)

Query: 130 EIVPEVAPLPGEPVIDKPGKGAFYATDLDLVLRAGGIDRLILTGITTDVCVSTTMREAND 189
+I+ E+AP + V+ K AF T+L ++R G D+LI+TGI + T EA
Sbjct: 107 KIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFM 166

Query: 190 RGYECLVLSDCTGATDPANHDAALRMVTMQGGVFGAVAASGDVLAALAGA 239
+ + D H AL G + +L L A
Sbjct: 167 EDIKAFFVGDAVADFSLEKHQMALEYAA---GRCAFTVMTDSLLDQLQNA 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2831NUCEPIMERASE435e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 43.2 bits (102), Expect = 5e-07
Identities = 33/182 (18%), Positives = 60/182 (32%), Gaps = 49/182 (26%)

Query: 1 MTIVVTAATGRLGSRIVASLLARG--AAASDVLA----TARRPEALADLAAQGVRTARLE 54
M +VT A G +G + LL G D L + + L LA G + +++
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 55 YTDADSVAAAIQPGD-TLVLVSGSEVGQR-------------VPQHTTVIEAAKAAGVGR 100
D + + G V +S + R + ++E + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 101 ILYTSVLRASTTELFIAGEH-------------------KATEEVLAAS-----GVPVTL 136
+LY AS++ ++ K E++A + G+P T
Sbjct: 121 LLY-----ASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATG 175

Query: 137 LR 138
LR
Sbjct: 176 LR 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2833TCRTETB300.019 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 29.8 bits (67), Expect = 0.019
Identities = 29/157 (18%), Positives = 58/157 (36%), Gaps = 22/157 (14%)

Query: 269 GASAFVAGLGFISLQGMQFIG--RMLGDRMVDRFGQRAIARLGGVLVLVGMGAALAFP-- 324
S G I G + +G +VDR G + +G + V A
Sbjct: 288 QLSTAEIGSVII-FPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLET 346

Query: 325 -------SVVGTIVGFGVAGFGVATLIPAAMQAADELPG---------FKPGTGLTIVGW 368
+V + G ++T++ ++++ + G GTG+ IVG
Sbjct: 347 TSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGG 406

Query: 369 LLRLGFLISPPVVGAIADASSLRFGLIFIPAAGLLVL 405
LL + L ++ D S+ + + + +G++V+
Sbjct: 407 LLSIPLL-DQRLLPMEVDQSTYLYSNLLLLFSGIIVI 442


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2844TYPE4SSCAGA290.027 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 28.9 bits (64), Expect = 0.027
Identities = 15/51 (29%), Positives = 31/51 (60%), Gaps = 4/51 (7%)

Query: 139 RREARAELTELGISTLQDVTVPVENLSGGQRQAVAVARAAAFGSKVVVLDE 189
R + +L+++G+S Q++ ++NL+ QAV+ A+A FG+ +D+
Sbjct: 946 RHDKVDDLSKVGLSRNQELAQKIDNLN----QAVSEAKAGFFGNLEQTIDK 992


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2848DHBDHDRGNASE969e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 95.5 bits (237), Expect = 9e-26
Identities = 73/249 (29%), Positives = 111/249 (44%), Gaps = 10/249 (4%)

Query: 7 KTALVTGGGSGIGAAISRALAAEGASVVVTDIQLDAAERVVAEIEGAGGTATAFRQDTAK 66
K A +TG GIG A++R LA++GA + D + E+VV+ ++ A AF D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 67 AEDSEAAVAHAVGTYGALHLAVNNAGISAPAADIGDYEISAWDRTRAIDLDGVFYGLRYQ 126
+ + A G + + VN AG+ P I W+ T +++ GVF R
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGL-IHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 127 VPAMVEAGGGAIVNMSSVLGSVGFAQNAAYVASKHALVGLTKVAALEYTARGVRTNAVGP 186
M++ G+IV + S V AAY +SK A V TK LE +R N V P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 187 GFIDTPLVRSSLSAD---------ALAYLESQHATGRLGTDKEVAALVLFLLSDDASFIS 237
G +T + S + + +L ++ +L ++A VLFL+S A I+
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 238 GSYHLVDGG 246
VDGG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2849RTXTOXINA310.014 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.5 bits (71), Expect = 0.014
Identities = 29/108 (26%), Positives = 42/108 (38%), Gaps = 31/108 (28%)

Query: 55 AGRALSRRTLLGGAGAGALAILVAQNAAA----PGAAAAQAASNLPFTAITPVDAAVDQF 110
AG L+ + +LG G G ++AQ AA AAA AS + AI+P+
Sbjct: 271 AGVELTTK-VLGNVGKGISQYIIAQRAAQGLSTSAAAAGLIASAVTL-AISPLS------ 322

Query: 111 TVPTGYRWQPIIRWGDPLFSYADDFDADNQTAKLASR--QFGYNNDYL 156
S AD F N+ + + R + GY+ D L
Sbjct: 323 -----------------FLSIADKFKRANKIEEYSQRFKKLGYDGDSL 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2850HELNAPAPROT280.005 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 28.3 bits (63), Expect = 0.005
Identities = 6/53 (11%), Positives = 16/53 (30%), Gaps = 3/53 (5%)

Query: 32 WPRTDPSLPAELVDALESHLGRGRSGVRTAKRAGDD--AGIATARERVSLAKH 82
+ + LV+ + + + A+ D+ A + + K
Sbjct: 93 NETSASEMVQALVNDYKQISSESKFVIGLAEENQDNATADLFVGLIE-EVEKQ 144


45CMM_2896CMM_2908Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2896734-9.341969
CMM_2897735-10.009117
CMM_2898740-11.276492
CMM_2899848-12.223314
CMM_2900745-10.782212
CMM_2901745-10.219235
CMM_PS_24119-2.204300
CMM_2902018-2.108155
CMM_2903111-1.606605
CMM_290419-0.677054
CMM_2905280.319524
CMM_290628-0.198169
CMM_290717-0.373738
CMM_2908250.203139
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2896PREPILNPTASE280.030 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 27.8 bits (62), Expect = 0.030
Identities = 19/58 (32%), Positives = 28/58 (48%), Gaps = 6/58 (10%)

Query: 17 LALVGAL--VPVLDLVTGLLAVVGAVFGILALTRKRRRNSRPLAITGLALNVVALAAW 72
LA +GA L +V L ++VGA GI + + S+P+ G L A+A W
Sbjct: 219 LAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKPIPF-GPYL---AIAGW 272


46CMM_0012CMM_0017N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_001217-1.338834conserved membrane protein
CMM_001307-1.398937putative sortase
CMM_0014-18-1.196477para-aminobenzoate synthetase component II
CMM_0015-29-1.056706penicillin-binding protein
CMM_0016-39-1.140459putative cell division membrane protein
CMM_0017-370.131898putative serine/threonine protein phosphatase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_001260KDINNERMP280.004 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 27.6 bits (61), Expect = 0.004
Identities = 9/48 (18%), Positives = 21/48 (43%), Gaps = 6/48 (12%)

Query: 12 QERVRDEDAPNP-----VWFKPIMFGFMLVGL-AWLIVYYVSLNAFPI 53
+++ +P + F P++F + + L++YY+ N I
Sbjct: 478 IQKMSPTTVTDPMQQKIMTFMPVIFTVFFLWFPSGLVLYYIVSNLVTI 525


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0015YERSSTKINASE330.003 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 33.2 bits (75), Expect = 0.003
Identities = 17/43 (39%), Positives = 27/43 (62%), Gaps = 1/43 (2%)

Query: 122 IMGQVLTALEYSHRAGVVHRDIKPGNIMVT-PTGQVKVMDFGI 163
I ++L + +AGVVH DIKPGN++ +G+ V+D G+
Sbjct: 250 IAHRLLDVTNHLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGL 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0016YERSSTKINASE350.001 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 34.7 bits (79), Expect = 0.001
Identities = 44/164 (26%), Positives = 73/164 (44%), Gaps = 22/164 (13%)

Query: 58 FRAEARHAALVNHEGIANVFDYGEEDGSAFLVMELVPGEALSTILE------RERVLSTD 111
++ +H L N G+A V YG A L+M+ V G S L ++ ++++
Sbjct: 184 YKTAGKHPNLANVHGMA-VVPYGNRKEEA-LLMDEVDGWRCSDTLRTLADSWKQGKINSE 241

Query: 112 K---VLDIIAQTALAL--HAAHQAGLVHRDIKPGNLLIT-PDGRVKITDFGIARIADQVP 165
+ IA L + H A +AG+VH DIKPGN++ G + D G+ + + P
Sbjct: 242 AYWGTIKFIAHRLLDVTNHLA-KAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQP 300

Query: 166 LTATGQVMGTVQYLSPEQASGH-PASPSTDVYSMGIVAYECLAG 208
T + +PE G+ AS +DV+ + C+ G
Sbjct: 301 KGFTE------SFKAPELGVGNLGASEKSDVFLVVSTLLHCIEG 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0017PF06872320.007 EspG protein
		>PF06872#EspG protein

Length = 398

Score = 31.6 bits (71), Expect = 0.007
Identities = 18/51 (35%), Positives = 28/51 (54%), Gaps = 5/51 (9%)

Query: 262 CGPEAEVSIATALRLSCNIPFAQLGIALGSERIAKMAEAFGYGKSIDVPMA 312
C PEA +I +A S N+P + L RI++ A A+ +S+D+P A
Sbjct: 279 CQPEAPTAICSAFYQSFNVP----ALMLTHVRISQ-ASAYNAQRSLDMPNA 324


47CMM_0084CMM_0093N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0084325-7.017651putative two-component system sensor kinase
CMM_0085326-7.194950putative transport protein, RND family
CMM_0086228-6.978495putative cytochrome P450
CMM_0087032-6.290160putative 3Fe-4S ferredoxin
CMM_0088030-5.810026putative ferredoxin reductase
CMM_0089132-5.787476putative beta-glucosidase, glycosyl hydrolase
CMM_0090130-5.729917putative LacI-family transcriptional regulator
CMM_0091132-4.962276beta-glucosidase, glycosyl hydrolase family 3
CMM_0092030-4.790859beta-glucosidase, glycosyl hydrolase family 1
CMM_0093030-4.945054putative beta-galactosidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0084SECFTRNLCASE290.031 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 28.7 bits (64), Expect = 0.031
Identities = 10/40 (25%), Positives = 18/40 (45%), Gaps = 1/40 (2%)

Query: 92 DIDLMKAAWNSFFVSAVTAVAVVFFSTLAGFAFAKLRFRG 131
+ D + W +F + V +A V + G F + F+G
Sbjct: 13 NFDFFRWQWATFGAAIVMMIASVILPLVIGLNFG-IDFKG 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0087HTHTETR679e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 67.3 bits (164), Expect = 9e-16
Identities = 30/181 (16%), Positives = 64/181 (35%), Gaps = 12/181 (6%)

Query: 8 RGPYRKGVERRREIVAAAAQLFSESGYTHASMRELAKRVGLSQALLLHYFSDKEDLLVEV 67
R ++ E R+ I+ A +LFS+ G + S+ E+AK G+++ + +F DK DL E+
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 68 LNLRDASVAEYLADIADSDIAT-----RSRKVARHAAEHEGLTSLYIALSAEAIDPEHPA 122
L ++++ E + R + + +
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGE 122

Query: 123 HEYFADHYRSAQEQTR-------EPGADAGVVPAGVSPELVTALGIAVMDGLQVQRQYRP 175
R+ ++ + +A ++PA + + + GL + P
Sbjct: 123 MAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAP 182

Query: 176 D 176

Sbjct: 183 Q 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0089HTHTETR280.036 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 28.4 bits (63), Expect = 0.036
Identities = 18/130 (13%), Positives = 40/130 (30%), Gaps = 4/130 (3%)

Query: 11 TLRDIARIAGVSVPTVSKVLNGRGDVSAETRERVQE---ATARVGYRRMPSAAVDALIAE 67
+L +IA+ AGV+ + + D+ +E E + + P + L
Sbjct: 33 SLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGELELEYQAKFPGDPLSVLREI 92

Query: 68 QLVDLVLPAVGDSWASALIGGVERVAAQNSLDLVIIMVRAGESQGRSWVERLVDH-RSRG 126
+ L + + + + +V R + +E+ + H
Sbjct: 93 LIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAK 152

Query: 127 ALIAVVQPTA 136
L A +
Sbjct: 153 MLPADLMTRR 162


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0091HTHFIS661e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 65.6 bits (160), Expect = 1e-14
Identities = 36/164 (21%), Positives = 63/164 (38%), Gaps = 5/164 (3%)

Query: 4 VVLVDDQAMIRAGLRGILEDAGIRVIGEAADGRSAFAVIRGSRPDVVLMDLRMPILDGVG 63
+++ DD A IR L L AG V ++ + + I D+V+ D+ MP +
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 ATAALRA-DPDLNAVRILVLTTFDGDEEVLAALRAGADGFLAKSADSESLIAAVEAVAKG 122
++ PDL +LV++ + + A GA +L K D LI +
Sbjct: 65 LLPRIKKARPDL---PVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 123 DTSLSAGAAKTVVSDLRRRGRSATDVELIARTATLTHRETDVVL 166
+ + GRSA E+ A L + +++
Sbjct: 122 PKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMI 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0093ACRIFLAVINRP466e-07 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 45.6 bits (108), Expect = 6e-07
Identities = 40/254 (15%), Positives = 90/254 (35%), Gaps = 33/254 (12%)

Query: 177 TELVGALVAFLVLLLTFGSLVAAGANMLGALVGVGVGVLGILAFSAIAPIG---SVTPIL 233
T ++ FLV+ L ++ A + + V V+ + F+ +A G + +
Sbjct: 343 TLFEAIMLVFLVMYLFLQNMRAT------LIPTIAVPVVLLGTFAILAAFGYSINTLTMF 396

Query: 234 AVMLGLAVGIDYCLFVLSRFRAELRSGRL-VQDAIGRATATAGSSVVFAGATVIIALVGL 292
++L + + +D + V+ + +L ++A ++ + ++V + + +
Sbjct: 397 GMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPM 456

Query: 293 TVVGI---PFLGEMGIAAAFAVAVAVLMSLTLLPALLSWM----GRRALGRRERASTRSA 345
G + I A+A++VL++L L PAL + + +
Sbjct: 457 AFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFN 516

Query: 346 GGSPRWTT------TWIGAVLRRPVIATFAVVGGLAVVALPLLG----------MQTSLV 389
I R ++ +V G+ V+ L L T +
Sbjct: 517 TTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQ 576

Query: 390 IPGGEDPDSTQRAA 403
+P G + TQ+
Sbjct: 577 LPAGATQERTQKVL 590



Score = 42.9 bits (101), Expect = 4e-06
Identities = 39/188 (20%), Positives = 76/188 (40%), Gaps = 25/188 (13%)

Query: 498 STAIALDSDEQLRSALIAYVALIVGLSFLLLVLLFRSFLVPLIATGGFLLSLGAALGSTV 557
+ + L E +++ A + +V L L + R+ L+P IA + L LG+
Sbjct: 330 TPFVQLSIHEVVKTLFEAIM--LVFLVMYLFLQNMRATLIPTIA---VPVVL---LGTFA 381

Query: 558 AVFQWGWLDPIVQAPQGNPLLSLLPIIVTGILFGLAMDYQVFLVSRIHEAHR-HGTSPRE 616
+ +G+ L++ +++ GL +D + +V + P+E
Sbjct: 382 ILAAFGY---------SINTLTMFGMVLA---IGLLVDDAIVVVENVERVMMEDKLPPKE 429

Query: 617 AVRAGFQQSAPVVVAAAAIMAAVFFGFALSPSS---LVGSIALALAVGVLADALLVRMIL 673
A Q +V A +++AVF A S + ++ + + ++LV +IL
Sbjct: 430 ATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL-SVLVALIL 488

Query: 674 VPALLTLL 681
PAL L
Sbjct: 489 TPALCATL 496



Score = 33.3 bits (76), Expect = 0.004
Identities = 33/178 (18%), Positives = 70/178 (39%), Gaps = 22/178 (12%)

Query: 511 SALIAYVALIVGLSFLLLVLLFRSFLVPLIATGGFLLSLG-AALGSTVAVFQWGWLDPIV 569
+ A VA+ + FL L L+ S+ +P+ +L + +G +A +
Sbjct: 870 NQAPALVAISFVVVFLCLAALYESWSIPVS----VMLVVPLGIVGVLLAATLFN------ 919

Query: 570 QAPQGNPLLSLLPIIVTGILFGLAMDYQVFLVSRIHEAHRH-GTSPREAV-RAGFQQSAP 627
Q N + ++ ++ GL+ + +V + G EA A + P
Sbjct: 920 ---QKNDVYFMVGLLT---TIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRP 973

Query: 628 VVV-AAAAIMAAVFFGFALSPSSLVGSIALALAV-GVLADALLVRMILVPALLTLLGR 683
+++ + A I+ + + S + A+ + V G + A L+ + VP ++ R
Sbjct: 974 ILMTSLAFILGVLPLAISNGAGSGAQN-AVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030


48CMM_0117CMM_0125N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0117-281.196048putative RNA polymerase ECF-subfamily sigma
CMM_0118-3110.913788conserved hypothetical protein
CMM_0119-291.406959putative sortase
CMM_0120-1101.477719hypothetical protein
CMM_01210101.616121hypothetical protein
CMM_01220102.007300hypothetical protein
CMM_0123071.914456putative pyroglutamylpeptidase
CMM_0124172.873239conserved hypothetical protein
CMM_01253103.544443hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0117TCRTETB1191e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 119 bits (301), Expect = 1e-31
Identities = 89/406 (21%), Positives = 155/406 (38%), Gaps = 18/406 (4%)

Query: 27 LCLAFFVEMVDNTLLSIALPTIGRALDSGTTGLQWVTSAYSLTFGGLLLTAGSAADRFGR 86
LC+ F +++ +L+++LP I + WV +A+ LTF G +D+ G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 87 RRVLLIGLAAFGLISV-AVVLVTDIGQLIALRAALGAAAAAMAPVTMSLIFRLFDDEKLR 145
+R+LL G+ SV V + LI R GA AAA + M ++ R E R
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKEN-R 137

Query: 146 MRSITIVMVVGMSGFVLGPLLGGSILGAVSWQWLLVINAPLALLVWIGVRLGIPADRRDD 205
++ ++ + G +GP +GG I + W +LL+I + V ++L R
Sbjct: 138 GKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKG 197

Query: 206 LTSERLDLPGTVLTVAAIGLGCYTLTSGVERGWLAPVTLACALGAVAAVAGFVLRERRTA 265
D+ G +L I TS + + FV R+
Sbjct: 198 ----HFDIKGIILMSVGIVFFMLFTTSYS---ISFLIVSVLSFLI------FVKHIRKVT 244

Query: 266 SPMIDLAIFRAGPVRGAALTQLGASVAFASILFGLILHFQYAYGWSPMRAGLANL-PIIV 324
P +D + + P L A + + + + S G + P +
Sbjct: 245 DPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTM 304

Query: 325 TMIAASPIAERLASRLGHRMACLVGTGFLVGGLVGLAWAVEHGYLAIMASMIVFTIG-LR 383
++I I L R G +G FL + ++ +E M +IVF +G L
Sbjct: 305 SVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLE-TTSWFMTIIIVFVLGGLS 363

Query: 384 TIMTICAVALVEAMPANRTSIGAALNDTAQELGTSLGTAVVGTLIA 429
T+ + + ++ G +L + L G A+VG L++
Sbjct: 364 FTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0118TETREPRESSOR545e-11 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 54.1 bits (130), Expect = 5e-11
Identities = 46/224 (20%), Positives = 86/224 (38%), Gaps = 39/224 (17%)

Query: 28 SLDTVLAEAIGILDESGERALTFRALAARLGGGVASIYWYVASRDELLEKVTEEVMGRVL 87
+ ++V+ A+ +L+E+G LT R LA +LG ++YW+V ++ LL+ + E++ R
Sbjct: 5 NRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILARHH 64

Query: 88 ADTEPLTHGSDPVDNVRAVALALFDELV-----------RRPWFGQYMLRNNGLQPNSML 136
+ P G +R A++ L+ RP
Sbjct: 65 DYSLPAA-GESWQSFLRNNAMSFRRALLRYRDGAKVHLGTRP------------DEKQYD 111

Query: 137 MYERIGQQLLGLDLTSRQRFHAVSSIVSYVVGVAADLAEPPPQEFLDSGLDREGFLGTLA 196
E + + + R +A+S++ + +G L+++ L
Sbjct: 112 TVETQLRFMTENGFSLRDGLYAISAVSHFTLGAV---------------LEQQEHTAALT 156

Query: 197 DRWRELDPDEYPFAHDAAGEMATHDDLDVFRSGLDLLLAGVRQQ 240
DR D + P +A M + D F GL+ L+ G Q
Sbjct: 157 DRPAAPDENLPPLLREALQIMDSDDGEQAFLHGLESLIRGFEVQ 200


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0119HTHTETR655e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 5e-15
Identities = 32/131 (24%), Positives = 56/131 (42%), Gaps = 5/131 (3%)

Query: 26 REQILDAAAALFAEHGLSGTSTRAIAERVGIRQASLYYHFAGKDDLLVELLTRSVRPSLD 85
R+ ILD A LF++ G+S TS IA+ G+ + ++Y+HF K DL E+ S +
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 86 VVRGIEDLVPDRATAAAALHALVMVDFRTLADTPHN---IGTLYLLPEVQGERYDGFRTQ 142
+ E + L +++ + + ++ E GE + Q
Sbjct: 73 LEL--EYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQAQ 130

Query: 143 RRELQDAYGRI 153
R ++Y RI
Sbjct: 131 RNLCLESYDRI 141


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0120RTXTOXIND310.025 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D signature.

Length = 478

Score = 31.3 bits (71), Expect = 0.025
Identities = 7/28 (25%), Positives = 12/28 (42%)

Query: 1195 VVGAPVAGRVLDVHIRPGEQVAPGQVLA 1222
+ V ++ ++ GE V G VL
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLL 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0124NUCEPIMERASE367e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 35.9 bits (83), Expect = 7e-04
Identities = 38/220 (17%), Positives = 69/220 (31%), Gaps = 52/220 (23%)

Query: 720 TVLLTGANGYLGRFLAIDWLERLAATGGTLVCIVRGADDADARRRLEAAFAADPAFARRF 779
L+TGA G++G ++ +RL G V G D+ + D + +
Sbjct: 2 KYLVTGAAGFIGFHVS----KRLLEAGHQ----VVGIDNLNDYY--------DVSLKQAR 45

Query: 780 AELSGSLEVLAGDVSEHRLGLDDERWIDLAAR---VDLVAHAAALVNHVLPYS-----AL 831
EL H++ L D + + V + + + YS A
Sbjct: 46 LELLAQ-----PGFQFHKIDLADREGMTDLFASGHFERVFISPHRLA--VRYSLENPHAY 98

Query: 832 FGPNVVGTAEAIRLAIAAGSVPVTFVSSVAVAGGARPGATADAEPSAPGALDEHADIRAT 891
N+ G + + + SS +V G R + +
Sbjct: 99 ADSNLTGFLNILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSV-------------- 144

Query: 892 IPEWAVGDEYANGYGASKWASEVLLREAHEHHGVPVAVFR 931
D + Y A+K A+E++ +G+P R
Sbjct: 145 -------DHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0125ENTSNTHTASED883e-23 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 87.8 bits (217), Expect = 3e-23
Identities = 38/192 (19%), Positives = 65/192 (33%), Gaps = 13/192 (6%)

Query: 36 ESDAVATALPGRRAEFVTGRVLARRALAALGRRPGSIPVARDGAPVWPAGIVGSITHCVG 95
D + +A R+AE + GR+ A AL +G R P+WP G+ GSI+HC
Sbjct: 35 HHDRLRSAGRKRKAEHLAGRIAAVHALREVGVRTVPGM-GDKRQPLWPDGLFGSISHCAT 93

Query: 96 LRACAVGRRDEHAGIGIDATPARPLPGGVLARVADLGSAPVAAGLDALRATGVEAPGSVL 155
+ R+ IGID +A + ++
Sbjct: 94 TALAVISRQ----RIGIDIEKIMSQHTAT--ELAPSIIDSDERQILQASLLPFPLALTLA 147

Query: 156 FAAAEAVAKARTSAQGGWHGIDGAEIVLHPDGSFAVR-----ARRGPDFTGTGRWAVAGG 210
F+A E+V KA + G + A++ ++ A + T W
Sbjct: 148 FSAKESVYKAFSDRV-TLPGFNSAKVTSLTATHISLHLLPAFAATMAERTVRTEWFQRDN 206

Query: 211 LALAGIALDEHW 222
+ ++
Sbjct: 207 SVITLVSAITRV 218


49CMM_0146CMM_0151N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0146390.925404putative membrane protein
CMM_0147290.360096conserved hypothetical protein
CMM_0148280.161087hypothetical protein
CMM_014927-0.144116putative transcriptional regulator, LacI-family
CMM_0150160.497333conserved hypothetical protein
CMM_0151111-0.256523conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0146PF06776270.048 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 27.2 bits (60), Expect = 0.048
Identities = 19/61 (31%), Positives = 24/61 (39%), Gaps = 5/61 (8%)

Query: 1 MPRHRAAPDAPARPVPH---GRRAARRPVRRRRTLLVTAIAVTLALAGAGTAYAVFADRA 57
+P +A PA P RR ARR R AIA L+ + A A A R+
Sbjct: 21 VPALKAIQMGPAELSPMLASCRRLARRNGARLMLAGAMAIA--LSFGWSDRADAQGAVRS 78

Query: 58 T 58

Sbjct: 79 V 79


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0148cloacin345e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 34.3 bits (78), Expect = 5e-04
Identities = 27/81 (33%), Positives = 40/81 (49%), Gaps = 15/81 (18%)

Query: 183 GTGQDPDPSTGGGTGTGSGSGSG-SGGGSGSGTS------------PAGSAPAADAPRDL 229
G G GGG+G G+G G+G SGGGSG+G + PA S P A
Sbjct: 47 GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLA-- 104

Query: 230 LPVTGRDIASALAGALLALLG 250
+ ++ +++A+A + AL G
Sbjct: 105 VSISAGALSAAIADIMAALKG 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0150INTIMIN392e-04 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 38.5 bits (89), Expect = 2e-04
Identities = 75/408 (18%), Positives = 127/408 (31%), Gaps = 40/408 (9%)

Query: 400 GLFLVGTSVTCTGTHVVTPAEAMSGTLVNTASARGVNTLLASVTSNSSSVTAQIVAPAPA 459
G S + + PA G+ V +AR + +SN+ +T +++
Sbjct: 497 GQIQHSGSQSAQDYQAILPAYVQGGSNVYKVTARAYDR--NGNSSNNVLLTITVLSNGQV 554

Query: 460 LALTKTGTLTDSNGNGRADVGERIAYSFVAQNSG----NVSLYTVAVADPRVTGISPAST 515
+ T + +AD E I Y+ + +G NV + V+ V + A+T
Sbjct: 555 VDQVGVTDFTADKTSAKADGTEAITYTATVKKNGVAQANVPVSFNIVSGTAVLSANSANT 614

Query: 516 TLAPGASQTFTSAA--YTVTQADVDAATPIVNT-ATVSGKTFAGVAAPTASSSTSTPVSG 572
+ A+ T S V A T +N A + + T+ +G
Sbjct: 615 NGSGKATVTLKSDKPGQVVVSAKTAEMTSALNANAVIFVDQTKASITEIKADKTTAVANG 674

Query: 573 SAALTLTKGATLTGGSKAGSTVAYSFSIRNTGTVPLTGVALTDPLPGLSAVTYTWPGTAG 632
A+T T + V + T L+ G + VT T T G
Sbjct: 675 QDAITYTVKVMKGDKPVSNQEVTF-----TTTLGKLSNSTEKTDTNGYAKVTLT-STTPG 728

Query: 633 TLAAGATATATASYTVRQADVDAGQIANTATVRGASSGGTQAQATATRTLTLDRTATLAF 692
+ +A + DV A ++ T+ L T L +
Sbjct: 729 K------SLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLP---TVWLQY 779

Query: 693 TKTATPGNVPAAGGVVTYAFRLQNTGSTTLTSVSIADPRAGVSALSYTWPGTAGTLAPGQ 752
N+ A+GG Y +R N ++ D +G L T++
Sbjct: 780 G----QVNLKASGGNGKYTWRSANPA------IASVDASSGQVTLKEK---GTTTISVIS 826

Query: 753 VVTATATYTATTADVAAGSIVNTATATATAPTGQVSGTATATVLAVAD 800
TATYT T + IV + T + L +
Sbjct: 827 SDNQTATYTIATPN---SLIVPNMSKRVTYNDAVNTCKNFGGKLPSSQ 871


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0151SHAPEPROTEIN1382e-38 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 138 bits (350), Expect = 2e-38
Identities = 81/371 (21%), Positives = 147/371 (39%), Gaps = 69/371 (18%)

Query: 2 ARAVGIDLGTTNSVVSVLEGG----EPTVIA-NAEGARTTPSVVAFTKDGEVLVGETAKR 56
+ + IDLGT N+++ V G EP+V+A + A + SV A VG AK+
Sbjct: 10 SNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDRAGSPKSVAA--------VGHDAKQ 61

Query: 57 QNVTNVDRTISSVKRHMGTDWTVGIDDKKYTSQELSARILGKLKRDAEQYLGDSVTDAVI 116
I++++ G+ + ++++ + ++ ++ ++
Sbjct: 62 MLGRT-PGNIAAIR-----PMKDGVIADFFVTEKMLQHFIKQVHSNS---FMRPSPRVLV 112

Query: 117 TVPAYFNDAERQATKEAGEIAGLNVLRIINEPTAAALAYGLDRGKEDELILVFDLGGGTF 176
VP ER+A +E+ + AG + +I EP AAA+ GL E +V D+GGGT
Sbjct: 113 CVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPV-SEATGSMVVDIGGGTT 171

Query: 177 DVSLLEVGKDDDFSTIQVRSTAGDNRLGGDDWDQRIVDHLVKRFKESTGVDVSNDKIAKQ 236
+V+++ + + R+GGD +D+ I++++ + + G
Sbjct: 172 EVAVISLNG---------VVYSSSVRIGGDRFDEAIINYVRRNYGSLIG----------- 211

Query: 237 RLKEAAEQAKKELSSS----TSTSIQLPYLSLTENGPANLDETLTRAKFEELTNDL---- 288
+ AE+ K E+ S+ I++ +L E P + E L L
Sbjct: 212 --EATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVPRGFTLN-SNEILEALQEPLTGIV 268

Query: 289 ------LERTRKPFEDVIREAGVSVGDVAHVVLVGGSTRMPAVVDLVKKLTGGKEPNKGV 342
LE+ I E G +VL GG + + L+ + T G
Sbjct: 269 SAVMVALEQCPPELASDISERG--------MVLTGGGALLRNLDRLLMEET-GIPVVVAE 319

Query: 343 NPDEVVAVGAA 353
+P VA G
Sbjct: 320 DPLTCVARGGG 330


50CMM_0165CMM_0171N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0165-271.729537conserved hypothetical protein
CMM_0166-361.253736putative Mn2+/Zn2+ ABC-type
CMM_0167-271.987309conserved hypothetical protein
CMM_0167a-17-0.938877NAD-dependent aldehyde dehydrogenase
CMM_0168-17-0.665438hypothetical protein
CMM_016918-0.868219putative glycosyltransferase
CMM_017017-0.408931hypothetical protein
CMM_0171090.151273O-acetylhomoserine (thiol)-lyase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0165PF05272290.030 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.030
Identities = 12/28 (42%), Positives = 14/28 (50%), Gaps = 4/28 (14%)

Query: 51 RPGC----VTALVGPNGSGKSTLLRTMA 74
PGC L G G GKSTL+ T+
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0166FERRIBNDNGPP1545e-47 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 154 bits (391), Expect = 5e-47
Identities = 80/329 (24%), Positives = 132/329 (40%), Gaps = 40/329 (12%)

Query: 1 MITRRRTLAMTALAAATALTLTACGTTEEASTGAGTTPAGEQITLTDGTGAEVTLDGPAT 60
+I+RRR L AL+ A A
Sbjct: 6 LISRRRLLTAMALSPL-------LWQMNTAHAAAID----------------------PN 36

Query: 61 KVVGTEWNVVENLVSLGVDPVGVADVAGYSAWSSAVPLVNEPADIGTRGEPSVETIASLA 120
++V EW VE L++LG+ P GVAD Y W S PL + D+G R EP++E + +
Sbjct: 37 RIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTEMK 96

Query: 121 PDLIVATTDLPADAITQLKAIAPVLQVNSADGSKQIQQSEDNLELIAKATGTEDKATEVI 180
P +V + + L IAP N +DG + + + +L +A + A +
Sbjct: 97 PSFMVWSAGY-GPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETHL 155

Query: 181 GAYDQAVTDAKAKLDAAGLAGSKFLFADAYVDAGAVAIRPFGTGSLIGDVTTELGLENAW 240
Y+ + K + G + L +D + + FG SL ++ E G+ NAW
Sbjct: 156 AQYEDFIRSMKPRFVKRGA---RPLLLTTLIDPRHMLV--FGPNSLFQEILDEYGIPNAW 210

Query: 241 TGEVDPAYGLGSTDVEGLTTVGDVQFLYNSNSTQGDDPFASTLAGNAVWQSLPFVTAGDV 300
GE + +G + ++ L DV L + L +WQ++PFV AG
Sbjct: 211 QGETN-FWGSTAVSIDRLAAYKDVDVLCFD---HDNSKDMDALMATPLWQAMPFVRAGRF 266

Query: 301 HRMPDGIWAFGGPASMTAYAKAVSDLLAG 329
R+P +W +G S + + + + + G
Sbjct: 267 QRVPA-VWFYGATLSAMHFVRVLDNAIGG 294


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0169HTHFIS518e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 51.0 bits (122), Expect = 8e-10
Identities = 26/99 (26%), Positives = 40/99 (40%), Gaps = 4/99 (4%)

Query: 9 RIRAVVVDDAVLLREGLARVLVEAGIDVVAQHADAAGFLAALADDAPDVVVMDVRMPPTF 68
+V DD +R L + L AG DV ++AA +A D+VV DV MP
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGYDVRIT-SNAATLWRWIAAGDGDLVVTDVVMPD-- 59

Query: 69 SDEGIRATVEARRRVPGIGVLLLSQYVEAAYAEEVFRSG 107
+ ++ P + VL++S A + G
Sbjct: 60 -ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKG 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0171SACTRNSFRASE386e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.6 bits (87), Expect = 6e-06
Identities = 18/80 (22%), Positives = 37/80 (46%), Gaps = 7/80 (8%)

Query: 59 AVADGALVGFAHVLEI-DGLAHLEQVSVPPEHGRRGRGRALVEASADEARRRGHRRITLR 117
+ +G + +G A +E ++V ++ ++G G AL+ + + A+ + L
Sbjct: 70 YYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLE 129

Query: 118 TFADVPWNATA---YARAGF 134
T D+ N +A YA+ F
Sbjct: 130 T-QDI--NISACHFYAKHHF 146


51CMM_0186CMM_0193N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_01862131.574291hypothetical protein
CMM_01874131.382353putative glycosyl hydrolase, family 2
CMM_01883121.006354putative dehydrogenase/oxidoreductase
CMM_01892102.312886putative drug exporter, RND family
CMM_0190283.021138putative transcriptional regulator, TetR-family
CMM_0191-270.720861putative membrane protein
CMM_0192-27-0.6123363-hydroxyacyl-Coenzyme A dehydrogenase
CMM_0193-17-1.706442conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0186SALSPVBPROT300.027 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 29.7 bits (66), Expect = 0.027
Identities = 25/75 (33%), Positives = 33/75 (44%), Gaps = 12/75 (16%)

Query: 276 PEIVASEEEARRVVDASRHPTDEQLRA--------IALGWSSDLETD----LTGLDLDRP 323
PE A E R ++ P+DE+ + IA G SS ETD GL LD+P
Sbjct: 419 PETQAKETLLSRDYLSTNEPSDEEFKNAMSVYINDIAEGLSSLPETDHRVVYRGLKLDKP 478

Query: 324 VDRGVFGEHVSHGTI 338
V E+ + G I
Sbjct: 479 ALSDVLKEYTTIGNI 493


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0187HTHTETR509e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.0 bits (119), Expect = 9e-10
Identities = 23/96 (23%), Positives = 45/96 (46%), Gaps = 2/96 (2%)

Query: 19 RERRRMATTTEISEAALALFEQRGMAATTIHDIAQAAGVSDRTCFRYFPSKEESVLTLHP 78
++ T I + AL LF Q+G+++T++ +IA+AAGV+ + +F K + +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 79 VFDAPLDAWLADV--DRGSAPLPQLEAVYERVLASL 112
+ ++ + + PL L + VL S
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLEST 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0188TCRTETB1388e-38 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 138 bits (348), Expect = 8e-38
Identities = 92/409 (22%), Positives = 186/409 (45%), Gaps = 20/409 (4%)

Query: 50 LVVAAFVVILNETIMSVALPSLMADLDITTATAQWLTTGFMLTMAVVIPATGFILQRFST 109
L + +F +LNE +++V+LP + D + A+ W+ T FMLT ++ G + +
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 110 RQVFGAAMTLFSIGTLIAAIA-PGFGILLVGRIVQASGTAIMMPLLFTTVLNLVPAARRG 168
+++ + + G++I + F +L++ R +Q +G A L+ V +P RG
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 169 RLMGVISIVIAVAPAIGPTVSGLILSSLSWRWMFWIVLPIALVALTLGLWKITNLTTPRK 228
+ G+I ++A+ +GP + G+I + W ++ ++P+ + L K+ K
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLL--LIPMITIITVPFLMKLLKKEVRIK 196

Query: 229 LPFDILSVVLSTLAFGGLIFGLSSLGESAEGDAPLPLWIPITVGVLALAAFITRQIALQR 288
FDI ++L ++ + +S + V VL+ F+ ++
Sbjct: 197 GHFDIKGIILMSVGIVFFMLFTTSY-----------SISFLIVSVLSFLIFVKHI---RK 242

Query: 289 QDRALMDLRTFRSRPFVVAIIMVSVSMMALFGSLIVLPLYLQNVLELGTLETG-LLLLPG 347
+D ++ PF++ ++ + + G + ++P +++V +L T E G +++ PG
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 348 GALMAILSPIVGRLFDRVGPRPLVIPGAIIVSIALWGMTTMLHEGTSIGWVIAVHLVLNA 407
+ I I G L DR GP ++ G +S++ + L E TS I + VL
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTA-SFLLETTSWFMTIIIVFVL-G 360

Query: 408 GLAFMFTPLLTSALGSLPPRLYSHGSATVSTMQQLAGAAGTALFVTVLT 456
GL+F T + T SL + G + ++ L+ G A+ +L+
Sbjct: 361 GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0190TETREPRESSOR419e-07 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 41.1 bits (96), Expect = 9e-07
Identities = 17/44 (38%), Positives = 29/44 (65%)

Query: 4 AGLDAATVTEAGAALADEIGLAGLSMGAVAERLGVKTPSLYKHV 47
A L+ +V +A L +E G+ GL+ +A++LG++ P+LY HV
Sbjct: 2 ARLNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHV 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0193HTHTETR656e-15 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 65.0 bits (158), Expect = 6e-15
Identities = 33/191 (17%), Positives = 62/191 (32%), Gaps = 15/191 (7%)

Query: 49 GVRRRAEIVAAAVRVFGVRGYGAATIKEIADEVGVSPAAVLRYFR-KEELLTEVL-RQWD 106
R I+ A+R+F +G + ++ EIA GV+ A+ +F+ K +L +E+
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 107 RQQPFVSEAAPGLPA---------LRAFVDLMRYHVEHRGFLELYLTFATETSDATHPAH 157
E P L ++ R +E+ +
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMA-VVQ 127

Query: 158 EYMRGRYARTIAQIRRRIGEAVALGQVPPMDDATLDYESACFLAILDGLEIQWIHNP-SL 216
+ R + +I + + + +P + GL W+ P S
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAII--MRGYISGLMENWLFAPQSF 185

Query: 217 DLPALVGEYVE 227
DL +YV
Sbjct: 186 DLKKEARDYVA 196


52CMM_0238CMM_0245N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0238-371.180041hypothetical protein
CMM_0239071.696816putative MFS permease
CMM_02402102.492569conserved hypothetical protein
CMM_02411146-8.316713putative transcriptional regulator, MerR family
CMM_02421040-7.123587conserved hypothetical protein
CMM_02431040-6.700904putative membrane protein involved in chromosome
CMM_02441038-6.428382putative membrane protein involved in chromosome
CMM_02451035-5.773936putative acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0238TCRTETB712e-15 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 70.7 bits (173), Expect = 2e-15
Identities = 47/182 (25%), Positives = 84/182 (46%), Gaps = 3/182 (1%)

Query: 23 LLTVALIVGISPFATDLYIPGLPAIAADLDASTASVQLTLTAFLVAFAVGQLVIGPISDG 82
L+ + ++ S + LP IA D + AS TAF++ F++G V G +SD
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 83 AGRRRILLAGTALFTLASIVCAIS-PDATTLILARVVQGLGGAAAAVSARAMVGDASTGA 141
G +R+LL G + S++ + + LI+AR +QG G AA +V
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 142 LRSRLFATLAVVNSVGPVVAPLVGGVVLTVWSWRAAFVVLAALGLALTIAAARLLPETLV 201
R + F + + ++G V P +GG++ W ++++L + +T+ L + V
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPFLMKLLKKEV 193

Query: 202 RT 203
R
Sbjct: 194 RI 195


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0240DHBDHDRGNASE902e-23 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 90.1 bits (223), Expect = 2e-23
Identities = 55/202 (27%), Positives = 74/202 (36%), Gaps = 13/202 (6%)

Query: 5 LITGCSTGLGRAFAVEALERGHDVVVTARDAANAQDLADTYPEHALA-----LDLDVTDP 59
ITG + G+G A A +G + A D + A A DV D
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHI--AAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 60 AQVSLAVDEATARFGGVDVLVNNAGYGYRAAVEEGDDEDVARLFDTQFHGSVRMIKAVLP 119
A + G +D+LVN AG + DE+ F G ++V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 120 GMRERRSGTIVNLSSIGAARTGAGSGYYGAVKAAIEQMTMALRTELEPLGIVATVVAPGS 179
M +RRSG+IV + S A Y + KAA T L EL I +V+PGS
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 180 FRTDFSGRSLTQSSTVIDDYAE 201
TD Q S D+
Sbjct: 190 TETDM------QWSLWADENGA 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0241TCRTETB250.044 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 25.2 bits (55), Expect = 0.044
Identities = 14/48 (29%), Positives = 22/48 (45%), Gaps = 9/48 (18%)

Query: 26 SLGITFAMLGLVLMLTLDDTRVAGLPFIILGITFFVMSVRPKRARTRP 73
S+GI F ML T + F+I+ + F++ V+ R T P
Sbjct: 208 SVGIVFFMLF---------TTSYSISFLIVSVLSFLIFVKHIRKVTDP 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0244HTHFIS794e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.1 bits (195), Expect = 4e-19
Identities = 29/112 (25%), Positives = 58/112 (51%)

Query: 25 RVFVVDEEAPITQLLSLALRMEGWDVRVFATGRAAIDAAVEAAPDAILLDMTLPDVSGVE 84
+ V D++A I +L+ AL G+DVR+ + D ++ D+ +PD + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 85 VVGELRRAGVASPVLFLTGRDSLEDRLAAFGAGADDYVTKPFGLEEVVETLR 136
++ +++A PVL ++ +++ + A GA DY+ KPF L E++ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0245GPOSANCHOR419e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 40.8 bits (95), Expect = 9e-06
Identities = 26/89 (29%), Positives = 33/89 (37%), Gaps = 17/89 (19%)

Query: 346 ELTAADAGGPVIPGVPTPTPTTPSIDPAGTAPTPITKP------AAQHGDTLPVTGTDG- 398
EL AG P P ++ G AP TKP + LP TG
Sbjct: 454 ELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGETAN 513

Query: 399 ----AAALGLGGLGTLLALVGAGALVARR 423
AAAL T++A G A+V R+
Sbjct: 514 PFFTAAAL------TVMATAGVAAVVKRK 536


53CMM_0296CMM_0303N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0296-390.061592conserved hypothetical protein
CMM_0297-3111.022220putative thioredoxin reductase
CMM_0298-2111.858868ATPase with chaperone activity, ATP-binding
CMM_02990103.069082putative transcriptional regulator
CMM_0300083.347475conserved hypothetical protein
CMM_0301092.641851hypothetical protein
CMM_0302-1111.974632putative oxidoreductase
CMM_0303-1101.421391putative dipeptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0296MALTOSEBP445e-07 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 44.3 bits (104), Expect = 5e-07
Identities = 51/206 (24%), Positives = 83/206 (40%), Gaps = 31/206 (15%)

Query: 5 KILTGSAALLVGALALAGCSGSGSGSGDDGGPVEMTLWHNSTTG------PGKAFWDKTT 58
KI TG+ L + AL S S ++G ++ +W N G GK F T
Sbjct: 2 KIKTGARILALSALTTMMFSASALAKIEEG---KLVIWINGDKGYNGLAEVGKKFEKDTG 58

Query: 59 ADFNAAHPGVTVTPTSIQNEDLDGKLQTALNSGDAPDIFLQRGGGKLAATVAAGQVMDIT 118
HP + L+ K +GD PDI + +G + +IT
Sbjct: 59 IKVTVEHP-----------DKLEEKFPQVAATGDGPDIIFW-AHDRFGGYAQSGLLAEIT 106

Query: 119 DGISADVKGQIAQSAFDANSIDGKAYAMPVAVLPSGIFYSQDLFTAAGITETPKTMDELD 178
+ ++ +DA +GK A P+AV + Y++DL + PKT +E+
Sbjct: 107 P--DKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDL-----LPNPPKTWEEIP 159

Query: 179 AAVEKLKATGVAPIALGAKD---AWP 201
A ++LKA G + + ++ WP
Sbjct: 160 ALDKELKAKGKSALMFNLQEPYFTWP 185


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0300NUCEPIMERASE320.002 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 32.1 bits (73), Expect = 0.002
Identities = 40/179 (22%), Positives = 63/179 (35%), Gaps = 46/179 (25%)

Query: 2 IVITGATGALNGATAEHLLRTVPAAGIAVVARD-----------PASARSFADRGVEVRR 50
++TGA G + ++ LL AG VV D A A G + +
Sbjct: 3 YLVTGAAGFIGFHVSKRLLE----AGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 51 GDYADAASLPTAFA--GADRLLLVSAS--------DPGADA---VALHRAAVEAATEAGV 97
D AD + FA +R+ + +P A A + +E +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 98 GRILYTSHQ---GAAPDSPFA-------PARDHAATEG---ILAAS-----GIPWTALR 138
+LY S G PF+ P +AAT+ ++A + G+P T LR
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLR 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0301HTHTETR553e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 54.6 bits (131), Expect = 3e-11
Identities = 28/127 (22%), Positives = 43/127 (33%), Gaps = 3/127 (2%)

Query: 19 TRDRIVDVAADLLRASGRAAVTTRAVSEAAGVQAPTIYRLFGDKEGLLDAVAEREMTRFS 78
TR I+DVA L G ++ + +++AAGV IY F DK L + E +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 79 AAKAEAVRAAAAGAADAVDDLRAGWDATIAFGLANPDLYALMSDPARGPGSPAVRAGVAL 138
+ E A D + LR + + LM A V
Sbjct: 72 ELELE---YQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 139 LEERVHR 145
+ +
Sbjct: 129 AQRNLCL 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0303DHBDHDRGNASE561e-11 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 55.8 bits (134), Expect = 1e-11
Identities = 51/194 (26%), Positives = 75/194 (38%), Gaps = 25/194 (12%)

Query: 5 LITGSNRGLGLETARRLVEAGHTVHA----GMRDTADGDAARAVGAHPVQ--LDVDDQAS 58
ITG+ +G+G AR L G + A + + +A H DV D A+
Sbjct: 12 FITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSAA 71

Query: 59 VERAIASLPE----LDVLVNNAGILGTAQGVDDLTPEAMLAVLQTNVVAVVRVTQAALPL 114
++ A + +D+LVN AG+L + L+ E A N V +++
Sbjct: 72 IDEITARIEREMGPIDILVNVAGVLRPGL-IHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 115 LRASDAPVIVNVAS-GVGWPRALRGEDTDERHVLTIPYATSKAALITATVQYAKNLPAF- 172
+ + IV V S G PR YA+SKAA + T L +
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMA-----------AYASSKAAAVMFTKCLGLELAEYN 179

Query: 173 -RINATDPGYTATD 185
R N PG T TD
Sbjct: 180 IRCNIVSPGSTETD 193


54CMM_0331CMM_0337N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0331095.157915putative sugar ABC transporter, permease
CMM_0332-1104.291314putative sugar ABC transporter, permease
CMM_0333-183.270429putative glucosamine-6-phosphate isomerase
CMM_0334-193.008252beta-galactosidase
CMM_0335092.411608putative acyl-CoA thioesterase
CMM_0336071.935792conserved hypothetical protein
CMM_0337191.825466putative transcriptional regulator, LacI family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0331ISCHRISMTASE581e-10 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 58.5 bits (141), Expect = 1e-10
Identities = 25/65 (38%), Positives = 39/65 (60%)

Query: 13 SLRAQVARALRLDPADVGLDDDLVDLGLESTALIRLAGRWRRDGLAADFSRLAADPTIRV 72
++R Q+A L+ P D+ +DL+D GL+S ++ L +WRR+G F LA PTI
Sbjct: 234 NIRKQIAELLQETPEDITDQEDLLDRGLDSVRIMTLVEQWRREGAEVTFVELAERPTIEE 293

Query: 73 WARML 77
W ++L
Sbjct: 294 WQKLL 298


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0333ENTSNTHTASED972e-26 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 96.6 bits (240), Expect = 2e-26
Identities = 54/222 (24%), Positives = 77/222 (34%), Gaps = 32/222 (14%)

Query: 24 MHPAEAGAVARAVPSRRREFAATRACARTALAALVAEAARGGGASRAAADAPGSSPVAIP 83
+ + A R+ E A R A AL + G V
Sbjct: 31 LWLPHHDRLRSAGRKRKAEHLAGRIAAVHALREV------------------GVRTVPGM 72

Query: 84 KGRGGDPVWPRGVVGSLTHCAGYRAAVVAGIDALRTIGIDAEPHAPLPREARDIVGLAGE 143
G P+WP G+ GS++HCA AV+ + + IGID E A ++ +
Sbjct: 73 -GDKRQPLWPDGLFGSISHCATTALAVI----SRQRIGIDIEKIMSQH-TATELAPSIID 126

Query: 144 LHPHPPLGAG----VHADCVLFSAKESVGKAHFARYREWLGFADLHVTLHPDGAFTARRS 199
L A A + FSAKESV KA F+ GF VT +
Sbjct: 127 SDERQILQASLLPFPLALTLAFSAKESVYKA-FSDRVTLPGFNSAKVTSLTATHISLHLL 185

Query: 200 AP--GPIPFPAYRGGWCVTEGIVLTCAWLAVPRIPSAVARPA 239
+ R W + V+T A+ R+P + PA
Sbjct: 186 PAFAATMAERTVRTEWFQRDNSVIT-LVSAITRVPHDRSAPA 226


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0335cloacin340.002 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.5 bits (76), Expect = 0.002
Identities = 18/40 (45%), Positives = 21/40 (52%)

Query: 564 ASLGSVSSSMHGTYSGSSSSGGSTGGTTSGGGGGGGGGGG 603
AS GS SS + + G S SG GG + G GGG G G
Sbjct: 33 ASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSG 72



Score = 32.4 bits (73), Expect = 0.006
Identities = 18/61 (29%), Positives = 21/61 (34%), Gaps = 9/61 (14%)

Query: 544 ARDRPGWYAGQTPFSPAVFAASLGSVSSSMHGTYSGSSSSGGSTGGTTSGGGGGGGGGGG 603
A D GW + P+ G S S GS G G + GG G GG
Sbjct: 33 ASDGSGWSSENNPW---------GGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSA 83

Query: 604 V 604
V
Sbjct: 84 V 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0337BLACTAMASEA340.001 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 33.6 bits (77), Expect = 0.001
Identities = 20/92 (21%), Positives = 42/92 (45%), Gaps = 4/92 (4%)

Query: 183 PIGSIVKLYVLGALVQAVEEGRIGWDDPLTVT-DDVRSLPSGELQDAPTGTVVSVRDTAE 241
P+ S K+ + GA++ V+ G + + D+ + + + ++V +
Sbjct: 63 PMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQDL--VDYSPVSEKHLADGMTVGELCA 120

Query: 242 KMIAISDNTATDMLIQAV-GREAVEAALVDLG 272
I +SDN+A ++L+ V G + A L +G
Sbjct: 121 AAITMSDNSAANLLLATVGGPAGLTAFLRQIG 152


55CMM_0400CMM_0409N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0400-1110.704143putative sugar ABC transporter, binding protein
CMM_04010110.473694putative sugar uptake ABC transporter, ATPase
CMM_04020100.607163transcriptional regulator, LacI family
CMM_0403081.091574manganese-containing catalase
CMM_0404181.562448hypothetical protein
CMM_0405171.712482putative phosphoketolase
CMM_0406181.163163hypothetical secreted protein, putative
CMM_04072121.444790putative hemagglutinin/hemolysin-related
CMM_04082120.392170conserved hypothetical protein
CMM_0409314-0.270906putative iron-siderophore ABC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0400DHBDHDRGNASE1329e-40 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 132 bits (334), Expect = 9e-40
Identities = 78/255 (30%), Positives = 123/255 (48%), Gaps = 31/255 (12%)

Query: 8 RVVVVSGAAQGIGRAIAERFLEEGCRVFGLDL----------------RFRDALPEGVTA 51
++ ++GAAQGIG A+A +G + +D R +A P
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP----- 63

Query: 52 IVADVTDQASVRAAIAQVVEAAGRVDVLVNNAGINVEGSVETIDPARFQAAFDVNVGGVF 111
ADV D A++ A++ G +D+LVN AG+ G + ++ ++A F VN GVF
Sbjct: 64 --ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVF 121

Query: 112 LLSQAVIPAMKAAGGGRIINAASFAAVIPSVGAAAYGASKAAVVQFTRVLASELGPWGIT 171
S++V M G I+ S A +P AAY +SKAA V FT+ L EL + I
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 172 VNAYAPGMIPTAM--------NGFAEMPEPAQDRLLDTLSIRRWERPDDVADLLVFLASD 223
N +PG T M NG ++ + + + + +++ +P D+AD ++FL S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 224 RAGYITGTLVDVSGG 238
+AG+IT + V GG
Sbjct: 242 QAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0401DHBDHDRGNASE673e-15 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 67.0 bits (163), Expect = 3e-15
Identities = 50/235 (21%), Positives = 94/235 (40%), Gaps = 24/235 (10%)

Query: 26 RTLWISGAGSGVGRATAVAAARDGWRLVLSGRRREALEETRALVDDLGSTAVVAPLDVTD 85
+ +I+GA G+G A A A G + E LE+ + + A P DV D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 86 DAALAAVV----ADLDRLDGVVVAAGL--NAPRRSWAEQDVADFDRIVATNLTGPAHQVA 139
AA+ + ++ +D +V AG+ S ++++ + +T + + V+
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 140 AALPLLRASGGTVVLVSSYAAWTHSPGAGVAYSASKTALGTLVRDLNAQEAGAGIRATHL 199
++ G++V V S A AY++SK A + L + A IR +
Sbjct: 129 KY--MMDRRSGSIVTVGSNPAGVPRTSMA-AYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 200 CPGTIDSDFLALRPTVPDAAERA---------------AMLTPDDVARAAMFVLA 239
PG+ ++D + AE+ + P D+A A +F+++
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVS 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0403TCRTETB290.050 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 28.7 bits (64), Expect = 0.050
Identities = 25/123 (20%), Positives = 47/123 (38%), Gaps = 6/123 (4%)

Query: 104 VSDRSSARWFLPIGLLLSALANLVVAFVPAVGASVALFAVVMIVNGFFQGMGWPPSGRTL 163
+SD+ + L G++++ +++ FV S+ + A + G +P +
Sbjct: 72 LSDQLGIKRLLLFGIIINCFGSVI-GFVGHSFFSLLIMARFIQGAG---AAAFPALVMVV 127

Query: 164 VHWFSTSE-RGGKTAIWNVAHNVGGAGAGGLAGLAIATFGTWQSAFWFPAIVCIVIALIA 222
V + E RG + + G G G G IA + W P I I + +
Sbjct: 128 VARYIPKENRGKAFGLIG-SIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLM 186

Query: 223 FVL 225
+L
Sbjct: 187 KLL 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0404HTHFIS532e-10 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.3 bits (128), Expect = 2e-10
Identities = 26/116 (22%), Positives = 47/116 (40%), Gaps = 5/116 (4%)

Query: 20 RVLLVDDQALVRAGIRVILESEDGIEVVGEAADGEEGVRLAAALAPDVICMDVQMPRVDG 79
+L+ DD A +R + L G +V ++ R AA D++ DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRA-GYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 80 LEATRRIVA-DPTIGAAVLMLTTFDREDYLFTALDAGASGFVLKSASPESLVEAVH 134
+ RI P + VL+++ + A + GA ++ K L+ +
Sbjct: 63 FDLLPRIKKARPDL--PVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0406TCRTETB1312e-35 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 131 bits (332), Expect = 2e-35
Identities = 99/450 (22%), Positives = 182/450 (40%), Gaps = 55/450 (12%)

Query: 32 RAWQALIVLLAGMFIALLDTTIVNVALPTIRTSLDASESTLSWIISGYALAFGLALIPAG 91
R Q LI L F ++L+ ++NV+LP I + ++ +W+ + + L F + G
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 92 RLGDRYGHKWVFVTGIALFTLASLACGVAQDDLQLVI-ARVVQGLAGGLFVPAVTAFIQL 150
+L D+ G K + + GI + S+ V L+I AR +QG F V +
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 151 LFPPQARGKAFAIMGAVIGVSSALGPIVGGLIIQAAGEESGWRLVFFVNLPVGLATVIAA 210
P + RGKAF ++G+++ + +GP +GG+I W + + +P+ T+I
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIA----HYIHWS--YLLLIPM--ITIITV 182

Query: 211 IFLLPSRQVAEQVAGDVRAQSGQRAGTKQGAAAPAKAPAASGVDLVGILLVSAGLVALLV 270
FL+ + + + G D+ GI+L+S G+V ++
Sbjct: 183 PFLM-----------KLLKKEVRIKGH---------------FDIKGIILMSVGIVFFML 216

Query: 271 PLIDGQDQGWPLWTYLSLAGGVVLLALFGAWEVMQTKRSKGVLVPPHLFTHPAFTGGVIL 330
++L ++ V+ +F V ++ V P L + F GV+
Sbjct: 217 FTTSY------SISFLIVS--VLSFLIF----VKHIRKVTDPFVDPGLGKNIPFMIGVLC 264

Query: 331 AMVYFAAFTSIFFTISILWQSGLGNSALESGLVSI-PFAIGSIVGSSQSNRLTKRLG-RT 388
+ F + + + S E G V I P + I+ L R G
Sbjct: 265 GGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLY 324

Query: 389 VLVIGTALVSVGLIWLWLVLLNTQAADLTSWMLLVPLLLAGIGNGLFIAPNAQFIVATVD 448
VL IG +SV + +L + TSW + + ++ G + + +++
Sbjct: 325 VLNIGVTFLSVSFLTASFLL------ETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLK 378

Query: 449 PAEAGAASGVIGTVQRVGSAVGIAVIGSVL 478
EAGA ++ + GIA++G +L
Sbjct: 379 QQEAGAGMSLLNFTSFLSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0409DHBDHDRGNASE604e-13 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 60.4 bits (146), Expect = 4e-13
Identities = 45/190 (23%), Positives = 77/190 (40%), Gaps = 14/190 (7%)

Query: 3 IQDQVALVTGANRGIGRTFVEELLARGARKVYATARRPEAIDVPGV---------EVLRL 53
I+ ++A +TGA +GIG L ++GA + A PE ++ E
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGA-HIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 54 DLADPASVDAA----AAAAQDVTLVVNNAGISTGAALITGDMAEIRREMDTHFYGTLGVI 109
D+ D A++D + ++VN AG+ + + E + G
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 110 RAFAPVLAANGGGGIVNILSALSWFSTSGAGGYAAAKAAEWNMTNAVRLELAEQGTLVQG 169
R+ + + G IV + S + + YA++KAA T + LELAE
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 170 VHLGAADTDI 179
V G+ +TD+
Sbjct: 185 VSPGSTETDM 194


56CMM_0457CMM_0464N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_045709-0.543296putative N-acetylglucosamine-6-phosphate
CMM_0458110-0.368797putative 3',5'-cyclic-nucleotide
CMM_0459291.454411hypothetical protein
CMM_0460291.555826putative membrane protein
CMM_04612151.853770putative ATP-dependent DNA helicase
CMM_04621123.060065putative adenine-specific DNA-modification
CMM_04634122.933008putative membrane protein
CMM_04643121.863550putative membrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0457HTHTETR515e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.8 bits (121), Expect = 5e-10
Identities = 31/183 (16%), Positives = 66/183 (36%), Gaps = 6/183 (3%)

Query: 11 RERRRVETSAKLTTLARRLTAQHGLAGFTVEEVCEQAGVSRRTFFNYFASKEDALFGRSA 70
++ ET + +A RL +Q G++ ++ E+ + AGV+R + +F K D LF
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSD-LFSEIW 63

Query: 71 RVDTADLEEAFVAAGDPEDPALSPTLLDDLADLCLERWARLDTDGTSMQDMHAAFRREPG 130
+ +++ E + L + L + H
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 131 LLVRALEAAVEDEAKDVLLVERREGLPAGDLRASVVVQ-----IMGALGRASAREFLAPG 185
+V+ + + E+ D + + + A L A ++ + + G + AP
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQ 183

Query: 186 NAD 188
+ D
Sbjct: 184 SFD 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0458TCRTETB1483e-41 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 148 bits (375), Expect = 3e-41
Identities = 97/430 (22%), Positives = 178/430 (41%), Gaps = 23/430 (5%)

Query: 2 TATAPAPLLLTQRRIWIIFSALIAGMLLSSLDQTIVSTAMPTIVGELGGVDHQV-WITTA 60
T+ + + L Q IW+ + S L++ +++ ++P I + W+ TA
Sbjct: 3 TSYSQSNLRHNQILIWLCILSF-----FSVLNEMVLNVSLPDIANDFNKPPASTNWVNTA 57

Query: 61 YLLATTIVMPIYGKFGDVLGRRNLFLAAIAIFTLASVGCAFAGDFWSFVVF-RAAQGLGG 119
++L +I +YGK D LG + L L I I SV F+S ++ R QG G
Sbjct: 58 FMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGA 117

Query: 120 GGLMILSQSIIADIVPANQRGKYLGPLGGIFGLSAVGGPLLGGFFVDHLTWQWAFYINIP 179
L ++A +P RGK G +G I + GP +GG ++ W++ + IP
Sbjct: 118 AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYI--HWSYLLLIP 175

Query: 180 IGIAAFVVAFITLTLPSKRATKRIDVAGVVLLSLATTCLIFFTDFGGDKAYGWGSLATWA 239
+ V + L R D+ G++L+S+ + FT +Y L
Sbjct: 176 MITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFT-----TSYSISFLIVS- 229

Query: 240 WGLGLVVAASLFVFTESRADDPVIPLSLFRNPVFVNATAIGLALGIGMFAAIGFVPTFLQ 299
V++ +FV + DP + L +N F+ G + + + VP ++
Sbjct: 230 -----VLSFLIFVKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMK 284

Query: 300 MSTGTSAAVSG-LLLLPMMVGLIGMSITSGILISRTGRYRIFPIVGTLLTMLALVLMTSL 358
S A G +++ P + +I GIL+ R G + +G ++ + + L
Sbjct: 285 DVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILVDRRGPLYVL-NIGVTFLSVSFLTASFL 343

Query: 359 TAQTPVWLICVFLFIFGLGLGLIMQVVVLVVQNAVPAAQIGTATSTNNYFREVGAALGTA 418
T ++ + +F+ G GL V+ +V +++ + G S N+ + G A
Sbjct: 344 LETTSWFMTIIIVFVLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIA 402

Query: 419 VFGTLFTTRL 428
+ G L + L
Sbjct: 403 IVGGLLSIPL 412


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0463SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 1e-04
Identities = 19/76 (25%), Positives = 26/76 (34%), Gaps = 9/76 (11%)

Query: 72 VGVGSAWVALDDPNGRPDTAFLFELLVDPSRRGCGYGRALLAAVEEATRAAGAPALALNV 131
+ + S W NG A + ++ V R G G ALL E + L L
Sbjct: 80 IKIRSNW------NGY---ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLET 130

Query: 132 FGANRVAIALYASAGY 147
N A YA +
Sbjct: 131 QDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0464NUCEPIMERASE589e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 58.3 bits (141), Expect = 9e-12
Identities = 78/366 (21%), Positives = 125/366 (34%), Gaps = 97/366 (26%)

Query: 8 RALVLGGTGAIGGATAERLARDGWSV---DV--TGRDPLAMPARLTDL---GVRFHALDR 59
+ LV G G IG ++RL G V D D ARL L G +FH +D
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 60 ADACGIERLTGDG-VDLLVDLVAFTAADVQA------------------LLPVMRASG-- 98
AD G+ L G + + V+ +L R +
Sbjct: 62 ADREGMTDLFASGHFERVFISPH--RLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 99 SVVVASSRAVHVDDAGRHVNGDEPPRFPVPIPEDNPTLPPAAPGTDPFTREGYAPSKVAV 158
++ ASS +V+ + +P D+ P + YA +K A
Sbjct: 120 HLLYASSSSVYGLNR------------KMPFSTDDSVDHPVSL---------YAATKKAN 158

Query: 159 ERAALDS----GLPVTVIRPSKVHGRWAR-NARTRAVVERMLAGADTIELADRGASVDHL 213
E A GLP T +R V+G W R + + ML G +I++ + G
Sbjct: 159 ELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEG-KSIDVYNYGKMKRDF 217

Query: 214 TAAANAAALVARVADVPGA-------------------RILHSADPDPLTAAEIVAVIAE 254
T + A + R+ DV R+ + + P+ + + + +
Sbjct: 218 TYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALED 277

Query: 255 ELGWRGRI--VPLEPGVAGGAHPWAAAHPIVLDTRASL-----ALGYAPVGPGAVLLRAE 307
LG + +PL+PG VL+T A +G+ P ++
Sbjct: 278 ALGIEAKKNMLPLQPG-------------DVLETSADTKALYEVIGFTPETTVKDGVKNF 324

Query: 308 VAWIRD 313
V W RD
Sbjct: 325 VNWYRD 330


57CMM_0468CMM_0477N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0468-280.195568conserved hypothetical protein
CMM_0469071.118600conserved hypothetical protein
CMM_0470071.399369conserved hypothetical protein
CMM_0471061.378797putative lysophospholipase
CMM_0472293.136072putative acetyltransferase
CMM_04732102.228085conserved hypothetical protein
CMM_0474492.233440putative serine protease, family S53
CMM_04764122.774113transcriptional regulator, TetR family
CMM_04775113.625103putative phospholipase C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0469DHBDHDRGNASE1121e-31 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 112 bits (281), Expect = 1e-31
Identities = 80/255 (31%), Positives = 117/255 (45%), Gaps = 14/255 (5%)

Query: 56 LTGRKVLITGADSGIGKAVAIAFAREGADIALNFLDEELEDARDTASTIEADGRTAALVP 115
+ G+ ITGA GIG+AVA A +GA IA +D E S+++A+ R A P
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAA--VDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 116 GDISDETACQDIVQASVAALGGVDCLVMVAGYQRNEDDILDLDSEQLDRTMKTNVYSLFW 175
D+ D A +I +G +D LV VAG R I L E+ + T N +F
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLR-PGLIHSLSDEEWEATFSVNSTGVFN 122

Query: 176 LSKAVIPHLP--KGGSIITTSSSQAYQPSPDKIDYAVSKGAIRNFTQGLAQQLAPKGIRV 233
S++V ++ + GSI+T S+ A P YA SK A FT+ L +LA IR
Sbjct: 123 ASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 234 NSVAPGPFWTVLQPVGQSASDVEE-----FGSQSVYGRP----GQPAEIAATYVFLASQE 284
N V+PG T +Q + + E G P +P++IA +FL S +
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQ 242

Query: 285 SSFTSGETIAVTGGT 299
+ + + V GG
Sbjct: 243 AGHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0471PF06580320.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.1 bits (73), Expect = 0.006
Identities = 18/105 (17%), Positives = 32/105 (30%), Gaps = 26/105 (24%)

Query: 458 LLSNAVKY----TPEGGTVITEVGVDGDHFRLCVTDDGVGMSAEDTSQLFTRFFRTNSAR 513
L+ N +K+ P+GG ++ + D L V + G
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKE------------- 309

Query: 514 ASTVAGVGLGLSITRSIVEA---HDGSIEVESAVGTGTTMRVRLP 555
G GL R ++ + I++ G V +P
Sbjct: 310 -----STGTGLQNVRERLQMLYGTEAQIKLSEKQG-KVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0473HTHFIS797e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 7e-19
Identities = 36/138 (26%), Positives = 59/138 (42%), Gaps = 2/138 (1%)

Query: 1 MDSGRVALVIEDDGDIRQLLEVVLRQGGFEVHSAGTATEGVRLAEEVSPDVITLDVGLPD 60
M + LV +DD IR +L L + G++V A R D++ DV +PD
Sbjct: 1 MTGATI-LVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 FDGFEAARRIR-LVSDAYIVMLTAQGEEVDTLLGLEAGADDYIVKPFRPRELRARISAMM 119
+ F+ RI+ D +++++AQ + + E GA DY+ KPF EL I +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 120 RRPRGGGDATATPAAGIP 137
P+ +
Sbjct: 120 AEPKRRPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0474BACINVASINB310.002 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 31.3 bits (70), Expect = 0.002
Identities = 38/110 (34%), Positives = 56/110 (50%), Gaps = 13/110 (11%)

Query: 89 TLVSMLVAAVVGGS---IAAMGLGLIFVTDACDVDAYV----CRDSLFTIGYGIAVAGPL 141
T+VS++ A GG+ +AA+GL ++ V D V A + +L I V PL
Sbjct: 326 TIVSVVAAVFTGGASLALAAVGLAVM-VADEI-VKAATGVSFIQQALNPIME--HVLKPL 381

Query: 142 F-LTGIAVIVALVGMIRGRTRPWLVLLIGVGASLAAYVLGAVLVLVAVPG 190
L G A+ AL G+ + + I VGA +AA + AV+V+VAV G
Sbjct: 382 MELIGKAITKALEGLGVDKKTAEMAGSI-VGAIVAAIAMVAVIVVVAVVG 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0477FLGMOTORFLIN290.007 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 29.5 bits (66), Expect = 0.007
Identities = 15/59 (25%), Positives = 28/59 (47%)

Query: 151 LALTEGAVRTLVRQAGDEVPGVLIGRCTLDGEVTRAGEPVRVALTMSVVWGDPLPELAQ 209
L LT+G+V L AG+ + ++ G GEV + V +T + + + L++
Sbjct: 79 LRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIITPSERMRRLSR 137


58CMM_0500CMM_0503N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0500363.079017putative 1-acylglycerol-3-phosphate
CMM_0501362.763318putative tRNA processing ribonuclease BN
CMM_0502171.076331Deoxyribodipyrimidine photo-lyase
CMM_050319-0.107933putative glycosyl transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0500SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 16/67 (23%), Positives = 26/67 (38%), Gaps = 4/67 (5%)

Query: 97 MALVPSARGRGLGRALLVALVAAVRASGAPAVSLSVEDGNDRARALYDSLGFVAVGREGG 156
+A+ R +G+G ALL + + + + L +D N A Y F G
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF----IIGA 150

Query: 157 SDVLLLR 163
D +L
Sbjct: 151 VDTMLYS 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0501PF05616413e-05 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 40.5 bits (94), Expect = 3e-05
Identities = 25/79 (31%), Positives = 29/79 (36%), Gaps = 3/79 (3%)

Query: 217 AAAAAVPVPTPSASPTPAPTPVPDPDPTPAPGPTPAPDPSPAPAPGRVATGVTVTAPDGT 276
+A A P P SP P P P+ P P P PDP P G T PD
Sbjct: 319 SAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSP 378

Query: 277 RVIDGQPRVDGLDRSATAD 295
V D R +G R +
Sbjct: 379 AVPD---RPNGRHRKERKE 394



Score = 35.5 bits (81), Expect = 0.001
Identities = 26/95 (27%), Positives = 33/95 (34%), Gaps = 6/95 (6%)

Query: 224 VPTPSASPTPAPTPVPDPDPTPAPGPTPAPDPSPAPAPGRVATGVTVTAPDGTRVIDGQP 283
+P P +P A P P P +P PA +P+P PG PD D P
Sbjct: 310 IPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPN----PEPDPDLNPDANP 365

Query: 284 RVDGLDRSATADSLTASDGSPEHAASVHVETGADG 318
DG + T A P + G DG
Sbjct: 366 DTDG--QPGTRPDSPAVPDRPNGRHRKERKEGEDG 398



Score = 34.3 bits (78), Expect = 0.002
Identities = 23/82 (28%), Positives = 31/82 (37%), Gaps = 6/82 (7%)

Query: 212 GSVTCAAAAAVPVPTPSASPTPAPTPVPDPDPTPAPGPTPAPDPSPAP----APGRVATG 267
GS A +P +P+ +P P P +P P P P P +P P PG
Sbjct: 318 GSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDS 377

Query: 268 VTVTAPDGTRVIDGQPRVDGLD 289
V PD + R +G D
Sbjct: 378 PAV--PDRPNGRHRKERKEGED 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0502SUBTILISIN486e-08 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 47.9 bits (114), Expect = 6e-08
Identities = 43/195 (22%), Positives = 68/195 (34%), Gaps = 31/195 (15%)

Query: 243 VDDVPAEDRGSGATIAIVAAYDDPDTQADTDTYSLA---VGEPAFTAGQYRDHPSASPRT 299
V + RG G +A++ DT D D L +G FT D
Sbjct: 31 APAVWNQTRGRGVKVAVL------DTGCDADHPDLKARIIGGRNFTDDDEGDPEIFKDYN 84

Query: 300 G---ICGGPTAWTDEQHLDVQAVHAMAPDATVS----YWGADDCTSTSLYTRILDAAEDG 352
G G A T+ ++ V +AP+A + + I A E
Sbjct: 85 GHGTHVAGTIAATENEN----GVVGVAPEADLLIIKVLNKQGSGQYDWIIQGIYYAIEQK 140

Query: 353 PDVISLSFGAMEGLDTADDRELLNRVLVEAASRDVSVFASTGNDGDYSGVGDHGGNATVA 412
D+IS+S G E + L+ + +A + + V + GN+GD +
Sbjct: 141 VDIISMSLGGPEDVPE------LHEAVKKAVASQILVMCAAGNEGDG-----DDRTDELG 189

Query: 413 SPASSPYVTAVGATS 427
P V +VGA +
Sbjct: 190 YPGCYNEVISVGAIN 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0503HTHTETR501e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 1e-09
Identities = 33/208 (15%), Positives = 65/208 (31%), Gaps = 13/208 (6%)

Query: 1 MNQRQAAVSRARREIMEAAGAQFAAHGYEGTSFSRVAEAMGKPKSAIGYHLFASKESLAG 60
+ + R+ I++ A F+ G TS +A+A G + AI Y F K L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAI-YWHFKDKSDLFS 60

Query: 61 AVVEDQEDRWLRIEAALDRP------GALHELIVFLLTGASTVEVCPVAAGAIRLLQDMP 114
+ E E +E L E+++ +L T E + I +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFV 120

Query: 115 RLGLAVERR-----FDVWRFTREHLEAELAVRDIRAG-DLDAVVDVLLSATFGVLSYRSP 168
V++ + + + L+ + + + A ++ G++
Sbjct: 121 GEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLF 180

Query: 169 RLAEHDSAERLRSLWIPLLVGLGIDDAD 196
D + R LL +
Sbjct: 181 APQSFDLKKEARDYVAILLEMYLLCPTL 208


59CMM_0631CMM_0637N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0631-2110.870312hypothetical membrane protein
CMM_0632-292.161512putative exonuclease
CMM_0633-152.702629putative dsDNA exonuclease subunit
CMM_0634-272.721252putative serine/threonine-protein kinase
CMM_0635-381.720781putative acyl-CoA ligase/aldehyde dehydrogenase
CMM_0636-181.9757593-oxoacyl-[acyl-carrier-protein] synthase
CMM_0637-292.230963putative acetolactate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0632PF06776320.001 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 32.2 bits (73), Expect = 0.001
Identities = 22/65 (33%), Positives = 27/65 (41%), Gaps = 9/65 (13%)

Query: 65 PRGPATRMRHPMPARLYP------RPARRRPAALAAAAAVIAVVCFGVAQPAGARPEPLP 118
P A +M PA L P R ARR A L A A+ + FG + A A+
Sbjct: 22 PALKAIQM---GPAELSPMLASCRRLARRNGARLMLAGAMAIALSFGWSDRADAQGAVRS 78

Query: 119 HSGDW 123
GDW
Sbjct: 79 VHGDW 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0635NUCEPIMERASE878e-22 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 87.1 bits (216), Expect = 8e-22
Identities = 69/340 (20%), Positives = 116/340 (34%), Gaps = 74/340 (21%)

Query: 1 MKVLVTGSRGKVGRAAVEALVAAGHDVTGVDLVRPVFD--------AGVVVPG-RYVMAD 51
MK LVTG+ G +G + L+ AGH V G+D + +D + PG ++ D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LTDAGSAFALVA--GMDAVVHVAA---IPQPTGNPAHVVLQTNLMSTFHMIEAAVRFGVP 106
L D L A + V + NP H +NL +++E +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENP-HAYADSNLTGFLNILEGCRHNKIQ 119

Query: 107 RFVNISSESIVGNFFPERPFLPDYAPVDEEHPL-RPQDPYALSKAFGEQLMDAAVRRSDI 165
+ SS S+ G L P + + P YA +K E + +
Sbjct: 120 HLLYASSSSVYG--------LNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGL 171

Query: 166 RVISLRPSTVHN-----DDNYAS--------------NLGKQVRDASVLTANLWSYIDAD 206
LR TV+ D N GK RD ++YID
Sbjct: 172 PATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRD--------FTYID-- 221

Query: 207 DLADAIV-----------------LSVASDLPGHEVFYIAAADNAGGHDFAAELRRHYGD 249
D+A+AI+ + A+ + + V+ I + D+ L G
Sbjct: 222 DIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGI 281

Query: 250 AIELRAIER----VDSSGISTAKARRLLGWEPKRSWRDHL 285
+ + V + T ++G+ P+ + +D +
Sbjct: 282 EAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGV 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0637RTXTOXINA330.003 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 33.0 bits (75), Expect = 0.003
Identities = 35/157 (22%), Positives = 59/157 (37%), Gaps = 44/157 (28%)

Query: 277 LAALSAALSAAPS----AHPDPDERWRAASAAGWTNLVLGLASAALAAVIVA-------- 324
L +S LSA + ++ D D R +AA+ T VLG ++ I+A
Sbjct: 242 LDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVLGNVGKGISQYIIAQRAAQGLS 301

Query: 325 --GP-AGVVAAAAGLALAP-----------------SLA-----------SSLASAMREP 353
AG++A+A LA++P + S LA+ +E
Sbjct: 302 TSAAAAGLIASAVTLAISPLSFLSIADKFKRANKIEEYSQRFKKLGYDGDSLLAAFHKET 361

Query: 354 GAHLPAIATFVVAASGITVGGLGAAFCALVAGVLVHL 390
GA ++ T + ++ G+ AA + G V
Sbjct: 362 GAIDASLTTISTVLASVS-SGISAAATTSLVGAPVSA 397


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0638PYOCINKILLER385e-05 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 38.2 bits (88), Expect = 5e-05
Identities = 26/70 (37%), Positives = 40/70 (57%), Gaps = 3/70 (4%)

Query: 211 SELQAQLTTLRDSRISVEEGFAIGERKRQEEAAAEARRQADARAAAAAAAAAANAAARPP 270
S LQ ++ TL ++ S+E A K +E+AAAEA+R+A+ +A AA AAN A P
Sbjct: 198 SSLQIRMNTLTAAKASIE---AAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPA 254

Query: 271 ANAPRPPSSG 280
+ ++G
Sbjct: 255 NGSVVATAAG 264


60CMM_0670CMM_0678N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0670193.062527putative membrane protein, possibly a
CMM_0671374.243251putative mannosyltransferase
CMM_0672394.804619putative glycosyltransferase
CMM_06731103.758344putative undecaprenyl-phosphate sugar
CMM_0675-1112.861892putative protein tyrosine kinase
CMM_06760122.650573putative acyltransferase
CMM_06771131.824650putative membrane-bound acyltransferase
CMM_06782131.597466putative esterase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0671ABC2TRNSPORT393e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 39.1 bits (91), Expect = 3e-05
Identities = 29/115 (25%), Positives = 48/115 (41%), Gaps = 5/115 (4%)

Query: 563 AIVAGALGFRIAHPLAMYGTMVLASITFAAIILALNVLLGSVGQ----FLGLVLMVVQLV 618
+VA ALG+ +Y V+A A L + V + F +++ L
Sbjct: 133 GVVAAALGY-TQWLSLLYALPVIALTGLAFASLGMVVTALAPSYDYFIFYQTLVITPILF 191

Query: 619 TAGGTFPWQTLPGPLAALHHVLPMSFAVDALRQLMYGGDLGQAAQDAGVLALWLV 673
+G FP LP LP+S ++D +R +M G + Q G L +++V
Sbjct: 192 LSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIV 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0673HTHTETR502e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 49.6 bits (118), Expect = 2e-09
Identities = 34/168 (20%), Positives = 62/168 (36%), Gaps = 16/168 (9%)

Query: 7 ARAPRKDAATNRQALVDAAVVALDRDP--DASLETIAAAAGLSRRAVYGHFATRDDLVRE 64
AR +++A RQ ++D A+ + SL IA AAG++R A+Y HF + DL E
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 65 VLQRGARRVVEALAGIAHDDSRIHIALIGARLWAEVEQVRVMARV--AVRGPHAREVGAE 122
+ + + E ++++ L +E R + +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 123 LAPLRAELQR------------VVERGIAAGELRGDIPAPTLARLIEG 158
+ + QR ++ I A L D+ A ++ G
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRG 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0674RTXTOXINA300.036 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.036
Identities = 26/113 (23%), Positives = 46/113 (40%), Gaps = 10/113 (8%)

Query: 18 NTPHADPAGIEAALVASPTAASDAASLLAPSPSGGSSIASASDATPDTASLGTGPLSLVA 77
N P+ D G V+ +A A+ +L+ + + + A+A L T ++
Sbjct: 231 NLPNLDNIGAGLDTVSGILSAISASFILSNADADTRTKAAAG------VELTT---KVLG 281

Query: 78 GAGAAVMGMAADSRAAAAATTAMPAATDAVAAPVDTATAPAPAAATTEPFLRA 130
G + RAA +T+ AA +A+ V A +P + + F RA
Sbjct: 282 NVGKGISQYIIAQRAAQGLSTSAAAAG-LIASAVTLAISPLSFLSIADKFKRA 333


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0678YERSSTKINASE300.032 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 30.5 bits (68), Expect = 0.032
Identities = 41/139 (29%), Positives = 57/139 (41%), Gaps = 28/139 (20%)

Query: 281 LGRLDGMAGVP------SIVETTEVDGHRYLATTRLDGDMLWQWQGKVNPLIRPGSTADE 334
L + GMA VP + EVDG R T R D W+ QGK+N G+
Sbjct: 193 LANVHGMAVVPYGNRKEEALLMDEVDGWRCSDTLRTLADS-WK-QGKINSEAYWGTIK-- 248

Query: 335 RAEFARRAMRLTRSVERLVAEMHRRGVTHGDLHPGNIL-ATDDDEARLIDFEV-ARSGTD 392
A R + +T + + GV H D+ PGN++ E +ID + +RSG
Sbjct: 249 --FIAHRLLDVTN-------HLAKAGVVHNDIKPGNVVFDRASGEPVVIDLGLHSRSGEQ 299

Query: 393 A-------AAPRRAMGGLG 404
AP +G LG
Sbjct: 300 PKGFTESFKAPELGVGNLG 318


61CMM_0818CMM_0825N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0818180.962103conserved hypothetical protein
CMM_0819281.549094putative cystathionine beta-synthase
CMM_0820590.448183Ribonuclease H
CMM_0821490.119352putative transcriptional regulator, LacI-family
CMM_0822490.255129putative L-ribulokinase
CMM_082357-0.918476putative L-ribulose-5-phosphate 4-epimerase
CMM_082467-0.781680putative L-arabinose isomerase
CMM_082557-0.566477putative sugar ABC transporter,
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0818LPSBIOSNTHSS332e-04 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 33.3 bits (76), Expect = 2e-04
Identities = 21/90 (23%), Positives = 35/90 (38%), Gaps = 12/90 (13%)

Query: 8 AGAFDLFHVGHLNILKHAKSRCDFLIAGVVSDEMLELNKGITPVVPLAERLEIV----SH 63
G+FD GHL+I++ D + V+ N P+ + ERLE + +H
Sbjct: 6 PGSFDPITFGHLDIIERGCRLFDQVYVAVLR------NPNKQPMFSVQERLEQIAKAIAH 59

Query: 64 ISYVDQARAETLPDKVDTWREVGFDVFFKG 93
+ E L V+ R+ +G
Sbjct: 60 LPNAQVDSFEGL--TVNYARQRQAGAILRG 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0819PF03544359e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 35.0 bits (80), Expect = 9e-04
Identities = 18/91 (19%), Positives = 28/91 (30%)

Query: 270 AVRPLTAPVAAPAPTPTPAPAPTTPAPGAGSGSGSSDAEQGVSLGDARTGAGSAPVGSTA 329
AV+P PV P P P P P P AP + + +
Sbjct: 65 AVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESR 124

Query: 330 YGVPSDAVFVAPTGSNGGSGSKSSPYATIQR 360
P + A S+ + + S P ++
Sbjct: 125 PASPFENTAPARPTSSTATAATSKPVTSVAS 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0820PF03544372e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 37.3 bits (86), Expect = 2e-04
Identities = 20/121 (16%), Positives = 32/121 (26%), Gaps = 8/121 (6%)

Query: 242 VWSYLSSSSPTQAVAFDDVTVRPLTTAASPMPTPTPTPTPTP----TPTPTPTAPAPTSS 297
+ P A V P P P P P P P P AP
Sbjct: 35 TSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVI 94

Query: 298 DPEQTVPRGDARPG----TGSGSAAVGTTTYPAPADGVYVSPTGSNTGAGTKASPYASIQ 353
+ + P+ +P + +P + + S+T + P S+
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVA 154

Query: 354 K 354

Sbjct: 155 S 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0825MICOLLPTASE1012e-23 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 101 bits (253), Expect = 2e-23
Identities = 38/143 (26%), Positives = 58/143 (40%), Gaps = 8/143 (5%)

Query: 1093 QDVTVKAAPANVAPTAVVTATATDLTA---KLDGSASTDADGTVASYAWDFGDGSTGTGP 1149
T N P AV+ + ++ + DG+ S D DG + +Y WDFGDG
Sbjct: 762 NTDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEA 821

Query: 1150 TPTHAYAAGGTYTVALTVTDDKGLTGTASTQVTVVA----PPVN-REPTAVIASTTADLV 1204
TH Y G Y V LTVTD+ G T S ++ VV +N EP
Sbjct: 822 KATHKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAK 881

Query: 1205 ANLDGRASSDPDGTIASWAWEFG 1227
+N+ + + + + ++
Sbjct: 882 SNMLVKGTLSEEDYSDKYYFDVA 904



Score = 92.1 bits (228), Expect = 2e-20
Identities = 36/133 (27%), Positives = 54/133 (40%), Gaps = 4/133 (3%)

Query: 1572 NQAPTAAFTSTANGLTA---SFDGSGSTDADGTVASYAWAFGDGTTGTGRTATHAYAAAG 1628
N+ P A S ++ + +FDG+ S D DG + +Y W FGDG ATH Y G
Sbjct: 772 NKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTG 831

Query: 1629 TYAVSLTVTDDKGLVSAKKDGTVQVS-APVVTPANQAPTAAFTSTAKDLTASFDASTSTD 1687
Y V LTVTD+ G ++ + V PV P F + ++ +
Sbjct: 832 EYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLVKGTLS 891

Query: 1688 ADGTVASYAWAFG 1700
+ Y +
Sbjct: 892 EEDYSDKYYFDVA 904



Score = 90.9 bits (225), Expect = 4e-20
Identities = 41/142 (28%), Positives = 60/142 (42%), Gaps = 8/142 (5%)

Query: 1180 QVTVVAPPVNREPTAVIASTTADLVA---NLDGRASSDPDGTIASWAWEFGDGTTGAGAS 1236
T VN+EP AVI S ++ +V N DG S D DG I ++ W+FGDG A
Sbjct: 763 TDTNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAK 822

Query: 1237 IAHPYAKAGTYTVALTVTDDKGATGRTTASVTVTA----PPVNQA-PVAAFTSTVANLVA 1291
H Y K G Y V LTVTD+ G + + V +N++ P F +
Sbjct: 823 ATHKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKS 882

Query: 1292 SLDASTSSDPDGTVASYAWAFG 1313
++ + + Y +
Sbjct: 883 NMLVKGTLSEEDYSDKYYFDVA 904



Score = 83.6 bits (206), Expect = 6e-18
Identities = 49/220 (22%), Positives = 73/220 (33%), Gaps = 39/220 (17%)

Query: 1012 TGRVPNQAPKAAFTQTADFLTA---SFDATGSTDGDGTVTGYAWDFGDGVQASGATQSHT 1068
T N+ PKA + + +FD T S D DG + Y WDFGDG +++ A +H
Sbjct: 767 TDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHK 826

Query: 1069 YAAAGTYPVVLTVTDDRGTTNRTQQDVT-VKAAPANV----APTAVVTATATDLTAKLDG 1123
Y G Y V LTVTD+ G N + + V+ P V P + +
Sbjct: 827 YNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLV 886

Query: 1124 SASTDADGTVASYAWDF-------------------------GDGST-GTGPTPTHAYAA 1157
+ + Y +D GD + T
Sbjct: 887 KGTLSEEDYSDKYYFDVAKKGNVKITLNNLNSVGITWTLYKEGDLNNYVLYATGNDGTVL 946

Query: 1158 GGTYTVA-----LTVTDDKGLTGTASTQVTVVAPPVNREP 1192
G T+ L+V +GT + V +E
Sbjct: 947 KGEKTLEPGRYYLSVYTYDNQSGTYTVNVKGNLKNEVKET 986



Score = 82.1 bits (202), Expect = 2e-17
Identities = 35/127 (27%), Positives = 52/127 (40%), Gaps = 14/127 (11%)

Query: 1268 TVTAPPVNQAPVAAFTSTVANLVA---SLDASTSSDPDGTVASYAWAFGDGTTGTGRTTT 1324
T T VN+ P A S + +V + D + S D DG + +Y W FGDG T
Sbjct: 765 TNTDVHVNKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKAT 824

Query: 1325 HAYAAAGTFAVSLTVTDDKGLATTTTSQVTV-----------QAPASNVLAQDSFGRAVA 1373
H Y G + V LTVTD+ G T + ++ V P ++ + ++
Sbjct: 825 HKYNKTGEYEVKLTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNM 884

Query: 1374 TGWGTAD 1380
GT
Sbjct: 885 LVKGTLS 891



Score = 78.6 bits (193), Expect = 2e-16
Identities = 29/76 (38%), Positives = 38/76 (50%), Gaps = 3/76 (3%)

Query: 1662 NQAPTAAFTSTAKDLTA---SFDASTSTDADGTVASYAWAFGDGTTGTGKTATHAYAAAG 1718
N+ P A S + + +FD + S D DG + +Y W FGDG ATH Y G
Sbjct: 772 NKEPKAVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTG 831

Query: 1719 TYAVSLTVTDDKGLAS 1734
Y V LTVTD+ G +
Sbjct: 832 EYEVKLTVTDNNGGIN 847


62CMM_0894CMM_0902N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0894312-0.267793putative peptide ABC transporter,
CMM_08951110.432258putative MFS permease
CMM_08961120.546305putative peptide ABC transporter, ATP-binding
CMM_08972120.267209putative peptide ABC transporter, ATP-binding
CMM_0898-3130.449958Riboflavin synthase beta chain
CMM_0899-112-0.2680353,4-dihydroxy-2-butanone 4-phosphate
CMM_0900-213-0.100449diaminohydroxyphosphoribosylaminopyrimidine
CMM_0901-211-1.240008putative hydrolase
CMM_0902-210-1.297592putative exonuclease III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0894NUCEPIMERASE320.001 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 32.4 bits (74), Expect = 0.001
Identities = 25/135 (18%), Positives = 47/135 (34%), Gaps = 21/135 (15%)

Query: 1 MRIIIAGGHGQIARLLERRLADQGHQPVGI---------VRNPDHASDLADAGAEALVLD 51
M+ ++ G G I + +RL + GHQ VGI LA G + +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 LE-KSGVDQVAEALRGADAVVFAAGGG-------PDSGPERKLTIDRDGAILLADAAERA 103
L + G+ + + + P + + LT G + + +
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLT----GFLNILEGCRHN 116

Query: 104 GVTRYVMISAMAVDG 118
+ + S+ +V G
Sbjct: 117 KIQHLLYASSSSVYG 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0896ACETATEKNASE492e-177 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 492 bits (1269), Expect = e-177
Identities = 191/392 (48%), Positives = 261/392 (66%), Gaps = 5/392 (1%)

Query: 4 VLVVNSGSSSFKYQLIEMDTESVLASGLVERIGEPTGSTRHKAGGDSWERELPIADHTAG 63
+LV+N GSSS KYQLIE +VLA GL ERIG H A G+ + + + DH
Sbjct: 3 ILVINCGSSSLKYQLIESKDGNVLAKGLAERIGINDSLLTHNANGEKIKIKKDMKDHKDA 62

Query: 64 FQAMLDAF--AEHGPSLEEEPPVAVGHRVVHGGDVFVEPTVVTDDVKADIDDLSALAPLH 121
+ +LDA +++G + AVGHRVVHGG+ F ++TDDV I D LAPLH
Sbjct: 63 IKLVLDALVNSDYGVIKDMSEIDAVGHRVVHGGEYFTSSVLITDDVLKAITDCIELAPLH 122

Query: 122 NPGALQGIAAAQTAFPDVPHVAVFDTAFHQTLPAEAYTYAIDRELAAAHRIRRYGFHGTS 181
NP ++GI A PDVP VAVFDTAFHQT+P AY Y I E ++IR+YGFHGTS
Sbjct: 123 NPANIEGIKACTQIMPDVPMVAVFDTAFHQTMPDYAYLYPIPYEYYTKYKIRKYGFHGTS 182

Query: 182 HKFVSEAAARLLGKPLEETRIIVLHLGNGASAAAVQGGRSIDTSMGLTPLEGLVMGTRSG 241
HK+VS+ AA +L KP+E +II HLGNG+S AAV+ G+SIDTSMG TPLEGL MGTRSG
Sbjct: 183 HKYVSQRAAEILNKPIESLKIITCHLGNGSSIAAVKNGKSIDTSMGFTPLEGLAMGTRSG 242

Query: 242 DIDPAILFHLARHTDLGLDDLETLLNRRSGLLGLTGLG-DMRDVQRAAAD-GDEAAQTAL 299
IDP+I+ +L ++ +++ +LN++SG+ G++G+ D RD++ AA GD+ AQ AL
Sbjct: 243 SIDPSIISYLMEKENISAEEVVNILNKKSGVYGISGISSDFRDLEDAAFKNGDKRAQLAL 302

Query: 300 GVYRHRIRHYVGAYAAQLGGVDAVVFTAGVGENNPLVRRRSLAGLEFMGIGIDDDRNELI 359
V+ +R++ +G+YAA +GGVD +VFTAG+GEN P +R L GLEF+G +D ++N++
Sbjct: 303 NVFAYRVKKTIGSYAAAMGGVDVIVFTAGIGENGPEIREFILDGLEFLGFKLDKEKNKVR 362

Query: 360 SSEARFVSPEGSPVAVLVIPTDEELEIARQSL 391
E +S S V V+V+PT+EE IA+ +
Sbjct: 363 GEE-AIISTADSKVNVMVVPTNEEYMIAKDTE 393


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0897ICENUCLEATIN1281e-31 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 128 bits (322), Expect = 1e-31
Identities = 142/480 (29%), Positives = 218/480 (45%), Gaps = 17/480 (3%)

Query: 1066 GAPAGDATVEFDVGVDGEEGVDSDQTAAIDVDGDGTDGTD-----GAADAAGTDATDAAG 1120
G D+T+ G G +S Q A G G+D G+ AG D++ AG
Sbjct: 200 GTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAG 259

Query: 1121 TDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDAT 1180
+T AG D++ AG + A + AG +T AG D++ AG +T AG ++T
Sbjct: 260 YGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEEST 319

Query: 1181 DAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAG 1240
AG +T A + AG +T AG D++ AG +T AG D++ AG +T A
Sbjct: 320 QTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQ 379

Query: 1241 TDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAT 1300
+ AG +T AG D++ AG +T AG ++T AG +T A + AG +T
Sbjct: 380 KGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGST 439

Query: 1301 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAG 1360
AG D++ AG + AG D++ AG +T A + AG +T AG +++ AG
Sbjct: 440 GTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAG 499

Query: 1361 TDATDAAGTDAADAAGTDATDAAD----------ATDAAGTDATDAAGTDATDAAGTDAT 1410
+T AG + AG +T A +T AG +++ AG +T A ++
Sbjct: 500 YGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSV 559

Query: 1411 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAG 1470
AG +T A + AG + AG+D++ AG +T A ++ AG +T A
Sbjct: 560 LTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAR 619

Query: 1471 TDATDAAGTDATDAAGTDAADAAGTDATDAA--DATDAAGTDATDAADATDAADATDGST 1528
+ G +T AG D++ AG +T A ++ AG +T A A GST
Sbjct: 620 EQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGST 679



Score = 127 bits (319), Expect = 3e-31
Identities = 131/435 (30%), Positives = 204/435 (46%), Gaps = 12/435 (2%)

Query: 1106 GAADAAGTDATDAAGTDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATD 1165
G+ AG D+T AG +T AG +++ AG + + AG +T AG D++
Sbjct: 197 GSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSL 256

Query: 1166 AAGADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGT 1225
AG +T AG D++ AG +T A + AG +T AG D++ AG +T AG
Sbjct: 257 IAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGE 316

Query: 1226 DATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1285
++T AG +T A + AG +T AG D++ AG +T AG D++ AG +T
Sbjct: 317 ESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQ 376

Query: 1286 AAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1345
A + AG +T AG D++ AG + AG ++ AG +T A + AG
Sbjct: 377 TAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGY 436

Query: 1346 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAD----------ATDAAGTDATD 1395
+T AG D++ AG +T AG D++ AG +T A +T AG +++
Sbjct: 437 GSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSL 496

Query: 1396 AAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1455
AG +T AG +T AG +T A ++ G + AG +++ AG +T A
Sbjct: 497 IAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASY 556

Query: 1456 DAADAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAA--DATDAAGTDATD 1513
++ AG +T A + AG +T AG+D++ AG +T A ++ AG +T
Sbjct: 557 NSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQ 616

Query: 1514 AADATDAADATDGST 1528
A GST
Sbjct: 617 TAREQSVLTTGYGST 631



Score = 123 bits (309), Expect = 4e-30
Identities = 137/453 (30%), Positives = 212/453 (46%), Gaps = 9/453 (1%)

Query: 1085 GVDSDQTAAIDVD---GDGTDGTDGAADA--AGTDATDAAGTDATDAAGTDATDAAGTDA 1139
G S QTA D G G+ GT GA + AG +T AG ++T AG +T A +
Sbjct: 275 GYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGS 334

Query: 1140 NDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAGTDATDAA 1199
+ AG +T AG D++ AG +T AG D++ AG +T A + AG +T A
Sbjct: 335 DLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTA 394

Query: 1200 GTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAGTDA 1259
G D++ AG +T AG ++T AG +T A + AG + AG D++ AG +
Sbjct: 395 GADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGS 454

Query: 1260 TDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDAADAA 1319
T AG D++ AG +T A + AG + AG +++ AG +T AG + A
Sbjct: 455 TQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTA 514

Query: 1320 GTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAA--GT 1377
G + A ++ G +T AG +++ AG +T A ++ AG + A G+
Sbjct: 515 GYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGS 574

Query: 1378 DATDAADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAADAA 1437
D T +T AG+D++ AG +T A ++ AG +T A + G + A
Sbjct: 575 DLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTA 634

Query: 1438 GTDATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDA 1497
G D++ AG +T AG ++ AG +T A + AG +T AG D++ AG +
Sbjct: 635 GADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGS 694

Query: 1498 TDAA--DATDAAGTDATDAADATDAADATDGST 1528
T A ++ AG +T A + GST
Sbjct: 695 TQTAGYNSILTAGYGSTQTAQEGSDLTSGYGST 727



Score = 119 bits (298), Expect = 8e-29
Identities = 127/451 (28%), Positives = 202/451 (44%), Gaps = 15/451 (3%)

Query: 1066 GAPAGDATVEFDVGVDGEEGVDSDQTAAIDVDGDGTDGTD-----GAADAAGTDATDAAG 1120
G D+++ G G DS TA G+D G+ AG D++ AG
Sbjct: 344 GTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAG 403

Query: 1121 TDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDAT 1180
+T AG ++T AG + A + AG +T AG D++ AG +T AG D++
Sbjct: 404 YGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSS 463

Query: 1181 DAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGTDATDAAGTDATDAAG 1240
AG +T A + AG +T AG +++ AG +T AG +T AG +T A
Sbjct: 464 LTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQ 523

Query: 1241 TDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDAT 1300
++ G +T AG +++ AG +T A ++ AG +T A + AG +T
Sbjct: 524 NESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGST 583

Query: 1301 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAG 1360
AG+D++ AG + A ++ AG +T A + G +T AG D++ AG
Sbjct: 584 GTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAG 643

Query: 1361 TDATDAAGTDAADAAGTDATDAAD----------ATDAAGTDATDAAGTDATDAAGTDAT 1410
+T AG ++ AG +T A +T AG D++ AG +T AG ++
Sbjct: 644 YGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSI 703

Query: 1411 DAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAG 1470
AG +T A + +G + AG D++ AG +T A ++ AG +T A
Sbjct: 704 LTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAR 763

Query: 1471 TDATDAAGTDATDAAGTDAADAAGTDATDAA 1501
+ G +T AG D++ AG +T A
Sbjct: 764 EQSVLTTGYGSTSTAGADSSLIAGYGSTQTA 794



Score = 100 bits (249), Expect = 5e-23
Identities = 106/394 (26%), Positives = 177/394 (44%), Gaps = 2/394 (0%)

Query: 1106 GAADAAGTDATDAAGTDATDAAGTDATDAAGTDANDAAGTDATDAAGTDATDAAGTDATD 1165
G+ A + G +T AG D++ AG + AG ++ AG +T A ++
Sbjct: 805 GSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDL 864

Query: 1166 AAGADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGT 1225
G +T AG D++ AG +T AG ++ AG +T A ++ G +T AG
Sbjct: 865 TTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 924

Query: 1226 DATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1285
+++ AG +T A + AG ++ A ++ AG +T AG D++ AG +T
Sbjct: 925 ESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQ 984

Query: 1286 AAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1345
AG + AG +T A +T AG + AG D++ AG ++ +G + AG
Sbjct: 985 TAGYQSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGY 1044

Query: 1346 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAADATDA--AGTDATDAAGTDATD 1403
+T +G + AG ++ +G ++ AG + A + AG ++T G +
Sbjct: 1045 GSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSML 1104

Query: 1404 AAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGT 1463
AG ++ AG +T +G D+ AG AG D+T AG + AG ++ AG
Sbjct: 1105 IAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGD 1164

Query: 1464 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDA 1497
+ AG D AG + AG ++ AG +
Sbjct: 1165 RSKLTAGNDCILMAGDRSKLTAGINSILTAGCRS 1198



Score = 95.6 bits (237), Expect = 1e-21
Identities = 107/365 (29%), Positives = 161/365 (44%), Gaps = 9/365 (2%)

Query: 1166 AAGADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGADATDAAGT 1225
A D T TD DA ++ T + A +T +
Sbjct: 114 ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDA-TIESGSTQPTQTIEIATYGSTLSGTH 172

Query: 1226 DATDAAGTDATDAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1285
+ AG +T+ AG + AG +T AG D+T AG +T AG +++ AG +T
Sbjct: 173 QSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQ 232

Query: 1286 AAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGT 1345
+ AG +T AG D++ AG + AG D++ AG +T A + AG
Sbjct: 233 TGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGY 292

Query: 1346 DATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAADATDAAGTDATDAAGTDATDAA 1405
+T AG D++ AG +T AG ++ AG +T A G+D T AG +T A
Sbjct: 293 GSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQK----GSDLT--AGYGSTGTA 346

Query: 1406 GTDATDAAGTDATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADAAGTDA 1465
G D++ AG +T AG D++ AG + A + AG +T AG D++ AG +
Sbjct: 347 GDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGS 406

Query: 1466 TDAAGTDATDAAGTDATDAA--GTDAADAAGTDATDAADATDAAGTDATDAADATDAADA 1523
T AG ++T AG +T A G+D G+ T D++ AG +T A + A
Sbjct: 407 TQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTA 466

Query: 1524 TDGST 1528
GST
Sbjct: 467 GYGST 471



Score = 70.6 bits (172), Expect = 5e-14
Identities = 83/306 (27%), Positives = 127/306 (41%), Gaps = 15/306 (4%)

Query: 1246 AAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDAADAAGTDATDAAGT 1305
A D T TD DA ++ T + A +T +
Sbjct: 114 ACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDA-TIESGSTQPTQTIEIATYGSTLSGTH 172

Query: 1306 DATDAAGTDAADAAGTDAADAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGTDATD 1365
+ AG + + AG + AG +T AG D+T AG +T AG +++ AG +T
Sbjct: 173 QSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQ 232

Query: 1366 AA--GTDAADAAGTDATDAADATDAAGTDATDAAGTDATDAAGTDATDAAGTDATDAAGT 1423
G+D G+ T D++ AG +T AG D++ AG +T A + AG
Sbjct: 233 TGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGY 292

Query: 1424 DAADAAGTDAADAAGTDATDAAGTDATDAAGTDAADA--------AGTDATDAAGTDATD 1475
+ AG D++ AG +T AG ++T AG + AG +T AG D++
Sbjct: 293 GSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSL 352

Query: 1476 AAGTDATDAAGTDAADAAGTDATDAA----DATDAAGTDATDAADATDAADATDGSTDGG 1531
AG +T AG D++ AG +T A D T G+ T AD++ A T G
Sbjct: 353 IAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGE 412

Query: 1532 DAPTRA 1537
++ A
Sbjct: 413 ESTQTA 418


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0900IGASERPTASE451e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.1 bits (106), Expect = 1e-06
Identities = 56/312 (17%), Positives = 97/312 (31%), Gaps = 48/312 (15%)

Query: 414 PAPVATPAPVATPAPVAAADPAPDAKPTETGTSAPAASAPDEPSSSSDTSAPPAEAPSAT 473
+ TP + P ++ A+ E P AP PS +++T A ++ S T
Sbjct: 994 TTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPP---APATPSETTETVAENSKQESKT 1050

Query: 474 APAAPVTFEQLRDSWPSVVEAVEKAKRSAWLVAVTATPRALADDVLTLSFVSANDAERFK 533
EQ +A E ++ VA A A+ N+ +
Sbjct: 1051 VEKN----EQ---------DATETTAQNR-EVAKEAKSNVKAN-------TQTNEVAQSG 1089

Query: 534 ERGAPGQGVSDILRTAILDVLGIRVKFIARVEPHGGAGAP--AGTPAPTGGGSASPAPEA 591
Q T + + + A+VE P +P S + P+A
Sbjct: 1090 SETKETQ------TTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 592 SRPAASPATSTGTRPEGGSASTTSAAVSPTASTAVTTPAASPPATKAPPARTTPAGGGWA 651
PA + + +TT+ P T+ P T++ T +
Sbjct: 1144 -EPARENDPTVNIKEPQSQTNTTADTEQP---AKETSSNVEQPVTESTTVNTGN-----S 1194

Query: 652 TVAIPTSDPGAAEAPAVRA-----PASRPERSAPAAPAAPAAPAAPPTAPAAAPRATAPS 706
V P + A P V + P +R RS + P ++ + A
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATT--SSNDRSTVALCDL 1252

Query: 707 VPARASSVVPDA 718
++V+ DA
Sbjct: 1253 TSTNTNAVLSDA 1264



Score = 36.2 bits (83), Expect = 6e-04
Identities = 37/190 (19%), Positives = 59/190 (31%), Gaps = 11/190 (5%)

Query: 313 PEDELERMRAQAVAFGAVELSRAADVVNAALTEMTGATSPRLHLELLVARVLVPASDDTH 372
P + VA + + S+ + TE T A+ V A+ T+
Sbjct: 1028 PAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNR----EVAKEAKSNVKANTQTN 1083

Query: 373 RGALARVERLERRVGVADAGADPAPAAVAPVATPTPAPAAAPAPVATPAPVATPAPVAAA 432
A + E E + A A V T +P + A
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQA 1143

Query: 433 DPAPDAKPTETGTSAPAASAPDEPSSSSDTSAPPAEAPSATAPAAPVTFEQLRDSWPSVV 492
+PA + PT + ++++DT P E S PVT ++ SVV
Sbjct: 1144 EPARENDPTVN-----IKEPQSQTNTTADTEQPAKETSSNVEQ--PVTESTTVNTGNSVV 1196

Query: 493 EAVEKAKRSA 502
E E +
Sbjct: 1197 ENPENTTPAT 1206



Score = 33.1 bits (75), Expect = 0.005
Identities = 30/233 (12%), Positives = 54/233 (23%), Gaps = 21/233 (9%)

Query: 594 PAASPATSTGTRPEGGSASTTSAAVSPTASTAVTTPAASPPATKAPPARTTPAGGGWATV 653
+ T + + S + + + A A PPA TP+
Sbjct: 993 DTTNITTPNNIQADVPSVPSNNEEI-----------ARVDEAPVPPPAPATPSETTETVA 1041

Query: 654 AIPTSDPGAAEAPAVRAPASRPERSAPAAPAAPAAPAAPPTAPAAAPRATAPSVPARASS 713
+ E A + + A A A T A + +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 714 VVPDAH-------VPDFEEPEPDEFGPAEPGWATGGASPDSAPPVARSVPAQQQQPAAAA 766
+ + P P A P + P + +
Sbjct: 1102 ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQ 1161

Query: 767 SGTGSA---PRPDTAAAAPAASPAAAPQRYGESVVRELLQATFIEEKPVERKA 816
+ T + P +T++ + G SVV T +P
Sbjct: 1162 TNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSE 1214



Score = 30.8 bits (69), Expect = 0.030
Identities = 27/139 (19%), Positives = 42/139 (30%), Gaps = 22/139 (15%)

Query: 398 AAVAPVATPTPAPAAAPAPVATPAPVATPAPVAAADPAPDAKPTETGTSAPA---ASAPD 454
V +P + P A PA P T T PA +S +
Sbjct: 1123 PKVTSQVSPKQEQSETVQPQAEPAR--ENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 455 EPSSSSDTSAP---PAEAPSATAPAAPVTFEQLRDSWPSVVEAVEKAKRSAWLVAVTATP 511
+P + S T E P T PA + P+V ++ +V + P
Sbjct: 1181 QPVTESTTVNTGNSVVENPENTTPA---------TTQPTVNSESSNKPKNRHRRSVRSVP 1231

Query: 512 RALADDVLTLSFVSANDAE 530
+ + S+ND
Sbjct: 1232 HN-----VEPATTSSNDRS 1245



Score = 30.4 bits (68), Expect = 0.038
Identities = 28/189 (14%), Positives = 51/189 (26%), Gaps = 3/189 (1%)

Query: 570 AGAPAGTPAPTGGGSASPAPEASRPAASPATSTGTRPEGGSASTTSAAVSPTASTAVTTP 629
A P+ P+ + EA P +PAT + T S + T
Sbjct: 1005 ADVPS---VPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATET 1061

Query: 630 AASPPATKAPPARTTPAGGGWATVAIPTSDPGAAEAPAVRAPASRPERSAPAAPAAPAAP 689
A A VA S+ + + A+ +
Sbjct: 1062 TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQE 1121

Query: 690 AAPPTAPAAAPRATAPSVPARASSVVPDAHVPDFEEPEPDEFGPAEPGWATGGASPDSAP 749
T+ + + + +V +A + + +EP+ A+ S +
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ 1181

Query: 750 PVARSVPAQ 758
PV S
Sbjct: 1182 PVTESTTVN 1190


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0902CARBMTKINASE300.016 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 30.2 bits (68), Expect = 0.016
Identities = 20/107 (18%), Positives = 35/107 (32%), Gaps = 21/107 (19%)

Query: 112 RIVDVTPGRVRDALGEGAIAIVAGFQGFNRGTGDITTLGRGGS----------DTTAVAL 161
V+ ++ + G I I +G G + + G D L
Sbjct: 172 GHVEAET--IKKLVERGVIVIASG-------GGGVPVILEDGEIKGVEAVIDKDLAGEKL 222

Query: 162 AAALGADVCEIYTDVDGIFTADPRVVPLARKIDRITSEEMLELAASG 208
A + AD+ I TDV+G + + + EE+ + G
Sbjct: 223 AEEVNADIFMILTDVNGAALYYGT--EKEQWLREVKVEELRKYYEEG 267


63CMM_0991CMM_0996N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_0991-2101.350912conserved hypothetical protein
CMM_0992-191.586835hypothetical protein
CMM_0993-191.601364putative L-asparagine permease, APC family
CMM_0994-390.081423hypothetical protein
CMM_0995-270.434106putative MFS permease
CMM_0996-270.745862putative transcriptional regulator, MarR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0991RTXTOXIND310.019 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.019
Identities = 16/42 (38%), Positives = 26/42 (61%), Gaps = 1/42 (2%)

Query: 513 GSVDTSGGGSVTSPMQATVVK-LAVEEGQQVVKGDLLVVLEA 553
G + SG P++ ++VK + V+EG+ V KGD+L+ L A
Sbjct: 88 GKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTA 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0994PF01206270.035 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 27.0 bits (60), Expect = 0.035
Identities = 11/46 (23%), Positives = 22/46 (47%), Gaps = 4/46 (8%)

Query: 358 PVV---DALAELRPAQVVYVACDPVALARDVALFAER-GYELRSVR 399
P++ LA + +V+YV +D F+++ G+EL +
Sbjct: 18 PILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQK 63


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0995HTHFIS442e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 44.4 bits (105), Expect = 2e-07
Identities = 26/115 (22%), Positives = 48/115 (41%), Gaps = 4/115 (3%)

Query: 19 RVAVVDDHESVRLGLKAACLDAGFEFILAAANARELVEGLVGRECDVVVLDLSLGDGSSV 78
+ V DD ++R L A AG++ + +NA L + + D+VV D+ + D ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYD-VRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 79 TDNVKA--AQGTGAAVLVHSIADRVASVREALAAGAAGVIPKSSATQTVMAAVAT 131
D + VLV S + + +A GA +PK ++ +
Sbjct: 64 -DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_0996PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 2e-04
Identities = 16/87 (18%), Positives = 36/87 (41%), Gaps = 7/87 (8%)

Query: 335 NSLQHAGTSAETITRTLVISGHGDEG-VAIDVVDDGVGFDQRRIPTERLGLRVSIKERVA 393
N ++H G + +++ G D G V ++V + G + + GL+ +++ER+
Sbjct: 266 NGIKH-GIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQ-NVRERLQ 323

Query: 394 QAGG---LVTIDSAPGEGTAVRIRWPA 417
G + + G+ + P
Sbjct: 324 MLYGTEAQIKLSEKQGKVN-AMVLIPG 349


64CMM_1037CMM_1049N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1037-271.580400putative K+ channel protein, beta subunit
CMM_1038-270.877118putative potassium channel protein subunit
CMM_1039072.357167putative transcriptional regulator, TetR family
CMM_1040071.890977putative short chain dehydrogenase
CMM_1041172.817456conserved membrane protein, TerC family
CMM_1042173.494102putative membrane protein
CMM_10432103.991634putative Zn-dependant hydrolase
CMM_10440103.581825*putative serine protease, family S51
CMM_1045082.010360putative uracil-DNA glycosylase
CMM_1046182.118782putative acetyltransferase
CMM_1047082.192738putative membrane bound protease
CMM_1048091.209064putative hexulose-6-phosphate synthase
CMM_1049014-0.811116putative 6-phospho-3-hexuloisomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1037RTXTOXINA310.025 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.1 bits (70), Expect = 0.025
Identities = 11/49 (22%), Positives = 23/49 (46%), Gaps = 1/49 (2%)

Query: 520 LDKTGTVTTGVMSLVRAVAAAGVDADELVRVAAALEARSEHPVARAVVE 568
L K G + + + + A + + DEL++ + S +A+A +E
Sbjct: 139 LGKAGGILSTFQNFL-GTALSSMKIDELIKKQKSGGNVSSSELAKASIE 186


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1042TCRTETA537e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 53.3 bits (128), Expect = 7e-10
Identities = 58/311 (18%), Positives = 110/311 (35%), Gaps = 21/311 (6%)

Query: 41 LGLFVALLPPIIVSLALKVAEVAPDDTAGTLSLVLGLGALVALVVNPLAGRLSDRTPGRF 100
+GL + +LP ++ L V +D ++L L AL+ P+ G LSD RF
Sbjct: 21 IGLIMPVLPGLLRDL------VHSNDVTAHYGILLALYALMQFACAPVLGALSD----RF 70

Query: 101 GMRRPWIVGGVVLGYGALILLTQATTVLALVGAWML--VQGAFNAAIAALIA-VMADSAR 157
G RRP ++ + ++ A + L ++ + GA A A IA + R
Sbjct: 71 G-RRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDER 129

Query: 158 PRNRGRVAAAIGVAQNGSLVVGTFIVQLFTTTTQQVLVPGAIGVAVVLAFALAFRDRVLT 217
R+ G ++A G V+G + + A+ L +
Sbjct: 130 ARHFGFMSACFGFGMVAGPVLGGLMGGF--SPHAPFFAAAALNGLNFLTGCFLLPESHKG 187

Query: 218 ERPTSRLSLKELLGSFVFDPRRNPDFGWAWLMRFLLTASAVTATNYLAFYLIDDLGVAQA 277
ER R L SF + + F++ + D
Sbjct: 188 ERRPLRREALNPLASFRWARGMTV-VAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDAT 246

Query: 278 DVANAVFVATLFNVIGVVSTTFVAGWLSDRLGRRKVFVAAAALVAVIGLVILALAPSLAV 337
+ F ++ ++ + G ++ RLG R+ + + G ++LA A +
Sbjct: 247 TIG---ISLAAFGILHSLAQAMITGPVAARLGERR-ALMLGMIADGTGYILLAFATRGWM 302

Query: 338 VYVAQLVIGAG 348
+ +++ +G
Sbjct: 303 AFPIMVLLASG 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1043HTHTETR741e-18 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 74.3 bits (182), Expect = 1e-18
Identities = 39/206 (18%), Positives = 80/206 (38%), Gaps = 12/206 (5%)

Query: 8 RRYAKGAARRREILEAALALIAERGYSASSLQEIADAVGISKAGVLHYFDSREALIAAVL 67
+ + R+ IL+ AL L +++G S++SL EIA A G+++ + +F + L + +
Sbjct: 4 KTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIW 63

Query: 68 EERDAHSAADYRE------AVPDGDPADMVGMLLRASSHNADTPGLVALYSRLVVDAAGA 121
E +++ E P +++ +L ++ L+ +
Sbjct: 64 ELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEM 123

Query: 122 EHPA---HSYIADRYDRVVGSVAAQVRALGVELPAGLDPDSFARVAVAVSDGLQLQWSYR 178
+ + YDR+ ++ + A LPA L A + GL W +
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEA--KMLPADLMTRRAAIIMRGYISGLMENWLFA 181

Query: 179 PE-IDMRDALERAIRALSGGVLPLPS 203
P+ D++ + L L P+
Sbjct: 182 PQSFDLKKEARDYVAILLEMYLLCPT 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1044CHLAMIDIAOM6330.003 Chlamydia cysteine-rich outer membrane protein 6 si...
		>CHLAMIDIAOM6#Chlamydia cysteine-rich outer membrane protein 6

signature.
Length = 547

Score = 32.7 bits (74), Expect = 0.003
Identities = 26/88 (29%), Positives = 39/88 (44%), Gaps = 7/88 (7%)

Query: 78 VTNRGSRTLRGVVRDAWQPSAGASSTRDRV------RIPAGERRAIRLSLTPTRRGERRT 131
+ N+G+ T R VV + P A S+ RV + GE R I + P +RG R T
Sbjct: 233 IVNQGTATARNVVVENPVPDGYAHSSGQRVLTFTLGDMQPGEHRTITVEFCPLKRG-RAT 291

Query: 132 ERVTIRSAGPLGLAARQATLLSPGAVRV 159
T+ G A T+++ V+V
Sbjct: 292 NIATVSYCGGHKNTASVTTVINEPCVQV 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1045HTHFIS431e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 42.5 bits (100), Expect = 1e-06
Identities = 38/151 (25%), Positives = 66/151 (43%), Gaps = 17/151 (11%)

Query: 17 GKAVVGQDGAVTGMI--IALLARG--HVLLEGVPGVAKTLLVRALSEA-LRLDTARVQFT 71
G +VG+ A+ + +A L + +++ G G K L+ RAL + R + V
Sbjct: 136 GMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAIN 195

Query: 72 PDLMPGDITGSLVYDSREGAFS---------FRRGPVFTSILLADEINRTPPKTQSALLE 122
+P D+ S ++ +GAF+ F + T L DEI P Q+ LL
Sbjct: 196 MAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGT--LFLDEIGDMPMDAQTRLLR 253

Query: 123 AMEERQVTVDGESHALP-DPFLVAATQNPVE 152
+++ + T G + D +VAAT ++
Sbjct: 254 VLQQGEYTTVGGRTPIRSDVRIVAATNKDLK 284


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1049HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.0 bits (226), Expect = 2e-23
Identities = 33/139 (23%), Positives = 60/139 (43%), Gaps = 1/139 (0%)

Query: 3 ARILVVDDDTALAEMIGIVLRTEGFEPSFCGDGGQALAAFHDAKPDLVLLDLMLPGLDGI 62
A ILV DDD A+ ++ L G++ + DLV+ D+++P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 QVCDLIRAE-SGVPIIMLTAKSDTADVVKGLESGADDYIVKPFNPKELVARIRTRLRPAA 121
+ I+ +P+++++A++ +K E GA DY+ KPF+ EL+ I L
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 122 AASPGLLQVGDLVVDVEGH 140
L + + G
Sbjct: 124 RRPSKLEDDSQDGMPLVGR 142


65CMM_1054CMM_1061N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1054-18-1.646730putative siderophore-interacting protein
CMM_1055212-0.056254putative short chain alcohol dehydrogenase
CMM_10563130.210291hypothetical protein
CMM_1057310-0.218396putative pyridine nucleotide-disulphide
CMM_105829-1.093721putative enoyl-CoA hydratase
CMM_1059210-1.999835putative F420-dependent NADP reductase
CMM_1060216-1.106053conserved hypothetical protein
CMM_1061114-0.073460conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1054SECA11250.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1125 bits (2912), Expect = 0.0
Identities = 416/876 (47%), Positives = 563/876 (64%), Gaps = 38/876 (4%)

Query: 1 MASVLEKVLRVGEGRTLRKLQNYAKAVNQLEEDFTHLTDEELKNETVELRERHANGESLD 60
+ +L KV RTLR+++ +N +E + L+DEELK +T E R R GE L+
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 61 DLLPEAFAAVREASRRTLGLRHFDVQIMGGAALHLGNIAEMKTGEGKTLVATLPAYLNAI 120
+L+PEAFA VREAS+R G+RHFDVQ++GG L+ IAEM+TGEGKTL ATLPAYLNA+
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 121 ASRGVHVITVNDYLASYQSELMGRVFRALGMTTGVILAGQTPQQRREQYAADITYGTNNE 180
+GVHV+TVNDYLA +E +F LG+T G+ L G +RE YAADITYGTNNE
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 181 FGFDYLRDNMAWQASDMVQRGHFFAVVDEVDSILIDEARTPLIISGPSAGDANRWFTEFA 240
+GFDYLRDNMA+ + VQR +A+VDEVDSILIDEARTPLIISGP+ + +
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSE-MYKRVN 240

Query: 241 TVAKRLV-----------PEVDYEVDEKKRTVGVLEAGIEKVEDHLGI-------DNLYE 282
+ L+ E + VDEK R V + E G+ +E+ L ++LY
Sbjct: 241 KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS 300

Query: 283 SANTPLISFLNNSIKAKALFKKDKDYVVMNGEVLIVDEHTGRILMGRRYNEGIHQAIEAK 342
AN L+ + +++A ALF +D DY+V +GEV+IVDEHTGR + GRR+++G+HQA+EAK
Sbjct: 301 PANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK 360

Query: 343 EGVAVKAENQTLATVTLQNYFRLYKKLSGMTGTAETEAAEFMSTYKLGVVPIPTNRPMQR 402
EGV ++ ENQTLA++T QNYFRLY+KL+GMTGTA+TEA EF S YKL V +PTNRPM R
Sbjct: 361 EGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIR 420

Query: 403 KDQSDLIYKNEKAKFEQVVEDIAERHAAGQPVLVGTTSVEKSEYLSKLLAKKGVRHEVLN 462
KD DL+Y E K + ++EDI ER A GQPVLVGT S+EKSE +S L K G++H VLN
Sbjct: 421 KDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN 480

Query: 463 AKNHAREAAIVAQAGRLGSVTVATNMAGRGTDIMLGGNAEFLAVAAMNARGLSPVETPEQ 522
AK HA EAAIVAQAG +VT+ATNMAGRGTDI+LGG+ + VAA+
Sbjct: 481 AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQA-EVAAL------------- 526

Query: 523 YETEWDDVFADVKAEVDEEAAKVIEAGGLYVLGTERHESRRIDNQLRGRSGRQGDPGESR 582
E + +KA+ V+EAGGL+++GTERHESRRIDNQLRGRSGRQGD G SR
Sbjct: 527 -ENPTAEQIEKIKADWQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSR 585

Query: 583 FYLSLTDDLMRLFNNGAAASLMGRDSVPDDVAIESKVVSRAIRSAQGQVEARNAEIRKNV 642
FYLS+ D LMR+F + + +M + + AIE V++AI +AQ +VE+RN +IRK +
Sbjct: 586 FYLSMEDALMRIFASDRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQL 645

Query: 643 LKYDDVLNRQREAIYGDRRHILEGDDLQERSQRFLEAVIDDVLDSHIG-EGNGDDWDFDA 701
L+YDDV N QR AIY R +L+ D+ E E V +D++I + + WD
Sbjct: 646 LEYDDVANDQRRAIYSQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPG 705

Query: 702 LWTELKTLYPISITIDEVITEAGSKGRVNRDFVRREILSDAKLAYSKREEQLGGAAMREL 761
L LK + + + I E + + ++ + +R IL+ + Y ++EE +G MR
Sbjct: 706 LQERLKNDFDLDLPIAEWLDKEPE---LHEETLRERILAQSIEVYQRKEEVVGAEMMRHF 762

Query: 762 ERRVVLSVIDRRWREHLYEMDYLKDGIGLRAMAQRDPLVEYQREGFALFQQMMGAIREET 821
E+ V+L +D W+EHL MDYL+ GI LR AQ+DP EY+RE F++F M+ +++ E
Sbjct: 763 EKGVMLQTLDSLWKEHLAAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEV 822

Query: 822 VGFLFNLEVEVQAPADAESVGPRIQAKGLAANQATA 857
+ L ++V + + R++A+ LA Q +
Sbjct: 823 ISTLSKVQVRMPEEVEELEQQRRMEAERLAQMQQLS 858


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1055INTIMIN270.045 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.3 bits (60), Expect = 0.045
Identities = 18/82 (21%), Positives = 30/82 (36%), Gaps = 3/82 (3%)

Query: 36 SAPAQASSAVAGSTPA-TSTATTATVRGTVRKRFDVDDFFGRQPCSSQDLPPSGPLLENL 94
SAP A+ VAG T T + T + ++ +Q S S L +
Sbjct: 129 SAPLVAAGGVAGHTNKLTKMSPDVTKSNMTDDK--ALNYAAQQAASLGSQLQSRSLNGDY 186

Query: 95 TRCVIEILAGARELDQIARWVS 116
+ +AG + Q+ W+
Sbjct: 187 AKDTALGIAGNQASSQLQAWLQ 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1059PF06580523e-09 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 51.8 bits (124), Expect = 3e-09
Identities = 38/222 (17%), Positives = 81/222 (36%), Gaps = 37/222 (16%)

Query: 286 ELRHQERELITKDATIREIHHRVK-----NNLQTVASLLRIQARRSHTEEAREALGHAQR 340
E+ + + ++A + + ++ N L + +L+ +ARE L
Sbjct: 148 EIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDP-----TKAREMLTS--- 199

Query: 341 RVGAIAVVHDTLSEGLNQNVDFDAVFDRVLLLIAEVA------SAHNTRVHPKIVGSFGV 394
LSE + ++ + R + L E+ + + ++ +
Sbjct: 200 -----------LSELMRYSLRYSN--ARQVSLADELTVVDSYLQLASIQFEDRLQFENQI 246

Query: 395 LPSAYATPL-ALALTELVTNAVEHGLAGRS--GEVAIEAARTEETLTVSVRDDG-VGLPE 450
P+ + + + LV N ++HG+A G++ ++ + T+T+ V + G + L
Sbjct: 247 NPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN 306

Query: 451 GKVGTGLGTQIVRTLIQGELGGTIDWHTLM-GSGTEVTIEVP 491
K TG G Q VR +Q G + +P
Sbjct: 307 TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1061ACRIFLAVINRP280.017 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 27.9 bits (62), Expect = 0.017
Identities = 23/134 (17%), Positives = 50/134 (37%), Gaps = 29/134 (21%)

Query: 19 RSPAVVLLTALIALEALGMAGVTALLVVDLLTSTPSSVASAVALIALAALAAVLLASVVR 78
P V+L + + + + A + + V + I L+A A+L+ +
Sbjct: 895 SIPVSVMLVVPLGI----VGVLLAATLFNQKNDVYFMVG-LLTTIGLSAKNAILIVEFAK 949

Query: 79 GILR-------------GRSWVRP-----AAVTWQVLQIAVGAGSLQGADARQDLG---- 116
++ R +RP A VL +A+ G+ G+ A+ +G
Sbjct: 950 DLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGA--GSGAQNAVGIGVM 1007

Query: 117 WGLIVPSVLVLVLL 130
G++ ++L + +
Sbjct: 1008 GGMVSATLLAIFFV 1021


66CMM_1109CMM_1115N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1109528-8.562334conserved hypothetical protein
CMM_1110429-8.475603putative ATP-dependent RNA helicase
CMM_111108-3.241403conserved hypothetical protein
CMM_1112-19-2.745441putative membrane protein
CMM_1113-19-2.381453conserved hypothetical protein
CMM_1114-18-0.934004putative ATP-dependant DNA helicase/nuclease
CMM_1115-28-0.584610putative ATP-dependent DNA helicase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1109ACETATEKNASE290.032 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 28.6 bits (64), Expect = 0.032
Identities = 14/32 (43%), Positives = 18/32 (56%), Gaps = 4/32 (12%)

Query: 184 DIIVFTNGVAE----IDEAGERGLASLGIRVD 211
D+IVFT G+ E I E GL LG ++D
Sbjct: 324 DVIVFTAGIGENGPEIREFILDGLEFLGFKLD 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1112PF06340320.004 Vibrio cholerae toxin co-regulated pilus biosynthesis pr...
		>PF06340#Vibrio cholerae toxin co-regulated pilus biosynthesis

protein F (TcpF)
Length = 338

Score = 31.9 bits (72), Expect = 0.004
Identities = 22/89 (24%), Positives = 36/89 (40%), Gaps = 12/89 (13%)

Query: 69 GEDRDGLEIFTKVYFPT------GAKGHNDTGLSRKHIMASID-----GSLERLQTDHVD 117
G RD LE KV FPT A ++ ++ G + LQ ++D
Sbjct: 58 GMSRDYLENCVKVSFPTSQDMFYDAYSSTESDGAKTRTKEDFSARLLAGDYDSLQKLYID 117

Query: 118 LYQAHR-YDYETPLEETMQAFADVVRQGK 145
Y A +D+E P + ++ + +GK
Sbjct: 118 FYLAQTTFDWEIPTRDQIETLVNYANEGK 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1114HTHTETR619e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.2 bits (148), Expect = 9e-14
Identities = 27/174 (15%), Positives = 58/174 (33%), Gaps = 7/174 (4%)

Query: 3 EPRITERGRRTRQRIIEATGEQILATGIGGTTLDDVRAATLTSKSQLFHYFPGGKIELVR 62
+ + + TRQ I++ G+ T+L ++ A ++ ++ +F K +L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHF-KDKSDLFS 60

Query: 63 EVAEWEGRQLLEAQEPHIHDLGSWESWEAWRTGL-----VDYYIGRGRWACPIGSLATQA 117
E+ E + E + + R L R R I +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGD-PLSVLREILIHVLESTVTEERRRLLMEIIFHKCEF 119

Query: 118 AAVDAELERTIAASMRAWRAFLARGVERMREAGLVAATADPERIATVILAAIQG 171
A +++ + + ++ EA ++ A R A ++ I G
Sbjct: 120 VGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISG 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1115DHBDHDRGNASE1076e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 107 bits (267), Expect = 6e-30
Identities = 85/253 (33%), Positives = 120/253 (47%), Gaps = 13/253 (5%)

Query: 17 LAGRTAVITGSTSGIGEAVARVLAASGAEVVVSGRDADRARAVVAAITAAGGTAHAVPAD 76
+ G+ A ITG+ GIGEAVAR LA+ GA + + ++ VV+++ A A A PAD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 77 LGGGYDGIRAFARDAAAALGGRVDILVNNAGVYPVGATAGLADDDLDGILAVNVRAPHVL 136
+ AR G +DILVN AGV G L+D++ + +VN
Sbjct: 66 VRDSAAIDEITARIEREM--GPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 137 VAELAPAMAERGTGAIVDVGSWMSRVGIPFGAAYTASKAAIEQMTRTWAAEYGPRGVRVN 196
++ M +R +G+IV VGS + V AAY +SKAA T+ E +R N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 197 SVAPGATSTP-------GNAADADVLAAMTAGTVAGVPV----RPVDIAFAVRFLVSDEA 245
V+PG+T T V+ G+P+ +P DIA AV FLVS +A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 246 AFVHGARLDVDGG 258
+ L VDGG
Sbjct: 244 GHITMHNLCVDGG 256


67CMM_1136CMM_1147N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1136181.907126putative transcriptional regulator, TetR family
CMM_1137162.728104putative carboxylesterase
CMM_1138283.164991hypothetical protein
CMM_1139-171.745469putative hydrolase
CMM_1140-170.762073putative acetyltransferase
CMM_1141081.454411hypothetical membrane protein
CMM_1142-171.153294putative ATP-dependent RNA helicase
CMM_1143-161.172795putative methyltransferase
CMM_1145-270.831667putative transcriptional regulator, Cro/CI
CMM_1146-171.257613hypothetical membrane protein
CMM_1147-270.831519hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1137cloacin459e-07 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 44.7 bits (105), Expect = 9e-07
Identities = 29/78 (37%), Positives = 34/78 (43%), Gaps = 8/78 (10%)

Query: 610 GGGMMGGRGGGDAFGGAIMGGIIGGLLSGGGGWGGG--------GGGGGFSGGGFGGGGG 661
GG G G + G I GG G + GG G G GGG G GG G
Sbjct: 3 GGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGH 62

Query: 662 FSGGGFGGGGGGGFSGGD 679
+GGG G GGG +GG+
Sbjct: 63 GNGGGNGNSGGGSGTGGN 80



Score = 43.9 bits (103), Expect = 2e-06
Identities = 28/73 (38%), Positives = 32/73 (43%), Gaps = 6/73 (8%)

Query: 613 MMGGRGGGDAFGGAIMGGIIGGLLSGGGGWGGGGGGGGFS------GGGFGGGGGFSGGG 666
M GG G G G G I G +G G GG G G+S GGG G G + GG
Sbjct: 1 MSGGDGRGHNTGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS 60

Query: 667 FGGGGGGGFSGGD 679
G GGG + G
Sbjct: 61 GHGNGGGNGNSGG 73



Score = 42.4 bits (99), Expect = 5e-06
Identities = 27/71 (38%), Positives = 29/71 (40%), Gaps = 6/71 (8%)

Query: 603 GGGYGGYGGGMMGGRGGGDAFGGAIMGGIIGGLLSGGGGWGGGGGGGGFSGGGFGGGGGF 662
G G G+ G G D G + GG G WGGG G G GGG G
Sbjct: 17 SGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGN------GGGNGN 70

Query: 663 SGGGFGGGGGG 673
SGGG G GG
Sbjct: 71 SGGGSGTGGNL 81



Score = 35.1 bits (80), Expect = 0.001
Identities = 25/73 (34%), Positives = 31/73 (42%), Gaps = 3/73 (4%)

Query: 599 DLDQGGGYGGYGGGMMGGRGGG---DAFGGAIMGGIIGGLLSGGGGWGGGGGGGGFSGGG 655
+++ G G GGG G G + +GG GI G SG G GG G GG SG G
Sbjct: 19 NINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTG 78

Query: 656 FGGGGGFSGGGFG 668
+ FG
Sbjct: 79 GNLSAVAAPVAFG 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1142DHBDHDRGNASE1224e-36 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 122 bits (308), Expect = 4e-36
Identities = 72/241 (29%), Positives = 117/241 (48%), Gaps = 15/241 (6%)

Query: 11 LRGRTALVTGAASGIGAAIARHFVLAGAEVVLLDLSPAVHEEARRIGAVGAVTA-----D 65
+ G+ A +TGAA GIG A+AR GA + +D +P E+ A A D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 66 VSDEARVAGAVAEAGRMLGRIDVLVNSHGILTEAPVAEMTLATWQRTIDVDLTSVFLLTR 125
V D A + A R +G ID+LVN G+L + ++ W+ T V+ T VF +R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 126 AVLPGMLERRDGRIITVASQLGQKGGVGLAHYAAAKAGVIAFTKSLALEVSGSNVLANVI 185
+V M++RR G I+TV S +A YA++KA + FTK L LE++ N+ N++
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 186 APGPISTPLVDAISVDWKDAKRR----------ELPLGRFGTVDEVAPTAVLLAADPDGN 235
+PG T + ++ D A++ +PL + ++A + L + G+
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 236 L 236
+
Sbjct: 246 I 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1143DHBDHDRGNASE1161e-33 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 116 bits (292), Expect = 1e-33
Identities = 73/253 (28%), Positives = 111/253 (43%), Gaps = 11/253 (4%)

Query: 7 VALVTGGASGIGRDTAVRLAARGDRVVVGRFPGDPHDAGATLEAVRAVGGTGIAVDLDVA 66
+A +TG A GIG A LA++G + +P + +++A A DV
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAA--VDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 67 STPSVDAFVRAALEEYGQVDHVIAAAGILRRAPLGAMTDERWDEVLGVDLGGVMRVIRAA 126
+ ++D E G +D ++ AG+LR + +++DE W+ V+ GV R+
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 127 EPLL--GRGSSIVAVSSIAGGVYGWGDHAHYATAKAGVVGLVRSAAVELAPRGIRANTVI 184
+ R SIV V S GV A YA++KA V + +ELA IR N V
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAA-YASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 185 PGLIETPQSLDAVNSLGPDGLRRAGDR------IPAGRVGRPEEVASVIAFLASEDAGYV 238
PG ET G IP ++ +P ++A + FL S AG++
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 239 TGQTLTVDGGLTI 251
T L VDGG T+
Sbjct: 247 TMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1144TCRTETA320.005 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.005
Identities = 32/146 (21%), Positives = 62/146 (42%), Gaps = 9/146 (6%)

Query: 44 FLAWTIAVYDFILFGTL---LPDISRDFGWDTSESLLVSTLVSVGTAVVVL---LVGPMV 97
+ + D + G + LP + RD + L+++ + ++G +
Sbjct: 8 IVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGALS 67

Query: 98 DRLGRRIGMVVSVTGTALSSGATALSAGAVSLVGIRSISGLGLAEQSINATYLNEIYEQT 157
DR GRR ++VS+ G A+ A + L R ++G+ A ++ Y+ +I T
Sbjct: 68 DRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADI---T 124

Query: 158 EDERIRRNKGFVYAMVQTGWPLGALL 183
+ + R+ GF+ A G G +L
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVL 150



Score = 30.2 bits (68), Expect = 0.016
Identities = 30/132 (22%), Positives = 45/132 (34%), Gaps = 8/132 (6%)

Query: 329 GWLGDRLGRRNVIVGGWLVAGLAFAVMLLGPDDPTFVMGAYMVGLFFLLGPYAAILFFQA 388
G L DR GRR V++ A + +A+M P +G + G+ G A
Sbjct: 64 GALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADI 123

Query: 389 GCFDSDCRATG--SSFIGAMSQPGAIIGGFLLTGLTASAMSFGQAALWVGAGGILASALV 446
D R G S+ G G ++GG + G + F AA L
Sbjct: 124 TDGDERARHFGFMSACFGFGMVAGPVLGGLM--GGFSPHAPFFAAAAL----NGLNFLTG 177

Query: 447 MLLAKPTGETRE 458
L + +
Sbjct: 178 CFLLPESHKGER 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1147PF06438310.006 Heme acquisition protein HasAp
		>PF06438#Heme acquisition protein HasAp

Length = 205

Score = 30.7 bits (69), Expect = 0.006
Identities = 13/52 (25%), Positives = 21/52 (40%), Gaps = 2/52 (3%)

Query: 237 ATEEVLRA-GHAPDALFATND-FAAIGAMGALHDAGLSVPDDVALVGYNDTP 286
A + + A A D + N F + A G H + +V +VG + P
Sbjct: 147 ALQGQIDALLKAVDPSLSINSTFDQLAAAGVAHATPAAAAAEVGVVGVQELP 198


68CMM_1237CMM_1242N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1237082.3326582-oxoglutarate dehydrogenase E1 component
CMM_1238-161.486827conserved membrane protein
CMM_1239-18-0.687667conserved membrane protein
CMM_1240-27-1.480472putative oxidoreductase
CMM_1241-17-1.317002conserved hypothetical protein
CMM_1242-28-2.178779conserved hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1237NUCEPIMERASE417e-06 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 40.5 bits (95), Expect = 7e-06
Identities = 17/80 (21%), Positives = 28/80 (35%), Gaps = 13/80 (16%)

Query: 28 LVVGATGISGSALVDQLTAEGWDVLAL-------------SRRAGADRPGVRWISADLRS 74
LV GA G G + +L G V+ + +R +PG ++ DL
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLAD 63

Query: 75 ADDLRRALAGEQPSHVFFTA 94
+ + A VF +
Sbjct: 64 REGMTDLFASGHFERVFISP 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1240ABC2TRNSPORT337e-04 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 33.4 bits (76), Expect = 7e-04
Identities = 46/197 (23%), Positives = 72/197 (36%), Gaps = 13/197 (6%)

Query: 49 PVIMLSIFSVAFGSSGNLGTAPDGSGGVSAAAYYLPGMIA-AGILLSGVQNLAVDIAMER 107
P+I L FG LG GGVS A+ GM+A + + + + +
Sbjct: 42 PLIYL------FGLGAGLGVMVGRVGGVSYTAFLAAGMVATSAMTAATFETIYAAFGRME 95

Query: 108 SDGTLKRLAGSPLPVLSYFIGKGGQVIVTSLLQVVVLLLVARFAFGVELPTDAGRWATFA 167
T + + + L + +G+ + L + +VA + + A
Sbjct: 96 GQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLYALPVIA 155

Query: 168 WVYGLGITSSAVLGIALSRIPRSGA-SATAVITPIVLVLQFISGVYLTFTMLPTWLQDVA 226
GL S ++ AL+ T VITPI+ F+SG LP Q A
Sbjct: 156 LT-GLAFASLGMVVTALAPSYDYFIFYQTLVITPIL----FLSGAVFPVDQLPIVFQTAA 210

Query: 227 AFLPLKWMAQGMRAVFL 243
FLPL +R + L
Sbjct: 211 RFLPLSHSIDLIRPIML 227


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1241PF06580454e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 44.9 bits (106), Expect = 4e-07
Identities = 55/336 (16%), Positives = 111/336 (33%), Gaps = 51/336 (15%)

Query: 83 ALVFLALVIPAATWGVASLSSFAVVQCVLCPLVWLLLDRVRDAVIGTLVLTGSIAVGFVV 142
+L+ L L ++ + ++ + ++ +L VIG + + ++ ++
Sbjct: 49 SLMGLVLTHAYRSF----IKRQGWLKLNMGQIILRVLPAC--VVIGMVWFVANTSIWRLL 102

Query: 143 GFGDLPGALPTMALSQGLSLVGTIALGLWITRIADLSRERLELLEGLRAAQAQVEELGRE 202
F + T+ L+ + + +W + + + Q ++ + +E
Sbjct: 103 AFINTKPVAFTLPLALSIIFNVVVVTFMW--SLLYFGWHFFKNYKQAEIDQWKMASMAQE 160

Query: 203 AGTARERERLSADIH-----DTVAQDLTGLVMLAQRGRRELRGGDADATGATLAELEEGA 257
A + L A I+ + L + L D L L E
Sbjct: 161 A----QLMALKAQINPHFMFNA----LNNIRALILE--------DPTKAREMLTSLSELM 204

Query: 258 RDALTQTRA-IVAATAPMELTDG-LGQALARLGERLSREAGIPVEVRADPGVGSVDRDAE 315
R +L + A V+ + + D L A + +RL E I + D +
Sbjct: 205 RYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIM----------DVQ 254

Query: 316 VVLLRCAQEGLANVRRHA-----DASAVELVLDRDGGDVVLVIRDDGRGF-DPARASGGY 369
V + Q + N +H + L +D G V L + + G + S G
Sbjct: 255 VPPM-LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGT 313

Query: 370 GLDGMRRRLAAAGG---RLDVASGPDGTRLTARIPA 402
GL +R RL G ++ ++ IP
Sbjct: 314 GLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1242HTHFIS669e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 9e-15
Identities = 32/116 (27%), Positives = 52/116 (44%), Gaps = 4/116 (3%)

Query: 6 IRVAVVDDHPVVRGGLAALLASAD-DIDVVGQAADGKAAVELAIAERPDVVLMDLRMPVL 64
+ V DD +R L L+ A D+ + AA + A D+V+ D+ MP
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA---AGDGDLVVTDVVMPDE 60

Query: 65 DGVGATARIREEAPDVRVLVLTTYETDASILTAIEAGASGYLLKAAPEEEILAGVR 120
+ RI++ PD+ VLV++ T + + A E GA YL K E++ +
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


69CMM_1330CMM_1338N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1330-190.449699conserved hypothetical protein
CMM_1331-17-0.028116DNA polymerase IV
CMM_1332-170.679757putative membrane protein
CMM_1333-180.647969UDP-glucose 6-dehydrogenase
CMM_1334-29-0.692390maltooligosyl trehalose trehalohydrolase
CMM_1335-28-0.433425maltooligosyltrehalose synthase
CMM_1336-28-0.903315putative glucan debranching enzyme
CMM_1337-29-1.147194conserved hypothetical protein
CMM_1338-19-1.809077putative ABC-type transporter, permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1330TCRTETA531e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 52.5 bits (126), Expect = 1e-09
Identities = 83/401 (20%), Positives = 134/401 (33%), Gaps = 37/401 (9%)

Query: 18 VALLALAMGGFAIGTTEFVAMGLLPQLAADLLPEVAARSTEAANAQAGTLISAYALGVVV 77
V L +A+ IG + M +LP L DL+ + A G L++ YAL
Sbjct: 9 VILSTVALDAVGIG----LIMPVLPGLLRDLVH------SNDVTAHYGILLALYALMQFA 58

Query: 78 GAPTIAAASARAPRRKLLLWLLLAFTLGTVLSAILPSFGLVVLARFVAGLPHGAYFGIAS 137
AP + A S R RR +LL L + + A P ++ + R VAG+ GA +A
Sbjct: 59 CAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGIT-GATGAVAG 117

Query: 138 LVAAQLMGEGKRARGVAFVLAGLTIANVIGVPIVTWIGQNAGWRVAYLVVAAIFAATFVA 197
A + +RAR F+ A V G + +G + AA+ F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLT 176

Query: 198 VFLAVPAQAGNPEATLRRE----------LRAFTRLQVWLALLIGAIGFGGFFAVYTFVS 247
+P LRRE R T + +A+ G A
Sbjct: 177 GCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPA--ALWV 234

Query: 248 PMVTEVTGLPEWSVPLALVVVGL------GMTAGNLAGGWWADRDVKAALLSLFGLLIVS 301
+ ++ ++L G+ M G +A +R AL+
Sbjct: 235 IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVA-ARLGER---RALMLGMIADGTG 290

Query: 302 LVGLVLTASNPVGLFGFLFLIGGSAAALSPGIQIRL-MDVAHDSQSIAAALNHSALNTGN 360
+ L + + L G P +Q L V + Q + + +
Sbjct: 291 YILLAFATRGWMAFPIMVLLASGGIG--MPALQAMLSRQVDEERQGQLQGSLAALTSLTS 348

Query: 361 AVGAALGGVTVAAGLGYTSPAIVGVGLSIAGLLIALASFGL 401
VG L AA + + G ++ L + GL
Sbjct: 349 IVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPALRRGL 389


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1331HTHFIS391e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 1e-05
Identities = 24/118 (20%), Positives = 48/118 (40%), Gaps = 5/118 (4%)

Query: 8 ITLAIVDDHRMLLGALTEWIRNAASDIEMVAAVSTWPDLLTHPRFPVDVVLLDLDLKDNL 67
T+ + DD + L + + A D+ + + +T + D+V+ D+ + D
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA--AGDGDLVVTDVVMPDEN 61

Query: 68 PISLKIATLKT--TGVKTVLMSTYSEPNVVREALASGALGYLVKSEDASMIVDAIRLA 123
L + +K + ++MS + +A GA YL K D + ++ I A
Sbjct: 62 AFDL-LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1334HTHFIS435e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 42.9 bits (101), Expect = 5e-07
Identities = 31/141 (21%), Positives = 51/141 (36%), Gaps = 7/141 (4%)

Query: 13 QRVRVALVDDHVLLLDGLSARLSRPRTGVEVVATSPTWNGLVRDDRFPEAFDVVVLDLAL 72
+ + DD + L+ LSR G +V TS D+VV D+ +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAA--GDGDLVVTDVVM 57

Query: 73 --RDEVPVAQKIRTLTGAGLTSVLLSTHADPSTIHGAMRAGASAVVPKAESSEELIASIH 130
+ + +I+ L +++S T A GA +PK ELI I
Sbjct: 58 PDENAFDLLPRIKKA-RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116

Query: 131 AAADGTPRQSALVQQAMQDFH 151
A R+ + ++ QD
Sbjct: 117 RALAEPKRRPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1337DHBDHDRGNASE725e-17 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 72.0 bits (176), Expect = 5e-17
Identities = 55/190 (28%), Positives = 82/190 (43%), Gaps = 11/190 (5%)

Query: 5 KRVVVTGASSGIGAATVRLFRSRGWDVVGVARREDRLRAL--AEETGATYAVADLTVQAD 62
K +TGA+ GIG A R S+G + V ++L + + + A +A A D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 63 VDALRDH----LRETGHVHALVNNAGGAVGTDSVEGGSSEDWAWMYEINVLAVRRVTSAL 118
A+ + RE G + LVN G + + S E+W + +N V + ++
Sbjct: 69 SAAIDEITARIEREMGPIDILVN-VAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 119 LPLLRAGVPEGGSADIVTVSSIAAHVPYEGGGGYNAAKAAAHAMLGVLRLELAGEPIRVI 178
+ GS IVTV S A VP Y ++KAAA L LELA IR
Sbjct: 128 SKYMMD--RRSGS--IVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 179 EIAPGQVRTE 188
++PG T+
Sbjct: 184 IVSPGSTETD 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1338HTHTETR551e-11 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 55.4 bits (133), Expect = 1e-11
Identities = 21/92 (22%), Positives = 35/92 (38%)

Query: 22 RRGPRTSGDARASIIEAARMLFIESGADRVSARRIAAAAGVDPSLVRYYFGSLEALLEEA 81
R+ + + + R I++ A LF + G S IA AAGV + ++F L E
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 82 LRPSEDLIAPYLRLRELPVEERGAALVAAALH 113
SE I + +++ L
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILI 94


70CMM_1456CMM_1467N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_145618-1.508609putative phosphatase
CMM_1457091.780608oxygen-independent coproporphyrinogen III
CMM_14580111.827257conserved hypothetical protein
CMM_1459-1111.510794conserved hypothetical protein
CMM_1460-2101.295775putative transcriptional regulator, HrcA family
CMM_1461-2101.005650chaperone
CMM_1462-390.429963conserved hypothetical protein
CMM_1463112-3.285233putative HIT family hydrolase
CMM_1464-19-2.604161putative phosphate starvation-induced ATPase
CMM_1465-19-2.163145putative metal-dependent hydrolase
CMM_1466010-1.861890conserved membrane protein
CMM_1467010-1.190061GTP-binding protein, era family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1456PF03544320.003 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 32.3 bits (73), Expect = 0.003
Identities = 21/122 (17%), Positives = 37/122 (30%), Gaps = 12/122 (9%)

Query: 317 PVQQPEQPAPRKRAARKVAQRQPGTPAGSTSPVVRGAAGSRPDPRSTQMIPAQDPAPARD 376
P Q + P + P P + + + +P P+ + P RD
Sbjct: 62 PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEK----PKPKPKPKPKPVKKVEQPKRD 117

Query: 377 ADPDEDEVTATWTVLPEDPASAEPSTSTSADADTGSGSEDAAPPKPPRAPRQPRKPATPP 436
P E + + P+ TS+ A + + PRA + +P P
Sbjct: 118 VKPVESRPASP-------FENTAPARPTSSTATAATSKPVTSVASGPRALSRN-QPQYPA 169

Query: 437 EE 438

Sbjct: 170 RA 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1461HELNAPAPROT755e-20 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 74.9 bits (184), Expect = 5e-20
Identities = 34/153 (22%), Positives = 68/153 (44%), Gaps = 10/153 (6%)

Query: 12 ASADVAAGVAQFLSPVVVNLQALAVNGKQAHWHVRGANFIGVHEFLDVLVAHAQDWADTA 71
+ V L+ + N L + HW+V+G +F +HE + L HA + DT
Sbjct: 5 NAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTI 64

Query: 72 AERVVALGLPIDARIETVAASTTTAPLTPGFRPSSATIAEVIAQIDATMELVNRAVQEL- 130
AER++A+G + TV T A +T G + + +E++ + + ++ + +
Sbjct: 65 AERLLAIGG---QPVATVKEYTEHASITDG--GNETSASEMVQALVNDYKQISSESKFVI 119

Query: 131 ----GEIDVNSQDVAIEIARGLEKDRWFLFAHI 159
D + D+ + + +EK W L +++
Sbjct: 120 GLAEENQDNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1462RTXTOXINA330.020 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 32.6 bits (74), Expect = 0.020
Identities = 45/229 (19%), Positives = 84/229 (36%), Gaps = 54/229 (23%)

Query: 300 AGLTVAAGLLVGAVPT-------TDAGTRVALALAAASAVALVHAAPRSRAAALPDALER 352
AGL +G+L + D T+ A + + V + +
Sbjct: 240 AGLDTVSGILSAISASFILSNADADTRTKAAAGVELTTKVL-----GNVGKGISQYIIAQ 294

Query: 353 RVLRVVGVATATAAVVAGAASAAAPDAVPALPLLVAAVASGAH-AWALTRPAFRAELLGR 411
R + + + A A ++A A + A PL ++A A + + R + LG
Sbjct: 295 RAAQGLSTSAAAAGLIASAVTLAIS------PLSFLSIADKFKRANKIEEYSQRFKKLG- 347

Query: 412 VTDGDA----DPRLTGDDVPPSAEASVELVAAPDDATDPAPTSADPLRVLAAAVAGVGAS 467
DGD+ + TG + +AS+ ++ VLA+ +G+ A+
Sbjct: 348 -YDGDSLLAAFHKETG-----AIDASLTTIST----------------VLASVSSGISAA 385

Query: 468 AA-----VPVTALMGGPALLTFCMQLVAAAVVAAALDVAARRLHDPVVA 511
A PV+AL+G +T + + A A + A ++ D +
Sbjct: 386 ATTSLVGAPVSALVGA---VTGIISGILEASKQAMFEHVASKMADVIAE 431


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1463IGASERPTASE320.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.3 bits (73), Expect = 0.005
Identities = 22/81 (27%), Positives = 31/81 (38%), Gaps = 5/81 (6%)

Query: 408 VDADGKVVDVTEFTKPVVRDADAVS-----EEPADADAEAVVADAPAEEAAEAPAAEEAP 462
V+ + VD T T P AD S EE A D V APA + E
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 463 AEKPKKKAPAKKKASEKAADS 483
++ K ++ A+E A +
Sbjct: 1045 KQESKTVEKNEQDATETTAQN 1065


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1467HTHFIS310.012 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.012
Identities = 12/42 (28%), Positives = 20/42 (47%), Gaps = 3/42 (7%)

Query: 118 KSNILLIGPTGCGKTYLAQTL---AKRLNVPFAVADATALTE 156
+++ G +G GK +A+ L KR N PF + A+
Sbjct: 160 DLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR 201


71CMM_1490CMM_1496N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1490-210-0.187845hypothetical protein
CMM_1491-180.802968*putative polypeptide deformylase
CMM_14922100.840178conserved membrane protein, putative transporter
CMM_14932110.758965putative glycosyltransferase
CMM_14941100.622168*putative membrane-bound acyltranferase
CMM_14951110.661664putative GDP-mannose 4,6-dehydratase
CMM_14962112.328228putative glycosyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1490TYPE3IMRPROT260.019 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 25.9 bits (57), Expect = 0.019
Identities = 10/30 (33%), Positives = 13/30 (43%)

Query: 33 AGEIIVRQRGTHFHPGVNVGRGGDDTLFAL 62
AGEII Q G F V+ + + A
Sbjct: 98 AGEIIGLQMGLSFATFVDPASHLNMPVLAR 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1492CARBMTKINASE404e-06 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 40.2 bits (94), Expect = 4e-06
Identities = 28/125 (22%), Positives = 47/125 (37%), Gaps = 8/125 (6%)

Query: 134 LPIVNENDTVATHEIRFGDNDRLAALVARLVDADLLLLLSDVDALYTRPPEEPGARRIEH 193
+P++ E+ + E D D +A V+AD+ ++L+DV+ + +
Sbjct: 197 VPVILEDGEIKGVEAVI-DKDLAGEKLAEEVNADIFMILTDVNGAA-LYYGTEKEQWLRE 254

Query: 194 VGFGDELDGVEIGSTGTGVGTGGAVTKVAAA-RLAAEAGTGVLLTSTAQVHSALAGEHVG 252
V + E G G KV AA R G ++ + AL G+ G
Sbjct: 255 VKVEELRKYYEEG----HFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVEALEGK-TG 309

Query: 253 TWFAP 257
T P
Sbjct: 310 TQVLP 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1495LPSBIOSNTHSS310.001 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 31.3 bits (71), Expect = 0.001
Identities = 19/68 (27%), Positives = 28/68 (41%), Gaps = 5/68 (7%)

Query: 11 IGVMGGTFDPIHNGHLVAASEVQQHLQL-DEVIFVPTGQPWQKQTVTDGEHRYLMTVIAT 69
+ G+FDPI GHL +++ +L D+V P KQ + + R A
Sbjct: 2 NAIYPGSFDPITFGHL---DIIERGCRLFDQVYVAVLRNP-NKQPMFSVQERLEQIAKAI 57

Query: 70 AANPRFTV 77
A P V
Sbjct: 58 AHLPNAQV 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1496IGASERPTASE373e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.6 bits (84), Expect = 3e-04
Identities = 45/292 (15%), Positives = 80/292 (27%), Gaps = 22/292 (7%)

Query: 69 ERPATPTTADAEASSTSRGDGSGSRPDEATTPPAVPLSAPGIGPDTSAILLSGGVLTRRQ 128
TP A+ S + +R DEA PP P + P + ++ +
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPAT-----PSETTETVAENSKQESK 1049

Query: 129 LRAIRE--AEEAAREQRGSEHDEAQSTPAGSETGSTPSDVEPTAAAVDDAPPSSSETSVP 186
E A E + R + + A ++T T ++
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 187 PYARYGRGSRATPRPRAPYGSAYRRGTAAEATGAPAADAPAADVEAQEKASDATVPADRE 246
A+ S + + A A V +E S AD E
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 247 ATPAETPSAWPFAPIVPSGDAAGRDVDASGEQPTVPPLAAHSRRSDEDPRPTPSLDATPW 306
PA+ S+ P+ + ++ E P + +PT + +++
Sbjct: 1170 Q-PAKETSSNVEQPV--TESTTVNTGNSVVENPEN--------TTPATTQPTVNSESSNK 1218

Query: 307 VTGSAEPALEPAFGTRASAAEDLPDDRTETPDADSASTVPAPTDSAAPDARA 358
++ A D T T++ DARA
Sbjct: 1219 PKNRHRRSVRSVPHNVEPATTSSNDRSTVA----LCDLTSTNTNAVLSDARA 1266


72CMM_1553CMM_1562N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1553-111-0.594702putative 3-oxoacyl-CoA thiolase
CMM_1554010-0.685893putative ribonuclease D
CMM_1555211-0.991130conserved hypothetical protein
CMM_15565111.109508putative pyrimidine reductase
CMM_15574101.205570conserved hypothetical protein putatively
CMM_15582100.684004Thiosulfate sulfurtransferase
CMM_15591100.416044putative ATPase
CMM_1560-19-0.022116putative ammonium transporter, Amt family
CMM_1561-180.413894probable oxidoreductase
CMM_1562-19-0.325191hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1553UREASE320.005 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 32.0 bits (73), Expect = 0.005
Identities = 18/54 (33%), Positives = 28/54 (51%), Gaps = 2/54 (3%)

Query: 348 IRAITANPAAILRLDHRVGALTPGLDADVVIWSGDPLDVQSRAEHVIIGGSTVY 401
I T NPA L H +G+L G AD+V+W +P + + V++GG+
Sbjct: 406 IAKYTINPAIAHGLSHEIGSLEVGKRADLVLW--NPAFFGVKPDMVLLGGTIAA 457



Score = 29.7 bits (67), Expect = 0.026
Identities = 17/49 (34%), Positives = 28/49 (57%), Gaps = 8/49 (16%)

Query: 40 LRDGLVAAVGRAGDVEVPEGAT--------VIDASGRWVLPGFVEAHGH 80
L+DG +AA+G+AG+ ++ G T VI G+ V G +++H H
Sbjct: 90 LKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIH 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1554NUCEPIMERASE344e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 34.4 bits (79), Expect = 4e-04
Identities = 53/290 (18%), Positives = 89/290 (30%), Gaps = 57/290 (19%)

Query: 14 VVAGASGFIGQVLVRELGDEGYLVRTI-------------------GRSGADACWGD--- 51
+V GA+GFIG + + L + G+ V I + G D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDLAD 63

Query: 52 AEGIRRLLD--GADLLVNLAGKSVNCRYGAANRREILRSRVETTAELARAVADSARPVPM 109
EG+ L + + + RY N S + + + +
Sbjct: 64 REGMTDLFASGHFERVFISPHRL-AVRYSLENPHAYADSNLTGFLNILEGCRHNK--IQH 120

Query: 110 WINASTATVYRHATDRPQTESTGELGDDFSPSVGRAWERAFFAPELPATRRVALRIAIVL 169
+ AS+++VY P + + S +A A +A + +
Sbjct: 121 LLYASSSSVYGLNRKMPFSTD-DSVDHPVSL----------YAATKKANELMAHTYSHLY 169

Query: 170 GRDGALQPLLGLARFGLGGPQLDGRFPGRP----SRIRAGAHHGYQPT---HGRQVFSWL 222
G P GL F + GP GRP + G +G+ +
Sbjct: 170 G-----LPATGLRFFTVYGPW------GRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFT 218

Query: 223 HIDDLV-GIIRFVRDTPTLEGPVNASSPAPVTNRRLMKVLRRAVGMPVGL 271
+IDD+ IIR P + + P + +V PV L
Sbjct: 219 YIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVEL 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1555TCRTETOQM1117e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 111 bits (280), Expect = 7e-28
Identities = 52/141 (36%), Positives = 76/141 (53%), Gaps = 7/141 (4%)

Query: 19 IRNFCIIAHIDHGKSTLADRMLQMTGVVD---SRSMRAQYLDRMDIERERGITIKSQAVR 75
I N ++AH+D GK+TL + +L +G + S D +ER+RGITI++
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 76 MPWELDGQTYALNMIDTPGHVDFSYEVSRSLAACEGAILLVDAAQGIEAQTLANLYLALE 135
WE +N+IDTPGH+DF EV RSL+ +GAILL+ A G++AQT + +
Sbjct: 63 FQWE----NTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRK 118

Query: 136 NDLTIIPVLNKIDLPAADPDK 156
+ I +NKID D
Sbjct: 119 MGIPTIFFINKIDQNGIDLST 139



Score = 88.8 bits (220), Expect = 2e-20
Identities = 50/246 (20%), Positives = 95/246 (38%), Gaps = 21/246 (8%)

Query: 147 IDLPAADPDKYAAELASL-IGGDPSDVLRVSGKTGAGVEDLLDRVSRTIPAPVGDPDAAA 205
+ + + + E + V S K G+++L++ ++ + +
Sbjct: 190 MSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTHRGQSEL 249

Query: 206 RAMIFDSVYDAYRGVVTYVRMIDGKLSPREKISMMSTRATHEILEIGVS-SPEPTPSDGL 264
+F Y R + Y+R+ G L R+ + +S + +I E+ S + E D
Sbjct: 250 CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSV-RISEKEKIKITEMYTSINGELCKIDKA 308

Query: 265 GVGEVGYL---ITGVKDVRQSKVGDTVTTAARPATEALPGYTEPLPMVFSGLYPIDGSDY 321
GE+ L + V +GDT R E PLP++ + + P
Sbjct: 309 YSGEIVILQNEFLKLNSV----LGDTKLLPQRERIEN------PLPLLQTTVEPSKPQQR 358

Query: 322 PDLRDALDKLKLSDAAL-VYEPETSVALGFGFRCGFLGLLHLEIITERLSREFGLDLITT 380
L DAL ++ SD L Y + + FLG + +E+ L ++ +++
Sbjct: 359 EMLLDALLEISDSDPLLRYYVDSATHEIIL----SFLGKVQMEVTCALLQEKYHVEIEIK 414

Query: 381 APSVIY 386
P+VIY
Sbjct: 415 EPTVIY 420



Score = 32.9 bits (75), Expect = 0.004
Identities = 15/77 (19%), Positives = 29/77 (37%), Gaps = 2/77 (2%)

Query: 414 EPVVKAAILAPKDYVGTIMELCQSRRGILLGMEYLGEDRVEVRYTMPLGEIVFDFFDNLK 473
EP + I AP++Y+ ++ + L + V + +P I ++ +L
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCI-QEYRSDLT 594

Query: 474 SKTAGYASLDYEPAGSQ 490
T G + E G
Sbjct: 595 FFTNGRSVCLTELKGYH 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1556ECOLNEIPORIN290.022 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 28.6 bits (64), Expect = 0.022
Identities = 11/30 (36%), Positives = 15/30 (50%), Gaps = 1/30 (3%)

Query: 1 MRRSIIALVIGAALAAGGA-VALPAPADAL 29
M++S+IAL + A A A V L A
Sbjct: 1 MKKSLIALTLAALPVAAMADVTLYGTIKAG 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1560PYOCINKILLER310.004 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.004
Identities = 15/53 (28%), Positives = 24/53 (45%), Gaps = 2/53 (3%)

Query: 149 EARIRSDLTRRAADALAGASVVIIGDTPADGVAAEAAGFPFVAVATGAYDVGQ 201
EA+ +++ R A+ A+ + V A AAG + VA GA + Q
Sbjct: 229 EAKRKAEEQARQQAAIRAANTYAMPANG--SVVATAAGRGLIQVAQGAASLAQ 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1562PF07675280.019 Cleaved Adhesin
		>PF07675#Cleaved Adhesin

Length = 1358

Score = 27.8 bits (61), Expect = 0.019
Identities = 26/95 (27%), Positives = 37/95 (38%), Gaps = 19/95 (20%)

Query: 34 EPNLPGQDAAWLGDPVSGLGIILQRADAPKAGKNR-----------VHLDLAPDDRTRDQ 82
EP+ P Q + L G + L + DAP A K + + + P + R
Sbjct: 329 EPS-PYQPVSNLTATAQGQKVTL-KWDAPSAKKAEGSREVKRIGDGLFVTIEPANDVRAN 386

Query: 83 EVDRVLALGAALVADRRNADGSGWVVLADPEGNEF 117
E VLA AD D +G+ L D + N F
Sbjct: 387 EAKVVLA------ADNVWGDNTGYQFLLDADHNTF 415


73CMM_1942CMM_1949N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_1942-271.367699hypothetical protein
CMM_1943-291.142123hypothetical protein
CMM_1944-171.008112hypothetical protein
CMM_1945-180.568666hypothetical protein
CMM_1946-2100.319007putative short chain
CMM_1947-39-0.131632hypothetical protein
CMM_1948-39-0.005299hypothetical protein
CMM_1949-380.228328hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1942V8PROTEASE422e-06 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 41.9 bits (98), Expect = 2e-06
Identities = 35/179 (19%), Positives = 57/179 (31%), Gaps = 28/179 (15%)

Query: 123 DPVSHIGVVAYVVDGKEMSCTANAVESANGLTVATAGHCA-FPGKDPSKMVFVPGYVKGQ 181
PV++I V A + V T+ T H DP + P +
Sbjct: 88 APVTYIQVEA---PTGTFIASGVVV---GKDTLLTNKHVVDATHGDPHALKAFPSAINQD 141

Query: 182 --PYTVWPVTSVTLPAGWRETLDPARDTAFLTVGS-PDGRTLTEAVGASPVEFHQPRT-- 236
P + +T +G D A + + + E V + + +
Sbjct: 142 NYPNGGFTAEQITKYSG-------EGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVN 194

Query: 237 HYTTVIGYPAVGRFTGDAPF--LCSGFARATHLEGQTGQELDCDMKEGASGAPFLDGSG 293
TV GYP GD P + + T+L+G + D G SG+P +
Sbjct: 195 QNITVTGYP------GDKPVATMWESKGKITYLKG-EAMQYDLSTTGGNSGSPVFNEKN 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1947V8PROTEASE320.003 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 31.9 bits (72), Expect = 0.003
Identities = 30/178 (16%), Positives = 54/178 (30%), Gaps = 21/178 (11%)

Query: 118 APVPHMGRLFVHRDGEDFSCSANVVESANRSTIATAGHCLTVRQEFSTDMVFYPRYEEGS 177
+ + V F S VV + T+ T H + + +P
Sbjct: 85 GHYAPVTYIQVEAPTGTFIASGVVV---GKDTLLTNKHVVDATHGDPHALKAFPSAINQD 141

Query: 178 PSLGAFPVVGGNVTIGWYERNDDDQAEDTSFLAVAHDDEGDDVQSVAGASPVRFFA--PA 235
+P G T + + D + + + +++ + V + + A
Sbjct: 142 ----NYP--NGGFTAEQITKYSGE--GDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQV 193

Query: 236 AQEVSMYGYPAAGRFDGGELERCAG---LGHVYTEMQIDLGCDMTGGVSGGPILEGDG 290
Q +++ GYP G + G MQ D TGG SG P+
Sbjct: 194 NQNITVTGYP--GDKPVATMWESKGKITYLKGE-AMQYD--LSTTGGNSGSPVFNEKN 246


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1948V8PROTEASE431e-06 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 42.7 bits (100), Expect = 1e-06
Identities = 32/252 (12%), Positives = 71/252 (28%), Gaps = 26/252 (10%)

Query: 30 SSATAAERPAAPASVTLDASSATVGIHVSSEDAAEAVDFWTPERRAAAIDADAPAPADGT 89
++ T A ++PA+ L + + + ++ TP+ +
Sbjct: 14 ATLTTATLVSSPAANALSSKAMD-----NHPQQTQSSKQQTPKIQKGGNLKPLEQREHAN 68

Query: 90 SDSGAVATDAVADSAHATQIEPIPHMGRIFYTQAGKGYACSANVVESANRSTIATAGHCL 149
+ + T + I + S VV + T+ T H +
Sbjct: 69 V---ILPNNDRHQITDTTN-GHYAPVTYIQVEAPTGTFIASGVVV---GKDTLLTNKHVV 121

Query: 150 TQK-QVFSDHIVFYPAYDHGESQYGAWPVITGYVPSGWYQRNDDDQGDDSSFMAVKRDDS 208
F A + G + SG D + + ++
Sbjct: 122 DATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSG---------EGDLAIVKFSPNEQ 172

Query: 209 GQDVQSAVGASPVLF--DQPGAEHASAYGYPAAGRFDGESLQWCSGQGEAVSAEQIALPC 266
+ + V + + + ++ + GYP G ++ G+ + E +
Sbjct: 173 NKHIGEVVKPATMSNNAETQVNQNITVTGYP--GDKPVATMWESKGKITYLKGEAMQYDL 230

Query: 267 DMNAGTSGGPIL 278
G SG P+
Sbjct: 231 STTGGNSGSPVF 242


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_1949TCRTETB441e-06 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 43.7 bits (103), Expect = 1e-06
Identities = 27/101 (26%), Positives = 47/101 (46%), Gaps = 9/101 (8%)

Query: 98 IGGIVAGHLGDRIGRKRMLVYSLVLMGIASTLIGVLPTYATIGLASVVGLVLLRLVQGIA 157
IG V G L D++G KR+L++ +++ S + V ++ ++ L++ R +QG
Sbjct: 64 IGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSL-------LIMARFIQGAG 116

Query: 158 AGAEWGGSALLSVEHAPAHRRGLFGAFTQMGSAGGMLLATG 198
A A ++ + P RG AF +GS M G
Sbjct: 117 AAAFPALVMVVVARYIPKENRG--KAFGLIGSIVAMGEGVG 155


74CMM_2058CMM_2066N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_205808-1.089593hypothetical protein with peptidoglycan binding
CMM_205908-1.865300hypothetical protein
CMM_206009-1.018222putative gamma-aminobutyrate permease, APC
CMM_206109-0.374307conserved hypothetical protein, phosphoglycerate
CMM_2062081.283575conserved hypothetical protein
CMM_2063182.197916conserved hypothetical protein
CMM_2064182.854164hypothetical protein
CMM_2065162.968705hypothetical protein
CMM_2066282.854995putative GTP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2058BCTERIALGSPF290.013 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.0 bits (65), Expect = 0.013
Identities = 22/104 (21%), Positives = 47/104 (45%), Gaps = 12/104 (11%)

Query: 17 VVVALAVVVVVVGFVLDQLSKRWAVDALGGGETIPLFPTARFALVYNPGVSFGMGAEVGP 76
VVA+AVV +++ V+ ++ +++ + +PL R + G+S + GP
Sbjct: 180 TVVAIAVVSILLSVVVPKVVEQFI----HMKQALPLS--TRVLM----GMSDAVRT-FGP 228

Query: 77 LLTVGIMALALGLAVWVGWQIRHRASLLQVLLLSAVLAGALGNL 120
+ + ++A + V + Q + R S + LL ++ L
Sbjct: 229 WMLLALLAGFMAFRVMLR-QEKRRVSFHRRLLHLPLIGRIARGL 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2062HELNAPAPROT1072e-32 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 107 bits (269), Expect = 2e-32
Identities = 39/146 (26%), Positives = 63/146 (43%), Gaps = 2/146 (1%)

Query: 34 TASEQLHESMQKVLVDLIELHIQGKQAHWNVVGKNFRDLHLQLDEIIDSAREFSDDLAER 93
T + S+ L + L+ + + HW V G +F LH + +E+ D A E D +AER
Sbjct: 8 TNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAER 67

Query: 94 MRALHATPDGRSDTVAETTSLPEYPQGEVDTAETVDLVTQRLEAAVHTMREVHDDVDEE- 152
+ A+ P E S+ + E +E V + + + V +E
Sbjct: 68 LLAIGGQPVATVKEYTEHASITDGG-NETSASEMVQALVNDYKQISSESKFVIGLAEENQ 126

Query: 153 DPTTADLLHGFITALEQYAWMVSAEN 178
D TADL G I +E+ WM+S+
Sbjct: 127 DNATADLFVGLIEEVEKQVWMLSSYL 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2063DHBDHDRGNASE1132e-32 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 113 bits (283), Expect = 2e-32
Identities = 80/259 (30%), Positives = 121/259 (46%), Gaps = 7/259 (2%)

Query: 4 GITGKTALITGADSGIGWETARILLSEGATVVLSDQDQGSLDEAAAKLDGGDRV-HAFAA 62
GI GK A ITGA GIG AR L S+GA + D + L++ + L R AF A
Sbjct: 5 GIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPA 64

Query: 63 DVTSVESLAALHDKVQEAVGDIDILVQSAGITGAQGLFHEIDDEGWTNTIEVDLMGPVRL 122
DV ++ + +++ +G IDILV AG+ GL H + DE W T V+ G
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVL-RPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 123 VKRFLPSLRKGGWGRIVFLASEDAVQPYDDELPYCAAKAGILALSKGLSRSYAKEGLLVN 182
+ + G IV + S A P Y ++KA + +K L A+ + N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 183 AVSPAFIHTPMTDAMMEKRADQLGTSTDDAIESFLDEERPYMELKRRGEPAEVANVVAFL 242
VSP T M ++ AD+ G + I+ L+ + + LK+ +P+++A+ V FL
Sbjct: 184 IVSPGSTETDMQWSLW---ADENG--AEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFL 238

Query: 243 CSDLASFVNGSNYRVDSGS 261
S A + N VD G+
Sbjct: 239 VSGQAGHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2065PF06580362e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.0 bits (83), Expect = 2e-04
Identities = 25/92 (27%), Positives = 35/92 (38%), Gaps = 16/92 (17%)

Query: 316 NAMRHG----EPGGRIRLRETWRTADVVLEVENPVALRGPAPDHVDPLGLGALRVGTGVE 371
N ++HG GG+I L+ T V LEVEN G GTG++
Sbjct: 266 NGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENT----GSLALKNTKES-----TGTGLQ 316

Query: 372 GMRARLAAVGGD---LEAEPVDDLFTARARIP 400
+R RL + G ++ A IP
Sbjct: 317 NVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2066HTHFIS562e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 56.0 bits (135), Expect = 2e-11
Identities = 20/105 (19%), Positives = 39/105 (37%), Gaps = 4/105 (3%)

Query: 2 IRIVLVDDQELFRGGVRVALDAQPDLEVVGEAGDGRQGLAVIDEVRPDVVLLDMRMPVMD 61
I++ DD R + AL ++ +V + I D+V+ D+ MP +
Sbjct: 4 ATILVADDDAAIRTVLNQAL-SRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLETVRALFDGTRDAPPRVIVLTTFALDRASATAIRGGASGFLLK 106
+ + + V+V++ + A GA +L K
Sbjct: 62 AFDLLPRI--KKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK 104


75CMM_2088CMM_2093N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2088082.765029hypothetical protein
CMM_2089072.866752putative amino acid permease, APC family
CMM_PS_18072.325377conserved hypothetical protein
CMM_2090071.413817conserved hypothetical protein
CMM_2091081.335866putative anti-sigma regulatory factor (Ser/Thr
CMM_2092081.411108putative anti-sigma factor antagonist
CMM_2093191.301744putative glycosyl transferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2088PHPHTRNFRASE330.001 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 33.2 bits (76), Expect = 0.001
Identities = 27/115 (23%), Positives = 46/115 (40%), Gaps = 9/115 (7%)

Query: 48 DPYMRGRMNDVPSYVPPFQLDEAMTGSAVGRVVESRSDDLEVGTLVSHSLGWRDVAQGPA 107
+ YM+ R D+ + + + G +G S + E +++ L D AQ
Sbjct: 121 NEYMKERAADIR------DVSKRVLGHLIGVETGSLATIAEETVIIAEDLTPSDTAQLNK 174

Query: 108 AGFRPVPEVPGVASSAHLGVLGLT-GLTAYVGLTRIA-SIQEGDVVFVSGAAGAV 160
+ G +S H ++ + + A VG + IQ GD+V V G G V
Sbjct: 175 QFVKGFATDIGGRTS-HSAIMSRSLEIPAVVGTKEVTEKIQHGDMVIVDGIEGIV 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2090DHBDHDRGNASE796e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 79.3 bits (195), Expect = 6e-20
Identities = 53/196 (27%), Positives = 75/196 (38%), Gaps = 11/196 (5%)

Query: 8 VALVTGATRGIGRAVAEDLGRTHRVIVHGRDRDAVDALAASLPDAVGWAADLAAGGLAD- 66
+A +TGA +GIG AVA L I S A A+ + D
Sbjct: 10 IAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 67 ---------LVPELDRLDVLVHSAGVIGGDAVAGTPVDEWRRVFEVNVFAVAEVTRVLLP 117
+ E+ +D+LV+ AGV+ + +EW F VN V +R +
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSK 129

Query: 118 ALRAAK-GQVVLVNSGSGFTANPTGGVYAGSKFALRALGDALREEERPHGVRVSSVHPGR 176
+ + G +V V S + YA SK A L E + +R + V PG
Sbjct: 130 YMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPGS 189

Query: 177 VATDMQRELRAKEGGE 192
TDMQ L A E G
Sbjct: 190 TETDMQWSLWADENGA 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2093PF04183477e-162 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 477 bits (1229), Expect = e-162
Identities = 152/573 (26%), Positives = 255/573 (44%), Gaps = 30/573 (5%)

Query: 236 LRPDVWAAATRHLVRKALAEFAHELLIAPERVDPELPAGAPRRPHDPRRWADYRVASADG 295
+ W R LV K L+E +E + E + Y +
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDR----------------YCINLPGA 44

Query: 296 RSAYAYRARVLELDHWDVDEASIRRTVEGQPVELDATDLVLDLRDRLGITDEVLPVYLDE 355
+ + + A +D ++R + A L++ L+ L ++D + ++ +
Sbjct: 45 Q--WRFIAERGIWGWLWIDAQTLRC----ADEPVLAQTLLMQLKQVLSMSDATVAEHMQD 98

Query: 356 IQSTLSA-AAFARLRDVPDARGLLTASYAEVESTMDEGHPCFVATNGRIGFDLDDHDRYA 414
+ +TL + R A L+ + ++ + GHP FV GR G+ + +RYA
Sbjct: 99 LYATLLGDLQLLKARRGLSASDLINLNADRLQCLL-SGHPKFVFNKGRRGWGKEALERYA 157

Query: 415 PEAGADVRILWLAVHERLARFTAIEGLDREAFLDAELGASARARFRARMESLGIDPAERV 474
PE R+ WLAV + +D L A + ARF + G+D +
Sbjct: 158 PEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDH-NWL 216

Query: 475 LVPVHPWQWENVVTVTFAGLVARRDILLLGTGDDEYGAQQSIRTWANRTTPERCYVKTSL 534
+PVHPWQW+ + F A ++ LG D++ AQQS+RT N + +K L
Sbjct: 217 PLPVHPWQWQQKIATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPL 276

Query: 535 SILNMGFTRGLSPAYMAVTPAINGWVHALVTGDAEFDRLGFGVLREVAAVGVRDERVESA 594
+I N RG+ Y+A P + W+ + DA + G +L E AA V E +
Sbjct: 277 TIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAAL 336

Query: 595 LPPGHSHGKMLSALWRESPVPGLAEGERLMSMTSLLHVDAHGDTVLGALIDASGIGAAAW 654
+ + +ML +WRE+P L E + M +L+ D + + GA ID SG+ A W
Sbjct: 337 ARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETW 396

Query: 655 LRRWLDAYLVPLAHALIAHDLAFMPHGENVILVLRDHVVVRVLMKDIAEEVAL----FDM 710
L + +VPL H L + +A + HG+N+ L +++ V RVL+KD ++ L F
Sbjct: 397 LTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPE 456

Query: 711 ERELPEDVRRIRMEIPEEERTLTVFTDVMDGFLRFAAALLEDRDDLGPEGLWRVAAEALA 770
LP++VR + + + + T LRF + L+ R + +++ A L+
Sbjct: 457 MDSLPQEVRDVTSRLSADYLIHDLQTGHFVTVLRFISPLM-VRLGVPERRFYQLLAAVLS 515

Query: 771 DHERAHPELAERFARFDLFAPSFDRSCLNRLQL 803
D+ + HP+++ERFA F LF P R LN ++L
Sbjct: 516 DYMKKHPQMSERFALFSLFRPQIIRVVLNPVKL 548


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2094PF05616290.030 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.3 bits (65), Expect = 0.030
Identities = 16/49 (32%), Positives = 24/49 (48%)

Query: 172 GVVHSSGYLGAKAALQERDAITVVGSGQSAAEIYRDLLEDVDSRGYRLD 220
GV+ G L A A+ VG+ A ++Y ED+ +RGY+ D
Sbjct: 66 GVLAGVGKLARLGAKFSTRAVPYVGTALLAHDVYETFKEDIQARGYQYD 114


76CMM_2153CMM_2161N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2153-2101.928559putative UDP-glucose 4-epimerase
CMM_2154-291.775204hypothetical protein
CMM_2155-18-0.087896putative short chain
CMM_2156070.211248hypothetical membrane protein
CMM_2157061.011728hypothetical protein
CMM_2158-171.100555hypothetical protein
CMM_2159-180.823773hypothetical protein
CMM_2160063.032660hypothetical protein
CMM_2161074.223612putative permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2153NUCEPIMERASE923e-23 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 91.8 bits (228), Expect = 3e-23
Identities = 69/359 (19%), Positives = 121/359 (33%), Gaps = 71/359 (19%)

Query: 1 MIVLVTGASGMLGRAVAERLAAAGHAVRAF---------QRQPSGLAASGTEPVPGSVVD 51
M LVTGA+G +G V++RL AGH V + + L
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQP----GFQF 56

Query: 52 LRGSVTDQASVARAVD--GVDAVVHLAAKVSLA---GDPDDFRAVNVEGTRGLLRAARAA 106
+ + D+ + + V ++++ +P + N+ G +L R
Sbjct: 57 HKIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHN 116

Query: 107 GVTRFVHVSSPSVAHTGLSITGDGAGPAD-PVRARGDYARTKAEGELIALAADDP-AMRV 164
+ ++ SS SV + D PV YA TK EL+A +
Sbjct: 117 KIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSL---YAATKKANELMAHTYSHLYGLPA 173

Query: 165 LAVRPHLVWGP-GDTQL-VARIVDRASRGR-LPLLGHGAALIDTVYRDNAADAIVAAL-- 219
+R V+GP G + + + G+ + + +G D Y D+ A+AI+
Sbjct: 174 TGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDV 233

Query: 220 --------------AAADTAHGRAYVVTNGEPRPVAELLAGMCRAAGVPAPRIRVPAALA 265
AA A R Y + N P + + + + A G+ A + +P
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQP- 292

Query: 266 RAAGGAVERVWAVRPGSDEPPMTRFLAEQLSTAHWFDQRETRRALGWTPAVSLDEGFER 324
G V A D + +G+TP ++ +G +
Sbjct: 293 ----GDVLETSA------------------------DTKALYEVIGFTPETTVKDGVKN 323


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2157BCTERIALGSPC300.016 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 30.3 bits (68), Expect = 0.016
Identities = 14/47 (29%), Positives = 22/47 (46%), Gaps = 1/47 (2%)

Query: 221 AGLQPGDTIVSIDGSPVTAWDQVTSTVQASAG-KELDVVVERGGARQ 266
GLQ D V+++G + +Q ++ A + VER G RQ
Sbjct: 216 VGLQDNDMAVALNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQ 262


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2158YERSSTKINASE423e-06 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 42.4 bits (99), Expect = 3e-06
Identities = 49/182 (26%), Positives = 74/182 (40%), Gaps = 27/182 (14%)

Query: 74 HPGLVTLFDVGDDVVADRVLAFIVMEIVDG-------STLADRMKEGPLPGP----EVAR 122
HP L + + +R ++M+ VDG TLAD K+G + +
Sbjct: 190 HPNLANVHGMAVVPYGNRKEEALLMDEVDGWRCSDTLRTLADSWKQGKINSEAYWGTIKF 249

Query: 123 IGGILSDALGYIHRRGVVHRDVKPANVLLARPEDDDEPAVAKLTDFGIARLVDGTRLTST 182
I L D ++ + GVVH D+KP NV+ R EP V L G S
Sbjct: 250 IAHRLLDVTNHLAKAGVVHNDIKPGNVVFDRAS--GEPVVIDL----------GLHSRSG 297

Query: 183 GSIIG-TVSYLSPEQALGEEVGAP--TDVYALGLVLLECLTGRRTFPGTAAESTMARVVR 239
G T S+ +PE +G +GA +DV+ + LL C+ G P + +
Sbjct: 298 EQPKGFTESFKAPELGVG-NLGASEKSDVFLVVSTLLHCIEGFEKNPEIKPNQGLRFITS 356

Query: 240 DP 241
+P
Sbjct: 357 EP 358


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2159PF05616374e-05 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 36.6 bits (84), Expect = 4e-05
Identities = 20/81 (24%), Positives = 32/81 (39%), Gaps = 3/81 (3%)

Query: 115 DAAETTPTPSPSPSASPSPSPSPSRTASPSPSPSPSPSRTATPSPPATPSPAPGGGNGNG 174
+A P P SP+ +P+ +P+P+ P+P P P +P P G +
Sbjct: 321 EAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQP---GTRPDS 377

Query: 175 DGGPGNGNGGGNGDGGAGDPG 195
P NG + G+ G
Sbjct: 378 PAVPDRPNGRHRKERKEGEDG 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2161INFPOTNTIATR622e-13 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 62.3 bits (151), Expect = 2e-13
Identities = 38/102 (37%), Positives = 56/102 (54%), Gaps = 1/102 (0%)

Query: 236 LKVGDGAEVTDGASVTVQYTGINWNTKKVFDSSWDRGQSATFVTSQVIPGFTKALVGQKV 295
+ G GA+ +VTV+YTG + VFDS+ G+ ATF SQVIPG+T+AL
Sbjct: 133 IDAGTGAKPGKSDTVTVEYTGTLID-GTVFDSTEKAGKPATFQVSQVIPGWTEALQLMPA 191

Query: 296 GSQVIAIIPPADGYGDKGQGTDIGGTDTIVFVVDILGTQPAA 337
GS +P YG + G IG +T++F + ++ + AA
Sbjct: 192 GSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKAA 233


77CMM_2260CMM_2267N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2260-2111.428419putative ribonucleoside-diphosphate reductase
CMM_2261-1110.825868putative MFS permease
CMM_2262191.052690hypothetical protein
CMM_2263180.188382putative aldo/keto reductase
CMM_2264170.506319hypothetical membrane protein
CMM_226507-0.077880putative secreted serine protease, family S1C
CMM_2266-180.416527putative alpha-L-arabinofuranosidase
CMM_2267-27-0.235566putative L-arabinose ABC transporter, permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2260SUBTILISIN1651e-49 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 165 bits (419), Expect = 1e-49
Identities = 89/282 (31%), Positives = 131/282 (46%), Gaps = 19/282 (6%)

Query: 56 EQAWQTTRGEGVKVAVIDTGVDASVADLRGAVAGGTDVSGVGSADGTKPVGASSEHGTMV 115
W TRG GVKVAV+DTG DA DL+ + GG + + D + + HGT V
Sbjct: 32 PAVWNQTRGRGVKVAVLDTGCDADHPDLKARIIGGRNFTDDDEGD-PEIFKDYNGHGTHV 90

Query: 116 ASLLAGRGTGTGSGVIGVAPAASVLSVSVALGGPTPGARDEDAQIADAVRWAVDNGASVI 175
A +A T +GV+GVAP A +L + V L G D I + +A++ +I
Sbjct: 91 AGTIAA--TENENGVVGVAPEADLLIIKV-LNKQGSGQYD---WIIQGIYYAIEQKVDII 144

Query: 176 NMSLTRNSLDWPESWDRAFLYAYEHDVVVVAAAGNRGSG---TTEVGAPATIPGVLAVAG 232
+MSL A A ++V+ AAGN G G T E+G P V++V
Sbjct: 145 SMSLGGPEDV--PELHEAVKKAVASQILVMCAAGNEGDGDDRTDELGYPGCYNEVISVGA 202

Query: 233 VDRSGAASFDASSQGITIAVAAPSEQLVGVEPGGRYVQWSGTSGAAPLVSGVVALVRAAH 292
++ AS + S+ + + AP E ++ PGG+Y +SGTS A P V+G +AL++
Sbjct: 203 INFDRHAS-EFSNSNNEVDLVAPGEDILSTVPGGKYATFSGTSMATPHVAGALALIKQLA 261

Query: 293 PELKADDVVERVLATARQK------GQPEIYGRGLVDAAAAV 328
D+ E L K P++ G GL+ A
Sbjct: 262 NASFERDLTEPELYAQLIKRTIPLGNSPKMEGNGLLYLTAVE 303


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2262RTXTOXIND280.025 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.9 bits (62), Expect = 0.025
Identities = 8/51 (15%), Positives = 19/51 (37%), Gaps = 5/51 (9%)

Query: 68 YLEQRQQLSSLQSAVDAQKGTIAQLQDQRARYDD-----PAFLKAQVRDRL 113
LEQ + + + K + Q++ + + K ++ D+L
Sbjct: 254 VLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKL 304


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2265HTHTETR382e-05 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 37.7 bits (87), Expect = 2e-05
Identities = 15/78 (19%), Positives = 26/78 (33%)

Query: 8 RRQDLLAHILDHLRAHPLQSVTFRGLADALGESTFVLVYHFGSKERLLEAAMDAIDHRQA 67
RQ +L L + S + +A A G + + +HF K L + +
Sbjct: 12 TRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIG 71

Query: 68 EMVEGDPRRIRPDELREW 85
E+ + D L
Sbjct: 72 ELELEYQAKFPGDPLSVL 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2267PYOCINKILLER330.002 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 33.2 bits (75), Expect = 0.002
Identities = 25/80 (31%), Positives = 41/80 (51%), Gaps = 6/80 (7%)

Query: 79 AGGLNSVSRALVPAIAAVGGVV-----VPALVYLAITAGSGLERGWPVPTATDIAFALGV 133
A G S+++A+ AIA +G V+ V A+ + ++T S W T + +ALG+
Sbjct: 271 AQGAASLAQAISDAIAVLGRVLASAPSVMAVGFASLTYSSRTAEQWQDQTPDSVRYALGM 330

Query: 134 LAVFGRGLPAAVRVFLLALA 153
A GLP +V + +A A
Sbjct: 331 DAA-KLGLPPSVNLNAVAKA 349


78CMM_2312CMM_2319N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_23124103.438733putative tRNA/rRNA methyltransferase
CMM_2313292.7488042-C-methyl-D-erythritol 2,4-cyclodiphosphate
CMM_2314392.918155putative transcriptional regulator, CarD family
CMM_23152130.994172hypothetical protein
CMM_23162100.905012putative two-component system response
CMM_23171100.816367putative two-component system sensor kinase
CMM_23181100.827721putative phosphate transport system regulator
CMM_23192110.621410phosphoglycerate mutase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2314ACETATEKNASE290.030 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 29.4 bits (66), Expect = 0.030
Identities = 13/63 (20%), Positives = 27/63 (42%), Gaps = 4/63 (6%)

Query: 270 GIDLLVTTGGVSAGAYEVVRDVLEGGVEFVSVAVQP---GGPQGLGTAEAGGARIPVVAF 326
G+D++V T G+ E+ +L+ G+EF+ + +++ V+
Sbjct: 322 GVDVIVFTAGIGENGPEIREFILD-GLEFLGFKLDKEKNKVRGEEAIISTADSKVNVMVV 380

Query: 327 PGN 329
P N
Sbjct: 381 PTN 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2318NUCEPIMERASE1674e-51 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 167 bits (424), Expect = 4e-51
Identities = 83/363 (22%), Positives = 142/363 (39%), Gaps = 67/363 (18%)

Query: 7 LVTGGAGFIGGAIVQAALEEGRRVRVLDSLRADVHGGDPEI---------DPRVELVRGD 57
LVTG AGFIG + + LE G +V +D+L D + D + P + + D
Sbjct: 4 LVTGAAGFIGFHVSKRLLEAGHQVVGIDNL-NDYY--DVSLKQARLELLAQPGFQFHKID 60

Query: 58 VTDPDAVAGALD--GVDVVCHQAAKVGLGVDFLDAPDYVTTNDGGTAVLLAAMTRAGIDR 115
+ D + + + V ++ + + Y +N G +L I
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 116 LVLASSMVVYGEGAYAGADGPVRPPARRVADLDAGMFDPVDPATGEPLVPQLIGEDVPLD 175
L+ ASS VYG P + + V
Sbjct: 121 LLYASSSSVYGLNRK-------------------------MPFSTDDSVDH--------- 146

Query: 176 PRNVYATTKLAQENLASSWTRATGGRAAALRYHNVYGP-GMPQNTPYAGVASLFRSALAR 234
P ++YA TK A E +A +++ G A LR+ VYGP G P + F A+
Sbjct: 147 PVSLYAATKKANELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALF-----KFTKAMLE 201

Query: 235 GEAPRVFEDGRQRRDFVHVRDVAGANLTAL--------AWTADREAGS-----FRAFNVG 281
G++ V+ G+ +RDF ++ D+A A + WT + + +R +N+G
Sbjct: 202 GKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIG 261

Query: 282 SGTVHTIGEMAEALAREAGGSAPVTTGEYRLGDVRHITASSDRLRTELGWEPRMTFEEGM 341
+ + + + +AL G A + GDV +A + L +G+ P T ++G+
Sbjct: 262 NSSPVELMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGV 321

Query: 342 REF 344
+ F
Sbjct: 322 KNF 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2320DHBDHDRGNASE1031e-28 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 103 bits (258), Expect = 1e-28
Identities = 73/255 (28%), Positives = 117/255 (45%), Gaps = 9/255 (3%)

Query: 9 KVVLITGGGSGLGRAAAVRLAAEGAQLALVDISEGGLVDTVAA--VMAATPDAAILTVLA 66
K+ ITG G+G A A LA++GA +A VD + L V++ A +A A
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEA----FPA 64

Query: 67 DVSKESDVDSYVHQTVERFGRIDGFFNNAGIEGRQNLTEDFTAAEFDKVVAINLRGVFLG 126
DV + +D + G ID N AG+ R L + E++ ++N GVF
Sbjct: 65 DVRDSAAIDEITARIEREMGPIDILVNVAGVL-RPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 127 LEKVLAVMREQGSGMVVNTASVGGIRGVGNQSGYAAAKHGVVGLTRNSAVEYGEFGIRIN 186
V M ++ SG +V S + + YA++K V T+ +E E+ IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 187 AIAPGAIWTPMVEASMKQISADDP--RGAAEQFIQGNPTKRYGEAEEIASVVAFLLSDDA 244
++PG+ T M + + + +G+ E F G P K+ + +IA V FL+S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 245 AYVNAAVLPIDGGQS 259
++ L +DGG +
Sbjct: 244 GHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2321BACINVASINB340.001 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 33.6 bits (76), Expect = 0.001
Identities = 19/81 (23%), Positives = 41/81 (50%), Gaps = 4/81 (4%)

Query: 58 LGLAIGIADAVVQATGAGLLLEIVLQAVLMSAIVVPAVVLLRRRLDRRSLASIGLSRRIG 117
+GLA+ +AD +V+A ++ L + M ++ P + L+ + + ++L +G+ ++
Sbjct: 346 VGLAVMVADEIVKAATGVSFIQQALNPI-MEHVLKPLMELIGKAIT-KALEGLGVDKKTA 403

Query: 118 RPIALGVGVGAVTGAVVWVPA 138
G VGA+ A+ V
Sbjct: 404 EMA--GSIVGAIVAAIAMVAV 422


79CMM_2427CMM_2444N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2427-18-3.54895550S ribosomal protein L16
CMM_2428-19-3.11976130S ribosomal protein S3
CMM_2429-210-1.20010550S ribosomal protein L22
CMM_2430-280.65512630S ribosomal protein S19
CMM_2431-37-0.17479850S ribosomal protein L2
CMM_2432-38-1.16844550S ribosomal protein L23
CMM_2433-18-1.74887850S ribosomal protein L4
CMM_2434-28-1.87714350S ribosomal protein L3
CMM_2435-110-2.37821030S ribosomal protein S10
CMM_2436-18-2.157033putative topoisomerase IB
CMM_2437-350.104332elongation factor EF-Tu
CMM_2438-230.501863elongation factor EF-G
CMM_2439-251.71791430S ribosomal protein S7
CMM_2440-161.37086630S ribosomal protein S12
CMM_2441-270.696312hypothetical membrane protein
CMM_2442-160.312448hypothetical membrane protein
CMM_244309-1.201285putative polar amino acid ABC transporter,
CMM_2444-19-1.382836putative polar amino acid ABC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2427SACTRNSFRASE416e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 40.7 bits (95), Expect = 6e-07
Identities = 13/52 (25%), Positives = 22/52 (42%)

Query: 103 VVVDPAHRGTGLARVLMAELERVARERGARRLILQTGDRQPDAIQLYATAGW 154
+ V +R G+ L+ + A+E L+L+T D A YA +
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2431adhesinmafb280.021 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 27.7 bits (61), Expect = 0.021
Identities = 31/102 (30%), Positives = 39/102 (38%), Gaps = 8/102 (7%)

Query: 6 TPLTTRRLRAGIAGLGLVLGLALAGCSSPAADDAATPTPSASASAEAASPTPEATAEGDA 65
PL A I GLG V G A D P+A+ + EA A A A
Sbjct: 268 APLPAEGKFAVIGGLGSVAGFEKNTRE--AVDRWIQENPNAAETVEAVFNV--AAAAKVA 323

Query: 66 GSGDAAAPGSREAIAAKTRDIACGLKDKSTLEESDVQAFRDL 107
AA PG AA + D A K K L +S Q +++
Sbjct: 324 KLAKAAKPGK----AAVSGDFADSYKKKLALSDSARQLYQNA 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2432TCRTETA290.022 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 29.4 bits (66), Expect = 0.022
Identities = 14/40 (35%), Positives = 21/40 (52%)

Query: 197 IIARVPLASGLLSGRYTADTTFAETDHRNFNRGGAAFDVG 236
I+A + A+G ++G Y AD T + R+F A F G
Sbjct: 104 IVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2433PF05616290.009 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 29.3 bits (65), Expect = 0.009
Identities = 20/58 (34%), Positives = 24/58 (41%), Gaps = 4/58 (6%)

Query: 36 PRFDPAFQRASGPAPAPHPRTDPQAGADVPGPTSTTGNPETHETPA--PSVPRAADRP 91
P PA A+ PAP +P T P D + NP+T P P P DRP
Sbjct: 329 PEVSPAENPANNPAPNENPGTRPNPEPD--PDLNPDANPDTDGQPGTRPDSPAVPDRP 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2434V8PROTEASE641e-13 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 64.3 bits (156), Expect = 1e-13
Identities = 37/193 (19%), Positives = 64/193 (33%), Gaps = 40/193 (20%)

Query: 95 ATTAQKAGVVTIDSALTYENAAGAGTGIILSSDGTILTNNHVVSGATSIRVTVEST---- 150
T A V I +G + T+LTN HVV +++
Sbjct: 82 TTNGHYAPVTYIQVEAPTGTFIASGVVV---GKDTLLTNKHVVDATHGDPHALKAFPSAI 138

Query: 151 -------GKAYVGKVVGTDATNDVAVLKLED-------ASGLTPAKLDTDG-VQVGEAVT 195
G ++ D+A++K + PA + + QV + +T
Sbjct: 139 NQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNIT 198

Query: 196 GVGNAGG--TGTLTAATGQVTALGQTITTQSEGTAAGETLTDLIQTDAAIVSGDSGGPLV 253
G G T+ + G++T L +Q D + G+SG P+
Sbjct: 199 VTGYPGDKPVATMWESKGKITYLKGEA----------------MQYDLSTTGGNSGSPVF 242

Query: 254 DAEGEVVGIDTAA 266
+ + EV+GI
Sbjct: 243 NEKNEVIGIHWGG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2437PF06580310.006 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 31.0 bits (70), Expect = 0.006
Identities = 29/152 (19%), Positives = 55/152 (36%), Gaps = 10/152 (6%)

Query: 66 QSLTGANSELIGFANYFEAFGDSQMWRSLGNTVVFTIASTIPLLVVGLVLALLVNLGLPG 125
Q + L GF F + S S+ +F IA ++ LV+ +
Sbjct: 16 QGIGWGVYTLTGFG--FASLYGSPKLHSM----IFNIAISLMGLVLTHAYRSFIK-RQGW 68

Query: 126 QWLWRLAFFLPFLLASTVVSLFWLWMYNPQLGVVNAIAGAFGLPQPAWLQDSNLAMTSVV 185
L L L A V+ + W ++ N + + A P L + + +VV
Sbjct: 69 LKLNMGQIILRVLPACVVIGMVW-FVANTSIWRLLAFINT--KPVAFTLPLALSIIFNVV 125

Query: 186 VTTVWWTVGFNFLIYLAALQNIPDQQYEAAAL 217
V T W++ + + + Q++ A++
Sbjct: 126 VVTFMWSLLYFGWHFFKNYKQAEIDQWKMASM 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2438MALTOSEBP416e-06 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 41.3 bits (96), Expect = 6e-06
Identities = 54/232 (23%), Positives = 81/232 (34%), Gaps = 27/232 (11%)

Query: 97 KLAMASVGGRAPDVAIMHAARVPGFAPGGLLDPWDTDRLAELGVTQADFEPRVWDKGVVD 156
K + G PD+ R G+A GLL D+ Q P WD +
Sbjct: 72 KFPQVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDK-----AFQDKLYPFTWDAVRYN 126

Query: 157 GKLYSIALDSHPFILMYNTDVAAQAGVLGGDGQLEEITSPDRFLEVMRAMQAVTGEHALS 216
GKL + + L+YN D+ EEI + D+ L+ G+ AL
Sbjct: 127 GKLIAYPIAVEALSLIYNKDLLPNP-----PKTWEEIPALDKELKA-------KGKSALM 174

Query: 217 YGYLGDGAQMWRLF-----YTFYKQMGGDMELPTGGEVVYDRDKAVASLEYIQTLLDGTI 271
+ L + W L Y F K G ++ G D A A L ++ L+
Sbjct: 175 FN-LQEPYFTWPLIAADGGYAF-KYENGKYDIKDVG---VDNAGAKAGLTFLVDLIKNKH 229

Query: 272 ATPSGDAGTAIAEFAGAKSGAIVTGVWELPTFQKAKVPFDAMPIPNLFGTPA 323
D A A F ++ + G W +KV + +P G P+
Sbjct: 230 MNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPS 281


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2443cloacin340.002 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 33.5 bits (76), Expect = 0.002
Identities = 22/91 (24%), Positives = 30/91 (32%)

Query: 368 GDTGGLVMSDWVTPEQAKLDAMAPILHPATGGSGSGSGSSSGSGSSSGSGSSSGSTPAPQ 427
G S W + +H G G + SG SG+G + + AP
Sbjct: 29 VGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGNGNSGGGSGTGGNLSAVAAPV 88

Query: 428 PQPQPAPSTSGAVTATWQPGGSWSSGYVADL 458
PA ST GA S +AD+
Sbjct: 89 AFGFPALSTPGAGGLAVSISAGALSAAIADI 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2444PYOCINKILLER290.043 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 28.6 bits (63), Expect = 0.043
Identities = 36/124 (29%), Positives = 48/124 (38%), Gaps = 9/124 (7%)

Query: 42 TLRILAETAEVPDAVRAETNARSVVEQQRA-LRRAAEEAEAAARARDAAAQRALPKVAPV 100
+L+I T A A EQ A +R AEE A AA A+P V
Sbjct: 199 SLQIRMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQARQQAAIRAANTYAMPANGSV 258

Query: 101 SATSPSAATRLRRSRLAATAILALALVGVIAGIAQVAGAGAWTLLVISSLATFASLAFLQ 160
AT AA R A LA A+ IA + +V + + V FASL +
Sbjct: 259 VAT---AAGRGLIQVAQGAASLAQAISDAIAVLGRVLASAPSVMAV-----GFASLTYSS 310

Query: 161 RMSQ 164
R ++
Sbjct: 311 RTAE 314


80CMM_2456CMM_2463N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2456292.168205conserved hypothetical protein
CMM_2457172.330789hypothetical protein
CMM_2458072.634303conserved hypothetical protein
CMM_2459-162.383597hypothetical protein
CMM_2460-191.227970hypothetical protein
CMM_2461281.107312conserved hypothetical protein
CMM_24623101.311217putative ATP-dependent helicase
CMM_2463391.153939putative transcriptional regulator, TetR family
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2456PF03309300.017 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 30.1 bits (68), Expect = 0.017
Identities = 17/68 (25%), Positives = 28/68 (41%), Gaps = 6/68 (8%)

Query: 1 MRIGIDVGGTNTDAVLMHGD----RVVVGIKSSTTEDVTSGIVGALAELDRQHPFDPADI 56
M + IDV T+T L+ G +VV + T +VT+ + +D D +
Sbjct: 1 MLLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELALT--IDGLIGDDAERL 58

Query: 57 DGVMIGTT 64
G +T
Sbjct: 59 TGASGLST 66


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2457PF09025280.046 YopR Core
		>PF09025#YopR Core

Length = 143

Score = 27.7 bits (61), Expect = 0.046
Identities = 25/90 (27%), Positives = 35/90 (38%), Gaps = 3/90 (3%)

Query: 18 HLPAIARGAAILGTGGGGDPYIGRLLAGEAIKELGPIPLADPFDLPDDAVVIPVAMMGAP 77
+LPA+ + A GG P GR LAG LG L F P + + A
Sbjct: 22 NLPAVDQVLAFEQALGGEPPAAGRRLAGLENGALGE-RLLQRFAQPLQGLEADRLELKA- 79

Query: 78 TVMVEKLPTVEQLQGAILALARYLGVTPTH 107
++ +LP Q Q +L L + P
Sbjct: 80 -MLRAELPLGRQQQTFLLQLLGAVEHAPGG 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2460RTXTOXIND310.017 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 30.6 bits (69), Expect = 0.017
Identities = 25/206 (12%), Positives = 70/206 (33%), Gaps = 10/206 (4%)

Query: 166 EAVHERIEARRAIDAELAAARAELVDVTATAELQGVGLFEHDHPAESSAELASRLEALRY 225
A+ + + + L A + + ++ L E P E + S E LR
Sbjct: 128 TALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRL 187

Query: 226 T--IKNAVRDKRAVTATSGFTFNGSEAQGRRFVSDMSKVLLRAYNAEAENAVKATRAGNL 283
T IK + + A+ ++ +++ + ++ ++
Sbjct: 188 TSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQ 247

Query: 284 HVAQNRLTKAAEQIARSGTMIDLRIQDGYHELRLEEL--QLASAHLRVLQAEKEMERERR 341
+A++ + + + + + ++ +LE++ ++ SA + + E
Sbjct: 248 AIAKHAVLEQENKYV------EAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEIL 301

Query: 342 AELREQAKASAELQAERERLDKERAH 367
+LR+ L E + ++ +
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQA 327



Score = 29.4 bits (66), Expect = 0.037
Identities = 14/113 (12%), Positives = 30/113 (26%), Gaps = 5/113 (4%)

Query: 86 AELDDYRARAQTEAEAARRTGADAASALVAAARAERDRILAEAGAERLRAEQEAGAVRLR 145
+L A A T +T + A + R + E + +
Sbjct: 125 LKLTALGAEADT-----LKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNV 179

Query: 146 AEEEAGATRRRAEAELQAVDEAVHERIEARRAIDAELAAARAELVDVTATAEL 198
+EEE + + +++ AE A + + +
Sbjct: 180 SEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRV 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2463PF03544409e-06 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 40.0 bits (93), Expect = 9e-06
Identities = 23/121 (19%), Positives = 29/121 (23%)

Query: 264 EPTTPAPTPTVPTPTPTVPTPTPTPTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTP 323
+P + P P P P V P P P AP P P
Sbjct: 48 QPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKP 107

Query: 324 TPTAPTPTPTAPTPTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGT 383
P +P PT T T P +V + + P
Sbjct: 108 VKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQY 167

Query: 384 P 384
P
Sbjct: 168 P 168



Score = 36.1 bits (83), Expect = 2e-04
Identities = 21/116 (18%), Positives = 25/116 (21%), Gaps = 1/116 (0%)

Query: 273 TVPTPTPTVPTPTPTPTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTP 332
T +P P P T A P P P P P P P P
Sbjct: 35 TSVHQVIELPAP-AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVV 93

Query: 333 TAPTPTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGTPTPTA 388
P P+ P V + T + T T
Sbjct: 94 IEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKP 149



Score = 35.3 bits (81), Expect = 3e-04
Identities = 23/114 (20%), Positives = 28/114 (24%), Gaps = 2/114 (1%)

Query: 288 PTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTVPTPRAT 347
+P P + T A P A P P P P P P P V +
Sbjct: 41 IELPAP-AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVV-IEKPK 98

Query: 348 APTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGTPTPTAGTPTPTAGTPTPT 401
P V P P PT+ T T P +
Sbjct: 99 PKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTS 152



Score = 34.2 bits (78), Expect = 8e-04
Identities = 23/117 (19%), Positives = 27/117 (23%), Gaps = 1/117 (0%)

Query: 277 PTPTVPTPTPTPTVPTPTPTAPTPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTAPT 336
P P P + T P P P P P P P P P
Sbjct: 44 PAPAQP-ISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPK 102

Query: 337 PTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPTVGTPTPGTPTPTAGTPTP 393
P P V + + TA + T T T A P
Sbjct: 103 PKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159



Score = 33.0 bits (75), Expect = 0.002
Identities = 19/111 (17%), Positives = 22/111 (19%)

Query: 300 PTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTAPTPTPTVPTPRATAPTPTPTVPTPT 359
P P P P P P P P P P P
Sbjct: 47 AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 360 VTVPTPTATVPTPTPTVGTPTPGTPTPTAGTPTPTAGTPTPTSTTATPIVP 410
+P T A + TA T T+ P
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGP 157



Score = 31.5 bits (71), Expect = 0.005
Identities = 18/113 (15%), Positives = 22/113 (19%)

Query: 316 PTPTAPTPTPTAPTPTPTAPTPTPTVPTPRATAPTPTPTVPTPTVTVPTPTATVPTPTPT 375
P + T A P A P P P P P P V P P P
Sbjct: 47 AQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 106

Query: 376 VGTPTPGTPTPTAGTPTPTAGTPTPTSTTATPIVPGGTTPGGSLPVTGYDPTP 428
+ A T+ + P
Sbjct: 107 PVKKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRA 159


81CMM_2531CMM_2536N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2531080.426851putative acetyltransferase
CMM_253217-0.539363conserved hypothetical protein
CMM_253317-0.510882putative sugar ABC transporter, permease
CMM_253426-0.083398putative sugar ABC transporter, permease
CMM_253525-0.156268conserved membrane protein, putative
CMM_2536-27-1.038935putative sugar ABC transporter, substrate
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2531TCRTETA515e-09 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 50.6 bits (121), Expect = 5e-09
Identities = 62/278 (22%), Positives = 100/278 (35%), Gaps = 10/278 (3%)

Query: 74 GLLVSVFAVAVVVSSAPLAALTRRVPRHALLLAVLAVFATSNLLTALAPTFELVTATRVL 133
G+L++++A+ + L AL+ R R +LL LA A + A AP ++ R++
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV 105

Query: 134 GGLAHGLFWAVVGAYSAHLVPPALIGRAVALTVGGGSLAFVLGVPVGTAAGQAFGWRAAF 193
G+ AV GAY A + R V G +G G F A F
Sbjct: 106 AGITGATG-AVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG-FSPHAPF 163

Query: 194 AGIGVLTLLGIVLVWRFLPAVGRPERAAPVDAPSGGRVGIRQRMAGQGLGGVLLVCLTAM 253
L L + LP + ER R + ++ V
Sbjct: 164 FAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFI-- 221

Query: 254 VIMIGHYGFYTYVDPFVTRVMGIPEQQLSAMLFGYGIAGAVGLLVAGTVFASR--PQLGV 311
+ ++G +V F + L +GI ++ + A+R + +
Sbjct: 222 MQLVGQVPAALWV-IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRAL 280

Query: 312 LAGLAAAAAGVSALAFVPGLWAVAIPGFLLWGLAFGAI 349
+ G+ A G LAF W +A P +L LA G I
Sbjct: 281 MLGMIADGTGYILLAFATRGW-MAFPIMVL--LASGGI 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2533ABC2TRNSPORT369e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 36.1 bits (83), Expect = 9e-05
Identities = 59/258 (22%), Positives = 98/258 (37%), Gaps = 23/258 (8%)

Query: 29 WIAT-RRNLIKIKR-LPDILVFTLIQPIMFVLLFSFVYAGAID-IPGSDYKEFIMAGIFA 85
WIA RRN I K+ L+ L +P++++ + + G Y F+ AG+ A
Sbjct: 16 WIAVWRRNYIAWKKAALASLLGHLAEPLIYLFGLGAGLGVMVGRVGGVSYTAFLAAGMVA 75

Query: 86 QTVVFGSTYSG--SAMANDLKDGIIDRFRTLPMSPSAVLVGR----TNGDLLINVFSMVV 139
+ + +T+ +A + + +++G L VV
Sbjct: 76 TSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVV 135

Query: 140 MLATGFVVGWRVRSSLLEALAGVALLLLFAYALSWVMAFLGMSVRSPEVINNASFLVLFP 199
A G+ SLL AL +AL L +L V+ L S + LV+ P
Sbjct: 136 AAALGYTQWL----SLLYALPVIALTGLAFASLGMVVTALA---PSYDYFIFYQTLVITP 188

Query: 200 LTFISNAFVPSDSLPTPLRVFAEYNPVSSVVLAARQLFGNVPAGTPVPDAWTLQNPVATT 259
+ F+S A P D LP + A + P+S + L + G PV D V
Sbjct: 189 ILFLSGAVFPVDQLPIVFQTAARFLPLSHSI----DLIRPIMLGHPVVDVCQH---VGAL 241

Query: 260 LIWAGVLLVVAVPLAVRK 277
I+ + ++ L R+
Sbjct: 242 CIYIVIPFFLSTALLRRR 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2535SUBTILISIN1073e-27 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 107 bits (269), Expect = 3e-27
Identities = 54/302 (17%), Positives = 85/302 (28%), Gaps = 86/302 (28%)

Query: 179 PDRIYHPTSTPAADFLGLTGADGVWAKTGGQEEAGEGAVIGVIDTGIAPENPAFAGEPLG 238
+ + A VW +T G G + V+DTG ++P
Sbjct: 11 QVIKQEQQVNEIPRGVEMIQAPAVWNQT-----RGRGVKVAVLDTGCDADHPDLKA---- 61

Query: 239 TTAGAEPYRDGSAIAYAKGDGTTFRGACQTGEQFTAADCSTKIVGARYFVTGFGQENIGT 298
+I+G R F
Sbjct: 62 -----------------------------------------RIIGGRNFTDD-------- 72

Query: 299 AATGEYVSPRDGDGHGSHTASTAAGEADVTATIDGNDLGEISGVAPASKIAAYKVCWSGP 358
G+ +D +GHG+H A T A + + GVAP + + KV
Sbjct: 73 -DEGDPEIFKDYNGHGTHVAGTIAA---------TENENGVVGVAPEADLLIIKVLNK-- 120

Query: 359 DPAVQTDDGCAGADLVAAIEQATKDGVDVINYSIGGGSARTTFSATDSAFLGAASAGIFV 418
++ I A + VD+I+ S+GG A A ++ I V
Sbjct: 121 ------QGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDVPEL---HEAVKKAVASQILV 171

Query: 419 AASAGNSGPGASTLD-----NASPWITTVAASTVAGNFEATAQLGDGQAFA--GSSITVT 471
+AGN G G D + +V A + + + G I T
Sbjct: 172 MCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNSNNEVDLVAPGEDILST 231

Query: 472 EP 473
P
Sbjct: 232 VP 233



Score = 67.6 bits (165), Expect = 6e-14
Identities = 28/142 (19%), Positives = 46/142 (32%), Gaps = 28/142 (19%)

Query: 580 DNTTGVSAPTP--QVAGFSSRGPVLADGSDILKPDVTAPGVSIIAATNNAEGEDPTFALL 637
+ V A + FS+ + D+ APG I++ +A
Sbjct: 195 NEVISVGAINFDRHASEFSNSNN---------EVDLVAPGEDILSTVPGG-----KYATF 240

Query: 638 SGTSMAAPHVAGLALLYLG-----EHPKATPAEIKSAMMTTAYDTVDEDGGKVTDPFTQG 692
SGTSMA PHVAG L T E+ + ++ + P +G
Sbjct: 241 SGTSMATPHVAGALALIKQLANASFERDLTEPELYAQLIKRTIPLGNS-------PKMEG 293

Query: 693 AGHVDARRYLDPGLLYLNDRAD 714
G + + ++ R
Sbjct: 294 NGLLYLTAVEELSRIFDTQRVA 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2536SUBTILISIN1103e-28 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 110 bits (277), Expect = 3e-28
Identities = 57/297 (19%), Positives = 86/297 (28%), Gaps = 80/297 (26%)

Query: 147 LSVEPDRTLHTTSTPDSRFLGLEGDDGLWSKVGGSDKAGEGTVIGVLDTGIAPDNPSFAG 206
+ R + + +W++ G G + VLDTG D+P
Sbjct: 7 IIPYQVIKQEQQVNEIPRGVEMIQAPAVWNQT-----RGRGVKVAVLDTGCDADHPDLKA 61

Query: 207 KPLGSTPGADPYLDGSRIDFRKGDGTVFHGTCETGDGFTADDCSTKIVGARSFEAGRAAS 266
+ I+G R+F
Sbjct: 62 R---------------------------------------------IIGGRNFTDDD--E 74

Query: 267 GDPNGPQEKLSPLDTAGHGSHTASTAAGDAGVAATAGTIQETIAGIAPAAKIAAYKVCWS 326
GDP D GHG+H A T A + + G+AP A + KV
Sbjct: 75 GDPE------IFKDYNGHGTHVAGTIAATEN--------ENGVVGVAPEADLLIIKVLNK 120

Query: 327 GPDPSKETDDGCELSDIVAGIEQATADGVDVLNMSLGGPGKTEDAFQRALLGAADAGIFV 386
+ I+ GI A VD+++MSLGGP A+ A + I V
Sbjct: 121 --------QGSGQYDWIIQGIYYAIEQKVDIISMSLGGPEDV-PELHEAVKKAVASQILV 171

Query: 387 AASAGNSGPDAGTVA-----NTEPWVTTVAASSVPRNYSGTVTLGSGAKFAGASATV 438
+AGN G V +V A + R+ S + +
Sbjct: 172 MCAAGNEGDGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNSNNEVDLVAPGEDI 228



Score = 57.2 bits (138), Expect = 1e-10
Identities = 23/123 (18%), Positives = 41/123 (33%), Gaps = 26/123 (21%)

Query: 559 QVAGFSSRGASEDVDGGDTIKPDITAPGVGILAAISDDGGKPAFAPYSGTSMSSPHIAGF 618
+ FS+ D+ APG IL+ + +A +SGTSM++PH+AG
Sbjct: 208 HASEFSNSNNE----------VDLVAPGEDILSTVPGGK----YATFSGTSMATPHVAGA 253

Query: 619 GLVYLG-----VHPKASPAEVKSALMTTATDTLDANGKPATDPFAQGAGQIAPDRFLNPG 673
+ + E+ + L+ P +G G +
Sbjct: 254 LALIKQLANASFERDLTEPELYAQLIKRTIPL-------GNSPKMEGNGLLYLTAVEELS 306

Query: 674 LYY 676
+
Sbjct: 307 RIF 309


82CMM_2569CMM_2574N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2569193.062511putative multidrug ABC transporter, permease
CMM_25702113.074535putative ABC transporter, ATP-binding protein
CMM_25710112.218935conserved hypothetical protein, putative
CMM_2572-191.455862conserved hypothetical protein
CMM_2573-280.938777putative zinc-dependant oxidoreductase
CMM_2574-27-0.379629putative carboxylesterase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2569PF03309290.032 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 29.0 bits (65), Expect = 0.032
Identities = 7/38 (18%), Positives = 14/38 (36%)

Query: 129 DEVTAAVAGWNLAPFPEAEVVQGTAETFDAGGVDGVYL 166
D + +A ++ V G++ D G +L
Sbjct: 111 DRIVNCLAAYHKYGTAAIVVDFGSSICVDVVSAKGEFL 148


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2571SACTRNSFRASE422e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 2e-06
Identities = 18/57 (31%), Positives = 28/57 (49%)

Query: 71 ADVQTIAVADGSRGRGIGRALLTRLVAEAHVRGAREVLLEVRADNPVAQALYSSLGF 127
A ++ IAVA R +G+G ALL + + A ++LE + N A Y+ F
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2573ALARACEMASE2783e-91 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 278 bits (713), Expect = 3e-91
Identities = 109/370 (29%), Positives = 175/370 (47%), Gaps = 25/370 (6%)

Query: 17 RAVVDLDAIRHNVRTLAALAAPARTMVAVKADAYGHGALQVARAALEAGAESLAVLDVAS 76
+A +DL A++ N+ + A AR VKA+AYGHG ++ A +L L+
Sbjct: 6 QASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLE--E 63

Query: 77 AVELRRSGITARLL---AWLHGVDTDFRVAVENGIDLGVSALWELERIAAAGRATGIRAR 133
A+ LR G +L + H D + ++ + V + W+L+ A
Sbjct: 64 AITLRERGWKGPILMLEGFFHA--QDLEIYDQHRLTTCVHSNWQLK--ALQNARLKAPLD 119

Query: 134 VHLKADTGLSRNGATPELWPDLVRAAVAADAAGELTLHALWSHLADA-SPEDDDAALARF 192
++LK ++G++R G P+ + + A GE+TL SH A+A P+ A+AR
Sbjct: 120 IYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTL---MSHFAEAEHPDGISGAMARI 176

Query: 193 HEAVRVAEELGARPVEKHLAASSAGIRLPAARFDMVRFGIAVYGISPFDDRS-GRDLGLI 251
+A E L + L+ S+A + P A FD VR GI +YG SP + GL
Sbjct: 177 EQAA---EGL---ECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLR 230

Query: 252 PAMTLEADVVSVKRVEAGHGVSYGLDHRTAGPSTLVLVPLGYADGIPRIAAPRASVLLNG 311
P MTL ++++ V+ ++AG V YG + + +V GYADG PR A VL++G
Sbjct: 231 PVMTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDG 290

Query: 312 RRFPVAGRIAMDQLVLDVGDLP-VEVGDTAVILGPGDRGEPTAEEWAGWAETIGDEIVTR 370
R G ++MD L +D+ P +G + G E ++ A A T+G E++
Sbjct: 291 VRTMTVGTVSMDMLAVDLTPCPQAGIGTPVELWGK----EIKIDDVAAAAGTVGYELMCA 346

Query: 371 VGPRVDRVHL 380
+ RV V +
Sbjct: 347 LALRVPVVTV 356


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2574ALARACEMASE2849e-96 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 284 bits (728), Expect = 9e-96
Identities = 105/369 (28%), Positives = 165/369 (44%), Gaps = 24/369 (6%)

Query: 16 EARIDTGAISANVRALRAATGAPLVMAVVKADGYGHGAVASARAALAGGADRLGVVDVRE 75
+A +D A+ N+ +R A V +VVKA+ YGHG A A D ++++ E
Sbjct: 6 QASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGA--TDGFALLNLEE 63

Query: 76 ALALRAAGIDAPVLT-WMHAPGADFATAIEADVDLGLNSLRQVREVAEAARRVGRTAEVH 134
A+ LR G P+L D + + ++S Q++ + A R+ +++
Sbjct: 64 AITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNA--RLKAPLDIY 121

Query: 135 LKVDTGLGRNGVTPAEWPAVVAEVQALVAEGRIHLGGVFSHLANAGREED-RAQVRAFHQ 193
LKV++G+ R G P V+ Q L A + + SH A A + + Q
Sbjct: 122 LKVNSGMNRLGFQPDR---VLTVWQQLRAMANVGEMTLMSHFAEAEHPDGISGAMARIEQ 178

Query: 194 SVDVVRAAGLEPGIRHLAATAGALRVPEARLDMVRLGIGIYGISPLDGV-TSADLGLVPA 252
+ + + R L+ +A L PEA D VR GI +YG SP A+ GL P
Sbjct: 179 AAEGL------ECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRPV 232

Query: 253 MTLVGSVVAVKRVPADTGVSYGYTYRTSSATTLALVSLGFADGVPRLASNRAPVAIHGAR 312
MTL ++ V+ + A V YG Y + +V+ G+ADG PR A PV + G R
Sbjct: 233 MTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVR 292

Query: 313 FRVSGRIAMDQFVVDVGDGVVDGIPVAVGDDAVLFGDPATGAPSVEEWAEATGTIGYEIV 372
G ++MD VD+ +G L+G +++ A A GT+GYE++
Sbjct: 293 TMTVGTVSMDMLAVDLTPCP----QAGIGTPVELWGK----EIKIDDVAAAAGTVGYELM 344

Query: 373 ARVAGRVTR 381
+A RV
Sbjct: 345 CALALRVPV 353


83CMM_2844CMM_2850N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2844620-6.248820
CMM_2845722-6.033275
CMM_2846621-5.159853
CMM_284707-1.234716
CMM_2848-270.306332
CMM_2849-271.244714
CMM_2850393.318811
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2844TYPE4SSCAGA290.027 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 28.9 bits (64), Expect = 0.027
Identities = 15/51 (29%), Positives = 31/51 (60%), Gaps = 4/51 (7%)

Query: 139 RREARAELTELGISTLQDVTVPVENLSGGQRQAVAVARAAAFGSKVVVLDE 189
R + +L+++G+S Q++ ++NL+ QAV+ A+A FG+ +D+
Sbjct: 946 RHDKVDDLSKVGLSRNQELAQKIDNLN----QAVSEAKAGFFGNLEQTIDK 992


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2848DHBDHDRGNASE969e-26 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 95.5 bits (237), Expect = 9e-26
Identities = 73/249 (29%), Positives = 111/249 (44%), Gaps = 10/249 (4%)

Query: 7 KTALVTGGGSGIGAAISRALAAEGASVVVTDIQLDAAERVVAEIEGAGGTATAFRQDTAK 66
K A +TG GIG A++R LA++GA + D + E+VV+ ++ A AF D
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 67 AEDSEAAVAHAVGTYGALHLAVNNAGISAPAADIGDYEISAWDRTRAIDLDGVFYGLRYQ 126
+ + A G + + VN AG+ P I W+ T +++ GVF R
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGL-IHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 127 VPAMVEAGGGAIVNMSSVLGSVGFAQNAAYVASKHALVGLTKVAALEYTARGVRTNAVGP 186
M++ G+IV + S V AAY +SK A V TK LE +R N V P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 187 GFIDTPLVRSSLSAD---------ALAYLESQHATGRLGTDKEVAALVLFLLSDDASFIS 237
G +T + S + + +L ++ +L ++A VLFL+S A I+
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 238 GSYHLVDGG 246
VDGG
Sbjct: 248 MHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2849RTXTOXINA310.014 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.5 bits (71), Expect = 0.014
Identities = 29/108 (26%), Positives = 42/108 (38%), Gaps = 31/108 (28%)

Query: 55 AGRALSRRTLLGGAGAGALAILVAQNAAA----PGAAAAQAASNLPFTAITPVDAAVDQF 110
AG L+ + +LG G G ++AQ AA AAA AS + AI+P+
Sbjct: 271 AGVELTTK-VLGNVGKGISQYIIAQRAAQGLSTSAAAAGLIASAVTL-AISPLS------ 322

Query: 111 TVPTGYRWQPIIRWGDPLFSYADDFDADNQTAKLASR--QFGYNNDYL 156
S AD F N+ + + R + GY+ D L
Sbjct: 323 -----------------FLSIADKFKRANKIEEYSQRFKKLGYDGDSL 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2850HELNAPAPROT280.005 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 28.3 bits (63), Expect = 0.005
Identities = 6/53 (11%), Positives = 16/53 (30%), Gaps = 3/53 (5%)

Query: 32 WPRTDPSLPAELVDALESHLGRGRSGVRTAKRAGDD--AGIATARERVSLAKH 82
+ + LV+ + + + A+ D+ A + + K
Sbjct: 93 NETSASEMVQALVNDYKQISSESKFVIGLAEENQDNATADLFVGLIE-EVEKQ 144


84CMM_2892CMM_2896N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CMM_2892-1111.714396
CMM_28931100.691648
CMM_28940100.634219
CMM_2895-210-1.084056
CMM_2896734-9.341969
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2892ACRIFLAVINRP502e-08 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 50.2 bits (120), Expect = 2e-08
Identities = 36/177 (20%), Positives = 60/177 (33%), Gaps = 17/177 (9%)

Query: 506 ASSERDQGLIIPIILAIVFVILGLLLRSLVAPVLLIASVLATFFASLGAANVLFQQVLGF 565
S ++ I +VF+ L L S PV ++ V L AA LF Q
Sbjct: 866 RLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAAT-LFNQKNDV 924

Query: 566 PAFDANVVLFAFLFLVALGVDYNIFLVTRAREERRLHGTREGMVRALASTGGVITSAGIL 625
+ L + L I +V A++ G V + IL
Sbjct: 925 YF------MVGLLTTIGLSAKNAILIVEFAKDLMEKEGKG---VVEATLMAVRMRLRPIL 975

Query: 626 LAAVFAVLGVLPVVALTQIGV-------IVCIGVLLDTLVVRTLLVPALVFLLGDRF 675
+ ++ +LGVLP+ G I +G ++ ++ VP ++ F
Sbjct: 976 MTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCF 1032



Score = 41.4 bits (97), Expect = 1e-05
Identities = 46/221 (20%), Positives = 84/221 (38%), Gaps = 23/221 (10%)

Query: 117 LVAVPLDASAVADDVAGTAATLRTTAADGLPDGLVAQLTGPVGFQADISNAFAGADFRLL 176
L ++ + A +G A L A LP G+ TG + N L+
Sbjct: 821 LPSMEIQGEAAPGTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAP----ALV 876

Query: 177 LVTVLVVAVLLIVTYRS-----PVLWIVPLVVVGAADGLARVVVTALADPLGITIDASIG 231
++ +VV + L Y S V+ +VPL +VG L + + +G
Sbjct: 877 AISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGV---LLAATLFNQK----NDVYFMVG 929

Query: 232 GILSVLVFGAGTNYALLLVARYREELTRQ-EDRHAAMLTAVTSAGPAIAASGGTVALSLI 290
+ G A+L+V ++ + ++ + A L AV I + L ++
Sbjct: 930 LLT---TIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVL 986

Query: 291 TLLLAELSGN---RALGFACAIGVLIAIAAALLVLPAALVV 328
L ++ +G+ A+G G++ A A+ +P VV
Sbjct: 987 PLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVV 1027


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2893DHBDHDRGNASE1072e-30 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 107 bits (269), Expect = 2e-30
Identities = 75/252 (29%), Positives = 123/252 (48%), Gaps = 15/252 (5%)

Query: 8 DGRTAIVTGAGSGIGRASALRFAREGARVIAADVSAERLDALV------AEHPDLELVPV 61
+G+ A +TGA GIG A A A +GA + A D + E+L+ +V A H +
Sbjct: 7 EGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAE-AFPAD 65

Query: 62 VGDITTEDGVHAIVAAADGRVDVLANVAGIMDGFLPAAEVDDRTWDLVFAVNVTAVMRLT 121
V D D + A + G +D+L NVAG++ + D W+ F+VN T V +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLR-PGLIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 122 RAVLPLMVEAGKGAIVNIASEAALRGSAAGLAYTASKHAVAGMTKNTAVLYAGQGIRVNA 181
R+V M++ G+IV + S A + AY +SK A TK + A IR N
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNI 184

Query: 182 VAPGATNTAIEAPMRSERAG---RVLGPIMQTIVPAPV----EADEMANSVIWLASDQSS 234
V+PG+T T ++ + ++ G + G + P+ + ++A++V++L S Q+
Sbjct: 185 VSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAG 244

Query: 235 NVTGVILASDGG 246
++T L DGG
Sbjct: 245 HITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2894HTHTETR573e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 3e-12
Identities = 26/116 (22%), Positives = 46/116 (39%)

Query: 21 QDLARVAIGLFAERGYDAVSMADIAEAAGVGRRSLFRYFPSKADLVWGGADVVGAELERL 80
Q + VA+ LF+++G + S+ +IA+AAGV R +++ +F K+DL ++ + + L
Sbjct: 14 QHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGEL 73

Query: 81 LAEGPGDRPLGDAYADAFVAALGPRFADMGMTRVRLRIIDAHPDLHSRSSPRITAA 136
E P + R L I H + A
Sbjct: 74 ELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQA 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CMM_2896PREPILNPTASE280.030 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 27.8 bits (62), Expect = 0.030
Identities = 19/58 (32%), Positives = 28/58 (48%), Gaps = 6/58 (10%)

Query: 17 LALVGAL--VPVLDLVTGLLAVVGAVFGILALTRKRRRNSRPLAITGLALNVVALAAW 72
LA +GA L +V L ++VGA GI + + S+P+ G L A+A W
Sbjct: 219 LAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKPIPF-GPYL---AIAGW 272



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.