PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeStaph_aureus_NZ_CP007454.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NZ_CP007454 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1CH52_RS00550CH52_RS00610Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS00550211-1.041407isochorismate synthase MenF
CH52_RS15065315-0.149115hypothetical protein
CH52_RS00555316-0.5665281,4-dihydroxy-2-naphthoate
CH52_RS00560016-2.052116GNAT family N-acetyltransferase
CH52_RS00565-117-3.104822TM2 domain-containing protein
CH52_RS00570-218-2.807701hypothetical protein
CH52_RS00575014-4.428744ABC transporter substrate-binding protein
CH52_RS00580-114-5.169771DoxX family protein
CH52_RS14225016-5.661872hypothetical protein
CH52_RS00590117-5.524347ABC transporter ATP-binding protein
CH52_RS00595117-5.585720YxeA family protein
CH52_RS00600015-5.701257bacteriocin-associated integral membrane family
CH52_RS14230-218-3.773541lactococcin 972 family bacteriocin
CH52_RS00605-217-4.112913hypothetical protein
CH52_RS00610-318-3.730691hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS00560SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.1 bits (70), Expect = 0.001
Identities = 18/68 (26%), Positives = 27/68 (39%), Gaps = 3/68 (4%)

Query: 101 LPVKEAKDDEYYIETIATFAAYRGRGIATKLLTSLLESNTHVKWS---LNCDINNEAALK 157
+ ++ + IE IA YR +G+ T LL +E + L N +A
Sbjct: 80 IKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACH 139

Query: 158 LYKKVGFI 165
Y K FI
Sbjct: 140 FYAKHHFI 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS00575FERRIBNDNGPP855e-21 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 84.6 bits (209), Expect = 5e-21
Identities = 46/253 (18%), Positives = 104/253 (41%), Gaps = 27/253 (10%)

Query: 48 NPKRVVVLEYSFADYLAALDMKPVGIADDGSTK------NITKSVRDKIGAYESVGSRPQ 101
+P R+V LE+ + L AL + P G+AD + + + SV D VG R +
Sbjct: 34 DPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVID-------VGLRTE 86

Query: 102 PNMEVISKLKPDLIIADVSRHKKIKSELSKIAPTIMLVSGTGDYNANI--EAFKTVAKAV 159
PN+E+++++KP ++ + + L++IAP G + ++ +A +
Sbjct: 87 PNLELLTEMKPSFMVWS-AGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLL 145

Query: 160 GKEKEGEKRLEKHDKILAEIRKKIEQSTLKSAFAFGISRA-GMFINNEDTFMGQFLIKMG 218
+ E L +++ + ++ + + + + M + ++ + L + G
Sbjct: 146 NLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILDEYG 205

Query: 219 IQPEVTKDKTTHVGERKGGPYIYLNNEELANI-NPKVMILATDGKTDKNRTKFIDPAVWK 277
I P + +T G ++ + LA + V+ D D + + +W+
Sbjct: 206 I-PNAWQGETNFWG------STAVSIDRLAAYKDVDVLCFDHDNSKDMD--ALMATPLWQ 256

Query: 278 SLKAVKDNKVYDV 290
++ V+ + V
Sbjct: 257 AMPFVRAGRFQRV 269


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS14230PRTACTNFAMLY250.041 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 25.0 bits (54), Expect = 0.041
Identities = 13/34 (38%), Positives = 18/34 (52%), Gaps = 4/34 (11%)

Query: 7 ALGLSTAAYASTEYAEGGT----WSHGVGSKYVW 36
ALG + YAS EY++G W+ G +Y W
Sbjct: 877 ALGRGHSLYASYEYSKGPKLAMPWTFHAGYRYSW 910


2CH52_RS00840CH52_RS00870Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS008402141.297166DUF3899 domain-containing protein
CH52_RS00845214-0.780289beta-ketoacyl-ACP synthase II
CH52_RS00850419-4.095165ketoacyl-ACP synthase III
CH52_RS00855419-6.222433YjzD family protein
CH52_RS00860317-6.432317MAP domain-containing protein
CH52_RS00865113-4.185550YbhB/YbcL family Raf kinase inhibitor-like
CH52_RS00870111-3.996170membrane protein
3CH52_RS01175CH52_RS01515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS01175216-5.306373CNNM domain-containing protein
CH52_RS01180220-7.007144hypothetical protein
CH52_RS01185521-7.043248hypothetical protein
CH52_RS01190521-6.428379hypothetical protein
CH52_RS01195420-4.263963hypothetical protein
CH52_RS01200220-2.619732hypothetical protein
CH52_RS01205320-0.371595hypothetical protein
CH52_RS01210520-0.264218hypothetical protein
CH52_RS012152170.216123staphylokinase
CH52_RS012203181.129832CHAP domain-containing protein
CH52_RS012252181.142132phage holin
CH52_RS012353211.194051hypothetical protein
CH52_RS012401221.643531BppU family phage baseplate upper protein
CH52_RS012452232.079270N-acetylglucosaminidase
CH52_RS012503211.877869DUF2951 domain-containing protein
CH52_RS012553211.421359XkdX family protein
CH52_RS012602201.421359DUF2977 domain-containing protein
CH52_RS012652191.301537BppU family phage baseplate upper protein
CH52_RS012702181.191410hypothetical protein
CH52_RS012752181.082763SGNH/GDSL hydrolase family protein
CH52_RS012803180.684690phage tail family protein
CH52_RS012853181.347643hypothetical protein
CH52_RS012901211.371361hypothetical protein
CH52_RS012951221.509130tail assembly chaperone
CH52_RS013000231.211862phage major tail protein, TP901-1 family
CH52_RS013051190.770232hypothetical protein
CH52_RS013101200.932642HK97 gp10 family phage protein
CH52_RS013152200.916929hypothetical protein
CH52_RS013203191.017577phage head-tail connector protein
CH52_RS013254171.265851hypothetical protein
CH52_RS013304180.798783phage major capsid protein
CH52_RS013353210.559839DUF4355 domain-containing protein
CH52_RS150753220.711977hypothetical protein
CH52_RS013454220.554843minor capsid protein
CH52_RS013504220.517759phage portal protein
CH52_RS01355424-0.006867PBSX family phage terminase large subunit
CH52_RS013602270.436961terminase small subunit
CH52_RS01365727-0.117584RinA family phage transcriptional activator
CH52_RS01370726-0.452727hypothetical protein
CH52_RS013755260.331163hypothetical protein
CH52_RS013805260.822056transcriptional activator RinB
CH52_RS013856312.059792hypothetical protein
CH52_RS013953283.036784DUF1381 domain-containing protein
CH52_RS014004292.711307dUTPase
CH52_RS014054312.850606DUF1024 family protein
CH52_RS014104332.544445phi PVL orf 51-like protein
CH52_RS014156332.597844hypothetical protein
CH52_RS014204332.141223DUF3113 family protein
CH52_RS014256321.315661DUF1064 domain-containing protein
CH52_RS014304321.572887DUF3269 family protein
CH52_RS014353291.021466hypothetical protein
CH52_RS014403281.090554DnaB helicase C-terminal domain-containing
CH52_RS014451260.411511hypothetical protein
CH52_RS014502260.152385DnaD domain protein
CH52_RS01455-1271.065518putative HNHc nuclease
CH52_RS014600270.464143single-stranded DNA-binding protein
CH52_RS014653351.261148ATP-binding protein
CH52_RS014707381.604338DUF2483 family protein
CH52_RS014757360.324650DUF1108 family protein
CH52_RS014806331.322168DUF1270 family protein
CH52_RS014856290.776258hypothetical protein
CH52_RS014905260.297478hypothetical protein
CH52_RS01495625-0.837420hypothetical protein
CH52_RS01500523-1.116086phage antirepressor KilAC domain-containing
CH52_RS01505420-0.857986helix-turn-helix transcriptional regulator
CH52_RS01510316-0.952099XRE family transcriptional regulator
CH52_RS01515216-0.480625hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01285GPOSANCHOR503e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.4 bits (120), Expect = 3e-08
Identities = 39/189 (20%), Positives = 76/189 (40%), Gaps = 34/189 (17%)

Query: 8 ATIEASVAKFKRQIDSAVKSVQRFKRVADQTKDVELNADDKKLQKTIKVAKKSLDAFSNK 67
T+EA A + + K+++ + + +K + A +
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADL-------E 301

Query: 68 KVKAKLDASIQDLQQKVLESNFELDKLNSKEVTPEIKLQKQKLT--KDIAEA-----EAK 120
L+A+ Q L++ L++ S+E +++ + QKL I+EA
Sbjct: 302 HQSQVLNANRQSLRRD-LDA--------SREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352

Query: 121 L--SELEKKRVNIDVNADNSKFNRVLKVSKASLEALNRS-----KAKAIIDVDNGVANSK 173
L S KK+ + A++ K K+S+AS ++L R +AK ++ ANSK
Sbjct: 353 LDASREAKKQ----LEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSK 408

Query: 174 IKRTKEELK 182
+ ++ K
Sbjct: 409 LAALEKLNK 417



Score = 49.7 bits (118), Expect = 5e-08
Identities = 36/134 (26%), Positives = 64/134 (47%), Gaps = 23/134 (17%)

Query: 7 KATIEASVAKFKRQIDSAVKSVQRFKRVADQTKDV--ELNADDKKLQKTIKVAKKSLDAF 64
KA +EA A + Q + Q +R D +++ +L A+ +KL++ K+++ S
Sbjct: 290 KAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASR--- 346

Query: 65 SNKKVKAKLDASIQDLQQKVLESNFELDKLNSKEVTPEIKLQ------------KQKLTK 112
+ ++ LDAS + +K LE+ E KL + E Q K+++ K
Sbjct: 347 --QSLRRDLDASRE--AKKQLEA--EHQKLEEQNKISEASRQSLRRDLDASREAKKQVEK 400

Query: 113 DIAEAEAKLSELEK 126
+ EA +KL+ LEK
Sbjct: 401 ALEEANSKLAALEK 414



Score = 49.7 bits (118), Expect = 6e-08
Identities = 27/193 (13%), Positives = 71/193 (36%), Gaps = 23/193 (11%)

Query: 11 EASVAKFKRQIDSAVKSVQRFKRVADQTKDVELNADDKKLQKTIKVAKKSLDAFSNKKVK 70
A + + + + K ++ + IK + A ++
Sbjct: 210 SAKIKTLEAEKAAL----AARKADLEKALE-GAMNFSTADSAKIKTLEAEKAALEARQ-- 262

Query: 71 AKLDASIQDLQQKVLESNFELDKLNSKEVTPEIKLQKQKLTKDIAEAEAK--LSELEKKR 128
A+L+ +++ + ++ L +++ E + + + A + +L+ R
Sbjct: 263 AELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASR 322

Query: 129 VNI-DVNADNSKFNRVLKVSKASLEALNRS-----KAKAIIDVDNGVANSKIKRTKEELK 182
+ A++ K K+S+AS ++L R +AK ++ ++ ++ +E+
Sbjct: 323 EAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLE-------AEHQKLEEQ-N 374

Query: 183 SIPNKTRSRLDVD 195
I +R L D
Sbjct: 375 KISEASRQSLRRD 387



Score = 41.2 bits (96), Expect = 2e-05
Identities = 25/183 (13%), Positives = 49/183 (26%), Gaps = 16/183 (8%)

Query: 18 KRQIDSAVKSVQRFKRVADQTKDV--ELNADDKKLQKTIKVAKKSLDAFSNKKVKAKLDA 75
K D + + K + E + ++L+ +K+L+ N A
Sbjct: 84 KDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFS--TADSA 141

Query: 76 SIQDL---QQKVLESN--FELDKLNSKEVTPEIKLQKQKLTKDIAEAEAKLSELEKK--- 127
I+ L + + E + + + + L + A EA+ +ELEK
Sbjct: 142 KIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEG 201

Query: 128 --RVNIDVNADNSKFNRVLKVSKASLEALN--RSKAKAIIDVDNGVANSKIKRTKEELKS 183
+ +A A L A D+ +
Sbjct: 202 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 261

Query: 184 IPN 186

Sbjct: 262 QAE 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01315YERSSTKINASE270.012 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 27.0 bits (59), Expect = 0.012
Identities = 12/29 (41%), Positives = 17/29 (58%)

Query: 61 RIKESISYPVSHVLINGIRYKIIDTRIYR 89
RI + PV + I G RY+IID ++ R
Sbjct: 22 RISQHWQNPVGELNIGGKRYRIIDNQVLR 50


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01335IGASERPTASE270.048 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 27.3 bits (60), Expect = 0.048
Identities = 30/156 (19%), Positives = 53/156 (33%), Gaps = 8/156 (5%)

Query: 49 QQKKVDEILERRVAHEKKKADEYAKEKAEEAAKEAAKLAKMNKDQKDEYERKQLEKELEQ 108
Q +V + + + E A + EE AK + + + KQ + E Q
Sbjct: 1081 QTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQ 1140

Query: 109 LRSEKQLNEMRSEARKMLSEAEVDSSDEVVNLVVTDTAEQTKLNVEA--FSNAVKKAVNE 166
++E + K ++D A++T NVE + N
Sbjct: 1141 PQAEPARENDPTVNIKEPQSQTNTTADT------EQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 167 AVKVNARQSPLTGGDSFNHSSKNKPQNLAEIARQKR 202
V+ +P T + N S NKP+N + +
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSV 1230


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01415PF06580260.029 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.4 bits (58), Expect = 0.029
Identities = 13/77 (16%), Positives = 26/77 (33%), Gaps = 9/77 (11%)

Query: 26 LTPGMVAKRVRGGWALLEALHAPYGMRLAEYKEIVLARIMQREAREREI----ARQRRKE 81
L + K V L ++ + + + + ++ EI +E
Sbjct: 101 LLAFINTKPVAFTLPLALSIIFNVVVVTFMWSLLYFGWHFFKNYKQAEIDQWKMASMAQE 160

Query: 82 AELR----KKKPH-LFN 93
A+L + PH +FN
Sbjct: 161 AQLMALKAQINPHFMFN 177


4CH52_RS01640CH52_RS01780Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS01640014-3.096584GNAT family N-acetyltransferase
CH52_RS01645316-3.039602LysE/ArgO family amino acid transporter
CH52_RS01650116-2.763262phosphoglycerate mutase family protein
CH52_RS01655117-4.685340hypothetical protein
CH52_RS01660118-4.268912sterile alpha motif-like domain-containing
CH52_RS01665220-3.348420hypothetical protein
CH52_RS01670220-1.426726hypothetical protein
CH52_RS01675118-1.667261hypothetical protein
CH52_RS01680216-1.545871hypothetical protein
CH52_RS01685115-2.095242hypothetical protein
CH52_RS016904151.418273cold-shock protein
CH52_RS016954141.260732thermonuclease family protein
CH52_RS017004150.800258hypothetical protein
CH52_RS017054160.930411extracellular matrix protein-binding adhesin
CH52_RS017107150.546326von Willebrand factor binding protein Vwb
CH52_RS017157181.958078MSCRAMM family adhesin clumping factor ClfA
CH52_RS01720517-4.576641N-acetyltransferase
CH52_RS01725616-4.571379hypothetical protein
CH52_RS01730312-1.457852hypothetical protein
CH52_RS01735112-1.034056DUF5067 domain-containing protein
CH52_RS15080-214-0.264671hypothetical protein
CH52_RS01745-114-0.522598hypothetical protein
CH52_RS01755-1160.911654SsrA-binding protein SmpB
CH52_RS017600181.335559ribonuclease R
CH52_RS017651271.953762carboxylesterase
CH52_RS017701312.085909preprotein translocase subunit SecG
CH52_RS017752352.679869hypothetical protein
CH52_RS017802303.163870phosphopyruvate hydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01640SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.1 bits (70), Expect = 0.001
Identities = 19/91 (20%), Positives = 34/91 (37%), Gaps = 6/91 (6%)

Query: 53 IVFGCYENETLIATAALEQI--RYVGKEHKSLIKYNFVTNNDKSINSELINFIINYARQN 110
F Y I + Y E ++ K K + + L++ I +A++N
Sbjct: 66 AAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAK----DYRKKGVGTALLHKAIEWAKEN 121

Query: 111 NYESLLTSIVSNNIGAKVFYSALGFDILGFE 141
++ L+ NI A FY+ F I +
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHFIIGAVD 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01675PF05704280.035 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 27.5 bits (61), Expect = 0.035
Identities = 13/69 (18%), Positives = 24/69 (34%), Gaps = 7/69 (10%)

Query: 116 EWVKKNYENTNHRYLVTLNLNSK-------KFTYCTKIIYQAYKFGVSEKSVKSYGLHII 168
W + Y N + +++ N + + YK + +Y HI
Sbjct: 239 YWKEIPYVNNVNPHMLQYLGNLPYDNSMFNYIKSTSPVQKLTYKLDYNNLKRNTYYDHIF 298

Query: 169 SPYAIKDNF 177
S +KDN+
Sbjct: 299 SIDKLKDNY 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01710IGASERPTASE310.016 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 0.016
Identities = 28/162 (17%), Positives = 52/162 (32%), Gaps = 4/162 (2%)

Query: 261 ALKLKADTEAAKNDVSKRSKRSLNTQNNKST-TQEISEEQKAEYQRKSEALKERFINRQK 319
+KA+T+ + S + T K T T E E+ K E ++ E K
Sbjct: 1073 KSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVT-SQVSP 1131

Query: 320 SKNESVVSLIDDEDDNENDRQLVVSAPSKKPTTPTTYTETTTQVPMPTVERQTQQQIVYK 379
+ +S E END + + P + T + + + T+ V
Sbjct: 1132 KQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNT 1191

Query: 380 TPKPLAGLNGESHDFTTTHQSPTTSNHTHNNVVEFEETSALP 421
+ N E+ TT + + + ++P
Sbjct: 1192 GNSVVE--NPENTTPATTQPTVNSESSNKPKNRHRRSVRSVP 1231


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01715ICENUCLEATIN421e-05 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 42.4 bits (99), Expect = 1e-05
Identities = 72/369 (19%), Positives = 131/369 (35%), Gaps = 6/369 (1%)

Query: 561 SDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASD 620
S G+DS + S +G +ST +G S + SD + S + DS+
Sbjct: 294 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 353

Query: 621 SDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSDSD 680
+ S + DS + S + SD + S + +DS+ + S + +
Sbjct: 354 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEE 413

Query: 681 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 740
S + S + SD + S + DS + S + DS + S
Sbjct: 414 STQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQT 473

Query: 741 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 800
+ SD + S S + +S + S + S + S + ++SD +
Sbjct: 474 AQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYG 533

Query: 801 SDSDSDSDSD------SDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSD 854
S S + ++S S + +S + S + SD + S + SDS
Sbjct: 534 STSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSII 593

Query: 855 SESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDSD 914
+ S + S + S + S +G S S++ +DS + GS + +
Sbjct: 594 AGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYN 653

Query: 915 SNSDSESGS 923
S + GS
Sbjct: 654 SILTAGYGS 662



Score = 42.4 bits (99), Expect = 1e-05
Identities = 74/379 (19%), Positives = 135/379 (35%), Gaps = 6/379 (1%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + E+S G S + S +G ST +G DS+ + S + DS
Sbjct: 309 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 368

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + SD + S + +DS+ + S + +S + S +
Sbjct: 369 TAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQK 428

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
SD + S + DS + S + DS + S + SD + S S
Sbjct: 429 GSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTS 488

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD----- 786
+ +S + S + S + S + ++SD + S S + ++S
Sbjct: 489 TAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGY 548

Query: 787 -SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDS 845
S + +S + S + SD + S + SDS + S + S
Sbjct: 549 GSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSL 608

Query: 846 DSDSDSDSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDT 905
+ S + S + S S + +DS + S +G +S ++ S T+
Sbjct: 609 TAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQE 668

Query: 906 GSDNDSDSDSNSDSESGSN 924
GSD + S S + + S+
Sbjct: 669 GSDLTAGYGSTSTAGADSS 687



Score = 41.7 bits (97), Expect = 2e-05
Identities = 72/379 (18%), Positives = 132/379 (34%), Gaps = 6/379 (1%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + EDS G S + S +G ST +G+DS+ + S + +S
Sbjct: 261 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQ 320

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + SD + S + DS+ + S + DS + S +
Sbjct: 321 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 380

Query: 672 DSDSDSDSDSDSDSDSDSD------SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 725
SD + S + +DS S + +S + S + SD + S
Sbjct: 381 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 440

Query: 726 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 785
+ DS + S + DS + S + SD + S S + +S +
Sbjct: 441 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGY 500

Query: 786 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDS 845
S + S + S + ++SD + S S + ++S + S + +S
Sbjct: 501 GSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVL 560

Query: 846 DSDSDSDSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDT 905
+ S + SD + S + SDS + S + S ++ S T+
Sbjct: 561 TAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTARE 620

Query: 906 GSDNDSDSDSNSDSESGSN 924
S + S S + + S+
Sbjct: 621 QSVLTTGYGSTSTAGADSS 639



Score = 41.7 bits (97), Expect = 2e-05
Identities = 72/369 (19%), Positives = 132/369 (35%), Gaps = 6/369 (1%)

Query: 561 SDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASD 620
S G DS + S +G DS+ +G S + SD + S + +DS+
Sbjct: 342 STGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLI 401

Query: 621 SDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSDSD 680
+ S + +S + S + SD + S + DS+ + S + D
Sbjct: 402 AGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGED 461

Query: 681 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 740
S + S + SD + S S + +S + S + S + S
Sbjct: 462 SSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQT 521

Query: 741 SDSDSDSDSDSDSDSDSDSDSD------SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 794
+ ++SD + S S + ++S S + +S + S + SD +
Sbjct: 522 AQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYG 581

Query: 795 SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSD 854
S + SDS + S + S + S + S + S S + +DS
Sbjct: 582 STGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLI 641

Query: 855 SESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDSD 914
+ S + +S + S + SD +G S S++ +DS + GS + +
Sbjct: 642 AGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYN 701

Query: 915 SNSDSESGS 923
S + GS
Sbjct: 702 SILTAGYGS 710



Score = 40.9 bits (95), Expect = 3e-05
Identities = 76/379 (20%), Positives = 138/379 (36%), Gaps = 6/379 (1%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSA------SDSDSASDSDS 605
G + EDS G S + S +G ST +G+DS+ S + +S
Sbjct: 357 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQ 416

Query: 606 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 665
+ S + SD + S + DS+ + S + DS + S +
Sbjct: 417 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 476

Query: 666 ASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 725
SD + S S + +S + S + S + S + ++SD + S S
Sbjct: 477 GSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTS 536

Query: 726 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 785
+ ++S + S + +S + S + SD + S + SDS +
Sbjct: 537 TAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGY 596

Query: 786 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDS 845
S + S + S + S + S S + +DS + S + +S
Sbjct: 597 GSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSIL 656

Query: 846 DSDSDSDSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDT 905
+ S ++ SD + S S + +DS + S +G +S ++ S T+
Sbjct: 657 TAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQE 716

Query: 906 GSDNDSDSDSNSDSESGSN 924
GSD S S S + + S+
Sbjct: 717 GSDLTSGYGSTSTAGADSS 735



Score = 40.5 bits (94), Expect = 3e-05
Identities = 74/372 (19%), Positives = 133/372 (35%), Gaps = 2/372 (0%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + + SD G S + DS +G ST +G DS+ + S + SD
Sbjct: 229 GSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 288

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + + S + S + +S + S + SD + S +
Sbjct: 289 TAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGD 348

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
DS + S + DS + S + SD + S + +DS + S
Sbjct: 349 DSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQ 408

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ +S + S + SD + S + DS + S + DS +
Sbjct: 409 TAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGY 468

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + SD + S S + +S + S + S + S + ++SD
Sbjct: 469 GSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDL 528

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDS 911
+ S S + ++S + S + +S +G S ++ SD T+ GS +
Sbjct: 529 ITGYGSTSTAGANSSLIAGYGSTQTASYNSV--LTAGYGSTQTAREGSDLTAGYGSTGTA 586

Query: 912 DSDSNSDSESGS 923
SDS+ + GS
Sbjct: 587 GSDSSIIAGYGS 598



Score = 40.5 bits (94), Expect = 4e-05
Identities = 74/369 (20%), Positives = 135/369 (36%), Gaps = 6/369 (1%)

Query: 561 SDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSA-- 618
S G+DS + S +G +ST +G S + SD + S + DS+
Sbjct: 390 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 449

Query: 619 ----SDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSD 674
S + DS + S + SD + S S + +S+ + S +
Sbjct: 450 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYG 509

Query: 675 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 734
S + S + ++SD + S S + ++S + S + +S + S
Sbjct: 510 STLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQT 569

Query: 735 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 794
+ SD + S + SDS + S + S + S + S +
Sbjct: 570 AREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYG 629

Query: 795 SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSD 854
S S + +DS + S + +S + S ++ SD + S S + +DS
Sbjct: 630 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLI 689

Query: 855 SESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDSD 914
+ S + +S + S + SD SG S S++ +DS + GS +
Sbjct: 690 AGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYH 749

Query: 915 SNSDSESGS 923
S+ + GS
Sbjct: 750 SSLTAGYGS 758



Score = 40.1 bits (93), Expect = 5e-05
Identities = 71/378 (18%), Positives = 130/378 (34%), Gaps = 4/378 (1%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G E + S G S + +DS +G ST +G +S+ + S SD
Sbjct: 181 GSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDL 240

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + S + S + DS+ + S + SD + S + +
Sbjct: 241 TAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGA 300

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
DS + S + +S + S + SD + S + DS + S
Sbjct: 301 DSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQ 360

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ DS + S + SD + S + +DS + S + +S +
Sbjct: 361 TAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGY 420

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + SD + S + DS + S + +S + S + SD
Sbjct: 421 GSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 480

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSD----SDSTSDTGS 907
+ S S + +S + S + S + GS + ++SD STS G+
Sbjct: 481 TAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGA 540

Query: 908 DNDSDSDSNSDSESGSNN 925
++ + S + N+
Sbjct: 541 NSSLIAGYGSTQTASYNS 558



Score = 39.0 bits (90), Expect = 1e-04
Identities = 65/354 (18%), Positives = 119/354 (33%)

Query: 570 GSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 629
GS + S + S + +S + S + +DS + S + +S
Sbjct: 165 GSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQ 224

Query: 630 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSDS 689
+ S SD + S + DS+ + S + DS + S +
Sbjct: 225 MAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 284

Query: 690 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 749
SD + S + +DS + S + +S + S + SD + S
Sbjct: 285 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 344

Query: 750 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 809
+ DS + S + DS + S + SD + S + +DS +
Sbjct: 345 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY 404

Query: 810 DSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSDSESDSDSDSDSDSES 869
S + +S + S ++ SD + S + DS + S + DS
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 870 DSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDSDSNSDSESGS 923
+ S + SD +G S S++ +S + GS + S + GS
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGS 518



Score = 38.2 bits (88), Expect = 2e-04
Identities = 68/346 (19%), Positives = 125/346 (36%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + E+S G S + S +G ST +G DS+ + S + DS
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + SD + S S + +S+ + S + S + S + +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
+SD + S S + ++S + S + +S + S + SD + S
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ SDS + S + S + S + S + S S + +DS +
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGY 644

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + +S + S + SD + S S + ++S + S + +S
Sbjct: 645 GSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSIL 704

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDS 897
+ S + SD S S S + +DS+ + GS +S S
Sbjct: 705 TAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHS 750



Score = 36.7 bits (84), Expect = 5e-04
Identities = 75/372 (20%), Positives = 135/372 (36%), Gaps = 2/372 (0%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + + SD G S + DS +G ST +G DS+ + S + SD
Sbjct: 421 GSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 480

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S S + S + S + S + S + ++SD + S S + +
Sbjct: 481 TAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGA 540

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
+S + S + +S + S + SD + S + SDS + S
Sbjct: 541 NSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQ 600

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ S + S + S + S S + +DS + S + +S +
Sbjct: 601 TASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGY 660

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + SD + S S + +DS + S + S + S + SD
Sbjct: 661 GSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDL 720

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDS 911
S S S + +DS + S + S+ +G S ++ S T+ GS + +
Sbjct: 721 TSGYGSTSTAGADSSLIAGYGSTQTASYHSS--LTAGYGSTQTAREQSVLTTGYGSTSTA 778

Query: 912 DSDSNSDSESGS 923
+DS+ + GS
Sbjct: 779 GADSSLIAGYGS 790



Score = 36.7 bits (84), Expect = 5e-04
Identities = 70/375 (18%), Positives = 131/375 (34%), Gaps = 6/375 (1%)

Query: 560 DSDSDPGSDSGSDSNSDSGSD--SGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDS 617
S + GS + S +G ST +G+DS + S + +S + S
Sbjct: 171 THQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGS 230

Query: 618 ASDSDSASDSDSASDSDSASDSDSA----SDSDSASDSDSASDSDSASDSDSASDSDSDS 673
SD + S + DS+ S + DS+ + S + SD +
Sbjct: 231 TQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTA 290

Query: 674 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 733
S + +DS + S + +S + S + SD + S + DS
Sbjct: 291 GYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDS 350

Query: 734 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 793
+ S + DS + S + SD + S + +DS + S +
Sbjct: 351 SLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTA 410

Query: 794 DSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDS 853
+S + S + SD + S + DS + S + DS + S
Sbjct: 411 GEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 470

Query: 854 DSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDS 913
+ SD + S S + +S + S + S+ + ST +++D +
Sbjct: 471 TQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLIT 530

Query: 914 DSNSDSESGSNNNVV 928
S S +G+N++++
Sbjct: 531 GYGSTSTAGANSSLI 545



Score = 36.7 bits (84), Expect = 6e-04
Identities = 75/373 (20%), Positives = 135/373 (36%), Gaps = 2/373 (0%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + + SD G S S + +S +G ST +G S + S + ++SD
Sbjct: 469 GSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDL 528

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S S + + S + S + +S + S + SD + S + S
Sbjct: 529 ITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGS 588

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
DS + S + S + S + S + S S + +DS + S
Sbjct: 589 DSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQ 648

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ +S + S + SD + S S + +DS + S + +S +
Sbjct: 649 TAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGY 708

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + SD S S S + +DS + S + S + S + S
Sbjct: 709 GSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVL 768

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDS 911
+ S S + +DS + S + S +G S ++ SD T+ GS + +
Sbjct: 769 TTGYGSTSTAGADSSLIAGYGSTQTAGYHSI--LTAGYGSTQTAQERSDLTTGYGSTSTA 826

Query: 912 DSDSNSDSESGSN 924
+DS+ + GS
Sbjct: 827 GADSSLIAGYGST 839



Score = 36.7 bits (84), Expect = 6e-04
Identities = 67/346 (19%), Positives = 124/346 (35%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + EDS G S + S +G STS +G +S+ + S + S
Sbjct: 453 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTL 512

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + + SD + S S + ++S+ + S ++ +S + S +
Sbjct: 513 TAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTARE 572

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
SD + S + SDS + S + S + S + S + S S
Sbjct: 573 GSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTS 632

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ +DS + S + +S + S + SD + S S + +DS +
Sbjct: 633 TAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGY 692

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + +S + S + SD S S S + ++S + S + S
Sbjct: 693 GSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSL 752

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDS 897
+ S + S + S S + +DS+ + GS + S
Sbjct: 753 TAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHS 798



Score = 33.2 bits (75), Expect = 0.007
Identities = 56/335 (16%), Positives = 107/335 (31%)

Query: 574 NSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDS 633
S +D + + + S ++ D D+ +S S +
Sbjct: 97 TKTSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPT 156

Query: 634 DSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDS 693
+ + S S + S + +S + S + +DS + S
Sbjct: 157 QTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQ 216

Query: 694 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 753
+ +S + S SD + S + DS + S + DS +
Sbjct: 217 TAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGY 276

Query: 754 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 813
S + SD + S + +DS + S + +S + S + SD
Sbjct: 277 GSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDL 336

Query: 814 DSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSDSESDSDSDSDSDSESDSDS 873
+ S + DS + S + DS + S ++ SD + S + +
Sbjct: 337 TAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGA 396

Query: 874 DSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSD 908
DS + S +G +S ++ S T+ GSD
Sbjct: 397 DSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSD 431



Score = 33.2 bits (75), Expect = 0.007
Identities = 66/339 (19%), Positives = 125/339 (36%), Gaps = 2/339 (0%)

Query: 561 SDSDPGSDSGSDSNSDSGSD--SGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSA 618
S + GS + + SD +G STS +G++S+ + S ++ +S + S
Sbjct: 508 YGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGST 567

Query: 619 SDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSD 678
+ SD + S + SDS+ + S ++ S + S + S +
Sbjct: 568 QTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTG 627

Query: 679 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 738
S S + +DS + S + +S + S + SD + S S + +DS
Sbjct: 628 YGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSS 687

Query: 739 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 798
+ S + +S + S + SD S S S + +DS + S +
Sbjct: 688 LIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTAS 747

Query: 799 SDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSDSESD 858
S + S + S + S S + ++S + S + S + S
Sbjct: 748 YHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGST 807

Query: 859 SDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDS 897
+ SD + S S + +DS+ + GS + +S
Sbjct: 808 QTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNS 846



Score = 32.4 bits (73), Expect = 0.010
Identities = 67/360 (18%), Positives = 131/360 (36%)

Query: 569 SGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSD 628
+G S +G S + ++S + S S + ++S+ + S ++ +S +
Sbjct: 506 AGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYG 565

Query: 629 SASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSD 688
S + SD + S + SDS+ + S ++ S + S + S
Sbjct: 566 STQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLT 625

Query: 689 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 748
+ S S + +DS + S + +S + S + SD + S S + +D
Sbjct: 626 TGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGAD 685

Query: 749 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 808
S + S + +S + S + SD S S S + +DS + S
Sbjct: 686 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQT 745

Query: 809 SDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSDSESDSDSDSDSDSE 868
+ S + S + S + S S + +DS + S + S +
Sbjct: 746 ASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYG 805

Query: 869 SDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDSDSNSDSESGSNNNVV 928
S + SD + S S + +DSS + ST G ++ + S + N+++
Sbjct: 806 STQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLT 865



Score = 32.4 bits (73), Expect = 0.011
Identities = 74/372 (19%), Positives = 135/372 (36%), Gaps = 2/372 (0%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + +SD G S S + ++S +G ST + +S + S + SD
Sbjct: 517 GSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDL 576

Query: 612 ASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDS 671
+ S + S S + S + S+ + S + S + S S + +
Sbjct: 577 TAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGA 636

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
DS + S + +S + S + SD + S S + +DS + S
Sbjct: 637 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQ 696

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 791
+ +S + S + SD S S S + +DS + S + S +
Sbjct: 697 TAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGY 756

Query: 792 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDS 851
S + S + S S + +DS + S + S + S + SD
Sbjct: 757 GSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDL 816

Query: 852 DSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDS 911
+ S S + +DS + S + +S +G S ++ +SD T+ GS + +
Sbjct: 817 TTGYGSTSTAGADSSLIAGYGSTQTAGYNSI--LTAGYGSTQTAQENSDLTTGYGSTSTA 874

Query: 912 DSDSNSDSESGS 923
DS+ + GS
Sbjct: 875 GYDSSLIAGYGS 886



Score = 31.3 bits (70), Expect = 0.025
Identities = 77/387 (19%), Positives = 138/387 (35%), Gaps = 14/387 (3%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSA--------------SDS 597
G + E SD G S + SDS +G ST + S+ S
Sbjct: 565 GSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVL 624

Query: 598 DSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDS 657
+ S S + +DS+ + S + +S + S + SD + S S + +
Sbjct: 625 TTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGA 684

Query: 658 DSASDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 717
DS+ + S + +S + S + SD S S S + +DS + S
Sbjct: 685 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQ 744

Query: 718 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 777
+ S + S + S + S S + +DS + S + S +
Sbjct: 745 TASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGY 804

Query: 778 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDS 837
S + SD + S S + +DS + S + +S + S ++ +SD
Sbjct: 805 GSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDL 864

Query: 838 DSDSDSDSDSDSDSDSDSESDSDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDS 897
+ S S + DS + S + +S + S + SD +G S S++
Sbjct: 865 TTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 924

Query: 898 DSDSTSDTGSDNDSDSDSNSDSESGSN 924
+S + GS + S + GS+
Sbjct: 925 ESSLIAGYGSTQTASFKSTLMAGYGSS 951



Score = 30.9 bits (69), Expect = 0.032
Identities = 67/360 (18%), Positives = 128/360 (35%)

Query: 569 SGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSD 628
+ +S +G S + S + S + SDS+ + S ++ S +
Sbjct: 554 ASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYG 613

Query: 629 SASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSDSDSDSDSDSD 688
S + S + S S + +DS+ + S + +S + S + SD
Sbjct: 614 STQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLT 673

Query: 689 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 748
+ S S + +DS + S + +S + S + SD S S S + +D
Sbjct: 674 AGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGAD 733

Query: 749 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 808
S + S + S + S + S + S S + +DS + S
Sbjct: 734 SSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQT 793

Query: 809 SDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSDSESDSDSDSDSDSE 868
+ S + S + SD + S S + +DS + S + +S +
Sbjct: 794 AGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYG 853

Query: 869 SDSDSDSDSDSDSASDSDSGSDSDSSSDSDSDSTSDTGSDNDSDSDSNSDSESGSNNNVV 928
S + +SD + S S + DSS + ST G ++ + S + N+++
Sbjct: 854 STQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLT 913



Score = 30.9 bits (69), Expect = 0.035
Identities = 68/339 (20%), Positives = 125/339 (36%), Gaps = 2/339 (0%)

Query: 561 SDSDPGSDSGSDSNSDSGSD--SGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSA 618
+S + GS + GSD +G ST +GSDS+ + S ++ S + S
Sbjct: 556 YNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGST 615

Query: 619 SDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSASDSDSDSDSDSD 678
+ S + S S + +DS+ + S + +S + S + SD +
Sbjct: 616 QTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAG 675

Query: 679 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 738
S S + +DS + S + +S + S + SD S S S + +DS
Sbjct: 676 YGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSS 735

Query: 739 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 798
+ S + S + S + S + S S + +DS + S +
Sbjct: 736 LIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAG 795

Query: 799 SDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSESDSDSDSDSDSDSDSDSDSDSDSESD 858
S + S + SD + S S + ++S + S + +S + S
Sbjct: 796 YHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGST 855

Query: 859 SDSDSDSDSESDSDSDSDSDSDSASDSDSGSDSDSSSDS 897
+ +SD + S S + DS+ + GS + +S
Sbjct: 856 QTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNS 894


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01720ALARACEMASE270.049 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 26.7 bits (59), Expect = 0.049
Identities = 13/37 (35%), Positives = 19/37 (51%), Gaps = 2/37 (5%)

Query: 135 MYDIYP-PYDGIPDEAFLI-KELKVNSLAGKTGTINY 169
D+ P P GI L KE+K++ +A GT+ Y
Sbjct: 305 AVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGY 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS01770SECGEXPORT441e-09 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 44.2 bits (104), Expect = 1e-09
Identities = 23/76 (30%), Positives = 44/76 (57%), Gaps = 4/76 (5%)

Query: 1 MHTFLIVLLIIDCIALITVVLLQEGKSSGLSGAISGGAE-QLFGKQKQRGVDLFLNRLTI 59
M+ L+V+ +I I L+ +++LQ+GK + + + GA LFG G F+ R+T
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFG---SSGSGNFMTRMTA 57

Query: 60 ILSILFFVLMICISYL 75
+L+ LFF++ + + +
Sbjct: 58 LLATLFFIISLVLGNI 73


5CH52_RS02310CH52_RS02400Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS02310213-2.020707TIGR00730 family Rossman fold protein
CH52_RS02315214-3.083115GNAT family N-acetyltransferase
CH52_RS02320411-3.014596hypothetical protein
CH52_RS02325112-2.774596hypothetical protein
CH52_RS02330212-2.835066GNAT family N-acetyltransferase
CH52_RS02335213-2.973012DUF1129 family protein
CH52_RS02340113-2.132489DUF456 domain-containing protein
CH52_RS02345014-1.207840sugar efflux transporter
CH52_RS02350118-2.167671LysR family transcriptional regulator
CH52_RS02355-116-4.474258DUF402 domain-containing protein
CH52_RS02360016-4.496660hypothetical protein
CH52_RS02365-112-2.977947cupin domain-containing protein
CH52_RS0237009-2.066903YebC/PmpR family DNA-binding transcriptional
CH52_RS02375010-3.031606HTH-type transcriptional regulator SarX
CH52_RS02380-18-3.227831AraC family transcriptional regulator
CH52_RS0238509-1.808617Bax inhibitor-1 family protein
CH52_RS0239009-1.705643LysM peptidoglycan-binding domain-containing
CH52_RS02395-38-2.653755inorganic phosphate transporter
CH52_RS02400-311-3.585160DUF47 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS02330SACTRNSFRASE355e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.9 bits (80), Expect = 5e-05
Identities = 22/102 (21%), Positives = 35/102 (34%), Gaps = 1/102 (0%)

Query: 42 EMICSRLEHTNDKIYIYENEGQLIAFIWGHFSNEKSMVNIELLYVEPQFRKLGIATQLKI 101
+M S +E ++Y E I I SN IE + V +RK G+ T L
Sbjct: 54 DMDVSYVEEEGKAAFLYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKKGVGTALLH 112

Query: 102 ALEKWAKTMNAKRISSTIHKNNLPMISLNKDLGYQVSHVKMY 143
+WAK + + N+ + + V
Sbjct: 113 KAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTM 154


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS02345TCRTETA576e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.8 bits (137), Expect = 6e-11
Identities = 73/365 (20%), Positives = 134/365 (36%), Gaps = 41/365 (11%)

Query: 11 KNYKLFVA--NMFLLGMGIAVTVPYLVLFATKDLGMTTNQ---YGLLLASAAISQFTVNS 65
N L V + L +GI + +P L +DL + + YG+LLA A+ QF
Sbjct: 3 PNRPLIVILSTVALDAVGIGLIMPVLP-GLLRDLVHSNDVTAHYGILLALYALMQFACAP 61

Query: 66 IIARFSDTHHFNRKIIIILALLMGALGFSIYFFVDTIWLFILLYAIFQGLFAPAMPQLYA 125
++ SD F R+ +++++L A+ ++I +W+ + + I G+
Sbjct: 62 VLGALSD--RFGRRPVLLVSLAGAAVDYAIMATAPFLWV-LYIGRIVAGITGATGA---V 115

Query: 126 SARESINVSSSKDRAQFANTVLRSMFSLGFLFGPFIGAQLIGLKGYAGLFGGTISIILFT 185
+ +++ +RA+ + + F G + GP +G + G +A F L
Sbjct: 116 AGAYIADITDGDERARHFG-FMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNF 174

Query: 186 LVLQVFFYKDLNIKHPISTQQHVEKIAPNMFKDKTL--------LLPFIAFILLHIGQWM 237
L + ++ + + A N L + FI+ +GQ
Sbjct: 175 LTGCFLLPESHK-----GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVP 229

Query: 238 YTMNMPLFVTDYLKENEQHVGYLASLCAGLEVPFMIIL-GVLSSRLHTRTLLIYGAIFGG 296
+ +F D + +G + L ++ G +++RL R L+ G I G
Sbjct: 230 AAL-WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADG 288

Query: 297 LFYFSIGVFKNFYMMLAGQVFLAIFLAVLLGIGISYFQDILPDFPGYASTLFSNAMVIGQ 356
Y + +M F + L GIG +P S GQ
Sbjct: 289 TGYILLAFATRGWM-----AFPIMVLLASGGIG-------MPALQAMLSRQVDEERQ-GQ 335

Query: 357 LGGNL 361
L G+L
Sbjct: 336 LQGSL 340



Score = 48.7 bits (116), Expect = 2e-08
Identities = 44/186 (23%), Positives = 73/186 (39%), Gaps = 13/186 (6%)

Query: 215 MFKDKTLLLPFIAFILLHIGQWMYTMNMPLFVTDYLKENEQ--HVGYLASLCAGLEVPFM 272
M ++ L++ L +G + +P + D + N+ H G L +L A ++
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 273 IILGVLSSRLHTRTLLIYGAIFGGLFYFSIGVFKNFYMMLAGQVFLAIFLAVLLGIGISY 332
+LG LS R R +L+ + Y + +++ G++ I A G +Y
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AY 119

Query: 333 FQDILPD-----FPGYASTLFSNAMVIGQLGGNLLGGAMSHWVGLENVFFVSAASIMLGM 387
DI G+ S F MV G + G L+GG H FF +AA L
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPH-----APFFAAAALNGLNF 174

Query: 388 ILIFFT 393
+ F
Sbjct: 175 LTGCFL 180


6CH52_RS02545CH52_RS02640Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS02545211-2.034777M50 family metallopeptidase
CH52_RS02550212-1.977001metal-dependent transcriptional regulator
CH52_RS02555115-1.753774metal ABC transporter ATP-binding protein
CH52_RS02560115-1.819357metal ABC transporter permease
CH52_RS02565016-1.810571metal ABC transporter substrate-binding protein
CH52_RS02570-217-2.349602ISL3 family transposase
CH52_RS15095-118-1.568685Na+/H+ antiporter Mnh2 subunit G
CH52_RS02575017-1.613941Na+/H+ antiporter Mnh2 subunit F
CH52_RS02580-114-1.478009Na+/H+ antiporter Mnh2 subunit E
CH52_RS02585212-1.683993Na+/H+ antiporter Mnh2 subunit D
CH52_RS02590112-1.915274Na+/H+ antiporter Mnh2 subunit C
CH52_RS02595012-1.432114Na+/H+ antiporter Mnh2 subunit B
CH52_RS02600115-2.043154Na+/H+ antiporter Mnh2 subunit A
CH52_RS02605113-2.092064tyrosine-type recombinase/integrase
CH52_RS02610-113-2.516836DUF1659 domain-containing protein
CH52_RS02615-314-3.492953DUF2922 domain-containing protein
CH52_RS02620-114-4.353788DMT family transporter
CH52_RS02625014-4.254113global transcriptional regulator SarA
CH52_RS02630-112-3.762535alpha/beta hydrolase
CH52_RS02635-113-3.876210hypothetical protein
CH52_RS02640114-3.272612hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS02555PF05272300.008 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.008
Identities = 13/38 (34%), Positives = 20/38 (52%), Gaps = 2/38 (5%)

Query: 35 GPNGAGKSSLIKSLIGE--FNATGTKLLYNKPIQQQLQ 70
G G GKS+LI +L+G F+ T + K +Q+
Sbjct: 603 GTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIA 640


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS02565adhesinb354e-125 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 354 bits (910), Expect = e-125
Identities = 148/308 (48%), Positives = 216/308 (70%), Gaps = 6/308 (1%)

Query: 1 MKKL--VPLLLALLLLVAACGTGGKQSSDKSNGKLKVVTTNSILYDMAKNVGGDNVDIHS 58
MKK + LLL + +AAC + K S++ + KL VV TNSI+ D+ KN+ GD +++HS
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQ-KSSTETGSSKLNVVATNSIIADITKNIAGDKINLHS 59

Query: 59 IVPVGQDPHEYEVKPKDIKKLTDADVILYNGLNLETG-NGWFEKALEQAGKSLKDKKVIA 117
IVPVGQDPHEYE P+D+KK + AD+I YNG+NLETG N WF K +E A K ++K A
Sbjct: 60 IVPVGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENA-KKKENKDYYA 118

Query: 118 VSKDVKPIYLNGEEGNKDKQDPHAWLSLDNGIKYVKTIQQTFIDNDKKHKADYEKQGNKY 177
VS+ V IYL G+ K K+DPHAWL+L+NGI Y + I + + D +K YEK Y
Sbjct: 119 VSEGVDVIYLEGQS-EKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKETYEKNLKAY 177

Query: 178 IAQLEKLNNDSKDKFNDIPKEQRAMITSEGAFKYFSKQYGITPGYIWEINTEKQGTPEQM 237
+ +L L+ ++K+KFN+IP E++ ++TSEG FKYFSK Y + YIWEINTE++GTP+Q+
Sbjct: 178 VEKLSALDKEAKEKFNNIPGEKKMIVTSEGCFKYFSKAYNVPSAYIWEINTEEEGTPDQI 237

Query: 238 RQAIEFVKKHKLKHLLVETSVDKKAMESLSEETKKDIFGEVYTDSIGKEGTKGDSYYKMM 297
+ +E ++K K+ L VE+SVD + M+++S++T I+ +++TDS+ ++G +GDSYY MM
Sbjct: 238 KTLVEKLRKTKVPSLFVESSVDDRPMKTVSKDTNIPIYAKIFTDSVAEKGEEGDSYYSMM 297

Query: 298 KSNIETVH 305
K N+E +
Sbjct: 298 KYNLEKIA 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS02585TCRTETB270.012 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 27.2 bits (60), Expect = 0.012
Identities = 16/73 (21%), Positives = 38/73 (52%)

Query: 5 ITHIMIISSLIIFGIALIICLFRLIKGPTTADRVVTFDTTSAVVMSIVGVLSVLMGTVSF 64
I H + S L++ + II + L+K R+ +++ VG++ ++ T S+
Sbjct: 162 IAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSY 221

Query: 65 LDSIMLIAIISFV 77
S ++++++SF+
Sbjct: 222 SISFLIVSVLSFL 234


7CH52_RS03580CH52_RS03775Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS035802172.837962carboxylesterase
CH52_RS035853173.143359phosphatase PAP2 family protein
CH52_RS143602183.368140hypothetical protein
CH52_RS035952194.051205DUF2294 domain-containing protein
CH52_RS036003194.204012YbcC family protein
CH52_RS036051172.627543NADH dehydrogenase subunit 5
CH52_RS148502220.681295phenol-soluble modulin PSM-alpha-1
CH52_RS148551180.303663phenol-soluble modulin PSM-alpha-2
CH52_RS14365314-0.493848phenol-soluble modulin PSM-alpha-3
CH52_RS14860414-1.296074phenol-soluble modulin PSM-alpha-4
CH52_RS03615314-1.533146GTP-binding protein
CH52_RS03620918-3.274870hypothetical protein
CH52_RS036251119-3.427523hypothetical protein
CH52_RS036301119-3.483123hypothetical protein
CH52_RS036351220-3.332368membrane protein
CH52_RS036401120-3.391588tandem-type lipoprotein
CH52_RS036451321-3.596220tandem-type lipoprotein
CH52_RS036501221-3.594054tandem-type lipoprotein
CH52_RS036551322-3.833732tandem-type lipoprotein
CH52_RS036601320-4.062487tandem-type lipoprotein
CH52_RS036651018-3.780874tandem-type lipoprotein
CH52_RS036701013-3.248289tandem-type lipoprotein
CH52_RS03675812-3.195653tandem-type lipoprotein
CH52_RS03680814-3.429692tandem-type lipoprotein
CH52_RS0368529-1.047999tandem-type lipoprotein Lpl1
CH52_RS03690210-0.460886myeloperoxidase inhibitor SPIN
CH52_RS03695310-0.986276FKLRK protein
CH52_RS03700014-1.149537superantigen-like protein SSL11
CH52_RS03705115-1.348614restriction endonuclease subunit S
CH52_RS03710115-0.826505type I restriction-modification system subunit
CH52_RS03715117-3.397902hypothetical protein
CH52_RS03720216-2.900242superantigen-like protein SSL10
CH52_RS03725317-2.733709superantigen-like protein SSL9
CH52_RS03730217-1.579017superantigen-like protein SSL8
CH52_RS03735016-1.436624superantigen-like protein SSL7
CH52_RS03745118-1.345084superantigen-like protein SSL5
CH52_RS03750-214-0.751505superantigen-like protein SSL4
CH52_RS03755-416-0.930011hypothetical protein
CH52_RS03760-316-1.006008superantigen-like protein SSL3
CH52_RS03765-115-2.411405superantigen-like protein SSL2
CH52_RS03770-118-2.769344superantigen-like protein SSL1
CH52_RS03775118-3.384304SDR family oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03625adhesinb270.014 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.1 bits (60), Expect = 0.014
Identities = 21/94 (22%), Positives = 40/94 (42%), Gaps = 14/94 (14%)

Query: 14 DISTTVETLNLISKMEAQKENIRSVIAPEHKHKYKDIENGLKGEE---KVLIEQMAQHCE 70
+S V+ + L + E KE+ H + ++ENG+ + K L E+ + E
Sbjct: 118 AVSEGVDVIYLEGQSEKGKED---------PHAWLNLENGIIYAQNIAKRLSEKDPANKE 168

Query: 71 AFKANFKGAAQ--GDWVKSAMSEIDSIKDDLKKI 102
++ N K + K A + ++I + K I
Sbjct: 169 TYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMI 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03650BCTERIALGSPC353e-04 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 34.6 bits (79), Expect = 3e-04
Identities = 18/83 (21%), Positives = 34/83 (40%), Gaps = 9/83 (10%)

Query: 181 INENVPSYDAKFKMSNKDENVKQLRSRYNIPTDKAPVLKMHIDGDLKGSSVGYKKLEIDF 240
+NE VP Y+AK D V Q + RY + + + S G +++
Sbjct: 124 VNEEVPGYNAKIVSIRPDRVVLQYQGRYEV---------LGLYSQEDSGSDGVPGAQVNE 174

Query: 241 SKGEKSDLSVIDSLNFQPAKVDE 263
+++ ++ D ++F P D
Sbjct: 175 QLQQRASTTMSDYVSFSPIMNDN 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03665BCTERIALGSPC310.002 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 31.5 bits (71), Expect = 0.002
Identities = 17/83 (20%), Positives = 34/83 (40%), Gaps = 9/83 (10%)

Query: 177 INSNVPSYDAKFKMSNKDENVKQLRSRYNIPTDKAPILKMHIDGDLKGSSVGYKKLEIDF 236
+N VP Y+AK D V Q + RY + + + S G +++
Sbjct: 124 VNEEVPGYNAKIVSIRPDRVVLQYQGRYEV---------LGLYSQEDSGSDGVPGAQVNE 174

Query: 237 SKEENSELSIVDSLNFQPAKNKD 259
++ + ++ D ++F P N +
Sbjct: 175 QLQQRASTTMSDYVSFSPIMNDN 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03700TOXICSSTOXIN1084e-31 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 108 bits (270), Expect = 4e-31
Identities = 47/225 (20%), Positives = 86/225 (38%), Gaps = 19/225 (8%)

Query: 16 LTTGMITTTAQPVKASTLEVRSQAT-------QDLSEYYKGRGFELTNVTGYKYG-NKVT 67
L T PV S+ ++ A +DL ++Y TN +
Sbjct: 15 LLLATTATDFTPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTNSEVLDNSLGSMR 74

Query: 68 FIDNSQQIDVTLTGNE----KLTVKDDDEVSNVDVFVVREGSDKSAITTSIGGITKTNGT 123
+ I + + + T + +++ + S+ + I I G+T T
Sbjct: 75 IKNTDGSISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTE-- 132

Query: 124 QHKDTVQNVNLSVSKSTGQHTTSVTSEYYSIYKEEISLKELDFKLRKHLIDKHDLYKTEP 183
T + L V K G+ S K+++++ LDF++R L H LY++
Sbjct: 133 -KLPTPIELPLKV-KVHGK--DSPLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSD 188

Query: 184 KDSKI-RITMKNGGYYTFELNKKLQPHRMGDTIDSRNIEKIEVNL 227
K +ITM +G Y +L+KK + + I+ I+ IE +
Sbjct: 189 KTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03720TOXICSSTOXIN1934e-64 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 193 bits (491), Expect = 4e-64
Identities = 51/202 (25%), Positives = 92/202 (45%), Gaps = 10/202 (4%)

Query: 31 KQNQKSVNKHDKEALYRYYTGKTMEMKNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKF 90
K + S N + K+ L Y +G + N L + ++R K I +++ +
Sbjct: 36 KTAKASTNDNIKDLLDWYSSG-SDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSP 94

Query: 91 QQRSYEGLDVFFVQEKRDKHD-----IFYTVGGVIQNNKTSGVVSAPILNISKEKGEDAF 145
E +D+ + K+ +H I + + GV K + P L + K G+D+
Sbjct: 95 AFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELP-LKV-KVHGKDSP 152

Query: 146 VKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRV-KISLKDGSFYNLDLRSK 204
+K Y K+++ + LD+++R L + +GLY++ K G KI++ DGS Y DL K
Sbjct: 153 LK-YGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKK 211

Query: 205 LKFKYMGEVIESKQIKDIEVNL 226
++ I +IK IE +
Sbjct: 212 FEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03725TOXICSSTOXIN1301e-39 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 130 bits (329), Expect = 1e-39
Identities = 39/197 (19%), Positives = 69/197 (35%), Gaps = 15/197 (7%)

Query: 43 INMLHQYYSEESFESTNISVKSEDYYGSNVLNFNQRNKTFKVFLLGDDKNKY------KE 96
I L +YS S TN V + + + + + K
Sbjct: 46 IKDLLDWYSSGSDTFTNSEVLD---NSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGEKV 102

Query: 97 KTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPSLQVKKVDAKHGFSINELF 156
+ + + + G+T + P L+VK F
Sbjct: 103 DLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELP-LKVKVHGKDSPLKYGPKF 159

Query: 157 FIQKEEVSLKELDFKIRKMLVEKYRLYKGAS-DKGRIVINMKDEKKYVIDLSEKLSFDRM 215
K+++++ LDF+IR L + + LY+ + G I M D Y DLS+K ++
Sbjct: 160 --DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTE 217

Query: 216 FDVMDSKQIKNIEVNLN 232
++ +IK IE +N
Sbjct: 218 KPPINIDEIKTIEAEIN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03730TOXICSSTOXIN1252e-37 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 125 bits (314), Expect = 2e-37
Identities = 47/199 (23%), Positives = 73/199 (36%), Gaps = 19/199 (9%)

Query: 42 DTNKLHQYYSGPSYELTNV--------SGQSQGYYDSNVLLFNQQNQKFQVFLLGKDENK 93
+ L +YS S TN S + + S L+ F G+
Sbjct: 45 NIKDLLDWYSSGSDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGE---- 100

Query: 94 YKEKTHGLDVFAVPELVDLDGRIFSVSGVTKKNVKSIFESLRTPNLLVKKIDDKDGFSID 153
K + + F +SGVT L L K+ KD +
Sbjct: 101 -KVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELP----LKVKVHGKDSP-LK 154

Query: 154 EFFFIQKEEVSLKELDFKIRKLLIKKYKLYEGSA-DKGRIVINMKDENKYEIDLSDKLDF 212
K+++++ LDF+IR L + + LY S G I M D + Y+ DLS K ++
Sbjct: 155 YGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEY 214

Query: 213 ERMADVINSEQIKNIEVNL 231
IN ++IK IE +
Sbjct: 215 NTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03735TOXICSSTOXIN1921e-63 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 192 bits (488), Expect = 1e-63
Identities = 49/197 (24%), Positives = 82/197 (41%), Gaps = 16/197 (8%)

Query: 42 DIKDLYRYYSSESFEFSNI--------SGKVENYNGSNVVRFNQEKQNHQLFLLGKDKDK 93
+IKDL +YSS S F+N S +++N +GS + F G+
Sbjct: 45 NIKDLLDWYSSGSDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGE---- 100

Query: 94 YKKGLEGQNVFVVKELIDPNGRLSTVGGVTKKNNKSSETNTHLFVNKVYGGNLDASIDSF 153
K L + + + + GVT + L V KV+G +
Sbjct: 101 -KVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELPLKV-KVHGKDSPLKYGP- 157

Query: 154 LINKEEVSLKELDFKIRKQLVEKYGLYKGTTKYGKI-TINLKDEKKEVIDLGDKLQFERM 212
+K+++++ LDF+IR QL + +GLY+ + K G I + D DL K ++
Sbjct: 158 KFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTE 217

Query: 213 GDVLNSKDIQNIAVTIN 229
+N +I+ I IN
Sbjct: 218 KPPINIDEIKTIEAEIN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03745TOXICSSTOXIN1352e-41 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 135 bits (340), Expect = 2e-41
Identities = 49/201 (24%), Positives = 73/201 (36%), Gaps = 14/201 (6%)

Query: 39 NVTKDIFDLRDYYSGASKELKNVTGYRYSKGGKHYLIFDKNRKFTRVQIFGKDIERFKAR 98
+ +I DL D+YS S N S G + + IF
Sbjct: 41 STNDNIKDLLDWYSSGSDTFTNSEVLDNSLGS---MRIKNTDGSISLIIFPSPYYSPAFT 97

Query: 99 KNPGLDI-----FVVKEAENRNGTVFSYGGVTKKNQDAYYDYINAPRFQIKRDEGDGIAT 153
K +D+ + F GVT + I P +K D
Sbjct: 98 KGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELPLK-VKVHGKDSPLK 154

Query: 154 YGRVHYIYKEEISLKELDFKLRQYLIQNFDLYKKFPKDSKI-KVIMKDGGYYTFELNKKL 212
YG K+++++ LDF++R L Q LY+ K K+ M DG Y +L+KK
Sbjct: 155 YG--PKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKF 212

Query: 213 QTNRMSDVIDGRNIEKIEANI 233
+ N I+ I+ IEA I
Sbjct: 213 EYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03750TOXICSSTOXIN1018e-28 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 101 bits (252), Expect = 8e-28
Identities = 45/216 (20%), Positives = 81/216 (37%), Gaps = 13/216 (6%)

Query: 82 TKVETPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEIGIILKKWTTIRFMNIVPDYF 141
T V + K N KDL +Y+ S F N ++ ++R N
Sbjct: 25 TPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTN-SEVLDNSLGSMRIKNTDGSI- 82

Query: 142 IYKIALVGKDDKKYDEGVHRNVDVFVVLEEKNKYGVE----RYSVGGITKSNSKKVDHKA 197
+ + VD+ +K+++ E + + G+T + +
Sbjct: 83 --SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIEL 140

Query: 198 GVRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIENHNLYGNV--GSGKIVIN 255
+++ + + K K+Q+++ LDF++R QL + H LY + G I
Sbjct: 141 PLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKIT 197

Query: 256 MKNGGKYTFELHKKLQENRMADVIDGTNIDNIEVNI 291
M +G Y +L KK + N I+ I IE I
Sbjct: 198 MNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03760TOXICSSTOXIN942e-24 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 93.6 bits (232), Expect = 2e-24
Identities = 44/223 (19%), Positives = 78/223 (34%), Gaps = 21/223 (9%)

Query: 140 TPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFEFEKQFGFLLKPWTTVRFM 199
TP P+ S + IK A+ +DL +Y+ S F L ++R
Sbjct: 25 TPVPLSSNQ-------IIKTAKASTNDNIKDLLDWYSSGSDTF-TNSEVLDNSLGSMRIK 76

Query: 200 NVIPNRFIYKIALVGKDEKKYKDGPYDNIDV-----FIVLEDNKYQLKKYSVGGITKTNS 254
N + + + + +D+ ++ + + G+T T
Sbjct: 77 NTDGSI---SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEK 133

Query: 255 KKVDHKAELSVTKKDNQGMISRDVSEYMITKEEISLKELDFKLRKQLIEKHNLYGNM--G 312
+ L V + K+++++ LDF++R QL + H LY +
Sbjct: 134 LPTPIELPLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKT 190

Query: 313 SGTIVIKMKNGGKYTFELHKKLQEHRMADVIEGTNIDKIEVNI 355
G I M +G Y +L KK + + I I IE I
Sbjct: 191 GGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03765TOXICSSTOXIN875e-23 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 87.0 bits (215), Expect = 5e-23
Identities = 39/203 (19%), Positives = 77/203 (37%), Gaps = 22/203 (10%)

Query: 37 ISENSKKLKAYYTQPSIEYKNVTGYISFIQPSIKFMNIIDGNSVNNLALIGKDKQHYHTG 96
++N K L +Y+ S + N + S+ M I + + +L +
Sbjct: 42 TNDNIKDLLDWYSSGSDTFTN----SEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFT 97

Query: 97 VHRNLNIFYVN-----EDKRFEGAKYSIGGITSANDKA--VDLIAEARVIKADHIGEYDY 149
+++ + I G+T+ ++L + +V D +Y
Sbjct: 98 KGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGP 157

Query: 150 DFFPFKIDKEAMSLKEIDFKLRKYLIDNYGLYGEMST----GKITVKKKYYGKYTFELDK 205
F DK+ +++ +DF++R L +GLY KIT+ Y +L K
Sbjct: 158 KF-----DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDG--STYQSDLSK 210

Query: 206 KLQEDRMSDVINVTDIDRIEIKV 228
K + + IN+ +I IE ++
Sbjct: 211 KFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03770TOXICSSTOXIN953e-26 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 95.5 bits (237), Expect = 3e-26
Identities = 46/214 (21%), Positives = 84/214 (39%), Gaps = 11/214 (5%)

Query: 18 TGVITSNVQSVQAKTEVKQQSESELKHYYNKPVLERKNVTGYKYTEKGKDYIDVIVDNQY 77
T V S+ Q ++ + +L +Y+ N + + + + +
Sbjct: 25 TPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTNS---EVLDNSLGSMRIKNTDGS 81

Query: 78 SQISLVGSDKDKFKDGDNSNIDVFILREGDSRQATN-----YSIGGVTKTNSQPFIDYIH 132
+ + S +D+ R S+ + + I GVT T P I
Sbjct: 82 ISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIE 139

Query: 133 TPILEIKKGKEEPQSSLYQIYKEDISLKELDYRLRERAIKQHGLYSNGLKQGQI-TITMK 191
P+ GK+ P + K+ +++ LD+ +R + + HGLY + K G ITM
Sbjct: 140 LPLKVKVHGKDSPLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMN 199

Query: 192 DGKSHTIDLSQKLEKERMGDSIDGRQIQKILVEM 225
DG ++ DLS+K E I+ +I+ I E+
Sbjct: 200 DGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03775NUCEPIMERASE300.009 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 30.1 bits (68), Expect = 0.009
Identities = 28/167 (16%), Positives = 61/167 (36%), Gaps = 32/167 (19%)

Query: 1 MNIMLTGATGHLGTHITNQAIANHIDHFHIGVRNVEKVPD----------DWRGKVSVRQ 50
M ++TGA G +G H++ + + H +G+ N+ D + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEA--GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 51 LDYFNQESMVEAFK--GMDTVVFI-------PSIIHP-SFKRIPEV--ENLVYAAKQSGV 98
+D ++E M + F + V S+ +P ++ N++ + + +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 99 AHIIFIG---YYADQHNNPFHMS-----PYFGYASRLLSTSGIDYTY 137
H+++ Y PF P YA+ + + +TY
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTY 165


8CH52_RS03820CH52_RS14380Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS038203202.541477xanthine phosphoribosyltransferase
CH52_RS038251192.825085general stress protein
CH52_RS038302203.255176hypothetical protein
CH52_RS038352204.427907hypothetical protein
CH52_RS038401214.258644hypothetical protein
CH52_RS038451203.479163L-cystine transporter
CH52_RS03850-1222.134956NADPH-dependent oxidoreductase
CH52_RS038550211.241308alkyl hydroperoxide reductase subunit C
CH52_RS038600200.430632alkyl hydroperoxide reductase subunit F
CH52_RS03865-113-1.933025NDxxF motif lipoprotein
CH52_RS03870-113-2.330965hypothetical protein
CH52_RS03875116-2.204741phosphoglycerate mutase family protein
CH52_RS03880015-2.084652hypothetical protein
CH52_RS03885215-2.586407GlsB/YeaQ/YmgE family stress response membrane
CH52_RS03890218-1.690687helix-turn-helix domain-containing protein
CH52_RS03895218-3.127810PepSY domain-containing protein
CH52_RS03900119-2.565812YxeA family protein
CH52_RS03905118-1.460222staphylococcal enterotoxin-like toxin X
CH52_RS03910221-0.882839Abi family protein
CH52_RS14380222-0.35644230S ribosomal protein S18
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03900adhesinb320.001 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 31.7 bits (72), Expect = 0.001
Identities = 34/166 (20%), Positives = 59/166 (35%), Gaps = 18/166 (10%)

Query: 2 KLKSLAVLSMSAVVLTACGNDTPKDETKSTESNTNQDTNTTKDV---IALKDVKTS---- 54
K + L +L ++ V L AC + ET S++ N + D+ IA +
Sbjct: 3 KCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVP 62

Query: 55 ----PEDAVKKAEETYKGQKLK-----GISFENSNGEWAYKVTQQ-KSGEESEVLVADKN 104
P + E+ K + GI+ E W K+ + K E + +
Sbjct: 63 VGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEG 122

Query: 105 KKVINKKTEKE-DTVNENDNFKYSDAIDYKKAIKEGQKEFDGDIKE 149
VI + + E + + + I Y + I + E D KE
Sbjct: 123 VDVIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKE 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03905PF04335270.011 VirB8 type IV secretion protein
		>PF04335#VirB8 type IV secretion protein

Length = 227

Score = 27.5 bits (61), Expect = 0.011
Identities = 14/63 (22%), Positives = 21/63 (33%), Gaps = 9/63 (14%)

Query: 9 TYLALTIIGAYAALFILKTIDSHGITDQFNPLVKEDDSYVKTTEVSTRMDDQLRSYTQSA 68
LA + A AAL LKT++ P V D ++ ++ A
Sbjct: 43 GALATAGVVAVAALTPLKTVE---------PYVITVDRNTGEASIAAKLHGDATITYDEA 93

Query: 69 FNK 71
K
Sbjct: 94 VRK 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03910TOXICSSTOXIN484e-09 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 48.1 bits (114), Expect = 4e-09
Identities = 30/117 (25%), Positives = 48/117 (41%), Gaps = 12/117 (10%)

Query: 74 TINGKSNKSRNWVYSERPLNENQVRIHLEGTYTVADRVYTPKRNITLNKEVVTLKELDHI 133
I+G +N + E PL V++H + + Y PK +K+ + + LD
Sbjct: 124 QISGVTNTEKLPTPIELPLK---VKVHGKD----SPLKYGPK----FDKKQLAISTLDFE 172

Query: 134 IRFAHIS-YGLYMGEHLPKGNIVINTKDGGKYTLESHKELQKDRENVKINTADIKNV 189
IR +GLY G I DG Y + K+ + + E IN +IK +
Sbjct: 173 IRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTI 229


9CH52_RS04135CH52_RS04170Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS041351163.828654protein-ADP-ribose hydrolase
CH52_RS041401123.258641glycine cleavage system protein H
CH52_RS04145-1113.686570LLM class flavin-dependent oxidoreductase
CH52_RS04150-1113.497660hypothetical protein
CH52_RS04160-1133.280853NADH-dependent flavin oxidoreductase
CH52_RS041650142.881724alpha/beta hydrolase
CH52_RS04170-2133.022965YSIRK domain-containing triacylglycerol lipase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04170GPOSANCHOR471e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 47.4 bits (112), Expect = 1e-07
Identities = 41/309 (13%), Positives = 85/309 (27%), Gaps = 12/309 (3%)

Query: 1 MLRGQEERKYSIRKYSIGVVSVLAATMFVVSSHEAQASEKTPTSNAAAQKETLNQPGEQG 60
M + R YS+RK G SV A + + +E + + +
Sbjct: 1 MTKNNTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERAD 60

Query: 61 NAITSHQMQSGKQLDDMHKENGKSGTVTEGKDTLQSSKHQSTQNSKTIRTQ---NDNQVK 117
+ K D E + L ++K + +N K++ +
Sbjct: 61 KFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEA 120

Query: 118 QDSERQGSKQSHQN------NATNNTERQNDQVQNTHHAERNGSQSTTSQSNDVDKSQPS 171
+ ++ + + + N E + + + + S +
Sbjct: 121 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKT 180

Query: 172 IPAQKVLPNHDKAAPTSTTPPSNDKTAPKSTKAQDATTDKHPNQQDTHQPAHQIIDAKQD 231
+ A+K +A + + + S K + +K + A
Sbjct: 181 LEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNF 240

Query: 232 DTVRQSEQKPQVGDLSKHIDGQNSPEKPTDKNTDNKQLIKDALQAPKTRSTTNAAADAKK 291
T ++ K + + Q EK KT AA +A+K
Sbjct: 241 STADSAKIKTLEAEKAALEARQAELEK---ALEGAMNFSTADSAKIKTLEAEKAALEAEK 297

Query: 292 VRPLKANQV 300
+QV
Sbjct: 298 ADLEHQSQV 306


10CH52_RS04250CH52_RS04305Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS04250214-2.924256formate/nitrite transporter family protein
CH52_RS04255918-3.876522DUF4064 domain-containing protein
CH52_RS042601118-4.170977DUF4467 domain-containing protein
CH52_RS04265921-3.988725TIGR01741 family protein
CH52_RS04270722-4.438300TIGR01741 family protein
CH52_RS04275418-4.984686TIGR01741 family protein
CH52_RS04280621-4.671292TIGR01741 family protein
CH52_RS04285318-2.193635TIGR01741 family protein
CH52_RS04290219-1.536931hypothetical protein
CH52_RS04295218-1.731172DUF5080 family protein
CH52_RS04300217-0.726186DUF5079 family protein
CH52_RS043053180.029662TIGR01741 family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04260ECOLIPORIN270.027 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 26.8 bits (59), Expect = 0.027
Identities = 11/32 (34%), Positives = 21/32 (65%)

Query: 1 MKRILVVFLMLAIILAGCSNKGEKYQKDIDKV 32
MKR ++ ++ A++ AG ++ E Y KD +K+
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKL 32


11CH52_RS04565CH52_RS04640Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS045651153.602914nucleoside hydrolase
CH52_RS045701153.618648glucose-specific PTS transporter subunit IIBC
CH52_RS045750182.944347L-lactate dehydrogenase
CH52_RS04585-2130.090437hypothetical protein
CH52_RS04590-112-0.558585FAD-binding oxidoreductase
CH52_RS04595-2120.599175hypothetical protein
CH52_RS04600-2101.241751DUF488 domain-containing protein
CH52_RS04605-2112.188505ABC transporter substrate-binding protein
CH52_RS04610-2122.647875PrsW family intramembrane metalloprotease
CH52_RS04615-1144.463536acyl CoA:acetate/3-ketoacid CoA transferase
CH52_RS046200165.004279acyl--CoA ligase
CH52_RS046251175.410945acyl-CoA dehydrogenase family protein
CH52_RS046301205.9113683-hydroxyacyl-CoA dehydrogenase/enoyl-CoA
CH52_RS04635-3153.960498thiolase family protein
CH52_RS04640-3163.370389hypothetical protein
12CH52_RS15120CH52_RS04935Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS15120522-7.559608ABC transporter ATP-binding protein
CH52_RS04785320-5.833928stage II sporulation protein M
CH52_RS04790219-5.107870hypothetical protein
CH52_RS04795120-4.770035putative cyclic bacteriocin
CH52_RS04800015-1.554356ABC transporter permease
CH52_RS04805014-2.008806ABC transporter ATP-binding protein
CH52_RS04810116-1.463649efflux RND transporter periplasmic adaptor
CH52_RS04815-215-0.217986hypothetical protein
CH52_RS04820-1140.868971type I restriction endonuclease subunit R
CH52_RS04825-1151.426930hypothetical protein
CH52_RS048300133.003430MurR/RpiR family transcriptional regulator
CH52_RS048350133.326790PTS transporter subunit EIIC
CH52_RS048401123.575262N-acetylmuramic acid 6-phosphate etherase
CH52_RS048450143.242332DUF871 domain-containing protein
CH52_RS04850-1162.314222glucose-specific PTS transporter subunit IIBC
CH52_RS04855-1172.688146hypothetical protein
CH52_RS04860-1131.997420alpha-keto acid decarboxylase family protein
CH52_RS04865-1142.407585isochorismatase family protein
CH52_RS048700152.065119branched-chain amino acid transport system II
CH52_RS048752152.144089ornithine--oxo-acid transaminase
CH52_RS048802142.567582N-acetyl-gamma-glutamyl-phosphate reductase
CH52_RS048853131.880780bifunctional glutamate
CH52_RS048902141.616083acetylglutamate kinase
CH52_RS048952131.316000YagU family protein
CH52_RS049002141.2448054'-phosphopantetheinyl transferase superfamily
CH52_RS049052151.296375non-ribosomal peptide synthetase
CH52_RS04915-2111.299690MFS transporter
CH52_RS04920-1121.948374NAD-dependent formate dehydrogenase
CH52_RS04925-1141.859640DUF2294 domain-containing protein
CH52_RS049300141.912696acyl-CoA/acyl-ACP dehydrogenase
CH52_RS049353150.865829ABC transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04835DNABINDINGHU300.002 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 30.4 bits (69), Expect = 0.002
Identities = 16/75 (21%), Positives = 28/75 (37%), Gaps = 15/75 (20%)

Query: 86 ELIENESVETLKNKMIARATNTMRFVATNIMDAQIDAICDVLKNARTIFLFGFGASSLTI 145
+LI +A AT + + +DA A+ L + L GFG +
Sbjct: 6 DLIA----------KVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVR- 54

Query: 146 GDLFQKLSRIGLNVR 160
++ +R G N +
Sbjct: 55 ----ERAARKGRNPQ 65


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04870ISCHRISMTASE604e-13 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 59.6 bits (144), Expect = 4e-13
Identities = 31/99 (31%), Positives = 51/99 (51%)

Query: 86 LDKRDDDFVIDKRHFSAFVGTDLDLQLRRRGIDTIVLGGVATHIGVDTTARDAYQLNYNQ 145
L DDD V+ K +SAF T+L +R+ G D +++ G+ HIG TA +A+ +
Sbjct: 112 LAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKA 171

Query: 146 FFVTDMMSAQNETLHQFPIDNVFPLMGQTITTNDFLNIL 184
FFV D ++ + HQ ++ T+ T+ L+ L
Sbjct: 172 FFVGDAVADFSLEKHQMALEYAAGRCAFTVMTDSLLDQL 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04895CARBMTKINASE320.002 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 31.7 bits (72), Expect = 0.002
Identities = 23/84 (27%), Positives = 41/84 (48%), Gaps = 7/84 (8%)

Query: 155 INADTLAYFIASSLKAPIYV-LSNIAGVLIN-----DVVIPQLPLVDIHQYIEHGD-IYG 207
I+ D +A + A I++ L+++ G + + + ++ + ++ +Y E G G
Sbjct: 213 IDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREVKVEELRKYYEEGHFKAG 272

Query: 208 GMIPKVLDAKNAIENGCPKVIIAS 231
M PKVL A IE G + IIA
Sbjct: 273 SMGPKVLAAIRFIEWGGERAIIAH 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04905ENTSNTHTASED290.009 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 29.2 bits (65), Expect = 0.009
Identities = 15/57 (26%), Positives = 27/57 (47%), Gaps = 5/57 (8%)

Query: 84 GQP-----IYVSLSYSYPYIVCVVDKEPVGIDIEKISQRLDWRTLVTCFSTNEAHQI 135
QP ++ S+S+ + V+ ++ +GIDIEKI + L ++ QI
Sbjct: 76 RQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATELAPSIIDSDERQI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04910NUCEPIMERASE538e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 52.9 bits (127), Expect = 8e-09
Identities = 54/266 (20%), Positives = 101/266 (37%), Gaps = 55/266 (20%)

Query: 2046 NTLLTGATGFLGAYLIEALQGYSHRIYCFIRADNEEIAWYKLMTNLNDYFS----EETVE 2101
L+TGA GF+G ++ + L H++ + D NLNDY+ + +E
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV---VGID-----------NLNDYYDVSLKQARLE 47

Query: 2102 MM----LSNIEVIVGDFECMDDVVLPENMDTIIH----AGARTDHFGDDDEFEKVNVQGT 2153
++ ++ + D E M D+ + + + R + + N+ G
Sbjct: 48 LLAQPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVR-YSLENPHAYADSNLTGF 106

Query: 2154 VDVIRLAQQHH-ARLIYVSTISV-GTYFDIDTEDVTFSEADVYKGQLLTSPYTRSKFYSE 2211
++++ + + L+Y S+ SV G + FS D + S Y +K +E
Sbjct: 107 LNILEGCRHNKIQHLLYASSSSVYG-----LNRKMPFSTDDSVDHPV--SLYAATKKANE 159

Query: 2212 LKVLEAVNN-GLDGRIVRVGNLTSPYNGRWHM------RNIKTNRFSMVMNDLLQLDCIG 2264
L + GL +R + P+ GR M + + + V N
Sbjct: 160 LMAHTYSHLYGLPATGLRFFTVYGPW-GRPDMALFKFTKAMLEGKSIDVYNY-------- 210

Query: 2265 VSMAEMPVDFSFVDTTARQIVALAQV 2290
+M DF+++D A I+ L V
Sbjct: 211 ---GKMKRDFTYIDDIAEAIIRLQDV 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04915TCRTETA320.004 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.004
Identities = 61/337 (18%), Positives = 127/337 (37%), Gaps = 33/337 (9%)

Query: 7 TLKVRLISNFLQLIITTAFIPFIALYLTDMLS----QSIVGIYLVGLVVLKFPLSIISGY 62
L V L + L + +P + L D++ + GI L +++F + + G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 63 LIEIFPKKLLVLIYQATMVIMLVFMGVFGSHQLWQI-IGFCVAYAIFTIVWGLQFPVMDT 121
L + F ++ ++L+ A + M + LW + IG VA + G V
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMAT--APFLWVLYIGRIVAG-----ITGATGAVAGA 118

Query: 122 LIMDAITEDVEHYIYKISYWMTNLSVAIGALLGGLMYGYSMLLLFLIAACIFLIVLFILY 181
I D D + + G +LGGLM G+S F AA + +
Sbjct: 119 YIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 182 IWLPQDRNQVKQSDDKRHASRYQKLQIMNIFRSYKLVLKDRNYMLLISGFSIIMMGEFSI 241
LP+ ++ + + + L+ F + ++G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAA--------LMAVFFIMQLVGQVPA 230

Query: 242 SSYIAIRLKDQF--ETISIGSYDITGAKMLAILLMINTVVVILLTYSISKVVLKIDFKKA 299
+ + I +D+F + +IG LA +++++ ++T ++ ++ ++A
Sbjct: 231 ALW-VIFGEDRFHWDATTIGI-------SLAAFGILHSLAQAMITGPVAA---RLGERRA 279

Query: 300 LITGLLIYIVGYSGLTYLNQFGLLVVFMIIATVGEII 336
L+ G++ GY L + + + M++ G I
Sbjct: 280 LMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIG 316


13CH52_RS05120CH52_RS05260Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS051204171.873831deoxyribose-phosphate aldolase
CH52_RS051253171.079031tetracycline efflux MFS transporter Tet(38)
CH52_RS051303160.887822purine-nucleoside phosphorylase
CH52_RS05140315-0.634564GntR family transcriptional regulator
CH52_RS15125314-0.874000LPXTG cell wall anchor domain-containing
CH52_RS05145214-0.595981superoxide dismutase
CH52_RS05150112-0.987091lipopolysaccharide biosynthesis protein
CH52_RS05155113-1.460442O-antigen ligase family protein
CH52_RS05160-1120.103299glycosyltransferase family 4 protein
CH52_RS05165-2110.743553sugar transferase
CH52_RS05170-2141.395731NAD-dependent epimerase/dehydratase family
CH52_RS05175-2141.748055hypothetical protein
CH52_RS051800153.386883(S)-acetoin forming diacetyl reductase
CH52_RS051850143.282014MFS transporter
CH52_RS05190-1152.698681bifunctional transcriptional
CH52_RS052001173.550130staphyloferrin B biosynthesis decarboxylase
CH52_RS052051163.336793staphyloferrin B biosynthesis citrate synthase
CH52_RS052102163.4539273-(L-alanin-3-ylcarbamoyl)-2-[(2-
CH52_RS052151153.157247L-2,3-diaminopropanoate--citrate ligase SbnE
CH52_RS052200112.807616staphyloferrin B export MFS transporter
CH52_RS05225-1113.113918staphyloferrin B biosynthesis protein SbnC
CH52_RS05230-292.662489N-[(2S)-2-amino-2-carboxyethyl]-L-glutamate
CH52_RS05235-3101.5960732,3-diaminopropionate biosynthesis protein SbnA
CH52_RS052401181.998882staphyloferrin B ABC transporter
CH52_RS052451161.581049staphyloferrin B ABC transporter permease
CH52_RS052501171.618038staphyloferrin B ABC transporter permease
CH52_RS05255112-0.024014HTH-type transcriptional regulator SarS
CH52_RS05260211-0.871879staphylococcal protein A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05125TCRTETB1655e-48 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 165 bits (418), Expect = 5e-48
Identities = 103/397 (25%), Positives = 194/397 (48%), Gaps = 2/397 (0%)

Query: 15 LLFLFVFSLVIDNSFKLISVAIADDLNISVTTVSWQATLAGLVIGIGAVVYASLSDAISI 74
L L FS++ + + IA+D N + +W T L IG VY LSD + I
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 75 RTLFIYGVILIIIGSIIGYIFQHQFPLLLVGRIIQTAGLAAAETLYVIYVAKYLSKEDQK 134
+ L ++G+I+ GS+IG++ F LL++ R IQ AG AA L ++ VA+Y+ KE++
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 135 TYLGLSTSSYSLSLVIGTLSGGFISTYLHWTNMFLIALIVVFTLPFLFKLLPKENNTNKA 194
GL S ++ +G GG I+ Y+HW+ + LI +I + T+PFL KLL KE K
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKKEVRI-KG 197

Query: 195 HLDFVGLILVATIATTVMLFITNFNWLYMIGALIAIIVFALYIKNAQRPLVNKSFFQNKR 254
H D G+IL++ MLF T+++ ++I ++++ ++F +I+ P V+ +N
Sbjct: 198 HFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRKVTDPFVDPGLGKNIP 257

Query: 255 YASFLFIVFVMYAIQLGYIFTFPFIMEQIYHLQLDTT-SLLLVPGYIVAVIVGALSGKIG 313
+ + +++ G++ P++M+ ++ L S+++ PG + +I G + G +
Sbjct: 258 FMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFGYIGGILV 317

Query: 314 EYLNSKQAIITAIILIALSLILPAFAVGNHISIFVISMIFFAGSFALMYAPLLNEAIKTI 373
+ + + +++S + +F + I ++F G + + ++
Sbjct: 318 DRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSL 377

Query: 374 DLNMTGVAIGFYNLIINVAVSVGIAIAAALIDFKALN 410
G + N ++ GIAI L+ L+
Sbjct: 378 KQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLD 414


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05175NUCEPIMERASE2179e-71 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 217 bits (554), Expect = 9e-71
Identities = 79/327 (24%), Positives = 139/327 (42%), Gaps = 33/327 (10%)

Query: 6 RVLITGGAGFIGSHLVDDL-QQDYDVYVLDNYRTG-----KRENIKSLADDHVF--ELDI 57
+ L+TG AGFIG H+ L + + V +DN K+ ++ LA ++D+
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKIDL 61

Query: 58 REYDAVEQIMKTYQFDYVIHLAALVSVAESVEKPILSQEINVVATLRLLEIIKKYNSHIK 117
+ + + + + F+ V ++V S+E P + N+ L +LE + I+
Sbjct: 62 ADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK--IQ 119

Query: 118 RFIFASSAAVYGDLPDLPKSDQSLI-LPLSPYAIDKYYGERTTLNYCSLYNIPTAVVKFF 176
++ASS++VYG +P S + P+S YA K E Y LY +P ++FF
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 177 NVFGPRQDPKSQYSGVISKMFDSFEHNKPFTFFGDGLQTRDFVYVYDVVQSVRLIMEH-- 234
V+GP P M K + G RDF Y+ D+ +++ + +
Sbjct: 180 TVYGPWGRPDMALFKFTKAML----EGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIP 235

Query: 235 ---------------KDAIGHGYNIGTGTFTNLLEVYRIIGELYGKSVEHEFKEARKGDI 279
A YNIG + L++ + + + G + + GD+
Sbjct: 236 HADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGDV 295

Query: 280 KHSYADISNL-KALGFVPKYTVETGLK 305
+ AD L + +GF P+ TV+ G+K
Sbjct: 296 LETSADTKALYEVIGFTPETTVKDGVK 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05185DHBDHDRGNASE1284e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 128 bits (323), Expect = 4e-38
Identities = 66/250 (26%), Positives = 113/250 (45%), Gaps = 2/250 (0%)

Query: 5 KVALVTGGAQGIGFKIAERLVEDGFKVAVVDFNEEGAKAAALKLSSDGTKAIAIKADVSN 64
K+A +TG AQGIG +A L G +A VD+N E + L ++ A A ADV +
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 65 RDDVFNAVRQTAAQFGDFHVMVNNAGLGPTTPIDTITEEQFKTVYGVNVAGVLWGIQAAH 124
+ + + G ++VN AG+ I ++++E+++ + VN GV ++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 125 EQFKKFNHGGKIINATSQAGVEGNPGLSLYCSTKFAVRGLTQVAAQDLASEGITVNAFAP 184
+ G I+ S ++ Y S+K A T+ +LA I N +P
Sbjct: 129 KYMMD-RRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GIVQTPMMESIAVATAEEAGKPEAWGWEQFTSQIALGRVSQPEDVSNVVSFLAGKDSDYI 244
G +T M S+ + E F + I L ++++P D+++ V FL + +I
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSL-ETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 245 TGQTIIVDGG 254
T + VDGG
Sbjct: 247 TMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05215PF04183511e-178 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 511 bits (1317), Expect = e-178
Identities = 145/592 (24%), Positives = 257/592 (43%), Gaps = 40/592 (6%)

Query: 25 VNQTILNRVKTRVMHQLVSSLIYENIVVYKASYQDGVGHFTIEGHDSEYRFTAEKTHSFD 84
+N + V R++ +++S L YE + + A Q G + I +++RF AE+ +
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQV--FHAESQ-GDDRYCINLPGAQWRFIAERG-IWG 56

Query: 85 RIRITSPIERVVGDEADTTTDYTQLLREAVFTFPKNDEKLEQFIVELLQTELKDTQSMQY 144
+ I + R AD LL + +D + + + +L T L D Q ++
Sbjct: 57 WLWIDAQTLRC----ADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKA 112

Query: 145 RESNPPATPETFN-DYEFYAMEGHQYHPSYKSRLGFTLSDNLKFGPDFVPNVKLQWLAID 203
R + N D + GH K R G+ ++ P++ +L WLA+
Sbjct: 113 RRGLSASDLINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVK 172

Query: 204 KDKVETTVSRNVVVNEMLRQQVGDKTYEHFVQQIEASGKHVNDVEMIPVHPWQFEHVIQV 263
++ + + ++++L + + + F Q + +G N + +PVHPWQ++ I
Sbjct: 173 REHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWL-PLPVHPWQWQQKIAT 231

Query: 264 DLAEERLNGTVLWLGESDELYHPQQSIRTMSPIDTT-KYYLKVPISITNTSTKRVLAPHT 322
D + G ++ LGE + + QQS+RT++ +K+P++I NTS R +
Sbjct: 232 DFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRY 291

Query: 323 IENAAQITDWLKQIQQQDMYLKDE----LKTVFLGEVLGQSYLNTQLSPYKQTQVYGALG 378
I + WL+Q+ D L L G V + Y +PY+ ++ LG
Sbjct: 292 IAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM---LG 348

Query: 379 VIWRENIYHMLIDEEDAIPFNALYASDKDGLPFIEKWIKQYG--SEAWTKQFLAVAIRPM 436
VIWREN L +E + L D++ P +I + G +E W Q V + P+
Sbjct: 349 VIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPL 408

Query: 437 IHMLYYHGIAFESHAQNMMLIHENGWPTRIALKDFHDGVRFKREHLSEAASHLTLKPMPE 496
H+L +G+A +H QN+ L + G P R+ LKDF +R +E E S +P+
Sbjct: 409 YHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDS------LPQ 462

Query: 497 AHKKVNSNSFIETDDERLVRDFLH---DAFFFINIAEIILFIEKQYGIDEQRQWQWVKDI 553
+ V S RL D+L F+ + I + + G+ E+R +Q + +
Sbjct: 463 EVRDVTS---------RLSADYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAV 513

Query: 554 IEAYQEAFPELNN-YQHFDLFEPTIQVEKLTTRRL-LSDSELRIHHVTNPLG 603
+ Y + P+++ + F LF P I L +L D + + N L
Sbjct: 514 LSDYMKKHPQMSERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLE 565


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05220PF041833014e-97 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 301 bits (772), Expect = 4e-97
Identities = 118/540 (21%), Positives = 212/540 (39%), Gaps = 63/540 (11%)

Query: 3 NKELIQHAAYAAIERILNEYFREENLYQVPPQNHQWSIQLSELE-TLTGEFRYWSAMGHH 61
N + + ++L+E E+ + + ++ I L + E W G
Sbjct: 2 NHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIW---GW- 57

Query: 62 MYHPEVWLIDGKSKKITTYKEAIARILQHMAQSADNQTA-VQQHMAQIMSDI--DNSIHR 118
ID ++ + +L + Q A V +HM + + + D + +
Sbjct: 58 ------LWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLK 111

Query: 119 TARYLQSNTIDYVEDRYIVSEQSLYLGHPFHPTPKSASGFSEADLEKYAPECHTSFQLHY 178
R L ++ + + Q L GHP K G+ + LE+YAPE +F+LH+
Sbjct: 112 ARRGLSASDL---INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHW 168

Query: 179 LAVHQD-------------VLLTRYVEGKEDQVEKVLYQLADIDISEIPKDFILLPTHPY 225
LAV ++ LLT ++ +E ++Q +D +++ LP HP+
Sbjct: 169 LAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-----HNWLPLPVHPW 223

Query: 226 QINVLRQHPQYMQYSEQGLIKDLGVSGDLVYPTSSVRTVF--SKALNIYLKLPIHVKITN 283
Q ++ +G + LG GD S+RT+ S+ + +KLP+ + T+
Sbjct: 224 QWQQK-IATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTS 282

Query: 284 FIRTNDLEQIERTIDAAQVIASVKDE-----------VETPHFKLMFEEGYRALLPNPLG 332
R I A++ + V + P + EGY AL P
Sbjct: 283 CYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYR 342

Query: 333 QTVEPEMDLLTNSAMIVREGIPNY-HADKDIHVLASLFETMPD-SPISKLAQVIEQSGLA 390
EM +I RE + D+ ++A+L E + P+ I++SGL
Sbjct: 343 YQ---EM-----LGVIWRENPCRWLKPDESPVLMATLMECDENNQPL--AGAYIDRSGLD 392

Query: 391 PEAWLECYLDRTLLPILKLFSNTGISLEAHVQNTLIELKDGIPDVCFVRDLEG-ICLSRT 449
E WL ++P+ L G++L AH QN + +K+G+P ++D +G + L +
Sbjct: 393 AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKE 452

Query: 450 IATEKQLVPNVVAASSPVVYAHDEAWHRLKYYVVVNHLGHLVSTIGKATRNEVVLWQLVA 509
E +P V + + A D H L+ V L + + + E +QL+A
Sbjct: 453 EFPEMDSLPQEVRDVTSRLSA-DYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLA 511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05225TCRTETA802e-18 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 79.9 bits (197), Expect = 2e-18
Identities = 71/372 (19%), Positives = 149/372 (40%), Gaps = 24/372 (6%)

Query: 13 ILWLSQFIAIAGLTVLVPLLPIYMASLQNLSVVEIQLWSGIAIAAPAVTTMIASPIWGKL 72
++ + + G+ +++P+LP + L + ++ GI +A A+ +P+ G L
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDL--VHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 73 GDKISRKWMVLRALLGLAVCLFLMALCTTPLQFVLVRLLQGLFGGVVDASSAFASAEAPA 132
D+ R+ ++L +L G AV +MA + R++ G+ G + A+ +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDG 126

Query: 133 EDRGKVLGRLQSSVSAGSLVGPLIGGVTASILGFSALLMSIAVITFIVCIFGALKLIETT 192
++R + G + + G + GP++GG+ A + A + + + G L E+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 193 HMPKSQTPNINKGIRRSFQCLLCTQQTCRFIIVGVLANFAMYGMLTALSPLASSVNHTAI 252
+ SF+ + V A A++ ++ + + +++
Sbjct: 186 KGERRPLRREALNPLASFR--------WARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 253 DDR-----SVIGFLQSAF-WTASILSAPLWGRFNDKSYVKSVYIFATIACGCSAILQGLA 306
+DR + IG +AF S+ A + G + + + IA G IL A
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 307 TNIEFLMAARILQGLTYSAL--IQSVMFVVVNACHQ-QLKGTFVGTTNSMLVVGQIIGSL 363
T +L + +Q+++ V+ Q QL+G+ T+ + I+G L
Sbjct: 298 TRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS----LTSIVGPL 353

Query: 364 SGAAITSYTTPA 375
AI + +
Sbjct: 354 LFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05230PF04183316e-103 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 316 bits (812), Expect = e-103
Identities = 119/527 (22%), Positives = 208/527 (39%), Gaps = 46/527 (8%)

Query: 79 RVSKQPLTAAEFWQTIANMNCDLSHEWEVARVEEGLTTAATQLAKQLSELDLASHPFV-- 136
R + +P+ A + + +S +++ T L + L++ +
Sbjct: 66 RCADEPVLAQTLLMQLKQVL-SMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINL 124

Query: 137 -MSEQFASLKDRPFHPLAKEKRGLREADYQVYQAELNQSFPLMVAAVKKTHMIHGDTANI 195
L P K +RG + + Y E +F L AVK+ HMI +
Sbjct: 125 NADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEM 184

Query: 196 DELENLTVPIKEQA----TDMLNDQGLSIDDYVLFPVHPWQYQHILPNVFATEISEKLVV 251
D + LT + Q + + + GL +++ PVHPWQ+Q + F + +E +V
Sbjct: 185 DIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQQKIATDFIADFAEGRMV 243

Query: 252 LLPLKFGD-YLSSSSMRSLIDIGAPYN-HVKVPFAMQSLGALRLTPTRYMKNGEQAEQLL 309
L +FGD +L+ S+R+L + +K+P + + R P RY+ G A + L
Sbjct: 244 SLG-EFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWL 302

Query: 310 RQLIEKDEALAKYVMV-CDETA-------WWSYMGQDNDIFKDQLGHLTVQLRKYPEVLA 361
+Q+ D L + V E A ++ + + +++ LG V R+ P
Sbjct: 303 QQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLG---VIWRENPCRWL 359

Query: 362 KNDTQQLVSMAALAANDRTLYQMICGKDNISKNDVMTLFEDIAQVFLKVTLSFM-QYGAL 420
K D + V MA L D + + S D T + +V + + +YG
Sbjct: 360 KPD-ESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVA 418

Query: 421 PELHGQNILLSFEDGRVQKCVLRD-HDTVRIYKPWLTAHQLSLPKYV--VREDTPNTLIN 477
HGQNI L+ ++G Q+ +L+D +R+ K SLP+ V V +
Sbjct: 419 LIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMD-SLPQEVRDVTSRLSADYLI 477

Query: 478 EDLETFFAYFQTLAVSVNLYAIIDAIQDLFGVSEHELMSLLKQILKNEVATISWVTTDQL 537
DL+T V + I + GV E LL +L + + Q+
Sbjct: 478 HDLQTGHF--------VTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMK-----KHPQM 524

Query: 538 AVRHILFDKQTWPFKQILLP---LLY-QRDSGGGSMPSGLTTVPNPM 580
+ R LF +++L L + D G +P+ L + NP+
Sbjct: 525 SERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05235SYCECHAPRONE310.002 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 31.2 bits (70), Expect = 0.002
Identities = 14/33 (42%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 25 VDALTEALTAHAHNDFVQ-PLKPYLRQDPENGH 56
+D E T +HN F Q LKP L D GH
Sbjct: 54 LDNNDEKETLLSHNIFSQDILKPILSWDEVGGH 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05245FERRIBNDNGPP707e-16 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 70.4 bits (172), Expect = 7e-16
Identities = 47/191 (24%), Positives = 78/191 (40%), Gaps = 38/191 (19%)

Query: 53 PKRVVTLYQGATDVAVSLGVKPVGAVES-----WTQKPKFEYIKNDLKDTKI-VGQEPAP 106
P R+V L ++ ++LG+ P G ++ W +P L D+ I VG P
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPP-------LPDSVIDVGLRTEP 87

Query: 107 NLEEISKLKPDLIVASKVRNEKVYDQLSKIAPTVSTDTVFKFKD----------TTKLMG 156
NLE ++++KP +V S + L++IAP F F D + M
Sbjct: 88 NLELLTEMKPSFMVWS-AGYGPSPEMLARIAPGR----GFNFSDGKQPLAMARKSLTEMA 142

Query: 157 KALGKEKEAEDLLKKYDDKVAAFQKDAKAKY--KDAWPLKASVVNF-RADHTRIYA-GGY 212
L + AE L +Y+D F + K ++ + A PL + H ++
Sbjct: 143 DLLNLQSAAETHLAQYED----FIRSMKPRFVKRGARPL--LLTTLIDPRHMLVFGPNSL 196

Query: 213 AGEILNDLGFK 223
EIL++ G
Sbjct: 197 FQEILDEYGIP 207


14CH52_RS05370CH52_RS05475Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS05370312-0.033363tRNA-dihydrouridine synthase
CH52_RS15025514-1.374260persulfide dioxygenase-sulfurtransferase CstB
CH52_RS05385313-1.106990persulfide response sulfurtransferase CstA
CH52_RS05390718-4.227659persulfide-sensing transcriptional repressor
CH52_RS05395820-5.196275sulfite exporter TauE/SafE family protein
CH52_RS05400820-5.645807TIGR04141 family sporadically distributed
CH52_RS05405522-4.610305hypothetical protein
CH52_RS05410-216-0.230984hypothetical protein
CH52_RS05415-213-0.07446223S rRNA
CH52_RS054200182.770968LPXTG-anchored adenosine synthase AdsA
CH52_RS054251182.734737MBL fold metallo-hydrolase
CH52_RS054302182.655806two-component system regulatory protein YycI
CH52_RS054354172.457944two-component system activity regulator YycH
CH52_RS054407191.755894cell wall metabolism sensor histidine kinase
CH52_RS054457182.248626response regulator YycF
CH52_RS054507172.326325**adenylosuccinate synthase
CH52_RS054555152.272511replicative DNA helicase
CH52_RS054604152.17038450S ribosomal protein L9
CH52_RS054752142.065119cyclic-di-AMP phosphodiesterase GdpP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05390PF01206614e-14 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 60.5 bits (147), Expect = 4e-14
Identities = 20/70 (28%), Positives = 37/70 (52%)

Query: 118 KQFNYRGFQCPGPIVKISQEMKNIEVGDQIEVKVTDPGFPSDIKSWVKQTRHTLVKLDEN 177
+ + G CP PI+K + + + G+ + V TDPG D +S+ KQT H L++ E
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 178 NNGINAIIQK 187
+ + +++
Sbjct: 66 DGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05460HTHFIS942e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 93.7 bits (233), Expect = 2e-24
Identities = 31/124 (25%), Positives = 64/124 (51%), Gaps = 1/124 (0%)

Query: 4 KVVVVDDEKPIADILEFNLKKEGYDVYCAYDGNDAVDLIYEEEPDIVLLDIMLPGRDGME 63
++V DD+ I +L L + GYDV + I + D+V+ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 VCREVRKKYE-MPIIMLTAKDSEIDKVLGLELGADDYVTKPFSTRELIARVKANLRRHYS 122
+ ++K +P+++++A+++ + + E GA DY+ KPF ELI + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 123 QPAQ 126
+P++
Sbjct: 125 RPSK 128


15CH52_RS05825CH52_RS05850Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS0582512163.809646hypothetical protein
CH52_RS0583010152.812317hypothetical protein
CH52_RS0583510142.520282flavin reductase family protein
CH52_RS058408132.600737serine-rich repeat glycoprotein adhesin SasA
CH52_RS058457132.525363accessory Sec system protein translocase subunit
CH52_RS058505132.086362accessory Sec system protein Asp1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05830ENTEROTOXINA280.005 Heat-labile enterotoxin A chain signature.
		>ENTEROTOXINA#Heat-labile enterotoxin A chain signature.

Length = 258

Score = 28.4 bits (63), Expect = 0.005
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 30 IELFEHTFGLQKELVKYVGIAEATTAALYSASFINKNISRLASLSTIGILSVAA 83
I L++H G Q V+Y +T+ +L SA ++I L+ ST I +A
Sbjct: 57 INLYDHARGTQTGFVRYDDGYVSTSLSLRSAHLAGQSI--LSGYSTYYIYVIAT 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05850ICENUCLEATIN653e-12 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 64.8 bits (157), Expect = 3e-12
Identities = 239/1048 (22%), Positives = 416/1048 (39%), Gaps = 4/1048 (0%)

Query: 687 ATQDNSGNAVTNTVTGLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKY 746
+ + +T + S T+ +TI ST + T+
Sbjct: 107 HHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGS 166

Query: 747 EVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDS 806
++ S ++ GST+ + ST A S T+ + S +V+ ST + S +
Sbjct: 167 TLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMA 226

Query: 807 VSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSGSVSKSTSLSDSISNSNSTEKSESLS 866
S S+ + ST S + GS + S + ST+ ++ S
Sbjct: 227 GYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGS 286

Query: 867 TSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLSDSTSNAISTSTSLS 926
T+ T T+ +DS ++ GS + ST +G ST + S A ST +
Sbjct: 287 DLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTA 346

Query: 927 ESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVS 986
S+ + S A S+ T+ S T+ S + ST + +DS+ +G
Sbjct: 347 GDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAG--Y 404

Query: 987 GSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGSTSES 1046
GS A +S T+ S T++ SD + GS + S ++ ST +G S
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 1047 LSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTS 1106
+ ST + S + S ST+ S+ + S + GS + S + +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 1107 FSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSVSDSTSLSTSES 1166
SD + S STA + S + ST + S + S +T+ SD T+ S
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 1167 DSISESTSTSDSISEAISASESTSISLSESNSTSDSESQSASAFLSESLSESTSESTSES 1226
+ S+S+ + S ++ S+ + S T+ +S + + S S + ++S+ +
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGY--GSTSTAGADSSLIA 642

Query: 1227 VSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASISTSTSISESTSTFKSESVSTSLSMS 1286
ST + S T+ GST T+ S + STST+ ++S+ S T+ S
Sbjct: 643 GYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNS 702

Query: 1287 TSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESES 1346
T+ ST + SD TS S S + + S+ + S + S+ +G S +
Sbjct: 703 ILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTA 762

Query: 1347 DSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSNSAS 1406
S + STS + + S +G ST T+ S T+ S + +S + + S
Sbjct: 763 REQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGS 822

Query: 1407 QSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSTSTSAS 1466
S + + S+ + S T+ Y S T+ ST T+ SD T+ STS +G S+ +
Sbjct: 823 TSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIA 882

Query: 1467 LSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDSSS 1526
GS + SI T+ ST + S + S S + S+ ++ S + S
Sbjct: 883 GYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKS 942

Query: 1527 TSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTS 1586
T ++ S+ +S + S S + S+ + ST +G S T+
Sbjct: 943 TLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTA 1002

Query: 1587 ESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTS 1646
E ST T+ S +T+ + S+ + S+ TS RS + S S S + S
Sbjct: 1003 EHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGS 1062

Query: 1647 TSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSSNISGSNSTSTSLSTSDSMS 1706
+ S S+ + S+ + S+ +G S I+ + S I+G S+ T+ S +S
Sbjct: 1063 SLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLIS 1122

Query: 1707 GSVSVSTSTSLSDSISGSTSVSDSSSTS 1734
G+ SV + I+G+ S + S
Sbjct: 1123 GADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 59.4 bits (143), Expect = 2e-10
Identities = 233/976 (23%), Positives = 399/976 (40%), Gaps = 18/976 (1%)

Query: 789 VSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSGSVSKSTS 848
V+ + + S ++ V + ++ S S +Q++ + GS T
Sbjct: 113 VACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTLSGTH 172

Query: 849 LSDSISNSNSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTS 908
S I+ STE + ST + T T+ +DS ++ GS + S+ ++G ST
Sbjct: 173 QSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQ 232

Query: 909 ASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDSKS 968
+ S A ST + S+ IA S T+ DS T+ ST ++
Sbjct: 233 TGMKGSDLTAGYGSTGTAGDDSSL--------IAGYGSTQTAGEDSSLTAGYGSTQTAQK 284

Query: 969 MSTSESLSDSTSTSGSVS------GSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLS 1022
S + ST T+G+ S GS A +S T+ S T++ SD + GS
Sbjct: 285 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 344

Query: 1023 ASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVR 1082
+ S ++ ST +G S + ST + S + S T+ + S+ +
Sbjct: 345 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY 404

Query: 1083 TSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTS 1142
S + S + S + SD T+ S TA +S + ST + S+
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 1143 LSTSNSERTSTSVSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSESNSTSDS 1202
+ S +T+ SD T+ S S + ES+ + S + ST + S T+ +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 1203 ESQSASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASIS 1262
ES + + S S + + S+ + ST ++ S T+ GST T+ S + S
Sbjct: 525 ESDLITGY--GSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGS 582

Query: 1263 TSTSISESTSTFKSESVSTSLSMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDS 1322
T T+ S+S+ S T+ S+ T+ ST + S T+ S S + + S+ +
Sbjct: 583 TGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIA 642

Query: 1323 ISTSKSDSISTSTSLSGSTSESESDSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLS 1382
S + S +G S + S + STS + + S +G ST T+ S
Sbjct: 643 GYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNS 702

Query: 1383 DSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTS 1442
T+ S + S + S S S + + S+ + S T++Y S T+ ST T+
Sbjct: 703 ILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTA 762

Query: 1443 TSLSDSTSISKSTSQSGSTSTSASLSGSESESDSQSISTS--ASESTSESASTSLSDSTS 1500
S T+ STS +G+ S+ + GS + SI T+ S T++ S + S
Sbjct: 763 REQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGS 822

Query: 1501 TSNSGSASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTS 1560
TS +G+ S+ + S + +S T+ ST + ++S+ + S S + S+ +
Sbjct: 823 TSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIA 882

Query: 1561 NRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDS 1620
ST + S+ T+ GST + SD T+ S S + S+ +G ST T++ S
Sbjct: 883 GYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKS 942

Query: 1621 RSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVS 1680
+ S+ S + STS + S + S + ST + GS +
Sbjct: 943 TLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTA 1002

Query: 1681 ISLSDSSNISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLS 1740
S + GS +T+ + S+ + GS S S + GST +S S T+ S
Sbjct: 1003 EHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGS 1062

Query: 1741 DSMSQSQSTSTSASGS 1756
+S +S+ T+ GS
Sbjct: 1063 SLISGRRSSLTAGYGS 1078



Score = 59.0 bits (142), Expect = 2e-10
Identities = 239/1087 (21%), Positives = 429/1087 (39%), Gaps = 12/1087 (1%)

Query: 837 SSMSGSVSKSTSLSDSISNSNSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQS 896
+S + + + + + S +++ R+ D + SGS +Q+
Sbjct: 99 TSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQT 158

Query: 897 LSTSISGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQS 956
+ + GS+ + S + ST T+ +S I+ S + + ST + S
Sbjct: 159 IEIATYGSTLSGTHQSQLIAGYGSTETA----GDSSTLIAGYGSTGTAGADSTLVAGYGS 214

Query: 957 TSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSIS 1016
T + S + S S + GS A S + S T+ S +
Sbjct: 215 TQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTA 274

Query: 1017 TSGSLSASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTS 1076
GS + S + ST +G+ S ++ ST + +S + S T+ S
Sbjct: 275 GYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGS 334

Query: 1077 TSASVRTSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGS 1136
+ S + S + S + S T+ S TA S + ST +
Sbjct: 335 DLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTA 394

Query: 1137 VSTSTSLSTSNSERTSTSVSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSES 1196
+ S+ ++ S +T+ S T+ S + S T+ S + +S+ I+ S
Sbjct: 395 GADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGS 454

Query: 1197 NSTSDSESQSASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTS 1256
T+ +S + + S ++ S+ T+ S+ST+ S + S T+ S T+
Sbjct: 455 TQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTA 514

Query: 1257 GSASISTSTSISESTSTFKSESVSTSLSMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTS 1316
G S T+ + S+ + + S S + + S + S T+ S+ + S + S
Sbjct: 515 GYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGS 574

Query: 1317 MSTSDSISTSKSDSISTSTSLSGSTSESESDSTSSSESKSDSTSMSISMSQSTSGSTSTS 1376
T+ ST + S S+ + GST + S+ ++ S T+ S+ + GSTST+
Sbjct: 575 DLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTA 634

Query: 1377 TSTSLSDSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSTYTSQSTSQS 1436
+ S + S + + S + S T+ S S + + + + S
Sbjct: 635 GADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGS 694

Query: 1437 ESTSTSTSLSDSTSISKSTSQSGSTSTSASLSGSESESDSQSISTSASESTSESASTSLS 1496
T+ S+ + S T+Q GS TS S S + +DS I+ S T+ S+ +
Sbjct: 695 TQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTA 754

Query: 1497 DSTSTSNSGSASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQSTSASLSDSLS 1556
ST + S T+ S S + +DSS + ST + S + S + S
Sbjct: 755 GYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERS 814

Query: 1557 TSTSNRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTS 1616
T+ ST + + S + GST + S T+ S + S +G STST+
Sbjct: 815 DLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTA 874

Query: 1617 TSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGS 1676
DS + ST S + ST T+ SD T+ S S + +S+ + GS
Sbjct: 875 GYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGS 934

Query: 1677 MSVSISLSDSSNISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTS 1736
+ S GS+ T+ S+ + GS S++ S + GST + ST T+
Sbjct: 935 TQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTA 994

Query: 1737 TSLSDSMSQSQSTSTSASGSLSTS--ISTSMSMSASTSSSQSTSVSTSLSTSDSISDSTS 1794
S ++ ST T+ GS +T+ S+ ++ S+ +S S T+ S IS S
Sbjct: 995 GYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRS 1054

Query: 1795 ISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDSTSGSTSTSISESLSTSGS 1854
+ +G S++ S S T+ S +++ S + S +G+ S I+ G
Sbjct: 1055 VLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIA------GK 1108

Query: 1855 GSTSVSDSTSMSESDSTSVSMSQDKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTS 1914
GS+ + S S + SV M+ ++ + +DS + S L+ ++S T+ S
Sbjct: 1109 GSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKL 1168

Query: 1915 MSGSQSI 1921
+G+ I
Sbjct: 1169 TAGNDCI 1175



Score = 57.8 bits (139), Expect = 5e-10
Identities = 207/898 (23%), Positives = 359/898 (39%), Gaps = 6/898 (0%)

Query: 1044 SESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSI 1103
S S +Q+ + S T QS + ST + +S + GS + +DS +
Sbjct: 150 SGSTQPTQTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLV 209

Query: 1104 STSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSVSDSTSLST 1163
+ S T+ +S+ A S S + ST + +S + S T+
Sbjct: 210 AGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGED 269

Query: 1164 SESDSISESTSTSDSISEAISASESTSISLSESNSTSDSESQSASAFLSESLSESTSEST 1223
S + ST T+ S+ + ST + ++S+ + S + S + S T
Sbjct: 270 SSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQT 329

Query: 1224 SESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASISTSTSISESTSTFKSESVSTSL 1283
++ S T+ S + +S + S T+G S T+ S T+ S+ +
Sbjct: 330 AQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYG 389

Query: 1284 SMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSE 1343
S T+ + S+ + S + +S + S T+ S + ST T+ S+
Sbjct: 390 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 449

Query: 1344 SESDSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSN 1403
+ ST ++ S T+ S + GS T+ S S + S ++ +
Sbjct: 450 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYG 509

Query: 1404 SASQSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSTST 1463
S + ST T+ +ESD + TS + + S + S ++ S T+ GST T
Sbjct: 510 STLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQT 569

Query: 1464 SASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESD 1523
+ S + S + S S + ST + S+ +G ST T+ S +
Sbjct: 570 AREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYG 629

Query: 1524 SSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGST-- 1581
S+ST+ +DS+ + S + S + ST T+ S + + S ST+ + S+
Sbjct: 630 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLI 689

Query: 1582 ----SESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSD 1637
S T+ +S T+ S T++ S S STST+ + S+ + S +T++
Sbjct: 690 AGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYH 749

Query: 1638 SQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSSNISGSNSTST 1697
S + ST T+ S + S ST+ + S+ +G S + S +G ST T
Sbjct: 750 SSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQT 809

Query: 1698 SLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSL 1757
+ SD +G S ST+ + S I+G S + S T+ S +Q S +G
Sbjct: 810 AQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYG 869

Query: 1758 STSISTSMSMSASTSSSQSTSVSTSLSTSDSISDSTSISISGSQSTVESESTSDSTSISD 1817
STS + S + S T+ S+ T+ S T+ S + S ST+ S
Sbjct: 870 STSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLI 929

Query: 1818 SESLSTSDSDSTSTSTSDSTSGSTSTSISESLSTSGSGSTSVSDSTSMSESDSTSVSMSQ 1877
+ ST + ST + S T+ S + GS S + DS+ ++ ST + Q
Sbjct: 930 AGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQ 989

Query: 1878 DKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTSMSGSQSISDSTSTSMSGSTST 1935
+ S + +ST T+ S +T+ ++S + GS S S +G ST
Sbjct: 990 STLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGST 1047



Score = 50.9 bits (121), Expect = 7e-08
Identities = 206/894 (23%), Positives = 361/894 (40%), Gaps = 6/894 (0%)

Query: 733 TDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTS 792
T G+ T + T + S ++ GSTQ + ST A S T+ GS + +
Sbjct: 281 TAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGY 340

Query: 793 ASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSGSVSKSTSLSDS 852
ST + S + S + +S+ + ST S ++ GS + + S
Sbjct: 341 GSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSL 400

Query: 853 ISNSNSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLS 912
I+ ST+ + ST T+ T T+ S + GS + S+ I+G ST +
Sbjct: 401 IAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGE 460

Query: 913 DSTSNAISTSTSLSESASTSDSISISNSIANSQSA------STSKSDSQSTSISLSTSDS 966
DS+ A ST ++ S + S S A +S+ ST + ST + S
Sbjct: 461 DSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQ 520

Query: 967 KSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDS 1026
+ + S+ ++ STS + + S IA S T++ +S+ T+ S + GS +
Sbjct: 521 TAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGY 580

Query: 1027 KSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVRTSES 1086
S + S S+ +G S + S+ + S + QS T+ STS + S
Sbjct: 581 GSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSL 640

Query: 1087 QSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTS 1146
+ GS + +S+ + S T+ S TA S S + + S+ + ST +
Sbjct: 641 IAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGY 700

Query: 1147 NSERTSTSVSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSESNSTSDSESQS 1206
NS T+ S T+ S+ S STST+ + S I+ ST + S+ T+ S
Sbjct: 701 NSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQ 760

Query: 1207 ASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASISTSTS 1266
+ S + S ST+ + SS + S + S T+ S T+ S T+
Sbjct: 761 TAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGY 820

Query: 1267 ISESTSTFKSESVSTSLSMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTS 1326
S ST+ S ++ S T+ S T+ S + +S + S ST+ S+
Sbjct: 821 GSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSL 880

Query: 1327 KSDSISTSTSLSGSTSESESDSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTS 1386
+ ST T+ S + ST +++ SD T+ S S + S+ + S ++
Sbjct: 881 IAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASF 940

Query: 1387 TSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTSTSLS 1446
S ++ + S+ + STS + +S + T + QS T+ S
Sbjct: 941 KSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQ 1000

Query: 1447 DSTSISKSTSQSGSTSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGS 1506
+ S T+ GST+T+ + S + S S S T+ ST +S S +G
Sbjct: 1001 TAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGY 1060

Query: 1507 ASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTI 1566
S+ S S+ + S+ + S+ + S + + S ++ S+ T+ ST+
Sbjct: 1061 GSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTL 1120

Query: 1567 ASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDS 1620
S + SV + + ++S T+ S + + S +G S T+ +D
Sbjct: 1121 ISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDC 1174



Score = 34.3 bits (78), Expect = 0.006
Identities = 116/530 (21%), Positives = 203/530 (38%), Gaps = 6/530 (1%)

Query: 1486 STSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQ 1545
S + +D + + + S +++ T D+T S + + +
Sbjct: 100 SAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTI 159

Query: 1546 STSASLSDSLSTSTSNRMSTIASLSTSVSTSE--SGSTSESTSESDSTSTSLSDSQSTSR 1603
+ S T S ++ S T+ +S +G S T+ +DST + S T+
Sbjct: 160 EIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAG 219

Query: 1604 STSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSD 1663
S+ + ST T S + S T+ DS ++ ST T+ DS+ + S
Sbjct: 220 EESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGST 279

Query: 1664 STSDSTSASTSGSMSVSISLSDSSNISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISG 1723
T+ S T+G S + +DSS I+G ST T+ S +G S T+ SD +G
Sbjct: 280 QTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAG 339

Query: 1724 STSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSLSTSISTSMSMSASTSSSQSTSVSTSL 1783
S + S+ + S + S+ +G ST + S ++ S T+
Sbjct: 340 YGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQ----TAQKGSDLTAGYGSTGTAG 395

Query: 1784 STSDSISDSTSISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDSTSGSTST 1843
+ S I+ S +G +ST + S T+ S+ + S T+ S +G ST
Sbjct: 396 ADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGST 455

Query: 1844 SISESLSTSGSGSTSVSDSTSMSESDSTSVSMSQDKSDSTSISDSESVSTSTSTSLSTSD 1903
+ S+ +G S + S+ + S S +S+ I+ S T+ S T+
Sbjct: 456 QTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAG 515

Query: 1904 STSTSESLSTSMSGSQSISDSTSTSMSGSTSTSESNSMHPSDSMSMHHTHSTSTSRLSSE 1963
ST + + S + S ST+ + S + S +S+ ST T+R S+
Sbjct: 516 YGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSD 575

Query: 1964 ATTSTSESQSTLSATSEVTKHNGTPAQSEKRLPDTGDSIKQNGLLGGVMT 2013
T + + S +S + + T S G Q V+T
Sbjct: 576 LTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLT 625


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05855SECYTRNLCASE1304e-36 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 130 bits (329), Expect = 4e-36
Identities = 93/440 (21%), Positives = 181/440 (41%), Gaps = 52/440 (11%)

Query: 4 LLQQYEYKIIYKRMLYTCFILFIYILGTNISI--VSYNDMQ------VKHESFFKIAISN 55
+ + + K++L+T I+ +Y +GT+I I V Y ++Q ++ F +
Sbjct: 5 FARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGLVNMF 64

Query: 56 MGGDVNTLNIFTLGLVPWLTSMIILMLISYRNMDKYMKQTSLEKHYKE------------ 103
GG + + IF LG++P++T+ IIL L++ + LE KE
Sbjct: 65 SGGALLQITIFALGIMPYITASIILQLLT-------VVIPRLEALKKEGQAGTAKITQYT 117

Query: 104 RILTLILSVIQSYFVIHEYVSKERVHQDN-------------IYLTILILVTGTMLLVWL 150
R LT+ L+++Q ++ S + + ++ + GT +++WL
Sbjct: 118 RYLTVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWL 177

Query: 151 ADKNSRYGIAGPMPIVMVSIIKSMMHQKMEYI------DASHIVIALLIILVIITLFILL 204
+ + GI M I+M I + + I I +I + +I + +++
Sbjct: 178 GELITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVV 237

Query: 205 FIELVEVRIPYI----DLMNVSATNMKSYLSWKVNPAGSITLMMSISAFVFLKSGIHFIL 260
F+E + RIP + S +Y+ KVN AG I ++ + S F
Sbjct: 238 FVEQAQRRIPVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVAQFAG 297

Query: 261 SMFNKSISDDMPMLTFDSPVGISVYLVIQMLLGYFLSRFLINTKQKSKDFLKSGNYFSGV 320
+ + D P+ I Y ++ + +F N ++ + + K G + G+
Sbjct: 298 GNSGWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFIPGI 357

Query: 321 KPGKDTERYLNYQARRVCWFGSALVTVIIGIPLYFTLFVPHLSTEIYFS-VQLIVLVYIS 379
+ G+ T YL+Y R+ W GS + +I +P L S F ++++V +
Sbjct: 358 RAGRPTAEYLSYVLNRITWPGSLYLGLIALVP-TMALVGFGASQNFPFGGTSILIIVGVG 416

Query: 380 INIAETIRTYLYFDKYKPFL 399
+ + I + L Y+ FL
Sbjct: 417 LETVKQIESQLQQRNYEGFL 436


16CH52_RS06670CH52_RS14875Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS06670-1103.155371GntR family transcriptional regulator
CH52_RS066751103.211568gluconokinase
CH52_RS06680093.258311gluconate:H+ symporter
CH52_RS066851102.136209fibronectin-binding protein FnbA
CH52_RS066953111.462305fibronectin-binding protein FnbB
CH52_RS067008152.048913UTP--glucose-1-phosphate uridylyltransferase
CH52_RS067058140.827368HTH-type transcriptional regulator SarU
CH52_RS067101017-0.306638HTH-type transcriptional regulator SarT
CH52_RS067151020-0.931382E domain-containing protein
CH52_RS067201020-0.089484hypothetical protein
CH52_RS067256160.378377hypothetical protein
CH52_RS06730-116-2.298353hypothetical protein
CH52_RS06735115-3.844135phospho-sugar mutase
CH52_RS06740214-3.507664(deoxy)nucleoside triphosphate
CH52_RS06745415-3.474273DUF3427 domain-containing protein
CH52_RS06750515-3.471665tandem-type lipoprotein
CH52_RS06755816-3.698633tandem-type lipoprotein
CH52_RS06760717-3.834200tandem-type lipoprotein
CH52_RS06765717-2.365511membrane protein
CH52_RS06770718-2.346233hypothetical protein
CH52_RS06775617-2.004191hypothetical protein
CH52_RS06780418-1.835781tandem-type lipoprotein
CH52_RS06785514-2.618422hypothetical protein
CH52_RS06790518-2.700255hypothetical protein
CH52_RS14875518-2.590465hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06700PF03544583e-11 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 58.1 bits (140), Expect = 3e-11
Identities = 23/115 (20%), Positives = 33/115 (28%), Gaps = 5/115 (4%)

Query: 869 TPPTPPTPEVPSEPETPMPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPSEPETPT 928
P P + + + + P + P EP P PE PE P + P
Sbjct: 45 APAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI--PEPPKEAPVVIEKPKPKPK 102

Query: 929 PPTPEVPSEPETPTPPTPEVPAEPGKPVPPAKEEPKKPSKPVEQGKVVTPVIEIN 983
P V + P P E P P +P+ PV +
Sbjct: 103 PKPKPV---KKVEQPKRDVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVA 154



Score = 56.5 bits (136), Expect = 9e-11
Identities = 21/102 (20%), Positives = 30/102 (29%), Gaps = 7/102 (6%)

Query: 877 EVPSEPETPMPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPS 936
+V P P + + + + P + P EP P PE PE P +
Sbjct: 39 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI--PEPPKEAPVVIEK 96

Query: 937 EPETPTPPTPEVPAEP-----GKPVPPAKEEPKKPSKPVEQG 973
P P V KPV P + + P
Sbjct: 97 PKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPT 138



Score = 54.2 bits (130), Expect = 6e-10
Identities = 21/108 (19%), Positives = 31/108 (28%)

Query: 871 PTPPTPEVPSEPETPMPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPSEPETPTPP 930
P + P EP P PE EP P E P P P + +P+ P
Sbjct: 61 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKP 120

Query: 931 TPEVPSEPETPTPPTPEVPAEPGKPVPPAKEEPKKPSKPVEQGKVVTP 978
P+ P T P + + + + + P
Sbjct: 121 VESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYP 168



Score = 53.4 bits (128), Expect = 9e-10
Identities = 24/120 (20%), Positives = 37/120 (30%), Gaps = 6/120 (5%)

Query: 861 QQTIEEDTTPPTPPTPEVPSEPETPMPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEV 920
+E PP P V EPE P P + P P P+
Sbjct: 57 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKR 116

Query: 921 PSEPETPTPPTPEVPSEPETPTPPTP------EVPAEPGKPVPPAKEEPKKPSKPVEQGK 974
+P P +P + P PT T V + P ++ +P+ P++
Sbjct: 117 DVKPVESRPASPFENTAPARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRI 176



Score = 51.5 bits (123), Expect = 4e-09
Identities = 25/103 (24%), Positives = 36/103 (34%), Gaps = 2/103 (1%)

Query: 905 EVPSEPETPTPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPAEPGKPVPPAKEEPK 964
+V P P + + + + P + P EP P PE EP K P E+PK
Sbjct: 39 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK 98

Query: 965 KPSKPVEQGKVVTPVIEINEKVKAVAPTKKAQSKKSELPETGG 1007
KP K V V + VK V + + +
Sbjct: 99 PKPKPKP--KPVKKVEQPKRDVKPVESRPASPFENTAPARPTS 139



Score = 48.4 bits (115), Expect = 4e-08
Identities = 26/102 (25%), Positives = 35/102 (34%), Gaps = 7/102 (6%)

Query: 891 EVPSEPETPTPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPSEPETPTPPTPEVPA 950
+V P P + + + + P + P EP P PE PE P +
Sbjct: 39 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPI--PEPPKEAPVVIE- 95

Query: 951 EPGKPVPPAKEEPKKPSKPVEQGKVVTPVIEINEKVKAVAPT 992
KP P K +P KP K VEQ K +E
Sbjct: 96 ---KPKPKPKPKP-KPVKKVEQPKRDVKPVESRPASPFENTA 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06705TONBPROTEIN516e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 50.8 bits (121), Expect = 6e-09
Identities = 23/67 (34%), Positives = 27/67 (40%)

Query: 828 EVPSEPETPTPPTPEVPSEPGEPTPPKPEVPSEPETPVPPTPEVPSEPGKPVPPAKEEPK 887
EP P PE EP P PE P E + P KPV +E+PK
Sbjct: 52 PADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPK 111

Query: 888 KPSKPVE 894
+ KPVE
Sbjct: 112 RDVKPVE 118



Score = 50.8 bits (121), Expect = 6e-09
Identities = 20/81 (24%), Positives = 23/81 (28%), Gaps = 4/81 (4%)

Query: 825 PTPEVPSE----PETPTPPTPEVPSEPGEPTPPKPEVPSEPETPVPPTPEVPSEPGKPVP 880
P P P P V P P+PE PE P + KP P
Sbjct: 39 PAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 98

Query: 881 PAKEEPKKPSKPVEQGKVVTP 901
K K +P K V
Sbjct: 99 KPKPVKKVQEQPKRDVKPVES 119



Score = 50.0 bits (119), Expect = 1e-08
Identities = 17/84 (20%), Positives = 28/84 (33%), Gaps = 3/84 (3%)

Query: 822 PTPPTPEVPSEPETPTPPTPEVPSEPGEPTP---PKPEVPSEPETPVPPTPEVPSEPGKP 878
P V PE P PE P P + +P+ P +V +P +
Sbjct: 54 DLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRD 113

Query: 879 VPPAKEEPKKPSKPVEQGKVVTPV 902
V P + P P + ++ +
Sbjct: 114 VKPVESRPASPFENTAPARLTSST 137



Score = 45.0 bits (106), Expect = 5e-07
Identities = 22/105 (20%), Positives = 33/105 (31%), Gaps = 4/105 (3%)

Query: 810 EGQQTIEEDTTPPTPPTPEVPSEPETPTPPTPEVPSEPGEPTPPKPEVPSEPETPVPPTP 869
E Q ++ P P PE PE PP PKP+ + P
Sbjct: 56 EPPQAVQPPPEPVVEPEPEPEPIPE---PPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKR 112

Query: 870 EV-PSEPGKPVPPAKEEPKKPSKPVEQGKVVTPVIEINEKVKAVA 913
+V P E P P + + PV + +A++
Sbjct: 113 DVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALS 157



Score = 42.3 bits (99), Expect = 5e-06
Identities = 27/88 (30%), Positives = 31/88 (35%), Gaps = 12/88 (13%)

Query: 834 ETPTPPTP-------EVPSEPGEPTPPKPEVPSEPETPVPPTPEVPSEPGKPVPPAKEEP 886
E P P P EP + P PE EPE P PE P K P E+P
Sbjct: 37 ELPAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP----KEAPVVIEKP 92

Query: 887 KKPSKPVEQGKVVTPVIEINEKVKAVAP 914
K KP + V + VK V
Sbjct: 93 KPKPKPKPK-PVKKVQEQPKRDVKPVES 119



Score = 32.3 bits (73), Expect = 0.008
Identities = 17/81 (20%), Positives = 25/81 (30%), Gaps = 5/81 (6%)

Query: 853 PKPEVPSE----PETPVPPTPEVPSEPGKPVPPAKEEPKKPSKPVEQGKVVTPVIEINEK 908
P P P + P V P V P E P P ++ VV + K
Sbjct: 39 PAPAQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPP-KEAPVVIEKPKPKPK 97

Query: 909 VKAVAPTKQKQSKKSELPETG 929
K K ++ K ++
Sbjct: 98 PKPKPVKKVQEQPKRDVKPVE 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06725V8PROTEASE350.003 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 34.6 bits (79), Expect = 0.003
Identities = 14/30 (46%), Positives = 18/30 (60%)

Query: 1268 EKPKDPKGPENPEKPSRPTHPSGPVNPNNP 1297
P +P P NP+ P+ P P+ P NPNNP
Sbjct: 290 NNPDNPDNPNNPDNPNNPDEPNNPDNPNNP 319



Score = 33.1 bits (75), Expect = 0.007
Identities = 13/30 (43%), Positives = 18/30 (60%)

Query: 1268 EKPKDPKGPENPEKPSRPTHPSGPVNPNNP 1297
+ P +P P+NP P P +P P NP+NP
Sbjct: 293 DNPDNPNNPDNPNNPDEPNNPDNPNNPDNP 322



Score = 32.3 bits (73), Expect = 0.011
Identities = 13/30 (43%), Positives = 19/30 (63%)

Query: 1268 EKPKDPKGPENPEKPSRPTHPSGPVNPNNP 1297
++P +P P+NP P P +P P NP+NP
Sbjct: 287 DQPNNPDNPDNPNNPDNPNNPDEPNNPDNP 316



Score = 31.1 bits (70), Expect = 0.030
Identities = 12/29 (41%), Positives = 20/29 (68%)

Query: 1268 EKPKDPKGPENPEKPSRPTHPSGPVNPNN 1296
+ P +P P NP++P+ P +P+ P NP+N
Sbjct: 296 DNPNNPDNPNNPDEPNNPDNPNNPDNPDN 324


17CH52_RS07620CH52_RS07695Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS07620-183.002626FosB1/FosB3 family fosfomycin resistance
CH52_RS07625-272.737989LysR family transcriptional regulator
CH52_RS07630093.125658urocanate hydratase
CH52_RS076350112.018436imidazolonepropionase
CH52_RS07640-1101.061084M20 peptidase aminoacylase family protein
CH52_RS07645090.845279SDR family oxidoreductase
CH52_RS07650010-0.425847hypothetical protein
CH52_RS076551110.296199Na+/H+ antiporter NhaC family protein
CH52_RS07660012-0.097067SRPBCC domain-containing protein
CH52_RS07665112-0.153428MurR/RpiR family transcriptional regulator
CH52_RS076702120.298239alpha-glucoside-specific PTS transporter subunit
CH52_RS07675110-0.856284hypothetical protein
CH52_RS076803110.136334bile acid:sodium symporter family protein
CH52_RS0768519-1.073394HAD family hydrolase
CH52_RS07690210-1.966027hypothetical protein
CH52_RS07695210-2.236619hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07635UREASE310.007 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 31.2 bits (71), Expect = 0.007
Identities = 18/63 (28%), Positives = 27/63 (42%), Gaps = 11/63 (17%)

Query: 27 LDELNVVKNGTVVIKDGKIVYAGQH-----TDDYD-----ATETIDASGKVVSPALVDAH 76
LD +VK + +KDG+I G+ TE I GK+V+ +D+H
Sbjct: 78 LDHWGIVK-ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSH 136

Query: 77 THL 79
H
Sbjct: 137 IHF 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07645DHBDHDRGNASE932e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.8 bits (230), Expect = 2e-24
Identities = 68/255 (26%), Positives = 112/255 (43%), Gaps = 15/255 (5%)

Query: 46 LQGYKMLVTGGDSAIGRAAAIAYAKEGADV-AINYLPSEEQDAQEVRQVIEESGQKAVLI 104
++G +TG IG A A A +GA + A++Y P + + +V ++ + A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE---KVVSSLKAEARHAEAF 62

Query: 105 PGDIRDEQFNYDLVEQAYQQLGGLDNVTLVAGHQQYHDDIHGFTTEAFTETFETNVYPLF 164
P D+RD ++ + +++G +D + VAG + IH + E + TF N +F
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRP-GLIHSLSDEEWEATFSVNSTGVF 121

Query: 165 WTVQKALEYLKP--GASITTTSSVQGYNPSPILHDYAASKAAIISLTKSFSEELGPKGIR 222
+ +Y+ SI T S P + YA+SKAA + TK EL IR
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 223 VNCVAPGPFWSPLQIS-----GGQPQ---SKIPTFGQKTPLGRAGQPVELCGTYVLLASE 274
N V+PG + +Q S G Q + TF PL + +P ++ + L S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 275 ESSYTTGQVFGVSGG 289
++ + T V GG
Sbjct: 242 QAGHITMHNLCVDGG 256


18CH52_RS08450CH52_RS14610Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS08450214-2.041317UDPGP type 1 family protein
CH52_RS08455313-1.395785hemolysin III family protein
CH52_RS08460414-1.366526MFS transporter
CH52_RS08465316-1.429548multidrug efflux transporter SepA
CH52_RS08470216-0.702363multidrug efflux MFS transporter LmrS
CH52_RS08480213-0.306739P-loop NTPase
CH52_RS084859160.661242******arginase
CH52_RS085358150.673663diadenylate cyclase CdaA
CH52_RS085409150.915463YbbR-like domain-containing protein
CH52_RS085458120.796183phosphoglucosamine mutase
CH52_RS085507131.244628LPXTG-anchored DUF1542 repeat protein FmtB
CH52_RS085557141.295909mannitol-1-phosphate 5-dehydrogenase
CH52_RS08560-1130.871339PTS sugar transporter subunit IIA
CH52_RS08565-1100.920484BglG family transcription antiterminator
CH52_RS08570-2100.743817PTS mannitol transporter subunit IICB
CH52_RS085750131.039157glutamine--fructose-6-phosphate transaminase
CH52_RS08580-1130.574407ABC transporter ATP-binding protein
CH52_RS08585013-2.268767Cof-type HAD-IIB family hydrolase
CH52_RS08590012-2.391233hypothetical protein
CH52_RS14610215-2.708828hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08465TCRTETB1028e-26 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 102 bits (256), Expect = 8e-26
Identities = 91/405 (22%), Positives = 175/405 (43%), Gaps = 14/405 (3%)

Query: 9 VIALILIMFMSAIESSIISLALPTIKQDLNA-GNLISLIFTAYFIALVIANPIVGELLSR 67
+I L ++ F S + +++++LP I D N + + TA+ + I + G+L +
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 68 FKIIYVAIAGLLLFSIGSFMCGLS-TNFTMLIISRVIQGFGSGVLMSLSQIVPKLAFEIP 126
I + + G+++ GS + + + F++LI++R IQG G+ +L +V
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 127 LRYKIMGIVGSVWGISSIIGPLLGGGILEFATWHWLFYINIPIAIIAIILVIWTFHFPEE 186
R K G++GS+ + +GP +GG I HW + + IP +I II V + ++
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAH--YIHWSYLLLIP--MITIITVPFLMKLLKK 191

Query: 187 ETVAKSKFDTKGLTLFYVFIGLIMFALLNQQLLLLNFLSFILAIVVAMCLFKVEKHVSSP 246
E K FD KG+ L V I M + + I++++ + K + V+ P
Sbjct: 192 EVRIKGHFDIKGIILMSVGIVFFMLFTTSY-----SISFLIVSVLSFLIFVKHIRKVTDP 246

Query: 247 FLPVVEF-NRSITLVFITDLLTAICLMGFNLYIPVYLQEQLGLSPLQSG-LVIFPLSVAW 304
F+ N + + + + GF +P +++ LS + G ++IFP +++
Sbjct: 247 FVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSV 306

Query: 305 ITLNFNLYRIEAKLSRKVIYLLSFTLLLVSSIIISFGIKL-PVLIAFVLILAGLSFGYIY 363
I + + + + + T L VS + SF ++ + +++ +
Sbjct: 307 IIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTK 366

Query: 364 TKDSVIVQEETSPLQMKKMMSFYGLTKNLGASIGSTIMGYLYAIQ 408
T S IV + MS T L G I+G L +I
Sbjct: 367 TVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08475TCRTETB1442e-40 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 144 bits (365), Expect = 2e-40
Identities = 98/416 (23%), Positives = 194/416 (46%), Gaps = 14/416 (3%)

Query: 7 TTRRRNFIVAVMLISAFVAILNQTLLNTALPSIMRELNINESTSQWLVTGFMLVNGVMIP 66
+ R N I+ + I +F ++LN+ +LN +LP I + N +++ W+ T FML +
Sbjct: 8 SNLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTA 67

Query: 67 LTAYLMDRIKTRPLYLAAMGTFLLGSIVAALAPN-FGVLMLARVIQAMGAGVLMPLMQFT 125
+ L D++ + L L + GS++ + + F +L++AR IQ GA L+
Sbjct: 68 VYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 126 LFTLFSKEHRGFAMGLAGLVIQFAPAIGPTVTGLIIDQASWRVPFIIIVGIALVAFVFGL 185
+ KE+RG A GL G ++ +GP + G+I W +++++ + + V L
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPFL 185

Query: 186 VSISSYNEVKYTKLDKRSVMYSTIGFGLMLYAFSSAGDLGFTSPIVIGALIISMVIIYLF 245
+ + D + ++ ++G + FT+ I LI+S++ +F
Sbjct: 186 MKLLKKEVRIKGHFDIKGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIF 236

Query: 246 IRRQFNITNVLLNLRVFKNRTFALCTISSMIIMMSMVGPALLIPLYVQNSLSLSALLSGL 305
++ +T+ ++ + KN F + + II ++ G ++P +++ LS G
Sbjct: 237 VKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGS 296

Query: 306 VIM-PGAIINGIMSVFTGKFYDKYGPRPLIYTGFTILTITTIMLCFLHTDTSYTYLIVVY 364
VI+ PG + I G D+ GP ++ G T L+++ + FL TS+ I++
Sbjct: 297 VIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIV 356

Query: 365 AIRMFSVSLLMMPINTTGINSLRNEEISHGTAIMNFGRVMAGSLGTALMVTLMSFG 420
+ S I+T +SL+ +E G +++NF ++ G A++ L+S
Sbjct: 357 FVLGGL-SFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08555IGASERPTASE457e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 44.7 bits (105), Expect = 7e-06
Identities = 57/316 (18%), Positives = 103/316 (32%), Gaps = 20/316 (6%)

Query: 2122 PQANNNSSADASTNSPTMDNDVTSKPEVESTNNG---TTDKPVTETDNATPAESTTNN-- 2176
P+ + +TN T +N P V S N + PV ATP+E+T
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 2177 ----NSTTTATNENAPTGSTATAPTTASTEAASSADSKDNASVNDSKQNAEVNNSAESQS 2232
S T NE T +TA A ++ + V S + + E++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 2233 TNGKVAQPKS--ENKAKAEKDGRDSTNQSMVESTTETLPSADITEPNVPSNTSKDKEEST 2290
T + K+ E + E S E + P A+ N P+ K+ + T
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 2291 TNQTDAGQLKSETNVASNEADKSPSKADT----EVSNKPSTSASSEAKDKMTSTNVSQKD 2346
D Q ET+ + + +T + + +T A+++ S+N +
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNR 1222

Query: 2347 DTATADTNDTQKSVGPVANNKAKDMQTNDTQKSVGSAANNKATQNDGANASPATVSNG-S 2405
+ + ++N + S N + A A ++ G +
Sbjct: 1223 HRRSVRSVPHNVEPATTSSNDRS----TVALCDLTSTNTNAVLSDARAKAQFVALNVGKA 1278

Query: 2406 HSMHQDMLNVTKPEEN 2421
S H L + +
Sbjct: 1279 VSQHISQLEMNNEGQY 1294



Score = 42.0 bits (98), Expect = 4e-05
Identities = 47/318 (14%), Positives = 104/318 (32%), Gaps = 9/318 (2%)

Query: 1073 NNGSTTEEKEAAKQQVQTEKTAADAAIDAAHSNVEVEAAKNAEIAKI-EAIQPATTTKDN 1131
NG +++ QT T + ++V + N EIA++ EA P
Sbjct: 974 VNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATP 1033

Query: 1132 AKQAIATKANERKTAIAQTQDITAEEIAAANADVDNAVTQANSNIEAANSQNDVDQAKTT 1191
++ N ++ ++T + ++ A +A SN++A N+V Q+ +
Sbjct: 1034 SETTETVAENSKQE--SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSE 1091

Query: 1192 GETSIDQVTPTVNKKATARNEITAILNNKLQEIQATPDATDEEKQAADAEANTENGKANQ 1251
+ T T K TA E + ++ Q P T + + +
Sbjct: 1092 TKE-----TQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPA 1146

Query: 1252 AISAATTNAQVDEAKANAEAAINAVTPKVVKKQAAKDEIDQLQATQTNVINNDQNATNEE 1311
+ T N + +++ N A + T +V+ N +N T
Sbjct: 1147 RENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206

Query: 1312 KEAAIQQLATAVTDAKNNITAATDDNGVDTAKDAGKNSIQSTQPATAVKSNAKNEVDQAV 1371
+ + ++ ++ + + + V+ A + + + +N + A
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDR-STVALCDLTSTNTNAVLSDAR 1265

Query: 1372 TTQNQAIDNTTGATTEEK 1389
N A ++
Sbjct: 1266 AKAQFVALNVGKAVSQHI 1283



Score = 33.9 bits (77), Expect = 0.010
Identities = 61/312 (19%), Positives = 105/312 (33%), Gaps = 18/312 (5%)

Query: 1021 DIDNATANTDVDNAKTTNEATIAAITPDANVKPAAKQAIADKVQAQETAIDANNGSTTEE 1080
D+ N TTN T I D P+ + IA +A S T E
Sbjct: 979 DLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTE 1038

Query: 1081 K--EAAKQQVQTEKTAADAAIDAAHSNVEVEAAKNAEIAKIEAIQPATTTKDNAKQAIAT 1138
E +KQ+ +T + A + N EV + + A T + Q+ +
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNV-------KANTQTNEVAQSGSE 1091

Query: 1139 KANERKTAIAQTQDITAEEIAAANADVDNAVTQANSNIEAANSQNDVDQAKTTGETSIDQ 1198
+ T +T + EE A + V + S + Q++ Q + D
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND- 1150

Query: 1199 VTPTVN-KKATARNEITAILNNKLQEIQATPDATDEEKQAADAEANTENGKANQAISAAT 1257
PTVN K+ ++ TA +E + + E + + N + AT
Sbjct: 1151 --PTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT--TPAT 1206

Query: 1258 TNAQVDEAKANAEAAINAVTPKVVKKQ---AAKDEIDQLQATQTNVINNDQNATNEEKEA 1314
T V+ +N + + + V A D+ ++ + + NA + A
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARA 1266

Query: 1315 AIQQLATAVTDA 1326
Q +A V A
Sbjct: 1267 KAQFVALNVGKA 1278



Score = 33.5 bits (76), Expect = 0.013
Identities = 45/305 (14%), Positives = 93/305 (30%), Gaps = 14/305 (4%)

Query: 804 KNEEIFKIENITDSTQTKMDAYKEVRQAATARKAQNATVSNATDEEVAEANAAVDAAQTE 863
E K E+ T + + A++A++ +N EVA++ + QT
Sbjct: 1039 TVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTT 1098

Query: 864 GLHDIQVVKSQQEVADTKAKVLDKINAIQTQAKVKPAADTEVENAYNTRKQEIQNSNAST 923
+ V+ +++ A + + ++ + +Q K V+ ++ N
Sbjct: 1099 ETKETATVEKEEK-AKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 924 TEEKEAAYTELDAKKQEARTNLDAANTNSDVTTAKDNGIAAINQVQAATTKKSDAKAEIA 983
+ + + + +E +N++ T T N + + T + +E +
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVT-ESTTVNTGNSVVENPENTTPATTQPTVNSESS 1216

Query: 984 QKASERKTAIEAMNDSTTEEQQAAKDKVDQAVVTANADIDNATANTDVDNAKTTNEATIA 1043
K R + E V + N A AK A
Sbjct: 1217 NKPKNR-HRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVA--- 1272

Query: 1044 AITPDANVKPAAKQAIADKV---QAQETAIDANNGSTTEEKEAAKQQVQTEKTAADAAID 1100
NV A Q I+ + Q +N + ++ ++ T D
Sbjct: 1273 -----LNVGKAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWD 1327

Query: 1101 AAHSN 1105
SN
Sbjct: 1328 QTISN 1332



Score = 32.3 bits (73), Expect = 0.034
Identities = 32/210 (15%), Positives = 63/210 (30%), Gaps = 4/210 (1%)

Query: 34 TTASAAEQNQPAQNQPAQPADANTQPNANAGAQANPAAQPANQGGQANPAGGAAQPAGQG 93
T + +Q P Q QP A + +P Q N QPA +
Sbjct: 1116 TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKET 1175

Query: 94 NQADPNNAAQAQPGNQ--AAPANQAGQGNNQATPNNNATPANQTQPANAPA-AAQPAAPV 150
+ ++ N + N P N+ +N+ + + + + P
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVE 1235

Query: 151 AANAQTQDPNASNTGE-GSINTTLTFDDPAISTDENRQDPTVTVTDKVNGYSLINNGKIG 209
A + D + + S NT D + V+ ++ + N G+
Sbjct: 1236 PATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYN 1295

Query: 210 FVNSELRRSDMFDKNNPQNYQAKGNVAALG 239
S + + + + + +K LG
Sbjct: 1296 VWVSNTSMNKNYSSSQYRRFSSKSTQTQLG 1325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08585PF05272290.017 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.017
Identities = 17/58 (29%), Positives = 27/58 (46%), Gaps = 8/58 (13%)

Query: 32 ILYGLNGAGKTTLLNILNAYEPATTGGVNLFGKMPGKVGYSAETVRQHIGFVSHSLLE 89
+L G G GK+TL+N L G++ F +G ++ Q G V++ L E
Sbjct: 600 VLEGTGGIGKSTLINTL--------VGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSE 649


19CH52_RS09195CH52_RS15160Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS09195014-3.411070LytTR family DNA-binding domain-containing
CH52_RS09200-114-4.965112sensor histidine kinase
CH52_RS09205-112-4.160939cyclic lactone autoinducer peptide
CH52_RS09215313-1.200077accessory gene regulator ArgB-like protein
CH52_RS09220312-1.818183delta-hemolysin
CH52_RS15160313-0.720233carbon-nitrogen family hydrolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS09205HTHFIS345e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 5e-04
Identities = 19/135 (14%), Positives = 44/135 (32%), Gaps = 13/135 (9%)

Query: 2 KIFICEDDPKQRENMVTIIKNYIMIEEKPMEIALATDNPYEVLEQAKNMNDIGCYFLDIQ 61
I + +DD R + + + T N + D D+
Sbjct: 5 TILVADDDAAIRTVLNQAL-------SRAGYDVRITSNAATLWRWIAAG-DGDLVVTDVV 56

Query: 62 LSTDINGIKLGSEIRKHDPVGNIIFVTSHSELTYLTFVYKVAAMDFIFK----DDPAELR 117
+ D N L I+K P ++ +++ + + A D++ K + +
Sbjct: 57 MP-DENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGII 115

Query: 118 TRIIDCLETAHTRLQ 132
R + + ++L+
Sbjct: 116 GRALAEPKRRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS09220PF046471302e-40 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 130 bits (329), Expect = 2e-40
Identities = 38/173 (21%), Positives = 72/173 (41%), Gaps = 7/173 (4%)

Query: 18 RNNLDHIQFLQVRLGMQIIVGNFFKILVTYSISIFLSVFLFTLVTHLSYMLIRYNAHGAH 77
+ ++R G+++ +G F+I++ ++ + + LS + R + GAH
Sbjct: 14 DRSDYPFNQEEIRYGIEVFLGTVFQIIIILLVAFVIGLAKEVAFCLLSAAVYRRFSGGAH 73

Query: 78 AKSSILCYIQSILTFVFVPYFLINIDINFTYLLALS--IIGLISVVIYAPAATKKQPIPI 135
+ C + S+L F + Y ID + LL L I L++++ P + I
Sbjct: 74 CEKYYRCTLTSLLVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNPRNLISN 133

Query: 136 KLVKRKKYLSIIMYLLVLILSLIIHPF-----YAQFMLLGILVESITLLPIFF 183
++ L M L+VL I A +LLG+L ++ TL +
Sbjct: 134 TEQRKTLKLKTSMVLMVLFGGSIGAYRLYTHQIALAILLGVLWQTFTLTALGH 186


20CH52_RS09280CH52_RS09705Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS09280217-2.533188hypothetical protein
CH52_RS09285322-3.239134hypothetical protein
CH52_RS09290421-3.533154hypothetical protein
CH52_RS09295421-2.004539helix-turn-helix domain-containing protein
CH52_RS09300318-1.353517DUF739 family protein
CH52_RS09305419-0.983013transcriptional regulator
CH52_RS09310423-1.069049hypothetical protein
CH52_RS09315422-2.120483phage antirepressor KilAC domain-containing
CH52_RS09320223-1.960146hypothetical protein
CH52_RS09325324-1.759936hypothetical protein
CH52_RS09330430-3.015188hypothetical protein
CH52_RS09335631-2.459365DUF771 domain-containing protein
CH52_RS09340531-1.142744DUF1270 domain-containing protein
CH52_RS09345229-0.130953DUF1108 family protein
CH52_RS093502290.990017DUF2483 domain-containing protein
CH52_RS093552280.654218ATP-binding protein
CH52_RS093602260.383971single-stranded DNA-binding protein
CH52_RS093652230.125014putative HNHc nuclease
CH52_RS093702250.312105hypothetical protein
CH52_RS093752280.990259hypothetical protein
CH52_RS150304280.416532conserved phage C-terminal domain-containing
CH52_RS093804260.851125ATP-binding protein
CH52_RS093852281.367429hypothetical protein
CH52_RS093904332.606873DUF3269 family protein
CH52_RS093954343.234619DUF3113 family protein
CH52_RS094004311.858333hypothetical protein
CH52_RS094054321.730732phi PVL orf 51-like protein
CH52_RS094103331.726459YopX family protein
CH52_RS094154331.910866acetyltransferase
CH52_RS094202320.756373DUF1024 family protein
CH52_RS09425330-0.271319hypothetical protein
CH52_RS094353320.852965dUTP diphosphatase
CH52_RS094404300.059792hypothetical protein
CH52_RS094458340.123311DUF1381 domain-containing protein
CH52_RS094501034-0.397197transcriptional activator RinB
CH52_RS09455732-0.272186DUF1514 family protein
CH52_RS09460531-0.740303hypothetical protein
CH52_RS09465023-0.217986nucleoside triphosphate pyrophosphohydrolase
CH52_RS09475023-0.245375HNH endonuclease
CH52_RS09480-118-0.150005phage terminase small subunit P27 family
CH52_RS09485-1160.288065phage terminase family protein
CH52_RS094901130.277609hypothetical protein
CH52_RS094951120.307408phage portal protein
CH52_RS095001130.099727HK97 family phage prohead protease
CH52_RS095052110.242137phage major capsid protein
CH52_RS095102130.176881hypothetical protein
CH52_RS09515214-0.359079head-tail connector protein
CH52_RS095250171.160593head-tail adaptor protein
CH52_RS095300191.232950HK97 gp10 family phage protein
CH52_RS095352153.198969hypothetical protein
CH52_RS095403152.962534Ig-like domain-containing protein
CH52_RS095451152.680565hypothetical protein
CH52_RS095501152.438543hypothetical protein
CH52_RS095551152.440676phage tail tape measure protein
CH52_RS095601162.397312phage tail family protein
CH52_RS095651171.170903hypothetical protein
CH52_RS095701181.112852hypothetical protein
CH52_RS09575424-1.156953hypothetical protein
CH52_RS095803210.100994DUF2951 domain-containing protein
CH52_RS095854190.213669putative holin-like toxin
CH52_RS146504180.629472hypothetical protein
CH52_RS09595416-0.593455phage holin
CH52_RS09600315-1.051319CHAP domain-containing protein
CH52_RS09605314-1.805287staphylokinase
CH52_RS09610418-3.503010chemotaxis-inhibiting protein CHIPS
CH52_RS14905517-4.103121complement inhibitor SCIN-A
CH52_RS09620414-5.860619hypothetical protein
CH52_RS09625415-5.970198hypothetical protein
CH52_RS09630013-4.055705phospholipase
CH52_RS09640015-3.602954extracellular adherence protein Eap/Map
CH52_RS09645015-3.749424MAP domain-containing protein
CH52_RS09650-113-3.573141aminotransferase class I/II-fold pyridoxal
CH52_RS09655014-2.101766hypothetical protein
CH52_RS09665215-4.239865hypothetical protein
CH52_RS09670013-4.777986GntR family transcriptional regulator
CH52_RS09675112-4.158039phenol-soluble modulin export ABC transporter
CH52_RS09680111-4.586124phenol-soluble modulin export ABC transporter
CH52_RS09685011-5.113978phenol-soluble modulin export ABC transporter
CH52_RS09690011-4.677050phenol-soluble modulin export ABC transporter
CH52_RS09695011-3.471731thioredoxin family protein
CH52_RS09700013-3.159956membrane protein
CH52_RS09705015-3.342364DUF1700 domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS09415PF06580270.029 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 26.8 bits (59), Expect = 0.029
Identities = 9/36 (25%), Positives = 18/36 (50%), Gaps = 5/36 (13%)

Query: 67 ERLEQARLERKLERKRKREAELR----RKKPH-LFN 97
+ +QA +++ +EA+L + PH +FN
Sbjct: 142 KNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFN 177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS09560GPOSANCHOR320.030 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 31.6 bits (71), Expect = 0.030
Identities = 16/128 (12%), Positives = 47/128 (36%), Gaps = 7/128 (5%)

Query: 18 GFNRGVTGLNRQMKMVSRELSANLSQFSRYDNSLEKSKIKVEGLSKKQKVQAQITKELKD 77
G T + ++K + E +A ++ + ++ + + L + + K+L+
Sbjct: 271 GAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEA 330

Query: 78 SYDKLSKETG-------ENSAKTQAAAAKYNEAYAKLNQYERELNQATQELKDMQREQKA 130
+ KL ++ A+ + A+ + E + + + ++R+ A
Sbjct: 331 EHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDA 390

Query: 131 LNTAMGKL 138
A ++
Sbjct: 391 SREAKKQV 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS09570CHANLCOLICIN408e-05 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 39.7 bits (92), Expect = 8e-05
Identities = 41/217 (18%), Positives = 79/217 (36%), Gaps = 20/217 (9%)

Query: 588 AIEAARESTKEQLRDYVKTSDYKTDKDGIVERLDTA-EAERTTLKGEIKDKVTLNEYRNG 646
A+E A++ + VK + + A +AE TL G+ NE
Sbjct: 190 AVEIAQKKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGK------RNELAQA 243

Query: 647 LEEQKQYTD--DQLSDLSNNPEIKASIEQANQEAQEALKSYIDAQDNLKEKESQAYADGK 704
+ K+ + +LS +N+P +A + A K + Q + E++
Sbjct: 244 SAKYKELDELVKKLSPRANDPLQNRPFFEATRRRVGAGKIREEKQKQVTASETRINRINA 303

Query: 705 ISEEEQRAIQDAQAKLEEAKQNAELKARNAEKKANAYTDNKVKESTDAQR---RTLT-RY 760
+ Q+AI N +K N ++++K++ DA +TLT +Y
Sbjct: 304 DITQIQKAISQVSNNRNAGIARVHEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKY 363

Query: 761 GSQIIQNGKEI-------KLRTTKEEFNATNRTLSNI 790
G + + +E+ K+ E A + +
Sbjct: 364 GEKYSKMAQELADKSKGKKIGNVNEALAAFEKYKDVL 400


21CH52_RS10300CH52_RS10325Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS10300210-2.876870YlbF/YmcA family competence regulator
CH52_RS1030539-2.535518hypothetical protein
CH52_RS10310511-2.392905exonuclease SbcCD subunit D
CH52_RS10315310-3.010698AAA family ATPase
CH52_RS10320210-2.7312703'-5' exoribonuclease YhaM
CH52_RS1032539-2.577010peptidylprolyl isomerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10325cloacin367e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 35.8 bits (82), Expect = 7e-04
Identities = 40/175 (22%), Positives = 68/175 (38%), Gaps = 36/175 (20%)

Query: 239 KQKEVALHDHSQEWKSLEQQLNIEPITFPEKGVDR-YEKARAHKQSLERDIGLRNERLAQ 297
KQ++ + QEW + T P + +R YE+ARA D+ ER A+
Sbjct: 297 KQRQDEENRRQQEWDA----------THPVEAAERNYERARAELNQANEDVARNQERQAK 346

Query: 298 LKEEATQLEPVKQSDIDAF-ISLNQQENEIKNKEFELTAIE-------------KDIANK 343
A Q+ ++S++DA +L EI K+F A +
Sbjct: 347 ----AVQVYNSRKSELDAANKTLADAIAEI--KQFNRFAHDPMAGGHRMWQMAGLKAQRA 400

Query: 344 QRDKDELQANIGWSETHHDVDSSEAMKSYVSEQIKNKQEQAAYIKQLERSLEENK 398
Q D + QA + + ++A S E K K+++ + E +L + K
Sbjct: 401 QTDVNNKQA--AFDAAAKEKSDADAALSSAMESRKKKEDKK---RSAENNLNDEK 450


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10330SSPANPROTEIN290.035 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 28.6 bits (63), Expect = 0.035
Identities = 12/31 (38%), Positives = 19/31 (61%)

Query: 146 PAASSHHHNFASGLSYHVLTMLRIAKSICDI 176
PA S HH+ SGL ++ + LRIA+ + +
Sbjct: 72 PAKSEHHNGNVSGLHHNGKSELRIAEKLLKV 102


22CH52_RS10380CH52_RS10565Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS10380212-2.682774alpha/beta hydrolase
CH52_RS10385313-3.934384********staphylococcal enterotoxin type O
CH52_RS10395518-6.436603staphylococcal enterotoxin type M
CH52_RS10440720-7.212328staphylococcal enterotoxin type I
CH52_RS10445820-6.934678staphylococcal enterotoxin type N
CH52_RS10450919-7.192623staphylococcal enterotoxin type G
CH52_RS14910817-6.702866DUF1828 domain-containing protein
CH52_RS10465719-6.874211hypothetical protein
CH52_RS10470721-6.268499hypothetical protein
CH52_RS10475417-4.922930DUF1829 domain-containing protein
CH52_RS10480314-3.846104bi-component leukocidin LukED subunit E
CH52_RS10485414-4.130453bi-component leukocidin LukED subunit D
CH52_RS10490316-4.481810DUF1828 domain-containing protein
CH52_RS15035215-3.698708hypothetical protein
CH52_RS10495212-3.750974DUF4888 domain-containing protein
CH52_RS10500213-3.987464serine protease SplA
CH52_RS10510013-2.687475serine protease SplB
CH52_RS10515213-1.509630serine protease SplC
CH52_RS10520312-1.032747serine protease SplD
CH52_RS10530312-0.647401serine protease SplF
CH52_RS10535-1131.399501hypothetical protein
CH52_RS10540-1130.155883type I restriction-modification system subunit
CH52_RS10550-114-0.075637restriction endonuclease subunit S
CH52_RS105551170.004422IS21 family transposase
CH52_RS105601180.456551hypothetical protein
CH52_RS10565522-3.392062type I toxin-antitoxin system Fst family toxin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10440BACTRLTOXIN1717e-55 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 171 bits (435), Expect = 7e-55
Identities = 91/270 (33%), Positives = 140/270 (51%), Gaps = 20/270 (7%)

Query: 3 NSKVMLNVLLLILNLIAICSVNNAYANEE-DPKIESLCKKSSVDPIALHNINDDYINNRF 61
++ ++ ++LI LI + S N A + DP + L K S + N+ Y ++
Sbjct: 2 YKRLFISRVILIFALILVISTPNVLAESQPDPMPDDLHKSSEFTG-TMGNMKYLYDDHYV 60

Query: 62 TTVKSIVSTTEKFLDFDLLFKSINWLDGISAEFKDLKVEFSSSAISKEFLGKTVDIYGVY 121
+ K V + +KFL DL++ D + +K E + ++K++ + VD+YG
Sbjct: 61 SATK--VKSVDKFLAHDLIYNI---SDKKLKNYDKVKTELLNEDLAKKYKDEVVDVYGSN 115

Query: 122 YKAHCH-------GEHQVDTACTYGGVTPHENNKLSEP--KNIGVAVYKDNVNVNTFIVT 172
Y +C+ G+ C YGG+T HE N +N+ V VY++ N +F V
Sbjct: 116 YYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRNTISFEVQ 175

Query: 173 TDKKKVTAQELDIKVRTKLNNAYKLYDRMTSDVQKGYIKFHSHSEHKESFYYDLFYIKGN 232
TDKK VTAQELDIK R L N LY+ +S + GYIKF ++ +F+YD+ G+
Sbjct: 176 TDKKSVTAQELDIKARNFLINKKNLYEFNSSPYETGYIKFIENNG--NTFWYDMMPAPGD 233

Query: 233 LPDQ--YLQIYNDNKTIDSSDYHIDVYLFT 260
DQ YL +YNDNKT+DS I+V+L T
Sbjct: 234 KFDQSKYLMMYNDNKTVDSKSVKIEVHLTT 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10445BACTRLTOXIN1217e-36 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 121 bits (306), Expect = 7e-36
Identities = 64/231 (27%), Positives = 111/231 (48%), Gaps = 36/231 (15%)

Query: 28 NLRNYYGSYPIEDHQSINPENNHLSHQLVFSMDNST------VTAEFKNVDDVKKFKNHA 81
N++ Y + + + + L+H L++++ + V E N D KK+K+
Sbjct: 50 NMKYLYDDHYVS-ATKVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEV 108

Query: 82 VDVYGLSYSGYCLKNKY------------IYGGVTFA-GDYLEKSRRIPINLWVNGEHQT 128
VDVYG +Y C + +YGG+T G++ + + + V +
Sbjct: 109 VDVYGSNYYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRN 168

Query: 129 ISTDKVSTNKKLVTAQEIDTKLRRYLQEEYNIYGFNDTNKGRNYGNKSKFSSGFNAGKIL 188
+ +V T+KK VTAQE+D K R +L + N+Y FN SS + G I
Sbjct: 169 TISFEVQTDKKSVTAQELDIKARNFLINKKNLYEFN--------------SSPYETGYIK 214

Query: 189 FHLNDGSSFSYDLFDT-GTGQAES-FLKIYNDNKTVETEKFHLDVEISYKD 237
F N+G++F YD+ G +S +L +YNDNKTV+++ ++V ++ K+
Sbjct: 215 FIENNGNTFWYDMMPAPGDKFDQSKYLMMYNDNKTVDSKSVKIEVHLTTKN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10450BACTRLTOXIN1082e-30 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 108 bits (270), Expect = 2e-30
Identities = 54/227 (23%), Positives = 98/227 (43%), Gaps = 37/227 (16%)

Query: 30 VGNLRNFYTKHDYIDLKGVTDKNLPIANQLEFS------TGTNDLISESNNWDEISKFKG 83
+GN++ Y H K + +A+ L ++ + + +E N D K+K
Sbjct: 48 MGNMKYLYDDHYVSATKVKSVDKF-LAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKD 106

Query: 84 KKLDIFGIDY-------------NGPCKSKYMYGGATL-SGQYLNSARKIPINLWVNGKH 129
+ +D++G +Y MYGG T G + ++ + + V
Sbjct: 107 EVVDVYGSNYYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENK 166

Query: 130 KTISTDKIATNKKLVTAQEIDVKLRRYLQEEYNIYGHNNTGKGKEYGYKSKFYSGFNNGK 189
+ + ++ T+KK VTAQE+D+K R +L + N+Y N+ + G
Sbjct: 167 RNTISFEVQTDKKSVTAQELDIKARNFLINKKNLY-EFNSSP-------------YETGY 212

Query: 190 VLFHLNNEKSFSYDLF-YTGDGLPVS-FLKIYEDNKIIESEKFHLDV 234
+ F NN +F YD+ GD S +L +Y DNK ++S+ ++V
Sbjct: 213 IKFIENNGNTFWYDMMPAPGDKFDQSKYLMMYNDNKTVDSKSVKIEV 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10465BACTRLTOXIN1559e-49 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 155 bits (394), Expect = 9e-49
Identities = 76/265 (28%), Positives = 124/265 (46%), Gaps = 21/265 (7%)

Query: 2 RLFYIAAIII-TLLCLINNNYVNAEV----DKKDLKKKSDLDSSKLFNLTSYYTDITWQL 56
RLF I+I L+ +I+ V AE DL K S+ + + N+ Y D +
Sbjct: 4 RLFISRVILIFALILVISTPNVLAESQPDPMPDDLHKSSEF-TGTMGNMKYLYDDH--YV 60

Query: 57 DESNKISTDQLLNNTIILKNIDISVLKTSSLKVEFNSSDLANQFKGKNIDIYGLYFGNKC 116
+ S D+ L + +I D + +K E + DLA ++K + +D+YG + C
Sbjct: 61 SATKVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEVVDVYGSNYYVNC 120

Query: 117 -------VGLTEEKTSCLYGGVTIHDGNQLDEEKV--IGVNVFKDGVQQEGFVIKTKKAK 167
VG +C+YGG+T H+GN D + + V V+++ F ++T K
Sbjct: 121 YFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRNTISFEVQTDKKS 180

Query: 168 VTVQELDTKVRFKLENLYKIYNKDTGNIQKGCIFFHSHNHQDQSFYYDLYNVKGSVG--A 225
VT QELD K R L N +Y ++ + G I F +N +F+YD+ G +
Sbjct: 181 VTAQELDIKARNFLINKKNLYEFNSSPYETGYIKFIENN--GNTFWYDMMPAPGDKFDQS 238

Query: 226 EFFQFYSDNRTVSSSNYHIDVFLYK 250
++ Y+DN+TV S + I+V L
Sbjct: 239 KYLMMYNDNKTVDSKSVKIEVHLTT 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10470BACTRLTOXIN1954e-64 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 195 bits (497), Expect = 4e-64
Identities = 109/261 (41%), Positives = 155/261 (59%), Gaps = 11/261 (4%)

Query: 4 LSTVIIILILEIVFHNMN-YVNAQPDPKLDELNKVSDYKNNKGTMGNVMNLYTSPPVEGR 62
+S VI+I L +V N +QPDP D+L+K S++ GTMGN+ LY V
Sbjct: 7 ISRVILIFALILVISTPNVLAESQPDPMPDDLHKSSEFT---GTMGNMKYLYDDHYVSAT 63

Query: 63 GVINSRQFLSHDLIFPI---EYKSYNEVKTELENTELANNYKDKKVDIFGVPYFYTCIIP 119
V + +FL+HDLI+ I + K+Y++VKTEL N +LA YKD+ VD++G Y+ C
Sbjct: 64 KVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEVVDVYGSNYYVNCYFS 123

Query: 120 KSEPDINQNFGGCCMYGGLTF---NSSENERDKLITVQVTIDNRQSLGFTITTNKNMVTI 176
+ G CMYGG+T N +N + + V+V + R ++ F + T+K VT
Sbjct: 124 SKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRNTISFEVQTDKKSVTA 183

Query: 177 QELDYKARHWLTKEKKLYEFDGSAFESGYIKFTEKNNTSFWFDLFPKKELVPFVPYKFLN 236
QELD KAR++L +K LYEF+ S +E+GYIKF E N +FW+D+ P F K+L
Sbjct: 184 QELDIKARNFLINKKNLYEFNSSPYETGYIKFIENNGNTFWYDMMPAPGDK-FDQSKYLM 242

Query: 237 IYGDNKVVDSKSIKMEVFLNT 257
+Y DNK VDSKS+K+EV L T
Sbjct: 243 MYNDNKTVDSKSVKIEVHLTT 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10495BICOMPNTOXIN433e-156 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 433 bits (1116), Expect = e-156
Identities = 214/318 (67%), Positives = 256/318 (80%), Gaps = 10/318 (3%)

Query: 1 MFKKKMLAATLSVGLIAPLASPIQE-SRANTNIENIGDGA--EVIKRTEDVSSKKWGVTQ 57
M K K+L TLSV L+APLA+P+ E ++A + E+IG G+ E+IKRTED +S KWGVTQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 58 NVQFDFVKDKKYNKDALIVKMQGFINSRTSFSDVKGSGYELTKRMIWPFQYNIGLTTKDP 117
N+QFDFVKDKKYNKDALI+KMQGFI+SRT++ + K + + K M WPFQYNIGL T D
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNH--VKAMRWPFQYNIGLKTNDK 118

Query: 118 NVSLINYLPKNKIETTDVGQTLGYNIGGNFQSAPSIGGNGSFNYSKTISYTQKSYVSEVD 177
VSLINYLPKNKIE+T+V QTLGYNIGGNFQSAPS+GGNGSFNYSK+ISYTQ++YVSEV+
Sbjct: 119 YVSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVE 178

Query: 178 KQNSKSVKWGVKANEFVTPDGKKSAHDRYLFVQSPNGPTGSAREYFAPDNQLPPLVQSGF 237
+QNSKSV WGVKAN F T G+KSA D LFV + R+YF PD++LPPLVQSGF
Sbjct: 179 QQNSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPH-SKDPRDYFVPDSELPPLVQSGF 237

Query: 238 NPSFITTLSHEKGSSDTSEFEISYGRNLDITYA----TLFPRTGIYAERKHNAFVNRNFV 293
NPSFI T+SHEKGSSDTSEFEI+YGRN+D+T+A T + + + R HNAFVNRN+
Sbjct: 238 NPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYT 297

Query: 294 VRYEVNWKTHEIKVKGHN 311
V+YEVNWKTHEIKVKG N
Sbjct: 298 VKYEVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10500BICOMPNTOXIN396e-141 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 396 bits (1020), Expect = e-141
Identities = 96/329 (29%), Positives = 177/329 (53%), Gaps = 24/329 (7%)

Query: 1 MKMKKLVKSSVASSIALLLLSNTVDAAQHITPVSEKKVDDKITLYKTTATSDNDKLNISQ 60
M K++ ++++ S+ L + ++ A+ + I + K T ++K ++Q
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 ILTFNFIKDKSYDKDTLVLKAAGNINSGYKKPNPKDYNYSQ-FYWGGKYNVSVSSESNDA 119
+ F+F+KDK Y+KD L+LK G I+S N K N+ + W +YN+ + + +
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTN-DKY 119

Query: 120 VNVVDYAPKNQNEEFQVQQTLGYSYGGDINISNGLSGGLNGSKSFSETINYKQESYRTTI 179
V++++Y PKN+ E V QTLGY+ GG+ + L G NGS ++S++I+Y Q++Y + +
Sbjct: 120 VSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGG--NGSFNYSKSISYTQQNYVSEV 177

Query: 180 DRKTNHKSIGWGVEAHKIMNNGWGPYGRDSYDPTYGNELFLGGRQSSSNAGQNFLPTHQM 239
+++ N KS+ WGV+A+ + ++LF+G + S + F+P ++
Sbjct: 178 EQQ-NSKSVLWGVKANSFAT-------ESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSEL 229

Query: 240 PLLARGNFNPEFISVLSHKQNDTKKSKIKVTYQREMD---------RYTNQWNRLHWIGN 290
P L + FNP FI+ +SH++ + S+ ++TY R MD Y N + H + N
Sbjct: 230 PPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHN 289

Query: 291 NYKNQNTVTFTSTYEVDWQNHTVKLIGTD 319
+ N+N +T YEV+W+ H +K+ G +
Sbjct: 290 AFVNRN---YTVKYEVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10530V8PROTEASE1381e-41 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 138 bits (349), Expect = 1e-41
Identities = 64/212 (30%), Positives = 101/212 (47%), Gaps = 18/212 (8%)

Query: 36 EKNVKEITDATKAPYNSVVAFA--------GGTGVVVGKNTIVTNKHIAKSNDIFKNRVA 87
+ +ITD T Y V +GVVVGK+T++TNKH+ + + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 88 AHYS---SKGKGGGNYDVKDIVEYPGKEDLAIVHVHETSTEGLNFNKNVSYTKFAEGA-- 142
A S G + + I +Y G+ DLAIV + + + + V + A
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVK-FSPNEQNKHIGEVVKPATMSNNAET 191

Query: 143 KAKDRISVIGYPKGAQTKYKMFESTGTINHISGTFIEFDAYAQPGNSGSPVLNSKHELIG 202
+ I+V GYP G + M+ES G I ++ G +++D GNSGSPV N K+E+IG
Sbjct: 192 QVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIG 250

Query: 203 ILYAGSGKDESEKNFGVYFTPQLKEFIQNNIE 234
I + G +E N V+ ++ F++ NIE
Sbjct: 251 IHWGGVP---NEFNGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10535V8PROTEASE1772e-56 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 177 bits (450), Expect = 2e-56
Identities = 65/230 (28%), Positives = 108/230 (46%), Gaps = 29/230 (12%)

Query: 29 EVQQTAKA-----ENNVTKIQDTNIFPYTGVVAFKS--------ATGFVVGKNTILTNKH 75
++Q A N+ +I DT Y V + A+G VVGK+T+LTNKH
Sbjct: 60 PLEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKH 119

Query: 76 V-SKNYKVGDRITAHP---NSDKGNGGIYSIKKIINYPGKEDVSVIQVEERAIERGPKGF 131
V + + A P N D G ++ ++I Y G+ D+++++ +
Sbjct: 120 VVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK----- 174

Query: 132 NFNDNVTPFKYAAGA--KAGERIKVIGYPHPYKNKYVLYESTGPVMSVEGSSIVYSAHTE 189
+ + V P + A + + I V GYP K ++ES G + ++G ++ Y T
Sbjct: 175 HIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMWESKGKITYLKGEAMQYDLSTT 233

Query: 190 SGNSGSPVLNSNNELVGIHFASDVKNDDNRNAYGVYFTPEIKKFIAENID 239
GNSGSPV N NE++GIH+ V N+ N V+ ++ F+ +NI+
Sbjct: 234 GGNSGSPVFNEKNEVIGIHWGG-VPNEFNG---AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10540V8PROTEASE1787e-57 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 178 bits (452), Expect = 7e-57
Identities = 64/217 (29%), Positives = 106/217 (48%), Gaps = 23/217 (10%)

Query: 37 EKNVTQVKDTNNFPYNGVVSFK--------DATGFVIGKNTIITNKHV-SKDYKVGDRIT 87
+ Q+ DT N Y V + A+G V+GK+T++TNKHV + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 88 AHP---NGDKGNGGIYKIKSISDYPGDEDISVMNIEEQAVERGPKGFNFNENVQAFNFAK 144
A P N D G + + I+ Y G+ D++++ + + E V+ +
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK-----HIGEVVKPATMSN 187

Query: 145 DA--KVDDKIKVIGYPLPAQNSFKQFESTGTIKRIKDNILNFDAYIEPGNSGSPVLNSNN 202
+A +V+ I V GYP +ES G I +K + +D GNSGSPV N N
Sbjct: 188 NAETQVNQNITVTGYPGDK-PVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKN 246

Query: 203 EVIGVVYGGIGKIGSEYNGAVYFTPQIKDFIQKHIEQ 239
EVIG+ +GG + +E+NGAV+ +++F++++IE
Sbjct: 247 EVIGIHWGG---VPNEFNGAVFINENVRNFLKQNIED 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10545V8PROTEASE1121e-31 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 112 bits (281), Expect = 1e-31
Identities = 58/227 (25%), Positives = 100/227 (44%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENSVKLITNTNVAPYSGVTWMGA--------GTGFVVGNHTIITNKHV 76
++Q A N IT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVTSDAVVQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKREAIGVMYASDKPTGESTRSFAVYFSPEIKKFIADNLD 238
SGSP+ N K E IG+ + AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEFNG----AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10550V8PROTEASE1156e-33 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 115 bits (290), Expect = 6e-33
Identities = 60/227 (26%), Positives = 103/227 (45%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENTVKQITNTNVAPYSGVTWMGA--------GTGFVVGNHTIITNKHV 76
++Q A N QIT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVSSDAIIQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKHEAIGVIYAGNKPSGESTRGFAVYFSPEIKKFIADNLD 238
SGSP+ N K+E IG+ + G AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEF----NGAVFINENVRNFLKQNIE 279


23CH52_RS14710CH52_RS10715Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS14710114-3.134333hypothetical protein
CH52_RS10690315-3.128884competence protein ComK
CH52_RS10695116-3.417160RNA polymerase sigma factor SigS
CH52_RS10700114-4.168849hypothetical protein
CH52_RS10705113-3.762635N-acetylglucosaminidase
CH52_RS10710-116-4.510669arsenite efflux transporter membrane subunit
CH52_RS10715019-3.353034metalloregulator ArsR/SmtB family transcription
24CH52_RS10795CH52_RS10910Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS10795281.216423rhodanese-like domain-containing protein
CH52_RS10800181.373067LPXTG-anchored repetitive surface protein SasC
CH52_RS10805180.886162NAD(P)/FAD-dependent oxidoreductase
CH52_RS10810190.571610polysaccharide biosynthesis protein
CH52_RS10820290.455835rRNA pseudouridine synthase
CH52_RS10825-111-0.503292YtxH domain-containing protein
CH52_RS10830-212-0.835270Mn(2+)-dependent dipeptidase Sapep
CH52_RS10835-211-0.757965D-amino-acid transaminase
CH52_RS10840-112-0.056760phosphotransferase family protein
CH52_RS10845112-0.094300tRNA (guanosine(46)-N7)-methyltransferase TrmB
CH52_RS10850012-0.328654MBL fold metallo-hydrolase
CH52_RS10855211-0.217986PepSY domain-containing protein
CH52_RS10860112-0.959276M42 family metallopeptidase
CH52_RS10865-114-1.018186thioredoxin family protein
CH52_RS10870011-0.547451DUF1444 domain-containing protein
CH52_RS10880013-0.758672DUF4479 domain-containing tRNA-binding protein
CH52_RS10885313-0.703480DNA translocase FtsK
CH52_RS10890412-0.556925UDP-N-acetylmuramate--L-alanine ligase
CH52_RS10900417-1.138300DUF948 domain-containing protein
CH52_RS10905514-0.629508hypothetical protein
CH52_RS10910212-0.259574hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10820IGASERPTASE370.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 37.4 bits (86), Expect = 0.001
Identities = 48/321 (14%), Positives = 90/321 (28%), Gaps = 18/321 (5%)

Query: 1323 QNQTNDQVDTTTNQAVNAIDNVEAEVVIKPKAIADIEKAVKEKQQQIDNSLDSTDNEKEV 1382
+ N VDTT N I V + IA +++A S + +
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 1383 ASQALAKEKEKALAAIDQAQTNSQVNQAATNGVSAIKIIQPETKVKPAAREKINQKANEL 1442
++ EK + A AQ +A + K E +
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEA-----------KSNVKANTQTNEVAQSGSETK 1093

Query: 1443 RAKINQDKEATAEERQ----VALDKINEFVNQAMTDITNNRTNQQVDDTTSQALDSIALV 1498
+ + KE E++ V +K E ++ V A ++ V
Sbjct: 1094 ETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 1499 APEHIVRAAARDAVKQQYEAKKQEIEQAEHATDEEKQVALNQLANNEKLALQNINQAVTN 1558
+ A +Q AK+ + T+ N + N + Q N
Sbjct: 1154 NIKEPQSQTNTTADTEQ-PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVN 1212

Query: 1559 NDVKRVETNGIATLKGVQPHIVIKPEAQQAIKATAENQVESIKDTPHATVDELDEANQLI 1618
++ N PH V +P + + + +A + + Q +
Sbjct: 1213 SESSNKPKNRHRRSVRSVPHNV-EPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFV 1271

Query: 1619 S-DTLKQAQQEIENTNQDAAV 1638
+ + K Q I +
Sbjct: 1272 ALNVGKAVSQHISQLEMNNEG 1292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10910IGASERPTASE401e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 40.4 bits (94), Expect = 1e-05
Identities = 38/263 (14%), Positives = 95/263 (36%), Gaps = 22/263 (8%)

Query: 52 QKADDLKVKEQELSQKFEERKTQLEETVAYTKERVEGFLNKSKNEQAALKAQQAAIKEEA 111
Q++ ++ EQ+ ++ + + +E + K + +++ + Q KE A
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQT-NEVAQSGSETKETQTTETKETA 1104

Query: 112 SANNLSDTSQEAQEIQEAKREAQAEADKSVAVSNEESKASALKVQQAAIKEEASANNLSD 171
+ E + +A+ E +K+ V S+ S + Q ++ +A +D
Sbjct: 1105 T--------------VEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAREND 1150

Query: 172 TSQEAQEIQEAKKEAQAETDKSAAVSNEEPKAVALKAQQAAIKEEASANNLSDTSQEAQE 231
+ +E Q ++ A+T++ A ++ + ++ N + T Q
Sbjct: 1151 PTVNIKEPQ-SQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP 1209

Query: 232 VQEAKKEAQAEKDSDTSTKDASAAKVEVSKPESQAERLANAAKQKQAKLTPGSKESQLTE 291
+E + + + + E + + LT + + L++
Sbjct: 1210 ------TVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSD 1263

Query: 292 ALFAEKPVAKNDLKEIPQLVTKK 314
A + VA N K + Q +++
Sbjct: 1264 ARAKAQFVALNVGKAVSQHISQL 1286



Score = 38.1 bits (88), Expect = 8e-05
Identities = 47/332 (14%), Positives = 97/332 (29%), Gaps = 17/332 (5%)

Query: 101 KAQQAAIKEEASANNLSDTSQEAQEIQEAKREAQAEADKSVAVSNEESKASALKVQQAAI 160
K Q + N + + EA S+ + + +
Sbjct: 987 KRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 161 KEEASANNLSDTSQEAQEIQEAKKEAQAET-DKSAAVSNEEPKAVALKAQQAAIKEEASA 219
+ + N D ++ + +E KEA++ + + + + Q KE A
Sbjct: 1047 ESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA-- 1104

Query: 220 NNLSDTSQEAQEVQEAKKEAQAEKDSDTSTKDASAAKVEVSKPESQAERLANAAKQKQAK 279
+ E +E + + E E TS + E +P+++ R + +
Sbjct: 1105 ------TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 280 LTPGSKESQLTEALFAEKPVAKNDLKEIPQLVTKKNDVSETETVNIDNKDTVKQKEAKFE 339
+ + + TE E V N V E N T ++
Sbjct: 1159 QSQTNTTAD-TEQPAKETSSNVEQPVTESTTVNTGNSVVEN-PENTTPATTQPTVNSESS 1216

Query: 340 NGVITRKADEKTTNNTAVDKKSGKQSKKTTPSNKRNASKASTNKTSGQKKQHNKKSSQGA 399
N R + V+ + + ++T + S + S + ++
Sbjct: 1217 NKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLS------DARAKAQF 1270

Query: 400 KKQSSSSKLTQKNNQTSNKNSKTTNAKSSNAS 431
+ ++Q +Q N N SN S
Sbjct: 1271 VALNVGKAVSQHISQLEMNNEGQYNVWVSNTS 1302



Score = 36.6 bits (84), Expect = 2e-04
Identities = 36/222 (16%), Positives = 72/222 (32%), Gaps = 3/222 (1%)

Query: 72 KTQLEETVAYTKERVEGFLNKSKNEQAALKAQQAAIKEEASANNLSD-TSQEAQEIQEAK 130
+ EE + V + +E A+ + + + N D T AQ + AK
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 131 REAQAEADKSVAVSNEESKASALKVQQAAIKEEASANNLSDTSQEAQEIQEAKKEAQAET 190
+ +S + + Q KE A+ E ++ QE K +
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVS 1130

Query: 191 DKSAAVSNEEPKAVALKAQQAA--IKEEASANNLSDTSQEAQEVQEAKKEAQAEKDSDTS 248
K +P+A + IKE S N + +++ + + E + + +
Sbjct: 1131 PKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVN 1190

Query: 249 TKDASAAKVEVSKPESQAERLANAAKQKQAKLTPGSKESQLT 290
T ++ E + P + + + + K S S
Sbjct: 1191 TGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPH 1232


25CH52_RS11600CH52_RS11685Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS11600-115-3.060914divalent metal cation transporter
CH52_RS11605-215-3.119396Fic family protein
CH52_RS15005-113-1.889877hypothetical protein
CH52_RS11620012-1.3851955'-methylthioadenosine/S-adenosylhomocysteine
CH52_RS11630-112-0.770195YqeG family HAD IIIA-type phosphatase
CH52_RS11635012-0.965023ribosome biogenesis GTPase YqeH
CH52_RS11640-111-0.946141shikimate dehydrogenase
CH52_RS11645-114-1.842534ribosome assembly RNA-binding protein YhbY
CH52_RS11650216-2.246700nicotinate (nicotinamide) nucleotide
CH52_RS11655216-1.817986bis(5'-nucleosyl)-tetraphosphatase (symmetrical)
CH52_RS11660-114-3.613232ribosome silencing factor
CH52_RS11665-114-4.349911class I SAM-dependent methyltransferase
CH52_RS11670-115-4.523935ComEA family DNA-binding protein
CH52_RS11675-111-3.263909ComE operon protein 2
CH52_RS11680-111-2.834945DNA internalization-related competence protein
CH52_RS11685-213-3.290308DNA polymerase III subunit delta
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS1160056KDTSANTIGN320.006 Rickettsia 56kDa type-specific antigen protein sign...
		>56KDTSANTIGN#Rickettsia 56kDa type-specific antigen protein

signature.
Length = 533

Score = 31.9 bits (72), Expect = 0.006
Identities = 19/55 (34%), Positives = 27/55 (49%)

Query: 205 GTVGGYITFAGAHRILDSGIKGKQYLPFVNQSAIAGILTTGIMRTLLFLAVLGVV 259
G VGG IT A + R+ + +GK++L G L G+ F A LGV+
Sbjct: 40 GVVGGMITGAESTRLDSTDSEGKKHLSLTTGLPFGGTLAAGMTIAPGFRAELGVM 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11675IGASERPTASE280.027 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.5 bits (63), Expect = 0.027
Identities = 33/190 (17%), Positives = 71/190 (37%), Gaps = 27/190 (14%)

Query: 40 QDDYTSRNFENKDTALKQSTSE---------NNSLSKLEDVQVKDGDNSKNKGPVYVDVK 90
+DD+ +RNF+ + + S ++++ QV G + + V D
Sbjct: 733 EDDWINRNFKATTMNVTGNASLYSGRNVANITSNITASNKAQVHIGYKTGDTVCVRSDYT 792

Query: 91 GAVKHPNVYKMTSKDRVVDLLDKAQLLDDADVSRINLSEKLTDQKMIFIPHKGQKNVEPQ 150
G V D+ ++ + L +V+ + + + +F + + N + +
Sbjct: 793 GYV---TCTTDKLSDKALNSFNPTNL--RGNVNLTESANFVLGKANLFGTIQSRGNSQVR 847

Query: 151 IEVNS-------VHVKNGNTNNTKVNLNTASVSELMSVPGVGQAKANAIVEYRNQQGAFQ 203
+ NS V + N ++LN+A S +V N++ + G+F
Sbjct: 848 LTENSHWHLTGNSDVHQLDLANGHIHLNSADNSN--NVTKYNTLTVNSL----SGNGSFY 901

Query: 204 EIDDLKKVKG 213
+ DL +G
Sbjct: 902 YLTDLSNKQG 911


26CH52_RS11870CH52_RS11920Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS11870-111-3.1839725-formyltetrahydrofolate cyclo-ligase
CH52_RS11875014-3.013840rhomboid family intramembrane serine protease
CH52_RS11880-112-2.748527YqgQ family protein
CH52_RS11885115-2.604051ROK family glucokinase
CH52_RS11890114-2.520847MTH1187 family thiamine-binding protein
CH52_RS11895017-4.022189MBL fold metallo-hydrolase
CH52_RS11900120-4.240512GspE/PulE family protein
CH52_RS11905221-5.241577type II secretion system F family protein
CH52_RS11910323-6.524292prepilin-type N-terminal cleavage/methylation
CH52_RS11915422-6.319376type II secretion system GspH family protein
CH52_RS11920317-5.036862hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11880TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.3 bits (76), Expect = 0.002
Identities = 29/170 (17%), Positives = 54/170 (31%), Gaps = 51/170 (30%)

Query: 241 MLTVYFIAGLFGN--------FVSLSFNTTTISVGASGAIFGLIGSIFAMMY---VSKTF 289
++ V+FI L G F F+ ++G S A FG++ S+ M V+
Sbjct: 215 LMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARL 274

Query: 290 NKK----------MLGQLLIA-----------LVILVGVSLFMS------NINIVAHIGG 322
++ G +L+A +V+L + M + + G
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQG 334

Query: 323 FIGGLLITL-----------IGYYYKVNRNIF--WILLIGMLVIFIALQI 359
+ G L L Y + + W + G + + L
Sbjct: 335 QLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11890PF03309300.012 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 29.7 bits (67), Expect = 0.012
Identities = 32/154 (20%), Positives = 51/154 (33%), Gaps = 37/154 (24%)

Query: 5 ILAADVGGTTCKLGIFTPELEQ---LHKWSIHTD---TSDSTGYTLLKGIYDSFVEKVNE 58
+LA DV T +G+ + + + +W I T+ T+D + G+
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELA-LTIDGLI--------- 51

Query: 59 NNYNFSNVLGVGIG--VPGPVDFEKGTVNGAVNLYWPE------KVNVREIFEQFVDCPV 110
+ + G VP V E V + YWP + VR VD P
Sbjct: 52 -GDDAERLTGASGLSTVP-SVLHE---VRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPK 106

Query: 111 YVDND--ANIAALGEKHKGAGEGADDVVAITLGT 142
V D N A K+ + + G+
Sbjct: 107 EVGADRIVNCLAAYHKYGT------AAIVVDFGS 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11900SHIGARICIN270.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 27.5 bits (61), Expect = 0.039
Identities = 20/99 (20%), Positives = 38/99 (38%), Gaps = 11/99 (11%)

Query: 82 DFLKDPVKNGADKFKQYGLPIITSKVTPEK-------LNEGSTEIE-GFKFNVLHTPGHS 133
F+ + K + K Y +P++ S + + N I ++ G+
Sbjct: 39 VFISNLRKALPYERKLYDIPLLRSTLPGSQRYALIHLTNYADETISVAIDVTNVYVMGYR 98

Query: 134 PGSLTYVFDEFAVVG--DTLFNNGIGRTDL-YKGDYETL 169
G +Y F+E + +F + + L Y G+YE L
Sbjct: 99 AGDTSYFFNEASATEAAKYVFKDAKRKVTLPYSGNYERL 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11910BCTERIALGSPF844e-20 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 84.1 bits (208), Expect = 4e-20
Identities = 65/347 (18%), Positives = 137/347 (39%), Gaps = 6/347 (1%)

Query: 14 KKRQLSKAQQIDLLSNLCNLLKYGFTLYQSFQFLNLQMTYKN-KQLGTTILSEISNGAPC 72
+K +LS + L L L+ L ++ + Q + QL + S++ G
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 73 NQIL-SLIGYSDTI-VMQVYLAERFGNIIDVLEETVNYMKVNRKSEQRLLKTLQYPLILV 130
+ G + + V E G++ VL +Y + ++ R+ + + YP +L
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 131 SIFIAMIIILNLTVIPQFQQLYTSMNIQLSSFQKTLSFFITSLPTIIVVMLIIVSMLAII 190
+ IA++ IL V+P+ + + M L + L ++ T ML+ + +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 191 MKLIYNNLNMLNKIN-FVMKLPLISGYFQLFKTYFVTNELVLFYKNGITLQSIVDVYINH 249
+++ + ++ LPLI + T L + + + L + + +
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 250 SS-DPFRQFLGKYLLTYSEMGYGLPQILEKLKCFKPQLIKFVLQGEKRGKLEVELKLYSQ 308
S D R L E G L + LE+ F P + + GE+ G+L+ L+ +
Sbjct: 301 MSNDYARHRLSLATDAVRE-GVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAAD 359

Query: 309 ILVKQIEDKAIKQTQFLQPILFLILGLFIVAIYLVIMLPMFQMMQSI 355
++ + +P+L + + ++ I L I+ P+ Q+ +
Sbjct: 360 NQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11915BCTERIALGSPG469e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.4 bits (110), Expect = 9e-10
Identities = 19/76 (25%), Positives = 44/76 (57%), Gaps = 4/76 (5%)

Query: 3 KFLKKTQAFTLIEMLLVLLIISLLLILIIPNI--AKQTAHIQSTGCNAQVKMVNSQIEAY 60
+ K + FTL+E+++V++II +L L++PN+ K+ A Q + + + + ++ Y
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKA--VSDIVALENALDMY 59

Query: 61 ALKHNRNPSSIEDLIA 76
L ++ P++ + L +
Sbjct: 60 KLDNHHYPTTNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11920BCTERIALGSPH407e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 39.9 bits (93), Expect = 7e-07
Identities = 14/79 (17%), Positives = 38/79 (48%), Gaps = 4/79 (5%)

Query: 9 KQSAFTMIEMLVVMMLISIFLLLTMTSKGLSNLRVIDDEA-NIISFITELNYIKSQAIAN 67
+Q FT++EM+++++L+ + + + + S D A + F +L +++ + +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPAS---RDDSAAQTLARFEAQLRFVQQRGLQT 58

Query: 68 QGYINVRFYENSDTIKVIE 86
+ V + + V+E
Sbjct: 59 GQFFGVSVHPDRWQFLVLE 77


27CH52_RS12465CH52_RS12585Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS124659112.108784alanine dehydrogenase
CH52_RS124709112.155376bifunctional threonine ammonia-lyase/L-serine
CH52_RS124759112.059419amino acid permease
CH52_RS124809112.065815multidrug efflux MFS transporter NorB
CH52_RS1248510112.102069hyperosmolarity resistance protein Ebh
CH52_RS124909112.184751ribonuclease HI family protein
CH52_RS124951150.049451zinc-finger domain-containing protein
CH52_RS12510-1120.199703NifU N-terminal domain-containing protein
CH52_RS12515-3120.971454virulence factor
CH52_RS12520-1120.417400BrxA/BrxB family bacilliredoxin
CH52_RS12525-1120.673647thymidylate synthase
CH52_RS125300110.244222dihydrofolate reductase
CH52_RS12535112-0.111062fatty acid kinase binding subunit FakB2
CH52_RS12540214-0.151452peptide-methionine (S)-S-oxide reductase MsrA
CH52_RS12545213-1.039494peptide-methionine (R)-S-oxide reductase MsrB
CH52_RS12550115-0.679065PTS glucose transporter subunit IIA
CH52_RS12555116-1.804160YozE family protein
CH52_RS12560015-1.759936S41 family peptidase
CH52_RS12565015-2.261781GNAT family N-acetyltransferase
CH52_RS12575017-2.564002undecaprenyldiphospho-muramoylpentapeptide
CH52_RS12580-114-2.266053phosphatase PAP2 family protein
CH52_RS12585113-3.273817hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12485TCRTETB1184e-31 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 118 bits (296), Expect = 4e-31
Identities = 97/414 (23%), Positives = 177/414 (42%), Gaps = 18/414 (4%)

Query: 12 NNKLLIGIVLSVITFWLFAQSLVNVVPILEDSFNTDIGTVNIAVSITALFSGMFVVGAGG 71
+N++LI + + L L +P + + FN + N + L + G
Sbjct: 12 HNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGK 71

Query: 72 LADKYGRIKLTNIGIILNILGSLLIIIS-NIPLLLIIGRLIQGLSAACIMPATLSIIKSY 130
L+D+ G +L GII+N GS++ + + LLI+ R IQG AA + ++ Y
Sbjct: 72 LSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARY 131

Query: 131 YIGKDRQRALSYWSIGSWGGSGVCSFFGGAVATLLGWRWIFILSIIISLIALFLIKGTPE 190
++R +A G GV GG +A + W ++ ++ +I + FL+K +
Sbjct: 132 IPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKK 191

Query: 191 TKSKSISLNKFDIKGLVLLVIMLLTLNILITKGSELGVTSLLFITLLAIAIGSFSLFIVL 250
FDIKG++L+ + ++ + T S I+ L +++ SF +F+
Sbjct: 192 EVRIK---GHFDIKGIILMSVGIVFFMLFTTSYS---------ISFLIVSVLSFLIFVKH 239

Query: 251 EKRATNPLIDFKLFKNKAYTGATASNFLLNG-VAGTLIVANTFVQRGLGYSSLQAGSLSI 309
++ T+P +D L KN + ++ G VAG + + ++ S+ + GS+ I
Sbjct: 240 IRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVII 299

Query: 310 TYLVM-VLIMIRVGEKLLQTLGCKKPMLIGTGVLIVGECLISLTFLPEILYVICCIIGYL 368
M V+I +G L+ G + IG L V ++ +FL E II
Sbjct: 300 FPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVS--FLTASFLLETTSWFMTIIIVF 357

Query: 369 FFGLGLGIYATPSTDTAIANAPLEKVGVAAGIYKMASALGGAFGVALSGAVYAI 422
G GL T + ++ ++ G + S L G+A+ G + +I
Sbjct: 358 VLG-GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSI 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12490GPOSANCHOR469e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.8 bits (108), Expect = 9e-06
Identities = 49/323 (15%), Positives = 96/323 (29%), Gaps = 9/323 (2%)

Query: 2582 TKVRAAQTKIDQAKALLQNKEDNSQLVTSKNNLQSSVNQVPSTAGMTQQSIDN------- 2634
T +A Q L + +E + N L+ + + + D
Sbjct: 37 TNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSN 96

Query: 2635 YNAKKREAETEITAAQRVIDNGDATAQQISDEKHRVDNALTALNQAKHDLTADTHALEQA 2694
K R+ + ++ I +A + N TA + L A+ AL
Sbjct: 97 AKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAAR 156

Query: 2695 VQQLNRTGTTTGKKPASITAYNNSIRALQSDLTSAKNSANAIIQKPIRTVQEVQSALTNV 2754
L + + +A ++ A ++ L + + ++ + + + +
Sbjct: 157 KADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 2755 NRVNERLTQAINQLVPLADNSALRTAKTKLDEEINKSVTTDGMTQSSIQAYENAKRAGQT 2814
L L T +I ++ E A
Sbjct: 217 EAEKAALAARKADL--EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 2815 ETTNAQNVINNGDATDQQIAAEKTKVEEKYNSLKQAIAGLTPDLAPLQTAKTQLQNDIDQ 2874
+T I +A + AEK +E + L L DL + AK QL+ + +
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQK 334

Query: 2875 PTSTTGMTSASVAAFNDKLSAAR 2897
++ AS + L A+R
Sbjct: 335 LEEQNKISEASRQSLRRDLDASR 357



Score = 41.6 bits (97), Expect = 2e-04
Identities = 56/339 (16%), Positives = 103/339 (30%), Gaps = 24/339 (7%)

Query: 2732 SANAIIQKPIRTVQEVQSALTNVNRVNERLTQAINQLVPLAD-----NSALRTAKTKLDE 2786
+ + T+++VQ N L + L N L + E
Sbjct: 40 VSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKE 99

Query: 2787 EINKSVTTDGMTQSSIQAYENAKRAGQTETTNAQNVINNGDATDQQIAAEKTKVEEKYNS 2846
++ K+ + S IQ E K + A N A + + AEK + +
Sbjct: 100 KLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKAD 159

Query: 2847 LKQAIAGLTPDLAPLQTAKTQLQNDIDQPTSTTGMTSASVAAFNDKLSAARTKIQEIDRV 2906
L++A+ G L+ + A A L A +
Sbjct: 160 LEKALEGAMNFSTADSAKIKTLEAEKAA-------LEARQAELEKALEGAM------NFS 206

Query: 2907 LASHPDVATIRQNVTAANAAKTALDQARNGLTVDKAPLENAKNQLQHSIDTQTSTTGMTQ 2966
A + T+ A A K L++A G L+ + +
Sbjct: 207 TADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE 266

Query: 2967 DSINAYNAKLTAARNKVQQINQVLAGSPTVDQINTNTSAANQAKSDLDHARQALTPDKAP 3026
++ TA K++ + + + L+ RQ+L D
Sbjct: 267 KALEGAMNFSTADSAKIKTLEA------EKAALEAEKADLEHQSQVLNANRQSLRRDLDA 320

Query: 3027 LQNAKTQLEQSINQPTDTTGMTTASLNAYNQKLQAARQK 3065
+ AK QLE + + ++ AS + + L A+R+
Sbjct: 321 SREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREA 359


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12565RTXTOXINA250.022 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 25.3 bits (55), Expect = 0.022
Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 15 GRHDDKGRLAEEIFDDLAFPKHDDD 39
G +DK LA+ F D+AF + +D
Sbjct: 866 GGKEDKLSLADIDFRDVAFKREGND 890


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12580SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 33/140 (23%), Positives = 54/140 (38%), Gaps = 19/140 (13%)

Query: 30 EQWDDQYPLLEHFEEDIAKDYLYVLEENDKIYGFIVVDQDQAEWYDDIDWPVNREGAFVI 89
+Q++D + + EE+ +LY LE + G I + N G +I
Sbjct: 48 KQYEDDDMDVSYVEEEGKAAFLYYLE--NNCIGRIKIRS-------------NWNGYALI 92

Query: 90 HRLTGSKEY--KGAATELFNYVIDVVKARGAEVILTDTFALNKPAQGLFAKFGFHKVGEQ 147
+ +K+Y KG T L + I+ K ++ +T +N A +AK F
Sbjct: 93 EDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVD 152

Query: 148 LMEYP--PYDKGEPFYAYYK 165
M Y P + YYK
Sbjct: 153 TMLYSNFPTANEIAIFWYYK 172


28CH52_RS12930CH52_RS12975Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS129302131.505035HesB/YadR/YfhF family protein
CH52_RS12940210-0.875958acyl-CoA thioesterase
CH52_RS12945210-1.160573aconitate hydratase AcnA
CH52_RS1295029-1.224841BCCT family transporter
CH52_RS12955312-2.595754large conductance mechanosensitive channel
CH52_RS12960514-4.086530SMC family ATPase
CH52_RS12965312-2.318403exonuclease SbcCD subunit D
CH52_RS12970211-0.554443CcdC family protein
CH52_RS129752140.240309hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12960MECHCHANNEL1452e-48 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 145 bits (367), Expect = 2e-48
Identities = 62/131 (47%), Positives = 90/131 (68%), Gaps = 11/131 (8%)

Query: 1 MLKEFKEFALKGNVLDLAIAVVMGAAFNKIISSLVENIIMPLIGKIFGSVDFAK------ 54
++KEF+EFA++GNV+DLA+ V++GAAF KI+SSLV +IIMP +G + G +DF +
Sbjct: 3 IIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVTLR 62

Query: 55 ----EWSFWGIKYGLFIQSVIDFIIIAFALFIFVKIANTL-MKKEEAEEEAVVEENVVLL 109
+ + YG+FIQ+V DF+I+AFA+F+ +K+ N L KKEE + VLL
Sbjct: 63 DAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEEVLL 122

Query: 110 TEIRDLLREKK 120
TEIRDLL+E+
Sbjct: 123 TEIRDLLKEQN 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12965FbpA_PF05833340.005 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 33.7 bits (77), Expect = 0.005
Identities = 39/249 (15%), Positives = 83/249 (33%), Gaps = 4/249 (1%)

Query: 241 LQARSKEILAFVNESKETAIKEYEIIEKKTLENNILKDNINQLNKNKIDFVQLKEQQPEI 300
I F E+ + + + +L N ID ++
Sbjct: 174 FDFSYDMIENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVE 233

Query: 301 DEIEAKLKLLQDITNLLNYIENREKIETKIAN--SKKDISKTNNKILNLDCDKRNIDKEK 358
+ ++ + Y +N + N SK+D K + + K+K
Sbjct: 234 VCKDLFKEIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFYYAKDK 293

Query: 359 K--MLEENGDLIESKTSFIDKTRVLFNDINKYQQSYLNIECLITEGEQLGDELNNLIKGL 416
+ ++ DL + + I++ +N + + + GE L + L KGL
Sbjct: 294 SDRLKSKSSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYALKKGL 353

Query: 417 EKVEDSIGNNESDYEKIIELNNAITNINNEINIIKENEKAKAELDKLLGSKQELENQINE 476
+E + +E+ I L+ T N + K+ K K + + E ++N
Sbjct: 354 SHIELANYYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNY 413

Query: 477 ETTIMKNLE 485
+++ N+
Sbjct: 414 LYSVLTNIN 422


29CH52_RS13075CH52_RS13170Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS13075-114-3.189000aspartate kinase
CH52_RS13080-313-4.634717hypothetical protein
CH52_RS13085-215-4.937087hypothetical protein
CH52_RS13090-116-5.440947thermonuclease family protein
CH52_RS13095-115-4.811555hypothetical protein
CH52_RS13100-115-5.002675response regulator transcription factor
CH52_RS13105-114-4.189304sensor histidine kinase
CH52_RS13110114-3.850747ABC transporter permease
CH52_RS13115215-3.715327ABC transporter ATP-binding protein
CH52_RS13120216-3.319374cardiolipin synthase
CH52_RS13125319-4.341555hypothetical protein
CH52_RS13130319-3.946329low specificity L-threonine aldolase
CH52_RS13135526-7.376822hypothetical protein
CH52_RS15180625-7.186472hypothetical protein
CH52_RS13150924-9.149846hypothetical protein
CH52_RS13160625-8.842017hypothetical protein
CH52_RS13165219-2.775305hypothetical protein
CH52_RS13170219-2.217170hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13100HTHFIS629e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.2 bits (151), Expect = 9e-14
Identities = 23/116 (19%), Positives = 53/116 (45%), Gaps = 2/116 (1%)

Query: 2 TSLIIAEDQNMLRQAMVQLIKLHGDFEILADTDNGLDAMKLIEEYNPNVVILDIEMPGMT 61
++++A+D +R + Q + G +++ T N + I + ++V+ D+ MP
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAG-YDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 62 GLEVLAEIRKKHLNIKVIIVTTFKRPGYFEKAVVNDVDAYVLKERSIEELVETINK 117
++L I+K ++ V++++ KA Y+ K + EL+ I +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13105PF04647330.001 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 32.8 bits (75), Expect = 0.001
Identities = 18/112 (16%), Positives = 42/112 (37%), Gaps = 9/112 (8%)

Query: 35 WLYIISVIVFSLSYLILVIVNNRLNTLMFYILLIIHYFIICYFVFSVHPMLSLFFFYSAF 94
+ S++VF++ I +++ L+ I I + + V +P
Sbjct: 79 RCTLTSLLVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNP---------RN 129

Query: 95 AVPFTFKNNVKKTATNLFILTMIICTIITYLLYNNYFVAMMVYYVVISLIML 146
+ T + K T++ ++ + +I Y LY + ++ V+ L
Sbjct: 130 LISNTEQRKTLKLKTSMVLMVLFGGSIGAYRLYTHQIALAILLGVLWQTFTL 181


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13110ABC2TRNSPORT290.016 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 29.1 bits (65), Expect = 0.016
Identities = 11/34 (32%), Positives = 15/34 (44%)

Query: 167 IVTIGLAVLGGLWFPINTFPNWLQHVAHVLPSYH 200
+V + L G FP++ P Q A LP H
Sbjct: 184 LVITPILFLSGAVFPVDQLPIVFQTAARFLPLSH 217


30CH52_RS13430CH52_RS13495Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS134303220.778605ribosome recycling factor
CH52_RS134354281.546531UMP kinase
CH52_RS134404260.831175translation elongation factor Ts
CH52_RS134456200.653311hypothetical protein
CH52_RS134503130.86867630S ribosomal protein S2
CH52_RS134553120.633269hypothetical protein
CH52_RS134601130.881135GTP-sensing pleiotropic transcriptional
CH52_RS13465291.223842ATP-dependent protease ATPase subunit HslU
CH52_RS13470180.964447ATP-dependent protease subunit HslV
CH52_RS1347519-0.664350tyrosine recombinase XerC
CH52_RS13485210-0.974958methylenetetrahydrofolate--tRNA-(uracil(54)-
CH52_RS13490210-1.070604type I DNA topoisomerase
CH52_RS13495211-1.648671DNA-processing protein DprA
31CH52_RS00065CH52_RS00115N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS00065114-2.658065heme ABC transporter substrate-binding protein
CH52_RS15015413-1.652546iron-regulated surface determinant protein IsdD
CH52_RS00080515-1.312078heme uptake protein IsdC
CH52_RS00085516-0.602509LPXTG-anchored heme-scavenging protein IsdA
CH52_RS00090516-0.673070heme uptake protein IsdB
CH52_RS00100-212-1.41389150S ribosomal protein L32
CH52_RS00110-112-2.008292DUF177 domain-containing protein
CH52_RS00115-213-1.243215pantetheine-phosphate adenylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS00075FERRIBNDNGPP452e-07 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 44.6 bits (105), Expect = 2e-07
Identities = 44/274 (16%), Positives = 101/274 (36%), Gaps = 20/274 (7%)

Query: 5 KYLTILVISVVILTSCQSSSSQESTKSGEFRIVPTTVALTMTLDKLDLPIVG--KPTSYK 62
+ LT + +S ++ + ++ RIV L L + G +Y+
Sbjct: 11 RLLTAMALSPLLWQMNTAHAAAIDPN----RIVALEWLPVELLLALGIVPYGVADTINYR 66

Query: 63 ---TLPNRYKDVPEIGQPMEPNVEAVKKLKPTHVLSVSTIKDEMQPFYKQLNMKGYFYDF 119
+ P V ++G EPN+E + ++KP+ ++ + + + +G+ +
Sbjct: 67 LWVSEPPLPDSVIDVGLRTEPNLELLTEMKPSFMVWSAGYGPSPEMLARIAPGRGFNFSD 126

Query: 120 DS--LKGMQKSITQLGDQFNRKAQAKELNDHLNSVKQKIENKAAKQKKHPKVLILMGVPG 177
L +KS+T++ D N ++ A+ + ++ + K+ P +L + P
Sbjct: 127 GKQPLAMARKSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPR 186

Query: 178 SYLVATDKSYIGDLVKIAGGENVIKVKDRQYISSNT---ENLLNINPDIILRLPHGMPEE 234
LV S +++ G N + + + S + L +L H ++
Sbjct: 187 HMLVFGPNSLFQEILDEYGIPNAWQ-GETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKD 245

Query: 235 VKKMFQKEFKQNDIWKHFKAVKNNHVYDLEEVPF 268
+ + +W+ V+ + V F
Sbjct: 246 MDAL-----MATPLWQAMPFVRAGRFQRVPAVWF 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS00090IGASERPTASE340.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.001
Identities = 27/132 (20%), Positives = 44/132 (33%), Gaps = 4/132 (3%)

Query: 184 ADAAKPNNVKPVQPKPAQPKTPTEQTKPVQPKVEKVKPTVTTTSKVEDNHSTKVVSTDTT 243
+ A+ + P PA P TE + K + + +V +
Sbjct: 1015 EEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 244 KDQTKTQTAHTVKTAQTAQEQNKVQTPVKDVATAKSESNNQAVSDNKSQQTNKVTKHNET 303
+ TQT + AQ+ E + QT + V K+Q+ KVT +
Sbjct: 1075 NVKANTQTN---EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS-QVS 1130

Query: 304 PKQASKAKELPK 315
PKQ P+
Sbjct: 1131 PKQEQSETVQPQ 1142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS00095IGASERPTASE366e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 6e-04
Identities = 37/194 (19%), Positives = 71/194 (36%), Gaps = 15/194 (7%)

Query: 447 RIVDKEAFTKANTDKSNKKEQQDNSAKKEA---------TPATPSKPTPSPVEKESQKQD 497
+ VD T N +++ N+ + PATPS+ T + E Q+
Sbjct: 990 QTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESK 1049

Query: 498 SQKDDNKQLPSVEKENDASSESGKDKTPATKPT------KGEVESSSTTPTKVVSTTQNV 551
+ + + + +N ++ K A T E + + TT TK +T +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 552 AKPTTASSKTTKDVVQTSAGSSEAKDSAPLQKANIKNTNDGHTQSQNNKNTQENKAKSLP 611
K + KT + TS S + + S +Q + T + +Q N
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 612 QTGEESNKDMTLPL 625
Q +E++ ++ P+
Sbjct: 1170 QPAKETSSNVEQPV 1183



Score = 30.0 bits (67), Expect = 0.035
Identities = 27/156 (17%), Positives = 45/156 (28%), Gaps = 5/156 (3%)

Query: 37 EAQAAAEETGGTNTEAQPKTEAVASPTTTSEKAPET-KPVANAVSVSNKEVEAPTSETKE 95
A EE TE + V S + ++ ET +P A ++ V +++
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 96 AKEVKEVKAPKETKAVKPAAKATNNTYPILNQELREAIKNPAIKDKDHSAPNSRPIDFEM 155
+ KET + + T N + NP + P
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE----NPENTTPATTQPTVNSESSNK 1218

Query: 156 KKENGEQQFYHYASSVKPARVIFTDSKPEIELGLQS 191
K + +V+PA D L S
Sbjct: 1219 PKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTS 1254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS00120LPSBIOSNTHSS2191e-76 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 219 bits (560), Expect = 1e-76
Identities = 77/155 (49%), Positives = 112/155 (72%)

Query: 5 IAVIPGSFDPITYGHLDIIERSTDRFDEIHVCVLKNSKKEGTFSLEERMDLIEQSVKHLP 64
A+ PGSFDPIT+GHLDIIER FD+++V VL+N K+ FS++ER++ I +++ HLP
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 65 NVKVHQFSGLLVDYCEQVGAKTIIRGLRAVSDFEYELRLTSMNKKLNNEIETLYMMSSTN 124
N +V F GL V+Y Q A I+RGLR +SDFE EL++ + NK L +++ET+++ +ST
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 125 YSFISSSIVKEVAAYRADISEFVPPYVEKALKKKF 159
YSF+SSS+VKEVA + ++ FVP +V AL +F
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


32CH52_RS03700CH52_RS03775N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS03700014-1.149537superantigen-like protein SSL11
CH52_RS03705115-1.348614restriction endonuclease subunit S
CH52_RS03710115-0.826505type I restriction-modification system subunit
CH52_RS03715117-3.397902hypothetical protein
CH52_RS03720216-2.900242superantigen-like protein SSL10
CH52_RS03725317-2.733709superantigen-like protein SSL9
CH52_RS03730217-1.579017superantigen-like protein SSL8
CH52_RS03735016-1.436624superantigen-like protein SSL7
CH52_RS03745118-1.345084superantigen-like protein SSL5
CH52_RS03750-214-0.751505superantigen-like protein SSL4
CH52_RS03755-416-0.930011hypothetical protein
CH52_RS03760-316-1.006008superantigen-like protein SSL3
CH52_RS03765-115-2.411405superantigen-like protein SSL2
CH52_RS03770-118-2.769344superantigen-like protein SSL1
CH52_RS03775118-3.384304SDR family oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03700TOXICSSTOXIN1084e-31 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 108 bits (270), Expect = 4e-31
Identities = 47/225 (20%), Positives = 86/225 (38%), Gaps = 19/225 (8%)

Query: 16 LTTGMITTTAQPVKASTLEVRSQAT-------QDLSEYYKGRGFELTNVTGYKYG-NKVT 67
L T PV S+ ++ A +DL ++Y TN +
Sbjct: 15 LLLATTATDFTPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTNSEVLDNSLGSMR 74

Query: 68 FIDNSQQIDVTLTGNE----KLTVKDDDEVSNVDVFVVREGSDKSAITTSIGGITKTNGT 123
+ I + + + T + +++ + S+ + I I G+T T
Sbjct: 75 IKNTDGSISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTE-- 132

Query: 124 QHKDTVQNVNLSVSKSTGQHTTSVTSEYYSIYKEEISLKELDFKLRKHLIDKHDLYKTEP 183
T + L V K G+ S K+++++ LDF++R L H LY++
Sbjct: 133 -KLPTPIELPLKV-KVHGK--DSPLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSD 188

Query: 184 KDSKI-RITMKNGGYYTFELNKKLQPHRMGDTIDSRNIEKIEVNL 227
K +ITM +G Y +L+KK + + I+ I+ IE +
Sbjct: 189 KTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03720TOXICSSTOXIN1934e-64 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 193 bits (491), Expect = 4e-64
Identities = 51/202 (25%), Positives = 92/202 (45%), Gaps = 10/202 (4%)

Query: 31 KQNQKSVNKHDKEALYRYYTGKTMEMKNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKF 90
K + S N + K+ L Y +G + N L + ++R K I +++ +
Sbjct: 36 KTAKASTNDNIKDLLDWYSSG-SDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSP 94

Query: 91 QQRSYEGLDVFFVQEKRDKHD-----IFYTVGGVIQNNKTSGVVSAPILNISKEKGEDAF 145
E +D+ + K+ +H I + + GV K + P L + K G+D+
Sbjct: 95 AFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELP-LKV-KVHGKDSP 152

Query: 146 VKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRV-KISLKDGSFYNLDLRSK 204
+K Y K+++ + LD+++R L + +GLY++ K G KI++ DGS Y DL K
Sbjct: 153 LK-YGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKK 211

Query: 205 LKFKYMGEVIESKQIKDIEVNL 226
++ I +IK IE +
Sbjct: 212 FEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03725TOXICSSTOXIN1301e-39 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 130 bits (329), Expect = 1e-39
Identities = 39/197 (19%), Positives = 69/197 (35%), Gaps = 15/197 (7%)

Query: 43 INMLHQYYSEESFESTNISVKSEDYYGSNVLNFNQRNKTFKVFLLGDDKNKY------KE 96
I L +YS S TN V + + + + + K
Sbjct: 46 IKDLLDWYSSGSDTFTNSEVLD---NSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGEKV 102

Query: 97 KTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPSLQVKKVDAKHGFSINELF 156
+ + + + G+T + P L+VK F
Sbjct: 103 DLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELP-LKVKVHGKDSPLKYGPKF 159

Query: 157 FIQKEEVSLKELDFKIRKMLVEKYRLYKGAS-DKGRIVINMKDEKKYVIDLSEKLSFDRM 215
K+++++ LDF+IR L + + LY+ + G I M D Y DLS+K ++
Sbjct: 160 --DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTE 217

Query: 216 FDVMDSKQIKNIEVNLN 232
++ +IK IE +N
Sbjct: 218 KPPINIDEIKTIEAEIN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03730TOXICSSTOXIN1252e-37 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 125 bits (314), Expect = 2e-37
Identities = 47/199 (23%), Positives = 73/199 (36%), Gaps = 19/199 (9%)

Query: 42 DTNKLHQYYSGPSYELTNV--------SGQSQGYYDSNVLLFNQQNQKFQVFLLGKDENK 93
+ L +YS S TN S + + S L+ F G+
Sbjct: 45 NIKDLLDWYSSGSDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGE---- 100

Query: 94 YKEKTHGLDVFAVPELVDLDGRIFSVSGVTKKNVKSIFESLRTPNLLVKKIDDKDGFSID 153
K + + F +SGVT L L K+ KD +
Sbjct: 101 -KVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELP----LKVKVHGKDSP-LK 154

Query: 154 EFFFIQKEEVSLKELDFKIRKLLIKKYKLYEGSA-DKGRIVINMKDENKYEIDLSDKLDF 212
K+++++ LDF+IR L + + LY S G I M D + Y+ DLS K ++
Sbjct: 155 YGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEY 214

Query: 213 ERMADVINSEQIKNIEVNL 231
IN ++IK IE +
Sbjct: 215 NTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03735TOXICSSTOXIN1921e-63 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 192 bits (488), Expect = 1e-63
Identities = 49/197 (24%), Positives = 82/197 (41%), Gaps = 16/197 (8%)

Query: 42 DIKDLYRYYSSESFEFSNI--------SGKVENYNGSNVVRFNQEKQNHQLFLLGKDKDK 93
+IKDL +YSS S F+N S +++N +GS + F G+
Sbjct: 45 NIKDLLDWYSSGSDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGE---- 100

Query: 94 YKKGLEGQNVFVVKELIDPNGRLSTVGGVTKKNNKSSETNTHLFVNKVYGGNLDASIDSF 153
K L + + + + GVT + L V KV+G +
Sbjct: 101 -KVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELPLKV-KVHGKDSPLKYGP- 157

Query: 154 LINKEEVSLKELDFKIRKQLVEKYGLYKGTTKYGKI-TINLKDEKKEVIDLGDKLQFERM 212
+K+++++ LDF+IR QL + +GLY+ + K G I + D DL K ++
Sbjct: 158 KFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTE 217

Query: 213 GDVLNSKDIQNIAVTIN 229
+N +I+ I IN
Sbjct: 218 KPPINIDEIKTIEAEIN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03745TOXICSSTOXIN1352e-41 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 135 bits (340), Expect = 2e-41
Identities = 49/201 (24%), Positives = 73/201 (36%), Gaps = 14/201 (6%)

Query: 39 NVTKDIFDLRDYYSGASKELKNVTGYRYSKGGKHYLIFDKNRKFTRVQIFGKDIERFKAR 98
+ +I DL D+YS S N S G + + IF
Sbjct: 41 STNDNIKDLLDWYSSGSDTFTNSEVLDNSLGS---MRIKNTDGSISLIIFPSPYYSPAFT 97

Query: 99 KNPGLDI-----FVVKEAENRNGTVFSYGGVTKKNQDAYYDYINAPRFQIKRDEGDGIAT 153
K +D+ + F GVT + I P +K D
Sbjct: 98 KGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELPLK-VKVHGKDSPLK 154

Query: 154 YGRVHYIYKEEISLKELDFKLRQYLIQNFDLYKKFPKDSKI-KVIMKDGGYYTFELNKKL 212
YG K+++++ LDF++R L Q LY+ K K+ M DG Y +L+KK
Sbjct: 155 YG--PKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKF 212

Query: 213 QTNRMSDVIDGRNIEKIEANI 233
+ N I+ I+ IEA I
Sbjct: 213 EYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03750TOXICSSTOXIN1018e-28 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 101 bits (252), Expect = 8e-28
Identities = 45/216 (20%), Positives = 81/216 (37%), Gaps = 13/216 (6%)

Query: 82 TKVETPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEIGIILKKWTTIRFMNIVPDYF 141
T V + K N KDL +Y+ S F N ++ ++R N
Sbjct: 25 TPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTN-SEVLDNSLGSMRIKNTDGSI- 82

Query: 142 IYKIALVGKDDKKYDEGVHRNVDVFVVLEEKNKYGVE----RYSVGGITKSNSKKVDHKA 197
+ + VD+ +K+++ E + + G+T + +
Sbjct: 83 --SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIEL 140

Query: 198 GVRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIENHNLYGNV--GSGKIVIN 255
+++ + + K K+Q+++ LDF++R QL + H LY + G I
Sbjct: 141 PLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKIT 197

Query: 256 MKNGGKYTFELHKKLQENRMADVIDGTNIDNIEVNI 291
M +G Y +L KK + N I+ I IE I
Sbjct: 198 MNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03760TOXICSSTOXIN942e-24 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 93.6 bits (232), Expect = 2e-24
Identities = 44/223 (19%), Positives = 78/223 (34%), Gaps = 21/223 (9%)

Query: 140 TPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFEFEKQFGFLLKPWTTVRFM 199
TP P+ S + IK A+ +DL +Y+ S F L ++R
Sbjct: 25 TPVPLSSNQ-------IIKTAKASTNDNIKDLLDWYSSGSDTF-TNSEVLDNSLGSMRIK 76

Query: 200 NVIPNRFIYKIALVGKDEKKYKDGPYDNIDV-----FIVLEDNKYQLKKYSVGGITKTNS 254
N + + + + +D+ ++ + + G+T T
Sbjct: 77 NTDGSI---SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEK 133

Query: 255 KKVDHKAELSVTKKDNQGMISRDVSEYMITKEEISLKELDFKLRKQLIEKHNLYGNM--G 312
+ L V + K+++++ LDF++R QL + H LY +
Sbjct: 134 LPTPIELPLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKT 190

Query: 313 SGTIVIKMKNGGKYTFELHKKLQEHRMADVIEGTNIDKIEVNI 355
G I M +G Y +L KK + + I I IE I
Sbjct: 191 GGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03765TOXICSSTOXIN875e-23 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 87.0 bits (215), Expect = 5e-23
Identities = 39/203 (19%), Positives = 77/203 (37%), Gaps = 22/203 (10%)

Query: 37 ISENSKKLKAYYTQPSIEYKNVTGYISFIQPSIKFMNIIDGNSVNNLALIGKDKQHYHTG 96
++N K L +Y+ S + N + S+ M I + + +L +
Sbjct: 42 TNDNIKDLLDWYSSGSDTFTN----SEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFT 97

Query: 97 VHRNLNIFYVN-----EDKRFEGAKYSIGGITSANDKA--VDLIAEARVIKADHIGEYDY 149
+++ + I G+T+ ++L + +V D +Y
Sbjct: 98 KGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGP 157

Query: 150 DFFPFKIDKEAMSLKEIDFKLRKYLIDNYGLYGEMST----GKITVKKKYYGKYTFELDK 205
F DK+ +++ +DF++R L +GLY KIT+ Y +L K
Sbjct: 158 KF-----DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDG--STYQSDLSK 210

Query: 206 KLQEDRMSDVINVTDIDRIEIKV 228
K + + IN+ +I IE ++
Sbjct: 211 KFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03770TOXICSSTOXIN953e-26 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 95.5 bits (237), Expect = 3e-26
Identities = 46/214 (21%), Positives = 84/214 (39%), Gaps = 11/214 (5%)

Query: 18 TGVITSNVQSVQAKTEVKQQSESELKHYYNKPVLERKNVTGYKYTEKGKDYIDVIVDNQY 77
T V S+ Q ++ + +L +Y+ N + + + + +
Sbjct: 25 TPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTNS---EVLDNSLGSMRIKNTDGS 81

Query: 78 SQISLVGSDKDKFKDGDNSNIDVFILREGDSRQATN-----YSIGGVTKTNSQPFIDYIH 132
+ + S +D+ R S+ + + I GVT T P I
Sbjct: 82 ISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIE 139

Query: 133 TPILEIKKGKEEPQSSLYQIYKEDISLKELDYRLRERAIKQHGLYSNGLKQGQI-TITMK 191
P+ GK+ P + K+ +++ LD+ +R + + HGLY + K G ITM
Sbjct: 140 LPLKVKVHGKDSPLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMN 199

Query: 192 DGKSHTIDLSQKLEKERMGDSIDGRQIQKILVEM 225
DG ++ DLS+K E I+ +I+ I E+
Sbjct: 200 DGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS03775NUCEPIMERASE300.009 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 30.1 bits (68), Expect = 0.009
Identities = 28/167 (16%), Positives = 61/167 (36%), Gaps = 32/167 (19%)

Query: 1 MNIMLTGATGHLGTHITNQAIANHIDHFHIGVRNVEKVPD----------DWRGKVSVRQ 50
M ++TGA G +G H++ + + H +G+ N+ D + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEA--GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 51 LDYFNQESMVEAFK--GMDTVVFI-------PSIIHP-SFKRIPEV--ENLVYAAKQSGV 98
+D ++E M + F + V S+ +P ++ N++ + + +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 99 AHIIFIG---YYADQHNNPFHMS-----PYFGYASRLLSTSGIDYTY 137
H+++ Y PF P YA+ + + +TY
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTY 165


33CH52_RS04660CH52_RS04680N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS046600150.501815formate C-acetyltransferase
CH52_RS046651140.439991ABC transporter substrate-binding protein
CH52_RS046701171.515817sensor histidine kinase
CH52_RS046751180.807655response regulator transcription factor
CH52_RS04680-1140.124648hexose-6-phosphate:phosphate antiporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04675SHAPEPROTEIN320.006 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 32.4 bits (74), Expect = 0.006
Identities = 18/54 (33%), Positives = 29/54 (53%), Gaps = 5/54 (9%)

Query: 257 AYLAAIKEQNGAAMSLGRTSTFLDIYAERDLKAGVITESEV-QEIIDHFIMKLR 309
+AA+ A LGRT +I A R +K GVI + V ++++ HFI ++
Sbjct: 50 KSVAAVGHD--AKQMLGRTPG--NIAAIRPMKDGVIADFFVTEKMLQHFIKQVH 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04685PF065801476e-42 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 147 bits (372), Expect = 6e-42
Identities = 55/226 (24%), Positives = 109/226 (48%), Gaps = 16/226 (7%)

Query: 288 YIYDLFESNEQLIHSIEHTERRLRDIQLKEIERQFQPHFLFNTMQTIQYLITLSPKLAQT 347
+ + F++ +Q ++ QL ++ Q PHF+FN + I+ LI P A+
Sbjct: 136 FGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKARE 195

Query: 348 VVQQLSQMLRYSLR-TNSHTVELNEELNYIEQYVAIQNIRFDDMIKLHIESSEEARHQTI 406
++ LS+++RYSLR +N+ V L +EL ++ Y+ + +I+F+D ++ + + +
Sbjct: 196 MLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV 255

Query: 407 GKMMLQPLIENAIKHGRDTESLDITIRLTLARQN--LHVLVCDNGIGMSSSRLQYVRQSL 464
M++Q L+EN IKHG I L + N + + V + G L+ ++S
Sbjct: 256 PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA----LKNTKES- 310

Query: 465 NNDVFDTKHLGLNHLHNKAMIQYGSHARLHIFSKRNQGTLICYKIP 510
GL ++ + + YG+ A++ + K+ + + IP
Sbjct: 311 -------TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVL-IP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04690HTHFIS812e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 80.6 bits (199), Expect = 2e-19
Identities = 42/169 (24%), Positives = 72/169 (42%), Gaps = 12/169 (7%)

Query: 3 KVVICDDERIIREGLKQIIPWGDYHFNTIYTAKDGVEALSLIQQHQPELVITDIRMPRKN 62
+++ DD+ IR L Q + Y + + I +LV+TD+ MP +N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GVDLLNDI--ALLDCNVIILSSYDDFEYMKAGIQHHVLDYLLKPVDHAQLEVILGRLVRT 120
DLL I A D V+++S+ + F + DYL KP D L ++G + R
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFD---LTELIGIIGRA 118

Query: 121 LLEQQSQNGRSLASCHDAFQPLLKVEYDDYYVNQIVDQIKQSYQTKVTV 169
L E + + + D PL+ + +I + + QT +T+
Sbjct: 119 LAEPKRRPSKLEDDSQD-GMPLVG---RSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04695TCRTETA379e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.5 bits (87), Expect = 9e-05
Identities = 53/361 (14%), Positives = 121/361 (33%), Gaps = 40/361 (11%)

Query: 30 AFFVVFFVYMAMYLIRNNFKAAQPFLKEEIGLSTLELGYIGL---AFSITYGLGKTLLGY 86
V + + LI P L ++ S + G+ +++ +LG
Sbjct: 10 ILSTVALDAVGIGLI----MPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 87 FVDGRNTKRIISFLLILSAITVLIMGFVLSYFGSVMGLLIVLWGLNGVFQSVGGPASYST 146
D + ++ L +A+ IM + F V+ + ++ G+ G G + +
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMAT--APFLWVLYIGRIVAGITG----ATGAVAGAY 119

Query: 147 ISRWAPRTKRGRYLGFWNTSHNIGGAIAGGVALWGANVFFHGNVIGMFIFPSVIALLIGI 206
I+ +R R+ GF + G + H F + + L +
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFL 175

Query: 207 ATLFIGKDDPEELGWNRAEEIWEEPVDKENIDSQGMTKWEIFKKYILGNPVIWILCVSNV 266
F+ + + + P+ +E ++ +W V ++ V +
Sbjct: 176 TGCFLLPE---------SHKGERRPLRREALNPLASFRWARGMT-----VVAALMAVFFI 221

Query: 267 FVYIVRIGIDNWAPLYVSEHLHFSKGDAVNTIFYFEI-GALVASLLWGYVSDLLKGRRAI 325
+ ++ W ++ + H+ ++ F I +L +++ G V+ L RRA+
Sbjct: 222 MQLVGQVPAALWV-IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRAL 280

Query: 326 VAIGCMFMITFVVLFYTNATSVMMVNISLFALGALIFGPQLLIGVSLTGFVPKNAISVAN 385
+ +++L + + + L A G I P +L + +
Sbjct: 281 MLGMIADGTGYILLAFATRGWMAFPIMVLLASGG-IGMP------ALQAMLSRQVDEERQ 333

Query: 386 G 386
G
Sbjct: 334 G 334


34CH52_RS04890CH52_RS04915N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS048902141.616083acetylglutamate kinase
CH52_RS048952131.316000YagU family protein
CH52_RS049002141.2448054'-phosphopantetheinyl transferase superfamily
CH52_RS049052151.296375non-ribosomal peptide synthetase
CH52_RS04915-2111.299690MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04895CARBMTKINASE320.002 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 31.7 bits (72), Expect = 0.002
Identities = 23/84 (27%), Positives = 41/84 (48%), Gaps = 7/84 (8%)

Query: 155 INADTLAYFIASSLKAPIYV-LSNIAGVLIN-----DVVIPQLPLVDIHQYIEHGD-IYG 207
I+ D +A + A I++ L+++ G + + + ++ + ++ +Y E G G
Sbjct: 213 IDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREVKVEELRKYYEEGHFKAG 272

Query: 208 GMIPKVLDAKNAIENGCPKVIIAS 231
M PKVL A IE G + IIA
Sbjct: 273 SMGPKVLAAIRFIEWGGERAIIAH 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04905ENTSNTHTASED290.009 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 29.2 bits (65), Expect = 0.009
Identities = 15/57 (26%), Positives = 27/57 (47%), Gaps = 5/57 (8%)

Query: 84 GQP-----IYVSLSYSYPYIVCVVDKEPVGIDIEKISQRLDWRTLVTCFSTNEAHQI 135
QP ++ S+S+ + V+ ++ +GIDIEKI + L ++ QI
Sbjct: 76 RQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATELAPSIIDSDERQI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04910NUCEPIMERASE538e-09 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 52.9 bits (127), Expect = 8e-09
Identities = 54/266 (20%), Positives = 101/266 (37%), Gaps = 55/266 (20%)

Query: 2046 NTLLTGATGFLGAYLIEALQGYSHRIYCFIRADNEEIAWYKLMTNLNDYFS----EETVE 2101
L+TGA GF+G ++ + L H++ + D NLNDY+ + +E
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV---VGID-----------NLNDYYDVSLKQARLE 47

Query: 2102 MM----LSNIEVIVGDFECMDDVVLPENMDTIIH----AGARTDHFGDDDEFEKVNVQGT 2153
++ ++ + D E M D+ + + + R + + N+ G
Sbjct: 48 LLAQPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVR-YSLENPHAYADSNLTGF 106

Query: 2154 VDVIRLAQQHH-ARLIYVSTISV-GTYFDIDTEDVTFSEADVYKGQLLTSPYTRSKFYSE 2211
++++ + + L+Y S+ SV G + FS D + S Y +K +E
Sbjct: 107 LNILEGCRHNKIQHLLYASSSSVYG-----LNRKMPFSTDDSVDHPV--SLYAATKKANE 159

Query: 2212 LKVLEAVNN-GLDGRIVRVGNLTSPYNGRWHM------RNIKTNRFSMVMNDLLQLDCIG 2264
L + GL +R + P+ GR M + + + V N
Sbjct: 160 LMAHTYSHLYGLPATGLRFFTVYGPW-GRPDMALFKFTKAMLEGKSIDVYNY-------- 210

Query: 2265 VSMAEMPVDFSFVDTTARQIVALAQV 2290
+M DF+++D A I+ L V
Sbjct: 211 ---GKMKRDFTYIDDIAEAIIRLQDV 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS04915TCRTETA320.004 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.004
Identities = 61/337 (18%), Positives = 127/337 (37%), Gaps = 33/337 (9%)

Query: 7 TLKVRLISNFLQLIITTAFIPFIALYLTDMLS----QSIVGIYLVGLVVLKFPLSIISGY 62
L V L + L + +P + L D++ + GI L +++F + + G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 63 LIEIFPKKLLVLIYQATMVIMLVFMGVFGSHQLWQI-IGFCVAYAIFTIVWGLQFPVMDT 121
L + F ++ ++L+ A + M + LW + IG VA + G V
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMAT--APFLWVLYIGRIVAG-----ITGATGAVAGA 118

Query: 122 LIMDAITEDVEHYIYKISYWMTNLSVAIGALLGGLMYGYSMLLLFLIAACIFLIVLFILY 181
I D D + + G +LGGLM G+S F AA + +
Sbjct: 119 YIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 182 IWLPQDRNQVKQSDDKRHASRYQKLQIMNIFRSYKLVLKDRNYMLLISGFSIIMMGEFSI 241
LP+ ++ + + + L+ F + ++G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAA--------LMAVFFIMQLVGQVPA 230

Query: 242 SSYIAIRLKDQF--ETISIGSYDITGAKMLAILLMINTVVVILLTYSISKVVLKIDFKKA 299
+ + I +D+F + +IG LA +++++ ++T ++ ++ ++A
Sbjct: 231 ALW-VIFGEDRFHWDATTIGI-------SLAAFGILHSLAQAMITGPVAA---RLGERRA 279

Query: 300 LITGLLIYIVGYSGLTYLNQFGLLVVFMIIATVGEII 336
L+ G++ GY L + + + M++ G I
Sbjct: 280 LMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIG 316


35CH52_RS05180CH52_RS05240N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS051800153.386883(S)-acetoin forming diacetyl reductase
CH52_RS051850143.282014MFS transporter
CH52_RS05190-1152.698681bifunctional transcriptional
CH52_RS052001173.550130staphyloferrin B biosynthesis decarboxylase
CH52_RS052051163.336793staphyloferrin B biosynthesis citrate synthase
CH52_RS052102163.4539273-(L-alanin-3-ylcarbamoyl)-2-[(2-
CH52_RS052151153.157247L-2,3-diaminopropanoate--citrate ligase SbnE
CH52_RS052200112.807616staphyloferrin B export MFS transporter
CH52_RS05225-1113.113918staphyloferrin B biosynthesis protein SbnC
CH52_RS05230-292.662489N-[(2S)-2-amino-2-carboxyethyl]-L-glutamate
CH52_RS05235-3101.5960732,3-diaminopropionate biosynthesis protein SbnA
CH52_RS052401181.998882staphyloferrin B ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05185DHBDHDRGNASE1284e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 128 bits (323), Expect = 4e-38
Identities = 66/250 (26%), Positives = 113/250 (45%), Gaps = 2/250 (0%)

Query: 5 KVALVTGGAQGIGFKIAERLVEDGFKVAVVDFNEEGAKAAALKLSSDGTKAIAIKADVSN 64
K+A +TG AQGIG +A L G +A VD+N E + L ++ A A ADV +
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 65 RDDVFNAVRQTAAQFGDFHVMVNNAGLGPTTPIDTITEEQFKTVYGVNVAGVLWGIQAAH 124
+ + + G ++VN AG+ I ++++E+++ + VN GV ++
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVS 128

Query: 125 EQFKKFNHGGKIINATSQAGVEGNPGLSLYCSTKFAVRGLTQVAAQDLASEGITVNAFAP 184
+ G I+ S ++ Y S+K A T+ +LA I N +P
Sbjct: 129 KYMMD-RRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 185 GIVQTPMMESIAVATAEEAGKPEAWGWEQFTSQIALGRVSQPEDVSNVVSFLAGKDSDYI 244
G +T M S+ + E F + I L ++++P D+++ V FL + +I
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSL-ETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 245 TGQTIIVDGG 254
T + VDGG
Sbjct: 247 TMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05215PF04183511e-178 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 511 bits (1317), Expect = e-178
Identities = 145/592 (24%), Positives = 257/592 (43%), Gaps = 40/592 (6%)

Query: 25 VNQTILNRVKTRVMHQLVSSLIYENIVVYKASYQDGVGHFTIEGHDSEYRFTAEKTHSFD 84
+N + V R++ +++S L YE + + A Q G + I +++RF AE+ +
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQV--FHAESQ-GDDRYCINLPGAQWRFIAERG-IWG 56

Query: 85 RIRITSPIERVVGDEADTTTDYTQLLREAVFTFPKNDEKLEQFIVELLQTELKDTQSMQY 144
+ I + R AD LL + +D + + + +L T L D Q ++
Sbjct: 57 WLWIDAQTLRC----ADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKA 112

Query: 145 RESNPPATPETFN-DYEFYAMEGHQYHPSYKSRLGFTLSDNLKFGPDFVPNVKLQWLAID 203
R + N D + GH K R G+ ++ P++ +L WLA+
Sbjct: 113 RRGLSASDLINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVK 172

Query: 204 KDKVETTVSRNVVVNEMLRQQVGDKTYEHFVQQIEASGKHVNDVEMIPVHPWQFEHVIQV 263
++ + + ++++L + + + F Q + +G N + +PVHPWQ++ I
Sbjct: 173 REHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWL-PLPVHPWQWQQKIAT 231

Query: 264 DLAEERLNGTVLWLGESDELYHPQQSIRTMSPIDTT-KYYLKVPISITNTSTKRVLAPHT 322
D + G ++ LGE + + QQS+RT++ +K+P++I NTS R +
Sbjct: 232 DFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRY 291

Query: 323 IENAAQITDWLKQIQQQDMYLKDE----LKTVFLGEVLGQSYLNTQLSPYKQTQVYGALG 378
I + WL+Q+ D L L G V + Y +PY+ ++ LG
Sbjct: 292 IAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM---LG 348

Query: 379 VIWRENIYHMLIDEEDAIPFNALYASDKDGLPFIEKWIKQYG--SEAWTKQFLAVAIRPM 436
VIWREN L +E + L D++ P +I + G +E W Q V + P+
Sbjct: 349 VIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPL 408

Query: 437 IHMLYYHGIAFESHAQNMMLIHENGWPTRIALKDFHDGVRFKREHLSEAASHLTLKPMPE 496
H+L +G+A +H QN+ L + G P R+ LKDF +R +E E S +P+
Sbjct: 409 YHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDS------LPQ 462

Query: 497 AHKKVNSNSFIETDDERLVRDFLH---DAFFFINIAEIILFIEKQYGIDEQRQWQWVKDI 553
+ V S RL D+L F+ + I + + G+ E+R +Q + +
Sbjct: 463 EVRDVTS---------RLSADYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAV 513

Query: 554 IEAYQEAFPELNN-YQHFDLFEPTIQVEKLTTRRL-LSDSELRIHHVTNPLG 603
+ Y + P+++ + F LF P I L +L D + + N L
Sbjct: 514 LSDYMKKHPQMSERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLE 565


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05220PF041833014e-97 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 301 bits (772), Expect = 4e-97
Identities = 118/540 (21%), Positives = 212/540 (39%), Gaps = 63/540 (11%)

Query: 3 NKELIQHAAYAAIERILNEYFREENLYQVPPQNHQWSIQLSELE-TLTGEFRYWSAMGHH 61
N + + ++L+E E+ + + ++ I L + E W G
Sbjct: 2 NHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIW---GW- 57

Query: 62 MYHPEVWLIDGKSKKITTYKEAIARILQHMAQSADNQTA-VQQHMAQIMSDI--DNSIHR 118
ID ++ + +L + Q A V +HM + + + D + +
Sbjct: 58 ------LWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLK 111

Query: 119 TARYLQSNTIDYVEDRYIVSEQSLYLGHPFHPTPKSASGFSEADLEKYAPECHTSFQLHY 178
R L ++ + + Q L GHP K G+ + LE+YAPE +F+LH+
Sbjct: 112 ARRGLSASDL---INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHW 168

Query: 179 LAVHQD-------------VLLTRYVEGKEDQVEKVLYQLADIDISEIPKDFILLPTHPY 225
LAV ++ LLT ++ +E ++Q +D +++ LP HP+
Sbjct: 169 LAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-----HNWLPLPVHPW 223

Query: 226 QINVLRQHPQYMQYSEQGLIKDLGVSGDLVYPTSSVRTVF--SKALNIYLKLPIHVKITN 283
Q ++ +G + LG GD S+RT+ S+ + +KLP+ + T+
Sbjct: 224 QWQQK-IATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTS 282

Query: 284 FIRTNDLEQIERTIDAAQVIASVKDE-----------VETPHFKLMFEEGYRALLPNPLG 332
R I A++ + V + P + EGY AL P
Sbjct: 283 CYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYR 342

Query: 333 QTVEPEMDLLTNSAMIVREGIPNY-HADKDIHVLASLFETMPD-SPISKLAQVIEQSGLA 390
EM +I RE + D+ ++A+L E + P+ I++SGL
Sbjct: 343 YQ---EM-----LGVIWRENPCRWLKPDESPVLMATLMECDENNQPL--AGAYIDRSGLD 392

Query: 391 PEAWLECYLDRTLLPILKLFSNTGISLEAHVQNTLIELKDGIPDVCFVRDLEG-ICLSRT 449
E WL ++P+ L G++L AH QN + +K+G+P ++D +G + L +
Sbjct: 393 AETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKE 452

Query: 450 IATEKQLVPNVVAASSPVVYAHDEAWHRLKYYVVVNHLGHLVSTIGKATRNEVVLWQLVA 509
E +P V + + A D H L+ V L + + + E +QL+A
Sbjct: 453 EFPEMDSLPQEVRDVTSRLSA-DYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLA 511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05225TCRTETA802e-18 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 79.9 bits (197), Expect = 2e-18
Identities = 71/372 (19%), Positives = 149/372 (40%), Gaps = 24/372 (6%)

Query: 13 ILWLSQFIAIAGLTVLVPLLPIYMASLQNLSVVEIQLWSGIAIAAPAVTTMIASPIWGKL 72
++ + + G+ +++P+LP + L + ++ GI +A A+ +P+ G L
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDL--VHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 73 GDKISRKWMVLRALLGLAVCLFLMALCTTPLQFVLVRLLQGLFGGVVDASSAFASAEAPA 132
D+ R+ ++L +L G AV +MA + R++ G+ G + A+ +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDG 126

Query: 133 EDRGKVLGRLQSSVSAGSLVGPLIGGVTASILGFSALLMSIAVITFIVCIFGALKLIETT 192
++R + G + + G + GP++GG+ A + A + + + G L E+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 193 HMPKSQTPNINKGIRRSFQCLLCTQQTCRFIIVGVLANFAMYGMLTALSPLASSVNHTAI 252
+ SF+ + V A A++ ++ + + +++
Sbjct: 186 KGERRPLRREALNPLASFR--------WARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 253 DDR-----SVIGFLQSAF-WTASILSAPLWGRFNDKSYVKSVYIFATIACGCSAILQGLA 306
+DR + IG +AF S+ A + G + + + IA G IL A
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 307 TNIEFLMAARILQGLTYSAL--IQSVMFVVVNACHQ-QLKGTFVGTTNSMLVVGQIIGSL 363
T +L + +Q+++ V+ Q QL+G+ T+ + I+G L
Sbjct: 298 TRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS----LTSIVGPL 353

Query: 364 SGAAITSYTTPA 375
AI + +
Sbjct: 354 LFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05230PF04183316e-103 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 316 bits (812), Expect = e-103
Identities = 119/527 (22%), Positives = 208/527 (39%), Gaps = 46/527 (8%)

Query: 79 RVSKQPLTAAEFWQTIANMNCDLSHEWEVARVEEGLTTAATQLAKQLSELDLASHPFV-- 136
R + +P+ A + + +S +++ T L + L++ +
Sbjct: 66 RCADEPVLAQTLLMQLKQVL-SMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINL 124

Query: 137 -MSEQFASLKDRPFHPLAKEKRGLREADYQVYQAELNQSFPLMVAAVKKTHMIHGDTANI 195
L P K +RG + + Y E +F L AVK+ HMI +
Sbjct: 125 NADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEM 184

Query: 196 DELENLTVPIKEQA----TDMLNDQGLSIDDYVLFPVHPWQYQHILPNVFATEISEKLVV 251
D + LT + Q + + + GL +++ PVHPWQ+Q + F + +E +V
Sbjct: 185 DIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQQKIATDFIADFAEGRMV 243

Query: 252 LLPLKFGD-YLSSSSMRSLIDIGAPYN-HVKVPFAMQSLGALRLTPTRYMKNGEQAEQLL 309
L +FGD +L+ S+R+L + +K+P + + R P RY+ G A + L
Sbjct: 244 SLG-EFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWL 302

Query: 310 RQLIEKDEALAKYVMV-CDETA-------WWSYMGQDNDIFKDQLGHLTVQLRKYPEVLA 361
+Q+ D L + V E A ++ + + +++ LG V R+ P
Sbjct: 303 QQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLG---VIWRENPCRWL 359

Query: 362 KNDTQQLVSMAALAANDRTLYQMICGKDNISKNDVMTLFEDIAQVFLKVTLSFM-QYGAL 420
K D + V MA L D + + S D T + +V + + +YG
Sbjct: 360 KPD-ESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVA 418

Query: 421 PELHGQNILLSFEDGRVQKCVLRD-HDTVRIYKPWLTAHQLSLPKYV--VREDTPNTLIN 477
HGQNI L+ ++G Q+ +L+D +R+ K SLP+ V V +
Sbjct: 419 LIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMD-SLPQEVRDVTSRLSADYLI 477

Query: 478 EDLETFFAYFQTLAVSVNLYAIIDAIQDLFGVSEHELMSLLKQILKNEVATISWVTTDQL 537
DL+T V + I + GV E LL +L + + Q+
Sbjct: 478 HDLQTGHF--------VTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMK-----KHPQM 524

Query: 538 AVRHILFDKQTWPFKQILLP---LLY-QRDSGGGSMPSGLTTVPNPM 580
+ R LF +++L L + D G +P+ L + NP+
Sbjct: 525 SERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05235SYCECHAPRONE310.002 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 31.2 bits (70), Expect = 0.002
Identities = 14/33 (42%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 25 VDALTEALTAHAHNDFVQ-PLKPYLRQDPENGH 56
+D E T +HN F Q LKP L D GH
Sbjct: 54 LDNNDEKETLLSHNIFSQDILKPILSWDEVGGH 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05245FERRIBNDNGPP707e-16 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 70.4 bits (172), Expect = 7e-16
Identities = 47/191 (24%), Positives = 78/191 (40%), Gaps = 38/191 (19%)

Query: 53 PKRVVTLYQGATDVAVSLGVKPVGAVES-----WTQKPKFEYIKNDLKDTKI-VGQEPAP 106
P R+V L ++ ++LG+ P G ++ W +P L D+ I VG P
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPP-------LPDSVIDVGLRTEP 87

Query: 107 NLEEISKLKPDLIVASKVRNEKVYDQLSKIAPTVSTDTVFKFKD----------TTKLMG 156
NLE ++++KP +V S + L++IAP F F D + M
Sbjct: 88 NLELLTEMKPSFMVWS-AGYGPSPEMLARIAPGR----GFNFSDGKQPLAMARKSLTEMA 142

Query: 157 KALGKEKEAEDLLKKYDDKVAAFQKDAKAKY--KDAWPLKASVVNF-RADHTRIYA-GGY 212
L + AE L +Y+D F + K ++ + A PL + H ++
Sbjct: 143 DLLNLQSAAETHLAQYED----FIRSMKPRFVKRGARPL--LLTTLIDPRHMLVFGPNSL 196

Query: 213 AGEILNDLGFK 223
EIL++ G
Sbjct: 197 FQEILDEYGIP 207


36CH52_RS05350CH52_RS05385N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS05350012-1.082778DNA2/NAM7 family helicase
CH52_RS053551110.049027hypothetical protein
CH52_RS05360111-0.034295TfoX/Sxy family protein
CH52_RS053651110.006659DUF6007 family protein
CH52_RS05370312-0.033363tRNA-dihydrouridine synthase
CH52_RS15025514-1.374260persulfide dioxygenase-sulfurtransferase CstB
CH52_RS05385313-1.106990persulfide response sulfurtransferase CstA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05350GPOSANCHOR397e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 39.3 bits (91), Expect = 7e-05
Identities = 18/134 (13%), Positives = 50/134 (37%), Gaps = 5/134 (3%)

Query: 515 INSEKTSIEEQVYHLDNETLRDNKEIEDLDNRINYIVKQIETLNELIKSIKESNKGFINK 574
++ ++++ L E +++ D ++ +I+ L ++++ +G +N
Sbjct: 76 LSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNF 135

Query: 575 LKAMFNSEEDESYKDHNKEKQQLLTQQLELEKCKKNKHEDLVSKLKEKEKLIKQLTKVQL 634
+ K EK L ++ +LEK + + + + L + ++
Sbjct: 136 ST-----ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 190

Query: 635 QLDELNSQLQELEA 648
+ EL L+
Sbjct: 191 RQAELEKALEGAMN 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05360LCRVANTIGEN250.037 Low calcium response V antigen signature.
		>LCRVANTIGEN#Low calcium response V antigen signature.

Length = 326

Score = 25.4 bits (55), Expect = 0.037
Identities = 12/38 (31%), Positives = 24/38 (63%)

Query: 9 DLFLNHVNSNAVKTRKMMGEYIIYYDSVVIGGLYDNRL 46
++F N V ++ ++ K + Y + D+++ GG YDN+L
Sbjct: 57 EVFANRVITDDIELLKKILAYFLPEDAILKGGHYDNQL 94


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS0536560KDINNERMP260.007 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 26.4 bits (58), Expect = 0.007
Identities = 8/38 (21%), Positives = 16/38 (42%), Gaps = 4/38 (10%)

Query: 14 WDLFFAIPMFLLFAYL----PNYNFITIFLNIVIIIFF 47
W F + P+F L ++ N+ F I + ++
Sbjct: 332 WLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIM 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05390PF01206614e-14 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 60.5 bits (147), Expect = 4e-14
Identities = 20/70 (28%), Positives = 37/70 (52%)

Query: 118 KQFNYRGFQCPGPIVKISQEMKNIEVGDQIEVKVTDPGFPSDIKSWVKQTRHTLVKLDEN 177
+ + G CP PI+K + + + G+ + V TDPG D +S+ KQT H L++ E
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 178 NNGINAIIQK 187
+ + +++
Sbjct: 66 DGTYHFRLKR 75


37CH52_RS05790CH52_RS05845N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS05790-217-1.189737ica operon transcriptional regulator IcaR
CH52_RS05795-116-1.501087Wzz/FepE/Etk N-terminal domain-containing
CH52_RS05800-215-0.352611polysaccharide biosynthesis tyrosine autokinase
CH52_RS05805-2150.632205tyrosine-protein phosphatase
CH52_RS05810-2150.113031GNAT family N-acetyltransferase
CH52_RS05815-2140.040126peptide-methionine (S)-S-oxide reductase
CH52_RS05820-1151.228295flavin reductase family protein
CH52_RS0582512163.809646hypothetical protein
CH52_RS0583010152.812317hypothetical protein
CH52_RS0583510142.520282flavin reductase family protein
CH52_RS058408132.600737serine-rich repeat glycoprotein adhesin SasA
CH52_RS058457132.525363accessory Sec system protein translocase subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05795HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 67.7 bits (165), Expect = 2e-16
Identities = 16/48 (33%), Positives = 31/48 (64%)

Query: 2 KDKIIDNAITLFSEKGYDGTTLDDIAKSVNIKKASLYYHFDSKKSIYE 49
+ I+D A+ LFS++G T+L +IAK+ + + ++Y+HF K ++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05815SACTRNSFRASE451e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 45.3 bits (107), Expect = 1e-08
Identities = 24/101 (23%), Positives = 46/101 (45%), Gaps = 5/101 (4%)

Query: 48 EKNDEVIGYIN--GPVIKERYISDDLFKNVSINNSEGGYISVLGLVVAPNYQGQGIAGRL 105
E +D + Y+ G Y+ ++ + I ++ GY + + VA +Y+ +G+ L
Sbjct: 51 EDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTAL 110

Query: 106 LNYFETLAKNHHRHGVTLTCRE---SLISFYEKYGYRNEGV 143
L+ AK +H G+ L ++ S FY K+ + V
Sbjct: 111 LHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05825NUCEPIMERASE270.043 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.4 bits (61), Expect = 0.043
Identities = 9/32 (28%), Positives = 14/32 (43%)

Query: 23 IPRPIAFVTTLNQDASVNAAPFSFFNIVNNHP 54
IP T + + AP+ +NI N+ P
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNIGNSSP 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05830ENTEROTOXINA280.005 Heat-labile enterotoxin A chain signature.
		>ENTEROTOXINA#Heat-labile enterotoxin A chain signature.

Length = 258

Score = 28.4 bits (63), Expect = 0.005
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 30 IELFEHTFGLQKELVKYVGIAEATTAALYSASFINKNISRLASLSTIGILSVAA 83
I L++H G Q V+Y +T+ +L SA ++I L+ ST I +A
Sbjct: 57 INLYDHARGTQTGFVRYDDGYVSTSLSLRSAHLAGQSI--LSGYSTYYIYVIAT 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05850ICENUCLEATIN653e-12 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 64.8 bits (157), Expect = 3e-12
Identities = 239/1048 (22%), Positives = 416/1048 (39%), Gaps = 4/1048 (0%)

Query: 687 ATQDNSGNAVTNTVTGLPSGLTFDSTNNTISGTPTNIGTSTISIVSTDASGNKTTTTFKY 746
+ + +T + S T+ +TI ST + T+
Sbjct: 107 HHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGS 166

Query: 747 EVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDS 806
++ S ++ GST+ + ST A S T+ + S +V+ ST + S +
Sbjct: 167 TLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMA 226

Query: 807 VSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSGSVSKSTSLSDSISNSNSTEKSESLS 866
S S+ + ST S + GS + S + ST+ ++ S
Sbjct: 227 GYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGS 286

Query: 867 TSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLSDSTSNAISTSTSLS 926
T+ T T+ +DS ++ GS + ST +G ST + S A ST +
Sbjct: 287 DLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTA 346

Query: 927 ESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVS 986
S+ + S A S+ T+ S T+ S + ST + +DS+ +G
Sbjct: 347 GDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAG--Y 404

Query: 987 GSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGSTSES 1046
GS A +S T+ S T++ SD + GS + S ++ ST +G S
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 1047 LSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTS 1106
+ ST + S + S ST+ S+ + S + GS + S + +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 1107 FSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSVSDSTSLSTSES 1166
SD + S STA + S + ST + S + S +T+ SD T+ S
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 1167 DSISESTSTSDSISEAISASESTSISLSESNSTSDSESQSASAFLSESLSESTSESTSES 1226
+ S+S+ + S ++ S+ + S T+ +S + + S S + ++S+ +
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGY--GSTSTAGADSSLIA 642

Query: 1227 VSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASISTSTSISESTSTFKSESVSTSLSMS 1286
ST + S T+ GST T+ S + STST+ ++S+ S T+ S
Sbjct: 643 GYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNS 702

Query: 1287 TSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESES 1346
T+ ST + SD TS S S + + S+ + S + S+ +G S +
Sbjct: 703 ILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTA 762

Query: 1347 DSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSNSAS 1406
S + STS + + S +G ST T+ S T+ S + +S + + S
Sbjct: 763 REQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGS 822

Query: 1407 QSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSTSTSAS 1466
S + + S+ + S T+ Y S T+ ST T+ SD T+ STS +G S+ +
Sbjct: 823 TSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIA 882

Query: 1467 LSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDSSS 1526
GS + SI T+ ST + S + S S + S+ ++ S + S
Sbjct: 883 GYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKS 942

Query: 1527 TSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTS 1586
T ++ S+ +S + S S + S+ + ST +G S T+
Sbjct: 943 TLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTA 1002

Query: 1587 ESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTS 1646
E ST T+ S +T+ + S+ + S+ TS RS + S S S + S
Sbjct: 1003 EHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGS 1062

Query: 1647 TSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSSNISGSNSTSTSLSTSDSMS 1706
+ S S+ + S+ + S+ +G S I+ + S I+G S+ T+ S +S
Sbjct: 1063 SLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLIS 1122

Query: 1707 GSVSVSTSTSLSDSISGSTSVSDSSSTS 1734
G+ SV + I+G+ S + S
Sbjct: 1123 GADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 59.4 bits (143), Expect = 2e-10
Identities = 233/976 (23%), Positives = 399/976 (40%), Gaps = 18/976 (1%)

Query: 789 VSTSASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSGSVSKSTS 848
V+ + + S ++ V + ++ S S +Q++ + GS T
Sbjct: 113 VACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTLSGTH 172

Query: 849 LSDSISNSNSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTS 908
S I+ STE + ST + T T+ +DS ++ GS + S+ ++G ST
Sbjct: 173 QSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQ 232

Query: 909 ASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDSKS 968
+ S A ST + S+ IA S T+ DS T+ ST ++
Sbjct: 233 TGMKGSDLTAGYGSTGTAGDDSSL--------IAGYGSTQTAGEDSSLTAGYGSTQTAQK 284

Query: 969 MSTSESLSDSTSTSGSVS------GSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLS 1022
S + ST T+G+ S GS A +S T+ S T++ SD + GS
Sbjct: 285 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 344

Query: 1023 ASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVR 1082
+ S ++ ST +G S + ST + S + S T+ + S+ +
Sbjct: 345 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY 404

Query: 1083 TSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTS 1142
S + S + S + SD T+ S TA +S + ST + S+
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 1143 LSTSNSERTSTSVSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSESNSTSDS 1202
+ S +T+ SD T+ S S + ES+ + S + ST + S T+ +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 1203 ESQSASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASIS 1262
ES + + S S + + S+ + ST ++ S T+ GST T+ S + S
Sbjct: 525 ESDLITGY--GSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGS 582

Query: 1263 TSTSISESTSTFKSESVSTSLSMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDS 1322
T T+ S+S+ S T+ S+ T+ ST + S T+ S S + + S+ +
Sbjct: 583 TGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIA 642

Query: 1323 ISTSKSDSISTSTSLSGSTSESESDSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLS 1382
S + S +G S + S + STS + + S +G ST T+ S
Sbjct: 643 GYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNS 702

Query: 1383 DSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTS 1442
T+ S + S + S S S + + S+ + S T++Y S T+ ST T+
Sbjct: 703 ILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTA 762

Query: 1443 TSLSDSTSISKSTSQSGSTSTSASLSGSESESDSQSISTS--ASESTSESASTSLSDSTS 1500
S T+ STS +G+ S+ + GS + SI T+ S T++ S + S
Sbjct: 763 REQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGS 822

Query: 1501 TSNSGSASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTS 1560
TS +G+ S+ + S + +S T+ ST + ++S+ + S S + S+ +
Sbjct: 823 TSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIA 882

Query: 1561 NRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDS 1620
ST + S+ T+ GST + SD T+ S S + S+ +G ST T++ S
Sbjct: 883 GYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKS 942

Query: 1621 RSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVS 1680
+ S+ S + STS + S + S + ST + GS +
Sbjct: 943 TLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTA 1002

Query: 1681 ISLSDSSNISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLS 1740
S + GS +T+ + S+ + GS S S + GST +S S T+ S
Sbjct: 1003 EHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGS 1062

Query: 1741 DSMSQSQSTSTSASGS 1756
+S +S+ T+ GS
Sbjct: 1063 SLISGRRSSLTAGYGS 1078



Score = 59.0 bits (142), Expect = 2e-10
Identities = 239/1087 (21%), Positives = 429/1087 (39%), Gaps = 12/1087 (1%)

Query: 837 SSMSGSVSKSTSLSDSISNSNSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQS 896
+S + + + + + S +++ R+ D + SGS +Q+
Sbjct: 99 TSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQT 158

Query: 897 LSTSISGSSSTSASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQS 956
+ + GS+ + S + ST T+ +S I+ S + + ST + S
Sbjct: 159 IEIATYGSTLSGTHQSQLIAGYGSTETA----GDSSTLIAGYGSTGTAGADSTLVAGYGS 214

Query: 957 TSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSIS 1016
T + S + S S + GS A S + S T+ S +
Sbjct: 215 TQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTA 274

Query: 1017 TSGSLSASDSKSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTS 1076
GS + S + ST +G+ S ++ ST + +S + S T+ S
Sbjct: 275 GYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGS 334

Query: 1077 TSASVRTSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGS 1136
+ S + S + S + S T+ S TA S + ST +
Sbjct: 335 DLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTA 394

Query: 1137 VSTSTSLSTSNSERTSTSVSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSES 1196
+ S+ ++ S +T+ S T+ S + S T+ S + +S+ I+ S
Sbjct: 395 GADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGS 454

Query: 1197 NSTSDSESQSASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTS 1256
T+ +S + + S ++ S+ T+ S+ST+ S + S T+ S T+
Sbjct: 455 TQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTA 514

Query: 1257 GSASISTSTSISESTSTFKSESVSTSLSMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTS 1316
G S T+ + S+ + + S S + + S + S T+ S+ + S + S
Sbjct: 515 GYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGS 574

Query: 1317 MSTSDSISTSKSDSISTSTSLSGSTSESESDSTSSSESKSDSTSMSISMSQSTSGSTSTS 1376
T+ ST + S S+ + GST + S+ ++ S T+ S+ + GSTST+
Sbjct: 575 DLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTA 634

Query: 1377 TSTSLSDSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSTYTSQSTSQS 1436
+ S + S + + S + S T+ S S + + + + S
Sbjct: 635 GADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGS 694

Query: 1437 ESTSTSTSLSDSTSISKSTSQSGSTSTSASLSGSESESDSQSISTSASESTSESASTSLS 1496
T+ S+ + S T+Q GS TS S S + +DS I+ S T+ S+ +
Sbjct: 695 TQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTA 754

Query: 1497 DSTSTSNSGSASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQSTSASLSDSLS 1556
ST + S T+ S S + +DSS + ST + S + S + S
Sbjct: 755 GYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERS 814

Query: 1557 TSTSNRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTS 1616
T+ ST + + S + GST + S T+ S + S +G STST+
Sbjct: 815 DLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTA 874

Query: 1617 TSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGS 1676
DS + ST S + ST T+ SD T+ S S + +S+ + GS
Sbjct: 875 GYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGS 934

Query: 1677 MSVSISLSDSSNISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTS 1736
+ S GS+ T+ S+ + GS S++ S + GST + ST T+
Sbjct: 935 TQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTA 994

Query: 1737 TSLSDSMSQSQSTSTSASGSLSTS--ISTSMSMSASTSSSQSTSVSTSLSTSDSISDSTS 1794
S ++ ST T+ GS +T+ S+ ++ S+ +S S T+ S IS S
Sbjct: 995 GYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRS 1054

Query: 1795 ISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDSTSGSTSTSISESLSTSGS 1854
+ +G S++ S S T+ S +++ S + S +G+ S I+ G
Sbjct: 1055 VLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIA------GK 1108

Query: 1855 GSTSVSDSTSMSESDSTSVSMSQDKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTS 1914
GS+ + S S + SV M+ ++ + +DS + S L+ ++S T+ S
Sbjct: 1109 GSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKL 1168

Query: 1915 MSGSQSI 1921
+G+ I
Sbjct: 1169 TAGNDCI 1175



Score = 57.8 bits (139), Expect = 5e-10
Identities = 207/898 (23%), Positives = 359/898 (39%), Gaps = 6/898 (0%)

Query: 1044 SESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSI 1103
S S +Q+ + S T QS + ST + +S + GS + +DS +
Sbjct: 150 SGSTQPTQTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLV 209

Query: 1104 STSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSVSDSTSLST 1163
+ S T+ +S+ A S S + ST + +S + S T+
Sbjct: 210 AGYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGED 269

Query: 1164 SESDSISESTSTSDSISEAISASESTSISLSESNSTSDSESQSASAFLSESLSESTSEST 1223
S + ST T+ S+ + ST + ++S+ + S + S + S T
Sbjct: 270 SSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQT 329

Query: 1224 SESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASISTSTSISESTSTFKSESVSTSL 1283
++ S T+ S + +S + S T+G S T+ S T+ S+ +
Sbjct: 330 AQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYG 389

Query: 1284 SMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSE 1343
S T+ + S+ + S + +S + S T+ S + ST T+ S+
Sbjct: 390 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 449

Query: 1344 SESDSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSN 1403
+ ST ++ S T+ S + GS T+ S S + S ++ +
Sbjct: 450 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYG 509

Query: 1404 SASQSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSTST 1463
S + ST T+ +ESD + TS + + S + S ++ S T+ GST T
Sbjct: 510 STLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQT 569

Query: 1464 SASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESD 1523
+ S + S + S S + ST + S+ +G ST T+ S +
Sbjct: 570 AREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYG 629

Query: 1524 SSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGST-- 1581
S+ST+ +DS+ + S + S + ST T+ S + + S ST+ + S+
Sbjct: 630 STSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLI 689

Query: 1582 ----SESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSD 1637
S T+ +S T+ S T++ S S STST+ + S+ + S +T++
Sbjct: 690 AGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYH 749

Query: 1638 SQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSSNISGSNSTST 1697
S + ST T+ S + S ST+ + S+ +G S + S +G ST T
Sbjct: 750 SSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQT 809

Query: 1698 SLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSL 1757
+ SD +G S ST+ + S I+G S + S T+ S +Q S +G
Sbjct: 810 AQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYG 869

Query: 1758 STSISTSMSMSASTSSSQSTSVSTSLSTSDSISDSTSISISGSQSTVESESTSDSTSISD 1817
STS + S + S T+ S+ T+ S T+ S + S ST+ S
Sbjct: 870 STSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLI 929

Query: 1818 SESLSTSDSDSTSTSTSDSTSGSTSTSISESLSTSGSGSTSVSDSTSMSESDSTSVSMSQ 1877
+ ST + ST + S T+ S + GS S + DS+ ++ ST + Q
Sbjct: 930 AGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQ 989

Query: 1878 DKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTSMSGSQSISDSTSTSMSGSTST 1935
+ S + +ST T+ S +T+ ++S + GS S S +G ST
Sbjct: 990 STLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGST 1047



Score = 50.9 bits (121), Expect = 7e-08
Identities = 206/894 (23%), Positives = 361/894 (40%), Gaps = 6/894 (0%)

Query: 733 TDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTS 792
T G+ T + T + S ++ GSTQ + ST A S T+ GS + +
Sbjct: 281 TAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGY 340

Query: 793 ASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSGSVSKSTSLSDS 852
ST + S + S + +S+ + ST S ++ GS + + S
Sbjct: 341 GSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSL 400

Query: 853 ISNSNSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLS 912
I+ ST+ + ST T+ T T+ S + GS + S+ I+G ST +
Sbjct: 401 IAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGE 460

Query: 913 DSTSNAISTSTSLSESASTSDSISISNSIANSQSA------STSKSDSQSTSISLSTSDS 966
DS+ A ST ++ S + S S A +S+ ST + ST + S
Sbjct: 461 DSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQ 520

Query: 967 KSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDS 1026
+ + S+ ++ STS + + S IA S T++ +S+ T+ S + GS +
Sbjct: 521 TAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGY 580

Query: 1027 KSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSLSTSQSGSTSTSTSTSASVRTSES 1086
S + S S+ +G S + S+ + S + QS T+ STS + S
Sbjct: 581 GSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSL 640

Query: 1087 QSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTS 1146
+ GS + +S+ + S T+ S TA S S + + S+ + ST +
Sbjct: 641 IAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGY 700

Query: 1147 NSERTSTSVSDSTSLSTSESDSISESTSTSDSISEAISASESTSISLSESNSTSDSESQS 1206
NS T+ S T+ S+ S STST+ + S I+ ST + S+ T+ S
Sbjct: 701 NSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQ 760

Query: 1207 ASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSASISTSTS 1266
+ S + S ST+ + SS + S + S T+ S T+ S T+
Sbjct: 761 TAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGY 820

Query: 1267 ISESTSTFKSESVSTSLSMSTSTSLSNSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTS 1326
S ST+ S ++ S T+ S T+ S + +S + S ST+ S+
Sbjct: 821 GSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSL 880

Query: 1327 KSDSISTSTSLSGSTSESESDSTSSSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTS 1386
+ ST T+ S + ST +++ SD T+ S S + S+ + S ++
Sbjct: 881 IAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASF 940

Query: 1387 TSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSTYTSQSTSQSESTSTSTSLS 1446
S ++ + S+ + STS + +S + T + QS T+ S
Sbjct: 941 KSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQ 1000

Query: 1447 DSTSISKSTSQSGSTSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGS 1506
+ S T+ GST+T+ + S + S S S T+ ST +S S +G
Sbjct: 1001 TAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGY 1060

Query: 1507 ASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTI 1566
S+ S S+ + S+ + S+ + S + + S ++ S+ T+ ST+
Sbjct: 1061 GSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTL 1120

Query: 1567 ASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDS 1620
S + SV + + ++S T+ S + + S +G S T+ +D
Sbjct: 1121 ISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDC 1174



Score = 34.3 bits (78), Expect = 0.006
Identities = 116/530 (21%), Positives = 203/530 (38%), Gaps = 6/530 (1%)

Query: 1486 STSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDSSSTSLSDSTSASMQSSESDSQ 1545
S + +D + + + S +++ T D+T S + + +
Sbjct: 100 SAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTI 159

Query: 1546 STSASLSDSLSTSTSNRMSTIASLSTSVSTSE--SGSTSESTSESDSTSTSLSDSQSTSR 1603
+ S T S ++ S T+ +S +G S T+ +DST + S T+
Sbjct: 160 EIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAG 219

Query: 1604 STSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSD 1663
S+ + ST T S + S T+ DS ++ ST T+ DS+ + S
Sbjct: 220 EESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGST 279

Query: 1664 STSDSTSASTSGSMSVSISLSDSSNISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISG 1723
T+ S T+G S + +DSS I+G ST T+ S +G S T+ SD +G
Sbjct: 280 QTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAG 339

Query: 1724 STSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSLSTSISTSMSMSASTSSSQSTSVSTSL 1783
S + S+ + S + S+ +G ST + S ++ S T+
Sbjct: 340 YGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQ----TAQKGSDLTAGYGSTGTAG 395

Query: 1784 STSDSISDSTSISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDSTSGSTST 1843
+ S I+ S +G +ST + S T+ S+ + S T+ S +G ST
Sbjct: 396 ADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGST 455

Query: 1844 SISESLSTSGSGSTSVSDSTSMSESDSTSVSMSQDKSDSTSISDSESVSTSTSTSLSTSD 1903
+ S+ +G S + S+ + S S +S+ I+ S T+ S T+
Sbjct: 456 QTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAG 515

Query: 1904 STSTSESLSTSMSGSQSISDSTSTSMSGSTSTSESNSMHPSDSMSMHHTHSTSTSRLSSE 1963
ST + + S + S ST+ + S + S +S+ ST T+R S+
Sbjct: 516 YGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSD 575

Query: 1964 ATTSTSESQSTLSATSEVTKHNGTPAQSEKRLPDTGDSIKQNGLLGGVMT 2013
T + + S +S + + T S G Q V+T
Sbjct: 576 LTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLT 625


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05855SECYTRNLCASE1304e-36 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 130 bits (329), Expect = 4e-36
Identities = 93/440 (21%), Positives = 181/440 (41%), Gaps = 52/440 (11%)

Query: 4 LLQQYEYKIIYKRMLYTCFILFIYILGTNISI--VSYNDMQ------VKHESFFKIAISN 55
+ + + K++L+T I+ +Y +GT+I I V Y ++Q ++ F +
Sbjct: 5 FARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGLVNMF 64

Query: 56 MGGDVNTLNIFTLGLVPWLTSMIILMLISYRNMDKYMKQTSLEKHYKE------------ 103
GG + + IF LG++P++T+ IIL L++ + LE KE
Sbjct: 65 SGGALLQITIFALGIMPYITASIILQLLT-------VVIPRLEALKKEGQAGTAKITQYT 117

Query: 104 RILTLILSVIQSYFVIHEYVSKERVHQDN-------------IYLTILILVTGTMLLVWL 150
R LT+ L+++Q ++ S + + ++ + GT +++WL
Sbjct: 118 RYLTVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWL 177

Query: 151 ADKNSRYGIAGPMPIVMVSIIKSMMHQKMEYI------DASHIVIALLIILVIITLFILL 204
+ + GI M I+M I + + I I +I + +I + +++
Sbjct: 178 GELITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVV 237

Query: 205 FIELVEVRIPYI----DLMNVSATNMKSYLSWKVNPAGSITLMMSISAFVFLKSGIHFIL 260
F+E + RIP + S +Y+ KVN AG I ++ + S F
Sbjct: 238 FVEQAQRRIPVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVAQFAG 297

Query: 261 SMFNKSISDDMPMLTFDSPVGISVYLVIQMLLGYFLSRFLINTKQKSKDFLKSGNYFSGV 320
+ + D P+ I Y ++ + +F N ++ + + K G + G+
Sbjct: 298 GNSGWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFIPGI 357

Query: 321 KPGKDTERYLNYQARRVCWFGSALVTVIIGIPLYFTLFVPHLSTEIYFS-VQLIVLVYIS 379
+ G+ T YL+Y R+ W GS + +I +P L S F ++++V +
Sbjct: 358 RAGRPTAEYLSYVLNRITWPGSLYLGLIALVP-TMALVGFGASQNFPFGGTSILIIVGVG 416

Query: 380 INIAETIRTYLYFDKYKPFL 399
+ + I + L Y+ FL
Sbjct: 417 LETVKQIESQLQQRNYEGFL 436


38CH52_RS05865CH52_RS05895N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS05865-1140.277064accessory Sec system translocase SecA2
CH52_RS05870116-0.045334accessory Sec system glycosyltransferase GtfA
CH52_RS05875011-0.103259accessory Sec system glycosylation chaperone
CH52_RS05880-190.155332cell-wall-anchored protein SasF
CH52_RS05885-280.208698cysteine hydrolase
CH52_RS05890-281.154160amidase domain-containing protein
CH52_RS05895-290.952122YhgE/Pip domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05875SECA6560.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 656 bits (1695), Expect = 0.0
Identities = 287/835 (34%), Positives = 450/835 (53%), Gaps = 68/835 (8%)

Query: 10 NELRLKSIRKIVKRINTWSDEVKSYSDDALKQKTIEFKERLASGVDTLDTLLPEAYAVAR 69
N+ L+ +RK+V IN E++ SD+ LK KT EF+ RL G + L+ L+PEA+AV R
Sbjct: 14 NDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKG-EVLENLIPEAFAVVR 72

Query: 70 EASWRVLGMYPKEVQLIGAIVLHEGNIAEMQTGEGKTLTATMPLYLNALSGKGTYLITTN 129
EAS RV GM +VQL+G +VL+E IAEM+TGEGKTLTAT+P YLNAL+GKG +++T N
Sbjct: 73 EASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNALTGKGVHVVTVN 132

Query: 130 DYLAKRDFEEMQPLYEWLGLTASLGFVDIVDYEYQKGEKRNIYEHDIIYTTNGRLGFDYL 189
DYLA+RD E +PL+E+LGLT V I KR Y DI Y TN GFDYL
Sbjct: 133 DYLAQRDAENNRPLFEFLGLT-----VGINLPGMPAPAKREAYAADITYGTNNEYGFDYL 187

Query: 190 IDNLADSAEGKFLPQLNYGIIDEVDSIILDAAQTPLVISGAPRLQSNLFHIVKEFVDTLI 249
DN+A S E + +L+Y ++DEVDSI++D A+TPL+ISG S ++ V + + LI
Sbjct: 188 RDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNKIIPHLI 247

Query: 250 E-----------DVHFKMKKTKKEIWLLNQGIEAAQSYFNV-------EDLYSEQAMVLV 291
+ HF + + +++ L +G+ + E LYS ++L+
Sbjct: 248 RQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPANIMLM 307

Query: 292 RNINLALRAQYLFESNVDYFVYNGDIVLIDRITGRMLPGTKLQAGLHQAIEAKEGMEVST 351
++ ALRA LF +VDY V +G+++++D TGR + G + GLHQA+EAKEG+++
Sbjct: 308 HHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEGVQIQN 367

Query: 352 DKSVMATITFQNLFKLFESFSGMTATGKLGESEFFDLYSKIVVQAPTDKAIQRIDEPDKV 411
+ +A+ITFQN F+L+E +GMT T EF +Y V PT++ + R D PD V
Sbjct: 368 ENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKDLPDLV 427

Query: 412 FRSVDEKNIAMIHDIVELHETGRPVLLITRTAEAAEYFSKVLFQMDIPNNLLIAQNVAKE 471
+ + EK A+I DI E G+PVL+ T + E +E S L + I +N+L A+ A E
Sbjct: 428 YMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANE 487

Query: 472 AQMIAEAGQIGSMTVATSMAGRGTDIKLG-----------------------------EG 502
A ++A+AG ++T+AT+MAGRGTDI LG +
Sbjct: 488 AAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQVRHDA 547

Query: 503 VEALGGLAVIIHEHMENSRVDRQLRGRSGRQGDPGSSCIYISLDDYLVKRWSDSNLAENN 562
V GGL +I E E+ R+D QLRGRSGRQGD GSS Y+S++D L++ ++ ++
Sbjct: 548 VLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASDRVSGMM 607

Query: 563 QLYSLDAQRLSQSNLFNRKVKQIVVKAQRISEEQGVKAREMANEFEKSISIQRDLVYEER 622
+ + + + + AQR E + R+ E++ + QR +Y +R
Sbjct: 608 RKLGMKPGEAIEHPWVTKAIA----NAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQR 663

Query: 623 NRVLEIDDAENRDFKALAKDVFEMFVNEE---KVLTKSRVVEYIYQNLSFQFNKDVACVN 679
N +L++ D ++ +DVF+ ++ + L + + + + L F+ D+
Sbjct: 664 NELLDVSDVSET-INSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 680 FKDKQAVVT------FLLEQFEKQLALNRKNMQSAYYYNIFVQKVFLKAIDSCWLEQVDY 733
+ DK+ + +L Q + ++ + A F + V L+ +DS W E +
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQ-RKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAA 781

Query: 734 LQQLKASVNQRQNGQRNAIFEYHRVALDSFEVMTRNIKKRMVKNICQSMITFDKE 788
+ L+ ++ R Q++ EY R + F M ++K ++ + + + +E
Sbjct: 782 MDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEE 836


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05895ISCHRISMTASE773e-19 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 77.0 bits (189), Expect = 3e-19
Identities = 41/183 (22%), Positives = 77/183 (42%), Gaps = 10/183 (5%)

Query: 3 RKTALLVLDMQE----GIASSVPRIKNIIKANQRAIEAARQHRIPVIFIRLVLDKHFNDV 58
+ LL+ DMQ + + + ++ Q IPV++ ++ +D
Sbjct: 29 NRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDDR 88

Query: 59 SSSNKVFSTIKAQGYAITEADASTRILEDLAPLEDEPIISKRRFSAFTGSYLEVYLRAND 118
+ + G + +I+ +LAP +D+ +++K R+SAF + L +R
Sbjct: 89 ALLTDFW------GPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEG 142

Query: 119 INHLVLTGVSTSGAVLSTALESVDKDYYITVLEDAVGDRSDDKHDFIIEQILSRSCDIES 178
+ L++TG+ L TA E+ +D + DAV D S +KH +E R
Sbjct: 143 RDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLEKHQMALEYAAGRCAFTVM 202

Query: 179 VES 181
+S
Sbjct: 203 TDS 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05900FLGFLGJ645e-13 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 63.6 bits (154), Expect = 5e-13
Identities = 50/176 (28%), Positives = 84/176 (47%), Gaps = 19/176 (10%)

Query: 304 SNNDDSGQFNVVDSKDTRQFVKSIAKDAHRIGQDNDIYASVMIAQAILESDSGRSALAKS 363
N DDS D++ F+ ++ A Q + + +++AQA LES G+ + +
Sbjct: 139 RNYDDSLPG------DSKAFLAQLSLPAQLASQQSGVPHHLILAQAALESGWGQRQIRRE 192

Query: 364 ---PNHNLFGIK--GAFEGNSVPFNTLEADGNKLYSINAGFRKYPSTKESLKDYSDLIKN 418
P++NLFG+K G ++G T E + + + A FR Y S E+L DY L+
Sbjct: 193 NGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYLEALSDYVGLLTR 252

Query: 419 GIDGNRTIYKPTWKSEADSYKDATSHLSKTYATDPNYAKKLNSIIKHYQLTQFDDE 474
+ + A + +DA YATDP+YA+KL ++I+ Q+ D+
Sbjct: 253 NPRYAAVTTAASAEQGAQALQDA------GYATDPHYARKLTNMIQ--QMKSISDK 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05905ABC2TRNSPORT396e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 39.1 bits (91), Expect = 6e-05
Identities = 37/172 (21%), Positives = 67/172 (38%), Gaps = 28/172 (16%)

Query: 817 NKHKSLESVLTTRQVFLGKAGFFIMLGML-----QALIVSVGDLLILKAGVESP---VLF 868
++ E++L T Q+ LG I+LG + +A + G ++ A + +L+
Sbjct: 95 EGQRTWEAMLYT-QLRLGD----IVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLY 149

Query: 869 VLITI-FCSIIFNSIVYTCVSLLGNPGKAIAIVLLVLQIAG----GGGTFPIQTTPQFFQ 923
L I + F S+ +L P I L I G FP+ P FQ
Sbjct: 150 ALPVIALTGLAFASLGMVVTAL--APSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQ 207

Query: 924 NISPYLPFTYAIDSLRETV-----GGIVPEILITKLIILTLFGIGFFVVGLI 970
+ +LP +++ID +R + + + + I+ F F L+
Sbjct: 208 TAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPF---FLSTALL 256


39CH52_RS05930CH52_RS05970N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS05930-390.068313zinc metalloproteinase aureolysin
CH52_RS05935-3100.696912arginine repressor
CH52_RS05940-2110.800266hypothetical protein
CH52_RS05945-2120.551245arginine deiminase
CH52_RS059500142.144120ornithine carbamoyltransferase
CH52_RS05955-1132.053223arginine-ornithine antiporter
CH52_RS05960-1142.191827carbamate kinase
CH52_RS059650172.241907Crp/Fnr family transcriptional regulator
CH52_RS059701151.849961MSCRAMM family adhesin clumping factor ClfB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05935THERMOLYSIN440e-152 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 440 bits (1133), Expect = e-152
Identities = 173/480 (36%), Positives = 249/480 (51%), Gaps = 42/480 (8%)

Query: 64 NIYQDYAVTDVKTDKKGFTHYTLQPSVDGVHAPDKEVKVHADKSGKVVLING----DTDA 119
+ ++ K D+ G T + ++ + H + G++ ++G + D
Sbjct: 71 QARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVAHVN-DGELSSLSGTLIPNLDK 129

Query: 120 KKVKPTNKVTLSKDDAADKAFKAVKIDKHKAKNLKDKVIKENKVEIDGDSNKYVYNVELI 179
+ +K +++ + + K A ++ K + ++ + D ++ + Y V +
Sbjct: 130 RTLKTEAAISIQQAEMIAKQDVADRVTKERPAA-EEGKPTRLVIYPDEETPRLAYEVNVR 188

Query: 180 TVTPEISHWKVKIDAQTGEILEKMNLVKEA-----------AETGKGKGVLGDTKDINI- 227
+TP +W IDA G++L K N + EA + G G+GVLGD K IN
Sbjct: 189 FLTPVPGNWIYMIDAADGKVLNKWNQMDEAKPGGAQPVAGTSTVGVGRGVLGDQKYINTT 248

Query: 228 -NSIDGGFSLEDLTHQGKLSAFSFNDQTG-QATLITNEDENFVKDEQRAGVDANYYAKQT 285
+S G + L+D T + + ++T +L + D F A VDA+YYA
Sbjct: 249 YSSYYGYYYLQDNTRGSGIFTYDGRNRTVLPGSLWADGDNQFFASYDAAAVDAHYYAGVV 308

Query: 286 YDYYKDTFGRESYDNQGSPIVSLTHVNNYGGQDNRNNAAWIGDKMIYGDGDGRTFTSLSG 345
YDYYK+ GR SYD + I S H YG NNA W G +M+YGDGDG+TF SG
Sbjct: 309 YDYYKNVHGRLSYDGSNAAIRSTVH---YG--RGYNNAFWNGSQMVYGDGDGQTFLPFSG 363

Query: 346 ANDVVAHELTHGVTQETANLEYKDQSGALNESFSDVFGYFVD-----DEDFLMGEDVYTP 400
DVV HELTH VT TA L Y+++SGA+NE+ SD+FG V+ + D+ +GED+YTP
Sbjct: 364 GIDVVGHELTHAVTDYTAGLVYQNESGAINEAMSDIFGTLVEFYANRNPDWEIGEDIYTP 423

Query: 401 GKEGDALRSMSNPEQFGQPAHMKDYVFTEKDNGGVHTNSGIPNKAAYNVIQ--------- 451
G GDALRSMS+P ++G P H +DNGGVHTNSGI NKAAY + Q
Sbjct: 424 GVAGDALRSMSDPAKYGDPDHYSKRYTGTQDNGGVHTNSGIINKAAYLLSQGGVHYGVSV 483

Query: 452 -AIGKSKSEQIYYRALTEYLTSNSNFKDCKDALYQAAKDLYDEQTAE--QVYEAWNEVGV 508
IG+ K +I+YRAL YLT SNF + A QAA DLY + E V +A+N VGV
Sbjct: 484 TGIGRDKMGKIFYRALVYYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05940ARGREPRESSOR827e-23 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 82.2 bits (203), Expect = 7e-23
Identities = 38/147 (25%), Positives = 78/147 (53%), Gaps = 2/147 (1%)

Query: 1 MKKSKRLEIVSTIVKKHKIYKKEQIISYIEEYFGVRYSATTIAKDLKELNIYRVPIDCET 60
M K +R + I+ ++I +++++ +++ G + T+++D+KEL++ +VP + +
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKD-GYNVTQATVSRDIKELHLVKVPTNNGS 59

Query: 61 WIYKAINNQTEQEMREKFRHYCEHEVLSSIINGSYIIVKTSPGFAQGINYFIDQLNIEEI 120
+ Y ++ K + + I++KT PG AQ I +D L+ EEI
Sbjct: 60 YKY-SLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEEI 118

Query: 121 LGTVSGNDTTLILTASNDMAEYVYAKL 147
+GT+ G+DT LI+ ++D + V K+
Sbjct: 119 MGTICGDDTILIICRTHDDTKVVQKKI 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05950ARGDEIMINASE5060.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 506 bits (1305), Expect = 0.0
Identities = 193/409 (47%), Positives = 275/409 (67%), Gaps = 8/409 (1%)

Query: 5 PIKVNSEIGALKTVLLKRPGKELENLVPDYLDGLLFDDIPYLEVAQKEHDHFAQVLREEG 64
PI + SEIG LK VLL RPG+ELENL P + LFDDIPYLEVA++EH+ FA +L+
Sbjct: 7 PINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFASILKNNL 66

Query: 65 VEVLYLEKLAAESIENPQ-VRSEFIDDVLAESKKTILGHEEEIKTLFATLSNQELVDKIM 123
VE+ Y+E L +E + + + ++FI + E++ +K F++L+ ++ K++
Sbjct: 67 VEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINLLKDYFSSLTIDNMISKMI 126

Query: 124 SGVRKEEINPKCTHLVEYMDDKYPFYLDPMPNLYFTRDPQASIGHGITINRMFWRARRRE 183
SGV EE+ + L + ++ F +DPMPN+ FTRDP ASIG+G+TIN+MF + R+RE
Sbjct: 127 SGVVTEELKNYTSSLDDLVNGANLFIIDPMPNVLFTRDPFASIGNGVTINKMFTKVRQRE 186

Query: 184 SIFIQYIVKHHPRFKDANIPIWLDRDCPFNIEGGDELVLSKDVLAIGVSERTSAQAIEKL 243
+IF +YI K+HP +K N+PIWL+R ++EGGDELVL+K +L IG+SERT A+++EKL
Sbjct: 187 TIFAEYIFKYHPVYK-ENVPIWLNRWEEASLEGGDELVLNKGLLVIGISERTEAKSVEKL 245

Query: 244 ARRIFENPQATFKKVVAIEIPTSRTFMHLDTVFTMIDYDKFTMHSAILKAEGNMNIFIIE 303
A +F+N + +F ++A +IP +R++MHLDTVFT IDY FT ++ + +I+++
Sbjct: 246 AISLFKN-KTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTSFTSD---DMYFSIYVLT 301

Query: 304 YDDVNKDIAIK-QSSHLKDTLEDVLGIDDIQFIPTGNGDVIDGAREQWNDGSNTLCIRPG 362
Y+ + I IK + + +KD L LG I I GD+I GAREQWNDG+N L I PG
Sbjct: 302 YNPSSSKIHIKKEKARIKDVLSFYLG-RKIDIIKCAGGDLIHGAREQWNDGANVLAIAPG 360

Query: 363 VVVTYDRNYVSNDLLRQKGIKVIEISGSELVRGRGGPRCMSQPLFREDI 411
++ Y RN+V+N L + GIKV I SEL RGRGGPRCMS PL REDI
Sbjct: 361 EIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIREDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05965CARBMTKINASE388e-138 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 388 bits (999), Expect = e-138
Identities = 137/314 (43%), Positives = 198/314 (63%), Gaps = 5/314 (1%)

Query: 1 MKEKIVIALGGNAIQT--KEATAEAQQTAIRRAMQNLKPLFDSPARIVISHGNGPQIGSL 58
M +++VIALGGNA+Q ++ + E +R+ + + + +VI+HGNGPQ+GSL
Sbjct: 1 MGKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSL 60

Query: 59 LIQQAKSNSDT-TPAMPLDTCGAMSQGMIGYWLETEINRILTEMNSDRTVGTIVTRVEVD 117
L+ + PA P+D GAMSQG IGY ++ + L + ++ V TI+T+ VD
Sbjct: 61 LLHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVD 120

Query: 118 KDDPRFNNPTKPIGPFYTKEEVEELQKEQPDSVFKEDAGRGYRKVVASPLPQSILEHQLI 177
K+DP F NPTKP+GPFY +E + L +E + KED+GRG+R+VV SP P+ +E + I
Sbjct: 121 KNDPAFQNPTKPVGPFYDEETAKRLARE-KGWIVKEDSGRGWRRVVPSPDPKGHVEAETI 179

Query: 178 RTLADGKNIVIACGGGGIPVIKKENTYEGVEAVIDKDFASEKLATLIEADTLMILTNVEN 237
+ L + IVIA GGGG+PVI ++ +GVEAVIDKD A EKLA + AD MILT+V
Sbjct: 180 KKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNG 239

Query: 238 VFINFNEPNQQQIDDIDVATLKKYAAQGKFAEGSMLPKIEAAIRFVESGENKKVIITNLE 297
+ + +Q + ++ V L+KY +G F GSM PK+ AAIRF+E G ++ II +LE
Sbjct: 240 AALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWG-GERAIIAHLE 298

Query: 298 QAYEALIGNKGTHI 311
+A EAL G GT +
Sbjct: 299 KAVEALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS05975PF05616512e-08 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 50.9 bits (121), Expect = 2e-08
Identities = 36/125 (28%), Positives = 57/125 (45%), Gaps = 14/125 (11%)

Query: 508 NVDPVTNRDYSIFGWNNENVVRYGGGSADGDSAVNPK-----DPTPG----PPVDPEPSP 558
N+ PVT+R+ N VV G + G++ V+ + D TPG P P P
Sbjct: 277 NMGPVTDRN-----GNPVQVVATFGRDSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEV 331

Query: 559 DPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDSDSGSDSDSESDSDSDSDSDSDS 618
P P +P P+ +P P+P+PDPD + D++ +D G+ DS + D +
Sbjct: 332 SPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNGRHRKE 391

Query: 619 DSDSE 623
+ E
Sbjct: 392 RKEGE 396



Score = 35.1 bits (80), Expect = 0.001
Identities = 18/63 (28%), Positives = 27/63 (42%), Gaps = 1/63 (1%)

Query: 538 DSAVNPK-DPTPGPPVDPEPSPDPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDS 596
+ A NP + PG +PEP PD P+ PD + P P+ PD + +
Sbjct: 336 NPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNGRHRKERKEG 395

Query: 597 DSG 599
+ G
Sbjct: 396 EDG 398


40CH52_RS06280CH52_RS14505N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS06280-280.629921SDR family NAD(P)-dependent oxidoreductase
CH52_RS06285-3130.516817TetR/AcrR family transcriptional regulator
CH52_RS06290-217-0.289670DUF2316 family protein
CH52_RS06295-1160.031836NmrA/HSCARG family protein
CH52_RS06300-118-0.371773VOC family protein
CH52_RS06305-1160.910762DUF896 domain-containing protein
CH52_RS063102142.536050hypothetical protein
CH52_RS145051121.245286TetR/AcrR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06285DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 69.7 bits (170), Expect = 2e-16
Identities = 48/197 (24%), Positives = 76/197 (38%), Gaps = 18/197 (9%)

Query: 3 KIVLITGGNKGLGYASAEALKALGYKVYIGSRND---VRGQQASQKLGVHYVQ--LDVTS 57
KI ITG +G+G A A L + G + N + + + H DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 58 DYSVKNAYNMIAEKEGRLDILINNAGISGQFSAPSKLTPRDVEEVYQTNVFGIVRMMNTF 117
++ I + G +DIL+N AG+ + L+ + E + N G+ +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVL-RPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 118 VPLLEKSEQPVVVNVSSGLGSFGMVTNPETAESKVNSLAYCSSKSAVTMLTLQYAKGLP- 176
+ +V V S P T+ + AY SSK+A M T L
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAG-----VPRTSMA-----AYASSKAAAVMFTKCLGLELAE 177

Query: 177 -NMQINAADPGATNTDL 192
N++ N PG+T TD+
Sbjct: 178 YNIRCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06290HTHTETR622e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.3 bits (151), Expect = 2e-14
Identities = 25/80 (31%), Positives = 44/80 (55%)

Query: 1 MRKDAKENRQRIEEIAHKLFDEEGVENISMNRIAKELGIGMGTLYRHFKDKSDLCYYVIQ 60
+++A+E RQ I ++A +LF ++GV + S+ IAK G+ G +Y HFKDKSDL + +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RDLDIFITHFKQIKDDYHSN 80
+ + + +
Sbjct: 65 LSESNIGELELEYQAKFPGD 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06300NUCEPIMERASE362e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 35.5 bits (82), Expect = 2e-04
Identities = 35/138 (25%), Positives = 53/138 (38%), Gaps = 35/138 (25%)

Query: 1 MKDILVIGATGKQGNAVVKQLLEDGWYVSAL--------TRNKNNRKLSDIGHPHLSIVE 52
MK LV GA G G V K+LLE G V + K R L + P +
Sbjct: 1 MK-YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQAR-LELLAQPGFQFHK 58

Query: 53 GDLSD-----------------NVSLQSAMKGKYGLYSIQ-PIVKDDVSEELRQGMKIIE 94
DL+D + A++ YS++ P D + L + I+E
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVR-----YSLENPHAYADSN--LTGFLNILE 111

Query: 95 IAEQENIQHIVYSTAGGV 112
IQH++Y+++ V
Sbjct: 112 GCRHNKIQHLLYASSSSV 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06305TRNSINTIMINR280.011 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 28.2 bits (62), Expect = 0.011
Identities = 14/45 (31%), Positives = 23/45 (51%)

Query: 56 FQNVSQQSLNTEPNEVMISLGVNTNEEVDQLVNKVKEAGGAVVQE 100
F+N Q +N + N I G ++ V+Q+ + KEAG Q+
Sbjct: 291 FKNPENQKVNIDANGNAIPSGELKDDIVEQIAQQAKEAGEVARQQ 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS06320HTHTETR449e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 43.8 bits (103), Expect = 9e-08
Identities = 33/200 (16%), Positives = 64/200 (32%), Gaps = 34/200 (17%)

Query: 5 KSIDPRIVRTKQLLVDAFLKISREKKLSQITVKDITDIATLNRATFYAHFTDKEDLLDYT 64
+ T+Q ++D L++ ++ +S ++ +I A + R Y HF DK DL
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 65 LSV---TILKDLNDNLSISNVINEKVLRNIFISIASYIKDAAKSCELNSEAFCNKAHQRI 121
+ I + + + VLR I I + + L F
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIF------HK 116

Query: 122 NNELEDIFAIM-LENSYPEHQRDIIVNS-------------------ASFLAAGISGLAL 161
+ ++ + + + D I + A + ISGL
Sbjct: 117 CEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLME 176

Query: 162 HWFNTSQ-----ETADVFID 176
+W Q + A ++
Sbjct: 177 NWLFAPQSFDLKKEARDYVA 196


41CH52_RS07130CH52_RS07155N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS07130-115-1.620674pyridoxal phosphate-dependent aminotransferase
CH52_RS07135017-1.1939626-carboxyhexanoate--CoA ligase
CH52_RS07140119-1.399420QueT transporter family protein
CH52_RS07145-118-0.957429bi-component gamma-hemolysin HlgAB/HlgCB subunit
CH52_RS07150-117-0.692408bi-component gamma-hemolysin HlgCB subunit C
CH52_RS07155016-0.842205bi-component gamma-hemolysin HlgAB subunit A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07130CLENTEROTOXN280.047 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 28.5 bits (63), Expect = 0.047
Identities = 8/47 (17%), Positives = 15/47 (31%), Gaps = 3/47 (6%)

Query: 233 GGVILSSND---VKDMLINHGRPLIYSSSLPIYNLYFIKRNIEKLIN 276
IL+ N+ L I + + FI+ ++E
Sbjct: 59 SSQILNPNETGTFSQSLTKSKEVSINVNFSVGFTSEFIQASVEYGFG 105


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07145BICOMPNTOXIN383e-136 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 383 bits (985), Expect = e-136
Identities = 87/322 (27%), Positives = 160/322 (49%), Gaps = 18/322 (5%)

Query: 1 MKMNKLVKSSVATSMALLLLSGTANAEGKITPVSVKKVDDKVTLYKTTATADSDKFKISQ 60
M NK++ ++++ S+ L + + + K T S+K+ ++Q
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 ILTFNFIKDKSYDKDTLVLKATGNINSGFVKPNPNDYDFSK-LYWGAKYNVSISSQSNDS 119
+ F+F+KDK Y+KD L+LK G I+S N + K + W +YN+ + + ++
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKT-NDKY 119

Query: 120 VNVVDYAPKNQNEEFQVQNTLGYTFGGDISISNGLSGGLNGNTAFSETINYKQESYRTTL 179
V++++Y PKN+ E V TLGY GG+ + L G NG+ +S++I+Y Q++Y + +
Sbjct: 120 VSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGG--NGSFNYSKSISYTQQNYVSEV 177

Query: 180 SRNTNYKNVGWGVEAHKIMNNGWGPYGRDSFHPTYGNELFLAGRQSSAYAGQNFIAQHQM 239
+ N K+V WGV+A+ + ++LF+ + S F+ ++
Sbjct: 178 EQQ-NSKSVLWGVKANSFATESGQ-------KSAFDSDLFVGYKPHSKDPRDYFVPDSEL 229

Query: 240 PLLSRSNFNPEFLSVLSHRQDGAKKSKITVTYQREMDL-----YQIRWNGFYWAGANYKN 294
P L +S FNP F++ +SH + + S+ +TY R MD+ + Y G N
Sbjct: 230 PPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHN 289

Query: 295 -FKTRTFKSTYEIDWENHKVKL 315
F R + YE++W+ H++K+
Sbjct: 290 AFVNRNYTVKYEVNWKTHEIKV 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07150BICOMPNTOXIN469e-170 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 469 bits (1207), Expect = e-170
Identities = 314/315 (99%), Positives = 314/315 (99%)

Query: 1 MLKNKILATTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60
MLKNKIL TTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV 120
NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV 120

Query: 121 SLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ 180
SLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ
Sbjct: 121 SLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ 180

Query: 181 NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS 240
NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS
Sbjct: 181 NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS 240

Query: 241 FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY 300
FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY
Sbjct: 241 FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY 300

Query: 301 EVNWKTHEIKVKGQN 315
EVNWKTHEIKVKGQN
Sbjct: 301 EVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07155BICOMPNTOXIN428e-154 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 428 bits (1103), Expect = e-154
Identities = 213/312 (68%), Positives = 247/312 (79%), Gaps = 8/312 (2%)

Query: 1 MIKNKILTATLAVGLIAPLANPFIEISKAENKIEDIGQGA--EIIKRTQDITSKRLAITQ 58
M+KNKILT TL+V L+APLANP +E +KA N EDIG+G+ EIIKRT+D TS + +TQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 59 NIQFDFVKDKKYNKDALVVKMQGFISSRTTYSDLKKYPYIKRMIWPFQYNISLKTKDSNV 118
NIQFDFVKDKKYNKDAL++KMQGFISSRTTY + KK ++K M WPFQYNI LKT D V
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV 120

Query: 119 DLINYLPKNKIDSADVSQKLGYNIGGNFQSAPSIGGSGSFNYSKTISYNQKNYVTEVESQ 178
LINYLPKNKI+S +VSQ LGYNIGGNFQSAPS+GG+GSFNYSK+ISY Q+NYV+EVE Q
Sbjct: 121 SLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ 180

Query: 179 NSKGVKWGVKANSFVTPNGQVSAYDQYLF-AQDPTGPAARDYFVPDNQLPPLIQSGFNPS 237
NSK V WGVKANSF T +GQ SA+D LF P RDYFVPD++LPPL+QSGFNPS
Sbjct: 181 NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS 240

Query: 238 FITTLSHERGKGDKSEFEITYGRNMDATYA-----YVTRHRLAVDRKHDAFKNRNVTVKY 292
FI T+SHE+G D SEFEITYGRNMD T+A + L R H+AF NRN TVKY
Sbjct: 241 FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY 300

Query: 293 EVNWKTHEVKIK 304
EVNWKTHE+K+K
Sbjct: 301 EVNWKTHEIKVK 312


42CH52_RS07385CH52_RS07425N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS07385-210-1.842901TetR/AcrR family transcriptional regulator
CH52_RS07390-114-0.835270YhgE/Pip domain-containing protein
CH52_RS07395-2110.367632DUF2871 domain-containing protein
CH52_RS07405-2120.751504hypothetical protein
CH52_RS07410-2140.432421NAD(P)/FAD-dependent oxidoreductase
CH52_RS07415-3110.315653GNAT family N-acetyltransferase
CH52_RS07420-29-0.355656oxidoreductase
CH52_RS07425-180.028820GNAT family N-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07390HTHTETR514e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 50.8 bits (121), Expect = 4e-10
Identities = 33/203 (16%), Positives = 76/203 (37%), Gaps = 21/203 (10%)

Query: 5 RRIRKTKSSIKQAFTKLLQEKDLEKITIRDITTRADINRGTFYLHYEDKYMLLADMEDEY 64
+ ++T+ I +L ++ + ++ +I A + RG Y H++DK L +++ +
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 65 ISELTTY----------TQFDLLRGSSIEDIANTFVNNILKNIFQHIHDNLEFY---HTI 111
S + +LR I + +T + + + I EF +
Sbjct: 67 ESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVV 126

Query: 112 LQLECTSQLEL--KINEHIKNNMQR-YISINHSIGGVPEMYFYSYVSGATISIIKYWVMD 168
Q + LE +I + +K+ ++ + + + Y+SG + W+
Sbjct: 127 QQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAII-MRGYISGLMEN----WLFA 181

Query: 169 KQPISVDELAKHVHNIIFNGPLR 191
Q + + A+ I+ L
Sbjct: 182 PQSFDLKKEARDYVAILLEMYLL 204


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07420SACTRNSFRASE332e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.6 bits (74), Expect = 2e-04
Identities = 15/59 (25%), Positives = 30/59 (50%), Gaps = 9/59 (15%)

Query: 66 IVDIAVSKSYQGQDYGSLIMEHIMKYIKN-----VSVESAYVSLIADYPADKLYAKFGF 119
I DIAV+K Y+ + G+ ++ +++ K + +E+ +++ A YAK F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINI----SACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07425PF03627290.024 PapG
		>PF03627#PapG

Length = 336

Score = 29.1 bits (65), Expect = 0.024
Identities = 15/48 (31%), Positives = 19/48 (39%), Gaps = 5/48 (10%)

Query: 21 TFKQLSPTDLPKGDVLIKVHY-SGINYKDALATQDH----NAVVKSYP 63
FK P DLP GD + + Y SG+ A V K+ P
Sbjct: 155 IFKVALPADLPLGDYSVTIPYTSGMQRHFASYLGARFKIPYNVAKTLP 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07430SACTRNSFRASE519e-11 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 51.5 bits (123), Expect = 9e-11
Identities = 24/112 (21%), Positives = 45/112 (40%), Gaps = 7/112 (6%)

Query: 40 FFKDNYTVEKFTQEINHVDSFHYFYQEDGANVGYIKMNINSAQTEEMGETYLEVQRIYFL 99
+FK + + + Y + +G IK+ N Y ++ I
Sbjct: 46 YFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNG-------YALIEDIAVA 98

Query: 100 KDFQGGGRGSQLIELAEKIAQEHNKHKIWLGVWEHNPRAQAFYKRHGFKVVG 151
KD++ G G+ L+ A + A+E++ + L + N A FY +H F +
Sbjct: 99 KDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


43CH52_RS07480CH52_RS07525N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS07480-315-1.423306ABC transporter ATP-binding protein
CH52_RS07485-113-2.238832YdcF family protein
CH52_RS07490-112-1.907483MarR family transcriptional regulator
CH52_RS07500-19-1.310882zinc ribbon domain-containing protein
CH52_RS0750518-1.352571hypothetical protein
CH52_RS0751018-1.115682multidrug effflux MFS transporter
CH52_RS07515111-1.469208TetR/AcrR family transcriptional regulator
CH52_RS07520012-1.168104HlyD family efflux transporter periplasmic
CH52_RS07525112-1.787492DHA2 family efflux MFS transporter permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07480PF05272290.014 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.014
Identities = 11/21 (52%), Positives = 14/21 (66%)

Query: 35 VILNGASGSGKTTLLTILGGL 55
V+L G G GK+TL+ L GL
Sbjct: 599 VVLEGTGGIGKSTLINTLVGL 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07510TCRTETA651e-13 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 65.2 bits (159), Expect = 1e-13
Identities = 69/386 (17%), Positives = 141/386 (36%), Gaps = 16/386 (4%)

Query: 15 IIILGSLTAIGALSIDMFLPGLPDIRHDF---QTTTSNAQLTLSMFMIGLAFGNLFAGPI 71
+I++ S A+ A+ I + +P LP + D T++ + L+++ + G +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 72 SDSTGRRKPLIIAMIIFTLASLGIVFVHNIWLMVALRFLQGVTGGAAAVISRAIASDMYS 131
SD GRR L++++ + + +W++ R + G+TG AV IA D+
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA-DITD 125

Query: 132 GNELTKFMALLMLVNGIAPVVAPTIGGIILNYSVWRMVFVILTIFGFVMVIGSLLKVPES 191
G+E + + G V P +GG++ +S F + + +PES
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGGFSP-HAPFFAAAALNGLNFLTGCFLLPES 184

Query: 192 LTVTNRESSSGLKTMFKNFKILLKTPRFVLPMLIQGMTFVILFTYISASPFII--QKIYG 249
R +F+ + V ++ + L + A+ ++I + +
Sbjct: 185 HKGERRPLRREALNPLASFR-WARGMTVVAALMAVFFI-MQLVGQVPAALWVIFGEDRFH 242

Query: 250 MTAIQFSWMFAGIGITLIISSQLTGYLVDFIDSQKLMRGMTMIQIIGVILVTIVLLNHWN 309
A A GI ++ + + ++ R M+ +I I+L
Sbjct: 243 WDATTIGISLAAFGILHSLAQ---AMITGPVAARLGERRALMLGMIADGTGYILLAFATR 299

Query: 310 FWILAIGFIILIAPVTGVATLGFTIAMDESSSGRGSSSSLLGLVQFLFGGVASPLVGVKG 369
W+ ++L + G+ L ++ +G L + L + PL+
Sbjct: 300 GWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSL-TSIVGPLLFTAI 358

Query: 370 EDNPIPY---IIIIIATAVILIILQI 392
I I A+ L+ L
Sbjct: 359 YAASITTWNGWAWIAGAALYLLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07515HTHTETR454e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 44.6 bits (105), Expect = 4e-08
Identities = 13/69 (18%), Positives = 24/69 (34%)

Query: 2 KRQAKIEIQNALVDLMAEYPFQEISTKMICAYCNINRSTFYDYYKDKFDLLDTINSKHKE 61
++ + I + + L ++ S I + R Y ++KDK DL I +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 62 KFQFLLSAL 70
L
Sbjct: 69 NIGELELEY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07520RTXTOXIND591e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 59.1 bits (143), Expect = 1e-12
Identities = 26/133 (19%), Positives = 45/133 (33%), Gaps = 13/133 (9%)

Query: 87 MDLKMPQKGTIAKLD-GMEGSMVQAGNPIAYAYNLDD-LYVTANIDEKDIKDVEVGKDVD 144
++ P + +L EG +V + DD L VTA + KDI + VG++
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAI 387

Query: 145 VTIDGQKAS----IKGKVDSIGKATAASFSLMPSSNSDGNYTKVSQVIPVKITLESEPSK 200
+ ++ + + GKV +I G V I +
Sbjct: 388 IKVEAFPYTRYGYLVGKVKNINLDAI-------EDQRLGLVFNVIISIEENCLSTGNKNI 440

Query: 201 QVVPGMNAEVKIH 213
+ GM +I
Sbjct: 441 PLSSGMAVTAEIK 453



Score = 32.5 bits (74), Expect = 0.001
Identities = 17/77 (22%), Positives = 35/77 (45%), Gaps = 2/77 (2%)

Query: 9 VITVVVLLAIGIAGFYFWNKTTSYVTTDNAKV--NGDQIKIASPASGQIKSLNVKQGDKL 66
++ ++ + IA V T N K+ +G +I + +K + VK+G+ +
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESV 118

Query: 67 DKGDKVATVTVQGQDGE 83
KGD + +T G + +
Sbjct: 119 RKGDVLLKLTALGAEAD 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS07525TCRTETB1582e-44 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 158 bits (402), Expect = 2e-44
Identities = 92/415 (22%), Positives = 187/415 (45%), Gaps = 16/415 (3%)

Query: 140 KILAALLFGMFIAILNQTLLNVALPKINTEFNISASTGQWLMTGFMLVNGILIPITAYLF 199
+IL L F ++LN+ +LNV+LP I +FN ++ W+ T FML I + L
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 200 NKYSYRKLFLVALVLFTIGSLICAISMN-FPIMMVGRVLQAIGAGVLMPLGSIVIITIYP 258
++ ++L L +++ GS+I + + F ++++ R +Q GA L +V+ P
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 259 PEKRGAAMGTMGIAMILAPAIGPTLSGYIVQNYHWNVMFYGMFIIGIIAILVGFVWFKLY 318
E RG A G +G + + +GP + G I HW+ + +I II + K
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP-MITIITVPFLMKLLKKE 192

Query: 319 QYTTNPKADIPGIIFSTIGFGALLYGFSEAGNKGWGSVEIETMFAIGIIFIILFVIRELR 378
DI GII ++G + + + ++ ++FV +
Sbjct: 193 VRIKGH-FDIKGIILMSVGIVFFMLF---TTSYSIS------FLIVSVLSFLIFVKHIRK 242

Query: 379 MKSPMLNLEVLKFPTFTLTTIINMVVMLSLYGGMILLPIYLQNLRGFSALDSG-LLLLPG 437
+ P ++ + K F + + ++ ++ G + ++P ++++ S + G +++ PG
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 438 SLIMGLLGPFAGKLLDTIGLKPLAIFGIAVMTYATWELTKLNMDTP-YMTIMGIYVLRSF 496
++ + + G G L+D G + G+ ++ + + L T +MTI+ ++VL
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVL--G 360

Query: 497 GMAFIMMPMVTAAINALPGRLASHGNAFLNTMRQLAGSIGTAILVTVMTTQTTQH 551
G++F + T ++L + A G + LN L+ G AI+ +++
Sbjct: 361 GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQ 415


44CH52_RS08385CH52_RS08415N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS08385-38-1.231499Asp23/Gls24 family envelope stress response
CH52_RS08390-38-1.506875D-ornithine--citrate ligase SfaD
CH52_RS08395-29-1.438600staphyloferrin A export MFS transporter
CH52_RS08400-39-1.582261staphyloferrin A synthetase SfaB
CH52_RS08405-111-1.353428staphyloferrin A biosynthesis protein SfaC
CH52_RS08410-211-1.160747hypothetical protein
CH52_RS08415010-0.344044Fe(3+) dicitrate ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08395TCRTETOQM290.012 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 28.7 bits (64), Expect = 0.012
Identities = 14/43 (32%), Positives = 21/43 (48%), Gaps = 5/43 (11%)

Query: 99 VDLKVILEYGE-----SAPKIFRKVTELVKEQVKYITGLDVVE 136
D K+ +YG S P FR + +V EQV G +++E
Sbjct: 495 TDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLE 537


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08400PF041832703e-84 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 270 bits (691), Expect = 3e-84
Identities = 93/475 (19%), Positives = 181/475 (38%), Gaps = 45/475 (9%)

Query: 197 SEQAVIEGHPLHPGAKLRKGLNALQTFLYSSEFNQPIKLKIVLIHSKLSRTMSLSKDYDT 256
Q ++ GHP K R+G Y+ E+ +L + + + M D +
Sbjct: 128 RLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKRE---HMIWRCDNEM 184

Query: 257 TVHQLF-----PDLIKQLENEFTPNFNFNDYHIMIVHPWQLDDVLHSDYQAEVDKELIIE 311
+HQL P + + N +++ + VHPWQ + +D+ A+ + ++
Sbjct: 185 DIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEGRMVS 244

Query: 312 AKHTLD-YYAGLSFRTLVPKYPAMSPHIKLSTNVHITGEIRTLSEQTTHNGPLMTRILND 370
D + A S RTL IKL ++ T R + + GPL +R L
Sbjct: 245 LGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQ 304

Query: 371 ILEKDVIFKSYASTIIDEVAGIHFYNEQDEVDYQTER--SEQLGTLFRKNIYQMIPQEVT 428
+ D + I+ E A + +E + E LG ++R+N + + + +
Sbjct: 305 VFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDES 364

Query: 429 PMIPSSLVATYPFNNESPIVTLIKRYQSAASLSDFESSAKSWIETYSKALLGLVIPLVTK 488
P++ ++L+ N P+ A + A++W+ + ++ + L+ +
Sbjct: 365 PVLMATLMECDE--NNQPLA--------GAYIDRSGLDAETWLTQLFRVVVVPLYHLLCR 414

Query: 489 YGIALEAHLQNAIATFRKDGLLDTMYIRDFEG-LRIDKAQLNEMGYSTSHFHEKSRILTD 547
YG+AL AH QN I K+G+ + ++DF+G +R+ K + EM S E + +
Sbjct: 415 YGVALIAHGQN-ITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMD---SLPQEVRDVTSR 470

Query: 548 SKTSVFNKAFYSTVQNHLGELILTISKASNDSNLERHMWYIVRDVLDNIFDQLVLSTHKS 607
+ + I + ER + ++ VL + + H
Sbjct: 471 LSADYLIHDLQTGHFVTVLRFISPLMVRLGVP--ERRFYQLLAAVLSDYMKK-----HPQ 523

Query: 608 NQVNENRINEIKDTMFAPFIDYKCVTTMRLE----DEAHHY--TYIK-VNNPLYR 655
+ +F P I + ++L D Y++ + NPL+
Sbjct: 524 MSERFALFS-----LFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWL 573


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08405TCRTETA423e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 41.7 bits (98), Expect = 3e-06
Identities = 53/340 (15%), Positives = 106/340 (31%), Gaps = 26/340 (7%)

Query: 6 FSSSFLLFLGNWIGQIGLNWFVLTTYHN--------AVYLGIVNFCRLVPILLLSVWAGA 57
S+ L +G IGL VL + GI+ + + GA
Sbjct: 11 LSTVALDAVG-----IGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 58 IADKYDKGRLLRITISSSFLVTAILCVLTYSFTAIPISVIIIYAT-LRGILSAVETPLRQ 116
++D++ + R + S A+ Y+ A + ++Y + ++ +
Sbjct: 66 LSDRFGR----RPVLLVSLAGAAV----DYAIMATAPFLWVLYIGRIVAGITGATGAVAG 117

Query: 117 AILPDLSDKISTTQAVSFHSFIINICRSIGPAIAGVILAVYHAPTTFLAQA--ICYFIAA 174
A + D++D + F S GP + G++ F A A F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177

Query: 175 LLCLPLHFKVTKIPEDATRYMPLKVIIDYFKLHMEGRQIFITSLLIMATGFSYTTLLPVL 234
LP K + P PL + + + ++ G L +
Sbjct: 178 CFLLPESHKGERRPLRREALNPLASFRWARGMTVVA-ALMAVFFIMQLVGQVPAALWVIF 236

Query: 235 TNKVFPGKSEIFGIAMTMCAIGGIIATLVL-PKVLKYIGMVNMYYLSSLLFGIALLGVVF 293
F + GI++ I +A ++ V +G L + G + + F
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAF 296

Query: 294 HNIVIMFICITLIGLFSQWARTTNRVYFQNNVKDYERGKV 333
M I ++ + V + +G++
Sbjct: 297 ATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQL 336



Score = 30.6 bits (69), Expect = 0.012
Identities = 37/180 (20%), Positives = 71/180 (39%), Gaps = 21/180 (11%)

Query: 10 FLLFLGNWIGQIGLNWFVLTTYH----NAVYLGI-VNFCRLVPILLLSVWAGAIADKYDK 64
+ F+ +GQ+ +V+ +A +GI + ++ L ++ G +A + +
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGE 276

Query: 65 GRLLRITISSSFLVTAILCVLTYSFTAIPISVIIIYATLRGILSAVETPLRQAILPDLSD 124
R L + + + +L T + A PI V++ + P QA+L D
Sbjct: 277 RRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA-------SGGIGMPALQAMLSRQVD 329

Query: 125 KISTTQAVSFHSFIINICRSIGPAIAGVILAVYHAPT----TFLAQAICYFIAALLCLPL 180
+ Q + + ++ +GP + I A T ++A A Y LLCLP
Sbjct: 330 EERQGQLQGSLAALTSLTSIVGPLLFTAIYA-ASITTWNGWAWIAGAALY----LLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08410PF041832581e-80 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 258 bits (661), Expect = 1e-80
Identities = 92/456 (20%), Positives = 176/456 (38%), Gaps = 56/456 (12%)

Query: 166 EGHPTHPLTKTKLPLTMEEVRAYAPEFEKEIPLQIMMIEKDHVVCTAMDGND--QFIIDE 223
GHP K + E + YAPE+ L + ++++H++ + D Q +
Sbjct: 134 SGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAA 193

Query: 224 IIPEYYNQIRVFLKSLGLKSEDYRAILVHPWQYDHTIGKYFEAWIAKKILIPT-PFTILS 282
+ P+ + + + GL ++ + VHPWQ+ I F A A+ ++ F
Sbjct: 194 MDPQEFARFSQVWQENGL-DHNWLPLPVHPWQWQQKIATDFIADFAEGRMVSLGEFGDQW 252

Query: 283 KATLSFRTMSLIDKP--YHVKLPVDAQATSAVRTVSTVTTVDGPKLSYALQN-------- 332
A S RT++ + +KLP+ TS R + GP S LQ
Sbjct: 253 LAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATL 312

Query: 333 ------MLNQYPGFKVAMEPFGEYANVDKDRARQLACIIRQKPE--IDGKGATVVSASLV 384
+L + V+ E + A L I R+ P + + V+ A+L+
Sbjct: 313 VQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLM 372

Query: 385 NKNPIDQKVIVDSYLEWLNQGITKESITTFIERYAQALIPPLIAFIQNYGIALEAHMQNT 444
+ +Q + +Y++ G+ E+ ++ + + ++ PL + YG+AL AH QN
Sbjct: 373 ECDENNQPLA-GAYID--RSGLDAET---WLTQLFRVVVVPLYHLLCRYGVALIAHGQNI 426

Query: 445 VVNLGPHFDIQFLVRDLGGS-RI------DLETLQHRVSDI--KITNDSLIADSIDAVIA 495
+ + + L++D G R+ ++++L V D+ +++ D LI D
Sbjct: 427 TLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQTGHFV 486

Query: 496 KFQHAVIQNQMAELIHHFNQYDCVEETELFNIVQQVVA--HAINPTLPHANELKDILFGP 553
I V E + ++ V++ +P + L LF P
Sbjct: 487 TV---------LRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFS-LFRP 536

Query: 554 TITVKALLNMRM-----ENKVKQYLNI--ELDNPIK 582
I L +++ + + N +L NP+
Sbjct: 537 QIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLW 572


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08415ALARACEMASE391e-05 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 39.4 bits (92), Expect = 1e-05
Identities = 59/325 (18%), Positives = 119/325 (36%), Gaps = 33/325 (10%)

Query: 4 VNINISKIKYNAKVLQTVFQSKNIQFTPVIKCIAGDRTIVESLKALG-INHVAESRLDNI 62
++++ +K N +++ + + + V+K A I A+G + A L+
Sbjct: 7 ASLDLQALKQNLSIVRQA--ATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEA 64

Query: 63 ISIADQDLTYTLLRTPAKKEISDMIEKVDMSIQTELSTIHQINEVAEV-LGKKHKILLMV 121
I++ ++ +L D+ + T + + Q+ + L I L V
Sbjct: 65 ITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKV 124

Query: 122 DWKDGREGVLTYDVLDYIKEIIHLKNIHFVGLAFNFMCFKSDAPSDDDIFMINRFVSAVE 181
+ R G VL +++ + N+ + L +F ++ P D + R A E
Sbjct: 125 NSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAE--AEHP-DGISGAMARIEQAAE 181

Query: 182 REIGYRLKIISGGNSSMLPQLLYNDLGKINELRIGETLFRGVDTTTNQAIAML-YQDAIT 240
+ R + + + P+ ++ +R G L+ + + IA + +T
Sbjct: 182 -GLECRRSLSNSAATLWHPEAHFD------WVRPGIILYGASPSGQWRDIANTGLRPVMT 234

Query: 241 LEAEILEIK-----PRVN-----TQTHESFLQAIVDIGYLD---TKVDNISPM---DQHI 284
L +EI+ ++ RV T E + IV GY D +P+
Sbjct: 235 LSSEIIGVQTLKAGERVGYGGRYTARDEQRI-GIVAAGYADGYPRHAPTGTPVLVDGVRT 293

Query: 285 NILGA-SSDHLMLDLNGQGHYQVGD 308
+G S D L +DL +G
Sbjct: 294 MTVGTVSMDMLAVDLTPCPQAGIGT 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS08425FERRIBNDNGPP965e-25 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 96.2 bits (239), Expect = 5e-25
Identities = 64/257 (24%), Positives = 107/257 (41%), Gaps = 24/257 (9%)

Query: 53 DAKRIVVLEYSFADALAALDVKPVGIADDGKKKRIIK--PVREKIGDYTSVGTRKQPNLE 110
D RIV LE+ + L AL + P G+AD + + P+ + + D VG R +PNLE
Sbjct: 34 DPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVID---VGLRTEPNLE 90

Query: 111 EISKLKPDLIIADSSRHKGINKELNKIAPTLSLKSFDGDYKQNI--NSFKTIAKALNKEK 168
++++KP ++ S+ + + L +IAP DG + S +A LN +
Sbjct: 91 LLTEMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQS 149

Query: 169 EGEKRLAEHDKLINKYKDEIKFDRNQKVLPAVV---AKAGLLAHPNYSYVGQFLNELGFK 225
E LA+++ I K R + L + L+ PN S + L+E G
Sbjct: 150 AAETHLAQYEDFIRSMKPRF-VKRGARPLLLTTLIDPRHMLVFGPN-SLFQEILDEYGIP 207

Query: 226 NALSDDVTKGLSKYLKGPYLQLDTEHLADLNPERMIIMTDHAKKDSAEFKKLQEDATWKK 285
NA +G + + + + LA ++ DH +S + L W+
Sbjct: 208 NAW-----QGETNFWG--STAVSIDRLAAYKDVDVLCF-DHD--NSKDMDALMATPLWQA 257

Query: 286 LNAVKNNRVDIVDRDVW 302
+ V+ R V VW
Sbjct: 258 MPFVRAGRFQRVP-AVW 273


45CH52_RS10385CH52_RS10530N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS10385313-3.934384********staphylococcal enterotoxin type O
CH52_RS10395518-6.436603staphylococcal enterotoxin type M
CH52_RS10440720-7.212328staphylococcal enterotoxin type I
CH52_RS10445820-6.934678staphylococcal enterotoxin type N
CH52_RS10450919-7.192623staphylococcal enterotoxin type G
CH52_RS14910817-6.702866DUF1828 domain-containing protein
CH52_RS10465719-6.874211hypothetical protein
CH52_RS10470721-6.268499hypothetical protein
CH52_RS10475417-4.922930DUF1829 domain-containing protein
CH52_RS10480314-3.846104bi-component leukocidin LukED subunit E
CH52_RS10485414-4.130453bi-component leukocidin LukED subunit D
CH52_RS10490316-4.481810DUF1828 domain-containing protein
CH52_RS15035215-3.698708hypothetical protein
CH52_RS10495212-3.750974DUF4888 domain-containing protein
CH52_RS10500213-3.987464serine protease SplA
CH52_RS10510013-2.687475serine protease SplB
CH52_RS10515213-1.509630serine protease SplC
CH52_RS10520312-1.032747serine protease SplD
CH52_RS10530312-0.647401serine protease SplF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10440BACTRLTOXIN1717e-55 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 171 bits (435), Expect = 7e-55
Identities = 91/270 (33%), Positives = 140/270 (51%), Gaps = 20/270 (7%)

Query: 3 NSKVMLNVLLLILNLIAICSVNNAYANEE-DPKIESLCKKSSVDPIALHNINDDYINNRF 61
++ ++ ++LI LI + S N A + DP + L K S + N+ Y ++
Sbjct: 2 YKRLFISRVILIFALILVISTPNVLAESQPDPMPDDLHKSSEFTG-TMGNMKYLYDDHYV 60

Query: 62 TTVKSIVSTTEKFLDFDLLFKSINWLDGISAEFKDLKVEFSSSAISKEFLGKTVDIYGVY 121
+ K V + +KFL DL++ D + +K E + ++K++ + VD+YG
Sbjct: 61 SATK--VKSVDKFLAHDLIYNI---SDKKLKNYDKVKTELLNEDLAKKYKDEVVDVYGSN 115

Query: 122 YKAHCH-------GEHQVDTACTYGGVTPHENNKLSEP--KNIGVAVYKDNVNVNTFIVT 172
Y +C+ G+ C YGG+T HE N +N+ V VY++ N +F V
Sbjct: 116 YYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRNTISFEVQ 175

Query: 173 TDKKKVTAQELDIKVRTKLNNAYKLYDRMTSDVQKGYIKFHSHSEHKESFYYDLFYIKGN 232
TDKK VTAQELDIK R L N LY+ +S + GYIKF ++ +F+YD+ G+
Sbjct: 176 TDKKSVTAQELDIKARNFLINKKNLYEFNSSPYETGYIKFIENNG--NTFWYDMMPAPGD 233

Query: 233 LPDQ--YLQIYNDNKTIDSSDYHIDVYLFT 260
DQ YL +YNDNKT+DS I+V+L T
Sbjct: 234 KFDQSKYLMMYNDNKTVDSKSVKIEVHLTT 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10445BACTRLTOXIN1217e-36 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 121 bits (306), Expect = 7e-36
Identities = 64/231 (27%), Positives = 111/231 (48%), Gaps = 36/231 (15%)

Query: 28 NLRNYYGSYPIEDHQSINPENNHLSHQLVFSMDNST------VTAEFKNVDDVKKFKNHA 81
N++ Y + + + + L+H L++++ + V E N D KK+K+
Sbjct: 50 NMKYLYDDHYVS-ATKVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEV 108

Query: 82 VDVYGLSYSGYCLKNKY------------IYGGVTFA-GDYLEKSRRIPINLWVNGEHQT 128
VDVYG +Y C + +YGG+T G++ + + + V +
Sbjct: 109 VDVYGSNYYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRN 168

Query: 129 ISTDKVSTNKKLVTAQEIDTKLRRYLQEEYNIYGFNDTNKGRNYGNKSKFSSGFNAGKIL 188
+ +V T+KK VTAQE+D K R +L + N+Y FN SS + G I
Sbjct: 169 TISFEVQTDKKSVTAQELDIKARNFLINKKNLYEFN--------------SSPYETGYIK 214

Query: 189 FHLNDGSSFSYDLFDT-GTGQAES-FLKIYNDNKTVETEKFHLDVEISYKD 237
F N+G++F YD+ G +S +L +YNDNKTV+++ ++V ++ K+
Sbjct: 215 FIENNGNTFWYDMMPAPGDKFDQSKYLMMYNDNKTVDSKSVKIEVHLTTKN 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10450BACTRLTOXIN1082e-30 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 108 bits (270), Expect = 2e-30
Identities = 54/227 (23%), Positives = 98/227 (43%), Gaps = 37/227 (16%)

Query: 30 VGNLRNFYTKHDYIDLKGVTDKNLPIANQLEFS------TGTNDLISESNNWDEISKFKG 83
+GN++ Y H K + +A+ L ++ + + +E N D K+K
Sbjct: 48 MGNMKYLYDDHYVSATKVKSVDKF-LAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKD 106

Query: 84 KKLDIFGIDY-------------NGPCKSKYMYGGATL-SGQYLNSARKIPINLWVNGKH 129
+ +D++G +Y MYGG T G + ++ + + V
Sbjct: 107 EVVDVYGSNYYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENK 166

Query: 130 KTISTDKIATNKKLVTAQEIDVKLRRYLQEEYNIYGHNNTGKGKEYGYKSKFYSGFNNGK 189
+ + ++ T+KK VTAQE+D+K R +L + N+Y N+ + G
Sbjct: 167 RNTISFEVQTDKKSVTAQELDIKARNFLINKKNLY-EFNSSP-------------YETGY 212

Query: 190 VLFHLNNEKSFSYDLF-YTGDGLPVS-FLKIYEDNKIIESEKFHLDV 234
+ F NN +F YD+ GD S +L +Y DNK ++S+ ++V
Sbjct: 213 IKFIENNGNTFWYDMMPAPGDKFDQSKYLMMYNDNKTVDSKSVKIEV 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10465BACTRLTOXIN1559e-49 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 155 bits (394), Expect = 9e-49
Identities = 76/265 (28%), Positives = 124/265 (46%), Gaps = 21/265 (7%)

Query: 2 RLFYIAAIII-TLLCLINNNYVNAEV----DKKDLKKKSDLDSSKLFNLTSYYTDITWQL 56
RLF I+I L+ +I+ V AE DL K S+ + + N+ Y D +
Sbjct: 4 RLFISRVILIFALILVISTPNVLAESQPDPMPDDLHKSSEF-TGTMGNMKYLYDDH--YV 60

Query: 57 DESNKISTDQLLNNTIILKNIDISVLKTSSLKVEFNSSDLANQFKGKNIDIYGLYFGNKC 116
+ S D+ L + +I D + +K E + DLA ++K + +D+YG + C
Sbjct: 61 SATKVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEVVDVYGSNYYVNC 120

Query: 117 -------VGLTEEKTSCLYGGVTIHDGNQLDEEKV--IGVNVFKDGVQQEGFVIKTKKAK 167
VG +C+YGG+T H+GN D + + V V+++ F ++T K
Sbjct: 121 YFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRNTISFEVQTDKKS 180

Query: 168 VTVQELDTKVRFKLENLYKIYNKDTGNIQKGCIFFHSHNHQDQSFYYDLYNVKGSVG--A 225
VT QELD K R L N +Y ++ + G I F +N +F+YD+ G +
Sbjct: 181 VTAQELDIKARNFLINKKNLYEFNSSPYETGYIKFIENN--GNTFWYDMMPAPGDKFDQS 238

Query: 226 EFFQFYSDNRTVSSSNYHIDVFLYK 250
++ Y+DN+TV S + I+V L
Sbjct: 239 KYLMMYNDNKTVDSKSVKIEVHLTT 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10470BACTRLTOXIN1954e-64 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 195 bits (497), Expect = 4e-64
Identities = 109/261 (41%), Positives = 155/261 (59%), Gaps = 11/261 (4%)

Query: 4 LSTVIIILILEIVFHNMN-YVNAQPDPKLDELNKVSDYKNNKGTMGNVMNLYTSPPVEGR 62
+S VI+I L +V N +QPDP D+L+K S++ GTMGN+ LY V
Sbjct: 7 ISRVILIFALILVISTPNVLAESQPDPMPDDLHKSSEFT---GTMGNMKYLYDDHYVSAT 63

Query: 63 GVINSRQFLSHDLIFPI---EYKSYNEVKTELENTELANNYKDKKVDIFGVPYFYTCIIP 119
V + +FL+HDLI+ I + K+Y++VKTEL N +LA YKD+ VD++G Y+ C
Sbjct: 64 KVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEVVDVYGSNYYVNCYFS 123

Query: 120 KSEPDINQNFGGCCMYGGLTF---NSSENERDKLITVQVTIDNRQSLGFTITTNKNMVTI 176
+ G CMYGG+T N +N + + V+V + R ++ F + T+K VT
Sbjct: 124 SKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRNTISFEVQTDKKSVTA 183

Query: 177 QELDYKARHWLTKEKKLYEFDGSAFESGYIKFTEKNNTSFWFDLFPKKELVPFVPYKFLN 236
QELD KAR++L +K LYEF+ S +E+GYIKF E N +FW+D+ P F K+L
Sbjct: 184 QELDIKARNFLINKKNLYEFNSSPYETGYIKFIENNGNTFWYDMMPAPGDK-FDQSKYLM 242

Query: 237 IYGDNKVVDSKSIKMEVFLNT 257
+Y DNK VDSKS+K+EV L T
Sbjct: 243 MYNDNKTVDSKSVKIEVHLTT 263


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10495BICOMPNTOXIN433e-156 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 433 bits (1116), Expect = e-156
Identities = 214/318 (67%), Positives = 256/318 (80%), Gaps = 10/318 (3%)

Query: 1 MFKKKMLAATLSVGLIAPLASPIQE-SRANTNIENIGDGA--EVIKRTEDVSSKKWGVTQ 57
M K K+L TLSV L+APLA+P+ E ++A + E+IG G+ E+IKRTED +S KWGVTQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 58 NVQFDFVKDKKYNKDALIVKMQGFINSRTSFSDVKGSGYELTKRMIWPFQYNIGLTTKDP 117
N+QFDFVKDKKYNKDALI+KMQGFI+SRT++ + K + + K M WPFQYNIGL T D
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNH--VKAMRWPFQYNIGLKTNDK 118

Query: 118 NVSLINYLPKNKIETTDVGQTLGYNIGGNFQSAPSIGGNGSFNYSKTISYTQKSYVSEVD 177
VSLINYLPKNKIE+T+V QTLGYNIGGNFQSAPS+GGNGSFNYSK+ISYTQ++YVSEV+
Sbjct: 119 YVSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVE 178

Query: 178 KQNSKSVKWGVKANEFVTPDGKKSAHDRYLFVQSPNGPTGSAREYFAPDNQLPPLVQSGF 237
+QNSKSV WGVKAN F T G+KSA D LFV + R+YF PD++LPPLVQSGF
Sbjct: 179 QQNSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPH-SKDPRDYFVPDSELPPLVQSGF 237

Query: 238 NPSFITTLSHEKGSSDTSEFEISYGRNLDITYA----TLFPRTGIYAERKHNAFVNRNFV 293
NPSFI T+SHEKGSSDTSEFEI+YGRN+D+T+A T + + + R HNAFVNRN+
Sbjct: 238 NPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYT 297

Query: 294 VRYEVNWKTHEIKVKGHN 311
V+YEVNWKTHEIKVKG N
Sbjct: 298 VKYEVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10500BICOMPNTOXIN396e-141 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 396 bits (1020), Expect = e-141
Identities = 96/329 (29%), Positives = 177/329 (53%), Gaps = 24/329 (7%)

Query: 1 MKMKKLVKSSVASSIALLLLSNTVDAAQHITPVSEKKVDDKITLYKTTATSDNDKLNISQ 60
M K++ ++++ S+ L + ++ A+ + I + K T ++K ++Q
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 ILTFNFIKDKSYDKDTLVLKAAGNINSGYKKPNPKDYNYSQ-FYWGGKYNVSVSSESNDA 119
+ F+F+KDK Y+KD L+LK G I+S N K N+ + W +YN+ + + +
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTN-DKY 119

Query: 120 VNVVDYAPKNQNEEFQVQQTLGYSYGGDINISNGLSGGLNGSKSFSETINYKQESYRTTI 179
V++++Y PKN+ E V QTLGY+ GG+ + L G NGS ++S++I+Y Q++Y + +
Sbjct: 120 VSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGG--NGSFNYSKSISYTQQNYVSEV 177

Query: 180 DRKTNHKSIGWGVEAHKIMNNGWGPYGRDSYDPTYGNELFLGGRQSSSNAGQNFLPTHQM 239
+++ N KS+ WGV+A+ + ++LF+G + S + F+P ++
Sbjct: 178 EQQ-NSKSVLWGVKANSFAT-------ESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSEL 229

Query: 240 PLLARGNFNPEFISVLSHKQNDTKKSKIKVTYQREMD---------RYTNQWNRLHWIGN 290
P L + FNP FI+ +SH++ + S+ ++TY R MD Y N + H + N
Sbjct: 230 PPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHN 289

Query: 291 NYKNQNTVTFTSTYEVDWQNHTVKLIGTD 319
+ N+N +T YEV+W+ H +K+ G +
Sbjct: 290 AFVNRN---YTVKYEVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10530V8PROTEASE1381e-41 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 138 bits (349), Expect = 1e-41
Identities = 64/212 (30%), Positives = 101/212 (47%), Gaps = 18/212 (8%)

Query: 36 EKNVKEITDATKAPYNSVVAFA--------GGTGVVVGKNTIVTNKHIAKSNDIFKNRVA 87
+ +ITD T Y V +GVVVGK+T++TNKH+ + + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 88 AHYS---SKGKGGGNYDVKDIVEYPGKEDLAIVHVHETSTEGLNFNKNVSYTKFAEGA-- 142
A S G + + I +Y G+ DLAIV + + + + V + A
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVK-FSPNEQNKHIGEVVKPATMSNNAET 191

Query: 143 KAKDRISVIGYPKGAQTKYKMFESTGTINHISGTFIEFDAYAQPGNSGSPVLNSKHELIG 202
+ I+V GYP G + M+ES G I ++ G +++D GNSGSPV N K+E+IG
Sbjct: 192 QVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIG 250

Query: 203 ILYAGSGKDESEKNFGVYFTPQLKEFIQNNIE 234
I + G +E N V+ ++ F++ NIE
Sbjct: 251 IHWGGVP---NEFNGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10535V8PROTEASE1772e-56 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 177 bits (450), Expect = 2e-56
Identities = 65/230 (28%), Positives = 108/230 (46%), Gaps = 29/230 (12%)

Query: 29 EVQQTAKA-----ENNVTKIQDTNIFPYTGVVAFKS--------ATGFVVGKNTILTNKH 75
++Q A N+ +I DT Y V + A+G VVGK+T+LTNKH
Sbjct: 60 PLEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKH 119

Query: 76 V-SKNYKVGDRITAHP---NSDKGNGGIYSIKKIINYPGKEDVSVIQVEERAIERGPKGF 131
V + + A P N D G ++ ++I Y G+ D+++++ +
Sbjct: 120 VVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK----- 174

Query: 132 NFNDNVTPFKYAAGA--KAGERIKVIGYPHPYKNKYVLYESTGPVMSVEGSSIVYSAHTE 189
+ + V P + A + + I V GYP K ++ES G + ++G ++ Y T
Sbjct: 175 HIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMWESKGKITYLKGEAMQYDLSTT 233

Query: 190 SGNSGSPVLNSNNELVGIHFASDVKNDDNRNAYGVYFTPEIKKFIAENID 239
GNSGSPV N NE++GIH+ V N+ N V+ ++ F+ +NI+
Sbjct: 234 GGNSGSPVFNEKNEVIGIHWGG-VPNEFNG---AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10540V8PROTEASE1787e-57 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 178 bits (452), Expect = 7e-57
Identities = 64/217 (29%), Positives = 106/217 (48%), Gaps = 23/217 (10%)

Query: 37 EKNVTQVKDTNNFPYNGVVSFK--------DATGFVIGKNTIITNKHV-SKDYKVGDRIT 87
+ Q+ DT N Y V + A+G V+GK+T++TNKHV + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 88 AHP---NGDKGNGGIYKIKSISDYPGDEDISVMNIEEQAVERGPKGFNFNENVQAFNFAK 144
A P N D G + + I+ Y G+ D++++ + + E V+ +
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK-----HIGEVVKPATMSN 187

Query: 145 DA--KVDDKIKVIGYPLPAQNSFKQFESTGTIKRIKDNILNFDAYIEPGNSGSPVLNSNN 202
+A +V+ I V GYP +ES G I +K + +D GNSGSPV N N
Sbjct: 188 NAETQVNQNITVTGYPGDK-PVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKN 246

Query: 203 EVIGVVYGGIGKIGSEYNGAVYFTPQIKDFIQKHIEQ 239
EVIG+ +GG + +E+NGAV+ +++F++++IE
Sbjct: 247 EVIGIHWGG---VPNEFNGAVFINENVRNFLKQNIED 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10545V8PROTEASE1121e-31 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 112 bits (281), Expect = 1e-31
Identities = 58/227 (25%), Positives = 100/227 (44%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENSVKLITNTNVAPYSGVTWMGA--------GTGFVVGNHTIITNKHV 76
++Q A N IT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVTSDAVVQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKREAIGVMYASDKPTGESTRSFAVYFSPEIKKFIADNLD 238
SGSP+ N K E IG+ + AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEFNG----AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS10550V8PROTEASE1156e-33 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 115 bits (290), Expect = 6e-33
Identities = 60/227 (26%), Positives = 103/227 (45%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENTVKQITNTNVAPYSGVTWMGA--------GTGFVVGNHTIITNKHV 76
++Q A N QIT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVSSDAIIQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKHEAIGVIYAGNKPSGESTRGFAVYFSPEIKKFIADNLD 238
SGSP+ N K+E IG+ + G AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEF----NGAVFINENVRNFLKQNIE 279


46CH52_RS11875CH52_RS11915N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS11875014-3.013840rhomboid family intramembrane serine protease
CH52_RS11880-112-2.748527YqgQ family protein
CH52_RS11885115-2.604051ROK family glucokinase
CH52_RS11890114-2.520847MTH1187 family thiamine-binding protein
CH52_RS11895017-4.022189MBL fold metallo-hydrolase
CH52_RS11900120-4.240512GspE/PulE family protein
CH52_RS11905221-5.241577type II secretion system F family protein
CH52_RS11910323-6.524292prepilin-type N-terminal cleavage/methylation
CH52_RS11915422-6.319376type II secretion system GspH family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11880TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.3 bits (76), Expect = 0.002
Identities = 29/170 (17%), Positives = 54/170 (31%), Gaps = 51/170 (30%)

Query: 241 MLTVYFIAGLFGN--------FVSLSFNTTTISVGASGAIFGLIGSIFAMMY---VSKTF 289
++ V+FI L G F F+ ++G S A FG++ S+ M V+
Sbjct: 215 LMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARL 274

Query: 290 NKK----------MLGQLLIA-----------LVILVGVSLFMS------NINIVAHIGG 322
++ G +L+A +V+L + M + + G
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQG 334

Query: 323 FIGGLLITL-----------IGYYYKVNRNIF--WILLIGMLVIFIALQI 359
+ G L L Y + + W + G + + L
Sbjct: 335 QLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11890PF03309300.012 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 29.7 bits (67), Expect = 0.012
Identities = 32/154 (20%), Positives = 51/154 (33%), Gaps = 37/154 (24%)

Query: 5 ILAADVGGTTCKLGIFTPELEQ---LHKWSIHTD---TSDSTGYTLLKGIYDSFVEKVNE 58
+LA DV T +G+ + + + +W I T+ T+D + G+
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELA-LTIDGLI--------- 51

Query: 59 NNYNFSNVLGVGIG--VPGPVDFEKGTVNGAVNLYWPE------KVNVREIFEQFVDCPV 110
+ + G VP V E V + YWP + VR VD P
Sbjct: 52 -GDDAERLTGASGLSTVP-SVLHE---VRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPK 106

Query: 111 YVDND--ANIAALGEKHKGAGEGADDVVAITLGT 142
V D N A K+ + + G+
Sbjct: 107 EVGADRIVNCLAAYHKYGT------AAIVVDFGS 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11900SHIGARICIN270.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 27.5 bits (61), Expect = 0.039
Identities = 20/99 (20%), Positives = 38/99 (38%), Gaps = 11/99 (11%)

Query: 82 DFLKDPVKNGADKFKQYGLPIITSKVTPEK-------LNEGSTEIE-GFKFNVLHTPGHS 133
F+ + K + K Y +P++ S + + N I ++ G+
Sbjct: 39 VFISNLRKALPYERKLYDIPLLRSTLPGSQRYALIHLTNYADETISVAIDVTNVYVMGYR 98

Query: 134 PGSLTYVFDEFAVVG--DTLFNNGIGRTDL-YKGDYETL 169
G +Y F+E + +F + + L Y G+YE L
Sbjct: 99 AGDTSYFFNEASATEAAKYVFKDAKRKVTLPYSGNYERL 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11910BCTERIALGSPF844e-20 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 84.1 bits (208), Expect = 4e-20
Identities = 65/347 (18%), Positives = 137/347 (39%), Gaps = 6/347 (1%)

Query: 14 KKRQLSKAQQIDLLSNLCNLLKYGFTLYQSFQFLNLQMTYKN-KQLGTTILSEISNGAPC 72
+K +LS + L L L+ L ++ + Q + QL + S++ G
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 73 NQIL-SLIGYSDTI-VMQVYLAERFGNIIDVLEETVNYMKVNRKSEQRLLKTLQYPLILV 130
+ G + + V E G++ VL +Y + ++ R+ + + YP +L
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 131 SIFIAMIIILNLTVIPQFQQLYTSMNIQLSSFQKTLSFFITSLPTIIVVMLIIVSMLAII 190
+ IA++ IL V+P+ + + M L + L ++ T ML+ + +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 191 MKLIYNNLNMLNKIN-FVMKLPLISGYFQLFKTYFVTNELVLFYKNGITLQSIVDVYINH 249
+++ + ++ LPLI + T L + + + L + + +
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 250 SS-DPFRQFLGKYLLTYSEMGYGLPQILEKLKCFKPQLIKFVLQGEKRGKLEVELKLYSQ 308
S D R L E G L + LE+ F P + + GE+ G+L+ L+ +
Sbjct: 301 MSNDYARHRLSLATDAVRE-GVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAAD 359

Query: 309 ILVKQIEDKAIKQTQFLQPILFLILGLFIVAIYLVIMLPMFQMMQSI 355
++ + +P+L + + ++ I L I+ P+ Q+ +
Sbjct: 360 NQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11915BCTERIALGSPG469e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.4 bits (110), Expect = 9e-10
Identities = 19/76 (25%), Positives = 44/76 (57%), Gaps = 4/76 (5%)

Query: 3 KFLKKTQAFTLIEMLLVLLIISLLLILIIPNI--AKQTAHIQSTGCNAQVKMVNSQIEAY 60
+ K + FTL+E+++V++II +L L++PN+ K+ A Q + + + + ++ Y
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKA--VSDIVALENALDMY 59

Query: 61 ALKHNRNPSSIEDLIA 76
L ++ P++ + L +
Sbjct: 60 KLDNHHYPTTNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS11920BCTERIALGSPH407e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 39.9 bits (93), Expect = 7e-07
Identities = 14/79 (17%), Positives = 38/79 (48%), Gaps = 4/79 (5%)

Query: 9 KQSAFTMIEMLVVMMLISIFLLLTMTSKGLSNLRVIDDEA-NIISFITELNYIKSQAIAN 67
+Q FT++EM+++++L+ + + + + S D A + F +L +++ + +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPAS---RDDSAAQTLARFEAQLRFVQQRGLQT 58

Query: 68 QGYINVRFYENSDTIKVIE 86
+ V + + V+E
Sbjct: 59 GQFFGVSVHPDRWQFLVLE 77


47CH52_RS12555CH52_RS12595N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS12555116-1.804160YozE family protein
CH52_RS12560015-1.759936S41 family peptidase
CH52_RS12565015-2.261781GNAT family N-acetyltransferase
CH52_RS12575017-2.564002undecaprenyldiphospho-muramoylpentapeptide
CH52_RS12580-114-2.266053phosphatase PAP2 family protein
CH52_RS12585113-3.273817hypothetical protein
CH52_RS12590011-0.449307response regulator transcription factor ArlR
CH52_RS12595010-0.160855sensor histidine kinase ArlS
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12565RTXTOXINA250.022 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 25.3 bits (55), Expect = 0.022
Identities = 10/25 (40%), Positives = 15/25 (60%)

Query: 15 GRHDDKGRLAEEIFDDLAFPKHDDD 39
G +DK LA+ F D+AF + +D
Sbjct: 866 GGKEDKLSLADIDFRDVAFKREGND 890


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12580SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 33/140 (23%), Positives = 54/140 (38%), Gaps = 19/140 (13%)

Query: 30 EQWDDQYPLLEHFEEDIAKDYLYVLEENDKIYGFIVVDQDQAEWYDDIDWPVNREGAFVI 89
+Q++D + + EE+ +LY LE + G I + N G +I
Sbjct: 48 KQYEDDDMDVSYVEEEGKAAFLYYLE--NNCIGRIKIRS-------------NWNGYALI 92

Query: 90 HRLTGSKEY--KGAATELFNYVIDVVKARGAEVILTDTFALNKPAQGLFAKFGFHKVGEQ 147
+ +K+Y KG T L + I+ K ++ +T +N A +AK F
Sbjct: 93 EDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVD 152

Query: 148 LMEYP--PYDKGEPFYAYYK 165
M Y P + YYK
Sbjct: 153 TMLYSNFPTANEIAIFWYYK 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12605HTHFIS935e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 5e-24
Identities = 30/125 (24%), Positives = 63/125 (50%), Gaps = 4/125 (3%)

Query: 2 TQILIVEDEQNLARFLELELTHENYNVDTEYDGQDGLDKALSHYYDLIILDLMLPSINGL 61
IL+ +D+ + L L+ Y+V + + DL++ D+++P N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EICRKIRQQQS-TPIIIITAKSDTYDKVAGLDYGADDYIVKPFDIEELLARIRAIL---R 117
++ +I++ + P+++++A++ + + GA DY+ KPFD+ EL+ I L +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 118 RQPQK 122
R+P K
Sbjct: 124 RRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS12610PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.2 bits (86), Expect = 1e-04
Identities = 31/185 (16%), Positives = 68/185 (36%), Gaps = 35/185 (18%)

Query: 277 IEEMNRIIKLVEELLELTKGDVNDISSEAQTVHINDE---IRSRIHSLKQLHPD-YQFDT 332
+E+ + +++ L EL + + S A+ V + DE + S + D QF+
Sbjct: 187 LEDPTKAREMLTSLSELMRYSLRY--SNARQVSLADELTVVDSYLQLASIQFEDRLQFEN 244

Query: 333 DLTSKNLEIKMKPHQFEQLFLIFIDNAIKYDVKNKK----IKVKTRLKNKQKIIEITDHG 388
+ +++++ P L ++N IK+ + I +K N +E+ + G
Sbjct: 245 QINPAIMDVQVPPM----LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTG 300

Query: 389 IGIPEEDQDFIFDRFYRVDKSRSRSQGGNGLGLSIAQKIIQL---NGGSIKIKSEINKGT 445
+ ++ G GL ++ +Q+ IK+ + K
Sbjct: 301 SLALKNTKE------------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342

Query: 446 TFKII 450
+I
Sbjct: 343 AMVLI 347


48CH52_RS13565CH52_RS13595N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS13565211-0.320146putative DNA-binding protein
CH52_RS13570110-0.016535signal recognition particle-docking protein
CH52_RS13575111-0.050901chromosome segregation protein SMC
CH52_RS135800120.808757ribonuclease III
CH52_RS135850110.633927acyl carrier protein
CH52_RS135950100.6045253-oxoacyl-[acyl-carrier-protein] reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13565BONTOXILYSIN260.037 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 26.0 bits (57), Expect = 0.037
Identities = 11/42 (26%), Positives = 23/42 (54%)

Query: 10 LRMNYLFDFYQSLLTNKQRNYLELFYLEDYSLSEIADTFNVS 51
L +NY + S++ ++ N L+ FY + Y + D +N++
Sbjct: 334 LNLNYFCQSFNSIIPDRFSNALKHFYRKQYYTMDYTDNYNIN 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13570SUBTILISIN363e-04 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 35.6 bits (82), Expect = 3e-04
Identities = 16/79 (20%), Positives = 29/79 (36%), Gaps = 11/79 (13%)

Query: 192 VGVNGVGKTTTIGKLAYRYKMEGKKVMLAAGDTFRAGAIDQLKVWGERVGVDVISQSEG- 250
GV GV + L + +L + + I Q + VD+IS S G
Sbjct: 101 NGVVGVAPEADL--LIIK--------VLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGG 150

Query: 251 SDPAAVMYDAINAAKNKGV 269
+ +++A+ A +
Sbjct: 151 PEDVPELHEAVKKAVASQI 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13575GPOSANCHOR552e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 54.7 bits (131), Expect = 2e-09
Identities = 53/326 (16%), Positives = 119/326 (36%), Gaps = 23/326 (7%)

Query: 170 KYKKRKAESLNKLDQTEDNLTRVEDILYDLEGRV-EPLKEEAAIAKEYKTLSHQMKHSDI 228
K K +E +K+ + E +E L + + E L+ + +
Sbjct: 103 KNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLE- 161

Query: 229 VVTVHDIDQYTNDNRQLDQRLNDLQGQQANKEADKQRLSQQIQQYKG-------KRHQLD 281
++ N + ++ L+ ++A EA + L + ++ K L+
Sbjct: 162 ----KALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLE 217

Query: 282 NDVESLNYQLVKATEAFEKYTGQLNVLEERKKNQSETNARYEEEQENLMELLENISNEIS 341
+ +L + +A E + K A E Q L + LE N +
Sbjct: 218 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 277

Query: 342 EAQDTYKSLKSKQKELNAVIRELEEQLYVSD----------EAHDEKLEEIKNEYYTLMS 391
K+L++++ L A +LE Q V + +A E ++++ E+ L
Sbjct: 278 ADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEE 337

Query: 392 EQSDVNNDIRFLKHTIEENEAKKSRLDSRLVEVFEQLKDIQGQIKTTKKEYQQTNKELSA 451
+ + L+ ++ + K +L++ ++ EQ K + ++ +++ + +
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 397

Query: 452 VDKEIKNIEKDLTDTKKAQNEYEEKL 477
V+K ++ L +K E EE
Sbjct: 398 VEKALEEANSKLAALEKLNKELEESK 423



Score = 53.1 bits (127), Expect = 5e-09
Identities = 31/315 (9%), Positives = 94/315 (29%), Gaps = 18/315 (5%)

Query: 177 ESLNKLDQTEDNLTRVEDILYDLEGRVEPLKEEAAIAKEYKTLSHQMKHSDIVVTVHDID 236
E +K + + L L ++ +E + + I
Sbjct: 57 ERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQ 116

Query: 237 QYTNDNRQLDQRLNDLQGQQANKEADKQRLSQQIQQYKGKRHQLDNDVESLNYQLVKATE 296
+ L++ L A + L + ++ L+ +E +
Sbjct: 117 ELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSA 176

Query: 297 AFEKYTGQLNVLEERKKNQSETNARYEEEQENLMELLENISNEISEAQDTYKSLKSKQKE 356
+ + LE R+ + ++ + E + L+ +
Sbjct: 177 KIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEG 236

Query: 357 LNAVIRELEEQLYVSDEAHDEKLEEIKNEYYTLMSEQSDVNNDIRFLKHTIEENEAKKSR 416
++ + + + ++ L N I+ EA+K+
Sbjct: 237 AMNFSTADSAKI----KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 292

Query: 417 LDSRLVEVFEQLKDIQGQIKTTKK--------------EYQQTNKELSAVDKEIKNIEKD 462
L++ ++ Q + + ++ ++ E+Q+ ++ + +++ +D
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352

Query: 463 LTDTKKAQNEYEEKL 477
L +++A+ + E +
Sbjct: 353 LDASREAKKQLEAEH 367


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13585ACRIFLAVINRP260.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.3 bits (58), Expect = 0.012
Identities = 10/42 (23%), Positives = 17/42 (40%), Gaps = 2/42 (4%)

Query: 33 GADSLDIAELVMELEDEFGTEIPDEEAEKINTVGDAVKFINS 74
GA++LD A+ + E P + K+ D F+
Sbjct: 296 GANALDTAKAIKAKLAELQPFFP--QGMKVLYPYDTTPFVQL 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13595DHBDHDRGNASE1441e-44 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 144 bits (365), Expect = 1e-44
Identities = 85/250 (34%), Positives = 136/250 (54%), Gaps = 13/250 (5%)

Query: 3 KSALVTGASRGIGRSIALQLAEEGYNV-AVNYAGSKEKAEAVVEEIKAKGVDSFAIQANV 61
K A +TGA++GIG ++A LA +G ++ AV+Y + EK E VV +KA+ + A A+V
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDY--NPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 62 ADADEVKAMIKEVVSQFGSLDVLVNNAGITRDNLLMRMKEQEWDDVIDTNLKGVFNCIQK 121
D+ + + + + G +D+LVN AG+ R L+ + ++EW+ N GVFN +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 122 ATPQMLRQRSGAIINLSSVVGAVGNPGQANYVATKAGVIGLTKSAARELASRGITVNAVA 181
+ M+ +RSG+I+ + S V A Y ++KA + TK ELA I N V+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 182 PGFIVSDMTDAL--SDELKEQML--------TQIPLARFGQDTDIANTVAFLASDKAKYI 231
PG +DM +L + EQ++ T IPL + + +DIA+ V FL S +A +I
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 232 TGQTIHVNGG 241
T + V+GG
Sbjct: 247 TMHNLCVDGG 256


49CH52_RS13910CH52_RS13945N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
CH52_RS13910111-0.641939carbamate kinase
CH52_RS13915013-1.950487ornithine carbamoyltransferase
CH52_RS13920014-2.814655superantigen-like protein SSL14
CH52_RS13925217-2.731553superantigen-like protein SSL13
CH52_RS13930317-2.849565superantigen-like protein SSL12
CH52_RS13935318-2.424722hypothetical protein
CH52_RS13940318-2.446962hypothetical protein
CH52_RS13945217-2.374043alpha-hemolysin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13915CARBMTKINASE388e-138 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 388 bits (998), Expect = e-138
Identities = 144/311 (46%), Positives = 210/311 (67%), Gaps = 7/311 (2%)

Query: 3 KIVVALGGNALGK-----SPQEQLELVKNTAKSLVGLITKGHEIVISHGNGPQVGSINLG 57
++V+ALGGNAL + S +E ++ V+ TA+ + +I +G+E+VI+HGNGPQVGS+ L
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 58 LNYAAEHNQGPAFPFAECGAMSQAYIGYQLQESLQNELHSIGMDKQVVTLVTQVEVDEND 117
++ PA P GAMSQ +IGY +Q++L+NEL GM+K+VVT++TQ VD+ND
Sbjct: 64 MDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKND 123

Query: 118 PAFNNPSKPIGLFYNKEEAEQIQKEKGFIFVEDAGRGYRRVVPSPQPISIIELESIKTLI 177
PAF NP+KP+G FY++E A+++ +EKG+I ED+GRG+RRVVPSP P +E E+IK L+
Sbjct: 124 PAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLV 183

Query: 178 KNDTLVIAAGGGGIPVIREQHDGFKGIDAVIDKDKTSALLGANIQCDQLIILTAIDYVYI 237
+ +VIA+GGGG+PVI E KG++AVIDKD L + D +ILT ++ +
Sbjct: 184 ERGVIVIASGGGGVPVILE-DGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 238 NFNTENQQPLKTTNVDELKRYIDENQFAKGSMLPKIEAAISFIENNPKGSVLITSLNELD 297
+ TE +Q L+ V+EL++Y +E F GSM PK+ AAI FIE + ++ I L +
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAI-IAHLEKAV 301

Query: 298 AALEGKVGTVI 308
ALEGK GT +
Sbjct: 302 EALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13925TOXICSSTOXIN621e-13 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 61.6 bits (149), Expect = 1e-13
Identities = 62/222 (27%), Positives = 96/222 (43%), Gaps = 17/222 (7%)

Query: 2 KKNIMNKLVLSTALLLLGTTSTQLPKTPISFSSEAKAYNISENETNINELIKYYTQPHFS 61
KK +MN ++S LLL TT+T P+S + K S N+ NI +L+ +Y+ +
Sbjct: 3 KKLLMNFFIVSP--LLLATTATDFTPVPLSSNQIIKTAKASTND-NIKDLLDWYSSGSDT 59

Query: 62 LSGKWLWQKPNGSIHATLQTWVWYSHIQVFGSESWGNINQLRNKYVDIFGT---KDEDTV 118
+ + GS+ ++ + +F S + + + + VD+ K + T
Sbjct: 60 FTNSEVLDNSLGSMR--IKNTDGSISLIIFPSP-YYSPAFTKGEKVDLNTKRTKKSQHTS 116

Query: 119 EGYWTYDETFTGGVTPA-ATSSDKPYRLFLKYSDKQQTIIGGHEFYKGNKPVLTLKELDF 177
EG TY GVT + L +K K + G +F +K L + LDF
Sbjct: 117 EG--TYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGPKF---DKKQLAISTLDF 171

Query: 178 RIRQTLIKNKKLYNGEFNKGQI-KIT-ADGNNYTIDLSKKLK 217
IR L + LY G KIT DG+ Y DLSKK +
Sbjct: 172 EIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFE 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13930TOXICSSTOXIN577e-12 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 56.6 bits (136), Expect = 7e-12
Identities = 55/228 (24%), Positives = 91/228 (39%), Gaps = 15/228 (6%)

Query: 16 LLLGTAFTQFPNTPINSSSEAKAYYINQNETNVNELTKYYSQKYLTFSNSTLWQKDNGTI 75
LLL T T F P++S+ K + N+ N+ +L +YS TF+NS + G++
Sbjct: 15 LLLATTATDFTPVPLSSNQIIKTAKASTND-NIKDLLDWYSSGSDTFTNSEVLDNSLGSM 73

Query: 76 HATLLQFSWYSHIQVYGPESWGNINQLRNKSVDIFGI---KDQETIDSFALSQETFTGGV 132
++ + S + P + + + + VD+ K Q T + + + GV
Sbjct: 74 R---IKNTDGSISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQI--SGV 128

Query: 133 TPA-ATSNDKHYKLNVTYKDKAETFTGGFPVYEGNKPVLTLKELDFRIRQTLIKSKKLYN 191
T L V K + + + +K L + LDF IR L + LY
Sbjct: 129 TNTEKLPTPIELPLKV--KVHGKDSPLKYG-PKFDKKQLAISTLDFEIRHQLTQIHGLYR 185

Query: 192 NSYNKGQI-KITGTDNN-YTIDLSKRLPSTDANRYVKKPQNAKIEVIL 237
+S G KIT D + Y DLSK+ + + IE +
Sbjct: 186 SSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13935TOXICSSTOXIN502e-09 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 49.7 bits (118), Expect = 2e-09
Identities = 53/223 (23%), Positives = 89/223 (39%), Gaps = 12/223 (5%)

Query: 1 MSKNITKNIILTTTLLLLGTVLPQNQKPVFSFYSEAKAYSIGQDETNINELIKYYTQPHF 60
M+K + N + + LLL T P+ S A + D NI +L+ +Y+
Sbjct: 1 MNKKLLMNFFIVSPLLLATTATDFTPVPLSSNQIIKTAKASTND--NIKDLLDWYSSGSD 58

Query: 61 SFSNKWLYQYDNGNIYVELKRYSWSAHISLWGAESWGNINQLKDRYVDVFGLKD-KDTDQ 119
+F+N DN + +K S + ++ + + + K VD+ + K
Sbjct: 59 TFTN--SEVLDNSLGSMRIKNTDGSISLIIFPSP-YYSPAFTKGEKVDLNTKRTKKSQHT 115

Query: 120 LWWSYRETFTGGVTPAAK-PSDKTYNLFVQYKDKLQTIIGAHKIYQGNKPVLTLKEIDFR 178
+Y GVT K P+ L V+ K + K +K L + +DF
Sbjct: 116 SEGTYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGPKF---DKKQLAISTLDFE 172

Query: 179 AREALIKNKILY-TENRNKGKLKIT-GGGNNYTIDLSKRLHSD 219
R L + LY + ++ G KIT G+ Y DLSK+ +
Sbjct: 173 IRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYN 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
CH52_RS13950BICOMPNTOXIN313e-109 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 313 bits (803), Expect = e-109
Identities = 72/318 (22%), Positives = 144/318 (45%), Gaps = 24/318 (7%)

Query: 9 VTTTLLLGSILMNPVANAADSDINIKTGTTDIGSNTTVKTGDLVTYDKEN--GMHKKVFY 66
+TTTL + L+ P+AN + T DIG + ++ N G+ + + +
Sbjct: 7 LTTTLSVS--LLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQF 64

Query: 67 SFIDDKNHNKKLLVIRTKGTIAGQYRVYSEEGANKS-GLAWPSAFKVQLQLPDNEVAQIS 125
F+ DK +NK L+++ +G I+ + Y+ + N + WP + + L+ D V+ I
Sbjct: 65 DFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYVSLI- 123

Query: 126 DYYPRNSIDTKEYMSTLTYGFNGNVTGDDTGKIGGLIGANVSIGHTLKYVQPDFKTILES 185
+Y P+N I++ TL Y GN + +GG N S ++ Y Q ++ + +E
Sbjct: 124 NYLPKNKIESTNVSQTLGYNIGGNFQSAPS--LGGNGSFNYS--KSISYTQQNYVSEVEQ 179

Query: 186 PTDKKVGWKVIFNNMVNQNWGPYDRDSWNPVYGNQLFMKTRNGSMKAAENFLDPNKASSL 245
K V W V N+ ++ + + LF+ + S + F+ ++ L
Sbjct: 180 QNSKSVLWGVKANSFATESG-------QKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPL 232

Query: 246 LSSGFSPDFATVITMDRKASKQQTNIDVIYERVRD-----DYQLHWTSTNWKGTNTKDKW 300
+ SGF+P F ++ + K S + ++ Y R D H+ ++ G + +
Sbjct: 233 VQSGFNPSFIATVSHE-KGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAF 291

Query: 301 TDRS-SERYKIDWEKEEM 317
+R+ + +Y+++W+ E+
Sbjct: 292 VNRNYTVKYEVNWKTHEI 309



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.