PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomecol.gbThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_002951 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1SACOL_RS00035SACOL_RS00400Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS000351103.058066RNA-binding protein
SACOL_RS000400122.808933DNA recombination protein RecF
SACOL_RS000450133.169885DNA gyrase subunit B
SACOL_RS000500132.959061DNA gyrase subunit A
SACOL_RS00055-1122.602116carbohydrate kinase
SACOL_RS00060-1131.984719histidine ammonia-lyase
SACOL_RS00065-1141.734646serine--tRNA ligase
SACOL_RS00070-1141.593921azaleucine resistance protein AzlC
SACOL_RS000751142.139117membrane protein
SACOL_RS000802142.332737homoserine O-acetyltransferase
SACOL_RS000854162.414549membrane protein
SACOL_RS000905162.514244DHH family phosphoesterase
SACOL_RS000957172.33692250S ribosomal protein L9
SACOL_RS001008172.223004replicative DNA helicase
SACOL_RS001058181.688453adenylosuccinate synthetase
SACOL_RS001204172.542868**DNA-binding response regulator
SACOL_RS001252182.713487PAS domain-containing sensor histidine kinase
SACOL_RS001300141.668918hypothetical protein
SACOL_RS00135-2141.319784hypothetical protein
SACOL_RS00140-3141.590520metallo-hydrolase
SACOL_RS00145-2161.274260multifunctional 2',3'-cyclic-nucleotide
SACOL_RS00150416-1.56636623S rRNA
SACOL_RS00155418-2.925436hypothetical protein
SACOL_RS00160015-2.312352hypothetical protein
SACOL_RS00165-213-2.362821transposase
SACOL_RS00170-214-2.500717hypothetical protein
SACOL_RS00175012-3.131525glycerophosphoryl diester phosphodiesterase
SACOL_RS00180-111-3.064462hypothetical protein
SACOL_RS00185012-2.911905PBP2a family beta-lactam-resistant peptidoglycan
SACOL_RS00190216-2.415805beta-lactam sensor/signal transducer MecR1
SACOL_RS00195021-1.235425type I restriction endonuclease
SACOL_RS00200-119-0.948162transposase
SACOL_RS00205-1200.659081hypothetical protein
SACOL_RS00210-1200.494480hypothetical protein
SACOL_RS00215-216-1.250161hypothetical protein
SACOL_RS00225-312-1.487202recombinase RecA
SACOL_RS00230-213-2.122245hypothetical protein
SACOL_RS00235-116-3.395314hypothetical protein
SACOL_RS002402191.323907hypothetical protein
SACOL_RS002453181.576379Zn-dependent hydrolase
SACOL_RS002503170.541492dihydroneopterin aldolase
SACOL_RS002554170.046184hypothetical protein
SACOL_RS00260316-0.211469membrane protein
SACOL_RS14325417-0.057025YSIRK signal domain/LPXTG anchor domain surface
SACOL_RS00270516-6.031194hypothetical protein
SACOL_RS00275720-6.636167glycosyl transferase family 1
SACOL_RS00280518-5.823539hypothetical protein
SACOL_RS00285520-5.737209Mur ligase middle domain protein
SACOL_RS00290619-5.516022hypothetical protein
SACOL_RS00295720-5.053191hypothetical protein
SACOL_RS00300519-4.069080hypothetical protein
SACOL_RS00305212-0.986508hypothetical protein
SACOL_RS00315414-1.019121membrane protein
SACOL_RS003200110.398025hypothetical protein
SACOL_RS003251100.240317dihydroneopterin aldolase
SACOL_RS003300110.572445Zn-dependent hydrolase
SACOL_RS00335115-0.806458pyridine nucleotide-disulfide oxidoreductase
SACOL_RS00340019-1.047580hypothetical protein
SACOL_RS00345017-2.030409tRNA-dihydrouridine synthase
SACOL_RS00350318-4.853377hypothetical protein
SACOL_RS00355418-5.187158transcriptional regulator
SACOL_RS00360216-5.152377MFS transporter
SACOL_RS00365113-3.779648flavodoxin
SACOL_RS00370114-3.687172LysR family transcriptional regulator
SACOL_RS00375114-2.863298DUF4865 domain-containing protein
SACOL_RS00380114-2.633710LysR family transcriptional regulator
SACOL_RS00385115-2.428100hypothetical protein
SACOL_RS00390215-2.487254ATPase
SACOL_RS00395617-2.723866hypothetical protein
SACOL_RS00400317-1.7732051-phosphatidylinositol phosphodiesterase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00075BCTERIALGSPF260.025 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 26.3 bits (58), Expect = 0.025
Identities = 21/119 (17%), Positives = 47/119 (39%), Gaps = 20/119 (16%)

Query: 7 MLILILLCGIVTLLIRIIP-----FIMISKVQLP----------DVVVRWLSFIPITLFT 51
+L ++ + + LL ++P FI K LP D V + ++ + L
Sbjct: 178 VLTVVAIAVVSILLSVVVPKVVEQFIH-MKQALPLSTRVLMGMSDAVRTFGPWMLLALLA 236

Query: 52 ALVIDSIIQQTPHG----EGYTLNIPYIIALIPTVILSIITRSLTITIISGIVIMAALR 106
+ ++ + L++P I + + + R+L+I S + ++ A+R
Sbjct: 237 GFMAFRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMR 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00120HTHFIS942e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 93.7 bits (233), Expect = 2e-24
Identities = 31/124 (25%), Positives = 64/124 (51%), Gaps = 1/124 (0%)

Query: 4 KVVVVDDEKPIADILEFNLKKEGYDVYCAYDGNDAVDLIYEEEPDIVLLDIMLPGRDGME 63
++V DD+ I +L L + GYDV + I + D+V+ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 VCREVRKKYE-MPIIMLTAKDSEIDKVLGLELGADDYVTKPFSTRELIARVKANLRRHYS 122
+ ++K +P+++++A+++ + + E GA DY+ KPF ELI + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 123 QPAQ 126
+P++
Sbjct: 125 RPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00210PHPHLIPASEA1260.040 Bacterial phospholipase A1 protein signature.
		>PHPHLIPASEA1#Bacterial phospholipase A1 protein signature.

Length = 289

Score = 25.7 bits (56), Expect = 0.040
Identities = 14/43 (32%), Positives = 26/43 (60%), Gaps = 1/43 (2%)

Query: 4 NRYITRGISEHLSLDLQILLWNMVKERDNQPD-TDYLHIFRLK 45
NR TR ++E+ + +++ W +V D+ PD T Y+ ++LK
Sbjct: 176 NRLYTRLMAENGNWLVEVKPWYVVGNTDDNPDITKYMGYYQLK 218


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00250PF01206634e-15 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 63.2 bits (154), Expect = 4e-15
Identities = 19/70 (27%), Positives = 37/70 (52%)

Query: 119 KTFNYSNLQCPGPIVNISKEIKNIAIGDQIEVVVTDHGFLNDIKSWVKQTGHTLVRLNDF 178
++ + + L CP PI+ K + + G+ + V+ TD G + D +S+ KQTGH L+ +
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 179 GNEIRAIIQK 188
+++
Sbjct: 66 DGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00260VACCYTOTOXIN280.034 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 28.5 bits (63), Expect = 0.034
Identities = 18/48 (37%), Positives = 27/48 (56%), Gaps = 7/48 (14%)

Query: 119 LALILMFIKVTPSTSHIKFNRVLLI--TIGGI-----IGLVSGIVGAG 159
LAL+ + +TP SH F ++I +GGI +G VSG++G G
Sbjct: 17 LALVGALVSITPQQSHAAFFTTVIIPAIVGGIATGAAVGTVSGLLGWG 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14325IGASERPTASE713e-14 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 71.2 bits (174), Expect = 3e-14
Identities = 61/303 (20%), Positives = 103/303 (33%), Gaps = 25/303 (8%)

Query: 96 TEQASTEEKANTTEQASTEEKADTTEQATTEEAPKAEGTDKVETEEAPKAEETDKATEEA 155
T + N + ++ E A +EAP +E E K +
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKT 1050

Query: 156 PKTEETDKATTEEAPKAEETDKATEEAPKTEETDKATTEEAPAAEETSKAATEEAPKAEE 215
+ E D ATE + E K A +T++ A + + +E
Sbjct: 1051 VEKNEQD---------------ATETTAQNREVAKEAKSNVKANTQTNEVA-QSGSETKE 1094

Query: 216 TSKAATEEAPKAEETEKTATEEAPKTEETDKVETEEAPKAEETSKAATEKAPKAEETNKV 275
T T+E E+ EK A E KT+E KV ++ +PK E++ + P E V
Sbjct: 1095 TQTTETKETATVEKEEK-AKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTV 1153

Query: 276 ETEEAPAAEETNKAATEETPAVEDTNAKSNSNAQPSETERTQVVDTVAKDLYKKSEVTEA 335
+E +TN A E PA +++SN + TE T V + ++
Sbjct: 1154 NIKEP--QSQTNTTADTEQPA-----KETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206

Query: 336 EKAEIEKVLPKDISNLSNEEIKKIALSEVLKETANKENAQPRATFRSVSSNARTTNVNYS 395
+ + N ++ + V T + + A S+N +
Sbjct: 1207 TQPTVNSESSNKPKNRHRRSVRSVP-HNVEPATTSSNDRSTVALCDLTSTNTNAVLSDAR 1265

Query: 396 ATA 398
A A
Sbjct: 1266 AKA 1268



Score = 63.9 bits (155), Expect = 5e-12
Identities = 64/368 (17%), Positives = 124/368 (33%), Gaps = 18/368 (4%)

Query: 38 IFGVANDQAEAAEN--NTTQKQDDSSDASKVKGNVQTIEQSSANSNESDIPEQVDVTKDT 95
+ + N + E +TT ++ + V E+ + P +T
Sbjct: 977 RYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSET 1036

Query: 96 TEQASTEEKANTTEQASTEEKADTTEQATTEEAPKAEGTDKVETEEAPKAEETDKATEEA 155
TE + K + E+ A T E A +A+ K T+ A+ + T+E
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSE-TKET 1095

Query: 156 PKTEETDKATTEEAPKAEETDKATEEAPKTEETDKATTEEAPAAEETSKAATEEAP---- 211
TE + AT E+ KA+ + T+E PK E++ + ++ A E P
Sbjct: 1096 QTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI 1155

Query: 212 --KAEETSKAATEEAPKAEETEKTATEEAPKTEETDKVETEEAPKAEETS----KAATEK 265
+T+ A E P E + T E P+ + +E
Sbjct: 1156 KEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSES 1215

Query: 266 APKAEETNKVETEEAPAAEETNKAATEETPAVEDTNAKS-NSNAQPSETERTQVVDTVAK 324
+ K + ++ P E ++ + V + S N+NA S+ +
Sbjct: 1216 SNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNV 1275

Query: 325 DLYKKSEVTEAEKAEIEKVLPKDISNLSNEEIKKIALSEVLKETANKENAQPRATFRSVS 384
+++ E + +SN + K S + ++K +++S
Sbjct: 1276 GKAVSQHISQLEMNNEG----QYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTIS 1331

Query: 385 SNARTTNV 392
+N + V
Sbjct: 1332 NNVQLGGV 1339



Score = 55.8 bits (134), Expect = 1e-09
Identities = 50/274 (18%), Positives = 92/274 (33%), Gaps = 9/274 (3%)

Query: 143 PKAEETDKATEEAPKTEETDKATTEEAPKAEETDKA-TEEAPKTEETDKATTEEAPAAEE 201
P+ E+ ++ + T + + + + A +EAP +E E
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 202 TSKAATEEAPKAEETSKAATEEAPKAEETEKTATEEAPKTEETDKV--ETEEAPKAEETS 259
SK ++ K E+ + T + + + K+ + +T E + ET+E E
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 260 KAATEKAPKA-EETNKVETEEAPAAEETNKAATEETPAVEDTNAKSN----SNAQPSETE 314
A EK KA ET K + ++ + K ET + A+ N + +P
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 315 RTQVVDTVAKDLYKKSEVTEAEKAEIEKVLPKDISNLSNEEIKKIALSEVLKETANKENA 374
T + ++ + N N V E++NK
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTT-PATTQPTVNSESSNKPKN 1221

Query: 375 QPRATFRSVSSNARTTNVNYSATALRAAAQDTVT 408
+ R + RSV N + + + A T T
Sbjct: 1222 RHRRSVRSVPHNVEPATTSSNDRSTVALCDLTST 1255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00325PF01206621e-14 SirA family protein
		>PF01206#SirA family protein

Length = 76

Score = 61.7 bits (150), Expect = 1e-14
Identities = 21/70 (30%), Positives = 38/70 (54%)

Query: 119 KQFNYRGLQCPGPIVKISQEMKNIEVGDQIEVKVTDPGFPSDIKSWVKQTRHTLVKLDEN 178
+ + GL CP PI+K + + + G+ + V TDPG D +S+ KQT H L++ E
Sbjct: 6 QSLDATGLNCPLPILKAKKTLATMNAGEVLYVMATDPGSVKDFESFSKQTGHELLEQKEE 65

Query: 179 NNGINAIIQK 188
+ + +++
Sbjct: 66 DGTYHFRLKR 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS0035060KDINNERMP260.012 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 26.1 bits (57), Expect = 0.012
Identities = 8/38 (21%), Positives = 16/38 (42%), Gaps = 4/38 (10%)

Query: 14 WDLFFAIPIFLLFAYL----PNYNFITIFLNIVIIIFF 47
W F + P+F L ++ N+ F I + ++
Sbjct: 332 WLWFISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIM 369


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00360TCRTETA552e-10 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 54.8 bits (132), Expect = 2e-10
Identities = 54/283 (19%), Positives = 105/283 (37%), Gaps = 25/283 (8%)

Query: 45 LGIMLALNVLSGFLASPIIGGLADKYNRRNIILITYLLQVILYLLIVIALVMIGFETYLV 104
GI+LAL L F +P++G L+D++ RR ++L++ + Y ++ A + L
Sbjct: 45 YGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFL----WVLY 100

Query: 105 IGFAIVNGIGWTTYMATSRSLVKQILKPDQYTDANSLLEISLQTGMFIAGGLSGILYKIN 164
IG IV GI T A + + + I D+ + GM L G++ +
Sbjct: 101 IG-RIVAGITGAT-GAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFS 158

Query: 165 G---FTLIIAMTIMMFLISIFMLFRLHVDKPTHSEEESTNSLLQEYLLGWKFLKDNM--- 218
F A+ + FL F+L +H E L M
Sbjct: 159 PHAPFFAAAALNGLNFLTGCFLL------PESHKGERRPLRREALNPLASFRWARGMTVV 212

Query: 219 --MIFIFGVISIIPMVFTMIFNISLPGYVYNVLKLSSVQFGFSDMLYGI-GGLCAGLISA 275
++ +F ++ ++ V ++ I + + + G S +GI L +I+
Sbjct: 213 AALMAVFFIMQLVGQVPAALWVI----FGEDRFHWDATTIGISLAAFGILHSLAQAMITG 268

Query: 276 ILSKKISTKVLIFLLYFILVINSALFIWINSAFYLFIGSFILG 318
++ ++ + + L L + + F +L
Sbjct: 269 PVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA 311



Score = 29.0 bits (65), Expect = 0.040
Identities = 12/34 (35%), Positives = 18/34 (52%), Gaps = 4/34 (11%)

Query: 354 LLQSLIAPFLGRWINEINDKFGFYIILILSLLIF 387
L+Q AP LG +D+FG +L++SL
Sbjct: 54 LMQFACAPVLGAL----SDRFGRRPVLLVSLAGA 83


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00390GPOSANCHOR398e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 39.3 bits (91), Expect = 8e-05
Identities = 18/134 (13%), Positives = 50/134 (37%), Gaps = 5/134 (3%)

Query: 515 INSEKTSIEEQVYHLDNETLRDNKEIEDLDNRINYIVKQIETLNELIKSIKESNKGFINK 574
++ ++++ L E +++ D ++ +I+ L ++++ +G +N
Sbjct: 76 LSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNF 135

Query: 575 LKAMFNSEEDESYKDHNKEKQQLLTQQLELEKCKKNKHEDLVSKLKEKEKLIKQLTKVQL 634
+ K EK L ++ +LEK + + + + L + ++
Sbjct: 136 ST-----ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEA 190

Query: 635 QLDELNSQLQELEA 648
+ EL L+
Sbjct: 191 RQAELEKALEGAMN 204


2SACOL_RS00450SACOL_RS00515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS00450212-0.824998membrane protein
SACOL_RS004551130.027500transcriptional regulator
SACOL_RS004601171.753280hypothetical protein
SACOL_RS004651161.685410L-lactate permease
SACOL_RS004701192.085019peptidoglycan-binding protein LysM
SACOL_RS00475-3101.724170transcriptional regulator
SACOL_RS00480-292.773143iron ABC transporter permease
SACOL_RS00485-1113.224572siderophore ABC transporter permease
SACOL_RS004900112.982919iron ABC transporter substrate-binding protein
SACOL_RS004951153.402058siderophore biosynthesis protein SbnA
SACOL_RS005002163.7166222,3-diaminopropionate biosynthesis protein SbnB
SACOL_RS005052163.618573siderophore biosynthesis protein SbnC
SACOL_RS005102173.856488MFS transporter
SACOL_RS005151163.239949siderophore biosynthesis protein SbnE
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00490FERRIBNDNGPP707e-16 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 70.4 bits (172), Expect = 7e-16
Identities = 47/191 (24%), Positives = 78/191 (40%), Gaps = 38/191 (19%)

Query: 53 PKRVVTLYQGATDVAVSLGVKPVGAVES-----WTQKPKFEYIKNDLKDTKI-VGQEPAP 106
P R+V L ++ ++LG+ P G ++ W +P L D+ I VG P
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPP-------LPDSVIDVGLRTEP 87

Query: 107 NLEEISKLKPDLIVASKVRNEKVYDQLSKIAPTVSTDTVFKFKD----------TTKLMG 156
NLE ++++KP +V S + L++IAP F F D + M
Sbjct: 88 NLELLTEMKPSFMVWS-AGYGPSPEMLARIAPGR----GFNFSDGKQPLAMARKSLTEMA 142

Query: 157 KALGKEKEAEDLLKKYDDKVAAFQKDAKAKY--KDAWPLKASVVNF-RADHTRIYA-GGY 212
L + AE L +Y+D F + K ++ + A PL + H ++
Sbjct: 143 DLLNLQSAAETHLAQYED----FIRSMKPRFVKRGARPL--LLTTLIDPRHMLVFGPNSL 196

Query: 213 AGEILNDLGFK 223
EIL++ G
Sbjct: 197 FQEILDEYGIP 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00500SYCECHAPRONE310.002 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 31.2 bits (70), Expect = 0.002
Identities = 14/33 (42%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 25 VDALTEALTAHAHNDFVQ-PLKPYLRQDPENGH 56
+D E T +HN F Q LKP L D GH
Sbjct: 54 LDNNDEKETLLSHNIFSQDILKPILSWDEVGGH 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00505PF04183317e-103 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 317 bits (815), Expect = e-103
Identities = 119/527 (22%), Positives = 208/527 (39%), Gaps = 46/527 (8%)

Query: 79 RVSKQPLTAAEFWQTIANMNCDLSHEWEVARVEEGLTTAATQLAKQLSELDLASHPFV-- 136
R + +P+ A + + +S +++ T L + L++ +
Sbjct: 66 RCADEPVLAQTLLMQLKQVL-SMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINL 124

Query: 137 -MSEQFASLKDRPFHPLAKEKRGLREADYQVYQAELNQSFPLMVAAVKKTHMIHGDTANI 195
L P K +RG + + Y E +F L AVK+ HMI +
Sbjct: 125 NADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEM 184

Query: 196 DELENLTVPIKEQA----TDMLNDQGLSIDDYVLFPVHPWQYQHILPNVFAKEISEKLVV 251
D + LT + Q + + + GL +++ PVHPWQ+Q + F + +E +V
Sbjct: 185 DIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQQKIATDFIADFAEGRMV 243

Query: 252 LLPLKFGD-YLSSSSMRSLIDIGAPYN-HVKVPFAMQSLGALRLTPTRYMKNGEQAEQLL 309
L +FGD +L+ S+R+L + +K+P + + R P RY+ G A + L
Sbjct: 244 SLG-EFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWL 302

Query: 310 RQLIEKDEALAKYVMV-CDETA-------WWSYMGQDNDIFKDQLGHLTVQLRKYPEVLA 361
+Q+ D L + V E A ++ + + +++ LG V R+ P
Sbjct: 303 QQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLG---VIWRENPCRWL 359

Query: 362 KNDTQQLVSMAALAANDRTLYQMICGKDNISKNDVMTLFEDIAQVFLKVTLSFM-QYGAL 420
K D + V MA L D + + S D T + +V + + +YG
Sbjct: 360 KPD-ESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVA 418

Query: 421 PELHGQNILLSFEDGRVQKCVLRD-HDTVRIYKPWLTAHQLSLPKYV--VREDTPNTLIN 477
HGQNI L+ ++G Q+ +L+D +R+ K SLP+ V V +
Sbjct: 419 LIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMD-SLPQEVRDVTSRLSADYLI 477

Query: 478 EDLETFFAYFQTLAVSVNLYAIIDAIQDLFGVSEHELMSLLKQILKNEVATISWVTTDQL 537
DL+T V + I + GV E LL +L + + Q+
Sbjct: 478 HDLQTGHF--------VTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMK-----KHPQM 524

Query: 538 AVRHILFDKQTWPFKQILLP---LLY-QRDSGGGSMPSGLTTVPNPM 580
+ R LF +++L L + D G +P+ L + NP+
Sbjct: 525 SERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00510TCRTETA802e-18 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 79.9 bits (197), Expect = 2e-18
Identities = 71/372 (19%), Positives = 149/372 (40%), Gaps = 24/372 (6%)

Query: 13 ILWLSQFIAIAGLTVLVPLLPIYMASLQNLSVVEIQLWSGIAIAAPAVTTMIASPIWGKL 72
++ + + G+ +++P+LP + L + ++ GI +A A+ +P+ G L
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDL--VHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 73 GDKISRKWMVLRALLGLAVCLFLMALCTTPLQFVLVRLLQGLFGGVVDASSAFASAEAPA 132
D+ R+ ++L +L G AV +MA + R++ G+ G + A+ +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDG 126

Query: 133 EDRGKVLGRLQSSVSAGSLVGPLIGGVTASILGFSALLMSIAVITFIVCIFGALKLIETT 192
++R + G + + G + GP++GG+ A + A + + + G L E+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 193 HMPKSQTPNINKGIRRSFQCLLCTQQTCRFIIVGVLANFAMYGMLTALSPLASSVNHTAI 252
+ SF+ + V A A++ ++ + + +++
Sbjct: 186 KGERRPLRREALNPLASFR--------WARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 253 DDR-----SVIGFLQSAF-WTASILSAPLWGRFNDKSYVKSVYIFATIACGCSAILQGLA 306
+DR + IG +AF S+ A + G + + + IA G IL A
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 307 TNIEFLMAARILQGLTYSAL--IQSVMFVVVNACHQ-QLKGTFVGTTNSMLVVGQIIGSL 363
T +L + +Q+++ V+ Q QL+G+ T+ + I+G L
Sbjct: 298 TRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS----LTSIVGPL 353

Query: 364 SGAAITSYTTPA 375
AI + +
Sbjct: 354 LFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00515PF041833045e-98 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 304 bits (779), Expect = 5e-98
Identities = 117/539 (21%), Positives = 211/539 (39%), Gaps = 61/539 (11%)

Query: 3 NKELIQHAAYAAIERILNEYFREENLYQVPPQNHQWSIQLSELE-TLTGEFRYWSAMGHH 61
N + + ++L+E E+ + + ++ I L + E W G
Sbjct: 2 NHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIW---GW- 57

Query: 62 MYHPEVWLIDGKSKKITTYKEAIARILQHMAQSADNQTA-VQQHMAQIMSDI--DNSIHR 118
ID ++ + +L + Q A V +HM + + + D + +
Sbjct: 58 ------LWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLK 111

Query: 119 TARYLQSNTIDYVEDRYIVSEQSLYLGHPFHPTPKSASGFSEADLEKYAPECHTSFQLHY 178
R L ++ + + Q L GHP K G+ + LE+YAPE +F+LH+
Sbjct: 112 ARRGLSASDL---INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHW 168

Query: 179 LAVHQD-------------VLLTRYVEGKEDQVEKVLYQLADIDISEIPKDFILLPTHPY 225
LAV ++ LLT ++ +E ++Q +D +++ LP HP+
Sbjct: 169 LAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-----HNWLPLPVHPW 223

Query: 226 QINVLRQHPQYMQYSEQGLIKDLGVSGDSVYPTSSVRTVF--SKALNIYLKLPIHVKITN 283
Q ++ +G + LG GD S+RT+ S+ + +KLP+ + T+
Sbjct: 224 QWQQK-IATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTS 282

Query: 284 FIRTNDLEQIERTIDAAQVIASVKDE-----------VETPHFKLMFEEGYRALLPNPLG 332
R I A++ + V + P + EGY AL P
Sbjct: 283 CYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYR 342

Query: 333 QTVEPEMDLLTNSAMIVREGIPNY-HADKDIHVLASLFETMPDSPMSKLSQVIEQSGLAP 391
EM +I RE + D+ ++A+L E ++ I++SGL
Sbjct: 343 YQ---EM-----LGVIWRENPCRWLKPDESPVLMATLMECDENN-QPLAGAYIDRSGLDA 393

Query: 392 EAWLECYLNRTLLPILKLFSNTGISLEAHVQNTLIELKDGIPDVCFVRDLEG-ICLSRTI 450
E WL ++P+ L G++L AH QN + +K+G+P ++D +G + L +
Sbjct: 394 ETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEE 453

Query: 451 ATEKQLVPNVVAASSPVVYAHDEAWHRLKYYVVVNHLGHLVSTIGKATRNEVVLWQLVA 509
E +P V + + A D H L+ V L + + + E +QL+A
Sbjct: 454 FPEMDSLPQEVRDVTSRLSA-DYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLA 511


3SACOL_RS00570SACOL_RS00595Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS00570212-0.517168UDP-phosphate N-acetylgalactosaminyl-1-phosphate
SACOL_RS00575211-0.412210glycosyl transferase family 1
SACOL_RS00580312-0.570854ligase
SACOL_RS005853110.817967hypothetical protein
SACOL_RS005902141.064274Superoxide dismutase [Mn/Fe] 2
SACOL_RS005953151.726907hypothetical protein
4SACOL_RS00785SACOL_RS00890Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS007853161.096038cation transporter
SACOL_RS007900151.949406hypothetical protein
SACOL_RS00795-1151.899052hypothetical protein
SACOL_RS00800-2121.954314sulfonate ABC transporter ATP-binding protein
SACOL_RS00805-2111.279246hypothetical protein
SACOL_RS008101141.399990sulfonate ABC transporter permease
SACOL_RS008152141.302222hypothetical protein
SACOL_RS008202141.298676hypothetical protein
SACOL_RS008252131.362998formate dehydrogenase
SACOL_RS008302141.698440MFS transporter
SACOL_RS008353132.032695non-ribosomal peptide synthetase
SACOL_RS008402152.9128034'-phosphopantetheinyl transferase
SACOL_RS008452162.444109hypothetical protein
SACOL_RS008501162.320169acetylglutamate kinase
SACOL_RS008550152.615794arginine biosynthesis bifunctional protein ArgJ
SACOL_RS008600142.156884N-acetyl-gamma-glutamyl-phosphate reductase
SACOL_RS00865-1182.839823ornithine aminotransferase 1
SACOL_RS00870-1172.422628branched-chain amino acid transporter II carrier
SACOL_RS008751153.368859isochorismatase
SACOL_RS008801133.755389indolepyruvate decarboxylase
SACOL_RS008850143.546275hypothetical protein
SACOL_RS008900143.219952PTS glucose EIICBA component
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00830TCRTETA320.004 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.004
Identities = 61/337 (18%), Positives = 127/337 (37%), Gaps = 33/337 (9%)

Query: 7 TLKVRLISNFLQLIITTAFIPFIALYLTDMLS----QSIVGIYLVGLVVLKFPLSIISGY 62
L V L + L + +P + L D++ + GI L +++F + + G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 63 LIEIFPKKLLVLIYQATMVIMLVFMGVFGSHQLWQI-IGFCVAYAIFTIVWGLQFPVMDT 121
L + F ++ ++L+ A + M + LW + IG VA + G V
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMAT--APFLWVLYIGRIVAG-----ITGATGAVAGA 118

Query: 122 LIMDAITEDVEHYIYKISYWMTNLSVAIGALLGGLMYGYSMLLLFLIAACIFLIVLFILY 181
I D D + + G +LGGLM G+S F AA + +
Sbjct: 119 YIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 182 IWLPQDRNQVKQSDDKRHASRYQKLQIMNIFRSYKLVLKDRNYMLLISGFSIIMMGEFSI 241
LP+ ++ + + + L+ F + ++G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAA--------LMAVFFIMQLVGQVPA 230

Query: 242 SSYIAIRLKDQF--ETISIGSYDITGAKMLAILLMINTVVVILLTYSISKVVLKIDFKKA 299
+ + I +D+F + +IG LA +++++ ++T ++ ++ ++A
Sbjct: 231 ALW-VIFGEDRFHWDATTIGI-------SLAAFGILHSLAQAMITGPVAA---RLGERRA 279

Query: 300 LITGLLIYIVGYSGLTYLNQFGLLVVFMIIATVGEII 336
L+ G++ GY L + + + M++ G I
Sbjct: 280 LMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIG 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00835NUCEPIMERASE522e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 51.7 bits (124), Expect = 2e-08
Identities = 54/266 (20%), Positives = 101/266 (37%), Gaps = 55/266 (20%)

Query: 2046 NTLLTGATGFLGAYLIEVLQGYSHRIYCFIRADNEEIAWYKLMTNLNDYFS----EETVE 2101
L+TGA GF+G ++ + L H++ + D NLNDY+ + +E
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV---VGID-----------NLNDYYDVSLKQARLE 47

Query: 2102 IM----LSNIEVIVGDFECMDDVVLPENMDTIIH----AGARTDHFGDDDEFEKVNVQGT 2153
++ ++ + D E M D+ + + + R + + N+ G
Sbjct: 48 LLAQPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVR-YSLENPHAYADSNLTGF 106

Query: 2154 VDVIRLAQQHH-ARLIYVSTISV-GTYFDIDTEDVTFSEADVYKGQLLTSPYTRSKFYSE 2211
++++ + + L+Y S+ SV G + FS D + S Y +K +E
Sbjct: 107 LNILEGCRHNKIQHLLYASSSSVYG-----LNRKMPFSTDDSVDHPV--SLYAATKKANE 159

Query: 2212 LKVLEAVNN-GLDGRIVRVGNLTNPYNGRWHM------RNIKTNRFSMVMNDLLQLDCIG 2264
L + GL +R + P+ GR M + + + V N
Sbjct: 160 LMAHTYSHLYGLPATGLRFFTVYGPW-GRPDMALFKFTKAMLEGKSIDVYNY-------- 210

Query: 2265 VSMAEMPVDFSFVDTTARQIVALAQV 2290
+M DF+++D A I+ L V
Sbjct: 211 ---GKMKRDFTYIDDIAEAIIRLQDV 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00840ENTSNTHTASED290.009 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 29.2 bits (65), Expect = 0.009
Identities = 15/57 (26%), Positives = 27/57 (47%), Gaps = 5/57 (8%)

Query: 84 GQP-----IYVSLSYSYPYIVCVVDKEPVGIDIEKISQRLDWRTLVTCFSTNEAHQI 135
QP ++ S+S+ + V+ ++ +GIDIEKI + L ++ QI
Sbjct: 76 RQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATELAPSIIDSDERQI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00850CARBMTKINASE320.002 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 31.7 bits (72), Expect = 0.002
Identities = 23/84 (27%), Positives = 41/84 (48%), Gaps = 7/84 (8%)

Query: 155 INADTLAYFIASSLKAPIYV-LSNIAGVLIN-----DVVIPQLPLVDIHQYIEHGD-IYG 207
I+ D +A + A I++ L+++ G + + + ++ + ++ +Y E G G
Sbjct: 213 IDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREVKVEELRKYYEEGHFKAG 272

Query: 208 GMIPKVLDAKNAIENGCPKVIIAS 231
M PKVL A IE G + IIA
Sbjct: 273 SMGPKVLAAIRFIEWGGERAIIAH 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00875ISCHRISMTASE604e-13 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 59.6 bits (144), Expect = 4e-13
Identities = 31/99 (31%), Positives = 51/99 (51%)

Query: 86 LDKRDDDFVIDKRHFSAFVGTDLDLQLRRRGIDTIVLGGVATHIGVDTTARDAYQLNYNQ 145
L DDD V+ K +SAF T+L +R+ G D +++ G+ HIG TA +A+ +
Sbjct: 112 LAPEDDDLVLTKWRYSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFMEDIKA 171

Query: 146 FFVTDMMSAQNETLHQFPIDNVFPLMGQTITTNDFLNIL 184
FFV D ++ + HQ ++ T+ T+ L+ L
Sbjct: 172 FFVGDAVADFSLEKHQMALEYAAGRCAFTVMTDSLLDQL 210


5SACOL_RS01025SACOL_RS01130Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS010252171.234019DNA-binding response regulator
SACOL_RS010300150.560136histidine kinase
SACOL_RS01035-1150.752678lipoprotein
SACOL_RS01040-2140.862063formate acetyltransferase
SACOL_RS01045-312-0.127021pyruvate formate-lyase-activating enzyme
SACOL_RS01050-212-0.891382hypothetical protein
SACOL_RS01055-1121.487828glycerophosphoryl diester phosphodiesterase
SACOL_RS01060-1143.588905hypothetical protein
SACOL_RS01065-2164.262811coagulase
SACOL_RS010700175.100027hypothetical protein
SACOL_RS010750164.747161hypothetical protein
SACOL_RS010800144.424930acetyl-CoA acetyltransferase
SACOL_RS01085-2122.6280993-hydroxyacyl-CoA dehydrogenase
SACOL_RS01090-2112.193867glutaryl-CoA dehydrogenase
SACOL_RS01095-2101.208875long-chain-fatty-acid--CoA ligase
SACOL_RS01100-2110.565389acyl CoA:acetate/3-ketoacid CoA transferase
SACOL_RS01105-111-0.463600protease PrsW
SACOL_RS01110-2130.153641nickel ABC transporter substrate-binding
SACOL_RS011150161.453542hypothetical protein
SACOL_RS011201193.073142hypothetical protein
SACOL_RS011250154.064524nitric oxide dioxygenase
SACOL_RS011301154.374238hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01025HTHFIS833e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 3e-20
Identities = 42/169 (24%), Positives = 72/169 (42%), Gaps = 12/169 (7%)

Query: 3 KVVICDDERIIREGLKQIIPWGDYHFNTIYTAKDGVEALSLIQQHQPELVITDIRMPRKN 62
+++ DD+ IR L Q + Y + + I +LV+TD+ MP +N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GVDLLNDI--AHLDCNVIILSSYDDFEYMKAGIQHHVLDYLLKPVDHAQLEVILGRLVRT 120
DLL I A D V+++S+ + F + DYL KP D L ++G + R
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFD---LTELIGIIGRA 118

Query: 121 LLEQQSQNGRSLASCHDAFQPLLKVEYDDYYVNQIVDQIKQSYQTKVTV 169
L E + + + D PL+ + +I + + QT +T+
Sbjct: 119 LAEPKRRPSKLEDDSQD-GMPLVG---RSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01030PF065801475e-42 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 147 bits (372), Expect = 5e-42
Identities = 55/226 (24%), Positives = 109/226 (48%), Gaps = 16/226 (7%)

Query: 288 YIYDLFESNEQLIHSIEHTERRLRDIQLKEIERQFQPHFLFNTMQTIQYLITLSPKLAQT 347
+ + F++ +Q ++ QL ++ Q PHF+FN + I+ LI P A+
Sbjct: 136 FGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKARE 195

Query: 348 VVQQLSQMLRYSLR-TNSHTVELNEELNYIEQYVAIQNIRFDDMIKLHIESSEEARHQTI 406
++ LS+++RYSLR +N+ V L +EL ++ Y+ + +I+F+D ++ + + +
Sbjct: 196 MLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV 255

Query: 407 GKMMLQPLIENAIKHGRDTESLDITIRLTLARQN--LHVLVCDNGIGMSSSRLQYVRQSL 464
M++Q L+EN IKHG I L + N + + V + G L+ ++S
Sbjct: 256 PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA----LKNTKES- 310

Query: 465 NNDVFDTKHLGLNHLHNKAMIQYGSHARLHIFSKRNQGTLICYKIP 510
GL ++ + + YG+ A++ + K+ + + IP
Sbjct: 311 -------TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVL-IP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01040SHAPEPROTEIN320.006 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 32.4 bits (74), Expect = 0.006
Identities = 18/54 (33%), Positives = 29/54 (53%), Gaps = 5/54 (9%)

Query: 257 AYLAAIKEQNGAAMSLGRTSTFLDIYAERDLKAGVITESEV-QEIIDHFIMKLR 309
+AA+ A LGRT +I A R +K GVI + V ++++ HFI ++
Sbjct: 50 KSVAAVGHD--AKQMLGRTPG--NIAAIRPMKDGVIADFFVTEKMLQHFIKQVH 99


6SACOL_RS01405SACOL_RS01495Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS014053170.264416protein EsaC
SACOL_RS01410215-0.569653virulence factor EsxB
SACOL_RS01415217-1.621500DUF5081 domain-containing protein
SACOL_RS01420217-1.500223hypothetical protein
SACOL_RS01425316-2.014063hypothetical protein
SACOL_RS01430418-5.158844hypothetical protein
SACOL_RS01435420-5.163201DUF5079 domain-containing protein
SACOL_RS01440620-4.727443DUF5080 domain-containing protein
SACOL_RS01445520-4.681591hypothetical protein
SACOL_RS01450719-4.369799hypothetical protein
SACOL_RS01455721-4.401295hypothetical protein
SACOL_RS01460622-3.960350hypothetical protein
SACOL_RS01465721-3.786577DUF5079 domain-containing protein
SACOL_RS01470923-3.619404hypothetical protein
SACOL_RS014751226-3.234411hypothetical protein
SACOL_RS014851221-3.768420hypothetical protein
SACOL_RS014901018-3.758814hypothetical protein
SACOL_RS01495314-2.793416hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01480LUXSPROTEIN280.019 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 27.6 bits (61), Expect = 0.019
Identities = 17/64 (26%), Positives = 31/64 (48%), Gaps = 7/64 (10%)

Query: 104 FTRDGKLNVSFDYIDWVNSEFGPMG-REHYYMYKKFGIWPEKEYAINWVEKIKDYVKEQD 162
F R+ S + ID PMG R +YM G E++ A W+ ++D +K ++
Sbjct: 62 FMRNHLNGDSVEIIDI-----SPMGCRTGFYM-SLIGTPSEQQVADAWIAAMEDVLKVEN 115

Query: 163 EAEL 166
+ ++
Sbjct: 116 QNKI 119


7SACOL_RS01605SACOL_RS01975Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS01605322-3.718194site-specific integrase
SACOL_RS01610321-2.763103toxin MazF
SACOL_RS01615526-0.352430hypothetical protein
SACOL_RS016205270.330741transcriptional regulator
SACOL_RS016256290.309915transcriptional regulator
SACOL_RS016305300.454466hypothetical protein
SACOL_RS01635528-0.062042hypothetical protein
SACOL_RS01640629-2.905839hypothetical protein
SACOL_RS01645529-3.043111hypothetical protein
SACOL_RS01650526-3.616104hypothetical protein
SACOL_RS01655327-2.798843hypothetical protein
SACOL_RS01660626-3.039135hypothetical protein
SACOL_RS01665427-2.453777hypothetical protein
SACOL_RS01670430-0.783464hypothetical protein
SACOL_RS016754320.054743hypothetical protein
SACOL_RS016806341.586739DUF1270 domain-containing protein
SACOL_RS016857332.686844hypothetical protein
SACOL_RS016905282.868859hypothetical protein
SACOL_RS016954293.290104hypothetical protein
SACOL_RS017005303.346775hypothetical protein
SACOL_RS017055333.349260Single-stranded DNA-binding protein 2
SACOL_RS017104333.033585hypothetical protein
SACOL_RS017154312.833845replication protein
SACOL_RS017204343.274816hypothetical protein
SACOL_RS017253332.970162DNA damage-inducible protein
SACOL_RS017303322.227753hypothetical protein
SACOL_RS017353303.290727hypothetical protein
SACOL_RS017403303.359335DNA N-6-adenine-methyltransferase
SACOL_RS017452292.066581hypothetical protein
SACOL_RS017501281.586558hypothetical protein
SACOL_RS017552280.443633transcriptional regulator
SACOL_RS017604260.535755hypothetical protein
SACOL_RS01765525-0.107332hypothetical protein
SACOL_RS017705250.576930hypothetical protein
SACOL_RS017755241.163441hypothetical protein
SACOL_RS017803231.088971hypothetical protein
SACOL_RS017853271.721577hypothetical protein
SACOL_RS017903282.036918hypothetical protein
SACOL_RS017953270.424937dUTP pyrophosphatase
SACOL_RS01800226-1.251496hypothetical protein
SACOL_RS01805430-1.339498hypothetical protein
SACOL_RS01810629-1.912386hypothetical protein
SACOL_RS01815627-2.261527transcriptional activator RinB
SACOL_RS01820323-1.136699hypothetical protein
SACOL_RS01825221-0.775381helicase
SACOL_RS01830317-1.114888transcriptional regulator
SACOL_RS01835412-1.145583HNH endonuclease
SACOL_RS01840312-0.969085terminase
SACOL_RS01845312-1.056064terminase
SACOL_RS01850412-1.443528phage portal protein
SACOL_RS01855514-2.108526ATP-dependent Clp protease ClpP
SACOL_RS01860316-1.009073phage capsid protein
SACOL_RS01865516-0.466183phage head-tail adapter protein
SACOL_RS01870617-0.533688hypothetical protein
SACOL_RS018757170.266733hypothetical protein
SACOL_RS018806142.281456hypothetical protein
SACOL_RS018856122.128017tail protein
SACOL_RS018905141.218901tail protein
SACOL_RS018956151.135894hypothetical protein
SACOL_RS019005151.433093hypothetical protein
SACOL_RS019054151.326010phage tail tape measure protein
SACOL_RS01910317-0.370164holin
SACOL_RS01915317-0.141767peptidase
SACOL_RS019204180.710529hypothetical protein
SACOL_RS019254160.465861hypothetical protein
SACOL_RS01930214-0.058312hypothetical protein
SACOL_RS01935216-0.071373hypothetical protein
SACOL_RS019401130.296520hypothetical protein
SACOL_RS01945-1102.078846hypothetical protein
SACOL_RS019500102.116538phage holin
SACOL_RS01955092.779310amidase
SACOL_RS019600113.397083hypothetical protein
SACOL_RS019701123.228351alpha/beta hydrolase
SACOL_RS019750163.962170NADH-dependent flavin oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01850STREPKINASE348e-04 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 34.3 bits (78), Expect = 8e-04
Identities = 27/121 (22%), Positives = 55/121 (45%), Gaps = 3/121 (2%)

Query: 69 PLKMYEDYKVVNTEVSDLLTVSPNNSLSSFDFINQIETIRNEKGNAYVLIERD---IYHQ 125
PL +D++ + L T++ ++++S + + Q ++I N+ Y + ERD + H
Sbjct: 194 PLNPDDDFRPGLKDTKLLKTLAIGDTITSQELLAQAQSILNKNHPGYTIYERDSSIVTHD 253

Query: 126 PSKLFLLNPDVVEMLIENQSRELYYSIHAATGNKLIVHNMDMLHFKHIVASNMVQGISPI 185
+ P E ++RE Y I+ +G ++N D++ K+ V + P
Sbjct: 254 NDIFRTILPMDQEFTYRVKNREQAYRINKKSGLNEEINNTDLISEKYYVLKKGEKPYDPF 313

Query: 186 D 186
D
Sbjct: 314 D 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01890INTIMIN310.002 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.002
Identities = 26/155 (16%), Positives = 55/155 (35%), Gaps = 23/155 (14%)

Query: 1 MTKTLKVYKGDDVVASEQGEGKVSVTLSNLEADTTYPKGTYQVAWEENGKESSKV----- 55
+T T+KV KGD V++++ ++ + + T G +V S V
Sbjct: 678 ITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTEKTDTNGYAKVTLTSTTPGKSLVSARVS 737

Query: 56 --------------DVPQFKTNPILVSGVSFTPETKSIMVNTDDNVEPNIAPSTATNKIL 101
I + G + ++ + ++ N
Sbjct: 738 DVAVDVKAPEVEFFTTLTIDDGNIEIVGTGVKGKLPTVWLQYG----QVNLKASGGNGKY 793

Query: 102 KYTSEHPEFVTVDENTGAIHGVAEGTSVITAMSTD 136
+ S +P +VD ++G + +GT+ I+ +S+D
Sbjct: 794 TWRSANPAIASVDASSGQVTLKEKGTTTISVISSD 828


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01905GPOSANCHOR605e-11 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 60.5 bits (146), Expect = 5e-11
Identities = 37/235 (15%), Positives = 76/235 (32%), Gaps = 11/235 (4%)

Query: 72 YSQVEDELKQVNANYQKAKSSVKDVEKAYLKLVEANKKEKLALDKSKEALKSSNTELKKA 131
+ + LK N++ ++KD + + K++ DKS S EL+
Sbjct: 62 FEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEAR 121

Query: 132 ENQYKRTNQRKQDAYQ----KLKQLRDAEQKLKNSNQATTAQLKRASDAVQKQSAKHKAL 187
+ ++ + + K+K L + L L+ A + SAK K L
Sbjct: 122 KADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTL 181

Query: 188 VEQYKQEGNQVQKLKVQNDNLSKSNDKIESSYAKTNTKLKQTEKEFNDLNNTIKNHSANV 247
+ L+ + L K+ + + + K+K E E L + +
Sbjct: 182 EAEK-------AALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKAL 234

Query: 248 AKAETAVNKEKAALNNLERSIDKASSEMKTFNKEQMIAQSHFGKLASQADVMSKK 302
A + A + LE + K A + +++ + +
Sbjct: 235 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE 289



Score = 58.2 bits (140), Expect = 2e-10
Identities = 33/239 (13%), Positives = 70/239 (29%), Gaps = 3/239 (1%)

Query: 28 RQLGVVNSEMKANLSAFDKSEKSMEKYQARIKGLNDRLKVQKKMYSQVEDELKQVNANYQ 87
L + NS++ N A + + + K + + EL+ A+ +
Sbjct: 67 NTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLE 126

Query: 88 KAKSSVKDVEKAYLKLVEANKKEKLALDKSKEALKSSNTELKKAENQYKRTNQRKQDAYQ 147
KA + K + L+ A N + + +
Sbjct: 127 KAL---EGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEA 183

Query: 148 KLKQLRDAEQKLKNSNQATTAQLKRASDAVQKQSAKHKALVEQYKQEGNQVQKLKVQNDN 207
+ L + +L+ + + S ++ A+ AL + ++ +
Sbjct: 184 EKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTA 243

Query: 208 LSKSNDKIESSYAKTNTKLKQTEKEFNDLNNTIKNHSANVAKAETAVNKEKAALNNLER 266
S +E+ A + + EK N SA + E +A +LE
Sbjct: 244 DSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEH 302



Score = 57.0 bits (137), Expect = 5e-10
Identities = 47/255 (18%), Positives = 95/255 (37%), Gaps = 3/255 (1%)

Query: 11 ELKLDHLGVQEGMKGLKRQLGVVNSEMKANLSAFDKSEKSMEKYQARIKGLNDRLKVQKK 70
L+ + ++ L++ L + A+ + E AR L L+
Sbjct: 180 TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMN 239

Query: 71 MYSQVEDELKQVNANYQKAKSSVKDVEKAYLKLVEANKKEKLALDKSKEALKSSNTELKK 130
+ ++K + A ++ ++EKA + + + + + + E
Sbjct: 240 FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKAD 299

Query: 131 AENQYKRTNQRKQDAYQKLKQLRDAEQKLKNSNQATTAQLKRASDAVQKQSAKHKALVEQ 190
E+Q + N +Q + L R+A+++L+ +Q Q K + + Q A E
Sbjct: 300 LEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREA 359

Query: 191 YKQEGNQVQKLKVQNDNLSKSNDKIESSYAKTNTKLKQTEKEFNDLN---NTIKNHSANV 247
KQ + QKL+ QN S + + KQ EK + N ++ + +
Sbjct: 360 KKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKEL 419

Query: 248 AKAETAVNKEKAALN 262
+++ KEKA L
Sbjct: 420 EESKKLTEKEKAELQ 434



Score = 51.2 bits (122), Expect = 3e-08
Identities = 41/261 (15%), Positives = 87/261 (33%), Gaps = 10/261 (3%)

Query: 21 EGMKGLKRQLGVVNSEMKANLSAFDKSEKSMEKYQARIKGLNDRLKVQKKMYSQVEDELK 80
EG ++A +A + + +EK + + K + L
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 81 QVNANYQKAKSSVKDVEKAYLKLVEANKKEKLALDKSKEALKSSNTELKKAENQYKRTNQ 140
A+ +KA + A ++ + EK AL+ + L+ + +
Sbjct: 225 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 284

Query: 141 RKQDAYQKLKQLRDAEQKLKNSNQATTAQLKRASDAVQKQSAKHKALVEQYKQEGNQVQK 200
+ L+ + L++ +Q A + + K L ++ QK
Sbjct: 285 TLEAEKAALEAE---KADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEH-------QK 334

Query: 201 LKVQNDNLSKSNDKIESSYAKTNTKLKQTEKEFNDLNNTIKNHSANVAKAETAVNKEKAA 260
L+ QN S + + KQ E E L K A+ ++ + A
Sbjct: 335 LEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREA 394

Query: 261 LNNLERSIDKASSEMKTFNKE 281
+E+++++A+S++ K
Sbjct: 395 KKQVEKALEEANSKLAALEKL 415



Score = 35.8 bits (82), Expect = 0.002
Identities = 40/262 (15%), Positives = 92/262 (35%), Gaps = 14/262 (5%)

Query: 905 KGVSKETEKALEKYVHYSEENSRIMEKVRLNSGQISEDKAKKLLKIETDL-----SNNLI 959
K LE E +EK + S + K+ +E + +
Sbjct: 171 STADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL 230

Query: 960 AEIEKRNKKELEKTQELIDKYSAF--DEQEKQNILTRTKEKNDLRIKKEQELNQKIKELK 1017
+ + I A + +Q L + E + + ++ K
Sbjct: 231 EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 290

Query: 1018 EKALSDGQISENERKEIEK-LENQRRDITVKELSKTEKEQERILVRMQRNRNAYSIDEAS 1076
++ E++ + + ++ RRD+ +K + E E + Q + S
Sbjct: 291 AALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLR 350

Query: 1077 KAIKEAEKARKARKKEVDKQYEDDVIAIKNNVNLSKSEKDKLLAIADQRHKDEVRKAKSK 1136
+ + + +A+K + E K E + I+ + +L + A K +V KA +
Sbjct: 351 RDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREA------KKQVEKALEE 404

Query: 1137 KDAVVDVVKKQNKDIDKEMDLS 1158
++ + ++K NK++++ L+
Sbjct: 405 ANSKLAALEKLNKELEESKKLT 426


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01930THERMOLYSIN290.049 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 28.8 bits (64), Expect = 0.049
Identities = 24/206 (11%), Positives = 53/206 (25%), Gaps = 16/206 (7%)

Query: 197 ANSRISDLENKAQAYSRTFDEQKRYMDEKHEAFKQSVNSGGLVTSGSTSNWQKAKITKDD 256
L +A+ + + F+Q++ + G+ +D
Sbjct: 61 QEKNTFQLGGQARERLSLIGNKLDELGHTVMRFEQAIA--ASLCMGAV-----LVAHVND 113

Query: 257 GKIMQITGFDFNNPEQRIGDSTQFIYVSQA--INYPRDVSTNGTVEYLVVTSDYKRMTYR 314
G++ ++G N ++R + I + QA I R+
Sbjct: 114 GELSSLSGTLIPNLDKRTLKTEAAISIQQAEMIAKQDVADRVTKERPAAEEGKPTRLVIY 173

Query: 315 PNGTN-------KVFVKRKEAGSWSEWSELAINDYNTPFETVQSAQSKANMAESNAKLYA 367
P+ V G+W + A + + A+ +
Sbjct: 174 PDEETPRLAYEVNVRFLTPVPGNWIYMIDAADGKVLNKWNQMDEAKPGGAQPVAGTSTVG 233

Query: 368 DDKFNKRYSVIFDGTANGVGSTLYLN 393
+ + T + YL
Sbjct: 234 VGRGVLGDQKYINTTYSSYYGYYYLQ 259


8SACOL_RS02185SACOL_RS02295Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS021852202.177932hypothetical protein
SACOL_RS021902170.285536GTP-binding protein YchF
SACOL_RS02195220-0.315665hypothetical protein
SACOL_RS02200222-0.02283730S ribosomal protein S6
SACOL_RS02205019-1.188614single-stranded DNA-binding protein
SACOL_RS02210120-2.50230630S ribosomal protein S18
SACOL_RS02215220-3.061798CAAX amino protease
SACOL_RS02220116-2.046390integrase
SACOL_RS02225215-1.767269toxin
SACOL_RS02230315-2.388033hypothetical protein
SACOL_RS02235014-1.973998peptidase
SACOL_RS02240115-1.961636transcriptional regulator
SACOL_RS02250-213-1.836687membrane protein
SACOL_RS022550190.478442hypothetical protein
SACOL_RS022600201.286838phosphoglycerate mutase
SACOL_RS02265-1202.271887hypothetical protein
SACOL_RS022701193.522645hypothetical protein
SACOL_RS022751204.371903hypothetical protein
SACOL_RS022802204.649180alkyl hydroperoxide reductase subunit F
SACOL_RS022852213.588859alkyl hydroperoxide reductase subunit C
SACOL_RS022902193.190914NADPH-dependent oxidoreductase
SACOL_RS022952192.796524L-cystine transporter tcyP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02230TOXICSSTOXIN471e-08 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 47.0 bits (111), Expect = 1e-08
Identities = 30/117 (25%), Positives = 48/117 (41%), Gaps = 12/117 (10%)

Query: 74 TINGKSNKSRNWVYSERPLNENQVRIHLEGTYTVAGRVYTPKRNITLNKEVVTLKELDHI 133
I+G +N + E PL V++H + + Y PK +K+ + + LD
Sbjct: 124 QISGVTNTEKLPTPIELPLK---VKVHGKDSPLK----YGPK----FDKKQLAISTLDFE 172

Query: 134 IRFAHIS-YGLYMGEHLPKGNIVINTKDGGKYTLESHKELQKDRENVKINTADIKNV 189
IR +GLY G I DG Y + K+ + + E IN +IK +
Sbjct: 173 IRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTI 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02240adhesinb320.001 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 31.7 bits (72), Expect = 0.001
Identities = 34/166 (20%), Positives = 59/166 (35%), Gaps = 18/166 (10%)

Query: 2 KLKSLAVLSMSAVVLTACGNDTPKDETKSTESNTNQDTNTTKDV---IALKDVKTS---- 54
K + L +L ++ V L AC + ET S++ N + D+ IA +
Sbjct: 3 KCRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIADITKNIAGDKINLHSIVP 62

Query: 55 ----PEDAVKKAEETYKGQKLK-----GISFENSNGEWAYKVTQQ-KSGEESEVLVADKN 104
P + E+ K + GI+ E W K+ + K E + +
Sbjct: 63 VGQDPHEYEPLPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEG 122

Query: 105 KKVINKKTEKE-DTMNENDNFKYSDAIDYKKAIKEGQKEFDGDIKE 149
VI + + E + + + I Y + I + E D KE
Sbjct: 123 VDVIYLEGQSEKGKEDPHAWLNLENGIIYAQNIAKRLSEKDPANKE 168


9SACOL_RS02385SACOL_RS02510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS023852160.452688hypothetical protein
SACOL_RS02390114-0.493358hypothetical protein
SACOL_RS02395112-1.263049hypothetical protein
SACOL_RS02400310-0.943277hypothetical protein
SACOL_RS02405210-0.461347hypothetical protein
SACOL_RS02410310-0.985397type I restriction-modification system subunit
SACOL_RS02415811-3.256639restriction endonuclease subunit S
SACOL_RS0242079-3.043643hypothetical protein
SACOL_RS02425810-2.923666hypothetical protein
SACOL_RS024301014-3.450081hypothetical protein
SACOL_RS024351213-4.210867lipoprotein
SACOL_RS024401113-4.005637hypothetical protein
SACOL_RS024451115-4.051983hypothetical protein
SACOL_RS024501116-3.874455hypothetical protein
SACOL_RS02455817-3.775593hypothetical protein
SACOL_RS02460314-1.830232hypothetical protein
SACOL_RS02465414-0.422067hypothetical protein
SACOL_RS024701162.275577hypothetical protein
SACOL_RS024752164.069695hypothetical protein
SACOL_RS024802164.036724hypothetical protein
SACOL_RS024852163.857426cobalamin synthesis protein CobW
SACOL_RS024903173.570583hypothetical protein
SACOL_RS024952173.147059NADH dehydrogenase subunit 5
SACOL_RS025002163.118475hypothetical protein
SACOL_RS025052180.700050hypothetical protein
SACOL_RS025102171.832967hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02390TOXICSSTOXIN973e-26 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 97.4 bits (242), Expect = 3e-26
Identities = 46/231 (19%), Positives = 85/231 (36%), Gaps = 21/231 (9%)

Query: 84 VTTPPSTNTPQPMQSTKSDTPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEIGIILK 143
+ T + TP P+ S + K N KDL +Y+ S F N ++
Sbjct: 17 LATTATDFTPVPLSSNQ-------IIKTAKASTNDNIKDLLDWYSSGSDTFTN-SEVLDN 68

Query: 144 KWTTIRFMNVVPDYFIYKIALVGKDDKKYGEGVHRNVDV-----FVVLEENNYNLEKYSV 198
++R N + + VD+ + + +
Sbjct: 69 SLGSMRIKNTDGSI---SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQI 125

Query: 199 GGITKSNSKKVDHKAGVRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIEKNN 258
G+T + + +++ + + K K+Q+++ LDF++R QL + +
Sbjct: 126 SGVTNTEKLPTPIELPLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHG 182

Query: 259 LYGNV--GSGKIVIKMKNGGKYTFELHKKLQENRMADVINSEQIKNIEVNL 307
LY + G I M +G Y +L KK + N IN ++IK IE +
Sbjct: 183 LYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02395TOXICSSTOXIN1323e-40 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 132 bits (332), Expect = 3e-40
Identities = 39/197 (19%), Positives = 71/197 (36%), Gaps = 15/197 (7%)

Query: 43 INMLHQYYSEESFEPTNISVKSEDYYGSNVLNFKQRNKAFKVFLLGDDKNKY------KE 96
I L +YS S TN V + K + + + + K
Sbjct: 46 IKDLLDWYSSGSDTFTNSEVLD---NSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGEKV 102

Query: 97 KTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPSLQVKKVDAKNGFSINELF 156
+ + + + G+T + P L+VK + F
Sbjct: 103 DLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELP-LKVKVHGKDSPLKYGPKF 159

Query: 157 FIQKEEVSLKELDFKIRKLLIEKYRLYKGTS-DKGRIVINMKDEKKHEIDLSEKLSFERM 215
K+++++ LDF+IR L + + LY+ + G I M D ++ DLS+K +
Sbjct: 160 --DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTE 217

Query: 216 FDVMDSKQIKNIEVNLN 232
++ +IK IE +N
Sbjct: 218 KPPINIDEIKTIEAEIN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02400TOXICSSTOXIN1934e-64 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 193 bits (491), Expect = 4e-64
Identities = 51/202 (25%), Positives = 92/202 (45%), Gaps = 10/202 (4%)

Query: 31 KQNQKSVNKHDKEALYRYYTGKTMEMKNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKF 90
K + S N + K+ L Y +G + N L + ++R K I +++ +
Sbjct: 36 KTAKASTNDNIKDLLDWYSSG-SDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSP 94

Query: 91 QQRSYEGLDVFFVQEKRDKHD-----IFYTVGGVIQNNKTSGVVSAPILNISKEKGEDAF 145
E +D+ + K+ +H I + + GV K + P L + K G+D+
Sbjct: 95 AFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELP-LKV-KVHGKDSP 152

Query: 146 VKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRV-KISLKDGSFYNLDLRSK 204
+K Y K+++ + LD+++R L + +GLY++ K G KI++ DGS Y DL K
Sbjct: 153 LK-YGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKK 211

Query: 205 LKFKYMGEVIESKQIKDIEVNL 226
++ I +IK IE +
Sbjct: 212 FEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02420TOXICSSTOXIN1082e-31 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 108 bits (272), Expect = 2e-31
Identities = 43/225 (19%), Positives = 79/225 (35%), Gaps = 21/225 (9%)

Query: 16 LTTGMITTTAQPVKASTLEVRSQAT-------QDLSEYYNRPFFEYTNQSGYKEEGKVTF 68
L T PV S+ ++ A +DL ++Y+ +TN
Sbjct: 15 LLLATTATDFTPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTNSEVLDNSLGSMR 74

Query: 69 TPNYQLIDVTLTGNEKQNF-------GEDISNVDIFVVRENSDRSGNTASIGGITKTNGS 121
N + D++ + S+ + I G+T T
Sbjct: 75 IKNTDGSISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTE-- 132

Query: 122 NYIDKVKDVNLIITKNIDSVTSTSTSSTYTINKEEISLKELDFKLRKHLIDKHNLYKTEP 181
+ L + + S +K+++++ LDF++R L H LY++
Sbjct: 133 ---KLPTPIELPLKVKVHGKDS-PLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSD 188

Query: 182 KDSKI-RITMKDGGFYTFELNKKLQTHRMGDVIDGRNIEKIEVNL 225
K +ITM DG Y +L+KK + + I+ I+ IE +
Sbjct: 189 KTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02450BCTERIALGSPC336e-04 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 33.4 bits (76), Expect = 6e-04
Identities = 18/83 (21%), Positives = 33/83 (39%), Gaps = 9/83 (10%)

Query: 185 INENVPSYDAKFKMSNKDENVKQLRSRYNIPTDKAPVLKMHIDGNLKGSSVGYKKLEIDF 244
+NE VP Y+AK D V Q + RY + + + S G +++
Sbjct: 124 VNEEVPGYNAKIVSIRPDRVVLQYQGRYEV---------LGLYSQEDSGSDGVPGAQVNE 174

Query: 245 SKGGKSDLSVIDSLNFQPAKVDE 267
++ ++ D ++F P D
Sbjct: 175 QLQQRASTTMSDYVSFSPIMNDN 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02475adhesinb270.013 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.1 bits (60), Expect = 0.013
Identities = 21/94 (22%), Positives = 40/94 (42%), Gaps = 14/94 (14%)

Query: 14 DISTTVETLNLISKMEAQKENIRTVIAPEHKHKYKDIENGLKGEE---KVLIEQMAQHCE 70
+S V+ + L + E KE+ H + ++ENG+ + K L E+ + E
Sbjct: 118 AVSEGVDVIYLEGQSEKGKED---------PHAWLNLENGIIYAQNIAKRLSEKDPANKE 168

Query: 71 AFKANFKGAAQ--GDWVKSAMSEIDSIKDDLKKI 102
++ N K + K A + ++I + K I
Sbjct: 169 TYEKNLKAYVEKLSALDKEAKEKFNNIPGEKKMI 202


10SACOL_RS03125SACOL_RS03150Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS031253190.260470deoxyguanosine kinase
SACOL_RS031304211.013726tRNA-specific adenosine deaminase
SACOL_RS031354190.467252haloacid dehalogenase
SACOL_RS031403180.128013FMN-dependent NADPH-azoreductase
SACOL_RS031453180.311254serine-aspartate repeat-containing protein C
SACOL_RS031502160.164139serine-aspartate repeat-containing protein D
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS03150GPOSANCHOR360.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 0.001
Identities = 43/234 (18%), Positives = 87/234 (37%), Gaps = 6/234 (2%)

Query: 22 KFSIRKYTVGTASILVGTTLI-FGLGNQ-EAKAAESTNKELNE--ATTSASDNQSSDKVD 77
+S+RK GTAS+ V T++ GL +A +T + + +D +
Sbjct: 9 HYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEIENNT 68

Query: 78 MQQLNQEDNTKNDNQKEMVSSQGNETTSNGNKLIEKESVQSTTGNKVEVSTAKSDE--QA 135
++ N + + N K+ E ++ KL + + S +K++ A+ + +A
Sbjct: 69 LKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKA 128

Query: 136 SPKSTNEDLNTKQTISNQEALQPDLQENKSVVNVQPTNEENKKVDAKTESTTLNVKSDAI 195
+ N I EA + L K+ + N + TL + A+
Sbjct: 129 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 188

Query: 196 KSNDETLVDNNSNSNNENNADIILPKSTAPKRLNTRMRIAAVQPSSTEAKNVND 249
++ L + N + AD K+ ++ R A ++ + A N +
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFST 242


11SACOL_RS03310SACOL_RS03370Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS03310218-3.526804transcriptional regulator
SACOL_RS03315520-4.361063hypothetical protein
SACOL_RS03320720-4.493949transposase
SACOL_RS03325923-7.593468hypothetical protein
SACOL_RS03330721-7.487405hypothetical protein
SACOL_RS03335622-7.247990hypothetical protein
SACOL_RS03340422-7.528368hypothetical protein
SACOL_RS03345320-8.081667hypothetical protein
SACOL_RS03350417-7.992353hypothetical protein
SACOL_RS03355417-7.423691hypothetical protein
SACOL_RS03360419-7.354984hypothetical protein
SACOL_RS03365015-5.245884hypothetical protein
SACOL_RS03370015-3.970466hypothetical protein
12SACOL_RS03445SACOL_RS03500Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS03445014-3.093397enterobactin ABC transporter permease
SACOL_RS03450013-3.688810haloacid dehalogenase
SACOL_RS03455-112-3.554770alpha/beta hydrolase
SACOL_RS03460-114-4.058785hypothetical protein
SACOL_RS03465-114-4.150542hypothetical protein
SACOL_RS03470-215-3.317448lysophospholipase
SACOL_RS03475-114-2.299754transcriptional regulator
SACOL_RS03480014-1.897182membrane protein
SACOL_RS03485115-1.836438hypothetical protein
SACOL_RS03490012-1.284668hypothetical protein
SACOL_RS03495113-1.804620recombinase
SACOL_RS03500212-1.573339cation:proton antiporter
13SACOL_RS03680SACOL_RS03785Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS03680-310-3.4237453-beta hydroxysteroid dehydrogenase
SACOL_RS03685-37-2.476367DNA-binding response regulator
SACOL_RS03690-19-1.578641sensor histidine kinase
SACOL_RS0369509-1.593086ABC transporter ATP-binding protein
SACOL_RS03700-18-3.074983ABC transporter permease
SACOL_RS03705010-2.938873hypothetical protein
SACOL_RS0371009-1.991466anion permease
SACOL_RS03715-112-2.847856peptidase M23
SACOL_RS03720016-4.363604hypothetical protein
SACOL_RS03725-116-4.318801AraC family transcriptional regulator
SACOL_RS03730118-1.994124transcriptional regulator
SACOL_RS03735019-0.643345transcriptional regulator
SACOL_RS03740114-1.483084cupin
SACOL_RS03745313-1.990893hypothetical protein
SACOL_RS03750311-2.339365hypothetical protein
SACOL_RS03755314-2.274514LysR family transcriptional regulator
SACOL_RS03760513-2.721135hypothetical protein
SACOL_RS03765311-2.988688MFS transporter
SACOL_RS03770314-3.182105membrane protein
SACOL_RS03775213-2.012094hypothetical protein
SACOL_RS03780116-0.734870GNAT family acetyltransferase
SACOL_RS03785215-0.683798hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS03685HTHFIS645e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.7 bits (155), Expect = 5e-14
Identities = 26/111 (23%), Positives = 57/111 (51%), Gaps = 1/111 (0%)

Query: 3 ILLVEDDNTLFQELKKELEQWDFNVAGIEDFGKVMDTFESFNPEIVILDVQLPKYDGFYW 62
IL+ +DD + L + L + ++V + + + + ++V+ DV +P + F
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 63 CRKMREV-SNVPILFLSSRDNPMDQVMSMELGADDYMQKPFYTNVLIAKLQ 112
++++ ++P+L +S+++ M + + E GA DY+ KPF LI +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS03695PF05272361e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 36.2 bits (83), Expect = 1e-04
Identities = 15/56 (26%), Positives = 26/56 (46%), Gaps = 8/56 (14%)

Query: 40 GPSGSGKTTLLNVLSSIDYISQGSITLKGKK--LEKLSNK------ELSDIRKHDI 87
G G GK+TL+N L +D+ S + K E+++ E++ R+ D
Sbjct: 603 GTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGIVAYELSEMTAFRRADA 658


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS03765TCRTETA575e-11 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 56.8 bits (137), Expect = 5e-11
Identities = 73/365 (20%), Positives = 134/365 (36%), Gaps = 41/365 (11%)

Query: 11 KNYKLFVA--NMFLLGMGIAVTVPYLVLFATKDLGMTTNQ---YGLLLASAAISQFTVNS 65
N L V + L +GI + +P L +DL + + YG+LLA A+ QF
Sbjct: 3 PNRPLIVILSTVALDAVGIGLIMPVLP-GLLRDLVHSNDVTAHYGILLALYALMQFACAP 61

Query: 66 IIARFSDTHHFNRKIIIILALLMGALGFSIYFFVDTIWLFILLYAIFQGLFAPAMPQLYA 125
++ SD F R+ +++++L A+ ++I +W+ + + I G+
Sbjct: 62 VLGALSD--RFGRRPVLLVSLAGAAVDYAIMATAPFLWV-LYIGRIVAGITGATGA---V 115

Query: 126 SARESINVSSSKDRAQFANTVLRSMFSLGFLFGPFIGAQLIGLKGYAGLFGGTISIILFT 185
+ +++ +RA+ + + F G + GP +G + G +A F L
Sbjct: 116 AGAYIADITDGDERARHFG-FMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNF 174

Query: 186 LVLQVFFYKDLNIKHPISTQQHVEKIAPNMFKDKTL--------LLPFIAFILLHIGQWM 237
L + ++ + + A N L + FI+ +GQ
Sbjct: 175 LTGCFLLPESHK-----GERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVP 229

Query: 238 YTMNMPLFVTDYLKENEQHVGYLASLCAGLEVPFMIIL-GVLSSRLQTRTLLIYGAIFGG 296
+ +F D + +G + L ++ G +++RL R L+ G I G
Sbjct: 230 AAL-WVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADG 288

Query: 297 LFYFSIGVFKNFYMMLAGQVFLAIFLAVLLGIGISYFQDILPDFPGYASTLFSNAMVIGQ 356
Y + +M F + L GIG +P S GQ
Sbjct: 289 TGYILLAFATRGWM-----AFPIMVLLASGGIG-------MPALQAMLSRQVDEERQ-GQ 335

Query: 357 LGGNL 361
L G+L
Sbjct: 336 LQGSL 340



Score = 49.1 bits (117), Expect = 2e-08
Identities = 44/186 (23%), Positives = 73/186 (39%), Gaps = 13/186 (6%)

Query: 215 MFKDKTLLLPFIAFILLHIGQWMYTMNMPLFVTDYLKENEQ--HVGYLASLCAGLEVPFM 272
M ++ L++ L +G + +P + D + N+ H G L +L A ++
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 273 IILGVLSSRLQTRTLLIYGAIFGGLFYFSIGVFKNFYMMLAGQVFLAIFLAVLLGIGISY 332
+LG LS R R +L+ + Y + +++ G++ I A G +Y
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAG-AY 119

Query: 333 FQDILPD-----FPGYASTLFSNAMVIGQLGGNLLGGAMSHWVGLENVFFVSAASIMLGM 387
DI G+ S F MV G + G L+GG H FF +AA L
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPH-----APFFAAAALNGLNF 174

Query: 388 ILIFFT 393
+ F
Sbjct: 175 LTGCFL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS03780SACTRNSFRASE357e-05 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.5 bits (79), Expect = 7e-05
Identities = 22/102 (21%), Positives = 35/102 (34%), Gaps = 1/102 (0%)

Query: 42 EMICSRLEHTNDKIYIYENEGQLIAFIWGHFSNEKSMVNIELLYVEPQFRKLGIATQLKI 101
+M S +E ++Y E I I SN IE + V +RK G+ T L
Sbjct: 54 DMDVSYVEEEGKAAFLYYLENNCIGRIKIR-SNWNGYALIEDIAVAKDYRKKGVGTALLH 112

Query: 102 ALEKWAKTMNAKRISNTIHKNNLPMISLNKDLGYQVSHVKMY 143
+WAK + + N+ + + V
Sbjct: 113 KAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTM 154


14SACOL_RS04355SACOL_RS04440Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS04355517-3.821617SsrA-binding protein
SACOL_RS04360818-5.438811hypothetical protein
SACOL_RS04365517-4.097357hypothetical protein
SACOL_RS043705182.124446hypothetical protein
SACOL_RS043755150.658952DUF5067 domain-containing protein
SACOL_RS043803151.036310hypothetical protein
SACOL_RS043852141.187026hypothetical protein
SACOL_RS043903131.554364acetyltransferase
SACOL_RS043953141.710579clumping factor A
SACOL_RS04400114-1.512370coagulase
SACOL_RS04405218-0.696382hypothetical protein
SACOL_RS04410020-0.641651hypothetical protein
SACOL_RS04415120-1.269582thermonuclease
SACOL_RS04420221-3.121824cold-shock protein
SACOL_RS04425118-4.042517hypothetical protein
SACOL_RS04430117-4.574686hypothetical protein
SACOL_RS04435117-2.750503hypothetical protein
SACOL_RS04440317-2.976772hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04390ALARACEMASE270.049 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 26.7 bits (59), Expect = 0.049
Identities = 13/37 (35%), Positives = 19/37 (51%), Gaps = 2/37 (5%)

Query: 135 MYDIYP-PYDGIPDEAFLI-KELKVNSLAGKTGTINY 169
D+ P P GI L KE+K++ +A GT+ Y
Sbjct: 305 AVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGY 341


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04395ICENUCLEATIN350.001 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 35.1 bits (80), Expect = 0.001
Identities = 56/298 (18%), Positives = 104/298 (34%)

Query: 566 GSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSDN 625
GS + +S + GS T+ GSD + S + + S + S + +S
Sbjct: 357 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQ 416

Query: 626 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 685
+ S + SD + S + DS + S + DS + S +
Sbjct: 417 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 476

Query: 686 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 745
SD + S S + +S + S + S + S + ++SD + S S
Sbjct: 477 GSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTS 536

Query: 746 DSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 805
+ ++S + S + +S + S + SD + S + SDS +
Sbjct: 537 TAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGY 596

Query: 806 DSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDSESDSNSDS 863
S + S + S + S + S S +G+DS + S + +S
Sbjct: 597 GSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNS 654



Score = 35.1 bits (80), Expect = 0.002
Identities = 62/321 (19%), Positives = 111/321 (34%), Gaps = 4/321 (1%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G E + S G S + +DS +G ST +G +S+ + S SD
Sbjct: 181 GSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDL 240

Query: 612 ASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 671
+ S + DS + S + DS + S + SD + S
Sbjct: 241 TAGYGSTGTA----GDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTG 296

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
+ +DS + S + +S + S + SD + S + DS +
Sbjct: 297 TAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGY 356

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDS 791
S + DS + S + SD + S + +DS + S + +S
Sbjct: 357 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQ 416

Query: 792 DSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSS 851
+ S + SD + S + DS + S + DS+ + GS +
Sbjct: 417 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 476

Query: 852 DSDSESDSNSDSESGSNNNVV 872
SD + S S +G ++++
Sbjct: 477 GSDLTAGYGSTSTAGYESSLI 497



Score = 35.1 bits (80), Expect = 0.002
Identities = 59/314 (18%), Positives = 107/314 (34%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + E+S G S S +G ST +G DS+ + S + DS
Sbjct: 213 GSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 272

Query: 612 ASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 671
+ S + +D + S + +DS + S + +S + S +
Sbjct: 273 TAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQK 332

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
SD + S + DS + S + DS + S + SD + S
Sbjct: 333 GSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTG 392

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDS 791
+ +DS + S + +S + S + SD + S + DS +
Sbjct: 393 TAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGY 452

Query: 792 DSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSS 851
S + DS + S ++ SD + S S + +S + S + S+
Sbjct: 453 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTL 512

Query: 852 DSDSESDSNSDSES 865
+ S + +ES
Sbjct: 513 TAGYGSTQTAQNES 526



Score = 35.1 bits (80), Expect = 0.002
Identities = 56/299 (18%), Positives = 105/299 (35%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + EDS G S + S +G ST +G+DS+ + S + +S
Sbjct: 261 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQ 320

Query: 612 ASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 671
+ S + +D + S + DS + S + DS + S +
Sbjct: 321 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 380

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
SD + S + +DS + S + +S + S + SD + S
Sbjct: 381 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 440

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDS 791
+ DS + S + DS + S + SD + S S + +S +
Sbjct: 441 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGY 500

Query: 792 DSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSS 850
S + S + S ++++SD + S S + ++S + S + +S
Sbjct: 501 GSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSV 559



Score = 35.1 bits (80), Expect = 0.002
Identities = 56/299 (18%), Positives = 104/299 (34%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + EDS G S + S +G ST +G+DS+ + S + +S
Sbjct: 357 GSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQ 416

Query: 612 ASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 671
+ S + +D + S + DS + S + DS + S +
Sbjct: 417 TAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 476

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
SD + S S + +S + S + S + S + ++SD + S S
Sbjct: 477 GSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTS 536

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDS 791
+ ++S + S + +S + S + SD + S + SDS +
Sbjct: 537 TAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGY 596

Query: 792 DSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSS 850
S + S + S + S + S S + +DS + S + +S
Sbjct: 597 GSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSI 655



Score = 34.0 bits (77), Expect = 0.004
Identities = 61/321 (19%), Positives = 116/321 (36%), Gaps = 4/321 (1%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + + SD G S + DS +G ST +G DS+ + S + SD
Sbjct: 325 GSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 384

Query: 612 ASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 671
+ S + +DS + S + +S + S + SD + S
Sbjct: 385 TAGYGSTGTA----GADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 440

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
+ DS + S + DS + S + SD + S S + +S +
Sbjct: 441 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGY 500

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDS 791
S + S + S + ++SD + S S + ++S + S + +S
Sbjct: 501 GSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVL 560

Query: 792 DSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSS 851
+ S + SD + S + SDS + S + S+ + GS +
Sbjct: 561 TAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTARE 620

Query: 852 DSDSESDSNSDSESGSNNNVV 872
S + S S +G++++++
Sbjct: 621 QSVLTTGYGSTSTAGADSSLI 641



Score = 33.2 bits (75), Expect = 0.006
Identities = 57/293 (19%), Positives = 103/293 (35%), Gaps = 2/293 (0%)

Query: 560 DSDSDPGSDSGSDSNSDSGSD--SGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDS 617
+S + GS + GSD +G ST +G DS+ + S + DS + S
Sbjct: 315 GEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 374

Query: 618 ASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 677
+ +D + S + +DS + S + +S + S + SD +
Sbjct: 375 TQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTA 434

Query: 678 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 737
S + DS + S + DS + S + SD + S S + +S
Sbjct: 435 GYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYES 494

Query: 738 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDS 797
+ S + S + S + + SD + S S + ++S + S +
Sbjct: 495 SLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTA 554

Query: 798 DSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSS 850
+S + S + SD + S + SDS + S + SS
Sbjct: 555 SYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSS 607



Score = 33.2 bits (75), Expect = 0.006
Identities = 57/298 (19%), Positives = 102/298 (34%)

Query: 566 GSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSDN 625
GS + S + GS T+ GSD + S + S + S + DS
Sbjct: 309 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 368

Query: 626 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 685
+ S + SD + S + +DS + S + +S + S +
Sbjct: 369 TAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQK 428

Query: 686 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 745
SD + S + DS + S + DS + S + SD + S S
Sbjct: 429 GSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTS 488

Query: 746 DSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 805
+ +S + S + S + S + ++SD + S S + ++S +
Sbjct: 489 TAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGY 548

Query: 806 DSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDSESDSNSDS 863
S + S + S + SD + S +GSDS + S ++ S
Sbjct: 549 GSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHS 606



Score = 33.2 bits (75), Expect = 0.007
Identities = 54/281 (19%), Positives = 101/281 (35%)

Query: 575 SDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSDNDSDSDSDSD 634
S + GSD T+ GS + DS+ + S + DS + S + SD
Sbjct: 422 STQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLT 481

Query: 635 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 694
+ S S + +S + S + S + S + ++SD + S S + ++
Sbjct: 482 AGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGAN 541

Query: 695 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 754
S + S + +S + S + SD + S + SDS + S
Sbjct: 542 SSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQT 601

Query: 755 SDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSESDSDSE 814
+ S + S + S + S S + +DS + S + +S +
Sbjct: 602 ASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYG 661

Query: 815 SDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDS 855
S + SD + S S + + S +G S ++ +S
Sbjct: 662 STQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNS 702



Score = 32.8 bits (74), Expect = 0.007
Identities = 56/298 (18%), Positives = 97/298 (32%)

Query: 566 GSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSDN 625
GS S + GS T+ S + S + + S + S + +S
Sbjct: 165 GSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQ 224

Query: 626 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 685
+ S SD + S + DS + S + DS + S +
Sbjct: 225 MAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQK 284

Query: 686 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 745
SD + S + +DS + S + +S + S + SD + S
Sbjct: 285 GSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTG 344

Query: 746 DSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 805
+ DS + S + DS+ + S + SD + S + +DS +
Sbjct: 345 TAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGY 404

Query: 806 DSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDSESDSNSDS 863
S + ES + S + SD + S +G DS + S + DS
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDS 462



Score = 32.8 bits (74), Expect = 0.007
Identities = 57/298 (19%), Positives = 104/298 (34%)

Query: 566 GSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSDN 625
GS + S + GS T+ GSD + S + S + S + DS
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 626 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 685
+ S + SD + S S + +S + S + S + S + +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 686 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 745
+SD + S S + ++S + S + +S + S + SD + S
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 746 DSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 805
+ SDS + S + S+ + S + S + S S + +DS +
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGY 644

Query: 806 DSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDSESDSNSDS 863
S + S + S + SD + S S +G+DS + S + +S
Sbjct: 645 GSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNS 702



Score = 32.8 bits (74), Expect = 0.008
Identities = 58/307 (18%), Positives = 108/307 (35%), Gaps = 2/307 (0%)

Query: 559 EDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSD--SASDSD 616
+DS G S + DS +G ST + S + S + +DS + S
Sbjct: 252 DDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGST 311

Query: 617 SASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 676
+ +S + S + SD + S + DS + S + DS +
Sbjct: 312 QTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAG 371

Query: 677 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 736
S + SD + S + +DS + S + +S + S + SD
Sbjct: 372 YGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSD 431

Query: 737 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSD 796
+ S + DS + S + DS+ + S + SD + S S +
Sbjct: 432 LTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAG 491

Query: 797 SDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDSE 856
+S + S + S + S + ++SD + S S +G++S + S
Sbjct: 492 YESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGST 551

Query: 857 SDSNSDS 863
++ +S
Sbjct: 552 QTASYNS 558



Score = 32.8 bits (74), Expect = 0.008
Identities = 56/299 (18%), Positives = 106/299 (35%)

Query: 552 GEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDS 611
G + E+S G S + S +G ST +G DS+ + S + DS
Sbjct: 405 GSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSL 464

Query: 612 ASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 671
+ S + +D + S S + +S + S + S + S + +
Sbjct: 465 TAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQN 524

Query: 672 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 731
+SD + S S + ++S + S + +S + S + SD + S
Sbjct: 525 ESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTG 584

Query: 732 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDS 791
+ SDS + S + S + S + S + S S + +DS +
Sbjct: 585 TAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGY 644

Query: 792 DSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSS 850
S + +S + S ++ SD + S S + +DS + S + +S
Sbjct: 645 GSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSI 703



Score = 31.6 bits (71), Expect = 0.018
Identities = 58/293 (19%), Positives = 109/293 (37%), Gaps = 6/293 (2%)

Query: 569 SGSDSNSDSG----SDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSA--SDSD 622
+G+DS+ +G +G +ST +G S + SD + S + DS+ +
Sbjct: 394 AGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYG 453

Query: 623 SDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 682
S + DS + S + SD + S S + +S + S + S
Sbjct: 454 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLT 513

Query: 683 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 742
+ S + ++SD + S S + ++S + S + +S + S +
Sbjct: 514 AGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG 573

Query: 743 SDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 802
SD + S + SDS + S + S + S + S + S S
Sbjct: 574 SDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST 633

Query: 803 SDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDS 855
+ +DS + S + +S + S + SD +G S S++ +DS
Sbjct: 634 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADS 686



Score = 31.6 bits (71), Expect = 0.018
Identities = 60/313 (19%), Positives = 109/313 (34%), Gaps = 4/313 (1%)

Query: 561 SDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASD 620
S G DS + S +G DS+ +G S + SD + S + +DS+
Sbjct: 246 STGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLI 305

Query: 621 SDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 680
+ + + +S + S + SD + S + DS + S + D
Sbjct: 306 AGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGED 365

Query: 681 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 740
S + S + SD + S + +DS + S + +S + S
Sbjct: 366 SSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQT 425

Query: 741 SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 800
+ SD + S + DS + S + DS + S + SD +
Sbjct: 426 AQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYG 485

Query: 801 SDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSD----SGSDSDSSSDSDSE 856
S S + ES + S + S + S + ++SD GS S + ++S
Sbjct: 486 STSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLI 545

Query: 857 SDSNSDSESGSNN 869
+ S + N+
Sbjct: 546 AGYGSTQTASYNS 558



Score = 31.3 bits (70), Expect = 0.022
Identities = 57/328 (17%), Positives = 108/328 (32%)

Query: 545 PEQPDEPGEIEPIPEDSDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSD 604
P PD E++ D+ +S S + + +T S S +
Sbjct: 122 PGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTLSGTHQSQLIAGYG 181

Query: 605 SASDSDSASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 664
S + +S + S +DS + S + +S + S SD
Sbjct: 182 STETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGYGSTQTGMKGSDLT 241

Query: 665 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 724
+ S + DS + S + DS + S + SD + S + +D
Sbjct: 242 AGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGAD 301

Query: 725 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSD 784
S + S + +S + S + SD + S + DS + S
Sbjct: 302 SSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQT 361

Query: 785 SDSDSDSDSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSG 844
+ DS + S + SD + S + +DS + S + +S + G
Sbjct: 362 AGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYG 421

Query: 845 SDSDSSSDSDSESDSNSDSESGSNNNVV 872
S + SD + S +G +++++
Sbjct: 422 STQTAQKGSDLTAGYGSTGTAGDDSSLI 449



Score = 31.3 bits (70), Expect = 0.027
Identities = 59/292 (20%), Positives = 108/292 (36%), Gaps = 2/292 (0%)

Query: 561 SDSDPGSDSGSDSNSDSGSD--SGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSA 618
DS + GS + GSD +G STS +G +S+ + S + S + S
Sbjct: 460 EDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGST 519

Query: 619 SDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 678
+ +++D + S S + ++S + S + +S + S + SD +
Sbjct: 520 QTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAG 579

Query: 679 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 738
S + SDS + S + S + S + S + S S + +DS
Sbjct: 580 YGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSS 639

Query: 739 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 798
+ S + +S + S + SD + S S + +DS + S +
Sbjct: 640 LIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAG 699

Query: 799 SDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSS 850
+S + S ++ SD S S S + +DS + S + SS
Sbjct: 700 YNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSS 751



Score = 30.9 bits (69), Expect = 0.028
Identities = 59/315 (18%), Positives = 111/315 (35%), Gaps = 4/315 (1%)

Query: 561 SDSDPGSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASD 620
S G+DS + S +G +ST +G S + SD + S + DS+
Sbjct: 294 STGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLI 353

Query: 621 SDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 680
+ + + DS + S + SD + S + +DS + S + +
Sbjct: 354 AGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEE 413

Query: 681 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 740
S + S + SD + S + DS + S + DS + S
Sbjct: 414 STQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQT 473

Query: 741 SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 800
+ SD + S S + +S + S + S + S + ++SD +
Sbjct: 474 AQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDLITGYG 533

Query: 801 SDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSD----SGSDSDSSSDSDSE 856
S S + + S + S + +S + S + SD GS + SDS
Sbjct: 534 STSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSII 593

Query: 857 SDSNSDSESGSNNNV 871
+ S + ++++
Sbjct: 594 AGYGSTQTASYHSSL 608



Score = 30.5 bits (68), Expect = 0.042
Identities = 58/298 (19%), Positives = 103/298 (34%)

Query: 566 GSDSGSDSNSDSGSDSGSDSTSDSGSDSASDSDSASDSDSASDSDSASDSDSASDSDSDN 625
GS + S + GS T+ + SD + S S + + S + S + +S
Sbjct: 501 GSTQTAGYGSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVL 560

Query: 626 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 685
+ S + SD + S + SDS + S + S + S +
Sbjct: 561 TAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTARE 620

Query: 686 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 745
S + S S + +DS + S + +S + S + SD + S S
Sbjct: 621 QSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTS 680

Query: 746 DSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 805
+ +DS + S + +S + S + SD S S S + +DS +
Sbjct: 681 TAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGY 740

Query: 806 DSESDSDSESDSDSDSDSDSDSDSDSDSDSDSASDSDSGSDSDSSSDSDSESDSNSDS 863
S + S + S + S + S S +G+DS + S + S
Sbjct: 741 GSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHS 798


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04400IGASERPTASE300.022 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.022
Identities = 39/228 (17%), Positives = 78/228 (34%), Gaps = 14/228 (6%)

Query: 207 ERANKKAVNKRMLENKKEDLETIIDEFFSDIDKTRPNNIPVLEDEKQEEKNHKN---MAQ 263
E A N + E E D +T N V ++ K K + +AQ
Sbjct: 1035 ETTETVAENSKQESKTVEKNE-------QDATETTAQNREVAKEAKSNVKANTQTNEVAQ 1087

Query: 264 LKSDTEAAKSDESKRSKRSKRSLNTQNHKPASQEVSEQQKAEYDKRAEERKARFLDNQKI 323
S+T+ ++ E+K + ++ + +QEV + K+ + +
Sbjct: 1088 SGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAR 1147

Query: 324 KKTPVVSLEYDFEHKQRIDNENDKKLVVSAPTKKPTSPTTYTETTTQV---PMPTVERQT 380
+ P V+++ + S+ ++P + +T T V P T T
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207

Query: 381 QQQIIYNAPKQLAGLNGES-HDFTTTHQSPTTSNHTHNNVVEFEETSA 427
Q + + + + S + TTS++ + V + TS
Sbjct: 1208 QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTST 1255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04435PF05704280.035 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 27.5 bits (61), Expect = 0.035
Identities = 13/69 (18%), Positives = 24/69 (34%), Gaps = 7/69 (10%)

Query: 116 EWVKKNYENTNHRYLVTLNLNSK-------KFTYCTKIIYQAYKFGVSEKSVKSYGLHII 168
W + Y N + +++ N + + YK + +Y HI
Sbjct: 239 YWKEIPYVNNVNPHMLQYLGNLPYDNSMFNYIKSTSPVQKLTYKLDYNNLKRNTYYDHIF 298

Query: 169 SPYAIKDNF 177
S +KDN+
Sbjct: 299 SIDKLKDNY 307


15SACOL_RS04485SACOL_RS04570Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS04485214-1.148998nitroreductase
SACOL_RS04490217-0.799852thiol reductase thioredoxin
SACOL_RS04495219-1.272206arsenate reductase
SACOL_RS04500318-1.488547glycine cleavage system protein H
SACOL_RS04505217-0.639813hypothetical protein
SACOL_RS045100150.157452hypothetical protein
SACOL_RS04515011-0.705776hypothetical protein
SACOL_RS04520-111-1.118795topiosmerase
SACOL_RS04525011-1.839747thiol reductase thioredoxin
SACOL_RS04530214-2.117671methionine import ATP-binding protein MetN 2
SACOL_RS04535316-3.002125ABC transporter permease
SACOL_RS04540318-3.363252methionine ABC transporter substrate-binding
SACOL_RS04545422-3.130647site-specific integrase
SACOL_RS04550621-3.705431exotoxin
SACOL_RS04555419-2.436649enterotoxin
SACOL_RS04560416-1.574837hypothetical protein
SACOL_RS04565219-1.352883hypothetical protein
SACOL_RS04570222-2.562112transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04540ADHESNFAMILY345e-04 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 34.1 bits (78), Expect = 5e-04
Identities = 16/48 (33%), Positives = 29/48 (60%)

Query: 1 MKKLFGLILVLTFAVVLAACGNGNKSGSDDKKITVGASPAPHAEILEK 48
MKKL L+++ A++L AC +G K + +K+ V A+ + A+I +
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKN 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04550BACTRLTOXIN1015e-28 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 101 bits (253), Expect = 5e-28
Identities = 59/232 (25%), Positives = 102/232 (43%), Gaps = 36/232 (15%)

Query: 29 IDNLRNFYTKKDFVDLKDVKDNDTPIANQLQFSNESYDLISESKDFNKFSN------FKG 82
+ N++ Y +V VK D +A+ L ++ L + K + N +K
Sbjct: 48 MGNMKYLYDDH-YVSATKVKSVDKFLAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKD 106

Query: 83 KKLDVFGISYNGQC-------------NTKYIYGGVT-ATNEYLDKSRNIPINIWINGNH 128
+ +DV+G +Y C +YGG+T + D + + + N
Sbjct: 107 EVVDVYGSNYYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENK 166

Query: 129 KTISTNKVSTNKKFVTAQEIDVKLRKYLQEEYNIYGHNGTKKGEEYGHKSKFYSGFNIGK 188
+ + +V T+KK VTAQE+D+K R +L + N+Y N + + G
Sbjct: 167 RNTISFEVQTDKKSVTAQELDIKARNFLINKKNLYEFNSSP--------------YETGY 212

Query: 189 VTFHLNNNDTFSYDLF-YTGDDGLPKSFLKIYEDNKTVESEKFHLDVDISYK 239
+ F NN +TF YD+ GD +L +Y DNKTV+S+ ++V ++ K
Sbjct: 213 IKFIENNGNTFWYDMMPAPGDKFDQSKYLMMYNDNKTVDSKSVKIEVHLTTK 264


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS04555BACTRLTOXIN1131e-32 Bacterial toxin signature.
		>BACTRLTOXIN#Bacterial toxin signature.

Length = 266

Score = 113 bits (284), Expect = 1e-32
Identities = 65/225 (28%), Positives = 102/225 (45%), Gaps = 37/225 (16%)

Query: 32 NLRNFYANYEPEKLQGVSSGNFSTSHQLEY------IDGKYTLYSQFHNEYEAKRLKDHK 85
N++ Y ++ + S F +H L Y + + ++ NE AK+ KD
Sbjct: 50 NMKYLYDDHYVSATKVKSVDKF-LAHDLIYNISDKKLKNYDKVKTELLNEDLAKKYKDEV 108

Query: 86 VDIFGISYSGLC-------------NTKYMYGGITLANQN-LDKPRNIPINLWVNGKQNT 131
VD++G +Y C MYGGIT N D + + V +
Sbjct: 109 VDVYGSNYYVNCYFSSKDNVGKVTGGKTCMYGGITKHEGNHFDNGNLQNVLVRVYENKRN 168

Query: 132 ISTDKVSTQKKEVTAQEIDIKLRKYLQNEYNIYGFNKTKKGQEYGYQSKFNSGFNKGKIT 191
+ +V T KK VTAQE+DIK R +L N+ N+Y FN +S + G I
Sbjct: 169 TISFEVQTDKKSVTAQELDIKARNFLINKKNLYEFN--------------SSPYETGYIK 214

Query: 192 FHLNNEPSFTYDLF-YTGTGQAES-FLKIYDDNKTIDTENFHLDV 234
F NN +F YD+ G +S +L +Y+DNKT+D+++ ++V
Sbjct: 215 FIENNGNTFWYDMMPAPGDKFDQSKYLMMYNDNKTVDSKSVKIEV 259


16SACOL_RS05005SACOL_RS05035Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS05005112-4.069780O-acetyltransferase
SACOL_RS05010113-4.178026chaperone protein ClpB
SACOL_RS05015418-6.325606LysR family transcriptional regulator
SACOL_RS05020420-6.0173222-isopropylmalate synthase
SACOL_RS05025518-3.883373hypothetical protein
SACOL_RS05030315-0.715029membrane protein
SACOL_RS050352131.434876phosphatidylethanolamine-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS05010IGASERPTASE366e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 6e-04
Identities = 17/143 (11%), Positives = 48/143 (33%), Gaps = 14/143 (9%)

Query: 420 QLEIEESALKNESDNASKQRLQELQEELANEKEKQAALQSRVESEKEKIANLQEKRAQLD 479
E+ +S + + Q + + ++EK + + + + + K+ Q +
Sbjct: 1082 TNEVAQSGSETKE----TQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137

Query: 480 ESRQALEDAQTNNNLEKAAELQYGTIPQLEKELRELEDNFQDEQGEDTDRMIREVVTDEE 539
+ E A+ N+ E Q + + ++ ++T + + VT+
Sbjct: 1138 TVQPQAEPARENDPTVNIKEPQ-------SQTNTTAD---TEQPAKETSSNVEQPVTEST 1187

Query: 540 IGDIVSQWTGIPVSKLVETEREK 562
+ + P + T +
Sbjct: 1188 TVNTGNSVVENPENTTPATTQPT 1210


17SACOL_RS05265SACOL_RS05330Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS05265-118-3.710935**competence protein ComK
SACOL_RS05270-119-4.069080hypothetical protein
SACOL_RS05275-114-5.536048lipoate--protein ligase A
SACOL_RS05280016-6.090238DUF2187 domain-containing protein
SACOL_RS05285-116-6.148497hypothetical protein
SACOL_RS05290016-6.058639putative holin-like toxin
SACOL_RS05295116-6.380500hypothetical protein
SACOL_RS05300116-6.327917hypothetical protein
SACOL_RS05305019-5.657804hypothetical protein
SACOL_RS05310014-4.344187ABC transporter ATP-binding protein
SACOL_RS05315014-4.174529hypothetical protein
SACOL_RS05320115-4.288618hypothetical protein
SACOL_RS05325013-3.364661glycosyl transferase family 1
SACOL_RS05330316-0.489376hypothetical protein
18SACOL_RS05790SACOL_RS05830Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS05790214-1.014813phosphopantetheine adenylyltransferase
SACOL_RS05795617-0.618338hypothetical protein
SACOL_RS05800518-0.603877DNA-binding protein
SACOL_RS05805616-1.602187hypothetical protein
SACOL_RS05810514-1.60604050S ribosomal protein L32
SACOL_RS05815413-1.607332iron-regulated surface determinant protein B
SACOL_RS05820114-2.518758iron-regulated surface determinant protein A
SACOL_RS05825-214-3.153235iron-regulated surface determinant protein C
SACOL_RS05830-115-3.386823hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS05790LPSBIOSNTHSS2191e-76 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 219 bits (560), Expect = 1e-76
Identities = 77/155 (49%), Positives = 112/155 (72%)

Query: 5 IAVIPGSFDPITYGHLDIIERSTDRFDEIHVCVLKNSKKEGTFSLEERMDLIEQSVKHLP 64
A+ PGSFDPIT+GHLDIIER FD+++V VL+N K+ FS++ER++ I +++ HLP
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 65 NVKVHQFSGLLVDYCEQVGAKTIIRGLRAVSDFEYELRLTSMNKKLNNEIETLYMMSSTN 124
N +V F GL V+Y Q A I+RGLR +SDFE EL++ + NK L +++ET+++ +ST
Sbjct: 62 NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTE 121

Query: 125 YSFISSSIVKEVAAYRADISEFVPPYVEKALKKKF 159
YSF+SSS+VKEVA + ++ FVP +V AL +F
Sbjct: 122 YSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQF 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS05815IGASERPTASE366e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 6e-04
Identities = 37/194 (19%), Positives = 71/194 (36%), Gaps = 15/194 (7%)

Query: 447 RIVDKEAFTKANTDKSNKKEQQDNSAKKEA---------TPATPSKPTPSPVEKESQKQD 497
+ VD T N +++ N+ + PATPS+ T + E Q+
Sbjct: 990 QTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESK 1049

Query: 498 SQKDDNKQLPSVEKENDASSESGKDKTPATKPT------KGEVESSSTTPTKVVSTTQNV 551
+ + + + +N ++ K A T E + + TT TK +T +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 552 AKPTTASSKTTKDVVQTSAGSSEAKDSAPLQKANIKNTNDGHTQSQNNKNTQENKAKSLP 611
K + KT + TS S + + S +Q + T + +Q N
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 612 QTGEESNKDMTLPL 625
Q +E++ ++ P+
Sbjct: 1170 QPAKETSSNVEQPV 1183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS05820IGASERPTASE348e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.3 bits (78), Expect = 8e-04
Identities = 27/132 (20%), Positives = 44/132 (33%), Gaps = 4/132 (3%)

Query: 184 ADAAKPNNVKPVQPKPAQPKTPTEQTKPVQPKVEKVKPTVTTTSKVEDNHSTKVVSTDTT 243
+ A+ + P PA P TE + K + + +V +
Sbjct: 1015 EEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKS 1074

Query: 244 KDQTKTQTAHTVKTAQTAQEQNKVQTPVKDVATAKSESNNQAVSDNKSQQTNKVTKHNET 303
+ TQT + AQ+ E + QT + V K+Q+ KVT +
Sbjct: 1075 NVKANTQTN---EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS-QVS 1130

Query: 304 PKQASKAKELPK 315
PKQ P+
Sbjct: 1131 PKQEQSETVQPQ 1142


19SACOL_RS05915SACOL_RS06000Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS059152122.112272succinate dehydrogenase cytochrome B558
SACOL_RS059202112.132117succinate dehydrogenase flavoprotein subunit
SACOL_RS05925014-0.232059succinate dehydrogenase iron-sulfur subunit
SACOL_RS05930-217-1.033258glutamate racemase
SACOL_RS05935-117-2.889106non-canonical purine NTP pyrophosphatase
SACOL_RS05940014-4.598350metal-dependent phosphodiesterase
SACOL_RS05945217-4.611836hypothetical protein
SACOL_RS05950316-4.188029fibrinogen-binding protein
SACOL_RS05955018-1.726379hypothetical protein
SACOL_RS05960321-0.198035FPRL1 inhibitory protein
SACOL_RS05965224-0.209425hypothetical protein
SACOL_RS059704261.235770fibrinogen-binding protein
SACOL_RS059756301.070005fibrinogen-binding protein
SACOL_RS059806291.825035hypothetical protein
SACOL_RS05985221-0.516958hypothetical protein
SACOL_RS05990220-2.788720hypothetical protein
SACOL_RS05995420-2.782229hypothetical protein
SACOL_RS06000218-2.571978hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS05920MICOLLPTASE300.027 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 30.5 bits (68), Expect = 0.027
Identities = 22/127 (17%), Positives = 47/127 (37%), Gaps = 13/127 (10%)

Query: 378 DFSQHGGNRLGANSLLSAIYGGTVAGPNAIDYISNIDRSYTDMDESIFEKRKAEEQERFD 437
+ ++G N N++ + + G + I D T+ F R ER +
Sbjct: 246 NIDKYGSNYSKGNAVFNLMKGIDYYTNSVIYNTKGYDAKNTE-----FYNRIDPYMERLE 300

Query: 438 KLLAMR---GTENAYKLHRELGEIMTANVTVVRENEKLLETDKKIVELMKRYEDIDMEDT 494
L + +NA+ ++ L T + RE+ + + + + MK Y + +
Sbjct: 301 SLCTIGDKLNNDNAWLVNNALYY--TGRMGKFREDPSISQ--RALERAMKEYPYLSYQYI 356

Query: 495 QTWSNQA 501
+ +N
Sbjct: 357 E-AANDL 362


20SACOL_RS06765SACOL_RS06895Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS06765219-1.913571glutamine synthetase
SACOL_RS06770220-2.456622hypothetical protein
SACOL_RS06775725-8.538704hypothetical protein
SACOL_RS06780521-8.741870hypothetical protein
SACOL_RS06785523-8.319892hypothetical protein
SACOL_RS06795422-6.425836hypothetical protein
SACOL_RS06800422-5.819697hypothetical protein
SACOL_RS06805422-5.561877hypothetical protein
SACOL_RS06810624-6.000868hypothetical protein
SACOL_RS06815425-5.892456phage head morphogenesis protein
SACOL_RS06820527-4.698150hypothetical protein
SACOL_RS06825328-4.417676hypothetical protein
SACOL_RS06830227-4.946041hypothetical protein
SACOL_RS06835427-6.187918hypothetical protein
SACOL_RS06840326-6.386401hypothetical protein
SACOL_RS06845220-3.075368hypothetical protein
SACOL_RS06850320-4.545858hypothetical protein
SACOL_RS06855115-3.446472hypothetical protein
SACOL_RS06860316-4.060091hypothetical protein
SACOL_RS06865115-3.882166hypothetical protein
SACOL_RS06870-115-4.177313threonine aldolase
SACOL_RS06875-117-5.040425hypothetical protein
SACOL_RS06880016-4.849228cardiolipin synthase
SACOL_RS06885-216-5.292517antibiotic ABC transporter ATP-binding protein
SACOL_RS06890-215-4.701589multidrug ABC transporter permease
SACOL_RS06895-213-4.368179two-component sensor histidine kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06890ABC2TRNSPORT290.016 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 28.7 bits (64), Expect = 0.016
Identities = 11/34 (32%), Positives = 15/34 (44%)

Query: 167 IVTIGLAVLGGLWFPINTFPNWLQHVAHVLPSYH 200
+V + L G FP++ P Q A LP H
Sbjct: 184 LVITPILFLSGAVFPVDQLPIVFQTAARFLPLSH 217


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06895PF04647330.001 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 32.8 bits (75), Expect = 0.001
Identities = 18/112 (16%), Positives = 42/112 (37%), Gaps = 9/112 (8%)

Query: 35 WLYIISVIVFSLSYLILVIVNNRLNTLMFYILLIIHYFIICYFVFSVHPMLSLFFFYSAF 94
+ S++VF++ I +++ L+ I I + + V +P
Sbjct: 79 RCTLTSLLVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNP---------RN 129

Query: 95 AVPFTFKNNVKKTATNLFILTMIICTIITYLLYNNYFVAMMVYYVVISLIML 146
+ T + K T++ ++ + +I Y LY + ++ V+ L
Sbjct: 130 LISNTEQRKTLKLKTSMVLMVLFGGSIGAYRLYTHQIALAILLGVLWQTFTL 181


21SACOL_RS07005SACOL_RS07045Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS07005211-0.419757hypothetical protein
SACOL_RS07010312-2.236522transketolase
SACOL_RS07015414-4.050994hypothetical protein
SACOL_RS07020312-2.544544hypothetical protein
SACOL_RS0702529-1.167743membrane protein
SACOL_RS0703029-1.103476exonuclease sbcCD subunit D
SACOL_RS07035210-0.835801nuclease SbcCD subunit C
SACOL_RS070402131.354080large-conductance mechanosensitive channel
SACOL_RS070452131.576824glycine/betaine ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07010CHANLCOLICIN300.041 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 29.7 bits (66), Expect = 0.041
Identities = 18/58 (31%), Positives = 26/58 (44%), Gaps = 5/58 (8%)

Query: 296 QNTMLKRANEDESQ-----WNSLLEKYAETYPELAEEFKLAISGKLPKNYKDELPRFE 348
QN +L +D + +L EKY E Y ++A+E GK N + L FE
Sbjct: 337 QNNLLNSQIKDAVDATVSFYQTLTEKYGEKYSKMAQELADKSKGKKIGNVNEALAAFE 394


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07035FbpA_PF05833360.001 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 35.6 bits (82), Expect = 0.001
Identities = 42/242 (17%), Positives = 86/242 (35%), Gaps = 16/242 (6%)

Query: 260 IKEFEIIEKKTLENNIL------------KDNINQLNKNKIDFVQLKEQQPEIEGIEAKL 307
I+ F L +NI + +L N ID ++ +
Sbjct: 181 IENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVEVCKDLFK 240

Query: 308 KLLQDITNLLNYIENREKIETKIAN--SKKDISKTNNKILNLDCDKRNIDKEKK--MLEE 363
++ + Y +N + N SK+D K + + K+K + +
Sbjct: 241 EIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFYYAKDKSDRLKSK 300

Query: 364 NGDLIESKISFIDKTRVLFNDINKYQQSYLNIERLRTEGEQLGDELNDLIKGLETVEDSI 423
+ DL + ++ I++ +N + + + + GE L + L KGL +E +
Sbjct: 301 SSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYALKKGLSHIELAN 360

Query: 424 GNNQSDYEKIIELNNTITNINNEINIIKENEKAKAELDKLLGSKQELENQINEETSILKN 483
+++ I L+ T N + K+ K K + + E ++N S+L N
Sbjct: 361 YYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTN 420

Query: 484 LE 485
+
Sbjct: 421 IN 422


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07040MECHCHANNEL1452e-48 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 145 bits (367), Expect = 2e-48
Identities = 62/131 (47%), Positives = 90/131 (68%), Gaps = 11/131 (8%)

Query: 1 MLKEFKEFALKGNVLDLAIAVVMGAAFNKIISSLVENIIMPLIGKIFGSVDFAK------ 54
++KEF+EFA++GNV+DLA+ V++GAAF KI+SSLV +IIMP +G + G +DF +
Sbjct: 3 IIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVTLR 62

Query: 55 ----EWSFWGIKYGLFIQSVIDFIIIAFALFIFVKIANTL-MKKEEAEEEAVVEENVVLL 109
+ + YG+FIQ+V DF+I+AFA+F+ +K+ N L KKEE + VLL
Sbjct: 63 DAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAAPAPTKEEVLL 122

Query: 110 TEIRDLLREKK 120
TEIRDLL+E+
Sbjct: 123 TEIRDLLKEQN 133


22SACOL_RS07485SACOL_RS07510Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS0748510112.448174zinc-finger domain-containing protein
SACOL_RS0749010112.335675membrane protein
SACOL_RS074959102.361168hypothetical protein
SACOL_RS075009102.306914hypothetical protein
SACOL_RS075059102.323734ribonuclease H
SACOL_RS075109102.281866hyperosmolarity resistance protein Ebh
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07510GPOSANCHOR473e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 47.4 bits (112), Expect = 3e-06
Identities = 50/323 (15%), Positives = 96/323 (29%), Gaps = 9/323 (2%)

Query: 2582 TKVRAAQTKIDQAKALLQNKEDNSQLVTSKNNLQSSVNQVPSTAGMTQQSIDN------- 2634
T +A Q L + +E + N L+ + + + D
Sbjct: 37 TNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSN 96

Query: 2635 YNAKKREAETEITAAQRVIDNGDATAQQISDEKHRVDNALTALNQAKHDLTADTHALEQA 2694
K R+ + ++ I +A + N TA + L A+ AL
Sbjct: 97 AKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAAR 156

Query: 2695 VQQLNRTGTTTGKKPASITAYNNSIRALQSDLTSAKNSANAIIQKPIRTVQEVQSALTNV 2754
L + + +A ++ A ++ L + + ++ + + + +
Sbjct: 157 KADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTL 216

Query: 2755 NRVNERLTQAINQLVPLADNSALKTAKTKLDEEINKSVTTDGMTQSSIQAYENAKRAGQT 2814
L L T +I ++ E A
Sbjct: 217 EAEKAALAARKADL--EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 2815 ESTNAQNVINNGDATDQQIAAEKTKVEEKYNSLKQAIAGLTPDLAPLQTAKTQLQNDIDQ 2874
ST I +A + AEK +E + L L DL + AK QL+ + +
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQK 334

Query: 2875 PTSTTGMTSASIAAFNEKLSAAR 2897
++ AS + L A+R
Sbjct: 335 LEEQNKISEASRQSLRRDLDASR 357



Score = 40.4 bits (94), Expect = 4e-04
Identities = 66/380 (17%), Positives = 122/380 (32%), Gaps = 36/380 (9%)

Query: 2732 SANAIIQKPIRTVQEVQSALTNVNRVNERLTQAINQLVPLAD-----NSALKTAKTKLDE 2786
+ + T+++VQ N L + L N L + E
Sbjct: 40 VSAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKE 99

Query: 2787 EINKSVTTDGMTQSSIQAYENAKRAGQTESTNAQNVINNGDATDQQIAAEKTKVEEKYNS 2846
++ K+ + S IQ E K + A N A + + AEK + +
Sbjct: 100 KLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKAD 159

Query: 2847 LKQAIAGLTPDLAPLQTAKTQLQNDIDQPTSTTGMTSASIAAFNEKLSAARTKIQEIDRV 2906
L++A+ G L+ + A A + L A +
Sbjct: 160 LEKALEGAMNFSTADSAKIKTLEAEKAA-------LEARQAELEKALEGAM------NFS 206

Query: 2907 LASHPDVATIRQNVTAANAAKSALDQARNGLTVDKAPLENAKNQLQYSIDTQTSTTGMTQ 2966
A + T+ A A K+ L++A G L+ + +
Sbjct: 207 TADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELE 266

Query: 2967 DSINAYNAKLTAARNKIQQINQVLAGSPTVEQINTNTSTANQAKSDLDHARQALTPDKAP 3026
++ TA KI+ + + + L+ RQ+L D
Sbjct: 267 KALEGAMNFSTADSAKIKTLEA------EKAALEAEKADLEHQSQVLNANRQSLRRDLDA 320

Query: 3027 LQTAKTQLEQSINQPTDTTGMTTASLNAYNQKLQAAR----------QKLTEINQVLNGN 3076
+ AK QLE + + ++ AS + + L A+R QKL E N++
Sbjct: 321 SREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEA- 379

Query: 3077 PTVQNINDKVTEANQAKDQL 3096
+ Q++ + + +AK Q+
Sbjct: 380 -SRQSLRRDLDASREAKKQV 398



Score = 33.9 bits (77), Expect = 0.042
Identities = 52/379 (13%), Positives = 123/379 (32%), Gaps = 21/379 (5%)

Query: 8667 QQQALENQINNATTRGEVAQK-LTEAQALNQAMEALRNSIQDQQQTEAGSKFINEDKPQK 8725
L +++NA + K L+E + Q +EA + ++ + N
Sbjct: 86 HNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAM-----NFSTADS 140

Query: 8726 DAYQAAVQNAKDLINQTNNPTLDKAQVEQLTQAVNQAKDNLHGDQKLADDKQHAVTDLNQ 8785
+ L + + + A + L ++ + +Q + +
Sbjct: 141 AKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALE 200

Query: 8786 LNGLNNPQRQALESQINNAATRGEVAQKLAEAKALDQAMQALRNSIQDQQQTESG--SKF 8843
+ A + A + +A + A+ + + + + +T +
Sbjct: 201 GAMNFSTADSAKIKTL--EAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 8844 INEDKPQKDAYQAAVQNAKDLINQTGNPTLDKSQVEQLTQAVTTAKDNLHGDQKLARDQQ 8903
+ A + A+ + + +K+ +E + L+ +++ R
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318

Query: 8904 QAVTTVNALPNLNHAQQQALTDAINAAPTRTEVAQHVQTATELDHAMETLKNKVDQVN-- 8961
A H + + A+ R + + + + E +E K+++ N
Sbjct: 319 DASREAKKQLEAEHQKLEEQNKISEAS--RQSLRRDLDASREAKKQLEAEHQKLEEQNKI 376

Query: 8962 ---TDKAQPNYTEASTDKKEAVDQALQAAESITDPTNGSNANKDAVDQVLTKLQEKENEL 9018
+ ++ +AS + K+ V++AL+ A S N + KL EKE
Sbjct: 377 SEASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEES----KKLTEKEKAE 432

Query: 9019 NGNERVAEAKTQAKQTIDQ 9037
+ AEAK ++ Q
Sbjct: 433 LQAKLEAEAKALKEKLAKQ 451


23SACOL_RS08015SACOL_RS08075Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS08015-121-3.340507integrase
SACOL_RS08020-120-3.192064hypothetical protein
SACOL_RS08025021-3.240018hypothetical protein
SACOL_RS08030021-2.744249mannosyl-glycoprotein
SACOL_RS08035021-3.902577membrane protein
SACOL_RS08040121-3.521892cell division protein FtsK
SACOL_RS08045121-3.218920ATPase AAA
SACOL_RS08050421-3.648450hypothetical protein
SACOL_RS08055221-3.603414hypothetical protein
SACOL_RS08060118-3.565146conjugal transfer protein
SACOL_RS08065016-1.819566hypothetical protein
SACOL_RS08070-216-2.936004hypothetical protein
SACOL_RS08075-115-3.179681hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08035GPOSANCHOR350.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 34.7 bits (79), Expect = 0.001
Identities = 27/195 (13%), Positives = 56/195 (28%), Gaps = 11/195 (5%)

Query: 419 SAIGAGVMNRTQERFNKIRHEQAQNKKAKRENQRDEPAPPLQNDNDLRRRQQDKPMPLFI 478
G MN + K + + +KA E ++ E L+ + K L
Sbjct: 161 EKALEGAMNFSTADSAK--IKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEA 218

Query: 479 NKDNQKNGNKRREQQESMNGNDVKSASVESNANDYSKQPQKASQQEHQVRETRQRKDIQR 538
K E+ N + S + + +KA+ + Q + +
Sbjct: 219 EKAALAARKADLEKALEGAMNFSTADSAKIKT----LEAEKAALEARQAELEKALEGAMN 274

Query: 539 SPQVVNQPLNNENHSINRKEQKSVQTAYDTDVQKRQIQNATQNQQSRQSGNRNQPITRNS 598
+ + E + + Q+ NA + R + +
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLE-----HQSQVLNANRQSLRRDLDASREAKKQLE 329

Query: 599 QSKDRLKEQKDINKH 613
+L+EQ I++
Sbjct: 330 AEHQKLEEQNKISEA 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08045HTHFIS320.013 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.013
Identities = 9/27 (33%), Positives = 17/27 (62%), Gaps = 1/27 (3%)

Query: 454 KGAVSDSPHVLITGQTGKGKSFLAKLL 480
+ +D ++ITG++G GK +A+ L
Sbjct: 155 RLMQTDLT-LMITGESGTGKELVARAL 180


24SACOL_RS08125SACOL_RS08175Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS08125218-5.018512aminomethyltransferase
SACOL_RS08130322-6.372277shikimate kinase
SACOL_RS08135323-6.554051hypothetical protein
SACOL_RS08140221-5.182303competence protein ComGF
SACOL_RS08145119-4.142844hypothetical protein
SACOL_RS08150017-3.950406prepilin-type N-terminal cleavage/methylation
SACOL_RS08155113-2.340409competence protein ComGC
SACOL_RS08160215-2.445676competence protein ComGB
SACOL_RS08165-111-2.637873hypothetical protein
SACOL_RS08170013-2.914850hydroxyacylglutathione hydrolase
SACOL_RS08175-111-3.110818hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08150BCTERIALGSPH414e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 40.7 bits (95), Expect = 4e-07
Identities = 14/79 (17%), Positives = 38/79 (48%), Gaps = 4/79 (5%)

Query: 5 KQSAFTMIEMLVVMMLISIFLLLTMTSKGLSNLRVIDDEA-NIISFITELNYIKSQAIAN 63
+Q FT++EM+++++L+ + + + + S D A + F +L +++ + +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPAS---RDDSAAQTLARFEAQLRFVQQRGLQT 58

Query: 64 QGYINVRFYENSDTIKVIE 82
+ V + + V+E
Sbjct: 59 GQFFGVSVHPDRWQFLVLE 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08155BCTERIALGSPG469e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.4 bits (110), Expect = 9e-10
Identities = 19/76 (25%), Positives = 44/76 (57%), Gaps = 4/76 (5%)

Query: 3 KFLKKTQAFTLIEMLLVLLIISLLLILIIPNI--AKQTAHIQSTGCNAQVKMVNSQIEAY 60
+ K + FTL+E+++V++II +L L++PN+ K+ A Q + + + + ++ Y
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKA--VSDIVALENALDMY 59

Query: 61 ALKHNRNPSSIEDLIA 76
L ++ P++ + L +
Sbjct: 60 KLDNHHYPTTNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08160BCTERIALGSPF844e-20 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 84.1 bits (208), Expect = 4e-20
Identities = 65/347 (18%), Positives = 137/347 (39%), Gaps = 6/347 (1%)

Query: 14 KKRQLSKAQQIDLLSNLCNLLKYGFTLYQSFQFLNLQMTYKN-KQLGTTILSEISNGAPC 72
+K +LS + L L L+ L ++ + Q + QL + S++ G
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 73 NQIL-SLIGYSDTI-VMQVYLAERFGNIIDVLEETVNYMKVNRKSEQRLLKTLQYPLILV 130
+ G + + V E G++ VL +Y + ++ R+ + + YP +L
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 131 SIFIAMIIILNLTVIPQFQQLYTSMNIQLSSFQKTLSFFITSLPTIIVVMLIIVSMLAII 190
+ IA++ IL V+P+ + + M L + L ++ T ML+ + +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 191 MKLIYNNLNMLNKIN-FVMKLPLISGYFQLFKTYFVTNELVLFYKNGITLQSIVDVYINH 249
+++ + ++ LPLI + T L + + + L + + +
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 250 SS-DPFRQFLGKYLLTYSEMGYGLPQILEKLKCFKPQLIKFVLQGEKRGKLEVELKLYSQ 308
S D R L E G L + LE+ F P + + GE+ G+L+ L+ +
Sbjct: 301 MSNDYARHRLSLATDAVRE-GVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAAD 359

Query: 309 ILVKQIEDKAIKQTQFLQPILFLILGLFIVAIYLVIMLPMFQMMQSI 355
++ + +P+L + + ++ I L I+ P+ Q+ +
Sbjct: 360 NQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08170SHIGARICIN270.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 27.5 bits (61), Expect = 0.039
Identities = 20/99 (20%), Positives = 38/99 (38%), Gaps = 11/99 (11%)

Query: 82 DFLKDPVKNGADKFKQYGLPIITSKVTPEK-------LNEGSTEIE-GFKFNVLHTPGHS 133
F+ + K + K Y +P++ S + + N I ++ G+
Sbjct: 39 VFISNLRKALPYERKLYDIPLLRSTLPGSQRYALIHLTNYADETISVAIDVTNVYVMGYR 98

Query: 134 PGSLTYVFDEFAVVG--DTLFNNGIGRTDL-YKGDYETL 169
G +Y F+E + +F + + L Y G+YE L
Sbjct: 99 AGDTSYFFNEASATEAAKYVFKDAKRKVTLPYSGNYERL 137


25SACOL_RS09140SACOL_RS09165Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS09140312-0.087288acetoin utilization protein AcuC
SACOL_RS09145614-0.408886catabolite control protein A
SACOL_RS09150516-0.927181hypothetical protein
SACOL_RS09155312-0.263095bifunctional 3-deoxy-7-phosphoheptulonate
SACOL_RS09160412-0.476711hypothetical protein
SACOL_RS09165312-0.595477hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09165IGASERPTASE441e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 43.9 bits (103), Expect = 1e-06
Identities = 46/299 (15%), Positives = 105/299 (35%), Gaps = 26/299 (8%)

Query: 72 KTQLEETVAYTKERVEGFLNKSKNEQAALKAQQAAIKEEASANNLSDTSQEAQEIQEAKR 131
+ EE + V + +E A+ + + + N D ++ + +E +
Sbjct: 1011 PSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAK 1070

Query: 132 EAQAEADKSVAVSNK-ESKAVALKAQQAAIKEEASANNLSDTSQEAQEIQEAKKEAQAET 190
EA++ + + +S + + Q KE A+ E +++A+ ET
Sbjct: 1071 EAKSNVKANTQTNEVAQSGSETKETQTTETKETAT--------------VEKEEKAKVET 1116

Query: 191 DKSAAVSNEEPKAVALKAQQAAIKEEASANNLSDTSQEAQEVQEAKKEAQAETDKSAAVS 250
+K+ V + + Q ++ +A +D + KE Q++T+ +A
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNI-------KEPQSQTNTTADTE 1169

Query: 251 NEEPKAVALKAQQAAIKEEASANNLSDISQEAQEVQEAKKEAQAEKDSDTLTKDASAAKV 310
P + + E + N + + + + A + +S K+ V
Sbjct: 1170 Q--PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV 1227

Query: 311 --EVSKPESQAERLANAAKQKQAKLTPGSKESQLTEALFAEKPVAKNDLKEIPQLVTKK 367
E + + LT + + L++A + VA N K + Q +++
Sbjct: 1228 RSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQL 1286



Score = 38.9 bits (90), Expect = 6e-05
Identities = 56/323 (17%), Positives = 106/323 (32%), Gaps = 37/323 (11%)

Query: 185 EAQAETDKSAAVSNEEPKAVALKAQQAAIKEEASANNLSDTS-QEAQEVQEAKKEAQAET 243
E + +T + ++ + + + +E A + A + + A+
Sbjct: 986 EKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSK 1045

Query: 244 DKSAAVSNEEPKAVALKAQQAAIKEEASANNLSDISQ-EAQEVQEAKKEAQAEKDSDTLT 302
+S V E A AQ + +EA +N ++ E + KE Q + +T T
Sbjct: 1046 QESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETAT 1105

Query: 303 KDASA-AKVEVSKPESQAERLANAAKQKQAKLTPGSKESQLTEALFAEKPVAKNDLKEIP 361
+ AKVE K + + ++++P K+ Q +P +ND
Sbjct: 1106 VEKEEKAKVETEKTQEVP--------KVTSQVSP--KQEQSETVQPQAEPAREND----- 1150

Query: 362 QLVTKKNDVSETETVNIDNKDTVKQKEAKFENGVITRKADEKTTNNTAVDKKSGKQSKKT 421
TVNI + A E K T++ + + T
Sbjct: 1151 ------------PTVNIKEPQSQTNTTADTEQP-------AKETSSNVEQPVTESTTVNT 1191

Query: 422 TPSNKRNASKASTNKTSGQKKQHNKKSSQGAKKQSSSSKSTQKNNQTSNKNSKTTNAKSS 481
S N + T + + ++S S T++ N ++T A
Sbjct: 1192 GNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCD 1251

Query: 482 NASKTPNAKVEKAKSKIEKRTFN 504
S NA + A++K + N
Sbjct: 1252 LTSTNTNAVLSDARAKAQFVALN 1274



Score = 36.2 bits (83), Expect = 4e-04
Identities = 52/332 (15%), Positives = 104/332 (31%), Gaps = 20/332 (6%)

Query: 172 TSQEAQEIQEAKKEAQAETDKSAAVSNEEPKAVALKA-QQAAIKEEASANNLSDTSQEAQ 230
S + + A+ + + A +E + VA + Q++ E+ + T+Q +
Sbjct: 1008 PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNRE 1067

Query: 231 EVQEAKKEAQAETDKSAAVSNEEPKAVALKAQQAAIKEEASANNLSDISQEAQEVQEAKK 290
+EAK +A T + + + + Q KE A + E +E + +
Sbjct: 1068 VAKEAKSNVKANTQTNEV---AQSGSETKETQTTETKETA--------TVEKEEKAKVET 1116

Query: 291 EAQAEKDSDTLTKDASAAKVEVSKPESQAERLANAAKQKQAKLTPGSKESQLTEALFAEK 350
E E T + E +P+++ R + + + + + TE E
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTAD-TEQPAKET 1175

Query: 351 PVAKNDLKEIPQLVTKKNDVSETETVNIDNKDTVKQKEAKFENGVITRKADEKTTNNTAV 410
V N V E N T ++ N R + V
Sbjct: 1176 SSNVEQPVTESTTVNTGNSVVEN-PENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNV 1234

Query: 411 DKKSGKQSKKTTPSNKRNASKASTNKTSGQKKQHNKKSSQGAKKQSSSSKSTQKNNQTSN 470
+ + + ++T + S + S + ++ + +Q +Q
Sbjct: 1235 EPATTSSNDRSTVALCDLTSTNTNAVLS------DARAKAQFVALNVGKAVSQHISQLEM 1288

Query: 471 KNSKTTNAKSSNASKTPNAKVEKAKSKIEKRT 502
N N SN S N + + K T
Sbjct: 1289 NNEGQYNVWVSNTSMNKNYSSSQYRRFSSKST 1320



Score = 35.4 bits (81), Expect = 6e-04
Identities = 56/356 (15%), Positives = 116/356 (32%), Gaps = 27/356 (7%)

Query: 118 DTSQEAQEIQEAKREAQAEADKSVAVSNKESKAVALKAQQAAIKEEASANNLSDTSQEAQ 177
+ Q + E A D++ A A ++ E S QE++
Sbjct: 1001 NNIQADVPSVPSNNEEIARVDEAPV----PPPAPATPSETTETVAENS-------KQESK 1049

Query: 178 EIQEAKKEAQAETDKSAAVSNEEPKAVALKAQQAAIKEEAS-ANNLSDTSQEAQEVQEAK 236
+++ +++A T ++ V+ E V Q + + S T + E +
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKE 1109

Query: 237 KEAQAETDKSAAVSNEEPKAVALKAQQAAIKEEASANNLSDISQEAQEVQEAKKEAQAEK 296
++A+ ET+K+ V + + Q ++ +A +D + +E Q +
Sbjct: 1110 EKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTE 1169

Query: 297 DSDTLTKDASAAKVEVSKPESQAERLANAAKQKQAKLTPGSKESQLTEALFAEKPVAKND 356
T V S + + + T + + +E+ K +
Sbjct: 1170 QPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT---QPTVNSESSNKPKNRHRRS 1226

Query: 357 LKEIPQLVTKKNDVSETETVNIDNKDTVKQKEAKFEN-GVITRKADEKTTNNTAVDKKSG 415
++ +P V E T + +++ TV + N + A K K+
Sbjct: 1227 VRSVPHNV-------EPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAV 1279

Query: 416 KQSKKTTPSNKRNASKASTNKTSGQKK----QHNKKSSQGAKKQSSSSKSTQKNNQ 467
Q N + TS K Q+ + SS+ + Q ++ N Q
Sbjct: 1280 SQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLGWDQTISNNVQ 1335


26SACOL_RS09350SACOL_RS09390Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS09350016-3.528848protein ArsC
SACOL_RS09355-116-4.269933autolysin
SACOL_RS09360113-3.617897hypothetical protein
SACOL_RS09365214-3.991794RNA polymerase sigma factor SigS
SACOL_RS09370216-3.237707competence protein ComK
SACOL_RS09375415-2.982731hypothetical protein
SACOL_RS09380112-3.140477CAAX amino protease
SACOL_RS09385215-3.008625transaldolase
SACOL_RS09390213-2.781291hypothetical protein
27SACOL_RS09480SACOL_RS09645Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS09480022-3.507885hypothetical protein
SACOL_RS09485522-3.213193hypothetical protein
SACOL_RS09490722-3.733552hypothetical protein
SACOL_RS09495828-4.295513hypothetical protein
SACOL_RS09500826-3.016725hypothetical protein
SACOL_RS09505728-4.910199hypothetical protein
SACOL_RS095101021-7.715595hypothetical protein
SACOL_RS095151021-8.527737hypothetical protein
SACOL_RS095201018-7.584763transposase
SACOL_RS09525512-5.035942hypothetical protein
SACOL_RS09530512-5.150382hypothetical protein
SACOL_RS09535411-4.481271hypothetical protein
SACOL_RS09540-2130.133896specificity determinant HsdS
SACOL_RS09545-2130.704710type I restriction-modification system subunit
SACOL_RS09550-1151.577939hypothetical protein
SACOL_RS09555413-0.134105serine protease SplF
SACOL_RS09560411-0.130496serine protease SplE
SACOL_RS09565311-0.316975serine protease SplD
SACOL_RS09570113-0.816552serine protease SplC
SACOL_RS09575016-1.307964serine protease SplB
SACOL_RS09580112-2.321239serine protease SplA
SACOL_RS09585112-3.129050hypothetical protein
SACOL_RS09590-213-2.350420hypothetical protein
SACOL_RS09595-210-2.792382hypothetical protein
SACOL_RS09605010-4.377794hypothetical protein
SACOL_RS09610010-4.196871bacitracin ABC transporter ATP-binding protein
SACOL_RS09615-111-3.885636peptidase S8
SACOL_RS09620-113-4.585177enterotoxin
SACOL_RS09625-111-3.989410hypothetical protein
SACOL_RS09630-112-3.408570hypothetical protein
SACOL_RS09635316-1.805209lantibiotic epidermin
SACOL_RS09640215-3.053568transposase
SACOL_RS09645213-2.835774lantibiotic epidermin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09560V8PROTEASE1156e-33 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 115 bits (290), Expect = 6e-33
Identities = 60/227 (26%), Positives = 103/227 (45%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENTVKQITNTNVAPYSGVTWMGA--------GTGFVVGNHTIITNKHV 76
++Q A N QIT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVSSDAIIQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKHEAIGVIYAGNKPSGESTRGFAVYFSPEIKKFIADNLD 238
SGSP+ N K+E IG+ + G AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEF----NGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09565V8PROTEASE1368e-41 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 136 bits (344), Expect = 8e-41
Identities = 63/227 (27%), Positives = 107/227 (47%), Gaps = 27/227 (11%)

Query: 30 IQQTAKA-----EHNVKLIKNTNVAPYNGVVSIGS--------GTGFIVGKNTIVTNKHV 76
++Q A ++ I +T Y V I +G +VGK+T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 VAGMEIGAH-IIAHP---NGEYNNGGFYKVKKIVRYSGQEDIAILHVEDKAVHPKNRNFK 132
V H + A P N + G + ++I +YSG+ D+AI+ +N++
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPN---EQNKHIG 177

Query: 133 DYTGILKIA--SEAKENERISIVGYPEPYINKFQMYESTGKVLSVKGNMIITDAFVEPGN 190
+ ++ +E + N+ I++ GYP M+ES GK+ +KG + D GN
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYPGDK-PVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 191 SGSAVFNSKYEVVGVHFGGNGPGNKSTKGYGVYFSPEIKKFIADNTD 237
SGS VFN K EV+G+H+GG + V+ + ++ F+ N +
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVP----NEFNGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09570V8PROTEASE1121e-31 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 112 bits (280), Expect = 1e-31
Identities = 58/227 (25%), Positives = 100/227 (44%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENSVKLITNTNVAPYSGVTWMGA--------GTGFVVGNQTIITNKHV 76
++Q A N IT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVTSDAVVQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKREAIGVMYASDKPTGESTRSFAVYFSPEIKKFIADNLD 238
SGSP+ N K E IG+ + AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEFNG----AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09575V8PROTEASE1794e-57 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 179 bits (454), Expect = 4e-57
Identities = 63/217 (29%), Positives = 105/217 (48%), Gaps = 23/217 (10%)

Query: 37 EKNVTQVKDTNIFPYNGVVSFK--------DATGFVIGKNTIITNKHV-SKDYKVGDRIT 87
+ Q+ DT Y V + A+G V+GK+T++TNKHV + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 88 AHP---NGDKGNGGIYKIKSISDYPGDEDISVMNIEEQAVERGPKGFNFNENVQAFNFAK 144
A P N D G + + I+ Y G+ D++++ + + E V+ +
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK-----HIGEVVKPATMSN 187

Query: 145 DA--KVDDKIKVIGYPLPAQNSFKQFESTGTIKRIKDNILNFDAYIEPGNSGSPVLNSNN 202
+A +V+ I V GYP +ES G I +K + +D GNSGSPV N N
Sbjct: 188 NAETQVNQNITVTGYPGDK-PVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKN 246

Query: 203 EVIGVVYGGIGKIGSEYNGAVYFTPQIKDFIQKHIEQ 239
EVIG+ +GG + +E+NGAV+ +++F++++IE
Sbjct: 247 EVIGIHWGG---VPNEFNGAVFINENVRNFLKQNIED 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09580V8PROTEASE1772e-56 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 177 bits (450), Expect = 2e-56
Identities = 64/230 (27%), Positives = 108/230 (46%), Gaps = 29/230 (12%)

Query: 29 EVQQTAKA-----ENNVTKVKDTNIFPYTGVVAFKS--------ATGFVVGKNTILTNKH 75
++Q A N+ ++ DT Y V + A+G VVGK+T+LTNKH
Sbjct: 60 PLEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKH 119

Query: 76 V-SKNYKVGDRITAHP---NSDKGNGGIYSIKKIINYPGKEDVSVIQVEERAIERGPKGF 131
V + + A P N D G ++ ++I Y G+ D+++++ +
Sbjct: 120 VVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK----- 174

Query: 132 NFNDNVTPFKYAAGA--KAGERIKVIGYPHPYKNKYVLYESTGPVMSVEGSSIVYSAHTE 189
+ + V P + A + + I V GYP K ++ES G + ++G ++ Y T
Sbjct: 175 HIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMWESKGKITYLKGEAMQYDLSTT 233

Query: 190 SGNSGSPVLNSNNELVGIHFASDVKNDDNRNAYGVYFTPEIKKFIAENID 239
GNSGSPV N NE++GIH+ V N+ N V+ ++ F+ +NI+
Sbjct: 234 GGNSGSPVFNEKNEVIGIHWGG-VPNEFNG---AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09585V8PROTEASE1397e-42 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 139 bits (351), Expect = 7e-42
Identities = 66/212 (31%), Positives = 103/212 (48%), Gaps = 18/212 (8%)

Query: 39 EKNVKEITDATKEPYNSVVAF--------VGGTGVVVGKNTIVTNKHIAKSNDIFKNRVS 90
+ +ITD T Y V +GVVVGK+T++TNKH+ + + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 91 AHHS---SKGKGGGNYDVKDIVEYPGKEDLAIVHVHETSTEGLNFNKNVSYTKFADGA-- 145
A S G + + I +Y G+ DLAIV + + + + V ++ A
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVK-FSPNEQNKHIGEVVKPATMSNNAET 191

Query: 146 KVKDRISVIGYPKGAQTKYKMFESTGTINHISGTFMEFDAYAQPGNSGSPVLNSKHELIG 205
+V I+V GYP G + M+ES G I ++ G M++D GNSGSPV N K+E+IG
Sbjct: 192 QVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIG 250

Query: 206 ILYAGSGKDESEKNFGVYFTPQLKEFIQNNIE 237
I + G +E N V+ ++ F++ NIE
Sbjct: 251 IHWGGVP---NEFNGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09615SUBTILISIN1602e-47 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 160 bits (406), Expect = 2e-47
Identities = 83/351 (23%), Positives = 138/351 (39%), Gaps = 73/351 (20%)

Query: 110 SRQWDMNKITNNGASYDDLPKHANTKIAIIDTGVMKNHDDLKNNFSTDSKNLVPLNGFRG 169
+ I A ++ + K+A++DTG +H DLK + G R
Sbjct: 21 EIPRGVEMI-QAPAVWNQT-RGRGVKVAVLDTGCDADHPDLKAR----------IIGGRN 68

Query: 170 TEPEETGDVHDVNDRKGHGTMVSGQTSANG---KLIGVAPNNKFTMYRVFGSKKT-ELLW 225
++ GD D GHGT V+G +A ++GVAP + +V + + + W
Sbjct: 69 FTDDDEGDPEIFKDYNGHGTHVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDW 128

Query: 226 VSKAIVQAANDGNQVINISVGSYIILDKNDHQTFRKDEKVEYDALQKAINYAKKKKSIVV 285
+ + I A +I++S+G + L +A+ A + +V+
Sbjct: 129 IIQGIYYAIEQKVDIISMSLGGP----------------EDVPELHEAVKKAVASQILVM 172

Query: 286 AAAGNDGIDVNDKQKLKLQREYQGNGEVKDVPASMDNVVTVGSTDQKSNLSEFSNFGMNY 345
AAGN+G + + P + V++VG+ + + SEFSN N
Sbjct: 173 CAAGNEG-------------DGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNSN-NE 218

Query: 346 TDIAAPGGSFAYLNQFGVDKWMNEGYMHKENILTTANNGRYIYQAGTSLATPKVSGALAL 405
D+ APG E+IL+T G+Y +GTS+ATP V+GALAL
Sbjct: 219 VDLVAPG----------------------EDILSTVPGGKYATFSGTSMATPHVAGALAL 256

Query: 406 IIDKYHLEKHPD----KAIELLYQHGTSKNNKPFSRYGHGELDVYKALNVA 452
I + D + L + N P G+G L + ++
Sbjct: 257 IKQLANASFERDLTEPELYAQLIKRTIPLGNSPK-MEGNGLLYLTAVEELS 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09630RTXTOXINA310.019 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.5 bits (71), Expect = 0.019
Identities = 22/104 (21%), Positives = 37/104 (35%), Gaps = 1/104 (0%)

Query: 813 SDYEFVSYEPEFFRYGGKNTINEIEAFFEYDTNLAVNIIENDFKFDRPYIVAISIMYLFE 872
+D E G KN I F + +++ + IE F I S+ E
Sbjct: 889 NDLIMYKGEGNVLSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALE 948

Query: 873 MFSISNEERMEIVNNYVPTSFKSKDIRPFKNELVTICNPANNFE 916
+ N + + N D+ P NE+ I + A +F+
Sbjct: 949 -YQQRNNKASYVYGNDALAYGSQGDLNPLINEISKIISAAGSFD 991


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09635GALLIDERMIN477e-12 Gallidermin signature.
		>GALLIDERMIN#Gallidermin signature.

Length = 52

Score = 47.4 bits (112), Expect = 7e-12
Identities = 29/46 (63%), Positives = 34/46 (73%), Gaps = 1/46 (2%)

Query: 2 EKVLDLDVQVKANNNSNDSAGDERITSHSLCTPGCAKTGSFNSFCC 47
++ DLDV+V A SNDS + RI S LCTPGCAKTGSFNS+CC
Sbjct: 8 NELFDLDVKVNAKE-SNDSGAEPRIASKFLCTPGCAKTGSFNSYCC 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09645GALLIDERMIN291e-04 Gallidermin signature.
		>GALLIDERMIN#Gallidermin signature.

Length = 52

Score = 29.3 bits (65), Expect = 1e-04
Identities = 16/38 (42%), Positives = 22/38 (57%), Gaps = 1/38 (2%)

Query: 2 EKVLDLDVQVKGNNNTNDSAGDERITSHLFCSFGCEKT 39
++ DLDV+V +NDS + RI S C+ GC KT
Sbjct: 8 NELFDLDVKVNAKE-SNDSGAEPRIASKFLCTPGCAKT 44


28SACOL_RS09755SACOL_RS09780Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS09755310-2.547144cell-cycle regulation protein HIT
SACOL_RS09760210-2.692424hypothetical protein
SACOL_RS0976529-3.004584hypothetical protein
SACOL_RS09770411-2.374800foldase
SACOL_RS0977529-2.5753533'-5' exoribonuclease YhaM
SACOL_RS09780210-2.879705DNA double-strand break repair Rad50 ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09775SSPANPROTEIN290.035 Salmonella invasion protein InvJ signature.
		>SSPANPROTEIN#Salmonella invasion protein InvJ signature.

Length = 336

Score = 28.6 bits (63), Expect = 0.035
Identities = 12/31 (38%), Positives = 19/31 (61%)

Query: 146 PAASSHHHNFASGLSYHVLTMLRIAKSICDI 176
PA S HH+ SGL ++ + LRIA+ + +
Sbjct: 72 PAKSEHHNGNVSGLHHNGKSELRIAEKLLKV 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09780GPOSANCHOR369e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 9e-04
Identities = 49/311 (15%), Positives = 104/311 (33%), Gaps = 39/311 (12%)

Query: 138 RNLNEKQLQDYLLQAGALGSTEFTSMREVINRKKDELYKKSGKNPIINQQIEQLKQLESQ 197
N + + D AL + E ++ K++L K ++++ ++++LE++
Sbjct: 66 NNTLKLKNSDLSFNNKAL-KDHNDELTEELSNAKEKLRKNDKS---LSEKASKIQELEAR 121

Query: 198 IREEEAKLETYHRLVDDRDKSSRRLENLKHNL--------NQLSKMHEEKQKEVALHDHS 249
+ E LE + LE K L L + A
Sbjct: 122 KADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTL 181

Query: 250 QEWKSLEQQLNIEPITFPEKGVDRYEKARAHKQSLERDIGLRNERLAQLKEEATQLEPVK 309
+ K+ + E E ++ A ++LE + R A L++
Sbjct: 182 EAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFS 241

Query: 310 QSDIDAFISLNQQENEIKNKEFELTAIEK-----------DIANKQRDKDELQ------- 351
+D +L ++ ++ ++ EL + I + +K L+
Sbjct: 242 TADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLE 301

Query: 352 -----SNIGWSETHHDVDSSEAMKSYVSEQIKNKQEQA----AYIKQLERSLEENKIEDN 402
N D+D+S K + + + +EQ A + L R L+ ++
Sbjct: 302 HQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKK 361

Query: 403 AVHSELDSVEE 413
+ +E +EE
Sbjct: 362 QLEAEHQKLEE 372



Score = 31.6 bits (71), Expect = 0.020
Identities = 38/241 (15%), Positives = 78/241 (32%), Gaps = 17/241 (7%)

Query: 163 MREVINRKKDELYKKSGKNPIINQQIEQLKQLESQIREEEAKLET-YHRLVDDRDKSSRR 221
+ + + + S K + + L ++ + +
Sbjct: 195 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 254

Query: 222 LENLKHNLNQLSKMHEEKQKEVALHDHSQEWKSLEQQLNIEPITFPEKGVDRYEKARAHK 281
L+ +L K E + E+ + + A++
Sbjct: 255 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKA---ALEAEKADLEHQSQVLNANR 311

Query: 282 QSLERDIGLRNERLAQLKEEATQLEPVKQSDIDAFIS----LNQQENEIKNKEFELTAIE 337
QSL RD+ E QL+ E +LE + + S L+ K E E +E
Sbjct: 312 QSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLE 371

Query: 338 KDIANKQRDKDELQSNIGWSETHHDVDSSEAMKSYVSEQIKNKQEQAAYIKQLERSLEEN 397
+ + + L+ D+D+S K V + ++ + A +++L + LEE+
Sbjct: 372 EQNKISEASRQSLRR---------DLDASREAKKQVEKALEEANSKLAALEKLNKELEES 422

Query: 398 K 398
K
Sbjct: 423 K 423


29SACOL_RS10390SACOL_RS10565Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS10390011-3.244036choloylglycine hydrolase
SACOL_RS10395011-4.493297hypothetical protein
SACOL_RS10400011-5.003324hypothetical protein
SACOL_RS10405111-4.545547membrane protein
SACOL_RS10410112-4.071557thioredoxin
SACOL_RS10415013-4.693998membrane protein
SACOL_RS10420215-4.161386ABC transporter ATP-binding protein
SACOL_RS10425012-2.605912membrane protein
SACOL_RS10430114-1.991112ABC transporter ATP-binding protein
SACOL_RS10435-114-3.641166GntR family transcriptional regulator
SACOL_RS10440015-3.796396hypothetical protein
SACOL_RS10445-215-2.740688hypothetical protein
SACOL_RS10450-113-2.830704hypothetical protein
SACOL_RS10455014-3.694076MAP domain-containing protein
SACOL_RS10460-113-2.784333MAP domain-containing protein
SACOL_RS10465014-2.325243sphingomyelin phosphodiesterase
SACOL_RS10470114-2.901641gamma-hemolysin subunit B
SACOL_RS10475114-3.710935succinyl-diaminopimelate desuccinylase
SACOL_RS10480117-4.067500hypothetical protein
SACOL_RS10485219-4.396476ferrichrome-binding protein FhuD
SACOL_RS10490318-4.904134potassium transporter KtrB
SACOL_RS10495418-4.448417GNAT family acetyltransferase
SACOL_RS10500522-3.827840hypothetical protein
SACOL_RS10505013-0.079669phage terminase small subunit
SACOL_RS105100120.876465hypothetical protein
SACOL_RS10515-2100.656228integrase
SACOL_RS105200101.258238molecular chaperone GroEL
SACOL_RS105250110.998999co-chaperone GroES
SACOL_RS105300110.851495CAAX amino protease
SACOL_RS10540411-1.587196hypothetical protein
SACOL_RS10545413-1.158767nitroreductase
SACOL_RS10550-212-4.442917hydrolase in agr operon
SACOL_RS10555013-4.363235delta-hemolysin
SACOL_RS10565012-3.538316accessory gene regulator protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10475BICOMPNTOXIN2171e-70 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 217 bits (553), Expect = 1e-70
Identities = 84/320 (26%), Positives = 145/320 (45%), Gaps = 18/320 (5%)

Query: 11 ICTLALSTTFTVLPATSFAKINSEIKQVSEKNLDGDTKMYTRTATTSDSQKNITQSLQFN 70
I T LS + A + + D ++ RT + ++ +TQ++QF+
Sbjct: 6 ILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQFD 65

Query: 71 FLTEPNYDKETVFIKAKGTIGSGLRILDPNGY-WNSTLRWPGSYSVSIQNVDDNNNTNVT 129
F+ + Y+K+ + +K +G I S + +RWP Y++ ++ ++ ++
Sbjct: 66 FVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKT--NDKYVSLI 123

Query: 130 DFAPKNQDESREVKYTYGYKTGGDFSINRGGLTGNITKESNYSETISYQQPSYRTLLDQS 189
++ PKN+ ES V T GY GG+F L GN + NYS++ISY Q +Y + ++Q
Sbjct: 124 NYLPKNKIESTNVSQTLGYNIGGNFQSAPS-LGGNGSF--NYSKSISYTQQNYVSEVEQQ 180

Query: 190 TSHKGVGWKVEAHLINNMGHDHTRQLTNDSDNRTKSEIFSLTRNGNLWAKDNFTPKDKMP 249
K V W V+A+ + S++F + + +D F P ++P
Sbjct: 181 N-SKSVLWGVKANSFATESGQKSAF---------DSDLFVGYKPHSKDPRDYFVPDSELP 230

Query: 250 VTVSEGFNPEFLAVMSHDKKDKGKSQFVVHYKRSMDEFKIDWNRHGFWG-YWSGENHVDK 308
V GFNP F+A +SH+K S+F + Y R+MD + Y G +
Sbjct: 231 PLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNA 290

Query: 309 -KEEKLSALYEVDWKTHNVK 327
+ YEV+WKTH +K
Sbjct: 291 FVNRNYTVKYEVNWKTHEIK 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10480BICOMPNTOXIN1651e-50 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 165 bits (419), Expect = 1e-50
Identities = 99/343 (28%), Positives = 157/343 (45%), Gaps = 42/343 (12%)

Query: 4 KKRVLIASSLSCAILLLSAATTQANSAHKDSQDQNKKEHVDKSQQKDKRNVTNKDKNSTA 63
K ++ ++LS ++L A N+
Sbjct: 2 LKNKILTTTLSVSLLAPLANPLLENAKAA-----------------------------ND 32

Query: 64 PDDIGKNGKIT--KRTETVYDEKTNILQNLQFDFIDDPTYDKNVLLVKKQGSIHSNLKFE 121
+DIGK I KRTE K + QN+QFDF+ D Y+K+ L++K QG I S +
Sbjct: 33 TEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQFDFVKDKKYNKDALILKMQGFISSRTTYY 92

Query: 122 SHKEEKNSNWLKYPSEYHVDFQVKRNRKTEILDQLPKNKISTAKVDSTFSYSSGGKFDST 181
++K+ + +++P +Y++ + ++ +++ LPKNKI + V T Y+ GG F S
Sbjct: 93 NYKKTNHVKAMRWPFQYNIGLKTN-DKYVSLINYLPKNKIESTNVSQTLGYNIGGNFQSA 151

Query: 182 KGIGRTSSNSYSKTISYNQQNYDTIASGKNNNWHVHWSVIANDLKYGGEVKNRNDELLFY 241
+G S +YSK+ISY QQNY + + N+ V W V AN K+ D LF
Sbjct: 152 PSLGGNGSFNYSKSISYTQQNYVSEVE-QQNSKSVLWGVKANSFATESGQKSAFDSDLFV 210

Query: 242 RNTRIATVENPELSFASKYRYPALVRSGFNPEFLTYLSNEK-SNEKTQFEVTYTRNQDIL 300
+ +P F P LV+SGFNP F+ +S+EK S++ ++FE+TY RN D+
Sbjct: 211 GYKPHSK--DPRDYFVPDSELPPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVT 268

Query: 301 KNR------PGIHYAPPILEKNKDGQRLIVTYEVDWKNKTVKV 337
+ + + V YEV+WK +KV
Sbjct: 269 HAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKYEVNWKTHEIKV 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10495FERRIBNDNGPP601e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 60.3 bits (146), Expect = 1e-12
Identities = 48/248 (19%), Positives = 95/248 (38%), Gaps = 21/248 (8%)

Query: 48 PKRVAVLTGFYVGDFIKLGIKPIAVSDITK-DSSILKPYL-KGVDYIG---ENDVERVAK 102
P R+ L V + LGI P V+D + +P L V +G E ++E + +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTE 94

Query: 103 AKPDLIVVDA-MDKNIKKYQKIAPTIPYTYNKYNH-----KEILKEIGKLTNNEDKAKKW 156
KP +V A + + +IAP + ++ ++ L E+ L N + A+
Sbjct: 95 MKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETH 154

Query: 157 IEEWDDKTRKDKKEIQSKIGQATASVFEPDEKQIYIYNSTWGRGLDIVHDAFGMPMTKQY 216
+ +++D R K + + D + + ++ + D +G+P Q
Sbjct: 155 LAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGP--NSLFQEILDEYGIPNAWQG 212

Query: 217 KDKLQEDKKGYASISKENISKYA-GDYIFLSKPSYGKFD-FEKTHTWQNIEAVKKGHVIS 274
+ + G ++S + ++ Y D + + D T WQ + V+ G
Sbjct: 213 ----ETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRF-- 266

Query: 275 YKAEDYWF 282
+ WF
Sbjct: 267 QRVPAVWF 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10505SACTRNSFRASE270.026 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.8 bits (59), Expect = 0.026
Identities = 16/61 (26%), Positives = 30/61 (49%), Gaps = 2/61 (3%)

Query: 76 EYMRILAFVIHSEFRKKGYGKRLLADSEEFSKRLNCKAITLNSGNRNERLSAHKLYSDNG 135
Y I + ++RKKG G LL + E++K + + L + + N +SA Y+ +
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDIN--ISACHFYAKHH 145

Query: 136 Y 136
+
Sbjct: 146 F 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10545TONBPROTEIN506e-09 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 49.6 bits (118), Expect = 6e-09
Identities = 27/92 (29%), Positives = 36/92 (39%), Gaps = 5/92 (5%)

Query: 111 QNPSPNPKPDPDNPKPKPDPKPDPDKPKPNPDPKPDPDNPKPNPDPKPDPDKPK-PNPDP 169
+ P P +P+P+P+P P+ PK P P PKP P PKP + P D
Sbjct: 56 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK-PKPKPKPKPVKKVQEQPKRDV 114

Query: 170 KP---DPDKPKPNPNPKPDPNKPNPNPSPDPD 198
KP P P N P + + P
Sbjct: 115 KPVESRPASPFENTAPARLTSSTATAATSKPV 146



Score = 46.1 bits (109), Expect = 7e-08
Identities = 32/110 (29%), Positives = 37/110 (33%), Gaps = 8/110 (7%)

Query: 98 QNPSTDSKPDPNNQNPSPNPKPDPDNPKPKPDPKPDPDKPKPNPDPKPDPDNPK-PNPDP 156
+ P P P P P+P P+ PK P P KPKP P PKP + P D
Sbjct: 56 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKP-KPKPKPKPKPVKKVQEQPKRDV 114

Query: 157 KP---DPDKPKPNPDPKPDPDKPKPNPNPKP---DPNKPNPNPSPDPDQP 200
KP P P N P KP + P P P
Sbjct: 115 KPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYP 164



Score = 39.2 bits (91), Expect = 1e-05
Identities = 29/115 (25%), Positives = 35/115 (30%), Gaps = 6/115 (5%)

Query: 80 NSRDANPDSNNVKPDSNNQNPSTDSKPDPNNQNPSPNPKPDPDNPKPKPDPKPDPDKPK- 138
D P P P + +P P +P P PKPKP PKP +
Sbjct: 51 TPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPK-PKPKPKPKPVKKVQEQ 109

Query: 139 PNPDPKP---DPDNPKPNPDPKPDPDKPKPNPDPKPDPDKPK-PNPNPKPDPNKP 189
P D KP P +P N P KP P + P P
Sbjct: 110 PKRDVKPVESRPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYP 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10565PF046471322e-41 Accessory gene regulator B
		>PF04647#Accessory gene regulator B

Length = 212

Score = 132 bits (335), Expect = 2e-41
Identities = 27/173 (15%), Positives = 68/173 (39%), Gaps = 7/173 (4%)

Query: 18 RNNLDHIQFLQVRLGMQVLAKNIGKLIVMYTIAYILNIFLFTLITNLTFYLIRRHAHGAH 77
+ ++R G++V + ++I++ +A+++ + L+ + RR + GAH
Sbjct: 14 DRSDYPFNQEEIRYGIEVFLGTVFQIIIILLVAFVIGLAKEVAFCLLSAAVYRRFSGGAH 73

Query: 78 APSSFWCYVESIILFILLPLVIVNFHINFLIMIILTVISLGVISV--YAPAATKKKPIPV 135
+ C + S+++F +L + + ++IL ++++ P + I
Sbjct: 74 CEKYYRCTLTSLLVFNVLAYIAHLIDPAYFQLLILIAFITSLLALLFLVPVDNPRNLISN 133

Query: 136 RLIKRKKYYAIIVSLTLFIITLII-----KEPFAQFIQLGIIIEAITLLPIFF 183
++ + L + I A I LG++ + TL +
Sbjct: 134 TEQRKTLKLKTSMVLMVLFGGSIGAYRLYTHQIALAILLGVLWQTFTLTALGH 186


30SACOL_RS11180SACOL_RS11340Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS11180213-1.901054oxidoreductase
SACOL_RS11185113-4.142971transcriptional regulator
SACOL_RS11190113-4.901089cation transporter
SACOL_RS11195015-5.071257lytic regulatory protein
SACOL_RS11200-118-2.895671hypothetical protein
SACOL_RS11205-211-1.131712resolvase
SACOL_RS11210-210-0.348644hypothetical protein
SACOL_RS11215-2110.877139hypothetical protein
SACOL_RS11220-2111.163518haloacid dehalogenase
SACOL_RS11225-1131.071702ABC transporter ATP-binding protein
SACOL_RS112307141.421340glutamine--fructose-6-phosphate
SACOL_RS112357121.370501PTS mannitol transporter subunit IICB
SACOL_RS112408120.944829PTS lactose transporter subunit IIB
SACOL_RS112459141.076639mannitol-specific phosphotransferase enzyme IIA
SACOL_RS112509140.824791mannitol-1-phosphate 5-dehydrogenase
SACOL_RS112557130.612150hypothetical protein
SACOL_RS11260212-1.330233phosphoglucosamine mutase
SACOL_RS11265214-1.086572hypothetical protein
SACOL_RS11270115-0.427844ABC transporter permease
SACOL_RS11275216-1.073241arginase
SACOL_RS11280215-1.349676transposase
SACOL_RS11330314-1.088856******hypothetical protein
SACOL_RS11335212-1.123789chromosome partitioning protein ParA
SACOL_RS11340213-1.789045MFS transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11180NUCEPIMERASE366e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.3 bits (84), Expect = 6e-05
Identities = 32/164 (19%), Positives = 58/164 (35%), Gaps = 31/164 (18%)

Query: 1 MNILVIGANGGVGSLLVQQLAKENVPFTAGVRQSDQLN-------------ALKSQGMKA 47
M LV GA G +G + ++L + V D LN L G +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG----HQVVGIDNLNDYYDVSLKQARLELLAQPGFQF 56

Query: 48 ILVDVEN-DSIETLTETFKPFDKVIFSVGSGGNTGA----DKTIIVDLDGAVKSMIASKE 102
+D+ + + + L + F++V S + +L G + + +
Sbjct: 57 HKIDLADREGMTDLFASGH-FERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRH 115

Query: 103 ANIKHYVMVST---YDSRRQ---AFDDSGD--LKPYTIAKHYAD 138
I+H + S+ Y R+ + DDS D + Y K +
Sbjct: 116 NKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11240HTHFIS300.044 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.8 bits (67), Expect = 0.044
Identities = 24/130 (18%), Positives = 50/130 (38%), Gaps = 12/130 (9%)

Query: 13 LLIKYHGQYITIHDIAQQLAVSSRTIHRELKGVEAYLTSFSLTLERANKKGLRIAGTDSD 72
L Y IT I + + S ++ A S SL++ +A ++ +R
Sbjct: 365 LTALYPQDVITREIIENE--LRSEIPDSPIEKAAA--RSGSLSISQAVEENMR-----QY 415

Query: 73 LNDLKQSIAQHQTIDLSVEE-QKVIIIYALIQAKEPVKQYSLAQEIGVSVQTLAKMLDDL 131
++ D + E + +I+ AL + + A +G++ TL K + +L
Sbjct: 416 FASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIK--AADLLGLNRNTLRKKIREL 473

Query: 132 ELDLNKYQLS 141
+ + + S
Sbjct: 474 GVSVYRSSRS 483


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11255IGASERPTASE462e-06 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.8 bits (108), Expect = 2e-06
Identities = 59/313 (18%), Positives = 105/313 (33%), Gaps = 20/313 (6%)

Query: 2139 PQANNNSSVDASTNSPTMDNDVTSKPEVESTNNG---TTDKPVTETDNATPAESTTNN-- 2193
P+ + +TN T +N P V S N + PV ATP+E+T
Sbjct: 983 PEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAE 1042

Query: 2194 ----NSTTTATNENAPTGSTATAPTTASTEAASSADSKDNASVNDSKQNAEVNNSAESQS 2249
S T NE T +TA A ++ + V S + + E++
Sbjct: 1043 NSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKE 1102

Query: 2250 TNDKVAQPKS--ENKAKAEKDGSDSTNQSMVESTTETLPSADITEPNVPSNTSKDKEEST 2307
T + K+ E + E S E + P A+ N P+ K+ + T
Sbjct: 1103 TATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQT 1162

Query: 2308 TNQTDAGQLKSET--NVASNEADKSPSKADTEVSNKPSTSASSEAKEKMTS------TNL 2359
D Q ET NV + + V P + + + + S N
Sbjct: 1163 NTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNR 1222

Query: 2360 SQKDDTATADTNDTQKSVGSAANNKATQNDGANASPATVSNGSNSANQDMLNVT-NTDDH 2418
++ + + + + + A + + + A +S+ A LNV H
Sbjct: 1223 HRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQH 1282

Query: 2419 QAKTKSAQQGKVN 2431
++ + +G+ N
Sbjct: 1283 ISQLEMNNEGQYN 1295



Score = 37.4 bits (86), Expect = 0.001
Identities = 46/280 (16%), Positives = 92/280 (32%), Gaps = 6/280 (2%)

Query: 929 RKQEIQNSNASTTEEKQAAYTELDTKKQE-ARTNLDAANTNSDVTTAKDNSIAAINQVQ- 986
R Q + +N +T QA + + +E AR + + T ++ A N Q
Sbjct: 988 RNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQE 1047

Query: 987 AATTKKSDAKAEIAQKASERKTAIEAMNDSTTEEQQAAKDKVDQAVV-TANADIDNAAAN 1045
+ T +K++ A A R+ A EA ++ Q + T + A
Sbjct: 1048 SKTVEKNEQDATET-TAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATV 1106

Query: 1046 NDVDNAKTTNEATIAAITPDANVKPAAKQAIADKVQAQETAIDGNNGSTTEEKAAAKQQV 1105
+ AK E T + V P KQ ++ VQ Q N+ + ++ ++
Sbjct: 1107 EKEEKAKVETEKTQEVPKVTSQVSP--KQEQSETVQPQAEPARENDPTVNIKEPQSQTNT 1164

Query: 1106 QTEKTTADAAIDAAHTNAEVEAAKKAAIAKIEAIQPATTTKDNAKEAIATKANERKTAIA 1165
+ + E+ + TT + +N+ K
Sbjct: 1165 TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHR 1224

Query: 1166 QTQDITAEEIAAANADVDNAVTQANSNIEAANSQNDVDQA 1205
++ + A ++ T A ++ + N+ + A
Sbjct: 1225 RSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDA 1264



Score = 36.2 bits (83), Expect = 0.002
Identities = 33/231 (14%), Positives = 66/231 (28%), Gaps = 10/231 (4%)

Query: 36 ASAAEQNQPAQNQPAQPADANTQPNANAGAQANPTAQPAAPANQGQPAVQPANQGGQANP 95
E + A + + A T+ + + + QP +PA +
Sbjct: 1095 TQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVN 1154

Query: 96 AGGAAQPNTQPAGQGNQADPNNAAQAQPGNQATPANQAGQ--GNNQATPNNNATPANQTQ 153
A A ++ QP ++T N N + T P ++
Sbjct: 1155 IKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSE 1214

Query: 154 PANAPAA-------AQPAAPVAANAQTQDPNASNTGE-GSINTTLTFDDPAISTDENRQD 205
+N P + P A + D + + S NT D +
Sbjct: 1215 SSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALN 1274

Query: 206 PTVTVTDKVNGYSLINNGKIGFVNSELRRSDMFDKNNPQNYQAKGNVAALG 256
V+ ++ + N G+ S + + + + + +K LG
Sbjct: 1275 VGKAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFSSKSTQTQLG 1325



Score = 36.2 bits (83), Expect = 0.002
Identities = 57/309 (18%), Positives = 101/309 (32%), Gaps = 12/309 (3%)

Query: 1038 DIDNAAANNDVDNAKTTNEATIAAITPDANVKPAAKQAIADKVQAQETAIDGNNGSTTEE 1097
D+ N TTN T I D P+ + IA +V +
Sbjct: 979 DLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIA-RVDEAPVPPPAPATPSETT 1037

Query: 1098 KAAAKQQVQTEKTTADAAIDAAHTNAEVEAAKKAAIAKIEAIQPATTTKDNAKEAIATKA 1157
+ A+ Q KT DA T A+ K A + ++A + E T+
Sbjct: 1038 ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQT 1097

Query: 1158 NERKTAIAQTQDITAEEIAAANADVDNAVTQANSNIEAANSQNDVDQAKTTGENSIDQVT 1217
E K +T + EE A + V + S + Q++ Q + E + +
Sbjct: 1098 TETK----ETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQ--AEPARENDP 1151

Query: 1218 PTVNKKATARNEITAILNNKLQEIQATPDATDEEKQAADAEANTENGKANQAISAATTNA 1277
K+ ++ TA +E + + E + + N + ATT
Sbjct: 1152 TVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENT--TPATTQP 1209

Query: 1278 QVDEAKANAEAAINAVTPKVVKKQ---AAKDEIDQLQATQTNVINNDQNATTEEKEAAIQ 1334
V+ +N + + + V A D+ ++ + + NA + A Q
Sbjct: 1210 TVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQ 1269

Query: 1335 QLATAVTDA 1343
+A V A
Sbjct: 1270 FVALNVGKA 1278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11340TCRTETB1443e-40 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 144 bits (364), Expect = 3e-40
Identities = 98/416 (23%), Positives = 194/416 (46%), Gaps = 14/416 (3%)

Query: 7 TTRRRNFIVAVMLISAFVAILNQTLLNTALPSIMRELNINESTSQWLVTGFMLVNGVMIP 66
+ R N I+ + I +F ++LN+ +LN +LP I + N +++ W+ T FML +
Sbjct: 8 SNLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTA 67

Query: 67 LTAYLMDRIKTRPLYLAAMGTFLLGSIVAALAPN-FGVLMLARVIQAMGAGVLMPLMQFT 125
+ L D++ + L L + GS++ + + F +L++AR IQ GA L+
Sbjct: 68 VYGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVV 127

Query: 126 LFTLFSKEHRGFAMGLAGLVIQFAPAIGPTVTGLIIDQASWRVPFIIIVGIAILAFVFGL 185
+ KE+RG A GL G ++ +GP + G+I W +++++ + + V L
Sbjct: 128 VARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW--SYLLLIPMITIITVPFL 185

Query: 186 VSISSYNEVKYTKLDKRSVMYSTIGFGLMLYAFSSAGDLGFTSPIVIGALILSMVIIYLF 245
+ + D + ++ ++G + FT+ I LI+S++ +F
Sbjct: 186 MKLLKKEVRIKGHFDIKGIILMSVGIVFFML---------FTTSYSISFLIVSVLSFLIF 236

Query: 246 IRRQFNITNALLNLRVFKNRTFALCTISSMIIMMSMVGPALLIPLYVQNSLSLSALLSGL 305
++ +T+ ++ + KN F + + II ++ G ++P +++ LS G
Sbjct: 237 VKHIRKVTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGS 296

Query: 306 VIM-PGAIINGIMSVFTGKFYDKYGPRPLIYTGFTILTITTIMLCFLHTDTSYTYLIVVY 364
VI+ PG + I G D+ GP ++ G T L+++ + FL TS+ I++
Sbjct: 297 VIIFPGTMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIV 356

Query: 365 AIRMFSVSLLMMPINTTGINSLRNEEISHGTAIMNFGRVMAGSLGTALMVTLMSFG 420
+ S I+T +SL+ +E G +++NF ++ G A++ L+S
Sbjct: 357 FVLGGL-SFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIP 411


31SACOL_RS12115SACOL_RS12205Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS12115-116-3.470022inositol monophosphatase
SACOL_RS12120012-2.532834DNA-binding transcriptional regulator
SACOL_RS12125111-2.261101transporter
SACOL_RS12130111-2.074205CAAX amino protease
SACOL_RS12135211-1.796623hypothetical protein
SACOL_RS12140110-0.908215RpiR family transcriptional regulator
SACOL_RS121453110.247278alanine glycine permease
SACOL_RS12150110-0.746309hypothetical protein
SACOL_RS121552130.384673hypothetical protein
SACOL_RS12160112-0.107332hypothetical protein
SACOL_RS12165012-0.026719haloacid dehalogenase
SACOL_RS121701110.442314sodium transporter
SACOL_RS12175010-0.201814hypothetical protein
SACOL_RS12180091.079980PTS alpha-glucoside transporter subunit IIBC
SACOL_RS12185-1101.210498RpiR family transcriptional regulator
SACOL_RS12190-1112.235586hypothetical protein
SACOL_RS12195093.324691sodium:proton antiporter
SACOL_RS12200-272.864366hypothetical protein
SACOL_RS12205-283.113280oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12115FLGFLIH290.012 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 29.4 bits (65), Expect = 0.012
Identities = 19/56 (33%), Positives = 26/56 (46%), Gaps = 5/56 (8%)

Query: 6 LQQIDKLICSWLKQIDNVIPQLIMEMTTETKRHRFDLVTNVD-----KQIQQQFQQ 56
+QQ+ + L +D+VI +M+M E R VD KQIQQ QQ
Sbjct: 98 MQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQ 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12120ARGREPRESSOR280.020 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.9 bits (62), Expect = 0.020
Identities = 19/55 (34%), Positives = 26/55 (47%), Gaps = 7/55 (12%)

Query: 1 MNKAERQNLIITAIQQNKKMTALELAKYC-----NVSKRTILRDIDDLENQGVKI 50
MNK +R I I N+ T EL NV++ T+ RDI +L VK+
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKEL--HLVKV 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12205DHBDHDRGNASE932e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 93.2 bits (231), Expect = 2e-24
Identities = 68/255 (26%), Positives = 112/255 (43%), Gaps = 15/255 (5%)

Query: 46 LQGYKILVTGGDSAIGRAAAIAYAKEGADV-AINYLPSEEQDAQEVRQVIEESGQKAVLI 104
++G +TG IG A A A +GA + A++Y P + + +V ++ + A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLE---KVVSSLKAEARHAEAF 62

Query: 105 PGDIRDEQFNYDLVEQAYQQLGGLDNVTLVAGHQQYHDDIHGFTTEAFTETFETNVYPLF 164
P D+RD ++ + +++G +D + VAG + IH + E + TF N +F
Sbjct: 63 PADVRDSAAIDEITARIEREMGPIDILVNVAGVLRP-GLIHSLSDEEWEATFSVNSTGVF 121

Query: 165 WTVQKALEYLKP--GASITTTSSVQGYNPSPILHDYAASKAAIISLTKSFSEELGPKGIR 222
+ +Y+ SI T S P + YA+SKAA + TK EL IR
Sbjct: 122 NASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 223 VNCVAPGPFWSPLQIS-----GGQPQ---SKIPTFGQKTPLGRAGQPVELCGTYVLLASE 274
N V+PG + +Q S G Q + TF PL + +P ++ + L S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 275 ESSYTTGQVFGVSGG 289
++ + T V GG
Sbjct: 242 QAGHITMHNLCVDGG 256


32SACOL_RS13045SACOL_RS13135Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS13045315-2.184626hypothetical protein
SACOL_RS13050416-2.329554hypothetical protein
SACOL_RS13055316-1.074966hypothetical protein
SACOL_RS13060717-1.583244hypothetical protein
SACOL_RS13065718-1.590679hypothetical protein
SACOL_RS13070617-3.360951hypothetical protein
SACOL_RS13075716-3.197626hypothetical protein
SACOL_RS13080314-2.326912hypothetical protein
SACOL_RS13085314-2.740085hypothetical protein
SACOL_RS13090214-2.893115helicase
SACOL_RS13095115-3.281091DNA mismatch repair protein MutT
SACOL_RS131056140.576569phosphoglucomutase
SACOL_RS13110919-0.240002hypothetical protein
SACOL_RS13115919-1.136992hypothetical protein
SACOL_RS13120916-0.432880hypothetical protein
SACOL_RS131257130.736550MFS transporter
SACOL_RS131308131.749318accumulation-associated protein
SACOL_RS131353111.037322transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13095VACCYTOTOXIN300.043 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 30.4 bits (68), Expect = 0.043
Identities = 34/172 (19%), Positives = 61/172 (35%), Gaps = 29/172 (16%)

Query: 646 QHSIDPSVI------FSKFSNYYEFLVRYKKIDTLLTENESKNLVFFSRQIAPGLKRIDS 699
++S P+++ F + +E R IDTL + ++ G + +
Sbjct: 866 RYSATPNLVAINQHDFGTIESVFELANRSNDIDTLYANSGAQ-----------GRDLLQT 914

Query: 700 LVLEELLKNELTYDELKNKMLNEVKDITEDDIDTSLRILDFSFYNAGIEKIYGSPIIERN 759
L+++ + NE+ T I +G++ + S + N
Sbjct: 915 LLIDSH-DAGYARTMIDATSANEITKQLNTATTTLNNIASLEHKTSGLQTLSLSNAMILN 973

Query: 760 ERMIRLSDAFTN----------ALSNQTFNMFLEDLIELSKYNNEKYQKGKN 801
R++ LS TN AL +Q F LE E+ KY+K N
Sbjct: 974 SRLVNLSRRHTNHIDSFAKRLQALKDQRFAS-LESAAEVLYQFAPKYEKPTN 1024


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13130V8PROTEASE350.002 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 34.6 bits (79), Expect = 0.002
Identities = 14/30 (46%), Positives = 18/30 (60%)

Query: 1140 EKPKDPKGPENPEKPSRPTHPSGPVNPNNP 1169
P +P P NP+ P+ P P+ P NPNNP
Sbjct: 290 NNPDNPDNPNNPDNPNNPDEPNNPDNPNNP 319



Score = 33.1 bits (75), Expect = 0.006
Identities = 13/30 (43%), Positives = 18/30 (60%)

Query: 1140 EKPKDPKGPENPEKPSRPTHPSGPVNPNNP 1169
+ P +P P+NP P P +P P NP+NP
Sbjct: 293 DNPDNPNNPDNPNNPDEPNNPDNPNNPDNP 322



Score = 32.3 bits (73), Expect = 0.010
Identities = 13/30 (43%), Positives = 19/30 (63%)

Query: 1140 EKPKDPKGPENPEKPSRPTHPSGPVNPNNP 1169
++P +P P+NP P P +P P NP+NP
Sbjct: 287 DQPNNPDNPDNPNNPDNPNNPDEPNNPDNP 316



Score = 31.1 bits (70), Expect = 0.027
Identities = 12/29 (41%), Positives = 20/29 (68%)

Query: 1140 EKPKDPKGPENPEKPSRPTHPSGPVNPNN 1168
+ P +P P NP++P+ P +P+ P NP+N
Sbjct: 296 DNPNNPDNPNNPDEPNNPDNPNNPDNPDN 324


33SACOL_RS13990SACOL_RS14015Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS139906142.428600accessory Sec system translocase SecA2
SACOL_RS139958142.820490accessory Sec system protein Asp3
SACOL_RS140009142.873964accessory Sec system protein Asp2
SACOL_RS1400510142.816965accessory Sec system protein Asp1
SACOL_RS1401011163.102318accessory Sec system protein translocase subunit
SACOL_RS1401512163.954522serine-rich adhesin for platelets
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13990SECA6600.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 660 bits (1703), Expect = 0.0
Identities = 288/835 (34%), Positives = 451/835 (54%), Gaps = 68/835 (8%)

Query: 10 NELRLKSIRKIVKRINTWSDEVKSYSDDALKQKTIEFKERLASGVDTLDTLLPEAYAVAR 69
N+ L+ +RK+V IN E++ SD+ LK KT EF+ RL G + L+ L+PEA+AV R
Sbjct: 14 NDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKG-EVLENLIPEAFAVVR 72

Query: 70 EASWRVLGMYPKEVQLIGAIVLHEGNIAEMQTGEGKTLTATMPLYLNALSGKGTYLITTN 129
EAS RV GM +VQL+G +VL+E IAEM+TGEGKTLTAT+P YLNAL+GKG +++T N
Sbjct: 73 EASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNALTGKGVHVVTVN 132

Query: 130 DYLAKRDFEEMQPLYEWLGLTASLGFVDIVDYEYQKGEKRNIYEHDIIYTTNGRLGFDYL 189
DYLA+RD E +PL+E+LGLT V I KR Y DI Y TN GFDYL
Sbjct: 133 DYLAQRDAENNRPLFEFLGLT-----VGINLPGMPAPAKREAYAADITYGTNNEYGFDYL 187

Query: 190 IDNLADSAEGKFLPQLNYGIIDEVDSIILDAAQTPLVISGAPRLQSNLFHIVKEFVDTLI 249
DN+A S E + +L+Y ++DEVDSI++D A+TPL+ISG S ++ V + + LI
Sbjct: 188 RDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNKIIPHLI 247

Query: 250 E-----------DVHFKMKKTKKEIWLLNQGIEAAQSYFNV-------EDLYSEQAMVLV 291
+ HF + + +++ L +G+ + E LYS ++L+
Sbjct: 248 RQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPANIMLM 307

Query: 292 RNINLALRAQYLFESNVDYFVYNGDIVLIDRITGRMLPGTKLQAGLHQAIEAKEGMEVST 351
++ ALRA LF +VDY V +G+++++D TGR + G + GLHQA+EAKEG+++
Sbjct: 308 HHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEGVQIQN 367

Query: 352 DKSVMATITFQNLFKLFESFSGMTATGKLGESEFFDLYSKIVVQVPTDKAIQRIDEPDKV 411
+ +A+ITFQN F+L+E +GMT T EF +Y V VPT++ + R D PD V
Sbjct: 368 ENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKDLPDLV 427

Query: 412 FRSVDEKNIAMIHDIVELHETGRPVLLITRTAEAAEYFSKVLFQMDIPNNLLIAQNVAKE 471
+ + EK A+I DI E G+PVL+ T + E +E S L + I +N+L A+ A E
Sbjct: 428 YMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANE 487

Query: 472 AQMIAEAGQIGSMTVATSMAGRGTDIKLG-----------------------------EG 502
A ++A+AG ++T+AT+MAGRGTDI LG +
Sbjct: 488 AAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQVRHDA 547

Query: 503 VEALGGLAVIIHEHMENSRVDRQLRGRSGRQGDPGSSCIYISLDDYLVKRWSDSNLAENN 562
V GGL +I E E+ R+D QLRGRSGRQGD GSS Y+S++D L++ ++ ++
Sbjct: 548 VLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASDRVSGMM 607

Query: 563 QLYSLDAQRLSQSNLFNRKVKQIVVKAQRISEEQGVKAREMANEFEKSISIQRDLVYEER 622
+ + + + + AQR E + R+ E++ + QR +Y +R
Sbjct: 608 RKLGMKPGEAIEHPWVTKAIA----NAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQR 663

Query: 623 NRVLEIDDAENQDFKALAKDVFEMFVNEE---KVLTKSRVVEYIYQNLSFQFNKDVACVN 679
N +L++ D ++ +DVF+ ++ + L + + + + L F+ D+
Sbjct: 664 NELLDVSDVSET-INSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 680 FKDKQAVVT------FLLEQFEKQLALNRKNMQSAYYYNIFVQKVFLKAIDSCWLEQVDY 733
+ DK+ + +L Q + ++ + A F + V L+ +DS W E +
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQ-RKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAA 781

Query: 734 LQQLKASVNQRQNGQRNAIFEYHRVALDSFEVMTRNIKKRMVKNICQSMITFDKE 788
+ L+ ++ R Q++ EY R + F M ++K ++ + + + +E
Sbjct: 782 MDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEE 836


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14010SECYTRNLCASE1274e-35 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 127 bits (322), Expect = 4e-35
Identities = 92/440 (20%), Positives = 179/440 (40%), Gaps = 52/440 (11%)

Query: 4 LLQQYEYKIIYKRMLYTCFILFIYILGTNISI--VSYNDMQ------VKHESFFKIAISN 55
+ + + K++L+T I+ +Y +GT+I I V Y ++Q ++ F +
Sbjct: 5 FARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGLVNMF 64

Query: 56 MGGDVNTLNIFTLGLGPWLTSMIILMLISYRNMDKYMKQTSLEKHYKE------------ 103
GG + + IF LG+ P++T+ IIL L++ + LE KE
Sbjct: 65 SGGALLQITIFALGIMPYITASIILQLLT-------VVIPRLEALKKEGQAGTAKITQYT 117

Query: 104 RILTLILSVIQSYFVIHEYVSKERVHQDN-------------IYLTILILVTGTMLLVWL 150
R LT+ L+++Q ++ S + + ++ + GT +++WL
Sbjct: 118 RYLTVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWL 177

Query: 151 ADKNSRYGIAGPMPIVMVSIIKSMMHQKMEYI------DASHIVIALLIILVIITLFILL 204
+ + GI M I+M I + + I I +I + +I + +++
Sbjct: 178 GELITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVV 237

Query: 205 FIELVEVRIPYI----DLMNVSATNMKSYLSWKVNPAGSITLMMSISAFVFLKSGIHFIL 260
F+E + RIP + S +Y+ KVN AG I ++ + S F
Sbjct: 238 FVEQAQRRIPVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVAQFAG 297

Query: 261 SMFNKSISDDMPMLTFDSPVGISVYLVIQMLLGYFLSRFLINTKQKSKDFLKSGNYFSGV 320
+ + D P+ I Y ++ + +F N ++ + + K G + G+
Sbjct: 298 GNSGWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFIPGI 357

Query: 321 KPGKDTERYLNYQARRVCWFGLALVTVIIGIPLYFTLFVPHLSTEIYFS-VQLIVLVYIS 379
+ G+ T YL+Y R+ W G + +I +P L S F ++++V +
Sbjct: 358 RAGRPTAEYLSYVLNRITWPGSLYLGLIALVP-TMALVGFGASQNFPFGGTSILIIVGVG 416

Query: 380 INIAETIRTYLYFDKYKPFL 399
+ + I + L Y+ FL
Sbjct: 417 LETVKQIESQLQQRNYEGFL 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14015ICENUCLEATIN576e-10 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 57.5 bits (138), Expect = 6e-10
Identities = 226/999 (22%), Positives = 389/999 (38%), Gaps = 4/999 (0%)

Query: 1163 TSESDSISESTSTSDSISEAISASESTFISLSESNSTSDSESQSASAFLSESLSESTSES 1222
TS I + + +E + S ++ ES S +++
Sbjct: 99 TSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQT 158

Query: 1223 TSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTS 1282
+ ST T S + GST T+ +ST + ST T+ ++ST S T+
Sbjct: 159 IEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTA 218

Query: 1283 LSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMST--SDSISTSKSDSISTSTSLSGS 1340
S+ + ST SD T+ S + S+ + ST + S+ T+ GS
Sbjct: 219 GEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 278

Query: 1341 TSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSNSASQSAS 1400
T ++ S + S T+G+ S+ + S T+ S + S + S +
Sbjct: 279 TQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTA 338

Query: 1401 NSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVSTSASLSGS 1460
ST T+ DS + Y S T+ +S+ T+ S T+ S +G ST + + S
Sbjct: 339 GYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADS 398

Query: 1461 ESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDLSSTSLS 1520
+ S T+ EST + S + S+ + ST + S+ + ST +
Sbjct: 399 SLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTA 458

Query: 1521 DSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTSESDS 1580
S+ S + S + STS + ++ ST +G S T+ S
Sbjct: 459 GEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGS 518

Query: 1581 TSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTS 1640
T T+ ++S + S S + + S+ + ST ++ S+ T+ S + S T+
Sbjct: 519 TQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTA 578

Query: 1641 MSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSASISDSQSMSES 1700
ST + S S + S T+ S + ST T+ S + + S S + ++S
Sbjct: 579 GYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADS 638

Query: 1701 VNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLSCSQSMSDSVS 1760
+ S + +S +G S + S T+ S S + + S + S +
Sbjct: 639 SLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTA 698

Query: 1761 TSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSESTS 1820
+S + S ++++ S+ S S ST+G+ S+ +G ST T+ S + S
Sbjct: 699 GYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGS 758

Query: 1821 LSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSDSKSLSSSQSMSGSESTSTSVS 1880
+ S T+ S S +G+ S + ST + S + S ++ S +
Sbjct: 759 TQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTT 818

Query: 1881 DSQSSSTSNSQFDSMSISASESDSMSTSDSSSISGSNSTSTSLSTSDSMSGSVSVSTSTS 1940
S+ST+ DS I+ S + +S +G ST T+ SD +G S ST+
Sbjct: 819 GYGSTSTAG--ADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 876

Query: 1941 LSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSLSTSISTSMSMSASTSSSQS 2000
S I+G S + S T+ S +Q S +G STS + S + S
Sbjct: 877 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQ 936

Query: 2001 TSVSTSLSTSDSISDSTSISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDS 2060
T+ S + S T+ S + S S + S + ST + ST T+
Sbjct: 937 TASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGY 996

Query: 2061 TSGSTSTSISESLSTSGSGSTSVSDSTSMSESNSSSVSMSQDKSDSTSISDSESVSTSTS 2120
S T+ S + GS +T+ +DS+ ++ SS S + + S S S
Sbjct: 997 GSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVL 1056

Query: 2121 TSLSTSDSTSTSESLSTSMSGSQSISDSTSTSMSGSTST 2159
T+ S S S T+ GS I+ S+ ++G ST
Sbjct: 1057 TAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPEST 1095



Score = 56.7 bits (136), Expect = 1e-09
Identities = 237/1078 (21%), Positives = 428/1078 (39%), Gaps = 14/1078 (1%)

Query: 1082 RTSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTST 1141
+TS Q + + + + S + S +T SGS +
Sbjct: 98 KTSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQ 157

Query: 1142 SLSTSNSERTSTSMSDSTSLSTSESDSISESTST--SDSISEAISASESTFISLSESNST 1199
++ + T + S ++ S + +ST + S + ++ST ++ S T
Sbjct: 158 TIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQT 217

Query: 1200 SDSESQSASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGST 1259
+ ES + + S S+ T+ S+ T+ S + S T+ S+ T+G
Sbjct: 218 AGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYG 277

Query: 1260 SISTSTSISESTSTFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMST 1319
S T+ S+ T+ + S + + S + S T+ S + S + S T
Sbjct: 278 STQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLT 337

Query: 1320 SDSISTSKSDSISTSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSL 1379
+ ST + S+ + GST + DS+ + S T+ S T+ S T+ +
Sbjct: 338 AGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGAD 397

Query: 1380 SLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDST 1439
S + S + S + ST T++ S T+ Y S T+ +S+ + S T
Sbjct: 398 SSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQT 457

Query: 1440 SISKSTSQSGSVSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSAST 1499
+ S+ +G ST + GS+ + S ST+ ES+ + S + S +
Sbjct: 458 AGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYG 517

Query: 1500 STSLSNSASASESDLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASL 1559
ST + + S + STS + + S+ + S ++ S+ + ST
Sbjct: 518 STQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLT 577

Query: 1560 STSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTST 1619
+ ST +GS S + ST T+ S T+ S + S T+ STS + +
Sbjct: 578 AGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGAD 637

Query: 1620 SMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTST 1679
S + S + S T+ ST + SD T+ S ST+G+ S I+ ST T
Sbjct: 638 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQT 697

Query: 1680 STSASEVMSASISDSQSMSESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRK 1739
+ S + + S + S S S S + +DS ++G S + S T+
Sbjct: 698 AGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYG 757

Query: 1740 SESVSESSSLSCSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTST 1799
S + S+ + S S + +DSS ++ S +++ S + S T+ S T
Sbjct: 758 STQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLT 817

Query: 1800 SGSLSTSTSLSGSESVSESTSLSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSD 1859
+G STST+ + S ++ S + S T+ S + S + STS + D
Sbjct: 818 TGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYD 877

Query: 1860 SKSLSSSQSMSGSESTSTSVSDSQSSSTSNSQFDSMSISASESDSMSTSDSSSISGSNST 1919
S ++ S + S + S+ T+ D + S S + S + GS T
Sbjct: 878 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQT 937

Query: 1920 STSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASG 1979
++ ST + GS + S + GSTS++ S+ + S + QST T+ G
Sbjct: 938 ASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYG 997

Query: 1980 SLSTSISTSMSMSASTSSSQSTSVSTSLSTSDSISDST----------SISISGSQSTVE 2029
S T+ +S + S++ + + S+ ++ S S S ISG +S +
Sbjct: 998 STQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLT 1057

Query: 2030 SESTSDSTSISDSESLSTSDSDSTSTSTSDSTSGSTSTSIS--ESLSTSGSGSTSVSDST 2087
+ S S S + S+ ++ S +G ST I+ S+ +G GS+ +
Sbjct: 1058 AGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYR 1117

Query: 2088 SMSESNSSSVSMSQDKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTSMSGSQSI 2145
S S + SV M+ ++ + +DS + S L+ ++S T+ S +G+ I
Sbjct: 1118 STLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCI 1175



Score = 53.2 bits (127), Expect = 1e-08
Identities = 226/1002 (22%), Positives = 400/1002 (39%), Gaps = 4/1002 (0%)

Query: 875 TSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLSDSTSNAISTSTSLSESASTSDS 934
ST + ++ ++T GS S I+G ST + ST A ST + + ST +
Sbjct: 151 GSTQPTQTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVA 210

Query: 935 ISISNSIANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAAS 994
S A +S+ + S T + S + ST + DS+ +G GS A
Sbjct: 211 GYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGY--GSTQTAGE 268

Query: 995 QSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGSTSESLSDSQSTS 1054
S T+ S T++ SD + GS + + S ++ ST +G S + ST
Sbjct: 269 DSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQ 328

Query: 1055 DSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTSFSDSTSDS 1114
+ S + S T+ S+ + S + S + S + SD T+
Sbjct: 329 TAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGY 388

Query: 1115 KSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTS 1174
S TA ++S + ST + ST + S +T+ SD T+ S + +S+
Sbjct: 389 GSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSL 448

Query: 1175 TSDSISEAISASESTFISLSESNSTSDSESQSASAFLSESLSESTSESTSESVSSSTSES 1234
+ S + +S+ + S T+ S + + S S + S + S+ T+
Sbjct: 449 IAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGY 508

Query: 1235 TSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTSLSMSTSTSLSDS 1294
S + S T+ + S+ +G S ST+ + S + + S ++ S+ T+ S
Sbjct: 509 GSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQ 568

Query: 1295 TSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESKSDSTSMSIS 1354
T+ S + S + S S + ST + S+ T+ GST ++ S +
Sbjct: 569 TAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGY 628

Query: 1355 MSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQS 1414
S ST+G+ S+ + S T+ S+ + S + S + STST+ +DS
Sbjct: 629 GSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSL 688

Query: 1415 TSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVSTSASLSGSESESDSQSISTSAS 1474
+ Y S T+ S T+ S T+ S SG STS + + S + S T++
Sbjct: 689 IAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASY 748

Query: 1475 ES--TSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDLSSTSLSDSTSASMQSSES 1532
S T+ ST + S +G STST+ ++S+ + + T+ S + S
Sbjct: 749 HSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQ 808

Query: 1533 DSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTS 1592
+Q S + STST+ S++ + S T+ S + S T+ SD +
Sbjct: 809 TAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGY 868

Query: 1593 RSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVS 1652
STS +G S+ + S T+ S S + S T+ S ST+ +S
Sbjct: 869 GSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSL 928

Query: 1653 DSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSASISDSQSMSESVNDSESVSESNS 1712
+ ST ++ S ++ S T+ S+ S S + S + S +
Sbjct: 929 IAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGY 988

Query: 1713 ESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLSCSQSMSDSVSTSDSSSLSVSTS 1772
+S + GST ++ S + + + ++SS ++ S S S ++ ST
Sbjct: 989 QSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTL 1048

Query: 1773 LRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSESTSLSDSISMSDSTS 1832
+ SV + S S S+ T+ GS ++ S + EST ++ + SM +
Sbjct: 1049 ISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGK 1108

Query: 1833 TSDSDSLSGSISLSGSTSLSTSDSLSDSKSLSSSQSMSGSES 1874
S + S +SG+ S+ + + + S +G S
Sbjct: 1109 GSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 52.1 bits (124), Expect = 3e-08
Identities = 174/773 (22%), Positives = 304/773 (39%), Gaps = 2/773 (0%)

Query: 1398 SASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVSTSASL 1457
+ + +E + S + + T D+T S ST + ++ +
Sbjct: 106 LHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYG 165

Query: 1458 SGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDLSST 1517
S SQ I+ S T+ +ST ++ ST +G+ ST + S + + S
Sbjct: 166 STLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQM 225

Query: 1518 SLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTSE 1577
+ ST M+ S+ + S + S+ + ST + S T+ GST +
Sbjct: 226 AGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKG 285

Query: 1578 SDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTST 1637
SD T+ S + + S+ +G ST T+ +S T+ ST SD + ST T
Sbjct: 286 SDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGT 345

Query: 1638 STSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSASISDSQSM 1697
+ S + S + DS+ + GS + SD T+ S + S +
Sbjct: 346 AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYG 405

Query: 1698 SESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLSCSQSMSD 1757
S ES + S + GS + GS + + S+ + S
Sbjct: 406 STQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLT 465

Query: 1758 SVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSE 1817
+ S ++ S S S + S + GST T+ GS T+ S + +E
Sbjct: 466 AGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNE 525

Query: 1818 STSLSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSDSKSLSSSQSMSGSESTST 1877
S ++ S S + + S + GS + S+ T+ S + S +G ST T
Sbjct: 526 SDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGT 585

Query: 1878 SVSDSQSSSTSNSQFDSMSISASESDSMSTSDSSSISGSNSTSTSLSTSDSMSGSVSVST 1937
+ SDS + S + S+ + ST + S + S ST+ + S ++
Sbjct: 586 AGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYG 645

Query: 1938 STSLSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSLSTSISTSMSMSASTSS 1997
ST + S T+ S+ T+ S + S ST+ + S ++ ST + S +
Sbjct: 646 STQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILT 705

Query: 1998 SQSTSVSTSLSTSDSISDSTSISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTST 2057
+ S T+ SD S S S +G+ S++ + S T+ S + S T+
Sbjct: 706 AGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQ 765

Query: 2058 SDSTS--GSTSTSISESLSTSGSGSTSVSDSTSMSESNSSSVSMSQDKSDSTSISDSESV 2115
S T+ GSTST+ ++S +G GST + S+ + S +Q++SD T+ S S
Sbjct: 766 SVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTST 825

Query: 2116 STSTSTSLSTSDSTSTSESLSTSMSGSQSISDSTSTSMSGSTSTSESNSMHPS 2168
+ + S+ ++ ST T+ S +G S + S + S S + + S
Sbjct: 826 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDS 878



Score = 51.3 bits (122), Expect = 5e-08
Identities = 204/902 (22%), Positives = 352/902 (39%), Gaps = 4/902 (0%)

Query: 1247 TSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTS 1306
T TS + + + ++ + + + D + S S +
Sbjct: 97 TKTSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPT 156

Query: 1307 DSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTS 1366
+ + S + S + ST T+ ST + ST + + S +G ST
Sbjct: 157 QTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQ 216

Query: 1367 TSTSLSDSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQS 1426
T+ S + S M S + + S + S+ + S T+ S T+
Sbjct: 217 TAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGY 276

Query: 1427 ESTSTSTSLSDSTSISKSTSQSGSVSTSASLSGSESESDSQSISTSA--SESTSESASTS 1484
ST T+ SD T+ ST +G+ S+ + GS + +S T+ S T++ S
Sbjct: 277 GSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDL 336

Query: 1485 LSDSTSTSNSGSASTSTSLSNSASASESDLSSTSLSDSTSASMQSSESDSQSTSASLSDS 1544
+ ST +G S+ + S + D S T+ ST + + S+ + S + +
Sbjct: 337 TAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGA 396

Query: 1545 LSTSTSNRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTS 1604
S+ + ST + S T+ GST + SD T+ S + S+ +G ST
Sbjct: 397 DSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQ 456

Query: 1605 TSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTS 1664
T+ DS T+ ST SD + STST+ S + S + ST +
Sbjct: 457 TAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGY 516

Query: 1665 GSMSVSISLSDSTSTSTSASEVMSASISDSQSMSESVNDSESVSESNSESDSKSMSGSTS 1724
GS + + SD + S S + S + S SV + S + GS
Sbjct: 517 GSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDL 576

Query: 1725 VSDSGSLSVSTSLRKSESVSESSSLSCSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDS 1784
+ GS + S + S+ + S + S ++ S S S + +
Sbjct: 577 TAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGA 636

Query: 1785 LSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSESTSLSDSISMSDSTSTSDSDSLSGSIS 1844
S + GST T+ S+ T+ S + S + S S + + S + GS
Sbjct: 637 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQ 696

Query: 1845 LSGSTSLSTSDSLSDSKSLSSSQSMSGSESTSTSVSDSQSSSTSNSQFDSMSISASESDS 1904
+G S+ T+ S + S SG STST+ +DS + S + S+ +
Sbjct: 697 TAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGY 756

Query: 1905 MSTSDSSSISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLS 1964
ST + S + S ST+ + S ++ ST + S T+ S+ T+ S
Sbjct: 757 GSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDL 816

Query: 1965 DSMSQSQSTSTSASGSLSTSISTSMSMSASTSSSQSTSVSTSLSTSDSISDSTSISISGS 2024
+ S ST+ + S ++ ST + S ++ S T+ SD + S S +G
Sbjct: 817 TTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 876

Query: 2025 QSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDSTSG--STSTSISESLSTSGSGSTS 2082
S++ + S T+ +S + S T+ SD T+G STST+ ES +G GST
Sbjct: 877 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQ 936

Query: 2083 VSDSTSMSESNSSSVSMSQDKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTSMSGS 2142
+ S + S ++++S T+ S S++ S+ ++ ST T+ ST +G
Sbjct: 937 TASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGY 996

Query: 2143 QS 2144
S
Sbjct: 997 GS 998



Score = 50.9 bits (121), Expect = 7e-08
Identities = 241/1060 (22%), Positives = 428/1060 (40%), Gaps = 16/1060 (1%)

Query: 907 TSASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDS 966
TSA A + + ++ S ++ + N T D+ S S + +
Sbjct: 99 TSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQT 158

Query: 967 KSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDS 1026
++T S T S ++G S + ST + ST +DS +G S +
Sbjct: 159 IEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTA 218

Query: 1027 KSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSES 1086
S + S S + S + S + GST T+ S+ S
Sbjct: 219 GEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 278

Query: 1087 QSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTS 1146
T+ S + S T+ +DS+ + ST ++ S + S + S T+
Sbjct: 279 TQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTA 338

Query: 1147 NSERTSTSMSDSTSLSTSESDSISESTSTSDSISEAISASESTFISLSESNSTSDSESQS 1206
T T+ DS+ ++ S + S+ + + ++ + ST + + S
Sbjct: 339 GYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADS 398

Query: 1207 ASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTS 1266
+ S + EST + ST + SD T+ GST T+ +S+ + ST T+
Sbjct: 399 SLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTA 458

Query: 1267 ISESTSTFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMS--TSDSIS 1324
+S+ T S T+ S T+ STS + S + S + S T+ S
Sbjct: 459 GEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGS 518

Query: 1325 TSKSDSISTSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSAS 1384
T + + S + GSTS + ++S+ ++ S T+ S T+ S T+ S +
Sbjct: 519 TQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTA 578

Query: 1385 MNQSGVDSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKS 1444
S + S S + ST T+ S T+ Y S T++ +S T+ S ST+ + S
Sbjct: 579 GYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADS 638

Query: 1445 TSQSGSVSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLS 1504
+ +G ST + +S + S T++ S + STS +G+ S+ +
Sbjct: 639 SLIAGYGSTQT------AGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGY 692

Query: 1505 NSASASESDLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVS 1564
S + + T+ ST + + S+ S S S + + S+ + ST + S
Sbjct: 693 GSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSL 752

Query: 1565 TSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTS 1624
T+ GST + +S T+ S S + + S+ +G ST T+ S T+ ST +T+
Sbjct: 753 TAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGST--QTA 810

Query: 1625 TSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSAS 1684
S + STST+ +DS+ ++ S T+ S T+G S + +S T+ S
Sbjct: 811 QERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGS 870

Query: 1685 EVMSASISDSQSMSESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVS 1744
+ S + S + S + S + S +G S ST+ +S ++
Sbjct: 871 TSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIA 930

Query: 1745 ESSSLSCSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLS 1804
S + S ++ SS + S ++ S S + DS +G ST T+G S
Sbjct: 931 GYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQS 990

Query: 1805 TSTSLSGSESVSESTSLSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSDSKSLS 1864
T T+ GS +E +S + S +T+ +DS ++G S S S + S +S
Sbjct: 991 TLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLIS 1050

Query: 1865 SSQSMSGSESTSTSVSDSQSSSTSN------SQFDSMSISASESDSMSTSDSSSISGSNS 1918
+S+ + S+ +S +SS T+ + S I+ ES ++ + S I+G S
Sbjct: 1051 GLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGS 1110

Query: 1919 TSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTS 1958
+ T+ S +SG+ SV + I+G+ S + S
Sbjct: 1111 SQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 50.1 bits (119), Expect = 1e-07
Identities = 240/1078 (22%), Positives = 423/1078 (39%), Gaps = 22/1078 (2%)

Query: 753 MSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDSVSASKS 812
+D V+ + S + + + +T S S + ++ + S
Sbjct: 109 RADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTL 168

Query: 813 LSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEKSESLSTSTSDS 872
T +S ++ ST S + S + + S ++ ST+ + S+ +
Sbjct: 169 SGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGY 228

Query: 873 LRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLSDSTSNAISTSTSLSESASTS 932
T T + S + GS + S+ I+G ST + DS+ A ST ++ S
Sbjct: 229 GSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 288

Query: 933 DSISISNSIANSQSA------STSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVS 986
+ S A + S+ ST + +ST + S + S+ + ST +
Sbjct: 289 TAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGD 348

Query: 987 GSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGSTSES 1046
S IA S T+ DS T+ S + GS + S + + S+ +G S
Sbjct: 349 DSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQ 408

Query: 1047 LSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTS 1106
+ +ST + S + S T+ ST + S + GS + DS +
Sbjct: 409 TAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGY 468

Query: 1107 FSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSES 1166
S T+ S TA S S + S+ + ST + S T+ S T+ + S+
Sbjct: 469 GSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDL 528

Query: 1167 DSISESTSTSDSISEAISASESTFISLSESNSTSDSESQSASAFLSESLSESTSESTSES 1226
+ STST+ + S I+ ST + S T+ S + S+ + S T+ S
Sbjct: 529 ITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGS 588

Query: 1227 VSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTSLSMS 1286
SS + S ++ S T+ S T+ S+ T+ S ST+ S ++ S
Sbjct: 589 DSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQ 648

Query: 1287 TSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESKS 1346
T+ S T+ S + S + S ST+ + S+ + ST T+ S +
Sbjct: 649 TAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGY 708

Query: 1347 DSTSMSISMSQSTSGSTSTS----------------TSTSLSDSTSTSLSLSASMNQSGV 1390
ST + S TSG STS T++ S T+ S + QS +
Sbjct: 709 GSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVL 768

Query: 1391 DSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGS 1450
+ S S + + S+ + S T+ Y S T+ ST T+ SD T+ STS +G+
Sbjct: 769 TTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGA 828

Query: 1451 VSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASAS 1510
S+ + GS + SI T+ ST + S + S S + S+ ++ S
Sbjct: 829 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQ 888

Query: 1511 ESDLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGS 1570
+ +S + S SD + S S + S+ ++ ST +G
Sbjct: 889 TAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGY 948

Query: 1571 TSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQS 1630
S T+ S+ T+ S S + S+ + ST T+ +ST + S +T+ S
Sbjct: 949 GSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTL 1008

Query: 1631 MSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSAS 1690
+ ST+T+ +DS+ ++ S TS S T+G S IS S T+ S ++S
Sbjct: 1009 TAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGR 1068

Query: 1691 ISDSQSMSESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLS 1750
S + S + S + +S ++G+ S+ +G S T+ +S +S + S+
Sbjct: 1069 RSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQ 1128

Query: 1751 CSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTS 1808
+ ++ +DS+ + S + + S + SK T+G+ +G S T+
Sbjct: 1129 MAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTA 1186



Score = 49.0 bits (116), Expect = 2e-07
Identities = 209/939 (22%), Positives = 371/939 (39%), Gaps = 2/939 (0%)

Query: 733 TDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTS 792
T G+ T + T + S ++ GSTQ + ST A S T+ GS + +
Sbjct: 281 TAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGY 340

Query: 793 ASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDS 852
ST + S + S + +S+ + ST S ++ S + + S
Sbjct: 341 GSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSL 400

Query: 853 ISNSSSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLS 912
I+ ST+ + ST T+ T T+ S + GS + S+ I+G ST +
Sbjct: 401 IAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGE 460

Query: 913 DSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDSKSMSTS 972
DS+ A ST ++ S + S S A +S+ + S T+ ST + ST
Sbjct: 461 DSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQ 520

Query: 973 ESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVS 1032
+ ++S +G GS S A + S + S T+ S + GS + S +
Sbjct: 521 TAQNESDLITGY--GSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTA 578

Query: 1033 SSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGS 1092
ST +GS S ++ ST + S + S T+ S + S S + + S
Sbjct: 579 GYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADS 638

Query: 1093 MSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTS 1152
+ S + S T+ S TA S + STS + + S+ ++ S +T+
Sbjct: 639 SLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTA 698

Query: 1153 TSMSDSTSLSTSESDSISESTSTSDSISEAISASESTFISLSESNSTSDSESQSASAFLS 1212
S T+ S + S TS S + + ++S+ I+ S T+ S + + S
Sbjct: 699 GYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGS 758

Query: 1213 ESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTS 1272
+ S T+ S+ST+ + S + S T+ S T+G S T+ S+ T+
Sbjct: 759 TQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTT 818

Query: 1273 TFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSIS 1332
+ S S + + S + S T+ S+ + S + S T+ STS + S
Sbjct: 819 GYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDS 878

Query: 1333 TSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDS 1392
+ + GST + +S + S T+ S T+ S ST+ S + S +
Sbjct: 879 SLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTA 938

Query: 1393 NSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVS 1452
+ S + S+ T+ S T+ Y S S + +S+ + S T+ +ST +G S
Sbjct: 939 SFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGS 998

Query: 1453 TSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASES 1512
T + S + S +T+ ++S+ + S S S + ST +S S +
Sbjct: 999 TQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTA 1058

Query: 1513 DLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTS 1572
S+ +S S+ S+ ++ S + ST + ++ S+ +G S
Sbjct: 1059 GYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRS 1118

Query: 1573 ESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMS 1632
S +DS + + + + S + S + + S + S T+ +D M+
Sbjct: 1119 TLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCILMA 1178

Query: 1633 LSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSI 1671
S T+ +S + S + S T+G SV I
Sbjct: 1179 GDRSKLTAGINSILTAGCRSKLIGSNGSTLTAGENSVLI 1217


34SACOL_RS00490SACOL_RS00520N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS004900112.982919iron ABC transporter substrate-binding protein
SACOL_RS004951153.402058siderophore biosynthesis protein SbnA
SACOL_RS005002163.7166222,3-diaminopropionate biosynthesis protein SbnB
SACOL_RS005052163.618573siderophore biosynthesis protein SbnC
SACOL_RS005102173.856488MFS transporter
SACOL_RS005151163.239949siderophore biosynthesis protein SbnE
SACOL_RS005201162.702973siderophore biosynthesis protein SbnF
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00490FERRIBNDNGPP707e-16 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 70.4 bits (172), Expect = 7e-16
Identities = 47/191 (24%), Positives = 78/191 (40%), Gaps = 38/191 (19%)

Query: 53 PKRVVTLYQGATDVAVSLGVKPVGAVES-----WTQKPKFEYIKNDLKDTKI-VGQEPAP 106
P R+V L ++ ++LG+ P G ++ W +P L D+ I VG P
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPP-------LPDSVIDVGLRTEP 87

Query: 107 NLEEISKLKPDLIVASKVRNEKVYDQLSKIAPTVSTDTVFKFKD----------TTKLMG 156
NLE ++++KP +V S + L++IAP F F D + M
Sbjct: 88 NLELLTEMKPSFMVWS-AGYGPSPEMLARIAPGR----GFNFSDGKQPLAMARKSLTEMA 142

Query: 157 KALGKEKEAEDLLKKYDDKVAAFQKDAKAKY--KDAWPLKASVVNF-RADHTRIYA-GGY 212
L + AE L +Y+D F + K ++ + A PL + H ++
Sbjct: 143 DLLNLQSAAETHLAQYED----FIRSMKPRFVKRGARPL--LLTTLIDPRHMLVFGPNSL 196

Query: 213 AGEILNDLGFK 223
EIL++ G
Sbjct: 197 FQEILDEYGIP 207


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00500SYCECHAPRONE310.002 Gram-negative bacterial type III secretion SycE cha...
		>SYCECHAPRONE#Gram-negative bacterial type III secretion SycE

chaperone signature.
Length = 130

Score = 31.2 bits (70), Expect = 0.002
Identities = 14/33 (42%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 25 VDALTEALTAHAHNDFVQ-PLKPYLRQDPENGH 56
+D E T +HN F Q LKP L D GH
Sbjct: 54 LDNNDEKETLLSHNIFSQDILKPILSWDEVGGH 86


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00505PF04183317e-103 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 317 bits (815), Expect = e-103
Identities = 119/527 (22%), Positives = 208/527 (39%), Gaps = 46/527 (8%)

Query: 79 RVSKQPLTAAEFWQTIANMNCDLSHEWEVARVEEGLTTAATQLAKQLSELDLASHPFV-- 136
R + +P+ A + + +S +++ T L + L++ +
Sbjct: 66 RCADEPVLAQTLLMQLKQVL-SMSDATVAEHMQDLYATLLGDLQLLKARRGLSASDLINL 124

Query: 137 -MSEQFASLKDRPFHPLAKEKRGLREADYQVYQAELNQSFPLMVAAVKKTHMIHGDTANI 195
L P K +RG + + Y E +F L AVK+ HMI +
Sbjct: 125 NADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEM 184

Query: 196 DELENLTVPIKEQA----TDMLNDQGLSIDDYVLFPVHPWQYQHILPNVFAKEISEKLVV 251
D + LT + Q + + + GL +++ PVHPWQ+Q + F + +E +V
Sbjct: 185 DIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQQKIATDFIADFAEGRMV 243

Query: 252 LLPLKFGD-YLSSSSMRSLIDIGAPYN-HVKVPFAMQSLGALRLTPTRYMKNGEQAEQLL 309
L +FGD +L+ S+R+L + +K+P + + R P RY+ G A + L
Sbjct: 244 SLG-EFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWL 302

Query: 310 RQLIEKDEALAKYVMV-CDETA-------WWSYMGQDNDIFKDQLGHLTVQLRKYPEVLA 361
+Q+ D L + V E A ++ + + +++ LG V R+ P
Sbjct: 303 QQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLG---VIWRENPCRWL 359

Query: 362 KNDTQQLVSMAALAANDRTLYQMICGKDNISKNDVMTLFEDIAQVFLKVTLSFM-QYGAL 420
K D + V MA L D + + S D T + +V + + +YG
Sbjct: 360 KPD-ESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYHLLCRYGVA 418

Query: 421 PELHGQNILLSFEDGRVQKCVLRD-HDTVRIYKPWLTAHQLSLPKYV--VREDTPNTLIN 477
HGQNI L+ ++G Q+ +L+D +R+ K SLP+ V V +
Sbjct: 419 LIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMD-SLPQEVRDVTSRLSADYLI 477

Query: 478 EDLETFFAYFQTLAVSVNLYAIIDAIQDLFGVSEHELMSLLKQILKNEVATISWVTTDQL 537
DL+T V + I + GV E LL +L + + Q+
Sbjct: 478 HDLQTGHF--------VTVLRFISPLMVRLGVPERRFYQLLAAVLSDYMK-----KHPQM 524

Query: 538 AVRHILFDKQTWPFKQILLP---LLY-QRDSGGGSMPSGLTTVPNPM 580
+ R LF +++L L + D G +P+ L + NP+
Sbjct: 525 SERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPL 571


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00510TCRTETA802e-18 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 79.9 bits (197), Expect = 2e-18
Identities = 71/372 (19%), Positives = 149/372 (40%), Gaps = 24/372 (6%)

Query: 13 ILWLSQFIAIAGLTVLVPLLPIYMASLQNLSVVEIQLWSGIAIAAPAVTTMIASPIWGKL 72
++ + + G+ +++P+LP + L + ++ GI +A A+ +P+ G L
Sbjct: 9 VILSTVALDAVGIGLIMPVLPGLLRDL--VHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 73 GDKISRKWMVLRALLGLAVCLFLMALCTTPLQFVLVRLLQGLFGGVVDASSAFASAEAPA 132
D+ R+ ++L +L G AV +MA + R++ G+ G + A+ +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDG 126

Query: 133 EDRGKVLGRLQSSVSAGSLVGPLIGGVTASILGFSALLMSIAVITFIVCIFGALKLIETT 192
++R + G + + G + GP++GG+ A + A + + + G L E+
Sbjct: 127 DERARHFGFMSACFGFGMVAGPVLGGLMGGF-SPHAPFFAAAALNGLNFLTGCFLLPESH 185

Query: 193 HMPKSQTPNINKGIRRSFQCLLCTQQTCRFIIVGVLANFAMYGMLTALSPLASSVNHTAI 252
+ SF+ + V A A++ ++ + + +++
Sbjct: 186 KGERRPLRREALNPLASFR--------WARGMTVVAALMAVFFIMQLVGQVPAALWVIFG 237

Query: 253 DDR-----SVIGFLQSAF-WTASILSAPLWGRFNDKSYVKSVYIFATIACGCSAILQGLA 306
+DR + IG +AF S+ A + G + + + IA G IL A
Sbjct: 238 EDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFA 297

Query: 307 TNIEFLMAARILQGLTYSAL--IQSVMFVVVNACHQ-QLKGTFVGTTNSMLVVGQIIGSL 363
T +L + +Q+++ V+ Q QL+G+ T+ + I+G L
Sbjct: 298 TRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTS----LTSIVGPL 353

Query: 364 SGAAITSYTTPA 375
AI + +
Sbjct: 354 LFTAIYAASITT 365


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00515PF041833045e-98 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 304 bits (779), Expect = 5e-98
Identities = 117/539 (21%), Positives = 211/539 (39%), Gaps = 61/539 (11%)

Query: 3 NKELIQHAAYAAIERILNEYFREENLYQVPPQNHQWSIQLSELE-TLTGEFRYWSAMGHH 61
N + + ++L+E E+ + + ++ I L + E W G
Sbjct: 2 NHKDWDLVNRRLVAKMLSELEYEQVFHAESQGDDRYCINLPGAQWRFIAERGIW---GW- 57

Query: 62 MYHPEVWLIDGKSKKITTYKEAIARILQHMAQSADNQTA-VQQHMAQIMSDI--DNSIHR 118
ID ++ + +L + Q A V +HM + + + D + +
Sbjct: 58 ------LWIDAQTLRCADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLK 111

Query: 119 TARYLQSNTIDYVEDRYIVSEQSLYLGHPFHPTPKSASGFSEADLEKYAPECHTSFQLHY 178
R L ++ + + Q L GHP K G+ + LE+YAPE +F+LH+
Sbjct: 112 ARRGLSASDL---INLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHW 168

Query: 179 LAVHQD-------------VLLTRYVEGKEDQVEKVLYQLADIDISEIPKDFILLPTHPY 225
LAV ++ LLT ++ +E ++Q +D +++ LP HP+
Sbjct: 169 LAVKREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-----HNWLPLPVHPW 223

Query: 226 QINVLRQHPQYMQYSEQGLIKDLGVSGDSVYPTSSVRTVF--SKALNIYLKLPIHVKITN 283
Q ++ +G + LG GD S+RT+ S+ + +KLP+ + T+
Sbjct: 224 QWQQK-IATDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTS 282

Query: 284 FIRTNDLEQIERTIDAAQVIASVKDE-----------VETPHFKLMFEEGYRALLPNPLG 332
R I A++ + V + P + EGY AL P
Sbjct: 283 CYRGIPGRYIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYR 342

Query: 333 QTVEPEMDLLTNSAMIVREGIPNY-HADKDIHVLASLFETMPDSPMSKLSQVIEQSGLAP 391
EM +I RE + D+ ++A+L E ++ I++SGL
Sbjct: 343 YQ---EM-----LGVIWRENPCRWLKPDESPVLMATLMECDENN-QPLAGAYIDRSGLDA 393

Query: 392 EAWLECYLNRTLLPILKLFSNTGISLEAHVQNTLIELKDGIPDVCFVRDLEG-ICLSRTI 450
E WL ++P+ L G++L AH QN + +K+G+P ++D +G + L +
Sbjct: 394 ETWLTQLFRVVVVPLYHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEE 453

Query: 451 ATEKQLVPNVVAASSPVVYAHDEAWHRLKYYVVVNHLGHLVSTIGKATRNEVVLWQLVA 509
E +P V + + A D H L+ V L + + + E +QL+A
Sbjct: 454 FPEMDSLPQEVRDVTSRLSA-DYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLA 511


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00520PF04183511e-178 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 511 bits (1317), Expect = e-178
Identities = 145/592 (24%), Positives = 256/592 (43%), Gaps = 40/592 (6%)

Query: 1 MNQTILNRVKTRVMHQLVSSLIYENIVVYKASYQDGVGHFTIEGHDSEYRFTAEKTHSFD 60
MN + V R++ +++S L YE + + A Q G + I +++RF AE+ +
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQV--FHAESQ-GDDRYCINLPGAQWRFIAERG-IWG 56

Query: 61 RIRITSPIERVVGDEADTTTDYTQLLREVVFTFPKNDEKLEQFIVELLQTELKDTQSMQY 120
+ I + R AD LL ++ +D + + + +L T L D Q ++
Sbjct: 57 WLWIDAQTLRC----ADEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKA 112

Query: 121 RESNPPATPETFN-DYEFYAMEGHQYHPSYKSRLGFTLSDNLKFGPDFVPNVKLQWLAID 179
R + N D + GH K R G+ ++ P++ +L WLA+
Sbjct: 113 RRGLSASDLINLNADRLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVK 172

Query: 180 KDKVETTVSRNVVVNEMLRQQVGDKTYEHFVQQIEASGKHVNDVEMIPVHPWQFEHVIQV 239
++ + + ++++L + + + F Q + +G N + +PVHPWQ++ I
Sbjct: 173 REHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLDHNWL-PLPVHPWQWQQKIAT 231

Query: 240 DLAEERLNGTVLWLGESDELYHPQQSIRTMSPIDTT-KYYLKVPISITNTSTKRVLAPHT 298
D + G ++ LGE + + QQS+RT++ +K+P++I NTS R +
Sbjct: 232 DFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRY 291

Query: 299 IENAAQITDWLKQIQQQDMYLKDE----LKTVFLGEVLGQSYLNTQLSPYKQTQVYGALG 354
I + WL+Q+ D L L G V + Y +PY+ ++ LG
Sbjct: 292 IAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEM---LG 348

Query: 355 VIWRENIYHMLIDEEDAIPFNALYASDKDGVPFIENWIKQYG--SEAWTKQFLAVAIRPM 412
VIWREN L +E + L D++ P +I + G +E W Q V + P+
Sbjct: 349 VIWRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPL 408

Query: 413 IHMLYYHGIAFESHAQNMMLIHENGWPTRIALKDFHDGVRFKREHLSEAASHLTLKPMPE 472
H+L +G+A +H QN+ L + G P R+ LKDF +R +E E S +P+
Sbjct: 409 YHLLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDS------LPQ 462

Query: 473 AHKKVNSNSFIETDDERLVRDFLH---DAFFFINIAEIILFIEKQYGIDEELQWQWVKGI 529
+ V S RL D+L F+ + I + + G+ E +Q + +
Sbjct: 463 EVRDVTS---------RLSADYLIHDLQTGHFVTVLRFISPLMVRLGVPERRFYQLLAAV 513

Query: 530 IEAYQEAFPELNN-YQHFDLFEPTIQVEKLTTRRL-LSDSELRIHHVTNPLG 579
+ Y + P+++ + F LF P I L +L D + + N L
Sbjct: 514 LSDYMKKHPQMSERFALFSLFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLE 565


35SACOL_RS00830SACOL_RS00850N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS008302141.698440MFS transporter
SACOL_RS008353132.032695non-ribosomal peptide synthetase
SACOL_RS008402152.9128034'-phosphopantetheinyl transferase
SACOL_RS008452162.444109hypothetical protein
SACOL_RS008501162.320169acetylglutamate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00830TCRTETA320.004 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.004
Identities = 61/337 (18%), Positives = 127/337 (37%), Gaps = 33/337 (9%)

Query: 7 TLKVRLISNFLQLIITTAFIPFIALYLTDMLS----QSIVGIYLVGLVVLKFPLSIISGY 62
L V L + L + +P + L D++ + GI L +++F + + G
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 63 LIEIFPKKLLVLIYQATMVIMLVFMGVFGSHQLWQI-IGFCVAYAIFTIVWGLQFPVMDT 121
L + F ++ ++L+ A + M + LW + IG VA + G V
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMAT--APFLWVLYIGRIVAG-----ITGATGAVAGA 118

Query: 122 LIMDAITEDVEHYIYKISYWMTNLSVAIGALLGGLMYGYSMLLLFLIAACIFLIVLFILY 181
I D D + + G +LGGLM G+S F AA + +
Sbjct: 119 YIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGC 178

Query: 182 IWLPQDRNQVKQSDDKRHASRYQKLQIMNIFRSYKLVLKDRNYMLLISGFSIIMMGEFSI 241
LP+ ++ + + + L+ F + ++G+
Sbjct: 179 FLLPESHKGERRPLRREALNPLASFRWARGMTVVAA--------LMAVFFIMQLVGQVPA 230

Query: 242 SSYIAIRLKDQF--ETISIGSYDITGAKMLAILLMINTVVVILLTYSISKVVLKIDFKKA 299
+ + I +D+F + +IG LA +++++ ++T ++ ++ ++A
Sbjct: 231 ALW-VIFGEDRFHWDATTIGI-------SLAAFGILHSLAQAMITGPVAA---RLGERRA 279

Query: 300 LITGLLIYIVGYSGLTYLNQFGLLVVFMIIATVGEII 336
L+ G++ GY L + + + M++ G I
Sbjct: 280 LMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIG 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00835NUCEPIMERASE522e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 51.7 bits (124), Expect = 2e-08
Identities = 54/266 (20%), Positives = 101/266 (37%), Gaps = 55/266 (20%)

Query: 2046 NTLLTGATGFLGAYLIEVLQGYSHRIYCFIRADNEEIAWYKLMTNLNDYFS----EETVE 2101
L+TGA GF+G ++ + L H++ + D NLNDY+ + +E
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQV---VGID-----------NLNDYYDVSLKQARLE 47

Query: 2102 IM----LSNIEVIVGDFECMDDVVLPENMDTIIH----AGARTDHFGDDDEFEKVNVQGT 2153
++ ++ + D E M D+ + + + R + + N+ G
Sbjct: 48 LLAQPGFQFHKIDLADREGMTDLFASGHFERVFISPHRLAVR-YSLENPHAYADSNLTGF 106

Query: 2154 VDVIRLAQQHH-ARLIYVSTISV-GTYFDIDTEDVTFSEADVYKGQLLTSPYTRSKFYSE 2211
++++ + + L+Y S+ SV G + FS D + S Y +K +E
Sbjct: 107 LNILEGCRHNKIQHLLYASSSSVYG-----LNRKMPFSTDDSVDHPV--SLYAATKKANE 159

Query: 2212 LKVLEAVNN-GLDGRIVRVGNLTNPYNGRWHM------RNIKTNRFSMVMNDLLQLDCIG 2264
L + GL +R + P+ GR M + + + V N
Sbjct: 160 LMAHTYSHLYGLPATGLRFFTVYGPW-GRPDMALFKFTKAMLEGKSIDVYNY-------- 210

Query: 2265 VSMAEMPVDFSFVDTTARQIVALAQV 2290
+M DF+++D A I+ L V
Sbjct: 211 ---GKMKRDFTYIDDIAEAIIRLQDV 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00840ENTSNTHTASED290.009 Enterobactin synthetase component D signature.
		>ENTSNTHTASED#Enterobactin synthetase component D signature.

Length = 234

Score = 29.2 bits (65), Expect = 0.009
Identities = 15/57 (26%), Positives = 27/57 (47%), Gaps = 5/57 (8%)

Query: 84 GQP-----IYVSLSYSYPYIVCVVDKEPVGIDIEKISQRLDWRTLVTCFSTNEAHQI 135
QP ++ S+S+ + V+ ++ +GIDIEKI + L ++ QI
Sbjct: 76 RQPLWPDGLFGSISHCATTALAVISRQRIGIDIEKIMSQHTATELAPSIIDSDERQI 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS00850CARBMTKINASE320.002 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 31.7 bits (72), Expect = 0.002
Identities = 23/84 (27%), Positives = 41/84 (48%), Gaps = 7/84 (8%)

Query: 155 INADTLAYFIASSLKAPIYV-LSNIAGVLIN-----DVVIPQLPLVDIHQYIEHGD-IYG 207
I+ D +A + A I++ L+++ G + + + ++ + ++ +Y E G G
Sbjct: 213 IDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREVKVEELRKYYEEGHFKAG 272

Query: 208 GMIPKVLDAKNAIENGCPKVIIAS 231
M PKVL A IE G + IIA
Sbjct: 273 SMGPKVLAAIRFIEWGGERAIIAH 296


36SACOL_RS01020SACOL_RS01040N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS010201181.587363hexose phosphate transporter
SACOL_RS010252171.234019DNA-binding response regulator
SACOL_RS010300150.560136histidine kinase
SACOL_RS01035-1150.752678lipoprotein
SACOL_RS01040-2140.862063formate acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01020TCRTETA379e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 37.5 bits (87), Expect = 9e-05
Identities = 53/361 (14%), Positives = 121/361 (33%), Gaps = 40/361 (11%)

Query: 30 AFFVVFFVYMAMYLIRNNFKAAQPFLKEEIGLSTLELGYIGL---AFSITYGLGKTLLGY 86
V + + LI P L ++ S + G+ +++ +LG
Sbjct: 10 ILSTVALDAVGIGLI----MPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 87 FVDGRNTKRIISFLLILSAITVLIMGFVLSYFGSVMGLLIVLWGLNGVFQSVGGPASYST 146
D + ++ L +A+ IM + F V+ + ++ G+ G G + +
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMAT--APFLWVLYIGRIVAGITG----ATGAVAGAY 119

Query: 147 ISRWAPRTKRGRYLGFWNTSHNIGGAIAGGVALWGANVFFHGNVIGMFIFPSVIALLIGI 206
I+ +R R+ GF + G + H F + + L +
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFL 175

Query: 207 ATLFIGKDDPEELGWNRAEEIWEEPVDKENIDSQGMTKWEIFKKYILGNPVIWILCVSNV 266
F+ + + + P+ +E ++ +W V ++ V +
Sbjct: 176 TGCFLLPE---------SHKGERRPLRREALNPLASFRWARGMT-----VVAALMAVFFI 221

Query: 267 FVYIVRIGIDNWAPLYVSEHLHFSKGDAVNTIFYFEI-GALVASLLWGYVSDLLKGRRAI 325
+ ++ W ++ + H+ ++ F I +L +++ G V+ L RRA+
Sbjct: 222 MQLVGQVPAALWV-IFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRAL 280

Query: 326 VAIGCMFMITFVVLFYTNATSVMMVNISLFALGALIFGPQLLIGVSLTGFVPKNAISVAN 385
+ +++L + + + L A G I P +L + +
Sbjct: 281 MLGMIADGTGYILLAFATRGWMAFPIMVLLASGG-IGMP------ALQAMLSRQVDEERQ 333

Query: 386 G 386
G
Sbjct: 334 G 334


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01025HTHFIS833e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 3e-20
Identities = 42/169 (24%), Positives = 72/169 (42%), Gaps = 12/169 (7%)

Query: 3 KVVICDDERIIREGLKQIIPWGDYHFNTIYTAKDGVEALSLIQQHQPELVITDIRMPRKN 62
+++ DD+ IR L Q + Y + + I +LV+TD+ MP +N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GVDLLNDI--AHLDCNVIILSSYDDFEYMKAGIQHHVLDYLLKPVDHAQLEVILGRLVRT 120
DLL I A D V+++S+ + F + DYL KP D L ++G + R
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFD---LTELIGIIGRA 118

Query: 121 LLEQQSQNGRSLASCHDAFQPLLKVEYDDYYVNQIVDQIKQSYQTKVTV 169
L E + + + D PL+ + +I + + QT +T+
Sbjct: 119 LAEPKRRPSKLEDDSQD-GMPLVG---RSAAMQEIYRVLARLMQTDLTL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01030PF065801475e-42 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 147 bits (372), Expect = 5e-42
Identities = 55/226 (24%), Positives = 109/226 (48%), Gaps = 16/226 (7%)

Query: 288 YIYDLFESNEQLIHSIEHTERRLRDIQLKEIERQFQPHFLFNTMQTIQYLITLSPKLAQT 347
+ + F++ +Q ++ QL ++ Q PHF+FN + I+ LI P A+
Sbjct: 136 FGWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKARE 195

Query: 348 VVQQLSQMLRYSLR-TNSHTVELNEELNYIEQYVAIQNIRFDDMIKLHIESSEEARHQTI 406
++ LS+++RYSLR +N+ V L +EL ++ Y+ + +I+F+D ++ + + +
Sbjct: 196 MLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQV 255

Query: 407 GKMMLQPLIENAIKHGRDTESLDITIRLTLARQN--LHVLVCDNGIGMSSSRLQYVRQSL 464
M++Q L+EN IKHG I L + N + + V + G L+ ++S
Sbjct: 256 PPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLA----LKNTKES- 310

Query: 465 NNDVFDTKHLGLNHLHNKAMIQYGSHARLHIFSKRNQGTLICYKIP 510
GL ++ + + YG+ A++ + K+ + + IP
Sbjct: 311 -------TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVL-IP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS01040SHAPEPROTEIN320.006 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 32.4 bits (74), Expect = 0.006
Identities = 18/54 (33%), Positives = 29/54 (53%), Gaps = 5/54 (9%)

Query: 257 AYLAAIKEQNGAAMSLGRTSTFLDIYAERDLKAGVITESEV-QEIIDHFIMKLR 309
+AA+ A LGRT +I A R +K GVI + V ++++ HFI ++
Sbjct: 50 KSVAAVGHD--AKQMLGRTPG--NIAAIRPMKDGVIADFFVTEKMLQHFIKQVH 99


37SACOL_RS02365SACOL_RS02420N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS02365-315-0.590423NAD(P)-dependent oxidoreductase
SACOL_RS02370116-1.750189hypothetical protein
SACOL_RS02375115-2.105904hypothetical protein
SACOL_RS02380116-1.354214hypothetical protein
SACOL_RS023852160.452688hypothetical protein
SACOL_RS02390114-0.493358hypothetical protein
SACOL_RS02395112-1.263049hypothetical protein
SACOL_RS02400310-0.943277hypothetical protein
SACOL_RS02405210-0.461347hypothetical protein
SACOL_RS02410310-0.985397type I restriction-modification system subunit
SACOL_RS02415811-3.256639restriction endonuclease subunit S
SACOL_RS0242079-3.043643hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02365NUCEPIMERASE310.004 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 30.9 bits (70), Expect = 0.004
Identities = 29/167 (17%), Positives = 62/167 (37%), Gaps = 32/167 (19%)

Query: 1 MNIMLTGATGHLGTHITNQAIANHIDHFHIGVRNV----------EKVPEDWRGKVPVRQ 50
M ++TGA G +G H++ + + H +G+ N+ ++ + +
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEA--GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 51 LDYFNQESMVEAFK--GMDTVVFI-------PSIIHP-SFKRIPEV--ENLVYAAKQSGV 98
+D ++E M + F + V S+ +P ++ N++ + + +
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 99 AHIIFIG---YYADQHNNPFHMS-----PYFGYAARLLATSGIDYTY 137
H+++ Y PF P YAA A + +TY
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTY 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02370TOXICSSTOXIN896e-24 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 89.3 bits (221), Expect = 6e-24
Identities = 50/212 (23%), Positives = 86/212 (40%), Gaps = 14/212 (6%)

Query: 21 ITSNVQSVQAKAEVKQQSESELKHYYNKPILERKNVTGFKYTDEGKHYLEVTVGQQHSRI 80
++SN AKA + +L +Y+ N D + + +
Sbjct: 29 LSSNQIIKTAKASTNDNIK-DLLDWYSSGSDTFTNSEVL---DNSLGSMRIKNTDGSISL 84

Query: 81 TLLGSDKDKFKDGENSNIDVFILREGDSRQATN-----YSIGGVTKSNSVQYIDYINTPI 135
+ S + +D+ R S+ + + I GVT + + I P
Sbjct: 85 IIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELP- 141

Query: 136 LEIKKDNEDV-LKDFYYISKEDISLKELDYRLRERAIKQHGLYSNGLKQGQI-TITMNDG 193
L++K +D LK K+ +++ LD+ +R + + HGLY + K G ITMNDG
Sbjct: 142 LKVKVHGKDSPLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDG 201

Query: 194 TTHTIDLSQKLEKERMGESIDGTKINKILVEM 225
+T+ DLS+K E I+ +I I E+
Sbjct: 202 STYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02375TOXICSSTOXIN882e-23 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 88.2 bits (218), Expect = 2e-23
Identities = 38/203 (18%), Positives = 78/203 (38%), Gaps = 22/203 (10%)

Query: 37 ISENSKKLKAYYNQPSIEYKNVTGYISFIQPSIKFMNIIDGNSVNNIALIGKDKQHYHTG 96
++N K L +Y+ S + N + S+ M I + + ++ +
Sbjct: 42 TNDNIKDLLDWYSSGSDTFTN----SEVLDNSLGSMRIKNTDGSISLIIFPSPYYSPAFT 97

Query: 97 VHRNLNIFYVN-----EDKRFEGAKYSIGGITSANDKA--VDLIAEARVIKEDHTGEYDY 149
+++ + I G+T+ ++L + +V +D +Y
Sbjct: 98 KGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGP 157

Query: 150 DFFPFKIDKEAMSLKEIDFKLRKYLIDNYGLYGEMST----GKITVKKKYYGKYTFELDK 205
F DK+ +++ +DF++R L +GLY KIT+ Y +L K
Sbjct: 158 KF-----DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDG--STYQSDLSK 210

Query: 206 KLQEDRMSDVINVTDIDRIEIKV 228
K + + IN+ +I IE ++
Sbjct: 211 KFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02380TOXICSSTOXIN817e-20 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 80.9 bits (199), Expect = 7e-20
Identities = 42/231 (18%), Positives = 81/231 (35%), Gaps = 25/231 (10%)

Query: 132 VTTPPSTNTPQPMQSTKSDTPQSPTIKQAQTDMTPKYEDLRAYYTKPSFEFEKQFGFMLK 191
+ T + TP P+ S + IK A+ +DL +Y+ S F +
Sbjct: 17 LATTATDFTPVPLSSNQ-------IIKTAKASTNDNIKDLLDWYSSGSDTF-TNSEVLDN 68

Query: 192 PWTTVRFMNVIPNRFIYKIALVGKDEKKYKDGPYDNIDV-----FIVLEDNKYQLKKYSV 246
++R N + + + + +D+ ++ + +
Sbjct: 69 SLGSMRIKNTDGSI---SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQI 125

Query: 247 GGITKTNSKKVNHKVELSITKKDNQGMISRDVSEYMITKEEISLKELDFKLRKQLIEKHN 306
G+T T ++ L + + K+++++ LDF++R QL + H
Sbjct: 126 SGVTNTEKLPTPIELPLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHG 182

Query: 307 LYGNM--GSGTIVIKMKNGGKYTFELHKKLQEHRMA----GTNIDNIEVNI 351
LY + G I M +G Y +L KK + + I IE I
Sbjct: 183 LYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02390TOXICSSTOXIN973e-26 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 97.4 bits (242), Expect = 3e-26
Identities = 46/231 (19%), Positives = 85/231 (36%), Gaps = 21/231 (9%)

Query: 84 VTTPPSTNTPQPMQSTKSDTPQSPTTKQVPTEINPKFKDLRAYYTKPSLEFKNEIGIILK 143
+ T + TP P+ S + K N KDL +Y+ S F N ++
Sbjct: 17 LATTATDFTPVPLSSNQ-------IIKTAKASTNDNIKDLLDWYSSGSDTFTN-SEVLDN 68

Query: 144 KWTTIRFMNVVPDYFIYKIALVGKDDKKYGEGVHRNVDV-----FVVLEENNYNLEKYSV 198
++R N + + VD+ + + +
Sbjct: 69 SLGSMRIKNTDGSI---SLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQI 125

Query: 199 GGITKSNSKKVDHKAGVRITKEDNKGTISHDVSEFKITKEQISLKELDFKLRKQLIEKNN 258
G+T + + +++ + + K K+Q+++ LDF++R QL + +
Sbjct: 126 SGVTNTEKLPTPIELPLKVKVHGKDSPLKYG---PKFDKKQLAISTLDFEIRHQLTQIHG 182

Query: 259 LYGNV--GSGKIVIKMKNGGKYTFELHKKLQENRMADVINSEQIKNIEVNL 307
LY + G I M +G Y +L KK + N IN ++IK IE +
Sbjct: 183 LYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02395TOXICSSTOXIN1323e-40 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 132 bits (332), Expect = 3e-40
Identities = 39/197 (19%), Positives = 71/197 (36%), Gaps = 15/197 (7%)

Query: 43 INMLHQYYSEESFEPTNISVKSEDYYGSNVLNFKQRNKAFKVFLLGDDKNKY------KE 96
I L +YS S TN V + K + + + + K
Sbjct: 46 IKDLLDWYSSGSDTFTNSEVLD---NSLGSMRIKNTDGSISLIIFPSPYYSPAFTKGEKV 102

Query: 97 KTHGLDVFAVPELIDIKGGIYSVGGITKKNVRSVFGFVSNPSLQVKKVDAKNGFSINELF 156
+ + + + G+T + P L+VK + F
Sbjct: 103 DLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLP--TPIELP-LKVKVHGKDSPLKYGPKF 159

Query: 157 FIQKEEVSLKELDFKIRKLLIEKYRLYKGTS-DKGRIVINMKDEKKHEIDLSEKLSFERM 215
K+++++ LDF+IR L + + LY+ + G I M D ++ DLS+K +
Sbjct: 160 --DKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYNTE 217

Query: 216 FDVMDSKQIKNIEVNLN 232
++ +IK IE +N
Sbjct: 218 KPPINIDEIKTIEAEIN 234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02400TOXICSSTOXIN1934e-64 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 193 bits (491), Expect = 4e-64
Identities = 51/202 (25%), Positives = 92/202 (45%), Gaps = 10/202 (4%)

Query: 31 KQNQKSVNKHDKEALYRYYTGKTMEMKNISALKHGKNNLRFKFRGIKIQVLLPGNDKSKF 90
K + S N + K+ L Y +G + N L + ++R K I +++ +
Sbjct: 36 KTAKASTNDNIKDLLDWYSSG-SDTFTNSEVLDNSLGSMRIKNTDGSISLIIFPSPYYSP 94

Query: 91 QQRSYEGLDVFFVQEKRDKHD-----IFYTVGGVIQNNKTSGVVSAPILNISKEKGEDAF 145
E +D+ + K+ +H I + + GV K + P L + K G+D+
Sbjct: 95 AFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTEKLPTPIELP-LKV-KVHGKDSP 152

Query: 146 VKGYPYYIKKEKITLKELDYKLRKHLIEKYGLYKTISKDGRV-KISLKDGSFYNLDLRSK 204
+K Y K+++ + LD+++R L + +GLY++ K G KI++ DGS Y DL K
Sbjct: 153 LK-YGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKK 211

Query: 205 LKFKYMGEVIESKQIKDIEVNL 226
++ I +IK IE +
Sbjct: 212 FEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS02420TOXICSSTOXIN1082e-31 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 108 bits (272), Expect = 2e-31
Identities = 43/225 (19%), Positives = 79/225 (35%), Gaps = 21/225 (9%)

Query: 16 LTTGMITTTAQPVKASTLEVRSQAT-------QDLSEYYNRPFFEYTNQSGYKEEGKVTF 68
L T PV S+ ++ A +DL ++Y+ +TN
Sbjct: 15 LLLATTATDFTPVPLSSNQIIKTAKASTNDNIKDLLDWYSSGSDTFTNSEVLDNSLGSMR 74

Query: 69 TPNYQLIDVTLTGNEKQNF-------GEDISNVDIFVVRENSDRSGNTASIGGITKTNGS 121
N + D++ + S+ + I G+T T
Sbjct: 75 IKNTDGSISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQISGVTNTE-- 132

Query: 122 NYIDKVKDVNLIITKNIDSVTSTSTSSTYTINKEEISLKELDFKLRKHLIDKHNLYKTEP 181
+ L + + S +K+++++ LDF++R L H LY++
Sbjct: 133 ---KLPTPIELPLKVKVHGKDS-PLKYGPKFDKKQLAISTLDFEIRHQLTQIHGLYRSSD 188

Query: 182 KDSKI-RITMKDGGFYTFELNKKLQTHRMGDVIDGRNIEKIEVNL 225
K +ITM DG Y +L+KK + + I+ I+ IE +
Sbjct: 189 KTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


38SACOL_RS06010SACOL_RS06045N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS06010117-2.554079alpha-hemolysin
SACOL_RS06015014-2.610176hypothetical protein
SACOL_RS06020014-1.780972hypothetical protein
SACOL_RS06025111-0.513620hypothetical protein
SACOL_RS06030111-0.397355hypothetical protein
SACOL_RS06035112-0.144410hypothetical protein
SACOL_RS060401130.057379ornithine carbamoyltransferase
SACOL_RS060452140.043044carbamate kinase 1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06010BICOMPNTOXIN314e-109 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 314 bits (805), Expect = e-109
Identities = 73/318 (22%), Positives = 145/318 (45%), Gaps = 24/318 (7%)

Query: 9 VTTTLLLGSILMNPVANAADSDINIKTGTTDIGSNTTVKTGDLVTYDKEN--GMHKKVFY 66
+TTTL + L+ P+AN + T DIG + ++ N G+ + + +
Sbjct: 7 LTTTLSVS--LLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQF 64

Query: 67 SFIDDKNHNKKLLVIRTKGTIAGQYRVYSEEGANKS-GLAWPSAFKVQLQLPDNEVAQIS 125
F+ DK +NK L+++ +G I+ + Y+ + N + WP + + L+ D V+ I
Sbjct: 65 DFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYVSLI- 123

Query: 126 DYYPRNSIDTKEYMSTLTYGFNGNVTGDDTGKIGGLIGANVSIGHTLKYVQPDFKTILES 185
+Y P+N I++ TL Y GN + +GG N S ++ Y Q ++ + +E
Sbjct: 124 NYLPKNKIESTNVSQTLGYNIGGNFQSAPS--LGGNGSFNYS--KSISYTQQNYVSEVEQ 179

Query: 186 PTDKKVGWKVIFNNMVNQNWGPYDRDSWNPVYGNQLFMKTRNGSMKAADNFLDPNKASSL 245
K V W V N+ ++ + + LF+ + S D F+ ++ L
Sbjct: 180 QNSKSVLWGVKANSFATESG-------QKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPL 232

Query: 246 LSSGFSPDFATVITMDRKASKQQTNIDVIYERVRD-----DYQLHWTSTNWKGTNTKDKW 300
+ SGF+P F ++ + K S + ++ Y R D H+ ++ G + +
Sbjct: 233 VQSGFNPSFIATVSHE-KGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAF 291

Query: 301 IDRS-SERYKIDWEKEEM 317
++R+ + +Y+++W+ E+
Sbjct: 292 VNRNYTVKYEVNWKTHEI 309


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06025TOXICSSTOXIN493e-09 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 48.9 bits (116), Expect = 3e-09
Identities = 53/223 (23%), Positives = 87/223 (39%), Gaps = 12/223 (5%)

Query: 1 MSKNITKNIILTTTLLLLGTVLPQNQKPVFSFYSEAKAYSIGQDETNINELIKYYTQPHF 60
M+K + N + + LLL T P+ S A + D NI +L+ +Y+
Sbjct: 1 MNKKLLMNFFIVSPLLLATTATDFTPVPLSSNQIIKTAKASTND--NIKDLLDWYSSGSD 58

Query: 61 SFSNKWLYQYDNGNIYVELKRYSWSAHISLWGAESWGNINQLKDRYVDVFGLKD-KDTDQ 119
+F+N DN + +K S + ++ + + + K VD+ + K
Sbjct: 59 TFTN--SEVLDNSLGSMRIKNTDGSISLIIFPSP-YYSPAFTKGEKVDLNTKRTKKSQHT 115

Query: 120 LWWSYRETFTGGVTPAAK-PSDKTYNLFVQYKDKLQTIIGAHKIYQGNKPVLTLKEIDFR 178
+Y GVT K P+ L V+ K + K +K L + +DF
Sbjct: 116 SEGTYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGPKF---DKKQLAISTLDFE 172

Query: 179 AREALIKNKILYNENRNKGKL-KIT-GGGNNYTIDLSKRLHSD 219
R L + LY + G KIT G+ Y DLSK+ +
Sbjct: 173 IRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFEYN 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06030TOXICSSTOXIN583e-12 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 57.7 bits (139), Expect = 3e-12
Identities = 55/228 (24%), Positives = 92/228 (40%), Gaps = 15/228 (6%)

Query: 16 LLLGTASTQFPNTPINSSSEAKAYYINQNETNVNELTKYYSQKYLTFSNSTLWQKDNGTI 75
LLL T +T F P++S+ K + N+ N+ +L +YS TF+NS + G++
Sbjct: 15 LLLATTATDFTPVPLSSNQIIKTAKASTND-NIKDLLDWYSSGSDTFTNSEVLDNSLGSM 73

Query: 76 HATLLQFSWYSHIQVYGPESWGNINQLRNKSVDIFGI---KDQETIDSFALSQETFTGGV 132
++ + S + P + + + + VD+ K Q T + + + GV
Sbjct: 74 R---IKNTDGSISLIIFPSPYYSPAFTKGEKVDLNTKRTKKSQHTSEGTYIHFQI--SGV 128

Query: 133 TPA-ATSNDKHYKLNVTYKDKAETFTGGFPVYEGNKPVLTLKELDFRIRQTLIKSKKLYN 191
T L V K + + + +K L + LDF IR L + LY
Sbjct: 129 TNTEKLPTPIELPLKV--KVHGKDSPLKYG-PKFDKKQLAISTLDFEIRHQLTQIHGLYR 185

Query: 192 NSYNKGQI-KITGADNN-YTIDLSKRLPSTDANRYVKKPQNAKIEVIL 237
+S G KIT D + Y DLSK+ + + IE +
Sbjct: 186 SSDKTGGYWKITMNDGSTYQSDLSKKFEYNTEKPPINIDEIKTIEAEI 233


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06035TOXICSSTOXIN612e-13 Staphylococcal toxic shock syndrome toxin signature.
		>TOXICSSTOXIN#Staphylococcal toxic shock syndrome toxin signature.

Length = 234

Score = 61.2 bits (148), Expect = 2e-13
Identities = 62/222 (27%), Positives = 96/222 (43%), Gaps = 17/222 (7%)

Query: 2 KKNIMNKLVLSTALLLLETTSTQLPKTPISFSSEAKAYNISENETNINELIKYYTQPHFS 61
KK +MN ++S LLL TT+T P+S + K S N+ NI +L+ +Y+ +
Sbjct: 3 KKLLMNFFIVSP--LLLATTATDFTPVPLSSNQIIKTAKASTND-NIKDLLDWYSSGSDT 59

Query: 62 LSGKWLWQKPNGSIHATLQTWVWYSHIQVFGSESWGNINQLRNKYVDIFGT---KDEDTV 118
+ + GS+ ++ + +F S + + + + VD+ K + T
Sbjct: 60 FTNSEVLDNSLGSMR--IKNTDGSISLIIFPSP-YYSPAFTKGEKVDLNTKRTKKSQHTS 116

Query: 119 EGYWTYDETFTGGVTPA-ATSSDKPYRLFLKYSDKQQTIIGGHEFYKGNKPVLTLKELDF 177
EG TY GVT + L +K K + G +F +K L + LDF
Sbjct: 117 EG--TYIHFQISGVTNTEKLPTPIELPLKVKVHGKDSPLKYGPKF---DKKQLAISTLDF 171

Query: 178 RIRQTLIKNKKLYNGEFNKGQI-KIT-ADGNNYTIDLSKKLK 217
IR L + LY G KIT DG+ Y DLSKK +
Sbjct: 172 EIRHQLTQIHGLYRSSDKTGGYWKITMNDGSTYQSDLSKKFE 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06045CARBMTKINASE388e-138 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 388 bits (998), Expect = e-138
Identities = 144/311 (46%), Positives = 210/311 (67%), Gaps = 7/311 (2%)

Query: 3 KIVVALGGNALGK-----SPQEQLELVKNTAKSLVGLITKGHEIVISHGNGPQVGSINLG 57
++V+ALGGNAL + S +E ++ V+ TA+ + +I +G+E+VI+HGNGPQVGS+ L
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 58 LNYAAEHNQGPAFPFAECGAMSQAYIGYQLQESLQNELHSIGMDKQVVTLVTQVEVDEND 117
++ PA P GAMSQ +IGY +Q++L+NEL GM+K+VVT++TQ VD+ND
Sbjct: 64 MDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKND 123

Query: 118 PAFNNPSKPIGLFYNKEEAEQIQKEKGFIFVEDAGRGYRRVVPSPQPISIIELESIKTLI 177
PAF NP+KP+G FY++E A+++ +EKG+I ED+GRG+RRVVPSP P +E E+IK L+
Sbjct: 124 PAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLV 183

Query: 178 KNDTLVIAAGGGGIPVIREQHDGFKGIDAVIDKDKTSALLGANIQCDQLIILTAIDYVYI 237
+ +VIA+GGGG+PVI E KG++AVIDKD L + D +ILT ++ +
Sbjct: 184 ERGVIVIASGGGGVPVILE-DGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAAL 242

Query: 238 NFNTENQQPLKTTNVDELKRYIDENQFAKGSMLPKIEAAISFIENNPKGSVLITSLNELD 297
+ TE +Q L+ V+EL++Y +E F GSM PK+ AAI FIE + ++ I L +
Sbjct: 243 YYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAI-IAHLEKAV 301

Query: 298 AALEGKVGTVI 308
ALEGK GT +
Sbjct: 302 EALEGKTGTQV 312


39SACOL_RS06365SACOL_RS06395N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS06365210-0.107332beta-ketoacyl-ACP reductase
SACOL_RS06370310-0.847357hypothetical protein
SACOL_RS06375110-0.294372acyl carrier protein
SACOL_RS06380110-0.213616ribonuclease 3
SACOL_RS06385111-0.408743chromosome segregation protein SMC
SACOL_RS063900140.698037signal recognition particle-docking protein
SACOL_RS063952140.957094DNA-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06365DHBDHDRGNASE1441e-44 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 144 bits (365), Expect = 1e-44
Identities = 85/250 (34%), Positives = 136/250 (54%), Gaps = 13/250 (5%)

Query: 3 KSALVTGASRGIGRSIALQLAEEGYNV-AVNYAGSKEKAEAVVEEIKAKGVDSFAIQANV 61
K A +TGA++GIG ++A LA +G ++ AV+Y + EK E VV +KA+ + A A+V
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDY--NPEKLEKVVSSLKAEARHAEAFPADV 66

Query: 62 ADADEVKAMIKEVVSQFGSLDVLVNNAGITRDNLLMRMKEQEWDDVIDTNLKGVFNCIQK 121
D+ + + + + G +D+LVN AG+ R L+ + ++EW+ N GVFN +
Sbjct: 67 RDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS 126

Query: 122 ATPQMLRQRSGAIINLSSVVGAVGNPGQANYVATKAGVIGLTKSAARELASRGITVNAVA 181
+ M+ +RSG+I+ + S V A Y ++KA + TK ELA I N V+
Sbjct: 127 VSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 182 PGFIVSDMTDAL--SDELKEQML--------TQIPLARFGQDTDIANTVAFLASDKAKYI 231
PG +DM +L + EQ++ T IPL + + +DIA+ V FL S +A +I
Sbjct: 187 PGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHI 246

Query: 232 TGQTIHVNGG 241
T + V+GG
Sbjct: 247 TMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06375ACRIFLAVINRP260.012 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.3 bits (58), Expect = 0.012
Identities = 10/42 (23%), Positives = 17/42 (40%), Gaps = 2/42 (4%)

Query: 33 GADSLDIAELVMELEDEFGTEIPDEEAEKINTVGDAVKFINS 74
GA++LD A+ + E P + K+ D F+
Sbjct: 296 GANALDTAKAIKAKLAELQPFFP--QGMKVLYPYDTTPFVQL 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06385GPOSANCHOR542e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 54.3 bits (130), Expect = 2e-09
Identities = 53/326 (16%), Positives = 119/326 (36%), Gaps = 23/326 (7%)

Query: 170 KYKKRKAESLNKLDQTEDNLTRVEDILYDLEGRV-EPLKEEAAIAKEYKTLSHQMKHSDI 228
K K +E +K+ + E +E L + + E L+ + +
Sbjct: 103 KNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLE- 161

Query: 229 VVTVHDIDQYTNDNRQLDQRLNDLQGQQANKEADKQRLSQQIQQYKG-------KRHQLD 281
++ N + ++ L+ ++A EA + L + ++ K L+
Sbjct: 162 ----KALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLE 217

Query: 282 NDVESLNYQLVKATEAFEKYTGQLNVLEERKKNQSETNARYEEEQENLMELLENISNEIS 341
+ +L + +A E + K A E Q L + LE N +
Sbjct: 218 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 277

Query: 342 EAQDTYKSLKSKQKELNAVIRELEEQLYVSD----------EAHDEKLEEIKNEYYTLMS 391
K+L++++ L A +LE Q V + +A E ++++ E+ L
Sbjct: 278 ADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEE 337

Query: 392 EQSDVNNDIRFLKHTIEENEAKKSRLDSRLVEVFEQLKDIQGQIKTTKKEYQQTNKELSA 451
+ + L+ ++ + K +L++ ++ EQ K + ++ +++ + +
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQ 397

Query: 452 VDKEIKNIEKDLTDTKKAQNEYEEKL 477
V+K ++ L +K E EE
Sbjct: 398 VEKALEEANSKLAALEKLNKELEESK 423



Score = 52.8 bits (126), Expect = 6e-09
Identities = 31/315 (9%), Positives = 94/315 (29%), Gaps = 18/315 (5%)

Query: 177 ESLNKLDQTEDNLTRVEDILYDLEGRVEPLKEEAAIAKEYKTLSHQMKHSDIVVTVHDID 236
E +K + + L L ++ +E + + I
Sbjct: 57 ERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQ 116

Query: 237 QYTNDNRQLDQRLNDLQGQQANKEADKQRLSQQIQQYKGKRHQLDNDVESLNYQLVKATE 296
+ L++ L A + L + ++ L+ +E +
Sbjct: 117 ELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSA 176

Query: 297 AFEKYTGQLNVLEERKKNQSETNARYEEEQENLMELLENISNEISEAQDTYKSLKSKQKE 356
+ + LE R+ + ++ + E + L+ +
Sbjct: 177 KIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEG 236

Query: 357 LNAVIRELEEQLYVSDEAHDEKLEEIKNEYYTLMSEQSDVNNDIRFLKHTIEENEAKKSR 416
++ + + + ++ L N I+ EA+K+
Sbjct: 237 AMNFSTADSAKI----KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 292

Query: 417 LDSRLVEVFEQLKDIQGQIKTTKK--------------EYQQTNKELSAVDKEIKNIEKD 462
L++ ++ Q + + ++ ++ E+Q+ ++ + +++ +D
Sbjct: 293 LEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRD 352

Query: 463 LTDTKKAQNEYEEKL 477
L +++A+ + E +
Sbjct: 353 LDASREAKKQLEAEH 367



Score = 33.9 bits (77), Expect = 0.004
Identities = 39/269 (14%), Positives = 89/269 (33%), Gaps = 26/269 (9%)

Query: 669 KSKSILSQKDELTTMRHQL----EDYLRQTESFEQQFKELKIKSDQLSELYFEKSQKHNT 724
K K++ ++K L + L E + + + + K L+ + L E +
Sbjct: 142 KIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEG 201

Query: 725 LKEQVHHFEMELDRLTTQETQIKNDHEEFEFEKNDGYT-SDKSRQTLSEKETYLESIKAS 783
++ L ++ + + E S + E +++A
Sbjct: 202 AMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEAR 261

Query: 784 LKRLEDEIERYT-----------KLSKEGKESVTKTQQTLHQKQS----------DLAVV 822
LE +E L E + HQ Q DL
Sbjct: 262 QAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDAS 321

Query: 823 KERIKTQQQTIDRLNNQNQQTKHQLKDVKEKIAFFNSDEVMGEQAFQNIKDQINGQQETR 882
+E K + +L QN+ ++ + ++ + + E Q +++Q + +R
Sbjct: 322 REAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASR 381

Query: 883 TRLSDELDKLKQQRIELNEQIDAQEAKLQ 911
L +LD ++ + ++ + ++ +KL
Sbjct: 382 QSLRRDLDASREAKKQVEKALEEANSKLA 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06390SUBTILISIN363e-04 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 35.6 bits (82), Expect = 3e-04
Identities = 16/79 (20%), Positives = 29/79 (36%), Gaps = 11/79 (13%)

Query: 192 VGVNGVGKTTTIGKLAYRYKMEGKKVMLAAGDTFRAGAIDQLKVWGERVGVDVISQSEG- 250
GV GV + L + +L + + I Q + VD+IS S G
Sbjct: 101 NGVVGVAPEADL--LIIK--------VLNKQGSGQYDWIIQGIYYAIEQKVDIISMSLGG 150

Query: 251 SDPAAVMYDAINAAKNKGV 269
+ +++A+ A +
Sbjct: 151 PEDVPELHEAVKKAVASQI 169


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS06395BONTOXILYSIN260.037 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 26.0 bits (57), Expect = 0.037
Identities = 11/42 (26%), Positives = 23/42 (54%)

Query: 10 LRMNYLFDFYQSLLTNKQRNYLELFYLEDYSLSEIADTFNVS 51
L +NY + S++ ++ N L+ FY + Y + D +N++
Sbjct: 334 LNLNYFCQSFNSIIPDRFSNALKHFYRKQYYTMDYTDNYNIN 375


40SACOL_RS07370SACOL_RS07400N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS07370-1100.113968hypothetical protein
SACOL_RS07375-1100.273057hypothetical protein
SACOL_RS073800100.262722hypothetical protein
SACOL_RS07385010-0.034136dihydrolipoyllysine-residue succinyltransferase
SACOL_RS07390-110-0.7103772-oxoglutarate dehydrogenase E1 component
SACOL_RS07395113-2.600619two-component sensor histidine kinase
SACOL_RS07400-211-2.168344DNA-binding response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07370HTHFIS290.027 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 28.6 bits (64), Expect = 0.027
Identities = 23/100 (23%), Positives = 39/100 (39%), Gaps = 14/100 (14%)

Query: 12 TVFNDAKALFDLNKNILLKGPTGSGKTKLAETL---SEVVDTPMHQVNC---SVDLDTES 65
++ L + +++ G +G+GK +A L + + P +N DL
Sbjct: 148 EIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESE 207

Query: 66 LLGF-KTIKTNAEGQQEIVFVDGPVIKAMKEGHILYIDEI 104
L G K T A+ + F EG L++DEI
Sbjct: 208 LFGHEKGAFTGAQTRSTGRF-------EQAEGGTLFLDEI 240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07385RTXTOXIND290.035 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.035
Identities = 21/163 (12%), Positives = 54/163 (33%), Gaps = 11/163 (6%)

Query: 46 EVVSEEAGVLSEQLASEGDTVEVGQAIAIIGEGSGNASKENSNDNTPQQNEETNNKKEET 105
E+ E ++ E + EG++V G + + A D Q+ + E+T
Sbjct: 98 EIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEA------DTLKTQSSLLQARLEQT 151

Query: 106 TNNSVDKAEVNQANDDNQQRINATPSARRYARENGVNLAEVSPKTNDVVRKEDIDKKQQA 165
+ ++ + N + ++ P + + E + L + + + + K+
Sbjct: 152 RYQILSRSI--ELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNL 209

Query: 166 PASTQTTQQASAKEEKKYNQYPTKPVIREKMSRRKKTAAKKLL 208
A+ + N V + ++ K+ +
Sbjct: 210 DKKRAERLTVLARINRYENL---SRVEKSRLDDFSSLLHKQAI 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07395PF06580371e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 37.2 bits (86), Expect = 1e-04
Identities = 31/185 (16%), Positives = 68/185 (36%), Gaps = 35/185 (18%)

Query: 277 IEEMNRIIKLVEELLELTKGDVNDISSEAQTVHINDE---IRSRIHSLKQLHPD-YQFDT 332
+E+ + +++ L EL + + S A+ V + DE + S + D QF+
Sbjct: 187 LEDPTKAREMLTSLSELMRYSLRY--SNARQVSLADELTVVDSYLQLASIQFEDRLQFEN 244

Query: 333 DLTSKNLEIKMKPHQFEQLFLIFIDNAIKYDVKNKK----IKVKTRLKNKQKIIEITDHG 388
+ +++++ P L ++N IK+ + I +K N +E+ + G
Sbjct: 245 QINPAIMDVQVPPM----LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTG 300

Query: 389 IGIPEEDQDFIFDRFYRVDKSRSRSQGGNGLGLSIAQKIIQL---NGGSIKIKSEINKGT 445
+ ++ G GL ++ +Q+ IK+ + K
Sbjct: 301 SLALKNTKE------------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342

Query: 446 TFKII 450
+I
Sbjct: 343 AMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS07400HTHFIS935e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.6 bits (230), Expect = 5e-24
Identities = 30/125 (24%), Positives = 63/125 (50%), Gaps = 4/125 (3%)

Query: 2 TQILIVEDEQNLARFLELELTHENYNVDTEYDGQDGLDKALSHYYDLIILDLMLPSINGL 61
IL+ +D+ + L L+ Y+V + + DL++ D+++P N
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EICRKIRQQQS-TPIIIITAKSDTYDKVAGLDYGADDYIVKPFDIEELLARIRAIL---R 117
++ +I++ + P+++++A++ + + GA DY+ KPFD+ EL+ I L +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 118 RQPQK 122
R+P K
Sbjct: 124 RRPSK 128


41SACOL_RS08150SACOL_RS08190N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS08150017-3.950406prepilin-type N-terminal cleavage/methylation
SACOL_RS08155113-2.340409competence protein ComGC
SACOL_RS08160215-2.445676competence protein ComGB
SACOL_RS08165-111-2.637873hypothetical protein
SACOL_RS08170013-2.914850hydroxyacylglutathione hydrolase
SACOL_RS08175-111-3.110818hypothetical protein
SACOL_RS08180-110-1.933816glucokinase
SACOL_RS08185011-2.131703hypothetical protein
SACOL_RS08190-111-2.184763rhomboid family intramembrane serine protease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08150BCTERIALGSPH414e-07 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 40.7 bits (95), Expect = 4e-07
Identities = 14/79 (17%), Positives = 38/79 (48%), Gaps = 4/79 (5%)

Query: 5 KQSAFTMIEMLVVMMLISIFLLLTMTSKGLSNLRVIDDEA-NIISFITELNYIKSQAIAN 63
+Q FT++EM+++++L+ + + + + S D A + F +L +++ + +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPAS---RDDSAAQTLARFEAQLRFVQQRGLQT 58

Query: 64 QGYINVRFYENSDTIKVIE 82
+ V + + V+E
Sbjct: 59 GQFFGVSVHPDRWQFLVLE 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08155BCTERIALGSPG469e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.4 bits (110), Expect = 9e-10
Identities = 19/76 (25%), Positives = 44/76 (57%), Gaps = 4/76 (5%)

Query: 3 KFLKKTQAFTLIEMLLVLLIISLLLILIIPNI--AKQTAHIQSTGCNAQVKMVNSQIEAY 60
+ K + FTL+E+++V++II +L L++PN+ K+ A Q + + + + ++ Y
Sbjct: 2 RATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKA--VSDIVALENALDMY 59

Query: 61 ALKHNRNPSSIEDLIA 76
L ++ P++ + L +
Sbjct: 60 KLDNHHYPTTNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08160BCTERIALGSPF844e-20 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 84.1 bits (208), Expect = 4e-20
Identities = 65/347 (18%), Positives = 137/347 (39%), Gaps = 6/347 (1%)

Query: 14 KKRQLSKAQQIDLLSNLCNLLKYGFTLYQSFQFLNLQMTYKN-KQLGTTILSEISNGAPC 72
+K +LS + L L L+ L ++ + Q + QL + S++ G
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 73 NQIL-SLIGYSDTI-VMQVYLAERFGNIIDVLEETVNYMKVNRKSEQRLLKTLQYPLILV 130
+ G + + V E G++ VL +Y + ++ R+ + + YP +L
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 131 SIFIAMIIILNLTVIPQFQQLYTSMNIQLSSFQKTLSFFITSLPTIIVVMLIIVSMLAII 190
+ IA++ IL V+P+ + + M L + L ++ T ML+ + +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 191 MKLIYNNLNMLNKIN-FVMKLPLISGYFQLFKTYFVTNELVLFYKNGITLQSIVDVYINH 249
+++ + ++ LPLI + T L + + + L + + +
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 250 SS-DPFRQFLGKYLLTYSEMGYGLPQILEKLKCFKPQLIKFVLQGEKRGKLEVELKLYSQ 308
S D R L E G L + LE+ F P + + GE+ G+L+ L+ +
Sbjct: 301 MSNDYARHRLSLATDAVRE-GVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAAD 359

Query: 309 ILVKQIEDKAIKQTQFLQPILFLILGLFIVAIYLVIMLPMFQMMQSI 355
++ + +P+L + + ++ I L I+ P+ Q+ +
Sbjct: 360 NQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08170SHIGARICIN270.039 Ribosome inactivating protein family signature.
		>SHIGARICIN#Ribosome inactivating protein family signature.

Length = 289

Score = 27.5 bits (61), Expect = 0.039
Identities = 20/99 (20%), Positives = 38/99 (38%), Gaps = 11/99 (11%)

Query: 82 DFLKDPVKNGADKFKQYGLPIITSKVTPEK-------LNEGSTEIE-GFKFNVLHTPGHS 133
F+ + K + K Y +P++ S + + N I ++ G+
Sbjct: 39 VFISNLRKALPYERKLYDIPLLRSTLPGSQRYALIHLTNYADETISVAIDVTNVYVMGYR 98

Query: 134 PGSLTYVFDEFAVVG--DTLFNNGIGRTDL-YKGDYETL 169
G +Y F+E + +F + + L Y G+YE L
Sbjct: 99 AGDTSYFFNEASATEAAKYVFKDAKRKVTLPYSGNYERL 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08180PF03309300.011 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 30.1 bits (68), Expect = 0.011
Identities = 32/154 (20%), Positives = 51/154 (33%), Gaps = 37/154 (24%)

Query: 5 ILAADVGGTTCKLGIFTPELEQ---LHKWSIHTD---TSDSTGYTLLKGIYDSFVEKVNE 58
+LA DV T +G+ + + + +W I T+ T+D + G+
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTEPEVTADELA-LTIDGLI--------- 51

Query: 59 NNYNFSNVLGVGIG--VPGPVDFEKGTVNGAVNLYWPE------KVNVREIFEQFVDCPV 110
+ + G VP V E V + YWP + VR VD P
Sbjct: 52 -GDDAERLTGASGLSTVP-SVLHE---VRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPK 106

Query: 111 YVDND--ANIAALGEKHKGAGEGADDVVAITLGT 142
V D N A K+ + + G+
Sbjct: 107 EVGADRIVNCLAAYHKYGT------AAIVVDFGS 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS08190TCRTETA330.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.3 bits (76), Expect = 0.002
Identities = 29/170 (17%), Positives = 54/170 (31%), Gaps = 51/170 (30%)

Query: 241 MLTVYFIAGLFGN--------FVSLSFNTTTISVGASGAIFGLIGSIFAMMY---VSKTF 289
++ V+FI L G F F+ ++G S A FG++ S+ M V+
Sbjct: 215 LMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARL 274

Query: 290 NKK----------MLGQLLIA-----------LVILVGVSLFMS------NINIVAHIGG 322
++ G +L+A +V+L + M + + G
Sbjct: 275 GERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQG 334

Query: 323 FIGGLLITL-----------IGYYYKVNRNIF--WILLIGMLVIFIALQI 359
+ G L L Y + + W + G + + L
Sbjct: 335 QLQGSLAALTSLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLCLPA 384


42SACOL_RS09555SACOL_RS09580N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS09555413-0.134105serine protease SplF
SACOL_RS09560411-0.130496serine protease SplE
SACOL_RS09565311-0.316975serine protease SplD
SACOL_RS09570113-0.816552serine protease SplC
SACOL_RS09575016-1.307964serine protease SplB
SACOL_RS09580112-2.321239serine protease SplA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09560V8PROTEASE1156e-33 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 115 bits (290), Expect = 6e-33
Identities = 60/227 (26%), Positives = 103/227 (45%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENTVKQITNTNVAPYSGVTWMGA--------GTGFVVGNHTIITNKHV 76
++Q A N QIT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVSSDAIIQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKHEAIGVIYAGNKPSGESTRGFAVYFSPEIKKFIADNLD 238
SGSP+ N K+E IG+ + G AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEF----NGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09565V8PROTEASE1368e-41 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 136 bits (344), Expect = 8e-41
Identities = 63/227 (27%), Positives = 107/227 (47%), Gaps = 27/227 (11%)

Query: 30 IQQTAKA-----EHNVKLIKNTNVAPYNGVVSIGS--------GTGFIVGKNTIVTNKHV 76
++Q A ++ I +T Y V I +G +VGK+T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 VAGMEIGAH-IIAHP---NGEYNNGGFYKVKKIVRYSGQEDIAILHVEDKAVHPKNRNFK 132
V H + A P N + G + ++I +YSG+ D+AI+ +N++
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPN---EQNKHIG 177

Query: 133 DYTGILKIA--SEAKENERISIVGYPEPYINKFQMYESTGKVLSVKGNMIITDAFVEPGN 190
+ ++ +E + N+ I++ GYP M+ES GK+ +KG + D GN
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYPGDK-PVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 191 SGSAVFNSKYEVVGVHFGGNGPGNKSTKGYGVYFSPEIKKFIADNTD 237
SGS VFN K EV+G+H+GG + V+ + ++ F+ N +
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVP----NEFNGAVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09570V8PROTEASE1121e-31 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 112 bits (280), Expect = 1e-31
Identities = 58/227 (25%), Positives = 100/227 (44%), Gaps = 26/227 (11%)

Query: 30 IQQTAKA-----ENSVKLITNTNVAPYSGVTWMGA--------GTGFVVGNQTIITNKHV 76
++Q A N IT+T Y+ VT++ +G VVG T++TNKHV
Sbjct: 61 LEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHV 120

Query: 77 TYHM-KVGDEIKAHPNGFY--NNGGGLYKVTKIVDYPGKEDIAVVQVEEKSTQPKGRKFK 133
+KA P+ N G + +I Y G+ D+A+V+ + +
Sbjct: 121 VDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSP---NEQNKHIG 177

Query: 134 DFTSKFNIA--SEAKENEPISVIGYPNPNGNKLQMYESTGKVLSVNGNIVTSDAVVQPGS 191
+ ++ +E + N+ I+V GYP + M+ES GK+ + G + D G+
Sbjct: 178 EVVKPATMSNNAETQVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGN 236

Query: 192 SGSPILNSKREAIGVMYASDKPTGESTRSFAVYFSPEIKKFIADNLD 238
SGSP+ N K E IG+ + AV+ + ++ F+ N++
Sbjct: 237 SGSPVFNEKNEVIGIHWGGVPNEFNG----AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09575V8PROTEASE1794e-57 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 179 bits (454), Expect = 4e-57
Identities = 63/217 (29%), Positives = 105/217 (48%), Gaps = 23/217 (10%)

Query: 37 EKNVTQVKDTNIFPYNGVVSFK--------DATGFVIGKNTIITNKHV-SKDYKVGDRIT 87
+ Q+ DT Y V + A+G V+GK+T++TNKHV + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 88 AHP---NGDKGNGGIYKIKSISDYPGDEDISVMNIEEQAVERGPKGFNFNENVQAFNFAK 144
A P N D G + + I+ Y G+ D++++ + + E V+ +
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK-----HIGEVVKPATMSN 187

Query: 145 DA--KVDDKIKVIGYPLPAQNSFKQFESTGTIKRIKDNILNFDAYIEPGNSGSPVLNSNN 202
+A +V+ I V GYP +ES G I +K + +D GNSGSPV N N
Sbjct: 188 NAETQVNQNITVTGYPGDK-PVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKN 246

Query: 203 EVIGVVYGGIGKIGSEYNGAVYFTPQIKDFIQKHIEQ 239
EVIG+ +GG + +E+NGAV+ +++F++++IE
Sbjct: 247 EVIGIHWGG---VPNEFNGAVFINENVRNFLKQNIED 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09580V8PROTEASE1772e-56 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 177 bits (450), Expect = 2e-56
Identities = 64/230 (27%), Positives = 108/230 (46%), Gaps = 29/230 (12%)

Query: 29 EVQQTAKA-----ENNVTKVKDTNIFPYTGVVAFKS--------ATGFVVGKNTILTNKH 75
++Q A N+ ++ DT Y V + A+G VVGK+T+LTNKH
Sbjct: 60 PLEQREHANVILPNNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKH 119

Query: 76 V-SKNYKVGDRITAHP---NSDKGNGGIYSIKKIINYPGKEDVSVIQVEERAIERGPKGF 131
V + + A P N D G ++ ++I Y G+ D+++++ +
Sbjct: 120 VVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVKFSPNEQNK----- 174

Query: 132 NFNDNVTPFKYAAGA--KAGERIKVIGYPHPYKNKYVLYESTGPVMSVEGSSIVYSAHTE 189
+ + V P + A + + I V GYP K ++ES G + ++G ++ Y T
Sbjct: 175 HIGEVVKPATMSNNAETQVNQNITVTGYPGD-KPVATMWESKGKITYLKGEAMQYDLSTT 233

Query: 190 SGNSGSPVLNSNNELVGIHFASDVKNDDNRNAYGVYFTPEIKKFIAENID 239
GNSGSPV N NE++GIH+ V N+ N V+ ++ F+ +NI+
Sbjct: 234 GGNSGSPVFNEKNEVIGIHWGG-VPNEFNG---AVFINENVRNFLKQNIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09585V8PROTEASE1397e-42 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 139 bits (351), Expect = 7e-42
Identities = 66/212 (31%), Positives = 103/212 (48%), Gaps = 18/212 (8%)

Query: 39 EKNVKEITDATKEPYNSVVAF--------VGGTGVVVGKNTIVTNKHIAKSNDIFKNRVS 90
+ +ITD T Y V +GVVVGK+T++TNKH+ + + +
Sbjct: 73 NNDRHQITDTTNGHYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVDATHGDPHALK 132

Query: 91 AHHS---SKGKGGGNYDVKDIVEYPGKEDLAIVHVHETSTEGLNFNKNVSYTKFADGA-- 145
A S G + + I +Y G+ DLAIV + + + + V ++ A
Sbjct: 133 AFPSAINQDNYPNGGFTAEQITKYSGEGDLAIVK-FSPNEQNKHIGEVVKPATMSNNAET 191

Query: 146 KVKDRISVIGYPKGAQTKYKMFESTGTINHISGTFMEFDAYAQPGNSGSPVLNSKHELIG 205
+V I+V GYP G + M+ES G I ++ G M++D GNSGSPV N K+E+IG
Sbjct: 192 QVNQNITVTGYP-GDKPVATMWESKGKITYLKGEAMQYDLSTTGGNSGSPVFNEKNEVIG 250

Query: 206 ILYAGSGKDESEKNFGVYFTPQLKEFIQNNIE 237
I + G +E N V+ ++ F++ NIE
Sbjct: 251 IHWGGVP---NEFNGAVFINENVRNFLKQNIE 279


43SACOL_RS09615SACOL_RS09655N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS09615-111-3.885636peptidase S8
SACOL_RS09620-113-4.585177enterotoxin
SACOL_RS09625-111-3.989410hypothetical protein
SACOL_RS09630-112-3.408570hypothetical protein
SACOL_RS09635316-1.805209lantibiotic epidermin
SACOL_RS09640215-3.053568transposase
SACOL_RS09645213-2.835774lantibiotic epidermin
SACOL_RS09650112-2.994870leucotoxin LukDv
SACOL_RS09655111-1.871364gamma-hemolysin subunit A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09615SUBTILISIN1602e-47 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 160 bits (406), Expect = 2e-47
Identities = 83/351 (23%), Positives = 138/351 (39%), Gaps = 73/351 (20%)

Query: 110 SRQWDMNKITNNGASYDDLPKHANTKIAIIDTGVMKNHDDLKNNFSTDSKNLVPLNGFRG 169
+ I A ++ + K+A++DTG +H DLK + G R
Sbjct: 21 EIPRGVEMI-QAPAVWNQT-RGRGVKVAVLDTGCDADHPDLKAR----------IIGGRN 68

Query: 170 TEPEETGDVHDVNDRKGHGTMVSGQTSANG---KLIGVAPNNKFTMYRVFGSKKT-ELLW 225
++ GD D GHGT V+G +A ++GVAP + +V + + + W
Sbjct: 69 FTDDDEGDPEIFKDYNGHGTHVAGTIAATENENGVVGVAPEADLLIIKVLNKQGSGQYDW 128

Query: 226 VSKAIVQAANDGNQVINISVGSYIILDKNDHQTFRKDEKVEYDALQKAINYAKKKKSIVV 285
+ + I A +I++S+G + L +A+ A + +V+
Sbjct: 129 IIQGIYYAIEQKVDIISMSLGGP----------------EDVPELHEAVKKAVASQILVM 172

Query: 286 AAAGNDGIDVNDKQKLKLQREYQGNGEVKDVPASMDNVVTVGSTDQKSNLSEFSNFGMNY 345
AAGN+G + + P + V++VG+ + + SEFSN N
Sbjct: 173 CAAGNEG-------------DGDDRTDELGYPGCYNEVISVGAINFDRHASEFSNSN-NE 218

Query: 346 TDIAAPGGSFAYLNQFGVDKWMNEGYMHKENILTTANNGRYIYQAGTSLATPKVSGALAL 405
D+ APG E+IL+T G+Y +GTS+ATP V+GALAL
Sbjct: 219 VDLVAPG----------------------EDILSTVPGGKYATFSGTSMATPHVAGALAL 256

Query: 406 IIDKYHLEKHPD----KAIELLYQHGTSKNNKPFSRYGHGELDVYKALNVA 452
I + D + L + N P G+G L + ++
Sbjct: 257 IKQLANASFERDLTEPELYAQLIKRTIPLGNSPK-MEGNGLLYLTAVEELS 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09630RTXTOXINA310.019 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.5 bits (71), Expect = 0.019
Identities = 22/104 (21%), Positives = 37/104 (35%), Gaps = 1/104 (0%)

Query: 813 SDYEFVSYEPEFFRYGGKNTINEIEAFFEYDTNLAVNIIENDFKFDRPYIVAISIMYLFE 872
+D E G KN I F + +++ + IE F I S+ E
Sbjct: 889 NDLIMYKGEGNVLSIGHKNGITFRNWFEKESGDISNHEIEQIFDKSGRIITPDSLKKALE 948

Query: 873 MFSISNEERMEIVNNYVPTSFKSKDIRPFKNELVTICNPANNFE 916
+ N + + N D+ P NE+ I + A +F+
Sbjct: 949 -YQQRNNKASYVYGNDALAYGSQGDLNPLINEISKIISAAGSFD 991


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09635GALLIDERMIN477e-12 Gallidermin signature.
		>GALLIDERMIN#Gallidermin signature.

Length = 52

Score = 47.4 bits (112), Expect = 7e-12
Identities = 29/46 (63%), Positives = 34/46 (73%), Gaps = 1/46 (2%)

Query: 2 EKVLDLDVQVKANNNSNDSAGDERITSHSLCTPGCAKTGSFNSFCC 47
++ DLDV+V A SNDS + RI S LCTPGCAKTGSFNS+CC
Sbjct: 8 NELFDLDVKVNAKE-SNDSGAEPRIASKFLCTPGCAKTGSFNSYCC 52


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09645GALLIDERMIN291e-04 Gallidermin signature.
		>GALLIDERMIN#Gallidermin signature.

Length = 52

Score = 29.3 bits (65), Expect = 1e-04
Identities = 16/38 (42%), Positives = 22/38 (57%), Gaps = 1/38 (2%)

Query: 2 EKVLDLDVQVKGNNNTNDSAGDERITSHLFCSFGCEKT 39
++ DLDV+V +NDS + RI S C+ GC KT
Sbjct: 8 NELFDLDVKVNAKE-SNDSGAEPRIASKFLCTPGCAKT 44


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09650BICOMPNTOXIN396e-141 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 396 bits (1020), Expect = e-141
Identities = 97/329 (29%), Positives = 177/329 (53%), Gaps = 24/329 (7%)

Query: 1 MKMKKLVKSSVASSIALLLLSNTVDAAQHITPVSEKKVDDKITLYKTTATSDNDKLNISQ 60
M K++ ++++ S+ L + ++ A+ + I + K T ++K ++Q
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 ILTFNFIKDKSYDKDTLVLKAAGNINSGYKKPNPKDYNYSQ-FYWGGKYNVSVSSESNDA 119
+ F+F+KDK Y+KD L+LK G I+S N K N+ + W +YN+ + + +
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTN-DKY 119

Query: 120 VNVVDYAPKNQNEEFQVQQTLGYSYGGDINISNGLSGGLNGSKSFSETINYKQESYRTTI 179
V++++Y PKN+ E V QTLGY+ GG+ + L G NGS ++S++I+Y Q++Y + +
Sbjct: 120 VSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGG--NGSFNYSKSISYTQQNYVSEV 177

Query: 180 DRKTNHKSIGWGVEAHKIMNNGWGPYGRDSYDPTYGNELFLGGRQSSSNAGQNFLPTHQM 239
+++ N KS+ WGV+A+ + ++LF+G + S + F+P ++
Sbjct: 178 EQQ-NSKSVLWGVKANSFAT-------ESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSEL 229

Query: 240 PLLARGNFNPEFISVLSHKQNDTKKSKIKVTYQREMD---------RYTNQWNRLHWVGN 290
P L + FNP FI+ +SH++ + S+ ++TY R MD Y N + H V N
Sbjct: 230 PPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHN 289

Query: 291 NYKNQNTVTFTSTYEVDWQNHTVKLIGTD 319
+ N+N +T YEV+W+ H +K+ G +
Sbjct: 290 AFVNRN---YTVKYEVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS09655BICOMPNTOXIN433e-156 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 433 bits (1116), Expect = e-156
Identities = 214/318 (67%), Positives = 256/318 (80%), Gaps = 10/318 (3%)

Query: 1 MFKKKMLAATLSVGLIAPLASPIQE-SRANTNIENIGDGA--EVIKRTEDVSSKKWGVTQ 57
M K K+L TLSV L+APLA+P+ E ++A + E+IG G+ E+IKRTED +S KWGVTQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 58 NVQFDFVKDKKYNKDALIVKMQGFINSRTSFSDVKGSGYELTKRMIWPFQYNIGLTTKDP 117
N+QFDFVKDKKYNKDALI+KMQGFI+SRT++ + K + + K M WPFQYNIGL T D
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNH--VKAMRWPFQYNIGLKTNDK 118

Query: 118 NVSLINYLPKNKIETTDVGQTLGYNIGGNFQSAPSIGGNGSFNYSKTISYTQKSYVSEVD 177
VSLINYLPKNKIE+T+V QTLGYNIGGNFQSAPS+GGNGSFNYSK+ISYTQ++YVSEV+
Sbjct: 119 YVSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVE 178

Query: 178 KQNSKSVKWGVKANEFVTPDGKKSAHDRYLFVQSPNGPTGSAREYFAPDNQLPPLVQSGF 237
+QNSKSV WGVKAN F T G+KSA D LFV + R+YF PD++LPPLVQSGF
Sbjct: 179 QQNSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPH-SKDPRDYFVPDSELPPLVQSGF 237

Query: 238 NPSFITTLSHEKGSSDTSEFEISYGRNLDITYA----TLFPRTGIYAERKHNAFVNRNFV 293
NPSFI T+SHEKGSSDTSEFEI+YGRN+D+T+A T + + + R HNAFVNRN+
Sbjct: 238 NPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYT 297

Query: 294 VRYEVNWKTHEIKVKGHN 311
V+YEVNWKTHEIKVKG N
Sbjct: 298 VKYEVNWKTHEIKVKGQN 315


44SACOL_RS10470SACOL_RS10495N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS10470114-2.901641gamma-hemolysin subunit B
SACOL_RS10475114-3.710935succinyl-diaminopimelate desuccinylase
SACOL_RS10480117-4.067500hypothetical protein
SACOL_RS10485219-4.396476ferrichrome-binding protein FhuD
SACOL_RS10490318-4.904134potassium transporter KtrB
SACOL_RS10495418-4.448417GNAT family acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10475BICOMPNTOXIN2171e-70 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 217 bits (553), Expect = 1e-70
Identities = 84/320 (26%), Positives = 145/320 (45%), Gaps = 18/320 (5%)

Query: 11 ICTLALSTTFTVLPATSFAKINSEIKQVSEKNLDGDTKMYTRTATTSDSQKNITQSLQFN 70
I T LS + A + + D ++ RT + ++ +TQ++QF+
Sbjct: 6 ILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQFD 65

Query: 71 FLTEPNYDKETVFIKAKGTIGSGLRILDPNGY-WNSTLRWPGSYSVSIQNVDDNNNTNVT 129
F+ + Y+K+ + +K +G I S + +RWP Y++ ++ ++ ++
Sbjct: 66 FVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKT--NDKYVSLI 123

Query: 130 DFAPKNQDESREVKYTYGYKTGGDFSINRGGLTGNITKESNYSETISYQQPSYRTLLDQS 189
++ PKN+ ES V T GY GG+F L GN + NYS++ISY Q +Y + ++Q
Sbjct: 124 NYLPKNKIESTNVSQTLGYNIGGNFQSAPS-LGGNGSF--NYSKSISYTQQNYVSEVEQQ 180

Query: 190 TSHKGVGWKVEAHLINNMGHDHTRQLTNDSDNRTKSEIFSLTRNGNLWAKDNFTPKDKMP 249
K V W V+A+ + S++F + + +D F P ++P
Sbjct: 181 N-SKSVLWGVKANSFATESGQKSAF---------DSDLFVGYKPHSKDPRDYFVPDSELP 230

Query: 250 VTVSEGFNPEFLAVMSHDKKDKGKSQFVVHYKRSMDEFKIDWNRHGFWG-YWSGENHVDK 308
V GFNP F+A +SH+K S+F + Y R+MD + Y G +
Sbjct: 231 PLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNA 290

Query: 309 -KEEKLSALYEVDWKTHNVK 327
+ YEV+WKTH +K
Sbjct: 291 FVNRNYTVKYEVNWKTHEIK 310


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10480BICOMPNTOXIN1651e-50 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 165 bits (419), Expect = 1e-50
Identities = 99/343 (28%), Positives = 157/343 (45%), Gaps = 42/343 (12%)

Query: 4 KKRVLIASSLSCAILLLSAATTQANSAHKDSQDQNKKEHVDKSQQKDKRNVTNKDKNSTA 63
K ++ ++LS ++L A N+
Sbjct: 2 LKNKILTTTLSVSLLAPLANPLLENAKAA-----------------------------ND 32

Query: 64 PDDIGKNGKIT--KRTETVYDEKTNILQNLQFDFIDDPTYDKNVLLVKKQGSIHSNLKFE 121
+DIGK I KRTE K + QN+QFDF+ D Y+K+ L++K QG I S +
Sbjct: 33 TEDIGKGSDIEIIKRTEDKTSNKWGVTQNIQFDFVKDKKYNKDALILKMQGFISSRTTYY 92

Query: 122 SHKEEKNSNWLKYPSEYHVDFQVKRNRKTEILDQLPKNKISTAKVDSTFSYSSGGKFDST 181
++K+ + +++P +Y++ + ++ +++ LPKNKI + V T Y+ GG F S
Sbjct: 93 NYKKTNHVKAMRWPFQYNIGLKTN-DKYVSLINYLPKNKIESTNVSQTLGYNIGGNFQSA 151

Query: 182 KGIGRTSSNSYSKTISYNQQNYDTIASGKNNNWHVHWSVIANDLKYGGEVKNRNDELLFY 241
+G S +YSK+ISY QQNY + + N+ V W V AN K+ D LF
Sbjct: 152 PSLGGNGSFNYSKSISYTQQNYVSEVE-QQNSKSVLWGVKANSFATESGQKSAFDSDLFV 210

Query: 242 RNTRIATVENPELSFASKYRYPALVRSGFNPEFLTYLSNEK-SNEKTQFEVTYTRNQDIL 300
+ +P F P LV+SGFNP F+ +S+EK S++ ++FE+TY RN D+
Sbjct: 211 GYKPHSK--DPRDYFVPDSELPPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVT 268

Query: 301 KNR------PGIHYAPPILEKNKDGQRLIVTYEVDWKNKTVKV 337
+ + + V YEV+WK +KV
Sbjct: 269 HAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKYEVNWKTHEIKV 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10495FERRIBNDNGPP601e-12 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 60.3 bits (146), Expect = 1e-12
Identities = 48/248 (19%), Positives = 95/248 (38%), Gaps = 21/248 (8%)

Query: 48 PKRVAVLTGFYVGDFIKLGIKPIAVSDITK-DSSILKPYL-KGVDYIG---ENDVERVAK 102
P R+ L V + LGI P V+D + +P L V +G E ++E + +
Sbjct: 35 PNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNLELLTE 94

Query: 103 AKPDLIVVDA-MDKNIKKYQKIAPTIPYTYNKYNH-----KEILKEIGKLTNNEDKAKKW 156
KP +V A + + +IAP + ++ ++ L E+ L N + A+
Sbjct: 95 MKPSFMVWSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQSAAETH 154

Query: 157 IEEWDDKTRKDKKEIQSKIGQATASVFEPDEKQIYIYNSTWGRGLDIVHDAFGMPMTKQY 216
+ +++D R K + + D + + ++ + D +G+P Q
Sbjct: 155 LAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGP--NSLFQEILDEYGIPNAWQG 212

Query: 217 KDKLQEDKKGYASISKENISKYA-GDYIFLSKPSYGKFD-FEKTHTWQNIEAVKKGHVIS 274
+ + G ++S + ++ Y D + + D T WQ + V+ G
Sbjct: 213 ----ETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALMATPLWQAMPFVRAGRF-- 266

Query: 275 YKAEDYWF 282
+ WF
Sbjct: 267 QRVPAVWF 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS10505SACTRNSFRASE270.026 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 26.8 bits (59), Expect = 0.026
Identities = 16/61 (26%), Positives = 30/61 (49%), Gaps = 2/61 (3%)

Query: 76 EYMRILAFVIHSEFRKKGYGKRLLADSEEFSKRLNCKAITLNSGNRNERLSAHKLYSDNG 135
Y I + ++RKKG G LL + E++K + + L + + N +SA Y+ +
Sbjct: 88 GYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDIN--ISACHFYAKHH 145

Query: 136 Y 136
+
Sbjct: 146 F 146


45SACOL_RS11390SACOL_RS11435N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS11390-38-1.415349ABC transporter substrate-binding protein
SACOL_RS11395-29-1.637576hypothetical protein
SACOL_RS11400-39-1.708006alanine racemase
SACOL_RS11405-39-1.748983siderophore synthetase
SACOL_RS11410-310-1.933283MFS transporter
SACOL_RS11415112-2.134036hypothetical protein
SACOL_RS11420420-2.830643transporter
SACOL_RS11425015-0.022007transposase
SACOL_RS11430-1130.488332transposase
SACOL_RS11435-1101.685005alkaline shock protein 23
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11390FERRIBNDNGPP965e-25 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 96.2 bits (239), Expect = 5e-25
Identities = 64/257 (24%), Positives = 107/257 (41%), Gaps = 24/257 (9%)

Query: 53 DAKRIVVLEYSFADALAALDVKPVGIADDGKKKRIIK--PVREKIGDYTSVGTRKQPNLE 110
D RIV LE+ + L AL + P G+AD + + P+ + + D VG R +PNLE
Sbjct: 34 DPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVID---VGLRTEPNLE 90

Query: 111 EISKLKPDLIIADSSRHKGINKELNKIAPTLSLKSFDGDYKQNI--NSFKTIAKALNKEK 168
++++KP ++ S+ + + L +IAP DG + S +A LN +
Sbjct: 91 LLTEMKPSFMVW-SAGYGPSPEMLARIAPGRGFNFSDGKQPLAMARKSLTEMADLLNLQS 149

Query: 169 EGEKRLAEHDKLINKYKDEIKFDRNQKVLPAVV---AKAGLLAHPNYSYVGQFLNELGFK 225
E LA+++ I K R + L + L+ PN S + L+E G
Sbjct: 150 AAETHLAQYEDFIRSMKPRF-VKRGARPLLLTTLIDPRHMLVFGPN-SLFQEILDEYGIP 207

Query: 226 NALSDDVTKGLSKYLKGPYLQLDTEHLADLNPERMIIMTDHAKKDSAEFKKLQEDATWKK 285
NA +G + + + + LA ++ DH +S + L W+
Sbjct: 208 NAW-----QGETNFWG--STAVSIDRLAAYKDVDVLCF-DHD--NSKDMDALMATPLWQA 257

Query: 286 LNAVKNNRVDIVDRDVW 302
+ V+ R V VW
Sbjct: 258 MPFVRAGRFQRVP-AVW 273


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11400ALARACEMASE391e-05 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 39.4 bits (92), Expect = 1e-05
Identities = 59/325 (18%), Positives = 119/325 (36%), Gaps = 33/325 (10%)

Query: 4 VNINISKIKYNAKVLQTVFQSKNMQFTPVIKCIAGDRTIVESLKALG-INHVAESRLDNI 62
++++ +K N +++ + + + V+K A I A+G + A L+
Sbjct: 7 ASLDLQALKQNLSIVRQA--ATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEA 64

Query: 63 ISIADQDLTYTLLRTPAKKEISDMIEKVDMSIQTELSTIHQINEVAEV-LGKKHKILLMV 121
I++ ++ +L D+ + T + + Q+ + L I L V
Sbjct: 65 ITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKV 124

Query: 122 DWKDGREGVLTYDVLDYIKEIIHLKNIHFVGLAFNFMCFKSDAPSDDDIFMINRFVSAVE 181
+ R G VL +++ + N+ + L +F ++ P D + R A E
Sbjct: 125 NSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAE--AEHP-DGISGAMARIEQAAE 181

Query: 182 REIGYRLKIISGGNSSMLPQLLYNDLGKINELRIGETLFRGVDTTTNQAIAML-YQDAIT 240
+ R + + + P+ ++ +R G L+ + + IA + +T
Sbjct: 182 -GLECRRSLSNSAATLWHPEAHFD------WVRPGIILYGASPSGQWRDIANTGLRPVMT 234

Query: 241 LEAEILEIK-----PRVN-----TQTHESFLQAIVDIGYLD---TKVDNISPM---DQHI 284
L +EI+ ++ RV T E + IV GY D +P+
Sbjct: 235 LSSEIIGVQTLKAGERVGYGGRYTARDEQRI-GIVAAGYADGYPRHAPTGTPVLVDGVRT 293

Query: 285 NILGA-SSDHLMLDLNGQGHYQVGD 308
+G S D L +DL +G
Sbjct: 294 MTVGTVSMDMLAVDLTPCPQAGIGT 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11405PF041832581e-80 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 258 bits (661), Expect = 1e-80
Identities = 92/456 (20%), Positives = 176/456 (38%), Gaps = 56/456 (12%)

Query: 166 EGHPTHPLTKTKLPLTMEEVRAYAPEFEKEIPLQIMMIEKDHVVCTAMDGND--QFIIDE 223
GHP K + E + YAPE+ L + ++++H++ + D Q +
Sbjct: 134 SGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKREHMIWRCDNEMDIHQLLTAA 193

Query: 224 IIPEYYNQIRVFLKSLGLKSEDYRAILVHPWQYDHTIGKYFEAWIAKKILIPT-PFTILS 282
+ P+ + + + GL ++ + VHPWQ+ I F A A+ ++ F
Sbjct: 194 MDPQEFARFSQVWQENGL-DHNWLPLPVHPWQWQQKIATDFIADFAEGRMVSLGEFGDQW 252

Query: 283 KATLSFRTMSLIDKP--YHVKLPVDAQATSAVRTVSTVTTVDGPKLSYALQN-------- 332
A S RT++ + +KLP+ TS R + GP S LQ
Sbjct: 253 LAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQVFATDATL 312

Query: 333 ------MLNQYPGFKVAMEPFGEYANVDKDRARQLACIIRQKPE--IDGKGATVVSASLV 384
+L + V+ E + A L I R+ P + + V+ A+L+
Sbjct: 313 VQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDESPVLMATLM 372

Query: 385 NKNPIDQKVIVDSYLEWLNQGITKESITTFIERYAQALIPPLIAFIQNYGIALEAHMQNT 444
+ +Q + +Y++ G+ E+ ++ + + ++ PL + YG+AL AH QN
Sbjct: 373 ECDENNQPLA-GAYID--RSGLDAET---WLTQLFRVVVVPLYHLLCRYGVALIAHGQNI 426

Query: 445 VVNLGPHFDIQFLVRDLGGS-RI------DLETLQHRVSDI--KITNDSLIADSIDAVIA 495
+ + + L++D G R+ ++++L V D+ +++ D LI D
Sbjct: 427 TLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSRLSADYLIHDLQTGHFV 486

Query: 496 KFQHAVIQNQMAELIHHFNQYDCVEETELFNIVQQVVA--HAINPTLPHANELKDILFGP 553
I V E + ++ V++ +P + L LF P
Sbjct: 487 TV---------LRFISPLMVRLGVPERRFYQLLAAVLSDYMKKHPQMSERFALFS-LFRP 536

Query: 554 TITVKALLNMRM-----ENKVKQYLNI--ELDNPIK 582
I L +++ + + N +L NP+
Sbjct: 537 QIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLW 572


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11410TCRTETA418e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 40.6 bits (95), Expect = 8e-06
Identities = 53/340 (15%), Positives = 105/340 (30%), Gaps = 26/340 (7%)

Query: 6 FSSSFLLFLGNWIGQIGLNWFVLTTYHN--------AVYLGIVNFCRLVPILLLSVWAGA 57
S+ L +G IGL VL + GI+ + + GA
Sbjct: 11 LSTVALDAVG-----IGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 58 IADKYDKGRLLRITISSSFLVTAILCVLTYSFTAIPISVIIIYAT-LRGILSAVETPLRQ 116
++D++ + R + S A+ Y+ A + ++Y + ++ +
Sbjct: 66 LSDRFGR----RPVLLVSLAGAAV----DYAIMATAPFLWVLYIGRIVAGITGATGAVAG 117

Query: 117 AILPDLSDKISTTQAVSFHSFIINICRSIGPAIAGVILAVYHAPTTFLAQA--ICYFIAV 174
A + D++D + F S GP + G++ F A A F+
Sbjct: 118 AYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTG 177

Query: 175 LLCLPLHFKVTKIPEDASRYMPLKVIIDYFKLHMEGRQIFITSLLIMATGFSYTTLLPVL 234
LP K + P PL + + + ++ G L +
Sbjct: 178 CFLLPESHKGERRPLRREALNPLASFRWARGMTVVA-ALMAVFFIMQLVGQVPAALWVIF 236

Query: 235 TNKVFPGKSEIFGIAMTMCAIGGIIATLVL-PKVLKYIGMVNMYYLSSFLFGIALLGVVF 293
F + GI++ I +A ++ V +G L G + + F
Sbjct: 237 GEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAF 296

Query: 294 HNIVIMFICITLIGLFSQWARTTNRVYFQNNVKDYERGKV 333
M I ++ + V + +G++
Sbjct: 297 ATRGWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQL 336



Score = 30.6 bits (69), Expect = 0.012
Identities = 37/180 (20%), Positives = 71/180 (39%), Gaps = 21/180 (11%)

Query: 10 FLLFLGNWIGQIGLNWFVLTTYH----NAVYLGI-VNFCRLVPILLLSVWAGAIADKYDK 64
+ F+ +GQ+ +V+ +A +GI + ++ L ++ G +A + +
Sbjct: 217 AVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGE 276

Query: 65 GRLLRITISSSFLVTAILCVLTYSFTAIPISVIIIYATLRGILSAVETPLRQAILPDLSD 124
R L + + + +L T + A PI V++ + P QA+L D
Sbjct: 277 RRALMLGMIADGTGYILLAFATRGWMAFPIMVLLA-------SGGIGMPALQAMLSRQVD 329

Query: 125 KISTTQAVSFHSFIINICRSIGPAIAGVILAVYHAPT----TFLAQAICYFIAVLLCLPL 180
+ Q + + ++ +GP + I A T ++A A Y LLCLP
Sbjct: 330 EERQGQLQGSLAALTSLTSIVGPLLFTAIYA-ASITTWNGWAWIAGAALY----LLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11415PF041832681e-83 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 268 bits (687), Expect = 1e-83
Identities = 93/475 (19%), Positives = 181/475 (38%), Gaps = 45/475 (9%)

Query: 197 SEQAVIEGHPLHPGAKLRKGLNALQTFLYSSEFNQPIKLKIVLIHSKLSRTMSLSKDYDT 256
Q ++ GHP K R+G Y+ E+ +L + + + M D +
Sbjct: 128 RLQCLLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAVKRE---HMIWRCDNEM 184

Query: 257 TVHQLF-----PDLIKQLENEFTPKFNFNDYHIMIVHPWQLDDVLHSDYQAEVDKELIIE 311
+HQL P + + +++ + VHPWQ + +D+ A+ + ++
Sbjct: 185 DIHQLLTAAMDPQEFARFSQVWQENGLDHNWLPLPVHPWQWQQKIATDFIADFAEGRMVS 244

Query: 312 AKHTLD-YYAGLSFRTLVPKYPAMSPHIKLSTNVHITGEIRTLSEQTTHNGPLMTRILND 370
D + A S RTL IKL ++ T R + + GPL +R L
Sbjct: 245 LGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGRYIAAGPLASRWLQQ 304

Query: 371 ILEKDVIFKSYASTIIDEVAGIHFYNEQDEADYQTER--SEQLGTLFRKNIYQMIPQEVT 428
+ D + I+ E A + +E A + E LG ++R+N + + + +
Sbjct: 305 VFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVIWRENPCRWLKPDES 364

Query: 429 PLIPSSLVATYPFNNESPIVTLIKRYQSAASLSDFESSAKSWVETYSKALLGLVIPLVTK 488
P++ ++L+ N P+ A + A++W+ + ++ + L+ +
Sbjct: 365 PVLMATLMECDE--NNQPLA--------GAYIDRSGLDAETWLTQLFRVVVVPLYHLLCR 414

Query: 489 YGIALEAHLQNAIATFRKDGLLDTMYIRDFEG-LRIDKAQLNEMGYSTSHFHEKSRILTD 547
YG+AL AH QN I K+G+ + ++DF+G +R+ K + EM S E + +
Sbjct: 415 YGVALIAHGQN-ITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMD---SLPQEVRDVTSR 470

Query: 548 SKTSVFNKAFYSTVQNHLGELILTISKASNDSNLERHMWYIVRDVLDNIFDQLVLSTHKS 607
+ + I + ER + ++ VL + + H
Sbjct: 471 LSADYLIHDLQTGHFVTVLRFISPLMVRLGVP--ERRFYQLLAAVLSDYMKK-----HPQ 523

Query: 608 NQVNENRINEIKDTMFAPFIDYKCVTTMRLE----DEAHHY--TYIK-VNNPLYR 655
+ +F P I + ++L D Y++ + NPL+
Sbjct: 524 MSERFALFS-----LFRPQIIRVVLNPVKLTWPDLDGGSRMLPNYLEDLQNPLWL 573


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS11435TCRTETOQM290.012 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 28.7 bits (64), Expect = 0.012
Identities = 14/43 (32%), Positives = 21/43 (48%), Gaps = 5/43 (11%)

Query: 99 VDLKVILEYGE-----SAPKIFRKVTELVKEQVKYITGLDVVE 136
D K+ +YG S P FR + +V EQV G +++E
Sbjct: 495 TDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLE 537


46SACOL_RS12325SACOL_RS12340N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS12325-18-1.269129MFS transporter
SACOL_RS12330-212-1.825299multidrug efflux protein
SACOL_RS12335-214-2.357894TetR family transcriptional regulator
SACOL_RS12340-214-2.145375Bcr/CflA family drug resistance efflux
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12325TCRTETB1591e-44 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 159 bits (403), Expect = 1e-44
Identities = 92/415 (22%), Positives = 187/415 (45%), Gaps = 16/415 (3%)

Query: 140 KILAALLFGMFIAILNQTLLNVALPKINTEFNISASTGQWLMTGFMLVNGILIPITAYLF 199
+IL L F ++LN+ +LNV+LP I +FN ++ W+ T FML I + L
Sbjct: 14 QILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLS 73

Query: 200 NKYSYRKLFLVALVLFTIGSLICAISMN-FPIMMVGRVLQAIGAGVLMPLGSIVIITIYP 258
++ ++L L +++ GS+I + + F ++++ R +Q GA L +V+ P
Sbjct: 74 DQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIP 133

Query: 259 PEKRGAAMGTMGIAMILAPAIGPTLSGYIVQNYHWNVMFYGMFIIGIIAILIGFVWFKLY 318
E RG A G +G + + +GP + G I HW+ + +I II + K
Sbjct: 134 KENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIP-MITIITVPFLMKLLKKE 192

Query: 319 QYTTNPKADIPGIIFSTIGFGALLYGFSEAGNKGWGSVEIETMFAIGIIFIILFVIRELR 378
DI GII ++G + + + ++ ++FV +
Sbjct: 193 VRIKGH-FDIKGIILMSVGIVFFMLF---TTSYSIS------FLIVSVLSFLIFVKHIRK 242

Query: 379 MKSPMLNLEVLKFPTFTLTTIINMVVMLSLYGGMILLPIYLQNLRGFSALDSG-LLLLPG 437
+ P ++ + K F + + ++ ++ G + ++P ++++ S + G +++ PG
Sbjct: 243 VTDPFVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPG 302

Query: 438 SLIMGLLGPFAGKLLDTIGLKPLAIFGIAVMTYATWELTKLNMDTP-YMTIMGIYVLRSF 496
++ + + G G L+D G + G+ ++ + + L T +MTI+ ++VL
Sbjct: 303 TMSVIIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVL--G 360

Query: 497 GMAFIMMPMVTAAINALPGRLASHGNAFLNTMRQLAGSIGTAILVTVMTTQTTQH 551
G++F + T ++L + A G + LN L+ G AI+ +++
Sbjct: 361 GLSFTKTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLSIPLLDQ 415


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12330RTXTOXIND592e-12 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 59.1 bits (143), Expect = 2e-12
Identities = 26/133 (19%), Positives = 45/133 (33%), Gaps = 13/133 (9%)

Query: 87 MDLKMPQKGTIAKLD-GMEGSMVQAGNPIAYAYNLDD-LYVTANIDEKDIKDVEVGKDVD 144
++ P + +L EG +V + DD L VTA + KDI + VG++
Sbjct: 328 SVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAI 387

Query: 145 VTIDGQKAS----IKGKVDSIGKATAASFSLMPSSNSDGNYTKVSQVIPVKITLESEPSK 200
+ ++ + + GKV +I G V I +
Sbjct: 388 IKVEAFPYTRYGYLVGKVKNINLDAI-------EDQRLGLVFNVIISIEENCLSTGNKNI 440

Query: 201 QVVPGMNAEVKIH 213
+ GM +I
Sbjct: 441 PLSSGMAVTAEIK 453



Score = 31.7 bits (72), Expect = 0.002
Identities = 17/77 (22%), Positives = 35/77 (45%), Gaps = 2/77 (2%)

Query: 9 VITVVVLLAIGIAGFYFWNKTTSYVTTDNAKV--NGDQIKIASPASGQIKSLNVKQGDKL 66
++ ++ + IA V T N K+ +G +I + +K + VK+G+ +
Sbjct: 59 LVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESV 118

Query: 67 DKGDKVAIVTVQGQDGE 83
KGD + +T G + +
Sbjct: 119 RKGDVLLKLTALGAEAD 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12335HTHTETR453e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.0 bits (106), Expect = 3e-08
Identities = 13/69 (18%), Positives = 24/69 (34%)

Query: 2 KRQAKIEIQNALVDLMAEYPFQEISTKMICAYCNINRSTFYDYYKDKFDLLDTINSKHKE 61
++ + I + + L ++ S I + R Y ++KDK DL I +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 62 KFQFLLSAL 70
L
Sbjct: 69 NIGELELEY 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12340TCRTETA651e-13 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 65.2 bits (159), Expect = 1e-13
Identities = 69/386 (17%), Positives = 141/386 (36%), Gaps = 16/386 (4%)

Query: 15 IIILGSLTAIGALSIDMFLPGLPDIRHDF---QTTTSNAQLTLSMFMIGLAFGNLFAGPI 71
+I++ S A+ A+ I + +P LP + D T++ + L+++ + G +
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 72 SDSTGRRKPLIIAMIIFTLASLGIVFVHNIWLMVALRFLQGVTGGAAAVISRAIASDMYS 131
SD GRR L++++ + + +W++ R + G+TG AV IA D+
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIA-DITD 125

Query: 132 GNELTKFMALLMLVNGIAPVVAPTIGGIILNYSVWRMVFVILTIFGFVMVIGSLLKVPES 191
G+E + + G V P +GG++ +S F + + +PES
Sbjct: 126 GDERARHFGFMSACFGFGMVAGPVLGGLMGGFSP-HAPFFAAAALNGLNFLTGCFLLPES 184

Query: 192 LTVTNRESSSGLKTMFKNFKILLKTPRFVLPMLIQGMTFVILFTYISASPFII--QKIYG 249
R +F+ + V ++ + L + A+ ++I + +
Sbjct: 185 HKGERRPLRREALNPLASFR-WARGMTVVAALMAVFFI-MQLVGQVPAALWVIFGEDRFH 242

Query: 250 MTAIQFSWMFAGIGITLIISSQLTGYLVDFIDSQKLMRGMTMIQIIGVILVTIVLLNHWN 309
A A GI ++ + + ++ R M+ +I I+L
Sbjct: 243 WDATTIGISLAAFGILHSLAQ---AMITGPVAARLGERRALMLGMIADGTGYILLAFATR 299

Query: 310 FWILAIGFIILIAPVTGVATLGFTIAMDESSSGRGSSSSLLGLVQFLFGGVASPLVGVKG 369
W+ ++L + G+ L ++ +G L + L + PL+
Sbjct: 300 GWMAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSL-TSIVGPLLFTAI 358

Query: 370 EDNPIPY---IIIIIATAVILIILQI 392
I I A+ L+ L
Sbjct: 359 YAASITTWNGWAWIAGAALYLLCLPA 384


47SACOL_RS12690SACOL_RS12720N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS12690119-1.540353hypothetical protein
SACOL_RS12695016-1.465445gamma-hemolysin component A
SACOL_RS12700015-1.587947gamma-hemolysin component C
SACOL_RS12705-115-1.489024gamma-hemolysin component B
SACOL_RS12710-315-1.210142hypothetical protein
SACOL_RS12715-218-1.1765496-carboxyhexanoate--CoA ligase
SACOL_RS12720-218-1.0216638-amino-7-oxononanoate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12695PF05704250.021 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 24.8 bits (54), Expect = 0.021
Identities = 5/23 (21%), Positives = 11/23 (47%)

Query: 37 FLIDYFQNDNKLLNVFLIDDAVS 59
++ Y + K + ++ D VS
Sbjct: 206 SMVTYLKKKEKPADYYIFHDFVS 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12700BICOMPNTOXIN428e-154 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 428 bits (1103), Expect = e-154
Identities = 213/312 (68%), Positives = 247/312 (79%), Gaps = 8/312 (2%)

Query: 1 MIKNKILTATLAVGLIAPLANPFIEISKAENKIEDIGQGA--EIIKRTQDITSKRLAITQ 58
M+KNKILT TL+V L+APLANP +E +KA N EDIG+G+ EIIKRT+D TS + +TQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 59 NIQFDFVKDKKYNKDALVVKMQGFISSRTTYSDLKKYPYIKRMIWPFQYNISLKTKDSNV 118
NIQFDFVKDKKYNKDAL++KMQGFISSRTTY + KK ++K M WPFQYNI LKT D V
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV 120

Query: 119 DLINYLPKNKIDSADVSQKLGYNIGGNFQSAPSIGGSGSFNYSKTISYNQKNYVTEVESQ 178
LINYLPKNKI+S +VSQ LGYNIGGNFQSAPS+GG+GSFNYSK+ISY Q+NYV+EVE Q
Sbjct: 121 SLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ 180

Query: 179 NSKGVKWGVKANSFVTPNGQVSAYDQYLF-AQDPTGPAARDYFVPDNQLPPLIQSGFNPS 237
NSK V WGVKANSF T +GQ SA+D LF P RDYFVPD++LPPL+QSGFNPS
Sbjct: 181 NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS 240

Query: 238 FITTLSHERGKGDKSEFEITYGRNMDATYA-----YVTRHRLAVDRKHDAFKNRNVTVKY 292
FI T+SHE+G D SEFEITYGRNMD T+A + L R H+AF NRN TVKY
Sbjct: 241 FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY 300

Query: 293 EVNWKTHEVKIK 304
EVNWKTHE+K+K
Sbjct: 301 EVNWKTHEIKVK 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12705BICOMPNTOXIN466e-169 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 466 bits (1201), Expect = e-169
Identities = 314/315 (99%), Positives = 314/315 (99%)

Query: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60
MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV 120
NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKTNDKYV 120

Query: 121 SLINYLPKNKIESTNVSQILGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ 180
SLINYLPKNKIESTNVSQ LGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ
Sbjct: 121 SLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGGNGSFNYSKSISYTQQNYVSEVEQQ 180

Query: 181 NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS 240
NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS
Sbjct: 181 NSKSVLWGVKANSFATESGQKSAFDSDLFVGYKPHSKDPRDYFVPDSELPPLVQSGFNPS 240

Query: 241 FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY 300
FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY
Sbjct: 241 FIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHNAFVNRNYTVKY 300

Query: 301 EVNWKTHEIKVKGQN 315
EVNWKTHEIKVKGQN
Sbjct: 301 EVNWKTHEIKVKGQN 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12710BICOMPNTOXIN383e-136 Staphylococcal bi-component toxin signature.
		>BICOMPNTOXIN#Staphylococcal bi-component toxin signature.

Length = 315

Score = 383 bits (985), Expect = e-136
Identities = 87/322 (27%), Positives = 160/322 (49%), Gaps = 18/322 (5%)

Query: 1 MKMNKLVKSSVATSMALLLLSGTANAEGKITPVSVKKVDDKVTLYKTTATADSDKFKISQ 60
M NK++ ++++ S+ L + + + K T S+K+ ++Q
Sbjct: 1 MLKNKILTTTLSVSLLAPLANPLLENAKAANDTEDIGKGSDIEIIKRTEDKTSNKWGVTQ 60

Query: 61 ILTFNFIKDKSYDKDTLVLKATGNINSGFVKPNPNDYDFSK-LYWGAKYNVSISSQSNDS 119
+ F+F+KDK Y+KD L+LK G I+S N + K + W +YN+ + + ++
Sbjct: 61 NIQFDFVKDKKYNKDALILKMQGFISSRTTYYNYKKTNHVKAMRWPFQYNIGLKT-NDKY 119

Query: 120 VNVVDYAPKNQNEEFQVQNTLGYTFGGDISISNGLSGGLNGNTAFSETINYKQESYRTTL 179
V++++Y PKN+ E V TLGY GG+ + L G NG+ +S++I+Y Q++Y + +
Sbjct: 120 VSLINYLPKNKIESTNVSQTLGYNIGGNFQSAPSLGG--NGSFNYSKSISYTQQNYVSEV 177

Query: 180 SRNTNYKNVGWGVEAHKIMNNGWGPYGRDSFHPTYGNELFLAGRQSSAYAGQNFIAQHQM 239
+ N K+V WGV+A+ + ++LF+ + S F+ ++
Sbjct: 178 EQQ-NSKSVLWGVKANSFATESGQ-------KSAFDSDLFVGYKPHSKDPRDYFVPDSEL 229

Query: 240 PLLSRSNFNPEFLSVLSHRQDGAKKSKITVTYQREMDL-----YQIRWNGFYWAGANYKN 294
P L +S FNP F++ +SH + + S+ +TY R MD+ + Y G N
Sbjct: 230 PPLVQSGFNPSFIATVSHEKGSSDTSEFEITYGRNMDVTHAIKRSTHYGNSYLDGHRVHN 289

Query: 295 -FKTRTFKSTYEIDWENHKVKL 315
F R + YE++W+ H++K+
Sbjct: 290 AFVNRNYTVKYEVNWKTHEIKV 311


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS12725CLENTEROTOXN280.048 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 28.5 bits (63), Expect = 0.048
Identities = 8/47 (17%), Positives = 15/47 (31%), Gaps = 3/47 (6%)

Query: 233 GGVILSSND---VKDMLINHGRPLIYSSSLPIYNLYFIKRNIEKLIN 276
IL+ N+ L I + + FI+ ++E
Sbjct: 59 SSQILNPNETGTFSQSLTKSKEVSINVNFSVGFTSEFIQASVEYGFG 105


48SACOL_RS13550SACOL_RS13585N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS13550-1160.082062TetR family transcriptional regulator
SACOL_RS13555-216-0.107332hypothetical protein
SACOL_RS13560-2130.727672hypothetical protein
SACOL_RS13565-280.846563glyoxalase
SACOL_RS13570-381.525032hypothetical protein
SACOL_RS13575-3122.034195hypothetical protein
SACOL_RS13580-2131.878803TetR family transcriptional regulator
SACOL_RS135850152.220225short-chain dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13550HTHTETR431e-07 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 43.5 bits (102), Expect = 1e-07
Identities = 33/200 (16%), Positives = 64/200 (32%), Gaps = 34/200 (17%)

Query: 11 KSIDPRIVRTKQLLVDAFLKISREKKLSQITVKDITDIATLNRATFYAHFTDKEDLLDYT 70
+ T+Q ++D L++ ++ +S ++ +I A + R Y HF DK DL
Sbjct: 3 RKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEI 62

Query: 71 LSV---TILKDLNDNLSISNVINEKVLRNIFISIASYIKDAAKSCELNSEAFCNKAHQRI 127
+ I + + + VLR I I + + L F
Sbjct: 63 WELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIF------HK 116

Query: 128 NNELEDIFAIM-LENSYPEHQRDIIVNS-------------------ASFLAAGISGLAL 167
+ ++ + + + D I + A + ISGL
Sbjct: 117 CEFVGEMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLME 176

Query: 168 HWFNTSQ-----ETADVFID 182
+W Q + A ++
Sbjct: 177 NWLFAPQSFDLKKEARDYVA 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13565TRNSINTIMINR270.019 Translocated intimin receptor (Tir) signature.
		>TRNSINTIMINR#Translocated intimin receptor (Tir) signature.

Length = 549

Score = 27.4 bits (60), Expect = 0.019
Identities = 14/45 (31%), Positives = 23/45 (51%)

Query: 56 FQNVSQQSLNTEPNEVMISLGVNTNEEVDQLVNKVKEAGGTVVQE 100
F+N Q +N + N I G ++ V+Q+ + KEAG Q+
Sbjct: 291 FKNPENQKVNIDANGNAIPSGELKDDIVEQIAQQAKEAGEVARQQ 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13570NUCEPIMERASE362e-04 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 35.5 bits (82), Expect = 2e-04
Identities = 35/138 (25%), Positives = 53/138 (38%), Gaps = 35/138 (25%)

Query: 1 MKDILVIGATGKQGNAVVKQLLEDGWYVSAL--------TRNKNNRKLSDIGHPHLSIVE 52
MK LV GA G G V K+LLE G V + K R L + P +
Sbjct: 1 MK-YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQAR-LELLAQPGFQFHK 58

Query: 53 GDLSD-----------------NVSLQSAMKGKYGLYSIQ-PIVKDDVSEELRQGMKIIE 94
DL+D + A++ YS++ P D + L + I+E
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVR-----YSLENPHAYADSN--LTGFLNILE 111

Query: 95 IAEQENIQHIVYSTAGGV 112
IQH++Y+++ V
Sbjct: 112 GCRHNKIQHLLYASSSSV 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13580HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 62.7 bits (152), Expect = 1e-14
Identities = 25/80 (31%), Positives = 44/80 (55%)

Query: 1 MRKDAKENRQRIEEIAHKLFDEEGVENISMNRIAKELGIGMGTLYRHFKDKSDLCYYVIQ 60
+++A+E RQ I ++A +LF ++GV + S+ IAK G+ G +Y HFKDKSDL + +
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 61 RDLDIFITHFKQIKDDYHSN 80
+ + + +
Sbjct: 65 LSESNIGELELEYQAKFPGD 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13585DHBDHDRGNASE702e-16 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 69.7 bits (170), Expect = 2e-16
Identities = 48/197 (24%), Positives = 76/197 (38%), Gaps = 18/197 (9%)

Query: 3 KIVLITGGNKGLGYASAEALKALGYKVYIGSRND---VRGQQASQKLGVHYVQ--LDVTS 57
KI ITG +G+G A A L + G + N + + + H DV
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRD 68

Query: 58 DYSVKNAYNMIAEKEGRLDILINNAGISGQFSAPSKLTPRDVEEVYQTNVFGIVRMMNTF 117
++ I + G +DIL+N AG+ + L+ + E + N G+ +
Sbjct: 69 SAAIDEITARIEREMGPIDILVNVAGVL-RPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 118 VPLLEKSEQPVVVNVSSGLGSFGMVTNPETAESKVNSLAYCSSKSAVTMLTLQYAKGLP- 176
+ +V V S P T+ + AY SSK+A M T L
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAG-----VPRTSMA-----AYASSKAAAVMFTKCLGLELAE 177

Query: 177 -NMQINAADPGATNTDL 192
N++ N PG+T TD+
Sbjct: 178 YNIRCNIVSPGSTETDM 194


49SACOL_RS13885SACOL_RS13930N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS138850142.355463clumping factor B
SACOL_RS138901152.371905transcriptional regulator
SACOL_RS13895-2120.457023carbamate kinase 2
SACOL_RS13900-2100.713592arginine-ornithine antiporter
SACOL_RS13905-3100.647903ornithine carbamoyltransferase
SACOL_RS13915-3100.098571arginine deiminase
SACOL_RS13920-110-1.025495hypothetical protein
SACOL_RS13925-2100.878687ArgR family transcriptional regulator
SACOL_RS13930-2101.306727aureolysin
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13890PF05616512e-08 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 50.5 bits (120), Expect = 2e-08
Identities = 36/125 (28%), Positives = 57/125 (45%), Gaps = 14/125 (11%)

Query: 508 NVDPVTNRDYSIFGWNNENVVRYGGGSADGDSAVNPK-----DPTPG----PPVDPEPSP 558
N+ PVT+R+ N VV G + G++ V+ + D TPG P P P
Sbjct: 277 NMGPVTDRN-----GNPVQVVATFGRDSQGNTTVDVQVIPRPDLTPGSAEAPNAQPLPEV 331

Query: 559 DPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDSDSGSDSDSESDSDSDSDSDSDS 618
P P +P P+ +P P+P+PDPD + D++ +D G+ DS + D +
Sbjct: 332 SPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNGRHRKE 391

Query: 619 DSDSE 623
+ E
Sbjct: 392 RKEGE 396



Score = 34.7 bits (79), Expect = 0.002
Identities = 18/63 (28%), Positives = 27/63 (42%), Gaps = 1/63 (1%)

Query: 538 DSAVNPK-DPTPGPPVDPEPSPDPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDS 596
+ A NP + PG +PEP PD P+ PD + P P+ PD + +
Sbjct: 336 NPANNPAPNENPGTRPNPEPDPDLNPDANPDTDGQPGTRPDSPAVPDRPNGRHRKERKEG 395

Query: 597 DSG 599
+ G
Sbjct: 396 EDG 398


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13900CARBMTKINASE387e-138 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 387 bits (995), Expect = e-138
Identities = 137/314 (43%), Positives = 196/314 (62%), Gaps = 5/314 (1%)

Query: 1 MKEKIVIALGGNAIQT--TEATAEAQQTAIRCAMQNLKPLFDSPARIVISHGNGPQIGSL 58
M +++VIALGGNA+Q + + E +R + + + +VI+HGNGPQ+GSL
Sbjct: 1 MGKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSL 60

Query: 59 LIQQAKSNSDT-TPAMPLDTCGAMSQGMIGYWLETEINRILTEMNSDRTVGTIVTRVEVD 117
L+ + PA P+D GAMSQG IGY ++ + L + ++ V TI+T+ VD
Sbjct: 61 LLHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVD 120

Query: 118 KDDPRFDNPTKPIGPFYTKEEVEELQKEQPDSVFKEDAGRGYRKVVASPLPQSILEHQLI 177
K+DP F NPTKP+GPFY +E + L +E + KED+GRG+R+VV SP P+ +E + I
Sbjct: 121 KNDPAFQNPTKPVGPFYDEETAKRLARE-KGWIVKEDSGRGWRRVVPSPDPKGHVEAETI 179

Query: 178 RTLADGKNIVIACGGGGIPVIKKENTYEGVEAVIDKDFASEKLATLIEADTLMILTNVEN 237
+ L + IVIA GGGG+PVI ++ +GVEAVIDKD A EKLA + AD MILT+V
Sbjct: 180 KKLVERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNG 239

Query: 238 VFINFNEPNQQQIDDIDVATLKKYAAQGKFVEGSMLPKIEAAIRFVESGENKKVIITNLE 297
+ + +Q + ++ V L+KY +G F GSM PK+ AAIRF+E G ++ II +LE
Sbjct: 240 AALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWG-GERAIIAHLE 298

Query: 298 QAYEALIGNKGTHI 311
+A EAL G GT +
Sbjct: 299 KAVEALEGKTGTQV 312


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13915ARGDEIMINASE5070.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 507 bits (1308), Expect = 0.0
Identities = 193/409 (47%), Positives = 275/409 (67%), Gaps = 8/409 (1%)

Query: 5 PIKVNSEIGALKTVLLKRPGKELENLVPDYLDGLLFDDIPYLEVAQKEHDHFAQVLREEG 64
PI + SEIG LK VLL RPG+ELENL P + LFDDIPYLEVA++EH+ FA +L+
Sbjct: 7 PINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFASILKNNL 66

Query: 65 VEVLYLEKLAAESIENPQ-VRSEFIDDVLAESKKTILGHEEEIKALFATLSNQELVDKIM 123
VE+ Y+E L +E + + + ++FI + E++ +K F++L+ ++ K++
Sbjct: 67 VEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINLLKDYFSSLTIDNMISKMI 126

Query: 124 SGVRKEEINPKCTHLVEYMDDKYPFYLDPMPNLYFTRDPQASIGHGITINRMFWRARRRE 183
SGV EE+ + L + ++ F +DPMPN+ FTRDP ASIG+G+TIN+MF + R+RE
Sbjct: 127 SGVVTEELKNYTSSLDDLVNGANLFIIDPMPNVLFTRDPFASIGNGVTINKMFTKVRQRE 186

Query: 184 SIFIQYIVKHHPRFKDANIPIWLDRDCPFNIEGGDELVLSKDVLAIGVSERTSAQAIEKL 243
+IF +YI K+HP +K N+PIWL+R ++EGGDELVL+K +L IG+SERT A+++EKL
Sbjct: 187 TIFAEYIFKYHPVYK-ENVPIWLNRWEEASLEGGDELVLNKGLLVIGISERTEAKSVEKL 245

Query: 244 ARRIFENPQATFKKVVAIEIPTSRTFMHLDTVFTMIDYDKFTMHSAILKAEGNMNIFIIE 303
A +F+N + +F ++A +IP +R++MHLDTVFT IDY FT ++ + +I+++
Sbjct: 246 AISLFKN-KTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTSFTSD---DMYFSIYVLT 301

Query: 304 YDDVNKDIAIK-QSSHLKDTLEDVLGIDDIQFIPTGNGDVIDGAREQWNDGSNTLCIRPG 362
Y+ + I IK + + +KD L LG I I GD+I GAREQWNDG+N L I PG
Sbjct: 302 YNPSSSKIHIKKEKARIKDVLSFYLG-RKIDIIKCAGGDLIHGAREQWNDGANVLAIAPG 360

Query: 363 VVVTYDRNYVSNDLLRQKGIKVIEISGSELVRGRGGPRCMSQPLFREDI 411
++ Y RN+V+N L + GIKV I SEL RGRGGPRCMS PL REDI
Sbjct: 361 EIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIREDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13925ARGREPRESSOR827e-23 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 82.2 bits (203), Expect = 7e-23
Identities = 38/147 (25%), Positives = 78/147 (53%), Gaps = 2/147 (1%)

Query: 1 MKKSKRLEIVSTIVKKHKIYKKEQIISYIEEYFGVRYSATTIAKDLKELNIYRVPIDCET 60
M K +R + I+ ++I +++++ +++ G + T+++D+KEL++ +VP + +
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKD-GYNVTQATVSRDIKELHLVKVPTNNGS 59

Query: 61 WIYKAINNQTEQEMREKFRHYCEHEVLSSIINGSYIIVKTSPGFAQGINYFIDQLNIEEI 120
+ Y ++ K + + I++KT PG AQ I +D L+ EEI
Sbjct: 60 YKY-SLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEEI 118

Query: 121 LGTVSGNDTTLILTASNDMAEYVYAKL 147
+GT+ G+DT LI+ ++D + V K+
Sbjct: 119 MGTICGDDTILIICRTHDDTKVVQKKI 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13930THERMOLYSIN440e-152 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 440 bits (1133), Expect = e-152
Identities = 173/480 (36%), Positives = 249/480 (51%), Gaps = 42/480 (8%)

Query: 64 NIYQDYAVTDVKTDKKGFTHYTLQPSVDGVHAPDKEVKVHADKSGKVVLING----DTDA 119
+ ++ K D+ G T + ++ + H + G++ ++G + D
Sbjct: 71 QARERLSLIGNKLDELGHTVMRFEQAIAASLCMGAVLVAHVN-DGELSSLSGTLIPNLDK 129

Query: 120 KKVKPTNKVTLSKDDAADKAFKAVKIDKNKAKNLKDKVIKENKVEIDGDSNKYVYNVELI 179
+ +K +++ + + K A ++ K + ++ + D ++ + Y V +
Sbjct: 130 RTLKTEAAISIQQAEMIAKQDVADRVTKERPAA-EEGKPTRLVIYPDEETPRLAYEVNVR 188

Query: 180 TVTPEISHWKVKIDAQTGEILEKMNLVKEA-----------AETGKGKGVLGDTKDINI- 227
+TP +W IDA G++L K N + EA + G G+GVLGD K IN
Sbjct: 189 FLTPVPGNWIYMIDAADGKVLNKWNQMDEAKPGGAQPVAGTSTVGVGRGVLGDQKYINTT 248

Query: 228 -NSIDGGFSLEDLTHQGKLSAFSFNDQTG-QATLITNEDENFVKDEQRAGVDANYYAKQT 285
+S G + L+D T + + ++T +L + D F A VDA+YYA
Sbjct: 249 YSSYYGYYYLQDNTRGSGIFTYDGRNRTVLPGSLWADGDNQFFASYDAAAVDAHYYAGVV 308

Query: 286 YDYYKDTFGRESYDNQGSPIVSLTHVNNYGGQDNRNNAAWIGDKMIYGDGDGRTFTSLSG 345
YDYYK+ GR SYD + I S H YG NNA W G +M+YGDGDG+TF SG
Sbjct: 309 YDYYKNVHGRLSYDGSNAAIRSTVH---YG--RGYNNAFWNGSQMVYGDGDGQTFLPFSG 363

Query: 346 ANDVVAHELTHGVTQETANLEYKDQSGALNESFSDVFGYFVD-----DEDFLMGEDVYTP 400
DVV HELTH VT TA L Y+++SGA+NE+ SD+FG V+ + D+ +GED+YTP
Sbjct: 364 GIDVVGHELTHAVTDYTAGLVYQNESGAINEAMSDIFGTLVEFYANRNPDWEIGEDIYTP 423

Query: 401 GKEGDALRSMSNPEQFGQPAHMKDYVFTEKDNGGVHTNSGIPNKAAYNVIQ--------- 451
G GDALRSMS+P ++G P H +DNGGVHTNSGI NKAAY + Q
Sbjct: 424 GVAGDALRSMSDPAKYGDPDHYSKRYTGTQDNGGVHTNSGIINKAAYLLSQGGVHYGVSV 483

Query: 452 -AIGKSKSEQIYYRALTEYLTSNSNFKDCKDALYQAAKDLYDEQTAE--QVYEAWNEVGV 508
IG+ K +I+YRAL YLT SNF + A QAA DLY + E V +A+N VGV
Sbjct: 484 TGIGRDKMGKIFYRALVYYLTPTSNFSQLRAACVQAAADLYGSTSQEVNSVKQAFNAVGV 543


50SACOL_RS13960SACOL_RS13990N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS13960-190.197217phage infection protein
SACOL_RS13965011-0.034324N-acetylmuramoyl-L-alanine amidase
SACOL_RS139700150.088341hypothetical protein
SACOL_RS13975-1140.439285hypothetical protein
SACOL_RS139800140.834651accessory Sec system glycosylation chaperone
SACOL_RS139850130.262435accessory Sec system glycosyltransferase GtfA
SACOL_RS139906142.428600accessory Sec system translocase SecA2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13960ABC2TRNSPORT396e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 39.1 bits (91), Expect = 6e-05
Identities = 37/172 (21%), Positives = 67/172 (38%), Gaps = 28/172 (16%)

Query: 817 NKHKSLESVLTTRQVFLGKAGFFIMLGML-----QALIVSVGDLLILKAGVESP---VLF 868
++ E++L T Q+ LG I+LG + +A + G ++ A + +L+
Sbjct: 95 EGQRTWEAMLYT-QLRLGD----IVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLY 149

Query: 869 VLITI-FCSIIFNSIVYTCVSLLGNPGKAIAIVLLVLQIAG----GGGTFPIQTTPQFFQ 923
L I + F S+ +L P I L I G FP+ P FQ
Sbjct: 150 ALPVIALTGLAFASLGMVVTAL--APSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQ 207

Query: 924 NISPYLPFTYAIDSLRETV-----GGIVPEILITKLIILTLFGIGFFVVGLI 970
+ +LP +++ID +R + + + + I+ F F L+
Sbjct: 208 TAARFLPLSHSIDLIRPIMLGHPVVDVCQHVGALCIYIVIPF---FLSTALL 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13965FLGFLGJ644e-13 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 64.0 bits (155), Expect = 4e-13
Identities = 50/176 (28%), Positives = 84/176 (47%), Gaps = 19/176 (10%)

Query: 304 SNNDDSGQFNVVDSKDTRQFVKSIAKDAHRIGQDNDIYASVMIAQAILESDSGRSALAKS 363
N DDS D++ F+ ++ A Q + + +++AQA LES G+ + +
Sbjct: 139 RNYDDSLPG------DSKAFLAQLSLPAQLASQQSGVPHHLILAQAALESGWGQRQIRRE 192

Query: 364 ---PNHNLFGIK--GAFEGNSVPFNTLEADGNQLYSINAGFRKYPSTKESLKDYSDLIKN 418
P++NLFG+K G ++G T E + + + A FR Y S E+L DY L+
Sbjct: 193 NGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFRVYSSYLEALSDYVGLLTR 252

Query: 419 GIDGNRTIYKPTWKSEADSYKDATSHLSKTYATDPNYAKKLNSIIKHYQLTQFDDE 474
+ + A + +DA YATDP+YA+KL ++I+ Q+ D+
Sbjct: 253 NPRYAAVTTAASAEQGAQALQDA------GYATDPHYARKLTNMIQ--QMKSISDK 300


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13970ISCHRISMTASE773e-19 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 77.0 bits (189), Expect = 3e-19
Identities = 41/183 (22%), Positives = 77/183 (42%), Gaps = 10/183 (5%)

Query: 3 RKTALLVLDMQE----GIASSVPRIKNIIKANQRAIEAARQHRIPVIFIRLVLDKHFNDV 58
+ LL+ DMQ + + + ++ Q IPV++ ++ +D
Sbjct: 29 NRAVLLIHDMQNYFVDAFTAGASPVTELSANIRKLKNQCVQLGIPVVYTAQPGSQNPDDR 88

Query: 59 SSSNKVFSTIKAQGYAITEADASTRILEDLAPLEDEPIISKRRFSAFTGSYLEVYLRAND 118
+ + G + +I+ +LAP +D+ +++K R+SAF + L +R
Sbjct: 89 ALLTDFW------GPGLNSGPYEEKIITELAPEDDDLVLTKWRYSAFKRTNLLEMMRKEG 142

Query: 119 INHLVLTGVSTSGAVLSTALESVDKDYYITVLEDAVGDRSDDKHDFIIEQILSRSCDIES 178
+ L++TG+ L TA E+ +D + DAV D S +KH +E R
Sbjct: 143 RDQLIITGIYAHIGCLVTACEAFMEDIKAFFVGDAVADFSLEKHQMALEYAAGRCAFTVM 202

Query: 179 VES 181
+S
Sbjct: 203 TDS 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS13990SECA6600.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 660 bits (1703), Expect = 0.0
Identities = 288/835 (34%), Positives = 451/835 (54%), Gaps = 68/835 (8%)

Query: 10 NELRLKSIRKIVKRINTWSDEVKSYSDDALKQKTIEFKERLASGVDTLDTLLPEAYAVAR 69
N+ L+ +RK+V IN E++ SD+ LK KT EF+ RL G + L+ L+PEA+AV R
Sbjct: 14 NDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKG-EVLENLIPEAFAVVR 72

Query: 70 EASWRVLGMYPKEVQLIGAIVLHEGNIAEMQTGEGKTLTATMPLYLNALSGKGTYLITTN 129
EAS RV GM +VQL+G +VL+E IAEM+TGEGKTLTAT+P YLNAL+GKG +++T N
Sbjct: 73 EASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNALTGKGVHVVTVN 132

Query: 130 DYLAKRDFEEMQPLYEWLGLTASLGFVDIVDYEYQKGEKRNIYEHDIIYTTNGRLGFDYL 189
DYLA+RD E +PL+E+LGLT V I KR Y DI Y TN GFDYL
Sbjct: 133 DYLAQRDAENNRPLFEFLGLT-----VGINLPGMPAPAKREAYAADITYGTNNEYGFDYL 187

Query: 190 IDNLADSAEGKFLPQLNYGIIDEVDSIILDAAQTPLVISGAPRLQSNLFHIVKEFVDTLI 249
DN+A S E + +L+Y ++DEVDSI++D A+TPL+ISG S ++ V + + LI
Sbjct: 188 RDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNKIIPHLI 247

Query: 250 E-----------DVHFKMKKTKKEIWLLNQGIEAAQSYFNV-------EDLYSEQAMVLV 291
+ HF + + +++ L +G+ + E LYS ++L+
Sbjct: 248 RQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSPANIMLM 307

Query: 292 RNINLALRAQYLFESNVDYFVYNGDIVLIDRITGRMLPGTKLQAGLHQAIEAKEGMEVST 351
++ ALRA LF +VDY V +G+++++D TGR + G + GLHQA+EAKEG+++
Sbjct: 308 HHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEGVQIQN 367

Query: 352 DKSVMATITFQNLFKLFESFSGMTATGKLGESEFFDLYSKIVVQVPTDKAIQRIDEPDKV 411
+ +A+ITFQN F+L+E +GMT T EF +Y V VPT++ + R D PD V
Sbjct: 368 ENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRKDLPDLV 427

Query: 412 FRSVDEKNIAMIHDIVELHETGRPVLLITRTAEAAEYFSKVLFQMDIPNNLLIAQNVAKE 471
+ + EK A+I DI E G+PVL+ T + E +E S L + I +N+L A+ A E
Sbjct: 428 YMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLNAKFHANE 487

Query: 472 AQMIAEAGQIGSMTVATSMAGRGTDIKLG-----------------------------EG 502
A ++A+AG ++T+AT+MAGRGTDI LG +
Sbjct: 488 AAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKADWQVRHDA 547

Query: 503 VEALGGLAVIIHEHMENSRVDRQLRGRSGRQGDPGSSCIYISLDDYLVKRWSDSNLAENN 562
V GGL +I E E+ R+D QLRGRSGRQGD GSS Y+S++D L++ ++ ++
Sbjct: 548 VLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFASDRVSGMM 607

Query: 563 QLYSLDAQRLSQSNLFNRKVKQIVVKAQRISEEQGVKAREMANEFEKSISIQRDLVYEER 622
+ + + + + AQR E + R+ E++ + QR +Y +R
Sbjct: 608 RKLGMKPGEAIEHPWVTKAIA----NAQRKVESRNFDIRKQLLEYDDVANDQRRAIYSQR 663

Query: 623 NRVLEIDDAENQDFKALAKDVFEMFVNEE---KVLTKSRVVEYIYQNLSFQFNKDVACVN 679
N +L++ D ++ +DVF+ ++ + L + + + + L F+ D+
Sbjct: 664 NELLDVSDVSET-INSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPIAE 722

Query: 680 FKDKQAVVT------FLLEQFEKQLALNRKNMQSAYYYNIFVQKVFLKAIDSCWLEQVDY 733
+ DK+ + +L Q + ++ + A F + V L+ +DS W E +
Sbjct: 723 WLDKEPELHEETLRERILAQSIEVYQ-RKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLAA 781

Query: 734 LQQLKASVNQRQNGQRNAIFEYHRVALDSFEVMTRNIKKRMVKNICQSMITFDKE 788
+ L+ ++ R Q++ EY R + F M ++K ++ + + + +E
Sbjct: 782 MDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEE 836


51SACOL_RS14010SACOL_RS14070N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SACOL_RS1401011163.102318accessory Sec system protein translocase subunit
SACOL_RS1401512163.954522serine-rich adhesin for platelets
SACOL_RS14020-1140.936510flavin reductase
SACOL_RS14025-1140.041811hypothetical protein
SACOL_RS14030-2140.124380hypothetical protein
SACOL_RS14035-2150.596274hypothetical protein
SACOL_RS14040-214-0.484284hypothetical protein
SACOL_RS14045-116-1.499633methionine sulfoxide reductase A
SACOL_RS14050-216-1.124281GNAT family acetyltransferase
SACOL_RS14055-217-1.289365capsular polysaccharide biosynthesis protein
SACOL_RS14060-116-1.904153capsular polysaccharide biosynthesis protein
SACOL_RS14065-114-3.483218capsular polysaccharide biosynthesis protein
SACOL_RS14070-212-1.844413biofilm operon icaADBC HTH-type negative
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14010SECYTRNLCASE1274e-35 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 127 bits (322), Expect = 4e-35
Identities = 92/440 (20%), Positives = 179/440 (40%), Gaps = 52/440 (11%)

Query: 4 LLQQYEYKIIYKRMLYTCFILFIYILGTNISI--VSYNDMQ------VKHESFFKIAISN 55
+ + + K++L+T I+ +Y +GT+I I V Y ++Q ++ F +
Sbjct: 5 FARAFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGLVNMF 64

Query: 56 MGGDVNTLNIFTLGLGPWLTSMIILMLISYRNMDKYMKQTSLEKHYKE------------ 103
GG + + IF LG+ P++T+ IIL L++ + LE KE
Sbjct: 65 SGGALLQITIFALGIMPYITASIILQLLT-------VVIPRLEALKKEGQAGTAKITQYT 117

Query: 104 RILTLILSVIQSYFVIHEYVSKERVHQDN-------------IYLTILILVTGTMLLVWL 150
R LT+ L+++Q ++ S + + ++ + GT +++WL
Sbjct: 118 RYLTVALAILQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWL 177

Query: 151 ADKNSRYGIAGPMPIVMVSIIKSMMHQKMEYI------DASHIVIALLIILVIITLFILL 204
+ + GI M I+M I + + I I +I + +I + +++
Sbjct: 178 GELITDRGIGNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVV 237

Query: 205 FIELVEVRIPYI----DLMNVSATNMKSYLSWKVNPAGSITLMMSISAFVFLKSGIHFIL 260
F+E + RIP + S +Y+ KVN AG I ++ + S F
Sbjct: 238 FVEQAQRRIPVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVAQFAG 297

Query: 261 SMFNKSISDDMPMLTFDSPVGISVYLVIQMLLGYFLSRFLINTKQKSKDFLKSGNYFSGV 320
+ + D P+ I Y ++ + +F N ++ + + K G + G+
Sbjct: 298 GNSGWKSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFIPGI 357

Query: 321 KPGKDTERYLNYQARRVCWFGLALVTVIIGIPLYFTLFVPHLSTEIYFS-VQLIVLVYIS 379
+ G+ T YL+Y R+ W G + +I +P L S F ++++V +
Sbjct: 358 RAGRPTAEYLSYVLNRITWPGSLYLGLIALVP-TMALVGFGASQNFPFGGTSILIIVGVG 416

Query: 380 INIAETIRTYLYFDKYKPFL 399
+ + I + L Y+ FL
Sbjct: 417 LETVKQIESQLQQRNYEGFL 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14015ICENUCLEATIN576e-10 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 57.5 bits (138), Expect = 6e-10
Identities = 226/999 (22%), Positives = 389/999 (38%), Gaps = 4/999 (0%)

Query: 1163 TSESDSISESTSTSDSISEAISASESTFISLSESNSTSDSESQSASAFLSESLSESTSES 1222
TS I + + +E + S ++ ES S +++
Sbjct: 99 TSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQT 158

Query: 1223 TSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTS 1282
+ ST T S + GST T+ +ST + ST T+ ++ST S T+
Sbjct: 159 IEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTA 218

Query: 1283 LSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMST--SDSISTSKSDSISTSTSLSGS 1340
S+ + ST SD T+ S + S+ + ST + S+ T+ GS
Sbjct: 219 GEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 278

Query: 1341 TSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSNSASQSAS 1400
T ++ S + S T+G+ S+ + S T+ S + S + S +
Sbjct: 279 TQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTA 338

Query: 1401 NSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVSTSASLSGS 1460
ST T+ DS + Y S T+ +S+ T+ S T+ S +G ST + + S
Sbjct: 339 GYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADS 398

Query: 1461 ESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDLSSTSLS 1520
+ S T+ EST + S + S+ + ST + S+ + ST +
Sbjct: 399 SLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTA 458

Query: 1521 DSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTSESDS 1580
S+ S + S + STS + ++ ST +G S T+ S
Sbjct: 459 GEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGS 518

Query: 1581 TSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTS 1640
T T+ ++S + S S + + S+ + ST ++ S+ T+ S + S T+
Sbjct: 519 TQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTA 578

Query: 1641 MSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSASISDSQSMSES 1700
ST + S S + S T+ S + ST T+ S + + S S + ++S
Sbjct: 579 GYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADS 638

Query: 1701 VNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLSCSQSMSDSVS 1760
+ S + +S +G S + S T+ S S + + S + S +
Sbjct: 639 SLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTA 698

Query: 1761 TSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSESTS 1820
+S + S ++++ S+ S S ST+G+ S+ +G ST T+ S + S
Sbjct: 699 GYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGS 758

Query: 1821 LSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSDSKSLSSSQSMSGSESTSTSVS 1880
+ S T+ S S +G+ S + ST + S + S ++ S +
Sbjct: 759 TQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTT 818

Query: 1881 DSQSSSTSNSQFDSMSISASESDSMSTSDSSSISGSNSTSTSLSTSDSMSGSVSVSTSTS 1940
S+ST+ DS I+ S + +S +G ST T+ SD +G S ST+
Sbjct: 819 GYGSTSTAG--ADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 876

Query: 1941 LSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSLSTSISTSMSMSASTSSSQS 2000
S I+G S + S T+ S +Q S +G STS + S + S
Sbjct: 877 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQ 936

Query: 2001 TSVSTSLSTSDSISDSTSISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDS 2060
T+ S + S T+ S + S S + S + ST + ST T+
Sbjct: 937 TASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGY 996

Query: 2061 TSGSTSTSISESLSTSGSGSTSVSDSTSMSESNSSSVSMSQDKSDSTSISDSESVSTSTS 2120
S T+ S + GS +T+ +DS+ ++ SS S + + S S S
Sbjct: 997 GSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVL 1056

Query: 2121 TSLSTSDSTSTSESLSTSMSGSQSISDSTSTSMSGSTST 2159
T+ S S S T+ GS I+ S+ ++G ST
Sbjct: 1057 TAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPEST 1095



Score = 56.7 bits (136), Expect = 1e-09
Identities = 237/1078 (21%), Positives = 428/1078 (39%), Gaps = 14/1078 (1%)

Query: 1082 RTSESQSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTST 1141
+TS Q + + + + S + S +T SGS +
Sbjct: 98 KTSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQ 157

Query: 1142 SLSTSNSERTSTSMSDSTSLSTSESDSISESTST--SDSISEAISASESTFISLSESNST 1199
++ + T + S ++ S + +ST + S + ++ST ++ S T
Sbjct: 158 TIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQT 217

Query: 1200 SDSESQSASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGST 1259
+ ES + + S S+ T+ S+ T+ S + S T+ S+ T+G
Sbjct: 218 AGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYG 277

Query: 1260 SISTSTSISESTSTFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMST 1319
S T+ S+ T+ + S + + S + S T+ S + S + S T
Sbjct: 278 STQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLT 337

Query: 1320 SDSISTSKSDSISTSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSL 1379
+ ST + S+ + GST + DS+ + S T+ S T+ S T+ +
Sbjct: 338 AGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGAD 397

Query: 1380 SLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDST 1439
S + S + S + ST T++ S T+ Y S T+ +S+ + S T
Sbjct: 398 SSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQT 457

Query: 1440 SISKSTSQSGSVSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSAST 1499
+ S+ +G ST + GS+ + S ST+ ES+ + S + S +
Sbjct: 458 AGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYG 517

Query: 1500 STSLSNSASASESDLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASL 1559
ST + + S + STS + + S+ + S ++ S+ + ST
Sbjct: 518 STQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLT 577

Query: 1560 STSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTST 1619
+ ST +GS S + ST T+ S T+ S + S T+ STS + +
Sbjct: 578 AGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGAD 637

Query: 1620 SMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTST 1679
S + S + S T+ ST + SD T+ S ST+G+ S I+ ST T
Sbjct: 638 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQT 697

Query: 1680 STSASEVMSASISDSQSMSESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRK 1739
+ S + + S + S S S S + +DS ++G S + S T+
Sbjct: 698 AGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYG 757

Query: 1740 SESVSESSSLSCSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTST 1799
S + S+ + S S + +DSS ++ S +++ S + S T+ S T
Sbjct: 758 STQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLT 817

Query: 1800 SGSLSTSTSLSGSESVSESTSLSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSD 1859
+G STST+ + S ++ S + S T+ S + S + STS + D
Sbjct: 818 TGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYD 877

Query: 1860 SKSLSSSQSMSGSESTSTSVSDSQSSSTSNSQFDSMSISASESDSMSTSDSSSISGSNST 1919
S ++ S + S + S+ T+ D + S S + S + GS T
Sbjct: 878 SSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQT 937

Query: 1920 STSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASG 1979
++ ST + GS + S + GSTS++ S+ + S + QST T+ G
Sbjct: 938 ASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYG 997

Query: 1980 SLSTSISTSMSMSASTSSSQSTSVSTSLSTSDSISDST----------SISISGSQSTVE 2029
S T+ +S + S++ + + S+ ++ S S S ISG +S +
Sbjct: 998 STQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLT 1057

Query: 2030 SESTSDSTSISDSESLSTSDSDSTSTSTSDSTSGSTSTSIS--ESLSTSGSGSTSVSDST 2087
+ S S S + S+ ++ S +G ST I+ S+ +G GS+ +
Sbjct: 1058 AGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYR 1117

Query: 2088 SMSESNSSSVSMSQDKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTSMSGSQSI 2145
S S + SV M+ ++ + +DS + S L+ ++S T+ S +G+ I
Sbjct: 1118 STLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCI 1175



Score = 53.2 bits (127), Expect = 1e-08
Identities = 226/1002 (22%), Positives = 400/1002 (39%), Gaps = 4/1002 (0%)

Query: 875 TSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLSDSTSNAISTSTSLSESASTSDS 934
ST + ++ ++T GS S I+G ST + ST A ST + + ST +
Sbjct: 151 GSTQPTQTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVA 210

Query: 935 ISISNSIANSQSASTSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVSGSLSIAAS 994
S A +S+ + S T + S + ST + DS+ +G GS A
Sbjct: 211 GYGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGY--GSTQTAGE 268

Query: 995 QSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGSTSESLSDSQSTS 1054
S T+ S T++ SD + GS + + S ++ ST +G S + ST
Sbjct: 269 DSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQ 328

Query: 1055 DSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTSFSDSTSDS 1114
+ S + S T+ S+ + S + S + S + SD T+
Sbjct: 329 TAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGY 388

Query: 1115 KSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSESDSISESTS 1174
S TA ++S + ST + ST + S +T+ SD T+ S + +S+
Sbjct: 389 GSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSL 448

Query: 1175 TSDSISEAISASESTFISLSESNSTSDSESQSASAFLSESLSESTSESTSESVSSSTSES 1234
+ S + +S+ + S T+ S + + S S + S + S+ T+
Sbjct: 449 IAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGY 508

Query: 1235 TSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTSLSMSTSTSLSDS 1294
S + S T+ + S+ +G S ST+ + S + + S ++ S+ T+ S
Sbjct: 509 GSTLTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQ 568

Query: 1295 TSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESKSDSTSMSIS 1354
T+ S + S + S S + ST + S+ T+ GST ++ S +
Sbjct: 569 TAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGY 628

Query: 1355 MSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQS 1414
S ST+G+ S+ + S T+ S+ + S + S + STST+ +DS
Sbjct: 629 GSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSL 688

Query: 1415 TSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVSTSASLSGSESESDSQSISTSAS 1474
+ Y S T+ S T+ S T+ S SG STS + + S + S T++
Sbjct: 689 IAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASY 748

Query: 1475 ES--TSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDLSSTSLSDSTSASMQSSES 1532
S T+ ST + S +G STST+ ++S+ + + T+ S + S
Sbjct: 749 HSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQ 808

Query: 1533 DSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTS 1592
+Q S + STST+ S++ + S T+ S + S T+ SD +
Sbjct: 809 TAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGY 868

Query: 1593 RSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVS 1652
STS +G S+ + S T+ S S + S T+ S ST+ +S
Sbjct: 869 GSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSL 928

Query: 1653 DSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSASISDSQSMSESVNDSESVSESNS 1712
+ ST ++ S ++ S T+ S+ S S + S + S +
Sbjct: 929 IAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGY 988

Query: 1713 ESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLSCSQSMSDSVSTSDSSSLSVSTS 1772
+S + GST ++ S + + + ++SS ++ S S S ++ ST
Sbjct: 989 QSTLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTL 1048

Query: 1773 LRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSESTSLSDSISMSDSTS 1832
+ SV + S S S+ T+ GS ++ S + EST ++ + SM +
Sbjct: 1049 ISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGK 1108

Query: 1833 TSDSDSLSGSISLSGSTSLSTSDSLSDSKSLSSSQSMSGSES 1874
S + S +SG+ S+ + + + S +G S
Sbjct: 1109 GSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 52.1 bits (124), Expect = 3e-08
Identities = 174/773 (22%), Positives = 304/773 (39%), Gaps = 2/773 (0%)

Query: 1398 SASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVSTSASL 1457
+ + +E + S + + T D+T S ST + ++ +
Sbjct: 106 LHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYG 165

Query: 1458 SGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASESDLSST 1517
S SQ I+ S T+ +ST ++ ST +G+ ST + S + + S
Sbjct: 166 STLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQM 225

Query: 1518 SLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTSESTSE 1577
+ ST M+ S+ + S + S+ + ST + S T+ GST +
Sbjct: 226 AGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKG 285

Query: 1578 SDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMSLSTST 1637
SD T+ S + + S+ +G ST T+ +S T+ ST SD + ST T
Sbjct: 286 SDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGT 345

Query: 1638 STSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSASISDSQSM 1697
+ S + S + DS+ + GS + SD T+ S + S +
Sbjct: 346 AGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYG 405

Query: 1698 SESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLSCSQSMSD 1757
S ES + S + GS + GS + + S+ + S
Sbjct: 406 STQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLT 465

Query: 1758 SVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSE 1817
+ S ++ S S S + S + GST T+ GS T+ S + +E
Sbjct: 466 AGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNE 525

Query: 1818 STSLSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSDSKSLSSSQSMSGSESTST 1877
S ++ S S + + S + GS + S+ T+ S + S +G ST T
Sbjct: 526 SDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGT 585

Query: 1878 SVSDSQSSSTSNSQFDSMSISASESDSMSTSDSSSISGSNSTSTSLSTSDSMSGSVSVST 1937
+ SDS + S + S+ + ST + S + S ST+ + S ++
Sbjct: 586 AGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYG 645

Query: 1938 STSLSDSISGSTSVSDSSSTSTSTSLSDSMSQSQSTSTSASGSLSTSISTSMSMSASTSS 1997
ST + S T+ S+ T+ S + S ST+ + S ++ ST + S +
Sbjct: 646 STQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILT 705

Query: 1998 SQSTSVSTSLSTSDSISDSTSISISGSQSTVESESTSDSTSISDSESLSTSDSDSTSTST 2057
+ S T+ SD S S S +G+ S++ + S T+ S + S T+
Sbjct: 706 AGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQ 765

Query: 2058 SDSTS--GSTSTSISESLSTSGSGSTSVSDSTSMSESNSSSVSMSQDKSDSTSISDSESV 2115
S T+ GSTST+ ++S +G GST + S+ + S +Q++SD T+ S S
Sbjct: 766 SVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTST 825

Query: 2116 STSTSTSLSTSDSTSTSESLSTSMSGSQSISDSTSTSMSGSTSTSESNSMHPS 2168
+ + S+ ++ ST T+ S +G S + S + S S + + S
Sbjct: 826 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDS 878



Score = 51.3 bits (122), Expect = 5e-08
Identities = 204/902 (22%), Positives = 352/902 (39%), Gaps = 4/902 (0%)

Query: 1247 TSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTS 1306
T TS + + + ++ + + + D + S S +
Sbjct: 97 TKTSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPT 156

Query: 1307 DSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTS 1366
+ + S + S + ST T+ ST + ST + + S +G ST
Sbjct: 157 QTIEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQ 216

Query: 1367 TSTSLSDSTSTSLSLSASMNQSGVDSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQS 1426
T+ S + S M S + + S + S+ + S T+ S T+
Sbjct: 217 TAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGY 276

Query: 1427 ESTSTSTSLSDSTSISKSTSQSGSVSTSASLSGSESESDSQSISTSA--SESTSESASTS 1484
ST T+ SD T+ ST +G+ S+ + GS + +S T+ S T++ S
Sbjct: 277 GSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDL 336

Query: 1485 LSDSTSTSNSGSASTSTSLSNSASASESDLSSTSLSDSTSASMQSSESDSQSTSASLSDS 1544
+ ST +G S+ + S + D S T+ ST + + S+ + S + +
Sbjct: 337 TAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGA 396

Query: 1545 LSTSTSNRMSTIASLSTSVSTSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTS 1604
S+ + ST + S T+ GST + SD T+ S + S+ +G ST
Sbjct: 397 DSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQ 456

Query: 1605 TSTSDSRSTSASTSTSMRTSTSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTS 1664
T+ DS T+ ST SD + STST+ S + S + ST +
Sbjct: 457 TAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGY 516

Query: 1665 GSMSVSISLSDSTSTSTSASEVMSASISDSQSMSESVNDSESVSESNSESDSKSMSGSTS 1724
GS + + SD + S S + S + S SV + S + GS
Sbjct: 517 GSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDL 576

Query: 1725 VSDSGSLSVSTSLRKSESVSESSSLSCSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDS 1784
+ GS + S + S+ + S + S ++ S S S + +
Sbjct: 577 TAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGA 636

Query: 1785 LSDSKSTSGSTSTSTSGSLSTSTSLSGSESVSESTSLSDSISMSDSTSTSDSDSLSGSIS 1844
S + GST T+ S+ T+ S + S + S S + + S + GS
Sbjct: 637 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQ 696

Query: 1845 LSGSTSLSTSDSLSDSKSLSSSQSMSGSESTSTSVSDSQSSSTSNSQFDSMSISASESDS 1904
+G S+ T+ S + S SG STST+ +DS + S + S+ +
Sbjct: 697 TAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGY 756

Query: 1905 MSTSDSSSISGSNSTSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTSTSTSLS 1964
ST + S + S ST+ + S ++ ST + S T+ S+ T+ S
Sbjct: 757 GSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDL 816

Query: 1965 DSMSQSQSTSTSASGSLSTSISTSMSMSASTSSSQSTSVSTSLSTSDSISDSTSISISGS 2024
+ S ST+ + S ++ ST + S ++ S T+ SD + S S +G
Sbjct: 817 TTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGY 876

Query: 2025 QSTVESESTSDSTSISDSESLSTSDSDSTSTSTSDSTSG--STSTSISESLSTSGSGSTS 2082
S++ + S T+ +S + S T+ SD T+G STST+ ES +G GST
Sbjct: 877 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQ 936

Query: 2083 VSDSTSMSESNSSSVSMSQDKSDSTSISDSESVSTSTSTSLSTSDSTSTSESLSTSMSGS 2142
+ S + S ++++S T+ S S++ S+ ++ ST T+ ST +G
Sbjct: 937 TASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGY 996

Query: 2143 QS 2144
S
Sbjct: 997 GS 998



Score = 50.9 bits (121), Expect = 7e-08
Identities = 241/1060 (22%), Positives = 428/1060 (40%), Gaps = 16/1060 (1%)

Query: 907 TSASLSDSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDS 966
TSA A + + ++ S ++ + N T D+ S S + +
Sbjct: 99 TSAMQFILHHRADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQT 158

Query: 967 KSMSTSESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDS 1026
++T S T S ++G S + ST + ST +DS +G S +
Sbjct: 159 IEIATYGSTLSGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTA 218

Query: 1027 KSMSVSSSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSES 1086
S + S S + S + S + GST T+ S+ S
Sbjct: 219 GEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGS 278

Query: 1087 QSTSGSMSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTS 1146
T+ S + S T+ +DS+ + ST ++ S + S + S T+
Sbjct: 279 TQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTA 338

Query: 1147 NSERTSTSMSDSTSLSTSESDSISESTSTSDSISEAISASESTFISLSESNSTSDSESQS 1206
T T+ DS+ ++ S + S+ + + ++ + ST + + S
Sbjct: 339 GYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADS 398

Query: 1207 ASAFLSESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTS 1266
+ S + EST + ST + SD T+ GST T+ +S+ + ST T+
Sbjct: 399 SLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTA 458

Query: 1267 ISESTSTFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMS--TSDSIS 1324
+S+ T S T+ S T+ STS + S + S + S T+ S
Sbjct: 459 GEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGS 518

Query: 1325 TSKSDSISTSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSAS 1384
T + + S + GSTS + ++S+ ++ S T+ S T+ S T+ S +
Sbjct: 519 TQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTA 578

Query: 1385 MNQSGVDSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKS 1444
S + S S + ST T+ S T+ Y S T++ +S T+ S ST+ + S
Sbjct: 579 GYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADS 638

Query: 1445 TSQSGSVSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLS 1504
+ +G ST + +S + S T++ S + STS +G+ S+ +
Sbjct: 639 SLIAGYGSTQT------AGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGY 692

Query: 1505 NSASASESDLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVS 1564
S + + T+ ST + + S+ S S S + + S+ + ST + S
Sbjct: 693 GSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSL 752

Query: 1565 TSESGSTSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTS 1624
T+ GST + +S T+ S S + + S+ +G ST T+ S T+ ST +T+
Sbjct: 753 TAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGST--QTA 810

Query: 1625 TSDSQSMSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSAS 1684
S + STST+ +DS+ ++ S T+ S T+G S + +S T+ S
Sbjct: 811 QERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGS 870

Query: 1685 EVMSASISDSQSMSESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVS 1744
+ S + S + S + S + S +G S ST+ +S ++
Sbjct: 871 TSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIA 930

Query: 1745 ESSSLSCSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLS 1804
S + S ++ SS + S ++ S S + DS +G ST T+G S
Sbjct: 931 GYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQS 990

Query: 1805 TSTSLSGSESVSESTSLSDSISMSDSTSTSDSDSLSGSISLSGSTSLSTSDSLSDSKSLS 1864
T T+ GS +E +S + S +T+ +DS ++G S S S + S +S
Sbjct: 991 TLTAGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLIS 1050

Query: 1865 SSQSMSGSESTSTSVSDSQSSSTSN------SQFDSMSISASESDSMSTSDSSSISGSNS 1918
+S+ + S+ +S +SS T+ + S I+ ES ++ + S I+G S
Sbjct: 1051 GLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGS 1110

Query: 1919 TSTSLSTSDSMSGSVSVSTSTSLSDSISGSTSVSDSSSTS 1958
+ T+ S +SG+ SV + I+G+ S + S
Sbjct: 1111 SQTAGYRSTLISGADSVQMAGERGKLIAGADSTQTAGDRS 1150



Score = 50.1 bits (119), Expect = 1e-07
Identities = 240/1078 (22%), Positives = 423/1078 (39%), Gaps = 22/1078 (2%)

Query: 753 MSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTSASTSKSTSVSLSDSVSASKS 812
+D V+ + S + + + +T S S + ++ + S
Sbjct: 109 RADYVACTEMQAGPGSPDVTSEVKVGNRSLPVTDDIDATIESGSTQPTQTIEIATYGSTL 168

Query: 813 LSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDSISNSSSTEKSESLSTSTSDS 872
T +S ++ ST S + S + + S ++ ST+ + S+ +
Sbjct: 169 SGTHQSQLIAGYGSTETAGDSSTLIAGYGSTGTAGADSTLVAGYGSTQTAGEESSQMAGY 228

Query: 873 LRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLSDSTSNAISTSTSLSESASTS 932
T T + S + GS + S+ I+G ST + DS+ A ST ++ S
Sbjct: 229 GSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDL 288

Query: 933 DSISISNSIANSQSA------STSKSDSQSTSISLSTSDSKSMSTSESLSDSTSTSGSVS 986
+ S A + S+ ST + +ST + S + S+ + ST +
Sbjct: 289 TAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGD 348

Query: 987 GSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVSSSMSTSQSGSTSES 1046
S IA S T+ DS T+ S + GS + S + + S+ +G S
Sbjct: 349 DSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQ 408

Query: 1047 LSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGSMSASQSDSMSISTS 1106
+ +ST + S + S T+ ST + S + GS + DS +
Sbjct: 409 TAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGY 468

Query: 1107 FSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTSTSMSDSTSLSTSES 1166
S T+ S TA S S + S+ + ST + S T+ S T+ + S+
Sbjct: 469 GSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQTAQNESDL 528

Query: 1167 DSISESTSTSDSISEAISASESTFISLSESNSTSDSESQSASAFLSESLSESTSESTSES 1226
+ STST+ + S I+ ST + S T+ S + S+ + S T+ S
Sbjct: 529 ITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGS 588

Query: 1227 VSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTSTFKSESVSTSLSMS 1286
SS + S ++ S T+ S T+ S+ T+ S ST+ S ++ S
Sbjct: 589 DSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQ 648

Query: 1287 TSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSISTSTSLSGSTSESKS 1346
T+ S T+ S + S + S ST+ + S+ + ST T+ S +
Sbjct: 649 TAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGY 708

Query: 1347 DSTSMSISMSQSTSGSTSTS----------------TSTSLSDSTSTSLSLSASMNQSGV 1390
ST + S TSG STS T++ S T+ S + QS +
Sbjct: 709 GSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVL 768

Query: 1391 DSNSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGS 1450
+ S S + + S+ + S T+ Y S T+ ST T+ SD T+ STS +G+
Sbjct: 769 TTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGA 828

Query: 1451 VSTSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASAS 1510
S+ + GS + SI T+ ST + S + S S + S+ ++ S
Sbjct: 829 DSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQ 888

Query: 1511 ESDLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGS 1570
+ +S + S SD + S S + S+ ++ ST +G
Sbjct: 889 TAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGY 948

Query: 1571 TSESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQS 1630
S T+ S+ T+ S S + S+ + ST T+ +ST + S +T+ S
Sbjct: 949 GSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGSTQTAEHSSTL 1008

Query: 1631 MSLSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSISLSDSTSTSTSASEVMSAS 1690
+ ST+T+ +DS+ ++ S TS S T+G S IS S T+ S ++S
Sbjct: 1009 TAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGR 1068

Query: 1691 ISDSQSMSESVNDSESVSESNSESDSKSMSGSTSVSDSGSLSVSTSLRKSESVSESSSLS 1750
S + S + S + +S ++G+ S+ +G S T+ +S +S + S+
Sbjct: 1069 RSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQ 1128

Query: 1751 CSQSMSDSVSTSDSSSLSVSTSLRSSESVSESDSLSDSKSTSGSTSTSTSGSLSTSTS 1808
+ ++ +DS+ + S + + S + SK T+G+ +G S T+
Sbjct: 1129 MAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCILMAGDRSKLTA 1186



Score = 49.0 bits (116), Expect = 2e-07
Identities = 209/939 (22%), Positives = 371/939 (39%), Gaps = 2/939 (0%)

Query: 733 TDASGNKTTTTFKYEVTRNSMSDSVSTSGSTQQSQSVSTSKADSQSASTSTSGSIVVSTS 792
T G+ T + T + S ++ GSTQ + ST A S T+ GS + +
Sbjct: 281 TAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGY 340

Query: 793 ASTSKSTSVSLSDSVSASKSLSTSESNSVSSSTSTSLVNSQSVSSSMSDSASKSTSLSDS 852
ST + S + S + +S+ + ST S ++ S + + S
Sbjct: 341 GSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSL 400

Query: 853 ISNSSSTEKSESLSTSTSDSLRTSTSLSDSLSMSTSGSLSKSQSLSTSISGSSSTSASLS 912
I+ ST+ + ST T+ T T+ S + GS + S+ I+G ST +
Sbjct: 401 IAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGE 460

Query: 913 DSTSNAISTSTSLSESASTSDSISISNSIANSQSASTSKSDSQSTSISLSTSDSKSMSTS 972
DS+ A ST ++ S + S S A +S+ + S T+ ST + ST
Sbjct: 461 DSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLTAGYGSTQ 520

Query: 973 ESLSDSTSTSGSVSGSLSIAASQSVSTSTSDSMSTSEIVSDSISTSGSLSASDSKSMSVS 1032
+ ++S +G GS S A + S + S T+ S + GS + S +
Sbjct: 521 TAQNESDLITGY--GSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREGSDLTA 578

Query: 1033 SSMSTSQSGSTSESLSDSQSTSDSDSKSLSQSTSQSGSTSTSTSTSASVRTSESQSTSGS 1092
ST +GS S ++ ST + S + S T+ S + S S + + S
Sbjct: 579 GYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADS 638

Query: 1093 MSASQSDSMSISTSFSDSTSDSKSASTASSESISQSASTSTSGSVSTSTSLSTSNSERTS 1152
+ S + S T+ S TA S + STS + + S+ ++ S +T+
Sbjct: 639 SLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYGSTQTA 698

Query: 1153 TSMSDSTSLSTSESDSISESTSTSDSISEAISASESTFISLSESNSTSDSESQSASAFLS 1212
S T+ S + S TS S + + ++S+ I+ S T+ S + + S
Sbjct: 699 GYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLTAGYGS 758

Query: 1213 ESLSESTSESTSESVSSSTSESTSLSDSTSESGSTSTSLSNSTSGSTSISTSTSISESTS 1272
+ S T+ S+ST+ + S + S T+ S T+G S T+ S+ T+
Sbjct: 759 TQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQERSDLTT 818

Query: 1273 TFKSESVSTSLSMSTSTSLSDSTSLSTSLSDSTSDSKSDSLSTSMSTSDSISTSKSDSIS 1332
+ S S + + S + S T+ S+ + S + S T+ STS + S
Sbjct: 819 GYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYDS 878

Query: 1333 TSTSLSGSTSESKSDSTSMSISMSQSTSGSTSTSTSTSLSDSTSTSLSLSASMNQSGVDS 1392
+ + GST + +S + S T+ S T+ S ST+ S + S +
Sbjct: 879 SLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYGSTQTA 938

Query: 1393 NSASQSASNSTSTSTSESDSQSTSSYTSQSTSQSESTSTSTSLSDSTSISKSTSQSGSVS 1452
+ S + S+ T+ S T+ Y S S + +S+ + S T+ +ST +G S
Sbjct: 939 SFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLTAGYGS 998

Query: 1453 TSASLSGSESESDSQSISTSASESTSESASTSLSDSTSTSNSGSASTSTSLSNSASASES 1512
T + S + S +T+ ++S+ + S S S + ST +S S +
Sbjct: 999 TQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTA 1058

Query: 1513 DLSSTSLSDSTSASMQSSESDSQSTSASLSDSLSTSTSNRMSTIASLSTSVSTSESGSTS 1572
S+ +S S+ S+ ++ S + ST + ++ S+ +G S
Sbjct: 1059 GYGSSLISGRRSSLTAGYGSNQIASHRSSLIAGPESTQITGNRSMLIAGKGSSQTAGYRS 1118

Query: 1573 ESTSESDSTSTSLSDSQSTSRSTSASGSASTSTSTSDSRSTSASTSTSMRTSTSDSQSMS 1632
S +DS + + + + S + S + + S + S T+ +D M+
Sbjct: 1119 TLISGADSVQMAGERGKLIAGADSTQTAGDRSKLLAGNNSYLTAGDRSKLTAGNDCILMA 1178

Query: 1633 LSTSTSTSMSDSTSLSDSVSDSTSDSTSASTSGSMSVSI 1671
S T+ +S + S + S T+G SV I
Sbjct: 1179 GDRSKLTAGINSILTAGCRSKLIGSNGSTLTAGENSVLI 1217


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14035ENTEROTOXINA280.006 Heat-labile enterotoxin A chain signature.
		>ENTEROTOXINA#Heat-labile enterotoxin A chain signature.

Length = 258

Score = 28.4 bits (63), Expect = 0.006
Identities = 17/54 (31%), Positives = 27/54 (50%), Gaps = 2/54 (3%)

Query: 30 IELFEHTFGLQKELVKYVGIAEATTAALYSASFINKNISRLASLSTIGILSVAA 83
I L++H G Q V+Y +T+ +L SA ++I L+ ST I +A
Sbjct: 57 INLYDHARGTQTGFVRYDDGYVSTSLSLRSAHLAGQSI--LSGYSTYYIYVIAT 108


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14040NUCEPIMERASE270.043 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.4 bits (61), Expect = 0.043
Identities = 9/32 (28%), Positives = 14/32 (43%)

Query: 23 IPRPIAFVTTLNQDASVNAAPFSFFNIVNNHP 54
IP T + + AP+ +NI N+ P
Sbjct: 234 IPHADTQWTVETGTPAASIAPYRVYNIGNSSP 265


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14050SACTRNSFRASE444e-08 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 43.8 bits (103), Expect = 4e-08
Identities = 23/101 (22%), Positives = 45/101 (44%), Gaps = 5/101 (4%)

Query: 48 EKNDEVIGYIN--GPVIKERYISDDLFKNVSTNNSEGGYISVLGLVVAPNYQGQGIAGRL 105
E +D + Y+ G Y+ ++ + ++ GY + + VA +Y+ +G+ L
Sbjct: 51 EDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTAL 110

Query: 106 LNYFETLAKNHHRHGVTLTCRE---SLISFYEKYGYRNEGV 143
L+ AK +H G+ L ++ S FY K+ + V
Sbjct: 111 LHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAV 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SACOL_RS14070HTHTETR682e-16 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 67.7 bits (165), Expect = 2e-16
Identities = 16/48 (33%), Positives = 31/48 (64%)

Query: 2 KDKIIDNAITLFSEKGYDGTTLDDIAKSVNIKKASLYYHFDSKKSIYE 49
+ I+D A+ LFS++G T+L +IAK+ + + ++Y+HF K ++
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.