PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2469.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_008023 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1MGAS2096_Spy0108MGAS2096_Spy0113Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0108214-1.854238Hsp33-like chaperonin
MGAS2096_Spy0109416-2.369322transcriptional regulator RofA
MGAS2096_Spy0110417-2.199866fibronectin-binding protein
MGAS2096_Spy0111316-3.999860sortase
MGAS2096_Spy0112316-4.013564sortase
MGAS2096_Spy0113115-4.490641fibronectin-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0109PF082808090.0 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 809 bits (2091), Expect = 0.0
Identities = 530/530 (100%), Positives = 530/530 (100%)

Query: 1 MISIFSLDRIEIGEYTYQRLIWLSKCRKRGPLSLIEKYLESSIESKCQLVVLFFKTSSLP 60
MISIFSLDRIEIGEYTYQRLIWLSKCRKRGPLSLIEKYLESSIESKCQLVVLFFKTSSLP
Sbjct: 1 MISIFSLDRIEIGEYTYQRLIWLSKCRKRGPLSLIEKYLESSIESKCQLVVLFFKTSSLP 60

Query: 61 ITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKRMISCQFTHPSKETYLYQLYASSN 120
ITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKRMISCQFTHPSKETYLYQLYASSN
Sbjct: 61 ITEVAEKTGLTFLQLNHYCEELNAFFPDSLSMTIQKRMISCQFTHPSKETYLYQLYASSN 120

Query: 121 VLQLLAFLIKNGSHSRPLTDFARSHFLSNSSAYRMREALIPLLRNFELKLSKNKIVGEEY 180
VLQLLAFLIKNGSHSRPLTDFARSHFLSNSSAYRMREALIPLLRNFELKLSKNKIVGEEY
Sbjct: 121 VLQLLAFLIKNGSHSRPLTDFARSHFLSNSSAYRMREALIPLLRNFELKLSKNKIVGEEY 180

Query: 181 RIRYLIALLYSKFGIKVYDLTQQDKNIIHSFLSHSSTHLKTSPWLSESFSFYDILLALSW 240
RIRYLIALLYSKFGIKVYDLTQQDKNIIHSFLSHSSTHLKTSPWLSESFSFYDILLALSW
Sbjct: 181 RIRYLIALLYSKFGIKVYDLTQQDKNIIHSFLSHSSTHLKTSPWLSESFSFYDILLALSW 240

Query: 241 KRHQFSVTIPQTRIFQQLKKLFVYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANN 300
KRHQFSVTIPQTRIFQQLKKLFVYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANN
Sbjct: 241 KRHQFSVTIPQTRIFQQLKKLFVYDSLKKSSRDIIETYCQLNFSAGDLDYLYLIYITANN 300

Query: 301 SFASLQWTPEHIRQCCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSKSFLFN 360
SFASLQWTPEHIRQCCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSKSFLFN
Sbjct: 301 SFASLQWTPEHIRQCCQLFEENDTFRLLLNPIITLLPNLKEQKASLVKALMFFSKSFLFN 360

Query: 361 LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR 420
LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR
Sbjct: 361 LQHFIPETNLFVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILR 420

Query: 421 NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH 480
NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH
Sbjct: 421 NIQPPLVVVFVASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITH 480

Query: 481 SQLIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTKQLT 530
SQLIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTKQLT
Sbjct: 481 SQLIPFVHHELTKGIAVAEISFDESILSIQELMYQVKEEKFQADLTKQLT 530


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0110PF03544300.018 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 30.3 bits (68), Expect = 0.018
Identities = 19/107 (17%), Positives = 27/107 (25%), Gaps = 2/107 (1%)

Query: 279 TSEHNPKTPELDGTPIPEDPKRPDESSEPALPPLMPELDGEEVP--EVPSESLEPALPPL 336
TS H PI P + P PE E P E E + A +
Sbjct: 35 TSVHQVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVI 94

Query: 337 MPELDGEEVPEVPSESLEPALPPLMPELDGEEVPEVPSESLEPALPP 383
+ P + +E + P P + P
Sbjct: 95 EKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAPARPTSST 141



Score = 30.3 bits (68), Expect = 0.021
Identities = 17/88 (19%), Positives = 25/88 (28%), Gaps = 7/88 (7%)

Query: 272 DPPKPGDTSEHNPKTPELDGTPIPEDPKRPDESSEPALPPLMPELDGEEVPEVPSESLEP 331
+PP+ PE + PIPE PK E P + P + +E
Sbjct: 61 EPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPK-------PKPKPKPVKKVEQ 113

Query: 332 ALPPLMPELDGEEVPEVPSESLEPALPP 359
+ P P + P
Sbjct: 114 PKRDVKPVESRPASPFENTAPARPTSST 141


2MGAS2096_Spy0141MGAS2096_Spy0157Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy01412161.026119adenylosuccinate synthetase
MGAS2096_Spy01420160.083023adenylosuccinate synthetase
MGAS2096_Spy0143-115-1.475142nucleoside-binding protein
MGAS2096_Spy0144-314-1.599318hypothetical protein
MGAS2096_Spy0145-215-1.521792transcription antitermination protein NusG
MGAS2096_Spy0146-113-1.743103NAD glycohydrolase
MGAS2096_Spy0147013-0.651208hypothetical protein
MGAS2096_Spy01480120.475468streptolysin O
MGAS2096_Spy01491225.244465hypothetical protein
MGAS2096_Spy01501204.451595hypothetical protein
MGAS2096_Spy0151-1224.205608hypothetical protein
MGAS2096_Spy0152-1214.137362cystathionine beta-lyase
MGAS2096_Spy0153-1254.422936cystathionine beta-lyase
MGAS2096_Spy0154-1264.807429leucyl-tRNA synthetase
MGAS2096_Spy0155-1183.084265transposase
MGAS2096_Spy01561234.510849PTS system ascorbate-specific transporter
MGAS2096_Spy01570214.922796PTS system 3-keto-L-gulonate specific
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0143LIPPROTEIN48599e-12 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 58.9 bits (142), Expect = 9e-12
Identities = 70/291 (24%), Positives = 110/291 (37%), Gaps = 49/291 (16%)

Query: 27 KQTTDNSLKIAMITNQTGIDDKSFNQSAWEGLQAWGKENKLEKGKGYDYFQSANESEFTT 86
K LK +IT++ IDDKSFNQSA+E L+ + K G + S F +
Sbjct: 55 KNAELLKLKPVLITDEGKIDDKSFNQSAFEALK------AINKQTGIEINNVEPSSNFES 108

Query: 87 NLESAVTNGYNLVFGIGFPLHDAVEKVAANN----PDNHFAIVD---DVIKGQKNVASIT 139
SA++ G+ + GF ++++ + N I+ D+ K S+
Sbjct: 109 AYNSALSAGHKIWVLNGFKHQQSIKQYIDAHREELERNQIKIIGIDFDIETEYKWFYSLQ 168

Query: 140 FSDHEAAYLAGVAAAKTTKTKQ-----VGFVGGMEGDVVKRFEKGFEAGVKSVDDTIKVR 194
F+ E+A+ G A A + V GG V F +GF G+ + K
Sbjct: 169 FNIKESAFTTGYAIASWLSEQDESKRVVASFGGGAFPGVTTFNEGFAKGILYYNQKHKSS 228

Query: 195 VAYAGS-------FADAARGKTI-------AAAQYAEGADVIYHAAGGTGAGVFSEAKSI 240
Y S F + T+ A VI A G F +
Sbjct: 229 KIYHTSPVKLDSGFTAGEKMNTVINNVLSSTPADVKYNPHVILSVA---GPATFETVRLA 285

Query: 241 NEKRKEEDKVWVIGVDRDQSEDGKYTTKDGKSANFVLTSSIKEVGKALVKV 291
N+ + +VIGVD DQ +D +LTS +K + +A+ +
Sbjct: 286 NKGQ------YVIGVDSDQG-----MIQDKDR---ILTSVLKHIKQAVYET 322


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0148TACYTOLYSIN8930.0 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 893 bits (2309), Expect = 0.0
Identities = 568/574 (98%), Positives = 571/574 (99%)

Query: 1 MKDMSNKKTFKKYSRVAGLLTAALIIGNLVTANAESNKQNTASTETTTTNEQPKPESSEL 60
MKDMSNKK FKKYSRVAGLLTAALI+GNLVTANA+SNKQNTA+TETTTTNEQPKPESSEL
Sbjct: 1 MKDMSNKKIFKKYSRVAGLLTAALIVGNLVTANADSNKQNTANTETTTTNEQPKPESSEL 60

Query: 61 TTEKAGQKMDDMLNSNDMIKLAPKEMPLESAEKEEKKSEDKKKSEEDHTEEINDKIYSLN 120
TTEKAGQKMDDMLNSNDMIKLAPKEMPLESAEKEEKKSED KKSEEDHTEEINDKIYSLN
Sbjct: 61 TTEKAGQKMDDMLNSNDMIKLAPKEMPLESAEKEEKKSEDNKKSEEDHTEEINDKIYSLN 120

Query: 121 YNELEVLAKNGETIENFVPKEGVKKADKFIVIERKKKNINTTPVDISIIDSVTDRTYPAA 180
YNELEVLAKNGETIENFVPKEGVKKADKFIVIERKKKNINTTPVDISIIDSVTDRTYPAA
Sbjct: 121 YNELEVLAKNGETIENFVPKEGVKKADKFIVIERKKKNINTTPVDISIIDSVTDRTYPAA 180

Query: 181 LQLANKGFTENKPDAVVTKRNPQKIHIDLPGMGDKATVEVNDPTYANVSTAIDNLVNQWH 240
LQLANKGFTENKPDAVVTKRNPQKIHIDLPGMGDKATVEVNDPTYANVSTAIDNLVNQWH
Sbjct: 181 LQLANKGFTENKPDAVVTKRNPQKIHIDLPGMGDKATVEVNDPTYANVSTAIDNLVNQWH 240

Query: 241 DNYSGGNTLPARTQYTESMVYSKSQIEAALNVNSKILDGTLGIDFKSISKGEKKVMIAAY 300
DNYSGGNTLPARTQYTESMVYSKSQIEAALNVNSKILDGTLGIDFKSISKGEKKVMIAAY
Sbjct: 241 DNYSGGNTLPARTQYTESMVYSKSQIEAALNVNSKILDGTLGIDFKSISKGEKKVMIAAY 300

Query: 301 KQIFYTVSANLPNNPADVFDKSVTFKELQRKGVSNEAPPLFVSNVAYGRTVFVKLETSSK 360
KQIFYTVSANLPNNPADVFDKSVT KELQRKGVSNEAPPLFVSNVAYGRTVFVKLETSSK
Sbjct: 301 KQIFYTVSANLPNNPADVFDKSVTLKELQRKGVSNEAPPLFVSNVAYGRTVFVKLETSSK 360

Query: 361 SNDVEAAFSAALKGTDVKTNGKYSDILENSSFTAVVLGGDAAEHNKVVTKDFDVIRNVIK 420
SNDVEAAFSAALKGTDVKTNGKYSDILENSSFTAVVLGGDAAEHNKVVTKDFDVIRNVIK
Sbjct: 361 SNDVEAAFSAALKGTDVKTNGKYSDILENSSFTAVVLGGDAAEHNKVVTKDFDVIRNVIK 420

Query: 421 DNATFSRKNPAYPISYTSVFLKNNKIAGVNNRSEYVETTSTEYTSGKINLSHQGAYVAQY 480
DNATFSRKNPAYPISYTSVFLKNNKIAGVNNRSEYVETTSTEYTSGKINLSHQGAYVAQY
Sbjct: 421 DNATFSRKNPAYPISYTSVFLKNNKIAGVNNRSEYVETTSTEYTSGKINLSHQGAYVAQY 480

Query: 481 EILWDEINYDDKGKEVITKRRWDNNWYSKTSPFSTVIPLGANSRNIRIMARECTGLAWEW 540
EILWDEINYDDKGKEVITKRRWDNNWYSKTSPFSTVIPLGANSRNIRIMARECTGLAWEW
Sbjct: 481 EILWDEINYDDKGKEVITKRRWDNNWYSKTSPFSTVIPLGANSRNIRIMARECTGLAWEW 540

Query: 541 WRKVIDERDVKLSKEINVNISGSTLSPYGSITYK 574
WRKVIDERDVKLSKEINVNISGSTLSPYGSITYK
Sbjct: 541 WRKVIDERDVKLSKEINVNISGSTLSPYGSITYK 574


3MGAS2096_Spy0166MGAS2096_Spy0175Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0166-1183.088320glycine betaine transport ATP-binding protein
MGAS2096_Spy0167-1182.772933glycine betaine transport ATP-binding protein
MGAS2096_Spy0168-1192.308703glycine betaine-binding protein / glycine
MGAS2096_Spy0169-2130.056517glycine betaine-binding protein / glycine
MGAS2096_Spy0170-115-0.783141DNA polymerase I
MGAS2096_Spy0171118-3.548089CoA binding protein
MGAS2096_Spy0172219-4.625083ferric uptake regulation protein
MGAS2096_Spy0173320-4.528150transcriptional regulatory protein
MGAS2096_Spy0174321-4.649693Co-activator of prophage gene expression IbrA
MGAS2096_Spy01752220.313697Co-activator of prophage gene expression IbrB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0173PF05043773e-20 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 76.9 bits (189), Expect = 3e-20
Identities = 21/103 (20%), Positives = 46/103 (44%), Gaps = 7/103 (6%)

Query: 2 FLTWENLFLT-NQSRRRLKLLVV----ERSYNNVGNFLKEYFGGFFEIINFDDLRLTMVD 56
++L + Q++ +LK+LV+ + V L Y FE+ + +L L+
Sbjct: 386 ITHTKHLVINLLQNQPKLKVLVMSNFDQYHAKFVAETLSYYCSNNFELEVWTELELSKES 445

Query: 57 MVSLSSEYDVIVTDMILEQTMDSEILFFNQMAPSVVANRLTDM 99
+ S YD+I+++ I+ + +++ N + + L M
Sbjct: 446 LE--DSPYDIIISNFIIPPIENKRLIYSNNINTVSLIYLLNAM 486


4MGAS2096_Spy0323MGAS2096_Spy0328Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0323020-3.472113hypothetical protein
MGAS2096_Spy0324018-3.102626site-specific tyrosine recombinase XerD
MGAS2096_Spy0325018-3.248059segregation and condensation protein A
MGAS2096_Spy0326019-3.328441segregation and condensation protein B
MGAS2096_Spy0327019-3.029255ribosomal large subunit pseudouridine synthase
MGAS2096_Spy0328220-2.880422hypothetical protein
5MGAS2096_Spy0351MGAS2096_Spy0372Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0351217-0.676539DNA polymerase III subunit delta'
MGAS2096_Spy03521180.983826tpl protein
MGAS2096_Spy03531191.426939putative cytoplasmic protein
MGAS2096_Spy0354-1182.072342DNA replication intiation control protein YabA
MGAS2096_Spy0355-1163.026468tetrapyrrole (Corrin/Porphyrin) methylase family
MGAS2096_Spy0356-1162.585198hypothetical protein
MGAS2096_Spy0357-1172.850399copper homeostasis protein cutC
MGAS2096_Spy0358-1162.381927arsenate reductase family protein
MGAS2096_Spy0359-1172.919678exodeoxyribonuclease III
MGAS2096_Spy0360-1182.816483L-lactate oxidase
MGAS2096_Spy03610202.907155interleukin-8 protease
MGAS2096_Spy03620223.630399hypothetical protein
MGAS2096_Spy0363-1244.479367hypothetical protein
MGAS2096_Spy0364-1265.098904methionyl-tRNA synthetase
MGAS2096_Spy03650326.216177hypothetical protein
MGAS2096_Spy0366-1295.836060ribonucleotide-diphosphate reductase subunit
MGAS2096_Spy0367-1305.834793ribonucleotide reductase stimulatory protein
MGAS2096_Spy0368-1275.229872ribonucleotide-diphosphate reductase subunit
MGAS2096_Spy03691304.567706hypothetical protein
MGAS2096_Spy03702282.529163C3 family ADP-ribosyltransferase
MGAS2096_Spy0371426-3.541726hypothetical protein
MGAS2096_Spy0372125-3.706635hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0361SUBTILISIN935e-22 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 92.6 bits (230), Expect = 5e-22
Identities = 42/160 (26%), Positives = 64/160 (40%), Gaps = 24/160 (15%)

Query: 264 DIDWTQTDDDTKYESHGMHVTGIVAGNSKEAAATGERFLGIAPEAQVMFMRVFANDVMGS 323
+ D D HG HV G +A +G+APEA ++ ++V G
Sbjct: 74 EGDPEIFKDY---NGHGTHVAGTIAAT-----ENENGVVGVAPEADLLIIKVLNKQGSGQ 125

Query: 324 AESLFIKAIEDAVALGADVINLSLGTANGAQLSGSKPLMEAIEKAKKAGVSVVVAAGNER 383
+ + I+ I A+ D+I++SLG L EA++KA + + V+ AAGNE
Sbjct: 126 YDWI-IQGIYYAIEQKVDIISMSLGGP-----EDVPELHEAVKKAVASQILVMCAAGNEG 179

Query: 384 VYGSDHDDPLATNPDYGLVGSPSTGRTPTSVAAINSKWVI 423
D+ +G P SV AIN
Sbjct: 180 DGDDRTDE----------LGYPGCYNEVISVGAINFDRHA 209



Score = 79.1 bits (195), Expect = 2e-17
Identities = 36/147 (24%), Positives = 58/147 (39%), Gaps = 18/147 (12%)

Query: 561 FDSVVSKAPSQKGNEMNHFSNWGLTSDGYLKPDITAPGGDIYSTYNDNHYGSQTGTSMAS 620
++ V+S + FSN + D+ APG DI ST Y + +GTSMA+
Sbjct: 194 YNEVISVGAINFDRHASEFSNSNN------EVDLVAPGEDILSTVPGGKYATFSGTSMAT 247

Query: 621 PQIAGASLLVKQ-YLEKTQPNLPKEKIADIVKNLLMSNAQIHVNPETKTTTSPRQQGAGL 679
P +AGA L+KQ + +L + L+ SP+ +G GL
Sbjct: 248 PHVAGALALIKQLANASFERDL----TEPELYAQLIKRT-------IPLGNSPKMEGNGL 296

Query: 680 LNIDGAVTSGLYVTGKDNYGSISLGNI 706
L + + G +S ++
Sbjct: 297 LYLTAVEELSRIFDTQRVAGILSTASL 323



Score = 40.6 bits (95), Expect = 4e-05
Identities = 11/34 (32%), Positives = 18/34 (52%), Gaps = 1/34 (2%)

Query: 127 HDWVKTKGAWDKGYKGQGKVVAVIDTGIDPAHQS 160
+ ++ W++ G+G VAV+DTG D H
Sbjct: 26 VEMIQAPAVWNQTR-GRGVKVAVLDTGCDADHPD 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0370BINARYTOXINA383e-05 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 37.7 bits (87), Expect = 3e-05
Identities = 42/170 (24%), Positives = 70/170 (41%), Gaps = 27/170 (15%)

Query: 88 INTSLDKTKGELSQLTPELRDQVAQLDAATHRLVIPWNIVVYRYVYETFLRDIGVSHADL 147
IN L + G L+ PEL +V ++ A IP N++VYR G L
Sbjct: 295 INNYL-ISNGPLNNPNPELDSKVNNIENALKLTPIPSNLIVYRRS--------GPQEFGL 345

Query: 148 TSYYRNHQFDPHILCKIK---------LGTRYTKHSFMSTT--ALKNGAMTHRPVEVRIC 196
T + F+ KI+ G T +F+ST+ ++ A R + +RI
Sbjct: 346 TLTSPEYDFN-----KIENIDAFKEKWEGKVITYPNFISTSIGSVNMSAFAKRKIILRIN 400

Query: 197 VKKGAKAAFVEPYSAVPSEVELLFPRGCQLEV--VGAYVSQDHKKLHIEA 244
+ K + A++ E E+L G + ++ V +Y KL ++A
Sbjct: 401 IPKDSPGAYLSAIPGYAGEYEVLLNHGSKFKINKVDSYKDGTVTKLILDA 450


6MGAS2096_Spy0393MGAS2096_Spy0402Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy03935281.523375cell division protein ftsK
MGAS2096_Spy03943221.495229integral membrane protein
MGAS2096_Spy03953222.38186050S ribosomal protein L11
MGAS2096_Spy03963192.00079850S ribosomal protein L1
MGAS2096_Spy03970173.000459uridylate kinase
MGAS2096_Spy0398-1182.961566ribosome recycling factor
MGAS2096_Spy0399-1152.769734S1 RNA-binding domain-containing protein
MGAS2096_Spy04001162.968992methionine sulfoxide reductase A
MGAS2096_Spy04011162.973667hypothetical protein
MGAS2096_Spy04020163.295592surface antigen
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0402IGASERPTASE348e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 34.3 bits (78), Expect = 8e-04
Identities = 41/209 (19%), Positives = 66/209 (31%), Gaps = 26/209 (12%)

Query: 89 QATTLTVQAPASSPASVSHVPSSEPLPQASATSQSTVPMAP------SATPSDVPTTPLA 142
QA +V + A V P P P + + TV T A
Sbjct: 1004 QADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTA 1063

Query: 143 SAKPDSFVTASS-ELTSSTNDVSTELSSESQKQPEVPQEAVPTPKAAETTEVEPKTDISE 201
+ + S+ + + TN+V+ S + Q +E T + E +VE E
Sbjct: 1064 QNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET-ATVEKEEKAKVE-TEKTQE 1121

Query: 202 DPTSANRPVPNESASEEVSSAAPAQAPAE--------KEETSAPAAQKAVADTTS----- 248
P ++ P + SE V A + + +T+ A + A TS
Sbjct: 1122 VPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ 1181

Query: 249 ----VATSNGLSYAPNHAYNPMNAGLQPQ 273
T N + + N A QP
Sbjct: 1182 PVTESTTVNTGNSVVENPENTTPATTQPT 1210



Score = 32.7 bits (74), Expect = 0.003
Identities = 29/168 (17%), Positives = 49/168 (29%), Gaps = 10/168 (5%)

Query: 89 QATTLTVQAPASSPASVSHVPSSEPLPQASATSQSTVPMAPSATPSDVPTTPLASAKPDS 148
+ +T Q S + P +EP + T P + + T +D P +
Sbjct: 1121 EVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT-EQPAKETSSNV 1179

Query: 149 FVTASSELTSSTNDVSTELSSESQKQPEVPQEAVPTPKAAETTEVEPKTDISEDPTSANR 208
+ T +T + E PE A P + +PK S
Sbjct: 1180 EQPVTESTTVNTGNSVVE-------NPENTTPATTQPTVNSESSNKPKNRHRRSVRS--V 1230

Query: 209 PVPNESASEEVSSAAPAQAPAEKEETSAPAAQKAVADTTSVATSNGLS 256
P E A+ + + + A A VA + G +
Sbjct: 1231 PHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKA 1278


7MGAS2096_Spy0413MGAS2096_Spy0422Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0413326-3.153117transposase
MGAS2096_Spy0414630-3.884205bacteriocin
MGAS2096_Spy0415422-1.538124hypothetical protein
MGAS2096_Spy0416321-1.162179hypothetical protein
MGAS2096_Spy0417421-1.888592hypothetical protein
MGAS2096_Spy0418220-1.059855hypothetical protein
MGAS2096_Spy04192210.468621hypothetical protein
MGAS2096_Spy04204190.642661hypothetical protein
MGAS2096_Spy0421-120-1.835713hypothetical protein
MGAS2096_Spy0422220-0.780930hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0415PF05844270.005 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 26.9 bits (59), Expect = 0.005
Identities = 15/39 (38%), Positives = 20/39 (51%), Gaps = 2/39 (5%)

Query: 12 MASISGGNAPGDAVIGGLGGLASG--LKFCKLLHPVLAG 48
MA I+G A AV+G LG L +G + K L + G
Sbjct: 123 MAVIAGVGALASAVVGSLGALKNGKAISQEKTLQKNIDG 161


8MGAS2096_Spy0447MGAS2096_Spy0480Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy04472170.702152daunorubicin resistance ATP-binding protein
MGAS2096_Spy04483150.267241daunorubicin resistance transmembrane protein
MGAS2096_Spy04491130.460883ABC transporter permease
MGAS2096_Spy0450-111-0.002633dihydroxyacetone kinase
MGAS2096_Spy0451-111-0.161903Acetyl-CoA acetyltransferase
MGAS2096_Spy0452-211-1.354223long-chain-fatty-acid--CoA ligase
MGAS2096_Spy0453113-1.816139putative cytoplasmic protein
MGAS2096_Spy0454012-2.422085two-component response regulator VicR
MGAS2096_Spy0455014-3.661127two-component sensor histidine kinase VicK
MGAS2096_Spy0456116-5.084992Zn-dependent hydrolase
MGAS2096_Spy0457118-5.472658ribonuclease III
MGAS2096_Spy0458119-6.183377chromosome partition protein smc
MGAS2096_Spy0459325-9.196133transcriptional regulator
MGAS2096_Spy0460430-9.686595shikimate 5-dehydrogenase
MGAS2096_Spy0461328-9.610419putative cytoplasmic protein
MGAS2096_Spy0462427-9.210017hypothetical protein
MGAS2096_Spy0463426-9.685323hypothetical protein
MGAS2096_Spy0464227-9.767196S-adenosylmethionine synthetase
MGAS2096_Spy0465326-9.769417hypothetical protein
MGAS2096_Spy0466222-8.813755cell wall biosynthesis glycosyltransferase
MGAS2096_Spy0467420-6.995813hypothetical protein
MGAS2096_Spy0468319-6.621752UDP-glucose 6-dehydrogenase
MGAS2096_Spy0469317-4.352948macrolide-efflux protein
MGAS2096_Spy0470521-2.397402transcriptional regulator
MGAS2096_Spy0471521-1.846436chromosome segregation ATPases
MGAS2096_Spy0472321-1.851782chromosome segregation ATPases
MGAS2096_Spy0473220-3.670345hypothetical protein
MGAS2096_Spy0474119-4.092648hypothetical protein
MGAS2096_Spy0475220-5.575076plasmid stabilization system antitoxin protein
MGAS2096_Spy0476322-6.373247plasmid stabilization system toxin protein
MGAS2096_Spy0477125-7.099428putative cytoplasmic protein
MGAS2096_Spy0478-118-5.337129hypothetical protein
MGAS2096_Spy0479019-1.180422transposase
MGAS2096_Spy04811141.776009transposase
MGAS2096_Spy04802161.557924hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0454HTHFIS921e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.8 bits (228), Expect = 1e-23
Identities = 29/133 (21%), Positives = 65/133 (48%), Gaps = 1/133 (0%)

Query: 3 KILIVDDEKPISDIIKFNLTKEGYDIVTAFDGREAVTIFEEEKPDLIILDLMLPELDGLE 62
IL+ DD+ I ++ L++ GYD+ + DL++ D+++P+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 VAKEIRKT-SHVPIIMLSAKDSEFDKVIGLEIGADDYVTKPFSNRELLARVKAHLRRTET 121
+ I+K +P++++SA+++ + E GA DY+ KPF EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 IETAVAEENASSG 134
+ + +++
Sbjct: 125 RPSKLEDDSQDGM 137


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0455PF06580445e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 44.5 bits (105), Expect = 5e-07
Identities = 30/187 (16%), Positives = 72/187 (38%), Gaps = 34/187 (18%)

Query: 253 DETNRMMRMISDLL--NLSRIDNQVTQLAVEMTNFTAFITSILNRFDLVKNQHTGTGKVY 310
+ M+ +S+L+ +L + + LA E+T +++ +F +++ ++
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQF---EDRLQFENQIN 247

Query: 311 EIVRDYPITSVWLEIDNDKMTQVIENILNNAIKYSPDGGKITVRMKTTDTQLIISISDQG 370
+ D + + ++ ++EN + + I P GGKI ++ + + + + + G
Sbjct: 248 PAIMDVQVPPMLVQT-------LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTG 300

Query: 371 LGIPKTDLPLIFDRFYRVDKARSRAQGGTGLGLAIAKEIIKQHHGF---IWAKSDYGKGS 427
K + TG GL +E ++ +G I GK
Sbjct: 301 SLALKNT------------------KESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKV- 341

Query: 428 TFTIVLP 434
+++P
Sbjct: 342 NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0458GPOSANCHOR482e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 47.8 bits (113), Expect = 2e-07
Identities = 48/313 (15%), Positives = 95/313 (30%), Gaps = 10/313 (3%)

Query: 209 AKVAKQFLELDANRKQLQLDILVKDIDIAQERQTKDTEALAALQQDLASYYAKRQSMEED 268
+ VA + + Q + D + + + + + + AL+ + + +E
Sbjct: 41 SAVATRSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEK 100

Query: 269 YQKFKQKKQVISQESDQTQTTLLELTKLIADLEKQIELVKLESGQ---EAEKKAEAKKHL 325
+K + + + + + +L K + + E A K L
Sbjct: 101 LRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL 160

Query: 326 EQLQEQLDGFQAEEKQRTEQLLHIDQQLCDVKQQLNELSNALERFSSDPDQLMETLREEF 385
E+ E F + + + L L + +L + FS+ ++TL E
Sbjct: 161 EKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEK 220

Query: 386 VLLMQKEAALSNQLTALKAHLDKEKQARQHKAQEYQLLVTKLDQLNDESQKAQAHYKAQK 445
L ++A L L + + E L + +L + A A
Sbjct: 221 AALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADS 280

Query: 446 EQVEMLLQNYQEGDKRVQELERDYQLKQERLFDLLDQ-------KKGKEARKASLESIQK 498
+++ L + +LE Q+ L KK EA LE K
Sbjct: 281 AKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNK 340

Query: 499 SHSQFYAGVRAVL 511
+R L
Sbjct: 341 ISEASRQSLRRDL 353



Score = 30.4 bits (68), Expect = 0.047
Identities = 30/163 (18%), Positives = 54/163 (33%), Gaps = 8/163 (4%)

Query: 676 ELEQISEELTRLVEQLKITEKEVAALQSDLIAKKEELTQLKLAGDQARLAEQRAQMAYQQ 735
LE L L+ + + AK + L K A E R +
Sbjct: 145 TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAA------LEARQAELEKA 198

Query: 736 LQEKQEDSKALLAALDQSQTTHSDESLLAEQARIEEALTAIAKKKNALTCDIDDIKENKD 795
L+ S A A + + +L A +A +E+AL A + I ++ K
Sbjct: 199 LEGAMNFSTADSAKIKTLE--AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKA 256

Query: 796 LIRQKTQNIHQALSQARLQERDLLNEKKFEQANQSRLRTQLKQ 838
+ + + +AL A + K +A ++ L +
Sbjct: 257 ALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKAD 299


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0469TCRTETA384e-05 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 38.3 bits (89), Expect = 4e-05
Identities = 28/141 (19%), Positives = 59/141 (41%), Gaps = 13/141 (9%)

Query: 52 SVIGVLFNLFGGVIADSFKR----KKIIITTNILCGTACLVLSFLTKEQWLVYAIVLTNV 107
+ G+L +L +I ++ ++ I GT ++L+F T W+ + I+ V
Sbjct: 253 AAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFAT-RGWMAFPIM---V 308

Query: 108 ILAFMSAFSSPSYKAFTKEIVKKDSISQLNSLLETTSTVIKVTVPMVAIFLYKLLGIHGV 167
+LA P+ +A V ++ QL L +++ + P++ +Y +
Sbjct: 309 LLAS-GGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA----ASI 363

Query: 168 LLLDGLSFLIAALLISFILPV 188
+G +++ A L LP
Sbjct: 364 TTWNGWAWIAGAALYLLCLPA 384


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0471PF05043270.037 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 26.8 bits (59), Expect = 0.037
Identities = 10/83 (12%), Positives = 29/83 (34%), Gaps = 10/83 (12%)

Query: 71 YLTNLPALAHDSLLLSN----VSYQAT-----EALLKLYDQSRSLNKQVFLAFDKASSYS 121
L+++ + D + S+ + S + F+ F++
Sbjct: 45 DLSHVKSAFPDLIFHSSTNGIRIINTDDSDIEMVYHHFFKHSTHFSILEFIFFNEGCQAE 104

Query: 122 PDANQL-LSENTVLRLSSNGNEL 143
+ +S +++ R+ S N++
Sbjct: 105 SICKEFYISSSSLYRIISQINKV 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0472GPOSANCHOR340.001 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 33.9 bits (77), Expect = 0.001
Identities = 46/235 (19%), Positives = 86/235 (36%), Gaps = 34/235 (14%)

Query: 174 DNIARYKERLKDKSDQLTTFRNARKYTF------ISNLVGGKKQFEANVSEIKRLEYDLA 227
+ K L + L I L K EA +E+++
Sbjct: 214 KTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAM 273

Query: 228 HLQDTHQDKIDSDDIEKNQQKLQLRNTKLELESSLRDKQRRLKLLDIS------------ 275
+ KI + + EK + + + + + + ++Q + LD S
Sbjct: 274 NFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQ 333

Query: 276 -IEFGLYPTESDLTELQQYFPDTNLKKLYEVEAYHKKLETIL------------DSEFST 322
+E +E+ L++ D + + ++EA H+KLE D + S
Sbjct: 334 KLEEQNKISEASRQSLRRDL-DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASR 392

Query: 323 E-RESLIAEIDELESQLTTLNQELQELGNIPNLS-SEYLENYSKLTATINALKEQ 375
E ++ + ++E S+L L + +EL L+ E E +KL A ALKE+
Sbjct: 393 EAKKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEK 447


9MGAS2096_Spy0551MGAS2096_Spy0559Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0551-215-3.131616hypothetical protein
MGAS2096_Spy0552022-4.741855putative cytoplasmic protein
MGAS2096_Spy0553228-6.477720site-specific recombinase
MGAS2096_Spy0554332-6.080400phage protein
MGAS2096_Spy0555324-3.618255phage transcriptional repressor
MGAS2096_Spy05563210.101953Cro family transcriptional regulator
MGAS2096_Spy05572190.115127phage protein
MGAS2096_Spy0558317-0.443040phage protein
MGAS2096_Spy05593170.119834phage protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0555HTHTETR290.011 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 29.2 bits (65), Expect = 0.011
Identities = 11/77 (14%), Positives = 31/77 (40%), Gaps = 13/77 (16%)

Query: 21 SMSELARNVGIAKSTMSRYFNKTREFPLNRADDFAKALNISTEFLLGIDLNSEVDGSELL 80
S+ E+A+ G+ + + +F +++D F++ +S + ++L +
Sbjct: 33 SLGEIAKAAGVTRGAIYWHFK-------DKSDLFSEIWELSESNIGELELEYQAK----- 80

Query: 81 GIYRELEEQRRVIVLDT 97
+ R I++
Sbjct: 81 -FPGDPLSVLREILIHV 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0556TYPE3OMGPROT250.030 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 25.2 bits (55), Expect = 0.030
Identities = 9/24 (37%), Positives = 16/24 (66%)

Query: 6 KRLKAERIASGMTQCEVAQSMGWK 29
K L +S +TQC++ +S+GW+
Sbjct: 562 KWLSQNNKSSYLTQCKMDKSLGWR 585


10MGAS2096_Spy0568MGAS2096_Spy0578Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0568230-0.752763phage protein
MGAS2096_Spy0569022-0.565073phage protein
MGAS2096_Spy05701220.034055hypothetical protein
MGAS2096_Spy05713210.715523phage protein
MGAS2096_Spy05723210.886623phage protein
MGAS2096_Spy05733201.079130ArpU family phage encoded transcriptional
MGAS2096_Spy05743190.590550chromosome partitioning protein parB /
MGAS2096_Spy05754200.987539phage protein
MGAS2096_Spy05764210.545410terminase large subunit
MGAS2096_Spy05773210.817321minor capsid protein
MGAS2096_Spy05782190.430881minor capsid protein
11MGAS2096_Spy0614MGAS2096_Spy0629Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0614215-0.170676phosphatase
MGAS2096_Spy0615316-0.098733DNA gyrase subunit B
MGAS2096_Spy0616621-0.226758septation ring formation regulator EzrA
MGAS2096_Spy06178300.017747putative cytoplasmic protein
MGAS2096_Spy06188320.299743phosphopyruvate hydratase
MGAS2096_Spy0619822-0.780089transposase
MGAS2096_Spy0620720-1.354468transposase
MGAS2096_Spy0621517-1.564453transcriptional regulator
MGAS2096_Spy0622518-2.064456extracellular matrix binding protein
MGAS2096_Spy0623120-4.056243streptolysin S
MGAS2096_Spy0624019-4.094994streptolysin S biosynthesis protein SagB
MGAS2096_Spy0625020-4.459349streptolysin S biosynthesis protein SagC
MGAS2096_Spy0626020-4.898958streptolysin S biosynthesis protein SagD
MGAS2096_Spy0627020-5.918326streptolysin S putative self-immunity protein
MGAS2096_Spy0628-114-4.055063streptolysin S biosynthesis protein SagF
MGAS2096_Spy0629-214-4.229799streptolysin S export ATP-binding protein SagG
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0621PF08280522e-10 M protein trans-acting positive regulator
		>PF08280#M protein trans-acting positive regulator

Length = 530

Score = 51.8 bits (124), Expect = 2e-10
Identities = 30/119 (25%), Positives = 50/119 (42%)

Query: 10 YLRRQQKRNKSFYNTLKTIVEEWMSAEGIVGKLPSYHLLLFTIQLEELLKTYLPPIPVYL 69
++ K N+ Y +LK IVEEWM+ L H LF +E++L+ PP+ V
Sbjct: 371 FVSPYYKGNQKLYTSLKLIVEEWMAKLPGKRYLNHKHFHLFCHYVEQILRNIQPPLVVVF 430

Query: 70 LTNNTAALDLMTNALSIYFPPAIATVMPVNVEIIPFKDIVKEKQSVIIADRQYLNLIQH 128
+ +N L+T++ YF + I K ++I Q + + H
Sbjct: 431 VASNFINAHLLTDSFPRYFSDKSIDFHSYYLLQDNVYQIPDLKPDLVITHSQLIPFVHH 489


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0622FLAGELLIN320.019 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 32.3 bits (73), Expect = 0.019
Identities = 38/280 (13%), Positives = 58/280 (20%), Gaps = 24/280 (8%)

Query: 1175 KVEQAKLDAIKSVDDAQTADAINDALGKGIENINNQYQHGDGVDVRKATAKGD--LEKEA 1232
V K + + + D G G V A D A
Sbjct: 173 NVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAA 232

Query: 1233 AKVKALIAKDPTLTQADKDKQTAAVDAAKNTAIAAVDKATTADGVNQELGKGITAINKAY 1292
+ + A+ AIA K G T K
Sbjct: 233 NGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTG 292

Query: 1293 RPGEGVKARKEAAKADLEKEAAKVKALIAKDPTLTQADKDKQTAAVDAAKNTAIAAVDKA 1352
G G + + EK V + A + A + N DK
Sbjct: 293 NDGNGKVS----TTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKT 348

Query: 1353 TTAEGINQELGKGITAINKAYRPGEGVKARKEAAKADLEKEAAKVKALITNDPTLTKADK 1412
+L ++ G +T A K
Sbjct: 349 KNESAKLSDLEANNAVKGESKITVNG-----------------AEYTANAAGDKVTLAGK 391

Query: 1413 AKQTGAVAKALKAAIAAVDKATTAEGINQELGKGITAINK 1452
A + I A + L +A++K
Sbjct: 392 TMFIDKTASGVSTLINEDA-AAAKKSTANPLASIDSALSK 430


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0628TYPE3IMSPROT310.003 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 31.3 bits (71), Expect = 0.003
Identities = 15/76 (19%), Positives = 32/76 (42%), Gaps = 1/76 (1%)

Query: 37 SYQDFLDVLLSLFQFVIIILVLFFYSATINLGEVLTFLTQTSWHWQILCYLVLYLMAIIE 96
S + ++ L S+ + V++ ++++ NL +L T L +L + +I
Sbjct: 133 SIKSLVEFLKSILKVVLLSILIWII-IKGNLVTLLQLPTCGIECITPLLGQILRQLMVIC 191

Query: 97 MTLLVLILIFDVLLQK 112
V+I I D +
Sbjct: 192 TVGFVVISIADYAFEY 207


12MGAS2096_Spy0667MGAS2096_Spy0681Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0667-216-3.430763hypothetical protein
MGAS2096_Spy0668-316-4.561058alpha-D-GlcNAc alpha-1,2-L-rhamnosyltransferase
MGAS2096_Spy0669-218-6.040695alpha-L-Rha alpha-1,3-L-rhamnosyltransferase
MGAS2096_Spy0670-120-6.359409polysaccharide ABC transporter permease
MGAS2096_Spy0671-121-6.707835polysaccharide export ATP-binding protein
MGAS2096_Spy0672-123-7.546602glycosyltransferase
MGAS2096_Spy0673022-7.944933alpha-L-Rha alpha-1,2-L-rhamnosyltransferase /
MGAS2096_Spy0674120-6.728096phosphoglycerol transferase
MGAS2096_Spy0675118-6.495214cell wall biosynthesis glycosyltransferase
MGAS2096_Spy0676218-6.690756hypothetical protein
MGAS2096_Spy0677218-5.821875transcriptional activator amrA
MGAS2096_Spy0678217-3.886871hypothetical protein
MGAS2096_Spy0679-117-1.350219peptidase T
MGAS2096_Spy0680022-1.567121pore forming protein ebsA
MGAS2096_Spy06813250.134844ferredoxin
13MGAS2096_Spy0703MGAS2096_Spy0721Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy07032160.353026lipoprotein signal peptidase
MGAS2096_Spy07042160.752576ribosomal large subunit pseudouridine synthase
MGAS2096_Spy07052160.346225ribosomal large subunit pseudouridine synthase
MGAS2096_Spy0706-114-0.593291bifunctional pyrimidine regulatory protein
MGAS2096_Spy0707-115-1.168952Uracil permease
MGAS2096_Spy0708-216-1.544755aspartate carbamoyltransferase catalytic
MGAS2096_Spy0709-318-2.120362carbamoyl phosphate synthase small subunit
MGAS2096_Spy0710-119-3.656207hypothetical protein
MGAS2096_Spy0711-218-3.427222carbamoyl phosphate synthase large subunit
MGAS2096_Spy0712-118-4.698591periplasmic protein of efflux system
MGAS2096_Spy0713-119-5.433225ABC transporter ATP-binding protein
MGAS2096_Spy0714017-4.714413ABC transporter permease
MGAS2096_Spy0715018-4.768309hypothetical protein
MGAS2096_Spy0716-116-3.90858130S ribosomal protein S16
MGAS2096_Spy0717016-4.664468RNA binding protein
MGAS2096_Spy0718016-4.705855hypothetical protein
MGAS2096_Spy0719-116-4.325733cell surface protein
MGAS2096_Spy0720-218-4.269203putative cytoplasmic protein
MGAS2096_Spy0721-215-3.961884cobalt-zinc-cadmium resistance protein czcD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0712RTXTOXIND446e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.4 bits (105), Expect = 6e-07
Identities = 21/112 (18%), Positives = 45/112 (40%), Gaps = 13/112 (11%)

Query: 170 QQLQDLNDAYADAQAEVNKAQIALNDTVVISSVSGTVVE-----VNNDIDPSSKNSQTLV 224
+L+ D E+ K + +V+ + VS V + + ++TL+
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTT----AETLM 357

Query: 225 HVATEGQ-LQVKGTLTEYDLANVKVGQSVKIKSKVYSNQEW---TGKISYVS 272
+ E L+V + D+ + VGQ+ IK + + + GK+ ++
Sbjct: 358 VIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN 409



Score = 37.1 bits (86), Expect = 1e-04
Identities = 24/185 (12%), Positives = 53/185 (28%), Gaps = 29/185 (15%)

Query: 21 ITLVLIITGVVLWKQQQNTLTADIAKEPYSTVSVTEGSIASSTLLSGTVKALSEEYIYFD 80
++ + + + + V+ G + S S +K + +
Sbjct: 62 YFIMGFLVIAFIL--------SVLG--QVEIVATANGKLTHSGR-SKEIKPIENSIV--- 107

Query: 81 ANKGNDATVTVKVGDQVTQGQQLVQYNTTTA-------QSAYDTAVRSLNKIGRQINHLK 133
+ VK G+ V +G L++ A QS+ A + ++
Sbjct: 108 ------KEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIE 161

Query: 134 TYGVPAV--STETNKDEATGEETTTTVQPSAQQNANYKQQLQDLNDAYADAQAEVNKAQI 191
+P + E + EE +Q + ++ Q +AE
Sbjct: 162 LNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLA 221

Query: 192 ALNDT 196
+N
Sbjct: 222 RINRY 226


14MGAS2096_Spy0775MGAS2096_Spy0787Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy07753141.205897orotidine 5'-phosphate decarboxylase
MGAS2096_Spy07761121.151291orotate phosphoribosyltransferase
MGAS2096_Spy07771130.274173amidase
MGAS2096_Spy0778-2120.211756cystine-binding protein
MGAS2096_Spy0779-2110.782037cystine transport system permease
MGAS2096_Spy0780-3100.952749uracil-DNA glycosylase
MGAS2096_Spy0781-3110.503931dihydroorotase
MGAS2096_Spy07821130.504534putative glycerol-3-phosphate acyltransferase
MGAS2096_Spy07832140.934220DNA topoisomerase IV subunit B
MGAS2096_Spy07844180.216545DNA topoisomerase IV subunit A
MGAS2096_Spy0785523-0.274587branched-chain amino acid aminotransferase
MGAS2096_Spy0786927-0.538972putative cytoplasmic protein
MGAS2096_Spy07875220.114980**30S ribosomal protein S1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0781UREASE371e-04 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 37.4 bits (87), Expect = 1e-04
Identities = 21/81 (25%), Positives = 30/81 (37%), Gaps = 20/81 (24%)

Query: 20 ADVLIDGKQIVKIASA-----------IECQEAQVIDASGLIVAPGLVDIHVHFREPGQT 68
AD+ + +I I A I +VI G IV G +D H+HF P Q
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFICPQQ- 144

Query: 69 HKEDIHTGALAAAAGGVTTVV 89
A G+T ++
Sbjct: 145 --------IEEALMSGLTCML 157


15MGAS2096_Spy0841MGAS2096_Spy0851Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0841012-3.451176haloacid dehalogenase-like hydrolase
MGAS2096_Spy0842114-4.070515hypothetical protein
MGAS2096_Spy0843215-4.173691putative cytoplasmic protein
MGAS2096_Spy0844421-3.100221putative cytoplasmic protein
MGAS2096_Spy0845316-0.007537putative cytoplasmic protein
MGAS2096_Spy08461152.516521hypothetical protein
MGAS2096_Spy08471164.165544nucleoside diphosphate kinase
MGAS2096_Spy0848-1154.036425nucleoside diphosphate kinase
MGAS2096_Spy0849-1153.847299transposase
MGAS2096_Spy0850-1174.622362GTP-binding protein LepA
MGAS2096_Spy08512194.343335collagen-like surface protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0845PF04605280.008 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 27.5 bits (61), Expect = 0.008
Identities = 15/44 (34%), Positives = 23/44 (52%), Gaps = 5/44 (11%)

Query: 7 RMILMFDMPTDTAEE-----RKAYRKFRKFLLSEGFIMHQFSIY 45
R + FD+ T + E+ R+ Y +KF+L GF Q+S Y
Sbjct: 5 RKAINFDLSTKSLEKYFKDTREPYSLIKKFMLENGFEHRQYSGY 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0850TCRTETOQM1133e-28 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 113 bits (283), Expect = 3e-28
Identities = 51/156 (32%), Positives = 81/156 (51%), Gaps = 8/156 (5%)

Query: 12 KIRNFSIIAHIDHGKSTLADRILEK---TETVSSREMQAQLLDSMDLERERGITIKLNAI 68
KI N ++AH+D GK+TL + +L + S + D+ LER+RGITI+
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 ELNYTAKDGETYIFHLIDTPGHVDFTYEVSRSLAACEGAILVVDAAQGIEAQTLANVYLA 128
+ E ++IDTPGH+DF EV RSL+ +GAIL++ A G++AQT +
Sbjct: 62 SFQW-----ENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 129 LDNDLEILPVINKIDLPAADPERVRHEVEDVIGLDA 164
+ + INKID D V ++++ + +
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEI 152



Score = 93.4 bits (232), Expect = 6e-22
Identities = 44/214 (20%), Positives = 93/214 (43%), Gaps = 16/214 (7%)

Query: 171 SAKAGIGIEEILEQIVEKVPAPTGDVDAPLQALIFDSVYDAYRGVILQVRIVNGIVKPGD 230
SAK IGI+ ++E I K + T + L +F Y R + +R+ +G++ D
Sbjct: 220 SAKNNIGIDNLIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRD 279

Query: 231 KIQMMSNGKTFDVTEVGIFTP-KAVGRDFLATGDVGYVAASIKTVADTRVGDTVTLANNP 289
+++ K +TE+ + D +G++ + + +GDT L
Sbjct: 280 SVRISEKEKI-KITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSV-LGDTKLLPQRE 337

Query: 290 AKEALHGYKQMNPMVFAGIYPIESNKYNDLREALEKLQLNDASLQFE--PETSQALGFGF 347
E P++ + P + + L +AL ++ +D L++ T + +
Sbjct: 338 RIENPL------PLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEII---- 387

Query: 348 RCGFLGLLHMDVIQERLEREFNIDLIMTAPSVVY 381
FLG + M+V L+ ++++++ + P+V+Y
Sbjct: 388 -LSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420



Score = 43.3 bits (102), Expect = 3e-06
Identities = 21/104 (20%), Positives = 41/104 (39%), Gaps = 12/104 (11%)

Query: 393 VSNPSEFPDPTRVAFIE----------EPYVKAQIMVPQEFVGAVMELSQRKRGDFVTMD 442
VS P++F + + EPY+ +I PQE++ + + + V
Sbjct: 510 VSTPADFRMLAPIVLEQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ 569

Query: 443 YIDDNRVNVIYQIPLAEIVFDFFDKLKSSTRGYASFDYDMSEYR 486
+ +N V + +IP I ++ L T G + ++ Y
Sbjct: 570 -LKNNEVILSGEIPARCI-QEYRSDLTFFTNGRSVCLTELKGYH 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0851GPOSANCHOR727e-16 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 72.4 bits (177), Expect = 7e-16
Identities = 36/105 (34%), Positives = 49/105 (46%), Gaps = 12/105 (11%)

Query: 298 QPGKPAPKTPEVPQKPDTAPHTPKTPQIPGQSKDVTPAPQSPSNRGLNKPQTQSGNQLAK 357
+ K A + ++ + TP P ++ +G NQ
Sbjct: 447 KLAKQAEELAKLRAGKASDSQTPDAK----------PGNKAVPGKGQAPQAGTKPNQ--N 494

Query: 358 TPAAHDTHRQLPATGETTNPFFTAAAVAIMTTAGVVAVAKRQENN 402
+T RQLP+TGET NPFFTAAA+ +M TAGV AV KR+E N
Sbjct: 495 KAPMKETKRQLPSTGETANPFFTAAALTVMATAGVAAVVKRKEEN 539


16MGAS2096_Spy0864MGAS2096_Spy0885Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy08641203.080899dipeptidase PepV
MGAS2096_Spy08654202.645015tRNA
MGAS2096_Spy08665232.529792tRNA
MGAS2096_Spy08677271.91849750S ribosomal protein L10
MGAS2096_Spy08684220.87532550S ribosomal protein L7/L12
MGAS2096_Spy0869228-0.516762hypothetical protein
MGAS2096_Spy0870229-2.052673hypothetical protein
MGAS2096_Spy0871330-3.233322hypothetical protein
MGAS2096_Spy0872725-4.906765DNA-cytosine methyltransferase
MGAS2096_Spy0873626-8.011723relaxase
MGAS2096_Spy0874427-9.057339relaxase
MGAS2096_Spy0875428-9.043871relaxase
MGAS2096_Spy0876629-9.652518relaxase
MGAS2096_Spy0877629-9.674993two-component response regulator
MGAS2096_Spy0878529-9.896239lantibiotic biosynthesis sensor protein
MGAS2096_Spy0879630-9.962042lantibiotic ABC transporter permease
MGAS2096_Spy0880630-9.700406lanthionine synthetase
MGAS2096_Spy0881627-9.546928Serine (threonine) dehydratase
MGAS2096_Spy0882425-8.609391lantibiotic ABC transporter ATP-binding protein
MGAS2096_Spy0883526-9.057141lantibiotic ABC transporter permease
MGAS2096_Spy0884428-7.936780lantibiotic ABC transporter permease
MGAS2096_Spy0885224-4.986426Cro/CI family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0877HTHFIS736e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.6 bits (178), Expect = 6e-17
Identities = 29/130 (22%), Positives = 57/130 (43%), Gaps = 1/130 (0%)

Query: 3 KILAIDDDKEILKLMKTSLEIENYHVITCQEIELPIVFDDFKGYDLILLDIMMPNISGTE 62
IL DDD I ++ +L Y V + DL++ D++MP+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 FCYKIREE-VHSPIIFVSALDGDNEIVQALNIGGDDFIVKPFSLKQFVAKVNSHLKREER 121
+I++ P++ +SA + ++A G D++ KPF L + + + L +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 AKIKNEAEER 131
K E + +
Sbjct: 125 RPSKLEDDSQ 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0883ANTHRAXTOXNA320.003 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 31.6 bits (71), Expect = 0.003
Identities = 17/47 (36%), Positives = 29/47 (61%), Gaps = 7/47 (14%)

Query: 184 NKWYLFPYDWSLKLLEPMTRMRINSIPFGAEFVPDYSQIFISLFLGI 230
NK Y+ +W+ +P+T+ +IN+IP AEF+ + S I S +G+
Sbjct: 639 NKAYI---EWT----DPITKAKINTIPTSAEFIKNLSSIRRSSNVGV 678


17MGAS2096_Spy0897MGAS2096_Spy0914Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy08972150.308047dihydroneopterin aldolase
MGAS2096_Spy0898214-0.2188442-amino-4-hydroxy-6-
MGAS2096_Spy0899214-0.542582UDP-N-acetylenolpyruvoylglucosamine reductase
MGAS2096_Spy0900217-1.176429spermidine/putrescine ABC transporter
MGAS2096_Spy0901116-0.862260spermidine/putrescine ABC transporter permease
MGAS2096_Spy0902215-0.347398spermidine/putrescine ABC transporter permease
MGAS2096_Spy09032160.646800spermidine/putrescine-binding protein
MGAS2096_Spy09042170.800697transcriptional regulatory protein dpiA
MGAS2096_Spy09052170.637920sensor kinase dpiB
MGAS2096_Spy09062150.033534sensor kinase dpiB
MGAS2096_Spy0907317-0.893357malate-sodium symport
MGAS2096_Spy0908219-1.968422NAD-dependent malic enzyme
MGAS2096_Spy0909120-4.097039Zn-dependent alcohol dehydrogenases and related
MGAS2096_Spy0910223-5.198161acid phosphatase/phosphotransferase
MGAS2096_Spy0911121-4.819652chloride channel protein
MGAS2096_Spy0912020-4.867044lipase/acylhydrolase family protein
MGAS2096_Spy0913017-4.982460hypothetical protein
MGAS2096_Spy0914-116-3.379053hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0903MYCMG045371e-04 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 36.6 bits (84), Expect = 1e-04
Identities = 24/82 (29%), Positives = 42/82 (51%), Gaps = 4/82 (4%)

Query: 31 SGSQSDKLVIYNWGDYIDPALLKKFTKETGIEVQYETFDSNEAMYTKIKQGGTTYDIAVP 90
S S V+ N+ YI P LL++ + + + T+ SNE + TY +AV
Sbjct: 21 SSCGSTTFVLANFESYISPLLLER--VQEKHPLTFLTYPSNEKLINGF--ANNTYSVAVA 76

Query: 91 SDYTIDKMIKENLLNKLDKSKL 112
S Y + ++I+ +LL+ +D S+
Sbjct: 77 STYAVSELIERDLLSPIDWSQF 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0904HTHFIS673e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 67.2 bits (164), Expect = 3e-15
Identities = 23/131 (17%), Positives = 50/131 (38%), Gaps = 2/131 (1%)

Query: 3 VLIIEDDPMVDFIHRNYLEKLNLFDRIISSDSMKAVQSILTDYAIDLILLDIHITDGNGI 62
+L+ +DD + + L + + + + + + DL++ D+ + D N
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 QFLEKWRAQHIPCEVIIISAANDGNIIRDGFHLGIIDYLIKPFTFERFQESIQQFVTHRE 122
L + + V+++SA N G DYL KPF I + + +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 123 HLANQQLEQAQ 133
++ + +Q
Sbjct: 124 RRPSKLEDDSQ 134


18MGAS2096_Spy0991MGAS2096_Spy1006Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0991218-1.150549luciferase-like monooxygenase
MGAS2096_Spy0992-215-1.371244NADH-dependent flavin oxidoreductase yqiG
MGAS2096_Spy0993-214-1.866038lipoate-protein ligase A
MGAS2096_Spy0994-118-2.075609phosphopantothenate--cysteine ligase
MGAS2096_Spy0995-219-1.935682phosphopantothenoylcysteine decarboxylase
MGAS2096_Spy0996-320-1.622349integral membrane protein
MGAS2096_Spy0997-221-1.356767phosphoglucomutase/phosphomannomutase
MGAS2096_Spy0998-219-1.870455nucleoside transport system permease
MGAS2096_Spy0999-320-3.025177nucleoside transport system permease
MGAS2096_Spy1000-220-3.175445nucleoside transport ATP-binding protein
MGAS2096_Spy1001022-5.014446nucleoside-binding protein
MGAS2096_Spy1002022-6.519576cytidine deaminase
MGAS2096_Spy1003019-5.16432016S rRNA m(2)G 1207 methyltransferase
MGAS2096_Spy1004019-5.244244pantothenate kinase
MGAS2096_Spy1005119-4.70075030S ribosomal protein S20
MGAS2096_Spy1006-117-4.352547sensor protein ciaH
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1001LIPPROTEIN48664e-14 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 65.8 bits (160), Expect = 4e-14
Identities = 76/299 (25%), Positives = 120/299 (40%), Gaps = 45/299 (15%)

Query: 36 DLKVAMVTDTGGVDDKSFNQSAWEGLQSWGKEMGLQKGTGFDYFQSTSESEYATNLDTAV 95
LK ++TD G +DDKSFNQSA+E L++ + K TG + S + + ++A+
Sbjct: 61 KLKPVLITDEGKIDDKSFNQSAFEALKA------INKQTGIEINNVEPSSNFESAYNSAL 114

Query: 96 SGGYQLIYGIGFALKDAIAKAAGD------NEGVKFVIIDDIIEGKDNV-ASVTFADHEA 148
S G+++ GF + +I + +K + ID IE + S+ F E+
Sbjct: 115 SAGHKIWVLNGFKHQQSIKQYIDAHREELERNQIKIIGIDFDIETEYKWFYSLQFNIKES 174

Query: 149 AYLAGIAAAKTTKTK-----TVGFVGGMEGTVITRFEKGFEAGVKS---------VDDTI 194
A+ G A A + V GG +T F +GF G+ + T
Sbjct: 175 AFTTGYAIASWLSEQDESKRVVASFGGGAFPGVTTFNEGFAKGILYYNQKHKSSKIYHTS 234

Query: 195 QVKVDYAGSFGDAAKGKTIAAAQYAAGADVIYQAAGG---TGAGVFNEAKAINEKRSEAD 251
VK+D +G I + ADV Y G F + N+ +
Sbjct: 235 PVKLD-SGFTAGEKMNTVINNVLSSTPADVKYNPHVILSVAGPATFETVRLANKGQ---- 289

Query: 252 KVWVIGVDRDQKDEGKYTSKDGKEANFVLASSIKEVGKAVQLINKQVADKKFPGGKTTV 310
+VIGVD DQ +D +L S +K + +AV + +K G K V
Sbjct: 290 --YVIGVDSDQG-----MIQDKDR---ILTSVLKHIKQAVYETLLDLILEKEEGYKPYV 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1006PF06580392e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.1 bits (91), Expect = 2e-05
Identities = 15/75 (20%), Positives = 31/75 (41%), Gaps = 5/75 (6%)

Query: 312 YGKIFYFQNQVNRSLRMDKALLKQLITILFDNAIKY----TDKNGIIEIIVKTTDKNLLI 367
+ F+NQ+N ++ D + L+ L +N IK+ + G I + + + +
Sbjct: 236 FEDRLQFENQINPAIM-DVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTL 294

Query: 368 SVIDNGPGITDEEKK 382
V + G K+
Sbjct: 295 EVENTGSLALKNTKE 309


19MGAS2096_Spy1023MGAS2096_Spy1030Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1023216-0.571401integral membrane protein
MGAS2096_Spy1024216-0.669131type I restriction-modification system
MGAS2096_Spy1025115-1.084606ABC transporter permease
MGAS2096_Spy1026-119-2.701951ABC transporter ATP-binding protein
MGAS2096_Spy1027221-4.357045TetR family transcriptional regulator
MGAS2096_Spy1028123-5.148398hypothetical protein
MGAS2096_Spy1029023-5.283049NAD-dependent K+ or Na(+) uptake system
MGAS2096_Spy1030025-4.691869Gls24 family general stress protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1026PF05272347e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.5 bits (76), Expect = 7e-04
Identities = 18/41 (43%), Positives = 23/41 (56%), Gaps = 2/41 (4%)

Query: 36 KGELVVIL-GASGAGKSTVLNILGGMD-TVDAGQVIIDGKD 74
K + V+L G G GKST++N L G+D D I GKD
Sbjct: 594 KFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD 634


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1027HTHTETR425e-07 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 41.5 bits (97), Expect = 5e-07
Identities = 13/48 (27%), Positives = 25/48 (52%)

Query: 4 RHTETKAYVKTALTTLLTEQSFETLTVSDLTKKAGINRGTFYLHYTDK 51
ET+ ++ L ++Q + ++ ++ K AG+ RG Y H+ DK
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDK 55


20MGAS2096_Spy1112MGAS2096_Spy1156Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1112020-4.084992transcriptional regulator
MGAS2096_Spy1113120-3.919255Na+ driven multidrug efflux pump
MGAS2096_Spy1114224-5.005515hypothetical protein
MGAS2096_Spy1115124-4.396392ferredoxin
MGAS2096_Spy11162260.716312TetR family transcriptional regulator
MGAS2096_Spy11172271.998341hypothetical protein
MGAS2096_Spy11181273.548435virginiamycin A acetyltransferase
MGAS2096_Spy11190304.403390hypothetical protein
MGAS2096_Spy1120-2375.333943MerR family transcriptional regulator
MGAS2096_Spy1121-2375.085358antigen-like protein
MGAS2096_Spy1122-2394.638125sortase
MGAS2096_Spy1123-2395.205063hypothetical protein
MGAS2096_Spy1124-2415.588067adenine-specific methyltransferase
MGAS2096_Spy1125-1456.512657TRSE protein
MGAS2096_Spy11260427.058614hypothetical protein
MGAS2096_Spy11270448.015947hypothetical protein
MGAS2096_Spy11282468.403409adenine-specific methyltransferase
MGAS2096_Spy11292488.599800aspartyl/glutamyl-tRNA amidotransferase subunit
MGAS2096_Spy11301468.365111TraG/TraD family protein
MGAS2096_Spy11311385.395812hypothetical protein
MGAS2096_Spy11322332.051182hypothetical protein
MGAS2096_Spy11331300.820270relaxase
MGAS2096_Spy1134129-0.882506relaxosome component
MGAS2096_Spy11352311.126592LtrC
MGAS2096_Spy11363310.472652hypothetical protein
MGAS2096_Spy11372393.968988hypothetical protein
MGAS2096_Spy11382517.997532hypothetical protein
MGAS2096_Spy11392549.016630superfamily II DNA/RNA helicase
MGAS2096_Spy11402569.348850site-specific recombinase
MGAS2096_Spy114135710.323553putative cytoplasmic protein
MGAS2096_Spy11422569.954690chromosome partitioning protein parB
MGAS2096_Spy11431539.193513plasmid recombination protein Mob family
MGAS2096_Spy11441415.856566hypothetical protein
MGAS2096_Spy11452364.184200RNA polymerase ECF-type sigma factor
MGAS2096_Spy11460365.602927putative cytoplasmic protein
MGAS2096_Spy11470355.297338RNA polymerase sigma-B factor
MGAS2096_Spy11480345.264834hypothetical protein
MGAS2096_Spy11490355.535537tetracycline resistance protein tetM
MGAS2096_Spy11500386.381140TnpV
MGAS2096_Spy11510396.687790superfamily II DNA/RNA helicase
MGAS2096_Spy11521396.663513hypothetical protein
MGAS2096_Spy11541406.940502hypothetical protein
MGAS2096_Spy11530406.732939hypothetical protein
MGAS2096_Spy11550345.679159hypothetical protein
MGAS2096_Spy11560274.683203collagen adhesion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1116HTHTETR574e-12 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 56.6 bits (136), Expect = 4e-12
Identities = 19/92 (20%), Positives = 44/92 (47%), Gaps = 1/92 (1%)

Query: 14 KNDLIEKGIELVNKNGINQLSLRKVAQACGVSHAAPYSHFSNKEELLQEMQLHITKKFTE 73
+ +++ + L ++ G++ SL ++A+A GV+ A Y HF +K +L E+ E
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 74 VLENTVSQYRGTPIFLLEFG-KAYISFFISRP 104
+ +++ G P+ +L + ++
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVLESTVTEE 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1121IGASERPTASE364e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 36.2 bits (83), Expect = 4e-04
Identities = 27/151 (17%), Positives = 48/151 (31%), Gaps = 6/151 (3%)

Query: 95 KAKDKAFEFKQKRMMKTVQSQRANAAGQNGSQFHYGTVSRSSASNTSVQRVKGARQTQKT 154
+ A KQ+ Q A + S A + Q + A+ +T
Sbjct: 1036 TTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKA---NTQTNEVAQSGSET 1092

Query: 155 IKQSARNSKNAIKTAAKGTVKT-TQKSVKTAQATSKAAIKTTQ--TTAKSAQAAAKASAK 211
+ +K + K T+K+ + + TS+ + K Q T A+ A +
Sbjct: 1093 KETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPT 1152

Query: 212 TAQKAAQAARATAKASAAAVKTTAKATVSAV 242
K Q+ T + K T+ V
Sbjct: 1153 VNIKEPQSQTNTTADTEQPAKETSSNVEQPV 1183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1132IGASERPTASE383e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 38.1 bits (88), Expect = 3e-05
Identities = 25/150 (16%), Positives = 53/150 (35%), Gaps = 4/150 (2%)

Query: 120 RFSLASVDTATIVQEAERTKDGKSAKTGQPEPDIGVQEKAQKDQLLDALMGKPVQKEDNA 179
++ L +V+ + E K ++ T Q D + + + D A
Sbjct: 968 KYKLRNVNGRYDLYNPEVEKRNQTVDTTNITT----PNNIQADVPSVPSNNEEIARVDEA 1023

Query: 180 PNPSVAKTEKSPLSEPTLEKRSKSAEGATLNKEKPSVKEELRKIKESRKEQETEVSPTLN 239
P P A S +E E + ++ N++ + + + + + N
Sbjct: 1024 PVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 240 QEKSSNQAKQPVKKTEHKQPARRKKPKKSK 269
+ S + + TE K+ A +K +K+K
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAK 1113



Score = 35.8 bits (82), Expect = 2e-04
Identities = 30/151 (19%), Positives = 52/151 (34%), Gaps = 13/151 (8%)

Query: 132 VQEAERTKDGKSAKT------GQPEPDIGVQEKAQKDQLLDALMGKPV---QKEDNAPN- 181
V++ +T D + T P +E A+ D+ E A N
Sbjct: 985 VEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENS 1044

Query: 182 PSVAKTEK---SPLSEPTLEKRSKSAEGATLNKEKPSVKEELRKIKESRKEQETEVSPTL 238
+KT + +E T + R + E + K E + E+++ Q TE T
Sbjct: 1045 KQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETA 1104

Query: 239 NQEKSSNQAKQPVKKTEHKQPARRKKPKKSK 269
EK + K E + + PK+ +
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ 1135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1140ADHESNFAMILY300.030 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 29.8 bits (67), Expect = 0.030
Identities = 20/100 (20%), Positives = 39/100 (39%), Gaps = 14/100 (14%)

Query: 398 NDEKAVRAIEKRLTDTDQSRAKALEKEQRKLNKRLAELDRL----FSSLYEDKVMERITE 453
N + I K+L+ D + + EK ++ +L +LD+ F+ + +K + +E
Sbjct: 146 NGIIFAKNIAKQLSAKDPNNKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSE 205

Query: 454 RNFEMMSGKYQKEQLEIE----------ARLKEVTETLND 483
F+ S Y I ++K + E L
Sbjct: 206 GAFKYFSKAYGVPSAYIWEINTEEEGTPEQIKTLVEKLRQ 245


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1149TCRTETOQM10750.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 1075 bits (2781), Expect = 0.0
Identities = 494/638 (77%), Positives = 562/638 (88%)

Query: 1 MKIINLGILAHVDAGKTTLTESLLYTSGAIAELGSVDEGTTRTDTMNLERQRGITIQTAV 60
MKIIN+G+LAHVDAGKTTLTESLLY SGAI ELGSVD+GTTRTD LERQRGITIQT +
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 61 TSFQWEDVKVNIIDTPGHMDFLAEVYRSLSVLDGAVLLVSAKDGIQAQTRILFHALQTMK 120
TSFQWE+ KVNIIDTPGHMDFLAEVYRSLSVLDGA+LL+SAKDG+QAQTRILFHAL+ M
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 121 IPTIFFINKIDQEGIDLPMVYREMKAKLSSEIIVKQKVGQHPHINVTDNDDMEQWDAVIM 180
IPTIFFINKIDQ GIDL VY+++K KLS+EI++KQKV +P++ VT+ + EQWD VI
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180

Query: 181 GNDELLEKYMSGKPFKMSELEQEENRRFQNGTLFPVYHGSAKNNLGIRQLIEVIASKFYS 240
GND+LLEKYMSGK + ELEQEE+ RF N +LFPVYHGSAKNN+GI LIEVI +KFYS
Sbjct: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS 240

Query: 241 STPEGQSELCGQVFKIEYSEKRRRFVYVRIYSGTLHLRDVIRISEKEKIKITEMCVPTNG 300
ST GQSELCG+VFKIEYSEKR+R Y+R+YSG LHLRD +RISEKEKIKITEM NG
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSING 300

Query: 301 ELYSSDTACSGDIVILPNDVLQLNSILGNEMLLPQRKFIENPLPMLQTTIAVKKSEQREI 360
EL D A SG+IVIL N+ L+LNS+LG+ LLPQR+ IENPLP+LQTT+ K +QRE+
Sbjct: 301 ELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREM 360

Query: 361 LLGALTEISDGDPLLKYYVDTTTHEIILSFLGNVQMEVICAILEEKYHLEAEIKEPTVIY 420
LL AL EISD DPLL+YYVD+ THEIILSFLG VQMEV CA+L+EKYH+E EIKEPTVIY
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420

Query: 421 MERPLRKAEYTIHIEVPPNPFWASVGLSIEPLPIGSGVQYESRVSLGYLNQSFQNAVMEG 480
MERPL+KAEYTIHIEVPPNPFWAS+GLS+ PLP+GSG+QYES VSLGYLNQSFQNAVMEG
Sbjct: 421 MERPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480

Query: 481 VLYGCEQGLYGWKVTDCKICFEYGLYYSPVSTPADFRLLSPIVLEQALKKAGTELLEPYL 540
+ YGCEQGLYGW VTDCKICF+YGLYYSPVSTPADFR+L+PIVLEQ LKKAGTELLEPYL
Sbjct: 481 IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL 540

Query: 541 HFEIYAPQEYLSRAYHDAPRYCADIVSTQIKNDEVILKGEIPARCIQEYRNDLTYFTNGQ 600
F+IYAPQEYLSRAY DAP+YCA+IV TQ+KN+EVIL GEIPARCIQEYR+DLT+FTNG+
Sbjct: 541 SFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR 600

Query: 601 GVCLTELKGYHPAIGKFICQPRRPNSRIDKVRHMFHKL 638
VCLTELKGYH G+ +CQPRRPNSRIDKVR+MF+K+
Sbjct: 601 SVCLTELKGYHVTTGEPVCQPRRPNSRIDKVRYMFNKI 638


21MGAS2096_Spy1210MGAS2096_Spy1222Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1210218-0.622590hypothetical protein
MGAS2096_Spy1211118-0.286485superoxide dismutase
MGAS2096_Spy1212319-0.343809DNA polymerase III subunit delta
MGAS2096_Spy1213119-0.000772COME operon protein 3
MGAS2096_Spy1214-117-0.002833COME operon protein 3
MGAS2096_Spy1215-2160.287487COME operon protein 1
MGAS2096_Spy12170140.3301181-acyl-sn-glycerol-3-phosphate acyltransferase
MGAS2096_Spy12161120.463441methyltransferase
MGAS2096_Spy12183150.769572hypothetical protein
MGAS2096_Spy12193171.585624Kup system potassium uptake protein
MGAS2096_Spy12203171.430575Kup system potassium uptake protein
MGAS2096_Spy12213160.883411Kup system potassium uptake protein
MGAS2096_Spy12223160.534055ATP-dependent RNA helicase
22MGAS2096_Spy1273MGAS2096_Spy1292Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy12732251.792913hypothetical protein
MGAS2096_Spy12742241.938272GTP-binding protein TypA/BipA
MGAS2096_Spy1275013-0.175886GTP-binding protein TypA/BipA
MGAS2096_Spy1276014-0.759907rhodanese-related sulfurtransferases
MGAS2096_Spy1277014-1.573701glucokinase
MGAS2096_Spy1278416-3.406698putative cytoplasmic protein
MGAS2096_Spy1279216-2.790999non-specific DNA-binding protein Dps /
MGAS2096_Spy1280219-2.767852Type 4 prepilin peptidase pilD
MGAS2096_Spy1281118-2.444211ribosomal RNA large subunit methyltransferase N
MGAS2096_Spy1282017-2.424537putative cytoplasmic protein
MGAS2096_Spy1283-116-1.146246hypothetical protein
MGAS2096_Spy1284-214-0.164357ribose operon repressor
MGAS2096_Spy1285-1140.759564ATP-dependent endopeptidase Lon
MGAS2096_Spy12861141.130968phosphopantetheine adenylyltransferase
MGAS2096_Spy12872181.782497methyltransferase
MGAS2096_Spy12883181.955690asparagine synthetase AsnA
MGAS2096_Spy12893241.937803carbamate kinase
MGAS2096_Spy12901201.166861hypothetical protein
MGAS2096_Spy12913230.855806arginine/ornithine antiporter
MGAS2096_Spy12923241.073104ornithine carbamoyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1274TCRTETOQM1267e-33 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 126 bits (317), Expect = 7e-33
Identities = 76/407 (18%), Positives = 146/407 (35%), Gaps = 95/407 (23%)

Query: 1 MDTPGHADFGGEVERIMKMVDGVVLVVDAYEGTMPQTRFVLKKALEQNLIPIVVVNKIDK 60
+DTPGH DF EV R + ++DG +L++ A +G QTR + + + I +NKID+
Sbjct: 73 IDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFINKIDQ 132

Query: 61 PSARP-------------------------------------AEVVDEVLELFIELGADD 83
+ V E + +E
Sbjct: 133 NGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLEKYMSG 192

Query: 84 EQLE-----------------FPVVYASAINGTSSLSDDPADQEHTMAPIFDTIIDHIPA 126
+ LE FPV + SA N + + + I + +
Sbjct: 193 KSLEALELEQEESIRFHNCSLFPVYHGSAKNNIG------------IDNLIEVITNKFYS 240

Query: 127 PVDNSDEPLQFQVSLLDYNDFVGRIGIGRVFRGTVKVGDQVTLSKLDGTTKNFRVTKLFG 186
L +V ++Y++ R+ R++ G + + D V +S + ++T+++
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRIS----EKEKIKITEMYT 296

Query: 187 FFGLERREIQEAKAGDLIAVSGMEDIFVGETITPTDCVEALPILRIDEPTLQMTFLVNNS 246
E +I +A +G+++ + E + + + T + + P LQ T
Sbjct: 297 SINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTT------ 349

Query: 247 PFAGREGKWITSRKVEER--LLAELQT----DVSLRVDPTDSPDKWTVSGRGELHLSILI 300
+ K ++R LL L D LR + + +S G++ + +
Sbjct: 350 ---------VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTC 400

Query: 301 ETMRRE-GYELQVSRPEVIIKEIDGVKCEPFERVQIDTPEEYQGAII 346
++ + E+++ P VI E K E + I+ P A I
Sbjct: 401 ALLQEKYHVEIEIKEPTVIYMERPLKKAE--YTIHIEVPPNPFWASI 445



Score = 42.5 bits (100), Expect = 3e-06
Identities = 18/79 (22%), Positives = 31/79 (39%), Gaps = 1/79 (1%)

Query: 328 EPFERVQIDTPEEYQGAIIQSLSERKGDMLDMQMVGNGQTRLIFLIPARGLIGYSTEFLS 387
EP+ +I P+EY + +++D Q + N + L IPAR + Y ++
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 388 MTRGYGIMNHTFDQYLPVV 406
T G + Y
Sbjct: 596 FTNGRSVCLTELKGYHVTT 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1275TCRTETOQM533e-12 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 52.9 bits (127), Expect = 3e-12
Identities = 24/61 (39%), Positives = 35/61 (57%), Gaps = 2/61 (3%)

Query: 8 IRNVAIIAHVDHGKTTLVDELLKQSHTLDERKELQE--RAMDSNDLEKERGITILAKNTA 65
I N+ ++AHVD GKTTL + LL S + E + + D+ LE++RGITI T+
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 66 V 66

Sbjct: 63 F 63


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1277PF03309310.004 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 31.3 bits (71), Expect = 0.004
Identities = 29/126 (23%), Positives = 43/126 (34%), Gaps = 14/126 (11%)

Query: 5 LLGIDLGGTTIKFGILTAAGEVQE---KWAIETNILEGGKHIVPDIVASIKHRLDLYGLS 61
LL ID+ T G+++ +G+ + +W I T + D +A L G
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTE-----PEVTADELALTIDG--LIGDD 54

Query: 62 SADFVGIGMGSPGAVDRDTNTVTGAFNLNWKETQEVGSVVEKELGIPFAIDNDANVAALG 121
+ G S V + V W V GIP +DN V A
Sbjct: 55 AERLTGASGLS--TVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGA-- 110

Query: 122 ERWVGA 127
+R V
Sbjct: 111 DRIVNC 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1279HELNAPAPROT1511e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 151 bits (383), Expect = 1e-49
Identities = 49/154 (31%), Positives = 85/154 (55%), Gaps = 4/154 (2%)

Query: 19 KKEASKNEKT--KAVLNQAVADLSVAASIVHQVHWYMRGPGFLYLHPKMDELLDSLNANL 76
K E +K +T + LN +++ + S +H+ HWY++GP F LH K +EL D +
Sbjct: 2 KTENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETV 61

Query: 77 DEMSERLITIGGAPYSTLAEFSKHSKLDEAKGTYDKTVAQHLARLVEVYLYLSSLYQVGL 136
D ++ERL+ IGG P +T+ E+++H+ + + + + ++ + LV Y +SS + +
Sbjct: 62 DTIAERLLAIGGQPVATVKEYTEHASITDGGN--ETSASEMVQALVNDYKQISSESKFVI 119

Query: 137 DITDEEGDAGTNDLFTAAKTEAEKTIWMLQAERG 170
+ +E D T DLF E EK +WML + G
Sbjct: 120 GLAEENQDNATADLFVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1280PREPILNPTASE290.009 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.4 bits (66), Expect = 0.009
Identities = 42/160 (26%), Positives = 59/160 (36%), Gaps = 25/160 (15%)

Query: 70 SLIIILWASMVHWVSASYCYLLLFSLLFSLF--DWRSQ------EYPFILWLFSFVSLLL 121
+L+ + A + + LLL +L +L D P + F L
Sbjct: 118 ALLSVAVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGG 177

Query: 122 FYSIN---------YLSLILLLLGLLAHLRPFSIGAGDFFYLASLALVLDLTSLIWLIQL 172
F S+ YL L L +G GDF LA+L L +L ++ L
Sbjct: 178 FVSLGDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLL 237

Query: 173 ASLAGITACLLL-------GIKRIPFIPYLSFGLFWIVLL 205
+SL G + L K IPF PYL+ WI LL
Sbjct: 238 SSLVGAFMGIGLILLRNHHQSKPIPFGPYLAIA-GWIALL 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1284HTHTETR337e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 33.4 bits (76), Expect = 7e-04
Identities = 9/34 (26%), Positives = 19/34 (55%)

Query: 8 KLILQGGKAMVTIKQVAEEAGVSRSTVSRYISQK 41
+L Q G + ++ ++A+ AGV+R + + K
Sbjct: 22 RLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDK 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1286LPSBIOSNTHSS1532e-50 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 153 bits (388), Expect = 2e-50
Identities = 58/157 (36%), Positives = 94/157 (59%), Gaps = 2/157 (1%)

Query: 5 IGLYTGSFDPVTNGHLDIVKRASGLFDQIYVGIFDNPTKKSYFKLEVRKAMLTQALADFT 64
+Y GSFDP+T GHLDI++R LFDQ+YV + NP K+ F ++ R + +A+A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 65 NVIVVTSHERLAIDVAKELRVTHLIRGLRNATDFEYEENLEYFNHLLAPNIETVYLISRN 124
N V + E L ++ A++ + ++RGLR +DFE E + N LA ++ETV+L +
Sbjct: 62 NAQVDSF-EGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTST 120

Query: 125 KWQALSSSRVRELIHFQSSLEGLVPQSVIAQV-EKMN 160
++ LSSS V+E+ F ++E VP V A + ++ +
Sbjct: 121 EYSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQFH 157


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1289CARBMTKINASE406e-145 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 406 bits (1046), Expect = e-145
Identities = 141/315 (44%), Positives = 204/315 (64%), Gaps = 6/315 (1%)

Query: 3 KQKIVVALGGNAIL--STDASAKAQQEALMSTSKSLVKLIKEGHEVIVTHGNGPQVGNLL 60
+++V+ALGGNA+ S + + + T++ + ++I G+EV++THGNGPQVG+LL
Sbjct: 2 GKRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLL 61

Query: 61 LQQAAADSEKN-PAMPLDTCVAMTEGSIGFWLVNALDNELQAQGIQKEVAAVVTQVIVDA 119
L A + PA P+D AM++G IG+ + AL NEL+ +G++K+V ++TQ IVD
Sbjct: 62 LHMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDK 121

Query: 120 KDPAFENPTKPIGPFLTEEDAKKQMAESGASFKEDAGRGWRKVVPSPKPVGIKEANVIRS 179
DPAF+NPTKP+GPF EE AK+ E G KED+GRGWR+VVPSP P G EA I+
Sbjct: 122 NDPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKK 181

Query: 180 LVDSGVVVVSAGGGGVPVVEDATSKSLTGVEAVIDKDFASQTLSELVDADLFIVLTGVDN 239
LV+ GV+V+++GGGGVPV+ + + GVEAVIDKD A + L+E V+AD+F++LT V+
Sbjct: 182 LVERGVIVIASGGGGVPVILED--GEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNG 239

Query: 240 VYINFNKPDQAKLEEVTVSQMKEYITQDQFAPGSMLPKVEAAIAFVENKPNAKAIITSLE 299
+ + + L EV V ++++Y + F GSM PKV AAI F+E +AII LE
Sbjct: 240 AALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEW-GGERAIIAHLE 298

Query: 300 NIDNVLSANAGTQII 314
L GTQ++
Sbjct: 299 KAVEALEGKTGTQVL 313


23MGAS2096_Spy1311MGAS2096_Spy1345Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy13110193.857515valyl-tRNA synthetase
MGAS2096_Spy13120191.816481hypothetical protein
MGAS2096_Spy13130212.033396ribosomal-protein-serine acetyltransferase
MGAS2096_Spy1314-1191.183271putative cytoplasmic protein
MGAS2096_Spy1315-2191.046331hypothetical protein
MGAS2096_Spy1316-2171.817133*3-deoxy-7-phosphoheptulonate synthase
MGAS2096_Spy1317-2171.6673783-dehydroquinate synthase
MGAS2096_Spy1318-1213.046919hypothetical protein
MGAS2096_Spy13191184.022302putative acetate kinase
MGAS2096_Spy13202173.238097putative cytoplasmic protein
MGAS2096_Spy13210182.964264SAM-dependent methyltransferase
MGAS2096_Spy13221182.595383hypothetical protein
MGAS2096_Spy13231152.264811shikimate 5-dehydrogenase
MGAS2096_Spy13241151.708968beta-galactosidase
MGAS2096_Spy1325016-0.024525two-component response regulator yesN
MGAS2096_Spy1326-1160.823756two-component sensor kinase yesM
MGAS2096_Spy13271190.668739hypothetical protein
MGAS2096_Spy13282172.099416sugar-binding protein
MGAS2096_Spy13292182.756980sugar transport system permease
MGAS2096_Spy13301173.046394sugar transport system permease
MGAS2096_Spy13310183.796725RpiR family transcriptional regulator
MGAS2096_Spy13320163.727519hypothetical protein
MGAS2096_Spy1333-2174.115812Beta-glucosidase
MGAS2096_Spy1334-3162.297112hyaluronoglucosaminidase
MGAS2096_Spy1335-3152.276018putative cytoplasmic protein
MGAS2096_Spy1336-3151.687938GntR family transcriptional regulator
MGAS2096_Spy1337-3151.169077hypothetical protein
MGAS2096_Spy1338-3140.613007Alpha-mannosidase
MGAS2096_Spy1339-313-2.256794sensory transduction protein kinase
MGAS2096_Spy1340-116-1.307459tRNA (Uracil-5-) -methyltransferase
MGAS2096_Spy1341219-3.878295recombination regulator RecX
MGAS2096_Spy1342519-4.511008hypothetical protein
MGAS2096_Spy1343119-4.058002hypothetical protein
MGAS2096_Spy1344119-4.313485transposase
MGAS2096_Spy1345021-3.507857********hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1311RTXTOXIND340.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.0 bits (78), Expect = 0.002
Identities = 12/73 (16%), Positives = 28/73 (38%), Gaps = 6/73 (8%)

Query: 805 YLPLADLLNVEEELVRLDKELAKWQKELDMVGKKLGNERFVANAKPEVVQKEKDKQADYQ 864
+ +L E + V EL ++ +L+ + ++ +AK E + + +
Sbjct: 248 AIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEI------LSAKEEYQLVTQLFKNEIL 301

Query: 865 AKYDATQERIAEM 877
K T + I +
Sbjct: 302 DKLRQTTDNIGLL 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1325HTHFIS851e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 84.9 bits (210), Expect = 1e-19
Identities = 31/133 (23%), Positives = 50/133 (37%), Gaps = 6/133 (4%)

Query: 4 KVLLVDDEYMILQGLTMIIDWQALGFEVVQTARSGKEALAYLTQYPVDVMISDVTMPGMT 63
+L+ DD+ I L + G++V + ++ D++++DV MP
Sbjct: 5 TILVADDDAAIRTVLNQAL--SRAGYDVR-ITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 64 GLDLIEAAKTYHPQLQTLILSGYQEFSYVQKAMELETKGYLLKPVDKAELQAKMKQFKDC 123
DL+ K P L L++S F KA E YL KP D L +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFD---LTELIGIIGRA 118

Query: 124 LDAQQAESIRQEA 136
L + + E
Sbjct: 119 LAEPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1326PF065801808e-54 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 180 bits (458), Expect = 8e-54
Identities = 71/324 (21%), Positives = 133/324 (41%), Gaps = 34/324 (10%)

Query: 251 LSKAYRMQYNRSGDLLAYVAVRKSYLLAEAVRTVFVHGLVSLLLAWLLLQLL-FRVFRNY 309
L+ AYR R G L + + A + + V+ W LL + +
Sbjct: 55 LTHAYRSFIKRQG-WLKLNMGQIILRVLPACVVIGMVWFVANTSIWRLLAFINTKPVAFT 113

Query: 310 IQQVSEITDTVEMVAAGDLSLTIDNSHMELELYHISEAINQMLASIKAYIDEVYVLEVEQ 369
+ I V +V + M LY + +A ID+ +
Sbjct: 114 LPLALSIIFNVVVV-----------TFMWSLLYF---GWHFFKNYKQAEIDQWK-MASMA 158

Query: 370 RDAQMRALQSQINPHFLYNTLEYIRMYALSCQQEELADVIYAFASLLRNNI--SQDKMTT 427
++AQ+ AL++QINPHF++N L IR L + +++ + + L+R ++ S + +
Sbjct: 159 QEAQLMALKAQINPHFMFNALNNIRALILE-DPTKAREMLTSLSELMRYSLRYSNARQVS 217

Query: 428 LKEELAFCEKYIYLYQMRYPDSFAYHVKIDESIADLAIPKFVIQPLVENYFVHGIDYSRN 487
L +EL + Y+ L +++ D + +I+ +I D+ +P ++Q LVEN HGI
Sbjct: 218 LADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQ 277

Query: 488 DNALSIKALDETDHLLIQVLDNGRGISQERLADMEKRLQEHQTTGNSSIGLQNVYLRLFH 547
+ +K + + ++V + G L T ++ GLQNV RL
Sbjct: 278 GGKILLKGTKDNGTVTLEVENTG-------------SLALKNTKESTGTGLQNVRERLQM 324

Query: 548 HFRDRVSWSMAKEPNGGFIIQIRI 571
+ ++++ G + I
Sbjct: 325 LYGTEAQIKLSEKQ-GKVNAMVLI 347


24MGAS2096_Spy1435MGAS2096_Spy1457Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy14352150.411262translation initiation factor IF-2
MGAS2096_Spy1436314-1.563354hypothetical protein
MGAS2096_Spy14372150.242290putative cytoplasmic protein
MGAS2096_Spy14382150.649041transcription elongation factor NusA
MGAS2096_Spy14393180.574817hypothetical protein
MGAS2096_Spy14402181.246017phage protein
MGAS2096_Spy14412190.896733deoxyribonuclease precursor
MGAS2096_Spy14422223.089448phage-associated cell wall hydrolase
MGAS2096_Spy14434201.923269phage protein
MGAS2096_Spy14443201.918531phage protein
MGAS2096_Spy14452191.621731phage protein
MGAS2096_Spy14462181.170320phage protein
MGAS2096_Spy14472172.435824phage infection protein
MGAS2096_Spy14482192.072917phage protein
MGAS2096_Spy14492192.054874hyaluronoglucosaminidase
MGAS2096_Spy14502172.106148phage endopeptidase
MGAS2096_Spy14513172.124141phage protein
MGAS2096_Spy14524172.579636phage protein
MGAS2096_Spy1453420-0.729232phage protein
MGAS2096_Spy1454320-0.661755phage protein
MGAS2096_Spy1455119-0.050528phage protein
MGAS2096_Spy14563180.155272phage protein
MGAS2096_Spy14573170.253145phage protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1435TCRTETOQM825e-18 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 81.8 bits (202), Expect = 5e-18
Identities = 46/139 (33%), Positives = 65/139 (46%), Gaps = 18/139 (12%)

Query: 461 IMGHVDHGKTTLLDTLRNSRVATGEAG------------------GITQHIGAYQIEEAG 502
++ HVD GKTTL ++L + A E G GIT G +
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 503 KKITFLDTPGHAAFTSMRARGASVTDITILIVAADDGVMPQTIEAINHSKAAGVPIIVAI 562
K+ +DTPGH F + R SV D IL+++A DGV QT + + G+P I I
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 563 NKIDKPGANPERVIAELAE 581
NKID+ G + V ++ E
Sbjct: 128 NKIDQNGIDLSTVYQDIKE 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1442FLGFLGJ924e-23 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 92.5 bits (229), Expect = 4e-23
Identities = 44/125 (35%), Positives = 63/125 (50%), Gaps = 8/125 (6%)

Query: 23 SLTAAQAILESGWGKHA-------PHNALFGIKADASWTGKSFNTKTQEEYQPGIVTDIV 75
L AQA LESGWG+ P LFG+KA +W G T E Y+ G +
Sbjct: 172 HLILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTE-YENGEAKKVK 230

Query: 76 DRFRAYDSWTDSIIDHGKFLNDNPRYQSVVGETDYKKACHAIKDAGYATASGYAELLIQL 135
+FR Y S+ +++ D+ L NPRY +V ++ A++DAGYAT YA L +
Sbjct: 231 AKFRVYSSYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYARKLTNM 290

Query: 136 IEEND 140
I++
Sbjct: 291 IQQMK 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1449PF07212444e-07 Hyaluronoglucosaminidase
		>PF07212#Hyaluronoglucosaminidase

Length = 336

Score = 44.3 bits (104), Expect = 4e-07
Identities = 29/99 (29%), Positives = 45/99 (45%), Gaps = 16/99 (16%)

Query: 3 LEERIPIKVLFDRKDAAEWQKLNPVVDDGELVVELDTHRLKVGDGKLNYNDLPYYEGPQG 62
+ E IP++V F R A EW + + ++ + E+ E DT K GDGK ++ L Y P
Sbjct: 1 MTETIPLRVQFKRMTAEEWTRSDVILLESEIGFETDTGYAKFGDGKNQFSKLKYLNKP-- 58

Query: 63 ESITKVQLSENGDLSVWIGDKET--KLGNIKGQKGDKGT 99
DL + +ET K+ ++ K DK
Sbjct: 59 ------------DLGAFAQKEETNSKITKLESSKADKNA 85


25MGAS2096_Spy1471MGAS2096_Spy1486Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy14713263.452744phage protein
MGAS2096_Spy14723263.674217phage protein
MGAS2096_Spy14733253.483011phage-related DNA helicase
MGAS2096_Spy14743253.956563hypothetical protein
MGAS2096_Spy14754263.932865DNA primase
MGAS2096_Spy14763264.039815phage-related DNA polymerase
MGAS2096_Spy14773222.340083phage protein
MGAS2096_Spy14783211.484906phage protein
MGAS2096_Spy14793221.927203phage protein
MGAS2096_Spy1480521-0.605233phage protein
MGAS2096_Spy1481422-0.507504phage protein
MGAS2096_Spy1482619-2.317861phage protein
MGAS2096_Spy1483220-2.497803phage protein
MGAS2096_Spy1484421-2.996441phage protein
MGAS2096_Spy1485421-3.234482phage protein
MGAS2096_Spy1486223-3.066526phage protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1479MICOLLPTASE290.036 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 29.3 bits (65), Expect = 0.036
Identities = 16/72 (22%), Positives = 32/72 (44%), Gaps = 2/72 (2%)

Query: 123 TSDVVILADGVIEIIDLKYGKGMPVSANQNPQMGLYALGAYASYDMV--YDFDRIKMTII 180
+ + ++ D +E+I+ ANQ + + G + D Y FD K +
Sbjct: 850 SKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLVKGTLSEEDYSDKYYFDVAKKGNV 909

Query: 181 QPRLDSVSSVDI 192
+ L++++SV I
Sbjct: 910 KITLNNLNSVGI 921


26MGAS2096_Spy1576MGAS2096_Spy1585Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy15762260.350945magnesium/cobalt transporter CorA
MGAS2096_Spy15770240.465387hypothetical protein
MGAS2096_Spy1578023-1.16145530S ribosomal protein S18
MGAS2096_Spy1579-118-1.789730single-stranded DNA-binding protein
MGAS2096_Spy1580-116-3.48917630S ribosomal protein S6
MGAS2096_Spy1581-114-3.150294hypothetical protein
MGAS2096_Spy1582-114-3.112677A/G-specific adenine glycosylase
MGAS2096_Spy1583-114-3.883503transcriptional regulator
MGAS2096_Spy1584-114-3.375452thioredoxin
MGAS2096_Spy1585-114-3.128147phosphatidylglycerophosphatase B
27MGAS2096_Spy1646MGAS2096_Spy1667Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1646319-6.487478type I restriction-modification system
MGAS2096_Spy1647318-7.289792Type I restriction-modification system
MGAS2096_Spy1649424-8.858357hypothetical protein
MGAS2096_Spy1648526-8.659066hypothetical protein
MGAS2096_Spy1650524-8.481314transcriptional regulatory protein degU
MGAS2096_Spy1651524-8.412130sensory transduction protein kinase
MGAS2096_Spy1652317-6.473881ABC transporter permease
MGAS2096_Spy1653118-4.062825ABC transporter ATP-binding protein
MGAS2096_Spy1654118-3.537373lantibiotic ABC transporter ATP-binding protein
MGAS2096_Spy1655220-3.106886Serine (threonine) dehydratase
MGAS2096_Spy1656225-1.603743lantibiotic salivaricin A
MGAS2096_Spy1657225-1.5615866-phospho-beta-galactosidase
MGAS2096_Spy1658227-1.487556PTS system lactose-specific transporter subunit
MGAS2096_Spy1659223-2.678640PTS system lactose-specific transporter subunit
MGAS2096_Spy1660222-2.908889tagatose 1,6-diphosphate aldolase
MGAS2096_Spy1661020-3.474085tagatose-6-phosphate kinase
MGAS2096_Spy1662018-2.720031galactose-6-phosphate isomerase subunit LacB
MGAS2096_Spy1663218-2.732051galactose-6-phosphate isomerase subunit LacA
MGAS2096_Spy1664322-1.772389lactose phosphotransferase system repressor
MGAS2096_Spy16653320.124840DNA-damage-inducible protein J
MGAS2096_Spy16662311.117602putative cytoplasmic protein
MGAS2096_Spy16672291.916819DNA integration/recombination/inversion protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1650HTHFIS463e-08 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 46.0 bits (109), Expect = 3e-08
Identities = 21/118 (17%), Positives = 51/118 (43%), Gaps = 6/118 (5%)

Query: 2 KILLIDDHRLFAKSIQLLFQQYD-EVDVIDTITSHFNDVTIDLSKYDIILLDINLTNISK 60
IL+ DD + + +V + + + + D+++ D+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWI--AAGDGDLVVTDVVM---PD 59

Query: 61 ENGLEIAKELIQSTPHLKVVMLTGYVKSIYRERAKKVGAYGFVDKNIDPKQLISILKK 118
EN ++ + ++ P L V++++ + +A + GAY ++ K D +LI I+ +
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1664ARGREPRESSOR300.006 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 29.8 bits (67), Expect = 0.006
Identities = 21/85 (24%), Positives = 38/85 (44%), Gaps = 11/85 (12%)

Query: 1 MKKKERHEKILDILKVDGFIKVKDIIDEM-----NISDMTARRDLDTLADKGLL-IRTHG 54
M K +RH KI +I+ + +++D + N++ T RD+ L L+ + T+
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKEL---HLVKVPTNN 57

Query: 55 GAQYLDYSSAKDEGHEKTHTEKKVL 79
G+ YS D+ K+ L
Sbjct: 58 GSYK--YSLPADQRFNPLSKLKRSL 80


28MGAS2096_Spy1679MGAS2096_Spy1727Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy16790163.061790hypothetical protein
MGAS2096_Spy16800183.916896putative cytoplasmic protein
MGAS2096_Spy1681-1193.724958Serine acetyltransferase
MGAS2096_Spy1682-1173.046728hypothetical protein
MGAS2096_Spy1683-1173.084749polynucleotide phosphorylase/polyadenylase
MGAS2096_Spy16840161.929522putative translaldolase
MGAS2096_Spy1685-1172.011782PTS system ascorbate-specific transporter
MGAS2096_Spy1686-2190.844092PTS system transporter subunit IIB
MGAS2096_Spy1687-1171.120842BigG family transcription antiterminator / PTS
MGAS2096_Spy16880211.319488hypothetical protein
MGAS2096_Spy16890171.54118030S ribosomal protein S15
MGAS2096_Spy1690-2173.426747hypothetical protein
MGAS2096_Spy1691-2163.737271putative transcriptional regulator
MGAS2096_Spy1692-2143.488157peptide deformylase
MGAS2096_Spy1693-1143.360025NADPH-dependent FMN reductase family protein
MGAS2096_Spy16940153.210785MarR family transcriptional regulator
MGAS2096_Spy16950153.187004DNA polymerase III PolC
MGAS2096_Spy1696-2132.272361prolyl-tRNA synthetase
MGAS2096_Spy1697-2132.362198M50 family membrane endopeptidase
MGAS2096_Spy1698-1132.497353phosphatidate cytidylyltransferase
MGAS2096_Spy1699-1163.167322undecaprenyl pyrophosphate synthase
MGAS2096_Spy1700-1153.497793preprotein translocase subunit YajC
MGAS2096_Spy1701-1153.887861thioredoxin
MGAS2096_Spy1703-2153.567886pullulanase
MGAS2096_Spy1702-2183.486177hypothetical protein
MGAS2096_Spy1704-2204.096323glucan 1,6-alpha-glucosidase
MGAS2096_Spy1705-1195.183464sugar ABC transporter ATP-binding protein
MGAS2096_Spy1706-1215.303109hypothetical protein
MGAS2096_Spy1707-1235.476323streptokinase
MGAS2096_Spy17080246.174698D-tyrosyl-tRNA(Tyr) deacylase
MGAS2096_Spy17090215.934786GTP pyrophosphokinase /
MGAS2096_Spy17102205.964640collagen-like surface protein
MGAS2096_Spy17110204.849732NrdI protein
MGAS2096_Spy17120184.506456NrdI protein
MGAS2096_Spy1713-1173.835515endonuclease/exonuclease/phosphatase family
MGAS2096_Spy1714-1173.764514PTS system glucose-specific transporter subunit
MGAS2096_Spy1715-1161.96416416S ribosomal RNA methyltransferase RsmE
MGAS2096_Spy17160181.18434750S ribosomal protein L11 methyltransferase
MGAS2096_Spy1717319-1.214025hypothetical protein
MGAS2096_Spy1718521-2.385300RhuM
MGAS2096_Spy17191221.618648ArpU family phage encoded transcriptional
MGAS2096_Spy17200232.189782transposase
MGAS2096_Spy17210253.079343hypothetical protein
MGAS2096_Spy1722-1232.637013hypothetical protein
MGAS2096_Spy1723-2222.530895hypothetical protein
MGAS2096_Spy1724-2222.324946Para-aminobenzoate synthetase component I /
MGAS2096_Spy1725118-1.269707anthranilate synthase component II
MGAS2096_Spy1726018-2.660407recombination factor protein RarA
MGAS2096_Spy1727-215-3.337505*acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy168860KDINNERMP250.019 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 24.5 bits (53), Expect = 0.019
Identities = 8/17 (47%), Positives = 10/17 (58%)

Query: 19 FFIFDMLFVSFHIWLEW 35
+ +LFVSF IW W
Sbjct: 7 LLVIALLFVSFMIWQAW 23


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1697PF04605300.008 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 29.8 bits (67), Expect = 0.008
Identities = 8/44 (18%), Positives = 17/44 (38%), Gaps = 2/44 (4%)

Query: 227 INGYKVTSWNDLTEAV-DLATRD-LGPSQTIKVTYKSHQRLKTV 268
+ ++ L E + DL +D + +Q+LK +
Sbjct: 80 FDITEIGEQYSLKETIQDLCAKDFHQKLKEFTEKTPKNQKLKDL 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1705PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 14/56 (25%), Positives = 20/56 (35%), Gaps = 9/56 (16%)

Query: 34 IVFVGPSGCGKSTTLRMIAGLEDISEGELKIGGEVVNDKSPKDRDIAMVFQNYALY 89
+V G G GKST + + GL+ S+ IG +D Y
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG---------TGKDSYEQIAGIVAY 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1706HTHFIS347e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.0 bits (78), Expect = 7e-04
Identities = 10/30 (33%), Positives = 19/30 (63%)

Query: 243 ALWSEHGNLVQTAQRLYIHRNSLQYKLDKF 272
AL + GN ++ A L ++RN+L+ K+ +
Sbjct: 444 ALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1707STREPKINASE7920.0 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 792 bits (2047), Expect = 0.0
Identities = 378/440 (85%), Positives = 401/440 (91%)

Query: 1 MKNYLSFGMFALLFALTFGTVKPVQAIAGPEWLLDRPSVNNSQLVVSVAGTVEGTNQEIS 60
MKNYLSFGMFALLFALTFGTV VQAIAGPEWLLDRPSVNNSQLVVSVAGTVEGTNQ+IS
Sbjct: 1 MKNYLSFGMFALLFALTFGTVNSVQAIAGPEWLLDRPSVNNSQLVVSVAGTVEGTNQDIS 60

Query: 61 LKFFEIDLTSRPAHGGKTEQGLSPKSKPFATNSSAMPHKLEKADLLKAIQEQLIANVHSN 120
LKFFEIDLTSRPAHGGKTEQGLSPKSKPFAT+S AM HKLEKADLLKAIQEQLIANVHSN
Sbjct: 61 LKFFEIDLTSRPAHGGKTEQGLSPKSKPFATDSGAMSHKLEKADLLKAIQEQLIANVHSN 120

Query: 121 DGYFEVIDFASDATITDRNGKVYFADRDDSVTLPTQPVQEFLLSGHVRVRPYQPKAVHNS 180
D YFEVIDFASDATITDRNGKVYFAD+D SVTLPTQPVQEFLLSGHVRVRPY+ K + N
Sbjct: 121 DDYFEVIDFASDATITDRNGKVYFADKDGSVTLPTQPVQEFLLSGHVRVRPYKEKPIQNQ 180

Query: 181 AERVNVNYEVSFVSETGNLDFTPSLKERYHLTTLAVGDSLSSQELAAIAQFILSKEHPDY 240
A+ V+V Y V F + DF P LK+ L TLA+GD+++SQEL A AQ IL+K HP Y
Sbjct: 181 AKSVDVEYTVQFTPLNPDDDFRPGLKDTKLLKTLAIGDTITSQELLAQAQSILNKNHPGY 240

Query: 241 IITKRDSSIVTHDNDIFRTILPMDQEFTYHIKDREQAYKANSKTGIVEKTNNTDLISEKY 300
I +RDSSIVTHDNDIFRTILPMDQEFTY +K+REQAY+ N K+G+ E+ NNTDLISEKY
Sbjct: 241 TIYERDSSIVTHDNDIFRTILPMDQEFTYRVKNREQAYRINKKSGLNEEINNTDLISEKY 300

Query: 301 YVLKKGEEPYDPFDRSHLKLFTIKYVDVDTNELLKSEQLLTASERNLDFRDLYDPRDKAK 360
YVLKKGE+PYDPFDRSHLKLFTIKYVDVDTNELLKSEQLLTASERNLDFRDLYDPRDKAK
Sbjct: 301 YVLKKGEKPYDPFDRSHLKLFTIKYVDVDTNELLKSEQLLTASERNLDFRDLYDPRDKAK 360

Query: 361 LLYNNLDAFGIMDYTLTGKVEDNHNDTNRIITVYMGKRPEGENASYHLAYDKDRYTEEER 420
LLYNNLDAFGIMDYTLTGKVEDNH+DTNRIITVYMGKRPEGENASYHLAYDKDRYTEEER
Sbjct: 361 LLYNNLDAFGIMDYTLTGKVEDNHDDTNRIITVYMGKRPEGENASYHLAYDKDRYTEEER 420

Query: 421 EVYSYLRYTGTPIPDNPKDK 440
EVYSYLRYTGTPIPDNP DK
Sbjct: 421 EVYSYLRYTGTPIPDNPNDK 440


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1710GPOSANCHOR609e-12 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 59.7 bits (144), Expect = 9e-12
Identities = 42/124 (33%), Positives = 54/124 (43%), Gaps = 3/124 (2%)

Query: 305 EKAPEKSPEVTPTPEMPEQP---GEKAPEKSPEVTPTPEMPEQPGEKAPEKSKEVTPAPE 361
K E+S ++T + Q E K E+ + KA +
Sbjct: 416 NKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGN 475

Query: 362 KPADKEANQTPERRNGNMAKTPVANNHRRLPATGEQANPFFTAAAVAVMTTAGVLAVTKR 421
K + N K P+ R+LP+TGE ANPFFTAAA+ VM TAGV AV KR
Sbjct: 476 KAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGETANPFFTAAALTVMATAGVAAVVKR 535

Query: 422 KENN 425
KE N
Sbjct: 536 KEEN 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1726HTHFIS340.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 33.7 bits (77), Expect = 0.001
Identities = 28/155 (18%), Positives = 46/155 (29%), Gaps = 38/155 (24%)

Query: 8 RMRPKTISEVIGQKHLVGEGKIIRRMVE-----ANRLSSMILYGPPGIGKTSIASAIAGT 62
R K + LVG ++ + ++++ G G GK +A A+
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDY 183

Query: 63 TRYAFRTF--------------------------NATIDSKKRLQEIAEEAKFSGGLVLL 96
+ F A S R ++ + G L
Sbjct: 184 GKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQ-------AEGGTLF 236

Query: 97 LDEIHRLDKTKQDFLLPLLENGTIIMIGATTENPF 131
LDEI + Q LL +L+ G +G T
Sbjct: 237 LDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRS 271


29MGAS2096_Spy1748MGAS2096_Spy1778Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy17482210.254514transcriptional regulator
MGAS2096_Spy17490220.875969hypothetical protein
MGAS2096_Spy1750-1221.009644immunogenic protein
MGAS2096_Spy1751-1221.002950immunogenic secreted protein
MGAS2096_Spy1752-122-0.300766two-component system histidine kinase
MGAS2096_Spy1753-121-0.517953two-component response regulator
MGAS2096_Spy1754-222-0.282377ABC transporter permease
MGAS2096_Spy17550232.761563ABC transporter ATP-binding protein
MGAS2096_Spy17561233.644006periplasmic protein of efflux system
MGAS2096_Spy17582152.436585hypothetical protein
MGAS2096_Spy17572152.735028hypothetical protein
MGAS2096_Spy17591203.153015hypothetical protein
MGAS2096_Spy17602213.200448fibronectin-binding protein
MGAS2096_Spy17611201.875217fibronectin-binding protein
MGAS2096_Spy17621210.915864fibronectin-binding protein
MGAS2096_Spy17632322.281486putative cytoplasmic protein
MGAS2096_Spy17643341.001752foldase PrsA
MGAS2096_Spy1765020-1.941244hypothetical protein
MGAS2096_Spy1766-122-2.334470streptopain
MGAS2096_Spy1767021-2.253239streptopain
MGAS2096_Spy1768019-3.820090hypothetical protein
MGAS2096_Spy1769-120-2.387948hypothetical protein
MGAS2096_Spy17700170.683502transcriptional regulator
MGAS2096_Spy17711183.432404streptodornase
MGAS2096_Spy17720234.355141streptodornase
MGAS2096_Spy17732244.424812hypothetical protein
MGAS2096_Spy17743244.399135low temperature requirement C protein
MGAS2096_Spy17752234.161858glycerol dehydrogenase
MGAS2096_Spy17761203.610243fructose-6-phosphate aldolase
MGAS2096_Spy17770213.417638formate acetyltransferase
MGAS2096_Spy17782182.220434PTS system cellobiose-specific transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1748PF050435620.0 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 562 bits (1449), Expect = 0.0
Identities = 109/473 (23%), Positives = 217/473 (45%), Gaps = 20/473 (4%)

Query: 34 ELSKALNISMLTLQTCLTNMQ-FMKEVGGITYKNGYITIWYHQHCGLQEVYQKALRHSQS 92
EL++ LN + ++ L++++ ++ + NG I ++ VY +HS
Sbjct: 30 ELAELLNCTERAVKDDLSHVKSAFPDLIFHSSTNGIRIINT-DDSDIEMVYHHFFKHSTH 88

Query: 93 FKLLETLFFRDFNSLEELAEELFVSISTLKRLIKKTNAYLTHTFGITILTSPVQVSGDEH 152
F +LE +FF + E + +E ++S S+L R+I + N + F + +PVQ+ G+E
Sbjct: 89 FSILEFIFFNEGCQAESICKEFYISSSSLYRIISQINKVIKRQFQFEVSLTPVQIIGNER 148

Query: 153 QIRLFYLKYFSEAYKISEWPFGEMLNLKNCERLLSLMIKEVDVRVNFTLFQHLKILSSVN 212
IR F+ +YFSE Y EWPF E + + +LL L+ KE +N + + LK+L N
Sbjct: 149 DIRYFFAQYFSEKYYFLEWPF-ENFSSEPLSQLLELVYKETSFPMNLSTHRMLKLLLVTN 207

Query: 213 LIRYYKGYSAVYDNKKTSHRFSQLIQSSLEIQDLSRLFYLKFGLYLDETTIAEMFSNHVN 272
L R G+ D + + + + I+ +++ F ++ + LDE + ++F ++
Sbjct: 208 LYRIKFGHFMEVDKDSFNDQSLDFLMQAEGIEGVAQSFESEYNISLDEEVVCQLFVSYFQ 267

Query: 273 DQLEIGYAF--DSIKQDSPTGCRKVTNWVHLL----DELEIRLNLSVTNKYEVAVILHNT 326
I + +K+DS V HLL D++ ++ + + NK + LHNT
Sbjct: 268 KMFFIDESLFMKCVKKDS-----YVEKSYHLLSDFIDQISVKYQIEIENKDNLIWHLHNT 322

Query: 327 TVLKEEDITANYLFFDYKKSYLNFYKQEHPHLYKACVAGVEKLMRSEKEPISKELTNQLI 386
L +++ ++ FD K + + ++ P + + + + S + N L
Sbjct: 323 AHLYRQELFTEFILFDQKGNTIRNFQNIFPKFVSDVKKELSHYLETLEVCSSSMMVNHLS 382

Query: 387 YAFFITWENSFLKVNQKDEKIRLLVI----ERSFNSVGNFLKKYIGEFFSITNFNELDAL 442
Y F ++ + + Q K+++LV+ + V L Y F + + EL+
Sbjct: 383 YTFITHTKHLVINLLQNQPKLKVLVMSNFDQYHAKFVAETLSYYCSNNFELEVWTELELS 442

Query: 443 TIDLEEIEKQYDVIVTDVMVGKSDELEIFFFYKMIPEAIIDKLNAFLNISFAD 495
LE + YD+I+++ ++ + + + + ++I LNA + I +
Sbjct: 443 KESLE--DSPYDIIISNFIIPPIENKRLIYSNNINTVSLIYLLNAMMFIRLDE 493


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1751PF03544340.001 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.8 bits (77), Expect = 0.001
Identities = 14/96 (14%), Positives = 28/96 (29%), Gaps = 7/96 (7%)

Query: 107 VTKQPDSSDQSTPS------PKDQSSQKESQNKDGRPTPSPDQQKDQTPDKTPEKSADKT 160
V + + + P P D ++ P P+ + + P+ E
Sbjct: 37 VHQVIELPAPAQPISVTMVAPADLE-PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIE 95

Query: 161 PEKGPEKATEKTPEPNRDAPKPIQPPLAAAAPVFAP 196
K K K + + ++P + A F
Sbjct: 96 KPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFEN 131



Score = 33.8 bits (77), Expect = 0.001
Identities = 20/84 (23%), Positives = 30/84 (35%), Gaps = 6/84 (7%)

Query: 139 PSPDQQKD---QTPDKTPEK---SADKTPEKGPEKATEKTPEPNRDAPKPIQPPLAAAAP 192
P+P Q P P PE E PEP ++AP I+ P P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 103

Query: 193 VFAPWRESDKDLSKLKPSSRSSAA 216
P ++ ++ +KP A+
Sbjct: 104 KPKPVKKVEQPKRDVKPVESRPAS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1752MECHCHANNEL320.002 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 32.1 bits (73), Expect = 0.002
Identities = 14/62 (22%), Positives = 28/62 (45%), Gaps = 8/62 (12%)

Query: 10 VINGLIIVVVTSILLVLYFAMPIYYTKVKDKEVKREFDQTSKQIKGKTVTEIRDILTKKI 69
V + LI+ ++ A+ + + KE +K+ +TEIRD+L ++
Sbjct: 82 VFDFLIVA------FAIFMAIKLINKLNRKKEEPAAAPAPTKEEV--LLTEIRDLLKEQN 133

Query: 70 NK 71
N+
Sbjct: 134 NR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1753HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 1e-20
Identities = 31/128 (24%), Positives = 55/128 (42%), Gaps = 1/128 (0%)

Query: 3 KILVVEDDDTISQVICEFLKANNYDPDCVFDGQAALDKWQTTSYDLIILDIMLPSLSGLE 62
ILV +DD I V+ + L YD + DL++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 VLKTIRKT-SDVPIIMLTALDDEYTQLVSFNHLISDYVTKPFSPLILIKRIENVLRVSTP 121
+L I+K D+P+++++A + T + + DY+ KPF LI I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 DEKRQIGD 129
+ D
Sbjct: 125 RPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1756RTXTOXIND544e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 54.4 bits (131), Expect = 4e-10
Identities = 34/144 (23%), Positives = 55/144 (38%), Gaps = 10/144 (6%)

Query: 60 DISLTLAGEVTANNSSKVKIDSSKGEVKDVFVKKGDVVKVGQPLFSYETSQRLTAQSSEF 119
+I T G++T + SK VK++ VK+G+ V+ G L +LTA +E
Sbjct: 81 EIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLL------KLTALGAEA 134

Query: 120 DVQTKANQLQVAKTNAALKWETYNRKVNEINTLKSRYNTAPDESLLEQIRSAEDSVSQAL 179
D + Q + A L+ Y I K PDE + + E +L
Sbjct: 135 DTL----KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190

Query: 180 SDAKTADSDVKTAQIELDKANATA 203
+ + + Q EL+ A
Sbjct: 191 IKEQFSTWQNQKYQKELNLDKKRA 214



Score = 39.4 bits (92), Expect = 2e-05
Identities = 28/180 (15%), Positives = 61/180 (33%), Gaps = 16/180 (8%)

Query: 120 DVQTKANQLQVAKTNAALKWETYNRKVNEINTLKSRYNTAPDESL---LEQIRSAEDSVS 176
D + ++ +AK + Y VNE+ KS+ E L E + +
Sbjct: 239 DFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN 298

Query: 177 QALSDAKTADSDVKTAQIELDKANATATMEKGKLEYDTVKSDTAGTIVSLNTDLPNQSKS 236
+ L + ++ +EL K + + +++ + + L
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEE-------RQQASVIRAPVSVKVQQLKVHTEGGVV- 350

Query: 237 KKENETFMEII-DKSKMLVKGNISEFDRDKLKIGQKVEV-IDRKDNSK--KWTGKVTQVG 292
ET M I+ + + V + D + +GQ + ++ ++ GKV +
Sbjct: 351 -TTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1760IGASERPTASE456e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.4 bits (107), Expect = 6e-07
Identities = 48/262 (18%), Positives = 80/262 (30%), Gaps = 13/262 (4%)

Query: 361 KPAEDERSSQATEPANPSKENSSTDASQGSHDSENPATDSPSQPQASDQGNHQSQVPNAK 420
+ QA P+ PS + PAT S + ++ +S+
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN 1054

Query: 421 PQTDKTDTP-NAPVPPQNTPNAPAIPQDTPKVPEAGGQSGPAGNAEEKAPDSGPKESGQ- 478
Q T N V + N A Q T +V ++G ++ E K + KE
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQ-TNEVAQSGSETKETQTTETKETATVEKEEKAK 1113

Query: 479 ---SSSKESPSVGEDTTPNQPDVLVGGQREPIDITEDTITDAPPTVSGHNASTQPQSVVE 535
++E P V +P Q Q E + + + PTV+ +Q + +
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQE------QSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 536 DTAPQRPDVLVGGQSEPIDITQDTQPGMSGSNDATVINEDTKPKRVFHFDNKEPQASEKA 595
P + Q T +T + N T+P NK ++
Sbjct: 1168 TEQPAKETSSNVEQPVTESTTVNTGNSVV-ENPENTTPATTQPTVNSESSNKPKNRHRRS 1226

Query: 596 AEQKLAPHDSHTTPQTSDDTAA 617
+ TT T A
Sbjct: 1227 VRSVPHNVEPATTSSNDRSTVA 1248



Score = 35.0 bits (80), Expect = 0.001
Identities = 23/162 (14%), Positives = 51/162 (31%), Gaps = 8/162 (4%)

Query: 293 LSEEAPNLRTAKEKVKTLKGILHDYYVDIKEPEKAKKYRVEDETLPSQPEAPAKPEAPAP 352
E +T + K V+ ++ ++ K + Q E PA
Sbjct: 1088 SGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAR 1147

Query: 353 SPSP--------APGQKPAEDERSSQATEPANPSKENSSTDASQGSHDSENPATDSPSQP 404
P + A+ E+ ++ T ST + G+ ENP +P+
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207

Query: 405 QASDQGNHQSQVPNAKPQTDKTDTPNAPVPPQNTPNAPAIPQ 446
Q + ++ N ++ ++ N ++ + +
Sbjct: 1208 QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 33.9 bits (77), Expect = 0.002
Identities = 34/217 (15%), Positives = 65/217 (29%), Gaps = 31/217 (14%)

Query: 322 KEPEKAKKYRVEDETLPSQPEAPAKPEAPAPSPSPAPGQKPAEDERSSQATEP-ANPSKE 380
K + ET + E AK E P + + + S+ +P A P++E
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 381 NSSTDASQGSHDSENPATDSPSQPQASDQGNHQSQVPNAKPQTDKTDTPNAPVPPQNTPN 440
N T + + ++ + Q + + + + P + T T + P+NT
Sbjct: 1149 NDPTVNIK---EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNT-GNSVVENPENTTP 1204

Query: 441 APAIPQDTPKVPEAGGQSGPAGNAEEKAPDSGPKESGQSSSKESPSVGEDTTPNQPDVLV 500
A P + PK + S + P E T + D
Sbjct: 1205 ATTQPTVNSESSNK------------------PKNRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 501 GGQREPIDITEDTITDAPPTVSGHNASTQPQSVVEDT 537
+ + + +A + Q V +
Sbjct: 1247 VALCDLTSTNTNAVLS--------DARAKAQFVALNV 1275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1761GPOSANCHOR361e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 1e-04
Identities = 24/89 (26%), Positives = 36/89 (40%), Gaps = 12/89 (13%)

Query: 191 IEEDTKPKRFFHFDN-EPQAPEKPKEQPSNSLPQ----------APVYKAAHHLPASGDK 239
EE K + D+ P A K P AP+ + LP++G+
Sbjct: 452 AEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGET 511

Query: 240 REASFTIVALTIIGAAGLLSKKRRDTEEN 268
FT ALT++ AG+ + +R EEN
Sbjct: 512 ANPFFTAAALTVMATAGVAAVVKR-KEEN 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1762IGASERPTASE320.013 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.013
Identities = 25/145 (17%), Positives = 53/145 (36%), Gaps = 6/145 (4%)

Query: 24 APTVLGQEVSNTEASASSTASVDATASGTTASGASSEATVATTNGGSQSTQVAAETTPQP 83
PTV +E + + + T S + TV T N ++ + T QP
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP 1209

Query: 84 QVQTSEQPAATSTSLASSTSSSEDKAPKAASTKSSSA-----TVASSSNGSNQGAGTEVE 138
V + + S S + P S+ S ++++N A + +
Sbjct: 1210 TVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQ 1269

Query: 139 PQMMDVEQYKVNKEKTELTVKDDNQ 163
++V + V++ ++L + ++ Q
Sbjct: 1270 FVALNVGKA-VSQHISQLEMNNEGQ 1293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1766STREPTOPAIN596e-14 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 59.3 bits (143), Expect = 6e-14
Identities = 39/102 (38%), Positives = 56/102 (54%), Gaps = 7/102 (6%)

Query: 2 EMHFVRTEPEARRIAETFCAENTQTKTPMRVQQLSYPSDTDHSGGEL-----YIYALSPA 56
+ +F R E EA+ A TF ++ K R + D + GGEL Y+Y +S
Sbjct: 28 DQNFARNEKEAKDSAITFIQKSAAIKAGARSAE-DIKLDKVNLGGELSGSNMYVYNISTG 86

Query: 57 GFIIVSGDTRAHTILGYSFDNNLDLN-HDNVRSMVEAYQKQI 97
GF+IVSGD R+ ILGYS + D N +N+ S +E+Y +QI
Sbjct: 87 GFVIVSGDKRSPEILGYSTSGSFDANGKENIASFMESYVEQI 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1767STREPTOPAIN7100.0 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 710 bits (1833), Expect = 0.0
Identities = 396/398 (99%), Positives = 397/398 (99%)

Query: 1 MNKKKLGIRLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAE 60
MNKKKLG+RLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAE
Sbjct: 1 MNKKKLGVRLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAE 60

Query: 61 DIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASF 120
DIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASF
Sbjct: 61 DIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASF 120

Query: 121 MESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGE 180
MESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGE
Sbjct: 121 MESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGE 180

Query: 181 QSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQY 240
QSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQY
Sbjct: 181 QSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQY 240

Query: 241 NWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQ 300
NWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQ
Sbjct: 241 NWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQ 300

Query: 301 SVHQINRSDFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWG 360
SVHQINR DFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWG
Sbjct: 301 SVHQINRGDFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWG 360

Query: 361 GVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP 398
GVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP
Sbjct: 361 GVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP 398


30MGAS2096_Spy1799MGAS2096_Spy1810Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy17990213.269709transcriptional regulator ctsR
MGAS2096_Spy18000243.649292cold shock protein
MGAS2096_Spy18010243.866946*peroxiredoxin
MGAS2096_Spy18020244.861605peroxiredoxin reductase (NAD(P)H) / NADH oxidase
MGAS2096_Spy18030225.024648imidazolonepropionase
MGAS2096_Spy18041265.424171urocanate hydratase
MGAS2096_Spy1805-1275.624719glutamate formiminotransferase
MGAS2096_Spy18060295.887842formiminotetrahydrofolate cyclodeaminase
MGAS2096_Spy18070234.667114formate--tetrahydrofolate ligase
MGAS2096_Spy1808-2224.031387putative cytoplasmic protein
MGAS2096_Spy1809-2233.813868amino acid permease
MGAS2096_Spy1810-1183.525761histidine ammonia-lyase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1802PF07212300.021 Hyaluronoglucosaminidase
		>PF07212#Hyaluronoglucosaminidase

Length = 336

Score = 30.0 bits (67), Expect = 0.021
Identities = 34/145 (23%), Positives = 58/145 (40%), Gaps = 22/145 (15%)

Query: 242 GGQVMETVGIENMIGTLYT--EGPKLMAEVEAHTKSYDVDIIKAQLATSIEKKENIEVTL 299
G M+ G+E +GTL E P + A + + + +DI+K K++ + T
Sbjct: 205 NGSAMQIRGVEKALGTLKITHENPNVEANYDENAAALSIDIVK--------KQKGGKGTA 256

Query: 300 ANGAVLQAKTAILALGAKWRNINVPGEDEFRNKGVTYCPHCDGPLFEGKDVAVIGGGNSG 359
A G + + + + RN+ +D+F K DG + K + GN
Sbjct: 257 AQGIYINSTSGTTGKLLRIRNLG---DDKFYVKH-------DGGFYAKKTSQI--DGNLK 304

Query: 360 LEAALDLAGLAKHVYVLEFLPELKA 384
L+ A YV + +LKA
Sbjct: 305 LKNPTADDHAATKAYVDSEVKKLKA 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1803UREASE477e-08 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 47.4 bits (113), Expect = 7e-08
Identities = 22/53 (41%), Positives = 32/53 (60%), Gaps = 6/53 (11%)

Query: 46 IAIKDGLIVALG-SGEPDAE-----LVGPQTIMRSYKGKIATPGIIDCHTHLV 92
I +KDG I A+G +G PD + +VGP T + + +GKI T G +D H H +
Sbjct: 88 IGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHFI 140


31MGAS2096_Spy1846MGAS2096_Spy1857Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1846013-3.077257histidyl-tRNA synthetase
MGAS2096_Spy1847219-4.69599350S ribosomal protein L32
MGAS2096_Spy1848318-4.43877850S ribosomal protein L33
MGAS2096_Spy1849420-4.410788cadmium resistance protein
MGAS2096_Spy1850520-4.665917cadmium efflux system accessory protein
MGAS2096_Spy1851521-3.883766hypothetical protein
MGAS2096_Spy1852723-2.102279FtsK/SpoIIIE family protein
MGAS2096_Spy1853722-2.502542hypothetical protein
MGAS2096_Spy1854724-2.880017transcriptional regulator
MGAS2096_Spy1855522-1.372871hypothetical protein
MGAS2096_Spy1856317-0.558624integral membrane protein
MGAS2096_Spy18572140.566523phosphohydrolase
32MGAS2096_Spy0247MGAS2096_Spy0256N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy02475372.230825surface exclusion protein
MGAS2096_Spy024810533.27063930S ribosomal protein S12
MGAS2096_Spy02497443.14506530S ribosomal protein S7
MGAS2096_Spy02505403.154758protein translation elongation factor G (EF-G)
MGAS2096_Spy02512282.605266protein translation elongation factor G (EF-G)
MGAS2096_Spy02520171.722232glyceraldehyde-3-phosphate dehydrogenase
MGAS2096_Spy0253-2141.113317amino acid ABC transporter ATP-binding protein
MGAS2096_Spy0254-1140.413587amino acid ABC transporter permease
MGAS2096_Spy0255017-0.290375amino acid ABC transporter permease
MGAS2096_Spy0256015-0.437162hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0247GPOSANCHOR442e-06 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 43.9 bits (103), Expect = 2e-06
Identities = 41/221 (18%), Positives = 75/221 (33%), Gaps = 20/221 (9%)

Query: 64 ETIQEAKATIDAVEKTLSQQKAELTELATALTKTTAEINHLKEQQDNEQKALTSAQEIYT 123
+ K D + + LS K +L + +L++ ++I L+ ++ + +KAL A T
Sbjct: 78 FNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFST 137

Query: 124 NTLASSE--------------------ETLLAQGAEHQRELTATETELHNAQADQHSKET 163
A + E + ++ E E +A Q E
Sbjct: 138 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 197

Query: 164 ALSEQKASISAETTRAQDLVEQVKTSEQNIAKLNAMISNPDAITKAAQTANDNTKALSSE 223
AL +A++ + + L + A L + + A +A +
Sbjct: 198 ALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAA 257

Query: 224 LEKAKADLENQKAKVKKQLTEELAAQKAALAEKEAELSRLK 264
LE +A+LE T + A K AEK A +
Sbjct: 258 LEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKA 298



Score = 40.4 bits (94), Expect = 3e-05
Identities = 42/207 (20%), Positives = 71/207 (34%), Gaps = 6/207 (2%)

Query: 64 ETIQEAKATIDAVEKTLSQQKAELTELATALTKTTAEINHLKEQQDNEQKALTSAQEIYT 123
E A KTL +KA L L K + + K L + +
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 124 NTLASSEETLLAQGAE---HQRELTATETELHNAQADQHSKETALSEQKASISAETTRAQ 180
A E+ L ++ E E +A Q E AL +A++ + +
Sbjct: 225 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIK 284

Query: 181 DLVEQVKTSEQNIAKLNAMISNPDAITKAAQTANDNTKALSSELEKAKADLENQKA---K 237
L + E A L +A ++ + D ++ +LE LE Q
Sbjct: 285 TLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEA 344

Query: 238 VKKQLTEELAAQKAALAEKEAELSRLK 264
++ L +L A + A + EAE +L+
Sbjct: 345 SRQSLRRDLDASREAKKQLEAEHQKLE 371



Score = 37.7 bits (87), Expect = 2e-04
Identities = 62/310 (20%), Positives = 113/310 (36%), Gaps = 46/310 (14%)

Query: 5 ERMDLEQTKPNQVKQKIALTSTIALLSASVGVSHQVKADDRASGETKASNTHDDSLPKPE 64
+ DLE+ + A ++ I L A + + + A N K +
Sbjct: 156 RKADLEKALEGAMNFSTADSAKIKTLEAEKAAL-EARQAELEKALEGAMNFSTADSAKIK 214

Query: 65 TIQEAKATIDAVEKTLSQQKAELTELATALTKTTAEINHLKEQQDNEQKALTSAQEIYTN 124
T++ KA + A + L + +TA + + K + Q L A E N
Sbjct: 215 TLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMN 274

Query: 125 TLASSE---ETLLAQGAEHQRELTATETELHNAQADQHSKETALS--------------- 166
+ +TL A+ A + E E + A++ S L
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQK 334

Query: 167 -----------------------EQKASISAETTRAQDLVEQVKTSEQNI-AKLNAMISN 202
E K + AE + ++ + + S Q++ L+A
Sbjct: 335 LEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREA 394

Query: 203 PDAITKAAQTAN---DNTKALSSELEKAKADLENQKAKVKKQLTEELAAQKAALAEKEAE 259
+ KA + AN + L+ ELE++K E +KA+++ +L E A K LA++ E
Sbjct: 395 KKQVEKALEEANSKLAALEKLNKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEE 454

Query: 260 LSRLKSSAPS 269
L++L++ S
Sbjct: 455 LAKLRAGKAS 464


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0250TCRTETOQM1092e-31 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 109 bits (274), Expect = 2e-31
Identities = 29/63 (46%), Positives = 41/63 (65%)

Query: 9 KTRNIGIMAHVDAGKTTTTERILYYTGKIHKIGETHEGASQMDWMEQEQERGITITSAAT 68
K NIG++AHVDAGKTT TE +LY +G I ++G +G ++ D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 QLN 71

Sbjct: 62 SFQ 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0251TCRTETOQM485e-167 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 485 bits (1249), Expect = e-167
Identities = 142/594 (23%), Positives = 244/594 (41%), Gaps = 65/594 (10%)

Query: 1 MDFTIEVQRSLRVLDGAVTVLDSQSGVEPQTETVWRQATEYGVPRIVFANKMDKIGADFL 60
MDF EV RSL VLDGA+ ++ ++ GV+ QT ++ + G+P I F NK+D+ G D
Sbjct: 79 MDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFINKIDQNGIDLS 138

Query: 61 YSVQTLHDRLQANAHPIQLPIGAEDDFRGIIDLIKMKAEIYTNDLGTDILEEDIPEEYLE 120
Q + ++L A +IK K E+Y N T+ E +
Sbjct: 139 TVYQDIKEKLSAEI------------------VIKQKVELYPNMCVTNFTESE------- 173

Query: 121 QAQEYREKLIEAVAETDEDLMMKYLEGEEITNDELIAGIRKATINVEFFPVLCGSAFKNK 180
+ V E ++DL+ KY+ G+ + EL N FPV GSA N
Sbjct: 174 --------QWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNI 225

Query: 181 GVQLMLDAVIAYLPSPLDIPAIKGVNPDTDAEEERPASDEEPFAALAFKIMTDPFVGRLT 240
G+ +++ + S + FKI RL
Sbjct: 226 GIDNLIEVITNKFYSSTH-------------------RGQSELCGKVFKIEYSEKRQRLA 266

Query: 241 FFRVYSGVLNSGSYVMNTSKGKRERIGRILQMHANSRQEIETVYAGDIAAAVG----LKD 296
+ R+YSGVL+ V + K K +I + +I+ Y+G+I L
Sbjct: 267 YIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNS 325

Query: 297 TTTGDSLTDEKAKVILESIEVPEPVIQLMVEPKSKADQDKMGVALQKLAEEDPTFRVETN 356
GD+ + E IE P P++Q VEP ++ + AL ++++ DP R +
Sbjct: 326 VL-GDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYYVD 380

Query: 357 VETGETVIAGMGELHLDVLVDRMKREFKVEANVGAPQVSYRETFRASTQARGFFKRQSGG 416
T E +++ +G++ ++V ++ ++ VE + P V Y E R +A +
Sbjct: 381 SATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME--RPLKKAEYTIHIEVPP 438

Query: 417 KGQFGDVWIEFTPNEEGKGFEFENAIVGGVVPREFIPAVEKGLIESMANGVLAGYPMVDV 476
+ + + +P G G ++E+++ G + + F AV +G+ G L G+ + D
Sbjct: 439 NPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGIRYGCEQG-LYGWNVTDC 497

Query: 477 KAKLYDGSYHDVDSSETAFKIAASLALKEAAKSAQPAILEPMMLVTITAPEDNLGDVMGH 536
K G Y+ S+ F++ A + L++ K A +LEP + I AP++ L
Sbjct: 498 KICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYLSFKIYAPQEYLSRAYTD 557

Query: 537 VTARRGRVDGMEAHGNSQIVRAYVPLAEMFGYATVLRSATQGRGTFMMVFDHYE 590
+ + N I+ +P + Y + L T GR + Y
Sbjct: 558 APKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGRSVCLTELKGYH 611


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0254TYPE3IMQPROT270.024 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 27.4 bits (61), Expect = 0.024
Identities = 16/49 (32%), Positives = 28/49 (57%)

Query: 163 MTLLISMVGTITGLFIGLLIGIFRTAPKAKHKVAALGQKLFGWLLTIYI 211
+ L++S TI IGLL+G+F+T + + + G KL G L +++
Sbjct: 14 LVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0256cloacin320.010 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.010
Identities = 13/21 (61%), Positives = 13/21 (61%)

Query: 619 SGGGISGGGGFSGGGGGGGGG 639
SG G GG G SGGG G GG
Sbjct: 60 SGHGNGGGNGNSGGGSGTGGN 80


33MGAS2096_Spy0810MGAS2096_Spy0815N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy0810-1150.416957dTDP-4-dehydrorhamnose 3,5-epimerase
MGAS2096_Spy0811-2150.164273dTDP-glucose 4,6-dehydratase
MGAS2096_Spy0812-115-0.2264207,8-dihydro-8-oxoguanine-triphosphatase
MGAS2096_Spy0813-113-0.617513hypothetical protein
MGAS2096_Spy0814-212-0.697442hypothetical protein
MGAS2096_Spy0815-311-0.855448fibronectin-binding protein / fibrinogen-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0810INTIMIN270.043 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 27.0 bits (59), Expect = 0.043
Identities = 15/60 (25%), Positives = 23/60 (38%), Gaps = 5/60 (8%)

Query: 3 GLKKISKEKMLPIGFPERFFEEGKLQN-NVS----FSRQHVLRGLHAEPWDKYISVADDG 57
G + I +G +RFF + NV FS + G+ E W Y + +G
Sbjct: 247 GARYIDSRFTANLGAGQRFFLPENMLGYNVFIDQDFSGDNTRLGIGGEYWRDYFKSSVNG 306


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0811NUCEPIMERASE1324e-38 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 132 bits (334), Expect = 4e-38
Identities = 74/344 (21%), Positives = 136/344 (39%), Gaps = 44/344 (12%)

Query: 4 NIIVTGGAGFIGSNFVHY-VYNNHPDVHVTVLDKLT--YAGN--RANIEAILGDRVELVV 58
+VTG AGFIG + + H V +D L Y + +A +E + +
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGH---QVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK 58

Query: 59 GDIADAELVDKLAA--KTDAIVHYAAESHNDNSLEDPSPFIHTNFIGTYTLLEAARKYDI 116
D+AD E + L A + + SLE+P + +N G +LE R I
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 117 RFHHV--STDEVYGDLPLREDLPGQGEGPGEKFTAETKYNPSSPYSSTKAASDLIVKAWV 174
+ H + S+ VYG L +P + + +P S Y++TK A++L+ +
Sbjct: 119 Q-HLLYASSSSVYG---LNRKMPFSTDDSVD--------HPVSLYAATKKANELMAHTYS 166

Query: 175 RSFGVKATISNCSNNYGPYQHIEKFIPRQITNILSGIKPKLYGEGKNVRDWIHTNDHSIG 234
+G+ AT YGP+ + + + +L G +Y GK RD+ + +D +
Sbjct: 167 HLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEA 226

Query: 235 VWAIL------------------TKGRIGETYLIGADGEKNNKEVLELILEKMGQPKDAY 276
+ + Y IG + ++ + + +G
Sbjct: 227 IIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKK- 285

Query: 277 DHVTDRAGHDLRYAIDSTKLREELGWEPQFTNFSEGLEETIKWY 320
+ + + G L + D+ L E +G+ P+ T +G++ + WY
Sbjct: 286 NMLPLQPGDVLETSADTKALYEVIGFTPE-TTVKDGVKNFVNWY 328


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0814ANTHRAXTOXNA290.031 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.031
Identities = 42/178 (23%), Positives = 81/178 (45%), Gaps = 20/178 (11%)

Query: 31 KALKEDDADSLIALGEYLESIGFLPHAKRIYLQLADDYPELNINLAQIAAEDDAIEEAF- 89
+ L E++ +S+ + GE + P A R + + P+L IN+ A + +E +
Sbjct: 118 QDLSEEEKNSMNSRGEKV------PFASRFVFEKKRETPKLIINIKDYAINSEQSKEVYY 171

Query: 90 -----LYLDKVSKDS---PNYLSALLVMADLYDMEGLTEVAREKLLQAVSISPEPLVIFG 141
+ LD +SKD P +L+ + ++D D + + +K + + ++ + + I
Sbjct: 172 EIGKGISLDIISKDKSLDPEFLNLIKSLSD--DSDSSDLLFSQKFKEKLELNNKSIDINF 229

Query: 142 LAEIDMSLQY-FKEAIDYYAQLDNRQILELTGISTYQRIGRAYASLGKFEAAIEFLEK 198
+ E Q+ F A YY D+R +LEL ++ + + G FE E L+K
Sbjct: 230 IKENLTEFQHAFSLAFSYYFAPDHRTVLELYAPDMFEYMNK--LEKGGFEKISESLKK 285


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy0815FbpA_PF058337070.0 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 707 bits (1826), Expect = 0.0
Identities = 196/577 (33%), Positives = 325/577 (56%), Gaps = 32/577 (5%)

Query: 15 MSFDGFFLHHLTNELKENLLYGRIQKVNQPFERELVLTIRNHRKNYKLLLSAHPVFGRVQ 74
M+ DG FL+ + +ELK ++ G+I KVNQP + E++L IR R ++KLL+S+ + R+
Sbjct: 1 MALDGIFLYSIIDELKNTIINGKIDKVNQPEKDEIILNIRKGRLSFKLLISSSSNYPRIH 60

Query: 75 ITQADFQNPQVPNTFTMIMRKYLQGAVIEQLEQIDNDRIIEIKVSNKNEIGDAIQATLII 134
+T NP F M++RKY+ A I + QI+ DRI+ I + +E+G +LII
Sbjct: 61 LTDLTKPNPIKAPMFCMVLRKYISNAKIVDIHQINQDRIVVIDFESTDELGFNSIYSLII 120

Query: 135 EIMGKHSNIILVDRAENKIIESIKHVGFSQNSYRTILPGSTYIEPPKTAAVNPFTITD-- 192
EIMG+HSN+ L+ + +N I++SIKH+ N+YR+I PG Y+ PPK+ +NPF +
Sbjct: 121 EIMGRHSNMTLIRKRDNIIMDSIKHITPDINTYRSIYPGIEYVYPPKSPKLNPFDFSYDM 180

Query: 193 VPLFEILQTQELTVKSLQQHFQGLGRDTAKELAELLTTDKLKR---------------FR 237
+ F + +L + F G+ + + E+ L + + F+
Sbjct: 181 IENFTKENSLQLNDNIFSKIFTGVSKTLSSEICFRLKNNSIDLSLSNLKEIVEVCKDLFK 240

Query: 238 EFFARPTQANLTTASFEPVLF---------SDSHATFETLSEMLDHFYQDKAERDRINQQ 288
E + + N T + V F +++ S++L++FY K + DR+ +
Sbjct: 241 EIQSNKFEFNCYTKNNSFVGFYCLNLMSKEDYKKIQYDSSSKLLENFYYAKDKSDRLKSK 300

Query: 289 ASDLIHRVQTELDKNRNKLSKQEAELLATENAELFRQKGELLTTYLSLVPNNQDSVILDN 348
+SDL V +++ K L E+ ++F+ GELLT + + + L N
Sbjct: 301 SSDLQKIVMNNINRCTKKDKILNNTLKKCEDKDIFKLYGELLTANIYALKKGLSHIELAN 360

Query: 349 YYT--GEKIEIALDKALTPNQNAQRYFKKYQKLKEAVKHLSGLIADTKQSITYFESVDYN 406
YY+ + ++I LD+ TP+QN Q Y+KKY KLK++ + + + ++ + Y SV N
Sbjct: 361 YYSENYDTVKITLDENKTPSQNVQSYYKKYNKLKKSEEAANEQLLQNEEELNYLYSVLTN 420

Query: 407 LSQA-SIDDIEDIREELYQAGFLKSRQ--RDKRHKRKKPEQYLASDGKTILMVGRNNLQN 463
++ A + D+IE+I++EL + G++K ++ + K+ K KP +++ DG I VG+NN+QN
Sbjct: 421 INNADNYDEIEEIKKELIETGYIKFKKIYKSKKSKTSKPMHFISKDGIDIY-VGKNNIQN 479

Query: 464 EELTFKMAKKGELWFHAKDIPGSHVIIKDNLDPSDEVKTDAAELAAYYSKARLSNLVQVD 523
+ LT K A K ++WFH K+IPGSHVI+K+ +D + +AA LAAYYSK++ S+ V VD
Sbjct: 480 DYLTLKFANKHDIWFHTKNIPGSHVIVKNIMDIPESTLLEAANLAAYYSKSQNSSNVPVD 539

Query: 524 MIEAKKLHKPSGAKPGFVTYTGQKTLRVTPDQAKILS 560
E K + KP+GAKPG V Y+ +T+ VTP + +
Sbjct: 540 YTEVKNVKKPNGAKPGMVIYSTNQTIYVTPTNPNLKN 576


34MGAS2096_Spy1269MGAS2096_Spy1286N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1269-2130.026711cell division protein ftsA
MGAS2096_Spy1268-2160.830469hypothetical protein
MGAS2096_Spy1270-2161.047374cell-division initiation protein DivIB
MGAS2096_Spy1271-2161.511638undecaprenyldiphospho-muramoylpentapeptide
MGAS2096_Spy12720181.903952UDP-N-acetylmuramoyl-L-alanyl-D-glutamate
MGAS2096_Spy12732251.792913hypothetical protein
MGAS2096_Spy12742241.938272GTP-binding protein TypA/BipA
MGAS2096_Spy1275013-0.175886GTP-binding protein TypA/BipA
MGAS2096_Spy1276014-0.759907rhodanese-related sulfurtransferases
MGAS2096_Spy1277014-1.573701glucokinase
MGAS2096_Spy1278416-3.406698putative cytoplasmic protein
MGAS2096_Spy1279216-2.790999non-specific DNA-binding protein Dps /
MGAS2096_Spy1280219-2.767852Type 4 prepilin peptidase pilD
MGAS2096_Spy1281118-2.444211ribosomal RNA large subunit methyltransferase N
MGAS2096_Spy1282017-2.424537putative cytoplasmic protein
MGAS2096_Spy1283-116-1.146246hypothetical protein
MGAS2096_Spy1284-214-0.164357ribose operon repressor
MGAS2096_Spy1285-1140.759564ATP-dependent endopeptidase Lon
MGAS2096_Spy12861141.130968phosphopantetheine adenylyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1269SHAPEPROTEIN475e-08 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 47.4 bits (113), Expect = 5e-08
Identities = 42/191 (21%), Positives = 79/191 (41%), Gaps = 16/191 (8%)

Query: 170 RKTVERAGIKVENIIISPLAMAKTILNEGEREFGATVIDMGGGQTTVASMRAQELQYTNI 229
R++ + AG + +I P+A A G+ V+D+GGG T VA + + Y++
Sbjct: 127 RESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLNGVVYSSS 186

Query: 230 YAEGGEYITKDISKVLKTSLAI------AEALKFNFGQAEISEASITETVK-VDVV-GSE 281
GG+ + I ++ + AE +K G A + V+ ++ G
Sbjct: 187 VRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGRNLAEGVP 246

Query: 282 EPVEVTERYLSEIISARIRHILDRVKQDLER------GRLLDLPGGIVLIGGGAIMPGVV 335
+ + E + + I+ V LE+ + + G+VL GGGA++ +
Sbjct: 247 RGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISE--RGMVLTGGGALLRNLD 304

Query: 336 EIAQEIFGVTV 346
+ E G+ V
Sbjct: 305 RLLMEETGIPV 315


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1271LIPPROTEIN48310.010 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 30.7 bits (69), Expect = 0.010
Identities = 19/99 (19%), Positives = 31/99 (31%), Gaps = 10/99 (10%)

Query: 154 FEQEDQLSKVKHLGAVTKVFKDANQMPESTQLE-AVKEYFSRDLKTLLFIGGSAGAHVFN 212
FE ++K + + N + S+ E A S K + G
Sbjct: 83 FEALKAINKQTGI--------EINNVEPSSNFESAYNSALSAGHKIWVLNGFKHQQS-IK 133

Query: 213 QFISDHPELKQRYNIINITGDPHLNELSSHLYRVDYVTD 251
Q+I H E +R I I D + Y + +
Sbjct: 134 QYIDAHREELERNQIKIIGIDFDIETEYKWFYSLQFNIK 172


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1274TCRTETOQM1267e-33 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 126 bits (317), Expect = 7e-33
Identities = 76/407 (18%), Positives = 146/407 (35%), Gaps = 95/407 (23%)

Query: 1 MDTPGHADFGGEVERIMKMVDGVVLVVDAYEGTMPQTRFVLKKALEQNLIPIVVVNKIDK 60
+DTPGH DF EV R + ++DG +L++ A +G QTR + + + I +NKID+
Sbjct: 73 IDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFINKIDQ 132

Query: 61 PSARP-------------------------------------AEVVDEVLELFIELGADD 83
+ V E + +E
Sbjct: 133 NGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLEKYMSG 192

Query: 84 EQLE-----------------FPVVYASAINGTSSLSDDPADQEHTMAPIFDTIIDHIPA 126
+ LE FPV + SA N + + + I + +
Sbjct: 193 KSLEALELEQEESIRFHNCSLFPVYHGSAKNNIG------------IDNLIEVITNKFYS 240

Query: 127 PVDNSDEPLQFQVSLLDYNDFVGRIGIGRVFRGTVKVGDQVTLSKLDGTTKNFRVTKLFG 186
L +V ++Y++ R+ R++ G + + D V +S + ++T+++
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRIS----EKEKIKITEMYT 296

Query: 187 FFGLERREIQEAKAGDLIAVSGMEDIFVGETITPTDCVEALPILRIDEPTLQMTFLVNNS 246
E +I +A +G+++ + E + + + T + + P LQ T
Sbjct: 297 SINGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTT------ 349

Query: 247 PFAGREGKWITSRKVEER--LLAELQT----DVSLRVDPTDSPDKWTVSGRGELHLSILI 300
+ K ++R LL L D LR + + +S G++ + +
Sbjct: 350 ---------VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTC 400

Query: 301 ETMRRE-GYELQVSRPEVIIKEIDGVKCEPFERVQIDTPEEYQGAII 346
++ + E+++ P VI E K E + I+ P A I
Sbjct: 401 ALLQEKYHVEIEIKEPTVIYMERPLKKAE--YTIHIEVPPNPFWASI 445



Score = 42.5 bits (100), Expect = 3e-06
Identities = 18/79 (22%), Positives = 31/79 (39%), Gaps = 1/79 (1%)

Query: 328 EPFERVQIDTPEEYQGAIIQSLSERKGDMLDMQMVGNGQTRLIFLIPARGLIGYSTEFLS 387
EP+ +I P+EY + +++D Q + N + L IPAR + Y ++
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 388 MTRGYGIMNHTFDQYLPVV 406
T G + Y
Sbjct: 596 FTNGRSVCLTELKGYHVTT 614


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1275TCRTETOQM533e-12 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 52.9 bits (127), Expect = 3e-12
Identities = 24/61 (39%), Positives = 35/61 (57%), Gaps = 2/61 (3%)

Query: 8 IRNVAIIAHVDHGKTTLVDELLKQSHTLDERKELQE--RAMDSNDLEKERGITILAKNTA 65
I N+ ++AHVD GKTTL + LL S + E + + D+ LE++RGITI T+
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 66 V 66

Sbjct: 63 F 63


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1277PF03309310.004 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 31.3 bits (71), Expect = 0.004
Identities = 29/126 (23%), Positives = 43/126 (34%), Gaps = 14/126 (11%)

Query: 5 LLGIDLGGTTIKFGILTAAGEVQE---KWAIETNILEGGKHIVPDIVASIKHRLDLYGLS 61
LL ID+ T G+++ +G+ + +W I T + D +A L G
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTE-----PEVTADELALTIDG--LIGDD 54

Query: 62 SADFVGIGMGSPGAVDRDTNTVTGAFNLNWKETQEVGSVVEKELGIPFAIDNDANVAALG 121
+ G S V + V W V GIP +DN V A
Sbjct: 55 AERLTGASGLS--TVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGA-- 110

Query: 122 ERWVGA 127
+R V
Sbjct: 111 DRIVNC 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1279HELNAPAPROT1511e-49 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 151 bits (383), Expect = 1e-49
Identities = 49/154 (31%), Positives = 85/154 (55%), Gaps = 4/154 (2%)

Query: 19 KKEASKNEKT--KAVLNQAVADLSVAASIVHQVHWYMRGPGFLYLHPKMDELLDSLNANL 76
K E +K +T + LN +++ + S +H+ HWY++GP F LH K +EL D +
Sbjct: 2 KTENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETV 61

Query: 77 DEMSERLITIGGAPYSTLAEFSKHSKLDEAKGTYDKTVAQHLARLVEVYLYLSSLYQVGL 136
D ++ERL+ IGG P +T+ E+++H+ + + + + ++ + LV Y +SS + +
Sbjct: 62 DTIAERLLAIGGQPVATVKEYTEHASITDGGN--ETSASEMVQALVNDYKQISSESKFVI 119

Query: 137 DITDEEGDAGTNDLFTAAKTEAEKTIWMLQAERG 170
+ +E D T DLF E EK +WML + G
Sbjct: 120 GLAEENQDNATADLFVGLIEEVEKQVWMLSSYLG 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1280PREPILNPTASE290.009 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.4 bits (66), Expect = 0.009
Identities = 42/160 (26%), Positives = 59/160 (36%), Gaps = 25/160 (15%)

Query: 70 SLIIILWASMVHWVSASYCYLLLFSLLFSLF--DWRSQ------EYPFILWLFSFVSLLL 121
+L+ + A + + LLL +L +L D P + F L
Sbjct: 118 ALLSVAVAMTLAPGWGTLAALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGG 177

Query: 122 FYSIN---------YLSLILLLLGLLAHLRPFSIGAGDFFYLASLALVLDLTSLIWLIQL 172
F S+ YL L L +G GDF LA+L L +L ++ L
Sbjct: 178 FVSLGDAVIGAMAGYLVLWSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGWQALPIVLLL 237

Query: 173 ASLAGITACLLL-------GIKRIPFIPYLSFGLFWIVLL 205
+SL G + L K IPF PYL+ WI LL
Sbjct: 238 SSLVGAFMGIGLILLRNHHQSKPIPFGPYLAIA-GWIALL 276


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1284HTHTETR337e-04 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 33.4 bits (76), Expect = 7e-04
Identities = 9/34 (26%), Positives = 19/34 (55%)

Query: 8 KLILQGGKAMVTIKQVAEEAGVSRSTVSRYISQK 41
+L Q G + ++ ++A+ AGV+R + + K
Sbjct: 22 RLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDK 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1286LPSBIOSNTHSS1532e-50 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 153 bits (388), Expect = 2e-50
Identities = 58/157 (36%), Positives = 94/157 (59%), Gaps = 2/157 (1%)

Query: 5 IGLYTGSFDPVTNGHLDIVKRASGLFDQIYVGIFDNPTKKSYFKLEVRKAMLTQALADFT 64
+Y GSFDP+T GHLDI++R LFDQ+YV + NP K+ F ++ R + +A+A
Sbjct: 2 NAIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLP 61

Query: 65 NVIVVTSHERLAIDVAKELRVTHLIRGLRNATDFEYEENLEYFNHLLAPNIETVYLISRN 124
N V + E L ++ A++ + ++RGLR +DFE E + N LA ++ETV+L +
Sbjct: 62 NAQVDSF-EGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTST 120

Query: 125 KWQALSSSRVRELIHFQSSLEGLVPQSVIAQV-EKMN 160
++ LSSS V+E+ F ++E VP V A + ++ +
Sbjct: 121 EYSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQFH 157


35MGAS2096_Spy1294MGAS2096_Spy1300N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1294-2191.849320arginine deiminase
MGAS2096_Spy1295-2191.839183Crp family transcriptional regulator
MGAS2096_Spy1296-2192.333853arginine repressor ArgR
MGAS2096_Spy1297-1182.177626hypothetical protein
MGAS2096_Spy12980181.802201putative cytoplasmic protein
MGAS2096_Spy12990191.776934two-component sensor kinase yesM
MGAS2096_Spy13000150.166507two-component response regulator yesN
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1294ARGDEIMINASE5780.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 578 bits (1492), Expect = 0.0
Identities = 191/410 (46%), Positives = 276/410 (67%), Gaps = 9/410 (2%)

Query: 5 TPIHVYSEIGKLKKVLLHRPGKEIENLMPDYLERLLFDDIPFLEDAQKEHDAFAQALRDE 64
PI+++SEIG+LKKVLLHRPG+E+ENL P ++ LFDDIP+LE A++EH+ FA L++
Sbjct: 6 NPINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFASILKNN 65

Query: 65 GIEVLYLETLAAESLVTP-EIREAFIDEYLSEANIRGRATKKAIRELLMAIEDNQELIEK 123
+E+ Y+E L +E LV+ + FI +++ EA I+ T +++ ++ +I K
Sbjct: 66 LVEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINLLKDYFSSL-TIDNMISK 124

Query: 124 TMAGVQKSELPEIPASEKGLTDLVESNYPFAIDPMPNLYFTRDPFATIGTGVSLNHMFSE 183
++GV EL +S L DLV F IDPMPN+ FTRDPFA+IG GV++N MF++
Sbjct: 125 MISGVVTEELKNYTSS---LDDLVNGANLFIIDPMPNVLFTRDPFASIGNGVTINKMFTK 181

Query: 184 TRNRETLYGKYIFTHHPIYGGGKVPMVYDRNETTRIEGGDELVLSKDVLAVGISQRTDAA 243
R RET++ +YIF +HP+Y VP+ +R E +EGGDELVL+K +L +GIS+RT+A
Sbjct: 182 VRQRETIFAEYIFKYHPVYKE-NVPIWLNRWEEASLEGGDELVLNKGLLVIGISERTEAK 240

Query: 244 SIEKLLVNIFKQNLGFKKVLAFEFANNRKFMHLDTVFTMVDYDKFTIHPEIEGDLRVYSV 303
S+EKL +++FK F +LAF+ NR +MHLDTVFT +DY FT + +Y +
Sbjct: 241 SVEKLAISLFKNKTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTSFTSDDMYFSIYVL 300

Query: 304 TYDNE--ELHIVEEKGDLAELLAANLGVEKVDLIRCGGDNLVAAGREQWNDGSNTLTIAP 361
TY+ ++HI +EK + ++L+ LG K+D+I+C G +L+ REQWNDG+N L IAP
Sbjct: 301 TYNPSSSKIHIKKEKARIKDVLSFYLG-RKIDIIKCAGGDLIHGAREQWNDGANVLAIAP 359

Query: 362 GVVVVYNRNTITNAILESKGLKLIKIHGSELVRGRGGPRCMSMPFEREDI 411
G ++ Y+RN +TN + E G+K+ +I SEL RGRGGPRCMSMP REDI
Sbjct: 360 GEIIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIREDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1296ARGREPRESSOR1234e-39 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 123 bits (311), Expect = 4e-39
Identities = 60/146 (41%), Positives = 92/146 (63%), Gaps = 2/146 (1%)

Query: 1 MNKKETRHQLIRSLISETTIHTQQELQERLQKNGITITQATLSRDMKELNLVKVTSGNDT 60
MNK + RH IR +I+ I TQ EL + L+K+G +TQAT+SRD+KEL+LVKV + N +
Sbjct: 1 MNKGQ-RHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGS 59

Query: 61 HYEALAISQTRWEH-RLRFYMEDALVMLKIVQHQIILKTLPGLAQSFGSILDAMQIPEIV 119
+ +L Q +L+ + DA V + H I+LKT+PG AQ+ G+++D + EI+
Sbjct: 60 YKYSLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEEIM 119

Query: 120 ATVCGDDTCLIVCEDNEQAKACYETL 145
T+CGDDT LI+C ++ K + +
Sbjct: 120 GTICGDDTILIICRTHDDTKVVQKKI 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1299PF065801836e-55 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 183 bits (466), Expect = 6e-55
Identities = 56/198 (28%), Positives = 98/198 (49%), Gaps = 9/198 (4%)

Query: 362 EKAIGQYRLQALASQINPHFLYNTLDTIIWMAEFNDSKRVVEVTKSLAKYFRLALNQGN- 420
+ +L AL +QINPHF++N L+ I + D + E+ SL++ R +L N
Sbjct: 155 ASMAQEAQLMALKAQINPHFMFNALNNIRALIL-EDPTKAREMLTSLSELMRYSLRYSNA 213

Query: 421 EYIRLADELDHVSQYLFIQKQRYGDKLSYEVQGLDVYADFVIPKLILQPLVENAIYHGIK 480
+ LADEL V YL + ++ D+L +E Q D +P +++Q LVEN I HGI
Sbjct: 214 RQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIA 273

Query: 481 EVDRKGMIKVTVSDTAQHLVLTVWDNGKGIEDSSLTNSQSLLARGGVGLKNVDQRLKLHY 540
++ + G I + + + L V + G ++ ++ G GL+NV +RL++ Y
Sbjct: 274 QLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKEST-------GTGLQNVRERLQMLY 326

Query: 541 GEGYHMTIHSQSDQFTEI 558
G + + + + +
Sbjct: 327 GTEAQIKLSEKQGKVNAM 344


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1300HTHFIS942e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 94.1 bits (234), Expect = 2e-24
Identities = 42/165 (25%), Positives = 75/165 (45%), Gaps = 12/165 (7%)

Query: 3 SLLIVEDEYLVRQGIRSLVDFSQFKIDRVNEAENGQLAWDLFQKEPYDIVLTDINMPKLN 62
++L+ +D+ +R + + + + V N W D+V+TD+ MP N
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY---DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GIQLAELIKQESPQTHLVFLTGYDDFNYALSALKLGADDYLLKPFSKADVEDMLGKLRKK 122
L IK+ P ++ ++ + F A+ A + GA DYL KPF D+ +++G + +
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF---DLTELIGIIGRA 118

Query: 123 LELSKKTETIQELVEQPQKEVSAIAMAIHE------RLADSDLTL 161
L K+ + E Q + + A+ E RL +DLTL
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTL 163


36MGAS2096_Spy1554MGAS2096_Spy1561N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1554-111-1.300954ferrichrome transport system permease fhuB
MGAS2096_Spy1555-212-1.754567ferrichrome-binding protein
MGAS2096_Spy1556-110-1.136825hypothetical protein
MGAS2096_Spy1557-110-1.016084iron ABC transporter permease
MGAS2096_Spy15580100.774639immunogenic secreted protein
MGAS2096_Spy15590101.423628alanine racemase
MGAS2096_Spy1560-290.6090314'-phosphopantetheinyl transferase
MGAS2096_Spy1561-290.626628preprotein translocase subunit SecA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1554TYPE3IMSPROT290.036 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 28.6 bits (64), Expect = 0.036
Identities = 19/76 (25%), Positives = 32/76 (42%), Gaps = 5/76 (6%)

Query: 264 LASVATSIVGVVSFLGL---IVPHMSRLLVGSKHQILIPFSALLGAFVFLLADTLGRSLA 320
+ S A + +GL H S+L++ Q +PFS L V + L
Sbjct: 29 VVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAEQSYLPFSQALSYVVDNVLLEFF-YLC 87

Query: 321 YPLEISPAIIMSIVGG 336
+PL ++ A +M+I
Sbjct: 88 FPL-LTVAALMAIASH 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1555FERRIBNDNGPP704e-16 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 70.4 bits (172), Expect = 4e-16
Identities = 55/265 (20%), Positives = 104/265 (39%), Gaps = 24/265 (9%)

Query: 27 VACVNQHPKTAKETEQQRIVATSVAVVDICDRLNLDLVGVCDSKLYTL----PKRYDAVK 82
+ A + RIVA V++ L + GV D+ Y L P D+V
Sbjct: 20 PLLWQMNTAHAAAIDPNRIVALEWLPVELLLALGIVPYGVADTINYRLWVSEPPLPDSVI 79

Query: 83 RVGLPMNPDIELIASLKPTWILSPNSLQEDLEPKYQKLDTEYGFLNLRSVEG------MY 136
VGL P++EL+ +KP++++ P + L +G
Sbjct: 80 DVGLRTEPNLELLTEMKPSFMV----WSAGYGPSPEMLARIAPGRGFNFSDGKQPLAMAR 135

Query: 137 QSIDDLGNLFQRQQEAKELRQQYQDYYRAFQAKRKGK-KKPKVLILMGLPGSYLVATNQS 195
+S+ ++ +L Q A+ QY+D+ R+ + + + +P +L + P LV S
Sbjct: 136 KSLTEMADLLNLQSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNS 195

Query: 196 YVGNLLDLAGGENVYQ--SDEKEFLSVNPEDMLA-KEPDLILRTAHAIPDKVKVMFDKEF 252
+LD G N +Q ++ +V+ + + A K+ D++ D +M
Sbjct: 196 LFQEILDEYGIPNAWQGETNFWGSTAVSIDRLAAYKDVDVLCFDHDNSKDMDALM----- 250

Query: 253 AENDIWKHFTAVKEGKVYDLDNTLF 277
+W+ V+ G+ + F
Sbjct: 251 -ATPLWQAMPFVRAGRFQRVPAVWF 274


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1558TONBPROTEIN300.014 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.014
Identities = 14/42 (33%), Positives = 17/42 (40%)

Query: 117 KPTDHSKPTDQPKPTDQPKPSPSKVDTAPASSLSRQLPEVRT 158
KP KP K +QPK V++ PAS P T
Sbjct: 93 KPKPKPKPKPVKKVQEQPKRDVKPVESRPASPFENTAPARLT 134


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1559ALARACEMASE343e-119 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 343 bits (882), Expect = e-119
Identities = 121/368 (32%), Positives = 194/368 (52%), Gaps = 23/368 (6%)

Query: 7 RPTVARVNLQAIKENVASVQKHIPLGVKTYAVVKADAYGHGAVQVSKALLPQVDGYCVSN 66
RP A ++LQA+K+N++ V++ + ++VVKA+AYGHG ++ A+ DG+ + N
Sbjct: 3 RPIQASLDLQALKQNLSIVRQAAT-HARVWSVVKANAYGHGIERIWSAI-GATDGFALLN 60

Query: 67 LDEALQLRQAGIDKEILIL-GVLLPNELELAVANAITVTIAS---LDWIALARLEKKECQ 122
L+EA+ LR+ G IL+L G +LE+ + +T + S L + ARL+
Sbjct: 61 LEEAITLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAP--- 117

Query: 123 GLKVHVKVDSGMGRIGLRSSKEVNLLIDSLKELGADVEGIFTHFATADEADDTKFNQQLQ 182
L +++KV+SGM R+G + + + + + +HFA A+ D +
Sbjct: 118 -LDIYLKVNSGMNRLGFQPDRVLTVWQQLRAMANVGEMTLMSHFAEAEHPDGIS--GAMA 174

Query: 183 FFKKLIAGLEDKPRLVHASNSATSIWHSDTIFNAVRLGIVSYGLNPSGS-DLSLPFPLQE 241
++ GL SNSA ++WH + F+ VR GI+ YG +PSG L+
Sbjct: 175 RIEQAAEGL---ECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDIANTGLRP 231

Query: 242 ALSLESSLVHVKMISAGDTVGYGVTYTAKKSEYVGTVPIGYADGWTRNM-QGFSVLVDGQ 300
++L S ++ V+ + AG+ VGYG YTA+ + +G V GYADG+ R+ G VLVDG
Sbjct: 232 VMTLSSEIIGVQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGV 291

Query: 301 FCEIIGRVSMDQLTIRLSKA--YPLGTKVTLIGSNQQKNISTTDIANYRNTINYEVLCLL 358
+G VSMD L + L+ +GT V L G K I D+A T+ YE++C L
Sbjct: 292 RTMTVGTVSMDMLAVDLTPCPQAGIGTPVELWG----KEIKIDDVAAAAGTVGYELMCAL 347

Query: 359 SDRIPRIY 366
+ R+P +
Sbjct: 348 ALRVPVVT 355


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1561SECA10520.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1052 bits (2723), Expect = 0.0
Identities = 394/903 (43%), Positives = 560/903 (62%), Gaps = 73/903 (8%)

Query: 1 MANILRKVIENDKG-ELRKLEKIAKKVESYADQMASLSDRDLQGKTLEFKERYQKGETLE 59
+ +L KV + LR++ K+ + + +M LSD +L+GKT EF+ R +KGE LE
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 60 QLLPEAFAVVREAAKRVLGLFPYRVQIMGGIVLHNGDVPEMRTGEGKTLTATMPVYLNAI 119
L+PEAFAVVREA+KRV G+ + VQ++GG+VL+ + EMRTGEGKTLTAT+P YLNA+
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 120 AGEGVHVITVNEYLSTRDATEMGEVYSWLGLSVGINLAAKSPAEKREAYNCDITYSTNSE 179
G+GVHV+TVN+YL+ RDA ++ +LGL+VGINL KREAY DITY TN+E
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 180 VGFDYLRDNMVVRQEDMVQRPLNFALVDEVDSVLIDEARTPLIVSGAVSSETNQLYIRAD 239
GFDYLRDNM E+ VQR L++ALVDEVDS+LIDEARTPLI+SG + ++Y R +
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSS-EMYKRVN 240

Query: 240 MFVKTLT------------SVDYVIDVPTKTIGLSDSGIDKAESYFNLS-------NLYD 280
+ L + +D ++ + L++ G+ E +LY
Sbjct: 241 KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS 300

Query: 281 IENVALTHFIDNALRANYIMLLDIDYVVSEDGEILIVDQFTGRTMEGRRFSDGLHQAIEA 340
N+ L H + ALRA+ + D+DY+V +DGE++IVD+ TGRTM+GRR+SDGLHQA+EA
Sbjct: 301 PANIMLMHHVTAALRAHALFTRDVDYIV-KDGEVIIVDEHTGRTMQGRRWSDGLHQAVEA 359

Query: 341 KEGVRIQEESKTSASITYQNMFRMYKKLAGMTGTAKTEEEEFREVYNMRIIPIPTNRPIA 400
KEGV+IQ E++T ASIT+QN FR+Y+KLAGMTGTA TE EF +Y + + +PTNRP+
Sbjct: 360 KEGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMI 419

Query: 401 RIDHTDLLYPTLESKFRAVVEDVKTRHAKGQPILVGTVAVETSDLISRKLVEAGIPHEVL 460
R D DL+Y T K +A++ED+K R AKGQP+LVGT+++E S+L+S +L +AGI H VL
Sbjct: 420 RKDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVL 479

Query: 461 NAKNHFKEAQIIMNAGQRGAVTIATNMAGRGTDIKLG----------------------- 497
NAK H EA I+ AG AVTIATNMAGRGTDI LG
Sbjct: 480 NAKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKA 539

Query: 498 ------EGVRELGGLCVIGTERHESRRIDNQLRGRSGRQGDPGESQFYLSLEDDLMRRFG 551
+ V E GGL +IGTERHESRRIDNQLRGRSGRQGD G S+FYLS+ED LMR F
Sbjct: 540 DWQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFA 599

Query: 552 SDRIKAFLDRMKLDEEDTVIKSGMLGRQVESAQKRVEGNNYDTRKQVLQYDDVMREQREI 611
SDR+ + ++ + + I+ + + + +AQ++VE N+D RKQ+L+YDDV +QR
Sbjct: 600 SDRVSGMMRKLGM-KPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRA 658

Query: 612 IYANRRDVITANRDLGPEIKAMIKRTIDRAVDAHARSNR---KDAIDAIVTFARTSLVPE 668
IY+ R +++ + D+ I ++ + +DA+ I + + +
Sbjct: 659 IYSQRNELLDVS-DVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLD 717

Query: 669 ESIS--AKELRGLKDDQIKEKLYQRALAIYDQQLSKLRDQEAIIEFQKVLILMIVDNKWT 726
I+ + L ++ ++E++ +++ +Y ++ + E + F+K ++L +D+ W
Sbjct: 718 LPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVG-AEMMRHFEKGVMLQTLDSLWK 776

Query: 727 EHIDALDQLRNAVGLRGYAQNNPVVEYQAEGFKMFQDMIGAIEFDVTRTMMKAQIH-EQE 785
EH+ A+D LR + LRGYAQ +P EY+ E F MF M+ +++++V T+ K Q+ +E
Sbjct: 777 EHLAAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEE 836

Query: 786 RERASQRATTAAPQNIQSQQSANTDD-------------LPKVERNEACPCGSGKKFKNC 832
E Q+ A + Q QQ ++ DD KV RN+ CPCGSGKK+K C
Sbjct: 837 VEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGKKYKQC 896

Query: 833 HGR 835
HGR
Sbjct: 897 HGR 899


37MGAS2096_Spy1705MGAS2096_Spy1710N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy1705-1195.183464sugar ABC transporter ATP-binding protein
MGAS2096_Spy1706-1215.303109hypothetical protein
MGAS2096_Spy1707-1235.476323streptokinase
MGAS2096_Spy17080246.174698D-tyrosyl-tRNA(Tyr) deacylase
MGAS2096_Spy17090215.934786GTP pyrophosphokinase /
MGAS2096_Spy17102205.964640collagen-like surface protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1705PF05272356e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 34.7 bits (79), Expect = 6e-04
Identities = 14/56 (25%), Positives = 20/56 (35%), Gaps = 9/56 (16%)

Query: 34 IVFVGPSGCGKSTTLRMIAGLEDISEGELKIGGEVVNDKSPKDRDIAMVFQNYALY 89
+V G G GKST + + GL+ S+ IG +D Y
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG---------TGKDSYEQIAGIVAY 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1706HTHFIS347e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.0 bits (78), Expect = 7e-04
Identities = 10/30 (33%), Positives = 19/30 (63%)

Query: 243 ALWSEHGNLVQTAQRLYIHRNSLQYKLDKF 272
AL + GN ++ A L ++RN+L+ K+ +
Sbjct: 444 ALTATRGNQIKAADLLGLNRNTLRKKIREL 473


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1707STREPKINASE7920.0 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 792 bits (2047), Expect = 0.0
Identities = 378/440 (85%), Positives = 401/440 (91%)

Query: 1 MKNYLSFGMFALLFALTFGTVKPVQAIAGPEWLLDRPSVNNSQLVVSVAGTVEGTNQEIS 60
MKNYLSFGMFALLFALTFGTV VQAIAGPEWLLDRPSVNNSQLVVSVAGTVEGTNQ+IS
Sbjct: 1 MKNYLSFGMFALLFALTFGTVNSVQAIAGPEWLLDRPSVNNSQLVVSVAGTVEGTNQDIS 60

Query: 61 LKFFEIDLTSRPAHGGKTEQGLSPKSKPFATNSSAMPHKLEKADLLKAIQEQLIANVHSN 120
LKFFEIDLTSRPAHGGKTEQGLSPKSKPFAT+S AM HKLEKADLLKAIQEQLIANVHSN
Sbjct: 61 LKFFEIDLTSRPAHGGKTEQGLSPKSKPFATDSGAMSHKLEKADLLKAIQEQLIANVHSN 120

Query: 121 DGYFEVIDFASDATITDRNGKVYFADRDDSVTLPTQPVQEFLLSGHVRVRPYQPKAVHNS 180
D YFEVIDFASDATITDRNGKVYFAD+D SVTLPTQPVQEFLLSGHVRVRPY+ K + N
Sbjct: 121 DDYFEVIDFASDATITDRNGKVYFADKDGSVTLPTQPVQEFLLSGHVRVRPYKEKPIQNQ 180

Query: 181 AERVNVNYEVSFVSETGNLDFTPSLKERYHLTTLAVGDSLSSQELAAIAQFILSKEHPDY 240
A+ V+V Y V F + DF P LK+ L TLA+GD+++SQEL A AQ IL+K HP Y
Sbjct: 181 AKSVDVEYTVQFTPLNPDDDFRPGLKDTKLLKTLAIGDTITSQELLAQAQSILNKNHPGY 240

Query: 241 IITKRDSSIVTHDNDIFRTILPMDQEFTYHIKDREQAYKANSKTGIVEKTNNTDLISEKY 300
I +RDSSIVTHDNDIFRTILPMDQEFTY +K+REQAY+ N K+G+ E+ NNTDLISEKY
Sbjct: 241 TIYERDSSIVTHDNDIFRTILPMDQEFTYRVKNREQAYRINKKSGLNEEINNTDLISEKY 300

Query: 301 YVLKKGEEPYDPFDRSHLKLFTIKYVDVDTNELLKSEQLLTASERNLDFRDLYDPRDKAK 360
YVLKKGE+PYDPFDRSHLKLFTIKYVDVDTNELLKSEQLLTASERNLDFRDLYDPRDKAK
Sbjct: 301 YVLKKGEKPYDPFDRSHLKLFTIKYVDVDTNELLKSEQLLTASERNLDFRDLYDPRDKAK 360

Query: 361 LLYNNLDAFGIMDYTLTGKVEDNHNDTNRIITVYMGKRPEGENASYHLAYDKDRYTEEER 420
LLYNNLDAFGIMDYTLTGKVEDNH+DTNRIITVYMGKRPEGENASYHLAYDKDRYTEEER
Sbjct: 361 LLYNNLDAFGIMDYTLTGKVEDNHDDTNRIITVYMGKRPEGENASYHLAYDKDRYTEEER 420

Query: 421 EVYSYLRYTGTPIPDNPKDK 440
EVYSYLRYTGTPIPDNP DK
Sbjct: 421 EVYSYLRYTGTPIPDNPNDK 440


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1710GPOSANCHOR609e-12 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 59.7 bits (144), Expect = 9e-12
Identities = 42/124 (33%), Positives = 54/124 (43%), Gaps = 3/124 (2%)

Query: 305 EKAPEKSPEVTPTPEMPEQP---GEKAPEKSPEVTPTPEMPEQPGEKAPEKSKEVTPAPE 361
K E+S ++T + Q E K E+ + KA +
Sbjct: 416 NKELEESKKLTEKEKAELQAKLEAEAKALKEKLAKQAEELAKLRAGKASDSQTPDAKPGN 475

Query: 362 KPADKEANQTPERRNGNMAKTPVANNHRRLPATGEQANPFFTAAAVAVMTTAGVLAVTKR 421
K + N K P+ R+LP+TGE ANPFFTAAA+ VM TAGV AV KR
Sbjct: 476 KAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGETANPFFTAAALTVMATAGVAAVVKR 535

Query: 422 KENN 425
KE N
Sbjct: 536 KEEN 539


38MGAS2096_Spy1748MGAS2096_Spy1767N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
MGAS2096_Spy17482210.254514transcriptional regulator
MGAS2096_Spy17490220.875969hypothetical protein
MGAS2096_Spy1750-1221.009644immunogenic protein
MGAS2096_Spy1751-1221.002950immunogenic secreted protein
MGAS2096_Spy1752-122-0.300766two-component system histidine kinase
MGAS2096_Spy1753-121-0.517953two-component response regulator
MGAS2096_Spy1754-222-0.282377ABC transporter permease
MGAS2096_Spy17550232.761563ABC transporter ATP-binding protein
MGAS2096_Spy17561233.644006periplasmic protein of efflux system
MGAS2096_Spy17582152.436585hypothetical protein
MGAS2096_Spy17572152.735028hypothetical protein
MGAS2096_Spy17591203.153015hypothetical protein
MGAS2096_Spy17602213.200448fibronectin-binding protein
MGAS2096_Spy17611201.875217fibronectin-binding protein
MGAS2096_Spy17621210.915864fibronectin-binding protein
MGAS2096_Spy17632322.281486putative cytoplasmic protein
MGAS2096_Spy17643341.001752foldase PrsA
MGAS2096_Spy1765020-1.941244hypothetical protein
MGAS2096_Spy1766-122-2.334470streptopain
MGAS2096_Spy1767021-2.253239streptopain
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1748PF050435620.0 Transcriptional activator
		>PF05043#Transcriptional activator

Length = 493

Score = 562 bits (1449), Expect = 0.0
Identities = 109/473 (23%), Positives = 217/473 (45%), Gaps = 20/473 (4%)

Query: 34 ELSKALNISMLTLQTCLTNMQ-FMKEVGGITYKNGYITIWYHQHCGLQEVYQKALRHSQS 92
EL++ LN + ++ L++++ ++ + NG I ++ VY +HS
Sbjct: 30 ELAELLNCTERAVKDDLSHVKSAFPDLIFHSSTNGIRIINT-DDSDIEMVYHHFFKHSTH 88

Query: 93 FKLLETLFFRDFNSLEELAEELFVSISTLKRLIKKTNAYLTHTFGITILTSPVQVSGDEH 152
F +LE +FF + E + +E ++S S+L R+I + N + F + +PVQ+ G+E
Sbjct: 89 FSILEFIFFNEGCQAESICKEFYISSSSLYRIISQINKVIKRQFQFEVSLTPVQIIGNER 148

Query: 153 QIRLFYLKYFSEAYKISEWPFGEMLNLKNCERLLSLMIKEVDVRVNFTLFQHLKILSSVN 212
IR F+ +YFSE Y EWPF E + + +LL L+ KE +N + + LK+L N
Sbjct: 149 DIRYFFAQYFSEKYYFLEWPF-ENFSSEPLSQLLELVYKETSFPMNLSTHRMLKLLLVTN 207

Query: 213 LIRYYKGYSAVYDNKKTSHRFSQLIQSSLEIQDLSRLFYLKFGLYLDETTIAEMFSNHVN 272
L R G+ D + + + + I+ +++ F ++ + LDE + ++F ++
Sbjct: 208 LYRIKFGHFMEVDKDSFNDQSLDFLMQAEGIEGVAQSFESEYNISLDEEVVCQLFVSYFQ 267

Query: 273 DQLEIGYAF--DSIKQDSPTGCRKVTNWVHLL----DELEIRLNLSVTNKYEVAVILHNT 326
I + +K+DS V HLL D++ ++ + + NK + LHNT
Sbjct: 268 KMFFIDESLFMKCVKKDS-----YVEKSYHLLSDFIDQISVKYQIEIENKDNLIWHLHNT 322

Query: 327 TVLKEEDITANYLFFDYKKSYLNFYKQEHPHLYKACVAGVEKLMRSEKEPISKELTNQLI 386
L +++ ++ FD K + + ++ P + + + + S + N L
Sbjct: 323 AHLYRQELFTEFILFDQKGNTIRNFQNIFPKFVSDVKKELSHYLETLEVCSSSMMVNHLS 382

Query: 387 YAFFITWENSFLKVNQKDEKIRLLVI----ERSFNSVGNFLKKYIGEFFSITNFNELDAL 442
Y F ++ + + Q K+++LV+ + V L Y F + + EL+
Sbjct: 383 YTFITHTKHLVINLLQNQPKLKVLVMSNFDQYHAKFVAETLSYYCSNNFELEVWTELELS 442

Query: 443 TIDLEEIEKQYDVIVTDVMVGKSDELEIFFFYKMIPEAIIDKLNAFLNISFAD 495
LE + YD+I+++ ++ + + + + ++I LNA + I +
Sbjct: 443 KESLE--DSPYDIIISNFIIPPIENKRLIYSNNINTVSLIYLLNAMMFIRLDE 493


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1751PF03544340.001 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 33.8 bits (77), Expect = 0.001
Identities = 14/96 (14%), Positives = 28/96 (29%), Gaps = 7/96 (7%)

Query: 107 VTKQPDSSDQSTPS------PKDQSSQKESQNKDGRPTPSPDQQKDQTPDKTPEKSADKT 160
V + + + P P D ++ P P+ + + P+ E
Sbjct: 37 VHQVIELPAPAQPISVTMVAPADLE-PPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIE 95

Query: 161 PEKGPEKATEKTPEPNRDAPKPIQPPLAAAAPVFAP 196
K K K + + ++P + A F
Sbjct: 96 KPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFEN 131



Score = 33.8 bits (77), Expect = 0.001
Identities = 20/84 (23%), Positives = 30/84 (35%), Gaps = 6/84 (7%)

Query: 139 PSPDQQKD---QTPDKTPEK---SADKTPEKGPEKATEKTPEPNRDAPKPIQPPLAAAAP 192
P+P Q P P PE E PEP ++AP I+ P P
Sbjct: 44 PAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKP 103

Query: 193 VFAPWRESDKDLSKLKPSSRSSAA 216
P ++ ++ +KP A+
Sbjct: 104 KPKPVKKVEQPKRDVKPVESRPAS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1752MECHCHANNEL320.002 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 32.1 bits (73), Expect = 0.002
Identities = 14/62 (22%), Positives = 28/62 (45%), Gaps = 8/62 (12%)

Query: 10 VINGLIIVVVTSILLVLYFAMPIYYTKVKDKEVKREFDQTSKQIKGKTVTEIRDILTKKI 69
V + LI+ ++ A+ + + KE +K+ +TEIRD+L ++
Sbjct: 82 VFDFLIVA------FAIFMAIKLINKLNRKKEEPAAAPAPTKEEV--LLTEIRDLLKEQN 133

Query: 70 NK 71
N+
Sbjct: 134 NR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1753HTHFIS831e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 1e-20
Identities = 31/128 (24%), Positives = 55/128 (42%), Gaps = 1/128 (0%)

Query: 3 KILVVEDDDTISQVICEFLKANNYDPDCVFDGQAALDKWQTTSYDLIILDIMLPSLSGLE 62
ILV +DD I V+ + L YD + DL++ D+++P + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 VLKTIRKT-SDVPIIMLTALDDEYTQLVSFNHLISDYVTKPFSPLILIKRIENVLRVSTP 121
+L I+K D+P+++++A + T + + DY+ KPF LI I L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 DEKRQIGD 129
+ D
Sbjct: 125 RPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1756RTXTOXIND544e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 54.4 bits (131), Expect = 4e-10
Identities = 34/144 (23%), Positives = 55/144 (38%), Gaps = 10/144 (6%)

Query: 60 DISLTLAGEVTANNSSKVKIDSSKGEVKDVFVKKGDVVKVGQPLFSYETSQRLTAQSSEF 119
+I T G++T + SK VK++ VK+G+ V+ G L +LTA +E
Sbjct: 81 EIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLL------KLTALGAEA 134

Query: 120 DVQTKANQLQVAKTNAALKWETYNRKVNEINTLKSRYNTAPDESLLEQIRSAEDSVSQAL 179
D + Q + A L+ Y I K PDE + + E +L
Sbjct: 135 DTL----KTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSL 190

Query: 180 SDAKTADSDVKTAQIELDKANATA 203
+ + + Q EL+ A
Sbjct: 191 IKEQFSTWQNQKYQKELNLDKKRA 214



Score = 39.4 bits (92), Expect = 2e-05
Identities = 28/180 (15%), Positives = 61/180 (33%), Gaps = 16/180 (8%)

Query: 120 DVQTKANQLQVAKTNAALKWETYNRKVNEINTLKSRYNTAPDESL---LEQIRSAEDSVS 176
D + ++ +AK + Y VNE+ KS+ E L E + +
Sbjct: 239 DFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKN 298

Query: 177 QALSDAKTADSDVKTAQIELDKANATATMEKGKLEYDTVKSDTAGTIVSLNTDLPNQSKS 236
+ L + ++ +EL K + + +++ + + L
Sbjct: 299 EILDKLRQTTDNIGLLTLELAKNEE-------RQQASVIRAPVSVKVQQLKVHTEGGVV- 350

Query: 237 KKENETFMEII-DKSKMLVKGNISEFDRDKLKIGQKVEV-IDRKDNSK--KWTGKVTQVG 292
ET M I+ + + V + D + +GQ + ++ ++ GKV +
Sbjct: 351 -TTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNIN 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1760IGASERPTASE456e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 45.4 bits (107), Expect = 6e-07
Identities = 48/262 (18%), Positives = 80/262 (30%), Gaps = 13/262 (4%)

Query: 361 KPAEDERSSQATEPANPSKENSSTDASQGSHDSENPATDSPSQPQASDQGNHQSQVPNAK 420
+ QA P+ PS + PAT S + ++ +S+
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN 1054

Query: 421 PQTDKTDTP-NAPVPPQNTPNAPAIPQDTPKVPEAGGQSGPAGNAEEKAPDSGPKESGQ- 478
Q T N V + N A Q T +V ++G ++ E K + KE
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQ-TNEVAQSGSETKETQTTETKETATVEKEEKAK 1113

Query: 479 ---SSSKESPSVGEDTTPNQPDVLVGGQREPIDITEDTITDAPPTVSGHNASTQPQSVVE 535
++E P V +P Q Q E + + + PTV+ +Q + +
Sbjct: 1114 VETEKTQEVPKVTSQVSPKQE------QSETVQPQAEPARENDPTVNIKEPQSQTNTTAD 1167

Query: 536 DTAPQRPDVLVGGQSEPIDITQDTQPGMSGSNDATVINEDTKPKRVFHFDNKEPQASEKA 595
P + Q T +T + N T+P NK ++
Sbjct: 1168 TEQPAKETSSNVEQPVTESTTVNTGNSVV-ENPENTTPATTQPTVNSESSNKPKNRHRRS 1226

Query: 596 AEQKLAPHDSHTTPQTSDDTAA 617
+ TT T A
Sbjct: 1227 VRSVPHNVEPATTSSNDRSTVA 1248



Score = 35.0 bits (80), Expect = 0.001
Identities = 23/162 (14%), Positives = 51/162 (31%), Gaps = 8/162 (4%)

Query: 293 LSEEAPNLRTAKEKVKTLKGILHDYYVDIKEPEKAKKYRVEDETLPSQPEAPAKPEAPAP 352
E +T + K V+ ++ ++ K + Q E PA
Sbjct: 1088 SGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAR 1147

Query: 353 SPSP--------APGQKPAEDERSSQATEPANPSKENSSTDASQGSHDSENPATDSPSQP 404
P + A+ E+ ++ T ST + G+ ENP +P+
Sbjct: 1148 ENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATT 1207

Query: 405 QASDQGNHQSQVPNAKPQTDKTDTPNAPVPPQNTPNAPAIPQ 446
Q + ++ N ++ ++ N ++ + +
Sbjct: 1208 QPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249



Score = 33.9 bits (77), Expect = 0.002
Identities = 34/217 (15%), Positives = 65/217 (29%), Gaps = 31/217 (14%)

Query: 322 KEPEKAKKYRVEDETLPSQPEAPAKPEAPAPSPSPAPGQKPAEDERSSQATEP-ANPSKE 380
K + ET + E AK E P + + + S+ +P A P++E
Sbjct: 1089 GSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARE 1148

Query: 381 NSSTDASQGSHDSENPATDSPSQPQASDQGNHQSQVPNAKPQTDKTDTPNAPVPPQNTPN 440
N T + + ++ + Q + + + + P + T T + P+NT
Sbjct: 1149 NDPTVNIK---EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNT-GNSVVENPENTTP 1204

Query: 441 APAIPQDTPKVPEAGGQSGPAGNAEEKAPDSGPKESGQSSSKESPSVGEDTTPNQPDVLV 500
A P + PK + S + P E T + D
Sbjct: 1205 ATTQPTVNSESSNK------------------PKNRHRRSVRSVPHNVEPATTSSNDRST 1246

Query: 501 GGQREPIDITEDTITDAPPTVSGHNASTQPQSVVEDT 537
+ + + +A + Q V +
Sbjct: 1247 VALCDLTSTNTNAVLS--------DARAKAQFVALNV 1275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1761GPOSANCHOR361e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.8 bits (82), Expect = 1e-04
Identities = 24/89 (26%), Positives = 36/89 (40%), Gaps = 12/89 (13%)

Query: 191 IEEDTKPKRFFHFDN-EPQAPEKPKEQPSNSLPQ----------APVYKAAHHLPASGDK 239
EE K + D+ P A K P AP+ + LP++G+
Sbjct: 452 AEELAKLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGET 511

Query: 240 REASFTIVALTIIGAAGLLSKKRRDTEEN 268
FT ALT++ AG+ + +R EEN
Sbjct: 512 ANPFFTAAALTVMATAGVAAVVKR-KEEN 539


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1762IGASERPTASE320.013 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.0 bits (72), Expect = 0.013
Identities = 25/145 (17%), Positives = 53/145 (36%), Gaps = 6/145 (4%)

Query: 24 APTVLGQEVSNTEASASSTASVDATASGTTASGASSEATVATTNGGSQSTQVAAETTPQP 83
PTV +E + + + T S + TV T N ++ + T QP
Sbjct: 1150 DPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQP 1209

Query: 84 QVQTSEQPAATSTSLASSTSSSEDKAPKAASTKSSSA-----TVASSSNGSNQGAGTEVE 138
V + + S S + P S+ S ++++N A + +
Sbjct: 1210 TVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQ 1269

Query: 139 PQMMDVEQYKVNKEKTELTVKDDNQ 163
++V + V++ ++L + ++ Q
Sbjct: 1270 FVALNVGKA-VSQHISQLEMNNEGQ 1293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1766STREPTOPAIN596e-14 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 59.3 bits (143), Expect = 6e-14
Identities = 39/102 (38%), Positives = 56/102 (54%), Gaps = 7/102 (6%)

Query: 2 EMHFVRTEPEARRIAETFCAENTQTKTPMRVQQLSYPSDTDHSGGEL-----YIYALSPA 56
+ +F R E EA+ A TF ++ K R + D + GGEL Y+Y +S
Sbjct: 28 DQNFARNEKEAKDSAITFIQKSAAIKAGARSAE-DIKLDKVNLGGELSGSNMYVYNISTG 86

Query: 57 GFIIVSGDTRAHTILGYSFDNNLDLN-HDNVRSMVEAYQKQI 97
GF+IVSGD R+ ILGYS + D N +N+ S +E+Y +QI
Sbjct: 87 GFVIVSGDKRSPEILGYSTSGSFDANGKENIASFMESYVEQI 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
MGAS2096_Spy1767STREPTOPAIN7100.0 Streptopain (C10) cysteine protease family signature.
		>STREPTOPAIN#Streptopain (C10) cysteine protease family signature.

Length = 398

Score = 710 bits (1833), Expect = 0.0
Identities = 396/398 (99%), Positives = 397/398 (99%)

Query: 1 MNKKKLGIRLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAE 60
MNKKKLG+RLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAE
Sbjct: 1 MNKKKLGVRLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAE 60

Query: 61 DIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASF 120
DIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASF
Sbjct: 61 DIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASF 120

Query: 121 MESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGE 180
MESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGE
Sbjct: 121 MESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGE 180

Query: 181 QSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQY 240
QSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQY
Sbjct: 181 QSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQY 240

Query: 241 NWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQ 300
NWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQ
Sbjct: 241 NWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQ 300

Query: 301 SVHQINRSDFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWG 360
SVHQINR DFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWG
Sbjct: 301 SVHQINRGDFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWG 360

Query: 361 GVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP 398
GVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP
Sbjct: 361 GVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP 398



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.