PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genomeexample.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in Bacteria (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1GBS_RS01035GBS_RS01080Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS010352131.907908metal ABC transporter ATP-binding protein
GBS_RS010403192.956740metal ABC transporter permease
GBS_RS010453203.034281DNA/RNA non-specific endonuclease
GBS_RS010503193.114657tyrosine--tRNA ligase
GBS_RS010552182.794325penicillin-binding protein PBP1B
GBS_RS010603222.983688DNA-directed RNA polymerase subunit beta
GBS_RS010650171.684812DNA-directed RNA polymerase subunit beta'
GBS_RS01070023-3.027175DUF1033 family protein
GBS_RS01075424-4.049186type II secretion system F family protein
GBS_RS01080018-3.331919prepilin-type N-terminal cleavage/methylation
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01075BCTERIALGSPF815e-19 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 81.0 bits (200), Expect = 5e-19
Identities = 58/300 (19%), Positives = 128/300 (42%), Gaps = 16/300 (5%)

Query: 80 DRMNKALLEGKDLSKMLGELG--FSDTVITQVALADLHGNISRSLLKIESYLANLLLVRK 137
+ ++EG L+ + F VA + G++ L ++ Y +R
Sbjct: 108 AAVRSKVMEGHSLADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRS 167

Query: 138 KVIEVATYPLILLTFLVLIMIGLRHYLMPQLGEN--------NFATSLITNVPNIF---- 185
++ + YP +L + ++ L ++P++ E +T ++ + +
Sbjct: 168 RIQQAMIYPCVLTVVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFG 227

Query: 186 -LLLLAVVLIFSLIFYIVQKRLSRIKVACFLTTIPLVGSYVKLYLTAYYAREWGNLLSQG 244
+LLA++ F ++++ R+ L +PL+G + TA YAR L +
Sbjct: 228 PWMLLALLAGFMAFRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASA 287

Query: 245 VELDQIVKVMQNQKSKLF-REIGYDMEEGFLSGKAFHQKVLDYPFFLTELSLMIEYGQVK 303
V L Q +++ + S + R + G + H+ + F + MI G+
Sbjct: 288 VPLLQAMRISGDVMSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERS 347

Query: 304 AKLGTELDIYADEKWEDFFTKLARATQLIQPVIFIFVALIIVMIYAAMLLPMYQNMEILS 363
+L + L+ AD + +F +++ A L +P++ + +A +++ I A+L P+ Q ++S
Sbjct: 348 GELDSMLERAADNQDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMS 407


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01080BCTERIALGSPG461e-09 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 46.4 bits (110), Expect = 1e-09
Identities = 21/64 (32%), Positives = 36/64 (56%), Gaps = 1/64 (1%)

Query: 9 KDKKVKAFTLLEMLVVLIIISVLLLLFVPNLSKQKESVTRTGNAAVVKVVESQAELFELQ 68
K + FTLLE++VV++II VL L VPNL KE + + + +E+ ++++L
Sbjct: 3 ATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKL- 61

Query: 69 ETGR 72
+
Sbjct: 62 DNHH 65


2GBS_RS01295GBS_RS11160Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS012952192.91847723S rRNA
GBS_RS013003203.788162NYN domain-containing protein
GBS_RS013051264.186280DegV family protein
GBS_RS013102283.909762helix-turn-helix domain-containing protein
GBS_RS013154283.651171IS30-like element ISSag9 family transposase
GBS_RS013201253.41188050S ribosomal protein L13
GBS_RS013251252.62525130S ribosomal protein S9
GBS_RS013302291.261485site-specific integrase
GBS_RS01335423-0.241252helix-turn-helix transcriptional regulator
GBS_RS01340324-0.316409hypothetical protein
GBS_RS01345220-1.128338helix-turn-helix domain-containing protein
GBS_RS01350218-2.225981replication initiator protein A
GBS_RS01355117-2.125368hypothetical protein
GBS_RS01365114-2.204735hypothetical protein
GBS_RS01370013-2.930986hypothetical protein
GBS_RS01375015-3.653787plasmid recombination protein
GBS_RS01380123-5.415437hypothetical protein
GBS_RS01385021-5.064645type II toxin-antitoxin system RelE/ParE family
GBS_RS01390023-4.802542hypothetical protein
GBS_RS01395124-5.703797WXG100 family type VII secretion target
GBS_RS01400325-5.628973hypothetical protein
GBS_RS01405221-4.541761hypothetical protein
GBS_RS01410319-3.841192hypothetical protein
GBS_RS01415420-4.083205hypothetical protein
GBS_RS01420318-3.371840hypothetical protein
GBS_RS01425117-2.423799hypothetical protein
GBS_RS01440416-2.045085Rgg/GadR/MutR family transcriptional regulator
GBS_RS01445316-1.375055MFS transporter
GBS_RS11160217-1.847789transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01385BINARYTOXINA260.042 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 25.8 bits (56), Expect = 0.042
Identities = 16/88 (18%), Positives = 40/88 (45%), Gaps = 4/88 (4%)

Query: 18 LKEIRDYISQNYSSTSGQRKMEQIINDIEKLEVFPEVGFDADEKYGSEISNYHSTRGYTL 77
++ D++ ++ ++K + + + L+ + + +K +ISNY TR Y
Sbjct: 44 IERPEDFLKDKENAIQWEKKEAERVE--KNLDTLEKEALELYKKDSEQISNYSQTRQYF- 100

Query: 78 SKDYIVLYHIEEEENRVVIDYLLPTRSD 105
DY + + E+E + + + + + D
Sbjct: 101 -YDYQIESNPREKEYKNLRNAISKNKID 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01395BCTERIALGSPD280.006 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 27.6 bits (61), Expect = 0.006
Identities = 12/40 (30%), Positives = 22/40 (55%)

Query: 57 ELSPKITQFAQLLEDINQQLLKVADVVEQTDSDIASQINK 96
++ P+I + +L +I Q++ VAD T SD+ + N
Sbjct: 489 KVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNT 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01445TCRTETA340.001 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.6 bits (77), Expect = 0.001
Identities = 44/186 (23%), Positives = 77/186 (41%), Gaps = 18/186 (9%)

Query: 20 VLYDYGNSTWIASMGGLGQKILGIYQIVELLVSIVLNPFGGALADRFQRRKILLITDAIC 79
+L D +S + + G+ +L +Y +++ + P GAL+DRF RR +LL++ A
Sbjct: 31 LLRDLVHSNDVTAHYGI---LLALYALMQFACA----PVLGALSDRFGRRPVLLVSLAGA 83

Query: 80 AIMCFLLSFIGDDKVMVYGLIVANAILAVSNAFSSPAYKSYIPEIVDKADIITYNANLET 139
A+ +++ V+ G IVA A +YI +I D + + +
Sbjct: 84 AVDYAIMATAPFLWVLYIGRIVAGITGAT-----GAVAGAYIADITDGDERARHFGFMSA 138

Query: 140 IVQIISVSSPVLGFLIFNNFGIRITLIVDA----ITFLISFLFLY-AIKVERVQLSKQEK 194
V+ PVLG L+ F A + FL L + K ER L ++
Sbjct: 139 CFGFGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREAL 197

Query: 195 VAIKNI 200
+ +
Sbjct: 198 NPLASF 203


3GBS_RS01575GBS_RS01635Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS01575-116-3.090665cell division protein FtsK
GBS_RS01580018-4.677152hypothetical protein
GBS_RS01585022-5.869958hypothetical protein
GBS_RS01590025-7.279065hypothetical protein
GBS_RS01595023-7.154659GNAT family N-acetyltransferase
GBS_RS01600-120-6.979304GNAT family N-acetyltransferase
GBS_RS01605020-6.344416GNAT family N-acetyltransferase
GBS_RS01610219-6.473961sigma factor regulator N-terminal
GBS_RS01615318-5.752415RNA polymerase sigma factor
GBS_RS01625316-4.504466TetR/AcrR family transcriptional regulator
GBS_RS01630113-1.900579ABC transporter permease
GBS_RS01635213-1.398831ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01575TYPE3OMBPROT320.005 Type III secretion system outer membrane B protein ...
		>TYPE3OMBPROT#Type III secretion system outer membrane B protein

family signature.
Length = 538

Score = 32.0 bits (72), Expect = 0.005
Identities = 19/75 (25%), Positives = 40/75 (53%), Gaps = 3/75 (4%)

Query: 238 DSTYLGIVDGKGADLYALGKLLQEELGEQIAIGSSPQMLAKLSREFVDIMNAR-FEVIKQ 296
D+ LG V+ + + L K++ EE+ Q+ I + ++ K+ + F D +N + + + +
Sbjct: 99 DTYKLGEVNKR--HINELNKVISEEIRAQLGIKNKKELQTKIKQIFTDYLNNKNWGPVNK 156

Query: 297 NSSLNADAYDLEMTP 311
N S + Y ++TP
Sbjct: 157 NISHHGKNYGFQLTP 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01625HTHTETR632e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.1 bits (153), Expect = 2e-14
Identities = 27/85 (31%), Positives = 40/85 (47%), Gaps = 6/85 (7%)

Query: 22 KQKVILSAIELFASQGFHGTSTAQLAKNAEVSQATIYKYFETKDKLLVFILELIVQTIGR 81
+Q ++ A+ LF+ QG TS ++AK A V++ IY +F+ K L I EL IG
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 82 PFFTELSTF------STKEELIHFF 100
+ F +E LIH
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVL 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01630ABC2TRNSPORT542e-10 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 53.8 bits (129), Expect = 2e-10
Identities = 44/166 (26%), Positives = 81/166 (48%), Gaps = 3/166 (1%)

Query: 197 RTSGTLDRLLATPVKRSDIVFGYMLSYGILAIIQTIVIVLSTIWLLDIQVVGSIFSVIIV 256
T + +L T ++ DIV G M A + I + L Q + ++++ ++
Sbjct: 95 EGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLYALPVI 154

Query: 257 NFILALVALSLGILMSTLAKSEFQMMQFIPLIIMPQLFFSG-IIPLENMANWAQTVGKIL 315
L SLG++++ LA S + + L+I P LF SG + P++ + QT + L
Sbjct: 155 ALT-GLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTAARFL 213

Query: 316 PLSYSGDALTKIIMYGQGLSNVSSNLLVLLLFLIILTIANIFGLKR 361
PLS+S D L + IM G + +V ++ L ++++I + L+R
Sbjct: 214 PLSHSID-LIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRR 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01635PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.003
Identities = 19/90 (21%), Positives = 29/90 (32%), Gaps = 22/90 (24%)

Query: 35 LIGPSGAGKSTLIKTMLGME-KADKGTALVLDTQMPDRNILNQIGYMA-QSDALYESLTA 92
L G G GKSTLI T++G++ +D D Y YE
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSD---THFDIGTGKD-------SYEQIAGIVAYE---- 646

Query: 93 LENLLFFGKMKGIQKTELKQQITHISKVVD 122
+M ++ + + S D
Sbjct: 647 ------LSEMTAFRRADAEAVKAFFSSRKD 670


4GBS_RS02180GBS_RS02455Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS02180016-4.374366hypothetical protein
GBS_RS02185017-3.910223HIT family protein
GBS_RS02190216-4.741785hypothetical protein
GBS_RS02195216-4.623483hypothetical protein
GBS_RS02200217-4.896241ABC transporter ATP-binding protein
GBS_RS02205115-2.815483ABC transporter permease
GBS_RS02210417-1.545578hypothetical protein
GBS_RS02215419-0.998298hypothetical protein
GBS_RS022202190.279666hypothetical protein
GBS_RS022251211.465366hypothetical protein
GBS_RS022302231.971548primase C-terminal domain-containing protein
GBS_RS022353302.318786hypothetical protein
GBS_RS022403282.212591hypothetical protein
GBS_RS022451231.621487hypothetical protein
GBS_RS022502210.470853hypothetical protein
GBS_RS02255219-0.115967hypothetical protein
GBS_RS11685418-0.614843hypothetical protein
GBS_RS02260316-1.198486hypothetical protein
GBS_RS02265216-0.818401hypothetical protein
GBS_RS02270216-1.526615hypothetical protein
GBS_RS02275317-1.857011hypothetical protein
GBS_RS02280414-0.416072hypothetical protein
GBS_RS02285216-0.099272hypothetical protein
GBS_RS022901131.284438hypothetical protein
GBS_RS022952121.531768DNA/RNA non-specific endonuclease
GBS_RS023003122.206111hypothetical protein
GBS_RS023052132.520117hypothetical protein
GBS_RS023102142.694713hypothetical protein
GBS_RS023151102.612918toprim domain-containing protein
GBS_RS023201103.252943DNA topoisomerase III
GBS_RS023255112.489887ATP-dependent Clp protease ATP-binding subunit
GBS_RS023308112.283553hypothetical protein
GBS_RS023357112.039078hypothetical protein
GBS_RS023404102.530383C39 family peptidase
GBS_RS023451122.449980LPXTG cell wall anchor domain-containing
GBS_RS02350-1133.007008LPXTG cell wall anchor domain-containing
GBS_RS02355-1214.045099single-stranded DNA-binding protein
GBS_RS02360-1223.933603hypothetical protein
GBS_RS02365-1223.862787type IV secretory system conjugative DNA
GBS_RS023700213.951031hypothetical protein
GBS_RS023750224.341086hypothetical protein
GBS_RS023801203.997530BRCT domain-containing protein
GBS_RS023851213.950278hypothetical protein
GBS_RS023901223.498315hypothetical protein
GBS_RS023951213.785269virulence factor
GBS_RS024001233.324551hypothetical protein
GBS_RS024050243.263407CHAP domain-containing protein
GBS_RS024100271.336221hypothetical protein
GBS_RS02415-1240.704312AAA family ATPase
GBS_RS02420-1200.471644hypothetical protein
GBS_RS02425-1180.158081helix-turn-helix domain-containing protein
GBS_RS024300141.017126hypothetical protein
GBS_RS024350130.816381ISLre2 family transposase
GBS_RS024400140.610884phosphotransferase family protein
GBS_RS024452152.301061tRNA (guanosine(46)-N7)-methyltransferase TrmB
GBS_RS024553172.253743*ribosome maturation factor RimP
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02300HELNAPAPROT280.012 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 27.9 bits (62), Expect = 0.012
Identities = 17/66 (25%), Positives = 25/66 (37%), Gaps = 11/66 (16%)

Query: 78 EELKNKVKLVSRQLDQLL------YLELTNFHALTKGNDFDIQDLESVHSRFDPMQHELI 131
E K LV L+ L Y +L FH KG F ++H +F+ +
Sbjct: 4 ENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHF-----FTLHEKFEELYDHAA 58

Query: 132 ARIDDI 137
+D I
Sbjct: 59 ETVDTI 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02320PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 21/89 (23%), Positives = 44/89 (49%), Gaps = 12/89 (13%)

Query: 254 AQARQLKAISIVSSVEMEEKQ-RAAPH-LFN-LSDIQGLAAKQWGFEPTKTERLIESL-- 308
A+ Q K S+ ++ + + PH +FN L++I+ L + +PTK ++ SL
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILE----DPTKAREMLTSLSE 202

Query: 309 YLKKYLSYPRTDTRFIT-EEEFDYLKNYL 336
++ L ++ R ++ +E + +YL
Sbjct: 203 LMRYSLR--YSNARQVSLADELTVVDSYL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02325HTHFIS397e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 7e-05
Identities = 34/171 (19%), Positives = 63/171 (36%), Gaps = 19/171 (11%)

Query: 279 LKGDQERLEGFKERLMNRVKGQEDAIEAVVDAVTIAQAGLQNEKRPLASFLFLGPTGVGK 338
K +LE + M V G+ A++ + +A+ + + + G +G GK
Sbjct: 122 PKRRPSKLEDDSQDGMPLV-GRSAAMQEIYR--VLARLMQTD-----LTLMITGESGTGK 173

Query: 339 TELAKAIAEALFDDEAAMIRFDMSEYKQKEDVTKLIGNRATRIKGQLTEGVKQKPYCV-- 396
+A+A+ + + +M+ + ++L G+ KG T +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHE----KGAFTGAQTRSTGRFEQ 229

Query: 397 -----LLLDEIEKAHSEVMDLFLQVLDDGRLTDSSGRLISFKNTIVIMTTN 442
L LDEI + L+VL G T GR + ++ TN
Sbjct: 230 AEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02340CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.3 bits (78), Expect = 0.002
Identities = 40/218 (18%), Positives = 78/218 (35%), Gaps = 16/218 (7%)

Query: 73 VVDEAQKQKDQSQQNLVKATSTVTEAEKVAAEATPEVVKEAIKAVTEAKEAVTDAEANVV 132
EA++++ + ++ + + AE E + E KAV A++ ++ A++ VV
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAE--EKRLAALSEEAKAVEIAQKKLSAAQSEVV 206

Query: 133 DAQKTEQKANQEVQSQAKTVDENVKVVADKESEVKQAEGVVTTAQEAIDSKTANTNAS-- 190
+ N + S D +K +A K +E+ QA E + + N
Sbjct: 207 KMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQ 266

Query: 191 ------------EAEKAVTEKQTKLETAETNLTEAQKQDAKIAEEKRLAEQEVVNKQLAV 238
A K EKQ ++ +ET + +I + V
Sbjct: 267 NRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARV 326

Query: 239 TDTQTLLKKLVTEINNEKVSTSLENQAYFNQRDGSWAG 276
+ + LKK + N ++ +++ F Q G
Sbjct: 327 HEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02345TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 21/78 (26%), Positives = 27/78 (34%)

Query: 41 TPPNSVDPSLPTDTTDPSTPVVPEKLPDTSKPAEPETPKEELPAPVDPTTPSAGKEDKQE 100
P SV P D P P + +P P+ APV P + K +
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 101 TVPGQAETPKEEVKPENP 118
V E PK +VKP
Sbjct: 102 PVKKVQEQPKRDVKPVES 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02350GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 14/195 (7%)

Query: 91 AQADLANQTQAVKDVTAKAQANTQAIKDATAENAKIDAENKAEAERVAKENKEGQAAVDA 150
A A+ +A++ + A++ IK AE A ++A + + A
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 151 RNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAK- 209
A +A++ + Q ++A ++ + L+A +AK E + + +
Sbjct: 283 IKTLEAE--KAALEAEKADLEHQSQVLNANRQSLRR---DLDASREAKKQLEAEHQKLEE 337

Query: 210 ----ANATNAQLQKDYQAKLAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLA 265
+ A+ L++D A K +EA + + + + ++A+ L++D A
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKL---EEQNKI-SEASRQSLRRDLDASRE 393

Query: 266 LYNQALKAKAEADKQ 280
Q KA EA+ +
Sbjct: 394 AKKQVEKALEEANSK 408



Score = 43.1 bits (101), Expect = 5e-06
Identities = 43/232 (18%), Positives = 83/232 (35%), Gaps = 4/232 (1%)

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTTDAQADLANQTQAVKDVTAKAQ 110
+ A L+KA+ A + A + A +A A +A++ +
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 111 ANTQAIKDATAENAKIDAEN-KAEAERVAKENKEGQAAVDARNKAGQAAVDARNKAKQQA 169
A++ IK AE A + A E N + + + A +A+ +
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 170 QDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAKLAEIK 229
+ + A+ + + A +A+ +Q NA L++D A K
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL-NANRQSLRRDLDASREAKK 326

Query: 230 SVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQS 281
+EA + + ++NK ++A + L +AK L +A K E +
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL--EAEHQKLEEQNKI 376



Score = 41.2 bits (96), Expect = 2e-05
Identities = 43/296 (14%), Positives = 78/296 (26%), Gaps = 23/296 (7%)

Query: 4 NTKGHGFFRKSKAYGLVCAIALA--GAFTLATSQVSADQVTTQATTQ-----------TV 50
NT H RK K A+AL GA + + + T T +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTT--DAQADLANQTQAVKDVTAK 108
N + S A + ++ A++ Q ++ A
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 109 AQANTQAI----KDATAENAKIDAENKAEAERVAKENKEGQAAVDARNKAGQAAVDARNK 164
+ + +A+ ++AE A A R A K + A++ +
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 165 AKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAK 224
+ + E + A +A A
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 225 LAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQ 280
A+IK++EA A + R + + A A KA + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKA----LEGAMNFSTADSAKIKTLEAEKAALEAE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02370adhesinb290.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.008
Identities = 11/31 (35%), Positives = 15/31 (48%)

Query: 3 MKKWFGLLFLGLALITLAACGQKTPEDIIKT 33
MKK L+ L LA + LAAC + +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGS 31


5GBS_RS02765GBS_RS02880Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS02765213-0.771526cation transporter
GBS_RS02775316-1.456404TetR/AcrR family transcriptional regulator
GBS_RS02785417-1.616195YSIRK-targeted surface antigen transcriptional
GBS_RS11195419-1.340424alpha-like surface protein
GBS_RS11200223-7.152458type II toxin-antitoxin system RelB/DinJ family
GBS_RS02795523-6.568957hypothetical protein
GBS_RS02800420-5.912850hypothetical protein
GBS_RS02810116-3.315765ImmA/IrrE family metallo-endopeptidase
GBS_RS02815018-0.557405helix-turn-helix transcriptional regulator
GBS_RS02820021-2.059922hypothetical protein
GBS_RS11205122-2.441395putative holin-like toxin
GBS_RS02830124-4.264694hypothetical protein
GBS_RS02835124-4.791855hypothetical protein
GBS_RS02840220-4.337774LPXTG cell wall anchor domain-containing
GBS_RS02845217-5.426594hypothetical protein
GBS_RS02850116-3.815460DUF771 domain-containing protein
GBS_RS11550116-2.918615*hypothetical protein
GBS_RS028702201.952677cupin domain-containing protein
GBS_RS028752193.085742class I SAM-dependent methyltransferase
GBS_RS028802162.058691DUF1912 family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02780HTHTETR512e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 2e-10
Identities = 14/64 (21%), Positives = 30/64 (46%)

Query: 5 RQIQKTKVAIYNAFISLLQENDYSKITVQDVIGLANVGRSTFYSHYESKEVLLKELCEDL 64
++ Q+T+ I + + L + S ++ ++ A V R Y H++ K L E+ E
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 65 FHHL 68
++
Sbjct: 67 ESNI 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS11195GPOSANCHOR664e-13 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 66.2 bits (161), Expect = 4e-13
Identities = 25/71 (35%), Positives = 38/71 (53%), Gaps = 11/71 (15%)

Query: 1067 TPKPVPDKDKYDPTGKSQQVNGKGN-----------KLPATGESATPFFNVAALTIISSV 1115
TP P G++ Q K N +LP+TGE+A PFF AALT++++
Sbjct: 468 TPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPSTGETANPFFTAAALTVMATA 527

Query: 1116 GLLSVSKKKED 1126
G+ +V K+KE+
Sbjct: 528 GVAAVVKRKEE 538



Score = 44.7 bits (105), Expect = 2e-06
Identities = 14/55 (25%), Positives = 21/55 (38%)

Query: 14 QTKQRFSIKKFKFGAASVLIGLSFLGGVTQGNLNIFEESIVAASTIPGSAATLNT 68
T + +S++K K G ASV + L+ LG N N + T
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERA 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02805ADHESNFAMILY290.003 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 29.4 bits (66), Expect = 0.003
Identities = 20/61 (32%), Positives = 25/61 (40%), Gaps = 14/61 (22%)

Query: 3 KKLVGFIVLALSTIILVACSNDSLEGE----------YYWINDARNQHMATIKGDKGYVE 52
KKL +VL LS IILVAC++ + I D I GDK +
Sbjct: 2 KKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKN----IAGDKIDLH 57

Query: 53 S 53
S
Sbjct: 58 S 58


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02885FLGMOTORFLIG280.003 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.2 bits (63), Expect = 0.003
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 6 EFLKDFEEWLQSQISINQMAMDSAKKVLEEDKDERAADAYI 46
L +F+E + +Q I + +D A+++LE+ + A I
Sbjct: 65 NVLLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDII 105


6GBS_RS03350GBS_RS03460Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS033502180.222923zinc ABC transporter substrate-binding protein
GBS_RS033550120.889761type B 50S ribosomal protein L31
GBS_RS033601111.193870bifunctional oligoribonuclease/PAP phosphatase
GBS_RS033652110.312882adenosine deaminase
GBS_RS03370215-0.162175flavodoxin
GBS_RS033753140.191873chorismate mutase
GBS_RS033803140.692894voltage-gated chloride channel family protein
GBS_RS033853200.84048350S ribosomal protein L19
GBS_RS112100210.122547*site-specific integrase
GBS_RS033951251.380811hypothetical protein
GBS_RS03400018-5.119845helix-turn-helix domain-containing protein
GBS_RS11570119-6.853064IS3 family transposase
GBS_RS11575223-7.248095IS3 family transposase
GBS_RS11580427-8.130246transposase
GBS_RS11220628-9.839579NINE protein
GBS_RS03415527-9.460018ABC transporter permease
GBS_RS03420526-8.336219ABC transporter ATP-binding protein
GBS_RS03425526-8.272623FtsX-like permease family protein
GBS_RS03430219-6.908648response regulator transcription factor
GBS_RS03435116-5.911812HAMP domain-containing histidine kinase
GBS_RS03440-29-0.592297CsbD family protein
GBS_RS03445010-0.353878hypothetical protein
GBS_RS11225010-0.122925rod shape-determining protein RodA
GBS_RS03450-18-0.149717HAD-IA family hydrolase
GBS_RS0345508-0.039850DNA topoisomerase (ATP-hydrolyzing) subunit B
GBS_RS034602171.236548septation ring formation regulator EzrA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03350ADHESNFAMILY2203e-70 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 220 bits (563), Expect = 3e-70
Identities = 79/313 (25%), Positives = 149/313 (47%), Gaps = 12/313 (3%)

Query: 1 MRKKFLLLMSFVAMFA-AWQLVQVKQVWADSKLKVVTTFYPVYEFTKNVVGDKADVSMLI 59
M+K LL+ F++ K + KLKVV T + + TKN+ GDK D+ ++
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIV 60

Query: 60 KAGTEPHDFEPSTKNIAAIQDSNAFVYMDDNMETWAPKVA-KSVKSKKVTTIKGTGDMLL 118
G +PH++EP +++ +++ Y N+ET K V++ K T K +
Sbjct: 61 PIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDY--FAV 118

Query: 119 TKGVEEEGEEHEGHGHEGHHHELDPHVWLSPERAISVVENIRNKFVKAYPKDAASFNKNA 178
+ GV+ EG +G DPH WL+ E I +NI + P + + KN
Sbjct: 119 SDGVDVI--YLEGQNEKGKE---DPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNL 173

Query: 179 DAYIAKLKELDKEYKNGLSN--AKQKSFVTQHAAFGYMALDYGLNQVPIAGLTPDAEPSS 236
Y KL +LDKE K+ + A++K VT AF Y + YG+ I + + E +
Sbjct: 174 KEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTP 233

Query: 237 KRLGELAKYIKKYNINYIYFEENASNKVAKTLADEVGVKTAVLSPLEGLSKKEMAAGEDY 296
+++ L + +++ + ++ E + ++ KT++ + + + ++++ G+ Y
Sbjct: 234 EQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQG-KEGDSY 292

Query: 297 FSVMRRNLKVLKK 309
+S+M+ NL + +
Sbjct: 293 YSMMKYNLDKIAE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03420PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 11/20 (55%), Positives = 14/20 (70%)

Query: 36 IVGKSGTGKSTLLSLLAGLD 55
+ G G GKSTL++ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03430HTHFIS755e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 5e-18
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 2/105 (1%)

Query: 2 KILTVEDDKLIREGISEYLSEFGYTVIQAKDGREALSKFNSD-INLVILDIQIPFINGLE 60
IL +DD IR +++ LS GY V + + +LV+ D+ +P N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 61 VLKEIRK-KSNLPILILTAFSDEEYKIDAFTNLVDGYVEKPFSLP 104
+L I+K + +LP+L+++A + I A Y+ KPF L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


7GBS_RS03870GBS_RS04070Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS03870217-0.903294amino acid ABC transporter ATP-binding protein
GBS_RS03875417-1.545578hypothetical protein
GBS_RS03880419-0.998298hypothetical protein
GBS_RS038852190.279666hypothetical protein
GBS_RS038901211.465366hypothetical protein
GBS_RS038952231.971548primase C-terminal domain-containing protein
GBS_RS039003302.318786hypothetical protein
GBS_RS039053282.212591hypothetical protein
GBS_RS039101231.621487hypothetical protein
GBS_RS039152210.470853hypothetical protein
GBS_RS03920219-0.115967hypothetical protein
GBS_RS11695418-0.614843hypothetical protein
GBS_RS03925316-1.198486hypothetical protein
GBS_RS03930216-0.818401hypothetical protein
GBS_RS03935216-1.526615hypothetical protein
GBS_RS03940317-1.857011hypothetical protein
GBS_RS03945414-0.416072hypothetical protein
GBS_RS03950216-0.099272hypothetical protein
GBS_RS039551131.284438hypothetical protein
GBS_RS039602121.531768DNA/RNA non-specific endonuclease
GBS_RS039653122.206111hypothetical protein
GBS_RS039702132.520117hypothetical protein
GBS_RS039752142.694713hypothetical protein
GBS_RS039801102.612918toprim domain-containing protein
GBS_RS039851103.252943DNA topoisomerase III
GBS_RS039905112.489887ATP-dependent Clp protease ATP-binding subunit
GBS_RS039958112.283553hypothetical protein
GBS_RS040007112.039078hypothetical protein
GBS_RS040054102.530383C39 family peptidase
GBS_RS040101122.449980LPXTG cell wall anchor domain-containing
GBS_RS04015-1133.007008LPXTG cell wall anchor domain-containing
GBS_RS04020-1214.045099single-stranded DNA-binding protein
GBS_RS04025-1223.933603hypothetical protein
GBS_RS04030-1223.862787type IV secretory system conjugative DNA
GBS_RS040350213.951031hypothetical protein
GBS_RS040400224.341086hypothetical protein
GBS_RS040451203.997530BRCT domain-containing protein
GBS_RS040501213.950278hypothetical protein
GBS_RS040551223.498315hypothetical protein
GBS_RS040601213.785269virulence factor
GBS_RS040651233.324551hypothetical protein
GBS_RS040700243.263407CHAP domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03870PF05272290.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.018
Identities = 17/38 (44%), Positives = 19/38 (50%), Gaps = 4/38 (10%)

Query: 27 EPG----QVVVLLGPSGSGKSTLIRTMNALESIDDGSL 60
EPG VVL G G GKSTLI T+ L+ D
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03965HELNAPAPROT280.012 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 27.9 bits (62), Expect = 0.012
Identities = 17/66 (25%), Positives = 25/66 (37%), Gaps = 11/66 (16%)

Query: 78 EELKNKVKLVSRQLDQLL------YLELTNFHALTKGNDFDIQDLESVHSRFDPMQHELI 131
E K LV L+ L Y +L FH KG F ++H +F+ +
Sbjct: 4 ENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHF-----FTLHEKFEELYDHAA 58

Query: 132 ARIDDI 137
+D I
Sbjct: 59 ETVDTI 64


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03985PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 21/89 (23%), Positives = 44/89 (49%), Gaps = 12/89 (13%)

Query: 254 AQARQLKAISIVSSVEMEEKQ-RAAPH-LFN-LSDIQGLAAKQWGFEPTKTERLIESL-- 308
A+ Q K S+ ++ + + PH +FN L++I+ L + +PTK ++ SL
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILE----DPTKAREMLTSLSE 202

Query: 309 YLKKYLSYPRTDTRFIT-EEEFDYLKNYL 336
++ L ++ R ++ +E + +YL
Sbjct: 203 LMRYSLR--YSNARQVSLADELTVVDSYL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03990HTHFIS397e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 7e-05
Identities = 34/171 (19%), Positives = 63/171 (36%), Gaps = 19/171 (11%)

Query: 279 LKGDQERLEGFKERLMNRVKGQEDAIEAVVDAVTIAQAGLQNEKRPLASFLFLGPTGVGK 338
K +LE + M V G+ A++ + +A+ + + + G +G GK
Sbjct: 122 PKRRPSKLEDDSQDGMPLV-GRSAAMQEIYR--VLARLMQTD-----LTLMITGESGTGK 173

Query: 339 TELAKAIAEALFDDEAAMIRFDMSEYKQKEDVTKLIGNRATRIKGQLTEGVKQKPYCV-- 396
+A+A+ + + +M+ + ++L G+ KG T +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHE----KGAFTGAQTRSTGRFEQ 229

Query: 397 -----LLLDEIEKAHSEVMDLFLQVLDDGRLTDSSGRLISFKNTIVIMTTN 442
L LDEI + L+VL G T GR + ++ TN
Sbjct: 230 AEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04005CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.3 bits (78), Expect = 0.002
Identities = 40/218 (18%), Positives = 78/218 (35%), Gaps = 16/218 (7%)

Query: 73 VVDEAQKQKDQSQQNLVKATSTVTEAEKVAAEATPEVVKEAIKAVTEAKEAVTDAEANVV 132
EA++++ + ++ + + AE E + E KAV A++ ++ A++ VV
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAE--EKRLAALSEEAKAVEIAQKKLSAAQSEVV 206

Query: 133 DAQKTEQKANQEVQSQAKTVDENVKVVADKESEVKQAEGVVTTAQEAIDSKTANTNAS-- 190
+ N + S D +K +A K +E+ QA E + + N
Sbjct: 207 KMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQ 266

Query: 191 ------------EAEKAVTEKQTKLETAETNLTEAQKQDAKIAEEKRLAEQEVVNKQLAV 238
A K EKQ ++ +ET + +I + V
Sbjct: 267 NRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARV 326

Query: 239 TDTQTLLKKLVTEINNEKVSTSLENQAYFNQRDGSWAG 276
+ + LKK + N ++ +++ F Q G
Sbjct: 327 HEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04010TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 21/78 (26%), Positives = 27/78 (34%)

Query: 41 TPPNSVDPSLPTDTTDPSTPVVPEKLPDTSKPAEPETPKEELPAPVDPTTPSAGKEDKQE 100
P SV P D P P + +P P+ APV P + K +
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 101 TVPGQAETPKEEVKPENP 118
V E PK +VKP
Sbjct: 102 PVKKVQEQPKRDVKPVES 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04015GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 14/195 (7%)

Query: 91 AQADLANQTQAVKDVTAKAQANTQAIKDATAENAKIDAENKAEAERVAKENKEGQAAVDA 150
A A+ +A++ + A++ IK AE A ++A + + A
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 151 RNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAK- 209
A +A++ + Q ++A ++ + L+A +AK E + + +
Sbjct: 283 IKTLEAE--KAALEAEKADLEHQSQVLNANRQSLRR---DLDASREAKKQLEAEHQKLEE 337

Query: 210 ----ANATNAQLQKDYQAKLAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLA 265
+ A+ L++D A K +EA + + + + ++A+ L++D A
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKL---EEQNKI-SEASRQSLRRDLDASRE 393

Query: 266 LYNQALKAKAEADKQ 280
Q KA EA+ +
Sbjct: 394 AKKQVEKALEEANSK 408



Score = 43.1 bits (101), Expect = 5e-06
Identities = 43/232 (18%), Positives = 83/232 (35%), Gaps = 4/232 (1%)

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTTDAQADLANQTQAVKDVTAKAQ 110
+ A L+KA+ A + A + A +A A +A++ +
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 111 ANTQAIKDATAENAKIDAEN-KAEAERVAKENKEGQAAVDARNKAGQAAVDARNKAKQQA 169
A++ IK AE A + A E N + + + A +A+ +
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 170 QDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAKLAEIK 229
+ + A+ + + A +A+ +Q NA L++D A K
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL-NANRQSLRRDLDASREAKK 326

Query: 230 SVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQS 281
+EA + + ++NK ++A + L +AK L +A K E +
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL--EAEHQKLEEQNKI 376



Score = 41.2 bits (96), Expect = 2e-05
Identities = 43/296 (14%), Positives = 78/296 (26%), Gaps = 23/296 (7%)

Query: 4 NTKGHGFFRKSKAYGLVCAIALA--GAFTLATSQVSADQVTTQATTQ-----------TV 50
NT H RK K A+AL GA + + + T T +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTT--DAQADLANQTQAVKDVTAK 108
N + S A + ++ A++ Q ++ A
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 109 AQANTQAI----KDATAENAKIDAENKAEAERVAKENKEGQAAVDARNKAGQAAVDARNK 164
+ + +A+ ++AE A A R A K + A++ +
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 165 AKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAK 224
+ + E + A +A A
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 225 LAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQ 280
A+IK++EA A + R + + A A KA + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKA----LEGAMNFSTADSAKIKTLEAEKAALEAE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04035adhesinb290.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.008
Identities = 11/31 (35%), Positives = 15/31 (48%)

Query: 3 MKKWFGLLFLGLALITLAACGQKTPEDIIKT 33
MKK L+ L LA + LAAC + +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGS 31


8GBS_RS05015GBS_RS05085Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS05015-113-3.314507DUF2974 domain-containing protein
GBS_RS05020011-2.166949CatB-related O-acetyltransferase
GBS_RS05025-210-0.702979TVP38/TMEM64 family protein
GBS_RS05030-1110.164239hypothetical protein
GBS_RS05035-1110.254162ABC transporter ATP-binding protein
GBS_RS05040-1110.763288GntR family transcriptional regulator
GBS_RS05045-1110.755666DNA polymerase III subunit alpha
GBS_RS050502191.6924926-phosphofructokinase
GBS_RS050550170.239484pyruvate kinase
GBS_RS05060-214-1.675543signal peptidase I
GBS_RS05065-115-0.980499glutamine--fructose-6-phosphate transaminase
GBS_RS05070115-3.568001alkylphosphonate utilization protein
GBS_RS05075114-3.641149amino acid ABC transporter permease
GBS_RS05080014-3.427618amino acid ABC transporter ATP-binding protein
GBS_RS05085214-2.389091amino acid ABC transporter substrate-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05050FbpA_PF05833290.024 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 29.5 bits (66), Expect = 0.024
Identities = 10/37 (27%), Positives = 17/37 (45%), Gaps = 3/37 (8%)

Query: 137 IGFDTAVATAVENLDRLRDTSASHNRTFVVEVMGRNA 173
I D V E+ D L + ++E+MGR++
Sbjct: 94 INQDRIVVIDFESTDELGFN---SIYSLIIEIMGRHS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05080PF05272352e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 2e-04
Identities = 14/43 (32%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 29 ILSLVGPSGGGKTTLLRMLAGLE-KIDSGTIVHDGKEVSVDHL 70
+ L G G GK+TL+ L GL+ D+ + GK+ S + +
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD-SYEQI 639


9GBS_RS05130GBS_RS05445Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS051302160.807593FAD-dependent oxidoreductase
GBS_RS051352140.368886L-lactate dehydrogenase
GBS_RS051402140.646784DNA gyrase subunit A
GBS_RS05145-112-0.123099class A sortase
GBS_RS05150-1140.803768VOC family protein
GBS_RS05155-1140.124642DUF1002 domain-containing protein
GBS_RS05160-1160.334537cation:proton antiporter
GBS_RS05165-1160.609605glutamine-hydrolyzing GMP synthase
GBS_RS05170-112-0.937446GntR family transcriptional regulator
GBS_RS05175-111-1.209684methylenetetrahydrofolate--tRNA-(uracil(54)-
GBS_RS05180111-3.408168GNAT family N-acetyltransferase
GBS_RS05185110-3.205704O-sialoglycoprotein endopeptidase
GBS_RS05190110-3.302398PASTA domain-containing protein
GBS_RS05195110-3.061514ABC transporter ATP-binding protein
GBS_RS05200-18-2.836284ABC transporter permease
GBS_RS0520519-3.078667response regulator transcription factor
GBS_RS05210012-2.195608sensor histidine kinase
GBS_RS05215-210-1.313290tyrosine recombinase XerS
GBS_RS05220-111-0.811978peptide ABC transporter substrate-binding
GBS_RS05225-112-0.965098DUF3307 domain-containing protein
GBS_RS05230-116-0.265935SatD family protein
GBS_RS05235-1220.676019ISLre2 family transposase
GBS_RS052400271.336221hypothetical protein
GBS_RS052450243.263407helix-turn-helix domain-containing protein
GBS_RS052500233.324551hypothetical protein
GBS_RS052551213.785269AAA family ATPase
GBS_RS052601223.498315hypothetical protein
GBS_RS052651213.950278CHAP domain-containing protein
GBS_RS052750224.341086hypothetical protein
GBS_RS052800213.951031virulence factor
GBS_RS05285-2223.862787hypothetical protein
GBS_RS05290-1223.933603hypothetical protein
GBS_RS05295-1214.045099BRCT domain-containing protein
GBS_RS05300-1133.007008hypothetical protein
GBS_RS053051122.449980hypothetical protein
GBS_RS053104102.530383type IV secretory system conjugative DNA
GBS_RS053157112.039078hypothetical protein
GBS_RS053208112.283553single-stranded DNA-binding protein
GBS_RS053255112.489887LPXTG cell wall anchor domain-containing
GBS_RS053301103.252943LPXTG cell wall anchor domain-containing
GBS_RS053351102.612918C39 family peptidase
GBS_RS053402142.694713hypothetical protein
GBS_RS053452132.520117hypothetical protein
GBS_RS053502122.206111ATP-dependent Clp protease ATP-binding subunit
GBS_RS053552121.531768DNA topoisomerase III
GBS_RS053601131.284438toprim domain-containing protein
GBS_RS05365216-0.099272hypothetical protein
GBS_RS05370414-0.416072hypothetical protein
GBS_RS05375317-1.857011hypothetical protein
GBS_RS05380216-1.526615DNA/RNA non-specific endonuclease
GBS_RS05385216-0.818401hypothetical protein
GBS_RS05390316-1.198486hypothetical protein
GBS_RS05395418-0.614843hypothetical protein
GBS_RS05400219-0.115967hypothetical protein
GBS_RS054052210.470853hypothetical protein
GBS_RS054101231.621487hypothetical protein
GBS_RS054153282.212591hypothetical protein
GBS_RS117003302.318786hypothetical protein
GBS_RS054202231.971548hypothetical protein
GBS_RS054251211.465366hypothetical protein
GBS_RS054352190.279666hypothetical protein
GBS_RS05440419-0.998298hypothetical protein
GBS_RS05445417-1.545578hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05180SACTRNSFRASE342e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 34.1 bits (78), Expect = 2e-04
Identities = 21/72 (29%), Positives = 36/72 (50%), Gaps = 9/72 (12%)

Query: 139 YGETYAKQMYNDNRANL--LNNNAK-VYLAIKDNQIIGDV---TAWDYGEYVEIDDFYVQ 192
+ + Y KQ Y D+ ++ + K +L +N IG + + W+ Y I+D V
Sbjct: 42 FSKPYFKQ-YEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWN--GYALIEDIAVA 98

Query: 193 ESYRGQGIGTRL 204
+ YR +G+GT L
Sbjct: 99 KDYRKKGVGTAL 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05210HTHFIS691e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.7 bits (168), Expect = 1e-15
Identities = 33/124 (26%), Positives = 58/124 (46%), Gaps = 3/124 (2%)

Query: 4 DQGKIYIVEDDMTIVSLLKGHLS-ASYHVSSVSNFRDVKQEIIAFQPDLILMDITLPYFN 62
I + +DD I ++L LS A Y V SN + + I A DL++ D+ +P N
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GFYWTAELRK-FLTIPIIFISSSNDEMDMVMALNMGGDDFISKPFSL-AVLDAKLTAILR 120
F ++K +P++ +S+ N M + A G D++ KPF L ++ A+
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 121 RSQQ 124
++
Sbjct: 122 PKRR 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05305adhesinb290.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.008
Identities = 11/31 (35%), Positives = 15/31 (48%)

Query: 3 MKKWFGLLFLGLALITLAACGQKTPEDIIKT 33
MKK L+ L LA + LAAC + +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGS 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05325GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 14/195 (7%)

Query: 91 AQADLANQTQAVKDVTAKAQANTQAIKDATAENAKIDAENKAEAERVAKENKEGQAAVDA 150
A A+ +A++ + A++ IK AE A ++A + + A
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 151 RNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAK- 209
A +A++ + Q ++A ++ + L+A +AK E + + +
Sbjct: 283 IKTLEAE--KAALEAEKADLEHQSQVLNANRQSLRR---DLDASREAKKQLEAEHQKLEE 337

Query: 210 ----ANATNAQLQKDYQAKLAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLA 265
+ A+ L++D A K +EA + + + + ++A+ L++D A
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKL---EEQNKI-SEASRQSLRRDLDASRE 393

Query: 266 LYNQALKAKAEADKQ 280
Q KA EA+ +
Sbjct: 394 AKKQVEKALEEANSK 408



Score = 43.1 bits (101), Expect = 5e-06
Identities = 43/232 (18%), Positives = 83/232 (35%), Gaps = 4/232 (1%)

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTTDAQADLANQTQAVKDVTAKAQ 110
+ A L+KA+ A + A + A +A A +A++ +
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 111 ANTQAIKDATAENAKIDAEN-KAEAERVAKENKEGQAAVDARNKAGQAAVDARNKAKQQA 169
A++ IK AE A + A E N + + + A +A+ +
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 170 QDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAKLAEIK 229
+ + A+ + + A +A+ +Q NA L++D A K
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL-NANRQSLRRDLDASREAKK 326

Query: 230 SVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQS 281
+EA + + ++NK ++A + L +AK L +A K E +
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL--EAEHQKLEEQNKI 376



Score = 41.2 bits (96), Expect = 2e-05
Identities = 43/296 (14%), Positives = 78/296 (26%), Gaps = 23/296 (7%)

Query: 4 NTKGHGFFRKSKAYGLVCAIALA--GAFTLATSQVSADQVTTQATTQ-----------TV 50
NT H RK K A+AL GA + + + T T +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTT--DAQADLANQTQAVKDVTAK 108
N + S A + ++ A++ Q ++ A
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 109 AQANTQAI----KDATAENAKIDAENKAEAERVAKENKEGQAAVDARNKAGQAAVDARNK 164
+ + +A+ ++AE A A R A K + A++ +
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 165 AKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAK 224
+ + E + A +A A
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 225 LAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQ 280
A+IK++EA A + R + + A A KA + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKA----LEGAMNFSTADSAKIKTLEAEKAALEAE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05330TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 21/78 (26%), Positives = 27/78 (34%)

Query: 41 TPPNSVDPSLPTDTTDPSTPVVPEKLPDTSKPAEPETPKEELPAPVDPTTPSAGKEDKQE 100
P SV P D P P + +P P+ APV P + K +
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 101 TVPGQAETPKEEVKPENP 118
V E PK +VKP
Sbjct: 102 PVKKVQEQPKRDVKPVES 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05335CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.3 bits (78), Expect = 0.002
Identities = 40/218 (18%), Positives = 78/218 (35%), Gaps = 16/218 (7%)

Query: 73 VVDEAQKQKDQSQQNLVKATSTVTEAEKVAAEATPEVVKEAIKAVTEAKEAVTDAEANVV 132
EA++++ + ++ + + AE E + E KAV A++ ++ A++ VV
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAE--EKRLAALSEEAKAVEIAQKKLSAAQSEVV 206

Query: 133 DAQKTEQKANQEVQSQAKTVDENVKVVADKESEVKQAEGVVTTAQEAIDSKTANTNAS-- 190
+ N + S D +K +A K +E+ QA E + + N
Sbjct: 207 KMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQ 266

Query: 191 ------------EAEKAVTEKQTKLETAETNLTEAQKQDAKIAEEKRLAEQEVVNKQLAV 238
A K EKQ ++ +ET + +I + V
Sbjct: 267 NRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARV 326

Query: 239 TDTQTLLKKLVTEINNEKVSTSLENQAYFNQRDGSWAG 276
+ + LKK + N ++ +++ F Q G
Sbjct: 327 HEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05350HTHFIS397e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 7e-05
Identities = 34/171 (19%), Positives = 63/171 (36%), Gaps = 19/171 (11%)

Query: 279 LKGDQERLEGFKERLMNRVKGQEDAIEAVVDAVTIAQAGLQNEKRPLASFLFLGPTGVGK 338
K +LE + M V G+ A++ + +A+ + + + G +G GK
Sbjct: 122 PKRRPSKLEDDSQDGMPLV-GRSAAMQEIYR--VLARLMQTD-----LTLMITGESGTGK 173

Query: 339 TELAKAIAEALFDDEAAMIRFDMSEYKQKEDVTKLIGNRATRIKGQLTEGVKQKPYCV-- 396
+A+A+ + + +M+ + ++L G+ KG T +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHE----KGAFTGAQTRSTGRFEQ 229

Query: 397 -----LLLDEIEKAHSEVMDLFLQVLDDGRLTDSSGRLISFKNTIVIMTTN 442
L LDEI + L+VL G T GR + ++ TN
Sbjct: 230 AEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05355PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 21/89 (23%), Positives = 44/89 (49%), Gaps = 12/89 (13%)

Query: 254 AQARQLKAISIVSSVEMEEKQ-RAAPH-LFN-LSDIQGLAAKQWGFEPTKTERLIESL-- 308
A+ Q K S+ ++ + + PH +FN L++I+ L + +PTK ++ SL
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILE----DPTKAREMLTSLSE 202

Query: 309 YLKKYLSYPRTDTRFIT-EEEFDYLKNYL 336
++ L ++ R ++ +E + +YL
Sbjct: 203 LMRYSLR--YSNARQVSLADELTVVDSYL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05375HELNAPAPROT280.012 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 27.9 bits (62), Expect = 0.012
Identities = 17/66 (25%), Positives = 25/66 (37%), Gaps = 11/66 (16%)

Query: 78 EELKNKVKLVSRQLDQLL------YLELTNFHALTKGNDFDIQDLESVHSRFDPMQHELI 131
E K LV L+ L Y +L FH KG F ++H +F+ +
Sbjct: 4 ENAKTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHF-----FTLHEKFEELYDHAA 58

Query: 132 ARIDDI 137
+D I
Sbjct: 59 ETVDTI 64


10GBS_RS05640GBS_RS05740Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS05640313-1.550965carbon starvation protein A
GBS_RS05645417-2.218813response regulator transcription factor LtdR
GBS_RS05650419-2.488056sensor histidine kinase
GBS_RS11300225-3.973443lipoprotein
GBS_RS11720323-4.931663hypothetical protein
GBS_RS05655223-5.139441hypothetical protein
GBS_RS05660120-1.714075hypothetical protein
GBS_RS11310-1161.635530hypothetical protein
GBS_RS05665-1172.516864hypothetical protein
GBS_RS05670-1202.611539CHAP domain-containing protein
GBS_RS056750201.189638hypothetical protein
GBS_RS056802212.394990hypothetical protein
GBS_RS056851160.016516DUF4176 domain-containing protein
GBS_RS05690-123-3.962833hypothetical protein
GBS_RS05695-124-4.970289hypothetical protein
GBS_RS05700-3120.515978hypothetical protein
GBS_RS05705-3131.575247hypothetical protein
GBS_RS05710-2110.740175TIGR04197 family type VII secretion effector
GBS_RS05715-2120.942419type VII secretion protein EssC
GBS_RS05720-2120.699911cell division protein FtsK
GBS_RS0573009-3.477562type VII secretion protein EssB
GBS_RS0573509-3.394166EsaB/YukD family protein
GBS_RS0574009-3.633266type VII secretion protein EssA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05645HTHFIS661e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 1e-14
Identities = 35/140 (25%), Positives = 60/140 (42%), Gaps = 9/140 (6%)

Query: 2 KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 61
+LV DD+ R L L++ ++ I + AT + D+ + D+ + D++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITS--NAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 LQLAEYINKM-PKPPLLIFATAYDQY--AIQAFEHDARDYLLKPYDFDRLKQAMDRVKGA 118
L I K P P+L+ +A + + AI+A E A DYL KP+D L + + + A
Sbjct: 63 FDLLPRIKKARPDLPVLVM-SAQNTFMTAIKASEKGAYDYLPKPFD---LTELIGIIGRA 118

Query: 119 LSTSTIIESVTSGPLFKQQY 138
L+ S
Sbjct: 119 LAEPKRRPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05650PF065802055e-63 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 205 bits (522), Expect = 5e-63
Identities = 59/211 (27%), Positives = 110/211 (52%), Gaps = 12/211 (5%)

Query: 366 QAEEATRLLQDAEMKSLQAQVNPHFLFNALNTIYGLIRMDSEKARKLVQDFSKVIRANLQ 425
+ + Q+A++ +L+AQ+NPHF+FNALN I LI D KAR+++ S+++R +L+
Sbjct: 150 DQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 426 RAKQNLIPLHDELEQVNAYLALEEARFPNMVAFNLDNQTNSDDNLMIPPFTLQVLIENSY 485
+ + L DEL V++YL L +F + + F ++ +PP +Q L+EN
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAI-MDVQVPPMLVQTLVENGI 268

Query: 486 KHAFKHVNKNNQLKVTIARNNDRLHIIVQDNGIGIPKEKLITLGKKTQISKQGSGTAIEN 545
KH + + ++ + ++N + + V++ G K +K+ +GT ++N
Sbjct: 269 KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN-----------TKESTGTGLQN 317

Query: 546 LVRRLNIIYDGQASLKFESNDSGTCAIVNIP 576
+ RL ++Y +A +K A+V IP
Sbjct: 318 VRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05660RTXTOXINA270.021 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 26.9 bits (59), Expect = 0.021
Identities = 15/31 (48%), Positives = 20/31 (64%), Gaps = 3/31 (9%)

Query: 71 AIDSGVDALSGAAIGTLVGGPVGTVVGAVQG 101
++ SG+ A AA +LVG PV +VGAV G
Sbjct: 377 SVSSGISA---AATTSLVGAPVSALVGAVTG 404


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05690IGASERPTASE300.004 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 29.6 bits (66), Expect = 0.004
Identities = 19/103 (18%), Positives = 32/103 (31%), Gaps = 1/103 (0%)

Query: 36 QATTVTDQTSEKAGENSSAEATSASESVSSEATTVVPEASLPSTDLHAQPADSKTAPTET 95
Q+ TV Q + + ++ A T P S + +
Sbjct: 1135 QSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194

Query: 96 VIEG-QTETNPTTSPDKQFLQQAIPKEKHRLGRKQIQANPKLQ 137
V+E + T TT P PK +HR + + N +
Sbjct: 1195 VVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA 1237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05745MYCMG045280.022 Hypothetical mycoplasma lipoprotein (MG045) signature.
		>MYCMG045#Hypothetical mycoplasma lipoprotein (MG045) signature.

Length = 483

Score = 27.8 bits (61), Expect = 0.022
Identities = 18/58 (31%), Positives = 30/58 (51%), Gaps = 3/58 (5%)

Query: 27 KNNDIHFDTKRLESD---SQKNSGFQTELTKDLFKKEDLKKLDKNRQSKVSLQKKQSK 81
K N+ +K++ +D S+K + TE K L +KED +L++N + V KK
Sbjct: 389 KKNNAEMKSKQMSTDQMTSEKEFDYYTETLKALLEKEDSAELNENEKKLVETIKKAYT 446


11GBS_RS05995GBS_RS06120Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS059952233.753492plasmid mobilization relaxosome protein MobC
GBS_RS060003233.916336hypothetical protein
GBS_RS060054254.268375hypothetical protein
GBS_RS060104254.183495toprim domain-containing protein
GBS_RS060155243.264821hypothetical protein
GBS_RS060203224.244064type IV secretory system conjugative DNA
GBS_RS060254233.752570DUF3801 domain-containing protein
GBS_RS060302223.540156hypothetical protein
GBS_RS060351213.425375hypothetical protein
GBS_RS060401213.313661hypothetical protein
GBS_RS060451213.660771CHAP domain-containing protein
GBS_RS060501211.865845hypothetical protein
GBS_RS060550222.339214DUF87 domain-containing protein
GBS_RS060602231.610210PrgI family protein
GBS_RS060651222.241321hypothetical protein
GBS_RS060705131.771157hypothetical protein
GBS_RS060754131.684037hypothetical protein
GBS_RS060809141.120236membrane protein
GBS_RS0608510151.141980hypothetical protein
GBS_RS0609010161.299196hypothetical protein
GBS_RS0609510161.045458LPXTG cell wall anchor domain-containing
GBS_RS061005140.573638LPXTG cell wall anchor domain-containing
GBS_RS061057140.234483LPXTG cell wall anchor domain-containing
GBS_RS061102230.454976hypothetical protein
GBS_RS06115121-0.144926helix-turn-helix transcriptional regulator
GBS_RS06120319-0.736871hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS0603560KDINNERMP280.003 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 28.4 bits (63), Expect = 0.003
Identities = 10/40 (25%), Positives = 20/40 (50%), Gaps = 1/40 (2%)

Query: 29 IKQLEKAQESQDALQKKIKTLKDDIASLEGKRQQEIMAEY 68
Q + + LQ KI+ +++ + + + QE+MA Y
Sbjct: 374 KAQYTSMAKMR-MLQPKIQAMRERLGDDKQRISQEMMALY 412


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06040PF05844270.048 YopD protein
		>PF05844#YopD protein

Length = 295

Score = 27.3 bits (60), Expect = 0.048
Identities = 21/80 (26%), Positives = 37/80 (46%), Gaps = 14/80 (17%)

Query: 28 GFGQMGNSS-QSKDRPKQTATKKTRTN-TITQTQVKEFLVAYYTKKDLEENRNRYKDYMT 85
F QM N+S Q + Q + ++ N TI Q+Q K+ +E+ + +M
Sbjct: 222 SFVQMANASVQVRQGESQASAREEEVNATIGQSQ----------KQKVEDQMSFDAGFMK 271

Query: 86 EGLYNATISEENKAQNQAYR 105
+ L I + ++ NQA+R
Sbjct: 272 DVL--QLIQQYTQSHNQAWR 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06045cloacin320.016 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 31.6 bits (71), Expect = 0.016
Identities = 24/80 (30%), Positives = 34/80 (42%), Gaps = 15/80 (18%)

Query: 23 HRHNQSSARKAQKAAKGELQVATKELKEAREAYQKAQTDFRAGIIDDEALKKAQEKLKQA 82
HR Q + KAQ+A Q + A +A K ++D D AL A E K+
Sbjct: 387 HRMWQMAGLKAQRA-----QTDVNNKQAAFDAAAKEKSD------ADAALSSAMESRKKK 435

Query: 83 QAKYKDAK----KVKKKVRK 98
+ K + A+ K K RK
Sbjct: 436 EDKKRSAENNLNDEKNKPRK 455


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06095IGASERPTASE535e-09 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 53.1 bits (127), Expect = 5e-09
Identities = 28/177 (15%), Positives = 61/177 (34%), Gaps = 5/177 (2%)

Query: 49 QTVTQNQAETVTSTQLDKAVDTAKKAAVAVTTTTAVNHATTTDAQADLANQTQAVKDVTA 108
QTV T + Q D + +A V + K +
Sbjct: 990 QTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESK 1049

Query: 109 KAQANTQAIKDATAENAKIDAENKAEAERVAKANKAGQAEVDARNKAGQAAVDARNKAKQ 168
+ N Q + TA+N ++ E K+ + + N+ Q+ + + K
Sbjct: 1050 TVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTET-----KETA 1104

Query: 169 QAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKADATNAQLQKDYQTK 225
+ ++KAK++ E E +V+ + + + + A+ + K+ Q++
Sbjct: 1105 TVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQ 1161



Score = 36.6 bits (84), Expect = 6e-04
Identities = 49/276 (17%), Positives = 86/276 (31%), Gaps = 23/276 (8%)

Query: 39 ADQVTTQATTQTVTQNQAETVTSTQLDKAVDTAKKAAVAVTTTTAVNHATTTDAQADLAN 98
D+ ETV ++ T +K T TTA N +A++++
Sbjct: 1020 VDEAPVPPPAPATPSETTETVAENSKQES-KTVEKNEQDATETTAQNREVAKEAKSNVKA 1078

Query: 99 QTQAVKDVTAKAQANTQAIKDATAENAKIDAENKA--EAERVAKANKAG----------- 145
TQ + V + T E A ++ E KA E E+ + K
Sbjct: 1079 NTQTNE-VAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137

Query: 146 ----QAEVDARNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQ-LNAQNKAKI 200
QAE N + +++ A +Q AK + N + S +N N
Sbjct: 1138 TVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197

Query: 201 DAENKDAQAKADATNAQLQKDYQTKLANIKSVEAYN---AGVRQRNKDAQAKADATNAQL 257
+ EN N++ + + +N A ++ A D T+
Sbjct: 1198 NPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTNT 1257

Query: 258 QKDYQAKLALYNQALKAKAEADKQSINNVAFDIKAQ 293
A +A Q I+ + + + Q
Sbjct: 1258 NAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQ 1293


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06100GPOSANCHOR352e-04 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 35.0 bits (80), Expect = 2e-04
Identities = 17/62 (27%), Positives = 20/62 (32%), Gaps = 5/62 (8%)

Query: 38 PLITPTPVDPGTP-TDTTKPSTPVDPGTPTDTTKPSTPVDPGTPTDTTKPSTPVDSGTST 96
L D TP + P P TKP+ P T PST G +
Sbjct: 457 KLRAGKASDSQTPDAKPGNKAVPGKGQAPQAGTKPNQNKAPMKETKRQLPST----GETA 512

Query: 97 NP 98
NP
Sbjct: 513 NP 514


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06105RTXTOXIND481e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 47.5 bits (113), Expect = 1e-07
Identities = 25/192 (13%), Positives = 57/192 (29%), Gaps = 1/192 (0%)

Query: 443 QSQALASQQVAQKQLNDAQAKATGLNAVTMQTPIAQANLIKAQSNLKDAQKRLAEAQASV 502
A A Q L A+ + T ++ + + +K E
Sbjct: 129 ALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLT 188

Query: 503 KLANQDNVKKQADLTKAESKLKDAQKQLAAAQAKLTTSKTKLNQLKQVLAEASQQVAQAN 562
L + Q + E L + + A++ + K L + S + +
Sbjct: 189 SLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQA 248

Query: 563 QDYKQAKDNLTQKTAYLTNLRNAQANLIKAQSDVAQAKDNLANKIAKLQREVA-YLQELK 621
+ + + LR ++ L + +S++ AK+ + E+ L++
Sbjct: 249 IAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTT 308

Query: 622 TKAVDAQSQYQK 633
+ K
Sbjct: 309 DNIGLLTLELAK 320



Score = 39.0 bits (91), Expect = 6e-05
Identities = 17/152 (11%), Positives = 62/152 (40%), Gaps = 12/152 (7%)

Query: 432 MTKNMANALTKQSQALASQQVAQKQLNDAQAKATGLNAVTMQTPIAQANLIKAQSNLKDA 491
+N++ + +L +Q + Q Q + L+ + A + + ++ +
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELN-LDKKRAERLTVLARINRYENLSRVE 233

Query: 492 QKRLAEAQASVK---LANQDNVKKQADLTKAESKLKDAQKQLAAAQAKLTTSKTKLNQLK 548
+ RL + + + +A ++++ +A ++L+ + QL ++++ ++K + +
Sbjct: 234 KSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVT 293

Query: 549 QV--------LAEASQQVAQANQDYKQAKDNL 572
Q+ L + + + + + ++
Sbjct: 294 QLFKNEILDKLRQTTDNIGLLTLELAKNEERQ 325


12GBS_RS06360GBS_RS06460Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS06360212-0.831293GntR family transcriptional regulator
GBS_RS06365314-0.490804Asp23/Gls24 family envelope stress response
GBS_RS06370518-0.818651CsbD family protein
GBS_RS06375621-0.493709Asp23/Gls24 family envelope stress response
GBS_RS06380421-0.748971DUF2273 domain-containing protein
GBS_RS06385-1140.096668alkaline shock response membrane anchor protein
GBS_RS06390-2140.107307GlsB/YeaQ/YmgE family stress response membrane
GBS_RS06395-113-0.171952GlsB/YeaQ/YmgE family stress response membrane
GBS_RS064000110.314502DNA helicase PcrA
GBS_RS0641009-0.411474PaaI family thioesterase
GBS_RS06420312-1.028477uracil permease
GBS_RS06425215-1.273079sodium:alanine symporter family protein
GBS_RS06430016-3.076191cation transporter
GBS_RS06435218-3.679064CidA/LrgA family protein
GBS_RS06440015-5.355966LrgB family protein
GBS_RS06445-115-5.688658DUF3862 domain-containing protein
GBS_RS11730016-6.529004hypothetical protein
GBS_RS06450-217-7.272544hypothetical protein
GBS_RS11735-317-6.289079hypothetical protein
GBS_RS06455117-3.726989hypothetical protein
GBS_RS06460117-3.044493helix-turn-helix domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06445ADHESNFAMILY290.014 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 28.7 bits (64), Expect = 0.014
Identities = 12/35 (34%), Positives = 18/35 (51%)

Query: 4 LKKITVGTIVCLSFLGLTACSSSNTQQTSTSKSNV 38
+KK+ ++ LS + L AC+S TS K V
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKV 35


13GBS_RS06540GBS_RS06645Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS06540013-3.602492acetyltransferase
GBS_RS06545115-4.377938UDP-N-acetylglucosamine 2-epimerase
GBS_RS06550119-5.931339N-acetylneuraminate synthase
GBS_RS06555221-6.417140polysaccharide biosynthesis protein
GBS_RS06560422-6.619550capsular polysaccharide biosynthesis protein
GBS_RS06565218-6.660906glycosyltransferase family 2 protein
GBS_RS06570217-5.466696glycosyltransferase family 2 protein
GBS_RS06575116-5.106700hypothetical protein
GBS_RS06580-115-4.268622multidrug MFS transporter
GBS_RS06585-212-4.617626UDP-N-acetylglucosamine--LPS N-acetylglucosamine
GBS_RS06590-210-4.279736sugar transferase
GBS_RS06595-211-3.759945tyrosine-protein kinase
GBS_RS06600-211-3.360345capsular polysaccharide biosynthesis protein
GBS_RS06605-210-2.633491tyrosine-protein phosphatase CpsB
GBS_RS06610-111-1.082883LCP family protein
GBS_RS06615-210-0.030028LysR family transcriptional regulator
GBS_RS06620-2111.561274hypothetical protein
GBS_RS06625-1122.648561purine-nucleoside phosphorylase
GBS_RS06630-1112.116016chloride channel protein
GBS_RS066350132.348691purine-nucleoside phosphorylase
GBS_RS066402141.554374arsenate reductase (glutaredoxin)
GBS_RS066452131.292871phosphopentomutase
14GBS_RS06875GBS_RS07245Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS06875-1184.104394IS3-like element ISSag2 family transposase
GBS_RS11330-1132.889434pneumococcal-type histidine triad protein
GBS_RS06895-1153.421058metal ABC transporter substrate-binding
GBS_RS06900-2121.557480C5a peptidase ScpA/B
GBS_RS06905-311-0.764859ISLre2 family transposase
GBS_RS06910-411-1.1818963'-5' exoribonuclease
GBS_RS06915-115-4.912475hypothetical protein
GBS_RS11335-116-4.431343site-specific integrase
GBS_RS06930116-5.674479DUF3173 family protein
GBS_RS06935219-3.429467replication protein
GBS_RS069403230.046764hypothetical protein
GBS_RS069454250.584979hypothetical protein
GBS_RS06950517-1.827230hypothetical protein
GBS_RS06955515-2.256438hypothetical protein
GBS_RS06965414-1.982929hypothetical protein
GBS_RS06970214-1.931034hypothetical protein
GBS_RS06975213-2.227149Eco57I restriction-modification methylase
GBS_RS06980213-2.578484site-specific integrase
GBS_RS06985213-1.881354DUF3173 family protein
GBS_RS06990013-0.745021helix-turn-helix transcriptional regulator
GBS_RS06995-2141.313954aldose 1-epimerase family protein
GBS_RS07005-1152.1877656-phospho-beta-galactosidase
GBS_RS070150214.840439PTS transporter subunit EIIC
GBS_RS070200235.322969PTS lactose/cellobiose transporter subunit IIA
GBS_RS070251265.853209transcription antiterminator
GBS_RS070300265.231535tagatose-bisphosphate aldolase
GBS_RS070352243.533309tagatose-6-phosphate kinase
GBS_RS070454251.122616galactose-6-phosphate isomerase subunit LacB
GBS_RS07050318-2.374298galactose-6-phosphate isomerase subunit LacA
GBS_RS07055315-4.586448DeoR/GlpR transcriptional regulator
GBS_RS07060316-4.451837relaxase/mobilization nuclease domain-containing
GBS_RS07065118-4.596818MobC family plasmid mobilization relaxosome
GBS_RS07070219-4.538295hypothetical protein
GBS_RS07075119-4.254878ATP-dependent helicase
GBS_RS07080120-2.258890AAA family ATPase
GBS_RS070850263.517796zeta toxin family protein
GBS_RS070901273.108518helix-turn-helix transcriptional regulator
GBS_RS070952262.813826hypothetical protein
GBS_RS071002273.136315hypothetical protein
GBS_RS071052274.218382hypothetical protein
GBS_RS071102274.248594toprim domain-containing protein
GBS_RS071153284.158891hypothetical protein
GBS_RS071203284.365555hypothetical protein
GBS_RS071251254.883217hypothetical protein
GBS_RS071301244.513658DEAD/DEAH box helicase family protein
GBS_RS071350213.368332hypothetical protein
GBS_RS071401224.203753hypothetical protein
GBS_RS071451234.370718hypothetical protein
GBS_RS071501244.307269LPXTG cell wall anchor domain-containing
GBS_RS071553263.809960type IV toxin-antitoxin system AbiEi family
GBS_RS071603284.599697nucleotidyl transferase AbiEii/AbiGii toxin
GBS_RS071654325.858810CHAP domain-containing protein
GBS_RS071704365.584994DUF87 domain-containing protein
GBS_RS071752365.925544PrgI family protein
GBS_RS071802355.911185type IV secretion system protein
GBS_RS071854345.444068hypothetical protein
GBS_RS071904334.921261type IV secretory system conjugative DNA
GBS_RS071955345.568642hypothetical protein
GBS_RS072005304.857556CPBP family intramembrane metalloprotease
GBS_RS072055294.886343hypothetical protein
GBS_RS072103254.799932hypothetical protein
GBS_RS072152204.729081hypothetical protein
GBS_RS072201143.637324DNA (cytosine-5-)-methyltransferase
GBS_RS072252163.372681replication initiator protein A
GBS_RS117401194.537337hypothetical protein
GBS_RS072301184.05500350S ribosomal protein L7/L12
GBS_RS07235-1163.65466150S ribosomal protein L10
GBS_RS07240-2173.336459ATP-dependent Clp protease ATP-binding subunit
GBS_RS07245-2213.889375homocysteine S-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06900PF05616340.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 34.3 bits (78), Expect = 0.002
Identities = 24/87 (27%), Positives = 35/87 (40%), Gaps = 2/87 (2%)

Query: 226 IPKKDLSPSELAAAQAYWSQKQGRGARPSDY-RPTPAPGRRKAPIPDVTPNPGQGHQPD- 283
IP+ DL+P A A + P++ P PG R P PD NP D
Sbjct: 310 IPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDG 369

Query: 284 NGGYHPAPPRPNDASQNKHQRDEFKGK 310
G P P D +H+++ +G+
Sbjct: 370 QPGTRPDSPAVPDRPNGRHRKERKEGE 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06905ADHESNFAMILY2475e-83 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 247 bits (631), Expect = 5e-83
Identities = 82/323 (25%), Positives = 143/323 (44%), Gaps = 34/323 (10%)

Query: 1 MKKGFFLMAMVVSLVMIAGCDKSANPKQPTQGMSVVTSFYPMYAMTKEVSGDLNDVR-MI 59
MKK L+ + +S +++ C Q + VV + + +TK ++GD D+ ++
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIV 60

Query: 60 QSGAGIHSFEPSVNDVAAIYDADLFVYHSHTLE----AWARDLDPNLKKSKVDVFEASKP 115
G H +EP DV +ADL Y+ LE AW L N KK++ + A
Sbjct: 61 PIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDYFA--- 117

Query: 116 LTLDRVKGLEDMEVTQGIDPATLY--------DPHTWTDPVLAGEEAVNIAKELGRLDPK 167
V+ G+D L DPH W + A NIAK+L DP
Sbjct: 118 -------------VSDGVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPN 164

Query: 168 HKDSYTKKAKAFKKEAEQLTEEYTQKFKKVR--SKTFVTQHTAFSYLAKRFGLKQLGISG 225
+K+ Y K K + + ++L +E KF K+ K VT AF Y +K +G+ I
Sbjct: 165 NKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWE 224

Query: 226 ISPEQEPSPRQLKEIQDFVKEYNVKTIFAEDNVNPKIAHAIAKSTGAKVKT---LSPLEA 282
I+ E+E +P Q+K + + +++ V ++F E +V+ + +++ T + +
Sbjct: 225 INTEEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAE 284

Query: 283 APSGNKTYLENLRANLEVLYQQL 305
+Y ++ NL+ + + L
Sbjct: 285 QGKEGDSYYSMMKYNLDKIAEGL 307


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS06910SUBTILISIN1072e-27 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 107 bits (269), Expect = 2e-27
Identities = 51/226 (22%), Positives = 85/226 (37%), Gaps = 47/226 (20%)

Query: 117 KAGKGAGTVVAVIDAGFDKNHEAWRLTDKAKARYQSKEDLEKAKKEHGITYGEWVNDKVA 176
+G G VAV+D G D +H DL KA+ G + +
Sbjct: 36 NQTRGRGVKVAVLDTGCDADHP----------------DL-KARIIGGRNFTDDDEGDPE 78

Query: 177 YYHDYSKDGKTAVDQEHGTHVSGILSGNAPSETKEPYRLEGAMPEAQLLLMRVEIVNGLA 236
+ DY+ HGTHV+G ++ + G PEA LL+++V G
Sbjct: 79 IFKDYNG---------HGTHVAGTIAATENE-----NGVVGVAPEADLLIIKVLNKQGSG 124

Query: 237 DYARNYAQAIRDAINLGAKVINMSFGNAALAYANLPDETKKAFDYAKSKGVSIVTSAGND 296
Y Q I AI +I+MS G E +A A + + ++ +AGN+
Sbjct: 125 QYD-WIIQGIYYAIEQKVDIISMSLGGPED-----VPELHEAVKKAVASQILVMCAAGNE 178

Query: 297 SSFGGKTRLPLADHPDYGVVGTPAAADSTLTVASYSPDKQLTETAT 342
+T +G P + ++V + + D+ +E +
Sbjct: 179 GDGDDRT----------DELGYPGCYNEVISVGAINFDRHASEFSN 214



Score = 79.9 bits (197), Expect = 5e-18
Identities = 37/139 (26%), Positives = 58/139 (41%), Gaps = 22/139 (15%)

Query: 457 NATPKVLPTASGTK---LSRFSSWGLTADGNIKPDIAAPGQDILSSVANNKYAKLSGTSM 513
+V+ + S FS+ + D+ APG+DILS+V KYA SGTSM
Sbjct: 192 GCYNEVISVGAINFDRHASEFSNSNN------EVDLVAPGEDILSTVPGGKYATFSGTSM 245

Query: 514 SAPLVAGIMGL-LQKQYETQYPDMTPSERLDLAKKVLMSSATALYDEDEKAYFSPRQQGA 572
+ P VAG + L Q + D+T E L+ L + SP+ +G
Sbjct: 246 ATPHVAGALALIKQLANASFERDLTEPE----LYAQLIKRTIPLGN-------SPKMEGN 294

Query: 573 GAVDAKKASA-ATMYVTDK 590
G + + ++ T +
Sbjct: 295 GLLYLTAVEELSRIFDTQR 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS07055ARGREPRESSOR290.009 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 29.1 bits (65), Expect = 0.009
Identities = 17/48 (35%), Positives = 26/48 (54%), Gaps = 5/48 (10%)

Query: 1 MNKHQRLDELVKLVDRVGTVTVSEIMDNLK-----VSDMTVRRDLTEL 43
MNK QR ++ +++ T E++D LK V+ TV RD+ EL
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKEL 48


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS07065PF07328373e-06 T-DNA border endonuclease VirD1
		>PF07328#T-DNA border endonuclease VirD1

Length = 144

Score = 37.3 bits (86), Expect = 3e-06
Identities = 20/92 (21%), Positives = 39/92 (42%), Gaps = 13/92 (14%)

Query: 26 LSKSKCKTFSEYARKVLLNPNM----------NFLTVDTTTYQDLIFELKRIG---NNVN 72
+++++ F ++ LN N F+ D T + L + I N+N
Sbjct: 27 MTEAELAEFDAQIAELGLNRNRALRIAARRIGGFVENDAKTVELLRDMSRAIAGVATNIN 86

Query: 73 QIAHAVNQNHIISHEQFQELRQGMAELVSEVE 104
QIA A N+ H ++ F R+ + +S++
Sbjct: 87 QIAKAANRTHDPAYHSFMAERKVLGLELSKLS 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS07150GPOSANCHOR413e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 41.2 bits (96), Expect = 3e-05
Identities = 53/365 (14%), Positives = 111/365 (30%), Gaps = 52/365 (14%)

Query: 13 QEKGEKYVFRKSKQYRTLCSVALGTVVMAFVALAGPMVQADEVGRTVATSVQTETNPATN 72
Y RK K SVA+ V+ AG +V +EV S
Sbjct: 4 NNTNRHYSLRKLKT--GTASVAVALTVLG----AGLVVNTNEVSAVATRSQTDTLEKVQE 57

Query: 73 LKENQPSPIAEQKDSLAATGQSTGTVTVTVPHDKVTQAVDKAKTEGIKAVQDKPMDLGNT 132
+ K + + + + + TE + ++K +
Sbjct: 58 RADKFEIENNTLKLKNSDLSFNNKAL----------KDHNDELTEELSNAKEKLRKNDKS 107

Query: 133 VSAAETSQQLKKAEEDATNQTTTISKTVEIYKSDKATYEAEKKWVEKRNEELTAAYDKAE 192
+S ++ + + K +E + A+ K +E L A E
Sbjct: 108 LSEKA------SKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLE 161

Query: 193 QTGTGL-NHSVDTTVSELKSQDQNAHVTVNTQTVKSGDGTSVSGYQEYVKSVAAIDKKNK 251
+ G N S + + + A + ++ ++ G + + +A K +
Sbjct: 162 KALEGAMNFSTADSAKIKTLEAEKAALEARQAELE----KALEGAMNFSTADSAKIKTLE 217

Query: 252 ANLADYRTKKQAADAVVAKNQLIQKENEAGLAKAKAENEAIDRRNKEGQKAVDEANKAG- 310
A A +K + + + A + +AE A++ R E +KA++ A
Sbjct: 218 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 277

Query: 311 ----------------QAAVEQANQEKQKQAANRAFEIATITKRNKEREEVAKKENAAID 354
+A + Q ANR + + + +++ ++
Sbjct: 278 ADSAKIKTLEAEKAALEAEKADLEHQSQVLNANR--------QSLRRDLDASREAKKQLE 329

Query: 355 AYNAK 359
A + K
Sbjct: 330 AEHQK 334



Score = 39.7 bits (92), Expect = 9e-05
Identities = 48/266 (18%), Positives = 93/266 (34%), Gaps = 24/266 (9%)

Query: 104 HDKVTQAVDKAKTEGIKAVQDKPMDLGNTVSAAETSQQLKKAEEDATNQTTTISKTVEIY 163
+ + + A+ D + L + D
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 164 KSDKATYEAEKKWVEKRNEELTAAYDKAEQTGTGLNHSVDTTVSELKSQDQNAHVTVNTQ 223
+ T EAEK +E R EL A + A T + + T +E + + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQS 304

Query: 224 TVKSGDGTSVSGYQEYVKSVAAIDKKNKANLADYRTKKQAADAV-------VAKNQLIQK 276
V + + S+ ++ S A K+ +A + + ++A + ++ +K
Sbjct: 305 QVLNANRQSLR--RDLDASREAK-KQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKK 361

Query: 277 ENEAGLAKAKAENEAIDRRNKEGQKAVD---EANKAGQAAVEQANQEKQKQAANRAFEIA 333
+ EA K + +N+ + + ++ +D EA K + A+E+AN + +A
Sbjct: 362 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQVEKALEEANSK-----------LA 410

Query: 334 TITKRNKEREEVAKKENAAIDAYNAK 359
+ K NKE EE K AK
Sbjct: 411 ALEKLNKELEESKKLTEKEKAELQAK 436


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS07240HTHFIS433e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 43.3 bits (102), Expect = 3e-06
Identities = 51/253 (20%), Positives = 90/253 (35%), Gaps = 31/253 (12%)

Query: 419 VIGQNDAVEAVARAIRRNRAGFDDGNRPIGSFLFVGPTGVGKTELAKQLAFDMFGSKDAI 478
++G++ A++ + R + R + + + G +G GK +A+ L
Sbjct: 139 LVGRSAAMQEIYRVLAR----LMQTDLTL---MITGESGTGKELVARALHDYGKRRNGPF 191

Query: 479 VRLDMSEYNDRTAVSKLIGATAGYVGYDDNSNTLTERIRRNPYSIVLLDEIEKADPQVIT 538
V ++M+ S+L G G + T R + + LDEI T
Sbjct: 192 VAINMAAIPRDLIESELFGHEKG--AFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQT 249

Query: 539 LLLQVLDDGRLTDGQGNTINFKNTVIIATSNAGFGNEAFTGDSDKDLKIMERISPYFRPE 598
LL+VL G T G T + I+A +N KDLK FR +
Sbjct: 250 RLLRVLQQGEYTTVGGRTPIRSDVRIVAATN-------------KDLKQSIN-QGLFRED 295

Query: 599 FLNRFNGV-IEFSHLS--KDDLNEIVDLMLDEVNQTIGKKGIDLVVDENVKSHLIDLGYD 655
R N V + L +D+ ++V + + + G D+ + +
Sbjct: 296 LYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEK-EGLDVKRF--DQEALELM--KAHP 350

Query: 656 EAMGVRPLRRVIE 668
VR L ++
Sbjct: 351 WPGNVRELENLVR 363


15GBS_RS07970GBS_RS11360Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS07970213-3.330569accessory Sec system protein Asp3
GBS_RS07975213-3.144325accessory Sec system protein Asp2
GBS_RS07980312-3.794474accessory Sec system protein Asp1
GBS_RS07985214-3.759699accessory Sec system protein translocase subunit
GBS_RS07990213-3.138199SP_1767 family glycosyltransferase
GBS_RS079952191.750718glycosyltransferase family 2 protein
GBS_RS080001171.027226glycosyl transferase family 8
GBS_RS080052141.474278glycosyl transferase
GBS_RS080101151.990667glycosyl transferase
GBS_RS080152132.352805sugar transferase
GBS_RS113602142.402390serine-rich repeat glycoprotein adhesin Srr1
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS07985SECYTRNLCASE1401e-39 Preprotein translocase SecY subunit signature.
		>SECYTRNLCASE#Preprotein translocase SecY subunit signature.

Length = 437

Score = 140 bits (355), Expect = 1e-39
Identities = 85/393 (21%), Positives = 173/393 (44%), Gaps = 40/393 (10%)

Query: 6 IFEKNIILRKILITFSLIIIFLLGRYVPIPGVLISAYKG------QDNNFATLYSTVTGG 59
F + +K+L T ++I+++ +G ++PIPGV + + L + +GG
Sbjct: 8 AFRTPDLRKKLLFTLAIIVVYRVGTHIPIPGVDYKNVQQCVREASGNQGLFGLVNMFSGG 67

Query: 60 NLSQVGVFSLGIGPMMTTMILLRLFTIG--------KYSSGVSQKVQQFRQNVVMLVIAI 111
L Q+ +F+LGI P +T I+L+L T+ K + K+ Q+ + + +AI
Sbjct: 68 ALLQITIFALGIMPYITASIILQLLTVVIPRLEALKKEGQAGTAKITQY-TRYLTVALAI 126

Query: 112 IQGLAIAISF-------------QYHNGFSLTKLLLATMILVTGAYIISWIGNLNAEYGF 158
+QG + + Q S+ + + + G ++ W+G L + G
Sbjct: 127 LQGTGLVATARSAPLFGRCSVGGQIVPDQSIFTTITMVICMTAGTCVVMWLGELITDRGI 186

Query: 159 G-GMTILVVVGMLVGQFNNIPLIFEL-FQDGYQLAIILFLLWTLVAMYLMITFERSEYRI 216
G GM+IL+ + + + + I + G + + L+ + L++ E+++ RI
Sbjct: 187 GNGMSILMFISIAATFPSALWAIKKQGTLAGGWIEFGTVIAVGLIMVALVVFVEQAQRRI 246

Query: 217 PVM--RTSIHNRLVDDA--YMPIKVNASGGMAFMYVYTLLMFPQYIIILLRSIFPTNPDI 272
PV + I R Y+P+KVN +G + ++ +LL P + N
Sbjct: 247 PVQYAKRMIGRRSYGGTSTYIPLKVNQAGVIPVIFASSLLYIPALVA----QFAGGNSGW 302

Query: 273 TSY--NDYFSLSSIQGVVIYMILMLVLSVAFTFVNIDPTKISEAMRESGDFIPNYRPGKE 330
S+ + +V Y +L++ + + ++ +P ++++ M++ G FIP R G+
Sbjct: 303 KSWVEQNLTKGDHPIYIVTYFLLIVFFAFFYVAISFNPEEVADNMKKYGGFIPGIRAGRP 362

Query: 331 TQSYLSKICYLFGTFSGFFMAFLGGVPLLFALG 363
T YLS + ++ + VP + +G
Sbjct: 363 TAEYLSYVLNRITWPGSLYLGLIALVPTMALVG 395


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS11360ICENUCLEATIN573e-10 Ice nucleation protein signature.
		>ICENUCLEATIN#Ice nucleation protein signature.

Length = 1258

Score = 57.5 bits (138), Expect = 3e-10
Identities = 142/636 (22%), Positives = 259/636 (40%), Gaps = 1/636 (0%)

Query: 623 YGHTNISGDSDANAEIKLLSESASTSASTSASTSASMSASTSASTSASMSASTSASTSAS 682
YG T +G+ + +++ + +A ++ +A +S A ++ +A +S
Sbjct: 212 YGSTQTAGEESSQMAGYGSTQTGMKGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSS 271

Query: 683 MSAS-TSASTSASTSTSTSASTSASTSASTSASMSASTSASTSASTSASTSASTSASTSA 741
++A S T+ S T+ S T+ + S+ ++ S T+ S T+ S T+
Sbjct: 272 LTAGYGSTQTAQKGSDLTAGYGSTGTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQ 331

Query: 742 STSASTSASTSASTSASTSASTSASTSASTSASTSASMSASMSASTSASMSASTSASTSA 801
S T+ S T+ S+ + S T+ S+ + S T+ S T+ S
Sbjct: 332 KGSDLTAGYGSTGTAGDDSSLIAGYGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGST 391

Query: 802 STSASTSASTSASTSASMSASTSASTSASTSASMSASTSASTSASTSASTSASTSTSTSA 861
T+ + S+ + S + S T+ S + S T+ S T+ S+ +
Sbjct: 392 GTAGADSSLIAGYGSTQTAGEESTQTAGYGSTQTAQKGSDLTAGYGSTGTAGDDSSLIAG 451

Query: 862 STSASTSASTSASMSASTSASTSPSTSASTSASTSASTSASTSASTSASMSASTSASTSA 921
S T+ S+ + S T+ S T+ S ST+ S+ + S T+ S
Sbjct: 452 YGSTQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGST 511

Query: 922 SMSASTSASTSASTSASTSASMSASTSASTSASTSASMSASTSASTSASTSASTSASTST 981
+ S T+ + S + S ST+ + S+ + S T++ S T+ S T+
Sbjct: 512 LTAGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAR 571

Query: 982 STSASTSASTSASTSASMSASTSASTSASTSASMSASTSASISASTSASMSASTSASTSA 1041
S T+ S T+ S S+ + S T++ S+ T+ S T+ S T+ S
Sbjct: 572 EGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGST 631

Query: 1042 STSASTSASMSASTSASTSASTSASTSASMSASTSASTSASTSASTSASTSASTSASTSA 1101
ST+ + S+ ++ S T+ S T+ S T+ S T+ S ST+ + S+ +
Sbjct: 632 STAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAG 691

Query: 1102 STSSSTSASTSASTSASTSASMSASTSASTSASMSASTSASTSASTSASMSASTSSSTSA 1161
S+ T+ S T+ S + S TS S ST+ + S+ + S T+S S+
Sbjct: 692 YGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSS 751

Query: 1162 SMSASTSASMSASTSASTSASTSASMSASTSSSTSASMSASTSASMSASTSASTSASTSA 1221
+ S + S T+ S S + + SS + S T+ S T+ S T+
Sbjct: 752 LTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQ 811

Query: 1222 SMSASTSASMSATTSASTSVSTSASTSASTSASTSS 1257
S T+ S +T+ + S + S T+ S
Sbjct: 812 ERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSI 847



Score = 55.1 bits (132), Expect = 2e-09
Identities = 145/618 (23%), Positives = 257/618 (41%), Gaps = 2/618 (0%)

Query: 642 SESASTSASTSASTSASMSASTSASTSASMSASTSASTSASMSASTSASTSASTSTSTSA 701
S + S+ + S + S + STS + S + ST + ST
Sbjct: 454 STQTAGEDSSLTAGYGSTQTAQKGSDLTAGYGSTSTAGYESSLIAGYGSTQTAGYGSTLT 513

Query: 702 STSASTSASTSAS--MSASTSASTSASTSASTSASTSASTSASTSASTSASTSASTSAST 759
+ ST + + S ++ S ST+ + S+ + S T++ S T+ S T+
Sbjct: 514 AGYGSTQTAQNESDLITGYGSTSTAGANSSLIAGYGSTQTASYNSVLTAGYGSTQTAREG 573

Query: 760 SASTSASTSASTSASTSASMSASMSASTSASMSASTSASTSASTSASTSASTSASTSASM 819
S T+ S T+ S S+ ++ S T++ S+ T+ S T+ S T+ S S
Sbjct: 574 SDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTST 633

Query: 820 SASTSASTSASTSASMSASTSASTSASTSASTSASTSTSTSASTSASTSASTSASMSAST 879
+ + S+ + S + S T+ S T+ S T+ S ST+ + S+ ++
Sbjct: 634 AGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTAGYGSTSTAGADSSLIAGYG 693

Query: 880 SASTSPSTSASTSASTSASTSASTSASTSASMSASTSASTSASMSASTSASTSASTSAST 939
S T+ S T+ S T+ S TS S ST+ + S+ ++ S T++ S+ T
Sbjct: 694 STQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTSTAGADSSLIAGYGSTQTASYHSSLT 753

Query: 940 SASMSASTSASTSASTSASMSASTSASTSASTSASTSASTSTSTSASTSASTSASTSASM 999
+ S T+ S T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 754 AGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYHSILTAGYGSTQTAQER 813

Query: 1000 SASTSASTSASTSASMSASTSASISASTSASMSASTSASTSASTSASTSASMSASTSAST 1059
S T+ S ST+ + S+ + S T+ S T+ S T+ S + S ST
Sbjct: 814 SDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTST 873

Query: 1060 SASTSASTSASMSASTSASTSASTSASTSASTSASTSASTSASTSSSTSASTSASTSAST 1119
+ S+ + S T+ S T+ S T+ S T+ S+ST+ S+ +
Sbjct: 874 AGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQENSDLTTGYGSTSTAGYESSLIAGYG 933

Query: 1120 SASMSASTSASTSASMSASTSASTSASTSASMSASTSSSTSASMSASTSASMSASTSAST 1179
S ++ S + S+ T+ S+ T+ S S + S+ ++ S + S T
Sbjct: 934 STQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSMAGYDSSLIAGYGSTQTAGYQSTLT 993

Query: 1180 SASTSASMSASTSSSTSASMSASTSASMSASTSASTSASTSASMSASTSASMSATTSAST 1239
+ S + +S+ T+ S +T+ + S+ + S+ TS S T+ S S
Sbjct: 994 AGYGSTQTAEHSSTLTAGYGSTATAGADSSLIAGYGSSLTSGIRSFLTAGYGSTLISGLR 1053

Query: 1240 SVSTSASTSASTSASTSS 1257
SV T+ S+ S SS
Sbjct: 1054 SVLTAGYGSSLISGRRSS 1071



Score = 47.1 bits (111), Expect = 5e-07
Identities = 141/623 (22%), Positives = 256/623 (41%), Gaps = 4/623 (0%)

Query: 648 SASTSASTSASMSASTSASTSASMSASTSASTSASMSASTSASTSASTSTSTSASTSAST 707
S T++ S + S T+ S T+ S + S S+ + ST T++ S+ T
Sbjct: 550 STQTASYNSVLTAGYGSTQTAREGSDLTAGYGSTGTAGSDSSIIAGYGSTQTASYHSSLT 609

Query: 708 SASTSASMSASTSASTSASTSASTSASTSASTSASTSASTSASTSASTSASTSASTSAST 767
+ S + S T+ S ST+ + S+ + S T+ S T+ S T+
Sbjct: 610 AGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEG 669

Query: 768 SASTSASTSASMSASMSASTSASMSASTSASTSASTSASTSASTSASTSASMSASTSAST 827
S T+ S S + + S+ + S T+ S T+ S T+ S S S ST
Sbjct: 670 SDLTAGYGSTSTAGADSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEGSDLTSGYGSTST 729

Query: 828 SASTSASMSASTSASTSASTSASTSASTSTSTSASTSASTSASTSASMSASTSASTSPST 887
+ + S+ ++ S T++ S+ T+ ST T+ S T+ S S + + S+ +
Sbjct: 730 AGADSSLIAGYGSTQTASYHSSLTAGYGSTQTAREQSVLTTGYGSTSTAGADSSLIAGYG 789

Query: 888 SASTSASTSASTSASTSASTSASMSASTSASTSASMSASTSASTSASTSASTSASMSAST 947
S T+ S T+ S T+ S T+ S S + + S+ + S T+ S T
Sbjct: 790 STQTAGYHSILTAGYGSTQTAQERSDLTTGYGSTSTAGADSSLIAGYGSTQTAGYNSILT 849

Query: 948 SASTSASTSASMSASTSASTSASTSASTSASTSTSTSASTSASTSASTSASMSASTSAST 1007
+ S T+ S T+ S ST+ S+ + S T+ S T+ S T+
Sbjct: 850 AGYGSTQTAQENSDLTTGYGSTSTAGYDSSLIAGYGSTQTAGYNSILTAGYGSTQTAQEN 909

Query: 1008 SASTSASMSASTSASISASTSASMSASTSASTSASTSASTSASMSASTSASTSASTSAST 1067
S T+ S ST+ S+ + S T++ S + S+ + S+ T+ S S
Sbjct: 910 SDLTTGYGSTSTAGYESSLIAGYGSTQTASFKSTLMAGYGSSQTAREQSSLTAGYGSTSM 969

Query: 1068 SASMSASTSASTSASTSASTSASTSASTSASTSASTSSSTSASTSASTSASTSASMSAST 1127
+ S+ + S T+ S T+ ST + SST T+ S +T+ + S+
Sbjct: 970 AGYDSSLIAGYGSTQTAGYQSTLTAGY--GSTQTAEHSST--LTAGYGSTATAGADSSLI 1025

Query: 1128 SASTSASMSASTSASTSASTSASMSASTSSSTSASMSASTSASMSASTSASTSASTSASM 1187
+ S+ S S T+ S +S S T+ S+ S S+ T+ S ++
Sbjct: 1026 AGYGSSLTSGIRSFLTAGYGSTLISGLRSVLTAGYGSSLISGRRSSLTAGYGSNQIASHR 1085

Query: 1188 SASTSSSTSASMSASTSASMSASTSASTSASTSASMSASTSASMSATTSASTSVSTSAST 1247
S+ + S ++ + S ++ S+ T+ S +S + S M+ + + S T
Sbjct: 1086 SSLIAGPESTQITGNRSMLIAGKGSSQTAGYRSTLISGADSVQMAGERGKLIAGADSTQT 1145

Query: 1248 SASTSASTSSSSSVTSNSSKEKV 1270
+ S + ++S + + K+
Sbjct: 1146 AGDRSKLLAGNNSYLTAGDRSKL 1168


16GBS_RS08130GBS_RS11750Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS08130419-2.780529preprotein translocase subunit SecG
GBS_RS11365418-2.20476850S ribosomal protein L33
GBS_RS08135315-2.736055multidrug efflux MFS transporter
GBS_RS08140216-3.739685hypothetical protein
GBS_RS08145-212-2.131172ABC transporter ATP-binding protein
GBS_RS08150-112-1.980694dephospho-CoA kinase
GBS_RS08155014-2.119267DNA-formamidopyrimidine glycosylase
GBS_RS08160318-2.381445quorum-sensing system transcriptional regulator
GBS_RS11630320-0.862051peptide pheromone SHP2
GBS_RS08165420-0.655044transglutaminase
GBS_RS11750324-2.469388hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS08130SECGEXPORT383e-07 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 38.0 bits (88), Expect = 3e-07
Identities = 24/79 (30%), Positives = 39/79 (49%), Gaps = 4/79 (5%)

Query: 1 MYNLLLTILLVLSVLLIISIFMQPQKNPSSNV-FDSSGSEALFERSKARGFEAFMQRFTG 59
MY LL + L++++ L+ I +Q K F + S LF + G FM R T
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLF---GSSGSGNFMTRMTA 57

Query: 60 VLVFFWLLIGLVLSILSSH 78
+L + +I LVL ++S+
Sbjct: 58 LLATLFFIISLVLGNINSN 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS08135TCRTETA1141e-30 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 114 bits (287), Expect = 1e-30
Identities = 75/341 (21%), Positives = 146/341 (42%), Gaps = 9/341 (2%)

Query: 15 LVMPFMVLYVEQLGAPSNKVEWYAGLSVSLSALSSALVAPLWGRLADKYGRKPMMVRAGL 74
L+MP + + L SN V + G+ ++L AL AP+ G L+D++GR+P+++ +
Sbjct: 23 LIMPVLPGLLRDLV-HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLA 81

Query: 75 MMTFTMGGLAFIHSVTGLLILRILNGIFAGYVPNSTALIASQAPQEESGYALGTLATGVT 134
+A + L I RI+ GI + A IA +E G ++
Sbjct: 82 GAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 135 GGMLIGPLLGGLLAEWFGIREVFLLVGTILLISTLMTIFMVKEDFKPISN---EETMPTT 191
GM+ GP+LGGL+ F F + ++ L F++ E K E +
Sbjct: 142 FGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPL 200

Query: 192 EVFKSVKSLQILIGLFVTSMIIQISAQSIAPILTLYIRHLGQTENLMFVSGLIVSGMGFS 251
F+ + + ++ L I+Q+ Q A + ++ + G+ ++ G
Sbjct: 201 ASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTI--GISLAAFGIL 258

Query: 252 SILSSPKL-GRIGDRIGNHRLLLLALLYSFLMYVLCSLAQTSLQLGVIRFLYGFGTGALM 310
L+ + G + R+G R L+L ++ Y+L + A I L G G M
Sbjct: 259 HSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG-GIGM 317

Query: 311 PSINSILTKIAPRQGLSRIFSYNQMFSNLGQVLGPFVGSAV 351
P++ ++L++ + ++ ++L ++GP + +A+
Sbjct: 318 PALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358



Score = 52.9 bits (127), Expect = 8e-10
Identities = 41/188 (21%), Positives = 75/188 (39%), Gaps = 8/188 (4%)

Query: 197 VKSLQILIGLFVTSMIIQISAQSIAPILTLYIRHLGQTENLMFVSGLIVSGMGFSSILSS 256
+K + LI + T + + I P+L +R L + ++ G++++ +
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 257 PKLGRIGDRIGNHRLLLLALLYSFLMYVLCSLAQTSLQLGVIRFLYGFGTGALMPSINSI 316
P LG + DR G +LL++L + + Y + + A L + R + G TGA +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAY 119

Query: 317 LTKIAPRQGLSRIFSYNQMFSNLGQVLGPFVG---SAVSIHLGFRWVFFVTSFIVLANFV 373
+ I +R F + G V GP +G S H FF + + NF+
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFL 175

Query: 374 WCFINFRK 381
+
Sbjct: 176 TGCFLLPE 183


17GBS_RS09055GBS_RS09115Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS09055211-0.018993mannose-6-phosphate isomerase, class I
GBS_RS090601100.199269ROK family protein
GBS_RS090651120.189151PTS beta-glucoside transporter subunit IIBCA
GBS_RS09070-212-1.955797sucrose-6-phosphate hydrolase
GBS_RS09075-315-2.907611LacI family DNA-binding transcriptional
GBS_RS09080-220-4.000864transcription antitermination factor NusB
GBS_RS09085-221-4.191993Asp23/Gls24 family envelope stress response
GBS_RS09090-223-4.325478elongation factor P
GBS_RS09095026-4.652113ABC transporter ATP-binding protein
GBS_RS09100226-4.552422ABC transporter ATP-binding protein
GBS_RS09105229-4.467656ABC transporter ATP-binding protein
GBS_RS09110430-3.278839energy-coupling factor transporter transmembrane
GBS_RS09115330-3.165700MptD family putative ECF transporter S
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS09100BCTERIALGSPF300.032 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 29.8 bits (67), Expect = 0.032
Identities = 13/40 (32%), Positives = 21/40 (52%), Gaps = 1/40 (2%)

Query: 132 HMQPDMVNAMVYPVILLVVSFLVSISFGIVMVVSLVLGVF 171
M+ + AM+YP +L VV+ V + +VV V+ F
Sbjct: 164 QMRSRIQQAMIYPCVLTVVAIAVVS-ILLSVVVPKVVEQF 202


18GBS_RS09420GBS_RS09450Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS094202291.367374type I glutamate--ammonia ligase
GBS_RS094256402.701414MerR family transcriptional regulator
GBS_RS094307433.035670aromatic acid exporter family protein
GBS_RS094359474.149083phosphoglycerate kinase
GBS_RS094408373.3261585'-nucleotidase, lipoprotein e(P4) family
GBS_RS094459383.773544type I glyceraldehyde-3-phosphate dehydrogenase
GBS_RS094505242.652667elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS09450TCRTETOQM6240.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 624 bits (1610), Expect = 0.0
Identities = 180/671 (26%), Positives = 301/671 (44%), Gaps = 65/671 (9%)

Query: 9 KTRNIGIMAHVDAGKTTTTERILYYTGKIHKIGETHEGASQMDWMEQEQERGITITSAAT 68
K NIG++AHVDAGKTT TE +LY +G I ++G +G ++ D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAQWDGHRVNIIDTPGHVDFTIEVQRSLRVLDGAVTVLDAQSGVEPQTETVWRQATEYGV 128
+ QW+ +VNIIDTPGH+DF EV RSL VLDGA+ ++ A+ GV+ QT ++ + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFANKMDKIGADFLYSVQSLHDRLQANAHPIQLPIGSEDDFRGIIDLIKMKAEIYTN 188
P I F NK+D+ G D Q + ++L A +IK K E+Y N
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEI------------------VIKQKVELYPN 163

Query: 189 DLGTDILEEDIPAEYVDQANEYREKLVEAVADTDEDLMMKYLEGEEITNEELMAAIRRAT 248
T+ E + + V + ++DL+ KY+ G+ + EL
Sbjct: 164 MCVTNFTESE---------------QWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRF 208

Query: 249 INVEFYPVLCGSAFKNKGVQLMLDAVIDYLPSPLDIPAIKGINPDTDEEETRPASDEEPF 308
N +PV GSA N G+ +++ + + S +
Sbjct: 209 HNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSEL 249

Query: 309 AALAFKIMTDPFVGRLTFFRVYSGVLNSGSYVLNTSKGKRERIGRILQMHANSRQEIETV 368
FKI RL + R+YSGVL+ V + K K +I + +I+
Sbjct: 250 CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKA 308

Query: 369 YAGDIAAAVG----LKDTTTGDSLTDEKSKVILESIEVPEPVIQLMVEPKSKADQDKMGI 424
Y+G+I L GD+ + E IE P P++Q VEP ++ +
Sbjct: 309 YSGEIVILQNEFLKLNS-VLGDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLLD 363

Query: 425 ALQKLAEEDPTFRVETNVETGETVISGMGELHLDVLVDRMKREFKVEANVGAPQVSYRET 484
AL ++++ DP R + T E ++S +G++ ++V ++ ++ VE + P V Y E
Sbjct: 364 ALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME- 422

Query: 485 FRASTQARGFFKRQSGGKGQFGDVWIEFTPNEEGKGFEFENAIVGGVVPREFIPAVEKGL 544
R +A + + + + +P G G ++E+++ G + + F AV +G+
Sbjct: 423 -RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGI 481

Query: 545 VESMANGVLAGYPMVDVKAKLYDGSYHDVDSSETAFKIAASLALKEAAKSAQPAILEPMM 604
G L G+ + D K G Y+ S+ F++ A + L++ K A +LEP +
Sbjct: 482 RYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL 540

Query: 605 LVTITAPEDNLGDVMGHVTARRGRVDGMEARGNTQVVRAFVPLAEMFGYATVLRSATQGR 664
I AP++ L + + + N ++ +P + Y + L T GR
Sbjct: 541 SFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR 600

Query: 665 GTFMMVFDHYE 675
+ Y
Sbjct: 601 SVCLTELKGYH 611


19GBS_RS10160GBS_RS10360Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS101600183.664763phosphate ABC transporter permease subunit PstC
GBS_RS101701203.944434phosphate-binding protein
GBS_RS101751254.999314AAA family ATPase
GBS_RS101802276.26157816S rRNA (uracil(1498)-N(3))-methyltransferase
GBS_RS101852275.39195450S ribosomal protein L11 methyltransferase
GBS_RS101902264.269485MepB family protein
GBS_RS101950224.379181MerR family transcriptional regulator
GBS_RS102001244.299701GNAT family N-acetyltransferase
GBS_RS102051233.236290NUDIX domain-containing protein
GBS_RS102101151.867464DUF3013 family protein
GBS_RS102150120.430238GNAT family N-acetyltransferase
GBS_RS10220-115-1.402721replication-associated recombination protein A
GBS_RS10230220-4.110625*hypothetical protein
GBS_RS10235123-7.208748hypothetical protein
GBS_RS10240024-7.373560hypothetical protein
GBS_RS10245227-9.111750hypothetical protein
GBS_RS11765126-9.180071hypothetical protein
GBS_RS10255124-9.574793helix-turn-helix transcriptional regulator
GBS_RS10260024-9.227161hypothetical protein
GBS_RS10265024-7.361540hypothetical protein
GBS_RS10270222-5.480365WYL domain-containing protein
GBS_RS10275318-4.523365hypothetical protein
GBS_RS10280422-4.813742hypothetical protein
GBS_RS10285125-5.045423WXG100 family type VII secretion target
GBS_RS10290326-6.403088DUF1310 family protein
GBS_RS10295424-7.012149hypothetical protein
GBS_RS10300426-9.453742type II toxin-antitoxin system RelB/DinJ family
GBS_RS10305423-9.542164hypothetical protein
GBS_RS10310624-8.705722hypothetical protein
GBS_RS10315622-7.873558ABC transporter permease
GBS_RS10320520-7.048170ATP-binding cassette domain-containing protein
GBS_RS11505419-6.352238epipeptide YydF family RiPP
GBS_RS10325320-6.365555aminoglycoside 6-adenylyltransferase
GBS_RS10330420-4.855184DUF2785 domain-containing protein
GBS_RS10335318-5.065153DUF2812 domain-containing protein
GBS_RS10340115-4.749833helix-turn-helix transcriptional regulator
GBS_RS10345016-4.808594GNAT family N-acetyltransferase
GBS_RS10350015-4.393656ABC transporter permease
GBS_RS10355-113-3.173213ABC transporter ATP-binding protein
GBS_RS10360-115-3.316005PLDc N-terminal domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10170FbpA_PF05833280.043 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.3 bits (63), Expect = 0.043
Identities = 25/112 (22%), Positives = 46/112 (41%), Gaps = 6/112 (5%)

Query: 168 PLDNAKAYQAKVSSGKVVIAGSSSVTPVMEKIKEAYHKVNAKVDVEIQQSDSSTGITSAI 227
P N ++Y K + K + + +E + + + V I +D+ I
Sbjct: 379 PSQNVQSYYKKYNKLKKSEEA---ANEQLLQNEEELNYLYS-VLTNINNADNYDEIEEIK 434

Query: 228 DGSADIG-MASRELDKTESSKGVKA-TVIATDGIAVVVNKKNKVNDLSTKQV 277
+ G + +++ K++ SK K I+ DGI + V K N ND T +
Sbjct: 435 KELIETGYIKFKKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKF 486


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10215SACTRNSFRASE422e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 2e-07
Identities = 24/121 (19%), Positives = 50/121 (41%), Gaps = 9/121 (7%)

Query: 18 QNTGWT---ALTSPVYDRKWTESDLEKNLAN--GMSFFVAEVDDKIAGVLDFGPYYPFPA 72
+N WT S Y +++ + D++ + G + F+ +++ G + +
Sbjct: 31 ENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNW---- 86

Query: 73 GKHVATFGILIAEPYQGQGLGKALLKALLTEAKAQGYIKIAMHVMGNNSRAISLYQKYGF 132
+ I +A+ Y+ +G+G ALL + AK + + + N A Y K+ F
Sbjct: 87 NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146

Query: 133 T 133

Sbjct: 147 I 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10220HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.009
Identities = 23/83 (27%), Positives = 32/83 (38%), Gaps = 9/83 (10%)

Query: 49 GIGKTSIASAIAGTTKYAFRTFNATVDSKKRLQEIAEEAKFSGGLVLLLDEIHRLDKTKQ 108
I + I S + G K AF A S R ++ + G L LDEI + Q
Sbjct: 198 AIPRDLIESELFGHEKGAFTG--AQTRSTGRFEQ-------AEGGTLFLDEIGDMPMDAQ 248

Query: 109 DFLLPLLENGNIIMIGATTENPF 131
LL +L+ G +G T
Sbjct: 249 TRLLRVLQQGEYTTVGGRTPIRS 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10270ARGREPRESSOR300.006 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 29.8 bits (67), Expect = 0.006
Identities = 20/54 (37%), Positives = 26/54 (48%), Gaps = 5/54 (9%)

Query: 1 MVKSSRLIDIWLYINNHKKFTTTELATK-----FNVSTRTIQRDIDELSLLGVP 49
M K R I I I ++ T EL +NV+ T+ RDI EL L+ VP
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVP 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10285BCTERIALGSPD260.024 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 26.0 bits (57), Expect = 0.024
Identities = 11/40 (27%), Positives = 21/40 (52%)

Query: 57 ELSPKITQFAQLLEDINQQLLKVADVVEQIDSDIASQINQ 96
++ P+I + +L +I Q++ VAD SD+ + N
Sbjct: 489 KVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNT 528


20GBS_RS10750GBS_RS10780Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS10750215-5.118500hypothetical protein
GBS_RS10755215-5.804212replication protein
GBS_RS10760219-6.191778MerR family transcriptional regulator
GBS_RS10765217-5.996849site-specific integrase
GBS_RS10770120-7.329137TerB N-terminal domain-containing protein
GBS_RS10775018-5.099334hypothetical protein
GBS_RS10780-214-3.076514hypothetical protein
21GBS_RS01845GBS_RS01885N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS01845-210-0.053850S-ribosylhomocysteine lyase
GBS_RS01850-212-0.568322ribonuclease Y
GBS_RS01855-213-0.886592ABC transporter ATP-binding protein
GBS_RS01860-312-0.892397ABC transporter permease
GBS_RS01865-111-0.584953sensor histidine kinase
GBS_RS018700100.694261response regulator transcription factor
GBS_RS01875-1100.692517DUF1211 domain-containing protein
GBS_RS01880-190.927296guanylate kinase
GBS_RS01885-190.802266DNA-directed RNA polymerase subunit omega
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01845LUXSPROTEIN1576e-52 Bacterial autoinducer-2 (AI-2) production protein Lu...
		>LUXSPROTEIN#Bacterial autoinducer-2 (AI-2) production protein LuxS

signature.
Length = 171

Score = 157 bits (398), Expect = 6e-52
Identities = 56/151 (37%), Positives = 83/151 (54%), Gaps = 4/151 (2%)

Query: 7 VESFELDHTIVKAPYVRLISEEVGPVGDIITNFDIRLIQPNENAIDTAGLHTIEHLLAKL 66
++SF +DHT + AP VR+ P GD IT FD+R PN++ + G+HT+EHL A
Sbjct: 3 LDSFTVDHTRMNAPAVRVAKTMQTPKGDTITVFDLRFTAPNKDILSEKGIHTLEHLYAGF 62

Query: 67 IRQRING----LIDCSPFGCRTGFHMIMWGKQDATEIAKVIKSSLEAIAGGVTWEDVPGT 122
+R +NG +ID SP GCRTGF+M + G ++A +++E + +P
Sbjct: 63 MRNHLNGDSVEIIDISPMGCRTGFYMSLIGTPSEQQVADAWIAAMEDVLKVENQNKIPEL 122

Query: 123 TIESCGNYKDHSLHSAQEWAKLILSQGISDN 153
CG HSL A++ AK IL G++ N
Sbjct: 123 NEYQCGTAAMHSLDEAKQIAKNILEVGVAVN 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01850TYPE4SSCAGX357e-04 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 35.1 bits (80), Expect = 7e-04
Identities = 43/153 (28%), Positives = 72/153 (47%), Gaps = 6/153 (3%)

Query: 30 KEAAELTLLNAEQDAVDLRGKAEIEAEHIRKAAERESKAHQKELLLEAKEEARKYREEIE 89
KEA L+ + L+ K I K E + KA +KE EAKE+A+K +++
Sbjct: 110 KEAVNFALMTRDYQEF-LKTKKLIVDAPDPKELEEQKKALEKEK--EAKEQAQKAQKDKR 166

Query: 90 KEFKSDRQELKQMEARLTDRASSLDRKDENLSNKEKMLDSKEQSLTDKSRHINEREQEIA 149
++ K +R + + LT+ S+ N + E + +E L R + +EQ A
Sbjct: 167 EKRKEERAKNRANLENLTNAMSNPQNLSNNKNLSELIKQQRENELDQMERLEDMQEQAQA 226

Query: 150 TLETKKVEELSR--IAELSQEEAKDIILADTEK 180
K++EEL++ E ++ AKD I T+K
Sbjct: 227 N-ALKQIEELNKKQAEEAVRQRAKDKISIKTDK 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01855PF05272300.015 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.015
Identities = 13/32 (40%), Positives = 17/32 (53%)

Query: 33 CVALIGPNGAGKTTLMSTLLGDISISSGSLTI 64
V L G G GK+TL++TL+G S I
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01870HTHFIS664e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 4e-15
Identities = 24/119 (20%), Positives = 50/119 (42%), Gaps = 5/119 (4%)

Query: 2 KLLVAEDQSMLRDAMCQLL-LMEESVSTIDQAGNGGEAIAILSNKAIDVAILDVEMPILS 60
+LVA+D + +R + Q L V N ++ D+ + DV MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI---TSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 61 GLDVLEWVRK-YQNVKVIIVTTFKRSGYFQRAIRSNVDAYVLKDRSVADLMKTIQKVLS 118
D+L ++K ++ V++++ +A Y+ K + +L+ I + L+
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS01885IGASERPTASE318e-04 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.8 bits (69), Expect = 8e-04
Identities = 18/76 (23%), Positives = 24/76 (31%), Gaps = 3/76 (3%)

Query: 24 QAKRAHELEAGEKATQDFKSVKSTLRALEEIESGNVVIHPDPSAKRASVRARIEAERLAK 83
Q + D SV S EEI + P P+ S AE +
Sbjct: 990 QTVDTTNITTPNNIQADVPSVPSNN---EEIARVDEAPVPPPAPATPSETTETVAENSKQ 1046

Query: 84 EEEERKIKEQIAKEKE 99
E + + EQ A E
Sbjct: 1047 ESKTVEKNEQDATETT 1062


22GBS_RS02320GBS_RS02370N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS023201103.252943DNA topoisomerase III
GBS_RS023255112.489887ATP-dependent Clp protease ATP-binding subunit
GBS_RS023308112.283553hypothetical protein
GBS_RS023357112.039078hypothetical protein
GBS_RS023404102.530383C39 family peptidase
GBS_RS023451122.449980LPXTG cell wall anchor domain-containing
GBS_RS02350-1133.007008LPXTG cell wall anchor domain-containing
GBS_RS02355-1214.045099single-stranded DNA-binding protein
GBS_RS02360-1223.933603hypothetical protein
GBS_RS02365-1223.862787type IV secretory system conjugative DNA
GBS_RS023700213.951031hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02320PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 21/89 (23%), Positives = 44/89 (49%), Gaps = 12/89 (13%)

Query: 254 AQARQLKAISIVSSVEMEEKQ-RAAPH-LFN-LSDIQGLAAKQWGFEPTKTERLIESL-- 308
A+ Q K S+ ++ + + PH +FN L++I+ L + +PTK ++ SL
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILE----DPTKAREMLTSLSE 202

Query: 309 YLKKYLSYPRTDTRFIT-EEEFDYLKNYL 336
++ L ++ R ++ +E + +YL
Sbjct: 203 LMRYSLR--YSNARQVSLADELTVVDSYL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02325HTHFIS397e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 7e-05
Identities = 34/171 (19%), Positives = 63/171 (36%), Gaps = 19/171 (11%)

Query: 279 LKGDQERLEGFKERLMNRVKGQEDAIEAVVDAVTIAQAGLQNEKRPLASFLFLGPTGVGK 338
K +LE + M V G+ A++ + +A+ + + + G +G GK
Sbjct: 122 PKRRPSKLEDDSQDGMPLV-GRSAAMQEIYR--VLARLMQTD-----LTLMITGESGTGK 173

Query: 339 TELAKAIAEALFDDEAAMIRFDMSEYKQKEDVTKLIGNRATRIKGQLTEGVKQKPYCV-- 396
+A+A+ + + +M+ + ++L G+ KG T +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHE----KGAFTGAQTRSTGRFEQ 229

Query: 397 -----LLLDEIEKAHSEVMDLFLQVLDDGRLTDSSGRLISFKNTIVIMTTN 442
L LDEI + L+VL G T GR + ++ TN
Sbjct: 230 AEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02340CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.3 bits (78), Expect = 0.002
Identities = 40/218 (18%), Positives = 78/218 (35%), Gaps = 16/218 (7%)

Query: 73 VVDEAQKQKDQSQQNLVKATSTVTEAEKVAAEATPEVVKEAIKAVTEAKEAVTDAEANVV 132
EA++++ + ++ + + AE E + E KAV A++ ++ A++ VV
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAE--EKRLAALSEEAKAVEIAQKKLSAAQSEVV 206

Query: 133 DAQKTEQKANQEVQSQAKTVDENVKVVADKESEVKQAEGVVTTAQEAIDSKTANTNAS-- 190
+ N + S D +K +A K +E+ QA E + + N
Sbjct: 207 KMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQ 266

Query: 191 ------------EAEKAVTEKQTKLETAETNLTEAQKQDAKIAEEKRLAEQEVVNKQLAV 238
A K EKQ ++ +ET + +I + V
Sbjct: 267 NRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARV 326

Query: 239 TDTQTLLKKLVTEINNEKVSTSLENQAYFNQRDGSWAG 276
+ + LKK + N ++ +++ F Q G
Sbjct: 327 HEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02345TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 21/78 (26%), Positives = 27/78 (34%)

Query: 41 TPPNSVDPSLPTDTTDPSTPVVPEKLPDTSKPAEPETPKEELPAPVDPTTPSAGKEDKQE 100
P SV P D P P + +P P+ APV P + K +
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 101 TVPGQAETPKEEVKPENP 118
V E PK +VKP
Sbjct: 102 PVKKVQEQPKRDVKPVES 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02350GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 14/195 (7%)

Query: 91 AQADLANQTQAVKDVTAKAQANTQAIKDATAENAKIDAENKAEAERVAKENKEGQAAVDA 150
A A+ +A++ + A++ IK AE A ++A + + A
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 151 RNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAK- 209
A +A++ + Q ++A ++ + L+A +AK E + + +
Sbjct: 283 IKTLEAE--KAALEAEKADLEHQSQVLNANRQSLRR---DLDASREAKKQLEAEHQKLEE 337

Query: 210 ----ANATNAQLQKDYQAKLAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLA 265
+ A+ L++D A K +EA + + + + ++A+ L++D A
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKL---EEQNKI-SEASRQSLRRDLDASRE 393

Query: 266 LYNQALKAKAEADKQ 280
Q KA EA+ +
Sbjct: 394 AKKQVEKALEEANSK 408



Score = 43.1 bits (101), Expect = 5e-06
Identities = 43/232 (18%), Positives = 83/232 (35%), Gaps = 4/232 (1%)

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTTDAQADLANQTQAVKDVTAKAQ 110
+ A L+KA+ A + A + A +A A +A++ +
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 111 ANTQAIKDATAENAKIDAEN-KAEAERVAKENKEGQAAVDARNKAGQAAVDARNKAKQQA 169
A++ IK AE A + A E N + + + A +A+ +
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 170 QDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAKLAEIK 229
+ + A+ + + A +A+ +Q NA L++D A K
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL-NANRQSLRRDLDASREAKK 326

Query: 230 SVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQS 281
+EA + + ++NK ++A + L +AK L +A K E +
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL--EAEHQKLEEQNKI 376



Score = 41.2 bits (96), Expect = 2e-05
Identities = 43/296 (14%), Positives = 78/296 (26%), Gaps = 23/296 (7%)

Query: 4 NTKGHGFFRKSKAYGLVCAIALA--GAFTLATSQVSADQVTTQATTQ-----------TV 50
NT H RK K A+AL GA + + + T T +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTT--DAQADLANQTQAVKDVTAK 108
N + S A + ++ A++ Q ++ A
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 109 AQANTQAI----KDATAENAKIDAENKAEAERVAKENKEGQAAVDARNKAGQAAVDARNK 164
+ + +A+ ++AE A A R A K + A++ +
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 165 AKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAK 224
+ + E + A +A A
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 225 LAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQ 280
A+IK++EA A + R + + A A KA + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKA----LEGAMNFSTADSAKIKTLEAEKAALEAE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02370adhesinb290.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.008
Identities = 11/31 (35%), Positives = 15/31 (48%)

Query: 3 MKKWFGLLFLGLALITLAACGQKTPEDIIKT 33
MKK L+ L LA + LAAC + +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGS 31


23GBS_RS02615GBS_RS02705N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS02615-213-2.164063membrane protein insertase YidC
GBS_RS02620015-2.458455protein jag
GBS_RS02680-211-1.770829********DUF402 domain-containing protein
GBS_RS02685-211-1.904611recombination regulator RecX
GBS_RS02690-211-0.96170323S rRNA (uracil(1939)-C(5))-methyltransferase
GBS_RS02695-212-1.034810nucleoside phosphorylase
GBS_RS02700-110-0.431213GNAT family N-acetyltransferase
GBS_RS02705-110-0.318047S8 family serine peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS0261560KDINNERMP1535e-45 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 153 bits (387), Expect = 5e-45
Identities = 67/207 (32%), Positives = 104/207 (50%), Gaps = 9/207 (4%)

Query: 37 IVYAFAKSIQWL-SFNHSIGLGIILFTLIIRAIMMPLYNMQMKSSQKMQEIQPRLKELQK 95
I K ++W+ SF + G II+ T I+R IM PL Q S KM+ +QP+++ +++
Sbjct: 336 ISQPLFKLLKWIHSFVGNWGFSIIIITFIVRGIMYPLTKAQYTSMAKMRMLQPKIQAMRE 395

Query: 96 KYPGKDPDSRLKLNDEMQSMYKAEGVNPYASVLPLLIQLPVLWALFQALTRVSFLKVGTF 155
+ + +++ EM ++YKAE VNP PLLIQ+P+ AL+ L L+ F
Sbjct: 396 RLGDD----KQRISQEMMALYKAEKVNPLGGCFPLLIQMPIFLALYYMLMGSVELRQAPF 451

Query: 156 LS--LELSQPDPYYILPVLAALFTFLSTWLTNKAAVEKNIALTLMTYVMPFIILVTSFNF 213
+LS DPYYILP+L + F ++ + + MP I V F
Sbjct: 452 ALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQ--KIMTFMPVIFTVFFLWF 509

Query: 214 ASGVVLYWTVSNAFQVFQILLLNNPYK 240
SG+VLY+ VSN + Q L+ +
Sbjct: 510 PSGLVLYYIVSNLVTIIQQQLIYRGLE 536


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02620ANTHRAXTOXNA280.042 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 28.2 bits (62), Expect = 0.042
Identities = 31/170 (18%), Positives = 67/170 (39%), Gaps = 25/170 (14%)

Query: 77 VDVVEEYIEEVDETL---EKEDVSQPELPKIDDKNVVT---TSEAIEKI----DLLPNIE 126
V+ + E+ E D +++ ++ E K N+V T+E ++KI DLL I
Sbjct: 31 VNAMNEHYTESDIKRNHKTEKNKTEKEKFKDSINNLVKTEFTNETLDKIQQTQDLLKKIP 90

Query: 127 VAAAQVTKYVENIIYEMDLDATIETTTSKRQISLQIETPEAGRIIGYHGKVLKSLQLLAQ 186
++ + IY D+D LQ + E + G+ + A
Sbjct: 91 KDVLEIYSELGGEIYFTDIDLV-------EHKELQDLSEEEKNSMNSRGEKV----PFAS 139

Query: 187 NYLHDRFSKSFSVSINVHDYVEHRTETL---IDFSKKIARRVL-ETNEPY 232
++ ++ ++ + IN+ DY + ++ + K I+ ++ +
Sbjct: 140 RFVFEKKRETPKLIINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLD 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02700SACTRNSFRASE280.015 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 27.6 bits (61), Expect = 0.015
Identities = 19/67 (28%), Positives = 30/67 (44%), Gaps = 5/67 (7%)

Query: 73 VIVNPDFRKKGLGNQLVEYAIKFSEANYPNKPIYAQAQ---AYLQDFYQSFGFQPVS-DI 128
+ V D+RKKG+G L+ AI++++ N + + Q FY F + D
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKEN-HFCGLMLETQDINISACHFYAKHHFIIGAVDT 153

Query: 129 YLEDNIP 135
L N P
Sbjct: 154 MLYSNFP 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS02705SUBTILISIN984e-24 Subtilisin serine protease family (S8) signature.
		>SUBTILISIN#Subtilisin serine protease family (S8) signature.

Length = 326

Score = 98.0 bits (244), Expect = 4e-24
Identities = 53/228 (23%), Positives = 83/228 (36%), Gaps = 64/228 (28%)

Query: 170 RGKGNVVAIIDTGFDINHDIFRLDSPKDDKHSFKTKAEFEELKAKHNITYGKWVNDKIVF 229
RG+G VA++DTG D +H +I+
Sbjct: 39 RGRGVKVAVLDTGCDADHPDL---------------------------------KARIIG 65

Query: 230 AHNYANNTETVADIAAAMKDGYGSEAKNISHGTHVAGIFVGNSKRPAINGLLLEGAAPNA 289
N+ ++ E +I KD G HGTHVAG N + G AP A
Sbjct: 66 GRNFTDDDEGDPEIF---KDYNG-------HGTHVAGTIAATE-----NENGVVGVAPEA 110

Query: 290 QVLLMRIPDKIDSDKFGEAYAKAITDAVNLGAKTINMSLGKTADSLIALNDKVKLALKLA 349
+L++++ +K S + + I A+ I+MSLG D ++ A+K A
Sbjct: 111 DLLIIKVLNKQGSG-QYDWIIQGIYYAIEQKVDIISMSLGGPEDV-----PELHEAVKKA 164

Query: 350 SEKGVAVVVAAGNEGAFGMDYSKPLSTNPDYGTVNSPAISEDTLSVAS 397
+ V+ AAGNEG + P + +SV +
Sbjct: 165 VASQILVMCAAGNEGDGDDRTD----------ELGYPGCYNEVISVGA 202



Score = 62.2 bits (151), Expect = 4e-12
Identities = 26/99 (26%), Positives = 42/99 (42%), Gaps = 14/99 (14%)

Query: 553 PDVTASGFEIYSSTYNNQYQTMSGTSMASPHVAGLMTMLQSHLAEKYKGMNLDSKKLLEL 612
D+ A G +I S+ +Y T SGTSMA+PHVAG + +++ + L E
Sbjct: 219 VDLVAPGEDILSTVPGGKYATFSGTSMATPHVAGALALIKQ------LANASFERDLTEP 272

Query: 613 S-KNILMSSATALYSEEDKAFYSPRQQGAGVVDAEKAIQ 650
L+ L SP+ +G G++ +
Sbjct: 273 ELYAQLIKRTIPL-------GNSPKMEGNGLLYLTAVEE 304


24GBS_RS03015GBS_RS03050N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS03015-3140.214381AMP-binding protein
GBS_RS03020-4130.249020endonuclease III
GBS_RS030253191.597053prepilin peptidase
GBS_RS030303210.948416YqgQ family protein
GBS_RS030351181.620460ROK family glucokinase
GBS_RS03045-112-0.434349rhodanese-like domain-containing protein
GBS_RS03050-111-0.062562translational GTPase TypA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03020PF05860300.006 haemagglutination activity domain.
		>PF05860#haemagglutination activity domain.

Length = 117

Score = 29.8 bits (67), Expect = 0.006
Identities = 21/82 (25%), Positives = 30/82 (36%), Gaps = 24/82 (29%)

Query: 115 FASWSDF---------F----SIQNAYFSVTSNSRLFIQGDFSFTGNLNLVL-------- 153
F S+ +F F +IQN VT S I G NL L
Sbjct: 35 FHSFQEFSVPTSGTAFFNNPTNIQNIISRVTGGSVSNIDGLIRANATANLFLINPNGIIF 94

Query: 154 ---SLLLLGGTLVVTQKNSVKY 172
+ L +GG+ V + N +K+
Sbjct: 95 GQNARLDIGGSFVGSTANRLKF 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03030PREPILNPTASE300.002 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 29.8 bits (67), Expect = 0.002
Identities = 17/67 (25%), Positives = 30/67 (44%), Gaps = 5/67 (7%)

Query: 66 IGSGDFLYLATIGLSLPLHQMFFIIQIGAWLGIIYCLVMRK-----MKKTIAFLPFLSIA 120
+G GDF LA +G L + ++ + + +G + + K I F P+L+IA
Sbjct: 211 MGYGDFKLLAALGAWLGWQALPIVLLLSSLVGAFMGIGLILLRNHHQSKPIPFGPYLAIA 270

Query: 121 YIIVTSY 127
I +
Sbjct: 271 GWIALLW 277


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03040PF03309290.033 Bvg accessory factor
		>PF03309#Bvg accessory factor

Length = 271

Score = 28.6 bits (64), Expect = 0.033
Identities = 36/165 (21%), Positives = 53/165 (32%), Gaps = 29/165 (17%)

Query: 5 LLGIDLGGTTIKFGILTLEGEVQE---KWAIETNTLENGRHIVSDIVESLKHRLSLYGLT 61
LL ID+ T G+++ G+ + +W I T + +D + L G
Sbjct: 2 LLAIDVRNTHTVVGLISGSGDHAKVVQQWRIRTE-----PEVTADELALTID--GLIGDD 54

Query: 62 KDDFLGIGMGSPGAVDRTSKTVTGAFNLNWADTQEVGSVIEKEVGIPFFIDNDANVAALG 121
+ G S V V W + V GIP +DN V A
Sbjct: 55 AERLTGASGLS--TVPSVLHEVRVMLEQYWPNVPHVLIEPGVRTGIPLLVDNPKEVGA-- 110

Query: 122 ERWVGAGAN----NPDVVFVTLGTGVG-----------GGVIADG 151
+R V A + V G+ + GG IA G
Sbjct: 111 DRIVNCLAAYHKYGTAAIVVDFGSSICVDVVSAKGEFLGGAIAPG 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03050TCRTETOQM1858e-53 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 185 bits (470), Expect = 8e-53
Identities = 101/465 (21%), Positives = 196/465 (42%), Gaps = 73/465 (15%)

Query: 8 IRNVAIIAHVDHGKTTLVDELLKQSHTLDERKELEE--RAMDSNDIEKERGITILAKNTA 65
I N+ ++AHVD GKTTL + LL S + E +++ D+ +E++RGITI T+
Sbjct: 3 IINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS 62

Query: 66 VAYNDVRINIMDTPGHADFGGEVERIMKMVDGVVLVVDAYEGTMPQTRFVLKKALEQNLI 125
+ + ++NI+DTPGH DF EV R + ++DG +L++ A +G QTR + + +
Sbjct: 63 FQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIP 122

Query: 126 PIVVVNKIDKPSARPSEVVDEVLELF---------IELGADDDQLDFP--VVYASAING- 173
I +NKID+ S V ++ E +EL + +F + + I G
Sbjct: 123 TIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGN 182

Query: 174 ----TSSMSDDPLDQEK------------TMAPIF--------------DTIIDHIPAPV 203
MS L+ + ++ P++ + I + +
Sbjct: 183 DDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSST 242

Query: 204 DNSEEPLQFQVSLLDYNDFVGRIGIGRVFRGTVKVGDQVTLSKLDGTTKNFRVTKLFGFF 263
+ L +V ++Y++ R+ R++ G + + D V +S + ++T+++
Sbjct: 243 HRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRIS----EKEKIKITEMYTSI 298

Query: 264 GLERKEIQEAKAGDLIAVSGMEDIFVGETVTPTDDIEPLPVLRIDEPTLQMTFLVNNSPF 323
E +I +A +G+++ + E + + + T + + P LQ T
Sbjct: 299 NGELCKIDKAYSGEIVILQN-EFLKLNSVLGDTKLLPQRERIENPLPLLQTT-------- 349

Query: 324 AGREGKWITSRKVEER--LLAELQT----DVSLRVDPTDSPDKWTVSGRGELHLSILIET 377
+ K ++R LL L D LR + + +S G++ + +
Sbjct: 350 -------VEPSKPQQREMLLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCAL 402

Query: 378 MRRE-GYELQVSRPEVIIKEIDGVQCEPFERVQIDTPEEYQGAII 421
++ + E+++ P VI E + E + I+ P A I
Sbjct: 403 LQEKYHVEIEIKEPTVIYMERPLKKAE--YTIHIEVPPNPFWASI 445



Score = 42.5 bits (100), Expect = 4e-06
Identities = 18/79 (22%), Positives = 31/79 (39%), Gaps = 1/79 (1%)

Query: 403 EPFERVQIDTPEEYQGAIIQSLSERKGDMLDMQMVGNGQTRLIFLIPARGLIGYSTEFLS 462
EP+ +I P+EY + +++D Q + N + L IPAR + Y ++
Sbjct: 537 EPYLSFKIYAPQEYLSRAYTDAPKYCANIVDTQ-LKNNEVILSGEIPARCIQEYRSDLTF 595

Query: 463 MTRGYGIMNHTFDQYLPVV 481
T G + Y
Sbjct: 596 FTNGRSVCLTELKGYHVTT 614


25GBS_RS03175GBS_RS03205N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS03175-211-1.762958TlyA family rRNA
GBS_RS03180-113-1.776362ArgR family transcriptional regulator
GBS_RS03185-112-1.874962DNA repair protein RecN
GBS_RS03190013-1.488428DegV family protein
GBS_RS03195212-1.854997SGNH/GDSL hydrolase family protein
GBS_RS03200214-1.233139YpmS family protein
GBS_RS03205115-0.437881HU family DNA-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03175BLACTAMASEA280.033 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 28.2 bits (63), Expect = 0.033
Identities = 6/27 (22%), Positives = 13/27 (48%), Gaps = 1/27 (3%)

Query: 102 QSGARL-VYAVDVGTNQLVWKLRQDHR 127
Q R+ + +D+ + + + R D R
Sbjct: 35 QLSGRVGMIEMDLASGRTLTAWRADER 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03180ARGREPRESSOR864e-24 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 85.7 bits (212), Expect = 4e-24
Identities = 41/153 (26%), Positives = 80/153 (52%), Gaps = 4/153 (2%)

Query: 1 MKKSERLNLIKQIVLNHAVETQHELLRRLEAYGVTLTQATISRDMNEIGIIKVPSAKGRY 60
M K +R I++I+ + +ETQ EL+ L+ G +TQAT+SRD+ E+ ++KVP+ G Y
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSY 60

Query: 61 IYGLSNENDPIFTTAVAKPIKTSILSISDKLLGLEQFININVIPGNSQLIKTFIMSHCQE 120
Y L + + + + + + + I I + +PGN+Q I + + E
Sbjct: 61 KYSLPADQRFNPLSKLKRSLMDAFVKIDSA----SHLIVLKTMPGNAQAIGALMDNLDWE 116

Query: 121 HIFSLTADDNSLLLIAKSEADADHIRQSMIAML 153
I D+++L+I ++ D +++ ++ +L
Sbjct: 117 EIMGTICGDDTILIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03190BONTOXILYSIN300.011 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 30.3 bits (68), Expect = 0.011
Identities = 16/65 (24%), Positives = 29/65 (44%), Gaps = 15/65 (23%)

Query: 207 KTFSKWL----DNFV-ESAQTR---KIAEIG--ISYCGKA----DMANNFKEKLAVLGAP 252
K + WL N+ + T+ + I + + GKA + +N+F E+ GA
Sbjct: 556 KKYYLWLKEVFKNYSFDINLTQEIDSMCGINEVVLWFGKALNILNTSNSFVEEYQDSGA- 614

Query: 253 ISVLE 257
IS++
Sbjct: 615 ISLIS 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03205DNABINDINGHU1252e-41 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 125 bits (315), Expect = 2e-41
Identities = 84/91 (92%), Positives = 88/91 (96%)

Query: 1 MANKQDLIAKVAEATELTKKDSAAAVDAVFAAVADYLAEGEKVQLIGFGNFEVRERAARK 60
MANKQDLIAKVAEATELTKKDSAAAVDAVF+AV+ YLA+GEKVQLIGFGNFEVRERAARK
Sbjct: 1 MANKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARK 60

Query: 61 GRNPQTGAEIEIAASKVPAFKAGKALKDAVK 91
GRNPQTG EI+I ASKVPAFKAGKALKDAVK
Sbjct: 61 GRNPQTGEEIKIKASKVPAFKAGKALKDAVK 91


26GBS_RS03985GBS_RS04035N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS039851103.252943DNA topoisomerase III
GBS_RS039905112.489887ATP-dependent Clp protease ATP-binding subunit
GBS_RS039958112.283553hypothetical protein
GBS_RS040007112.039078hypothetical protein
GBS_RS040054102.530383C39 family peptidase
GBS_RS040101122.449980LPXTG cell wall anchor domain-containing
GBS_RS04015-1133.007008LPXTG cell wall anchor domain-containing
GBS_RS04020-1214.045099single-stranded DNA-binding protein
GBS_RS04025-1223.933603hypothetical protein
GBS_RS04030-1223.862787type IV secretory system conjugative DNA
GBS_RS040350213.951031hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03985PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 21/89 (23%), Positives = 44/89 (49%), Gaps = 12/89 (13%)

Query: 254 AQARQLKAISIVSSVEMEEKQ-RAAPH-LFN-LSDIQGLAAKQWGFEPTKTERLIESL-- 308
A+ Q K S+ ++ + + PH +FN L++I+ L + +PTK ++ SL
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILE----DPTKAREMLTSLSE 202

Query: 309 YLKKYLSYPRTDTRFIT-EEEFDYLKNYL 336
++ L ++ R ++ +E + +YL
Sbjct: 203 LMRYSLR--YSNARQVSLADELTVVDSYL 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS03990HTHFIS397e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 7e-05
Identities = 34/171 (19%), Positives = 63/171 (36%), Gaps = 19/171 (11%)

Query: 279 LKGDQERLEGFKERLMNRVKGQEDAIEAVVDAVTIAQAGLQNEKRPLASFLFLGPTGVGK 338
K +LE + M V G+ A++ + +A+ + + + G +G GK
Sbjct: 122 PKRRPSKLEDDSQDGMPLV-GRSAAMQEIYR--VLARLMQTD-----LTLMITGESGTGK 173

Query: 339 TELAKAIAEALFDDEAAMIRFDMSEYKQKEDVTKLIGNRATRIKGQLTEGVKQKPYCV-- 396
+A+A+ + + +M+ + ++L G+ KG T +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHE----KGAFTGAQTRSTGRFEQ 229

Query: 397 -----LLLDEIEKAHSEVMDLFLQVLDDGRLTDSSGRLISFKNTIVIMTTN 442
L LDEI + L+VL G T GR + ++ TN
Sbjct: 230 AEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04005CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.3 bits (78), Expect = 0.002
Identities = 40/218 (18%), Positives = 78/218 (35%), Gaps = 16/218 (7%)

Query: 73 VVDEAQKQKDQSQQNLVKATSTVTEAEKVAAEATPEVVKEAIKAVTEAKEAVTDAEANVV 132
EA++++ + ++ + + AE E + E KAV A++ ++ A++ VV
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAE--EKRLAALSEEAKAVEIAQKKLSAAQSEVV 206

Query: 133 DAQKTEQKANQEVQSQAKTVDENVKVVADKESEVKQAEGVVTTAQEAIDSKTANTNAS-- 190
+ N + S D +K +A K +E+ QA E + + N
Sbjct: 207 KMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQ 266

Query: 191 ------------EAEKAVTEKQTKLETAETNLTEAQKQDAKIAEEKRLAEQEVVNKQLAV 238
A K EKQ ++ +ET + +I + V
Sbjct: 267 NRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARV 326

Query: 239 TDTQTLLKKLVTEINNEKVSTSLENQAYFNQRDGSWAG 276
+ + LKK + N ++ +++ F Q G
Sbjct: 327 HEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04010TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 21/78 (26%), Positives = 27/78 (34%)

Query: 41 TPPNSVDPSLPTDTTDPSTPVVPEKLPDTSKPAEPETPKEELPAPVDPTTPSAGKEDKQE 100
P SV P D P P + +P P+ APV P + K +
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 101 TVPGQAETPKEEVKPENP 118
V E PK +VKP
Sbjct: 102 PVKKVQEQPKRDVKPVES 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04015GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 14/195 (7%)

Query: 91 AQADLANQTQAVKDVTAKAQANTQAIKDATAENAKIDAENKAEAERVAKENKEGQAAVDA 150
A A+ +A++ + A++ IK AE A ++A + + A
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 151 RNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAK- 209
A +A++ + Q ++A ++ + L+A +AK E + + +
Sbjct: 283 IKTLEAE--KAALEAEKADLEHQSQVLNANRQSLRR---DLDASREAKKQLEAEHQKLEE 337

Query: 210 ----ANATNAQLQKDYQAKLAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLA 265
+ A+ L++D A K +EA + + + + ++A+ L++D A
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKL---EEQNKI-SEASRQSLRRDLDASRE 393

Query: 266 LYNQALKAKAEADKQ 280
Q KA EA+ +
Sbjct: 394 AKKQVEKALEEANSK 408



Score = 43.1 bits (101), Expect = 5e-06
Identities = 43/232 (18%), Positives = 83/232 (35%), Gaps = 4/232 (1%)

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTTDAQADLANQTQAVKDVTAKAQ 110
+ A L+KA+ A + A + A +A A +A++ +
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 111 ANTQAIKDATAENAKIDAEN-KAEAERVAKENKEGQAAVDARNKAGQAAVDARNKAKQQA 169
A++ IK AE A + A E N + + + A +A+ +
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 170 QDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAKLAEIK 229
+ + A+ + + A +A+ +Q NA L++D A K
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL-NANRQSLRRDLDASREAKK 326

Query: 230 SVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQS 281
+EA + + ++NK ++A + L +AK L +A K E +
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL--EAEHQKLEEQNKI 376



Score = 41.2 bits (96), Expect = 2e-05
Identities = 43/296 (14%), Positives = 78/296 (26%), Gaps = 23/296 (7%)

Query: 4 NTKGHGFFRKSKAYGLVCAIALA--GAFTLATSQVSADQVTTQATTQ-----------TV 50
NT H RK K A+AL GA + + + T T +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTT--DAQADLANQTQAVKDVTAK 108
N + S A + ++ A++ Q ++ A
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 109 AQANTQAI----KDATAENAKIDAENKAEAERVAKENKEGQAAVDARNKAGQAAVDARNK 164
+ + +A+ ++AE A A R A K + A++ +
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 165 AKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAK 224
+ + E + A +A A
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 225 LAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQ 280
A+IK++EA A + R + + A A KA + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKA----LEGAMNFSTADSAKIKTLEAEKAALEAE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS04035adhesinb290.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.008
Identities = 11/31 (35%), Positives = 15/31 (48%)

Query: 3 MKKWFGLLFLGLALITLAACGQKTPEDIIKT 33
MKK L+ L LA + LAAC + +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGS 31


27GBS_RS05305GBS_RS05355N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS053051122.449980hypothetical protein
GBS_RS053104102.530383type IV secretory system conjugative DNA
GBS_RS053157112.039078hypothetical protein
GBS_RS053208112.283553single-stranded DNA-binding protein
GBS_RS053255112.489887LPXTG cell wall anchor domain-containing
GBS_RS053301103.252943LPXTG cell wall anchor domain-containing
GBS_RS053351102.612918C39 family peptidase
GBS_RS053402142.694713hypothetical protein
GBS_RS053452132.520117hypothetical protein
GBS_RS053502122.206111ATP-dependent Clp protease ATP-binding subunit
GBS_RS053552121.531768DNA topoisomerase III
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05305adhesinb290.008 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 29.0 bits (65), Expect = 0.008
Identities = 11/31 (35%), Positives = 15/31 (48%)

Query: 3 MKKWFGLLFLGLALITLAACGQKTPEDIIKT 33
MKK L+ L LA + LAAC + +
Sbjct: 1 MKKCRFLVLLLLAFVGLAACSSQKSSTETGS 31


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05325GPOSANCHOR459e-07 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 45.4 bits (107), Expect = 9e-07
Identities = 36/195 (18%), Positives = 77/195 (39%), Gaps = 14/195 (7%)

Query: 91 AQADLANQTQAVKDVTAKAQANTQAIKDATAENAKIDAENKAEAERVAKENKEGQAAVDA 150
A A+ +A++ + A++ IK AE A ++A + + A
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 282

Query: 151 RNKAGQAAVDARNKAKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAK- 209
A +A++ + Q ++A ++ + L+A +AK E + + +
Sbjct: 283 IKTLEAE--KAALEAEKADLEHQSQVLNANRQSLRR---DLDASREAKKQLEAEHQKLEE 337

Query: 210 ----ANATNAQLQKDYQAKLAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLA 265
+ A+ L++D A K +EA + + + + ++A+ L++D A
Sbjct: 338 QNKISEASRQSLRRDLDASREAKKQLEAEHQKL---EEQNKI-SEASRQSLRRDLDASRE 393

Query: 266 LYNQALKAKAEADKQ 280
Q KA EA+ +
Sbjct: 394 AKKQVEKALEEANSK 408



Score = 43.1 bits (101), Expect = 5e-06
Identities = 43/232 (18%), Positives = 83/232 (35%), Gaps = 4/232 (1%)

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTTDAQADLANQTQAVKDVTAKAQ 110
+ A L+KA+ A + A + A +A A +A++ +
Sbjct: 148 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFST 207

Query: 111 ANTQAIKDATAENAKIDAEN-KAEAERVAKENKEGQAAVDARNKAGQAAVDARNKAKQQA 169
A++ IK AE A + A E N + + + A +A+ +
Sbjct: 208 ADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEK 267

Query: 170 QDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAKLAEIK 229
+ + A+ + + A +A+ +Q NA L++D A K
Sbjct: 268 ALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVL-NANRQSLRRDLDASREAKK 326

Query: 230 SVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQS 281
+EA + + ++NK ++A + L +AK L +A K E +
Sbjct: 327 QLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL--EAEHQKLEEQNKI 376



Score = 41.2 bits (96), Expect = 2e-05
Identities = 43/296 (14%), Positives = 78/296 (26%), Gaps = 23/296 (7%)

Query: 4 NTKGHGFFRKSKAYGLVCAIALA--GAFTLATSQVSADQVTTQATTQ-----------TV 50
NT H RK K A+AL GA + + + T T +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEI 64

Query: 51 TQNQAETVTSTQLDKAVATAKKAAVAVTTTPAVNHATTT--DAQADLANQTQAVKDVTAK 108
N + S A + ++ A++ Q ++ A
Sbjct: 65 ENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKAD 124

Query: 109 AQANTQAI----KDATAENAKIDAENKAEAERVAKENKEGQAAVDARNKAGQAAVDARNK 164
+ + +A+ ++AE A A R A K + A++ +
Sbjct: 125 LEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE 184

Query: 165 AKQQAQDDQKAKIDAENKAESQRVSQLNAQNKAKIDAENKDAQAKANATNAQLQKDYQAK 224
+ + E + A +A A
Sbjct: 185 KAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTAD 244

Query: 225 LAEIKSVEAYNAGVRQRNKDAQAKADATNAQLQKDYQAKLALYNQALKAKAEADKQ 280
A+IK++EA A + R + + A A KA + +
Sbjct: 245 SAKIKTLEAEKAALEARQAELEKA----LEGAMNFSTADSAKIKTLEAEKAALEAE 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05330TONBPROTEIN310.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 31.5 bits (71), Expect = 0.002
Identities = 21/78 (26%), Positives = 27/78 (34%)

Query: 41 TPPNSVDPSLPTDTTDPSTPVVPEKLPDTSKPAEPETPKEELPAPVDPTTPSAGKEDKQE 100
P SV P D P P + +P P+ APV P + K +
Sbjct: 42 AQPISVTMVTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPK 101

Query: 101 TVPGQAETPKEEVKPENP 118
V E PK +VKP
Sbjct: 102 PVKKVQEQPKRDVKPVES 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05335CHANLCOLICIN340.002 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 34.3 bits (78), Expect = 0.002
Identities = 40/218 (18%), Positives = 78/218 (35%), Gaps = 16/218 (7%)

Query: 73 VVDEAQKQKDQSQQNLVKATSTVTEAEKVAAEATPEVVKEAIKAVTEAKEAVTDAEANVV 132
EA++++ + ++ + + AE E + E KAV A++ ++ A++ VV
Sbjct: 149 AFQEAEQRRKEIEREKAETERQLKLAEAE--EKRLAALSEEAKAVEIAQKKLSAAQSEVV 206

Query: 133 DAQKTEQKANQEVQSQAKTVDENVKVVADKESEVKQAEGVVTTAQEAIDSKTANTNAS-- 190
+ N + S D +K +A K +E+ QA E + + N
Sbjct: 207 KMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQ 266

Query: 191 ------------EAEKAVTEKQTKLETAETNLTEAQKQDAKIAEEKRLAEQEVVNKQLAV 238
A K EKQ ++ +ET + +I + V
Sbjct: 267 NRPFFEATRRRVGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARV 326

Query: 239 TDTQTLLKKLVTEINNEKVSTSLENQAYFNQRDGSWAG 276
+ + LKK + N ++ +++ F Q G
Sbjct: 327 HEAEENLKKAQNNLLNSQIKDAVDATVSFYQTLTEKYG 364


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05350HTHFIS397e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 38.7 bits (90), Expect = 7e-05
Identities = 34/171 (19%), Positives = 63/171 (36%), Gaps = 19/171 (11%)

Query: 279 LKGDQERLEGFKERLMNRVKGQEDAIEAVVDAVTIAQAGLQNEKRPLASFLFLGPTGVGK 338
K +LE + M V G+ A++ + +A+ + + + G +G GK
Sbjct: 122 PKRRPSKLEDDSQDGMPLV-GRSAAMQEIYR--VLARLMQTD-----LTLMITGESGTGK 173

Query: 339 TELAKAIAEALFDDEAAMIRFDMSEYKQKEDVTKLIGNRATRIKGQLTEGVKQKPYCV-- 396
+A+A+ + + +M+ + ++L G+ KG T +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHE----KGAFTGAQTRSTGRFEQ 229

Query: 397 -----LLLDEIEKAHSEVMDLFLQVLDDGRLTDSSGRLISFKNTIVIMTTN 442
L LDEI + L+VL G T GR + ++ TN
Sbjct: 230 AEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATN 280


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS05355PF06580372e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.8 bits (85), Expect = 2e-04
Identities = 21/89 (23%), Positives = 44/89 (49%), Gaps = 12/89 (13%)

Query: 254 AQARQLKAISIVSSVEMEEKQ-RAAPH-LFN-LSDIQGLAAKQWGFEPTKTERLIESL-- 308
A+ Q K S+ ++ + + PH +FN L++I+ L + +PTK ++ SL
Sbjct: 147 AEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILE----DPTKAREMLTSLSE 202

Query: 309 YLKKYLSYPRTDTRFIT-EEEFDYLKNYL 336
++ L ++ R ++ +E + +YL
Sbjct: 203 LMRYSLR--YSNARQVSLADELTVVDSYL 229


28GBS_RS08680GBS_RS08710N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS08680090.664977membrane protein insertase YidC
GBS_RS086850100.415460amino acid ABC transporter permease
GBS_RS086900110.198272transporter substrate-binding domain-containing
GBS_RS086950120.295714amidase
GBS_RS08700010-0.835302transcription elongation factor GreA
GBS_RS08705-19-0.451527endolytic transglycosylase MltG
GBS_RS08710-111-0.482932GNAT family N-acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS0868060KDINNERMP1192e-32 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 119 bits (300), Expect = 2e-32
Identities = 61/226 (26%), Positives = 105/226 (46%), Gaps = 22/226 (9%)

Query: 34 YGVIWNTLGVPMANLITYFAQHQGLGFGVAIIIVTIIVRVIILPLGLYQSWKASYQA-EK 92
YG +W + P+ L+ + G +G +III+T IVR I+ PL KA Y + K
Sbjct: 330 YGWLW-FISQPLFKLLKWIHSFVG-NWGFSIIIITFIVRGIMYPLT-----KAQYTSMAK 382

Query: 93 MAYFKPLFEPINERLRNAKTQEEKLAAQTELMTAQRENGLSMFGGIGCLPLLIQMPFFSA 152
M +P + + ERL + +K E+M + ++ GG C PLLIQMP F A
Sbjct: 383 MRMLQPKIQAMRERLGD-----DKQRISQEMMALYKAEKVNPLGG--CFPLLIQMPIFLA 435

Query: 153 IFFAARYTPGVSSATFLG----LNLGQKSLTLTVIIAILYFVQSWLSMQGVPDEQRQQMK 208
+++ + + A F L+ L +++ + F +S V D +Q++
Sbjct: 436 LYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKI- 494

Query: 209 TMMYVMPIAMVFMSISLPASVALYWFIGGIFSIIQQLVTTYVLKPK 254
M MP+ + P+ + LY+ + + +IIQQ + L+ +
Sbjct: 495 --MTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKR 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS08685HTHFIS310.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.3 bits (71), Expect = 0.002
Identities = 15/46 (32%), Positives = 24/46 (52%), Gaps = 5/46 (10%)

Query: 110 EIIRAALLAVDHGQWEAARALGLKTPTIYR-----GIIIPQATRIA 150
+I AAL A Q +AA LGL T+ + G+ + +++R A
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVYRSSRSA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS08705IGASERPTASE473e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 46.6 bits (110), Expect = 3e-07
Identities = 36/214 (16%), Positives = 72/214 (33%), Gaps = 22/214 (10%)

Query: 19 EQILAELEEANRLRKLREEELYQKEQEAKEAARRTAQLMADYEAQRLKDE-QEARAKALE 77
E E + + + E Q + AKEA E + E +E + +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 78 TKQRLEEQEKARIEAKLLAEAAREEERRQAEQALASQEEQVINQGMEPSRELDSGSKSSE 137
+E++EKA++E E +E + ++ + ++ + + EP+RE D E
Sbjct: 1102 ETATVEKEEKAKVE----TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 138 FRTTENVPDIDLKADKTDVATAVPNQETEEIFLVRATDIPTEGENVKLGETSELEPVAKE 197
++ N AD A + + + N + P +
Sbjct: 1158 PQSQTNTT-----ADTEQPAKETSSNVEQPV----TESTTVNTGNSVVENPENTTPATTQ 1208

Query: 198 PIRVEDLSKEEEDIALSAKNKHNKRERRQKADNV 231
P + S + ++ ++R R NV
Sbjct: 1209 PTVNSESSNKPKN--------RHRRSVRSVPHNV 1234



Score = 29.6 bits (66), Expect = 0.050
Identities = 42/220 (19%), Positives = 77/220 (35%), Gaps = 37/220 (16%)

Query: 36 EEELYQKEQEAKEAARRTAQL------MADYEAQRLKDEQEAR---------AKALETKQ 80
+LY E E + T + AD + +E+ AR A A ++
Sbjct: 977 RYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSET 1036

Query: 81 RLEEQEKARIEAKLLAEA---AREEERRQAEQALASQEEQVINQGMEPSRELDSGSKSSE 137
E ++ E+K + + A E + E A ++ N + S +K ++
Sbjct: 1037 TETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQ 1096

Query: 138 FRTTENVPDIDLKADKTDVATAVPNQETEEIFLVRATDIPTEGENVKLGETSELEPVAKE 197
T+ ++ K +K V T +T+E+ V + P + E SE E
Sbjct: 1097 TTETKETATVE-KEEKAKVETE----KTQEVPKVTSQVSPKQ-------EQSETVQPQAE 1144

Query: 198 PIRVEDLSKEEEDIALSAKNKHNKRERRQKADNVAKRIAR 237
P R E D ++ K ++ + AK +
Sbjct: 1145 PAR-------ENDPTVNIKEPQSQTNTTADTEQPAKETSS 1177


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS08710SACTRNSFRASE325e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 32.2 bits (73), Expect = 5e-04
Identities = 23/88 (26%), Positives = 38/88 (43%), Gaps = 8/88 (9%)

Query: 58 EGPVIKGRYLTDDLFHKVSEIPVREG--GFIGITSLSIHPDFKGQGIGTALLAAMKDLVV 115
EG YL ++ + I +R G+ I +++ D++ +G+GTALL +
Sbjct: 63 EGKAAFLYYLENNC---IGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAK 119

Query: 116 SQERDGISLTCHDDLIS---FYEMNGFK 140
G+ L D IS FY + F
Sbjct: 120 ENHFCGLMLETQDINISACHFYAKHHFI 147


29GBS_RS09945GBS_RS09965N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS09945-1100.295722response regulator
GBS_RS09950-1110.852499UDP-glucose 4-epimerase GalE
GBS_RS09955-2110.731612alpha-glucosidase
GBS_RS09960-2100.540781sn-glycerol-3-phosphate ABC transporter
GBS_RS09965-2110.501668helix-turn-helix domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS09945HTHFIS789e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.6 bits (191), Expect = 9e-19
Identities = 26/103 (25%), Positives = 49/103 (47%), Gaps = 2/103 (1%)

Query: 3 VLIIEDDPMVEFIHRNYLEKLNYFQNIYSTASQTQAIAYLNDIKIQLVLLDIHIKEGNGL 62
+L+ +DD + + L + Y ++ T++ ++ LV+ D+ + + N
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 ELLKLLRNQHQNTEVIVISAANEAHTVKEAFHLGIVDYLIKPF 105
+LL ++ + V+V+SA N T +A G DYL KPF
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS09950NUCEPIMERASE1642e-50 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 164 bits (417), Expect = 2e-50
Identities = 76/332 (22%), Positives = 142/332 (42%), Gaps = 43/332 (12%)

Query: 1 MAVLILGGAGYIGSHMVDQLITQGKEKVIVVDNLVTGH-------RQAV--HSDAIFYEG 51
M L+ G AG+IG H+ +L+ G +V+ +DNL + R + F++
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 52 DLSDKTFMRQVFRENPDVDAVIHFAAFSLVAESMENPLKYFDNNTAGMIKLLEVMNECDI 111
DL+D+ M +F + V V S+ENP Y D+N G + +LE I
Sbjct: 60 DLADREGMTDLFASGH-FERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 112 KNIVFSSTAATYGIPEQVPILETAP-QNPINPYGESKLMMETIMKWADQAYGIKFVALRY 170
++++++S+++ YG+ ++P +P++ Y +K E + YG+ LR+
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRF 178

Query: 171 FNVAGDKPDGSIGEDHKPETHLLPIILQVAQGVRDKIMIFGDDYNTPDGTNVRDYVHPFD 230
F V G P G +P+ L + +G I ++ G RD+ + D
Sbjct: 179 FTVYG--PWG------RPDMALFKFTKAMLEG--KSIDVYN------YGKMKRDFTYIDD 222

Query: 231 LADAHILAVDYLRQGNES---------------NVFNLGSSTGFSNLQMLEAARRITGKE 275
+A+A I D + + V+N+G+S+ + ++A G E
Sbjct: 223 IAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIE 282

Query: 276 IPAQKAARRPGDPDTLIASSEKARQILGWEPK 307
+PGD A ++ +++G+ P+
Sbjct: 283 AKKNMLPLQPGDVLETSADTKALYEVIGFTPE 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS09960PF05272355e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 5e-04
Identities = 13/56 (23%), Positives = 19/56 (33%), Gaps = 9/56 (16%)

Query: 34 IVFVGPSGCGKSTTLRMIAGLEDISEGELKIDGEVVNDKSPKDRDIAMVFQNYALY 89
+V G G GKST + + GL+ S+ I +D Y
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI---------GTGKDSYEQIAGIVAY 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS09965HTHFIS435e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 43.3 bits (102), Expect = 5e-07
Identities = 14/54 (25%), Positives = 27/54 (50%), Gaps = 4/54 (7%)

Query: 215 LQHILDTSDTSAIIKALWQEQGNLAKTAKALFIHRNSLQYKLDKFTQSSGLNLK 268
+L + I+ AL +GN K A L ++RN+L+ K+ + G+++
Sbjct: 429 YDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL----GVSVY 478


30GBS_RS10790GBS_RS10820N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS10790-112-1.888643response regulator transcription factor
GBS_RS10795-210-0.894077HAMP domain-containing histidine kinase
GBS_RS10800-311-1.238917YfcC family protein
GBS_RS10805-211-1.961861carbamate kinase
GBS_RS10810-111-1.157024ornithine carbamoyltransferase
GBS_RS10815-112-1.449115GHKL domain-containing protein
GBS_RS1082009-1.901457DNA-binding domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10800HTHFIS934e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 93.0 bits (231), Expect = 4e-24
Identities = 33/123 (26%), Positives = 62/123 (50%)

Query: 2 KILVVEDEFDLNRSIVKLLKKQHYSVDSASNGEEALQFVSVAEYDVIILDVMMPKMDGFT 61
ILV +D+ + + + L + Y V SN ++++ + D+++ DV+MP + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLKLLRNKGSQVSILMLTARDAVEDRIAGLDFGADDYLVKPFEFGELMARIRAMLRRANR 121
L ++ + +L+++A++ I + GA DYL KPF+ EL+ I L R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 QVS 124
+ S
Sbjct: 125 RPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10815CARBMTKINASE376e-133 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 376 bits (966), Expect = e-133
Identities = 142/311 (45%), Positives = 204/311 (65%), Gaps = 7/311 (2%)

Query: 3 KIVVALGGNALGN-----SPEEQLRLVKHTAKSLVALIKKGHEIVVSHGNGPQVGAINLG 57
++V+ALGGNAL S EE + V+ TA+ + +I +G+E+V++HGNGPQVG++ L
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 58 MNFAAESGQGTNFPFPECGAMSQGYIGYHLQQSLLNELRQEGINKEVATIITQIEVDESD 117
M+ + P GAMSQG+IGY +QQ+L NELR+ G+ K+V TIITQ VD++D
Sbjct: 64 MDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKND 123

Query: 118 QAFSAPTKPIGTFYDKETSEKIAIEKGYTFVEDAGRGYRRVVASPEPKKIIEINSIKTLI 177
AF PTKP+G FYD+ET++++A EKG+ ED+GRG+RRVV SP+PK +E +IK L+
Sbjct: 124 PAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLV 183

Query: 178 ENDTLVIAGGGGGIPVI-NKGGYEGIAAVIDKDKSSALLAGELAADQLIILTAVDYVYTQ 236
E +VIA GGGG+PVI G +G+ AVIDKD + LA E+ AD +ILT V+
Sbjct: 184 ERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAALY 243

Query: 237 FGKENQKALTEVNENQMIDYVNQGEFAKGSMLPKVIACMSFLDHNPKGTALITSLNGLED 296
+G E ++ L EV ++ Y +G F GSM PKV+A + F++ + A+I L +
Sbjct: 244 YGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGE-RAIIAHLEKAVE 302

Query: 297 ALDGKLGTRIT 307
AL+GK GT++
Sbjct: 303 ALEGKTGTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10825PF06580416e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 6e-06
Identities = 28/175 (16%), Positives = 58/175 (33%), Gaps = 32/175 (18%)

Query: 256 NIIKGLGTYF----SVKNESTMALKDIFQIVLSYTRSIIQFRHQDIIILENNKCNLIISN 311
++ L N ++L D +V SY + + +D + EN N I +
Sbjct: 195 EMLTSLSELMRYSLRYSNARQVSLADELTVVDSYL-QLASIQFEDRLQFENQ-INPAIMD 252

Query: 312 YYYLLTIISNIVLNAVE-AIDKQKK-GTISVHTEELEDFIKIEISDNGPGIPDKMKHMIF 369
++ +V N ++ I + + G I + + + +E+ + G K
Sbjct: 253 VQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKEST- 311

Query: 370 KPGFSTKFDANGDIYRGIGLSHVR----ILMEEQYQGTITVCPNQPNGTTFTLLF 420
G GL +VR +L + Q ++ + +L
Sbjct: 312 ----------------GTGLQNVRERLQMLYGTEAQIKLS---EKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10830HTHFIS654e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 4e-14
Identities = 28/144 (19%), Positives = 63/144 (43%), Gaps = 9/144 (6%)

Query: 5 IIDDDPTITMILQDIIE-EDFNNTVVRVNNVSSKAYNELLIADVDIVLIDLLMPILDGVT 63
+ DDD I +L + + V +N ++ + + D D+V+ D++MP +
Sbjct: 8 VADDDAAIRTVLNQALSRAGY--DVRITSNAAT-LWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LVQKIYKQRSDLKFIMISQVKDNDLRQEAYKAGIEFFINKPINIIEVKSVVKRVTDTI-- 121
L+ +I K R DL +++S +A + G ++ KP ++ E+ ++ R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 ---EMQKKLNTIQNLLENTPSYQK 142
+++ L+ + + Q+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQE 148


31GBS_RS10990GBS_RS11025N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
GBS_RS10990215-0.275057arginine repressor
GBS_RS10995-117-0.351201Crp/Fnr family transcriptional regulator
GBS_RS11000-1180.262732B3/4 domain-containing protein
GBS_RS11005-1190.465916arginine deiminase
GBS_RS11010016-0.213532GNAT family N-acetyltransferase
GBS_RS11015115-0.315758ornithine carbamoyltransferase
GBS_RS11020-116-0.629036arginine-ornithine antiporter
GBS_RS11025-111-2.361813carbamate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS10990ARGREPRESSOR1286e-41 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 128 bits (324), Expect = 6e-41
Identities = 58/149 (38%), Positives = 95/149 (63%), Gaps = 4/149 (2%)

Query: 1 MNKKETRHQLIRSLVSETKVRTQHELRELLEKNGVSVTQATLSRDMKELNLIKVNESSDN 60
MNK + RH IR +++ ++ TQ EL ++L+K+G +VTQAT+SRD+KEL+L+KV N
Sbjct: 1 MNKGQ-RHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKV---PTN 56

Query: 61 ATETYYEIHSISQKRWEERLRFYMEDALVMLFPVQNQIVLKTLPGLAQSFGSILDSILLP 120
Y + + + +L+ + DA V + + IVLKT+PG AQ+ G+++D++
Sbjct: 57 NGSYKYSLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWE 116

Query: 121 EILATVCGDDTCLIICNNSEEAQKCFEKL 149
EI+ T+CGDDT LIIC ++ + +K+
Sbjct: 117 EIMGTICGDDTILIICRTHDDTKVVQKKI 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS11005ARGDEIMINASE5700.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 570 bits (1471), Expect = 0.0
Identities = 193/408 (47%), Positives = 275/408 (67%), Gaps = 8/408 (1%)

Query: 6 PIHVFSEIGKLKKVMLHRPGKEIENLMPDYLERLLFDDIPFLEDAQKEHDAFAQALRNEG 65
PI++FSEIG+LKKV+LHRPG+E+ENL P ++ LFDDIP+LE A++EH+ FA L+N
Sbjct: 7 PINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFASILKNNL 66

Query: 66 VEVLYLENLAAESL-TNQEIREQFIDEYIGEANVRGRATKKAIRELLLNIKDNKELIEKT 124
VE+ Y+E+L +E L ++ + +FI ++I EA ++ T +++ + +I K
Sbjct: 67 VEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINLLKDYFSS-LTIDNMISKM 125

Query: 125 MAGIQKSELPEIPSSEKGLTDLVESNYPFAIDPMPNLYFTRDPFATIGNGVSLNHMFSET 184
++G+ EL SS L DLV F IDPMPN+ FTRDPFA+IGNGV++N MF++
Sbjct: 126 ISGVVTEELKNYTSS---LDDLVNGANLFIIDPMPNVLFTRDPFASIGNGVTINKMFTKV 182

Query: 185 RNRETLYGKYIFTHHPEYGGKVPMVYEREETTRIEGGDELVLSKDVLAVGISQRTDAASI 244
R RET++ +YIF +HP Y VP+ R E +EGGDELVL+K +L +GIS+RT+A S+
Sbjct: 183 RQRETIFAEYIFKYHPVYKENVPIWLNRWEEASLEGGDELVLNKGLLVIGISERTEAKSV 242

Query: 245 EKLLVNIFKQNLGFKKVLAFEFANNRKFMHLDTVFTMVDYDKFTIHPEIEGDLRVYSVTY 304
EKL +++FK F +LAF+ NR +MHLDTVFT +DY FT + +Y +TY
Sbjct: 243 EKLAISLFKNKTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTSFTSDDMYFSIYVLTY 302

Query: 305 ENQD--LHIEEEKGDLADLLAKNLGVEKVELIRCGGDNLVAAGREQWNDGSNTLTIAPGV 362
+HI++EK + D+L+ LG K+++I+C G +L+ REQWNDG+N L IAPG
Sbjct: 303 NPSSSKIHIKKEKARIKDVLSFYLG-RKIDIIKCAGGDLIHGAREQWNDGANVLAIAPGE 361

Query: 363 VIVYNRNTITNAILESKGLKLIKINGSELVRGRGGPRCMSMPFEREDL 410
+I Y+RN +TN + E G+K+ +I SEL RGRGGPRCMSMP RED+
Sbjct: 362 IIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIREDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS11010AUTOINDCRSYN300.003 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 29.8 bits (67), Expect = 0.003
Identities = 13/66 (19%), Positives = 22/66 (33%), Gaps = 7/66 (10%)

Query: 6 NHLPYQRAASLY-IRFSVFVIER-----NIKMEEEFDDNDEQDTIYAVLYDGKQPVSTGR 59
L ++ L+ +R F +R EFD D +T Y + + R
Sbjct: 10 TLLSETKSGELFTLRKETFK-DRLNWAVQCTDGMEFDQYDNNNTTYLFGIKDNTVICSLR 68

Query: 60 FLPETQ 65
F+
Sbjct: 69 FIETKY 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
GBS_RS11025CARBMTKINASE412e-148 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 412 bits (1061), Expect = e-148
Identities = 147/314 (46%), Positives = 211/314 (67%), Gaps = 6/314 (1%)

Query: 6 QKIVVALGGNAIL--STDASAKAQQEALINTSKSLVKLIKEGHDVIVTHGNGPQVGNLLL 63
+++V+ALGGNA+ S + + + T++ + ++I G++V++THGNGPQVG+LLL
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 64 QQAASDSEKN-PAMPLDTCVAMTEGSIGFWLQNALNNELQEQGIDKEVATVVTQVIVDEK 122
A + PA P+D AM++G IG+ +Q AL NEL+++G++K+V T++TQ IVD+
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 123 DQAFTNPTKPIGPFLSEEDAKKQAQETGSKFKEDAGRGWRKVVPSPKPVGIKEASVIRRL 182
D AF NPTKP+GPF EE AK+ A+E G KED+GRGWR+VVPSP P G EA I++L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 183 VDSGVVVISAGGGGVPVIEDANTKALKGVEAVIDKDFASQTLSELVDADLFIVLTGVDNV 242
V+ GV+VI++GGGGVPVI + +KGVEAVIDKD A + L+E V+AD+F++LT V+
Sbjct: 183 VERGVIVIASGGGGVPVILEDG--EIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGA 240

Query: 243 FVNFNKPNQEKLEEVTVSQMKQYITENQFAPGSMLPKVEAAIAFVENKPESRAIITSLEN 302
+ + ++ L EV V ++++Y E F GSM PKV AAI F+E RAII LE
Sbjct: 241 ALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEW-GGERAIIAHLEK 299

Query: 303 IDNVLAQNAGTQIV 316
L GTQ++
Sbjct: 300 AVEALEGKTGTQVL 313



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.