PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
GenomeNC_004116.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_004116 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1SAG0154SAG0160Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SAG01542131.868992adc operon repressor AdcR
SAG01553193.182842zinc ABC transporter ATP-binding protein
SAG01563193.144197zinc ABC transporter permease
SAG01583183.083738tyrosyl-tRNA synthetase
SAG01592172.843534penicillin-binding protein 1B
SAG01603223.041539DNA-directed RNA polymerase subunit beta
2SAG0210SAG0260Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG02104212.360645hypothetical protein
SAG02116272.518929DegV family protein
SAG02123313.191966hypothetical protein
SAG02132281.879415hypothetical protein
SAG02144271.58215050S ribosomal protein L13
SAG02153261.73824630S ribosomal protein S9
SAG02163270.492281hypothetical protein
SAG02173260.231549phage integrase family site specific
SAG0218421-1.478961Cro/CI family transcriptional regulator
SAG02190220.342972hypothetical protein
SAG02200240.344502hypothetical protein
SAG0221221-1.047716hypothetical protein
SAG0222119-1.388130hypothetical protein
SAG0223019-1.342592hypothetical protein
SAG0224016-1.614055replication initiation protein
SAG0225113-2.788383hypothetical protein
SAG0226116-4.034451recombination protein
SAG0227117-4.379457hypothetical protein
SAG0228-115-4.009955hypothetical protein
SAG0229018-4.749400hypothetical protein
SAG0230020-4.869000hypothetical protein
SAG0231222-5.411578hypothetical protein
SAG0232321-4.472815hypothetical protein
SAG0233622-5.759120hypothetical protein
SAG0234621-5.250287hypothetical protein
SAG0235521-4.644310hypothetical protein
SAG0236416-3.353588hypothetical protein
SAG0237216-2.567278hypothetical protein
SAG0238415-2.174709hypothetical protein
SAG0239215-1.555441MutR family transcriptional regulator
SAG0240315-1.821963transporter
SAG0241115-0.579156amino acid ABC transporter permease
SAG0242013-0.920354amino acid ABC transporter amino acid-binding
SAG0243113-0.571990amino acid ABC transporter permease
SAG0244014-0.774779amino acid ABC transporter ATP-binding protein
SAG0245119-2.510866*************hypothetical protein
SAG0246117-2.026542hypothetical protein
SAG0247221-4.092683hypothetical protein
SAG0248320-5.413464hypothetical protein
SAG0249321-5.986103hypothetical protein
SAG0250224-6.813042hypothetical protein
SAG0251225-6.947544Cro/CI family transcriptional regulator
SAG0252225-7.076312acetyltransferase
SAG0253023-7.419459acetyltransferase
SAG0254-119-6.833681acetyltransferase
SAG0255119-6.349834hypothetical protein
SAG0256216-4.943706ECF subfamily RNA polymerase sigma factor
SAG0257114-3.348406lipoprotein
SAG0258114-3.665786TetR family transcriptional regulator
SAG0259117-2.781937ABC transporter
SAG0260217-1.534109ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0230BCTERIALGSPD280.007 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 27.6 bits (61), Expect = 0.007
Identities = 12/40 (30%), Positives = 22/40 (55%)

Query: 57 ELSPKITQFAQLLEDINQQLLKVADVVEQTDSDIASQINK 96
++ P+I + +L +I Q++ VAD T SD+ + N
Sbjct: 489 KVKPQINEGDSVLLEIEQEVSSVADAASSTSSDLGATFNT 528


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0240TCRTETA320.004 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 32.1 bits (73), Expect = 0.004
Identities = 43/183 (23%), Positives = 75/183 (40%), Gaps = 18/183 (9%)

Query: 23 DYGNSTWIASMGGLGQKILGIYQIVELLVSIVLNPFGGALADRFQRRKILLITDAICAIM 82
D +S + + G+ +L +Y +++ + P GAL+DRF RR +LL++ A A+
Sbjct: 34 DLVHSNDVTAHYGI---LLALYALMQFACA----PVLGALSDRFGRRPVLLVSLAGAAVD 86

Query: 83 CFLLSFIGDDKVMVYGLIVANAILAVSNAFSSPAYKSYIPEIVDKADIITYNANLETIVQ 142
+++ V+ G IVA A +YI +I D + + +
Sbjct: 87 YAIMATAPFLWVLYIGRIVAGITGAT-----GAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 143 IISVSSPVLGFLIFNNFGIRITLIVDA----ITFLISFLFLY-AIKVERVQLSKQEKVAI 197
V+ PVLG L+ F A + FL L + K ER L ++ +
Sbjct: 142 FGMVAGPVLGGLM-GGFSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPL 200

Query: 198 KNI 200
+
Sbjct: 201 ASF 203


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0258HTHTETR631e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 63.1 bits (153), Expect = 1e-14
Identities = 27/85 (31%), Positives = 40/85 (47%), Gaps = 6/85 (7%)

Query: 22 KQKVILSAIELFASQGFHGTSTAQLAKNAEVSQATIYKYFETKDKLLVFILELIVQTIGR 81
+Q ++ A+ LF+ QG TS ++AK A V++ IY +F+ K L I EL IG
Sbjct: 13 RQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSESNIGE 72

Query: 82 PFFTELSTF------STKEELIHFF 100
+ F +E LIH
Sbjct: 73 LELEYQAKFPGDPLSVLREILIHVL 97


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0259ABC2TRNSPORT557e-11 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 55.3 bits (133), Expect = 7e-11
Identities = 44/166 (26%), Positives = 81/166 (48%), Gaps = 3/166 (1%)

Query: 197 RTSGTLDRLLATPVKRSDIVFGYMLSYGILAIIQTIVIVLSTIWLLDIQVVGSIFSVIIV 256
T + +L T ++ DIV G M A + I + L Q + ++++ ++
Sbjct: 95 EGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAGIGVVAAALGYTQWLSLLYALPVI 154

Query: 257 NFILALVALSLGILMSTLAKSEFQMMQFIPLIIMPQLFFSG-IIPLENMASWAQTVGKIL 315
L SLG++++ LA S + + L+I P LF SG + P++ + QT + L
Sbjct: 155 ALT-GLAFASLGMVVTALAPSYDYFIFYQTLVITPILFLSGAVFPVDQLPIVFQTAARFL 213

Query: 316 PLSYSGDALTKIIMYGQGLPNVSSNLLVLLLFLIILTIANIFGLKR 361
PLS+S D L + IM G + +V ++ L ++++I + L+R
Sbjct: 214 PLSHSID-LIRPIMLGHPVVDVCQHVGALCIYIVIPFFLSTALLRR 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0260PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.003
Identities = 19/90 (21%), Positives = 29/90 (32%), Gaps = 22/90 (24%)

Query: 35 LIGPSGAGKSTLIKTMLGME-KADKGTALVLDTQMPDRNILNQIGYMA-QSDALYESLTG 92
L G G GKSTLI T++G++ +D D Y YE
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSD---THFDIGTGKD-------SYEQIAGIVAYE---- 646

Query: 93 LENLLFFGKMKGIQKTELKQQITHISKVVD 122
+M ++ + + S D
Sbjct: 647 ------LSEMTAFRRADAEAVKAFFSSRKD 670


3SAG0367SAG0377Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG0367-211-3.111697acetyltransferase
SAG0368-113-3.121925hypothetical protein
SAG0369016-4.575258hypothetical protein
SAG0370-115-3.391141HIT family protein
SAG0371014-3.455729hypothetical protein
SAG0372114-2.436947hypothetical protein
SAG0373-212-1.139122ABC transporter ATP-binding protein
SAG0374-211-1.050932ABC transporter permease
SAG0375-1140.691092hypothetical protein
SAG03762142.406184tRNA (guanine-N(7)-)-methyltransferase
SAG03773162.362622*hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0367SACTRNSFRASE609e-14 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 60.0 bits (145), Expect = 9e-14
Identities = 35/94 (37%), Positives = 43/94 (45%), Gaps = 8/94 (8%)

Query: 74 ICLIAKLKNKVIGLITIISQSD--IEIEHVGDLFIAVQKDYWGYGIGHILMEEAIEWASD 131
+ L+N IG I I S + IE IAV KDY G+G L+ +AIEWA +
Sbjct: 66 AAFLYYLENNCIGRIKIRSNWNGYALIED-----IAVAKDYRKKGVGTALLHKAIEWAKE 120

Query: 132 NDITRRLELSVQGRNERAIHLYQKFGFEIDGLQT 165
N L L Q N A H Y K F I + T
Sbjct: 121 NHFC-GLMLETQDINISACHFYAKHHFIIGAVDT 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0368PF00577300.024 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.8 bits (67), Expect = 0.024
Identities = 15/102 (14%), Positives = 25/102 (24%), Gaps = 13/102 (12%)

Query: 337 YGTTASNDSSTYSSTQENNYNTTPYSEAPP----SYS---GNTTYSSETNQTTHQNYYNS 389
+ +++ S ++ Y SYS G + +T N
Sbjct: 614 WRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATLNY 673

Query: 390 STPASN------YSSNTNTGQADSSGSVNNHNGAATPNPNTG 425
N +S + SG V H T
Sbjct: 674 RGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLN 715


4SAG0428SAG0445Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG04283201.636992zinc-containing alcohol dehydrogenase
SAG04293221.022481aldo/keto reductase
SAG04304251.201415cation efflux system protein
SAG04315251.191185TetR family transcriptional regulator
SAG04324251.524355AraC family transcriptional regulator
SAG04334312.927524surface protein Rib
SAG0435118-1.418185DNA-damage-inducible protein J
SAG04362170.955964hypothetical protein
SAG04373160.959420lipoprotein
SAG04400204.339942*hypothetical protein
SAG04410184.502949hypothetical protein
SAG04420183.714155acetyltransferase
SAG04430193.611103acetyltransferase
SAG04440172.781811hypothetical protein
SAG04451173.190991valyl-tRNA synthetase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0431HTHTETR512e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 2e-10
Identities = 14/64 (21%), Positives = 30/64 (46%)

Query: 5 RQIQKTKVAIYNAFISLLQENDYSKITVQDVIGLANVGRSTFYSHYESKEVLLKELCEDL 64
++ Q+T+ I + + L + S ++ ++ A V R Y H++ K L E+ E
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 65 FHHL 68
++
Sbjct: 67 ESNI 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0433GPOSANCHOR673e-13 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 67.0 bits (163), Expect = 3e-13
Identities = 27/98 (27%), Positives = 42/98 (42%), Gaps = 19/98 (19%)

Query: 1303 ATPGDKPAKVVVTYPDGSKDTVDVTVKVVDPRTDADKNDPAGKDQQVNGKGN-------- 1354
A ++ AK+ S+ D + G+ Q K N
Sbjct: 449 AKQAEELAKLRAGKASDSQTP--------DAKPGNKAVPGKGQAPQAGTKPNQNKAPMKE 500

Query: 1355 ---KLPATGENATPFFNVVALTIMSSVGLLSVSKKKED 1389
+LP+TGE A PFF ALT+M++ G+ +V K+KE+
Sbjct: 501 TKRQLPSTGETANPFFTAAALTVMATAGVAAVVKRKEE 538



Score = 63.5 bits (154), Expect = 4e-12
Identities = 12/56 (21%), Positives = 24/56 (42%)

Query: 14 QTKQRFSIKKFKFGAASVLIGISFLGGFTQGQFNISTDTVFAAEVISGSAVTLNTN 69
T + +S++K K G ASV + ++ LG N + ++ + V +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERAD 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0437ADHESNFAMILY270.026 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 26.7 bits (59), Expect = 0.026
Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 3 KKLLGLMILAISTVFLVACSTNS 25
KKL L++L +S + LVAC++
Sbjct: 2 KKLGTLLVLFLSAIILVACASGK 24


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0440FLGMOTORFLIG280.003 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.2 bits (63), Expect = 0.003
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 6 EFLKDFEEWLQSQISINQMAMDSAKKVLEEDKDERAADAYI 46
L +F+E + +Q I + +D A+++LE+ + A I
Sbjct: 65 NVLLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDII 105


5SAG0535SAG0624Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG05352180.235572zinc ABC transporter substrate-binding protein
SAG05360110.90241050S ribosomal protein L31
SAG05371121.079411DHH family protein
SAG05381131.019190adenosine deaminase
SAG05390131.488506flavodoxin
SAG05402150.541900hypothetical protein
SAG05413150.912453voltage-gated chloride channel family protein
SAG0542616-0.058089IS1381 transposase protein A
SAG0543619-0.532760IS1381 transposase protein B
SAG0544720-0.71191150S ribosomal protein L19
SAG0545422-0.662920*prophage LambdaSa1, site-specific recombinase
SAG05462230.245943hypothetical protein
SAG0547325-0.600878hypothetical protein
SAG0548325-2.335809prophage LambdaSa1, repressor protein
SAG0549422-3.189606hypothetical protein
SAG0550221-2.249676hypothetical protein
SAG0551321-3.080389hypothetical protein
SAG0552420-3.674551hypothetical protein
SAG0553318-2.553120hypothetical protein
SAG0554123-0.975590prophage LambdaSa1, Cro/CI family
SAG0555025-0.882915prophage LambdaSa1, antirepressor
SAG0556327-1.143885hypothetical protein
SAG0557129-1.233935hypothetical protein
SAG0558229-0.912969hypothetical protein
SAG0559526-1.974342hypothetical protein
SAG05606300.045825hypothetical protein
SAG05616240.464592hypothetical protein
SAG05620250.084565hypothetical protein
SAG05631240.189811hypothetical protein
SAG05640230.240800hypothetical protein
SAG05650240.560523hypothetical protein
SAG05662260.437233prophage LambdaSa1, single-strand binding
SAG0567430-0.029440prophage LambdaSa1, reverse
SAG0568630-0.118408hypothetical protein
SAG0569729-0.008965hypothetical protein
SAG05708320.273536hypothetical protein
SAG_RS02245732-0.439455hypothetical protein
SAG0571828-1.688209hypothetical protein
SAG0572729-0.346312hypothetical protein
SAG0573526-1.900360hypothetical protein
SAG0574625-1.705594hypothetical protein
SAG0575520-2.625940hypothetical protein
SAG05760180.957987hypothetical protein
SAG05770161.201049hypothetical protein
SAG0578-1161.874295hypothetical protein
SAG0579-1162.647934hypothetical protein
SAG0581-1152.852869hypothetical protein
SAG0582-2153.323953hypothetical protein
SAG05830182.907370hypothetical protein
SAG05850183.508363hypothetical protein
SAG05861193.565175hypothetical protein
SAG05870212.960719prophage LambdaSa1, structural protein
SAG05880212.422959hypothetical protein
SAG05892212.071753hypothetical protein
SAG05902201.251690hypothetical protein
SAG05913211.901794hypothetical protein
SAG05922212.032837hypothetical protein
SAG05932212.947223prophage LambdaSa1, structural protein
SAG05941212.576089hypothetical protein
SAG05951212.475283hypothetical protein
SAG05962212.370139prophage LambdaSa1, pblA protein, internal
SAG05972212.300715prophage LambdaSa1, minor structural protein
SAG05983201.966231prophage LambdaSa1, N-acetylmuramoyl-L-alanine
SAG05994190.065031prophage LambdaSa1, minor structural protein
SAG0600-115-2.549432hypothetical protein
SAG0601-115-2.395537hypothetical protein
SAG0602-116-2.390442hypothetical protein
SAG0603016-2.607127hypothetical protein
SAG0604-116-1.553153prophage LambdaSa1, lysin
SAG0605-117-2.771230hypothetical protein
SAG0606324-6.269505hypothetical protein
SAG0607425-6.719200hypothetical protein
SAG0608425-6.665263hypothetical protein
SAG0610527-7.378160hypothetical protein
SAG0612629-9.826930hypothetical protein
SAG0613629-9.739990transmembrane protein Vexp1
SAG0614528-8.566971ABC transporter ATP-binding protein
SAG0615322-7.170545transmembrane protein Vexp3
SAG0616220-6.669182DNA-binding response regulator VncR
SAG0617-113-3.349557sensor histidine kinase VncS
SAG0619010-0.341229hypothetical protein
SAG0620010-0.110276hypothetical protein
SAG0621-18-0.137068rod shape-determining protein RodA
SAG062208-0.027201HAD superfamily hydrolase
SAG06232161.233976DNA gyrase subunit B
SAG06242190.866879septation ring formation regulator EzrA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0535ADHESNFAMILY2203e-70 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 220 bits (563), Expect = 3e-70
Identities = 79/313 (25%), Positives = 149/313 (47%), Gaps = 12/313 (3%)

Query: 1 MRKKFLLLMSFVAMFA-AWQLVQVKQVWADSKLKVVTTFYPVYEFTKNVVGDKADVSMLI 59
M+K LL+ F++ K + KLKVV T + + TKN+ GDK D+ ++
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIV 60

Query: 60 KAGTEPHDFEPSTKNIAAIQDSNAFVYMDDNMETWAPKVA-KSVKSKKVTTIKGTGDMLL 118
G +PH++EP +++ +++ Y N+ET K V++ K T K +
Sbjct: 61 PIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDY--FAV 118

Query: 119 TKGVEEEGEEHEGHGHEGHHHELDPHVWLSPERAISVVENIRNKFVKAYPKDAASFNKNA 178
+ GV+ EG +G DPH WL+ E I +NI + P + + KN
Sbjct: 119 SDGVDVI--YLEGQNEKGKE---DPHAWLNLENGIIFAKNIAKQLSAKDPNNKEFYEKNL 173

Query: 179 DAYIAKLKELDKEYKNGLSN--AKQKSFVTQHAAFGYMALDYGLNQVPIAGLTPDAEPSS 236
Y KL +LDKE K+ + A++K VT AF Y + YG+ I + + E +
Sbjct: 174 KEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWEINTEEEGTP 233

Query: 237 KRLGELAKYIKKYNINYIYFEENASNKVAKTLADEVGVKTAVLSPLEGLSKKEMAAGEDY 296
+++ L + +++ + ++ E + ++ KT++ + + + ++++ G+ Y
Sbjct: 234 EQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAEQG-KEGDSY 292

Query: 297 FSVMRRNLKVLKK 309
+S+M+ NL + +
Sbjct: 293 YSMMKYNLDKIAE 305


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0563PF07212250.016 Hyaluronoglucosaminidase
		>PF07212#Hyaluronoglucosaminidase

Length = 336

Score = 25.0 bits (54), Expect = 0.016
Identities = 9/17 (52%), Positives = 11/17 (64%)

Query: 29 ENIELKKQLKRLKAENW 45
E I L+ Q KR+ AE W
Sbjct: 3 ETIPLRVQFKRMTAEEW 19


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0564ANTHRAXTOXNA270.039 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 27.0 bits (59), Expect = 0.039
Identities = 23/89 (25%), Positives = 39/89 (43%), Gaps = 1/89 (1%)

Query: 71 QAEAKAEKYKETIRLAMELSQKKKVDAGMFKVSLRRSKKVEILDETKIPLDYMQEKIEYK 130
+ A E Y E+ ++K K + FK S+ K E +ET + Q+ ++
Sbjct: 30 EVNAMNEHYTESDIKRNHKTEKNKTEKEKFKDSINNLVKTEFTNETLDKIQQTQDLLKKI 89

Query: 131 PMKA-EISKALKSGIDISGVELIETESLQ 158
P EI L I + ++L+E + LQ
Sbjct: 90 PKDVLEIYSELGGEIYFTDIDLVEHKELQ 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0567ACRIFLAVINRP290.042 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 29.0 bits (65), Expect = 0.042
Identities = 17/105 (16%), Positives = 36/105 (34%), Gaps = 16/105 (15%)

Query: 7 INIFEKVQVFQRKIYLSTKADNKRKFGVLYDKVYRKDILKVAWFYVKRNKGSAGIDDFTI 66
+++ + L + + GV + + A G ++DF
Sbjct: 713 VSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDI--NQTISTAL-------GGTYVNDFID 763

Query: 67 EEIEAYGVQKFLDEIEDQLRNKKYQPKAVKRVYIPKANGKKRPLG 111
V+K + + + R P+ V ++Y+ ANG+ P
Sbjct: 764 RGR----VKKLYVQADAKFRM---LPEDVDKLYVRSANGEMVPFS 801


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0598FLGFLGJ330.006 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 33.1 bits (75), Expect = 0.006
Identities = 34/143 (23%), Positives = 57/143 (39%), Gaps = 11/143 (7%)

Query: 299 IITQLYLESFWGDSTV----GKRDNNWAGMSGGAQTRPSGVKVTT---GMARPANEGGTY 351
I+ Q LES WG + G+ N G+ + ++TT +
Sbjct: 174 ILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKF 233

Query: 352 MHYASVDDFLKDYTYLLAKQGIYNVVGKKNIAD-YTKGLFRAGGAKYDYAAAGYQSYTNL 410
Y+S + L DY LL + Y V A+ + L AG A + A + TN+
Sbjct: 234 RVYSSYLEALSDYVGLLTRNPRYAAVTTAASAEQGAQALQDAGYATDPHYA---RKLTNM 290

Query: 411 MTNIRNGINKVTGNILNTIDKLW 433
+ +++ +KV+ ID L+
Sbjct: 291 IQQMKSISDKVSKTYSMNIDNLF 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0614PF05272320.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.0 bits (72), Expect = 0.002
Identities = 11/20 (55%), Positives = 14/20 (70%)

Query: 36 IVGKSGTGKSTLLSLLAGLD 55
+ G G GKSTL++ L GLD
Sbjct: 601 LEGTGGIGKSTLINTLVGLD 620


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0616HTHFIS755e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 5e-18
Identities = 30/105 (28%), Positives = 51/105 (48%), Gaps = 2/105 (1%)

Query: 2 KILTVEDDKLIREGISEYLSEFGYTVIQAKDGREALSKFNSD-INLVILDIQIPFINGLE 60
IL +DD IR +++ LS GY V + + +LV+ D+ +P N +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 61 VLKEIRK-KSNLPILILTAFSDEEYKIDAFTNLVDGYVEKPFSLP 104
+L I+K + +LP+L+++A + I A Y+ KPF L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109


6SAG0892SAG0900Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG0892212-2.250688HAD superfamily hydrolase
SAG0893314-2.229579hypothetical protein
SAG0894315-1.629997hypothetical protein
SAG0895124-1.097389lipoyl-binding domain-containing protein
SAG0896433-0.089160oxidoreductase
SAG08976350.189760hypothetical protein
SAG089812523.337200hypothetical protein
SAG089912472.562187hypothetical protein
SAG09005333.279419hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0896PF04605300.001 Virulence-associated protein D (VapD)
		>PF04605#Virulence-associated protein D (VapD)

Length = 125

Score = 29.8 bits (67), Expect = 0.001
Identities = 15/44 (34%), Positives = 24/44 (54%), Gaps = 5/44 (11%)

Query: 2 RMILMFDMPTETAEE-----RKAYRKFRKFLLSEGFIMHQFSVY 40
R + FD+ T++ E+ R+ Y +KF+L GF Q+S Y
Sbjct: 5 RKAINFDLSTKSLEKYFKDTREPYSLIKKFMLENGFEHRQYSGY 48


7SAG0915SAG0949Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG09152261.601360Tn916, transposase
SAG09161311.624420Tn916, excisionase
SAG09170311.610565Tn916 hypothetical protein
SAG09181210.273196Tn916 hypothetical protein
SAG09191210.089397Tn916 hypothetical protein
SAG09203230.091910Tn916 hypothetical protein
SAG09212231.671954Tn916, transcriptional regulator
SAG09220253.128164Tn916 hypothetical protein
SAG09230243.221969tetracycline resistance protein
SAG09240284.519630tetracycline resistance determinant leader
SAG09250284.523785Tn916 hypothetical protein
SAG09260295.168987Tn916, NLP/P60 family protein
SAG0927-1284.630271hypothetical protein
SAG0929-1284.171506Tn916 hypothetical protein
SAG09300294.351831Tn916 hypothetical protein
SAG09311304.124779Tn916 hypothetical protein
SAG09321273.805804Tn916, transcriptional regulator
SAG0933-2150.567689DNA translocase FtsK
SAG0934-2101.074301Tn916 hypothetical protein
SAG0935-1121.192678Tn916 hypothetical protein
SAG0936-2110.496121Tn916 hypothetical protein
SAG0938-2120.230936GntR family transcriptional regulator
SAG0939-1110.528926DNA polymerase III DnaE
SAG09402171.8581066-phosphofructokinase
SAG09411140.577105pyruvate kinase
SAG0942-213-0.495415signal peptidase I
SAG0943-212-0.039700hypothetical protein
SAG0944-313-0.605234glucosamine--fructose-6-phosphate
SAG0945-212-1.355345IS1548 transposase
SAG0946115-3.615267phnA protein
SAG0947114-3.710675amino acid ABC transporter permease
SAG0948014-3.557434amino acid ABC transporter ATP-binding protein
SAG0949214-2.449569amino acid ABC transporter amino acid-binding
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0923TCRTETOQM11170.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 1117 bits (2891), Expect = 0.0
Identities = 622/639 (97%), Positives = 631/639 (98%)

Query: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60
MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120
TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180
IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180

Query: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS 240
GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS
Sbjct: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNNIGIDNLIEVITNKFYS 240

Query: 241 STHRGPSELCGNVFKIEYTKKRQRLAYIRLYSGVLHLRDSVRVSEKEKIKVTEMYTSING 300
STHRG SELCG VFKIEY++KRQRLAYIRLYSGVLHLRDSVR+SEKEKIK+TEMYTSING
Sbjct: 241 STHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEKIKITEMYTSING 300

Query: 301 ELCKIDRAYSGEIVILQNEFLKLNSVLGDTKLLPQRKKIENPHPLLQTTVEPSKPEQREM 360
ELCKID+AYSGEIVILQNEFLKLNSVLGDTKLLPQR++IENP PLLQTTVEPSKP+QREM
Sbjct: 301 ELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREM 360

Query: 361 LLDALLEISDSDPLLRYYVDSTTHEIILSFLGKVQMEVISALLQEKYHVEIELKEPTVIY 420
LLDALLEISDSDPLLRYYVDS THEIILSFLGKVQMEV ALLQEKYHVEIE+KEPTVIY
Sbjct: 361 LLDALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIY 420

Query: 421 MERPLKNAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480
MERPLK AEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG
Sbjct: 421 MERPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEG 480

Query: 481 IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL 540
IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL
Sbjct: 481 IRYGCEQGLYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL 540

Query: 541 SFKIYAPQEYLSRAYNDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR 600
SFKIYAPQEYLSRAY DAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR
Sbjct: 541 SFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR 600

Query: 601 SVCLTELKGYHVTTGEPVCQPRRPNSRIDKVRYMFNKIT 639
SVCLTELKGYHVTTGEPVCQPRRPNSRIDKVRYMFNKIT
Sbjct: 601 SVCLTELKGYHVTTGEPVCQPRRPNSRIDKVRYMFNKIT 639


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0927IGASERPTASE397e-05 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 39.3 bits (91), Expect = 7e-05
Identities = 34/197 (17%), Positives = 72/197 (36%), Gaps = 18/197 (9%)

Query: 526 DTKDRMVDTASGLKEQVKDLPTNARYA-VYQGKSKVKENVRDLTSSISQTKADRASG--R 582
D + KE ++ N + V Q S+ KE T + + + +
Sbjct: 1057 DATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVET 1116

Query: 583 KEQQEQRRKT--IAKRRSEMEQVKQKKQPASSVHERPTTRQEQYHDEQTSKQSNIQTSYK 640
++ QE + T ++ ++ + E V+ + +PA PT ++ QT+ ++
Sbjct: 1117 EKTQEVPKVTSQVSPKQEQSETVQPQAEPARE--NDPTVNIKEP-QSQTNTTAD------ 1167

Query: 641 ESQQAKQERPAVKSDFSSPKVERQGNTVQEKTVQKPATSTTTADRTSQRPITKERPSTVQ 700
Q AK+ V+ + GN+V E P +T + + + +P
Sbjct: 1168 TEQPAKETSSNVEQPVTESTTVNTGNSVVE----NPENTTPATTQPTVNSESSNKPKNRH 1223

Query: 701 RVPLQNTRSRPPIKTAT 717
R +++ T +
Sbjct: 1224 RRSVRSVPHNVEPATTS 1240


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0940FbpA_PF05833290.024 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 29.5 bits (66), Expect = 0.024
Identities = 10/37 (27%), Positives = 17/37 (45%), Gaps = 3/37 (8%)

Query: 137 IGFDTAVATAVENLDRLRDTSASHNRTFVVEVMGRNA 173
I D V E+ D L + ++E+MGR++
Sbjct: 94 INQDRIVVIDFESTDELGFN---SIYSLIIEIMGRHS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0948PF05272352e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 2e-04
Identities = 14/43 (32%), Positives = 23/43 (53%), Gaps = 2/43 (4%)

Query: 29 ILSLVGPSGGGKTTLLRMLAGLE-KIDSGTIVHDGKEVSVDHL 70
+ L G G GK+TL+ L GL+ D+ + GK+ S + +
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKD-SYEQI 639


8SAG1015SAG1036Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1015413-2.170919carbon starvation protein CstA
SAG1016518-2.991790response regulator
SAG1017520-3.659576sensor histidine kinase
SAG1018628-6.450114lipoprotein
SAG1019627-6.831834hypothetical protein
SAG1020426-6.848266lipoprotein
SAG1021326-5.818012hypothetical protein
SAG1022126-8.733686hypothetical protein
SAG1023131-8.503501hypothetical protein
SAG1024-126-6.307635lipoprotein
SAG1025025-5.917850hypothetical protein
SAG1027024-4.992751hypothetical protein
SAG1028-3120.561209hypothetical protein
SAG1029-3141.923059hypothetical protein
SAG1030-2120.996380hypothetical protein
SAG1031-2131.204765hypothetical protein
SAG1032-2130.947042hypothetical protein
SAG1033-210-0.405234DNA translocase FtsK
SAG103419-3.455604hypothetical protein
SAG103509-3.381517hypothetical protein
SAG103609-3.620617hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1016HTHFIS661e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 66.0 bits (161), Expect = 1e-14
Identities = 35/140 (25%), Positives = 60/140 (42%), Gaps = 9/140 (6%)

Query: 2 KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 61
+LV DD+ R L L++ ++ I + AT + D+ + D+ + D++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITS--NAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 62 LQLAEYINKM-PKPPLLIFATAYDQY--AIQAFEHDARDYLLKPYDFDRLKQAMDRVKGA 118
L I K P P+L+ +A + + AI+A E A DYL KP+D L + + + A
Sbjct: 63 FDLLPRIKKARPDLPVLVM-SAQNTFMTAIKASEKGAYDYLPKPFD---LTELIGIIGRA 118

Query: 119 LSTSTIIESVTSGPLFKQQY 138
L+ S
Sbjct: 119 LAEPKRRPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1017PF065802055e-63 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 205 bits (522), Expect = 5e-63
Identities = 59/211 (27%), Positives = 110/211 (52%), Gaps = 12/211 (5%)

Query: 366 QAEEATRLLQDAEMKSLQAQVNPHFLFNALNTIYGLIRMDSEKARKLVQDFSKVIRANLQ 425
+ + Q+A++ +L+AQ+NPHF+FNALN I LI D KAR+++ S+++R +L+
Sbjct: 150 DQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLR 209

Query: 426 RAKQNLIPLHDELEQVNAYLALEEARFPNMVAFNLDNQTNSDDNLMIPPFTLQVLIENSY 485
+ + L DEL V++YL L +F + + F ++ +PP +Q L+EN
Sbjct: 210 YSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAI-MDVQVPPMLVQTLVENGI 268

Query: 486 KHAFKHVNKNNQLKVTIARNNDRLHIIVQDNGIGIPKEKLITLGKKTQISKQGSGTAIEN 545
KH + + ++ + ++N + + V++ G K +K+ +GT ++N
Sbjct: 269 KHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN-----------TKESTGTGLQN 317

Query: 546 LVRRLNIIYDGQASLKFESNDSGTCAIVNIP 576
+ RL ++Y +A +K A+V IP
Sbjct: 318 VRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1021RTXTOXINA270.021 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 26.9 bits (59), Expect = 0.021
Identities = 15/31 (48%), Positives = 20/31 (64%), Gaps = 3/31 (9%)

Query: 71 AIDSGVDALSGAAIGTLVGGPVGTVVGAVQG 101
++ SG+ A AA +LVG PV +VGAV G
Sbjct: 377 SVSSGISA---AATTSLVGAPVSALVGAVTG 404


9SAG1132SAG1144Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1132212-0.729358hypothetical protein
SAG1133314-0.443613hypothetical protein
SAG1134517-0.806002GntR family transcriptional regulator
SAG1135620-0.481060gls24 protein
SAG1136420-0.871554hypothetical protein
SAG1137-113-0.001343gls24 protein
SAG1138-213-0.069636hypothetical protein
SAG1139-113-0.244227hypothetical protein
SAG1140-1100.346770hypothetical protein
SAG1141-19-0.207137hypothetical protein
SAG114209-0.312730ATP-dependent DNA helicase PcrA
SAG1143311-0.513564hypothetical protein
SAG1144312-0.766807uracil permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1134PF06580280.030 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 27.9 bits (62), Expect = 0.030
Identities = 16/88 (18%), Positives = 37/88 (42%), Gaps = 9/88 (10%)

Query: 40 TIASTFNVSPETARKGLNILADLQILTLKHGSGAII-LSKEKAIEFLNQYETSHSVAILK 98
I + P AR+ L L++L +L++ + + L+ E + ++ Y + +
Sbjct: 181 NIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADE--LTVVDSY-----LQLAS 233

Query: 99 GKIRDNIKAQQQEMEELA-TLVDDFLLQ 125
+ D ++ + Q + V L+Q
Sbjct: 234 IQFEDRLQFENQINPAIMDVQVPPMLVQ 261


10SAG1159SAG1173Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1159015-3.619901neuD protein
SAG1160217-4.341529UDP-N-acetylglucosamine-2-epimerase NeuC
SAG1161219-5.600294N-acetyl neuramic acid synthetase NeuB
SAG1162525-6.708546polysaccharide biosynthesis protein CpsL
SAG1163528-8.100899polysaccharide biosynthesis protein CpsK(V)
SAG1164528-8.023347glycosyl transferase CpsJ(V)
SAG1165626-7.059975glycosyl transferase CpsO(V)
SAG1166421-6.790866glycosyl transferase CpsN(V)
SAG1167220-6.069131polysaccharide biosynthesis protein CpsM(V)
SAG1168116-5.738586polysaccharide biosynthesis protein cpsH(V)
SAG1169-115-4.387823glycosyl transferase CpsG(V)
SAG1170-213-4.552674polysaccharide biosynthesis protein CpsF
SAG1171-210-4.231546glycosyl transferase CpsE
SAG1172-311-3.605646cpsD protein
SAG1173-211-3.143654cpsC protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1167PF05704494e-09 Capsular polysaccharide synthesis protein
		>PF05704#Capsular polysaccharide synthesis protein

Length = 307

Score = 49.1 bits (117), Expect = 4e-09
Identities = 24/97 (24%), Positives = 45/97 (46%), Gaps = 5/97 (5%)

Query: 1 MIPKVIHYCWFGG-NPLPDNLKKYIKTWREQCPDYEIIEWNEHNY----DVSKNVFMREA 55
M K I CW G P +++ + + ++ D+++I + +NY D+ + R
Sbjct: 66 MRQKYIFICWLQGIEKAPYIVQQCVASVKKNSGDFKVIIIDGNNYKEWVDIPDFLIKRWQ 125

Query: 56 YTKKNFAYVSDYARLDIIYTYGGFYLDTDVELLKSLD 92
K A+ SD RL ++ YGG ++D V + +
Sbjct: 126 EGKMLDAWFSDILRLFLLCKYGGLWIDATVYMFDKVP 162


11SAG1228SAG1248Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1228-2214.805616ISSdy1, transposase OrfA
SAG1229-2172.959376ISSdy1, transposase OrfB
SAG1230-2141.941233hypothetical protein
SAG1233-2141.660636streptococcal histidine triad family protein
SAG12340150.308852laminin-binding surface protein
SAG1235117-0.409973GBSi1, group II intron, maturase
SAG1237-116-2.240725hypothetical protein
SAG1238-118-0.640072hypothetical protein
SAG1239-117-3.805430hypothetical protein
SAG1241-114-3.531961IS3 family transposase OrfA
SAG1243-115-3.621451ISSdy1, transposase OrfA
SAG1244015-4.013728ISSdy1, transposase OrfB
SAG1245216-5.490780hypothetical protein
SAG1246216-5.176422hypothetical protein
SAG1247219-3.264837phage integrase family site specific
SAG1248220-1.208254hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1233PF05616340.002 Neisseria meningitidis TspB protein
		>PF05616#Neisseria meningitidis TspB protein

Length = 501

Score = 34.3 bits (78), Expect = 0.002
Identities = 24/87 (27%), Positives = 35/87 (40%), Gaps = 2/87 (2%)

Query: 226 IPKKDLSPSELAAAQAYWSQKQGRGARPSDY-RPTPAPGRRKAPIPDVTPNPGQGHQPD- 283
IP+ DL+P A A + P++ P PG R P PD NP D
Sbjct: 310 IPRPDLTPGSAEAPNAQPLPEVSPAENPANNPAPNENPGTRPNPEPDPDLNPDANPDTDG 369

Query: 284 NGGYHPAPPRPNDASQNKHQRDEFKGK 310
G P P D +H+++ +G+
Sbjct: 370 QPGTRPDSPAVPDRPNGRHRKERKEGE 396


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1234ADHESNFAMILY2467e-83 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 246 bits (630), Expect = 7e-83
Identities = 82/323 (25%), Positives = 144/323 (44%), Gaps = 34/323 (10%)

Query: 1 MKKVFFLMAMVVSLVMIAGCDKSANPKQPTQGMSVVTSFYPMYAMTKEVSGDLNDVR-MI 59
MKK+ L+ + +S +++ C Q + VV + + +TK ++GD D+ ++
Sbjct: 1 MKKLGTLLVLFLSAIILVACASGKKDTTSGQKLKVVATNSIIADITKNIAGDKIDLHSIV 60

Query: 60 QSGAGIHSFEPSVNDVAAIYDADLFVYHSHTLE----AWARDLDPNLKKSKVNVFEASKP 115
G H +EP DV +ADL Y+ LE AW L N KK++ + A
Sbjct: 61 PIGQDPHEYEPLPEDVKKTSEADLIFYNGINLETGGNAWFTKLVENAKKTENKDYFA--- 117

Query: 116 LTLDRVKGLEDMEVTQGIDPATLY--------DPHTWTDPVLAGEEAVNIAKELGHLDPK 167
V+ G+D L DPH W + A NIAK+L DP
Sbjct: 118 -------------VSDGVDVIYLEGQNEKGKEDPHAWLNLENGIIFAKNIAKQLSAKDPN 164

Query: 168 HKDSYTKKAKAFKKEAEQLTEEYTQKFKKVR--SKTFVTQHTAFSYLAKRFGLKQLGISG 225
+K+ Y K K + + ++L +E KF K+ K VT AF Y +K +G+ I
Sbjct: 165 NKEFYEKNLKEYTDKLDKLDKESKDKFNKIPAEKKLIVTSEGAFKYFSKAYGVPSAYIWE 224

Query: 226 ISPEQEPSPRQLKEIQDFVKEYNVKTIFAEDNVNPKIAHAIAKSTGAKVKT---LSPLEA 282
I+ E+E +P Q+K + + +++ V ++F E +V+ + +++ T + +
Sbjct: 225 INTEEEGTPEQIKTLVEKLRQTKVPSLFVESSVDDRPMKTVSQDTNIPIYAQIFTDSIAE 284

Query: 283 APSGNKTYLENLRANLEVLYQQL 305
+Y ++ NL+ + + L
Sbjct: 285 QGKEGDSYYSMMKYNLDKIAEGL 307


12SAG1271SAG1305Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG12712210.484559hypothetical protein
SAG12723231.307439hypothetical protein
SAG12734241.438852hypothetical protein
SAG12745251.956220hypothetical protein
SAG12753274.320988hypothetical protein
SAG12763274.197010hypothetical protein
SAG12774294.523474hypothetical protein
SAG12783264.928249hypothetical protein
SAG12793254.786328hypothetical protein
SAG12802244.275587SNF2 family protein
SAG12813234.129515hypothetical protein
SAG12822244.310391calcium-binding protein
SAG12832244.378291agglutinin receptor
SAG12842263.718952abortive infection protein AbiGI
SAG12852284.162664abortive infection protein AbiGII
SAG12862305.052091Tn5252, Orf28
SAG12872314.411912Tn5252, Orf26
SAG12890314.337526Tn5252, Orf23
SAG12901313.452567hypothetical protein
SAG12923324.955628hypothetical protein
SAG12932314.993845protease
SAG12943315.172917hypothetical protein
SAG12954325.441112hypothetical protein
SAG12962275.301286hypothetical protein
SAG12972235.217416C-5 cytosine-specific DNA methylase
SAG12980172.565647hypothetical protein
SAG12991172.610911hypothetical protein
SAG13003183.571301hypothetical protein
SAG13011184.66959050S ribosomal protein L7/L12
SAG13020174.15797450S ribosomal protein L10
SAG1303-2173.960851ATP-dependent Clp protease, ATP-binding subunit
SAG13041214.826335hypothetical protein
SAG13050204.195749homocysteine methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1283TONBPROTEIN350.002 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 35.0 bits (80), Expect = 0.002
Identities = 8/29 (27%), Positives = 9/29 (31%)

Query: 1571 PQPEEPSPNQPTPPQPPIETIEPPVPASI 1599
QP +P P PI P I
Sbjct: 61 VQPPPEPVVEPEPEPEPIPEPPKEAPVVI 89


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1303HTHFIS441e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 44.1 bits (104), Expect = 1e-06
Identities = 51/253 (20%), Positives = 90/253 (35%), Gaps = 31/253 (12%)

Query: 419 VIGQNDAVEAVARAIRRNRAGFDDGNRPIGSFLFVGPTGVGKTELAKQLAFDMFGSKDAI 478
++G++ A++ + R + R + + + G +G GK +A+ L
Sbjct: 139 LVGRSAAMQEIYRVLAR----LMQTDLTL---MITGESGTGKELVARALHDYGKRRNGPF 191

Query: 479 VRLDMSEYNDRTAVSKLIGATAGYVGYDDNSNTLTERIRRNPYSIVLLDEIEKADPQVIT 538
V ++M+ S+L G G + T R + + LDEI T
Sbjct: 192 VAINMAAIPRDLIESELFGHEKG--AFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQT 249

Query: 539 LLLQVLDDGRLTDGQGNTINFKNTVIIATSNAGFGNEAFTGDSDKDLKIMERISPYFRPE 598
LL+VL G T G T + I+A +N KDLK FR +
Sbjct: 250 RLLRVLQQGEYTTVGGRTPIRSDVRIVAATN-------------KDLKQSIN-QGLFRED 295

Query: 599 FLNRFNGV-IEFSHLS--KDDLSEIVDLMLDEVNQTIGKKGIDLVVDENVKSHLIELGYD 655
R N V + L +D+ ++V + + + G D+ + +
Sbjct: 296 LYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEK-EGLDVKRF--DQEALELM--KAHP 350

Query: 656 EAMGVRPLRRVIE 668
VR L ++
Sbjct: 351 WPGNVRELENLVR 363


13SAG1483SAG1501Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1483419-2.745672preprotein translocase subunit SecG
SAG1484418-2.17179050S ribosomal protein L33
SAG1485316-2.706099multi-drug resistance protein
SAG1486013-2.997555hypothetical protein
SAG1487-211-1.856781ABC transporter ATP-binding protein
SAG1488-112-1.943264dephospho-CoA kinase
SAG1489114-2.108597formamidopyrimidine-DNA glycosylase
SAG1490315-2.691810MutR family transcriptional regulator
SAG1491417-1.213402hypothetical protein
SAG1492627-2.039645hypothetical protein
SAG1493524-4.587205hypothetical protein
SAG1494114-2.532384hypothetical protein
SAG1495112-2.556321CAAX amino terminal protease family protein
SAG1496-210-2.609510hypothetical protein
SAG1497-212-3.109506hypothetical protein
SAG1498-213-3.044536hypothetical protein
SAG1499-113-2.440620GTP-binding protein Era
SAG1500017-3.513098diacylglycerol kinase
SAG1501014-3.043576metalloprotease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1483SECGEXPORT383e-07 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 38.0 bits (88), Expect = 3e-07
Identities = 24/79 (30%), Positives = 39/79 (49%), Gaps = 4/79 (5%)

Query: 1 MYNLLLTILLVLSVLLIISIFMQPQKNPSSNV-FDSSGSEALFERSKARGFEAFMQRFTG 59
MY LL + L++++ L+ I +Q K F + S LF + G FM R T
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLF---GSSGSGNFMTRMTA 57

Query: 60 VLVFFWLLIGLVLSILSSH 78
+L + +I LVL ++S+
Sbjct: 58 LLATLFFIISLVLGNINSN 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1485TCRTETA1141e-30 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 114 bits (287), Expect = 1e-30
Identities = 75/341 (21%), Positives = 146/341 (42%), Gaps = 9/341 (2%)

Query: 15 LVMPFMVLYVEQLGAPSNKVEWYAGLSVSLSALSSALVAPLWGRLADKYGRKPMMVRAGL 74
L+MP + + L SN V + G+ ++L AL AP+ G L+D++GR+P+++ +
Sbjct: 23 LIMPVLPGLLRDLV-HSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLA 81

Query: 75 MMTFTMGGLAFIHSVTGLLILRILNGIFAGYVPNSTALIASQAPQEESGYALGTLATGVT 134
+A + L I RI+ GI + A IA +E G ++
Sbjct: 82 GAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFG 141

Query: 135 GGMLIGPLLGGLLAEWFGIREVFLLVGTILLISTLMTIFMVKEDFKPISN---EETMPTT 191
GM+ GP+LGGL+ F F + ++ L F++ E K E +
Sbjct: 142 FGMVAGPVLGGLMGG-FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPL 200

Query: 192 EVFKSVKSLQILIGLFVTSMIIQISAQSIAPILTLYIRHLGQTENLMFVSGLIVSGMGFS 251
F+ + + ++ L I+Q+ Q A + ++ + G+ ++ G
Sbjct: 201 ASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTI--GISLAAFGIL 258

Query: 252 SILSSPKL-GRIGDRIGNHRLLLLALLYSFLMYVLCSLAQTSLQLGVIRFLYGFGTGALM 310
L+ + G + R+G R L+L ++ Y+L + A I L G G M
Sbjct: 259 HSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASG-GIGM 317

Query: 311 PSINSILTKIAPRQGLSRIFSYNQMFSNLGQVLGPFVGSAV 351
P++ ++L++ + ++ ++L ++GP + +A+
Sbjct: 318 PALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAI 358



Score = 52.9 bits (127), Expect = 8e-10
Identities = 41/188 (21%), Positives = 75/188 (39%), Gaps = 8/188 (4%)

Query: 197 VKSLQILIGLFVTSMIIQISAQSIAPILTLYIRHLGQTENLMFVSGLIVSGMGFSSILSS 256
+K + LI + T + + I P+L +R L + ++ G++++ +
Sbjct: 1 MKPNRPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACA 60

Query: 257 PKLGRIGDRIGNHRLLLLALLYSFLMYVLCSLAQTSLQLGVIRFLYGFGTGALMPSINSI 316
P LG + DR G +LL++L + + Y + + A L + R + G TGA +
Sbjct: 61 PVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGI-TGATGAVAGAY 119

Query: 317 LTKIAPRQGLSRIFSYNQMFSNLGQVLGPFVG---SAVSIHLGFRWVFFVTSFIVLANFV 373
+ I +R F + G V GP +G S H FF + + NF+
Sbjct: 120 IADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAP----FFAAAALNGLNFL 175

Query: 374 WCFINFRK 381
+
Sbjct: 176 TGCFLLPE 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1499TCRTETOQM330.001 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 33.3 bits (76), Expect = 0.001
Identities = 31/137 (22%), Positives = 59/137 (43%), Gaps = 17/137 (12%)

Query: 3 FKSGFVAILGRPNVGKSTFLNHVMGQKIAIMSDKAQTTRNKIMGIYTTETEQIVFIDTPG 62
+ SG + LG + G + N ++ ++ I T+ + E ++ IDTPG
Sbjct: 25 YNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITS-------FQWENTKVNIIDTPG 77

Query: 63 IHKPKTALGDFMVESAYSTLREVETVLFMVPADEKRGKGDDMIIERLKAAKIPVILVINK 122
H DF+ E Y +L ++ + ++ A + ++ L+ IP I INK
Sbjct: 78 -HM------DFLAE-VYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFINK 129

Query: 123 IDK--VHPDQLLEQIDD 137
ID+ + + + I +
Sbjct: 130 IDQNGIDLSTVYQDIKE 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1501MALTOSEBP290.007 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 29.3 bits (65), Expect = 0.007
Identities = 19/60 (31%), Positives = 34/60 (56%), Gaps = 8/60 (13%)

Query: 31 NKEMAVTFVTNERSHELNLEYRDTDRPTDVISLEYKPEVDISFDEEDLAENPELAEMLED 90
NKE+A F+ N + LE + D+P ++L+ S+ EE+LA++P +A +E+
Sbjct: 298 NKELAKEFLENYLLTDEGLEAVNKDKPLGAVALK-------SY-EEELAKDPRIAATMEN 349


14SAG1693SAG1701Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1693519-1.749047transcription antitermination protein NusB
SAG1694421-1.604003hypothetical protein
SAG1695322-3.482550elongation factor P
SAG1696428-6.406125hypothetical protein
SAG1697528-7.498687hypothetical protein
SAG1698629-8.248268hypothetical protein
SAG1699426-5.709685hypothetical protein
SAG1700124-3.635912hypothetical protein
SAG1701318-0.392627hypothetical protein
15SAG1763SAG1769Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG17632281.436235glutamine synthetase, type I
SAG17646392.714063GlnR family transcriptional regulator
SAG17657413.048319hypothetical protein
SAG17668464.161732phosphoglycerate kinase
SAG17678373.303459acid phosphatase
SAG17688383.751289glyceraldehyde-3-phosphate dehydrogenase
SAG17695232.598583elongation factor G
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1769TCRTETOQM6240.0 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 624 bits (1610), Expect = 0.0
Identities = 180/671 (26%), Positives = 301/671 (44%), Gaps = 65/671 (9%)

Query: 9 KTRNIGIMAHVDAGKTTTTERILYYTGKIHKIGETHEGASQMDWMEQEQERGITITSAAT 68
K NIG++AHVDAGKTT TE +LY +G I ++G +G ++ D E++RGITI + T
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGIT 61

Query: 69 TAQWDGHRVNIIDTPGHVDFTIEVQRSLRVLDGAVTVLDAQSGVEPQTETVWRQATEYGV 128
+ QW+ +VNIIDTPGH+DF EV RSL VLDGA+ ++ A+ GV+ QT ++ + G+
Sbjct: 62 SFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGI 121

Query: 129 PRIVFANKMDKIGADFLYSVQSLHDRLQANAHPIQLPIGSEDDFRGIIDLIKMKAEIYTN 188
P I F NK+D+ G D Q + ++L A +IK K E+Y N
Sbjct: 122 PTIFFINKIDQNGIDLSTVYQDIKEKLSAEI------------------VIKQKVELYPN 163

Query: 189 DLGTDILEEDIPAEYVDQANEYREKLVEAVADTDEDLMMKYLEGEEITNEELMAAIRKAT 248
T+ E + + V + ++DL+ KY+ G+ + EL
Sbjct: 164 MCVTNFTESE---------------QWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRF 208

Query: 249 INVEFYPVLCGSAFKNKGVQLMLDAVIDYLPSPLDIPAIKGINPDTDEEETRPASDEEPF 308
N +PV GSA N G+ +++ + + S +
Sbjct: 209 HNCSLFPVYHGSAKNNIGIDNLIEVITNKFYSSTH-------------------RGQSEL 249

Query: 309 AALAFKIMTDPFVGRLTFFRVYSGVLNSGSYVLNTSKGKRERIGRILQMHANSRQEIETV 368
FKI RL + R+YSGVL+ V + K K +I + +I+
Sbjct: 250 CGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKA 308

Query: 369 YAGDIAAAVG----LKDTTTGDSLTDEKSKVILESIEVPEPVIQLMVEPKSKADQDKMGI 424
Y+G+I L GD+ + E IE P P++Q VEP ++ +
Sbjct: 309 YSGEIVILQNEFLKLNS-VLGDTKLLPQR----ERIENPLPLLQTTVEPSKPQQREMLLD 363

Query: 425 ALQKLAEEDPTFRVETNVETGETVISGMGELHLDVLVDRMKREFKVEANVGAPQVSYRET 484
AL ++++ DP R + T E ++S +G++ ++V ++ ++ VE + P V Y E
Sbjct: 364 ALLEISDSDPLLRYYVDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVIYME- 422

Query: 485 FRASTQARGFFKRQSGGKGQFGDVWIEFTPNEEGKGFEFENAIVGGVVPREFIPAVEKGL 544
R +A + + + + +P G G ++E+++ G + + F AV +G+
Sbjct: 423 -RPLKKAEYTIHIEVPPNPFWASIGLSVSPLPLGSGMQYESSVSLGYLNQSFQNAVMEGI 481

Query: 545 VESMANGVLAGYPMVDVKAKLYDGSYHDVDSSETAFKIAASLALKEAAKSAQPAILEPMM 604
G L G+ + D K G Y+ S+ F++ A + L++ K A +LEP +
Sbjct: 482 RYGCEQG-LYGWNVTDCKICFKYGLYYSPVSTPADFRMLAPIVLEQVLKKAGTELLEPYL 540

Query: 605 LVTITAPEDNLGDVMGHVTARRGRVDGMEARGNTQVVRAFVPLAEMFGYATVLRSATQGR 664
I AP++ L + + + N ++ +P + Y + L T GR
Sbjct: 541 SFKIYAPQEYLSRAYTDAPKYCANIVDTQLKNNEVILSGEIPARCIQEYRSDLTFFTNGR 600

Query: 665 GTFMMVFDHYE 675
+ Y
Sbjct: 601 SVCLTELKGYH 611


16SAG1829SAG1884Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG18292242.573511CtsR family transcriptional regulator
SAG18304263.038170hypothetical protein
SAG18315272.969278elongation factor Ts
SAG18323255.11126430S ribosomal protein S2
SAG18333245.282022alkyl hydroperoxide reductase
SAG18343265.680152alkyl hydroperoxide reductase
SAG18354397.640467*hypothetical protein
SAG18365398.123702hypothetical protein
SAG18373347.647312prophage LambdaSa2, lysin
SAG18384336.735344prophage LambdaSa2, holin
SAG18394316.789841hypothetical protein
SAG18404306.787637hypothetical protein
SAG18414296.484975hypothetical protein
SAG18423286.357791prophage LambdaSa2, PblB
SAG18433285.639374hypothetical protein
SAG18444295.413912hypothetical protein
SAG18456323.523864hypothetical protein
SAG18467323.406402hypothetical protein
SAG18475273.705601hypothetical protein
SAG18484283.452971hypothetical protein
SAG18495303.851867hypothetical protein
SAG18504324.535267hypothetical protein
SAG18514324.297017hypothetical protein
SAG18523324.012252hypothetical protein
SAG18533343.652063prophage LambdaSa2, protease
SAG18544343.431459hypothetical protein
SAG18554322.827826prophage LambdaSa2, terminase large subunit
SAG18564321.313365hypothetical protein
SAG18574311.887925prophage LambdaSa2, HNH endonuclease family
SAG18584323.274881hypothetical protein
SAG18594353.712458prophage LambdaSa2, site-specific recombinase
SAG18604333.943261hypothetical protein
SAG18614364.165483prophage LambdaSa2, Cro/CI family
SAG18624364.264172hypothetical protein
SAG18633352.912432prophage LambdaSa2, single-strand binding
SAG18643313.431993hypothetical protein
SAG18653364.495100hypothetical protein
SAG18662343.756675hypothetical protein
SAG18673374.845032hypothetical protein
SAG18683374.597269hypothetical protein
SAG18694365.003867prophage LambdaSa2, type II DNA modification
SAG18705364.130908prophage LambdaSa2, DNA replication protein
SAG18725323.637303hypothetical protein
SAG18735324.240464prophage LambdaSa2, replicative DNA helicase
SAG18746241.516190hypothetical protein
SAG18754221.402808hypothetical protein
SAG18763230.947102prophage LambdaSa2, HNH endonuclease family
SAG18775221.460502prophage LambdaSa2, antirepressor protein
SAG18786260.518085hypothetical protein
SAG18794230.050092hypothetical protein
SAG18803260.378283hypothetical protein
SAG18813280.380182hypothetical protein
SAG1882017-0.334327prophage LambdaSa2, repressor protein
SAG1883217-0.966554hypothetical protein
SAG1884213-0.469477hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1833MICOLLPTASE280.022 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 28.1 bits (62), Expect = 0.022
Identities = 12/67 (17%), Positives = 24/67 (35%), Gaps = 6/67 (8%)

Query: 119 RGTFIIDP--DGVIQMMEINADGIGRDASTLIDKVRAAQYIRQHTGEVCPAKWKEGAETL 176
R II D + GI TL++ +RA Y+ + ++ +
Sbjct: 137 RVQAIIYGLEDSGRTYTADDDKGI----PTLVEFLRAGYYLGFYNKQLSYLNTPQLKNEC 192

Query: 177 TPSLDLV 183
P++ +
Sbjct: 193 LPAMKAI 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1834PF07212300.025 Hyaluronoglucosaminidase
		>PF07212#Hyaluronoglucosaminidase

Length = 336

Score = 30.0 bits (67), Expect = 0.025
Identities = 15/45 (33%), Positives = 24/45 (53%), Gaps = 2/45 (4%)

Query: 242 GGQVIETVGIENMIGTLYT--EGPKLMAQIEEHTKSYDIDIIKSQ 284
G ++ G+E +GTL E P + A +E+ + IDI+K Q
Sbjct: 205 NGSAMQIRGVEKALGTLKITHENPNVEANYDENAAALSIDIVKKQ 249


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1844GPOSANCHOR330.008 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 32.7 bits (74), Expect = 0.008
Identities = 28/219 (12%), Positives = 71/219 (32%), Gaps = 14/219 (6%)

Query: 33 LKKDFNNINRQLKMDPDNVDLLNRKLVNLQEQARVGAIKIAELKKQQKALGESEVGSAQW 92
LK ++++ K D+ D L +L N +E+ R ++E + + L + +
Sbjct: 69 LKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKA 128

Query: 93 NKLQLEIAK--VESQMKIVDKAMESTKKHIEDVGDPKSILNLNKELDNVAKELDIVNQKL 150
+ + + + + + + + +N + K L+ L
Sbjct: 129 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 188

Query: 151 ELDPDNVELAEQKMKLLGKQSELAGDKVQELK------------KKQAALGDEKIGTEEW 198
E +E A + ++ K + A+ + +
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 199 RQLQNEIGQAEVEVLKIDRAMDILGESSRSATGDIKEAT 237
+ L+ E E ++++A++ S + + IK
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLE 287


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1856HTHFIS280.019 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 27.9 bits (62), Expect = 0.019
Identities = 7/19 (36%), Positives = 14/19 (73%)

Query: 10 EKLGISRATLTRYRKKLGI 28
+ LG++R TL + ++LG+
Sbjct: 457 DLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1858YERSSTKINASE280.006 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 27.8 bits (61), Expect = 0.006
Identities = 14/45 (31%), Positives = 24/45 (53%)

Query: 12 LEAGSTLEIYLTKNDLEHIANGYEVTLDIKPNETVNKIVIKPSFV 56
L A +++L + L H G+E +IKPN+ + I +P+ V
Sbjct: 317 LGASEKSDVFLVVSTLLHCIEGFEKNPEIKPNQGLRFITSEPAHV 361


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1860FRAGILYSIN270.024 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 27.3 bits (60), Expect = 0.024
Identities = 11/45 (24%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 81 LEEFQNAISELLEVLEPDDKKIFHLRWG----EHTGYDWIQVWHI 121
LE F ++ + DD+ F +RWG + G W +++
Sbjct: 288 LEGFTASLKSNPKAEGYDDQIYFLIRWGTWDNKILGMSWFNSYNV 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1865FbpA_PF05833240.049 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 24.4 bits (53), Expect = 0.049
Identities = 9/35 (25%), Positives = 16/35 (45%), Gaps = 6/35 (17%)

Query: 39 NALKYQLRYRKKNGLEDLKKARKNLDWLIEEMEKE 73
N Y +Y K LKK+ + + + + E+E
Sbjct: 382 NVQSYYKKYNK------LKKSEEAANEQLLQNEEE 410


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1880RTXTOXINA250.015 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 24.9 bits (54), Expect = 0.015
Identities = 7/30 (23%), Positives = 18/30 (60%), Gaps = 1/30 (3%)

Query: 18 FIELSKEFNKKARELEELAHELKSFDFEGE 47
F+ ++ +F + A ++EE + K ++G+
Sbjct: 323 FLSIADKFKR-ANKIEEYSQRFKKLGYDGD 351


17SAG1964SAG2036Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG19640193.677412phosphate ABC transporter permease
SAG19651214.174742phosphate ABC transporter permease
SAG19661244.221039hemolysin
SAG19672264.962184hypothetical protein
SAG19682286.46887916S ribosomal RNA methyltransferase RsmE
SAG19692254.88529050S ribosomal protein L11 methyltransferase
SAG19703201.813881hypothetical protein
SAG19713233.264869hypothetical protein
SAG19722284.792934MerR family transcriptional regulator
SAG19732254.626867acetyltransferase
SAG19742243.309399MutT/nudix family protein
SAG19751212.943508hypothetical protein
SAG19761212.807783hypothetical protein
SAG19771201.923316acetyltransferase
SAG19780170.537224recombination factor protein RarA
SAG1979119-0.534195*hypothetical protein
SAG1980217-0.959246ABC transporter ATP-binding protein
SAG1981217-2.665535hypothetical protein
SAG1982115-2.606957Cro/CI family transcriptional regulator
SAG1983016-3.137117hypothetical protein
SAG1984016-3.153698hypothetical protein
SAG1985117-3.601395hypothetical protein
SAG1986016-3.414527phage integrase family site specific
SAG1987018-4.894725hypothetical protein
SAG1988020-5.376029hypothetical protein
SAG1989122-6.858999hypothetical protein
SAG1990222-7.121853hypothetical protein
SAG1991026-6.211317Cro/CI family transcriptional regulator
SAG1992127-6.277623hypothetical protein
SAG1993226-3.849109phage integrase family site specific
SAG1994325-2.854330hypothetical protein
SAG1995021-2.614176hypothetical protein
SAG1996019-1.811893cell wall surface anchor family protein
SAG1997-217-1.742576hypothetical protein
SAG1998-215-1.327168hypothetical protein
SAG1999-217-1.918749hypothetical protein
SAG2000-217-2.019180hypothetical protein
SAG2001-114-2.214966conjugal transfer protein, interruption-C
SAG2002-115-3.033644IS1381 transposase protein B
SAG2003-215-4.235285IS1381 transposase protein A
SAG2005120-5.691442hypothetical protein
SAG2006120-5.056739hypothetical protein
SAG2007321-4.866791hypothetical protein
SAG2008320-4.919788hypothetical protein
SAG2009319-6.762036hypothetical protein
SAG2010319-6.955290hypothetical protein
SAG2011319-6.748450hypothetical protein
SAG2012219-6.160080hypothetical protein
SAG2013320-6.056739hypothetical protein
SAG2014221-5.737004hypothetical protein
SAG2015223-4.500576Cro/CI family transcriptional regulator
SAG2016216-2.662523hypothetical protein
SAG2017112-1.447553Cro/CI family transcriptional regulator
SAG2018112-0.863716DNA translocase FtsK
SAG2019314-0.042646hypothetical protein
SAG2020112-0.667086hypothetical protein
SAG2021112-1.247162cell wall surface anchor family protein
SAG2022-114-2.497526ISL3 family transposase
SAG2023-118-4.068608mercuric reductase
SAG2024118-6.893000mercuric resistance operon regulatory protein
SAG2025420-6.846151Mn2+/Fe2+ transporter
SAG2026623-7.921351hypothetical protein
SAG2027520-7.035521ABC transporter ATP-binding protein
SAG2028419-6.375393hypothetical protein
SAG2029320-6.371975streptomycin resistance protein
SAG2030420-4.853515hypothetical protein
SAG2031418-5.065255hypothetical protein
SAG2032215-4.746835hypothetical protein
SAG2033015-4.780407acetyltransferase
SAG2034016-4.362750hypothetical protein
SAG2035-113-3.160564ABC transporter ATP-binding protein
SAG2036-215-3.303356hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1966FbpA_PF05833280.043 Fibronectin-binding protein
		>FbpA_PF05833#Fibronectin-binding protein

Length = 577

Score = 28.3 bits (63), Expect = 0.043
Identities = 25/112 (22%), Positives = 46/112 (41%), Gaps = 6/112 (5%)

Query: 168 PLDNAKAYQAKVSSGKVVIAGSSSVTPVMEKIKEAYHKVNAKVDVEIQQSDSSTGITSAI 227
P N ++Y K + K + + +E + + + V I +D+ I
Sbjct: 379 PSQNVQSYYKKYNKLKKSEEA---ANEQLLQNEEELNYLYS-VLTNINNADNYDEIEEIK 434

Query: 228 DGSADIG-MASRELDKTESSKGVKA-TVIATDGIAVVVNKKNKVNDLSTKQV 277
+ G + +++ K++ SK K I+ DGI + V K N ND T +
Sbjct: 435 KELIETGYIKFKKIYKSKKSKTSKPMHFISKDGIDIYVGKNNIQNDYLTLKF 486


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1977SACTRNSFRASE422e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 2e-07
Identities = 24/121 (19%), Positives = 50/121 (41%), Gaps = 9/121 (7%)

Query: 18 QNTGWT---ALTSPVYDRKWTESDLEKNLAN--GMSFFVAEVDDKIAGVLDFGPYYPFPA 72
+N WT S Y +++ + D++ + G + F+ +++ G + +
Sbjct: 31 ENGVWTYTEERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIRSNW---- 86

Query: 73 GKHVATFGILIAEPYQGQGLGKALLKALLTEAKAQGYIKIAMHVMGNNSRAISLYQKYGF 132
+ I +A+ Y+ +G+G ALL + AK + + + N A Y K+ F
Sbjct: 87 NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146

Query: 133 T 133

Sbjct: 147 I 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1978HTHFIS310.008 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.008
Identities = 23/83 (27%), Positives = 32/83 (38%), Gaps = 9/83 (10%)

Query: 49 GIGKTSIASAIAGTTKYAFRTFNATVDSKKRLQEIAEEAKFSGGLVLLLDEIHRLDKTKQ 108
I + I S + G K AF A S R ++ + G L LDEI + Q
Sbjct: 198 AIPRDLIESELFGHEKGAFTG--AQTRSTGRFEQ-------AEGGTLFLDEIGDMPMDAQ 248

Query: 109 DFLLPLLENGNIIMIGATTENPF 131
LL +L+ G +G T
Sbjct: 249 TRLLRVLQQGEYTTVGGRTPIRS 271


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1991HELNAPAPROT270.048 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 26.8 bits (59), Expect = 0.048
Identities = 12/51 (23%), Positives = 24/51 (47%), Gaps = 1/51 (1%)

Query: 51 NLDKIAEYFRATPTQLFGTIKEIELENSVLETDTYTSKADHILKSVKEFYE 101
+D IAE A Q T+KE E++ + + A +++++ Y+
Sbjct: 60 TVDTIAERLLAIGGQPVATVKEY-TEHASITDGGNETSASEMVQALVNDYK 109


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2021TONBPROTEIN381e-04 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 37.7 bits (87), Expect = 1e-04
Identities = 20/80 (25%), Positives = 28/80 (35%), Gaps = 2/80 (2%)

Query: 366 VVPNVVIPKQPTPPSTEKVTPEAEKPVPEKPVEPKFVTPTLKTYTPAQPKVKPHVSIPEK 425
V P + P Q P E V +P P + K +PK KP + E+
Sbjct: 50 VTPADLEPPQAVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQ 109

Query: 426 INYSVSVHPVLVPAANPSKA 445
V PV A+P +
Sbjct: 110 PKRDVK--PVESRPASPFEN 127



Score = 34.2 bits (78), Expect = 0.001
Identities = 20/107 (18%), Positives = 32/107 (29%), Gaps = 1/107 (0%)

Query: 365 DVVPNVVIPKQPTPPSTEKVTPEAEKPVPEKPVEPK-FVTPTLKTYTPAQPKVKPHVSIP 423
V P +P P P E PV + +PK P QPK
Sbjct: 60 AVQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVQEQPKRDVKPVES 119

Query: 424 EKINYSVSVHPVLVPAANPSKAVIDEAGQSVNGKTVLPNAELNYVAK 470
+ + P + ++ + A +G L + Y A+
Sbjct: 120 RPASPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQPQYPAR 166


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2022PF07520290.036 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 29.2 bits (65), Expect = 0.036
Identities = 12/74 (16%), Positives = 26/74 (35%)

Query: 305 HLRYSQWRHRCMSSNSKDAYKDLVRAVDNWHVEIFNYFDKRLTNAYTESINSIIRQVERM 364
W + + ++ DL V +W E+F F + + S ++ E
Sbjct: 184 DPGAMSWFLQRLEADEDGNAVDLQLWVSDWLKEMFLDFKRAERPGRSISEENLPHMFEHW 243

Query: 365 GRGYSFDALRAKIL 378
R S+ + + +
Sbjct: 244 ARYLSYLQVIQRAV 257


18SAG2109SAG2117Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SAG21092314.10357150S ribosomal protein L32
SAG21100314.00015350S ribosomal protein L33
SAG21112324.010359hypothetical protein
SAG21122354.543209phage integrase family site specific
SAG21134312.998665hypothetical protein
SAG21144271.634832hypothetical protein
SAG2115622-0.726944hypothetical protein
SAG2116520-1.937130hypothetical protein
SAG2117219-2.066256hypothetical protein
19SAG0431SAG0440N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG04315251.191185TetR family transcriptional regulator
SAG04324251.524355AraC family transcriptional regulator
SAG04334312.927524surface protein Rib
SAG0435118-1.418185DNA-damage-inducible protein J
SAG04362170.955964hypothetical protein
SAG04373160.959420lipoprotein
SAG04400204.339942*hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0431HTHTETR512e-10 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 51.2 bits (122), Expect = 2e-10
Identities = 14/64 (21%), Positives = 30/64 (46%)

Query: 5 RQIQKTKVAIYNAFISLLQENDYSKITVQDVIGLANVGRSTFYSHYESKEVLLKELCEDL 64
++ Q+T+ I + + L + S ++ ++ A V R Y H++ K L E+ E
Sbjct: 7 QEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELS 66

Query: 65 FHHL 68
++
Sbjct: 67 ESNI 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0433GPOSANCHOR673e-13 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 67.0 bits (163), Expect = 3e-13
Identities = 27/98 (27%), Positives = 42/98 (42%), Gaps = 19/98 (19%)

Query: 1303 ATPGDKPAKVVVTYPDGSKDTVDVTVKVVDPRTDADKNDPAGKDQQVNGKGN-------- 1354
A ++ AK+ S+ D + G+ Q K N
Sbjct: 449 AKQAEELAKLRAGKASDSQTP--------DAKPGNKAVPGKGQAPQAGTKPNQNKAPMKE 500

Query: 1355 ---KLPATGENATPFFNVVALTIMSSVGLLSVSKKKED 1389
+LP+TGE A PFF ALT+M++ G+ +V K+KE+
Sbjct: 501 TKRQLPSTGETANPFFTAAALTVMATAGVAAVVKRKEE 538



Score = 63.5 bits (154), Expect = 4e-12
Identities = 12/56 (21%), Positives = 24/56 (42%)

Query: 14 QTKQRFSIKKFKFGAASVLIGISFLGGFTQGQFNISTDTVFAAEVISGSAVTLNTN 69
T + +S++K K G ASV + ++ LG N + ++ + V +
Sbjct: 5 NTNRHYSLRKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERAD 60


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0437ADHESNFAMILY270.026 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 26.7 bits (59), Expect = 0.026
Identities = 10/23 (43%), Positives = 16/23 (69%)

Query: 3 KKLLGLMILAISTVFLVACSTNS 25
KKL L++L +S + LVAC++
Sbjct: 2 KKLGTLLVLFLSAIILVACASGK 24


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0440FLGMOTORFLIG280.003 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 28.2 bits (63), Expect = 0.003
Identities = 11/41 (26%), Positives = 22/41 (53%)

Query: 6 EFLKDFEEWLQSQISINQMAMDSAKKVLEEDKDERAADAYI 46
L +F+E + +Q I + +D A+++LE+ + A I
Sbjct: 65 NVLLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDII 105


20SAG0495SAG0505N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG0495-3130.822228hypothetical protein
SAG0496-112-0.553122exodeoxyribonuclease VII large subunit
SAG0497-112-0.820398exodeoxyribonuclease VII small subunit
SAG0498-311-0.999581geranyltranstransferase
SAG0499-212-1.769389hemolysin A
SAG0500-113-1.785039arginine repressor ArgR
SAG0501-113-1.884974DNA repair protein RecN
SAG0502013-1.475779DegV family protein
SAG0503212-1.817875lipase/acylhydrolase
SAG0504214-1.198058hypothetical protein
SAG0505115-0.398741DNA-binding protein HU
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0495BINARYTOXINA290.024 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 28.9 bits (64), Expect = 0.024
Identities = 23/125 (18%), Positives = 39/125 (31%), Gaps = 22/125 (17%)

Query: 153 QKEWERLSGIAVSQQTKENTQTALKSFPKGTILVAKSSHTRI-FQDLDEKEIIVGGPYQA 211
+WE+ V + + AL+ + K + ++ S TR F D + Y+
Sbjct: 57 AIQWEKKEAERVEKNLDTLEKEALELYKKDSEQISNYSQTRQYFYDYQIESNPREKEYK- 115

Query: 212 TGGMGDTLCGMIAGMLAQFKEA---SPLDKVSVGVYLHSAIAQGLSKEAYVVLPTTISDE 268
+ A + +DK Y S +KE IS E
Sbjct: 116 -----------------NLRNAISKNKIDKPINVYYFESPEKFAFNKEIRTENQNEISLE 158

Query: 269 IPKEM 273
E+
Sbjct: 159 KFNEL 163


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0499BLACTAMASEA280.033 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 28.2 bits (63), Expect = 0.033
Identities = 6/27 (22%), Positives = 13/27 (48%), Gaps = 1/27 (3%)

Query: 102 QSGARL-VYAVDVGTNQLVWKLRQDHR 127
Q R+ + +D+ + + + R D R
Sbjct: 35 QLSGRVGMIEMDLASGRTLTAWRADER 61


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0500ARGREPRESSOR864e-24 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 85.7 bits (212), Expect = 4e-24
Identities = 41/153 (26%), Positives = 80/153 (52%), Gaps = 4/153 (2%)

Query: 1 MKKSERLNLIKQIVLNHAVETQHELLRRLEAYGVTLTQATISRDMNEIGIIKVPSAKGRY 60
M K +R I++I+ + +ETQ EL+ L+ G +TQAT+SRD+ E+ ++KVP+ G Y
Sbjct: 1 MNKGQRHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKVPTNNGSY 60

Query: 61 IYGLSNENDPIFTTAVAKPIKTSILSISDKLLGLEQFININVIPGNSQLIKTFIMSHCQE 120
Y L + + + + + + + I I + +PGN+Q I + + E
Sbjct: 61 KYSLPADQRFNPLSKLKRSLMDAFVKIDSA----SHLIVLKTMPGNAQAIGALMDNLDWE 116

Query: 121 HIFSLTADDNSLLLIAKSEADADHIRQSMIAML 153
I D+++L+I ++ D +++ ++ +L
Sbjct: 117 EIMGTICGDDTILIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0502BONTOXILYSIN290.035 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 28.7 bits (64), Expect = 0.035
Identities = 16/65 (24%), Positives = 29/65 (44%), Gaps = 15/65 (23%)

Query: 207 KTFSKWL----DNFV-ESAQTR---KIAEIG--ISYCGKA----DMANNFREKLAVLGAP 252
K + WL N+ + T+ + I + + GKA + +N+F E+ GA
Sbjct: 556 KKYYLWLKEVFKNYSFDINLTQEIDSMCGINEVVLWFGKALNILNTSNSFVEEYQDSGA- 614

Query: 253 ISVLE 257
IS++
Sbjct: 615 ISLIS 619


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0505DNABINDINGHU1252e-41 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 125 bits (315), Expect = 2e-41
Identities = 84/91 (92%), Positives = 88/91 (96%)

Query: 1 MANKQDLIAKVAEATELTKKDSAAAVDAVFAAVADYLAEGEKVQLIGFGNFEVRERAARK 60
MANKQDLIAKVAEATELTKKDSAAAVDAVF+AV+ YLA+GEKVQLIGFGNFEVRERAARK
Sbjct: 1 MANKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARK 60

Query: 61 GRNPQTGAEIEIAASKVPAFKAGKALKDAVK 91
GRNPQTG EI+I ASKVPAFKAGKALKDAVK
Sbjct: 61 GRNPQTGEEIKIKASKVPAFKAGKALKDAVK 91


21SAG0712SAG0724N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG0712114-2.025207DNA-binding response regulator
SAG0713015-1.565534hypothetical protein
SAG0714-216-1.216949hypothetical protein
SAG0715-214-0.729448amino acid ABC transporter permease
SAG0716-3110.047867amino acid ABC transporter permease
SAG0717-213-0.284381amino acid ABC transporter amino acid-binding
SAG0718-112-0.152766amino acid ABC transporter ATP-binding protein
SAG0719111-0.347221DNA-binding response regulator
SAG0720111-0.506462sensory box histidine kinase
SAG0721012-0.454867metallo-beta-lactamase superfamily protein
SAG0722210-0.138735hypothetical protein
SAG0723110-0.289084ribonuclease III
SAG072409-0.414807chromosome segregation protein SMC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0712HTHFIS741e-17 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.5 bits (183), Expect = 1e-17
Identities = 29/120 (24%), Positives = 58/120 (48%), Gaps = 4/120 (3%)

Query: 3 TVLVVQGDDETIELLRSYLEGALYKVVMASDGEEAFSLFQQHQIDLAIIDITLPKIDGYE 62
T+LV D +L L A Y V + S+ + DL + D+ +P + ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 LTRLIRQ-DSQIPIIMLAAKTTDMDRILGLNIGADDFITKPFN---SLEVLARINSQLRR 118
L I++ +P+++++A+ T M I GA D++ KPF+ + ++ R ++ +R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0714THERMOLYSIN343e-04 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 33.8 bits (77), Expect = 3e-04
Identities = 23/110 (20%), Positives = 46/110 (41%), Gaps = 2/110 (1%)

Query: 74 TKSAEYSYHVDVKTGQILERDMDNNGFSKSTSQSSSSSSQKSHKIS-QEEAKKIAFKDAN 132
A HV+ L + N ++ ++ S Q++ I+ Q+ A ++ +
Sbjct: 102 CMGAVLVAHVNDGELSSLSGTLIPNLDKRTLKTEAAISIQQAEMIAKQDVADRVTKERPA 161

Query: 133 IEESEVSNLKIKEEIENGKSVYDIDFVDLKNK-NEVDYQIDAETGKIIER 181
EE + + L I + E + Y+++ L Y IDA GK++ +
Sbjct: 162 AEEGKPTRLVIYPDEETPRLAYEVNVRFLTPVPGNWIYMIDAADGKVLNK 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0718PF05272290.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.3 bits (65), Expect = 0.018
Identities = 17/38 (44%), Positives = 19/38 (50%), Gaps = 4/38 (10%)

Query: 27 EPG----QVVVLLGPSGSGKSTLIRTMNALESIDDGSL 60
EPG VVL G G GKSTLI T+ L+ D
Sbjct: 590 EPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHF 627


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0719HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 2e-23
Identities = 29/134 (21%), Positives = 65/134 (48%), Gaps = 1/134 (0%)

Query: 3 KILIVDDEKPISDIIKFNLTKEGYETATAFDGREALVQYAEFQPDLIILDLMLPELDGLE 62
IL+ DD+ I ++ L++ GY+ + A DL++ D+++P+ + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 63 VAKEVRKT-SHIPIIMLSAKDSEFDKVIGLEIGADDYVTKPFSNRELLARVKAHLRRTEN 121
+ ++K +P++++SA+++ + E GA DY+ KPF EL+ + L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 IETAVAEESAQNAS 135
+ + ++S
Sbjct: 125 RPSKLEDDSQDGMP 138


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0720PF06580340.001 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 33.7 bits (77), Expect = 0.001
Identities = 30/191 (15%), Positives = 67/191 (35%), Gaps = 42/191 (21%)

Query: 252 DETNRMMRMISDLL--SLSRIDNEVTHLDVEMTNFTAFMTSILNRFDQIRNQKTVTGKVY 309
+ M+ +S+L+ SL + L E+T +++ +F+ ++ ++
Sbjct: 191 TKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFE---DRLQFENQIN 247

Query: 310 EIVRDYPLKSIWVEIDTDKMTQVIDNILNNAVKY----SPDGGKITVNLRTTKTQMILSI 365
+ D + + ++ ++ N +K+ P GGKI + + L +
Sbjct: 248 PAIMDVQVPPM-----------LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEV 296

Query: 366 SDQGLGIPKKDLPLIFDRFYRVDKARSRKQGGTGLGLSIAKEIVKQHKGF---IWAKSEY 422
+ G K + TG GL +E ++ G I +
Sbjct: 297 ENTGSLALKNT------------------KESTGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 423 GKGSTFTIVLP 433
GK +++P
Sbjct: 339 GKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG0724GPOSANCHOR504e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 50.1 bits (119), Expect = 4e-08
Identities = 43/334 (12%), Positives = 115/334 (34%), Gaps = 21/334 (6%)

Query: 167 VLKYKTRKKETQSKLEQTQGNLDRLEDIIYELDMQVQPLEKQASIAKRFLVLDEERQGLH 226
+ K + ++ L + + LE +L+ ++ + +
Sbjct: 94 LSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNF--------STADSAKIKT 145

Query: 227 LSILIEDILQHQSDLTTVEEKLLTVRKELATYYQQRQSLEDENQSLKQKRHHLSEEIEAK 286
L + ++DL E + + + ++LE E +L+ ++ L + +E
Sbjct: 146 LEAEKAALAARKADLEKALEGAMNFSTADSA---KIKTLEAEKAALEARQAELEKALEGA 202

Query: 287 QLALLDVTKLKSDLERQIDLIRLESNQKAEKKEEAGQRLAELEIKAKDCSDQITQKNIEL 346
+ LE + A +K + + L + S +I E
Sbjct: 203 MNFSTADSAKIKTLEAEKA-------ALAARKADLEKALEGAMNFSTADSAKIKTLEAEK 255

Query: 347 TTLSEKIAQIRSEIVSTESSLERFSTNPDQI---IEKLREDFVTLMQEEADTSNALTALL 403
L + A++ + + S + L + L + + +L
Sbjct: 256 AALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLR 315

Query: 404 ADIENQKQASQAKSQEIQEVSKNLEVLKSNAKVALERFEAAKKNVRQLLSHYQDLGQTLQ 463
D++ ++A + E Q++ + ++ +++ + +A+++ +QL + +Q L + +
Sbjct: 316 RDLDASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNK 375

Query: 464 NLEGEYKNQQSILFDHLDEIKSKQARISSLESIL 497
E ++ + L + K + + S L
Sbjct: 376 ISEASRQSLRRDLDASREAKKQVEKALEEANSKL 409



Score = 42.7 bits (100), Expect = 8e-06
Identities = 40/261 (15%), Positives = 82/261 (31%), Gaps = 17/261 (6%)

Query: 674 KPELDNLKKELKQAQSKQLIQEKEVATLLEQLKEKQETLAQLKNDGEQARLEEQRADIEY 733
K +L K L + SK E A L + L+ + E+
Sbjct: 98 KEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARK 157

Query: 734 QQLSEKLADLNKLYNGLQLSSGALEQTTSENEKNRLEKELEQFAIKKEELTTSIAQIKED 793
L + L ++ + + T E EK LE + E
Sbjct: 158 ADLEKALEGAMN-----FSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAK 212

Query: 794 KDSIQEKVNNLTTLLSEAQLEERDLLNEQKFERANCTRLEITLSEIKRDISNLQTLLSHQ 853
+++ + L ++ + +N + A LE + ++ + L
Sbjct: 213 IKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAEL------- 265

Query: 854 DSQLDKEELPRIEKQLLQVNNRRENDEEKLVSLRFELEDCEAALDDLAASLAKEGQKNES 913
++ L + + + E + +L E D E L A+ + ++
Sbjct: 266 -----EKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDA 320

Query: 914 LIRQQAQLESQCEQLSQQLMI 934
+ QLE++ ++L +Q I
Sbjct: 321 SREAKKQLEAEHQKLEEQNKI 341



Score = 31.6 bits (71), Expect = 0.021
Identities = 30/242 (12%), Positives = 87/242 (35%), Gaps = 5/242 (2%)

Query: 760 TTSENEKNRLEKELEQFAIKKEELTTSIAQIKEDKDSIQEKVNNLTTLLSEAQLEERDLL 819
+ + ++++ ++F I+ L + + + ++++ + LT LS A+ + R
Sbjct: 46 RSQTDTLEKVQERADKFEIENNTLKLKNSDLSFNNKALKDHNDELTEELSNAKEKLRKND 105

Query: 820 NEQKFERANCTRLEITLSEIKRDISNLQTLLSHQDSQLD-----KEELPRIEKQLLQVNN 874
+ + LE +++++ + + +++ K L + L +
Sbjct: 106 KSLSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALE 165

Query: 875 RRENDEEKLVSLRFELEDCEAALDDLAASLAKEGQKNESLIRQQAQLESQCEQLSQQLMI 934
N + LE +AAL+ A L K + + + E L
Sbjct: 166 GAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAA 225

Query: 935 FSRQLSEDYQMTLDEAKVKANVLEDILMAREQLKSLQAKIKALGPVNIDAIAQFEEVHER 994
L + + ++ + + ++ + + L++ QA+++ ++ +
Sbjct: 226 RKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKT 285

Query: 995 LT 996
L
Sbjct: 286 LE 287


22SAG1608SAG1614N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1608090.548677OxaA-like protein precursor
SAG1609-1110.306897amino acid ABC transporter permease
SAG1610-1110.195396amino acid ABC transporter substrate-binding
SAG16110120.324490amidase
SAG1612112-0.470874transcription elongation factor GreA
SAG1613010-0.899127hypothetical protein
SAG1614011-0.645598acetyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG160860KDINNERMP1201e-32 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 120 bits (301), Expect = 1e-32
Identities = 60/226 (26%), Positives = 106/226 (46%), Gaps = 22/226 (9%)

Query: 34 YGVIWNTLGVPMANLITYFAQHQGLGFGVAIIIVTVIVRVVILPLGLYQSWKASYQA-EK 92
YG +W + P+ L+ + G +G +III+T IVR ++ PL KA Y + K
Sbjct: 330 YGWLW-FISQPLFKLLKWIHSFVG-NWGFSIIIITFIVRGIMYPLT-----KAQYTSMAK 382

Query: 93 MAYFKPLFEPINERLRNAKTQEEKLAAQTELMTAQRENGLSMFGGIGCLPLLIQMPFFSA 152
M +P + + ERL + +K E+M + ++ GG C PLLIQMP F A
Sbjct: 383 MRMLQPKIQAMRERLGD-----DKQRISQEMMALYKAEKVNPLGG--CFPLLIQMPIFLA 435

Query: 153 IFFAARYTPGVSSATFLG----LNLGQKSLTLTVIIAILYFVQSWLSMQGVPDEQRQQMK 208
+++ + + A F L+ L +++ + F +S V D +Q++
Sbjct: 436 LYYMLMGSVELRQAPFALWIHDLSAQDPYYILPILMGVTMFFIQKMSPTTVTDPMQQKI- 494

Query: 209 TMMYLMPIMMVFMSISLPASVALYWFIGGIFSIIQQLVTTYVLKPK 254
M MP++ + P+ + LY+ + + +IIQQ + L+ +
Sbjct: 495 --MTFMPVIFTVFFLWFPSGLVLYYIVSNLVTIIQQQLIYRGLEKR 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1609HTHFIS320.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.7 bits (72), Expect = 0.002
Identities = 15/46 (32%), Positives = 24/46 (52%), Gaps = 5/46 (10%)

Query: 110 EIIRAALLAVDHGQWEAARALGLKTPTIYR-----GIIIPQATRIA 150
+I AAL A Q +AA LGL T+ + G+ + +++R A
Sbjct: 439 PLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVSVYRSSRSA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1613IGASERPTASE473e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 46.6 bits (110), Expect = 3e-07
Identities = 36/214 (16%), Positives = 71/214 (33%), Gaps = 22/214 (10%)

Query: 19 EQILAELEEANRLRKLREEELYQKEQEAKEAARRTAQLMADYEAQRLKDE-REARAKALE 77
E E + + + E Q + AKEA E + E +E + +
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETK 1101

Query: 78 TKQRLEEQEKARIEAKLLAEAAREEERRQAEQALASQEEQVINQGMEPSRELDSGSKSSE 137
+E++EKA++E E +E + ++ + ++ + + EP+RE D E
Sbjct: 1102 ETATVEKEEKAKVE----TEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKE 1157

Query: 138 FRTTENVPDIDLKADKTDVATAVPNQETEEIFLVRATDIPTEGENVKLGEISELEPVAKE 197
++ N AD A + + + N + P +
Sbjct: 1158 PQSQTNTT-----ADTEQPAKETSSNVEQPV----TESTTVNTGNSVVENPENTTPATTQ 1208

Query: 198 PIRVEDLSKEEEGIALSAKNKHNKRERRQKADNV 231
P + S + + ++R R NV
Sbjct: 1209 PTVNSESSNKPK--------NRHRRSVRSVPHNV 1234


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1614SACTRNSFRASE300.002 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 30.3 bits (68), Expect = 0.002
Identities = 16/56 (28%), Positives = 26/56 (46%), Gaps = 3/56 (5%)

Query: 88 ITSLSIHPDFKGQGIGTALLAAMKDLVVSQERDGISLTCHDDLIS---FYEMNGFK 140
I +++ D++ +G+GTALL + G+ L D IS FY + F
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFI 147


23SAG1922SAG1926N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG1922-1110.336332response regulator
SAG1923-1110.865148UDP-glucose 4-epimerase
SAG1924-2110.744261glucan 1,6-alpha-glucosidase
SAG1925-2100.572263sugar ABC transporter ATP-binding protein
SAG1926-2110.536047helix-turn-helix domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1922HTHFIS781e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.6 bits (191), Expect = 1e-18
Identities = 26/103 (25%), Positives = 49/103 (47%), Gaps = 2/103 (1%)

Query: 3 VLIIEDDPMVEFIHRNYLEKLNYFQNIYSTASQTQAIAYLNDIKIQLVLLDIHIKEGNGL 62
+L+ +DD + + L + Y ++ T++ ++ LV+ D+ + + N
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 ELLKLLRNQHQNTEVIVISAANEAHTVKEAFHLGIVDYLIKPF 105
+LL ++ + V+V+SA N T +A G DYL KPF
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1923NUCEPIMERASE1642e-50 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 164 bits (417), Expect = 2e-50
Identities = 76/332 (22%), Positives = 142/332 (42%), Gaps = 43/332 (12%)

Query: 1 MAVLILGGAGYIGSHMVDQLITQGKEKVIVVDNLVTGH-------RQAV--HSDAIFYEG 51
M L+ G AG+IG H+ +L+ G +V+ +DNL + R + F++
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAG-HQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 52 DLSDKTFMRQVFRENPDVDAVIHFAAFSLVAESMENPLKYFDNNTAGMIKLLEVMNECDI 111
DL+D+ M +F + V V S+ENP Y D+N G + +LE I
Sbjct: 60 DLADREGMTDLFASGH-FERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 112 KNIVFSSTAATYGIPEQVPILETAP-QNPINPYGESKLMMETIMKWADQAYGIKFVALRY 170
++++++S+++ YG+ ++P +P++ Y +K E + YG+ LR+
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRF 178

Query: 171 FNVAGDKPDGSIGEDHKPETHLLPIILQVAQGVRDKIMIFGDDYNTPDGTNVRDYVHPFD 230
F V G P G +P+ L + +G I ++ G RD+ + D
Sbjct: 179 FTVYG--PWG------RPDMALFKFTKAMLEG--KSIDVYN------YGKMKRDFTYIDD 222

Query: 231 LADAHILAVDYLRQGNES---------------NVFNLGSSTGFSNLQMLEAARRITGKE 275
+A+A I D + + V+N+G+S+ + ++A G E
Sbjct: 223 IAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIE 282

Query: 276 IPAQKAARRPGDPDTLIASSEKARQILGWEPK 307
+PGD A ++ +++G+ P+
Sbjct: 283 AKKNMLPLQPGDVLETSADTKALYEVIGFTPE 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1925PF05272355e-04 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 35.0 bits (80), Expect = 5e-04
Identities = 13/56 (23%), Positives = 19/56 (33%), Gaps = 9/56 (16%)

Query: 34 IVFVGPSGCGKSTTLRMIAGLEDISEGELKIDGEVVNDKSPKDRDIAMVFQNYALY 89
+V G G GKST + + GL+ S+ I +D Y
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI---------GTGKDSYEQIAGIVAY 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG1926HTHFIS435e-07 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 43.3 bits (102), Expect = 5e-07
Identities = 14/54 (25%), Positives = 27/54 (50%), Gaps = 4/54 (7%)

Query: 215 LQHILDTSDTSAIIKALWQEQGNLAKTAKALFIHRNSLQYKLDKFTQSSGLNLK 268
+L + I+ AL +GN K A L ++RN+L+ K+ + G+++
Sbjct: 429 YDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIREL----GVSVY 478


24SAG2122SAG2128N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG2122-311-1.196912DNA-binding response regulator
SAG2123-212-1.920538sensor histidine kinase
SAG2124-111-1.144375hypothetical protein
SAG2125-112-1.452354carbamate kinase
SAG212609-1.901765ornithine carbamoyltransferase
SAG2127110-2.318494sensor histidine kinase
SAG2128110-1.993913response regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2122HTHFIS934e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 93.0 bits (231), Expect = 4e-24
Identities = 33/123 (26%), Positives = 62/123 (50%)

Query: 2 KILVVEDEFDLNRSIVKLLKKQHYSVDSASNGEEALQFVSVAEYDVIILDVMMPKMDGFT 61
ILV +D+ + + + L + Y V SN ++++ + D+++ DV+MP + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 FLKLLRNKGSQVSILMLTARDAVEDRIAGLDFGADDYLVKPFEFGELMARIRAMLRRANR 121
L ++ + +L+++A++ I + GA DYL KPF+ EL+ I L R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 QVS 124
+ S
Sbjct: 125 RPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2125CARBMTKINASE376e-133 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 376 bits (966), Expect = e-133
Identities = 142/311 (45%), Positives = 204/311 (65%), Gaps = 7/311 (2%)

Query: 3 KIVVALGGNALGN-----SPEEQLRLVKHTAKSLVALIKKGHEIVVSHGNGPQVGAINLG 57
++V+ALGGNAL S EE + V+ TA+ + +I +G+E+V++HGNGPQVG++ L
Sbjct: 4 RVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLLH 63

Query: 58 MNFAAESGQGTNFPFPECGAMSQGYIGYHLQQSLLNELRQEGINKEVATIITQIEVDESD 117
M+ + P GAMSQG+IGY +QQ+L NELR+ G+ K+V TIITQ VD++D
Sbjct: 64 MDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKND 123

Query: 118 QAFSAPTKPIGTFYDKETSEKIAIEKGYTFVEDAGRGYRRVVASPEPKKIIEINSIKTLI 177
AF PTKP+G FYD+ET++++A EKG+ ED+GRG+RRVV SP+PK +E +IK L+
Sbjct: 124 PAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKLV 183

Query: 178 ENDTLVIAGGGGGIPVI-NKGGYEGIAAVIDKDKSSALLAGELAADQLIILTAVDYVYTQ 236
E +VIA GGGG+PVI G +G+ AVIDKD + LA E+ AD +ILT V+
Sbjct: 184 ERGVIVIASGGGGVPVILEDGEIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGAALY 243

Query: 237 FGKENQKALTEVNENQMIDYVNQGEFAKGSMLPKVIACMSFLDHNPKGTALITSLNGLED 296
+G E ++ L EV ++ Y +G F GSM PKV+A + F++ + A+I L +
Sbjct: 244 YGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGE-RAIIAHLEKAVE 302

Query: 297 ALDGKLGTRIT 307
AL+GK GT++
Sbjct: 303 ALEGKTGTQVL 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2127PF06580416e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.0 bits (96), Expect = 6e-06
Identities = 28/175 (16%), Positives = 58/175 (33%), Gaps = 32/175 (18%)

Query: 256 NIIKGLGTYF----SVKNESTMALKDIFQIVLSYTRSIIQFRHQDIIILENNKCNLIISN 311
++ L N ++L D +V SY + + +D + EN N I +
Sbjct: 195 EMLTSLSELMRYSLRYSNARQVSLADELTVVDSYL-QLASIQFEDRLQFENQ-INPAIMD 252

Query: 312 YYYLLTIISNIVLNAVE-AIDKQKK-GTISVHTEELEDFIKIEISDNGPGIPDKMKHMIF 369
++ +V N ++ I + + G I + + + +E+ + G K
Sbjct: 253 VQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKEST- 311

Query: 370 KPGFSTKFDANGDIYRGIGLSHVR----ILMEEQYQGTITVCPNQPNGTTFTLLF 420
G GL +VR +L + Q ++ + +L
Sbjct: 312 ----------------GTGLQNVRERLQMLYGTEAQIKLS---EKQGKVNAMVLI 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2128HTHFIS654e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.9 bits (158), Expect = 4e-14
Identities = 28/144 (19%), Positives = 63/144 (43%), Gaps = 9/144 (6%)

Query: 5 IIDDDPTITMILQDIIE-EDFNNTVVRVNNVSSKAYNELLIADVDIVLIDLLMPILDGVT 63
+ DDD I +L + + V +N ++ + + D D+V+ D++MP +
Sbjct: 8 VADDDAAIRTVLNQALSRAGY--DVRITSNAAT-LWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 LVQKIYKQRSDLKFIMISQVKDNDLRQEAYKAGIEFFINKPINIIEVKSVVKRVTDTI-- 121
L+ +I K R DL +++S +A + G ++ KP ++ E+ ++ R
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 ---EMQKKLNTIQNLLENTPSYQK 142
+++ L+ + + Q+
Sbjct: 125 RPSKLEDDSQDGMPLVGRSAAMQE 148


25SAG2160SAG2167N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SAG2160215-0.250999ArgR family transcriptional regulator
SAG2161-216-0.329400Crp/Fnr family transcriptional regulator
SAG2162-1180.286337hypothetical protein
SAG2163-1180.489644arginine deiminase
SAG2164016-0.191302acetyltransferase
SAG2165115-0.303109ornithine carbamoyltransferase
SAG2166-115-0.616387arginine/ornithine antiporter
SAG2167-111-2.349164carbamate kinase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2160ARGREPRESSOR1286e-41 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 128 bits (324), Expect = 6e-41
Identities = 58/149 (38%), Positives = 95/149 (63%), Gaps = 4/149 (2%)

Query: 1 MNKKETRHQLIRSLVSETKVRTQHELRELLEKNGVSVTQATLSRDMKELNLIKVNESSDN 60
MNK + RH IR +++ ++ TQ EL ++L+K+G +VTQAT+SRD+KEL+L+KV N
Sbjct: 1 MNKGQ-RHIKIREIITANEIETQDELVDILKKDGYNVTQATVSRDIKELHLVKV---PTN 56

Query: 61 ATETYYEIHSISQKRWEERLRFYMEDALVMLFPVQNQIVLKTLPGLAQSFGSILDSILLP 120
Y + + + +L+ + DA V + + IVLKT+PG AQ+ G+++D++
Sbjct: 57 NGSYKYSLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWE 116

Query: 121 EILATVCGDDTCLIICNNSEEAQKCFEKL 149
EI+ T+CGDDT LIIC ++ + +K+
Sbjct: 117 EIMGTICGDDTILIICRTHDDTKVVQKKI 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2163ARGDEIMINASE5700.0 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 570 bits (1471), Expect = 0.0
Identities = 193/408 (47%), Positives = 275/408 (67%), Gaps = 8/408 (1%)

Query: 6 PIHVFSEIGKLKKVMLHRPGKEIENLMPDYLERLLFDDIPFLEDAQKEHDAFAQALRNEG 65
PI++FSEIG+LKKV+LHRPG+E+ENL P ++ LFDDIP+LE A++EH+ FA L+N
Sbjct: 7 PINIFSEIGRLKKVLLHRPGEELENLTPFIMKNFLFDDIPYLEVARQEHEVFASILKNNL 66

Query: 66 VEVLYLENLAAESL-TNQEIREQFIDEYIGEANVRGRATKKAIRELLLNIKDNKELIEKT 124
VE+ Y+E+L +E L ++ + +FI ++I EA ++ T +++ + +I K
Sbjct: 67 VEIEYIEDLISEVLVSSVALENKFISQFILEAEIKTDFTINLLKDYFSS-LTIDNMISKM 125

Query: 125 MAGIQKSELPEIPSSEKGLTDLVESNYPFAIDPMPNLYFTRDPFATIGNGVSLNHMFSET 184
++G+ EL SS L DLV F IDPMPN+ FTRDPFA+IGNGV++N MF++
Sbjct: 126 ISGVVTEELKNYTSS---LDDLVNGANLFIIDPMPNVLFTRDPFASIGNGVTINKMFTKV 182

Query: 185 RNRETLYGKYIFTHHPEYGGKVPMVYEREETTRIEGGDELVLSKDVLAVGISQRTDAASI 244
R RET++ +YIF +HP Y VP+ R E +EGGDELVL+K +L +GIS+RT+A S+
Sbjct: 183 RQRETIFAEYIFKYHPVYKENVPIWLNRWEEASLEGGDELVLNKGLLVIGISERTEAKSV 242

Query: 245 EKLLVNIFKQNLGFKKVLAFEFANNRKFMHLDTVFTMVDYDKFTIHPEIEGDLRVYSVTY 304
EKL +++FK F +LAF+ NR +MHLDTVFT +DY FT + +Y +TY
Sbjct: 243 EKLAISLFKNKTSFDTILAFQIPKNRSYMHLDTVFTQIDYSVFTSFTSDDMYFSIYVLTY 302

Query: 305 ENQD--LHIEEEKGDLADLLAKNLGVEKVELIRCGGDNLVAAGREQWNDGSNTLTIAPGV 362
+HI++EK + D+L+ LG K+++I+C G +L+ REQWNDG+N L IAPG
Sbjct: 303 NPSSSKIHIKKEKARIKDVLSFYLG-RKIDIIKCAGGDLIHGAREQWNDGANVLAIAPGE 361

Query: 363 VIVYNRNTITNAILESKGLKLIKINGSELVRGRGGPRCMSMPFEREDL 410
+I Y+RN +TN + E G+K+ +I SEL RGRGGPRCMSMP RED+
Sbjct: 362 IIAYSRNHVTNKLFEENGIKVHRIPSSELSRGRGGPRCMSMPLIREDI 409


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2164AUTOINDCRSYN300.002 Autoinducer synthesis protein signature.
		>AUTOINDCRSYN#Autoinducer synthesis protein signature.

Length = 216

Score = 30.2 bits (68), Expect = 0.002
Identities = 12/64 (18%), Positives = 22/64 (34%), Gaps = 7/64 (10%)

Query: 1 MPYQRAASLY-IRFSVFVIER-----NIKMEEEFDDNDEQDTIYAVLYDGKQPVSTGRFL 54
+ ++ L+ +R F +R EFD D +T Y + + RF+
Sbjct: 12 LSETKSGELFTLRKETFK-DRLNWAVQCTDGMEFDQYDNNNTTYLFGIKDNTVICSLRFI 70

Query: 55 PETQ 58

Sbjct: 71 ETKY 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SAG2167CARBMTKINASE412e-148 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 412 bits (1061), Expect = e-148
Identities = 147/314 (46%), Positives = 211/314 (67%), Gaps = 6/314 (1%)

Query: 6 QKIVVALGGNAIL--STDASAKAQQEALINTSKSLVKLIKEGHDVIVTHGNGPQVGNLLL 63
+++V+ALGGNA+ S + + + T++ + ++I G++V++THGNGPQVG+LLL
Sbjct: 3 KRVVIALGGNALQQRGQKGSYEEMMDNVRKTARQIAEIIARGYEVVITHGNGPQVGSLLL 62

Query: 64 QQAASDSEKN-PAMPLDTCVAMTEGSIGFWLQNALNNELQEQGIDKEVATVVTQVIVDEK 122
A + PA P+D AM++G IG+ +Q AL NEL+++G++K+V T++TQ IVD+
Sbjct: 63 HMDAGQATYGIPAQPMDVAGAMSQGWIGYMIQQALKNELRKRGMEKKVVTIITQTIVDKN 122

Query: 123 DQAFTNPTKPIGPFLSEEDAKKQAQETGSKFKEDAGRGWRKVVPSPKPVGIKEASVIRRL 182
D AF NPTKP+GPF EE AK+ A+E G KED+GRGWR+VVPSP P G EA I++L
Sbjct: 123 DPAFQNPTKPVGPFYDEETAKRLAREKGWIVKEDSGRGWRRVVPSPDPKGHVEAETIKKL 182

Query: 183 VDSGVVVISAGGGGVPVIEDANTKALKGVEAVIDKDFASQTLSELVDADLFIVLTGVDNV 242
V+ GV+VI++GGGGVPVI + +KGVEAVIDKD A + L+E V+AD+F++LT V+
Sbjct: 183 VERGVIVIASGGGGVPVILEDG--EIKGVEAVIDKDLAGEKLAEEVNADIFMILTDVNGA 240

Query: 243 FVNFNKPNQEKLEEVTVSQMKQYITENQFAPGSMLPKVEAAIAFVENKPESRAIITSLEN 302
+ + ++ L EV V ++++Y E F GSM PKV AAI F+E RAII LE
Sbjct: 241 ALYYGTEKEQWLREVKVEELRKYYEEGHFKAGSMGPKVLAAIRFIEW-GGERAIIAHLEK 299

Query: 303 IDNVLAQNAGTQIV 316
L GTQ++
Sbjct: 300 AVEALEGKTGTQVL 313



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.