PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2256.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_009438 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1Sputcn32_0074Sputcn32_0079Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_00740204.737984hypothetical protein
Sputcn32_00751225.891302MarR family transcriptional regulator
Sputcn32_00761215.573382siderophore-interacting protein
Sputcn32_00770215.166174imidazolonepropionase
Sputcn32_00780184.132416histidine utilization repressor
Sputcn32_00790183.664897urocanate hydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0077UREASE431e-06 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 43.2 bits (102), Expect = 1e-06
Identities = 22/52 (42%), Positives = 29/52 (55%), Gaps = 8/52 (15%)

Query: 352 TLNAAKALGIEDTVGSLVVGKQADFCLWDIATPAQLAYSYGVNPCKDVVKNG 403
T+N A A G+ +GSL VGK+AD LW+ PA +GV P V+ G
Sbjct: 410 TINPAIAHGLSHEIGSLEVGKRADLVLWN---PAF----FGVKP-DMVLLGG 453



Score = 32.8 bits (75), Expect = 0.002
Identities = 19/61 (31%), Positives = 29/61 (47%), Gaps = 6/61 (9%)

Query: 23 YGAITNAAIAVKDGKIAWLGPRSE---LPAFDVL---SIPVYRGKGGWITPGLIDAHTHL 76
+ I A I +KDG+IA +G P ++ V G+G +T G +D+H H
Sbjct: 80 HWGIVKADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHF 139

Query: 77 I 77
I
Sbjct: 140 I 140


2Sputcn32_0108Sputcn32_0119Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_01080173.155440hypothetical protein
Sputcn32_0109-2214.179776glyoxalase/bleomycin resistance
Sputcn32_0110-3224.224450hypothetical protein
Sputcn32_0111-3234.376900LysR family transcriptional regulator
Sputcn32_0112-2213.8020053-ketoacyl-ACP reductase
Sputcn32_0113-2202.413678hypothetical protein
Sputcn32_0114-2162.106928cytochrome c biogenesis protein, transmembrane
Sputcn32_0115-1150.675324DSBA oxidoreductase
Sputcn32_0116013-0.107554redoxin domain-containing protein
Sputcn32_0117013-0.146681hypothetical protein
Sputcn32_0118013-0.021047mechanosensitive ion channel protein MscS
Sputcn32_01192140.230222lysine exporter protein LysE/YggA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0112DHBDHDRGNASE1261e-37 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 126 bits (318), Expect = 1e-37
Identities = 85/257 (33%), Positives = 126/257 (49%), Gaps = 15/257 (5%)

Query: 3 SSNNLQGKVAFVQGGSRGIGAAIVKRLASDGAAVAFTYVSSEAQSQLLVDEVITHGGQAI 62
++ ++GK+AF+ G ++GIG A+ + LAS GA +A + E + +V + A
Sbjct: 2 NAKGIEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKL-EKVVSSLKAEARHAE 60

Query: 63 AIKADSTEPEAIRRAIRETKAHFGGLDIIVNNAGTLIWDSIENLTLEDWERTINTNVRSV 122
A AD + AI + G +DI+VN AG L I +L+ E+WE T + N V
Sbjct: 61 AFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGV 120

Query: 123 FVASQEAALHMND--GGRIINIGSTNAERIPFVGGAIYGMSKSALVGLAKGLARDLGPRA 180
F AS+ + +M D G I+ +GS N +P A Y SK+A V K L +L
Sbjct: 121 FNASRSVSKYMMDRRSGSIVTVGS-NPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 181 ITVNNIQPGPVDTDMN-----PDNGD------SSEPVKAMGALGRFGKAEEIASFVAFIA 229
I N + PG +TDM +NG S E K L + K +IA V F+
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 230 GPEAGYITGASLMIDGG 246
+AG+IT +L +DGG
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0114IGASERPTASE330.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.005
Identities = 32/147 (21%), Positives = 50/147 (34%), Gaps = 17/147 (11%)

Query: 28 TGWL-VNDNHPPAKVRFMLTGEVDPATNTLPAVLEVQLEGDWKTYWRSPGEGGIAPTIKW 86
G L V + RF+LTG + + + L G + R GI+ T K
Sbjct: 664 NGNLNVTFKGKSEQNRFLLTGGTNLNGDLTVEKGTLFLSGRPTPHARD--IAGISSTKKD 721

Query: 87 DGSHNLQQV----DWRWPAPEEFSLLGLQTFGYKGNTTF----PLTIKVDDIAASTQLRG 138
+V DW T GN + + +I AS + +
Sbjct: 722 PHFAENNEVVVEDDW------INRNFKATTMNVTGNASLYSGRNVANITSNITASNKAQV 775

Query: 139 KVTLSTCTTICVLTDYQMSLDFTPNAL 165
+ T T+CV +DY + T + L
Sbjct: 776 HIGYKTGDTVCVRSDYTGYVTCTTDKL 802


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0117RTXTOXIND290.007 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 28.6 bits (64), Expect = 0.007
Identities = 8/53 (15%), Positives = 20/53 (37%), Gaps = 5/53 (9%)

Query: 78 AKIQALEAKLVERQNELTQTVKEGRPMDKVSKKQAKVAEVEAELAKAREELER 130
++I + + + + +DK+ + + + ELAK E +
Sbjct: 280 SEILSAKEEYQLVTQLFKNEI-----LDKLRQTTDNIGLLTLELAKNEERQQA 327


3Sputcn32_0150Sputcn32_0230Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0150320-1.087216RND family efflux transporter MFP subunit
Sputcn32_0151423-3.392152copper-transporting ATPase domain-containing
Sputcn32_0152321-3.756349hypothetical protein
Sputcn32_0153322-5.084004phage integrase family site specific
Sputcn32_0154423-5.675961site-specific DNA-methyltransferase, type I
Sputcn32_0155321-6.174030ATPase AAA
Sputcn32_0156219-5.078923hypothetical protein
Sputcn32_0157219-5.040015HsdR family type I site-specific
Sputcn32_0158120-5.535566KAP P-loop domain-containing protein
Sputcn32_0159219-4.401284deoxyguanosinetriphosphate
Sputcn32_0160116-2.586070restriction modification system DNA specificity
Sputcn32_01611190.071907type I restriction-modification system, M
Sputcn32_01621231.369924hypothetical protein
Sputcn32_01632232.585023hypothetical protein
Sputcn32_01643263.261725hypothetical protein
Sputcn32_01653263.372344ATPase AAA
Sputcn32_01663273.898729putative integrase protein
Sputcn32_01673253.604986resolvase domain-containing protein
Sputcn32_01682223.555933putative mercuric reductase
Sputcn32_01691212.958912mercuric transport periplasmic protein
Sputcn32_0170-1202.537064mercuric transporter MerT
Sputcn32_0171-1221.379420MerR family transcriptional regulator
Sputcn32_01720210.290532DNA-repair protein
Sputcn32_0173-1241.770718integrase catalytic subunit
Sputcn32_0174023-2.572284transposase IS3/IS911 family protein
Sputcn32_0175224-2.889081putative prophage repressor
Sputcn32_0176425-3.130488hypothetical protein
Sputcn32_0177221-2.144492prevent-host-death family protein
Sputcn32_0178119-1.935516phage integrase family protein
Sputcn32_0179425-4.913925hypothetical protein
Sputcn32_0180625-2.309696hypothetical protein
Sputcn32_0181-115-0.114908hypothetical protein
Sputcn32_0182-1150.782762putative transcriptional regulator
Sputcn32_0183-1160.499773hypothetical protein
Sputcn32_01840150.366332hypothetical protein
Sputcn32_01850140.290532hypothetical protein
Sputcn32_01860150.116123sulfate transporter
Sputcn32_0187218-2.613698UspA domain-containing protein
Sputcn32_0188117-2.629672TraR/DksA family transcriptional regulator
Sputcn32_0189118-1.681074resolvase domain-containing protein
Sputcn32_01901180.505928prevent-host-death family protein
Sputcn32_01911191.091410plasmid stabilization system protein
Sputcn32_01921253.740110transposase Tn3 family protein
Sputcn32_01934429.013338cointegrate resolution protein T
Sputcn32_01944418.598707MerR family transcriptional regulator
Sputcn32_01952408.126906cation efflux system permease
Sputcn32_01962397.346193lipoprotein signal peptidase
Sputcn32_01973407.550920transposase, IS204/IS1001/IS1096/IS1165 family
Sputcn32_01982305.208991iron-containing alcohol dehydrogenase
Sputcn32_01993295.359081aldehyde dehydrogenase
Sputcn32_02004316.054015hypothetical protein
Sputcn32_02014327.080086ethanolamine utilization protein
Sputcn32_02024315.979273propanediol utilization
Sputcn32_02033295.610913propanediol utilization protein
Sputcn32_02042306.254902microcompartments protein
Sputcn32_02052306.075603microcompartments protein
Sputcn32_02061304.930283MIP family channel protein
Sputcn32_02072284.194569glycyl-radical activating family protein
Sputcn32_02082284.033887pyruvate formate-lyase
Sputcn32_02091284.114120microcompartments protein
Sputcn32_02102293.358462microcompartments protein
Sputcn32_02112293.675243PTS system mannose/fructose/sorbose family
Sputcn32_02122313.993179PTS system mannose/fructose/sorbose family
Sputcn32_02134304.699134PTS system sorbose subfamily transporter subunit
Sputcn32_02143302.962253transposase, IS4 family protein
Sputcn32_02153292.652846IS1 transposase
Sputcn32_02162273.967070insertion element protein
Sputcn32_02172273.672606phage integrase family protein
Sputcn32_02183387.088620IS630 orf
Sputcn32_02192396.596430helix-turn-helix domain-containing protein
Sputcn32_0220-1357.574774MerR family transcriptional regulator
Sputcn32_0221-1316.615080cation efflux system permease
Sputcn32_0222-1296.173770lipoprotein signal peptidase
Sputcn32_02230296.102502transposase, IS204/IS1001/IS1096/IS1165 family
Sputcn32_02242234.206037hypothetical protein
Sputcn32_02252234.000029CzcA family heavy metal efflux protein
Sputcn32_02263212.755027biotin/lipoyl attachment domain-containing
Sputcn32_02274222.967685outer membrane efflux protein
Sputcn32_02284200.421848hypothetical protein
Sputcn32_0229220-0.089803cation efflux system permease
Sputcn32_0230220-0.613821MerR family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0150RTXTOXIND363e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 36.0 bits (83), Expect = 3e-04
Identities = 29/153 (18%), Positives = 54/153 (35%), Gaps = 24/153 (15%)

Query: 301 QLAQSRQTQYRVPFYAQQDGVVETLSIR-DGMYIEPSTEAMSLV-DLSKVWVIADVFENE 358
+LA++ + Q A V+ L + +G + + M +V + + V A V +
Sbjct: 317 ELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKD 376

Query: 359 QSWIAKGQKAEVAVPAMNIR---GIEGTIDYIYPE------LDPVTRSLRVRVVLNNTNV 409
+I GQ A + V A + G + I + L V + + N +
Sbjct: 377 IGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVII-SIEENCLST 435

Query: 410 DLRPKTLAKVSLFGGPNRDVLVIPQEALIQTGK 442
+ + L G + A I+TG
Sbjct: 436 GNKN-----IPLSSG-------MAVTAEIKTGM 456



Score = 35.2 bits (81), Expect = 6e-04
Identities = 29/117 (24%), Positives = 46/117 (39%), Gaps = 17/117 (14%)

Query: 202 LWKFVETVGQVDYNESQITH------IHARVTGWVEKLMVKSMGDTVKKGQLLYELYSP- 254
+ + V V ++TH I V++++VK G++V+KG +L +L +
Sbjct: 73 ILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKE-GESVRKGDVLLKLTALG 131

Query: 255 ---DLVNAQDDYLLALDTAKTSGNPRYQDLVRRAGL-RLSFLGFNDEQIKQLAQSRQ 307
D + Q L A RYQ L R L +L L DE Q +
Sbjct: 132 AEADTLKTQSSLLQARLEQT-----RYQILSRSIELNKLPELKLPDEPYFQNVSEEE 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0157FLGHOOKAP1310.028 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.028
Identities = 22/111 (19%), Positives = 38/111 (34%), Gaps = 7/111 (6%)

Query: 712 NRLHSKKQFGYLIDYRGILKELDTTIEKYQDLAERTQGGFDIDD--LKGLYNRMDTEYKK 769
+ L + G + G+ +E D I A+ G + + N + T
Sbjct: 45 STLGAGGWVGNGVYVSGVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSS 104

Query: 770 LPGLHSDLWAIFDTVKNKQDGPALRQALAPKVNDVDGKLVDTNLKKRDDFY 820
L D + T+ + + PA RQAL K + + K D +
Sbjct: 105 LATQMQDFFTSLQTLVSNAEDPAARQALIGK-----SEGLVNQFKTTDQYL 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0180MALTOSEBP300.003 Maltose binding protein signature.
		>MALTOSEBP#Maltose binding protein signature.

Length = 396

Score = 30.1 bits (67), Expect = 0.003
Identities = 29/108 (26%), Positives = 47/108 (43%), Gaps = 16/108 (14%)

Query: 39 SGDNGSLHQFYVDNYVGRDFSVDRASGVMAG------ALKNSYVTEPQVIDFGSSENAYK 92
+ D G ++ Y +D VD A G AG +KN ++ D+ +E A+
Sbjct: 188 AADGGYAFKYENGKYDIKDVGVDNA-GAKAGLTFLVDLIKNKHMNADT--DYSIAEAAFN 244

Query: 93 VVATMRVDQGAGAGSNV------YALTI-EEFESGPEKPFVFLMNATV 133
T G A SN+ Y +T+ F+ P KPFV +++A +
Sbjct: 245 KGETAMTINGPWAWSNIDTSKVNYGVTVLPTFKGQPSKPFVGVLSAGI 292


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0193RTXTOXIND352e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 34.8 bits (80), Expect = 2e-04
Identities = 9/99 (9%), Positives = 34/99 (34%), Gaps = 5/99 (5%)

Query: 44 QQLQAELRLSHQALSVKQQECTTLKAQTQQQSAELQHATQSVSKIEQQLLGIQNSQQQTE 103
+++ L + S Q + + ++ AE +++ E ++ S+
Sbjct: 182 EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENL-SRVEKSRLDDF 240

Query: 104 QKLYRKD----TELNRLQKQHEDLQQQYAEAAAKVASLQ 138
L K + + ++ + + +++ ++
Sbjct: 241 SSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0225ACRIFLAVINRP7510.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 751 bits (1940), Expect = 0.0
Identities = 220/1053 (20%), Positives = 426/1053 (40%), Gaps = 49/1053 (4%)

Query: 5 MLRLAIARRYLFLTLTLLIIAIGSWSYQQLPIDAVPDITNVQVQINTAAPGYSPLEAEQR 64
M I R L ++++ G+ + QLP+ P I V ++ PG +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 65 ITYPVETALYGLPNLSYTRSLS-RYGLSQVTVVFEEGTDIYFARNLINTRLGAIKDMLPE 123
+T +E + G+ NL Y S S G +T+ F+ GTD A+ + +L +LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 124 GIEPEMGPISTGLGEIFMYTVQAKPGALQQNGSPYDAMALREIQDWIIKPQLAQVKGVVE 183
++ + + M + + + +K L+++ GV +
Sbjct: 121 EVQQQGISVEKSSSSYLMVA------GFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGD 174

Query: 184 VNSIGGYNKQYHVLPDPLKLLNYGLSIKDIELALQANNDNRGAGYI------EREGMQLL 237
V G + D L Y L+ D+ L+ ND AG + + +
Sbjct: 175 VQLFGA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS 233

Query: 238 VRSPGQLTSLDDIANVII-TQYDTIPVKLSDVADVAIGKELRTGAATQDGKEAVLGTAMM 296
+ + + + ++ V + D V+L DVA V +G E A +GK A +
Sbjct: 234 IIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKL 293

Query: 297 LIDENSRTVARDVAQKLEQIKSSLPEGIIAEAVYDRTTLVDKAIATVSKNLLEGALLVIV 356
N+ A+ + KL +++ P+G+ YD T V +I V K L E +LV +
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 357 VLFILLGNLRGALITAAVIPLSMLMTITGMVQAGVSANLMSLG--ALDFGLIVDGTVIIV 414
V+++ L N+R LI +P+ +L T + G S N +++ L GL+VD +++V
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVV 413

Query: 415 ENAVRRLAQAQHNGRIQPLKERLNTVYLATAEVIRPSLFGVAIITIVYMPIFSLTGVEGK 474
EN R + + + + + +++ + +++ V++P+ G G
Sbjct: 414 ENVERVMMEDKLPPK--------EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 475 MFHPMAATVVIALLSAMVLSLTIVPAAVAVFLNGKISEKESA----------VIRSAKTL 524
++ + T+V A+ +++++L + PA A L +E +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNH 525

Query: 525 YAPLLALALKWRALVIGLASALVGVCLWLATTLGSEFVPQLDEGDIALHAMRIPGTGLEQ 584
Y + L + + + +V + L L S F+P+ D+G G E+
Sbjct: 526 YTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQER 585

Query: 585 -AVAMQEILEQKIKTFAEVDKVFARIGTPEVATDPMPPNVADNFVILKPRSEWPNPDKTK 643
+ ++ + +K E V + + N FV LKP E + +
Sbjct: 586 TQKVLDQVTDYYLK--NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSA 643

Query: 644 AQLVTEMEAALATLPGNNYEFTQPIQM-RFNELISGVRADLG-IKVFGDDLDQLVTSANQ 701
++ + L + F P M EL + D I G D L + NQ
Sbjct: 644 EAVIHRAKMELGKIRDG---FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 702 ILQAVNKVQGA-ADTKVEQVTGLPTLSVIPNRTALARYGLNVVELQDWVAAAIGGTSAGI 760
+L + + + + + ++ G+++ ++ ++ A+GGT
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND 760

Query: 761 LYEGDRRFELIVRLPETLRRDLDKLAVLPVPLPNGDFVPLQEVATLDLSPAPAQISRENG 820
+ R +L V+ R + + L V NG+ VP T ++ R NG
Sbjct: 761 FIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNG 820

Query: 821 KRRVVVTANVRGRDLGSFVEEVKAQINRDVT-LPAGYWLDYGGTFEQLESASQRLSIVVP 879
+ + G+ + A + + LPAG D+ G Q + + +V
Sbjct: 821 LPSMEIQGEAAP---GTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVA 877

Query: 880 VTLLLILGILVMAFASFKDALIIFSGVPLALTGGVLALYLRGMPLSISAGIGFIALSGVA 939
++ +++ L + S+ + + VPL + G +LA L + +G + G++
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLS 937

Query: 940 VLNGLVMLSFIRDLWREKG-DLLLAITEGALTRLRPVLMTALVASLGFVPMAINIGTGAE 998
N ++++ F +DL ++G ++ A RLRP+LMT+L LG +P+AI+ G G+
Sbjct: 938 AKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSG 997

Query: 999 VQRPLATVVIGGIISSTLLTLLVLPVLYHWVHK 1031
Q + V+GG++S+TLL + +PV + + +
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 77.2 bits (190), Expect = 2e-16
Identities = 64/354 (18%), Positives = 139/354 (39%), Gaps = 17/354 (4%)

Query: 699 ANQILQAVNKVQGAADTKVEQVTGLPTLSVIPNRTALARYGLNVVELQDWVAAA----IG 754
A+ + ++++ G D V+ + + + L +Y L V++ + +
Sbjct: 159 ASNVKDTLSRLNGVGD--VQLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAA 216

Query: 755 GTSAGILYEGDRRFELIVRLPETLRRDLDKLAVLPVPLPNGDFVPLQEVATLDLSPAPAQ 814
G G ++ + + + V +G V L++VA ++L
Sbjct: 217 GQLGGTPALPGQQLNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYN 276

Query: 815 I-SRENGKRRVVVTANVR-GRDLGSFVEEVKAQINR-DVTLPAGYWLDYGGTFEQLESAS 871
+ +R NGK + + G + + +KA++ P G + ++
Sbjct: 277 VIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFFPQG--MKVLYPYDTTPFVQ 334

Query: 872 QRLSIVVPVTLL--LILGILVMAF--ASFKDALIIFSGVPLALTGGVLALYLRGMPLSIS 927
+ VV TL ++L LVM + + LI VP+ L G L G ++
Sbjct: 335 LSIHEVV-KTLFEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTL 393

Query: 928 AGIGFIALSGVAVLNGLVMLSFIRDLWREKGDLLLAITEGALTRLR-PVLMTALVASLGF 986
G + G+ V + +V++ + + E TE ++++++ ++ A+V S F
Sbjct: 394 TMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVF 453

Query: 987 VPMAINIGTGAEVQRPLATVVIGGIISSTLLTLLVLPVLYHWVHKNDNRKQQTE 1040
+PMA G+ + R + ++ + S L+ L++ P L + K + +
Sbjct: 454 IPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHEN 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0226RTXTOXIND384e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 38.3 bits (89), Expect = 4e-05
Identities = 22/128 (17%), Positives = 42/128 (32%), Gaps = 23/128 (17%)

Query: 166 IKAPIDGVVIQRSANV--GEIAQQQTLFSIA-DFGSLWADFRVYPSQQVAVATGQTVVIM 222
I+AP+ V Q + G + +TL I + +L V + GQ +I
Sbjct: 330 IRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIK 389

Query: 223 AADT-------EINGVIAHIIPSL-----TQPYQLARVKLDK----RDGK---LSSGQLI 263
+ + G + +I + +++ K LSSG +
Sbjct: 390 -VEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMAV 448

Query: 264 EGAVISGE 271
+ +G
Sbjct: 449 TAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0230BACYPHPHTASE280.013 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 28.2 bits (62), Expect = 0.013
Identities = 14/68 (20%), Positives = 30/68 (44%)

Query: 61 ADIERLLFIKHCRSLDLSLAEIRQLLSLNDSPMSQCDDVNQMIEQHIEQVERRIADLTQL 120
D +L + HCR+ A++ + +NDS SQ + + + +++ + QL
Sbjct: 392 GDDSKLRPVIHCRAGVGRTAQLIGAMCMNDSRNSQLSVEDMVSQMRVQRNGIMVQKDEQL 451

Query: 121 NKQLKALR 128
+ +K
Sbjct: 452 DVLIKLAE 459


4Sputcn32_0243Sputcn32_0275Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0243214-0.862950ISSod10, transposase OrfB
Sputcn32_0244113-1.332068ISSod10, transposase OrfA
Sputcn32_0246213-1.677092transposase, IS4 family protein
Sputcn32_0247316-1.590930hypothetical protein
Sputcn32_0248315-1.375073hypothetical protein
Sputcn32_0249316-2.619147CopA family copper resistance protein
Sputcn32_0250024-6.604089copper resistance B
Sputcn32_0251124-7.175702copper resistance protein CopC
Sputcn32_0252220-4.229437copper resistance D domain-containing protein
Sputcn32_0253325-2.624749hypothetical protein
Sputcn32_0254326-0.284789hypothetical protein
Sputcn32_02553251.106541outer membrane porin
Sputcn32_02564314.260538hypothetical protein
Sputcn32_02573315.388831sensor kinase CusS
Sputcn32_02582305.473849DNA-binding transcriptional activator CusR
Sputcn32_02593296.310613copper/silver efflux system outer membrane
Sputcn32_02604286.210821copper-binding protein
Sputcn32_02614296.104173copper/silver efflux system membrane fusion
Sputcn32_02624306.324657CzcA family heavy metal efflux protein
Sputcn32_02632295.034850hypothetical protein
Sputcn32_0264-1202.864557heavy metal translocating P-type ATPase
Sputcn32_0265-216-1.776736hypothetical protein
Sputcn32_0266-317-2.824310peptidase M23B
Sputcn32_0267-119-4.244806hypothetical protein
Sputcn32_0269-219-4.234525plastocyanin domain-containing protein
Sputcn32_0270-217-3.629567copper-translocating P-type ATPase
Sputcn32_0271122-4.919584MerR family transcriptional regulator
Sputcn32_0272223-3.917740hypothetical protein
Sputcn32_0273323-2.963359hypothetical protein
Sputcn32_0275223-1.874075RNA-directed DNA polymerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0258HTHFIS891e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 1e-22
Identities = 37/117 (31%), Positives = 61/117 (52%)

Query: 2 KILIVEDEIKTGEYLSKGLREAGFVVDHADNGLTGYHLAMTAEYDLVILDIMLPDVNGWD 61
IL+ +D+ L++ L AG+ V N T + + DLV+ D+++PD N +D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 IIRMLRTAGKGMPVLLLTALGTIEHRVKGLELGADDYLVKPFAFAELLARVKTLLRR 118
++ ++ A +PVL+++A T +K E GA DYL KPF EL+ + L
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0259RTXTOXIND356e-04 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 35.2 bits (81), Expect = 6e-04
Identities = 19/157 (12%), Positives = 44/157 (28%), Gaps = 9/157 (5%)

Query: 294 ADANIGAARAAFFPSITLTSGLSASSTELSSLFTPGSGMWNFIPKIDIPVFNAGRNKANL 353
A+A+ +++ + + S + P + + ++ R + +
Sbjct: 132 AEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLI 191

Query: 354 KLAEIRQQQSVVNYEQKIQSAFKDVSDTFALRDSLSQQLESQQRYLDSLQITLQRARGLY 413
K Q E + + A + ++ LD L
Sbjct: 192 KEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYENLSRVEKSRLDDFSS-------LL 244

Query: 414 ASGAVSYIEVLDAERSLFATQQTILDLTYSRQVNEIN 450
A++ VL+ E + Y Q+ +I
Sbjct: 245 HKQAIAKHAVLEQENKYVEAVNEL--RVYKSQLEQIE 279


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0262ACRIFLAVINRP6890.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 689 bits (1779), Expect = 0.0
Identities = 211/1060 (19%), Positives = 435/1060 (41%), Gaps = 54/1060 (5%)

Query: 1 MIEWIIRRSVANRFLVMMGALFLSIWGTWTIINTPVDALPDLSDVQVIIKTSYPGQAPQI 60
M + IRR + A+ L + G I+ PV P ++ V + +YPG Q
Sbjct: 1 MANFFIRR----PIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQT 56

Query: 61 VENQVTYPLTTTMLSVPGAKTVRGFSQ-FGDSYVYVIFEDGTDLYWARSRVLEYLNQVQG 119
V++ VT + M + + S G + + F+ GTD A+ +V L
Sbjct: 57 VQDTVTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATP 116

Query: 120 KLPAGVSS-EIGPDATGVGWIFEYALVDRSGKHDLSELRSLQDWFLKFELKTIPNVAEVA 178
LP V I + + ++ V + ++ +K L + V +V
Sbjct: 117 LLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQ 176

Query: 179 SVGGVVKQYQIQLDPVKLTQYGISLPEVKQALSSSNQEAGG------SSVEIAESEYMVR 232
G +I LD L +Y ++ +V L N + ++ + +
Sbjct: 177 LFGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 233 ASGYLQTIDDFKNIVLKTGENGVPVYLRDVARIQMGPEMRRGIAELNGQGEVAGGVVILR 292
A + ++F + L+ +G V L+DVAR+++G E IA +NG+ AG + L
Sbjct: 236 AQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGK-PAAGLGIKLA 294

Query: 293 SGKNAREVITAVRDKLDTLKASLPEGVEIVTTYDRSQLIDRAIDNLSYKLLEEFIVVAVV 352
+G NA + A++ KL L+ P+G++++ YD + + +I + L E ++V +V
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLV 354

Query: 353 CALFLWHVRSALVAIISLPLGLCIAFIVMHFQGLNANIMSLGGIAIAVGAMVDAAIVMIE 412
LFL ++R+ L+ I++P+ L F ++ G + N +++ G+ +A+G +VD AIV++E
Sbjct: 355 MYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVE 414

Query: 413 NAHKRLEEWDHQHPGEQIDNATRWKVITDASVEVGPALFISLLIITLSFIPIFTLEGQEG 472
N + + E D + + ++ AL ++++ FIP+ G G
Sbjct: 415 NVERVMME----------DKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTG 464

Query: 473 RLFGPLAFTKTYSMAGAAILAIIVIPILMGFWIRGKIPAETSNPLNRL----------LI 522
++ + T +MA + ++A+I+ P L ++ + AE +
Sbjct: 465 AIYRQFSITIVSAMALSVLVALILTPALCATLLKP-VSAEHHENKGGFFGWFNTTFDHSV 523

Query: 523 KAYHPLLLRVLHWPKTTLLVAALSIFTVIWPLSQVGGEFLPKINEGDLLYMPSTLPGVSP 582
Y + ++L LL+ AL + ++ ++ FLP+ ++G L M G +
Sbjct: 524 NHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQ 583

Query: 583 GEAAALLQTTDKLIKT--VPEVASVFGKTGKAETATDSAPLEMMETTIQLKPEDQW-RPG 639
+L V SVF G + + + LKP ++
Sbjct: 584 ERTQKVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQN---AGMAFVSLKPWEERNGDE 640

Query: 640 MTIDKIIEELDRTVRLPGLANLWVPPIRNRIDMLSTGIKSPIGIKVSGTVLSDIDVTAQS 699
+ + +I + + + +++ + I +G +
Sbjct: 641 NSAEAVIHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 700 IEAVAKTVPG-VVSALAERLEGGRYIDVDINREKASRYGMTVGDVQLFVSSAIGGAMVGE 758
+ +A P +VS LE +++++EKA G+++ D+ +S+A+GG V +
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND 760

Query: 759 TVEGVARYPINIRYPQDYRNSPQALREMPILTPMKQQITLGDVADIKVVSGPTMLKTENA 818
++ + ++ +R P+ + ++ + + + + V G L+ N
Sbjct: 761 FIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNG 820

Query: 819 RPASWIYVDARGRDMVSVVNDIKTAISEKVKLRPGTSVAFSGQFELLEHANKKLKLMVPM 878
P+ I +A + + KL G ++G + + +V +
Sbjct: 821 LPSMEIQGEAAPGTSSGDAMALMENL--ASKLPAGIGYDWTGMSYQERLSGNQAPALVAI 878

Query: 879 TVMIIFILLYLAFRRVDEALLILMSLPFALVGGIWFLYWQGFHMSVATGTGFIALAGVAA 938
+ +++F+ L + + +++ +P +VG + V G + G++A
Sbjct: 879 SFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSA 938

Query: 939 EFGVVMLMYLRHAIEAHPELSRKETFTPEGLDEALYHGAVLRVRPKAMTVAVIVAGLLPI 998
+ ++++ + + +L KE EA +R+RP MT + G+LP+
Sbjct: 939 KNAILIVEFAK-------DLMEKEGKGVV---EATLMAVRMRLRPILMTSLAFILGVLPL 988

Query: 999 LWGTGAGSEVMSRIAAPMIGGMITAPLLSLFIIPAAYKLI 1038
GAGS + + ++GGM++A LL++F +P + +I
Sbjct: 989 AISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVI 1028


5Sputcn32_0291Sputcn32_0313Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0291217-2.977522hypothetical protein
Sputcn32_0292322-3.197715RNA polymerase factor sigma-32
Sputcn32_0293425-3.747508fimbrial protein
Sputcn32_0295518-2.104479hypothetical protein
Sputcn32_0296617-1.152783hypothetical protein
Sputcn32_0297516-1.287120fimbrial subunit
Sputcn32_0298313-2.652263fimbrial protein
Sputcn32_0299313-3.199439pili assembly chaperone
Sputcn32_0300315-3.740498fimbrial biogenesis outer membrane usher
Sputcn32_0301219-5.707695fimbrial subunit
Sputcn32_0302217-5.322534pili assembly chaperone
Sputcn32_0303117-4.594739multi-sensor hybrid histidine kinase
Sputcn32_0304015-3.391233two component LuxR family transcriptional
Sputcn32_0305011-1.655885response regulator receiver modulated
Sputcn32_0306113-1.329553hypothetical protein
Sputcn32_0307012-0.887329o-succinylbenzoate--CoA ligase
Sputcn32_0308012-1.128308O-succinylbenzoate synthase
Sputcn32_0309113-1.241363alpha/beta hydrolase fold domain-containing
Sputcn32_0310112-1.7812922-succinyl-5-enolpyruvyl-6-hydroxy-3-
Sputcn32_0311014-3.407613LysR family transcriptional regulator
Sputcn32_0312315-1.254921hypothetical protein
Sputcn32_0313215-0.267909formate-dependent nitrite reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0300PF005777020.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 702 bits (1814), Expect = 0.0
Identities = 269/867 (31%), Positives = 418/867 (48%), Gaps = 52/867 (5%)

Query: 14 LRLLPLLLLSAPLYADDMEYTFDEALLLGPGYNSDYLQRLANGPDILPGQYQVDLFINGR 73
+RL +A E F+ L L R NG ++ PG Y+VD+++N
Sbjct: 28 VRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNG 87

Query: 74 FVSRELLQFIELSDNK-VEPCIEEAIWTKAGVRPEFMAPAPITGG--C-SLGYTVKGNNF 129
+++ + F + + PC+ A G+ ++ + C L +
Sbjct: 88 YMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATA 147

Query: 130 SFDSGSLRLELTVPQAYMDNKPRGYISPDEWDSGETALFVNYSGNYYRSESSYGRRSTSE 189
D G RL LT+PQA+M N+ RGYI P+ WD G A +NY+ + ++ G S
Sbjct: 148 QLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGG--NSH 205

Query: 190 SGYLGLSSGFNLGLWRIRNQSTYQYRDSGFGSDSQ--FDSLRTYATRALPFWQSELSLGE 247
YL L SG N+G WR+R+ +T+ Y S S S+ + + T+ R + +S L+LG+
Sbjct: 206 YAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGD 265

Query: 248 LYTRSTVFGSISFKGFQIQSDTRMLPVSQRGYAPTVTGIAQTTAKVVIKQNGREIYQTTV 307
YT+ +F I+F+G Q+ SD MLP SQRG+AP + GIA+ TA+V IKQNG +IY +TV
Sbjct: 266 GYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTV 325

Query: 308 AAGPFEINDLYPTNYQGDLLVEITEADGRVSSFTVPFSAVPGSMRAGQSQYALSMGKSID 367
GPF IND+Y GDL V I EADG FTVP+S+VP R G ++Y+++ G+
Sbjct: 326 PPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRS 385

Query: 368 TG---EDGYFTDFTYELGLSNAMTFNSGLRIASGYVAASAGSVFTT-PIGAIGATGVYSH 423
E F T GL T G ++A Y A + G +GA+ ++
Sbjct: 386 GNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQAN 445

Query: 424 SKLKPELGDVETDSGWRLGLNYSRAF-ESGTSVTLAGYKYSTEGFRELSDIFRQRAYLDN 482
S L D G + Y+++ ESGT++ L GY+YST G+ +D R N
Sbjct: 446 ST----LPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 483 GY-------------DYSSENYLQKAEMVLSMSQHLGDWGSLAVSGSKRQYRDGRDDDES 529
DY + Y ++ ++ L+++Q LG +L +SGS + Y + DE
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQ 561

Query: 530 YQMGYSNQFGLVSLGVNYSRQYLNQTENSIAGPIPVKTKDDVWSLSLSVPLG-------A 582
+Q G + F ++ ++YS + K +D + +L++++P
Sbjct: 562 FQAGLNTAFEDINWTLSYSL---TKNAWQ-------KGRDQMLALNVNIPFSHWLRSDSK 611

Query: 583 SSNHAASTGYSNSGDSN---NYYAGISGMLDDDRTLSYSLNASR-LDDNDFSGNSYSATL 638
S AS YS S D N AG+ G L +D LSYS+ + SG++ ATL
Sbjct: 612 SQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATL 671

Query: 639 NKRMSLASMGLNYSRSDSYQQWGGNIRGAVVVHEGGLTLGQSVGETFAIIEAPGAEGAAV 698
N R + + YS SD +Q + G V+ H G+TLGQ + +T +++APGA+ A V
Sbjct: 672 NYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKV 731

Query: 699 KNNWGTYIDSNGYALMPSLMPYRSNDVSLDSANIDEDVELVDSRKTVTPYAGAAVKLKYE 758
+N G D GYA++P YR N V+LD+ + ++V+L ++ V P GA V+ +++
Sbjct: 732 ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFK 791

Query: 759 TRKGKAALFIAKLDSGEVAPLGANILGSLGENIGTVGQAGLAYVRLAEPVGKLTLKWGER 818
R G L + + P GA + ++ G V G Y+ GK+ +KWGE
Sbjct: 792 ARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEE 850

Query: 819 REDSCQIHYDLTEQPTDVRLHRLPTVC 845
C +Y L + L +L C
Sbjct: 851 ENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0303HTHFIS771e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.2 bits (190), Expect = 1e-16
Identities = 33/157 (21%), Positives = 63/157 (40%), Gaps = 10/157 (6%)

Query: 960 HILIVDDHSANRLLLSQQLRYLGHSVDEANNGLEAIQLFRQHPYRIVLTDCNMPIMDGYE 1019
IL+ DD +A R +L+Q L G+ V +N + +V+TD MP + ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 FSRRLRQFEQNNNLPAAVVLGYTANAQLEAKQACIDAGMNDCLFKPISLEDLRQKLESYC 1079
R++ + +LP V+ +A + G D L KP L +L +
Sbjct: 65 LLPRIK--KARPDLPVLVM---SAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG--- 116

Query: 1080 QKLTLEHPQQAFLPEALTKLTG--GNTPLFEQLLKEL 1114
+ L + + L + G + +++ + L
Sbjct: 117 RALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0304HTHFIS866e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.0 bits (213), Expect = 6e-22
Identities = 34/165 (20%), Positives = 74/165 (44%), Gaps = 11/165 (6%)

Query: 1 MKR-KILIVDDHPVVVLALKIILEQNGFEVIADTNNGVDALKLVKDLSPDAVILDIGIPQ 59
M IL+ DD + L L + G++V T+N + + D V+ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 60 LDGLEVIERSRKLANPPPILVLTAQPSDHFVVRCIQAGASGFVSKQKDMTEVTGALRAIL 119
+ +++ R +K P+LV++AQ + ++ + GA ++ K D+TE+ G + L
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 120 S-------GHSYFPIFGNNIITQS--HQQEAELIKKLSTREMVVL 155
+ G ++ +S Q+ ++ +L ++ ++
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0305HTHFIS508e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 50.2 bits (120), Expect = 8e-09
Identities = 17/72 (23%), Positives = 28/72 (38%), Gaps = 2/72 (2%)

Query: 1 MGKIRVIVLEDHPFQRTVLEYNLASFPCVEVFSFGTAQDALAWLDIHHSADIVICDLMMT 60
M ++V +D RTVL S +V A W+ D+V+ D++M
Sbjct: 1 MTGATILVADDDAAIRTVLN-QALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMP 58

Query: 61 GVDGLSFLRKAK 72
+ L + K
Sbjct: 59 DENAFDLLPRIK 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0310BINARYTOXINA330.002 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 33.5 bits (76), Expect = 0.002
Identities = 24/93 (25%), Positives = 41/93 (44%), Gaps = 10/93 (10%)

Query: 483 LNNDGGNIFNLL---PVPNEQVRNDYYRLSHGLEFGYAAAMFNLPYNQVDNLADFQDSYN 539
L++ NI N L P+P+ + YR S EFG +N+++N+ F++ +
Sbjct: 312 LDSKVNNIENALKLTPIPSNLI---VYRRSGPQEFGLTLTSPEYDFNKIENIDAFKEKWE 368

Query: 540 ----EALDFQGASIIEVNVSQTQASDQIAELNL 568
+F SI VN+S I +N+
Sbjct: 369 GKVITYPNFISTSIGSVNMSAFAKRKIILRINI 401


6Sputcn32_0330Sputcn32_0342Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_03302191.352805TonB-dependent receptor
Sputcn32_03314272.838370formate dehydrogenase subunit gamma
Sputcn32_03323272.7361714Fe-4S ferredoxin
Sputcn32_03333262.734181molybdopterin oxidoreductase
Sputcn32_03342232.210770twin-arginine translocation pathway signal
Sputcn32_03350201.726238formate dehydrogenase subunit gamma
Sputcn32_03360201.6654264Fe-4S ferredoxin
Sputcn32_03370191.436105molybdopterin oxidoreductase
Sputcn32_0338-1160.385409hypothetical protein
Sputcn32_03390140.131233cytoplasmic chaperone TorD family protein
Sputcn32_0340016-0.6930234Fe-4S ferredoxin
Sputcn32_0341218-1.660515hypothetical protein
Sputcn32_0342217-1.228056hypothetical protein
7Sputcn32_0354Sputcn32_0360Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_03541233.670806fumarate reductase flavoprotein subunit
Sputcn32_03552304.372337N-acetyltransferase GCN5
Sputcn32_0356-1355.440742integral membrane sensor signal transduction
Sputcn32_0357-2325.150488two component transcriptional regulator
Sputcn32_0358-1253.870210peptidase
Sputcn32_0359-1233.928774diheme cytochrome c
Sputcn32_0360-1193.455353cytochrome c-type protein Shp
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0355SACTRNSFRASE300.006 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.006
Identities = 12/39 (30%), Positives = 21/39 (53%)

Query: 112 NVYVNANHRNKGLGKLLVNAVIEHARAIGLQKIYLFTAD 150
++ V ++R KG+G L++ IE A+ + L T D
Sbjct: 94 DIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0357HTHFIS755e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 75.3 bits (185), Expect = 5e-18
Identities = 30/129 (23%), Positives = 60/129 (46%)

Query: 2 RLLLIEDDTDLVARLIPALNEAGYTVEHADNGIDGAFLGEEENFEAVILDLGLPGKPGLQ 61
+L+ +DD + L AL+ AGY V N + + V+ D+ +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLGQWRQKGLAMPVLILTARDAWHERVDGLKAGADDYLGKPFHIEELLARLEVLIRRHFG 121
+L + ++ +PVL+++A++ + + + GA DYL KPF + EL+ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 RADNVLQHA 130
R + +
Sbjct: 125 RPSKLEDDS 133


8Sputcn32_0428Sputcn32_0454Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_04281263.057421plasmid stabilization system protein
Sputcn32_04291244.519240prevent-host-death family protein
Sputcn32_04301234.708663hypothetical protein
Sputcn32_04311266.0024313-oxoacyl-(acyl carrier protein) synthase II
Sputcn32_04322255.7242243-ketoacyl-ACP reductase
Sputcn32_04332265.518573thioester dehydrase family protein
Sputcn32_04342255.3929763-oxoacyl-ACP synthase
Sputcn32_04352275.434529hypothetical protein
Sputcn32_04362255.461039FAD dependent oxidoreductase
Sputcn32_04373265.416837hypothetical protein
Sputcn32_04381265.017111hypothetical protein
Sputcn32_04390254.811530thioesterase superfamily protein
Sputcn32_04401254.798439histidine ammonia-lyase
Sputcn32_04410233.724971glycosyl transferase family protein
Sputcn32_04421243.404636thioester dehydrase family protein
Sputcn32_04431223.733309hypothetical protein
Sputcn32_0444-1161.466556hypothetical protein
Sputcn32_0445-1173.320835acyl carrier protein
Sputcn32_0446-1143.068310acyl carrier protein
Sputcn32_04470142.771797phospholipid/glycerol acyltransferase
Sputcn32_04480131.698654hypothetical protein
Sputcn32_04490141.048503hypothetical protein
Sputcn32_04502162.570744ATP-dependent DNA helicase RecG
Sputcn32_04511160.911766multiple drug resistance protein MarC
Sputcn32_04522161.603406OmpA/MotB domain-containing protein
Sputcn32_04532171.539179hypothetical protein
Sputcn32_04543192.235822hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0432DHBDHDRGNASE1052e-29 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 105 bits (262), Expect = 2e-29
Identities = 70/248 (28%), Positives = 114/248 (45%), Gaps = 15/248 (6%)

Query: 5 VLVTGSSRGIGKAIALKLAVAGFDIALHYHSNQTAADATAAEVRALGVNASLLKFDVADR 64
+TG+++GIG+A+A LA G IA N + + ++A +A DV D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIA-AVDYNPEKLEKVVSSLKAEARHAEAFPADVRDS 69

Query: 65 ATVRAAIEADIEANGAYYGVILNAGINRDTAFPAMTESEWDSVIHTNLDGFYNVIHPCVM 124
A + G ++ AG+ R ++++ EW++ N G +N V
Sbjct: 70 AAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRS-VS 128

Query: 125 PMVQGRKGGRIITLASVSGIAGNRGQVNYSASKAGIIGATKALSLELAKRKITVNCIAPG 184
+ R+ G I+T+ S Y++SKA + TK L LELA+ I N ++PG
Sbjct: 129 KYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSPG 188

Query: 185 LIETDM----------VADIPKDMVEQL---VPMRRMGKPSEIAALAAFLMSDDAAYITR 231
ETDM + K +E +P++++ KPS+IA FL+S A +IT
Sbjct: 189 STETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHITM 248

Query: 232 QVISVNGG 239
+ V+GG
Sbjct: 249 HNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0437ACRIFLAVINRP340.003 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 33.7 bits (77), Expect = 0.003
Identities = 27/149 (18%), Positives = 53/149 (35%), Gaps = 25/149 (16%)

Query: 682 LLALALGIALLLFSLSFGIKKAAVVVAVPA--LAALLTLATLGITGSPLSLFHALALILV 739
A+ L ++ L +AVP L LA G + + L++F ++L
Sbjct: 345 FEAIMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMF---GMVLA 401

Query: 740 FGIGIDYSL----------------FFASAQQHGKAVMMAVFMSACSTLLAFGLLAFSQT 783
G+ +D ++ + ++ + A+ A F +AF
Sbjct: 402 IGLLVDDAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGG 461

Query: 784 QA---IHYFGLTLSLGIGFTFLLSPLILT 809
F +T+ + + L++ LILT
Sbjct: 462 STGAIYRQFSITIVSAMALSVLVA-LILT 489



Score = 33.3 bits (76), Expect = 0.005
Identities = 19/122 (15%), Positives = 42/122 (34%), Gaps = 17/122 (13%)

Query: 676 RLLTLKLLALALGIALLLFSLSFGIKKAAVVVAVPALAALLTLATLGITGSPLSLFHALA 735
+ L ++ + + L L +L V+ V L + L + ++ +
Sbjct: 871 QAPALVAISFVV-VFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVG 929

Query: 736 LILVFGIGIDYSLFFAS-----AQQHGKAVMMAV-----------FMSACSTLLAFGLLA 779
L+ G+ ++ ++ GK V+ A M++ + +L LA
Sbjct: 930 LLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLA 989

Query: 780 FS 781
S
Sbjct: 990 IS 991


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0450SECA412e-05 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 40.6 bits (95), Expect = 2e-05
Identities = 30/84 (35%), Positives = 39/84 (46%), Gaps = 8/84 (9%)

Query: 294 MRLVQGDV-----GSGKTLVAAMAA-LQAIENGYQVAMMAPTELLAEQHATNFAAWFEPL 347
M L + + G GKTL A + A L A+ G V ++ + LA++ A N FE L
Sbjct: 92 MVLNERCIAEMRTGEGKTLTATLPAYLNALT-GKGVHVVTVNDYLAQRDAENNRPLFEFL 150

Query: 348 GLKVGW-LAGKLKGKARTQSLADI 370
GL VG L G R ADI
Sbjct: 151 GLTVGINLPGMPAPAKREAYAADI 174


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0452OMPADOMAIN848e-20 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 83.8 bits (207), Expect = 8e-20
Identities = 39/118 (33%), Positives = 63/118 (53%), Gaps = 12/118 (10%)

Query: 374 LFDYDSERMLPESKPVLEVLATYLKQN--PALSFYVVGHTDDKGERSYNQSLSERRAAAV 431
LF+++ + PE + L+ L + L S V+G+TD G +YNQ LSERRA +V
Sbjct: 222 LFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQSV 281

Query: 432 IKQLNEAFNIPSVQLTAHGNGEYSPVASNANDTGQRL---------NRRVELVLRSDK 480
+ L + IP+ +++A G GE +PV N D ++ +RRVE+ ++ K
Sbjct: 282 VDYL-ISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKGIK 338


9Sputcn32_0490Sputcn32_0498Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_04902161.182562cell division protein FtsA
Sputcn32_04911170.676669cell division protein FtsZ
Sputcn32_04921150.111119UDP-3-O-[3-hydroxymyristoyl] N-acetylglucosamine
Sputcn32_0493217-0.842838hypothetical protein
Sputcn32_0494115-0.826193peptidase M23B
Sputcn32_0495217-0.956236preprotein translocase subunit SecA
Sputcn32_0496318-1.440236***hypothetical protein
Sputcn32_0497217-1.291311delta-aminolevulinic acid dehydratase
Sputcn32_0498216-2.168645diguanylate cyclase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0490SHAPEPROTEIN697e-15 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 68.6 bits (168), Expect = 7e-15
Identities = 51/221 (23%), Positives = 91/221 (41%), Gaps = 20/221 (9%)

Query: 150 SGMRMEAKVHIVTC----ANDMAKNITK-SVERCGLKVDDLVFSGIASADAVLTFDEKDL 204
S M ++ C A + + + S + G + L+ +A+A +
Sbjct: 100 SNSFMRPSPRVLVCVPVGATQVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEAT 159

Query: 205 GVCIVDIGGGTTDIAVYTNGALRHCAVVPVAGNQVTNDIAKIFR------TPSSHAEQIK 258
G +VDIGGGTT++AV + + + + V + G++ I R + AE+IK
Sbjct: 160 GSMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIK 219

Query: 259 VQFACARSSMVSREDSIEVPS---VGGRPSR-SMSRHTLAEVVEPRYQELFELVLKELKD 314
+ A IEV G P +++ + + E ++ + V+ L+
Sbjct: 220 HEIGSAYPG--DEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQ 277

Query: 315 SGLE---DQIAAGIVLTGGTASIQGVVDIAEATFGMPVRVA 352
E D G+VLTGG A ++ + + G+PV VA
Sbjct: 278 CPPELASDISERGMVLTGGGALLRNLDRLLMEETGIPVVVA 318


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0495SECA13230.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1323 bits (3425), Expect = 0.0
Identities = 659/909 (72%), Positives = 766/909 (84%), Gaps = 10/909 (1%)

Query: 1 MFGKLLTKVFGSRNDRTLKGLQKVVIKINALEADYEKLTDEELKAKTAEFRERLAAGETL 60
M KLLTKVFGSRNDRTL+ ++KVV INA+E + EKL+DEELK KTAEFR RL GE L
Sbjct: 1 MLIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVL 60

Query: 61 DDIMAEAFATVREASKRVFEMRHFDVQLLGGMVLDSNRIAEMRTGEGKTLTATLPAYLNA 120
++++ EAFA VREASKRVF MRHFDVQLLGGMVL+ IAEMRTGEGKTLTATLPAYLNA
Sbjct: 61 ENLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNA 120

Query: 121 LTGKGVHVITVNDYLARRDAENNRPLFEFLGLTVGINVAGIGQQEKKAAYNADITYGTNN 180
LTGKGVHV+TVNDYLA+RDAENNRPLFEFLGLTVGIN+ G+ K+ AY ADITYGTNN
Sbjct: 121 LTGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNN 180

Query: 181 EFGFDYLRDNMAFSPQDRVQRPLHYALIDEVDSILIDEARTPLIISGAAEDSSELYTKIN 240
E+GFDYLRDNMAFSP++RVQR LHYAL+DEVDSILIDEARTPLIISG AEDSSE+Y ++N
Sbjct: 181 EYGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVN 240

Query: 241 TLIPNLIRQDKEDSEEYVGEGDYSIDEKAKQVHFTERGQEKVENLLIERGMLAEGDSLYS 300
+IP+LIRQ+KEDSE + GEG +S+DEK++QV+ TERG +E LL++ G++ EG+SLYS
Sbjct: 241 KIIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYS 300

Query: 301 AANISLLHHVNAALRAHTLFERDVDYIVQDGEVIIVDEHTGRTMPGRRWSEGLHQAVEAK 360
ANI L+HHV AALRAH LF RDVDYIV+DGEVIIVDEHTGRTM GRRWS+GLHQAVEAK
Sbjct: 301 PANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAK 360

Query: 361 EGVHIQNENQTLASITFQNYFRQYEKLAGMTGTADTEAFEFQHIYGLDTVVVPTNRPMVR 420
EGV IQNENQTLASITFQNYFR YEKLAGMTGTADTEAFEF IY LDTVVVPTNRPM+R
Sbjct: 361 EGVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIR 420

Query: 421 KDMADLVYLTANEKYQAIIKDIKDCRERGQPVLVGTVSIEQSELLARLMVQEKIPHQVLN 480
KD+ DLVY+T EK QAII+DIK+ +GQPVLVGT+SIE+SEL++ + + I H VLN
Sbjct: 421 KDLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGIKHNVLN 480

Query: 481 AKFHEKEAEIVAQAGRTGAVTIATNMAGRGTDIVLGGNWNMEIDALDNPTPEQKAKIKAD 540
AKFH EA IVAQAG AVTIATNMAGRGTDIVLGG+W E+ AL+NPT EQ KIKAD
Sbjct: 481 AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSWQAEVAALENPTAEQIEKIKAD 540

Query: 541 WQVRHDAVVAAGGLHILGTERHESRRIDNQLRGRAGRQGDAGSSRFYLSMEDSLMRIFAS 600
WQVRHDAV+ AGGLHI+GTERHESRRIDNQLRGR+GRQGDAGSSRFYLSMED+LMRIFAS
Sbjct: 541 WQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDALMRIFAS 600

Query: 601 DRVSGMMKKLGMEEGEAIEHPWVSRAIENAQRKVEARNFDIRKQLLEFDDVANDQRQVVY 660
DRVSGMM+KLGM+ GEAIEHPWV++AI NAQRKVE+RNFDIRKQLLE+DDVANDQR+ +Y
Sbjct: 601 DRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVANDQRRAIY 660

Query: 661 AQRNELMDAESIEDTIKNIQDDVIGAVIDQYIPPQSVEELWDVPGLEQRLNQEFMLKLPI 720
+QRNEL+D + +TI +I++DV A ID YIPPQS+EE+WD+PGL++RL +F L LPI
Sbjct: 661 SQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDLDLPI 720

Query: 721 QEWLDKEDDLHEESLRERIITSWSDAYKAKEEMVGASVLRQFEKAVMLQTLDGLWKEHLA 780
EWLDKE +LHEE+LRERI+ + Y+ KEE+VGA ++R FEK VMLQTLD LWKEHLA
Sbjct: 721 AEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSLWKEHLA 780

Query: 781 AMDHLRQGIHLRGYAQKNPKQEYKRESFELFQQLLNTLKHDVISVLSKVQVQAQSDVEEM 840
AMD+LRQGIHLRGYAQK+PKQEYKRESF +F +L +LK++VIS LSKVQV+ +VEE+
Sbjct: 781 AMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMPEEVEEL 840

Query: 841 EARRREEDAKIQRDYQHAAAEALVGGGDEDDESIAAHTPMIRDGD-KVGRNDPCPCGSGR 899
E +RR E + +DD+S AA + G+ KVGRNDPCPCGSG+
Sbjct: 841 EQQRRME---------AERLAQMQQLSHQDDDSAAAAALAAQTGERKVGRNDPCPCGSGK 891

Query: 900 KYKQCHGKL 908
KYKQCHG+L
Sbjct: 892 KYKQCHGRL 900


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0498PF02370300.014 M protein repeat
		>PF02370#M protein repeat

Length = 168

Score = 30.5 bits (68), Expect = 0.014
Identities = 13/69 (18%), Positives = 32/69 (46%)

Query: 387 DERKAKLRIQQEALKQAQKIRSAREEALKLEAETNEKLEQMVQERTLELEITLRELHEVN 446
D RK + + Q + + ++ + +E + E + ++ QE+ + + ++L
Sbjct: 59 DLRKREGQYQDKIEELEKERKEKQERPERREKFERQHQDKHYQEQQKKHQQEQQQLEAEK 118

Query: 447 QKLTEQSTI 455
QKL ++ I
Sbjct: 119 QKLAKEKQI 127


10Sputcn32_0566Sputcn32_0594Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0566221-1.888032MSHA pilin protein MshB
Sputcn32_0567222-2.140770methylation site containing protein
Sputcn32_0568316-0.554078methylation site containing protein
Sputcn32_05693150.219749MSHA pilin protein MshC
Sputcn32_05702160.820891methylation site containing protein
Sputcn32_05712160.947984MSHA biogenesis protein MshO
Sputcn32_05722160.950153MSHA biogenesis protein MshP
Sputcn32_05730150.599217MSHA biogenesis protein MshQ
Sputcn32_0574-2170.450100rod shape-determining protein MreB
Sputcn32_0575-1160.638782rod shape-determining protein MreC
Sputcn32_0576-1130.794182rod shape-determining protein MreD
Sputcn32_0577-1121.170492maf protein
Sputcn32_0578-1121.852659ribonuclease G
Sputcn32_05790142.161493hypothetical protein
Sputcn32_05801163.432410nitrilase/cyanide hydratase and apolipoprotein
Sputcn32_05810152.800221C69 family peptidase
Sputcn32_05821162.172576outer membrane efflux protein
Sputcn32_05830152.817171secretion protein HlyD family protein
Sputcn32_05840152.218218ABC transporter
Sputcn32_0585-3173.235261ABC transporter
Sputcn32_0586-2173.760479Ion transport protein
Sputcn32_0587-2143.770563amino acid permease-associated protein
Sputcn32_0588-1183.837859hypothetical protein
Sputcn32_0589-2173.467857hypothetical protein
Sputcn32_0590-3163.310688catalase
Sputcn32_0591-1172.047474MgtC/SapB transporter
Sputcn32_0592-1171.751021Ig domain-containing protein
Sputcn32_0593-1162.342309hypothetical protein
Sputcn32_0594-1193.029087C69 family peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0566BCTERIALGSPG446e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 43.7 bits (103), Expect = 6e-08
Identities = 20/46 (43%), Positives = 31/46 (67%), Gaps = 2/46 (4%)

Query: 4 KQNGFSLIELVIVIVILGLLAATAIPRFLNVTD--DAQDASVDGVA 47
KQ GF+L+E+++VIVI+G+LA+ +P + + D Q A D VA
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVA 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0567BCTERIALGSPG451e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 44.9 bits (106), Expect = 1e-08
Identities = 14/31 (45%), Positives = 23/31 (74%)

Query: 2 MKRQQGFTLIELVVVIIILGILAVTAAPKFI 32
+Q+GFTL+E++VVI+I+G+LA P +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLM 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0568BCTERIALGSPG494e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 49.1 bits (117), Expect = 4e-10
Identities = 18/52 (34%), Positives = 28/52 (53%)

Query: 1 MKKQIGFTLIELVVVIIILGILAVTAAPKFINLQSDARASTVKGLEAAIKGA 52
KQ GFTL+E++VVI+I+G+LA P + + A A++ A
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENA 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0569BCTERIALGSPH452e-08 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 44.9 bits (106), Expect = 2e-08
Identities = 25/80 (31%), Positives = 40/80 (50%), Gaps = 1/80 (1%)

Query: 8 KQAGFTLVELVTTVILIGILSVTVLPRLFTQSSYSAFSLRNEFMAELRQVQQRALNNTDR 67
+Q GFTL+E++ ++L+G+ + VL SA F A+LR VQQR L +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQT-GQ 60

Query: 68 CYRIAVSSVGYQVSQFATRD 87
+ ++V +Q RD
Sbjct: 61 FFGVSVHPDRWQFLVLEARD 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0570BCTERIALGSPH341e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 34.2 bits (78), Expect = 1e-04
Identities = 14/56 (25%), Positives = 29/56 (51%), Gaps = 4/56 (7%)

Query: 16 KGFTLIELVVGMLVIAIAIVM-LSSMLFPQADRAAKTLHRVRSA-ELA--HSVMNE 67
+GFTL+E+++ +L++ ++ M L + + D AA+TL R + +
Sbjct: 4 RGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTG 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0571BCTERIALGSPG336e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 32.9 bits (75), Expect = 6e-04
Identities = 11/18 (61%), Positives = 17/18 (94%)

Query: 13 RGFTLVEMVTVILILGIL 30
RGFTL+E++ VI+I+G+L
Sbjct: 8 RGFTLLEIMVVIVIIGVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0574SHAPEPROTEIN5560.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 556 bits (1435), Expect = 0.0
Identities = 313/348 (89%), Positives = 331/348 (95%), Gaps = 1/348 (0%)

Query: 1 MFKKLRGIFSNDLSIDLGTANTLIYVRDEGIVLNEPSVVAIRGERNSSGQKSVAAVGTEA 60
M KK RG+FSNDLSIDLGTANTLIYV+ +GIVLNEPSVVAIR +R + KSVAAVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDR-AGSPKSVAAVGHDA 59

Query: 61 KQMLGRTPGNIQAIRPMKDGVIADFYVTEKMLQHFIKQVHNNSFFRPSPRVLVCVPVGAT 120
KQMLGRTPGNI AIRPMKDGVIADF+VTEKMLQHFIKQVH+NSF RPSPRVLVCVPVGAT
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 QVERRAIRESAMGAGAREVYLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAIISLN 180
QVERRAIRESA GAGAREV+LIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVA+ISLN
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GVVYSSSVRIGGDKFDDAIINYVRRNYGSLIGEATAERIKHTIGTAYPGDEVLEIEVRGR 240
GVVYSSSVRIGGD+FD+AIINYVRRNYGSLIGEATAERIKH IG+AYPGDEV EIEVRGR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 NLAEGVPRSFILNSNEILEALQEPLSGIVSAVMVALEQSPPELASDISERGMVLTGGGAL 300
NLAEGVPR F LNSNEILEALQEPL+GIVSAVMVALEQ PPELASDISERGMVLTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 LRDLDRLLMQETGIPVMVADDPLTCVARGGGKALEMIDMHGGDLFSEE 348
LR+LDRLLM+ETGIPV+VA+DPLTCVARGGGKALEMIDMHGGDLFSEE
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0579TONBPROTEIN300.042 Gram-negative bacterial tonB protein signature.
		>TONBPROTEIN#Gram-negative bacterial tonB protein signature.

Length = 239

Score = 30.3 bits (68), Expect = 0.042
Identities = 16/98 (16%), Positives = 31/98 (31%), Gaps = 9/98 (9%)

Query: 1265 EPVVEELERKSKEIEIPESILPIIGAGDKAMPTEGNNEQHTPKLGPKASEDMKLGEPL-- 1322
V E E + + I P P++ K P + K+ + D+K E
Sbjct: 65 PEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKP--KPKPKPVKKVQEQPKRDVKPVESRPA 122

Query: 1323 -PQGSQPDTMPPEPAI--KPIEP--TNSQDVKPIKQNQ 1355
P + +P + + + + +NQ
Sbjct: 123 SPFENTAPARLTSSTATAATSKPVTSVASGPRALSRNQ 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0583RTXTOXIND491e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 48.7 bits (116), Expect = 1e-08
Identities = 36/231 (15%), Positives = 81/231 (35%), Gaps = 26/231 (11%)

Query: 6 MLALVALVALGLILAYGLKLAYSPQPSLLQGQI--EAREYNVSSKVPGRVEQVLVRRGDS 63
++ + + IL+ ++ + G++ R + V++++V+ G+S
Sbjct: 61 AYFIMGFLVIAFILSVLGQVE---IVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGES 117

Query: 64 VAEGDLLFAIHSPELDAKLMQAEGGRDAAQAKQLEANNGARSQEVMAAKEQWLKAQAAAT 123
V +GD+L + + +A ++ + A+ +Q +RS E+ E L +
Sbjct: 118 VRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQ 177

Query: 124 LAKTTYTRVENLFNEGVAARQKRDEAFTQWQAAKYTEQAALAMYQMAEEG--ARVETKAA 181
E + E F+ WQ KY ++ L + AR+
Sbjct: 178 NVSE---------EEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINRYEN 228

Query: 182 AAGNARM---------AEGAVKEVSAVMEDSQMRAPKSGEISEVLLQAGEL 223
+ + + A+ + AV+E E+ Q ++
Sbjct: 229 LSRVEKSRLDDFSSLLHKQAIAK-HAVLEQENKYVEAVNELRVYKSQLEQI 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0588IGASERPTASE290.040 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.9 bits (64), Expect = 0.040
Identities = 27/156 (17%), Positives = 55/156 (35%), Gaps = 20/156 (12%)

Query: 193 NSNGSRKRTTPQVNKPAQEIPAERMTP------------QVNKQMPEVQTEATAPQVNAL 240
N NG P+V K Q + +T N+++ V P A
Sbjct: 973 NVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPAT 1032

Query: 241 ANEQAPEILPIPVREVSRILAQALQTGTLSPEDERYIGQ-------LITQRTDLTQQEAE 293
+E E + ++ S+ + + Q T + R + + TQ ++ Q +E
Sbjct: 1033 PSETT-ETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSE 1091

Query: 294 QRVSQGFEKVQTTLKNAETTARQAADEARKASAYAA 329
+ +Q E +T E A+ ++ ++ +
Sbjct: 1092 TKETQTTETKETATVEKEEKAKVETEKTQEVPKVTS 1127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0592SURFACELAYER320.010 Lactobacillus surface layer protein signature.
		>SURFACELAYER#Lactobacillus surface layer protein signature.

Length = 439

Score = 31.9 bits (72), Expect = 0.010
Identities = 35/173 (20%), Positives = 64/173 (36%), Gaps = 11/173 (6%)

Query: 315 NDGTITRVQTPASISFSSDCTTNNSATLDTPVTTLSGNASSTFQDTSCSGNSERNDQIIA 374
N T + + S S+ S T+ +L+G+ S+++ S + N ++
Sbjct: 41 NANTNAKYDVDVTPSISAIAAVAKSDTMPAIPGSLTGSISASYNGKSYTANLPKDSGNAT 100

Query: 375 TTVAGNQTLTASLPFTLQRQTLA----SLSFESAEPTQIRIKGAGGTGSSESSLVSFKVT 430
T + N T+ + + T+ S +F S G T S + V+F
Sbjct: 101 ITDSNNNTVKPAELEADKAYTVTVPDVSFNFGSEN------AGKEITIGSANPNVTFTEK 154

Query: 431 SANGQPAAQQEVSFTLDTVVGGLSFANGTANAQSLTNSQGIASVRVLSGTVPT 483
+ + QPA+ +V+ D V S A T + + V +G T
Sbjct: 155 TGD-QPASTVKVTLDQDGVAKLSSVQIKNVYAIDTTYNSNVNFYDVTTGATVT 206


11Sputcn32_0604Sputcn32_0617Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0604112-3.182003OmpA domain-containing protein
Sputcn32_0605212-2.609991MORN repeat-containing protein
Sputcn32_0606-1161.049774hypothetical protein
Sputcn32_06070181.805528transposase Tn5 dimerisation subunit
Sputcn32_06080181.947883phosphoribosylaminoimidazolesuccinocarboxamide
Sputcn32_06090161.249425hypothetical protein
Sputcn32_0610-1172.908500pentapeptide repeat-containing protein
Sputcn32_06120204.1772814Fe-4S ferredoxin
Sputcn32_06131224.056022polysulfide reductase, NrfD
Sputcn32_06140203.713979hypothetical protein
Sputcn32_06150183.730223transcriptional repressor protein MetJ
Sputcn32_06160173.758121O-succinylhomoserine (thiol)-lyase
Sputcn32_06170153.182814bifunctional aspartate kinase II/homoserine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0604OMPADOMAIN693e-16 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 69.2 bits (169), Expect = 3e-16
Identities = 51/204 (25%), Positives = 78/204 (38%), Gaps = 25/204 (12%)

Query: 1 MKNNIITLAAIISLVSFSTYSHANSSNNIDVAQNGIYVGANYGYLKVDGADDFD------ 54
MK I +A ++L F+T + A +N Y GA G+ + +
Sbjct: 1 MKKTAIAIA--VALAGFATVAQAAPKDNT------WYTGAKLGWSQYHDTGFINNNGPTH 52

Query: 55 DNSDVFQGLVGYRFNQYLALEGGYINFGKY----GNNLAKAETDGYTAGLKVMFPIVDRV 110
+N GY+ N Y+ E GY G+ + G K+ +PI D +
Sbjct: 53 ENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGSVENGAYKAQGVQLTAKLGYPITDDL 112

Query: 111 ELYAKAGQLWYSTDYKVVGFSGNKDDEGV--FAGAGVAFKVTDRFLINAEYTWYDAEINA 168
++Y + G + + D K G D GV GV + +T EY W + +A
Sbjct: 113 DIYTRLGGMVWRADTKSNV-YGKNHDTGVSPVFAGGVEYAITPEIATRLEYQWTNNIGDA 171

Query: 169 DNVANGANTETDFKQASLGVEYRF 192
+ T D SLGV YRF
Sbjct: 172 HTI----GTRPDNGMLSLGVSYRF 191


12Sputcn32_0629Sputcn32_0636Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0629212-0.455271cytoplasmic chaperone TorD family protein
Sputcn32_06301130.805268hypothetical protein
Sputcn32_06312151.798030hypothetical protein
Sputcn32_06320152.408536TonB family protein
Sputcn32_06330193.858577hydratase/decarboxylase family protein
Sputcn32_0634-1163.286290hypothetical protein
Sputcn32_0635-1183.771089HAD family hydrolase
Sputcn32_06360173.255967hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0632PF03544704e-16 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 70.4 bits (172), Expect = 4e-16
Identities = 26/89 (29%), Positives = 41/89 (46%), Gaps = 5/89 (5%)

Query: 281 KQEQQPIFRTNPKYPLSYAQQRKSGWVQLKFTVDEHGFVKNPEILASNGGILFEKESIKA 340
+ + R P+YP R G V++KF V G V N +IL++ +FE+E A
Sbjct: 154 ASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFEREVKNA 213

Query: 341 LDKWRYAPKFENGKAVEAQTSVQLDYTID 369
+ +WRY P V V + + I+
Sbjct: 214 MRRWRYEPGKPGSGIV-----VNILFKIN 237


13Sputcn32_0659Sputcn32_0667Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0659-1193.805163hypothetical protein
Sputcn32_0660-1194.349017hypothetical protein
Sputcn32_0661-1214.996653methyl-accepting chemotaxis sensory transducer
Sputcn32_06620245.377867molybdopterin oxidoreductase
Sputcn32_06631245.056398polysulfide reductase, NrfD
Sputcn32_06642244.7493854Fe-4S ferredoxin
Sputcn32_06653224.030256histidine kinase
Sputcn32_06662243.460763response regulator receiver protein
Sputcn32_06670223.546765hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0665PF06580300.031 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.031
Identities = 38/198 (19%), Positives = 72/198 (36%), Gaps = 33/198 (16%)

Query: 469 LQSVLTLIQQEVTRADSIISRLRNLLKK--RPVSKQPLYLHELVNETVPLLAYEFEQHQI 526
L ++ LI ++ T+A +++ L L++ R + + + L + + L Q +
Sbjct: 179 LNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFED 238

Query: 527 NLAVNVNGEPYLQSLDEVGMQQLLLN-LLKNAFDACVQRLELESSGTEQIITQKPYTPTI 585
L P ++ +V + +L+ L++N + I Q P I
Sbjct: 239 RLQFENQINP---AIMDVQVPPMLVQTLVENGI--------------KHGIAQLPQGGKI 281

Query: 586 DIDLRYQECTLLLTVTDNGTGLTEETSLLMQAFYSTKSEGLGLGLVICRDIAESHGGTFS 645
+ T+ L V + G+ + T +S G GL V R + +G
Sbjct: 282 LLKGTKDNGTVTLEVENTGSLALKNTK---------ESTGTGLQNVRER-LQMLYGTEAQ 331

Query: 646 L--ESAMGGGCQAQVAIP 661
+ G A V IP
Sbjct: 332 IKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0666HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.6 bits (217), Expect = 2e-22
Identities = 31/150 (20%), Positives = 57/150 (38%), Gaps = 4/150 (2%)

Query: 7 VYLIDDDDSVRRSLRFMLESYGLKIIDFDSAEAFFTAVDLTLPGCALVDVRMPGLSGQQL 66
+ + DDD ++R L L G + +A + + + DV MP + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 67 HLELVAKNSPLAVIYLTGHGDVPMAVDALKLGAVDFFQKPADGAKLAEAVVKALEHT--- 123
+ L V+ ++ A+ A + GA D+ KP D +L + +AL
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 124 -KAHHQDNQYLETYQALTPREREILNLIAQ 152
D+Q + +EI ++A+
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0667GPOSANCHOR529e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 52.4 bits (125), Expect = 9e-09
Identities = 52/316 (16%), Positives = 101/316 (31%), Gaps = 10/316 (3%)

Query: 598 EYAASEQELRIRLSKAEEALQSAQELQTEAESQLISINGELDNLSRELTFARTAYKNSRD 657
EL LS A+E L+ + +E S++ + +L + L A
Sbjct: 82 ALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSA 141

Query: 658 DLRRLFDEKRSEQDKINKALSERKAHAGQRLTQLDGELKQLKHQHELWLEDQKEQALEAR 717
++ L EK + KA G K + E + ++ LE
Sbjct: 142 KIKTLEAEK---AALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA 198

Query: 718 MEKNAYWQEVIGVLDNQLGQIKATIEGRRESAKIEQKACETWYKNELKSRGVDEDNILKL 777
+E + L KA + R+ + + + + E L
Sbjct: 199 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 778 KQQIRELETKISRAEQRRSDVLRFDDWY-----QHTWLIRKPKLQTQLSDVKR-AVSEID 831
+ + ELE + A + + Q+Q+ + R ++
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318

Query: 832 QQLKAKTLEVKTRRQKLDTELKASNAAQVEASENLTKLRAVMRKLAELKLPTNNEEAQGS 891
+ +++ QKL+ + K S A++ +L R ++L E + E+ + S
Sbjct: 319 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL-EAEHQKLEEQNKIS 377

Query: 892 LGERLRQGEDLLLKRD 907
R DL R+
Sbjct: 378 EASRQSLRRDLDASRE 393



Score = 32.3 bits (73), Expect = 0.011
Identities = 51/346 (14%), Positives = 118/346 (34%), Gaps = 27/346 (7%)

Query: 360 WRTDVENLSERHKLQTEKHQDIEAAYNARRSKIGEQLNRELESLHSDQDKQREARDKQRE 419
+ + + K+ D+ A + N EL S+ ++ DK
Sbjct: 55 VQERADKFEIENNTLKLKNSDLSFNNKAL-----KDHNDELTEELSNAKEKLRKNDKSLS 109

Query: 420 VARADIDALEAQWRNQIDAGKASFSEQEYQFKLTAAELKLRVDGVTYTEEEKLSLAIFDE 479
+ I LEA+ + D KA + +A L + + ++
Sbjct: 110 EKASKIQELEAR---KADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL----EK 162

Query: 480 RIHRADEEQESCNAKVERLASDERKLRSKRDQANEALRIASLRVNERQAELDELHHML-- 537
+ A + +AK++ L +++ L +++ + +AL A A++ L
Sbjct: 163 ALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 222

Query: 538 FPESHTLLEFLRKEAQGWEQSLGKVIAPELLHRTDLHPSVTGSGDTLFGVHLDLKAIDVP 597
LE + A + + I + L +L+ +
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQA-----------ELEKA-LE 270

Query: 598 EYAASEQELRIRLSKAEEALQSAQELQTEAESQLISINGELDNLSRELTFARTAYKNSRD 657
++ E + + + + E Q +N +L R+L +R A K
Sbjct: 271 GAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEA 330

Query: 658 DLRRLFDEKRSEQDKINKALSERKAHAGQRLTQLDGELKQLKHQHE 703
+ ++L ++ + + ++L + + QL+ E ++L+ Q++
Sbjct: 331 EHQKLEEQNKI-SEASRQSLRRDLDASREAKKQLEAEHQKLEEQNK 375


14Sputcn32_0698Sputcn32_0703Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0698-120-4.795461LysR family transcriptional regulator
Sputcn32_0699023-5.063736MATE efflux family protein
Sputcn32_0700226-6.232213co-chaperonin GroES
Sputcn32_0701222-5.506859chaperonin GroEL
Sputcn32_0702221-8.050666hypothetical protein
Sputcn32_0703120-7.153126hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0699SECFTRNLCASE310.012 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 30.6 bits (69), Expect = 0.012
Identities = 21/114 (18%), Positives = 45/114 (39%), Gaps = 14/114 (12%)

Query: 169 MSLAAIINLILDPLLIFGIGPFPRLEIQGAAIATLIAWVVALSLSSYLLIIKRKMLEWAV 228
+L A++ L+ D LL G+ +L+ +A L+ + S++ +++
Sbjct: 178 FALGAVVALVHDVLLTVGLFAVLQLKFDLTTVAALLT-ITGYSINDTVVV---------- 226

Query: 229 FDIDRMRANWSKLAHIAQPAALMNLINP-LANAVIMAMLAHIDHSAVAAFGAGT 281
DR+R N K + + +N L+ V+ M + + +G
Sbjct: 227 --FDRLRENLIKYKTMPLRDVMNLSVNETLSRTVMTGMTTLLALVPMLIWGGDV 278


15Sputcn32_0751Sputcn32_0765Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0751224-0.128742major facilitator superfamily transporter
Sputcn32_0752221-0.548208hypothetical protein
Sputcn32_07532170.84064530S ribosomal protein S6
Sputcn32_07540131.259981primosomal replication protein N
Sputcn32_0755-1130.71045730S ribosomal protein S18
Sputcn32_0756-1110.63611650S ribosomal protein L9
Sputcn32_0757-1100.518945hypothetical protein
Sputcn32_0758-2140.708524replicative DNA helicase
Sputcn32_0759-118-0.768794alanine racemase
Sputcn32_0760119-2.661588putative chemotaxis protein CheX
Sputcn32_0761318-2.336109tRNA-dihydrouridine synthase A
Sputcn32_0762417-3.064841hypothetical protein
Sputcn32_0763315-2.713326phage shock protein C, PspC
Sputcn32_0764214-2.467533enoyl-CoA hydratase/isomerase
Sputcn32_0765214-2.440551hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0751TCRTETB401e-05 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 40.2 bits (94), Expect = 1e-05
Identities = 27/153 (17%), Positives = 54/153 (35%), Gaps = 1/153 (0%)

Query: 221 APAYASNLGLPPEKVATYMTATILAGLLAQWPMGKLSDIMPRSRLIRINCVLLGILALGI 280
P A++ PP TA +L + GKLSD + RL+ ++ ++
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 281 ALTPYHPVISLVMTFLFGILGFTFYPLATALANSRVEQSERVGLSATILLTFGLGASIGP 340
+ + ++ F+ G F L + + + R I +G +GP
Sbjct: 97 FVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVGP 156

Query: 341 LIASTLMQWLGNNMLYGFMSACTLVLFLRLRYV 373
I + ++ + L T++ L +
Sbjct: 157 AIGGMIAHYIHWSYLLLIPMI-TIITVPFLMKL 188



Score = 32.5 bits (74), Expect = 0.003
Identities = 26/137 (18%), Positives = 55/137 (40%), Gaps = 12/137 (8%)

Query: 213 IVGSFYGLAPAYASNLGLPPEKVATYMTATILAGLLAQWPMGKLSDIMPRSRLIRINCVL 272
++ + L+ A ++ + P + I+ G + G L D ++ I
Sbjct: 282 MMKDVHQLSTAEIGSVIIFPG-----TMSVIIFGYIG----GILVDRRGPLYVLNIGVTF 332

Query: 273 LGILALGIALTPYHP--VISLVMTFLFGILGFTFYPLATALANSRVEQSERVGLSATILL 330
L + L + +++++ F+ G L FT ++T +++S +Q G+S
Sbjct: 333 LSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVISTIVSSSLKQQEAGAGMSLLNFT 392

Query: 331 TFGLGASIGPLIASTLM 347
+F L G I L+
Sbjct: 393 SF-LSEGTGIAIVGGLL 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0757V8PROTEASE471e-07 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 47.3 bits (112), Expect = 1e-07
Identities = 17/51 (33%), Positives = 26/51 (50%), Gaps = 1/51 (1%)

Query: 650 SVPVNFLS-SVDTTGGNSGSPVFNGKGELVGLNFDSTYEAITKDWFFNPTI 699
+ + + TTGGNSGSPVFN K E++G+++ F N +
Sbjct: 220 YLKGEAMQYDLSTTGGNSGSPVFNEKNEVIGIHWGGVPNEFNGAVFINENV 270


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0759ALARACEMASE433e-154 Alanine racemase signature.
		>ALARACEMASE#Alanine racemase signature.

Length = 356

Score = 433 bits (1114), Expect = e-154
Identities = 155/350 (44%), Positives = 218/350 (62%), Gaps = 6/350 (1%)

Query: 6 RAEISSSALQNNLAVLRQQASASQVMAVVKANGYGHGLLNVANCLVNADGFGLARLEEAL 65
+A + AL+ NL+++RQ A+ ++V +VVKAN YGHG+ + + + DGF L LEEA+
Sbjct: 6 QASLDLQALKQNLSIVRQAATHARVWSVVKANAYGHGIERIWSAIGATDGFALLNLEEAI 65

Query: 66 ELRAGGVKARLLLLEGFFRSTDLPLLVAHDIDTVVHHESQIEMLEQVKLTKPVTVWLKVD 125
LR G K +L+LEGFF + DL + H + T VH Q++ L+ +L P+ ++LKV+
Sbjct: 66 TLRERGWKGPILMLEGFFHAQDLEIYDQHRLTTCVHSNWQLKALQNARLKAPLDIYLKVN 125

Query: 126 SGMHRLGVTPEQFATVYARLMACPNIAKPIHLMTHFACADEPDNHYTDVQMAAFNELTAG 185
SGM+RLG P++ TV+ +L A N+ + LM+HFA A+ PD MA + G
Sbjct: 126 SGMNRLGFQPDRVLTVWQQLRAMANV-GEMTLMSHFAEAEHPDGISG--AMARIEQAAEG 182

Query: 186 LPGFRTLANSAGALYWPKSQGDWIRPGIALYGVSPVA--GDCGTNHGLIPAMNLVSRLIA 243
L R+L+NSA L+ P++ DW+RPGI LYG SP D N GL P M L S +I
Sbjct: 183 LECRRSLSNSAATLWHPEAHFDWVRPGIILYGASPSGQWRDI-ANTGLRPVMTLSSEIIG 241

Query: 244 VRDHKANQPVGYGCYWTAKQDTRLGVVAIGYGDGYPRNAPEGTPVWVNGRRVPIVGRVSM 303
V+ KA + VGYG +TA+ + R+G+VA GY DGYPR+AP GTPV V+G R VG VSM
Sbjct: 242 VQTLKAGERVGYGGRYTARDEQRIGIVAAGYADGYPRHAPTGTPVLVDGVRTMTVGTVSM 301

Query: 304 DMLTVDLGQDATDKVGDDVLLWGQDLPVEEVAERIGTIAYELVTKLTPRV 353
DML VDL +G V LWG+++ +++VA GT+ YEL+ L RV
Sbjct: 302 DMLAVDLTPCPQAGIGTPVELWGKEIKIDDVAAAAGTVGYELMCALALRV 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0761ANTHRAXTOXNA290.025 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 29.3 bits (65), Expect = 0.025
Identities = 29/116 (25%), Positives = 42/116 (36%), Gaps = 31/116 (26%)

Query: 137 DAMKQVVDIPVTVKTRIGIDE---------QDSYEFLT------YFIDIVNAKGCTDFTI 181
D++ +V T +T I + +D E + YF DI D
Sbjct: 61 DSINNLVKTEFTNETLDKIQQTQDLLKKIPKDVLEIYSELGGEIYFTDI-------DLVE 113

Query: 182 HARKAWLQGLSPKENR------EIPPLDYERVYQLKRDYPVLNISINGGVTTLEQA 231
H LQ LS +E E P V++ KR+ P L I+I EQ+
Sbjct: 114 HKE---LQDLSEEEKNSMNSRGEKVPFASRFVFEKKRETPKLIINIKDYAINSEQS 166


16Sputcn32_0804Sputcn32_0813Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0804215-1.848485hypothetical protein
Sputcn32_0805314-2.300135N-acetyltransferase GCN5
Sputcn32_0806116-3.712311invasion gene expression up-regulator, SirB
Sputcn32_0807120-4.295093hypothetical protein
Sputcn32_0808223-5.614256hypothetical protein
Sputcn32_0809024-6.4598112-dehydro-3-deoxyphosphooctonate aldolase
Sputcn32_0810026-6.894121flavocytochrome c
Sputcn32_0811126-6.796255outer membrane porin
Sputcn32_0812123-5.619208LysR family transcriptional regulator
Sputcn32_0813-119-4.996379hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0811ECOLNEIPORIN416e-06 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 40.6 bits (95), Expect = 6e-06
Identities = 39/202 (19%), Positives = 70/202 (34%), Gaps = 7/202 (3%)

Query: 24 LDFYGRLWLGVANSSN----GLSGNEKVDGFSLENYASYLGVKGEYAAYDNFSLLYKFEA 79
+ YG + GV S + G G + + S +G KG+ + +++ E
Sbjct: 21 VTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSKIGFKGQEDLGNGLKAIWQVEQ 80

Query: 80 GIESFDNDDSNIFKPRNAYLGFKTNYGSAVFGRNDTVFKSAEGKVDLFNITSSDMNMIIA 139
D R +++G K +G GR ++V K + + IA
Sbjct: 81 KASIAGTDSGW--GNRQSFIGLKGGFGKLRVGRLNSVLKDTGDINPWDSKSDYLGVNKIA 138

Query: 140 GNDRLGDSVTLNSAKFVGVTMGISYVFDKDFNQKNPELSDKQNNYAVSFTVGDASLKNKN 199
+ SV +S +F G++ + Y + + + N E NY K
Sbjct: 139 EPEARLISVRYDSPEFAGLSGSVQYALNDNAGRHNSESYHAGFNYKNGGFFVQYGGAYKR 198

Query: 200 HYISIAYADGLNLLEATRLVGG 221
H+ + + RLV G
Sbjct: 199 HHQVQENVNIEK-YQIHRLVSG 219


17Sputcn32_0864Sputcn32_0876Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0864021-4.010139hypothetical protein
Sputcn32_0865-121-4.147845hypothetical protein
Sputcn32_0866-215-1.795613acetyltransferase
Sputcn32_0867-2170.403234hypothetical protein
Sputcn32_0868-1150.859109hypothetical protein
Sputcn32_0869-1202.026546hypothetical protein
Sputcn32_08701253.321124hypothetical protein
Sputcn32_08711242.991885S-adenosylmethionine synthetase
Sputcn32_08722202.512349transketolase
Sputcn32_08732171.657404erythrose 4-phosphate dehydrogenase
Sputcn32_08742181.528656phosphoglycerate kinase
Sputcn32_08752150.635464fructose-1,6-bisphosphate aldolase
Sputcn32_0876213-0.620458hypothetical protein
18Sputcn32_0927Sputcn32_0933Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0927-1203.040442helix-turn-helix domain-containing protein
Sputcn32_09281212.890795GreA/GreB family elongation factor
Sputcn32_09291223.375713cytochrome d ubiquinol oxidase subunit II
Sputcn32_09301213.323996cytochrome bd ubiquinol oxidase subunit I
Sputcn32_09312213.101809GntR family transcriptional regulator
Sputcn32_09322162.452813O-acetylhomoserine/O-acetylserine sulfhydrylase
Sputcn32_09332161.558412hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0933IGASERPTASE360.001 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 35.8 bits (82), Expect = 0.001
Identities = 27/152 (17%), Positives = 53/152 (34%), Gaps = 11/152 (7%)

Query: 6 DVAQLKAELAQLQSLHLSQQSSLSRQLAEFSTKLETLSQQIATEDATETTLNMAASASSI 65
+VAQ +E + Q+ + + E K+ET Q + ++ + S +
Sbjct: 1084 EVAQSGSETKETQTT---ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQ 1140

Query: 66 ASVLTAADNAPTLT---YAAQTPTLEPASVAPA---PAEPNPWQQNAWQEDPWQRKTKNT 119
A +N PT+ +QT T + PA + + + +N
Sbjct: 1141 PQAEPARENDPTVNIKEPQSQTNT-TADTEQPAKETSSNVEQPVTESTTVNTGNSVVENP 1199

Query: 120 STEQTAKTEHQAQDQQLSDEVKLQASVQVASQ 151
A T+ + + S++ K + V S
Sbjct: 1200 ENTTPATTQPTV-NSESSNKPKNRHRRSVRSV 1230


19Sputcn32_1010Sputcn32_1024Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1010111-3.029669two component transcriptional regulator
Sputcn32_1011112-2.977634hypothetical protein
Sputcn32_1012314-2.380652MltA-interacting MipA family protein
Sputcn32_1013215-1.106109hypothetical protein
Sputcn32_1014214-1.325492GPR1/FUN34/yaaH family protein
Sputcn32_1015-114-1.705963hypothetical protein
Sputcn32_1016012-1.724040glyoxalase/bleomycin resistance
Sputcn32_101709-1.219694NADPH-dependent FMN reductase
Sputcn32_1018110-0.684916putative transcriptional regulator
Sputcn32_10191110.466437pseudouridine synthase
Sputcn32_10201120.567481methyl-accepting chemotaxis sensory transducer
Sputcn32_10211141.211858***putative lipoprotein
Sputcn32_10223151.814036RluA family pseudouridine synthase
Sputcn32_10232161.388567hypothetical protein
Sputcn32_10242161.642400ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1010HTHFIS759e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 74.9 bits (184), Expect = 9e-18
Identities = 32/135 (23%), Positives = 58/135 (42%), Gaps = 2/135 (1%)

Query: 9 RVLLVEDDVRLANLIVDYLKSHGMHVEVERRGDTVLTRLINYKPDIILLDIMLPGMDGLS 68
+L+ +DD + ++ L G V + T+ + D+++ D+++P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 69 LCEKLPDYFAG-PILLMSALGSTEDQIKGLELGADDYVVKPVDPALLVARIHNLL-RRQN 126
L ++ P+L+MSA + IK E GA DY+ KP D L+ I L +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 127 QPPQAESHCLTFGKL 141
+P + E L
Sbjct: 125 RPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1020CHANLCOLICIN310.013 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 31.2 bits (70), Expect = 0.013
Identities = 37/218 (16%), Positives = 81/218 (37%), Gaps = 6/218 (2%)

Query: 289 KRMQQQQAETEQTATAMNEMTATVAEVAQSAAAAADSAKDADTYAANGNSIVMQSIDSMS 348
K +++++AETE+ +A +++ A A + K + + + S
Sbjct: 158 KEIEREKAETERQLKLAEAEEKRLAALSEEAKAVEIAQKKLSAAQSEVVKMDGEIKTLNS 217

Query: 349 QLSDQIQKTAKVIGFLANESQNIGRVLDVIKSIAEQTNLLALNAAIE-AARAGEQGRGFA 407
+LS I + LA + + + K + E L+ A R +
Sbjct: 218 RLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVKKLSPRANDPLQNRPFFEATRRR 277

Query: 408 VVADEVRTLAQRTQKSTQE----IEAMIATLQQGVKEAVSAMEIGIHQVDDANDKANQAG 463
V A ++R Q+ +++ I A I +Q+ + + + GI +V +A + +A
Sbjct: 278 VGAGKIREEKQKQVTASETRINRINADITQIQKAISQVSNNRNAGIARVHEAEENLKKAQ 337

Query: 464 QALKEIVTSVDSITELNTHIATAAEEQSSVAESINRSI 501
L D++ + T E+ + + +
Sbjct: 338 NNLLNSQIK-DAVDATVSFYQTLTEKYGEKYSKMAQEL 374


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1024HTHFIS411e-05 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 1e-05
Identities = 35/180 (19%), Positives = 66/180 (36%), Gaps = 30/180 (16%)

Query: 552 LEGEREKLLQMEVALHER--VIGQNEAVDAVANAIRRSRAGLADPNRPIGSFLFLGPTGV 609
L + + ++E + ++G++ A+ + + R L + + + G +G
Sbjct: 119 LAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLAR----LMQTDLTL---MITGESGT 171

Query: 610 GKTELCKSLARFLFDTESALVRIDMSEFMEKHSVSRLVGAPPGYVGYEEGGYLTEAVRRK 669
GK + ++L + V I+M+ S L G +E+G + T A R
Sbjct: 172 GKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFG-------HEKGAF-TGAQTRS 223

Query: 670 PYSV-------ILLDEVEKAHPDVFNILLQVLDDG---RLTDGQGRTVDFRNTVIIMTSN 719
+ LDE+ D LL+VL G + D R I+ +N
Sbjct: 224 TGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR---IVAATN 280



Score = 31.0 bits (70), Expect = 0.020
Identities = 14/68 (20%), Positives = 29/68 (42%), Gaps = 3/68 (4%)

Query: 151 DPNAEDQRQALKKFTIDLTERAEQG-KLDPVIGRDDEIRRTIQVLQRRSKNN-PVLI-GE 207
+AL + ++ + P++GR ++ +VL R + + ++I GE
Sbjct: 109 TELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGE 168

Query: 208 PGVGKTAI 215
G GK +
Sbjct: 169 SGTGKELV 176


20Sputcn32_1058Sputcn32_1084Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_10582150.341151riboflavin biosynthesis protein RibF
Sputcn32_1059115-0.392792isoleucyl-tRNA synthetase
Sputcn32_1060021-1.882195lipoprotein signal peptidase
Sputcn32_1061126-4.404385peptidylprolyl isomerase, FKBP-type
Sputcn32_1062128-4.6790744-hydroxy-3-methylbut-2-enyl diphosphate
Sputcn32_1063130-5.684045type IV pilus modification protein PilV
Sputcn32_1064231-6.223236prepilin-type cleavage/methylation-like protein
Sputcn32_1065230-6.417906type IV pilus assembly protein PilX
Sputcn32_1066229-6.647229type IV pilin biogenesis protein
Sputcn32_1067124-5.517864methylation site containing protein
Sputcn32_1068125-5.879253hypothetical protein
Sputcn32_1069027-5.709551methylation site containing protein
Sputcn32_1070-116-2.805877methylation site containing protein
Sputcn32_1071-215-1.849886nitrogen regulatory protein P-II
Sputcn32_1072-113-1.012916NADH dehydrogenase
Sputcn32_1073-2120.414929regulatory protein, LacI
Sputcn32_1074-2111.434529TonB-dependent receptor
Sputcn32_1075-1153.618608beta-N-acetylhexosaminidase
Sputcn32_1076-2143.406099BadF/BadG/BcrA/BcrD type ATPase
Sputcn32_1077-1163.035969glutamine--fructose-6-phosphate transaminase
Sputcn32_1078-1163.490237N-acetylglucosamine-6-phosphate deacetylase
Sputcn32_1079-1173.161155hypothetical protein
Sputcn32_1080-1193.686325glucose/galactose transporter
Sputcn32_10810204.102943hypothetical protein
Sputcn32_10821194.021168hypothetical protein
Sputcn32_1083-1184.337190methylmalonate-semialdehyde dehydrogenase
Sputcn32_1084-1133.112326beta alanine--pyruvate transaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1063BCTERIALGSPG300.002 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.002
Identities = 9/24 (37%), Positives = 18/24 (75%), Gaps = 2/24 (8%)

Query: 13 QQGFSLIEVLVALVIL--VIGLIG 34
Q+GF+L+E++V +VI+ + L+
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVV 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1064PF05307280.049 Bundlin
		>PF05307#Bundlin

Length = 193

Score = 27.8 bits (61), Expect = 0.049
Identities = 29/101 (28%), Positives = 46/101 (45%), Gaps = 11/101 (10%)

Query: 13 YQTGLSLVELMVAMVIGLFLTAGVFTMFSMSSSNVTTTSQFNQLQENGRIALAILERDLS 72
Y+ GLSL+E + + + +TAGV MF S++ + SQ N + E AI +
Sbjct: 10 YEKGLSLIESAMVLALAATVTAGV--MFYYQSASDSNKSQ-NAISEVMSATSAINGLYIG 66

Query: 73 QLGFMGDMTGTDFVLGSNTQVNIAAVANDCVGDGLNNATLP 113
Q + G L SN +N +A+ ++ N T P
Sbjct: 67 QTSYTG--------LNSNILLNTSAIPDNYKDTKNNKITNP 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1067BCTERIALGSPG586e-14 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 58.0 bits (140), Expect = 6e-14
Identities = 24/70 (34%), Positives = 43/70 (61%), Gaps = 2/70 (2%)

Query: 5 RGFTLIELMITVAIVGILAAIAYPSYIEYVTKSGRSEGVAAVMRVANLQEQYYLDNKAYA 64
RGFTL+E+M+ + I+G+LA++ P+ + K+ + + V+ ++ + N + Y LDN Y
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHYP 67

Query: 65 TDMTKLGLSA 74
T T GL +
Sbjct: 68 T--TNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1069BCTERIALGSPG362e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 35.6 bits (82), Expect = 2e-05
Identities = 12/28 (42%), Positives = 19/28 (67%)

Query: 6 TGFTLVELMVTIAVAAILLSIGAPSLIS 33
GFTL+E+MV I + +L S+ P+L+
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMG 35


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1070BCTERIALGSPG290.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.003
Identities = 10/27 (37%), Positives = 20/27 (74%)

Query: 5 QKGFSLIELITTLSISTILLTVGVPSL 31
Q+GF+L+E++ + I +L ++ VP+L
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNL 33


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1081PF00577391e-05 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 38.7 bits (90), Expect = 1e-05
Identities = 16/102 (15%), Positives = 36/102 (35%), Gaps = 3/102 (2%)

Query: 104 LSYDITLY--RYNYSGESDLGYFEVTAGVEFKGFRL-AYWFTNDYGGSDLDYHYTELNYS 160
L+Y+ + + G S Y + +G+ +RL + + +
Sbjct: 187 LNYNFSGNSVQNRIGGNSHYAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHI 246

Query: 161 YTFVENWNLDLHYGYNAGDALDDGEGFDSYSDYSIGVSTEFA 202
T++E + L GD G+ FD + ++++
Sbjct: 247 NTWLERDIIPLRSRLTLGDGYTQGDIFDGINFRGAQLASDDN 288


21Sputcn32_1319Sputcn32_1337Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_13192183.476972sigma-54 dependent trancsriptional regulator
Sputcn32_13203214.490488hypothetical protein
Sputcn32_13215224.967951hypothetical protein
Sputcn32_13224224.982158hypothetical protein
Sputcn32_13234204.6380652-nitropropane dioxygenase
Sputcn32_13244194.058312Beta-hydroxyacyl-(acyl-carrier-protein)
Sputcn32_13254163.354751omega-3 polyunsaturated fatty acid synthase
Sputcn32_13262152.834361erythronolide synthase
Sputcn32_1327014-1.397884transcriptional regulator
Sputcn32_1328017-3.1118114'-phosphopantetheinyl transferase
Sputcn32_1329019-3.1687347-cyano-7-deazaguanine reductase
Sputcn32_1330-120-2.870717SecY interacting protein Syd
Sputcn32_1331020-2.621406hypothetical protein
Sputcn32_1332223-4.141730hypothetical protein
Sputcn32_1333325-3.843160hypothetical protein
Sputcn32_1334122-2.852721hypothetical protein
Sputcn32_1335221-2.566584N-acetyltransferase GCN5
Sputcn32_1336321-2.586670hypothetical protein
Sputcn32_1337218-1.415471hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1319HTHFIS369e-126 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 369 bits (948), Expect = e-126
Identities = 160/450 (35%), Positives = 224/450 (49%), Gaps = 38/450 (8%)

Query: 40 LAQLEANHVDVAVIEFPCLSHEDYLDLMK--NAAMSGIEFIFLSNGQPNPNLDKLMAQYA 97
+ A D+ V + + E+ DL+ A + + +S K + A
Sbjct: 40 WRWIAAGDGDLVVTDVV-MPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGA 98

Query: 98 GYHLRKPYDMSLISSILIDFAQHLVASSPSRSSPFLSELDQYGLLVGSSAPMHKLYRTIR 157
+L KP+D++ + I +A R S + LVG SA M ++YR +
Sbjct: 99 YDYLPKPFDLTEL----IGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLA 154

Query: 158 RVSVTDSNVLIIGESGVGKELVANTIHLASPRVNEPFIAINCGALSPELVDSELFGHVKG 217
R+ TD ++I GESG GKELVA +H R N PF+AIN A+ +L++SELFGH KG
Sbjct: 155 RLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKG 214

Query: 218 AFTGANRDHKGVFEQAEGGTLFLDEITEMPIEHQVKLLRVLENNEYRPVGGDKVLHSDVR 277
AFTGA G FEQAEGGTLFLDEI +MP++ Q +LLRVL+ EY VGG + SDVR
Sbjct: 215 AFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVR 274

Query: 278 IVAATNRDPLAAIEAGLFREDLYFRLAQFPIRVPPLRVRGEDIIGLAQHFLAYRNVAEKQ 337
IVAATN+D +I GLFREDLY+RL P+R+PPLR R EDI L +HF+
Sbjct: 275 IVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLD 334

Query: 338 CKVFSEESLRMIAANRWTGNVRELKHAVERAYILA-DQEILPSHLQLEQEIDNAKIAEEE 396
K F +E+L ++ A+ W GNVREL++ V R L I ++ E + E+
Sbjct: 335 VKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSEIPDSPIEK 394

Query: 397 VIIPQG------------------------------MRLDDLEKIAIYQALESSNGNKTD 426
G L ++E I AL ++ GN+
Sbjct: 395 AAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIK 454

Query: 427 TAEQLGISVKTLYNKLSKYEQQVEAGEPTA 456
A+ LG++ TL K+ + V +A
Sbjct: 455 AADLLGLNRNTLRKKIRELGVSVYRSSRSA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1326PF03544377e-04 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 36.9 bits (85), Expect = 7e-04
Identities = 22/126 (17%), Positives = 34/126 (26%), Gaps = 8/126 (6%)

Query: 1181 PAVASQPRATAPAPASVDPAPVAATTMPHNAAPVTQAVATEAVSTPV-APVVQTAPVAYS 1239
PA P+A P PV P A + P P + PV
Sbjct: 57 PADLEPPQA-----VQPPPEPVVEPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKV 111

Query: 1240 PAVTVQVAPAAPALVMPAVVMPEVTPAAPATSGLSAGLVQASEIESTMMAVVADKTGYPT 1299
V P P P + + ++ + + S A+ ++ YP
Sbjct: 112 EQPKRDVKPVESRPASPFENTAPARPTSSTATAATSK--PVTSVASGPRALSRNQPQYPA 169

Query: 1300 EMLELG 1305
L
Sbjct: 170 RAQALR 175


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1330NUCEPIMERASE300.007 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 30.1 bits (68), Expect = 0.007
Identities = 11/44 (25%), Positives = 18/44 (40%), Gaps = 1/44 (2%)

Query: 171 LTAFIEALSPRIAPPVKHEELPMPALDHPGIFANIKRMWQNLFG 214
L +I+AL + K LP+ D A+ K + + G
Sbjct: 268 LMDYIQALEDALGIEAKKNMLPLQPGDVLETSADTKAL-YEVIG 310


22Sputcn32_1369Sputcn32_1380Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_13692200.577790putative hydrolase
Sputcn32_1370118-0.2105283'(2'),5'-bisphosphate nucleotidase
Sputcn32_13711190.284720fructokinase
Sputcn32_1372220-0.152816hypothetical protein
Sputcn32_1373222-0.539930hypothetical protein
Sputcn32_1374318-0.375232hypothetical protein
Sputcn32_1375317-1.298592N-acetyltransferase GCN5
Sputcn32_1376318-1.249848hypothetical protein
Sputcn32_1378218-1.869010IS91 family transposase
Sputcn32_1379218-2.210091hypothetical protein
Sputcn32_1380217-1.880889decaheme cytochrome c
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1370STREPKINASE290.034 Streptococcus streptokinase protein signature.
		>STREPKINASE#Streptococcus streptokinase protein signature.

Length = 440

Score = 28.5 bits (63), Expect = 0.034
Identities = 16/51 (31%), Positives = 24/51 (47%)

Query: 1 MKDIVVIDKRAIALTLTKQDLLEVVNIAKAAGDAIMAIYEQDDVVVSHKGD 51
+KD ++ AI T+T Q+LL IYE+D +V+H D
Sbjct: 205 LKDTKLLKTLAIGDTITSQELLAQAQSILNKNHPGYTIYERDSSIVTHDND 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1371ACETATEKNASE310.005 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 31.3 bits (71), Expect = 0.005
Identities = 16/62 (25%), Positives = 23/62 (37%), Gaps = 7/62 (11%)

Query: 204 SGALAKSG--ADIMALVDK----GDIIAMAAFERYVDRLARALAHVINLLDP-DAIVLGG 256
SG SG +D L D GD A A + R+ + + + D IV
Sbjct: 271 SGVYGISGISSDFRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMGGVDVIVFTA 330

Query: 257 GM 258
G+
Sbjct: 331 GI 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1375SACTRNSFRASE290.007 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.8 bits (64), Expect = 0.007
Identities = 18/91 (19%), Positives = 33/91 (36%), Gaps = 1/91 (1%)

Query: 43 QGHTCRAIYQNDKPVGFFMWVSETPEKVSIWRFMVDQRYQQSGIGRVALNLALNEIKASD 102
+G Y + +G S I V + Y++ G+G L+ A+ E +
Sbjct: 63 EGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAI-EWAKEN 121

Query: 103 KVKEIEICYDPENPVAKDFYSRFGFQEVGLD 133
+ + N A FY++ F +D
Sbjct: 122 HFCGLMLETQDINISACHFYAKHHFIIGAVD 152


23Sputcn32_1412Sputcn32_1427Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_14122201.726161PAS/PAC sensor-containing diguanylate cyclase
Sputcn32_14142191.856082hypothetical protein
Sputcn32_14153162.903444transcriptional regulator CadC
Sputcn32_14160185.132410hypothetical protein
Sputcn32_1417-1205.200538hypothetical protein
Sputcn32_14181234.874767hypothetical protein
Sputcn32_14192224.256496TetR family transcriptional regulator
Sputcn32_14202233.776831secretion protein HlyD family protein
Sputcn32_14211202.602848ABC transporter-like protein
Sputcn32_14221171.462219ABC transporter
Sputcn32_14232140.625484acetyl-CoA hydrolase/transferase
Sputcn32_1424115-0.402009hypothetical protein
Sputcn32_1425-1160.484960hypothetical protein
Sputcn32_14261150.125491methyl-accepting chemotaxis sensory transducer
Sputcn32_1427216-1.283399hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1419HTHTETR772e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 77.0 bits (189), Expect = 2e-19
Identities = 29/163 (17%), Positives = 60/163 (36%), Gaps = 6/163 (3%)

Query: 31 SSDARQRLIIAALSLFSHRSYPTVSTREIAREAGVDAALIRYYFGSKAGLFEQMVRETLE 90
+ + RQ ++ AL LFS + + S EIA+ AGV I ++F K+ LF ++ +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 91 PVLARLREISAAQAPNN---MSELMQTYYRVMAPNPGLPRLIMRVLQEGDGTEPYHIMLS 147
+ E A + + E++ L+ + + + ++
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 148 VFDHILSLSRQWLESTL---VNAGYLKEGIDPDLVRLSFVSLM 187
++ S +E TL + A L + + +
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1420RTXTOXIND573e-11 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 57.1 bits (138), Expect = 3e-11
Identities = 44/318 (13%), Positives = 98/318 (30%), Gaps = 77/318 (24%)

Query: 33 TVERDRLTLTAPVGELITQINVVEGQRVKAGEVLIQLDATSANA---------------- 76
T + ++ +I V EG+ V+ G+VL++L A A A
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQ 150

Query: 77 ---RLALRQAELDQAKAKLSEAVTGARLE----------------------------DID 105
++ R EL++ + ++D
Sbjct: 151 TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD 210

Query: 106 RAKAVLDGANATVKEAQRAFERTN-------RLFATKVLS--------------QADLDT 144
+ +A A + + L + ++ +L
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRV 270

Query: 145 ARAARDTSLAKQAEAQQSLRLLENGTRSE---QLAQAKAAVAAASASVAVEQKALADLSL 201
++ + ++ A++ +L+ ++E +L Q + + +A ++ +
Sbjct: 271 YKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVI 330

Query: 202 VAARDAVVDTLP-WREGDRIAAGTQLIGLLASDNPY-VRVYLPATWLDRVKVGDSVNILV 259
A V L EG + L+ ++ D+ V + + + VG + I V
Sbjct: 331 RAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKV 390

Query: 260 DG----REAPIAGTVRNI 273
+ R + G V+NI
Sbjct: 391 EAFPYTRYGYLVGKVKNI 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1421adhesinb310.007 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 30.6 bits (69), Expect = 0.007
Identities = 15/87 (17%), Positives = 28/87 (32%), Gaps = 12/87 (13%)

Query: 220 SPQQLMAAMGARVIEVSGDDLRT---------LKQSLMSESA---VLSAAQIGSRLRVLV 267
P+ + A +I +G +L T ++ + E+ +S L
Sbjct: 73 LPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQS 132

Query: 268 RSDIADPLAWLKPKIANRAMEEVRASL 294
DP AWL + + + L
Sbjct: 133 EKGKEDPHAWLNLENGIIYAQNIAKRL 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1422ABC2TRNSPORT392e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 38.8 bits (90), Expect = 2e-05
Identities = 42/160 (26%), Positives = 70/160 (43%), Gaps = 11/160 (6%)

Query: 186 GVILTMTMVMFT----SAAIVREREQGNMEFLITTPVRPLELMLGKIVPYVLVGFVQVTI 241
G++ T M T AA R Q E ++ T +R +++LG++ +
Sbjct: 72 GMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG 131

Query: 242 ILSAGHLLFDVP---IRGGLDSIALAAMLFICASLTLGLVISTMAKTQLQSMQMTVFVLL 298
I L + L IAL + F +LG+V++ +A + + V+
Sbjct: 132 IGVVAAALGYTQWLSLLYALPVIALTGLAFA----SLGMVVTALAPSYDYFIFYQTLVIT 187

Query: 299 PSILLSGFMFPFDAMPIAAQWIAEALPATHFMRMSRAIVL 338
P + LSG +FP D +PI Q A LP +H + + R I+L
Sbjct: 188 PILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIML 227


24Sputcn32_1463Sputcn32_1489Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1463-121-8.829843hypothetical protein
Sputcn32_1464015-6.436823hypothetical protein
Sputcn32_1465114-4.164273integrase catalytic subunit
Sputcn32_1466316-4.262878transposase IS3/IS911 family protein
Sputcn32_1467315-3.868984hypothetical protein
Sputcn32_1468316-3.615142hypothetical protein
Sputcn32_14692171.438722pyridoxal-dependent decarboxylase
Sputcn32_14703180.403093glycerate kinase
Sputcn32_1471014-1.093940gluconate transporter
Sputcn32_1472011-1.311672catalase domain-containing protein
Sputcn32_1473314-1.688992transcriptional regulator CdaR
Sputcn32_1474620-1.223174phage SPO1 DNA polymerase domain-containing
Sputcn32_1475620-1.423797hypothetical protein
Sputcn32_1476518-0.892550outer membrane protein MtrB
Sputcn32_1477417-0.962896cytochrome C family protein
Sputcn32_1478417-1.200619decaheme cytochrome c
Sputcn32_1479115-1.604008hypothetical protein
Sputcn32_1480-215-2.360339FeoA family protein
Sputcn32_1481-114-2.337566ferrous iron transport protein B
Sputcn32_1482112-2.398491glutaminyl-tRNA synthetase
Sputcn32_1483013-1.308260hypothetical protein
Sputcn32_1484218-1.047683tRNA--hydroxylase
Sputcn32_1485220-0.099973UDP-2,3-diacylglucosamine hydrolase
Sputcn32_14862210.475563cyclophilin type peptidyl-prolyl cis-trans
Sputcn32_14871190.657564cysteinyl-tRNA synthetase
Sputcn32_14882200.515629bifunctional 5,10-methylene-tetrahydrofolate
Sputcn32_1489319-0.019075***trigger factor
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1476VACCYTOTOXIN320.007 Helicobacter pylori vacuolating cytotoxin signature.
		>VACCYTOTOXIN#Helicobacter pylori vacuolating cytotoxin signature.

Length = 1291

Score = 32.3 bits (73), Expect = 0.007
Identities = 30/150 (20%), Positives = 57/150 (38%), Gaps = 23/150 (15%)

Query: 247 TALNYNGSTFK--NEYNQLSFDSAFNPNYVTAQNNTGTMALDPDNQSHTVSLMGQYNDNT 304
A G K ++ + + ++A N ++QNN+ T ++P N + + T
Sbjct: 324 IAPPEGGYKDKPNDKPSNTTQNNAKNDKQESSQNNSNTQVINPPNSAQKTEI-----QPT 378

Query: 305 NVISGRFLAG--------QMSQDQALVTDNYIYANQLATNAVDAKVDLLGINLK------ 350
VI G F G +++ + + L TNA + GINL
Sbjct: 379 QVIDGPFAGGKNTVVNINRINTNADGTIRVGGFKASLTTNAAHLHIGKGGINLSNQASGR 438

Query: 351 --LVSKVTNSLRLSGSYDYHDRDNNTLIEG 378
LV +T ++ + G +++ + G
Sbjct: 439 SLLVENLTGNITVDGPLRVNNQVGGYALAG 468


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1478PF00577300.043 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 29.8 bits (67), Expect = 0.043
Identities = 28/92 (30%), Positives = 37/92 (40%), Gaps = 6/92 (6%)

Query: 358 TDSKGAPVDITALLP-QIQRVEIITNVGPNNITLSYFTKDSVIAVKNGVLDSNASIVDGK 416
TD +G V + + RV + TN +N+ L + V V + V K
Sbjct: 739 TDWRGYAV-LPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFKARVGIK 797

Query: 417 LLYTTT---KPLPFGAAKTD-TDTSVTFVNWA 444
LL T T KPLPFGA T + S V
Sbjct: 798 LLMTLTHNNKPLPFGAMVTSESSQSSGIVADN 829


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1481TCRTETOQM436e-06 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 42.5 bits (100), Expect = 6e-06
Identities = 47/219 (21%), Positives = 84/219 (38%), Gaps = 62/219 (28%)

Query: 14 NAGKSTLFNAL---TGANQQVG---------NW------SGVTVEKKTGHFTLNGADVYL 55
+AGK+TL +L +GA ++G + G+T++ F V +
Sbjct: 13 DAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWENTKVNI 72

Query: 56 TDLPGIYDLLPAGNSCDCSLDEQIAQQYLAEQRIDGIINLVDA-------TNIERHLYLT 108
D PG D +A+ Y + +DG I L+ A T I L
Sbjct: 73 IDTPGHMDF--------------LAEVYRSLSVLDGAILLISAKDGVQAQTRI-----LF 113

Query: 109 AQLRELAIPMVVVLNKIDAAIKRGIKVD--LQKMSQELGCPVI---------GVCSRDPA 157
LR++ IP + +NKID GI + Q + ++L ++ +C +
Sbjct: 114 HALRKMGIPTIFFINKIDQN---GIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFT 170

Query: 158 DVEKVQAQVL---DLLQGRVSEAPL-MLDYDEQIEAGVQ 192
+ E+ + DLL+ +S L L+ +++
Sbjct: 171 ESEQWDTVIEGNDDLLEKYMSGKSLEALELEQEESIRFH 209


25Sputcn32_1513Sputcn32_1532Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1513020-3.113808ATP-dependent DNA helicase DinG
Sputcn32_1514124-3.667268DNA polymerase II
Sputcn32_1515429-3.708320porin
Sputcn32_1516429-3.176155TonB-dependent receptor
Sputcn32_1517426-3.138047hypothetical protein
Sputcn32_1518326-3.585968MotA/TolQ/ExbB proton channel
Sputcn32_1519122-3.133129MotA/TolQ/ExbB proton channel
Sputcn32_1520123-3.378576biopolymer transport protein ExbD/TolR
Sputcn32_1521223-3.604331TonB family protein
Sputcn32_1522223-4.296772hypothetical protein
Sputcn32_1523020-3.383275diguanylate cyclase
Sputcn32_1524-118-2.546203hypothetical protein
Sputcn32_1525218-2.274634PpiC-type peptidyl-prolyl cis-trans isomerase
Sputcn32_1526-118-1.223731N-acetyltransferase GCN5
Sputcn32_15274130.438425RNA-binding S4 domain-containing protein
Sputcn32_1528213-0.156197hypothetical protein
Sputcn32_1529214-0.464265LysR family transcriptional regulator
Sputcn32_1530214-0.732352hypothetical protein
Sputcn32_1531212-0.967244nuclease SbcCD subunit D
Sputcn32_1532213-1.436783SMC domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1515ECOLIPORIN861e-20 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 85.7 bits (212), Expect = 1e-20
Identities = 106/417 (25%), Positives = 167/417 (40%), Gaps = 53/417 (12%)

Query: 1 MNKTLVATALAAIFLAPSVSAIEIYKDDKNAVEIGGFIDVRVINTQGETEVVNG-ASRIN 59
M + ++A + A+ A + A EIY D N +++ G +D + ++ +G + +
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVDGLHYFSDDSSK--DGDQTYMR 58

Query: 60 FGFSRE--LTNGWNAFAKLEWGVNPVGSSDIVYNNRFESVQDEFFYNRLGYAGLSHDQYG 117
GF E + + + + E+ V N E + RL +AGL YG
Sbjct: 59 VGFKGETQINDQLTGYGQWEYNVQ---------ANTTEGEGANSW-TRLAFAGLKFGDYG 108

Query: 118 TITIGKQWGAWYDVVYSTNYGFVWDGNTAGVYTFNKDDGAVNGVGRGDKTVQYRNA--FG 175
+ G+ +G YDV T+ + G++ Y N G NGV YRN FG
Sbjct: 109 SFDYGRNYGVLYDVEGWTDMLPEFGGDSYT-YADNYMTGRANGV------ATYRNTDFFG 161

Query: 176 ---DFSFAVQAQLKNSSFFTCDIENITENECEARWIE---GGADAQQVTYDYTYGGALTY 229
+FA+Q Q KN S D+ T N I G TYD G +
Sbjct: 162 LVDGLNFALQYQGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGA 221

Query: 230 KVTEMLSVSAGVNRGEFDIDYGNGDQKTAVDLIYGAGITWGNFDNNGLYIAA------NV 283
T + VN G GD+ A + AG+ +D N +Y+A N+
Sbjct: 222 AYTTSDRTNEQVNAGG---TIAGGDKADA----WTAGL---KYDANNIYLATMYSETRNM 271

Query: 284 NKQENHDTDNIGRLIKDAQGAETLVSYKFDNGLRPFISYNVLDAGKDYVIQPNFNADPDD 343
D G + Q E Y+FD GLRP +S+ ++ GKD D D
Sbjct: 272 TPYGKTDKGYDGGVANKTQNFEVTAQYQFDFGLRPAVSF-LMSKGKDLTYNNVNGDDKDL 330

Query: 344 VFKRQFVVVGLHFVWDPNTVLYVEARKDYSDFTSTDKVQEARMSLSEDDGIAIGIRY 400
V ++ VG + ++ N YV+ + + D D + +S DD +A+G+ Y
Sbjct: 331 V---KYADVGATYYFNKNFSTYVDYKINLLD--DDDPFYKD-AGISTDDIVALGMVY 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1521PF035441011e-28 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 101 bits (252), Expect = 1e-28
Identities = 36/169 (21%), Positives = 64/169 (37%), Gaps = 11/169 (6%)

Query: 39 TPVIEITMDRQDSKAQNKPRVVPKPPPPPEQPQKPDTTPPDTSSNID----TSMSFNMGG 94
P E + K PKP P P+ P S N
Sbjct: 75 EPEPEPEPIPEPPKEAPVVIEKPKPKPKPKPKPVKKVEQPKRDVKPVESRPASPFENTAP 134

Query: 95 VEAGGPSTG-FKLGNMMTRDGDATPIVRIEPQYPIAAARDGKEGWVQLRFTINELGGVDD 153
+ + + + R +PQYP A EG V+++F + G VD+
Sbjct: 135 ARPTSSTATAATSKPVTSVASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDN 194

Query: 154 VEVIQAEPKRLFDKEAIRALKKWKYKPKIVDGKPLKQPGMTVQLDFTLD 202
V+++ A+P +F++E A+++W+Y+P G+ V + F ++
Sbjct: 195 VQILSAKPANMFEREVKNAMRRWRYEP------GKPGSGIVVNILFKIN 237


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1522SYCDCHAPRONE300.016 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 29.5 bits (66), Expect = 0.016
Identities = 11/51 (21%), Positives = 21/51 (41%)

Query: 196 YFNQKKYKQAVGVLETMVPLFPEDGRLWVQLAQFYLMVEDYDKSLATYDLA 246
+ KY+ A V + + L D R ++ L + YD ++ +Y
Sbjct: 46 QYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYG 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1526ADHESNFAMILY280.023 Adhesin family signature.
		>ADHESNFAMILY#Adhesin family signature.

Length = 309

Score = 27.9 bits (62), Expect = 0.023
Identities = 16/51 (31%), Positives = 19/51 (37%), Gaps = 12/51 (23%)

Query: 30 DINLYVRSPEPEDVIRNKFIQRADTWFY-------GSGEWLTLVIETVNTG 73
D + Y P PEDV K AD FY G W T ++E
Sbjct: 65 DPHEY--EPLPEDV---KKTSEADLIFYNGINLETGGNAWFTKLVENAKKT 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1532IGASERPTASE473e-07 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 47.4 bits (112), Expect = 3e-07
Identities = 47/310 (15%), Positives = 100/310 (32%), Gaps = 22/310 (7%)

Query: 198 AADIRALVKDQRSRRDGILQSAGLASDDELSCELAK--LTPELETAQSAKEQALQQQQWV 255
+I+A V S + I + ++ T + Q +K +Q
Sbjct: 1000 PNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDAT 1059

Query: 256 IKTSDAAQHLLAEFAQFDALTQTVAALDAQQENMAAQTYKLNLAKQAQHMAPMLEVFVAR 315
T+ + + A TQT + E QT + K+ + + V
Sbjct: 1060 ETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTE---TKETATVEKEEKAKVET 1116

Query: 316 EQEAKAASLALEHAKTALIHAKQAFDNAELKTADL--PVLEASLLEQEQVKQQLNALGPQ 373
E+ + T+ + KQ A+ +++ Q + A Q
Sbjct: 1117 EKTQEVPK------VTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 374 L-RELDRLSKTLEQEQAQLISAKTQLQNSKHELTTVVQKRRELESALPQLQANSETRLSL 432
+E + E + + + ++N ++ Q ES+ + +
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV--- 1227

Query: 433 QQAHQQQQQLLSTYQQWQQVVARVSL----TQAKLVEAKVKGQQLSAQHQQAQVAYKALL 488
++ + +T + VA L T A L +A+ K Q ++ +A + + L
Sbjct: 1228 -RSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQL 1286

Query: 489 LTWHQGQAAI 498
++GQ +
Sbjct: 1287 EMNNEGQYNV 1296



Score = 38.1 bits (88), Expect = 2e-04
Identities = 45/325 (13%), Positives = 106/325 (32%), Gaps = 14/325 (4%)

Query: 276 TQTVAALDAQQENMAAQTYKLNLAKQAQHMAPMLEVFVAREQEAKAASLALEHAKTALIH 335
T + Q + + + +A+ + P E A + + +KT +
Sbjct: 995 TNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKN 1054

Query: 336 AKQA----FDNAELKTADLPVLEASLLEQEQVKQQLNALGPQLRELDRLSKTLEQEQAQL 391
+ A N E+ ++A+ E + Q E + ++E+A++
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKV 1114

Query: 392 ISAKTQLQNSKHELTTVVQKRRELESALPQLQANSETRLSLQQAHQQQQQLLSTYQQWQQ 451
+ KTQ + Q++ E + ++ +++++ Q T Q ++
Sbjct: 1115 ETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKE 1174

Query: 452 VVARVSLTQAKLVEAKVKGQQLSAQHQQAQVAYKALLLTWHQGQAAILARQLQQDEPCPV 511
T + + + + ++ + + T + + + + V
Sbjct: 1175 -------TSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSV 1227

Query: 512 CGSQTHPQPAQSQEPLPSDEVLQLALEAETNAQEILSKARAEYRGLQTQLETLQQQAQ-- 569
+ +PA + S L TNA ++A+A++ L Q +Q
Sbjct: 1228 RSVPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNVGKAVSQHISQLE 1287

Query: 570 -DLAVQLGTAVDIPLDEHTHTLSQY 593
+ Q V ++ SQY
Sbjct: 1288 MNNEGQYNVWVSNTSMNKNYSSSQY 1312


26Sputcn32_1546Sputcn32_1573Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1546016-3.045290hypothetical protein
Sputcn32_1547015-2.864139carbon starvation protein CstA
Sputcn32_1548016-3.665837putative two-component response-regulatory
Sputcn32_1549017-3.846819signal transduction histidine kinase LytS
Sputcn32_1550019-3.876918hypothetical protein
Sputcn32_1551019-3.271880RepA domain-containing protein
Sputcn32_1552-115-2.533884hypothetical protein
Sputcn32_1553016-2.181168type IV pilus assembly PilZ
Sputcn32_1554117-2.372787phosphate-starvation-inducible E
Sputcn32_1555017-2.194409hypothetical protein
Sputcn32_1556-116-2.188110DNA internalization-related competence protein
Sputcn32_1557014-2.332981lipid A ABC exporter, fused ATPase and inner
Sputcn32_1558-118-4.061335tetraacyldisaccharide 4'-kinase
Sputcn32_1559120-4.566371hypothetical protein
Sputcn32_1560119-4.359208putative lipoprotein
Sputcn32_1561119-4.347491hypothetical protein
Sputcn32_1562220-4.246038glutaredoxin
Sputcn32_1563119-3.457627hypothetical protein
Sputcn32_1564217-2.331721glyoxalase/bleomycin resistance
Sputcn32_1565117-2.708273hypothetical protein
Sputcn32_1566118-2.959729glyoxalase/bleomycin resistance
Sputcn32_1567017-3.196814hypothetical protein
Sputcn32_1568116-0.940664cytidine deaminase
Sputcn32_1569216-1.062021exonuclease I
Sputcn32_1570317-0.341305hypothetical protein
Sputcn32_15712150.18740923S rRNA methyltransferase A
Sputcn32_15723150.798095cold-shock DNA-binding domain-containing
Sputcn32_15733141.016238ribonuclease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1548HTHFIS704e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 70.2 bits (172), Expect = 4e-16
Identities = 35/109 (32%), Positives = 54/109 (49%), Gaps = 9/109 (8%)

Query: 5 LIVDDEPFAREELTELLSQEADLEIIGQCSNAIEALQIIAKEKPQLIFLDIQMPRISGME 64
L+ DD+ R L + LS+ + SNA + IA L+ D+ MP + +
Sbjct: 7 LVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 65 LIAML---DPDNLPKIVFVTAYDEF--AVKAFDNHAFDYLLKPIDADRL 108
L+ + PD LP +V ++A + F A+KA + A+DYL KP D L
Sbjct: 65 LLPRIKKARPD-LPVLV-MSAQNTFMTAIKASEKGAYDYLPKPFDLTEL 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1549PF065802271e-71 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 227 bits (580), Expect = 1e-71
Identities = 64/198 (32%), Positives = 113/198 (57%), Gaps = 3/198 (1%)

Query: 357 EQQQNLLTQAELKLLQAQVNPHFLFNALNTISAIIKRDPDMSKQLLQQLSQFLRINLKRT 416
+ ++ +A+L L+AQ+NPHF+FNALN I A+I DP ++++L LS+ +R +L+ +
Sbjct: 152 WKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREMLTSLSELMRYSLRYS 211

Query: 417 TG-LVTLGDELDHIASYLTIEKARFIDKLQVNIAIPEPLYPCKVPAFTLQPIIENAVKHG 475
V+L DEL + SYL + +F D+LQ I + +VP +Q ++EN +KHG
Sbjct: 212 NARQVSLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHG 271

Query: 476 TSHMIEQGQIKVTGKLENNILELDVTDNAGL-YEPTSGSEGLGMNLVHKRIQNLFGEQYG 534
+ + + G+I + G +N + L+V + L + T S G G+ V +R+Q L+G +
Sbjct: 272 IAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKNTKESTGTGLQNVRERLQMLYGTEAQ 331

Query: 535 ITVECERDVYTRVIIRLP 552
I + E+ ++ +P
Sbjct: 332 IKLS-EKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1560adhesinb280.023 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 27.5 bits (61), Expect = 0.023
Identities = 14/42 (33%), Positives = 19/42 (45%)

Query: 3 LRSISLVALSCISLIACSSAPIDPTELAGKLKDRLTTDIKAD 44
R + L+ L+ + L ACSS + KL T I AD
Sbjct: 4 CRFLVLLLLAFVGLAACSSQKSSTETGSSKLNVVATNSIIAD 45


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1563OMPADOMAIN310.019 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 31.4 bits (71), Expect = 0.019
Identities = 23/90 (25%), Positives = 33/90 (36%), Gaps = 25/90 (27%)

Query: 975 KGALAEALVKLYEQEVKADPQLVKDTVLNDTKETSLSAEDILTRWHIALYNLSVKAQNVT 1034
K AL +LY Q DP+ VL T A YN
Sbjct: 231 KPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDA-----------YNQ-------- 271

Query: 1035 NGMLGDLAQERAKTVKAYLVDAKDISPERI 1064
L++ RA++V YL+ +K I ++I
Sbjct: 272 -----GLSERRAQSVVDYLI-SKGIPADKI 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1573IGASERPTASE622e-11 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 61.6 bits (149), Expect = 2e-11
Identities = 68/379 (17%), Positives = 111/379 (29%), Gaps = 36/379 (9%)

Query: 516 YK-RIEHPEAKLYEPR--KLERTAAPTPALKGFAAPQKVEQAPSPTVKIEAPQPSFFSKL 572
YK R + LY P K +T T V PS +I
Sbjct: 969 YKLRNVNGRYDLYNPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVD------- 1021

Query: 573 IGTITALFASSDKAEPAKTTETK--NTDTNAANANRRNRRTDTRRPRNSQDADKAKEGNR 630
A A P++TTET N+ + + + +N + A +AK +
Sbjct: 1022 ----EAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVK 1077

Query: 631 EPRNRNAKKTAEPVAVATQERAVREKEDSAK-RPAKVETKPRVQAPKEVI------ADLE 683
N + TQ +E K AKVET+ + PK E
Sbjct: 1078 ANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSE 1137

Query: 684 ADAPKQEVARERRQRRNMRRKVRIDNGNNTPDNAIPIAPEEAAEVLAEIAAINAAASIDA 743
P+ E ARE N++ N + + + E +N S+
Sbjct: 1138 TVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVE 1197

Query: 744 KAEVAAEAPTIEPKA-------PRARRQPRKEAIPAQATLEAVA-EEGTPVETAPIETAS 795
E A T +P P+ R + ++P + + + V + + +
Sbjct: 1198 NPENTTPA-TTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTN 1256

Query: 796 VDVVEIPVTVTADTSEMAEPLVVSQHNEAESAEDENTSA----DEQSKREQRDGQRRSRR 851
+ V A + VSQH +E + + Q R
Sbjct: 1257 TNAVLSDARAKAQFVALNVGKAVSQHISQLEMNNEGQYNVWVSNTSMNKNYSSSQYRRFS 1316

Query: 852 SPRHLRAAGQRRRRDEDDQ 870
S G + + Q
Sbjct: 1317 SKSTQTQLGWDQTISNNVQ 1335



Score = 39.7 bits (92), Expect = 8e-05
Identities = 48/238 (20%), Positives = 75/238 (31%), Gaps = 14/238 (5%)

Query: 875 PAQFVPNDELGADQEYPTEVTHSAHITGPSSAPT---VEAVKAETVEQAVTEVVAVVEYV 931
+ + AD P+ +++ I AP A +ET E + V
Sbjct: 994 TTNITTPNNIQADV--PSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTV 1051

Query: 932 APVTAPVSTTEIEKIAIADAPVAETPVIQKVAAEATITPVVPETTETQVLEPKIEETKAE 991
+ T + +A + + + ET ETQ E K T
Sbjct: 1052 EKNEQDATETTAQNREVAKEAKSNVKANTQ---TNEVAQSGSETKETQTTETKETAT--- 1105

Query: 992 TVEDIAEAKTEPQVVLQPASVVKATVVQDITKVPTKAVASAPMTKPAAIVK-PQPKVQTE 1050
VE +AK E + Q V + V + T + P + V +P+ QT
Sbjct: 1106 -VEKEEKAKVETEKT-QEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTN 1163

Query: 1051 ATNVNSQAADVTDAVVSKPKTTSRFGAMVSSDMTKPVVEVRTQVEVPKGREYDNTPSE 1108
T Q A T + V +P T S +S + P + E N P
Sbjct: 1164 TTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKN 1221


27Sputcn32_1604Sputcn32_1626Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1604219-0.567363MotA/TolQ/ExbB proton channel
Sputcn32_1605217-0.418648biopolymer transport protein ExbD/TolR
Sputcn32_1606114-0.830497TolA family protein
Sputcn32_1607-112-0.852026translocation protein TolB
Sputcn32_1608-111-1.239229OmpA domain-containing protein
Sputcn32_1609-210-1.211593hypothetical protein
Sputcn32_1610-39-1.186752******hypothetical protein
Sputcn32_1611-210-0.935787bifunctional 2',3'-cyclic nucleotide
Sputcn32_1612-116-1.281004peptidyl-dipeptidase Dcp
Sputcn32_1613118-3.102930prolyl 4-hydroxylase subunit alpha
Sputcn32_1614219-2.929787N-acetyltransferase GCN5
Sputcn32_1615319-3.894435N-acetyltransferase GCN5
Sputcn32_1616319-3.759942hypothetical protein
Sputcn32_1617014-1.370145hypothetical protein
Sputcn32_1618012-1.056262hypothetical protein
Sputcn32_1619013-0.161660hypothetical protein
Sputcn32_16202161.390254hypothetical protein
Sputcn32_16211162.009839hypothetical protein
Sputcn32_16221141.444065endonuclease/exonuclease/phosphatase
Sputcn32_16231190.844628hypothetical protein
Sputcn32_16241220.672850phage integrase family protein
Sputcn32_16251220.233066putative transposase
Sputcn32_1626324-1.444475hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1606IGASERPTASE613e-12 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 60.8 bits (147), Expect = 3e-12
Identities = 32/164 (19%), Positives = 61/164 (37%), Gaps = 4/164 (2%)

Query: 69 EAERREQ---ERQAELERKAQEAKQAREREQAQLKKLEQERKLQEIETQKANEAAKVAQV 125
E E+R Q Q + ++ ++++ + VA+
Sbjct: 984 EVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAEN 1043

Query: 126 KQQQEKEKAQKAEADRKLKEQERKVAEEAAQKAAEKRKAEEAAAAKAETERKQKEEAERK 185
+Q+ K + + + Q R+VA+EA + E A + +ET+ Q E +
Sbjct: 1044 SKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKET 1103

Query: 186 KKEEADRKLKEEAERKRK-ADAEAKARAQQEQEMADALAAEQAA 228
E + K K E E+ ++ ++ +QEQ AE A
Sbjct: 1104 ATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPAR 1147



Score = 58.9 bits (142), Expect = 1e-11
Identities = 34/250 (13%), Positives = 75/250 (30%), Gaps = 7/250 (2%)

Query: 36 EPVQQVSAPAVKAVMVDQQKVANQVEKIKQEKREAERREQERQAELERKAQEAKQARERE 95
E +Q S K + A E K+ K + Q E+ + E K+ + E
Sbjct: 1042 ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQ--TNEVAQSGSETKETQTTE 1099

Query: 96 QAQLKKLEQERKLQEIETQKANEAAKVAQVKQQQEKEKAQKAEADRKLKEQERKVAEEAA 155
+ +E+E K + + +QV +QE+ + + +A+ +E
Sbjct: 1100 TKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAE---PARENDPTVNIK 1156

Query: 156 QKAAEKRKAEEAAAAKAETERKQKEEAERKKKEEADRKLKEEAERKRKADAEAKARAQQE 215
+ ++ + ET ++ + E E A + ++
Sbjct: 1157 EPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESS 1216

Query: 216 QEMADALAAEQAALSQTMNKQMQSEVGKYTAMIKSTIQRNLVVDESMRGKSCKVSVRLAN 275
+ + ++ + S + T + N + + K N
Sbjct: 1217 NKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDLTSTN--TNAVLSDARAKAQFVALN 1274

Query: 276 DGFVISSQTD 285
G +S
Sbjct: 1275 VGKAVSQHIS 1284



Score = 57.4 bits (138), Expect = 3e-11
Identities = 34/185 (18%), Positives = 63/185 (34%), Gaps = 5/185 (2%)

Query: 61 EKIKQEKREAERREQERQAELERKAQEAKQAREREQAQLKKLE--QERKLQEIETQKANE 118
I+ + +E E A E + QE K E Q A E
Sbjct: 1001 NNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATE 1060

Query: 119 AAKVAQVKQQQEKEKAQKAEADRKLKEQERKVAEEAAQKAAEKRKAEEAAAAKAETERKQ 178
+ ++ K + ++ + + E + E E+ AK ETE+ Q
Sbjct: 1061 TTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQ 1120

Query: 179 KE---EAERKKKEEADRKLKEEAERKRKADAEAKARAQQEQEMADALAAEQAALSQTMNK 235
+ ++ K+E ++ +AE R+ D + Q Q A + A + + +
Sbjct: 1121 EVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVE 1180

Query: 236 QMQSE 240
Q +E
Sbjct: 1181 QPVTE 1185



Score = 35.8 bits (82), Expect = 3e-04
Identities = 31/201 (15%), Positives = 66/201 (32%), Gaps = 15/201 (7%)

Query: 53 QQKVANQVEKIKQEKREAERREQE---------RQAELERKAQEAKQAREREQAQLKKLE 103
+ K VEK ++ K E E+ ++ +Q + E +A+ ARE + K
Sbjct: 1099 ETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEP 1158

Query: 104 QERKLQEIETQKANE--AAKVAQVKQQQEKEKAQKAEADRKLKEQERKVAEEAAQKAAEK 161
Q + +T++ + ++ V Q + + + +++ K
Sbjct: 1159 QSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNK 1218

Query: 162 RKAEEAAAAKAETERKQKEEAERKKKEEADRKLKEEAERKRKADAEAKARAQQEQEMADA 221
K + ++ E + + + ARA+ + +
Sbjct: 1219 PKNRHRRSVRS---VPHNVEPATTSSNDRSTVALCDLTSTNTNAVLSDARAKAQFVALNV 1275

Query: 222 LAAEQAALSQT-MNKQMQSEV 241
A +SQ MN + Q V
Sbjct: 1276 GKAVSQHISQLEMNNEGQYNV 1296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1608OMPADOMAIN1058e-30 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 105 bits (264), Expect = 8e-30
Identities = 29/103 (28%), Positives = 51/103 (49%), Gaps = 4/103 (3%)

Query: 76 IYFDFDRSEVQTEFAAILEAHGTYLVEH--PSVRVLIEGHADERGTPEYNIALGERRAKA 133
+ F+F+++ ++ E A L+ + L V++ G+ D G+ YN L ERRA++
Sbjct: 221 VLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQS 280

Query: 134 VAKYLQGMGVQPSQMSVVSYGEEKPLDFSRTDEGFGKNRRAVL 176
V YL G+ ++S GE P+ + D K R A++
Sbjct: 281 VVDYLISKGIPADKISARGMGESNPVTGNTCDN--VKQRAALI 321


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1609RTXTOXIND280.038 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 27.9 bits (62), Expect = 0.038
Identities = 8/61 (13%), Positives = 22/61 (36%), Gaps = 3/61 (4%)

Query: 34 DRLARLERIVKSR---QQSELEMQRRLDTLQQEVLELRGLTEQQNYQIEQMQQRQRQLYD 90
RL ++ + + + LE + + E+ + EQ +I ++ + +
Sbjct: 235 SRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQ 294

Query: 91 D 91

Sbjct: 295 L 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1615SACTRNSFRASE413e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.1 bits (96), Expect = 3e-07
Identities = 20/57 (35%), Positives = 30/57 (52%)

Query: 80 LILNDVFVTQHARCVGIGRALVQRAARYAKEQNISYLILETQRDNRRAQGLYEALGF 136
++ D+ V + R G+G AL+ +A +AKE + L+LETQ N A Y F
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1622MICOLLPTASE704e-14 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 69.7 bits (170), Expect = 4e-14
Identities = 34/127 (26%), Positives = 60/127 (47%), Gaps = 7/127 (5%)

Query: 784 APVASFTQVVNGATVQLTST-SSDSDGHIVSAEWNLGDNTVAVGNVVTHSYRQSGEYQVT 842
A + S + V+ + T S D DG I + EW+ GD + TH Y ++GEY+V
Sbjct: 777 AVIKSDSSVIVEEEINFDGTESKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTGEYEVK 836

Query: 843 LTVTDNDGLTQSISQKVTVVVEN-----VKKPPVAQIQRIN-LWLVDMFISTSYDTDGVI 896
LTVTDN+G + S+K+ VV + + P ++ N + +M + + +
Sbjct: 837 LTVTDNNGGINTESKKIKVVEDKPVEVINESEPNNDFEKANQIAKSNMLVKGTLSEEDYS 896

Query: 897 KQHKWTF 903
++ +
Sbjct: 897 DKYYFDV 903



Score = 40.1 bits (93), Expect = 4e-05
Identities = 18/55 (32%), Positives = 32/55 (58%), Gaps = 1/55 (1%)

Query: 889 SYDTDGVIKQHKWTFDNGTRAN-GQVVLRLARRGQHTVELTVKDNDKLTDTTTLT 942
S D DG IK ++W F +G ++N + + + G++ V+LTV DN+ +T +
Sbjct: 798 SKDEDGEIKAYEWDFGDGEKSNEAKATHKYNKTGEYEVKLTVTDNNGGINTESKK 852


28Sputcn32_1654Sputcn32_1681Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1654216-1.484037histidine triad (HIT) protein
Sputcn32_1655115-0.802088hypothetical protein
Sputcn32_1656116-0.720317hypothetical protein
Sputcn32_1657114-0.868130purine nucleoside phosphorylase
Sputcn32_1658115-0.954895hypothetical protein
Sputcn32_1659-214-0.678873cyclic nucleotide-binding protein
Sputcn32_1660-114-0.135606TonB-dependent receptor
Sputcn32_1661117-0.848111hypothetical protein
Sputcn32_1662213-0.200684nicotinamide mononucleotide transporter PnuC
Sputcn32_1663314-0.093356aminoglycoside phosphotransferase
Sputcn32_16642120.603998hypothetical protein
Sputcn32_16652100.865018nitroreductase
Sputcn32_16661140.746596N-acetyltransferase GCN5
Sputcn32_16670130.848432gamma-glutamyltransferase
Sputcn32_1668-1130.747244hypothetical protein
Sputcn32_1669-114-0.847403lysine exporter protein LysE/YggA
Sputcn32_1670-113-1.8106994-hydroxyphenylpyruvate dioxygenase
Sputcn32_1671013-1.820233homogentisate 1,2-dioxygenase
Sputcn32_1672115-2.424340LysR family transcriptional regulator
Sputcn32_1673217-2.998849hypothetical protein
Sputcn32_1674215-3.074843TonB-dependent receptor
Sputcn32_1675112-3.615820hypothetical protein
Sputcn32_1676112-3.363796hypothetical protein
Sputcn32_1677111-3.559369hypothetical protein
Sputcn32_1678012-3.670360hypothetical protein
Sputcn32_1679011-3.719992AMP-binding protein
Sputcn32_1680-113-3.971386putative PAS/PAC sensor protein
Sputcn32_1681-113-3.005827disulfide bond formation protein B
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1680RTXTOXIND320.009 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 32.1 bits (73), Expect = 0.009
Identities = 19/108 (17%), Positives = 39/108 (36%), Gaps = 13/108 (12%)

Query: 610 KRQEQEQEKRFQNQAA-------HLQKQQSKMQIVNDENNALKQQLAEFNKAFEMQFQIN 662
R E+ + F + + +Q++K +E K QL + +I
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIES------EIL 283

Query: 663 LSKAQSQKLMQNFLAEVITQIMQEQDRLLAQICQTQANGGDESQIAIT 710
+K + Q + Q F E++ ++ Q D + + N + I
Sbjct: 284 SAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIR 331


29Sputcn32_1710Sputcn32_1718Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_17102234.299510ThiJ/PfpI domain-containing protein
Sputcn32_17112275.407052hypothetical protein
Sputcn32_17122316.307400hypothetical protein
Sputcn32_17133336.8216723-oxoacid CoA-transferase subunit B
Sputcn32_17143326.8163983-oxoacid CoA-transferase subunit A
Sputcn32_17152336.807227pyruvate carboxyltransferase
Sputcn32_17161265.930722carbamoyl-phosphate synthase subunit L
Sputcn32_17172235.252581enoyl-CoA hydratase/isomerase
Sputcn32_17182204.584586propionyl-CoA carboxylase
30Sputcn32_1810Sputcn32_1821Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1810-1143.517018hypothetical protein
Sputcn32_1811-1164.517357bifunctional acetaldehyde-CoA/alcohol
Sputcn32_18122185.567410hypothetical protein
Sputcn32_18132196.005004methyl-accepting chemotaxis sensory transducer
Sputcn32_18151194.543817*major facilitator superfamily transporter
Sputcn32_18161172.159043major facilitator superfamily transporter
Sputcn32_1817015-0.211692beta-lactamase
Sputcn32_1818-113-2.336385LysR family transcriptional regulator
Sputcn32_1819-113-3.013727CRISPR-associated Cas1 family protein
Sputcn32_1820-114-3.873403DEAD/DEAH box helicase
Sputcn32_1821-211-3.077661hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1815TCRTETA651e-13 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 64.8 bits (158), Expect = 1e-13
Identities = 87/391 (22%), Positives = 146/391 (37%), Gaps = 22/391 (5%)

Query: 8 LFIALGLFGLYAVEFG-VIGILPAIIQRHGITVAQA---GWLVALFAGVVAVCGPVMVLW 63
L + L L AV G ++ +LP +++ + G L+AL+A + C PV+
Sbjct: 7 LIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGAL 66

Query: 64 LARFERRKVLAVSLLIFSLCSLLSAWAPSFAVLMALRVPSALLHPVFFSVAFASALSLYP 123
RF RR VL VSL ++ + A AP VL R+ + + +VA A +
Sbjct: 67 SDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGAT-GAVAGAYIADITD 125

Query: 124 ADRAAHATSMAFLGTTLGLVLGVPLSTWIEATVSYEASFYFSAVVNLVAA-AGLWVMLPP 182
D A G+V G P+ + S A F+ +A +N + G +++
Sbjct: 126 GDERARHFGFMSACFGFGMVAG-PVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 183 RPKTRVASQQ---NPLLVLRTKRVWLAVATAVCMFAAMFSVYSYAAE----YLAREVHLG 235
R ++ NPL R R VA + +F M V A + H
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 236 GEAISLLLVVFGVGGVLGN-LIAGRALGRQLAWTVLSYPIVLASAYSVLLVFSSASFAAM 294
I + L FG+ L +I G R L ++ +LL F++ + A
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGWMAF 304

Query: 295 LPICLLWGAAHTSGLIVSQMWITSAAPDAPEFATSLYVSAANLGVVLGAFVGGSFIESVG 354
+ LL + M + + +L ++G + I +
Sbjct: 305 PIMVLLASGGIGMPAL-QAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFT-AIYAAS 362

Query: 355 MPGVIWSGWLF---AGLAVVFVLARRIKSWS 382
+ W+GW + A L ++ + A R WS
Sbjct: 363 IT--TWNGWAWIAGAALYLLCLPALRRGLWS 391


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1816TCRTETB462e-07 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 46.0 bits (109), Expect = 2e-07
Identities = 32/176 (18%), Positives = 71/176 (40%), Gaps = 1/176 (0%)

Query: 14 LLALALAGFVTILTEALPAGLLPQIGAGLGVSEALAGQLVTVYAIGSLLMAIPLMTITQG 73
L+ L + F ++L E + LP I A + T + + + ++
Sbjct: 16 LIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQ 75

Query: 74 VRRRPLLLAAIVGFAVANTVTTFSTSY-TLTLVARFLAGVAAGLLWALLAGYASRMVPEH 132
+ + LLL I+ + + S+ +L ++ARF+ G A AL+ +R +P+
Sbjct: 76 LGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKE 135

Query: 133 LKGRAIAIAMVGTPLALSLGVPAGTLLGNLVGWRVCFGIMSVLALMLIVWVRLKVP 188
+G+A + + +G G ++ + + W I + + + ++L
Sbjct: 136 NRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKLLKK 191


31Sputcn32_1832Sputcn32_1842Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_18323202.228810hydrogenase nickel incorporation protein HypA
Sputcn32_18333181.548068hydrogenase expression/formation protein HypE
Sputcn32_18345180.992286hydrogenase expression/formation protein HypD
Sputcn32_18355170.344511hydrogenase assembly chaperone HypC/HupF
Sputcn32_18363140.109085hydrogenase nickel incorporation protein HypB
Sputcn32_1837312-0.256727CoA-binding domain-containing protein
Sputcn32_1838212-1.427113hypothetical protein
Sputcn32_1839111-0.905651ABC transporter-like protein
Sputcn32_1840013-0.856422hypothetical protein
Sputcn32_1841014-0.961740TonB-dependent siderophore receptor
Sputcn32_1842213-0.197773peptidase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1835PF05211250.027 Neuraminyllactose-binding hemagglutinin
		>PF05211#Neuraminyllactose-binding hemagglutinin

Length = 260

Score = 25.4 bits (55), Expect = 0.027
Identities = 10/25 (40%), Positives = 18/25 (72%)

Query: 52 VMNKIDKEDAQQSLELYQEIVTKLE 76
+M +IDK+ Q++LE YQ+ +L+
Sbjct: 231 IMQEIDKKLTQKNLESYQKDAKELK 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1836NUCEPIMERASE280.045 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 27.8 bits (62), Expect = 0.045
Identities = 10/39 (25%), Positives = 19/39 (48%), Gaps = 1/39 (2%)

Query: 207 PYFDFNLAEARAQLKKLNPDTQIIEVSIKDGDSMQAVAQ 245
Y+D +L +AR +L P Q ++ + D + M +
Sbjct: 35 DYYDVSLKQARLELLA-QPGFQFHKIDLADREGMTDLFA 72


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1839PF05272310.018 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.8 bits (69), Expect = 0.018
Identities = 11/45 (24%), Positives = 20/45 (44%), Gaps = 4/45 (8%)

Query: 341 ILEAGAK----LAIIGENGVGKTTLLRCLVNELIHNEGTIKWSEN 381
++E G K + + G G+GK+TL+ LV ++
Sbjct: 588 VMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTG 632


32Sputcn32_1873Sputcn32_1889Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1873017-3.096905Na+/H+ antiporter NhaC
Sputcn32_1874117-3.515650hypothetical protein
Sputcn32_1875016-1.939284elongation factor P
Sputcn32_1876213-1.732871hypothetical protein
Sputcn32_1877116-0.000442flavodoxin FldA
Sputcn32_18782170.295904LexA regulated protein
Sputcn32_18790140.540404hypothetical protein
Sputcn32_18800141.390440alpha/beta hydrolase fold domain-containing
Sputcn32_18810172.157914replication initiation regulator SeqA
Sputcn32_18820173.029589phosphoglucomutase
Sputcn32_18831193.460501peptide methionine sulfoxide reductase
Sputcn32_18842193.201088succinylglutamate desuccinylase
Sputcn32_18851182.4818723-methyl-2-oxobutanoate dehydrogenase
Sputcn32_18861181.724224transketolase, central region
Sputcn32_18870140.042666dihydrolipoamide acetyltransferase
Sputcn32_1888012-1.070700quinolinate synthetase
Sputcn32_1889215-2.423820glyceraldehyde-3-phosphate dehydrogenase, type
33Sputcn32_1942Sputcn32_2001Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1942-220-3.381999peptidase M14, carboxypeptidase A
Sputcn32_1943-222-4.253041hypothetical protein
Sputcn32_1944-219-4.052960hypothetical protein
Sputcn32_1945-319-3.712511aromatic amino acid aminotransferase
Sputcn32_1946-219-4.326137hypothetical protein
Sputcn32_1947-218-4.018006bax protein
Sputcn32_1948-217-3.568583hypothetical protein
Sputcn32_1949-116-1.985986C32 tRNA thiolase
Sputcn32_1950016-2.109145universal stress protein UspE
Sputcn32_1951114-2.028860fumarate/nitrate reduction transcriptional
Sputcn32_1952113-1.661575hypothetical protein
Sputcn32_1953016-1.910413cbb3-type cytochrome oxidase maturation protein
Sputcn32_1954115-1.343559heavy metal translocating P-type ATPase
Sputcn32_1955218-2.054363hypothetical protein
Sputcn32_1956016-1.403716cytochrome c oxidase, cbb3-type subunit III
Sputcn32_1957118-1.277997cbb3-type cytochrome oxidase subunit
Sputcn32_1958018-0.471666cbb3-type cytochrome c oxidase subunit II
Sputcn32_1959-1180.027136cbb3-type cytochrome c oxidase subunit I
Sputcn32_1960017-0.331416hypothetical protein
Sputcn32_19620180.141421transposase Tn3 family protein
Sputcn32_19631192.614994transposase IS3/IS911 family protein
Sputcn32_19642192.624641integrase catalytic subunit
Sputcn32_19652193.055467bile acid:sodium symporter
Sputcn32_19662213.038484heavy metal translocating P-type ATPase
Sputcn32_19672234.273572hypothetical protein
Sputcn32_19682224.060538CzcA family heavy metal efflux protein
Sputcn32_19693212.632345biotin/lipoyl attachment domain-containing
Sputcn32_19704222.849916outer membrane efflux protein
Sputcn32_19713211.077252hypothetical protein
Sputcn32_19723200.535185cation efflux system permease
Sputcn32_19734200.116900MerR family transcriptional regulator
Sputcn32_19744200.673672cation diffusion facilitator family transporter
Sputcn32_19754200.174120hypothetical protein
Sputcn32_1977419-0.570187hypothetical protein
Sputcn32_19784180.097994sulfatase
Sputcn32_19793211.451193sulfatase
Sputcn32_19803242.227729diacylglycerol kinase
Sputcn32_19813252.532962two component transcriptional regulator
Sputcn32_19822243.782561integral membrane sensor signal transduction
Sputcn32_19832233.151176bile acid:sodium symporter
Sputcn32_1984218-0.025009cointegrate resolution protein T
Sputcn32_1985118-1.419398hypothetical protein
Sputcn32_1986017-1.989258hypothetical protein
Sputcn32_1987118-2.714212phage integrase family protein
Sputcn32_1988218-3.929708response regulator receiver modulated metal
Sputcn32_1989118-2.613698transposase Tn3 family protein
Sputcn32_19900150.116123plasmid stabilization system protein
Sputcn32_19910140.117775prevent-host-death family protein
Sputcn32_19921150.134378resolvase domain-containing protein
Sputcn32_19931160.603724TraR/DksA family transcriptional regulator
Sputcn32_1994-113-0.248234UspA domain-containing protein
Sputcn32_1995-113-0.765719sulfate transporter
Sputcn32_1996215-2.551495response regulator receiver protein
Sputcn32_1997213-2.428412hypothetical protein
Sputcn32_1998113-2.221550hypothetical protein
Sputcn32_1999112-2.508757*inner membrane transport protein YdhC
Sputcn32_2000-213-2.159713putative DNA-binding transcriptional regulator
Sputcn32_2001-117-3.227412hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1942PF07520290.045 Virulence protein SrfB
		>PF07520#Virulence protein SrfB

Length = 1041

Score = 28.8 bits (64), Expect = 0.045
Identities = 8/40 (20%), Positives = 13/40 (32%)

Query: 107 AYFAPYSYERHLDLLSSAQLHPDVNLEHLGLTLDGRDITL 146
F S R + + + P+ + L LD R
Sbjct: 885 GAFQMKSTARFVGEMDTNGQIPEGRMLFEDLDLDARKSAQ 924


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1947FLGFLGJ405e-06 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 40.1 bits (93), Expect = 5e-06
Identities = 31/89 (34%), Positives = 40/89 (44%), Gaps = 12/89 (13%)

Query: 139 VPESMVLIQAANESGWGSSRFARE----GFNFFGEWCFSTGCGIVPSSR------GSGKM 188
VP ++L QAA ESGWG + RE +N FG G V G K
Sbjct: 169 VPHHLILAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKK 228

Query: 189 HEVK--VFSSVDASVSSYMRNLNSNPAYA 215
+ K V+SS ++S Y+ L NP YA
Sbjct: 229 VKAKFRVYSSYLEALSDYVGLLTRNPRYA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1954RTXTOXINA310.025 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 30.7 bits (69), Expect = 0.025
Identities = 37/161 (22%), Positives = 66/161 (40%), Gaps = 16/161 (9%)

Query: 635 EQLKRQGCSVSIASGDHSGHVYQLAKELGIEDVHSGLTPADKLA-----LVTELQKTSTV 689
E +K+Q +++S + + +L +L ++ V S + + L + L T +
Sbjct: 164 ELIKKQKSGGNVSSSELAKASIELINQL-VDTVASLNNNVNSFSQQLNTLGSVLSNTKHL 222

Query: 690 AMFGDGINDAPVL--AGADLSVAMGSGSAIAKNSADLILLGDHLSRFTQAVSVAKLTTQI 747
G+ + + P L GA L G SAI SA IL T+A + +LTT++
Sbjct: 223 NGVGNKLQNLPNLDNIGAGLDTVSGILSAI---SASFILSNADADTRTKAAAGVELTTKV 279

Query: 748 I----KQNLAWALGYNALILPLAVTGHVAPYIAAIGMSASS 784
+ K + + L+ + A IA+ A S
Sbjct: 280 LGNVGKGISQYIIA-QRAAQGLSTSAAAAGLIASAVTLAIS 319


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1968ACRIFLAVINRP7550.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 755 bits (1951), Expect = 0.0
Identities = 225/1055 (21%), Positives = 429/1055 (40%), Gaps = 49/1055 (4%)

Query: 5 MLRLAIARRYLFLTLTLLIIAIGSWSYQQLPIDAVPDITNVQVQINTAAPGYSPLEAEQR 64
M I R L ++++ G+ + QLP+ P I V ++ PG +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 65 ITYPVETALYGLPNLSYTRSLS-RYGLSQVTVVFEEGTDIYFARNLINTRLGAIKDMLPE 123
+T +E + G+ NL Y S S G +T+ F+ GTD A+ + +L +LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 124 GIEPEMGPISTGLGEIFMYTVQAKPGALQQNGSPYDAMALREIQDWIIKPQLAQVKGVVE 183
++ + + M + + + +K L+++ GV +
Sbjct: 121 EVQQQGISVEKSSSSYLMVA------GFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGD 174

Query: 184 VNSIGGYNKQYHVLPDPLKLLNYGLSIKDVELALQANNDNRGAGYI------EREGMQLL 237
V G + D L Y L+ DV L+ ND AG + + +
Sbjct: 175 VQLFGA-QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNAS 233

Query: 238 VRSPGQLTSLDDIANVII-TQYDTIPVKLSDVADVAIGKELRTGAATQDGKEAVLGTAMM 296
+ + + + ++ V + D V+L DVA V +G E A +GK A +
Sbjct: 234 IIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKL 293

Query: 297 LIDENSRTVARDVAQKLEQIKSSLPEGIIAEAVYDRTTLVDKAIATVSKNLLEGALLVIV 356
N+ A+ + KL +++ P+G+ YD T V +I V K L E +LV +
Sbjct: 294 ATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFL 353

Query: 357 VLFILLGNLRAALITAAVIPLSMLMTITGMVQAGVSANLMSLG--ALDFGLIVDGTVIIV 414
V+++ L N+RA LI +P+ +L T + G S N +++ L GL+VD +++V
Sbjct: 354 VMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVV 413

Query: 415 ENAVRRLAQAQHNGRIQPLKERLNTVYLATAEVIRPSLFGVAIITIVYMPIFSLTGVEGK 474
EN R + + + + + +++ + +++ V++P+ G G
Sbjct: 414 ENVERVMMEDKLPPK--------EATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGA 465

Query: 475 MFHPMAATVVIALLSAMVLSLTIVPAAVAVFLNGKISEKESA----------VIRSAKTL 524
++ + T+V A+ +++++L + PA A L +E +
Sbjct: 466 IYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNH 525

Query: 525 YAPLLALALKWRVLVIGLASALVGVCLWLATTLGSEFVPQLDEGDIALHAMRIPGTGLEQ 584
Y + L + + + +V + L L S F+P+ D+G G E+
Sbjct: 526 YTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQER 585

Query: 585 -AVAMQEILEQKIKTFAEVDKVFARIGTSEVATDPMPPNVADNFVILKPRSEWPNPDKTK 643
+ ++ + +K E V + + + N FV LKP E + +
Sbjct: 586 TQKVLDQVTDYYLK--NEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSA 643

Query: 644 AQLVTEMEAALATLPGNNYEFTQPIQM-RFNELISGVRADLG-IKVFGDDLDQLVTSANQ 701
++ + L + F P M EL + D I G D L + NQ
Sbjct: 644 EAVIHRAKMELGKIRDG---FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQ 700

Query: 702 ILQAVNKVQGA-ADIKVEQVTGLPTLSVIPNRTALARYGLNVVELQDWVSAAIGGTSAGI 760
+L + + ++ + + ++ G+++ ++ +S A+GGT
Sbjct: 701 LLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVND 760

Query: 761 LYEGDRRFELIVRLPETLRRDLDKLAVLPVPLPNGDFVPLQEVATLDLSPAPAQISRENG 820
+ R +L V+ R + + L V NG+ VP T ++ R NG
Sbjct: 761 FIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNG 820

Query: 821 KRRVVVTANVRGRDLGSFVEEVKAQINR-DVALPAGYWLDYGGTFEQLESASQRLSIVVP 879
+ + G+ + A + LPAG D+ G Q + + +V
Sbjct: 821 LPSMEIQGEAAP---GTSSGDAMALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVA 877

Query: 880 VTLLLILGILVMAFASFKDALIIFSGVPLALTGGVLALYLRGMPLSISAGIGFIALSGVA 939
++ +++ L + S+ + + VPL + G +LA L + +G + G++
Sbjct: 878 ISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLS 937

Query: 940 VLNGLVMLSFIRDLWREKG-DLLLAITEGALTRLRPVLMTALVASLGFVPMAINIGTGAE 998
N ++++ F +DL ++G ++ A RLRP+LMT+L LG +P+AI+ G G+
Sbjct: 938 AKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSG 997

Query: 999 VQRPLATVVIGGIISSTLLTLFVLPVLYHWLHSKF 1033
Q + V+GG++S+TLL +F +PV + + F
Sbjct: 998 AQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRRCF 1032


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1969RTXTOXIND391e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 39.4 bits (92), Expect = 1e-05
Identities = 21/129 (16%), Positives = 42/129 (32%), Gaps = 23/129 (17%)

Query: 156 TIKAPIDGVIIQRSAN-VGEVAQQ-QTLFSIA-DFGGLWADFRLYPSQQDAVATGQKVVI 212
I+AP+ + Q + G V +TL I + L + + GQ +I
Sbjct: 329 VIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAII 388

Query: 213 MAADT-------EINGVIAHIIPSL-----TQPYQLARVKLD-------NRDGKLSSGQL 253
+ + G + +I + ++ N++ LSSG
Sbjct: 389 K-VEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNIPLSSGMA 447

Query: 254 IEGAVISGE 262
+ + +G
Sbjct: 448 VTAEIKTGM 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1973BACYPHPHTASE280.013 Salmonella/Yersinia modular tyrosine phosphatase si...
		>BACYPHPHTASE#Salmonella/Yersinia modular tyrosine phosphatase

signature.
Length = 468

Score = 28.2 bits (62), Expect = 0.013
Identities = 14/68 (20%), Positives = 30/68 (44%)

Query: 61 ADIERLLFIKHCRSLDLSLAEIRQLLSLNDSPMSQCDDVNQMIEQHIEQVERRIADLTQL 120
D +L + HCR+ A++ + +NDS SQ + + + +++ + QL
Sbjct: 392 GDDSKLRPVIHCRAGVGRTAQLIGAMCMNDSRNSQLSVEDMVSQMRVQRNGIMVQKDEQL 451

Query: 121 NKQLKALR 128
+ +K
Sbjct: 452 DVLIKLAE 459


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1981HTHFIS907e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 89.5 bits (222), Expect = 7e-23
Identities = 30/125 (24%), Positives = 58/125 (46%), Gaps = 2/125 (1%)

Query: 2 RILLVEDDQLLAQGLVTALERLHYRVEHCISGNQALQAAQNSLFDVMILDLGLPDGLALP 61
IL+ +DD + L AL R Y V + + D+++ D+ +PD A
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VIKTLRASNQALPILVLTAWDHLETKIDVLDAGADDYVLKPCDVREIEARLR--VVHRRH 119
++ ++ + LP+LV++A + T I + GA DY+ KP D+ E+ + + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 120 QQRQH 124
+ +
Sbjct: 125 RPSKL 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1984GPOSANCHOR378e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 37.4 bits (86), Expect = 8e-05
Identities = 32/250 (12%), Positives = 78/250 (31%), Gaps = 6/250 (2%)

Query: 51 EDEKGRLDDLGALSDELGQLVGHLARRLQDEALQRVQHAEERHTLALQEKLQQLDTQGRE 110
++ + R DL + + +++ ++ A + L ++ L+
Sbjct: 116 QELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL--EKALEGAMNFSTA 173

Query: 111 LNSSYEQCETLQQQLRDALAQIDGLKEARHQDALQIAAMKTEMHHLTLKLHDRDTQIQSL 170
++ + E + L A+++ E + +A + L R ++
Sbjct: 174 DSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKA 233

Query: 171 EAKHTHARQALEHYRESVKTNREQELQRHEHQVQQLQAELRTCQQIISVKQQDYTAVKEQ 230
+ A + E E E + +L+ L + ++ +
Sbjct: 234 LEGAMNFSTADSAKIK----TLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE 289

Query: 231 AQELTLQLKYSQDAAAAFEKQLTFLKEDNNDLRRQQQINEAELSELKHAVGARMQETQAL 290
L + + + L+ D + R ++ EAE +L+ Q+L
Sbjct: 290 KAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSL 349

Query: 291 TEQLNTMRGA 300
L+ R A
Sbjct: 350 RRDLDASREA 359


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1996HTHFIS366e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.6 bits (82), Expect = 6e-06
Identities = 12/48 (25%), Positives = 17/48 (35%), Gaps = 5/48 (10%)

Query: 28 KVLVVDDEPDVHTVTKLALSRFRLDGRALSFINAYSGEQAKALLRQEQ 75
+LV DD+ + TV ALSR D R + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-----TSNAATLWRWIAAGD 47


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1999TCRTETB673e-14 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 67.2 bits (164), Expect = 3e-14
Identities = 83/389 (21%), Positives = 144/389 (37%), Gaps = 52/389 (13%)

Query: 10 NIKFFMFLFYLAMLSMLGFIATDMYLPAFKAIEGSLGSTPSQVAMSLTCFLAGLALGQLL 69
N++ L +L +LS + + + I P+ T F+ ++G +
Sbjct: 9 NLRHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAV 68

Query: 70 YGPLVNKIGKRWALIFGLVLFAIASVFIANSDSILMLNI-ARFFQAIGACSAGVIWQAIV 128
YG L +++G + L+FG+++ SV S L I ARF Q GA + + +V
Sbjct: 69 YGKLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVV 128

Query: 129 VEQYDADKAQGIFSNIMPLVALSPALAPILGAYILNELGWR------------------- 169
+ F I +VA+ + P +G I + + W
Sbjct: 129 ARYIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHWSYLLLIPMITIITVPFLMKL 188

Query: 170 ---------------AIFISLCVIAFM----------LVLMTLYFVPSNKHQHKHSEHAA 204
I +S+ ++ FM L++ L F+ KH K
Sbjct: 189 LKKEVRIKGHFDIKGIILMSVGIVFFMLFTTSYSISFLIVSVLSFLIFVKHIRK-VTDPF 247

Query: 205 VSYGQILKNTRYLGNVVIFGACSGAFFAYLTVWPIVME-MHGYQATEIGLSFI-PQTIMF 262
V G + KN ++ V+ G G ++++ P +M+ +H EIG I P T+
Sbjct: 248 VDPG-LGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSV 306

Query: 263 IFGGYASKLLIKRIGATQTLSILLSIFAVCVISIIMFTLIYPAGSIFPLLISFSILAAAN 322
I GY +L+ R G L+I ++ +V ++ ++ L+
Sbjct: 307 IIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTK 366

Query: 323 GAIYPIVVNSALQQFSQNAAKAAGLQNFL 351
I IV +S Q Q A L NF
Sbjct: 367 TVISTIVSSSLKQ---QEAGAGMSLLNFT 392


34Sputcn32_2160Sputcn32_2189Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_21602173.364183acylphosphatase
Sputcn32_21612172.994361hypothetical protein
Sputcn32_21622203.507810hypothetical protein
Sputcn32_21633203.349272hypothetical protein
Sputcn32_21642203.148540phage integrase family protein
Sputcn32_21652193.026494YD repeat-containing protein
Sputcn32_2166-2160.088835hypothetical protein
Sputcn32_2167-1140.653508YD repeat-containing protein
Sputcn32_2168-114-0.086735hypothetical protein
Sputcn32_21690141.640787hypothetical protein
Sputcn32_21700142.280048beta-hexosaminidase
Sputcn32_21711183.540061L-serine dehydratase 1
Sputcn32_21722183.302689hypothetical protein
Sputcn32_21733223.800102carboxylesterase
Sputcn32_21744234.595017alcohol dehydrogenase
Sputcn32_21756244.899937LysR family transcriptional regulator
Sputcn32_21765265.320445FAD dependent oxidoreductase
Sputcn32_21774274.916671hypothetical protein
Sputcn32_21784285.274981ATP phosphoribosyltransferase
Sputcn32_21794285.689571histidinol dehydrogenase
Sputcn32_21803284.968572histidinol-phosphate aminotransferase
Sputcn32_21812284.908180imidazole glycerol-phosphate
Sputcn32_21820182.731680imidazole glycerol phosphate synthase subunit
Sputcn32_21830182.3327001-(5-phosphoribosyl)-5-[(5-
Sputcn32_21841182.120865imidazole glycerol phosphate synthase subunit
Sputcn32_21851170.790261bifunctional phosphoribosyl-AMP
Sputcn32_21861160.551397aromatic amino acid transporter
Sputcn32_2187216-0.849546hypothetical protein
Sputcn32_2188220-3.549041hypothetical protein
Sputcn32_2189218-3.599185hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2165SALSPVBPROT462e-06 Salmonella virulence plasmid 65kDa B protein signature.
		>SALSPVBPROT#Salmonella virulence plasmid 65kDa B protein signature.

Length = 591

Score = 45.5 bits (107), Expect = 2e-06
Identities = 45/180 (25%), Positives = 70/180 (38%), Gaps = 20/180 (11%)

Query: 296 GAAVFSVPIDIPPGRNGMQPAVSLDYSSRSGQGIAGVGWSVTAGSALHRCDTTVAQEGLS 355
G A ++P+ I R G PA++L YSS G G GVGWS S V Q S
Sbjct: 34 GLASITLPLPISAER-GFAPALALHYSSGGGNGPFGVGWSCATMSIARSTSHGVPQYNDS 92

Query: 356 --------RAVIMSASDRLCLDGQKLMAVSG-----QYGTSGAQYRTELDQFARVTQYGA 402
++ + S + A Y + Q RTE F R+ +
Sbjct: 93 DEFLGPDGEVLVQTLSTGDAPNPVTCFAYGDVSFPQSYTVTRYQPRTESS-FYRLEYWVG 151

Query: 403 LTSATTYFVVERKDNIVATYGGTTDSRHI---ALGHTLPMTWAINKQQDRAGNTMTYAYL 459
++ ++++ + I+ G T +R A HT W + + AG + Y+YL
Sbjct: 152 NSNGDDFWLLHDSNGILHLLGKTAAARLSDPQAASHT--AQWLVEESVTPAGEHIYYSYL 209


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2172ACETATEKNASE310.006 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 30.5 bits (69), Expect = 0.006
Identities = 18/66 (27%), Positives = 32/66 (48%), Gaps = 15/66 (22%)

Query: 78 SVVTTREVETEADFARLAKMPQVSGIAPLNRILLPENLDSIKSLREAAARLKADILLIYT 137
SV+ T +V L + +APL+ P N++ I +A ++ D+ ++
Sbjct: 101 SVLITDDV--------LKAITDCIELAPLHN---PANIEGI----KACTQIMPDVPMVAV 145

Query: 138 FDTSFH 143
FDT+FH
Sbjct: 146 FDTAFH 151


35Sputcn32_2252Sputcn32_2270Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2252219-2.589391Ig domain-containing protein
Sputcn32_2253-115-3.449014sodium:dicarboxylate symporter
Sputcn32_2254013-3.426338hypothetical protein
Sputcn32_2255014-3.249872two component transcriptional regulator
Sputcn32_2256-114-3.296560integral membrane sensor signal transduction
Sputcn32_2257-118-2.777502prolyl 4-hydroxylase subunit alpha
Sputcn32_2258-116-2.558549metal dependent phosphohydrolase
Sputcn32_2259019-2.530303magnesium and cobalt transport protein CorA
Sputcn32_2260020-2.053003hypothetical protein
Sputcn32_2261016-0.777898hypothetical protein
Sputcn32_22620190.529428NAD-dependent deacetylase
Sputcn32_22632260.431507ferric uptake regulator
Sputcn32_22642241.004199N-acetyltransferase GCN5
Sputcn32_22654261.112283GreA/GreB family elongation factor
Sputcn32_22664271.525089succinyl-CoA synthetase subunit alpha
Sputcn32_22674251.310890succinyl-CoA synthetase subunit beta
Sputcn32_22684231.0399242-oxoglutarate dehydrogenase, E2 subunit,
Sputcn32_22693230.8634802-oxoglutarate dehydrogenase E1 component
Sputcn32_22702200.094934succinate dehydrogenase iron-sulfur subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2252INTIMIN310.031 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 30.8 bits (69), Expect = 0.031
Identities = 51/254 (20%), Positives = 78/254 (30%), Gaps = 31/254 (12%)

Query: 48 NTNISAATPATVSAKVVDSKLGVLANKLVTFTLDDSALGVFQTLEDTASNGAVVTDANGV 107
N A + SA + T TL G TA +AN V
Sbjct: 599 NIVSGTAVLSANSANTN-------GSGKATVTLKSDKPGQVVVSAKTA-EMTSALNANAV 650

Query: 108 ASLKLLTAQKAGGGVITAKVSSGESGTIPFMMKGDGG------------NAGGGPQVTLK 155
+ A + I + +K G G + +
Sbjct: 651 IFVDQTKASITEIKADKTTAVANGQDAITYTVKVMKGDKPVSNQEVTFTTTLGKLSNSTE 710

Query: 156 LTDAAGNSINSITSTTPG--ILSATVTGLTKTVIVTFDSTIGDLPVKTAITDLNGKASVN 213
TD G + ++TSTTPG ++SA V+ + V L + ++ G
Sbjct: 711 KTDTNGYAKVTLTSTTPGKSLVSARVSDVAVDVKAPEVEFFTTLTIDDGNIEIVGTGV-- 768

Query: 214 LYAGSTPGAGTATASLSTGEVGEQVFVVGATNVLMGSGSPFVSGKADVSALTLSAGGTAT 273
G P T L G+V + + S A +TL GT T
Sbjct: 769 --KGKLP-----TVWLQYGQVNLKASGGNGKYTWRSANPAIASVDASSGQVTLKEKGTTT 821

Query: 274 ISILLQDDQGKAFT 287
IS++ D+Q +T
Sbjct: 822 ISVISSDNQTATYT 835


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2255HTHFIS772e-18 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 76.8 bits (189), Expect = 2e-18
Identities = 32/120 (26%), Positives = 57/120 (47%), Gaps = 1/120 (0%)

Query: 2 RILVVEDDVILSHHLKVQLSDLGNQVQVALTAKEGFYQATNYPIDVAIVDLGLPDQDGIS 61
ILV +DD + L LS G V++ A + D+ + D+ +PD++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LIQQLRDAELKAPILILTARVNWQDKVEGLNAGADDYLVKPFQKEELVARLD-ALVRRSA 120
L+ +++ A P+L+++A+ + ++ GA DYL KPF EL+ + AL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2256FLGFLIH310.011 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 30.5 bits (68), Expect = 0.011
Identities = 18/59 (30%), Positives = 29/59 (49%), Gaps = 3/59 (5%)

Query: 205 INEGKTKSLSEGYPVELEGVTQALNQMLQQSSAQQQRYQNAMNDLAHSLKTRLAAVHAI 263
I EG+ + +GY EG+ Q L Q L ++ +QQ M L +T L A+ ++
Sbjct: 60 IAEGRQQGHKQGYQ---EGLAQGLEQGLAEAKSQQAPIHARMQQLVSEFQTTLDALDSV 115


36Sputcn32_2400Sputcn32_2407Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_24001183.770704chromosome segregation and condensation protein
Sputcn32_24011194.370450Sua5/YciO/YrdC/YwlC family protein
Sputcn32_24022214.721170phosphotransferase domain-containing protein
Sputcn32_24032245.222501anthranilate synthase component I
Sputcn32_24043235.495341anthranilate synthase component II
Sputcn32_24053234.974224anthranilate phosphoribosyltransferase
Sputcn32_24061213.654202bifunctional indole-3-glycerol phosphate
Sputcn32_24070203.154616tryptophan synthase subunit beta
37Sputcn32_2441Sputcn32_2460Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_24412101.064970tRNA pseudouridine synthase A
Sputcn32_24422101.502715hypothetical protein
Sputcn32_24431122.910667aspartate-semialdehyde dehydrogenase
Sputcn32_24441153.736438D-isomer specific 2-hydroxyacid dehydrogenase
Sputcn32_24451154.407785beta-ketoacyl synthase
Sputcn32_24460184.882284FAD dependent oxidoreductase
Sputcn32_24470174.369101tRNA/rRNA methyltransferase SpoU
Sputcn32_2448-1163.941122hypothetical protein
Sputcn32_2449-1163.800487ATP-NAD/AcoX kinase
Sputcn32_2450-2100.860664major facilitator superfamily transporter
Sputcn32_2451-112-4.892944chorismate synthase
Sputcn32_2452014-5.853445N5-glutamine S-adenosyl-L-methionine-dependent
Sputcn32_2453217-6.497447hypothetical protein
Sputcn32_2454013-5.291218phosphohistidine phosphatase, SixA
Sputcn32_2455-111-4.390558peptidase M16 domain-containing protein
Sputcn32_2456-112-3.668689PAS/PAC and GAF sensor-containing diguanylate
Sputcn32_24571183.164481hypothetical protein
Sputcn32_24581173.570704hypothetical protein
Sputcn32_24591183.524012multifunctional fatty acid oxidation complex
Sputcn32_24600163.3923743-ketoacyl-CoA thiolase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2442IGASERPTASE340.004 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.9 bits (77), Expect = 0.004
Identities = 29/162 (17%), Positives = 58/162 (35%), Gaps = 4/162 (2%)

Query: 137 AGVKSPAKTTTPRAKEPNTTKAIAQTSPVVAPVVKVEPKTELKVEPKVEPKSEPVESKQD 196
A S K T + T + + V + PK +V PK E +SE V+ + +
Sbjct: 1086 AQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQE-QSETVQPQAE 1144

Query: 197 AVAKTATVVKSEPKMLSSDVNARLEAAERKTLSLTDELARTQDQLSVRNSDVEALKAKVE 256
+ V + ++ A E ++T S ++ ++ NS VE +
Sbjct: 1145 PARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTP 1204

Query: 257 ELNQQIAVLEETLLASKQQNQALKAELEAAQSADVVSVSEPD 298
Q E + + +++++ + A + S D
Sbjct: 1205 ATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPA---TTSSND 1243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2450TCRTETA422e-06 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 42.1 bits (99), Expect = 2e-06
Identities = 66/356 (18%), Positives = 121/356 (33%), Gaps = 48/356 (13%)

Query: 47 GFLLAILMATRIVAPNVWAKVADRTGMRAELIKMGAGAAALAYLSFFYHGGFVYMALSLA 106
G LLA+ + V ++DR G R L+ AGAA + MA +
Sbjct: 46 GILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAI----------MATAPF 95

Query: 107 LYTFFWNAILAQLEVIT----------LETLGENASRYGQIRSFGSIGFICLVVGAGFAI 156
L+ + I+A + T + E A +G + + G + AG +
Sbjct: 96 LWVLYIGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMV-----AGPVL 150

Query: 157 GQWGTEVLPYI---------GLTLFTGMLLSALPLPANRAVRPAGQERHRLK-------W 200
G P+ GL TG L LP RP +E
Sbjct: 151 GGLMGGFSPHAPFFAAAALNGLNFLTGCFL--LPESHKGERRPLRREALNPLASFRWARG 208

Query: 201 TKPIVWFMISAMLLQMSAGPFYGFFVLYLKQA-GYSESSAGI-FVALGAMAEIVMFMFAP 258
+ M ++Q+ +V++ + + ++ GI A G + + M
Sbjct: 209 MTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITG 268

Query: 259 RLLGRYGVNTLLIVSIGMTCLRWLLVAFGVDSVLWLGFSQLLHAFTFGLTHAASIQFVHK 318
+ R G L++ + ++L+AF W+ F ++ + G+ A + +
Sbjct: 269 PVAARLGERRALMLGMIADGTGYILLAFATRG--WMAFPIMVLLASGGIGMPALQAMLSR 326

Query: 319 HFDASHRSQGQALYASLSFGVGGALGTWICGYIWGDGSGAVWSWVFAAVCAFAAML 374
D + Q Q A+L+ + +G + I+ W + A A +
Sbjct: 327 QVDEERQGQLQGSLAALT-SLTSIVGPLLFTAIYAASITTWNGWAWIAGAALYLLC 381


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2456BINARYTOXINB330.014 Binary toxin B family signature.
		>BINARYTOXINB#Binary toxin B family signature.

Length = 764

Score = 32.7 bits (74), Expect = 0.014
Identities = 32/161 (19%), Positives = 62/161 (38%), Gaps = 24/161 (14%)

Query: 111 ITNENELLIGTDKGLVLYNKNDESFNTLDIKSEYLDGEIWSLSNNDFNNEILVGINKGIV 170
+ NE+E + +GL+ Y +D +F + + G++ S+ +++ N + N+
Sbjct: 37 LLNESES---SSQGLLGYYFSDLNFQAPMVVTSSTTGDL-SIPSSELEN--IPSENQYFQ 90

Query: 171 SLDLKNNNIRNDYIGKDYLEVKKSLNIDNKIFIKSYDG--FLYEVNENKIKIIYDNVFDI 228
S K E + + DN + + D N NKI++ ++ I
Sbjct: 91 SAIWSGF-----IKVKKSDEYTFATSADNHVTMWVDDQEVINKASNSNKIRLEKGRLYQI 145

Query: 229 ----EKENKT-------LYISTNNGLYTYSQEGILKLSEYK 258
++EN T LY + + L+L E K
Sbjct: 146 KIQYQRENPTEKGLDFKLYWTDSQNKKEVISSDNLQLPELK 186


38Sputcn32_2523Sputcn32_2541Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2523015-3.293111hypothetical protein
Sputcn32_2524024-4.974779excinuclease ABC subunit C
Sputcn32_2525-126-6.073719GAF sensor signal transduction histidine kinase
Sputcn32_2526030-7.840051copper resistance lipoprotein NlpE
Sputcn32_2527235-9.841524polysaccharide biosynthesis protein CapD
Sputcn32_2528439-12.696641sugar transferase
Sputcn32_2529540-13.746259NAD-dependent epimerase/dehydratase
Sputcn32_2530442-15.223268glycosyl transferase family protein
Sputcn32_2531541-15.833936glycosyl transferase family protein
Sputcn32_2532440-15.146317hypothetical protein
Sputcn32_2533235-12.594723glycosyl transferase family protein
Sputcn32_2534131-10.939066hypothetical protein
Sputcn32_2535027-8.562232polysaccharide biosynthesis protein
Sputcn32_2536023-6.158480group 1 glycosyl transferase
Sputcn32_2537021-3.787834dTDP-4-dehydrorhamnose 3,5-epimerase
Sputcn32_2538-217-2.876624dTDP-4-dehydrorhamnose reductase
Sputcn32_2539-216-3.203625glucose-1-phosphate thymidylyltransferase
Sputcn32_2540-116-2.685888dTDP-glucose-4,6-dehydratase
Sputcn32_2541-115-3.137511lipopolysaccharide biosynthesis protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2525PF06580441e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 43.7 bits (103), Expect = 1e-06
Identities = 26/188 (13%), Positives = 66/188 (35%), Gaps = 45/188 (23%)

Query: 269 SLVDRNLARAAELV--------HNFKRTAADQSIWERERFNLKAY--ILQVFSSLKPLMR 318
+L+ + +A E++ ++ + + A Q E + +Y + ++
Sbjct: 184 ALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLAS--------IQ 235

Query: 319 -KKSISLTINVDDGIYLYSYPGAIAQIFTNLVSNSFRHGFPDNFTGEKQIIVEADKTSST 377
+ + ++ I P + Q LV N +HG G +I+++ K + T
Sbjct: 236 FEDRLQFENQINPAIMDVQVPPMLVQT---LVENGIKHGIAQLPQG-GKILLKGTKDNGT 291

Query: 378 INIIYQDSGVGMSDDIKARAFEPFFTTAKQSGGTGLGMSIIHNLVTQKLNGRISLISAPN 437
+ + +++G + K TG G+ + R+ ++
Sbjct: 292 VTLEVENTGSLALKNTK--------------ESTGTGLQNVRE--------RLQMLYGTE 329

Query: 438 QGVKIEMQ 445
+K+ +
Sbjct: 330 AQIKLSEK 337


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2527NUCEPIMERASE553e-10 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 55.2 bits (133), Expect = 3e-10
Identities = 48/298 (16%), Positives = 101/298 (33%), Gaps = 44/298 (14%)

Query: 283 VMVTGAGGSIGSELCRQILKQSPKKLVLFELSEFALYSIERELCATAAELGLELDILPIM 342
+VTGA G IG + +++L+ + + + L+++ S+++ A L +
Sbjct: 3 YLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQ---ARLELLA-QPGFQFHK 58

Query: 343 GSVQRENRVQAVMQAFKVQTVYHAAAYKHVPLVEHNVVEGVRNNVFGTLYTARAAIAAKV 402
+ + + + + V+ + V N +N+ G L K+
Sbjct: 59 IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKI 118

Query: 403 ETFVLVST---------------DKAVRPTNIMGTTKRMAELALQALAKEEHQTRFCMVR 447
+ + S+ D P ++ TK+ EL + + +R
Sbjct: 119 QHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYS-HLYGLPATGLR 177

Query: 448 FGNVLGSSGS---VVPLFRKQIANGGPVTV-THPEITRFFMTIPEASQLVIQA------- 496
F V G G + F K + G + V + ++ R F I + ++ +I+
Sbjct: 178 FFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAEAIIRLQDVIPHA 237

Query: 497 -----------GAMGKGGDVFVLDMGKSVKIVDLAAKMIRLSGYEVKDE--DHPNGDI 541
A V+ + V+++D + G E K GD+
Sbjct: 238 DTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEAKKNMLPLQPGDV 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2529NUCEPIMERASE589e-12 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 57.9 bits (140), Expect = 9e-12
Identities = 49/230 (21%), Positives = 95/230 (41%), Gaps = 36/230 (15%)

Query: 4 KFLLTGASGFVGKHL--------------------YSINPSQFRC-VVREDGERFPDAYK 42
K+L+TGA+GF+G H+ Y ++ Q R ++ + G +F +K
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQF---HK 58

Query: 43 VNGINSKTDWTDSFLD--IDCIIHL---AGLAHSNNNRNDEYRETNFLGTLCLAQQASKS 97
++ + + TD F + + + +S N Y ++N G L + + +
Sbjct: 59 ID-LADREGMTDLFASGHFERVFISPHRLAVRYSLEN-PHAYADSNLTGFLNILEGCRHN 116

Query: 98 GVKRFVFVSSIGVNGNATLEKPFSIFDE-PKPLNSYTNSKYDAEIGLKKIAAETGLEVVI 156
++ ++ SS V G + PFS D P++ Y +K E+ + GL
Sbjct: 117 KIQHLLYASSSSVYGLNR-KMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATG 175

Query: 157 VRPTLVYGPNAPGN---FGLLTKLVKKLPVLPFGLANNKRDFIAVQNLAD 203
+R VYGP + F +++ + + KRDF + ++A+
Sbjct: 176 LRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIAE 225


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2538NUCEPIMERASE465e-08 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 46.3 bits (110), Expect = 5e-08
Identities = 30/161 (18%), Positives = 59/161 (36%), Gaps = 27/161 (16%)

Query: 1 MKILVTGSHGQVGSCLVKQLSQMPDVEFLAVD--------------REQL---------- 36
MK LVTG+ G +G + K+L + + + +D E L
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGH-QVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 37 DITNSAAVSKLVNQFKPDAIINAAAHTAVDKAEQEVELSYAINRDGPQFLAQSANRVG-A 95
D+ + ++ L + + + AV + + N G + +
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 96 TILHISTDYVFAGDKDGEYVETDAVA-PQGVYGHSKLAGEL 135
+L+ S+ V+ ++ + D+V P +Y +K A EL
Sbjct: 120 HLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANEL 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2540NUCEPIMERASE1841e-57 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 184 bits (469), Expect = 1e-57
Identities = 82/361 (22%), Positives = 147/361 (40%), Gaps = 51/361 (14%)

Query: 1 MKILVTGGAGFIGSAVVRHIIGNTPDSVVNVDKLT--YAGNL-ESLSSVASNARYTFEKV 57
MK LVTG AGFIG V + ++ VV +D L Y +L ++ + + + F K+
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEA-GHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKI 59

Query: 58 DICDRAQLDRVFLLHQPDVVMHLAAESHVDRSITGPADFIQTNIVGTYTLLEAARNYWMQ 117
D+ DR + +F + V V S+ P + +N+ G +LE R+ +Q
Sbjct: 60 DLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQ 119

Query: 118 LDTERKLAFRFHHISTDEVYGDLPHPDEQEGQVVNQELPLFTETTPYAPSSPYSASKASS 177
+ S+ VYG N+++P T+ + P S Y+A+K ++
Sbjct: 120 ---------HLLYASSSSVYGL------------NRKMPFSTDDSVDHPVSLYAATKKAN 158

Query: 178 DHLVRAWLRTYGLPTIVTNCSNNYGPYHFPEKLIPLVILNALEGKPLPIYGKGDQIRDWL 237
+ + + YGLP YGP+ P+ + LEGK + +Y G RD+
Sbjct: 159 ELMAHTYSHLYGLPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFT 218

Query: 238 YVEDHARALYKVV------------------TEGKVGETYNIGGHNEKQNLEVVQTICRI 279
Y++D A A+ ++ YNIG + + ++ +Q +
Sbjct: 219 YIDDIAEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDA 278

Query: 280 LDSLVPKATPYAEQITYVTDRPGHDRRYAIDASKMSAELDWQPQETFETGLRKTVEWYLA 339
L +A + + +PG + D + + + P+ T + G++ V WY
Sbjct: 279 LG---IEA-----KKNMLPLQPGDVLETSADTKALYEVIGFTPETTVKDGVKNFVNWYRD 330

Query: 340 N 340

Sbjct: 331 F 331


39Sputcn32_2591Sputcn32_2630Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2591217-0.870460flagellar basal body L-ring protein
Sputcn32_2592219-1.161549flagellar basal body rod protein FlgG
Sputcn32_2593019-1.724707flagellar basal body rod protein FlgF
Sputcn32_2594-120-2.725978flagellar hook protein FlgE
Sputcn32_2595-219-4.318877flagellar basal body rod modification protein
Sputcn32_2596-119-4.111378flagellar basal body rod protein FlgC
Sputcn32_2597117-4.158482flagellar basal body rod protein FlgB
Sputcn32_2598117-4.199749protein-glutamate O-methyltransferase
Sputcn32_2599019-4.399653putative CheW protein
Sputcn32_2600023-4.254480flagellar basal body P-ring biosynthesis protein
Sputcn32_2601120-4.539996anti-sigma-28 factor FlgM
Sputcn32_2602019-4.324383FlgN family protein
Sputcn32_2603020-4.008510hypothetical protein
Sputcn32_2604-219-1.853896hypothetical protein
Sputcn32_2605025-6.329637hypothetical protein
Sputcn32_2606127-8.618181*hypothetical protein
Sputcn32_2607023-7.032291hypothetical protein
Sputcn32_2608020-7.508548hypothetical protein
Sputcn32_2609121-7.895079bacteriophage replication gene A
Sputcn32_2610326-10.914142hypothetical protein
Sputcn32_2611222-9.059129retron-type reverse transcriptase
Sputcn32_2612123-7.708728phage integrase family protein
Sputcn32_2613436-11.626469hypothetical protein
Sputcn32_2614237-10.548598hypothetical protein
Sputcn32_2615335-9.477906putative sugar nucleotidyltransferase
Sputcn32_2616231-7.930266HAD family hydrolase
Sputcn32_2617130-6.801791type 12 methyltransferase
Sputcn32_2618129-5.893540hypothetical protein
Sputcn32_2619027-4.776478hypothetical protein
Sputcn32_2620025-4.595357acyl-protein synthetase, LuxE
Sputcn32_2621-125-4.740741AMP-dependent synthetase and ligase
Sputcn32_2622-127-4.954721short-chain dehydrogenase/reductase SDR
Sputcn32_2623-127-4.938054acyl carrier protein
Sputcn32_2624-125-4.666963hypothetical protein
Sputcn32_2625023-6.608003N-acylneuraminate-9-phosphate synthase
Sputcn32_2626122-7.063019FlaR protein (FlaR)
Sputcn32_2627020-6.886520acylneuraminate cytidylyltransferase
Sputcn32_2628020-6.586299DegT/DnrJ/EryC1/StrS aminotransferase
Sputcn32_2629020-6.274871polysaccharide biosynthesis protein CapD
Sputcn32_2630022-5.768985hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2591FLGLRINGFLGH1437e-45 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 143 bits (362), Expect = 7e-45
Identities = 71/215 (33%), Positives = 106/215 (49%), Gaps = 9/215 (4%)

Query: 11 LLLSACSSTQKKPIADDPFYAPVYPEAPPTKIAATGSIYQDSQAA-----SLYSDIRAHK 65
L L+ C+ P+ A P P A GSI+Q +Q L+ D R
Sbjct: 17 LSLTGCAWIPSTPLVQGATSAQPVPGPTP---VANGSIFQSAQPINYGYQPLFEDRRPRN 73

Query: 66 VGDIITIVLKEATQAKKSAGNQIKKGSDMSLDPIYAAGSNISV-AGVPLDLRYKDSMNTK 124
+GD +TIVL+E A KS+ + + + D+
Sbjct: 74 IGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFN 133

Query: 125 RESDADQSNSLDGSISANIMQVLNNGNLVVRGEKWISINNGDEFIRVTGIVRSQDIKPDN 184
+ A+ SN+ G+++ + QVL NGNL V GEK I+IN G EFIR +G+V + I N
Sbjct: 134 GKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSN 193

Query: 185 TIDSTRMANARIQYSGTGTFADAQKVGWLSQFFMS 219
T+ ST++A+ARI+Y G G +AQ +GWL +FF++
Sbjct: 194 TVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2592FLGHOOKAP1431e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.6 bits (100), Expect = 1e-06
Identities = 19/119 (15%), Positives = 41/119 (34%), Gaps = 4/119 (3%)

Query: 145 EDATSITVSAEGEVSVKTAGAAENQVVGQLSMTDFINPSGLDPMGQNLYTETG---ASGT 201
D I +++E + + + Q + + +L ++ G A+
Sbjct: 427 TDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLK 486

Query: 202 PIQGTASLDGMGAIRQGALETSNVNVTEELVNLIESQRIYEMNSKVISAVDQMLAYVNQ 260
T + + S VN+ EE NL Q+ Y N++V+ + + +
Sbjct: 487 TSSATQGNV-VTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 35.3 bits (81), Expect = 2e-04
Identities = 9/36 (25%), Positives = 20/36 (55%)

Query: 5 LWISKTGLDAQQTDIAVISNNVANASTVGYKKSRAV 40
+ + +GL+A Q + SNN+++ + GY + +
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2593FLGHOOKAP1290.023 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.023
Identities = 9/32 (28%), Positives = 17/32 (53%)

Query: 205 SNVNPVDEMVSLIELQRQFEMQVKMMKTAEEI 236
S VN +E +L Q+ + ++++TA I
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAI 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2594FLGHOOKAP1401e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 40.3 bits (94), Expect = 1e-05
Identities = 15/35 (42%), Positives = 22/35 (62%)

Query: 2 SFNIALSGISAAQKDLNTTANNIANANTIGFKESR 36
N A+SG++AAQ LNT +NNI++ N G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37



Score = 37.6 bits (87), Expect = 1e-04
Identities = 13/49 (26%), Positives = 25/49 (51%)

Query: 405 SLSSSALEQSNIDLTTELVDLISAQRNFQANSRTLEVNNTLQQTVLQIR 453
LS+ S ++L E +L Q+ + AN++ L+ N + ++ IR
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2596FLGHOOKAP1333e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.0 bits (75), Expect = 3e-04
Identities = 9/38 (23%), Positives = 18/38 (47%)

Query: 99 NVNVMEEMADMISASRSYQMNVQVAEAAKSMLQQTLGM 136
VN+ EE ++ + Y N QV + A ++ + +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 29.9 bits (67), Expect = 0.003
Identities = 16/67 (23%), Positives = 29/67 (43%), Gaps = 6/67 (8%)

Query: 5 SIFDVAGSGMSAQSVRLNTTASNIANADSVSSSIDKTYRSRHPIFEAEMAKAQSQQQTSQ 64
S+ + A SG++A LNT ++NI++ + Y + I + +
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYN------VAGYTRQTTIMAQANSTLGAGGWVGN 55

Query: 65 GVTVRGI 71
GV V G+
Sbjct: 56 GVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2599HTHFIS611e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 1e-12
Identities = 23/128 (17%), Positives = 52/128 (40%), Gaps = 12/128 (9%)

Query: 180 HIMVIDDSAVARKQIIRSLESLNLQIDTAKDGREALDKLKAIASEMDNVADEIPLIISDI 239
I+V DD A R + ++L + + + A + L+++D+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA---------GDGDLVVTDV 55

Query: 240 EMPEMDGYTLTAEIRDDPKLKHIKVVLHTSLSGVFNQAMVQKVGANDFIAK-FNPDELAA 298
MP+ + + L I+ + V++ ++ + + GA D++ K F+ EL
Sbjct: 56 VMPDENAFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG 113

Query: 299 AVNKHLSL 306
+ + L+
Sbjct: 114 IIGRALAE 121


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2622DHBDHDRGNASE1292e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 129 bits (324), Expect = 2e-38
Identities = 79/253 (31%), Positives = 123/253 (48%), Gaps = 13/253 (5%)

Query: 4 LRGKKALVTGANRGIGLAIAHKFAQQGAELWINGLDSNKIEQVRVEIVSKY-GTDCHALC 62
+ GK A +TGA +GIG A+A A QGA I +D N + +V K A
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAH--IAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 63 FDVSDPLAVKAGFQYLFKQTKTLDALVNNAGIMDDALLGMVTHQQLERSFSTNTYSVIYC 122
DV D A+ + ++ +D LVN AG++ L+ ++ ++ E +FS N+ V
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNA 123

Query: 123 TQYAARLMERAGSGSVINIASIMGRVGNAGQSVYAGSKAAVIGITQSLAKELASKQIRVN 182
++ ++ M SGS++ + S V + YA SKAA + T+ L ELA IR N
Sbjct: 124 SRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCN 183

Query: 183 AIAPGFIETDLVKIL--NENTYESRLQ--------SIAMGRAGFASEVADVAAFLASNMA 232
++PG ETD+ L +EN E ++ I + + S++AD FL S A
Sbjct: 184 IVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQA 243

Query: 233 SYVTGQVIGVDGG 245
++T + VDGG
Sbjct: 244 GHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2629NUCEPIMERASE805e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 79.8 bits (197), Expect = 5e-19
Identities = 43/245 (17%), Positives = 87/245 (35%), Gaps = 54/245 (22%)

Query: 6 TILITGGTGSFGQKYTKTILERY-----------------KPKRLIIFSRDELKQYEMQQ 48
L+TG G G +K +LE K RL + ++ + ++
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 49 VFNAPCMRYFIGDVRDGERLKQAFKDVDF--VIHAAALKQVPAAEYNPMECIKTNIHGAE 106
D+ D E + F F V + V + NP +N+ G
Sbjct: 59 -----------IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFL 107

Query: 107 NVIRAAISNNVKKVIALST---------------DKAASPINLYGATKLASDKLFVAANN 151
N++ N ++ ++ S+ D P++LY ATK A++ + ++
Sbjct: 108 NILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH 167

Query: 152 VVGDGKTRFSAVRYGNVVGSRGS---VVPFFKQLIANGATSLPITHPDMTRFWITLQDGV 208
+ G + +R+ V G G + F + + G + + M R + + D
Sbjct: 168 LYG---LPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 209 DFVLK 213
+ +++
Sbjct: 225 EAIIR 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2630SYCDCHAPRONE310.013 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.013
Identities = 18/99 (18%), Positives = 38/99 (38%), Gaps = 8/99 (8%)

Query: 727 TLVKILAHSDEYMPQ---YAYILKLQGKVQESINIY--LDYLEKYPSDTQTWVKLGLFMV 781
T+ + S + + Q A+ GK +++ ++ L L+ D++ ++ LG
Sbjct: 24 TIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLD--HYDSRFFLGLGACRQ 81

Query: 782 EINQIEPAHTAFSNAVNADPTNQVAQHYLTE-LTQLMTP 819
+ Q + A ++S D + E L Q
Sbjct: 82 AMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGEL 120


40Sputcn32_2662Sputcn32_2678Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2662318-2.746221hypothetical protein
Sputcn32_2663317-2.591337DoxX family protein
Sputcn32_2664319-2.464770pirin domain-containing protein
Sputcn32_2665219-2.567302N-acetyltransferase GCN5
Sputcn32_2666019-3.575325N-acetyltransferase GCN5
Sputcn32_2667021-3.498739NrfJ-like protein
Sputcn32_2668017-2.958049hypothetical protein
Sputcn32_2669018-3.276175N-acetyltransferase GCN5
Sputcn32_2670017-4.022157extracellular solute-binding protein
Sputcn32_2671-117-4.578018diguanylate cyclase
Sputcn32_2672-119-3.955992hypothetical protein
Sputcn32_2673023-4.753954threonine aldolase
Sputcn32_2674023-5.842780OmpA/MotB domain-containing protein
Sputcn32_2675122-5.568995mechanosensitive ion channel protein MscS
Sputcn32_2676120-3.683901redoxin domain-containing protein
Sputcn32_2677021-3.300866hypothetical protein
Sputcn32_2678021-3.307097hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2662FRAGILYSIN280.024 Fragilysin metallopeptidase (M10C) enterotoxin signat...
		>FRAGILYSIN#Fragilysin metallopeptidase (M10C) enterotoxin

signature.
Length = 405

Score = 27.7 bits (61), Expect = 0.024
Identities = 14/47 (29%), Positives = 23/47 (48%), Gaps = 1/47 (2%)

Query: 1 MKRFKLLIICSVIALVSGCASNQYSVTYDSDPKGAEVFCNGRSYGYT 47
MK KLL++ AL++ C++ S+T D + +S YT
Sbjct: 9 MKNVKLLLMLGTAALLAACSNEADSLTTSIDAP-VTASIDLQSVSYT 54


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2666SACTRNSFRASE300.005 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.005
Identities = 11/51 (21%), Positives = 27/51 (52%)

Query: 79 CVGFINVETDYYHRGYIDSFYIHPDWQRQGVGGRVYCELEQWARGQGYASL 129
C+G I + +++ I+ + D++++GVG + + +WA+ + L
Sbjct: 76 CIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGL 126


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2671ARGREPRESSOR330.001 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 33.3 bits (76), Expect = 0.001
Identities = 25/103 (24%), Positives = 51/103 (49%), Gaps = 11/103 (10%)

Query: 29 IIALLLSHQLRT--EIEDLVTEDVVKVSTALELAKDTEKLQIITTRLNHYD-----NELQ 81
I ++ ++++ T E+ D++ +D V+ A +++D ++L ++ N+ Q
Sbjct: 10 IREIITANEIETQDELVDILKKDGYNVTQA-TVSRDIKELHLVKVPTNNGSYKYSLPADQ 68

Query: 82 RLQLLSELATLWG---LLIDSSRTIAKLETSPQLQQAINNIID 121
R LS+L + IDS+ + L+T P QAI ++D
Sbjct: 69 RFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMD 111


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2674OMPADOMAIN718e-17 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 70.7 bits (173), Expect = 8e-17
Identities = 29/116 (25%), Positives = 54/116 (46%), Gaps = 12/116 (10%)

Query: 90 VYFEFAIAEVDLSQWKSLALVKAFLEANN--DVALILVGHTDIVGTPEFNYQLSLQRAEN 147
V F F A + +L + + L + D +++++G+TD +G+ +N LS +RA++
Sbjct: 221 VLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQS 280

Query: 148 VRRILVEDYGFNPNRFTIVGKGISEPVADNRSSEGRRL---------NRRVQFIVN 194
V L+ G ++ + G G S PV N ++ +RRV+ V
Sbjct: 281 VVDYLISK-GIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVK 335


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2678TYPE4SSCAGA280.031 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.7 bits (61), Expect = 0.031
Identities = 19/65 (29%), Positives = 28/65 (43%), Gaps = 3/65 (4%)

Query: 38 AWVAQKNDESGYHLDGLMDQRVRDAVNAQLAAKGMSLVDAKEADVLVNYLTKVDKKINVD 97
A AQKN+ + Q V++ VN L G+S EA L + + K++N
Sbjct: 819 AQQAQKNESLNARKKSEIYQSVKNGVNGTLVGNGLSQA---EATTLSKNFSDIKKELNAK 875

Query: 98 TFNTN 102
N N
Sbjct: 876 LGNFN 880


41Sputcn32_2701Sputcn32_2708Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2701019-4.882186hypothetical protein
Sputcn32_2702-120-4.228152hypothetical protein
Sputcn32_2703-119-4.159409cytochrome B561
Sputcn32_2704-120-3.310984****hypothetical protein
Sputcn32_2705-219-3.478771hypothetical protein
Sputcn32_2706-219-3.280426hypothetical protein
Sputcn32_2707-219-2.805521hypothetical protein
Sputcn32_2708-118-3.008602hypothetical protein
42Sputcn32_2771Sputcn32_2789Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_27711153.272796recombination and repair protein
Sputcn32_27720153.921680phosphatidylglycerophosphatase
Sputcn32_2773-1164.426597thiamine-monophosphate kinase
Sputcn32_27741184.228324transcription antitermination protein NusB
Sputcn32_27751183.7083556,7-dimethyl-8-ribityllumazine synthase
Sputcn32_27761234.4128743,4-dihydroxy-2-butanone 4-phosphate synthase
Sputcn32_27770224.691000riboflavin synthase subunit alpha
Sputcn32_2778-1223.966362riboflavin biosynthesis protein RibD
Sputcn32_27790223.768082transcriptional regulator NrdR
Sputcn32_2780-1213.193060serine hydroxymethyltransferase
Sputcn32_27810182.996717putative ABC transporter ATP-binding protein
Sputcn32_2782-2141.371245LysR family transcriptional regulator
Sputcn32_2783-2150.793918beta-lactamase domain-containing protein
Sputcn32_2784-2151.006681putative protein-disulfide isomerase
Sputcn32_2785-2140.485455ketosteroid isomerase-like protein
Sputcn32_2786-2121.889233RND family efflux transporter MFP subunit
Sputcn32_2787-2122.273798acriflavin resistance protein
Sputcn32_2788-2163.135381Bcr/CflA subfamily drug resistance transporter
Sputcn32_2789-2163.237926AraC family transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2771GPOSANCHOR396e-05 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 38.5 bits (89), Expect = 6e-05
Identities = 39/217 (17%), Positives = 74/217 (34%), Gaps = 14/217 (6%)

Query: 167 KQIEADLKQLEASQHERIARKQLVQYQVEELD-EFDLKVDEFDEIEQEHKRLANGTELIA 225
+ A A K ++ + EL+ + ++ + K L +A
Sbjct: 165 EGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALA 224

Query: 226 TCQASLDLLTDGEENNIESLLNRVVSLAEDLQSYDPALSNVSTMLNDALIQVQESAGELQ 285
+A L+ +G N + ++ +L + + + + L AL +
Sbjct: 225 ARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAE----LEKALEGAMNFSTADS 280

Query: 286 HYLSKLELDPTHFAYLEERLSKAMQLARKHHVSPNKLAEHHLALKAELSSLDSDESKLEE 345
+ LE A LE + ++ + + L A + L+++ KLEE
Sbjct: 281 AKIKTLE---AEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEAEHQKLEE 337

Query: 346 VQQQVEASRAAYLSNAQKLSQSRARYAK---ELDKLV 379
+ EASR + L L SR + E KL
Sbjct: 338 QNKISEASRQS-LRR--DLDASREAKKQLEAEHQKLE 371



Score = 36.2 bits (83), Expect = 4e-04
Identities = 48/247 (19%), Positives = 85/247 (34%), Gaps = 19/247 (7%)

Query: 138 KSEHQLTLLDSYANHRLLIDTVAASFQRCKQIEADLKQLEASQHERIARKQLVQYQVEEL 197
SE + + A L + + A +K LEA + ARK ++ +E
Sbjct: 108 LSEKASKIQELEARKADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGA 167

Query: 198 DEFDLKVDEFDEIEQEHKRLANGTELIATCQASLDLLTDGEENNIESLLNRVVSLAEDLQ 257
F + K L + QA L+ + + + L+
Sbjct: 168 MNFST------ADSAKIKTLEAEKAALEARQAELE----KALEGAMNFSTADSAKIKTLE 217

Query: 258 SYDPALSNVSTMLNDALIQVQESAGELQHYLSKLELDPTHFAYLEERLSKAMQLARKHHV 317
+ AL+ L AL + + LE + A LE R ++ +
Sbjct: 218 AEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAE---KAALEARQAELEKALEGAMN 274

Query: 318 SPNKLAEHHLALKAELSSLDSDESKLEEVQQQVEASRAAYLSNAQKLSQSRARYAK---E 374
+ L+AE ++L+++++ LE Q + A+R S + L SR + E
Sbjct: 275 FSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQ---SLRRDLDASREAKKQLEAE 331

Query: 375 LDKLVTQ 381
KL Q
Sbjct: 332 HQKLEEQ 338


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2773TYPE3IMQPROT270.027 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 27.0 bits (60), Expect = 0.027
Identities = 9/39 (23%), Positives = 16/39 (41%)

Query: 76 LSDLAAMGAEPAWMTLALTLPEVDETWLSGFSEGLFEAA 114
+ DL G + ++ L L+ + G GLF+
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTV 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2781PYOCINKILLER300.028 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.1 bits (67), Expect = 0.028
Identities = 16/52 (30%), Positives = 26/52 (50%), Gaps = 1/52 (1%)

Query: 103 RLDEVYAAYAEPDADFDALAKEQGELEAIIQAQDAHNLEHILERAANALRLP 154
R++ + AA A +A A+EQ EA +A++ + RAAN +P
Sbjct: 203 RMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQAR-QQAAIRAANTYAMP 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2786RTXTOXIND491e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.4 bits (118), Expect = 1e-08
Identities = 21/108 (19%), Positives = 47/108 (43%), Gaps = 3/108 (2%)

Query: 50 PLTQSISLIGKLA-AERAVVIAPQVTGKIKQIAVTSNQAVKKGQLLIELDDMKAQAAVAE 108
+ + GKL + R+ I P +K+I V ++V+KG +L++L + A+A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 109 ANAYLNDEKRKLKEFEKLISRNAITQTEIDAQKASVDIAQARLTSAQA 156
+ L +L++ I +I ++ K + ++ +
Sbjct: 139 TQSSLLQA--RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184



Score = 48.7 bits (116), Expect = 2e-08
Identities = 41/246 (16%), Positives = 84/246 (34%), Gaps = 33/246 (13%)

Query: 51 LTQSISLIGKLAAERAVVIAPQVTGKIKQIAV-----TSNQAVKKGQLLIELDDMKAQAA 105
Q + K AER V+A ++ V ++ Q + + ++ +
Sbjct: 202 KYQKELNLDKKRAERLTVLA-RINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 106 VAEANAYLNDEKRKLKEFEKLISRNAITQTEIDAQ----------KASVDIAQ--ARLTS 153
EA L K +L++ E I + + + +I L
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 154 AQADLHYHSLIAPFAGKT-GLINFSEGKMVSVGTELMTL-DDLSSMRLDLQVPEHYLAQL 211
+ + AP + K L +EG +V+ LM + + ++ + V + +
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 212 SIGMPVSATSRAWPGETF---MGKVVAIDP-RVNEETLNL--KIRVQFD-------NPKD 258
++G A+P + +GKV I+ + ++ L L + + + N
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNI 440

Query: 259 RLKPGM 264
L GM
Sbjct: 441 PLSSGM 446


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2787ACRIFLAVINRP7780.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 778 bits (2011), Expect = 0.0
Identities = 310/1032 (30%), Positives = 520/1032 (50%), Gaps = 28/1032 (2%)

Query: 3 LSDVSVKRPVVAIVLSLLLCVFGFVSFTKLSVREMPDVESPVVTISTSYSGASASIMESQ 62
+++ ++RP+ A VL+++L + G ++ +L V + P + P V++S +Y GA A ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 ITKTLEDELTGISGIDEITSTT-RNGSSRITVKFLLGWNLTEGVSDVRDAVARAQRRLPE 121
+T+ +E + GI + ++ST+ GS IT+ F G + V++ + A LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 DAKDPIVSKDNGSGEPSVYVNLSSSIMDRTQ--LTDYAQRVLEDRFSLISGVSSISISGG 179
+ + +S + S + S TQ ++DY ++D S ++GV + + G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 LYKVMYVKLRPEQMAGRNVTVTDITNALRKENVETPGGQVRNDTTV------MSVRTKRL 233
Y + + L + + +T D+ N L+ +N + GQ+ + S+ +
Sbjct: 181 QYAMR-IWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 YYTPKDFDYLVVRTASDGTPIYLKDVADVAVGAQNENSTFKSDGIVNLSLGIITQSDANP 293
+ P++F + +R SDG+ + LKDVA V +G +N N + +G LGI + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 LVVAQEVHKEVDRVQNFLPEGTSLVVDFDSTVFIDRSINEVYNTLFVTGALVVLVLYIFI 353
L A+ + ++ +Q F P+G ++ +D+T F+ SI+EV TLF LV LV+Y+F+
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 GQARATLIPAVTVPVSLISAFIAANMFGYSINLLTLMALILAIGLVVDDAIVVVENIFHH 413
RATLIP + VPV L+ F FGYSIN LT+ ++LAIGL+VDDAIVVVEN+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 I-EKGEEPLLAAYKGTREVGFAVVATTAVLVMVFLPISFMEGMVGLLFTEFSVMLAVSVM 472
+ E P A K ++ A+V VL VF+P++F G G ++ +FS+ + ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 FSSLIALTLTPVLSSKLLKANVK-----PNRFNRWVDSGFARMEKVYRAAVSRAIQFRLI 527
S L+AL LTP L + LLK F W ++ F Y +V + +
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 528 APLVILACIGGSAWLMQQVPSQLAPQEDRGVLYAFVKGAEGTSYNRMTANMDIVEDRLMP 587
L+ + G L ++PS P+ED+GV ++ G + R +D V D +
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 588 LLGQGVLRSFSVQAPAFGGRAGDQTGFVIMQLEDWEHRDVTAQQALGIISSA---LKDIP 644
V F+V +F G+ G + L+ WE R+ A +I A L I
Sbjct: 600 NEKANVESVFTVNGFSFSGQ-AQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 645 DVMVRPM-MPGFRGQ-SSEPVQFVL---GGSDYTELFKWAQILKEEANASP-MMEGADLD 698
D V P MP ++ F L G + L + L A P + +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 YAETTPELIVTVDKERAAELGISVDEVSQTLEVMLGGRTETTYVDRGEEYDVYLRGDENS 758
E T + + VD+E+A LG+S+ +++QT+ LGG ++DRG +Y++ D
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 FNNVGDLSQIYMRSAKGELVTLDTVTHIEEVASAQKLSHTNKQKSITLKANISEGYTLGE 818
D+ ++Y+RSA GE+V T V + +L N S+ ++ + G + G+
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 819 SLAFLENKAVELLPKDISVGYTGESKEFKENQSSILIVFGLALLVAYLVLAAQFESFINP 878
++A +EN A + LP I +TG S + + + + + ++ +V +L LAA +ES+ P
Sbjct: 839 AMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIP 897

Query: 879 LVVMFTVPMGVFGGFLGLLITSQGINIYSQIGMIMLIGMVTKNGILIVEFANQLRDR-GF 937
+ VM VP+G+ G L + +Q ++Y +G++ IG+ KN ILIVEFA L ++ G
Sbjct: 898 VSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGK 957

Query: 938 ELDKAIIDASTRRLRPILMTAFTTLVGAIPLIFSTGAGSESRIAVGTVVFFGMAFATFVT 997
+ +A + A RLRPILMT+ ++G +PL S GAGS ++ AVG V GM AT +
Sbjct: 958 GVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLA 1017

Query: 998 LFVIPAMYRLIS 1009
+F +P + +I
Sbjct: 1018 IFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2788TCRTETB583e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.6 bits (139), Expect = 3e-11
Identities = 41/178 (23%), Positives = 86/178 (48%), Gaps = 10/178 (5%)

Query: 12 LLMIFPQAMETIYSPALPNIAENFAVSVGGASQTLSVYFIAFAIGVFCWGRLADIIGRRK 71
+L F E + + +LP+IA +F + + + + F+IG +G+L+D +G ++
Sbjct: 21 ILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKR 80

Query: 72 AMLAGLVCYAIGSALALMV-NDFSLLLLARVLSAFGAA----VGSVITQTMMRDSYSGEE 126
+L G++ GS + + + FSLL++AR + GAA + V+ + G+
Sbjct: 81 LLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKA 140

Query: 127 LAKVFSVMGMSLGISPVIGLLLGSVLSAYWGYQGVFVALMVSAIVLLFLSIKSLPETK 184
+ S++ M G+ P IG ++ + +W Y + + + I+ + +K L +
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGMIAHYI--HWSY---LLLIPMITIITVPFLMKLLKKEV 193


43Sputcn32_2826Sputcn32_2834Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2826217-1.646044hypothetical protein
Sputcn32_2827216-1.246434peptide chain release factor 3
Sputcn32_2828218-1.465453lipoprotein NlpI
Sputcn32_28294210.312548polynucleotide phosphorylase/polyadenylase
Sputcn32_28304180.918476diguanylate cyclase/phosphodiesterase
Sputcn32_28315212.18160730S ribosomal protein S15
Sputcn32_28325201.892448tRNA pseudouridine synthase B
Sputcn32_28335221.813271ribosome-binding factor A
Sputcn32_28344211.992218translation initiation factor IF-2
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2827TCRTETOQM2005e-59 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 200 bits (510), Expect = 5e-59
Identities = 107/461 (23%), Positives = 211/461 (45%), Gaps = 47/461 (10%)

Query: 10 KRRTFAIISHPDAGKTTITEKVLLFGNALQKAGTV-KGKKSGQHAKSDWMEMEKDRGISI 68
K +++H DAGKTT+TE +L A+ + G+V KG ++D +E+ RGI+I
Sbjct: 2 KIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGT-----TRTDNTLLERQRGITI 56

Query: 69 TTSVMQFPYGGALVNLLDTPGHEDFSEDTYRTLTAVDSCLMVIDSAKGVEDRTIKLMEVT 128
T + F + VN++DTPGH DF + YR+L+ +D +++I + GV+ +T L
Sbjct: 57 QTGITSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHAL 116

Query: 129 RLRDTPIVTFMNKLDRDIRDPIDLMDEVENVLNIACAPITWPIGSGKEFKGVYHILRDEV 188
R P + F+NK+D++ D + +++ L+ +++ +V
Sbjct: 117 RKMGIPTIFFINKIDQNGIDLSTVYQDIKEKLSAEI------------------VIKQKV 158

Query: 189 VLYQSGMGHTIQERRVIEGINNPDLDKAIGSYAADLR-DEMELVRGASNEFDHAAFLKGE 247
LY + E + + + D + Y + + +EL + S F
Sbjct: 159 ELYPNMCVTNFTESEQWDTVIEGN-DDLLEKYMSGKSLEALELEQEES-----IRFHNCS 212

Query: 248 LTPVFFGTALGNFGVDHILDGIVEWAPKPLPRESDTRVIMPDEEKFTGFVFKIQANMDPK 307
L PV+ G+A N G+D++++ I R + + G VFKI+ K
Sbjct: 213 LFPVYHGSAKNNIGIDNLIEVITNKFYSSTHR---------GQSELCGKVFKIE--YSEK 261

Query: 308 HRDRVAFMRVCSGRYEQGMKMHHVRIGKDVNVSDALTFMAGDRERAEEAYPGDIIGLHNH 367
R R+A++R+ SG + K + +++ T + G+ + ++AY G+I+ L N
Sbjct: 262 -RQRLAYIRLYSGVLHLRDSVRISEKEK-IKITEMYTSINGELCKIDKAYSGEIVILQNE 319

Query: 368 GTIRIGDTFTQGEKFRFTGVPNFAPEMFR-RIRLRDPLKQKQLLKGLVQLSEEG-AVQVF 425
+++ + + + + P +++ LL L+++S+ ++ +
Sbjct: 320 F-LKLNSVLGDTKLLPQRERIENPLPLLQTTVEPSKPQQREMLLDALLEISDSDPLLRYY 378

Query: 426 RPLDTNDLIVGAVGVLQFEVVVGRLKSEYNVEAIYEGISVS 466
T+++I+ +G +Q EV L+ +Y+VE + +V
Sbjct: 379 VDSATHEIILSFLGKVQMEVTCALLQEKYHVEIEIKEPTVI 419


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2828SYCDCHAPRONE310.004 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.004
Identities = 12/71 (16%), Positives = 20/71 (28%)

Query: 70 NEQRARFHYDRGVIYDSVGLRLLARIDFMQALKLQPDLADAYNFLGIYYTQEGEYDSAYE 129
+ +RF G ++G LA + + Q+GE A
Sbjct: 66 DHYDSRFFLGLGACRQAMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAES 125

Query: 130 AFDGVLELAPN 140
EL +
Sbjct: 126 GLFLAQELIAD 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2834TCRTETOQM734e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 72.6 bits (178), Expect = 4e-15
Identities = 52/202 (25%), Positives = 78/202 (38%), Gaps = 30/202 (14%)

Query: 387 IMGHVDHGKTSLLDYIRRAKVAAGEAG------------------GITQHIGAYHVETEN 428
++ HVD GKT+L + + A E G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 429 GMITFLDTPGHAAFTAMRARGAKATDIVVLVVAADDGVMPQTIEAIQHAKAGNVPLIVAV 488
+ +DTPGH F A R D +L+++A DGV QT + +P I +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 489 NKMDKPEADIDRV----KSELSQHGVMS-------EDWGGDNMFAFVSAKTGEGVDDLLE 537
NK+D+ D+ V K +LS V+ + + EG DDLLE
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 538 GILLQAEVLELKAVRDGMAAGV 559
+ + LE + +
Sbjct: 188 K-YMSGKSLEALELEQEESIRF 208


44Sputcn32_2881Sputcn32_2922Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2881218-3.033026TonB-dependent siderophore receptor
Sputcn32_2882223-4.870428hypothetical protein
Sputcn32_2883325-4.791282ISSod8, transposase
Sputcn32_2884528-4.714012hypothetical protein
Sputcn32_2886527-4.313547hypothetical protein
Sputcn32_2888424-5.889569beta-lactamase domain-containing protein
Sputcn32_2889322-6.858498hypothetical protein
Sputcn32_2890219-6.352496metallophosphoesterase
Sputcn32_2891219-6.594766hypothetical protein
Sputcn32_2892218-6.502296hypothetical protein
Sputcn32_2893218-6.379864hypothetical protein
Sputcn32_2894218-5.659938phage integrase family protein
Sputcn32_2895016-2.938385hypothetical protein
Sputcn32_2896118-3.419896hypothetical protein
Sputcn32_2897219-2.881264phage integrase family protein
Sputcn32_2898121-2.300595hypothetical protein
Sputcn32_2899022-1.653523helix-turn-helix domain-containing protein
Sputcn32_2900-121-0.802372phage integrase family protein
Sputcn32_2901123-0.992278hypothetical protein
Sputcn32_29020250.579865phage transcriptional regulator AlpA
Sputcn32_29030293.139200hypothetical protein
Sputcn32_29041264.591607hypothetical protein
Sputcn32_29051224.094566hypothetical protein
Sputcn32_29062243.984192putative DNA-binding protein
Sputcn32_29070224.138422hypothetical protein
Sputcn32_29080213.815404N-6 DNA methylase
Sputcn32_29090202.825181restriction modification system DNA specificity
Sputcn32_29101212.374088ATPase
Sputcn32_29111212.312336hypothetical protein
Sputcn32_29120191.559945HsdR family type I site-specific
Sputcn32_29131230.007509XRE family transcriptional regulator
Sputcn32_2914122-0.209702hypothetical protein
Sputcn32_2915220-0.619606XRE family transcriptional regulator
Sputcn32_2916420-1.772968HipA domain-containing protein
Sputcn32_2917521-3.240375hypothetical protein
Sputcn32_2918421-3.271880transposase, IS4 family protein
Sputcn32_2919523-4.534909hypothetical protein
Sputcn32_2920422-4.498532von Willebrand factor type A domain-containing
Sputcn32_2921220-4.739395ATPase
Sputcn32_2922117-3.843088sigma-54 dependent trancsriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2898ACRIFLAVINRP270.006 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 26.7 bits (59), Expect = 0.006
Identities = 15/61 (24%), Positives = 27/61 (44%), Gaps = 6/61 (9%)

Query: 2 SNTYLTAEELSLRIKYDARTIRDQLKDAILLEGVHYIRPFGGRKILFIWESVEQLMLFGY 61
N T +++S Y A ++D L L GV ++ FG + + IW + L +
Sbjct: 145 DNPGTTQDDIS---DYVASNVKDTLSR---LNGVGDVQLFGAQYAMRIWLDADLLNKYKL 198

Query: 62 S 62
+
Sbjct: 199 T 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2922HTHFIS335e-112 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 335 bits (861), Expect = e-112
Identities = 134/392 (34%), Positives = 204/392 (52%), Gaps = 30/392 (7%)

Query: 125 TFYQASKEQGIEAVNIPFNVSAEYVPAVTQLNQSELSKLAQLEAPIDSAFDDIITIGTDI 184
T +AS++ + + PF+++ E + + + + ++LE ++ +
Sbjct: 89 TAIKASEKGAYDYLPKPFDLT-ELIGIIGRALAEPKRRPSKLEDDSQD-GMPLVGRSAAM 146

Query: 185 QTLKQQAQILAQQEMPVLICGESGTGKEMFARAIHNASTGKDKPFIAVNCGAFPSELIDS 244
Q + + L Q ++ ++I GESGTGKE+ ARA+H+ ++ PF+A+N A P +LI+S
Sbjct: 147 QEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIES 206

Query: 245 ILFGHKRGAFTGAVSDKAGVFEQAHKGTLFLDEFGELEPAVQVRLLRVLQDGKYTRLGDS 304
LFGH++GAFTGA + G FEQA GTLFLDE G++ Q RLLRVLQ G+YT +G
Sbjct: 207 ELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGR 266

Query: 305 EEQSSNFRLITATNKNLISEVAYGRFREDLFYRVAIGVLQLPPLRSRQGDLDFLVDSLLQ 364
S+ R++ ATNK+L + G FREDL+YR+ + L+LPPLR R D+ LV +Q
Sbjct: 267 TPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQ 326

Query: 365 TLAKEHPILQGKNISYLAKKVIKEHKWPGNIRELKATLLRAALWASSDCID--------- 415
KE L K A +++K H WPGN+REL+ + R D I
Sbjct: 327 QAEKEG--LDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELR 384

Query: 416 ----------------EVEMQKALFTMEDKQDSLLDRELDQGFDIKDILAEVERRYIIKA 459
+ + +A+ + + L +LAE+E I+ A
Sbjct: 385 SEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAA 444

Query: 460 LEQTDGNKSKVARLLGYNNHQTLSNKLKKLGI 491
L T GN+ K A LLG N TL K+++LG+
Sbjct: 445 LTATRGNQIKAADLLGL-NRNTLRKKIRELGV 475


45Sputcn32_2936Sputcn32_2966Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_29362162.196883methyl-accepting chemotaxis sensory transducer
Sputcn32_29372193.151613hypothetical protein
Sputcn32_29382193.486269pseudouridine synthase
Sputcn32_29391193.616556carbamoyl phosphate synthase large subunit
Sputcn32_2940-1113.226027carbamoyl phosphate synthase small subunit
Sputcn32_2941-1144.147148dihydrodipicolinate reductase
Sputcn32_2942-1133.680068peptidylprolyl isomerase, FKBP-type
Sputcn32_2943-1163.484497peptidase M48, Ste24p
Sputcn32_2944-1173.213662DEAD/DEAH box helicase
Sputcn32_2945-1151.547991glycerol dehydrogenase
Sputcn32_2946-115-0.211580aldehyde dehydrogenase
Sputcn32_2947017-3.314471periplasmic binding protein
Sputcn32_2948020-4.715743hypothetical protein
Sputcn32_2949322-5.426127N-acetyltransferase GCN5
Sputcn32_2950422-5.665880hypothetical protein
Sputcn32_2951423-3.770811hypothetical protein
Sputcn32_2952-119-0.734856hypothetical protein
Sputcn32_2953-1141.963190hypothetical protein
Sputcn32_2954-1142.878803hypothetical protein
Sputcn32_29551152.453764hypothetical protein
Sputcn32_29563183.202494hypothetical protein
Sputcn32_29573203.348248cob(I)yrinic acid a,c-diamide
Sputcn32_29582213.389198cobyric acid synthase
Sputcn32_29592213.755011cobalbumin biosynthesis protein
Sputcn32_29601234.885589cobalamin synthase
Sputcn32_29611224.188737nicotinate-nucleotide--dimethylbenzimidazole
Sputcn32_29620233.413928transport system permease
Sputcn32_29630160.273668ABC transporter-like protein
Sputcn32_2964113-0.803375phosphoglycerate mutase
Sputcn32_2965212-0.782555B12-dependent methionine synthase
Sputcn32_2966018-4.244497hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2942INFPOTNTIATR1514e-48 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 151 bits (382), Expect = 4e-48
Identities = 76/203 (37%), Positives = 118/203 (58%), Gaps = 5/203 (2%)

Query: 6 STVEQQASYGVGRQMGEQLAANSFDGVDIPAVQAGLADAFAGLESAVS---MQDLQVAFT 62
+T + + SY +G +G+ D ++ + G+ D +G + ++ M+D+ F
Sbjct: 28 TTDKDKLSYSIGADLGKNFKNQGID-INPDVLAKGMQDGMSGAQLILTEEQMKDVLSKFQ 86

Query: 63 -EISGRIQAAQEQAAAAASAEGEAFLAQNAKREGVTVTDSGLQYEVLVQGSGAKPTYEDT 121
++ + A + A A+G+AFL+ N + G+ V SGLQY+++ G+GAKP DT
Sbjct: 87 KDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPSGLQYKIIDAGTGAKPGKSDT 146

Query: 122 VRTHYHGSFINGDVFDSSVVRGQPAEFPVSGVIAGWTEALQLMPVGTKLKLYVPHHLAYG 181
V Y G+ I+G VFDS+ G+PA F VS VI GWTEALQLMP G+ +++VP LAYG
Sbjct: 147 VTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEALQLMPAGSTWEVFVPADLAYG 206

Query: 182 ERGAGASIPPYSTLVFEVELLDI 204
R G I P TL+F++ L+ +
Sbjct: 207 PRSVGGPIGPNETLIFKIHLISV 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2947FERRIBNDNGPP451e-07 Ferrichrome-binding periplasmic protein signature.
		>FERRIBNDNGPP#Ferrichrome-binding periplasmic protein signature.

Length = 296

Score = 45.3 bits (107), Expect = 1e-07
Identities = 42/175 (24%), Positives = 65/175 (37%), Gaps = 15/175 (8%)

Query: 23 AQPAKRIIALSPHAVEMLYAIGAGDTIVAATDYADY------PEAAKKIPRIGGYYGIQM 76
A RI+AL VE+L A+G D +Y P + +G +
Sbjct: 32 AIDPNRIVALEWLPVELLLALGI--VPYGVADTINYRLWVSEPPLPDSVIDVGLRTEPNL 89

Query: 77 ERVMELNPDLIVVWDSGNKA--EDINQL-RTLGFNLYGSDPKTLEGVAKELEELGQLTGH 133
E + E+ P + VW +G E + ++ GFN + + L K L E+ L
Sbjct: 90 ELLTEMKPSFM-VWSAGYGPSPEMLARIAPGRGFN-FSDGKQPLAMARKSLTEMADLLNL 147

Query: 134 VEEASKAAAAYRAELIRLRVENAKKSE-PKVFYQLWSTPLMTV-SKNSWIQEIMS 186
A A Y + ++ K+ P + L M V NS QEI+
Sbjct: 148 QSAAETHLAQYEDFIRSMKPRFVKRGARPLLLTTLIDPRHMLVFGPNSLFQEILD 202


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2949SACTRNSFRASE341e-04 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 33.8 bits (77), Expect = 1e-04
Identities = 15/59 (25%), Positives = 23/59 (38%), Gaps = 5/59 (8%)

Query: 72 IEMLFISPDVRGKGIGALLAAHAI-----KDQGATKVDVNEQNEQALGFYQHIGFTVTG 125
IE + ++ D R KG+G L AI ++ + N A FY F +
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGA 150


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2951TYPE4SSCAGA270.036 Type IV secretion system CagA exotoxin signature.
		>TYPE4SSCAGA#Type IV secretion system CagA exotoxin signature.

Length = 1147

Score = 27.0 bits (59), Expect = 0.036
Identities = 15/61 (24%), Positives = 27/61 (44%)

Query: 41 KLDEVTESEWERFNQLVRLLHSEYEMYAVNLENDTIDKLEDLDKYLPTYEEDMNRGQMNF 100
KLD ++E E E+F ++ + + Y L ND I + D + G +++
Sbjct: 409 KLDNLSEKEKEKFRTEIKDFQKDSKAYLDALGNDRIAFVSKKDTKHSALITEFGNGDLSY 468

Query: 101 T 101
T
Sbjct: 469 T 469


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2965BCTERIALGSPD310.028 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 31.4 bits (71), Expect = 0.028
Identities = 14/71 (19%), Positives = 31/71 (43%), Gaps = 5/71 (7%)

Query: 354 AGLEPLTIDAQTLFVNVGERTN---VTGSAKFLKLIKEGKFEQALDVAREQVESGAQIID 410
+P+ + + + +TN VT + + ++ + LD+ R QV A I +
Sbjct: 298 QAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLE--RVIAQLDIRRPQVLVEAIIAE 355

Query: 411 INMDEGMLDGV 421
+ +G+ G+
Sbjct: 356 VQDADGLNLGI 366


46Sputcn32_3029Sputcn32_3034Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3029215-2.681818ABC transporter-like protein
Sputcn32_3030420-4.167603hypothetical protein
Sputcn32_3031317-3.686391hypothetical protein
Sputcn32_3032216-3.410700hypothetical protein
Sputcn32_3033115-3.550332hypothetical protein
Sputcn32_3034214-2.750361dihydropteridine reductase
47Sputcn32_3201Sputcn32_3212Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3201328-7.251770hypothetical protein
Sputcn32_3202433-8.017706hypothetical protein
Sputcn32_3203734-9.185339hypothetical protein
Sputcn32_3204-120-3.053710hypothetical protein
Sputcn32_3205122-1.849623hypothetical protein
Sputcn32_3206022-0.799000ATPase
Sputcn32_32070222.701467hypothetical protein
Sputcn32_32081254.919338hypothetical protein
Sputcn32_32091265.321374glycine dehydrogenase
Sputcn32_32100194.832140glycine cleavage system protein H
Sputcn32_32110184.456745glycine cleavage system aminomethyltransferase
Sputcn32_32121173.737193UbiH/UbiF/VisC/COQ6 family ubiquinone
48Sputcn32_3271Sputcn32_3276Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3271224-2.429820hypothetical protein
Sputcn32_3272328-1.087748ClpXP protease specificity-enhancing factor
Sputcn32_3273426-0.849977stringent starvation protein A
Sputcn32_3274425-0.531635cytochrome c1
Sputcn32_3275426-0.352488cytochrome b/b6 domain-containing protein
Sputcn32_3276421-0.121319ubiquinol-cytochrome c reductase, iron-sulfur
49Sputcn32_3318Sputcn32_3327Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3318118-3.292087hypothetical protein
Sputcn32_3319117-3.118381diguanylate cyclase/phosphodiesterase
Sputcn32_3320015-3.286977hypothetical protein
Sputcn32_3321118-3.010840hypothetical protein
Sputcn32_3322117-2.392773hypothetical protein
Sputcn32_3323114-2.496666hypothetical protein
Sputcn32_3324214-2.562105response regulator receiver protein
Sputcn32_3325217-3.104521histone family protein DNA-binding protein
Sputcn32_3326318-3.178310hypothetical protein
Sputcn32_3327216-2.589755alpha-L-glutamate ligase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3324HTHFIS632e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.5 bits (152), Expect = 2e-13
Identities = 25/108 (23%), Positives = 44/108 (40%), Gaps = 3/108 (2%)

Query: 146 RVLVVDDSRMARNVIKRTIGNLGIKLITEAEDGAQAIELMRNNMFDLVITDYNMPSIDGL 205
+LV DD R V+ + + G + + A + DLV+TD MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 206 ALTQFIRNESQQSHIPILMVSSEANDTHLSNVSQAGVNALCDKPFEPQ 253
L I+ + +P+L++S++ S+ G KPF+
Sbjct: 64 DLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109



Score = 47.9 bits (114), Expect = 2e-08
Identities = 32/155 (20%), Positives = 57/155 (36%), Gaps = 6/155 (3%)

Query: 10 SILLVEPSDTQRRIIIQHLQQEGIVSIQTAANIEEAKAVVGRHKPDLIASAMHFEDGTAI 69
+IL+ + R ++ Q L + G ++ +N + DL+ + + D A
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 70 DLLSYLRVNSDYKDIQFMLVSSECRREQLEIFRQSGVVAILPKPFHAEHLGKALNATIDL 129
DLL R+ D+ +++S++ + G LPKPF L + +
Sbjct: 64 DLLP--RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 130 LSHDELDLSHFDVHDVRVLVVDDSRM--ARNVIKR 162
L D D LV + M V+ R
Sbjct: 122 PKRRPSKLED-DSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3325DNABINDINGHU1092e-35 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 109 bits (275), Expect = 2e-35
Identities = 45/88 (51%), Positives = 66/88 (75%)

Query: 2 NKTELIAKIAENADLTKVEAARALKSFEAAITESMKNGDKISIVGFGSFETATRAARTGR 61
NK +LIAK+AE +LTK ++A A+ + +A++ + G+K+ ++GFG+FE RAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIQIAEATVPKFKAGKTLRDSV 89
NPQTG+EI+I + VP FKAGK L+D+V
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3326OUTRMMBRANEA280.027 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 28.4 bits (63), Expect = 0.027
Identities = 16/76 (21%), Positives = 29/76 (38%), Gaps = 9/76 (11%)

Query: 47 VAIQGGIDYSHDSGFYAGTWASNVDFGDETSYELDLYVGYAGNITEDISYDIGYLYYGYP 106
+ G HD+GF +N E + GY + + +++GY + G
Sbjct: 30 TGAKLGWSQYHDTGFI-----NNNGPTHENQLGAGAFGGY--QVNPYVGFEMGYDWLGRM 82

Query: 107 DAPGSIDFG--ELHGA 120
GS++ G + G
Sbjct: 83 PYKGSVENGAYKAQGV 98


50Sputcn32_3336Sputcn32_3344Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_33361295.398574redox-active disulfide protein 2
Sputcn32_33372306.429008antibiotic biosynthesis monooxygenase
Sputcn32_33381286.318341N-acetyltransferase GCN5
Sputcn32_33391286.404951large-conductance mechanosensitive channel
Sputcn32_33401286.627747antibiotic biosynthesis monooxygenase
Sputcn32_33411286.687689CzcA family heavy metal efflux protein
Sputcn32_33423225.176006RND family efflux transporter MFP subunit
Sputcn32_33433224.000353outer membrane efflux protein
Sputcn32_33440223.293982hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3338SACTRNSFRASE385e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.6 bits (87), Expect = 5e-06
Identities = 21/72 (29%), Positives = 32/72 (44%), Gaps = 5/72 (6%)

Query: 75 ASIGRVVVSPAGRGKGLAMPLMQQAIESALTTWPDAGIQIGAQDY-LKA--FYQKLGFVA 131
A I + V+ R KG+ L+ +AIE A G+ + QD + A FY K F+
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKEN-HFCGLMLETQDINISACHFYAKHHFII 148

Query: 132 CS-EMYLEDGIP 142
+ + L P
Sbjct: 149 GAVDTMLYSNFP 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3339MECHCHANNEL1762e-60 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 176 bits (448), Expect = 2e-60
Identities = 89/136 (65%), Positives = 111/136 (81%), Gaps = 1/136 (0%)

Query: 1 MSLIKEFKAFASRGNVIDMAVGIIIGAAFGKIVSSFVADVIMPPIGIILGGVNFSDLSIV 60
MS+IKEF+ FA RGNV+D+AVG+IIGAAFGKIVSS VAD+IMPP+G+++GG++F ++
Sbjct: 1 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT 60

Query: 61 LQAAQGDAPSVVIAYGKFIQTVIDFTIIAFAIFMGLKAINTLKRKEEEAPKAPPAPTKEE 120
L+ AQGD P+VV+ YG FIQ V DF I+AFAIFM +K IN L RK+EE P A PAPTKEE
Sbjct: 61 LRDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEE-PAAAPAPTKEE 119

Query: 121 ELLSEIRDLLKAQQEK 136
LL+EIRDLLK Q +
Sbjct: 120 VLLTEIRDLLKEQNNR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3341ACRIFLAVINRP6610.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 661 bits (1708), Expect = 0.0
Identities = 224/1072 (20%), Positives = 431/1072 (40%), Gaps = 67/1072 (6%)

Query: 9 AIKNRLLVVLALLAVIVGCVAMLSKLNLDAFPDVTNVQVTINTAAEGLAAEEVEKLISYP 68
I+ + + + +++ + +L + +P + V+++ G A+ V+ ++
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 69 VESAMYALPAVTEVRSLS-RTGLSIVTVVFAEGTDIYFARQQVFEQLQAAREMIPSGVGV 127
+E M + + + S S G +T+ F GTD A+ QV +LQ A ++P V
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQ 124

Query: 128 PEIGPNTSGLGQIYQYILRAEPNSGINAAELRSLNDYLVKLILMPVGGVTEVLSFGGDVR 187
I S + ++ N G ++ VK L + GV +V FG
Sbjct: 125 QGISVEKSSSSYLMVAGFVSD-NPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ-Y 182

Query: 188 QYQVQVDPNKLRAYGLSMAQVTEALESNNRNAGGWFMDQGQEQLVVRGYGMLPAGDAGLA 247
++ +D + L Y L+ V L+ N + G L +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLG-GTPALPGQQLNASIIAQTRFK 241

Query: 248 AIAQ----IPLTEASGTPVRIGDIAQVDFGSEIRVGAVTMTRRDESGQVQNLGEVVAGVV 303
+ + G+ VR+ D+A+V+ G E + N +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARI----------NGKPAAGLGI 291

Query: 304 LKRMGANTKATIDDIGARVSLIEQALPDGVSFEVFYDQADLVDKAVTTVRDALLMAFVFI 363
GAN T I A+++ ++ P G+ YD V ++ V L A + +
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLV 351

Query: 364 VVILALFLVNIRATLLVLLSIPVSIGLALMVMSYYGLSANLMSLGGLAVAIGMLVDGSVV 423
+++ LFL N+RATL+ +++PV + +++ +G S N +++ G+ +AIG+LVD ++V
Sbjct: 352 FLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIV 411

Query: 424 MVENIFKHLTQPDRRHLQEARTRADGEIDPYHSDEDGGLQANMAVRIMLAAKEVCSPIFF 483
+VEN+ + + + L A + ++ +
Sbjct: 412 VVENVERVM-------------------------MEDKLPPKEATE--KSMSQIQGALVG 444

Query: 484 ATAIIIVVFAPLFALEGVEGKLFQPMAVSIILAMISALLVALIAVPALAVYLFK------ 537
++ VF P+ G G +++ +++I+ AM ++LVALI PAL L K
Sbjct: 445 IAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEH 504

Query: 538 ----RGVMLKQSVILAPLDAAYRKLLTATLARPKVVMLSALLMFALSLLLLPRLGTEFVP 593
G + Y + L +L L+ A ++L RL + F+P
Sbjct: 505 HENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLP 564

Query: 594 ELEEGTINLRVTLAPTASLGTSLQVAPKLEAILLEFPEVEYALSRIGAPELGGDPEPVSN 653
E ++G + L A+ + +V ++ L+ + S + +
Sbjct: 565 EEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKANVE-SVFTVNGFSFSGQAQNA 623

Query: 654 IEVYIGLKPIAEWQSASSRLE--LQRLMEEKLSVFPGLLLTFSQPIATRVDELLSGVKAQ 711
++ LKP E + E + R E + G ++ F+ P + EL +
Sbjct: 624 GMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMP---AIVELGTATGFD 680

Query: 712 LA-IKIFGPDLAVLSQKGQALSDLVAKIPGAV-DVSLEQVSGEAQLVVRPKRELLARYGI 769
I G L+Q L + A+ P ++ V + AQ + +E G+
Sbjct: 681 FELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGV 740

Query: 770 SVDQVMSLVSQGIGGVSAGQVIDGNARYDINVRLAAEFRQSPDAIKDLLLSGTNGATVRL 829
S+ + +S +GG ID + V+ A+FR P+ + L + NG V
Sbjct: 741 SLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPF 800

Query: 830 GEVASVEVEMAPPNIRRDDVQRRVVVQANVA-GRDMGSVVKDIYALVPKADLPAGYTVII 888
+ P + R + + +Q A G G + + L K LPAG
Sbjct: 801 SAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDW 858

Query: 889 GGQYENQQRAQQKLMLVVPVSIALIALLLYFSFGSFKQVLLIMANVPLALIGGIVALYVS 948
G ++ + + +V +S ++ L L + S+ + +M VPL ++G ++A +
Sbjct: 859 TGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF 918

Query: 949 GTYLSVPSSIGFITLFGVAVLNGVVLVDSINQ-RRQSGEALYDCVYEGTVGRLRPVLMTA 1007
V +G +T G++ N +++V+ + G+ + + RLRP+LMT+
Sbjct: 919 NQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTS 978

Query: 1008 LTSALGLIPILLSSGVGSEIQKPLAVVIIGGLFSSTALTLLVLPTLYCWLYR 1059
L LG++P+ +S+G GS Q + + ++GG+ S+T L + +P + + R
Sbjct: 979 LAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 111 bits (280), Expect = 5e-27
Identities = 83/551 (15%), Positives = 189/551 (34%), Gaps = 63/551 (11%)

Query: 4 KLIEAAIKNRLLVVLALLAVIVGCVAMLSKLNLDAFPDVTNVQVTIN-TAAEGLAAEEVE 62
+ + + +L ++ G V + +L P+ G E +
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 63 KLI----SYPVESAMYALPAVTEVRSLSRTGLS----IVTVVFAEGTDIYFARQQVFEQL 114
K++ Y +++ + +V V S +G + + V + +
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 115 QAAREM---IPSGVGVPEIGPNTSGLGQIYQYILRAEPNSGINAAELRSLNDYLVKLILM 171
A+ I G +P P LG + +G+ L + L+ +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 172 PVGGVTEV-LSFGGDVRQYQVQVDPNKLRAYGLSMAQVTEALES--NNRNAGGWFMDQGQ 228
+ V + D Q++++VD K +A G+S++ + + + + +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 229 EQLVVRGYGMLPAGDA-GLAAIAQIPLTEASGTPVRIGDIAQVDFGSEIRVGAVTMTRRD 287
++L V+ A + ++ + A+G V + G+ + R +
Sbjct: 768 KKLYVQ----ADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVY----GSPRLERYN 819

Query: 288 ESGQVQNLGEVVAGVVLKRMGANTKATIDDIGARVSLIEQALPDGVSFEVFYDQADLVDK 347
++ GE G D A + + LP G+ ++ + +
Sbjct: 820 GLPSMEIQGEAAPGTSS-----------GDAMALMENLASKLPAGIGYD-WTGMSYQERL 867

Query: 348 AVTTVRDALLMAFVFIVVILALFLVNIRATLLVLLSIPVSIGLALMVMSYYGLSANLMSL 407
+ + ++FV + + LA + + V+L +P+ I L+ + + ++ +
Sbjct: 868 SGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFM 927

Query: 408 GGLAVAIGMLVDGSVVMVENIFKHLTQPDRRHLQEARTRADGEIDPYHSDEDGGLQANMA 467
GL IG+ ++++VE K L + + + + EA
Sbjct: 928 VGLLTTIGLSAKNAILIVEFA-KDLMEKEGKGVVEA------------------------ 962

Query: 468 VRIMLAAKEVCSPIFFATAIIIVVFAPLFALEGVEGKLFQPMAVSIILAMISALLVALIA 527
++A + PI + I+ PL G + + ++ M+SA L+A+
Sbjct: 963 --TLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 528 VPALAVYLFKR 538
VP V + +
Sbjct: 1021 VPVFFVVIRRC 1031



Score = 102 bits (257), Expect = 2e-24
Identities = 90/515 (17%), Positives = 193/515 (37%), Gaps = 36/515 (6%)

Query: 565 RPKVVMLSALLMFALSLLLLPRLGTEFVPELEEGTINLRVTLAPTASLGT-SLQVAPKLE 623
RP + A+++ L + +L P + +++ P A T V +E
Sbjct: 8 RPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSAN-YPGADAQTVQDTVTQVIE 66

Query: 624 AILLEFPEVEYALSRIGAPELGGDPEPVSNIEVYIGLKPIAEWQSASSRLELQRLMEEKL 683
+ + Y S + ++ + + + + A Q ++ KL
Sbjct: 67 QNMNGIDNLMYMSST---------SDSAGSVTITLTFQSGTDPDIA------QVQVQNKL 111

Query: 684 SVFPGLLLTFSQPIATRVDELLSGVKAQLAIKIFGPDLAVLSQKGQA---LSDLVAKIPG 740
+ LL Q V++ S P + D ++++ G
Sbjct: 112 QLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNG 171

Query: 741 AVDVSLEQVSGEAQLVVRPKRELLARYGISVDQVMSLVSQGIGGVSAGQVIDGNARYD-- 798
DV L + + + +LL +Y ++ V++ + ++AGQ+ A
Sbjct: 172 VGDVQL--FGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQ 229

Query: 799 INVRLAAEFR-QSPDAIKDLLL-SGTNGATVRLGEVASVEVEMAPPNIR-RDDVQRRVVV 855
+N + A+ R ++P+ + L ++G+ VRL +VA VE+ N+ R + + +
Sbjct: 230 LNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGL 289

Query: 856 QANVA-GRDMGSVVKDIYALVP--KADLPAGYTVIIGGQYENQQRAQQKLMLVVP---VS 909
+A G + K I A + + P G V+ Y+ Q + VV +
Sbjct: 290 GIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLY--PYDTTPFVQLSIHEVVKTLFEA 347

Query: 910 IALIALLLYFSFGSFKQVLLIMANVPLALIGGIVALYVSGTYLSVPSSIGFITLFGVAVL 969
I L+ L++Y + + L+ VP+ L+G L G ++ + G + G+ V
Sbjct: 348 IMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVD 407

Query: 970 NGVVLVDSINQRRQS-GEALYDCVYEGTVGRLRPVLMTALTSALGLIPILLSSGVGSEIQ 1028
+ +V+V+++ + + + ++ A+ + IP+ G I
Sbjct: 408 DAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY 467

Query: 1029 KPLAVVIIGGLFSSTALTLLVLPTLYCWLYRGDKR 1063
+ ++ I+ + S + L++ P L L +
Sbjct: 468 RQFSITIVSAMALSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3342RTXTOXIND539e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.3 bits (128), Expect = 9e-10
Identities = 36/182 (19%), Positives = 65/182 (35%), Gaps = 22/182 (12%)

Query: 109 RATATLVVDRDRTATLAPQLDVRVLARHVVPGQEVKKGEPLLTLGGAAVAQAQADYINAA 168
R V++ R + L + +A+H V QE + +N
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQE----------------NKYVEAVNEL 268

Query: 169 AEWSRVKRMSEGAVSVSRRMQAQVDAELKRAILEAIKMTPAQIRALE----STPEAIGSY 224
+ E + ++ V K IL+ ++ T I L E +
Sbjct: 269 RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328

Query: 225 QLLAPIDGRVQQ-DIAMLGQVFSAGTPLMQLT-DESYLWVEAQLTPTQTMHIQVGSSALV 282
+ AP+ +VQQ + G V + LM + ++ L V A + I VG +A++
Sbjct: 329 VIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAII 388

Query: 283 QV 284
+V
Sbjct: 389 KV 390



Score = 40.2 bits (94), Expect = 1e-05
Identities = 26/148 (17%), Positives = 53/148 (35%), Gaps = 5/148 (3%)

Query: 101 LANLNLDIRATATLVVDRDRTATLAPQLDVRVLARHVVPGQEVKKGEPLLTLGG----AA 156
L + + A L R+ + P + V V G+ V+KG+ LL L A
Sbjct: 77 LGQVEIVATANGKLTHS-GRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEAD 135

Query: 157 VAQAQADYINAAAEWSRVKRMSEGAVSVSRRMQAQVDAELKRAILEAIKMTPAQIRALES 216
+ Q+ + A E +R + +S D + + E + + +
Sbjct: 136 TLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 217 TPEAIGSYQLLAPIDGRVQQDIAMLGQV 244
+ YQ +D + + + +L ++
Sbjct: 196 STWQNQKYQKELNLDKKRAERLTVLARI 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3343RTXTOXIND300.029 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.029
Identities = 21/174 (12%), Positives = 56/174 (32%), Gaps = 15/174 (8%)

Query: 82 HVNFGQWLPEL-LTQFN-QLPEVQAQLVRQQQAKLAIQAANRAVYNPELGLNYQNADTDA 139
V G L +L + Q+ L++ + + Q +R++ L +
Sbjct: 117 SVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI--ELNKLPELKLPDEP 174

Query: 140 YSLGLSQTLDWGDKRGVATKRAELEAQILFADIGLERSQMLAERLLALAEQAQSRKALTF 199
Y +S+ + + + + Q ++ L++ + AE+ +
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKR---------AERLTVLARINR 225

Query: 200 AEQQLRFTKAQLNIAEQRLAAGDLSNVELQLIQLEVASNTADYALAEQVALVAD 253
E R K++L+ L ++ + + + + L + +
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNE--LRVYKSQLEQ 277


51Sputcn32_3359Sputcn32_3368Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3359212-0.186664periplasmic copper-binding protein
Sputcn32_3360211-0.250543NosL family protein
Sputcn32_3361213-0.426151formate-dependent nitrite reductase complex
Sputcn32_3362212-0.825600peptidylprolyl isomerase, FKBP-type
Sputcn32_3363113-0.750349rhodanese domain-containing protein
Sputcn32_3364113-0.753189cytochrome c
Sputcn32_33652140.296618cytochrome c-type biogenesis protein CcmF
Sputcn32_33662141.480143cytochrome C biogenesis protein
Sputcn32_33672131.406123redoxin domain-containing protein
Sputcn32_33682162.863817hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3362INFPOTNTIATR1621e-51 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 162 bits (411), Expect = 1e-51
Identities = 92/227 (40%), Positives = 131/227 (57%), Gaps = 9/227 (3%)

Query: 21 ALFVSVASFAGTGLKTDADKTSYSIGASIGNYISGQVYNQVELGSEVNVDLVVQGFVDAL 80
A+ ++A+ T L TD DK SYSIGA +G Q G ++N D++ +G D +
Sbjct: 14 AMSTAMAATDATSLTTDKDKLSYSIGADLGKNFKNQ-------GIDINPDVLAKGMQDGM 66

Query: 81 KDKQ-QLTDEEVLTYLNQRAEQLNAARAVVAEKQMAETKKASAAYLAENQKKSGVKVTAS 139
Q LT+E++ L++ + L A R+ K+ E K A+L+ N+ K G+ V S
Sbjct: 67 SGAQLILTEEQMKDVLSKFQKDLMAKRSAEFNKKAEENKAKGDAFLSANKSKPGIVVLPS 126

Query: 140 GLQYEVLTQGKGHKPNPEDVVTVEYVGTLINGTEFENTVGRKEPTRFALMSVIPGWEEGL 199
GLQY+++ G G KP D VTVEY GTLI+GT F++T +P F + VIPGW E L
Sbjct: 127 GLQYKIIDAGTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQVSQVIPGWTEAL 186

Query: 200 KLMPVGSKYRFVVPASLAYGAEAV-GIIPPESALIFEIELKNIEKPS 245
+LMP GS + VPA LAYG +V G I P LIF+I L +++K +
Sbjct: 187 QLMPAGSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISVKKAA 233


52Sputcn32_3388Sputcn32_3399Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3388016-3.083885hypothetical protein
Sputcn32_3389216-2.360451hypothetical protein
Sputcn32_3390116-1.568460PAS/PAC sensor-containing diguanylate cyclase
Sputcn32_3391216-1.472668peptidylprolyl isomerase, FKBP-type
Sputcn32_3392316-1.639695thioredoxin
Sputcn32_3393215-1.466467anion transporter
Sputcn32_33941140.438417major facilitator superfamily transporter
Sputcn32_33950151.421589PepSY-associated TM helix domain-containing
Sputcn32_33960193.724591hypothetical protein
Sputcn32_33970205.013912hypothetical protein
Sputcn32_33980205.198634hypothetical protein
Sputcn32_33990195.627874permease
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3391INFPOTNTIATR677e-17 Macrophage infectivity potentiator signature.
		>INFPOTNTIATR#Macrophage infectivity potentiator signature.

Length = 233

Score = 66.9 bits (163), Expect = 7e-17
Identities = 39/98 (39%), Positives = 53/98 (54%), Gaps = 6/98 (6%)

Query: 12 GEGKEAVKGALITTQYRGFLEDGTQFDASYDRGQ--AFQCVIGTGRVIKGWDQGIMGMKI 69
G G + K +T +Y G L DGT FD++ G+ FQ +VI GW + + M
Sbjct: 136 GTGAKPGKSDTVTVEYTGTLIDGTVFDSTEKAGKPATFQ----VSQVIPGWTEALQLMPA 191

Query: 70 GGKRKLLVPAHLAYGERQVGAHIKPHSNLIFEIELLEV 107
G ++ VPA LAYG R VG I P+ LIF+I L+ V
Sbjct: 192 GSTWEVFVPADLAYGPRSVGGPIGPNETLIFKIHLISV 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3394TCRTETA349e-04 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 34.0 bits (78), Expect = 9e-04
Identities = 62/355 (17%), Positives = 132/355 (37%), Gaps = 27/355 (7%)

Query: 13 NSLFVPVAGLSLFALASGYLMSLIPLSLTFFELNTSLAP---LLASIFYLGLLLGAPCIA 69
L V ++ ++L A+ G +M ++P L + + +L +++ L AP +
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLG 64

Query: 70 PIVARIGHSKAFILFLNILLCSVVAMILIPQSGVWL--ASRLVAGFAVAGVFVVVESWLL 127
+ R G + +L +++ +V I+ +W+ R+VAG A V +++
Sbjct: 65 ALSDRFG--RRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGA-TGAVAGAYIA 121

Query: 128 MADTQKQRAKRLGLYMTALYG-GTAIGQLAIDYLGTAGNLPYLVIMGLLAAASLPALLVK 186
+RA+ G +M+A +G G G + +G L +
Sbjct: 122 DITDGDERARHFG-FMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFL 180

Query: 187 RGQPIASEQQSMSLSGLKNLSQPAIVGCLVSGLLLGPIYGLLPIYVAIDMAL------DR 240
+ E++ + L L+ + L ++ ++ + + AL DR
Sbjct: 181 LPESHKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDR 240

Query: 241 ------QTGLFMALIILGGMLVQPLVS-YLSPRVNKSGLIM-GFCLLGTAALFLLTQYSN 292
G+ +A + L Q +++ ++ R+ + +M G GT + L
Sbjct: 241 FHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRG 300

Query: 293 MSLIIGFLLLGASAFALYPIAISLACDDLPASQIVSATQVMLLSY-SIGSVIGPL 346
+LL + + A+ + Q L + S+ S++GPL
Sbjct: 301 WMAFPIMVLLASGGIGM--PALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPL 353


53Sputcn32_3418Sputcn32_3423Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3418318-0.163181transcriptional regulator PdhR
Sputcn32_3419319-0.243032regulatory protein AmpE
Sputcn32_34203180.054609N-acetyl-anhydromuranmyl-L-alanine amidase
Sputcn32_3421319-0.031057nicotinate-nucleotide pyrophosphorylase
Sputcn32_3422217-0.482349fimbrial protein pilin
Sputcn32_34232170.125405type IV-A pilus assembly ATPase PilB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3422BCTERIALGSPG492e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 49.1 bits (117), Expect = 2e-10
Identities = 23/74 (31%), Positives = 44/74 (59%), Gaps = 5/74 (6%)

Query: 13 KAKGFTLIELMIVVAIIGILAAIALPAYQDYTIKSQINGGLSEVSALKNSFEVAVLDD-- 70
K +GFTL+E+M+V+ IIG+LA++ +P K+ +S++ AL+N+ ++ LD+
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHH 65

Query: 71 ---TDPSLTATAKG 81
T+ L + +
Sbjct: 66 YPTTNQGLESLVEA 79


54Sputcn32_3506Sputcn32_3523Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3506015-3.589980coproporphyrinogen III oxidase
Sputcn32_3507115-4.257031bile acid:sodium symporter
Sputcn32_3508115-4.176606**RNP-1 like RNA-binding protein
Sputcn32_3509014-3.938459glutamate racemase
Sputcn32_3510115-3.791782tRNA (uracil-5-)-methyltransferase
Sputcn32_3511114-2.816815transposase
Sputcn32_3512015-0.289613hypothetical protein
Sputcn32_35130231.718104transposase
Sputcn32_3514-1262.930887plasmid stabilization system protein
Sputcn32_3515-1273.140645CopG family transcriptional regulator
Sputcn32_3516-1283.461426retinol acyltransferase domain-containing
Sputcn32_3517-3272.512665phage integrase family protein
Sputcn32_3518121-0.460794hypothetical protein
Sputcn32_3519122-0.786608hypothetical protein
Sputcn32_3520117-1.537652hypothetical protein
Sputcn32_3521217-1.953815DNA repair protein RadC
Sputcn32_3522316-2.833674phage transcriptional regulator AlpA
Sputcn32_3523215-2.663476hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3511TYPE4SSCAGX300.021 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 30.1 bits (67), Expect = 0.021
Identities = 30/134 (22%), Positives = 54/134 (40%), Gaps = 8/134 (5%)

Query: 36 PKELAPQKKQASSKKQAQQSGAKPAGENVIDFEHEKKSREKGLKNDHGIKNKFSASDIID 95
PKEL QKK +K+A++ K + E K+ R K N + N S +
Sbjct: 138 PKELEEQKKALEKEKEAKEQAQKAQKDKR---EKRKEERAKNRANLENLTNAMSNPQNLS 194

Query: 96 GHEDLDQQTKDKIEESTPTAHEYFNANPTATTNTITQINKELLSNRTKQGELIEPEVLNE 155
+++L + K + E + A N + QI + N+ + E + +
Sbjct: 195 NNKNLSELIKQQRENELDQMERLEDMQEQAQANALKQIEE---LNKKQAEEAVRQRA--K 249

Query: 156 HKDTMKCDRASSLP 169
K ++K D++ P
Sbjct: 250 DKISIKTDKSQKSP 263


55Sputcn32_3545Sputcn32_3554Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_35452123.551033phospholipid/glycerol acyltransferase
Sputcn32_35463124.029406TetR family transcriptional regulator
Sputcn32_35474123.838232N-acetyltransferase GCN5
Sputcn32_35483113.1404463'(2'),5'-bisphosphate nucleotidase
Sputcn32_35493113.080540ADP-ribose diphosphatase NudE
Sputcn32_35503113.112042fibronectin type III domain-containing protein
Sputcn32_3551214-1.136353phage tail collar domain-containing protein
Sputcn32_3552215-1.459834N-acetyltransferase GCN5
Sputcn32_3553215-1.037543hypothetical protein
Sputcn32_3554214-0.009199phytanoyl-CoA dioxygenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3546HTHTETR454e-08 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 45.0 bits (106), Expect = 4e-08
Identities = 22/102 (21%), Positives = 36/102 (35%), Gaps = 3/102 (2%)

Query: 1 MARRKEHSHDEIRAMAIQAATELLIDQGVAGLSLRKVASQIGYVPSTLINIFGSYNYLLL 60
MAR+ + E R + A L QGV+ SL ++A G + F + L
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 AV---SEATLRALHLRLAEVCAADSLGKIIAMAWEYSQFAHE 99
+ SE+ + L L D L + + +
Sbjct: 61 EIWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVT 102


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3550OMPADOMAIN401e-04 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 39.9 bits (93), Expect = 1e-04
Identities = 36/184 (19%), Positives = 54/184 (29%), Gaps = 24/184 (13%)

Query: 2276 LLSVVLLLVGMVNPVNAA-NEGYWYAEGFIGQSQVDNGKRVLQPQAGAGAVTSVDTTDTA 2334
+++ + L G AA + WY +G SQ +
Sbjct: 5 AIAIAVALAGFATVAQAAPKDNTWYTGAKLGWSQYHDTGF-------INNNGPTHENQLG 57

Query: 2335 IGVSVGYQWTPLVAVELGYADFGNGSARIEGASLTPEQYHEQVKGVTPVLADGVTLGLRF 2394
G GYQ P V E+GY G R+ ++ A GV L +
Sbjct: 58 AGAFGGYQVNPYVGFEMGYDWLG----RMPYKGSVENGAYK---------AQGVQLTAKL 104

Query: 2395 TLLQHEAWRFEVPIGLFRWQADISSTMGNSRLTTALDGTDWYAGVRFSYQFTESWSVGLG 2454
+ +G W+AD S + T G Y T + L
Sbjct: 105 GYPITDDLDIYTRLGGMVWRADTKSNVYGKNHDT---GVSPVFAGGVEYAITPEIATRLE 161

Query: 2455 YQYV 2458
YQ+
Sbjct: 162 YQWT 165


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3552SACTRNSFRASE421e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 42.2 bits (99), Expect = 1e-07
Identities = 17/81 (20%), Positives = 36/81 (44%), Gaps = 2/81 (2%)

Query: 71 YILFYHQQAVGKVMLDISEHRVHLI-DLIVIHSMRGQGFGSAILAFIKQEAAIRNLP-VG 128
++ + +G++ + + + LI D+ V R +G G+A+L + A + +
Sbjct: 68 FLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLM 127

Query: 129 LSVEKENARAKKLYLQHGFKL 149
L + N A Y +H F +
Sbjct: 128 LETQDINISACHFYAKHHFII 148


56Sputcn32_3586Sputcn32_3614Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_35866284.795896adenylate cyclase
Sputcn32_35876294.847423porphobilinogen deaminase
Sputcn32_35886274.512914uroporphyrinogen III synthase HEM4
Sputcn32_35895264.181815hypothetical protein
Sputcn32_35906274.369509HemY domain-containing protein
Sputcn32_35917284.382678outer membrane adhesin like protein
Sputcn32_3592017-3.095682ABC transporter-like protein
Sputcn32_3593116-3.504486HlyD family type I secretion membrane fusion
Sputcn32_3594113-3.142422TolC family type I secretion outer membrane
Sputcn32_3595013-3.324193OmpA/MotB domain-containing protein
Sputcn32_3596014-3.757915hypothetical protein
Sputcn32_3597114-3.677377diguanylate cyclase/phosphodiesterase
Sputcn32_3598014-2.640652****GAF sensor-containing diguanylate
Sputcn32_3599013-1.547515ATP-dependent DNA helicase Rep
Sputcn32_3600217-1.843199hypothetical protein
Sputcn32_3601216-1.663208hypothetical protein
Sputcn32_3602115-0.666720hypothetical protein
Sputcn32_3603013-1.194402NnrS family protein
Sputcn32_3604-111-2.895692nitrite reductase
Sputcn32_3605-111-0.838056hypothetical protein
Sputcn32_36060151.692492PA-phosphatase-like phosphoesterase
Sputcn32_36071204.027030import inner membrane translocase subunit Tim44
Sputcn32_36081224.191055hypothetical protein
Sputcn32_36090256.026320hypothetical protein
Sputcn32_36100266.782467alanine--glyoxylate transaminase
Sputcn32_36110286.652712threonine dehydratase
Sputcn32_36120245.201748dihydroxy-acid dehydratase
Sputcn32_3613-1213.718742acetolactate synthase 2 regulatory subunit
Sputcn32_3614-2203.396808acetolactate synthase 2 catalytic subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3589RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.006
Identities = 15/78 (19%), Positives = 37/78 (47%), Gaps = 7/78 (8%)

Query: 86 YQQLQQQIQQQQLAQDEKNNALQSQLAQALLQPNQRIEQLEQQQLNDAKT-----YQELS 140
+ Q Q Q++L D+K + LA+ + + + ++E+ +L+D +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLAR--INRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 141 KLVENQSQLQDRVNKLAE 158
++E +++ + VN+L
Sbjct: 253 AVLEQENKYVEAVNELRV 270



Score = 29.4 bits (66), Expect = 0.025
Identities = 6/41 (14%), Positives = 15/41 (36%)

Query: 88 QLQQQIQQQQLAQDEKNNALQSQLAQALLQPNQRIEQLEQQ 128
Q++ +I + ++++ L Q I L +
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3591CABNDNGRPT808e-17 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 80.0 bits (197), Expect = 8e-17
Identities = 51/218 (23%), Positives = 77/218 (35%), Gaps = 26/218 (11%)

Query: 3920 GGGQSGIITNSNGKEVVAS-----GANNKSYSTTDAQFVNGGDGNDHIETGKGNDVIYAG 3974
G +G + + +A+ GAN + + N D + +
Sbjct: 235 GADYNGHYGGAPMIDDIAAIQRLYGANMTTRTGDSVYGFNSNTDRDFYTATDSSKALIFS 294

Query: 3975 RTGSTGYGSDDALELSVNTLLNHHIMTGELTGANRMVDSNGLLLANDVASHKADIVNGGS 4034
+ G + D S N +N G L N + G
Sbjct: 295 VWDAGGTDTFDFSGYSNNQRIN---------LNEGSFSDVGGLKGNVS-------IAHGV 338

Query: 4035 GDDRIYGQSGSDILYGHTGNDYIDGGSHNDALRGGEGNDTLIGGLGDDVLRGDSGADTFV 4094
+ G SG+DIL G++ ++ + GG+ ND L GG G DTL GG G D SG D+
Sbjct: 339 TIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAGADTLYGGAGRDTFVYGSGQDST- 397

Query: 4095 WRYAEFGTDHIMDFKVTEDKLDLSDLLQGESANNLDSY 4132
D I DF+ DK+DLS + +
Sbjct: 398 ----VAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQ 431



Score = 38.8 bits (90), Expect = 5e-04
Identities = 24/176 (13%), Positives = 43/176 (24%), Gaps = 52/176 (29%)

Query: 3909 NPNQKILNVSFGGGQSGIITNSNGKEVVASGANNKSYSTTDAQFVNGGDGNDHIETGKGN 3968
+ Q + + +V N + GG GND + +
Sbjct: 299 GGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVGNSAD 358

Query: 3969 DVIYAGRTGSTGYGSDDALELSVNTLLNHHIMTGELTGANRMVDSNGLLLANDVASHKAD 4028
+++ G G+D
Sbjct: 359 NILQGGA------GND-------------------------------------------- 368

Query: 4029 IVNGGSGDDRIYGQSGSDILYGHTGNDYIDGGSH--NDALRGGEGNDTLIGGLGDD 4082
++ GG+G D +YG +G D +G D D +G + D
Sbjct: 369 VLYGGAGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQ 424


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3593RTXTOXIND318e-106 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 318 bits (816), Expect = e-106
Identities = 96/434 (22%), Positives = 196/434 (45%), Gaps = 15/434 (3%)

Query: 29 RLIIWAMAAMIVCFLLWAAFAKLDKVTTGSGKVIPSSQVQVIQSLDGGIMQELYVREGEL 88
RL+ + + +V + + +++ V T +GK+ S + + I+ ++ I++E+ V+EGE
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGES 117

Query: 89 VTKGQPLVRIDDTRFRSDFAQQEQEVFGLKTNAIRMRAELDSILISDMTSDWREQVLITK 148
V KG L+++ +D + + + + R + SI L
Sbjct: 118 VRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI------------ELNKL 165

Query: 149 KALVFPDSIV--AAEPALVHRQQEEYNGRLDNLSNQLEILVRQIQQRQQEIDELASKTTT 206
L PD V R + NQ + +++ E + ++
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINR 225

Query: 207 LTTSMQLISRELELTRPLAKKGIVPEVELLKLERAVNDAQGELNSLRLLRPKLKSALDEA 266
++ L+ L K + + +L+ E +A EL + +++S + A
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 267 ILKRREAVFVYAADLRAQLNETQTRLSRMNEAQVGAQDKVSKAIITSPVNGTIKTTHINT 326
+ + ++ ++ +L +T + + +++ ++I +PV+ ++ ++T
Sbjct: 286 KEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHT 345

Query: 327 LGGVVQPGVNIIEIVPSEDQLLIETKVLPKDIAFLHPGLPAIVKVTAYDFTRYGGLKGTV 386
GGVV ++ IVP +D L + V KDI F++ G AI+KV A+ +TRYG L G V
Sbjct: 346 EGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKV 405

Query: 387 EHISADTSQDEEGNSFYLIRVRTEESSLVKDDGTQMPIIPGMLTTVDVITGQRSILEYIL 446
++I+ D +D+ + + + EE+ L +P+ GM T ++ TG RS++ Y+L
Sbjct: 406 KNINLDAIEDQRLGLVFNVIISIEENCLS-TGNKNIPLSSGMAVTAEIKTGMRSVISYLL 464

Query: 447 NPILRAKDTALRER 460
+P+ + +LRER
Sbjct: 465 SPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3595OMPADOMAIN916e-24 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 90.8 bits (225), Expect = 6e-24
Identities = 32/118 (27%), Positives = 53/118 (44%), Gaps = 12/118 (10%)

Query: 77 NILFPNDSAFIAPEYYSQIEDIAAFLRQY--PTTKVTIEGHTSRTGTDERNAVLSQERAD 134
++LF + A + PE + ++ + + L V + G+T R G+D N LS+ RA
Sbjct: 220 DVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQ 279

Query: 135 AVTAALADRFNIDRSRLTAIGYGSSRPIVLEQTPEAEMR---------NRRVVAEVTG 183
+V L + I +++A G G S P+ + R +RRV EV G
Sbjct: 280 SVVDYLISK-GIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKG 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3600PF06580250.041 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 25.2 bits (55), Expect = 0.041
Identities = 8/29 (27%), Positives = 14/29 (48%)

Query: 47 MVSREEFDVQQHVLLKTREKLEALQAQVN 75
+ E D + + +L AL+AQ+N
Sbjct: 143 NYKQAEIDQWKMASMAQEAQLMALKAQIN 171


57Sputcn32_3677Sputcn32_3688Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_36772130.982782hypothetical protein
Sputcn32_36782130.892837hypothetical protein
Sputcn32_36792130.967791ornithine decarboxylase
Sputcn32_36802111.550303putrescine transporter
Sputcn32_36811121.379081porin
Sputcn32_36821121.171153hypothetical protein
Sputcn32_3683-113-0.876960hypothetical protein
Sputcn32_3684-113-0.580995hypothetical protein
Sputcn32_3685017-0.602219hypothetical protein
Sputcn32_36863180.647569ThiJ/PfpI domain-containing protein
Sputcn32_36872141.543518TetR family transcriptional regulator
Sputcn32_36882132.067555hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3681ECOLIPORIN771e-17 E.coli/Salmonella-type porin signature.
		>ECOLIPORIN#E.coli/Salmonella-type porin signature.

Length = 383

Score = 76.9 bits (189), Expect = 1e-17
Identities = 99/410 (24%), Positives = 160/410 (39%), Gaps = 53/410 (12%)

Query: 1 MNKTCIALVLPVLLTATSSQAIELYKDSKNSLDLSGWLGFAALNDSHDTSVIDDLSRVRF 60
M + +ALV+P LL A ++ A E+Y N LDL G + S D+S D + +R
Sbjct: 1 MKRKVLALVIPALLAAGAAHAAEIYNKDGNKLDLYGKVD-GLHYFSDDSSKDGDQTYMRV 59

Query: 61 SF--ERNEKHGWTAFATTEWGINMVSSDDSLVMQGGKLAAEKNEDFLYNRLGYVGMSHDE 118
F E T + E+ + Q E + RL + G+ +
Sbjct: 60 GFKGETQINDQLTGYGQWEYNV-----------QANTTEGEGANSW--TRLAFAGLKFGD 106

Query: 119 WGSLSFGKQWGVYYDIAGTTDLPNVFAGYSVGAYAFSDGGLTGTGRADSAFIYRNT--LG 176
+GS +G+ +GV YD+ G TD+ F G S Y ++D + TGRA+ YRNT G
Sbjct: 107 YGSFDYGRNYGVLYDVEGWTDMLPEFGGDS---YTYADNYM--TGRANGVATYRNTDFFG 161

Query: 177 PV---HIALQYAAKTNGDIVLKNADGSEMADSELNFDSS----YGASLTYSVTDKFKLLA 229
V + ALQY K G+ ++ + +G S TY + F A
Sbjct: 162 LVDGLNFALQYQGKNESQSADDVNIGTNNRNNGDDIRYDNGDGFGISTTYDIGMGFSAGA 221

Query: 230 GFNRGDFDGHLGNTQVDETNKIVGIGAMYGDYYHYAENREASGLYVGFNAHQSKNNELVD 289
+ D N QV+ I G D + +A+ +Y+ +++N
Sbjct: 222 AYTTSD----RTNEQVNAGGTIA--GGDKADAWTAGLKYDANNIYLATMYSETRNMTPYG 275

Query: 290 GELYDATG--------VELMTAYQFDNGFVPMLVLSYLDLDTDATTPIQGKWTRQ----F 337
G E+ YQFD G P +S+L T + +
Sbjct: 276 KTDKGYDGGVANKTQNFEVTAQYQFDFGLRPA--VSFLMSKGKDLTYNNVNGDDKDLVKY 333

Query: 338 AMLGLHYRYSNDTVMFAEMKLDFSKMDDAALEAL---EDNGFAVGINYFF 384
A +G Y ++ + + + K++ DD + D+ A+G+ Y F
Sbjct: 334 ADVGATYYFNKNFSTYVDYKINLLDDDDPFYKDAGISTDDIVALGMVYQF 383


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3687HTHTETR479e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 46.9 bits (111), Expect = 9e-09
Identities = 18/79 (22%), Positives = 30/79 (37%), Gaps = 4/79 (5%)

Query: 18 KVLHAFKEMLKQQDYRDIAVADIAYKADVGRTTFYRYFKRKLDILIALHQNIFEDIFADL 77
+L + QQ ++ +IA A V R Y +FK K D+ I+E +++
Sbjct: 15 HILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE----IWELSESNI 70

Query: 78 QSAEDWLSTNTPPSLEKLL 96
E P +L
Sbjct: 71 GELELEYQAKFPGDPLSVL 89


58Sputcn32_3702Sputcn32_3714Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3702-218-3.279765hypothetical protein
Sputcn32_3703-118-3.294302DNA adenine methylase
Sputcn32_3704-119-3.739275sporulation domain-containing protein
Sputcn32_3705020-4.6985243-dehydroquinate synthase
Sputcn32_3706018-5.070340shikimate kinase I
Sputcn32_3707-116-3.908478type IV pilus secretin PilQ
Sputcn32_3708-213-1.697343pilus assembly protein, PilQ
Sputcn32_3709-112-0.541277pilus assembly protein, PilO
Sputcn32_3710-1120.332350fimbrial assembly family protein
Sputcn32_3711-1121.114564type IV pilus assembly protein PilM
Sputcn32_37120151.7452251A family penicillin-binding protein
Sputcn32_3713-1163.299834argininosuccinate lyase
Sputcn32_37140173.125756argininosuccinate synthase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3703TYPE3IMSPROT330.001 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 32.8 bits (75), Expect = 0.001
Identities = 22/91 (24%), Positives = 30/91 (32%), Gaps = 17/91 (18%)

Query: 164 IGYEKAFEQIRTGDVIYCDPP-------YAPLSTTASFTTYVGAGFSLDDQALLARHSRH 216
I E ++ V+ +P Y T T+ D Q R
Sbjct: 245 IQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYT----DAQVQ---TVRK 297

Query: 217 TALERGIPVLISNHDIPLTRELYRGAHLAKL 247
A E G+P+L IPL R LY A +
Sbjct: 298 IAEEEGVPIL---QRIPLARALYWDALVDHY 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3704PF05272320.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.007
Identities = 15/65 (23%), Positives = 24/65 (36%)

Query: 14 ALIQRLHHIASYSDQLLVLSGAQGSGKTTLVTALATDFDESNAALVICPMHADNAEIRRK 73
+ R+ D +VL G G GK+TL+ L S+ I +I
Sbjct: 583 GHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGI 642

Query: 74 ILVQL 78
+ +L
Sbjct: 643 VAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3706PF05272310.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.002
Identities = 16/64 (25%), Positives = 25/64 (39%), Gaps = 8/64 (12%)

Query: 9 LVGPMGAGKSTIGRHLAQML-----HLEFHDSDQEIEQRTGADIAWVFDVEGEEGFRRRE 63
L G G GKST+ L + H + EQ G +++ FRR +
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAG---IVAYELSEMTAFRRAD 657

Query: 64 AQVI 67
A+ +
Sbjct: 658 AEAV 661


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3707BCTERIALGSPD2471e-74 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 247 bits (631), Expect = 1e-74
Identities = 93/401 (23%), Positives = 178/401 (44%), Gaps = 38/401 (9%)

Query: 313 DVPWDQALDLILQTKGLDKRIEGNILMVAPSEELAIRESQNLKNKQEVKELAPLYSEYLQ 372
+ W A D++ L+K + L + + E N ++
Sbjct: 198 PLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPNSRQRIIAMIK 257

Query: 373 ----------------INYAKAIDIAELLKSADSSLLSPRG------------SVAVDER 404
+ YAKA D+ E+L S++ S + + +
Sbjct: 258 QLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQ 317

Query: 405 TNTVLVKDTAEIIENIHRLVEVLDIPIRQVLIESRMVTVKDNVSEDLGIRWGITDQQGNK 464
TN ++V +++ ++ R++ LDI QVL+E+ + V+D +LGI+W + +
Sbjct: 318 TNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQ 377

Query: 465 GSSGSLEGAQDIANGIVPSIGDRLNVNLPAQVDSAASIAFHVAKLADGTILDLELSALEQ 524
++ L + IA + ++ +L + + S IA + + L+AL
Sbjct: 378 FTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGN----WAMLLTALSS 433

Query: 525 ENKGEIIASPRITTSNQKAAYIEQGIEIPYV-----QSTSSGATSVTFKKAVLSLRVTPQ 579
K +I+A+P I T + A G E+P + S + +V K + L+V PQ
Sbjct: 434 STKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQ 493

Query: 580 ITPDNRVILDLEITQDSEGKTVPTSTGP-AVAIDTQRIGTQVLVNNGETIVLGGIYQQNL 638
I + V+L++E S +++ +T+ + VLV +GET+V+GG+ +++
Sbjct: 494 INEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSV 553

Query: 639 ISRVSKVPVLGDIPLVGFLFRNTTDKNERQELLIFVTPKIV 679
KVP+LGDIP++G LFR+T+ K ++ L++F+ P ++
Sbjct: 554 SDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVI 594



Score = 48.4 bits (115), Expect = 8e-08
Identities = 33/175 (18%), Positives = 75/175 (42%), Gaps = 14/175 (8%)

Query: 274 SLNFQNISVRTVLQIIADYNNFNLVTSDTVEGNITLR-LDDVPWDQALDLILQTKGLDKR 332
S +F+ ++ + ++ N ++ +V G IT+R D + +Q L
Sbjct: 31 SASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEEQYYQFFLSVL----D 86

Query: 333 IEGNILMVAPSEELAIRESQNLKNKQEVK--ELAP-----LYSEYLQINYAKAIDIAELL 385
+ G ++ + L + S++ K + AP + + + + A D+A LL
Sbjct: 87 VYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVTRVVPLTNVAARDLAPLL 146

Query: 386 KSADSSLLSPRGSVAVDERTNTVLVKDTAEIIENIHRLVEVLDIPIRQVLIESRM 440
+ + + + GSV E +N +L+ A +I+ + +VE +D + ++ +
Sbjct: 147 RQLNDN--AGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGDRSVVTVPL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3711SHAPEPROTEIN422e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 42.1 bits (99), Expect = 2e-06
Identities = 33/156 (21%), Positives = 59/156 (37%), Gaps = 34/156 (21%)

Query: 199 VDIGANMTTFSVVESGETTFIREQAFGGELFTQSILSFYGMSY------EQAEKAKIE-- 250
VDIG T +V+ + GG+ F ++I+++ +Y AE+ K E
Sbjct: 164 VDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIG 223

Query: 251 -------------------GDLPRNY------MFEVLSPFQTQLLQQVKRTLQIYCTSSG 285
+PR + + E L T ++ V L+
Sbjct: 224 SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELA 283

Query: 286 RDKVDY-LVLCGGTSKLEGMANLLINELGVHTIIAD 320
D + +VL GG + L + LL+ E G+ ++A+
Sbjct: 284 SDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAE 319


59Sputcn32_3783Sputcn32_3794Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3783214-0.078895cytochrome-c oxidase
Sputcn32_37842150.108510cytochrome C oxidase assembly protein
Sputcn32_3785216-0.657629cytochrome c oxidase subunit III
Sputcn32_3786114-0.335802hypothetical protein
Sputcn32_37872140.128800hypothetical protein
Sputcn32_3788213-1.959302cytochrome oxidase assembly
Sputcn32_3789013-1.872395protoheme IX farnesyltransferase
Sputcn32_3790-113-2.518231electron transport protein SCO1/SenC
Sputcn32_3791-215-3.427464polysaccharide deacetylase
Sputcn32_3792-116-4.269607MATE efflux family protein
Sputcn32_3793-318-4.849054hypothetical protein
Sputcn32_3794-316-3.651560putative DNA uptake protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3790PF06057290.017 Type IV secretory pathway VirJ component
		>PF06057#Type IV secretory pathway VirJ component

Length = 243

Score = 28.7 bits (64), Expect = 0.017
Identities = 13/56 (23%), Positives = 25/56 (44%), Gaps = 7/56 (12%)

Query: 59 ADLQGKW---NLFFIGFTFCPDICPTTLNKLAAAYHELNKIAPIQVVFLSVDPNRD 111
Q ++ + IG++F ++ P LN++ A Y + + V LS + D
Sbjct: 108 DKYQAEFGTQKVILIGYSFGAEVIPFVLNEMPARYR--KNV--LGAVLLSPSQSSD 159


60Sputcn32_3804Sputcn32_3815Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3804215-0.961740RNA-binding S1 domain-containing protein
Sputcn32_3805422-2.261017GreA/GreB family elongation factor
Sputcn32_3806422-1.931833phage integrase family protein
Sputcn32_3807421-1.550866phage integrase family protein
Sputcn32_3808016-0.410459hypothetical protein
Sputcn32_3809-3141.432416transcription elongation factor GreB
Sputcn32_3810-2152.424083hypothetical protein
Sputcn32_3812-2183.427274ISSod10, transposase OrfA
Sputcn32_38130193.406332hypothetical protein
Sputcn32_38141213.952200major facilitator superfamily transporter
Sputcn32_38150193.586168glyceraldehyde-3-phosphate dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3813THERMOLYSIN270.017 Thermolysin metalloprotease (M4) family signature.
		>THERMOLYSIN#Thermolysin metalloprotease (M4) family signature.

Length = 544

Score = 26.9 bits (59), Expect = 0.017
Identities = 21/80 (26%), Positives = 35/80 (43%), Gaps = 15/80 (18%)

Query: 12 MKKTALMSLLGL--GLFACVAHAS----------EFDLPGFVTEVEDGRLW---VFKENS 56
M K A++ +GL GL A AS ++ P FV+ GR V++
Sbjct: 1 MNKRAMLGAIGLAFGLMAWPFGASAKGKSMVWNEQWKTPSFVSGSLLGRCSQELVYRYLD 60

Query: 57 AELTEFKQHGEPAKQFTVIG 76
E F+ G+ ++ ++IG
Sbjct: 61 QEKNTFQLGGQARERLSLIG 80


61Sputcn32_3834Sputcn32_3842Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3834320-3.534169hypothetical protein
Sputcn32_3835724-5.397424redox-active disulfide protein 2
Sputcn32_3836520-4.908709regulatory protein ArsR
Sputcn32_3837520-4.558052helix-turn-helix domain-containing protein
Sputcn32_3838621-4.875334hypothetical protein
Sputcn32_3839523-5.573403type III restriction enzyme, res subunit
Sputcn32_3840524-5.971060hypothetical protein
Sputcn32_3841119-4.212903integrase catalytic subunit
Sputcn32_3842117-3.530846transposase IS3/IS911 family protein
62Sputcn32_3857Sputcn32_3865Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_38572262.567679hypothetical protein
Sputcn32_38582201.721692ubiquinol oxidase subunit II
Sputcn32_38591244.158485cytochrome-c oxidase
Sputcn32_38600306.928632cytochrome c oxidase subunit III
Sputcn32_38610266.262701cytochrome C oxidase subunit IV
Sputcn32_3862-1256.110665protoheme IX farnesyltransferase
Sputcn32_3863-1235.891503TonB-dependent siderophore receptor
Sputcn32_38641358.818065amino acid permease-associated protein
Sputcn32_38651225.344637homocysteine S-methyltransferase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3862PYOCINKILLER310.007 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.9 bits (69), Expect = 0.007
Identities = 21/76 (27%), Positives = 27/76 (35%), Gaps = 10/76 (13%)

Query: 192 AIAIFRFNDYA----------AANIPVLPVAEGMTKAKLHIVLYIAVFALVSALLPLAGY 241
AI N YA AA ++ VA+G I IAV V A P
Sbjct: 241 QAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAIAVLGRVLASAPSVMA 300

Query: 242 TGIAFMAVTCATSLWW 257
G A + + T+ W
Sbjct: 301 VGFASLTYSSRTAEQW 316


63Sputcn32_3888Sputcn32_3895Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3888-115-3.348076phosphopantetheine adenylyltransferase
Sputcn32_3889017-5.084167sulfatase
Sputcn32_3890122-6.561611dTDP-4-dehydrorhamnose 3,5-epimerase
Sputcn32_3891226-7.541607dTDP-4-dehydrorhamnose reductase
Sputcn32_3892222-6.858700glycosyl transferase family protein
Sputcn32_3893121-6.619072CDP-glycerol:poly(glycerophosphate)
Sputcn32_3894020-5.075172UDP-galactopyranose mutase
Sputcn32_3895-116-3.118144hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3888LPSBIOSNTHSS2233e-78 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 223 bits (571), Expect = 3e-78
Identities = 81/157 (51%), Positives = 113/157 (71%)

Query: 5 AIYPGTFDPITNGHVDLIERAAKLFKHVTIGIAANPSKQPRFTLEERVELVNRVTAHLDN 64
AIYPG+FDPIT GH+D+IER +LF V + + NP+KQP F+++ER+E + + AHL N
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAHLPN 62

Query: 65 VDVVGFSGLLVDFAKEQKASVLVRGLRAVSDFEYEFQLANMNRRLSPDLESVFLTPAEEN 124
V F GL V++A++++A ++RGLR +SDFE E Q+AN N+ L+ DLE+VFLT + E
Sbjct: 63 AQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLTTSTEY 122

Query: 125 SFISSTLVKEVALHGGDVSQFVHPEVAAALAAKLTSI 161
SF+SS+LVKEVA GG+V FV VAAAL + +
Sbjct: 123 SFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQFHPV 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3891NUCEPIMERASE451e-07 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 45.2 bits (107), Expect = 1e-07
Identities = 22/103 (21%), Positives = 44/103 (42%), Gaps = 4/103 (3%)

Query: 38 RLDITNREQVDTVVNQFHPDVIINAAAYTAVDKAEQEIELCYAINRDGPKYLAEAA--NQ 95
++D+ +RE + + H + + + AV + + N G + E N+
Sbjct: 58 KIDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNK 117

Query: 96 VGALILHISSDYVFSGRKHGLYTEDDIPD-PLGVYGQSKLAGE 137
+ L+ SS V+ + ++ DD D P+ +Y +K A E
Sbjct: 118 IQHLLY-ASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANE 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3895PF05272290.025 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.9 bits (64), Expect = 0.025
Identities = 10/58 (17%), Positives = 17/58 (29%), Gaps = 7/58 (12%)

Query: 6 IIARVGDNSLHPT--WINSEK-----RNFDLFISYFGNEEKKYSEHADYYEHMKGGKW 56
I N +HP W+ +++ R + G Y Y + G
Sbjct: 523 INVAADMNRVHPFRDWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYI 580


64Sputcn32_3921Sputcn32_3941Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3921223-1.355349hypothetical protein
Sputcn32_3922324-1.733615superfamily I DNA/RNA helicase
Sputcn32_3923325-1.395196transposase IS3/IS911 family protein
Sputcn32_3924327-2.072796integrase catalytic subunit
Sputcn32_3925425-2.585700hypothetical protein
Sputcn32_3926326-1.685443putative DNA modification methylase
Sputcn32_3927327-1.636482helix-turn-helix domain-containing protein
Sputcn32_3928327-1.664502hypothetical protein
Sputcn32_3929423-0.791916hypothetical protein
Sputcn32_39303240.430923transposition regulatory protein
Sputcn32_39313240.091915transposition protein
Sputcn32_3932425-0.687431hypothetical protein
Sputcn32_3933325-1.921556hypothetical protein
Sputcn32_3934325-1.105869ATPase AAA
Sputcn32_3935327-2.363387putative transposase
Sputcn32_3937529-4.167511hypothetical protein
Sputcn32_3938527-3.880748hypothetical protein
Sputcn32_3939327-3.209853DNA methylase N-4/N-6 domain-containing protein
Sputcn32_3940226-1.894509integrase catalytic subunit
Sputcn32_3941225-1.689594hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3930TYPE3OMGPROT290.048 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 29.1 bits (65), Expect = 0.048
Identities = 11/31 (35%), Positives = 16/31 (51%), Gaps = 2/31 (6%)

Query: 33 LANIPVISRGKLINSQKRQSRRVDRVFTFKP 63
L +IP I G L + +RR R+F +P
Sbjct: 478 LGDIPYI--GALFRRKSELTRRTVRLFIIEP 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3933FLGMRINGFLIF270.021 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 27.2 bits (60), Expect = 0.021
Identities = 16/50 (32%), Positives = 22/50 (44%), Gaps = 7/50 (14%)

Query: 51 EFFDE----QSKLASMMNQIRELETEVAR-FNESQGVK--RGQLTDPERS 93
E D+ S+ + +N R LE E+AR VK R L P+ S
Sbjct: 113 ELLDQEKFGISQFSEQVNYQRALEGELARTIETLGPVKSARVHLAMPKPS 162


65Sputcn32_3951Sputcn32_3987Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_39513210.144465glucosamine--fructose-6-phosphate
Sputcn32_3952223-0.196880DeoR family transcriptional regulator
Sputcn32_3953329-0.258631TonB-dependent siderophore receptor
Sputcn32_39543310.591670UDP-N-acetylglucosamine pyrophosphorylase
Sputcn32_39555360.616135F0F1 ATP synthase subunit epsilon
Sputcn32_39565390.436028F0F1 ATP synthase subunit beta
Sputcn32_3957433-0.819261F0F1 ATP synthase subunit gamma
Sputcn32_3958432-1.575376F0F1 ATP synthase subunit alpha
Sputcn32_3959525-3.861006F0F1 ATP synthase subunit delta
Sputcn32_3960323-3.960997F0F1 ATP synthase subunit B
Sputcn32_3961122-4.510704F0F1 ATP synthase subunit C
Sputcn32_3962-124-4.344296F0F1 ATP synthase subunit A
Sputcn32_3963-224-4.178637ATP synthase I chain
Sputcn32_3964-123-4.009502parB-like partition protein
Sputcn32_3965-320-1.834694cobyrinic acid a,c-diamide synthase
Sputcn32_3966-317-1.06949716S rRNA methyltransferase GidB
Sputcn32_3967-2140.416323tRNA uridine 5-carboxymethylaminomethyl
Sputcn32_39680161.806414flavodoxin
Sputcn32_39690172.279923TGF-beta receptor, type I/II extracellular
Sputcn32_39714193.683282RND family efflux transporter MFP subunit
Sputcn32_39723211.490996two component transcriptional regulator
Sputcn32_39734220.974590integral membrane sensor signal transduction
Sputcn32_3974423-0.775064hypothetical protein
Sputcn32_39760190.960591hypothetical protein
Sputcn32_3977-1211.203075hypothetical protein
Sputcn32_39781273.205994hypothetical protein
Sputcn32_39792324.372008resolvase domain-containing protein
Sputcn32_39801292.390265transcriptional regulator
Sputcn32_39812313.151972N-6 DNA methylase
Sputcn32_39821241.256894hypothetical protein
Sputcn32_39831231.251665putative transcriptional regulator
Sputcn32_39841220.243487N-6 DNA methylase
Sputcn32_3985117-0.575635restriction modification system DNA specificity
Sputcn32_39861190.358227HsdR family type I site-specific
Sputcn32_3987217-2.204173hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3960IGASERPTASE310.002 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.2 bits (70), Expect = 0.002
Identities = 21/108 (19%), Positives = 43/108 (39%), Gaps = 10/108 (9%)

Query: 27 PPLMNAIEERQKKIADGLADAGRAAKDLELAQAKATEQLKEAKVTANEIIE------QAN 80
PP A + ++ + +K +E + ATE + + A E Q N
Sbjct: 1026 PPPAPATPSETTETVA--ENSKQESKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTN 1083

Query: 81 K--RKAQIVEEAKAEADAERAKIIAQGKAEIENERSRVKDDLRKQVAA 126
+ + +E + E A + + KA++E E+++ + QV+
Sbjct: 1084 EVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSP 1131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3971RTXTOXIND330.002 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 33.3 bits (76), Expect = 0.002
Identities = 25/131 (19%), Positives = 51/131 (38%), Gaps = 15/131 (11%)

Query: 56 PGRVAAV-RSAEIRAQISGIVQGRLFEQGAEITSGTVLFQINPAPFKADVDIAAAALLRA 114
G++ RS EI+ + IV+ + ++G + G VL ++ +AD ++LL+A
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQA 146

Query: 115 EAVWVRAR-----QEADRLASLIQT-----EAVSQQMYDDAI----SQRDQAAADVAQTK 160
R + E ++L L + VS++ Q Q +
Sbjct: 147 RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE 206

Query: 161 ATLARRQLDLQ 171
L +++ +
Sbjct: 207 LNLDKKRAERL 217



Score = 32.1 bits (73), Expect = 0.003
Identities = 19/109 (17%), Positives = 51/109 (46%), Gaps = 2/109 (1%)

Query: 120 RARQEADRLASLIQTEAVSQQMYDDAISQRDQAAADVAQTKATLARRQLDLQFASVEAPI 179
+ E++ L++ + + V+Q ++ + + Q ++ LA+ + Q + + AP+
Sbjct: 275 LEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPV 334

Query: 180 SGRIDEALV-SEGALVSPTDTTPMARIQQIDQVYVDVRLPASMLKAMRQ 227
S ++ + V +EG +V+ +T M + + D + V + + +
Sbjct: 335 SVKVQQLKVHTEGGVVTTAETL-MVIVPEDDTLEVTALVQNKDIGFINV 382


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3972HTHFIS853e-21 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.3 bits (211), Expect = 3e-21
Identities = 37/142 (26%), Positives = 68/142 (47%), Gaps = 6/142 (4%)

Query: 11 MTSALVLIAEDEAEIADILIAYLQRSGLRTQHAIDGIQALAMHQTFKPDLLLLDVQMPNL 70
MT A +L+A+D+A I +L L R+G + + DL++ DV MP+
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 71 DGWNVLNEIRSRG-DTPVIMLTALDQDIDKLMGLRLGADDYVVKPFNPAEVVARVQAVL- 128
+ +++L I+ D PV++++A + + + GA DY+ KPF+ E++ + L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 129 ----RRARTNTPINTQMLRVGQ 146
R ++ M VG+
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGR 142


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3985BONTOXILYSIN290.037 Bontoxilysin signature.
		>BONTOXILYSIN#Bontoxilysin signature.

Length = 1196

Score = 29.1 bits (65), Expect = 0.037
Identities = 16/72 (22%), Positives = 36/72 (50%), Gaps = 8/72 (11%)

Query: 189 IDDDTTYPLRINSTSNVEIKLLKELLLNKPQNGQFVKKGSGGSVDCSFLNVVDG---YVN 245
++D+ L + +T + I +K L+N ++ +V+K D + V+DG Y++
Sbjct: 1070 VNDNNKSYLSLKNTDGINISSVKFKLINIDESKVYVQK-----WDECIICVLDGTEKYLD 1124

Query: 246 SYSTEDRREIIS 257
+R +++S
Sbjct: 1125 ISPENNRIQLVS 1136


66Sputcn32_0300Sputcn32_0310N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0300315-3.740498fimbrial biogenesis outer membrane usher
Sputcn32_0301219-5.707695fimbrial subunit
Sputcn32_0302217-5.322534pili assembly chaperone
Sputcn32_0303117-4.594739multi-sensor hybrid histidine kinase
Sputcn32_0304015-3.391233two component LuxR family transcriptional
Sputcn32_0305011-1.655885response regulator receiver modulated
Sputcn32_0306113-1.329553hypothetical protein
Sputcn32_0307012-0.887329o-succinylbenzoate--CoA ligase
Sputcn32_0308012-1.128308O-succinylbenzoate synthase
Sputcn32_0309113-1.241363alpha/beta hydrolase fold domain-containing
Sputcn32_0310112-1.7812922-succinyl-5-enolpyruvyl-6-hydroxy-3-
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0300PF005777020.0 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 702 bits (1814), Expect = 0.0
Identities = 269/867 (31%), Positives = 418/867 (48%), Gaps = 52/867 (5%)

Query: 14 LRLLPLLLLSAPLYADDMEYTFDEALLLGPGYNSDYLQRLANGPDILPGQYQVDLFINGR 73
+RL +A E F+ L L R NG ++ PG Y+VD+++N
Sbjct: 28 VRLFVACAFAAQAPLSSAELYFNPRFLADDPQAVADLSRFENGQELPPGTYRVDIYLNNG 87

Query: 74 FVSRELLQFIELSDNK-VEPCIEEAIWTKAGVRPEFMAPAPITGG--C-SLGYTVKGNNF 129
+++ + F + + PC+ A G+ ++ + C L +
Sbjct: 88 YMATRDVTFNTGDSEQGIVPCLTRAQLASMGLNTASVSGMNLLADDACVPLTSMIHDATA 147

Query: 130 SFDSGSLRLELTVPQAYMDNKPRGYISPDEWDSGETALFVNYSGNYYRSESSYGRRSTSE 189
D G RL LT+PQA+M N+ RGYI P+ WD G A +NY+ + ++ G S
Sbjct: 148 QLDVGQQRLNLTIPQAFMSNRARGYIPPELWDPGINAGLLNYNFSGNSVQNRIGG--NSH 205

Query: 190 SGYLGLSSGFNLGLWRIRNQSTYQYRDSGFGSDSQ--FDSLRTYATRALPFWQSELSLGE 247
YL L SG N+G WR+R+ +T+ Y S S S+ + + T+ R + +S L+LG+
Sbjct: 206 YAYLNLQSGLNIGAWRLRDNTTWSYNSSDSSSGSKNKWQHINTWLERDIIPLRSRLTLGD 265

Query: 248 LYTRSTVFGSISFKGFQIQSDTRMLPVSQRGYAPTVTGIAQTTAKVVIKQNGREIYQTTV 307
YT+ +F I+F+G Q+ SD MLP SQRG+AP + GIA+ TA+V IKQNG +IY +TV
Sbjct: 266 GYTQGDIFDGINFRGAQLASDDNMLPDSQRGFAPVIHGIARGTAQVTIKQNGYDIYNSTV 325

Query: 308 AAGPFEINDLYPTNYQGDLLVEITEADGRVSSFTVPFSAVPGSMRAGQSQYALSMGKSID 367
GPF IND+Y GDL V I EADG FTVP+S+VP R G ++Y+++ G+
Sbjct: 326 PPGPFTINDIYAAGNSGDLQVTIKEADGSTQIFTVPYSSVPLLQREGHTRYSITAGEYRS 385

Query: 368 TG---EDGYFTDFTYELGLSNAMTFNSGLRIASGYVAASAGSVFTT-PIGAIGATGVYSH 423
E F T GL T G ++A Y A + G +GA+ ++
Sbjct: 386 GNAQQEKPRFFQSTLLHGLPAGWTIYGGTQLADRYRAFNFGIGKNMGALGALSVDMTQAN 445

Query: 424 SKLKPELGDVETDSGWRLGLNYSRAF-ESGTSVTLAGYKYSTEGFRELSDIFRQRAYLDN 482
S L D G + Y+++ ESGT++ L GY+YST G+ +D R N
Sbjct: 446 ST----LPDDSQHDGQSVRFLYNKSLNESGTNIQLVGYRYSTSGYFNFADTTYSRMNGYN 501

Query: 483 GY-------------DYSSENYLQKAEMVLSMSQHLGDWGSLAVSGSKRQYRDGRDDDES 529
DY + Y ++ ++ L+++Q LG +L +SGS + Y + DE
Sbjct: 502 IETQDGVIQVKPKFTDYYNLAYNKRGKLQLTVTQQLGRTSTLYLSGSHQTYWGTSNVDEQ 561

Query: 530 YQMGYSNQFGLVSLGVNYSRQYLNQTENSIAGPIPVKTKDDVWSLSLSVPLG-------A 582
+Q G + F ++ ++YS + K +D + +L++++P
Sbjct: 562 FQAGLNTAFEDINWTLSYSL---TKNAWQ-------KGRDQMLALNVNIPFSHWLRSDSK 611

Query: 583 SSNHAASTGYSNSGDSN---NYYAGISGMLDDDRTLSYSLNASR-LDDNDFSGNSYSATL 638
S AS YS S D N AG+ G L +D LSYS+ + SG++ ATL
Sbjct: 612 SQWRHASASYSMSHDLNGRMTNLAGVYGTLLEDNNLSYSVQTGYAGGGDGNSGSTGYATL 671

Query: 639 NKRMSLASMGLNYSRSDSYQQWGGNIRGAVVVHEGGLTLGQSVGETFAIIEAPGAEGAAV 698
N R + + YS SD +Q + G V+ H G+TLGQ + +T +++APGA+ A V
Sbjct: 672 NYRGGYGNANIGYSHSDDIKQLYYGVSGGVLAHANGVTLGQPLNDTVVLVKAPGAKDAKV 731

Query: 699 KNNWGTYIDSNGYALMPSLMPYRSNDVSLDSANIDEDVELVDSRKTVTPYAGAAVKLKYE 758
+N G D GYA++P YR N V+LD+ + ++V+L ++ V P GA V+ +++
Sbjct: 732 ENQTGVRTDWRGYAVLPYATEYRENRVALDTNTLADNVDLDNAVANVVPTRGAIVRAEFK 791

Query: 759 TRKGKAALFIAKLDSGEVAPLGANILGSLGENIGTVGQAGLAYVRLAEPVGKLTLKWGER 818
R G L + + P GA + ++ G V G Y+ GK+ +KWGE
Sbjct: 792 ARVGIKLLMTLT-HNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGMPLAGKVQVKWGEE 850

Query: 819 REDSCQIHYDLTEQPTDVRLHRLPTVC 845
C +Y L + L +L C
Sbjct: 851 ENAHCVANYQLPPESQQQLLTQLSAEC 877


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0303HTHFIS771e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 77.2 bits (190), Expect = 1e-16
Identities = 33/157 (21%), Positives = 63/157 (40%), Gaps = 10/157 (6%)

Query: 960 HILIVDDHSANRLLLSQQLRYLGHSVDEANNGLEAIQLFRQHPYRIVLTDCNMPIMDGYE 1019
IL+ DD +A R +L+Q L G+ V +N + +V+TD MP + ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 1020 FSRRLRQFEQNNNLPAAVVLGYTANAQLEAKQACIDAGMNDCLFKPISLEDLRQKLESYC 1079
R++ + +LP V+ +A + G D L KP L +L +
Sbjct: 65 LLPRIK--KARPDLPVLVM---SAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG--- 116

Query: 1080 QKLTLEHPQQAFLPEALTKLTG--GNTPLFEQLLKEL 1114
+ L + + L + G + +++ + L
Sbjct: 117 RALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVL 153


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0304HTHFIS866e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 86.0 bits (213), Expect = 6e-22
Identities = 34/165 (20%), Positives = 74/165 (44%), Gaps = 11/165 (6%)

Query: 1 MKR-KILIVDDHPVVVLALKIILEQNGFEVIADTNNGVDALKLVKDLSPDAVILDIGIPQ 59
M IL+ DD + L L + G++V T+N + + D V+ D+ +P
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 60 LDGLEVIERSRKLANPPPILVLTAQPSDHFVVRCIQAGASGFVSKQKDMTEVTGALRAIL 119
+ +++ R +K P+LV++AQ + ++ + GA ++ K D+TE+ G + L
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 120 S-------GHSYFPIFGNNIITQS--HQQEAELIKKLSTREMVVL 155
+ G ++ +S Q+ ++ +L ++ ++
Sbjct: 120 AEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0305HTHFIS508e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 50.2 bits (120), Expect = 8e-09
Identities = 17/72 (23%), Positives = 28/72 (38%), Gaps = 2/72 (2%)

Query: 1 MGKIRVIVLEDHPFQRTVLEYNLASFPCVEVFSFGTAQDALAWLDIHHSADIVICDLMMT 60
M ++V +D RTVL S +V A W+ D+V+ D++M
Sbjct: 1 MTGATILVADDDAAIRTVLN-QALSRAGYDVRITSNAATLWRWIAAGDG-DLVVTDVVMP 58

Query: 61 GVDGLSFLRKAK 72
+ L + K
Sbjct: 59 DENAFDLLPRIK 70


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0310BINARYTOXINA330.002 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 33.5 bits (76), Expect = 0.002
Identities = 24/93 (25%), Positives = 41/93 (44%), Gaps = 10/93 (10%)

Query: 483 LNNDGGNIFNLL---PVPNEQVRNDYYRLSHGLEFGYAAAMFNLPYNQVDNLADFQDSYN 539
L++ NI N L P+P+ + YR S EFG +N+++N+ F++ +
Sbjct: 312 LDSKVNNIENALKLTPIPSNLI---VYRRSGPQEFGLTLTSPEYDFNKIENIDAFKEKWE 368

Query: 540 ----EALDFQGASIIEVNVSQTQASDQIAELNL 568
+F SI VN+S I +N+
Sbjct: 369 GKVITYPNFISTSIGSVNMSAFAKRKIILRINI 401


67Sputcn32_0365Sputcn32_0372N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0365012-0.088745Fis family GAF modulated sigma54 specific
Sputcn32_0366013-0.333026integral membrane sensor signal transduction
Sputcn32_0367-1110.514908two component transcriptional regulator
Sputcn32_0368-2120.108563hypothetical protein
Sputcn32_0369-2120.290806cation diffusion facilitator family transporter
Sputcn32_03700150.487729hypothetical protein
Sputcn32_0371-2161.058248nitrogen metabolism transcriptional regulator,
Sputcn32_0372-1170.607461signal transduction histidine kinase, nitrogen
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0365HTHFIS336e-111 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 336 bits (864), Expect = e-111
Identities = 120/340 (35%), Positives = 191/340 (56%), Gaps = 10/340 (2%)

Query: 296 RDPQLERAWQHANKVITKQIPLLVLGETGVGKEQFVKKLHAQSARRTEPLVAVNCAALPA 355
R ++ ++ +++ + L++ GE+G GKE + LH RR P VA+N AA+P
Sbjct: 142 RSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPR 201

Query: 356 ELVESELFGYQAGAFTGANRTGFIGKIRQAHGGFLFLDEIGEMPLAAQSRLLRVLQEREV 415
+L+ESELFG++ GAFTGA G+ QA GG LFLDEIG+MP+ AQ+RLLRVLQ+ E
Sbjct: 202 DLIESELFGHEKGAFTGAQTRS-TGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEY 260

Query: 416 VPVGSNQSFKVDIQIIAATHMDLEKQVAQGLFRQDLFYRLNGLQVRLPALRERQ-DIERI 474
VG + D++I+AAT+ DL++ + QGLFR+DL+YRLN + +RLP LR+R DI +
Sbjct: 261 TTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDL 320

Query: 475 IH---KLHRKHRIAAQDICPELLGKMMQYDWPGNLRELDNLMQVACLMAEGDNTLNWQHL 531
+ + K + + E L M + WPGN+REL+NL++ + D + + +
Sbjct: 321 VRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQD-VITREII 379

Query: 532 PDYLAAKLTCEPLKGDLLQAASLCEEFKEVSGQPSASPLPSAKFAVEPNVTKVQSDSLNE 591
+ L +++ P++ A S + + S A+ P+ + L E
Sbjct: 380 ENELRSEIPDSPIEKA--AARSGSLSISQAVEENMRQYFASFGDALPPSG--LYDRVLAE 435

Query: 592 AIYSNVQQAFQACNGNVSQCAKRLGISRNALYRKLKQMGL 631
Y + A A GN + A LG++RN L +K++++G+
Sbjct: 436 MEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGV 475


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0366PF06580385e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.3 bits (89), Expect = 5e-05
Identities = 20/118 (16%), Positives = 47/118 (39%), Gaps = 12/118 (10%)

Query: 277 EAEQLEKLIAELLELSRVKLNTNETKVRLGLAESLSQVLDDAEFEADQQDKKIT--IDID 334
+ + +++ L EL R L + + + LA+ L+ V + + Q + ++ I+
Sbjct: 189 DPTKAREMLTSLSELMRYSLRYSNARQ-VSLADELTVVDSYLQLASIQFEDRLQFENQIN 247

Query: 335 EDIELAHFPKSLSRAIENLLRNAIRYA------KNDIYIHASATADEVYITIKDDGPG 386
I P L ++ L+ N I++ I + + V + +++ G
Sbjct: 248 PAIMDVQVPPML---VQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL 302


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0367HTHFIS927e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 92.2 bits (229), Expect = 7e-24
Identities = 44/163 (26%), Positives = 75/163 (46%), Gaps = 3/163 (1%)

Query: 2 SRILLIDDDLGLSELLGQLLELEGFTLTLAYDGKKGLDLALTTDFDLILLDVMLPKLNGF 61
+ IL+ DDD + +L Q L G+ + + + D DL++ DV++P N F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 62 EVLRALRQH-KQTPVLMLTARGDEIDRVVGLEIGADDYLPKPFNDRELIARIRAIIRRAH 120
++L +++ PVL+++A+ + + E GA DYLPKPF+ ELI I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 121 LTAQEIHAAPAQEFGDLRLDPSRQEAYCNEQLIILTGTEFTLL 163
++ + + QE Y L L T+ TL+
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAAMQEIY--RVLARLMQTDLTLM 164


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0371HTHFIS5600.0 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 560 bits (1445), Expect = 0.0
Identities = 197/473 (41%), Positives = 294/473 (62%), Gaps = 11/473 (2%)

Query: 7 VWILDDDSSIRWVLEKALQGAKLSTASFAAAESLWQALEISQPRVIVSDIRMPGTDGLSL 66
+ + DDD++IR VL +AL A + A +LW+ + ++V+D+ MP + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 67 LERLQIHYPHIPVIIMTAHSDLDSAVSAYQAGAFEYLPKPFDIDEAISLVERALTHATEQ 126
L R++ P +PV++M+A + +A+ A + GA++YLPKPFD+ E I ++ RAL +
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 127 SPAPVQETQVKTPEIIGEAPAMQEVFRAIGRLSRSSISVLINGQSGTGKELVAGALHKHS 186
P+ +++ ++G + AMQE++R + RL ++ ++++I G+SGTGKELVA ALH +
Sbjct: 126 -PSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARALHDYG 184

Query: 187 PRKDKPFIALNMAAIPKDLIESELFGHEKGAFTGAANVRQGRFEQANGGTLFLDEIGDMP 246
R++ PF+A+NMAAIP+DLIESELFGHEKGAFTGA GRFEQA GGTLFLDEIGDMP
Sbjct: 185 KRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMP 244

Query: 247 LDVQTRLLRVLADGQFYRVGGHNAVQVDVRIIAATHQDLELLVQKGGFREDLFHRLNVIR 306
+D QTRLLRVL G++ VGG ++ DVRI+AAT++DL+ + +G FREDL++RLNV+
Sbjct: 245 MDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVP 304

Query: 307 VHLPPLSQRREDIPQLATHFLASAAKEIGVEPKILTKETAAKLSQLPWPGNVRQLENTCR 366
+ LPPL R EDIP L HF+ A KE G++ K +E + PWPGNVR+LEN R
Sbjct: 305 LRLPPLRDRAEDIPDLVRHFVQQAEKE-GLDVKRFDQEALELMKAHPWPGNVRELENLVR 363

Query: 367 WLTVMASGQEILPQDLPPELLKDPVSITHATKVGQDWQSALTEWIDQKLSE--------- 417
LT + I + + EL + + ++++ +++ + +
Sbjct: 364 RLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDAL 423

Query: 418 GNSDLLTEVQPAFERILLETALRHTQGHKQEAAKRLGWGRNTLTRKLKELSMD 470
S L V E L+ AL T+G++ +AA LG RNTL +K++EL +
Sbjct: 424 PPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRELGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0372PF06580408e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 40.2 bits (94), Expect = 8e-06
Identities = 35/188 (18%), Positives = 71/188 (37%), Gaps = 33/188 (17%)

Query: 166 TLIIEQADRLRNLVDRL-------LGPQRPTQHSLHNIHKVVQKVYKLVEMALPANIQLK 218
LI+E + R ++ L L Q SL + VV +L + +Q +
Sbjct: 184 ALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFE 243

Query: 219 RDYDPSIPDIEMDPDQMQQAVLNILQNAVQALEHTGGEILIRTRTQHQVTIGSQRHKLVL 278
+P+I D+++ P +Q V N +++ + L GG+IL++ + +
Sbjct: 244 NQINPAIMDVQVPPMLVQTLVENGIKHGIAQLPQ-GGKILLKGTKDNGT----------V 292

Query: 279 TLSIIDNGPGIPPELMDTLFYPMVTSREQGSGLGLSIAHNIARLHSG---RIDCISSAGH 335
TL + + G + + ++ +G GL ++ G +I G
Sbjct: 293 TLEVENTGSL------------ALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGK 340

Query: 336 TEFTISLP 343
+ +P
Sbjct: 341 VNAMVLIP 348


68Sputcn32_0557Sputcn32_0574N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0557-2190.199413hypothetical protein
Sputcn32_0558-2150.844728MSHA biogenesis protein MshJ
Sputcn32_0559-2151.154384MSHA biogenesis protein MshK
Sputcn32_0560-1151.000014pilus (MSHA type) biogenesis protein MshL
Sputcn32_05610171.665173MSHA biogenesis protein MshM
Sputcn32_0562-1160.972426hypothetical protein
Sputcn32_05630170.016827type II secretion system protein E
Sputcn32_0564019-1.428891type II secretion system protein
Sputcn32_0565119-2.412187hypothetical protein
Sputcn32_0566221-1.888032MSHA pilin protein MshB
Sputcn32_0567222-2.140770methylation site containing protein
Sputcn32_0568316-0.554078methylation site containing protein
Sputcn32_05693150.219749MSHA pilin protein MshC
Sputcn32_05702160.820891methylation site containing protein
Sputcn32_05712160.947984MSHA biogenesis protein MshO
Sputcn32_05722160.950153MSHA biogenesis protein MshP
Sputcn32_05730150.599217MSHA biogenesis protein MshQ
Sputcn32_0574-2170.450100rod shape-determining protein MreB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0557RTXTOXIND290.012 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.012
Identities = 13/75 (17%), Positives = 24/75 (32%), Gaps = 9/75 (12%)

Query: 51 EQALQQASQQKQQLEQQKAALEAQIAARKPDPALVARVELESQQLELKQLLISELALRSA 110
++ QK Q E A+ ++AR+ +++ S L S+
Sbjct: 192 KEQFSTWQNQKYQKELNLDKKRAERL------TVLARINRYENLSRVEK---SRLDDFSS 242

Query: 111 LTSRGFAPVLKDLAQ 125
L + L Q
Sbjct: 243 LLHKQAIAKHAVLEQ 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0560BCTERIALGSPD1802e-51 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 180 bits (459), Expect = 2e-51
Identities = 76/293 (25%), Positives = 131/293 (44%), Gaps = 27/293 (9%)

Query: 257 PQAGLVTIRAFPSELRQVRSFLNSAETHLQRQVILEAKILEVTLSDDFQQGIQWENVLGH 316
Q + + A P + + + + + QV++EA I EV +D GIQW N
Sbjct: 316 GQTNALIVTAAPDVMNDLERVIAQLDIR-RPQVLVEAIIAEVQDADGLNLGIQWANKNAG 374

Query: 317 VGN-TNINFGTTAGTVGNKVTSTLGGVTS-------------LSIKGSDFTTMINLLDTQ 362
+ TN + G + G V+S ++ ++ L +
Sbjct: 375 MTQFTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSS 434

Query: 363 GDVDVLSSPRVTASNNQKAVIKVGTDEYFVTDVSSTTVAGTTPVTTPQVELTPFFSGIAL 422
D+L++P + +N +A VG + +T S T +G T + + GI L
Sbjct: 435 TKNDILATPSIVTLDNMEATFNVGQEVPVLT--GSQTTSGDNIFNTVERKTV----GIKL 488

Query: 423 DVTPQIDKDGNVLLHVHPSVIDVKEQTKNIKVSNESLELPLAQSEIRESDTVIRAASGDV 482
V PQI++ +VLL + V V + S+ S +L + R + + SG+
Sbjct: 489 KVKPQINEGDSVLLEIEQEVSSVADAA-----SSTSSDLGATFN-TRTVNNAVLVGSGET 542

Query: 483 VVIGGLMKSENVEVVSQVPLLGDIPLLGELFKNRSKQKKKTELIILLKPTVVG 535
VV+GGL+ + +VPLLGDIP++G LF++ SK+ K L++ ++PTV+
Sbjct: 543 VVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVIR 595


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0562SYCDCHAPRONE310.005 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.005
Identities = 9/48 (18%), Positives = 22/48 (45%)

Query: 324 QGQFTLAEQAYRQLLQQEPQQGKWWMGLGYALDSQQQFAKASQAYRTA 371
G++ A + ++ L + ++++GLG + Q+ A +Y
Sbjct: 49 SGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSYSYG 96


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0564BCTERIALGSPF305e-103 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 305 bits (782), Expect = e-103
Identities = 112/406 (27%), Positives = 202/406 (49%), Gaps = 4/406 (0%)

Query: 1 MPVYQYRGRSGQGQAVTGQLDAASESAAADMLLARGIIPLEVKVAKVVK----SFSVTQL 56
M Y Y+ QG+ G +A S A +L RG++PL V + + S ++
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 57 FGGKVGLDELQIFTRQMYSLTRSGIPILRAIAGLSETAHSQRMKDALNDISEQLTAGRPL 116
++ +L + TRQ+ +L + +P+ A+ +++ + + + + ++ G L
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 117 SSAMNQHPEVFDSLFVSMVHVGENTGKLEDAFIQLSGYIEREQETRRRIKAAMRYPIFVL 176
+ AM P F+ L+ +MV GE +G L+ +L+ Y E+ Q+ R RI+ AM YP +
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 177 IAITIAMVILNIMVIPKFAEMFSRFGADLPWATKVLIGTSNLFVNYWPLMLVALIGAIVG 236
+ + IL +V+PK E F LP +T+VL+G S+ + P ML+AL+ +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 237 IRYWHHTEKGEKQWDKWKLHIPAVGSIIERSTLSRYCRSFSMMLSAGVPMTQALSLVADA 296
R EK + + LH+P +G I +RY R+ S++ ++ VP+ QA+ + D
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 297 VDNAYMHDRIVGMRRGIESGDSVLRVSNQSQLFTPLVLQMVAVGEETGQIDQLLNDAADF 356
+ N Y R+ + G S+ + Q+ LF P++ M+A GE +G++D +L AAD
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 357 YEGEVDYDLKNLTAKLEPILIGIVAVIVLVLALGIYLPMWDMLNVV 402
+ E + EP+L+ +A +VL + L I P+ + ++
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLM 406


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0566BCTERIALGSPG446e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 43.7 bits (103), Expect = 6e-08
Identities = 20/46 (43%), Positives = 31/46 (67%), Gaps = 2/46 (4%)

Query: 4 KQNGFSLIELVIVIVILGLLAATAIPRFLNVTD--DAQDASVDGVA 47
KQ GF+L+E+++VIVI+G+LA+ +P + + D Q A D VA
Sbjct: 6 KQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVA 51


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0567BCTERIALGSPG451e-08 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 44.9 bits (106), Expect = 1e-08
Identities = 14/31 (45%), Positives = 23/31 (74%)

Query: 2 MKRQQGFTLIELVVVIIILGILAVTAAPKFI 32
+Q+GFTL+E++VVI+I+G+LA P +
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLM 34


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0568BCTERIALGSPG494e-10 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 49.1 bits (117), Expect = 4e-10
Identities = 18/52 (34%), Positives = 28/52 (53%)

Query: 1 MKKQIGFTLIELVVVIIILGILAVTAAPKFINLQSDARASTVKGLEAAIKGA 52
KQ GFTL+E++VVI+I+G+LA P + + A A++ A
Sbjct: 4 TDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENA 55


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0569BCTERIALGSPH452e-08 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 44.9 bits (106), Expect = 2e-08
Identities = 25/80 (31%), Positives = 40/80 (50%), Gaps = 1/80 (1%)

Query: 8 KQAGFTLVELVTTVILIGILSVTVLPRLFTQSSYSAFSLRNEFMAELRQVQQRALNNTDR 67
+Q GFTL+E++ ++L+G+ + VL SA F A+LR VQQR L +
Sbjct: 2 RQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQT-GQ 60

Query: 68 CYRIAVSSVGYQVSQFATRD 87
+ ++V +Q RD
Sbjct: 61 FFGVSVHPDRWQFLVLEARD 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0570BCTERIALGSPH341e-04 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 34.2 bits (78), Expect = 1e-04
Identities = 14/56 (25%), Positives = 29/56 (51%), Gaps = 4/56 (7%)

Query: 16 KGFTLIELVVGMLVIAIAIVM-LSSMLFPQADRAAKTLHRVRSA-ELA--HSVMNE 67
+GFTL+E+++ +L++ ++ M L + + D AA+TL R + +
Sbjct: 4 RGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQTLARFEAQLRFVQQRGLQTG 59


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0571BCTERIALGSPG336e-04 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 32.9 bits (75), Expect = 6e-04
Identities = 11/18 (61%), Positives = 17/18 (94%)

Query: 13 RGFTLVEMVTVILILGIL 30
RGFTL+E++ VI+I+G+L
Sbjct: 8 RGFTLLEIMVVIVIIGVL 25


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0574SHAPEPROTEIN5560.0 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 556 bits (1435), Expect = 0.0
Identities = 313/348 (89%), Positives = 331/348 (95%), Gaps = 1/348 (0%)

Query: 1 MFKKLRGIFSNDLSIDLGTANTLIYVRDEGIVLNEPSVVAIRGERNSSGQKSVAAVGTEA 60
M KK RG+FSNDLSIDLGTANTLIYV+ +GIVLNEPSVVAIR +R + KSVAAVG +A
Sbjct: 1 MLKKFRGMFSNDLSIDLGTANTLIYVKGQGIVLNEPSVVAIRQDR-AGSPKSVAAVGHDA 59

Query: 61 KQMLGRTPGNIQAIRPMKDGVIADFYVTEKMLQHFIKQVHNNSFFRPSPRVLVCVPVGAT 120
KQMLGRTPGNI AIRPMKDGVIADF+VTEKMLQHFIKQVH+NSF RPSPRVLVCVPVGAT
Sbjct: 60 KQMLGRTPGNIAAIRPMKDGVIADFFVTEKMLQHFIKQVHSNSFMRPSPRVLVCVPVGAT 119

Query: 121 QVERRAIRESAMGAGAREVYLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAIISLN 180
QVERRAIRESA GAGAREV+LIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVA+ISLN
Sbjct: 120 QVERRAIRESAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN 179

Query: 181 GVVYSSSVRIGGDKFDDAIINYVRRNYGSLIGEATAERIKHTIGTAYPGDEVLEIEVRGR 240
GVVYSSSVRIGGD+FD+AIINYVRRNYGSLIGEATAERIKH IG+AYPGDEV EIEVRGR
Sbjct: 180 GVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIGSAYPGDEVREIEVRGR 239

Query: 241 NLAEGVPRSFILNSNEILEALQEPLSGIVSAVMVALEQSPPELASDISERGMVLTGGGAL 300
NLAEGVPR F LNSNEILEALQEPL+GIVSAVMVALEQ PPELASDISERGMVLTGGGAL
Sbjct: 240 NLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELASDISERGMVLTGGGAL 299

Query: 301 LRDLDRLLMQETGIPVMVADDPLTCVARGGGKALEMIDMHGGDLFSEE 348
LR+LDRLLM+ETGIPV+VA+DPLTCVARGGGKALEMIDMHGGDLFSEE
Sbjct: 300 LRNLDRLLMEETGIPVVVAEDPLTCVARGGGKALEMIDMHGGDLFSEE 347


69Sputcn32_0665Sputcn32_0672N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_06653224.030256histidine kinase
Sputcn32_06662243.460763response regulator receiver protein
Sputcn32_06670223.546765hypothetical protein
Sputcn32_0668-1192.606712hypothetical protein
Sputcn32_0669-1182.191910hypothetical protein
Sputcn32_06700130.152071hypothetical protein
Sputcn32_06710160.818330sodium:dicarboxylate symporter
Sputcn32_0672-2130.470059methyl-accepting chemotaxis sensory transducer
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0665PF06580300.031 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.031
Identities = 38/198 (19%), Positives = 72/198 (36%), Gaps = 33/198 (16%)

Query: 469 LQSVLTLIQQEVTRADSIISRLRNLLKK--RPVSKQPLYLHELVNETVPLLAYEFEQHQI 526
L ++ LI ++ T+A +++ L L++ R + + + L + + L Q +
Sbjct: 179 LNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFED 238

Query: 527 NLAVNVNGEPYLQSLDEVGMQQLLLN-LLKNAFDACVQRLELESSGTEQIITQKPYTPTI 585
L P ++ +V + +L+ L++N + I Q P I
Sbjct: 239 RLQFENQINP---AIMDVQVPPMLVQTLVENGI--------------KHGIAQLPQGGKI 281

Query: 586 DIDLRYQECTLLLTVTDNGTGLTEETSLLMQAFYSTKSEGLGLGLVICRDIAESHGGTFS 645
+ T+ L V + G+ + T +S G GL V R + +G
Sbjct: 282 LLKGTKDNGTVTLEVENTGSLALKNTK---------ESTGTGLQNVRER-LQMLYGTEAQ 331

Query: 646 L--ESAMGGGCQAQVAIP 661
+ G A V IP
Sbjct: 332 IKLSEKQGKV-NAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0666HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 87.6 bits (217), Expect = 2e-22
Identities = 31/150 (20%), Positives = 57/150 (38%), Gaps = 4/150 (2%)

Query: 7 VYLIDDDDSVRRSLRFMLESYGLKIIDFDSAEAFFTAVDLTLPGCALVDVRMPGLSGQQL 66
+ + DDD ++R L L G + +A + + + DV MP + L
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 67 HLELVAKNSPLAVIYLTGHGDVPMAVDALKLGAVDFFQKPADGAKLAEAVVKALEHT--- 123
+ L V+ ++ A+ A + GA D+ KP D +L + +AL
Sbjct: 66 LPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRR 125

Query: 124 -KAHHQDNQYLETYQALTPREREILNLIAQ 152
D+Q + +EI ++A+
Sbjct: 126 PSKLEDDSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0667GPOSANCHOR529e-09 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 52.4 bits (125), Expect = 9e-09
Identities = 52/316 (16%), Positives = 101/316 (31%), Gaps = 10/316 (3%)

Query: 598 EYAASEQELRIRLSKAEEALQSAQELQTEAESQLISINGELDNLSRELTFARTAYKNSRD 657
EL LS A+E L+ + +E S++ + +L + L A
Sbjct: 82 ALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGAMNFSTADSA 141

Query: 658 DLRRLFDEKRSEQDKINKALSERKAHAGQRLTQLDGELKQLKHQHELWLEDQKEQALEAR 717
++ L EK + KA G K + E + ++ LE
Sbjct: 142 KIKTLEAEK---AALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKA 198

Query: 718 MEKNAYWQEVIGVLDNQLGQIKATIEGRRESAKIEQKACETWYKNELKSRGVDEDNILKL 777
+E + L KA + R+ + + + + E L
Sbjct: 199 LEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKIKTLEAEKAAL 258

Query: 778 KQQIRELETKISRAEQRRSDVLRFDDWY-----QHTWLIRKPKLQTQLSDVKR-AVSEID 831
+ + ELE + A + + Q+Q+ + R ++
Sbjct: 259 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDL 318

Query: 832 QQLKAKTLEVKTRRQKLDTELKASNAAQVEASENLTKLRAVMRKLAELKLPTNNEEAQGS 891
+ +++ QKL+ + K S A++ +L R ++L E + E+ + S
Sbjct: 319 DASREAKKQLEAEHQKLEEQNKISEASRQSLRRDLDASREAKKQL-EAEHQKLEEQNKIS 377

Query: 892 LGERLRQGEDLLLKRD 907
R DL R+
Sbjct: 378 EASRQSLRRDLDASRE 393



Score = 32.3 bits (73), Expect = 0.011
Identities = 51/346 (14%), Positives = 118/346 (34%), Gaps = 27/346 (7%)

Query: 360 WRTDVENLSERHKLQTEKHQDIEAAYNARRSKIGEQLNRELESLHSDQDKQREARDKQRE 419
+ + + K+ D+ A + N EL S+ ++ DK
Sbjct: 55 VQERADKFEIENNTLKLKNSDLSFNNKAL-----KDHNDELTEELSNAKEKLRKNDKSLS 109

Query: 420 VARADIDALEAQWRNQIDAGKASFSEQEYQFKLTAAELKLRVDGVTYTEEEKLSLAIFDE 479
+ I LEA+ + D KA + +A L + + ++
Sbjct: 110 EKASKIQELEAR---KADLEKALEGAMNFSTADSAKIKTLEAEKAALAARKADL----EK 162

Query: 480 RIHRADEEQESCNAKVERLASDERKLRSKRDQANEALRIASLRVNERQAELDELHHML-- 537
+ A + +AK++ L +++ L +++ + +AL A A++ L
Sbjct: 163 ALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAA 222

Query: 538 FPESHTLLEFLRKEAQGWEQSLGKVIAPELLHRTDLHPSVTGSGDTLFGVHLDLKAIDVP 597
LE + A + + I + L +L+ +
Sbjct: 223 LAARKADLEKALEGAMNFSTADSAKIKTLEAEKAALEARQA-----------ELEKA-LE 270

Query: 598 EYAASEQELRIRLSKAEEALQSAQELQTEAESQLISINGELDNLSRELTFARTAYKNSRD 657
++ E + + + + E Q +N +L R+L +R A K
Sbjct: 271 GAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQLEA 330

Query: 658 DLRRLFDEKRSEQDKINKALSERKAHAGQRLTQLDGELKQLKHQHE 703
+ ++L ++ + + ++L + + QL+ E ++L+ Q++
Sbjct: 331 EHQKLEEQNKI-SEASRQSLRRDLDASREAKKQLEAEHQKLEEQNK 375


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0672CHANLCOLICIN300.036 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 29.7 bits (66), Expect = 0.036
Identities = 37/209 (17%), Positives = 76/209 (36%), Gaps = 17/209 (8%)

Query: 340 EDMARSATLAAKATRDADTEAKNGVTSVGQTITAIDALKVKLEQVSDVIGQLSKRGDEI- 398
E + A A KA ++A+ K +T + + E+ + + +K +
Sbjct: 137 EKARKEAEAAEKAFQEAEQRRKEIEREKAETERQLKLAE-AEEKRLAALSEEAKAVEIAQ 195

Query: 399 ----GAVTDVIGAIAEQTNLLALNAAIEAARAGEMGR------GFAVVADEVRTLASRSQ 448
A ++V+ E L + ++ AR EM A + + + L +
Sbjct: 196 KKLSAAQSEVVKMDGEIKTLNSRLSSSIHARDAEMKTLAGKRNELAQASAKYKELDELVK 255

Query: 449 ASTQDINRRIQGIQQDSANAVQSMAQSRTETEQTIVCSQQASEALTRINTAVSSITDVND 508
+ N +Q A + A E +Q +Q + + TRIN + IT +
Sbjct: 256 KLSPRANDPLQNRPFFEATRRRVGAGKIREEKQ-----KQVTASETRINRINADITQIQK 310

Query: 509 QLASATEQLAVVSGTINQNMENIAQAVEN 537
++ + +++ EN+ +A N
Sbjct: 311 AISQVSNNRNAGIARVHEAEENLKKAQNN 339


70Sputcn32_0678Sputcn32_0684N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_0678013-0.138860two-component response regulator
Sputcn32_0679-114-1.524042hypothetical protein
Sputcn32_0680012-1.252703aspartate kinase III
Sputcn32_0681012-1.853828Mg2+ transporter protein, CorA family protein
Sputcn32_0682113-2.064393hypothetical protein
Sputcn32_0683014-1.944679two component LuxR family transcriptional
Sputcn32_0684115-1.749177nitrate/nitrite sensor protein NarQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0678HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 2e-20
Identities = 31/131 (23%), Positives = 61/131 (46%), Gaps = 1/131 (0%)

Query: 1 MQNPHILIVEDEAVTRNTLRSIFEAEGYVVTEANDGAEMHKAMQENKINLVVMDINLPGK 60
M IL+ +D+A R L GY V ++ A + + + +LVV D+ +P +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 NGLLLARELREIN-NIGLIFLTGRDNEVDKILGLEIGADDYITKPFNPRELTIRARNLLT 119
N L +++ ++ ++ ++ ++ + I E GA DY+ KPF+ EL L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 120 RVNSAGNEVEE 130
+++E+
Sbjct: 121 EPKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0680CARBMTKINASE310.010 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 30.9 bits (70), Expect = 0.010
Identities = 18/81 (22%), Positives = 27/81 (33%), Gaps = 5/81 (6%)

Query: 202 DYSAALLAEALRASAVEIWTDVAGIYTTDPRLAPNAHPIAEISFNEAAEMATFGAKVLHP 261
D + LAE + A I TDV G + E+ E + G
Sbjct: 216 DLAGEKLAEEVNADIFMILTDVNGAALY--YGTEKEQWLREVKVEELRKYYEEGH--FKA 271

Query: 262 ATILPAVRQQIQVFVGSSKEP 282
++ P V I+ F+ E
Sbjct: 272 GSMGPKVLAAIR-FIEWGGER 291


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0683HTHFIS621e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.2 bits (151), Expect = 1e-13
Identities = 22/137 (16%), Positives = 55/137 (40%), Gaps = 2/137 (1%)

Query: 6 SILVVDDHPLLRKGICQLITSDPDFSLFGEAGGGLDALSAVATDEPDIILLDLNMKGMTG 65
+ILV DD +R + Q ++ + + +A + D+++ D+ M
Sbjct: 5 TILVADDDAAIRTVLNQALS-RAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENA 62

Query: 66 LDTLNAMRQEGVTSRIVILTVSDAKQDVIRLLRAGADGYLLKDTEPDLLLEKLKNAMLGH 125
D L +++ +++++ + I+ GA YL K + L+ + A+
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 126 RVISEEIEEYLYELKNV 142
+ ++E+ + +
Sbjct: 123 KRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0684PF06580425e-06 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 41.8 bits (98), Expect = 5e-06
Identities = 25/149 (16%), Positives = 58/149 (38%), Gaps = 21/149 (14%)

Query: 405 NQLNEINEGVSTAYAQLRELL----STFRLTIKEPNLKS-AMEAMLDQLRAKTDI----- 454
N LN I + + RE+L R +++ N + ++ L + + +
Sbjct: 177 NALNNIRALILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQF 236

Query: 455 --KISLDYKLSPQWLEAKQHIHILQITREATLNAIKH-----SKATLIHIRCYKDDNAMV 507
++ + +++P ++ + ++Q E N IKH + I ++ KD+ +
Sbjct: 237 EDRLQFENQINPAIMDVQVPPMLVQTLVE---NGIKHGIAQLPQGGKILLKGTKDNGTVT 293

Query: 508 NISVCDNGVGIEHLKERDQHFGIGIMHER 536
+ V + G + G+ + ER
Sbjct: 294 -LEVENTGSLALKNTKESTGTGLQNVRER 321


71Sputcn32_0844Sputcn32_0851N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_08440141.089058two component transcriptional regulator
Sputcn32_08450111.103410integral membrane sensor signal transduction
Sputcn32_08462141.341122hypothetical protein
Sputcn32_08472141.194421MltA-interacting MipA family protein
Sputcn32_08482151.277617aldo/keto reductase
Sputcn32_08490170.962175Na(+)-translocating NADH-quinone reductase
Sputcn32_08500180.659674Na(+)-translocating NADH-quinone reductase
Sputcn32_0851-2150.244931Na(+)-translocating NADH-quinone reductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0844HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.0 bits (205), Expect = 2e-20
Identities = 41/131 (31%), Positives = 60/131 (45%), Gaps = 2/131 (1%)

Query: 1 MSVAKHILVVEDDTSLAEWISDYLLDHGYEVTVASQGDFALEMIADETPDLVLLDVMMPV 60
M+ A ILV +DD ++ ++ L GY+V + S IA DLV+ DV+MP
Sbjct: 1 MTGAT-ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPD 59

Query: 61 KNGFDVCKEARAFYAG-PILFMTACVEDGDEIRGLDAGADDYLTKPIRPQVLLARIKALL 119
+N FD+ + P+L M+A I+ + GA DYL KP L+ I L
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 120 RRVGDEEQKLQ 130
KL+
Sbjct: 120 AEPKRRPSKLE 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0847IGASERPTASE280.045 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 28.1 bits (62), Expect = 0.045
Identities = 26/115 (22%), Positives = 43/115 (37%), Gaps = 6/115 (5%)

Query: 127 IHLGNGTLSSKFQHDVTGVYN--GFQADITYYHPINVGFGDLVPYAGVHYFSKDFANYYT 184
I LG G SK Q + + Q +T N+G + P GV Y A++
Sbjct: 1377 IDLGYGKFQSKLQTNHNAKFARHTAQFGLTAGKAFNLGNFGITPIVGVRYSYLSNADFAL 1436

Query: 185 G---VTSSEATAFRPAYQADGTFAYKLGYALVIPV-TKHLDITQNTGYSHIGANM 235
+ + + Q D ++ Y LG V P+ + D Q +G ++
Sbjct: 1437 DQARIKVNPISVKTAFAQVDLSYTYHLGEFSVTPILSARYDANQGSGKINVNGYD 1491


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0848HELNAPAPROT290.015 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 29.1 bits (65), Expect = 0.015
Identities = 19/88 (21%), Positives = 35/88 (39%), Gaps = 16/88 (18%)

Query: 109 IHQAVDASLARLQIDTIDLYQLHWPDRNTNFFG--ELFYDEQDHEHQTPILETLEALAEV 166
+ +++ L+ + L++ HW + +FF E F E ET++ +AE
Sbjct: 13 VENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKF-----EELYDHAAETVDTIAER 67

Query: 167 IRQGKVRYIGVSNETPWGLMK-YLQLAE 193
+ IG P +K Y + A
Sbjct: 68 LLA-----IGGQ---PVATVKEYTEHAS 87


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_0851SECA280.043 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.043
Identities = 19/59 (32%), Positives = 28/59 (47%), Gaps = 15/59 (25%)

Query: 210 VKEGDVHGVDAVSGATMTGR----GVQRAMEFWFGVE-----------GFQTFFNQLKT 253
VK+G+V VD +G TM GR G+ +A+E GV+ FQ +F +
Sbjct: 328 VKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKEGVQIQNENQTLASITFQNYFRLYEK 386


72Sputcn32_1063Sputcn32_1070N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1063130-5.684045type IV pilus modification protein PilV
Sputcn32_1064231-6.223236prepilin-type cleavage/methylation-like protein
Sputcn32_1065230-6.417906type IV pilus assembly protein PilX
Sputcn32_1066229-6.647229type IV pilin biogenesis protein
Sputcn32_1067124-5.517864methylation site containing protein
Sputcn32_1068125-5.879253hypothetical protein
Sputcn32_1069027-5.709551methylation site containing protein
Sputcn32_1070-116-2.805877methylation site containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1063BCTERIALGSPG300.002 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 30.2 bits (68), Expect = 0.002
Identities = 9/24 (37%), Positives = 18/24 (75%), Gaps = 2/24 (8%)

Query: 13 QQGFSLIEVLVALVIL--VIGLIG 34
Q+GF+L+E++V +VI+ + L+
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVV 30


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1064PF05307280.049 Bundlin
		>PF05307#Bundlin

Length = 193

Score = 27.8 bits (61), Expect = 0.049
Identities = 29/101 (28%), Positives = 46/101 (45%), Gaps = 11/101 (10%)

Query: 13 YQTGLSLVELMVAMVIGLFLTAGVFTMFSMSSSNVTTTSQFNQLQENGRIALAILERDLS 72
Y+ GLSL+E + + + +TAGV MF S++ + SQ N + E AI +
Sbjct: 10 YEKGLSLIESAMVLALAATVTAGV--MFYYQSASDSNKSQ-NAISEVMSATSAINGLYIG 66

Query: 73 QLGFMGDMTGTDFVLGSNTQVNIAAVANDCVGDGLNNATLP 113
Q + G L SN +N +A+ ++ N T P
Sbjct: 67 QTSYTG--------LNSNILLNTSAIPDNYKDTKNNKITNP 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1067BCTERIALGSPG586e-14 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 58.0 bits (140), Expect = 6e-14
Identities = 24/70 (34%), Positives = 43/70 (61%), Gaps = 2/70 (2%)

Query: 5 RGFTLIELMITVAIVGILAAIAYPSYIEYVTKSGRSEGVAAVMRVANLQEQYYLDNKAYA 64
RGFTL+E+M+ + I+G+LA++ P+ + K+ + + V+ ++ + N + Y LDN Y
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYKLDNHHYP 67

Query: 65 TDMTKLGLSA 74
T T GL +
Sbjct: 68 T--TNQGLES 75


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1069BCTERIALGSPG362e-05 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 35.6 bits (82), Expect = 2e-05
Identities = 12/28 (42%), Positives = 19/28 (67%)

Query: 6 TGFTLVELMVTIAVAAILLSIGAPSLIS 33
GFTL+E+MV I + +L S+ P+L+
Sbjct: 8 RGFTLLEIMVVIVIIGVLASLVVPNLMG 35


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1070BCTERIALGSPG290.003 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.003
Identities = 10/27 (37%), Positives = 20/27 (74%)

Query: 5 QKGFSLIELITTLSISTILLTVGVPSL 31
Q+GF+L+E++ + I +L ++ VP+L
Sbjct: 7 QRGFTLLEIMVVIVIIGVLASLVVPNL 33


73Sputcn32_1294Sputcn32_1302N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1294013-1.586495recombination associated protein
Sputcn32_1295014-1.754289porin
Sputcn32_1296-114-1.791001two component transcriptional regulator
Sputcn32_1297-116-2.413296PAS/PAC sensor signal transduction histidine
Sputcn32_1298017-2.782806phosphate binding protein
Sputcn32_1299019-2.939755peptidase M1, membrane alanine aminopeptidase
Sputcn32_1300-213-0.798362glutathione peroxidase
Sputcn32_1301-1110.092769magnesium transporter
Sputcn32_1302-1140.870138N-acetyltransferase GCN5
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1294SECA310.008 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 31.0 bits (70), Expect = 0.008
Identities = 11/41 (26%), Positives = 24/41 (58%), Gaps = 1/41 (2%)

Query: 81 EALEEKVALIEDEENRKLAKKEKDALKD-EIITSLLPRAFS 120
A+E ++ + DEE + + + L+ E++ +L+P AF+
Sbjct: 29 NAMEPEMEKLSDEELKGKTAEFRARLEKGEVLENLIPEAFA 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1295ECOLNEIPORIN888e-22 E.coli/Neisseria porin superfamily signature.
		>ECOLNEIPORIN#E.coli/Neisseria porin superfamily signature.

Length = 331

Score = 87.5 bits (217), Expect = 8e-22
Identities = 75/335 (22%), Positives = 127/335 (37%), Gaps = 31/335 (9%)

Query: 7 KTLLASALASTTLASAYAAEPLTVYGKLNV---TAQSNDEQGDAT------TTIQSNASR 57
K+L+A LA+ +A A +T+YG + T++S G T I S+
Sbjct: 3 KSLIALTLAALPVA---AMADVTLYGTIKAGVETSRSVAHNGAQAASVETGTGIVDLGSK 59

Query: 58 FGVKGDFELSSSLQAFYTIEYEVDTGAETSNNFTARNQFVGLKGSFGSFSVGRNDTLLKT 117
G KG +L + L+A + +E + A T + + R F+GLKG FG VGR +++LK
Sbjct: 60 IGFKGQEDLGNGLKAIWQVEQKASI-AGTDSGWGNRQSFIGLKGGFGKLRVGRLNSVLK- 117

Query: 118 SQGGVDQFNDLYETGDIKVLFKGENRLSQTATYLTPSFGGFVFGATYVAEGDANQQGQDG 177
G ++ ++ + + + + E RL Y +P F G Y +A + +
Sbjct: 118 DTGDINPWDSKSDYLGVNKIAEPEARLIS-VRYDSPEFAGLSGSVQYALNDNAGRHNSES 176

Query: 178 FSLAAMYGDAKLKESAFYAAIAYDSDVKGYEIIRATVQGKIAGFTLGG----MYQQQEET 233
+ Y + A + + I + + ++G+ + QQ++
Sbjct: 177 YHAGFNYKNGGFFVQYGGAYKRHHQVQENVNIEKYQIHRLVSGYDNDALYASVAVQQQDA 236

Query: 234 YKDELPVTTDSVNGYLFSAAYDIKAVT-----LKAQFQDMEDKGDS-----WSVGADYSL 283
E + +S + AY VT + + VGA+Y
Sbjct: 237 KLVEENYSHNSQTEVAATLAYRFGNVTPRVSYAHGFKGSFDATNYNNDYDQVVVGAEYDF 296

Query: 284 GKPTKLLAFYT--NRSFEASSDDDKYIGVGLEHKF 316
K T L S GVGL HKF
Sbjct: 297 SKRTSALVSAGWLQEGKGESKFVSTAGGVGLRHKF 331


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1296HTHFIS912e-23 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 2e-23
Identities = 34/130 (26%), Positives = 64/130 (49%), Gaps = 4/130 (3%)

Query: 3 ARILIVEDELAIREMLTFVMEQHGFTTSAAEDFDSAIALLKEPYPDLILLDWMFPGGSGI 62
A IL+ +D+ AIR +L + + G+ + + + DL++ D + P +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 QLAKRLKQDEFTRQIPIIMLTARGEEEDKVKGLEVGADDYITKPFSPKELVARIKAVL-- 120
L R+K+ +P+++++A+ +K E GA DY+ KPF EL+ I L
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 121 RRSAPTRLEE 130
+ P++LE+
Sbjct: 122 PKRRPSKLED 131


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1297PF06580330.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 32.9 bits (75), Expect = 0.002
Identities = 19/105 (18%), Positives = 34/105 (32%), Gaps = 26/105 (24%)

Query: 327 LISNAIRY----TEPGGKITVQWRSVATGGLFSVTDTGEGIAPQHIARLTERFYRVDSAR 382
L+ N I++ GGKI ++ V +TG
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 383 SRQTGGSGLGLAIVKHALNHHHSE---LTITSEVGKGSTFSFVIP 424
+G GL V+ L + + ++ + GK + +IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1302SACTRNSFRASE310.001 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 31.5 bits (71), Expect = 0.001
Identities = 29/135 (21%), Positives = 48/135 (35%), Gaps = 7/135 (5%)

Query: 19 ELMLAASLIEQRDDNAHPLEHSVFFRSRAVVLAKTPQGNIVGCAAIKAGEGKIGEFGYLV 78
E + +Q +D+ + + V +A L N +G I++ +
Sbjct: 39 EERFSKPYFKQYEDDDMDVSY-VEEEGKAAFLYYLEN-NCIGRIKIRSNWNGYALIEDIA 96

Query: 79 VSPLYRRQGIAQGLTQKRIEVAKSLGIAILFATIRAENISSRANLLKAGFKFWR-DYLSI 137
V+ YR++G+ L K IE AK L + NIS+ K F D +
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHFIIGAVDTMLY 156

Query: 138 RGTGNT----VGWYY 148
+ WYY
Sbjct: 157 SNFPTANEIAIFWYY 171


74Sputcn32_1419Sputcn32_1422N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_14192224.256496TetR family transcriptional regulator
Sputcn32_14202233.776831secretion protein HlyD family protein
Sputcn32_14211202.602848ABC transporter-like protein
Sputcn32_14221171.462219ABC transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1419HTHTETR772e-19 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 77.0 bits (189), Expect = 2e-19
Identities = 29/163 (17%), Positives = 60/163 (36%), Gaps = 6/163 (3%)

Query: 31 SSDARQRLIIAALSLFSHRSYPTVSTREIAREAGVDAALIRYYFGSKAGLFEQMVRETLE 90
+ + RQ ++ AL LFS + + S EIA+ AGV I ++F K+ LF ++ +
Sbjct: 9 AQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSES 68

Query: 91 PVLARLREISAAQAPNN---MSELMQTYYRVMAPNPGLPRLIMRVLQEGDGTEPYHIMLS 147
+ E A + + E++ L+ + + + ++
Sbjct: 69 NIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQQ 128

Query: 148 VFDHILSLSRQWLESTL---VNAGYLKEGIDPDLVRLSFVSLM 187
++ S +E TL + A L + + +
Sbjct: 129 AQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYI 171


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1420RTXTOXIND573e-11 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 57.1 bits (138), Expect = 3e-11
Identities = 44/318 (13%), Positives = 98/318 (30%), Gaps = 77/318 (24%)

Query: 33 TVERDRLTLTAPVGELITQINVVEGQRVKAGEVLIQLDATSANA---------------- 76
T + ++ +I V EG+ V+ G+VL++L A A A
Sbjct: 91 THSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQ 150

Query: 77 ---RLALRQAELDQAKAKLSEAVTGARLE----------------------------DID 105
++ R EL++ + ++D
Sbjct: 151 TRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLD 210

Query: 106 RAKAVLDGANATVKEAQRAFERTN-------RLFATKVLS--------------QADLDT 144
+ +A A + + L + ++ +L
Sbjct: 211 KKRAERLTVLARINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRV 270

Query: 145 ARAARDTSLAKQAEAQQSLRLLENGTRSE---QLAQAKAAVAAASASVAVEQKALADLSL 201
++ + ++ A++ +L+ ++E +L Q + + +A ++ +
Sbjct: 271 YKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVI 330

Query: 202 VAARDAVVDTLP-WREGDRIAAGTQLIGLLASDNPY-VRVYLPATWLDRVKVGDSVNILV 259
A V L EG + L+ ++ D+ V + + + VG + I V
Sbjct: 331 RAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKV 390

Query: 260 DG----REAPIAGTVRNI 273
+ R + G V+NI
Sbjct: 391 EAFPYTRYGYLVGKVKNI 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1421adhesinb310.007 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 30.6 bits (69), Expect = 0.007
Identities = 15/87 (17%), Positives = 28/87 (32%), Gaps = 12/87 (13%)

Query: 220 SPQQLMAAMGARVIEVSGDDLRT---------LKQSLMSESA---VLSAAQIGSRLRVLV 267
P+ + A +I +G +L T ++ + E+ +S L
Sbjct: 73 LPEDVKKTSQADLIFYNGINLETGGNAWFTKLVENAKKKENKDYYAVSEGVDVIYLEGQS 132

Query: 268 RSDIADPLAWLKPKIANRAMEEVRASL 294
DP AWL + + + L
Sbjct: 133 EKGKEDPHAWLNLENGIIYAQNIAKRL 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1422ABC2TRNSPORT392e-05 ABC-2 type transport system membrane protein signat...
		>ABC2TRNSPORT#ABC-2 type transport system membrane protein

signature.
Length = 262

Score = 38.8 bits (90), Expect = 2e-05
Identities = 42/160 (26%), Positives = 70/160 (43%), Gaps = 11/160 (6%)

Query: 186 GVILTMTMVMFT----SAAIVREREQGNMEFLITTPVRPLELMLGKIVPYVLVGFVQVTI 241
G++ T M T AA R Q E ++ T +R +++LG++ +
Sbjct: 72 GMVATSAMTAATFETIYAAFGRMEGQRTWEAMLYTQLRLGDIVLGEMAWAATKAALAGAG 131

Query: 242 ILSAGHLLFDVP---IRGGLDSIALAAMLFICASLTLGLVISTMAKTQLQSMQMTVFVLL 298
I L + L IAL + F +LG+V++ +A + + V+
Sbjct: 132 IGVVAAALGYTQWLSLLYALPVIALTGLAFA----SLGMVVTALAPSYDYFIFYQTLVIT 187

Query: 299 PSILLSGFMFPFDAMPIAAQWIAEALPATHFMRMSRAIVL 338
P + LSG +FP D +PI Q A LP +H + + R I+L
Sbjct: 188 PILFLSGAVFPVDQLPIVFQTAARFLPLSHSIDLIRPIML 227


75Sputcn32_1491Sputcn32_1497N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1491115-0.319549ATP-dependent protease ATP-binding subunit ClpX
Sputcn32_1492014-0.482490ATP-dependent protease La
Sputcn32_1493015-0.789987histone family protein DNA-binding protein
Sputcn32_1494-115-0.979332PpiC-type peptidyl-prolyl cis-trans isomerase
Sputcn32_1495-315-1.080374trans-2-enoyl-CoA reductase
Sputcn32_1496-214-0.493652ABC transporter-like protein
Sputcn32_1497-115-0.419349oligopeptide/dipeptide ABC transporter ATPase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1491HTHFIS310.009 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 31.0 bits (70), Expect = 0.009
Identities = 14/70 (20%), Positives = 28/70 (40%), Gaps = 13/70 (18%)

Query: 64 KLPTPHELRAHLDDYVIGQDRAKKVLSVAVYNHYKRLKNSSPKDGVELGKSNILLIGPTG 123
+ P+ E + ++G+ A +Y RL + +++ G +G
Sbjct: 124 RRPSKLEDDSQDGMPLVGRSAA----MQEIYRVLARLMQT---------DLTLMITGESG 170

Query: 124 SGKTLLAETL 133
+GK L+A L
Sbjct: 171 TGKELVARAL 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1492HTHFIS350.001 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 35.2 bits (81), Expect = 0.001
Identities = 45/211 (21%), Positives = 76/211 (36%), Gaps = 37/211 (17%)

Query: 262 NMPSEAKEKALAELNKLRMMSP---MSAEATV---VRSY----VDWMTSVPWSQRSKIKR 311
MP E L + K R P MSA+ T +++ D++ P+ I
Sbjct: 56 VMPDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPK-PFDLTELIGI 114

Query: 312 D---------LAKAQEVLDTDHYGLEKVKERILEYLAVQSRVRQLKGPILCLVGPPGVGK 362
E D L + E V +R+ Q ++ + G G GK
Sbjct: 115 IGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLM-ITGESGTGK 173

Query: 363 TSLGQSIAKATGRK---YVRVALGGVRD---EAEIRGHRRTYIGSMPGKVIQKMAKVGVK 416
+ +++ R+ +V + + + E+E+ GH + G+ G + +
Sbjct: 174 ELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEK---GAFTGAQTRSTGRFEQA 230

Query: 417 N--PLFLLDEIDKMSSDMRGDPASALLEVLD 445
LFL DEI M D + + LL VL
Sbjct: 231 EGGTLFL-DEIGDMPMDAQ----TRLLRVLQ 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1493DNABINDINGHU1194e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (300), Expect = 4e-39
Identities = 53/88 (60%), Positives = 69/88 (78%)

Query: 2 NKSELIEKIASGADISKAAAGRALDSFIAAVTEGLKEGDKISLVGFGTFEVRERAERTGR 61
NK +LI K+A +++K + A+D+ +AV+ L +G+K+ L+GFG FEVRERA R GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGEEIKIAAAKIPAFKAGKALKDAV 89
NPQTGEEIKI A+K+PAFKAGKALKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1496HTHFIS290.022 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 29.0 bits (65), Expect = 0.022
Identities = 13/28 (46%), Positives = 16/28 (57%)

Query: 42 TLAIVGEAGSGKSTLARILVGAEPRSGG 69
TL I GE+G+GK +AR L R G
Sbjct: 162 TLMITGESGTGKELVARALHDYGKRRNG 189


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1497HTHFIS310.011 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 30.6 bits (69), Expect = 0.011
Identities = 9/16 (56%), Positives = 14/16 (87%)

Query: 38 LVGESGSGRSLLARAI 53
+ GESG+G+ L+ARA+
Sbjct: 165 ITGESGTGKELVARAL 180


76Sputcn32_1704Sputcn32_1709N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1704-1121.400750RND family efflux transporter MFP subunit
Sputcn32_17050142.098883acriflavin resistance protein
Sputcn32_17060181.834369N-acetyltransferase GCN5
Sputcn32_17071182.335237glucose sorbosone dehydrogenase
Sputcn32_17080172.605605hypothetical protein
Sputcn32_17091202.877942hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1704RTXTOXIND531e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 52.5 bits (126), Expect = 1e-09
Identities = 34/202 (16%), Positives = 71/202 (35%), Gaps = 21/202 (10%)

Query: 91 VQLQNAEQIAKVKAAQVKVTDNKRELNRISSLVTSRTVAELERDRLQTLIDTTRAELEQA 150
+ ++++ + ++ K E ++ L + + +L + I EL +
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDN--IGLLTLELAKN 321

Query: 151 QVSLNDRRILAPFDGR-LGLRQVSVGSLVTPGT---EITTLDDISKIKLDFSVPERFIQE 206
+ I AP + L+ + G +VT I DD +++ V + I
Sbjct: 322 EERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDD--TLEVTALVQNKDIGF 379

Query: 207 LRPGKLVEATAIAFPDETF---NGIVTSI------DSRVNPTTRAVI--VRAEIP--NPD 253
+ G+ AFP + G V +I D R+ +I + N +
Sbjct: 380 INVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKN 439

Query: 254 LRLLPGMLMKVKLIKRSRDALM 275
+ L GM + ++ R +
Sbjct: 440 IPLSSGMAVTAEIKTGMRSVIS 461



Score = 32.5 bits (74), Expect = 0.002
Identities = 19/90 (21%), Positives = 34/90 (37%), Gaps = 5/90 (5%)

Query: 62 SVTITPKVTDMVMSLNFDDGDIVKRGDLLVQLQNAEQIAKVKAAQVKVTDNKRELNRISS 121
S I P +V + +G+ V++GD+L++L A Q + + E R
Sbjct: 96 SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQI 155

Query: 122 LVTSRTVAELERDRLQTLIDTTRAELEQAQ 151
L S +E ++L L +
Sbjct: 156 LSRS-----IELNKLPELKLPDEPYFQNVS 180


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1705ACRIFLAVINRP8410.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 841 bits (2174), Expect = 0.0
Identities = 321/1035 (31%), Positives = 551/1035 (53%), Gaps = 30/1035 (2%)

Query: 3 LTDLSVKRPVFASVISLLVVAFGLVSFDKLPLREYPNIDPPIVSIETNYRGASAAVVESR 62
+ + ++RP+FA V++++++ G ++ +LP+ +YP I PP VS+ NY GA A V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 ITQLIEDRISGVEGIRHVSSSS-SDGRSSVTLEFDISRNIEDAANDVRDRISGLLDNLPE 121
+TQ+IE ++G++ + ++SS+S S G ++TL F + + A V++++ LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 EADPPEVQKANGGDEVIMWLNLVSD--NMTTLELTDYTNRYLSDRLSVVDGVARIRIGGG 179
E + +M VSD T +++DY + D LS ++GV +++ G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 KVYAMRIWLDRQALASRSLTVADVEAALRAENVELPAGSI------ESKERHFTVRLERS 233
+ YAMRIWLD L LT DV L+ +N ++ AG + ++ + ++ +
Sbjct: 181 Q-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 YRTAEDFANLVLTQGNDGYLVKLGDVAKVEIGSEEERIMFRGNKEAMIGLGVSKQSTANT 293
++ E+F + L +DG +V+L DVA+VE+G E ++ R N + GLG+ + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 LEVARAVNALIDKINPTLPAGMSIKRSYDSSVFIEASIKEVYQTLFIAMILVIIVIYLFL 353
L+ A+A+ A + ++ P P GM + YD++ F++ SI EV +TLF A++LV +V+YLFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 GSARAMLIPAITVPVSLLGTFIVLYALGYTINLLTLLAMILAIGMVVDDAIVMLENIHRR 413
+ RA LIP I VPV LLGTF +L A GY+IN LT+ M+LAIG++VDDAIV++EN+ R
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 I-EEGDSPLKAAYLGAREVAFAVIATTLVLVSVFMPITFLEGDLGKLFKEFAVAMSAAVI 472
+ E+ P +A ++ A++ +VL +VF+P+ F G G ++++F++ + +A+
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 FSSIVALTLSPMMCSKLLKPATQD-----SWLVRKVDGIMASIARGYQTSLQKAMAKPLL 527
S +VAL L+P +C+ LLKP + + + Y S+ K +
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 528 MSVLVLCALGSSFFLVQKVPQEFAPQEDRGSLFLMVNGPQGASYEYIESYMTEVENRLMP 587
++ + L ++P F P+ED+G M+ P GA+ E + + +V + +
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 588 LVDAGDIKRLLIRAPRGFGRAADFSNGMAIVVLEDWGQRRPIKE----VIVDINKRLADL 643
+ +++ + F A + GMA V L+ W +R + VI L +
Sbjct: 600 -NEKANVESVFTVNGFSFSGQAQ-NAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKI 657

Query: 644 --AGVQAFPVMRQA-FGRGVGKPVQFV-IGGPSYEELARWRDIMMEKAAENP-MLLGLDH 698
V F + G G + + G ++ L + R+ ++ AA++P L+ +
Sbjct: 658 RDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRP 717

Query: 699 DYKETKPQLRVVIDRDRAASLGVSIANIGRTLESMLGSRLVTTFMRDGEEYDVIVEGERN 758
+ E Q ++ +D+++A +LGVS+++I +T+ + LG V F+ G + V+ +
Sbjct: 718 NGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAK 777

Query: 759 NQNTAADLQNIYVRSERSKELIPLSNLVTVEEFADASSLNRYNRMRAITIEASLADGYSL 818
+ D+ +YVRS + E++P S T + L RYN + ++ I+ A G S
Sbjct: 778 FRMLPEDVDKLYVRS-ANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSS 836

Query: 819 GAALDDLNQMARAYLPAEAVISYKGQSLDYQESGNSMYFVFLLALGIVFLVLAAQFESYI 878
G A+ + +A LPA + G S + SGN + ++ +VFL LAA +ES+
Sbjct: 837 GDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWS 895

Query: 879 HPIVIMLTVPLATVGALAGLWFTGQSLNIYSQIGIIMLVGLAAKNGILIVEFANQLRDK- 937
P+ +ML VPL VG L Q ++Y +G++ +GL+AKN ILIVEFA L +K
Sbjct: 896 IPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKE 955

Query: 938 GVDFDRAIIQAASQRLRPILMTGITTAAGAIPLVMAVGAGAETRFVIGVVVLSGILLATL 997
G A + A RLRPILMT + G +PL ++ GAG+ + +G+ V+ G++ ATL
Sbjct: 956 GKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATL 1015

Query: 998 FTIFVIPTAYGFFAR 1012
IF +P + R
Sbjct: 1016 LAIFFVPVFFVVIRR 1030



Score = 91.1 bits (226), Expect = 1e-20
Identities = 51/324 (15%), Positives = 126/324 (38%), Gaps = 13/324 (4%)

Query: 706 QLRVVIDRDRAASLGVSIANIGRTLES----MLGSRLVTTFMRDGEEYDVIVEGERNNQN 761
+R+ +D D ++ ++ L+ + +L T G++ + + + +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFK- 241

Query: 762 TAADLQNIYVRSERSKELIPLSNLVTVEE-FADASSLNRYNRMRAITIEASLADGYSLGA 820
+ + +R ++ L ++ VE + + + R N A + LA G +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALD 301

Query: 821 ALDDLNQMA---RAYLPA--EAVISYKGQSLDYQESGNSMYFVFLLALGIVFLVLAAQFE 875
+ + + P + + Y + Q S + + A+ +VFLV+ +
Sbjct: 302 TAKAIKAKLAELQPFFPQGMKVLYPYD-TTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 876 SYIHPIVIMLTVPLATVGALAGLWFTGQSLNIYSQIGIIMLVGLAAKNGILIVE-FANQL 934
+ ++ + VP+ +G A L G S+N + G+++ +GL + I++VE +
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 935 RDKGVDFDRAIIQAASQRLRPILMTGITTAAGAIPLVMAVGAGAETRFVIGVVVLSGILL 994
+ + A ++ SQ ++ + +A IP+ G+ + ++S + L
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 995 ATLFTIFVIPTAYGFFARNSGSPE 1018
+ L + + P + +
Sbjct: 481 SVLVALILTPALCATLLKPVSAEH 504


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1706SACTRNSFRASE300.004 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.004
Identities = 23/112 (20%), Positives = 35/112 (31%), Gaps = 6/112 (5%)

Query: 20 LLLQLGYSSTQEQLQMYLEKSERTDE-IYIAEEKGNIIGLISLLFFDYFPAQQQICRITA 78
Y E M + E + ++ + N IG I + I
Sbjct: 40 ERFSKPYFKQYEDDDMDVSYVEEEGKAAFLYYLENNCIGRIKIR-----SNWNGYALIED 94

Query: 79 LIVTQACRGLGVGTQLINFAKARANEQGCHQLEVTTSMRREKTQAYYEAIGF 130
+ V + R GVGT L++ A A E L + T +Y F
Sbjct: 95 IAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1709ARGREPRESSOR280.006 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 27.9 bits (62), Expect = 0.006
Identities = 9/30 (30%), Positives = 17/30 (56%)

Query: 7 LLEAIKAEHQITTQNELVALLSQNELLIQQ 36
+ I ++I TQ+ELV +L ++ + Q
Sbjct: 9 KIREIITANEIETQDELVDILKKDGYNVTQ 38


77Sputcn32_1727Sputcn32_1738N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_1727014-1.543475short-chain dehydrogenase/reductase SDR
Sputcn32_1728-111-2.037561acyl-CoA thioesterase II
Sputcn32_1729-213-1.424897hypothetical protein
Sputcn32_1730-213-1.141868LysR family transcriptional regulator
Sputcn32_1731-113-0.250510major facilitator superfamily transporter
Sputcn32_1732-1140.586436Ppx/GppA phosphatase
Sputcn32_17331151.052536polyphosphate kinase
Sputcn32_1734-2131.904253putative chaperone
Sputcn32_1735-2111.704921CreA family protein
Sputcn32_1736-2131.636885cystathionine beta-lyase
Sputcn32_1737-2120.976175integral membrane sensor signal transduction
Sputcn32_1738-211-0.229448two component transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1727DHBDHDRGNASE832e-20 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 82.8 bits (204), Expect = 2e-20
Identities = 53/204 (25%), Positives = 84/204 (41%), Gaps = 19/204 (9%)

Query: 6 KVALITGAGSGLGRAYAIMLAERGAKVVLIDQPTVHCAEVSTDAQSVTHSINDNLNQTYD 65
K+A ITGA G+G A A LA +GA + +D + L +
Sbjct: 9 KIAFITGAAQGIGEAVARTLASQGAHIAAVDYNP------------------EKLEKVVS 50

Query: 66 SIIKLGCDCLQFVLDVSDKAAVDRMVDTVAKSWQRIDILINNAGIYGACAFERITPEQWQ 125
S+ F DV D AA+D + + + IDIL+N AG+ ++ E+W+
Sbjct: 51 SLKAEARHAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWE 110

Query: 126 RQLDVDLNGSFYLTQAVWPLMKLQNYGRIVMTTGVSGLFGDLHQVGFSAAKMALVGMVNS 185
V+ G F +++V M + G IV ++++K A V
Sbjct: 111 ATFSVNSTGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKC 170

Query: 186 LSIEGEMHNIRVNSLCPQAV-TAM 208
L +E +NIR N + P + T M
Sbjct: 171 LGLELAEYNIRCNIVSPGSTETDM 194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1731TCRTETB622e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 61.8 bits (150), Expect = 2e-12
Identities = 39/184 (21%), Positives = 83/184 (45%), Gaps = 2/184 (1%)

Query: 21 RDTRLMWALCVASVVVYINLYLMQGMLPLIAEHFAVSGSKATLILSVTSFSLAFSLLIYA 80
R +++ LC+ S +N ++ LP IA F + + + + + +Y
Sbjct: 11 RHNQILIWLCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYG 70

Query: 81 VISDRIGRHAPIVVSLWLLALSNLL-LIWTPDFNGLLLVRLLQGVVLAAVPAIAMAYFKE 139
+SD++G ++ + + +++ + F+ L++ R +QG AA PA+ M
Sbjct: 71 KLSDQLGIKRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVAR 130

Query: 140 QLSPSTMLKAAGIYIMANSIGGIVGRLLGGMMSQFLSWQASMWLLFLVTLAGVALTSYLL 199
+ KA G+ ++G VG +GGM++ ++ W + + L+ ++T+ V LL
Sbjct: 131 YIPKENRGKAFGLIGSIVAMGEGVGPAIGGMIAHYIHW-SYLLLIPMITIITVPFLMKLL 189

Query: 200 PSGA 203

Sbjct: 190 KKEV 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1732SHAPEPROTEIN300.024 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 30.1 bits (68), Expect = 0.024
Identities = 16/36 (44%), Positives = 24/36 (66%)

Query: 137 NLVIDIGGGSTEVVLGQKNTPTHLSSLRCGCVSFNE 172
++V+DIGGG+TEV + N + SS+R G F+E
Sbjct: 161 SMVVDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDE 196


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1734SHAPEPROTEIN392e-05 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 39.4 bits (92), Expect = 2e-05
Identities = 24/81 (29%), Positives = 41/81 (50%), Gaps = 11/81 (13%)

Query: 192 AAKRAGFIDVAFLFEPLAAGMDYEASLIDNQTVLVVDVGGGTTDCSVVKMGPAHKANLDR 251
+A+ AG +V + EP+AA + + + +VVD+GGGTT+ +V+ +
Sbjct: 129 SAQGAGAREVFLIEEPMAAAIGAGLPVSEATGSMVVDIGGGTTEVAVISLN--------- 179

Query: 252 AADCLGHSGQRIGGNDLDIAL 272
+ S RIGG+ D A+
Sbjct: 180 --GVVYSSSVRIGGDRFDEAI 198


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_1738HTHFIS637e-14 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.3 bits (154), Expect = 7e-14
Identities = 27/130 (20%), Positives = 53/130 (40%), Gaps = 2/130 (1%)

Query: 3 RIAIVEDEAAIRENYKDVLQQHGYSVQTYADRPSAMLAFNTRLPDLAIIDIGLGNEIDGG 62
I + +D+AAIR L + GY V+ ++ + DL + D+ + +E
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE--NA 62

Query: 63 FMLCQSLRAMSNTLPIIFLTARDSDFDTVCGLRLGADDYLSKEVSFPHLTARLAALFRRS 122
F L ++ LP++ ++A+++ + GA DYL K L +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 123 ELAASQTPQE 132
+ S+ +
Sbjct: 123 KRRPSKLEDD 132


78Sputcn32_2413Sputcn32_2417N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2413-2212.122261IucA/IucC family protein
Sputcn32_2414-1202.086919TonB-dependent siderophore receptor
Sputcn32_2415-1190.101433ferric iron reductase
Sputcn32_2416017-1.666959intracellular septation protein A
Sputcn32_24170120.254690YciI-like protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2413PF041836200.0 IucA / IucC family
		>PF04183#IucA / IucC family

Length = 580

Score = 620 bits (1600), Expect = 0.0
Identities = 166/593 (27%), Positives = 294/593 (49%), Gaps = 22/593 (3%)

Query: 42 LTPAYWQAANRHLVKKILCEFTHEKLITPTLYGQKAGLNHYELRLKDSTYYFSARHYQLD 101
+ W NR LV K+L E +E++ + G + Y + L + + F A
Sbjct: 1 MNHKDWDLVNRRLVAKMLSELEYEQVFHA----ESQGDDRYCINLPGAQWRFIAERGIWG 56

Query: 102 HLAIDADTIRVSLAGKEQTLDAMSLIISLKNDLGISETLLPTYLEEITSTLYSKAYKL-A 160
L IDA T+R + ++ + A +L++ LK L +S+ + +++++ +TL L A
Sbjct: 57 WLWIDAQTLRCA----DEPVLAQTLLMQLKQVLSMSDATVAEHMQDLYATLLGDLQLLKA 112

Query: 161 HQAIPAATLARADYQSIEAGMTEGHPVFIANNGRIGFDMQDYRQFAPESAMPMQLVWLGV 220
+ + A+ L + ++ + GHP F+ N GR G+ + ++APE A +L WL V
Sbjct: 113 RRGLSASDLINLNADRLQC-LLSGHPKFVFNKGRRGWGKEALERYAPEYANTFRLHWLAV 171

Query: 221 RKSKTTFAALENLSHDALLKKELG-QQFDDFQQHLKTKQHDPQDFYFMPVHPWQWREKIA 279
++ + + LL + Q+F F Q + D ++ +PVHPWQW++KIA
Sbjct: 172 KREHMIWRCDNEMDIHQLLTAAMDPQEFARFSQVWQENGLD-HNWLPLPVHPWQWQQKIA 230

Query: 280 RVFAGDIARGDLVYLGEGNEQYQVQQSIRTFFNLASPQKCYVKTALSILNMGFMRGLSPL 339
F D A G +V LGE +Q+ QQS+RT N + +K L+I N RG+
Sbjct: 231 TDFIADFAEGRMVSLGEFGDQWLAQQSLRTLTNASRRGGLDIKLPLTIYNTSCYRGIPGR 290

Query: 340 YMSCTPQINAWVANLVESDPYFAQQGFVILKEIAAIGYHHTYYEQALTQDSAYKKMLSAL 399
Y++ P + W+ + +D Q G VIL E AA H Y Y++ML +
Sbjct: 291 YIAAGPLASRWLQQVFATDATLVQSGAVILGEPAAGYVSHEGYAALARAPYRYQEMLGVI 350

Query: 400 WRESPLPHIEPKQNLMTMAALLHTDHEDKALIAALIVASGLPAKDWVSRYLNLYLSPLLH 459
WRE+P ++P ++ + MA L+ D ++ L A I SGL A+ W+++ + + PL H
Sbjct: 351 WRENPCRWLKPDESPVLMATLMECDENNQPLAGAYIDRSGLDAETWLTQLFRVVVVPLYH 410

Query: 460 AFFAYDLVFMPHGENLILVLDEYVPVKILMKDIGEEVAVLNG----AKPLPDDVKRLAVS 515
Y + + HG+N+ L + E VP ++L+KD ++ ++ LP +V+ +
Sbjct: 411 LLCRYGVALIAHGQNITLAMKEGVPQRVLLKDFQGDMRLVKEEFPEMDSLPQEVRDVTSR 470

Query: 516 LEEEMKLNYILLDIFDCIFRYLAPLLDEQTSVSESQFWELVADNVRDYQAQHPHLADKFA 575
L + ++ + F + R+++PL+ + V E +F++L+A + DY +HP ++++FA
Sbjct: 471 LSADYLIHDLQTGHFVTVLRFISPLMV-RLGVPERRFYQLLAAVLSDYMKKHPQMSERFA 529

Query: 576 QYDLFKDSFVRTCLNRIQLNNNQQMIDLADREKNL-RFAGGIDNPLAAFRQSH 627
+ LF+ +R LN ++L DL + L + + NPL Q +
Sbjct: 530 LFSLFRPQIIRVVLNPVKL----TWPDLDGGSRMLPNYLEDLQNPLWLVTQEY 578


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2414PRTACTNFAMLY310.025 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 30.8 bits (69), Expect = 0.025
Identities = 31/130 (23%), Positives = 43/130 (33%), Gaps = 21/130 (16%)

Query: 224 DSGSVRGRVVAAYQDKDSFQDRYEQQRTTLYGIVETDIGDSTLFTLGVDYQDATPSGTMS 283
D+G GR A Q D+ R Q + G F LG D+ A G
Sbjct: 645 DAGGAWGRGFAQRQQLDNRAGRRFDQ--KVAG-----------FELGADHAVAVAGGRWH 691

Query: 284 GGLPLFYSDGSRTNYDRATSTAPDWGSAHTQGLNTFASLEHRLDNGWNLKGTYTYGDNSL 343
G Y+ G R G HT ++ + D+G+ L T
Sbjct: 692 LGGLAGYTRGDRG--------FTGDGGGHTDSVHVGGYATYIADSGFYLDATLRASRLEN 743

Query: 344 EFDVLWATGY 353
+F V + GY
Sbjct: 744 DFKVAGSDGY 753


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_24152FE2SRDCTASE1081e-29 Ferric iron reductase signature.
		>2FE2SRDCTASE#Ferric iron reductase signature.

Length = 262

Score = 108 bits (272), Expect = 1e-29
Identities = 70/238 (29%), Positives = 95/238 (39%), Gaps = 67/238 (28%)

Query: 129 KALHSLWGQWYFGLLVPPIMEWIFNAPKAIFEPIHWQPQSIFMQVHPSGRVAKFEFNLAK 188
K L SLW QWY GL+VPP+M + KA+ P+ + H +GRVA F ++ +
Sbjct: 89 KPLISLWAQWYIGLMVPPLMLALLTQEKAL----DVSPEHFHAEFHETGRVACFWVDVCE 144

Query: 189 RQPNTALTFKKHHGIEPLSQTNTKPSFKTDSKVHSPLTPHKPPVDKELALQGLILNLLQP 248
+ T PH P E LI L P
Sbjct: 145 DKNAT---------------------------------PHSPQHRMET----LISQALVP 167

Query: 249 SVERLLTLSPTPAKLYWSHLGYLIHWYLGELG--LSQQQNQRLKQALFRQATFQDGSINP 306
V+ L KL WS+ GYLI+WYL E+ L + + L+ ALF + T +G NP
Sbjct: 168 VVQALEATGEINGKLIWSNTGYLINWYLTEMKQLLGEATVESLRHALFFEKTLTNGEDNP 227

Query: 307 LYNSINLFIDTEQSNANSNTKTSVKTNSKPGPKISCIRRTCCLRYQLANTGQCHDCPL 364
L+ ++ L RRTCC RY+L + QC DC L
Sbjct: 228 LWRTVVLRDGLLV------------------------RRTCCQRYRLPDVQQCGDCTL 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2417adhesinmafb250.040 Neisseria meningitidis: adhesin MafB signature.
		>adhesinmafb#Neisseria meningitidis: adhesin MafB signature.

Length = 467

Score = 25.4 bits (55), Expect = 0.040
Identities = 9/44 (20%), Positives = 16/44 (36%)

Query: 54 AGFSGSLVVADFDSLASAQAWANADPYFAAGVYQSVVVKPFKRV 97
G GS+ + ++ + W +P A V V +V
Sbjct: 279 IGGLGSVAGFEKNTREAVDRWIQENPNAAETVEAVFNVAAAAKV 322


79Sputcn32_2555Sputcn32_2599N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2555015-0.521054chemotaxis-specific methylesterase
Sputcn32_2556114-0.587732CheA signal transduction histidine kinase
Sputcn32_2557013-0.517766chemotaxis phosphatase, CheZ
Sputcn32_2558-212-0.108267response regulator receiver protein
Sputcn32_2559-2120.018769flagellar biosynthesis sigma factor
Sputcn32_2560-2140.037342cobyrinic acid a,c-diamide synthase
Sputcn32_2561-214-0.301644flagellar biosynthesis regulator FlhF
Sputcn32_2562-115-0.789340flagellar biosynthesis protein FlhA
Sputcn32_2563019-1.327533flagellar biosynthesis protein FlhB
Sputcn32_2564-217-1.477256flagellar biosynthesis protein FliR
Sputcn32_2565-216-2.231706flagellar biosynthetic protein FliQ
Sputcn32_2566016-2.044035flagellar biosynthesis protein FliP
Sputcn32_2567115-1.972214flagellar biosynthesis protein, FliO
Sputcn32_2568015-0.599517flagellar motor switch protein
Sputcn32_2569116-0.983670flagellar motor switch protein FliM
Sputcn32_2570116-0.873729flagellar basal body-associated protein FliL
Sputcn32_2571013-0.347662flagellar hook-length control protein
Sputcn32_25720120.073811flagellar export protein FliJ
Sputcn32_2573-1130.073649flagellum-specific ATP synthase
Sputcn32_2574012-0.706481flagellar assembly protein H
Sputcn32_2575-112-0.702368flagellar motor switch protein G
Sputcn32_2576-115-1.232800flagellar MS-ring protein
Sputcn32_2577018-2.162417flagellar hook-basal body complex subunit FliE
Sputcn32_2578019-3.080879two component, sigma54 specific, Fis family
Sputcn32_2579222-3.649160PAS/PAC sensor signal transduction histidine
Sputcn32_2580221-4.150424sigma-54 dependent trancsriptional regulator
Sputcn32_2581321-4.420363flagellar protein FliS
Sputcn32_2582318-3.639265hypothetical protein
Sputcn32_2583116-2.686187flagellar hook-associated 2 domain-containing
Sputcn32_2584114-1.689228flagellar protein FlaG protein
Sputcn32_2585015-1.449457flagellin domain-containing protein
Sputcn32_2586-114-1.219530flagellin domain-containing protein
Sputcn32_2587-113-1.036418flagellar hook-associated protein FlgL
Sputcn32_2588-113-0.501367flagellar hook-associated protein FlgK
Sputcn32_2589117-0.146227flagellar rod assembly protein/muramidase FlgJ
Sputcn32_2590117-0.610601flagellar basal body P-ring protein
Sputcn32_2591217-0.870460flagellar basal body L-ring protein
Sputcn32_2592219-1.161549flagellar basal body rod protein FlgG
Sputcn32_2593019-1.724707flagellar basal body rod protein FlgF
Sputcn32_2594-120-2.725978flagellar hook protein FlgE
Sputcn32_2595-219-4.318877flagellar basal body rod modification protein
Sputcn32_2596-119-4.111378flagellar basal body rod protein FlgC
Sputcn32_2597117-4.158482flagellar basal body rod protein FlgB
Sputcn32_2598117-4.199749protein-glutamate O-methyltransferase
Sputcn32_2599019-4.399653putative CheW protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2555HTHFIS642e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 64.5 bits (157), Expect = 2e-13
Identities = 28/135 (20%), Positives = 56/135 (41%), Gaps = 7/135 (5%)

Query: 2 AIKVLVVDDSSFFRRRVSEIVNQDPELEVIATASNGAEAVKMAAELNPQVITMDIEMPVM 61
+LV DD + R +++ +++ + SN A + A + ++ D+ MP
Sbjct: 3 GATILVADDDAAIRTVLNQALSR--AGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 62 DGITAVREIMAKCP-TPILMFSSLTHDGAKATLDALDAGALDFLPKRF--EDIATNKDDA 118
+ + I P P+L+ S+ + + A + GA D+LPK F ++ A
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSA--QNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRA 118

Query: 119 ILLLQQRVKALGRRR 133
+ ++R L
Sbjct: 119 LAEPKRRPSKLEDDS 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2556PF06580462e-07 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 46.4 bits (110), Expect = 2e-07
Identities = 28/182 (15%), Positives = 54/182 (29%), Gaps = 68/182 (37%)

Query: 433 TLNKEIDLVM---------IGEETDLDKNLVEALADPLVH------LVRNSVDHGIEMPN 477
+L E+ +V + + + A+ D V LV N + HGI
Sbjct: 217 SLADELTVVDSYLQLASIQFEDRLQFENQINPAIMDVQVPPMLVQTLVENGIKHGIA--- 273

Query: 478 EREASGKPRTGTITLSASQEGDHILLKIEDDGAGMDPEKLKQIAIKRGVLDDDAAARMTD 537
P+ G I L +++ + L++E+ G+
Sbjct: 274 -----QLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------------- 306

Query: 538 TEAYNLIFAPGFSTKVEISDISGRGVGMDVVKTRIAQLNG---TVHIDSMKGKGTVLEIK 594
G G+ V+ R+ L G + + +GK + +
Sbjct: 307 -------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VL 346

Query: 595 VP 596
+P
Sbjct: 347 IP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2558HTHFIS897e-24 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 7e-24
Identities = 33/105 (31%), Positives = 51/105 (48%), Gaps = 3/105 (2%)

Query: 6 KILIVDDFSTMRRIIKNLLRDLGFNNTQEADDGSTALPMLQKGDFDFVVTDWNMPGMQGI 65
IL+ DD + +R ++ L G++ + +T + GD D VVTD MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 DLLRAIRADDSLKHLPVLMVTAEAKREQIIAAAQAGVNGYVVKPF 110
DLL I+ LPVL+++A+ I A++ G Y+ KPF
Sbjct: 64 DLLPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPF 106


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2561PF05272300.023 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.0 bits (67), Expect = 0.023
Identities = 8/21 (38%), Positives = 11/21 (52%)

Query: 244 GVVALVGPTGVGKTTSLAKLA 264
V L G G+GK+T + L
Sbjct: 597 YSVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2563TYPE3IMSPROT335e-116 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 335 bits (860), Expect = e-116
Identities = 95/347 (27%), Positives = 177/347 (51%), Gaps = 2/347 (0%)

Query: 6 SGERSEEPTGRRLEQAREKGQVARSKELGTAAVLLSAATGFYMLGPGIATALSHVFERVF 65
SGE++E+PT +++ AR+KGQVA+SKE+ + A++++ + L S + +
Sbjct: 2 SGEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLM--LI 59

Query: 66 TMDRAQIYDTNQMFNVWGVVAGEIAWPMLKIMLLIVVVAFIGNVSLGGMNFSTQAMMPKA 125
+++ + + + V V E + ++ + ++A +V G S +A+ P
Sbjct: 60 PAEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDI 119

Query: 126 SKMSPLAGFKRMFGVQALVELTKGIAKFSVVAFSAYFLLSYYFNDILLLSSDHLPGNVHH 185
K++P+ G KR+F +++LVE K I K +++ + ++ +L L + +
Sbjct: 120 KKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPL 179

Query: 186 ALDLLIWMFILLCSSVLLIVVIDVPFQIWNHNKQLKMTKQEVKDEYKDTEGKPEVKGRVR 245
+L + ++ ++I + D F+ + + K+LKM+K E+K EYK+ EG PE+K + R
Sbjct: 180 LGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRR 239

Query: 246 QMQRELAQRRMMAEVPNADVIVVNPEHYAVAVKYDVKRSAAPFVIAKGVDDVAFKIREVA 305
Q +E+ R M V + V+V NP H A+ + Y + P V K D +R++A
Sbjct: 240 QFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIA 299

Query: 306 REYNIAIVSAPPLARAIYHTTKLDQQIPEGLFTAVAQVLAYVFQLRQ 352
E + I+ PLARA+Y +D IP A A+VL ++ +
Sbjct: 300 EEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNI 346


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2564TYPE3IMRPROT1225e-36 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 122 bits (309), Expect = 5e-36
Identities = 93/243 (38%), Positives = 143/243 (58%), Gaps = 1/243 (0%)

Query: 15 YMWPLFRVASMLMVMVVFGAATTPARVRLLLAMAITFAIAPVLPPVENADLFSLSAVFIT 74
Y WPL RV +++ + + P RV+L LAM ITFAIAP LP + +FS A+++
Sbjct: 16 YFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLP-ANDVPVFSFFALWLA 74

Query: 75 AQQIIIGVAMGFVSQMVMQTFVLTGQIIGMQTSLGFASMVDPGSGQQTPVIGNFFLLLAT 134
QQI+IG+A+GF Q G+IIG+Q L FA+ VDP S PV+ +LA
Sbjct: 75 VQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASHLNMPVLARIMDMLAL 134

Query: 135 LIFLAVDGHLLMIRMLVASFETLPISNQGLTLTSYRSLAEWGSYMFGAALTMSISAIIAL 194
L+FL +GHL +I +LV +F TLPI + L ++ +L + GS +F L +++ I L
Sbjct: 135 LLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLIFLNGLMLALPLITLL 194

Query: 195 LLVNLSFGVMTRAAPQLNIFSIGFPITMIGGLLILWLTLTPVMAHFDEVWAAAQVLLCDI 254
L +NL+ G++ R APQL+IF IGFP+T+ G+ ++ + + + +++ LL DI
Sbjct: 195 LTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPFCEHLFSEIFNLLADI 254

Query: 255 LGL 257
+
Sbjct: 255 ISE 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2565TYPE3IMQPROT471e-10 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 46.7 bits (111), Expect = 1e-10
Identities = 20/73 (27%), Positives = 37/73 (50%)

Query: 4 EALIDIFREALAVIVMMVSAIVLPGLGIGLVVAVFQAATSINEQTLSFLPRLLVTLFGLM 63
+ L+ +AL +++++ + IGL+V +FQ T + EQTL F +LL L
Sbjct: 2 DDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLF 61

Query: 64 FMGHWLVQTLMDF 76
+ W + L+ +
Sbjct: 62 LLSGWYGEVLLSY 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2566FLGBIOSNFLIP2763e-96 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 276 bits (707), Expect = 3e-96
Identities = 120/240 (50%), Positives = 176/240 (73%)

Query: 8 LIGLAILFFSVSVGAADGVLPAVTVKTAADGSTEYSVTMQILLLMTSLSFLPAMVIMLTS 67
L+ +A + + A LP +T + G +S+ +Q L+ +TSL+F+PA+++M+TS
Sbjct: 4 LLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLMMTS 63

Query: 68 FTRIIVVLSILRQAIGLQQTPSNQVLIGMSLFMTFFIMAPVFDKIYDQGVKPYIDEQLTL 127
FTRII+V +LR A+G P NQVL+G++LF+TFFIM+PV DKIY +P+ +E++++
Sbjct: 64 FTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEKISM 123

Query: 128 QQAFDKGKEPLRAFMLGQVRTTDLKTFIDISGYQNINSPEEAPMSVLVPAFITSELKTAF 187
Q+A +KG +PLR FML Q R DL F ++ + PE PM +L+PA++TSELKTAF
Sbjct: 124 QEALEKGAQPLREFMLRQTREADLGLFARLANTGPLQGPEAVPMRILLPAYVTSELKTAF 183

Query: 188 QIGFMLFVPFLVLDLVVASILMAMGMMMLSPMIVSLPFKIMLFVLVDGWGLVLGTLANSF 247
QIGF +F+PFL++DLV+AS+LMA+GMMM+ P ++LPFK+MLFVLVDGW L++G+LA SF
Sbjct: 184 QIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2568FLGMOTORFLIN1111e-34 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 111 bits (278), Expect = 1e-34
Identities = 58/128 (45%), Positives = 85/128 (66%), Gaps = 3/128 (2%)

Query: 2 STEDTG---DDWAAAMAEQALEEANAVALDELVDDSQPISKADAAKLDTILDIPVTISME 58
S E+TG D WA A+ EQ + A +D I+DIPV +++E
Sbjct: 8 SDENTGALDDLWADALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTVE 67

Query: 59 VGRSFISIRNLLQLNQGSVVELDRVAGEPLDVMVNGTLIAHGEVVVVNDKFGIRLTDVIS 118
+GR+ ++I+ LL+L QGSVV LD +AGEPLD+++NG LIA GEVVVV DK+G+R+TD+I+
Sbjct: 68 LGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDIIT 127

Query: 119 QTERIKKL 126
+ER+++L
Sbjct: 128 PSERMRRL 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2569FLGMOTORFLIM2482e-82 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 248 bits (634), Expect = 2e-82
Identities = 87/326 (26%), Positives = 166/326 (50%), Gaps = 10/326 (3%)

Query: 1 MSDLLSQDEIDALLHGVD--DVDDDDDLD-AASQDARSYDFSSQDRIVRGRMPTLEIVNE 57
M+++LSQDEID LL + D +D + ++ YDF D+ + +M TL +++E
Sbjct: 1 MTEVLSQDEIDQLLTAISSGDASIEDARPISDTRKITLYDFRRPDKFSKEQMRTLSLMHE 60

Query: 58 RFARHLRISMFNMMRRAAEVSINGVQMLKFGEYVHTLFVPTSLNMVRFSPLKGTALITME 117
FAR S+ +R V + V L + E++ ++ P++L ++ PLKG A++ ++
Sbjct: 61 TFARLTTTSLSAQLRSMVHVHVASVDQLTYEEFIRSIPTPSTLAVITMDPLKGNAVLEVD 120

Query: 118 ARLVFILVDNFFGGDGRFHAKIEGREFTPTERRIVQLLLKIIFEDYKDAWAPVMDVEFDY 177
+ F ++D FGG G+ R+ T E +++ ++ I + +++W V+D+
Sbjct: 121 PSITFSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPRL 178

Query: 178 LDSEVNPAMANIVSPTEVVVINSFHIEVDGGGGDFHITMPYSMIEPIRELLDAG--VQSD 235
E NP A IV P+E+VV+ + +V G + +PY IEPI L + S
Sbjct: 179 GQIETNPQFAQIVPPSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWFSSV 238

Query: 236 KQDTDMRWSQALHDEIMDVKVGFDACVVEHELTLKDVMNFKAGDIIPVE---LPEYIMMK 292
++ + ++ L D++ V + A V L+++D++ + GDII + + + ++
Sbjct: 239 RRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPFVLS 298

Query: 293 IEDLPTYRCKMGRSRDNLALKIYEKI 318
I + + C+ G +A +I E+I
Sbjct: 299 IGNRKKFLCQPGVVGKKIAAQILERI 324


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2571FLGHOOKFLIK547e-10 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 53.7 bits (128), Expect = 7e-10
Identities = 43/162 (26%), Positives = 68/162 (41%), Gaps = 7/162 (4%)

Query: 313 EVASEFKPVSVTTSPTQPQVNRQDIPQIQLSLRQGVETPNQMQEMIQRFSPVMKQQLITM 372
EV S PV+ SP Q +P + + + P E Q S Q +
Sbjct: 199 EVISTPSPVTAAASPLITPHQTQPLPTVAAPV---LSAPLGSHEWQQSLS----QHISLF 251

Query: 373 VSNGIQHAEIRLDPPELGHMTVKIQVHGDQTQVQFHVTQSQTRDMVEQAIPRLRELLQEQ 432
G Q AE+RL P +LG + + ++V +Q Q+Q R +E A+P LR L E
Sbjct: 252 TRQGQQSAELRLHPQDLGEVQISLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAES 311

Query: 433 GMQLADSHVSQGEQEQGRDGGFGESNGSGSTNLDEFSAEELD 474
G+QL S++S + + + N + + E+ D
Sbjct: 312 GIQLGQSNISGESFSGQQQAASQQQQSQRTANHEPLAGEDDD 353


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2572FLGFLIJ427e-08 Flagellar FliJ protein signature.
		>FLGFLIJ#Flagellar FliJ protein signature.

Length = 147

Score = 42.5 bits (99), Expect = 7e-08
Identities = 36/145 (24%), Positives = 72/145 (49%)

Query: 1 MANADPLLLVLNLALDAEEQASLLLKSAQLECQKRQHQLNALNNYRLDYMKQMQSQQGQA 60
MA L + +LA E A+ LL + CQ+ + QL L +Y+ +Y + S
Sbjct: 1 MAEHGALATLKDLAEKEVEDAARLLGEMRRGCQQAEEQLKMLIDYQNEYRNNLNSDMSAG 60

Query: 61 ISASHYHQFHRFIRQIDDAIAQQNRVVADGEKQKEYRQHHWLEKQKKRKAVELLLANKAK 120
I+++ + + +FI+ ++ AI Q + + ++ + + W EK+++ +A + L ++
Sbjct: 61 ITSNRWINYQQFIQTLEKAITQHRQQLNQWTQKVDIALNSWREKKQRLQAWQTLQERQST 120

Query: 121 KRDALELKREQKMTDEFASQQFYRR 145
E + +QK DEFA + R+
Sbjct: 121 AALLAENRLDQKKMDEFAQRAAMRK 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2574FLGFLIH888e-23 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 88.3 bits (218), Expect = 8e-23
Identities = 57/192 (29%), Positives = 98/192 (51%), Gaps = 9/192 (4%)

Query: 55 VEAIAPPTMAEIEHIRAQAEEEGFSEGKTQGFNEGLEKGRLEGLAQGHQEGFTQGHEQGL 114
+E P ++ ++ QA E QG+ G+ +GR +G QG+QEG QG EQGL
Sbjct: 33 IEEAEPSLEQQLAQLQMQAHE--------QGYQAGIAEGRQQGHKQGYQEGLAQGLEQGL 84

Query: 115 ETGLEEAKGLINRFESLLNQFEKPLQLLDGDIELSLMTLAMALAKSVIGHELKTHPEQIL 174
+ + R + L+++F+ L LD I LM +A+ A+ VIG ++
Sbjct: 85 AEAKSQQAPIHARMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALI 144

Query: 175 SALRLGIESLPIKEQAVTIRLHPDDVILVEKLYSAAQLARNQWQLEVDPSLSPGECIISS 234
++ ++ P+ +R+HPDD+ V+ + A L+ + W+L DP+L PG C +S+
Sbjct: 145 KQIQQLLQQEPLFSGKPQLRVHPDDLQRVDDMLGAT-LSLHGWRLRGDPTLHPGGCKVSA 203

Query: 235 QRSLVDLSLPSR 246
+D S+ +R
Sbjct: 204 DEGDLDASVATR 215


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2575FLGMOTORFLIG2859e-97 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 285 bits (730), Expect = 9e-97
Identities = 111/348 (31%), Positives = 196/348 (56%), Gaps = 5/348 (1%)

Query: 1 MAENKTKEAAEVSSFNVKDMSGIEKTAILLLSLSEADAASILKHLEPKQVQKVGMAMAAM 60
M E K KE +VS+ ++G +K AILL+S+ ++ + K+L ++++ + +A +
Sbjct: 1 MEEKKEKEILDVSA-----LTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKL 55

Query: 61 EDFGQEKVVGVHKLFLDDIQKYSSIGFNSEEFVRKALTAALGADKAGNLIEQIIMGSGAK 120
E E V F + + I ++ R+ L +LG KA ++I + ++
Sbjct: 56 ETITSELKDNVLLEFKELMMAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSR 115

Query: 121 GLDSLKWMDARQVATIIQNEHPQIQTIVLSYLEPDQAAEIFGQFPENTRLDLMMRIANLE 180
+ ++ D + IQ EHPQ ++LSYL+P +A+ I P + ++ RIA ++
Sbjct: 116 PFEFVRRADPANILNFIQQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMD 175

Query: 181 EVQPAALQELNDIMEKQFAGQGGAQAAKMGGLKAAANIMNYLDTGIESQLMETMRESDEE 240
P ++E+ ++EK+ A GG+ I+N D E ++E++ E D E
Sbjct: 176 RTSPEVVREVERVLEKKLASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPE 235

Query: 241 MAQQIQDLMFVFENLIDVDDRGIQILLREVQQDVLLKALKGTDDQLKEKLLGNMSKRAAE 300
+A++I+ MFVFE+++ +DDR IQ +LRE+ L KALK D ++EK+ NMSKRAA
Sbjct: 236 LAEEIKKKMFVFEDIVLLDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAAS 295

Query: 301 LLRDDLEAMGPIRISEVEVAQKEILSIARRLSDSGEIMLGGGGGDEFL 348
+L++D+E +GP R +VE +Q++I+S+ R+L + GEI++ GG ++ L
Sbjct: 296 MLKEDMEFLGPTRRKDVEESQQKIVSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2576FLGMRINGFLIF2995e-97 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 299 bits (768), Expect = 5e-97
Identities = 163/566 (28%), Positives = 263/566 (46%), Gaps = 56/566 (9%)

Query: 26 LGGVDMMRQITMILALAICLALAVFVMLWAQEPEYRPL-GKMETQEMVQVLDVLDKNKIK 84
L + +I +I+A + +A+ V ++LWA+ P+YR L + Q+ ++ L + I
Sbjct: 16 LNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIP 75

Query: 85 YQIEVD--VIKVPEDKYQEVKLMLSRAGIDGGTTSKDFLTQDSGFGVSQRMEQARLKHSQ 142
Y+ I+VP DK E++L L++ G+ G L FG+SQ EQ + +
Sbjct: 76 YRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRAL 135

Query: 143 EENLARAIEQLQSVSRAKVILALPKENVFARNTSQPSATVVINTRRG-GLGQGEVDAIVD 201
E LAR IE L V A+V LA+PK ++F R PSA+V + G L +G++ A+V
Sbjct: 136 EGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAVVH 195

Query: 202 IVASAVQGLEPSRVTVTDSNGRLLNSGSQDGVSARARRELELVQQKEAEYRTKIESILVP 261
+V+SAV GL P VT+ D +G LL + G +L+ E+ + +IE+IL P
Sbjct: 196 LVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDL-NDAQLKFANDVESRIQRRIEAILSP 254

Query: 262 ILGPDNFTSQVDVSMDFTAVEQTAKRFNPDLPALRSEMTVENNST-----GGTTGGIPGA 316
I+G N +QV +DF EQT + ++P+ A ++ + + G GG+PGA
Sbjct: 255 IVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGA 314

Query: 317 LSNQPPMESDIP--------EDATNASEKVVAGNSH--------REATRNFELDTTISHT 360
LSNQP ++ P ++A N + + NS+ R T N+E+D TI HT
Sbjct: 315 LSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHT 374

Query: 361 RQQIGVVRRISVSVAVDFKPGAADENGQVARVARTEQELTNIRRLLEGAVGFSTQRGDVL 420
+ +G + R+SV+V V++K A + + T ++ I L A+GFS +RGD L
Sbjct: 375 KMNVGDIERLSVAVVVNYKTLADGKP-----LPLTADQMKQIEDLTREAMGFSDKRGDTL 429

Query: 421 EVVTVPFMDQLMEDIPSPELWEQPWFWRAVKLGVGALVILV----LILAVVRPMLKRLIY 476
VV PF + W+Q F + L++LV L VRP L R +
Sbjct: 430 NVVNSPF-SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRRVE 488

Query: 477 PDSVNMPDDSRLGNELAEIEDQYAADTLGMLNTKEAEYSYADDGSIL---IPNLHKDDDM 533
E E+ E + D + + M
Sbjct: 489 ----EAKAAQEQAQVRQETEE-------------AVEVRLSKDEQLQQRRANQRLGAEVM 531

Query: 534 IKAIRALVANEPELSTQVVKNWLQDN 559
+ IR + N+P + V++ W+ ++
Sbjct: 532 SQRIREMSDNDPRVVALVIRQWMSND 557


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2577FLGHOOKFLIE573e-14 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 57.4 bits (138), Expect = 3e-14
Identities = 29/81 (35%), Positives = 46/81 (56%)

Query: 31 QQVNNTSGADFGQLLSQAVGNVSGLQSTSSNLATRLEMGDTTVSLSDTVIAREKASVAFE 90
Q+ F L A+ +S Q+ + A + +G+ V+L+D + +KASV+ +
Sbjct: 23 QESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQ 82

Query: 91 ATVQVRNKLVEAYKEIMSMPV 111
+QVRNKLV AY+E+MSM V
Sbjct: 83 MGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2578HTHFIS455e-160 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 455 bits (1171), Expect = e-160
Identities = 168/484 (34%), Positives = 249/484 (51%), Gaps = 43/484 (8%)

Query: 1 MSEAKLLLVEDDASLREALLDTLLLAQYECIDVACGEDAILALKQHQFDLVISDVQMQGI 60
M+ A +L+ +DDA++R L L A Y+ + + DLV++DV M
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 61 GGLGLLNYLQQHHPKLPVLLMTAYATIGSAVSAIKLGAVDYLAKPFAPEVLLNQVSRYLP 120
LL +++ P LPVL+M+A T +A+ A + GA DYL KPF L+ + R L
Sbjct: 61 NAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALA 120

Query: 121 LKTNADKPIVAD-----------EKSLALLSLAQRVAASDASVMIMGPSGSGKEVLARYI 169
+ D + + R+ +D ++MI G SG+GKE++AR +
Sbjct: 121 EPKRRPSKLEDDSQDGMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELVARAL 180

Query: 170 HQHSSRAEQAFVAINCAAIPENMLEATLFGYEKGAFTGAYQACPGKFEQAQGGTLLLDEI 229
H + R FVAIN AAIP +++E+ LFG+EKGAFTGA G+FEQA+GGTL LDEI
Sbjct: 181 HDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEI 240

Query: 230 SEMDLGLQAKLLRVLQEREVERLGGRKTIKLDVRVLATSNRDLKAVVAAGGFREDLYYRI 289
+M + Q +LLRVLQ+ E +GGR I+ DVR++A +N+DLK + G FREDLYYR+
Sbjct: 241 GDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDLYYRL 300

Query: 290 NVFPLTWPALNQRPADILPLARHLLAKHAKALNIGATPELDEQACRRLLSHRWPGNVREL 349
NV PL P L R DI L RH + + K D++A + +H WPGNVREL
Sbjct: 301 NVVPLRLPPLRDRAEDIPDLVRHFVQQAEK--EGLDVKRFDQEALELMKAHPWPGNVREL 358

Query: 350 DNIVQRALILRTGSVITANDIIIDVQGALINDESEVSASEPEG----------------- 392
+N+V+R L VIT I +++ + + E +A+
Sbjct: 359 ENLVRRLTALYPQDVITREIIENELRSEIPDSPIEKAAARSGSLSISQAVEENMRQYFAS 418

Query: 393 ----------LGEELKAQEHVIILETLAQCQGSRKLVAEKLGISARTLRYKMARMRDMGI 442
L E+ +IL L +G++ A+ LG++ TLR K +R++G+
Sbjct: 419 FGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKK---IRELGV 475

Query: 443 QLPS 446
+
Sbjct: 476 SVYR 479


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2579PF06580290.023 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.4 bits (66), Expect = 0.023
Identities = 21/95 (22%), Positives = 37/95 (38%), Gaps = 19/95 (20%)

Query: 256 LVINSLEAGASQ------IRILATESKDQLMLEVIDNGKGLDAKMQQKVMEPFFTTKAQG 309
LV N ++ G +Q I + T+ + LEV + G +
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSL------------ALKNTKES 310

Query: 310 TGLGLA-VVQSVVRNHGGEIQLRCLPNKGCTVSLV 343
TG GL V + + +G E Q++ +G ++V
Sbjct: 311 TGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMV 345


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2580HTHFIS443e-155 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 443 bits (1142), Expect = e-155
Identities = 169/484 (34%), Positives = 262/484 (54%), Gaps = 27/484 (5%)

Query: 7 RILLVGTPSERLSRLCCIFEFLGEQIDVI-----APEKLNSYLQDTRYRALVLFTDTMPS 61
IL+ + + L G + + + + D LV+ MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGD-----LVVTDVVMPD 59

Query: 62 -DAIKLLATQFAWQP----ILL--FGEIGDFQVSNVLG---QIEEPLSYPQLTELLHFCQ 111
+A LL +P +++ ++ G + +P +L ++
Sbjct: 60 ENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 112 VYGQVKRPQVPTSANQTKLFRSLVGRSDGIAHVRHLINQVATSDATVLVLGQSGTGKEVV 171
+ + ++ + LVGRS + + ++ ++ +D T+++ G+SGTGKE+V
Sbjct: 120 AEPKRRPSKLEDDSQD---GMPLVGRSAAMQEIYRVLARLMQTDLTLMITGESGTGKELV 176

Query: 172 ARNIHYLSERRDGPFIPVNCGAIPPELLESELFGHEKGSFTGAICSRKGRFELAEGGTLF 231
AR +H +RR+GPF+ +N AIP +L+ESELFGHEKG+FTGA GRFE AEGGTLF
Sbjct: 177 ARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTRSTGRFEQAEGGTLF 236

Query: 232 LDEIGDMPLQMQVKLLRVLQERVFERVGGTKTINVDVRVVAATHRDLESMISGNEFREDL 291
LDEIGDMP+ Q +LLRVLQ+ + VGG I DVR+VAAT++DL+ I+ FREDL
Sbjct: 237 LDEIGDMPMDAQTRLLRVLQQGEYTTVGGRTPIRSDVRIVAATNKDLKQSINQGLFREDL 296

Query: 292 YYRLNVFPIEMPALSERKDDVPLLLQELVSRVYNEGRGKVRFTQRAIESLKEHAWSGNVR 351
YYRLNV P+ +P L +R +D+P L++ V + EG RF Q A+E +K H W GNVR
Sbjct: 297 YYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQAEKEGLDVKRFDQEALELMKAHPWPGNVR 356

Query: 352 ELSNLVERLTILYPGGLVDVNDLPVKYRHIDVPEYCVEMSEEQQERDALASIFSSEEPVE 411
EL NLV RLT LYP ++ + + R ++P+ +E + + +++ + EE +
Sbjct: 357 ELENLVRRLTALYPQDVITREIIENELRS-EIPDSPIEKAAARSGSLSISQ--AVEENMR 413

Query: 412 IPETRFPNELPPEGVNLKDLLAELEIDMIRQALELQDNVVARAAEMLGIRRTTLVEKMRK 471
F + LPP G+ +LAE+E +I AL +AA++LG+ R TL +K+R+
Sbjct: 414 QYFASFGDALPPSGL-YDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRNTLRKKIRE 472

Query: 472 YGMT 475
G++
Sbjct: 473 LGVS 476


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2583FLAGELLIN320.006 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 31.6 bits (71), Expect = 0.006
Identities = 25/220 (11%), Positives = 48/220 (21%)

Query: 4 TATGMGSGLDITNIVKVLVDAEKTPKEAMFNKTEDSIKAKVSAMGTLKSALSAFQDAVKK 63
G+ + T V + V T KS +
Sbjct: 207 VDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIA 266

Query: 64 LQTGDALNQRKISVSNSTFLTATADKTAQAGSYGIKVEQLAVNHKIAGANVANPASGVGE 123
TF T G + V +A
Sbjct: 267 GAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAAT 326

Query: 124 GSLDFDINGKNFSVDIAATDSLDAIAKKVNKASDNVGVTATVVTSDAGSRLVFSSNKTGE 183
++ + D + K++ N V + G+ ++
Sbjct: 327 LQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKV 386

Query: 184 DNQINITATDTSGSGLSDMFDASNITTLQDAKNAVIYIDN 223
D + SG+S + + + N + ID+
Sbjct: 387 TLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPLASIDS 426


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2585FLAGELLIN1343e-38 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 134 bits (338), Expect = 3e-38
Identities = 97/271 (35%), Positives = 126/271 (46%), Gaps = 11/271 (4%)

Query: 2 AITVNTNVTSLKAQKNLNTSASGLATSMERLSSGLRINGAKDDAAGLAISNRLNSQVRGL 61
A +NTN SL Q NLN S S L++++ERLSSGLRIN AKDDAAG AI+NR S ++GL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 DVGMRNANDAISIAQISEGAMQEQTNMLQRMRDLTVQAENGANSSDDLTSIQKEIDQLAL 121
RNAND ISIAQ +EGA+ E N LQR+R+L+VQA NG NS DL SIQ EI Q
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EITAIGTNTAFGTTKLLDGTFSAGKTFQVGHQSGEDITISVSKTTASALKVGSLDIKGSA 181
EI + T F K+L QVG GE ITI + K +L + ++ G
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ--MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 182 RASALAA---------IDAAIKTIDSQRADLGAKQNRLAYNISNSANTQANISDAKSRIV 232
A+ D + R D+ + + +
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 233 DVDFAKETSQMTKNQVLQQTGSAMLAQANQL 263
D + K + A A +
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 85.5 bits (211), Expect = 5e-21
Identities = 64/212 (30%), Positives = 102/212 (48%), Gaps = 3/212 (1%)

Query: 60 GLDVGMRNANDAISIAQISEGAMQEQTNMLQRMRDLTVQAENGANSSDDLTSIQKEIDQL 119
+ + +++A I+ GA LQ +++ NG + DD T +
Sbjct: 298 KVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSD 357

Query: 120 ALEITAIGTNTAFGTTKLLDGTFSAGKTFQVGHQSGEDITISVSKTTASALKVGSLDIKG 179
A+ + +AG + ++ I + + S L
Sbjct: 358 LEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTM---FIDKTASGVSTLINEDAAAAK 414

Query: 180 SARASALAAIDAAIKTIDSQRADLGAKQNRLAYNISNSANTQANISDAKSRIVDVDFAKE 239
+ A+ LA+ID+A+ +D+ R+ LGA QNR I+N NT N++ A+SRI D D+A E
Sbjct: 415 KSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATE 474

Query: 240 TSQMTKNQVLQQTGSAMLAQANQLPQVALSLL 271
S M+K Q+LQQ G+++LAQANQ+PQ LSLL
Sbjct: 475 VSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2586FLAGELLIN1342e-38 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 134 bits (339), Expect = 2e-38
Identities = 95/271 (35%), Positives = 128/271 (47%), Gaps = 10/271 (3%)

Query: 2 AITVNTNVTSMKAQKNLNTSSSGLATSMERLSSGLRINSAKDDAAGLAISNRLNSQVRGL 61
A +NTN S+ Q NLN S S L++++ERLSSGLRINSAKDDAAG AI+NR S ++GL
Sbjct: 1 AQVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGL 60

Query: 62 DVGMRNANDAISIAQIAEGAMQEQTNMLQRMRDLTVQAENGANSTDDLDAIQKEIDQLAE 121
RNAND ISIAQ EGA+ E N LQR+R+L+VQA NG NS DL +IQ EI Q E
Sbjct: 61 TQASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLE 120

Query: 122 EITAIGDSTAFGNTKLMTGNFSAGKTFQVGHQEGEDITISVGTNNAGSLMV--------S 173
EI + + T F K+++ + QVG +GE ITI + + SL +
Sbjct: 121 EIDRVSNQTQFNGVKVLSQDNQ--MKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 174 TLTIATSGGRSTALAAIDAAIKNIDNQRAALGAKQNRLAYNISNSANTQANVADAKSRIV 233
T+ + D + R + + + A
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 234 DVDFAKETSVMTKNQVLQQTGSAMLAQANQL 264
D + K + A A +
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIAGAI 269



Score = 84.7 bits (209), Expect = 1e-20
Identities = 63/290 (21%), Positives = 113/290 (38%), Gaps = 23/290 (7%)

Query: 6 NTNVTSMKAQKNLNTSSSGLATSMERLSSGLRINSAKDDAAGLAISNRLNSQVRGLDVGM 65
+T ++ + +N ++ L T ++ + + AG A + + ++G G
Sbjct: 217 DTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGD 276

Query: 66 RNANDAISIAQIAEGAMQEQTNMLQRMRDLTVQAENGANSTDDLDAIQKEIDQLAE---- 121
++ + + + V + + +
Sbjct: 277 TFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTS 336

Query: 122 -------------EITAIGDSTAFGNTKLMTGNFSAGKTFQVGHQEGEDITISVGTNNAG 168
+A N + + G+ +T++ T
Sbjct: 337 VVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFID 396

Query: 169 ------SLMVSTLTIATSGGRSTALAAIDAAIKNIDNQRAALGAKQNRLAYNISNSANTQ 222
S +++ A + LA+ID+A+ +D R++LGA QNR I+N NT
Sbjct: 397 KTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTV 456

Query: 223 ANVADAKSRIVDVDFAKETSVMTKNQVLQQTGSAMLAQANQLPQVALSLL 272
N+ A+SRI D D+A E S M+K Q+LQQ G+++LAQANQ+PQ LSLL
Sbjct: 457 TNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLL 506


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2587FLAGELLIN583e-11 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 57.7 bits (139), Expect = 3e-11
Identities = 60/358 (16%), Positives = 121/358 (33%), Gaps = 8/358 (2%)

Query: 20 QTATSKILEQLSSGKKVNTAGDDPVAALGIDNLNQRNALVDQFMKNIDYATNRLAVTESK 79
Q++ S +E+LSSG ++N+A DD + + Q +N + + TE
Sbjct: 21 QSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQASRNANDGISIAQTTEGA 80

Query: 80 LGSAENLASSIREQVMRAVNGTLADSERQMIADEMKGSLEELLSIANSKDESGNYMFSGY 139
L N +RE ++A NGT +DS+ + I DE++ LEE+ ++N +G + S
Sbjct: 81 LNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEIDRVSNQTQFNGVKVLSQ- 139

Query: 140 STDKEPFAFDNSTPPKIVYSGDSGIQNSLVQSGVALGTNVPGDSAFMKAPNGLGDYSVNY 199
+ N + +++ + G NV G +V
Sbjct: 140 DNQMKIQVGANDGETITIDLQKIDVKSLGLD-----GFNVNGPKEATVGDLKSSFKNVTG 194

Query: 200 LASQQGEFSVKTAKIADASTYLADTYTFNFIDNGLGGTNLQVLDSANNPVANIANFDATT 259
+ + + + T + N Q+ + F T
Sbjct: 195 YDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTK 254

Query: 260 PVSFNGIEVKVSGKPSAGDSFTMEPQAEVSIFDTISSAIALIEDPNSANTPQGRAQLAQI 319
+ ++G G V+ TI + + + T G +
Sbjct: 255 STAGTAEAKAIAGAIKGGKEGDTFDYKGVTF--TIDTKTGNDGNGKVSTTINGEKVTLTV 312

Query: 320 LNNIDSGVNQISSARSVAGNNLKAVESYKETHTEEKVVNTSALSRLEDLDYASAITEF 377
+ N ++ + N +V + + T ++ ++ LS LE + ++
Sbjct: 313 ADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKI 370



Score = 29.2 bits (65), Expect = 0.030
Identities = 42/277 (15%), Positives = 80/277 (28%), Gaps = 18/277 (6%)

Query: 140 STDKEPFAFDNSTPPKIVYSGDSGIQNSLVQSGVALGTNVPGDSAFMKAPNGLGDYSVNY 199
DK N ++ + A + +K +
Sbjct: 223 VPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKG 282

Query: 200 LASQQGE--FSVKTAKIADASTYLADTYTFNFIDNGLGGTNLQVLDSANNPVANIANF-- 255
+ + K++ T T I G + L S+ N ++ N
Sbjct: 283 VTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQF 342

Query: 256 ---DATTPVSFNGIEVKVSGKPSAGDSFTMEPQAEVSIFDTISSAIALIEDPNSANTPQG 312
D T S +++ + T+ + +A
Sbjct: 343 TFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGV 402

Query: 313 RAQLAQI-----------LNNIDSGVNQISSARSVAGNNLKAVESYKETHTEEKVVNTSA 361
+ + L +IDS ++++ + RS G +S SA
Sbjct: 403 STLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSA 462

Query: 362 LSRLEDLDYASAITEFEKQQLALNAVSSVFSKVGSVS 398
SR+ED DYA+ ++ K Q+ A +SV ++ V
Sbjct: 463 RSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVP 499


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2588FLGHOOKAP12205e-66 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 220 bits (561), Expect = 5e-66
Identities = 128/460 (27%), Positives = 194/460 (42%), Gaps = 29/460 (6%)

Query: 4 DLLNIARTGVLASQSQLGVTSNNIANANTAGYHRQVATQSTLESQRFGNSFYGTGTYVDD 63
L+N A +G+ A+Q+ L SNNI++ N AGY RQ + S + G G YV
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 64 VKRIYNDYAARELRIGQTTLSGAEASYGKLSELDQLFSQIGKMVPQSLNSLFTGLNSLAD 123
V+R Y+ + +LR QT SG A Y ++S++D + S + + FT L +L
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 124 LPADLGIRSSTLNDAKQLANSLNQMQSSLNGQLTQTNDQITGMTKRINEISKELANLNLE 183
D R + + ++ L N L Q Q N I +IN +K++A+LN +
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 184 LMKSPNQDAI-----LLDKQDALVQELSQYAQVNVIPQDNGAKSIMLGGSVMLVSGEIAM 238
+ + A LLD++D LV EL+Q V V QD G +I + LV G A
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 239 TMDTKTGNPFPNELQLMSSIGSQSVSADPSKL--GGQLGALFEYRDQTLIPASHELDQLA 296
+ + P+ + G+ P KL G LG + +R Q L + L QLA
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 297 LGIADNFNKMQQQGLDLNGQVGANIFRDINDPLMSLGRVGGYSNNTGNATLGVNIDDTRL 356
L A+ FN + G D NG G + F + V + N G+ +G + D
Sbjct: 302 LAFAEAFNTQHKAGFDANGDAGEDFFA------IGKPAVLQNTKNKGDVAIGATVTDASA 355

Query: 357 LTGGSYELSF-------TAPASYELRDTETGVITPLTLNGSTLEGGAGFSINIKAGAMAS 409
+ Y++SF T AS + +G L G A
Sbjct: 356 VLATDYKISFDNNQWQVTRLASNTTFTVTPDANGKVAFDGLELTFT---------GTPAV 406

Query: 410 GDRFAIRPTAGAANGITVEMTDPKGIAAASPKITADAANS 449
D F ++P + A + V +TD IA AS + D+ N
Sbjct: 407 NDSFTLKPVSDAIVNMDVLITDEAKIAMASEEDAGDSDNR 446



Score = 89.2 bits (221), Expect = 9e-21
Identities = 38/103 (36%), Positives = 55/103 (53%)

Query: 535 AEGDNSNAVAMAKLSESKVMNGGKSTLADVFENTKIDIGSKTKAAEVRVGSAEAIYQQAY 594
+ DN N A+ L + GG + D + + DIG+KT + + + Q
Sbjct: 441 GDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLS 500

Query: 595 ARVESESGVNLDEEAANLMRFQQAYQASARIMTTAQQIFDTLL 637
+ +S SGVNLDEE NL RFQQ Y A+A+++ TA IFD L+
Sbjct: 501 NQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALI 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2589FLGFLGJ2132e-68 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 213 bits (542), Expect = 2e-68
Identities = 112/363 (30%), Positives = 168/363 (46%), Gaps = 80/363 (22%)

Query: 12 DLGGLDSLRAQAQKDEKGTLKQVAQQFEGIFVQMLMKSMRDANAVFESDSPLNSQYTKFY 71
D L+ L+A+A +D ++ VA+Q EG+FVQM++KSMRDA D +S++T+ Y
Sbjct: 14 DAQSLNELKAKAGEDPAANIRPVARQVEGMFVQMMLKSMRDALP---KDGLFSSEHTRLY 70

Query: 72 EQMRDQQLSVDLSDKGVLGLADMMVQQLSPESSQLTPASVLRNDGGEKLQRGDKAFTAPA 131
M DQQ++ ++ LGLA+MMV+Q++PE P
Sbjct: 71 TSMYDQQIAQQMTAGKGLGLAEMMVKQMTPEQ------------------------PLPE 106

Query: 132 QNTSTQDVLDASTPPVSTSTQIPTYIARPTFESDRPEAVTSSLDIDTPIPSLAINTPKPA 191
++T P T + +A++ +
Sbjct: 107 ESTP----------------AAPMKFPLETVVRYQNQALSQLV----------------- 133

Query: 192 WSEQPLSPIEPVISGQILPTVAFKETQKTLKFGSREEFLATLYPHAEKAAKALGTKPEVL 251
Q P S G + FLA L A+ A++ G ++
Sbjct: 134 ---QKAVPRNYDDSLP----------------GDSKAFLAQLSLPAQLASQQSGVPHHLI 174

Query: 252 LAQSALETGWGQKIVRGSNGAPSHNLFNIKADRRWLGDKANVSTLEFEQGIAVRQKADFR 311
LAQ+ALE+GWGQ+ +R NG PS+NLF +KA W G ++T E+E G A + KA FR
Sbjct: 175 LAQAALESGWGQRQIRRENGEPSYNLFGVKASGNWKGPVTEITTTEYENGEAKKVKAKFR 234

Query: 312 VYTDFEHSFNDFVTFIAEGERYQGATKVAASPTQFIRALQDAGYATDPKYAEKVIKVMQT 371
VY+ + + +D+V + RY A AAS Q +ALQDAGYATDP YA K+ ++Q
Sbjct: 235 VYSSYLEALSDYVGLLTRNPRY-AAVTTAASAEQGAQALQDAGYATDPHYARKLTNMIQQ 293

Query: 372 ISQ 374
+
Sbjct: 294 MKS 296


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2590FLGPRINGFLGI374e-131 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 374 bits (962), Expect = e-131
Identities = 161/377 (42%), Positives = 225/377 (59%), Gaps = 20/377 (5%)

Query: 1 MRFKRIIALAILIFSLP--------SQAERIKDIANVQGVRSNQLIGYGLVVGLPGTGEK 52
MR RIIA A++ +LP + RIKDIA++Q R NQLIGYGLVVGL GTG+
Sbjct: 1 MRVLRIIAAALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDS 60

Query: 53 ---TNYTEQTFTTMLKNFGINLPDNFRPKIKNVAVVAVHADMPAFIKPGQELDVTVSSLG 109
+ +TEQ+ ML+N GI + KN+A V V A++P F PG +DVTVSSLG
Sbjct: 61 LRSSPFTEQSMRAMLQNLGITTQGG-QSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLG 119

Query: 110 EAKSLRGGTLLQTFLKGVDGNVYAIAQGSLVVSGFSAEGLDGSKVIQNTPTVGRIPNGAI 169
+A SLRGG L+ T L G DG +YA+AQG+L+V+GFSA+G D + + Q T R+PNGAI
Sbjct: 120 DATSLRGGNLIMTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAI 178

Query: 170 VERSVATPFSSGDYLTFNLRRADFSTAQRMADAINDL----LGPDMARPLDATSVQVSAP 225
+ER + + F L LR DFSTA R+AD +N G +A P D+ + V P
Sbjct: 179 IERELPSKFKDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP 238

Query: 226 RDVSQRVSFLATLENLDVIPAEESAKVIVNSRTGTIVVGQNVKLLPAAITHGGLTVTIAE 285
R V+ +A +ENL + + AKV++N RTGTIV+G +V++ A+++G LTV + E
Sbjct: 239 R-VADLTRLMAEIENL-TVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTE 296

Query: 286 ATQVSQPNALANGETVVTADTTIGVSESDRRMFMFSPGTTLDELVRAVNLVGAAPSDVLA 345
+ QV QP + G+T V T I + ++ G L LV +N +G ++A
Sbjct: 297 SPQVIQPAPFSRGQTAVQPQTDIMAMQEGSKVA-IVEGPDLRTLVAGLNSIGLKADGIIA 355

Query: 346 ILEALKVAGALHGELII 362
IL+ +K AGAL EL++
Sbjct: 356 ILQGIKSAGALQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2591FLGLRINGFLGH1437e-45 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 143 bits (362), Expect = 7e-45
Identities = 71/215 (33%), Positives = 106/215 (49%), Gaps = 9/215 (4%)

Query: 11 LLLSACSSTQKKPIADDPFYAPVYPEAPPTKIAATGSIYQDSQAA-----SLYSDIRAHK 65
L L+ C+ P+ A P P A GSI+Q +Q L+ D R
Sbjct: 17 LSLTGCAWIPSTPLVQGATSAQPVPGPTP---VANGSIFQSAQPINYGYQPLFEDRRPRN 73

Query: 66 VGDIITIVLKEATQAKKSAGNQIKKGSDMSLDPIYAAGSNISV-AGVPLDLRYKDSMNTK 124
+GD +TIVL+E A KS+ + + + D+
Sbjct: 74 IGDTLTIVLQENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFN 133

Query: 125 RESDADQSNSLDGSISANIMQVLNNGNLVVRGEKWISINNGDEFIRVTGIVRSQDIKPDN 184
+ A+ SN+ G+++ + QVL NGNL V GEK I+IN G EFIR +G+V + I N
Sbjct: 134 GKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSN 193

Query: 185 TIDSTRMANARIQYSGTGTFADAQKVGWLSQFFMS 219
T+ ST++A+ARI+Y G G +AQ +GWL +FF++
Sbjct: 194 TVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN 228


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2592FLGHOOKAP1431e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 42.6 bits (100), Expect = 1e-06
Identities = 19/119 (15%), Positives = 41/119 (34%), Gaps = 4/119 (3%)

Query: 145 EDATSITVSAEGEVSVKTAGAAENQVVGQLSMTDFINPSGLDPMGQNLYTETG---ASGT 201
D I +++E + + + Q + + +L ++ G A+
Sbjct: 427 TDEAKIAMASEEDAGDSDNRNGQALLDLQSNSKTVGGAKSFNDAYASLVSDIGNKTATLK 486

Query: 202 PIQGTASLDGMGAIRQGALETSNVNVTEELVNLIESQRIYEMNSKVISAVDQMLAYVNQ 260
T + + S VN+ EE NL Q+ Y N++V+ + + +
Sbjct: 487 TSSATQGNV-VTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 35.3 bits (81), Expect = 2e-04
Identities = 9/36 (25%), Positives = 20/36 (55%)

Query: 5 LWISKTGLDAQQTDIAVISNNVANASTVGYKKSRAV 40
+ + +GL+A Q + SNN+++ + GY + +
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTI 39


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2593FLGHOOKAP1290.023 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 28.8 bits (64), Expect = 0.023
Identities = 9/32 (28%), Positives = 17/32 (53%)

Query: 205 SNVNPVDEMVSLIELQRQFEMQVKMMKTAEEI 236
S VN +E +L Q+ + ++++TA I
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAI 538


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2594FLGHOOKAP1401e-05 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 40.3 bits (94), Expect = 1e-05
Identities = 15/35 (42%), Positives = 22/35 (62%)

Query: 2 SFNIALSGISAAQKDLNTTANNIANANTIGFKESR 36
N A+SG++AAQ LNT +NNI++ N G+
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37



Score = 37.6 bits (87), Expect = 1e-04
Identities = 13/49 (26%), Positives = 25/49 (51%)

Query: 405 SLSSSALEQSNIDLTTELVDLISAQRNFQANSRTLEVNNTLQQTVLQIR 453
LS+ S ++L E +L Q+ + AN++ L+ N + ++ IR
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2596FLGHOOKAP1333e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 33.0 bits (75), Expect = 3e-04
Identities = 9/38 (23%), Positives = 18/38 (47%)

Query: 99 NVNVMEEMADMISASRSYQMNVQVAEAAKSMLQQTLGM 136
VN+ EE ++ + Y N QV + A ++ + +
Sbjct: 508 GVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 29.9 bits (67), Expect = 0.003
Identities = 16/67 (23%), Positives = 29/67 (43%), Gaps = 6/67 (8%)

Query: 5 SIFDVAGSGMSAQSVRLNTTASNIANADSVSSSIDKTYRSRHPIFEAEMAKAQSQQQTSQ 64
S+ + A SG++A LNT ++NI++ + Y + I + +
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYN------VAGYTRQTTIMAQANSTLGAGGWVGN 55

Query: 65 GVTVRGI 71
GV V G+
Sbjct: 56 GVYVSGV 62


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2599HTHFIS611e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 61.4 bits (149), Expect = 1e-12
Identities = 23/128 (17%), Positives = 52/128 (40%), Gaps = 12/128 (9%)

Query: 180 HIMVIDDSAVARKQIIRSLESLNLQIDTAKDGREALDKLKAIASEMDNVADEIPLIISDI 239
I+V DD A R + ++L + + + A + L+++D+
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAA---------GDGDLVVTDV 55

Query: 240 EMPEMDGYTLTAEIRDDPKLKHIKVVLHTSLSGVFNQAMVQKVGANDFIAK-FNPDELAA 298
MP+ + + L I+ + V++ ++ + + GA D++ K F+ EL
Sbjct: 56 VMPDENAFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG 113

Query: 299 AVNKHLSL 306
+ + L+
Sbjct: 114 IIGRALAE 121


80Sputcn32_2629Sputcn32_2637N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2629020-6.274871polysaccharide biosynthesis protein CapD
Sputcn32_2630022-5.768985hypothetical protein
Sputcn32_2631017-1.287351hypothetical protein
Sputcn32_2632-115-1.200518hypothetical protein
Sputcn32_2633-116-0.952171hypothetical protein
Sputcn32_2634-116-0.726413TetR family transcriptional regulator
Sputcn32_2635-114-0.704690RND family efflux transporter MFP subunit
Sputcn32_2636-111-0.889210acriflavin resistance protein
Sputcn32_2637-1161.661202methyl-accepting chemotaxis sensory transducer
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2629NUCEPIMERASE805e-19 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 79.8 bits (197), Expect = 5e-19
Identities = 43/245 (17%), Positives = 87/245 (35%), Gaps = 54/245 (22%)

Query: 6 TILITGGTGSFGQKYTKTILERY-----------------KPKRLIIFSRDELKQYEMQQ 48
L+TG G G +K +LE K RL + ++ + ++
Sbjct: 2 KYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHK--- 58

Query: 49 VFNAPCMRYFIGDVRDGERLKQAFKDVDF--VIHAAALKQVPAAEYNPMECIKTNIHGAE 106
D+ D E + F F V + V + NP +N+ G
Sbjct: 59 -----------IDLADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFL 107

Query: 107 NVIRAAISNNVKKVIALST---------------DKAASPINLYGATKLASDKLFVAANN 151
N++ N ++ ++ S+ D P++LY ATK A++ + ++
Sbjct: 108 NILEGCRHNKIQHLLYASSSSVYGLNRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSH 167

Query: 152 VVGDGKTRFSAVRYGNVVGSRGS---VVPFFKQLIANGATSLPITHPDMTRFWITLQDGV 208
+ G + +R+ V G G + F + + G + + M R + + D
Sbjct: 168 LYG---LPATGLRFFTVYGPWGRPDMALFKFTKAMLEGKSIDVYNYGKMKRDFTYIDDIA 224

Query: 209 DFVLK 213
+ +++
Sbjct: 225 EAIIR 229


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2630SYCDCHAPRONE310.013 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.013
Identities = 18/99 (18%), Positives = 38/99 (38%), Gaps = 8/99 (8%)

Query: 727 TLVKILAHSDEYMPQ---YAYILKLQGKVQESINIY--LDYLEKYPSDTQTWVKLGLFMV 781
T+ + S + + Q A+ GK +++ ++ L L+ D++ ++ LG
Sbjct: 24 TIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLD--HYDSRFFLGLGACRQ 81

Query: 782 EINQIEPAHTAFSNAVNADPTNQVAQHYLTE-LTQLMTP 819
+ Q + A ++S D + E L Q
Sbjct: 82 AMGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGEL 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2634HTHTETR699e-17 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 69.3 bits (169), Expect = 9e-17
Identities = 31/145 (21%), Positives = 55/145 (37%), Gaps = 5/145 (3%)

Query: 13 RSEQKRQQVLVAAIDLFCRQGFPHTSMDEVAKLAGVSKQTVYSHYGSKDELFVAAIE--S 70
+++ RQ +L A+ LF +QG TS+ E+AK AGV++ +Y H+ K +LF E
Sbjct: 8 EAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWELSE 67

Query: 71 KCVGHNLHDDLLNDPSQPEAALTQFALQFGEMIVSPEAITVFKACVAQSESHP---EVSR 127
+G + P P + L + + E V+ E + + V +
Sbjct: 68 SNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVGEMAVVQ 127

Query: 128 LFFEAGPQQIVGILADYLLAVEALG 152
+ + L
Sbjct: 128 QAQRNLCLESYDRIEQTLKHCIEAK 152


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2635RTXTOXIND423e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.7 bits (98), Expect = 3e-06
Identities = 27/111 (24%), Positives = 48/111 (43%), Gaps = 9/111 (8%)

Query: 75 SGKLSELYVDSGTKVVQGQILAKLDTHLLEAERQEIQASLAQTQADVDLARSTLKRNLEL 134
+ + E+ V G V +G +L KL EA+ + Q+SL Q + + L R++EL
Sbjct: 104 NSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQ-TRYQILSRSIEL 162

Query: 135 KKSGYVS-------EQLLDENRSQLVSL-ESAKQRLMASQHANRLKLDKSQ 177
K + + + +E +L SL + ++ L LDK +
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKR 213


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2636ACRIFLAVINRP379e-117 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 379 bits (975), Expect = e-117
Identities = 208/1044 (19%), Positives = 435/1044 (41%), Gaps = 52/1044 (4%)

Query: 1 MIKAFVENGRLVSLVIALLIVAGLGAISSLPRTEDPHITNRFASVITSYPGASAERVEAL 60
M F+ ++ +L++AG AI LP + P I SV +YPGA A+ V+
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTEVIENQLRRLEEIKLIQSTS-RPGVSVIQLELKDTVMETAPVWSR--ARDLLADAKAN 117
VT+VIE + ++ + + STS G I L + T P ++ ++ L A
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQS---GTDPDIAQVQVQNKLQLATPL 117

Query: 118 LPNGIQTPTLDDQIGYAYTAILSLVWNADTPIRADILNRYAKE-LQSRLRLLPGTDFVKL 176
LP +Q + + + +++ + + D ++ Y ++ L L G V+L
Sbjct: 118 LPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQL 177

Query: 177 YGAPTEEILVQLNGSQMSQLQLTPSTIAQILTNADSKISAGEINNT------TFRALVEV 230
+GA + + L+ +++ +LTP + L + +I+AG++ T A +
Sbjct: 178 FGAQ-YAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIA 236

Query: 231 SGELDSQTRIRQVPLKIDTQGQIIRLGDIATVTRQSQTPADSIALVDGKQSVLVAVRMLD 290
+ +V L++++ G ++RL D+A V + + IA ++GK + + +++
Sbjct: 237 QTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENY-NVIARINGKPAAGLGIKLAT 295

Query: 291 NTRVDLWQAQVNKVVDELSRDVPANITIQWLFEQNSYTSVRLGDLVINLLQGFIIILLVL 350
+ + EL P + + + ++ + + + ++V L + +++ LV+
Sbjct: 296 GANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVM 355

Query: 351 LLTLG-VRNAIIVAISLPLTALFTLACMKYINLPIHQMSVTGLVVALGIMVDNAIVIVDA 409
L L +R +I I++P+ L T A + I+ +++ G+V+A+G++VD+AIV+V+
Sbjct: 356 YLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVEN 415

Query: 410 IAQRRQ-QGMSRLAAVSQTLHHLWLPLAGSTITTILAFAPIVLMPGAAGEFVGGIAMSVI 468
+ + + A +++ + L G + F P+ G+ G +++++
Sbjct: 416 VERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIV 475

Query: 469 FALLGSYLISHTLIAGLAGRF---------SIDGKHDAWYQHGISMPLLSHYFQASLRFA 519
A+ S L++ L L G W+ +++ S+
Sbjct: 476 SAMALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDH--SVNHYTNSVGKI 533

Query: 520 LNRPIISAIAIGVIPALGFFASSKMTEQFFPPSDRDMFQIEIYLAPHVSLENTLNQV-QL 578
L + +I A ++ F P D+ +F I L + E T + Q+
Sbjct: 534 LGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQV 593

Query: 579 IDNTLRSINGITQVDWVVGGNSPSFYYNLTQRQQGAANYAQAMVK-----VTDFKRANAL 633
D L++ + + V G ++ + + Q A A +K D A A+
Sbjct: 594 TDYYLKNEKANVESVFTVNG------FSFSGQAQNAG-MAFVSLKPWEERNGDENSAEAV 646

Query: 634 IPELQQQLDS---AFPEVQVLVRKLEQGPPFNAPVELM-IFGSNLDTLRAIGDEIRLILS 689
I + +L F + +E G EL+ G D L +++ + +
Sbjct: 647 IHRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAA 706

Query: 690 KTP-DVLHTRATLSAGAPKVWLEVNEDASLMSGLSLTEIAKQIQMSTTGVIGGSILEQTE 748
+ P ++ R + LEV+++ + G+SL++I + I + G +++
Sbjct: 707 QHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGR 766

Query: 749 SLPIRVRLSDENREQVNRLSEIQLVSPSGETVSLSALSHSEIHVSRGAIPRRNGQRVNTI 808
+ V+ + R + ++ + S +GE V SA + S + R NG I
Sbjct: 767 VKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEI 826

Query: 809 EAYIVSGVLPAQVLNDVKDKIAQLTLPSGYRIEIGGESAKRNEAVGNLLSSVMLVVTLLL 868
+ G + +++ ++ LP+G + G S + + + V + ++
Sbjct: 827 QGEAAPGTSSGDAMALMENLASK--LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVF 884

Query: 869 ATVVLSFNSFRLTAIILLSAMQSAGLGLLAVFVFGYPFGFPVIIGLLGLMGLAINAAIVI 928
+ + S+ + ++L LLA +F ++GLL +GL+ AI+I
Sbjct: 885 LCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILI 944

Query: 929 LAELEEIPAAR-LGDKDVIISTVSSCGRHIGSTTVTTVGGFIPLII---AGGGFWPPFAI 984
+ +++ G + + V R I T++ + G +PL I AG G I
Sbjct: 945 VEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGI 1004

Query: 985 AIAGGTLLTTLLSLVWVPTMYLLL 1008
+ GG + TLL++ +VP ++++
Sbjct: 1005 GVMGGMVSATLLAIFFVPVFFVVI 1028


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2637IGASERPTASE300.028 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.4 bits (68), Expect = 0.028
Identities = 36/243 (14%), Positives = 86/243 (35%), Gaps = 25/243 (10%)

Query: 393 QDSVESLEQQASKAQSIAKQNGEEAQALM---LQTDQIATAIEEMSTSIRDVANHAQDGA 449
+VE EQ A++ + ++ +EA++ + QT+++A + E +
Sbjct: 1048 SKTVEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVE 1107

Query: 450 EQSLQVDSAAKEGYNQQTKVVQDLLKLSQQLSNSHQSIEKVSQESEAISKVTEVINSIAE 509
++ AK + +V + ++S + S + E + T I
Sbjct: 1108 KE-----EKAKVETEKTQEVPKVTSQVSPKQEQSET--VQPQAEPARENDPTVNIKEPQS 1160

Query: 510 QTNLLA--LNAAIEAARAGEQGRGFAVVADEVRTLAQRTQSSILEISQTIEKLQTQ---- 563
QTN A A E + EQ + + ++ + +++ +Q ++
Sbjct: 1161 QTNTTADTEQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPK 1220

Query: 564 VKNATEQMAQSHQLGTTSATQGEITGEQLKEITRRIGELAISSRNIASATEQQSSVAQEI 623
++ + H + + + + + L ++T S N + + AQ +
Sbjct: 1221 NRHRRSVRSVPHNVEPATTSSNDRSTVALCDLT---------STNTNAVLSDARAKAQFV 1271

Query: 624 THN 626
N
Sbjct: 1272 ALN 1274


81Sputcn32_2781Sputcn32_2798N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_27810182.996717putative ABC transporter ATP-binding protein
Sputcn32_2782-2141.371245LysR family transcriptional regulator
Sputcn32_2783-2150.793918beta-lactamase domain-containing protein
Sputcn32_2784-2151.006681putative protein-disulfide isomerase
Sputcn32_2785-2140.485455ketosteroid isomerase-like protein
Sputcn32_2786-2121.889233RND family efflux transporter MFP subunit
Sputcn32_2787-2122.273798acriflavin resistance protein
Sputcn32_2788-2163.135381Bcr/CflA subfamily drug resistance transporter
Sputcn32_2789-2163.237926AraC family transcriptional regulator
Sputcn32_2790-3142.398249hypothetical protein
Sputcn32_2791-3152.357881hydrophobe/amphiphile efflux-1 (HAE1) family
Sputcn32_2792-1101.240730RND family efflux transporter MFP subunit
Sputcn32_2793-210-0.610689TetR family transcriptional regulator
Sputcn32_2794-211-1.175866OsmC family protein
Sputcn32_2795-311-1.759625extracellular solute-binding protein
Sputcn32_2796-313-1.059984LysR family transcriptional regulator
Sputcn32_2797-215-0.495467adenylosuccinate synthase
Sputcn32_2798-215-1.104853putative CheW protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2781PYOCINKILLER300.028 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.1 bits (67), Expect = 0.028
Identities = 16/52 (30%), Positives = 26/52 (50%), Gaps = 1/52 (1%)

Query: 103 RLDEVYAAYAEPDADFDALAKEQGELEAIIQAQDAHNLEHILERAANALRLP 154
R++ + AA A +A A+EQ EA +A++ + RAAN +P
Sbjct: 203 RMNTLTAAKASIEAAAANKAREQAAAEAKRKAEEQAR-QQAAIRAANTYAMP 253


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2786RTXTOXIND491e-08 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 49.4 bits (118), Expect = 1e-08
Identities = 21/108 (19%), Positives = 47/108 (43%), Gaps = 3/108 (2%)

Query: 50 PLTQSISLIGKLA-AERAVVIAPQVTGKIKQIAVTSNQAVKKGQLLIELDDMKAQAAVAE 108
+ + GKL + R+ I P +K+I V ++V+KG +L++L + A+A +
Sbjct: 79 QVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLK 138

Query: 109 ANAYLNDEKRKLKEFEKLISRNAITQTEIDAQKASVDIAQARLTSAQA 156
+ L +L++ I +I ++ K + ++ +
Sbjct: 139 TQSSLLQA--RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEV 184



Score = 48.7 bits (116), Expect = 2e-08
Identities = 41/246 (16%), Positives = 84/246 (34%), Gaps = 33/246 (13%)

Query: 51 LTQSISLIGKLAAERAVVIAPQVTGKIKQIAV-----TSNQAVKKGQLLIELDDMKAQAA 105
Q + K AER V+A ++ V ++ Q + + ++ +
Sbjct: 202 KYQKELNLDKKRAERLTVLA-RINRYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENK 260

Query: 106 VAEANAYLNDEKRKLKEFEKLISRNAITQTEIDAQ----------KASVDIAQ--ARLTS 153
EA L K +L++ E I + + + +I L
Sbjct: 261 YVEAVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAK 320

Query: 154 AQADLHYHSLIAPFAGKT-GLINFSEGKMVSVGTELMTL-DDLSSMRLDLQVPEHYLAQL 211
+ + AP + K L +EG +V+ LM + + ++ + V + +
Sbjct: 321 NEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFI 380

Query: 212 SIGMPVSATSRAWPGETF---MGKVVAIDP-RVNEETLNL--KIRVQFD-------NPKD 258
++G A+P + +GKV I+ + ++ L L + + + N
Sbjct: 381 NVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLSTGNKNI 440

Query: 259 RLKPGM 264
L GM
Sbjct: 441 PLSSGM 446


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2787ACRIFLAVINRP7780.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 778 bits (2011), Expect = 0.0
Identities = 310/1032 (30%), Positives = 520/1032 (50%), Gaps = 28/1032 (2%)

Query: 3 LSDVSVKRPVVAIVLSLLLCVFGFVSFTKLSVREMPDVESPVVTISTSYSGASASIMESQ 62
+++ ++RP+ A VL+++L + G ++ +L V + P + P V++S +Y GA A ++
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 63 ITKTLEDELTGISGIDEITSTT-RNGSSRITVKFLLGWNLTEGVSDVRDAVARAQRRLPE 121
+T+ +E + GI + ++ST+ GS IT+ F G + V++ + A LP+
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 DAKDPIVSKDNGSGEPSVYVNLSSSIMDRTQ--LTDYAQRVLEDRFSLISGVSSISISGG 179
+ + +S + S + S TQ ++DY ++D S ++GV + + G
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 180 LYKVMYVKLRPEQMAGRNVTVTDITNALRKENVETPGGQVRNDTTV------MSVRTKRL 233
Y + + L + + +T D+ N L+ +N + GQ+ + S+ +
Sbjct: 181 QYAMR-IWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 234 YYTPKDFDYLVVRTASDGTPIYLKDVADVAVGAQNENSTFKSDGIVNLSLGIITQSDANP 293
+ P++F + +R SDG+ + LKDVA V +G +N N + +G LGI + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 294 LVVAQEVHKEVDRVQNFLPEGTSLVVDFDSTVFIDRSINEVYNTLFVTGALVVLVLYIFI 353
L A+ + ++ +Q F P+G ++ +D+T F+ SI+EV TLF LV LV+Y+F+
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 354 GQARATLIPAVTVPVSLISAFIAANMFGYSINLLTLMALILAIGLVVDDAIVVVENIFHH 413
RATLIP + VPV L+ F FGYSIN LT+ ++LAIGL+VDDAIVVVEN+
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 414 I-EKGEEPLLAAYKGTREVGFAVVATTAVLVMVFLPISFMEGMVGLLFTEFSVMLAVSVM 472
+ E P A K ++ A+V VL VF+P++F G G ++ +FS+ + ++
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 473 FSSLIALTLTPVLSSKLLKANVK-----PNRFNRWVDSGFARMEKVYRAAVSRAIQFRLI 527
S L+AL LTP L + LLK F W ++ F Y +V + +
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGR 539

Query: 528 APLVILACIGGSAWLMQQVPSQLAPQEDRGVLYAFVKGAEGTSYNRMTANMDIVEDRLMP 587
L+ + G L ++PS P+ED+GV ++ G + R +D V D +
Sbjct: 540 YLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLK 599

Query: 588 LLGQGVLRSFSVQAPAFGGRAGDQTGFVIMQLEDWEHRDVTAQQALGIISSA---LKDIP 644
V F+V +F G+ G + L+ WE R+ A +I A L I
Sbjct: 600 NEKANVESVFTVNGFSFSGQ-AQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 645 DVMVRPM-MPGFRGQ-SSEPVQFVL---GGSDYTELFKWAQILKEEANASP-MMEGADLD 698
D V P MP ++ F L G + L + L A P + +
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPN 718

Query: 699 YAETTPELIVTVDKERAAELGISVDEVSQTLEVMLGGRTETTYVDRGEEYDVYLRGDENS 758
E T + + VD+E+A LG+S+ +++QT+ LGG ++DRG +Y++ D
Sbjct: 719 GLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKF 778

Query: 759 FNNVGDLSQIYMRSAKGELVTLDTVTHIEEVASAQKLSHTNKQKSITLKANISEGYTLGE 818
D+ ++Y+RSA GE+V T V + +L N S+ ++ + G + G+
Sbjct: 779 RMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGD 838

Query: 819 SLAFLENKAVELLPKDISVGYTGESKEFKENQSSILIVFGLALLVAYLVLAAQFESFINP 878
++A +EN A + LP I +TG S + + + + + ++ +V +L LAA +ES+ P
Sbjct: 839 AMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIP 897

Query: 879 LVVMFTVPMGVFGGFLGLLITSQGINIYSQIGMIMLIGMVTKNGILIVEFANQLRDR-GF 937
+ VM VP+G+ G L + +Q ++Y +G++ IG+ KN ILIVEFA L ++ G
Sbjct: 898 VSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGK 957

Query: 938 ELDKAIIDASTRRLRPILMTAFTTLVGAIPLIFSTGAGSESRIAVGTVVFFGMAFATFVT 997
+ +A + A RLRPILMT+ ++G +PL S GAGS ++ AVG V GM AT +
Sbjct: 958 GVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLA 1017

Query: 998 LFVIPAMYRLIS 1009
+F +P + +I
Sbjct: 1018 IFFVPVFFVVIR 1029


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2788TCRTETB583e-11 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 57.6 bits (139), Expect = 3e-11
Identities = 41/178 (23%), Positives = 86/178 (48%), Gaps = 10/178 (5%)

Query: 12 LLMIFPQAMETIYSPALPNIAENFAVSVGGASQTLSVYFIAFAIGVFCWGRLADIIGRRK 71
+L F E + + +LP+IA +F + + + + F+IG +G+L+D +G ++
Sbjct: 21 ILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKR 80

Query: 72 AMLAGLVCYAIGSALALMV-NDFSLLLLARVLSAFGAA----VGSVITQTMMRDSYSGEE 126
+L G++ GS + + + FSLL++AR + GAA + V+ + G+
Sbjct: 81 LLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRGKA 140

Query: 127 LAKVFSVMGMSLGISPVIGLLLGSVLSAYWGYQGVFVALMVSAIVLLFLSIKSLPETK 184
+ S++ M G+ P IG ++ + +W Y + + + I+ + +K L +
Sbjct: 141 FGLIGSIVAMGEGVGPAIGGMIAHYI--HWSY---LLLIPMITIITVPFLMKLLKKEV 193


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2791ACRIFLAVINRP10360.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1036 bits (2681), Expect = 0.0
Identities = 414/1043 (39%), Positives = 637/1043 (61%), Gaps = 18/1043 (1%)

Query: 2 LSQFFIKRPIFAAVLSLLFLITGAIAVWQLPITEYPEVVPPTVVVTANYPGANPKVIAET 61
++ FFI+RPIFA VL+++ ++ GA+A+ QLP+ +YP + PP V V+ANYPGA+ + + +T
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 62 VASPLEQEINGVEDMLYMSSQATSDGRMTLTITFAIGTDVDRAQTQVQSRVDRAMPRLPQ 121
V +EQ +NG+++++YMSS + S G +T+T+TF GTD D AQ QVQ+++ A P LPQ
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 122 EVQRLGIVTEKSSPDLTMVVHLLSPDNRYDMLYLSNYAALNVKDELARIKGVGAVRLFGA 181
EVQ+ GI EKSS MV +S + +S+Y A NVKD L+R+ GVG V+LFG
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFG- 179

Query: 182 GEYSLRIWLDPNKVSALGLSPADIIAAVREQNQQAAAGSLGAQPSGSA-DFQLLINVKGR 240
+Y++RIWLD + ++ L+P D+I ++ QN Q AAG LG P+ I + R
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 241 LTELSEFEDIIIKVGQNAEVIRLKDVARVELGATSYALRSLLDNKDAVAIPVFQASGSNA 300
EF + ++V + V+RLKDVARVELG +Y + + ++ K A + + A+G+NA
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANA 299

Query: 301 IQISDDVRAEMARLAKSFPEGLQYEIVYDPTVFVRGSIEAVVKTLLEAVLLVVLVVVLFL 360
+ + ++A++A L FP+G++ YD T FV+ SI VVKTL EA++LV LV+ LFL
Sbjct: 300 LDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFL 359

Query: 361 QTWRASIIPLVAVPVSLVGTFAFMHLLGFSLNALSLFGLVLAIGIVVDDAIVVVENVERN 420
Q RA++IP +AVPV L+GTFA + G+S+N L++FG+VLAIG++VDDAIVVVENVER
Sbjct: 360 QNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERV 419

Query: 421 IA-AGLSPMAATQKAMKEVTGPIVATTLVLAAVFIPTAFMSGLTGQFYKQFALTITISTF 479
+ L P AT+K+M ++ G +V +VL+AVFIP AF G TG Y+QF++TI +
Sbjct: 420 MMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMA 479

Query: 480 ISAINSLTLSPALSALLLKSHDAPKDGLTRLMDKLFGAWLFVPFNRLFNRASDGYGYLVR 539
+S + +L L+PAL A LLK A F FN F+ + + Y V
Sbjct: 480 LSVLVALILTPALCATLLKPVSAEHHENKG--------GFFGWFNTTFDHSVNHYTNSVG 531

Query: 540 KVIRFGGIIGLVYLGMVALTGVQFANTPTGYVPGQDKQYLVAFAQLPDAASLERTDTVIK 599
K++ G L+Y +VA V F P+ ++P +D+ + QLP A+ ERT V+
Sbjct: 532 KILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLD 591

Query: 600 KMSEIALNH--PGVAHSIAFPGLSINGFTNSPNSGVVFVALDDFELRKSPELSANAIAGQ 657
++++ L + V G S +G + N+G+ FV+L +E R E SA A+ +
Sbjct: 592 QVTDYYLKNEKANVESVFTVNGFSFSG--QAQNAGMAFVSLKPWEERNGDENSAEAVIHR 649

Query: 658 LNQQFADIQDAFIAIFPPPPVQGLGTIGGFRLQIQDRANLGYEALYQVTQQVMYKAWADP 717
+ I+D F+ F P + LGT GF ++ D+A LG++AL Q Q++ A P
Sbjct: 650 AKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHP 709

Query: 718 -QLAGIFSSYQVNVPQLELDINRTKAKQQAVSLDQIFQTLQTYMGSTYVNDFNRFGRTYQ 776
L + + + Q +L++++ KA+ VSL I QT+ T +G TYVNDF GR +
Sbjct: 710 ASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKK 769

Query: 777 VNMQADEAFRQSPQQISQLKVPNVNGDMIPLGSFINVSQSAGPDRVMHYNGFTTAEINGS 836
+ +QAD FR P+ + +L V + NG+M+P +F G R+ YNG + EI G
Sbjct: 770 LYVQADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGE 829

Query: 837 PAPGVSTGQAQAAIEKILAETLPIGMTYEWTELTYQQILAGNTGLLVFPLVILLVFMVLA 896
APG S+G A A +E + ++ LP G+ Y+WT ++YQ+ L+GN + + ++VF+ LA
Sbjct: 830 AAPGTSSGDAMALMENLASK-LPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLA 888

Query: 897 AQYESLSLPLAIILIIPMTLLSALSGVLIYGGDNNIFTQIGLIVLVGLATKNAILIVEFA 956
A YES S+P++++L++P+ ++ L ++ N+++ +GL+ +GL+ KNAILIVEFA
Sbjct: 889 ALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFA 948

Query: 957 KEKQDH-GMAPMEAILEAARLRLRPILMTSIAFIMGVVPMVFSTGAGAEMRQAMGVAVFA 1015
K+ + G +EA L A R+RLRPILMTS+AFI+GV+P+ S GAG+ + A+G+ V
Sbjct: 949 KDLMEKEGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMG 1008

Query: 1016 GMIGVTVFGLILTPLFYYALAKR 1038
GM+ T+ + P+F+ + +
Sbjct: 1009 GMVSATLLAIFFVPVFFVVIRRC 1031



Score = 96.1 bits (239), Expect = 3e-22
Identities = 80/511 (15%), Positives = 176/511 (34%), Gaps = 47/511 (9%)

Query: 553 LGMVALTGVQFANTPTGYVPGQDKQYLVAFAQLPDAASLERTDTVIKKMSEIALNHPGVA 612
+ ++ + P P + A P A + DTV + + + +
Sbjct: 17 IILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVIEQNMNGIDNL- 75

Query: 613 HSIAFPGLSINGFTNSPNSGVVFVALDDFELRKSPELSANAIAGQLNQQFADIQDAFIAI 672
+ ++ ++S S + + F+ P+++ Q+ +
Sbjct: 76 -------MYMSSTSDSAGSVTITL---TFQSGTDPDIAQV----QVQNKLQLATPLLPQE 121

Query: 673 FPPPPVQGLGTIGGFRLQIQDRANLGYEALYQVTQQVMYKAWAD----PQLAGIFSSYQV 728
+ + + + G+ + T Q + L+ + V
Sbjct: 122 VQQQGISVEKSSSSYLMVA------GFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV 175

Query: 729 NV----PQLELDINRTKAKQQAVSLDQIFQTLQT----YMGSTYVNDFNRFGRTYQVNMQ 780
+ + + ++ + ++ + L+ G+ ++
Sbjct: 176 QLFGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASII 235

Query: 781 ADEAFRQSPQQISQLKVP-NVNGDMIPLGSFINVSQSAGPDRVM-HYNGFTTAEINGSPA 838
A F ++P++ ++ + N +G ++ L V V+ NG A + A
Sbjct: 236 AQTRF-KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLA 294

Query: 839 PGVSTGQAQAAIEKILAE---TLPIGM----TYEWTELTYQQILAGNTGLLVFPLVILLV 891
G + AI+ LAE P GM Y+ T I L I+LV
Sbjct: 295 TGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLF---EAIMLV 351

Query: 892 FMVLAAQYESLSLPLAIILIIPMTLLSALSGVLIYGGDNNIFTQIGLIVLVGLATKNAIL 951
F+V+ +++ L + +P+ LL + + +G N T G+++ +GL +AI+
Sbjct: 352 FLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIV 411

Query: 952 IVE-FAKEKQDHGMAPMEAILEAARLRLRPILMTSIAFIMGVVPMVFSTGAGAEMRQAMG 1010
+VE + + + P EA ++ ++ ++ +PM F G+ + +
Sbjct: 412 VVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFS 471

Query: 1011 VAVFAGMIGVTVFGLILTPLFYYALAKRGSK 1041
+ + + M + LILTP L K S
Sbjct: 472 ITIVSAMALSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2792RTXTOXIND448e-07 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 44.0 bits (104), Expect = 8e-07
Identities = 19/123 (15%), Positives = 46/123 (37%), Gaps = 5/123 (4%)

Query: 64 SVTLIPRVSGYIESVNFKEGALVKKGDVLFRIDPSVFEVEVARLKADLASAISAE---QL 120
S + P + ++ + KEG V+KGDVL ++ E + + ++ L A + Q+
Sbjct: 96 SKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQI 155

Query: 121 ATNDLERARKLFDQKAVSAELLDTRESNKRQTAAAVASVKAALMR--AELDLAYTQVQAP 178
+ +E + + + E + + + + + +L + +A
Sbjct: 156 LSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAE 215

Query: 179 IDG 181

Sbjct: 216 RLT 218



Score = 41.7 bits (98), Expect = 4e-06
Identities = 21/100 (21%), Positives = 41/100 (41%), Gaps = 10/100 (10%)

Query: 103 EVARLKADLASAISAEQLATNDLERARKLFDQKAVSAELLDTRESNKRQTAAAVASVKAA 162
E+ K+ L S A + + +LF + + +L RQT + +
Sbjct: 267 ELRVYKSQLEQIESEILSAKEEYQLVTQLF-KNEILDKL--------RQTTDNIGLLTLE 317

Query: 163 LMRAELDLAYTQVQAPIDGRVSYANVTT-GNYVTAGQSVL 201
L + E + ++AP+ +V V T G VT ++++
Sbjct: 318 LAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAETLM 357


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2793HTHTETR627e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 61.6 bits (149), Expect = 7e-14
Identities = 40/207 (19%), Positives = 73/207 (35%), Gaps = 10/207 (4%)

Query: 12 GRPRAFDTEDA-LAKALEVFWRKGFEGTSLTDLTQAMGINKPSLYAAFGNKEQLFLKAIE 70
+ A +T L AL +F ++G TSL ++ +A G+ + ++Y F +K LF + E
Sbjct: 5 TKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSEIWE 64

Query: 71 LYEQRPCAFFYPALEKETAYL--VVESMLLGAADSLVDKSHPQGCLIVQGALTCSEAGQA 128
L E K V+ +L+ +S V + + + + C G+
Sbjct: 65 LSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHK-CEFVGEM 123

Query: 129 IKDTLINRRRDGEI--ALCERLQRAKDEGDLPADADPLLLARYIGTVLQGMAVQA----T 182
R E + + L+ + LPAD A + + G+
Sbjct: 124 AVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFAPQ 183

Query: 183 NGICPNELRQVAELTLANFPRNHINHN 209
+ E R + L + N
Sbjct: 184 SFDLKKEARDYVAILLEMYLLCPTLRN 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2798HTHFIS723e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 72.2 bits (177), Expect = 3e-16
Identities = 31/116 (26%), Positives = 52/116 (44%), Gaps = 15/116 (12%)

Query: 195 TILIVDDSAFIRKMIENTLRSAGYNIITAKDGGDALEMLMEFETLADQDNASISDFVSAI 254
TIL+ DD A IR ++ L AGY++ + + + +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIA-------------AGDGDLV 51

Query: 255 ITDVEMPRMDGMHLVKRLRESKAYRQMPIVMFSSLMSEDNRIKAISLGANDTITKP 310
+TDV MP + L+ R++ KA +P+++ S+ + IKA GA D + KP
Sbjct: 52 VTDVVMPDENAFDLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKP 105


82Sputcn32_2834Sputcn32_2845N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_28344211.992218translation initiation factor IF-2
Sputcn32_28351121.335307transcription elongation factor NusA
Sputcn32_28360141.614009hypothetical protein
Sputcn32_28371151.874707**preprotein translocase subunit SecG
Sputcn32_28381151.993894triosephosphate isomerase
Sputcn32_28391171.811138phosphoglucosamine mutase
Sputcn32_28400151.799909dihydropteroate synthase
Sputcn32_28410151.648177ATP-dependent metalloprotease FtsH
Sputcn32_2842-3151.04425723S rRNA methyltransferase J
Sputcn32_2843-2160.311001hypothetical protein
Sputcn32_2844-3150.544802preprotein translocase subunit SecF
Sputcn32_2845-2130.197159preprotein translocase subunit SecD
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2834TCRTETOQM734e-15 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 72.6 bits (178), Expect = 4e-15
Identities = 52/202 (25%), Positives = 78/202 (38%), Gaps = 30/202 (14%)

Query: 387 IMGHVDHGKTSLLDYIRRAKVAAGEAG------------------GITQHIGAYHVETEN 428
++ HVD GKT+L + + A E G GIT G + EN
Sbjct: 8 VLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGITSFQWEN 67

Query: 429 GMITFLDTPGHAAFTAMRARGAKATDIVVLVVAADDGVMPQTIEAIQHAKAGNVPLIVAV 488
+ +DTPGH F A R D +L+++A DGV QT + +P I +
Sbjct: 68 TKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMGIPTIFFI 127

Query: 489 NKMDKPEADIDRV----KSELSQHGVMS-------EDWGGDNMFAFVSAKTGEGVDDLLE 537
NK+D+ D+ V K +LS V+ + + EG DDLLE
Sbjct: 128 NKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIEGNDDLLE 187

Query: 538 GILLQAEVLELKAVRDGMAAGV 559
+ + LE + +
Sbjct: 188 K-YMSGKSLEALELEQEESIRF 208


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2837SECGEXPORT1207e-39 Protein-export SecG membrane protein signature.
		>SECGEXPORT#Protein-export SecG membrane protein signature.

Length = 110

Score = 120 bits (301), Expect = 7e-39
Identities = 62/111 (55%), Positives = 82/111 (73%), Gaps = 1/111 (0%)

Query: 1 MYEVLLVIYLLVALGLIGLVLIQQGKGADMGASFGAGASGTLFGSSGSGNFLTRTTAILA 60
MYE LLV++L+VA+GL+GL+++QQGKGADMGASFGAGAS TLFGSSGSGNF+TR TA+LA
Sbjct: 1 MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLA 60

Query: 61 IAFFTLSLIIGNLSANHAKNEDSWKNLGSDTEQVTQPVQDGTEKSETKIPD 111
FF +SL++GN+++N W+NL S + Q K + IP+
Sbjct: 61 TLFFIISLVLGNINSNKTNKGSEWENL-SAPAKTEQTQPAAPAKPTSDIPN 110


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2838adhesinb310.003 Adhesin B signature.
		>adhesinb#Adhesin B signature.

Length = 310

Score = 31.4 bits (71), Expect = 0.003
Identities = 18/95 (18%), Positives = 34/95 (35%), Gaps = 16/95 (16%)

Query: 142 REARRTFEVIAEELDIVIQKNGTMAFDNAIIAY----EPLWAVGTGKSATPEQAQEVHAF 197
+EA+ F I E +++ G F AY +W + T + TP+Q + +
Sbjct: 186 KEAKEKFNNIPGEKKMIVTSEG--CFKYFSKAYNVPSAYIWEINTEEEGTPDQIKTLVEK 243

Query: 198 IRKRLSEVSPFIGENIRILYGGSVTPSNAADLFAQ 232
+RK + L+ S ++
Sbjct: 244 LRKT----------KVPSLFVESSVDDRPMKTVSK 268


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2841HTHFIS340.002 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 34.0 bits (78), Expect = 0.002
Identities = 23/82 (28%), Positives = 32/82 (39%), Gaps = 18/82 (21%)

Query: 198 VLMVGPPGTGKTLLAKAIAGESK---VPFFT-----ISGSDFVEMFVGV------GASRV 243
+++ G GTGK L+A+A+ K PF I G GA
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTR 222

Query: 244 RD-MFEQAKKSAPCIIFIDEID 264
FEQA+ +F+DEI
Sbjct: 223 STGRFEQAEGGT---LFLDEIG 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2844SECFTRNLCASE2354e-78 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 235 bits (600), Expect = 4e-78
Identities = 91/298 (30%), Positives = 150/298 (50%), Gaps = 20/298 (6%)

Query: 13 WRYISSAISIFLMLASLTIIGVKGFNWGLDFTGGVVTEVQIDRKITSSELQPLLNAAYQQ 72
W++ + +I +M+AS+ + V G N+G+DF GG + I + L
Sbjct: 19 WQWATFGAAIVMMIASVILPLVIGLNFGIDFKGGTTIRTESTTAIDVGVYRAALEPLELG 78

Query: 73 EVSVVSASEP--------------------GRWVLRYADTAQSNVDIAQTLAPLGEIQVL 112
+V + +P G N A +++
Sbjct: 79 DVIISEVRDPSFREDQHVAMIRIQMQEDGQGAEGQGAQGQELVNKVETALTAVDPALKIT 138

Query: 113 NTSIVGPQVGKELAEQGGLALLVAMLCILGYLSYRFEWRLASGALFALVHDVIFVLAFFA 172
+ VGP+V EL +LL A + I+ Y+ RFEW+ A GA+ ALVHDV+ + FA
Sbjct: 139 SFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTVGLFA 198

Query: 173 LTQMEFNLTVLAAVLAILGYSLNDSIIIADRIRELLIAKPKLAIQEINNQAIVATFSRTM 232
+ Q++F+LT +AA+L I GYS+ND++++ DR+RE LI + ++++ N ++ T SRT+
Sbjct: 199 VLQLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNETLSRTV 258

Query: 233 VTSGTTLMTVGALWIMGGGPLEGFSIAMFIGILTGTFSSISVGTSLPELLGLSPEHYK 290
+T TTL+ + + I GG + GF AM G+ TGT+SS+ V ++ +GL K
Sbjct: 259 MTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIVLFIGLDRNKEK 316


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2845SECFTRNLCASE788e-18 Bacterial translocase SecF protein signature.
		>SECFTRNLCASE#Bacterial translocase SecF protein signature.

Length = 333

Score = 78.3 bits (193), Expect = 8e-18
Identities = 30/172 (17%), Positives = 82/172 (47%), Gaps = 4/172 (2%)

Query: 422 VTIVEERTIGPTLGAENIENGFAALGLGMGITLLFMALWYR-RLGWVANIALISNMVILF 480
+ I ++GP + E + +L + + ++ + + + A +AL+ ++++
Sbjct: 135 LKITSFESVGPKVSGELVWTAVWSLLAATVVIMFYIWVRFEWQFALGAVVALVHDVLLTV 194

Query: 481 GLLALIPGAVLTLPGIAGLVLTVGMAVDTNVLIFERIKDKLKEGRSFALA--IDTGFDSA 538
GL A++ L +A L+ G +++ V++F+R+++ L + ++ L ++ +
Sbjct: 195 GLFAVL-QLKFDLTTVAALLTITGYSINDTVVVFDRLRENLIKYKTMPLRDVMNLSVNET 253

Query: 539 FSTIFDANFTTMITAVVLYSIGNGPIQGFALTLGLGLLTSMFTGIFASRALI 590
S TT++ V + G I+GF + G+ T ++ ++ ++ ++
Sbjct: 254 LSRTVMTGMTTLLALVPMLIWGGDVIRGFVFAMVWGVFTGTYSSVYVAKNIV 305


83Sputcn32_2873Sputcn32_2879N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_2873-217-1.094638serine-type D-Ala-D-Ala carboxypeptidase
Sputcn32_2874-2160.275458hypothetical protein
Sputcn32_2875-2130.532467lipoate-protein ligase B
Sputcn32_2876-1130.081720lipoyl synthase
Sputcn32_2877-115-0.430607ribosomal-protein-alanine acetyltransferase
Sputcn32_2878-116-0.453798hypothetical protein
Sputcn32_2879016-1.433123Ferritin, Dps family protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2873BLACTAMASEA290.026 Beta-lactamase class A signature.
		>BLACTAMASEA#Beta-lactamase class A signature.

Length = 286

Score = 29.0 bits (65), Expect = 0.026
Identities = 21/109 (19%), Positives = 39/109 (35%), Gaps = 3/109 (2%)

Query: 37 AAKAYVLMDYYSGQIIAESNAYESLNPASLTKMMTSYVIGQEIKAGNVSPEDDVTISKNA 96
+ MD SG+ + A E S K++ + + AG+ E + +
Sbjct: 38 GRVGMIEMDLASGRTLTAWRADERFPMMSTFKVVLCGAVLARVDAGDEQLERKIHYRQQD 97

Query: 97 WSKNFSDSSKMFIEVGKTVKVADLNRGIIIQSGNDACVAMAEHIAGTEG 145
++S S+ + + V +L I S N A + + G G
Sbjct: 98 L-VDYSPVSEKH--LADGMTVGELCAAAITMSDNSAANLLLATVGGPAG 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2877SACTRNSFRASE504e-10 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 49.6 bits (118), Expect = 4e-10
Identities = 18/53 (33%), Positives = 29/53 (54%)

Query: 84 DICLAPEHQGYGYGKLLLSEVIEAAKTSGAVVVMLEVRESNLAARTLYQTMGF 136
DI +A +++ G G LL + IE AK + +MLE ++ N++A Y F
Sbjct: 94 DIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHHF 146


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2878BCTERIALGSPF290.006 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 28.6 bits (64), Expect = 0.006
Identities = 25/103 (24%), Positives = 43/103 (41%), Gaps = 7/103 (6%)

Query: 10 MAITSWRIRDTKVKPYQVIWDADGALPQAQTLIEQVLELIGVTPDECDFDCQIHKGKQII 69
MA ++ D + K + +AD A Q L E+ L + V + D Q G +
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGD---QQKSGSTGL 57

Query: 70 WDLRRHKVRPRTAWLVSEPLASLLVGS----EAKRALWSQICQ 108
R+ ++ L++ LA+L+ S EA A+ Q +
Sbjct: 58 SLRRKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEK 100


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_2879HELNAPAPROT1437e-47 Helicobacter neutrophil-activating protein A family ...
		>HELNAPAPROT#Helicobacter neutrophil-activating protein A family

signature.
Length = 153

Score = 143 bits (362), Expect = 7e-47
Identities = 53/147 (36%), Positives = 77/147 (52%)

Query: 8 QTNREEIAAGLNQLLADSYSLYLKTHSFHWNVTGPMFTSLHLLFEQQYTELALAVDLIAE 67
+TN+ + LN L++ + LY K H FHW V GP F +LH FE+ Y A VD IAE
Sbjct: 7 KTNQTLVENSLNTQLSNWFLLYSKLHRFHWYVKGPHFFTLHEKFEELYDHAAETVDTIAE 66

Query: 68 RVRALGARALGSYSAYASLTEIKEDQGVTKAETMIRELLNDQEIVIRNARALYPLVSKAN 127
R+ A+G + + + Y I + T A M++ L+ND + + ++ + L +
Sbjct: 67 RLLAIGGQPVATVKEYTEHASITDGGNETSASEMVQALVNDYKQISSESKFVIGLAEENQ 126

Query: 128 DEATADLLTQRIQLHEKNAWMLRSLLA 154
D ATADL I+ EK WML S L
Sbjct: 127 DNATADLFVGLIEEVEKQVWMLSSYLG 153


84Sputcn32_3064Sputcn32_3069N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3064-2142.692777sulfite reductase (NADPH) flavoprotein, alpha
Sputcn32_3065-2132.283675NAD(P) transhydrogenase subunit alpha
Sputcn32_3066-2111.313030pyridine nucleotide transhydrogenase
Sputcn32_30670100.573462TetR family transcriptional regulator
Sputcn32_30680110.737998hypothetical protein
Sputcn32_30690121.249680bifunctional heptose 7-phosphate kinase/heptose
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3064TACYTOLYSIN310.014 Bacterial thiol-activated pore-forming cytolysin sig...
		>TACYTOLYSIN#Bacterial thiol-activated pore-forming cytolysin

signature.
Length = 574

Score = 31.1 bits (70), Expect = 0.014
Identities = 20/108 (18%), Positives = 36/108 (33%), Gaps = 8/108 (7%)

Query: 229 KQNPYSAEVLVSQKITGRGSDRDVRHVEIDLGDSGLTYQAGDALGVWFSNNEALVEEILT 288
K+ P + +K + EI+ L Y + L V N E + +
Sbjct: 84 KEMPLESAEKEEKKSEDNKKSEEDHTEEINDKIYSLNY---NELEVLAKNGETIENFVPK 140

Query: 289 ALSLSGDEQVVVEKESLTLKQALVDKKEL----TQLYPG-LVKAWAEL 331
D+ +V+E++ + VD + + YP L A
Sbjct: 141 EGVKKADKFIVIERKKKNINTTPVDISIIDSVTDRTYPAALQLANKGF 188


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3065PYOCINKILLER300.026 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 30.1 bits (67), Expect = 0.026
Identities = 15/37 (40%), Positives = 18/37 (48%), Gaps = 5/37 (13%)

Query: 364 EVTFP-----PPPISVSAAPAKPVAKIEPKSTTPKAP 395
EVT P PP+ ++ PA P P STTP P
Sbjct: 399 EVTVPSTTAEAPPLILTWTPASPPGNQNPSSTTPVVP 435


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3067HTHTETR481e-09 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 48.5 bits (115), Expect = 1e-09
Identities = 24/146 (16%), Positives = 50/146 (34%), Gaps = 13/146 (8%)

Query: 1 MAKRSRVQTEQTINQIMDEALKQILTIGFETMSYTTLSEATGISRTGISHHFPRKSDFLV 60
MA++++ + ++T I+D AL+ G + S +++A G++R I HF KSD
Sbjct: 1 MARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFS 60

Query: 61 RLDSRIGNLFVAALDFSSQEALETSWMQAMQEEHYRAVLRLFFSLCGGANNEITLFRAVS 120
+ ++ + E + R +L L +
Sbjct: 61 EI------WELSESNIGELELEYQAKFPGDPLSVLREILIHVLEST-VTEERRRLLMEII 113

Query: 121 SARQQAISELGLAGDRTINHLLGRTA 146
+ + G+ + R
Sbjct: 114 FHKCEF------VGEMAVVQQAQRNL 133


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3069LPSBIOSNTHSS310.006 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 30.9 bits (70), Expect = 0.006
Identities = 10/28 (35%), Positives = 17/28 (60%)

Query: 348 GCFDILHAGHVSYLKQAKALGDRLIVAV 375
G FD + GH+ +++ L D++ VAV
Sbjct: 7 GSFDPITFGHLDIIERGCRLFDQVYVAV 34


85Sputcn32_3216Sputcn32_3221N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3216-1161.9890005-formyltetrahydrofolate cyclo-ligase
Sputcn32_3217-2162.191759short chain dehydrogenase
Sputcn32_3218-1161.057677putative thiol-disulfide oxidoreductase DCC
Sputcn32_3219-2120.572660malate dehydrogenase
Sputcn32_3220-2130.007257arginine repressor
Sputcn32_3221-214-0.071081NAD-dependent epimerase/dehydratase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3216GPOSANCHOR290.014 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 29.3 bits (65), Expect = 0.014
Identities = 16/47 (34%), Positives = 25/47 (53%), Gaps = 4/47 (8%)

Query: 28 KASRNQLRKTIRAARNAL----SATEQNQASLCASQKMLNELQAKKA 70
+ASR LR+ + A+R A A E+ + L A +K+ EL+ K
Sbjct: 378 EASRQSLRRDLDASREAKKQVEKALEEANSKLAALEKLNKELEESKK 424


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3217DHBDHDRGNASE443e-07 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 43.5 bits (102), Expect = 3e-07
Identities = 49/254 (19%), Positives = 91/254 (35%), Gaps = 25/254 (9%)

Query: 5 IIITGVGKRIGYALAKHLLAQGHSVIG-----TYRSHYPSIDKLRVLGATLIQCDFYDNV 59
ITG + IG A+A+ L +QG + S K A D D+
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 60 QVQGLIDQLGQYPKIRAIIHNASDWLADSSQTFTASEVIQRMMQVHVSVPYQLNLALASQ 119
+ + ++ + I+ N + L + E + V+ + + + +++
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 120 LQAGAEEEIGASDIIHITDYVAEKGSAKHIAYAASKAALHNLTLSFAAKFAPE-VKVNSI 178
+ G+ I+ + A AYA+SKAA T + A ++ N +
Sbjct: 131 MMD---RRSGS--IVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIV 185

Query: 179 APAMI-------LFNQGDDEAYQQKTLAKAL-----LPKEAGNEEIIDLVEYLL--NSRY 224
+P L+ + K + L K A +I D V +L+ + +
Sbjct: 186 SPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 225 VTGRSHHVDGGRHL 238
+T + VDGG L
Sbjct: 246 ITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3220ARGREPRESSOR1463e-48 Bacterial arginine repressor signature.
		>ARGREPRESSOR#Bacterial arginine repressor signature.

Length = 149

Score = 146 bits (371), Expect = 3e-48
Identities = 43/150 (28%), Positives = 71/150 (47%), Gaps = 5/150 (3%)

Query: 6 NQDDLVRIFKSILKEERFGSQSEIVAALQAEGFGNINQSKVSRMLSKFGAVRTRNAKQEM 65
N+ + I+ +Q E+V L+ +G+ N+ Q+ VSR + + V+
Sbjct: 2 NKGQRHIKIREIITANEIETQDELVDILKKDGY-NVTQATVSRDIKELHLVKVPTNNGSY 60

Query: 66 VYCLPAELGVPTAGSPLKNLV---LDVDHNQAMIVVRTSPGAAQLIARLLDSIGKPEGIL 122
Y LPA+ ++L+ + +D +IV++T PG AQ I L+D++ E I+
Sbjct: 61 KYSLPADQRFNPLSKLKRSLMDAFVKIDSASHLIVLKTMPGNAQAIGALMDNLDWEE-IM 119

Query: 123 GTIAGDDTIFICPSSIQDIADTLETIKSLF 152
GTI GDDTI I + D + I L
Sbjct: 120 GTICGDDTILIICRTHDDTKVVQKKILELL 149


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3221NUCEPIMERASE374e-05 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 36.7 bits (85), Expect = 4e-05
Identities = 27/124 (21%), Positives = 43/124 (34%), Gaps = 25/124 (20%)

Query: 1 MKIAILGATGWIGGAILKEALSRGHEVTAL-----VRDPS-------KLPTTNAAVRTVD 48
MK + GA G+IG + K L GH+V + D S L +D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 49 L-NQPLVADSFTNQ--DVVI---AAIGGR------AAQNHEIVAGTATHLLAILPKAKVP 96
L ++ + D F + + V + R A + G +L K+
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLN-ILEGCRHNKIQ 119

Query: 97 RLLW 100
LL+
Sbjct: 120 HLLY 123


86Sputcn32_3248Sputcn32_3255N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3248-1120.253749sodium/hydrogen exchanger
Sputcn32_32490151.180811galactokinase
Sputcn32_32500150.506188aldose 1-epimerase
Sputcn32_32510190.312576hypothetical protein
Sputcn32_32520160.169323alcohol dehydrogenase
Sputcn32_3253-115-0.511639hypothetical protein
Sputcn32_3254-113-0.135506acriflavin resistance protein
Sputcn32_3255012-1.331681RND family efflux transporter MFP subunit
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3248FLGHOOKAP1340.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 0.001
Identities = 20/120 (16%), Positives = 44/120 (36%), Gaps = 16/120 (13%)

Query: 402 AYDELRNKFEGEILGIE---HKQELVDLHRANGRNVVKGDASDTDFWEKLDHAPNLELVL 458
D+L ++ +I+G+E ++ ANG ++V+G S + + +
Sbjct: 200 QRDQLVSEL-NQIVGVEVSVQDGGTYNITMANGYSLVQG--STARQLAAVPSSADPSRTT 256

Query: 459 LAMPHHAGNMFAVEQLKKLDYQGKISAIV--------QYSDDADALRASGVHSVYNLYEA 510
+A +E +KL G + I+ Q + L + + ++A
Sbjct: 257 VAYVDGTAG--NIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLALAFAEAFNTQHKA 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3249RTXTOXINA290.041 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 28.8 bits (64), Expect = 0.041
Identities = 11/41 (26%), Positives = 21/41 (51%), Gaps = 4/41 (9%)

Query: 124 LAAGLSSSGALVVAFGTAISDSSQLHLSPMAVAQLAQRGEH 164
A GLS+S A +A++ L +SP++ +A + +
Sbjct: 296 AAQGLSTSAAAAGLIASAVT----LAISPLSFLSIADKFKR 332


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3254ACRIFLAVINRP6560.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 656 bits (1693), Expect = 0.0
Identities = 302/1032 (29%), Positives = 513/1032 (49%), Gaps = 44/1032 (4%)

Query: 8 IRHPIFASVLSIMAVLLGLIAFHKLDIQYFPEHTTHSASVNASIAGASADFMSSNVADKL 67
IR PIFA VL+I+ ++ G +A +L + +P + SV+A+ GA A + V +
Sbjct: 6 IRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVI 65

Query: 68 VAAASGLDKVDTM-STDCSEGRCSLTIKFKDDT-TDIEYTNLMNKLRSSVEGINDFPQSM 125
+G+D + M ST S G ++T+ F+ T DI + NKL+ + PQ
Sbjct: 66 EQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLAT---PLLPQE- 121

Query: 126 IDKPTVTDDTSATDSASNIITFVNTGGMEKQAMYDYISQQLVPQLKQVQGVGAVWGPYGG 185
+ + ++ + S++ + G + + DY++ + L ++ GVG V G
Sbjct: 122 VQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDV--QLFG 179

Query: 186 SQKAVRVWLNPEQMKALNIKAADVVGTLGSYNASFTSG------AIKGKSRDFSINPLNQ 239
+Q A+R+WL+ + + + DV+ L N +G A+ G+ + SI +
Sbjct: 180 AQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTR 239

Query: 240 VETLEDVKDLVIKVS-DGKIIRVSDVAEVVMGEESLSPSLLSIGGHSAMSLQILPLSNAN 298
+ E+ + ++V+ DG ++R+ DVA V +G E+ + + I G A L I + AN
Sbjct: 240 FKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYN-VIARINGKPAAGLGIKLATGAN 298

Query: 299 PVTVASNIKAEIARMQKHLPQGLEMNLAYNQADFIEAAIDEGFAALIEAVILVSLIVVLF 358
+ A IKA++A +Q PQG+++ Y+ F++ +I E L EA++LV L++ LF
Sbjct: 299 ALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLF 358

Query: 359 LGSLRAASIPIITIPVCVIGVFAVMSALGFSINVLTILAIILAIGLVVDDAIVVVENCYR 418
L ++RA IP I +PV ++G FA+++A G+SIN LT+ ++LAIGL+VDDAIVVVEN R
Sbjct: 359 LQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVER 418

Query: 419 HI-ENGETPFNAAINGCQEIIFPIIAMTLTLAAVYLPIGLMSGLTADLFRQFSFTLAAAV 477
+ E+ P A +I ++ + + L+AV++P+ G T ++RQFS T+ +A+
Sbjct: 419 VMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAM 478

Query: 478 IISGMVALTLSPMMSAYLINTTEQQPK-----WFSRVEGALQQLNHLYINELSKWFTRKR 532
+S +VAL L+P + A L+ + +F + Y N + K
Sbjct: 479 ALSVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTG 538

Query: 533 QMLGMALVLMGLAGIAYWQLPKILLPVEDSGFIDVAANGPTGVGREYHLDHNSELNGVID 592
+ L + +++ + + +LP LP ED G P G +E ++
Sbjct: 539 RYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYL 598

Query: 593 GHPAVGANLSY------IEGEPVN----HVLLKPWGER---REGIDEVIADLIAKSKESV 639
+ + G+ N V LKPW ER + VI + +
Sbjct: 599 KNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIR 658

Query: 640 SAYNMSFTIRSADNLNIATNVRLELTTLDRNK---DKLSETAAKVQKLMEDYPG-LTNVG 695
+ + F + + L AT EL +D+ D L++ ++ + +P L +V
Sbjct: 659 DGFVIPFNMPAIVELGTATGFDFEL--IDQAGLGHDALTQARNQLLGMAAQHPASLVSVR 716

Query: 696 NSVLRDQLRYDLSIDRNAIILSGVSYGDVTNALSTFLGSVKAADLHATDGFTYPIQVQVN 755
+ L D ++ L +D+ GVS D+ +ST LG D G + VQ +
Sbjct: 717 PNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDF-IDRGRVKKLYVQAD 775

Query: 756 LDKLSDFRVMDKLYVTSESGQALPLSQFVSINQTTAESNIKTFMGLDSAELTADIMPGYS 815
+DKLYV S +G+ +P S F + + ++ + GL S E+ + PG S
Sbjct: 776 AKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTS 835

Query: 816 TDEIKAYLDEQLPTLLTDAQGFKYNGLIKELMDSQAGTQSLFLLALVFIYLILAAQFESF 875
+ + A + E L + L G+ + G+ + S +L ++ V ++L LAA +ES+
Sbjct: 836 SGDAMALM-ENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESW 894

Query: 876 VDPLIILLTVPLCIVGALLTLTLFGQSLNIYSQIGLLTLVGLVTKHGILLVEFANK-QQL 934
P+ ++L VPL IVG LL TLF Q ++Y +GLLT +GL K+ IL+VEFA +
Sbjct: 895 SIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEK 954

Query: 935 QGLSAIEAAQSSAKSRLRPILMTSLTMILSAIPLALASGPGSLGLANIGLVLVGGLLAGT 994
+G +EA + + RLRPILMTSL IL +PLA+++G GS +G+ ++GG+++ T
Sbjct: 955 EGKGVVEATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSAT 1014

Query: 995 FFSLFVVPVAYV 1006
++F VPV +V
Sbjct: 1015 LLAIFFVPVFFV 1026



Score = 87.6 bits (217), Expect = 1e-19
Identities = 62/365 (16%), Positives = 123/365 (33%), Gaps = 28/365 (7%)

Query: 662 LELTTLDRNKDKLSETAAK-VQKLMEDYPGLTNVGNSVLRDQLRYDLSIDRNAIILSGVS 720
+D +S+ A V+ + G+ +V + +R + +D + + ++
Sbjct: 142 FVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQYAMR--IWLDADLLNKYKLT 199

Query: 721 YGDVTNALS-----TFLGSVKAADLHATDGFTYPIQVQVNLDKLSDFRVMDKLYVTSESG 775
DV N L G + I Q +F + G
Sbjct: 200 PVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRFKNPEEFG--KVTLRVNSDG 257

Query: 776 QALPLSQFVSINQTTAESNIK-TFMGLDSAELTADIMPGYST----DEIKAYLDEQLPTL 830
+ L + N+ G +A L + G + IKA L E P
Sbjct: 258 SVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANALDTAKAIKAKLAELQPFF 317

Query: 831 LTDAQGFKYN------GLIKELMDSQAGTQSLFLLALVFIYLILAAQFESFVDPLIILLT 884
QG K ++ S A++ ++L++ ++ LI +
Sbjct: 318 ---PQGMKVLYPYDTTPFVQL---SIHEVVKTLFEAIMLVFLVMYLFLQNMRATLIPTIA 371

Query: 885 VPLCIVGALLTLTLFGQSLNIYSQIGLLTLVGLVTKHGILLVEFANKQQL-QGLSAIEAA 943
VP+ ++G L FG S+N + G++ +GL+ I++VE + + L EA
Sbjct: 372 VPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVMMEDKLPPKEAT 431

Query: 944 QSSAKSRLRPILMTSLTMILSAIPLALASGPGSLGLANIGLVLVGGLLAGTFFSLFVVPV 1003
+ S ++ ++ + IP+A G + +V + +L + P
Sbjct: 432 EKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPA 491

Query: 1004 AYVAM 1008
+
Sbjct: 492 LCATL 496


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3255RTXTOXIND514e-09 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 51.0 bits (122), Expect = 4e-09
Identities = 40/192 (20%), Positives = 69/192 (35%), Gaps = 29/192 (15%)

Query: 115 AELDNTKAKADLDKAKSALALAKTKLERVEDLL---IKEPFALAKQDVDELRENVNLADA 171
A + K+ L++ +S + AK + + V L I + ++ L LA
Sbjct: 264 AVNELRVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE--LAKN 321

Query: 172 DFRQKQAIMNDYLIKAPFDG---QLTSFSQSIGSQIGAGTALVTLY-SLHPVEVRYAISQ 227
+ RQ+ ++ I+AP QL ++ G + L+ + +EV +
Sbjct: 322 EERQQASV-----IRAPVSVKVQQLKVHTE--GGVVTTAETLMVIVPEDDTLEVTALVQN 374

Query: 228 NDFGKAKKGQEVDVTVEAY---GTQIFNGVVNYVAP--AIDESAG---RVEI-----HAT 274
D G GQ + VEA+ G V + D+ G V I +
Sbjct: 375 KDIGFINVGQNAIIKVEAFPYTRYGYLVGKVKNINLDAIEDQRLGLVFNVIISIEENCLS 434

Query: 275 IDNAEFKLAPGM 286
N L+ GM
Sbjct: 435 TGNKNIPLSSGM 446



Score = 49.4 bits (118), Expect = 1e-08
Identities = 24/108 (22%), Positives = 46/108 (42%), Gaps = 7/108 (6%)

Query: 97 ISAIHFSNGDKVTKGQVIAELDNTKAKADLDKAKSALALAKTKLERVEDLLIKEPFALAK 156
+ I G+ V KG V+ +L A+AD K +S+L A+ + R + L ++
Sbjct: 107 VKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILS----RSIEL 162

Query: 157 QDVDELRENVNLADADFRQKQAIMNDYLIKAPFDGQLTSFSQSIGSQI 204
+ EL+ + +++ + LIK F T +Q ++
Sbjct: 163 NKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFS---TWQNQKYQKEL 207


87Sputcn32_3324Sputcn32_3330N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3324214-2.562105response regulator receiver protein
Sputcn32_3325217-3.104521histone family protein DNA-binding protein
Sputcn32_3326318-3.178310hypothetical protein
Sputcn32_3327216-2.589755alpha-L-glutamate ligase
Sputcn32_3328116-2.079705response regulator receiver modulated
Sputcn32_3329115-1.586910response regulator receiver protein
Sputcn32_3330014-1.568730multi-sensor signal transduction histidine
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3324HTHFIS632e-13 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.5 bits (152), Expect = 2e-13
Identities = 25/108 (23%), Positives = 44/108 (40%), Gaps = 3/108 (2%)

Query: 146 RVLVVDDSRMARNVIKRTIGNLGIKLITEAEDGAQAIELMRNNMFDLVITDYNMPSIDGL 205
+LV DD R V+ + + G + + A + DLV+TD MP +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRI-TSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 206 ALTQFIRNESQQSHIPILMVSSEANDTHLSNVSQAGVNALCDKPFEPQ 253
L I+ + +P+L++S++ S+ G KPF+
Sbjct: 64 DLLPRIKKA--RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLT 109



Score = 47.9 bits (114), Expect = 2e-08
Identities = 32/155 (20%), Positives = 57/155 (36%), Gaps = 6/155 (3%)

Query: 10 SILLVEPSDTQRRIIIQHLQQEGIVSIQTAANIEEAKAVVGRHKPDLIASAMHFEDGTAI 69
+IL+ + R ++ Q L + G ++ +N + DL+ + + D A
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 70 DLLSYLRVNSDYKDIQFMLVSSECRREQLEIFRQSGVVAILPKPFHAEHLGKALNATIDL 129
DLL R+ D+ +++S++ + G LPKPF L + +
Sbjct: 64 DLLP--RIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 130 LSHDELDLSHFDVHDVRVLVVDDSRM--ARNVIKR 162
L D D LV + M V+ R
Sbjct: 122 PKRRPSKLED-DSQDGMPLVGRSAAMQEIYRVLAR 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3325DNABINDINGHU1092e-35 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 109 bits (275), Expect = 2e-35
Identities = 45/88 (51%), Positives = 66/88 (75%)

Query: 2 NKTELIAKIAENADLTKVEAARALKSFEAAITESMKNGDKISIVGFGSFETATRAARTGR 61
NK +LIAK+AE +LTK ++A A+ + +A++ + G+K+ ++GFG+FE RAAR GR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPQTGKEIQIAEATVPKFKAGKTLRDSV 89
NPQTG+EI+I + VP FKAGK L+D+V
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3326OUTRMMBRANEA280.027 Outer membrane protein A signature.
		>OUTRMMBRANEA#Outer membrane protein A signature.

Length = 346

Score = 28.4 bits (63), Expect = 0.027
Identities = 16/76 (21%), Positives = 29/76 (38%), Gaps = 9/76 (11%)

Query: 47 VAIQGGIDYSHDSGFYAGTWASNVDFGDETSYELDLYVGYAGNITEDISYDIGYLYYGYP 106
+ G HD+GF +N E + GY + + +++GY + G
Sbjct: 30 TGAKLGWSQYHDTGFI-----NNNGPTHENQLGAGAFGGY--QVNPYVGFEMGYDWLGRM 82

Query: 107 DAPGSIDFG--ELHGA 120
GS++ G + G
Sbjct: 83 PYKGSVENGAYKAQGV 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3328HTHFIS632e-12 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 62.9 bits (153), Expect = 2e-12
Identities = 36/137 (26%), Positives = 58/137 (42%), Gaps = 6/137 (4%)

Query: 3 LLLIDDDEVDRTAVIRALRQSKLAFNVIEANCAFDGLNLALERHFDGILLDYMLPDANGL 62
+L+ DDD RT + +AL S+ ++V + A D ++ D ++PD N
Sbjct: 6 ILVADDDAAIRTVLNQAL--SRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 EVLIKLNAMTQDQTVVVMLSRYEDEKLAQRCIELGAQDFLLK---DEVNSRILTRAIRYA 119
++L ++ D V+VM S A + E GA D+L K I+ RA+
Sbjct: 64 DLLPRIKKARPDLPVLVM-SAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEP 122

Query: 120 KQRASMALALRNSHQKL 136
K+R S L
Sbjct: 123 KRRPSKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3329HTHFIS468e-09 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 46.4 bits (110), Expect = 8e-09
Identities = 23/109 (21%), Positives = 43/109 (39%), Gaps = 10/109 (9%)

Query: 11 TILLVDDDDVDYMAVQRAMKQLRLLNPLIRARDGLEALHILTNPEAIKGPYLILLDLNMP 70
TIL+ DDD + +A+ + + + + A L++ D+ MP
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGY--DVRITSNAATLWRWI----AAGDGDLVVTDVVMP 58

Query: 71 RMNGFEFLEHLRS-DPTLSSSVVFMLTTSSTDEDRMKAYSHHVAGYMVK 118
N F+ L ++ P L V +++ +T +KA Y+ K
Sbjct: 59 DENAFDLLPRIKKARPDL---PVLVMSAQNTFMTAIKASEKGAYDYLPK 104


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3330PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 39.5 bits (92), Expect = 4e-05
Identities = 22/107 (20%), Positives = 37/107 (34%), Gaps = 22/107 (20%)

Query: 608 LVVRNLMSNAIKH---HDRDTGVIKVQCEPKGDVYWFSVVDDGPGISKAYHGKVFEMFQT 664
++V+ L+ N IKH G I ++ V + G
Sbjct: 258 MLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGS---------------- 301

Query: 665 LKPRDEVEGSGLGLSLVKKTIESLGGE---IKLESEGRGCRFRFSWP 708
L ++ E +G GL V++ ++ L G IKL + P
Sbjct: 302 LALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


88Sputcn32_3338Sputcn32_3343N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_33381286.318341N-acetyltransferase GCN5
Sputcn32_33391286.404951large-conductance mechanosensitive channel
Sputcn32_33401286.627747antibiotic biosynthesis monooxygenase
Sputcn32_33411286.687689CzcA family heavy metal efflux protein
Sputcn32_33423225.176006RND family efflux transporter MFP subunit
Sputcn32_33433224.000353outer membrane efflux protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3338SACTRNSFRASE385e-06 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 37.6 bits (87), Expect = 5e-06
Identities = 21/72 (29%), Positives = 32/72 (44%), Gaps = 5/72 (6%)

Query: 75 ASIGRVVVSPAGRGKGLAMPLMQQAIESALTTWPDAGIQIGAQDY-LKA--FYQKLGFVA 131
A I + V+ R KG+ L+ +AIE A G+ + QD + A FY K F+
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKEN-HFCGLMLETQDINISACHFYAKHHFII 148

Query: 132 CS-EMYLEDGIP 142
+ + L P
Sbjct: 149 GAVDTMLYSNFP 160


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3339MECHCHANNEL1762e-60 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 176 bits (448), Expect = 2e-60
Identities = 89/136 (65%), Positives = 111/136 (81%), Gaps = 1/136 (0%)

Query: 1 MSLIKEFKAFASRGNVIDMAVGIIIGAAFGKIVSSFVADVIMPPIGIILGGVNFSDLSIV 60
MS+IKEF+ FA RGNV+D+AVG+IIGAAFGKIVSS VAD+IMPP+G+++GG++F ++
Sbjct: 1 MSIIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLIGGIDFKQFAVT 60

Query: 61 LQAAQGDAPSVVIAYGKFIQTVIDFTIIAFAIFMGLKAINTLKRKEEEAPKAPPAPTKEE 120
L+ AQGD P+VV+ YG FIQ V DF I+AFAIFM +K IN L RK+EE P A PAPTKEE
Sbjct: 61 LRDAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEE-PAAAPAPTKEE 119

Query: 121 ELLSEIRDLLKAQQEK 136
LL+EIRDLLK Q +
Sbjct: 120 VLLTEIRDLLKEQNNR 135


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3341ACRIFLAVINRP6610.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 661 bits (1708), Expect = 0.0
Identities = 224/1072 (20%), Positives = 431/1072 (40%), Gaps = 67/1072 (6%)

Query: 9 AIKNRLLVVLALLAVIVGCVAMLSKLNLDAFPDVTNVQVTINTAAEGLAAEEVEKLISYP 68
I+ + + + +++ + +L + +P + V+++ G A+ V+ ++
Sbjct: 5 FIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQV 64

Query: 69 VESAMYALPAVTEVRSLS-RTGLSIVTVVFAEGTDIYFARQQVFEQLQAAREMIPSGVGV 127
+E M + + + S S G +T+ F GTD A+ QV +LQ A ++P V
Sbjct: 65 IEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEVQQ 124

Query: 128 PEIGPNTSGLGQIYQYILRAEPNSGINAAELRSLNDYLVKLILMPVGGVTEVLSFGGDVR 187
I S + ++ N G ++ VK L + GV +V FG
Sbjct: 125 QGISVEKSSSSYLMVAGFVSD-NPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGAQ-Y 182

Query: 188 QYQVQVDPNKLRAYGLSMAQVTEALESNNRNAGGWFMDQGQEQLVVRGYGMLPAGDAGLA 247
++ +D + L Y L+ V L+ N + G L +
Sbjct: 183 AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLG-GTPALPGQQLNASIIAQTRFK 241

Query: 248 AIAQ----IPLTEASGTPVRIGDIAQVDFGSEIRVGAVTMTRRDESGQVQNLGEVVAGVV 303
+ + G+ VR+ D+A+V+ G E + N +
Sbjct: 242 NPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARI----------NGKPAAGLGI 291

Query: 304 LKRMGANTKATIDDIGARVSLIEQALPDGVSFEVFYDQADLVDKAVTTVRDALLMAFVFI 363
GAN T I A+++ ++ P G+ YD V ++ V L A + +
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLV 351

Query: 364 VVILALFLVNIRATLLVLLSIPVSIGLALMVMSYYGLSANLMSLGGLAVAIGMLVDGSVV 423
+++ LFL N+RATL+ +++PV + +++ +G S N +++ G+ +AIG+LVD ++V
Sbjct: 352 FLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIV 411

Query: 424 MVENIFKHLTQPDRRHLQEARTRADGEIDPYHSDEDGGLQANMAVRIMLAAKEVCSPIFF 483
+VEN+ + + + L A + ++ +
Sbjct: 412 VVENVERVM-------------------------MEDKLPPKEATE--KSMSQIQGALVG 444

Query: 484 ATAIIIVVFAPLFALEGVEGKLFQPMAVSIILAMISALLVALIAVPALAVYLFK------ 537
++ VF P+ G G +++ +++I+ AM ++LVALI PAL L K
Sbjct: 445 IAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMALSVLVALILTPALCATLLKPVSAEH 504

Query: 538 ----RGVMLKQSVILAPLDAAYRKLLTATLARPKVVMLSALLMFALSLLLLPRLGTEFVP 593
G + Y + L +L L+ A ++L RL + F+P
Sbjct: 505 HENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLP 564

Query: 594 ELEEGTINLRVTLAPTASLGTSLQVAPKLEAILLEFPEVEYALSRIGAPELGGDPEPVSN 653
E ++G + L A+ + +V ++ L+ + S + +
Sbjct: 565 EEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKNEKANVE-SVFTVNGFSFSGQAQNA 623

Query: 654 IEVYIGLKPIAEWQSASSRLE--LQRLMEEKLSVFPGLLLTFSQPIATRVDELLSGVKAQ 711
++ LKP E + E + R E + G ++ F+ P + EL +
Sbjct: 624 GMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDGFVIPFNMP---AIVELGTATGFD 680

Query: 712 LA-IKIFGPDLAVLSQKGQALSDLVAKIPGAV-DVSLEQVSGEAQLVVRPKRELLARYGI 769
I G L+Q L + A+ P ++ V + AQ + +E G+
Sbjct: 681 FELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGLEDTAQFKLEVDQEKAQALGV 740

Query: 770 SVDQVMSLVSQGIGGVSAGQVIDGNARYDINVRLAAEFRQSPDAIKDLLLSGTNGATVRL 829
S+ + +S +GG ID + V+ A+FR P+ + L + NG V
Sbjct: 741 SLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRMLPEDVDKLYVRSANGEMVPF 800

Query: 830 GEVASVEVEMAPPNIRRDDVQRRVVVQANVA-GRDMGSVVKDIYALVPKADLPAGYTVII 888
+ P + R + + +Q A G G + + L K LPAG
Sbjct: 801 SAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAMALMENLASK--LPAGIGYDW 858

Query: 889 GGQYENQQRAQQKLMLVVPVSIALIALLLYFSFGSFKQVLLIMANVPLALIGGIVALYVS 948
G ++ + + +V +S ++ L L + S+ + +M VPL ++G ++A +
Sbjct: 859 TGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLF 918

Query: 949 GTYLSVPSSIGFITLFGVAVLNGVVLVDSINQ-RRQSGEALYDCVYEGTVGRLRPVLMTA 1007
V +G +T G++ N +++V+ + G+ + + RLRP+LMT+
Sbjct: 919 NQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVVEATLMAVRMRLRPILMTS 978

Query: 1008 LTSALGLIPILLSSGVGSEIQKPLAVVIIGGLFSSTALTLLVLPTLYCWLYR 1059
L LG++P+ +S+G GS Q + + ++GG+ S+T L + +P + + R
Sbjct: 979 LAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFFVPVFFVVIRR 1030



Score = 111 bits (280), Expect = 5e-27
Identities = 83/551 (15%), Positives = 189/551 (34%), Gaps = 63/551 (11%)

Query: 4 KLIEAAIKNRLLVVLALLAVIVGCVAMLSKLNLDAFPDVTNVQVTIN-TAAEGLAAEEVE 62
+ + + +L ++ G V + +L P+ G E +
Sbjct: 528 NSVGKILGSTGRYLLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQ 587

Query: 63 KLI----SYPVESAMYALPAVTEVRSLSRTGLS----IVTVVFAEGTDIYFARQQVFEQL 114
K++ Y +++ + +V V S +G + + V + +
Sbjct: 588 KVLDQVTDYYLKNEKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVI 647

Query: 115 QAAREM---IPSGVGVPEIGPNTSGLGQIYQYILRAEPNSGINAAELRSLNDYLVKLILM 171
A+ I G +P P LG + +G+ L + L+ +
Sbjct: 648 HRAKMELGKIRDGFVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQ 707

Query: 172 PVGGVTEV-LSFGGDVRQYQVQVDPNKLRAYGLSMAQVTEALES--NNRNAGGWFMDQGQ 228
+ V + D Q++++VD K +A G+S++ + + + + +
Sbjct: 708 HPASLVSVRPNGLEDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRV 767

Query: 229 EQLVVRGYGMLPAGDA-GLAAIAQIPLTEASGTPVRIGDIAQVDFGSEIRVGAVTMTRRD 287
++L V+ A + ++ + A+G V + G+ + R +
Sbjct: 768 KKLYVQ----ADAKFRMLPEDVDKLYVRSANGEMVPFSAFTTSHWVY----GSPRLERYN 819

Query: 288 ESGQVQNLGEVVAGVVLKRMGANTKATIDDIGARVSLIEQALPDGVSFEVFYDQADLVDK 347
++ GE G D A + + LP G+ ++ + +
Sbjct: 820 GLPSMEIQGEAAPGTSS-----------GDAMALMENLASKLPAGIGYD-WTGMSYQERL 867

Query: 348 AVTTVRDALLMAFVFIVVILALFLVNIRATLLVLLSIPVSIGLALMVMSYYGLSANLMSL 407
+ + ++FV + + LA + + V+L +P+ I L+ + + ++ +
Sbjct: 868 SGNQAPALVAISFVVVFLCLAALYESWSIPVSVMLVVPLGIVGVLLAATLFNQKNDVYFM 927

Query: 408 GGLAVAIGMLVDGSVVMVENIFKHLTQPDRRHLQEARTRADGEIDPYHSDEDGGLQANMA 467
GL IG+ ++++VE K L + + + + EA
Sbjct: 928 VGLLTTIGLSAKNAILIVEFA-KDLMEKEGKGVVEA------------------------ 962

Query: 468 VRIMLAAKEVCSPIFFATAIIIVVFAPLFALEGVEGKLFQPMAVSIILAMISALLVALIA 527
++A + PI + I+ PL G + + ++ M+SA L+A+
Sbjct: 963 --TLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 528 VPALAVYLFKR 538
VP V + +
Sbjct: 1021 VPVFFVVIRRC 1031



Score = 102 bits (257), Expect = 2e-24
Identities = 90/515 (17%), Positives = 193/515 (37%), Gaps = 36/515 (6%)

Query: 565 RPKVVMLSALLMFALSLLLLPRLGTEFVPELEEGTINLRVTLAPTASLGT-SLQVAPKLE 623
RP + A+++ L + +L P + +++ P A T V +E
Sbjct: 8 RPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSAN-YPGADAQTVQDTVTQVIE 66

Query: 624 AILLEFPEVEYALSRIGAPELGGDPEPVSNIEVYIGLKPIAEWQSASSRLELQRLMEEKL 683
+ + Y S + ++ + + + + A Q ++ KL
Sbjct: 67 QNMNGIDNLMYMSST---------SDSAGSVTITLTFQSGTDPDIA------QVQVQNKL 111

Query: 684 SVFPGLLLTFSQPIATRVDELLSGVKAQLAIKIFGPDLAVLSQKGQA---LSDLVAKIPG 740
+ LL Q V++ S P + D ++++ G
Sbjct: 112 QLATPLLPQEVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNG 171

Query: 741 AVDVSLEQVSGEAQLVVRPKRELLARYGISVDQVMSLVSQGIGGVSAGQVIDGNARYD-- 798
DV L + + + +LL +Y ++ V++ + ++AGQ+ A
Sbjct: 172 VGDVQL--FGAQYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQ 229

Query: 799 INVRLAAEFR-QSPDAIKDLLL-SGTNGATVRLGEVASVEVEMAPPNIR-RDDVQRRVVV 855
+N + A+ R ++P+ + L ++G+ VRL +VA VE+ N+ R + + +
Sbjct: 230 LNASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGL 289

Query: 856 QANVA-GRDMGSVVKDIYALVP--KADLPAGYTVIIGGQYENQQRAQQKLMLVVP---VS 909
+A G + K I A + + P G V+ Y+ Q + VV +
Sbjct: 290 GIKLATGANALDTAKAIKAKLAELQPFFPQGMKVLY--PYDTTPFVQLSIHEVVKTLFEA 347

Query: 910 IALIALLLYFSFGSFKQVLLIMANVPLALIGGIVALYVSGTYLSVPSSIGFITLFGVAVL 969
I L+ L++Y + + L+ VP+ L+G L G ++ + G + G+ V
Sbjct: 348 IMLVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVD 407

Query: 970 NGVVLVDSINQRRQS-GEALYDCVYEGTVGRLRPVLMTALTSALGLIPILLSSGVGSEIQ 1028
+ +V+V+++ + + + ++ A+ + IP+ G I
Sbjct: 408 DAIVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIY 467

Query: 1029 KPLAVVIIGGLFSSTALTLLVLPTLYCWLYRGDKR 1063
+ ++ I+ + S + L++ P L L +
Sbjct: 468 RQFSITIVSAMALSVLVALILTPALCATLLKPVSA 502


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3342RTXTOXIND539e-10 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 53.3 bits (128), Expect = 9e-10
Identities = 36/182 (19%), Positives = 65/182 (35%), Gaps = 22/182 (12%)

Query: 109 RATATLVVDRDRTATLAPQLDVRVLARHVVPGQEVKKGEPLLTLGGAAVAQAQADYINAA 168
R V++ R + L + +A+H V QE + +N
Sbjct: 225 RYENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQE----------------NKYVEAVNEL 268

Query: 169 AEWSRVKRMSEGAVSVSRRMQAQVDAELKRAILEAIKMTPAQIRALE----STPEAIGSY 224
+ E + ++ V K IL+ ++ T I L E +
Sbjct: 269 RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328

Query: 225 QLLAPIDGRVQQ-DIAMLGQVFSAGTPLMQLT-DESYLWVEAQLTPTQTMHIQVGSSALV 282
+ AP+ +VQQ + G V + LM + ++ L V A + I VG +A++
Sbjct: 329 VIRAPVSVKVQQLKVHTEGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAII 388

Query: 283 QV 284
+V
Sbjct: 389 KV 390



Score = 40.2 bits (94), Expect = 1e-05
Identities = 26/148 (17%), Positives = 53/148 (35%), Gaps = 5/148 (3%)

Query: 101 LANLNLDIRATATLVVDRDRTATLAPQLDVRVLARHVVPGQEVKKGEPLLTLGG----AA 156
L + + A L R+ + P + V V G+ V+KG+ LL L A
Sbjct: 77 LGQVEIVATANGKLTHS-GRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEAD 135

Query: 157 VAQAQADYINAAAEWSRVKRMSEGAVSVSRRMQAQVDAELKRAILEAIKMTPAQIRALES 216
+ Q+ + A E +R + +S D + + E + + +
Sbjct: 136 TLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQF 195

Query: 217 TPEAIGSYQLLAPIDGRVQQDIAMLGQV 244
+ YQ +D + + + +L ++
Sbjct: 196 STWQNQKYQKELNLDKKRAERLTVLARI 223


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3343RTXTOXIND300.029 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.8 bits (67), Expect = 0.029
Identities = 21/174 (12%), Positives = 56/174 (32%), Gaps = 15/174 (8%)

Query: 82 HVNFGQWLPEL-LTQFN-QLPEVQAQLVRQQQAKLAIQAANRAVYNPELGLNYQNADTDA 139
V G L +L + Q+ L++ + + Q +R++ L +
Sbjct: 117 SVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI--ELNKLPELKLPDEP 174

Query: 140 YSLGLSQTLDWGDKRGVATKRAELEAQILFADIGLERSQMLAERLLALAEQAQSRKALTF 199
Y +S+ + + + + Q ++ L++ + AE+ +
Sbjct: 175 YFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKR---------AERLTVLARINR 225

Query: 200 AEQQLRFTKAQLNIAEQRLAAGDLSNVELQLIQLEVASNTADYALAEQVALVAD 253
E R K++L+ L ++ + + + + L + +
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNE--LRVYKSQLEQ 277


89Sputcn32_3451Sputcn32_3467N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3451017-0.719078flagellar hook-length control protein
Sputcn32_3452118-1.446833hypothetical protein
Sputcn32_3453116-2.246100flagellar protein FliS
Sputcn32_3454016-1.728298flagellar hook-associated 2 domain-containing
Sputcn32_3455118-1.481695flagellin domain-containing protein
Sputcn32_3456-117-1.057474flagellin domain-containing protein
Sputcn32_3457-216-1.214258hypothetical protein
Sputcn32_3458-215-0.464116flagellar hook-associated protein 3
Sputcn32_3459-2160.775830flagellar hook-associated protein FlgK
Sputcn32_3460-1150.892235peptidoglycan hydrolase
Sputcn32_3461-1140.779848flagellar basal body P-ring protein
Sputcn32_3462-2140.835702flagellar basal body L-ring protein
Sputcn32_3463-1140.689472flagellar basal-body rod protein FlgG
Sputcn32_3464-1140.300307flagellar basal-body rod protein FlgF
Sputcn32_3465014-0.929641flagellar hook protein FlgE
Sputcn32_3466217-2.104990flagellar hook capping protein
Sputcn32_3467216-2.379518flagellar basal-body rod protein FlgC
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3451FLGHOOKFLIK393e-05 Flagellar hook-length control protein signature.
		>FLGHOOKFLIK#Flagellar hook-length control protein signature.

Length = 375

Score = 38.7 bits (89), Expect = 3e-05
Identities = 37/139 (26%), Positives = 60/139 (43%), Gaps = 3/139 (2%)

Query: 244 TQAASATQWGPVSLTPTASLAQQAQEILTPLREHLRFQVDQHIKKAELRLDPPELGKIDL 303
TQ P P S + E L +H+ Q + AELRL P +LG++ +
Sbjct: 214 LITPHQTQPLPTVAAPVLSAPLGSHEWQQSLSQHISLFTRQGQQSAELRLHPQDLGEVQI 273

Query: 304 NIRLEGDRLQVHMHAVNPAIRDALLNGLERLRMDLAMD--HGGQIDVDVGQGGSQQQQQE 361
+++++ ++ Q+ M + + +R AL L LR LA GQ ++ G+ S QQQ
Sbjct: 274 SLKVDDNQAQIQMVSPHQHVRAALEAALPVLRTQLAESGIQLGQSNIS-GESFSGQQQAA 332

Query: 362 TALFASSIAPETAMENGAD 380
+ S G D
Sbjct: 333 SQQQQSQRTANHEPLAGED 351


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3455FLAGELLIN1115e-30 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 111 bits (278), Expect = 5e-30
Identities = 88/268 (32%), Positives = 132/268 (49%), Gaps = 9/268 (3%)

Query: 4 VHTNYASIVAQGAVNKSNNLLTNAMERLSTGLRINSASDDAAGLQIANRMNANVKGMETA 63
++TN S++ Q +NKS + L++A+ERLS+GLRINSA DDAAG IANR +N+KG+ A
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 SRNISDATSMLQTADGALEELTTIANRQKELATQAANGVNSAADLTALNDEFTQLNAEIT 123
SRN +D S+ QT +GAL E+ R +EL+ QA NG NS +DL ++ DE Q EI
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 RIVENTTYAGNKLFDTGVLTTGTGVKFQIGAGTAETLDVKLGAI----PKTVAGTLTGGT 179
R+ T + G K+ L+ +K Q+GA ET+ + L I + G
Sbjct: 124 RVSNQTQFNGVKV-----LSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPK 178

Query: 180 ANAAIALVDTFLETVGTERSTLGANINRLGHTAANLASVTENTKAAAGRIMDADFAVESA 239
L +F G + +GAN R+ + + + T ++A +
Sbjct: 179 EATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTT 238

Query: 240 NMTRNQLLVQAGTTVLSSANQNTGLVMG 267
+ N V T S+A +
Sbjct: 239 DDAENNTAVDLFKTTKSTAGTAEAKAIA 266



Score = 73.2 bits (179), Expect = 8e-17
Identities = 55/266 (20%), Positives = 86/266 (32%), Gaps = 1/266 (0%)

Query: 6 TNYASIVAQGAVNKSNNLLTNAMERLSTGLRINSASDDAAGLQIANRMNANVKGMETASR 65
N ++ + + + D G+ G S
Sbjct: 242 ENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVST 301

Query: 66 NISDATSMLQTADGALEELTTIANRQKELATQAANGVNSAADLTALNDEFTQLNAEITRI 125
I+ L AD A + + VN + +++
Sbjct: 302 TINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEAN 361

Query: 126 VENTTYAGNKLFDTGVLTTGTGVKFQIGAGTAETLDVKLGAIPKTVAGTLTG-GTANAAI 184
+ + G K + T G + +
Sbjct: 362 NAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTANPL 421

Query: 185 ALVDTFLETVGTERSTLGANINRLGHTAANLASVTENTKAAAGRIMDADFAVESANMTRN 244
A +D+ L V RS+LGA NR NL + N +A RI DAD+A E +NM++
Sbjct: 422 ASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKA 481

Query: 245 QLLVQAGTTVLSSANQNTGLVMGLLR 270
Q+L QAGT+VL+ ANQ V+ LLR
Sbjct: 482 QILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3456FLAGELLIN1054e-28 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 105 bits (264), Expect = 4e-28
Identities = 79/269 (29%), Positives = 126/269 (46%), Gaps = 13/269 (4%)

Query: 4 VHTNYASIVAQGAVNKSNNLLTNAMERLSTGLRINSASDDAAGLQIANRMNANVKGMETA 63
++TN S++ Q +NKS + L++A+ERLS+GLRINSA DDAAG IANR +N+KG+ A
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 SRNISDATSMLQTADGALEELTTIANRQKELATQAANGVNSADDLTALDAEFKELNKEIS 123
SRN +D S+ QT +GAL E+ R +EL+ QA NG NS DL ++ E ++ +EI
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 124 RILDNTTYAGNNLFTKLEAGVTFQIGAGTAETLAVTTTAIDDAALAAGDLTT-------- 175
R+ + T + G + ++ + + Q+GA ET+ + ID +L
Sbjct: 124 RVSNQTQFNGVKVLSQ-DNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATV 182

Query: 176 ----GANAAITLVDTFLAAVGTERSTLGANINRLGHTAANLASVTENTKAAAGRIMDADF 231
+ +T DT+ R + + TA + A D
Sbjct: 183 GDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAE 242

Query: 232 AVESANMTRNQLLVQAGTTVLSSANQNTG 260
+ ++ + + A G
Sbjct: 243 NNTAVDLFKTTKSTAGTAEAKAIAGAIKG 271



Score = 69.7 bits (170), Expect = 1e-15
Identities = 56/276 (20%), Positives = 99/276 (35%), Gaps = 15/276 (5%)

Query: 7 NYASIVAQGAVNKSNNLLTNAMERLSTGLRINSASDDAAGLQIANRMNANVKGMETASRN 66
+ A N + L + + + + G + + + ++
Sbjct: 232 ANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKT 291

Query: 67 ISDATSMLQTADGALEELTTIANRQKELATQAANGVNSADDLTA---------------L 111
+D + T + T+A+ A A + S+ ++
Sbjct: 292 GNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNE 351

Query: 112 DAEFKELNKEISRILDNTTYAGNNLFTKLEAGVTFQIGAGTAETLAVTTTAIDDAALAAG 171
A+ +L + ++ +T AG + T + A
Sbjct: 352 SAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAA 411

Query: 172 DLTTGANAAITLVDTFLAAVGTERSTLGANINRLGHTAANLASVTENTKAAAGRIMDADF 231
+ +D+ L+ V RS+LGA NR NL + N +A RI DAD+
Sbjct: 412 AAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADY 471

Query: 232 AVESANMTRNQLLVQAGTTVLSSANQNTGLVMGLLR 267
A E +NM++ Q+L QAGT+VL+ ANQ V+ LLR
Sbjct: 472 ATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3458FLAGELLIN346e-04 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 34.2 bits (78), Expect = 6e-04
Identities = 27/139 (19%), Positives = 58/139 (41%)

Query: 1 MRVTMQNLYTNNLNSLQNTTYDVARLNQMLSKGVSILAPSDDPIGVVRVMDNQRDLALVQ 60
+ +L N+L + ++ + LS G+ I + DD G ++ +
Sbjct: 2 QVINTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLT 61

Query: 61 QYIKNIDSLSTSMSRSETYLSSMVETQQRMKEISIATNSSNLSAEDRASYASEMEELLKG 120
Q +N + + +E L+ + QR++E+S+ + S D S E+++ L+
Sbjct: 62 QASRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEE 121

Query: 121 LVDNINATDESGNYLFSGN 139
+ N T +G + S +
Sbjct: 122 IDRVSNQTQFNGVKVLSQD 140


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3459FLGHOOKAP11474e-41 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 147 bits (373), Expect = 4e-41
Identities = 85/319 (26%), Positives = 148/319 (46%), Gaps = 8/319 (2%)

Query: 2 SMLNIGKSGLLASMAALNATSNNVANAMVAGYSRQQVMLSSVGGGAYGS---GAGVFVDG 58
S++N SGL A+ AALN SNN+++ VAGY+RQ +++ G GV+V G
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVSG 61

Query: 59 VRRISDQYEVAQLWQTTSAVGFSKVQSSYLRQAEQVFGADGNNISKGLDQLFAALNSSME 118
V+R D + QL + + + + + + ++++ + F +L + +
Sbjct: 62 VQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNMLSTSTSSLATQMQDFFTSLQTLVS 121

Query: 119 QPNLIAYRQSVLNEAKAVAQRVNAINDNIDSQRNQINGQLGTSVKEINAQLAIIANFNRD 178
A RQ+++ +++ + + + + Q Q+N +G SV +IN IA+ N
Sbjct: 122 NAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASLNDQ 181

Query: 179 IQAASVTGTIPPA--LQDGRDAAIDDLAAIIDIRVVEDSQGMLNISLARGEPLLTGNTA- 235
I + G L D RD + +L I+ + V G NI++A G L+ G+TA
Sbjct: 182 ISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGSTAR 241

Query: 236 --AKLTSAPDPANPKNNLVSIQFGASQFNVDETAGGSLGALLNYRDTQLADSQAYIDELA 293
A + S+ DP+ V G + GSLG +L +R L ++ + +LA
Sbjct: 242 QLAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNTLGQLA 301

Query: 294 VMLATEFNSILASGTDLNG 312
+ A FN+ +G D NG
Sbjct: 302 LAFAEAFNTQHKAGFDANG 320



Score = 66.9 bits (163), Expect = 6e-14
Identities = 34/110 (30%), Positives = 61/110 (55%), Gaps = 3/110 (2%)

Query: 346 QDGTQGDNTNLKALVQLANKELSFTSLGSSTSLAESFSSKVGQLGSASRQAISFAKTSVD 405
+D DN N +AL+ L + + +G + S ++++S V +G+ + + + T +
Sbjct: 438 EDAGDSDNRNGQALLDLQSNSKT---VGGAKSFNDAYASLVSDIGNKTATLKTSSATQGN 494

Query: 406 LQKDAQVQWASTSGVNPDEEGINLIIYQQSYMANAKVISTADQLFQTMLS 455
+ Q S SGVN DEE NL +QQ Y+ANA+V+ TA+ +F +++
Sbjct: 495 VVTQLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3460FLGFLGJ451e-08 Flagellar protein FlgJ signature.
		>FLGFLGJ#Flagellar protein FlgJ signature.

Length = 313

Score = 45.1 bits (106), Expect = 1e-08
Identities = 24/74 (32%), Positives = 45/74 (60%), Gaps = 5/74 (6%)

Query: 26 GALKMVSQQFEAQFLQTVLKQMRSATDAMADEDNPLTTKSNNGLYQDLHDAELANRLSQV 85
++ V++Q E F+Q +LK MR DA+ + L + + LY ++D ++A +++
Sbjct: 31 ANIRPVARQVEGMFVQMMLKSMR---DALPKDG--LFSSEHTRLYTSMYDQQIAQQMTAG 85

Query: 86 NGMGLAEVMTKQLS 99
G+GLAE+M KQ++
Sbjct: 86 KGLGLAEMMVKQMT 99


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3461FLGPRINGFLGI331e-114 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 331 bits (849), Expect = e-114
Identities = 139/372 (37%), Positives = 211/372 (56%), Gaps = 14/372 (3%)

Query: 5 LLLLLLVSGTLQAQEQSQSRYLMDIVDVQGLRDNQLVGYGLVVGLSGTGDR-SQVKFTSQ 63
L+ + Q+ + + DI +Q RDNQL+GYGLVVGL GTGD FT Q
Sbjct: 10 ALVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQ 69

Query: 64 SVVNMLKQFGVQIDDKTDPKLKNVAAVAVHATITSLASPGQSLDVTVSSLGDAKSLQGGT 123
S+ ML+ G+ KN+AAV V A + ASPG +DVTVSSLGDA SL+GG
Sbjct: 70 SMRAMLQNLGITTQG-GQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGN 128

Query: 124 LLMTPLRAVDGEIYAVAQGNLVVGGISAEGRNGSSVTVNVPTVGTIPNGALLEAPIKSNF 183
L+MT L DG+IYAVAQG L+V G SA+G + +++T V T +PNGA++E + S F
Sbjct: 129 LIMTSLSGADGQIYAVAQGALIVNGFSAQG-DAATLTQGVTTSARVPNGAIIERELPSKF 187

Query: 184 SDNEDIILNLKDPNFKTARNIERAVNEL----FGPDVARAQDHAKVLIHAPKSNRERVTF 239
D+ +++L L++P+F TA + VN +G +A +D ++ + P +
Sbjct: 188 KDSVNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKP-RVADLTRL 246

Query: 240 MSMLEELKIDQGRRSPRIVFNSRTGTVVMGGDVVARKAAVSHGNLTVTIVERQNVSQPNG 299
M+ +E L ++ ++V N RTGT+V+G DV + AVS+G LTV + E V QP
Sbjct: 247 MAEIENLTVET-DTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAP 305

Query: 300 AYLGNAAGETVVTNDSQVLVEQGNKRMFVWPEGTSIEEIVRAVNSLGATPMDLMAILEAL 359
+ G+T V + ++ Q ++ + EG + +V +NS+G ++AIL+ +
Sbjct: 306 F----SRGQTAVQPQTDIMAMQEGSKVAI-VEGPDLRTLVAGLNSIGLKADGIIAILQGI 360

Query: 360 SEAGSLEADLVV 371
AG+L+A+LV+
Sbjct: 361 KSAGALQAELVL 372


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3462FLGLRINGFLGH1509e-48 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 150 bits (381), Expect = 9e-48
Identities = 71/225 (31%), Positives = 110/225 (48%), Gaps = 17/225 (7%)

Query: 7 LCFALLSGCMSHIPDQETKPGTKEWAPPEIDYSLPDAKDGSLYRPGYMLT-----LFKDK 61
L L+GC + IP + P + + +GS+++ + LF+D+
Sbjct: 14 LLVLSLTGC-AWIP---STPLVQGATSAQPVPGPTPVANGSIFQSAQPINYGYQPLFEDR 69

Query: 62 RAFREGDILTVALDEKTYSSKKADTKTNKEQDIGMGLKGNVGEKT------ADADGKTSF 115
R GD LT+ L E +SK + +++ G V A AD + S
Sbjct: 70 RPRNIGDTLTIVLQENVSASKSSSANASRDGKTNFGF-DTVPRYLQGLFGNARADVEASG 128

Query: 116 SRGFNGAGSSTQQNQLSGSITVTVSKVLPNGTLLIRGEKWLRLNQGDEYLRLLGIIRTDD 175
FNG G + N SG++TVTV +VL NG L + GEK + +NQG E++R G++
Sbjct: 129 GNTFNGKGGANASNTFSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRT 188

Query: 176 IGNNNTISSQRIADARIIYGGQGAIADSNAMGWASRYFNSPWFPL 220
I +NT+ S ++ADARI Y G G I ++ MGW R+F + P+
Sbjct: 189 ISGSNTVPSTQVADARIEYVGNGYINEAQNMGWLQRFFLN-LSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3463FLGHOOKAP1422e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.9 bits (98), Expect = 2e-06
Identities = 11/47 (23%), Positives = 20/47 (42%)

Query: 213 ALRQGALEGANVNVVEEMVEMISTQRAYEMNAKVVSASDDMLKFLNQ 259
L + VN+ EE + Q+ Y NA+V+ ++ + L
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544



Score = 37.6 bits (87), Expect = 4e-05
Identities = 9/37 (24%), Positives = 18/37 (48%)

Query: 3 SALWVSKTGLTAQDTKMTAIANNLANVNTTGFKRDRV 39
S + + +GL A + +NN+++ N G+ R
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTT 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3465FLGHOOKAP1348e-04 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 34.2 bits (78), Expect = 8e-04
Identities = 20/57 (35%), Positives = 28/57 (49%), Gaps = 5/57 (8%)

Query: 2 SFNIALSGLQATTQDLNTISNNIANASTNGFRGGR----SEFASIYNGGQAG-GVSV 53
N A+SGL A LNT SNNI++ + G+ +++ GG G GV V
Sbjct: 3 LINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYV 59



Score = 32.6 bits (74), Expect = 0.002
Identities = 14/41 (34%), Positives = 21/41 (51%)

Query: 353 LEGSNVDTTAEMVNLMSAQRNYQSNAKVLDVNSTMQQALLN 393
S V+ E NL Q+ Y +NA+VL + + AL+N
Sbjct: 504 QSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALIN 544


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3467FLGHOOKAP1310.001 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 31.1 bits (70), Expect = 0.001
Identities = 8/39 (20%), Positives = 19/39 (48%)

Query: 97 SNVNTIEEMADMMAASRSFETSVEVMNRARSMQQGLLQL 135
S VN EE ++ + + + +V+ A ++ L+ +
Sbjct: 507 SGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545


90Sputcn32_3474Sputcn32_3484N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3474018-0.517668flagellar assembly protein H
Sputcn32_3475017-0.910272flagellar motor switch protein G
Sputcn32_3476117-1.213632flagellar MS-ring protein
Sputcn32_3477218-1.998208flagellar hook-basal body complex subunit FliE
Sputcn32_3478017-1.675780sigma-54 dependent trancsriptional regulator
Sputcn32_3479218-2.238321hypothetical protein
Sputcn32_3480-218-0.856422flagellar motor switch protein FliN
Sputcn32_3481-117-0.237803flagellar biosynthesis protein FliP
Sputcn32_34820161.203037flagellar biosynthetic protein FliQ
Sputcn32_34830172.143939flagellar biosynthetic protein FliR
Sputcn32_3484-1182.674652flagellar biosynthetic protein FlhB
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3474FLGFLIH604e-13 Flagellar assembly protein FliH signature.
		>FLGFLIH#Flagellar assembly protein FliH signature.

Length = 228

Score = 60.2 bits (145), Expect = 4e-13
Identities = 40/186 (21%), Positives = 83/186 (44%), Gaps = 2/186 (1%)

Query: 40 QQAFDEGYEEGVLQGKNAGYEAGIEEGRIAGHAAGFHQGKLEGVAAGKTSINEQLNSLLV 99
+ + ++ + +Q GY+AGI EGR GH G+ +G +G+ G Q +
Sbjct: 37 EPSLEQQLAQLQMQAHEQGYQAGIAEGRQQGHKQGYQEGLAQGLEQGLAEAKSQQAPIHA 96

Query: 100 PLGALRELLEDGHAKQVREQQNLILDLVRRVSQQVIRCELTLQPQQILKLVEETLSALPD 159
+ L + + ++ + ++QVI T+ ++K +++ L P
Sbjct: 97 RMQQLVSEFQTTLDALDSVIASRLMQMALEAARQVIGQTPTVDNSALIKQIQQLLQQEPL 156

Query: 160 DQSDMKIHLEPNAVIKLKEL--AEDKIRGWNLIADSNISAGSCRIVSNKSDADASVETRL 217
++ + P+ + ++ ++ A + GW L D + G C++ +++ D DASV TR
Sbjct: 157 FSGKPQLRVHPDDLQRVDDMLGATLSLHGWRLRGDPTLHPGGCKVSADEGDLDASVATRW 216

Query: 218 DTCMKL 223
+L
Sbjct: 217 QELCRL 222


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3475FLGMOTORFLIG1723e-53 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 172 bits (437), Expect = 3e-53
Identities = 80/324 (24%), Positives = 170/324 (52%), Gaps = 1/324 (0%)

Query: 6 QAAMLLLSMGEEGAARVMAHLDRNDVQHLSHKMARLSSITQQEAEAVLSRFFQRYKEQSG 65
+AA+LL+S+G E +++V +L + +++ L+ ++A+L +IT + + VL F + Q
Sbjct: 20 KAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELMMAQEF 79

Query: 66 IARASRSYLQKTLDIALGDRLSKSLIDSIYGDEIKVLVKRLEWVDPQLLAREITHEHCQL 125
I + Y ++ L+ +LG + + +I+++ + + DP + I EH Q
Sbjct: 80 IQKGGIDYARELLEKSLGTQKAVDIINNLGSALQSRPFEFVRRADPANILNFIQQEHPQT 139

Query: 126 QAVLLGLLPAESAAKILKMLPSDSQDEVLVRIAQLGELDRNVVDELRELVERCMLMAMEK 185
A++L L + A+ IL LP++ Q V RIA + VV E+ ++E+ + +
Sbjct: 140 IALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKKLASLSSE 199

Query: 186 SHTQVAGVKQVADILNRFE-GDREQLMEMIKLHDKQMAIDVTDNMFDFIILGRQKQETLQ 244
+T GV V +I+N + + ++E ++ D ++A ++ MF F + ++Q
Sbjct: 200 DYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVLLDDRSIQ 259

Query: 245 TLLGQVPSETLSLALKGIDFELKDALLNALPKRMSSAIETQIEALGGVPVSRASGARKEI 304
+L ++ + L+ ALK +D +++ + + KR +S ++ +E LG ++++I
Sbjct: 260 RVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGPTRRKDVEESQQKI 319

Query: 305 MELAKQLMQEGEIELQLFEEQVVV 328
+ L ++L ++GEI + E+ V+
Sbjct: 320 VSLIRKLEEQGEIVISRGGEEDVL 343


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3476FLGMRINGFLIF314e-103 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 314 bits (807), Expect = e-103
Identities = 154/555 (27%), Positives = 276/555 (49%), Gaps = 56/555 (10%)

Query: 34 RGDRQVIALALLAVVVASVIVLMLWTATAGYRPLYGSQENVDTSQVLNVLEAEGIHYRLE 93
R + ++ + + VA V+ ++LW T YR L+ + + D ++ L I YR
Sbjct: 20 RANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNIPYRFA 79

Query: 94 ANTGAVLVAEEQVGNARMLLAAKGVKAKVPSGMEALDNNALGTSQFMEQAKYRHSLEGEL 153
+GA+ V ++V R+ LA +G+ G E LD G SQF EQ Y+ +LEGEL
Sbjct: 80 NGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQEKFGISQFSEQVNYQRALEGEL 139

Query: 154 ARTIMSLKLVRAARVHLAIPKQSLFIRQEPELPTASVMLQLDPNTRLSESQVEAIVNLVA 213
ARTI +L V++ARVHLA+PK SLF+R++ + P+ASV + L+P L E Q+ A+V+LV+
Sbjct: 140 ARTIETLGPVKSARVHLAMPKPSLFVREQ-KSPSASVTVTLEPGRALDEGQISAVVHLVS 198

Query: 214 GSVTGLTASNIKVVDQDGRYLSENISGNQDLSQSRNKQLQYTRELEHSLVANAASMLEPV 273
+V GL N+ +VDQ G L+++ + +DL+ + QL++ ++E + ++L P+
Sbjct: 199 SAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLN---DAQLKFANDVESRIQRRIEAILSPI 255

Query: 274 LGHDNFQVRVTAKVNFNQVEETKESLDPQ------NVVTQERTSIDDSSNSIAAGIPGAL 327
+G+ N +VTA+++F E+T+E P + +++ + G+PGAL
Sbjct: 256 VGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPGAL 315

Query: 328 SNKPP-----------------------QVGTAATDDKTRNLKQEESRQYDVGRSVRHVR 364
SN+P T + R+ ++ E+ Y+V R++RH +
Sbjct: 316 SNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTNSNSAGPRSTQRNETSNYEVDRTIRHTK 375

Query: 365 YQQMQLENLSVSVLINS---AAGQGVFNNEAQLVKFGNMVKDAIGFSAARGDSFTINAFE 421
+E LSV+V++N A G+ + Q+ + ++ ++A+GFS RGD+ +
Sbjct: 376 MNVGDIERLSVAVVVNYKTLADGKPLPLTADQMKQIEDLTREAMGFSDKRGDTLNVVNSP 435

Query: 422 FTPTVVAEIPTSPWWQSENY----QSYLRYIIGGILGFGLILFVLRPLVKHLTRTAQITA 477
F V P+WQ +++ + R+++ ++ + L +RP LTR +
Sbjct: 436 F-SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQ---LTRRVEEAK 491

Query: 478 PSIEPAALSAASANALDGPTVDYAPNQAHQLPSADWLGSQGLPEPGSPLTVKMEHLALLA 537
+ E A + + A++ Q Q + LG++ V + + ++
Sbjct: 492 AAQEQAQVRQETEEAVEVRLSK--DEQLQQRRANQRLGAE----------VMSQRIREMS 539

Query: 538 NKEPARVAEVIAHWI 552
+ +P VA VI W+
Sbjct: 540 DNDPRVVALVIRQWM 554


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3477FLGHOOKFLIE473e-10 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 47.0 bits (111), Expect = 3e-10
Identities = 20/72 (27%), Positives = 34/72 (47%), Gaps = 1/72 (1%)

Query: 42 SFTDLIKSKVAAVNQDQNQSSMAMTAVDSGKSD-DLVGAMVASQKASLSFATMLQIRNRL 100
SF + + + ++ Q + G+ L M QKAS+S +Q+RN+L
Sbjct: 32 SFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDMQKASVSMQMGIQVRNKL 91

Query: 101 VQAFDDVMKMPI 112
V A+ +VM M +
Sbjct: 92 VAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3478HTHFIS373e-128 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 373 bits (960), Expect = e-128
Identities = 139/401 (34%), Positives = 205/401 (51%), Gaps = 45/401 (11%)

Query: 73 ELAAAAMRAGVQDYLLIPVESEQLLASIHR----LRRLELPDSS-------LVVSASVSR 121
A A G DYL P + +L+ I R +R LV ++ +
Sbjct: 88 MTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQ 147

Query: 122 QLLMLAHRAATTEASVLLLGESGTGKEPLARYIHRHSSRSHKPFIAINCAAIPESILESV 181
++ + R T+ ++++ GESGTGKE +AR +H + R + PF+AIN AAIP ++ES
Sbjct: 148 EIYRVLARLMQTDLTLMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESE 207

Query: 182 LFGHVKGAFTGAICDKAGKFEQANGGTLLLDEIGEMPLTLQAKLLRVLQEREVERLGGQH 241
LFGH KGAFTGA G+FEQA GGTL LDEIG+MP+ Q +LLRVLQ+ E +GG+
Sbjct: 208 LFGHEKGAFTGAQTRSTGRFEQAEGGTLFLDEIGDMPMDAQTRLLRVLQQGEYTTVGGRT 267

Query: 242 AIALDIRIIASTNRDLRQAVELGHFREDLFYRLDVLPLKISPLRDRKADILPLAEHFLDL 301
I D+RI+A+TN+DL+Q++ G FREDL+YRL+V+PL++ PLRDR DI L HF+
Sbjct: 268 PIRSDVRIVAATNKDLKQSINQGLFREDLYYRLNVVPLRLPPLRDRAEDIPDLVRHFVQQ 327

Query: 302 YHQPDTTSTCYFSEHAKQALVSYDWPGNVRELENCIQRALVMRRGLAIQAADLGLNIQLE 361
+ F + A + + ++ WPGNVRELEN ++R + I + ++ E
Sbjct: 328 AEKEG-LDVKRFDQEALELMKAHPWPGNVRELENLVRRLTALYPQDVITREIIENELRSE 386

Query: 362 V------------------QAVEH------------PEVTDGLRASKQQAEFQYIIDVLK 391
+ QAVE + + E+ I+ L
Sbjct: 387 IPDSPIEKAAARSGSLSISQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALT 446

Query: 392 RFNGQRTLSAQALGMTTRALRYRLVQMREAGIDIEMLLEQA 432
G + +A LG+ LR + +RE G+ + A
Sbjct: 447 ATRGNQIKAADLLGLNRNTLRKK---IRELGVSVYRSSRSA 484


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3480FLGMOTORFLIN801e-22 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 79.9 bits (197), Expect = 1e-22
Identities = 45/119 (37%), Positives = 71/119 (59%), Gaps = 19/119 (15%)

Query: 13 SDDDFLLDDDIFSEKSLSKQDTQSRKLKNNNFFQQL------------------PVQVTL 54
SD++ DD++++ +L++Q + K + FQQL PV++T+
Sbjct: 8 SDENTGALDDLWAD-ALNEQKATTTKSAADAVFQQLGGGDVSGAMQDIDLIMDIPVKLTV 66

Query: 55 ELASAEMSLGELNRMGEGDVIALDRMVGEPLDIRVNGALLGRGEVVEVAGRYGVRLLEI 113
EL M++ EL R+ +G V+ALD + GEPLDI +NG L+ +GEVV VA +YGVR+ +I
Sbjct: 67 ELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIAQGEVVVVADKYGVRITDI 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3481FLGBIOSNFLIP2274e-77 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 227 bits (580), Expect = 4e-77
Identities = 120/244 (49%), Positives = 166/244 (68%), Gaps = 2/244 (0%)

Query: 1 MIVRLLLLSTFLFVPHAIASEGL-TLFTLDSGESSQAVNIKLEILALMTAISFLPVMLMM 59
M L + L++ +A L + + Q+ ++ ++ L +T+++F+P +L+M
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPGITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILLM 60

Query: 60 LTSFTRIIIVLAILRQALGLQQSPPNRVLVGIALILTIFIMRPVGDKIYKEAYLPYDQGK 119
+TSFTRIIIV +LR ALG +PPN+VL+G+AL LT FIM PV DKIY +AY P+ + K
Sbjct: 61 MTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEEK 120

Query: 120 IELMEAISIAKVPLTRFMLAQTRETDLEQMLKIANEPTHMKTAEEVPFFVLMPAFVLSEL 179
I + EA+ PL FML QTRE DL ++AN ++ E VP +L+PA+V SEL
Sbjct: 121 ISMQEALEKGAQPLREFMLRQTREADLGLFARLANTGP-LQGPEAVPMRILLPAYVTSEL 179

Query: 180 KTAFQIGFLIFLPFLVIDLVVASVLMSMGMMMLSPLIISLPFKLMIFVLVDGWAMTVSTL 239
KTAFQIGF IF+PFL+IDLV+ASVLM++GMMM+ P I+LPFKLM+FVLVDGW + V +L
Sbjct: 180 KTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVGSL 239

Query: 240 TASF 243
SF
Sbjct: 240 AQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3482TYPE3IMQPROT479e-11 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 47.1 bits (112), Expect = 9e-11
Identities = 20/80 (25%), Positives = 39/80 (48%)

Query: 3 INELTALFADSMFLVIMMVSALVTPGLILGLIVAVFQAATQVNEQTLSFLPRLIITLLMV 62
+++L +++LV+++ I+GL+V +FQ TQ+ EQTL F +L+ L +
Sbjct: 1 MDDLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCL 60

Query: 63 LFSGHWLIQQLSDLFDRLFM 82
W + L ++
Sbjct: 61 FLLSGWYGEVLLSYGRQVIF 80


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3483TYPE3IMRPROT1232e-36 Type III secretion system inner membrane R protein ...
		>TYPE3IMRPROT#Type III secretion system inner membrane R protein

family signature.
Length = 261

Score = 123 bits (311), Expect = 2e-36
Identities = 78/256 (30%), Positives = 132/256 (51%), Gaps = 2/256 (0%)

Query: 1 MLSLTSTEISMLIGGFWWPFCRIMGAFMIMPLLGNAYVPVMVRIFLALSIAALLAPMLPP 60
ML +TS + + ++WP R++ P+L VP V++ LA+ I +AP LP
Sbjct: 1 MLQVTSEQWLSWLNLYFWPLLRVLALISTAPILSERSVPKRVKLGLAMMITFAIAPSLPA 60

Query: 61 VPPVDAISLASLFLAIEQLLIGFMLALFLTLLIHVMTMLGNIMSMQMGLAMAVMNDPANG 120
S +L+LA++Q+LIG L + + G I+ +QMGL+ A DPA+
Sbjct: 61 NDVPV-FSFFALWLAVQQILIGIALGFTMQFAFAAVRTAGEIIGLQMGLSFATFVDPASH 119

Query: 121 DSSPIISEWFQIFGTLIFLALNGHLVAINIIVDSFRLWPIGH-GIFELPLMGLVSRLAWL 179
+ P+++ + L+FL NGHL I+++VD+F PIG + + L + +
Sbjct: 120 LNMPVLARIMDMLALLLFLTFNGHLWLISLLVDTFHTLPIGGEPLNSNAFLALTKAGSLI 179

Query: 180 FAASLMLAIPAVLAMLMVNITFGVLSRSAPSLNIFSLGFPMTMLMGLLCVFLSLSGIPSR 239
F LMLA+P + +L +N+ G+L+R AP L+IF +GFP+T+ +G+ + + I
Sbjct: 180 FLNGLMLALPLITLLLTLNLALGLLNRMAPQLSIFVIGFPLTLTVGISLMAALMPLIAPF 239

Query: 240 YSDLCLDALTAMYQFI 255
L + + I
Sbjct: 240 CEHLFSEIFNLLADII 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3484TYPE3IMSPROT323e-111 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 323 bits (829), Expect = e-111
Identities = 114/350 (32%), Positives = 188/350 (53%), Gaps = 6/350 (1%)

Query: 8 DKTEEATPQKLRKARKEGQVPRSKDLASAALVLGCSVMLTSNANWFATKVSGLTKYNMLL 67
+KTE+ TP+K+R ARK+GQV +SK++ S AL++ S ML ++++ S L ML+
Sbjct: 4 EKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKL----MLI 59

Query: 68 TRAELDQPD--MMVRHLGTSLVEMLTILSPLFIMVALLAAIAGALPGGPIFNVGNANFKY 125
+ P + + L+E + PL + AL+A + + G + +
Sbjct: 60 PAEQSYLPFSQALSYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIKPDI 119

Query: 126 SRIDPIAGVGRIFSTQSLVELLKSCLKIVLLISIMLVFLNGHLQSLLSYSHRPIDEAVRD 185
+I+PI G RIFS +SLVE LKS LK+VLL ++ + + G+L +LL I+
Sbjct: 120 KKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECITPL 179

Query: 186 GINLLSQGMLYLGVGLLVIAFIDVPYQYWHHKKQLRMSRQEVKDEHTQQEGKPEIKAKIR 245
+L Q M+ VG +VI+ D ++Y+ + K+L+MS+ E+K E+ + EG PEIK+K R
Sbjct: 180 LGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKSKRR 239

Query: 246 QLQQRMARSRADTTIPKADVLLVNPTHYAVALKYNPDLADAPYVLTKGTEELALYMRELA 305
Q Q + + ++ V++ NPTH A+ + Y P V K T+ +R++A
Sbjct: 240 QFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVRKIA 299

Query: 306 KKHGVEIIDIPPLTRAIYHSTQVDQQIPSALFIAIAHVLSYVMQIKASRK 355
++ GV I+ PL RA+Y VD IP+ A A VL ++ + ++
Sbjct: 300 EEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLERQNIEKQ 349


91Sputcn32_3561Sputcn32_3568N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_35611150.902950general secretion pathway protein J
Sputcn32_35620151.155576general secretion pathway protein I
Sputcn32_35630141.225388general secretion pathway protein H
Sputcn32_35640131.464855general secretion pathway protein G
Sputcn32_35650141.560468general secretion pathway protein F
Sputcn32_35660142.005736general secretion pathway protein E
Sputcn32_35670141.750597general secretion pathway protein D
Sputcn32_3568-2171.108090general secretion pathway protein C
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3561BCTERIALGSPG290.007 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 29.5 bits (66), Expect = 0.007
Identities = 16/41 (39%), Positives = 25/41 (60%), Gaps = 3/41 (7%)

Query: 3 LKLTKAHTGFTLLEMLIAIAIFAMIGVAANAVLSTVLTNDE 43
++ T GFTLLE+++ I I IGV A+ V+ ++ N E
Sbjct: 1 MRATDKQRGFTLLEIMVVIVI---IGVLASLVVPNLMGNKE 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3562PilS_PF08805290.004 PilS N terminal
		>PilS_PF08805#PilS N terminal

Length = 185

Score = 28.7 bits (64), Expect = 0.004
Identities = 12/32 (37%), Positives = 17/32 (53%)

Query: 5 KGMTLLEVIVALAVFSVAAVSITKSLGEQMAN 36
KG TL+EV++ + V V A S K +N
Sbjct: 26 KGATLMEVLLVVGVIVVLAASAYKLYSMVQSN 57


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3563BCTERIALGSPH883e-24 Bacterial general secretion pathway protein H signa...
		>BCTERIALGSPH#Bacterial general secretion pathway protein H

signature.
Length = 170

Score = 87.7 bits (217), Expect = 3e-24
Identities = 42/171 (24%), Positives = 72/171 (42%), Gaps = 39/171 (22%)

Query: 4 LRQTGFTLMEVMLVILLMGLTAAAVTMSIGNSGPQQALERTAQQFMAATELVLDETVLSG 63
+RQ GFTL+E+ML++LLMG++A V ++ S A + T +F A V + +G
Sbjct: 1 MRQRGFTLLEMMLILLLMGVSAGMVLLAFPASRDDSAAQ-TLARFEAQLRFVQQRGLQTG 59

Query: 64 QFIGIVIEKTSYQFVFYKDG---------------KWNPLEKDRMLSEKQMETGVSLNLV 108
QF G+ + +QF+ + +W PL R+ +
Sbjct: 60 QFFGVSVHPDRWQFLVLEARDGADPAPADDGWSGYRWLPLRAGRVATSGS---------- 109

Query: 109 LDGLPLVQEDEEEKSWFDEPFIEAKTEDKKKHPEPQIMLFPSGEMSAFELS 159
+ G L + ++W P +++FP GEM+ F L+
Sbjct: 110 IAGGKLNLAFAQGEAW-------------TPGDNPDVLIFPGGEMTPFRLT 147


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3564BCTERIALGSPG2344e-83 Bacterial general secretion pathway protein G signa...
		>BCTERIALGSPG#Bacterial general secretion pathway protein G

signature.
Length = 145

Score = 234 bits (599), Expect = 4e-83
Identities = 98/144 (68%), Positives = 121/144 (84%)

Query: 1 MQIDNRQKGFTLLEVMVVIVILGILASMVVPNLMGNKDKADQQKAVSDIVALENALDMYK 60
M+ ++Q+GFTLLE+MVVIVI+G+LAS+VVPNLMGNK+KAD+QKAVSDIVALENALDMYK
Sbjct: 1 MRATDKQRGFTLLEIMVVIVIIGVLASLVVPNLMGNKEKADKQKAVSDIVALENALDMYK 60

Query: 61 LDNGVYPTTEQGLEALVQKPTISPEPRNYREEGYVKRLPQDPWRNNYLLLSPGENSKLDI 120
LDN YPTT QGLE+LV+ PT+ P NY +EGY+KRLP DPW N+Y+L++PGE+ D+
Sbjct: 61 LDNHHYPTTNQGLESLVEAPTLPPLAANYNKEGYIKRLPADPWGNDYVLVNPGEHGAYDL 120

Query: 121 FSAGPDGQPGTEDDIGNWNLQNFQ 144
SAGPDG+ GTEDDI NW L +
Sbjct: 121 LSAGPDGEMGTEDDITNWGLSKKK 144


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3565BCTERIALGSPF5070.0 Bacterial general secretion pathway protein F signa...
		>BCTERIALGSPF#Bacterial general secretion pathway protein F

signature.
Length = 408

Score = 507 bits (1307), Expect = 0.0
Identities = 229/408 (56%), Positives = 308/408 (75%), Gaps = 1/408 (0%)

Query: 1 MPAFEYKALDAKGKQLKGVIEADTARHARSQLREQRMMPLEILPVTEKEAKAKSSGFSV- 59
M + Y+ALDA+GK+ +G EAD+AR AR LRE+ ++PL + + K+ S+G S+
Sbjct: 1 MAQYHYQALDAQGKKCRGTQEADSARQARQLLRERGLVPLSVDENRGDQQKSGSTGLSLR 60

Query: 60 FKRGISVAELALITRQIATLVAAGLPIEESLKAVGQQCEKDRLASMIMAVRSRVVEGYSF 119
K +S ++LAL+TRQ+ATLVAA +P+EE+L AV +Q EK L+ ++ AVRS+V+EG+S
Sbjct: 61 RKIRLSTSDLALLTRQLATLVAASMPLEEALDAVAKQSEKPHLSQLMAAVRSKVMEGHSL 120

Query: 120 ADSLAEFPHIFDDLYRAMVASGEKSGHLEVVLNRLADYTERRQQLKSKLQQAMIYPIVLT 179
AD++ FP F+ LY AMVA+GE SGHL+ VLNRLADYTE+RQQ++S++QQAMIYP VLT
Sbjct: 121 ADAMKCFPGSFERLYCAMVAAGETSGHLDAVLNRLADYTEQRQQMRSRIQQAMIYPCVLT 180

Query: 180 VVAIGVISVLLAAVVPKVVGQFEHMGAELPATTRFLISASDFVQNYGLFVVLAIVLLAVV 239
VVAI V+S+LL+ VVPKVV QF HM LP +TR L+ SD V+ +G +++LA++ +
Sbjct: 181 VVAIAVVSILLSVVVPKVVEQFIHMKQALPLSTRVLMGMSDAVRTFGPWMLLALLAGFMA 240

Query: 240 FQRMLKSPAFRMKYDSFLLKMPVIGRVSKGLNTARFARTLSILSASSVPLLDGMRIASEV 299
F+ ML+ R+ + LL +P+IGR+++GLNTAR+ARTLSIL+AS+VPLL MRI+ +V
Sbjct: 241 FRVMLRQEKRRVSFHRRLLHLPLIGRIARGLNTARYARTLSILNASAVPLLQAMRISGDV 300

Query: 300 LQNMRVRAAVDDATARVREGTSLGAALTNTKLFPAMMLYMIASGEKSGQLEQMLERAADN 359
+ N R + AT VREG SL AL T LFP MM +MIASGE+SG+L+ MLERAADN
Sbjct: 301 MSNDYARHRLSLATDAVREGVSLHKALEQTALFPPMMRHMIASGERSGELDSMLERAADN 360

Query: 360 QDREFEGNVNIALGVFEPMLVVSMAGIVLFIVMAILQPILALNNLISS 407
QDREF + +ALG+FEP+LVVSMA +VLFIV+AILQPIL LN L+SS
Sbjct: 361 QDREFSSQMTLALGLFEPLLVVSMAAVVLFIVLAILQPILQLNTLMSS 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3567BCTERIALGSPD5960.0 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 596 bits (1539), Expect = 0.0
Identities = 327/680 (48%), Positives = 446/680 (65%), Gaps = 33/680 (4%)

Query: 6 IRRKLIAGVVAGAAMFTSQFVWSEQYAANFKGTDIQEFINIVGKNLNKTIIVDPTIRGKI 65
IR + ++ A +F +E+++A+FKGTDIQEFIN V KNLNKT+I+DP++RG I
Sbjct: 7 IRSFSLTLLIFAALLFRP--AAAEEFSASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTI 64

Query: 66 NVRSYDLLNDEQYYQFFLNVLQVYGYAIVEMENGVIKVVKDKSAQTGAIRVANDNDPGIG 125
VRSYD+LN+EQYYQFFL+VL VYG+A++ M NGV+KVV+ K A+T A+ VA+D PGIG
Sbjct: 65 TVRSYDMLNEEQYYQFFLSVLDVYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIG 124

Query: 126 DEMVTRIVALYNTEAKQLAPLLRQLNDNAGGGNVVNYDPSNVLMLTGRASVVNKLVEIIR 185
DE+VTR+V L N A+ LAPLLRQLNDNAG G+VV+Y+PSNVL++TGRA+V+ +L+ I+
Sbjct: 125 DEVVTRVVPLTNVAARDLAPLLRQLNDNAGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVE 184

Query: 186 RVDKQGDTSVQVVPLEYASAGEMVRIIDTLYRASANQAQLPGQAPKVVADERINAVIVSG 245
RVD GD SV VPL +ASA ++V+++ L + ++ A VVADER NAV+VSG
Sbjct: 185 RVDNAGDRSVVTVPLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSG 244

Query: 246 DEKSRQRVVELIHRLDAEQASTGNTKVRYLRYAKAEDLVEVLTGFAQKLEGEKDPSAQAG 305
+ SRQR++ +I +LD +QA+ GNTKV YL+YAKA DLVEVLTG + ++ EK +
Sbjct: 245 EPNSRQRIIAMIKQLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVA 304

Query: 306 AKRRNEINIMAHTDTNALVISAEPDQMRTIESVINQLDIRRAQVLVEAIIVEVAEGDNVG 365
A +N I I AH TNAL+++A PD M +E VI QLDIRR QVLVEAII EV + D +
Sbjct: 305 ALDKN-IIIKAHGQTNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLN 363

Query: 366 FGVQWAAKAGGGTQFNNLGPTIGEIGAGIWQAQDKEGTYITNPSTGEVIGKNPDTQGDVT 425
G+QWA K G TQF N G I AG + G V+
Sbjct: 364 LGIQWANKNAGMTQFTNSGLPISTAIAG---------------------ANQYNKDGTVS 402

Query: 426 -LLAQALGKVNGMAWGVAMGDFGALIQAVSSDTNSNVLATPSITTLDNQEASFIVGDEVP 484
LA AL NG+A G G++ L+ A+SS T +++LATPSI TLDN EA+F VG EVP
Sbjct: 403 SSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNVGQEVP 462

Query: 485 ILTGSTASSNNSNPFQTVERKEVGVKLKVVPQINEGNAVKLAIEQEVSGVNG-----NTG 539
+LTGS +++ N F TVERK VG+KLKV PQINEG++V L IEQEVS V ++
Sbjct: 463 VLTGSQ-TTSGDNIFNTVERKTVGIKLKVKPQINEGDSVLLEIEQEVSSVADAASSTSSD 521

Query: 540 VDISFATRRLTTTVMADSGQIVVLGGLINEEVQESVQKVPFLGDIPVLGHLFKSSSSKKT 599
+ +F TR + V+ SG+ VV+GGL+++ V ++ KVP LGDIPV+G LF+S+S K +
Sbjct: 522 LGATFNTRTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLGDIPVIGALFRSTSKKVS 581

Query: 600 KKNLMIFIKPTIIRDGMTMEGIAGRKYNYFRALQLEQQ-KRGVNLMPDTQVPVLDEWNQS 658
K+NLM+FI+PT+IRD + +Y F Q +Q+ K + M + + + Q
Sbjct: 582 KRNLMLFIRPTVIRDRDEYRQASSGQYTAFNDAQSKQRGKENNDAMLNQDLLEIYP-RQD 640

Query: 659 EYLPPEVNDILERYKDGRGL 678
+V+ ++ + G L
Sbjct: 641 TAAFRQVSAAIDAFNLGGNL 660


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3568BCTERIALGSPC1772e-56 Bacterial general secretion pathway protein C signa...
		>BCTERIALGSPC#Bacterial general secretion pathway protein C

signature.
Length = 272

Score = 177 bits (451), Expect = 2e-56
Identities = 70/295 (23%), Positives = 141/295 (47%), Gaps = 33/295 (11%)

Query: 8 IAKAASIPHKPLSQVTFWFGFIVSLLLAAQITWKLVPTRSSPSAWSPTATTVGSGAGQID 67
I+K + + ++ F+ ++ A I W++ ++P S T + A Q
Sbjct: 3 ISKLPPLSPSVIRRILFYLLMLLFCQQLAMIFWRIGLPDNAPV-SSVQIT--PAQARQQP 59

Query: 68 LTGLQQLGLFGKADAQSERPKVEVVEAITDAPKTSLSILLTGVVASTADQKGLAIIESSG 127
+T L LFG + +++ ++ + ++ P ++L++ LTGV+A D + +AII
Sbjct: 60 VT-LNDFTLFGVSPEKNKAGALDASQM-SNLPPSTLNLSLTGVMAGDDDSRSIAIISKDN 117

Query: 128 IQETYSLGDKIKGTSASLKEVYADRIIITNAGRYETLMLDGLVYTSQSAVNQQLQQAKTS 187
Q + + +++ G +A + + DR+++ GRYE L L + V
Sbjct: 118 EQFSRGVNEEVPGYNAKIVSIRPDRVVLQYQGRYEVLGLYSQEDSGSDGVPG-------- 169

Query: 188 KVEQTVSRVDQRQNAEISQELAESRNELLADPSKITDYIAISPVRQGDAVVGYRLNPGKD 247
A+++++L + + ++DY++ SP+ + + GYRLNPG
Sbjct: 170 --------------AQVNEQLQQR------ASTTMSDYVSFSPIMNDNKLQGYRLNPGPK 209

Query: 248 VNLFRQAGFKANDLAKSINGYDLTVMTQALEMMSQLPELTEVSIMVEREGQLVEI 302
+ F + G + ND+A ++NG DL QA + M ++ ++ ++ VER+GQ +I
Sbjct: 210 SDSFYRVGLQDNDMAVALNGLDLRDAEQAKKAMERMADVHNFTLTVERDGQRQDI 264


92Sputcn32_3589Sputcn32_3595N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_35895264.181815hypothetical protein
Sputcn32_35906274.369509HemY domain-containing protein
Sputcn32_35917284.382678outer membrane adhesin like protein
Sputcn32_3592017-3.095682ABC transporter-like protein
Sputcn32_3593116-3.504486HlyD family type I secretion membrane fusion
Sputcn32_3594113-3.142422TolC family type I secretion outer membrane
Sputcn32_3595013-3.324193OmpA/MotB domain-containing protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3589RTXTOXIND320.006 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 31.7 bits (72), Expect = 0.006
Identities = 15/78 (19%), Positives = 37/78 (47%), Gaps = 7/78 (8%)

Query: 86 YQQLQQQIQQQQLAQDEKNNALQSQLAQALLQPNQRIEQLEQQQLNDAKT-----YQELS 140
+ Q Q Q++L D+K + LA+ + + + ++E+ +L+D +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLAR--INRYENLSRVEKSRLDDFSSLLHKQAIAKH 252

Query: 141 KLVENQSQLQDRVNKLAE 158
++E +++ + VN+L
Sbjct: 253 AVLEQENKYVEAVNELRV 270



Score = 29.4 bits (66), Expect = 0.025
Identities = 6/41 (14%), Positives = 15/41 (36%)

Query: 88 QLQQQIQQQQLAQDEKNNALQSQLAQALLQPNQRIEQLEQQ 128
Q++ +I + ++++ L Q I L +
Sbjct: 277 QIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLE 317


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3591CABNDNGRPT808e-17 NodO calcium binding signature.
		>CABNDNGRPT#NodO calcium binding signature.

Length = 479

Score = 80.0 bits (197), Expect = 8e-17
Identities = 51/218 (23%), Positives = 77/218 (35%), Gaps = 26/218 (11%)

Query: 3920 GGGQSGIITNSNGKEVVAS-----GANNKSYSTTDAQFVNGGDGNDHIETGKGNDVIYAG 3974
G +G + + +A+ GAN + + N D + +
Sbjct: 235 GADYNGHYGGAPMIDDIAAIQRLYGANMTTRTGDSVYGFNSNTDRDFYTATDSSKALIFS 294

Query: 3975 RTGSTGYGSDDALELSVNTLLNHHIMTGELTGANRMVDSNGLLLANDVASHKADIVNGGS 4034
+ G + D S N +N G L N + G
Sbjct: 295 VWDAGGTDTFDFSGYSNNQRIN---------LNEGSFSDVGGLKGNVS-------IAHGV 338

Query: 4035 GDDRIYGQSGSDILYGHTGNDYIDGGSHNDALRGGEGNDTLIGGLGDDVLRGDSGADTFV 4094
+ G SG+DIL G++ ++ + GG+ ND L GG G DTL GG G D SG D+
Sbjct: 339 TIENAIGGSGNDILVGNSADNILQGGAGNDVLYGGAGADTLYGGAGRDTFVYGSGQDST- 397

Query: 4095 WRYAEFGTDHIMDFKVTEDKLDLSDLLQGESANNLDSY 4132
D I DF+ DK+DLS + +
Sbjct: 398 ----VAAYDWIADFQKGIDKIDLSAFRNEGQLSFVQDQ 431



Score = 38.8 bits (90), Expect = 5e-04
Identities = 24/176 (13%), Positives = 43/176 (24%), Gaps = 52/176 (29%)

Query: 3909 NPNQKILNVSFGGGQSGIITNSNGKEVVASGANNKSYSTTDAQFVNGGDGNDHIETGKGN 3968
+ Q + + +V N + GG GND + +
Sbjct: 299 GGTDTFDFSGYSNNQRINLNEGSFSDVGGLKGNVSIAHGVTIENAIGGSGNDILVGNSAD 358

Query: 3969 DVIYAGRTGSTGYGSDDALELSVNTLLNHHIMTGELTGANRMVDSNGLLLANDVASHKAD 4028
+++ G G+D
Sbjct: 359 NILQGGA------GND-------------------------------------------- 368

Query: 4029 IVNGGSGDDRIYGQSGSDILYGHTGNDYIDGGSH--NDALRGGEGNDTLIGGLGDD 4082
++ GG+G D +YG +G D +G D D +G + D
Sbjct: 369 VLYGGAGADTLYGGAGRDTFVYGSGQDSTVAAYDWIADFQKGIDKIDLSAFRNEGQ 424


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3593RTXTOXIND318e-106 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 318 bits (816), Expect = e-106
Identities = 96/434 (22%), Positives = 196/434 (45%), Gaps = 15/434 (3%)

Query: 29 RLIIWAMAAMIVCFLLWAAFAKLDKVTTGSGKVIPSSQVQVIQSLDGGIMQELYVREGEL 88
RL+ + + +V + + +++ V T +GK+ S + + I+ ++ I++E+ V+EGE
Sbjct: 58 RLVAYFIMGFLVIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEIIVKEGES 117

Query: 89 VTKGQPLVRIDDTRFRSDFAQQEQEVFGLKTNAIRMRAELDSILISDMTSDWREQVLITK 148
V KG L+++ +D + + + + R + SI L
Sbjct: 118 VRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSI------------ELNKL 165

Query: 149 KALVFPDSIV--AAEPALVHRQQEEYNGRLDNLSNQLEILVRQIQQRQQEIDELASKTTT 206
L PD V R + NQ + +++ E + ++
Sbjct: 166 PELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKRAERLTVLARINR 225

Query: 207 LTTSMQLISRELELTRPLAKKGIVPEVELLKLERAVNDAQGELNSLRLLRPKLKSALDEA 266
++ L+ L K + + +L+ E +A EL + +++S + A
Sbjct: 226 YENLSRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQIESEILSA 285

Query: 267 ILKRREAVFVYAADLRAQLNETQTRLSRMNEAQVGAQDKVSKAIITSPVNGTIKTTHINT 326
+ + ++ ++ +L +T + + +++ ++I +PV+ ++ ++T
Sbjct: 286 KEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQASVIRAPVSVKVQQLKVHT 345

Query: 327 LGGVVQPGVNIIEIVPSEDQLLIETKVLPKDIAFLHPGLPAIVKVTAYDFTRYGGLKGTV 386
GGVV ++ IVP +D L + V KDI F++ G AI+KV A+ +TRYG L G V
Sbjct: 346 EGGVVTTAETLMVIVPEDDTLEVTALVQNKDIGFINVGQNAIIKVEAFPYTRYGYLVGKV 405

Query: 387 EHISADTSQDEEGNSFYLIRVRTEESSLVKDDGTQMPIIPGMLTTVDVITGQRSILEYIL 446
++I+ D +D+ + + + EE+ L +P+ GM T ++ TG RS++ Y+L
Sbjct: 406 KNINLDAIEDQRLGLVFNVIISIEENCLS-TGNKNIPLSSGMAVTAEIKTGMRSVISYLL 464

Query: 447 NPILRAKDTALRER 460
+P+ + +LRER
Sbjct: 465 SPLEESVTESLRER 478


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3595OMPADOMAIN916e-24 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 90.8 bits (225), Expect = 6e-24
Identities = 32/118 (27%), Positives = 53/118 (44%), Gaps = 12/118 (10%)

Query: 77 NILFPNDSAFIAPEYYSQIEDIAAFLRQY--PTTKVTIEGHTSRTGTDERNAVLSQERAD 134
++LF + A + PE + ++ + + L V + G+T R G+D N LS+ RA
Sbjct: 220 DVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYNQGLSERRAQ 279

Query: 135 AVTAALADRFNIDRSRLTAIGYGSSRPIVLEQTPEAEMR---------NRRVVAEVTG 183
+V L + I +++A G G S P+ + R +RRV EV G
Sbjct: 280 SVVDYLISK-GIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRRVEIEVKG 336


93Sputcn32_3703Sputcn32_3711N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3703-118-3.294302DNA adenine methylase
Sputcn32_3704-119-3.739275sporulation domain-containing protein
Sputcn32_3705020-4.6985243-dehydroquinate synthase
Sputcn32_3706018-5.070340shikimate kinase I
Sputcn32_3707-116-3.908478type IV pilus secretin PilQ
Sputcn32_3708-213-1.697343pilus assembly protein, PilQ
Sputcn32_3709-112-0.541277pilus assembly protein, PilO
Sputcn32_3710-1120.332350fimbrial assembly family protein
Sputcn32_3711-1121.114564type IV pilus assembly protein PilM
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3703TYPE3IMSPROT330.001 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 32.8 bits (75), Expect = 0.001
Identities = 22/91 (24%), Positives = 30/91 (32%), Gaps = 17/91 (18%)

Query: 164 IGYEKAFEQIRTGDVIYCDPP-------YAPLSTTASFTTYVGAGFSLDDQALLARHSRH 216
I E ++ V+ +P Y T T+ D Q R
Sbjct: 245 IQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYT----DAQVQ---TVRK 297

Query: 217 TALERGIPVLISNHDIPLTRELYRGAHLAKL 247
A E G+P+L IPL R LY A +
Sbjct: 298 IAEEEGVPIL---QRIPLARALYWDALVDHY 325


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3704PF05272320.007 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.6 bits (71), Expect = 0.007
Identities = 15/65 (23%), Positives = 24/65 (36%)

Query: 14 ALIQRLHHIASYSDQLLVLSGAQGSGKTTLVTALATDFDESNAALVICPMHADNAEIRRK 73
+ R+ D +VL G G GK+TL+ L S+ I +I
Sbjct: 583 GHVARVMEPGCKFDYSVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAGI 642

Query: 74 ILVQL 78
+ +L
Sbjct: 643 VAYEL 647


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3706PF05272310.002 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 31.2 bits (70), Expect = 0.002
Identities = 16/64 (25%), Positives = 25/64 (39%), Gaps = 8/64 (12%)

Query: 9 LVGPMGAGKSTIGRHLAQML-----HLEFHDSDQEIEQRTGADIAWVFDVEGEEGFRRRE 63
L G G GKST+ L + H + EQ G +++ FRR +
Sbjct: 601 LEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSYEQIAG---IVAYELSEMTAFRRAD 657

Query: 64 AQVI 67
A+ +
Sbjct: 658 AEAV 661


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3707BCTERIALGSPD2471e-74 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 247 bits (631), Expect = 1e-74
Identities = 93/401 (23%), Positives = 178/401 (44%), Gaps = 38/401 (9%)

Query: 313 DVPWDQALDLILQTKGLDKRIEGNILMVAPSEELAIRESQNLKNKQEVKELAPLYSEYLQ 372
+ W A D++ L+K + L + + E N ++
Sbjct: 198 PLSWASAADVVKLVTELNKDTSKSALPGSMVANVVADERTNAVLVSGEPNSRQRIIAMIK 257

Query: 373 ----------------INYAKAIDIAELLKSADSSLLSPRG------------SVAVDER 404
+ YAKA D+ E+L S++ S + + +
Sbjct: 258 QLDRQQATQGNTKVIYLKYAKASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQ 317

Query: 405 TNTVLVKDTAEIIENIHRLVEVLDIPIRQVLIESRMVTVKDNVSEDLGIRWGITDQQGNK 464
TN ++V +++ ++ R++ LDI QVL+E+ + V+D +LGI+W + +
Sbjct: 318 TNALIVTAAPDVMNDLERVIAQLDIRRPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQ 377

Query: 465 GSSGSLEGAQDIANGIVPSIGDRLNVNLPAQVDSAASIAFHVAKLADGTILDLELSALEQ 524
++ L + IA + ++ +L + + S IA + + L+AL
Sbjct: 378 FTNSGLPISTAIAGANQYNKDGTVSSSLASALSSFNGIAAGFYQGN----WAMLLTALSS 433

Query: 525 ENKGEIIASPRITTSNQKAAYIEQGIEIPYV-----QSTSSGATSVTFKKAVLSLRVTPQ 579
K +I+A+P I T + A G E+P + S + +V K + L+V PQ
Sbjct: 434 STKNDILATPSIVTLDNMEATFNVGQEVPVLTGSQTTSGDNIFNTVERKTVGIKLKVKPQ 493

Query: 580 ITPDNRVILDLEITQDSEGKTVPTSTGP-AVAIDTQRIGTQVLVNNGETIVLGGIYQQNL 638
I + V+L++E S +++ +T+ + VLV +GET+V+GG+ +++
Sbjct: 494 INEGDSVLLEIEQEVSSVADAASSTSSDLGATFNTRTVNNAVLVGSGETVVVGGLLDKSV 553

Query: 639 ISRVSKVPVLGDIPLVGFLFRNTTDKNERQELLIFVTPKIV 679
KVP+LGDIP++G LFR+T+ K ++ L++F+ P ++
Sbjct: 554 SDTADKVPLLGDIPVIGALFRSTSKKVSKRNLMLFIRPTVI 594



Score = 48.4 bits (115), Expect = 8e-08
Identities = 33/175 (18%), Positives = 75/175 (42%), Gaps = 14/175 (8%)

Query: 274 SLNFQNISVRTVLQIIADYNNFNLVTSDTVEGNITLR-LDDVPWDQALDLILQTKGLDKR 332
S +F+ ++ + ++ N ++ +V G IT+R D + +Q L
Sbjct: 31 SASFKGTDIQEFINTVSKNLNKTVIIDPSVRGTITVRSYDMLNEEQYYQFFLSVL----D 86

Query: 333 IEGNILMVAPSEELAIRESQNLKNKQEVK--ELAP-----LYSEYLQINYAKAIDIAELL 385
+ G ++ + L + S++ K + AP + + + + A D+A LL
Sbjct: 87 VYGFAVINMNNGVLKVVRSKDAKTAAVPVASDAAPGIGDEVVTRVVPLTNVAARDLAPLL 146

Query: 386 KSADSSLLSPRGSVAVDERTNTVLVKDTAEIIENIHRLVEVLDIPIRQVLIESRM 440
+ + + + GSV E +N +L+ A +I+ + +VE +D + ++ +
Sbjct: 147 RQLNDN--AGVGSVVHYEPSNVLLMTGRAAVIKRLLTIVERVDNAGDRSVVTVPL 199


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3711SHAPEPROTEIN422e-06 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 42.1 bits (99), Expect = 2e-06
Identities = 33/156 (21%), Positives = 59/156 (37%), Gaps = 34/156 (21%)

Query: 199 VDIGANMTTFSVVESGETTFIREQAFGGELFTQSILSFYGMSY------EQAEKAKIE-- 250
VDIG T +V+ + GG+ F ++I+++ +Y AE+ K E
Sbjct: 164 VDIGGGTTEVAVISLNGVVYSSSVRIGGDRFDEAIINYVRRNYGSLIGEATAERIKHEIG 223

Query: 251 -------------------GDLPRNY------MFEVLSPFQTQLLQQVKRTLQIYCTSSG 285
+PR + + E L T ++ V L+
Sbjct: 224 SAYPGDEVREIEVRGRNLAEGVPRGFTLNSNEILEALQEPLTGIVSAVMVALEQCPPELA 283

Query: 286 RDKVDY-LVLCGGTSKLEGMANLLINELGVHTIIAD 320
D + +VL GG + L + LL+ E G+ ++A+
Sbjct: 284 SDISERGMVLTGGGALLRNLDRLLMEETGIPVVVAE 319


94Sputcn32_3846Sputcn32_3854N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
Sputcn32_3846-1131.532815osmolarity response regulator
Sputcn32_3847-1140.571817osmolarity sensor protein
Sputcn32_38480140.033855methyl-accepting chemotaxis sensory transducer
Sputcn32_3849-1140.209027redoxin domain-containing protein
Sputcn32_3850-1130.126021hypothetical protein
Sputcn32_3851-211-0.429554peptidase
Sputcn32_3852-111-0.752928two component transcriptional regulator
Sputcn32_3853-2100.080086integral membrane sensor signal transduction
Sputcn32_38540110.844447hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3846HTHFIS987e-26 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 98.4 bits (245), Expect = 7e-26
Identities = 38/129 (29%), Positives = 66/129 (51%)

Query: 6 SKILVVDDDMRLRALLERYLMEQGYQVRSAANAEQMDRLLERENFHLLVLDLMLPGEDGL 65
+ ILV DDD +R +L + L GY VR +NA + R + + L+V D+++P E+
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 66 SICRRLRQQGSLIPIVMLTAKGDEVDRIIGLELGADDYLPKPFNPRELLARIKAVMRRQT 125
+ R+++ +P+++++A+ + I E GA DYLPKPF+ EL+ I +
Sbjct: 64 DLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPK 123

Query: 126 LEVPGAPAQ 134

Sbjct: 124 RRPSKLEDD 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3847PF06580544e-10 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 54.1 bits (130), Expect = 4e-10
Identities = 27/179 (15%), Positives = 60/179 (33%), Gaps = 28/179 (15%)

Query: 270 IVNDIEDMDAIISQFIAYIRQ----DQEANRELAQINKLIQDIVQAEANRAGEIEMVLTD 325
I+ D +++ +R LA ++ +Q + + +
Sbjct: 186 ILEDPTKAREMLTSLSELMRYSLRYSNARQVSLADELTVVDSYLQLASIQFEDRLQFENQ 245

Query: 326 CPEALFQAIAIKRVLSNLVENAFRYG------SGWIRISSQYDGKRIGFSVEDNGPGIDE 379
A+ ++ LVEN ++G G I + D + VE+ G +
Sbjct: 246 INPAIMDVQVPPMLVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALK 305

Query: 380 SQIQKLFQPFTQGDIARGSVGSGLGLA-IIKRIIDRHQGQITLS-NRKEGGLKAQVWLP 436
+ + +G GL + +R+ + + + + K+G + A V +P
Sbjct: 306 NTKE----------------STGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3852HTHFIS832e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 82.6 bits (204), Expect = 2e-20
Identities = 31/125 (24%), Positives = 62/125 (49%)

Query: 2 RLLLVEDDIELQTNLKQHLLDAHYSIDVASDGEEGLFQALECNYDAAIIDVGLPKLDGIS 61
+L+ +DD ++T L Q L A Y + + S+ + D + DV +P +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 LIRSVRQKERDFPILILTARDSWQDKVEGLDAGADDYLTKPFHPQELVARLKALIRRSAG 121
L+ +++ D P+L+++A++++ ++ + GA DYL KPF EL+ + +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 122 KASPL 126
+ S L
Sbjct: 125 RPSKL 129


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3853PF06580300.021 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.021
Identities = 16/80 (20%), Positives = 26/80 (32%), Gaps = 15/80 (18%)

Query: 365 KAAKSTVKLTVTGDAYQLQICIEDDGPGISESLKNQIFERGIRADSYRQGNGIGLAIVRD 424
+ L T D + + +E+ G ++ K + G GL VR+
Sbjct: 275 LPQGGKILLKGTKDNGTVTLEVENTGSLALKNTK--------------ESTGTGLQNVRE 320

Query: 425 -LVDSYNGRISVSHSETLGG 443
L Y + SE G
Sbjct: 321 RLQMLYGTEAQIKLSEKQGK 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
Sputcn32_3854IGASERPTASE330.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 33.1 bits (75), Expect = 0.003
Identities = 24/148 (16%), Positives = 48/148 (32%), Gaps = 25/148 (16%)

Query: 345 RERDNRAPTYQNTKAELKERRS--ANAEQMPATKGRSDPVHTQRERSDAQAKQMQANKDI 402
E+D T QN + + + + AN + + S+ TQ + A + K
Sbjct: 1054 NEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAK 1113

Query: 403 QQKQYQTREPQRQENSRAITPRVESPRQEIKRSEPQRMEQPRQSVPRQREEVKVRQSEPR 462
+ + P+ ++P+ E ++EP R P ++ EP+
Sbjct: 1114 VETEKTQEVPKVTSQ---VSPKQEQSETVQPQAEPARENDPTVNI-----------KEPQ 1159

Query: 463 QNQNMQATRAVDQNKGRSTQSQERRNRE 490
N A + Q + +
Sbjct: 1160 SQTNTTAD---------TEQPAKETSSN 1178



Score = 32.0 bits (72), Expect = 0.008
Identities = 25/160 (15%), Positives = 46/160 (28%), Gaps = 11/160 (6%)

Query: 337 KETHNYNSRERDNRAPTYQNTKAELKERRSANAEQMP--------ATKGRSDPVHTQRER 388
N R + AP A E AE + ++ RE
Sbjct: 1009 SVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREV 1068

Query: 389 SDAQAKQMQANK---DIQQKQYQTREPQRQENSRAITPRVESPRQEIKRSEPQRMEQPRQ 445
+ ++AN ++ Q +T+E Q E T E + + + Q
Sbjct: 1069 AKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQ 1128

Query: 446 SVPRQREEVKVRQSEPRQNQNMQATRAVDQNKGRSTQSQE 485
P+Q + V+ +N + +T +
Sbjct: 1129 VSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.