PredictBias

identification of genomic and pathogenicity islands in prokaryotic genome
Home | Help | Analyzed genomes
 
A) Input parameters
Genome2277.gbkThreshold dinucleotide bias2
Threshold codon bias4Threshold %GC bias3
E-value (RPSBlast)0.05Genome (non-pathogenic)
 
B) Compare a potential GI or PAI in related non-pathogenic sp. (phylogenetic tree)
Potential GI or PAI start    end  
Select Organism     
 
C) Potential GIs and PAIs in NC_003047 (download)
S.NoStartEndBiasVirulenceInsertion elementsPrediction
1SMc02583SMc02596Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc025831113.303704hypothetical protein
SMc025841103.066324transcriptional regulator
SMc02585193.064191sensor histidine kinase transmembrane protein
SMc025861103.305986ATP-dependent helicase
SMc025873112.533729ABC transporter ATP-binding protein
SMc025884122.344528ABC transporter permease
SMc025892151.589355ABC transporter substrate-binding protein
SMc025902181.106263***hypothetical protein
SMc02591122-0.120579hypothetical protein
SMc02592224-1.552694hypothetical protein
SMc02593226-2.181631hypothetical protein
SMc02594228-2.797177hypothetical protein
SMc02595126-2.837541cystathionine gamma-synthase
SMc02596225-2.976186transcriptional regulator
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02584HTHFIS1042e-28 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 104 bits (260), Expect = 2e-28
Identities = 33/132 (25%), Positives = 59/132 (44%), Gaps = 1/132 (0%)

Query: 24 SLLIVDDDTAFLRRLARAMEARGFAVEIAESVAEGIAKAKTRPPKHAVIDLRLSDGSGLD 83
++L+ DDD A L +A+ G+ V I + A V D+ + D + D
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 84 VIEAIRGRRDDTRMIVLTGYGNIATAVNAVKLGALDYLAKPADADDILAALIQRPGERVE 143
++ I+ R D ++V++ TA+ A + GA DYL KP D +++ +I R +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELI-GIIGRALAEPK 123

Query: 144 PPENPMSADRVR 155
+ + D
Sbjct: 124 RRPSKLEDDSQD 135



Score = 37.1 bits (86), Expect = 2e-05
Identities = 9/68 (13%), Positives = 23/68 (33%), Gaps = 1/68 (1%)

Query: 124 PADADDILAALIQRPGERVEPPE-NPMSADRVRWEHIQRVYEMCERNVSETARRLNMHRR 182
++ + G+ + P + + I N + A L ++R
Sbjct: 405 SQAVEENMRQYFASFGDALPPSGLYDRVLAEMEYPLILAALTATRGNQIKAADLLGLNRN 464

Query: 183 TLQRILAK 190
TL++ + +
Sbjct: 465 TLRKKIRE 472


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02585ANTHRAXTOXNA401e-05 Anthrax toxin LF subunit signature.
		>ANTHRAXTOXNA#Anthrax toxin LF subunit signature.

Length = 800

Score = 40.1 bits (93), Expect = 1e-05
Identities = 31/173 (17%), Positives = 60/173 (34%), Gaps = 27/173 (15%)

Query: 236 ERELGDDPRFGEDVHLLRSQSERCRDILRRLTTLSSESEEHMRLLPLSSLIE----EVMA 291
E D + +++ E+ +D + L +E ++ L++ +V+
Sbjct: 36 EHYTESDIKRNHKTEKNKTEKEKFKDSINNLVKTEFTNETLDKIQQTQDLLKKIPKDVLE 95

Query: 292 PHREFGIEI--------ELKEQGDRASEPVGIRNAGILYGLGNLLENAVDYARK----KV 339
+ E G EI E KE D + E N G + + +K K+
Sbjct: 96 IYSELGGEIYFTDIDLVEHKELQDLSEEEKNSMN---SRGEKVPFASRFVFEKKRETPKL 152

Query: 340 TVTTEHTAERVRVTIE---DDGDGFSPDILARIGEPY-----VTRRQKDDSAG 384
+ + A + E + G G S DI+++ + + DDS
Sbjct: 153 IINIKDYAINSEQSKEVYYEIGKGISLDIISKDKSLDPEFLNLIKSLSDDSDS 205


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02587PF05272280.037 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 28.1 bits (62), Expect = 0.037
Identities = 15/42 (35%), Positives = 19/42 (45%), Gaps = 8/42 (19%)

Query: 42 VLTIMGPSGSGKSTLLAFAGG---FLDPAFDVSGRILIDGKD 80
+ + G G GKSTL+ G F D FD+ GKD
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIG-----TGKD 634


2SMc02611SMc02617Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SMc0261139-0.206158oxidoreductase
SMc026122110.105564oxidoreductase
SMc02613213-0.388543glutamine synthetase
SMc02614517-0.092640hypothetical protein
SMc02615315-0.298264signal peptide protein
SMc02616418-1.608113permease transmembrane protein
SMc02617213-2.408953hypothetical protein
3SMc02859SMc02892Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc028593131.954477hypothetical protein
SMc028601152.254203hypothetical protein
SMc028610142.321531phosphate transporter
SMc028622131.915569Pit accessory protein
SMc028631122.312675recombination protein F
SMc028641131.703517molybdopterin biosynthesis protein MoeB
SMc028651141.157738acetyltransferase
SMc028662140.812890transcriptional regulator
SMc028672160.378614multidrug-efflux system transmembrane protein
SMc02868015-0.353927multidrug efflux system protein
SMc02869013-0.330107ABC transporter ATP-binding protein
SMc02870-1110.176987oxidoreductase
SMc028710110.552186ABC transporter permease
SMc02872-1111.500750ABC transporter permease
SMc02873-1102.445097periplasmic binding (signal peptide) ABC
SMc028742113.782933N-acetylmuramic acid 6-phosphate etherase
SMc028752134.091275hypothetical protein
SMc02876294.035093transcriptional regulator
SMc02877193.955118hypothetical protein
SMc028782113.589608N-acetylglucosamine-6-phosphate deacetylase
SMc028792122.432510hypothetical protein
SMc02880-1101.782941hypothetical protein
SMc02881-190.887134hypothetical protein
SMc02882011-0.426706CysZ-like protein
SMc02883-112-1.755928hypothetical protein
SMc02884-113-1.959841lipoprotein
SMc02885219-0.882884methionine sulfoxide reductase A
SMc02886018-1.540058signal peptide protein
SMc02887015-1.154310*hypothetical protein
SMc050010130.188256hypothetical protein
SMc028880140.774033transcriptional regulator
SMc028890131.025815transporter
SMc028900131.204005outer membrane receptor protein
SMc02891-1122.635314oxidoreductase transmembrane protein
SMc028920133.070421transporter
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02862TYPE4SSCAGX280.034 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 27.8 bits (61), Expect = 0.034
Identities = 16/67 (23%), Positives = 33/67 (49%), Gaps = 7/67 (10%)

Query: 133 PLLSRIGANAHRLSAIAEEVTHVEDRSDQLHEQGLKDLFQRHGASNPMAYIIGSEIYGEL 192
P ++ G +R++ IAE+ ++D++ L + + NP+ + YGEL
Sbjct: 457 PNMTNSGLRWYRVNEIAEKFKLIKDKA-------LVTVINKGYGKNPLTKNYNIKNYGEL 509

Query: 193 EKVVDRF 199
E+V+ +
Sbjct: 510 ERVIKKL 516


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02865SACTRNSFRASE422e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 2e-07
Identities = 19/67 (28%), Positives = 27/67 (40%), Gaps = 1/67 (1%)

Query: 51 DEIAFAAVAGRELLGCI-FCKPEADCLYIGKLAVAPGRQGKGVGRMLIAAAEETARDLGL 109
+ AF +G I I +AVA + KGVG L+ A E A++
Sbjct: 64 GKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHF 123

Query: 110 PALRLQT 116
L L+T
Sbjct: 124 CGLMLET 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02866HTHTETR1061e-30 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 106 bits (265), Expect = 1e-30
Identities = 62/205 (30%), Positives = 101/205 (49%), Gaps = 1/205 (0%)

Query: 1 MRRTKADAEATRQKILCAAERMFYKKGVPNTTLEEVAKEAGVTRGAIYWHFANKTDLFLA 60
R+TK +A+ TRQ IL A R+F ++GV +T+L E+AK AGVTRGAIYWHF +K+DLF
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 LYEAVPLPHEDMIAREIETEAFDTLAIVESATSDWLTTLAADEQRQRILAIMLR-CDYDN 119
++E ++ D L+++ L + +E+R+ ++ I+ C++
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 120 DMSAVLVRQREIEERHDALLELAFARALERGQLQEHWAPPTAARALRWMMMGLCTEWLLF 179
+M+ V QR + +E +E L AA +R + GL WL
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181

Query: 180 GRRFDLVAQGSEALQSLFAGFRRVP 204
+ FDL + + + L + P
Sbjct: 182 PQSFDLKKEARDYVAILLEMYLLCP 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02867ACRIFLAVINRP11470.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1147 bits (2969), Expect = 0.0
Identities = 553/1033 (53%), Positives = 746/1033 (72%), Gaps = 5/1033 (0%)

Query: 1 MPSFFIDRPIFAWVVAIFIMIAGIIAIPLLPVSQYPDVAPPQISINTNYPGASSQDTYQS 60
M +FFI RPIFAWV+AI +M+AG +AI LPV+QYP +APP +S++ NYPGA +Q +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTRLIEDELNGVEGLLYFESSTSSSGSVSIDATFQPGTDPSQASVDIQNRVQRVEPRLPD 120
VT++IE +NG++ L+Y S++ S+GSV+I TFQ GTDP A V +QN++Q P LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 PVRQQGVQVDEAGAGFLLIISLTSTDGSMDAIGLGDYLSRNVLSEIQRVPGVGRAQLFAT 180
V+QQG+ V+++ + +L++ S + + DY++ NV + R+ GVG QLF
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 ERSMRVWLDPDKMLGLNLTAADVTAAIQAQNAQIASGSIGAQPNPITQQVTAPVVIKGQL 240
+ +MR+WLD D + LT DV ++ QN QIA+G +G P QQ+ A ++ + +
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 SSPEEFGAIVLRANADGSAVRLRDVARLEIGGESYTFSTRLNGSPSAAIAVQLSPSGNAM 300
+PEEFG + LR N+DGS VRL+DVAR+E+GGE+Y R+NG P+A + ++L+ NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 STSAAIKARMDELAEFFPEGLEYSIPYDTSPFVAVSIEKVLHTLVEAVGLVFLVMFLFLQ 360
T+ AIKA++ EL FFP+G++ PYDT+PFV +SI +V+ TL EA+ LVFLVM+LFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NVRYTIIPTIVVPVALLGTCAVMLAMGFSINVLTMFGMVLAIGILVDDAIIVVENVERIM 420
N+R T+IPTI VPV LLGT A++ A G+SIN LTMFGMVLAIG+LVDDAI+VVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 SEEGLTPKDATRKAMKQITGAVIGITLVLASVFIPMAFFPGAVGVIYRQFSLTMVVSILF 480
E+ L PK+AT K+M QI GA++GI +VL++VFIPMAFF G+ G IYRQFS+T+V ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SALLALSLTPALCASFLKQVPKGHHHAKRGFFGWFNRGFDRTSHGYTRAVGGIVRRTGRF 540
S L+AL LTPALCA+ LK V HH K GFFGWFN FD + + YT +VG I+ TGR+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 MVIYLALLAGLGWAFLQLPSSFLPDEDQGFVIVMMQLPSEATANRTTEVIEQTETIFG-- 598
++IY ++AG+ FL+LPSSFLP+EDQG + M+QLP+ AT RT +V++Q +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 QEKAVDTIVAINGFSFFGSGQNAGLAFVTLKDWSERDAD-NSAQSIAGRATMAMSQIKDA 657
++ V+++ +NGFSF G QNAG+AFV+LK W ER+ D NSA+++ RA M + +I+D
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 658 ISFALSPPAIQGLGTTGGFSFRLQDRAGLGQAALAEARDQLLDLASQSKV-LTGVRFEGM 716
+ PAI LGT GF F L D+AGLG AL +AR+QLL +A+Q L VR G+
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 717 PDAAQVSVNIDREKANTFGVTFADINSTISTNLGSSYVNDFPNAGRMQRVTVQADETKRM 776
D AQ + +D+EKA GV+ +DIN TIST LG +YVNDF + GR++++ VQAD RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 777 QTADLLNLNVRNSNGGMVPLSAFADVEWVKAPTQTVGYNGYPAVRISGEAAPGYSSGDAI 836
D+ L VR++NG MVP SAF WV + YNG P++ I GEAAPG SSGDA+
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 837 AEMERLVAELPAGFGYEWTGQSLQEIQSGSQAPLLIALSCLLVFLCLAALYESWSIPVSV 896
A ME L ++LPAG GY+WTG S QE SG+QAP L+A+S ++VFLCLAALYESWSIPVSV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 897 IMVVPLGVIGAVLAVTMRDMPNDVYFKVGLIAIIGLSAKNAILIIEFAKELRE-QGKSLI 955
++VVPLG++G +LA T+ + NDVYF VGL+ IGLSAKNAILI+EFAK+L E +GK ++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 956 DSTLEAAHLRFRPILMTSLAFTLGVLPLAIATGASSGSQRAIGTGVMGGMISATVLAIFF 1015
++TL A +R RPILMTSLAF LGVLPLAI+ GA SG+Q A+G GVMGGM+SAT+LAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1016 VPVFFVFVMKIFE 1028
VPVFFV + + F+
Sbjct: 1021 VPVFFVVIRRCFK 1033



Score = 93.0 bits (231), Expect = 3e-21
Identities = 88/515 (17%), Positives = 188/515 (36%), Gaps = 49/515 (9%)

Query: 542 VIYLALLAGLGWAFLQLPSSFLPDEDQGFVIVMMQLP---SEATANRTTEVIEQTETIFG 598
V+ + L+ A LQLP + P V V P ++ + T+VIEQ
Sbjct: 14 VLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVIEQNMN--- 70

Query: 599 QEKAVDTIVAINGFSFFGSGQNAGLAFVTLKDWSERDADNSAQSIAGRATMAMSQIKDAI 658
+D ++ ++ S +AG +TL S D D + + + +A + +
Sbjct: 71 ---GIDNLMYMSSTSD-----SAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 659 -------SFALSPPAIQGLGTTGGFSFRLQDRAGLGQAALAEARDQLLDLASQSKV-LTG 710
+ S + + D + + +D L L V L G
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDD---ISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 711 VRFEGMPDAAQVSVNIDREKANTFGVTFADINSTISTN---LGSSYVNDFPNAGRMQRVT 767
++ + + +D + N + +T D+ + + + + + P Q++
Sbjct: 180 AQY-------AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPG-QQLN 231

Query: 768 VQADETKRMQTA-DLLNLNVR-NSNGGMVPLSAFADVEW-VKAPTQTVGYNGYPAVRISG 824
R + + + +R NS+G +V L A VE + NG PA +
Sbjct: 232 ASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGI 291

Query: 825 EAAPGYSSGDAI----AEMERLVAELPAGFGYEWTGQSLQEIQSGSQA---PLLIALSCL 877
+ A G ++ D A++ L P G + + +Q L A+ +
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI--M 349

Query: 878 LVFLCLAALYESWSIPVSVIMVVPLGVIGAVLAVTMRDMPNDVYFKVGLIAIIGLSAKNA 937
LVFL + ++ + + VP+ ++G + + G++ IGL +A
Sbjct: 350 LVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDA 409

Query: 938 ILIIE-FAKELREQGKSLIDSTLEAAHLRFRPILMTSLAFTLGVLPLAIATGASSGSQRA 996
I+++E + + E ++T ++ ++ ++ + +P+A G++ R
Sbjct: 410 IVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQ 469

Query: 997 IGTGVMGGMISATVLAIFFVPVFFVFVMKIFERGR 1031
++ M + ++A+ P ++K
Sbjct: 470 FSITIVSAMALSVLVALILTPALCATLLKPVSAEH 504


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02868RTXTOXIND415e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 5e-06
Identities = 21/101 (20%), Positives = 47/101 (46%), Gaps = 5/101 (4%)

Query: 105 QVKVDSAEATLKRAQAVVDQAKRTADRQSRLKEAQVTAVQQYD-DAIAALAQAEADVGIA 163
+ K A L+ ++ ++Q + ++ + VT Q + + + L Q ++G+
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEI-LSAKEEYQLVT--QLFKNEILDKLRQTTDNIGLL 314

Query: 164 EAGLAEAKLNLQYTNVTAPISGRI-GRALITEGALVNTNDP 203
LA+ + Q + + AP+S ++ + TEG +V T +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355



Score = 36.7 bits (85), Expect = 2e-04
Identities = 21/124 (16%), Positives = 51/124 (41%), Gaps = 8/124 (6%)

Query: 60 PGRIT-ATRIAEVRPRISGIVVERVFEQGTMVKEGDVLYRIDPAPFQVKVDSAEATLK-- 116
G++T + R E++P + IV E + ++G V++GDVL ++ + +++L
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQA 146

Query: 117 -----RAQAVVDQAKRTADRQSRLKEAQVTAVQQYDDAIAALAQAEADVGIAEAGLAEAK 171
R Q + + + +L + ++ + + + + + +
Sbjct: 147 RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE 206

Query: 172 LNLQ 175
LNL
Sbjct: 207 LNLD 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02869PF05272300.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.011
Identities = 18/74 (24%), Positives = 30/74 (40%), Gaps = 4/74 (5%)

Query: 5 ASRSSFNPRGRHVGSLQLKTIRKAFGSHEVLKGIDLDVKDGEFVIFVGPSGCGKSTLLRT 64
+ + PR L+ + K V + ++ K V+ G G GKSTL+ T
Sbjct: 560 KTPDDYKPRR----LRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINT 615

Query: 65 IAGLEDATSGSVQI 78
+ GL+ + I
Sbjct: 616 LVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02870HTHFIS320.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.004
Identities = 15/90 (16%), Positives = 32/90 (35%), Gaps = 7/90 (7%)

Query: 60 EALAELEPELCSINTYSDTHADYAVMAMEAGAHVFVEKP--LATTVADAERVVACARANG 117
+ + P+L + + A+ A E GA+ ++ KP L + R +A +
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 118 RKLVV-----GYILRHHPSWMRLIAEARKL 142
KL ++ + + +L
Sbjct: 127 SKLEDDSQDGMPLVGRSAAMQEIYRVLARL 156


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02880ACETATEKNASE330.001 Acetate kinase family signature.
		>ACETATEKNASE#Acetate kinase family signature.

Length = 400

Score = 32.9 bits (75), Expect = 0.001
Identities = 30/155 (19%), Positives = 54/155 (34%), Gaps = 20/155 (12%)

Query: 127 LGTGVGGGLVIDGRLINADGGFAGEWGHGPAVAAAAGHPPIAIPAFPC---GCGQSRCVD 183
LG G V +G+ I+ GF G A+ +G +I ++ V+
Sbjct: 208 LGNGSSIAAVKNGKSIDTSMGFTPL--EGLAMGTRSGSIDPSIISYLMEKENISAEEVVN 265

Query: 184 TVGGARGLERLHETVHGKALSSHDIIEG-WQNGNAEAARTIDVFVDLVSSPLALVINITG 242
+ G+ + G + D+ + ++NG+ A ++VF V + G
Sbjct: 266 ILNKKSGVY----GISGISSDFRDLEDAAFKNGDKRAQLALNVFAYRVKKTIGSYAAAMG 321

Query: 243 A--TIVPVGGGLSNAEALLAEIDRAVRARILRRFD 275
IV G + E +R IL +
Sbjct: 322 GVDVIVFTAG--------IGENGPEIREFILDGLE 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02881ACRIFLAVINRP300.024 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 30.2 bits (68), Expect = 0.024
Identities = 9/33 (27%), Positives = 16/33 (48%), Gaps = 1/33 (3%)

Query: 159 SAAAVAAGLVPLAIGTDTGGSVRIPAAMTGIVG 191
++ A G++PLAI G + G++G
Sbjct: 977 TSLAFILGVLPLAISNGAGSGAQNAVG-IGVMG 1008


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02884LIPPROTEIN48762e-17 Mycoplasma P48 major surface lipoprotein signature.
		>LIPPROTEIN48#Mycoplasma P48 major surface lipoprotein signature.

Length = 428

Score = 75.8 bits (186), Expect = 2e-17
Identities = 83/331 (25%), Positives = 135/331 (40%), Gaps = 47/331 (14%)

Query: 22 DIKPAIIYDLGGKFDKSFNEAAFNGAEKFKTETGIEYREFEIANDAQREQALRRFASDGN 81
+KP +I D G DKSFN++AF + +TGIE E ++ E A S G+
Sbjct: 61 KLKPVLITDEGKIDDKSFNQSAFEALKAINKQTGIEINNVEPSS--NFESAYNSALSAGH 118

Query: 82 SPIVMAGFNWAASLEKLAGEYPD------TKFAIIDMVVEKPNVK---SIVFKEQEGSYL 132
V+ GF S+++ + + K ID +E K S+ F +E ++
Sbjct: 119 KIWVLNGFKHQQSIKQYIDAHREELERNQIKIIGIDFDIET-EYKWFYSLQFNIKESAFT 177

Query: 133 VG-----VLAGLASKTKTVSFVGGMDIPLIHKFACGYVGGAKSTGADVKVLEAY------ 181
G L+ + V+ GG P + F G+ G K + Y
Sbjct: 178 TGYAIASWLSEQDESKRVVASFGGGAFPGVTTFNEGFAKGILYYNQKHKSSKIYHTSPVK 237

Query: 182 --TGTTPDAWNDPVKGG--EIAKSQIDQGSDVVYHAAGGTGVGVLQAAADAGKLGIGVDS 237
+G T + V + + V+ AG ++ A+ G+ IGVDS
Sbjct: 238 LDSGFTAGEKMNTVINNVLSSTPADVKYNPHVILSVAGPATFETVR-LANKGQYVIGVDS 296

Query: 238 NQNMLQ-PGKVLTSMLKRVDVAVYDSFMA-------------AKDDKFEFGISNLGL-KE 282
+Q M+Q ++LTS+LK + AVY++ + KD K + S+ G KE
Sbjct: 297 DQGMIQDKDRILTSVLKHIKQAVYETLLDLILEKEEGYKPYVVKDKKADKKWSHFGTQKE 356

Query: 283 DGVGYA----LDEHNQVLITPEMKEAVEKVK 309
+G A + Q I ++KEA++ K
Sbjct: 357 KWIGVAENHFSNTEEQAKINNKIKEAIKMFK 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02889TCRTETB300.017 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 30.2 bits (68), Expect = 0.017
Identities = 27/153 (17%), Positives = 47/153 (30%), Gaps = 7/153 (4%)

Query: 212 SNRPLVGGI--GFLILYQCGVRLGTSMLSPFMVDI--GFPVATIGWVRGAGGAVVGLIGA 267
N P + G+ G +I G G + P+M+ A IG V G + +I
Sbjct: 254 KNIPFMIGVLCGGIIF---GTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSVIIFG 310

Query: 268 LAGAALVQRFGSGRTLAAFAAVNLLAFAGLALFAFGIWQGDIAAIVLLLVQAAAVAMSFV 327
G LV R G L ++F + IV +L +
Sbjct: 311 YIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLGGLSFTKTVIS 370

Query: 328 ALYAMMMNWSDTEQAATDFAVLQSLDAALAVAM 360
+ + + + + L +A+
Sbjct: 371 TIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAI 403


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02892TCRTETB1408e-39 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 140 bits (355), Expect = 8e-39
Identities = 100/404 (24%), Positives = 179/404 (44%), Gaps = 15/404 (3%)

Query: 32 LSLSMLLSSLGTSIANVGLPSLATAFSASFQQVQWVVLAYLLAITTLIVGVGRLGDMVGR 91
L + S L + NV LP +A F+ WV A++L + G+L D +G
Sbjct: 19 LCILSFFSVLNEMVLNVSLPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGI 78

Query: 92 RRLLLAGILLFTVASGLSGAAPS-LPLLIAARAVQGLGAAVMMALAMALVGETVAKEKTG 150
+RLLL GI++ S + S LLI AR +QG GAA AL M +V + KE G
Sbjct: 79 KRLLLFGIIINCFGSVIGFVGHSFFSLLIMARFIQGAGAAAFPALVMVVVARYIPKENRG 138

Query: 151 GAMGLLGTMSAIGTALGPSLGGVLIAGLGWQAIFVVNVPLGALAFILARRSLPADRQGSA 210
A GL+G++ A+G +GP++GG++ + W +++ +P+ + + L
Sbjct: 139 KAFGLIGSIVAMGEGVGPAIGGMIAHYIHWS--YLLLIPMITIITVPFLMKLLKK---EV 193

Query: 211 AERTGFDVPGTLLLGLTLAAYALAMTVAHDSFGPLNMALLAAAVVGAGLFVLAERRATSP 270
+ FD+ G +L+ + + + L T SF L +V+ +FV R+ T P
Sbjct: 194 RIKGHFDIKGIILMSVGIVFFMLFTTSYSISF-------LIVSVLSFLIFVKHIRKVTDP 246

Query: 271 LMRPEALRDPVLGAGLAMSALVSTVMMATLVVGPFYLSRALGLGEALVG-IVMSAGPVVS 329
+ P ++ G+ ++ + + + P+ + L A +G +++ G +
Sbjct: 247 FVDPGLGKNIPFMIGVLCGGIIFGTVAGFVSMVPYMMKDVHQLSTAEIGSVIIFPGTMSV 306

Query: 330 ALSGLPAGRAVDRLGAPFVIVTGLLAMAAGSFALSVLPEMFGVAGYLAAMLVLTPGYQLF 389
+ G G VDR G +V+ G+ ++ S L E + + VL G
Sbjct: 307 IIFGYIGGILVDRRGPLYVLNIGVTFLSVSFLTASFLLETTSWFMTIIIVFVLG-GLSFT 365

Query: 390 QAANNTAVMVDVRPDRRGVVSGMLNLSRNLGLITGASVMGAVFA 433
+ +T V ++ G +LN + L TG +++G + +
Sbjct: 366 KTVISTIVSSSLKQQEAGAGMSLLNFTSFLSEGTGIAIVGGLLS 409


4SMc01716SMc01701Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc01716211-0.567025sensor histidine kinase transmembrane protein
SMc017152100.502649*hypothetical protein
SMc017142100.414599hypothetical protein
SMc01713290.629957hypothetical protein
SMc01712390.803265L-lactate dehydrogenase (cytochrome) protein
SMc017113101.138715hypothetical protein
SMc017101121.415912hypothetical protein
SMc017092140.572790signal peptide protein
SMc017082140.589800hypothetical protein
SMc01707-1130.567792signal peptide protein
SMc01706-2130.926917signal peptide protein
SMc01705-2122.362610hypothetical protein
SMc01704-1122.582428hypothetical protein
SMc01703-1132.666936hypothetical protein
SMc01702-2122.742086oxidoreductase
SMc01701-1113.071607hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01716PF06580364e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 35.6 bits (82), Expect = 4e-04
Identities = 42/250 (16%), Positives = 89/250 (35%), Gaps = 48/250 (19%)

Query: 252 GQHLASMQRELQRTLKQQKSL---AELGLAVSKIN-HDMRNILSSAQLISDRLADVDDPV 307
G H ++ + + S+ A+L ++IN H M N L++ + + +
Sbjct: 137 GWHFFKNYKQAEIDQWKMASMAQEAQLMALKAQINPHFMFNALNNIRALILEDPTKAREM 196

Query: 308 VKRFAPTLLR-TIDRAVGYT---REVLSYGRTTEAEPHRRFLALRPLVEDVAELLAVDRQ 363
+ + L+R ++ + + L+ V+ +L ++ +
Sbjct: 197 LTSLS-ELMRYSLRYSNARQVSLADELTV------------------VDSYLQLASIQFE 237

Query: 364 DGIDFEIQVRDDIEVDADSEQLFRVVHNICRNAVEALANYKPEDGSERRISVSAVRTGSV 423
D + FE Q+ I D + +V + N ++ P+ G +I + +
Sbjct: 238 DRLQFENQINPAIM---DVQVPPMLVQTLVENGIKHGIAQLPQGG---KILLKGTKDNGT 291

Query: 424 VTISIDDTGPGMPAKARENLFAAFRGSARSGGTGLGLAIARE-LVLAHGGTIALVEKPTP 482
VT+ +++TG +E TG GL RE L + +G +
Sbjct: 292 VTLEVENTGSLALKNTKE-------------STGTGLQNVRERLQMLYGTEAQIKLSEKQ 338

Query: 483 G-TLFRIELP 491
G + +P
Sbjct: 339 GKVNAMVLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01715TETREPRESSOR924e-25 Tetracycline repressor protein signature.
		>TETREPRESSOR#Tetracycline repressor protein signature.

Length = 218

Score = 92.3 bits (229), Expect = 4e-25
Identities = 55/200 (27%), Positives = 95/200 (47%), Gaps = 9/200 (4%)

Query: 31 LSRERIVATAVELLDAQGIDGLTMRRLADRLGSGVMSLYWHVDNKEDVFD-LALDSVLEY 89
L+RE ++ A+ELL+ GIDGLT R+LA +LG +LYWHV NK + D LA++ + +
Sbjct: 4 LNRESVIDAALELLNETGIDGLTTRKLAQKLGIEQPTLYWHVKNKRALLDALAVEILARH 63

Query: 90 RGPPQLVESRDWRGKVVHMLEDWRAGMLRHPWSASLLPRRALGPNILSRLELLSRTLSGA 149
W+ + + +R +LR+ A + +E R ++
Sbjct: 64 HDYSLPAAGESWQSFLRNNAMSFRRALLRYRDGAKVHLGTRPDEKQYDTVETQLRFMTEN 123

Query: 150 GVADADLNAAIWSLWNYVMGATITRASFGLSDEDRAAAQQRLTRLSEHYPTIERS--RLL 207
G + D AI ++ ++ +GA + + E AA R E+ P + R +++
Sbjct: 124 GFSLRDGLYAISAVSHFTLGAVLEQ------QEHTAALTDRPAAPDENLPPLLREALQIM 177

Query: 208 LDNDWDGAFRKGLDFLLDGL 227
+D + AF GL+ L+ G
Sbjct: 178 DSDDGEQAFLHGLESLIRGF 197


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01710cloacin401e-04 Cloacin signature.
		>cloacin#Cloacin signature.

Length = 551

Score = 39.7 bits (92), Expect = 1e-04
Identities = 29/104 (27%), Positives = 39/104 (37%), Gaps = 7/104 (6%)

Query: 1253 ITGGSPGGGGGGGTSLALGGTTVLKAGGGGGGGGAAGASSGSTAPDPKDGGNARTPTANA 1312
++GG G G S G G G GGGA+ S S+ +P GG+
Sbjct: 1 MSGGDGRGHNTGAHST--SGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGG 58

Query: 1313 NCNSSGNGASAADISGSSDGGSGGGGGGGYTGGKAGSGYPYNAT 1356
G +G+S GGSG GG G+P +T
Sbjct: 59 GSGHGNGGG-----NGNSGGGSGTGGNLSAVAAPVAFGFPALST 97



Score = 39.7 bits (92), Expect = 1e-04
Identities = 34/115 (29%), Positives = 36/115 (31%), Gaps = 13/115 (11%)

Query: 1214 GDTITGTVGSGGGVNGFSDNGGAGGSGTGRGGDGGDTTSITGGSPGGGGGGGTSLALGGT 1273
I G G G SD G GG G GGS G GGG
Sbjct: 17 SGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGSGHGNGGGN-------- 68

Query: 1274 TVLKAGGGGGGGGAAGASSGSTAPDPKDGGNARTPTANANCNSSGNGASAADISG 1328
G GGG G G S AP TP A S GA +A I+
Sbjct: 69 -----GNSGGGSGTGGNLSAVAAPVAFGFPALSTPGAGGLAVSISAGALSAAIAD 118



Score = 38.2 bits (88), Expect = 3e-04
Identities = 24/77 (31%), Positives = 30/77 (38%), Gaps = 10/77 (12%)

Query: 1218 TGTVGSGGGVNGFSDNGGAGGSGTGRGGDGGDTTSITGGSPGGGGGGGTSLALGGTTVLK 1277
TG + G +NG G GG + G + GGS G GG S
Sbjct: 11 TGAHSTSGNINGGPTGLGVGGGASDGSGWSSENNPWGGGSGSGIHWGGGS---------- 60

Query: 1278 AGGGGGGGGAAGASSGS 1294
G GGG G +G SG+
Sbjct: 61 GHGNGGGNGNSGGGSGT 77


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01708OMADHESIN573e-10 Yersinia outer membrane adhesin signature.
		>OMADHESIN#Yersinia outer membrane adhesin signature.

Length = 455

Score = 57.2 bits (137), Expect = 3e-10
Identities = 51/157 (32%), Positives = 79/157 (50%), Gaps = 30/157 (19%)

Query: 102 GTDAQANGDRSLAIGRQNQAGNEQSIGIGAGNTATGKLSIGIGSSNVASGEQSLSLGAGN 161
G +A A G S+AIG +A ++ +GAG+ ATG S+ IG + A G+ +++ GA +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 162 NALGQGSISIGTETTAGGLRSIAFGVRASTKEANLDIPDDVAAIDAIAIGTNTKANGDRS 221
A G +A G RAST + +A+G N+KA+ S
Sbjct: 122 TAQKDG---------------VAIGARASTSDT------------GVAVGFNSKADAKNS 154

Query: 222 VSIGTGSQASSG---AVSIGDAAKAVGDKSVSIGTES 255
V+IG S ++ +++IGD +K + SVSIG ES
Sbjct: 155 VAIGHSSHVAANHGYSIAIGDRSKTDRENSVSIGHES 191



Score = 44.9 bits (105), Expect = 2e-06
Identities = 43/118 (36%), Positives = 65/118 (55%), Gaps = 6/118 (5%)

Query: 211 GTNTKANGDRSVSIGTGSQASSGA-VSIGDAAKAVGDKSVSIGTESWADGDESVSIGLVN 269
G N A G S++IG ++A+ GA V++G + A G SV+IG S A GD +V+ G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 270 NAGFEG---NDRIKGGQTSVSLGAFNQSPGIEAIAIGARNE--ANADRSIAIGSRAKT 322
A +G R T V++G +++ ++AIG + AN SIAIG R+KT
Sbjct: 122 TAQKDGVAIGARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKT 179



Score = 44.1 bits (103), Expect = 3e-06
Identities = 48/186 (25%), Positives = 79/186 (42%), Gaps = 32/186 (17%)

Query: 461 GNEALASADEAIAIGTGAVASGLKSISIGVGNTVSGASSGAIGDPTDITGTGSYSLGNDN 520
G A A +IAIG A A+ ++++G G+ +G +S AIG + G + + G +
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 521 TIAADNAGTFGNDNTLADAADGSRVIGNGNNIDVSDAFVLGNGADVTEVGGVALGSGSVS 580
T D A +D +G + D ++ +G+ + V G ++ G S
Sbjct: 122 TAQKDGVAI----GARASTSDTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRS 177

Query: 581 DTGADVAGYVPGGASTADQNAIEATQSTRGAVAVGNPDAETGVYRQITGVAAGTADSDAA 640
T +V++G+ + RQ+T +AAGT D+DA
Sbjct: 178 KT------------------------DRENSVSIGH----ESLNRQLTHLAAGTKDTDAV 209

Query: 641 NVAQLK 646
NVAQLK
Sbjct: 210 NVAQLK 215



Score = 37.6 bits (86), Expect = 4e-04
Identities = 30/105 (28%), Positives = 47/105 (44%), Gaps = 4/105 (3%)

Query: 1160 ESVAESKSYTDEKTEWAIDQAAIYTDQVIETK----VSAVNNYAQQRFAQLSGEIGQVRS 1215
E++A + Y D K+ + A YTD + + N Y +F QL + ++ +
Sbjct: 321 EALASANVYADSKSSHTLKTANSYTDVTVSNSTKKAIRESNQYTDHKFRQLDNRLDKLDT 380

Query: 1216 EARQAAAIGLAAASLRFDNEPGKLSVALGGGFWRSEGALAFGAGY 1260
+ A A SL GK++ G G +RS ALA G+GY
Sbjct: 381 RVDKGLASSAALNSLFQPYGVGKVNFTAGVGGYRSSQALAIGSGY 425



Score = 32.9 bits (74), Expect = 0.009
Identities = 39/135 (28%), Positives = 61/135 (45%), Gaps = 16/135 (11%)

Query: 4 GRQSVSAGSGSLAFGNGSYANSNGSVAIGQSAYAANVRAIAIG------GDDAFAWREAE 57
G + + G S+A G + A +VA+G + A V ++AIG GD A + A
Sbjct: 62 GLNASAKGIHSIAIGATAEAAKGAAVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAAS 121

Query: 58 QTKAGGSQSIAMGVRARTKSLVVDDPDTVANEADPGGASDAIAIGTDAQ--ANGDRSLAI 115
+ G +A+G RA T D A +++AIG + AN S+AI
Sbjct: 122 TAQKDG---VAIGARASTS-----DTGVAVGFNSKADAKNSVAIGHSSHVAANHGYSIAI 173

Query: 116 GRQNQAGNEQSIGIG 130
G +++ E S+ IG
Sbjct: 174 GDRSKTDRENSVSIG 188



Score = 31.8 bits (71), Expect = 0.019
Identities = 19/66 (28%), Positives = 36/66 (54%)

Query: 1140 IANVAKGVKATDAVNVGQLDESVAESKSYTDEKTEWAIDQAAIYTDQVIETKVSAVNNYA 1199
+ ++A G K TDAVNV QL + + +++ T++++ + A Y D + + NNY
Sbjct: 196 LTHLAAGTKDTDAVNVAQLKKEIEKTQENTNKRSAELLANANAYADNKSSSVLGIANNYT 255

Query: 1200 QQRFAQ 1205
+ A+
Sbjct: 256 DSKSAE 261



Score = 30.6 bits (68), Expect = 0.042
Identities = 31/100 (31%), Positives = 53/100 (53%), Gaps = 4/100 (4%)

Query: 98 AIAIGTDAQANGDRSLAIGRQNQAGNEQSIGIGAGNTATGKLSIGIGSSNVASGEQSLSL 157
A+A+G + A G S+AIG ++A + ++ GA +TA K + IG+ S + +++
Sbjct: 86 AVAVGAGSIATGVNSVAIGPLSKALGDSAVTYGAASTAQ-KDGVAIGARASTS-DTGVAV 143

Query: 158 GAGNNALGQGSISIG--TETTAGGLRSIAFGVRASTKEAN 195
G + A + S++IG + A SIA G R+ T N
Sbjct: 144 GFNSKADAKNSVAIGHSSHVAANHGYSIAIGDRSKTDREN 183


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01707PF06776937e-26 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 93.1 bits (231), Expect = 7e-26
Identities = 25/135 (18%), Positives = 44/135 (32%), Gaps = 4/135 (2%)

Query: 57 FGNWTLICDENFKTKQKICNITQNVAD-ESGKSIFSWSLAATA-EGRPMMILRVPPAAGK 114
G+W + CD K + C + Q+V + + + + TA + +M + P
Sbjct: 80 HGDWQIRCDTPPGAKAEQCALIQSVVAEDRSNAGLTVIILKTADQKSKLMRVVAPLGVLL 139

Query: 115 GSRVSLSFPDRKKPVEVEVESCDRGVCLATVPVGPILRDNIAREATVGVSYGTGDG-TIS 173
S + L D C C+A V + L + T I
Sbjct: 140 PSGLGLKL-DNVDVGRAGFVRCLPNGCVAEVVMDDKLLGQLRTAKTATFIIFETPEEGIG 198

Query: 174 INLPLKGLNPALEAI 188
L L G+ + +
Sbjct: 199 FPLSLNGIGEGYDKL 213


5SMc02159SMc02154Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc02159224-2.205084hypothetical protein
SMc02158227-3.426292acyltransferase
SMc02157330-4.476534hypothetical protein
SMc02156441-6.486090hypothetical protein
SMc02187646-7.335620integrase DNA protein
SMc02202549-9.310801hypothetical protein
SMc02201652-8.995286hypothetical protein
SMc02189328-4.455579hypothetical protein
SMc02200430-4.663606hypothetical protein
SMc02184428-4.233241transcriptional regulator
SMc02199325-3.808162hypothetical protein
SMc02203221-3.137705hypothetical protein
SMc02155219-2.869274hypothetical protein
SMc02154220-3.259162hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02200TCRTETA280.018 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 28.2 bits (63), Expect = 0.018
Identities = 15/92 (16%), Positives = 37/92 (40%), Gaps = 7/92 (7%)

Query: 16 NLSRHWLVFASAFFFGLVVCAGIQQSWSEGLG------AILIPTAIGYFVVKGGRNVSGL 69
+++ H+ + + + CA + + S+ G L A+ Y ++ + L
Sbjct: 40 DVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVL 99

Query: 70 -LVRYSIFMVGFMFAISALVISEITEAHNSTQ 100
+ R + G A++ I++IT+ +
Sbjct: 100 YIGRIVAGITGATGAVAGAYIADITDGDERAR 131


6SMc02282SMc02311Y        NYGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SMc02282415-2.398981copper-containing oxidoreductase
SMc02283520-3.667907copper-containing oxidoreductase signal peptide
SMc02284522-4.159650signal peptide protein
SMc02285623-3.893639adenylate cyclase transmembrane protein
SMc02326528-5.055376hypothetical protein
SMc02327426-4.031219ribonuclease HI protein
SMc02328326-2.822503hypothetical protein
SMc02286318-1.237044hypothetical protein
SMc02287221-2.090076DNA-invertase
SMc02289218-1.970058transposase ISRM11/ISRM2011-2
SMc05002319-2.862871hypothetical protein
SMc02291319-2.905975transposase ISRM11/ISRM2011-2
SMc02292320-3.109342type I restriction enzyme R protein
SMc02295423-4.364398specificity protein S
SMc02296420-3.850662modification enzyme transmembrane protein
SMc02297526-4.113266integrase/resolvase recombinase
SMc02298524-3.077989transposase ISRM1
SMc02329625-3.335044transposase ISRM1
SMc02300626-3.193017transposase number 4 for insertion sequence
SMc023011150.152626transposase number 3 for insertion sequence
SMc023020130.928261transposase number 2 for insertion sequence
SMc044301102.098900transposase number 1 for insertion sequence
SMc023031112.157820partial transposase for insertion sequence
SMc023042122.562248*hypothetical protein
SMc023052122.159366UDP-N-acetylglucosamine
SMc023060121.852972hypothetical protein
SMc023073121.641070histidinol dehydrogenase
SMc02308016-0.998129hypothetical protein
SMc02309122-2.446946arsenate reductase
SMc02310327-3.647778translation initiation factor IF-1
SMc02311221-1.448165Maf-like protein
7SMc03026SMc03054Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc03026213-0.572562hypothetical protein
SMc030271150.305483flagellar basal body rod protein FlgB
SMc030280150.725378flagellar basal body rod protein FlgC
SMc030290161.229044flagellar hook-basal body protein FliE
SMc030300150.820928flagellar basal body rod protein FlgG
SMc030311150.767251flagellar basal body P-ring biosynthesis protein
SMc03032214-0.103613flagellar basal body P-ring protein
SMc03033316-1.231930hypothetical protein
SMc03034415-1.895690flagellar basal body L-ring protein
SMc03035414-2.173899flagellar transmembrane protein
SMc03036415-1.813721flagellar biosynthesis protein FliP
SMc03037214-1.228311flagellin A
SMc03038-18-0.515618flagellin B protein
SMc03039-280.997875flagellin D protein
SMc03040-291.638295flagellin protein
SMc030410101.611918hypothetical protein
SMc03042-280.968807flagellar motor protein MotB
SMc03043-180.926053chemotaxis protein
SMc03044170.169287chemotaxis protein (motility protein D)
SMc0304527-1.777733hypothetical protein
SMc0304629-1.968098transcriptional regulator
SMc03047210-1.725435flagellar hook protein FlgE
SMc03048211-1.847125flagellar hook-associated protein FlgK
SMc03049215-2.267517flagellar hook-associated protein FlgL
SMc03050316-1.724470flagellar biosynthesis regulatory protein FlaF
SMc03051315-1.012767flagellar biosynthesis repressor FlbT
SMc03052415-0.692297flagellar basal body rod modification protein
SMc03053414-0.779083flagellar biosynthesis protein FliQ
SMc03054312-0.823840flagellar biosynthesis protein FlhA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03028FLGHOOKAP1300.002 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.002
Identities = 8/40 (20%), Positives = 20/40 (50%)

Query: 95 LPNVNILIEMADMREANRAYEANLQTIKQSRDLISQTIDL 134
+ VN+ E +++ + Y AN Q ++ + + I++
Sbjct: 506 ISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 27.2 bits (60), Expect = 0.028
Identities = 14/60 (23%), Positives = 21/60 (35%), Gaps = 7/60 (11%)

Query: 5 TSALKVSASGLQAESTRLRIVSENIANARSTGDAPGADPFRRKTISFAAEVDRASGASLV 64
+S + + SGL A L S NI++ G + R+T A V
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAG-------YTRQTTIMAQANSTLGAGGWV 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03029FLGHOOKFLIE342e-05 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 34.3 bits (78), Expect = 2e-05
Identities = 20/89 (22%), Positives = 37/89 (41%), Gaps = 2/89 (2%)

Query: 25 SASLVMPGAGTAAPQAGSFAEVLGNMTTDAIRSMKSAEGTSLQAIRGEA--NTREVVDAV 82
+ ++ + SFA L + +A + + GE +V+ +
Sbjct: 15 ATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDM 74

Query: 83 MSAEQSLQTAIAIRDKVVTAYLEIARMQI 111
A S+Q I +R+K+V AY E+ MQ+
Sbjct: 75 QKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03030FLGHOOKAP1412e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 2e-06
Identities = 10/45 (22%), Positives = 22/45 (48%)

Query: 213 QVKQNYLESSNVDPVKEITDLISAQRAYEMNSKIIQAADEMAATV 257
Q+ S V+ +E +L Q+ Y N++++Q A+ + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDAL 542



Score = 38.8 bits (90), Expect = 1e-05
Identities = 13/34 (38%), Positives = 20/34 (58%)

Query: 4 LSIAATGMNAQQLNLEVIANNIANINTTGYKRAR 37
++ A +G+NA Q L +NNI++ N GY R
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03031PYOCINKILLER270.029 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 27.5 bits (60), Expect = 0.029
Identities = 16/58 (27%), Positives = 24/58 (41%)

Query: 104 REQYAVERGSTVRLVFNNGGLTITAAGSPLQDAAVGDLIRVRNVDTGVIVSGTVMADS 161
Q A R + + NG + TAAG L A G + + + V G V+A +
Sbjct: 238 ARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAIAVLGRVLASA 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03032FLGPRINGFLGI482e-173 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 482 bits (1242), Expect = e-173
Identities = 266/363 (73%), Positives = 312/363 (85%)

Query: 9 LLTLAVVFAATLTSAYAASRIKDVASLQSGRDNQLIGYGLVVGLQGTGDSLRSSPFTDQS 68
L+ A+ F +T + SRIKD+ASLQ+GRDNQLIGYGLVVGLQGTGDSLRSSPFT+QS
Sbjct: 11 LVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQS 70

Query: 69 IRAMLQNLGISTQGGDSRTRNVAAVLVTATLPPFASPGSRLDVTVGSLGDATSLRGGTLV 128
+RAMLQNLGI+TQGG S +N+AAV+VTA LPPFASPGSR+DVTV SLGDATSLRGG L+
Sbjct: 71 MRAMLQNLGITTQGGQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLI 130

Query: 129 MTSLSGADGQIYAVAQGSVVVSGFNAQGEAAQLSQGVTTAGRVPNGAIIERELPSKFKDG 188
MTSLSGADGQIYAVAQG+++V+GF+AQG+AA L+QGVTT+ RVPNGAIIERELPSKFKD
Sbjct: 131 MTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDS 190

Query: 189 FNLVLQLRNPDFSTAVGMAAAINRYAAAQFGGRIAEALDSQSVLVQKPKMADLARLMADV 248
NLVLQLRNPDFSTAV +A +N +A A++G IAE DSQ + VQKP++ADL RLMA++
Sbjct: 191 VNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEI 250

Query: 249 ENLVIETDAPARVVINERTGTIVIGQDVRVAEVAVSYGTLTVQVSETPTIVQPEPFSRGE 308
ENL +ETD PA+VVINERTGTIVIG DVR++ VAVSYGTLTVQV+E+P ++QP PFSRG+
Sbjct: 251 ENLTVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQ 310

Query: 309 TAYEPNTTIEAQADGGTVAILNGSSLRSLVAGLNSIGVKPDGIIAILQSIKSAGALQAEL 368
TA +P T I A +G VAI+ G LR+LVAGLNSIG+K DGIIAILQ IKSAGALQAEL
Sbjct: 311 TAVQPQTDIMAMQEGSKVAIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370

Query: 369 VLQ 371
VLQ
Sbjct: 371 VLQ 373


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03034FLGLRINGFLGH2652e-92 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 265 bits (678), Expect = 2e-92
Identities = 56/209 (26%), Positives = 90/209 (43%), Gaps = 18/209 (8%)

Query: 29 PAMSPIGSGLQYTQTPQLAMYPKQPRHVTNGYSLWNDQQAALFKDARAINIGDILTVDIR 88
P +P+ +G + Q+ Q Y QP LF+D R NIGD LT+ ++
Sbjct: 41 PGPTPVANGSIF-QSAQPINYGYQP----------------LFEDRRPRNIGDTLTIVLQ 83

Query: 89 IDDKASFENETDRSRKNSSGFNLGASGQSQTSDFAWS-GDLEYGSNTKTEGDGKTERSEK 147
+ AS + + SR + F + F + D+E G G S
Sbjct: 84 ENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKGGANASNT 143

Query: 148 LRLLVAAVVTGVLENGNLLISGSQEVRVNHELRILNVAGIVRPRDVDADNVISYDRIAEA 207
+ V VL NGNL + G +++ +N + +G+V PR + N + ++A+A
Sbjct: 144 FSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADA 203

Query: 208 RISYGGRGRLTEVQQPPWGQQLVDLVSPL 236
RI Y G G + E Q W Q+ +SP+
Sbjct: 204 RIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03036FLGBIOSNFLIP2812e-98 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 281 bits (721), Expect = 2e-98
Identities = 98/246 (39%), Positives = 157/246 (63%), Gaps = 5/246 (2%)

Query: 1 MLRFATFIIAMMAMSGIAGAQSFPADILNTPVDGSVASWI--IRTFGLLTVLSVAPGILI 58
M R + ++ + P I + P+ G SW ++T +T L+ P IL+
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPG-ITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILL 59

Query: 59 MVTSFPRFVIAFAILRSGMGLATTPSNMIMVSLALFMTFYVMAPTFDRAWRDGIDPLLKN 118
M+TSF R +I F +LR+ +G + P N +++ LALF+TF++M+P D+ + D P +
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 119 EISETDAMQRMSEPFREFMVANTRDKDLQLFIDIAREKGQTVVVDEKVDLRAVVPAFMIS 178
+IS +A+++ ++P REFM+ TR+ DL LF +A + E V +R ++PA++ S
Sbjct: 120 KISMQEALEKGAQPLREFMLRQTREADLGLFARLANT--GPLQGPEAVPMRILLPAYVTS 177

Query: 179 EIRRGFEIGFLIMLPFLVIDLIVATITMAMGMMMLPPTAISLPFKILFFVLIDGWNLLVG 238
E++ F+IGF I +PFL+IDL++A++ MA+GMMM+PP I+LPFK++ FVL+DGW LLVG
Sbjct: 178 ELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVG 237

Query: 239 SLVRSF 244
SL +SF
Sbjct: 238 SLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03037FLAGELLIN1492e-42 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 149 bits (376), Expect = 2e-42
Identities = 58/379 (15%), Positives = 117/379 (30%), Gaps = 4/379 (1%)

Query: 4 ILTNNSAMAALSTLRSISSSMEDTQSRISSGLRVGSASDNAAYWSIATTMRSDNQALSAV 63
I TN+ ++ + L SS+ R+SSGLR+ SA D+AA +IA S+ + L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 QDALGLG---AAKVDTAYSGMESAIEVVKEIKAKLVAATEDGVDKAKIQEEITQLKDQLT 120
G A + A + + + ++ V+E+ + T D IQ+EI Q +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SIAEAASFSGENWLQADLSGGPVTKSVVGGFVRDSSGAVSVKKVD-YSLNTDTVLFDTTG 179
++ F+G L D + G + + VK + N + T G
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 180 NTGILDKVYNVSQASVTLPVNVNGTTSEYTVGAYNVDDLIDASATFDGDYANVGAGALAG 239
+ K + V + + +
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 240 DYVKVQGSWVKAVDVAATGQEVVYDDGTTKWGVDTTVTGAPATNVAAPASIATIDITIAA 299
+ K+ A + + K G G T + ++
Sbjct: 244 NTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTI 303

Query: 300 QAGNLDALIAGVDEALTDMTSAAASLGSISSRIDLQSDFVNKLSDSIDSGVGRLVDADMN 359
+ +A + ++ +A + F +S ++A+
Sbjct: 304 NGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNA 363

Query: 360 EESTRLKALQTQQQLAIQA 378
+ + + A A
Sbjct: 364 VKGESKITVNGAEYTANAA 382



Score = 94.7 bits (235), Expect = 2e-23
Identities = 57/363 (15%), Positives = 110/363 (30%), Gaps = 19/363 (5%)

Query: 32 SSGLRVGSASDNAAYWSIATTMRSDNQALSAVQDALGLGAAKVDTAYSGMESAIEVVKEI 91
L + + N + ++S + ++ SG +
Sbjct: 164 VKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTV 223

Query: 92 KAKLVAATEDGVDKAKIQEEITQLKDQLTSIAEAASFSGENWLQADLSGGPVTKSVVGGF 151
K V+ A Q ++ + S +A G + G
Sbjct: 224 PDK------VYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDT 277

Query: 152 VRDSSGAVSVKKVDYSLNTDTVLFDTTGNTGILDKVYNVSQASVTLPVNVNGTTSEYTVG 211
++ + + + T + V +++ + + +S+
Sbjct: 278 FDYKGVTFTIDTKTG-NDGNGKVSTTINGEKVTLTVADITAGAANV-DAATLQSSKNVYT 335

Query: 212 AYNVDDLIDASATFDGDYANVGAGALAGDYVKVQGSWVKAVDVAATGQEVVYDDGTTKWG 271
+ T + A + + + A A + V G T +
Sbjct: 336 SVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFI 395

Query: 272 VDTTVTGAPATNVAAPASIATIDITIAAQAGNLDALIAGVDEALTDMTSAAASLGSISSR 331
T + N A AA + +A +D AL+ + + +SLG+I +R
Sbjct: 396 DKTASGVSTLINEDA-----------AAAKKSTANPLASIDSALSKVDAVRSSLGAIQNR 444

Query: 332 IDLQSDFVNKLSDSIDSGVGRLVDADMNEESTRLKALQTQQQLAIQALSIANSDSQNVLS 391
D + +++S R+ DAD E + + Q QQ L+ AN QNVLS
Sbjct: 445 FDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLS 504

Query: 392 LFR 394
L R
Sbjct: 505 LLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03038FLAGELLIN1537e-44 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 153 bits (387), Expect = 7e-44
Identities = 55/378 (14%), Positives = 111/378 (29%), Gaps = 4/378 (1%)

Query: 4 ILTNIAAMAALQTLRTIGSNMEETQAHVSSGLRVGQAADNAAYWSIATTMRSDNMALSAV 63
I TN ++ L S++ +SSGLR+ A D+AA +IA S+ L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 QDALGLG---AAKVDTAYSGMESAIEVVKEIKAKLVAATEDGVDKAKIQEEIDQLKDQLT 120
G A + A + + + ++ V+E+ + T D IQ+EI Q +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SIAEAASFSGENWLQADLSGGPVTKSVVGSFVRDAGGAVSVKKVD-YSLNTNSVLFDTAG 179
++ F+G L D + G + + VK + N N T G
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 180 NTGILDKVYNVSQASVTLPVNVNGTTSEYTVGAYNVDDLIDASATFDGDYANVGAGALAG 239
+ K + V + + +
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 240 DYVKVQGSWVKAVDVAATGQEVVYDDGTTKWGVDTTVTGAPATNVAAPASIATIDITIAA 299
+ K+ A + + K G G T + ++
Sbjct: 244 NTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTI 303

Query: 300 QAGNLDALIAGVDEALTDMTSAAADLGSIAMRIDLQSDFVNKLSDSIDSGVGRLVDADMN 359
+ +A + ++ +A + F +S ++A+
Sbjct: 304 NGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNA 363

Query: 360 EESTRLKALQTQQQLAIQ 377
+ + + A
Sbjct: 364 VKGESKITVNGAEYTANA 381



Score = 94.3 bits (234), Expect = 3e-23
Identities = 57/363 (15%), Positives = 108/363 (29%), Gaps = 19/363 (5%)

Query: 32 SSGLRVGQAADNAAYWSIATTMRSDNMALSAVQDALGLGAAKVDTAYSGMESAIEVVKEI 91
L + N + ++S ++ SG +
Sbjct: 164 VKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTV 223

Query: 92 KAKLVAATEDGVDKAKIQEEIDQLKDQLTSIAEAASFSGENWLQADLSGGPVTKSVVGSF 151
K V+ A Q D ++ + S +A G + G
Sbjct: 224 PDK------VYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDT 277

Query: 152 VRDAGGAVSVKKVDYSLNTNSVLFDTAGNTGILDKVYNVSQASVTLPVNVNGTTSEYTVG 211
G ++ + N + T + V +++ + + +S+
Sbjct: 278 FDYKGVTFTIDTKTG-NDGNGKVSTTINGEKVTLTVADITAGAANV-DAATLQSSKNVYT 335

Query: 212 AYNVDDLIDASATFDGDYANVGAGALAGDYVKVQGSWVKAVDVAATGQEVVYDDGTTKWG 271
+ T + A + + + A A + V G T +
Sbjct: 336 SVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFI 395

Query: 272 VDTTVTGAPATNVAAPASIATIDITIAAQAGNLDALIAGVDEALTDMTSAAADLGSIAMR 331
T + N A AA + +A +D AL+ + + + LG+I R
Sbjct: 396 DKTASGVSTLINEDA-----------AAAKKSTANPLASIDSALSKVDAVRSSLGAIQNR 444

Query: 332 IDLQSDFVNKLSDSIDSGVGRLVDADMNEESTRLKALQTQQQLAIQSLSIANSASENVLT 391
D + +++S R+ DAD E + + Q QQ L+ AN +NVL+
Sbjct: 445 FDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLS 504

Query: 392 LFR 394
L R
Sbjct: 505 LLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03039FLAGELLIN1381e-38 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 138 bits (349), Expect = 1e-38
Identities = 54/330 (16%), Positives = 102/330 (30%), Gaps = 12/330 (3%)

Query: 4 ILTNVAAMAALQTLRGIDSNMEETQARVSSGLRVGTASDNAAYWSIATTMRSDNMALSAV 63
I TN ++ L S++ R+SSGLR+ +A D+AA +IA S+ L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 QDALGLGAAKVDTAYAGM---ENAVEVVKEIRAKLVAATEDGVDKAKIQEEIEQLKQQLT 120
G + T + N ++ V+E+ + T D IQ+EI+Q +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SIATAASFSGENWLQADIT-TPVTKSVVGSFVRDSSGVVSVKTID-YVLDGNSVLFDTVG 178
++ F+G L D + G + + VK++ + N TVG
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 179 DAGILDKIYNVSQASVTLPVNVNGTTTEYTVAAYAVDELIAAGATFDGDSANVTGYTVPA 238
D K V + + +
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQ-------L 236

Query: 239 GGIDYNGNFVKVEGTWVRAIDVAATGQEVVYDDGTTKWGVDTTVAGAPAINVVAPASIEN 298
D N ++ A + + K G G + N
Sbjct: 237 TTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGN 296

Query: 299 IDITNAAQAANLDALIRGVDEALEDLISAT 328
++ + + + ++ +AT
Sbjct: 297 GKVSTTINGEKVTLTVADITAGAANVDAAT 326



Score = 90.5 bits (224), Expect = 7e-22
Identities = 47/268 (17%), Positives = 81/268 (30%), Gaps = 7/268 (2%)

Query: 139 TTPVTKSVVGSFVRDSSGVVSVKTIDYVLDGNSVLFDTVGDAGILDKIYNVSQASVTLPV 198
T + + ++G K I + G DT G+ I + V
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKE-GDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 199 NVNGTTTEYTVAAYAVDELIAAGATFDGDSANVTGYTVPAGGIDYNGNFVKVEGTWVRAI 258
+ + T+ + A S+ +V G ++
Sbjct: 300 STTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLE 359

Query: 259 DVAATGQEVVY------DDGTTKWGVDTTVAGAPAINVVAPASIENIDITNAAQAANLDA 312
A E T I+ A I+ AA +
Sbjct: 360 ANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTAN 419

Query: 313 LIRGVDEALEDLISATSALGSISMRIGMQEEFVSKLTDSIDSGIGRLVDADMNEESTRLK 372
+ +D AL + + S+LG+I R + +++S R+ DAD E + +
Sbjct: 420 PLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMS 479

Query: 373 ALQTQQQLAIQSLSIANTNSENILQLFR 400
Q QQ L+ AN +N+L L R
Sbjct: 480 KAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03040FLAGELLIN1104e-29 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 110 bits (275), Expect = 4e-29
Identities = 48/295 (16%), Positives = 90/295 (30%), Gaps = 3/295 (1%)

Query: 26 TTQGRISSGYRVETAADNAAYWSIATTMRSDNAALSTVHDALGLGAAKVDTFYSAMDTVI 85
T + +V A N + + T G AK
Sbjct: 216 TDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEG 275

Query: 86 DVMTEIKAKLVAASEPGVDKDKINKEVAELKSQLNSAAQSASFSGENWLYNGASAALGTK 145
D ++ G D + + + A + + S+
Sbjct: 276 DTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYT 335

Query: 146 SIVASFNRSADGSVTVSTLNYDTAKSVLIDVTDPARGMLTKAVNADALQSTPTGTARNYY 205
S+V D + S D + + A + T
Sbjct: 336 SVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGA---EYTANAAGDKVTLAGKT 392

Query: 206 LIDAGAAPAGATEIEIDNATTGAQLGDMISVVDELISQLTDSAATLGAITSRIEMQESFV 265
+ A +T I D A + ++ +D +S++ ++LGAI +R + + +
Sbjct: 393 MFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNL 452

Query: 266 ANLMDVIDKGVGRLVDADMNEESTRLKALQTQQQLGIQSLSIANTTSENILRLFQ 320
N + ++ R+ DAD E + + Q QQ G L+ AN +N+L L +
Sbjct: 453 GNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507



Score = 103 bits (257), Expect = 1e-26
Identities = 47/327 (14%), Positives = 96/327 (29%), Gaps = 15/327 (4%)

Query: 4 IMTNPAAMAALQTLRAINHNLETTQGRISSGYRVETAADNAAYWSIATTMRSDNAALSTV 63
I TN ++ L +L + R+SSG R+ +A D+AA +IA S+ L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 HDALGLGAAKVDTFYSAMDTV---IDVMTEIKAKLVAASEPGVDKDKINKEVAELKSQLN 120
G + T A++ + + + E+ + + D I E+ + +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SAAQSASFSGENWLYNGASAALGTKSIVASFNRSADGSVTVSTLN---YDTAKSVLIDVT 177
+ F+G L + + + V +L ++ V
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 178 DPARGMLTKAVNADALQ-------STPTGTARNYYLIDAGAAPAGATEIEIDNATTGAQL 230
D +G T A+
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 231 GDMISVVDELIS--QLTDSAATLGAITSRIEMQESFVANLMDVIDKGVGRLVDADMNEES 288
+ + S ++ A GAI E + ID G + ++
Sbjct: 244 NTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTI 303

Query: 289 TRLKALQTQQQLGIQSLSIANTTSENI 315
K T + + ++ T ++
Sbjct: 304 NGEKVTLTVADITAGAANVDAATLQSS 330


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03046HTHFIS412e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 2e-06
Identities = 23/114 (20%), Positives = 42/114 (36%), Gaps = 5/114 (4%)

Query: 2 IVVVDDRALVKDGYASLFGREGIP-STGFDPREFGEWVSSAADSDIDAVEAFLIGQGEST 60
I+V DD A ++ R G + W+++ D V ++ E+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD---GDLVVTDVVMPDENA 62

Query: 61 FTLPRAIRDR-SRAPVIAMSDTPSLENTLALFDCGVDDVVRKPVHPREILARVA 113
F L I+ PV+ MS + + + G D + KP E++ +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03047FLGHOOKAP1415e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 5e-06
Identities = 13/49 (26%), Positives = 28/49 (57%)

Query: 357 EILSGALESSNVDIAEELTAMIESQRNYTANSKVFQTGSELLEVLVNLK 405
++ + S V++ EE + Q+ Y AN++V QT + + + L+N++
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 38.4 bits (89), Expect = 5e-05
Identities = 13/33 (39%), Positives = 20/33 (60%)

Query: 9 TGVSGMNAQSNRLSTVAENIANANTTGYKRAST 41
+SG+NA L+T + NI++ N GY R +T
Sbjct: 6 NAMSGLNAAQAALNTASNNISSYNVAGYTRQTT 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03048FLGHOOKAP1682e-14 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 68.4 bits (167), Expect = 2e-14
Identities = 77/406 (18%), Positives = 143/406 (35%), Gaps = 36/406 (8%)

Query: 4 SSAIAIAQSAFSTTAQQTATVSKNIANSGNADYSR----------RMAMLGTTPGGAQIV 53
SS I A S + T S NI++ A Y+R + G G +
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 54 SIYRAQNEALLKQNLIGISQSSAQSSLLSGLEIMKSALGGNDYESSPSTYLSAFRNSLQT 113
+ R + + Q +QSS ++ + + + L + SS +T + F SLQT
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNML--STSTSSLATQMQDFFTSLQT 118

Query: 114 FASTPGNATIAATVVSDASDLANSISKTSAAVQDLRLDSDKKIAEEVANLNRLLAQFETA 173
S + ++ + L N T ++D + I V +N Q +
Sbjct: 119 LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASL 178

Query: 174 NNAVKQATAAGTDATA--ALDDRDKILKQVSELVGISTVTRAN-NDTVIYTSGGTVLFET 230
N+ + + T G A+ LD RD+++ +++++VG+ + + +G +++ +
Sbjct: 179 NDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGS 238

Query: 231 LPREVTFAPKSAYDATVTGNGIFIDGVPLAAGSGADTSAQGKLAGLLQLRDDIAPTFQSQ 290
R+ A + ++DG G L G+L R ++
Sbjct: 239 TARQ--LAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNT 296

Query: 291 LDEMARGLVTLF---KEGGL-------PGLFTWSGGTVPAAGAVQPGLAASLSVNPAAKA 340
L ++A F + G F V + +A +V A+
Sbjct: 297 LGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAV 356

Query: 341 NPFLLRDGGFNG---VVSNPDGNAGYTVLLDGFVTAMDGDMAFDGA 383
F+ V+ N +TV D +G +AFDG
Sbjct: 357 L-ATDYKISFDNNQWQVTRLASNTTFTVTPD-----ANGKVAFDGL 396



Score = 38.8 bits (90), Expect = 4e-05
Identities = 18/79 (22%), Positives = 39/79 (49%)

Query: 390 SSIMEFAASSIGWFEQIRSGASTADDNKAALLARTQEALGSVTGVSIDEELSLLLDLEQS 449
S + AS + + T+ + ++ + S++GV++DEE L +Q
Sbjct: 465 KSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQY 524

Query: 450 YKASAKLISTVDAMMASLL 468
Y A+A+++ T +A+ +L+
Sbjct: 525 YLANAQVLQTANAIFDALI 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03053TYPE3IMQPROT535e-13 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 52.8 bits (127), Expect = 5e-13
Identities = 20/77 (25%), Positives = 39/77 (50%)

Query: 5 DALDIVQAAIWTVIVASGPAVLAAMVVGVGIAFIQALTQVQEMTLTFVPKILAVMITAAI 64
D + A++ V++ SG + A ++G+ + Q +TQ+QE TL F K+L V + +
Sbjct: 3 DLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62

Query: 65 SAPFVGAQISIFTDIVF 81
+ + G + + V
Sbjct: 63 LSGWYGEVLLSYGRQVI 79


8SMc02383SMc02658Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc02383212-0.481460hypothetical protein
SMc02384111-0.194775glycosyltransferase transmembrane protein
SMc02385190.021087hypothetical protein
SMc02386112-0.614001AMP nucleosidase
SMc02387216-0.305317hypothetical protein
SMc02388116-0.092673hypothetical protein
SMc02389116-0.044062hypothetical protein
SMc02390115-0.407174glutathione S-transferase
SMc02391116-0.642691hypothetical protein
SMc02392015-1.737261hypothetical protein
SMc02393016-1.617157histidinol-phosphate aminotransferase
SMc02394120-2.380826*hypothetical protein
SMc02461220-2.499874hypothetical protein
SMc02396218-2.217504outer membrane protein
SMc02397216-1.971983integrase
SMc02398314-0.191255transposase ISRM11/ISRM2011-2
SMc05004212-0.460108hypothetical protein
SMc02399211-0.644422transposase ISRM11/ISRM2011-2
SMc024001120.127428outer membrane protein
SMc024012140.810781hypothetical protein
SMc024031140.567801murein transglycosylase
SMc024043160.280981dihydrodipicolinate synthase
SMc02405214-0.058297SsrA-binding protein
SMc024062140.157949hypothetical protein
SMc02422018-0.830493hypothetical protein
SMc02407-114-2.174394hypothetical protein
SMc02408-213-1.958408DNA-directed RNA polymerase subunit omega
SMc02659-211-2.144718GTP pyrophosphokinase (ATP:GTP
SMc02658122-3.108256hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02401TCRTETA270.030 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 27.1 bits (60), Expect = 0.030
Identities = 10/32 (31%), Positives = 18/32 (56%), Gaps = 1/32 (3%)

Query: 91 LGAFFG-GLAAGAIVGGVLSQPRYAAPRYSGG 121
+ A FG G+ AG ++GG++ AP ++
Sbjct: 136 MSACFGFGMVAGPVLGGLMGGFSPHAPFFAAA 167


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02406PRTACTNFAMLY607e-11 Pertactin virulence factor family signature.
		>PRTACTNFAMLY#Pertactin virulence factor family signature.

Length = 910

Score = 59.7 bits (144), Expect = 7e-11
Identities = 64/258 (24%), Positives = 101/258 (39%), Gaps = 15/258 (5%)

Query: 1136 VWARIDGTRAHVEPEESGARVSEYDSRSWLLQAGIDALLHSSGKGELIGGLTAHYGGINT 1195
W R R ++ +D + + G D + +G +GGL + G
Sbjct: 649 AWGRGFAQRQQLDNRAG----RRFDQKVAGFELGADHAVAVAGGRWHLGGLAGYTRGDRG 704

Query: 1196 DVSSFYGDGSIDTRGYGLGATLTWYGQNGFYLDGQGQLTRFDSDLDSDI--LGRLADGND 1253
F GDG T +G T+ +GFYLD + +R ++D +
Sbjct: 705 ----FTGDGGGHTDSVHVGGYATYIADSGFYLDATLRASRLENDFKVAGSDGYAVKGKYR 760

Query: 1254 GLGYAVSIEAGHRIGLGDTWALTPQAQLVNTSVNFDDFVDPYGARVSLDDGNSLRGRIGL 1313
G S+EAG R D W L PQA+L + G RV + G+S+ GR+GL
Sbjct: 761 THGVGASLEAGRRFTHADGWFLEPQAELAVFRAGGGAYRAANGLRVRDEGGSSVLGRLGL 820

Query: 1314 ALDREQQWQGAEGDGRSSRLYGTIDLYREFLGDTRADVSGVSFGIEADDWIGEIGIGGSY 1373
+ + + GR + Y + +EF G +G++ E E+G+G +
Sbjct: 821 EVGK----RIELAGGRQVQPYIKASVLQEFDGAGTVHTNGIAHRTELRGTRAELGLGMAA 876

Query: 1374 NWGDDKYSLYGEARASTG 1391
G +SLY S G
Sbjct: 877 ALG-RGHSLYASYEYSKG 893



Score = 34.3 bits (78), Expect = 0.004
Identities = 49/260 (18%), Positives = 86/260 (33%), Gaps = 16/260 (6%)

Query: 344 SAIGLAAISQGNMGEARVSLDGGDVSTGGDNAEGLIATAKGAGGTATARLVDGNVKTAGR 403
+ + +A + G A + + G+ G+ GG TA +K +GR
Sbjct: 17 TTLAMALGALGAAPAAHADWNNQSIVKTGERQHGIHIQGSDPGGVRTA--SGTTIKVSGR 74

Query: 404 DAEGLIAHAGDESIPTSINADASVIMQAGSITTTGDGA-GGMIAETDIGPTPSTGKATAV 462
A+G++ E+ + + +G ++ G G + V
Sbjct: 75 QAQGILL----ENPAAELQFRNGSVTSSGQLSDDGIRRFLGTVTVKAGKLVADHATLANV 130

Query: 463 QRAGTIMTSGGEIGGNEGSYAIAALSF-GSGLASIEQA------GGSATTAGAQSHALYA 515
+ G + +IA + G+G IE+ + G AL +
Sbjct: 131 GDTWDDDGIALYVAGEQAQASIADSTLQGAGGVQIERGANVTVQRSAIVDGGLHIGALQS 190

Query: 516 LSIFGNTMVTQAAGSSAVATGAKASGLRALAGPGGGNEVTLDG-KVTAGSAPAVHTIGSA 574
L + ++ T ASG A G +E+TLDG +T G A V + A
Sbjct: 191 LQP-EDLPPSRVVLRDTNVTAVPASGAPAAVSVLGASELTLDGGHITGGRAAGVAAMQGA 249

Query: 575 GSKITIGSAAEIDGSASGTA 594
+ + D A G
Sbjct: 250 VVHLQRATIRRGDAPAGGAV 269


9SMc01903SMc01912Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc01903214-2.162613ATP-dependent Clp protease proteolytic subunit
SMc01904111-1.449047ATP-dependent protease ATP-binding subunit ClpX
SMc01905190.875566ATP-dependent protease LA protein
SMc019062102.153700histone-like protein
SMc019072101.823227hypothetical protein
SMc019082122.525340transcriptional regulator
SMc019092112.672447hypothetical protein
SMc019101102.255709hypothetical protein
SMc01911216-0.942965hypothetical protein
SMc01912217-1.500207***NADH dehydrogenase subunit A
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01904RTXTOXIND290.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.031
Identities = 14/82 (17%), Positives = 31/82 (37%), Gaps = 11/82 (13%)

Query: 245 ILFICGGAFAGLDKIISARGEKTSIGFGATVRAPEDRRVGEVLRELEPEDLVKFGLIPEF 304
++ ++ + +A G+ T G ++ E+ V E++ ++ + V+ G
Sbjct: 69 VIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEII--VKEGESVRKG----- 121

Query: 305 IGRLPVLATLEDLDEDALIQIL 326
VL L L +A
Sbjct: 122 ----DVLLKLTALGAEADTLKT 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01905PF05272340.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.5 bits (76), Expect = 0.003
Identities = 14/81 (17%), Positives = 32/81 (39%), Gaps = 6/81 (7%)

Query: 299 DWLLGLPWGKKSKIKTDLNHAEKVLDTDHFGLDKVKERIVEYLAVQARSSKIKGP----- 353
DW+ W + +++ L H D+ ++V + +++ P
Sbjct: 537 DWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFD 596

Query: 354 -ILCLVGPPGVGKTSLAKSIA 373
+ L G G+GK++L ++
Sbjct: 597 YSVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01906DNABINDINGHU1192e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (301), Expect = 2e-39
Identities = 47/88 (53%), Positives = 59/88 (67%)

Query: 2 NKNELVAAVADKAGLSKADASSAVDAVFETIQGELKNGGDIRLVGFGNFSVSRREASKGR 61
NK +L+A VA+ L+K D+++AVDAVF + L G ++L+GFGNF V R A KGR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPSTGAEVDIPARNVPKFTAGKGLKDAV 89
NP TG E+ I A VP F AGK LKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01910RTXTOXIND412e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 2e-05
Identities = 31/227 (13%), Positives = 64/227 (28%), Gaps = 24/227 (10%)

Query: 405 AEGNHGDAEREVARLKEGAADPAFLAERVHLLRQSDCLLRQQSSAREIDRLEAAICDRLD 464
EG + +L A+ L + LL+ R Q +R I+ + D
Sbjct: 113 KEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172

Query: 465 ELRPFAGTIDDLAALPAPAPAVVEAWRAEDHAEAERRLRLADRIADESERQAGDEARLAA 524
E + +++ L + W+ + ++ L L + A+ A
Sbjct: 173 EPYFQNVSEEEVLRLTSLIKEQFSTWQNQK---YQKELNLDKKRAERLTVLARINRYENL 229

Query: 525 IA------------ANGGVVDDAAVCELRRRRDTAWQRHRAGLGEQTAIAFETALNEHDA 572
+ + AV E + A R + I
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI--------ESE 281

Query: 573 ITALRLAQAERVAEIRSLTL-AVTERRARLQSLDAQRKAAEEQRQRL 618
I + + ++ L + + + L + EE++Q
Sbjct: 282 ILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328



Score = 36.0 bits (83), Expect = 0.001
Identities = 39/209 (18%), Positives = 66/209 (31%), Gaps = 24/209 (11%)

Query: 119 ATYSTMFSLDDDSIEEGGEAILKSEGELGSLLFSASSGLPDSTAVLAVLRAEADSFFRPQ 178
T T SL +E+ IL EL L P V S + Q
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQ 194

Query: 179 --ARKHQLAELKAELDALKAERNEIDVNAREYAALRKALASMRARHEAAARLRSE----- 231
++Q + + LD +AER + ++R + + L +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARIN---RYENLSRVEKSRLDDFSSLLHKQAIAK 251

Query: 232 ---LRADRDRMRAQRDAVPLLARLRGARQELAGRDPLPVPPAEWQEELPVLRR-RDAEIA 287
L + + A + ++L E+ +EE ++ + EI
Sbjct: 252 HAVLEQENKYVEAVNELRVYKSQLEQIESEIL----------SAKEEYQLVTQLFKNEIL 301

Query: 288 AGLRQLHEELTRRREELAALPRDEQALAI 316
LRQ + + ELA +QA I
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQASVI 330


10SMc01213SMc01202Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc01213211-0.487342hypothetical protein
SMc01212211-0.796837transport transmembrane protein
SMc01211112-1.046153hypothetical protein
SMc01210013-0.756880hypothetical protein
SMc01209114-1.858972phosphopantetheine adenylyltransferase
SMc01700015-1.235211peptidyl-prolyl cis-trans isomerase A
SMc01208016-1.383360peptidyl-prolyl cis-trans isomerase B protein
SMc01207118-1.772736S-adenosylmethionine--tRNA
SMc01206118-2.144082queuine tRNA-ribosyltransferase
SMc01205320-2.335883amino-acid-binding periplasmic signal peptide
SMc01204215-0.585772transmembrane oxidoreductase
SMc01203319-1.578467hypothetical protein
SMc01202217-0.664942lipoprotein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01212TCRTETA2753e-91 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 275 bits (704), Expect = 3e-91
Identities = 146/359 (40%), Positives = 209/359 (58%), Gaps = 3/359 (0%)

Query: 9 RGLFLVFLILFLDIMGIAIIVPVLPTYLEELTGADIGEAAVDGGWLLLVYSAMQFLFAPL 68
R L ++ + LD +GI +I+PVLP L +L + + G LL +Y+ MQF AP+
Sbjct: 5 RPLIVILSTVALDAVGIGLIMPVLPGLLRDLVHS--NDVTAHYGILLALYALMQFACAPV 62

Query: 69 IGNLSDRFGRRPVLLASVLTFALDNLICALATSYWMLFIGRSLAGISGASFGTASAYIAD 128
+G LSDRFGRRPVLL S+ A+D I A A W+L+IGR +AGI+GA+ A AYIAD
Sbjct: 63 LGALSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIVAGITGATGAVAGAYIAD 122

Query: 129 VSDDENRAKNFGLIGIAFGTGFALGPVIGGFLGELGPRVPFYGAAALSFLNFIMGVFLLP 188
++D + RA++FG + FG G GPV+GG +G P PF+ AAAL+ LNF+ G FLLP
Sbjct: 123 ITDGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLP 182

Query: 189 ETLAPANRRRFEWHRANPLGALKQMRHYPGIGWVGLVFFLYWLAHAVYPAVWSFVGSYRY 248
E+ RR NPL + + R + + VFF+ L V A+W G R+
Sbjct: 183 ES-HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRF 241

Query: 249 GWSEGQIGLSLGIFGVGGAIVMALVLPRVVPALGERRTAALGLTFTALGMAGYAAAWEGW 308
W IG+SL FG+ ++ A++ V LGERR LG+ G A A GW
Sbjct: 242 HWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADGTGYILLAFATRGW 301

Query: 309 MVYAVIVATALESLADPPLRSIASVHVPPSAQGELQGALTSISSITTILGPLMFTQIFA 367
M + ++V A + P L+++ S V QG+LQG+L +++S+T+I+GPL+FT I+A
Sbjct: 302 MAFPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALTSLTSIVGPLLFTAIYA 360


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01209LPSBIOSNTHSS1691e-56 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 169 bits (429), Expect = 1e-56
Identities = 64/158 (40%), Positives = 97/158 (61%), Gaps = 5/158 (3%)

Query: 4 AFYPGSFDPITNGHLDVLVQALNVAAKVIVAIGVHPGKAPLFSFDERADLIRAALEETLP 63
A YPGSFDPIT GHLD++ + + +V VA+ +P K P+FS ER + I A+ LP
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRLFDQVYVAVLRNPNKQPMFSVQERLEQIAKAIAH-LP 61

Query: 64 ERAADISVVSFDNLVVDAAREHGARLLVRGLRDGTDLDYEMQMAGMNRQMAPDIQTLFLP 123
+ V SF+ L V+ AR+ A ++RGLR +D + E+QMA N+ +A D++T+FL
Sbjct: 62 ----NAQVDSFEGLTVNYARQRQAGAILRGLRVLSDFELELQMANTNKTLASDLETVFLT 117

Query: 124 AGTASRPITATLVRQIAAMGGDVSAFVPGAVHQALQAK 161
T ++++LV+++A GG+V FVP V AL +
Sbjct: 118 TSTEYSFLSSSLVKEVARFGGNVEHFVPSHVAAALYDQ 155


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01204DHBDHDRGNASE646e-14 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 63.5 bits (154), Expect = 6e-14
Identities = 51/186 (27%), Positives = 79/186 (42%), Gaps = 6/186 (3%)

Query: 11 VVVVGASRGIGKAIAEVAARDGAPVVLVARSEVALAAAADDIRNAGGEAFTVPLDFLADD 70
+ GA++GIG+A+A A GA + V + L ++ A P D + D
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD-VRDS 69

Query: 71 AALG--LEDFLSANGLFCDVLVNSAGYGLRGGAT-VLPLDEQLGLVDLNIRALTELTLRL 127
AA+ G D+LVN AG LR G L +E +N + + +
Sbjct: 70 AAIDEITARIEREMGPI-DILVNVAGV-LRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 128 LPGMVARRRGGVINLGSVASFTPGPYMALYYASKGFVRSFSEALHQELRHTGVTVTCVAP 187
M+ RR G ++ +GS + P MA Y +SK F++ L EL + V+P
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 188 GPVSTE 193
G T+
Sbjct: 188 GSTETD 193


11SMc00317SMc00254Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00317318-3.129732hypothetical protein
SMc00248421-3.652197cytochrome C-type biogenesis protein
SMc002491022-3.216132*hypothetical protein
SMc00250826-3.228563hypothetical protein
SMc00251725-2.862135hypothetical protein
SMc00252423-1.587736signal peptide protein
SMc00253421-1.136537signal peptide protein
SMc00254320-1.413370hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00248BACINVASINB310.004 Salmonella/Shigella invasin protein B signature.
		>BACINVASINB#Salmonella/Shigella invasin protein B signature.

Length = 593

Score = 31.3 bits (70), Expect = 0.004
Identities = 16/35 (45%), Positives = 22/35 (62%)

Query: 149 CIGPVLGTILGVAAARDTVADGAALLAIYSLGLAV 183
CIG VLG +L + + V G A LA+ ++GLAV
Sbjct: 316 CIGKVLGALLTIVSVVAAVFTGGASLALAAVGLAV 350


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00253RTXTOXINA310.001 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 31.5 bits (71), Expect = 0.001
Identities = 12/46 (26%), Positives = 22/46 (47%)

Query: 14 IASGAVAQQADGTNAATGMVGGAATGAIIGGPIGAGVGAVVGATLG 59
++ + + + + AAT +++G P+ A VGAV G G
Sbjct: 363 AIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISG 408


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00254RTXTOXINA363e-05 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 35.7 bits (82), Expect = 3e-05
Identities = 15/48 (31%), Positives = 24/48 (50%)

Query: 13 TGIFAAPAIANDTNQSAVTGAAGGAVTGAIVGGPVGAAIGGAVGLVAG 60
TG A T ++V+ A T ++VG PV A +G G+++G
Sbjct: 361 TGAIDASLTTISTVLASVSSGISAAATTSLVGAPVSALVGAVTGIISG 408


12SMc00264SMc04866Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc002643100.186658oxidoreductase
SMc00265211-0.087826periplasmic binding protein
SMc002662110.147793transmembrane protein
SMc00267313-0.080258hypothetical protein
SMc00268314-0.497746oxidoreductase
SMc00269314-0.627439transketolase
SMc00270115-0.948660transketolase subunit alpha
SMc00271114-1.185548periplasmic binding protein
SMc00272117-1.749581hypothetical protein
SMc00273018-2.236717transmembrane protein
SMc00274116-2.950541oxidoreductase
SMc00275116-3.560100transcriptional regulator
SMc00276218-4.458466hypothetical protein
SMc00277019-3.809716hypothetical protein
SMc00278016-3.130269transcriptional regulator
SMc00279016-2.305700hypothetical protein
SMc00280317-0.976675hypothetical protein
SMc00281518-1.088432signal peptide protein
SMc00282015-1.920399hypothetical protein
SMc00283-114-2.655680transcriptional regulator
SMc00284015-3.325214hypothetical protein
SMc00285015-3.487135transposase ISRM3
SMc04866017-4.396226hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00264DHBDHDRGNASE1002e-27 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 100 bits (250), Expect = 2e-27
Identities = 77/263 (29%), Positives = 130/263 (49%), Gaps = 11/263 (4%)

Query: 2 ADRLKGKSAIVFGAGSSGPGWGNGKAAAVLYAREGARVACVDVDMEAAEETAAIIASEGG 61
A ++GK A + GA G G+A A A +GA +A VD + E E+ + + +E
Sbjct: 3 AKGIEGKIAFITGAAQ-----GIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEAR 57

Query: 62 QAVAIAADVTALASVEAAVAAARQAFGIISILHNNVGVTHMGGPVELSEAQFQQSVDLNL 121
A A ADV A+++ A + G I IL N GV G LS+ +++ + +N
Sbjct: 58 HAEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNS 117

Query: 122 GSVFRTAKAVIPHMIEAGGGAVVNI-SSLAGIRWTGYPYFAYYATKAAVNQATVAIAAQY 180
VF +++V +M++ G++V + S+ AG+ T AY ++KAA T + +
Sbjct: 118 TGVFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMA--AYASSKAAAVMFTKCLGLEL 175

Query: 181 APHGIRANCVVPGLIDTPLIYKQISSQYASAEEM---VAARNAALPGGRMGDAWDVANAA 237
A + IR N V PG +T + + + + + + + + +P ++ D+A+A
Sbjct: 176 AEYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAV 235

Query: 238 LFLASDEAKFITGVCLPVDGGQS 260
LFL S +A IT L VDGG +
Sbjct: 236 LFLVSGQAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00268DHBDHDRGNASE1278e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 127 bits (320), Expect = 8e-38
Identities = 83/261 (31%), Positives = 128/261 (49%), Gaps = 16/261 (6%)

Query: 1 MGTALLEGNFAVITGAASRRGLGKATARLFAEHGATVAILD--LDEADAREAAASLGPRH 58
M +EG A ITGAA +G+G+A AR A GA +A +D ++ + ++ RH
Sbjct: 1 MNAKGIEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARH 58

Query: 59 V-GMACNVTDLASVRTVMDALVSQWGRIDILVNNAGITQPLKIMEIAPENYDAVLDVNLR 117
+V D A++ + + + G IDILVN AG+ +P I ++ E ++A VN
Sbjct: 59 AEAFPADVRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNST 118

Query: 118 GTLYCSQAVIPHMRSRKQGKIVNLSSVSAQRGGGIFGGPHYSAAKAGVLGLTKAMARELA 177
G S++V +M R+ G IV + S A G Y+++KA + TK + ELA
Sbjct: 119 GVFNASRSVSKYMMDRRSGSIVTVGSNPA--GVPRTSMAAYASSKAAAVMFTKCLGLELA 176

Query: 178 PDNVRVNAICPGFIATDITGGLLTPEK---------LEEIKAGIPMGRPGTADDVAGCAL 228
N+R N + PG TD+ L E LE K GIP+ + D+A L
Sbjct: 177 EYNIRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVL 236

Query: 229 FLASDLSAYVTGSEVDVNGGS 249
FL S + ++T + V+GG+
Sbjct: 237 FLVSGQAGHITMHNLCVDGGA 257


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00271PF06291280.026 Lambda prophage Bor protein
		>PF06291#Lambda prophage Bor protein

Length = 102

Score = 27.7 bits (61), Expect = 0.026
Identities = 20/84 (23%), Positives = 39/84 (46%), Gaps = 5/84 (5%)

Query: 3 LRHLLTAAVLSLCAGPAAAQTYQLSHNAAAGNPKDVASLKF--AELVEQKSEGRLKIDVG 60
++ +L +A L++ A QT+ + + A PK+ + F + + ++K+ KI G
Sbjct: 6 MKKMLFSAALAMLITGCAQQTFTVGNKPTAVTPKETITHHFFVSGIGQKKTVDAAKI-CG 64

Query: 61 GSAQF--GDDAETITNMRLGTIAF 82
G+ + +T N LG I
Sbjct: 65 GAENVVKTETQQTFVNGLLGFITL 88


13SMc04315SMc04336Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc043152141.023937transcriptional regulator
SMc043251110.691969methionine synthase I
SMc043260130.114606aminotransferase
SMc043270130.111099hypothetical protein
SMc04328114-0.410894lipoprotein signal peptide
SMc04329013-0.614328hypothetical protein
SMc04330015-1.268244trimethylamine methyltransferase
SMc04331-117-1.933886dimethylamine corrinoid protein
SMc04332416-0.828907entericidin B signal peptide protein
SMc04333315-0.320866hypothetical protein
SMc043345160.479008formyltransferase
SMc050204150.762504entericidin A precursor
SMc043353161.395808lipoprotein transmembrane
SMc043363151.447584transmembrane signal peptide protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04330BINARYTOXINA300.030 Clostridial binary toxin A signature.
		>BINARYTOXINA#Clostridial binary toxin A signature.

Length = 454

Score = 29.6 bits (66), Expect = 0.030
Identities = 11/34 (32%), Positives = 19/34 (55%)

Query: 463 LADNNSFEQWEIEGEKRIEQRANALARSWLEHYE 496
L D + QWE + +R+E+ + L + LE Y+
Sbjct: 51 LKDKENAIQWEKKEAERVEKNLDTLEKEALELYK 84


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04335CHANLCOLICIN240.042 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 23.9 bits (51), Expect = 0.042
Identities = 10/19 (52%), Positives = 13/19 (68%)

Query: 37 GSGEGAGSGSMGGSGGGSS 55
G+ +G+GSG GG GG S
Sbjct: 27 GTPDGSGSGGGGGKGGSKS 45


14SMc04177SMc04196Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc04177418-1.536518hypothetical protein
SMc04178218-0.774115hypothetical protein
SMc04179320-2.390760transmembrane protein
SMc04180325-4.280146transmembrane protein
SMc04181326-4.829774transmembrane protein
SMc04182121-2.861225hypothetical protein
SMc04183220-2.808347transmembrane protein
SMc04185220-4.144574glycosyltransferase
SMc04184017-2.629049hypothetical protein
SMc04186017-1.686187hypothetical protein
SMc04187118-1.271814DNA packaging protein GP2
SMc04188219-3.944578hypothetical protein
SMc04189223-4.029684hypothetical protein
SMc04190422-4.013117signal peptide protein
SMc04191322-2.508542hypothetical protein
SMc04194219-2.936676transmembrane protein
SMc04195218-2.639723transposase ISRM3
SMc04440219-1.617910hypothetical protein
SMc04196320-1.672390hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04185CHANLCOLICIN280.040 Channel forming colicin signature.
		>CHANLCOLICIN#Channel forming colicin signature.

Length = 522

Score = 28.1 bits (62), Expect = 0.040
Identities = 12/21 (57%), Positives = 17/21 (80%)

Query: 217 SGSEARYRAKSFRDRLTQRLK 237
+ +EA+ +AK+ RD LTQRLK
Sbjct: 75 AAAEAQAKAKANRDALTQRLK 95


15SMc02813SMc01557Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc02813232-4.479929aminoglycoside N3-acetyltransferase
SMc01545339-6.254431hypothetical protein
SMc01546442-7.245383hypothetical protein
SMc01547445-8.005163hypothetical protein
SMc01740645-8.100064L-lactate dehydrogenase (cytochrome) protein
SMc01741745-8.435713hypothetical protein
SMc01742640-6.905009hypothetical protein
SMc01745737-6.510747hypothetical protein
SMc04442632-5.355871acetyltransferase
SMc01548527-6.320106hypothetical protein
SMc01744327-6.766219hypothetical protein
SMc01549225-6.833557hypothetical protein
SMc01749227-6.257302replicative DNA helicase
SMc01550226-5.986259hypothetical protein
SMc01551122-5.116470hypothetical protein
SMc01552119-2.781008hypothetical protein
SMc01553017-1.294411acyl carrier protein
SMc01554016-0.780864hypothetical protein
SMc015550170.717410hypothetical protein
SMc015561140.971112*hypothetical protein
SMc01557216-0.070468signal peptide protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01545SACTRNSFRASE300.002 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.002
Identities = 9/22 (40%), Positives = 15/22 (68%)

Query: 48 VEAEHRGRGIGRALVGRAAEAA 69
V ++R +G+G AL+ +A E A
Sbjct: 97 VAKDYRKKGVGTALLHKAIEWA 118


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04442SACTRNSFRASE482e-09 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 48.0 bits (114), Expect = 2e-09
Identities = 22/77 (28%), Positives = 33/77 (42%), Gaps = 2/77 (2%)

Query: 63 ATQALVGHIQLAVDWRNGVARIGRVMVAPTMRGQGLAKSLLEAALEQAFSHPEIERTELN 122
+G I++ +W NG A I + VA R +G+ +LL A+E A + L
Sbjct: 72 LENNCIGRIKIRSNW-NGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENH-FCGLMLE 129

Query: 123 VYSWNSAAIRTYNKVGF 139
N +A Y K F
Sbjct: 130 TQDINISACHFYAKHHF 146


16SMc01522SMc01515Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc01522222-1.494190hypothetical protein
SMc03949121-2.656274nitrogen regulatory protein
SMc01521120-1.879992nitrogen regulatory protein
SMc01520316-0.227876hypothetical protein
SMc015196140.170544hypothetical protein
SMc01748815-1.135092hypothetical protein
SMc01518715-0.662005hypothetical protein
SMc01517411-0.110751hypothetical protein
SMc015163110.562662hypothetical protein
SMc015152101.362530hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01522HTHTETR617e-14 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 60.8 bits (147), Expect = 7e-14
Identities = 31/136 (22%), Positives = 58/136 (42%), Gaps = 1/136 (0%)

Query: 4 APSRKKQPQRVRRQLLEVAARLSLEQGVAAVTLDAVSQAAGVSKGGLLHHFPNKLALLDA 63
A K++ Q R+ +L+VA RL +QGV++ +L +++AAGV++G + HF +K L
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 64 LFDDLVARFDSALDEAMAGDDVEKGRFTRAYLGVCFALDAEAEAQGWQMLTIALLAEPHL 123
+++ + E A + R L E + ++ I +
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRR-LLMEIIFHKCEFV 120

Query: 124 KERWREWVARRSAQFA 139
E A+R+
Sbjct: 121 GEMAVVQQAQRNLCLE 136


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01515PF03544712e-16 Gram-negative bacterial tonB protein
		>PF03544#Gram-negative bacterial tonB protein

Length = 243

Score = 70.8 bits (173), Expect = 2e-16
Identities = 47/207 (22%), Positives = 72/207 (34%), Gaps = 11/207 (5%)

Query: 97 VSQEPSEMTPTEADVILPAEEMPASQVEESDVIASIAPVETVIPQERPDPEEVLKPEVEK 156
V + P+ P ++ PA+ P V+ PV P+ P PE + V
Sbjct: 40 VIELPAPAQPISVTMVAPADLEPPQAVQPPP-----EPVVEPEPEPEPIPEPPKEAPVVI 94

Query: 157 VEPKKEPAKKKKVVRKKAGEGGTQAKSQKKGKADGAETATAAAATGESRGASREIGNAAV 216
+PK +P K K V+K Q K K + A ++ +
Sbjct: 95 EKPKPKPKPKPKPVKKVE-----QPKRDVKPVESRPASPFENTAPARPTSSTATAATSKP 149

Query: 217 SNYPGKVRRKLTRA-IRYPAEARRQGLRGVAHVSFTVTSGGGVARVGITKSAGSPVLDQA 275
R L+R +YPA A+ + G V F VT G V V I + + + ++
Sbjct: 150 VTSVASGPRALSRNQPQYPARAQALRIEGQVKVKFDVTPDGRVDNVQILSAKPANMFERE 209

Query: 276 ALETVRRAAPFPAIPAGAGRDRWEFNI 302
+RR P P F I
Sbjct: 210 VKNAMRRWRYEPGKPGSGIVVNILFKI 236


17SMc01978SMc02017Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc01978214-0.785522sugar transport system permease ABC transporter
SMc01979113-0.827467sugar transport system permease ABC transporter
SMc01980114-0.871146sugar transport system ATP-binding ABC
SMc01981314-1.579455cytochrome C transmembrane protein
SMc01982216-1.006581alternative cytochrome C oxidase polypeptide II
SMc01983215-1.196993alternative cytochrome C oxidase polypeptide I
SMc01984013-1.743490cytochrome-C oxidase transmembrane protein
SMc01985011-1.643094cytochrome-c oxidase
SMc01986-110-1.232233hypothetical protein
SMc01987011-0.746362dehydrogenase
SMc01988018-1.438547hypothetical protein
SMc01989117-0.227965hypothetical protein
SMc01990415-0.398926ferredoxin reductase
SMc01991416-0.266764short chain dehydrogenase
SMc01992415-1.000766alcohol dehydrogenase
SMc01993416-1.839625dioxygenase ferredoxin protein
SMc02016314-1.411186ferredoxin reductase
SMc02017214-1.718788hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01980PF05272340.001 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.9 bits (77), Expect = 0.001
Identities = 15/56 (26%), Positives = 20/56 (35%), Gaps = 9/56 (16%)

Query: 32 VVLVGPSGCGKSTLLRMIAGLESVTGGEIRIAGRRVNELAPKDRDIAMVFQSYALY 87
VVL G G GKSTL+ + GL+ + I +D Y
Sbjct: 599 VVLEGTGGIGKSTLINTLVGLDFFSDTHFDI---------GTGKDSYEQIAGIVAY 645


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01991DHBDHDRGNASE921e-24 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 92.4 bits (229), Expect = 1e-24
Identities = 66/251 (26%), Positives = 109/251 (43%), Gaps = 20/251 (7%)

Query: 5 GKSVIVTGGGKGIGRATVALLAARGAQVVALSRTQSDLDELSSQTGCHG-----IVADLA 59
GK +TG +GIG A LA++GA + A+ L+++ S AD+
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 60 DAAAARAA----MVEAGTCDILVNCAGTNVLESLIDMTDDGYDVVMNTNLRAALICAQEF 115
D+AA E G DILVN AG + ++D+ ++ + N ++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 116 ARARMAAGGGGAIVNVTSIAGHRGFQNHVCYAASKAGLEGATRVMAKELGPYGIRVVAIA 175
++ M G+IV V S + YA+SKA T+ + EL Y IR ++
Sbjct: 128 SKY-MMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVS 186

Query: 176 PTITMTELAAAAWADPQKSEPMMVRH---------PAGRFATVDDVARSIAMLLSSDAEM 226
P T T++ + WAD + +++ P + A D+A ++ L+S A
Sbjct: 187 PGSTETDMQWSLWAD-ENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGH 245

Query: 227 VSGAVLPVDGG 237
++ L VDGG
Sbjct: 246 ITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01993MICOLLPTASE270.026 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 26.6 bits (58), Expect = 0.026
Identities = 9/31 (29%), Positives = 15/31 (48%)

Query: 28 GRSFAIFRTADDQYYATDDICTHEYAHISDG 58
G F RT ++ Y +++ HE+ H G
Sbjct: 480 GTFFTYERTPEESIYTLEELFRHEFTHYLQG 510


18SMc02034SMc02460Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc020342102.196827oxidoreductase
SMc020353121.620324oxidoreductase
SMc020364111.412194transcriptional regulator
SMc020374121.091944oxidoreductase
SMc020384131.477900glycerol dehydrogenase
SMc020392140.913925oxidoreductase
SMc020402111.924683oxidoreductase
SMc020410112.514528short chain dehydrogenase
SMc020420101.903711glucose-6-phosphate isomerase
SMc02043192.425563KHG/KDPG aldolase
SMc02044082.454894hypothetical protein
SMc02045193.041283oxidoreductase
SMc02332092.564001hypothetical protein
SMc024602112.435692hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02034DHBDHDRGNASE1279e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 127 bits (320), Expect = 9e-38
Identities = 80/257 (31%), Positives = 126/257 (49%), Gaps = 13/257 (5%)

Query: 8 LSGRVAVVTGAGQGIGLACAEALCEAGAAVVLTDISAERCEAGRAALAAKGYVVETDLID 67
+ G++A +TGA QGIG A A L GA + D + E+ E ++L A+ E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 68 IGDSASVNAVADRLAVSGRAADILVANAGIAHAGVPAEELSDADWERMIGINLSGAFRSC 127
+ DSA+++ + R+ DILV AG+ G LSD +WE +N +G F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 128 RAFGRHMLAKGRGSIVTIGSMSGTIVNRPQQQVH-YNAAKAGVHHLTRSLAAEWAARGVR 186
R+ ++M+ + GSIVT+GS P+ + Y ++KA T+ L E A +R
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPA---GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 187 VNSVAPTYIDTPL--LTFAKED------KPMYEQWLDMTPMHRLGQPDEIASVVLFLASD 238
N V+P +T + +A E+ K E + P+ +L +P +IA VLFL S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 239 ASSLMTGSIVAADAGYT 255
+ +T + D G T
Sbjct: 242 QAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02037DHBDHDRGNASE1522e-47 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 152 bits (385), Expect = 2e-47
Identities = 91/258 (35%), Positives = 133/258 (51%), Gaps = 14/258 (5%)

Query: 8 LNGRAAFVTGGSRGIGFACAEALGEAGARVAISARSRDEGEKAVRQLRQKGIEAIYLPAD 67
+ G+ AF+TG ++GIG A A L GA +A + ++ EK V L+ + A PAD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 68 ISNESAAQQVVRQAAAELGGLDILVNNAGIARHCDSLKLEPETWDEVINTNLTGLFWCCR 127
+ + +A ++ + E+G +DILVN AG+ R L E W+ + N TG+F R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 128 AAIETMSAAGRGSIVNIGSISGYISNLPQNQV-AYNASKAGVHMLTKSLAGEFAKSNIRI 186
+ + M GSIV +GS + +P+ + AY +SKA M TK L E A+ NIR
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNP---AGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 187 NAVAPGYIETAMTQG--GLDDPEWSKIW-------LGMTPLGRAGKASEVAAAVLFLASD 237
N V+PG ET M ++ I G+ PL + K S++A AVLFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGI-PLKKLAKPSDIADAVLFLVSG 241

Query: 238 AASYITGSVLTIDGGYTI 255
A +IT L +DGG T+
Sbjct: 242 QAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02039DHBDHDRGNASE744e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 74.3 bits (182), Expect = 4e-18
Identities = 46/149 (30%), Positives = 68/149 (45%), Gaps = 6/149 (4%)

Query: 8 VMITGASAGIGQETARVFSAAGYPLLLIARRSELIEAMALPNML------AVAADVRDYD 61
ITGA+ GIG+ AR ++ G + + E +E + A ADVRD
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 62 ALASAIRQGEARFGPVDCLINNAGVSRLARLDEQDPAQWRDLVDINCLGVLNGMHAVAPG 121
A+ + E GP+D L+N AGV R + +W +N GV N +V+
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 122 MKERRCGTIVNVSSTQAARSIRTMTSMAA 150
M +RR G+IV V S A +M + A+
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYAS 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02041DHBDHDRGNASE1356e-41 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 135 bits (341), Expect = 6e-41
Identities = 84/252 (33%), Positives = 122/252 (48%), Gaps = 13/252 (5%)

Query: 15 GKTVVVTGAATGIGRAVAEAFATKRARVALLDRDAAVSDVAVS----LGTGHIAHVADVT 70
GK +TGAA GIG AVA A++ A +A +D + + VS A ADV
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 71 DEQGVERAVKSVTEAFGRIDILINNAGIGPLAPAESYPTAEWDRTLAVNLKGAFLMARAI 130
D ++ + G IDIL+N AG+ S EW+ T +VN G F +R++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 131 APGMLEQGSGRIVNMASQAAIIGIEGHVAYCASKAGIIGMTNCMALEWGPRGVTVNAVSP 190
+ M+++ SG IV + S A + AY +SKA + T C+ LE + N VSP
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 191 TVVETELGLTGWAGEKGERARAA---------IPTRRFAKPWEIAASVLYLAGGAAAMVN 241
ET++ + WA E G IP ++ AKP +IA +VL+L G A +
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 242 GANLMIDGGYTI 253
NL +DGG T+
Sbjct: 248 MHNLCVDGGATL 259


19SMc02358SMc02420Y        Y        NPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc023582131.142403high-affinity branched-chain amino acid ABC
SMc02359114-0.196905high-affinity branched-chain amino acid ABC
SMc02360112-0.590310branched chain amino acid ABC transporter
SMc02412-112-1.367046hypothetical protein
SMc02413114-1.347511aminotransferase
SMc02414214-2.224528hydrolase
SMc02415216-1.995330peptide-binding periplasmic ABC transporter
SMc02416315-1.179643N-ethylammeline chlorohydrolase
SMc02417417-1.027317peptide-binding periplasmic ABC transporter
SMc02418416-0.185634peptide transport system permease ABC
SMc02419315-0.499504peptide transport system permease
SMc02420215-0.916089cytosine deaminase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02412ISCHRISMTASE606e-13 Isochorismatase signature.
		>ISCHRISMTASE#Isochorismatase signature.

Length = 312

Score = 60.0 bits (145), Expect = 6e-13
Identities = 49/207 (23%), Positives = 82/207 (39%), Gaps = 22/207 (10%)

Query: 20 DIALDPETTALLVIDIQNTYLDPKDTPEETARWQPFYERMRKTVIPNTARLIDECRKRKV 79
DP LL+ D+QN ++D + + N +L ++C + +
Sbjct: 23 SWVPDPNRAVLLIHDMQNYFVDA---------FTA-GASPVTELSANIRKLKNQCVQLGI 72

Query: 80 EVIF-ARIACLKPDGRDRSLSQK--KPGFNYLLLPKELPEGQIVPELEPRSDEIVVTKTT 136
V++ A+ PD DR+L PG N E +I+ EL P D++V+TK
Sbjct: 73 PVVYTAQPGSQNPD--DRALLTDFWGPGLN-----SGPYEEKIITELAPEDDDLVLTKWR 125

Query: 137 DSALTGTNLRLILHNMGIKDVICCGIFTD-QCVSSTVRSLADESFGVVVVDDCGAAATDD 195
SA TNL ++ G +I GI+ C+ + + E V D A + +
Sbjct: 126 YSAFKRTNLLEMMRKEGRDQLIITGIYAHIGCLVTACEAFM-EDIKAFFVGDAVADFSLE 184

Query: 196 LHRRELEIINMIYCHVVSLEEVLTFFR 222
H+ LE V + +L +
Sbjct: 185 KHQMALEYAAGRCAFTVMTDSLLDQLQ 211


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02414UREASE419e-06 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 40.9 bits (96), Expect = 9e-06
Identities = 20/40 (50%), Positives = 27/40 (67%), Gaps = 4/40 (10%)

Query: 368 ITIDAAGAIGMDHEIGSLEVGKKADIVIIDMR----KPHL 403
TI+ A A G+ HEIGSLEVGK+AD+V+ + KP +
Sbjct: 409 YTINPAIAHGLSHEIGSLEVGKRADLVLWNPAFFGVKPDM 448


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02420UREASE300.018 Urea amidohydrolase (urease) protein signature.
		>UREASE#Urea amidohydrolase (urease) protein signature.

Length = 570

Score = 30.1 bits (68), Expect = 0.018
Identities = 16/54 (29%), Positives = 23/54 (42%), Gaps = 11/54 (20%)

Query: 17 CDVLIRDGKIAGFGR-----------FEAEPGMAVEDGGNAIVVPGLIDAHTHL 59
D+ ++DG+IA G+ PG V G IV G +D+H H
Sbjct: 86 ADIGLKDGRIAAIGKAGNPDMQPGVTIIVGPGTEVIAGEGKIVTAGGMDSHIHF 139


20SMc00671SMc00659Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00671291.044700histidine transport system permease ABC
SMc00670290.931619histidine transport ATP-binding ABC transporter
SMc006691101.226020histidine ammonia-lyase
SMc00668114-0.281306hypothetical protein
SMc00667118-1.566037hypothetical protein
SMc00666527-3.471486hypothetical protein
SMc00665128-3.994381hypothetical protein
SMc00664014-0.128676signal peptide protein
SMc006632150.723032hypothetical protein
SMc006622150.717788hypothetical protein
SMc006612151.083521hypothetical protein
SMc006603160.825483signal peptide protein
SMc006592150.110838*tRNA-specific 2-thiouridylase MnmA
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00670SECA280.038 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 28.3 bits (63), Expect = 0.038
Identities = 20/93 (21%), Positives = 33/93 (35%), Gaps = 1/93 (1%)

Query: 125 NVVYGQRVRGVSKDDAREIGMKWIDTVGLSGYDAKFPHQLSGGMKQRVGLARALAADTDV 184
+Y QR + D E + V + DA P Q M GL L D D+
Sbjct: 657 RAIYSQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDFDL 716

Query: 185 IL-MDEAFSALDPLIRGDMQDQLLQLQRNLAKT 216
L + E L +++++L + +
Sbjct: 717 DLPIAEWLDKEPELHEETLRERILAQSIEVYQR 749


21SMc03178SMc03190Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SMc031782120.484242hypothetical protein
SMc031792120.803262monovalent cation/H+ antiporter subunit A
SMc031804140.880681monovalent cation/H+ antiporter subunit C
SMc031814131.046198monovalent cation/H+ antiporter subunit D
SMc031823110.898685pH adaptation potassium efflux system E
SMc031832112.405925monovalent cation/H+ antiporter subunit F
SMc031841103.077782PH adaptation potassium efflux system G
SMc031851103.311284NADPH dehydrogenase quinone reductase
SMc031861104.035696transcriptional regulator
SMc031872104.329653precorrin-4 C(11)-methyltransferase
SMc031881114.509271precorrin-6Y C5,15-methyltransferase
SMc031891113.709567cobalt-precorrin-6x reductase
SMc031901123.119759precorrin-3B C(17)-methyltransferase
22SMc03105SMc03090Y        NNGenomic Island
LocusTagDNBiasCDNBias%GCBiasProduct
SMc031052111.7461821-deoxy-D-xylulose 5-phosphate reductoisomerase
SMc031043111.0562045-aminolevulinate synthase
SMc031033120.914457carbon monoxide dehydrogenase medium subunit
SMc03102190.901117carbon monoxide dehydrogenase large subunit
SMc03101091.733460carbon monoxide dehydrogenase small subunit
SMc03100091.549654hypothetical protein
SMc030990101.307471adenylate cyclase
SMc030982101.656180hypothetical protein
SMc030972101.900303hypothetical protein
SMc030961112.121884signal peptide protein
SMc030950120.243435hypothetical protein
SMc030940120.430946aminoglycoside 3'-phosphotransferase
SMc030931120.277157hypothetical protein
SMc03092114-0.054970transcriptional regulator
SMc03091013-0.384275arginase
SMc03090212-1.224873chemotaxis protein
23SMc02470SMc03790Y        Y        YPathogenicity Island (biased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc024702141.525835transcriptional regulator
SMc024691140.596713oxidoreductase
SMc02468214-0.035139hypothetical protein
SMc024673130.080269peptide methionine sulfoxide reductase
SMc024662130.526558succinate dehydrogenase iron-sulfur subunit
SMc024652130.821695succinate dehydrogenase flavoprotein subunit
SMc02464216-0.564169succinate dehydrogenase membrane anchor subunit
SMc024631131.531988succinate dehydrogenase cytochrome B-556 subunit
SMc032251132.763782hypothetical protein
SMc032261133.134056hypothetical protein
SMc032271143.359347hypothetical protein
SMc032283133.910690YciI-like protein
SMc032294123.943233NAD(P)H-dependent glycerol-3-phosphate
SMc032304113.260753DNA-binding/iron metalloprotein/AP endonuclease
SMc032313102.613688porphobilinogen deaminase
SMc032322102.610426uroporphyrinogen-III synthase
SMc03233291.819699hypothetical protein
SMc032340100.494290hypothetical protein
SMc03235011-0.955231hypothetical protein
SMc03236010-0.850887glutamine amidotransferase
SMc03237111-1.109317transport transmembrane protein
SMc03238212-1.212360hypothetical protein
SMc03239211-1.177394inorganic pyrophosphatase
SMc032401110.774888signal peptide protein
SMc032412110.835642hypothetical protein
SMc032422101.425146GTP-binding protein
SMc03243-1111.957891alkaline phosphatase transmembrane protein
SMc032441161.495207hypothetical protein
SMc032452152.119278amidase
SMc03746322-1.887768hypothetical protein
SMc03747423-2.051804hypothetical protein
SMc03748227-2.691115hypothetical protein
SMc03312225-3.203976transposase ISRM11/ISRM2011-2
SMc03313328-3.553360transposase ISRM11/ISRM2011-2
SMc03246229-4.376388integrase DNA protein
SMc05011431-3.583753bacteriophage DNA-binding transcription
SMc03247424-3.095609integrase/recombinase
SMc03248330-5.192682transposase
SMc03249441-7.776541hypothetical protein
SMc03751441-7.627165hypothetical protein
SMc03250543-7.627760hypothetical protein
SMc03251441-7.183585hypothetical protein
SMc03252547-8.876278gamma-glutamyl kinase
SMc03253543-8.656061L-proline 3-hydroxylase
SMc03254531-5.207314antikinase
SMc03256431-5.464636hypothetical protein
SMc03257431-5.634571transposase
SMc03258633-6.577110transcriptional regulator
SMc03259530-5.331107hypothetical protein
SMc03260732-5.578184transposase
SMc03262742-7.820992hypothetical protein
SMc03263642-8.436201hypothetical protein
SMc03264744-9.120210dipeptidase
SMc03265746-9.432724amino acid dehydrogenase transmembrane protein
SMc03267748-10.212634dipeptidase
SMc03268747-10.086497peptide transport system ATP-binding ABC
SMc03269748-9.860865peptide-binding periplasmic ABC transporter
SMc03754849-8.294349peptide transport system permease ABC
SMc03272537-4.457761peptide transport system permease ABC
SMc05012330-2.031457hypothetical protein
SMc05013330-2.209851hypothetical protein
SMc05014329-1.716890recombinase
SMc05015326-1.746468recombinase
SMc03277220-1.325963transport transmembrane protein
SMc03278123-2.138820transposase number 1 for insertion sequence
SMc03279123-2.629597transposase number 2 for insertion sequence
SMc03280126-3.443227transposase number 3 for insertion sequence
SMc03281330-5.101136transposase
SMc03282536-6.603625transposase number 1 for insertion sequence
SMc03283743-7.832638transposase
SMc03284844-8.456562hypothetical protein
SMc03285842-8.423956hypothetical protein
SMc03286740-7.900068protease transmembrane protein
SMc03287640-7.123499oxidoreductase
SMc03288434-4.861181hypothetical protein
SMc03289230-4.282422hypothetical protein
SMc03290326-3.785087hypothetical protein
SMc04428430-4.980911hypothetical protein
SMc03293224-4.086055hypothetical protein
SMc03294222-4.241305hypothetical protein
SMc03295220-3.759928transposase ISRM1
SMc03296120-3.797768transposase ISRM1
SMc03297327-4.085667hypothetical protein
SMc03298429-3.926222transposase number 3 for insertion sequence
SMc03300532-4.081109transposase number 2 for insertion sequence
SMc03301531-3.905891transposase number 1 for insertion sequence
SMc05017531-4.027890transposase
SMc05018534-5.383824DNA or RNA helicase
SMc03761536-6.129063hypothetical protein
SMc03762640-7.862742hypothetical protein
SMc03763544-9.521198cytosine-specific methyltransferase
SMc03764442-9.479445DNA mismatch endonuclease, patch repair protein
SMc03765341-9.365271hypothetical protein
SMc03766433-7.292939hypothetical protein
SMc03767327-5.482937hypothetical protein
SMc03768317-2.595415hypothetical protein
SMc03769111-1.399223protease transmembrane protein
SMc037700141.646965*50S ribosomal protein L21
SMc037720122.17717350S ribosomal protein L27
SMc037731132.606095hypothetical protein
SMc037741112.039931hypothetical protein
SMc037752121.726043GTPase ObgE
SMc037763102.403478gamma-glutamyl kinase
SMc03777191.486753gamma-glutamyl phosphate reductase
SMc03778091.662719nicotinic acid mononucleotide
SMc03780-1101.509153hypothetical protein
SMc037810101.823159rRNA large subunit methyltransferase
SMc037820101.030232signal peptide protein
SMc0378306-0.205341carboxy-terminal processing protease precursor
SMc037841100.636024hypothetical protein
SMc037852110.395984dinucleoside polyphosphate hydrolase
SMc037862110.047541bacterioferritin
SMc037872120.255045hypothetical protein
SMc037882110.955523DNA polymerase III subunit alpha
SMc037893142.263151hypothetical protein
SMc037902140.438990hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03233IGASERPTASE330.003 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 32.7 bits (74), Expect = 0.003
Identities = 44/256 (17%), Positives = 74/256 (28%), Gaps = 13/256 (5%)

Query: 22 EAEKTEPAAAEAAAGPDAAKSKPADISTSEAAEAGEVRTGEVESNEAGIGEPPLPAGDPK 81
E + TE A +A + A+ T+E A++G E+ E E A K
Sbjct: 1055 EQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGS------ETKETQTTETKETATVEK 1108

Query: 82 EGTPQEETE-MEEARADAAAAAFDAELPHATNGELPEAKRRQATGAGALAAGILGGLIAL 140
E + ETE +E + + E + A+ T
Sbjct: 1109 EEKAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADT 1168

Query: 141 AGAGVLQYAGYVPAPGPEQPGTEQNLASEI--EAIKAELQAQAPAAPVDVAPLENRLAAL 198
+ N E A Q + + +R +
Sbjct: 1169 EQPAKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVR 1228

Query: 199 EQAAR-EPGAAAAGAPEIKALEAEVTNLTNEMAALKTELAEARQAADAARAELAGRIDQA 257
EP ++ AL + TN A L A+A+ A ++ I Q
Sbjct: 1229 SVPHNVEPATTSSNDRSTVALCDLTSTNTN--AVLSDARAKAQFVALNVGKAVSQHISQL 1286

Query: 258 EQKLNEPANDIEMAKA 273
E NE ++ ++
Sbjct: 1287 EMN-NEGQYNVWVSNT 1301


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03237TCRTETA491e-08 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 49.4 bits (118), Expect = 1e-08
Identities = 71/360 (19%), Positives = 140/360 (38%), Gaps = 31/360 (8%)

Query: 53 LGLIGLFQFLPSLVLILVTGSVADRYNRRAIMGICMLVSTLCGAAL--LLLTVTGLFSPW 110
L L L QF + VL G+++DR+ RR + L+ +L GAA+ ++ W
Sbjct: 49 LALYALMQFACAPVL----GALSDRFGRRPV-----LLVSLAGAAVDYAIMATAPFL--W 97

Query: 111 PVYAILVVFGIERAFLGPASQSLAPNLVPAEDLPNAIAWNSTSWQTAMIVGPVAGGLLYG 170
+Y +V GI A G + + ++ ++ + S + M+ GPV GGL+ G
Sbjct: 98 VLYIGRIVAGITGA-TGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGLMGG 156

Query: 171 FGPTVPYAVSLGFFAAGALMVFAV---PKPARRSPPSAANWQTITAGFRYIKAEKIVLGA 227
F P P+ + L + R P + FR+ + +V
Sbjct: 157 FSPHAPFFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLA-SFRWARGMTVVAAL 215

Query: 228 ISLDLFAVLLGGA-VALMPVFARDVLTLGPWGLGLLRAAPGV-GAVGMAVWLAAHPIRNR 285
+++ L+G AL +F D +G+ AA G+ ++ A+ R
Sbjct: 216 MAVFFIMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLG 275

Query: 286 AGLYMFVGVALFGLATVVFGLSATPWISIAALVVMGASDMISVYVREILITLWTPDELRG 345
+ +G+ G ++ + W++ +V++ + + ++ +L +E +G
Sbjct: 276 ERRALMLGMIADGTGYILLAFATRGWMAFPIMVLLASGGIGMPALQAMLSRQ-VDEERQG 334

Query: 346 RVNAVNMVFVGASNELGEFRAGMMASGFGTVFAVVFG---GAGTLLVSLLWALGFPQLRR 402
++ ++ +G F ++A G + + L+ L P LRR
Sbjct: 335 QLQGSLAALTSLTSIVGPL-------LFTAIYAASITTWNGWAWIAGAALYLLCLPALRR 387


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03241SACTRNSFRASE290.007 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.8 bits (64), Expect = 0.007
Identities = 12/54 (22%), Positives = 25/54 (46%)

Query: 91 IYEIYLRPEYQGIGLGRVLFGEAKSLLKSLGCEGLVVWCLEDSANAYNFFHSAG 144
I +I + +Y+ G+G L +A K GL++ + + +A +F+
Sbjct: 92 IEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLMLETQDINISACHFYAKHH 145


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03242TCRTETOQM1892e-54 Tetracycline resistance protein TetO/TetQ/TetM family ...
		>TCRTETOQM#Tetracycline resistance protein TetO/TetQ/TetM family

signature.
Length = 639

Score = 189 bits (482), Expect = 2e-54
Identities = 106/448 (23%), Positives = 188/448 (41%), Gaps = 90/448 (20%)

Query: 1 MSIRNIAIIAHVDHGKTTLVDELLKQSGSFRENQRVAE--RMMDSNELEKERGITILAKA 58
M I NI ++AHVD GKTTL + LL SG+ E V + D+ LE++RGITI
Sbjct: 1 MKIINIGVLAHVDAGKTTLTESLLYNSGAITELGSVDKGTTRTDNTLLERQRGITIQTGI 60

Query: 59 TSVEWKGTRINIVDTPGHADFGGEVERILSMVDGAIVLVDAAEGPMPQTKFVVGKALKVG 118
TS +W+ T++NI+DTPGH DF EV R LS++DGAI+L+ A +G QT+ + K+G
Sbjct: 61 TSFQWENTKVNIIDTPGHMDFLAEVYRSLSVLDGAILLISAKDGVQAQTRILFHALRKMG 120

Query: 119 LRPIVAINKIDRPDARHEEVINEVFDLFAA----------------------------LD 150
+ I INKID+ V ++ + +A ++
Sbjct: 121 IPTIFFINKIDQNGIDLSTVYQDIKEKLSAEIVIKQKVELYPNMCVTNFTESEQWDTVIE 180

Query: 151 ATDEQLD--------------------------FPILYGSGRNGWMNVAPEGPQDQGLAP 184
D+ L+ FP+ +GS +N G+
Sbjct: 181 GNDDLLEKYMSGKSLEALELEQEESIRFHNCSLFPVYHGSAKNN-----------IGIDN 229

Query: 185 LLDLVLDHVPEPS-VGEGPFRMIGTILEANPFLGRIITGRIHSGSIKPNQAVKVLGQDGK 243
L++++ + + G+ +E + R+ R++SG + +V++ +
Sbjct: 230 LIEVITNKFYSSTHRGQSELCGKVFKIEYSEKRQRLAYIRLYSGVLHLRDSVRISEK--- 286

Query: 244 VLESGRISKILAFRGIERQPIEEAHAGDIVAIAGLS---KGTVADTFCDPAINEPLKAQP 300
E +I+++ E I++A++G+IV + + DT P P
Sbjct: 287 --EKIKITEMYTSINGELCKIDKAYSGEIVILQNEFLKLNSVLGDTKLLPQRERIENPLP 344

Query: 301 IDPPTVTMSFLVNDSPLAGTEGDKVTSRVIRDRLFKEAEGNVALKIEESSEKDSFYVSGR 360
+ TV P + + + D L + ++ + L+ S +S
Sbjct: 345 LLQTTV--------EPSKPQQREML-----LDALLEISDSDPLLRYYVDSATHEIILSFL 391

Query: 361 GELQLAVLIETMRRE-GFELAVSRPRVV 387
G++Q+ V ++ + E+ + P V+
Sbjct: 392 GKVQMEVTCALLQEKYHVEIEIKEPTVI 419



Score = 38.7 bits (90), Expect = 7e-05
Identities = 20/84 (23%), Positives = 30/84 (35%), Gaps = 1/84 (1%)

Query: 395 QLLEPIEEVVIDVDEEHSGVVVQKMSERKAEMAELRPSGGNRVRLVFFAPTRGLIGYQSE 454
+LLEP I +E+ + A + N V L P R + Y+S+
Sbjct: 534 ELLEPYLSFKIYAPQEYLSRAYTDAPKYCANI-VDTQLKNNEVILSGEIPARCIQEYRSD 592

Query: 455 LLTDTRGTAIMNRLFHDYQPYKGE 478
L T G ++ Y GE
Sbjct: 593 LTFFTNGRSVCLTELKGYHVTTGE 616


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03252CARBMTKINASE444e-07 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 43.7 bits (103), Expect = 4e-07
Identities = 30/130 (23%), Positives = 54/130 (41%), Gaps = 11/130 (8%)

Query: 148 GAVPLVNENDTTATCGTRVGDNDRLAAWIAEIINADLLILLSNVDGLFMKDPRNNPLTPM 207
G VP++ E+ V D D +AE +NAD+ ++L++V+G + T
Sbjct: 195 GGVPVILEDGEIKGVEA-VIDKDLAGEKLAEEVNADIFMILTDVNGAAL-----YYGTEK 248

Query: 208 LTEVESITREIEAMATQSVDPYSSGGMISKIEAG-KIAMNAGCRMIIANGTRSHPLYAIE 266
+ + E E + +G M K+ A + G R IIA+ ++ +
Sbjct: 249 EQWLREVKVE-ELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAHLEKAVEALEGK 307

Query: 267 SGGPSTHFIP 276
+G T +P
Sbjct: 308 TG---TQVLP 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc05012CARBMTKINASE260.007 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 25.9 bits (57), Expect = 0.007
Identities = 15/49 (30%), Positives = 21/49 (42%), Gaps = 3/49 (6%)

Query: 2 AEIDKERLRAMLAT---ADPVMLFTAIRAAQEDLGRRVDRRGAQVTPEE 47
A IDK+ LA AD M+ T + A G ++ +V EE
Sbjct: 211 AVIDKDLAGEKLAEEVNADIFMILTDVNGAALYYGTEKEQWLREVKVEE 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc0501560KDINNERMP300.008 60kDa inner membrane protein signature.
		>60KDINNERMP#60kDa inner membrane protein signature.

Length = 548

Score = 29.5 bits (66), Expect = 0.008
Identities = 16/63 (25%), Positives = 26/63 (41%), Gaps = 5/63 (7%)

Query: 51 LVIHWVGGVHTELRLPKRRRGQRNATPDDIVEAVRQLVLIANDDVIAGVLNRNGL-TTGN 109
L I+ GG + LP + + P ++E Q + A +G+ R+G N
Sbjct: 70 LTINTRGGDVEQALLPAYPKELNSTQPFQLLETSPQFIYQAQ----SGLTGRDGPDNPAN 125

Query: 110 GNR 112
G R
Sbjct: 126 GPR 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03277TCRTETA340.002 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 33.6 bits (77), Expect = 0.002
Identities = 57/273 (20%), Positives = 89/273 (32%), Gaps = 13/273 (4%)

Query: 62 AASTLPVFFLSILAGAIADNFSRRWVMFAGHCLLALASTTLVISVGIGFISPWLILGLGF 121
A L F + + GA++D F RR V+ LA A+ I + +L +G
Sbjct: 50 ALYALMQFACAPVLGALSDRFGRRPVLLVS---LAGAAVDYAI---MATAPFLWVLYIGR 103

Query: 122 LAGCGFALNDPAWHASIGDILHKRDIPAAVTLTSVGYNIVRSGGPALGGVILAFFGPLTA 181
+ A I DI + S + GP LGG+ + F P
Sbjct: 104 IVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGPVLGGL-MGGFSPHAP 162

Query: 182 FALAALCDLTP--LGAIWRTKWHVRA-SPLPREKITTAIYDGLRFTAMSFEIRSAVARAT 238
F AA + G + H PL RE + AV
Sbjct: 163 FFAAAALNGLNFLTGCFLLPESHKGERRPLRREALNPLASFRWARGMTVVAALMAV--FF 220

Query: 239 LFGLASISILALLPIVVRDQLASGPIVYGILLAGFG-SGAFVAGMSNSFLRRIMSQNKLV 297
+ L AL I D+ GI LA FG + M + + + + +
Sbjct: 221 IMQLVGQVPAALWVIFGEDRFHWDATTIGISLAAFGILHSLAQAMITGPVAARLGERRAL 280

Query: 298 AVASVACAACCLSLALTSSVAVAAFALALGGAG 330
+ +A + LA + +A + L +G
Sbjct: 281 MLGMIADGTGYILLAFATRGWMAFPIMVLLASG 313


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03286V8PROTEASE447e-07 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 44.2 bits (104), Expect = 7e-07
Identities = 34/199 (17%), Positives = 59/199 (29%), Gaps = 36/199 (18%)

Query: 266 AYRNVAAIGLN---GLIHCSGTLIGRRTVLTAAHCVENYQNYIAAGRMTVSFGSLFFRPE 322
Y V I + G SG ++G+ T+LT H V+ F
Sbjct: 86 HYAPVTYIQVEAPTGTFIASGVVVGKDTLLTNKHVVD--------ATHGDPHALKAFPSA 137

Query: 323 KNYVVTGFDFPRDPAAGFFFNPQTY----EDDIAVLYVAEDVDSAYPPSSLHVGDRPTWA 378
N G F Q E D+A++ + + + + + A
Sbjct: 138 INQDNYPN--------GGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNA 189

Query: 379 EIKDKPIAIIIVGFGFNVVRNDAVGAGIKREAAIHADGYTNRSFFFQRSNPNTCKGDSGG 438
E + I + G+ D A + + + G+SG
Sbjct: 190 ETQVNQ-NITVTGYPG-----DKPVATMWESKGKITYLKGEAMQYDLSTTG----GNSGS 239

Query: 439 PSFLIVNDDFVLIGVTSTG 457
P + N+ +IG+ G
Sbjct: 240 P---VFNEKNEVIGIHWGG 255


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03300PF00577260.033 Outer membrane usher protein FimD
		>PF00577#Outer membrane usher protein FimD

Length = 878

Score = 26.4 bits (58), Expect = 0.033
Identities = 15/75 (20%), Positives = 23/75 (30%), Gaps = 20/75 (26%)

Query: 3 ASGVVVYVSCQPVDFRKGAASLMALVRDGGLDPF-----------------NGALYVFRS 45
G +V R G LM L + PF NG +Y+
Sbjct: 781 TRGAIVRAE---FKARVGIKLLMTLTHNNKPLPFGAMVTSESSQSSGIVADNGQVYLSGM 837

Query: 46 KRADRVRIVWWDGSG 60
A +V++ W +
Sbjct: 838 PLAGKVQVKWGEEEN 852


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc05018YERSSTKINASE310.017 Yersinia serine/threonine protein kinase signature.
		>YERSSTKINASE#Yersinia serine/threonine protein kinase signature.

Length = 732

Score = 31.2 bits (70), Expect = 0.017
Identities = 20/51 (39%), Positives = 29/51 (56%), Gaps = 8/51 (15%)

Query: 209 NDELVEQVLQDGLFQEMMPSNLDRRRMDRIASRPPEARELAALLGGQRVHL 259
++E +Q+L+D L EM P + D RR+ P + REL+ LL R HL
Sbjct: 414 DEESAKQILKDTLTGEMSPLSTDVRRIT-----PKKLRELSDLL---RTHL 456


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03767ARGDEIMINASE290.026 Bacterial arginine deiminase signature.
		>ARGDEIMINASE#Bacterial arginine deiminase signature.

Length = 409

Score = 29.4 bits (66), Expect = 0.026
Identities = 11/69 (15%), Positives = 21/69 (30%), Gaps = 14/69 (20%)

Query: 142 LRDSGVPFHVITVLSWETLSQPDELFDFYISEGIERISFNIEEIEGINKGSSLKRPDVKE 201
L+++ V I L E L L + +IS+ I + +K
Sbjct: 62 LKNNLVEIEYIEDLISEVLVSSVALENKFISQFIL--------------EAEIKTDFTIN 107

Query: 202 AFRAFLRRF 210
+ +
Sbjct: 108 LLKDYFSSL 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03768V8PROTEASE482e-08 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 48.1 bits (114), Expect = 2e-08
Identities = 40/208 (19%), Positives = 72/208 (34%), Gaps = 17/208 (8%)

Query: 29 ATTVFIECKTAAGTSRGSGVVVSVEGHVLTARHVLGLKPNDPMPGAIECAGSIGVADRNA 88
A +I+ + GT SGVVV + +LT +HV+ DP + +
Sbjct: 88 APVTYIQVEAPTGTFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQ-DNYPN 145

Query: 89 TRAMIAQPITVG--VDAALLQFQDQRDYE-------FMRVCKFEDWMIRRKIFVAGFPGM 139
Q D A+++F + + + + + I V G+PG
Sbjct: 146 GGFTAEQITKYSGEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGD 205

Query: 140 TETGVPSFREGVLSTTRRNSKGVLETDGQTIAGMSGGPAFSSNLKGLVGIVIGA--QFTP 197
+G ++ + + ++ D T G SG P F+ ++GI G
Sbjct: 206 KPVATMWESKGKITYLKGEA---MQYDLSTTGGNSGSPVFNEK-NEVIGIHWGGVPNEFN 261

Query: 198 QGTVDYFGILPIEQRYIDDFQLTVSDQP 225
+ ++ I+D DQP
Sbjct: 262 GAVFINENVRNFLKQNIEDIHFANDDQP 289


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03769V8PROTEASE1004e-25 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 100 bits (249), Expect = 4e-25
Identities = 47/229 (20%), Positives = 80/229 (34%), Gaps = 21/229 (9%)

Query: 411 SVIGEDTRAAVLDTTGFPSRAIVQILFETRAREQHLCTGTLVSPNTVLTAAHCIHSGTRT 470
++ + R + DTT + I + +G +V +T+LT H + +
Sbjct: 69 VILPNNDRHQITDTTNGHYAPVTYI-QVEAPTGTFIASGVVVGKDTLLTNKHVVDAT--H 125

Query: 471 GEPFQNFRIIPGRNLGAAPFGRCLGVGASVLAGWTASATTDQSRYYDLGAIKLNCNIGDT 530
G+P N P G + +G A + N +IG+
Sbjct: 126 GDPHALKAFPSAINQDNYPNGGFTAEQITKYSGEGDLAI------VKFSPNEQNKHIGEV 179

Query: 531 TGWLGVRTIGNDEAIDTV-VQGYAADRAPTGRQWVSEDKLRILWQLKGFYQNDTFGGTSG 589
+ + + V GY D P W S+ K+ L Y T GG SG
Sbjct: 180 VKPATMSNNAETQVNQNITVTGYPGD-KPVATMWESKGKITYLKGEAMQYDLSTTGGNSG 238

Query: 590 APVFAKDSTDTLIGVHTNGLHGAEQPWKSN-NAFTRITSERLSLIQQWI 637
+PVF + + +IG+H G+ + N I + ++Q I
Sbjct: 239 SPVF--NEKNEVIGIHWGGV-------PNEFNGAVFINENVRNFLKQNI 278


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03776CARBMTKINASE414e-06 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 41.0 bits (96), Expect = 4e-06
Identities = 29/123 (23%), Positives = 49/123 (39%), Gaps = 10/123 (8%)

Query: 135 GSVPIINENDTVATTEIRYGDNDRLAARVATMVGADLLVLLSDIDGLYTAPPHLDPNARF 194
G VP+I E+ + E D D ++A V AD+ ++L+D++G + ++
Sbjct: 195 GGVPVILEDGEIKGVEAVI-DKDLAGEKLAEEVNADIFMILTDVNGAALY--YGTEKEQW 251

Query: 195 LETVAEITPEIEAMAGGAASELSRGGMRTKIDAG-KIATTAGCAMIIASGKPDHPLAAIE 253
L E+ E E G M K+ A + G IIA + + A+E
Sbjct: 252 LR---EVKVE-ELRKYYEEGHFKAGSMGPKVLAAIRFIEWGGERAIIAH--LEKAVEALE 305

Query: 254 AGA 256

Sbjct: 306 GKT 308


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03778LPSBIOSNTHSS300.004 Lipopolysaccharide core biosynthesis protein signat...
		>LPSBIOSNTHSS#Lipopolysaccharide core biosynthesis protein

signature.
Length = 166

Score = 29.8 bits (67), Expect = 0.004
Identities = 23/72 (31%), Positives = 34/72 (47%), Gaps = 6/72 (8%)

Query: 11 GLFGGSFNPPHDGHALVAETALRRLGLDQLWWMVTPGNPLKDRNHLAPLGERIAMSEK-I 69
++ GSF+P GH + E R DQ++ V NP K + + ER+ K I
Sbjct: 3 AIYPGSFDPITFGHLDIIERGCRL--FDQVYVAVL-RNPNK--QPMFSVQERLEQIAKAI 57

Query: 70 ARNPRIKVTAFE 81
A P +V +FE
Sbjct: 58 AHLPNAQVDSFE 69


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03782GPOSANCHOR485e-08 Gram-positive coccus surface protein anchor signature.
		>GPOSANCHOR#Gram-positive coccus surface protein anchor signature.

Length = 539

Score = 48.1 bits (114), Expect = 5e-08
Identities = 44/308 (14%), Positives = 101/308 (32%), Gaps = 22/308 (7%)

Query: 6 RRIRRGAAVAALFLSVIGPT--------AAQAPADAGAAKAAPPDPAADLAVRRDSTASE 57
R+++ G A A+ L+V+G +A A + A + ++ +
Sbjct: 13 RKLKTGTASVAVALTVLGAGLVVNTNEVSAVATRSQTDTLEKVQERADKFEIENNTLKLK 72

Query: 58 LEALSKTITLSRDRSEALEKTIAEIDKSNETLRDALVDSAAKRQEFERRISDGERTLGDL 117
LS +D ++ L + ++ + +L + A+K QE E R +D E+ L
Sbjct: 73 NSDLSFNNKALKDHNDELTEELSNAKEKLRKNDKSLSEKASKIQELEARKADLEKALEGA 132

Query: 118 RVKEDVVRRSLRARRGLLAEVLAALQRMGRNPPPAILVTPEDALGSVRSAILLGAVVPEI 177
++ A + A + + A+ + L A +
Sbjct: 133 MNFSTADSAKIKTLEAEKAALAARKADLEK----ALEGAMNFSTADSAKIKTLEAEKAAL 188

Query: 178 REQTDSLVADLKALADIRSGIAREREELTAAMTARLEEERRMSMLVAEKEKLRQRNAADL 237
+ L L+ + + + + + L A A + + + ++A +
Sbjct: 189 EARQAELEKALEGAMNFSTADSAKIKTLEAEKAALAARKADLEKALEGAMNFSTADSAKI 248

Query: 238 AAEQRKAEELALQAENLGGLISSLENEISSVREAAAAARAEE----------EERRRMSE 287
+ + L + L + N ++ AE+ E + ++
Sbjct: 249 KTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLN 308

Query: 288 AEREAARE 295
A R++ R
Sbjct: 309 ANRQSLRR 316



Score = 29.3 bits (65), Expect = 0.042
Identities = 36/237 (15%), Positives = 72/237 (30%), Gaps = 21/237 (8%)

Query: 68 SRDRSEALEKTIAEIDKSNETLRDALVDSAAKRQEFERRISDGERTLGDLRVKEDVVRRS 127
E +TL A++ E E+ + K +
Sbjct: 160 LEKALEGAMNFSTADSAKIKTLEAEKAALEARQAELEKALEGAMNFSTADSAKIKTLEAE 219

Query: 128 LRARRGLLAEVLAALQRMGRNPPPAILVTPEDALGSVRSAILLGAVVPEIREQTDSLVAD 187
A A++ AL+ + ++ + + + L
Sbjct: 220 KAALAARKADLEKALEGA-----MNFSTADSAKIKTLEAEKA------ALEARQAELEKA 268

Query: 188 LKALADIRSGIAREREELTAAMTARLEEERRMSMLVAEKEKLRQRNAADLAAEQRKAEEL 247
L+ + + + + + L A A E+ + RQ DL A + ++L
Sbjct: 269 LEGAMNFSTADSAKIKTLEAEKAALEAEKADLEHQSQVLNANRQSLRRDLDASREAKKQL 328

Query: 248 ALQAENLGGLISSLENEISSVREAAAAARAEE----------EERRRMSEAEREAAR 294
+ + L E S+R A+R + EE+ ++SEA R++ R
Sbjct: 329 EAEHQKLEEQNKISEASRQSLRRDLDASREAKKQLEAEHQKLEEQNKISEASRQSLR 385


24SMc04113SMc02824N        Y        YPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc041131110.791711pilus assembly transmembrane protein
SMc041122131.098057pilus assembly signal peptide protein
SMc041111120.962140pilus assembly transmembrane protein
SMc041101130.575365hypothetical protein
SMc041091120.655872response regulator protein
SMc028200110.591540pilus assembly protein
SMc02821-1110.322890hypothetical protein
SMc02822-1121.107460hypothetical protein
SMc028230121.261327transposase
SMc02824-1131.844195hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04113PREPILNPTASE483e-09 Type IV prepilin cysteine protease (C20) family sig...
		>PREPILNPTASE#Type IV prepilin cysteine protease (C20) family

signature.
Length = 290

Score = 48.3 bits (115), Expect = 3e-09
Identities = 40/156 (25%), Positives = 64/156 (41%), Gaps = 21/156 (13%)

Query: 10 ILVVFPFSLALAALSDLLTMTIPNRVSLVLLVSFFFVAPLAGLDLAQLGLHTVAAGLIFA 69
++ + L DL M +P++++L LL L G + AG +
Sbjct: 136 AALLLTWVLVALTFIDLDKMLLPDQLTLPLLWGGLLFNLLGGFVSLGDAVIGAMAGYLVL 195

Query: 70 VSF-----ALFAANVMGGGDAKLLTASAAWFGFNESLVTFLVYVSFFGGFLTLIVLLLRS 124
S L MG GD KLL A AW G+ ++L L+ S G F+ + ++LLR+
Sbjct: 196 WSLYWAFKLLTGKEGMGYGDFKLLAALGAWLGW-QALPIVLLLSSLVGAFMGIGLILLRN 254

Query: 125 KENLILAARIPVPRLLLTAKKIPYGIAIALGGFAAY 160
+K IP+G +A+ G+ A
Sbjct: 255 HH---------------QSKPIPFGPYLAIAGWIAL 275


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04111BCTERIALGSPD1131e-28 Bacterial general secretion pathway protein D signa...
		>BCTERIALGSPD#Bacterial general secretion pathway protein D

signature.
Length = 660

Score = 113 bits (284), Expect = 1e-28
Identities = 71/331 (21%), Positives = 122/331 (36%), Gaps = 50/331 (15%)

Query: 164 AVDLARAFLKGGEATTRNITAQGNNGDAD----IFAEDRQNS--------------QIVN 205
A DL A D I A + N+ +++
Sbjct: 279 ASDLVEVLTGISSTMQSEKQAAKPVAALDKNIIIKAHGQTNALIVTAAPDVMNDLERVIA 338

Query: 206 LLTIEGDDQVTLKVTVAEVSRQVLKQLGFNGRVSDGESGISFRNPANLGDAIAVGTNALI 265
L I QV ++ +AEV LG + + + AIA
Sbjct: 339 QLDIR-RPQVLVEAIIAEVQDADGLNLGIQWANKNAGMTQFTNSGLPISTAIAGANQYNK 397

Query: 266 KGSIG---PTTISSY---------------INAMEQAGVMRTLAEPSLTAISGQEAKFYV 307
G++ + +SS+ + A+ + LA PS+ + EA F V
Sbjct: 398 DGTVSSSLASALSSFNGIAAGFYQGNWAMLLTALSSSTKNDILATPSIVTLDNMEATFNV 457

Query: 308 GGEFRLAGVQEVSTDDETGQTTVEREVNDVEYGIRLNFKPVVLGPGRISLQIETDVSEPT 367
G E + +T + TVER+ GI+L KP + + L+IE +VS
Sbjct: 458 GQEVPVLTGS-QTTSGDNIFNTVERK----TVGIKLKVKPQINEGDSVLLEIEQEVSSVA 512

Query: 368 YEGSVVTGNSMTAIPGNTFLGIRRREASTSVELPSGGSIVIAGLVQDNIRQAMSGLPGAS 427
+ + +S NT R + +V + SG ++V+ GL+ ++ +P
Sbjct: 513 D--AASSTSSDLGATFNT------RTVNNAVLVGSGETVVVGGLLDKSVSDTADKVPLLG 564

Query: 428 KIPILGTLFRSKDFQRSETELVIIATPYLVR 458
IP++G LFRS + S+ L++ P ++R
Sbjct: 565 DIPVIGALFRSTSKKVSKRNLMLFIRPTVIR 595


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04109HTHFIS371e-04 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 37.1 bits (86), Expect = 1e-04
Identities = 19/137 (13%), Positives = 41/137 (29%), Gaps = 10/137 (7%)

Query: 44 EAVQRLMERCGQDRRMAKVSLRVTGGGIAAAANTFAGASTPNLIILETATEPGALLSELA 103
+ + + R G V AA + A +L++ + L
Sbjct: 17 TVLNQALSRAG---------YDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDLLP 67

Query: 104 PLAEVCDPSTRVIVIGRHNDIALYRELIRNGISEYMVAPVGMADMLSAVSAIFVDPEAEP 163
+ + P V+V+ N + G +Y+ P + +++ + +P+ P
Sbjct: 68 RIKKA-RPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 164 LGRSLAFIGAKGGCGSS 180
G S
Sbjct: 127 SKLEDDSQDGMPLVGRS 143


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02820RTXTOXINA300.030 Gram-negative bacterial RTX toxin determinant A family...
		>RTXTOXINA#Gram-negative bacterial RTX toxin determinant A family

signature.
Length = 1024

Score = 29.5 bits (66), Expect = 0.030
Identities = 10/61 (16%), Positives = 23/61 (37%)

Query: 370 TVREIISGSVDVIIQASRLRDGSRRITHITEVTGMEGDVIITQDLMRYEIDGEDANGRIV 429
+V E+I + S+ D + G +G+ + D + G + + ++
Sbjct: 718 SVEELIGTTRADKFFGSKFTDIFHGADGDDLIEGNDGNDRLYGDKGNDTLSGGNGDDQLY 777

Query: 430 G 430
G
Sbjct: 778 G 778


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02824SYCDCHAPRONE310.004 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 30.7 bits (69), Expect = 0.004
Identities = 21/108 (19%), Positives = 33/108 (30%)

Query: 98 AVMQQVAIYHPADREVLGAYGKAQAAAGQLEQALATISRAQTPDRPDWKLKSAEGAILDQ 157
+ + E L + Q +G+ E A D D + GA
Sbjct: 23 GTIAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGLGACRQA 82

Query: 158 LGRSAEARLRYREALDLKPNEPSVLSNLGMSYLLTKDLRTAETYLKSA 205
+G+ A Y + EP + L +L AE+ L A
Sbjct: 83 MGQYDLAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEAESGLFLA 130


25SMc02862SMc02870N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc028622131.915569Pit accessory protein
SMc028631122.312675recombination protein F
SMc028641131.703517molybdopterin biosynthesis protein MoeB
SMc028651141.157738acetyltransferase
SMc028662140.812890transcriptional regulator
SMc028672160.378614multidrug-efflux system transmembrane protein
SMc02868015-0.353927multidrug efflux system protein
SMc02869013-0.330107ABC transporter ATP-binding protein
SMc02870-1110.176987oxidoreductase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02862TYPE4SSCAGX280.034 Type IV secretion system CagX conjugation protein si...
		>TYPE4SSCAGX#Type IV secretion system CagX conjugation protein

signature.
Length = 522

Score = 27.8 bits (61), Expect = 0.034
Identities = 16/67 (23%), Positives = 33/67 (49%), Gaps = 7/67 (10%)

Query: 133 PLLSRIGANAHRLSAIAEEVTHVEDRSDQLHEQGLKDLFQRHGASNPMAYIIGSEIYGEL 192
P ++ G +R++ IAE+ ++D++ L + + NP+ + YGEL
Sbjct: 457 PNMTNSGLRWYRVNEIAEKFKLIKDKA-------LVTVINKGYGKNPLTKNYNIKNYGEL 509

Query: 193 EKVVDRF 199
E+V+ +
Sbjct: 510 ERVIKKL 516


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02865SACTRNSFRASE422e-07 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 41.9 bits (98), Expect = 2e-07
Identities = 19/67 (28%), Positives = 27/67 (40%), Gaps = 1/67 (1%)

Query: 51 DEIAFAAVAGRELLGCI-FCKPEADCLYIGKLAVAPGRQGKGVGRMLIAAAEETARDLGL 109
+ AF +G I I +AVA + KGVG L+ A E A++
Sbjct: 64 GKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHF 123

Query: 110 PALRLQT 116
L L+T
Sbjct: 124 CGLMLET 130


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02866HTHTETR1061e-30 TetR bacterial regulatory protein HTH signature.
		>HTHTETR#TetR bacterial regulatory protein HTH signature.

Length = 215

Score = 106 bits (265), Expect = 1e-30
Identities = 62/205 (30%), Positives = 101/205 (49%), Gaps = 1/205 (0%)

Query: 1 MRRTKADAEATRQKILCAAERMFYKKGVPNTTLEEVAKEAGVTRGAIYWHFANKTDLFLA 60
R+TK +A+ TRQ IL A R+F ++GV +T+L E+AK AGVTRGAIYWHF +K+DLF
Sbjct: 2 ARKTKQEAQETRQHILDVALRLFSQQGVSSTSLGEIAKAAGVTRGAIYWHFKDKSDLFSE 61

Query: 61 LYEAVPLPHEDMIAREIETEAFDTLAIVESATSDWLTTLAADEQRQRILAIMLR-CDYDN 119
++E ++ D L+++ L + +E+R+ ++ I+ C++
Sbjct: 62 IWELSESNIGELELEYQAKFPGDPLSVLREILIHVLESTVTEERRRLLMEIIFHKCEFVG 121

Query: 120 DMSAVLVRQREIEERHDALLELAFARALERGQLQEHWAPPTAARALRWMMMGLCTEWLLF 179
+M+ V QR + +E +E L AA +R + GL WL
Sbjct: 122 EMAVVQQAQRNLCLESYDRIEQTLKHCIEAKMLPADLMTRRAAIIMRGYISGLMENWLFA 181

Query: 180 GRRFDLVAQGSEALQSLFAGFRRVP 204
+ FDL + + + L + P
Sbjct: 182 PQSFDLKKEARDYVAILLEMYLLCP 206


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02867ACRIFLAVINRP11470.0 Acriflavin resistance protein family signature.
		>ACRIFLAVINRP#Acriflavin resistance protein family signature.

Length = 1034

Score = 1147 bits (2969), Expect = 0.0
Identities = 553/1033 (53%), Positives = 746/1033 (72%), Gaps = 5/1033 (0%)

Query: 1 MPSFFIDRPIFAWVVAIFIMIAGIIAIPLLPVSQYPDVAPPQISINTNYPGASSQDTYQS 60
M +FFI RPIFAWV+AI +M+AG +AI LPV+QYP +APP +S++ NYPGA +Q +
Sbjct: 1 MANFFIRRPIFAWVLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDT 60

Query: 61 VTRLIEDELNGVEGLLYFESSTSSSGSVSIDATFQPGTDPSQASVDIQNRVQRVEPRLPD 120
VT++IE +NG++ L+Y S++ S+GSV+I TFQ GTDP A V +QN++Q P LP
Sbjct: 61 VTQVIEQNMNGIDNLMYMSSTSDSAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQ 120

Query: 121 PVRQQGVQVDEAGAGFLLIISLTSTDGSMDAIGLGDYLSRNVLSEIQRVPGVGRAQLFAT 180
V+QQG+ V+++ + +L++ S + + DY++ NV + R+ GVG QLF
Sbjct: 121 EVQQQGISVEKSSSSYLMVAGFVSDNPGTTQDDISDYVASNVKDTLSRLNGVGDVQLFGA 180

Query: 181 ERSMRVWLDPDKMLGLNLTAADVTAAIQAQNAQIASGSIGAQPNPITQQVTAPVVIKGQL 240
+ +MR+WLD D + LT DV ++ QN QIA+G +G P QQ+ A ++ + +
Sbjct: 181 QYAMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPGQQLNASIIAQTRF 240

Query: 241 SSPEEFGAIVLRANADGSAVRLRDVARLEIGGESYTFSTRLNGSPSAAIAVQLSPSGNAM 300
+PEEFG + LR N+DGS VRL+DVAR+E+GGE+Y R+NG P+A + ++L+ NA+
Sbjct: 241 KNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGIKLATGANAL 300

Query: 301 STSAAIKARMDELAEFFPEGLEYSIPYDTSPFVAVSIEKVLHTLVEAVGLVFLVMFLFLQ 360
T+ AIKA++ EL FFP+G++ PYDT+PFV +SI +V+ TL EA+ LVFLVM+LFLQ
Sbjct: 301 DTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAIMLVFLVMYLFLQ 360

Query: 361 NVRYTIIPTIVVPVALLGTCAVMLAMGFSINVLTMFGMVLAIGILVDDAIIVVENVERIM 420
N+R T+IPTI VPV LLGT A++ A G+SIN LTMFGMVLAIG+LVDDAI+VVENVER+M
Sbjct: 361 NMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDAIVVVENVERVM 420

Query: 421 SEEGLTPKDATRKAMKQITGAVIGITLVLASVFIPMAFFPGAVGVIYRQFSLTMVVSILF 480
E+ L PK+AT K+M QI GA++GI +VL++VFIPMAFF G+ G IYRQFS+T+V ++
Sbjct: 421 MEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQFSITIVSAMAL 480

Query: 481 SALLALSLTPALCASFLKQVPKGHHHAKRGFFGWFNRGFDRTSHGYTRAVGGIVRRTGRF 540
S L+AL LTPALCA+ LK V HH K GFFGWFN FD + + YT +VG I+ TGR+
Sbjct: 481 SVLVALILTPALCATLLKPVSAEHHENKGGFFGWFNTTFDHSVNHYTNSVGKILGSTGRY 540

Query: 541 MVIYLALLAGLGWAFLQLPSSFLPDEDQGFVIVMMQLPSEATANRTTEVIEQTETIFG-- 598
++IY ++AG+ FL+LPSSFLP+EDQG + M+QLP+ AT RT +V++Q +
Sbjct: 541 LLIYALIVAGMVVLFLRLPSSFLPEEDQGVFLTMIQLPAGATQERTQKVLDQVTDYYLKN 600

Query: 599 QEKAVDTIVAINGFSFFGSGQNAGLAFVTLKDWSERDAD-NSAQSIAGRATMAMSQIKDA 657
++ V+++ +NGFSF G QNAG+AFV+LK W ER+ D NSA+++ RA M + +I+D
Sbjct: 601 EKANVESVFTVNGFSFSGQAQNAGMAFVSLKPWEERNGDENSAEAVIHRAKMELGKIRDG 660

Query: 658 ISFALSPPAIQGLGTTGGFSFRLQDRAGLGQAALAEARDQLLDLASQSKV-LTGVRFEGM 716
+ PAI LGT GF F L D+AGLG AL +AR+QLL +A+Q L VR G+
Sbjct: 661 FVIPFNMPAIVELGTATGFDFELIDQAGLGHDALTQARNQLLGMAAQHPASLVSVRPNGL 720

Query: 717 PDAAQVSVNIDREKANTFGVTFADINSTISTNLGSSYVNDFPNAGRMQRVTVQADETKRM 776
D AQ + +D+EKA GV+ +DIN TIST LG +YVNDF + GR++++ VQAD RM
Sbjct: 721 EDTAQFKLEVDQEKAQALGVSLSDINQTISTALGGTYVNDFIDRGRVKKLYVQADAKFRM 780

Query: 777 QTADLLNLNVRNSNGGMVPLSAFADVEWVKAPTQTVGYNGYPAVRISGEAAPGYSSGDAI 836
D+ L VR++NG MVP SAF WV + YNG P++ I GEAAPG SSGDA+
Sbjct: 781 LPEDVDKLYVRSANGEMVPFSAFTTSHWVYGSPRLERYNGLPSMEIQGEAAPGTSSGDAM 840

Query: 837 AEMERLVAELPAGFGYEWTGQSLQEIQSGSQAPLLIALSCLLVFLCLAALYESWSIPVSV 896
A ME L ++LPAG GY+WTG S QE SG+QAP L+A+S ++VFLCLAALYESWSIPVSV
Sbjct: 841 ALMENLASKLPAGIGYDWTGMSYQERLSGNQAPALVAISFVVVFLCLAALYESWSIPVSV 900

Query: 897 IMVVPLGVIGAVLAVTMRDMPNDVYFKVGLIAIIGLSAKNAILIIEFAKELRE-QGKSLI 955
++VVPLG++G +LA T+ + NDVYF VGL+ IGLSAKNAILI+EFAK+L E +GK ++
Sbjct: 901 MLVVPLGIVGVLLAATLFNQKNDVYFMVGLLTTIGLSAKNAILIVEFAKDLMEKEGKGVV 960

Query: 956 DSTLEAAHLRFRPILMTSLAFTLGVLPLAIATGASSGSQRAIGTGVMGGMISATVLAIFF 1015
++TL A +R RPILMTSLAF LGVLPLAI+ GA SG+Q A+G GVMGGM+SAT+LAIFF
Sbjct: 961 EATLMAVRMRLRPILMTSLAFILGVLPLAISNGAGSGAQNAVGIGVMGGMVSATLLAIFF 1020

Query: 1016 VPVFFVFVMKIFE 1028
VPVFFV + + F+
Sbjct: 1021 VPVFFVVIRRCFK 1033



Score = 93.0 bits (231), Expect = 3e-21
Identities = 88/515 (17%), Positives = 188/515 (36%), Gaps = 49/515 (9%)

Query: 542 VIYLALLAGLGWAFLQLPSSFLPDEDQGFVIVMMQLP---SEATANRTTEVIEQTETIFG 598
V+ + L+ A LQLP + P V V P ++ + T+VIEQ
Sbjct: 14 VLAIILMMAGALAILQLPVAQYPTIAPPAVSVSANYPGADAQTVQDTVTQVIEQNMN--- 70

Query: 599 QEKAVDTIVAINGFSFFGSGQNAGLAFVTLKDWSERDADNSAQSIAGRATMAMSQIKDAI 658
+D ++ ++ S +AG +TL S D D + + + +A + +
Sbjct: 71 ---GIDNLMYMSSTSD-----SAGSVTITLTFQSGTDPDIAQVQVQNKLQLATPLLPQEV 122

Query: 659 -------SFALSPPAIQGLGTTGGFSFRLQDRAGLGQAALAEARDQLLDLASQSKV-LTG 710
+ S + + D + + +D L L V L G
Sbjct: 123 QQQGISVEKSSSSYLMVAGFVSDNPGTTQDD---ISDYVASNVKDTLSRLNGVGDVQLFG 179

Query: 711 VRFEGMPDAAQVSVNIDREKANTFGVTFADINSTISTN---LGSSYVNDFPNAGRMQRVT 767
++ + + +D + N + +T D+ + + + + + P Q++
Sbjct: 180 AQY-------AMRIWLDADLLNKYKLTPVDVINQLKVQNDQIAAGQLGGTPALPG-QQLN 231

Query: 768 VQADETKRMQTA-DLLNLNVR-NSNGGMVPLSAFADVEW-VKAPTQTVGYNGYPAVRISG 824
R + + + +R NS+G +V L A VE + NG PA +
Sbjct: 232 ASIIAQTRFKNPEEFGKVTLRVNSDGSVVRLKDVARVELGGENYNVIARINGKPAAGLGI 291

Query: 825 EAAPGYSSGDAI----AEMERLVAELPAGFGYEWTGQSLQEIQSGSQA---PLLIALSCL 877
+ A G ++ D A++ L P G + + +Q L A+ +
Sbjct: 292 KLATGANALDTAKAIKAKLAELQPFFPQGMKVLYPYDTTPFVQLSIHEVVKTLFEAI--M 349

Query: 878 LVFLCLAALYESWSIPVSVIMVVPLGVIGAVLAVTMRDMPNDVYFKVGLIAIIGLSAKNA 937
LVFL + ++ + + VP+ ++G + + G++ IGL +A
Sbjct: 350 LVFLVMYLFLQNMRATLIPTIAVPVVLLGTFAILAAFGYSINTLTMFGMVLAIGLLVDDA 409

Query: 938 ILIIE-FAKELREQGKSLIDSTLEAAHLRFRPILMTSLAFTLGVLPLAIATGASSGSQRA 996
I+++E + + E ++T ++ ++ ++ + +P+A G++ R
Sbjct: 410 IVVVENVERVMMEDKLPPKEATEKSMSQIQGALVGIAMVLSAVFIPMAFFGGSTGAIYRQ 469

Query: 997 IGTGVMGGMISATVLAIFFVPVFFVFVMKIFERGR 1031
++ M + ++A+ P ++K
Sbjct: 470 FSITIVSAMALSVLVALILTPALCATLLKPVSAEH 504


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02868RTXTOXIND415e-06 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 5e-06
Identities = 21/101 (20%), Positives = 47/101 (46%), Gaps = 5/101 (4%)

Query: 105 QVKVDSAEATLKRAQAVVDQAKRTADRQSRLKEAQVTAVQQYD-DAIAALAQAEADVGIA 163
+ K A L+ ++ ++Q + ++ + VT Q + + + L Q ++G+
Sbjct: 258 ENKYVEAVNELRVYKSQLEQIESEI-LSAKEEYQLVT--QLFKNEILDKLRQTTDNIGLL 314

Query: 164 EAGLAEAKLNLQYTNVTAPISGRI-GRALITEGALVNTNDP 203
LA+ + Q + + AP+S ++ + TEG +V T +
Sbjct: 315 TLELAKNEERQQASVIRAPVSVKVQQLKVHTEGGVVTTAET 355



Score = 36.7 bits (85), Expect = 2e-04
Identities = 21/124 (16%), Positives = 51/124 (41%), Gaps = 8/124 (6%)

Query: 60 PGRIT-ATRIAEVRPRISGIVVERVFEQGTMVKEGDVLYRIDPAPFQVKVDSAEATLK-- 116
G++T + R E++P + IV E + ++G V++GDVL ++ + +++L
Sbjct: 87 NGKLTHSGRSKEIKPIENSIVKEIIVKEGESVRKGDVLLKLTALGAEADTLKTQSSLLQA 146

Query: 117 -----RAQAVVDQAKRTADRQSRLKEAQVTAVQQYDDAIAALAQAEADVGIAEAGLAEAK 171
R Q + + + +L + ++ + + + + + +
Sbjct: 147 RLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKE 206

Query: 172 LNLQ 175
LNL
Sbjct: 207 LNLD 210


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02869PF05272300.011 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 30.4 bits (68), Expect = 0.011
Identities = 18/74 (24%), Positives = 30/74 (40%), Gaps = 4/74 (5%)

Query: 5 ASRSSFNPRGRHVGSLQLKTIRKAFGSHEVLKGIDLDVKDGEFVIFVGPSGCGKSTLLRT 64
+ + PR L+ + K V + ++ K V+ G G GKSTL+ T
Sbjct: 560 KTPDDYKPRR----LRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTLINT 615

Query: 65 IAGLEDATSGSVQI 78
+ GL+ + I
Sbjct: 616 LVGLDFFSDTHFDI 629


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02870HTHFIS320.004 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.1 bits (73), Expect = 0.004
Identities = 15/90 (16%), Positives = 32/90 (35%), Gaps = 7/90 (7%)

Query: 60 EALAELEPELCSINTYSDTHADYAVMAMEAGAHVFVEKP--LATTVADAERVVACARANG 117
+ + P+L + + A+ A E GA+ ++ KP L + R +A +
Sbjct: 67 PRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKRRP 126

Query: 118 RKLVV-----GYILRHHPSWMRLIAEARKL 142
KL ++ + + +L
Sbjct: 127 SKLEDDSQDGMPLVGRSAAMQEIYRVLARL 156


26SMc02245SMc02252N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc02245-180.764297dihydroorotate dehydrogenase 2
SMc02246-1100.670736hypothetical protein
SMc02247-1100.741183signal peptide protein
SMc02248-190.836884transcriptional regulator
SMc02249-190.707410sensory transduction histidine kinase
SMc02250-290.503267large-conductance mechanosensitive channel
SMc02251-191.056699aspartate aminotransferase
SMc02252-1100.949303UDP-glucose 4-epimerase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02245HTHFIS320.003 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.5 bits (74), Expect = 0.003
Identities = 10/36 (27%), Positives = 18/36 (50%), Gaps = 1/36 (2%)

Query: 275 AVLARMRKRVGPHLPIIGVGGVCSAETAAEKIRAGA 310
+L R++K P LP++ + + TA + GA
Sbjct: 64 DLLPRIKKA-RPDLPVLVMSAQNTFMTAIKASEKGA 98


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02247INTIMIN280.012 Intimin signature.
		>INTIMIN#Intimin signature.

Length = 939

Score = 28.5 bits (63), Expect = 0.012
Identities = 14/43 (32%), Positives = 27/43 (62%), Gaps = 1/43 (2%)

Query: 28 GRYSMQKSETGLARLDTETGEVTLCQEKGGELICRMAADERAA 70
G+Y+ + + +A +D +G+VTL +EKG I +++D + A
Sbjct: 791 GKYTWRSANPAIASVDASSGQVTL-KEKGTTTISVISSDNQTA 832


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02248HTHFIS699e-16 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 68.7 bits (168), Expect = 9e-16
Identities = 33/116 (28%), Positives = 53/116 (45%), Gaps = 3/116 (2%)

Query: 1 MPEMTIIIADDHPLFRGAMRQALSGMAGAPAIVEAGDFAAARRAAADNPDADLMLLDLTM 60
M TI++ADD R + QALS + + A R A DL++ D+ M
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAG--YDVRITSNAATLWRWIAAGD-GDLVVTDVVM 57

Query: 61 PGVSGLSGLIALRAEFPSLPVMIVSAHDDPATIHRALDLGAAGFLSKSAGIEEIRE 116
P + L ++ P LPV+++SA + T +A + GA +L K + E+
Sbjct: 58 PDENAFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIG 113


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02249PF06580340.002 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 34.5 bits (79), Expect = 0.002
Identities = 26/101 (25%), Positives = 38/101 (37%), Gaps = 25/101 (24%)

Query: 917 LVQNLVSNAIKYTLR-----GKVLVGVRRHGQTATIEVLDSGIGIPSSKFRTIFKEFARL 971
LVQ LV N IK+ + GK+L+ + T T+EV ++G +
Sbjct: 259 LVQTLVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN------------ 306

Query: 972 EEGARTASGLGLGLSIVDRISRVL---NHPVGLQSKPGKGT 1009
T G GL V ++L + L K GK
Sbjct: 307 -----TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02250MECHCHANNEL1133e-35 Bacterial mechano-sensitive ion channel signature.
		>MECHCHANNEL#Bacterial mechano-sensitive ion channel signature.

Length = 136

Score = 113 bits (283), Expect = 3e-35
Identities = 60/141 (42%), Positives = 86/141 (60%), Gaps = 11/141 (7%)

Query: 1 MLNEFKEFIARGNVMDLAVGVIIGAAFSKIVDSVVNDLVMPVVGAITGGGFDFSNYFLPL 60
++ EF+EF RGNV+DLAVGVIIGAAF KIV S+V D++MP +G + GG DF + + L
Sbjct: 3 IIKEFREFAMRGNVVDLAVGVIIGAAFGKIVSSLVADIIMPPLGLLI-GGIDFKQFAVTL 61

Query: 61 SASVTAPTLSAAREQGAVFAYGNFITVLINFLILAWIIFLLIKLVNRARASVERDKAPDP 120
+ V YG FI + +FLI+A+ IF+ IKL+N+ E A
Sbjct: 62 R-------DAQGDIPAVVMHYGVFIQNVFDFLIVAFAIFMAIKLINKLNRKKEEPAAA-- 112

Query: 121 AAPPPQDILLLSEIRDLLRQR 141
P ++ +LL+EIRDLL+++
Sbjct: 113 -PAPTKEEVLLTEIRDLLKEQ 132


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02252NUCEPIMERASE1614e-49 Nucleotide sugar epimerase signature.
		>NUCEPIMERASE#Nucleotide sugar epimerase signature.

Length = 334

Score = 161 bits (410), Expect = 4e-49
Identities = 78/345 (22%), Positives = 134/345 (38%), Gaps = 43/345 (12%)

Query: 1 MAVLVTGGAGYIGSHMVWSLLDGGEAVVVLDCLSTGF-------RWAVAPEARFYF--GD 51
M LVTG AG+IG H+ LL+ G VV +D L+ + R + + F F D
Sbjct: 1 MKYLVTGAAGFIGFHVSKRLLEAGHQVVGIDNLNDYYDVSLKQARLELLAQPGFQFHKID 60

Query: 52 VGDRALLQRVFAENEIDSVVHFAGSAVVPESVANPLAYYENNTANTRTLIAATVEAGIRH 111
+ DR + +FA + V V S+ NP AY ++N ++ I+H
Sbjct: 61 LADREGMTDLFASGHFERVFISPHRLAVRYSLENPHAYADSNLTGFLNILEGCRHNKIQH 120

Query: 112 FVFSSTAAVYGTQDTPDPVSET-AALRPQSPYGRSKLMSEMMLQDAAAAHDFRFVALRYF 170
+++S+++VYG + P S + P S Y +K +E+M + + LR+F
Sbjct: 121 LLYASSSSVYGL-NRKMPFSTDDSVDHPVSLYAATKKANELMAHTYSHLYGLPATGLRFF 179

Query: 171 NVAGADPLGRAGQSTLGATHLIKVACEAALGRRRKIDVLGTDYPTADGTGVRDYIHVSDL 230
V G P GR + T + G+ IDV G RD+ ++ D+
Sbjct: 180 TVYG--PWGRPDMALFKFTKAML------EGKS--IDVYN------YGKMKRDFTYIDDI 223

Query: 231 VAAHRSALAYLRAGGEPL---------------VANCGYGHGFSVLQVLDTVRQVSGRDF 275
A + V N G ++ + + G +
Sbjct: 224 AEAIIRLQDVIPHADTQWTVETGTPAASIAPYRVYNIGNSSPVELMDYIQALEDALGIEA 283

Query: 276 MVDYAPRRPGDPAQIVADPSVARLKLDWVPTHASLEHIVRSAFDW 320
+ P +PGD + AD + + P +++ V++ +W
Sbjct: 284 KKNMLPLQPGDVLETSADTKALYEVIGFTPE-TTVKDGVKNFVNW 327


27SMc03006SMc03040N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc03006210-0.428527chemotaxis regulator protein
SMc030071100.009146chemotaxis protein (sensory transduction
SMc030082110.241497chemotaxis protein
SMc030091110.492663chemotaxis protein methyltransferase
SMc030100110.542997chemotaxis-specific methylesterase
SMc03011010-0.041945chemotaxis regulator protein
SMc0301209-0.111831chemoreceptor glutamine deamidase CheD
SMc0301309-1.427032hypothetical protein
SMc03014010-1.452526flagellar MS-ring protein
SMc03015012-2.183716transcriptional regulator
SMc03016111-1.421695transcriptional regulator
SMc03017111-1.752847hypothetical protein
SMc03018010-1.577044flagellar biosynthesis protein FlhB
SMc03019-110-0.857239flagellar motor switch protein G
SMc03020-2110.348992flagellar motor switch protein
SMc03021-1120.381271flagellar motor switch transmembrane protein
SMc03022-113-0.208863flagellar motor protein MotA
SMc030231140.036205hypothetical protein
SMc030241110.665785flagellar basal body rod protein FlgF
SMc030251130.680820flagellum-specific ATP synthase
SMc03026213-0.572562hypothetical protein
SMc030271150.305483flagellar basal body rod protein FlgB
SMc030280150.725378flagellar basal body rod protein FlgC
SMc030290161.229044flagellar hook-basal body protein FliE
SMc030300150.820928flagellar basal body rod protein FlgG
SMc030311150.767251flagellar basal body P-ring biosynthesis protein
SMc03032214-0.103613flagellar basal body P-ring protein
SMc03033316-1.231930hypothetical protein
SMc03034415-1.895690flagellar basal body L-ring protein
SMc03035414-2.173899flagellar transmembrane protein
SMc03036415-1.813721flagellar biosynthesis protein FliP
SMc03037214-1.228311flagellin A
SMc03038-18-0.515618flagellin B protein
SMc03039-280.997875flagellin D protein
SMc03040-291.638295flagellin protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03006HTHFIS916e-25 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 91.4 bits (227), Expect = 6e-25
Identities = 32/115 (27%), Positives = 58/115 (50%), Gaps = 2/115 (1%)

Query: 4 RVLTVDDSRTIRNMLLVTLNNAGFETIQAEDGVEGLEKLDTANPDVIVTDINMPRLDGFG 63
+L DD IR +L L+ AG++ + + + D++VTD+ MP + F
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 64 FIEGVRKNDRYRAVPILVLTTESDAEKKNRARQAGATGWIVKPFDPTKLIDAIER 118
+ ++K +P+LV++ ++ +A + GA ++ KPFD T+LI I R
Sbjct: 65 LLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGR 117


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03007PF06580397e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 7e-05
Identities = 18/126 (14%), Positives = 36/126 (28%), Gaps = 50/126 (39%)

Query: 475 IRNAVDHGLETPEKRVAAGKNPEGTVRLTAKHRSGRIVIELADDGAGINREKVRQKAIDN 534
+ N + HG+ G + L +G + +E+ + G+ +
Sbjct: 264 VENGIKHGIA--------QLPQGGKILLKGTKDNGTVTLEVENTGSLALKN--------- 306

Query: 535 DLIAADANLSDEEVDNLIFHAGFSTADKISDISGRGVGMDVVKRSIQALGG---RINISS 591
G G+ V+ +Q L G +I +S
Sbjct: 307 ------------------------------TKESTGTGLQNVRERLQMLYGTEAQIKLSE 336

Query: 592 KPGQGS 597
K G+ +
Sbjct: 337 KQGKVN 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03010HTHFIS694e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 69.5 bits (170), Expect = 4e-15
Identities = 42/162 (25%), Positives = 70/162 (43%), Gaps = 13/162 (8%)

Query: 1 MSAPARVLVVDDSATMRGLISAVLN-ADPDITVVGQAADALEARQAIKQLDPDVVTLDIE 59
M+ A +LV DD A +R +++ L+ A D+ + AA A D D+V D+
Sbjct: 1 MTG-ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAG---DGDLVVTDVV 56

Query: 60 MPNMNGLEFLDKIMRLRP-MPVIMVSTLTHRGAEATIAALEIGAFDCVGKPQPGDTHPFR 118
MP+ N + L +I + RP +PV+++S I A E GA+D + KP
Sbjct: 57 MPDENAFDLLPRIKKARPDLPVLVMS--AQNTFMTAIKASEKGAYDYLPKPFDLT----- 109

Query: 119 DLADKVKAAARSQRKSMITSNRAAAPAATAVSDSRAGRKIVA 160
+L + A ++ + V S A ++I
Sbjct: 110 ELIGIIGRALAEPKRRPSKLEDDSQDGMPLVGRSAAMQEIYR 151


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03011HTHFIS844e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 83.7 bits (207), Expect = 4e-22
Identities = 29/124 (23%), Positives = 52/124 (41%), Gaps = 3/124 (2%)

Query: 6 KIKVLIVDDQVTSRLLLGDALQQLGFKQITAAGDGEQGMKIMAQNPHHLVISDFNMPKMD 65
+L+ DD R +L AL + G+ + + + +A LV++D MP +
Sbjct: 3 GATILVADDDAAIRTVLNQALSRAGY-DVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 66 GLGLLQAVRANPATKKAAFIILTAQGDRALVQKAAALGANNVLAKPFTIEKMKAAIEAVF 125
LL ++ A ++++AQ KA+ GA + L KPF + ++ I
Sbjct: 62 AFDLLPRIKK--ARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRAL 119

Query: 126 GALK 129
K
Sbjct: 120 AEPK 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03014FLGMRINGFLIF448e-154 Flagellar M-ring protein signature.
		>FLGMRINGFLIF#Flagellar M-ring protein signature.

Length = 559

Score = 448 bits (1154), Expect = e-154
Identities = 138/568 (24%), Positives = 242/568 (42%), Gaps = 58/568 (10%)

Query: 13 NLSNLGQGKLIALAVAGVVAIGFVLGAGIYVNRPSFETLYVGLERSDVTQISIALAEANV 72
L+ L I L VAG A+ V+ ++ P + TL+ L D I L + N+
Sbjct: 15 WLNRLRANPRIPLIVAGSAAVAIVVAMVLWAKTPDYRTLFSNLSDQDGGAIVAQLTQMNI 74

Query: 73 DFEVGTDGGSIQVPVGMTGKARLLLAERGLPSSANAGYELFDNVGSLGLTSFMQEVTRVR 132
+ G+I+VP + RL LA++GLP G+EL D G++ F ++V R
Sbjct: 75 PYRFANGSGAIEVPADKVHELRLRLAQQGLPKGGAVGFELLDQ-EKFGISQFSEQVNYQR 133

Query: 133 ALEGEIARTIQQISGIAAARVHIVMPERGSFRKAEQTPTASVMIR--ASATVGRSAASSI 190
ALEGE+ARTI+ + + +ARVH+ MP+ F + +++P+ASV + + S++
Sbjct: 134 ALEGELARTIETLGPVKSARVHLAMPKPSLFVREQKSPSASVTVTLEPGRALDEGQISAV 193

Query: 191 RHLVASSVPGLDVDDVTVLDSTGQLLASGDDPSNSALNQSLGVVQNVQSDLEKKIDNALA 250
HLV+S+V GL +VT++D +G LL + + L +V+S ++++I+ L+
Sbjct: 194 VHLVSSAVAGLPPGNVTLVDQSGHLLTQSNTSGRDLNDAQLKFANDVESRIQRRIEAILS 253

Query: 251 PFLGMDNFRTSVTARLNTDAQQIQETVFDPESRVERSTRVIKEEQK-------------- 296
P +G N VTA+L+ ++ E + P ++T ++
Sbjct: 254 PIVGNGNVHAQVTAQLDFANKEQTEEHYSPNGDASKATLRSRQLNISEQVGAGYPGGVPG 313

Query: 297 --SSQQQPDNAATV---------QQNVPQAAPRGGAGQQSSDEAEKKEEQTNYEINSKTI 345
S+Q P N A + QN PQ + + + ++ E +NYE++
Sbjct: 314 ALSNQPAPPNEAPIATPPTNQQNAQNTPQTSTSTN-SNSAGPRSTQRNETSNYEVDRTIR 372

Query: 346 ATVKNSYSIERLSIAVVVNRGRLAAMAGEPADQAKIDAYLQEMQKIVSSAAGIDPGRGDV 405
T N IERLS+AVVVN LA P +++++ + A G RGD
Sbjct: 373 HTKMNVGDIERLSVAVVVNYKTLADGKPLPLT----ADQMKQIEDLTREAMGFSDKRGDT 428

Query: 406 VTLNAMDFVETQLLDQAVPGPGI-MEMLTRNLGGIINALAFVAVAFLVVWFGMRPLARQL 464
+ + F + + + P + L L + VA+++ +RP +
Sbjct: 429 LNVVNSPF--SAVDNTGGELPFWQQQSFIDQLLAAGRWLLVLVVAWILWRKAVRPQLTRR 486

Query: 465 GFGGQAGKLEGEAAGLELPDFSPAGAGAGGALMEGFGSDFGFDGGDDLLNLGDEAGFNRR 524
+A + + E + D+ L N+R
Sbjct: 487 VEEAKAAQEQ------------------AQVRQETEEAVEVRLSKDEQLQQRRA---NQR 525

Query: 525 VK-EGPERRLARMVEISEERAAKILRKW 551
+ E +R+ M + A ++R+W
Sbjct: 526 LGAEVMSQRIREMSDNDPRVVALVIRQW 553


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03018TYPE3IMSPROT327e-113 Type III secretion system inner membrane S protein ...
		>TYPE3IMSPROT#Type III secretion system inner membrane S protein

family signature.
Length = 354

Score = 327 bits (841), Expect = e-113
Identities = 94/346 (27%), Positives = 181/346 (52%), Gaps = 9/346 (2%)

Query: 8 DSKTEAPSEKKISDATEKGNVPFSREVTAFASTLAIYIFVVFFLSDGAANMAEALKDIFE 67
KTE P+ KKI DA +KG V S+EV + A +A+ ++ + ++ + E
Sbjct: 3 GEKTEQPTPKKIRDARKKGQVAKSKEVVSTALIVALSAMLMGLSDYYFEHFSKLMLIPAE 62

Query: 68 Q---PEAWRLDTATDAVALISHVVLKCAALVLPVFILLILFGVGSSIFQNLPRPVLDRIQ 124
Q P + L ++ +V+L+ L P+ + L + S + Q + I+
Sbjct: 63 QSYLPFSQAL------SYVVDNVLLEFFYLCFPLLTVAALMAIASHVVQYGFLISGEAIK 116

Query: 125 PKWNRVSPAAGFKRIYGVQGLVEFGKSLFKIIVVSIVVVLVLWNDYFATLDMMFSDPVTI 184
P +++P G KRI+ ++ LVEF KS+ K++++SI++ +++ + L + I
Sbjct: 117 PDIKKINPIEGAKRIFSIKSLVEFLKSILKVVLLSILIWIIIKGNLVTLLQLPTCGIECI 176

Query: 185 FTTMISDLKQIIIVVLFATATLAIVDLFWTRHHWYTELRMTRQEVKDELKQSQGDPIVKS 244
+ L+Q++++ ++I D + + + EL+M++ E+K E K+ +G P +KS
Sbjct: 177 TPLLGQILRQLMVICTVGFVVISIADYAFEYYQYIKELKMSKDEIKREYKEMEGSPEIKS 236

Query: 245 RLRSMQRDRARKRMISSVPRATLIIANPTHYAVALRYVREESDAPVVVAMGKDLVALKIR 304
+ R ++ + M +V R+++++ANPTH A+ + Y R E+ P+V D +R
Sbjct: 237 KRRQFHQEIQSRNMRENVKRSSVVVANPTHIAIGILYKRGETPLPLVTFKYTDAQVQTVR 296

Query: 305 EIAEKNGIPVFEDPPLARSMFAQVSVDSVIPPVFYKAVAELIHRVY 350
+IAE+ G+P+ + PLAR+++ VD IP +A AE++ +
Sbjct: 297 KIAEEEGVPILQRIPLARALYWDALVDHYIPAEQIEATAEVLRWLE 342


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03019FLGMOTORFLIG2821e-95 Flagellar motor switch protein FliG signature.
		>FLGMOTORFLIG#Flagellar motor switch protein FliG signature.

Length = 344

Score = 282 bits (722), Expect = 1e-95
Identities = 60/329 (18%), Positives = 152/329 (46%), Gaps = 6/329 (1%)

Query: 18 LSQTEKAAAVLLAMGKSIAGKLLKFFTQSELQAIIAAAQSLRAVPPHELEALVNEFEDLF 77
L+ +KAA +L+++G I+ K+ K+ +Q E++++ L + + ++ EF++L
Sbjct: 15 LTGKQKAAILLVSIGSEISSKVFKYLSQEEIESLTFEIAKLETITSELKDNVLLEFKELM 74

Query: 78 TEGAGLMDNAK-AMESILEEGLTPDEVDGLLGRRATFQSYEASIWDRLMDCDPVIIAQLL 136
+ +LE+ L + ++ + ++ ++ + DP I +
Sbjct: 75 MAQEFIQKGGIDYARELLEKSLGTQKAVDIINNLG--SALQSRPFEFVRRADPANILNFI 132

Query: 137 AREHPQTIAYVLSMMPSSFGAKVLLQLSDKQRPEILNRAVNIKNVNPKAAAIIEARVIEI 196
+EHPQTIA +LS + + +L L + + + R + +P+ +E + +
Sbjct: 133 QQEHPQTIALILSYLDPQKASFILSSLPTEVQTNVARRIALMDRTSPEVVREVERVLEKK 192

Query: 197 IEEMEAD--RNSPGPAKIAEVMNELEKPQVDTLLASLETISTDSVKKVRPKIFLFDDILF 254
+ + ++ ++ G + E++N ++ ++ SLE + ++++ K+F+F+DI+
Sbjct: 193 LASLSSEDYTSAGGVDNVVEIINMADRKTEKFIIESLEEEDPELAEEIKKKMFVFEDIVL 252

Query: 255 MPQRSRVQLFNDVSTDVITMALRGSAAELRESILASIGARQRRMIESDLAAGDAGINPRD 314
+ RS ++ ++ + AL+ ++E I ++ R M++ D+ +D
Sbjct: 253 LDDRSIQRVLREIDGQELAKALKSVDIPVQEKIFKNMSKRAASMLKEDMEFLGP-TRRKD 311

Query: 315 IAIARRSITQEAIRLSASGQLELKEKEPE 343
+ +++ I +L G++ + E
Sbjct: 312 VEESQQKIVSLIRKLEEQGEIVISRGGEE 340


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03020FLGMOTORFLIN1053e-31 Flagellar motor switch protein FliN signature.
		>FLGMOTORFLIN#Flagellar motor switch protein FliN signature.

Length = 137

Score = 105 bits (263), Expect = 3e-31
Identities = 33/81 (40%), Positives = 57/81 (70%), Gaps = 3/81 (3%)

Query: 125 SGMTANMDLIMDIPIDVQIVLGTSRMQVSGLMALTEGATIALDRKIGEPVEIMVNGRVIG 184
SG ++DLIMDIP+ + + LG +RM + L+ LT+G+ +ALD GEP++I++NG +I
Sbjct: 48 SGAMQDIDLIMDIPVKLTVELGRTRMTIKELLRLTQGSVVALDGLAGEPLDILINGYLIA 107

Query: 185 RGEITVLEGDVTRFGVKLLEI 205
+GE+ V+ ++GV++ +I
Sbjct: 108 QGEVVVVAD---KYGVRITDI 125


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03021FLGMOTORFLIM503e-09 Flagellar motor switch protein FliM signature.
		>FLGMOTORFLIM#Flagellar motor switch protein FliM signature.

Length = 344

Score = 50.3 bits (120), Expect = 3e-09
Identities = 36/223 (16%), Positives = 89/223 (39%), Gaps = 12/223 (5%)

Query: 99 DSPVLIALVEALLGAEPTSIEEPAPRSLSKIEIDVALPVFHGIAEVLRTAVNAPGGFEPV 158
D + ++++ L G + + R L+ IE V V I +R + P
Sbjct: 120 DPSITFSIIDRLFGGTGQAAKVQ--RDLTDIENSVMEGVIVRILANVRESWTQVIDLRPR 177

Query: 159 VGRPYNSAERAKPDPVLQDVFAASIDMTIGLGPVLSTFSVIVPQSTL--LKTRVVSRK-G 215
+G+ + + A+ P + + +G + +P T+ + +++ S+
Sbjct: 178 LGQIETNPQFAQIVP--PSEMVVLVTLETKVGEEEGMMNFCIPYITIEPIISKLSSQFWF 235

Query: 216 AGEDRNAKTEWTEQLEEQVRRSAVALEARIRLESLTLDTLSRLQAGDVIPFHD---GQDV 272
+ R++ T++ L +++ + + A + L++ + L+ GD+I HD G
Sbjct: 236 SSVRRSSTTQYMGVLRDKLSTVDMDVVAEVGSLRLSVRDILGLRVGDIIRLHDTHVGDPF 295

Query: 273 RVEVSANGRDLYVCEFGRSGSRYTVRVKDTHGSEQDILRHIMS 315
+ + + ++C+ G G + ++ + S +S
Sbjct: 296 VLSIGNRKK--FLCQPGVVGKKIAAQILERIESTSQEDFEELS 336


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03024FLGHOOKAP1300.009 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 29.9 bits (67), Expect = 0.009
Identities = 10/37 (27%), Positives = 19/37 (51%)

Query: 3 TGLYVALSSQMALEKRLNTLADNIANSNTVGFRATEV 39
+ + A+S A + LNT ++NI++ N G+
Sbjct: 2 SLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTT 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03028FLGHOOKAP1300.002 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 30.3 bits (68), Expect = 0.002
Identities = 8/40 (20%), Positives = 20/40 (50%)

Query: 95 LPNVNILIEMADMREANRAYEANLQTIKQSRDLISQTIDL 134
+ VN+ E +++ + Y AN Q ++ + + I++
Sbjct: 506 ISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINI 545



Score = 27.2 bits (60), Expect = 0.028
Identities = 14/60 (23%), Positives = 21/60 (35%), Gaps = 7/60 (11%)

Query: 5 TSALKVSASGLQAESTRLRIVSENIANARSTGDAPGADPFRRKTISFAAEVDRASGASLV 64
+S + + SGL A L S NI++ G + R+T A V
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAG-------YTRQTTIMAQANSTLGAGGWV 53


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03029FLGHOOKFLIE342e-05 Flagellar hook-basal body complex protein FliE signa...
		>FLGHOOKFLIE#Flagellar hook-basal body complex protein FliE

signature.
Length = 103

Score = 34.3 bits (78), Expect = 2e-05
Identities = 20/89 (22%), Positives = 37/89 (41%), Gaps = 2/89 (2%)

Query: 25 SASLVMPGAGTAAPQAGSFAEVLGNMTTDAIRSMKSAEGTSLQAIRGEA--NTREVVDAV 82
+ ++ + SFA L + +A + + GE +V+ +
Sbjct: 15 ATAMSARAQESLPQPTISFAGQLHAALDRISDTQTAARTQAEKFTLGEPGVALNDVMTDM 74

Query: 83 MSAEQSLQTAIAIRDKVVTAYLEIARMQI 111
A S+Q I +R+K+V AY E+ MQ+
Sbjct: 75 QKASVSMQMGIQVRNKLVAAYQEVMSMQV 103


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03030FLGHOOKAP1412e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 2e-06
Identities = 10/45 (22%), Positives = 22/45 (48%)

Query: 213 QVKQNYLESSNVDPVKEITDLISAQRAYEMNSKIIQAADEMAATV 257
Q+ S V+ +E +L Q+ Y N++++Q A+ + +
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDAL 542



Score = 38.8 bits (90), Expect = 1e-05
Identities = 13/34 (38%), Positives = 20/34 (58%)

Query: 4 LSIAATGMNAQQLNLEVIANNIANINTTGYKRAR 37
++ A +G+NA Q L +NNI++ N GY R
Sbjct: 4 INNAMSGLNAAQAALNTASNNISSYNVAGYTRQT 37


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03031PYOCINKILLER270.029 Pyocin S killer protein signature.
		>PYOCINKILLER#Pyocin S killer protein signature.

Length = 617

Score = 27.5 bits (60), Expect = 0.029
Identities = 16/58 (27%), Positives = 24/58 (41%)

Query: 104 REQYAVERGSTVRLVFNNGGLTITAAGSPLQDAAVGDLIRVRNVDTGVIVSGTVMADS 161
Q A R + + NG + TAAG L A G + + + V G V+A +
Sbjct: 238 ARQQAAIRAANTYAMPANGSVVATAAGRGLIQVAQGAASLAQAISDAIAVLGRVLASA 295


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03032FLGPRINGFLGI482e-173 Flagellar P-ring protein signature.
		>FLGPRINGFLGI#Flagellar P-ring protein signature.

Length = 373

Score = 482 bits (1242), Expect = e-173
Identities = 266/363 (73%), Positives = 312/363 (85%)

Query: 9 LLTLAVVFAATLTSAYAASRIKDVASLQSGRDNQLIGYGLVVGLQGTGDSLRSSPFTDQS 68
L+ A+ F +T + SRIKD+ASLQ+GRDNQLIGYGLVVGLQGTGDSLRSSPFT+QS
Sbjct: 11 LVFSALPFLSTPPAQADTSRIKDIASLQAGRDNQLIGYGLVVGLQGTGDSLRSSPFTEQS 70

Query: 69 IRAMLQNLGISTQGGDSRTRNVAAVLVTATLPPFASPGSRLDVTVGSLGDATSLRGGTLV 128
+RAMLQNLGI+TQGG S +N+AAV+VTA LPPFASPGSR+DVTV SLGDATSLRGG L+
Sbjct: 71 MRAMLQNLGITTQGGQSNAKNIAAVMVTANLPPFASPGSRVDVTVSSLGDATSLRGGNLI 130

Query: 129 MTSLSGADGQIYAVAQGSVVVSGFNAQGEAAQLSQGVTTAGRVPNGAIIERELPSKFKDG 188
MTSLSGADGQIYAVAQG+++V+GF+AQG+AA L+QGVTT+ RVPNGAIIERELPSKFKD
Sbjct: 131 MTSLSGADGQIYAVAQGALIVNGFSAQGDAATLTQGVTTSARVPNGAIIERELPSKFKDS 190

Query: 189 FNLVLQLRNPDFSTAVGMAAAINRYAAAQFGGRIAEALDSQSVLVQKPKMADLARLMADV 248
NLVLQLRNPDFSTAV +A +N +A A++G IAE DSQ + VQKP++ADL RLMA++
Sbjct: 191 VNLVLQLRNPDFSTAVRVADVVNAFARARYGDPIAEPRDSQEIAVQKPRVADLTRLMAEI 250

Query: 249 ENLVIETDAPARVVINERTGTIVIGQDVRVAEVAVSYGTLTVQVSETPTIVQPEPFSRGE 308
ENL +ETD PA+VVINERTGTIVIG DVR++ VAVSYGTLTVQV+E+P ++QP PFSRG+
Sbjct: 251 ENLTVETDTPAKVVINERTGTIVIGADVRISRVAVSYGTLTVQVTESPQVIQPAPFSRGQ 310

Query: 309 TAYEPNTTIEAQADGGTVAILNGSSLRSLVAGLNSIGVKPDGIIAILQSIKSAGALQAEL 368
TA +P T I A +G VAI+ G LR+LVAGLNSIG+K DGIIAILQ IKSAGALQAEL
Sbjct: 311 TAVQPQTDIMAMQEGSKVAIVEGPDLRTLVAGLNSIGLKADGIIAILQGIKSAGALQAEL 370

Query: 369 VLQ 371
VLQ
Sbjct: 371 VLQ 373


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03034FLGLRINGFLGH2652e-92 Flagellar L-ring protein signature.
		>FLGLRINGFLGH#Flagellar L-ring protein signature.

Length = 232

Score = 265 bits (678), Expect = 2e-92
Identities = 56/209 (26%), Positives = 90/209 (43%), Gaps = 18/209 (8%)

Query: 29 PAMSPIGSGLQYTQTPQLAMYPKQPRHVTNGYSLWNDQQAALFKDARAINIGDILTVDIR 88
P +P+ +G + Q+ Q Y QP LF+D R NIGD LT+ ++
Sbjct: 41 PGPTPVANGSIF-QSAQPINYGYQP----------------LFEDRRPRNIGDTLTIVLQ 83

Query: 89 IDDKASFENETDRSRKNSSGFNLGASGQSQTSDFAWS-GDLEYGSNTKTEGDGKTERSEK 147
+ AS + + SR + F + F + D+E G G S
Sbjct: 84 ENVSASKSSSANASRDGKTNFGFDTVPRYLQGLFGNARADVEASGGNTFNGKGGANASNT 143

Query: 148 LRLLVAAVVTGVLENGNLLISGSQEVRVNHELRILNVAGIVRPRDVDADNVISYDRIAEA 207
+ V VL NGNL + G +++ +N + +G+V PR + N + ++A+A
Sbjct: 144 FSGTLTVTVDQVLVNGNLHVVGEKQIAINQGTEFIRFSGVVNPRTISGSNTVPSTQVADA 203

Query: 208 RISYGGRGRLTEVQQPPWGQQLVDLVSPL 236
RI Y G G + E Q W Q+ +SP+
Sbjct: 204 RIEYVGNGYINEAQNMGWLQRFFLNLSPM 232


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03036FLGBIOSNFLIP2812e-98 Escherichia coli: Flagellar biosynthetic protein Fl...
		>FLGBIOSNFLIP#Escherichia coli: Flagellar biosynthetic protein FliP

signature.
Length = 245

Score = 281 bits (721), Expect = 2e-98
Identities = 98/246 (39%), Positives = 157/246 (63%), Gaps = 5/246 (2%)

Query: 1 MLRFATFIIAMMAMSGIAGAQSFPADILNTPVDGSVASWI--IRTFGLLTVLSVAPGILI 58
M R + ++ + P I + P+ G SW ++T +T L+ P IL+
Sbjct: 1 MRRLLSVAPVLLWLITPLAFAQLPG-ITSQPLPGGGQSWSLPVQTLVFITSLTFIPAILL 59

Query: 59 MVTSFPRFVIAFAILRSGMGLATTPSNMIMVSLALFMTFYVMAPTFDRAWRDGIDPLLKN 118
M+TSF R +I F +LR+ +G + P N +++ LALF+TF++M+P D+ + D P +
Sbjct: 60 MMTSFTRIIIVFGLLRNALGTPSAPPNQVLLGLALFLTFFIMSPVIDKIYVDAYQPFSEE 119

Query: 119 EISETDAMQRMSEPFREFMVANTRDKDLQLFIDIAREKGQTVVVDEKVDLRAVVPAFMIS 178
+IS +A+++ ++P REFM+ TR+ DL LF +A + E V +R ++PA++ S
Sbjct: 120 KISMQEALEKGAQPLREFMLRQTREADLGLFARLANT--GPLQGPEAVPMRILLPAYVTS 177

Query: 179 EIRRGFEIGFLIMLPFLVIDLIVATITMAMGMMMLPPTAISLPFKILFFVLIDGWNLLVG 238
E++ F+IGF I +PFL+IDL++A++ MA+GMMM+PP I+LPFK++ FVL+DGW LLVG
Sbjct: 178 ELKTAFQIGFTIFIPFLIIDLVIASVLMALGMMMVPPATIALPFKLMLFVLVDGWQLLVG 237

Query: 239 SLVRSF 244
SL +SF
Sbjct: 238 SLAQSF 243


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03037FLAGELLIN1492e-42 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 149 bits (376), Expect = 2e-42
Identities = 58/379 (15%), Positives = 117/379 (30%), Gaps = 4/379 (1%)

Query: 4 ILTNNSAMAALSTLRSISSSMEDTQSRISSGLRVGSASDNAAYWSIATTMRSDNQALSAV 63
I TN+ ++ + L SS+ R+SSGLR+ SA D+AA +IA S+ + L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 QDALGLG---AAKVDTAYSGMESAIEVVKEIKAKLVAATEDGVDKAKIQEEITQLKDQLT 120
G A + A + + + ++ V+E+ + T D IQ+EI Q +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SIAEAASFSGENWLQADLSGGPVTKSVVGGFVRDSSGAVSVKKVD-YSLNTDTVLFDTTG 179
++ F+G L D + G + + VK + N + T G
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 180 NTGILDKVYNVSQASVTLPVNVNGTTSEYTVGAYNVDDLIDASATFDGDYANVGAGALAG 239
+ K + V + + +
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 240 DYVKVQGSWVKAVDVAATGQEVVYDDGTTKWGVDTTVTGAPATNVAAPASIATIDITIAA 299
+ K+ A + + K G G T + ++
Sbjct: 244 NTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTI 303

Query: 300 QAGNLDALIAGVDEALTDMTSAAASLGSISSRIDLQSDFVNKLSDSIDSGVGRLVDADMN 359
+ +A + ++ +A + F +S ++A+
Sbjct: 304 NGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNA 363

Query: 360 EESTRLKALQTQQQLAIQA 378
+ + + A A
Sbjct: 364 VKGESKITVNGAEYTANAA 382



Score = 94.7 bits (235), Expect = 2e-23
Identities = 57/363 (15%), Positives = 110/363 (30%), Gaps = 19/363 (5%)

Query: 32 SSGLRVGSASDNAAYWSIATTMRSDNQALSAVQDALGLGAAKVDTAYSGMESAIEVVKEI 91
L + + N + ++S + ++ SG +
Sbjct: 164 VKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTV 223

Query: 92 KAKLVAATEDGVDKAKIQEEITQLKDQLTSIAEAASFSGENWLQADLSGGPVTKSVVGGF 151
K V+ A Q ++ + S +A G + G
Sbjct: 224 PDK------VYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDT 277

Query: 152 VRDSSGAVSVKKVDYSLNTDTVLFDTTGNTGILDKVYNVSQASVTLPVNVNGTTSEYTVG 211
++ + + + T + V +++ + + +S+
Sbjct: 278 FDYKGVTFTIDTKTG-NDGNGKVSTTINGEKVTLTVADITAGAANV-DAATLQSSKNVYT 335

Query: 212 AYNVDDLIDASATFDGDYANVGAGALAGDYVKVQGSWVKAVDVAATGQEVVYDDGTTKWG 271
+ T + A + + + A A + V G T +
Sbjct: 336 SVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFI 395

Query: 272 VDTTVTGAPATNVAAPASIATIDITIAAQAGNLDALIAGVDEALTDMTSAAASLGSISSR 331
T + N A AA + +A +D AL+ + + +SLG+I +R
Sbjct: 396 DKTASGVSTLINEDA-----------AAAKKSTANPLASIDSALSKVDAVRSSLGAIQNR 444

Query: 332 IDLQSDFVNKLSDSIDSGVGRLVDADMNEESTRLKALQTQQQLAIQALSIANSDSQNVLS 391
D + +++S R+ DAD E + + Q QQ L+ AN QNVLS
Sbjct: 445 FDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLS 504

Query: 392 LFR 394
L R
Sbjct: 505 LLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03038FLAGELLIN1537e-44 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 153 bits (387), Expect = 7e-44
Identities = 55/378 (14%), Positives = 111/378 (29%), Gaps = 4/378 (1%)

Query: 4 ILTNIAAMAALQTLRTIGSNMEETQAHVSSGLRVGQAADNAAYWSIATTMRSDNMALSAV 63
I TN ++ L S++ +SSGLR+ A D+AA +IA S+ L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 QDALGLG---AAKVDTAYSGMESAIEVVKEIKAKLVAATEDGVDKAKIQEEIDQLKDQLT 120
G A + A + + + ++ V+E+ + T D IQ+EI Q +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SIAEAASFSGENWLQADLSGGPVTKSVVGSFVRDAGGAVSVKKVD-YSLNTNSVLFDTAG 179
++ F+G L D + G + + VK + N N T G
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 180 NTGILDKVYNVSQASVTLPVNVNGTTSEYTVGAYNVDDLIDASATFDGDYANVGAGALAG 239
+ K + V + + +
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 240 DYVKVQGSWVKAVDVAATGQEVVYDDGTTKWGVDTTVTGAPATNVAAPASIATIDITIAA 299
+ K+ A + + K G G T + ++
Sbjct: 244 NTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTI 303

Query: 300 QAGNLDALIAGVDEALTDMTSAAADLGSIAMRIDLQSDFVNKLSDSIDSGVGRLVDADMN 359
+ +A + ++ +A + F +S ++A+
Sbjct: 304 NGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLEANNA 363

Query: 360 EESTRLKALQTQQQLAIQ 377
+ + + A
Sbjct: 364 VKGESKITVNGAEYTANA 381



Score = 94.3 bits (234), Expect = 3e-23
Identities = 57/363 (15%), Positives = 108/363 (29%), Gaps = 19/363 (5%)

Query: 32 SSGLRVGQAADNAAYWSIATTMRSDNMALSAVQDALGLGAAKVDTAYSGMESAIEVVKEI 91
L + N + ++S ++ SG +
Sbjct: 164 VKSLGLDGFNVNGPKEATVGDLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTV 223

Query: 92 KAKLVAATEDGVDKAKIQEEIDQLKDQLTSIAEAASFSGENWLQADLSGGPVTKSVVGSF 151
K V+ A Q D ++ + S +A G + G
Sbjct: 224 PDK------VYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDT 277

Query: 152 VRDAGGAVSVKKVDYSLNTNSVLFDTAGNTGILDKVYNVSQASVTLPVNVNGTTSEYTVG 211
G ++ + N + T + V +++ + + +S+
Sbjct: 278 FDYKGVTFTIDTKTG-NDGNGKVSTTINGEKVTLTVADITAGAANV-DAATLQSSKNVYT 335

Query: 212 AYNVDDLIDASATFDGDYANVGAGALAGDYVKVQGSWVKAVDVAATGQEVVYDDGTTKWG 271
+ T + A + + + A A + V G T +
Sbjct: 336 SVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFI 395

Query: 272 VDTTVTGAPATNVAAPASIATIDITIAAQAGNLDALIAGVDEALTDMTSAAADLGSIAMR 331
T + N A AA + +A +D AL+ + + + LG+I R
Sbjct: 396 DKTASGVSTLINEDA-----------AAAKKSTANPLASIDSALSKVDAVRSSLGAIQNR 444

Query: 332 IDLQSDFVNKLSDSIDSGVGRLVDADMNEESTRLKALQTQQQLAIQSLSIANSASENVLT 391
D + +++S R+ DAD E + + Q QQ L+ AN +NVL+
Sbjct: 445 FDSAITNLGNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLS 504

Query: 392 LFR 394
L R
Sbjct: 505 LLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03039FLAGELLIN1381e-38 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 138 bits (349), Expect = 1e-38
Identities = 54/330 (16%), Positives = 102/330 (30%), Gaps = 12/330 (3%)

Query: 4 ILTNVAAMAALQTLRGIDSNMEETQARVSSGLRVGTASDNAAYWSIATTMRSDNMALSAV 63
I TN ++ L S++ R+SSGLR+ +A D+AA +IA S+ L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 QDALGLGAAKVDTAYAGM---ENAVEVVKEIRAKLVAATEDGVDKAKIQEEIEQLKQQLT 120
G + T + N ++ V+E+ + T D IQ+EI+Q +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SIATAASFSGENWLQADIT-TPVTKSVVGSFVRDSSGVVSVKTID-YVLDGNSVLFDTVG 178
++ F+G L D + G + + VK++ + N TVG
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 179 DAGILDKIYNVSQASVTLPVNVNGTTTEYTVAAYAVDELIAAGATFDGDSANVTGYTVPA 238
D K V + + +
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQ-------L 236

Query: 239 GGIDYNGNFVKVEGTWVRAIDVAATGQEVVYDDGTTKWGVDTTVAGAPAINVVAPASIEN 298
D N ++ A + + K G G + N
Sbjct: 237 TTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGN 296

Query: 299 IDITNAAQAANLDALIRGVDEALEDLISAT 328
++ + + + ++ +AT
Sbjct: 297 GKVSTTINGEKVTLTVADITAGAANVDAAT 326



Score = 90.5 bits (224), Expect = 7e-22
Identities = 47/268 (17%), Positives = 81/268 (30%), Gaps = 7/268 (2%)

Query: 139 TTPVTKSVVGSFVRDSSGVVSVKTIDYVLDGNSVLFDTVGDAGILDKIYNVSQASVTLPV 198
T + + ++G K I + G DT G+ I + V
Sbjct: 241 AENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKE-GDTFDYKGVTFTIDTKTGNDGNGKV 299

Query: 199 NVNGTTTEYTVAAYAVDELIAAGATFDGDSANVTGYTVPAGGIDYNGNFVKVEGTWVRAI 258
+ + T+ + A S+ +V G ++
Sbjct: 300 STTINGEKVTLTVADITAGAANVDAATLQSSKNVYTSVVNGQFTFDDKTKNESAKLSDLE 359

Query: 259 DVAATGQEVVY------DDGTTKWGVDTTVAGAPAINVVAPASIENIDITNAAQAANLDA 312
A E T I+ A I+ AA +
Sbjct: 360 ANNAVKGESKITVNGAEYTANAAGDKVTLAGKTMFIDKTASGVSTLINEDAAAAKKSTAN 419

Query: 313 LIRGVDEALEDLISATSALGSISMRIGMQEEFVSKLTDSIDSGIGRLVDADMNEESTRLK 372
+ +D AL + + S+LG+I R + +++S R+ DAD E + +
Sbjct: 420 PLASIDSALSKVDAVRSSLGAIQNRFDSAITNLGNTVTNLNSARSRIEDADYATEVSNMS 479

Query: 373 ALQTQQQLAIQSLSIANTNSENILQLFR 400
Q QQ L+ AN +N+L L R
Sbjct: 480 KAQILQQAGTSVLAQANQVPQNVLSLLR 507


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03040FLAGELLIN1104e-29 Flagellin signature.
		>FLAGELLIN#Flagellin signature.

Length = 507

Score = 110 bits (275), Expect = 4e-29
Identities = 48/295 (16%), Positives = 90/295 (30%), Gaps = 3/295 (1%)

Query: 26 TTQGRISSGYRVETAADNAAYWSIATTMRSDNAALSTVHDALGLGAAKVDTFYSAMDTVI 85
T + +V A N + + T G AK
Sbjct: 216 TDTTAPTVPDKVYVNAANGQLTTDDAENNTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEG 275

Query: 86 DVMTEIKAKLVAASEPGVDKDKINKEVAELKSQLNSAAQSASFSGENWLYNGASAALGTK 145
D ++ G D + + + A + + S+
Sbjct: 276 DTFDYKGVTFTIDTKTGNDGNGKVSTTINGEKVTLTVADITAGAANVDAATLQSSKNVYT 335

Query: 146 SIVASFNRSADGSVTVSTLNYDTAKSVLIDVTDPARGMLTKAVNADALQSTPTGTARNYY 205
S+V D + S D + + A + T
Sbjct: 336 SVVNGQFTFDDKTKNESAKLSDLEANNAVKGESKITVNGA---EYTANAAGDKVTLAGKT 392

Query: 206 LIDAGAAPAGATEIEIDNATTGAQLGDMISVVDELISQLTDSAATLGAITSRIEMQESFV 265
+ A +T I D A + ++ +D +S++ ++LGAI +R + + +
Sbjct: 393 MFIDKTASGVSTLINEDAAAAKKSTANPLASIDSALSKVDAVRSSLGAIQNRFDSAITNL 452

Query: 266 ANLMDVIDKGVGRLVDADMNEESTRLKALQTQQQLGIQSLSIANTTSENILRLFQ 320
N + ++ R+ DAD E + + Q QQ G L+ AN +N+L L +
Sbjct: 453 GNTVTNLNSARSRIEDADYATEVSNMSKAQILQQAGTSVLAQANQVPQNVLSLLR 507



Score = 103 bits (257), Expect = 1e-26
Identities = 47/327 (14%), Positives = 96/327 (29%), Gaps = 15/327 (4%)

Query: 4 IMTNPAAMAALQTLRAINHNLETTQGRISSGYRVETAADNAAYWSIATTMRSDNAALSTV 63
I TN ++ L +L + R+SSG R+ +A D+AA +IA S+ L+
Sbjct: 4 INTNSLSLLTQNNLNKSQSSLSSAIERLSSGLRINSAKDDAAGQAIANRFTSNIKGLTQA 63

Query: 64 HDALGLGAAKVDTFYSAMDTV---IDVMTEIKAKLVAASEPGVDKDKINKEVAELKSQLN 120
G + T A++ + + + E+ + + D I E+ + +++
Sbjct: 64 SRNANDGISIAQTTEGALNEINNNLQRVRELSVQATNGTNSDSDLKSIQDEIQQRLEEID 123

Query: 121 SAAQSASFSGENWLYNGASAALGTKSIVASFNRSADGSVTVSTLN---YDTAKSVLIDVT 177
+ F+G L + + + V +L ++ V
Sbjct: 124 RVSNQTQFNGVKVLSQDNQMKIQVGANDGETITIDLQKIDVKSLGLDGFNVNGPKEATVG 183

Query: 178 DPARGMLTKAVNADALQ-------STPTGTARNYYLIDAGAAPAGATEIEIDNATTGAQL 230
D +G T A+
Sbjct: 184 DLKSSFKNVTGYDTYAVGANKYRVDVNSGAVVTDTTAPTVPDKVYVNAANGQLTTDDAEN 243

Query: 231 GDMISVVDELIS--QLTDSAATLGAITSRIEMQESFVANLMDVIDKGVGRLVDADMNEES 288
+ + S ++ A GAI E + ID G + ++
Sbjct: 244 NTAVDLFKTTKSTAGTAEAKAIAGAIKGGKEGDTFDYKGVTFTIDTKTGNDGNGKVSTTI 303

Query: 289 TRLKALQTQQQLGIQSLSIANTTSENI 315
K T + + ++ T ++
Sbjct: 304 NGEKVTLTVADITAGAANVDAATLQSS 330


28SMc03046SMc03053N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc0304629-1.968098transcriptional regulator
SMc03047210-1.725435flagellar hook protein FlgE
SMc03048211-1.847125flagellar hook-associated protein FlgK
SMc03049215-2.267517flagellar hook-associated protein FlgL
SMc03050316-1.724470flagellar biosynthesis regulatory protein FlaF
SMc03051315-1.012767flagellar biosynthesis repressor FlbT
SMc03052415-0.692297flagellar basal body rod modification protein
SMc03053414-0.779083flagellar biosynthesis protein FliQ
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03046HTHFIS412e-06 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 41.4 bits (97), Expect = 2e-06
Identities = 23/114 (20%), Positives = 42/114 (36%), Gaps = 5/114 (4%)

Query: 2 IVVVDDRALVKDGYASLFGREGIP-STGFDPREFGEWVSSAADSDIDAVEAFLIGQGEST 60
I+V DD A ++ R G + W+++ D V ++ E+
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGD---GDLVVTDVVMPDENA 62

Query: 61 FTLPRAIRDR-SRAPVIAMSDTPSLENTLALFDCGVDDVVRKPVHPREILARVA 113
F L I+ PV+ MS + + + G D + KP E++ +
Sbjct: 63 FDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIG 116


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03047FLGHOOKAP1415e-06 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 41.5 bits (97), Expect = 5e-06
Identities = 13/49 (26%), Positives = 28/49 (57%)

Query: 357 EILSGALESSNVDIAEELTAMIESQRNYTANSKVFQTGSELLEVLVNLK 405
++ + S V++ EE + Q+ Y AN++V QT + + + L+N++
Sbjct: 498 QLSNQQQSISGVNLDEEYGNLQRFQQYYLANAQVLQTANAIFDALINIR 546



Score = 38.4 bits (89), Expect = 5e-05
Identities = 13/33 (39%), Positives = 20/33 (60%)

Query: 9 TGVSGMNAQSNRLSTVAENIANANTTGYKRAST 41
+SG+NA L+T + NI++ N GY R +T
Sbjct: 6 NAMSGLNAAQAALNTASNNISSYNVAGYTRQTT 38


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03048FLGHOOKAP1682e-14 Flagellar hook-associated protein signature.
		>FLGHOOKAP1#Flagellar hook-associated protein signature.

Length = 546

Score = 68.4 bits (167), Expect = 2e-14
Identities = 77/406 (18%), Positives = 143/406 (35%), Gaps = 36/406 (8%)

Query: 4 SSAIAIAQSAFSTTAQQTATVSKNIANSGNADYSR----------RMAMLGTTPGGAQIV 53
SS I A S + T S NI++ A Y+R + G G +
Sbjct: 1 SSLINNAMSGLNAAQAALNTASNNISSYNVAGYTRQTTIMAQANSTLGAGGWVGNGVYVS 60

Query: 54 SIYRAQNEALLKQNLIGISQSSAQSSLLSGLEIMKSALGGNDYESSPSTYLSAFRNSLQT 113
+ R + + Q +QSS ++ + + + L + SS +T + F SLQT
Sbjct: 61 GVQREYDAFITNQLRAAQTQSSGLTARYEQMSKIDNML--STSTSSLATQMQDFFTSLQT 118

Query: 114 FASTPGNATIAATVVSDASDLANSISKTSAAVQDLRLDSDKKIAEEVANLNRLLAQFETA 173
S + ++ + L N T ++D + I V +N Q +
Sbjct: 119 LVSNAEDPAARQALIGKSEGLVNQFKTTDQYLRDQDKQVNIAIGASVDQINNYAKQIASL 178

Query: 174 NNAVKQATAAGTDATA--ALDDRDKILKQVSELVGISTVTRAN-NDTVIYTSGGTVLFET 230
N+ + + T G A+ LD RD+++ +++++VG+ + + +G +++ +
Sbjct: 179 NDQISRLTGVGAGASPNNLLDQRDQLVSELNQIVGVEVSVQDGGTYNITMANGYSLVQGS 238

Query: 231 LPREVTFAPKSAYDATVTGNGIFIDGVPLAAGSGADTSAQGKLAGLLQLRDDIAPTFQSQ 290
R+ A + ++DG G L G+L R ++
Sbjct: 239 TARQ--LAAVPSSADPSRTTVAYVDGTAGNIEIPEKLLNTGSLGGILTFRSQDLDQTRNT 296

Query: 291 LDEMARGLVTLF---KEGGL-------PGLFTWSGGTVPAAGAVQPGLAASLSVNPAAKA 340
L ++A F + G F V + +A +V A+
Sbjct: 297 LGQLALAFAEAFNTQHKAGFDANGDAGEDFFAIGKPAVLQNTKNKGDVAIGATVTDASAV 356

Query: 341 NPFLLRDGGFNG---VVSNPDGNAGYTVLLDGFVTAMDGDMAFDGA 383
F+ V+ N +TV D +G +AFDG
Sbjct: 357 L-ATDYKISFDNNQWQVTRLASNTTFTVTPD-----ANGKVAFDGL 396



Score = 38.8 bits (90), Expect = 4e-05
Identities = 18/79 (22%), Positives = 39/79 (49%)

Query: 390 SSIMEFAASSIGWFEQIRSGASTADDNKAALLARTQEALGSVTGVSIDEELSLLLDLEQS 449
S + AS + + T+ + ++ + S++GV++DEE L +Q
Sbjct: 465 KSFNDAYASLVSDIGNKTATLKTSSATQGNVVTQLSNQQQSISGVNLDEEYGNLQRFQQY 524

Query: 450 YKASAKLISTVDAMMASLL 468
Y A+A+++ T +A+ +L+
Sbjct: 525 YLANAQVLQTANAIFDALI 543


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03053TYPE3IMQPROT535e-13 Type III secretion system inner membrane Q protein ...
		>TYPE3IMQPROT#Type III secretion system inner membrane Q protein

family signature.
Length = 86

Score = 52.8 bits (127), Expect = 5e-13
Identities = 20/77 (25%), Positives = 39/77 (50%)

Query: 5 DALDIVQAAIWTVIVASGPAVLAAMVVGVGIAFIQALTQVQEMTLTFVPKILAVMITAAI 64
D + A++ V++ SG + A ++G+ + Q +TQ+QE TL F K+L V + +
Sbjct: 3 DLVFAGNKALYLVLILSGWPTIVATIIGLLVGLFQTVTQLQEQTLPFGIKLLGVCLCLFL 62

Query: 65 SAPFVGAQISIFTDIVF 81
+ + G + + V
Sbjct: 63 LSGWYGEVLLSYGRQVI 79


29SMc00003SMc00071N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00003-1110.273028chaperone protein
SMc00005-112-0.028659enoyl-ACP reductase
SMc00006-113-0.407108hypothetical protein
SMc00070014-0.917503signal peptide protein
SMc00007-112-0.894717chorismate synthase
SMc00008-114-1.227876riboflavin biosynthesis protein RibA
SMc00071014-1.742854hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00003IGASERPTASE320.005 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 31.6 bits (71), Expect = 0.005
Identities = 27/197 (13%), Positives = 56/197 (28%), Gaps = 9/197 (4%)

Query: 15 RRTAGQEEIKAAWRSVARAVHPDHNQDDPTANERFAEAGRAYELLRDPVRRSRYDWARRE 74
+ E A R VA+ + + + NE E + + +
Sbjct: 1053 KNEQDATETTAQNREVAKEAKSN-VKANTQTNEVAQSGSETKETQTTETKETATVEKEEK 1111

Query: 75 AELRRMEAMKEKMRGAEVP-------DEPVGAETAEEAISRIFGVEPQATSARPKSSARP 127
A++ + + ++V AE A E + EPQ+ + + +P
Sbjct: 1112 AKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQP 1171

Query: 128 AER-ERAPVEAKTEPAAEAKQEQPKADALARRSIMPAADLVAAIVQRIRGRIAKTAEKVP 186
A+ + TE + + + + + R ++ VP
Sbjct: 1172 AKETSSNVEQPVTESTTVNTGNSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVP 1231

Query: 187 DLAVDVHASVEDVINLA 203
S D +A
Sbjct: 1232 HNVEPATTSSNDRSTVA 1248


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00005DHBDHDRGNASE584e-12 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 58.1 bits (140), Expect = 4e-12
Identities = 66/257 (25%), Positives = 104/257 (40%), Gaps = 17/257 (6%)

Query: 8 MNGKRGVIMGVANNRSIAWGIAKALAEAGAEI-ALTWQGDALKKRVEPLAQELGAFMAGH 66
+ GK I G A + I +A+ LA GA I A+ + + L+K V L E A
Sbjct: 6 IEGKIAFITGAA--QGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFP 63

Query: 67 CDVTDLATIDAVFSALEEKWGKIDFVVHAIAFSDKDELTGRYLDTSRDNFARTMDISVYS 126
DV D A ID + + +E + G ID +V+ G S + + T ++
Sbjct: 64 ADVRDSAAIDEITARIEREMGPIDILVNVAGVLR----PGLIHSLSDEEWEATFSVNSTG 119

Query: 127 FTAVAARADRVMND--GGSILTLTYYGAEKVMPHYNVMGVAKAALEASVRYLAVDLGNRG 184
+ + M D GSI+T+ A +KAA + L ++L
Sbjct: 120 VFNASRSVSKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYN 179

Query: 185 IRVNAISAGPIKT-----LAASGIGDFRYILKWNE---YNAPLKRTVSIEEVGNSALYLL 236
IR N +S G +T L A G + I E PLK+ ++ ++ L+L+
Sbjct: 180 IRCNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLV 239

Query: 237 SDLSSGVTGEVHHVDSG 253
S + +T VD G
Sbjct: 240 SGQAGHITMHNLCVDGG 256


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00007PF05272320.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 32.4 bits (73), Expect = 0.003
Identities = 11/49 (22%), Positives = 19/49 (38%)

Query: 125 DYRGGGRSSARETAARVAAGAIARKVVPGLVVRGALVQIGKHRIDRSNW 173
DY+ + + G +AR + PG ++V G I +S
Sbjct: 564 DYKPRRLRYLQLVGKYILMGHVARVMEPGCKFDYSVVLEGTGGIGKSTL 612


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00071PF067761825e-61 Invasion associated locus B
		>PF06776#Invasion associated locus B

Length = 214

Score = 182 bits (463), Expect = 5e-61
Identities = 94/169 (55%), Positives = 122/169 (72%), Gaps = 1/169 (0%)

Query: 8 RLPALAALAIAAFALPAASLAQQSNATPGTVKSNHGAWSIVCDQPAGASAEQCALMQNVI 67
R A LA A + + +++A G V+S HG W I CD P GA AEQCAL+Q+V+
Sbjct: 47 RNGARLMLAGAMAIALSFGWSDRADA-QGAVRSVHGDWQIRCDTPPGAKAEQCALIQSVV 105

Query: 68 AEDRPEVGLSVVVLKTADRKAKILRVLAPLGVLLPNGLGLNVDGKDIGRAYFVRCFSDGC 127
AEDR GL+V++LKTAD+K+K++RV+APLGVLLP+GLGL +D D+GRA FVRC +GC
Sbjct: 106 AEDRSNAGLTVIILKTADQKSKLMRVVAPLGVLLPSGLGLKLDNVDVGRAGFVRCLPNGC 165

Query: 128 YAEVVLEDELLKTFRAGATATFIVFQSPEEGIGIPVDLKGFDEGYDALP 176
AEVV++D+LL R TATFI+F++PEEGIG P+ L G EGYD LP
Sbjct: 166 VAEVVMDDKLLGQLRTAKTATFIIFETPEEGIGFPLSLNGIGEGYDKLP 214


30SMc00458SMc02376N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00458-2101.277581transcriptional regulator
SMc00129-191.785685sensor histidine kinase
SMc00067191.898760outer-membranne lipoprotein
SMc02361-191.101156cytochrome C-type biogenesis transmembrane
SMc02362080.570836cytochrome c-type biogenesis protein CcmE
SMc02363-190.553470cytochrome C-type biogenesis transmembrane
SMc02364-170.388386cytochrome C-type biogenesis transmembrane
SMc02365-160.131913protease precursor protein
SMc02366-17-0.192074transcriptional regulator
SMc02367-17-0.011958sensor histidine kinase transmembrane protein
SMc02368-17-0.047213bifunctional glutamine-synthetase
SMc0236908-0.257305sensor histidine kinase transmembrane protein
SMc02370-180.283842aminopeptidase N
SMc02371-1110.378768hypothetical protein
SMc02372-1120.587143transport transmembrane protein
SMc023731120.518304hypothetical protein
SMc02374-111-0.425911hypothetical protein
SMc02375-110-0.199953hypothetical protein
SMc02376-112-0.799375heat shock protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00458HTHFIS882e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.0 bits (218), Expect = 2e-22
Identities = 33/123 (26%), Positives = 59/123 (47%), Gaps = 1/123 (0%)

Query: 2 RILIVEDDTNLNRQLADALKEAGYVVDQAYDGEEGHYLGDAEPYDAVILDIGLPEMDGIT 61
IL+ +DD + L AL AGY V + A D V+ D+ +P+ +
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLEKWRADGKTMPVLILTARDRWSDKVAGIDAGADDYVAKPFHVEEVLARI-RALIRRAA 120
+L + + +PVL+++A++ + + + GA DY+ KPF + E++ I RAL
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GHA 123
+
Sbjct: 125 RPS 127


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00129PF06580300.019 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 29.8 bits (67), Expect = 0.019
Identities = 21/109 (19%), Positives = 43/109 (39%), Gaps = 28/109 (25%)

Query: 357 LLENAARYAHDRISLNVEPAPEDAREREPGRQ---WIVLQIDDDGPGLDPDQIAVAMKRG 413
L+EN ++ + P+ + G + + L++++ G
Sbjct: 263 LVENGIKHG-------IAQLPQGGKILLKGTKDNGTVTLEVENTGSL------------- 302

Query: 414 KRLDESKPGTGLGLSIVSE-IVGEY--QGSVELSRRDVGGLRATLVLPA 459
L +K TG GL V E + Y + ++LS + G + A +++P
Sbjct: 303 -ALKNTKESTGTGLQNVRERLQMLYGTEAQIKLSEKQ-GKVNAMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02361SYCDCHAPRONE280.032 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 28.4 bits (63), Expect = 0.032
Identities = 9/40 (22%), Positives = 16/40 (40%)

Query: 209 EAQNALQKALALDPDDPRSAFYLALGLKQEGKHAEALAAF 248
+A Q LD D R L + G++ A+ ++
Sbjct: 54 DAHKVFQALCVLDHYDSRFFLGLGACRQAMGQYDLAIHSY 93



Score = 28.4 bits (63), Expect = 0.036
Identities = 12/52 (23%), Positives = 20/52 (38%), Gaps = 1/52 (1%)

Query: 214 LQKALALDPDDPRSAFYLALGLKQEGKHAEALAAFRKLAESSPADAP-WLSL 264
+ + D + LA Q GK+ +A F+ L D+ +L L
Sbjct: 25 IAMLNEISSDTLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHYDSRFFLGL 76


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02365V8PROTEASE636e-13 V8 serine protease family signature.
		>V8PROTEASE#V8 serine protease family signature.

Length = 336

Score = 63.1 bits (153), Expect = 6e-13
Identities = 34/162 (20%), Positives = 58/162 (35%), Gaps = 25/162 (15%)

Query: 119 RPRAQGSGFFITEDGYLVTNNHVVSDGS-------AFTVIMN-----DGTELDAKLVGKD 166
SG + L+TN HVV AF +N +G ++
Sbjct: 99 TGTFIASGVVVG-KDTLLTNKHVVDATHGDPHALKAFPSAINQDNYPNGGFTAEQITKYS 157

Query: 167 SRTDLAVLKVDDKRK-------FTYVSFADDEKVRVGDWVVAVGNPFGLGGTVTAGIISA 219
DLA++K + + +++ + +V + G P A + +
Sbjct: 158 GEGDLAIVKFSPNEQNKHIGEVVKPATMSNNAETQVNQNITVTGYPGDKP---VATMWES 214

Query: 220 RGRDIGSGPYDDYLQVDAAVNRGNSGGPTFNLSGEVVGINTA 261
+G+ +Q D + GNSG P FN EV+GI+
Sbjct: 215 KGKITYLKGEA--MQYDLSTTGGNSGSPVFNEKNEVIGIHWG 254


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02366HTHFIS892e-22 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 88.7 bits (220), Expect = 2e-22
Identities = 32/127 (25%), Positives = 55/127 (43%), Gaps = 1/127 (0%)

Query: 3 TRMKILIVEDDLEAAAYLAKAFREAGIVCDHASDGESGLFMASENAYDVLVVDRMLPRRD 62
T IL+ +DD L +A AG S+ + + D++V D ++P +
Sbjct: 2 TGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDEN 61

Query: 63 GLSLITELRRKDIHTPVLILSALGQVDDRVTGLRAGGDDYLPKPYAFSELLARVE-VLGR 121
L+ +++ PVL++SA + G DYLPKP+ +EL+ + L
Sbjct: 62 AFDLLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 RKGAPEQ 128
K P +
Sbjct: 122 PKRRPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02367PF06580394e-05 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 38.7 bits (90), Expect = 4e-05
Identities = 21/104 (20%), Positives = 42/104 (40%), Gaps = 20/104 (19%)

Query: 362 LIDNAIKYA--EGAENPLIRVEMMRSDRHVVLTVADHGPGIPENMRGEVVKRFVRLDESR 419
L++N IK+ + + I ++ + + V L V + G +N
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 420 SKPGTGLGLSLVDAVMEMHRG-SLELSATEEDGRGLAVRMVFPA 462
+K TG GL V ++M G ++ +E+ G+ ++ P
Sbjct: 307 TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVN-AMVLIPG 349


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02369PF06580363e-04 Sensor histidine kinase
		>PF06580#Sensor histidine kinase

Length = 349

Score = 36.4 bits (84), Expect = 3e-04
Identities = 25/105 (23%), Positives = 41/105 (39%), Gaps = 26/105 (24%)

Query: 672 LLSNAVKF----TNEGGRISLRARKVRGAVTLTIADSGIGIPKGALQKIGQPFEQVQSQY 727
L+ N +K +GG+I L+ K G VTL + ++G K
Sbjct: 263 LVENGIKHGIAQLPQGGKILLKGTKDNGTVTLEVENTGSLALKN---------------- 306

Query: 728 AKSKGGSGLGLA-ISRSLTRLHGG--TMKIHSAENVGTIISVRIP 769
+K +G GL + L L+G +K+ + + V IP
Sbjct: 307 --TKESTGTGLQNVRERLQMLYGTEAQIKLSEKQGKVNAM-VLIP 348


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02372TCRTETA300.017 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 30.2 bits (68), Expect = 0.017
Identities = 34/137 (24%), Positives = 55/137 (40%), Gaps = 8/137 (5%)

Query: 230 LSTGEGATLLASLLAGSAISQVPI----GRASDRMDRRIVMVACGIAGVVSCLAMSVSIA 285
+ + + LLA A+ Q G SDR RR V++ V M+ +
Sbjct: 36 VHSNDVTAHYGILLALYALMQFACAPVLGALSDRFGRRPVLLVSLAGAAVDYAIMATAPF 95

Query: 286 SSPPVLYALAACIGTVLFPIYALNVAHANDLARPDEYVEISSGLMITYGLGTISGPLMVG 345
VLY + + + A+ A+ D+ DE + +G G ++GP ++G
Sbjct: 96 LW--VLY-IGRIVAGITGATGAVAGAYIADITDGDERARHFGFMSACFGFGMVAGP-VLG 151

Query: 346 PVMDRFGPVALFIALAV 362
+M F P A F A A
Sbjct: 152 GLMGGFSPHAPFFAAAA 168


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02375IGASERPTASE300.012 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 30.0 bits (67), Expect = 0.012
Identities = 20/97 (20%), Positives = 33/97 (34%), Gaps = 5/97 (5%)

Query: 30 EEPVDRLAEFEAMKSRRAQSRAVPQTGATAAEPQQSAPPAAARERRPAPPAPPPVVAI-- 87
+E E +A K +++ VP+ + + P+Q + PA P V I
Sbjct: 1101 KETATVEKEEKA-KVETEKTQEVPKVTSQVS-PKQEQSETVQPQAEPARE-NDPTVNIKE 1157

Query: 88 PDEQAIKEAQFVADAARSLGELRTAMEAFAGCNLRNS 124
P Q A A + + + N NS
Sbjct: 1158 PQSQTNTTADTEQPAKETSSNVEQPVTESTTVNTGNS 1194


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02376SHAPEPROTEIN544e-10 Bacterial cell shape determinant MreB/Mbl protein s...
		>SHAPEPROTEIN#Bacterial cell shape determinant MreB/Mbl protein

signature.
Length = 347

Score = 54.0 bits (130), Expect = 4e-10
Identities = 57/232 (24%), Positives = 98/232 (42%), Gaps = 40/232 (17%)

Query: 2 ASALGLDFGTTNSVLAQTEGASTRSIVVESPTGRSDTTRTALSFLKDAGPSPLKVEA-GQ 60
++ L +D GT N+++ + IV+ P+ ++ +D SP V A G
Sbjct: 10 SNDLSIDLGTANTLIYVKG----QGIVLNEPS--------VVAIRQDRAGSPKSVAAVGH 57

Query: 61 AAIREFIDNPGECRFLQSIKTFAASALFQGTLIFGKRYEFEDLMQRFLTRLRDYAGNNWE 120
A + PG ++ +K G + + E ++Q F+ ++ N++
Sbjct: 58 DAKQMLGRTPGNIAAIRPMK--------DGVI--ADFFVTEKMLQHFIKQVH---SNSFM 104

Query: 121 ATFRRVVIGRPVHFAGANPDEKLALERYNRALTRMGFPEVHYVYEPVAAAFYFARNLDRD 180
RV++ P GA E+ A+ + G EV + EP+AAA +
Sbjct: 105 RPSPRVLVCVP---VGATQVERRAIRE---SAQGAGAREVFLIEEPMAAAIGAGLPVSEA 158

Query: 181 ATVLVADFGGGTTDYSIIRFESQAGRLTATPIGHSGVGIAGDHFDFRIIDNI 232
+V D GGGTT+ ++I S G + + S V I GD FD II+ +
Sbjct: 159 TGSMVVDIGGGTTEVAVI---SLNGVVYS-----SSVRIGGDRFDEAIINYV 202


31SMc04433SMc01910N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc04433-1160.359154hypothetical protein
SMc01811-212-0.067712hypothetical protein
SMc01812-114-0.833972cytochrome P450-like monooxygenase
SMc01813-114-1.020952hypothetical protein
SMc04448013-1.248954hypothetical protein
SMc01903214-2.162613ATP-dependent Clp protease proteolytic subunit
SMc01904111-1.449047ATP-dependent protease ATP-binding subunit ClpX
SMc01905190.875566ATP-dependent protease LA protein
SMc019062102.153700histone-like protein
SMc019072101.823227hypothetical protein
SMc019082122.525340transcriptional regulator
SMc019092112.672447hypothetical protein
SMc019101102.255709hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04433FIMREGULATRY260.013 Escherichia coli: P pili regulatory PapB protein si...
		>FIMREGULATRY#Escherichia coli: P pili regulatory PapB protein

signature.
Length = 104

Score = 26.4 bits (58), Expect = 0.013
Identities = 9/34 (26%), Positives = 20/34 (58%), Gaps = 6/34 (17%)

Query: 4 RREFVRLALEE----GVNRRELCRRFGISPDIGY 33
+ V LA+++ G +R+E+C ++ ++ GY
Sbjct: 43 HSDRVILAMKDYLVGGHSRKEVCEKYQMNN--GY 74


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01812INVEPROTEIN300.013 Salmonella/Shigella invasion protein E (InvE) signat...
		>INVEPROTEIN#Salmonella/Shigella invasion protein E (InvE)

signature.
Length = 372

Score = 30.5 bits (68), Expect = 0.013
Identities = 45/181 (24%), Positives = 69/181 (38%), Gaps = 44/181 (24%)

Query: 112 LRTLVNRAFVSRQIEEL-RPEIEALSHAVIDGFEKDGETELLKTYAETIPVTIIARMLGI 170
LR L+ R + +EE+ R ++E+L V E+ + + LK I + AR+ G
Sbjct: 129 LRELLRR----KDLEEIVRKKLESLLKHV----EEQTDPKTLKA---GINCALKARLFGK 177

Query: 171 PVEAAPRLL-------------------DW------SHRMVKMYVFNPSLETEFDANNAS 205
+ P LL DW R+V + SL T+ DAN+AS
Sbjct: 178 TLSLKPGLLRASYRQFIQSESHEVEIYSDWIASYGYQRRLVVLDFIEGSLLTDIDANDAS 237

Query: 206 A---EFADYLKGIIAEKRTNPADDLLTHMITSE---KDGERLSDAELISTTVLLLNAGHE 259
EF L+ + K AD L + S K ++ + + LL HE
Sbjct: 238 CSRLEFGQLLRRLTQLKMLRSADLLFVSTLLSYSFTKAFN-AEESSWLLLMLSLLQQPHE 296

Query: 260 A 260

Sbjct: 297 V 297


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01904RTXTOXIND290.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.4 bits (66), Expect = 0.031
Identities = 14/82 (17%), Positives = 31/82 (37%), Gaps = 11/82 (13%)

Query: 245 ILFICGGAFAGLDKIISARGEKTSIGFGATVRAPEDRRVGEVLRELEPEDLVKFGLIPEF 304
++ ++ + +A G+ T G ++ E+ V E++ ++ + V+ G
Sbjct: 69 VIAFILSVLGQVEIVATANGKLTHSGRSKEIKPIENSIVKEII--VKEGESVRKG----- 121

Query: 305 IGRLPVLATLEDLDEDALIQIL 326
VL L L +A
Sbjct: 122 ----DVLLKLTALGAEADTLKT 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01905PF05272340.003 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 33.5 bits (76), Expect = 0.003
Identities = 14/81 (17%), Positives = 32/81 (39%), Gaps = 6/81 (7%)

Query: 299 DWLLGLPWGKKSKIKTDLNHAEKVLDTDHFGLDKVKERIVEYLAVQARSSKIKGP----- 353
DW+ W + +++ L H D+ ++V + +++ P
Sbjct: 537 DWVKAQQWDEVPRLEKWLVHVLGKTPDDYKPRRLRYLQLVGKYILMGHVARVMEPGCKFD 596

Query: 354 -ILCLVGPPGVGKTSLAKSIA 373
+ L G G+GK++L ++
Sbjct: 597 YSVVLEGTGGIGKSTLINTLV 617


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01906DNABINDINGHU1192e-39 Prokaryotic integration host factor signature.
		>DNABINDINGHU#Prokaryotic integration host factor signature.

Length = 91

Score = 119 bits (301), Expect = 2e-39
Identities = 47/88 (53%), Positives = 59/88 (67%)

Query: 2 NKNELVAAVADKAGLSKADASSAVDAVFETIQGELKNGGDIRLVGFGNFSVSRREASKGR 61
NK +L+A VA+ L+K D+++AVDAVF + L G ++L+GFGNF V R A KGR
Sbjct: 3 NKQDLIAKVAEATELTKKDSAAAVDAVFSAVSSYLAKGEKVQLIGFGNFEVRERAARKGR 62

Query: 62 NPSTGAEVDIPARNVPKFTAGKGLKDAV 89
NP TG E+ I A VP F AGK LKDAV
Sbjct: 63 NPQTGEEIKIKASKVPAFKAGKALKDAV 90


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01910RTXTOXIND412e-05 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 41.4 bits (97), Expect = 2e-05
Identities = 31/227 (13%), Positives = 64/227 (28%), Gaps = 24/227 (10%)

Query: 405 AEGNHGDAEREVARLKEGAADPAFLAERVHLLRQSDCLLRQQSSAREIDRLEAAICDRLD 464
EG + +L A+ L + LL+ R Q +R I+ + D
Sbjct: 113 KEGESVRKGDVLLKLTALGAEADTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPD 172

Query: 465 ELRPFAGTIDDLAALPAPAPAVVEAWRAEDHAEAERRLRLADRIADESERQAGDEARLAA 524
E + +++ L + W+ + ++ L L + A+ A
Sbjct: 173 EPYFQNVSEEEVLRLTSLIKEQFSTWQNQK---YQKELNLDKKRAERLTVLARINRYENL 229

Query: 525 IA------------ANGGVVDDAAVCELRRRRDTAWQRHRAGLGEQTAIAFETALNEHDA 572
+ + AV E + A R + I
Sbjct: 230 SRVEKSRLDDFSSLLHKQAIAKHAVLEQENKYVEAVNELRVYKSQLEQI--------ESE 281

Query: 573 ITALRLAQAERVAEIRSLTL-AVTERRARLQSLDAQRKAAEEQRQRL 618
I + + ++ L + + + L + EE++Q
Sbjct: 282 ILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328



Score = 36.0 bits (83), Expect = 0.001
Identities = 39/209 (18%), Positives = 66/209 (31%), Gaps = 24/209 (11%)

Query: 119 ATYSTMFSLDDDSIEEGGEAILKSEGELGSLLFSASSGLPDSTAVLAVLRAEADSFFRPQ 178
T T SL +E+ IL EL L P V S + Q
Sbjct: 135 DTLKTQSSLLQARLEQTRYQILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQ 194

Query: 179 --ARKHQLAELKAELDALKAERNEIDVNAREYAALRKALASMRARHEAAARLRSE----- 231
++Q + + LD +AER + ++R + + L +
Sbjct: 195 FSTWQNQKYQKELNLDKKRAERLTVLARIN---RYENLSRVEKSRLDDFSSLLHKQAIAK 251

Query: 232 ---LRADRDRMRAQRDAVPLLARLRGARQELAGRDPLPVPPAEWQEELPVLRR-RDAEIA 287
L + + A + ++L E+ +EE ++ + EI
Sbjct: 252 HAVLEQENKYVEAVNELRVYKSQLEQIESEIL----------SAKEEYQLVTQLFKNEIL 301

Query: 288 AGLRQLHEELTRRREELAALPRDEQALAI 316
LRQ + + ELA +QA I
Sbjct: 302 DKLRQTTDNIGLLTLELAKNEERQQASVI 330


32SMc01371SMc01368N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc01371-115-2.0009652-component receiver domain-containing protein
SMc0137009-0.332601response regulator PleD
SMc01369110-0.29461050S ribosomal protein L33
SMc01368190.499816transport transmembrane protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01371HTHFIS636e-15 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 63.3 bits (154), Expect = 6e-15
Identities = 20/118 (16%), Positives = 48/118 (40%), Gaps = 4/118 (3%)

Query: 5 VMIVEDNELNMKLFRDLIEASGYATIQTRNGMEALELARKHRPDLILMDIQLPEVSGLEV 64
+++ +D+ + + +GY T N DL++ D+ +P+ + ++
Sbjct: 6 ILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFDL 65

Query: 65 TKWLKEDDELHVIPVIAVTAFAMKGDE-ERIRQGGCEAYVSKPISVPKFIETIKTYLG 121
+K+ +PV+ ++A + +G + Y+ KP + + I I L
Sbjct: 66 LPRIKKAR--PDLPVLVMSAQNTFMTAIKASEKGAYD-YLPKPFDLTELIGIIGRALA 120


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01370HTHFIS856e-20 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 85.3 bits (211), Expect = 6e-20
Identities = 34/138 (24%), Positives = 63/138 (45%), Gaps = 3/138 (2%)

Query: 3 ARILVVDDVPANVKLLEARLVAEYFDVLTAGDGHAALATCEKTPVDLVLLDIMMPGMDGF 62
A ILV DD A +L L +DV + DLV+ D++MP + F
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 63 EVCERLKANSRTAHIPVVMITALDQPSDRVRGLKAGADDFLTKPVNDLQLMSRV-KSLVR 121
++ R+K +PV++++A + ++ + GA D+L KP + +L+ + ++L
Sbjct: 64 DLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 122 LKNVSDELRLRAQTAQTI 139
K +L +Q +
Sbjct: 122 PKRRPSKLEDDSQDGMPL 139



Score = 58.7 bits (142), Expect = 2e-11
Identities = 32/140 (22%), Positives = 64/140 (45%), Gaps = 8/140 (5%)

Query: 154 GSVLLVDGRASSQERLTRALKPIA-DVAVISDPQAALFEAAESSFDLIIVNANFDDYDPL 212
++L+ D A+ + L +AL DV + S+ A DL++ + D +
Sbjct: 4 ATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAF 63

Query: 213 RLCSQLRSLERTRFIPILLVTEQGNDERIVRALELGVTDYIMRPVDPNELVA---RSLTQ 269
L +++ + +P+L+++ Q ++A E G DY+ +P D EL+ R+L +
Sbjct: 64 DLLPRIK--KARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAE 121

Query: 270 IRRKHCNDRLRASVQQTIEL 289
+R+ +L Q + L
Sbjct: 122 PKRRP--SKLEDDSQDGMPL 139


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01369CLENTEROTOXN240.041 Clostridium enterotoxin signature.
		>CLENTEROTOXN#Clostridium enterotoxin signature.

Length = 319

Score = 23.9 bits (51), Expect = 0.041
Identities = 11/39 (28%), Positives = 17/39 (43%)

Query: 11 LLSTADTGYFYVTTKNSRTMTDKMTKTKYDPVAKKHVEF 49
L + + T K + +T T KY +A K V+F
Sbjct: 223 LYDWRSSNSYPWTQKLNLHLTITATGQKYRILASKIVDF 261


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc01368TCRTETA668e-14 Tetracycline resistance protein signature.
		>TCRTETA#Tetracycline resistance protein signature.

Length = 399

Score = 65.6 bits (160), Expect = 8e-14
Identities = 80/387 (20%), Positives = 139/387 (35%), Gaps = 24/387 (6%)

Query: 19 SLIAALASITAVGIAIGLGLPLLSVIME---KRGIAPTLNGLNAAMAGLASMAAAPFTMK 75
LI L+++ + IGL +P+L ++ G+ A+ L A AP
Sbjct: 6 PLIVILSTVALDAVGIGLIMPVLPGLLRDLVHSNDVTAHYGILLALYALMQFACAPVLGA 65

Query: 76 FAHRHGVAPTMLLAIMFAAASSLGFYYLTNFWLWFPLRIVFHGAITVLFILSEFWINATA 135
+ R G P +L+++ AA W+ + RIV G ++ +I
Sbjct: 66 LSDRFGRRPVLLVSLAGAAVDYAIMATAPFLWVLYIGRIV-AGITGATGAVAGAYIADIT 124

Query: 136 PPNRRGFVLGIYGTVLSLGFASGPLLFSILGSEG-FLPFAVGAGVILLSAIPIFL----- 189
+ R G G +GP+L ++G PF A + L+ +
Sbjct: 125 DGDERARHFGFMSACFGFGMVAGPVLGGLMGGFSPHAPFFAAAALNGLNFLTGCFLLPES 184

Query: 190 ---ARNESPVLDEKPKRHFMRYVFLVPTAT--AAVFIFGAVEYGGLSLFPIFGT-RAGFS 243
R P F + A A FI V +L+ IFG R +
Sbjct: 185 HKGERRPLRREALNPLASFRWARGMTVVAALMAVFFIMQLVGQVPAALWVIFGEDRFHWD 244

Query: 244 ESQAALLLTVMGVGNFIFQIPL-GMLSDRVKDRRTILSALTLIGLVGALFLPTLVENW-F 301
+ + L G+ + + Q + G ++ R+ +RR ++ + G G + L W
Sbjct: 245 ATTIGISLAAFGILHSLAQAMITGPVAARLGERRALMLGMIADG-TGYILLAFATRGWMA 303

Query: 302 LMALVLLFWGGCVSGLYTVGLSHLGSRLQGADLAAANAAFVFSYAVGTVAGPQVIGAAMD 361
+VLL GG LS + L + AA ++ ++ GP + A
Sbjct: 304 FPIMVLLASGGIGMPALQAMLSRQVDEERQGQLQGSLAALT---SLTSIVGPLLFTAIYA 360

Query: 362 VTGN--DGFAWAIAGFFGLYVVLSLVR 386
+ +G+AW L + +L R
Sbjct: 361 ASITTWNGWAWIAGAALYLLCLPALRR 387


33SMc00241SMc00245N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00241-110-0.486362transcriptional regulator
SMc00242-110-1.198670signal peptide protein
SMc00244-212-0.252511ABC transporter permease
SMc00243-112-0.321298ABC transporter permease
SMc00245-112-0.230284ABC transporter ATP-binding protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00241CARBMTKINASE280.032 Bacterial carbamate kinase signature.
		>CARBMTKINASE#Bacterial carbamate kinase signature.

Length = 314

Score = 28.3 bits (63), Expect = 0.032
Identities = 21/73 (28%), Positives = 30/73 (41%), Gaps = 4/73 (5%)

Query: 11 GTPKRTSHAQV-VDQLGKAIVSGEFPVGSILPGDPELALRFRVSRTVLREAMKTLAAKGM 69
GT K +V V++L K G F GS+ P A+RF A+ K +
Sbjct: 245 GTEKEQWLREVKVEELRKYYEEGHFKAGSMGP-KVLAAIRFIEWGG--ERAIIAHLEKAV 301

Query: 70 IVPRARIGTRVTP 82
+ GT+V P
Sbjct: 302 EALEGKTGTQVLP 314


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00242OMPADOMAIN348e-04 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 34.1 bits (78), Expect = 8e-04
Identities = 19/52 (36%), Positives = 23/52 (44%), Gaps = 6/52 (11%)

Query: 22 MKKLLVALATTVAVLGIAPAASAQEKAKICFIYVGSKTDGGWTQAHDIGRQE 73
MKK +A+A V A AQ K Y G+K GW+Q HD G
Sbjct: 1 MKKTAIAIA----VALAGFATVAQAAPKDNTWYTGAKL--GWSQYHDTGFIN 46


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00244TYPE3OMGPROT290.017 Type III secretion system outer membrane G protein ...
		>TYPE3OMGPROT#Type III secretion system outer membrane G protein

family signature.
Length = 607

Score = 29.5 bits (66), Expect = 0.017
Identities = 17/59 (28%), Positives = 26/59 (44%), Gaps = 12/59 (20%)

Query: 122 AIPLLSKIPVVGPLLFRQDAIFYASVAIVVGVHLFLFRSRAGLKLRAVGDSHASAHALG 180
+PLL IP +G LFR+ + V LF+ ++ R + + A ALG
Sbjct: 474 KVPLLGDIPYIGA-LFRRKSELTRRT-----VRLFI------IEPRIIDEGIAHHLALG 520


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00245PHPHTRNFRASE300.025 Phosphoenolpyruvate-protein phosphotransferase sign...
		>PHPHTRNFRASE#Phosphoenolpyruvate-protein phosphotransferase

signature.
Length = 572

Score = 30.1 bits (68), Expect = 0.025
Identities = 18/111 (16%), Positives = 37/111 (33%), Gaps = 11/111 (9%)

Query: 232 TPASLARMMVGSDVAVVTPEGTSTRGDVQLEARHLTVAART---PFAVSLKDICLKVRAG 288
TP+ A++ T G T H + +R+ P V K++ K++ G
Sbjct: 165 TPSDTAQLNKQFVKGFATDIGGRT--------SHSAIMSRSLEIPAVVGTKEVTEKIQHG 216

Query: 289 EVLAIAGVAGNGQSELFDALSGEYPLADDDAIQIRQRPVGTRNINARRLMG 339
+++ + G+ G + Y + +Q + G
Sbjct: 217 DMVIVDGIEGIVIVNPTEEEVKAYEEKRAAFEKQKQEWAKLVGEPSTTKDG 267


34SMc02032SMc02041N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc02032-291.336059ABC transporter ATP-binding protein
SMc02033-1121.709207periplasmic binding protein
SMc020342102.196827oxidoreductase
SMc020353121.620324oxidoreductase
SMc020364111.412194transcriptional regulator
SMc020374121.091944oxidoreductase
SMc020384131.477900glycerol dehydrogenase
SMc020392140.913925oxidoreductase
SMc020402111.924683oxidoreductase
SMc020410112.514528short chain dehydrogenase
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02032PF05272300.037 Virulence-associated E family protein
		>PF05272#Virulence-associated E family protein

Length = 892

Score = 29.7 bits (66), Expect = 0.037
Identities = 14/39 (35%), Positives = 17/39 (43%)

Query: 35 VQVLLGENGAGKSTLMKILAGEHAPTGGEIVVGGRKVSA 73
VL G G GKSTL+ L G + +G K S
Sbjct: 598 SVVLEGTGGIGKSTLINTLVGLDFFSDTHFDIGTGKDSY 636


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02034DHBDHDRGNASE1279e-38 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 127 bits (320), Expect = 9e-38
Identities = 80/257 (31%), Positives = 126/257 (49%), Gaps = 13/257 (5%)

Query: 8 LSGRVAVVTGAGQGIGLACAEALCEAGAAVVLTDISAERCEAGRAALAAKGYVVETDLID 67
+ G++A +TGA QGIG A A L GA + D + E+ E ++L A+ E D
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 68 IGDSASVNAVADRLAVSGRAADILVANAGIAHAGVPAEELSDADWERMIGINLSGAFRSC 127
+ DSA+++ + R+ DILV AG+ G LSD +WE +N +G F +
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPG-LIHSLSDEEWEATFSVNSTGVFNAS 124

Query: 128 RAFGRHMLAKGRGSIVTIGSMSGTIVNRPQQQVH-YNAAKAGVHHLTRSLAAEWAARGVR 186
R+ ++M+ + GSIVT+GS P+ + Y ++KA T+ L E A +R
Sbjct: 125 RSVSKYMMDRRSGSIVTVGSNPA---GVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIR 181

Query: 187 VNSVAPTYIDTPL--LTFAKED------KPMYEQWLDMTPMHRLGQPDEIASVVLFLASD 238
N V+P +T + +A E+ K E + P+ +L +P +IA VLFL S
Sbjct: 182 CNIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSG 241

Query: 239 ASSLMTGSIVAADAGYT 255
+ +T + D G T
Sbjct: 242 QAGHITMHNLCVDGGAT 258


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02037DHBDHDRGNASE1522e-47 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 152 bits (385), Expect = 2e-47
Identities = 91/258 (35%), Positives = 133/258 (51%), Gaps = 14/258 (5%)

Query: 8 LNGRAAFVTGGSRGIGFACAEALGEAGARVAISARSRDEGEKAVRQLRQKGIEAIYLPAD 67
+ G+ AF+TG ++GIG A A L GA +A + ++ EK V L+ + A PAD
Sbjct: 6 IEGKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPAD 65

Query: 68 ISNESAAQQVVRQAAAELGGLDILVNNAGIARHCDSLKLEPETWDEVINTNLTGLFWCCR 127
+ + +A ++ + E+G +DILVN AG+ R L E W+ + N TG+F R
Sbjct: 66 VRDSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASR 125

Query: 128 AAIETMSAAGRGSIVNIGSISGYISNLPQNQV-AYNASKAGVHMLTKSLAGEFAKSNIRI 186
+ + M GSIV +GS + +P+ + AY +SKA M TK L E A+ NIR
Sbjct: 126 SVSKYMMDRRSGSIVTVGSNP---AGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRC 182

Query: 187 NAVAPGYIETAMTQG--GLDDPEWSKIW-------LGMTPLGRAGKASEVAAAVLFLASD 237
N V+PG ET M ++ I G+ PL + K S++A AVLFL S
Sbjct: 183 NIVSPGSTETDMQWSLWADENGAEQVIKGSLETFKTGI-PLKKLAKPSDIADAVLFLVSG 241

Query: 238 AASYITGSVLTIDGGYTI 255
A +IT L +DGG T+
Sbjct: 242 QAGHITMHNLCVDGGATL 259


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02039DHBDHDRGNASE744e-18 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 74.3 bits (182), Expect = 4e-18
Identities = 46/149 (30%), Positives = 68/149 (45%), Gaps = 6/149 (4%)

Query: 8 VMITGASAGIGQETARVFSAAGYPLLLIARRSELIEAMALPNML------AVAADVRDYD 61
ITGA+ GIG+ AR ++ G + + E +E + A ADVRD
Sbjct: 11 AFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVRDSA 70

Query: 62 ALASAIRQGEARFGPVDCLINNAGVSRLARLDEQDPAQWRDLVDINCLGVLNGMHAVAPG 121
A+ + E GP+D L+N AGV R + +W +N GV N +V+
Sbjct: 71 AIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSVSKY 130

Query: 122 MKERRCGTIVNVSSTQAARSIRTMTSMAA 150
M +RR G+IV V S A +M + A+
Sbjct: 131 MMDRRSGSIVTVGSNPAGVPRTSMAAYAS 159


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02041DHBDHDRGNASE1356e-41 2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase sig...
		>DHBDHDRGNASE#2,3-dihydro-2,3-dihydroxybenzoate dehydrogenase

signature.
Length = 261

Score = 135 bits (341), Expect = 6e-41
Identities = 84/252 (33%), Positives = 122/252 (48%), Gaps = 13/252 (5%)

Query: 15 GKTVVVTGAATGIGRAVAEAFATKRARVALLDRDAAVSDVAVS----LGTGHIAHVADVT 70
GK +TGAA GIG AVA A++ A +A +D + + VS A ADV
Sbjct: 8 GKIAFITGAAQGIGEAVARTLASQGAHIAAVDYNPEKLEKVVSSLKAEARHAEAFPADVR 67

Query: 71 DEQGVERAVKSVTEAFGRIDILINNAGIGPLAPAESYPTAEWDRTLAVNLKGAFLMARAI 130
D ++ + G IDIL+N AG+ S EW+ T +VN G F +R++
Sbjct: 68 DSAAIDEITARIEREMGPIDILVNVAGVLRPGLIHSLSDEEWEATFSVNSTGVFNASRSV 127

Query: 131 APGMLEQGSGRIVNMASQAAIIGIEGHVAYCASKAGIIGMTNCMALEWGPRGVTVNAVSP 190
+ M+++ SG IV + S A + AY +SKA + T C+ LE + N VSP
Sbjct: 128 SKYMMDRRSGSIVTVGSNPAGVPRTSMAAYASSKAAAVMFTKCLGLELAEYNIRCNIVSP 187

Query: 191 TVVETELGLTGWAGEKGERARAA---------IPTRRFAKPWEIAASVLYLAGGAAAMVN 241
ET++ + WA E G IP ++ AKP +IA +VL+L G A +
Sbjct: 188 GSTETDMQWSLWADENGAEQVIKGSLETFKTGIPLKKLAKPSDIADAVLFLVSGQAGHIT 247

Query: 242 GANLMIDGGYTI 253
NL +DGG T+
Sbjct: 248 MHNLCVDGGATL 259


35SMc04458SMc00739N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc04458-1110.404693preprotein translocase subunit SecA
SMc007441120.837275transporter
SMc007430101.281931hypothetical protein
SMc007420110.718964hypothetical protein
SMc007411110.976675fatty-acid-CoA ligase
SMc007402150.665483hypothetical protein
SMc007392120.259089hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04458SECA11910.0 SecA protein signature.
		>SECA#SecA protein signature.

Length = 901

Score = 1191 bits (3082), Expect = 0.0
Identities = 488/914 (53%), Positives = 637/914 (69%), Gaps = 35/914 (3%)

Query: 4 LGGFARKLFGSANDRRVRGYKGRVDAINALEAEMKALSDEALAAKTAEFRRELADGKTLD 63
L K+FGS NDR +R + V+ INA+E EM+ LSDE L KTAEFR L G+ L+
Sbjct: 2 LIKLLTKVFGSRNDRTLRRMRKVVNIINAMEPEMEKLSDEELKGKTAEFRARLEKGEVLE 61

Query: 64 DILVPAFAVVREAALRVLGLRPFDVQLIGGMILHERAIAEMKTGEGKTLVATLPVYLNAL 123
+++ AFAVVREA+ RV G+R FDVQL+GGM+L+ER IAEM+TGEGKTL ATLP YLNAL
Sbjct: 62 NLIPEAFAVVREASKRVFGMRHFDVQLLGGMVLNERCIAEMRTGEGKTLTATLPAYLNAL 121

Query: 124 AGKGVHVVTVNDYLAQRDAGMMGRIYGFLGMTTGVIVHGLSDEQRRDAYACDVTYATNNE 183
GKGVHVVTVNDYLAQRDA ++ FLG+T G+ + G+ +R+AYA D+TY TNNE
Sbjct: 122 TGKGVHVVTVNDYLAQRDAENNRPLFEFLGLTVGINLPGMPAPAKREAYAADITYGTNNE 181

Query: 184 LGFDYLRDNMKYERGQMVQRGHFFAIVDEVDSILVDEARTPLIISGPLDDRSDLYNTINE 243
GFDYLRDNM + + VQR +A+VDEVDSIL+DEARTPLIISGP +D S++Y +N+
Sbjct: 182 YGFDYLRDNMAFSPEERVQRKLHYALVDEVDSILIDEARTPLIISGPAEDSSEMYKRVNK 241

Query: 244 FIPLLSPE------------DYEIDEKQRSANFSEEGTEKLENMLREAGLL-KGESLYDI 290
IP L + + +DEK R N +E G +E +L + G++ +GESLY
Sbjct: 242 IIPHLIRQEKEDSETFQGEGHFSVDEKSRQVNLTERGLVLIEELLVKEGIMDEGESLYSP 301

Query: 291 ENVAIVHHVNNALKAHKLFTRDKDYIVRNGEIVIIDEFTGRMMPGRRYSEGQHQALEAKE 350
N+ ++HHV AL+AH LFTRD DYIV++GE++I+DE TGR M GRR+S+G HQA+EAKE
Sbjct: 302 ANIMLMHHVTAALRAHALFTRDVDYIVKDGEVIIVDEHTGRTMQGRRWSDGLHQAVEAKE 361

Query: 351 KVQIQPENQTLASITFQNYFRMYDKLAGMTGTAATEAEEFGNIYGLEVLEVPTNLPIKRI 410
VQIQ ENQTLASITFQNYFR+Y+KLAGMTGTA TEA EF +IY L+ + VPTN P+ R
Sbjct: 362 GVQIQNENQTLASITFQNYFRLYEKLAGMTGTADTEAFEFSSIYKLDTVVVPTNRPMIRK 421

Query: 411 DEDDEVYRTVGEKFKAIIDEIKSAHERGQPMLVGTTSIEKSELLADMLKKSGFSKFQVLN 470
D D VY T EK +AII++IK +GQP+LVGT SIEKSEL+++ L K+G K VLN
Sbjct: 422 DLPDLVYMTEAEKIQAIIEDIKERTAKGQPVLVGTISIEKSELVSNELTKAGI-KHNVLN 480

Query: 471 ARYHEQEAYIVAQAGVPGAVTIATNMAGRGTDIQLGGNPDMRIQQELADVEPGPEREARE 530
A++H EA IVAQAG P AVTIATNMAGRGTDI LGG+ Q E+A +E +
Sbjct: 481 AKFHANEAAIVAQAGYPAAVTIATNMAGRGTDIVLGGSW----QAEVAALENPTAEQI-- 534

Query: 531 KAIREEVQKLKEKALAAGGLYVLATERHESRRIDNQLRGRSGRQGDPGRSKFYLSLQDDL 590
+ I+ + Q + L AGGL+++ TERHESRRIDNQLRGRSGRQGD G S+FYLS++D L
Sbjct: 535 EKIKADWQVRHDAVLEAGGLHIIGTERHESRRIDNQLRGRSGRQGDAGSSRFYLSMEDAL 594

Query: 591 MRIFGSDRMDGMLQKLGLKEGEAIVHPWINKALERAQKKVEARNFDIRKNLLKYDDVLND 650
MRIF SDR+ GM++KLG+K GEAI HPW+ KA+ AQ+KVE+RNFDIRK LL+YDDV ND
Sbjct: 595 MRIFASDRVSGMMRKLGMKPGEAIEHPWVTKAIANAQRKVESRNFDIRKQLLEYDDVAND 654

Query: 651 QRKVIFEQRIELMDAESVTDTVTDMRNEVIEEIVAKRIPERAYAEKWDAEGLKADVQQYF 710
QR+ I+ QR EL+D V++T+ +R +V + + IP ++ E WD GL+ ++ F
Sbjct: 655 QRRAIYSQRNELLDVSDVSETINSIREDVFKATIDAYIPPQSLEEMWDIPGLQERLKNDF 714

Query: 711 NLDLPIAEWVAEE-GIAEDDIRERITAAVDKAAAERAERFGPEIMQYVERSVVLQTLDHL 769
+LDLPIAEW+ +E + E+ +RERI A + + E G E+M++ E+ V+LQTLD L
Sbjct: 715 DLDLPIAEWLDKEPELHEETLRERILAQSIEVYQRKEEVVGAEMMRHFEKGVMLQTLDSL 774

Query: 770 WREHIVNLDHLRSVIGFRGYAQRDPLQEYKSEAFELFQALLGNLRQAVTAQLMRVEL--- 826
W+EH+ +D+LR I RGYAQ+DP QEYK E+F +F A+L +L+ V + L +V++
Sbjct: 775 WKEHLAAMDYLRQGIHLRGYAQKDPKQEYKRESFSMFAAMLESLKYEVISTLSKVQVRMP 834

Query: 827 -VREAPEEPQPLPPMQAHHIDPLTGEDDFAQAGETLLAVAPANRDPADPSTWGKVARNEA 885
E E+ + + + + L+ +DD + A L A KV RN+
Sbjct: 835 EEVEELEQQRRMEAERLAQMQQLSHQDDDSAAAAALAAQTG----------ERKVGRNDP 884

Query: 886 CPCGSGKKYKHCHG 899
CPCGSGKKYK CHG
Sbjct: 885 CPCGSGKKYKQCHG 898


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00744TCRTETB612e-12 Tetracycline resistance protein TetB signature.
		>TCRTETB#Tetracycline resistance protein TetB signature.

Length = 458

Score = 61.4 bits (149), Expect = 2e-12
Identities = 33/138 (23%), Positives = 58/138 (42%), Gaps = 2/138 (1%)

Query: 37 VPAMPGILGTSPAVVQLTLSLYMVMLGLGQIVFGPVSDRIGRRPVLIGGAMLFAAASFCL 96
+P + PA + +M+ +G V+G +SD++G + +L+ G ++ S
Sbjct: 37 LPDIANDFNKPPASTNWVNTAFMLTFSIGTAVYGKLSDQLGIKRLLLFGIIINCFGSVIG 96

Query: 97 AASSSAIPFVAF-RFLQAVGASAALVATFATIRDVYADRPESVVIYSLFSSILAFVPALG 155
S + RF+Q GA AA A + Y + + L SI+A +G
Sbjct: 97 FVGHSFFSLLIMARFIQGAGA-AAFPALVMVVVARYIPKENRGKAFGLIGSIVAMGEGVG 155

Query: 156 PITGAMLAERFGWRSIFV 173
P G M+A W + +
Sbjct: 156 PAIGGMIAHYIHWSYLLL 173


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00740MICOLLPTASE270.012 Microbial collagenase metalloprotease (M9) signature.
		>MICOLLPTASE#Microbial collagenase metalloprotease (M9) signature.

Length = 1104

Score = 27.0 bits (59), Expect = 0.012
Identities = 10/49 (20%), Positives = 14/49 (28%), Gaps = 2/49 (4%)

Query: 45 AYRAAEDTLLDQTLVRVDEHSYRHYLDI-LDQPPSGDGFERLMNAPKPW 92
Y + + TL + H + HYL P E W
Sbjct: 484 TYERTPEESI-YTLEELFRHEFTHYLQGRYVVPGMWGQGEFYQEGVLTW 531


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00739SACTRNSFRASE300.003 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 29.5 bits (66), Expect = 0.003
Identities = 17/61 (27%), Positives = 31/61 (50%), Gaps = 6/61 (9%)

Query: 83 AVLGRLAIDRDWQGKGLGAALLQDAV--LRSSQAADIMGIRGLLVHAISGEAKAFYEHYG 140
A++ +A+ +D++ KG+G ALL A+ + + +M L I+ A FY +
Sbjct: 90 ALIEDIAVAKDYRKKGVGTALLHKAIEWAKENHFCGLM----LETQDINISACHFYAKHH 145

Query: 141 F 141
F
Sbjct: 146 F 146


36SMc00656SMc00650N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00656013-1.697697hypothetical protein
SMc00655011-0.795031hypothetical protein
SMc00654-1130.474286response regulator,controls chromosomal
SMc006530141.3488682-component receiver domain-containing protein
SMc00652-1131.721742hypothetical protein
SMc006510111.111707hypothetical protein
SMc006500101.763057hypothetical protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00656SACTRNSFRASE290.002 Streptothricin acetyltransferase signature.
		>SACTRNSFRASE#Streptothricin acetyltransferase signature.

Length = 173

Score = 28.8 bits (64), Expect = 0.002
Identities = 18/65 (27%), Positives = 29/65 (44%), Gaps = 1/65 (1%)

Query: 1 MDIAQEETNTKGRYTATLDGH-TGEMTYSRSSPHLIIVDHTFVPDALRGKGVGQALALHA 59
MD++ E K + L+ + G + + +++ V R KGVG AL A
Sbjct: 55 MDVSYVEEEGKAAFLYYLENNCIGRIKIRSNWNGYALIEDIAVAKDYRKKGVGTALLHKA 114

Query: 60 IEEAR 64
IE A+
Sbjct: 115 IEWAK 119


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00654HTHFIS792e-19 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 79.5 bits (196), Expect = 2e-19
Identities = 27/124 (21%), Positives = 58/124 (46%), Gaps = 1/124 (0%)

Query: 2 RVLLIEDDSATAQSIELMLKSESFNVYTTDLGEEGVDLGKLYDYDIILLDLNLPDMSGYE 61
+L+ +DD+A + L ++V T D D+++ D+ +PD + ++
Sbjct: 5 TILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDENAFD 64

Query: 62 VLRTLRLSKVKTPILILSGMAGIEDKVRGLGFGADDYMTKPFHKDELVARI-HAIVRRSK 120
+L ++ ++ P+L++S ++ GA DY+ KPF EL+ I A+ +
Sbjct: 65 LLPRIKKARPDLPVLVMSAQNTFMTAIKASEKGAYDYLPKPFDLTELIGIIGRALAEPKR 124

Query: 121 GHAQ 124
++
Sbjct: 125 RPSK 128


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00653HTHFIS532e-11 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 53.3 bits (128), Expect = 2e-11
Identities = 25/109 (22%), Positives = 44/109 (40%), Gaps = 4/109 (3%)

Query: 1 MDVRRLMIADGSDVVRKVGKRILSGMGFLVSEAPSSLEALVRCEAELPNILIVDAGLDG- 59
M +++AD +R V + LS G+ V ++ A ++++ D +
Sbjct: 1 MTGATILVADDDAAIRTVLNQALSRAGYDVRITSNAATLWRWIAAGDGDLVVTDVVMPDE 60

Query: 60 -ALDLIGNIRRLPEGASVRIYYCVVQADLKKMMAGKRAGADDFLLKPFD 107
A DL+ I++ V + Q + GA D+L KPFD
Sbjct: 61 NAFDLLPRIKKARPDLPVLV--MSAQNTFMTAIKASEKGAYDYLPKPFD 107


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00650RTXTOXIND290.031 Gram-negative bacterial RTX secretion protein D signat...
		>RTXTOXIND#Gram-negative bacterial RTX secretion protein D

signature.
Length = 478

Score = 29.0 bits (65), Expect = 0.031
Identities = 23/181 (12%), Positives = 54/181 (29%), Gaps = 10/181 (5%)

Query: 47 QEVRAQRDAARATYAVENAKTLQALRRERDKAVALMVQHERTLRDTRHLVSENTEL---Q 103
Q + + + + E + + E+ + L +
Sbjct: 154 QILSRSIELNKLPELKLPDEPYFQNVSEEEVLRLTSLIKEQFSTWQNQKYQKELNLDKKR 213

Query: 104 AQLADMNVEAATMRSAIRQLEQRLESMEAAAKSSARENGANSET-VRSLQSRIDKQSADL 162
A+ + + R + RL+ + A ++ V +++ + +L
Sbjct: 214 AERLTVLARINRYENLSRVEKSRLDDFSSLLH-----KQAIAKHAVLEQENKYVEAVNEL 268

Query: 163 DGLKIDLAARDTEIEHLKSRMRAMHDE-RETLRANLKAETARAGEMELRLTGDEARMHQL 221
K L ++EI K + + + + L+ T G + L L +E R
Sbjct: 269 RVYKSQLEQIESEILSAKEEYQLVTQLFKNEILDKLRQTTDNIGLLTLELAKNEERQQAS 328

Query: 222 E 222

Sbjct: 329 V 329


37SMc00638SMc03956N        Y        NPathogenicity Island (unbiased-composition)
LocusTagDNBiasCDNBias%GCBiasProduct
SMc00638-181.037688HEAT resistant agglutinin 1 protein
SMc00637-291.082094phosphoglucosamine mutase
SMc04459-181.152225metalloprotease transmembrane protein
SMc02940-1101.358512hypothetical protein
SMc02941090.525355hypothetical protein
SMc02942080.351884peptidoglycan-associated lipoprotein
SMc04461070.062570translocation protein TolB
SMc0395607-0.333277signal peptide protein
ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc00638OMPADOMAIN384e-05 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 37.6 bits (87), Expect = 4e-05
Identities = 25/118 (21%), Positives = 44/118 (37%), Gaps = 27/118 (22%)

Query: 1 MKKIFRGVAAALLSGTPVYAADVYQPPVEAPFVEQPVEVVEASGWYLRGDVGYAWNDLRG 60
MKK +A AL V A AP + + WY +G++
Sbjct: 1 MKKTAIAIAVALAGFATVAQA--------AP---------KDNTWYTGAKLGWSQYH--- 40

Query: 61 AHYFQGSNSNLVDFDRADVDDSYVLGGGVGYQINSYLRADVTLDYLGESDFNGSTSGG 118
++ ++ + ++ G GYQ+N Y+ ++ D+LG + GS G
Sbjct: 41 -------DTGFINNNGPTHENQLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGSVENG 91


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc04459HTHFIS330.005 FIS bacterial regulatory protein HTH signature.
		>HTHFIS#FIS bacterial regulatory protein HTH signature.

Length = 484

Score = 32.9 bits (75), Expect = 0.005
Identities = 23/82 (28%), Positives = 32/82 (39%), Gaps = 18/82 (21%)

Query: 194 VLLVGPPGTGKTLLARSV---AGEANVPFFT-----ISGSDFVEMFVGV------GASRV 239
+++ G GTGK L+AR++ N PF I G GA
Sbjct: 163 LMITGESGTGKELVARALHDYGKRRNGPFVAINMAAIPRDLIESELFGHEKGAFTGAQTR 222

Query: 240 RD-MFEQAKKNAPCIIFIDEID 260
FEQA+ +F+DEI
Sbjct: 223 STGRFEQAEGGT---LFLDEIG 241


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02941SYCDCHAPRONE310.004 Gram-negative bacterial type III secretion SycD cha...
		>SYCDCHAPRONE#Gram-negative bacterial type III secretion SycD

chaperone signature.
Length = 168

Score = 31.1 bits (70), Expect = 0.004
Identities = 24/96 (25%), Positives = 37/96 (38%), Gaps = 15/96 (15%)

Query: 220 DPGDLYQAGYSHVLSGDYSIAEQEFR--DYLDAFPSGDKAADASFWMGEAQYSQ--GKYS 275
LY ++ SG Y A + F+ LD + D+ F++G Q G+Y
Sbjct: 35 TLEQLYSLAFNQYQSGKYEDAHKVFQALCVLDHY-------DSRFFLGLGACRQAMGQYD 87

Query: 276 DAAKTFLNAHQSHGKSPKAP----EMLLKLGMSLGA 307
A ++ K P+ P E LL+ G A
Sbjct: 88 LAIHSYSYGAIMDIKEPRFPFHAAECLLQKGELAEA 123


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc02942OMPADOMAIN1223e-36 OMPA domain signature.
		>OMPADOMAIN#OMPA domain signature.

Length = 346

Score = 122 bits (308), Expect = 3e-36
Identities = 31/119 (26%), Positives = 57/119 (47%), Gaps = 11/119 (9%)

Query: 59 QDFTVNVGDRIFFDTDSTSIRADAQATLDRQAQWLAKY--PNYGITIEGHADERGTREYN 116
Q + + F+ + +++ + QA LD+ L+ + + + G+ D G+ YN
Sbjct: 211 QTKHFTLKSDVLFNFNKATLKPEGQAALDQLYSQLSNLDPKDGSVVVLGYTDRIGSDAYN 270

Query: 117 LALGARRAAATRDYLVSRGVPGNRMRTISYGKEKPVA--VCDD-------ISCWSQNRR 166
L RRA + DYL+S+G+P +++ G+ PV CD+ I C + +RR
Sbjct: 271 QGLSERRAQSVVDYLISKGIPADKISARGMGESNPVTGNTCDNVKQRAALIDCLAPDRR 329


ORFs having significant similarity with Known Virulence factors
LocusTagHitsScoreE-valueComments
SMc03956IGASERPTASE501e-08 IgA-specific serine endopeptidase (S6) signature.
		>IGASERPTASE#IgA-specific serine endopeptidase (S6) signature.

Length = 1541

Score = 49.7 bits (118), Expect = 1e-08
Identities = 45/216 (20%), Positives = 71/216 (32%), Gaps = 20/216 (9%)

Query: 38 PVDIVPVESITQIQQGDKKAPAKEKA------SPVPTKKP-TPVENAENVGEN------D 84
VD + + IQ P+ + +PVP P TP E E V EN
Sbjct: 991 TVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVAENSKQESKT 1050

Query: 85 VDLKTPPTPNVKPVENESAAAPQKTEKAPPTPDPVKEEIEKVEETKPAA--EPAT----- 137
V+ E A + KA + V + + +ET+ E AT
Sbjct: 1051 VEKNEQDATETTAQNREVAKEAKSNVKANTQTNEVAQSGSETKETQTTETKETATVEKEE 1110

Query: 138 EVAALPEPKQEVKPDTKPEPAPAEEQPAENPEAEALPDRVPTPQVKPKVEKPAQTAKTPE 197
+ E QEV T E+ P+AE + PT +K + TA T +
Sbjct: 1111 KAKVETEKTQEVPKVTSQVSPKQEQSETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQ 1170

Query: 198 RKKEEVKKEQKKASSQKESDFNADEIAALLNKQESS 233
KE ++ + + + N ++
Sbjct: 1171 PAKETSSNVEQPVTESTTVNTGNSVVENPENTTPAT 1206



Score = 49.3 bits (117), Expect = 2e-08
Identities = 37/251 (14%), Positives = 72/251 (28%), Gaps = 24/251 (9%)

Query: 30 EVADVEALPV-DIVPVESITQIQQGDKKAPAKEKASPVPTKKPTPV-------------E 75
E+A V+ PV P + + + + K + T
Sbjct: 1016 EIARVDEAPVPPPAPATPSETTETVAENSKQESKTVEKNEQDATETTAQNREVAKEAKSN 1075

Query: 76 NAENVGENDVDLKTPPTPNVKPVENESAAAPQKTEKAPPTPDPVKEEIEKVEETKPAAEP 135
N N+V T + E + A +K EKA + +E + + P E
Sbjct: 1076 VKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ 1135

Query: 136 ATEVAALPEPKQEVKPDTKPEPAPAEEQPAENPEAEALPDRVPTPQVKPKVEKPAQTAKT 195
+ V EP +E P + ++ P + + V+ V +
Sbjct: 1136 SETVQPQAEPARENDPTVNIKEPQSQTN---TTADTEQPAKETSSNVEQPVTESTTVNTG 1192

Query: 196 P-------ERKKEEVKKEQKKASSQKESDFNADEIAALLNKQESSGGGAKRSTEEAALGG 248
+ SS K + + + ++ + E + + + A
Sbjct: 1193 NSVVENPENTTPATTQPTVNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVALCDL 1252

Query: 249 KKTTSGNTLSQ 259
T + LS
Sbjct: 1253 TSTNTNAVLSD 1263



Score = 44.3 bits (104), Expect = 6e-07
Identities = 31/137 (22%), Positives = 50/137 (36%), Gaps = 11/137 (8%)

Query: 117 DPVKEEIEKVEETKPAAEPATEVAALPEPKQE----VKPDTKPEPAPAEEQPAENPEAEA 172
+P E+ + +T P A +P + D P P PA P+E E A
Sbjct: 982 NPEVEKRNQTVDTTNITTPNNIQADVPSVPSNNEEIARVDEAPVPPPAPATPSETTETVA 1041

Query: 173 LPDRVPTPQVKPKVEKPAQTAKTPERKKEEVKKEQKKASSQKESDFNADEIAALLNKQES 232
+ Q VEK Q A + EV KE A S +++ +E+A ++ +
Sbjct: 1042 E----NSKQESKTVEKNEQDATETTAQNREVAKE---AKSNVKANTQTNEVAQSGSETKE 1094

Query: 233 SGGGAKRSTEEAALGGK 249
+ + T K
Sbjct: 1095 TQTTETKETATVEKEEK 1111



Score = 42.4 bits (99), Expect = 3e-06
Identities = 51/241 (21%), Positives = 78/241 (32%), Gaps = 18/241 (7%)

Query: 44 VESITQIQQGDKKAPAKEKASPVPTKKPTPVENAENVGENDVDLKTPP--TPNVKPVENE 101
V++ TQ + + ++ TK+ VE E + P T V P + +
Sbjct: 1076 VKANTQTNEVAQSGSETKETQTTETKETATVEKEEKAKVETEKTQEVPKVTSQVSPKQEQ 1135

Query: 102 SAAA-PQKTEKAPPTPDPVKEEIEKVEETKPAAE-PATEVAALPEPKQEVKPDTKPEPAP 159
S PQ P +E + T E PA E ++ E P T+
Sbjct: 1136 SETVQPQAEPARENDPTVNIKEPQSQTNTTADTEQPAKETSSNVEQ-----PVTESTTVN 1190

Query: 160 AEEQPAENPEAEALPDRVPTPQVKPKVEKPAQTAKTPERKKEEVKKEQKKASSQKESDFN 219
ENPE PT KP + R + +S+ + +
Sbjct: 1191 TGNSVVENPENTTPATTQPT-VNSESSNKPKNRHRRSVRSVPHNVEPATTSSNDRSTVAL 1249

Query: 220 ADEIAALLNKQESSGGGAKRSTEEAALGGKKTTSGNTLSQSEMDALRGQIQNN-WSIIPG 278
D + N S A+ + AL K S +SQ EM+ + Q N W
Sbjct: 1250 CDLTSTNTNAVLSD---ARAKAQFVALNVGKAVS-QHISQLEMN---NEGQYNVWVSNTS 1302

Query: 279 M 279
M
Sbjct: 1303 M 1303



 
Contact Sachin Pundhir for Bugs/Comments.
For best view 1024 x 768 resolution & IE 6.0 or above recommended.